Snapshot

August 5, 2009 at 5:42 am 5 comments

| Gabriel |

For a very long time I’ve been in the habit of keeping multiple versions of both manuscripts and scripts. Every day that I work on a file I save a new version as basenameYYMMDD.extension. The reason I do this is to facilitate buyer’s remorse on changes (and conversely, let myself be free to experiment). Although Time Machine does this for me to an extent, I still like to do manual snapshots in part because I’m a creature of habit and in part because Time Machine is more of a restore tool than a “how did I do this last month” kind of tool. Worse, Time Machine deletes old backups (which happens really often if you have any disk image files).

Anyway, In the course of working with Lyx I realized that a problem with my approach is that the file always has a new name, which makes it hard to draw interconnections between files. For instance, Lyx allows you to have child documents, graphics, etc, embedded within master documents but this only works if you save stable names for the target files.

What I realized was that the simple solution was that instead of having all versions called basenameYYMMDD.extension, where the current version is just the one with the most recent YYMMDD, I could make the current version basename.extension and the snapshots basenameYYMMDD.extension. That way I can have another file point to the file and always have it point to the current version, even as I also have access to the snapshots.

I wrote a shell script that does this. On a Mac you can use the “run shell script” workflow action to do it in Automator which let’s you treat it as a Finder plug-in (right-click) or as an app (drag-and-drop).

#bin/bash
TIMESTAMP=`date '+%Y%m%d'`
for f in "$@"
do
	EXTENSION=${f##*.}
	FILENAME=`basename $f | sed 's/\(.*\)\..*/\1/'`
	DIRPATH=`dirname $f`
	cp "$f" "$DIRPATH/$FILENAME$TIMESTAMP.$EXTENSION"
done

My version assumes that it’s taking arguments (in my case from Automator) but to make it self contained you can define a variable as a list of paths and have it read off of that variable instead of “$@”. Here’s an example of the syntax for the variable:

X="/Users/rossman/bigdeal.txt /Users/rossman/biggerdeal.lyx /Users/rossman/biggestdeal.do "

Two possible modifications you might wish to use are to a) only snapshot files that have been changed recently or b) put the snapshots in an “archive” or “oldversions” subdirectory. To do the latter change the penultimate line to:

  cp "$f" "$DIRPATH/archive/$FILENAME$TIMESTAMP.$EXTENSION"

(Thanks to Haynes and Ganbustein in the MacOSXHints forums for some help debugging the code).

Entry filed under: Uncategorized. Tags: .

Time consistency and reputation Incentives vs institutions

5 Comments

  • 1. Alan  |  August 5, 2009 at 9:19 pm

    Nice solution. And though you’re using LyX, the script itself could easily be plugged into other editors like TextMate.

    Did you consider any more full-featured version control systems to handle this?

  • 2. gabrielrossman  |  August 5, 2009 at 11:28 pm

    Perhaps I should have looked into actual version control but just having dated snapshots works pretty well for me, especially since Spotlight lets me find the last version of a now changed/deleted passage very quickly.

    Do you have any suggestions for version control?

  • 3. Gabriel Rossman  |  August 26, 2009 at 6:23 pm

    i saw an interesting review of several version control programs today linked at /.

    after reading it i think i remain happy with my simpler approach. my hunch is that a formal version control would mostly be a good idea for collaborations, especially if the team is reasonably large, the dependencies are complex, and the division of labor is fuzzy.

  • 4. AD  |  May 16, 2013 at 6:23 pm

    Old post, I know. But I have a question. Say you are on your fourth version of your .tex file and your 7th version of your .do file. And then you submit to a journal. Then you start making changes to both sets of files. Can you go back to the fourth version of the .tex file and reproduce the paper you submitted, or would it have the most updated tables? I ask, because this seems to run counter to the replicability ideal for research. Almost like going into the .tex file and changing the linked graph file is what keeps the ties together at the time of the writing?

    • 5. gabrielrossman  |  May 16, 2013 at 10:41 pm

      You’d have to go into the archive folders and run the last time-stamped scripts from when you submitted the paper.

      So basically, it’s the date timestamps that allow you to reconstruct replicability, although the workflow’s default assumption is to use the most current version of everything.


The Culture Geeks


%d bloggers like this: