Which of my cites is missing?

June 1, 2011 at 4:37 am 2 comments

| Gabriel |

I was working on my book (in Lyx) and it drove me crazy that at the top of the bibliography was a missing citation. Finding the referent to this missing citation manually was easier said than done and ultimately I gave up and had the computer do it. These suggestions are provided rather inelegantly as a “log” spread across two languages. However you could pretty easily work them into an argument-passing script written in just one language. Likewise, it should be easy to modify them for use with plain vanilla LaTeX if need be.

First, I pulled all the citations from the book manuscript and all the keys from my Bibtex files.

grep '^key ' book.lyx | sort | uniq -u | perl -pe 's/^key "([^"]+)"/$1/' > cites.txt
grep '^\@' ~/Documents/latexfiles/ghrcites_manual.bib | perl -pe 's/\@.+{(.+),/$1/' > bibclean.txt
grep '^\@' ~/Documents/latexfiles/ghrcites_zotero.bib | perl -pe 's/\@.+{(.+),/$1/' >> bibclean.txt

Then in Stata I merged the two files and looked for Bibtex keys that appear in the manuscript but not the Bibtex files. [Update, see the comments for a better way to do this.] From that point it was easy to add the citations to the Bibtex files (or correct the spelling of the keys in the manuscript).

insheet using bibclean.txt, clear
tempfile x
save `x'
insheet using cites.txt, clear
merge 1:1 v1 using `x'
list if _merge==1

Entry filed under: Uncategorized. Tags: , , , .

Misc Links A nous, l’ivresse meilleure des chants joyeux!


  • 1. Scott Golder  |  June 1, 2011 at 11:41 am

    Do you know the Unix “comm” command? It’ll do the final step more easily. Given two sorted text files, it tells you which lines are missing. The command “comm -23 cites.txt bibclean.txt” will list only those lines that exist in file 1 (your manuscript) but not in file 2 (your reference managers).

    • 2. gabrielrossman  |  June 1, 2011 at 12:56 pm

      didn’t know that, thanks. this is useful as having it entirely in Unix would make it easier to script and more portable.

The Culture Geeks

%d bloggers like this: