Posts tagged ‘text editor’

Executing do-files from text editors

| Gabriel |

Stata now defaults to opening a do-file in the integrated do-file editor rather than just running it. The integrated do-file editor is now pretty good, but I’m a creature of habit and I prefer to use an external text editor (usually TextMate) then pipe to Stata. The current default behavior makes this somewhat inconvenient.

Fortunately, you can change this pretty easily in the preferences. Open Stata’s preferences, go to the “Do-File” tab and then the “advanced” sub-tab. Now uncheck the box that says “Edit do-files opened from the Finder in Do-file Editor.” Even though it says “from the Finder” this also applies to do-files launched pretty much any way you can think of: after-market file managers, text editors, etc.

Alternately, you could rewrite your text editor’s Stata support to use Stata console, but that’s probably overkill.

September 28, 2011 at 5:18 am 3 comments

Misc Links

| Gabriel |

  • Useful detailed overview of Lion. The user interface stuff doesn’t interest me nearly as much as the tight integration of version control and “resume.” Also, worth checking if your apps are compatible. (Stata and Lyx are supposed to work fine. TextMate is supposed to run OK with some minor bugs. No word on R. Fink doesn’t work yet). It sounds good but I’m once again sitting it out for a few months until the compatibility bugs get worked out. Also, as with Snow Leopard many of the features won’t really do anything until developers implement them in their applications.
  • I absolutely loved the NPR Planet Money story on the making of Rihanna’s “Man Down.” (Not so fond of the song itself, which reminds me of Bing Crosby and David Bowie singing “Little Drummer Boy” in matching cardigans). If you have any interest at all in production of culture read the blog post and listen to the long form podcast (the ATC version linked from the blog post is the short version).
  • Good explanation of e, which comes up surprisingly often in sociology (logit regression, diffusion models, etc.). I like this a lot as in my own pedagogy I really try to emphasize the intuitive meaning of mathematical concepts rather than just the plug and chug formulae on the one hand or the proofs on the other.
  • People are using “bimbots” to scrape Facebook. And to think that I have ethical misgivings about forging a user-agent string so wget looks like Firefox.

July 20, 2011 at 3:46 pm

Escaped quotes and syntax highlighting

| Gabriel |

Quotes usually delimit where a string with embedded spaces begin, but sometimes you want the quote to be literal and this requires escaping it. To recycle an example I’ve used before, suppose you wanted to display:

Beavis said "Fire! Fire!"

To get Stata to display this, you would escape the quotes by encompassing them in left and right apostrophes (just like calling a local) so the command would be:

disp `"Beavis said "Fire! Fire!""'

This is a trivial example, but a more realistic application is you might want to put some things that involve quotes inside a local and since the content of the local is itself delimited by quotes you’ll need to escape them.

OK, easy enough, but the problem is that most external text editors don’t appreciate this nuance of Stata syntax and end up showing the rest of the document as quoted text, effectively making the syntax highlighting useless. (Stata’s internal editor doesn’t suffer this problem, but I’m in the habit of using TextMate since prior to Stata 11 the editor didn’t have highlighting and it still doesn’t do code-folding). The solution is to let two syntax-parsing wrongs make a right by putting a single quote in a comment, which Stata will ignore but which the text editor will parse as closing a previous hanging quote. It works like this:

disp `"Beavis said "Fire! Fire!""'
* " this line exists only to let the text editor's parser know that everything is back to normal
disp "see, it works. this quoted text should show up as quoted whereas the word 'disp' appears as a command"

February 7, 2011 at 5:05 am 5 comments

Lyx and UltraEdit

| Gabriel |

I’ve been using the beta of Lyx 2.0 for a few weeks now. The first beta was unstable but the second beta has yet to crash or otherwise give me problems so I’ve gone ahead and committed to the new file format (which is still a dialect of TeX, just a slightly different one). I generally find it to be a big improvement in all sorts of subtle ways, particularly how it resolves subtle dependency issues. For instance, I could never get Lyx 1.6.x to recognize the Aspell dictionary on my Mac so I’d have to run a Ubuntu VM to check my spelling. Lyx 2.0 automatically reads the Mac OS X dictionary.There’s also an amazing “compare documents” feature that lets you diff any two files but instead of standard diff output it gives you something that looks a lot like “track changes” in Word. The full list of features is here.

You can download the beta here. Note that this is an ftp not web link and some browsers don’t do FTP so either use an FTP compatible browser like Chrome, an FTP enabled file manager, or a dedicated FTP client. The “dmg” link is for Macs and the “exe” link is for Windows. Note that the Lyx 2.0.x file format is not backwards compatible with Lyx 1.6.x software although Lyx 2.0 beta can export to the old format.

The other software I’ve been playing with is UltraEdit for Mac, for which I was a beta tester. Overall it strikes me as a very good editor and they’ve made admirable efforts to make it Mac native but it still looks like Windows software because it has one big window with internal demarcation rather than lots of floating pallettes, etc. Anyway I’m going to stick with TextMate (which has better language support for the languages I care about) and TextWrangler (which I find more intuitive for batch cleaning files) but I think people transitioning from Windows to Mac might be well served by UltraEdit, especially if they used it (or similar software like Notepad++) on Windows.

December 21, 2010 at 4:13 am

Editra

| Gabriel |

I hadn’t been paying much attention to Editra since my last comparison shopping of text editors, but recently the project has made some really big strides and is shaping up to be a great cross-platform text editor. Most notably for me, it has both syntax highlighting and code-folding support for Stata. (In addition to R, perl, LaTex, bash, html, and plenty of languages I don’t use).  Furthermore, it now has a plug-in framework for language syntax so adding support for additional languages is easy if you have a Scintilla file. (The old method was to recompile from source — yes, really). There’s also a great “Generate” feature which will let you preserve your syntax highlighting in html, rtf, or tex, though in my experience the tex filter is buggy. (Note that there is a similar “copy as RTF” plug-in for TextMate). Finally, the Mac version comes as a binary and actually looks like a Quartz-native Mac program — no Fink / X11 hassle.

Editra is still considered an alpha release and I remain happy with TextMate for my own use, but if you need cross-platform and/or free, I’d recommend considering it. Note that these features could be especially valuable for teaching stats, since students have little money and use a variety of platforms.

Also, another free cross-platform editor worth checking out is Komodo. It has code-folding and syntax highlighting but as far as I can tell, the Stata syntax only supports highlighting (no folding) and there’s no R support at all, though it has a well-documented plug-in system so it should be feasible for someone to write or port an R syntax file to it.

March 3, 2010 at 4:54 am 2 comments

R and TextMate

| Gabriel |

Now that I’ve started dabbling in R, I figured I needed to get my text editor to highlight the Klingon-esque syntax. TextWrangler and Smultron already support R, but getting it for TextMate requires the Terminal:

cd "~/Library/Application Support/TextMate/Bundles"
svn co http://svn.textmate.org/trunk/Bundles/R.tmbundle/

Note that 64-bit R is buggy so if you have trouble piping scripts from TextMate to Rdaemon (i.e., the command line R running in the background), you can use the bundle editor to redirect it to “R32” instead of just “R” which will force it to use the slightly slower but more reliable 32-bit R. Or if that’s too hard, just stick to piping to R.app instead of Rdaemon.

Also, as long as you’re playing with the TextMate library, you might as well install “GetBundles,” a GUI frontend for browsing the TextMate bundle server.
svn co http://svn.textmate.org/trunk/Review/Bundles/GetBundles.tmbundle/

Note that GetBundles (with “s”) supersedes the now defunct GetBundle (without “s”) that you might see mentioned if you google things like “TextMate Bundle” or “TextMate R syntax”.

December 1, 2009 at 4:54 am 6 comments

TextWrangler 3

| Gabriel |

Version 3.0 of TextWrangler came out yesterday. TextWrangler is an excellent text editor, especially for data cleaning, although I prefer TextMate for writing code.

August 25, 2009 at 1:27 pm

Merging Pajek vertices into Stata

| Gabriel |

Sometimes I use Pajek (or something that behaves similarly like Mathematica or Network Workbench) to generate a variable which I then want to merge back onto Stata. However the problem is that the output requires a little cleaning because it’s not as if the first column is your “id” variable as it exists in Stata and the second column the metric and you can just merge on “id.” Instead they tend to encode your Stata id variable, which means you have to merge twice, first to associate the Stata id variable with the Pajek id variable, second to associate the new data with your main dataset.

So the first step is to create a merge file to associate the encoded key with the Stata id variable. You get this from the Pajek “.net” file (ie, the data file). The first part of this file is the encoding of the nodes, the rest (which you don’t care about for these purposes) is the connections between these nodes. In other words you want to go from this:

*Vertices 3
1 "tom"
2 "dick"
3 "harry"
*Edges
1 2
2 3

to this:

pajek_id	stata_id
1	Tom
2	Dick
3	Harry

The thing that makes this a pain is that “.net” files are usually really big so if you try to just select the “vertices” part of the file you may be holding down the mouse button for a really long time. My solution is to open the file in a text editor (I prefer TextWrangler for this) and put the cursor at the end of what I want. I then enter the regular expression search pattern “^.+$\r” (or “^.+$\n”) to be replaced with nothing, which has the effect of erasing everything after the cursor. Note that the search should start at the cursor and not wrap so don’t check “start at top” or “wrap around.” You’ll then be left with just the labels, the edge list having been deleted. Another way to do it is to search the whole file and tell it to delete lines that do not include quotes marks.

Having eliminated the edge list and kept only the encoding key, at this point you still need to get the vertice labels into a nice tab-delimited format, which is easily accomplished with this pattern.

“(.+)”$
\t\1

Note the leading space in the search regular expression. Also note that if the labels have embedded spaces there should be quotes around \1 in the replacement regular expression.

Manually name the first column “pajek_id” and the second column “stata_id” (or better yet, whatever you call your id variable in Stata) and save the file as something like “pajekmerge.txt”. Now go to Stata and use “insheet,” “sort,” and “merge” to add the “pajek_id” variable into Stata. You’re now ready to import the foreign data. Use “insheet” to get it into Stata. Some of these programs include an id variable, if so name it “pajek_id.” Others (eg Mathematica) don’t and just rely on ordering. If so, enter the command “gen mathematica_id=[_n]”. You’re now ready to merge the foreign data into Stata.

This is obviously a tricky process and there are a lot of stupid ways it could go wrong. Therefore it is absolutely imperative that you spot-check the results. There are usually some cases where you intuitively know about what the new metric should be. Likewise, you may have another variable native to your Stata dataset that should have a reasonably high (positive or negative) correlation with the new metric imported from Pajek. Check this correlation as when things should be correlated but ain’t it often means a merge error.

Note that it’s not necessarily a problem if some cases in your Stata dataset don’t have corresponding entries in your Pajek output. This is because isolates are often dropped from your Pajek data. However you should know who these isolates are and be able to spot-check that the right people are missing. If you’re doing an inter-locking board studies and you see that an investment bank in your Stata data doesn’t appear in your Pajek data then you probably have a merge error.

July 22, 2009 at 5:02 am 2 comments

Collaborative code

| Gabriel |

A friend recently told me about a collaborative text editor. I’m perfectly happy having an entirely local text editor because I tend to do most of coding by myself (I’m a loner, a rebel). Although I co-author a lot, there tends to be sufficiently clear division of labor that I can just send data files and output to my co-authors and vice versa. For instance on one project I did all the cleaning and my co-author did the analysis. Nonetheless, not everyone works like this so I figured I’d pass it along.

The program my friend told me about is Etherpad. This is a totally cloud solution and is very quick to set up. Unfortunately it’s really bare bones, for instance it highlights by author (which is good) but doesn’t highlight syntax for anything but Java.

There are also local clients with remote sync. A popular solution for collaborative coding on the mac is SubEthaEdit. On the plus side there is Stata syntax. On the downside both authors need to have Macs and buy the software (30 euros).

A cross-platform, free, and open-source solution is Gobby. Although there is no Stata syntax file it uses a well documented highlighting standard so it should be feasible to write one. In principle Gobby works on the Mac but there’s no binary so good luck getting it to compile. If you’re a Mac person who can’t get Fink to work my suggestion is to use the Linux or Windows version through virtualization.

May 19, 2009 at 5:30 am

Choosing a (Mac) text editor with Stata

| Gabriel |

I love TextWrangler (free) but I was a little frustrated that it doesn’t allow code folding. (I should note that it does support everything else on my text editor wish list, most notably regular expressions, syntax highlighting, and pushing). I’d seen code folding in action with html editors like Bluefish and this struck me as a great feature. If you’re not familiar with it, code folding is when you hide some block of code, usually a subroutine or loop. TextWrangler’s big brother, BBEdit ($49 educational, $125 commercial) does offer code folding but it’s only really useful if the syntax files are written to make it work because the program has to be able to recognize what a loop looks like. Unfortunately BBEdit doesn’t come with Stata syntax files and the excellent TextWrangler Stata language file written by dataninja doesn’t support BBEdit only features like code folding. I looked to see if I could figure out how to write code folding syntax into dataninja’s language module but I couldn’t find any documentation about code folding in the developer kit and in any case I’m not that talented. Aaargh!

In desperation I’ve considered using another text editor, even though I really like TextWrangler. Apparently Kate (free) works pretty well with Stata but there’s no mac version available. (In theory I could recompile it using Fink but that never works for me). Likewise Notepad++ (free) has an excellent Stata syntax file and I highly recommend it to people who use Windows, but there’s no Mac version so I’d have to run it through Crossover/Wine and again that’s a hassle (the key bindings are different and you lose access to the native file browser and applescript integration). UltraEdit ($49) also has good Stata support and apparently it will be ported to Mac/Linux, but it’s not going to be out for a few months.

Editra (free) is a very well-featured and cross-platform editor, but there’s no Stata syntax file yet, nor can I figure out how to write one. One minor limitation I’ve noticed is that Editra can’t handle extremely long rows, but I only ever used extremely long rows for file list globals and there’s a better way to do that. A nice feature is that the language support is in the app package (as compared to “~/Library/Application Support”) which makes it easier to run off a key. Likewise Smultron keeps the syntax in the app package and is well-suited to run off a key. It has excellent Stata highlighting but no code folding. Smultron is the only editor I’ve seen that comes with the Stata syntax file included so it might be a good choice for beginners who don’t want to fiddle with the language preferences, libraries, and that sort of thing to install a user-written file.

Currently the best option is looking like TextMate ($45 educational, $53 commercial). Timothy Beatty at York University has put together a bundle that integrates it beautifully with Stata (note that the bundle assumes you have MP and requires some light editing for some of the features to work with other versions). Something I didn’t expect to like as much as I did is that every open file has its own window (rather than a tab drawer like TW) and this makes it much easier to compare two similar files, though it would get unwieldy with dozens of open files. On the other hand I still prefer certain features of TextWrangler. For instance, it’s much easy to execute a multi-file find/replace in TextWrangler than it is in TextMate (which requires you to first set up a “project” then apply the batch to the project). Both the tab thing and the batch thing have something in common which is that TextWrangler is better suited for cleaning multiple data files, something I do a lot of. However for coding it’s looking like TextMate. I’ve been using it for about a week while working with a very complex file and so far I’ve been very happy with its code folding, (limited) syntax completion, (excellent) syntax highlighting, etc. Some of the other editors I’ve mentioned could be this good in principle (and already are for some languages), but they would need the as-of-yet unwritten syntax files to do so for Stata.

(btw, here are the definitive thoughts on using text editors with Stata for various platforms).

April 15, 2009 at 6:30 am 2 comments


The Culture Geeks