Contour graphs in Stata
| Gabriel |
Stata has a lot of graphing capabilities but it can’t do contour maps (and can only do monochrome surface maps through an ado file). Contour maps are kind of like a scatterplot except that they have three dimensions, where the color-coding stands in for the z-axis. They’re really useful for graphically illustrating complex nonlinear functions (which is why you see them a lot in hard science journals and really hardcore techy network stuff). Usually x and y are some parameter and z is some metric that’s calculated by feeding x and y into a nonlinear equation or simulation. Surface maps are similar but they show the z-axis literally instead of through color-coding. Grusky the Younger has used these to show ginormous occupational reproduction xtabs (x is dad’s job, y is son’s job, z is frequency of that particular combo).
Anyway, these graphs are really useful for certain things but they don’t work in Stata (or Numbers or OpenOffice). On the other hand Excel, R, and Gnuplot have all been able to do this forever. (Gnumeric can do contour but not surface). Anyway, this kind of sucks because I work in Stata and it’s a hassle to a) export a table to a Gnumeric or Excel or b) script Stata to push the graphing to R or GnuPlot. Since I need to do these graphs a lot for exploratory purposes I want to be able to do a quick and dirty draft directly within Stata. (I don’t mind using the other software for publication quality stuff but as every quant knows you do the exploratory stuff a thousand times before you’re ready to set it as publication quality). There’s a pretty good ado file for surface graphs (just type “findit surface”) but I find color-coding much easier to read than 3-D so I want contour graphs and I want it in Stata.
Anyway, I wrote this code to create draft contour graphs. The command syntax is just “crudecontour x y z” where x and y are the axes and z is the color-coding. The program automatically breaks z into quartiles and shows it color-coded from blue (low z) to red (high z).
Note that this little program revels in two distinct aspects of mediocrity. First, it expects the dataset to have exactly one cell for each combination of x and y. If there’s a missing cell it just plots it as white space (unlike the good packages which will impute). Second, it produces graphs that look a like a video game from 1979. On the other hand, it’s native (have you tried to install Gnuplot on a Mac?) and it took five minutes to code. It’s good enough for exploratory work but you definitely want to use something else for the final version. I still haven’t given up yet on scripting Stata to push it to R or Gnuplot because I’d really rather batch this than do it in a GUI spreadsheet.
capture program drop crudecontourcapture program drop crudecontour program define crudecontour set more off local x `1' local y `2' local z `3' /*color-coding variable*/ quietly sum `z', detail local z25=`r(p25)' local z50=`r(p50)' local z75=`r(p75)' twoway /* */ (scatter `x' `y' if `z'<`z25', mcolor(blue) msize(huge) msymbol(square)) /* */ (scatter `x' `y' if `z'>=`z25' & `z'<`z50', mcolor(green) msize(huge) msymbol(square)) /* */ (scatter `x' `y' if `z'>=`z50' & `z'<`z75', mcolor(yellow) msize(huge) msymbol(square)) /* */ (scatter `x' `y' if `z'>=`z75', mcolor(red) msize(huge) msymbol(square))/* */ , legend(order(1 "1Q `z'" 2 "2Q `z'" 3 "3Q `z'" 4 "4Q `z'")) end