Archive for October, 2011

Chain of Litigation

| Gabriel |

I was intrigued by the FT infographic on cell phone patent suits and decided to reformat it with F-R layout to get a big picture. A few things leap out. First, there is some pretty serious reciprocity (aka, counter-suits) going on, especially with Apple. On the other hand, Microsoft seems to be pretty good at attacking people who aren’t in a position to fight back. (*cough*trolls*cough*). Second, Google is at the periphery of the network which is pretty strange since many of these suits are actually about Android. This highlights the litigation strategy of picking off the weak members of the Android herd rather than taking the fight directly to Google itself. Furthermore, it suggests a data issue that there are omitted ties from the network, specifically positive ties between firms in the form of alliances (especially the Open Handset Alliance), and that reading the graph without these positive ties is misleading.

Anyway, here’s the graph. Click on it for a scalable PDF. Click here for the data in “.net” format. Code to produce the graph is below the fold.

(more…)

Advertisements

October 18, 2011 at 2:30 pm

USC Annenberg Talk on Climbing the Charts

| Gabriel |

USC Annenberg posted video of my talk in which I discuss the genre chapter of Climbing the Charts. In the chapter/talk I discuss how genre conventions structure diffusion, using as examples crossover between radio formats and the institutionalization of reggaetón with the growth of the “hurban” format.

October 14, 2011 at 12:27 pm 4 comments

Importnew.ado (requires R)

| Gabriel |

After hearing from two friends in a single day who are still on Stata 10 that they were having trouble opening Stata 12 .dta files, I rewrote my importspss.ado script to translate Stata files into an older format, by default Stata 9.

I’ve tested this with Stata 12 and in theory it should work with older versions, but please post positive or negative results in the comments. Remember that you need to have R installed. Anyway, I would recommend handling the backwards compatibility issue on the sender’s side with the native “saveold” command, but this should work in a pinch if for some reason you can’t impose on the sender to fix it and you need to fix it on the recipient’s end. Be especially careful if the dataset includes formats that Stata’s been updating a lot lately (e.g., the date formats).

The syntax is just:

importnew foo.dta

Here’s the code:

*importnew.ado
*by GHR 10/7/2011
*this script uses R to make new Stata files backwards compatible
* that is, use it when your collaborator forgot to use "saveold"

*use great caution if you are using data formats introduced in recent versions
* eg, %tb

*DEPENDENCY: R and library(foreign)
*if R exists but is not in PATH, change the reference to "R" in line 29 to be the specific location

capture program drop importnew
program define importnew
	set more off
	local future `1'
	local version=9  /* version number for your copy of Stata */ 
	local obsolete=round(runiform()*1000)
	local sourcefile=round(runiform()*1000)
	capture file close rsource
	file open rsource using `sourcefile'.R, write text replace
	file write rsource "library(foreign)" _n
	file write rsource `"directory <- "`c(pwd)'" "' _n
	file write rsource `"future <- "`future'" "' _n
	file write rsource `"obsolete <- paste("`obsolete'",".dta",sep="") "' _n
	file write rsource "setwd(directory)" _n
	file write rsource `"data <- read.dta(future, convert.factors=TRUE, missing.type=FALSE)"' _n
	file write rsource `"write.dta(data, file=obsolete, version=`version')"' _n
	file close rsource
	shell R --vanilla <`sourcefile'.R
	erase `sourcefile'.R
	use `obsolete'.dta, clear
	erase `obsolete'.dta
end
*have a nice day

October 10, 2011 at 11:15 am

So Long Arial

| Gabriel |

Now that I’ve been looking into setting my graphs in a serif font instead of the default Arial, I find that changing fonts in Stata graphs is surprisingly difficult. It’s not that it’s intrinsically difficult, just confusing to learn because Stata’s handling of fonts breaks with the general Stata convention of specifying options as part of a command and is instead more similar to the Gnuplot style of changing device preferences then executing a command targeting the device. One implication of this is that there’s no option to change the graph font in the graph GUI interface (which is how I usually learn new bits of command-line syntax).

Another issue is that graphs aren’t WYSIWYG. Rather the interactive display graph can look different from the graph saved to disk and that in turn can be inconsistent depending on what file format you use. To avoid confusion I just set everything at once, like this:

local graphfont "Palatino"
graph set eps fontface `graphfont'
graph set eps fontfaceserif `graphfont'
graph set eps  /*echo back preferences*/

graph set window fontface `graphfont'
graph set window fontfaceserif `graphfont'
graph set window /*echo back preferences*/

One of the oddities is that there is no set of PDF options. Rather (at least on a Mac) you control the PDF device as part of the display device (“window”). My understanding is that Stata for Mac relies on the low level OS X PDF support for creating PDFs, and this would explain why it considers PDF to be part of the display device rather than one of the file type devices (as well as why it won’t make PDFs if you suppress screen rendering, why Stata for Mac got PDF support earlier than the other platforms, and why the PDFs looked like this until they fixed it in 11.2 and 12). Note that this means that your PDF will use the EPS fonts if you use my graphexportpdf.ado script and the display fonts if you use the base “graph export foo.pdf” syntax.

I haven’t checked, but I wouldn’t be surprised if the PDF preferences in Stata for Windows are controlled by the EPS preferences rather than the display preferences. In any case, I recommend just setting all devices the same so it won’t matter.

(Thanks to Eric Booth, whose StataList post, helped me figure this out).

October 5, 2011 at 4:00 am 8 comments

Adding elements to graphs as a slideshow

| Gabriel |

One of the tricks to a successful presentation is to limit what your audience sees so they don’t get ahead of you and also to preserve a general sense of timing and flow. This helps keep the audience’s attention and also is good for focusing expectations in such a way that the next bit is counter-intuitive and therefore interesting. Nothing is so boring as sitting in a talk and seeing ten bullet points and realizing that the speaker is only on bullet number three.

Similarly if you’re using graphs in a talk (which you should as much as possible since they read better than tables), you may only want to reveal part of a graph as you talk about it, then reveal the next bit when you’re ready. The most obvious way to do this is to just crop the graph or cover it with boxes that match the background or something. Unfortunately that’s ugly and clunky and doesn’t work if the graph elements are tightly commingled. Another way to do it is to generate two graphs, one of which has the elements and the other of which doesn’t. The problem with this is that the graphs don’t match up properly. For instance, if you have a line graph and you keep adding lines to it, the legend will first appear and then grow larger, crowding out the graph itself.

Ideally, what you want is a set of graphs that are completely identical except some elements are missing in one version which are added in the other version. You can then line the graphs up, talk about the first set of elements, and then do a smooth transition to the version with the full set of elements. Here’s an example from the talk I gave yesterday. In order to explain crossover I first show the song’s native formats then dissolve to also show the crossover formats.

Here’s how I did it. The basic trick is that Stata can create transparent graph elements by setting the color to “none”. You do the exact same graph multiple times, you just set colors to be transparent when you want to conceal elements. That is, the code in lines 10–14 is identical to that in lines 17–21 except that lines 13 and 14 set line color to “none” instead of Stata’s standard s2color scheme.

use final_f, clear
keep if artist=="SARA BAREILLES"
drop if format=="All" | format=="Other"
sum date
local maxdate=`r(max)'
local mindate=`r(min)'
local interval=(`maxdate'-`mindate')/10
local interval=round(`interval',7)

twoway (line Nt_inc_p date if format=="AAA_Rock", lwidth(thick) lcolor(navy)) /*
  */ (line Nt_inc_p date if format=="Hot_AC", lwidth(thick) lcolor(maroon)) /*
  */ (line Nt_inc_p date if format=="Top_40", lwidth(thick) lcolor(none)) /*
  */ (line Nt_inc_p date if format=="Mainstream_AC", lwidth(thick) lcolor(none)) /*
  */ , xtitle("") xmtick(`mindate'(7)`maxdate') xlabel(`mindate'(`interval')`maxdate', labsize(vsmall) angle(forty_five) format(%tdMon_dd,_CCYY)) legend(order (1 "AAA Rock" 2 "Hot AC" 3 "Top 40" 4 "Mainstream AC"))  graphregion(fcolor(white))
graph export $images/sarabareilles_lovesong_1.pdf, replace

twoway (line Nt_inc_p date if format=="AAA_Rock", lwidth(thick) lcolor(navy)) /*
  */ (line Nt_inc_p date if format=="Hot_AC", lwidth(thick) lcolor(maroon)) /*
  */ (line Nt_inc_p date if format=="Top_40", lwidth(thick) lcolor(dkorange)) /*
  */ (line Nt_inc_p date if format=="Mainstream_AC", lwidth(thick) lcolor(forest_green)) /*
*/ , xtitle("") xmtick(`mindate'(7)`maxdate') xlabel(`mindate'(`interval')`maxdate', labsize(vsmall) angle(forty_five) format(%tdMon_dd,_CCYY)) legend(order (1 "AAA Rock" 2 "Hot AC" 3 "Top 40" 4 "Mainstream AC")) graphregion(fcolor(white))
graph export $images/sarabareilles_lovesong_2.pdf, replace 

October 4, 2011 at 4:42 am 2 comments


The Culture Geeks