Archive for August, 2009

SPPA 2008

| Gabriel |

The 2008 wave for the Survey of Public Participation in the Arts is now available at CPANDA. In the WSJ, Terry Teachout noticed one basic thing in the data, which is that nobody born since the Ford administration likes jazz. I’ve been waiting for this dataset for awhile because several years ago Pete Peterson and I noticed some weird differences between 92 to 02 (particularly as relates to the omnivore hypothesis) and we need a third data point to help us figure it out. Another cool thing about the dataset is that they now ask questions about literature by genre, which as seen in the literature based on SPPA music questions, is a good way to get at cultural capital type issues.

Anyway, one of the minor annoyances about SPPA is that it uses a convention of “1=Yes 2=No” whereas any native Stata speaker knows that this is an abomination and contrary to the divine rule that in all binary variables, 0 shall equal “no” and 1 shall equal “yes.” (For one thing, this makes it easier to sum the dummies into a count). As such I’ve written this code to fix these perverse variables. Just add it to the end of the do-file that CPANDA generates for you when you download the file.

*change all the yes/no vars to Stata convention where 0 is no and 1 is yes
*all variables that are similar to yes/no but slightly different (eg, PEDWWNTO) are left alone
*to avoid confusion by plugging into scripts that assume SPSS yes/no, rename these variables with suffix "r"
global yesnovars "PEX4A PEX4B PEX5 PEQ1A PEQ2A PEQ3A PEQ4A PEQ5A PEQ6A PEQ7A PEQ8AA PEQ9A PEQ10A PEQ10B PEQ11A PEQ12A PEQ13AA1 PEQ13AA2 PEQ13AA3 PEA1A PEA1B PEA2 PEA31 PEA32 PEA33 PEA34 PEA35 PEA36 PEA37 PEA38 PEA39 PEA310 PEA311 PEA41 PEA42 PEA43 PEA44 PEA45 PEA46 PEA47 PEA48 PEA49 PEA410 PEA411 PEA412 PEA413 PEA414 PEB1A PEB2A PEB3A PEB4A PEB5A PEB6 PEB7 PEB8 PEB9 PEB10 PEB11 PEB12 PEB13 PEB14 PEC2A PEC3A PEC4A PEC5A PEC6A PEC7A PEC8A PEC9A PEC10A PEC11A PEC12A PEC13A PEC14A PEC15A PEC15B PEC16A PEC16B PEC16C PEC17A PEC18A PEC19A PEC20A PEC21A PEC25A PEC26A PEC27A PED1A PED1C PED1D PED2A PED2C PED2D PED3A PED3C PED3D PED4A PED4C PED4D PED5A PED5C PED5D PED6A PED6C PED6D PED7A PED7C PED7D HETELAVL HETELHHD HUBUS PEABSPDO PEAFEVER PEAFNOW PEDW4WK PEDWAVL PEDWLKO PEDWLKWK PEDWWK PEERNCOV PEERNLAB PEERNRT PEERNUOT PEHRAVL PEJHWKO PELAYAVL PELAYFTO PELAYLK PELKAVL PEMJOT PENLFRET PESCHENR PUBUS1 PUBUS2OT PUDIS1 PUDIS2 PUHROFF1 PUHROT1 PUIODP1 PUIODP2 PUIODP3 PUJHDP1O PULAY6M PULAYDT "
sum $yesnovars
*check that range is (1,2)

lab def yesno 0 "N" 1 "Y"

foreach var in $yesnovars {
	recode `var' 2=0 1=1 .=.
	lab val `var' yesno
	ren `var' `var'r
}

August 18, 2009 at 5:45 am 1 comment

Data collection

| Gabriel |

A few months ago I took this picture at a rest stop on I-40 near Tucumcari, New Mexico.
DSCF7413

It struck me as pretty funny that the highway department would collect quality control data in this way. On ruminating about it, there’s a kind of McLuhanesque “medium is the message” quality to it in that a survey conducted with push buttons and lightbulbs is necessarily going to be terse and to the point. Compare this to your standard social science or marketing survey, gavaged with the “usual suspects” battery of demographic background questions, the interminable list of Likert-scale questions that are only slight paraphrases of each other, etc.

August 17, 2009 at 5:00 am 1 comment

Applied diffusion modeling

| Gabriel |

Via Slashdot, some mathematicians at the University of Ottowa have modeled zombie infestation. It’s basically your standard endogenous growth model with a cute application. Here’s the conclusion:

In summary, a zombie outbreak is likely to lead to the collapse of civilisation, unless it is dealt with quickly. While aggressive quarantine may contain the epidemic, or a cure may lead to coexistence of humans and zombies, the most effective way to contain the rise of the undead is to hit hard and hit often. As seen in the movies, it is imperative that zombies are dealt with quickly, or else we are all in a great deal of trouble.

Here’s an even more “sophisticated” simulation, which allows spatial heterogeneity.

August 14, 2009 at 11:04 pm

The big gens

| Gabriel |

I heard that a handful of clans (or “gens” in Latin) dominated the higher offices of the Roman republic and I figured that this would be a good data question. To start, I copied the Fasti Consulares from Wikipedia and limited it to the Republican period, defined as the Rape of Lucretia through the Battle of Actium.

Roman names followed the convention of “personal gens family [honorifics].” So, for instance “Publius Cornelius Scipio Africanus” means “the man Publius from the Scipio branch of the Cornelius clan, the conqueror of Africa.” From the perspective of seeing which clans dominated the Republic, the key bit is the second name so I used the Stata string function “word” to pull the second word out of each of these names.

As can be seen, the distribution of consulships/gens follows a power-law. Since power-laws indicate a cumulative advantage mechanism we can interpret this as meaning that in Rome a family’s power and prestige was endogenous.

consuls

The most dominant clans  in the republic were the Furii (41 consulships), the Claudii (45 consulships), the Aemilii (53 consulships), the Fabii (62 consulships), the Valerii (71 consulships), and the Cornelii (106 consulships). This means that a Cornelius was consul about once every six years.

In contrast the Iulii (as in Gaius Iulius Caesar) held the consulship a relatively paltry 29 times, so small wonder that in order to establish the monarchy they had to form a marriage alliance with the Claudii. Likewise, the Pompeii were a politically obscure family but Pompey Magnus became powerful through his patron-client relationships with the Cornelii.

August 14, 2009 at 5:34 am 1 comment

Did I stutter?

| Gabriel |

Vain creature that I am, I was googling my Dixie Chicks stuff and it was a somewhat frustrating experience. Here are two of the references to it I found, out there on the wilds of the internet:

  • Gabriel Rossman uses the Dixie Chicks incident of when they openly spoke out against Bush to show how synergy is creating corporate censorhip.
  • A study titled “Who Killed the Travelin’ Soldier: Elites, Masses, and Blacklisting of Critical Speakers” done by Gabriel Rossman Dept of Sociology Princeton University supports the fact that neither “Free Republic or other right wing groups” organized boycotts were responsible for the dcx demise. It was the dcx themselves.

Aaaargggghhh!

If you haven’t read or don’t remember the article, the gist of it is that the Dixie Chicks blacklist was probably instigated by right-wing social movements and was definitely not instigated by big companies like Clear Channel. That is to say that both of these references to the paper get it completely back asswards, apparently so as to make it conform to their ideological priors. I can understand people thinking (incorrectly in my opinion) that it was big media’s fault or that right-wing social movements had nothing to do with it, but I really don’t see how you can use findings to the opposite effect as evidence for these positions. Nonetheless I’m sure it happens constantly (sometimes even in journals rather than, as in this case, random websites).

August 13, 2009 at 4:49 am

But it’s got an 11

| Gabriel |

stata11

StataCorp cronies like Jeremy may have gotten it weeks ago, but mere end-users like me had to wait a bit longer. This looks to be a significant upgrade (especially for Windows users, who now have a good integrated do-file editor). In particular I wish I had access to the new “factor variables” syntax (as a replacement for “xi”) a few weeks ago. Likewise the new stcrreg model (think an st version of mlogit) looks very good.

A few words on a smooth upgrade. First, remember to reload your ado files.

Also, remember to update any scripts in your text editor so it knows where to push. In TextMate hit control-command-option-B to invoke the script editor. Scroll down to “Stata” and edit the scripts “Send File to Stata” and “Send Selection to Stata.” In each script the key line is

osascript -e "tell application \"StataMP\"

If this line doesn’t refer to your current version of Stata, change it.

August 12, 2009 at 5:39 pm 2 comments

links

| Gabriel |

NYT: Statistics jobs

Note that a lot of what they are calling statistics is really more about data mining, which is why several of the people they highlight are computer scientists not statisticians. This is consistent with my belief that our training ought to give more emphasis to workflow and data cleaning. Despite the usual standard-error-centric statistics training I’ve managed to develop a decent workflow, but (as can be seen by reading my shell scripts) I still really struggle with data cleaning languages like awk and perl.

Apple Rejecting All e-Book App Store Submissions?

Long story short, Apple is worried about confirming whether the application developers have clear title to the copyright. I see this as indirect fallout from the Kindle “1984” scandal, as well as a good illustration of transaction costs (and by extension, a good argument for limited copyright terms).

August 7, 2009 at 6:06 am

Older Posts Newer Posts


The Culture Geeks