April | 2011 | Code and Culture

Archive for April, 2011

Maybe a reason it would have been better to keep ASA in Chicago

| Gabriel |

Matt Wray was interviewed today on the Freakonomics podcast to talk about his research on suicide in Las Vegas. My first thought was that his skillful redirection of the Hungary question demonstrates a good return-on-investment for RWJ’s media training. My second thought was to wonder how worried should I be that half the discipline will be visiting this desert oasis of self-murder this August. Let’s work out the math of the expected mortality, shall we?

About 85 Americans kill themselves a day, which out of a population of 300 million works out to a daily personal risk of 2.8*10^-7. Wray et al SSM 2008 estimates that the odds-ratio doubles during a visit to Vegas, which implies a daily risk of 5.6*10^-7. ASA usually has an attendance of about 4000 people, each of whom we can assume stays for about four days. This works out to 16,000 person-days, at a risk of 5.6*10^-7 per day, which works out to an expected 0.009 suicides. Of course, we have to account for the baseline risk had we all stayed home and presented papers via Skype, so the excess mortality is something like 0.0045 suicides. Another way to put this is that we could expect a single excess sociologist’s suicide if we were to hold ASA in Vegas every year for the next two centuries.

This blood will be on your hands, ASA!!! *

—————————
* For the benefit of people with no sense of humor and who extend the whole “civility” thing to ASA governance debates, I should be explicit that I’m kidding about this.

April 28, 2011 at 1:08 pm GR 6 comments

Another R bleg about loops and graph devices

| Gabriel |

I’m having another problem with R and was hoping somebody could help me out in the comments. Long story short, I can make graphs properly if I do them one at a time, but when I try to loop it I get device errors. In particular it creates the PDFs but they are either empty or corrupted.

Here is the log, which first fails to do it with a loop, then does it right for one case where I manually assign the variables rather than looping.

> # File-Name:       surfacegraphs.R                 
> # Date:            2011-04-25
> # Author:          Gabriel Rossman                                       
> # Purpose:         graph from Stata
> # Packages Used:   lattice   
> # note, wireframe code from lisa 
> timestamp()
##------ Tue Apr 26 10:44:09 2011 ------##
> library(lattice)
> 
> histopath <- '~/Documents/project/histograms'
> image2 <- '~/Documents/project/images/histograms'
> 
> timestamp()
##------ Tue Apr 26 10:44:10 2011 ------##
> 
> #create surface histograms, showing how population evolves over time
> #  parameters held constant
> setwd(histopath)
> for(d in 0:10) {
+ 	for(p in 0:5) {
+ 		d10 <- d*10
+ 		p100 <- p*100
+ 		datafile <- paste(histopath,'/d',d10,'p',p100,'.txt', sep="")
+ 		dataobject <- read.table(file=datafile,header=TRUE)
+ 		pdfcolor <- paste(image2,'/hist_color_d',d10,'p',p100,'.pdf', sep="")
+ 		pdfgrey <- paste(image2,'/hist_grey_d',d10,'p',p100,'.pdf', sep="")
+ 		pdf(pdfcolor)
+ 		wireframe( dataobject$z~dataobject$x*dataobject$y, shade=TRUE) 
+ 		dev.off()
+ 		
+ 		pdf(pdfgrey)
+ 		wireframe( dataobject$z~dataobject$x*dataobject$y, shade=TRUE, par.settings=standard.theme(color=FALSE))
+ 		dev.off()
+ 	}
+ }
There were 50 or more warnings (use warnings() to see the first 50)
> timestamp()
##------ Tue Apr 26 10:44:12 2011 ------##
> 
> #loop doesn't work
> #  seems to be the dev.off()
> #try a few manually
> d10 <- 0
> p100 <- 0
> datafile <- paste(histopath,'/d',d10,'p',p100,'.txt', sep="")
> dataobject <- read.table(file=datafile,header=TRUE)
> pdfcolor <- paste(image2,'/hist_color_d',d10,'p',p100,'.pdf', sep="")
> pdfgrey <- paste(image2,'/hist_grey_d',d10,'p',p100,'.pdf', sep="")
> pdf(pdfcolor)
> wireframe( dataobject$z~dataobject$x*dataobject$y, shade=TRUE) 
> dev.off()
null device 
          1 
> 
> 
> timestamp()
##------ Tue Apr 26 10:44:14 2011 ------##
> 
>

The warnings start like this and go on from there:

> warnings()
Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
5: In min(x) : no non-missing arguments to min; returning Inf

Any ideas?
Do I just need to hard-code the graphs I really want rather than batching them?

[Update]
As Michal suggested, I needed to wrap wireframe in print. Here’s an example of the output (for a baseline simulation).
hist_color_d0p0

April 27, 2011 at 4:51 am GR 7 comments

Simulations, numlist, and order of operations

| Gabriel |

I’ve been programming another simulation and as is typical am batching it through various combinations of parameter values, recording the results each time. In making such heavy (and recursive) use of the forvalues loop I noticed some issues with numlist and orders of operation in algorithms.

First, Stata’s numlist expression (as in the “forvalues” syntax) introduces weird rounding errors, especially if specified as fractions. Thus it is preferable to count by integers then scale down to the fractional value within the loop. This is also useful if you want to save each run of the simulation as a file as it lets you avoid fractional filenames.

So instead of this:

forvalues i=0(.01)1 {
	replace x=sin(`i')
	save sin`i'.dta, replace
}

Do this:

forvalues i=0/100 {
	local i_scaled=`i'/100
	replace x=sin(`i_scaled')
	save sin`i'.dta, replace
}

Another issue with numlist is that it can introduce infintessimal errors so that evaluating “1==1” comes back false. If you have a situation like this you need to make the comparison operator fuzzy. So instead of just writing the expression “if y==x” you would use the expression

if y>x-.0001 & y<x+.0001

Finally, I’ve noticed that when you are running nested loops the number of operations grows exponentially and so it makes a big difference in what order you do things. In particular, you want to arrange operations so they are repeated the least numbers of times. For instance, suppose you have batched a simulation over three parameters (x, y, and z) and saved each combination in its own dataset with the convention “results_x_y_z” and you wish to append the results in such a way that the parameter values are variables in the new appended dataset. The simple (but slow) way to run the append is like this:

clear
gen x=.
gen y=.
gen z=.
forvalues x=1/100 {
	forvalues y=1/100 {
		forvalues z=1/100 {
			append using results_`x'_`y'_`z'
			recode x .=`x'
			recode y .=`y'
			recode z .=`z'
		}
	}
}

Unfortunately this is really slow. The following code has the same number of lines but it involves about half as many operations for the computer to do. In the first version there are four commands that are each run 100^3 times. The second version has two commands that run 100^3 times, one command that runs 100^2 times, and one command that runs 100 times.

clear
gen x=.
gen y=.
gen z=.
forvalues x=1/100 {
	forvalues y=1/100 {
		forvalues z=1/100 {
			append using results_`x'_`y'_`z'
			recode z .=`z'
		}
		recode y .=`y'
	}
	recode x .=`x'
}

April 26, 2011 at 4:46 am GR 2 comments

Thanks Kate

| Gabriel |

Most of you who would care about this have already seen it, but I just wanted to say publicly that ASA Secretary Kate Berheide has done a great job of providing a breakdown of the ASA’s finances.

Her report is in several pieces so I’ll give links to each of them:

The cover letter
Revenues and expenses broken out as much as possible by program. [I am particularly impressed by this one as it both must have been the most challenging to write and it goes the furthest towards helping us imagine counterfactual sets of priorities. It’s still a hard issue to get a handle on, but I think Kate made it as clear as it possibly could be. Note that the fact that dues are only a minority of revenues means that any change up or down in expenses has an outsize impact on dues].
The ASA’s need for more revenues and an explanation of what services have been cut recently and which would be restored.
Comparison to AAA and APSA. [Note that Kate leaves out AEA but is candid about this right up front on the basis of AEA having a somewhat different model in several different ways and in particular being a quasi-publisher. FWIW, AEA makes about $3 million in licensing fees as compared to about $2 million for ASA. My back of the envelope calculations are that this explains about half the gap in dues.]

These reports go a long way towards illuminating the ASA’s finances and why the leadership has proposed a substantial aggregate increase in dues revenue. Different people may differ about the merits of the dues increase but thanks to Kate we now have a much clearer picture of what’s at issue. I particularly appreciate how quickly she put the reports together and how open and gracious she was about considering what kinds of information would be useful to the membership.

April 25, 2011 at 3:29 pm GR

Immediate field work opportunity

| Gabriel |

If any of you has ever wanted to replicate Festinger’s 1957 classic When Prophecy Fails, you better get into the field now as you’ve only got a few weeks.

April 18, 2011 at 2:34 pm GR 2 comments

Stata for Mac PDF fixed in Stata 11.2

| Gabriel |

About a year ago, I got frustrated with Stata’s “graph export foo.pdf” command, which at the time gave hideous output. Apparently the problem was that Stata used the same code to write to disk as to write to screen. As a work-around, I wrote graphexportpdf.ado, which is basically a wrapper to pipe Stata-generated eps files through Ghostscript.

I am happy to report that the revision notes for Stata 11.2 include this line, from the section about Stata for Mac:

33. Graphs exported as PDF files are now exported with increased resolution.

That is to say, they fixed it. I tested it and it creates beautiful output and does so very quickly. Thanks StataCorp!

I highly recommend that Mac users of Stata 11.2 and higher use the native PDF capabilities through the standard “graph export foo.pdf” syntax. Graphexportpdf.ado may still be useful for Mac users of versions 10 and earlier and to Linux users (who don’t have Quartz but usually have Ghostscript as part of a LaTeX distro).

Finally, remember that “graph export foo.pdf” is a Mac only option so if you want your code to be portable you should treat it like this:

if "`c(os)'"=="MacOSX" { 
  graph export mygraph.pdf, replace
} 
else {
  graph export mygraph.eps, replace
}

April 11, 2011 at 4:29 pm GR 3 comments

ASATransparency.org

| Gabriel |

As discussed before on this and other blogs, the ASA leadership has put a measure on the April/May ballot to increase the dues. The justification provided in Footnotes is extremely misleading by giving most of its attention to questions of distributional impact to the point that a reader could be forgiven for not noticing that it is a strictly monotonic hike for employed sociologists in all income brackets. What the leadership has not provided the membership is a justification of why the ASA should be the most expensive social science professional society, with the proposed dues being roughly two or three times as expensive as AEA and 10-20% more expensive than AAA or APSA. Basically, what services are we getting for the money and do we as a membership really support such expenses? Maybe we do, maybe we don’t, but certainly it’s a question we should take seriously.

Although I am personally optimistic that the ASA will gather and disseminate such information over the next year or so, this is information that the membership needs before voting on the dues hike. Whether or not we expect that on having such information that we would probably support or oppose the dues hike, we can all agree that at the present we lack sufficient information to make an informed decision. As such I urge my fellow ASA members to read and sign the petition at asatransparency.org that demands a better justification for the dues hike rather than just letting the ASA expand through a fit of absence of mind.

The text of the petition is reproduced below but you need to visit asatransparency.org to sign it:

We the undersigned sociologists¹ hereby register our concern with the ASA leadership’s recommendation that the membership vote for a significant aggregate dues increase. (See the March issue of Footnotes for the recommendation and rationale.)² We urge ASA members to vote against the proposed dues increase unless the ASA leadership presents a cogent explanation that specifically addresses why a substantial increase in total dues beyond the usual cost of living increase is warranted.

The published rationale argues that ASA dues should be more progressive. Like the ASA leadership, we support progressivity in the distribution of dues payments across the ASA membership. But what of the aggregate size of those payments? As shown inTable 3 of the Footnotes article, the proposal increases dues in every income bracket for employed sociologists.³ The new proposal does much more than just redistribute the dues burden in a more progressive way. It will also generate a substantial amount of new revenue, and the ASA has offered no explanation for why this is needed.

We believe that such a large aggregate increase in dues should be explained to members, before any vote, by a clear account of what more the ASA will be doing or why it needs to raise funds beyond a cost of living increase to continue existing services. This explanation must be specific about the services to be funded by additional dues revenue, and distinguish services that need additional dues funds from those that generate enough revenue on their own to break-even or make a profit. The explanation should also compare dues and services offered by peer organizations like APSA, AEA, and AAA, and provide a compelling explanation of why ASA leadership proposes dues that are higher.⁴

Unless the ASA leadership provides a compelling justification that meets these criteria before the May elections, we urge ASA members to vote against the new dues schedule.

Notes

1. “Sociologists” includes both Ph.Ds and graduate students in sociology, as well as other social scientists who engage in sociological research or teach sociology.

2. http://www.asanet.org/footnotes/mar11/dues_0311.html.

3. The proposal holds student dues steady and decreases dues for unemployed members by twenty dollars (http://www.asanet.org/footnotes/mar11/table3_0311.html), yet it appears the aggregate increase in other categories is far greater than what would be needed to simply balance this decrease for unemployed members.

4. For a comparison of current and proposed ASA dues with other social science organizations, see “A Comparative Look at ASA Membership Costs and Benefits“.

OK, now that you’ve read it, go to ASAtransparency.org and sign the thing.

April 7, 2011 at 4:35 am GR

The one-drop ethnicity

| Gabriel |

One of the findings from the 2010 Census that has been the subject of much discussion is the large increase in the Latino population since 2000, most of which is from natural increase rather than immigration. This is obviously a real trend and is to be expected as Latinos tend to be younger (read: many are in their prime fertility years) and to have higher total fertility than do Anglos. However I was left wondering that some of this increase might be somewhat exaggerated by how we collect the data.

As is described in Cristina Mora’s fantastic dissertation, during the 1970s there were all sorts of twists and contingencies as “Hispanic” was institutionalized as a category. One of the oddities is that “Hispanic” is not a race, but an “ethnicity” that cross-cuts race. A few decades later, we switched to a “check all that apply” racial taxonomy, but since Hispanic is not a race a Census respondent can check both “black” and “Asian” but not both “Hispanic” and “not Hispanic.” This effectively creates a one-drop rule for Hispanic ethnicity that creates systematic biases in our understanding of Hispanics and has the ironic effect of blinding us to some important ways that this always somewhat artificial distinction is blurring.

My own household consists of an adult Anglo (me), an adult Hispanic white (my wife), and our daughter who the Census Bureau considers a Hispanic white, just the same as if both her parents were Latino. In contrast if my wife were not Mexican but African American, the Census Bureau would make salient this admixture by coding our daughter as both “white” and “black.” Nor is my sort of marriage especially uncommon. In the 2000 Census, 28% of current marriages involving at least one Latino spouse also involved a non-Latino spouse. Thus as a quick and dirty ballpark estimate* we can say that something like a quarter to a third of the natural increase in Latinos involves children who are in some sense “half Latino,” but whom the Census records as simply “Latino.”

Thus implicitly our data have a “one-drop” rule for Latino ethnicity, even as we have moved away from comparable assumptions for racial distinctions. The theoretical question is whether this is an assumption that reflects social reality. In a sense it may be performative — if social institutions treat the offspring of Latino/non-Latino unions as simply “Latino” this may actually push such people towards identifying as Latino. (See a series of fascinating articles by Saperstein and Penner on social institutions and longitudinal racial identity). However I am rather skeptical that a one-drop rule for Latino necessarily reflects reality. Latino has always been a hodge-podge category defined almost entirely by a shared history of the Spanish language and we know that Latinos rapidly lose Spanish language ability when they do not grow up in households with an immigrant member. This Spanish-language loss is especially pronounced for people with a non-Latino parent and absent this defining characteristic we might expect many people who are half Latino and half Anglo to just sort of blend in, with an ethnic identity that is as unmarked and low salience as someone who is Irish/German or Levantine/Scots-Irish. Another way to say this is that we need to be careful about reifying our current arbitrary and historically contingent understanding of ethnicity and understand that a lot of what we see as growth among the Latino minority is really part of an emerging 21st century “beige” majority, no matter what they check on close-form surveys rooted in the politics, economics, and social sciences of the 1970s.

[Update: Also see similar thoughts from Matt Yglesias].

————————–

* Note that extrapolating marriages to births assumes that (a) the patterns are roughly comparable for non-marital fertility and (b) endogamous Latinos have similar fertility to exogamous Latinos.

April 1, 2011 at 5:05 pm GR 3 comments

Reply to Yesterday’s Comment

| Gabriel |

In the comments to a previous post, Don raised a series of points about the reform process and the nature of ASA governance. This seemed to be most directly in response to OW, who had her own response immediately below Don’s comment. Ezra and I also had some thoughts, which are largely similar to OW’s in emphasizing that we intend no animus and see the issues as largely structural. Since this is a reply to a public comment and largely concerns procedural issues of the nature of debate about the ASA, Ezra and I decided to post it publicly instead of sending it in a private e-mail.

Dear Don:

First, we’d like to thank you for your service to the discipline. We are sure that during your four years as treasury-secretary, you made many sacrifices and had to deal with all kinds of challenges, and obviously your motivations in taking on the job and working so hard at it were based on the purest of intentions. Indeed, it is our view that the ASA’s funding shortfall (if that is an accurate characterization for the rationale for the dues increase) is not a result of impure motivation but of an organization that has become institutionalized in such a way that it is not sufficiently transparent and inclusive. Individuals within such an organization – whether they be staff members or elected officers – may be capable, diligent, and having the best of intentions, but what they are up against is much bigger than themselves.

It is of course true that this manner of raising criticisms is somewhat unruly and the rules are unclear. For instance, is it right or wrong to use pseudyonyms? What to do about the fact that the questions come up here, there, and everywhere on different blogs? We can imagine that from your perspective it can feel like playing whack-a-mole.

However please consider the situation as an organization theorist: do you really prefer a system in which concerns are channeled via private notes to the ASA president? We think it is not healthy for people to confine their concerns to “going through the proper channels” as isolated complainants. Rather, people who perceive a problem need to caucus amongst themselves to develop a common understanding of the problem and solutions to it. Sometimes they may discover that there are compelling answers for their concerns (for instance, the ASA has now provided a useful account about the interest-rate swap). At other times, they may collectively develop a more thorough understanding of issues that were not apparent at first (such as that journals are a profit center for the association, or that one of the main reasons the ASA is expensive is because it has adopted a mission of raising the public profile of the discipline), and these issue are then raised as policy issues for the membership to consider. As you noted, the ASA has an elected leadership; we would add that another positive externality of caucusing is to recruit informed, motivated candidates.

As to the issue of insulting the staff or officers, we certainly do not want to make ad hominem attacks and we have strived to avoid it. However there is the danger that too strenuous efforts to avoid giving offense will mean never criticizing strategies or policies. Doing so would create the danger of excessive status quo bias and possibly create structural rachet mechanisms. For instance, can we never question the necessity of the ASA providing a service lest the staff member who provides that service take offense? And is it wrong to criticize a past decision lest the people responsible for it take offense?

Please consider in this regard: We can’t recall the last time that an ASA officer was elected on a reform ticket, or any kind of platform that involved criticism of how previous administrations have done things. In fact, we’re pretty sure that this has never happened. There are a whole bunch of reasons that we think this has never happened, but we will refrain from offering them here. Rather, we would just ask: if we were to design a membership organization, wouldn’t we prefer one in which reform candidates have at least some possibility of existing, and in which the members are encouraged to gather in public spaces and debate the policies and practices of the organization? The answer is obvious, and we apply it in every other organization or government in which we participate. And if that is the answer, the ASA leadership should try to get past the feeling that they are being personally attacked, and realize that the firestorm of criticism is a very productive development for the organization, which should be channeled to reform the system. The first step in doing so is to provide (to the members as well as the leadership) a clear picture of how much the ASA spends on various services, how much it brings in through various revenue streams, and how this compares to peer organizations. Having done so,we will all have a clearer picture of how the ASA works and whether it would be preferable to increase its revenues, decrease its ambitions, or some mix of the two. And in this kind of environment, we can be confident that we will also make more productive uses of our resources such that we might realize our ambitions for sociology while using less of our members’ resources.

Yours,

Gabriel Rossman and Ezra Zuckerman

April 1, 2011 at 12:49 pm GR 9 comments

Code and Culture

Archive for April, 2011

Maybe a reason it would have been better to keep ASA in Chicago

Another R bleg about loops and graph devices

Simulations, numlist, and order of operations

Thanks Kate

Immediate field work opportunity

Stata for Mac PDF fixed in Stata 11.2

ASATransparency.org

The one-drop ethnicity

Reply to Yesterday’s Comment

The Culture Geeks

Archives

Recent Comments

Blogroll

References/Resources