Or you could just do regressions

October 20, 2010 at 5:18 am 3 comments

| Gabriel |

Over at the “Office Hours” podcast (née Contexts podcast), Jeremy Freese gives an interview about sociology and genetics. The main theme of it is that when you have a model characterized by nonlinearity, positive feedback, and other sorts of complexity, you can get misleading results from models with essentially additive assumptions like the models we use to calculate heritability coefficients. (Heritability is closely analogous to a Pearson correlation coefficient. It is usually calculated from data about outcomes for fraternal vs identitical twins and uses reasonable assumptions about how much genetics these twins share, respectively 0.5 vs 1.0).

Jeremy gives the example that if people have small differences in natural endowments, but they specialize in human capital formation in ways that play to their endowments, then this will show up as very high heritability. Jeremy suggests this is misleading since the actual genetic impact on initial endowment is relatively small. I agree in a sense, but in another way, it’s not misleading at all. That is, the heritability coefficient is accurately reflecting that a condition is a predictable consequence of genetics even if the causal mechanism is in some sense social rather than entirely about amino acids.

This is exactly the same issue as an argument I had with one of my co-authors a few years ago. We were studying how pop songs spread across radio and dividing how much of this was endogenous (stations imitating each other) versus exogenous (stations all imitating something else). The argument was how to understand the effects of the pop charts published in Billboard and Radio & Records. One of my co-authors was arguing that these are not radio stations but periodicals and therefore should be considered exogenous to the system of radio stations. Myself and the other author held the position that appearing on the pop charts is an entirely predictable consequence of being played by a lot of radio stations and therefore it is endogenous, even if the effect is proximately channeled through something outside the system. I believe this is true in an ontological sense but it’s also a convenient belief since it’s necessary to make the math work.

Anyway, back to Jeremy’s case, you have a lot of things that are predictable outcomes of genetic endowment but for the sake of argument we can assume that we are really dealing with a small initial effect that is greatly magnified by a social mechanism. I would submit that in the current set of social circumstances the heritability coefficient as naively measured is very informative. This is sometimes contrasted with how informative it is in the abstract, but if you take gene-environment interdependence (or any complex system) seriously, then “in the abstract” is a meaningless concept. Rather you can only think about a counterfactual heritability coefficient in a counterfactual social system. This calls out for counterfactual causality logic to see how effects vary on different margins, etc, of the sort developed by Pearl and operationalized for social scientists by Morgan and Winship.

Currently, American social structure allows a lot of self-assignment to different trajectories, including an expensive (at both the personal and societal level) system of “second chances” for people to get back into the academic trajectory whether they show much aptitude for it or not and have sufficient remaining years in the labor market to amortize the human capital expense or not. As such there is sorting but it’s fairly subtle and to a substantial extent voluntary. This is the situation Jeremy describes in his stylized example of people voluntarily accruing human capital to complement natural endowments.

We can contrast this with two hypothetical scenarios. In counterfactual A, imagine that we had perfect sorting to match aptitude to development. Think of how the military uses the ASVAB to assign recruits to occupational specialties. Better yet, imagine some perfectly measured and perfectly interpreted genetic screen for aptitudes measured at birth, and on that basis we sent people from daycare onwards into a humanities track, a hard science track, or various blue collar vocational tracks with no opportunity for later transfers between tracks. That is, in this scenario we would see much stronger sorting to match aptitude and career than in the status quo. In counterfactual B, we can imagine that people are again permanently and coercively tracked, but tracking is assigned by a roulette wheel. That is, there would be no association between endowments and later experiences. In these two scenarios we could puzzle out a variety of consequences. Aside from the degradation of freedom taken as an assumption of the counterfactuals, the most obvious implications are that higher sorting would increase the dispersion of various outcome measures and the apparent heritability effect whereas random sorting would decrease outcome dispersion and measured heritability.

When people talk about heritability coefficients being biased as high, they seem to have in mind something like the random sorting model. This model strikes me as only useful as a thought experiment to establish the lower bounds of heritability since in the real world a Harrison Bergeron dystopia isn’t terribly likely. Rather we can think of scenarios that are roughly similar to reality, but vary on some margin. For instance, we can imagine how various policies (e.g., merit scholarships vs. need-based scholarships) might increase or decrease the sorting of genetic endowment and complementary human capital development on the margin and by extension what impact this would have on the distribution and covariation of outcomes.

[Update 10/22/2010: On further reflection, I can think of a scenario where a naive reading of heritability coefficients would still strike me as grossly misleading, even if it were reliable, and I would prefer the “random assignment” counterfactual as “true” heritability. Imagine a society that is genetically homogenous as to skin pigmentation genes, but where having detached earlobes were a social basis for assigning people to work indoors. In this scenario, there would be non-trivial heritability for skin color even though (by assumption) this society has no variance (and hence no heritability) for genes directly affecting pigmentation. Similarly, imagine a society where children without cheek dimples were exposed to ample lead and inadequate iodine, thereby making the undimpled into a hereditary caste of half-wits even though the genes that create dimples have no direct effect on g. I suppose what I’m getting at is that social mechanisms that select on and magnify genetic endowments are one thing, whereas social processes based on completely orthogonal stigma are another.]

Entry filed under: Uncategorized. Tags: , .

fsx.ado, fork of fs.ado (capture ls as macro) Misc Links: Stata Networks and Mac SPSS bugfix


  • 1. Michael Bishop  |  October 20, 2010 at 9:35 pm

    Excellent points and great explanation. Jeremy would agree, wouldn’t he?

  • 2. Jay Livingston  |  October 27, 2010 at 12:30 pm

    1. It’s possible I’ve misunderstood, and this is not anything I know much about. But I think Jeremy’s point (and maybe yours too) is that although the association between genetics and outcomes depends on social arrangements, too many discussions ignore the social and instead seize on the heritability coefficient as absolute and unchangeable and therefore something that policies can do nothing about.

    2. On the Billboard thing, I would have thought that the imitation factor was a matter of information. What’s important is not what other stations were playing but what one station thought or knew that other stations were playing. They can get that information by listening, by monitoring other stations’ playlists, maybe by talking with people at other stations, and probably othe ways I can’t think of. Or they can get the information by reading Billboard. Billboard info is more complete, but I’m not sure I understand why it would be less endogenous?

    • 3. gabrielrossman  |  October 27, 2010 at 12:51 pm


      1. that’s pretty much right. responsiveness to policy intervention is one way to put it (either for practical proposals or for dystopian thought experiments), but it could also be an issue of social organization generally. the classic example is that near sightedness is natural and largely genetic but can easily be changed (ie, corrected) with eyeglasses, whereas caste systems, currency, etc, are social constructions but remarkably robust.

      2. it sounds like you and i basically agree. if i’m interested in my field’s behavior, then a magazine summarizing the field is merely a passive information channel insofar as it is accurate. to the extent that it is systematically biased, then it becomes intrinsically interesting. see Anand and Peterson’s 2000 Organization Science article on the Soundscan revolution, a technological improvement that basically corrected long-standing distortions in the chart. Also, see this earlier post, the latter half of which is about political struggles over Soundscan-like technologies. further note that even if you don’t have outright errors of the sort that characterized pre-Soundscan Billboard charts, there are still consequential judgement calls. for instance, in aggregating information across stations should a chart count stations equally (as most Mediabase charts do) or weight stations by how many listeners they have (as the Mediabase country chart does)? similarly, in interpreting prestige surveys of universities should US News take a straight average of peer opinions or do some kind of centrality metric so the opinions of elite peers count more? these aren’t questions with a clear right or wrong answer but they still have consequences.

The Culture Geeks

%d bloggers like this: