Posts tagged ‘categorization’

The one-drop ethnicity

| Gabriel |

One of the findings from the 2010 Census that has been the subject of much discussion is the large increase in the Latino population since 2000, most of which is from natural increase rather than immigration. This is obviously a real trend and is to be expected as Latinos tend to be younger (read: many are in their prime fertility years) and to have higher total fertility than do Anglos. However I was left wondering that some of this increase might be somewhat exaggerated by how we collect the data.

As is described in Cristina Mora’s fantastic dissertation, during the 1970s there were all sorts of twists and contingencies as “Hispanic” was institutionalized as a category. One of the oddities is that “Hispanic” is not a race, but an “ethnicity” that cross-cuts race. A few decades later, we switched to a “check all that apply” racial taxonomy, but since Hispanic is not a race a Census respondent can check both “black” and “Asian” but not both “Hispanic” and “not Hispanic.” This effectively creates a one-drop rule for Hispanic ethnicity that creates systematic biases in our understanding of Hispanics and has the ironic effect of blinding us to some important ways that this always somewhat artificial distinction is blurring.

My own household consists of an adult Anglo (me), an adult Hispanic white (my wife), and our daughter who the Census Bureau considers a Hispanic white, just the same as if both her parents were Latino. In contrast if my wife were not Mexican but African American, the Census Bureau would make salient this admixture by coding our daughter as both “white” and “black.” Nor is my sort of marriage especially uncommon. In the 2000 Census, 28% of current marriages involving at least one Latino spouse also involved a non-Latino spouse. Thus as a quick and dirty ballpark estimate* we can say that something like a quarter to a third of the natural increase in Latinos involves children who are in some sense “half Latino,” but whom the Census records as simply “Latino.”

Thus implicitly our data have a “one-drop” rule for Latino ethnicity, even as we have moved away from comparable assumptions for racial distinctions. The theoretical question is whether this is an assumption that reflects social reality. In a sense it may be performative — if social institutions treat the offspring of Latino/non-Latino unions as simply “Latino” this may actually push such people towards identifying as Latino. (See a series of fascinating articles by Saperstein and Penner on social institutions and longitudinal racial identity). However I am rather skeptical that a one-drop rule for Latino necessarily reflects reality. Latino has always been a hodge-podge category defined almost entirely by a shared history of the Spanish language and we know that Latinos rapidly lose Spanish language ability when they do not grow up in households with an immigrant member. This Spanish-language loss is especially pronounced for people with a non-Latino parent and absent this defining characteristic we might expect many people who are half Latino and half Anglo to just sort of blend in, with an ethnic identity that is as unmarked and low salience as someone who is Irish/German or Levantine/Scots-Irish. Another way to say this is that we need to be careful about reifying our current arbitrary and historically contingent understanding of ethnicity and understand that a lot of what we see as growth among the Latino minority is really part of an emerging 21st century “beige” majority, no matter what they check on close-form surveys rooted in the politics, economics, and social sciences of the 1970s.

[Update: Also see similar thoughts from Matt Yglesias].


* Note that extrapolating marriages to births assumes that (a) the patterns are roughly comparable for non-marital fertility and (b) endogamous Latinos have similar fertility to exogamous Latinos.


April 1, 2011 at 5:05 pm 3 comments

Seeing like a state used to see

| Gabriel |

The Census Bureau website has not only the current edition of the Statistical Abstract of the United States, but most of the old editions going back to 1878. A lot of the time you just need the basic summary statistics not the raw micro data so this is a great resource.

It’s also just plain interesting, especially if you’re interested in categorical schema. For instance, in the 1930s revenues for the radio industry are in the section on “National Government Finances” (because this is the tax base), whereas the same figure is now in a section on “Information & Communications.” This suggests a very different conception about what the information is there for and who is meant to consume it for what purposes.

What really surprised me though was treating the deaf or blind as commensurable with convicted felons, but there they are in the “Defectives and Delinquents” chapter of editions through the early 1940s. The logic of such a category seems to be based not on a logic of “people who might deserve help vs people who might hurt us” but on a causal model of deviations from normality that assumes a eugenics/phrenology conception of crime as based on a malformed brain. Given that we now use MRI to study the brains of murderers, the main thing that’s really shocking about the category is the casual harshness of the word “defective” rather than applying a materialist etiology to both social deviance and physical disability.

September 14, 2010 at 4:45 am

More adventures in the reliability of fixed traits

| Gabriel |

[Update: the finding is not robust. See Hannon and Defina]

Last year I described a memo by Gary Gates about some implications of coding errors for gender.

The new Social Problems has an article by Saperstein and Penner that’s comparably weird (and an impressive effort in secondary data analysis). They use data from NLSY on self-reported race over many waves and find that after a prison term an appreciable number of people who always or almost always had called themselves “white” will start calling themselves “black.” The finding seems to be a real effect rather than just random reliability problems as it’s often a sustained shift in identity over subsequent waves and the rate is strongly in one direction after the prison shock. You can’t really demonstrate validity for this kind of thing in a brief summary, but suffice it to say that if you read the article you get the very strong impression that the authors were both careful from the get go and faced demanding peer reviewers.

It’s obviously a weird finding but it makes some sense once you buy that a) identity is somewhat plastic and that b) the racial associations of crime and criminal justice could have an impact on this. One way to understand the results is that many people have multiple ancestry which they can invoke in kind of an ancestral toolkit. For instance Wentworth Miller ’95 (who given his most famous two performances could be seen as the personification of this article) might decide to emphasize his whiteness, Jewishness, blackness, or Arabness, whereas I am limited to the much narrower gamut of identifying as white or Jewish. Largely as a function of with whom I was associating at the time, I have in fact at times self-identified as mostly Irish or as mostly Jewish, but I’d always imagined that I lack the option of plausibly considering myself black. The really weird thing is that this intuition seems to be mistaken as the SP article finds that the effect is not limited to people who the authors could identify as being Hispanic or mixed-race but extends to many non-Hispanic whites with no known black ancestry.

One complication to keep in mind is that almost all American blacks have a lot of white (and to a lesser extent, American Indian) ancestry but we still tend to consider them “black” rather than “mixed race” unless they have a very high proportion of white ancestry and/or very white phenotype. That is the American identity=f(ancestry) is pretty bizarre and it’s hard to puzzle how that affects findings like in the SP piece. For this and other reasons I’d love to see this kind of study replicated in a society with racial schemas other than the “one-drop” tradition.

March 23, 2010 at 5:21 am 9 comments

Ok, what’s your second favorite movie?

| Gabriel |

A few journalists have asked me about the recent changes to the “best picture” category at the Oscars. I’m reproducing my answer here. Also see my previous post on the increasing performativity of Oscar bait. Finally, the official version of the article on centrality and spillovers is finally out.

The Academy’s recent changes to the “best picture” category were a reaction to the increasing dominance of the category by obscure films. This has increasingly been an issue since the surprise win of “Shakespeare in Love” at the 1999 Oscars. This film was much less popular that “Saving Private Ryan” but took the Oscar in part because of an aggressive lobbying campaign from Miramax. Ever since, the Oscars have been increasingly dominated by obscure art films rather than big tentpole films for the simple reason that the small films benefit much more from Oscar attention than do the big films. Simply put, “Avatar” won’t sell any more tickets or DVDs if it wins or doesn’t win an Oscar, whereas Oscars could make a huge difference to the box office and DVD for films like “Precious” or “The Hurt Locker.” We saw this again last year in that “Slumdog Millionaire” made most of its money after the Oscars.

This dynamic has created a niche for “Oscar bait” films, which are released in November or December and often feature unpleasant (or if you prefer, challenging) material. The best example of this in the current slate is “Precious,” which is not exactly a “fun” movie. The downside for the Academy is that because audiences aren’t very interested in these movies, it depresses attention for the Oscars. Probably the breaking incident was that last year none of the nominated films had made over $40 million domestic in calendar year 2008, but “Dark Knight” (which made over $530 million domestic in calendar year 2008) was not nominated. This despite the fact that “Dark Knight” was not just a popcorn movie but artistically defensible, having been well-received by critics who saw in it a lot of interesting themes about morality and moral culpability. In large part because “Dark Knight” was not nominated, the Oscars had poor ratings, lower than many episodes of the game show “American Idol.” The hope was that by expanding the nominee list and changing the voting system, the Academy could ensure that at least a few hits would be nominated and would be likely to win, thereby halting the evolution of the Oscars into a ghetto for obscure art films.

Of course the downside to expanding the nomination list is that it makes it plausible that a few broadly popular films could split the vote and a film with a cohesive minority of supporters could attain a plurality, despite broad distaste. Similar issues are seen every once in awhile in politics when two mainstream political parties split the vote and seats go to an extremist party. To avoid such a possibility, the Academy has adopted an “instant runoff” voting system wherein voters do not just choose their favorite, but rank all of their choices. The tally then considers second and third choices until a film achieves a majority. The effect of the voting system should work as intended, which is to say it should ensure a consensus pick that most voters are reasonably happy with rather than a divisive pick with a few fervent fans but which is otherwise despised. The instant runoff system is less likely to produce a dark horse win than the old simple plurality system, but it’s worth noting that according to the “Arrow impossibility theorem” no voting system can reflect voter preferences with perfect accuracy.

In theory, one potential problem with instant runoff is strategic voting. With strategic voting, somebody might write a false second choice if they are afraid that their true second choice is likely to beat out their actual favorite. So if my favorite movie is “Hurt Locker” and my second favorite is “Avatar,” I might falsely claim that my second favorite is “District 9” because I see it as a longshot and I’m thereby assuring that I won’t contribute to “Avatar” beating “Hurt Locker.” In reality, I think this is unlikely to happen to any great extent because strategic voting requires a lot of coordination and the Academy is very strict about policing negative campaigning. In fact, they just censured a producer of “Hurt Locker” for implying that voters should vote against “Avatar.”

March 6, 2010 at 3:46 pm 4 comments

For your consideration

| Gabriel |

So the Oscar nominees were announced at dawn this morning and half of the ten best picture nominees had done appreciable box office. Likewise, many of the other categories had nominees from self-important movies that nobody saw like “Invictus.” Of course, the reason we have ten best picture nominees is that the Academy realized that the last few years’ Oscars were of no interest to the average moviegoer because the total box office of all the nominees combined was less than the Kelly Blue Book value of a three year old Honda Civic. You could just see the Academy board of governors thinking, “If only we had nominated ‘The Dark Knight,’ people might have tuned in to see the vaudevillian razzamatazz of Hugh Jackman!” Of course, there are two reasons why Dark Knight wasn’t nominated. One is that “Oscar bait” now has highly stylized genre conventions (like close-ups of Sean Penn looking constipated) which the Dark Knight didn’t meet (it was more interested in being watchable).

The other is that Dark Knight didn’t really need an Oscar nomination as it already made plenty of money and had good word of mouth. In contrast, your classic Oscar bait movie extends its theatrical run and/or sells more dvds after a nomination. Oscars are an essential resource for Oscar bait films, which nobody will watch until they are consecrated. When something is valuable, people tend to pursue it, and hence we have the aggressive “for your consideration” Oscar campaigning, as discussed at length a few weeks ago on The Business.

According to industry legend, this began in a really serious way when Miramax elbowed its way to a best picture for “Shakespeare in Love,” displacing the obvious favorite of “Saving Private Ryan.” I realized that I could test this by looking at my data to see signs of the development of Oscar performativity. One of the clearest examples of the film industry organizing itself around the Oscars is the concentration of Oscar bait in December. Late released films are both more salient to the Oscar voters and thus more likely to be nominated and still in theaters in February and thus more likely to be able to exploit nomination. By looking at the interaction between decade of release and day of the year, we can thus track the development of Oscar performativity. As seen below, there was basically no Oscar performativity in the late 30s and early 40s. From the late 40s through the early 90s there was a steady but low level of performativity. Then in the late 90s and early aughts the performativity gets really strong.

So basically, it looks like it really was “Shakespeare in Love” that set the precedent for all the artsy movies being crammed into December. Thanks Bob and Harvey!

Here’s the code:

set matsize 1000
gen decade=year
recode decade 1900/1935=. 1936/1945=1 1946/1955=2 1956/1965=3 1966/1975=4 1976/1985=5 1986/1995=6 1996/2005=7
capture lab drop decade
lab def decade 1 "1936-1945" 2 "1946-1955" 3 "1956-1965" 4 "1966-1975" 5 "1976-1985" 6 "1986-1995" 7 "1996-2005"
lab val decade decade
replace date=round(date)
logit actor_nom female major FPY g_drama centrality pWnom pDnom
esttab, se
*have a nice day

Here are the results.

female              0.924***

major               0.538***

FPY00              -0.326***

g_drama             1.814***

centrality         0.0635***

pWnom               0.253***

pDnom               1.026***

date              0.00185**

1b.decade               0

2.decade           -0.597*

3.decade           -0.450

4.decade          -0.0529

5.decade           -0.220

6.decade           -1.051***

7.decade           -1.994***

1b.decade#~e            0

2.decade#c~e      0.00362***

3.decade#c~e      0.00442***

4.decade#c~e      0.00325**

5.decade#c~e      0.00348***

6.decade#c~e      0.00465***

7.decade#c~e      0.00766***

_cons              -12.43***
N                  147908
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001

February 2, 2010 at 4:32 pm 5 comments

Towards a sociology of living death

| Gabriel |

Daniel Drezner had a post a few months ago talking about how international relations scholars of the four major schools would react to a zombie epidemic. Aside from the sheer fun of talking about something as silly as zombies, it has much the same illuminating satiric purpose as “how many X does it take to screw in a lightbulb” jokes. If you have even a cursory familiarity with IR it is well worth reading.

Here’s my humble attempt to do the same for several schools within sociology. Note that I’m not even to get into the Foucauldian “whose to say that life is ‘normal’ and living death is ‘deviant'” stuff because, really, it would be too easy. Also, I wrote this post last week and originally planned to save it for Halloween, but I figured I’d move it up given that Zombieland is doing so well with critics and at the box office.

Public Opinion. Consider the statement that “Zombies are a growing problem in society.” Would you:

  1. Strongly disagree
  2. Somewhat disagree
  3. Neither agree nor disagree
  4. Somewhat agree
  5. Strongly agree
  6. Um, how do I know you’re really with NORC and not just here to eat my brain?

Criminology. In some areas (e.g., Pittsburgh, Raccoon City), zombification is now more common that attending college or serving in the military and must be understood as a modal life course event. Furthermore, as seen in audit studies employers are unwilling to hire zombies and so the mark of zombification has persistent and reverberating effects throughout undeath (at least until complete decomposition and putrefecation). However race trumps humanity as most employers prefer to hire a white zombie over a black human.

Cultural toolkit. Being mindless, zombies have no cultural toolkit. Rather the great interest is understanding how the cultural toolkits of the living develop and are invoked during unsettled times of uncertainty, such as an onslaught of walking corpses. The human being besieged by zombies is not constrained by culture, but draws upon it. Actors can draw upon such culturally-informed tools as boarding up the windows of a farmhouse, shotgunning the undead, or simply falling into panicked blubbering.

Categorization. There’s a kind of categorical legitimacy problem to zombies. Initially zombies were supernaturally animated dead, they were sluggish but relentlessness, and they sought to eat human brains. In contrast, more recent zombies tend to be infected with a virus that leaves them still living in a biological sense but alters their behavior so as to be savage, oblivious to pain, and nimble. Furthermore even supernatural zombies are not a homogenous set but encompass varying degrees of decomposition. Thus the first issue with zombies is defining what is a zombie and if it is commensurable with similar categories (like an inferius in Harry Potter). This categorical uncertainty has effects in that insurance underwriters systematically undervalue life insurance policies against monsters that are ambiguous to categorize (zombies) as compared to those that fall into a clearly delineated category (vampires).

Neo-institutionalism. Saving humanity from the hordes of the undead is a broad goal that is easily decoupled from the means used to achieve it. Especially given that human survivors need legitimacy in order to command access to scarce resources (e.g., shotgun shells, gasoline), it is more important to use strategies that are perceived as legitimate by trading partners (i.e., other terrified humans you’re trying to recruit into your improvised human survival cooperative) than to develop technically efficient means of dispatching the living dead. Although early on strategies for dealing with the undead (panic, “hole up here until help arrives,” “we have to get out of the city,” developing a vaccine, etc) are practiced where they are most technically efficient, once a strategy achieves legitimacy it spreads via isomorphism to technically inappropriate contexts.

Population ecology. Improvised human survival cooperatives (IHSC) demonstrate the liability of newness in that many are overwhelmed and devoured immediately after formation. Furthermore, IHSC demonstrate the essentially fixed nature of organizations as those IHSC that attempt to change core strategy (eg, from “let’s hole up here until help arrives” to “we have to get out of the city”) show a greatly increased hazard for being overwhelmed and devoured.

Diffusion. Viral zombieism (e.g. Resident Evil, 28 Days Later) tends to start with a single patient zero whereas supernatural zombieism (e.g. Night of the Living Dead, the “Thriller” video) tends to start with all recently deceased bodies rising from the grave. By seeing whether the diffusion curve for zombieism more closely approximates a Bass mixed-influence model or a classic s-curve we can estimate whether zombieism is supernatural or viral, and therefore whether policy-makers should direct grants towards biomedical labs to develop a zombie vaccine or the Catholic Church to give priests a crash course in the neglected art of exorcism. Furthermore marketers can plug plausible assumptions into the Bass model so as to make projections of the size of the zombie market over time, and thus how quickly to start manufacturing such products as brain-flavored Doritos.

Social movements. The dominant debate is the extent to which anti-zombie mobilization represents changes in the political opportunity structure brought on by complete societal collapse as compared to an essentially expressive act related to cultural dislocation and contested space. Supporting the latter interpretation is that zombie hunting militias are especially likely to form in counties that have seen recent increases in immigration. (The finding holds even when controlling for such variables as gun registrations, log distance to the nearest army administered “safe zone,” etc.).

Family. Zombieism doesn’t just affect individuals, but families. Having a zombie in the family involves an average of 25 hours of care work per week, including such tasks as going to the butcher to buy pig brains, repairing the boarding that keeps the zombie securely in the basement and away from the rest of the family, and washing a variety of stains out of the zombie’s tattered clothing. Almost all of this care work is performed by women and very little of it is done by paid care workers as no care worker in her right mind is willing to be in a house with a zombie.

Applied micro-economics. We combine two unique datasets, the first being military satellite imagery of zombie mobs and the second records salvaged from the wreckage of Exxon/Mobil headquarters showing which gas stations were due to be refueled just before the start of the zombie epidemic. Since humans can use salvaged gasoline either to set the undead on fire or to power vehicles, chainsaws, etc., we have a source of plausibly exogenous heterogeneity in showing which neighborhoods were more or less hospitable environments for zombies. We show that zombies tended to shuffle towards neighborhoods with low stocks of gasoline. Hence, we find that zombies respond to incentives (just like school teachers, and sumo wrestlers, and crack dealers, and realtors, and hookers, …).

Grounded theory. One cannot fully appreciate zombies by imposing a pre-existing theoretical framework on zombies. Only participant observation can allow one to provide a thick description of the mindless zombie perspective. Unfortunately scientistic institutions tend to be unsupportive of this kind of research. Major research funders reject as “too vague and insufficiently theory-driven” proposals that describe the intention to see what findings emerge from roaming about feasting on the living. Likewise IRB panels raise issues about whether a zombie can give informed consent and whether it is ethical to kill the living and eat their brains.

Ethnomethodology. Zombieism is not so much a state of being as a set of practices and cultural scripts. It is not that one is a zombie but that one does being a zombie such that zombieism is created and enacted through interaction. Even if one is “objectively” a mindless animated corpse, one cannot really be said to be fulfilling one’s cultural role as a zombie unless one shuffles across the landscape in search of brains.

Conversation Analysis.

1  HUMAN:    Hello, (0.5) Uh, I uh, (Ya know) is anyone in there?
2  ZOMBIE1:  Br:ai[ns], =
3  ZOMBIE2:       [Br]:ain[s]
4  ZOMBIE1:              =[B]r:ains
5  HUMAN:    Uh, I uh= li:ke, Hello? =
6  ZOMBIE1:  Br:ai:ns!
7  (0.5)
8  HUMAN:    Die >motherfuckers!<
9  SHOTGUN:  Bang! (0.1) =
10 ZOMBIE1:  Aa:ar:gg[gh!]
11 SHOTGUN:         =[Chk]-Chk, (0.1) Bang!

October 13, 2009 at 4:24 am 21 comments

Underneath it all

| Gabriel |

A few years ago I had a friendly argument with Jenn Lena and Pete Peterson about their ASR article on genre trajectories. While I generally love that article, my one minor quibble is their position that there is such a thing as non-genre music, and in particular that “pop” can be considered unmarked, in genre terms. They write “Not all commercial music can be properly considered a genre in our sense of the term.” They exclude Tin Pan Alley (showtunes) and go on to write that, “Much the same argument holds for pop and teen music. At its core, pop music is music found in Billboard magazine’s Hot 100 Singles chart. Songs intended for the pop music market usually have their distinguishing genre characteristics purposely obscured or muted in the interest of gaining wider appeal.”

Myself, I disagree with treating pop as beyond genre. First, the Hot 100 is an aggregate without any real meaning as a categorical marker. I find it interesting that in radio it’s increasingly prevalent to call “Top 40” as “Contemporary Hits Radio” in recognition of the fact that in the literal sense top 40 hasn’t existed for decades and many bands who are very popular would nonetheless not get played in CHR and many bands (think Britney Spears) only get played in CHR, implying that CHR is itself a genre of what we might call “high pop.” Billboard itself distinguishes between the Hot 100 (whatever is really popular, regardless of genre) and Top 40 Mainstream (CHR).

Second, and more importantly, it is impossible to have non-genre music in the same way that it is impossible to have language-less speech if you take the Howard Becker perspective that genre is about having sufficient shared understandings and expectations so as to allow coordination between actors. Consider the fact that most genres work on the Buddy Holly model of long-lasting bands who write their own songs whereas high pop almost exclusively involves project-based collaborations of songwriters, session musicians, producers, and (most salient to the audience) singers. Since standards are especially important when the collaborations are ephemeral, then coordination through strong shared expectations is more important in high pop than genre music. Likewise, high pop sounds more monotonous than many genre-based music. Furthermore, high pop is not merely the baseline, but involves specialized skills and techniques (e.g., vocal filters) not found in “genres.”

For the most part this issue is orthogonal to the argument they present in the article (which is why I like the article despite this dispute) but I think it potentially creates problems for the IST (Industry-> Scene-> Traditional) trajectory, most of which involves a spin-off of high pop music (as is seen most clearly with the Nashville Sound, which was basically Tin Pan Alley with cowboy hats). In response to this Pete said that there is a distinction between pop and genre in that with pop change is gradual and more Lamarckian than the creative destruction and churn seen with genres. I think this is definition is fair enough, certainly it’s highly relevant to their purposes. So the question of whether it is possible to have non-genre music ultimately comes down to whether you choose to emphasize churn or shared expectations as the defining feature of genre.

Anyway, I was reminded of this discussion a few days ago when my wife and I went to see No Doubt. This band has had 8 singles on the Billboard 100 chart and had multiple singles in four different Billboard format charts (rhythmic, CHR, adult, modern rock) so I think they are a fair candidate for what Jenn and Pete have in mind as “pop.” However the performance I attended made it apparent that at their core they are ultimately still a ska band. Most obviously, during one of Gwen’s costume changes the band did a cover of The Special’s arrangement of “Guns of Navarone” and when she came back she was wearing what can only be described as a two-tone sequined romper and later on she wore a metallic Fred Perry shirt and braces (worn hanging). More generally all of their dancing was based on ska steps, their rhythm section dominates their lead guitar, and they had a horns section and keyboard (tuned as an organ).

In a sense, I think you can take No Doubt as a vindication of what Jenn and Pete are arguing. Here you have a band that started out within genre music but graduated into commercial success by recording unmarked pop. Note that their return to ska/dancehall with “Rock Steady” didn’t sell nearly as many copies as the mostly pop albums “Tragic Kingdom” and “Return of Saturn”. However there’s also the interesting fact that when Gwen decided to dive headfirst into high pop, she did so as a “solo” act, which in effect meant that she went from collaborating with Tony Kanal to doing so with Dr Dre and the Neptunes. I take Gwen’s solo career as a vindication for my perspective, the idea being that going into high pop involves not just the negative act of losing the markings and skills of genre and becoming generic music (which presumably Kanal could have done), but the positive act of acquiring the markings and skills of high pop (which required soliciting the efforts of high pop specialists like the Neptunes).

Special bonus armchair speculation!

Compare and contrast No Doubt and Dance Hall Crashers. Both are up-tempo California ska bands that started in the late 80s and have girl singers (two of them in the case of DHC). Although this is necessarily disputable, I would submit that c. 1995 (when No Doubt broke), DHC was the more talented band. Likewise, DHC has the better pedigree, being (along with Rancid) the successors to Operation Ivy. So why is it that Gwen Stefani rather than Elyse Rogers or Karina Denike is the one who ultimately became a world class pop star and an entrepreneur of overpriced designer fauxriental baby clothes?

I have three speculations, listed below in rough order of how much credence I give each of them:

  1. Looking for an explanation is futile because cultural markets are radically stochastic. If you have two talented bands it is literally impossible to predict ex ante which will become popular and in some alternate universe DHC are gazillionaires whereas No Doubt is known only to aficionados of California 90s music.
  2. Jenn and Pete are right and the issue is that No Doubt was better at transcending genre. Noteworthy in this respect is that basically all of DHC’s music is skacore whereas from their very first recordings No Doubt has always included elements of disco and pop, including AC-friendly Tin-Pan-Alley-esque ballads like “Don’t Speak” that it’s pretty hard to imagine DHC playing.
  3. There’s a cluster economy explanation in that No Doubt is from Orange County (which c. 1994 was supposed to be the next Seattle) whereas DHC is from the East Bay.

June 23, 2009 at 5:42 am 1 comment

p(gay married couple | married couple reporting same sex)

| Gabriel |

Over at Volokh, Dale Carpenter reproduces an email from Gary Gates (who unfortunately I don’t know personally, even though we’re both faculty affiliates of CCPR). In the email, Gates disputes a Census report on gay couples that Carpenter had previously discussed, arguing that many of the “gay” couples were actually straight couples who had coding errors for gender. This struck me as pretty funny, in no small part because in grad school my advisor used to warn me that no variable is reliable, even self-reported gender. (Paul, you were right). More broadly, this points to the problems of studying small groups. (Gays and lesbians are about 3% of the population, the famous 10% figure is a myth based on Kinsey’s use of convenience/purposive sampling).

Of course the usual problem with studying minorities is how to recruit a decent sample size in such a way that still approximates a random sample drawn from the (minority) population. If you take a random sample of the population and then do a screening question (“do you consider yourself gay”) you’re facing a lot of expense and also problems of refusal if the screener involves stigma because refusal and social desirability bias will be higher on a screener than if the same question is asked later on in the interview. On the other hand if you just direct your sample recruitment to areas where your minority is concentrated you’ll save a lot of time but you will also be getting only members of the minority who experience segregation, which is unfortunate as gays who live in West Hollywood are very different from those who live in Northridge, American Indians who live on reservations are very different from those who live in Phoenix, etc. Both premature screeners involving stigma and recruitment by concentrated area are likely to lead to recruiting unrepresentative members of the group on such dimensions as salience of the group identity.

These problems are familiar nightmares to anyone who knows survey methods. However the issue described by Gates in response to Carpenter (and the underlying Census study) presents a wholly new issue that when you are dealing with a small class you can have problems even if sampling is not a problem and even if measurement error in defining the class is minimal. Really this is the familiar Bayesian problem that when you are dealing with a low baseline probability events, even reasonably accurate measures can lead to false positives outnumbering true positives. The usual example given in statistics/probability textbooks is that if few people actually have a disease and you have a very accurate test for that disease, nonetheless the large majority of people who initially test positive for this disease will ultimately turn out to be healthy. Similarly, if straight marriages are much more common than gay marriages then it can still be that most so-called gay marriages are actually coding errors of straight marriages, even if the odds of a miscoded household roster for a given straight marriage are very low.

June 22, 2009 at 8:03 am

Graphing novels

| Gabriel |

Via Andrew Gelman, I just found this link to TextArc, a package for automated social network visualization of books. Gelman says that (as is often true of visualizations) it’s beautiful and a technical achievement but it’s not clear what the theoretical payoff is. I wasn’t aware of TextArc at the time, but a few years ago I suggested some possibilities for this kind of analysis in a post on orgtheory in which I graphed R. Kelly’s “Trapped in the Closet.” My main suggestion at the time was, and still is, that social network analysis could be a very effective way to distinguish between two basic types of literature:

  1. episodic works. In these works a single protagonist or small group of protagonists continually meets new people in each chapter. For example The Odyssey or The Adventures of Huckleberry Finn. The network structure for these works will be a star, with a single hub consisting of the main protagonist and a bunch of small cliques each radiating from the hub but not connected to each other. There would also be a steep power-law distribution for centrality.
  2. serial works.  In these works an ensemble of several characters, both major and minor, all seem to know each other but often with low tie strength. For example Neal Stephenson’s Baroque Cycle. This kind of work would have a small world structure. Centrality would follow a poisson but there’d be much higher dispersion for the frequency of repeated interaction across edges.

An engine like TextArc could be very powerful at coding for these things (although I’d want to tweak it so it only draws networks among proper nouns, which would be easy enough with a concordance and/or some regular expression filters.) Of course as Gelman asked, what’s the point?

I can think of several. First, there might be strong associations between network structure and genre. Second, we might imagine that the prevalence of these network structures might change over time. A related issue would be to distinguish between literary fiction and popular fiction. My impression is that currently episodic fiction is popular and serial fiction is literary (especially in the medium of television), but in other eras it was the opposite. A good coding method would allow you to track the relationship between formal structure, genre, prestige, and time.

April 26, 2009 at 1:12 pm 3 comments

The Culture Geeks