Posts tagged ‘diffusion’

Climbing the Charts, ch 4

| Gabriel |

A few months ago Stanford’s sociology department was nice enough to invite me up to give a talk on chapter four of Climbing the Charts. This chapter argues that the opinion leadership hypothesis cannot be supported in radio and in the talk I show a simulation of why we should be skeptical of this hypothesis in general. There’s no video, but here’s an enhanced audio file with slideshow. Also a separate PDF of the slides in case you have problems with the integrated version. (A caveat, I knew I was speaking to a technically sophisticated audience so I let the jargon flow freely, the chapter itself is much easier to follow for people without a networks background).

Also in shameless plugging news, Fabio’s review at OrgTheory.

September 19, 2012 at 6:00 am

The Diffusion of Innocence

| Gabriel |

[Update 1: From skimming Al Jazeera’s (conveniently date-stamped) blog on this issue for Saturday and Sunday it looks like the protests have slowed considerably, which would imply an s-curve.]

[Update 2: It looks like the prime mover political entrepreneur was the shock artist, who actively tried to get a reaction out of people. That is, this is more similar to Jones threatening to burn Korans than to the Danish imans going on tour with (forged versions of) the Danish cartoons. Of course as in purely domestic culture wars issues, there can be a strange symbiosis between partisans on both sides who disagree on the merits but mutually benefit from discord.]

I took the Atlantic Wire’s map (also see the KML file) of “Innocence of Muslims” protest, did my best to add dates, and graphed it as a (cumulative) diffusion curve. Pretty much as you’d expect, it shows exponential growth indicating a process of imitation. Note that the curve rises a bit above trend on Friday, but on the other hand it’s not entirely a Friday thing since you do see growth on Wednesday and Thursday too. I’m gonna split the difference and say it’s about half garden variety imitation and half the fact that Friday is the Islamic Sabbath.

Let’s hope the curve starts bumping up against the asymptote soon and goes from exponential to s-curve. On a more pessimistic note, even after this particular issue burns out, the tactic itself of drumming up outrage against an obscure blasphemy will be imitated at some point in the future by some political entrepreneur, just as this itself was almost certainly inspired by earlier similar efforts by other policy entrepreneurs. That is, there is a logic of imitation at both micro and macro, for protests within each scandal and for scandals imitating each other.

A caveat, I did my best to get the dates right but it often isn’t clear, even in the original news story linked in the KML file. Also, thanks to Neal Caren and Matt Frost for pointing to and showing me how to download the file.

September 15, 2012 at 5:16 pm 1 comment

Now These Are the Names, Pt 1

| Gabriel |

There’s a lot of great research on names and I’ve been a big fan of it for years, although it’s hard to commensurate with my own favorite diffusion models since names are a flow whereas the stuff I’m interested in generally concern diffusion across a finite population.

Anyway, I was inspired to play with this data by two things in conversation. The one I’ll discuss today is somebody repeated a story about a girl named “Lah-d,” which is pronounced “La dash da” since “the dash is not silent.”

This appears to be a slight variation on an existing apocryphal story, but it reflects three real social facts that are well documented in the name literature. First, black girls have the most eclectic names of any demographic group, with a high premium put on on creativity and about 30% having unique names. Second, even when their names are unique coinages they still follow systematic rules, as with the characteristic prefix “La” and consonant pair “sh.” Third, these distinctly black names are an object of bewildered mockery (and a basis for exclusion) by others, which is the appeal in retelling this and other urban legends on the same theme.*

To tell if there was any evidence for this story I checked the Social Security data, but the web searchable interface only includes the top 1000 names per year. Thus checking on very rare names requires downloading the raw text files. There’s one file per year, but you can efficiently search all of them from the command line by going to the directory where you unzipped the archive and grepping.

cd ~/Downloads/names
grep '^Lah-d' *.txt
grep '^Lahd' *.txt

As you can see, this name does not appear anywhere in the data. Case closed? Well, there’s a slight caveat in that for privacy reasons the data only include names that occur at least five times in a given birth year. So while it includes rare names, it misses extremely rare names. For instance, you also get a big fat nothing if you do this search:

grep '^Reihan' *.txt

This despite the fact that I personally know an American named Reihan. (Actually I’ve never asked him to show me a photo ID so I should remain open to the possibility that “Reihan Salam” is just a memorable nom de plume and his birth certificate really says “Jason Miller” or “Brian Davis”).

For names that do meet the minimal threshold though you can use grep as the basis for a quick and dirty time series. To automate this I wrote a little Stata script to do this called grepnames. To call it, you give it two arguments, the (case-sensitive) name you’re looking for and the directory where you put the name files. It gives you back a time-series for how many births had that name.

capture program drop grepnames
program define grepnames
	local name "`1'"
	local directory "`2'"

	tempfile namequery
	shell grep -r '^`name'' "`directory'" > `namequery'

	insheet using `namequery', clear
	gen year=real(regexs(1)) if regexm(v1,"`directory'yob([12][0-9][0-9][0-9])\.txt")
	gen name=regexs(1) if regexm(v1,"`directory'yob[12][0-9][0-9][0-9]\.txt:(.+)")
	keep if name=="`name'"
	ren v3 frequency
	ren v2 sex
	fillin sex year
	recode frequency .=0
	sort year sex
	twoway (line frequency year if sex=="M") (line frequency year if sex=="F"), legend(order(1 "Male" 2 "Female")) title(`"Time Series for "`name'" by Birth Cohort"')
end

For instance:

grepnames Gabriel "/Users/rossman/Documents/codeandculture/names/"

Note that these numbers are not scaled for the size of the cohorts, either in reality or as observed by the Social Security administration. (Their data is noticeably worse for cohorts prior to about 1920). Still, it’s pretty obvious that my first name has grown more popular over time.

We can also replicate a classic example from Lieberson of a name that became less popular over time, for rather obvious reasons.

grepnames Adolph "/Users/rossman/Documents/codeandculture/names/"

Next time, how diverse are names over time with thoughts on entropy indices.

(Also see Jay’s thoughts on names, as well as taking inspiration from my book to apply Bass models to film box office).

* Yes, I know that one of those stories is true but the interesting thing is that people like to retell it (and do so with mocking commentary), not that the underlying incident is true. It is also true that yesterday I had eggs and coffee for breakfast, but nobody is likely to forward an e-mail to their friends repeating that particular banal but accurate nugget.

August 10, 2012 at 7:11 am 3 comments

Is Facebook “Naturally Occurring”?

| Gabriel |

Lewis, Gonzalez, and Kaufman have a forthcoming paper in PNAS on “Social selection and peer influence in an online social network.” The project uses Facebook data from the entire college experience of a single cohort of undergrads at one school in order to pick at the perennial homophily/influence question. (Also see earlier papers from this project).

Overall it’s an excellent study. The data collection and modeling efforts are extremely impressive. Moreover I’m very sympathetic to (and plan to regularly cite) the conclusion that contagious diffusion is over-rated and we need to consider the micro-motives and mechanisms underlying contagion. I especially liked how they synthesize the Bourdieu tradition with diffusion to argue that diffusion is most likely for taste markers that are distinctive in both sense of the term. As is often the case with PNAS or Science, the really good stuff is in the appendix and in this case it gets downright comical as they apply some very heavy analytical firepower to trying to understand why hipsters are such pretentious assholes before giving up and delegating the issue to ethnography.

The thing that really got me thinking though was a claim they make in the methods section:

Because data on Facebook are naturally occurring, we avoided interviewer effects, recall limitations, and other sources of measurement error endemic to survey-based network research

That is, the authors are reifying Facebook as “natural.” If all they mean is that they’re taking a fly on the wall observational approach, without even the intervention of survey interviews, then yes, this is naturally occurring data. However I don’t think that observational necessarily means natural. If researchers themselves imposed reciprocity, used a triadic closure algorithm to prime recall, and discouraged the deletion of old ties; we’d recognize this as a measurement issue. It’s debatable whether it’s any more natural if Mark Zuckerberg is the one making these operational measurement decisions instead of Kevin Lewis.

Another way to put this is to ask where does social reality end and observation of it begin? In asking the question I’m not saying that there’s a clean answer. On one end of the spectrum we might have your basic random-digit dialing opinion survey that asks people to answer ambiguously-worded Likert-scale questions about issues they don’t otherwise think about. On the other end of the spectrum we might have well-executed ethnography. Sure, scraping Facebook isn’t as unnatural as the survey but neither is it as natural as the ethnography. Of course, as the information regimes literature suggests to us, you can’t really say that polls aren’t natural either insofar as their unnatural results leak out of the ivory tower and become a part of society themselves. (This is most obviously true for things like the unemployment rate and presidential approval ratings).

At a certain point something goes from figure to ground and it becomes practical, and perhaps even ontologically valid, to treat it as natural. You can make a very good argument that market exchange is a social construction that was either entirely unknown or only marginally important for most of human history. However at the present the market so thoroughly structures and saturates our lives that it’s practical to more or less take it for granted when understanding modern societies and only invoke the market’s contingent nature as a scope condition to avoid excessive generalization of economics beyond modern life and into the past, across cultures, and the deep grammar of human nature.

We are, God help us, rapidly approaching a situation where online social networks structure and constitute interaction. Once we do, the biases built into these systems are no longer measurement issues but will be constitutive of social structure. During the transitional period we find ourselves in though, let’s recognize that these networks are human artifices that are in the process of being incorporated into social life. We need a middle ground between “worthless” and “natural” for understanding social media data.

December 22, 2011 at 11:07 am 16 comments

USC Annenberg Talk on Climbing the Charts

| Gabriel |

USC Annenberg posted video of my talk in which I discuss the genre chapter of Climbing the Charts. In the chapter/talk I discuss how genre conventions structure diffusion, using as examples crossover between radio formats and the institutionalization of reggaetón with the growth of the “hurban” format.

October 14, 2011 at 12:27 pm 4 comments

The Passively Monitored Self and the Death of a “Backstage”

| Gabriel |

Practical advice will follow, but first a rant.

I have previously complained about “social” features that automate how you share information, especially when such features are opt-out rather than opt-in. For instance, I was not enthusiastic about Skype “mood messages” giving your friends and colleagues a play-by-play of what music you listen to, nor was I enamored of a product that would share your browser history.

It’s not as if I’m an introverted recluse either. I have a blog and I correspond pretty actively by e-mail, but the difference is that in these media I actively and deliberately control the flow of information rather than having the prestigious, shameful, and indifferent aspects of my personality and behavior all indiscriminately broadcast to my alters.

I have a fantasy in which Mark Zuckerberg is weeping in his garden when he overhears some neighbor children saying “take and read.” He looks up and notices an old copy of The Presentation of Self in Everyday Life sitting on the table. Tolle lege Mr. Zuckerberg, tolle lege.

Barring such an epiphany, I wouldn’t be surprised if next year’s Facebook Developer’s Conference includes announcements that American Standard is going social to automatically let your friends know when you use the toilet. Or perhaps Vivid will automatically tell all your second cousins and old friends from high school what pornography you’ve purchased. Or Gap brands could let all your friends know what size pants you wear. Visa could post a status update giving the vendor, address, and dollar value every time you buy anything. Because, really, everything’s better when it’s social regardless of whether it’s humiliating or just pointless information overload. It’s a brave new world of web 2.0 social media integration!

Anyway, I was most recently aggravated by Spotify which (like most things nowadays) defaults to over-sharing. Spotify describes this to NPR as “Freeing people from the hassle of actively sharing songs they like [which] will help keep people engaged in their friends’ listening habits without effort.” Some of us prefer to have this “hassle” because the alternative is an uncensored view of our listening habits. As I wrote when Apple added its “Ping” social feature to iTunes:

As a cultural sociologist who has published research on music as cultural capital, I understand how my successful presentation of self depends on me making y’all believe that I only listen to George Gershwin, John Adams, Hank Williams, the Raveonettes, and Sleater-Kinney, as compared to what I actually listen to 90% of the time, which is none of your fucking business.

Anyway, the worst thing about Spotify freeing you from privacyhassle is it does so by default and it’s difficult to opt-out. You can edit your profile to suppress playlists, but by default they are all revealed and even if you suppress them, new ones created thereafter are revealed. Worse, editing your profile provides no way to suppress “Top Tracks” and “Top Artists” (at least in the Mac client version 0.6.1). After a fair amount of searching (and coming very close to deleting my account entirely), I discovered that it’s fairly easy to totally suppress all of this through the client’s preferences. Just go to the “Spotify” menu and choose “Preferences . . .” then scroll down and uncheck these boxes:

You may now return to the dignity of crafting a public personae that is only loosely coupled to your backstage behavior. Enjoy.

September 27, 2011 at 4:29 am 4 comments

This Malawian Life

| Gabriel |

Just a quick tip to check out the current episode of This American Life, which is based on the work of my CCPR colleague Susan Watkin on HIV-related gossip in Malawi. Even if you’re not interested in health or development, it’s very interesting for what it says about social networks, diffusion, statistical discrimination, and concealed stigma. The main issue is that people constantly talk about HIV in attempts to figure out who has HIV and thus makes an undesirable sex partner but I also had a few somewhat idiosyncratic interests:

  1. Information does not just diffuse through social networks in the usual sense of things that would show up in your edge list or sociomatrix but also through space (I’m at the clinic next door to the HIV clinic when you pick up your meds) and through ad hoc collections of people temporarily bounded together (a bunch of people on a bus all start speculating about the HIV status of a pedestrian). I consider this more evidence for my belief that network contagion as a mechanism for information flow is over-rated.
  2. A lot of public health programs emphasize the coals to Newcastle policy of “encouraging discussion” and “raising awareness.” These policies were driven by cosmopolitan elites, international NGOs, etc. That is, it’s John Meyer “world society” kind of stuff run amuck.
  3. About a year ago our mutual grad student, Tom Hannan, started a new project that synthesizes Susan’s concerns in #2 with some of my recent theoretical/methodological interests.

August 30, 2011 at 4:37 am

Older Posts


The Culture Geeks