Archive for October, 2009

Probability distributions

| Gabriel |

I wrote this little demo for my stats class to show how normal distributions result from complex processes that sum the constituent parts whereas count distributions result from complex processes where a single constituent failure is catastrophic.

*this do-file is a simple demo of how you statistical distributions are built up from additive vs sudden-death causation
*this is entirely based on a simulated coin toss -- the function "round(uniform())"
*one either counts how many heads out of 10 tosses or how long a streak of heads lasts
*I'm building up from this simple function for pedagogical purposes, in actual programming there are much more direct functions like rnormal()

*1. The normal distribution
*Failure is an additive setback
set obs 1000
forvalues var=1/10 {
	quietly gen x`var'=.
forvalues row=1/1000 {
	forvalues var=1/10 {
		quietly replace x`var'=round(uniform()) in `row'

gen sumheads=x1+x2+x3+x4+x5+x6+x7+x8+x9+x10
order sumheads
lab var sumheads "How Many Heads Out of 10 Flips"
*show five examples
list in 1/5
histogram sumheads, discrete normal
graph export sumheads.png, replace

*2. Count distribution
*Failure is catastrophic
set obs 1000
forvalues var=1/30 {
	quietly gen x`var'=.
gen streak=0
lab var streak "consecutive heads before first tails"
gen fail=0
forvalues row=1/1000 {
	forvalues var=1/30 {
		quietly replace x`var'=round(uniform()) in `row'
		quietly replace fail=1 if x`var'==0
		quietly replace streak=`var' if fail==0
	quietly replace fail=. in `row'/`row'
quietly replace streak=0 if x1==0 

*show five partial examples
list streak x1 x2 x3 x4 x5 in 1/5
histogram streak, discrete
graph export streakheads.png, replace

*have a nice day

October 12, 2009 at 5:41 pm

The winner is …

| Gabriel |

WASHINGTON — The American Sociological Association announced today that it is giving the distinguished book award to the prospectus for Climbing the Chart by Gabriel Rossman. In a statement, the ASA prize committee said they were awarding the prize for the prospectus’s “extraordinary efforts to synthesize sociology of culture, economic sociology, and social networks.”

Appearing in his front yard, Professor Rossman said he was ‘’surprised and deeply humbled” by the committee’s decision, mostly because he hasn’t finished writing the book yet. Previous ASA book awards have gone to such completed manuscripts as Charles Tilly’s Durable Inequality. However Professor Rossman quickly put to rest any speculation that he might not accept the honor. Describing the award as an “affirmation of the production of culture paradigm’s leadership on behalf of aspirations to scientific rigor held by scholars in all sociological subfields,” he said he would accept it as “a call to action.”

“To be honest,” Professor Rossman said “I do not feel that I deserve to be in the company of so many of the transformative figures who have been honored by this prize, men and women who’ve inspired me and inspired the entire world through their actually finishing writing their books.”

[Update: I see Mankiw made what is essentially the same joke]

October 9, 2009 at 1:26 pm 3 comments

Correlations and sparseness

| Gabriel |

I just published a paper in which the dependent variable was a binary variable with a frequency of about 1%. You can think of a dummy as basically taking a latent continuous distribution and turning it into a step function. When the dummy is sparse, the step occurs at the extreme right-tail. Ideally we would have had the underlying distribution itself, but that’s life. Having not just a binary, but a sparse binary, means that it’s really hard to do things like fixed-effects as when almost all of your cases are coded “0” it’s really easy to run into perfect prediction.

Anyway, one of the ways that sparse binary variables are weird is that nothing really correlates with them. A peer reviewer noticed this in the corr-var-cov matrix and asked about this, and it was a very reasonable question since most people don’t have a lot of experience with sparse binary variables and under most circumstances low zero-order correlations with the dependent variable are a sign of trouble. I found that the easiest way to both get a grasp on the issue myself and to explain it to the reviewer was to just demonstrate it with a simple simulation.

 set obs 100000
 gen x=0
 replace x=1 in 1/50000
 gen y=0
 replace y=1 in 1/1000
 tab y x, cell nofreq chi2
 corr y x
 corr y x, cov

What this simulation is doing is creating a dataset with one common binary trait (x) and one sparse binary trait (y), where the sparse trait is effectively a subset of the common trait. In the coded illustration, the correlation is about .1, which is pretty low, and the covariance is even lower. On the other hand the Chi2 is through the roof, which you’d expect given that Chi2 defines the null by the marginals, and this dataset shows as much association as is possible given the marginals. Here’s a real world example. Since 1920 there have been hundreds of millions of native born Americans over the age of 35. Of these people, a little under half were men and 16 have been president, all of them men. For this population there would be a very high Chi2 but a very low correlation between being male and being president of the United States.

All this is a good illustration of why, technically, you’re not supposed to run correlations with dummies. This is one of those rules that we violate all the time and usually it’s not a big problem. Not only is it usually not a problem, but it’s pretty convenient because there is no appealling alternative for showing zero-order associations for all combinations of a mix of continuous and dummy variables. However when the dummy gets sparse you can run into trouble. Fortunately things like this are pretty easy to explain with a simulation that is similar to your data/model but where the true structure is known by assumption.

October 8, 2009 at 4:46 am

A garbage can Moodle of organizational choice

| Gabriel |

My university recently switched from our in-house course management system ClassWeb, to the off-the-shelf course management system Moodle. I thought ClassWeb was fine but after a few hours of working with Moodle, I wrote to our IT people and asked them to let me just hand-code my own site (using html so primitive that it’s more optimized for Lynx than Firefox). Part of this is that I hate the Moodle GUI (I would say that a GUI for web design is a bad idea but the WordPress GUI is great) and part of it is that Moodle is structured to encourage a kind of course website that I think is silly for a traditional course (though probably excellent for distance learning).

I know that my complete opt-out is a bit extreme and I don’t think many other people are likely to do it, but on the other hand neither do I think many faculty are likely to use the system in the way its designers intended. I browsed through a few of my colleague’s course websites and most of them have just posted the syllabus to the main page, maybe a few readings links from the weekly subpages. Only one page that I saw seemed to be using it with anything remotely approximating the ambition for which it was designed (though it’s hard to tell because Moodle defaults to walled garden for everything beyond the main page). Thus we seem to have a case of partial adoption with most of the faculty going along with it, but only in a superficial way.

So what are these features for which it was designed? To caricature it slightly, Moodle is premised on the idea that your course website should be Facebook (which I also hate, as it strikes me as the re-AOL-ization of the internet). There are plenty of discussion forums and the whole thing is based on a plug-in API designed for extensibility. Some of these things seem to be pretty useful, for instance the course Moodle pages here automatically link to the course’s reserve reading page at the library. Many of the features are things that were probably exciting to code and/or relevant to distance learning but don’t fit well into traditional pedagogy. For instance, Moodle can host quizzes, something that sounds useful but it will snow in Hell before I assign quizzes to be done at home. There’s so much of this sort of web 2.0 gee-whizzery that other things that are highly salient to the faculty (like, oh, the syllabus) are pushed into a fairly inconspicuous place.

All this is to say that Moodle is the answer to a question that the faculty didn’t ask. Now college faculty have a lot of highly salient pedagogical issues. We complain that students don’t do the reading, that they don’t pay attention in lecture, that they don’t engage in discussion section. I recently spent over an hour discussing grade-grubbing with a colleague. However I have never heard any professor complain that our students don’t create enough user-generated content on the course website. Broadly speaking, I think it’s fair to say that most professors expect the course webpage to be just a place where they can post the syllabus and other materials, basically a replacement for handouts (and the need to give replacement copies of lost handouts).

In contrast if you read much about the Moodle philosophy you’ll see an explicit rejection of the “I teach – you learn” pedagogy and an emphasis on the teacher as a guide for student self-development. Here’s the Wikipedia summary:

The stated philosophy of Moodle includes a constructivist and social constructionist approach to education, emphasizing that learners (and not just teachers) can contribute to the educational experience in many ways. Moodle’s features reflect this in various design aspects, such as making it possible for students to comment on entries in a database (or even to contribute entries themselves), or to work collaboratively in a wiki. In Moodle, standard features often provide for a greater degree of learner-generated content than is found in other learning management systems. For example, the glossary tool can be configured to allow students to actively collaborate and contribute content.

If you read some stuff by some of the university IT advocates for Moodle (but not I should note those at my school)* it gets a bit more blunt. For example the InsideHigherEd Technology and Learning Blog writes:

We see the lecture system of teaching as a remnant of a pre-Guttenberg economy and social order of information scarcity. We believe that people learn by doing, by creating, and that the lecture system of passive note-taking and information regurgitation is about the poorest method for learning ever invented.

Oh, I see. I have my doubts about this as an issue of pedagogy (for most types of material), but let’s put aside the question of technical efficacy and consider it as an issue of diffusion in an organizational context. So basically, the IHE guy is suggesting that the organization’s central stakeholders do something that requires acquiring a new skill set and making a substantial time investment which if successful will undermine their status claims. It’s not that I expect most of the faculty to actively resist this, but I highly doubt that they are going to actively embrace Moodle and foster the kind of web-centric constructivist pedagogy it expects us to.

So here’s my confident prediction. Moodle will become increasingly popular because IT people see it as practical (it’s a standard and it’s free) and/or a way to leverage status claims (it implies that the faculty ought to aspire to being web developers soliciting user-generated content instead of lecturers relating their expertise).** Ironically, because most of the faculty don’t care that much about the course website they will go along with Moodle, but only in a very superficial way. The constructivist pedagogy people will claim success on the basis that there are umpteen million Moodle-based pages without noticing that almost every one of these pages consists entirely of a syllabus and a bunch of completely empty discussion forums, weekly subpages, etc.


* I don’t think that the IT staff here are trying to push a particular pedagogy on the faculty. From talking to them and reading the relevant documents I get the impression that they mostly see switching from ClassWeb to Moodle as an issue of harmonizing technical standards and providing capability (including special grants) to those faculty who choose to do an ambitious web 2.0 website. I should also add that our IT people have done a pretty good job of creating tools to migrate content from ClassWeb to Moodle and to make Moodle more closely match the design aesthetics of my school.

** This is an analytical point, not a put-down. One of the running themes of this blog is that I admire techies and think more social science academics should learn how to program. If it’s a choice between Randy Waterhouse and GEB Kivistik, I’m with Randy in a heartbeat. One of my problems with Moodle is it seems to be a case of the Randys embracing their inner Kivistiks.

October 6, 2009 at 5:10 am 3 comments

Online interactional vandalism

| Gabriel |

In the 1999 AJS article “Talking City Trouble” (which is reproduced in revised form as a chapter in Sidewalk), Mitch Duneier and Harvey Molotch take an ethnomethodological/conversation-analysis approach to understanding what they delicately call “attempt to initiate conversations” but what Dave Chapelle more simply calls “hollerin at bitches.” What they show is that the practice is often much more sophisticated than just yelling out something like “damn, look at that ass.”

The marginal men they studied had complex strategies for leveraging basic etiquette such that it would be rude for women to refuse to flirt with them. Most of the women did still refuse to flirt with the men, but the point is that the men’s overtures had been crafted as facially polite, putting the women between an interactional rock and a hard place. For example, when a man comes up and starts playing with your dog, then asks you an innocuous question related to the dog, you are denying his citizenship and humanity to refuse to answer, even though you know no good can come of talking to this guy. Since demanding that a woman flirt with you is itself rude (especially, let’s face it, when the woman has a much higher social status), the authors refer to this normative jujitsu as “interactional vandalism” in that the men are using interactional norms to get something that they normally have no right to expect out of an interaction.

Anyway, I was reminded of this when I recently got a message “from” a friend and colleague asking me to sign up for Feed Share / InfoAxe, a “service” that shares your browser history with your friends (which strikes me as the worst idea since the ice-cream glove). This was not really a personal invitation from my friend and colleague, but an instance of social spam.

We’re all used to getting messages in broken English offering to sell aphrodisiacs or fake rolexes or trying to rope us into a 419 con to launder the assets of some imaginary embezzler or tyrant. Most of us have learned to delete these messages at an almost precognitive level. Another sort of message that we are increasingly used to is “so and so has added you as a friend” messages from LinkedIn, Facebook, etc. Now these are messages that you cannot simply delete but must deal with. One way to deal with it is to accept the friend request. Another is to do what I usually do and write back a brief message to my friend along the lines of “I’m not accepting the friend request but that’s only because I’m not interested in the service, so please don’t take it as an insult.” Either way, my friends and colleagues (as compared to some random con artist) have a right to expect that I will reply to them.

The message I got from Feed Share played up this sense of obligation in several ways. First, it was not addressed “from” Feed Share but “from” my friend and colleague, which is why I opened the message. Second, the body of the message contained several lines framing the pitch as being less about whether I was interested in their horrible service than about acknowledging the strength of my connection to my friend and colleague. “Is [name here] your friend?” and “Please respond or [name here] may think you said no :( ” This struck me as very compelling and if I didn’t have a blanket preference to avoid those kind of services I might have clicked the “yes” button, because, yes [name here] is my friend, and I certainly don’t want [name here] to think [I] said no :(.  In contrast, I don’t even remotely contemplate whether I would like the business opportunity of helping some tinhorn dictator launder his assets. This is what makes social spam so invidious, it plays against our sense of mutual obligation. If it becomes common enough, I’m just going to reflexively delete stuff like this and assume that my friends got phished and I don’t need to read the message, let alone apologize for not wanting to join them on some crappy service.

October 5, 2009 at 3:49 pm 3 comments

Newer Posts

The Culture Geeks