A Note on the Uses of Official Statistics

November 4, 2009 at 1:46 pm 4 comments

| Gabriel |

They are ourselves, I replied; and they see only the shadows of the images which the fire throws on the wall of the den; to these they give names, and if we add an echo which returns from the wall, the voices of the passengers will seem to proceed from the shadows.  — Plato

One of the points I like to stress to my grad students is that data is not an objective (or even unbiased) representation of reality but the result of a social process. The WSJ had a story recently on how we get the “jobs created or saved” figures around the stimulus bill and it makes me want to burn my Stata dvd, take a two-hour shower, and then switch to qualitative methods where at least I know that I would be responsible for any validity problems in my work.

The idea of “jobs created or saved” by a government policy is a meaningful concept in principle but in practice it’s essentially impossible to reckon with any certainty. It’s the kind of problem you might be able to approach empirically if it happened many times and there was some relatively exogenous instrument, but in a single instance you’re probably better off using an answer derived from theory than actually trying to measure it. Nonetheless the political process demands that it be answered empirically and the results are absurd.

The way the government has tried to measure “jobs created or saved” by the stimulus is by simply asking contractors or subcontractors how many jobs were created or saved in their firm by the contract. This involves both false positives of contractors exaggerating the number of jobs they created or saved and false negatives of firms that were not direct beneficiaries of contracts but increased or retained production in expectations of benefitting from the multiplier. In the case covered by the WSJ, a shoe store that sold nine pairs of boots for $100 each to the Army Corps of Engineers didn’t know what else to put and so said they saved nine jobs. When asked about this by the WSJ the shoe store owner’s daughter/bookkeeper replied

“The question, I would like to know is: How do you answer that? Did we create zero? Is it creating a job because they have boots and go out and work for the Corps? I would be really curious to hear how somebody does create a job. The formula is out there for anyone to create, and it’s just so difficult,” she said.

Who’d a thunk it, but apparently FA Hayek was reincarnated as a shoe store worker in Kentucky.

(h/t McArdle)

Entry filed under: Uncategorized. Tags: , .

Astro-baptists Team Sorting


  • 1. Michael Bishop  |  November 16, 2009 at 4:55 pm

    Gabriel, I absolutely share your frustration when officials, pundits, or anybody else provides a number without giving any sense of the difficulties of getting an unbiased estimate. Instead of framing the problem as the limits of empiricism. I would say the problem is that a lot of people say things with more certainty than is justified because there are few consequences for being wrong. This is a special case.

    • 2. gabrielrossman  |  November 16, 2009 at 5:42 pm

      that’s fair enough, there’d be nothing wrong with the estimate were it accompanied by appropriate error bars or (what I think is really the case in this instance) hedges of uncertainty.
      on the other hand though, there’s the subtler question of whether the hedge would really be appreciated. qualitative analyses of fields where people make decisions on the basis of quantitative data show that these people are aware of data uncertainty in principle but in practice are very prone to reifying naive readings of the data, especially when those estimates are convenient. for instance, the advertising market is sensitive to measured changes in audience size that are well within the margin of error for audience surveys (Napoli 2003), television network executives reify pre-testing when it is convenient and ignore it when it is not (Gitlin 1983), and the music industry reacted to improvements in sales measurement as if there were actual changes in the market rather than just corrections of long recognized problems in our ability to perceive it accurately (Anand and Peterson 2000).

      • 3. Michael Bishop  |  November 17, 2009 at 5:11 pm

        Now that is a very interesting point. Individuals clearly use data to rationalize the decision they would have made anyway. Why is it so hard to create a norm that people attempt to honestly quantify their uncertainty?

  • 4. An approximate rant « Code and Culture  |  July 5, 2010 at 2:41 pm

    […] why not just say “about ten”? Because people will ignore the “about.” In a comment thread discussion I noted that even people who know better tend tend to reify point estimates. For instance, the […]

The Culture Geeks

%d bloggers like this: