| Gabriel |
Rod Dreher at The American Conservative has a post on people invoking the concept of “social construction” with his lead example being a speech and debate team that always changes the subject to a critical race theory rant about the conventions of debate itself, even if the pre-specified debate topic is about national service or green energy or whatever. The judge then awards the match to this non sequitur, invoking “social constructionism” to explain himself.
I can get angry about this on a whole other level than Dreher does, precisely because I think social construction is a valuable concept. And I really do take the concept seriously. My PhD training is as a neo-institutionalist (ie, how organizational practices are socially constructed), I have an ASR on market information regimes (ie, how socially constructed market data shapes market behavior), and my current project is on relational work (ie, how exchange is socially constructed as market or social). I also advise grad students on these sorts of topics. So it’s not like I’m some angry epistemological realist who goes around giving swirlies to phenomenologists.
Social construction is a really useful concept, but unfortunately, this really important concept has the misfortune of being popular with idiots who don’t really understand it. When this sort of person says “x is socially constructed” the implication is “therefore we can ignore x.” When I lecture on social constructionism I ridicule this sort of thing as “ruby slippers” social constructionism, as if your sociology professor tells you “why Dorothy, you’ve had the power to solve inequality all along, just click your heels three times and say ‘race is a social construct,’ ‘race is a social construct,’ ‘race is a social construct.'” If you really grok social constructionism, the appropriate reaction to somebody invoking the concept in almost any practical context is to shrug and say “your point being?” If you actually read Berger and Luckmann rather than just get the gist of it from some guy with whom you are smoking weed, you’ll see that the key aspects of social constructionism are intersubjectivity and institutions. That is social construction is important because social interaction is premised on shared conventions and becomes deeply codified to the extent that for most purposes it might as well be objective.
Suppose you had two contractors bidding on remodeling your kitchen. One of them says that it will be done in X days, involving Y materials, and cost you $Z. The other gives you a fascinating (but at times dubious) lecture about whether time exists in the abstract or only relative to perception, the ugly history of exploitation in the formica industry, and the chartalist theory of money. You then go back to the first contractor, who is bewildered and has no rebuttal to the second contractor’s very, um, creative arguments. You would have to be an idiot to award the bid to the second contractor, even if you think they are right about everything they said. As it happens, I actually believe that time, kitchen materials, and money are all socially constructed. It is also true that kitchen remodeling is also a social construct and one of the conventions of that particular social construct is that you talk about things like time, material, and price rather than offer a critical perspective on the same.
| Gabriel |
This morning Governor Chris Christie endorsed Donald Trump for president. There was widespread speculation that this reflected Christie hoping for an appointment as Attorney General in the event of a Trump victory. This was met with widespread disgust from mainstream conservative intellectuals, all of whom despise Trump (and immediately prior to the endorsement were delighting in Rubio having learned to fight Trump at his own insult comic game). Over on Twitter, Josh Barro observed that it is precisely Trump’s outsider nature that makes endorsing him attractive for an ambitious Republican politician.
This struck me as very astute and reminded me of Gould’s 2002 AJS on The Origins of Status Hierarchies. This model starts with a cumulative advantage model for status. The trick with cumulative advantage models though is to avoid their natural tendency towards absolute inequality and so the models always have some kind of braking mechanism so the histogram ends up as a power-law, not a step function. For instance, Rosen 1981 uses heterogeneity of taste and diminishing marginal returns to avoid what would otherwise be the implication of his model of exactly one celebrity achieving universal acclaim. Anyway, the point is that cumulative advantage models need a brake, and Gould’s brake is reciprocity. Gould observes that attention and resources are finite and so when someone has many followers, they lose the ability to reciprocate with them. To the extent that followers are attentive not only to the status of a patron, but the attention and resources the patron reciprocates, then their high numbers of followers will swamp the ability of high status patrons to reciprocate and so inhibit their ability to attract new followers. For instance, a grad student might rationally prefer to work with an associate professor who has only a few advisees and so can spend several hours a week with each of them than with a Nobel Laureate who has so many advisees he doesn’t recognize some of them in the hallway.
In this sense, Rubio as the clear favorite of the party establishment has already recruited great masses of political talent. Should Rubio win in November, he will have an embarrassment of riches in terms of followers with whom to fill cabinet positions and other high-ranking political roles. That is to say, Rubio’s ability to reciprocate the support of his followers is swamped by the great number of followers he has acquired. (I’m talking about followers among the sorts of people likely to be appointed to administration positions, I’ll get to voters later). This then makes some potential followers decide to affiliate with a patron who is not too busy for them, and hence Chris Christie is hoping to spend the next eight years building RICO cases against people who use the term “short-fingered vulgarian.”
But, there’s a problem with this, which is that status itself provides resources, especially in a system where power is not continuous but winner-take-all. (The discontinuity is really important, as Schilke and I argued recently). In this sense, it shouldn’t matter that a candidate with few endorsements has the fewest supporters competing for patronage because that candidate would lose and so not have patronage to allocate. That would be true if the political science model nicknamed “the party decides”(which we can generalize as the endogeneity of status competition) were true. But if that model were true, we would be seeing Rubio (who recruited the most intellectuals) or Jeb! (who raised the most money) as the clear front-runner and that is anything but the case since the GOP primary this cycle has been consistently dominated by outsiders (Trump, briefly Carson, and even Cruz, who is a senator but a notably un-collegial one).
This then suggests that we have to recognize that power, including the ability to allocate resources to followers, is not necessarily a function of how many followers one has. In ordinary times it might be, especially in the Republican party which normally follows the party decides model. However in this year it is clear that popularity in opinion polls and primaries/caucuses has no (positive) correlation with establishment support. This may be because Trump, like Lenin, is a figure of such immense charisma that he can defy the models. Or it may be that the base is revolting over a substantive issue like immigration. Or maybe the support of neo-Nazis with a bizarre interest in anime and the Frankfurt school is the secret sauce. Whatever the exact nature of why the party decides model is breaking, the fact is that it is. The Republican primary reminds me of Bourdieu’s model of a field of mass cultural production and a restricted field of production. Rubio is clearly dominating in the restricted field of elite conservative opinion, but that does him very little good considering how effective Trump is at the mass field. If we view the competition for endorsements not as an isolated system, but one that is loosely coupled to an adjacent system of competition for voters, then the status competition for endorsements is no longer entirely endogenous but there is a source of exogenous power shaping it. (In the Gould model this would be subsumed as part of Q_j). Hence Trump’s great popularity with voters despite his great unpopularity with party elites makes him more attractive than he would otherwise be to party elites who will break ranks and affiliate with the demagogue.
In Trump’s case, his fame, wit, and shamelessness have gained him the support of voters and this has disrupted the otherwise endogenous system of endorsements, however the model could generalize to any source of power outside of the endogenous process of consensus building within party elites. A very similar model would apply to those political actors who welcome a foreign invader as supporters in domestic disputes they would otherwise lose. Americans take for granted that the opposition party will be a loyal opposition and so we abide by the maxim that “politics ends at the water’s edge,” which is why periods like the Second Red Scare (or from the other perspective, the Popular Front that preceded it) seem so anomalous. However for centuries, machinations to set yourself up as a client-state after relying on imperial powers to depose the current batch of elites is most of what politics was. In such a scenario, a political actor who lacks much power within the internal dynamics of oligarchy could still acquire followers if they seemed to be favored by the forces massing across the border. So we might expect a lot of ambitious mitteleuropean politicians to affiliate with heretofore minor fascist parties c 1938, or with heretofore minor communist parties c 1943.
| Gabriel |
For each quote, guess the source: a classic of gift exchange or a Los Angeles Times article about deposed Sheriff and soon to be plea bargainee, Lee Baca. Highlight the text to see the answers and score your quiz!
“Until he has given back, the receiver is ‘obliged,’ expected to show his gratitude towards his benefactor or at least to show regard for him, go easy on him, pull his punches…” (Bourdieu Logic of Practice)
“The etiquette of the feast, of the gift that one receives with dignity, but is not solicited, is extremely marked among these tribes.” (Mauss The Gift)
“I don’t solicit any gifts. I’ve never asked for a gift.… People just do it for me.” (Los Angeles Times)
“When you’re taking gifts from strangers, there’s only one reason. They only give gifts because they want something.” (Los Angeles Times)
“These, however, are but the outward signs of kindness, not the kindnesses themselves.” (Seneca Benefits)
“What they’re expressing is appreciation for the respectful way we do business.” (Los Angeles Times)
“No one is really unaware of the logic of exchange … but no one fails to comply with the rules of the game, which is to act as if one did not know the rule.” (Bourdieu Pascalian Meditations)
“Nobody is free to refuse the present that is offered.” (Mauss The Gift)
“My life would be much easier if people did not give me gifts.” (Los Angeles Times)
| Gabriel |
As long-time readers will remember, I have been collecting Twitter with the R library(twitteR). Unfortunately that workflow has proven to be buggy, mostly for reasons having to do with authentication. As such I decided to learn Python and migrate my project to the Twython module. Overall, I’ve been very impressed by the language and the module. I haven’t had any dependency problems and authentication works pretty smoothly. On the other hand, it requires a lot more manual coding to get around rate limits than does twitteR and this is a big part of what my scripts are doing.
I’ll let you follow the standard instructions for installing Python 3 and the Twython module before showing you my workflow. Note that all of my code was run on Python 3.5.1 and OSX 10.9. You want to use Python 3, not Python 2 as tweets are UTF-8. If you’re a Mac person, OSX comes with 2.7 but you will need to install Python3. For the same reason, use Stata 14 for tweets.
One tip on installation, pip tends to default to 2.7 so use this syntax in bash.
python3 -m pip install twython
I use three py scripts, one to write Twython queries to disk, one to query information about a set of Twitter users, and one to query tweets from a particular user. Note that the query scripts can be slow to execute, which is deliberate as otherwise you end up hitting rate limits. (Twitter’s API allows fifteen queries per fifteen minutes). I call the two query scripts from bash with argument passing. The disk writing script is called by the query scripts and doesn’t require user intervention, though you do need to be sure Python knows where to find it (usually by keeping it in the current working directory). Note that you will need to adjust things like file paths and authentication keys. (When accessing Twitter through scripts instead of your phone, you don’t use usernames and passwords but keys and secrets, you can generate the keys by registering an application).
I am discussing this script first even though it is not directly called by the user because it is the most natural place to discuss Twython’s somewhat complicated data structure. A Twython data object is a list of dictionaries. (I adapted this script for exporting lists of dictionaries). You can get a pretty good feel for what these objects look like by using type() and the pprint module. In this sample code, I explore a data object created by infoquery.py.
type(users) #shows that users is a list type(users) #shows that each element of users is a dictionary #the objects are a bunch of brackets and commas, use pprint to make a dictionary (sub)object human-readable with whitespace import pprint pp=pprint.PrettyPrinter(indent=4) pp.pprint(users) pp.pprint(users['status']) #you can also zoom in on daughter objects, in this case the user's most recent tweet object. Note that this tweet is a sub-object within the user object, but may itself have sub-objects
As you can see if you use the pprint command, some of the dictionary values are themselves dictionaries. It’s a real fleas upon fleas kind of deal. In the datacollection.py script I pull some of these objects out and delete others for the “clean” version of the data. Also note that tw2csv defaults to writing these second-level fields as one first-level field with escaped internal delimiters. So if you open a file in Excel, some of the cells will be really long and have a lot of commas in them. While Excel automatically parses the escaped commas correctly, Stata assumes you don’t want them escaped unless you use this command:
import delimited "foo.csv", delimiter(comma) bindquote(strict) varnames(1) asdouble encoding(UTF-8) clear
Another tricky thing about Twython data is there can be variable number of dictionary entries (ie, some fields are missing from some cases). For instance, if a tweet is not a retweet it will be missing the “retweeted_status” dictionary within a dictionary. This was the biggest problem with reusing the Stack Overflow code and required adapting another piece of code for getting the union set of dictionary keys. Note this will give you all the keys used in any entry from the current query, but not those found uniquely in past or future queries. Likewise, Python sorts field order randomly. For these two reasons, I hard-coded tw2csv as overwrite, not append, and build in a timestamp to the query scripts. If you tweak the code to append, you will run into problems with the fields not lining up.
Anyway, here’s the actual tw2csv code.
#tw2csv.py def tw2csv(twdata,csvfile_out): import csv import functools allkey = functools.reduce(lambda x, y: x.union(y.keys()), twdata, set()) with open(csvfile_out,'wt') as output_file: dict_writer=csv.DictWriter(output_file,allkey) dict_writer.writeheader() dict_writer.writerows(twdata)
One of the queries I like to run is getting basic information like date created, description, and follower counts. Basically, all the stuff that shows up on a user’s profile page. The Twitter API allows you to do this for 100 users simultaneously and I do this with the infoquery.py script. It assumes that your list of target users is stored in a text file, but there’s a commented out line that lets you hard code the users, which may be easier if you’re doing it interactively. Likewise, it’s designed to only query 100 users at a time, but there’s a commented out line that’s much simpler in interactive use if you’re only querying a few users.
You can call it from the command line and it takes as an argument the location of the input file. I hard-coded the location of the output. Note the “3” in the command-line call is important as operating systems like OSX default to calling Python 2.7.
python3 infoquery.py list.txt
#infoquery.py from twython import Twython import sys import time from math import ceil import tw2csv #custom module parentpath='/Users/rossman/Documents/twittertrucks/infoquery_py' targetlist=sys.argv #text file listing feeds to query, one per line. full path ok. today = time.strftime("%Y%m%d") csvfilepath_info=parentpath+'/info_'+today+'.csv' #authenticate APP_KEY='' #25 alphanumeric characters APP_SECRET='' #50 alphanumeric characters twitter=Twython(APP_KEY,APP_SECRET,oauth_version=2) #simple authentication object ACCESS_TOKEN=twitter.obtain_access_token() twitter=Twython(APP_KEY,access_token=ACCESS_TOKEN) handles = [line.rstrip() for line in open(targetlist)] #read from text file given as cmd-line argument #handles=("gabrielrossman,sociologicalsci,twitter") #alternately, hard-code the list of handles #API allows 100 users per query. Cycle through, 100 at a time #users = twitter.lookup_user(screen_name=handles) #this one line is all you need if len(handles) < 100 users= #initialize data object hl=len(handles) cycles=ceil(hl/100) #unlike a get_user_timeline query, there is no need to cap total cycles for i in range(0, cycles): ## iterate through all tweets up to max of 3200 h=handles[0:100] del handles[0:100] incremental = twitter.lookup_user(screen_name=h) users.extend(incremental) time.sleep(90) ## 90 second rest between api calls. The API allows 15 calls per 15 minutes so this is conservative tw2csv.tw2csv(users,csvfilepath_info)
This last script collects tweets for a specified user. The tricky thing about this code is that the Twitter API allows you to query the last 3200 tweets per user, but only 200 at a time, so you have to cycle over them. moreover, you have to build in a delay so you don’t get rate-limited. I adapted the script from this code but made some tweaks.
One change I made was to only scrape as deep as necessary for any given user. For instance, as of this writing, @SociologicalSci has 1192 tweets, so it cycles six times, but if you run it in a few weeks @SociologicalSci would have over 1200 and so it would run at least seven cycles. This change makes the script run faster, but ultimately gets you to the same place.
The other change I made is that I save two versions of the file, one as is and the other that pulls out some objects from the subdictionaries and deletes the rest. If for some reason you don’t care about retweet count but are very interested in retweeting user’s profile background color, go ahead and modify the code. See above for tips on exploring the data structure interactively so you can see what there is to choose from.
As above, you’ll need to register as an application and supply a key and secret.
You call it from bash with the target screenname as an argument.
python3 datacollection.py sociologicalsci
#datacollection.py from twython import Twython import sys import time import simplejson from math import ceil import tw2csv #custom module parentpath='/Users/rossman/Documents/twittertrucks/feeds_py' handle=sys.argv #takes target twitter screenname as command-line argument today = time.strftime("%Y%m%d") csvfilepath=parentpath+'/'+handle+'_'+today+'.csv' csvfilepath_clean=parentpath+'/'+handle+'_'+today+'_clean.csv' #authenticate APP_KEY='' #25 alphanumeric characters APP_SECRET='' #50 alphanumeric characters twitter=Twython(APP_KEY,APP_SECRET,oauth_version=2) #simple authentication object ACCESS_TOKEN=twitter.obtain_access_token() twitter=Twython(APP_KEY,access_token=ACCESS_TOKEN) #adapted from http://www.craigaddyman.com/mining-all-tweets-with-python/ #user_timeline=twitter.get_user_timeline(screen_name=handle,count=200) #if doing 200 or less, just do this one line user_timeline=twitter.get_user_timeline(screen_name=handle,count=1) #get most recent tweet lis=user_timeline['id']-1 #tweet id # for most recent tweet #only query as deep as necessary tweetsum= user_timeline['user']['statuses_count'] cycles=ceil(tweetsum / 200) if cycles>16: cycles=16 #API only allows depth of 3200 so no point trying deeper than 200*16 time.sleep(60) for i in range(0, cycles): ## iterate through all tweets up to max of 3200 incremental = twitter.get_user_timeline(screen_name=handle, count=200, include_retweets=True, max_id=lis) user_timeline.extend(incremental) lis=user_timeline[-1]['id']-1 time.sleep(90) ## 90 second rest between api calls. The API allows 15 calls per 15 minutes so this is conservative tw2csv.tw2csv(user_timeline,csvfilepath) #clean the file and save it for i, val in enumerate(user_timeline): user_timeline[i]['user_screen_name']=user_timeline[i]['user']['screen_name'] user_timeline[i]['user_followers_count']=user_timeline[i]['user']['followers_count'] user_timeline[i]['user_id']=user_timeline[i]['user']['id'] user_timeline[i]['user_created_at']=user_timeline[i]['user']['created_at'] if 'retweeted_status' in user_timeline[i].keys(): user_timeline[i]['rt_count'] = user_timeline[i]['retweeted_status']['retweet_count'] user_timeline[i]['qt_id'] = user_timeline[i]['retweeted_status']['id'] user_timeline[i]['rt_created'] = user_timeline[i]['retweeted_status']['created_at'] user_timeline[i]['rt_user_screenname'] = user_timeline[i]['retweeted_status']['user']['name'] user_timeline[i]['rt_user_id'] = user_timeline[i]['retweeted_status']['user']['id'] user_timeline[i]['rt_user_followers'] = user_timeline[i]['retweeted_status']['user']['followers_count'] del user_timeline[i]['retweeted_status'] if 'quoted_status' in user_timeline[i].keys(): user_timeline[i]['qt_created'] = user_timeline[i]['quoted_status']['created_at'] user_timeline[i]['qt_id'] = user_timeline[i]['quoted_status']['id'] user_timeline[i]['qt_text'] = user_timeline[i]['quoted_status']['text'] user_timeline[i]['qt_user_screenname'] = user_timeline[i]['quoted_status']['user']['name'] user_timeline[i]['qt_user_id'] = user_timeline[i]['quoted_status']['user']['id'] user_timeline[i]['qt_user_followers'] = user_timeline[i]['quoted_status']['user']['followers_count'] del user_timeline[i]['quoted_status'] if user_timeline[i]['entities']['urls']: #list for j, val in enumerate(user_timeline[i]['entities']['urls']): urlj='url_'+str(j) user_timeline[i][urlj]=user_timeline[i]['entities']['urls'][j]['expanded_url'] if user_timeline[i]['entities']['user_mentions']: #list for j, val in enumerate(user_timeline[i]['entities']['user_mentions']): mentionj='mention_'+str(j) user_timeline[i][mentionj] = user_timeline[i]['entities']['user_mentions'][j]['screen_name'] if user_timeline[i]['entities']['hashtags']: #list for j, val in enumerate(user_timeline[i]['entities']['hashtags']): hashtagj='hashtag_'+str(j) user_timeline[i][hashtagj] = user_timeline[i]['entities']['hashtags'][j]['text'] if user_timeline[i]['coordinates'] is not None: #NoneType or Dict user_timeline[i]['coord_long'] = user_timeline[i]['coordinates']['coordinates'] user_timeline[i]['coord_lat'] = user_timeline[i]['coordinates']['coordinates'] del user_timeline[i]['coordinates'] del user_timeline[i]['user'] del user_timeline[i]['entities'] if 'place' in user_timeline[i].keys(): #NoneType or Dict del user_timeline[i]['place'] if 'extended_entities' in user_timeline[i].keys(): del user_timeline[i]['extended_entities'] if 'geo' in user_timeline[i].keys(): del user_timeline[i]['geo'] tw2csv.tw2csv(user_timeline,csvfilepath_clean)
| Gabriel |
There has been a tremendous amount of hype over the last few years about universal pre-K as a magic bullet to solve all social problems. We see a lot of talk of return on investment at rates usually only promised by prosperity gospel preachers and Ponzi schemes. Unfortunately, two recent large-scale studies, one in Quebec and one in Tennessee, showed small negative effects for pre-K. An article writing up the Tennessee study in New York advises fear not, for:
These are all good studies, and they raise important questions. But none of them is an indictment of preschool, exactly, so much as an indictment of particular approaches to it. How do we know that? Two landmark studies, first published in 1993 and 2008, demonstrate definitively that, if done right, state-sponsored pre-K can have profound, lasting, and positive effects — on individuals and on a community.
It then goes on to explain that the Perry and Abecedarian projects were studies involving 123 and 100 people respectively, had marvelous outcomes, and were play rather than drill oriented.
The phrase “demonstrate definitively” is the kind of phrase you have to very careful with and it just looks silly to say that this definitive knowledge comes from two studies with sample size of about a hundred. Tiny studies with absurdly large effects sizes are exactly where you would expect to find publication bias. Indeed, this is almost inevitable when the sample sizes are so underpowered that the only way to get β/se>1.96 is for β to be implausibly large. (As Jeremy Freese observed, this is among the dozen or so major problems with the PNAS himmicane study).
The standard way to detect publication bias is through a meta-analysis showing that small studies have big effects and big studies have small effects. For instance, this is what Card and Krueger showed in a meta-analysis of the minimum wage literature which demonstrated that their previous paper on PA/NJ was only an outlier when you didn’t account for publication bias. Similarly, in a 2013 JEP, Duncan and Magnuson do a meta-analysis of the pre-K literature. Their visualization in figure 2 emphasizes the declining effects sizes over time, but you can also see that the large studies (shown as large circles) generally have much smaller β than the small studies (shown as small circles). If we added the Tennessee and Quebec studies to this plot they would be large circles on the right slightly below the x-axis. That is to say, they would fall right on the regression line and might even pull it down further.
This is what publication bias looks like: old small studies have big effects and new large studies have small effects.
I suppose it’s possible that the reason Perry and Abecedarian showed big results is because the programs were better implemented than those in the newer studies, but this is not “demonstrated definitively” and given the strong evidence that it’s all publication bias, let’s tentatively assume that if something’s too good to be true (such as that a few hours a week can almost deterministically make kids stay in school, earn a solid living, and stay out of jail), then it ain’t.
| Gabriel |
Today the Economist posted a graph showing the patrons of factions in various civil wars in the Middle East. The point of the graph is that the alliances don’t neatly follow balance theory, since it is in fact sometimes the case that the friend of my enemy is my friend, which is a classic balance theory fail. As such, I thought it would be fun to run a Spinglass model on the graph. Note that I could only do edges, not arcs, so I only included positive ties, not hostility ties. One implication of this is ISIS drops out as it (currently) lacks state patronage.
Here’s the output. The second column is community and the third is betweenness.
> s Graph community structure calculated with the spinglass algorithm Number of communities: 4 Modularity: 0.4936224 Membership vector:  4 4 3 2 2 2 4 3 4 3 1 4 1 3 3 4 2 4 2 > output b [1,] "bahrain_etc" "4" "0" [2,] "egypt_gov" "4" "9.16666666666667" [3,] "egypt_mb" "3" "1.06666666666667" [4,] "iran" "2" "47.5" [5,] "iraq_gov" "2" "26" [6,] "iraq_kurd" "2" "26" [7,] "jordan" "4" "6.73333333333333" [8,] "libya_dawn" "3" "1.06666666666667" [9,] "libya_dignity" "4" "0.333333333333333" [10,] "qatar" "3" "27.5333333333333" [11,] "russia" "1" "0" [12,] "saudi" "4" "4" [13,] "syria_gov" "1" "17" [14,] "syria_misc" "3" "31.0333333333333" [15,] "turkey" "3" "6.83333333333333" [16,] "uae" "4" "4" [17,] "usa" "2" "74.4" [18,] "yemen_gov" "4" "74.3333333333333" [19,] "yemen_houthi" "2" "0"
So it looks like we’re in community 2, which is basically Iran and its clients, though in fairness we also have high betweenness as we connect community 2 (Greater Iran), community 3 (the pro Muslim Brotherhood Sunni states), and community 4 (the pro Egyptian government Sunni states). This is consistent with the “offshore balancing” model of Obama era MENA policy.
Here’s the code:
library("igraph") setwd('~/Documents/codeandculture') mena <- read.graph('mena.net',format="pajek") la = layout.fruchterman.reingold(mena) V(mena)$label <- V(mena)$id #attaches labels plot.igraph(mena, layout=la, vertex.size=1, vertex.label.cex=0.5, vertex.label.color="darkred", vertex.label.font=2, vertex.color="white", vertex.frame.color="NA", edge.color="gray70", edge.arrow.size=0.5, margin=0) s <- spinglass.community(mena) b <- betweenness(mena, directed=FALSE) output <- cbind(V(mena)$id,s$membership,b) s output
And here’s the data:
*Vertices 19 1 "bahrain_etc" 2 "egypt_gov" 3 "egypt_mb" 4 "iran" 5 "iraq_gov" 6 "iraq_kurd" 7 "jordan" 8 "libya_dawn" 9 "libya_dignity" 10 "qatar" 11 "russia" 12 "saudi" 13 "syria_gov" 14 "syria_misc" 15 "turkey" 16 "uae" 17 "usa" 18 "yemen_gov" 19 "yemen_houthi" *Arcs 1 18 2 9 2 18 4 5 4 6 4 13 4 19 7 2 7 14 7 18 10 3 10 8 10 14 10 18 11 13 12 2 12 9 12 18 15 3 15 8 15 14 16 2 16 9 16 18 17 5 17 6 17 14 17 18
| Gabriel |
The transformations of the television industry are an endlessly fascinating subject that I spend a lot of time ruminating on but haven’t ever, you know, actually published on. We can start with a few basic technological shifts, specifically the DVR and broadband internet. Both technologies have the effect that people are watching fewer commercials. From this we can infer that advertisers will have a pronounced preference for “DVR-proof” advertising.* One form of this is product shots, which are indeed a big deal nowadays, especially in the reality competition genre. Of course product shots are inherently cumbersome and are pretty much the antithesis of the scatter advertising market insofar as they require commitments during pre-production which is even more extreme than up-fronts and which is why we long ago got past the age of Texaco Star Theatre. So basically, the 30 second spot you will always have with you. Or rather, the demand for the 30 second spot you will always have with you and the question is can we find a type of programming where people watch the ads. (Note that the recent Laureate Jean Tirole did work on this issue, as explained by Alex Tabarrok at MR).
In practice getting people to watch spot advertising means programming that has to be watched live and in practice that in turn means sports.** Thus it is entirely predictable that advertisers will pay a premium for sports. It is also predictable that the cable industry will pay a premium for sports because must-watch ephemera is a good insurance policy against cord-cutting. Moreover, as a straight-forward Ricardian rent type issue, we would predict that this increased demand would accrue to the owners of factor inputs: athletes, team owners, and (in the short-run) the owners of cable channels with contracts to carry sports content. Indeed this has basically all happened. You’ve got ESPN being the cash cow of Disney, ESPN and TNT in turn signing a $24 billion deal with the NBA, an NBA team selling for $2 billion, and Kobe Bryant making $30 million in salary. Basically, there’s a ton of money in DVR-proof sports, both from advertising and from the ever-rising carriage fees that get passed on in the form of ever rising basic cable rates. (I imagine a Johnny Cash parody, “how high’s the carriage fees mama? 6 bucks per sub and rising.”).
Here’s something else that is entirely predictable from these premises: we should have declining viewership for sports. Think about it, you have widget A and widget B. Widget A has a user experience that’s the same as it’s always been (ie, you got to watch it when it’s on and sit through the ads) but the price is rapidly increasing (it used to be you could get it over broadcast or just from a basic cable package that was relatively cheap). In contrast you have widget B which has a dramatically improved user experience (you can watch every episode ever on-demand whenever you feel like it without ads and do so on your tv, tablet, or whatever) and a rapidly declining price (if you’re willing to wait for the previous season, scripted content is practically free). If you’re the marginal viewer who ex ante finds sports and scripted equally compelling, it seems like as sports get more expensive and you keep having to watch ads, whereas scripted gets dirt cheap, ad-free, and generally more convenient, the marginal viewer would give up sports, watch last season’s episodes of Breaking Bad on Netflix, be blissfully unaware of major advertising campaigns, and pocket the $50 difference between a basic cable package and a $10 Netflix subscription. Of course you wouldn’t predict that the kinds of guys who put body paint on their naked torsos would give up on sports just because Netflix has every season of Frasier, but you would predict that at the population level interest in sports would decline slightly to moderately.
The weird thing is that this latter prediction didn’t happen. During exactly the same period over which sports got more expensive in absolute terms and there was declining direct cost and hassle for close substitutes, viewership for sports increased. From 2003 to 2013, sports viewership was up 27%. Or rather, baseball isn’t doing so great and basketball is holding its own, but holy moly, people love football. If you look at both the top events and top series on tv, it’s basically football, football, some other crap, and more football. (Also note that football doesn’t appear in the “time-shifted” lists, meaning that people do watch the ads). And it’s not just that people have always liked football or that non-football content is weakening, but football is growing in absolute popularity.
That this would happen in an era of DVRs and streaming is nuts, and kind of goes contrary to the whole notion of substitutes. I mean, I just can’t understand how when one thing gets more expensive and something else that’s similar gets a lot cheaper and lower hassle, that you see people flocking to the thing that is more money in absolute terms and more hassle in relative terms.*** Maybe we just need to keep heightening the contradictions and then eventually the system will unravel, but this doesn’t explain why we’ve seen a medium-run fairly substantial rise in sports viewership instead of just stability with a bit of noise.
I’m sure one of my commenters is smarter than me and can explain why either my premises or logic is incorrect, but at least to me this looks like an anomaly. And even if we can ultimately find some auxiliary hypothesis that explains why of course we’d predict a rise in sports viewership if we only considered that [your brilliant ex post explanation goes here],**** let’s keep in mind that this is all ex post, and adjust down our confidence about making social scientific predictive inferences accordingly. A theory like decline in total cost of widget B will lead to substitution of widget B for widget A is a pretty good theory and if its predictions don’t hold in the face of something like bigger linebackers or more exciting editing for instant replay, then you have to wonder how much any theory can get us.
*If we’re a bit more creative we could also infer that the market information regime for audience ratings will see a lot of contentious changes.
**It is interesting that the tv networks aggressively promote Twitter in order to promote live viewing of scripted content and news, but at this point the idea that networks will hashtag their way to a higher “C3” ratings is pretty niche/speculative.
*** The closest parallel I can think of is that it’s the easy-going mainline Protestant churches that have seen especially steep declines in attendance/membership and the more personally demanding churches that are relatively strong. I may have to rethink this point though after I fully digest the new Hout & Fischer.
**** Your ex post explanation better speak to the (extensive) marginal fan and not just the intensity of hardcore fans, since my understanding is total number of football viewers is up, and so the explanation can’t be anything like the growth in fantasy leagues leads hardcore fans to watch 20 hours a week instead of 3 hours a week.