April 26, 2009

Gabriel

Via Andrew Gelman, I just found this link to TextArc, a package for automated social network visualization of books. Gelman says that (as is often true of visualizations) it’s beautiful and a technical achievement but it’s not clear what the theoretical payoff is. I wasn’t aware of TextArc at the time, but a few years ago I suggested some possibilities for this kind of analysis in a post on orgtheory in which I graphed R. Kelly’s “Trapped in the Closet.” My main suggestion at the time was, and still is, that social network analysis could be a very effective way to distinguish between two basic types of literature:

  1. episodic works. In these works a single protagonist or small group of protagonists continually meets new people in each chapter. For example The Odyssey or The Adventures of Huckleberry Finn. The network structure for these works will be a star, with a single hub consisting of the main protagonist and a bunch of small cliques each radiating from the hub but not connected to each other. There would also be a steep power-law distribution for centrality.
  2. serial works.  In these works an ensemble of several characters, both major and minor, all seem to know each other but often with low tie strength. For example Neal Stephenson’s Baroque Cycle. This kind of work would have a small world structure. Centrality would follow a poisson but there’d be much higher dispersion for the frequency of repeated interaction across edges.

An engine like TextArc could be very powerful at coding for these things (although I’d want to tweak it so it only draws networks among proper nouns, which would be easy enough with a concordance and/or some regular expression filters.) Of course as Gelman asked, what’s the point?

I can think of several. First, there might be strong associations between network structure and genre. Second, we might imagine that the prevalence of these network structures might change over time. A related issue would be to distinguish between literary fiction and popular fiction. My impression is that currently episodic fiction is popular and serial fiction is literary (especially in the medium of television), but in other eras it was the opposite. A good coding method would allow you to track the relationship between formal structure, genre, prestige, and time.

  • 1. Jenn Lena  |  April 26, 2009 at 5:45 pm

    I don’t remember now if I commented this on the old orgtheory thread, but Duncan Watts hand drew the story arc of “Snatch” (the Guy Richie movie). This caper flick about a diamond heist is a classic of what you call the “serial works” type.

    And your “so what” answers are all ex post: won’t authors be interested in using such a tool in planning their work, especially in cases with many characters and when very weak connections are exploited over many pages?

  • 2. gabrielrossman  |  April 26, 2009 at 6:05 pm

    oh, you columbia people think of everything.

    it hadn’t occurred to me that this could be a tool not just for analysis, but design. it’s pretty funny imagining a novelist saying something like “hmmm, my mean path length is a little high, i better ask the computer to add a few more random graph elements.” or at least i don’t see them doing it this formally (unless it were some kind of weird experimental work).

    however, i can easily see them saying something roughly equivalent to this, and i think many of them already do. in fact, i know that many authors write what is basically a “show bible” or backstory notes and i wouldn’t be surprised if such preparatory materials included hand-sketched network graphs.

  • 3. Bobby Chen  |  April 29, 2009 at 4:27 am

    I’m not sure if they actually draw out network graphs, but the co-producers of the TV show Lost talks about some variation of this for their TV show on their podcast. They have a continuity expert that are supposed to keep tracks of all the dates, character’s date of births… etc. And they also have a big story board in which all the relationships between the main (and minor) characters are outlined to keep them straight. One might imagine this kind of tool would be helpful for an evolving TV show, especially one that involves time-traveling.

