| Gabriel |
Via Andrew Gelman, I just found this link to TextArc, a package for automated social network visualization of books. Gelman says that (as is often true of visualizations) it’s beautiful and a technical achievement but it’s not clear what the theoretical payoff is. I wasn’t aware of TextArc at the time, but a few years ago I suggested some possibilities for this kind of analysis in a post on orgtheory in which I graphed R. Kelly’s “Trapped in the Closet.” My main suggestion at the time was, and still is, that social network analysis could be a very effective way to distinguish between two basic types of literature:
- episodic works. In these works a single protagonist or small group of protagonists continually meets new people in each chapter. For example The Odyssey or The Adventures of Huckleberry Finn. The network structure for these works will be a star, with a single hub consisting of the main protagonist and a bunch of small cliques each radiating from the hub but not connected to each other. There would also be a steep power-law distribution for centrality.
- serial works. In these works an ensemble of several characters, both major and minor, all seem to know each other but often with low tie strength. For example Neal Stephenson’s Baroque Cycle. This kind of work would have a small world structure. Centrality would follow a poisson but there’d be much higher dispersion for the frequency of repeated interaction across edges.
An engine like TextArc could be very powerful at coding for these things (although I’d want to tweak it so it only draws networks among proper nouns, which would be easy enough with a concordance and/or some regular expression filters.) Of course as Gelman asked, what’s the point?
I can think of several. First, there might be strong associations between network structure and genre. Second, we might imagine that the prevalence of these network structures might change over time. A related issue would be to distinguish between literary fiction and popular fiction. My impression is that currently episodic fiction is popular and serial fiction is literary (especially in the medium of television), but in other eras it was the opposite. A good coding method would allow you to track the relationship between formal structure, genre, prestige, and time.