Archive for November, 2011

Seven-inch heels, natural language processing, and sociology

The following is a guest post from Trey Causey, a long-time reader of codeandculture and a grad student at Washington who does a lot of work with web scraping. We got to discussing a dubious finding and at my request he graciously wrote up his thoughts into a guest post.

| Trey |

Recently, Gabriel pointed me to a piece in Ad Age (and original press release) about IBM researchers correlating the conversations of fashion bloggers with the state of the economy (make sure you file away the accompanying graph for the next time you teach data visualization). Trevor Davis, a “consumer-products expert” with IBM, claimed that as economic conditions improve, the average height of high heels mentioned by these bloggers decreases. Similarly, as economic conditions worsen, the average height would increase. As Gabriel pointed out, these findings seemed to lack any sort of face validity — how likely does it seem that, at any level of economic performance, the average high heel is seven inches tall (even among fashionistas)? I’ll return to the specific problems posed by what I’ll call the “seven-inch heel problem” in a moment, but first some background on the methods that most likely went into this study.

While amusing, if not very credible, the IBM study is part of a growing area (dubbed by some “computational social science”) situated at the intersection of natural language processing and machine learning. By taking advantage of the explosion of available digital text and computing power, researchers in this area are attempting to model structure in and test theories of large-scale social behavior. You’ve no doubt seen some of this work in the media, ranging from “predicting the Arab Spring” to using Twitter to predict GOP primary frontrunners. Many of these works hew towards the style end of the style-substance divide and are not typically motivated by any recognizable theory. However, this is changing as linguists use Twitter to discover regional dialect differences and model the daily cycle of positive and negative emotions.

Much of this work is being met by what I perceive to be reflexive criticism (as in automatic, rather than in the more sociological sense) from within the academy. The Golder and Macy piece in particular received sharp criticism in the comments on orgtheory, labeled variously “empiricism gone awry”, non-representative, and even fallacious (and in which yours truly was labeled “cavalier”). Some of this criticism is warranted, although as is often the case with new methods and data sources, much of the criticism seems rooted in misunderstanding. I suspect part of this is the surprisingly long-lived skepticism of scholarly work on “the internet” which, with the rise of Facebook and Twitter, seems to have been reinvigorated.

However, sociologists are doing themselves a disservice by seeing this work as research on the internet qua internet. Incredible amounts of textual and relational data are out there for the analyzing — and we all know if there’s one thing social scientists love, it’s original data. And these data are not limited to blog posts, status updates, and tweets. Newspapers, legislation, historical archives, and more are rapidly being digitized, providing pristine territory for analysis. Political scientists are warming to the approach, as evidenced by none other than the inimitable Gary King and his own start-up Crimson Hexagon, which performs sentiment analysis on social media using software developed for a piece in AJPS. Political Analysis, the top-ranked journal in political science and the methodological showcase for the discipline, devoted an entire issue in 2008 to the “text-as-data” approach. Additionally, a group of historians and literary scholars have adopted these methods, dubbing the new subfield the “digital humanities.”

Sociologists of culture and diffusion have already warmed to many of these ideas, but the potential for other subfields is significant and largely unrealized. Social movement scholars could find ways to empirically identify frames in wider public discourse. Sociologists of stratification have access to thousands of public- and private-sector reports, the texts of employment legislation, and more to analyze. Race, ethnicity, and immigration researchers can model changing symbolic boundaries across time and space. The real mistake, in my view, is dismissing these methods as an end in and of themselves rather than as a tool for exploring important and interesting sociological questions. Although many of the studies hitting the mass media seem more “proof of concept” than “test of theory,” this is changing; sociologists will not want to be left behind. Below, I will outline the basics of some of these methods and then return to the seven-inch heels problem.

The use of simple scripts or programs to scrape data from the web or Twitter has been featured several times on this blog. The data that I collected for my dissertation were crawled and then scraped from multiple English and Arabic news outlets that post their archives online, including Al Ahram, Al Masry Al Youm, Al Jazeera, and Asharq al Awsat. The actual scrapers are written in Python using the Scrapy framework.

Obtaining the data is the first and least interesting step (to sociologists). Using the scraped data, I am creating chains of topic models (specifically using Latent Dirichlet Allocation) to model latent discursive patterns in the media from the years leading up to the so-called “Arab Spring.” In doing so, I am trying to identify the convergence and divergence in discourse across and within sources to understand how contemporary actors were making sense of their social, political, and economic contexts prior to a major social upheaval. Estimating common knowledge prior to contentious political events is often problematic due to hindsight biases, because of the problems of conducting surveys in non-democracies, and for the obvious reason that we usually don’t know when a major social upheaval is about to happen even if we may know which places may be more susceptible.

Topic modeling is a method that will be look familiar in its generalities to anyone who has seen a cluster analysis. Essentially, topic models use unstructured text — i.e., text without labeled fields from a database or from a forced-choice survey — to model the underlying topical components that make up a document or set of documents. For instance, one modeled topic might be composed of the words “protest”, “revolution”, “dictator”, and “tahrir”. The model attempts to find the words that have the highest probability of being found with one another and with the lowest probability of being found with other words. The generated topics are devoid of meaning, however, without theoretically informed interpretation. This is analogous to survey researchers that perform cluster or factor analyses to find items that “hang together” and then attempt to figure out what the latent construct is that links them.

Collections of documents (a corpus) are usually represented as a document-term matrix, where each row is a document and the columns are all of the words that appear in your set of documents (the vocabulary). The contents of the individual cells are the per-document word frequencies. This produces a very sparse matrix, so some pre-processing is usually performed to reduce the dimensionality. The majority of all documents from any source are filled with words that convey little to no information — prepositions, articles, common adjectives, etc. (see Zipf’s law). Words that appear in every document or in a very small number of documents provide little explanatory power and are usually removed. The texts are often pre-processed using tools such as the Natural Language Toolkit for Python or RTextTools (which is developed in part here at the University of Washington) to remove these words and punctuation. Further, words are often “stemmed” or “lemmatized” so that the number of words with common suffixes and prefixes but with similar meanings is reduced. For example, “run”, “runner”, “running”, and “runs” might all be reduced to “run”.

This approach is known as a “bag-of-words” approach in that the order and context of the words is assumed to be unimportant (obviously, a contentious assumption, but perhaps that is a debate for another blog). Researchers that are uncomfortable with this assumption can use n-grams, groupings of two or more words, rather than single words. However, as the n increases, the number of possible combinations and the accompanying computing power required grows rapidly. You may be familiar with the Google Ngram Viewer. Most of the models are extendable to other languages and are indifferent to the actual content of the text although obviously the researcher needs to be able to read and make sense of the output.

Other methods require different assumptions. If you are interested in parts of speech, a part-of-speech tagger is required, which assumes that the document is fairly coherent and not riddled with typos. Tracking exact or near-exact phrases is difficult as well, as evidenced by the formidable team of computer scientists working on MemeTracker. The number of possible variations on even a short phrase quickly becomes unwieldy and requires substantial computational resources — which brings us back to the seven-inch heels.

Although IBM now develops the oft-maligned SPSS, they also produced Watson. This is why the total lack of validity of fashion blogging results is surprising. If one were seriously going to track the height of heels mentioned and attempt to correlate it with economic conditions, in order to have any confidence that you have captured a non-biased sample of mentions, at least two necessary steps would include:

  • Identifying possible combinations of size metrics and words for heels: seven-inch heels, seven inch heels, seven inch high heels, seven-inch high-heels, seven inch platforms, etc. And so on. This is further complicated by the fact that many text processing algorithms will treat “seven-inch” as one word.
  • Dealing with the problem of punctuational abbreviations for these metrics: 7″ heels, 7″ high heels, 7 and a 1/2 inch heels, etc. Since punctuation is usually stripped out, it would be necessary to leave it in, but then how to distinguish quotation marks that appear as size abbreviations and those that appear in other contexts?
  • Do we include all of these variations with “pumps?” Is there something systematic such as age, location, etc. about individuals that refer to “pumps” rather than “heels?”
  • Are there words or descriptions for heels that I’m not even aware of? Probably.

None of these is an insurmountable problem and I have no doubt that IBM researchers have easy access to substantial computing power. However, each of them requires careful thought prior to and following data collection; the combination of them together quickly complicates matters. Since IBM is unlikely to reveal their methods, though, I have serious doubts as to the validity of their findings.

As any content analyst can tell you, text is a truly unique data source as it is intentional language and is one of the few sources of observational data for which the observation process is totally unobtrusive. In some cases, the authors are no longer alive! Much of the available online text of interest to social scientists was not produced for scholarly inquiry and was not generated from survey responses. However, the sheer volume of the text requires some (but not much!) technical sophistication to acquire and make sense of and, like any other method, these analyses can produce results that are essentially meaningless. Just as your statistics package of choice will output meaningless regression results from just about any data you feed into it, automated and semi-automated text analysis produces its own share of seven-inch heels.

November 21, 2011 at 8:46 am 11 comments

Useless Majors or Small Majors?

| Gabriel |

The WSJ has a very interesting table of the unemployment and wage distributions for various majors. There’s lots to talk about, particularly the STEM/humanities/social/vocational divide, but one thing that struck me was that the highest and lowest unemployment rates were dominated by tiny majors. In general, small populations tend to have more widely varying outcomes just as a function of standard error, which is why you should always ignore headlines about big jumps in the crime rate for small towns. Anyway, I downloaded the data, generated some plots, and yup, it’s your classic funnel.

Here’s unemployment by rank popularity. Because low rank means popular, the funnel is backwards.

WSJ only provides rank, but I approximated raw size as the inverse of log rank plus 1 and this gives us the typical funnel.

Moral of the story, don’t change your major from clinical psych to actuarial science just yet. On the other hand, nursing, elementary education, and general education really do appear to be real deal outliers of low unemployment.

Here’s the code.

insheet using ~/Documents/codeandculture/majors.txt, clear
drop v7
gen unemploymentpercent_real=real(subinstr(unemploymentpercent,"%","",.))
twoway scatter unemploymentpercent_real  popularity, xtitle(Rank Order Popularity) ytitle(% Unemployed)
graph export majors_unemployed_rank.png, replace

gen size=1/(log(1+popularity))
corr unemploymentpercent_real popularity size

lab def size 0 "Obscure" 2 "Ubiquitous"
lab val size size
twoway (scatter unemploymentpercent_real size), xlabel(#2, labels angle(forty_five) valuelabel) xtitle(Approximate Raw Size) ytitle(% Unemployed)
graph export majors_unemployed_size.png, replace

*have a nice day

Also, here’s the data in plain text:

Major Field	Unemployment Percent	25th % Earnings	Median % Earnings	75th % Earnings	Popularity	
ACCOUNTING	5.4%	$41,000	$61,000	$94,000	3	
ACTUARIAL SCIENCE	0.0%	$52,000	$81,000	$116,000	150	
ADVERTISING AND PUBLIC RELATIONS	6.1%	$36,000	$50,000	$74,000	41	
AEROSPACE ENGINEERING	3.6%	$60,000	$84,000	$111,000	105	
AGRICULTURAL ECONOMICS	1.3%	$30,000	$57,000	$99,000	122	
AGRICULTURE PRODUCTION AND MANAGEMENT	3.0%	$32,000	$48,000	$71,000	75	
ANIMAL SCIENCES	5.7%	$26,000	$40,000	$60,000	67	
ANTHROPOLOGY AND ARCHEOLOGY	6.9%	$30,000	$40,000	$60,000	55	
APPLIED MATHEMATICS	4.1%	$52,000	$71,000	$100,000	131	
ARCHITECTURAL ENGINEERING	5.8%	$50,000	$71,000	$96,000	140	
ARCHITECTURE	10.6%	$37,000	$60,000	$85,000	33	
AREA ETHNIC AND CIVILIZATION STUDIES	5.7%	$34,000	$48,000	$76,000	66	
ART AND MUSIC EDUCATION	4.2%	$32,000	$41,000	$51,000	48	
ART HISTORY AND CRITICISM	6.9%	$33,000	$45,000	$71,000	81	
ASTRONOMY AND ASTROPHYSICS	0.0%	$56,000	$62,000	$101,000	170	
ATMOSPHERIC SCIENCES AND METEOROLOGY	1.6%	$40,000	$68,000	$101,000	146	
BIOCHEMICAL SCIENCES	7.1%	$30,000	$48,000	$80,000	87	
BIOLOGICAL ENGINEERING	6.8%	$39,000	$60,000	$94,000	126	
BIOLOGY	5.6%	$35,000	$51,000	$76,000	14	
BIOMEDICAL ENGINEERING	5.9%	$45,000	$68,000	$101,000	137	
BOTANY	6.9%	$26,000	$40,000	$55,000	147	
BUSINESS ECONOMICS	5.0%	$44,000	$71,000	$101,000	80	
BUSINESS MANAGEMENT AND ADMINISTRATION	6.0%	$38,000	$56,000	$85,000	1	
CHEMICAL ENGINEERING	3.8%	$60,000	$86,000	$117,000	49	
CHEMISTRY	5.1%	$39,000	$59,000	$85,000	36	
CIVIL ENGINEERING	4.9%	$55,000	$76,000	$101,000	32	
CLINICAL PSYCHOLOGY	19.5%	$25,000	$40,000	$61,000	168	
COGNITIVE SCIENCE AND BIOPSYCHOLOGY	4.5%	$36,000	$43,000	$91,000	167	
COMMERCIAL ART AND GRAPHIC DESIGN	8.1%	$31,000	$45,000	$69,000	21	
COMMUNICATION DISORDERS SCIENCES AND SERVICES	3.3%	$32,000	$41,000	$50,000	98	
COMMUNICATION TECHNOLOGIES	6.7%	$33,000	$50,000	$73,000	89	
COMMUNICATIONS	6.3%	$35,000	$50,000	$81,000	7	
COMMUNITY AND PUBLIC HEALTH	4.1%	$31,000	$46,000	$70,000	110	
COMPOSITION AND SPEECH	7.7%	$30,000	$40,000	$61,000	99	
COMPUTER ADMINISTRATION MANAGEMENT AND SECURITY	9.5%	$39,000	$52,000	$75,000	114	
COMPUTER AND INFORMATION SYSTEMS	5.6%	$44,000	$62,000	$86,000	31	
COMPUTER ENGINEERING	7.0%	$58,000	$81,000	$102,000	47	
COMPUTER NETWORKING AND TELECOMMUNICATIONS	5.2%	$35,000	$53,000	$76,000	97	
COMPUTER PROGRAMMING AND DATA PROCESSING	6.2%	$39,000	$55,000	$84,000	121	
COMPUTER SCIENCE	5.6%	$50,000	$77,000	$102,000	10	
CONSTRUCTION SERVICES	5.4%	$49,000	$65,000	$101,000	76	
COSMETOLOGY SERVICES AND CULINARY ARTS	7.3%	$26,000	$41,000	$60,000	115	
COUNSELING PSYCHOLOGY	5.2%	$23,000	$34,000	$42,000	133	
COURT REPORTING	4.9%	$36,000	$55,000	$81,000	151	
CRIMINAL JUSTICE AND FIRE PROTECTION	4.7%	$36,000	$50,000	$73,000	13	
CRIMINOLOGY	5.2%	$35,000	$50,000	$71,000	92	
DRAMA AND THEATER ARTS	7.1%	$28,000	$40,000	$60,000	45	
EARLY CHILDHOOD EDUCATION	4.1%	$28,000	$37,000	$45,000	50	
ECOLOGY	5.2%	$31,000	$43,000	$60,000	109	
ECONOMICS	6.3%	$42,000	$69,000	$108,000	16	
EDUCATIONAL ADMINISTRATION AND SUPERVISION	0.0%	$41,000	$65,000	$89,000	171	
EDUCATIONAL PSYCHOLOGY	10.9%	$28,000	$35,000	$51,000	156	
ELECTRICAL AND MECHANIC REPAIRS AND TECHNOLOGIES	8.4%	$30,000	$44,000	$68,000	134	
ELECTRICAL ENGINEERING	5.0%	$60,000	$86,000	$111,000	17	
ELECTRICAL ENGINEERING TECHNOLOGY	5.5%	$42,000	$65,000	$91,000	65	
ELEMENTARY EDUCATION	3.6%	$32,000	$40,000	$49,000	8	
ENGINEERING AND INDUSTRIAL MANAGEMENT	9.2%	$50,000	$71,000	$98,000	127	
ENGINEERING MECHANICS PHYSICS AND SCIENCE	6.5%	$40,000	$67,000	$101,000	132	
ENGINEERING TECHNOLOGIES	5.3%	$40,000	$60,000	$91,000	117	
ENGLISH LANGUAGE AND LITERATURE	6.7%	$32,000	$48,000	$75,000	11	
ENVIRONMENTAL ENGINEERING	2.2%	$54,000	$67,000	$90,000	144	
ENVIRONMENTAL SCIENCE	5.0%	$40,000	$52,000	$76,000	60	
FAMILY AND CONSUMER SCIENCES	5.1%	$30,000	$40,000	$58,000	29	
FILM VIDEO AND PHOTOGRAPHIC ARTS	7.3%	$30,000	$45,000	$71,000	54	
FINANCE	4.5%	$44,000	$65,000	$101,000	12	
FINE ARTS	7.4%	$28,000	$44,000	$65,000	22	
FOOD SCIENCE	6.9%	$34,000	$71,000	$101,000	129	
FORESTRY	3.1%	$38,000	$50,000	$73,000	104	
FRENCH GERMAN LATIN AND OTHER COMMON FOREIGN LANGUAGE STUDIES	5.9%	$32,000	$48,000	$67,000	43	
GENERAL AGRICULTURE	3.0%	$28,000	$44,000	$68,000	71	
GENERAL BUSINESS	5.3%	$38,000	$59,000	$91,000	2	
GENERAL EDUCATION	4.2%	$31,000	$41,000	$53,000	9	
GENERAL ENGINEERING	5.9%	$47,000	$73,000	$101,000	24	
GENERAL MEDICAL AND HEALTH SERVICES	5.8%	$35,000	$50,000	$71,000	74	
GENERAL SOCIAL SCIENCES	8.2%	$34,000	$50,000	$74,000	68	
GENETICS	7.4%	$33,000	$71,000	$99,000	163	
GEOGRAPHY	6.1%	$40,000	$54,000	$81,000	62	
GEOLOGICAL AND GEOPHYSICAL ENGINEERING	0.0%	$56,000	$73,000	$101,000	166	
GEOLOGY AND EARTH SCIENCE	5.7%	$41,000	$60,000	$93,000	73	
GEOSCIENCES	3.2%	$36,000	$52,000	$71,000	153	
HEALTH AND MEDICAL ADMINISTRATIVE SERVICES	4.3%	$36,000	$51,000	$76,000	63	
HEALTH AND MEDICAL PREPARATORY PROGRAMS	5.2%	$40,000	$60,000	$86,000	130	
HISTORY	6.5%	$34,000	$50,000	$81,000	18	
HOSPITALITY MANAGEMENT	5.8%	$32,000	$49,000	$71,000	38	
HUMAN RESOURCES AND PERSONNEL MANAGEMENT	6.6%	$40,000	$55,000	$85,000	40	
HUMAN SERVICES AND COMMUNITY ORGANIZATION	6.9%	$29,000	$38,000	$50,000	77	
HUMANITIES	8.4%	$30,000	$45,000	$62,000	118	
INDUSTRIAL AND MANUFACTURING ENGINEERING	5.6%	$50,000	$75,000	$100,000	59	
INDUSTRIAL AND ORGANIZATIONAL PSYCHOLOGY	10.4%	$45,000	$62,000	$81,000	135	
INDUSTRIAL PRODUCTION TECHNOLOGIES	3.1%	$46,000	$67,000	$91,000	82	
INFORMATION SCIENCES	5.9%	$48,000	$71,000	$95,000	69	
INTERCULTURAL AND INTERNATIONAL STUDIES	6.6%	$32,000	$50,000	$71,000	100	
INTERDISCIPLINARY SOCIAL SCIENCES	6.3%	$31,000	$40,000	$50,000	96	
INTERNATIONAL BUSINESS	8.5%	$38,000	$52,000	$87,000	72	
INTERNATIONAL RELATIONS	5.8%	$40,000	$57,000	$93,000	79	
JOURNALISM	7.0%	$34,000	$50,000	$79,000	25	
LANGUAGE AND DRAMA EDUCATION	5.0%	$32,000	$41,000	$50,000	58	
LIBERAL ARTS	7.6%	$32,000	$48,000	$71,000	20	
LIBRARY SCIENCE	15.0%	$23,000	$36,000	$49,000	159	
LINGUISTICS AND COMPARATIVE LANGUAGE AND LITERATURE	10.2%	$30,000	$44,000	$70,000	90	
MANAGEMENT INFORMATION SYSTEMS AND STATISTICS	4.2%	$47,000	$71,000	$96,000	44	
MARKETING AND MARKETING RESEARCH	5.9%	$40,000	$59,000	$90,000	6	
MASS MEDIA	6.9%	$32,000	$46,000	$69,000	35	
MATERIALS ENGINEERING AND MATERIALS SCIENCE	7.7%	$57,000	$84,000	$105,000	136	
MATERIALS SCIENCE	4.7%	$65,000	$81,000	$106,000	161	
MATHEMATICS	5.0%	$42,000	$63,000	$95,000	28	
MATHEMATICS AND COMPUTER SCIENCE	3.5%	$55,000	$91,000	$151,000	158	
MATHEMATICS TEACHER EDUCATION	3.4%	$34,000	$42,000	$56,000	108	
MECHANICAL ENGINEERING	3.8%	$60,000	$81,000	$106,000	23	
MECHANICAL ENGINEERING RELATED TECHNOLOGIES	6.6%	$38,000	$65,000	$87,000	123	
MEDICAL ASSISTING SERVICES	2.9%	$34,000	$51,000	$71,000	95	
MEDICAL TECHNOLOGIES TECHNICIANS	1.4%	$44,000	$58,000	$74,000	51	
METALLURGICAL ENGINEERING	3.9%	$50,000	$86,000	$110,000	152	
MICROBIOLOGY	5.2%	$40,000	$60,000	$86,000	94	
MILITARY TECHNOLOGIES	10.9%	$81,000	$86,000	$126,000	173	
MINING AND MINERAL ENGINEERING	4.3%	$71,000	$101,000	$121,000	162	
MISCELLANEOUS AGRICULTURE	3.7%	$31,000	$46,000	$67,000	160	
MISCELLANEOUS BIOLOGY	5.3%	$31,000	$50,000	$69,000	125	
MISCELLANEOUS BUSINESS & MEDICAL ADMINISTRATION	5.3%	$35,000	$52,000	$81,000	64	
MISCELLANEOUS EDUCATION	3.7%	$33,000	$46,000	$65,000	61	
MISCELLANEOUS ENGINEERING	7.4%	$42,000	$71,000	$91,000	106	
MISCELLANEOUS ENGINEERING TECHNOLOGIES	6.0%	$45,000	$65,000	$91,000	88	
MISCELLANEOUS FINE ARTS	16.2%	$26,000	$40,000	$49,000	164	
MISCELLANEOUS HEALTH MEDICAL PROFESSIONS	3.3%	$35,000	$45,000	$62,000	93	
MISCELLANEOUS PSYCHOLOGY	10.3%	$30,000	$45,000	$71,000	120	
MISCELLANEOUS SOCIAL SCIENCES	3.8%	$38,000	$52,000	$85,000	143	
MOLECULAR BIOLOGY	5.3%	$32,000	$50,000	$76,000	124	
MULTI-DISCIPLINARY OR GENERAL SCIENCE	4.6%	$36,000	$55,000	$81,000	26	
MULTI/INTERDISCIPLINARY STUDIES	5.5%	$34,000	$42,000	$50,000	107	
MUSIC	5.2%	$30,000	$45,000	$67,000	37	
NATURAL RESOURCES MANAGEMENT	6.9%	$36,000	$50,000	$71,000	78	
NAVAL ARCHITECTURE AND MARINE ENGINEERING	1.7%	$60,000	$96,000	$117,000	145	
NEUROSCIENCE	7.2%	$34,000	$52,000	$76,000	154	
NUCLEAR ENGINEERING	4.1%	$65,000	$96,000	$138,000	149	
NUCLEAR INDUSTRIAL RADIOLOGY AND BIOLOGICAL TECHNOLOGIES	2.2%	$47,000	$64,000	$81,000	142	
NURSING	2.2%	$48,000	$60,000	$80,000	4	
NUTRITION SCIENCES	6.4%	$35,000	$51,000	$71,000	101	
OCEANOGRAPHY	3.3%	$40,000	$50,000	$79,000	148	
OPERATIONS LOGISTICS AND E-COMMERCE	4.7%	$45,000	$65,000	$97,000	102	
OTHER FOREIGN LANGUAGES	6.4%	$32,000	$45,000	$76,000	111	
PETROLEUM ENGINEERING	4.4%	$83,000	$127,000	$178,000	138	
PHARMACOLOGY	0.0%	$48,000	$60,000	$101,000	169	
PHARMACY PHARMACEUTICAL SCIENCES AND ADMINISTRATION	3.2%	$78,000	$105,000	$121,000	53	
PHILOSOPHY AND RELIGIOUS STUDIES	7.2%	$30,000	$42,000	$65,000	42	
PHYSICAL AND HEALTH EDUCATION TEACHING	4.5%	$34,000	$46,000	$59,000	39	
PHYSICAL FITNESS PARKS RECREATION AND LEISURE	4.8%	$33,000	$45,000	$61,000	27	
PHYSICAL SCIENCES	2.5%	$36,000	$51,000	$68,000	157	
PHYSICS	4.5%	$39,000	$68,000	$101,000	70	
PHYSIOLOGY	4.6%	$30,000	$48,000	$68,000	113	
PLANT SCIENCE AND AGRONOMY	2.7%	$28,000	$42,000	$71,000	85	
POLITICAL SCIENCE AND GOVERNMENT	6.0%	$38,000	$57,000	$91,000	15	
PRE-LAW AND LEGAL STUDIES	7.9%	$32,000	$45,000	$69,000	91	
PSYCHOLOGY	6.1%	$30,000	$43,000	$65,000	5	
PUBLIC ADMINISTRATION	6.9%	$36,000	$50,000	$78,000	112	
PUBLIC POLICY	2.2%	$47,000	$65,000	$101,000	141	
SCHOOL STUDENT COUNSELING	0.0%	$18,000	$20,000	$42,000	172	
SCIENCE AND COMPUTER TEACHER EDUCATION	5.0%	$36,000	$47,000	$58,000	116	
SECONDARY TEACHER EDUCATION	3.8%	$35,000	$43,000	$59,000	57	
SOCIAL PSYCHOLOGY	8.8%	$32,000	$45,000	$60,000	155	
SOCIAL SCIENCE OR HISTORY TEACHER EDUCATION	3.0%	$35,000	$45,000	$58,000	83	
SOCIAL WORK	6.8%	$30,000	$39,000	$51,000	30	
SOCIOLOGY	7.0%	$33,000	$45,000	$67,000	19	
SOIL SCIENCE	4.9%	$43,000	$64,000	$81,000	165	
SPECIAL NEEDS EDUCATION	3.6%	$34,000	$42,000	$50,000	52	
STATISTICS AND DECISION SCIENCE	6.9%	$50,000	$76,000	$108,000	128	
STUDIO ARTS	8.0%	$25,000	$37,000	$57,000	84	
TEACHER EDUCATION: MULTIPLE LEVELS	1.1%	$30,000	$38,000	$48,000	86	
THEOLOGY AND RELIGIOUS VOCATIONS	4.1%	$25,000	$38,000	$54,000	46	
TRANSPORTATION SCIENCES AND TECHNOLOGIES	4.4%	$42,000	$68,000	$98,000	56	
TREATMENT THERAPY PROFESSIONS	2.6%	$40,000	$62,000	$81,000	34	
UNITED STATES HISTORY	15.1%	$30,000	$50,000	$96,000	139	
VISUAL AND PERFORMING ARTS	9.2%	$20,000	$36,000	$52,000	103	
ZOOLOGY	6.7%	$33,000	$55,000	$81,000	119	

November 7, 2011 at 9:55 am 34 comments


The Culture Geeks