Python for Social Scientists

Python for Social Scientists

(Originally posted on my old blog on Jan 9, 2014)

As part of the content analysis I am doing for my dissertation, I’ve started to look at using Python to scrape documents from the web, as well as clean them up for analysis. In theory this should save me lots of man-hours of work; in reality, well, we’ll see.
Anyway, I though I would share my (very limited) experience and sources here. This is meant for total newbies, so if you know anything about coding, go hang out at stack overflow and make fun of me there.

Continue reading →

Content Analysis

Content Analysis

(Originally posted on my old blog on Dec. 9, 2013)

I’ve had several colleagues ask about using content analysis, so I’ve decided to put together a list of links and other resources in one spot.

A word on the qual vs. quant divide. Basically, when it comes to text as data, for me, your research goals define your methods. Sometimes, you are just not going to be able to answer your question without human coding. On the other hand, if you’re analyzing massive amounts of text, unless your research budget is equal to God’s, you aren’t going to be able to deal with it except through automating the coding process. But that said, even after hand coding, I still use quantitative methods to compare validity and see if the difference in categories between documents is statistically significant. The divide between the two is much blurrier here, although it does still play an role in how you define your data and what you see as “valid” coding methods. Continue reading →