(Originally posted on my old blog on Dec. 9, 2013)

I’ve had several colleagues ask about using content analysis, so I’ve decided to put together a list of links and other resources in one spot.

A word on the qual vs. quant divide. Basically, when it comes to text as data, for me, your research goals define your methods. Sometimes, you are just not going to be able to answer your question without human coding. On the other hand, if you’re analyzing massive amounts of text, unless your research budget is equal to God’s, you aren’t going to be able to deal with it except through automating the coding process. But that said, even after hand coding, I still use quantitative methods to compare validity and see if the difference in categories between documents is statistically significant. The divide between the two is much blurrier here, although it does still play an role in how you define your data and what you see as “valid” coding methods.

Books and Introduction to Content Analysis

Three books that I’ve found very useful on content analysis include:

Schreier, Margrit. 2012. Qualitative Content Analysis in Practice. Thousand Oaks: Sage. 

Neuendorf, Kimberly A. 2002. The Content Analysis Guidebook. Thousand Oaks: Sage.

Krippendorff, Klaus. 2012. Content Analysis: An Introduction to its Methodology. 3rd ed. Thousand Oaks: Sage.

I found the first to be the most accessible, and the other two to be useful as guides to specific issues. Schreier points out many non-so-common sense issues one might encounter when hand coding text, as well as how to validate your coding (even if you don’t have a second coder).

Ted Underwood’s blog also provides a great guide to getting started with text mining projects.

For a survey of automatic content analysis focusing on analysis at the document level, see Grimmer and  Stewart (2013). Monroe, Colaresi and Quinn (2008) provide a discussion of text-as-data methods that aim to show differences in political speech and provide a suggested model.

Software and Computer Programming

For software (which is a big help but can be costly), these sites provide reviews: http://www.textanalysis.infoand http://www.surrey.ac.uk/sociology/research/researchcentres/caqdas/. Many programs have trial programs so you can play around and see what you like, and some offer free versions that would work for smaller projects. If you are doing any hand coding, these make life easier.

Of course, you can also use Python or R, especially if you are using quantitative analysis and computer recognition techniques.  I’ll write some more about that later.