2019-04-16Zeitschriftenartikel DOI: 10.18452/20123
Topic modeling for analyzing open-ended survey responses
Open-ended responses are widely used in market research studies. Processing of such responses requires labour-intensive human coding. This paper focuses on unsupervised topic models and tests their ability to automate the analysis of open-ended responses. Since state-of-the-art topic models struggle with the shortness of open-ended responses, the paper considers three novel short text topic models: Latent Feature Latent Dirichlet Allocation, Biterm Topic Model and Word Network Topic Model. The models are fitted and evaluated on a set of real-world open-ended responses provided by a market research company. Multiple components such as topic coherence and document classification are quantitatively and qualitatively evaluated to appraise whether topic models can replace human coding. The results suggest that topic models are a viable alternative for open-ended response coding. However, their usefulness is limited when a correct one-to-one mapping of responses and topics or the exact topic distribution is needed.
Files in this item
This article was supported by the Open Access Publication Fund of Humboldt-Universität zu Berlin.