Evaluating topic models with stability

Topic models are unsupervised techniques that extract likely topics from text corpora, by creating probabilistic word-topic and topic-document associations. Evaluation of topic models is a challenge because (a) topic models are often employed on unlabelled data, so that a ground truth does not exist and (b) "soft" (probabilistic) document clusters are created by state-of-the-art topic models, which complicates comparisons even when ground truth labels are available. Perplexity has often been used as a performance measure, but can only be used for fixed vocabularies and feature sets. The authors turn to an alternative performance measure for topic models - topic stability - and compare its behaviour with perplexity when the vocabulary size is varied. They then evaluate two topic models, LDA and GaP, using topic stability. They also use labelled data to test topic stability on these two models, and show that topic stability has significant potential to evaluate topic models on both labelled and unlabelled corpora

Reference: Copy to Clipboard

De Waal, A and Barnard, E. 2008. Evaluating topic models with stability. Nineteenth Annual Symposium of the Pattern Recognition Association of South Africa (PRASA 2008), Cape Town, South Africa, 27-28 November 2008, pp 79-84