TY - CONF
TI - On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations
AU - Arun, R.
AU - Suresh, V.
AU - Veni Madhavan, C. E.
AU - Narasimha Murthy, M. N.
A2 - Zaki, Mohammed J.
A2 - Yu, Jeffrey Xu
A2 - Ravindran, B.
A2 - Pudi, Vikram
T3 - Lecture Notes in Computer Science
AB - This work proposes a measure to identify the correct number of topics and offer empirical evidence in its favor in terms of classification accuracy and the number of topics that are naturally present in the corpus. The measure's merit is shown by applying it on real-world as well as synthetic data sets(both text and images). In proposing this measure, view LDA as a matrix factorization mechanism, wherein a given corpus C is split into two matrix factors M 1 and M 2 as given by C d*w = M1 d*t x Q t*w .
C3 - Advances in Knowledge Discovery and Data Mining
DA - 2010///
PY - 2010
SP - 391
EP - 402
LA - en
PB - Springer Berlin Heidelberg
SN - 978-3-642-13657-3
ST - On Finding the Natural Number of Topics with Latent Dirichlet Allocation
KW - Topic model optimization
KW - Topic modeling
ER -