TY - CONF TI - On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations AU - Arun, R. AU - Suresh, V. AU - Veni Madhavan, C. E. AU - Narasimha Murthy, M. N. A2 - Zaki, Mohammed J. A2 - Yu, Jeffrey Xu A2 - Ravindran, B. A2 - Pudi, Vikram T3 - Lecture Notes in Computer Science AB - This work proposes a measure to identify the correct number of topics and offer empirical evidence in its favor in terms of classification accuracy and the number of topics that are naturally present in the corpus. The measure's merit is shown by applying it on real-world as well as synthetic data sets(both text and images). In proposing this measure, view LDA as a matrix factorization mechanism, wherein a given corpus C is split into two matrix factors M 1 and M 2 as given by C d*w = M1 d*t x Q t*w . C3 - Advances in Knowledge Discovery and Data Mining DA - 2010/// PY - 2010 SP - 391 EP - 402 LA - en PB - Springer Berlin Heidelberg SN - 978-3-642-13657-3 ST - On Finding the Natural Number of Topics with Latent Dirichlet Allocation KW - Topic model optimization KW - Topic modeling ER -