Bibliography – Topic Model Optimization

Selected DH research and resources bearing on, or utilized by, the WE1S project.
(all) Distant Reading | Cultural Analytics | | Sociocultural Approaches | Topic Modeling in DH | Non-consumptive Use


Kapadia, Shashank. “Evaluate Topic Models: Latent Dirichlet Allocation (LDA).” Medium, 2020. https://towardsdatascience.com/evaluate-topic-model-in-python-latent-dirichlet-allocation-lda-7d57484bb5d0. Cite
Syed, S., and M. Spruit. “Selecting Priors for Latent Dirichlet Allocation.” In 2018 IEEE 12th International Conference on Semantic Computing (ICSC), 194–202, 2018. https://doi.org/10.1109/ICSC.2018.00035. Cite
George, Clint P., and Hani Doss. “Principled Selection of Hyperparameters in the Latent Dirichlet Allocation Model.” Journal of Machine Learning Research 18, no. 162 (2018): 1–38. http://jmlr.org/papers/v18/15-595.html. Cite
Narkhede, Sarang. Understanding Confusion Matrix, 2018. https://towardsdatascience.com/understanding-confusion-matrix-a9ad42dcfd62. Cite
Dewi, Andisa, and Kilian Thiel. “Topic Extraction: Optimizing the Number of Topics with the Elbow Method.” KNIME (blog), 2017. https://www.knime.com/blog/topic-extraction-optimizing-the-number-of-topics-with-the-elbow-method. Cite
Ellis, Peter. Cross-Validation of Topic Modelling, 2017. http://freerangestats.info/blog/2017/01/05/topic-model-cv.html. Cite
Schofield, Alexandra, Laure Thompson, and David Mimno. “Quantifying the Effects of Text Duplication on Semantic Models.” In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2737–47. Copenhagen, Denmark: Association for Computational Linguistics, 2017. https://doi.org/10.18653/v1/D17-1290. Cite
Schöch, Christof. Topic Modeling with MALLET: Hyperparameter Optimization, 2016. https://dragonfly.hypotheses.org/1051. Cite
Soltoff, Benjamin. Text Analysis: Topic Modeling, 2016. https://cfss.uchicago.edu/fall2016/text02.html. Cite
Allahyari, Mehdi, and Krys Kochut. “Discovering Coherent Topics with Entity Topic Models.” In 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), 26–33. Omaha, NE, USA: IEEE, 2016. https://doi.org/10.1109/WI.2016.0015. Cite
Alexander, Eric, and Michael Gleicher. “Task-Driven Comparison of Topic Models.” IEEE Transactions on Visualization and Computer Graphics 22, no. 1 (2016): 320–29. https://doi.org/10.1109/TVCG.2015.2467618. Cite
Murdock, Jaimie, and Colin Allen. “Visualization Techniques for Topic Model Checking.” In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 4284–85. AAAI’15. Austin, Texas: AAAI Press, 2015. http://dl.acm.org/citation.cfm?id=2888116.2888368. Cite
Chuang, Jason, Margaret E. Roberts, Brandon M. Stewart, Rebecca Weiss, Dustin Tingley, Justin Grimmer, and Jeffrey Heer. “TopicCheck: Interactive Alignment for Assessing Topic Model Stability.” In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 175–84. Denver: Association for Computational Linguistics, 2015. https://doi.org/10.3115/v1/N15-1018. Cite
Boyd-Graber, Jordan, David Mimno, and David Newman. “Care and Feeding of Topic Models: Problems, Diagnostics, and Improvements.” Handbook of Mixed Membership Models and Their Applications, 2014. https://doi.org/10.1201/b17520-21. Cite
Sievert, Carson, and Kenneth R. Shirley. “LDAvis : A Method for Visualizing and Interpreting Topics.” In Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, 63–70. Association for Computational Linguistics, 2014. http://www.aclweb.org/anthology/W14-3110. Cite
Evans, Michael S. “A Computational Approach to Qualitative Analysis in Large Textual Datasets.” PLOS ONE 9, no. 2 (2014): e87908. https://doi.org/10.1371/journal.pone.0087908. Cite
Chuang, Jason, Sonal Gupta, Christopher D. Manning, and Jeffrey Heer. “Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment.” In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28, III-612-III–620. ICML’13. Atlanta, GA, USA: JMLR.org, 2013. http://dl.acm.org/citation.cfm?id=3042817.3043005. Cite
Chen, Zhiyuan, Arjun Mukherjee, Bing Liu, Meichun Hsu, Malu Castellanos, and Riddhiman Ghosh. “Discovering Coherent Topics Using General Knowledge.” In Proceedings of the 22Nd ACM International Conference on Information & Knowledge Management, 209–18. CIKM ’13. New York, NY, USA: ACM, 2013. https://doi.org/10.1145/2505515.2505519. Cite
Chuang, Jason, Daniel Ramage, Christopher Manning, and Jeffrey Heer. “Interpretation and Trust: Designing Model-Driven Visualizations for Text Analysis.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 443–52. CHI ’12. New York, NY, USA: ACM, 2012. https://doi.org/10.1145/2207676.2207738. Cite
Ponweiser, Martin. “Latent Dirichlet Allocation in R.” Diploma Thesis, Vienna University of Economics and Business, 2012. http://epub.wu.ac.at/3558/. Cite
Matt Taddy. “On Estimation and Selection for Topic Models.” In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, edited by Neil D. Lawrence and Mark Girolami, 1184–93. PMLR, 2012. http://proceedings.mlr.press/v22/taddy12.html. Cite
Bischof, Jonathan M., and Edoardo M. Airoldi. “Summarizing Topical Content with Word Frequency and Exclusivity.” In Proceedings of the 29th International Coference on International Conference on Machine Learning, 9–16. ICML’12. USA: Omnipress, 2012. https://icml.cc/2012/papers/113.pdf. Cite
Mimno, David, and David Blei. “Bayesian Checking for Topic Models.” In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 227–37. Association for Computational Linguistics, 2011. Cite
Gavagai (company). “The Advantage of Ethersource on the TOEFL Synonym Test Compared to Other Methods.” Gavagai (blog), 2011. https://www.gavagai.io/blog/2011/12/14/the-advantage-of-ethersource-on-the-toefl-synonym-test-compared-to-other-methods/. Cite
Ratinov, Lev, Dan Roth, Doug Downey, and Mike Anderson. “Local and Global Algorithms for Disambiguation to Wikipedia,” 1375–84, 2011. https://aclweb.org/anthology/papers/P/P11/P11-1138/. Cite
Arun, R., V. Suresh, C. E. Veni Madhavan, and M. N. Narasimha Murthy. “On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations.” In Advances in Knowledge Discovery and Data Mining, edited by Mohammed J. Zaki, Jeffrey Xu Yu, B. Ravindran, and Vikram Pudi, 391–402. Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2010. Cite
Cao, Juan, Tian Xia, Jintao Li, Yongdong Zhang, and Sheng Tang. “A Density-Based Method for Adaptive LDA Model Selection.” Neurocomputing, Advances in Machine Learning and Computational Intelligence, 72, no. 7 (2009): 1775–81. https://doi.org/10.1016/j.neucom.2008.06.011. Cite
AlSumait, Loulwah, Daniel Barbará, James Gentle, and Carlotta Domeniconi. “Topic Significance Ranking of LDA Generative Models.” In Proceedings of the 2009th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I, 67–82. ECMLPKDD’09. Berlin, Heidelberg: Springer-Verlag, 2009. https://doi.org/10.1007/978-3-642-04180-8_22. Cite
Matveeva, Irina, Gina-anne Levow, Ayman Farahat, and Christiaan Royer. “Terms Representation with Generalized Latent Semantic Analysis.” In Recent Advances in Natural Language Processing IV: Selected Papers from RANLP 2005, 292:45–54. Amsterdam Studies in the Theory and History of Linguistic Science. Amsterdam; Philadelphia: John Benjamins, 2007. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.110.2216. Cite
Newman, David, Jey Han Lau, Karl Grieser, and Timothy Baldwin. “Automatic Evaluation of Topic Coherence.” In COLING-ACL 2006: 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics; 17 - 21 July 2006, Sydney, Australia; Proceedings of the Conference. Vol. 1, Vol. 1. Stroudsburg, PA: Association for Computational Linguistics, 2006. https://dl.acm.org/citation.cfm?id=1858011. Cite
Griffiths, T. L., and M. Steyvers. “Finding Scientific Topics.” Proceedings of the National Academy of Sciences 101, no. Supplement 1 (2004): 5228–35. https://doi.org/10.1073/pnas.0307752101. Cite