TI - When you have a MALLET, everything looks like a nail
AU - Schmidt, Benjamin M.
AB - [Excerpt from beginning of the post]: One reason I'm interested in ship logs is that they give some distance to think about problems in reading digital texts. That's particularly true for machine learning techniques. In my last post, an appendix to the long whaling post, I talked about using K-means clustering and k-nearest neighbor methods to classify whaling voyages. But digital humanists working with texts hardly ever use k-means clustering; instead, they gravitate towards a more sophisticated form of clustering called topic modeling, particularly David Blei's LDA (so much so that I'm going to use 'LDA' and 'topic modeling' synonymously here). There's a whole genre of introductory posts out there encouraging humanists to try LDA: Scott Weingart's wraps a lot of them together, and Miriam Posner's is freshest off the presses.
So as an appendix to that appendix, I want to use ship's data to think about how we use LDA.
