Topic Model Interpretation Protocol

Because complex data analysis can have a “black box” effect, researchers using machine-learning methods (e.g., in so-called in silico science) need not just to document technical workflows for reproducibility but make humanly understandable the steps in a workflow. The goal is to facilitate the interpretation of results. Digital humanities research, of course, is rooted not just in data science but also long-standing traditions of humanistic hermeneutics, including the critical scrutiny of how humans “read” and “interpret” materials. Digital humanists thus carry the extra burden of needing to make visible the machine-to-human and human-to-human interpretive steps hidden in the interpretive process, steps involving how researchers read a topic model and how researchers communicate, discuss, and provide evidence for observations about topic models to reach credible conclusions. Yet there are currently no best practices in the digital humanities for explaining data workflow, let alone with attention to the act of human interpretation.

WE1S has developed a topic-model interpretation protocol that declares standard instructions and observation steps for researchers using topic models–that is, a transparent, documented, and understandable process for the interaction between machine learning and human interpretation. The goal is not to assert a definitive topic-model interpretation process (because this will be different depending on the nature of projects, materials, resources, and personnel), but to declare a topic model interpretation process that can then serve as a paradigm to be adapted, improved, and varied by the larger digital humanities community.

Module 0
User Training
Module 0...
Module 1
Overview of model
Module 1...
Module 2
Representative topics
Module 2...
Module 3.a
Analyze a topic
Module 3.a...
Module 3.b
Analyze a cluster of topics
Module 3.b...
Module 3.c
Analyze a keyword
Module 3.c...
Module 4.a
Compare sets of topics (multiple topics)
Module 4.a...
Module 4.c
Compare two keywords
Module 4.c...
Module 5
Compare two parts of corpus
     * part to whole
     * 2 metadata sets
     * 2 time ranges
     * compare to "random" corpus
Module 5...
Module Z
Add-on steps for collaborative interpretation
Module Z...
Module 7
Analysis & synthesis of interpretation results
Module 7...
Report Module --
Instructions for writing report
Report Module --...
= Created
= Created
Module 6
Compare two different
topic models
Module 6...
Module X
Make & document your
own workflow
Module X...
(use Google Doc
research report template
Module Y
Document use of additional methods and tools
Module Y...
Viewer does not support full SVG 1.1

The WE1S Interpretation Protocol consists of discrete modules that are combined in sequence or parallel to address research questions. The modules are like Lego™ or Minecraft™ blocks to be creatively snapped together. There is even a “Module X” for making and documenting improvised workflows bridging between other modules.

The modules are implemented as Qualtrics surveys (shareable with others through QSF files that can be imported into Qualtrics at other institutions with a license. Interpretation Protocol modules are also shared as HTML-based forms that can be used by those without a Qualtrics license. (Word and PDF copies of the surveys are also available for reference or drafting or responses.)

WE1S is publishing its Interpretation Protocol in hopes that it will be forked, evolved, and adapted by other projects to create a consensus practice of open, reproducible digital humanities research.

  • Qualtrics surveys for Interpretation Protocol (v. 2.0, June 2019)
    • Live Qualtrics surveys
    • Word doc reference copies (useful for team members to use in drafting responses to the Qualtrics surveys, which can only be operated by an individual in a unique session; also useful for planning research exercises without starting an actual survey)
    • PDF reference copies (useful for planning research exercises without starting an actual survey)
    • QSF files (importable into Qualtrics at other institutions with a Qualtrics license) [pending]
  • HTML-forms version of Interpretation Protocol [pending]