Collections & Topic Models

WE1S studies a corpus of journalistic media and other documents related to the humanities that it harvested for text analysis (but does not store or make available as readable text due to copyright constraints).* This corpus is organized as approximately 30 “collections” (combinations of different kinds of sources and year ranges) to facilitate exploring different research questions. Collections are represented and made available as word-frequency, topic-modeling, and other data generated from analyzing the original texts.

The explanation “cards” below provide briefer summaries of the collections. [Cards are under construction.]


* WE1S makes available only derived-data, “non-consumptive use” word frequency, topic model, and other datasets along with their visualizations. Datasets cannot be used to access, read, or reconstruct the original texts.