WE1S Repositories & Deposits

Zenodo is the open-science repository for research data and related outputs created through the European OpenAIRE initiative and operated by CERN. Zenodo follows FAIRE (Findable Accessible Interoperable Reusable) principles.
GitHub is a development platform (proprietary) commonly used by software and other project developers to evolve, maintain, and distribute their code and documentation.
WE1S practices principles of research sustainability and openness by depositing its data (datasets and "collections"), tools, and lab notes in the Zenodo open-science repository. We also distribute our code resources — for our computing "Workspace" (tools and workflow) and the Docker containerization of its computing environment — in GitHub repositories. Below are searchable and sortable tables of our Zenodo deposits and GitHub repos.
Glossary of terms useful for understanding WE1S deposits and repositories.
  • "Corpus / Corpora" -- The total set of texts (and data about them) that WE1S works with. (Compare Collection.)
  • "Datasets" -- Complete sets of data representing the WE1S corpus of texts that has been derived from the original texts but is not itself readable as plain text. For example, data that the WE1S Workspace generates from texts include: bags-of-words or term frequencies, ngram counts, etc.
  • "Collection" -- Derived data and visualization files representing a subset of WE1S's datasets and corpus (e.g. just top newspapers, or student newspapers, or only newspaper articles containing both the words humanities and science, etc.).
  • "Project production files" -- Software code (Jupyter notebooks and related tools), derived data, topic model files, and visualization files used to create models and visualizations of "collections".

WE1S Deposits in Zenodo: All

Deposit TitleTypeBrief DescriptionOpen LicenseDOI
WE1S "Workspace" SoftwareSoftware codeEnsemble of Jupyter notebooks and other tools that can be used modularly or in a workflow sequence to collect, manage, analyze, topic model, visualize, and perform other operations on texts. (Latest versions of the template files for the Workspace are kept on the WE1S GitHub site.)MIT10.5281/zenodo.5034712
humanities_keywords DatasetDatasetThe WE1S humanities_keywords dataset contains word-frequency and other non-consumptive-use data about 474,930 unique documents (no duplicate or close variants) mentioning the word "humanities" in English-language news sources. and other keywords related to the humanities in English-language news sources. Other keywords include "liberal arts," "the arts," "literature," "history," and "philosophy." The documents came from 850 U.S. and 437 international news sources with their associated blogs (including student newspapers) published mostly during 1989-2019. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.5068311
comparison_not_humanities DatasetDatasetThe WE1S comparison_not_humanities dataset contains word-frequency and other non-consumptive-use data about 1,380,456 unique English-language news documents (no duplicate or close-variant documents) that do not contain the word "humanities." The documents came from mainstream U.S. news sources published during 2000-2019. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.5068699
comparison_sciences DatasetDatasetThe WE1S comparison_sciences dataset contains word-frequency and other non-consumptive-use data about 553,699 unique English-language news documents (no duplicate or close-variant documents) that contain the words "science" or "sciences." The documents came from U.S. mainstream and student news sources published during 1977-2019 (though mostly from 1985-2019). WE1S researchers use this data to understand how public discourse about the humanities compares to public discourse about science. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.5068756
twitter DatasetDatasetThe WE1S twitter dataset contains 5,024,756 tweets posted to Twitter between December 6th, 2013 and June 30th, 2019. The dataset is divided into subcollections based on the query terms "humanities", "liberal arts", "stem", "science", and "science-es" (that is a query for the presence of either "science" or "sciences"). Subcollections can be identified in the dataset from the value of the metapath property. Collectively, the tweets represent the work of 1,886,739 distinct usernames. Each tweet's mentions, hashtags, and links are recorded, as well the number of likes and retweets. Unlike most other WE1S datasets, the Twitter dataset does not contain extracted features. Instead, it contains the original text of the tweet (the value of the content property, along with a tidy_tweet property, which contains the text of the tweet after preprocessing. Tweets were preprocessed using a modified form of the WE1S preprocessing algorithm. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.5068253
reddit DatasetDatasetThe WE1S reddit dataset contains 1,034,174 Reddit comments containing the terms "humanities", "liberal arts", or "the arts", downloaded by Raymond Steding using pushshift.io. Initially, comments posted between 2006 and 2018 were collected. Comments from 2019 were later added. This data has been processed using the WhatEvery1Says preprocessor, and, in addition to metadata downloaded from Reddit, sentiment scores generated with Textblob have been recorded. A description of the process at an early stage in the production of this dataset can be found in Steding's blog post "A Digital Humanities Study of Reddit Student Discourse about the Humanities." (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.5068267
tvarchive DatasetDatasetThe WE1S tvarchive dataset ccontains word-frequency and other non-consumptive-use data about 1,205,844 English-language transcriptions of U.S. television news broadcasts. The documents were scraped from the Internet Archive's TV News Archive, which includes automatic captions of select U.S. news broadcasts since 2009. While the complete TV News Archive contains over 2.2 million transcripts, WE1S researchers were only able to collect about 1.2 million documents containing complete transcripts. The full TV News Archive includes transcripts from 33 networks and hundreds of shows. Unlike other WE1S datasets, the tvarchive dataset was not collected using keyword searches for specific terms (i.e., documents containing the word "humanities"). (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.5068267
Collection 1Data / Topic model visualizationsU.S. News Media, c. 1989-2019 (WE1S core collection of articles mentioning humanities") -- A collection of word-frequency and other data representing 82,324 unique articles mentioning "humanities" (no duplicate or close-variant documents) published mostly during 1989-2019 in 850 U.S. news sources and their associated blogs. (About 5,000 articles originate from earlier in the 1980s.) The word "humanities" occurs 134,948 times in the collection. WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4902187
Project Production Files for Collection 1Data / Topic model visualizationsThis is an archive of the WE1S project folder from which Collection 1 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5028254
Collection 2Data / Topic model visualizationsU.S. News Media, c. 1989-2019 (articles mentioning "humanities" or "liberal arts") -- A collection of word-frequency and other data representing 94,816 unique articles mentioning "humanities" or "liberal arts" (no duplicate or close-variant documents) published mostly during 1989-2019 in 884 U.S. news sources and their associated blogs. (5,492 articles originate from earlier years going back to 1977.) WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4908882
Project Production Files for Collection 2Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 2 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5030554
Collection 3Data / Topic model visualizationsU.S. News Media, c. 1989-2019 (articles mentioning "humanities" or "the arts") -- A collection of word-frequency and other data representing 108,207 unique articles mentioning "humanities" or "the arts" (no duplicate or close-variant documents) published mostly during 1989-2019 in 1,170 U.S. news sources and their associated blogs. (5,308 articles originate from earlier years going back to 1977.) WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4913688
Project Production Files for Collection 3Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 3 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5030871
Collection 4Data / Topic model visualizationsU.S. Top Newspapers, 1977-2018 (articles mentioning "humanities") -- A collection of word-frequency and other data representing 28,375 unique articles mentioning "humanities" (no duplicate or close-variant documents) published from 1977 to 2018 in the 15 top-circulation U.S. news sources and their associated blogs. The word "humanities" occurs 39,852 times in 28,375 documents in the collection. WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4919794
Project Production Files for Collection 4Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 4 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5031979
Collection 5Data / Topic model visualizationsU.S. Top Newspapers, 1977-2018 (articles mentioning "humanities" or "liberal arts") -- A collection of word-frequency and other data representing 30,323 unique articles mentioning "humanities" or "liberal arts" (no duplicate or close-variant documents) published from 1977 to 2018 in the 15 top-circulation U.S. news sources and their associated blogs. The word "humanities" occurs 39,890 times in 28,398 documents in the collection, while the phrase "liberal arts" occurs 2,888 times in 2,380 documents. WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4914736
Project Production Files for Collection 5Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 5 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5033192
Collection 14Data / Topic model visualizationsU.S. Student Newspapers (articles mentioning "humanities" or "liberal arts") -- A collection of word-frequency and other data representing 21,182 unique articles mentioning the "humanities" or "liberal arts" (no duplicates or close variants) published in 1998-2018 (primarily 2005-2018) in about 650 U.S university and college student newspapers that are on the UWire news service. WE1S and other researchers use this data to look for broad patterns and help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4920178
Project Production Files for Collection 14Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 14 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5033590
Collection 15Data / Topic model visualizationsArticles mentioning "humanities" or "literature" from ProQuest's Ethnic NewsWatch and GenderWatch -- A collection of word-frequency and other data representing 835 unique articles mentioning "humanities" or "literature" (no duplicate or close-variant documents) published mostly during 2016, 2018, and 2019 in 109 U.S. news sources gathered in ProQuest's Ethnic NewsWatch ("ethnic and minority press") and GenderWatch (sources gathered for "gender and women's studies, and gay, lesbian, bisexual, and transgender [GLBT] research"). WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4925152
Project Production Files for Collection 15Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 15 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5034610
Collection 18Data / Topic model visualizationsU.S. Student Newspapers (articles mentioning "science(s)" -- A collection of word-frequency and other data representing 81,445 unique articles mentioning "science" or "sciences" from the UWire news service. Articles were published in 2000-2018 in 601 university and college student newspapers, mainly from the United States. There is a noticeable spike up in the number of articles mentioning "science(s) between 2017 and 2018 from 8,116 to 14162. WE1S and other researchers can use this data to look for broad patterns and guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4914288
Project Production Files for Collection 18Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 18 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5034718
Collection 20Data / Topic model visualizationsU.S. Top Newspapers, 2000-2018 (sample of all articles) -- A collection of word-frequency and other data representing 29,183 unique articles (no duplicates or close variants) published during 2000-2018 in 15 top U.S. newspapers and their associated online blogs. WE1S and other researchers use this data to look for broad patterns and help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4927419
Project Production Files for Collection 20Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 20 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5035825
Collection 21Data / Topic model visualizationsU.S. Top Newspapers, 2000-2018 (articles mentioning "humanities" or "science") -- A collection that contains data representing all 15,692 articles from its set of sources in these years mentioning "humanities" but only a sampling of the 388,691 articles mentioning "science" or "sciences" from those same sources and years. It downsamples "science(s)" articles (while maintaining the proportions of articles from particular sources and years) to achieve a 50/50 balance of articles related to the humanities and sciences. The purpose is to allow media discourse on the humanities to be studied alongside that on the sciences and not be buried so far down in the statistical pile that it cannot easily be seen in detail. Collection 21 is thus not a representation of the relative weight of discussion of the humanities and sciences but instead an aid to studying the fine features and structures of each. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4927745
Project Production Files for Collection 21Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 21 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5039471
Collection 28Data / Topic model visualizationsTweets containing keyword "humanities", c. 2014-2017 -- This collection of the WE1S Twitter corpus consists of 799,744 tweets containing the keyword "humanities" from authors who tweeted the term "humanities" more than once between Jan. 1, 2014, and Dec. 31, 2017. (See also C-29, which aggregates tweets by author.) (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4940253
Project Production Files for Collection 28Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 28 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5032911
Collection 29Data / Topic model visualizationsTweets containing keyword "humanities", c. 2014-2017 (tweets aggregated by author) -- This collection of the WE1S Twitter corpus consists of 799,744 tweets containing the keyword "humanities" from authors who tweeted the term "humanities" more than once between Jan.1, 2014, and Dec. 31, 2017. This version of our Twitter corpus compiles tweets by each author into single "documents" for topic-modeling analysis, resulting in 132,562 total documents. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4940259
Project Production Files for Collection 29Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 29 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5033338
Collection 32Data / Topic model visualizationsU.S. Top Newspapers (sample of all articles) -- A collection of word-frequency and other data representing 204,617 unique articles (no duplicates or close variants) published during 2012-2018 in 15 top U.S. newspapers and their associated online blogs. WE1S and other researchers use this data to look for broad patterns and help guide closer study. Included is data based on an approximately 1:40 proportional balance between articles mentioning "humanities" (about 5,000) and a sample of articles on everything else (about 200,000 more or less "random" documents found through searching on common English words). In essence, the collection is a sampled representation of "everything" in these sources for these years (limited by the fact that it is not feasible to know how many articles were actually published in these publications, to determine how completely they were collected in available database repositories, or to harvest everything from such databases.) (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4940326
Project Production Files for Collection 32Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 32 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5040629
Collection 33Data / Topic model visualizationsArticles classified as being about the humanities or the sciences from U.S. top-circulating newspapers and student newspapers, c. 1998-2018 -- A collection of word-frequency and other data representing 13,214 unique articles (no duplicate or close-variant documents) classified as being about the humanities or science published from 1998-2018 in 507 U.S. top-circulating and student newspapers and their associated blogs. The collection includes 2,477 articles from U.S. top-circulating newspapers and 10,737 articles from student newspapers. Using supervised classification models, 2,869 articles in the collection have been classified as being about the humanities, and 10,345 articles in the collection have been classified as being about science. WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4940725
Project Production Files for Collection 33Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 33 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5042756
Collection 36Data / Topic model visualizationsArticles containing the word "humanities" but that have been classified as not being about the humanties from U.S. top-circulating newspapers and student newspapers, c. 1998-2018 -- A collection of word-frequency and other data representing 27,362 unique articles (no duplicate or close-variant documents) that contain the word "humanities" but that have not been classified as being about the humanities published from 1998-2018 in 545 U.S. top-circulating and student newspapers and their associated blogs. WE1S and other researchers use this data to look for broad patterns and to help guide closer study. The collection includes 13,309 articles from U.S. top-circulating newspapers and 14,053 articles from student newspapers. Supervised classification models have classified these articles as not being about the humanities; this collection therefore helps WE1S understand what articles that contain the word "humanities" but that aren't about the humanities per se are like. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4948902
Project Production Files for Collection 36Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 36 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5043200
Collection 37Data / Topic model visualizationsArticles containing the words "science" or "sciences" but that have been classified as not being about science, c. 1998-2018 -- A collection of word-frequency and other data representing 87,278 unique articles (no duplicate or close-variant documents) that contain the words "science" or "sciences" but that have not been classified as being about science published from 1998-2018 in 610 U.S. top-circulating and student newspapers and their associated blogs. WE1S and other researchers use this data to look for broad patterns and to help guide closer study. The collection includes 13,628 articles from U.S. top-circulating newspapers and 73,650 articles from student newspapers. Supervised classification models have classified these articles as not being about science; this collection therefore helps WE1S understand what articles that contain the words "science" or "sciences" but that aren't about science per se are like. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4958256
Project Production Files for Collection 37Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 37 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5043726
Collection 38Data / Topic model visualizationsA collection of 124,340 Reddit comments longer than 225 words from 2006 to 2018 containing the terms "humanities," "liberal arts," or "the arts." WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4958695
Project Production Files for Collection 38Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 38 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5039647
Collection 39Data / Topic model visualizationsA subset of WE1S's Collection 38 (C-38) Reddit collection -- Collection 39 is tailored to focus on student discourse about the humanities. Where C-38 includes Reddit comments longer than 225 words from 2006 to 2019 containing the terms humanities, liberal arts, or the arts, C-39 consists of 66,290 comments from that larger collection (about half the original number) that also contain at least one of the terms student, major, or college (including plurals and other forms). (Similar to C-38 is WE1S's Corpus-A, an earlier version of the same collection, but including only the years 2006-2018.) (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")CC BY-SA 4.010.5281/zenodo.4959834
Project Production Files for Collection 39Software code / Data / Topic model files / Visualization filesThis is an archive of the WE1S project folder from which Collection 39 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation.MIT10.5281/zenodo.5044260
Wilcoxon Rank Sum Test and Keyphrase Extraction DataDatasetThis repository contains Wilcoxon rank sum test and keyphrase extraction data cited in the WhatEvery1Says (WE1S) Project's article "What Everyone Says About the Humanities: The Challenge Posed by the Public Perception of the Humanities in the Media."CC BY-SA 4.010.5281/zenodo.5112419
Topic Model Interpretation ProtocolDocumentsThe WhatEvery1Says (WE1S) Project developed a topic-model interpretation protocol that declares standard instructions and observation steps for researchers using topic models — a transparent, documented, and understandable process for the interaction between machine learning and human interpretation. We make Interpretation Protocol "as is" in their original Qualtrics survey formats (exported as QSF files for others who can import them into Qualtrics) as well as adapted Word .docx formats (using customized versions of Word's "document properties" in each file to re-create the editable, repeated "running notes" in the original surveys). These files include instructions and references that are specific to the WE1S project and its materials. We hope that they can be forked, evolved, and adapted by other projects to evolve a consensus practice of open, reproducible digital humanities research.CC BY-SA 4.010.5281/zenodo.4940170
Lab-1DocumentsLab-1 is the documentation deposit for WE1S research team 1, which studied the media representation of the humanities "crisis". Included in the deposit are the team's reports and lab notes (working folders) for different periods of time during the WE1S project.CC BY-SA 4.010.5281/zenodo.4891827
Lab-3DocumentsDocumentation deposit (reports and lab notes) of the WhatEvery1Says (WE1S) project's Team 3 -- research team studying the relation between social groups and the humanities as represented in journalistic media.CC BY-SA 4.010.5281/zenodo.4828366
Lab-4DocumentsDocumentation deposit (reports and lab notes) of the WhatEvery1Says (WE1S) project's Team 4 -- research team studying the value of the humanities as represented in journalistic media.CC BY-SA 4.010.5281/zenodo.4831043
Lab-5DocumentsDocumentation deposit (reports and lab notes) of the WhatEvery1Says (WE1S) project's Team 5 -- research team studying the broader profile of the humanities in society as represented in journalistic media..CC BY-SA 4.010.5281/zenodo.4831113
Lab-6DocumentsDocumentation deposit (reports and lab notes) of the WhatEvery1Says (WE1S) project's Team 6 -- research team studying the humanities in different media, including social media.CC BY-SA 4.010.5281/zenodo.4831165
Lab-7DocumentsDocumentation deposit (reports and lab notes) of the WhatEvery1Says (WE1S) project's Team 7 -- research team studying the impact of government, funding agencies, and foundations on the humanities as perceived in journalistic media.CC BY-SA 4.010.5281/zenodo.4830907

Go to WE1S GitHub