Zenodo is the open-science repository for research data and related outputs created through the European OpenAIRE initiative and operated by CERN. Zenodo follows FAIRE (Findable Accessible Interoperable Reusable) principles.
GitHub is a development platform (proprietary) commonly used by software and other project developers to evolve, maintain, and distribute their code and documentation.
WE1S practices principles of research sustainability and openness by depositing its data (datasets and "collections"), tools, and lab notes in the Zenodo open-science repository.
We also distribute our code resources — for our computing "Workspace" (tools and workflow) and the Docker containerization of its computing environment — in GitHub repositories.
Below are searchable and sortable tables of our Zenodo deposits and GitHub repos.
GitHub is a development platform (proprietary) commonly used by software and other project developers to evolve, maintain, and distribute their code and documentation.
Glossary of terms useful for understanding WE1S deposits and repositories.
- "Corpus / Corpora" -- The total set of texts (and data about them) that WE1S works with. (Compare Collection.)
- "Datasets" -- Complete sets of data representing the WE1S corpus of texts that has been derived from the original texts but is not itself readable as plain text. For example, data that the WE1S Workspace generates from texts include: bags-of-words or term frequencies, ngram counts, etc.
- "Collection" -- Derived data and visualization files representing a subset of WE1S's datasets and corpus (e.g. just top newspapers, or student newspapers, or only newspaper articles containing both the words humanities and science, etc.).
- "Project production files" -- Software code (Jupyter notebooks and related tools), derived data, topic model files, and visualization files used to create models and visualizations of "collections".
WE1S Deposits in Zenodo: All
Deposit Title | Type | Brief Description | Open License | DOI |
---|---|---|---|---|
WE1S "Workspace" Software | Software code | Ensemble of Jupyter notebooks and other tools that can be used modularly or in a workflow sequence to collect, manage, analyze, topic model, visualize, and perform other operations on texts. (Latest versions of the template files for the Workspace are kept on the WE1S GitHub site.) | MIT | 10.5281/zenodo.5034712 |
humanities_keywords Dataset | Dataset | The WE1S humanities_keywords dataset contains word-frequency and other non-consumptive-use data about 474,930 unique documents (no duplicate or close variants) mentioning the word "humanities" in English-language news sources. and other keywords related to the humanities in English-language news sources. Other keywords include "liberal arts," "the arts," "literature," "history," and "philosophy." The documents came from 850 U.S. and 437 international news sources with their associated blogs (including student newspapers) published mostly during 1989-2019. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.5068311 |
comparison_not_humanities Dataset | Dataset | The WE1S comparison_not_humanities dataset contains word-frequency and other non-consumptive-use data about 1,380,456 unique English-language news documents (no duplicate or close-variant documents) that do not contain the word "humanities." The documents came from mainstream U.S. news sources published during 2000-2019. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.5068699 |
comparison_sciences Dataset | Dataset | The WE1S comparison_sciences dataset contains word-frequency and other non-consumptive-use data about 553,699 unique English-language news documents (no duplicate or close-variant documents) that contain the words "science" or "sciences." The documents came from U.S. mainstream and student news sources published during 1977-2019 (though mostly from 1985-2019). WE1S researchers use this data to understand how public discourse about the humanities compares to public discourse about science. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.5068756 |
twitter Dataset | Dataset | The WE1S twitter dataset contains 5,024,756 tweets posted to Twitter between December 6th, 2013 and June 30th, 2019. The dataset is divided into subcollections based on the query terms "humanities", "liberal arts", "stem", "science", and "science-es" (that is a query for the presence of either "science" or "sciences"). Subcollections can be identified in the dataset from the value of the metapath property. Collectively, the tweets represent the work of 1,886,739 distinct usernames. Each tweet's mentions, hashtags, and links are recorded, as well the number of likes and retweets. Unlike most other WE1S datasets, the Twitter dataset does not contain extracted features. Instead, it contains the original text of the tweet (the value of the content property, along with a tidy_tweet property, which contains the text of the tweet after preprocessing. Tweets were preprocessed using a modified form of the WE1S preprocessing algorithm. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.5068253 |
reddit Dataset | Dataset | The WE1S reddit dataset contains 1,034,174 Reddit comments containing the terms "humanities", "liberal arts", or "the arts", downloaded by Raymond Steding using pushshift.io. Initially, comments posted between 2006 and 2018 were collected. Comments from 2019 were later added. This data has been processed using the WhatEvery1Says preprocessor, and, in addition to metadata downloaded from Reddit, sentiment scores generated with Textblob have been recorded. A description of the process at an early stage in the production of this dataset can be found in Steding's blog post "A Digital Humanities Study of Reddit Student Discourse about the Humanities." (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.5068267 |
tvarchive Dataset | Dataset | The WE1S tvarchive dataset ccontains word-frequency and other non-consumptive-use data about 1,205,844 English-language transcriptions of U.S. television news broadcasts. The documents were scraped from the Internet Archive's TV News Archive, which includes automatic captions of select U.S. news broadcasts since 2009. While the complete TV News Archive contains over 2.2 million transcripts, WE1S researchers were only able to collect about 1.2 million documents containing complete transcripts. The full TV News Archive includes transcripts from 33 networks and hundreds of shows. Unlike other WE1S datasets, the tvarchive dataset was not collected using keyword searches for specific terms (i.e., documents containing the word "humanities"). (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.5068267 |
Collection 1 | Data / Topic model visualizations | U.S. News Media, c. 1989-2019 (WE1S core collection of articles mentioning humanities") -- A collection of word-frequency and other data representing 82,324 unique articles mentioning "humanities" (no duplicate or close-variant documents) published mostly during 1989-2019 in 850 U.S. news sources and their associated blogs. (About 5,000 articles originate from earlier in the 1980s.) The word "humanities" occurs 134,948 times in the collection. WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4902187 |
Project Production Files for Collection 1 | Data / Topic model visualizations | This is an archive of the WE1S project folder from which Collection 1 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5028254 |
Collection 2 | Data / Topic model visualizations | U.S. News Media, c. 1989-2019 (articles mentioning "humanities" or "liberal arts") -- A collection of word-frequency and other data representing 94,816 unique articles mentioning "humanities" or "liberal arts" (no duplicate or close-variant documents) published mostly during 1989-2019 in 884 U.S. news sources and their associated blogs. (5,492 articles originate from earlier years going back to 1977.) WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4908882 |
Project Production Files for Collection 2 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 2 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5030554 |
Collection 3 | Data / Topic model visualizations | U.S. News Media, c. 1989-2019 (articles mentioning "humanities" or "the arts") -- A collection of word-frequency and other data representing 108,207 unique articles mentioning "humanities" or "the arts" (no duplicate or close-variant documents) published mostly during 1989-2019 in 1,170 U.S. news sources and their associated blogs. (5,308 articles originate from earlier years going back to 1977.) WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4913688 |
Project Production Files for Collection 3 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 3 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5030871 |
Collection 4 | Data / Topic model visualizations | U.S. Top Newspapers, 1977-2018 (articles mentioning "humanities") -- A collection of word-frequency and other data representing 28,375 unique articles mentioning "humanities" (no duplicate or close-variant documents) published from 1977 to 2018 in the 15 top-circulation U.S. news sources and their associated blogs. The word "humanities" occurs 39,852 times in 28,375 documents in the collection. WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4919794 |
Project Production Files for Collection 4 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 4 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5031979 |
Collection 5 | Data / Topic model visualizations | U.S. Top Newspapers, 1977-2018 (articles mentioning "humanities" or "liberal arts") -- A collection of word-frequency and other data representing 30,323 unique articles mentioning "humanities" or "liberal arts" (no duplicate or close-variant documents) published from 1977 to 2018 in the 15 top-circulation U.S. news sources and their associated blogs. The word "humanities" occurs 39,890 times in 28,398 documents in the collection, while the phrase "liberal arts" occurs 2,888 times in 2,380 documents. WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4914736 |
Project Production Files for Collection 5 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 5 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5033192 |
Collection 14 | Data / Topic model visualizations | U.S. Student Newspapers (articles mentioning "humanities" or "liberal arts") -- A collection of word-frequency and other data representing 21,182 unique articles mentioning the "humanities" or "liberal arts" (no duplicates or close variants) published in 1998-2018 (primarily 2005-2018) in about 650 U.S university and college student newspapers that are on the UWire news service. WE1S and other researchers use this data to look for broad patterns and help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4920178 |
Project Production Files for Collection 14 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 14 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5033590 |
Collection 15 | Data / Topic model visualizations | Articles mentioning "humanities" or "literature" from ProQuest's Ethnic NewsWatch and GenderWatch -- A collection of word-frequency and other data representing 835 unique articles mentioning "humanities" or "literature" (no duplicate or close-variant documents) published mostly during 2016, 2018, and 2019 in 109 U.S. news sources gathered in ProQuest's Ethnic NewsWatch ("ethnic and minority press") and GenderWatch (sources gathered for "gender and women's studies, and gay, lesbian, bisexual, and transgender [GLBT] research"). WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4925152 |
Project Production Files for Collection 15 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 15 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5034610 |
Collection 18 | Data / Topic model visualizations | U.S. Student Newspapers (articles mentioning "science(s)" -- A collection of word-frequency and other data representing 81,445 unique articles mentioning "science" or "sciences" from the UWire news service. Articles were published in 2000-2018 in 601 university and college student newspapers, mainly from the United States. There is a noticeable spike up in the number of articles mentioning "science(s) between 2017 and 2018 from 8,116 to 14162. WE1S and other researchers can use this data to look for broad patterns and guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4914288 |
Project Production Files for Collection 18 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 18 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5034718 |
Collection 20 | Data / Topic model visualizations | U.S. Top Newspapers, 2000-2018 (sample of all articles) -- A collection of word-frequency and other data representing 29,183 unique articles (no duplicates or close variants) published during 2000-2018 in 15 top U.S. newspapers and their associated online blogs. WE1S and other researchers use this data to look for broad patterns and help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4927419 |
Project Production Files for Collection 20 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 20 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5035825 |
Collection 21 | Data / Topic model visualizations | U.S. Top Newspapers, 2000-2018 (articles mentioning "humanities" or "science") -- A collection that contains data representing all 15,692 articles from its set of sources in these years mentioning "humanities" but only a sampling of the 388,691 articles mentioning "science" or "sciences" from those same sources and years. It downsamples "science(s)" articles (while maintaining the proportions of articles from particular sources and years) to achieve a 50/50 balance of articles related to the humanities and sciences. The purpose is to allow media discourse on the humanities to be studied alongside that on the sciences and not be buried so far down in the statistical pile that it cannot easily be seen in detail. Collection 21 is thus not a representation of the relative weight of discussion of the humanities and sciences but instead an aid to studying the fine features and structures of each. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4927745 |
Project Production Files for Collection 21 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 21 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5039471 |
Collection 28 | Data / Topic model visualizations | Tweets containing keyword "humanities", c. 2014-2017 -- This collection of the WE1S Twitter corpus consists of 799,744 tweets containing the keyword "humanities" from authors who tweeted the term "humanities" more than once between Jan. 1, 2014, and Dec. 31, 2017. (See also C-29, which aggregates tweets by author.) (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4940253 |
Project Production Files for Collection 28 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 28 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5032911 |
Collection 29 | Data / Topic model visualizations | Tweets containing keyword "humanities", c. 2014-2017 (tweets aggregated by author) -- This collection of the WE1S Twitter corpus consists of 799,744 tweets containing the keyword "humanities" from authors who tweeted the term "humanities" more than once between Jan.1, 2014, and Dec. 31, 2017. This version of our Twitter corpus compiles tweets by each author into single "documents" for topic-modeling analysis, resulting in 132,562 total documents. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4940259 |
Project Production Files for Collection 29 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 29 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5033338 |
Collection 32 | Data / Topic model visualizations | U.S. Top Newspapers (sample of all articles) -- A collection of word-frequency and other data representing 204,617 unique articles (no duplicates or close variants) published during 2012-2018 in 15 top U.S. newspapers and their associated online blogs. WE1S and other researchers use this data to look for broad patterns and help guide closer study. Included is data based on an approximately 1:40 proportional balance between articles mentioning "humanities" (about 5,000) and a sample of articles on everything else (about 200,000 more or less "random" documents found through searching on common English words). In essence, the collection is a sampled representation of "everything" in these sources for these years (limited by the fact that it is not feasible to know how many articles were actually published in these publications, to determine how completely they were collected in available database repositories, or to harvest everything from such databases.) (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4940326 |
Project Production Files for Collection 32 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 32 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5040629 |
Collection 33 | Data / Topic model visualizations | Articles classified as being about the humanities or the sciences from U.S. top-circulating newspapers and student newspapers, c. 1998-2018 -- A collection of word-frequency and other data representing 13,214 unique articles (no duplicate or close-variant documents) classified as being about the humanities or science published from 1998-2018 in 507 U.S. top-circulating and student newspapers and their associated blogs. The collection includes 2,477 articles from U.S. top-circulating newspapers and 10,737 articles from student newspapers. Using supervised classification models, 2,869 articles in the collection have been classified as being about the humanities, and 10,345 articles in the collection have been classified as being about science. WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4940725 |
Project Production Files for Collection 33 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 33 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5042756 |
Collection 36 | Data / Topic model visualizations | Articles containing the word "humanities" but that have been classified as not being about the humanties from U.S. top-circulating newspapers and student newspapers, c. 1998-2018 -- A collection of word-frequency and other data representing 27,362 unique articles (no duplicate or close-variant documents) that contain the word "humanities" but that have not been classified as being about the humanities published from 1998-2018 in 545 U.S. top-circulating and student newspapers and their associated blogs. WE1S and other researchers use this data to look for broad patterns and to help guide closer study. The collection includes 13,309 articles from U.S. top-circulating newspapers and 14,053 articles from student newspapers. Supervised classification models have classified these articles as not being about the humanities; this collection therefore helps WE1S understand what articles that contain the word "humanities" but that aren't about the humanities per se are like. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4948902 |
Project Production Files for Collection 36 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 36 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5043200 |
Collection 37 | Data / Topic model visualizations | Articles containing the words "science" or "sciences" but that have been classified as not being about science, c. 1998-2018 -- A collection of word-frequency and other data representing 87,278 unique articles (no duplicate or close-variant documents) that contain the words "science" or "sciences" but that have not been classified as being about science published from 1998-2018 in 610 U.S. top-circulating and student newspapers and their associated blogs. WE1S and other researchers use this data to look for broad patterns and to help guide closer study. The collection includes 13,628 articles from U.S. top-circulating newspapers and 73,650 articles from student newspapers. Supervised classification models have classified these articles as not being about science; this collection therefore helps WE1S understand what articles that contain the words "science" or "sciences" but that aren't about science per se are like. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4958256 |
Project Production Files for Collection 37 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 37 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5043726 |
Collection 38 | Data / Topic model visualizations | A collection of 124,340 Reddit comments longer than 225 words from 2006 to 2018 containing the terms "humanities," "liberal arts," or "the arts." WE1S and other researchers use this data to look for broad patterns and to help guide closer study. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4958695 |
Project Production Files for Collection 38 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 38 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5039647 |
Collection 39 | Data / Topic model visualizations | A subset of WE1S's Collection 38 (C-38) Reddit collection -- Collection 39 is tailored to focus on student discourse about the humanities. Where C-38 includes Reddit comments longer than 225 words from 2006 to 2019 containing the terms humanities, liberal arts, or the arts, C-39 consists of 66,290 comments from that larger collection (about half the original number) that also contain at least one of the terms student, major, or college (including plurals and other forms). (Similar to C-38 is WE1S's Corpus-A, an earlier version of the same collection, but including only the years 2006-2018.) (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.") | CC BY-SA 4.0 | 10.5281/zenodo.4959834 |
Project Production Files for Collection 39 | Software code / Data / Topic model files / Visualization files | This is an archive of the WE1S project folder from which Collection 39 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. | MIT | 10.5281/zenodo.5044260 |
Wilcoxon Rank Sum Test and Keyphrase Extraction Data | Dataset | This repository contains Wilcoxon rank sum test and keyphrase extraction data cited in the WhatEvery1Says (WE1S) Project's article "What Everyone Says About the Humanities: The Challenge Posed by the Public Perception of the Humanities in the Media." | CC BY-SA 4.0 | 10.5281/zenodo.5112419 |
Topic Model Interpretation Protocol | Documents | The WhatEvery1Says (WE1S) Project developed a topic-model interpretation protocol that declares standard instructions and observation steps for researchers using topic models — a transparent, documented, and understandable process for the interaction between machine learning and human interpretation. We make Interpretation Protocol "as is" in their original Qualtrics survey formats (exported as QSF files for others who can import them into Qualtrics) as well as adapted Word .docx formats (using customized versions of Word's "document properties" in each file to re-create the editable, repeated "running notes" in the original surveys). These files include instructions and references that are specific to the WE1S project and its materials. We hope that they can be forked, evolved, and adapted by other projects to evolve a consensus practice of open, reproducible digital humanities research. | CC BY-SA 4.0 | 10.5281/zenodo.4940170 |
Lab-1 | Documents | Lab-1 is the documentation deposit for WE1S research team 1, which studied the media representation of the humanities "crisis". Included in the deposit are the team's reports and lab notes (working folders) for different periods of time during the WE1S project. | CC BY-SA 4.0 | 10.5281/zenodo.4891827 |
Lab-3 | Documents | Documentation deposit (reports and lab notes) of the WhatEvery1Says (WE1S) project's Team 3 -- research team studying the relation between social groups and the humanities as represented in journalistic media. | CC BY-SA 4.0 | 10.5281/zenodo.4828366 |
Lab-4 | Documents | Documentation deposit (reports and lab notes) of the WhatEvery1Says (WE1S) project's Team 4 -- research team studying the value of the humanities as represented in journalistic media. | CC BY-SA 4.0 | 10.5281/zenodo.4831043 |
Lab-5 | Documents | Documentation deposit (reports and lab notes) of the WhatEvery1Says (WE1S) project's Team 5 -- research team studying the broader profile of the humanities in society as represented in journalistic media.. | CC BY-SA 4.0 | 10.5281/zenodo.4831113 |
Lab-6 | Documents | Documentation deposit (reports and lab notes) of the WhatEvery1Says (WE1S) project's Team 6 -- research team studying the humanities in different media, including social media. | CC BY-SA 4.0 | 10.5281/zenodo.4831165 |
Lab-7 | Documents | Documentation deposit (reports and lab notes) of the WhatEvery1Says (WE1S) project's Team 7 -- research team studying the impact of government, funding agencies, and foundations on the humanities as perceived in journalistic media. | CC BY-SA 4.0 | 10.5281/zenodo.4830907 |