Research Questions

As part of our research process, we polled our WE1S researchers (faculty and graduate-student research assistants) during several stages of the project for research questions that we thought we would like to address. Raw answers such polls became the basis for discussion among our researchers and then subsequent stages of prioritization, consolidation, and elimination. We iterated this exercise of “asking questions” several times from the prototype phase of WE1S to its phase as a grant-funded project because one distinctive aspect of the “hermeneutical circle” of interpretation for digital humanities and other digital research projects is the need to allow conceptual aims and technical opportunities or constraints to shape each other in a reciprocal, iterative, to-and-fro rhythm of adjustment—what one theorist and historian of science and technology, Andrew Pickering, calls “the mangle of practice.” Conceptual aims influence technical methods, while the particular realities of technical methods (adjusted as well to the nature of our underlying research materials) influence conceptual aims—for example, by motivating us to re-ask a question in an answerable rather than ideal way.

Below is the end result of the third stage of our “asking questions” and then “mangling” them from their general form into what we called “operationalized” form, which is the “mangling” we thought we could achieve.” (See the theory of “operationalizing” as applied to the digital humanities.”)

We were not able to get to all these questions, and the ones we did get to often ended up mutating in the process of research as we learned more about our materials, data, and models. The following set of research questions (and their operationalized versions) are presented “as is.”

Question (in form addressed to research goal)Possible operationalized form of question
APrimary Corpus Team questions
a.1How does public discourse on the humanities related to the academy compare with that on the "public humanities"? What topic differences do we see between a sample of our corpus with the keywords "university", "college", AND/OR "K-12", and a sample that does NOT contain these words?
a.2What correlation (if any) is there between the political leanings of a given publication and the types of discourses surrounding the humanities? How do "left" and "right" publications compare?
a.3To what extent does public discourse on the humanities fall into categories of advocacy/crisis/funding shortages/arguing, and to what extent are they a "given"? What other discourses are associated with each form of discussion (ie, is STEM usually mentioned in conjunction with the humanities-in-crisis discourse? Is reporting on humanities programs always flavored with an earnest justification for its importance to the wider world?). To what extent might the humanities need to be in crisis in order to be recognized as a discourse in their own right?To what extent are the humanities "invisible" (or, at best, "cosmic background radiation" unless they are in crisis?
a.4Are specific humanistic disciplines (i.e., "philosophy", "art history," "literature") discussed differently from "the humanities" as a broader concept, and how?
a.5Do "major U.S. newspapers" discuss the humanities in a significantly different way than regional or local news sources?Are certain themes, tropes, or rhetorical structures about the humanities more prevalent in the top 10 U.S. newspapers by circulation than in a corpus of regional and local U.S. newspapers?
BIdentity & Inclusion Team questions
b.1How do ProQuest and LexisNexis (and, more generally, any database) include and classify publications that represent different identities? How many publications are available in ProQuest vs. LexisNexis that position themselves as representative of a given identity group? How are these publications and sources classified and categorized by the databases (e.g., "women's magazine,")?
b.2How are the humanities represented in ProQuest Gender and Ethnic Newswatch, and how does that compare/contrast with "identity-focused" publications in LexisNexis?Which topics are most important in a topic model of these sources? Do topics related to the humanities appear in greater proportion in publications from either database?
b.3How are the humanities discussed in university mission statements?In a corpus of university mission statements, how often are the humanities and its cognates invoked? What is the positionality of the humanities (or the skills associated with the humanities) in relation to a given university's mission (are they central, peripheral, or absent)?
b.4Do "identity-focused" sources focus on different topics or issues compared with the sources collected in our primary corpus? In a topic model of all of our sources, which topics appear in these "identity-focused" publications in greater proportion, if any?
CStudents & Humanities Team questions
c.1Do students stigmatise (stereotype?) humanities and STEM subjects/majors differently?

* If so, what reasons do they give?
* What form does this stigmatisation take?
* Are there subfields that are particular targets?
c.2How do students navigate, experience, and perceive the humanities in U.S. colleges?In a topic model of just student-based sources, which topics or clusters of topics are most important? (highest weight, most central, most representative, etc)
c.3How different is student discourse from “general” discourse?What topics are most unique to the student subcorpus vs the primary corpus? (Using source metadata to sort a combined topic model of the primary and student corpus)
c.4How similar or different is student discourse in our journalistic corpus vs our social media corpus vs our human subjects research? Another way of putting this is: What stories about the humanities tend to make it into print? Which remain the province of online consumption and which are most often only private thoughts and feelings? (Could also broaden this beyond the undergrad perspective as well)In a topic model of student newspapers, student-based social media sources, and human subjects sources (from UCSB undergrads), which topics are most unique to or most highly represented within each subcorpus? (Could be broadened as well - for instance, by including sources from the primary corpus or the full human subjects survey results, both undergrad and non-undergrad)
c.5How similar or different is stereotype and stigma of humanities and STEM discussed in our journalistic corpus vs our social media corpus vs human subjects research? Another way of putting this is: how are the humanities and STEM perceived? What generalizations form into a “humanities” or “STEM” stereotype and is there a noticeable difference in the language used when stigma is the chosen word instead of stereotype. What causes the stereotype of a humanities student or STEM student to become a stigma?In a topic model including all student corpora do topics definitively help answer the questions in 3.a.?
c.6How does discourse about the humanities vary among different component communities of a college campus (i.e., students, faculty, departments, administration)?Within a topic model of academia-based sources (including student newspapers, the Chronicle, teaching blogs, etc,) which topics are most unique to each authorial perspective or source type? We can also do this via human subjects research with a close reading of UCSB (with IRB approved participants).
c.7Are there trends in what humanities discourse looks like based on institution type (for instance, public universities, private universities, liberal arts colleges, or community colleges -- perhaps in conversation with the fantastic data collected by Giorgina and Team 3 on HBCUs, HSIs, and women’s colleges)? A related question: what kinds of institutions tend to use the word “humanities” the most or the least, and what does this say about the socioeconomics of the word itself?In a topic model built from sources from Teams 2 and 3, what sources cluster in relation to which topics? Which sources are missing or are least represented?
c.8What unique opportunities exist for us in using social media and its accompanying metadata? For instance, can we use features like threads, likes, and karma scores to understand what genres of humanities discourse (the crisis, the advertisement, advice, the intellectual debate) are most popular, most powerful, or most successful in promoting “constructive” conversation?

[Alan's comment: I think we need a specific end-goal question that can serve as an aim-point for this fascinating methodological inquiry. Something like: "Which goes most viral in student social media: positive or negative aspects of the humanities?"]
How would we adapt the WE1S interpretation protocol to a social media topic model (like Twitter)?
c.9What is the best way to translate our research into advocacy outputs particularly geared towards a college setting? (for instance, a student discussing a choice of major with parents)Alan: perhaps we need to build this question on top of component questions about the topics or keywords most associated with specific groups (parents, students, politicians, university administrators, etc.)?
c.10What can we learn from Twitter about the discourse surrounding the Humanities on Social Media (particularly among students)?
DComparison Corpus Team questions
d.1What are some of the (lexical, semantic) features of public discourse about the humanities (in specific contexts)?In a classification task or series of tasks, what specific textual features are most significant in distinguishing between texts about the humanities and texts not about the humanities?
d.2How does public discourse about the humanities change across different (historical or publication) contexts?How do specific lexical or semantic features of texts about the humanities change across different contexts?
d.3How does public discourse about the humanities compare to public discourse about the sciences? (This question seems to me to be related to the question Ray has posed in row 21)In a classification task or series of tasks, what specific textual features are most significant in distinguishing between texts about the humanities and texts about the sciences?
EQuestions from others (including PIs, postdocs, and Interpretation lab team)
e.1How does the portrayal of the Humanities by "general" news media compare to that of "industry" publications like The Chronicle of Higher Education and social media publications like Twitter and Reddit. Are there discursive features in common? Are there discursive features from one that can be adopted in the other(s) in order to provide greater social support for the Humanities?What words are most associated with "humanities", and how does this vary in different genres of publications. Are certain topics generated by the keyword "humanities" more prevalent in one of these genres than in others?
e.2How much humanities "cosmic background radiation" 'is there by comparson with the radiation of science, politics, etc.?How broadly does "humanities" appear as a word among the top 10 most frequent words in topics, compared to the words "sciences," "politics," "economics," "higher_education"? (E.g., does it appear in 8% of topics in a model as a top-10 term?) Also, the same question for the breadth of topics in which "humanities" appears in the second, third, fourth, fifth, .... decile of keywords.)
e.3What are the contexts of discussion that bring both the humanities and the sciences into the same discussion?In topics where "humanities" and "sciences" both occur in the top 20 terms, what are the top other terms, if any, that co-occur in those topics?
e.4What are cognates of the "humanities" in public discourse (e.g., "literature," "history," "philosophy," etc.)?In topics where "humanities" appears in the top 20 terms, what other humanities fields or activities are also named in the top 20 terms (e.g., "literature," "books," "music," "history")?
e.5How is "humanities" distributed in public discourse among various scales and kinds of social units for context? E.g., what percentage of the time does it appear in the context of national policy, state policy, city policy, family and parents, schools, the individual, etc.?
e.6In taking up any of our research questions, what is the difference in our corpus in 2004-2007 vs. 2008-2012 vs. 2013-2016? (that is, before, during, and after the Great Recession)?
e.7In taking up any of our research questions, what differences can we see between different regions of the U.S. (east, west, south, midwest, northwest, northeast, etc.)
e.8In taking up any of our research questions, what differences can we see between how "humanities" is discussed in different forms of media? (Print, web, TV, radio, etc.?)
e.9How prevalent is "humanities crisis discourse" (discourse about the crisis in the humanities) as compared with other ways of talking about the humanities (again, within specific contexts)?Can we identify, broadly -- and probably? through human hand coding -- "crisis" topics in a topic model? How many such topics occur in a model as compared with the other topics in the model? Alternatively, can we identify via human reading, in any given sample of articles, articles/texts that are about the "crisis in the humanities"? How many such articles occur in this model/these models, as compared with other kinds of articles about the humanities?
e.10In taking up any of our questions, is there a difference between the part of the corpus collected by searcing on "humanities" by comparison with "liberal arts"?
e.11"What is the most typical (or perhaps most average) article in our corpus?

(With an answer to that, one could imagine creating a humanities article bot that would churn out exemplar documents from our corpus)"
e.12To what extent do questions of employability and income appear in discussions regarding the humanities...
e.13What don't we talk about when we talk about the humanities?

(The answer is, of course, a lot -- almost everything -- but in the sense of "statistically, what are the biggest topics in (a related discourse) that are small or absent when the word humanities is present?" -- where the related discourse might be news in general, or something more specific like "science.")
e.14How have the words associated with the humanities changed over time?
...which co-words (topics) have appeared recently?
...which co-words (topics) have disappeared?
e.15Where are people talking about the humanities? If we were to extract and geocode the place names from these topics, what would the geographic distance between media sources and geocoded place names tell us about the "discursive diversity" of a given place and the people in that place? Might we be able to measure the connections between these sources, based on shared topics, to create a network of shared semantics/topics, and what network characteristics would that reveal?This question (very related to a previous question Alan had!) would require geocoding place names from topics (for which we currently have the ability to do) and then mapping the results. Thinking about how to map connections of topics is more tricky. They would likely be geographically embedded by the source, but I think more thought needs to be given into what that even means.
e.16Which emotions/sentiments are often associated with "humanities", versus "STEM" for example?sentiment analysis
e.17How are any one or set of the following groups of students discussed in relation to the humanities by comparison with the generic, mainstream figure of "students":
* first-generation to college students
* first-generation immigrant students
* students of color (this likely neets to be broken out into different groups)
* female students
* male students
e.18As a hybrid between Rebecca and Scott's questions (rows 4 and row 32), I think it could be important to ask a very basic part vs whole framing, such as "how does academic discourse about the humanities compare with public discourse about the humanities?"
e.19We currently don't have a question about class. Originally, I think we were calling this "brow" level, but perhaps a more systemic way to do this would be to add the cost of newspaper (or tuition, for schools) as metadata and ask:

"how do more expensive publications (and/or more expensive institutions) talk about the humanities in relation to those that cost less?"
e.20Is there any value in trying to capture the "reach" of some of our sources? Could be based on things like—number of comments, number of shares for articles; number of upvotes for reddit; number of likes or shares, or followers, for tweets.
e.21To which extent do successes in a given and narrowly definable Humanities discourse (eg. Digital Humanities) raise pubic interest in other, related Humanities discourses (eg. Humanities Crises, Public Humanities, Environmental Humanities), or in the Humanities at large?Discovery and human validation of success events, as well as discourse categories. Human reading, full text searches, classification, topic modeling. Longitudinal study.
e.22What is the speed at which a given Humanities news is spread in the public discourse?Discovery and human validation of success events. Longitudinal study.
e.23What is the turnover rate of news in the Humanities; i.e. how fast does it go until a given Humanities news is no longer considered newswothy or mentioned in the public discourse?Discovery and human validation of success events. Longitudinal study.
e.24Do Humanities news follow common geographical patterns when they spread?Mapping. Longitudinal study.
e.25How do major government agencies, foundations, and professional associations shape or correlate with discussion of the humanities (sciences, majors, etc.)?
e.26How do state and local civic agencies shape or correlate with discussion of the humanities (sciences, majors, etc.)?