Corpus Sources

WE1S draws from a variety of sources for its corpus of news and other material, including database collections of news media (such as LexisNexis and ProQuest) and direct access to online publications.  Collection from LexisNexis is facilitated through the LexisNexis Web Services Kit; and collection from Web-based publications is facilitated through the WE1S-developed Chomp Web-scraping tool.

To help ensure that its collection strategy is informed by an understanding of social, technical, and other contexts influencing the creation and circulation of public media, WE1S produced focus “area of focus reports” on areas of collection and also “scoping research reports” bearing on what a “representative” corpus of public discourse might be.