European News Sources (WE1S Area of Focus Report)

Report by Aili Pettersson Peeker

Final Version Created June 2018


Pettersson Peeker, Aili. “European News Sources.” WhatEvery1Says Project, July 3, 2018.

1. Overview

What is your area(s) of focus?

European news sources in English (except the UK).

Why is this area of focus important to the WE1S corpus?

Journalistic texts published in Europe are an important part of the WE1S corpus for the project’s ambition to study and compare how the humanities are discussed in media discourses across the world, and not only in the United States.

2. Source Scoping Process

How have you been selecting sources for the WE1S corpus? (e.g. collecting from particular databases, using “impact” lists, etc.)

I found the majority of my sources through a list of European news sources published by The Guardian in 2002.

If you are using external lists to guide your selection of sources, include links here and indicate who produced them, for what purpose the list was produced, and any potential bias issues involved.

The Guardian does not provide a reason for why they created the list, and neither do they provide information about which individual journalists have worked on it. While the problem of bias never should be neglected and The Guardian is a news source with a political orientation left of the center (see e.g., the publication’s journalistic standards and high reputation provide some sense of a safeguard against an “extreme” bias. Furthermore, the list does include news sources that are both progressive and conservative.

3. Corpus Representativeness

How representative do you think your corpus is? (“Representativeness” can be interpreted and addressed in a number of ways, so tailor it to be most productive for your area.)

I think that this corpus is quite representative of major English language news sources in Europe, but it is probably not very representative for media discourse in Europe in general. In other words, due to linguistic barriers and the fact that there are not very many major news sources in English in European countries where English is not the official language, it is difficult to create a representative corpus of European news sources if limited to sources in English.

What challenges in achieving representativeness have you encountered?

As mentioned, a major challenge has been the fact that the selection of sources is limited since many European countries have relatively few English language news sources. As will be discussed below, this fact places constraints on the possibility of creating a broader and more representative corpus.

Provide a tally breakdown of the various facets of sources in your area of focus that WE1S is considering as possible measures of overall corpus “representativeness” (for example, by source or media type, nationality, region, political orientation, identification with specific racial, ethnic, and gender audiences, etc.)

The following numbers are all based on the geographical categorization built in to the Google sheet used for our collection work. This might be a problem for this corpus as well as others if the person collecting the metadata has based their collection work on other geographical categorizations than the one connected to the Google sheet. This is the case for me, as I have worked with Wikipedia’s list of European states and territories and this list does not correspond 100% with the current categorization in the Google sheet. For example, Georgia is listed as European in the Wikipedia list but categorized as Asian according to the source linked to the formula used in the Google sheet. I have only looked at the European sources I have collected information about, so as to avoid overlap with other Research Assistants’ work.

Total: 67 Sources

Distribution Method:

  • Print: 33
  • Broadcast: 7
  • Online: 27

Database Access:

  • Yes: 18
  • No: 49


  • Lexis Nexis: 11
  • Newsbank/World Access: 6
  • Other: 1

Countries with More than One Source:

  • Germany: 4
  • Czech Republic: 3
  • Ireland: 3
  • Ukraine: 3
  • Bosnia and Herzegovina: 2
  • Belgium: 2
  • Belarus: 2
  • Spain: 2
  • Hungary: 2
  • The Netherlands: 2
  • Poland: 2
  • Russia: 2

This breakdown highlights the issue of weighting. As of now, I have collected the sources I have been able to find for most countries, as there generally are not more than (at most) a couple of English language news sources in European countries where English isn’t the official language. However, it might be worth considering whether we should “weight” countries and include more sources from countries with larger populations and/or influence (which, of course, would be an intricate question to settle in itself). It might be considered skewed representation that, for example, there are three sources from the Czech Republic but only one from France. However, this might not turn out to be an issue in the end since far from all the current sources are available in any database.

Political Orientation:

  • Center-Left: 6
  • Center-Right: 2
  • Centrist: 6
  • Liberal: 2
  • Progressive: 3
  • Socialist: 1
  • No information: 47

What challenges or difficulties have you encountered in the source selection or collection process? Do you anticipate any challenges emerging from your work going forward?

A frequent and major problem when collecting European sources has been to find sources that are available in any database. As can be seen in the breakdown above, just over a quarter of the sources in the current corpus are available in a database. This means that the number of European sources used in the project’s text analysis actually will be much smaller than the number of European sources in the current corpus, and thus that the actual text analysis will be based on a much less representative corpus.

Another problem related to representativeness is that of the urban/rural division. Since most—if not all–of the current European publications are based in capitals, the rural parts of Europe are not represented in our current corpus. This will lead to potentially distorted results when comparing Europe to the U.S., since the WE1S corpus currently does include many publications from rural parts of the United States. This is a problem that appears to be rather difficult to get around, as the vast majority of English language news sources in Europe are based in capitals, or at least in major cities.

For some sources, it has been difficult to find even basic information about the publication (such as ownership and circulation numbers). This problem has been particularly frequent when researching publications in countries where democracy is relatively young (and perhaps fragile), e.g. former Soviet states.

A further potential problem concerns the question of how to decide what sources to include in the corpus. Since it would be disproportionately time-consuming to conduct in-depth research about the legitimacy of every publication that I have collected information about, decisions about what sources to include in cases where there is more than one possible source from a country have had to be made on rather loose grounds. I have often found myself basing such decision on the design of the publication’s website, which might be a good indication of the trustworthiness of a source but need not be. This has not been a very frequent issue however, since most European countries have relatively few news sources in English.

5. Research Scan

Conduct some preliminary research on the questions or challenges that you provided in sections three and four.

Have other scholars reflected on these issues? Are there publications that address these problems? Has research been conducted on how to overcome these challenges or at least acknowledge them productively?

An interesting and recent book-length publication discussing intersecitons of identity, representativeness, and media in Europe is We Europeans? Media, Representations, Identities (2008) edited by William Uricchio, Ib Bondebjerg, and Peter Golding. Two essays in this anthology are of particular interest for discussions about representativeness the current European media landscape (see below).

Alexei Simonov, who is director of the Glasnost Defense Foundation (a non-profit organization working to defend journalists and the freedom of expression in Russia), has written an article based on research on censorship and the state of press freedom in today’s Russia that might be helpful for a better understanding of the the media landscape in both Russia and post-Soviet states in Europe.

  • Robins, Kevin. “Media and Cultural Diversity in Europe.” We Europeans? Media, Representations, Identities. Edited by William Uricchio, Ib Bondebjerg, and Peter Golding. Intellect,ja 2008, pp. 109-122.

  • Simonov, Alexei, and Marjorie Farquharson. “Media as Mouthpiece.” Index on Censorship, vol. 34, no. 4, 2005, pp. 78-82. doi:

  • Uricchio, William. “We Europeans? Media, Representations Identities.” We Europeans? Media, Representations, Identities. Edited by William Uricchio, Ib Bondebjerg, and Peter Golding. Intellect, 2008, pp 11-22.