Introduction
The WE1S Identity and Inclusion team (of which we are a part) is focused on research outputs centered on how gender, sexuality, race, and ethnicity factor into humanities discourse. Our core questions of interest include: How are different gender and ethnic groups positioned in relation to the humanities in public discourse? What kind of conversations do these groups hold about the humanities? Are there discrepancies between how these groups position themselves in relation to the humanities and the ways in which other groups position them in relation to the humanities? Do “identity-focused” sources focus on different topics or issues compared with the sources collected in the WE1S primary corpus?
This past year, we were thinking about how we could create new outputs for the project that aren’t necessarily focused exclusively on data collection, and to imagine methods and visualizations beyond topic modeling that can help us answer research questions about how various social groups and identities are positioned in relation to the humanities. One of these outputs is a map of various universities and colleges that are institutionally affiliated with certain groups in the United States—namely, Historically Black Colleges and Universities (HBCUs), Hispanic-Serving Institutions (HSIs), Tribal Colleges, and Women’s Colleges. There is currently no map that exists which features all of these universities on a single visualization platform. In addition to creating a map to fill this gap in representation, our map allows us to ask larger questions about geographic space and place, how these universities are situated in relation to the towns and cities where they are located, and the racial demographics of communities surrounding these universities, among other questions.
Background Methodology
The map is based on a spreadsheet we created to collect and organize data about these universities. The universities we visualize include all Historically Black Colleges and Universities (HBCUs),Hispanic Serving Institutions (HSIs),Women’s Colleges, and Tribal Colleges. Some of these university classifications are statistic-driven—HSIs, for example, are officially classified as such once total Hispanic enrollment at a given university constitutes a minimum of 25% of total enrollment.[1]For each university, we collected data on the town and state in which the university is located, the university’s latitude and longitude, and the year it was established. The rationale behind this collection was to provide geographic data for these schools that could be rendered into a visualization, and to situate these universities in relation to other census data, as detailed in our mapping methodology. We take this map to be a starting point for our explorations, and—just like the universities and populations it represents—one that is not static, but rather continuously changing, evolving, and shifting. In this spirit, we take this map to be an early iteration of what will be subject to revision and our future explorations and in-depth analysis in the coming months.
Mapping Methodology
Su created the map in Leaflet, which is an open source JavaScript library[2]that allows for easy creation of “slippy” maps, namely web maps that zoom and pan easily.
I. Creating each layer
A. Basemaps
A basemap provides a user with some geographic or spatial context to the information being presented in the map. Mapbox[3]provides several different customizable basemaps which can be accessed through their API and a personal access token. Su created a few simple basemaps in the Mapbox Studio online that have different color schemes and can be toggled via the layer menu box in Leaflet. The basemaps include “dark,” “streets,” and “blue,” with subtle differences in color and geographic features.
B. Racial dot density[4]
The racial dot density layer was first created in R and relies heavily on the software package sf. The first challenge in visualizing the racial background of the entire United States is figuring out the appropriate resolution. Spatial data exists as points, lines, polygons, and raster (or images), and demographic data can be collected at any number of varying resolutions within those feature types. The United States Census Bureau provides demographic information mainly through polygons that represent the country’s administrative boundaries and can exist at the national, state, county, Census block, and Census tract level (among a few others). Due to the heavy computational burden at higher resolutions, Su chose to represent race at the county level, and decided to have one dot represent 1,000 people of each race. She downloaded racial information and the GIS data of U.S. counties from IPUMS NHGIS,[5]which is an organization and website that easily provides downloadable packages of national historical geographic information systems. The method for calculating the number of proportional dots within each county polygon in the U.S. was operationalized via an online tutorial[6]that involves filtering for the columns (races) of interest, dividing by 1,000, rounding to a whole number (taking into account statistical bias), and then generating random points within the county boundaries. Leaflet requires all data layers to be in the WGS84 coordinate reference system (CRS), which is one of many types of geographic coordinate systems that standardizes how geographic data is collected around the world. A simple projection of the data into WGS84 and conversion into GeoJSON using mapshaper.org allowed for import into Leaflet.
C. Identity-focused schools and colleges
The latitude, longitude, and other attributes of our schools of interest were collected into a single spreadsheet. The data was then exported as GeoJSON in mapshaper.org (though prototypes of the point features were taken into ArcMap, a mapping software by ESRI, this step was not actually necessary) and brought into Leaflet.
II. Layer Snapshot
Layer | Feature Type | Source |
Basemaps (x3) | Raster | Mapbox |
Racial dot density | Point | U.S. Census Bureau, NHGIS IPUMS |
Identity-focused schools and colleges | Point | Giorgina Paiella, Jamal Russell, and Susan Burtner |
III. Appendix A: R code for the racial dot density layer
library(sf)
library(tidyverse)
# Import county polygons shapefile
us_counties_2017 <- st_read(“/Users/sburtner/Documents/WE1S/we1s-mapping/layers/nhgis0006_shapefile_tl2017_us_county_2017/US_county_2017.shp”, stringsAsFactors = FALSE, quiet = TRUE)
# Set projection for shapefile
st_crs(us_counties_2017) <- 102003
# Import demographic data
nhgis_2017 <- read.csv(“/Users/sburtner/Documents/WE1S/we1s-mapping/layers/nhgis0005_ds233_20175_2017_county.csv”, header = TRUE, sep = “,”) %>%
select(GISJOIN, COUNTY,
ends_with(“03”), ends_with(“04”), ends_with(“05”),
ends_with(“06”), ends_with(“07”), ends_with(“12”)) %>%
select(GISJOIN, COUNTY, starts_with(“AHZAE”)) %>%
rename(white = ends_with(“03”)) %>%
rename(black_aa = ends_with(“04”)) %>%
rename(ai_an = ends_with(“05”)) %>%
rename(asian = ends_with(“06”)) %>%
rename(nh_pi = ends_with(“07”)) %>%
rename(hispanic = ends_with(“12”))
# Join the data
county_join <- left_join(us_counties_2017, nhgis_2017, by = c(“GISJOIN”)) #st_as_sf if you join the other way
head(county_join)
# Create function that randomizes rounded values
#Credit for code here: **https://www.cultureofinsight.com/blog/2018/05/02/2018-04-08-multivariate-dot-density-maps-in-r-with-sf-ggplot2/**
random_round <- function(x) {
v = as.integer(x)
r = x – v
test = runif(length(r), 0.0, 1.0)
add = rep(as.integer(0), length(r))
add[r > test] <- as.integer(1)
value = v + add
ifelse(is.na(value) | value < 0, 0, value)
return(value)
}
# Create dataframe of number of dots to plot for each race (1 for every 100 people)
num_dots <- as.data.frame(county_join) %>%
select(white:hispanic) %>%
mutate_all(list(~(./1000))) %>%
mutate_all(random_round)
# Create the actual vectorized points
sf_dots <- map_df(names(num_dots),
~ st_sample(county_join, size = num_dots[, .x], type = “random”) %>% # generate points in each polygon
st_cast(“POINT”) %>% # cast the geom set as ‘POINT’ data
st_coordinates() %>% # pull out coordinates into a matrix
as_tibble() %>% # convert to tibble
setNames(c(“lon”, “lat”)) %>% # set column names
mutate(race = .x) # add categorical race variable
) %>%
slice(sample(1:n()))
# (This will take a long time to run)
#write_csv(sf_dots, “/Users/sburtner/Documents/WE1S/webmap/race_dots1000.csv”)
Appendix B: HTML for Leaflet map
<!DOCTYPE html>
<html lang=”en”>
<head>
<meta charset=”utf-8″ name=”viewport” content=”width=device-width, initial-scale=1.0″>
<title>Special Interest Colleges & Universities</title>
<link rel=”stylesheet” href=”https://unpkg.com/leaflet@1.5.1/dist/leaflet.css”
integrity=”sha512-xwE/Az9zrjBIphAcBb3F6JVqxf46+CDLwfLMHloNu6KEQCAWi6HcDUbeOfBIptF7tcCzusKFjFw2yuvEpDL9wQ==”
crossorigin=””/>
<script src=”https://unpkg.com/leaflet@1.5.1/dist/leaflet.js”
integrity=”sha512-GffPMF3RvMeYyc1LWMHtK8EbPv0iNZ8/oTtHPx9/cc2ILxQ+u905qIwdpULaqDkyBKgOaB57QTMg7ztg8Jm2Og==”
crossorigin=””></script>
<script src=”src/plugins/leaflet-ajax-gh-pages/dist/leaflet.ajax.min.js”></script>
<script src=’https://api.mapbox.com/mapbox.js/plugins/leaflet-fullscreen/v1.0.1/Leaflet.fullscreen.min.js’></script>
<link href=’https://api.mapbox.com/mapbox.js/plugins/leaflet-fullscreen/v1.0.1/leaflet.fullscreen.css’ rel=’stylesheet’/>
<style>
html, body, #map {
height: 100%;
}
body {
padding: 0;
margin: 0;
}
</style>
</head>
<body>
<div id=”map”</div>
<script>
var HBCUicon = L.icon({iconUrl: ‘img/icons/hbcut.png’,
iconSize: [30, 20]});
var HSIicon = L.icon({iconUrl: ‘img/icons/hsit.png’,
iconSize: [25, 20]});
var WCicon = L.icon({iconUrl: ‘img/icons/wct.png’,
iconSize: [20, 20]});
var TCicon = L.icon({iconUrl: ‘img/icons/tct.png’,
iconSize: [20, 20]});
var dark = L.tileLayer(‘https://api.mapbox.com/styles/v1/sburtner/{id}/tiles/256/{z}/{x}/{y}?access_token={accessToken}’, {
attribution: ‘Basemap by © <a href=”https://www.mapbox.com/”>Mapbox</a>’,
id: ‘cjynuhtax42fh1cq14prabicc’,
accessToken: ‘pk.eyJ1Ijoic2J1cnRuZXIiLCJhIjoiY2o4b3JlbXEyMDZiczMzbWt0cGMxdGMzcSJ9.4cfUZCPSQvyz7FFabHrflA’,
opacity: 0.8
}),
streets = L.tileLayer(‘https://api.mapbox.com/styles/v1/sburtner/{id}/tiles/256/{z}/{x}/{y}?access_token={accessToken}’, {
attribution: ‘Basemap by © <a href=”https://www.mapbox.com/”>Mapbox</a>’,
id: ‘cjykd7z7a0ti01dqv6ot88ysz’,
accessToken: ‘pk.eyJ1Ijoic2J1cnRuZXIiLCJhIjoiY2o4b3JlbXEyMDZiczMzbWt0cGMxdGMzcSJ9.4cfUZCPSQvyz7FFabHrflA’,
opacity: 0.8
}),
blue = L.tileLayer(‘https://api.mapbox.com/styles/v1/sburtner/{id}/tiles/256/{z}/{x}/{y}?access_token={accessToken}’, {
attribution: ‘Basemap by © <a href=”https://www.mapbox.com/”>Mapbox</a>’,
id: ‘cjxxkp06oaoyv1cqpt8ifdyjm’,
accessToken: ‘pk.eyJ1Ijoic2J1cnRuZXIiLCJhIjoiY2o4b3JlbXEyMDZiczMzbWt0cGMxdGMzcSJ9.4cfUZCPSQvyz7FFabHrflA’,
opacity: 0.8
});
var map = L.map(‘map’, {
preferCanvas: true,
center: [39.8, -98.6],
zoom: 4,
layers: [dark, streets, blue],
fullscreenControl: true,
fullscreenControl: {
pseudoFullscreen: false
}
});
var baseMaps = {
“Streets”: streets,
“Blue”: blue,
“Dark”: dark
};
var schools = new L.GeoJSON.AJAX(“https://raw.githubusercontent.com/sburtner/we1s-mapping/master/SIC.json”, {
pointToLayer: function (feature, latlng) {
switch(feature.properties.Category) {
case “HSI”: return L.marker(latlng, {icon: HSIicon});
case “HBCU”: return L.marker(latlng, {icon: HBCUicon});
case “Women’s College”: return L.marker(latlng, {icon: WCicon});
case “Tribal College”: return L.marker(latlng, {icon: TCicon});
}
}
}).addTo(map);
var race1 = new L.GeoJSON.AJAX(“https://raw.githubusercontent.com/sburtner/we1s-mapping/master/racedots1.json”, {
pointToLayer: function (feature, latlng) {
switch (feature.properties.race) {
case “white”: return L.circleMarker(latlng, {radius: 1, fillColor: “#f511e9”, weight: 0, fillOpacity: 0.8}); //magenta
case “black_aa”: return L.circleMarker(latlng, {radius: 1, fillColor: “#f7f420”, weight: 0, fillOpacity: 0.8}); // yellow
case “hispanic”: return L.circleMarker(latlng, {radius: 1, fillColor: “#20f752”, weight: 0, fillOpacity: 0.8}); // green
case “asian”: return L.circleMarker(latlng, {radius: 1, fillColor: “#11ecf7”, weight: 0, fillOpacity: 0.8}); //turquoise
}
}
});
var race2 = new L.GeoJSON.AJAX(“https://raw.githubusercontent.com/sburtner/we1s-mapping/master/racedots2.json”, {
pointToLayer: function (feature, latlng) {
switch (feature.properties.race) {
case “white”: return L.circleMarker(latlng, {radius: 1, fillColor: “#f511e9”, weight: 0, fillOpacity: 0.8}); //magenta
case “black_aa”: return L.circleMarker(latlng, {radius: 1, fillColor: “#f7f420”, weight: 0, fillOpacity: 0.8}); // yellow
case “hispanic”: return L.circleMarker(latlng, {radius: 1, fillColor: “#20f752”, weight: 0, fillOpacity: 0.8}); // green
case “asian”: return L.circleMarker(latlng, {radius: 1, fillColor: “#11ecf7”, weight: 0, fillOpacity: 0.8}); //turquoise
}
}
});
var race3 = new L.GeoJSON.AJAX(“https://raw.githubusercontent.com/sburtner/we1s-mapping/master/racedots3.json”, {
pointToLayer: function (feature, latlng) {
switch (feature.properties.race) {
case “white”: return L.circleMarker(latlng, {radius: 1, fillColor: “#f511e9”, weight: 0, fillOpacity: 0.8}); //magenta
case “black_aa”: return L.circleMarker(latlng, {radius: 1, fillColor: “#f7f420”, weight: 0, fillOpacity: 0.8}); // yellow
case “hispanic”: return L.circleMarker(latlng, {radius: 1, fillColor: “#20f752”, weight: 0, fillOpacity: 0.8}); // green
case “asian”: return L.circleMarker(latlng, {radius: 1, fillColor: “#131aeb”, weight: 0, fillOpacity: 0.8}); //turquoise
}
}
});
var races = L.layerGroup([race1, race2, race3]);
var overlayMaps = {
“Schools”: schools,
“Races”: races
};
L.control.layers(baseMaps, overlayMaps, {position: ‘topleft’}).addTo(map);
</script>
</body>
</html>
Notes
[1]HACU, “Hispanic-Serving Institution Definition” (https://www.hacu.net/hacu/HSI_Definition.asp).
[4]See Appendix A for the R code that created this layer.
[6]https://www.cultureofinsight.com/blog/2018/05/02/2018-04-08-multivariate-dot-density-maps-in-r-with-sf-ggplot2/