One Election Visualization to Rule Them All

I’m fascinated with creating a single visualization that can fully communicate the results of an election. I’ve made a few attempts in the past and even came up with a set of criteria for the ideal visualization. More or less, I want to answer the questions “What happened?”, “What changed?”, and “Why?” in a single visualization. Of course, nobody can fully answer the “Why?” question, but you can point towards an answer by looking at demographic and survey data.

So here’s my most recent attempt at an all-encompassing visualization for the 2016 election:

This might look like a little bit like an abstract painting, but I promise it’s informative. Each color category represents the entire electorate broken up into subgroups by age, housing density, education, gender, race, and state. The size of the bubble represents the population of the group, and the coordinate position represents the margin and turnout. The tail of each bubble represents the change from the 2012 presidential election.

If you want to find out what changed this election, just find the large bubbles with the long tails. And to find the crucial states in the electoral college, find the bubbles that just crossed over the y-axis. Note: this visualization is interactive so you can zoom in for more detail, and filter categories using the legend.

So the older age groups, males, high school graduates, and very low density areas all shifted rightwards in 2016. At the same time, turnout dropped substantially among the Non-Hispanic Black populations. These changes were enough to bring crucial states like WI, PA and MI just across the threshold into Republican territory. No new insights here, but I think it’s interesting to have a single graphic that communicates it all.

A Few More Election Visualizations to Rule Them All

One nice thing about the Catalist dataset is that it has results for every 2 years, so we can look at midterm elections too. So the visualizations below just combines the Catalist margins data with Elections Project turnout data for each demographic.

For state level results in non-presidential years, I aggregated the MIT Elections Lab house election results and combined these results with the Election Project state turnout data. The bubble tails are staggered by 4 years so that presidential and house election results are compared against each other.

There’s a lot to take in here, but I think the most striking thing is the increase in turnout and leftward shift in 2018. Blue wave indeed. This change seems largely to be driven by college educated voters. Another crucial thing to note here is that in 2018 the house vote crossed into Democratic territory in both Wisconsin (Rep -7.5%) and Arizona (Rep -1.71%). These will be crucial swing states in 2020 for both the presidency and senate, so things could get interesting.

Update: Preliminary 2020 results have been added, based on aggregate county level results and the Elections Project state turnout data. The big story of 2020 so far seems to be turnout – specifially an increase in turnout (and a slight shift left) in low and medium density places that was just enough to outweigh the turnout increases in right leaning, very low density places. It’s also interesting to see Republican margins improve in high density places, although the county data is still preliminary so this result could change.

One interesting question is whether the margin changes in 2020 were driven by turnout or vote switching. Only time (and better sources of data) will tell. Catalist won’t be releasing their demographic data until January of 2021, so I’ll add that when it becomes available.

Data Issues

The real challenge with these plots is getting the data, especially the demographic data. The source of the demographic turnout data is the Census Current Population Survey, via the Election Projects dataset [4, 5]. The demographic margin data is courtesy of the Catalist dataset. The margin and turnout data for each housing density and state are derived from county and state level election results, along with Census Bureau county housing density data and Citylab density categories.

The Elections Project reports turnout as a percent of voting eligible population, while my density estimates use percent of citizen voting age population. So this does lead to some differences in estimated turnout:

category estimated_turnout
age 62.28
density 59.44
education 64.19
race 61.39
sex 60.79
state 60.94

Overall turnout was 55.4% of voting age population and 60.2% voting eligible population in 2016. So these numbers are in the right ballpark, but obviously they’re not perfect.

Sources

[1] MIT Election Labs, 2000-2016 County level presidential results: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ

[2] Citizen Voting Age Population data: https://www.census.gov/programs-surveys/decennial-census/about/voting-rights/cvap.html

[3] United States Election Project, Demographic Turnout Data. http://www.electproject.org/home/voter-turnout/demographics

[4] United States Election Project, State Turnout Data. http://www.electproject.org/home/voter-turnout/voter-turnout-data

[5] American National Election Studies, demographic data. https://electionstudies.org/data-center/anes-time-series-cumulative-data-file/

[6] Catalist demographic data. https://docs.google.com/spreadsheets/d/1UwC_GapbE3vF6-n1THVbwcXoU_zFvO8jJQL99ouX3Rw/edit?ts=5beae6d4#gid=433702266

[7] Revisiting What Happened in the 2018 Election. Yair Ghitza. https://medium.com/@yghitza_48326/revisiting-what-happened-in-the-2018-election-c532feb51c0#_ftn1

[8] U.S. House 1976–2018. MIT MIT Election Data and Science Lab. https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/IG0UN2

[9] CityLab Congressional Density Index, Methodology. https://github.com/theatlantic/citylab-data/blob/master/citylab-congress/methodology.md

[10] Code source for this post: psthomas/onevis: https://github.com/psthomas/onevis