I’m fascinated with creating a single visualization that can fully communicate the results of an election. I’ve made a few attempts in the past and even came up with a set of criteria for the ideal visualization. More or less, I want to answer the questions “What happened?”, “What changed?”, and “Why?” in a single visualization. Of course, nobody can fully answer the “Why?” question, but you can point towards an answer by looking at demographic and survey data.
So here’s my most recent attempt at an all-encompassing visualization for the 2016 election:
This might look like a little bit like an abstract painting, but I promise it’s informative. Each color category represents the entire electorate broken up into subgroups by age, housing density, education, gender, race, and state. The size of the bubble represents the population of the group, and the coordinate position represents the margin and turnout. The tail of each bubble represents the change from the weighted average (60/30/10) of the 2012, 2008, and 2004 elections.
If you want to find out what changed this election, just find the large bubbles with the long tails. And to find the crucial states in the electoral college, find the bubbles that just crossed over the y-axis. Note that this visualization is interactive so you can zoom in and mouseover for labels.
So the older age groups, males, high school graduates, and very low density areas all shifted rightwards in 2016. At the same time, turnout dropped substantially among the Non-Hispanic Black populations. These changes were enough to bring crucial states like WI, PA and MI just across the threshold into Republican territory. No new insights here, but I think it’s interesting to have a single graphic that communicates it all.
All the code and data for this post are available here.
The real challenge with a plot like this is getting the data, especially the demographic data. The source for the turnout data in this plot is the Census Current Population Survey . The margin data for each demographic is from the American National Election Studies cumulative data file . Each of these sources is released in the spring of each year, so you need to wait for a few months to get a clear picture about what happened in an election. Catalist have announced a dataset that will be available more immediately after an election, so maybe I’ll use that in the future (although it’s still not possible to calculate turnout by subgroup using their data alone).
The eagle-eyed among you may have noticed a mathematical impossibility on the plot. The housing density subgroups all have a lower turnout than the sex subgroups. This isn’t possible, and it’s because the numbers are derived from two different sources of data. The housing density data is just from county level Citizen Voting Age Population data, while the sex data is from the Census Bureau’s Current Population Survey. If I calculate the expected national turnout based each group’s values, these are the results:
So obviously there’s some variation there, maybe we should just blame it on measurement error? Overall turnout was
55.4% of voting age population and
60.2% voting eligible population in 2016. So these numbers are in the right ballpark, but obviously they’re not perfect.
A Few More Election Visualizations to Rule Them All
I finally got around to parsing the Catalist demographic margins dataset and combined it with the Elections Project turnout dataset. The nice thing about these sources is that they cover elections every 2 years, so I can show results in between presidential election years.
To get state level results in non-presidential years, I aggregated the MIT Elections Lab house election results and combined these results with the Election Project state turnout data. Instead of comparing against a weighted average of the past three elections, the bubble tails are now just staggered by 4 years so that presidential and house election results are compared against each other.
There’s a lot to take in here, but I think the most striking thing is the increase in turnout and leftward shift in 2018. Blue wave indeed. This change seems largely to be driven by college educated voters. Another crucial thing to note here is that in 2018 the house vote crossed into Democratic territory in both Wisconsin (Rep -7.5%) and Arizona (Rep -1.71%). These will be crucial swing states in 2020 for both the presidency and senate, so things could get interesting.
 2004-2008 County Voting data: https://github.com/helloworlddata/us-presidential-election-county-results
 2005-2009 County VAP data: https://www.census.gov/programs-surveys/decennial-census/about/voting-rights/cvap.html
 2012-2016 County Voting and VAP data: https://github.com/kyaroch/2012_and_2016_presidential_election_results_by_county
 United States Election Project, Demographic Turnout Data. http://www.electproject.org/home/voter-turnout/demographics
 United States Election Project, State Turnout Data. http://www.electproject.org/home/voter-turnout/voter-turnout-data
 American National Election Studies. https://electionstudies.org/data-center/anes-time-series-cumulative-data-file/
 Revisiting What Happened in the 2018 Election. Yair Ghitza. https://medium.com/@yghitza_48326/revisiting-what-happened-in-the-2018-election-c532feb51c0#_ftn1
 U.S. House 1976–2018. MIT MIT Election Data and Science Lab. https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/IG0UN2
 Code source for this post: psthomas/onevis: https://github.com/psthomas/onevis