Introduction
The 2016 presidential election underscored the enduring fault lines between red and blue America. Typically, the electoral result is explained thus: Donald Trump did better in places that were rural, white, less educated, and less well off. That narrative is generally supported by the data. But these broad strokes only tell part of the story. If we ask a different question—can I look at a place and predict whether it will be red or blue—the answer is less clear. There is a lot of noise—rural areas that voted for Hillary Clinton, highly educated or very wealthy areas that voted for Trump, and so on. More importantly, the fault line changes—sometimes, it's race, sometimes it's education, sometimes it's something else. So where exactly are these fault lines, and how well do they capture the split between red and blue America?
The Major Fault Lines
To answer this question, I examined county-level results against a number of variables—urbanization, age, race, income and education. For each of those variables, I created two graphs: the first shows the relationship between that variable and Trump's victory margin in that county; the second shows the cumulative nation-wide margin according to this variable.
Consider, for instance, urbanization. The left chart is simple: each dot is a single county, color-coded by whether it voted for Trump or Clinton. The vertical axis shows the Trump vs. Clinton margin (positive means the county voted for Trump; negative for Clinton). The horizontal axis shows each county's level of urbanization. In this view, we're interested in how the dots are arranged—is there a clear relationship, for example, between the Trump v. Clinton margin and how urban a county is?
The answer is: sort of. If we start from the far left, we are looking at counties that are 0% urban (totally rural). Clearly, there are more red dots than blue dots—counties won by Trump. On the other end—counties that are 100% urban—there are more blue dots than red dots, which means Clinton won more of those counties. Yet there is much variation in between. Clinton won many rural counties; and the Trump-Clinton margin in rural areas ranged from razor thin to almost 100%. It's hard to look at urbanization alone and say whether a place would be red or blue.
This picture, however, only tells part of the story. Counties vary greatly in size: Loving, Texas reported 65 votes cast, while Los Angeles, California reported 3.4 million. Both are shown as a single dot. In fact, Trump won 84% of all counties in the United States, even though he lost the popular vote by some 2.9 million votes. There is, in other words, an inherent bias in visualizing counties equally—there are too many small counties that tend to vote Republican.
This brings us to the second visual. This is less intuitive but very telling. All counties are sorted according their level of urbanization: the far left shows counties that are 0% urban and the far right shows counties that are 100% urban. The vertical axis, however, is now a running total: the Trump-Clinton margin as we move from left to right. In simple terms, this means that if the trajectory is ascending, Trump is building a nation-wide lead in those counties; if the trajectory is descending, Clinton is building a nation-wide lead. If the trajectory is flat—meaning the cumulative margin is neither increasing nor decreasing—then, those counties are either split evenly (small-margin victories) or they are small counties, so a victory by either side is not enough to move the national total in either direction.
2016 Presidential Election: Results by county against level of urbanization
Source: Election results from the Associated Press (via New York Times). Urbanization from U.S. Census Bureau, 2010 Census (Table P2, URBAN and RURAL). Results for Alaska reported state-wide; input variables are also state-wide.
Let's start, again, from the far left—counties that are 100% rural. Trump is building a nation-wide margin in these areas, even though Trump and Clinton both won counties that were 100% rural. However, Trump won more and/or bigger counties and/or by wider margins. By the time we leave the fully rural counties, Trump has a margin of around 0.7%. As we move to right, counties become more urban, but Trump is still building a lead. In fact, until counties are about 85% urban, Trump has a lead that exceeds 10%. To be clear, this margin is calculated over the entire vote—which means that if every other county to the right of whatever dot we're looking at voted 50:50 for Trump-Clinton, then the final tally would be a 10% victory margin for Trump. At that point, we have an inflection point: when counties are 90% urban or more, Clinton starts to reduce Trump's margin and eventually reverse it. Again, there are still red dots in this space—highly urban areas won by Trump. But the overall trajectory is downward, meaning that winning urban counties was crucial to Clinton's final tally—even though lots of highly or totally urban areas voted for Trump.
Let's examine the other variables (if you want to explore the data yourself, go straight to the appendix). Start with age (median age in years). Here, we observe a possible link: older areas tend to show a higher victory margin for Trump. Even in this case, however, there is a lot of noise. The graph on the right is more compelling: the Trump vs. Clinton cumulative margin seems steady at counties with low median age levels; but Clinton built a lead between median ages of 32 and 38 years; from then on, Trump showed gains, which petered out by the time counties get very old (50+ years in median age).
2016 Presidential Election: Results by county against median age in years
Source: Election results from the Associated Press (via New York Times). Median age from U.S. Census Bureau, 2011-2015 American Community Survey 5-Year Estimates, HC01_EST_VC35 in Table S0101, AGE AND SEX. Results for Alaska reported state-wide; input variables are also state-wide.
Race shows a clearer pattern. On the scatter plot, Trump needed a county to be at least 20% white before winning. Even then, the victories were few. The areas where he won bigger leads were even more white. The cumulative tally tells an even clearer story: Clinton built a commanding lead until counties were about 80% white; from then on, Trump generally won. The results are somewhat similar when looking at counties with more blacks or African Americans (see appendix): Trump built a lead until counties had about 5% blacks or African Americans; then there are some ups and downs, but by the time the number hits 20% or more, Clinton generally built a solid lead. The trajectory is similar when it comes to the percent of population with Hispanic or Latino origin.
2016 Presidential Election: Results by county against percent white
Source: Election results from the Associated Press (via New York Times). Percent white from U.S. Census Bureau, 2011-2015 American Community Survey 5-Year Estimates, HD01_VD02 divided by HD01_VD01 in B02001 RACE; in Table S0101, AGE AND SEX. Results for Alaska reported state-wide; input variables are also state-wide.
When it comes to income, the scatter plot is inconclusive. But the cumulative total is clearer: Trump built a lead in areas where median household income ranged from about $35,000 a year to $50,000 a year; after that point, and especially after $65,000 a year, counties went to Clinton, although there were occasional movements horizontally (counties either split evenly or too small to change the running total). Other measures show mixed results. Poverty shows a zigzag pattern: Trump built a lead until a big county came in for Clinton, which pushed the curve down (see appendix). In terms of employment, Trump built a lead in counties where employment participation ranged from 35 to 55% (share of people over 16 who are employed). Then, Clinton built a margin, although there was a mini Trump rally on the far right of the chart (see appendix).
2016 Presidential Election: Results by county against median household income
Source: Election results from the Associated Press (via New York Times). Median household income from U.S. Census Bureau, State and County Estimates for 2015. Results for Alaska reported state-wide; input variables are also state-wide.
Education shows no clear pattern when it comes to the percent of people who have finished high-school (see appendix). It is clearer, however, when it comes to higher education. The scatter plot shows a weak off-diagonal—meaning that counties with more people who have higher education degrees favored Clinton. The cumulative margin graph makes this even clearer: if more than 30% of the population had a bachelor's degree or higher, then those counties tended to help Clinton build a lead.
2016 Presidential Election: Results by county against percent with bachelor's or higher
Source: Election results from the Associated Press (via New York Times). Educational attainment income from U.S. Census Bureau, 2011-2015 American Community Survey 5-Year Estimates, HC02_EST_VC18 in Table S1501, EDUCATIONAL ATTAINMENT. Results for Alaska reported state-wide; input variables are also state-wide.
On a country-wide level, the scatter plots were inconclusive—too much noise. At a state level, however, the scatter plots are very interesting (see appendix). For instance, if you look at Alabama, Arkansas, Georgia, Mississippi or South Carolina—the correlation between percent white and Trump's victory margin is clear. In other places—like California—there is a step change: Trump only won counties that were at least 70% white. Urbanization shows a much clearer correlation in Illinois, Maryland, Michigan, New Jersey, Ohio, Pennsylvania; but the link is less clear in Mississippi, New Mexico, or North Carolina.
It is equally interesting to look at individual states. In Michigan and Pennsylvania, urbanization tells a compelling story: a few large urban centers were closing the gap for Clinton but weren't enough to tilt the overall state to her favor. The same is true for Ohio, although Trump's margin in Ohio was higher. By contrast, in Virginia, those last few urban centers led to a win for Clinton. In Illinois, it's a single urban county (Cook) that shifts the balance.
In short, there are very interesting sub-stories to explore at the state level—stories that are hard to notice at the country-level graphs above.
Conclusion
What do these results show us? In my view, there are two main takeaways. First, the conventional narrative is correct but incomplete—yes, Trump built leads in counties that were rural; white; less educated; had lower employment levels or lower median household incomes. But that story only tells us what drove the electoral result—they do not, on their own, tell exactly what makes a county red or blue—or how much. Second, at the state level, we have lots of examples of cases that either fit nicely into the nation-wide trend or really deviate it from it. We have examples where race is a very clear predictor; and other examples where race does not seem to tell us much about how a county would vote. We have stories were the urban/rural split is pronounced, and others where it is not.
In short, if we ask a narrow question—what drove the electoral result—the standard answer is, more or less, accurate. But if we ask a different question—what makes a place red or blue—the answer is far more complicated, and impossible to distill into a few pithy sentences.
Appendix
Source: Election results from the Associated Press (via New York Times). All other variables from U.S. Census Bureau. Urbanization from 2010 Census (Table P2, URBAN and RURAL). Median age, race, origin, employment/population ratio and educational attainment from 2011-2015 American Community Survey 5-Year Estimates (median age: HC01_EST_VC35 in Table S0101, AGE AND SEX; White alone: HD01_VD02/HD01_VD01 in B02001 RACE; Black of African American alone: HD01_VD03/HD01_VD01 in B02001 RACE; origin: HD01_VD03/HD01_VD01 in B03003 HISPANIC OR LATINO ORIGIN; employment/population ratio: HC03_EST_VC01 in Table S2301, EMPLOYMENT STATUS; educational attainment: HC02_EST_VC17 and HC02_EST_VC18 in Table S1501, EDUCATIONAL ATTAINMENT). Median income and poverty from State and County Estimates for 2015. Results for Alaska reported state-wide; input variables are also state-wide. For more information on the data sources and approach, please visit my github repository.