Confidently Uncertain: Introducing ‘Solid Purple’
The more you know, the more you realize you don’t know.
If you’re reading this, we probably share a few things in common. You might not literally have brain worms like Robert Kennedy Jr., but you relate to the sentiment to the degree that you know exactly what I’m talking about when I refer to RFK Jr. having brain worms. You might even find your mood shifting between elation and rage purely on the whims of whatever college few people have ever heard of decides to publish a poll on a Wednesday afternoon. It’s almost as if the fate of the free world is in the balance or something.
Riding the pollercoaster, clearly, drives people crazy. It’s even worse when you’re on it and you see who are supposed to be very serious people completely misunderstanding what a particular survey does and doesn’t say. Part of it is ignorance, but more importantly, part of it is driven by motivated reasoning. Polls that I like are real and polls that I don’t like are methodologically dubious, and were only conducted to manufacture a media narrative that serves the publisher’s ideological priors. Obviously.
You probably know that any one individual poll that just fell out of a coconut tree should be taken seriously but not literally. “Throw it in the average,” they say. It exists within the context in which it lives and what came before it. Rationally, that makes sense, but it has never really emotionally resonated with me. When Nate Cohn wakes up and decides to choose violence, it colors my entire day until Quinnipiac rolls in to declare that Democrats are winning senior citizens by 30 points or whatever.
So I found a way to break myself out of this cycle...I’ll build my own damn average, with blackjack and hookers!
For the last six months, I’ve been tinkering away, taking a deep dive into the data and forcing myself to commit to methodological choices that could make me look prescient or totally foolish in six months, but if nothing else, I will have been consistent. I chose which cherries I’m going to pick nearly a year out from the election and there is no changing my mind according to whatever suits my preferred narrative.
Over time, I will elaborate further on the methodology of this forecast and how it differs from what actually qualified academics and statisticians have come up with to try and predict the future. I cannot stress enough that what makes this model different from the others is what makes it objectively worse - I’m a hobbyist who barely knew how to do this at the start. I f-ed around with statistical forecasting in college a little more than a decade ago. I read papers and reviewed Github repos from folks like G. Elliott Morris (now at FiveThirtyEight) and could not have approached this without folks who did the legwork on how to do this and pointedly refused to gatekeep it. That said, a lot of it went over my head! If FiveThirtyEight is a hydrogen bomb, my forecast is a coughing baby.
Which brings me to the name...
Qualitative forecasts of elections, like Larry Sabato’s Crystal Ball or the Cook Political Report, sort races into categories ranging from Safe/Solid R to Safe/Solid D, with everything in between being Lean, Likely, or if it’s a true jump ball, Tossup. Since Republicans are red and Democrats are blue (we don’t need to get into that right now), people often call swing states and districts “purple.”
I’m in some group chats where friends and family members are very sure they know what’s going to happen in November. Trump’s got this in the bag. Biden’s going to pull through. These are people who aren’t paying a ton of attention to election coverage, let alone the deltas in the crosstabs between polls of the Presidential race and the generic ballot. But they can read the vibes and go with their gut, and that’s usually accurate enough for them.
On the spectrum of political observers between people who touch grass and the kids who meticulously map the swing of every precinct between 2006 and 2014 in some bellwether county of 3,000 people, I’m somewhere in the middle. I know just enough to sound smart to the normies, but the real heads can immediately clock that I’m a bit out of my depth.
I’ve read the polls. I’ve studied the political science research. I gathered the fundraising data, extrapolated the population and demographic data, controlled for correlated error across states and districts, modeled uncertainty based on national and state polling error over time, sussed out how much ticket splitting and third-party protest voting actually occurs vs. what people report in surveys, etc. etc. etc. With that, I feel well-qualified to rate the 2024 election, and name my forecast of it:
SOLID PURPLE.
With that, it could not be more fitting that on the weekend the model is going live, the Presidential race and the battle for control over the House of Representatives are deadlocked at what’s essentially a tie. Not only that, but one of the major candidates and the God Emperor of the Republican Party, Donald Trump, was convicted of 34 felonies in the state of New York, a fact that has the potential to completely uproot (or not!) every underlying assumption we have about the upcoming election.
This model will not be updated daily for a few reasons. As you’ll see, there’s not much value in doing that, at least right now, because the forecast has been incredibly stable so far. The text and graphs you see on the site are not dynamic (yet) and it’s a lot of work to go through and update it all, especially if all that’s changing is the vote share went up or down a tenth of a percentage point. Also, as I outlined above, this began as a sort of mental health exercise for an audience of one, and you know what’s not healthy? Refreshing the page of an election forecast waiting for a decimal point to change any time YouGov publishes its third poll in a week.
The timeline of model updates will accelerate over the course of the cycle, as will changes under the hood to the weighting of data. As you might be able to suss out as you take a look at the various tables and charts on the site, there is essentially no weighting on individual state polls at this time. The numbers you see are heavily reliant on national polling of the Presidential election as well as the generic ballot, at least until we reach the point where voters are actively aware of who the candidates are going to be and a consistent pattern emerges to indicate someone is going to over/underperform the national environment for reasons specific to that particular race.
You may notice that the Presidential and Senate forecasts are pretty consistent with conventional wisdom at this point, but a few of the House races are, at first glance, kind of funky. Solid Purple relies heavily on a weighted average of crosstabs in polls, and prioritizes surveys that actually release that data. There are limitations to this approach, and if this model was purely taking crosstabs at face value, the results would really strain credulity unless we have the kind of seismic coalitional realignment in 2024 that completely breaks the field of election forecasting forever. This isn’t even taking into account all the things I don’t know I don’t know (you know?) But the fact remains that the story being told in the numbers this cycle is that Democrats’ erosion in the polls vs. 2020 is almost entirely consolidated among younger, non-white voters, and unless you throw those numbers out entirely and strictly focus on the topline results, you are going to find some regions shifting 10-20 points in any given direction while other regions stay put. In the near future, I will draft an article and some interactives demonstrating how the national polling could manifest at the district level if it is in any way an accurate reflection of the state of our political landscape.
So with that...thank you for indulging me on this little project of mine! I’m sure there are lots of little bugs, errors, and oversights all over this damn thing, so if you find any, hit me up on Twitter @zachheltzel or email zach@solid-purple.com.