I found Sam Wang’s article this morning interesting for several reasons. He’s saying that Hillary Clinton has a 99% probability of winning tomorrow’s presidential election, and Nate Silver says that it’s kind of ridiculous to posit a number that high. His “now-cast” number, which calculates the probability if the election were held today, is at 66.7%. Wang does a nice, clear job of explaining how they can arrive at such wildly different numbers.
There's a reasonable range of disagreement. But a model showing Clinton at 98% or 99% is not defensible based on the empirical evidence.
— Nate Silver (@NateSilver538) November 5, 2016
Essentially, the two polling analysts have different goals. Silver wants to estimate final margins in all 56 presidential elections tomorrow. There are 56 presidential elections because there are 50 individual states, the District of Columbia, and five congressional districts (in Maine and Nebraska) that will allocate the Electoral College votes. There are also a couple of potentially faithless electors in Washington State who may overrule the people they’re supposed to represent and vote for some third party candidate. Because Silver has so many estimates to make, he has more variables and parameters than Wang, which introduces more uncertainty, which works against making an aggressive prediction in favor of the frontrunner. As Wang says, it’s not that Silver is biased against Clinton. His model would be biased against any frontrunner because it’s biased against certainty.
Wang, on the other hand, has a goal of estimating where people’s time and organization should be best spent. If the presidential election is a slam-dunk then people should spend their time phonebanking for close Senate and House races. He doesn’t care how much Clinton wins by in North Carolina vs. Rhode Island. He doesn’t even really care who wins those states. All he cares about is estimating who will win the overall election. Therefore, he has very few parameters and variables to worry about and he can rely more heavily on the data without introducing a lot of his own uncertainty to the analysis.
Of course, Wang still has to make some assumptions. His primary estimate is to guess how far the actual results will be from the meta-margin (which is defined by how much the polls would have to move in Trump’s direction to create a perfect toss-up). In this case, that number is 2.6%. In the last three elections, the winner over performed the meta-margin in 2004 (1.3%) ,2008 (1.2%) and 2012 (2.3%). The 2000 election was a special case because Al Gore won the popular vote and should have won the Electoral College but did not become the president. He actually underperformed the meta-margin, but probably by less than a point. Basically, the 2000 polls were accurate in predicting the popular vote. The important thing is that, while it’s a pitifully small sample, there is no recent precedent for the mega-margin being off by as much as 2.6%, and the trend has been for the polls to underestimate the size of the win they are predicting (which, in this case, cuts against Trump’s chances).
To make this prediction, Wang increased his sample by looking at historic Senate races from the same period, and he came up with an estimate that the polls are likely to off by only +/- 0.8%. This gives him 99% confidence that Clinton will win. Using different estimates, he comes up with a winning probability of 93% using a slightly different distribution formula, 91% using a +/- 1.5% estimate, and 68% using a +/-5.0% estimate.
In other words, to arrive at Silver’s number, he has to assume that the polls are off by 5% and that there’s a 50% chance that they’re off in Trump’s favor.
Now, Silver’s primary argument against this is that polling errors tend to be highly correlated, so that if the polls are wrong in Clinton’s direction in Ohio they are likely to be similarly off in Pennsylvania and other states.
If you haven't carefully tested how errors are correlated between states, for example, your model will be way overconfident.
— Nate Silver (@NateSilver538) November 5, 2016
Wang doesn’t dispute this, but he argues that his model accounts for it, and not by rating every pollster and adding all the uncertainty of weighting polls’ accuracy.
For each state, my code calculates a median and its standard error, which together give a probability. This is done for each of 56 contests: the 50 states, the District of Columbia, and five Congressional districts that have a special rule. Then a compounding procedure is used to calculate the exact distribution of all possibilities, from 0 to 538 electoral votes, without need for simulation. The median of that is the snapshot of where conditions appear to be today.
Note that in 2008 and 2012, this type of snapshot gave the electoral vote count very accurately – closer than FiveThirtyEight, in fact.
This approach has multiple advantages, not least of which is that it automatically sorts out uncorrelated and correlated changes between states. As the snapshot changes from day to day, unrelated fluctuations between states (such as random sampling error) get averaged out. At the same time, if a change is correlated among states, the whole snapshot moves.
At this point, I’ll let the statisticians arbitrate who is right. If Wang and Silver can’t agree on this point, I am hardly qualified to decide between them.
The final part of Wang’s column that I found interesting was his description of meeting with financial investors.
Let me start by pointing out that FiveThirtyEight and the Princeton Election Consortium have different goals. One site has the goal of being correct in an academic sense, i.e. mulling over many alternatives and discussing them. The other site is driven by monetary and resource considerations. However, which is which? It’s opposite to what you may think.
Several weeks ago I visited a major investment company to talk about election forecasting. Many people there had strong backgrounds in math, computer science, and physics. They were highly engaged in the Princeton Election Consortium’s math and were full of questions. I suddenly realized that we did the same thing: estimate the probability of real-world events, and find ways to beat the “market.”
Silver has been criticized for having a financial incentive in making the election appear closer than it is, but Wang is defending FiveThirtyEight against that charge and saying that he is the one who is seeking money. But he’s also saying that he isn’t distracted by academic questions and is much more results or bottom-line oriented. Just like investors want to use math to help them predict future events and decide where to invest their money, Wang wants to use math to help people decide how to invest their time.
If you think Wang is right, you should be phone banking and knocking doors for downticket candidates because Clinton doesn’t need your help.