I agree with Ed Kilgore that we cannot cover politics well if we simply ignore polls, and there’s a good debate to be had about how best to aggregate polls. But Kilgore ignored the subject that most interests me. And that is the likelihood that Gallup and (particularly) Rasmussen polls were deliberately favorable to Mitt Romney in the last polling cycle, and what that did to benefit both Romney and Republicans in general.
Candidates operate within a larger nebulous media environment that they can influence but not control. It is easier to function as a candidate in an environment in which you are not being told day after day that the polls show that you have no more than a 33% chance of winning. That was what Nate Silver’s aggregator consistently showed almost all last year, and he still overestimated Romney’s final standing because all the pollsters (but especially Gallup and Rasmussen) were off.
Other aggregators that did not correct for house effect were much worse and more commonly used on television and in print. The effect was to create a false impression that the race was closer than it was, which (on the whole) was beneficial to Romney.
An argument can be made that this false impression created an unwarranted complacency on Romney’s part, preventing him from taking big chances that could have turned the race. Yet, frankly, understanding the role of simple denial in the Romney campaign is a separate topic from how we should interpret polls and poll aggregations in the future. Romney’s in-house pollster may have been the worst of all, and we can’t really know why they were delivering such rose-tinted news to their boss.
In general, closer-than-reality polling was beneficial for Romney. In addition to contributing to an easier media environment, it helped him continue to rake in money until the very end of the campaign.
For media outlets who were more focused on getting eyeballs than getting it right, closer-than-reality polls were helpful because they made the campaign seem more interesting and newsworthy.
So, there are a lot of issues surrounding polling beyond just explaining why the pollsters underestimated Obama’s lead.
That was what Nate Silver’s aggregator consistently showed almost all last year, and he still overestimated Romney’s final standing because all the pollsters (but especially Gallup and Rasmussen) were off.
People still take Rasmussen seriously, why? Did people read who was part of that National Review cruise right after the election?
they were used in the aggregators. And Gallup’s name is so storied in polling that is will take proof that they gamed their polling to hurt their reputation enough for people to stop reporting their polls as gold.
One more reason it’s so bizarre is that they were including almost 50% cellphones in their polls according to TPM.
Anyhow, I’d be interested to know what the relationship is between chance of winning and actual vote total. Romney only had a 33% of winning but got 47% of the total result.
MNPundit, the chance of winning is based entirely on two factors:
Extremely simple political example:
Candidate A is polling at 51% with a 1 point margin of error.
Candidate B is polling at 49% with the same 1 point margin of error.
The distribution of polls for both candidates is Gaussian (normal). This is commonly referred to as a “bell curve”. That means, in this example, candidate A has exactly as many polls with 49% of the vote as with 53% of the vote. She’ll have even more polls with 50% of the vote, but an equal number with 52% of the vote. The most frequently occurring scenario should have her with 51% of the vote.
Normal distributions are probably the most frequently occurring distribution that we encounter. Wikipedia has a useful (math heavy) article on normal distributions, but for lay usage, this page is much more approachable: http://www.mathsisfun.com/data/standard-normal-distribution.html. Anyways, polling results probably aren’t exactly normal, but they’re mighty close.
Back to our candidates. Based on what we know about normal distributions, 68% of all outcomes fall within one standard deviation of the mean. 95% of all outcomes fall within two standard deviations of the mean, and something like 99.9% of all outcomes fall within three standard deviations of the mean. Put simply, this means that if the polling is accurate:
Candidate A has a 68% likelihood of receiving 50-52% of the vote, and a ~100% likelihood of receiving 48-54% of the vote.
Candidate B conversely has a 68% likelihood of receiving 48-50% of the vote, and a ~100% likelihood of receiving 46-52% of the vote.
So how do you predict who wins? Well, there’s a little math involved, using what’s called the “probability density function”. You can look this up on Wikipedia for a detailed explanation. It’s fairly tricky to do by hand, but exceedingly easy to do with a computer.
In the example that I described above, Candidate A wins in roughly 85% of the election scenarios.
I should add that Nate Silver did not underpredict because of Gallup and Rasmussen. He used bias correcting measures to try and counter serious outliers. Silver’s “failure” was because he incorporated too much uncertainty into his model. In other words, he trusted the polling aggregate too little. Models that used exclusively polling, such as Sam Wang’s, predicted the election with even higher certainty (Wang was basically at 99.99% certain for Obama on Election Day).
I should say that the description I gave is not exactly how Nate Silver runs his predictions. He actually simulates the election with all his factors figured in. This allows him to plug in factors with unknown effects on the standard deviation (for example). The end result is basically the same, however.
Excellent explanation. I was going to quibble about the occurrence of the Gaussian distribution, but I re-read your post “Normal distributions are probably the most frequently occurring distribution that we encounter.” Emphasis added by me. No quarrel. The Gaussian is far and away the easiest distribution to manipulate mathematically and so is used even when the Poisson or Rayleigh distribution should be used. I should add that I recall from the depths of Graduate School that both the Gaussian and Poisson are limiting cases of the binomial distribution (coin-flipping) with different behavior in the limit.
it has been shown (sorry no link) that the Rayleigh distribution is a better fit to the stock market than the Gaussian, but Gaussian terms like “mean reversion” and “standard deviation” still abound although the Rayleigh distribution has no standard deviation and, indeed, no mean. Near the mode, the two distributions look similar but the Rayleigh falls off much slower (lots more “Black Swans”).
A major factor throughout was the probability of winning the electoral votes necessary – although R$ had a large number of votes in red states, he didn’t have much chance [33%] of winning enough electoral votes
Where would the proof come from? A mole inside their organization? Getting elections consistently off the mark by wide margins?
It’s a fix, Booman. A classic fix. Classic. Paid-off…in one way if not others…journalists hype the tomato can and the betting remains high as a result.
It’s not about quality or accuracy, it’s just about branding.
Like
Duh.
AG
The point isn’t to “simply ignore polls”. It’s to put effort into covering the policy points that matter to voters and to stop with the obsessive horse-race coverage that we’re treated to in most campaign articles. The sad truth is that political journalists are largely mathematically illiterate, the consequence being that their coverage of polls is often flat out wrong. A false report is worse than no coverage at all.
Your argument that Gallup and (particularly) Rasmussen polls were deliberately favorable to Romney is unfortunately part of this nonsense. It’s simply the flip side of Dean Chambers “unskewed polls”. You’re both pushing for cherry picking.
You have no evidence that Nate Silver overestimated Romney’s standing, the fact that he got the final call right is a good argument that he was doing things pretty well. He kept both Gallup and Rasmussen in the mix, because his aggregator doesn’t know which polls are biased, by how much, or in which direction. We won’t know that next round either.
Close polls (polling reporting) mainly benefits the news business. It’s all about the Horse Race, dontcha know.
I became very worried when these polls showed Romney as so close (I just couldn’t understand why anyone would think he would be a good president, still can’t believe how many millions voted for him.) But now I wonder if those polls also had a positive effect. Maybe they scared everybody out of the September belief that Obama was going to cruise to victory, and made sure that Democrats would get their vote out. People were certainly determined to vote for Obama, they stood in line for hours to do so — there was no complacency.
yes, could be