There is no doubt that Edison-Mitofsky’s exit polls failed to predict successfully proportions of votes cast for the two presidential candidates in 2004. Because the candidates were so close, and because the exit poll deviated from the count in the loser’s direction, this meant that a predicted Kerry win contrasted painfully with a Bush victory in the count.
The 64 million (or is it billion?) dollar questions is: which was wrong, the polls or the count?
“Fraudsters” claim it was the count. Edison-Mitofsky (and the “anti-fraudsters”) claim it was the exit polls. In this diary I have a brand new analysis. Bear with me, even if you find math icky. There will be pictures, if not cookies.
First of all, my credentials: I do not have a PhD, though I am working on it, and if it wasn’t for DKos I might have it by now. But in any case it is not in statistics, although it is a very statistical field, and I use statistics daily. It is in science, though, so I have a scientific training: in other words I am trained to make and test hypothesis. I am also trained to treat my own findings (and the findings of others) with scepticism and to be constantly alert for alternative plausible explanations for my findings, and to set up new testable predictions to further disambiguate my results.
Regarding my “fraudster credentials”: I am a fraudster. I believe your election was inexcusably riggable and may well have been rigged. It was also inexcusably unauditable. I am convinced that there was real and massive voter suppression in Ohio, and that it was probably deliberate. I think the recount in Ohio was a sham, and the subversion of the recount is in itself suggestive of coverup of fraud. I think Kenneth Blackwell should be jailed.
However (and I’ll come clean now in case you want to read no further) I don’t believe the exit polls in themselves are evidence for fraud. I don’t think they are inconsistent with fraud, but I don’t think they support it either. Read on for my reasoning.
My analysis here is of data provided in the Edison-Mitofsky report, namely their table of “within precinct error” for each state (pp32-33).
The exit polls involve two levels of sampling. The first is their sampling of what the pollsters hope will be “typical” precincts. The second is their sampling of voters in those precincts. Both procedures are liable to both error and bias. In their report, EM distinguishes between error/bias arising from each of these two sampling procedures. By comparing the differences between the real count in their selected precincts and the real count in the state as a whole, they conclude that their precinct sampling was pretty good. If anything there was a non-significant bias in Bush’s favour. In other words they picked typical precincts.
The EM report therefore fingers “within precinct error” or “WPE” as the culprit, in other words the difference, in percentage points, between the proportion of sampled voters saying they’d voted for each candidate, and the proportion of votes apparently cast for each candidate in that precinct. EM have not given us precinct level WPE data, but they have given us “average” WPEs for each state.
The point about the WPEs is that they are large – there is a lot of room for random error in this kind of sampling. However, once you average the WPEs for each precinct, the random error should cancel itself out, leaving a mean WPE of near zero. That is if the error is random. If there is bias, either because the sampling technique is biased or because the votes counted do not reflect the votes cast (biased counting in other words) the WPEs will not cancel out – the mean WPE will not be zero. We can test whether the mean WPEs were statistically significantly different from zero, not within each state (because EM have not given us the standard deviations, only the means) but across states. Again, even if some states had significant WPEs in one direction, and other states in the other direction, unless there was systematic bias across the nation, the state average WPEs should cancel themselves out. This we can check.
I did so in this diary, and, surprise, surprise, they are massively significantly different from zero. The mean WPEs are significantly negative, which, in the sign convention used by EM means they over-stated Kerry’s share of the counted vote.
However, I do not think this analysis is valid (though it is mine) and what is invalid about it is also invalid about a number of other analyses that have been done.
RATIONALE FOR THIS NEW ANALYSIS
If you postulate a systematic bias in the polling (whether it is in the count or the poll – I’m remaining neutral at this point) the magnitude of the WPE will mathematically depend on the magnitude of the actual margin. For the sake of clarity I am going to assume the error was in the polling, but the math is the same if it was in the count.
Say you have a precinct with 200 voters. 100 of them vote for Kerry and 100 for Bush. The real margin is 0 (50%-50%). And if you poll 50% of each voter, i.e. you interview 50 of the Kerry voters and 50 of the Bush voters, your prediction will also be 0. Your WPE for this precinct will be zero. (In real life, as in coin tossing, sometimes you will poll more than 50 Kerry voters, and sometimes more than 50 Bush voters, but over many precincts, the errors will cancel out, provided the coin is not weighted). However, say there is something wrong with your sampling technique, and you actually poll 56 Kerry voters for every 50 Bush voters (as EM allege they must have done). You will interview 56 Kerry voters, but only 50 Bush voters, giving you a total of 106 interviews. 56/106 is 53% for Kerry and 50/105 is only 47%. So your predicted margin is 6% in Kerry’s favour. As the “true” margin is 0, your WPE is -6.
Now, take this same sampling bias, and apply it to a precinct where there are 160 Kerry voters and only 40 Bush voters. This will give a result of 80% Kerry, 20% Bush. But you are over sampling. So you sample 56% of the Kerry voters, giving you 90 interviews, and 50% of the Bush voters giving you 20 interviews, i.e. 110 interviews. You then compute your proportions, and it gives you 90/110 for Kerry (82%) and 20/100 (18%) for Bush. So your counted margin is 20%-80% = -60%, but your estimated margin is 82%-18% = -64%. Your WPE is therefore -4.
The point here is that the bias is identical in both precincts – you are sampling 56% of the Kerry voters and 50% of the Bush voters. However, in the pro-Kerry precinct this translates into a WPE of -4, and in the evenly split precinct it translates into a WPE of -6.
We’ll do it one more time: take a Bush precinct with 200 voters. 160 of them vote for Bush and you sample 50% of them = 80. 40 vote for Kerry and you sample 56% of them = 22 of them. You have 102 interviews. 80/102 is 78% and 40/102 is 22%. So your ratio in the count is 80% for Bush and 20% for Kerry, but your prediction is 78% for Bush and 22% for Kerry. “Real” margin is 60% in Bush’s favour, predicted margin is 57% (there’s a rounding error here). So the WPE is – 3.
This all means that for a given uniform bias, whether in the polls or the count, a “swing” precinct will give an error of 6 percentage points, while strongly Red or Blue precincts will give an error of only 3 percentage points. Moreover, you can show that as the bias increases in Kerry’s favour, the apparent error will remain low in the red precincts, but increase in the blue precincts.
I have graphed below what the WPEs would look like in precincts with different degrees of partisanship but with an identical,and extreme degree of polling bias (2 Kerry voters for every 1 Bush voter):
An interim conclusion: any observation along the lines of “exits were more strongly in Kerry’s favour in the swing states” is simply an artefact of the math. For a uniform degree of bias, the erroneous margin produced will be greatest where support for each candidate is most evenly matched. What we need to do is to convert the WPE (or any measure of the margin that is expressed as a difference between two percentages) into a parameter that is mathematically independent of the proportion of votes counted. This I have done (if you want the formula, email me). EM kindly give us the WPE for previous years, so I have also done the same for the previous four presidential elections.
It gives an index of bias that is not only independent of state “colour” but is also a linear scale, which means we can use fairly conventional stats on it.
Question number 1
Is the mean bias across the nation significantly different from zero – in other words does the bias in different states cancel out when averaged across the nation?
The test here is a single sample t test, and I have graphed the results below. I omitted DC and Oregon, which for various reason are anomalous.
The error bars are 95% confidence intervals. Note that none of the error bars crosses the 0 line, which means that all are significantly different from zero. The vertical access gives the value of the bias. Due to a quirk of the math, positive value means Democratic over-sampling (or undercounting, depending on your hypothesis). You can see that in every single one of these elections, there was a significant over-estimate of the Democratic vote. However, you can also see that in 2004 it was greater than in any previous year. Its nearest rival was 1992. For stats geeks here are the t values:
I therefore conducted a second t test to check whether the bias in 2004 was significantly greater than in 1992. The answer is yes [t(94)=2.100, p<.05]. The variance was also significantly different, but the years remain significantly different, even after appropriate adjustment of the degrees of freedom.
Question number 2:
Is the mean bias greater in swing states than in safe states?
To do this I tested the relationship between bias and margin twice: once for a quadratic fit (which would imply that the middle was different to the ends – i.e. swing states different) and once for a linear fit). In three years, the linear fit was significant (1992, 1996 and 2004) but in no case did the quadratic fit result in an improvement to the model. For non-geeks, this means that the swing states were not in any way special.
The interesting finding, however, is that the linear fit was positive both in 2004 and in 1988 but negative in 1992. Translated, this means that in 2004 and in 1988, both two-horse races (1988 was Bush senior versus Dukakis), the overestimate of the Democratic vote was greatest in Blue states – the bluer the state, the greater the Democratic over-estimate. However, in 1992, the reverse was true – the redder the state, the greater the Democratic over-estimate. In 1992, remember, Perot was on the ballot for the first time.
Here is a scatter plot showing the bias for 2004 plotted against the counted margin. I have identified the three big swing states by different markers:
I also did some other stuff, but this diary is already way too long.
So: how to interpret all this? Two points:
1. The initial excitement over the exit polls was simply that they were wrong. Not only were they wrong, but they made us think Kerry was going to win. They were also significantly wronger than in any of the previous four elections, and even the nearest competitor (1992) simply over-estimated the extent of Clinton’s victory, it didn’t have him winning when he didn’t, so it may have slipped under our radar. My analysis shows that ever single poll has significantly over-estimated the Democratic vote. The probability of this being due to chance is astronomically low in both 1992 and 2004. So we can conclude that there is in-built Democratic bias in the exit-polling system that varies from year to year. It appears (as EM point out in their report) to be greatest where the interest in the election is highest (1992 and 2004). However, this year the bias was significantly even greater than in 1992 (though only at a probability of p<0.05).
2. My analysis shows that the swing states were not in fact more wrong than the safe states. This evidence shows that the greatest bias was in the safest blue states. Although the big swing states of Ohio, Florida and Pennsylvania are all on the high side of the line, they are not significantly out of line (I checked). In fact they are relatively close to the regression line. So even if there was a legitimate reason (as I suspect there is) for an in-built polling bias that is greatest in Democratic states, these three states are not an exception to this rule. Moreover, the pattern of polling bias is the same as in the nearest comparable election, 1988, another two-horse race where there was also a large significant over-estimate of the Democratic vote and another losing Democratic candidate (Dukakis).
I therefore conclude that we have a choice between the following hypotheses:
- There was legitimate over-sampling of Democratic voters that was greatest in the most Democratic states, that this is a pattern that has been observed before, but for some reason was greater than usual this year.
- There was widespread, state-level fraud targeted somewhat inefficiently at the bluest states, not at the swing states.
- The first effect, plus targeted small scale fraud in swing states that is lost in the variance due to polling bias. After all, it remains true that a relative small degree of ballot stuffing in Ohio, Pennsylvania or Florida could have swung the election for Kerry and would not appear in the exit polls.
But of course this would also mean that the magnitude of the exit poll discrepancy is not evidence of fraud.
Cross posted at Daily Kos