November 28th 2013: How wrong were the polls in Brandon-Souris?

Out of the four by-election last Monday, Brandon-Souris was the only one were the polls (from Forum) picked the wrong winner (the polls weren't perfect by any means in the other ridings, especially Provencher, but they at least had the right winner).

While the Liberal candidate was projected to win easily, the Conservatives ultimately managed to hold onto this seat by small margin. It's clear that local effects played a great role, as this riding should never have been in play for the Liberals (or the NDP) based on the last election and provincial polling. Still, not only were the Liberals polled ahead consistently during the last month, their lead was actually increasing. In the very last poll, the lead was 59% vs 30%! To think that you can be polled 29 points ahead and still lose is crazy.

Since it was a riding-specific poll, the margins of error were large (or, alternatively, the sample size was small). Strangely enough, Forum decided to decrease the sample size with every new poll, from 527 on November 5th to only 368 on November 24th. Was it on purpose or was it becoming too hard to find people to answer the polls (it isn't a big riding)?

The question is really: given the actual results, what were the odds that Forum would get a sample with 59% Liberals and only 30% of Conservatives, out of 368 respondents? I decided to answer this question. I ran 100,000 simulations of a multinomial distribution. I used the percentages of the actual by-election.

Out of these simulations, the most extreme result was 54% for the Liberals. Let me rephrase that: given the actual results, out of a 100,000 samples drawn randomly, the Liberals never got MORE than 54%. So getting a sample with 59% Liberals was not possible, not out of 100,000 simulations. Similarly, the lowest I could get for the Conservatives was 32%. So again, if the sample was indeed random, Forum could not have got as many Liberals and as few Conservatives in its sample. For both parties, the observed percentages in the poll are not possible, for either party.

Ok but let's be fair to Forum: their last poll was different from their other ones. They didn't have the Liberals ahead by 29 points every time. So let's use their other two polls. The one from November 22 had 443 observations and had the Liberals at 50% and the Conservatives at 36%.

In this case, we do have cases with the Liberals as high as 50% and the Tories as low as 36%. But the probabilities of having the Liberals above 48% is only 1%, while the probability of getting the Tories below 38% is also only 1%. So even if we aren't too specific (i.e: we don't want to have exactly 50% LPC and 36% CPC), we see that the odds are at less than 1%. The other polls has the Liberals right, but the Tories still widely underestimated.

Some of you might argue that it was indeed possible and Forum just got unlucky. That could be a potential excuse if Forum had had one poll released only. But Forum released 4 polls in the last month alone. And all of them had the Conservatives at 36% or less. If your sample is truly random, then you can be unlucky a couple of times (in theory, 5 times out of a 100), but 4 times in a row during the same month? We are getting close to 0%.

In conclusion, statistical variation alone can't explain the results from the Forum polls. It simply is not possible to be that unlucky so many times in a row. So clearly, something else happened. The people being polled were not the ones who actually voted. Or some changed their mind. You can find other explanations (or combinations of explanations) if you want.

One excuse could be the low turnout. But at 44.7%, it isn't that much lower than the 57.5% of the last general election. So sure, it's harder to identify voters during a by-election, but you can't succeed if you don't try. What's the point of polling four times in a  month if every time, you don't reach the right people?

Honestly, this is puzzling me. Pollsters don't only do political polls, they mostly make their money for doing market research for private companies. But if I were a business owner, why on earth would I pay Forum (or any other pollster) when I see how poorly they do during elections? How could I be sure they are indeed reaching my consumers?

As for me, I'll keep relying heavily on the simulations which allow me to be less dependent on the polls. But at some point, if I need to call every riding with less than a 30 points lead a toss-up, I might as well stop making projections. In the case of Brandon-Souris, the model (based on the regional polls) would have predicted an easy Conservative win. But these riding specific polls should not be that wrong.