With the Ontario entering its final two weeks and the polls diverging between them (mostly depending on whether they are done online or by phone), one could wonder: how accurate are Canadian polls in general? Should we trust them?

The short answer is yes, but they tend to be off once in a while. And even when they aren't completely off -the majority of the time- there is still considerable uncertainty even if you average many polls.

Let's look at how I reached this conclusion.

I collected the polls during the last week (or so) of the campaign for 10 elections. They are Alberta 2012 and 2017, BC 2013 and 2017, Quebec 2012 and 2014, Ontario 2014 as well as the last three federal election. This is I believe a fairly good sample. It includes many different elections over many years. And it does include the two big misses (Alberta 2012 and BC 2013), so whatever results I find can't be called biased towards pollsters.

For each election, I averaged the polls. A simple, straight up average. No fancy adjustments based on sample size or whether the poll was done 12 hours after another one. No adjustment for incumbency or anything. I only made sure to include only one poll for each firm. I believe my personal average will beat this simple one most of the time, but at least it's clear and straightforward.

I then compared the average for the main parties (the exact number will vary. It's 4 in Ontario, 5 for federal election or Alberta, etc) to the actual results.

I then calculated what is called the MSE, the mean square error. It's simply the average of the square sof the differences. Why the square? So that it doesn't matter whether the polls were over or underestimating a party. A deviation of 2 points above or below will result in the same "penalty" (2 squared, so 4).

If you know your stats, for unbiased estimators (and we believe that polls are of this kind, meaning they don,t systematically under or overestimate voting intentions), the MSE is equal to the variance.

Once I have the MSE, I can take the square root and multiply by 1.96 in order to get the margins of error. This is really basic stats but don't worry if you don't get it. I'll interpret the results in a non (or less) technical way.

The table below presents the results for these 10 elections.

So in average for the main parties, polls will be within 2 points of the actual results. Interpret this number as saying that in average, if polls estimate a party at (say) 35%, then it was 2 points off from the actual result (so the party ultimately got 33% or 37%). Of course that's an average. Sometimes polls miss by more (they were off by 11 points for the PC in Alberta 2012 but only off by 1 point for the PC of Harper in 2015).

Notice that the 1.92% is for the poll average. If you look at individual polls, it'll be worse in general (they are few exceptions where a single poll would do better than the average such as the Nanos and Forum polls for the federal election in 2015).

It's good but not exceptional. Again this is after averaging usually 5-10 polls for each election. Being off by around 2 points in average can mean the outcome is very different from the one you were expecting. Imagine the poll average is showing a race tied at 36%-36%. Applying the average absolute error, it means the actual results could be 38% to 34%. I can guarantee you that in most cases, this is an error big enough to change the winner. Every percentage point can turn into multiple seats at that level. 2 points off was enough to give Trudeau a majority compared to the projections (they had other problems as well, but let's move on).

A simple observation: out of the 10 elections, the incumbent party was underestimated by the polls in 9 of those. The only exception is Quebec 2014. But in Quebec, it is expected that the Liberals will be underestimated. So we could almost say 10 out 10.

In average, the underestimation of the incumbent is a crazy 3.6 points! If we exclude the two big misses of Alberta 2012 and BC 2013, it gives us 2.1 points. This is why in my average I allocate more undecided to the incumbent as to take care of this systematic underestimation. And yes I'm currently doing it for the Liberals in Ontario. This boosts them by 1.5 points usually (it depends on recent polls, number of undecided, etc). I feel strongly about doing it because it is based on data and evidence. For all the mistakes I made in 2015 (not the best election for me), getting the CPC right is one of the few things I got!

Okay, let's go back to the general accuracy of polls. If we translate the MSE above into margins of error, it means that Canadian polls have actual, effective margins of error of 5.68% 19 times out of 20. This is equivalent to a poll with only 271 respondents (for a party at 35% as MoE are a function of the level of support)!

You might be confused here. Most polls out there are reporting MoE of around 3% to 3.5% for a sample size of 1000 respondents. So how can the average have larger MoE? Averaging multiple polls, independent from each other, should in theory significantly reduce this margin of error. That's true, but only for the theoretical margins of error that are there to take care of sampling variations (in other words: the fact we only randomly select a 1000 people to answer). But see, in the real world, sampling variation isn't the main issue. Far from it. In the real world, when we actually look at polling accuracy, we need to remember that other factors are at play: turnout, people changing their mind or making up their mind at the last minute, people lying to the pollsters, people refusing to answer but voting anyway, incorrect weighting by the pollsters, etc.

This is why I couldn't care less when we see online polls with non random samples to which traditional margins of error don't apply. So many people make a big deal out of it. I really don't because I know that the actual margins of error are much bigger and not a function of sampling. If sampling variation was the single greatest source of uncertainty, then averaging 5-10 polls of a 1000 respondents would give spot on estimates of the election. But it doesn't.

How does this compare to other countries? Well it's worse than for French president elections where the effective margins of error of 3.81% (and that's for 2002, 2007 and 2012, I haven't included 2017 where the polls were super accurate). Funny enough, French pollsters do NOT use probabilistic random samples. And yet they perform quite well. As for the US, this article mentions actual margins of 7 points, although I believe it'd be lower if we only look at presidential elections.

Out of the 10 elections I included in my analysis, two were clear misses (Alberta 2012 and BC 2013). Some were partial misses (Quebec 2012 for instance). If we remove the two really bad elections (let's just assume they were one-off, or well, two-off -possibly caused by online polls not being as well developed as they are now), you get the second column.

With margins of error of 3.2%, we see that Canadian polls are fairly accurate. Still, this means that at the end of this Ontario election, even after averaging 5-10 polls (that will be published during the last week), you should add and subtract around 3 points to the various averages. That gives you a fairly wide intervals (6 points is bigger than the current gap between the PC and the NDP!). With how votes are translated into seats, it could mean the PC for instance could win 50 or 80 seats! And remember that this is assuming the polls will be "right". If you want to factor in the possibility of a bigger miss like in Alberta, then you need intervals of almost 12 points!

Don't think however that the chances that the polls are 12 points off are the same as the chances they are 2 points off. That's not how you should interpret those margins of error. The upper and lower bounds should be seen as possible but unlikely results for the election based on the polling average. In order to illustrate this, I generated this graph below for a generic party polling at 35% (usually the score of the top party in one Canadian election). The graph gives you a visual representation of how likely each electoral outcome is. (technical note: this part works better if you accept a Bayesian point of view where the polling average represents the uncertainty we have for our knowledge of the correct voting intentions). This distribution is the result of simulating 10,000 samples with sample size of 270.

See, if a party is polled (in average) at 35%, then this party could actually receive only 30% of the vote. But while this is possible, it's highly unlikely (thus why the bars are smaller). Please do not interpret this article or this graph as a validation of saying "anything can happen". This gives the wrong impression that every outcome is equally likely when it isn't. Similarly, if your party is polling at 30%, your party can indeed be underestimated and win. But it's not the most likely scenario. In average you want your party to be polling high, there is no way around this. People pretending otherwise are either lying or stupid.

What this exercise shows is that despite my best efforts (or anyone doing this job), there will always be considerable uncertainty. This is why I believe people should focus more on my probabilities (where I account for this uncertainty with a well calibrated model) rather than the top line numbers. What it also shows is that people who simply say "polls are always wrong" are actually wrong. This is simply not true. Yes Canadian polls aren't perfect and they sometimes miss, but in general they do a fairly good job.