Let's talk about this poll with the NDP at 40%

Polls have been relatively stable this election. Since the writ was dropped, we can really only identify very small trends, namely the small drop of the Conservatives and the slight increase of the Liberals.

If the election were tomorrow, these projections (of a few days ago) would still be valid

However, there was one poll last week that stood out and generated some discussions. I'm of course talking of the Forum poll that had the New Democrat at 40%, the Liberals at 30% and the Conservatives at only 23%. This is a stunning result. And one that is particularly off the poll average.

Before I continue, I want to say that I actually quite like Forum. They poll all the time, they provide details and their track-record is pretty good. It's easy not to be wrong when you never poll. So this post shouldn't be seen as an indication that I believe Forum's methodology to be fundamentally flawed. Not at all.

Getting variations between polls is not only expected, it's desirable. After all, this shows that the sampling process is truly random or at least close to it. It's when polls all show the exact same numbers that this is suspicious. With that said, there is a limit in how much variation we should observe. Especially given that federal polls have sample sizes of at least 1000 observations. In the case of the Forum poll, it was actually 1440 respondents.

When you poll, you can get results off the mark if you are unlucky when sampling. Like you got too many NDP voters compared to how they are truly in the population. But you can be unlucky when you ask (say) 20 people, not when asking 1440. If you are truly selecting the people randomly, it's incredibly unlikely. If you want an analogy, it's like when you play the lottery. Yes you can win in theory, but the odds are definitely against you.

So let's look at how "unlucky" Forum got possibly. The recent (completely un-adjusted) poll average excluding this poll is:

NDP: 34%

CPC: 29%

LPC: 27%

So let's assume (for now) that this poll average is representative of the true voting intentions in the population in this country. The question we have now is: how likely is it to randomly select 1440 Canadians and get that 40% of them would vote NDP, 30% Liberals and only 23% Conservatives?

To answer the question, I simply used my usual simulations. Specifically, I ran 10000 simulations where I was randomly sampling 1440 respondents from a population where there were 34% of NDP, 29% of CPC and 27% of LPC.

Here below you have the graph of the 10000 simulations for the level of support of the NDP. As you can see, in average, we indeed get the true support of 34%. But depending on how unlucky we got in our sampling process, we sometimes get the NDP much higher or much lower. Most of the times though, we are between 32 and 36 (if we round up). This is what the typical margins of error are telling us.

So, how many situations do we have where the NDP is as high as 40% and the Tories as low as 23%? The answer: none! That's right, out of 10000 simulations, there wasn't a single case where we managed to recreate the numbers of Forum. The highest for the NDP was 38.8% and the lowest for the CPC was 24%. And we are talking of one simulation out of 10000.

If you're wondering why, this is because 1440 isn't that small of a sample size. If you prefer, think of this analogy: you can flip a coin 5 times and get "head" 5 times. Unlikely (actually less than 5% chances) but not impossible. But if you flip a coin 1440 times, chances are you will have got mostly 50% of "Heads" and 50% of "Tails". This is a similar logic here.

Maybe the odds are actually so small we need more than 10000 simulations. So I tried with 100,000. The result: one case with the NDP above 39.6% and the lowest for the CPC at 23.5%. So quite close, but let's remember it's 1 out of 100000.

So, what does it mean? There are essentially three possibilities:

1. Forum was the only one to pick up something. What I mean here is that the average above (the 34-29-27) is actually not valid. Maybe the NDP is indeed on the rise and the Conservatives falling fast.

Of course the follow up question is: how come only Forum picked that up? After all, Forum was on the field August 23rd and 24th. Others were in the field during that time and some (Ipsos ,Innovative and today's Abacus - not included in the poll average yet though) actually polled after Forum.

2. Forum is actually wrong. And I'm not talking 1 out of 20 times type of wrong. I'm talking 1 in more than 100,000. So either they got incredibly unlucky when selecting the sample, or they didn't select it correctly.

People tend to think that pollsters simply randomly choose 1440 people across Canada when in fact we are talking of a complex process with different layers (they first decide to poll a specific number in each province, etc) and weighting. If you look at the poll by province, you see the Conservatives very low in Ontario (26%), Qc (11%!) as well as in the Prairies (tied for 2nd at 28%) and even Alberta ("only" 42%). The NDP on the other hand was very high in Quebec (54%!!) and the Prairies (first at 41%). So it does seem some provinces were simply off and caused the results overall to also be wrong or incorrect.

Forum polls a lot and fast. Their polls do have more volatility than others (we have observed this before). Earlier this campaign, Forum already had a one day poll (conducted on a Sunday between 11am and 4pm) with the NDP at 38%. And let's not forget about Brandon-Souris.

At the end of the day, I'd simply classify this poll as an outlier and move on. The NDP is not at 40% and the CPC is not at 23%.