As we are approaching election day, polls are confirming the Liberals are ahead. While it's far from a sure thing, the projections are now quite confident Justin Trudeau will become prime minister on Monday. With that said, polls are never perfect and can sometimes be very, very wrong (hey, remember Alberta 2012 or BC 2013? or the recent UK election?). Here are some reasons why we could think the polls are wrong and people need to remain cautious. Please note, I'm NOT saying polls are wrong, just reminding that they can. And no, I'm not trying to preemptively find excuses as to why my projections could end up being wrong Monday evening.

1. Polls are always a little bit wrong, at best, and very wrong at worst

I have yet to follow/cover one election where polls got it perfectly right. Earlier this year in Alberta, pollsters were congratulating themselves for correctly calling a NDP majority. They indeed did, but they also overestimated the NDP by 2-3 points and underestimated the Progressive Conservatives by 3-4 points. This means the NDP-PC gap ended up being roughly 6 points smaller than predicted. It didn't matter in Alberta because the NDP's lead was massive. But such a mistake on Monday and Stephen Harper remains PM, at least for a while. Similarly, if polls were as wrong as in 2011, there as well it'd most likely prevent Trudeau from winning.

I personally believe the Tories are underestimated for multiple reasons. They are the incumbents and incumbents are almost systematically underestimated. They were also massively underestimated by the polls in 2011, especially in Ontario. I'm not saying it's likely the CPC will be 4 or 5 points above the poll average, but I wouldn't be surprised if they were 1-3 points above. This is why in my projections, I make small adjustments which result in a 2 points boost for the Tories.

In other recent elections, polls almost always made some mistakes. In Quebec in 2014, the Liberals were underestimated by 1-2 points and the PQ finished 2 points below the average. Still in 2014, the Ontario election saw the surprise majority win of Kathleen Wynne. In this case, polls were especially wrong and overestimated the PC with their likely voters adjustments. Only Angus Reid is offering such a LV model this year, but it remains that polls are always 1 or 2 points off. Yes aggregating polls provide lower margins of error, but this only corrects for sampling variation. Measuring vote intentions is prone to more error than this. People can change their mind, they can lie, or they can simply not vote.

For this Monday, any small error by the polls in Ontario could drastically change the outcome of the race. The differences range from a Liberal majority to a Harper victory.

2. People who vote are different from the general population. And people who answer polls are also different.

Polling firms sample from the general population of adult citizens. In the case of online panels, they actually sample from the panel. And in order to be on this panel, you need to volunteer. While I don't have a problem with online polls and find them reliable (and results speak for themselves), it remains that turnout can massively affect the results.

One big determinant is the age, with older citizens voting in much greater proportion than young people. However, polls weight their samples based on census data. What this means is that if the 55+ represent 20% of the population (a number I just made up), they'll want their observations from people aged 55 and over to represent 20% of the sample, after weighting. But given that the turnout among people aged 55+ is much greater, you see that this could cause some problems. In BC for instance, around half of the voters are above 50, a proportion much greater than in the general population.

This is the typical answer used to justify the failures of the polls in British Columbia in 2013. I'm half convinced but I acknowledge this is a potential issue.

Look at the voting intentions by age in some of the most recent polls, you usually see the Conservatives winning the 55+ crowd (I say usually because pollsters don't all seem to agree). So if this age group votes a lot more proportionally than the other age groups, the results will be closer to their vote intentions.

It's only one of the multiple dimensions where something like could occur, but you get the idea. When Angus Reid does its Likely Voters model, they try to specifically account for this. After all, you have to wonder why pollsters weight based on the census instead of on the demographics of voters. At the same time, as I've said, LV models made the numbers worse in Ontario last year... So no model is maybe better than a bad one?

People also need to realize that people who answer to polls are not the same as people who vote and/or the general population. Just look at the Angus Reid poll where we learn that 34% of respondents have already voted, even though the actual turnout was below 15%. We always get that, with like 90% of people answering polls saying they will vote. Of course, a lot of people don't want to admit they don't vote but it simply shows that we can sometimes simply poll or ask the wrong people.

By the way, both Angus Reid and Ekos have the numbers among people who already voted. Angus has Liberals and Tories at 34% while Ekos has the Conservatives at 34.9% and the Liberals at 32.5%. Of course this is mostly due to the fact that advance polls are for more committed voters (a group where the Tories have the edge). Still, it shows at least two things: 1) despite a surge in the turnout, the Tories are actually ahead 2) if only the more committed voters go out and vote, Stephen Harper still has a chance. Let's remember in particular that advance voting took place during the Thanksgiving weekend, when the Liberals were surging and most likely at their peak in Ontario.

See? The Liberals can win but they need the people saying they want to vote for them to actually, you know, go out and vote for them. I know the Liberals and Trudeau have taken a lot from the Obama campaign in trying to mobilize younger voters, we'll see on Monday if they were successful or not.

3. Projections make mistakes even when the polls are right

Despite a lot of hard work, there is no denying that seat projections models aren't perfect. Not only do they rely heavily on polls, they make mistakes even with the correct percentages. You always have surprises, candidates who perform better or worse than expected, local effects, etc. They are usually rare but you never know. The CAQ somehow managed to win pretty much every single close riding in the Montreal suburbs last year, against all odds. This year, if the Tories can resist in key parts of the GTA, they could well create an upset. At the same time, errors go both ways and the Liberals could surprise us with a majority (the projections actually show it's possible).

So we'll see, but this post is simply to remind everybody that despite a lot of polls and pretty much every projection model showing the Liberals ahead, it doesn't mean the win is guaranteed. Just ask Christy Clark in BC.

