The day before the election, I tweeted that the BC Liberals had more chances to win the election than the Boston Bruins had to come back against the Mapple Leafs (at 4-1 in the middle of the third period). Based on the polls, I had the BC NDP's odds of winning around 86-87%. Seems that the Bruis' win should really have prepared us for last night!. But I can at least argue that I technically had the Liberals victory as a possiblity (although the magnitude of the victory, with 50 seats for the Liberals, was really at the maximum of the projection ranges). However, the actual result was incredibly unlikely. At least based on the information we had.
Before I talk about how wrong the polls were, above is what the projections would have looked like if I had the correct vote shares. Details can be found here. As you can see, the model would have performed quite well, making the correct call in 75 ridings (so a success rate of 88%). Some of the mistakes include the ridings of Christy Clark, Oak-Bay-Gordon Head (where I didn't think Andrew Weaver would get such a big boost but it was a possiblity as well in the model) or a couple of close races in the lower mainland. It also seems the model would have made quite a lot of mistakes on Vancouver Island, probably because the regional swing (of the Green especially) was so different from the provincial swing. Since some of these 10 mistakes cancel out each other, the overall projections would have been really close. The fact is that if you came to my blog and input these vote shares in my simulator, you'd have won your political pool! So "my part of the job", i.e translating votes into seats, is clearly well done. Let's not miss the achievement here: by having the provincial vote shares only, the model could have correctly called the election as well as 88 % of the ridings. On top of that, if you look through the pdf, you'll see that more often than not, not only the correct call would have been made, but the vote shares in the riding are very close to the actual ones. It just shows how important the provincial swing is. So translating votes into seats is totally possible, with very little information (but a lot of work from me). Now, even more than before, the key is really to find or project the correct vote shares.
That, honeslty, shouldn't be my job. Pollsters are there for that. We've got a lot of polls during this BC election, including like 3-4 during the last two days. They were all showing the same outcome: BC NDP ahead by 5-7 points (except Forum who had the BC NDP up by 2 only, but they released the poll about a week ago, so they might just have been lucky). So even by averaging the polls, you'd still have been far off. If I accounted for the true statistical uncertainty due to these polls (i.e: margins of error), my simulations would never have allowed for the possibility of a BC Liberals win. Indeed, when you combine polls, margins of error go down. It's only because I add a lot more randomness that I technically had the actual outcome as a possibility. My only regret is that two ridings that were called with probabilities of 100% went wrong. I thought I was accounting for enough uncertainty, but it seems I'll have to add even more, both at the provincial and riding levels. The probabilities should account for the possible error of the polls. They already do, but not fully it seems.
Honestly, it's quite depressing and frustating to work so hard on building a model just to be so wrong because all the polls are providing you with incorrect information. In the US, during the last election, you didn't have to be Nate Silver to predict the election. A simple poll average would have been (almost) spot on. So pollsters in this country should really ask themselves why they can be so wrong. And not just one or two polls, but all of them! It's the same thing that happened in Alberta (and to a lesser extend in Quebec) last year (we could also include the underestimation of the Conservatives last federal election, although some polls were quite close). It's almost funny because people argue a lot about the correct methodology (online panel vs IVR vc phone calls), but the issue lies elsewhere. I'm sure pollsters will come out (they have started already) talking about how there was a late swing and blablbla. But to me, the problem is they don't poll the right people. Ekos is trying to identify likely voters but is obviously failling miserably.
I'll continue to build model and offer projections. After the last Quebec election, I decided to add uncertainty as to account for the possibility of big mistakes in the polls. I'll thus continue that way and hopefully, polling firms will improve on their parts.