Projections update Saturday May 26th 2018

Not a lot to write about today. While Forum did come out yesterday with the NDP at a crazy 47%, other pollsters seemed to agree the race was much closer. Ekos, after leaking some partial results the day before showing the NDP up by 10, ultimately published a full poll with PC and NDP statistically tied. Abacus said on Twitter that they weren't observing a crazy NDP break-out beyond what they had already observed (interesting because Abacus was one of the first to catch the rise of the NDP a couple of days ago) and Mainstreet's tracker is slowly but surely converging to a close race. Finally, Innovative went kinda against the trend by publishing a poll with the PC still relatively comfortably ahead. It should be noted however that the Innovative poll was conducted from the 18th to the 23rd, so it's possible it didn't capture some of the NDP's jump that happened recently.

So, turned out that my long analysis of yesterday where I was showing the edge the PC has with vote efficiency is still valid and relevant! Yeah! By the way, you should definitely read this analysis!

Here below are the most up to date projections.

Voting intentions; Seat projections with confidence intervals; Chances of winning the most seats

Mainstreet also started publishing some riding polls, namely in Ajax and Guelph. Riding polls are by far less accurate but I can't ignore them. After all, I need to make projections because we don't get 124 riding polls. So when there is one, I include it in my forecast.

I can't reveal the exact numbers in Guelph but let's just say that the poll confirmed the Green leader Mike Schreiner is in the race. It's always very difficult to predict such a race where one party is putting everything it got into this one riding. Past results of Mike Schreiner didn't seem to indicate a crazy personal effect (as opposed to Elizabeth May or Andrew Weaver for instance) but it seems this is working better this time around. To be fair, Guelph is most likely a better riding and the vote is so split that it takes a low percentage of votes to win this year.

So can he win? Yes, absolutely. But it remains a 3-way race with the PC and NDP (even the Liberal candidate isn't fully out).

So the big change of the day is the confidence interval for the Green now being 0 to 1 instead of 0 to 0. Beyond this, I don't have anything to add for now. Enjoy your Saturday!

The Ontario election is now a competitive race between the Conservatives and the NDP

Aaaaaand we got a race! That's right, after weeks (months?) of the Progressive Conservative Party of Doug Ford clearly leading in the polls (and the seat projections), we now have a close race between this party and the NDP of Andrea Horwath. This is honestly quite impressive given how big the PC lead was just a few weeks ago. If the Tories end up losing this election, I'm sure people will compare them to the Maple Leafs and the famous blown lead to Boston.

After a few days where online polls were showing a tight race (with the NDP sometimes ahead) while phone polls (Ekos, Mainstreet) kept showing the PC way ahead, things changed yesterday. Frank Graves, Ekos' CEO has tweeted that they now see big shifts and the NDP is first. They should publish the poll today (the preliminary results leaked on Twitter were literally showing the NDP with a 10 points lead!). As for Mainstreet, its CEO Quito Maggi has also Tweeted something similar and we should start seeing the changes in their daily trackers (although, since it's a 3-days tracker, it'll take some time). We also know Forum will publish a poll on Thursday and rumours are that they have the NDP very, very high. Edit: Yup, the Forum poll is up and this party is at freaking 47%! If you want to see what the projections would look like with these numbers, just use the simulator.

All that to say that the projections below will most likely be outdated very, very soon. But even if that's the case, you can see a very different race with the NDP within striking distance of the Tories.
Edit: I updated the projections with the new poll from Forum. For the first time, the polling average has the NDP ahead.

Voting intentions; Seat projections with confidence intervals; Chances of winning the most seats

The probabilities for a majority are only 43.2% for the PC and 19.1% for the NDP. You can find all the details at the bottom of this post.


Since the projections won't be valid for very long, I want to focus instead on what a close PC-NDP race, with both parties around 36%, would look like. Of course, I'm assuming here that the race will indeed remain tight for a while and the NDP won't take a large lead. But this scenario is far from impossible but let's ignore it for now. If the rumored Ekos and Forum numbers are true, we are talking of an orange wave and all this article here is for nothing. Oh well.

The model seems to show that in this case (the close race), the Conservatives would have the edge with vote efficiency. In order to fully represent this, I used my simulations and estimated the function below. This is showing you the chances of the PC winning more seats (majority or not) than the NDP as a function of the PC lead in terms of votes (so it's negative if the NDP gets a higher percentage of vote). This is valid for the NDP and PC around 36% and the Liberals around 20%.



If no party has an edge, then the chances of winning should be exactly 50% if the vote spread is 0. In other words, if two parties were to get (say) 36%, then they both should have a 50% chance of winning. So one way to capture the edge of one party is to see where the 50% chance is.

As you can see, the 50% mark is when the PC is around -2 points, which means the NDP needs to win the popular vote by 2 percentage points (ex: 38% versus 36%) in order to have the same chances as the PC. Another way to say this is that if the two parties are actually tied, the PC has around 90% chances of winning more seats.

Just to be clear here, the projections above show the PC with around a 2 points lead and "only" 80% chances of winning. So you might be confused because this graph here shows that if the PC wins by 2 points (+2 on the x-axis), then the PC has over 90% chances of winning more seats. The probabilities here and in the projections above aren't representing the same uncertainty. In the projections, it is accounting for the possibility that the polls are wrong. And we know they can be. On the other hand the graph here shows the uncertainty due to the distribution of the vote (or the electoral system). So at PC +2, this is really looking at the possible seat distribution if the Tories were to actually receive 2% more votes. See the difference?

2 points is a fairly big advantage although this is less than the usually accepted 5 points lead the Liberals need to have on the PQ in Quebec since the Liberal vote is heavily concentrated in non-francophone ridings (note: I don't believe this 5 points rule is remotely true nowadays, but that's another story). This is just another example of the flaws of our electoral system where one party could get fewer votes but more seats.

Ok so why is the PC so much more efficient? One explanation is of course that my model is just wrong. This is completely possible but not super interesting (cause I won't know it unless we actually get the election).

Another explanation is because the NDP is wasting votes in some regions and too far behind in others. The PC vote is really evenly spread while the NDP vote is more concentrated in Hamilton/Niagara and the North. The NDP remains low in central Ontario and the east and, more importantly, is still behind the PC significantly in the GTA (in Toronto proper, it seems to be a 3-way race and you can just roll a die to make your prediction). The GTA is really the key. It's the source of many seats. In the projections above, the Tories are winning 21 seats in the 905 while the NDP is only at 8.

So what the NDP needs right now is to increase its share of votes in the GTA. And possibly in a non-uniform way but instead by increasing more in some key ridings. Of course, this is easier said than done. The vote inefficiency of the NDP in the GTA seems particularly severe.

Look at the map below, from Wikipedia. The GTA was the life source of the Liberals in 2014. The Tories actually lost seats there between 2011 and 2014 (as well as in the greater 905 in general). For the NDP, it finished below 20% in the 905 and only won 2 seats. The best measure of the inefficiency of the NDP vote is to calculate the standard deviation within the 905. The higher the standard deviation, the more volatile the NDP vote is, which means it's high in a few ridings but also very low in others. For 2014, the standard deviation was around 11 points for the NDP, almost twice as much as the PC (interestingly, the NDP is close to the Liberals but the Grits were also ahead there, so it's less of an issue for them). In average the NDP were almost 30 points away from winning ridings while the PC was only around 15 points away. And there as well the NDP's deficit varied quite a lot between ridings.




So the big question this year is really: who will win the lottery since the Liberals are about to lose most of their seats? Based on what we just saw, the PC is in a much better position to win these seats. The NDP really has only two ways here. First, it could simply take a huge lead overall in this region, thus compensating for its inefficient distribution. The second method is to hope that its gains in votes will be optimally distributed in the right ridings. In the east 905, we are talking of ridings in Vaughan for instance that could be the first to go NDP (among the ones not already going orange) or Ajax. On the other hand, the NDP would likely be wasting votes if it was to increase mostly in Thornhill or Markham. So either on the left or right side of the east 905, but not in the middle. In the west 905, the NDP should hope its gains aren't in the ridings closer to Hamilton (Oakville, Burlington) and more in ridings like Mississauga (and Brampton, but it's already projected to go NDP). In general, the west of the 905 (so the Peel Halton region) is better for the NDP than the east (the York region).

There are 4-5 ridings where the NDP is within striking distance. But the others? The others are harder. In some of them the NDP is likely still a good 20 points away. Still, taking 4 ridings away from the PC would go a long way to restore a more balance race between the two. Good riding targeting by the NDP and this race is a toss-up.

So what the NDP needs right now if it actually wants to win on June 7th is a good transfer of Liberals votes in the GTA. Andrea Horwath needs to inform and convince these voters. Inform them that the OLP has no chance of stopping Doug Ford (note: the latest Leger poll showed just how uninformed most voters were. 50% of Liberal voters thought their party had the best chances of winning the election!) and then convince them to actually make the jump. I'm sure she'll spend a lot of time repeating this during the last two weeks (and the debate).

The other region where the NDP seems to be currently inefficient is the east, including Ottawa. But this is less important than the GTA in the overall vote efficiency of each party. With that said, a OLP to NDP transfer in Ottawa (along with the one in the GTA) would give Horwath a majority.

So here you have it. Unless polls start showing a PC rebound or a large NDP lead, this race will become incredibly competitive. But the Tories have the vote efficiency going for them. They can lose the popular vote by 2 points and still win more seats. They would likely not get a majority however. But that's for another article.

I'll try to update the projections quickly when we get the new polls today.

The detailed projections from above:

Quick morning projections update, May 24th 2018

I already posted a long article about the general accuracy of Canadian polls here, I strongly suggest you give it a read. Since I know other polls will come today (Léger for instance), I just updated my projections using yesterday's Mainstreet tracker as well as the new Pollara poll (for which we have very little information).

So here it is folks, just the number, no much blah blah.

Voting intentions; Seat projections with confidence intervals; Chances of winning the most seats

Notice that the Conservatives of Ford are now less than 90% sure to win the most seats. Maybe more significant is the fact the chances of a majority are now barely above 60% (see below for details). If the trend goes on (and I suspect it'll with the new Leger numbers. Leger doing online polls which have shown a tighter race than phone ones), then it'll soon be toss up for a Tory majority...

Possible outcomes are here:


Finally the detailed projections. Have a nice day everyone!

Can we trust Canadian polls?

With the Ontario entering its final two weeks and the polls diverging between them (mostly depending on whether they are done online or by phone), one could wonder: how accurate are Canadian polls in general? Should we trust them?

The short answer is yes, but they tend to be off once in a while. And even when they aren't completely off -the majority of the time- there is still considerable uncertainty even if you average many polls.

Let's look at how I reached this conclusion.

I collected the polls during the last week (or so) of the campaign for 10 elections. They are Alberta 2012 and 2017, BC 2013 and 2017, Quebec 2012 and 2014, Ontario 2014 as well as the last three federal election. This is I believe a fairly good sample. It includes many different elections over many years. And it does include the two big misses (Alberta 2012 and BC 2013), so whatever results I find can't be called biased towards pollsters.

For each election, I averaged the polls. A simple, straight up average. No fancy adjustments based on sample size or whether the poll was done 12 hours after another one. No adjustment for incumbency or anything. I only made sure to include only one poll for each firm. I believe my personal average will beat this simple one most of the time, but at least it's clear and straightforward.

I then compared the average for the main parties (the exact number will vary. It's 4 in Ontario, 5 for federal election or Alberta, etc) to the actual results.

I then calculated what is called the MSE, the mean square error. It's simply the average of the square sof the differences. Why the square? So that it doesn't matter whether the polls were over or underestimating a party. A deviation of 2 points above or below will result in the same "penalty" (2 squared, so 4).

If you know your stats, for unbiased estimators (and we believe that polls are of this kind, meaning they don,t systematically under or overestimate voting intentions), the MSE is equal to the variance.

Once I have the MSE, I can take the square root and multiply by 1.96 in order to get the margins of error. This is really basic stats but don't worry if you don't get it. I'll interpret the results in a non (or less) technical way.

The table below presents the results for these 10 elections.




So in average for the main parties, polls will be within 2 points of the actual results. Interpret this number as saying that in average, if polls estimate a party at (say) 35%, then it was 2 points off from the actual result (so the party ultimately got 33% or 37%). Of course that's an average. Sometimes polls miss by more (they were off by 11 points for the PC in Alberta 2012 but only off by 1 point for the PC of Harper in 2015).

Notice that the 1.92% is for the poll average. If you look at individual polls, it'll be worse in general (they are few exceptions where a single poll would do better than the average such as the Nanos and Forum polls for the federal election in 2015).

It's good but not exceptional. Again this is after averaging usually 5-10 polls for each election. Being off by around 2 points in average can mean the outcome is very different from the one you were expecting. Imagine the poll average is showing a race tied at 36%-36%. Applying the average absolute error, it means the actual results could be 38% to 34%. I can guarantee you that in most cases, this is an error big enough to change the winner. Every percentage point can turn into multiple seats at that level. 2 points off was enough to give Trudeau a majority compared to the projections (they had other problems as well, but let's move on).

A simple observation: out of the 10 elections, the incumbent party was underestimated by the polls in 9 of those. The only exception is Quebec 2014. But in Quebec, it is expected that the Liberals will be underestimated. So we could almost say 10 out 10.

In average, the underestimation of the incumbent is a crazy 3.6 points! If we exclude the two big misses of Alberta 2012 and BC 2013, it gives us 2.1 points. This is why in my average I allocate more undecided to the incumbent as to take care of this systematic underestimation. And yes I'm currently doing it for the Liberals in Ontario. This boosts them by 1.5 points usually (it depends on recent polls, number of undecided, etc). I feel strongly about doing it because it is based on data and evidence. For all the mistakes I made in 2015 (not the best election for me), getting the CPC right is one of the few things I got!

Okay, let's go back to the general accuracy of polls. If we translate the MSE above into margins of error, it means that Canadian polls have actual, effective margins of error of 5.68% 19 times out of 20. This is equivalent to a poll with only 271 respondents (for a party at 35% as MoE are a function of the level of support)!

You might be confused here. Most polls out there are reporting MoE of around 3% to 3.5% for a sample size of 1000 respondents. So how can the average have larger MoE? Averaging multiple polls, independent from each other, should in theory significantly reduce this margin of error. That's true, but only for the theoretical margins of error that are there to take care of sampling variations (in other words: the fact we only randomly select a 1000 people to answer). But see, in the real world, sampling variation isn't the main issue. Far from it. In the real world, when we actually look at polling accuracy, we need to remember that other factors are at play: turnout, people changing their mind or making up their mind at the last minute, people lying to the pollsters, people refusing to answer but voting anyway, incorrect weighting by the pollsters, etc.

This is why I couldn't care less when we see online polls with non random samples to which traditional margins of error don't apply. So many people make a big deal out of it. I really don't because I know that the actual margins of error are much bigger and not a function of sampling. If sampling variation was the single greatest source of uncertainty, then averaging 5-10 polls of a 1000 respondents would give spot on estimates of the election. But it doesn't.

How does this compare to other countries? Well it's worse than for French president elections where the effective margins of error of 3.81% (and that's for 2002, 2007 and 2012, I haven't included 2017 where the polls were super accurate). Funny enough, French pollsters do NOT use probabilistic random samples. And yet they perform quite well. As for the US, this article mentions actual margins of 7 points, although I believe it'd be lower if we only look at presidential elections.

Out of the 10 elections I included in my analysis, two were clear misses (Alberta 2012 and BC 2013). Some were partial misses (Quebec 2012 for instance). If we remove the two really bad elections (let's just assume they were one-off, or well, two-off -possibly caused by online polls not being as well developed as they are now), you get the second column.

With margins of error of 3.2%, we see that Canadian polls are fairly accurate. Still, this means that at the end of this Ontario election, even after averaging 5-10 polls (that will be published during the last week), you should add and subtract around 3 points to the various averages. That gives you a fairly wide intervals (6 points is bigger than the current gap between the PC and the NDP!). With how votes are translated into seats, it could mean the PC for instance could win 50 or 80 seats! And remember that this is assuming the polls will be "right". If you want to factor in the possibility of a bigger miss like in Alberta, then you need intervals of almost 12 points!

Don't think however that the chances that the polls are 12 points off are the same as the chances they are 2 points off. That's not how you should interpret those margins of error. The upper and lower bounds should be seen as possible but unlikely results for the election based on the polling average. In order to illustrate this, I generated this graph below for a generic party polling at 35% (usually the score of the top party in one Canadian election). The graph gives you a visual representation of how likely each electoral outcome is. (technical note: this part works better if you accept a Bayesian point of view where the polling average represents the uncertainty we have for our knowledge of the correct voting intentions). This distribution is the result of simulating 10,000 samples with sample size of 270.


See, if a party is polled (in average) at 35%, then this party could actually receive only 30% of the vote. But while this is possible, it's highly unlikely (thus why the bars are smaller). Please do not interpret this article or this graph as a validation of saying "anything can happen". This gives the wrong impression that every outcome is equally likely when it isn't. Similarly, if your party is polling at 30%, your party can indeed be underestimated and win. But it's not the most likely scenario. In average you want your party to be polling high, there is no way around this. People pretending otherwise are either lying or stupid.

What this exercise shows is that despite my best efforts (or anyone doing this job), there will always be considerable uncertainty. This is why I believe people should focus more on my probabilities (where I account for this uncertainty with a well calibrated model) rather than the top line numbers. What it also shows is that people who simply say "polls are always wrong" are actually wrong. This is simply not true. Yes Canadian polls aren't perfect and they sometimes miss, but in general they do a fairly good job.

Ontario election: What is Google Trends showing?

Ontario election: What is Google Trends showing?
Using searches on Google can be useful during an election. There seems to be at the very least a correlation between the quantity of searches and votes. With that said, it's at best iffy and not something I'd use over polls. It can also be super finicky. Last year during the mayoral election in Montreal, I discovered that I'd get different results if I use the accent on the e of Valérie Plante (seriously, if I was looking at Google Trends for "Valérie Plante", she was losing, but she was winning under "Valerie Plante"...). Still, it can be interesting to look at this.

First, let's look at which leader is most searched. All three leaders (Wynne, Ford and Horwath) are famous enough that Google knows about them. It means we can do a (superior) "topic" search instead of a term search (the latter means we'd simply be looking literally at the words "Kathleen Wynne". It's inferior because it can get polluted by people searching for another Kathleen. It also doesn't capture related searches like "Premier of Ontario". At least that's my understanding). Results are below for Ontario during the last 90 days.

 

Note: I don't choose the colors! Also, these graph are live. So the text I'm writing here is valid as of May 23rd!

Doug Ford is well ahead. That can be a good thing or not. For instance the main search associated with Doug Ford is "Doug Ford greenbelt" related to the infamous "secret" discussions that Ford had with developers and promising them to open the greenbelt to them.

As for Kathleen Wynne, things don't improve as the main search is "how old is Kathleen Wynne"!

What to take from this graph? Mostly two things. 1. Ford is by far the most searched leader. This is impressive since usually incumbents have an edge. The divisive Ford creates controversies and interest. 2. Horwath is still third. One of her issues is that she is simply less known. In the polls, she is the leader with the most "neutral" or even "don't know" answers when asked if they have a positive or negative opinion of her. You'd think that would have changed with the campaign and the polls showing her in (possibly) a winning position. But this hasn't been the case (yet maybe?). It also didn't really change for John Horgan last year in BC. At least the searches increased for Horwath right as the campaign started.


If instead we look at searches for the parties, the NDP does much better. This is similar to what was happening last year in BC with Christy Clark crushing it with her popularity but the NDP doing much better with the party.

 

Interestingly here, it seems we can pick up the recent decline of the Conservatives and the rise of the NDP. It'll be interesting to monitor this for the next two weeks. The rise is even more pronounced if we use the last 7 days:

 

So, any useful information for predictions? I think yes, if you aren't too picky. These two graph to me confirm the Liberals are not in the race. They are the incumbent and main (historical) party and are quite low in Google searches. That isn't normal or a good sign for them.

After, does it show the NDP or PC ahead? I'm tempted to do an average of the two measures as a vote is simultaneously influenced by the leader and the party. If you do that and normalize to 100, you get the following "estimates" for the last 30 days:

OLP: 24%
PC: 50%
NDP: 26%

If you only use the very last 7 days:

OLP: 22%
PC: 45%
NDP: 33%

Not bad! Keeping in mind the possible overestimation of the PC due to the "celebrity status" of Ford (or putting more weight on the party searches instead than the leader ones), you get a pretty good estimate of the polling average.

The chances of a majority are falling for Ford while polls disagree with each others

Yesterday, we got the latest Ipsos poll. And for the first time during this campaign, it was showing the NDP ahead (by 1 point) among decided and leaning voters. While this poll confirmed a trend where the PC was decreasing while the NDP was rising, the actual numbers were quite different from other recent polls.

A closer examination of the four most recent polls (Ekos, Abacus, Mainstreet daily tracker and Ipsos) shows that online and phone polls (Automatic phone calls, called IVR) do not agree on the current state of the race. The two phone polls (Ekos and Mainstreet) show the PC well ahead (about 10 points, 39% versus 29%) while the two online polls (Abacus and Ipsos) suggest a tied race. At least everybody seems to agree that the Liberals are far behind in third, so there's that.

Online and phone polls not agreeing with each other isn't new. We had a similar situation between Ekos and Ipsos 4 years ago in Ontario. Still, let's hope they both have a part of the truth and averaging them will work. Arguments can be made both ways. Some people don't trust online polls  in general (note: they are wrong). An experiment by Campaign Research 4 years ago showed that online polls tend to overestimate the NDP. If this is the case again, then the NDP is most likely around 32-33%, right between the phone and online polls. Some have criticized IVR polls for failing to reach enough young voters, but whether this is a problem or not mostly depends on what the youth turnout will be. Mainstreet has also talked about the failure of IVR in the Calgary mayoral election.

At the end of the day, we can't really pick one method over the other for now. The best we cna do is to be aware of the disparities and keep monitoring it. I wish we could get a live caller poll (CATI) as this is the gold standard of polls. For now, I'll simply continue to average polls. If you feel very strongly that one set of polls is incorrect, feel free to enter your numbers into the simulator.

Using these four most recent polls, we get the following projections.

Voting intentions; Seat projections with confidence intervals; Chances of winning the most seats

The possible outcomes are:


Finally the seat distributions are:



While the Conservatives are still ahead, the chances of a majority are dropping fast. In the span of a week, we went from an almost certainty (over 90% chances) to "only" 66% chances. 66% might seem very high but it means there is 1 chance in 3 that Doug Ford doesn't get a majority. This isn't a small probability. A weak analogy would be that you wouldn't get in a car that has a 33% chance of crashing. This is especially important because if Ford were to fail to secure a majority (62 seats), you could expect a complicated situation where the NDP and Liberals could try to make a deal. It'd be a messy situation to say the least.

The situation isn't dire for the Tories (yet?). They are still leading (in the average and phone polls), their numbers show they do better among older voters (more likely to vote) or people who voted last time (more likely to vote again). They also continue to dominate the GTA, Toronto included, with its ton of seats Add the seats in the East and Central Ontario plus a couple in the Southwest and the PC is still the favourite to finish first. Make no mistake here, the PC is favourite right now.

The NDP remains behind but could get some traction if we start talking more and more about their chances (some might say it could also motivate the PC base to go out and vote but I don't think it matters, these voters were already motivated). The NDP vote is also less efficient than the Tories', mostly because of the GTA. The NDP vote is also too unevenly spread. It's concentrated in the North and the Southwest (Hamilton, Niagara, etc). If you look at the regional polling averages, the standard deviation across regions is 4.9pts for the Liberals, 3 for the PC and 5.6 for the NDP. The PC is remarkably widespread and is therefore competitive in every region. This is ideal when you are the main party and you want to win the most seats. Being concentrated like the NDP is good when you are at 19%, not so much at 32%.

I think I'm a little bit surprised at how resilient the Liberal vote has been. After a sharp drop at the very beginning of the campaign, it has remained quite stable. I have them slightly higher than the raw poll average however because I allocated the undecided non-proportionally (a method that has proven to work and a systematic examination of past electoral polls shows a clear underestimation of incumbents). Still, it seems the OLP has reached its floor, at least for now. For Andrea Horwath, her best shot at convincing some Liberal voters to switch side (to block Ford) is more likely during the debate next week.

That's all for now. here below are the detailed projections.

Yes the NDP can win... but that's nothing new

Polls haven't moved much over the last week. At least not the average. Let's face it, except for the first few days, this has been a fairly boring election as far as polls are concerned (or for out of province neutrals like me). With that said, the latest numbers from Abacus were showing a much different race with the NDP just 1 point behind the PC. Beyond the horse race numbers, Abacus has been doing an excellent job at providing a better insight into this campaign (if you want deeper polls, Abacus and Innovative have been the best so far). One of the points of David Coletto, Abacus' CEO, is that the NDP can indeed win. In this post, I'll argue that while this is true, this isn't something new. The NDP has always had a high ceiling.

Before going further however, here below are the most up to date projections. They are based on the polls published up to Monday evening. That includes naturally the above mentioned Abacus poll as well as others, such as the most recent numbers from the Mainstreet daily tracker.

Voting intentions; Seat projections with confidence intervals; Chances of winning the most seats
The riding by riding projections can be found at the end of this article.
Nothing really different from last week. The NDP isn't rising anymore, at least not if we use the average of recent polls. The PC is stable after having dipped slightly below 40% while the Liberals are surprisingly resilient at around 23-25%. With that said, the projected seat total for the Tories keeps decreasing and is possibly getting close to a situation where a majority would be far from guaranteed. Right now, the odds of a PC majority are around 80%.

It mostly depends on the GTA. If the polls are right and Ford is leading in both the 416 and 905, then a majority is highly likely (close to 100% really). But there is no denying that we have moved from a situation of a potential super majority for the PC to one where a minority is far from impossible (and would open a can of worms depending on the actual results). Please remember that my model does account for the regional polling average but doesn't use it directly. If I were, the Liberals would be lower and the PC higher. Just mentioning it so that you can compare projections and understand them better.


Alright, back to the main topic of the day: can the NDP win?

The projections above are indicating that yes, it's possible, albeit unlikely. But that isn't what David Coletto was really referring to. He was mostly talking about how the NDP has a high vote ceiling. Look at the graph below showing a ceiling of accessible voters of 67%! This is higher than any of the other parties.

Source: Abacus Research

Andrea Horwath, the leader of the Ontario NDP, is also the most popular of the three leaders and the one with, by far, the lowest percentage of people disliking her (she does have a ton of "neutral" and "don't know" though).

In an election whose main theme is clearly "change", and where the leader of the current top party is highly divisive, this should naturally put the NDP in a good position.

I am not disagreeing with Abacus and Coletto here. I think their top line numbers are a little bit off compared to the average but everything else makes sense. What I'm arguing however is whether having a high ceiling is really such a good thing.

On one hand you can see it as a positive since many people are willing to vote for you. On the other, it means a small percentage of these people are actually currently voting for you! Using only the numbers from Abacus, it means Ford is currently convincing 65% of his potential voters to support him and his party. The Liberals have a conversion rate of 57% and the NDP only 51%.

See, the thing on election day is that you don't win seats based on potential voters but as a function of actual votes cast for you! The NDP has a structural problem at converting potential support into actual votes.

This isn't new. Using an Abacus poll from 2011, I had calculated what I called "volatility ranges". See those as margins of error that account not only for sampling uncertainty but also for the fact people are uncertain about their choices and can therefore change their mind. I reproduced the ranges here below, from the article:
Liberal
30%-40%
PC
29%-37%
NDP
22%-33%

As a reminder, the Liberals ultimately got 37.7% of the vote, the PC 35.5% and the NDP 22.7%. The same poll, when asking the  traditional question "who would you vote for?" had 38-36-23, so the poll was accurate.

As you can see, there were a lot of people already saying they could vote NDP. But few, relatively, who did. In the survey, respondents were asked to rate, on a scale from 1 to 10, how likely they'd be to vote for each party. What was obvious was that the same likelihood for two parties didn't really mean the same thing. For instance, among the people who were saying 6 (out of 10) for the Liberals, 55% would also choose the Liberals in the more traditional "who would you vote for?" question. On the other hand, only 26% of the 6s for the NDP would be converted in the traditional question. The NDP had a lower conversion rate at every likelihood level.

The NDP therefore ultimately scored at the bottom of its range while the Liberals and PC were at the top. It finished distant third when, looking at the potential, it could actually have finished first!

Let's be clear here, I'm not saying the situation in 2011 was identical to the one now. The NDP was clearly the third party back then and didn't have much chance of winning, it was simply an unlikely possibility. But it already had a relatively high ceiling, one that was much higher than its actual share of votes.

The current situation is qualitatively similar with the NDP having a high ceiling (much higher though) but a relatively low conversion rate.

The challenge for the NDP is to finally turn more of these potential voters into actual ones. The recent Abacus poll was showing interesting stuff. For instance, voters would support the NDP more after being told this party was actually ahead of the Liberals (and therefore had a better chance to beat the Tories). The problem is it's easy to inform a few hundreds respondents to a survey, it's harder to let the entire province know. Remember, people like you, reading my blog, aren't really representative of the average voter. This is why Horwath is trying so hard to tell voters it's between her and Doug Ford.

So we'll see if the NDP can climb higher in the next two weeks. The potential is there, now it's up to this party to convert. The NDP can not only go and get votes from the other two parties (mainly the Liberals), but it also has to make sure it gets its fair share of the roughly 10% of undecided voters.


Riding by riding projections as of May 22nd 2018:

Best, worst and other scenarios as of day 12 of the 2018 Ontario election

Alright, it's not super easy to come up with articles to write when we get few polls and the overall situation isn't really changing. Still, let's look at some of the possible scenarios based on the most recent polls and projections. This will also allow us to see how much uncertainty there is currently.

I added the new Ekos poll to the average as well as yesterday's Mainstreet tracker (for the daily tracker, I update daily but I thus remove the one from the day before, otherwise I'd be double counting some of Mainstreet data).

As I was saying, the overall situation is mostly unchanged. Maybe the only thing worth mentioning is how the Liberals have stabilized. If anything they seem to have slightly rebounded (in terms of votes, not seats however). This means the NDP is still far from being a true challenger to the PC of Doug Ford.

Voting intentions; Seat projections with confidence intervals`Chances of winning the most seats


When it says "chances of winning" above, it is referring to the chances of winning the most seats. Still, in our electoral system, remember that you can technically win the most seats but not ""win" and become Premier. Obviously, in the current election, that could only happen if Ford was to fail to get a majority. The graph below shows you the various scenarios.


The "others" is really just one scenario: a tie between the PC and the NDP. This occurred 2 times out of 10,000 simulations.

Also, no, I didn't forget about the OLP. This party simply can't win based on the current information (polls, etc). Not even a minority. The Liberals can, however, be tied with the NDP (22 times out of 10,000). I understand some of you might say "but polls have been wrong before" and they'd be right. But my simulations already include very big variations. Also, the probabilities have proven to work in past elections. So no, I really believe the Liberals would have no chance if the election was tomorrow.

The possibility of a PC minority is interesting. What would actually happen depends on many factors. Would the NDP try to get the support from the Liberals? All the leaders currently say no to a coalition but if the PC was to win a small minority, it wouldn't be surprising to see a possible change of attitude from Horwath for instance. Again however, it'd depend on many factors.

Which leads us to the question: can the PC actually win a small minority only? Yes, it's possible. here below is the distribution of the possible number of seats for each party.


The absolute worst scenario for the PC is at 40 seats. The chances of winning 50 seats and fewer are less than 1%! So right now, even in a scenario where the polls had overestimated the Conservatives and this party had an inefficient vote, it'd still likely win at least 50 seats. At 50 and more, it gets more and more difficult for the NDP and OLP to try to prevent Ford from governing.

In other words: this election currently has the least amount of uncertainty I've seen in a long time. Look at the distributions, they barely overlap. This means we have a high confidence that, if the election were tomorrow, the PC would finish 1st, the NDP 2nd and the Liberals 3rd. Of course, as usual, the election isn't tomorrow.

Speaking of best and worst scenarios, for the NDP there are, respectively, 67 and 21 seats, while it's 42 and 0 for the Liberals. The Green could at best get 1 seat (chances are less than 1%).

Finally, here is the riding by riding projections.



Is the left/progressive vote splitting each other and giving Ford the win?

With the Progressive Conservative party of Doug Ford continuing to lead comfortably in the polls (and in the corresponding projections), one topic of discussion is how the Liberals and NDP are just splitting each other. After all, the PC is about to win a majority with only 40% of the vote.

Is that true? In this post, I'm not going to argue about whether the OLP and NDP are similar or have similar policies. I'll simply say this: polls have shown that these two parties do seem to share some of the same voters.

The point of this post is more straightforward: would Ford and the PC really lose if the OLP and NDP were together? Are these two parties really splitting the vote?

In order to answer this question, I'm using the second choices from the Mainstreet and Innovative polls. Surprisingly, they show very similar second choices. They both agree that the main choice of Liberal voters is the NDP (at around 50%) while the OLP is the main second choice of the NDP voters (at around 40%). Interestingly (but not used in the calculations for this post), the NDP is also the main second choice of PC voters. See the full details at the end of this post.

At first, it seems that the people talking about vote splitting are right: there are currently 53 ridings where the Tories are projected to win but the total of OLP+NDP is greater. That would leave Ford with fewer than 30 seats and far away from the job of Premier (instead of the 79 seats in the most recent projections).

The mistake, however, is to assume to 100% of the Liberals votes would agree to vote NDP (and vice versa). In reality, some Liberals would rather vote PC or not vote at all. Same for the NDP voters. This is a crucial distinction to remember. Just because two parties share some of the same voters doesn't mean all these voters would be okay voting for the other one.

I therefore used the second choices to redistribute the votes in two scenarios: one where the OLP wouldn't exist and one where the NDP wouldn't. The table below shows the results.

Number of seats won by each party


As you can see, if the OLP was to disappear and "merge" with the NDP, this would lead to a very competitive rate but the PC would still win (a short majority, which is logical with only two parties winning seats). On the other hand, if the NDP was to withdraw, the Liberals would pick up enough voters to only win 35 seats. The main reason why the Liberals would perform much worse alone than the NDP would is because the share of NDP voters willing to vote Liberals (as 2nd choice) is smaller than the share of Liberals voters having the NDP as 2nd choice.

In both cases, Doug Ford would still win. And keep in mind these two scenarios are already optimistic as I'm assuming the remaining party (the NDP when the Liberals wouldn't exist for instance) would retain 100% of its votes.

So, is the left splitting each other and giving Ford the win? The short answer is currently no. The Conservatives are winning because they are by far the main party at 40%. NDP and OLP are splitting each other partially but this isn't the reason why Ford might end up Premier next month.

After, there is no denying the electoral system is playing a role here. With a proportional system, Ford wouldn't be able to win a majority. But that's another discussion.


By the way, here is the average second choices used in these calculations.

Source: Mainstreet and Innovative polls
As a side note, the fact that about 30% of PC voters have the NDP as second choice could be a source of great swings in the next few weeks. It's not unreasonable to think some of these voters are currently voting PC just because this party is seen as the main option to defeat the Liberals. I talked about this here.

What would it take for the Ontario election to be competitive?

Let's face it, as it stands, the Ontario election isn't a nail biting contest. If the election was tomorrow, anything but a Doug Ford victory would be a monumental surprise. Here below are my most updated projections to illustrate my point. Not much is changed since yesterday except that I now include the daily Mainstreet tracker (but I will obviously not reveal the actual numbers. Consider subscribing if you are interested. For $30, you'll get daily new numbers on their site). I have also fixed some small typos and mistakes that people have noticed).

Voting intentions; Seat projections with confidence intervals; Chances of winning the most seats

The riding by riding projections are available at the bottom of this post.



So, how far are we from a competitive election? The answer is actually closer than what you think. Or what the projections above might make you believe.

See, the projections and probabilities aren't telling you the odds that the situation will change. It's telling the odds for today, based on the information available. So when I say the PC has a 99% chance of winning, this is if the election was tomorrow. The way you should interpret it is that currently, the only way for the Tories to lose would be for the polls to be incredibly off.

But things will likely change between now and June 7th. Scandals, debates, attacks, etc. All of these will likely change the opinion of some voters.

Who are those voters who could change and, if they do change, would it make the race more competitive? Let's look at them.

Based on the (very long and detailed) Innovative poll, it is clear that PC voters are the least likely to change their mind. On the other hand, NDP and Liberals show less certainty with their current choice. Specifically, almost a majority of OLP voters -48%- are actually likely to switch! Mainstreet shows similar numbers, at least qualitatively speaking.

The NDP also has the particularity to be the main 2nd choice of both Liberals and Conservative voters.

Bottom line, there are mostly two groups of voters that could still change: Some OLP-NDP swing voters (the bigger group) and some PC-NDP swing voters. Here below is the graph from the Innovative poll. For the former, there is a key sub-group that consists essentially of people who want change but still think the Ontario Liberals are the best to form the government. This is a key group Andrea Horwath needs to convince to switch.



So, how many of these voters need to actually switch for the NDP to have a chance? Let's play with the numbers.

Both Mainstreet and Innovative agree that about 50% of the Liberals have the NDP as second choice (and 15% for the PC). And both pollsters agree that about half of them are likely to change. Let's be a little bit conservative and assume that 30% of Liberals voters will switch. That would leave this party with about 17% of the votes. This is a little bit more than how many are defined as the "core ON Liberals" (people who don't think it's time for change and like the current government, so these voters will not change their vote). It does mean most of the "time for change but think OLP is best" group would switch however.

The NDP would receive 75%*30% of the OLP voters (the 75% is around 50%/(50%+15%) because 50% of OLP voters have the NDP as second choice and 15% have the PC. I'm assuming here no OLP voters would switch to not voting, so I need to normalize), thus around 5.5 points. The PC would receive some votes (as some OLP voters do have the PC as second choice), around 1.5 points.

That would leave us with the following situation, roughly:

OLP: 17%
PC: 41.5%
NDP: 35%

That still isn't a competitive race. The PC would still be favourite. But the dynamic and coverage of the race could be very different. The NDP could now be seen as a real challenger, something that isn't fully the case for everyone right now. And this could mean that some people who hate the Liberals and really want change but are currently voting PC (seen as the best way to get rid of Wynne) could change their mind. Call it a bandwagon effect.

What if 50% of Liberals voters switch instead?

OLP: 12%
PC: 42.5%
NDP: 38.5%

If we assume that the Liberals would only lose voters to the NDP, then it'd be PC at 40% and NDP at 38.5%, a very close election.

So here you have it, assuming the PC voters won't change opinion easily (and evidence indicates that much), in order for this race to be competitive, we need basically one third of Liberals voters to switch. We need the Liberals to fall to their absolute base or core. Anyone who doesn't think this government has been doing a great job and it's time for change needs to vote something else than OLP.

Is it likely? Well, when asked, Liberals voters say there is a 50% chance they will change their mind. Can we interpret that as there is a 50% chance we will have a competitive by the end of the campaign? I think this is pushing it. I think many people always overestimate their likelihood to change. But it's definitely far from impossible.

And let's remember that there are other scenarios that lead to a competitive race (instead of only Liberals voters switching to the NDP, we can have a mix of Liberals and PC voters). The only real difficulty in having a competitive race is the fact the PC is at 40% and its vote appears to be solid. It's hard to get two parties around 40% as it requires the third one to collapse.

My guess? Well I'm terrible at guesses usually but I believe the Liberals will likely keep dropping in the next 1-2 weeks and NDP will keep increasing. We will get to a point where we'll be talking about the possibility of a competitive race (think PC at 39% and NDP at 33%). After that, no idea if the trend will continue or not. it'll depend on many factors.

I'll say this: it doesn't take a crazy scenario or insane assumptions for this race to be competitive. A fairly straightforward reading and application of the 2nd choices and chances to change their mind actually leads to it.


Projections update, May 17th 2018: Liberals continue to fall

Projections update, May 17th 2018: Liberals continue to fall
Just a quick update for the projections. No wall of text or analysis for now.


As you can see, the Conservatives of Doug Ford are still well ahead and in a very comfortable position. But we see more and more sign of a rally of progressive voters behind the NDP.


Riding by riding projections here:

Note: there was a mistake where the Liberals were at -2.1 in one riding. It's obviously an error. I corrected it. It's now at 0. And yes I know the OLP won't literally be at zero but don't take it a face value, take it as "the Liberals are super low there". Thanks to the Reddit user who spotted the mistake.

I have realized that I haven't added the bonus for Doug Ford in hos own riding. Not sure why I keep forgetting but it'll be updated next projection (after the next poll).

Vote efficiency in Ontario

As the Ontario election goes on and the rise of the NDP was (kinda) confirmed by the Ipsos poll on Wednesday, now is as good of a time as any to talk about vote efficiency. I'm referring to how successful parties are at converting votes into seats into our current electoral system.

I have to admit that when I started writing this post, I thought it'd be very easy. But as I began writing and thinking about it, I realized that defining and measuring vote efficiency isn't as straightforward as I thought. My first instinct was to look at a measure such as votes per seats (or seat per votes). That kinda works but it left me dissatisfied. This works to compare parties within the same election but not across (as the number of votes changes with turnout).

I also quickly realized that the real question wasn't so much which party was more or less efficient at their current vote level, but overall. What I mean here is that a measure such as votes per seat will obviously show that the NDP at 19% is less efficient than the Liberals at 38% with a majority. But that isn't saying much. Our electoral system is such that such result is expected. The better question is more: what if the NDP was to reach 38% and the Liberals fall to 19%? Would the NDP win more or fewer seats than the Liberals had?

So here is what I came up with:

1. Vote efficiency is about getting an optimal distribution of the vote over the map
2. Ideally, a party would want not to waste any vote. That means that as soon as you have one more vote in a riding, you move the rest of your votes to another riding. Winning by 1 vote or 5000 has the same result, except that you wasted 4999 votes in the second case.

Therefore, for every election since 2007, I looked at how each party could have distributed its votes in order to maximize the number of seats won, keeping the distribution of votes of the other parties constant. This isn't very difficult to do. All I did is order the riding in order of the number of votes require to win (after removing the Liberal votes). For instance, imagine one riding where the PC finished second with only 5000 votes, then the OLP would allocate 5001 votes to this riding and move on to the next one. You keep doing this until your party ran out of vote.

With this method, vote efficiency is defined as the ratio of seats actually won to the maximum number of seats that could have been won with an optimal distribution of the votes.

Note that for some years, it does mean a party's optimal distribution would have resulted in this party winning all the seats. This is the case in 2014 for the Liberals for instance. Since they "only" won 58 seats, this gives us an efficiency of 58/107=0.54 (or 54%).

The table below summarizes the findings. It also include the votes per seat measure (which correlates with my more fancy method).



As you can see, the Liberals have been the most efficient party since 2007. This isn't surprising, they won the last 3 elections and got the most votes in all of them.

There is obviously a relationship between your percentage of votes (province-wide) and the efficiency of this vote. Again, this is naturally due to our electoral system.

The graph below shows this relationship. It also illustrates where each party stood for every election. As Ontario didn't experience important swings since 2007 (this is about to change this year it seems), all three parties' data points are close to each other for the 3 elections. This is unfortunate as it would be interesting to see some crossovers.



This graph clearly shows the "money zone" where each extra percentage point turns into many more seats. The slope of the relationship picks up around 25%, which is the usually accepted threshold with FPTP.

Finally this graphs shows that the NDP, while getting overall fewer votes and lower efficiency, is actually doing quite well for the province-wide percentages it got. On the other hand the Conservatives seemed to have slightly under performed for their votes level. This is most likely to the GTA where the Liberals have been incredibly successful over the last decade.



Let's look at other measures (direct or indirect) of vote efficiency.

1. General distribution of the votes

The NDP has had a higher standard deviation of its vote compared to the Liberals since 2007 (14 pts versus 12 in 2014 for instance). The standard deviation of the NDP was actually the highest of the top three parties in 2 of the last 3 elections (NDP and PC were very close in 2011). This is particularly visible if you look at the variations across regions. While the Liberals were between 23% and 49% in 2014 in average in the 10 regions of the model, the NDP varied from 13% to 45%.

The NDP therefore has a vote that is fluctuating a lot between regions. Very concentrated in some regions (the North, the Southwest and Hamilton) while being very low in others (Central, Ottawa, the 905). And that could actually explain the more narrow possible distribution of seats for this party: there are regions where the NDP needs to increase substantially before being able to win seats.

In general, we say that small parties want to have a concentrated vote while big parties want it widespread. The NDP, at 20%, has done well with concentrating its votes in some regions. But if Andrea Horwath wants to become Premier, she better hopes her new votes are in the "right" regions.

On that note, the NDP could actually have quite an inefficient vote this time around if its increase is concentrated in the city of Toronto. The Liberals dominated there in 2014 and there is a chance that despite the fall (and the NDP's rise), most seats will still go the Liberals.


2. Margins of victory in ridings won

Ideally, you want your margins to be as small as possible otherwise it means you are wasting votes.

In 2014, the NDP won its 21 seats with a margin of 20 points in average (they completely dominated the North for instance as you can see here). The Liberals only needed 17 points while the PC was at 15 points.


3. Close races (<5points and="" i="" lost="" margin="" of="" victory="" won="">

If you are well organized and know how to target the right riding (and get the vote out), your efficiency will increase.



No clear pattern here, if only the improvement for the NDP after 2007. The Liberals used to win more races than they lost except in 2014. But they still won a majority that year with less than 40% of the vote.

4. What if all three parties were around 31%?

Finally, let's use the model and simulations to see what the predictions would be if there was an almost perfect split between the top 3 parties (the Green are left at around 5%).

In this scenario, the chances of winning would be

OLP: 51.8%
PC: 26.3%
NDP: 18.6%

Similarly, the 95% confidence intervals for the seats would be:

OLP: 29-59
PC: 28-54
NDP: 30-51

This confirms the advantage the Liberals have. After that, it's pretty close between Conservatives and New Democrats. Let's not forget those are fully hypothetical simulations. We noticed again the smaller range for the NDP.


Conclusion

Lots of numbers and measures in this post. At the end of the day, I think the main result is the strong vote efficiency of the Liberals in Ontario. But this efficiency was highly dependent on a) being the top party and b) the GTA. Based on the recent polls, it seems fair to say both points aren't valid this year. Both the PC and NDP have okay vote efficiency. They don't waste a ton of votes in one region (like the provincial Liberals used to in the English part of Montreal and therefore needed about 4-5 points more than the PQ to win an election). So as far as winning the most seats, both the PC and NDP can do it. If the race was to become more competitive between PC and NDP, I think the key would simply be: who will win the GTA? And if we observe a massive migration of voters from the Liberals to the NDP, could this party also inherits the high efficiency of the Liberals? Time will tell.