So, who is leading in the referendum on electoral reform?

Alright, it's almost over! One week exactly and Elections BC will stop accepting ballots and will start counting them. Turnout is now at 37% (received ballots; it's 31.1% of processed ballots). So the good news is that the final turnout will be decent. More importantly, by being above the 40% mark, it ensures the Liberals leader Andrew Wilkinson won't be able to call the result illegitimate. Now the questions is really: who will win?

The turnout is actually still ahead of the one for the HST referendum:


Although we are closer to the deadline than we were for the HST vote. If we instead look at the number of days remaining, we are slightly behind:

It seems a finish around 45% is likely and this is much higher than I'd have thought.

Here below is the current turnout by riding. Remember that some regions received their ballots later and/or were more affected by Canada Post's strike.





So, who is likely winning?

Throughout this campaign, I tried to estimate the number of votes cast for the YES and NO side. Ever since I switched to using the turnout by age provided by Elections BC, I've had the NO side ahead but with a decreasing lead. It has been quite stable the last few days.



Remember that these are estimates. They are subjects to margins of error, in particular from the polls they are based on (technically I'm using turnout, demographic and polling data to provide these estimates).

So first, let's see how sensitives these estimates are to a change of specifications.

1. What if we apply some margins of error to the polling averages

I'm using the four published polls during this campaign (from Angus-Reid, Mainstreet, Insights West and Research & Co.; I'm not using my private Google Survey even though the results were very similar, mostly because I don't have as much details regarding the location of the respondents).

But those are polls, they have margins of error even when we aggregate them. The issue here is that I'm not sure how accurate these polls are. For federal or provincial elections, I have done the research and know the empirical margins of error. For referendums however, we don't have many data points. In general polls have done quite well for the 2009, 2011 or 2015 transit plebiscite, but it doesn't mean it'll always be the case. The number of undecided this year is much higher as well.

These polls have also been remarkably similar. So let's use the standard error of these four polls and create an interval by adding or subtracting 1.96 times this standard error. That gives us margins of error of plus or minus 1.5% (and more by age or regional breakdown but let's ignore this for now; Errors are likely to be correlated anyway. For instance if the polls underestimated the YES in the Lower Mainland, they'd have done the same on the Island).

Ok, so let's re-run the estimates above but using either the upper or lower bounds for the support for PR.

% for PR with upper bound: 50.4%
% for PR with lower bound: 47.4%

So the YES can win but the interval shows it's not the most likely outcome.


2. How important is the age turnout?

These estimates work this way: using the turnout by age of 2017 (thus telling me the percentage of voters aged 18-34 in each riding for instance) and the region, I look at the number of votes in one riding and "assign" them to the YES or NO based on the demographic data (the age thingy) as well as the levels of support for this age group in the polls.

The issue here is that around 50% of voters in 2017 were actually above 55 (this is a crazy stat as the 55+ do not represent remotely close to 50% of the population). Given that polls show the 55+ want to keep FPTP in majority (63% exactly), this gives the NO side a huge boost.

But here's the thing: polls always weigh their samples based on census data, not turnout one. At first it seems like a perfect recipe to be wrong (since you'll underestimate the impact of the 55+ and over estimate the 18-34 for instance) but in practice it has been relatively fine. Look at the polls for the BC election last year. They did a fairly good job. If you re-weight the polls with the age-turnout data, you usually get a higher number for the Liberals than what the polls showed. This is typical in pretty much every single election where one party does better among the 55+. But again, in the real life, we don't observe such parties to be systematically underestimated.

What I'm saying here is that my method above might actually be quite harsh against the YES side.

There is also the issue that regressions (see below or previous blog posts) have shown a clear trend: the 18-34 are voting more than in 2017 while the 35-54 are voting a lot less (the 55+ were voting more at first but that's not the case anymore). So there as well I might be underestimating the YES side (note: my method should technically account at least partially for this since more votes are coming from ridings with more voters aged 18-34, but I'm not making any additional adjustments such as increasing the share of the 18-35 at the riding level).

Finally, I would technically need the polling data broken down by age AND region (like what is the level of support for PR for the 18-34 in the Lower Mainland?). Polls don't provide me with this. They provide the average by age OR the region. My method above basically runs the estimation using each and does and does an average.

I contacted the pollsters to try to get the data I want (because they have it), we'll see if they nicely oblige. In the meantime, we can try one thing: let's assume the two (age and region) are independent from each other. So if we compare the levels of support for the 35-54 to the 18-34 in the Lower Mainland, we should get a proportionally similar result as comparing the 35-54 to the 18-34 on the Island for instance. This assumption will amplify the differences. For instance the 18-34 (more in favour in general) living on the Island (generally more pro-PR) will therefore be really, really pro-PR. On the other hand the 55+ in the Interior will be incredibly against it. That might actually be a good thing as I suspect that the two campaigns are motivating the more "extreme" voters (the ones really in favour of PR and those really against it).

If I do this instead, I get the following graph:



Same trends but the YES is now ahead. Interesting. The assumption of independence is really strong (and likely invalid) but at the same time, is it worse than doing both separately and averaging? Not sure. Please somebody smarter than me could let me know.


3. What about new voters?

8487 people registered to vote by the deadline of Elections BC. That's not a large number (it's 0.26% of the registered voters). However, it's reasonable to assume that a majority of these voters did so to vote for PR. Why would people who weren't registered decide to go and vote to keep a system they weren't even using? I'm sure there are some people like this (maybe to protect the province from all the Nazi parties that PR will inevitably brings if you listen to some people lol) but the majority of these 8487 should go into the YES side.

If we assume that 75% of these new voters will pick the YES side, then my estimated percentage for the YES increases by a little bit less than 0.2%. Minor? Yes, but that could well be the difference at the end.


4. What about using the votes for each party in 2017?

Maybe instead of trying to mix age and regional data, the best and easiest method is to take the 2017 results by ridings and use those as benchmark. So a riding with a majority of votes for the BC Liberals will vote against PR. This should capture both the age and regional effects.

Unfortunately for us, most polls did not provide the levels of support for each side by the political choice in 2017. Only Research Co. and Angus-Reid did so and we can see that the BC Liberals voters really, really hate PR (81% in favour of keeping the current system) while NDP and Green are mostly in favour (around 70-75%).

So now my calculations are as follow: imagine a riding voted at 40% for the Liberals. Therefore, 40% of the votes for the current referendum will be assumed to come from the BC Liberals voters and 81% of them will vote NO. The regressions below show that the 2017 turnout is a strong predictor of the current turnout, so this method might actually be quite valid.

If I use this, I get the following graph (yes I know this post has the same graph many times but that's the point of this article):



One potential problem with this method is that it doesn't account for how the other voters will vote this time (in particular the BC Conservatives). Still, we see similar patterns but here again the YES side seems narrowly ahead. The trend for PR is quite steep but the scale is misleading as it shows results really, really close to 50%. Interestingly this is the method that gives us the closest race so far and the smallest absolute variations (it stays between 50.8 and 49.2%).


5. Regressions

What the polls have shown is that people in BC are mostly split 50-50. So using the turnout by riding so far, we can try to find patterns (and stop trying to actually estimate and count votes). Here are the results with the latest numbers:


If you've been reading my blog during this referendum, you already know that the same regressions were initially showing a positive coefficients for the share of 55+. Over time it has switched from a positive and significant effect to a negative but not statistically significant one. In the less simple model, we are actually getting close to getting statistical significance.

As for the regions, the Lower Mainland and the Island are above once we adjust for the number of days they had to vote (which is, again, still a determinant of the turnout even this late into the referendum, thus showing once again that Elections BC was right to extend the deadline).

The major impact of the % of people with English as mother tongue is just confirming to us that the Vancouver suburbs of Richmond And Surrey aren't very interested in this process so far.

If I had used the percentage for each party (instead of looking at whether the riding elected a Liberal or NDP MLA last year), then the share of 55+ is negative and significant. And in this case the higher the % for the Green, the higher the turnout (significant at 10%) while it's the opposite for the % for the NDP (thus again showing that this party isn't getting its vote out as much, although this could be partially explained by Surrey).


Conclusion

It still a very close race but the trend remains favourable to the PR side. Margins of error make predicting this referendum almost impossible but there are reasons to be optimistic if you want to change the electoral system.

The regressions show that the 18-34 are voting more than they have in the past while it isn't the case for the 55+, at least using aggregate data (so subject to a possible ecological fallacy - Google it if you want to know more). Regionally, the effect of when the ballots were received (and/or of the postal strikes) are decreasing and we gradually see the Lower Mainland catching up the Interior while Vancouver Island is now firmly ahead. Both trends should be positive for PR.

On the other hand, ridings with a BC Liberals MLAs continue to vote more (we see it with Vancouver Quilchena for instance) and this is good news for the current system. I heard that the BC NDP was now making phone calls to motivate their voters to get out and mail their ballots. Hopefully this isn't too little too late.

Personally I'm becoming cautiously optimistic for the YES side but I'll wait to see the final turnout to really make a call.