This morning, as they do every day, Elections BC reported on how many ballots had been received and screened by them. The screening part is is important as Elections BC started reporting last week on how many ballots it has received in total. If the current turnout is only 10.6% if we look at the ballots processed, it's 21% of ballots received. A much higher number.

There is still considerable variations between ridings. The question is really: can these variations be interpreted in any way? For instance can we see a pattern more favourable to the YES or NO side? The main issue, as I previously reported, is that not all ridings received their voting packages at the same time. Some ridings, mostly in the interior, got theirs as early as October 24th while others, mostly in the Lower Mainland, had to wait until November 2nd. Even among the large group scheduled for the 2nd, you can still observe many differences.

The graph below shows you the turnout as a function of when the ballots were scheduled to arrived. As you can see, this is converging nicely.

Any analysis must account for this. The problem is that we don't have really detailed data. What we do have however is the turnout by day for each riding ever since Elections BC started reporting on them (the 5th). In this article, I use this information to account for when ridings received their ballots. Specifically, I look at when a riding crossed the 1% turnout and then count how many days it has been since. It isn't perfect but I think it's doing a pretty good job. I don't need a measure that is super accurate. What I need is a way to differentiate (or group) ridings in similar groups. So it doesn't matter if I estimate 3 days of voting and it was 6, as long as I'm off by the same margin for other ridings. I chose the 1% arbitrarily. I thought it was low enough that any riding that actually started voting should cross almost immediately.

Let me give you an example. The riding of Abbortsford-Mission only crossed the 1% mark on November 14th (I'm talking processed ballots here). There were ballots reported from this riding as early as the 5th (2 ballots specifically) but it took until the 14th to get enough ballots to represent 1% of registered voters. This riding was on the November 2nd schedule.

Let me give you an example. The riding of Abbortsford-Mission only crossed the 1% mark on November 14th (I'm talking processed ballots here). There were ballots reported from this riding as early as the 5th (2 ballots specifically) but it took until the 14th to get enough ballots to represent 1% of registered voters. This riding was on the November 2nd schedule.

On the other hand the riding of Boundary-Similkamen crossed the 1% mark on November 5th, thus the first day reported by Elections BC. This riding was scheduled for October 25th.

You can therefore understand why, even today, the turnout in Boundary-Similkamen is 19.2% while it's only 6.3% in ABM. Of course the two should ultimately converge but it hasn't been the case yet.

And just to illustrate why my measure is better than simply using the scheduled date of Elections BC. Among the November 2nd ridings, 19 have been over 1% since the 14th while 12 ridings have crossed the mark on November 7th, 6 days before. Some ridings, such as Kelowna-Mission, were scheduled for November 2nd but actually crossed the 1% threshold on November 5th already. Thus, as you can see, there is considerable variation even among ridings technically in the same scheduled group. Differences can come from when they received the ballots or from a possibly selective processing from Elections BC (also, Canada Post could have its influence).

The graph below shows you the turnout as a function of how many days the riding had to vote (again, measured as how many days the riding has been over the 1% threshold).

A quick extrapolation would indicate that the final turnout should be around 30% if things continue at the same rate. Given that we are at 21% right now (received), 30% seems like a good guess unless it suddenly picks up in the last few days. I've been expecting a final result between 20 and 30% since the beginning. It seems I might have been slightly too conservative (or pessimistic).

This issue controlled for (well as much as I possibly can), we can try to look at other patterns. For instance is the Lower Mainland voting more or less than the interior? Are ridings with a higher proportion of 18-34 voting more? Of course we need to control for all other factors while doing this. This is why I used a regression. If you aren't familiar, think of it as a statistical tool looking at correlations but accounting for other factors at the same time. For instance, if I look at the simple correlation between turnout and age, some of the effects could instead come from the region. This is the case as ridings in the Lower Mainland have, in average, a younger electorate. A regression will be able to disentangle the two.

Without further ado, here are the results with the data of this morning:

There are two things you want to focus on. First of all is the sign of the coefficients. If it's positive (like with the share of 65+), it means the higher the share in a riding the higher the current turnout. Specifically, it means that if the share of people aged 65 and over is 10% higher, the turnout is 3.07% higher. For the number of days, you see that it's slightly less than 1% per day, which matches with the extrapolation of a final turnout around 30% so far given how many days there are left.

The second thing is the number (or lack of) of stars * next to the coefficient. If there is none, it means the relationship isn't statistically significant. To explain it in very simple terms, it means that we can't be very confident that there is indeed a relationship there. It could just be random noise from the limited sample we have. If there is no star, disregard the coefficient and imagine this is zero (i.e: no relationship between these variables). If there is one star, it's significant at 10% (so we are confident at 90%). 2 stars is 5% and 1 star is 1% or less (so certainty of 99% or more).

So, what can we learn? First of all, my measure of number of days above 1%, which is there to account for the fact some ridings have been able to return their ballots for a longer period, is by far the most important variable. Highly significant (t-stat of over 12 for my fellow nerds out there). What this means is variations across ridings is currently mostly caused by when the ridings received the ballots and/or how fast Elections BC has been at processing the ballots from this riding. In other words: I might be wasting my time here by using preliminary data.

We do observe a couple of other significant variables. So far it seems ridings with older people (over 55) vote more. Similarly the interior is voting more than the Lower Mainland (you need to interpret the coefficients for Lower Mainland and Island as differences with the Interior). Remember, in both cases this is while accounting for the fact many ridings in the interior got their ballots sooner.

All in all, this isn't looking very good for proportional representation. Polls have shown that younger people are more pro PR. Also, the interior is less in favour of PR than the Lower Mainland or Vancouver Island. At the same time, this is preliminary. Maybe younger, urban voters are taking longer to make their choice. But right now these effects are reinforcing each other and the odds of this referendum being successful aren't really great. The YES camp needs to step up its game and convince the younger voters (as well as the undecided) to mail their ballots back.

Edit:

Someone on Twitter suggested I instead looked at the change in turnout between last Friday and today as a way to determine where the new votes were coming from and possibly predict the future trend. I did run this regression. Without posting the full table, here are my findings:

Ridings that had fewer days of voting are catching up (the coefficient in front of number of days is negative). This is logical. Ridings with a higher proportion of 18-34 increased by more, so did the ridings with more 55+. It therefore seems it's really young people versus old ones and the middle age ones voting less (they most likely care less). Also, ridings with a higher share of votes for the NDP in 2017 have increased less than the others during this weekend. This is significant. Not sure why though but it seems to indicate the NDP isn't really getting its vote out.

Edit:

Someone on Twitter suggested I instead looked at the change in turnout between last Friday and today as a way to determine where the new votes were coming from and possibly predict the future trend. I did run this regression. Without posting the full table, here are my findings:

Ridings that had fewer days of voting are catching up (the coefficient in front of number of days is negative). This is logical. Ridings with a higher proportion of 18-34 increased by more, so did the ridings with more 55+. It therefore seems it's really young people versus old ones and the middle age ones voting less (they most likely care less). Also, ridings with a higher share of votes for the NDP in 2017 have increased less than the others during this weekend. This is significant. Not sure why though but it seems to indicate the NDP isn't really getting its vote out.