Illustration of how the new model helps: the last federal election

Last federal election, a lot of things changed. There was the NDP surge (especially in Quebec but in Ontario as well) and the collapse of the Liberals in Ontario. Also, the polls clearly underestimated the share of votes of the Conservatives. Indeed, almost all polls conducted during the last week-end were showing the CPC at around 36-37%, not 39-40%. The underestimation was even worse in Ontario.

I projected a Conservative minority back then for my finale projections. Obviously I was wrong.However, the model works with the percentages as input. If the polls are wrong (as it was clearly the case), the model can't be right obviously. This is one of the reasons why I was wrong. But there is another one: my model wasn't really meant to project the parties in case of really large changes or swings. I actually made that very clear from the beginning of the election. This is why I've now updated the model (for Quebec only for now) as to make it more robust to extrapolation. By extrapolation, I mean when a party is polled way above or below its previous results. For instance in Quebec, the NDP has never done better than 12.5% in recent years before 2011. So to have the NDP at around 40% is obviously a totally different story.

The next federal model will naturally include the new features currently implemented in the Quebec model. However, for the sake of illustration, I went back to the model for the last federal election and I added the extrapolation features for Ontario and Quebec, two provinces with big swings where the model didn't perform as well as expected. The table below displays the old projections, the new projections as well as the actual results. Naturally, I use the true percentages as input.

actual 73 11 22 0
model 1.0 63 24 19 0
model 2.0 70 16 20 0
CPC LPC NDP Green Bloc
actual 5 7 59 0 4
model 1.0 9 12 44 0 10
model 2.0 8 5 57 0 5

As you can see, the new model would have been a lot closer to the actual results. Let's look at one riding for instance: Mississauga-Brampton-South. This riding is located in the GTA, a region where the Liberals have traditionnally hold up better than in the rest of the province (and the Conservatives have had less success). The old model was translating a 1-point change provincially into (say) a 0.8 change in this region fo both parties. So with the actual observed provincial swings between 2008 and 2011 (+5 for the CPC, -8 for the Liberals and +7.4 for the NDP), the model was projecting a close win for the Liberals: 42% vs 37%. Still a way smaller margin that in 2008 where the LPC won 47.69 versus 32.96.

The new model would have projected a close CPC win, 39% vs 38%. The actual results were a win for the CPC 44% vs 35%. So while it isn't perfect, it shows how the "extrapolation features" manage to correct for the fact that the Liberals had to start losing votes in riding where they were above their provincial average. On the other hand, the Conservatives had to start gaining votes in those ridings where their swing was usually decreased. In this specific riding, it would still have overestimated the Liberals but at least the outcome was right.

What this means is that the new model (currently used for the next Quebec election) gives you the best of the two worlds. When parties are "close" to their previous results, the model transates the provincial swings into riding-level ones taking into account the region, the incumbent and other factors, And when one or more parties enter extrapolation territory, the model correct itself to make better predictions. Of course, mistakes will still (and always) be made since you can have region-year specific effects or the impact of local issues (plus, as before, you need to right percentages).