Kalman Gain


I've implemented a Kalman filter, and used it to track opinion polls for the UK 2017 general election, starting from the 2015 general election result. The Kalman filter is an optimal tracking algorithm, used in guidance and navigation systems, but has much wider applicability.

Kalman filters are normally used where the measurements are equally spaced in time. Opinion polls aren't, so each observation is an opinion poll, and the time axis isn't uniform. The raw data is from Opinion polling for the United Kingdom general election, 2017. The measurement error is the sampling error (which is proportional to the square root of the sample size), plus half a percent as opinion poll results are in whole percentages. I varied the process error until I got a reasonably smooth plot which tracked the opinion polls well. In more typical applications, there's a velocity term. Ideally, I'd have something similar here, to model the network effect of voters changing the behaviour of those they are in frequent contact with (e.g. friends and relatives), but I wasn't sure if that would work, so changes between are treated as random.

I removed the large YouGov polls, as they're conducted in a non-standard way, and also the large SurveyMonkey polls.

The first plot is of all the political parties.

all parties
PartyColorFinal percentage
Lib Demorange7.994850

The rise in the Tory vote and corresponding decline in the UKIP vote started after the EU referendum. The Labour vote decline coincides with the Labour Party infighting. The rise in the Labour vote starts when the election is called and Labour finally gets its act together. The Tory vote continues to rise at the expense of UKIP, then peaks. Labour also gained from the Lib Dems and Greens.

I translated this into seats using Electoral Calculus's User-defined Prediction, which gives a Conservative majority of 60 seats.

There are several factors which could change this result, but the polling companies do make allowances for these: the likelihood of someone being polled voting at all, and the same way as they said they would. There's a known Shy Tory Factor. Also, Electoral Calculus uses a higher SNP vote than the Kalman filter predicts. Opinion polling stopped yesterday, so the estimates are a day out-of-date. There are reports of larger than usual turnout by young people. The first two favour the Tories, and the last two, Labour.

The UK uses first-past-the-post voting, a system known to be unfair if there are more than two parties, so I also examined what would happen if the were only two parties. In my two party system, the UKIP would with the Tories to form the Reactionary Party (magenta), and the Lib Dems, Greens, and SNP (and Plaid Cymru) would merge with Labour to form the Progressive Party (green). The results then are that the Progressives narrowly wins the popular vote (by 2%) but the Reactionaries are still the largest party, 9 seats short of an overall majority: User-defined Prediction. Note, however, that Electoral Calculus still keeps the SNP and the Northern Irish parties.

reactionaries vs. progressives

Post Mortem


The final results for the United Kingdom general election, 2017 were:

PartyPercentage vote
Lib Dem7.4

So my prediction was wildly inaccurate. Why? Before blaming the opinion polls, I ran the Kalman filter again, this time providing one aggregated poll per day since the start of the election campaign, by spreading each opinion poll over the days it was conducted. Ideally, a Kalman filter should have one input per unit time.

main parties
PartyColorFinal percentage vote
Lib Demorange7.665925

That's slightly more accurate, but still nowhere close to the 2.4% difference between the Tories and Labour in the final result. The Labour vote is still underestimated, and the Tory vote is slightly overestimated. Even allowing Labour's upward trend to continue for another day or two wouldn't make much difference, so it looks like most of the opinion polls were wide of the mark, and my Kalman filter is off the naughty step. Garbage in, garbage out.

Am I disappointed that my prediction was so poor? No, because I wanted Labour to win. They did very well to have improved their share of the vote as much as they did, and come so close to winning. Hats off to Jeremy Corbyn.


© Copyright Donald Fisk 2017