It is year since the 2017 general election. I am sure lots of people will be writing a lot of articles looking back at the election itself and the year since, but I thought I’d write a something about the 2017 polling error, something that has gone largely unexamined compared to the 2015 error. The polling companies themselves have all carried out their own internal examinations and reported to the BPC, and the BPC will be putting out a report based on that in due course. In the meantime, here are my own personal thoughts about the wider error across the industry.

The error in 2017 wasn’t the same as 2015.

Most casual observers of polls will probably have lumped the errors of 2015 and 2017 in together, and seen 2017 as just “the polls getting it wrong again”. In fact the nature of the error in 2017 was completely different to that in 2015. It would be wrong to say they are unconnected – the cause of the 2017 errors was often pollsters trying to correct the error of 2015 – but the way the polls were wrong was completely different.

To understand the difference between the errors in 2015 and the errors in 2017 it helps to think of polling methodology as being divided into two bits. The first is the sample – the way pollsters try to get respondents who are representative of the public, be that through their sampling itself or the weights they apply afterwards. The second is the adjustments they make to turn that sample into a measure of how people would actually vote, how they model things like turnout and accounting for people who say don’t know, or refuse to answer.

In 2015, the polling industry got the first of those wrong, and the second right (or at least, the second of those wasn’t the cause of the error). The Sturgis Inquiry into the 2015 polling error looked at every possible cause of error, and decided that the polls had samples that were not representative. While they didn’t think the way pollsters predicted turnout was based on strong enough evidence and recommended improvements there too, they ruled it out as cause of the 2015 error.

In 2017 it was the opposite situation. The polling samples themselves had pretty much the correct result to start with, showing only a small Tory lead. More traditional approaches towards modelling turnout (which typically made only small differences) would have resulted in polls that only marginally overstated the Tory lead. The large errors we saw in 2017 were down to the more elaborate adjustments that pollsters had introduced. If you had stripped away all the attempts aimed at modelling turnout, don’t knows and suchlike (as in the table below) then the underlying samples the pollsters were working with would have got the Conservative lead over Labour about right:

What did pollsters do that was wrong?

The actual things that pollsters did to make their figures wrong varied from pollster to pollster. So for ICM, ComRes and Ipsos MORI, it looks as if new turnout models inflated the Tory lead, for BMG it was their new adjustment for electoral registration, for YouGov it was reallocating don’t knows. The actual details were different in each case, but the thing they had in common was that pollsters had introduced post-fieldwork adjustments that had larger impacts than at past elections, and which ended up over-adjusting in favour of the Tories.

In working out how pollster came to make this error we need to have closer look at the diagnosis of what went wrong in 2015. Saying that samples were “wrong” is easy, if you are going to solve it you need to identify how they were wrong. After 2015 the broad consensus among the industry was that the samples had contained too many politically engaged young people who went out to vote Labour and not enough disinterested young people who stayed at home. Polling companies took a mixture of two different approaches towards dealing with this, though most companies did a bit of both.

One approach was to try and treat the cause of the error by improving the samples themselves, trying to increase the proportion of respondents who had less interest in politics. Companies started adding quotas or weights that had a more direct relationship with political interest, things like education (YouGov, Survation & Ipsos MORI), newspaper readership (Ipsos MORI) or straight out interest in politics (YouGov & ICM). Pollsters who primarily took this approach ended up with smaller Tory leads.

The other was to try and treat the symptom of the problem by introducing new approaches to turnout that assumed lower rates of turnout among respondents from demographic groups who had not traditionally turned out to vote in the past, and where pollsters felt samples had too many people who were likely to vote. The most notable examples were the decision by some pollsters to replace turnout models based on self-assessment, with turnout models based on demographics – downweighting groups like the young or working class who have traditionally had lower turnouts. Typically these changes produced polls with substantially larger Conservative leads.

So was it just to do with pollsters getting youth turnout wrong?

This explanation chimes nicely with the idea that the polling error was down to polling companies getting youth turnout wrong, that young people actually turned out at an unusually high level, but that polling companies fixed youth turnout at an artificially low level, thereby missing this surge in young voting. This is an attractive theory at first glance, but as is so often the case, it’s actually a bit more complicated than that.

The first problem with the theory is that it’s far from clear whether there was a surge in youth turnout. The British Election Study has cast doubt upon whether or not youth turnout really did rise that much. That’s not a debate I’m planning on getting into here, but suffice to say, if there wasn’t really that much of a leap in youth turnout, then it cannot explain some of the large polling misses in 2017.

The second problem with the hypothesis is that there isn’t really that much relationship between those polling companies who had about the right proportion of young people in their samples and those who got it right.

The chart below shows the proportion of voters aged under 25 in each polling company’s final polling figures. The blue bar is the proportion in the sample as a whole, the red bar the proportion in the final voting figures, once pollsters had factored in turnout, dealt with don’t knows and so on. As you would expect, everyone had roughly the same proportion of under 25s in their weighted sample (in line with the actual proportion of 18-24 year olds in the population), but among their sample of actual voters it differs radically. At one end, less than 4% of BMG’s final voting intention figures were based on people aged under 25s. At the other end, almost 11% of Survation’s final voting figures were based on under 25s.

According to the British Election Study, the closest we have to authorative figures, the correct figure should have been about 7%. That implies Survation got it right despite having far too many young people. ComRes had too many young people, yet had one of the worst understatements of Labour support. MORI had close to the correct proportion of young people, yet still got it wrong. There isn’t the neat relationship we’d expect if this was all about getting the correct proportion of young voters. Clearly the explanation must be rather more complicated than that.

So what exactly did go wrong?

Without a nice, neat explanation like youth turnout, the best overarching explanation for the 2017 error is that polling companies seeking to solve the overstatement of Labour in 2015 simply went too far and ended up understating them in 2017. The actual details of this differed from company to company, but it’s fair to say that the more elaborate the adjustments that polling companies made for things like turnout and don’t knows, the worse they performed. Essentially, polling companies over-did it.

Weighting down young people was part of this, but it was certainly not the whole explanation and some pollsters came unstruck for different reasons. This is not an attempt to look in detail at each pollster, as they may also have had individual factors at play (in BMG’s report, for example, they’ve also highlighted the impact of doing late fieldwork during the daytime), but there is a clear pattern of over-enthusiastic post-fieldwork adjustments turning essentially decent samples into final figures that were too Conservative:

  • BMG’s weighted sample would have shown the parties neck-and-neck. With just traditional turnout weighting they would have given the Tories around a four point lead. However, combining this with an additional down-weighting by past non-voting and the likelihood of different age/tenure groups to be registered to vote changed this into a 13 point Tory lead.
  • ICM’s weighted sample would have shown a five point Tory lead. Adding demographic likelihood to vote weights that largely downweighted the young increased this to a 12 point Tory lead.
  • Ipsos MORI’s weighted sample would have shown the parties neck-and-neck, and MORI’s traditional 10/10 turnout filter looks as if it would have produced an almost spot-on 2 point Tory lead. An additional turnout filter based on demographics changed this to an 8 point Tory lead.
  • YouGov’s weighted sample had a 3 point Tory lead, which would’ve been unchanged by their traditional turnout weights (and which also exactly matched their MRP model). Reallocating don’t knows changed this to a 7 point Tory lead.
  • ComRes’s weighted sample had a 1 point Conservative lead, and by my calculations their old turnout model would have shown much the same. Their new demographic turnout model did not actually understate the proportion of young people, but did weight down working class voters, producing a 12 point Tory lead.

Does this mean modelling turnout by demographics is dead?

No. Or at least, it shouldn’t do. The pollsters who got it most conspicuously wrong in 2017 were indeed those who relied on demographic turnout models, but this may have been down to the way they did it.

Normally weights are applied to a sample all at the same time using “rim weighting” (this is an iterative process that lets you weight by multiple items without them throwing each other off). What happened with the demographic turnout modelling in 2017 is that companies effectively did two lots of weights. First they weighted the demographics and past vote of the data so it matched the British population. Then they effectively added separate weights by things like age, gender and tenure so that the demographics of those people included in their final voting figures matched the people who actually voted in 2015. The problem is this may well have thrown out the past vote figures, so the 2015 voters in their samples matched the demographics of 2015 voters, but didn’t match the politics of 2015 voters.

It’s worth noting that some companies used demographic based turnout modelling and were far more successful. Kantar’s polling used a hybrid turnout model based upon both demographics and self-reporting, and was one of the most accurate polls. Surveymonkey’s turnout modelling was based on the demographics of people who voted in 2015, and produced only a 4 point Tory lead. YouGov’s MRP model used demographics to predicts respondents likelihood to vote and was extremely accurate. There were companies who made a success of it, and it may be more of a question about how to do it well, rather than whether one does it at all.

What have polling companies done to correct the 2017 problems, and should I trust them?

For individual polling companies the errors of 2017 are far more straightforward to address than in 2015. For most polling companies it has been a simple matter of dropping the adjustments that went wrong. All the causes of error I listed above have simply been reversed – for example, ICM have dropped their demographic turnout model and gone back to asking people how likely they are to vote, ComRes have done the same. MORI have stopped factoring demographics into their turnout, YouGov aren’t reallocating don’t knows, BMG aren’t currently weighting down groups with lower registration.

If you are worried that the specific type of polling error we saw in 2017 could be happening now you shouldn’t be – all the methods that caused the error have been removed. A simplistic view that the polls understated Labour in 2017 and, therefore, Labour are actually doing better than the polls suggest is obviously fallacious.
However, that is obviously not a guarantee that polls couldn’t be wrong in other ways.

But what about the polling error of 2015?

This is a much more pertinent question. The methodology changes that were introduced in 2017 were intended to correct the problems of 2015. So if the changes are reversed, does that mean the errors of 2015 will re-emerge? Will polls risk *overstating* Labour support again?

The difficult situation the polling companies find themselves in is that the methods used in 2017 would have got 2015 correct, but got 2017 wrong. The methods used in 2015 would have got 2017 correct, but got 2015 wrong. The question we face is what approach would have got both 2015 and 2017 right?

One answer may be for polling companies to use more moderate versions of the changes them introduced in 2017. Another may be to concentrate more on improving samples, rather than post-fieldwork adjustments to turnout. As we saw earlier in the article, polling companies took a mixture of two approaches to solving the problem of 2017. The approach of “treating the symptom” by changing turnout models and similar ended up backfiring, but what about the first approach – what became of the attempts to improve the samples themselves?

As we saw above, the actual samples the polls used were broadly accurate. They tended to have smaller parties too high, but the balance between Labour and Conserative was pretty accurate. For one reason or another, the sampling problem from 2015 appears to have completely disappeared by 2017. 2015 samples were skewed towards Labour, but in 2017 they were not. I can think of three possible explanations for this.

  • The post-2015 changes made by the polling companies corrected the problem. This seems unlikely to be the sole reason, as polling samples were better across the board, with those companies who had done little to improve their samples performing in line with those who had made extensive efforts.
  • Weighting and sampling by the EU ref made samples better. There is one sampling/weighting change that nearly everyone made – they started sampling/weighting by recalled EU ref vote, something that was an important driver of how people voted in 2017. It may just be that providence has provided the polling companies with a useful new weighting variable that meant samples were far better at predicting vote shares.
  • Or perhaps the causes of the problems in 2015 just weren’t an issue in 2017. A sample being wrong doesn’t necessarily mean the result will be wrong. For example, if I had too many people with ginger hair in my sample, the results would probably still be correct (unless there is some hitherto unknown relationship between voting and hair colour). It’s possible that – once you’ve controlled for other factors – in 2015 people with low political engagement voted differently to engaged people, but that in 2017 they voted in much the same way. In other words, it’s possible that the sampling shortcomings of 2015 didn’t go away, they just ceased to matter.

It is difficult to come to firm answer with the data available, but whichever mix of these is the case, polling companies shouldn’t be complacent. Some of them have made substantial attempts to improve their samples from 2015, but if the problems of 2015 disappeared because of the impact of weighting by Brexit or because political engagement mattered less in 2017, then we cannot really tell how successful they were. And it stores up potential problems for the future – weighting by a referendum that happened in 2016 will only be workable for so long, and if political engagement didn’t matter this time, it doesn’t mean it won’t in 2022.

Will MRP save the day?

One of the few conspicuous successes in the election polling was the YouGov MRP model (that is, multi-level regression and post-stratification). I expect come the next election there will be many other attempts to do the same. I will urge one note of caution – MRP is not a panacea to polling’s problems. They can go wrong, and still relies on the decisions people make in designing the model it runs upon.

MRP is primarily a method of modelling opinion at lower geographical areas from a big overall dataset. Hence in 2017 YouGov used it to model the share of the vote in the 632 constituencies in Great Britain. In that sense, it’s a genuinely important step forward in election polling, because it properly models actual seat numbers and, from there, who will win the election and will be in a position to form a government. Previously polls could only predict shares of the vote, which others could use to project into a result using the rather blunt tool of uniform national swing. MRP produces figures at the seat level, so can be used to predict the actual result.

Of course, if you’ve got shares of the vote for each seat then you’ll also be able to use it to get national shares of the vote. However, at that level it really shouldn’t be that different from what you’d get from a traditional poll that weighted its sample using the variables and the same targets (indeed, the YouGov MRP and traditional polls showed much the same figures for much of the campaign – the differences came down to turnout adjustments and don’t knows). Its level of accuracy will still depend on the quality of the data, the quality of the modelling and whether the people behind it have made the right decisions about the variables used in the model and on how they model things like turnout… in other words, all the same things that determine if an opinion poll gets it right or not.

In short, I do hope the YouGov MRP model works as well in 2022 as it did in 2017, but MRP as a technique is not infallible. Lord Ashcroft also did a MRP model in 2017, and that was showing a Tory majority of 60.

TLDR:

  • The polling error in 2017 wasn’t a repeat of 2015 – the cause and direction of the error were complete opposites.
  • In 2017 the polling samples would have got the Tory lead pretty much spot on, but the topline figures ended up being wrong because pollsters added various adjustments to try and correct the problems of 2015.
  • While a few pollsters did come unstuck over turnout models, it’s not as simple as it being all about youth turnout. Different pollsters made different errors.
  • All the adjustments that led to the error have now been reversed, so the specific error we saw in 2017 shouldn’t reoccur.
  • But that doesn’t mean polls couldn’t be wrong in other way (most worryingly, we don’t really know why the underlying problem in 2015 error went away), so pollsters shouldn’t get complacent about potential polling error.
  • MRP isn’t a panacea to the problems – it still needs good modelling of opinion to get accurate results. If it works though, it can give a much better steer on actual seat numbers than traditional polls.


138 Responses to “Why the polls were wrong in 2017”

1 2 3
  1. Fascinating analysis.

  2. “The error in 2017 wasn’t the same as 2017”

    Really?

  3. A lot to read and take in.

    [I’ve done the reading bit…]

  4. it does seem challenging to correct things if you can’t figure whether your adjustments worked or not.

  5. Ultra TLDR

    They got it wrong in 2015. Resulting corrections led to getting it wrong in 2017. They’ve undone the corrections, so the original error could reappear at some point.

    Or maybe some new error.

  6. Well now we know where AW has been for the last couple of weeks: a sterling essay.

  7. How far did the polls overstate the Tory lead before TM called the election? What I am wondering is whether the pollsters’ errors led to the election?

    Of course, the loss of a by-election by Labour does suggest that they were in doldrums. Although I remember the analysis of the local election result was not as disastrous for Labour as the opinion polls were suggesting – although still a clear win for Tories.

    Can we use the polls with better adjustment to show when the began to shift? Assuming that they did.

  8. Lots to digest. Let’s see how long it is before someone starts banging on about Brexit.

  9. Superb analysis- one thing that stands out (even if it is a bit of a Doh!) is how much thought and work goes into attempting to get the polls right. I guess if you get down to one of the core failures to assess turnout it surely is largely an impossible thing to get right. Very noticeable how recent exit polls, which don’t have that problem, have come very close to the final result even with big uncertainty on the night over the Scottish sample in particular.

    One puzzle for me is those final Yougov figures in the chart that shows the adjustments (Lab from 38 to 35). The previous 2 Yougov polls were very close to the final result 42/38 and 42/39. At the time that could have retrospectively been put down to a bad sample or the unfair charge of Yougov “bottling it”. But I’m not clear how it is that Yougov with presumably the same sort of weighting that lowered the Lab vote substantially ended up with a bad final poll when the previous were close to being spot on? Perhaps they had a bad final sample and their adjustments made a bad sample worse?

  10. Test

  11. “VALERIE
    Lots to digest. Let’s see how long it is before someone starts banging on about Brexit.”

    That should delay the inevitable for at least five minutes Valerie.

  12. Yougov’s MRP was also considerably more expensive in terms of data given they were throwing 5000-8000 surveys at it each day.

    In some respects I guess this didn’t actually cost yougov much as they can presumably just tack a voting intention question on the front of the normal marketing work. But it’s worth noting that accuracy across finer segments of the population needs more data.

  13. Oh No we’ve got to talk about proper Polling!!!!

    Peter.

  14. @ Various from last thread. I’m as bored repeating myself as I’m sure other folks are reading it (or pretending not to read it). Once more though it seems!

    Leaving the EU will not fix UK’s problems by itself it needs a proactive HMG as well. I’ve repeatedly agreed that EU is not to “blame” for UK’s problems – most of UK’s problems (such as over reliance on London, debt-junkyism and failure to “bend the rules” as well as other countries) are self-inflicted due to decades of CON-LAB-CON laissez faire govts mixed with anglo culture.

    However, we voted to Leave and we are now where we are with a cr4p, weak and unstable govt making a total mess of Brexit (polls leaning heavily to negotiations going badly). We can’t do hindsight on decades of laissez faire govts, we can’t do hindsight on Brexit and unless/until we get rid of May (probably triggering a GE) we can’t do anything about how cr4p May (and Hammond) is (are).

    IMHO the 3 biggest causal problems facing UK are:
    1/ chronic and unsustainable current account deficit
    2/ low productivity
    3/ ageing population

    Other very important issues concerning govt finances (e.g. NHS funding) or raising real wages can’t be fixed in a sustainable way unless we tackle these three issues. Brexit on its own will not solve these issues but it can, and now has to be, the excuse to kick UK out of laissez fairre slumber.

    If any party had a better idea to fix UK’s problems over the last 25+years then it is a bit late now. If anyone from Remain can suggest how BINO delivered by May or a.n.other weak govt will tackle the three issues I highlight then please explain. I’m totally fine with another GE to settle this – preferably with a new CON leader and CoE!

    Last words concerning the EU for today will have been seen by Remain as it was in the Guardian!
    https://www.theguardian.com/commentisfree/2018/jun/06/fix-eu-single-currency-does-not-work

    Soros has been writing harsh words about EU in the Guardian as well. Like Heseltine he wants UK to stay to help reform them!! Well “*&%k that” as they say – saving them from themselves twice was enough, outside of the Euro (thank God) why would they listen to any advice we give them if we beg to rejoin.

    [Here ends a rant by the Young Gammon Shrek liberation party ;)]

  15. Peter Cairns (and/or others)

    I wasn’t following YG’s MRP model very closely in 2017.

    How well did it do in the Scottish polity?

    I don’t mean in accurate prediction of constituency results (too many close marginals for that) but in likely vote shares in different types of constituencies?

  16. Valerie:

    105 minutes!

    It was TW though….

  17. I am not 100% sure this is what AW is saying but if you apply many layers of correction to any set of data the danger of amplifying some small systematic error must increase (my gut feeling!)

  18. Andrew11,
    “It was TW though….”
    yes

  19. Oldnat,

    “How well did it do in the Scottish polity?”

    Pretty good as I recall, it predicted Angus Robertson losing his seat, which I didn’t expect!

    Peter.

  20. It always amazes me that pollsters get it as right as they do. After all their weighting depends on the assumption that the people chosen to represent (say) the young are not atypical of those who will actually vote. And if the past is to guide methodology in the present, the assumption seems to be that if a group turns out to be atypical, this will be repeated in the same way next time round. It seems from Antony’s analysis that in 2015 the responding young were atypical in having too many politically engaged labour voters while this was not true of the 3017 young respondents Hence in part the two opposite kinds of error,

    Like almost everyone except the admirable Dr Mibbles (where is he when one needs him, enjoying well deserved winnings in the Bahamas perhaps) I thought the conservatives would win the election much more handsomely than they did. I did not, however, think that they would win as handsomely as the betting and some of the pollsters believed. On this basis I advised my son who unlike me bets and he won quite a lot of money.

    The reasons for my belief was basically that youth turnout in 2015 was higher than in 2017 and the general feel of 2017 was that youth were much more engaged than in 2015.,

    So my suggestion for improving polls are that:

    a) they take more trouble to avoid over correction and to do this they should make use if more than last time’s results,, If last time’s target was missed by a wide margin one doesn’t usually adjust so next time one hits last time’s figure exactly but goes for something like half way between last time’s target and tast time’s result.

    b) They should try to measure key aspects of the current context and take this into account in their adjustments, If there was more enthusiasm for labour among the usually uninterested young in 2017 this should have shown up in polling and it should be possible to take account of it in any adjustments as I did in my lay way,

    These suggestions are not alternatives to the more technical solutions which Antony discusses. However a key problem is clearly the fact that contexts change in ways that can hopefully be detected at the tune but which cannot be predicted in advance,

  21. Peter Cairns

    Ta

  22. The problem is that the context and hence the sources of error change from election to election. Part of the answer may lie in making more use of information before the last election (2010 as well as 2015, as it were) This might reduce over-correction. And it should also be possible to pick up clues as to the current context from current polling (e.g. to notice that the disinterested young voters are more enthusiastic this time).

  23. @ ON

    You can still see the version on the night of the GE here:

    https://yougov.co.uk/uk-general-election-2017/

    It did over-estimate SNP numbers a bit, but the interesting thing is to look at the constituency level, and many of the close calls are often indicated as ‘toss-up’. Which is fair enough, there were some very narrow victories in Scotland in particular.

  24. apologies for second post. the first one seemed to have disappeared and I could not be bothered to repeat it. Probably most won’t have time to read it so the shorter version may have its point

  25. TrigGuy

    Ta to you too.

  26. @Valerie – do stop banging on about Brexit, please!

  27. @ ON

    “It did over-estimate SNP numbers a bit”

    It looks to me like the model get the SNP vs SCONs battles mostly about right, but failed to spot the SLAB gains. The map is rather difficult to read, and requires time and patience, but I make it that it only predicted 1 seat for LAB.

  28. TrigGuy:

    Many thanks for putting up that final MRP prediction for Scotland constituency GE results, in response to ON`s request.

    Yes, I thought the YouGov MRP model was good, and seemed to be giving sensible predictions in the run-up. So I was looking at it each day. Labour`s revival in the west Central Belt had been suggested on some days in the run-up, though was not visible as wins in this final prediction.

    It did worry me that there were some changes in predictions from day to day.

  29. AW, polling.

    I go with Charles (b).

    Pollsters try to improve their accuracy by looking at past results and applying the patterns they see there to current data. eg, trying to be cleverer than just asking people if they will vote, but looking at how frequently such and such group voted last time and in effect ignoring what they are saying now and instead applying their past behaviour.

    The problem is, to what extent can a previous pattern be carried forward. If there is no pattern, then seeking one out and applying it will just cause errors. Hence I go with Charles, it is necessary to look for confounding circumstances in the current election,which will change past patterns.

    As best I understand it, this seems rather like the MRP model. Which sought to find patterns in the overall responses, which could be applied to an inadequate random sample on a per constituency basis, and improve the result locally. This kind of process would seem to be what pollsters are trying to do looking at past results, but they are drawing their patterns from old behaviour which may have changed.

    I never studied 2015, or if I did I forget now what I thought (maybe I posted something then). It is arguable that the 2015 result was the exception, so correcting future elections on its basis was building in an error. But every election must have some degree of special cirumstances which would have to be identified if pollsters are trying for greater accuracy.

    In this case there were two obvious differences to past elections. The first was Brexit. This caused a clear change in turnout, or caused people to switch party. Thus the composition of the block which was voting for each of the main parties was now different to past elections, and so might be expected to behave differently to a ‘traditional’ labour or tory model.

    The second was Jeremy Corbyn. Not so much the man, I am not clear to what extent his personality was a plus or minus. But he represented a policy change by the labour party which harked back to a pre Thatcher era. Many commented on how this motivated a wholly different group of supporters and presumably this included voters, not just party activists.

    I quite favour the idea that unless a political position has an actual party representing it, it is unlikely to show up as popular. So, whatever the innate national support might be for corbyn style politics, it could not be measured while the choice was Blair style politics, or Cameron. Or, apart from Blair being more a Thatcherite than traditional labour, Brown and then Milliband had campaigns which were too much apologies for their parties failings than actual new policies with appeal. So whether it was a resurgent left, or resurgent self cofidence, labour had a different platform under Corbyn and so it is unsurprising if its supporters showed different levels of motivation.

    So I go with Charles, it is necessary to find the distinctive features of each elections which motivated voters, and then consider how much each is still true, or has significantly changed.

    Normalising to the referendum result presumably helped a lot. To the extent these groups with this motivation carried across to 2017, then it was an ideal way to pick a sample. Every election should have a trial run the year before, just to help pollsters. In this case, it sounds from what you say that the basic samples, including this correction, were fairly good,

    The samples were also normalised to past political affiliation, which although I am suggesting it was wrong, because it should have been some sort of Corbynised labour normalisation, it was nonetheless a representation of tribal party affiliation. Perhaps what should have happened is marking up labour’s percentage to allow for Corbyn, and then marking down again for differential turnout. Maybe the 2015 correction was not wrong, but canclled out by Corbyn. Corbyn motivated and therefore cancelled out traditional labour tendency not to bother voting.

  30. More polling,

    Oh, and we still havn’t resolved the issue that while the referendum was probably a good source to normalise voters in 2017, it may age even faster than usual after an election, because of the differential effect of older voters dying. Or no effect, depending on whether people think becoming a leaver is a result of getting older or a result of formative experiences.

    Eventually this will cease to matter, but it is quite possible we may have another election while Brexit continues to dominate. Currently this issue is a material one which is biasing the polling about brexit issues, even if there is no election.

  31. “The problem is, to what extent can a previous pattern be carried forward.”

    ——-

    Indeed. E.g. the problem of “shy” voters. For example, Tories might be getting a bad rap, then some voters might not admit to wishing to vote Tory. But party perceptions may shift. So there might be fewer shy Tories next time. Similarly, turnout may change if policies or conditions or perceptions change.

    There’s a need to test CURRENT distortions, and compensate for those, rather than relying on past distortions. (Marketing people sometimes do this by throwing in extra questions designed to test for it).

  32. @TrevorWarne: “Soros has been writing harsh words about EU in the Guardian as well. Like Heseltine he wants UK to stay to help reform them!! Well “*&%k that” as they say – saving them from themselves twice was enough, outside of the Euro (thank God) why would they listen to any advice we give them if we beg to rejoin.”

    We will return because the EU has established that it does not really need us, and we have accepted that we cannot live without it.

    Their response to any suggestion to change will be to remember that we don’t matter to them.

    Also the pro-EU faction in the UK, which will obviously be in charge in such circumstances, will be pre-occupied with a significant number of angry Leavers. They are not going to want to draw attention to the EU’s failings.

  33. Getting back to polling.

    If the 2015 based model was to be correct again, would that mean the Tories had a massive lead?

    But if 2017 was right because of a change in voting patterns, how do we know that those patterns have not changed still further? In which case Labour doing fine.

    Is the thing that is unlikely to change rapidly is the likelihood of an individual being selected by a survey, and their likelihood of responding to a survey? In which case would that allow the identification of trends?

  34. I thought that the increased turnout affected roughly ages 25-44 as well? Were these people previously downweighted? Being in that age range myself i couldn’t help but notice the difference in 2017 compared to previous elections. The previous attitude which some people had that it didnt matter who you vote for, nothing would change, was blown out of the water at the referendum where many of these people were on the losing side. People seemed a lot more motivated than in previous elections and my facebook feed showed quite a different campaign to what the Westminster Bubble thought was going on.

    http://www.britishelectionstudy.com/wp-content/uploads/2018/01/blog-turnout.png

  35. Echoing earlier comments about how difficult polling must be, has there ever been a comparably seismic change in background factors between two general elections, even setting aside how close together they were?

    David Cameron’s Tory/LibDem Coalition, Ed Miliband’s Labour and UKIP running stronger than a traditional third party, versus Theresa May’s Tories, Jeremy Corbyn’s Labour and an intensely polarising referendum inbetween.

    That’s got to be the polling equivalent of whiplash?

  36. YG has a Full Scottish poll for the Times,

    Westminster VI (changes from their January poll in brackets)

    SNP 40 (+4)
    SCon 27 (+4)
    SLab 23 (-5)
    SLD 7 (+1)
    SGP 2 (-1)
    UKIP 1 (-2)
    Other 0 (-1)

    Westminster seat projection (changes from 2017)
    SNP 43 (+8)
    SCon 11 (-2)
    SLD 4 (n/c)
    SLab 1 (-6)

    Also the annual Scottish Social Attitudes Survey has an interesting analysis of the two constitutional divisions that run through Scottish politics

    https://www.natcen.ac.uk/media/1595212/BSA_35_Scotland.pdf

  37. ON

    Why are Labour still struggling in Scotland? I would have thought someone like Corbyn would make the party more attractive north of the border

  38. The main threat to the YouGov alternative model is the wrong assumption set. For the time being it is unlikely to be undermined (it is less sensitive to the reallocation models of the traditional polls). However, once the distribution is set in the alternative model (through the Markov chain – this is what was probably used), it needs extraordinary shift to change the outcome variables (apart from the extreme marginals).

    As to the traditional ones – well, the adjustments all aim at sorting out the deductive and the inductive inference (which also implies precision of today and long term), which cannot be resolved at the same time.

    When the elections are far, even if the question is how would you vote tomprrow, the methodological changes aim at reducing the error rate. When we are close to elections, it is not really relevant – they question is the closeness of the poll to the outcome. These are two completely different questions, two completely different evaluations of the sampling and two completely different adjustment methods.

    I really hope that those who do the surveys know that it is impossible to reconcile the two, and understandably they go for the error reduction methodological changes (as the benchmark is the previous elections).

    I also have to add that I’m not at all convinced of the BES analysis of the 2015 elections – there is no word about the magnitude effect of the assumed fault.

    So, close to the election we have to hope that the sample corresponds the population (voters), and in between elections that the adjustments are tested (during the campaign in 2017 I think all polling companies changed their methodology). However, the overall methodology of polls are not very good in spotting errors in simulations.

  39. Mike Pearce

    The latest YG Scottish poll shows Sturgeon’s popularity has dipped by 2 points since January to minus 2. Meanwhile Corbyn’s has dropped 27 points to minus 30.

    That would suggest that Corbyn’s appeal to Scots voters is overstated – often that’s done by folk who think that Scotland is a “left-wing” country because it used to elect Labour MPs.

    In reality, most Scots probably vote for what they see as the best interests of Scotland, and that may not involve deep concern for the governance of domestic matters in the English polity.

    Even many of us who are reasonably keen on public ownership would rather see such facilities under the control of Holyrood than Westminster, and Corbyn’s language doesn’t suggest that he thinks in those terms.

  40. Mike Pearce: because it’s just a single poll. Might well be an outlier.

    But also because attitudinal surveys have repeatedly shown that Scotland isn’t much more left-wing than England.

  41. Oldnat: yes, the Tories’ problems in Scotland stem mainly from the Thatcher era, when those living north of Hadrian’s wall were definitely losers of government policy. (Chiefly its opposition to devolution and its use of Scotland as a guinea pig to roll out the poll tax.)

  42. PollTroll

    I’m always puzzled by folk who assume that all those “north of Hadrian’s Wall” are in Scotland!

    I think your analysis of the status of SCon is somewhat dated! You might care to have a look at the table on page 8 of the Social Attitudes Survey on “How Scotland should be governed” –

    https://www.natcen.ac.uk/media/1595212/BSA_35_Scotland.pdf

    If there is a smidgeon of difference between the Lab, Con & LD attitudes of Westminster politicians on that issue, then it isn’t obvious to me.

  43. Thanks Anthony for a fascinating article, and thank God for some discussion of actual polling!

    Polling will be extremely difficult over the next few years (IMO) because of the overriding issue of the B-word, which cuts across usual party loyalties. e.g. if the B-word ends up being seen as a betrayal by Leavers, the Tories will suffer, but if it is seen as a success will that party benefit much, or will Labour-lite voters revert to DK/Labour at the next GE?

  44. Interesting study in the Times today about when tipping points can occur.

    “A study has found that social trends can reach a tipping point where they enter the mainstream, and it comes after about a quarter of the population adopt them.

    For the study, published in the journal Science, scientists devised an experiment to test the emergence of new ideas and social conventions. They also tested whether those trends can be changed by a committed minority of contrarian hipsters — or activists. The experiment involved two stages and almost 200 people. In the first stage they were put in pairs and shown a random photograph of a person they didn’t know. Each participant was given a selection of names and told to choose the one that suited the person. If they chose the same name they received money. If they came up with a different name they did not. Then they got to see what their partner had chosen.

    They were then paired with someone else and the process repeated. Because people were rewarded for conforming, the group gradually converged on the same name.

    In the second stage the scientists inserted the hipsters, whose role was to choose a different name and stick to it. Anyone paired with a hipster found they were no longer rewarded. Would the society ignore them or change to match them? When the group of hipsters was small their efforts were futile. People accepted the risk of being paired with someone who used the wrong name but did nothing about it. However, once the hipsters passed 25 per cent of the population they effected a rapid shift where the more pliable majority changed their behaviour.“

  45. For example, if I had too many people with ginger hair in my sample, the results would probably still be correct (unless there is some hitherto unknown relationship between voting and hair colour)

    But of course there is, Mr Wells, – people with red hair are more likely to vote SNP[1]. Or rather a higher proportion of people born in and living in Scotland have red hair. And people born in and living in Scotland are more likely to vote SNP. Which illustrates another problem – that biases are not always caused directly, but possibly by both being linked to a third, possibly unmeasured factor.

    There was a surprising lack of self-examination in the polling industry after last June. There was a general assumption that everyone ‘knew’ what the cause was, which was over-correction, especially regarding youth turnout. This piece shows that was far from the full story (as many of us already had said) and it’s interesting that BPC (who very quickly decided there was no need for an inquiry last June) are looking at things again.

    Anthony referred to BMG’s Report on their spectacular failure which was published a few days ago:

    http://www.bmgresearch.co.uk/wp-content/uploads/2018/06/BMG-Research-2017-General-Election-Methodology-Report-June-2018.pdf

    which has some very interesting things in it, especially regarding registration and turnout and a rather large elephant conspicuous by its absence.

    [1] Because of higher levels of Labour support among BME voters there may be other links to hair colour there as well.

  46. Mike Pearce: Why are Labour still struggling in Scotland? I would have thought someone like Corbyn would make the party more attractive north of the border

    Scotland voted strongly for Remain. Whatever Corbyn looks like to leavers, to Remain, he looks like a leaver.

    Also, he looks oblivious to devolution. AIUI, devolution is viewed favourably by a substantial majority, even if they are unionist in outlook. To me, he looks completely out of tune with Scotland.

  47. @Laszlo

    When the elections are far, even if the question is how would you vote tomorrow, the methodological changes aim at reducing the error rate. When we are close to elections, it is not really relevant – they question is the closeness of the poll to the outcome. These are two completely different questions, two completely different evaluations of the sampling and two completely different adjustment methods.

    I may or may no understand this. Are you saying that when an election is not immediately expected the vi question is getting at something like an attitude (Are you pleased with the government?} ? When the election is close, the question is about behaviour (how will you vote tomorrow?)

    If this is what you are saying it seems to me true. Different factors may clearly apply (e.g. how efficient your party is at getting out the vote, how likely you think your vote is to make a difference etc).. However the questions are related and I would not call them completely different.

  48. A genuinely interesting thread, and one that gives great pause for thought.

    The interesting notion is that presumably the various corrections are applied if polling companies think their samples or turn out assumptions are incorrect, and to make these decisions traditionally means looking back at the last time. As many have observed, that means effectively fighting the last battle and hoping the various factors are repeated.

    In 2015 it was noted that there were periodic massive polling failures (1992) and it may be that the problems arise when there is a generational shift that alters the underly!ng patterns, and that this time we just happen to have had two major shifts in quick succession, but ultimately the application of corrections must surely depend on the human judgement of the polling companies – do they think their results are accurate.

    I’m interested in this aspect. There are always comments about ‘herding’ as we approach election dates, but I’m unaware that polling companies consciously change anything in their methodology that would deliver this. It was also interesting to note that Survation got the 2015 result spot on, with a late poll that they discounted as it didn’t look right.

    With the sources of errors identified in each of the last two elections, it does seem that it’s the human judgements rather than simple statistical failings that are the main problem, and I would imagine that this means there is always going to be room for error in polls, especially in circumstances of substantial changes withing political outlook among the population.

  49. Today’s Times reports a YouGov Poll:-
    Con 44 +2
    Lab 37 -2
    LD 8 -1

  50. Today’s Times reports a You Gov Poll :-

    Con 44 +2
    Lab 37 -2
    LD 8 -1

1 2 3