Opinion polling on Brexit has not necessarily been the best. Highly politically contentious issues do tend to attract polling that is sub-optimal, and Brexit has followed that trend. I’ve seen several Brexit polls coming up with surprising findings based on agree/disagree statements – that is, questions asked in the form:

Do you agree with the following statement? I think Brexit is great
Agree
Disagree
Don’t know

This is a very common way of asking questions, but one that has a lot of problems. One of the basic rules in writing fair and balanced survey questions is that you should try to given equal prominence to both sides of the argument. Rather than ask “Do you support X?”, a survey should ask “Do you support or oppose X?”. In practice agree-disagree statements break that basic rule – they ask people whether they agree/disagree with one side of the argument, without mentioning the other side of the argument.

In some cases the opposite side of the argument is implicit. If the statement is “Theresa May is doing a good job”, then it is obvious to most respondents that the alternative view is that May is doing a bad job (or perhaps an average job). Even when it’s as obvious as this it still sometimes to make a difference – for whatever reason, decades of academic research into questionnaire design suggest people are more likely to agree with statements than to disagree with them, regardless of what the statement is (generally referred to as “acquiescence bias”).

There is a substantial body of academic evidence exploring this phenomenon (see, for example Schuman & Presser in the 1980s, or the recent work of Jon Krosnick) it tends to find around 10%-20% of people will agree with both a statement and its opposite, if it is asked in both directions. Various explanations have been put forward for this in academic studies – that it’s a result of personality type, or that it is satisficing (people just trying to get through a survey with minimal effort). The point is that it exists.

This is not just a theoretical issue that turns up in artificial academic experiments – they are plenty of real life examples in published polls. My favourite remains this ComRes poll for UKIP back in 2009. It asked if people agreed or disagreed with a number of statements including “Britain should remain a full member of the EU” and “Britain should leave the European Union but maintain close trading links”. 55% of people agreed that Britain should remain a full member of the EU. 55% of people also agreed that Britain should leave the EU. In other words, at least 10% of the same respondents agreed both that Britain should remain AND leave.

There is another good real life example in this poll. 42% agreed with a statement saying that “divorce should not be made too easy, so as to encourage couples to stay together”. However, 69% of the same sample also agreed that divorce should be “as quick and easy as possible”. At least 11% of the sample agreed both that divorce should be as easy as possible AND that it should not be too easy.

Examples like this of polls that asked both sides of the argument and produced contradictory findings are interesting quirks – but since they asked the statement in both directions they don’t mislead. However, it is easy to imagine how they would risk being misleading if they had asked the statement in only one direction. If that poll had only asked the pro-Brexit statement, then it would have looked as if a majority supported leaving. If the poll had only asked the anti-Leave statement, then it would have looked as if a majority supported staying. With agree-disagree statements, if you don’t ask both sides, you risk getting a very skewed picture.

In practice, I fear the problem is often far more serious in published political polls. The academic studies tend to use quite neutrally worded, simple, straightforward statements. In the sort of political polling for pressure groups and campaigning groups that you see in real life the statements are often far more forcefully worded, and are often statements that justify or promote an opinion – below are some examples I’ve seen asked as agree-disagree statements in polls:

“The Brexit process has gone on long enough so MPs should back the Prime Minister’s deal and get it done”
“The result of the 2016 Referendum should be respected and there should be no second referendum”
“The government must enforce the minimum wage so we have a level playing field and employers can’t squeeze out British workers by employing immigrants on the cheap”

I don’t pick these because they are particularly bad (I’ve seen much worse), only to illustrate the difference. These are statements that are making an active argument in favour of an opinion, where the argument in the opposite direction is not being made. They do not give a reason why MPs may not want to back the Prime Minister’s deal, why a second referendum might be a good idea, why enforcing the minimum wage might be bad. It is easy to imagine that respondents might find these statements convincing… but that they might have found the opposite opinion just as convincing if they’d been presented with that. I would expect questions like this to produce a much larger bias in the direction of the statement if asked as an agree-disagree statement.

With a few exceptions I normally try to avoid running agree-disagree statements, but we ran some specially to illustrate the problems, splitting the sample so that one group of respondents were asked if they agreed or disagreed with a statement, and a second group where asked if they agreed-disagreed with a contrasting statement. As expected, it produces varied results.

For simple questions, like whether Theresa May is doing a good job, the difference is small (people disagreed with the statement that “Theresa May is doing a good job by 57% to 15% and agreed with the statement that “Theresa May is doing a bad job” by 52% to 18%. Almost a mirror image. On some of the other questions, the differences were stark:

  • If you asked if people agree that “The NHS needs reform more than it needs extra money” then people agree by 43% to 23%. However, if you ask if people agree with the opposite statement, that “The NHS needs extra money more than it needs reform”, then people also agree, by 53% to 20%.
  • If you ask if people agree or disagree that “NHS services should be tailored to the needs of populations in local areas, even if this means that there are differences across the country as a whole” than people agree by 43% to 18%. However, if you ask if they agree or disagree with a statement putting the opposite opinion – “NHS services should be the same across the country” – then people agree by 88% to 2%!
  • By 67% to 12% people agree with the statement that “Brexit is the most important issue facing the government and should be its top priority”. However, by 44% to 26% they also agree with the statement “There are more important issues that the government should be dealing with than Brexit”

I could go on – there are more results here (summary, full tabs) – but I hope the point is made. Agree/disagree statements appear to produce a consistent bias in favour of the statement, and while this can be minor in questions asking simple statements of opinion, if the statements amount to political arguments the scale of the bias can be huge.

A common suggested solution to this issue is to make sure that the statements in a survey are balanced, with an equal amount of statements in each direction. So, for example, if you were doing a survey about attitudes towards higher taxes, rather than asking people if they agreed or disagreed with ten statements in favour of high taxes, you’d ask if people agreed or disagreed with five statements in favour of higher taxes and five statements in favour of lower taxes.

This is certainly an improvement, but is still less than ideal. First it can produce contradictory results like the examples above. Secondly, in practice it can often result in some rather artificial and clunky sounding questions and double-negatives. Finally, in practice it is often difficult to make sure statements really are balanced (too often I have seen surveys that attempt a balanced statement grid, but where the statements in one direction are hard-hitting and compelling, and in the other direction are deliberately soft-balled or unappetising).

The better solution is not to ask them as agree-disagree statements at all. Change them into questions with specific answers – instead of asking if people agree that “Theresa May is going a good job”, ask if May is doing a good or bad job. Instead of asking if people agree that “The NHS needs reform more than it needs more money”, ask what people think the NHS needs more – reform or more money? Questions like the examples I gave above can easily be made better by pairing the contrasting statements, and asking which better reflects respondents views:

  • Asked to pick between the two statements on NHS reform or funding, 41% of people think it needs reform more, 43% think it needs extra money more.
  • Asked to pick between the two statements on NHS services, 36% think they should be tailored to local areas, 52% would prefer them to be the same across the whole country.
  • Asked to pick between the two statements on the importance of Brexit, 58% think it is the most important issue facing the government, 27% think there are more important issues the government should be dealing with instead.

So what does this mean when it comes to interpreting real polls?

The sad truth is that, despite the known problems with agree-disagree statements, they are far from uncommon. They are quick to ask, require almost no effort at all to script and are very easy for clients after a quick headline to interpret. And I fear there are some clients to whom the problems with bias are an advantage, not a obstacle; you often see them in polls commissioned by campaigning groups and pressure groups with a clear interest in getting a particular result.

Whenever judging a poll (and this goes to observers reading them, and journalists choosing whether to report them) my advice has always been to go to polling companies websites and look at the data tables – look at the actual numbers and the actual question wording. If the questions behind the headlines have been asked using agree-disagree statements, you should be sceptical. It’s a structure that does have an inherent bias, and does result in more people agreeing than if the question had been asked a different way.

Consider how the results may have been very different if the statement had been asked in the opposite direction. If it’s a good poll, you shouldn’t have to imagine that – the company should have made the effort to balance the poll by asking some of the statements in the opposite direction. If they haven’t made that effort, well, to me that rings some alarm bells.

If you get a poll that’s largely made up of agree-disagree statements, that are all worded in the direction that the client wants the respondent to answer rather than some in each direction, that use emotive and persuasive phrasing rather than bland and neutral wording? You would be right to be cautious.


It is year since the 2017 general election. I am sure lots of people will be writing a lot of articles looking back at the election itself and the year since, but I thought I’d write a something about the 2017 polling error, something that has gone largely unexamined compared to the 2015 error. The polling companies themselves have all carried out their own internal examinations and reported to the BPC, and the BPC will be putting out a report based on that in due course. In the meantime, here are my own personal thoughts about the wider error across the industry.

The error in 2017 wasn’t the same as 2015.

Most casual observers of polls will probably have lumped the errors of 2015 and 2017 in together, and seen 2017 as just “the polls getting it wrong again”. In fact the nature of the error in 2017 was completely different to that in 2015. It would be wrong to say they are unconnected – the cause of the 2017 errors was often pollsters trying to correct the error of 2015 – but the way the polls were wrong was completely different.

To understand the difference between the errors in 2015 and the errors in 2017 it helps to think of polling methodology as being divided into two bits. The first is the sample – the way pollsters try to get respondents who are representative of the public, be that through their sampling itself or the weights they apply afterwards. The second is the adjustments they make to turn that sample into a measure of how people would actually vote, how they model things like turnout and accounting for people who say don’t know, or refuse to answer.

In 2015, the polling industry got the first of those wrong, and the second right (or at least, the second of those wasn’t the cause of the error). The Sturgis Inquiry into the 2015 polling error looked at every possible cause of error, and decided that the polls had samples that were not representative. While they didn’t think the way pollsters predicted turnout was based on strong enough evidence and recommended improvements there too, they ruled it out as cause of the 2015 error.

In 2017 it was the opposite situation. The polling samples themselves had pretty much the correct result to start with, showing only a small Tory lead. More traditional approaches towards modelling turnout (which typically made only small differences) would have resulted in polls that only marginally overstated the Tory lead. The large errors we saw in 2017 were down to the more elaborate adjustments that pollsters had introduced. If you had stripped away all the attempts aimed at modelling turnout, don’t knows and suchlike (as in the table below) then the underlying samples the pollsters were working with would have got the Conservative lead over Labour about right:

What did pollsters do that was wrong?

The actual things that pollsters did to make their figures wrong varied from pollster to pollster. So for ICM, ComRes and Ipsos MORI, it looks as if new turnout models inflated the Tory lead, for BMG it was their new adjustment for electoral registration, for YouGov it was reallocating don’t knows. The actual details were different in each case, but the thing they had in common was that pollsters had introduced post-fieldwork adjustments that had larger impacts than at past elections, and which ended up over-adjusting in favour of the Tories.

In working out how pollster came to make this error we need to have closer look at the diagnosis of what went wrong in 2015. Saying that samples were “wrong” is easy, if you are going to solve it you need to identify how they were wrong. After 2015 the broad consensus among the industry was that the samples had contained too many politically engaged young people who went out to vote Labour and not enough disinterested young people who stayed at home. Polling companies took a mixture of two different approaches towards dealing with this, though most companies did a bit of both.

One approach was to try and treat the cause of the error by improving the samples themselves, trying to increase the proportion of respondents who had less interest in politics. Companies started adding quotas or weights that had a more direct relationship with political interest, things like education (YouGov, Survation & Ipsos MORI), newspaper readership (Ipsos MORI) or straight out interest in politics (YouGov & ICM). Pollsters who primarily took this approach ended up with smaller Tory leads.

The other was to try and treat the symptom of the problem by introducing new approaches to turnout that assumed lower rates of turnout among respondents from demographic groups who had not traditionally turned out to vote in the past, and where pollsters felt samples had too many people who were likely to vote. The most notable examples were the decision by some pollsters to replace turnout models based on self-assessment, with turnout models based on demographics – downweighting groups like the young or working class who have traditionally had lower turnouts. Typically these changes produced polls with substantially larger Conservative leads.

So was it just to do with pollsters getting youth turnout wrong?

This explanation chimes nicely with the idea that the polling error was down to polling companies getting youth turnout wrong, that young people actually turned out at an unusually high level, but that polling companies fixed youth turnout at an artificially low level, thereby missing this surge in young voting. This is an attractive theory at first glance, but as is so often the case, it’s actually a bit more complicated than that.

The first problem with the theory is that it’s far from clear whether there was a surge in youth turnout. The British Election Study has cast doubt upon whether or not youth turnout really did rise that much. That’s not a debate I’m planning on getting into here, but suffice to say, if there wasn’t really that much of a leap in youth turnout, then it cannot explain some of the large polling misses in 2017.

The second problem with the hypothesis is that there isn’t really that much relationship between those polling companies who had about the right proportion of young people in their samples and those who got it right.

The chart below shows the proportion of voters aged under 25 in each polling company’s final polling figures. The blue bar is the proportion in the sample as a whole, the red bar the proportion in the final voting figures, once pollsters had factored in turnout, dealt with don’t knows and so on. As you would expect, everyone had roughly the same proportion of under 25s in their weighted sample (in line with the actual proportion of 18-24 year olds in the population), but among their sample of actual voters it differs radically. At one end, less than 4% of BMG’s final voting intention figures were based on people aged under 25s. At the other end, almost 11% of Survation’s final voting figures were based on under 25s.

According to the British Election Study, the closest we have to authorative figures, the correct figure should have been about 7%. That implies Survation got it right despite having far too many young people. ComRes had too many young people, yet had one of the worst understatements of Labour support. MORI had close to the correct proportion of young people, yet still got it wrong. There isn’t the neat relationship we’d expect if this was all about getting the correct proportion of young voters. Clearly the explanation must be rather more complicated than that.

So what exactly did go wrong?

Without a nice, neat explanation like youth turnout, the best overarching explanation for the 2017 error is that polling companies seeking to solve the overstatement of Labour in 2015 simply went too far and ended up understating them in 2017. The actual details of this differed from company to company, but it’s fair to say that the more elaborate the adjustments that polling companies made for things like turnout and don’t knows, the worse they performed. Essentially, polling companies over-did it.

Weighting down young people was part of this, but it was certainly not the whole explanation and some pollsters came unstruck for different reasons. This is not an attempt to look in detail at each pollster, as they may also have had individual factors at play (in BMG’s report, for example, they’ve also highlighted the impact of doing late fieldwork during the daytime), but there is a clear pattern of over-enthusiastic post-fieldwork adjustments turning essentially decent samples into final figures that were too Conservative:

  • BMG’s weighted sample would have shown the parties neck-and-neck. With just traditional turnout weighting they would have given the Tories around a four point lead. However, combining this with an additional down-weighting by past non-voting and the likelihood of different age/tenure groups to be registered to vote changed this into a 13 point Tory lead.
  • ICM’s weighted sample would have shown a five point Tory lead. Adding demographic likelihood to vote weights that largely downweighted the young increased this to a 12 point Tory lead.
  • Ipsos MORI’s weighted sample would have shown the parties neck-and-neck, and MORI’s traditional 10/10 turnout filter looks as if it would have produced an almost spot-on 2 point Tory lead. An additional turnout filter based on demographics changed this to an 8 point Tory lead.
  • YouGov’s weighted sample had a 3 point Tory lead, which would’ve been unchanged by their traditional turnout weights (and which also exactly matched their MRP model). Reallocating don’t knows changed this to a 7 point Tory lead.
  • ComRes’s weighted sample had a 1 point Conservative lead, and by my calculations their old turnout model would have shown much the same. Their new demographic turnout model did not actually understate the proportion of young people, but did weight down working class voters, producing a 12 point Tory lead.

Does this mean modelling turnout by demographics is dead?

No. Or at least, it shouldn’t do. The pollsters who got it most conspicuously wrong in 2017 were indeed those who relied on demographic turnout models, but this may have been down to the way they did it.

Normally weights are applied to a sample all at the same time using “rim weighting” (this is an iterative process that lets you weight by multiple items without them throwing each other off). What happened with the demographic turnout modelling in 2017 is that companies effectively did two lots of weights. First they weighted the demographics and past vote of the data so it matched the British population. Then they effectively added separate weights by things like age, gender and tenure so that the demographics of those people included in their final voting figures matched the people who actually voted in 2015. The problem is this may well have thrown out the past vote figures, so the 2015 voters in their samples matched the demographics of 2015 voters, but didn’t match the politics of 2015 voters.

It’s worth noting that some companies used demographic based turnout modelling and were far more successful. Kantar’s polling used a hybrid turnout model based upon both demographics and self-reporting, and was one of the most accurate polls. Surveymonkey’s turnout modelling was based on the demographics of people who voted in 2015, and produced only a 4 point Tory lead. YouGov’s MRP model used demographics to predicts respondents likelihood to vote and was extremely accurate. There were companies who made a success of it, and it may be more of a question about how to do it well, rather than whether one does it at all.

What have polling companies done to correct the 2017 problems, and should I trust them?

For individual polling companies the errors of 2017 are far more straightforward to address than in 2015. For most polling companies it has been a simple matter of dropping the adjustments that went wrong. All the causes of error I listed above have simply been reversed – for example, ICM have dropped their demographic turnout model and gone back to asking people how likely they are to vote, ComRes have done the same. MORI have stopped factoring demographics into their turnout, YouGov aren’t reallocating don’t knows, BMG aren’t currently weighting down groups with lower registration.

If you are worried that the specific type of polling error we saw in 2017 could be happening now you shouldn’t be – all the methods that caused the error have been removed. A simplistic view that the polls understated Labour in 2017 and, therefore, Labour are actually doing better than the polls suggest is obviously fallacious.
However, that is obviously not a guarantee that polls couldn’t be wrong in other ways.

But what about the polling error of 2015?

This is a much more pertinent question. The methodology changes that were introduced in 2017 were intended to correct the problems of 2015. So if the changes are reversed, does that mean the errors of 2015 will re-emerge? Will polls risk *overstating* Labour support again?

The difficult situation the polling companies find themselves in is that the methods used in 2017 would have got 2015 correct, but got 2017 wrong. The methods used in 2015 would have got 2017 correct, but got 2015 wrong. The question we face is what approach would have got both 2015 and 2017 right?

One answer may be for polling companies to use more moderate versions of the changes them introduced in 2017. Another may be to concentrate more on improving samples, rather than post-fieldwork adjustments to turnout. As we saw earlier in the article, polling companies took a mixture of two approaches to solving the problem of 2017. The approach of “treating the symptom” by changing turnout models and similar ended up backfiring, but what about the first approach – what became of the attempts to improve the samples themselves?

As we saw above, the actual samples the polls used were broadly accurate. They tended to have smaller parties too high, but the balance between Labour and Conserative was pretty accurate. For one reason or another, the sampling problem from 2015 appears to have completely disappeared by 2017. 2015 samples were skewed towards Labour, but in 2017 they were not. I can think of three possible explanations for this.

  • The post-2015 changes made by the polling companies corrected the problem. This seems unlikely to be the sole reason, as polling samples were better across the board, with those companies who had done little to improve their samples performing in line with those who had made extensive efforts.
  • Weighting and sampling by the EU ref made samples better. There is one sampling/weighting change that nearly everyone made – they started sampling/weighting by recalled EU ref vote, something that was an important driver of how people voted in 2017. It may just be that providence has provided the polling companies with a useful new weighting variable that meant samples were far better at predicting vote shares.
  • Or perhaps the causes of the problems in 2015 just weren’t an issue in 2017. A sample being wrong doesn’t necessarily mean the result will be wrong. For example, if I had too many people with ginger hair in my sample, the results would probably still be correct (unless there is some hitherto unknown relationship between voting and hair colour). It’s possible that – once you’ve controlled for other factors – in 2015 people with low political engagement voted differently to engaged people, but that in 2017 they voted in much the same way. In other words, it’s possible that the sampling shortcomings of 2015 didn’t go away, they just ceased to matter.

It is difficult to come to firm answer with the data available, but whichever mix of these is the case, polling companies shouldn’t be complacent. Some of them have made substantial attempts to improve their samples from 2015, but if the problems of 2015 disappeared because of the impact of weighting by Brexit or because political engagement mattered less in 2017, then we cannot really tell how successful they were. And it stores up potential problems for the future – weighting by a referendum that happened in 2016 will only be workable for so long, and if political engagement didn’t matter this time, it doesn’t mean it won’t in 2022.

Will MRP save the day?

One of the few conspicuous successes in the election polling was the YouGov MRP model (that is, multi-level regression and post-stratification). I expect come the next election there will be many other attempts to do the same. I will urge one note of caution – MRP is not a panacea to polling’s problems. They can go wrong, and still relies on the decisions people make in designing the model it runs upon.

MRP is primarily a method of modelling opinion at lower geographical areas from a big overall dataset. Hence in 2017 YouGov used it to model the share of the vote in the 632 constituencies in Great Britain. In that sense, it’s a genuinely important step forward in election polling, because it properly models actual seat numbers and, from there, who will win the election and will be in a position to form a government. Previously polls could only predict shares of the vote, which others could use to project into a result using the rather blunt tool of uniform national swing. MRP produces figures at the seat level, so can be used to predict the actual result.

Of course, if you’ve got shares of the vote for each seat then you’ll also be able to use it to get national shares of the vote. However, at that level it really shouldn’t be that different from what you’d get from a traditional poll that weighted its sample using the variables and the same targets (indeed, the YouGov MRP and traditional polls showed much the same figures for much of the campaign – the differences came down to turnout adjustments and don’t knows). Its level of accuracy will still depend on the quality of the data, the quality of the modelling and whether the people behind it have made the right decisions about the variables used in the model and on how they model things like turnout… in other words, all the same things that determine if an opinion poll gets it right or not.

In short, I do hope the YouGov MRP model works as well in 2022 as it did in 2017, but MRP as a technique is not infallible. Lord Ashcroft also did a MRP model in 2017, and that was showing a Tory majority of 60.

TLDR:

  • The polling error in 2017 wasn’t a repeat of 2015 – the cause and direction of the error were complete opposites.
  • In 2017 the polling samples would have got the Tory lead pretty much spot on, but the topline figures ended up being wrong because pollsters added various adjustments to try and correct the problems of 2015.
  • While a few pollsters did come unstuck over turnout models, it’s not as simple as it being all about youth turnout. Different pollsters made different errors.
  • All the adjustments that led to the error have now been reversed, so the specific error we saw in 2017 shouldn’t reoccur.
  • But that doesn’t mean polls couldn’t be wrong in other way (most worryingly, we don’t really know why the underlying problem in 2015 error went away), so pollsters shouldn’t get complacent about potential polling error.
  • MRP isn’t a panacea to the problems – it still needs good modelling of opinion to get accurate results. If it works though, it can give a much better steer on actual seat numbers than traditional polls.


-->

Donald Trump has been citing Brexit as the model of how he could win the election despite expections, his surrogates of how there might be a shy Trump vote, like Brexit. So what, if any, lessons can we learn about the US election from recent polling experience in Britain?

In 2015 the British polls got the general election wrong. Every company had Labour and Conservative pretty much neck-and-neck, when in reality the Conservatives won by seven points. In contrast, the opinion polls as a whole were not wrong on Brexit, or at least, they were not all that wrong. Throughout the referendum campaign polls conducted by telephone generally showed Remain ahead, but polls conducted online generally showed a very tight race. Most of the online polls towards the end of the campaign showed Leave ahead, and polls by TNS and Opinium showed Leave ahead in their final eve-of-referendum polls.

That’s the first point that the parallel falls down – Brexit wasn’t a surprise because the polls were wrong. The polls were showing a race that was neck-and-neck. It was a surprise because people hadn’t believed or paid attention to that polling evidence. The media expected Remain would win, took polls showing Remain ahead more seriously and a false narrative built up that the telephone polls were more accurately reflecting the race when in the event, those online polls showing leave ahead were right. This is not the case in the US – the media don’t think Trump will lose because they are downplaying inconvenient polling evidence, they think Trump will lose because of the polling evidence consistently shows that.

In the 2015 general election however the British polls really were wrong, and while some of the polls got Brexit right, some did indeed show solid Leave victories. Do either of those have any relevance for Trump?

The first claim is the case of shy voters. Much as 1948 is the famous examples of polling failure in the US, in this country 1992 was the famous mistake, and was put down to “Shy Tories”. That is, people who intended to vote Conservative, but were unwilling to admit it to pollsters. Shy voters are extremely difficult to diagnose. If people lie to pollsters about how they’ll vote before the election but tell the truth afterwards, then it is impossible to distinguish “shy voters” from people changing their minds (in the case of recent British polls, this does not appear to be the case. In both the 2015 election and the 2016 EU referendum recontact surveys found no significant movement towards the Conservatives or towards Leave). Alternatively, if people are consistent in lying to pollsters about their intentions beforehand and lying about how they voted afterwards, it’s impossible to catch them out.

The one indirect way of diagnosing shy voters is to compare the answers given to surveys using live interviewers, and surveys conducted online (or in the US, using robocalls – something that isn’t regularly done in the UK). If people are reluctant to admit to voting a certain way, they should be less embarrassed when it isn’t an actual human being doing the interviewing. In the UK the inquiry used this approach to rule out “shy Tories” as a cause of the 2015 polling error (online polls did not have a higher level of Tory support than phone polls).

In the US election there does appear to be some prima facie evidence of “Shy Trumpers”* – online polls and robopolls have tended to produce better figures for Donald Trump than polls conducted by a human interviewer. However, when this same difference was evident during the primary season the polls without a live interviewer were not consistently more accurate (and besides, even polls conducted without a human interviewer still have Clinton reliably ahead).

The more interesting issue is sample error. It is wrong to read directly across from Brexit to Trump – while there are superficial similarities, these are different countries, very different sorts of elections, in different party systems and traditions. There will be many different drivers of support. To my mind the interesting similarity though is the demographics – the type of people who vote for Trump and voted for Brexit.

Going back to the British general election of 2015, the inquiry afterwards identified sampling error as the cause of the polling error: the sort of people who were able to be contacted by phone and agreed to take part, and the sort of people who joined online panels were unrepresentative in a way that weights and quotas were not then correcting. While the inquiry didn’t specify how the samples were wrong, my own view (and one that is shared by some other pollsters) is that the root cause was that polling samples were too engaged, too political, too educated. We disproportionately got politically-aware graduates, the sort of people who follow politics in the media and understand what is going on. We don’t get enough of the poorly educated who pay little attention to politics. Since then several British companies have adopted extra weights and quotas by education level and level of interest in politics.

The relevance for Brexit polling is that there was a strong correlation between educational qualification and how people voted. Even within age cohorts, graduates were more likely to vote to Remain, people with few or no educational qualifications were more likely to vote to Leave. People with a low level of interest in politics were also more likely to vote to Leave. These continuing sampling issues may well have contributed to some of those pollsters who did it wrong in June.

One thing that Brexit does have in common with Trump is those demographics. Trump’s support is much greater among those without a college degree. I suspect if you asked you’d find it was greater among those people who don’t normally pay much attention to politics. In the UK those are groups who we’ve had difficulty in properly representing in polling samples – if US pollsters have similar issues, then there is a potential source for error. College degree seems to be a relatively standard demographic in US polling, so I assume that is correct already. How much interest people have in politics is more nebulous, less easy to measure or control.

In Britain the root cause of polling mishaps in 2015 (and for some, but not all, companies in 2016) seems to be that the declining pool of people still willing to take part in polls under-represented certain groups, and that those groups were less likely to vote for Labour, more likely to vote for Brexit. If (and it’s a huge if – I am only reporting the British experience, not passing judgement on American polls) the sort of people who American pollsters struggle to reach in these days of declining response rates are more likely to vote for Trump, then they may experience similar problems.

Those thinking that the sort of error that affected British polls could happen in the US are indeed correct… but could happen is not the same as is happening. Saying something is possible is a long way from there being any evidence that is actually is happening. Some of the British polls got Brexit wrong, and Trump is a little bit Brexity, therefore the polls are wrong really doesn’t carry water.

xxxxxxxxxxxxxxxxxx

*This has no place in a sensible article about polling methodology, but I feel I should point out to US readers that in British schoolboy slang when I was a kid – and possibly still today – to Trump is to fart. “Shy Trump” sounds like it should refer to surreptitiously breaking wind and denying it.


Over the last year couple of years Labour’s lead has gradually been whittled away, from ten points in 2012 they are now holding onto only the very narrowest of leads, with many polls showing them neck and neck. At the same time we have seen UKIP’s support getting ever higher, with polls regularly putting them in the mid teens. One naive assumption could be that supporters have moved directly from Labour to UKIP, but in reality there is a lot of churn back and forth between parties. A political party could be picking up support from one group of voters, but losing an equal number of voters somewhere else. The voters now backing UKIP could be people who earlier in the Parliament were backing Labour, even if they didn’t vote Labour in 2010.

Every month YouGov carry out around twenty polls for the Sun and the Sunday Times. In any individual poll the crossbreaks of 2010 Conservative, Labour and Liberal Democrat voters are too small to be robust, but by aggregrating up the polls from a whole month we have enough responses to really examine the underlying churn, and by comparing the figures from 2012 and 2013 to today, we can see how party support has changed.

All these charts are based on YouGov’s figures. For simplicities sake the movement between the parties are always *net* figures – for example, there are a very small number of people who voted Labour last time but said they’d vote Lib Dem this time, but the vast bulk of the movement is in the opposite direction. I’ve netted them up to get the overall movement between each party. I’ve also excluded very small movements made up of less than 0.2%. The percentages are of the whole of the sample, not of each parties support, and because the sample also includes people who say don’t know or won’t vote things don’t add up to 100%. You can click on each image to get a bigger, readable version. With that in mind…

2012churn550

Here’s October 2012, a high point for Labour when they were enjoying an average lead of around 10 points in YouGov’s national polls. Labour’s vote at the time was very robust, they were making a very small net loss to UKIP, but otherwise their vote from 2010 was solid and they had added to it small amounts of support from 2010 non-voters and Conservatives and a large chunk of former Liberal Democrats. Lib Dem support had already slumped, with the large majority of their support going to either Labour or to Don’t know/Would not vote (DK/WNV). The Conservatives had started to lose support to UKIP, but it wasn’t yet a flood – they were also losing some support to Labour and a large chunk to DK/WNV.

2013churn

Moving onto October 2013, Labour’s lead had now fallen to around 6 points in YouGov’s national polls. They were still holding onto their 2010 support, but their gains from the Conservatives and non-voters were starting to falter. The movement of support from the Conservatives to UKIP had vastly increased, but part of this was balanced out by fewer Con-to-DK/WNV and Con-to-Lab switchers. The number of lost Tories was growing, but lost Tories were also switching their destination, saying they’d support UKIP rather than saying Labour or don’t know. The Liberal Democrats and Labour were also starting to see increased movement to UKIP, though at this point the big chunk of LD-to-Lab voters remained solid.

2014churn

Finally here is the picture from October 2014. Labour’s average lead in YouGov’s polls last month was just 1.5 points and their retained support from 2010 is now faltering. In 2012 20.6% of our polls were made up of people who had voted Labour in 2010 and would do so again, that has now dropped to 16.6%. Those 2010 Labour voters are now flaking away towards UKIP, the Greens and the SNP. Movement from Con-to-Lab has now dried up completely. The chunk of CON-to-UKIP voters has continued to grow, but mostly at the expense of CON-to-DK/WNV, meaning Tory support has remained largely unchanged. Most importantly that solid block of LAB>LD switchers has started to wither, down from 6.6% of the sample to 4.6%. The Liberal Democrats themselves aren’t doing any better, but their former supporters are scattering more widely, moving to the Tories, UKIP and Greens.

Comparing the churn from 2012 and now you can see Labour’s issue. In 2012 all arrows pointed to Labour, they were picking up support from everywhere and holding on to what they had. Today they still have the benefit of a strong transfer from the Liberal Democrats (though even that’s declining), but they are leaking support in every direction – to the Greens, to UKIP and to the SNP.

One of the reasons the Conservatives ended up falling short at the last election was that they failed to clearly identify themselves as THE party for change – the public wanted rid of Gordon Brown and Labour, but following the debates Nick Clegg managed to make many people think the Liberal Democrats were the more convincing alternative. Ed Miliband may face a similar problem, the government still isn’t popular and still has a relatively low level of support, but the anti-government vote seems to be fracturing away from Labour to alternative non-Conservative parties like UKIP, the Greens and the SNP.

(This post is also featured on the New Statesman’s May 2015 here)


Final lap

A year to go until the general election, meaning there are awful lot of “year to go till the election” posts out there (though two new things certainly worth looking at are the new British Election Study site here and the Polling Observatory’s new election prediction here. The election prediction by Rob Ford, Will Jennings, Mark Pickup and Christopher Wlezien takes a similar approach to Stephen Fisher’s, which I’ve discussed here before, in averaging current polls and then assuming that they move in the same way over the next 12 months as they have done in the final year of previous Parliaments. Unlike Steve’s projection, which has a Tory lead, Ford et al predict a miniscule (0.4%) Labour lead. The difference between the two projections are small, and technical – Ford et al assume a regression towards the long term average, I think Steve assumes a regression towards the previous election result, the inputs are slightly different (the Polling Observatory ones corrects for errors at the last election, my average which Steve uses doesn’t – largest effect of that will be that Steve predicts a higher Lib Dem score) and there are different smoothing effects in there. Expect more predictions along these lines to pop out of the woodwork in the year ahead.

Anyway, I’ve previously written about what I think are the big five concepts that will decide the election – how the improving economy impacts on voting intentions (especially if or as wages start to rise above inflation)? Whether Ed Miliband’s mediocre opinion poll ratings will increase in salience closer to the election, given they aren’t currently prevening a Labour lead? If and how quickly UKIP support subsides following the European elections? To what degree, if at all, Lib Dem incumbents can resist the tide against them? And, of course, what happens in the Scottish independence referendum? So for today, I’m instead looking at the timetable for the year ahead. These are essentially the “known unknowns” – the things that will happen before the next election but which we don’t yet know the impact of, as opposed to all those unpredictable events that will also happen.

25 MAY 2014. European election results. The last set of mid-term elections and the beginning of the final lap. Barring any big suprises UKIP will come top or almost top, the media will go into another Farage frenzy for a couple of weeks and UKIP will enjoy a big spike in the Westminster opinion polls. Do not be surprised to see them at 20% or more in some polls. The question then becomes one of how much of that support is retained over the next eleven months. Also watch how the other parties react, will the Conservative backbenches panic, will the leadership be tempted to ape UKIP? I expected them to go all ferrets-in-a-sack after UKIP did well in the local elections last year, but they held it together surprisingly well. If the Lib Dems do incredibly badly keep an eye on them too – they have been incredibly disciplined as they march towards the guns so far.

5 JUNE 2014. Newark – coming shortly after the European elections we have the Newark by-election. The Conservatives have a fairly chunky majority, it’s not ideal territory for UKIP and they’ve picked a candidate who plays to UKIP stereotypes rather than challenging them like Diane James did in Eastleigh, but the timing means UKIP will likely still be enjoying a big boost.

JUNE 2014. The Queens Speech and the Private Members Ballot – the final session of a Parliament won’t have many exciting bills left, watch who wins the private members ballot though. If a compliant enough Conservative comes top of the ballot then they’ll re-introduce the EU Referendum Bill that got lost in the Lords last time, and if it passes the Commons unamended again the Parliament Act would come into play. Labour and the Liberal Democrats may have to act to kill it in the Commons this time round. The Conservatives will hope a second try at the referendum bill will help win back UKIP supporters, a less charitable interpretation would be that it will offer the Conservatives an exciting opportunity to bang on about a subject normal voters don’t much care about every Friday for six months.

JULY 2014? Summer reshuffle – David Cameron has at least one big reshuffle before the general election (two if the coalition is brought to a formal end at some point), which will be his opportunity to put in place the team he wants for the general election. Cameron’s nature so far has been to avoid lots of changes and it’s rare for a reshuffle to be drastic enough to entrude upon public opinion, but it will determine who some of the big players are.

18 SEPT 2014 Scottish referendum – this is by far the biggest known unknown still facing us. If there is a NO vote (and while the trend has been towards YES, all the polls in the campaign have shown NO ahead) then it will at least have a substantial political impact in Scotland. In the event there is a YES vote absolutely everything would change. There would be a question mark over whether David Cameron should resign, certainly the political agenda would instantly be dominated by questions about the Scottish independence negotiations and the 2015 election would be fought in the knowledge that 40 odd Labour MPs would be elected for a period of only a year.

21 SEPT 2014 – Conference season. This is one of the few fixed, big events of the year that has the potential to impact on public opinion. The dates are a bit mixed up this year – normally the order goes Lib Dems, Labour, Conservatives. Because of the normal dates would have clashed with the Scottish refernedum the Liberal Democrats have moved their conference to last, so it will go Lab, Con, LD. All three will be a showcase for the general election, people will be paying more attention as the election approaches and expect an up-and-down in the polls as each party gets its chance in the spotlight.

OCTOBER 2014? New EU Commissioner – not something that will be noticed by the general public, but does have the opportunity to precipitate a by-election if a sitting MP is sent off to Europe as the new British EU Commissioner. It might even precipitate….

DATE TBC. Boris makes his mind up – we don’t know when it will happen, but at some point or other Boris Johnson will either stand for Parliament, or rule out the possibility of standing at the next election (even if that’s at close of nominations… though I expect the Conservative party will want to shut it down one way or the other long before that). Given it’s Boris it will attract public attention, how the Conservative party manage any return to Parliament will determine if they can use Boris in a role that helps them or if it’s seem only as a first shot in a leadership campaign.

DEC 2014. Autumn statement – presumably any big changes will be in the budget, but if the economic news is positive by the Autumn it’s a chance for George Osborne to highlight it.

DATE TBC. The end of the coalition – at some point the coalition between the Liberal Democrats and the Conservatives has to end or, at least, it needs to be become clear how it will end. The three obvious possiblities are a disordered breakdown over some issue, resulting in a minority Tory government, disengaging to supply and demand for the final few months, or remaining in coalition right to polling day. I suspect the first one won’t happen now, that leaves us an ordered break up with the Lib Dem ministers resigning but continuing to support the government in matters of confidence, or remaining in office right till the end. Even in the latter case, in order to effectively fight an election at some point the Lib Dems will need to appoint spokespeople in areas where they don’t have cabinet ministers and announce policies different to those of the coalition government. How will that impact on Lib Dem support?

JAN 2015. The long campaign begins – it many ways its begun already, or will begin after the Euros or after conference season. It’s a matter of perception, but Christmas is the last real break before the election and at their return in the New Year we will likely see a slew of announcements and policies, the start of the real campaign, and in my view the time when the polls start to come into focus and start to resemble the final result. Don’t get me wrong – there will still be time for things to change, there will still be a budget, manifestos, announcements and possibly debates, but the clock is ticking.

MAR 2015?. The Budget. Budgets are often seen as an opportunity for governments to win support by handing out election bribes. As I write here every year, in recent budgets that really doesn’t seem to have been the result – it’s more common for bad budgets to damage support than good ones to win support. Still, it will be an opportunity for Osborne to give away something to try and win votes, or at least try and portray himself as a reliable safe pair of hands that the country will want to re-elect.

30 MAR 2015. Parliament dissolved.

APR 2015?. The Leaders debates. How much impact they had last time is still debated (did they genuinely increase Lib Dem support, or was it all froth? Did the opportunity cost of the debates dominating the campaign prevent other changes in public support?), but they certainly have the potential to make a difference. We obviously don’t know what the format will be, when they will happen, who they will involve or even if they’ll happen (the genie can go back in the bottle – after the famous JFK v Nixon debate in 1960 there wasn’t another one till 1976) – much of the briefing now by Labour and the Conservatives is probably largely grandstanding and negotiating stances, partly aimed at showing willing so they can paint it as the “other side’s fault” if they don’t go ahead. I wouldn’t expect any debates to have as big an impact as in 2010 because they aren’t “new” anymore – the exception would be if somehow they other parties did agree to Nigel Farage taking part. For a smaller party what happens at the debate is not as important as the credibility brought by being part of the debate to begin with.

7 MAY 2015 – Election Day