One of the key bits of evidence on why the polls got it wrong has today popped into the public domain – the British Election Study face to face survey. The data itself is downloadable here if you have SPSS or Stata, and the BES team have written about it here and here. The BES has two elements – an online panel study, going back to the same people before, during and after the election campaign, and a post-election random face-to-face study, allowing comparison with similar samples going back to the 1964 BES. This is the latter part.

The f2f BES poll went into the field just after the election and fieldwork was conducted up until September (proper random face-to-face polls take a very long time). On the question of how people voted in the 2015 election the topline figures were CON 41%, LAB 33%, LDEM 7%, UKIP 11%, GRN 3%. These figures are, of course, still far from perfect – the Conservatives and Labour are both too high, UKIP too low, but the gap between Labour and Conservative – the problem that bedevilled all the pre-election polls, is much closer to reality.

This is a heavy pointer towards the make-up of samples having been a cause of the polling error. If the problems had been caused by people incorrectly reporting their voting intentions (“shy Tories”) or people saying they would when they did not then it is likely that exactly the same problems would have shown up in the British Election Study (indeed, given the interviewer effect those problems could have been worse). The difference between the BES f2f results and the pre-election polls suggests that the error is associated with the thing that makes the BES f2f so different from the pre-election polls – the way it is sampled.

As regular readers will know, most published opinion polls are not actually random. Most online polls are conducted using panels of volunteers, with respondents selected using demographic quotas to model the British public as closely as possible. Telephone polls are quasi-random, since they do at least select randomised numbers to call, but the fact that not everyone has a landline and that the overwhelming majority of people do not answer the call or agree to take part means the end results is not really close to a random sample. The British Election Study was a proper randomised study – it randomly picked consistencies, then addresses within in them, then a person at that address. The interviewer then repeatedly attempted to contact that specific person to take part (in a couple of cases up to 16 times!). The response rate was 56%.

Looking at Jon Mellon’s write up, this ties in well with the idea that polls were not including enough of the sort of people who don’t vote. One of the things that pollsters have flagged up in the investigations of what went wrong is that they found less of a gap in people’s reported likelihood of voting between young and old people than in the past, suggesting polls might no longer be correctly picking up the differential turnout between different social groups. The f2f BES poll did this far better. Another clue is in the comparison between whether people voted, and how difficult it was to get them to participate in the survey – amongst people who the BES managed to contact on their first attempt 77% said they had voted in the election, among those who took six or more goes only 74% voted. A small difference in the bigger scheme of things, but perhaps indicative.

This helps us diagnose the problem at the election – but it still leaves the question of how to solve it. I should pre-empt a couple of wrong conclusions that people will jump to. One is the idea polls should go back to face-to-face – this mixes up mode (whether a poll is done by phone, in person, or online) with sampling (how the people who take part in the poll are selected). The British Election Study poll appears to have got it right because of its sampling (because it was random), not because of its mode (because it was face-to-face). The two do not necessarily go hand-in-hand: when face-to-face polling used to be the norm in the 1980s it wasn’t done using random sampling, it was done using quota sampling. Rather than asking interviewers to contact a specific randomly selected person and to attempt contact time and again, interviewers were given a quota of, say, five middle-aged men, and any old middle-aged men would do.

That, of course, leads to the next obvious question of why don’t pollsters move to genuine random samples? The simple answers there are cost and time. I think most people in market research would agree a proper random sample like the BES is the ideal, but the cost is exponentially higher. This isn’t more expensive in the sense of “well, they should pay a bit if they want better results” type way – it’s more expensive as in a completely difference scale of expense, the difference between a couple of thousand and a couple of hundred thousand. No media outlet could ever justify the cost of a full scale random poll, it’s just not ever going to happen. It’s a shame, I for one would obviously be delighted were I to live in a world where people were willing to pay hundreds of thousands of pounds for polls, but such is life. Things like the BES only exist because of big funding grants from the ESRC (and at some elections that has need to be matched by grants from other charitable trusts).

The public opinion poll industry has always been about a finding a way of measuring public opinion that can combine accuracy with being affordable enough for people to actually buy and speedy enough to react to events, and whatever the solutions that emerge from the 2015 experience will have those same aims. Changing sampling techniques to make them resemble random sampling more could, of course, be one of the routes that companies look at. Or controlling their sampling and weighting in ways to better address shortcomings of the sampling. Or different ways of modelling turnout, like ComRes are looking at. Or something else yet unspeculated. Time will tell.

The other important bit of evidence we are still waiting for is the BES’s voter validation exercise (the large scale comparison of whether poll respondents’ claims on whether they voted or not actually match up against their individual records on the marked electoral register). That will help us understand a lot more about how well or badly the polls measured turnout, and how to predict individual respondents’ likelihood of voting.

Beyond that, the polling inquiry team have a meeting in January to announce their initial findings – we shall see what they come up with.

154 Responses to “What the BES face to face poll tells us about what went wrong”

1 2 3 4
  1. First time I’ve been first.

    Interesting to read of the financial costs of polling. I’ve never asked what it costs to run a series of polls leading up to, say, a general election, but there is here a real problem of how to balance the provision of seriously believable results with the need to keep costs within the budgets of those who wish to publish. Not easy. But thanks, AW, for the presentation of the BES study.

  2. By the way, does anyone have a view on whether it is easier to meet the BES problems outlined above with more local polling? For example, Old Nat points out that polling within Scotland is often much more accurate than UK/GB wide polling. Would it make more economic sense to have polling (or any indivudal poll) more limited geographically and thereby more ‘in depth’ and accurate? So during a GE campaign, for example, each day we would have a different area of the UK analysed. Would that work?

  3. AW
    Why cant random sampling be done digitally from the electoral registers+

  4. John Pilgrim –

    It can, but it would be less accurate than using the Postcode Address File.

    As many people have discussed here in the past, there is a problem with the completeness of the electoral register, so using that to draw random samples risks missing out people who aren’t on the register from your sample.

    The postcode address file however should have every household and other building in the UK on it, so the only people it should systemically miss out upon are rough sleepers, or people living in unofficial addresses (e.g. outbuildings made into residences without planning permission, etc) which I suspect are both pretty marginal concerns compared to the significant body of people not on the register.

  5. I think those choosing the audience for BBC QT could do well to listen to some of these issues/ lessons!

    Especially the one about having as accurate a sample as is feasible/ possible.

    Though QT is comprised of people requesting to be an audience member (so are self selecting) there must be some bio and informational data you can ask for which would help you to balance an audience as far as is possible.

    In all forms of debate, social movements, community groups, panel shows with audiences, political polling significantly biased samples and self selection are debilitating our understanding of what the consensus of opinion actually is.

    The GE was a clear example of that disconnect between presumed opinion and actual opinion; ditto current debates around political leaders: and the notion their views draw ecstatic support in the general population because that was the level of emotion at the activists meeting you attended the previous night!

  6. Forgot to mention earlier that West Lothian is having its first snow of the winter :-)

  7. AW

    But surely if someone is not on the electoral register he or she cannot vote. So sampling from the register must be more accurate than just a post code lottery. Or have I misunderstood?

  8. @Rob

    But the BBC wants an engaged and (if possible) badly behaved audience, not one which reflects 100% the wider population. QT is a show, put on for entertainment as much as for serious political debate. That’s one reason I seldom watch it. The other is it goes on rather late……. I’m getting too old for that now……

  9. @John B

    Social research, of which opinion polling is a specialist form, is costly, and the cost is at least partly in the sheer time it takes to contact thousands of people and the labour costs that represents. That’s why online methods are popular – you can pop a survey online and people can answer it.

    What that doesn’t do is firstly give you a representative sample, and nor does it have the same assurance that people take the same meaning from the questions that you think they ought to (remarkably common) or that they are not just giving random or silly answers – or not completing. That’s where a skilled interviewer comes in. But they need paying.

    If you’re doing an academic piece of qualitative research, you can easily do a half hour interview with a respondent, and that’s why samples are often small ( another reason is that they’re a bugger to write up and code).

    Even on a questionnaire, even if you have one that can be gone through in 10 minutes (believe me, that’s a short or very straightforward questionnaire), and even assuming you can find a contact for everyone (the actual biggest sticking point and why polling companies like volunteer panels – you have the contacts to hand and they’ll usually talk to you) and you get hold of them first time, a 3000 person sample such as this is 500 person hours *just to collect the data*.

    In reality, collecting the data will have taken significantly longer. And cost a lot more.

    And all this before any analysis is done.

    tl;dr Social research can be hard and costs a lot. The polling companies have to strike a balance between utmost rigour and resource constraints and this balance is not easy.

  10. I’m curious as to how pollsters can compare how people said they voted in polls with how they actually voted in the GE. I’m obviously very naive but I had thought that votes in actual elections were entirely confidential but apparently I was wrong. Who else has access to this information?

  11. JohnB – you have, it’s an important point though, one that Jon Mellon makes in his BES article (and, I expect, one that may be crucial in understanding what went wrong when the inquiry comes to report).

    At one level one could be forgiven for thinking that if polls don’t include non-voters it doesn’t matter – they don’t vote. If people who aren’t on the register aren’t sampled it doesn’t make any difference because, by definition, they can’t vote. However, it *does* matter, because they are part of the demographic balance of the sample.

    All our demographic targets are set based on the whole adult population, including non-voters, so it’s a zero-sum game. If you don’t have enough non-voters in a demographic, you end up replacing them with voters.

    So, for example, imagine that 12% of the adult population are under 25 and that only half of them vote. The non-voting under 25s don’t take part in polls either, so the pollsters end up with samples that only have 6% under 25s, but keen to be representative they weight them up to 12% of the sample. You end up with a poll that’s got double the number of under 25s who vote in it.

  12. Watchit – who people vote for is secret. IF they voted is a matter of public record.

    When you go into the polling station on election day the presiding officer will take your number or name/address and check you are on the electoral register, put a mark by your name and give you a ballot paper.

    That copy of the electoral register, with marks by the names of people who have voted, is a public document, and if you phone up the local electoral services department you are allowed to go and inspect the document and see if people voted or not. The political parties and elected politicians are allowed to actually take a copy of it – ordinary people are only allowed to view it and take notes by hand.

  13. The BES short reports are suggesting that what lies at the heart of the May polling fiasco is the sampling biases inherent in convention polling. If that outweighs differential turnout and other factors then there is one very stark phenomenon that I don’t understand. In the run-up to the election, the polls basically had Labour and the Tories tied. However, in the first polls reported AFTER the election the Tories were immediately showing the advantages that had emerged in the GE. At that point it was too early for the polling houses to introduce their various ‘fixes’, so presumably the sampling biases were just as influential as they had been prior to May 7. If so – in the absence of a methodological fix – where did the instantaneous Tory lead come from? (Perhaps respondents are inclined to side with the winner after an election. But I have to admit that I suspect that faulty turnout adjustments were more crucial than seems to be suggested in these blogs).

  14. Unicorn – companies hadn’t made any new or different switches at that point, but immediately after the election everyone switched from weighting using political weighting based on the 2010 election to political weighting based on the 2015 election.

    The reason vote recall in those post-election polls matched the 2015 election result is that the polls weighted it by 2015 vote recall to ensure that.

    The only interesting one was MORI, who don’t use political weighting, and have been showing the Tories ahead in recalled vote. Note sure how that has happened.

  15. Unicorn

    However, in the first polls reported AFTER the election the Tories were immediately showing the advantages that had emerged in the GE. At that point it was too early for the polling houses to introduce their various ‘fixes’, so presumably the sampling biases were just as influential as they had been prior to May 7.

    No they were already introducing the ‘fixes’ by weighting to the 2015 result. If you look at this YouGov for example:

    which seems to be the first done completely after the election, they have weighted to the May vote share. Since not many people would have changed their mind in the three week since polling day, you would expect the headline VI to reflect the result. In actual fact it showed a small swing to the Conservative – a sort of ‘honeymoon effect’ or ‘winner’s bonus’ that you often see after elections.

  16. AW – Thanks … Preoccupied as I was with the pre-GE projections, I overlooked that routine methodological switch. Incidentally, based on the BES summaries, it doesn’t look as if it is going to be easy to adjust for the biases introduced by the failure to sample those hard-to-reach potential voters.

  17. AnthonyJWells, Barbazenzero, Laszlo, MrNameless, Jack Sheldon, Pete B, and all the others who offered help regarding 1975 polling figures, thank you: I am genuinely grateful and somewhat touched by the help offered. Apologies to anybody I omitted

  18. RM – Thanks for your eleboration. You were pipped by Anthony in identifying my blunder. I shall now slip back into hibernation….

  19. @AnthonyJWells

    If memory serves me correctly (it may not: it’s been a while) the postcode address file does include things like hospitals, prisons, homeless shelters, battered women shelters, houses of multiple occupation, bed and breakfasts. It does not include things like unofficial addresses, and houses built in the last six months. I think it also does not include addresses that have never been registered with anybody ever (so things like moor shelters).

    Sorry, I’m not sure if this helps or just muddys the waters further…:-(

  20. Note that the BES hasn’t ruled out differential turnout (yet). The marked register still has to be checked, so we can see how many people are lying about having voted, and whether there is any bias in how they said they voted. I am very interested in the numbers which don’t match (both ways), although the reasons will remain a matter of speculation.

  21. @AW

    Thank you.

    One question though: does it make sense to distinguish (as I was trying to do) between those who are not on the electoral roll (perhaps, who knowns, even as a matter of principle), and those who, despite being on the roll, do not vote on any particular occasion?

    @Chris Riley

    Thank you. I can see that gathering the data is a huge task and analysing it another huge task. I suppose that trying to quantify the degree to which any self-selecting panel is ‘representative’ of the public as a whole must also be a nightmare.

    So what of the option of limiting any particular poll to a geographically small area, so as to be able to say with some degree of certainity “Lancashire thinks like this” or “Surrey thinks like this”? Or is it worth concentrating only on those areas which have recently (last twenty years or so, perhaps) shown themselves to be bellwether seats?

    Or is it the case that the only real solution is to say that the MOE is far greater than previously admitted?

  22. Thanks for the explanation, Anthony – that’s good to know.

  23. If anyone wants the data in a better format, here is link to the data I exported into CSV format.

    It’s still 16 meg,…….

  24. …sorry 13 meg.

  25. This link references the numbers on the first row to the question.

  26. Sorry, last post on the file….

    Here is a zipped version (1.3m only….)

  27. I loathe people who say “I told you so” – but when I suggested previously that there was a whole silent un-engaged cohort of quiet people out there who are swing voters, which polling companies do not reach, might I not have been onto something? Admittedly it was a hunch, but BES seems to indicate that this was true – or have I got it wrong?

  28. Anthony

    Thanks for the article.

    At the time of the election, Scully, Curtice & others noted that the polls in London, Scotland and Wales had been pretty accurate (and I assumed this to be the case).

    1. Would you agree that this was so


    2. If sampling is the basic problem, why would it only appear to affect England (outwith London)?

  29. @Oldnat
    “If sampling is the basic problem, why would it only appear to affect England (outwith London)?”
    1. the English are reticent (and/or devious – perfidious Albion)
    2. London is no longer full of the English.

    [My computer red lines ‘outwith’. Do you mean ‘except’?

  30. Dave

    No. I meant outwith.

  31. COLIN

    Having looked at the tables and crossbreaks of the new Survation poll, I suspect that you would be right not to trust it and especially not the crossbreaks.

    It claims to be a UK poll of 2007 UK residents, but the raw data prior to weighting is England 1709, Scotland 159, Wales 110 & NI 29 and post weighting England 1683,Scotland 171, Wales 119 & NI 35 making the national predictions for all but England worthless. Just two data points for Scotland should be enough to demonstrate that….

    For Scotland, their weighted sample has Con 16.20% and Lab,14.50% were a General Election to take place tomorrow, with no distinction offered between Westminster and national General Elections.

    For what little they are worth, the national crossbreaks are:

    England: Remain 37.20%, Leave 41.80%, Undecided 21.40%
    Scotland: Remain 48.00%, Leave 30.90%, Undecided 15.40%
    Wales: Remain 34.50%, Leave 36.40%, Undecided 30.90%
    NI: Remain 46.70%, Leave 36.70%, Undecided 20.00%
    UK: Remain 38.50%, Leave 42.90%, Undecided 18.60%

    Excluding Undecideds they are:
    England: Remain 47.10%, Leave 52.90%
    Scotland: Remain 60.90%, Leave 39.10%
    Wales: Remain 48.70%, Leave 51.30%
    NI: Remain 56.00%, Leave 44.00%
    UK: Remain 47.30%, Leave 52.70%

    I would be very interested to know what number of responses AW would consider to be sub-samples worth mentioning.

  32. Barbazenzero

    “making the national predictions for all but England worthless”

    Indeed – though, as it happens, the Scottish, Welsh & NI figures do seem to be roughly in line with proper polling in the devolved administrations.

    Survation do deserve credit for polling in NI – as they said “NI IS important”, and not just defaulting to the tired, old, inaccurate position that other pollsters still seem to be stuck in.

    Polling the whole of the UK (not just a bit of it) to produce an estimate of current opinion is sensible.

    Polling on Westminster intentions across the UK (as Survation did – and then dumping their responses into “others”) much less clever.

    Anthony makes the sensible point above that polling has to be affordable to the clients – but that begs the question “on Westminster VI, why try to poll 3 of the 4 systems and provide a concatenated result which only approximately England to any reasonable degree?”

    Far better to poll on UK matters simultaneously across the 4 nations.

    On matters unique to each administration, poll only there (which happens with Welsh, Scottish & NI polls – but England seldom gets treated with the respect it deserves.

    On Westminster VI (or on matters which clients want to know differing national perspectives), poll each area separately and properly, recognising that England is so dominant that it should properly get more regular polling than the other three administrations.

    Surely even (Chas &) Dave couldn’t object to England being given its rightful place! :-)

  33. @Tony,

    Was that a bit of self-loathing on display there?! ;)

  34. @OldNat

    Is the issue not (and I say this without studying the figures in much depth, so don’t go too hard if I’m talking rubbish) that what was actually wrong was the gap between CON and LAB? The rest – the SNP, the LDs (though everybody gave them too many seats), UKIP (though most gave them Thurrock and S Thanet) were about right. At a quick glance the gap between CON and LAB was similarly wrong in Wales to England. Even in Scotland LAB were overstated though the polls got CON about right.

  35. Jack Sheldon

    I don’t know – that was why I was asking Anthony.

    I wasn’t following the Welsh polls that closely, but Scully commented that “the final YouGov Welsh poll was actually very accurate – only being about one percentage point too low for the Conservatives and one point too high for Labour”.

    The Wiki article suggests that most polling seemed to be within moe

    In Scotland, both Con & Lab were miles behind the SNP, so any errors in their comparative share would have made little difference – and in any case would have been unclear due to the active campaign by SNPouters for tactical voting.

    Again, moe covers most late polling and actual result.

    Wiki doesn’t give the London GE results (and I can’t remember who said the London polling was pretty accurate), so It would be good to see an actual comparison.

    Could it be that the Con/Lab ratio was wrong, but that actually only affected such marginals? (The overall Welsh numbers were right, but in their Con/Lab marginals, the edge went to the Tories. Can we describe such a failure to predict as a “polling error”?)

    If the problem is sampling error, then I think it needs greater explanation as to why it only relates to the sampling of Con/Lab voters.

    Does it apply to all demographic groups, for example.

  36. OLDNAT
    Indeed – though, as it happens, the Scottish, Welsh & NI figures do seem to be roughly in line with proper polling in the devolved administrations.

    Had you emboldened the “as it happens” I would have agreed entirely with your post.

    I do realise that polling is expensive but to poll 29 people in Northern Ireland to allow labelling it as a “UK” poll strikes me as purely cosmetic.

    The national percentages, except for Wales, are much as I would have expected.

  37. Barbazenzero

    I haven’t compared samples with share of UK population. Are the NI 29 less than their appropriate share?

    Including them as part of the UK figure seems sensible. Publishing a crossbreak for them adds little to the sum of human knowledge. :-)

  38. OLDNAT
    Are the NI 29 less than their appropriate share?

    Google and wiki both put it at just over 1.8 million vs 53 million so around 3.3% suggesting just under 100 would be the right sample.

    But that would only apply if NI and England had near identical polities and is not the point I was trying but failing to make.

    A poll of 1000 individuals eligible to vote has pretty low margins of error if conducted in a uniform polity, but we also know that a poll of 300 to 500 shouldn’t be too far out.

    If [who commissioned it] really want to know how the four nations will vote the a 1000 sample for England and say 350 for each of the other home nations would provide meaningful crossbreaks as well as meaningful state-wide numbers.

  39. Barbazenzero

    Thanks for the calculation – so 29 IS cosmetic.

    I would be extremely disappointed with Survation, if they had allowed their client to influence their choice of the geographical distribution of their panel.

    Were that to be the case, one would have to ask if the numbers from each of the English regions was appropriate too.

    I hope that isn’t the case, but if it was, drumming out of the BPC would seem appropriate.

  40. OLDNAT
    I hope that isn’t the case, but if it was, drumming out of the BPC would seem appropriate.

    Better done to the sound of the old Orange flute, perhaps?

  41. Barbazenzero


  42. According to Comres ( the MOE for 29 people is +/- 18%.

    If the sample isn’t representative (very likely) it could be far worse than that.


    Thanks-the minutae of Polling leaves me cold, but I admire people who can raise that sort of interest in them.

    As I say -I don’t trust them any more & I’m not sure what they can do to change that.

  44. Barbazenzero

    Just did a comparison of the 2015 electorate and the Survation samples from each of the 4 UK polities.


    Wales – by 20.2%
    England – by 1.8%


    Scotland – by 3.4%
    Northern Ireland – by 34.0%

    What an odd representation of the 4 polities, and how co-incidentally useful to Leave EU!

  45. @Colin

    Perhaps what you need is to observe a series polls up to an event saying xyz will happen and the result being close to that.

    That’s my position at the moment.

  46. OLDNAT
    What an odd representation of the 4 polities, and how co-incidentally useful to Leave EU!

    Quite so. Additionally, the Cons moving from 3rd to 2nd in Scotland seems mildly suspicious unless the new LiS leader is seen to be doing even worse than I thought whilst Wales move towards a “leave” majority despite Lab not quite having lost the plot there.

    Good night.

  47. OLDNAT

    Wiki doesn’t give the London GE results (and I can’t remember who said the London polling was pretty accurate), so It would be good to see an actual comparison.

    No, Wiki does indeed have the London GE results and pre-election polls here:,_2015_(London)

    There was only one poll near the election (YouGov 20-22 Apr) which gave the result:

    Con 32% (35%)

    Lab 44% (44%)

    UKIP 10% (8%)

    Lib Dem 8% (8%)

    Green 5% (5%)

    Actuals in ().

    So slightly underestimating the Tories, mainly at the expense of UKIP.

  48. Roger Mexico

    Thanks – I was looking at the 2015 polling page.

    However, not a Lab/Con problem in London.

    What I’m interested in is where the sampling error (assuming that that is what it mainly was) occurred both geographically and demographically.

    For example, the Con over Lab lead in Scotland, that BZ referred to in the Survation UK poll, has been replicated in a number of GB wide polls.

    Changing the weightings across GB, even where they create or increase errors which weren’t significant in May, seems unwise.

    I can understand the commercial pollsters trying to reassure their current clients that they have “fixed the problem”, but such responses may be making the polls even more unreliable, if the changes are inappropriate.

  49. The BBC have a turnout map.

    Folks can take a look & make up their own minds whether the regional polls which were correct have higher than average turnout.

  50. Amber

    Interesting point.

    I had a look at the map, but couldn’t see a clear picture – in any case, I’d still like confirmation that the May 2015 suggestion that Scotland, Wales and London had more accurate polling, as was suggested at the time.

    If so, then could it be, for example, that these areas were the only ones to have a series of demographically balanced polls?

    If the pollsters did the same for the English regions, would concatenating those results be more effective than trying to match a sample of 1 (or 2) thousand to the GB/UK population as a whole?

    Much of the “default position” that pollsters use assumes that demographic weighting (and then guessing as to likelihood to vote) is sufficient to represent a widely disparate population.

    On some issues, that is probably still correct, but I’m not convinced that it works for VI for Westminster.

1 2 3 4