“But the sheer size of the survey […] makes it of interest…”

One of the most common errors in interpreting polls and surveys is the presumption that because something has a really huge sample size it is more meaningful. Or indeed, meaningful at all. Size isn’t what makes a poll meaningful, it is how representative the sample is. Picture it this way, if you’d done an EU referendum poll of only over 60s you’d have got a result that was overwhelmingly LEAVE… even if you polled millions of them. If you did a poll and only included people under 30 you’d have got a result that was overwhelmingly REMAIN… even if you polled millions of them. What matters is that the sample accurately reflects the wider population you want them to represent, that you have the correct proportions of both young and old (and male & female, rich & poor, etc, etc). Size alone does not guarantee that.

The classic real world example of this is the 1936 Presidential Election in the USA. I’ve referred to this many times but I thought it worth reciting the story in full, if only so people can direct others to it in future.

Back in 1936 the most respected barometers of public opinion was the survey conducted by the Literary Digest, a weekly news magazine with a hefty circulation. At each Presidential election the Digest carried out a survey by mail, sending surveys to its million-plus subscriber base and to a huge list of other people, gathered from phone directories, membership organisations, subscriber lists and so on. There was no attempt at weighting or sampling, just a pure numbers grab, with literally millions of replies. This method had correctly called the winner for the 1920, 1924, 1928 and 1932 Presidential elections.

In 1936 the Digest sent out more than ten million ballots. The sample size for their final results was 2,376,523. This was, obviously, huge. One can imagine how the today’s papers would write up a poll of that size and, indeed, the Digest wrote up their results with not a little hubris. If anything, they wrote it up with huge, steaming, shovel loads of hubris. They bought all the hubris in the shop, spread it across the newsroom floor and rolled about in it cackling. Quotes included:

  • “We make no claim to infallibility. We did not coin the phrase “uncanny accuracy” which has been so freely applied to our Polls”
  • “Any sane person can not escape the implication of such a gigantic sampling of popular opinion as is embraced in THE LITERARY DIGEST straw vote.”
  • “The Poll represents the most extensive straw ballot in the field—the most experienced in view of its twenty-five years of perfecting—the most unbiased in view of its prestige—a Poll that has always previously been correct.”

digestpoll

You can presumably guess what is going to happen here. The final vote shares in the 1936 Literary Digest poll were 57% for Alf Landon (Republican) and 43% for Roosevelt (Democrat). This worked out as 151 electoral votes for Roosevelt and 380 for Landon. The actual result was 62% Roosevelt, 38% for Landon. Roosevelt received 523 in the electoral college, Landon received 8, one of the largest landslide victories in US history. Wrong does not nearly begin to describe how badly off the Literary Digest was.

At the same time George Gallup was promoting his new business, carrying out what would become proper opinion polls and using them for a syndicated newspaper column called “America Speaks”. His methods were quite far removed from modern methods – he used a mixed mode method, mail-out survey for richer respondents and face-to-face for poorer, harder to reach respondents. The sample size was also still huge by modern standards, about 40,000*. The important different from the Literary Digest poll however was that Gallup attempted to get a representative sample – the mail out surveys and sampling points for face-to-face interviews had quotas on geography and on urban and rural areas, interviewers had quotas for age, gender and socio-economic status.

pic2097election

Gallup set out to challenge and defeat the Literary Digest – a battle between a monstrously huge sample and Gallup’s smaller but more representative sample. Gallup won. His final poll predicted Roosevelt 55.7%, Landon 44.3%.* Again, by modern standards it wasn’t that accurate (the poll by his rival Elmo Roper, who was setting quotas based on the census rather than his turnout estimates was actually better, predicting Roosevelt on 61%… but he wasn’t as media savvy). Nevertheless, Gallup got the story right, the Literary Digest hideously wrong. George Gallup’s reputation was made and the Gallup organisation became the best known polling company in the US. The Literary Digest’s reputation was shattered and the magazine folded a couple of years later. The story has remained a cautionary tale of why a representative poll with a relatively small sample is more use than a large poll that makes no effort to be representative, even if it is absolutely massive.

The question of why the Digest poll was so wrong is interesting itself. Its huge error is normally explained through where the sample came from – they drew it from things like magazine subscribers, automobile association members and telephone listings. In depression era America many millions of voters didn’t have telephones and couldn’t afford cars or magazine subscriptions, creating an inbuilt bias towards wealthier Republican voters. In fact it appears to be slightly more complicated than that – Republican voters were also far more likely to return their slips than Democrat voters were. All of these factors – a skewed sampling frame, differential response rate and no attempt to combat these – combined to make the Literary Digest’s sample incredibly biased, despite its massive and impressive size.

Ultimately, it’s not the size that matters in determining if a poll is any good. It’s whether it’s representative or not. Of course, a large representative poll is better than a small representative poll (though it is a case of diminishing returns) but the representativeness is a prerequisite for it being of any use at all.

So next time you see some open-access poll shouting about having tens of thousands of responses and are tempted to think “Well, it may not be that representative, but it’s got a squillion billion replies so it must mean something, mustn’t it?” Don’t. If you want something that you can use to draw conclusions about the wider population, it really is whether it reflects that population that counts. Size alone won’t cut it.

=

* You see different sample sizes quoted for Gallup’s 1936 poll – I’ve seen people cite 50,000 as his sample size or just 3,000. The final America Speaks column before the 1936 election doesn’t include the number of responses he got (though does mention he sent out about 300,000 mailout surveys to try and get it). However, the week after (8th Nov 1936) the Boston Globe had an interview with the organisation going through the details of how they did it that says they aimed at 40,000 responses.
** If you are wondering why the headline in that thumbnail says 54% when I’ve said Gallup called the final share as 55.7%, it’s because the polls were sometimes quoted as share of the vote for all candidates, sometimes for share of the vote for just the main two parties. I’ve quoted both polls as “share of the main party vote” to keep things consistent.


Petitions are a rubbish way of measuring public opinion. In fairness, that isn’t actually their purpose – a petition is a way for individuals to record and express their opinion, a way of highlighting an issue and exerting pressure. They can indeed be very good at that job. Some people however assume that because a vast number of people sign a petition it must, therefore, reflect wider public opinion. That is not the case – if a million people sign a petition hey are not necessarily representative of anyone but themselves. It shows only what those themselves think, the rest of the population may think the opposite, but not be bothered to sign petitions about it (and some demographic or attitudinal groups may just be more inclined to express their opinions through petitions).

So it appears to be with the petition on the Trump visit. Well over a million and a half people have signed a petition against the visit, but a YouGov poll in the Times this morning shows 49% of people think the visit should go ahead, only 36% think it should cancelled (Though it’s important to note the poll question does not relate to the petition specifically. The poll asked if the visit should go ahead at all, the petition is about the more technical issue of whether it should be downgraded from a full State Visit).

This does not mean there’s a silent majority of the British public who like Donald Trump – quite the opposite, British public opinion is very hostile about him and getting worse. 62% now think he will be a poor or terrible president (up from 54% just after the presidential election) and people here are overwhelming negative about his policies. The ban on refugees and visitors from seven Muslim countries gets the thumbs down from 50% of British respondents and the support of only 29%. Other policies are even less popular (67% think his wall is a a bad idea, similar figures disapprove of his environmental policies)

One can only assume that the public think the invite to Trump should stand despite their dislike of the man and his policies because, like it or not, he is the leader of a country we need to work with. Asked what the attitude of the British government should be towards trump 51% say we should try to work with him, rather than distance ourselves from him (32%). Opinion there is moving swiftly though – there has been a large drop since November when 66% thought the government should work with him.

I do ponder what sort of reception Donald Trump will get of the visit goes ahead. The British public really don’t like him, and if that petition doesn’t measure the balance of opinion, it probably does give us a good idea of the pool of people available to turn up to any visit to protest. That said, there have been plenty of State Visits by unpopular world leaders in the past that have been managed without incident. I just wouldn’t count on too many large public events…


-->

Donald Trump has won, so we have another round of stories about polling shortcomings, though thankfully it’s someone else’s country this time round (this is very much a personal take from across an ocean – the Yougov American and British teams are quite separate, so I have no insider angle on the YouGov American polls to offer).

A couple of weeks ago I wrote about whether there was potential for the US polls to suffer the same sort of polling mishap as Britain had experienced in 2015. It now looks as if they have. The US polling industry actually has a very good record of accuracy – they obviously have a lot more contests to poll, a lot more information to hand (and probably a lot more money!), but nevertheless – if you put aside the 2000 exit poll, you have to go back to 1948 to find a complete polling catastrophe in the US. That expectation of accuracy means they’ll probably face a lot of flak in the days ahead.

We in Britain have, shall I say, more recent experience of the art of being wrong, so here’s what insight I can offer. First the Brexit comparison. I fear this will be almost universal over the next few weeks, but when it comes to polling it is questionable:

  • In the case of Brexit, the polling picture was mixed. Put crudely, telephone polls showed a clear lead for Remain, online polls showed a tight race, with leave often ahead. Our media expected Remain to win and wrongly focused only on those polls that agreed with them, leading to a false narrative of a clear Remain lead, rather than a close run thing. Some polls were wrong, but the perception that they were all off is wrong – it was a failure of interpretation.
  • In the case of the USA, the polling picture was not really mixed. With the exception of the outlying USC Dornslife/LA Times poll all the polls tended to show a picture of Clinton leading, backed up by state polls also showing Clinton leads consistent with the national polls. People were quite right to interpret the polls as showing Clinton heading towards victory… it was the polls themselves that were wrong.

How wrong were they? As I write, it looks as if Hillary Clinton will actually get the most votes, but lose in the Electoral College. In that sense, the national polls were not wrong when they showed Clinton ahead, she really was. It’s one of the most fustrating situations to be in as a pollster, those times when statistically you are correct… but your figures have told the wrong narrative, so everyone thinks you are wrong. That doesn’t get the American pollsters off the hook though: the final polls were clustered around a 4 point lead for Clinton, when in reality it looks about 1 point. More importantly, the state polls were often way out, polls had Ohio as a tight race when Trump stomped it by 8 points. All the polls in Wisconsin had Clinton clearly ahead; Trump won. Polls in Minnesota were showing Clinton leads of 5-10 points, it ended up on a knife edge. Clearly something went deeply wrong here.

Putting aside exactly how comparable the Brexit polls and the Trump polls are, there are some potential lessons in terms of polling methodology. I am no expert in US polling, so I’ll leave it to others more knowledgable than I to dig through the entrails of the election polls. However, based on my experiences of recent mishaps in British polling, there are a couple of places I would certainly start looking.

One is turnout modelling – US pollsters often approach turnout in a very different way how British pollsters traditionally did it. We’ve always relied on weighting to the profile of the whole population and asking people if they are likely to vote. US pollsters have access to far more information on which people actually do vote, allowing they to weight their samples to the profile of actual voters in a state. This has helped the normally good record of US pollsters… but carries a potential risk if the type of people who vote changes, if there is an unexpected increase in turnout among demographics who don’t usually vote. This was one of the ways British pollsters did get burnt over Brexit. After getting the 2015 election wrong lots of British companies experimented with a more US-style approach, modelling turnout on the basis of people’s demographics. Those companies then faced problems when there was unexpectedly high turnout from more working-class, less well-educated voters at the referendum. Luckily for US pollsters, the relatively easy availability of data on who voted means they should be able to rule this in or out quite easily.

The second is sampling. The inquiry into our general election polling error in 2015 found that unrepresentative samples were the core of the problem, and I can well imagine that this is a problem that risks affecting pollsters anywhere. Across the world landline penetration is falling, response rates are falling and it seems likely that the dwindling number of people still willing to take part in polls are ever more unrepresentative. In this country our samples seemed to be skewed towards people who were too educated, who paid too much attention to politics, followed the news agenda and the political media too closely. We under-represented those with little interest in politics, and several UK pollsters have since started sampling and weighting by that to try and address the issue. Were the US pollsters to suffer a similar problem one can easily imagine how it could result in polls under-representing Donald Trump’s support. If that does end up being the case, the question will be what US pollsters do to address the issue.


Donald Trump has been citing Brexit as the model of how he could win the election despite expections, his surrogates of how there might be a shy Trump vote, like Brexit. So what, if any, lessons can we learn about the US election from recent polling experience in Britain?

In 2015 the British polls got the general election wrong. Every company had Labour and Conservative pretty much neck-and-neck, when in reality the Conservatives won by seven points. In contrast, the opinion polls as a whole were not wrong on Brexit, or at least, they were not all that wrong. Throughout the referendum campaign polls conducted by telephone generally showed Remain ahead, but polls conducted online generally showed a very tight race. Most of the online polls towards the end of the campaign showed Leave ahead, and polls by TNS and Opinium showed Leave ahead in their final eve-of-referendum polls.

That’s the first point that the parallel falls down – Brexit wasn’t a surprise because the polls were wrong. The polls were showing a race that was neck-and-neck. It was a surprise because people hadn’t believed or paid attention to that polling evidence. The media expected Remain would win, took polls showing Remain ahead more seriously and a false narrative built up that the telephone polls were more accurately reflecting the race when in the event, those online polls showing leave ahead were right. This is not the case in the US – the media don’t think Trump will lose because they are downplaying inconvenient polling evidence, they think Trump will lose because of the polling evidence consistently shows that.

In the 2015 general election however the British polls really were wrong, and while some of the polls got Brexit right, some did indeed show solid Leave victories. Do either of those have any relevance for Trump?

The first claim is the case of shy voters. Much as 1948 is the famous examples of polling failure in the US, in this country 1992 was the famous mistake, and was put down to “Shy Tories”. That is, people who intended to vote Conservative, but were unwilling to admit it to pollsters. Shy voters are extremely difficult to diagnose. If people lie to pollsters about how they’ll vote before the election but tell the truth afterwards, then it is impossible to distinguish “shy voters” from people changing their minds (in the case of recent British polls, this does not appear to be the case. In both the 2015 election and the 2016 EU referendum recontact surveys found no significant movement towards the Conservatives or towards Leave). Alternatively, if people are consistent in lying to pollsters about their intentions beforehand and lying about how they voted afterwards, it’s impossible to catch them out.

The one indirect way of diagnosing shy voters is to compare the answers given to surveys using live interviewers, and surveys conducted online (or in the US, using robocalls – something that isn’t regularly done in the UK). If people are reluctant to admit to voting a certain way, they should be less embarrassed when it isn’t an actual human being doing the interviewing. In the UK the inquiry used this approach to rule out “shy Tories” as a cause of the 2015 polling error (online polls did not have a higher level of Tory support than phone polls).

In the US election there does appear to be some prima facie evidence of “Shy Trumpers”* – online polls and robopolls have tended to produce better figures for Donald Trump than polls conducted by a human interviewer. However, when this same difference was evident during the primary season the polls without a live interviewer were not consistently more accurate (and besides, even polls conducted without a human interviewer still have Clinton reliably ahead).

The more interesting issue is sample error. It is wrong to read directly across from Brexit to Trump – while there are superficial similarities, these are different countries, very different sorts of elections, in different party systems and traditions. There will be many different drivers of support. To my mind the interesting similarity though is the demographics – the type of people who vote for Trump and voted for Brexit.

Going back to the British general election of 2015, the inquiry afterwards identified sampling error as the cause of the polling error: the sort of people who were able to be contacted by phone and agreed to take part, and the sort of people who joined online panels were unrepresentative in a way that weights and quotas were not then correcting. While the inquiry didn’t specify how the samples were wrong, my own view (and one that is shared by some other pollsters) is that the root cause was that polling samples were too engaged, too political, too educated. We disproportionately got politically-aware graduates, the sort of people who follow politics in the media and understand what is going on. We don’t get enough of the poorly educated who pay little attention to politics. Since then several British companies have adopted extra weights and quotas by education level and level of interest in politics.

The relevance for Brexit polling is that there was a strong correlation between educational qualification and how people voted. Even within age cohorts, graduates were more likely to vote to Remain, people with few or no educational qualifications were more likely to vote to Leave. People with a low level of interest in politics were also more likely to vote to Leave. These continuing sampling issues may well have contributed to some of those pollsters who did it wrong in June.

One thing that Brexit does have in common with Trump is those demographics. Trump’s support is much greater among those without a college degree. I suspect if you asked you’d find it was greater among those people who don’t normally pay much attention to politics. In the UK those are groups who we’ve had difficulty in properly representing in polling samples – if US pollsters have similar issues, then there is a potential source for error. College degree seems to be a relatively standard demographic in US polling, so I assume that is correct already. How much interest people have in politics is more nebulous, less easy to measure or control.

In Britain the root cause of polling mishaps in 2015 (and for some, but not all, companies in 2016) seems to be that the declining pool of people still willing to take part in polls under-represented certain groups, and that those groups were less likely to vote for Labour, more likely to vote for Brexit. If (and it’s a huge if – I am only reporting the British experience, not passing judgement on American polls) the sort of people who American pollsters struggle to reach in these days of declining response rates are more likely to vote for Trump, then they may experience similar problems.

Those thinking that the sort of error that affected British polls could happen in the US are indeed correct… but could happen is not the same as is happening. Saying something is possible is a long way from there being any evidence that is actually is happening. Some of the British polls got Brexit wrong, and Trump is a little bit Brexity, therefore the polls are wrong really doesn’t carry water.

xxxxxxxxxxxxxxxxxx

*This has no place in a sensible article about polling methodology, but I feel I should point out to US readers that in British schoolboy slang when I was a kid – and possibly still today – to Trump is to fart. “Shy Trump” sounds like it should refer to surreptitiously breaking wind and denying it.


US Election Night

Tonight, you should hardly need telling, is US election night, so here’s a thread for overnight discussion and some things to look out for tonight.

First up a note about exit polls. Exit polls this year are only being conducted in 31 states rather than 50, they aren’t bothering with some of the safe states, including some of those states closing first (so no exit polls in Kentucky, Georgia, South Carolina and West Virginia – all safe Republican states). Exit polls are done in the same way as in the UK – people stand outside polling stations and get a proportion of people leaving after having voted to fill in surveys, although unlike in the UK they ask a full survey about their opinions and why they voted as they did, not just how they voted. Exit polls are supplemented with phone polls to take account of early voters.

Exit polls are not released until the polls in that state are closed, people will inevitably claim to have leaked exit poll data prior to that. Any claimed leaks before 10pm will be bollocks anyway – the polling data will still be under strict quarantine in a locked room. Any claimed leaks after that will probably be bollocks too, and should be ignored anyway as it won’t be probably weighted yet. Also bear in mind that in recent elections the initially released exit poll data has tended to be a bit skewed to the Democrats, becoming more accurate as actual votes come in. That may or may not be the case this time.

The initial exit polls are updated as actual votes are counted, and weighted based on the declared votes in the districts where the exit poll took place. Once the networks are certain that a party has won the seat they call it – obviously the closer a state is, the longer it takes for the networks to be certain who has won. Hence while we have a list of the time that states’ polls close and exit polls will be released, its it only the very safe states that will be called straight away. In 2008 states where the vote was relatively close (say, under 10 points) sometimes took a couple of hours after polls closed for the networks to call the race, states with the tightest races took much longer to call: Montana and Florida four hours, North Caroline a day, Missouri a week.

In a tight race, don’t expect a result in the early hours!

Looking at the timetable.

11pm. Most polls close in Indiana and Kentucky – some parts of both states are an hour behind, so the networks may not call them until the whole state has finished voting, but either way both will vote Romney.

12 midnight. Polls close in Georgia, South Carolina, Vermont & Virginia. Most polls close in Florida. Georgia and South Carolina will vote Romney, Vermont will vote Obama. Virginia and Florida are the first toss-up states. For Romney to win, he really needs to win both of these – if Obama wins either of them then it becomes difficult (but not impossible) for Romney to win. The polls in Florida are neck-and-neck, if they have been accurate it is not going to be called for many hours. Most recent polls in Virginia have shown Obama ahead, but it will probably be a few hours until it is called.

12:30 am. North Carolina, Ohio and West Virginia polls close. West Virginia will vote Romney. North Carolina and Ohio are another two key states. Recent polls have tended to show Romney ahead in North Carolina, again it is a state he really does need to win. Ohio is likely to be the key – if Romney wins Virginia, North Carolina and Florida he needs Ohio and another state (New Hampshire, Colorado, Iowa?) to win. If Obama holds Ohio he has won unless he loses something unexpected like Wisconsin or Pennsylvania. At 12:30 though this will be academic – if the race is at all close they aren’t going to be calling it yet.

1 am. A whole slew of states close: Connecticut, Delaware, Maine, Maryland, Massachusetts, New Hampshire, New Jersey, Pennsylvania, Rhode Island, Alabama, Illinois, Mississippi, Missouri, Oklahoma, Tennessee, most of Texas and Michigan and the remaining part of Florida close. Apart from Florida the only really interesting states amongst them all are New Hampshire and Pennsylvania. Pennsylvania should be solidly Democrat, Obama won it by 10 points last time, John Kerry won it, all but one of the recent polls have shown Obama ahead, often by good margins. It should be Obama. However, both campaigns have been targeting it so it must be seen as somewhat close. In the event that Romney does win it he would probably win unless Obama had won Virginia, North Carolina or Florida instead (though frankly, the idea of the Democrats winning one of those states and not Pennsylvania is somewhat bonkers)

1:30 am. Arkansas polls close.

2:00 am. New York, Kansas, Louisana, Minnesota, Nebraska, Wisconsin, Arizona, Colorado, New Mexico, South Dakota, Wyoming polls close, the remainder of Texas and Michigan close. Again, these are mostly safe states – the two interesting ones are Wisconsin (which the polls suggest should go for Obama without too much difficulty) and Colorado, where the polls are closer, but are consistently showing Obama ahead. More interesting is that by now we’ve had a couple of hours for votes to be counted in the early-closing states, so with a bit of luck we should start to see competitive states being called. The broad picture of the night, whether Obama is probably home and dry, whether it is going to be close or whether there could be a Romney win will hopefully start to emerge around now.

3:00 am. Polls close in Iowa, Montana, Utah, Nevada and most of North Dakota. Iowa and Nevada are the last of the swing states. Nevada looks like it should go to Obama, Iowa’s polls have Obama ahead, but less convincingly.

4:00 am. California, Washington, Hawaii. Most of Idaho, Oregon. All safe states for one side or the other. The closer run states should be being called around now. Unless things have gone right down to the wire we should soon know who has won. If they have gone right down to a couple of states where the candidates are neck-and-neck it could take days. Go to bed.

5:00 am. Alaska. Go! Go to bed. Sheesh.