Sooner or later, a pollster gets something wrong. It happens to everyone if they are in the game for long enough. There are two responses to that, there is to deny there is any problem and blame it all on a late swing, or you can go away, work out what went wrong and put it right. The good pollsters do the second one – so when all the companies got it wrong in 1992 there was an industry inquiry, and ICM in particular came up with new innovations that addressed the problem and led to many of the methods companies use today. In 2008 when MORI got the London election wrong they went away, looked at what had happened, and made changes to put it right. A pollster that gets things wrong, admits it, and then puts it right isn’t a bad thing.
Anyway, in the US election last year the most venerable of all polling companies, Gallup, managed to get things wrong, showing a small lead for Mitt Romney rather than the eventual victory for Barack Obama. They put their hands up, invited in some academics to help and went away and looked at their methods – the result is here. Gallup tested about twenty different hypotheses of things that could have gone wrong, and found that in the majority things were working okay and there was no issue to address. They ended up with four issues where they think things went wrong and caused the overestimate of Romney’s support.
Most or all of the actual problems Gallup identified aren’t directly relevant to British political polls – different system, different challenges, different methods, different solutions – but it’s still an interesting look at what can go wrong with a poll and how a company should dig through its methods if something has gone wrong.
Likelihood to Vote – In Britain pollsters have a relatively simple way of approaching likelihood to vote: they ask people how likely they are to vote, and then weight and/or filter people’s responses based upon that, either giving people’s answers more weight based on how likely they say they are to vote or excluding people below a certain threshold. The only exception to this is ICM, who also include whether people voted in the 2010 election in their likelihood to vote model. American pollsters tend to use much more complicated methods, they ask people how likely they are to vote, but also whether they voted last time, how interested they are in politics, whether they know where the polling station is and so on – there are seven questions in all, which they use to work out a likelihood to vote score and then include only those most likely to vote. Other American pollsters do much the same, but Gallup’s method put more weight on whether people voted in the past, and their adjustment ended up being more pro-Romney than some other companies. Gallup are going to go away and do more work on turnout, including whether the sort of people who take part in polls are more likely to vote anyway (something that I would certainly expect to be true).
Sampling – Most telephone pollsters in the USA get their phone numbers in a similar way to British pollsters, by using random digit dialling. This ensures that people who are ex-directory are not excluded from samples, but at the cost of getting lots of dead telephone numbers, faxes, modems, business numbers and so on. In 2011 Gallup started doing something different. Like most companies they do a fair amount of their interviews on mobile phones, and noted that the majority of ex-directory people did have mobile phones, so theorised that it was safe to randomly generate their landline sample from telephone directories, while bumping up the mobile phone sample to catch those ex-directory people on their mobiles (mobile people who reported being ex-directory were weighted up to account for the tiny percentage of ex-directory people without mobiles). In theory it should of worked. In practice, it probably didn’t – before weighting the RDD sample was more democratic, younger and more pro-Obama than the listed one, so Gallup are going back to using the more expensive RDD method.
Time zone skews – this is an interesting one. As you might expect, Gallup sample within and weight by the regions in the USA. But within some of those regions there are different time zones, and because Gallup started polling at 5pm local time, it meant that in regions that covered more than one time zone they ended up doing more interviews in the eastern part of the region. Correcting this problem would have increased Obama’s support in Gallup’s final poll by 1%. Of course, in Britain we don’t have different time zones to worry about, but it illustrates a problem that can effect any methodology design – skews within the categories you weight by. A pollster can have, for example, the correct proportion of people in the DE social class or people over the age of 55, but what if within those categories you control for people are skewed towards more affluent DEs, or people only just over the age of 55?
Race – the final problem was a rather specific one on how Gallup asked about race – instead of giving people a list of race categories and asking which applied to the respondent, they asked them one at a time and got people to say yes or no, which produced some rather odd effects like overstating the proportion of Native Americans and mixed race people.
The full Gallup review is here and if you’re interested I’d also recommend reading the verdict of Mark Blumenthal (who spotted some of the problems before Gallup did) here and who has obviously followed it infinitely more closely than me.