Weighting by Demographics
As we’ve seen from the sampling article, no sampling technique is perfect: quasi-random sampling by definition has some random variation in it and even YouGov, who know the demographics of all the people they invite to a poll, can’t be certain they will all respond at the same rate. If an achieved sample doesn’t match the known demographics of Great Britain then pollsters deal with it through weighting.
For example, we know from the census and from larger, more robust surveys that 48% of the adult population is male and 52% is female, so if a sample contains only 45% men it is clearly under-representing them. If you want 48% men but only have 45% men, every individual male respondent is given a weight of 48/45 = 1.07. Literally they are multiplied by a factor of 1.07 and count as 1.07 of a person when the totals are tallied up. In our example women would have to be weighted down in a similar way.
Weighting is uncontroversial when it is done on figures where we know what a truly representative sample would look like. Those targets are normally drawn from either the national census and the figures provided by the Office of National Statistics, or from large and genuinely random surveys like the National Readership Survey or the Labour Force Survey. These weights will include things like gender, age, region, social class, level of education, tenure (that’s whether you own your own home, rent it privately or live in social housing), work status, number of cars and whether people took a foriegn holiday recently.
As an observer of polls, you can generally ignore these weights – everyone is going to weight their polls pretty much identically in terms of things like gender and age. It’s important that the figures are available so we can check they are weighted properly, whenever a poll shows strange results someone will pop up on an internet forum saying “I bet they only polled people in London and no one in Scotland”. That isn’t actually ever going to happen with reputable, high profile pollsters, but it’s good we can see for ourselves. Two things are worth mentioning though:
1) Don’t fall in love with “raw” figures. Polls are weighted for a reason, to make them representative. The unweighted figures in tables are not some pure unsullied figures that let us see what the real picture is before evil pollsters manipulate them: they are figures from a sample of unknown demographics that may be wildly unrepresentative.
2) In most cases, polls are only weighted to be representative at a national level, not for sub-breaks. In other words, when you look at the tables, for the totals you can be sure that they are based on a sample with the right proportions of old and young people, men and women. However, if you look at the column for men on the table, it doesn’t necessarily follow that it contains the right proportion of old and young men – because polls are only weighted at a national level, it could be that all the old people in the sample were women and all the young people men.
That’s an extreme example and it is always the case – MORI’s quotas for age, gender and work status are all interlocked, YouGov’s quotas for age and gender are interlocked – but as a general rule be a bit cautious of sub-breaks.
If I called that the non-controversial part of weighting, what is the controversial part? The answer is political weighting.
In theory if we control a sample so that it is representative in terms of all the measures we do know about, the right number of old and young, employed and retired, owner-occupiers and renters, etc, then it should form a microcosm of society as a whole and represent them in other ways we don’t know about…specifically for our purposes, it should represent them politically. Unfortunately it doesn’t seem to.
A couple of decades ago voting intention was heavily correlated with class. If a sample contained the right numbers of ABs and DEs and the right proportions of council tenants and homeowners it would also be politically representative. There is no longer such a strong correlation, and unrepresentative samples is one of the reasons put forward to explain why the polls were wrong in 1992. The main phone pollsters, ICM and Populus, find that samples that are not politically weighted contain too many Labour supporters. To solve this problem using weighting though gives us a problem, if class and tenure no longer correlate strongly with voting intention, what does? More to point, what is there that strongly correlates with voting intention and that we have solid figures for that we can weight towards?
One very strong correlation is with past voting behaviour and, since we have the election results from 2005, we also have good strong data to draw target weights from. In theory a pollster can ask respondents who they voted for in 2005, weight the sample so that recalled voting pattern reflects what actually happened in 2005 (with some allowance for new voters and so on) and – bingo! – a politically representative sample.
The problem with this approach is down to “false recall” – people are not actually very good at remembering how they voted at the last election. This is undisputed and supported by extremely strong evidence – multiple academic panel studies have demonstrated that if you ask a group of people how they voted, note down the results and then go back to the same people a period of time later and ask them again their answers will have changed. There are various different explanations for this phenomenon, perhaps people say how they wish they had voted, rather than how they actually did, perhaps they are embarrassed to admit voting for a now unpopular party. There is good evidence to suggest that some people align their “past vote” to match how they would vote now. My two preferred explanations are that people who did not vote claim they did (and tend to claim they voted Labour) and that Labour supporters who voted tactically for the Liberal Democrats in their own particular constituency claim they voted for the party that they actually supported, rather than the party they voted for.
The second of those is a guess, but the first at least we can be pretty certain of, since the last British Election Study checked people who said they’d voted against the marked electoral register, and those people who said they’d voted but hadn’t did disproportionately claim they’d voted Labour. Whatever the reasons, the general trend is that more people claim they voted Labour that actually did, and fewer people claim they voted Lib Dem than actually did.
This gives pollsters a problem. If you got a imaginary, perfectly representative sample and asked them how they’d voted in 2005 it would not match the actual election results because of false recall: you’d probably have too many people claiming to be Labour voters and too few claiming to have voted Lib Dems. If you weighted that perfect sample to match the actual election results you would be making it unrepresentative, because it would have too few people who actually voted Labour. Looking at an actual sample that has more 2005 Labour voters than there should be, too few Tories and Lib Dems, it is impossible to tell how much of the discrepancy is down to a biased sample (which should be corrected for) and how much is down to false recall (which is genuine and would give false results if corrected for).
ICM, Populus and ComRes, the three pollsters who weight by past vote, deal with this problem by estimating the level of false recall based on their findings at past elections. Populus weight to a point halfway between the average they get in their polls and the actual election result, ICM weight to a point three quarters of the way towards the actual election result. How ComRes come up with their figure is less clear, but as far as I can tell is very similar to ICM.
In practice the weighting figures are (and these figures are from Spring/Summer 2007, when all the pollsters provided comparable weighting figures for a BPC conference):
ComRes: CON 18, LAB 20, LDEM 13, Oth 7, Didn’t vote 29, Ref/DK 13
ICM: CON 19, LAB 23, LDEM 13, Oth 4, Didn’t vote 32, Ref/DK 8
Populus: CON 20, LAB 25, LDEM 13, Oth 5, Didn’t vote 29, Ref/DK 7
As you can see, comparing Populus and ICM the figures are very similar to each other (and are pretty similar today) – Populus are slightly nicer to Labour and ICM and slightly nicer to the Lib Dems, a pattern that’s reflected in their poll findings, but the differences are small. ComRes’s figures contrast more sharply, they are much less favourable to Labour and weight the others to a very high figure.
It is, however, important to note that while ICM and Populus’s target weights are pretty much static, ComRes’s seem to be more variable – so we can’t assume ComRes will always produce figures that are more favourable to the Tories than ICM and Populus.
Given that there is such a strong correlation between past voting behaviour and current voting intention, difference in weighting can make a significant difference to the top line figure. Unfortunately there is no accurate way of judging which weighting figures are “right” – looking at a raw sample, it is impossible to say how much of the difference between recalled vote and the actual vote at the last election is down to false recall, and how much is the sample actually being unrepresentative. Deciding the targets is a matter of judgement for the pollsters.
That brings us to the other three pollsters – YouGov, Ipsos MORI and Angus Reid.
For phone pollsters, the only contact they have with respondent is during the interview – any information they gather has to be got at the time, so if a poll was done in May 2007, the pollster needs to ponder to what extent people’s recall of how they voted has changed in the intervening 2 years. Because YouGov draw their sample from a panel they can obtain information about respondents one day, save it to file and then use it to weight the results of surveys done months or years later. This alleviates the problem of false recall for YouGov – they don’t have to worry about people’s recalled vote or party ID changing from week to week as party support fluctuates, they have data from a point in the past which they can weight to fixed target figures. YouGov have on occassion updated their party ID targets as new people join their panel, but on the whole the target is static.
YouGov use party ID (i.e. the answer to the question “which party, if any, do you most identify with”) to weight rather than 2005 vote. This doesn’t make much difference in practice – both are strongly correlated to voting intention and the party ID figures YouGov weight to match that of a representative May 2005 sample – but does have two results worth mentioning:
Firstly, rather annoyingly it means we can’t draw any direct comparisons between YouGov’s weighting and that of other pollsters. In practice we couldn’t anyway because YouGov are weighting using information from May 2005, and the phone pollsters are weighting using info from 2008, a lot of false recall has happened since them, but it’s still a bit of a shame.
Secondly, they are often misconstrued as being one and the same thing: they aren’t. YouGov weight to proportions of CON 26%, LAB 32%, LDEM 12%, Other 3%, None of them 24.5%, Don’t know 2.5%. In the YouGov’s 2005 polling only 72% of people who said they identified with the Labour party actually said they intended to vote Labour – 13% said they were voting Lib Dem and 9% said they were abstaining. Equally, those 25.5% of people who said they didn’t identify with any party are not a bunch of non-voters, many of them do vote, but are genuine floating voters. I occassionally see people compare the figure YouGov weight Labour party ID to and the figure other companies weight Labour recalled vote to and draw the errorneous conclusion that YouGov are weighting Labour more highly. This is wrong – the two figures aren’t comparable.
Alone amongst the pollsters Ipsos MORI do not use any political weighting at all (that isn’t to say they don’t weight, or their figures are unweighted. They do use all the normal demographic weights like age and gender). This has two important effects – firstly, the weighting used by ICM, Populus and so on nearly always reduces the proportion of Labour voters in a sample, so the absence of weighting in Ipsos MORI polls has historically helped Labour. Since they don’t weight by it, MORI do not as standard ask about recalled vote, but when they did it showed their samples contained roughly the same proportion of people who said they’d voted Conservative in 2005 as ICM and Populus’s polls… but far more people who claimed they had voted Labour.
Following the 2008 London election when MORI wrongly showed Ken Livingstone ahead of Boris Johnson they conducted a review of their methodology. This found that their samples had too many public sector workers, something that presumably increased Labour support in their polls. Since then MORI have been weighting by public sector employment along with all their normal demographics. We have very little past vote data from MORI since them, but that which we do suggests their samples are still more Labour than Populus and ICM. Here’s the June 2008 recalled 2005 vote data for the three pollsters:
ICM: CON 19, LAB 23, LDEM 13, Oth 4, Didn’t vote 32, Ref/DK 9
Populus: CON 20, LAB 24, LDEM 13, Oth 5, Didn’t vote 30, Ref/DK 8
Ipsos MORI: CON 20, LAB 30, LDEM 10, Oth 5, Didn’t vote 27, Ref/DK 7
In October 2008 MORI showed partial figures for past vote, with the Conservatives on 23%, Labour on 31% and the Lib Dems on 10.5%.
This does beg the question of why, if MORI’s samples seem to contain more people who say they voted Labour in 2005, their topline aren’t more pro-Labour than ICM and Populus (in fact, it tends to be other way round). The reasons are MORI also have a very harsh likelihood to vote filter that acts in the opposite direction (I will look at how pollsters deal with likelihood to vote in a later article) and ICM and Populus also re-allocate don’t knows in a way that currently favours Labour (again, a subject of a future article). However, in other questions that aren’t filtered by likelihood to vote and don’t have don’t knows re-allocated, such as approval ratings for party leaders or best party on the economy, it is important to remember that the sample in a MORI poll will normally contain far more Labour voters than samples used by other pollsters.
The second effect of MORI not using political weighting is that their polls tend to be more volatile than those produced by rivals. All polls suffer normal sample error, but politically weighted polls will at least always have the same political make-up, when figures change in a politically unweighted poll we have to ponder whether it is just the result of a sample that is more Labour or more Conservative.
If political weighting results in samples that are more consistent and more representative, then one might be forgiven for asking why MORI don’t do it. The answer is actually largely to do with volatility. As we’ve seen above, recalled vote can change over time. If it changes only slowly then the formula’s used by ICM and Populus can adapt, and their target weightings for past vote will also change. Where ICM and Populus’s methodology could go wrong is if false recall changed suddenly – imagine one party did something so heinously awful that huge swathes of the population were suddenly reluctant to admit supporting them at the last election, ICM and Populus would assume they just had a sample that was lacking sufficient people and weight it back up, disguising a genuine shift in support. That is an extreme example, but illustrates MORI’s concern – what if at least some of the difference between recalled vote in polls isn’t sample error, but is a genuine response to changing public attitudes towards parties? In that case weighting it all away would be wrongly dampening out genuine volatility in public opinion.
There’s no consensus on this point. ICM and Populus believe that recalled past vote is actually very consistent and changes only very slowly over time, and therefore think it is a suitable thing to weight by. Ipsos MORI believe it is more volatile, and therefore think it isn’t suitable for weighting.
Angus Reid weight by past vote, but unlike ICM, Populus and ComRes they do not take any account of false recall. Instead they weight to the actual shares of the vote. This means they normally weight Labour to a significantly smaller proportion of their samples than other pollsters and, consequently, produce a significantly lower level of Labour support.