On small sample sizes

This is a cross-post from the YouGov website here. It’s territory I’ve covered here in the past (and here, and here.) I trust most of my regular readers will know full well that cross-breaks with small sample sizes are not to be trusted. Any news report with comparative results for people of different religions needs particular skepticism, there have been proper polls with parallel representative samples of different religious groups, but a bit of digging does often find it’s someone balancing a story on a sub-sample of 8 or something.

– – –

In the small print of opinion polls you’ll often find a ‘margin of error’ quoted, normally of plus or minus 3%. This means that 19 times out of 20, the figures in the opinion poll will be within 3% of the ‘true’ answer you’d get if you interviewed the entire population.

A poll of 1,000 people has a margin of error of +/- 3%, a poll of 2,000 people a margin of error of +/- 2%. The smaller the sample, the less precise it is and the wider the margin of error. Strictly speaking, these calculations are based on the assumption that polls are genuine random samples, with every member of the population having an equal chance of being selected. In many cases this isn’t true ? polls are carried out by quota sampling, or from panels of volunteers. Even polls done by randomly dialling phone numbers aren’t truly random, as the majority of people decline to take part. Even so, the margin of error is still a good rough guide to how precise a poll in, and indeed, when measured against real events like general elections most polls are indeed within the margin of error of the real result.

However, it is important to note that a margin of error applies to the whole sample. All pollsters who are members of the British Polling Council, like YouGov, will publish computer tables showing the detailed results of the poll, which will include crossbreaks breaking down respondents by age, gender, social class, region and other demographics. While these offer great insight into patterns of public opinion, they do, naturally, have smaller sample sizes. For example, a poll of 1000 people will normally have around 500 men and 500 women, and the margins of error on those figures will be around +/- 4%

For smaller demographic groups, sample sizes are even smaller and these bring with them much larger margins of error. For example, a poll of 1000 people would have a margin of error of +/- 3%, but if there were only 100 Scottish respondents within that poll the Scottish figures would have a margin of error of +/- 10%. This means unless the difference between what Scottish respondents said was different to what the rest of the sample said by more than 10 percentage points, it would not be statistically significant. It could just be random error.

The error is particularly common when looking at responses of ethnic minorities or religious minorities in national polls. Britain is an overwhelmingly Christian or secular country, meaning that in any properly representative poll of the British population, only a small percentage of respondents will be Muslim, Hindu or Jewish, and any crossbreaks by religion or ethnicity will be based on very small numbers with very large margins of error.

Many newspapers, for example, have picked up on a recent report by thinktank Demos based on YouGov polling, claiming it shows that Muslim Britains are more patriotic than average. In our poll 83% of British Muslims agreed with the statement that they were proud to be a British citizen, which was indeed higher than the 79% of the general population who agreed. However, this poll contained only 42 Muslim respondents, giving a margin of error of +/- 15% for Muslim respondents. Take into account the margins of error on the poll, and we only can be confident that between 68%-98% of British Muslims are proud to be British, compared to 77-81% of the general population. In other words, it is impossible to draw any firm conclusions about whether Muslim Britain are more proud to be British, less proud to be British, or exactly the same as everyone else.

Remember, it isn’t just the sample size of the overall poll that counts, but the sample sizes of the crossbreaks too. It is very rare that crossbreaks of fewer than 50 or 100 respondents will tell you anything reliable or useful.


47 Responses to “On small sample sizes”

  1. Why don’t pollsters put MoE for their polls , including all crossbreaks on every poll they publish ?

  2. YouGov poll five times a week. Would it be valid to add together the results from a certain sub sample over the course of a week to give a more accurate result?

  3. Colin – pollsters often do, hidden away in the small print. They don’t on crossbreaks, though occassionally they label significant differences (sometimes you’ll see letters under numbers in the crossbreaks, that indicates the number is significantly different from those in the columns indicated by that letter after taking into account sample size. It isn’t commonly done).

    Colin Green – it gets rid of the problem of sample size, but doesn’t resolve things completely. You still have the other issue of polls being weighted so they are representative *overall*, it doesn’t necessarily follow that the cross-breaks are internally representative.

    For example, polls are weighted so they have the correct proportion of middle class and working class respondents and the correct proportions of men and women. That doesn’t proclude the possibility that there could be too many working class men and too many middle class women.

  4. Thanks Anthony. So better, but still not great. I guess you’d have to rely on the hope that over several polls things might average out.

  5. Oh dear

    ” Just because Credit Suisse bankers are people too (even if 1% people, but still people), and just because they know too damn well that “no ECB intervention” means “no bonus”, and very likely “no job”, they go for broke and join Deutsche Bank, JPM, RBS, and everyone else (but, again, not Goldman), in predicting the end of Europe unless Draghi does his rightful duty and remembers that without banker support he will also be lining up at the jobless claims office very soon.Of course, being a Goldman boy, Draghi will only do what Lloyd tells him to. Either way, here is Credit Suisse’s rejoinder to the global Mutual Assured Destruction tragicomedy, which now makes Honk (as Lagarde calls him) Paulson’s overtures to congress seem like amateur hour. “We seem to have entered the last days of the euro as we currently know it. That doesn’t make a break-up very likely,but it does mean some extraordinary things will almost certainly need to happen – probably by mid-January –to prevent the progressive closure of all the euro zone sovereign bond markets, potentially accompanied by escalating runs on even the strongest banks. That may sound overdramatic, but it reflects the inexorable logic of investors realizing that – as things currently stand – they simply cannot be sure what exactly they are holding or buying in the euro zone sovereign bond markets…One paradox is that pressure on Italian and Spanish bond yields may get quite a lot worse even as their new governments start to deliver reforms – 10-year yields spiking above 9% for a short period is not something one could rule out. For that matter, it’s quite possible that we will see French yields above 5%, and even Bund yields rise during this critical fiscal union debate.” Of course, the explicit message is: help us ECB-Wan Kenobi, you are our only hope. The implicit one is: do it, or we pull the trigger and blow it all up to hell.”

    Print print print!!!!!

  6. ANTHONY
    Thanks.

    The more I read you caveats & explanations, the more I feel that Poll results should come with health warnings like cigarettes.

    Given this thread piece , why provide crossbreak data at all?

    What is the point of it? Who is it for?

  7. RiN

    @” the inexorable logic of investors realizing that – as things currently stand – they simply cannot be sure what exactly they are holding or buying in the euro zone sovereign bond markets…”

    That bit summarises the market’s dilemma in a nutshell.

    And that dilemma exists because of political failure in EZ.

    So the answer is to tell the markets , with absolute certainty -that every EZ sovereign debt is as good as another =as good as Germany’s .

    And that means one tradeable instrument-the EuroBond.

    And that means the Creditor EZ countries guarantee the debts of Debtor EZ countries ( well that saves them from just sending cheques to Athens , which they don’t seem keen on)

    And that means EZ state spending envelope budgets will be vetted by ECB ( =BUndesbank=Germany)

    ps-I love your banker conpiracy obsession-it knows no bounds :-)

  8. Colin – I do ponder the same questions! More seriously, it is often useful, as long as there are big differences. If you’ve asked if people support policy X, and 20% of under 25s support it, but 80% of over 65s support it then it is showing something genuine.

    I pile the caveats about crossbreaks not being internally weighting on so strongly because of the tendency of people to look at voting intention in cross breaks, and on voting intention something that makes a figure 4 or 5 points too high or low completely transforms a poll.

    For other policy questions, where you are really looking at support/oppose/even split a couple of percentage points either way doesn’t really change the big picture.

  9. RiN

    And even German economists are becoming believers ! :–

    h ttp://www.businessinsider.com/eurobonds-new-euro-endgame-2011-11

  10. Anthony
    @”I do ponder the same questions! More seriously, it is often useful, as long as there are big differences.”

    OK-so why not have pollsters agree what constitutes “big” in this context & just publish that with appropriate notes.

    I suppose the downside would be UKPR traffic would decline somewhat ! :-)

  11. I´m afraid not really related to polling samples (though I do love the regular rants about polling ineptitude), but I was wondering if anyone knew of any sort of polls or assessment of how “good” your MPs are. How often do they reply to their constituents emails, vote according to their pledges to their constituents etc? Just as an issue of interest.

  12. @Richard in Norway

    Perhaps you Norwegians can use your oil fund to buy up Euro debt.

  13. Wolf

    Well the fund managers are probably stupid enough.

  14. @Ken

    Fancy you advertising on this site.
    Leave it out!! 8-)

  15. @ AW
    A poll of 1,000 people has a margin of error of +/- 3%, a poll of 2,000 people a margin of error of +/- 2%
    ——————————–

    How do pollsters know this to be so?

  16. The lights in Canary Wharf just went out, do you think it means banking, but not as we know it. If only we had those great astrologers, Keynes and Russell Grant to assist, okay, one will do. :-)

  17. VALERIE……I was just vain enough to be fooled myself. :-)

  18. Valerie – see here http://ukpollingreport.co.uk/faq-sampling

    It’s basic probability. If you have a bag of 1000 balls, 500 red, 500 blue, then you know that if you pick one out there is a 50% chance of a blue one, 50% chance of a red one.

    Now imagine you had a big bag of 40,000,000 balls, half blue, half red. And you pick out 1000 balls. We know (ignoring the effect of balls already picked), that each ball you pick has a 50% chance of being red, 50% chance of being blue. Hence one can calculate the odds of getting particular combinations, in 95% of instances, the number of red balls you’d get would be between 469 and 531.

    Polling is just reverse engineering that, so say you didn’t know how many of each colour balls are in the bag, but you got 500 red and 500 blue, you’d be able to be 95% certain that between 47% and 53% of the balls were red.

    For a better explanation though, you’d need a mathematician. It’s their theories! We pollsters just utilize the fruits of their labour.

  19. Thanks Anthony.
    An explanation I can actually understand! :-)

  20. @Ken

    I see the ad has gone now! :-)

  21. I’ve always thought that the far bigger caveat with polls on voting intention is that, with the exception of few months leading up to a general election, the question “If there was an election tomorrow, how would you vote” is a very hypothetical one. If there really was an election the following day, chances are most respondents would have put a lot more thought into how they will vote than an off-the-cuff answer given when asked by telephone or an online poll. So there’s no knowing if the floating voters would really have voted the way they said they would in the poll, even if every respondent thought they were voting truthfully.

    This is why I think there needs to be caution when more party leads the other in VI by more than 3% (or even 6% if you want to safeguard against the 3% margin of error in both parties). As someone with a higher education in chemistry/physics, I can of course tell you that statistical error is easy to estimate, but sources of error in the methodology (in this case, the question of whether people actually would vote the way they think they would) is very difficult to quantify. And the absolute bugger is that as we approach polling day and voting intentions shift, we don’t know whether on party has gained in popularity, or the polls are merely re-aligning themselves with true voting intention (as, of course, most voters will give their vote progressively more thought as polling day approaches).

    And this means that even if the polls remain static until 2014, there will still be everything to play for in the final few months.

  22. VALERIE………So it has. Incidentally, for a simpler analysis of statistical randomness, which is what you are discussing, vis a vis polling of course, take pi, in the first six billion decimal plces of pi, each of the digits from 0 through 9 shows up about six hundred million times, thus pi, like polling probability, gives you a fair idea of the odds. On second thoughts, perhaps we should stick with Anthony’s bag of balls. :-)

  23. Quite so. I’ve spent half my life explaining to the people I work with that the survey results they are panicking/ complaining/ proposing to take action about, aren’t worth the paper they’re written on because they’re based on a sub-sample of 8 (quite apart from the fact the 8 are extremely unlikely to be random in the real world).

  24. Ken’s ad is back.
    Welcome back Ken.
    ….

    But on polling…surely the fact that there is a panel who are polled on amongst other things politics will tend to increase the tendency of those in the panel to become more politically aware…unlike the vast majority of the populace who couldn’t give a damn and cast their vote at GEs etc based on no more than a candidate or the leaders of the parties sounds (or looks)?

  25. MIKE N………..Do I detect a touch of cynicism in in your judgement of our electorate, oh ye of little faith. :-o

  26. Chris Neville Smith –

    Absolutely. I try not to bang on about margin of error too much because people then fixate on it as the be all and end all of polling error. It’s the reason I don’t quote sample sizes in the polling tables, as then people automatically assume big sample=most accurate.

    The margin of error only refers to the normal sample error. Polls will also be subject to all sorts of different error, non-response bias, non-contact bias, weighting effects, effects from question wording, none of which can be neatly quantified in the same easy way.

    Mike N – you might have thought so, but it doesn’t really happen. Most YouGov polls are non-political, and the recruitment is normally targetted at non-political groups. We’ve never found any real evidence of panel effect

  27. Ken

    I’m a realist!

  28. MIKE N………….Then Ed M could do with your help, he seems to have decided to be Michael Foot. :-)

  29. Ken

    He hasn’t bought a duffel coat has he?

  30. RIN…………..Any cartoonists picking up on your notion would be on fertile ground, Foot, Benn, Miliband………duffles all ! :-)

  31. The amount of missing money in the MF global bankruptcy has increased from $600 million to somewhere between 1.2 to 3 billion and still no arrests. There are reports on the main, market cheerleader tv station that small investors are pulling their money out of the market.

  32. Which is your advert Ken?

  33. Hurray! This means I will be able to stop whinging to people about drawing ridiculous conclusions from small samples until they start doing it again. So several minutes then. :?

    Some points that sometimes get missed. The warning about sub-samples not having the same make-up as the sample as a whole does matter. For example the figures are weighted so that there are the expected number of Conservatives in the sample as a whole. But this means that for Scotland, where the percentage of Conservatives is much lower there will probably be ‘too many’ Conservatives – hence the reason why their figures in the Scottish cross-breaks are a bit bigger thatn you expect. The point about this sort of bias is that you can’t reduce it by increasing the sample size – another good reason for being careful about cross-breaks.

    Also you have to be careful of what your sample size is. For example if you look at the Voting Intention in YouGov’s daily polls, the sample size in the latest on (published Sunday) is the conveniently round 1700. But 25% said they wouldn’t vote or didn’t know how they would and these were excluded before calculating the VIs. So the real sample size was 1275. This means the MoE is +/- 2.7 not the +/- 2.3 it would initially appear to be.

    To complicate matters the MoE is smaller when the number you are estimating is smaller. Normally MoEs are calculated at the ‘50% level’. To use the 1275 sample quoted above this would be 2.74 at 50%, 2.68 at 40% – so little real difference. However if you estimate MoE for an expected value of 10%, it’s only 1.64%. This explains why the VI figure for the Lib Dems doesn’t move about as much as for the other two. It’s nothing to do with their vote being more solid it’s because it’s smaller.

    Finally and most importantly it must always be remembered that MoE is a minimum. It’s the error you would still get even if everything else is perfect, just because you’re taking a sample. Nothing you can do can make it any smaller. In reality however it will be bigger than that because we don’t live in a perfect world. To make things worse we usually don’t know how imperfect it is. So always mentally add on a bit before you use the figures.

  34. Richard in Norway

    @”The amount of missing money in the MF global bankruptcy has increased from $600 million to somewhere between 1.2 to 3 billion and still no arrests.”

    It is very heartening to follow your concern for the derivatives traders who’s funds have gone missing :-)

  35. Colin

    Well it could have been me! But its not just speculators its also small businesses hedging currency etc etc. Also the point is confidence in the market and the liquidity effects of any fears that the market ain’t safe

  36. COLIN……….Valerie was being a bit naughty, a popup on the home page advertising the British Legion, with the strap line, ‘Leave it out Ken’ appeared a few times today, the Ken in question was of the Clarke variety, not me. :-)

  37. Colin

    Also many of the MF global customers were ordinary folk who invested their 401k’s on the markets, that is their retirement money which may have been stolen(don’t quote me on this I need to check it out)

  38. KEN

    Thanks-I was being a bit dim:-)

    RICHARD

    I was just teasing-pleased that you don’t exclude speculators from your concerns ( are they now in the 1% seeming as you are one of them ? )

    If a crime has been committed , I hope & expect the US authorities will prosecute. They are already checking it out. I think you just need a little patience.

  39. RICHARD

    THe MF Global numbers currently being reported appear to be :-

    Recovered by administrator so far $3.7 bn

    Of which-distributed already to clients $2bn,

    Leaving $1.7bn to distribute.

    Administrator expects to have returned 60% of clients money by Dec.

    The original estimate of $600m was said to be an 11%
    shortfall-so the new estimate seems to equate to 22% shortfall.

    But there is a huge workload just to establish the facts.

  40. COLIN……….Joe Corzine will find comfort as bridge partner to Bernie Madoff, apparently they share the interest. :-)

  41. Colin

    As always your faith in authority is touching -:)

  42. KEN :-)

    Here’s hoping.

    RICHARD

    It’s the US we’re talking about-not the UK.

  43. Pete B.
    I posted earlier this year a mea culpa re how embarrased I am looking back at my attitude to bad economic news in the early ’80s. My first reaction would be almost pleasure at the negative impact I felt it would have on the Gov’t support rather than concern for the people affected.
    I can honestly say that this is not the case now. I genuinely hope that the GO strategy wil not be as damaging as I think it will be and I hope that there will nuanced a change to a more balance approach even reverting to 5 years rather that 4 for ther defecit reduction plan blaming EZ crisis. Of course Labour will say told you so etc but no one will care come the GE how we get to where we are so the Gov’t would benefit and that would be fine by me if they (as I see it) do less damage than would have been the case.
    BTW – the where ‘righties’ similarily gleeful in 2009.

  44. Sorry my English/typing crap.
    I hope there will be a nuanced change to a more balanced approach.
    Also, BTW there were ‘righties’ on here gleeful in 2009.

  45. Anthony would be amazed by what some party workers think are reliable samples when doing their ‘5-bar gates’ at election counts.

    Some tellers think if you sample just thirty votes from a given ballot box that you can call the way a given polling district has voted.

    I’ve done some at-the-count experiments by doing cumulative samples from individual ballot boxes: interestingly, these reveal that although a sample of fifty is usually unreliable, by the time you get down to the 100th vote sampled the cumulative percentage share for each party has pretty well settled down.

    I realise that what I am saying is heresy to any qualified statistician.

    One thing I learnt at MORI is that the size of what we in those days called the ‘universal set’ is not relevant to the accuracy of a poll, but I don’t see how this can be.

    Surely if you have a ballot box with only 200 votes in it, and you sample 100 of them, then the level of confidence you can have in your sample is greater because the remaining 100 (unsampled) ballots can have less effect on the final result.

    Of course this does not apply to Anthony’s assessment of Muslim opinion, as over a million of our fellow citizens belong to this faith.

    Just one thought about this, though: could it be that Muslims are seemingly more patriotic because they fear that they have to go out of their way more than other groups to express their loyalty to Britain (for the simple reason that their loyalty is constantly – and unfairly – under scrutiny)?

  46. Robin – the size of the universe you are polling does make a big difference if it is a case of 200 people, or 1000 people or something. As the universe gets bigger though it makes less and less difference, beyond around 40,000 it makes very little difference at all.

    A sample of 1000 out of 2000 will indeed be more accurate than a sample of 1000 out of 5000 (margins of error would be 2.2 and 2.8 respectively).

    However a sample of 1000 out of 100,000 will not be significantly more accurate than a sample of 1000 out of 1,000,000 (margins of error of 3.08 and 3.1 respectively).