number crunching ISSUE 18 | Summer 2013 bringing CUTTING-EDGE SCIENCE INto THE CLASSROOM MAKING SENSE OF NUMBERS How statistics help us understand the world vital statistics Statistics can seem daunting, but don’t panic! This issue of Big Picture shows how we can use maths to understand more about the world around us. Join us as we explore how to use stats to summarise data, see whether our figures are significant and put our findings into context, so we can make decisions based on evidence rather than opinion. Read this alongside our online content at www.wellcome.ac.uk/ bigpicture/numbers. Some thought-provoking numbers from the world around us NUMERACY skills equivalent to gcse grade G or higher by age Source: www.gov.uk/government/uploads/system/uploads/attachment_data/file/32277/11-1367-2011-skills-for-lifesurvey-findings.pdf Vital statistics Some interesting numbers from the world around us. 2 How science works How researchers use science to explore and understand. 4 Making sense of stats A look at how we use statistics to interpret data. 6 Risky business Exploring how different people understand risk. Stats Q&A Putting to rest some common statistical myths. When stats go bad Real-life examples of how statistics can be misused. real voices Three people talk about how they use science and stats. number of drivers car drivers killed or seriously injured by age IN GREAT BRITAIN INSIDE 2005–09 – average 2011 8 10 12 14 ONLINE Go to www.wellcome.ac.uk/bigpicture/ numbers for more teaching resources, including extra articles, useful web links, lesson ideas, curriculum links and more. You can also download the PDF of this magazine and subscribe to the Big Picture series. driver age Source: www.gov.uk/government/statistical-data-sets/ras30-reported-casualties-in-road-accidents 2 | BIG PICTURE 18: number crunching data storage 250 32 GB GB portable mp3 player games console 18 000 000 200 000 000 GB data amassed by CERN by mid-2012 200 petabytes 100 000 000 GB usable storage at wellcome trust sanger institute by 2011 GB facebook photos and videos by mid-2012 Source: www.amazon.com, www.itbusinessedge.com/cm/blogs/lawson/the-big-data-software-problem-behind-cerns-higgs-boson-hunt/?cs=50736, www.extremetech.com/computing/129183-how-big-is-the-cloud, Wellcome Trust Sanger Institute worldwide email accounts 2012 3.3 billion earnings premium for those who took a-level maths 2016 4.3 (estimated) 10% billion Source: www.reform.co.uk/client_files/www. reform.co.uk/files/the_value_of_mathematics.pdf Source: www.radicati.com/?p=8417 coin tossing probabilities 0.25 Probability of getting two heads if you toss two coins in a row. 60% of (97) Members of parliament tested couldn’t correctly predict the probability of getting two heads in a row. finding data Putting this diagram together, we found that different sources gave different numbers for the same thing. Why don’t they match? Well, data can be interpreted in different ways, and estimates can be made using different methods and/or baseline data. Definitions matter, too – different sources might define ‘numeracy’ or ‘adult’ differently. Which should you choose? The source itself is important – is it reliable? Are the figures recent? How might an organisation’s ‘agenda’ affect how it calculates and presents data? Source: Royal Statistical Society and Ipsos MORI Summer 2013 | 3 how science works Scientists work to investigate, interpret and understand the world around us. They use a set of tools and techniques known as the scientific method and produce data. iStockphoto Beating bias Researchers try to keep things objective Bias is anything that introduces errors into research and distorts your findings. Good design means trying as much as possible to eliminate bias throughout the experiment – from the initial research through to the publication of the results. Researchers try to reduce bias in several ways. These include using blind trials, in which certain information is kept from people in a study or even the investigators (e.g. patients not being told whether they are receiving an experimental drug or proven drug). They also use control groups: the control group is treated the same as the experimental group, except in the one variable you are investigating. If a population is being sampled, the sample size needs to be big enough to reflect the overall population as precisely as possible. This increases the study’s reliability (how likely it is that someone repeating the experiment would get results similar to those of the initial investigator), but it often adds to the cost. How the sample is chosen is also important. Choosing the sample randomly or systematically helps to eliminate investigator and other biases. As the name suggests, systematic sampling uses a system. You break a population into elements that are then selected at regular intervals to form the sample – for example, from a list of everyone in year 12, start with a randomly selected student and then pick every 20th student from the list. The null hypothesis A hypothesis is an explanation you can test It’s human nature to look for patterns and draw conclusions from what we observe – for example, to argue that there is a link between x and y. However, science can actually never prove anything with absolute certainty. Instead, researchers assume that no link exists and explore how likely it is that they would still see the same result because of chance or other unknown factors. In science, a hypothesis is the explanation you think is behind an observation. To show a scientific hypothesis to be true, you actually need to show that the null hypothesis – a ‘non-event’ where the effect is not seen – is false. Statistical analysis can then be used to assess the support for the alternative hypothesis. For example, you might think that bumblebees prefer one colour flower over another. In this case, your null hypothesis (H0) is ‘There is no difference in the number of visits to each colour of flower’, and the alternative hypothesis (H1) is ‘There is a difference in the number of visits to each colour of flower’. MORE ONLINE: www.wellcome.ac.uk/bigpicture/numbers 4 | BIG PICTURE 18: number crunching sturdy studies How to recognise good research When it comes to testing a hypothesis, high-quality results come from a study that is well designed and limits bias (see ‘Beating bias’, left). Often, this can mean changing only one variable at time, although this can be hard in real-world situations. So-called ‘multivariate statistics’ can be used where several different variables are being observed at once – for example, when assessing the effects of fertilisers on plant growth, where the variables might include plant height and crop yield. Care should be taken when extrapolating the results of a study. Extrapolating means stretching information beyond the specific group you were studying – for example, applying the findings of animal research or in vitro research (i.e. research in test-tubes) to humans. When results are published in a peer-reviewed journal, this means the articles have been reviewed carefully by scientists working in the same field. This lends credibility to the findings and indicates that they reflect research that has followed the scientific process. FAST FACT Drowsy driving is dangerous: if you’re awake for 24 hours, the effect on driving is equivalent to a blood alcohol level of 0.1 per cent, greater than the UK legal limit. Source: well.blogs.nytimes. com/2013/01/04/drowsy-drivers-pose-major-risks look at the evidence grand designs Evidence is central to science Evidence is what science is all about. It’s all well and good having an idea about how the world works, but you need something to back it up. The scientific method allows researchers to test their ideas through investigations. First, you need a question. This normally comes after some initial observations that suggest something interesting is happening or from a problem you want to solve. You can construct a hypothesis that could explain what you’ve seen and carry out an experiment to Not all experiments are the same test it, then analyse the data and draw conclusions. The overall quality of any study depends on making sure each step is done properly. Not all evidence gathered using the scientific method is the same. Quantitative data are measurable (e.g. length or height), and because they are numerical they can be analysed statistically. Qualitative data are descriptive (e.g. hair colour). Analysing qualitative data is more difficult and interpretations can be subject to personal opinion. How you design a study depends on the question you’re asking. In medicine, the most appropriate type of study depends on whether you are trying to diagnose, treat or calculate the likely outcome of a condition. For more on this, see the diagram below. In ecology, as in medicine, samples are taken: examining an entire population can be time-consuming and damage the environment you’re looking at. The design of the study depends on what you’re investigating. For example, to estimate the size of an animal population, researchers often use a mark–release–recapture method. Marking and releasing a set number of individuals, then capturing another set number and counting how many individuals got caught twice gives a good indication of how many animals there are altogether. To sample plant populations, quadrats are used so that each sample comes from a specific area of ground. types of medical study This diagram summarises some of the different types of study and analysis used in medical research. In general, the higher up the pyramid an approach is, the higher the quality of evidence produced by that approach (and the smaller the amount of evidence available). Laboratory work Case series/case report Case-control study Cohort study Work is done in the lab, using test-tubes or animals, to hone the research method and question before moving on to more advanced studies. Based on the treatment of individual patients and on observation rather than experiment. Even if the data are quantitative (numerical), it is difficult to make any generalisations. An observational study that compares individuals who have a particular condition (‘cases’) and those who do not (‘controls’). You can see any correlation, or link, between a particular factor and the disease, but you cannot draw reliable conclusions about any cause. A group of people are monitored over an extended period (often years) to see how changes in one variable affect another – for example, smoking and lung cancer. Randomised controlled trial (RCT) Systematic review and meta-analysis Often used in drug testing, RCTs involve the participants being randomly assigned to receive either the treatment under investigation or a placebo (dummy treatment). Known as the ‘gold standard’ for clinical research. The strongest evidence. A systematic review collects all the available literature on a particular topic, and metaanalysis is used to combine the numerical outcomes of many separate RCTs. Source: Based on a diagram from the UNC Health Sciences Library Summer 2013 | 5 making sense of stats Experiments yield data. How can we interpret this information using statistics? What are some of the common pitfalls in data analysis and interpretation? Graphically thinking Just about average There are different types of average When we talk about ‘an average’, what we’re really trying to do is get some sense of where the middle is. We can then use that as a way of comparing two groups of data. Unfortunately, there isn’t just one type of average – there are several. To get the mean, add the data together and divide the total by how many data points there are (see equation below). Beware, though – outlying data can often skew the mean to be artificially high or low. Take the following number list: 1, 3, 6, 9, 9, 11, 14. The mean here is 7.57. However, add a much higher number to the end of the list, say 50. The mean is now 12.88. Just one particularly large outlier has almost doubled the mean, and the majority of the numbers are below the mean. Imagine how the mean wealth of biology teachers in a room might change if Bill Gates joined them. If you place the numbers in ascending order and look for the middle value in the list, you have the median. If there are an even number of values, you take the mean of the middle pair. For the original list, this is 9. Outliers have a much smaller effect on the median than the mean, so adding 50 again does not alter the median. The mode is the value that occurs most often in a list. For this list, the mode is 9. What’s your type? Not all data are the same iStockphoto Researchers define data in different ways. For example, data are categorical if the values can be sorted into non-overlapping categories (e.g. by blood type, species or sex). Every value should belong to only one category, and it should be clear which one it belongs to. Categorical data are also known as ‘nominal data’, or ‘frequencies’, as the research looks to find out how frequently data fall into each category. Ordinal data, by contrast, can be ranked or have some sort of rating scale. Ordinal data often come from surveys and questionnaires. Data can also be defined as discrete or continuous. Data are discrete, or discontinuous, if they can take only isolated values. Continuous data can take on any value and are limited only by how accurate your measurements are. So, while foot length is continuous, shoe size is discrete because you can’t be a size 7.234434 – you have to be a 7 or a 7.5. MORE ONLINE: www.wellcome.ac.uk/bigpicture/numbers 6 | BIG PICTURE 18: number crunching Using graphs and diagrams to show data When you have your data, you may want to represent them graphically – for example, to show whether two variables are correlated (e.g. a scatter graph plotting duck egg length against duck egg width) or to show different proportions (e.g. a pie chart showing the prey items of a lion). Which chart or graph you use will depend on the type of data you have. Common diagrams include bar charts, pie charts, line graphs, scatter graphs and histograms. See www.wellcome.ac.uk/bigpicture/ numbers for a how-to guide on histograms and weblinks to other resources on graphs and charts. A number of significance Significance has a special meaning in stats If you want to accept your alternative hypothesis, you must first reject your null hypothesis. There is a chance, however, of rejecting the null hypothesis when it is actually true. It is usually possible to calculate the probability (p-value) that what you observed in an experiment was due just to chance. You use a significance level to decide whether you will reject the null hypothesis, and this is often set at the 0.05 or 5 per cent level. If your measured p-value equals 0.04, for example, then this is less than 5 per cent, so you can reject the null hypothesis and accept your alternative. Still, this doesn’t mean that you have proved the alternative hypothesis: if the null hypothesis were true, there would still be a 4 per cent chance of getting your result. If an unscrupulous investigator keeps on doing experiments on useless treatments, they will still get results ‘significant at the 5 per cent level’ on 1 in 20 occasions. If only those ‘positive’ trials are reported, we will get a very misleading impression. This is why it is essential to have access to all the evidence, whether positive or negative. ‘All trials registered, all results reported’ is a campaign started by researchers, doctors and others to try and get all clinical trials past and present to be registered and to have their results reported. FAST FACT In a class of 23 people, the chance that two people will have the same birthday is just over 50 per cent. Source: mathforum.org/dr.math/ faq/faq.birthdayprob.html iStockphoto What is normal? Many things follow a normal distribution Datasets can be spread out in many different ways. The majority of the data can sit above the mean or below it. In many datasets, however – particularly large ones – the data points seem to settle equally on either side of the mean. Plotted on a graph, the shape of the distribution resembles a bell and so is sometimes called a ‘bell curve’. This is also called a normal distribution (see graph below). Standard deviation is a measure of how spread out the numbers are around the mean. If a dataset follows a MEAN SD = STANDARD DEVIATION -3SD -2SD -1SD 0 68% 95% 99.7% +1SD +2SD +3SD normal distribution, approximately 68 per cent of the data will fall between one standard deviation on either side of the mean. Around 95 per cent will fall within two standard deviations on either side. In such circumstances, the mean, median and mode of the data are all equal. There are many everyday biological examples that follow a roughly normal distribution, including blood pressure, height and foot length. Along with these examples, you could also try looking at stalk height in daisies, the length of holly leaves or the diameter of lichens (commonly found on gravestones). Choose your method Exploring different statistical tests There are several different statistical tests that you can use, depending on the type of data you are dealing with. Two examples are given here, and there are more online – plus a workedthrough example for chi-squared (‘chi’ is pronounced ‘ki’ to rhyme with ‘eye’) – at www.wellcome.ac.uk/bigpicture/ numbers. The chi-squared test is used with categorical data to see whether any difference in frequencies between your sets of results is due to chance. For example, you could use the test with the null hypothesis that ‘there is no difference in the frequency of worms on different types of ground’. In a chi-squared test, you draw a table of your observed frequencies and your predicted frequencies and calculate the chi-squared value. You compare this to the critical value to see whether the difference between them is likely to have occurred by chance. If your calculated value is bigger than the critical value, you reject your null hypothesis. The t-test enables you to see whether two samples are different when you have data that are continuous and normally distributed. The test allows you to compare the means and standard deviations of the two groups to see whether there is a statistically significant difference between them. For example, you could test the heights of the members of two different biology classes. Jumping to conclusions Take care with correlation So you’ve collected your data and noticed a strong correlation between two of your variables. It might be very tempting to assume that a change in one is causing a change in the other, but don’t fall into the trap. A correlation shows that there is a relationship between your variables; it doesn’t prove there is a causal relationship. Think about Andrew Wakefield’s 1998 claim that using the MMR vaccine can result in autism. It’s true that use of the MMR vaccine had increased up to that point, as had the number of cases of autism recorded, so there’s a correlation between the two. This doesn‘t necessarily mean that the jab is causing the increase in autism – there could be a third factor (called a confounder) causing one or both of the variables to increase, such as an increase in maternal age. Wakefield’s work has been discredited and he was struck off in 2010, meaning he can no longer practise as a doctor. True causation can only be tested with carefully controlled studies, which often compare two groups who are matched in every way except for the variable of interest. This limits the role of confounders as much as possible. Oh, and watch your ‘u’s and ‘s’s – a causal relationship is very different from a casual one! Summer 2013 | 7 Risky business It is impossible to live a life without risk. So, how do we understand and weigh up the risks associated with different activities, behaviours and events? Take a chance Nobody lives a risk-free life responsibility for reducing risk, from individuals to governments. On the roads, for example, laws about speed limits, seat belts and drink-driving are intended to reduce the risk of accidents and injuries, and we all have responsibility for our own behaviour as a driver or passenger. In the home, people install smoke and intruder alarms to reduce the risk of fire damage or burglary. The same is true of carbon monoxide detectors. In addition, risk in the workplace has become a hot topic in recent years. Along with conventional risk reduction devices such as fire extinguishers and alarms, attention is increasingly turning to reducing other, less obvious risks at work. This can include supplying appropriate office furniture to reduce the risk of back pain and taking steps to reduce the risk of stress. On a larger scale, national and international government organisations spend a lot of time and money on health campaigns to warn about risks in many areas, from unprotected sex to smoking and from drinking while pregnant to the dangers of drugs. iStockphoto Risk is generally understood as an exposure to the chance of injury, loss or harm. Throughout our lives we come across countless situations that present such risks; they’re impossible to avoid. We often act to limit risks by undertaking a particular action or by reducing or stopping a certain behaviour. Sometimes we do it without giving it much thought – by taking an umbrella if the forecast says it’s likely to rain, for example. Everyone has some Life-changing findings? We don’t all respond to risk the same way Living the (micro) life One way to quantify risk MORE ONLINE: www.wellcome.ac.uk/bigpicture/numbers 8 | BIG PICTURE 18: number crunching deal with short-term choices, where the potential benefits are immediate, better than long-term ones. In some situations we have to weigh up relative risks – for example, with preventative medicines such as antimalaria drugs. Drugs like doxycycline and mefloquine can have unpleasant side-effects, but if you don’t take them when travelling to malaria hotspots, you run the risk of contracting the potentially fatal condition. Research into our perception of risk has tried to place a monetary value on how much the average person would have to be paid to willingly accept a one-in-a-million chance of death – see ‘Living the (micro) life’. You might think it would be high, but the findings estimate that it is just $50. iStockphoto US scientist Ronald A Howard first introduced the concept of ‘micromorts’ – a unit of risk measuring a one-ina-million probability of death. This allows us to compare the risk of day-today events. For example, in the medical world, going under anaesthesia for a non-emergency operation exposes you to an average of ten micromorts. In the UK, giving birth (all births combined) is worth 120 micromorts; a Caesarean section increases this to 170. Skydiving, rock-climbing and hang-gliding come in at ten micromorts or lower. Calculations suggest each mission flown by a member of Bomber Command in World War II carried 27 000 micromorts – a 2.7 per cent chance of death. People perceive and respond to risks in different ways. How we think and behave (psychological factors) and how society works (sociological factors) play a part in this – we often go with our gut feeling or are affected by the behaviour of people around us. It seems most people underestimate the risk in activities where they have control and overestimate the risk of things they can do little about. The timescale involved can also be a factor, particularly when it comes to health. It can be hard to be motivated to change short-term behaviour – such as eating or drinking habits – because of a risk of health problems in the distant future. This can often lead people to put off an action to another day, saying things like “I’ll stop smoking when I get older”. When it comes to long-term benefits, it is also potentially easier to take a ‘positive’ action, such as joining a gym, than a ‘negative’ one, such as drinking less. On the whole, it seems as though people FAST FACT On average, one in five in vitro fertilisation (IVF) pregnancies is a multiple pregnancy, compared to one in 80 for women who conceive naturally. Source: www.hfea.gov.uk/Multiplebirths-after-IVF.html iStockphoto Calculate your odds Our gut reactions to probability aren’t always correct Are you absolutely sure? Why risk should be reported responsibly The way numbers are presented can be misleading, so watch out. In 1995, UK news outlets reported advice from the Committee on Safety of Medicines, which suggested that a new version of the contraceptive pill doubled a woman’s risk of venous thromboembolism (VTE) – a condition in which blood clots form in the veins around the legs. In the wake of the news, the British Pregnancy Advisory Service estimated that the number of abortions in the UK rose by 13 000, reversing a previously downward trend, as a result of a falling trust in the Pill. The number of births to teenage mothers also increased. The actual findings relating to the Pill and VTE are as follows: for every 7000 women that took the previous Pill, one would have VTE. For every 7000 women that took the new Pill, two would have VTE. The claim of ‘a doubling’ of the risk of VTE was not wrong: the number of women affected had doubled from 1 in 7000 to 2 in 7000. This is the relative risk. But was this the best number to publish? The difference in terms of women affected is just 1 in 7000 (the difference between 2 in 7000 and 1 in 7000 women), or 0.014 per cent. We’re not always good at doing maths quickly. Often we go by gut feeling, rather than what the numbers tell us. Take childbirth, for example; imagine a woman has given birth to three children, who are all boys. If she becomes pregnant again then people might say – because all of her current children are boys – that there is a strong chance her next child will also be a boy, or that she must be ‘due’ a girl next. And yet biology tells us there is still a 50 per cent (or 1:1) chance of her having a boy because each conception is an independent event and is unaffected by the existence of her previous children. The same is true of coin tosses. Just because a coin comes up heads ten times in a row, a head is no more likely than a tail on the 11th flip (provided the coin is not fixed). The probability of having a boy, or the coin coming down heads, are both 1/2, no matter what’s happened previously. The situation is different if the events are dependent – if you pull an ace from a pack of cards without returning it, the probability of picking another ace goes down from 4/52 to 3/51. This is especially true when unlikely events occur. Take natural disasters, for example. It is often said that events like floods, tsunamis and hurricanes are ‘once every 100 years events’, and people are surprised when they occur more often. What it actually means is there is a 1 per cent (1 in 100) chance of the event happening in any given year. Again, however, these are independent events: if a ‘once in 100 years’ flood happened last year, that doesn‘t mean that it can’t happen again this year. It is unlikely – but unlikely things happen all the time. Summer 2013 | 9 stats q&a From lottery mythbusting to understanding medical test results, your niggling number questions are answered using science and statistics. L ? Q: My dad refuses to pick 1, 2, 3, 4, 5 and 6 on the Lotto draw. Is this sensible? Q: My teacher says there’s no such thing as ‘truth’ in science. So why do we bother? A: Those six numbers are just A: Think for a second about as likely to come up as any other six numbers. In the UK Lotto, six numbers are drawn at random from numbers 1 to 49, giving 13 983 816 possible combinations. The chance of winning is approximately 1 in 14 million, no matter which numbers you pick. However, there is a very good reason why your dad shouldn’t play 1, 2, 3, 4, 5 and 6 – if he did win, he would win less money than with another combination. An estimated 10 000 players a week choose the combination of balls 1–6. If you picked those numbers too and they came up, you’d have to share your cash with all those other people, which would severely dilute your winnings. Although you can’t increase your chances of winning, there are other ways you might increase the potential amount you’d win. People pick numbers between 1 and 31 more often, for example, because they correspond to the birthdays of people they know. Selecting six numbers between 32 and 49 is more likely to set you apart from the crowd. how you’d prove something to be true. You could say, for example, that all swans are white – but you can’t prove that’s true because it’s impossible for you to observe every single swan. However, just one observation of a black swan would prove that you were wrong. It’s much easier to prove something is untrue than to prove it’s true. Accordingly, the scientific method works by trying to falsify a statement. When a scientist puts forward a theory, the rest of the scientific community try to disprove it. It’s only those theories that survive the onslaught – the ones that can’t (yet) be proven wrong – that we stick with. That doesn’t mean that one day a new piece of evidence won’t come along to disprove it, but you trust it for now. For example, we think that dropping a brick will mean it falls to the ground, but we can never prove that it wouldn’t one day float off. So far, though, the theory that it will fall has withstood all attempts at falsification. MORE ONLINE: www.wellcome.ac.uk/bigpicture/numbers 10 | BIG PICTURE 18: number crunching Q: I just read online about a man who cured his cancer with carrot juice. Why can’t we all use natural remedies? A: Remember a key saying in statistics: correlation does not equal causation. In other words, just because there is a relationship between the volume of carrot juice the man consumed and the disappearance of his cancer, it doesn’t mean that one caused the other. Even if it did, think what you’re not hearing about. If he had died in spite of all the carrots, would an article have been written about it? Maybe there are thousands of failures out there that are not being reported. This is an example of ‘selection bias’ – where you only hear about the ‘successes’. A related phenomenon, ‘publication bias’, happens in scientific research. Journals and some researchers are more likely to publish new, exciting results than ‘negative’ findings that a particular variable does not cause an effect. Medical science does not work on the basis of anecdotal evidence (based on personal experiences), but through carefully controlled trials. Q: I’ve passed my driving test. I’ve been told by car insurance companies that there’s no discount for being female, even though women are statistically less likely to have an accident. Why? A: You’re right that women are statistically less likely to have an accident. In fact, young men under the age of 22 used to pay an average of £1000 more a year in insurance premiums than women of the same age. This was because the statistics show that young men are twice as likely to suffer a serious collision as young women (and ten times more likely to be killed or seriously injured in one than those aged 35 or over). However, an EU ‘gender directive’ that came into force on 21 December 2012 made it illegal to discriminate according to sex when pricing financial products, such as pension annuities and insurance. The change has seen premiums for women under 40 rise and those for men of a similar age fall. Some have argued that this is unfair, but others have said equality can’t be selective. The change is unlikely to affect the premiums of those aged over 40 – the age at which the statistics show that men and women become equally likely to have an accident. FAST FACT More than 99 per cent of the population has a greater than average number of legs. Source: plus.maths.org/ content/all-about-averages Q: My mum doesn’t like that I smoke, but I don’t think it’s a big deal. My great-grandma has smoked since she was 15 and is still going at 90! A: A common misconception is that your own experience – or those of your friends or relatives – can be generalised to everyone else. As a famous quote by English statistician and biologist R A Fisher says, “That is an experience, not an experiment”. Just because your great-grandma has survived to that age doesn’t mean you will; you can’t get meaningful data from a sample size of one. This is why most scientists use experiments that collect large volumes of data, rather than relying on anecdotal case studies. Q: My doctor has told me that if I get a ‘positive’ result in a medical test, it doesn’t definitely mean that I have the disease. Why not? A: No test is 100 per cent accurate. Not all positive test results mean that someone has a particular condition; not all negative results mean that a disease is absent. Positive and negative test results can be described as true or false, depending on whether they classify correctly the person tested. The diagram shows the outcome of breast screening by an X-ray technique called mammography. For every 1000 positive test result 41 1000 women screened women screened, 41 have a positive result and are called back for more tests. Of these, 8 women will be found to have cancer, and 33 will be found not to. So, an initial positive result means just a 20 per cent chance of cancer. Overall, 966.8 (8 + 958.8) women per 1000 screened will get a ‘true’ result, making this technique 97 per cent accurate. 8 true positive 33 false positive further testing 958.8 negative test result Lottery and smoking images: iStockphoto Carrot juice image: 123RF (cancer) (no cancer) true negative (no cancer) 959 0.2 false negative (cancer) Summer 2013 | 11 when stats go bad Actually, this is more a case of ‘when people using stats go bad’. Look at these real-life examples to see how numbers can be misused, misrepresented or poorly explained, and explore the implications of substandard stats. Take care with your... wording “Just 100 cod left in North Sea” In September 2012 the Daily Telegraph website ran a story with a rather astonishing headline: “Just 100 cod left in North Sea”. Actually, there are iStock around 437 million cod in that body of water. What happened? The answer is a lesson in being precise with your vocabulary. The original story came from the Sunday Times, who had reported that an analysis of catches at North Sea ports across Europe in 2011 found no cod over the age of 13. Using statistical sampling techniques, it was calculated that this must mean fewer than 100 such fish exist (otherwise one would have been caught). The finding was reported under the headline of “Only 100 adult cod in North Sea” – but what constitutes a mature cod? You’d assume from the story that it was any fish over the age of 13; however, cod reach full sexual maturity around the age of six. According to the Government, there are more than 21 million such fish in the North Sea. It all comes down what you classify as an ‘adult’, or ‘mature’, fish. Consider… Q:Can you find examples of other headlines that are misleading because of the words used (or those left out)? Take care with your... calculations iStockphoto “Chance of cot deaths in brothers ‘1 in 73 million’” Sally Clark served three years of a life sentence for the murder of her two children before her conviction was overturned in 2003. In the original case, the defence had claimed that sudden infant death syndrome (SIDS) – commonly known as cot death – was responsible for the death of both boys, who died just over a year apart. The prosecution argued that such a double cot death was exceptionally unlikely and claimed murder. The prosecution’s assertion was based on the expert testimony of Professor Sir Roy Meadow, a researcher in paediatrics. Meadow had said that the chances of one child dying from SIDS in a non-smoking, affluent family was 1 in 8543. When working out the probability of two cot deaths in the same family, he squared this probability – multiplying 8543 by 8543 – to get 1 in 73 million. MORE ONLINE: www.wellcome.ac.uk/bigpicture/numbers 12 | BIG PICTURE 18: number crunching This would have been the right thing to do if the two events were independent of each other, like tosses of a coin. The chance of getting two heads in a row is 1/2 x 1/2 (1/4). However, two cot deaths in the same family are not independent events; there could be underlying genetic or environmental factors that make them more likely. The Royal Statistical Society deemed Meadow’s account a “mis-use of statistics”. Consider… Q:Even if the chance of a double cot death in the same family really was 1 in 73 million, why would this not have meant there was only a 1 in 73 million (0.0000014 per cent) chance of the accused being innocent? Search for “prosecutor’s fallacy” online to find out more. Q:What figure should this number have been compared with to work out the relative likelihood of guilt or innocence? Q:Should statistical evidence in court only be presented by experts in statistics, rather than by experts in the field in which the statistics are being used? FAST FACT Although Grenada won only one medal at the London 2012 Olympic Games, it comes top in the medal table per capita (per person living there). The USA, which won the most medals, is ranked 49th when you count medals this way. Source: www.medalspercapita.com Take care with your... claimS “Lifescan, like an MOT for your body” Lifescan is a company offering you a CT scan to provide an “MOT for your body”. Their original TV advert described it as “a quick and easy scan that could detect the early signs of life-threatening diseases, way Wellcome Images before the symptoms begin”. It is also described as a “check-up all in one go”. In 2010, however, the Advertising Standards Authority (ASA) – the UK’s independent regulator of advertising across all media – ruled that the advert could no longer be broadcast in its original form after complaints from two medical doctors. According to the ASA, the advert implied that CT scanning in patients with no current symptoms could pick up any kind of underlying health problem, and they ruled there was no evidence to back up these claims. The ASA also ruled that Lifescan didn’t provide enough information about the potential risks of exposure to radiation. According to a 2007 report from a governmental advisory committee, a typical CT scan carries a one-in-2000 lifetime risk of developing a fatal cancer. Although this risk might be acceptable in high-risk patients, it might be unacceptable for people without symptoms. In explaining their verdict, they said: “We were concerned that the...respondents entering the competition were selected on the merits of their competition entry [the short story] and may have been inclined to be less than impartial in their survey responses in order to stand a better chance of winning.” Why risk should be reported responsibly Consider… Q: How much information can a company reasonably be expected to include in a short advert? Q: To what extent is the consumer responsible for doing their own research before having a scan? Take care with your... sample “Recommended by 93% of Red readers” iStockphoto The Advertising Standards Authority (ASA) banned a TV advert for hair product Nice ‘n’ Easy in 2009. The advert included a voiceover that said “93% of Red magazine readers would recommend Nice ‘n’ Easy to a friend. The other 7% probably don’t have any friends”. Afterwards, some text flashed up on the screen: “Participants in a survey of 245 Red magazine readers, April 2008”. The problem wasn’t so much the small sample (a professional opinion survey typically covers at least 1000 people) as the way the data were collected. Participants had volunteered to take part in the survey and were sent a pack of the hair dye, as well as the survey (which included a question about whether they would recommend the product to a friend). If they returned the survey, along with a photo of themselves and a short story, they had a chance of winning a trip to New York. The ASA banned the advert because they believed the claim of 93 per cent was misleading. Consider… Q: How many of a magazine’s readers should you ask before you can make claims about the opinions of its readership? Q: Is it OK to provide incentives to people taking part in surveys? Will you get the same results, or will the incentive skew the findings? Q: Is there any guarantee that the people filling in the survey actually tried the hair dye they were sent? How could it have been done differently? discuss Is it ever morally acceptable to ‘spin’ information about risk to try to influence people’s behaviour? Imagine a new report has been published, linking eating meat to the development of cancer. Is it right for a newspaper to report the relative risk without the absolute risk (see page 9), to create a more striking story and therefore sell more papers? Would you feel differently if this were done by a government health campaign or a cancer charity trying to raise funds for research? What if it were used in adverts by a business that produces vegetarian food? Summer 2013 | 13 real voices Three people tell us about the role of statistics and science in their lives. Meet Vicky Peterkin, a biostatistician at a pharmaceutical company; biology teacher David Colthurst; and Anthony Underwood, who uses bioinformatics at Public Health England. Vicky Peterkin dr David Colthurst Senior biostatistician at a pharmaceutical company What do you do? I work as a statistician in the pharmaceutical industry, specialising in clinical trials for new treatments. How do you use statistics in your job? To find out whether the drug we’re testing significantly improves a measurement of interest compared with a control drug. For example, if we’re testing a new drug to treat high blood pressure, the measurement of interest might be the change in blood pressure since starting treatment; however, it’s not enough to just look at the mean blood pressure change in each treatment group. We need to adjust for other information (such as age and weight) and show whether the difference is statistically significant. Why is that? Adjusting for other information ensures that any differences between the treatment group results are due to the drug, not due to differences in disease severity and demographics. Statistical significance indicates that the size of the difference between treatment groups is too big to have occurred by chance and must be due to the treatments. What is a p-value? When we run a statistical test, we get a p-value at the end. It shows the statistical significance – the degree to Biology teacher leading a project to do scientific research in schools which we can be certain that the effects we see are due to our drug, rather than chance. It’s the key piece of information we want from the trial. It tells us whether the drug is useful and whether we should carry on researching it. That’s what I love about statistics: you can boil a huge amount of information down to a single, clear, yes-or-no decision. Why might people be wary of maths? I think people imagine statisticians sit at a computer working on their own all day. In my job that’s just not the case. I’m involved at every stage of a clinical trial – I help design it, check the data quality while it’s running and analyse it all at the end – and I work closely with medics, scientists and trial monitors on a daily basis. What training do you have? I studied maths at university, then a Master’s in statistics. There’s a lot of on-the-job training, too. I think students need to be exposed to statistics at an earlier age, so they can see for themselves how useful it is. The industry is in desperate need of young statisticians. You can become involved in important trials very quickly, and your opinion is highly valued. Find out more about working as a statistician at www.psiweb.org/newcareers. MORE ONLINE: www.wellcome.ac.uk/bigpicture/numbers 14 | BIG PICTURE 18: number crunching What do you do? I am a secondary school science teacher. I am also the lead teacher on the MBP2 project, which gives sixth-form students the opportunity to carry out genuine research. What is the MBP2 project? Five years ago my wife was diagnosed with multiple sclerosis. I wondered if I would be able to combine my 15 years of experience as a biochemist with my 15 years of teaching. The Myelin Basic Protein Project (MBP2) is investigating the role of this protein in multiple sclerosis using genetically modified yeast. The project is run in collaboration with researchers at the University of Kent. When we first started, they ran a workshop at the University to teach DNA and protein techniques to a small group of students, who then taught them to their teachers and other students. It was a nice turnaround to have the students be the experts while the teachers sat scratching their heads, thinking “How does this bit work?” What do your students gain? The real wake-up call for them was around experimental procedure – often, experiments don’t work as planned or give the results that are expected. When this happens, you have to tweak and change the design. Very quickly the students realised that while you may do steps A, B, C and D in the hope you will get result E, nine times out of ten it won’t work! It has also given the students the opportunity to try lots of techniques that they normally wouldn’t get to experience until university. For example, our students have developed a genetically modified yeast strain that can make the human myelin basic protein. While doing this they have learnt how to extract DNA, use the polymerase chain reaction (PCR) to amplify DNA, and carry out Western blots to study the proteins produced. What is next for the project? I believe MBP2 provides a model for how students can carry out research in schools. Inspired by this, Authentic Biology is a series of research projects led by sixth-form students in five schools across the UK. Each one is drawing on the expertise of researchers at their local universities to investigate topics relevant to them. In London, for example, students are researching diabetes, and in Sheffield they are looking at heart disease. We are currently planning the second Authentic Biology Symposium. This is an opportunity for the schools involved to share their research with each other and with academics from the partner universities. Find out more at www. thelangtonstarcentre.org/ index.php/mbp-squared-link. Big Picture Debates the Brain Is being in love just a chemical reaction? Is technology harming our brains? These are just some of the areas of debate presented in our free app, which explores social and ethical questions about the human brain. Find it at brainapp.wellcomeapps.com or on the App Store. dr ANTHONY UNDERWOOD Bioinformatician at Public Health England What do you do? I’m a bioinformatician, which means I use computers as a laboursaving device to answer biological questions more quickly than would be possible using the traditional biological methods. I work for Public Health England (PHE), an executive agency of the Department of Health with a broad remit to protect the community against health dangers, including infectious disease, chemicals and radiation. How do you use statistics in your job? I work in the bioinformatics unit of the Microbiology Services Division of PHE. We are currently involved in a project that aims to sequence many of the genomes from bacteria or viruses in patient samples sent to us by hospitals and GP surgeries. Obviously, that generates a vast amount of data very quickly, and it’s my job to crunch it so we can get meaningful information from it. What kind of information? My lab colleagues need to identify the species and characterise the particular strain of the bacteria or virus. They look for sets of genes that might make it resistant to particular drugs or produce variants of a toxin that make it more harmful when infecting people. The revolution in sequencing technology means we can now identify differences between bacteria at the level of a single DNA nucleotide. That means we can do some really neat stuff in terms of tracking the infection to its source – whether that’s a meat-packaging factory, within a particular community (a school, for example), or a healthcare worker in a hospital. Why do you think people can be wary of maths? I know my own kids think “What’s the use of that formula I’ve just learned?” I think most people like to see how they can use numbers to achieve something concrete. Did you like maths at school? I always enjoyed maths, but biology was something I could reach out and touch. I could look at a pond sample under a microscope or tear apart a leaf and see what was going on inside it. It was only when I was working as a molecular biologist that I saw the real-world uses of maths. How did you get into bioinformatics? I was looking for amino acid patterns in nematode worms, which was a long, laborious process. Someone mentioned that bioinformatics might be able to help me out. When I finally managed to get the computer program I’d written to work, I got the data in minutes instead of days. That was a huge kick. I still get a thrill now, when I write a program that can do weeks of work instantaneously. The big leaps forward in science today come from combining different skill sets, which is what bioinformatics does. It’s really taking off; our group has expanded from a team of three to ten in a year, so it’s a career for the future. Find out more at www.gov.uk/ government/organisations/publichealth-england. the team Education editor: Stephanie Sinclair Editor: Chrissie Giles Assistant editor: Kirsty Strawbridge Writers: Penny Bailey, Chrissie Giles, Emma Rhule, Colin Stuart Project manager: Rosie Cotter Graphic designer: Malcolm Chivers Illustrator: Glen McBeth Publisher: Mark Henderson Head of Education and Learning: Hilary Leevers Teachers’ advisory board: Peter Anderson, Paul Connell, Alison Davies, Helen English, Ian Graham, Stephen Ham, Kim Hatfield, Jaswinder Kaur, Moss Newnham, Jonathan Schofield, Robert Rowney Advisory board: Sarah Allen, Graham Currell, Fiona Davidge, Neville Davies, Thomas Ezard, Marianne Freiberger, Alexis Gilbert, Jenny Koenig, Nancy Lee, Ross MacFarlane, Giles Newton, David Spiegelhalter, Andrew Steele, Robin Sutton Wellcome Trust: We are a global charitable foundation dedicated to achieving extraordinary improvements in human and animal health. We support the brightest minds in biomedical research and the medical humanities. Our breadth of support includes public engagement, education and the application of research to improve health. We are independent of both political and commercial interests. The future of science depends on the quality of science education today. All images, unless otherwise indicated, are from Wellcome Images (images.wellcome.ac.uk). Big Picture is © the Wellcome Trust 2013 and is licensed under Creative Commons Attribution 2.0 UK. ISSN 1745-7777. Cartoon illustrations are © Glen McBeth. This is an open access publication and, with the exception of images and illustrations, the content may unless otherwise stated be reproduced free of charge in any format or medium, subject to the following conditions: content must be reproduced accurately; content must not be used in a misleading context; the Wellcome Trust must be attributed as the original author and the title of the document must be specified in the attribution. The Wellcome Trust is a charity registered in England and Wales, no. 210183. Its sole trustee is The Wellcome Trust Limited, a company registered in England and Wales, no. 2711000 (whose registered office is at 215 Euston Road, London NW1 2BE, UK). PU-5687/23K/05–2013/MC Summer 2013 | 15 Veer Want to bring cutting-edge science to the classroom? We’re looking for trainee and newly qualified post-16 biology teachers to join the Big Picture advisory board. You’ll be involved in reviewing content, shaping the future of the magazine, and much more. It’s a brilliant opportunity to influence the Wellcome Trust’s education work and could boost your career prospects. For more information, contact the team at bigpicture@wellcome.ac.uk BigPicture free subscriptions Sign up to receive free regular copies of Big Picture at www.wellcome.ac.uk/ bigpicture/order Here, you can also order more copies of this issue of Big Picture, or (not all available) past issues, which include Inside the Brain. Or you can contact us: T +44 (0)20 7611 8651 E publishing@wellcome.ac.uk Big Picture Wellcome Trust Freepost RSHU-ZJKL-LCZK Feltham TW14 0RN Are you a teacher in the UK? If so, you can order a class set. Email publishing@wellcome.ac.uk Name: Job title: Organisation: Address: Email address: Feedback Questions, comments, ideas? Share your thoughts on Big Picture by emailing us: bigpicture@wellcome.ac.uk Big Picture is a free post-16 resource that explores issues around biology and medicine. 50% This document was printed on material made from 25 per cent post-consumer waste & 25 per cent pre-consumer waste.