Tall Stories or how a simple question doesn’t always have a simple answer Neil Sheldon Royal Statistical Society Centre for Statistical Education neilsheldon.net Some people think statistics is a branch of mathematics ... ... but they’re wrong. It’s more important than that ... ... it’s a life skill The purpose of statistics is understanding, not numbers Understanding statistics is understanding the world around you Understanding statistics enables you take better decisions The statistics tell a story ... ... but first you have to understand the story behind the statistics Tall Stories or how a simple question doesn’t always have a simple answer Tall stories • How tall am I ... Tall stories • How tall am I ... – in absolute terms? What is my height in feet and inches? Tall stories • How tall am I ... – in absolute terms? What is my height in feet and inches? – in relative terms? Am I short, or tall, or about average? Tall stories • What factors influence the answers to – the absolute question? – the relative question? Tall stories • What factors influence the answers to – the absolute question? – the relative question? • Variation, variation, variation! If only ... Men, white From the US National Health And Nutrition Examination Survey ‘NHANES III’ 1988-94 Men, black Men, Hispanic Men, other Women, white Women, black Women, Hispanic Women, other But where do the data come from? And what are the implications of sampling variation? NormalSimulation.xlsx Cross-sectional and longitudinal studies Age differences in height derived from cross-sectional studies can be the result of differential secular influences among the age cohorts. To determine the magnitude of height loss that accompanies aging, longitudinal studies are required. The authors studied 2,084 men and women aged 17–94 years enrolled from 1958 to 1993 in the Baltimore Longitudinal Study of Aging, Baltimore, Maryland. On average, men's height was measured nine times during 15 years and women's height five times during 9 years. The rate of decrease in height was greater for women than for men. For both sexes, height loss began at about age 30 years and accelerated with increasing age. Cumulative height loss from age 30 to 70 years averaged about 3 cm for men and 5 cm for women; by age 80 years, it increased to 5 cm for men and 8 cm for women. Am J Epidemiol 1999;150:969-77. Longitudinal data Overlapping longitudinal data They were shorter back then ... • Judged by the height of the doorframes he built, medieval man seems to have been short by today’s standards. • But evidence gathered from 3,000 skeletons reveals that human height has varied little over the past 1,000 years. • From the 10th century through to the 19th, the average height of adult men was 5ft 7in or 170cm - just 2in below today's average. • Women were an average of 5ft 2in or 158cm - just over an inch shorter than today. All the bones in the study came from the medieval St Peter's Church in Barton upon Humber, North East Lincolnshire. ... Or were they? Based on a modest sample of skeletons from northern Europe, average heights fell from 173.4 cm in the early Middle Ages to a low of roughly 167 cm during the 17th and 18th centuries. Taking the data at face value, this decline of approximately 6.4 cm substantially exceeds any prolonged downturns found during industrialization in several countries that have been studied. Significantly, recovery to levels achieved in the early Middle Ages was not attained until the early 20th century. It is plausible to link the decline in average height to climate deterioration; growing inequality; urbanization and the expansion of trade and commerce, which facilitated the spread of diseases; fluctuations in population size that impinged on nutritional status; the global spread of diseases associated with European expansion and colonization; and conflicts or wars over state building or religion. Because it is reasonable to believe that greater exposure to pathogens accompanied urbanization and industrialization, and there is evidence of climate moderation, increasing efficiency in agriculture, and greater interregional and international trade in foodstuffs, it is plausible to link the reversal of the long-term height decline with dietary improvements. Richard H Steckel Variation in height during the day Variation in height during the day Did you know that astronauts are up to 2 inches taller while they're in space? As soon as they come back to Earth, though, they return to their normal height. Imagine that the vertebrae in your back form a giant spring. Pushing down on the spring keeps it coiled tightly. When the force is released, the spring stretches out. In the same way, the spine elongates by up to three percent while humans travel in space. To some degree, a similar stretching of the spine happens to you every night. When you lie down, gravity isn't pushing down on your vertebrae. Measure your height carefully as soon as you get up or while you are still lying down. You will find that you're about a centimeter or two taller. Variation with wealth Mean height of Dutch adults Time series data Variation by sex Men are, on average, taller than women. But some women are taller than some men. How can we quantify this? men and women heights overlap.xls Data, data everywhere ... Human height - Wikipedia, the free encyclopedia.htm Self-reported height Genetic variation • Children of tall parents are, on average, – as tall as – taller than – shorter than their parents ... ? • Parents of tall children are, on average, – as tall as – taller than – shorter than their children ... ? Regression to the mean C C=P P C=P C P Individual heights are measured in standard deviations from male mean or female mean as appropriate. Then P is the average of father’s height and mother’s height • Variation, variation, variation – A very simple question, ‘How tall am I’, raises many issues to do with variation – These issues go to the heart of many statistical concepts • Variation, variation, variation –variation within groups This is ‘the usual’ concept of variation: the variability within a population or a sample is measured by the standard deviation or the inter-quartile range • Variation, variation, variation –variation between groups Groups may differ from one another. Sometimes the variation between groups is more important, sometimes it is less important, than the variation within groups. (Analysis of variance treats this in fine detail.) • Variation, variation, variation –variation within individuals Sometimes the attribute to be measured is not constant: it may have a cyclical variation it may have a trend over time ... and it may have both • Variation, variation, variation –variation over time Where an attribute is observed to vary with time, the variation may be cross-sectional: “the older ones were like that when they were young” longitudinal: “that’s what happens as you get older” ... or a combination of the two • Variation, variation, variation –historical variation A longer-term variation that may be quite distinct from longitudinal or cross-sectional variation • Variation, variation, variation –variation in definition Any attribute being measured or counted has first to be defined. It is very common for definitions to vary from one situation to another • Variation, variation, variation –sampling “error” variation A misnomer, as it is in the nature of samples to vary: it’s not a bug but a feature. We all know that samples vary, but we are often tempted to read more information into a sample than it can actually offer • Variation, variation, variation –sampling bias variation There are many ways in which a non-random sample can be unrepresentative. Opportunity sampling – measuring or counting whatever is at hand – may be the most common and the most dangerous • Variation, variation, variation –self-reporting variation Lacks objectivity and so can be deeply misleading. It’s like anecdotal evidence on a large scale • Variation, variation, variation –variation and correlation Strong correlations reduce variation: knowing the value of one variable can reduce the uncertainty about another • Variation, variation, variation –variation by error And, underpinning everything else, there are the errors we all make in counting and measuring, recording and tabulating. Even if all the other sources of variation are controlled and understood, our own fallibility ensures that there will always be variation in the data Tall Stories how a simple question doesn’t always have a simple answer Neil Sheldon Royal Statistical Society Centre for Statistical Education neilsheldon.net