Statistics Part1 by Arier Lee

Introduction to statistics in medicine – Part 1 Arier Lee Introduction • Who am I • Who do I work with • What do I do Why do we need statistics Population Sample The important role of statistics in medicine • Statisticians pervades every aspect of medical research • Medical practice and research generates lots of data • Research involves asking lots of questions with strong statistical aspects • The evaluation of new treatments, procedures and preventative measures relies on statistical concepts in both design and analysis • Statisticians are consulted at early stage of a medical study Research process Research question Primary and secondary endpoints Analyse data Study design Interpret results Sampling and/or randomisation scheme Disseminate Power and sample size calculation Pre-define analyses methods Bias • A form of systematic error that can affect scientific research • Selection bias – well defined inclusion / exclusion criteria, randomisation • Assessment bias – blinding • Response bias, lost-to-follow-up bias – maximise response • Questionnaire bias – careful wording and good interviewer training Some common data types • Continuous age, weight, height, blood pressure • Percentages % of households owning a dog • Counts Number of pre-term babies • Binary yes/no, male/female, sick/healthy • Ordinal taste of biscuits: strongly dislike, dislike, neutral, like, strongly like • Nominal categorical Ethnicity: European, Maori, Pacific Islander, Chinese etc. Descriptive statistics for continuous data – the average • Mean (sum of values)/(number in group) • Median The middle value, 50th percentile • Mode The value that occurs the most often 3 4 7 8 8 8 9 11 11 13 21 23 24 mean=11.54 median mode=8 Descriptive statistics for continuous data – the spread 0, 1, 2, 5, 8, 8, 9, 10, 12, 14, 18, 20 21, 23, 25, 27, 34, 43 Q1 Q2 Q3 18 numbers • Range Minimum and maximum numbers • Interquartile range Quartiles divide data into quarters • Standard deviation A statistic that tells us how far away from the mean the data is spread (95% of the data lies between 2 SD) √ (xi - x) 2 /(n-1) Estimation – Estimation: determine value of a variable and its likely range (ie. 95% confidence intervals) • Statistical inference is a process of generalising results calculated from a sample to a population • We are interested in some numerical characteristic of a population (called a parameter). e.g. the mean height or the proportion of pregnant women with hypertension • We take a sample from the population and calculate an estimate of this parameter Estimation – a simple example • We want to estimate the mean height of 10 years old boys • Take a random sample of 100 ten years old boys and calculate the sample mean • The mean height of my random sample is 141cm • Based on our random sample, we estimate the mean height of 10 years old boys is 141cm Distribution of Data • It is essential to know the distribution of your data so you can choose the appropriate statistical method to analyse the data • Data can be distributed (spread out) in different ways • Continuous data: There are many cases when the data tends to be around a central value with no bias to the left or right – normal distribution Distribution of data – Normal distribution • Many parametric methods assumes data is normally distributed • • • • • Bell curve Peak at a central value Symmetric about the centre Mean=median=mode The distribution can be described by two parameters – mean and standard deviation Standard deviation • Standard deviation – shows how much variation or ‘dispersion’ exists in the data. • 95% of the data are contained within 2 standard deviations A simulated example – Birth weight Histogram of birth weight Mean=3250g SD=550g Some other common distributions • Some common distributions – Binomial distribution – gestational diabetes (Yes/No) – Uniform distribution - throwing a die, equal (uniform) probability for each of the six sides – And many many more… Sampling variability • Because of random sampling, the estimated value will be just an estimate – not exactly the same as the true value • If repeated samples are taken from a population then each sample and hence sample mean and standard deviation is different. This is known as Sampling Variability Sampling variability • In practice we do not repeat the sampling to measure sampling variability we endeavour to obtain a random sample and use statistical theory to quantify the error • Fundamental principle to justify our estimate is reasonable: If it were possible to repeat a study over and over again, in the long run the estimates of each study would be distributed around the true value • If we have a random sample then the sampling variability depends on the size of the sample and the underlying variability of the variable being measured

Statistics Part1 by Arier Lee

Related documents

Products

Support

Statistics Part1 by Arier Lee

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib