Biostatistics course Part 6 Normal distribution Dr. en C. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division of Health Sciences and Engineering Campus Celaya Salvatierra University of Guanajuato Mexico Biosketch Medical Doctor by University Autonomous of Guadalajara. Pediatrician by the Mexican Council of Certification on Pediatrics. Postgraduate Diploma on Epidemiology, London School of Hygine and Tropical Medicine, University of London. Master Sciences with aim in Epidemiology, Atlantic International University. Doctorate Sciences with aim in Epidemiology, Atlantic International University. Professor Titular A, Full Time, University of Guanajuato. Level 1 National Researcher System padillawarm@gmail.com Competencies The reader will define what is Normal distribution and standard Normal distribution. He (she) will know how are the Normal and standard Normal distribution. He (she) will apply the properties of standard Normal distribution. He (she) will know how standardize values to change in a standard Normal distribution. Introduction We know how calculate probabilities and to find binomial distribution. But, there are other variables that they can take more values that only two. If they have a limited number of categories, are categorical variables. If they can take many different values, are numeric variables. Quantitative variables They can take many values and are negative or positive. Distribution of glycemia in 500 persons N° of persons 120 100 80 60 40 20 0 <40 40-69 70-99 100-129 130-159 160-189 Level of glycemia mg/100 ml 190-219 220+ Quantitative variables Percentage (%) Distribution of glucose in blood levels in 500 persons 25 20 15 10 5 0 <40 40-69 70-99 100-129 130-159 160-189 Level glycemia mg/100 ml 190-219 220+ Quantitative variables What happen if the sample size is more big? Changed the histogram? Distribution of glycemia in 3,500 persons N° of persons 1200 1000 800 600 400 200 0 <40 40-69 70-99 100-129 130-159 160-189 Level of glycemia mg/100 ml 190-219 220+ Quantitative variables The distributions of many variables are symmetrical, specially when the sample size is big. Stature in meters. n=1000 5 4 3 2 1 0 N° of persons N° of persons Stature in meters. n=10 1.5 1.6 1.7 Stature (m t) 1.8 400 300 200 100 0 1.5 1.6 1.7 Stature (m t) 1.8 Normal distribution It is used to represent the distribution of values that they should observe, if we include all population. It show the value distribution, if we repeat many times the measure in a great population. Because of this, Y axis of Normal distribution is called probability. An histogram show the value distribution observed in a sample. An Normal plot show value distribution that it is thinking that they can occur in the population of which the sample was obtained. Normal distribution We can use the Normal distribution, to answer questions as: What is the probability of a adult man has a glycemia level < or = to 50 mg/100 ml? We can answer, taken the percentage of observed men, with glycemia levels < 150 mg/100 ml. Normal standard distribution Normal distribution is defined by a complicated mathematical formulae, but we have published tables that define area under the Normal curve: Normal standard distribution. In this the mean is 0 and standard deviation is ±1. These tables are in statistic textbooks. Normal standard distribution When Z=0.00 is 0.5 When Z = 1.00 is 0.159 or 0.841 -1 0 +1 Normal standard distribution Many times, we want the range out of area of curve. Area out of range is complementary to the range in of area of the curve. -1 0 +1 Range Area in the range Area out the range -1, + 1 68.3% 31.7% -2, +2 95.4% 0.6% -3, +3 99.7% 0.3% -4, +4 99.99% 0.01% Normal standard distribution Area out of range is that we shall use with more frequency. There are tables published with these values and they are the table with two tails. Standardized values Any Normal distribution can be changed in a Normal standard distribution. To standardize value, we subtract of each value its mean and it is divided by standard deviation. Example Mean of stature is 1.58 mt with s = 0.12 Standardized value for stature of 1.7 is 1.7 1.58/0.12 = 0.12/0.12 = 1.00 Standardized values How we do apply the lessons learned? What is the probability of one person in the population, has less than 1.6 mts of stature? We know that should calculate what is the area under curve to the left of 1.6 mts, under a Normal curve with a mean of 1.58 and s of 0.12. 1.6 -1.58/0.12 = 0.167 Using the tables of Normal standard distribution p lower value (to the left of mean) for 0.167 is 0.5675 = 56.75%. We can answer that the probability of an individual of this population has less than 1.6 mts is 56.75% It is a population with low stature! Standardized values Cautions: Sample size We have used a sample of 1000 measures, if the sample size is less, the results are different. Supposition: The results depend of the supposition that statures are distributed Normally with the same mean and standard deviation found in the sample. If the supposition is incorrect, the results are wrong. Non-Normal distribution Not all quantitative variables have a Normal distribution. Distribution of levels of glucose in blood n=10 10 010 9 80 -8 9 60 -6 9 4 3 2 1 0 <5 0 We measured levels of glycemia in 10 personas: the distribution are skewed. Do we can use the properties of Normal distribution? Number of patient Glucose in blood (m g/100 m l) Non-Normal distribution If, it is skewed to the right, we can apply logarithmic transformations. If, it is skewed to the left, we squared each value. Original values is transformed in natural logarithmic values (button “ln” in scientific calculators). Distribution of log values of glucose in blood n=10 Glucose in blood (m g/100 m l) -5 .1 4. 9 4. 3 -4 .5 4 3 2 1 0 <4 .0 Number of patients 10 010 9 80 -8 9 60 -6 9 4 3 2 1 0 <5 0 Number of patients Distribution of values of glucose in blood n=10 log glucose in blood (m g/100 m l) Bibliography 1.- Last JM. A dictionary of epidemiology. New York, 4ª ed. Oxford University Press, 2001:173. 2.- Kirkwood BR. Essentials of medical ststistics. Oxford, Blackwell Science, 1988: 14. 3.- Altman DG. Practical statistics for medical research. Boca Ratón, Chapman & Hall/ CRC; 1991: 1-9.