Lecture 1: Introduction to Statistics By: Zinabu Dawit (MPH Epidemiology and Biostatics) Zinabu D 1 At the end of this topic, the students will be able to: Define statistics/ biostatics Describe types of statistics Describe characteristics and limitation of statistics Discuss application of biostatistics in health sciences Zinabu D 2 Definition of statistics/ biostatistics Types of Statistics Characteristics of statistical data Limitations of statistics Uses /application of biostatistics Zinabu D 3 1. What is statistics/biostatistics? 2. What are the roles/uses of biostatistics in health sciences? Zinabu D 4 Definition Statistics is the field of study that deals with the collection, organization, summarization, analysis, and interpretation of masses of numerical data for understanding a phenomenon or making wise decisions (inferences) by examining only a small part of the data. Zinabu D 5 Therefore it means two things: 1.Statistical data Refers to numerical descriptions of things/events 2.Statistical methods Refers to procedures for collecting, organizing/presenting, and analyzing masses of numerical data Zinabu D 6 The statistical methods are employed in many fields the application of statistical methods to the fields of life and health sciences( biology, medicine,public health) is called Biostatistics Zinabu D 7 1.Descriptive statistics Deals with the description of data in a clear and informative manner using number ,tables and graphs/charts. Involves the organization and summarization of a body of data with one or more meaningful tools. Helps to identify /describe the general features /characteristics and trends in a set of data and extracting useful information Zinabu D 8 2. Inferential( Inductive )statistics Deals with techniques of making conclusions/inference about the population based on the information obtained from a sample drawn from that population Or sample statistics observed are inferred to the corresponding population parameters Inferential statistics builds upon descriptive statistics Example: Estimation, Hypothesis testing Zinabu D 9 1.They must be in aggregates of facts statistics are number of facts A single fact, even though numerically stated, cannot be called statistics 2.They must be affected by a multiplicity of causes Malaria is attributable to factors like Human factors, parasite factors, and environmental factors Zinabu D 10 3.They must be enumerated or estimated according to a reasonable standard of accuracy 4.They must have been collected in a systematic manner for a predetermined purpose. 5.They must be placed in relation to each other(must be comparable.) Numerical facts may be placed in relation to each other either in point of time, space or condition. Zinabu D 11 Provide a way of organizing complex data in a suitable form Assessment of health status Resource allocation Health program evaluation Assessing risk factors Drawing of inferences(conclusion) Zinabu D 12 1.It deals with only those subjects of inquiry that are capable of being quantitatively measured and numerically expressed. 2.It deals with only on aggregates of facts and no importance is attached to individual observations 3.Statistical data are only approximately and not mathematically correct(certain errors involved) Zinabu D 13 Variable Is a characteristic which takes different values in different persons, places, or things. Any aspect of an individual or object that is measured (e.g. weight) or recorded (e.g. age) and takes any value. There may be one or many variable in a study Zinabu D 14 1. Qualitative(Categorical) Nominal and ordinal 2. Quantitative(Numerical) Discrete and continuous Zinabu D 15 1. Categorical variable A variable which can not be measured in quantitative form but can only be sorted by name or categories Not able to be measured as we measure height or weight The notion of magnitude is absent or implicit. Categories must not overlap and must cover all possibilities Zinabu D 16 Categorical variable is divided into two: 1. Nominal variable the values fall into un-ordered categories or classes Uses names, labels or symbols to assign each measurement. Examples: Blood type (A, B, AB, O), sex (male/female) Zinabu D 17 2. Ordinal variable Assigns each measurement to one of a limited number of categories that are ranked in terms of order. Although non-numerical, can be considered to have a natural ordering Examples:1. cancer stages: 1, 2, 3, 4 2. pain severity: no pain, slight pain, moderate pain, severe pain Zinabu D 18 2. Quantitative variable A variable that can be measured or counted and expressed numerically. Has the notion of magnitude. e.g # of student in this class etc. Zinabu D 19 Quantitative variable is divided into two: 1. Discrete variable It can only have a limited number of discrete values and hence takes on integer values only Characterized by gaps or interruptions in the values. Both the order and magnitude of the values matter. e.g. number of children in household(0, 1, 2, 3, etc.) Zinabu D 20 2. Continuous variable It can have an infinite number of possible values in any given interval or within some range Both the magnitude and the order of the values matter Does not possess the gaps or interruptions E.g. Weight(50.123...), Height(1.342...) Zinabu D 21 Manipulation of variables Continuous variables can be discredited E.g. age(1&1/12-1yr) can be rounded to whole numbers Continuous or discrete variables can be categorized E.g. age categories-1(1-5),2(6-10),3(11-15) Categorical variables can be re-categorized E.g. marital status(single,married,divorsed,widowed) lumping from 4 categories down to 2 (married,single) Zinabu D 22 The type of unit on which a variable is measured is called a scale. Assignment of numbers to subjects, objects or events according to a set of rules is called Scales of Measurement All measurements are not the same. e.g Measuring weight and height of an individual have different scale(wt=---kg vs height=---meter) Zinabu D 23 1.Nominal scale the values fall into un-ordered categories or classes Simply name differences Examples: 1.Religion(Orthodox,protestant,muslim,catholic) 2.Marital status(single,married,divorsed) Zinabu D 24 2. Ordinal scale Assigns each measurement to one of a limited number of categories that are ranked in terms of order Although non-numerical, can be considered to have a natural ordering The numbers have limited meaning 4>3>2>1 is all we know apart from their utility as labels Example: social class(very poor,poor,rich,very rich) Zinabu D 25 3. Interval scale - Differences between any two numbers on a scale are of known size. (magnitude + constant distance between points) Example: Temp. in oF on 4 consecutive days Days: A B Temp. oF: 50 55 C D 60 65 For these data, not only is day A with 50o F cooler than day D with 65o but is 15o cooler. It has no true zero point( “0” is arbitrarily chosen and doesn’t reflect the absence of temp.) Zinabu D 26 4. . Ratio scale Measurement begins at a true zero point and the scale has equal space (magnitude + constant distance between points + true zero) Examples: length, weight, etc. Zinabu D 27 Zinabu D 28 Identify the type of data (nominal, ordinal, interval and ratio) represented by each of the following. 1. 2. 3. 4. 5. 6. 7. 8. 9. Blood group N Temperature (Celsius) I Ethnic group N Job satisfaction index (1-5) O Number of heart attacks O Calendar year O Serum uric acid (mg/100ml) O Number of accidents in 3 - year period O Number of cases of each reportable disease reported by a health worker o 10. The average weight gain of 6 1-year old dogs (with a special diet o Zinabu D 29 Variable types and measurement types have implications : On how data should be displayed or summarized Determines the kind of statistical procedures that should be used Zinabu D 30 THANK YOU!!!! Zinabu D 31