Presented by Dr. Sajid Ali Yousuf Zai sayousuf@numl.edu.pk Why study statistics? To be a good consumer of research Summarizing data and understanding graphs To separate actual differences from chance variations Medication, treatments, etc. How many of you have taken statistics or research method course(s) before? Misleading Statistics "There are three kinds of lies: lies, damned lies, and statistics.“(Mark Twain). What is Data Analysis Data are a bunch of values of one or more variables. A variable is something that has different values. Values can be numbers or names, depending on the variable: Numeric, e.g. weight Counting, e.g. number of injuries Ordinal, e.g. competitive level (values are numbers/names) Nominal, e.g. sex (values are names Descriptive and Inferential Statistics Statistics allow us to describe particular characteristics of a set of data. Descriptive statistics are concerned with the presentation, organization, and summarization of data. • Ex. Charts, graphs, summarizing with numbers, etc. Statistics also allow us to infer our findings to the general population. Inferential statistics allow us to generalize from our sample of data to a larger group of subjects. Variables • A variable is simply what is being observed or measured. – Variables can take one of a number of values • • Gender (Male or Female) Height (20 in to 90 in) • Two types of variables: The dependent variable is the outcome of interest, which should change in response to some intervention. Ex. Treatment effect The independent variable is the intervention, or what is being manipulated. Ex. Medication Variables Example: Examine the impact of using cooperative teaching methods versus using lecture teaching methods in HIV/AIDS awareness programs. Independent variable: teaching method Dependent variable: awareness Variables Example: Do high school males report greater participation in risky health-related behaviors than high school females? Independent variable: gender (male or female) Dependent variable: amount of participation in risky health-related behavior Types of Data Have list variables to describe themselves; how are characteristics of variables different? Qualitative and Quantitative Categorical versus Continuous Data Scales of Measure Nominal, Ordinal, Interval, and Ratio Data Distinction can be important because it limits types of statistical test used Quantitative & Qualitative Quantitative variables Variables with levels indicating different amounts A higher score indicates more of the variable than a lower score Examples: height, weight, time (continuous variables) Qualitative variables Variables with levels indicating different kinds Examples: gender, political affiliation, year in school (discrete variables) Categorical and Continuous Data Categorical data have values that can assume only whole numbers or a fixed number of outcomes Can have only a limited set of values Examples: Gender, hair/eye color, political preference • Continuous data may take an infinite number of value, within a defined range – often have units and decimals • Examples: Height, weight, age, time, temperature, blood pressure Nominal Scale Ordinal Scale Interval Scale Ratio Scale Nominal Data A nominal variable consists of named categories, with no implied order among the categories. Existential/Dichotomous – two categories (exists or doesn’t exist) Polytomous – multiple categories Examples: Have condition X or not Gender (M/F vs. 0/1) [coding is common] Hair/eye color Marital status social security numbers, number on back of runners Ordinal Data An ordinal variable consists of ordered categories, where the differences between categories cannot be considered to be equal. Examples: Letter grades (A, B, …) Evaluations (Excellent, …, Poor) Places finishing a race (1st, 2nd, …) birth order, Olympic medals, likert scale Interval Data An interval variable has equal distances between values, but the zero point is arbitrary. Arithmetic can be applied; add, subtract, average No “natural” zero where none is present Examples: IQ scores (Quasi-interval) Temperature (Fahrenheit and Celsius) scores on standardized tests Ratio Data A ratio variable has equal intervals between values and a meaningful zero point. Examples: Height Weight Temperature (Kelvin) height, weight, time, scores on school tests, dollar bills Scales of Measurement Nominal Scale Numbers serve as labels and do not indicate any quantitative relationship All you can say is the numbers represent different things Examples: social security numbers, number on back of runners Ordinal scale You can say the numbers are different and that one is greater than another, but not equal differences Indicates rank order Examples: birth order, Olympic medals, likert scale Interval scale You can say numbers are different, there is an order, and there is an equal distance between numbers Examples: temperature (F and C), scores on standardized tests Ratio scale Keeps all of the properties of the other four scales and has a true zero point, so you can say that 4 is twice as much as 2 16 kilograms is 4 times heavier than 4 kilograms Examples: height, weight, time, scores on school tests, dollar bills Types of Data Variable Type Assumptions Nominal • Named categories. Ordinal • Same as nominal plus ordered categories. Interval • Same as ordinal plus Ratio intervals. • Same as interval plus meaningful zero. equal Which one is which type of variable ? • • • • • • • • • • Gender (Male / Female) Height Age Hair color Nationality Anxiety level Blood pressure Test scores Program enrolled in Olympic medals Descriptive Statistics Central Tendency Where is the center of the distribution? Mode = category with highest frequency Median = middle category or score Mean = average score Descriptive Statistics Variability “Where are the ends of the distribution? How are cases distributed around the middle?” Range = difference between highest and lowest scores Standard deviation = measure of variability; involves deviations of scores from mean; most scores fall within one standard deviation above or below mean.