Uploaded by Saba Shaikh

1 - Descrptive Statistics

advertisement
Presented by
Dr. Sajid Ali Yousuf Zai
sayousuf@numl.edu.pk
Why study statistics?



To be a good consumer of research
Summarizing data and understanding graphs
To separate actual differences from chance variations
 Medication, treatments, etc.
 How many of you have taken statistics or research
method course(s) before?
Misleading Statistics
"There are three kinds of lies: lies, damned lies, and statistics.“(Mark Twain).
What is Data Analysis
 Data are a bunch of values of one or more variables.
 A variable is something that has different values.
 Values can be numbers or names, depending on the variable:

Numeric, e.g. weight

Counting, e.g. number of injuries

Ordinal, e.g. competitive level (values are
numbers/names)

Nominal, e.g. sex (values are names
Descriptive and Inferential Statistics
 Statistics allow us to describe particular
characteristics of a set of data.
 Descriptive statistics are concerned with the
presentation, organization, and summarization of data.
• Ex. Charts, graphs, summarizing with numbers, etc.
 Statistics also allow us to infer our findings to the
general population.
 Inferential statistics allow us to generalize from our
sample of data to a larger group of subjects.
Variables
• A variable is simply what is being observed or
measured.
– Variables can take one of a number of values
•
•
Gender (Male or Female)
Height (20 in to 90 in)
• Two types of variables:
 The dependent variable is the outcome of interest,
which should change in response to some intervention.
Ex. Treatment effect
 The independent variable is the intervention, or what
is being manipulated. Ex. Medication
Variables
 Example: Examine the impact of using cooperative
teaching methods versus using lecture teaching
methods in HIV/AIDS awareness programs.
 Independent variable: teaching method
 Dependent variable: awareness
Variables
 Example: Do high school males report greater
participation in risky health-related behaviors than
high school females?
 Independent variable: gender (male or female)
 Dependent variable: amount of participation in risky
health-related behavior
Types of Data
 Have list variables to describe themselves; how are
characteristics of variables different?
 Qualitative and Quantitative
 Categorical versus Continuous Data
 Scales of Measure
 Nominal, Ordinal, Interval, and Ratio Data
 Distinction can be important because it limits types of
statistical test used
Quantitative & Qualitative
 Quantitative variables



Variables with levels indicating different amounts
A higher score indicates more of the variable than a lower
score
Examples: height, weight, time (continuous variables)
 Qualitative variables


Variables with levels indicating different kinds
Examples: gender, political affiliation, year in school (discrete
variables)
Categorical and Continuous Data
 Categorical data have values that can assume only whole
numbers or a fixed number of outcomes
 Can have only a limited set of values
 Examples: Gender, hair/eye color, political preference
• Continuous data may take an infinite number of value,
within a defined range
– often have units and decimals
• Examples: Height, weight, age, time, temperature, blood
pressure
 Nominal Scale
 Ordinal Scale
 Interval Scale
 Ratio Scale
Nominal Data
 A nominal variable consists of named categories,
with no implied order among the categories.
 Existential/Dichotomous – two categories (exists or doesn’t exist)
 Polytomous – multiple categories
 Examples:
 Have condition X or not
 Gender (M/F vs. 0/1) [coding is common]
 Hair/eye color
 Marital status
 social security numbers, number on back of runners
Ordinal Data
 An ordinal variable consists of ordered categories,
where the differences between categories cannot be
considered to be equal.
 Examples:
 Letter grades (A, B, …)
 Evaluations (Excellent, …, Poor)
 Places finishing a race (1st, 2nd, …)
 birth order, Olympic medals, likert scale
Interval Data
 An interval variable has equal distances between
values, but the zero point is arbitrary.
 Arithmetic can be applied; add, subtract, average
 No “natural” zero where none is present
 Examples:
 IQ scores (Quasi-interval)
 Temperature (Fahrenheit and Celsius)
 scores on standardized tests
Ratio Data
 A ratio variable has equal intervals between values
and a meaningful zero point.
 Examples:
 Height
 Weight
 Temperature (Kelvin)
 height, weight, time, scores on school tests, dollar bills
Scales of Measurement
 Nominal Scale
Numbers serve as labels and do not indicate any quantitative relationship
 All you can say is the numbers represent different things
 Examples: social security numbers, number on back of runners
 Ordinal scale
 You can say the numbers are different and that one is greater than another, but not
equal differences
 Indicates rank order
 Examples: birth order, Olympic medals, likert scale
 Interval scale
 You can say numbers are different, there is an order, and there is an equal distance
between numbers
 Examples: temperature (F and C), scores on standardized tests
 Ratio scale
 Keeps all of the properties of the other four scales and has a true zero point, so you can
say that 4 is twice as much as 2
 16 kilograms is 4 times heavier than 4 kilograms
 Examples: height, weight, time, scores on school tests, dollar bills

Types of Data
Variable Type
Assumptions
 Nominal
• Named categories.
 Ordinal
• Same as nominal plus ordered
categories.
 Interval
• Same as ordinal plus
 Ratio
intervals.
• Same as interval plus
meaningful zero.
equal
Which one is which type of variable ?
•
•
•
•
•
•
•
•
•
•
Gender (Male / Female)
Height
Age
Hair color
Nationality
Anxiety level
Blood pressure
Test scores
Program enrolled in
Olympic medals
Descriptive Statistics
Central Tendency
Where is the center of the distribution?
 Mode = category with highest frequency
 Median = middle category or score
 Mean = average score
Descriptive Statistics
Variability
 “Where are the ends of the distribution? How are
cases distributed around the middle?”
 Range = difference between highest and lowest scores
 Standard deviation = measure of variability; involves
deviations of scores from mean; most scores fall
within one standard deviation above or below mean.
Download