Chapter 1:
Introduction to Statistics
Instructor:
Dr. John J. Kerbs, Associate Professor
Joint Ph.D. in Social Work and Sociology
© 2013 - - DO NOT CITE, QUOTE, REPRODUCE, OR DISSEMINATE WITHOUT
WRITTEN PERMISSION FROM THE AUTHOR:
Dr. John J. Kerbs can be emailed for permission at kerbsj@ecu.edu
Variables
A variable is a characteristic or condition that can change or take on different values.
Most research begins with a general question about the relationship between two variables for a specific group of individuals.
Population
The entire group of individuals is called the population .
For example, a researcher may be interested in the relation between class size (variable 1) and academic performance (variable 2) for the population of third-grade children.
Sample
Usually populations are so large that a researcher cannot examine the entire group. Therefore, a sample is selected to represent the population in a research study. The goal is to use the results obtained from the sample to help answer questions about the population.
Relationships between
Population and Sample
Descriptive Statistics
Descriptive statistics are methods for organizing and summarizing data.
For example, tables or graphs are used to organize data, and descriptive values such as the average score are used to summarize data.
A descriptive value for a population is called a parameter and a descriptive value for a sample is called a statistic .
Inferential Statistics
Inferential statistics are methods for using sample data to make general conclusions
(inferences) about populations.
Because a sample is typically only a part of the whole population, sample data provide only limited information about the population.
As a result, sample statistics are generally imperfect representatives of the corresponding population parameters.
Sampling Error
The discrepancy between a sample statistic and its population parameter is called sampling error .
Defining and measuring sampling error is a large part of inferential statistics.
The Role of Inferential and
Descriptive Statistics
Is the difference between the two groups a real difference or is it due to chance (sampling error)?
Data
The measurements obtained in a research study are called the data .
The goal of statistics is to help researchers organize and interpret the data.
Constructs: Intangible Attributes
Some variables can be observed and measured directly.
Examples include height, weight, eye color, number of arrests, etc.
Constructs or Hypothetical Constructs - -
These variables are intangible and cannot be directly observed.
Intelligence, anxiety, hunger, etc.
Types of Variables
Variables can be classified as discrete or continuous based upon the types of values that can be assigned to them.
Discrete variables (such as class size) consist of indivisible categories and they are commonly restricted to whole, countable numbers
(impossible to observe values between two numbers - - e.g., half a student).
Continuous variables (such as time or weight) are infinitely divisible into whatever units a researcher may choose. For example, time can be measured to the nearest minute, second, half-second, etc.
Real Limits
To define the units for a continuous variable, a researcher must use real limits which are boundaries located exactly halfway between adjacent categories on a continuous number line.
Every score has two real limits:
The upper real limit is at the top of the interval.
The lower real limit is at the bottom.
Weight Example: If a person weighs 150 pounds, the upper real limit would be 150.5 pounds and the lower real limit would be 149.5 pounds.
Measuring Variables
To establish relationships between variables, researchers must observe the variables and record their observations. This requires that the variables be measured .
The process of measuring a variable requires a set of categories called a scale of measurement and a process that classifies each individual into one category.
1.
Four Types of Measurement
Scales
A nominal scale is an unordered set of categories identified only by name. Nominal measurements only permit you to determine whether two individuals are the same or different.
2.
An ordinal scale is an ordered set of categories.
Ordinal measurements tell you the direction of difference between two individuals.
3.
4.
Four Types of Measurement
Scales (continued)
An interval scale consists of ordered categories that are all intervals of exactly the same size. Equal differences between numbers on a scale reflect equal differences in magnitude.
Interval measurements identify the direction and magnitude of a difference. The zero point is located arbitrarily on an interval scale.
A ratio scale is an interval scale where a value of zero indicates none of the variable.
Ratio measurements identify the direction and magnitude of differences and allow ratio comparisons of measurements.
Correlational Studies
The goal of a correlational study is to determine whether there is a relationship between two variables and to describe the relationship.
A correlational study simply observes
(measures) the two variables as they exist naturally.
The Correlational Method Using A
Scatter Plot
Experiments
The goal of an experiment is to demonstrate a cause-and-effect relationship between two variables; that is, to show that changing the value of one variable causes changes to occur in a second variable.
Experiments (continued)
In an experiment , one variable is manipulated to create treatment conditions.
A second variable is observed and measured to obtain scores for a group of individuals in each of the treatment conditions.
The measurements are then compared to see if there are differences between treatment conditions.
All other variables are controlled to prevent them from influencing the results.
The manipulated variable is called the independent variable (predictor) and the observed variable is the dependent variable
(outcome) .
Experiments (continued)
There are times when experiments include a test condition and a control condition.
Experimental Condition - - Research subjects in this group receive special treatment (e.g., violent offenders who are given cognitive-behavioral treatment)
Control Condition - - Research subjects in this group do not receive special treatment (e.g., no cognitive behavioral treatment)
The Structure of an Experiment
Cause and Effect Considerations
To say that a difference between two groups is caused by a particular variable, the researcher must rule out competing explanations that might otherwise explain between-group differences - - e.g., any other variables that might affect the outcome must be controlled for (included in) the study.
Participant Variables - - Characteristics of the study participants like age, gender, and income.
Environmental Variables - - Characteristics of the environment like lighting, time of day, weather conditions.
3 Ways to Control for Other
Variables that Might Affect
Outcomes
Option 1: Random Assignment of Participants
Evenly distributes characteristics across two groups so that the groups are no different.
This approach can be used to control for participant and environmental variables.
3 Ways to Control for Other
Variables that Might Affect
Outcomes
Option 2: Matching Samples
Matching to ensure for equivalent groups
(e.g., same proportion of males and females for treatment and comparison groups) or equivalent environments.
3 Ways to Control for Other
Variables that Might Affect
Outcomes
Option 3: Hold Variables Constant
Researchers can hold a variable constant so that it does not affect outcomes
Example: Only study 10-year old children.
Two groups of 10-year older children with one group watching 30-minutes of comedy and the second group watching
30 minutes of violent material.
Other Types of Studies
Other types of research studies, know as nonexperimental or quasi-experimental , are similar to experiments because they also compare groups of scores.
These studies do not use a manipulated variable to differentiate the groups. Instead, the variable that differentiates the groups is usually a pre-existing participant variable (such as male/female) or a time variable (such as before/after).
Other Types of Studies
(Continued.)
Because these studies do not use the manipulation and control of true experiments, they cannot demonstrate cause and effect relationships.
As a result, they are similar to correlational research because they simply demonstrate and describe relationships.
Two Examples of
Non-Experimental Studies
Nonequivalent
Group Study
NOTE: In a non-experimental study, the “independent variable” that is used to create the different groups of scores is often called the quasiindependent variable.
Pre-Post Study
Statistical Notation
The individual measurements or scores obtained for a research participant will be identified by the letter X (or X and Y if there are multiple scores for each individual).
The number of scores in a data set will be identified by N for a population or n for a sample.
Summing a set of values is a common operation in statistics and has its own notation. The Greek letter sigma, Σ , will be used to stand for "the sum of." For example,
Σ X identifies the sum of the scores.
Order of Operations
1.
2.
3.
4.
5.
All calculations within parentheses are done first.
Squaring or raising to other exponents is done second.
Multiplying, and dividing are done third, and should be completed in order from left to right.
Summation with the Σ notation is done next.
Any additional adding and subtracting is done last and should be completed in order from left to right.
Order of Operations Examples
1.
A set of scores consists of the following values: 4, 7, 3, 9, and 2
2.
1.
2.
3.
4.
5.
6.
Please calculate
ΣX =
( ΣX) 2 =
ΣX 2 =
ΣX + 5 =
Σ(X-2) =
Σ(X-2) 2 =
Order of Operations Examples
1.
A set of scores consists of the following values: 4, 7, 3, 9, and 2
2.
1.
2.
3.
4.
5.
6.
Please calculate
ΣX = 4 + 7 + 3 + 9 + 2 = 25
( ΣX) 2 = (25) 2 = 625
ΣX 2 = 16 + 49 + 9 + 81 + 4 = 159
ΣX + 5 = 25 + 5 = 30
Σ(X-2) = 2 + 5 + 1 + 7 + 0 = 15
Σ(X-2) 2 = 2 2 + 5 2 + 1 2 + 7 2 + 0 2 = 79
= 4 + 25 + 1 + 49 + 0 = 79