Uploaded by kengkeng arabit

Psychstats

advertisement
What is statistics?
Basic Concepts
Statistics – is a branch of
mathematics that focuses on the
organization, analysis, and
interpretation of a group of
numbers.
variable – characteristic that can
have different values.
Example: Stress level; age; gender;
religion
 Psychologists use statistical
methods to help them make
sense of the numbers they
collect when conducting
research.
 The word statistics comes
from the Italian word statista,
a person dealing with affairs
of state (from stato, “state”).
The two branches of statistical
methods
descriptive statistics – procedures
for summarizing a group of scores
or otherwise making them more
understandable.
inferential statistics – procedures
for drawing conclusions based on
the scores collected in a research
study but going beyond them
values – A value is a specific
measurement or number obtained
during data collection. It
represents a raw piece of data
before any analysis.
score – A score often refers to a
processed or interpreted value. It
typically results from applying a
formula or assessment to the raw
data to make it meaningful
Example :
Age: 5 years, 10 years, 25 years
Height: 150 cm, 175 cm, 190 cm
equal-interval variable – variable
in which the numbers stand for
approximately equal amounts of
what is being measured.
Example:
Temperature in Celsius: The
difference between 20°C and
30°C is the same as the difference
between 30°C and 40°C.
IQ Scores: The difference between
an IQ of 100 and 110 is the same
as the difference between an IQ
of 110 and 120.
levels of measurement – Levels of
measurement refer to the different
ways that variables can be
quantified and categorized. They
determine the types of statistical
analysis that can be performed.
ratio scale – an equal-interval
variable is measured on a ratio
scale if it has an absolute zero
point, meaning that the value of
zero on the variable indicates a
complete absence of the variable.
Example:
Weight: 0 kg (no weight), 50 kg,
100 kg
Height: 0 cm (no height), 150 cm,
180 cm
numeric variable – variable whose
values are numbers (as opposed
to a nominal variable). Also called
quantitative variable.
rank-order variable – numeric
variable in which the values are
ranks, such as class standing or
place finished in a race.
Levels of measurement
(kinds of variables)
Also called ordinal variable.
Example:
Race Position: 1st place, 2nd
place, 3rd place (the difference
between 1st and 2nd place may
not be the same as between 2nd
and 3rd place)
Movie Ratings: 5 stars, 4 stars, 3
stars (the difference between
each star rating is not necessarily
the same)
nominal variable – variable with
values that are categories (that is,
they are names rather than
numbers), it represents categories
with no intrinsic order or ranking.
Also called categorical variable.
Example:
Eye Color: Blue, Brown, Green
Type of Pet: Dog, Cat, Bird
In summary:
Numerical Variable: Age, Height
Equal-Interval Variable:
Temperature in Celsius, IQ Scores
Ratio Scale: Weight, Height
Rank-Order Variable: Race
Position, Movie Ratings
Nominal Variable: Eye Color, Type
of Pet
discrete variable – variable that
has specific values and that
cannot have values between
these specific values.
Example:
Number of Children: A family can
have 0, 1, 2, 3, etc., children. You
cannot have 2.5 children.
Number of Cars in a Parking Lot:
You can count the cars as 0, 1, 2,
3, etc., but not 1.5 cars.
continuous variable – variable for
which, in theory, there are an
infinite number of values between
any two values.
Example:
Height: A person’s height could be
170.2 cm, 165.5 cm, or any other
value within a range.
Temperature: The temperature
can be 20.5°C, 21.3°C, or any
value within the temperature
range.
Frequency Table
frequency table – is a way to
organize data to show how often
each value or range of values
occurs in a dataset.
How to Make a Frequency Table
1. Make a list down the page
of each possible value, from
lowest to highest – Note that
even if one of the ratings
between 0 and 10 is not
used, you still include that
value in the listing, showing it
as having a frequency of 0.
For example, if no one gave
a stress rating of 2, you still
include 2 as one of the
values on the frequency
table.
2. Go one by one through the
scores, making a mark for
each next to its value on
your list.
3. Make a table showing how
many times each value on
your list is used.
4. Figure the percentage of
scores for each value – To
do this, take the frequency
for that value, divide it by
the total number of scores,
and multiply by 100. You
may need to round off the
percentage. We
recommend that you round
percentages to one decimal
place.
Grouped Frequency Tables
Interval – range of values in a
grouped frequency table that are
grouped together. (For example, if
the interval size is 10, one of the
intervals might be from
10 to 19.)
grouped frequency table –
frequency table in which the
number of individuals (frequency)
is given for each interval of
values.
Note: Sometimes there are so
many possible values that an
ordinary frequency table is
too awkward to give a simple
picture of the scores. The solution is
to make groupings of values that
include all values in a certain
range. This combined category is a
range of values that includes these
two values. A combined category
like this is called an interval. A
frequency table that uses intervals
is called a grouped frequency
table.
other without spaces, giving the
appearance of a city skyline.
 A graph is another good
way to make a large group
of scores easy to
understand.
 Researchers make
histograms to show the
pattern visually in a
frequency table.
 The values, from lowest to
highest go along the
bottom; (b) the frequencies
from 0 at the bottom to the
highest frequency of any
value at the top go along
the left edge; (c) above
each value is a bar with a
height of the frequency for
that value.
Histograms
Histogram – barlike graph of a
frequency distribution in which the
values are plotted along the
horizontal axis and
the height of each bar is the
frequency of that value; the bars
are usually placed next to each
How to Make a Histogram
1. Make a frequency table (or
grouped frequency table).
2. Put the values along the
bottom of the page, from left
to right, from lowest to
highest.
3. Make a scale of frequencies
along the left edge of the
page that goes from 0 at the
bottom to the highest
frequency for any value.
4. Make a bar above each
value with a height for the
frequency of that value.
Shapes of frequency
distributions
frequency distribution – pattern of
frequencies over the various
values; what a frequency table,
histogram, or frequency
polygon describes.
unimodal distribution – has one
peak or mode, which is the value
that appears most frequently.
bimodal distribution – has two
peaks or modes. These peaks
represent values that occur more
frequently than others in the
dataset.
multimodal distribution – has more
than two peaks or modes.
rectangular distribution – or
uniform distribution, has values that
are all equally likely to occur,
resulting in a flat, rectangular
shape when graphed.
In summary:
Frequency Distribution: Shows how
often each value occurs.
Unimodal Distribution: One peak.
Bimodal Distribution: Two peaks.
Multimodal Distribution: More than
two peaks.
Rectangular Distribution: Values
occur with equal frequency.
Symmetrical and Skewed
Distributions
symmetrical distribution – is a type
of distribution where the left and
right sides are mirror images of
each other.
skewed distribution – distribution in
which the scores pile up on one
side of the middle and are spread
out on the other side; distribution
that is not symmetrical.
 A distribution that is skewed
to the right is also called
positively skewed. A
distribution skewed to the
left is also called negatively
skewed.
floor effect – situation in which
many scores pile up at the low end
of a distribution (creating skewness
to the right) because it is not
possible to have any lower score.
ceiling effect – situation in which
many scores pile up at the high
end of a distribution (creating
skewness to the left) because it is
not possible to have a higher
score.
Normal and Kurtotic
Distributions
normal curve – specific
mathematically defined, bellshaped frequency distribution that
is symmetrical and unimodal;
distributions observed in nature
and in research commonly
approximate it.
Kurtosis – extent to which a
frequency distribution deviates
from a normal curve in terms of
whether its curve in the
middle is more peaked or flat than
the normal curve.
Central tendency
Central tendency – is a statistical
measure that identifies a single
value as representative of an
entire dataset.
Mean – arithmetic average of a
group of scores; sum of the scores
divided by the number of scores.
Mode – is the value that appears
most frequently in a dataset. A
dataset may have one mode,
more than one mode, or no mode
at all if no number repeats.
Median – The median is the middle
value in a dataset when the values
are arranged in ascending or
descending order. If the dataset
has an odd number of values, the
median is the middle one. If it has
an even number of values, the
median is the average of the two
middle values.
Mean Example
and Statistical Symbols
 The rule for figuring the
mean is to add up all the
scores and divide by the
number of scores.
Mode Example
Example:
M – mean.
∑ – sum of; add up all the scores
following this symbol.
X – scores in the distribution of the
variable X
N stands for number – the number
of scores in a distribution.
Example:
∑x is 7 + 8 + 8 + 7 + 3 + 1 + 6 + 9 + 3
+ 8, which is 60.
In our example, there are 10
scores. Thus, N equals 10.
Consider the dataset: 1, 2, 2, 3, 4.
The mode is 2.
If the dataset is 1, 1, 2, 2, 3, it has
two modes: 1 and 2 (bimodal).
If the dataset is 1, 2, 3, 4, 5, it has
no mode since no number
repeats.
Median Example
Example:
Consider the dataset: 3, 1, 4, 1, 5.
First, arrange the data in
ascending order: 1, 1, 3, 4, 5.
Note: When an answer is not a
whole number, we suggest that
you use two more decimal places
in the answer than for the original
numbers.
The median is 3.
For an even number of values,
consider the dataset: 3, 1, 4, 1.
Arranged in ascending order: 1, 1,
3, 4.
The median is
 The variance is the sum of
the squared deviations of
the scores from the mean,
divided by the number of
scores.
Variance – measure of how spread
out a set of scores are; average of
the squared deviations from the
mean.
deviation score – score minus the
mean.
squared deviation score – square
of the difference between a score
and the mean.
sum of squared deviations – total
of each score’s squared
difference from the mean.
Formulas for the Variance and the
Standard Deviation
Z scores
Z score – number of standard
deviations that a score is above
(or below, if it is negative) the
mean of its distribution; it is thus an
ordinary score transformed so that
it better describes the score’s
location in a distribution.
Probability Calculations: Helps
calculate probabilities under the
normal distribution curve.
Formula
Positive Z-score – Indicates that
the data point is above the mean.
Negative Z-score – Indicates that
the data point is below the mean.
Example:
Magnitude of Z-score – Indicates
how far (in terms of standard
deviations) the data point is from
the mean.
Outlier Detection: Identifies data
points that are unusually high or
low compared to the rest of the
dataset.
 Psychologists usually study
samples and not
populations because it is not
practical in most cases to
study the entire population.
Methods of sampling
Z-score of 0 – Indicates that the
data point is exactly at the mean.
Uses of Z-score
Standardization: Allows
comparison of data points from
different datasets.
Sample – is a subset of the
population that is selected to
represent the larger group.
Sample and population
Population – refers to the entire
group of individuals, items, or data
points that we want to study and
draw conclusions about.
Simple Random Sampling – every
member of the population has an
equal chance of being selected.
Selection is done randomly without
any bias.
Stratified Sampling – The
population is divided into
homogeneous subgroups (strata)
based on certain characteristics
(e.g., age, gender, income). Then,
random samples are taken from
each stratum.
Systematic Sampling – Selecting
every nth individual from a list or
population. The first individual is
randomly chosen, and subsequent
selections are made at regular
intervals.
Example:
Selecting every 10th person from a
list of registered voters.
Cluster Sampling – The population
is divided into clusters
(geographical or administrative
units), and then some clusters are
randomly selected. All individuals
within the selected clusters are
included in the sample.
Example:
Randomly selecting several
schools in a district, then surveying
all students within those schools.
Convenience Sampling –
Individuals who are readily
available and willing to participate
are included in the sample. This
method is easy and quick but may
not represent the entire population
accurately.
Snowball Sampling – Initially
selecting a few individuals who
meet the criteria for the study.
These individuals then refer others
they know who also meet the
criteria, creating a chain or
'snowball' effect.
Quota Sampling – Similar to
stratified sampling but nonrandom. Researchers choose
individuals to fulfill certain quotas
based on predetermined criteria
(e.g., age, gender) until the quota
is met.
Probability, Outcome,
Frequency
Probability – measures the
likelihood or chance of a specific
outcome occurring.
Outcome – refers to the result of an
experiment, observation, or action.
It represents a possible result or
event that can occur.
Frequency – is how many times
something happens.
Expected relative frequency – is
what you expect to get in the long
run if you repeat the experiment
many times.
Steps for Finding Probabilities
1. Determine the number of
possible successful
outcomes
2. Determine the number of all
possible outcomes.
3. Divide the number of
possible successful
outcomes (Step ❶) by the
number of all possible
outcomes (Step ❷).
deciding whether the outcome of
a study (results for a sample)
supports a particular theory or
practical innovation (which is
thought to apply to a population).
Research hypothesis – Claims a
significant relationship, effect, or
difference between variables.
Null hypothesis – States there is no
significant relationship, effect, or
difference; any observed results
are due to random chance.
Hypothesis testing
Theory – set of principles that
attempt to explain one or more
facts, relationships, or events.
Hypothesis – prediction, often
based on informal observation,
previous research, or theory, that is
tested in a research study.
Hypothesis testing – procedure for
 In hypothesis testing,
researchers typically seek to
reject the null hypothesis in
favor of the research
hypothesis, based on
empirical data and
statistical analysis.
Download