Probability (Chapter 6)

advertisement
Probability (Chapter 6)
 The relationship between populations and samples often
described in terms of ‘probability’
 Knowing the make-up of a population allows us to infer the
likely characteristics of samples from the same population
(population to sample inference)
 This, however, is backwards from what we do in inferential
statistics
 In inferential statistics, we make inferences from a known
sample to what the population it came from must be like
(sample to population inference)
 In research we are interested in weather the sample values
belong to the control group population or some other population
different from the control – we determine this by using
probability statistics
Definition of probability:
 Probability of event ‘A’ = number of outcomes classified as ‘A’
divided by the total number of possible outcomes
 e.g., probability of picking the queen of spades from a complete
deck of cards is:
1
52
1
 This is written as p (queen of spades)
 The probability of picking an ace p (ace)
4
52
 p (spade) is:
13 1
  0.25  25%
52 4
Random sampling as a condition for determining
probability:
 For our definition of probability to be correct, we need to
assume our sample is attained by random sampling
 There are 2 requirements for random sampling:
 Each individual in the population must have an equal
chance of being selected
 If more than one case is selected, we must have a
constant probability for every selection (this is
accomplished by sampling with replacement)
2
For n = 2 cards:
 If you have already taken 1 card out of a deck, the probability of
picking a certain card on the second attempt changes:
Pick 1:
p (jack of hearts) = 1/52
Pick 2:
If pick 1 did not produce the jack of hearts,
p (jack of hearts) = 1/51
Note: Pick 2 contradicts the second requirement for probability
(constant probability with every selection)
 To keep probabilities constant, we must select with replacement
(requirement for replacement becomes less important as the size
of the sample increases)
 Probability and frequency distributions:
6,6,8,9,12,13,13,15
p(x>8) = 5/8
p(x<9) = 3/8
3
Normal distributions
 Definition:




Symmetrical
Highest Frequency in the middle (mode = mean = median)
Frequencies taper off as scores get further from the mean
34.13% (34%) of the data in the distribution is 1 standard
deviation above the mean and 34.13% is 1 standard
deviation below the mean
 13.59% (14%) of the data is between the 1st and 2nd
standard deviation from the mean in each direction
 2.28% (2%) of the distribution is beyond the 2nd standard
deviation in each direction
Check the Z scores for each of these proportions of the
distribution using the Unit Normal tables
Note: much data is considered to be normally distributed if you
collect enough of it, e.g. height, age, IQ
4
Probability associated with specific samples
(proportions under curve):
 e.g., considering heights are normally distributed and we have a
mean of 68 inches and a standard deviation of 6 inches:
 What is the probability of randomly selecting someone who is >
72 inches?
 Question is really one of proportion under the curve – always
sketch the curve first
 We need to work out Z-scores first:
z
xx


72  68
 0.66
6
Note: with a Z distribution
  0,   1
 Tables have been developed for every area of the curve in terms
of expressing it as a Z-score (unit normal table) – B, C and D
components
 In this case, the proportion above a Z of 0.66 (72 inches) is
0.2546 or 25%
5
Quartiles:
 Percentiles divide distributions into 100 equal parts
 Quartiles divide distributions into 4 equal parts
Q1 Z = -0.67
Q2 (median)
Z=0
Q3 Z = + 0.67
 e.g., find the 1st, 2nd, 3rd quartiles for a population distribution
with a mean of 50 and a standard deviation of 10
for Q1
Z = -0.67
x    z
 x  50  (0.67)(10)
 x  50  6.7
 x  43.3
for Q2
Z=0
x  50  (0)(10)
 x  50
for Q3
Z=0.67
x  50  (0.67)(10)
 x  50  6.7
 x  56.7
6
Semi-interquartile range:
Q3  Q1 56.7  43.3 13.4


 6.7
2
2
2
Note:
Semi-interquartile range also equals the size of 1
quartile
Semi-interquartile range always equals 0.67 times the
standard deviation
0.67
Work through all problems at the end of chapter 6. Remember
answers to the even numbered questions are available at the psych
lab.
7
Download