Chapter 6 - Using Probability to Make Decisions about Data

advertisement

Chapter 6

USING PROBABILITY TO MAKE

DECISIONS ABOUT DATA

Going Forward

Your goals in the chapter are to learn:

• What probability is

• How to compute the probability of raw scores and sample means using z-scores

• How random sampling should produce a representative sample

• How sampling error may produce an unrepresentative sample

• How to use a sampling distribution of means to decide whether a sample represents a particular population

Random Sampling

Selecting a sample so that all events or individuals in the population have an equal chance of being selected is known as random

sampling.

Probability

• The probability of an event is equal to the event’s relative frequency in the population of possible events that can occur

• The symbol for probability is p

Probability Distributions

Probability Distributions

A probability distribution indicates the probability of all possible events in a population

 An empirical probability distribution is created by observing the relative frequency of every event in the population

 A theoretical probability distribution is based on how we assume nature distributes events in the population

Obtaining Probability from the Standard

Normal Curve

Probability of Individual Scores

The proportion of the total area under the

standard normal curve for particular scores equals the probability of those scores.

The Probability of Sample Means

One type of theoretical probability distribution known as the sampling distribution of means is used to determine the probability of randomly obtaining any particular sample means.

Sampling Distribution of SAT

Means When N = 25

Probability of Sample Means

• The probability of selecting a particular sample mean is the same as the probability of randomly selecting a sample of participants whose scores produce that sample mean

• The larger the absolute value of a sample mean’s z-score, the less likely the mean is to occur when samples are drawn from the underlying raw score population

Random Sampling and

Sampling Error

Representative Samples

• A representative sample is one in which the characteristics of the individuals and scores in the sample accurately reflect the characteristics of the individuals and scores in the population

Sampling error occurs when random chance produces a sample statistic (e.g., s 2 ) not equal to the population parameter it represents

(e.g., s 2

)

Which Is It?

• It is always possible to obtain a sample that is not representative

• Therefore, any sample might either poorly represent one population because of sampling error or accurately represent a different population

Deciding Whether a Sample

Represents a Population

Likely vs. Unlikely

Sample mean A is likely. Sample mean B is unlikely.

Region of Rejection

• At some point, a sample mean is so far above or below the population mean it is unbelievable that chance produced such an unrepresentative sample

• The area beyond these points is called the region of rejection

Region of Rejection

The region of rejection is the part of a sampling distribution containing values so unlikely we

“reject” the idea they represent the underlying raw score population.

A Sampling Distribution of Means

Showing the Region of Rejection

Criterion

The criterion is the probability defining samples as unlikely to be representing the raw score population.

Critical Value

• A critical value marks the inner edge of the region of rejection

• For a criterion of .05, the area in each tail equals .025

• ± 1.96 is the critical value of z for a criterion of .05 in a two-tailed test

Rejection Rule

• When a sample’s z-score lies beyond the critical value, reject the idea the sample represents the underlying raw score population reflected by the sampling distribution

• When the z-score does not lie beyond the critical value, retain the idea the sample represents the underlying raw score population

Summary

1. Set up the sampling distribution

– Select the criterion (e.g., .05)

– Locate the region of rejection

– Determine the critical value (e.g., ± 1.96 in a two-tailed test with a criterion of .05)

Summary

2. Compute the sample mean and its z-score

– s

X

– X m of the sampling distribution

One-Tailed and Two-Tailed Tests

Two-tailed tests—we reject the idea the sample mean is representative if it falls in either the negative tail or the positive tail of the distribution

• One-tailed tests

– If we are interested in positive z-scores, reject the idea the sample mean is representative only if it falls in the positive tail

– If we are interested in negative z-scores, reject the idea the sample mean is representative only if it falls in the negative tail

One-Tailed Tests

Example

A sample of 10 scores yields a sample mean of 305.

Does the sample represent the population where m 

325 and s

X

25 ?

s

X z

 s

X 

N

( X s

X m

)

25

10

305

7 .

91

325

7 .

91

20

7 .

91

 

2 .

53

Example

• With a criterion of 0.05 and a region of rejection in two tails, the critical value is

1.96.

• Since the sample z of –2.53 is beyond –1.96, it is in the region of rejection. The sample does not represent the population.

Download