INFERENCE

advertisement

INFERENCE

What can you find out about a population by looking at a sample?

Getting started

 You need a population to sample

 There should be a reason to sample

The koala learning activity was developed by Anthony

Harradine (2008)

Are the koalas healthy?

Take a sample and make a dotplot

What do you notice?

What do you notice?

What do you notice?

The median of the population is likely to be within the range of sample medians.

The median weight of the female koalas is likely to be between 4.7kg and 5.4kg.

Are the koalas healthy?

Making an inference

The actual population median is 5.1kg.

Usually we only see one sample.

We make an inference that the population median is the same as the sample median (even though we know that it is probably not exactly the same).

This is called a point estimate.

Making an interval estimate

At NZC level 7, the idea of the interval is developed further.

Taking samples of different sizes and collecting the medians, you can demonstrate that there is less variation in the medians of large samples than the medians of small samples.

Collections of medians

Dot Plot

Measures from Sample size 15

Dot Plot Medians from 200 samples of size 30

40 50 60 70 80 m edian

90 100 110

Measures from Sample size 60

Dot Plot

40 50 60 70 80 m edian

90 100 110

40 50 60 70 80 m edian

90

Lindsay Smith, University of Auckland

Stats Day 2011

100 110

What else might affect the uncertainty in estimating the population median?

 The spread of the population

 Comparing the heights of intermediate school (years 7 and 8) and the heights of junior high school students

(years 7 to 10)

Lindsay Smith, University of Auckland

Stats Day 2011

Sampling variability: effect of spread

Dot Plot

Intermediate

Dot Plot

Middle School

100 120 140 160 height

180 200

Sample of Intermediate

Box Plot

100 120

Sample of Intermediate

140 160 height

180

Box Plot

200

120 140 160 180

Lindsay Smith, University of Auckland

Stats Day 2011

200

120

Sample of Middle School

140 160 height

180

Box Plot

200

120

Sample of Middle School

140 160 height

180

Box Plot

200

120 140 160 height

180 200

Estimating the spread of the population

 Best estimate: using the IQR of our sample

 Using the quartiles of our sample as point estimates for the quartiles of the population

Lindsay Smith, University of Auckland

Stats Day 2011

Providing an interval estimate (a confidence interval) for the population median

There are two factors which affect the uncertainty of estimating the parameter:

1. Sample size

2. Spread of population, estimated with sample IQR

 How confident do we want to be that our interval estimate contains the true population median?

Lindsay Smith, University of Auckland

Stats Day 2011

Development of formula for confidence interval population median = sample median ± measure of spread

√sample size

To ensure we predict the population median

90% of the time population median = sample median ±

1.5

measure of spread

√sample size population median = sample median ± 1.5 x IQR

√n

Lindsay Smith, University of Auckland

Stats Day 2011

Justification for the calculation

Based on simulations,

 The interval includes the true population median for 9 out of 10 samples - the population median is probably in the interval somewhere.

 This leads to being able to make a claim about the populations when they do not overlap.

 Sampling variation only produces a shift large enough to make a mistaken claim about once in 40 pairs of samples.

Lindsay Smith, University of Auckland

Stats Day 2011

Comparing two populations

 Sampling variation is always present and will cause a shift in the medians

 We are looking for sufficient evidence, a big enough shift in the intervals for the median to be able to make a claim that there is a difference back in the populations

Lindsay Smith, University of Auckland

Stats Day 2011

“ NCEA level 2 is not an endpoint.

It is a platform.”

Download