Chapter 2 Slides

advertisement

INTRODUCTORY STATISTICS

Chapter 2 DESCRIPTIVE STATISTICS

PowerPoint Image Slideshow

SEC. 2.2 KEY VOCABULARY

• Stem-plot: organization of data using the last digit as the leaf and previous digits as the stem

• Line graph: graph made by connecting data points

• Bar graph: consists of bars that are separate from each other

STEM-PLOTS

Create a stem-plot for the following data of times it takes students to get to school: 15, 21, 17, 55,11, 31, 25, 14, 15, 3

Are there any potential outliers?

FIGURE 2.2

What does this line graph represent?

How many total pieces of data are there?

FIGURE 2.3

What does this bar graph represent?

PRACTICE:

Create a bar graph for the following information:

Randomly choose 5 countries (173 total) to show their education expenditures as a percentage of their GDP.

CIA Factbook

FIGURE 2.5

A histogram is similar to a bar graph, but has adjoining bars and has either frequency or relative frequency on the y-axis.

FIGURE 2.6

Create a relative frequency column and then a histogram displaying weight and relative frequency.

SEC. 2.4: MEASURES OF LOCATION OF DATA

• Quartiles: numbers that separate the data into quarters; quartiles may or may not be part of the data. The second quartile is the median of the data.

• Percentile: The p th percentile of a distribution is the value with p percent of the observations less than it.

• Median: 50 th percentile

• Q1: 25 th percentile

• Q3: 75 th percentile

• Interquartile Range or IQR : range of the middle 50 percent of the data values; the IQR = Q3 –Q1.

• Outlier: A value outside the range of Q2 ± 1.5·IQR

FIGURE 2.11

What is the 75 th percentile?

• Does more data lie between 4 and 7 or between 1 and 2? How do you know?

• Where do the middle 50% of values lie?

EXAMINING PERCENTILES

Jenny earned a score of 86 on her test. How did she perform relative to the rest of the class?

6 7

7 2334

7 5777899

8 00123334

8 5 6 9

9 03

Answer: Her score was greater than 21 of the 25 observations.

Since 21 of the 25, or 84%, of the scores are below hers, Jenny is at the 84 th percentile in the class’s test score distribution.

MORE PERCENTILES

• What score corresponds with the 50 th percentile?

• What score corresponds with the 10 th percentile?

• What score corresponds with the 90 th percentile?

PRACTICE:

• Where is the 25 th percentile?

• Which range contains the most data values: 95-400, 10-20 or 490-

790? How do you know?

• What is the IQR?

• What can you say about the spread of the data?

SEC. 2.5: BOX PLOTS

• Which data set has more values above 80?

• Which data set is more likely to have an outlier?

• What situation could fit these box plots? What could we infer about the data in each situation?

FIGURE 2.13

• What is median of this set?

• What is a potential set of data that could fit this box plot?

PRACTICE:

Barry Bonds set the major league record by hitting 73 home runs in the 2001 season. Here are data on the number of home runs that

Bonds hit in each of his 21 complete seasons.

16, 25, 24, 19, 33, 25, 34, 46, 37, 33, 42, 40, 37, 34, 49, 73, 42, 45,

45, 26, 28 a) Are there any potential outliers? Use Q2 ± 1.5·IQR to determine if there are any actual outliers.

b) Create a box plot for the data (find the min, Q1, Q2, Q3 and max).

FIGURE 2.46

• What generalizations can you make about the ages of 3 series vs.

7 series drivers.

• What can you say about the spread of ages of 5 series drivers?

FIGURE 2.43

• Why does China’s graph have no box?

• Why is Germany’s box missing the Q3 mark in the box?

Why is the U.S. missing a smallest value mark?

SEC. 2.6 MEASURES OF THE CENTER OF DATA

• Mean for a sample: 𝑥 = 𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠 𝑖𝑛 𝑠𝑎𝑚𝑝𝑙𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑖𝑛 𝑠𝑎𝑚𝑝𝑙𝑒

• Mean for a population: 𝜇 = 𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠 𝑖𝑛 𝑎 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑖𝑛 𝑎 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛

• Mode: Most frequent value

USING A CALCULATOR TO FIND MEASURES OF

CENTER

Enter your data values (do both L1 and L2)

Hit the “STAT” key to get here

Scroll down for more…

FINDING AVERAGES ONLY KNOWING GROUPED

VALUES

• Find the midpoint of each interval

• Multiply the midpoint by the frequency

• Mean = (Sum of Frequency × Midpoint) / Total Frequency

PRACTICE: FIND THE MEAN OF CAPPUCCINOS SOLD

PER HOUR

Cappuccinos

0-3

4-7

8-11

12-15

16-19

3

2

3

8

Frequency

2

If 13 cappuccinos were sold from 7am to 8am, is this above or below the average number of cappuccinos?

SEC. 2.7: SKEWNESS AND THE MEAN, MEDIAN AND

MODE

This is a symmetrical distribution.

What would be true of the mean, median and mode?

SKEWED TO THE LEFT

Write the set of data.

• Find the mean, median and mode. How do they compare?

Note: The mean is less than the mode when skewed left.

SKEWED RIGHT

Write the set of data.

• Find the mean, median and mode. How do they compare?

Note: The mean is larger than the mode when skewed right.

SEC. 2.8: MEASURES OF THE SPREAD OF DATA

KEY TERMS

• If x is a number, then the difference " x – mean" is called its deviation .

• The variance is the average of the squares of the deviations.

• The standard deviation is a number that measures how far data values are from their mean. It is the square root of the variance.

• For a sample, 𝑠 =

• For a population, 𝜎 = 𝑥) 2 𝑛−1

(𝑥−𝜇) 2

𝑁

The z-score of a value is how many standard deviations from the mean it is. 𝑣𝑎𝑙𝑢𝑒 − 𝑚𝑒𝑎𝑛 𝑧 − 𝑠𝑐𝑜𝑟𝑒 = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛

USE YOUR CALCULATOR TO FIND THE STANDARD

DEVIATION

Given the set of sample data for wait times at a doctor’s office, find the standard deviation of wait times.

10, 12, 5, 32, 21, 23, 7

Make sure to identify whether it is sample standard deviation or population standard deviation.

Would a wait time of 30 minutes be very “unusual”? (Within 2 standard deviations is considered normal.)

FINDING STANDARD DEVIATION FROM A

FREQUENCY TABLE

• Enter midpoints into L1 and frequencies into L2

• Select STAT, then 1-Var Stats, L1, L2 and Enter

• Would a height of 152 cm or a height of 172 cm be farther from the average in this class?

Download