stats ch 1 notes packet - Grayslake North High School

advertisement
AP STATS
CH 1 EXPLORING DATA
Name:____________
1
AP Stats Fathom Intro & Exploration 2013
Part 1 – Exploring Variables
1) Open the file called “Fathom Intro 2013” which should be emailed to you.
2) The box you see is called the collection box. Drag the bottom right corner of the box to see each individual case
in the collection.
3) Double click on one of the cases to open what is called an Inspection Window. Click on each tab (Cases,
Measures, Comments and Display) to familiarize yourself with the information given in the Inspector.
a.
How much money does the person in your case spend on his/her haircut? _________________
4) Double click on several cases (any order is fine). Try to get a sense of how much money students in our class
spend on a haircut. About how much money do people spend? _____________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
5) You may exit the inspection collection window. Minimize the collection box by dragging the bottom right corner
toward the top left corner (opposite of what you did to expand). With the collection box highlighted, drag down
a new case table from the tool bar (it is to the right of the collection box icon). Adjust the table so that you can
see all the table headings but only a few cases within the data. See diagram below.
6) Drag down a new graph from the tool bar. Drag the word Height from the case table onto the x-axis and
release. Drag each of the other attributes (headings) onto the x-axis of the same graph.
a. What different types of graphical representations were used? _______________________________
______________________________________________________________________________________
______________________________________________________________________________________
b. Why were the graphical representations not all the same? ___________________________________
______________________________________________________________________________________
______________________________________________________________________________________
______________________________________________________________________________________
2
Part 2 – Exploring Dot Plots and Histograms
7) Drag the HairCut attribute on to the x-axis of the graph. Fathom automatically graphs a dot plot. What does
each dot represent?
_________________________________________________________________________________________
_________________________________________________________________________________________
8) Go to the upper right hand corner of the graph window and change the dot plot to a histogram. Compare and
contrast a dot plot and a histogram. Be sure to include benefits and drawbacks of each.
_________________________________________________________________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
9) How do you think a histogram is created? _______________________________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
10) Highlight the first histogram bar by clicking on it once and keep the hand positioned on the bar. Now look in the
very lower left hand corner of the computer screen at the dialogue box. Record what is in the dialogue box and
explain what it means.
_________________________________________________________________________________________
_________________________________________________________________________________________
11) Try highlighting other bars and look at the dialogue box to get a better feel for what is being graphed.
12) Double click on the x-axis. A text box will appear with information about the graph. How big is the bin width?
________ Double click on this number, type in 10 and hit enter. Does the graph look different? If so, how?
_________________________________________________________________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
13) Change the bin width back to 5.
14) Pull down a new graph and drag the Money (how much money was in one’s pocket) attribute onto the x-axis.
Change it to a histogram. Double click on the x-axis and use the same bin width used the HairCut histogram.
3
15) One of the nice features of Fathom is how it can display highlights in more than one window. Highlight a bar in
the Money histogram. What is highlighted in the HairCut histogram? Is that what you would expect? Explain why
or why not? _________________________________________________________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
16) Delete the Money graph.
Part 3 – Comparing Relationships
17) Drag the Gender attribute on y-axis of the graph that has HairCut on the x-axis. It will create a split graph. Does
the graph look how you would expect it to? Explain why or why not? ____________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
18) Drag the Money attribute on the y-axis of the graph that has HairCut on the x-axis. Does there seem to be a
relationship between the amount of money someone in our sample spends on a haircut and the amount of
money in his/her pocket?____________________________________________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
19) Drag the Gender attribute on top of the Money vs. HairCut graph. It will create a color-coded graph looking at
the data by gender.
20) Select a point that seems far from the majority of points. Does the point you picked represent a male or a
female? ____________ Describe the amount of money he/she spends on a haircut and the amount of money in
his/her pocket relative to other people in AP Stats? ______________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
4
21) Think back to the beginning of the lab when you tried to get a better sense of how much money people in the
sample spent on their haircut by clicking on the individual cases. Explain how your understanding of this data
has changed as a result of exploring the features of Fathom.
_________________________________________________________________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
_________________________________________________________________________________________
22) Delete any existing graphs.
Part 4 – Exploring on Your Own
23) Create a new blank graph.
24) Experiment with various attributes on the graph and look for interesting things about students in AP Stats.
Choose two graphs to copy and paste in Microsoft Word and answer these questions for each graph:
 Give a description of what each graph displays.
 Describe the relationship between the variables in the context of the situation.
 Pick one data point of interest on each graph to discuss in detail.
Homework Skim through Chapter P in the book. Become familiar with the book and what it has to offer. I read
the book the first time I taught the class and it was readable and helpful.
5
AP Stats Ch P Quantitative and Qualitative Notes
Warm ups
Try to match the vocabulary on the left with the definition on the right.
Categorical variable (qualitative)
The objects described by a set of data
Quantitative variable
Where we deliberately do something to individuals in order to
observe their responses.
Individual
Any characteristic of an individual, can take on different values
for different individuals.
Variable
Places and individual into one of several groups or categories
Distribution
Observe individuals and measure variables o interest but we do
not attempt to influence the responses.
Observational Study
Takes numerical values for which arithmetic operations such as
adding and averaging make sense
Experiment
Tells us what values the variable takes and how often it takes
those values.
1.
What is the difference between an observational study and an experiment?
2.
What are some advantages/disadvantages of each?
3.
When might you use each?
6
4. What do you think is important to know about the data that is to be or has been collected?
Graphs for Qualitative Data
Graphs for Quantitative Data
7
5.
Why don’t we use a bar graph or pie chart for quantitative data?
6.
What is a distribution? Where have you seen one before?
For the following questions use the student data sheet (use pg 38-47 in text as reference):
7. #Athletic: _____ # Clubs: _____ # Both: _____ # Neither: _____
Create a bar graph showing the number of students choosing each flavor. What type of data is this?
8. Using the height data, create a dot plot below. What type of data is this?
9. Using the height data, make a regular stem and leaf plot below:
8
10. Using the height data, broken up by gender, make a back-to-back stem plot below:
Male
Female
11. What is the purpose of a back-to-back stemplot?
12. What is the mean height of the class?
13. What is the median height for the class?
14. Explain one benefit and one drawback for a Stemplot or Dot Plot.
9
Name
Paper Airplane and Histogram Activity
How far can a paper airplane travel? This data collection activity will try to find out. All pairs will follow these standards:
1) Select a single paper airplane for use. (Why?)
2) Distances will be measured in feet.
3) Distance will be measured along the direction of release, as illustrated below.
Launch your plane 20 times. Record the flight distance for each trial below. Round to the nearest foot.
4) Below, make a histogram of your flight distances.
o Step 1: Determine class width & identify classes
o Step 2: Prepare frequency distribution
Class Limits
Tally
Frequency
10
o
5)
Step 3 & 4, Label/title axes & graph, draw bars to represent frequency in each class.
Describe the graph you have constructed. Be sure to discuss the following: center, shape, and spread.
11
AP Stats 1.1b Histograms, Center, Shape, and Spread Notes
Goals: Understand and be able to create Histograms.
Date:________________________
The above histograms are for college students.
What can you conclude from the above histograms?__________________________________________________
Why do you think this is?_________________________________________________________________________
Does your personal experience back this up?_________________________________________________________
1)
Why use a histogram to represent a distribution?
a. __________________________________________________________________________
b. __________________________________________________________________________
c. __________________________________________________________________________
12
2) Main Features:
a. Shape:
i. ____________________________________________________________________
ii. ____________________________________________________________________
____________________________________________________________________
b. Center: _________________________________________________________________________
c. Spread: _________________________________________________________________________
d. Major Peaks: ____________________________________________________________________
e. Outliers: ________________________________________________________________________
3) Describe the shape of the following histograms:
___________________
______________________
_____________________ ___________________
__________________
13
4) How to create a histogram by hand:
Prepare a histogram to present the following data set. Use 6 classes.
Time it takes for us to get to school in minutes.
Class Tally Frequency
Limits
Step 1: Determine class width & identify classes
Step 2: Prepare frequency distribution
Step 3 & 4, Label/title axes & graph, draw bars to represent frequency in each class.
14
5) Prepare a histogram to present the following data set. Use 6 classes and your calculator.
See page 59 for calculator instructions. Use the data below:
Time spent studying for math per night
15
6) Nurses on the eighth floor of Community Hospital believe they need extra staffing at night. To
estimate the night workload, a random sample of 35 nights was used. For each night, the total
number of room calls to the nurses’ station on the eighth floor was recorded as follows:
68
60
69
70
83
58
90
86
71
71
92
80
95
70
74
46
18
84
82
75
63
101
77
102
86
85
73
86
62
100
90
37
88
70
87
  2649
Use five classes. Describe the shape/center of the distribution.
16
AP Stats 1.1c Rel. Freq., Percentiles, and Ogive Notes
Warm UP
If your doctor told you that you were in the 60th percentile for height, what exactly would that mean?
Percentile: __________________________________________________________________________
____________________________________________________________________________________
____________________________________________________________________________________
Relative Cumulative Frequency Graph: _____________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
17
Ogives vs. Histograms: __________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
Creating an Ogive:
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
18
Using the table below, make an Ogive.
Daily High Temperatures During the Aspen Ski Season (Farenheit)
Class Limits Frequency Relative
Frequency
10-20
23
15.2%
Cumulative
Frequency
23
Relative Cumulative
Frequency
15.2%
20-30
43
28.5%
66
43.7%
30-40
40-50
50-60
51
27
7
151
33.8%
17.9%
4.6%
117
144
151
77.5%
95.4%
100%
What temperature corresponds to the 60th percentile?
Find the center of the distribution. Interpret this.
The temperature of 45◦ corresponds to what percentile? Interpret this.
19
Make an ogive for the height of our students.
Make an ogive for the amount of money our students have in their wallets.
20
Time Plots:
Example 1:
Pyramid Lake, Nevada, is described as the pride of the Paiute Indian Nation. It is a beautiful desert lake famous for very
large trout. The elevation of the lake surface (feet above sea level) varies according to the annual flow of the Truckee
River from Lake Tahoe. The US Geological Survey provided the following data:
Year
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
Elevation
3817
3815
3810
3812
3808
3803
3798
3797
3795
3797
3802
3807
3811
3816
3817
Make a time plot displaying the data.
21
Hw 1.1
Describe the distribution at right.
2.
22
AP Stats 1.2 Boxplots and Outliers
Warm Ups
Define
1) Mean:
2) Median:
3)Mode:
Notes
3) Quartiles:
4) Five Number Summary:
5) Outliers:
6) Outlier Rule:
23
7) Mark the quartiles on the boxplots below. Describe what the histograms for these data sets would look like.
8) How to create a box plot:
See #9 (skip the identifying outliers step)
9) How to create a modified box plot:
i) Organize the data (if necessary)
ii) Find the five-number summary
iii) Identify any outliers using the 1.5*IQR rule
iv) Sketch graph using your 5 number summary and marking any outliers
 Note: whiskers from the box go to the
______
____
24
10) How many calories are in a serving of cheese pizza? An article in Consumer Reports gave the calories in a 5-ounce
serving of supermarket cheese pizza. The calories are:
332
364
393
347
350
353
357
296
358
322
337
323
333
299
316
275
Find the mean and median. Construct a stemplot then use it to prepare a five-number summary and a box plot. Be sure
to identify any outliers and calculate the interquartile range.
25
11) What percentage of the general US population have bachelor’s degrees? The Statistical Abstract of the United
States, 120th Edition, gives the percentage of bachelor’s degrees by state.
22
26
24
17
27
38
34
24
22
22
26
21
26
18
22
26
20
21
23
35
31
21
32
19
23
24
20
20
27
31
25
27
24
22
26
24
27
24
27
21
26
19
24
28
28
32
29
19
24
22
Find the mean and median. Prepare a five-number summary and a modified box plot using your calculator. Be sure to
identify any outliers and calculate the interquartile range.
26
12) Bellview College must make a report to the budget committee about the average credit hour load a full-time
student takes. (A 12-credit-hour load is the minimum requirement for full-time status. For the same tuition, students
may take up to 20 credit hours.) A random sample of 40 students yielded the following information (in credit hours):
17
12
14
17
13
16
18
20
13
12
12
17
16
15
14
12
12
13
17
14
15
12
15
16
12
18
20
19
12
15
19
14
16
17
15
19
12
13
12
15
Prepare a five-number summary and a modified box plot using your calculator. Be sure to identify any outliers and
calculate the interquartile range.
27
1.1 Hw
28
AP Stats 1.2 Variance and Standard Deviation
Warm Ups
Two separate polls were done regarding favorite radio stations. The ages of the participants for the polls are below.
Construct a dot plot with the data and find the mean and median age of each poll.
20
60
20
60
20
60
40
40
40
40
40
40
40
40
1) What conclusions can you draw based on the measures of center and the dot plots?
2) Suppose you were given just the mean and the median. What other information would be useful in
understanding the data that was used/collected?
s2 
Variance (of a sample):
1
( x  x ) 2
n 1
or
( x1  x) 2  ( x2  x) 2  ...  ( xn  x) 2
s 
n 1
2
s
Standard Deviation:
1
( x  x)2
n 1
29
3) Big Blossom Greenhouse gathered a random sample of mature peak blooms from a test product
named Hybrid B. The eight blossoms had these widths (in inches):
5
5
5
6
6
6
7
8
a) What is the range of the data?
b) What is the interquartile range?
x
x 
x
( x  x) 2
xx
( x  x ) 
1
x 
n
s2 
1
( x  x ) 2 
n 1
x
n=
c: Calculate the variance
( x  x ) 2 
s2 
Standard Deviation
s
30
4)
A sample of different landscapes in Mesa Verde National Park was taken over a 2-year period, the
number of deer per square kilometer was determined. The results were(deer per square kilometer):
30
20
5
29
58
7
20
18
4
29
22
9
a) Calculate the range, sample mean, sample variance, and sample standard deviation.
5) Hatching success of game birds is a topic discussed in the book Wildlife Management Techniques.
What percentange of Canada goose nests are successful (at least one gosling survives)? Studies on
Montana, Illinois, Wyoming, Utah, and California gave the following percentage of successful nests:
23.9
52.5
68.5
78.6
71
17.8
57.5
59
52
a) Calculate the range, sample mean, sample variance, and sample standard deviation.
6) Using the height data from our two AP Stats classes, calculate the range, sample mean, sample variance, and
sample standard deviation (use your class data sheet from the beginning of the year).
31
7) If s  0 what does that tell us about the spread? What if s is very close to 0 ?
8) Is the standard deviation resistant to outliers or not? Explain.
9) When is it best to use five-number summary (or box plot) to describe a set of data as opposed to
the mean and standard deviation? When are the mean and standard deviation useful?
32
Variance and Standard deviation homework.
33
Special Problem # 1 – Travel Times
Choose a partner and complete. Due Tomorrow.
How long does it take? Here are the travel times from home to work, in minutes, for 15 workers in North Carolina,
chosen by random by the Census Bureau:
30
20
10
40
25
20
10
60
15
40
5
30
12
10
10
1. Create a stem plot of these data, describe the distribution, and find the mean.
2. How many of the 15 travel times are greater than the mean?
3. What is the percentage?
4. If we leave out the single longest travel time, what is the mean of this distribution?
5. How many of the 14 travel times are greater than the mean?
34
6. What is the percentage?
7. Find the median of the original distribution and the altered distribution (without 60).
Travel times in New York State are longer than North Carolina. Here are the travel times, in minutes, for 20 randomly
chosen New York workers:
10
30
5
25
40
20
10
15
30
20
15
20
85
15
65
15
60
60
40
45
8. Make a stem plot and describe the distribution, then find the median.
9. Find the 5 number summaries of the original North Carolina data set and the New York data set.
10. Would the third quartiles change if the last number changed to 600 in both data sets? Why?
11. Draw parallel box plots. (see calculator output on pg. 82) Use two stat plots on your calculator if not doing by hand.
35
12. Find the standard deviation and variance of each distribution.
13. Which distribution is more variable? How can you tell?
14. Describe the shape of both distributions. (Include how you know this)
15. Showing your work, determine any outliers for both distributions.
36
AP Statistics Section 1.2 Linear Transformations
Warm ups
1) What does linear mean, and what makes something linear? What makes something nonlinear? (Think in terms
of an equation and powers)
2) What does the word transform mean?
37
3) What does it mean to make a linear transformation?
Please find the mean and standard deviation (using a calculator) for each of the columns below.
List 1:
2167
2189
2245
2011
Mean =
1999
2272
Std .Dev. =
List 2: Subtract 147 from each of the entries in column 1. (L1-147)L2
Mean =
Std. Dev. =
List 3: Divide each entry from Column 1 by 64.
(L1/56)L3
Mean =
Std. Dev. =
4) What happened to the mean from L1  L2? L1  L3?
5) What happened to the standard deviation from L1  L2? L1  L3?
What about varance?
LINEAR TRANSFORMATIONS
38
Linear transformations: When every value of the variable x is transformed into a new value x new given by the equation
xnew  a  bx .
xnew  3  2 x
Original Data (x)
Median
Mean
Range
IQR
St. Dev.
Variance
IQR
St. Dev.
Variance
IQR
St. Dev.
Variance
St. Dev.
Variance
3, 4, 6, 8, 12, 15, 20
Add 4 to each value in the original data and complete the table.
xnew  4  x
Median
Mean
Range
7, 8, 10, 12, 16, 19, 24
Multiply each value in the original data by 3 and complete the table.
x new  3x
Median
Mean
Range
9, 12, 18, 24, 36, 45, 60
Multiply each value in the original data by 2 and add 3 and complete the table.
xnew  3  2 x
Median
Mean
Range
IQR
9, 11, 15, 19, 27, 33, 43
Compare the values for each measurement by column. How is each summary statistic of x affected by the linear
transformation x new  a  bx ?
Median new =
Mean new =
Range new =
39
IQR new =
St. Dev. new =
Variance new =
a) Suppose a teacher gave a test for which x  70 and s  21 . He wants to apply a linear transformation
xnew  a  bx to “scale” the grades so that x new  82 and s new  7 . Find a and b.
b) Now create a data set where n=15, x  70 and s  21 . Perform the linear transformation on your data using the
values of a and b that you found in part a above. Afterwards, verify that x new  82 and s new  7 .
Exploring Univariate Data Ch1 Rev
Measures of Spread
Range: Range = maximum – minimum
Interquartile Range (IQR): IQR  Q3  Q1 .
Quartiles:
The first quartile ( Q1 ) is the value for which 25% of the observations are less than. It is the Median of the first half of
the set of observations.
40
The third quartile ( Q3 ) is the value for which 75% of the observations are less than. It is the Median of the second half
of the set of observations.
Note: IQR is typically used to describe spread when Median is used to describe center.
Five number summary: Min, Q1 , Median, Q3 , Max
Outliers: An observation is called an outlier if it lies more than 1.5  IQR above Q3 or 1.5  IQR below Q1 .
Variance ( s 2 ): The variance is the roughly the average of the squared differences between each observation and the
mean.
( x1  x) 2  ( x2  x) 2 
s 
n 1
2
 ( xn  x) 2
or
s2 
1
( xi  x) 2

n 1
Standard deviation (s): The standard deviation is the square root of variance.
s
1
( xi  x ) 2

n 1
Note: Variance and Standard Deviation are used to measure spread when the mean is used to describe center.
Note: When the distribution is approximately symmetric, the mean and standard deviation are generally used to
summarize the distribution. If the distribution is skewed, a five number summary is generally used.
41
Use class data.
Fnd each of the following for the distribution of height (inches).
Five number summary:
IQR:
Are there any outliers in the distribution of height?
Find the variance and standard deviation.
Construct a boxplot for the height (inches) using the grid as a guide.
42
0
40
50
60
70
Height (inches)
Construct parallel boxplots for the height for men and women using the grid as a guide.
Men
Women
0
40
50
60
70
Height (inches)
Using the boxplots above, compare and contrast the distributions of height for men and women. (Shape center and
spread)
43
In each of the following settings, give the values of a and b for the linear transformation
xnew  a  bx that express the change in measurement units. Then explain how the transformation will affect the mean the IQR, the median, and the standard deviation of the original
distribution.
a) You collect data on the power of car engines, measured in horsepower. Your teacher requires
you to convert the power to watts. One horsepower is 746 watts.
b) You measure the temperature (in degrees Farenheit) of your school’s swimming pool at 20
different locations within the pool. Your swim team coach wants the summary statistics in
degrees Celsius (F = 9/5C + 32).
Morning review 2013 #1 a, 2012 # 3 a, 2009 #1 a,b
44
45
46
47
Download