Probability and Statistics Curriculum Guide

Mathematics
Probability
and Statistics
Curriculum Guide
Revised 2010
This page is intentionally left blank.
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Introduction
PRINCE WILLIAM COUNTY SCHOOLS
The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction and assessment. It defines the content knowledge, skills,
and understandings that are measured by the Standards of Learning assessment. It provides additional guidance to teachers as they develop an
instructional program appropriate for their students. It also assists teachers in their lesson planning by identifying essential understandings, defining
essential content knowledge, and describing the intellectual skills students need to use. This Guide delineates in greater specificity the content that all
teachers should teach and all students should learn.
The format of the Curriculum Guide facilitates teacher planning by identifying the key concepts, knowledge, and skills that should be the focus of
instruction for each objective. The Curriculum Guide is divided into sections: Curriculum Information, Essential Knowledge and Skills, Key
Vocabulary, Essential Questions and Understandings, Teacher Notes and Elaborations, Resources, and Sample Instructional Strategies and Activities.
The purpose of each section is explained below.
Curriculum Information:
This section includes the objective, focus or topic, and in some, not all, foundational objectives that are being built upon.
Essential Knowledge and Skills:
Each objective is expanded in this section. What each student should know and be able to do in each objective is outlined. This is not meant to be an
exhaustive list nor is a list that limits what is taught in the classroom. This section is helpful to teachers when planning classroom assessments as it is a
guide to the knowledge and skills that define the objective.
Key Vocabulary:
This section includes vocabulary that is key to the objective and many times the first introduction for the student to new concepts and skills.
Essential Questions and Understandings:
This section delineates the key concepts, ideas, and mathematical relationships that all students should grasp to demonstrate an understanding of the
objectives.
Teacher Notes and Elaborations:
This section includes background information for the teacher. It contains content that is necessary for teaching this objective and may extend the
teachers’ knowledge of the objective beyond the current grade level. It may also contain definitions of key vocabulary to help facilitate student learning.
Resources:
This section lists various resources that teachers may use when planning instruction. Teachers are not limited to only these resources.
Sample Instructional Strategies and Activities:
This section lists ideas and suggestions that teachers may use when planning instruction.
1
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
PRINCE WILLIAM COUNTY SCHOOLS
Probability and Statistics in Prince William County is a semester course. The following chart lists the objectives for the Probability and Statistics
Curriculum organized by topic. Specific objectives have been selected from the VDOE Probability and Statistics Standards to meet the objectives of this
semester course. The chart organizes the objectives by topic. The Prince William County cross-content vocabulary terms that are in this course are:
analyze, compare and contrast, conclude, evaluate, explain, generalize, question/inquire, sequence, solve, summarize, and synthesize.
Topic
Descriptive Statistics
Data Collection
Probability
Inferential Statistics
Objectives
PS 1, PS 2, PS 3, PS 4
PS 8, PS 9
PS 11, PS 12, PS 13, PS 16
PS 17
2
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Descriptive Statistics
Virginia Standard PS.1
The student will analyze graphical displays
of univariate data including dotplots (line
plot), stemplots (stem-and-leaf plot), and
histograms, to identify and describe patterns
and departures from patterns, using central
tendency, spread, clusters, gaps, and
outliers. Appropriate technology will be
used to create graphical displays.
Essential Knowledge and Skills
Key Vocabulary
The student will use problem solving,
mathematical communication, mathematical
reasoning, connections and representations
to:
• Create and interpret graphical displays of
data, including dotplots, stemplots and
histograms.
• Examine graphs of data for clusters and
gaps, and relate those phenomena to the data
in context.
• Examine graphs of data for outliers, and
explain the outlier(s) within the context of
the data.
• Examine graphs of data, and identify the
central tendency of the data as well as the
spread. Explain the central tendency and the
spread of the data within the context of the
data.
Key Vocabulary
cluster
dotplot (line plot)
gap
histogram
mean
measures of dispersion
median
mode
outliers
population
spread
stemplots (stem-and-leaf plot)
univariate data
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Essential Questions
• What are different methods by which data can be displayed?
• How do measures of dispersion describe data?
Essential Understandings
• Data are collected for a purpose and have meaning in a context.
• Measures of central tendency describe how the data cluster or group.
• Measures of dispersion describe how the data spread (disperse) around the center of the data.
• Graphical displays of data may be analyzed informally.
• Data analysis must take place within the context of the problem.
Teacher Notes and Elaborations
Univariate refers to an expression, equation, function, or polynomial of only one variable. Univariate
data involves a single variable per case. A categorical variable places an individual into one of several
groups or categories. A quantitative variable takes numerical values for which arithmetic operations
such as adding and averaging make sense. Quantitative data is often displayed on a histogram and
categorical data is often displayed on a bar chart. A histogram is a bar graph that represents the
frequency distribution of a data set. It has a horizontal scale that is quantitative and measures the data
values, has a vertical scale that measures the frequencies of the classes, and the consecutive bars must
touch.
Population is the entire set of individuals from which samples are drawn. It is the set of people or
things (units) that is being investigated.
A measure of central tendency is a value that represents a typical or central entry of a data set. The
three most commonly used measures of central tendency are the mean, the median, and the mode.
The mean of a data set is the sum of the data entries divided by the number of entries. It is the balance
point of a distribution. To find the mean of a data set, use one of the following formulas.
∑x
∑x
Population Mean: µ =
Sample Mean: x =
N
n
The lowercase Greek letter µ (pronounced mu) represents the population mean and x (read as: “x
bar”) represents the sample mean. Note that N represents the number of entries in a population and n
represents the number of entries in a sample. The symbol Σ, for sum, means to add up all the values of
x.
(continued)
3
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Descriptive Statistics
Virginia Standard PS.1
The student will analyze graphical displays
of univariate data including dotplots (line
plot), stemplots (stem-and-leaf), and
histograms, to identify and describe patterns
and departures from patterns, using central
tendency, spread, clusters, gaps, and
outliers. Appropriate technology will be
used to create graphical displays.
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Teacher Notes and Elaborations (continued)
The median is the midpoint of a distribution, the number such that half the data set is smaller and the other half is larger. To find the median of a
distribution:
1. Arrange all data in the set in order of size, from smallest to largest.
n +1
data
2. If the number n of data in the set is odd, the median M is the center in the ordered list. The location of the median is found by counting
2
up from the bottom of the list.
3. If the number n of data in the set is even, the median M is the average of the two center data in the ordered list. The location of the median is again
n +1
from the bottom of the list.
2
The mode is a peak of the distribution. There may be one, more than one, or no mode.
A cluster is a naturally occurring subgroup of a population used in sampling. On a plot, a cluster is a group of data “clustering” close to the same value,
away from other groups.
A gap is a difference as in between two totals. On a plot, a gap is the space that separates clusters of data.
A stemplot (stem and leaf plot) is similar to a histogram but has the advantage that the graph still contains the original data values. In a stemplot, each
number is separated into a stem (the entries leftmost digits) and a leaf (the rightmost digit).
A dotplot is used to graph quantitative data. Each data entry is plotted using a point above its value on a horizontal axis. Like a stemplot, a dotplot
illustrates how data are distributed, determines specific data entries, and identifies unusual data values (outliers). An outlier is a data entry that is far
removed from the other entries in the data set. It is data that falls outside the overall pattern of the graph.
Spread is the degree to which values in a distribution differ. Measures of variability or spread, for quantitative variables include the standard deviation,
interquartile range, and range. Statisticians like to measure and analyze the measures of dispersion (spread) of the data set about the mean in order to
assist in making inferences about the population. One measure of spread would be to find the sum of the deviations between each element and the mean
whose sum is always zero. There are two methods to overcome this mathematical dilemma:
1. take the absolute value of the deviations before finding the average; or
2. square the deviations before finding the average.
The mean absolute deviation uses the first method and the variance and standard deviation uses the second.
4
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Descriptive Statistics
Virginia Standard PS.1
Resources
PRINCE WILLIAM COUNTY SCHOOLS
Sample Instructional Strategies and Activities
Text:
Elementary Statistics: Picturing the World, 3rd
Edition, 2006, Larson and Farber, Pearson
Prentice Hall
Elementary Statistics: A Step by Step
Approach, Sixth Edition, 2007, Bluman,
Glencoe McGraw-Hill
PWC Mathematics website
http://pwcs.math.schoolfusion.us/
Mathematics SOL Resources
www.doe.virginia.gov/instruction/high_school/
mathematics/index.shtml
5
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Descriptive Statistics
Virginia Standard PS.2
The student will analyze numerical
characteristics of univariate data sets to
describe patterns and departure from
patterns, using mean, median, mode,
variance, standard deviation, interquartile
range, range, and outliers.
Essential Knowledge and Skills
Key Vocabulary
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
The student will use problem solving,
mathematical communication, mathematical
reasoning, connections and representations
to:
• Interpret mean, median, mode, range,
interquartile range, variance, and standard
deviation of a univariate data set in terms of
the problem’s context.
• Identify possible outliers, using an
algorithm.
• Explain the influence of outliers on a
univariate data set.
• Explain ways in which standard deviation
addresses dispersion by examining the
formula for standard deviation.
Essential Questions
• Why is data collected?
• What is an outlier and how does it influence a data set?
• Do all dispersions contain an outlier?
• How are measures of central tendency used?
• What is meant by the spread of the data?
Key Vocabulary
deviation
dispersion
interquartile range
mean
median
mode
outlier
range
standard deviation
variance
The median of a data set is the value that lies in the middle of the data when the data set is ordered. If
the data set has an odd number of entries, the median is the middle data entry. If the data set has an
even number of entries, the median is the mean of the two middle data entries.
Essential Understandings
• Data are collected for a purpose and have meaning within a context.
• Analysis of the descriptive statistical information generated by a univariate data set should include
the interplay between central tendency and dispersion as well as among specific measures.
• Data points identified algorithmically as outliers should not be excluded from the data unless
sufficient evidence exists to show them to be in error.
Teacher Notes and Elaborations
The mean of a data set is the sum of the data entries divided by the number of entries.
The mode of a data set is the data entry that occurs with the greatest frequency. If no data entry is
repeated, the data set has no mode. If two entries occur with the same greatest frequency, each entry
is a mode and the data set is called bimodal.
The range of a data set is the difference between the maximum and minimum data entries in the set.
Range = (Maximum data entry) – (Minimum data entry)
Deviation of a data entry in a population data set is the difference between the entry and the mean µ
of the data set. It is the difference from the mean x − x , or other measure of center.
Deviation of x = x − µ
Dispersion is the degree to which the values of a frequency distribution are scattered around some
central point, usually the arithmetic mean or median.
The standard deviation is the positive square root of the variance of the data set. The greater the value
of the standard deviation, the more spread out the data are about the mean. The lesser (closer to 0) the
value of the standard deviation, the closer the data are clustered about the mean.
(continued)
6
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Descriptive Statistics
Virginia Standard PS.2
The student will analyze numerical
characteristics of univariate data sets to
describe patterns and departure from
patterns, using mean, median, mode,
variance, standard deviation, interquartile
range, range and outliers.
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Teacher Notes and Elaborations (continued)
Another characteristic of data is its level of measurement. The level of measurement determines which statistical calculations are meaningful. The four
levels of measurement, in order from lowest to highest, are nominal, ordinal, interval, and ratio. The following table summarizes the four levels of
measurement.
Level of Measurement
Nominal
(Qualitative data)
Ordinal
(Qualitative or quantitative data)
Interval
(Quantitative data)
Ratio
(Quantitative data)
Meaningful Calculations
Put in a category.
Put in a category and put in order.
Put in a category, put in order, and find differences
between values.
Put in a category, put in order, find differences
between values, and find ratios of values.
A variance is a measure of spread equal to the square of the standard deviation. The average of the squared deviations from the mean is known as the
variance, and is another measure of the spread of the elements in a data set.
n
∑ (x − µ)
2
i
2
To calculate variance use (σ ) =
i =1
, where µ represents the mean of the data set, n represents the number of elements in the data set, and xi
n
represents the ith element of the data set. (This is the formula that will be used on the new Algebra I SOL assessment and included on the formula sheet for
the Algebra EOC SOL.)
The differences between the elements and the arithmetic mean are squared so that the differences do not cancel each other out when finding the sum.
When squaring the differences, the units of measure are squared and larger differences are “weighted” more heavily than smaller differences. In order to
provide a measure of variation in terms of the original units of the data, the square root of the variance is taken, yielding the standard deviation.
The standard deviation is the positive square root of the variance of the data set. The greater the value of the standard deviation, the more spread out the
data are about the mean. The lesser (closer to 0) the value of the standard deviation, the closer the data are clustered about the mean.
∑(x − µ)
2
i
To calculate standard deviation use (σ ) =
i =1
, where µ represents the mean of the data set, n represents the number of elements in the data set,
n
and xi represents the i element of the data set. (This is the formula that will be used on the new Algebra I SOL assessment and included on the formula
sheet for the Algebra EOC SOL.)
th
Often, textbooks will use two distinct formulas for standard deviation. In these formulas, the Greek letter “σ ”, written and read “sigma”, represents the
standard deviation of a population, and “s” represents the sample standard deviation. The population standard deviation can be estimated by calculating
the sample standard deviation. The formulas for sample and population standard deviation look very similar except that in the sample standard deviation
formula, n – 1 is used instead of n in the denominator. The reason for this is to account for the possibility of greater variability of data in the population
than what is seen in the sample. When n – 1 is used in the denominator, the result is a larger number. Therefore, the calculated value of the sample
standard deviation will be larger than the population standard deviation. As sample sizes get larger (n gets larger), the difference between the sample
standard deviation and the population standard deviation gets smaller. The use of n – 1 to calculate the sample standard deviation is known as Bessel’s
correction.
(continued)
7
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Descriptive Statistics
Virginia Standard PS.2
The student will analyze numerical
characteristics of univariate data sets to
describe patterns and departure from
patterns, using mean, median, mode,
variance, standard deviation, interquartile
range, range, and outliers.
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Teacher Notes and Elaborations (continued)
To locate the center of distribution, divide the data into a lower and upper half. Find the values that divide each half in half again. These two values, the
lower quartile, Q1 and the upper quartiles, Q3, together with the median, divide the data into fourths. The interquartile range or measure of spread is the
distance between upper and lower quartiles (IQR = Q3 – Q1). IQR represents 50% of the data. Outliers are unusual data values. Typically, outliers are 1.5
IQR away from Q1 and Q3. It is possible that distributions will not contain outliers. For example, in a normal distribution, there are no outliers.
Appropriate technology will be used to calculate statistics.
8
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Descriptive Statistics
Virginia Standard PS.2
Resources
PRINCE WILLIAM COUNTY SCHOOLS
Sample Instructional Strategies and Activities
Text:
Elementary Statistics: Picturing the World, 3rd
Edition, 2006, Larson and Farber, Pearson
Prentice Hall
Elementary Statistics: A Step by Step
Approach, Sixth Edition, 2007, Bluman,
Glencoe McGraw-Hill
PWC Mathematics website
http://pwcs.math.schoolfusion.us/
Mathematics SOL Resources
www.doe.virginia.gov/instruction/high_school/
mathematics/index.shtml
9
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Descriptive Statistics
Virginia Standard PS.3
The student will compare distributions of
two or more univariate data sets, analyzing
center and spread (within group and
between group variations, clusters and gaps,
shapes, outliers, or other unusual features.
Essential Knowledge and Skills
Key Vocabulary
The student will use problem solving,
mathematical communication, mathematical
reasoning, connections and representations
to:
• Compare and contrast two or more
univariate data sets by analyzing measures
of center and spread within a contextual
framework.
• Describe any unusual features of the data,
such as clusters, gaps, or outliers, within the
context of the data.
• Analyze in context kurtosis and skewness in
conjunction with other descriptive measures.
Key Vocabulary
kurtosis
measure of center
normal distribution
skewness
statistical tendency
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Essential Questions
• How can unusual data be described?
• How is statistical tendency used?
Essential Understandings
• Data are collected for a purpose and have meaning in a context.
• Statistical tendency refers to typical cases but not necessarily to individual cases.
Teacher Notes and Elaborations
A measure of center is a single number summary that measures the “center” of a distribution; usually
the mean (or average) is used. Median and mode are also measures of center.
A normal distribution is a useful probability distribution that has a symmetric bell or mound shape
and tails extending infinitely in both directions.
Normal distributions are a family of probability models that assign probabilities to events as areas
under a curve. The normal curves are symmetric and bell-shaped. A specific normal curve is
completely described by giving its mean, µ, and its standard deviation, σ.
Kurtosis is a measure of the concentration of a distribution about its mean.
Statistical tendency is a way in which something (data) typically behaves or happens or is likely to
react, behave or happen.
Skewness is a measure of the symmetry of a distribution about its mean. If the mean equals the median
there is no skew, the distribution is symmetric. If the mean is smaller than the median the distribution
will be left skewed. If the mean is larger than the median the distribution will be right skewed.
Distributions are skewed when they show bunching at one end and a long tail stretching out in the
other direction.
Data are useful only if it can be organized and presented so the meaning is clear. Two principles that
are useful in exploring and analyzing data are:
1. Examine each variable by itself, and then look at the relationship among the variables.
2. Begin with a graph or graphs then add numerical summaries of specific aspects of the data. In
any graph of data, the overall pattern can be described by its shape, center, and spread.
Outliers fall outside the overall pattern.
Appropriate technology will be used to generate graphical displays.
10
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Descriptive Statistics
Virginia Standard PS.3
Resources
PRINCE WILLIAM COUNTY SCHOOLS
Sample Instructional Strategies and Activities
Text:
Elementary Statistics: Picturing the World, 3rd
Edition, 2006, Larson and Farber, Pearson
Prentice Hall
Elementary Statistics: A Step by Step
Approach, Sixth Edition, 2007, Bluman,
Glencoe McGraw-Hill
PWC Mathematics website
http://pwcs.math.schoolfusion.us/
Mathematics SOL Resources
www.doe.virginia.gov/instruction/high_school/
mathematics/index.shtml
11
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Descriptive Statistics
Virginia Standard PS.4
The student will analyze scatterplots to
identify and describe the relationship
between two variables, using shape; strength
of relationship; clusters; positive, negative,
or no association; outliers; and influential
points.
Essential Knowledge and Skills
Key Vocabulary
The student will use problem solving,
mathematical communication, mathematical
reasoning, connections and representations
to:
• Examine scatterplots of data, and describe
skewness, kurtosis, and correlation within
the context of the data.
• Describe and explain any unusual features
of the data, such as clusters, gaps or outliers,
within the context of the data.
• Identify influential data points (observations
that have great effect on a line of best fit
because of extreme x-values) and describe
the effect of the influential points.
Key Vocabulary
bivariate data
correlation
influential point
kurtosis
regression
scatterplot
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Essential Questions
• How can graphs be used to examine data?
• What is the role of outliers in data observations?
• What is strength of an association between two variables?
Essential Understandings
• A scatterplot serves two purposes:
- to determine if there is a useful relationship between two variables, and
- to determine the family of equations that describes the relationship.
• Data are collected for a purpose and have meaning in a context.
• Association between two variables considers both the direction and strength of the association.
• The strength of an association between two variables reflects how accurately the value of one
variable can be predicted based on the value of the other variable.
• Outliers are observations with large residuals and do not follow the pattern apparent in the other
data points.
Teacher Notes and Elaborations
A scatterplot is a plot that shows the relationship between two quantitative variables, usually with
each case represented by a dot. On a scatterplot, an influential point is a point that strongly influences
the regression equation and correlation. To judge a point’s influence, fit a line and compute a
correlation first with, and then without, the point in question.
Kurtosis is a descriptive property of distributions designed to indicate the general form of
concentration around the mean.
A correlation is a numerical value between −1 and 1 inclusive that measures the strength and
direction of a linear relationship between two variables. Strength of an association between two
variables is strong if there is little variation within each vertical strip (conditional distribution of y
given x). If there is a lot of variation, the relationship is weak.
A regression is the statistical study of the relationship between two (or more) quantitative variables,
such as fitting a line to bivariate data.
Bivariate data is data that involve two variables per case. For quantitative variables, it is often
displayed on a scatterplot.
Influential points are points that cause large changes in parameter estimates when they are deleted. For
example, a substantially low score on a test (outlier) will affect the mean of the distribution. If deleted,
the measures of central tendency and dispersion will change.
(continued)
12
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Descriptive Statistics
Virginia Standard PS.4
The student will analyze scatterplots to
identify and describe the relationship
between two variables, using shape; strength
of relationship; clusters; positive, negative,
or no association; outliers; and influential
points.
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Teacher Notes and Elaborations (continued)
If a logical relationship exists between two variables, a graph is used to plot the available data. A scatterplot contains an x (independent or explanatory)
value and a y (dependent or response) value.
A scatterplot serves two purposes:
1. it helps to see if there is a useful relationship between the two variables; and
2. it helps to determine the type of equation to use to describe the relationship.
Appropriate technology will be used to generate scatterplots and identify outliers and influential points.
13
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Descriptive Statistics
Virginia Standard PS.4
Resources
PRINCE WILLIAM COUNTY SCHOOLS
Sample Instructional Strategies and Activities
Text:
Elementary Statistics: Picturing the World, 3rd
Edition, 2006, Larson and Farber, Pearson
Prentice Hall
Elementary Statistics: A Step by Step
Approach, Sixth Edition, 2007, Bluman,
Glencoe McGraw-Hill
PWC Mathematics website
http://pwcs.math.schoolfusion.us/
Mathematics SOL Resources
www.doe.virginia.gov/instruction/high_school/
mathematics/index.shtml
14
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Data Collection
Virginia Standard PS.8
The student will describe the methods of
data collection in a census, sample survey,
experiment, and observational study and
identify an appropriate method of solution
for a given problem setting.
Essential Knowledge and Skills
Key Vocabulary
The student will use problem solving,
mathematical communication, mathematical
reasoning, connections and representations
to:
• Compare and contrast controlled
experiments and observational studies and
the conclusions one can draw from each.
• Compare and contrast population and
sample and parameter and statistic.
• Identify biased sampling methods.
• Describe simple random sampling.
• Select a data collection method appropriate
for a given context.
Key Vocabulary
biased
biased sampling
census
control group
experiment
observational study
parameter
population
sample survey
simple random sampling
statistic
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Essential Questions
• What are the various methods of data collection?
• How does data collection affect conclusions for a problem?
• What are the differences between controlled experiments and observational studies?
• What determines whether a sample is biased?
Essential Understandings
• The value of a sample statistic varies from sample to sample if the simple random samples are
taken repeatedly from the population of interest.
• Poor data collection can lead to misleading and meaningless conclusions.
Teacher Notes and Elaborations
An experiment is when a treatment is assigned to a person, animal, or object, to observe a response.
A control group is a group in an experiment that provides a standard for comparison to evaluate the
effectiveness of a treatment; often given the placebo.
An observational study is a study in which the conditions of interest are already built into the units
being studied and are not randomly assigned.
Population is the set of people or things (units) that is being investigated.
Census is a count or measure of the entire population.
A parameter is a summary number that describes a population (usually unknown) or a probability
distribution. It is a numerical description of a population characteristic.
A statistic is any function of a number of random variables usually identically distributed, that may be
used as an estimator for a population parameter. A statistic is a numerical description of a sample
characteristic. The value of a statistic is known when a sample is taken, but it can change from sample
to sample.
A sampling method is biased if it tends to give samples in which some characteristic of the population
is underrepresented or overrepresented (biased sampling).
Sample selection bias (convenience sampling) is the extent to which a sampling procedure produces
samples that tend to result in numerical summaries that are systematically too high or too low.
Simple random sampling is a sample in which individuals are selected by using some random process.
A sample survey is an investigation of one or more characteristics of a population.
(continued)
15
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Data Collection
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Teacher Notes and Elaborations (continued)
Statistical Studies
Virginia Standard PS.8
The student will describe the methods of
data collection in a census, sample survey,
experiment, and observational study and
identify an appropriate method of solution
for a given problem setting.
Observational
(Observe and measure
but do not modify)
Observe individuals and measure
variables of interest but do not influence
the responses. The purpose is to describe
some group or situation.
Differences
Experimental
(Apply some
treatment)
Impose some treatment on individuals in
order to observe their responses. The
purpose of an experiment is to study
whether the treatment causes a change in
the response.
16
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Data Collection
Resources
PRINCE WILLIAM COUNTY SCHOOLS
Sample Instructional Strategies and Activities
Text:
Elementary Statistics: Picturing the World, 3rd
Edition, 2006, Larson and Farber, Pearson
Prentice Hall
Virginia Standard PS.8
Elementary Statistics: A Step by Step
Approach, Sixth Edition, 2007, Bluman,
Glencoe McGraw-Hill
PWC Mathematics website
http://pwcs.math.schoolfusion.us/
Mathematics SOL Resources
www.doe.virginia.gov/instruction/high_school/
mathematics/index.shtml
17
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Data Collection
Virginia Standard PS.9
The student will plan and conduct a survey.
The plan will address sampling techniques
(e.g., simple random and stratified) and
methods to reduce bias.
Essential Knowledge and Skills
Key Vocabulary
The student will use problem solving,
mathematical communication, mathematical
reasoning, connections and representations
to:
• Investigate and describe sampling
techniques, such as simple random
sampling, stratified sampling, and cluster
sampling.
• Determine which sampling technique is
best, given a particular context.
• Plan a survey to answer a question or
address an issue.
• Given a plan for a survey, identify possible
sources of bias, and describe ways to reduce
bias.
• Design a survey instrument.
• Conduct a survey.
Key Vocabulary
biased sample
cluster
cluster sample
convenience sample
simple random sampling
stratified sampling
survey
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Essential Questions
• What is required to plan and conduct a survey?
• What are sampling techniques and how do they reduce bias?
Essential Understandings
• The purpose of sampling is to provide sufficient information so that population characteristics
may be inferred.
• Inherent bias diminishes as sample size increases.
Teacher Notes and Elaborations
To survey is to look over or examine in detail. A survey is a detailed collection of information.
Surveys can be valuable in determining the attitude of a population about a candidate, product, or
issue. The most common types of surveys are done by interview, mail, or telephone. In designing a
survey, it is important to word the questions so they do not lead to biased results.
The design of a statistical study is biased if it systematically favors certain outcomes.
A biased sample has a distribution that is not determined only by the population from which it is
drawn, but also by some property that influences the distribution of the sample. Biased samples do not
represent the entire population of the study. For example, an opinion poll might be biased by
geographical location.
Another source of bias is voluntary response samples. These are biased because people with strong
opinions are more likely to respond.
A cluster is a naturally occurring subgroup of a population used in stratified sampling.
A cluster sample is when a population falls into a naturally occurring subgroup which has similar
characteristics.
Simple random sampling is the process of collecting samples devised to avoid any interference from
any shared property of, or relation between the elements selected, so that its distribution is affected
only by that of the whole population and can therefore be taken to be representative of it.
A stratified sample is a sample that is not drawn at random from the whole population, but is drawn
separately from a number of disjoint strata of the population in order to ensure a more representative
sample. To achieve a stratified random sample, divide the units of the sampling frame into nonoverlapping subgroups and choose a simple random sample from each subgroup.
A type of sample that often leads to biased studies is a convenience sample. A convenience sample
consists only of available members of the population or members of the population that are easiest to
reach. For this reason, this is not a recommended sampling technique.
18
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Data Collection
Resources
PRINCE WILLIAM COUNTY SCHOOLS
Sample Instructional Strategies and Activities
Text:
Elementary Statistics: Picturing the World, 3rd
Edition, 2006, Larson and Farber, Pearson
Prentice Hall
Virginia Standard PS.9
Elementary Statistics: A Step by Step
Approach, Sixth Edition, 2007, Bluman,
Glencoe McGraw-Hill
PWC Mathematics website
http://pwcs.math.schoolfusion.us/
Mathematics SOL Resources
www.doe.virginia.gov/instruction/high_school/
mathematics/index.shtml
19
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Probability
Virginia Standard PS.11
The student will identify and describe two
or more events as complementary,
dependent, independent, and/or mutually
exclusive.
Essential Knowledge and Skills
Key Vocabulary
The student will use problem solving,
mathematical communication, mathematical
reasoning, connections and representations
to:
• Define and give contextual examples of
complementary, dependent, independent,
and mutually exclusive events.
• Given two or more events in a problem
setting, determine if the events are
complementary, dependent, independent,
and/or mutually exclusive.
Key Vocabulary
complement
dependent event
independent event
mutually exclusive
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Essential Questions
• What is meant by mutually exclusive?
• What is meant by independent/dependent outcomes?
• How are events defined and what are examples of each?
Essential Understandings
• The complement of event A consists of all outcomes in which event A does not occur.
• Two events, A and B, are independent if the occurrence of one does not affect the probability of
the occurrence of the other. If A and B are not independent, then they are said to be dependent.
• Events A and B are mutually exclusive if they cannot occur simultaneously.
Teacher Notes and Elaborations
The sum of the probabilities of all outcomes in a sample space is 1 or 100%. In the following
examples, the rectangle represents the total probability of the sample space.
Two events A and B are mutually exclusive if A and B cannot occur at the same time.
A and B are mutually exclusive.
A
B
The complement of event E is the set of all outcomes in a sample space that are not included in event
E. The complement of event E is denoted by E′ and is read as “E prime”.
E′
1
E
6
2
5
3
4
The area of the circle represents the probability of event E, and the area outside
the circle represents the probability of the complement of event E.
(continued)
20
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Probability
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Teacher Notes and Elaborations (continued)
Two events are independent if the occurrence of one of the events does not affect the probability of the occurrence of the other event. Events that are not
independent are dependent events. An example of an independent event is the roll of a die and the flip of a coin.
Virginia Standard PS.11
The student will identify and describe two
or more events as complementary,
dependent, independent, and/or mutually
exclusive.
21
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Probability
Virginia Standard PS.11
Resources
PRINCE WILLIAM COUNTY SCHOOLS
Sample Instructional Strategies and Activities
Text:
Elementary Statistics: Picturing the World, 3rd
Edition, 2006, Larson and Farber, Pearson
Prentice Hall
Elementary Statistics: A Step by Step
Approach, Sixth Edition, 2007, Bluman,
Glencoe McGraw-Hill
PWC Mathematics website
http://pwcs.math.schoolfusion.us/
Mathematics SOL Resources
www.doe.virginia.gov/instruction/high_school/
mathematics/index.shtml
22
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Probability
Virginia Standard PS.12
The student will find probabilities (relative
frequency and theoretical), including
conditional probabilities for events that are
either dependent or independent, by
applying the Law of Large Numbers
concept, the addition rule, and the
multiplication rule.
PRINCE WILLIAM COUNTY SCHOOLS
Essential Knowledge and Skills
Key Vocabulary
The student will use problem solving,
mathematical communication, mathematical
reasoning, connections and representations
to:
• Calculate relative frequency and expected
frequency.
• Find conditional probabilities for dependent,
independent, and mutually exclusive events.
Key Vocabulary
addition rule
conditional probability
Law of Large Numbers
multiplication rule
relative frequency
theoretical probability
Essential Questions and Understandings
Teacher Notes and Elaborations
Essential Questions
• How are probabilities calculated?
• How is the Law of Large Numbers applied?
Essential Understandings
• Data are collected for a purpose and have meaning in a context.
• Venn diagrams may be used to find conditional probabilities.
• The Law of Large Numbers states that as a procedure is repeated again and again, the relative
frequency probability of an event tends to approach the actual probability.
Teacher Notes and Elaborations
Theoretical probability is used when each outcome in a sample space is equally likely to occur.
Number of outcomes in E
P( E ) =
Total number of outcomes in sample space
A conditional probability is the probability of an event occurring, given that another event has already
occurred. The conditional probability of event B occurring, given that event A has occurred, is
denoted by P(B|A) and is read as “probability of B, given A.
To find the probability of two events occurring in sequence, use the multiplication rule.
If events A and B are independent then the rule is: P(A and B) = P(A) · P(B)
To use the multiplication rule, first find the probability that the first event occurs, find the probability
the second event occurs given the first event has occurred, and then multiply these two probabilities.
Two events A and B are mutually exclusive if A and B cannot occur at the same time. The addition
rule for the probability of A or B states that the probability that events A or B will occur (A or B) is
given by:
P(A or B) = P(A) + P(B) – P(A and B).
If events A and B are mutually exclusive, then the rule can be simplified to:
P(A or B) = P(A) + P(B).
As the number of times a probability experiment is repeated, the empirical probability (relative
frequency) of an event approaches the theoretical probability of the event (The Law of Large
Numbers).
The relative frequency of a class is the portion or percentage of the data that falls in that class. To find
the relative frequency of a class, divide the frequency f by the sample size n.
Class frequency
Relative frequency =
Sample size
=
f
n
23
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Probability
Virginia Standard PS.12
Resources
PRINCE WILLIAM COUNTY SCHOOLS
Sample Instructional Strategies and Activities
Text:
Elementary Statistics: Picturing the World, 3rd
Edition, 2006, Larson and Farber, Pearson
Prentice Hall
Elementary Statistics: A Step by Step
Approach, Sixth Edition, 2007, Bluman,
Glencoe McGraw-Hill
PWC Mathematics website
http://pwcs.math.schoolfusion.us/
Mathematics SOL Resources
www.doe.virginia.gov/instruction/high_school/
mathematics/index.shtml
24
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Probability
Virginia Standard PS.13
The student will develop, interpret, and
apply the binomial probability distribution
for discrete random variables, including
computing the mean and standard deviation
for the binomial variable.
PRINCE WILLIAM COUNTY SCHOOLS
Essential Knowledge and Skills
Key Vocabulary
The student will use problem solving,
mathematical communication, mathematical
reasoning, connections and representations
to:
• Develop the binomial probability
distribution within a real-world context.
• Calculate the mean and standard deviation
for the binomial variable.
• Use the binomial distribution to calculate
probabilities associated with experiments
for which there are only two possible
outcomes.
Key Vocabulary
binomial distribution
probability distribution
Essential Questions and Understandings
Teacher Notes and Elaborations
Essential Questions
• How is the mean and standard deviation calculated for a binomial variable?
• What is a probability distribution?
• What is the relationship between variances and standard deviation?
• What is meant by binomial distribution?
• How are binomial probabilities determined?
• How can the binomial distribution be applied to real-world applications?
Essential Understandings
• A probability distribution is a complete listing of all possible outcomes of an experiment together
with their probabilities. The procedure has a fixed number of independent trials.
• A random variable assumes different values depending on the event outcome.
• A probability distribution combines descriptive statistical techniques and probabilities to form a
theoretical model of behavior.
Teacher Notes and Elaborations
A binomial experiment is a probability experiment that satisfies the following conditions:
1. The experiment is repeated for a fixed number n of trials, where each trial is independent of
the other trials.
2. There are only two possible outcomes of interest for each trial. The outcomes can be classified
as a success or as a failure.
3. The probability of a success is the same for each trial.
4. The random variable x counts the number of successful trials out of the n trials.
The parameters of a binomial distribution are n and p.
If data are produced in a binomial setting, then the random variable X = number of successes is called
a binomial random variable, and the probability distribution of X is called a binomial distribution. The
distribution of the count X of successes in the binomial setting is the binomial distribution with
parameters n and p. The parameter n is the number of observation, and p is the probability of a success
on any one observation. The possible values of X are the whole numbers from 0 to n or X is B(n, p).
There are several ways to find the probability of x successes in n trials. One way is to use the binomial
probability formula.
P( x ) = n Cx p x q n − x
n!
p xqn− x
( n − x)! x !
where: x = the number of successes in n trials
p = the probability of success in a single trial
q = probability of failure in a single trial
q=1–p
=
The mean and the standard deviation of a binomial distribution can be computed using the formulas:
µ = np and σ = np (1 − p ) .
25
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Probability
Virginia Standard PS.13
Resources
PRINCE WILLIAM COUNTY SCHOOLS
Sample Instructional Strategies and Activities
Text:
Elementary Statistics: Picturing the World, 3rd
Edition, 2006, Larson and Farber, Pearson
Prentice Hall
Elementary Statistics: A Step by Step
Approach, Sixth Edition, 2007, Bluman,
Glencoe McGraw-Hill
PWC Mathematics website
http://pwcs.math.schoolfusion.us/
Mathematics SOL Resources
www.doe.virginia.gov/instruction/high_school/
mathematics/index.shtml
26
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Probability
Virginia Standard PS.16
The student will identify properties of a
normal distribution and apply the normal
distribution to determine probabilities, using
a table or graphing calculator.
PRINCE WILLIAM COUNTY SCHOOLS
Essential Knowledge and Skills
Key Vocabulary
The student will use problem solving,
mathematical communication, mathematical
reasoning, connections and representations
to:
• Identify the properties of a normal
probability distribution.
• Describe how the standard deviation and the
mean affect the graph of the normal
distribution.
• Determine the probability of a given event,
using the normal distribution.
Key Vocabulary
continuous probability distribution
normal curve
normal distribution
Essential Questions and Understandings
Teacher Notes and Elaborations
Essential Questions
• What are the properties of a normal probability distribution?
• How does the standard deviation and mean affect the graph of the normal distribution?
• How is the probability of an event calculated?
Essential Understandings
• The normal distribution curve is a family of symmetrical curves defined by the mean and the
standard deviation.
• Areas under the curve represent probabilities associated with continuous distributions.
• The normal curve is a probability distribution and the total area under the curve is 1.
Teacher Notes and Elaborations
A continuous random variable has an infinite number of possible values that can be represented by an
interval on the number line. Its probability distribution is called a continuous probability distribution.
A normal distribution is a continuous probability distribution for a random variable x. The graph of a
normal distribution is called the normal curve and has the following properties:
1. The mean, median, and mode are equal.
2. The normal curve is bell shaped and is symmetric about the mean.
3. The total area under the normal curve is equal to one.
4. The normal curve approaches but never touches the x-axis as it extends farther and farther
away from the mean.
5. Between µ − σ and µ + σ (in the center of the curve) the graph curves downward. The
graph curves upward to the left of µ − σ and to the right of µ + σ . The points at which the
curve changes from curving upward to curving downward are called inflection points.
Inflection points
x
-3σ
-2σ
-1σ
µ
1σ
2σ
3σ
68%
95%
99.7%
27
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Probability
Virginia Standard PS.16
Resources
PRINCE WILLIAM COUNTY SCHOOLS
Sample Instructional Strategies and Activities
Text:
Elementary Statistics: Picturing the World, 3rd
Edition, 2006, Larson and Farber, Pearson
Prentice Hall
Elementary Statistics: A Step by Step
Approach, Sixth Edition, 2007, Bluman,
Glencoe McGraw-Hill
PWC Mathematics website
http://pwcs.math.schoolfusion.us/
Mathematics SOL Resources
www.doe.virginia.gov/instruction/high_school/
mathematics/index.shtml
28
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Inferential Statistics
Virginia Standard PS.17
The student, given data from a large sample,
will find and interpret point estimates and
confidence intervals for parameters. The
parameters will include proportion and
mean, difference between two proportions,
and difference between two means
(independent and paired).
Essential Knowledge and Skills
Key Vocabulary
The student will use problem solving,
mathematical communication, mathematical
reasoning, connections and representations
to:
• Construct confidence intervals to estimate a
population parameter, such as a proportion
or the difference between two proportions;
or a mean or the difference between two
means.
• Select a value for alpha (Type I error) for a
confidence interval.
• Interpret confidence intervals in the context
of the data.
• Explain the importance of random sampling
for confidence intervals.
• Calculate point estimates for parameters,
and discuss the limitations of point
estimates.
Key Vocabulary
confidence interval
parameter
point estimate
Type I error
PRINCE WILLIAM COUNTY SCHOOLS
Essential Questions and Understandings
Teacher Notes and Elaborations
Essential Questions
• Why are confidence intervals and tests of significance important?
• How is sampling used and why is it important?
Essential Understandings
• A primary goal of sampling is to estimate the value of a parameter based on a statistic.
• Confidence intervals use the sample statistic to construct an interval of values that one can be
reasonably certain contains the true (unknown) parameter.
• Confidence intervals and tests of significance are complementary procedures.
• Paired comparisons experimental design allows control for possible effects of extraneous
variables.
Teacher Notes and Elaborations
A parameter is a numerical description of a population characteristic.
A statistic is a numerical description of a sample characteristic.
A Type I error is the error of rejecting the null hypothesis when it is in fact true. In a hypothesis test,
the level of significance is the maximum allowable probability of making a Type I error. To decrease
the probability of a Type I error, decrease the significance level. Changing the sample size has no
effect of the probability of a Type I error. A Type I error is denoted by α , the lowercase Greek letter
alpha.
A point estimate is a single value estimate for a population parameter. The most unbiased point
estimate of the population means µ is the sample mean x .
Using a point estimate and a margin of error, an interval estimate of a population parameter such as
µ can be constructed. This interval estimate is called a confidence interval.
The margin of error is calculated by the z-score for the given confidence level times the standard error.
p (1 − p )
For proportions, the standard error can be computed using the formula σ =
.
n
A confidence interval for proportion is p plus or minus margin of error.
A c-confidence interval for the population mean µ is x − E < µ < x + E . The probability that the
confidence interval contains µ is c.
29
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
Curriculum Information
Topic
Inferential Statistics
Resources
PRINCE WILLIAM COUNTY SCHOOLS
Sample Instructional Strategies and Activities
Text:
Elementary Statistics: Picturing the World, 3rd
Edition, 2006, Larson and Farber, Pearson
Prentice Hall
Virginia Standard PS.17
Elementary Statistics: A Step by Step
Approach, Sixth Edition, 2007, Bluman,
Glencoe McGraw-Hill
PWC Mathematics website
http://pwcs.math.schoolfusion.us/
Mathematics SOL Resources
www.doe.virginia.gov/instruction/high_school/
mathematics/index.shtml
30
PROBABILITY AND STATISTICS CURRICULUM GUIDE (Revised 2010)
PRINCE WILLIAM COUNTY SCHOOLS
31