Chapter 13 Notes

advertisement
Chapter 13
Statistics and Probability
Statistics and Probability – involve studying a group of individuals or objects (POPULATION), and its
subsets (SAMPLES)
Statistics – used to make sense of data by organizing, summarizing and drawing conclusions.
a)
Data gathered by RANDOM SAMPLES – each member of the population have an equal
chance of being in the sample
2 Types of Data:
#1. Qualitative – data divided into categories, such as color of eyes, male or female
#2. Quantitative – numerical data, such as the number of miles home, time spent studying
Further broken down into: a) discrete – if there is a minimum increment between 2 different
values.
b) continuous – if the difference between 2 values is arbitrarily small
Ways to Display DATA:
#1. Frequency Table – displays the frequency of an occurrence by tally marks and the relative frequency
by a fraction, decimal or percent.
Ex.
#2. Bar Graph – displays the categories on a horizontal axis and the frequencies or relative frequencies
on a vertical axis, or vice versa.
a) The height of the bar shows the frequency of the value
b) All bars should be the same width
Ex.
#3. Pie Chart – displays the categories and the relative frequencies.
a) Is divided into sectors whose central angle measure equals the fraction of 360
Distribution Shapes:
a)
b)
Uniform = all date have the same frequency
Symmetric = the right and left sides of distribution have frequencies that are mirror images of
each other
c) Skewed right = the right side of the distribution has much lower frequencies than the left
d) Skewed left = the left side of the distribution has much lower frequencies that the right
Outlier – a data value that is far removed from the rest of the data
*usually caused by errors or by unusual members of the population
2 Ways to Display Quantitative Data:
#1. Stem Plot – usually displays small data sets
a)
b)
c)
d)
Leading digit/s represents the stems
Arrange vertically from lowest to highest value
Last digit is the leaf – arrange horizontally lowest to highest
Can predict distribution by the leaves
#2. Histogram – is a bar graph with no gap between bars,
a) Divide the range of data into classes of equal width, so that each data value is exactly in one
class = class interval
b) Draw horizontal axis and indicate the first value in each class interval
c) Draw vertical scale and label it with either frequencies or relative frequencies.
d) Draw rectangles with a width equal to the class interval and height equal to the frequency of
the data within each interval.
13.2 Measures of Center and Spread
Measures of Center:
#1. Mean – the average  are affected by outliers
#2. Median – middle number in a data set (if odd) (if even – take mean of 2 middle #’s)
are not affected by outliers
#3. Mode – the number with the highest frequency (occurs most often)
Measure of Center Distributions:
A) If a distribution is symmetric = the mean and median are equal
B) If a distribution is skewed left = the mean is to the left of the median
C) If a distribution is skewed right = the mean is to the right of the median
Measure of Spread (variability of data):
#1. Standard Deviation – measures the average distance of a data element from the mean, the most
common measure of variability – best used if data is symmetric about the mean
a) Deviation of the data value xi - x
b) Then square each deviation and find the average. = VARIANCE
c) The square root of the variance = STANDARD DEVIATION
***However, is the data is taken from a sample rather than a population, it is common to divide
by n-1 instead of n when taking the average of the squared deviations = SAMPLE STANDARD
DEVIATION (denoted by s)
#2. Range – the difference between the maximum and minimum data values.
#3. Interquartile Range – a measure of variability that is resistant to extreme values, yet still gives a
good indication of the spread of the data. = Q3 – Q1
A) 5 Number Summary and Box Plot: consists of
1) Minimum, Q1, Median,Q3, Maximum
2) Minimum = lowest # in data set
3) Q1 = median of lower quartile
4) Median = middle number of the whole data set
5) Q3= median of upper quartile
6) Maximum = biggest # in data set
**Box plot is a visual representation of 5 number summary
13-3 Basic Probability
Experiment – any process that generates one or more observable outcomes.
a) Sample Space = the set of all possible outcomes
Ex. page 865
b)
Event- any outcome or set of outcomes in the sample space.
Ex. in rolling a die, the set {1,3,5} is an event that can be described as rolling an odd number
Probability – of an event is a number from 0 to 1 that indicates how likely the event is to occur
a)
b)
c)
d)
Probability of 0 = event cannot occur
Probability of 1 = event must occur
Sum of the probabilities of all outcomes in the sample space is 1.
The probability of an event is the sum of the probabilities of the outcomes in the event
Probability Distribution – the probability of an event described by a table
Mutually Exclusive Events – two events that have no outcomes in common
a) Cannot both occur in the same trial of an experiment
b) Find the probability of an event (E or F) by adding the individual probabilities
c) Complement of an Event = has a probability of 1 – p (set of all outcomes that are not
contained in the event)
Independent Events – if the occurrence or non-occurrence of one event has no effect on the probability
of the other event.
a)
If 2 events are independent, then the probability of event (E and F) is the product of the
individual probabilities
Differences between Mutually Exclusive and Independent:
Mutually Exclusive
1.
2.
Term often refers to 2 possible results for
a SINGLE trial of an experiment.
The word “or” is often used to describe a
pair of mutually exclusive events.
3. For mutually exclusive events, E and F,
P(E or F) = P(E) + P(F)
** P(E U F)
Independent
1.
Term often refers to the results from 2 or
MORE trials of an experiment or from
different experiments.
2) The word “and” is often used to describe a
pair of independent events.
3) For independent events E and F
P(E and F)= P(E) · P(F) ** P(EΩF)
Random Variables – is a function that assigns a number to each outcome in the sample space of the
experiment.
Ex. rolling two dice:  Random variable is the total number of the faces shown
a) Write sample space
b) Find range of random variable
c) List outcomes for which the value of 7 is assigned.
Expected Value(Mean) of a Random Variable – the average value of the outcomes.
**If the experiment is repeated a large number of times, the average approaches the expected
value.
****TO CALCULATE THE EXPECTED VALUE FROM A PROBABILITY DISTRIBUTION = multiply
each value by its probability and add the results:
EX.
Sum of
spins
Probability
0
1
1
16
2
1
8
3
3
16
4
1
4
5
3
16
6
1
8
1
16
= 0(1/16) + 1(1/8)+2(3/16) + 3(1/4) + 4(3/16) + 5(1/8) + 6(1/16) = 3
13-4 Determining Probabilities
Exact probability will never be known, so we estimate it in 2 ways:
#1. Experimental Estimates of Probability
a) ** as the number of trials of an experiment increases, the relative frequency of an outcome
Approaches the probability of the outcome
P(E) = number of trials with an outcome in E
n
#2. Theoretical Estimates of Probability
a)
Suppose an experiment has a sample space of n outcomes, all of which are equally likely.
1
Then the probability of each outcome is 𝑛, and the probability of an event E is given by:
P(E) = number of outcomes in E
n
Ex. an experiment consists of spinning a spinner divided into 5 equal sections numbered
from 1 to 5. Suppose that all outcomes are equally likely:
a) Write a probability distribution for the experiment
b) Find the probability of the event that the spinner lands on a prime number
Fundamental Counting Principle (multiplication principle) = consider a set of k experiments. Suppose
the first experiment has n1 outcomes, the second has n2 outcomes and so on. Then the total number of
outcomes is n1•n2•…•nk for all k experiments.
Ex. Suppose there are 5 roads from Town A to Town B, 4 roads from Town B to Town C, and 6 roads
from Town C to Town D. How many different routes are there from Town A to Town D, passing through
both Towns B and C?
Ex. Do 3 coin toss tree diagram on page 880
Permutations – if r items are chosen in order without replacement from n possible items, the number of
permutations is:
nPr
=
𝑛!
(𝑛−𝑟)!
Combinations – if r items are chosen in any order without replacement from n possible items, the
number of combinations is:
nCr
𝑛!
= 𝑟!(𝑛−𝑟)!
**if each item is equally likely to be chosen, the permutations and combinations are all equally likely for
a given value of r.
Ex. Suppose 5 tiles labeled with capital letters A,B,C,D and E are placed randomly in slots a,b,c,d and e.
What is the probability that each capital letter is matched with its lowercase counterpart?
Ex. A bag contains 26 tiles, each labeled with a letter A through Z. Julie chooses 5 tiles at random from
the bag. What is the probability that she chooses the letters of her name in any order (matching all 5
letters)? What is the probability of matching 4 letters? 3 letters? 0 letters?
13.5 Normal Distributions
Properties:
1)
2)
3)
4)
5)
6)
7)
Bell shaped
Symmetric about the mean
X-axis is a horizontal asymptote
Area under the curve and above x-axis is 1
Maximum value occurs at the mean
Has 2 points of inflection  at 1 standard deviation to the right and left of the mean
The mean, median and mode all have the same value = CENTER
Empirical Rule:



About 68% of the data values are within 1 standard deviation of the mean
About 95% of the data values are within 2 standard deviations of the mean
About 99.7% of the data values are within 3 standard deviations of the mean
Download