Uploaded by Vallabh Bharamgunde

PPT 1

advertisement
Data Science
for
Managerial
Decisions
Objective of the Course
This course aims to provide the basic foundations needed for data scientists.
It includes the fundamental concepts and covers the mathematical and statistical
essentials required for understanding and implementing predictive and
prescriptive models for solving business problems. It will provide hands-on
training to students through excel based approach.
Textbooks for the
Course
๏ƒ˜ Anderson, D. R., Sweeney, D. J., Williams, T. A., Camm,
J. D., and Cochran, J. J. (2018). Statistics for Business &
Economics. Cengage learning.
๏ƒ˜ Hastie, T., Tibshirani, R., Friedman, J. H., and Friedman,
J. H. (2009). The elements of statistical learning: data
mining, inference, and prediction (Vol. 2, pp. 1-758).
New York: springer.
๏ƒ˜ Provost, F., and Fawcett, T. (2013). Data Science for
Business: What you need to know about data mining and
data-analytic thinking. O'Reilly Media, Inc.
*Grading &
Evaluation
Mid-Term Exam
20%
End-Term Exam
40%
Project, Assignment, Attendance, Quiz, Class participation
40%
Total
100%
Course Outline
Topic(s) to be covered
1. Introduction
-> Applications of quantitative techniques in business
-> Data – Categorical & Quantitative
-> Scales of measurement
2. Descriptive Statistics – Tabular & Graphical Display
-> Summarizing data for qualitative variables
-> Summarizing data for quantitative variables
-> Summarizing data for two variables
3. Descriptive Statistics-Numerical Measures
-> Measures of location – mean, weighted mean, median, geometric mean, mode,
percentiles, quartiles
-> Measures of variability – range, interquartile range, variance, standard
deviation, coefficient of variation
-> Measures of distribution shape, relative location, and detecting outliers
-> Five-Number Summaries & Box plot
-> Measures of association between two variables
4. Introduction to Probability
-> Experiments, counting rules, assigning probabilities
-> Events and their probabilities
-> Basic relationships of probabilities
-> Conditional probability
-> Bayes’ theorem
5. Discrete and Continuous Probability Distribution
-> Random Variables
-> Expected value and variance
-> Binomial probability distribution
-> Poisson probability distribution
-> Hypergeometric probability distribution
-> Uniform probability distribution
-> Normal probability distribution
-> Exponential probability distribution
6. Sampling and sampling distributions
-> Selecting a sample
-> Point estimation
-> Sampling distribution of ๐‘ฅ๐‘ฅฬ…
-> Properties of point estimators
7. Interval Estimation
-> Margin of error and interval estimate
-> Population mean: σ known
-> Population mean: σ unknown
-> Determining sample size
8. Hypothesis Tests
-> Developing Null & Alternative hypothesis
-> Type I and type II errors
-> One-Tailed Test
-> Two-Tailed Test
9. Inference
-> Inferences About the Difference Between Two
Population Means: σ1 and σ2 Known
-> Inferences About the Difference Between Two
Population Means: σ1 and σ2 Unknown
-> Inferences About the Difference Between Two
Population Means: Matched Sample
Topics for Chapter 1
Statistics
Applications in
Business and
Economics
Data
Data Sources
Descriptive
Statistics
Statistical
Inference
Analytics
Big Data and
Data Mining
Computers and
Statistical
Analysis
Ethical
Guidelines for
Statistical
Practice
We live in a world that’s drowning in data
๏ƒผWebsites track every user’s every click
๏ƒผYour smartphone is building up a record of your
location and speed every second of every day.
The
Dominance
of Data
๏ƒผ“Quantified selfers” wear pedometers-onsteroids that are ever recording their heart rates,
movement habits, diet, and sleep patterns.
๏ƒผSmart cars collect driving habits
๏ƒผSmart homes collect living habits
๏ƒผSmart marketers collect purchasing habits
๏ƒผThe Internet itself represents a huge graph of
knowledge that contains (among other things) an
enormous
cross-referenced
encyclopedia;
domain-specific databases about movies, music,
sports results, pinball machines, memes, and too
many government statistics
Data Science
Data Science
Business
Acumen
Mathematics
Expertise
Technology:
Hacking skills
Data science is a blend of skills in three major areas
Data
science
is
an
interdisciplinary field that
combines techniques, tools,
and
methodologies
from
statistics,
mathematics,
computer
science,
and
domain knowledge to extract
insights and knowledge from
large volumes of structured
and unstructured data.
Applications of Data Science in Business
Fraud
detection
and risk
management
Sales
forecasting
and
demand
prediction
Recommen
der systems
Supply chain
optimization
Applications of Data Science in Healthcare
Disease prediction and
early diagnosis
Personalized medicine
and treatment
optimization
Health monitoring and
wearable devices
Drug discovery and
clinical trials
Public health analytics
and outbreak prediction
Applications of Data Science in Finance
CREDIT SCORING AND
RISK ASSESSMENT
ALGORITHMIC TRADING
AND FINANCIAL
FORECASTING
FRAUD DETECTION AND
ANTI-MONEY
LAUNDERING
CUSTOMER SENTIMENT
ANALYSIS FOR
INVESTMENT DECISIONS
PORTFOLIO
OPTIMIZATION AND
WEALTH MANAGEMENT
What is Data ?
Data are the facts and
figures collected,
analyzed, and summarized
for presentation and
interpretation.
All the data collected in a
particular study are
referred to as the data set
for the study.
What is data ?
Every day, we come into contact with an
enormous amount of data in the shape of
facts, numerical figures, tables, and graphs,
Newspapers, television,
magazines, and other forms of
communication
These could be cricket batting or
bowling averages, company
revenues, city temperatures,
five-year plan expenditures,
polling results, and more.
Data are numerical or nonnumerical facts or figures
collected for a purpose.
Applications of Data Science
Netflix
Personalized video Ranking, Trending now ranker
Over 208 million paid
subscribers worldwide.
i)
With thousands of smart
devices streaming
supported,
Around 3 billion hours
watched every month
Personalized Recommendation System
Data: Viewing time, platform searches for
keywords, and content pause time, rewind,
rewatched.
Predict what viewer is likely to watch and give
personalized watchlist to user.
Advanced use of data
analytics and
recommendation systems
ii)
Content
Analytics
Data collected over 100
billion events every day
Development
using
Data
Come up with content that their viewers want to watch
even before they know they want to watch it.
Data Types
Data types
Qualitative
Nominal
Quantitative
Ordinal
Discrete
Continuous
Ratio
Interval
Scale of Measurement
Data
Categorical
Numeric
Nominal
Ordinal
Quantitative
Nonnumeric
Nominal
Numeric
Ordinal
Interval
Ratio
Nominal scale
Used to categorize data into mutually exclusive categories or groups
Examples:
• Gender
• Marital status
• Religion
• Race
• Hair Color
• Country
• Zip code
• Student ID
Ordinal scale
Simply depict the order of variables and not the difference between each of the
variables
Interval Scale
• Negative values are possible
because the zero point does not
indicate the absence of value.
• Display not only the order and
direction of your values but
also their precise distinctions.
• On the interval measure, the
distance between every number
is identical.
• There is no fixed beginning
point.
• Cannot calculate ratios.
• Can perform addition and
subtraction,
but
not
multiplication or division.
Examples of Interval Scale
Time of day in a 12hour clock
Temperature in
degrees Fahrenheit
or Celsius (not
Kelvin)
IQ test
SAT and ACT scores
Age
Income range
Year
Voltage
Grade levels in
school
Ratio Scale
• A quantitative scale
with a true zero and equal
distances between
adjacent nodes.Expansion
of the interval
measurement level.The
difference between the
values, and the ratio of
values, both are
meaningful.
• Temperature ratios can
only be calculated using
the Kelvin scale. Although
40 degrees is double the
temperature of 20
degrees, it is not twice as
scorching on the Celsius
or Fahrenheit scales..
• However, on the Kelvin
scale, 40 K is twice as hot
as 20 K because this
scale's beginning point is a
true zero.
Ratio Scale of Measurement
Temperature outside is 0-degree Celsius. 0 degree doesn’t mean it’s not hot or cold, it is a value.
However, the
temperature in
Kelvin is a ratio
variable, as 0.0
Kelvin really does
mean 'no heat'.
Temperature,
expressed in F or
C, is not a ratio
variable.
A temperature of
0.0 on either of
those scales does
not mean 'no
heat.
Examples of Ratio Scale
Characteristics
of Ratio Scale
Source: https://www.questionpro.com/blog/ratio-scale/
https://www.cuemath.com/measurement/scales-ofmeasurement/
Comparison of four scales of
measurement
Data, Data Sets, Elements, Variables, and
Observations
• Elements are the entities for which data collection is performed.
• A variable is an element's characteristic of concern.
• An observation is the collection of measurements obtained for a
particular constituent
• An n-element data set comprises n observations.
• The total number of data values in a comprehensive data set is equal
to the product of the number of elements by the number of variables.
Question: Identify the type of scale of measurement ?
Practice
problem
Which of the following variables are
qualitative and which are quantitative?
If the variable is quantitative, then
specify whether the variable is discrete
or continuous.
a. Points scored in a football game.
b. Racial composition of a high school
classroom.
c. Heights of 15-year-olds.
Data Types
1. Cross-sectional data
Refers to data collected by
recording a characteristic of
many subjects at the same
point in time, or without
regard to differences in time.
Place
Max
temperature
Humidity
Wind
Date
P1
T1
H1
W1
30August2021
P2
T2
H2
W2
30August2021
P3
T3
H3
W3
30August2021
Data Types
2. Time series data
Refers to data collected over several time periods focusing on certain groups of people, specific
events, or objects. Time series data can include hourly, daily, weekly, monthly, quarterly, or
annual observations.
Place
Max temperature
Humidity
Wind
Date
P1
T1
H1
W1
11-March-2020
P2
T2
H2
W2
11-March-2021
P3
T3
H3
W3
11-March-2022
Data Types
3. Panel Data (Longitudinal Data)
Combination of Cross-sectional and Time-series data
Place
Max
temperature
Humidity
Wind
Date
P1
T1
H1
W1
1-March-2020
P2
T2
H2
W2
12-April-2021
P3
T3
H3
W3
09-Jan-2022
Data Types
4. Structured data
• Reside in a pre-defined, row-column
format.
• Spreadsheet or database applications.
• Enter, store, query, and analyze.
• Numerical information that is objective
and not open to interpretation.
5. Unstructured data
•
•
•
•
•
Do not conform to a pre-defined, row-column format.
Textual and multimedia content.
Do not conform to database structures.
Do not conform to a row-column model required in most database systems.
Example: social media data such as Twitter, YouTube, Facebook, and blogs.
6. Big Data
Six characteristics
(a) Volume: immense amount of data
compiled from single or multiple sources
(b) Velocity: generated at rapid speed,
management is critical issue
(c) Variety: all types, forms, structured or
unstructured
(d) Veracity: credibility and quality of data,
reliability
(e) Values: methodological plan for
formulating questions, curating right
data, and unlocking hidden potential
Types of Data
Classification of Data Based on its Presentation
Ungrouped Data
Grouped Data
Types of Data
In statistics, raw data refers to data that has been collected directly
from a primary source and has not been processed in any way. For
example: Data obtained from the Government reports.
Ungrouped data is defined as the data given as individual points (i.e.
values or numbers) such as 15, 63, 34, 20, 25, and so on
Grouped data means the data (or information) given in the form of
class intervals such as 0-20, 20-40 and so on.
Grouped Vs. Ungrouped Data
• For Example: Let us say there are 30 men in a colony,
whose age groups are as follows:
• 55, 35, 29, 35, 24, 77, 65, 45, 26, 29, 35, 66, 57, 59, 33,
31, 64, 28, 63, 55, 25, 69, 46, 38, 48, 61, 37, 55, 24, 64
• Data available in such a form is called raw data. And
each entry i.e., 55, 35, 296, and so forth, is
the value or observation
• For analyzing this data, we have to arrange this data in
increasing order
• 24, 24, 25, 26, 28, 29, 29, 31, 33, 35, 35, 35, 37, 38, 45,
46, 48, 55, 55, 55, 57, 59, 61, 63, 64, 64, 65, 66, 69, 77
Grouped Vs. Ungrouped Data
Age
Frequency
24
2
25
1
26
1
28
1
29
2
31
1
33
1
35
3
37
1
38
1
45
1
46
1
48
1
55
3
57
1
59
1
61
1
Frequency Distribution Table for
Ungrouped Data
24, 24, 25, 26, 28, 29, 29, 31, 33, 35, 35, 35,
37, 38, 45, 46, 48, 55, 55, 55, 57, 59, 61, 63,
64, 64, 65, 66, 69, 77
If we create a frequency distribution
table for each and every observation, then
it will form a large table. So, for easy
understanding, we can make a table with a
group of observations say 0 to 10, 10 to 20
etc.
Grouped Vs. Ungrouped Data
Consider the marks of 50 students of MBA class in Supply Chain Management
obtained in an examination. The maximum marks of the exam are 50.
23, 8, 13, 18, 32, 44, 19, 8, 25, 27, 10, 30, 22, 40, 39, 17, 25, 9, 15, 20, 30, 24, 29,
19, 16, 33, 38, 46, 43, 22, 37, 27, 17, 11, 34, 41, 35, 45, 31, 26, 42, 18, 28, 30, 22,
20, 33, 39, 40, 32
Frequency Distribution Table for Grouped Data
Groups (marks)
Frequency (No. of Students)
0-10
3
10-20
11
20-30
14
30-40
14
40-50
8
Total: 50
Data Preparation
Three Steps:
• Counting and sorting
• Handling missing values
• Sub-setting
Subsetting is the process of extracting a
portion of the data set that is relevant for
subsequent statistical analysis.
Data
Presentation:
Tabular and
Graphical
Methods
A categorical variable consists of
observations that represent labels or
names. Summarize the data with a
frequency distribution.
Methods to Visualize a Categorical
Variable
A. Bar chart
B. Pie chart
It is a segmented circle whose
segments
portray
the
relative
frequencies of the categories of a
qualitative variable.
Data Presentation: Tabular
and Graphical Methods
Understanding the relationship
between
two
categorical
variables
• Use a contingency table to
examine the relationship between
two categorical variables.
• Use a stacked column chart to
visualize more than one categorical
variable.
Personality
Relationship between
two categorical
variables
Analyst
Diplomat
Explorer
Sentinel
Female
55
164
194
79
Male
61
160
210
77
Data Presentation: Tabular and
Graphical Methods
Use a frequency distribution to summarize a
numerical variable. Instead of categories,
series of intervals or classes need to be
designed.
Methods to Visualize a Numeric Variable
Histogram: A histogram is the counterpart to
the vertical bar chart used for a categorical
variable. No gaps between bars/intervals
Data Presentation: Tabular and Graphical Methods
Polygon
• Midpoint of each interval/class
on the x-axis
• Frequency or relative frequency
on the y-axis
Line chart
Stem-and-leaf diagram
An ogive depicts a cumulative frequency
or cumulative relative frequency.
Use a scatterplot to examine the
relationship between two numerical
variables.
Data Presentation: Tabular and Graphical Methods
Source:
https://www.mathsisfun.com/definitions/stem-andleaf-plot.html
Stem and Leaf Diagram
Application: Japanese train timetable
bus Timetable
• Source: https://byjus.com/maths/frequencypolygons/
Tabular and Graphical Procedures
Data
Qualitative Data
Tabular
Methods
•Frequency
Distribution
•Rel. Freq. Dist.
•% Freq. Dist.
•Cross-tabulation
Graphical
Methods
•Bar Graph
•Pie Chart
Quantitative Data
Tabular
Methods
•Frequency
Distribution
•Rel. Freq. Dist.
•Cum. Freq. Dist.
•Cum. Rel. Freq.
Distribution
•Cross tabulation
Graphical
Methods
•Histogram
•Freq. curve
•Box plot
•Scatter
Diagram
•Stem-and-Leaf
Display
Descriptive Statistics: Tabular and Graphical
Displays
1
2
Summarizing Data for a
Categorical Variable
Summarizing Data for a
Quantitative Variable
• Categorical data use labels or
names to identify categories of like
items.
• Quantitative data are numerical
values that indicate how much or
how many.
Summarizing Categorical Data
Frequency
Distribution
Relative
Frequency
Distribution
Bar Chart
Pie Chart
Percent
Frequency
Distribution
Frequency Distribution
• A frequency distribution is a tabular overview of data indicating the number
(frequency) of observations in each of multiple non-overlapping groups or
classes.
• The goal is to reveal insights into the data that cannot be immediately reached
by looking at the raw data.
Frequency
Distribution
• Guests at Marada Inn were requested to evaluate the quality of their
accommodations using a rating scale that included the options of
excellent, above average, average, below average, or poor.
• The ratings given by a sample of 20 individuals are as follows:
Below Average
Average
Above Average
Above Average
Above Average
Above Average
Above Average
Below Average
Below Average
Average
Poor
Poor
Above Average
Excellent
Above Average
Average
Above Average
Average
Above Average
Average
Frequency Distribution: Marada Inn
Rating
Frequency
Poor
2
Below Average
3
Average
5
Above Average
9
Excellent
1
Total
20
Relative Frequency Distribution
• The relative frequency of a class refers to the proportion or fraction of the total
number of data items that belong to that specific class.
Relative frequency of a class =
Frequency of the class
๐‘›๐‘›
• A relative frequency distribution is a table that summarizes a dataset by
displaying the relative frequency of each class.
Percent
Frequency
Distribution
• The percentage frequency of a
class is calculated by multiplying
the relative frequency by 100.
• A percent frequency distribution
is a table that provides a concise
summary of a dataset by
displaying
the
percentage
frequency for each class.
Relative Frequency and Percent Frequency
Distributions
Marada Inn
Rating
Relative Frequency
Percent Frequency
Poor
.10
10
Below Average
.15
15
Average
.25
25
Above Average
.45
45
Excellent
.05
5
1.00
100
Total
61
Bar Chart
• A bar chart is a visual representation used to display qualitative data.
• The labels for each class are specified on one axis, typically the horizontal
axis.
• The other axis, typically the vertical axis, can be represented using a
frequency, relative frequency, or percent frequency scale.
• We extend the height of each class label by using a fixed-width bar drawn
above it. The bars are segregated to highlight the distinct categorization of
each class.
Bar Chart
Marada Inn Quality Ratings
10
9
Frequency
8
7
6
5
4
3
2
1
Poor
Below
Average
Average
Above
Average
Excellent
Quality
Rating
Pareto Diagram
• Bar charts are used in quality control to
pinpoint the most significant sources of
errors.
• A Pareto diagram is a bar chart in which
the bars are arranged in descending height
order from left to right, with the most
frequent cause appearing first.
• This illustration bears the name of its
creator, Italian economist Vilfredo Pareto.
Pie Chart
A popular way to show relative frequency and percent frequency
distributions for categorical data is with a pie chart.
First, draw a circle. Then, use the relative frequencies to divide the
circle into sections that match the relative frequency for each class.
Since a circle has 360 degrees, a class with a relative frequency of.25
would take.25 divided by 360 is 90 degrees of a circle.
Pie Chart
Marada Inn Quality Ratings
Excellent
5%
Above
Average
45%
Poor
10%
Below
Average
15%
Average
25%
Half of the customers who were asked about the
quality of Marada said it was "above average" or
"excellent" (look at the left side of the pie). This
might make the boss happy.
If you look at the top of the pie, you can see that for
every person who gave an "excellent" rating, there
were two who gave a "poor" rating. This should
make the boss unhappy.
Summarizing Quantitative Data
Frequency
Distribution
Relative
Frequency
and Percent
Frequency
Distributions
Dot Plot
Histogram
Cumulative
Distributions
Stem-andLeaf Display
Frequency
Distribution
The boss of Hudson Auto wants to know more
about how much the parts used to tune up engines
in the shop cost. She looks at 50 tune-up bills from
customers. The prices of parts, rounded to the
nearest dollar, are shown below:
Frequency Distribution
THE THREE STEPS
NEEDED TO DETERMINE
THE CLASSES FOR A
FREQUENCY
DISTRIBUTION WITH
QUANTITATIVE DATA ARE:
FIND THE NUMBER OF
GROUPS THAT DON'T
OVERLAP.
FIND THE WIDTH OF
EACH CLASS.
FIND OUT THE CLASS
LIMITS.
Frequency Distribution: Guidelines for Determining the Number of Classes
Use between 5 and 20 classes.
Data sets with a larger number of elements usually require a larger number of classes.
Smaller data sets usually require fewer classes.
The goal is to use enough classes to show the variation in the data, but not so many
classes that some contain only a few data items.
Frequency Distribution: Guidelines for Determining the Width of Each
Class
• Use classes of equal width.
• Approximate Class Width = Largest data value − Smallest data value
Number of classes
• Making the classes the same width reduces the chance of inappropriate
interpretations
Note on Number of Classes and Class Width
In practice, the number of classes and the
appropriate class width are determined
by trial and error.
Once a possible number of classes is
chosen, the appropriate class width is
found.
The process can be repeated for a
different number of classes.
Ultimately, the analyst uses judgment to
determine the combination of the
number of classes and class width that
provides the best frequency distribution
for summarizing the data.
Guidelines for Determining the Class Limits
Class limits need to be set so that each piece of data only goes to one class.
The lower-class limit shows the smallest data number that can be given to a class.
The upper-class limit shows the biggest data number that can be given to the class.
How accurate the data is determining what numbers should be used for the class limits.
A class with no higher limit or lower limit is called an open-end class..
Hudson Auto Repair
• If we choose six classes:
• Approximate Class Width = (109 - 52)/6 = 9.5 = 10
Parts Cost ($) Frequency
50-59
2
60-69
13
70-79
16
80-89
7
90-99
7
100-109
5
Total 50
Relative Frequency and Percent Frequency Distributions
Hudson Auto Repair
Insights Gained from the % Frequency Distribution: Hudson
Auto Repair
Only 4% of the parts costs are in
the $50-59 class.
30% of the parts costs are under
$70.
The greatest percentage (32% or
almost one-third) of the parts
costs are in the $70-79 class.
10% of the parts costs are $100
or more
Dot Plot
• One of the simplest graphical summaries of data is a dot plot.
• A horizontal axis shows the range of data values.
• Then each data value is represented by a dot placed above the
axis
Histogram
A histogram is a popular way to show quantitative data on a
graph.
The important variable is on the horizontal line.
Above each class interval, a rectangle is made with a height
that matches the frequency, relative frequency, or percent
frequency of the interval.
Unlike a bar graph, a histogram doesn't have an easy way to
separate rectangles that belong to the same class.
Histogram: Example: Hudson Auto Repair
Histograms: Symmetric
• The left tail is a copy of the right tail.
• Height of People is an example.
Histograms showing Skewness: Moderately Skewed Left
The left side has a longer tail.
Example: Exam Scores
Histograms showing Skewness: Moderately Right Skewed
A Longer tail to the right
Example: Housing Values
Histograms showing Skewness: Highly Skewed Right
A very long tail to the right
Example: Executive Salaries
Cumulative Distributions
Cumulative frequency distribution - indicates the number of items with
values below each class's upper limit.
Cumulative relative frequency distribution — indicates the proportion of
items with values less than or equal to each class's upper limit.
The cumulative percent frequency distribution indicates the percentage
of items with values less than or equal to each class's upper limit.
Cumulative Distributions
The total number of observations is always
equal to the last entry in a cumulative
frequency distribution.
In a cumulative relative frequency
distribution, the last element always
equals 1.00.
In a cumulative percent frequency
distribution, the last entry always equals
100.
Cumulative Distributions: Hudson Auto Repair
Stem-andleaf
• A stem-and-leaf diagram depicts the
rank order as well as the contour of the
data distribution.
• It is comparable to a histogram on its
side, but it displays the real data values.
• Each data item's first digits are
positioned to the left of a vertical line.
• We record the last digit for each item in
rank order to the right of the vertical
line.
• Each line (row) in the display is called
a stem. Each digit on a stem represents
a leaf.
Example: Hudson Auto Repair
The manager of Hudson Auto would like to obtain a better knowledge of the
cost of parts utilized in the shop’s engine tune-ups. She checks 50 client invoices
for potential improvements. The following information shows the component
costs, rounded to the nearest dollar.
Stem-and-Leaf Display
Example: Hudson Auto Repair
Stretched Stem-and-Leaf Display
• If we believe the original stem-and-leaf display has condensed the
data too much, we can stretch the display vertically by using two
stems for each leading digit(s).
• Whenever a stem value is stated twice, the first value corresponds
to leaf values of 0 - 4, and the second value corresponds to leaf
values of 5 - 9.
Stem-and-Leaf Display
Example: Hudson Auto Repair
Stem-and-Leaf Display: Leaf Units
• Each leaf is represented by a single
digit.
• The leaf unit in the preceding case was
1.
• Leaf units can range from 100 to 10, 1,
0.1, and so on.
• Where the leaf unit is not depicted, it is
believed to be 1.
• The leaf unit tells you how to multiply
the stem-and-leaf numbers to get close
to the original data.
Stem-and-Leaf Display
• Example: Leaf Unit = 0.1
8.6
11.7
If we have data with values such as
9.4
9.1
10.2 11.0 8.8
Leaf Unit = 0.1
8
9
10
11
6 8
1 4
2
0 7
93
Stem-and-Leaf Display
• Example: Leaf Unit = 10
If we have data with values such as
1806 1717 1974 1791 1682 1910 1838
Leaf Unit = 10
8
16
17 1 9
0 3
18
19 1 7
The 82 in 1682 is rounded down to 80 and is represented as an 8.
94
Methods of organizing, exploring, and
summarizing data
Visual (charts and graphs)
provides insight into
characteristics of a data set
without using mathematics.
Numerical (statistics or tables)
provides insight into
characteristics of a data set using
mathematics.
Descriptive
Statistics:
Tabular and
Graphical
Displays
• Summarizing Data for Two Variables
Using Tables
• Summarizing Data for Two Variables
Using Graphical Displays
• Data Visualization: Best Practices in
Creating Effective Graphical Displays
Summarizing Data for Two Variables using
Tables
• Thus far we have focused on methods that are used to summarize
the data for one variable at a time.
• Often a manager is interested in tabular and graphical methods that
will help understand the relationship between two variables.
• Crosstabulation is a method for summarizing the data for two
variables.
97
Cross-tabulation
• A cross-tabulation is a tabular summary of data for
two variables.
• Crosstabulation can be used when:
๏ฑone variable is categorical and the other is
quantitative,
๏ฑboth variables are categorical, or
๏ฑ both variables are quantitative.
• The left and top margin labels define the classes for
the two variables.
Example: Finger Lakes Homes: Cross-tabulation
The number of Finger Lakes homes sold for each style and price for the past two
years is shown below.
Price
Range
Home Style
Colonial Log
Split
A-Frame
Total
< $250,000
> $250,000
18
12
6
14
19
16
12
3
55
Total
30
20
35
15
100
45
99
Cross-tabulation Example: Finger Lakes Homes
Insights Gained from Preceding Cross-tabulation
• The greatest number of homes (19) in the sample are a split-level style and priced at less
than $250,000.
• Only three homes in the sample are an A-Frame style and priced at $250,000 or more.
100
Cross-tabulation: Simpson’s Paradox
Data in two or more
crosstabulations are often
aggregated to produce a summary
crosstabulation.
We must be careful in drawing
conclusions about the relationship
between the two variables in the
aggregated crosstabulation.
In some cases, the conclusions
based upon an aggregated
crosstabulation can be completely
reversed if we look at the
unaggregated data. The reversal of
conclusions based on aggregate
and unaggregated data is called
Simpson’s paradox.
Summarizing Data for Two Variables Using Graphical
Displays
• In most cases, a graphical display is more useful than a table for recognizing
patterns and trends.
• Displaying data in creative ways can lead to powerful insights.
• Scatter diagrams and trendlines are useful in exploring the relationship
between two variables.
102
Scatter Diagram and Trendline
• A scatter diagram depicts the relationship between two quantitative variables
graphically.
• One variable is depicted on the horizontal axis, while the other is depicted on
the vertical axis.
• The overall link between the variables is suggested by the general pattern of the
plotted dots.
• The relationship is approximated by a trendline.
Scatter Diagram
A Positive Relationship
A Negative Relationship
No Relationship
Scatter Diagram
• Example: Panthers Football Team
The Panthers football team is interested in investigating the relationship, if any, between
interceptions made and points scored.
x = Number of
Interceptions
1
3
2
1
3
y = Number of
Points Scored
14
24
18
17
30
Scatter Diagram and Trendline
Number of Points Scored
35
y
30
25
20
15
10
5
0
0
1
2
Number of Interceptions
3
4
x
106
Example: Panthers Football Team
• Insights Gained from the Preceding Scatter Diagram
• The scatter diagram indicates a positive relationship between the number of
interceptions and the number of points scored.
• Higher points scored are associated with a higher number of interceptions.
• The relationship is not perfect; all plotted points in the scatter diagram are not
on a straight line.
Side-by-Side Bar Chart
• A side-by-side bar chart is a graphical display
for depicting multiple bar charts on the same
display.
• Each cluster of bars represents one value of
the first variable.
• Each bar within a cluster represents one value
of the second variable.
Side-by-Side Bar Chart
20
18
Finger Lake Homes
< $250,000
> $250,000
Frequency
16
14
12
10
8
6
4
2
Colonial
Log
Split-Level
A-Frame
Home Style
109
Stacked Bar Chart
• Another method for comparing and displaying
two variables simultaneously is a stacked bar
chart.
• It is a bar chart with rectangular segments of
every bar being a different color.
• When percentage frequencies are shown, all
bars have the same height (or length) and reach
100%.
Stacked Bar Chart
40
36
Finger Lake Homes
Frequency
32
28
< $250,000
> $250,000
24
20
16
12
8
4
Colonial
Log
Split
A-Frame
Home Style
Stacked Bar Chart
Percentage Frequency
Finger Lake Homes
100
90
80
70
< $250,000
> $250,000
60
50
40
30
20
10
Colonial
Log
Split
A-Frame
Home Style
112
Data Visualization: Best Practices in Creating
Effective Graphical Displays
• Data visualization is the process of presenting and summarizing information
about a data set using graphical displays.
• The objective is to provide the most important details about the data in an
efficient and understandable manner.
Creating Effective Graphical Displays
• Effective graphical display design requires both science and art.
• Here are some suggestions.
• Clearly and succinctly title the display.
• Don't complicate the display.
• Include the units of measurement and clearly mark each axis.
• If colors are utilized, make sure they stand out.
Choosing the Type of Graphical Display
Displays used to show the distribution of data
• Bar Chart to show the frequency distribution or relative frequency distribution for
categorical data
• Pie Chart to show the relative frequency or percent frequency for categorical data
• Dot Plot to show the distribution of quantitative data over the entire range of the
data
• Histogram to show the frequency distribution for quantitative data over a set of
class intervals
Choosing the Type of Graphical Display
A. Displays used to make comparisons:
• Side-by-Side Bar Chart to compare two variables
• Stacked Bar Chart to compare the relative frequency or
Percent frequency of two categorical variables
B. Displays used to show relationships
• Scatter Diagram to show the relationship between two
quantitative variables
• Trendline to approximate the relationship of data in a
scatter diagram
Data Dashboards
• A data dashboard is a widely used data visualization tool.
• It organizes and presents key performance indicators
(KPIs) used to monitor an organization or process.
• It provides timely, summary information that is easy to
read, understand, and interpret.
• . Some additional guidelines include . . .
•
• Minimize the need for screen scrolling.
• Avoid unnecessary use of color or 3D.
• Use borders between charts to improve readability.
Tabular and Graphical Displays
Data
Categorical Data
Tabular
Displays
• Frequency
Distribution
• Rel. Freq. Dist.
• Percent Freq.
Distribution
• Cross-tabulation
Quantitative Data
Graphical
Displays
• Bar Chart
• Pie Chart
• Side-by-Side
Bar Chart
• Stacked
Bar Chart
Tabular
Displays
• Frequency Dist.
• Rel. Freq. Dist.
• % Freq. Dist.
• Cum. Freq. Dist.
• Cum. Rel. Freq. Dist.
• Cum. % Freq. Dist.
• Cross-tabulation
Graphical
Displays
• Dot Plot
• Histogram
• Stem-andLeaf Display
• Scatter
Diagram
118
Practice Problem
Construct a stem and leaf display for the following data
A. 11.3, 9.6, 10.4, 7.5, 8.3, 10.5, 10.0, 9.3, 8.1, 7.7, 7.5,
8.4, 6.3, 8.8
B. 1161, 1206, 1478, 1300, 1604, 1725, 1361,1422, 1221,
1378, 1623, 1426, 1557, 1730, 1706, 1689
Solution
Assignment1
Showcase the representation of various graphical methods for
categorical and numerical variables using Excel, Tableau, and Power
BI applications.
Mode of Submission: Online (OneDrive)
Submission: PPT
Type: Individual
Last Date: 12/July/2023
Descriptive Statistics: Numerical Measures
Measures of Location
• Mean
• Median
• Mode
• Weighted Mean
• Geometric Mean
• Percentiles
• Quartiles
Measures of Variability
• Range
• Interquartile Range
• Variance
• Standard Deviation
• Coefficient of Variation
Descriptive Measures
Measures of Central Tendency or Measure of
Location/Position
Mean
Median
Mode
These are statistical terms that endeavour to determine the "centre" of a set of
numbers.
Measures of central tendency seek to condense data into a single value and make
data comparisons easier.
Mean
• It is the most commonly used measure
• It represents the average of the data
values
• Useful for a data set that does not have
outliers
• (Outlier- is a number in a set of data that is
much bigger or smaller than the rest of the
numbers)
• Types of Mean- (a) Arithmetic or Simple
Mean (b) Harmonic (c) Geometric
Mean: Merits and Demerits
Merits of Mean
• It is simple to compute and has a distinct
Demerits of Mean
•
value.
• It is least susceptible to sampling
undue influence on the mean.
•
statistical analysis.
• It is founded on every observation
For qualitative data, it cannot be
computed.
variations.
• It may be employed for additional
Extreme items (outliers) have an
•
It cannot be computed by observation
nor by graphical location.
Arithmetic Mean
The following equation (1) can be used to calculate the mean
for any data set:
๏ฟฝ (for sample) or ๐œ‡๐œ‡ ( for Population)
It is represented using ๐’™๐’™
๐‘†๐‘†๐‘ข๐‘ข๐‘ข๐‘ข ๐‘œ๐‘œ๐‘œ๐‘œ ๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ
๐‘€๐‘€๐‘’๐‘’๐‘’๐‘’๐‘’๐‘’ =
๐‘๐‘๐‘๐‘๐‘๐‘๐‘๐‘๐‘๐‘๐‘๐‘ ๐‘œ๐‘œ๐‘œ๐‘œ ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ๐‘œ
Mean for Ungrouped data
∑ ๐’™๐’™๐’Š๐’Š
๏ฟฝ=
๐’™๐’™
๐’๐’
(1)
Mean for Grouped data
∑ ๐’‡๐’‡๐’™๐’™๐’Š๐’Š
๏ฟฝ=
; ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ ๐’๐’ = ๏ฟฝ ๐’‡๐’‡๐’Š๐’Š
๐’™๐’™
๐’๐’
Mean for ungrouped data
Example 1:
Rahul requires a B or better in
chemistry to graduate. He did
poorly on the first three of his
tests, but well on the last four.
These are his evaluations:
You can select Sigma from the
drop-down menu in Excel
46; 53; 54; 74; 78; 81 and 100
Compute
the
mean
and
determine if Rahul’s grade will
be a B (80 to 89 average) or a C
(70 to 79 average)
๏ฟฝ=
๐’™๐’™
∑ ๐’™๐’™๐’Š๐’Š
๐’๐’
= (46+53+54+74+78+81+100)/7= 71. 285
Geometric Mean
Geometric Mean (GM):
๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ ๐’Ž๐’Ž๐’Ž๐’Ž๐’Ž๐’Ž๐’Ž๐’Ž =
Grouped Data
∑ ๐’๐’๐’๐’๐’๐’๐’๐’๐’Š๐’Š
๐‘ฎ๐‘ฎ. ๐‘ด๐‘ด = ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ
๐’๐’
๐’๐’
๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘ ๐’๐’๐’๐’ "๐’๐’๐’ ๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘๐’‘ ๐’—๐’—๐’—๐’—๐’—๐’—๐’—๐’—๐’—๐’—๐’—๐’—
Ungrouped Data
∑ ๐’‡๐’‡ ∗ ๐’๐’๐’๐’๐’๐’๐’๐’๐’Š๐’Š
; ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ ๐’๐’ = ๏ฟฝ ๐’‡๐’‡๐’Š๐’Š
๐‘ฎ๐‘ฎ. ๐‘ด๐‘ด = ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ๐‘จ
๐’๐’
EXCEL Command to compute the Geometric Mean for grouped data
= GEOMEAN (Number 1, Number 2--------Number n)
Geometric Mean
• Calculated by finding the nth root of the product of n values.
• Analyzing growth rates in financial data (where using the arithmetic mean will provide
misleading results).
• Determine the mean rate of change over several successive periods (be it years, quarters,
weeks, . . .).
Applications
Changes in populations of species
Crop yields
Pollution levels
Birth and death rates
Geometric Mean
Rate of Return
Period
1
2
3
4
5
Return (%)
-6.0
-8.0
-4.0
2.0
5.4
๐‘ฅ๐‘ฅ๐‘”๐‘”ฬ… =
5
Growth Factor
0.940
0.920
0.960
1.020
1.054
.94 . 92)(.96)(1.02)(1.054)
= [.89254]1/5 = .97752
Average growth rate per period is (.97752 - 1) (100) = -2.248%
Harmonic Mean
Harmonic Mean Computation
Grouped Data
๐’๐’
๐‘ฏ๐‘ฏ =
๐Ÿ๐Ÿ
∑
๐’™๐’™๐’Š๐’Š
Ungrouped Data
๐’๐’
๐‘ฏ๐‘ฏ =
; ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ ๐’๐’ = ๏ฟฝ ๐’‡๐’‡
๐’‡๐’‡
∑
๐’™๐’™๐’Š๐’Š
EXCEL Command to compute the Harmonic mean
=HARMEAN (Number 1, Number 2--------Number n)
Weighted Mean
Example 2: Suppose your midterm test score is 83 and your final exam score is
95. Using weights of 40% for the midterm and 60% for the final exam, compute
the weighted average of your scores. If the minimum average for an A is 90, will
you earn an A?
∑ ๐’˜๐’˜๐’™๐’™
๏ฟฝ=
๐’™๐’™
; ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ ๐’๐’ = ๏ฟฝ ๐’˜๐’˜
๐’๐’
๐’˜๐’˜ = ๐’˜๐’˜๐’˜๐’˜๐’˜๐’˜๐’˜๐’˜๐’˜๐’˜๐’˜๐’˜
Weighted mean = 90.2
Solved Example: Using Excel
Example 3: The weight recorded to the nearest grams of 60 apples picked out at random from a
consignment are given below (Source: http://www.uop.edu.pk/ocontents/chapter%203.pdf)
106; 107; 76; 82; 109; 107; 115; 93; 187; 95; 123; 125; 111; 92; 86; 70; 126; 68; 130; 129; 139; 119; 115; 128; 100;
186; 84; 99; 113; 204; 111; 141; 136; 123; 90; 115; 98; 110; 78; 185; 162; 178; 140; 152; 173; 146; 158; 194; 148;
90; 107; 181; 131; 75; 184; 104; 110; 80; 118; 82
Calculate arithmetic mean, geometric mean and harmonic mean?
Weight (grams)
65----84
85----104
105----124
125----144
145----164
165----184
185----204
Frequency
9
10
17
10
5
4
5
Answer: AM: 122.5 ; GM: 117.7021; HM: 113.1139
Practice Problem
Practice Problem 1: Calculate arithmetic mean, harmonic mean and
geometric mean for the following raw data?
Try Yourself
Hint:
Lower
0
150
300
450
600
750
Upper
150
300
450
600
750
900
fj
3
8
21
10
1
4
41
96
109
203
264
266
267
285
289
290
292
307
311
358
362
372
Raw Data (seconds)
378
380
382
385
414
417
421
421
423
424
427
429
429
439
445
448
454
470
484
499
514
522
524
552
561
587
653
775
792
809
878
Median
Median: “When the observations are arranged in
ascending or descending order, then a value, that divides
a distribution into equal parts, is called median.
Median for ungrouped data
๐‘ฐ๐‘ฐ๐‘ฐ๐‘ฐ ๐’๐’ ๐’Š๐’Š๐’Š๐’Š ๐’๐’๐’๐’๐’๐’,
๐’Š๐’Š๐’Š๐’Š ๐’๐’ ๐’Š๐’Š๐’Š๐’Š ๐’†๐’†๐’†๐’†๐’†๐’†๐’†๐’†,
๐’๐’ + ๐Ÿ๐Ÿ
๐’•๐’•๐’•๐’• ๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’
๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด = ๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’” ๐’๐’๐’๐’
๐Ÿ๐Ÿ
๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด =
๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’” ๐’๐’๐’๐’
๐’๐’
๐’๐’ + ๐Ÿ๐Ÿ
๐’•๐’•๐’•๐’•
+
๐’•๐’•๐’•๐’• ๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’
๐Ÿ๐Ÿ
๐Ÿ๐Ÿ
๐Ÿ๐Ÿ
Median
Median for ungrouped data when n is even
Calculate the median for the following marks obtained by 9 students are given
below
X: 45; 32; 37; 46; 39; 36; 41; 48; 36
Solution: Arrange the data in ascending or descending order
32, 36, 36, 37, 39, 41, 45, 46, 48
๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด = ๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’” ๐’๐’๐’๐’
๐’๐’+๐Ÿ๐Ÿ
๐Ÿ๐Ÿ
๐Ÿ“๐Ÿ“๐Ÿ“๐Ÿ“๐Ÿ“๐Ÿ“ ๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’ = ๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘
: n = 9 i.e. “n” is odd
๐’•๐’•๐’•๐’• ๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’ = ๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’” ๐’๐’๐’๐’
๐Ÿ—๐Ÿ—+๐Ÿ๐Ÿ
๐Ÿ๐Ÿ
๐’•๐’•๐’•๐’• =
Median
Median for ungrouped data when n is odd
Calculate the median for the following the marks obtained by 9 students are given below
X: 45; 32; 37; 46; 39; 36; 41; 48; 36; 50
Solution: Arrange the data in ascending or descending order
32, 36, 36, 37, 39, 41, 45, 46, 48, 50
๐’Š๐’Š๐’Š๐’Š ๐’๐’ ๐’Š๐’Š๐’Š๐’Š ๐’†๐’†๐’†๐’†๐’†๐’†๐’†๐’†,
๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด =
๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’” ๐’๐’๐’๐’
๐Ÿ๐Ÿ๐Ÿ๐Ÿ
๐Ÿ๐Ÿ
๐’•๐’•๐’•๐’•+
: n = 10 i.e. n is even
๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด =
๐Ÿ๐Ÿ๐Ÿ๐Ÿ+๐Ÿ๐Ÿ
๐Ÿ๐Ÿ
๐Ÿ๐Ÿ
๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’” ๐’๐’๐’๐’
๐’•๐’•๐’•๐’• ๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’
๐’๐’
๐’๐’ + ๐Ÿ๐Ÿ
๐’•๐’•๐’•๐’• ๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’๐’
๐Ÿ๐Ÿ ๐’•๐’•๐’•๐’• +
๐Ÿ๐Ÿ
๐Ÿ๐Ÿ
= Size of (5th + 6th ) observation/2= (39+41)/2=
Median
Median for grouped data
๐’‰๐’‰ ๐’๐’
๐‘ด๐‘ด๐’†๐’†๐’†๐’†๐’†๐’†๐’†๐’†๐’†๐’† = ๐’๐’ +
− ๐‘ช๐‘ช ; ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ๐‘ฏ ๐’๐’ = ๏ฟฝ ๐’‡๐’‡
๐’‡๐’‡ ๐Ÿ๐Ÿ
Where, l= lower class boundary of the median class
h= width of the class
f= frequency of the median class
C= Cumulative frequency of the class preceding the median class
Solved Example: Median
Median for continuous grouped data
Example 4: Find the median, for the distribution of examination marks given
below:
Class Boundaries
29.5-39.5
39.5-49.5
49.5-59.5
59.5-69.5
69.5-79.5
79.5-89.5
80.5-99.5
๐‘ด๐‘ด๐’†๐’†๐’†๐’†๐’†๐’†๐’†๐’†๐’†๐’† = ๐’๐’ +
๐’‰๐’‰ ๐’๐’
− ๐‘ช๐‘ช = ๐Ÿ“๐Ÿ“๐Ÿ“๐Ÿ“. ๐Ÿ“๐Ÿ“ +
Number of Students (fi)
8
87
190
304
211
85
20
๐Ÿ๐Ÿ๐Ÿ๐Ÿ
๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—
− ๐Ÿ๐Ÿ๐Ÿ๐Ÿ๐Ÿ๐Ÿ = ๐Ÿ”๐Ÿ”๐Ÿ”๐Ÿ” ๐’Ž๐’Ž๐’Ž๐’Ž๐’Ž๐’Ž๐’Ž๐’Ž๐’Ž๐’Ž
Median: Advantages and Disadvantages
Advantages
It is easy to compute and simple to understand
Disadvantages
•
Since the median is an average position,
arranging
It is not affected by extreme values
the
data
in
ascending
or
descending order of magnitude is time-
It has a definite and certain value because it is
consuming in case of a large number of
rigidly defined
observations.
It can be calculated even if the values of the
•
consider the magnitude of the items
extremes are not known. However, the number of
items should be known.
It is a positional average and does not
•
It is not dependent on all the observations, so
it cannot be considered as their good
representative
Mode
A value that occurs most frequently in a data is called mode
For example:
4, 6, 7, 8, 3, 4, 5, 4, 3, 5, 5, 2, 1, 4, 9, 4
Mode is 4
The data having one mode is called uni-modal distribution.
The data having two modes is called bi-modal distribution.
The data having more than two modes is called multi-modal distribution
Mode
Mode for grouped data
A value which has the largest frequency in a set of data is called mode
For example;
No of Assistants
0
1
2
3
4
5
6
7
8
9
Frequency
3
4
6
7
10
6
5
5
3
1
Mode
Mode for continuous grouped data
Where l = the lower limit of modal class.
h = the size of class interval.
fm = the frequency of the modal class.
f1 denotes the frequency of the class preceding the modal class.
f2 denotes the frequency of the class succeeding the modal class
Solved Example: Mode
Example 5: Find the mode, for the distribution of examination marks
given below:
Class Boundaries
29.5-39.5
39.5-49.5
49.5-59.5
59.5-69.5
69.5-79.5
79.5-89.5
80.5-99.5
Answer:
๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด = ๐Ÿ“๐Ÿ“๐Ÿ“๐Ÿ“. ๐Ÿ“๐Ÿ“ +
Number of Students (fi)
8
87
190
304
211
85
20
๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘ − ๐Ÿ๐Ÿ๐Ÿ๐Ÿ๐Ÿ๐Ÿ
= ๐Ÿ“๐Ÿ“๐Ÿ“๐Ÿ“. ๐Ÿ“๐Ÿ“ + ๐Ÿ“๐Ÿ“. ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ = ๐Ÿ”๐Ÿ”๐Ÿ”๐Ÿ”. ๐Ÿ“๐Ÿ“๐Ÿ“๐Ÿ“
๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘ − ๐Ÿ๐Ÿ๐Ÿ๐Ÿ๐Ÿ๐Ÿ + ๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘ − ๐Ÿ๐Ÿ๐Ÿ๐Ÿ๐Ÿ๐Ÿ
Practice Problem: Mode
Practice Problem 2: Calculate the mode for the following data ?
Try
Yourself
Wages
Below 100
100-200
200-300
300-400
400-500
Above 500
Number of Workers
8
12
25
15
10
6
Mode: Advantage and Disadvantages
Advantage
It is easy to compute and simple to understand
Disadvantage
•
The value of mode is not based on each and
It is a value around which there is maximum
every item of the series as it considers only the
concentration of observations, Hence, it is the best
highest concentration of frequencies.
representative of the data
•
Value of mode may not be determined always
It is not affected by extreme values of the given
•
There are two methods of determining mode,
Inspection Method and Grouping Method.
data
We may not get the same value of mode by
It can be determined graphically with the help of
Histogram
Useful for both quantitative & qualitative data
the two methods. So, it is not rigidly defined.
•
Since it is not based on all the observations and not
rigidly defined, it is not suitable for further
algebraic treatment.
Application of Mean, Median and Mode
Mode:
Everyday decisions are based on the modal concept.
Similar to selecting an elective in which the majority of students are enrolling.
Choosing the car color that the majority of people are purchasing.
Going to a movie where the majority of people are going.
Median:
A country's income distribution in which the income of 20% of the wealthy is roughly equivalent
to the income of the remaining 80% of the population.
Due to the elevated income of 20% of the population, the average income will be distorted.
The median provides a more realistic average income.
Application of Mean, Median and Mode
Mean:
If data are symmetrical or normally distributed, the mean provides a very
accurate depiction of the central value.
As most distributions are symmetrical, the mean is the most common measure
of central tendency in industrial and non-industrial settings.
Mean height, weight, blood pressure, score, etc.
Percentiles
• Information about how data are spread over the interval from the smallest value
to the largest value.
• Example: Admission test scores for colleges and universities
• The pth percentile of a data set is a value such that at least p percent of the
items take on this value or less and at least (100 - p) percent of the items take
on this value or more.
Percentiles
Median is middle value for given set of observations
What is percentile, Quartile, Deciles?
Suppose your GATE Exam percentile is 90. It implies that, 90 % percent of the
people who have given the gate exam has score marks less you.
Percentiles
Q1= First Quartile /Lower Quartile = 25th percentile
Q2= Second Quartile/Middle Quartile = 50th percentile
Q3 = Third Quartile/ Upper Quartile = 75th percentile
IQR = Interquartile Range = Q3 - Q1
In general, the pth percentile divides the data into two parts.
Approximately p percent of the observations are less than the pth percentile
Approximately 1-p percent of the observations are greater than the pth
percentile
Percentiles
• Compute Lp, the location of the pth percentile.
Arrange the data in ascending order.
Lp = (p/100)(n + 1)
80th Percentile
Example: Apartment Rents
Lp = (p/100)(n + 1) = (80/100)(70 + 1) = 56.8
(the 56th value plus .8 times the difference between the 57th and 56th values)
80th Percentile = 635 + .8(649 – 635) = 646.2
525
540
550
565
580
610
675
530
540
550
570
585
615
675
530
540
550
570
590
625
680
535
545
550
572
590
625
690
535
545
550
575
590
625
700
535
545
560
575
600
635
700
535
545
560
575
600
649
700
535
545
560
580
600
650
700
540
550
565
580
600
670
715
540
550
565
580
610
670
715
80th Percentile
Example: Apartment Rents
“At least 80% of the items take on a
value of 646.2 or less.”
“At least 20% of the items take on a
value of 646.2 or more.”
56/70 = .8 or 80%
525
540
550
565
580
610
675
530
540
550
570
585
615
675
530
540
550
570
590
625
680
14/70 = .2 or 20%
535
545
550
572
590
625
690
535
545
550
575
590
625
700
535
545
560
575
600
635
700
535
545
560
575
600
649
700
535
545
560
580
600
650
700
540
550
565
580
600
670
715
540
550
565
580
610
670
715
154
Quartiles
Quartiles are specific percentiles.
First Quartile = 25th Percentile
Second Quartile = 50th Percentile = Median
Third Quartile = 75th Percentile
Third Quartile (75th Percentile)
Example: Apartment Rents
Lp = (p/100)(n + 1) = (75/100)(70 + 1) = 53.25
(the 53rd value plus .25 times the difference between the 54th and 53rd values)
Third quartile = 625 + .25(625 – 625) = 625
525
540
550
565
580
610
675
530
540
550
570
585
615
675
530
540
550
570
590
625
680
535
545
550
572
590
625
690
535
545
550
575
590
625
700
535
545
560
575
600
635
700
535
545
560
575
600
649
700
535
545
560
580
600
650
700
540
550
565
580
600
670
715
540
550
565
580
610
670
715
156
Five-Number Summaries and Box Plots
• Summary statistics and easy-to-draw graphs can be used to quickly summarize
large quantities of data.
• Two tools that accomplish this are five-number summaries and box plots.
157
Five-Number Summary
1. Smallest Value
2. First Quartile
3. Median
4. Third Quartile
5. Largest Value
158
Five-Number Summary: Example: Apartment Rents
Lowest Value = 525
First Quartile = 545
Median = 575
530
540
550
570
590
625
680
535
545
560
575
600
635
700
Largest Value = 715
Third Quartile = 625
525
540
550
565
580
610
675
530
540
550
570
585
615
675
535
545
550
572
590
625
690
535
545
550
575
590
625
700
535
545
560
575
600
649
700
535
545
560
580
600
650
700
540
550
565
580
600
670
715
540
550
565
580
610
670
715
159
Box Plot
• A box plot is a graphical summary of data that is based on a fivenumber summary.
• A key to the development of a box plot is the computation of the
median and the quartiles Q1 and Q3.
• Box plots provide another way to identify outliers.
160
Boxplot
A boxplot, also referred to as a box-and-whisker plot, is a way to graphically
display a five-number summary.
Steps for constructing Boxplot
a.Place the five-number summary values on the horizontal axis in ascending
order.
b.Make a box that includes the first and third quartiles.
c.In the box, draw a dashed vertical line at the median.
d.Draw a line ("whisker") from Q1 to the smallest number that is not more
than 1.5*IQR from Q1. Similarly for the other side.
e.To indicate observations that are more than 1.5*IQR from the box, use an
asterisk (or another symbol) (outliers).
Steps for constructing Boxplot
A boxplot is used to estimate the shape of a distribution informally.
Symmetry: median is in the center of the box, and the left and right
whiskers are equally distant from their respective quartiles
Positively skewed: the median is left of center and the right
whisker is longer than the left whisker
Negatively skewed: the median is right of center and the left
whisker is longer than the right whisker
Measures of Variability/Dispersion
• It is often desirable to consider measures of variability
(dispersion), as well as measures of location.
• For example, in choosing supplier A or supplier B we might
consider not only the average delivery time for each but also
the variability in delivery time for each.
Measures of Variability/Dispersion
• Measures of central location gives the value on which all the
observations could be assumed to be located or concentrated.
• does not give any information about the variability present in the
data i.e., how much observations are scattered or differ from each
other.
• help to interpret the variability of data i.e., to know how much
homogenous or heterogeneous the data is.
Measures of Variability
Coefficient of
Variation
Standard
Deviation
Range
Interquartile
Range
Variance
Range
• The range of a data set is the difference between the largest
and smallest data values.
Range = Largest value – Smallest value
• It is the simplest measure of variability.
• It is very sensitive to the smallest and largest data values.
RANGE
What
is
the
range
4;8;1; 6; 6;2; 9; 3; 6; 9
of
the
following
data:
The maximum value is 9; the minimum value is 1; the range is 9 - 1 =
8
Two very different sets of data can have the same range:
1 1 1 1 9 vs 1 3 5 7 9
Range
Example: Apartment Rents
Range = largest value - smallest value
Range = 715 - 525 = 190
525
540
550
565
580
610
675
530
540
550
570
585
615
675
530
540
550
570
590
625
680
535
545
550
572
590
625
690
535
545
550
575
590
625
700
535
545
560
575
600
635
700
535
545
560
575
600
649
700
535
545
560
580
600
650
700
540
550
565
580
600
670
715
540
550
565
580
610
670
715
169
Interquartile Range
• The interquartile range of a data set is the difference between the
third quartile and the first quartile.
• It is the range for the middle 50% of the data.
• It overcomes the sensitivity to extreme data values.
Interquartile Range (IQR)
Example: Apartment Rents
3rd Quartile (Q3) = 625
1st Quartile (Q1) = 545
IQR = Q3 - Q1 = 625 - 545 = 80
Variance
• The variance is a measure of variability that utilizes all the data.
• It is based on the difference between the value of each observation
(xi) and the mean (๐‘ฅ๐‘ฅฬ… for a sample, m for a population).
• The variance is useful in comparing the variability of two or more
variables.
Variance
• Average of the squared differences between each data value and the
mean.
• Computed as follows:
∑
๐‘ฅ๐‘ฅ
−
๐‘ฅ๐‘ฅ
ฬ…
๐‘–๐‘–
๐‘ ๐‘  2 =
๐‘›๐‘› − 1
for a sample
2
∑ ๐‘ฅ๐‘ฅ๐‘–๐‘– − ๐œ‡๐œ‡
2
๐œŽ๐œŽ =
๐‘๐‘
2
for a population
Standard Deviation
• Positive square root of the variance.
• Measured in the same units as the data, making it more easily
interpreted than the variance.
• Computed as follows:
s = ๐‘ ๐‘  2
for a sample
σ= σ2
for a population
Coefficient of Variation
Indicates how large the standard deviation is in relation to the mean.
๐‘ ๐‘ 
๐‘ฅ๐‘ฅฬ…
x 100 %
for a sample
๐œŽ๐œŽ
๐œ‡๐œ‡
x 100 %
for a population
Mean Deviation
For ungrouped data
∑ ๐‘ฅ๐‘ฅ๐‘–๐‘– − ๐‘ฅ๐‘ฅฬ…
๐‘€๐‘€๐‘€๐‘€๐‘€๐‘€๐‘€๐‘€ ๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท =
๐‘›๐‘›
For grouped data
∑ ๐‘“๐‘“๐‘–๐‘– ๐‘ฅ๐‘ฅ๐‘–๐‘– − ๐‘ฅ๐‘ฅฬ…
๐‘€๐‘€๐‘€๐‘€๐‘€๐‘€๐‘€๐‘€ ๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท๐ท =
๐‘›๐‘›
Standard Deviation
For ungrouped data
๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ
๐ˆ๐ˆ๐Ÿ๐Ÿ
=
๐Ÿ๐Ÿ
๐’๐’
∑ ๐’™๐’™๐’Š๐’Š − ๐’™๐’™
๏ฟฝ ๐Ÿ๐Ÿ ; ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ ๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’… =
๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ
Example: Compute variance and standard variable for the following ungrouped
data ?
Observation
๐‘ฅ๐‘ฅ๐‘–๐‘– − ๐‘ฅ๐‘ฅฬ…
๐‘ฅ๐‘ฅ๐‘–๐‘– − ๐‘ฅ๐‘ฅฬ…
Sum
Mean
๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ ๐ˆ๐ˆ๐Ÿ๐Ÿ =
๐Ÿ๐Ÿ
๐’๐’
∑ ๐’™๐’™๐’Š๐’Š − ๐’™๐’™
๏ฟฝ
๐Ÿ๐Ÿ
๐Ÿ๐Ÿ
๐Ÿ‘๐Ÿ‘
= = ๐ŸŽ๐ŸŽ. ๐Ÿ”๐Ÿ”๐Ÿ”๐Ÿ”
(๐‘ฅ๐‘ฅ๐‘–๐‘– )
1
-1
1
2
0
0
3
+1
1
6
0
2
2
0
2/3=0.67
and S.D ( ๐œŽ๐œŽ) = 0.67 = 0.82
Standard Deviation
For grouped data
∑ ๐’‡๐’‡๐’Š๐’Š ๐’™๐’™๐’Š๐’Š − ๐’™๐’™
๏ฟฝ
๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ =
∑ ๐’‡๐’‡๐’Š๐’Š
๐Ÿ๐Ÿ
∑ ๐’‡๐’‡๐’Š๐’Š ๐’™๐’™๐’Š๐’Š ๐Ÿ๐Ÿ − ∑ ๐’‡๐’‡๐’Š๐’Š ๐’™๐’™
๏ฟฝ๐Ÿ๐Ÿ
=
∑ ๐’‡๐’‡๐’Š๐’Š
๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ ๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’…๐’… =
๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ๐‘ฝ
Example 6: Compute mean, mean deviation, standard deviation and variance?
Class Interval
Frequency
40-50
50-60
60-70
70-80
80-90
90-100
10
20
20
15
15
20
Mid-point of class
interval (๐’™๐’™๐’Š๐’Š )
45
55
65
75
85
95
Practice Problem
Practice Problem : Compute mean, mean deviation, standard
deviation and variance for the given data below?
Class Interval
2000-3000
3000-4000
4000-5000
5000-6000
6000-7000
Frequency
2
5
6
4
3
Practice Problem
Example: using the five-number summaries for Growth and Value
Find the IQR for Growth.
Determine whether any outliers exist
Repeat for Value
Solution
Example: using the five-number summaries for Growth and Value
Growth
IQR = Q3 − Q1 = 36.94 − 2.86 = 34.11,
Limit = 1.5*IQR = 1.5*34.11 = 51.17;
Q1− Min = 2.86 − (−40.90) = 43.76 (left whisker),
Max − Q3 = 79.48 − 36.94 = 42.51 (right whisker)
Both are less than 51.17, no outliers
Solution
Value
IQR = Q3 − Q1 = 22.44 − 1.70 = 20.74;
Limit = 1.5*IQR = 1.5*20.74 = 31.11;
Q1- Min = 1.70 − (−46.52) = 48.22 (left whisker),
Max − Q3 = 44.08 − 22.44 = 21.64 (right whisker)
Left whisker limit exceeds the limit, hence, there is an outlier(s) on
the left side
Numerical problem
• Revenue Growth Rate. Annual revenue for Corning Supplies grew
by 5.5% in 2014, 1.1% in 2015, −3.5% in 2016, −1.1% in 2017, and
1.8% in 2018. What is the mean growth annual rate over this period?
Solution
To calculate the mean growth rate, we must first compute the geometric
mean of the five growth factors:
Year % Growth Growth Factor xi
2014
5.5
1.055
2015 1.1
1.011
2016 -3.5
0.965
2017 -1.1
0.989
2018
1.8
1.018
Solution
The mean annual growth rate is (1.007152 – 1)100% = 0.7152%.
Numerical Problem
• Hardshell Jacket Ratings. OutdoorGearLab is an organization that tests outdoor gear used
for climbing, camping, mountaineering, and backpacking. Suppose that the following data
show the ratings of hardshell jackets based on the breathability, durability, versatility,
features, mobility, and weight of each jacket. The ratings range from 0 (lowest) to 100
(highest).
• 42, 66, 67, 71, 78, 62, 61, 76, 71, 67
• 61, 64, 61, 54, 83, 63, 68, 69, 81, 53
a. Compute the mean, median, and mode.
b. Compute the first and third quartiles.
c. Compute and interpret the 90th percentile.
Solution
Numerical Problem
Air Quality Index. The Los Angeles Times regularly reports the air quality index
for various areas of Southern California. A sample of air quality index values for
Pomona provided the following data: 28, 42, 58, 48, 45, 55, 60, 49, and 50.
a. Compute the range and interquartile range.
b. Compute the sample variance and sample standard deviation.
c. A sample of air quality index readings for Anaheim provided a sample mean of
48.5, a sample variance of 136, and a sample standard deviation of 11.66. What
comparisons can you make between the air quality in Pomona and that in
Anaheim on the basis of these descriptive statistics?
Solution
Measures of
Distribution
Shape, Relative
Location, and
Detecting
Outliers
Distribution Shape: Skewness
z-Scores
Chebyshev’s Theorem
Empirical Rule
Detecting Outliers
Distribution Shape: Skewness
• An important measure of the shape of a distribution is called
skewness.
• The formula for the skewness of sample data is
Skewness =
๐‘›๐‘›
๐‘ฅ๐‘ฅ๐‘–๐‘– −๐‘ฅ๐‘ฅฬ… 3
∑
(๐‘›๐‘›−1)(๐‘›๐‘›−2)
๐‘ ๐‘ 
A histogram provides a graphical display showing the shape of a distribution.
Source: https://en.wikipedia.org/wiki/Skewness
Distribution Shape: Skewness
Symmetric (not skewed)
• Skewness is zero
• Mean and median are equal
Relative Frequency
.35
Skewness = 0
.30
.25
.20
.15
.10
.05
0
194
Often, the distribution of income is positively
skewed.
Few individuals make a substantial quantity of
money, as the majority of people earn an
average income. The distribution of income is
right-skewed. On the extreme right of the
distribution, there are a few affluent
individuals.
A researcher conducts a survey with a group of
elderly people about their age of retirement.
Because the majority of people retire in their
mid-60s or older, the distribution would be
negatively skewed
Distribution Shape: Skewness
Moderately Skewed Left
Skewness is negative.
Mean will usually be less than the median.
Relative Frequency
.35
Skewness = - .31
.30
.25
.20
.15
.10
.05
0
196
Distribution Shape: Skewness
Moderately Skewed Right
Skewness is positive
Mean will usually be more than the median
Relative Frequency
.35
Skewness = .31
.30
.25
.20
.15
.10
.05
0
197
Distribution Shape: Skewness
Highly Skewed Right
• Skewness is positive (often above 1.0).
• Mean will usually be more than the median.
Relative Frequency
.35
Skewness = 1.25
.30
.25
.20
.15
.10
.05
0
198
z-Scores
In addition to measures of location, variability, and shape, we are also interested in
the relative location of values within a data set.
Measures of relative location help us determine how far a particular value is from
the mean.
• The z-score is often called the standardized value.
• It denotes the number of standard deviations a data value xi is from the mean.
๐‘ฅ๐‘ฅ๐‘–๐‘– −๐‘ฅ๐‘ฅฬ…
๐‘ง๐‘ง๐‘–๐‘– =
๐‘ ๐‘ 
• Excel’s STANDARDIZE function can be used to compute the z-score.
z-Scores
• An observation’s z-score is a measure of the relative location of the observation
in a data set.
• A data value less than the sample mean will have a z-score less than zero.
• A data value greater than the sample mean will have a z-score greater than zero.
• A data value equal to the sample mean will have a z-score of zero.
200
Sample mean, x = 44,
Sample
deviation, s = 8.
standard
The z-score of −1.50 for the fifth observation shows it is farthest from the
mean; it is 1.50 standard deviations below the mean.
The z-score for any observation can be interpreted as a measure of the
relative location of the observation in a data set.
Chebyshev’s Theorem
• Enables us to make statements about the proportion of data values that must
be within a specified number of standard deviations of the mean.
• At least (1 - 1/z2) of the items in any data set will be within z standard
deviations of the mean, where z is any value greater than 1.
• Chebyshev’s theorem requires z > 1; but z need not be an integer.
202
Chebyshev’s Theorem
• At least 75% of the data
values must be within k = 2
standard deviations of the
mean.
• At least 89% of the data
values must be within k = 3
standard deviations of the
mean.
• At least 94% of the data
values must be within k = 4
standard deviations of the
mean.
Practice Problem
Suppose that the midterm test scores for 100 students in a college
business statistics course had a mean of 70 and a standard deviation of 5.
• How many students had test scores between 60 and 80?
• How many students had test scores between 58 and 82?
Solution
For the test scores between 60 and 80, we note that 60 is two standard deviations
below the mean and 80 is two standard deviations above the mean
• Using Chebyshev’s theorem, we see that at least .75, or at least 75%, of the observations,
must have values within two standard deviations of the mean.
• Thus, at least 75% of the students must have scored between 60 and 80.
Solution
For the test scores between 58 and 82, we see that (58 − 70)/5 = −2.4
indicates 58 is 2.4 standard deviations below the mean and that (82 −
70)/5 = +2.4 indicates 82 is 2.4 standard deviations above the mean.
Applying Chebyshev’s theorem with z = 2.4, we have
At least 82.6% of the students must have test scores between 58 and 82.
Empirical Rule
Chebyshev’s theorem
advantage is that it applies to
any data set regardless of the
shape of the distribution of
the data.
It could be used with any of
the distributions.
In many practical
applications, however, data
sets exhibit a symmetric
mound-shaped or bellshaped distribution.
When the data are believed to
approximate this distribution,
the empirical rule can be
used to determine the
percentage of data values that
must be within a specified
number of standard
deviations of the mean.
Empirical Rule
When the data are believed to approximate a bell-shaped distribution:
• The empirical rule can be used to determine the percentage of data values that must
be within a specified number of standard deviations of the mean.
• The empirical rule is based on the normal distribution
For data having a bell-shaped distribution:
• Approximately 68% of the data values will be within one standard deviation of the
mean.
• Approximately 95% of the data values will be within two standard deviations of the
mean.
• Almost all of the data values will be within three standard deviations of the mean.
Empirical Rule
209
Detecting Outliers
• An outlier is an unusually small or unusually large value in a data set.
• A data value with a z-score less than –3 or greater than +3 might be
considered an outlier.
• It might be:
• an incorrectly recorded data value
• a data value that was incorrectly included in the data set
• a correctly recorded data value that belongs in the data set
Measures of Association Between Two Variables
• We have examined numerical methods used to summarize the data for one
variable at a time.
• Often a manager or decision maker is interested in the relationship between two
variables.
• Two descriptive measures of the relationship between two variables are
covariance and correlation coefficient.
211
Covariance
•
•
•
•
The covariance is a measure of the linear association between two variables.
Positive values indicate a positive relationship.
Negative values indicate a negative relationship.
The covariance is computed as follows:
For samples:
For populations:
Correlation Coefficient
Correlation is a measure of linear association and not necessarily causation.
Just because two variables are highly correlated, it does not mean that one
variable is the cause of the other.
The correlation coefficient is computed as follows:
For samples:
For populations:
Correlation Coefficient
• The coefficient can take on values between –1 and +1.
• Values near –1 indicate a strong negative linear relationship.
• Values near +1 indicate a strong positive linear relationship.
• The closer the correlation is to zero, the weaker the relationship
Covariance and Correlation Coefficient
Example: Golfing Study
A golfer is interested in investigating the relationship, if any, between driving distance and
18-hole score.
Average Driving
Distance (yds.)
Average
18-Hole Score
277.6
259.5
269.1
267.0
255.6
272.9
69
71
70
70
71
69
215
Covariance and Correlation Coefficient
Example: Golfing Study
x
Average
Std. Dev.
y
277.6
259.5
269.1
267.0
255.6
272.9
69
71
70
70
71
69
267.0
8.2192
70.0
.8944
(๐‘ฅ๐‘ฅ๐‘–๐‘– -๐‘ฅ๐‘ฅ)ฬ…
10.65
-7.45
2.15
0.05
-11.35
5.95
(๐‘ฆ๐‘ฆ๐‘–๐‘– -๐‘ฆ๐‘ฆ)
๏ฟฝ
-1.0
1.0
0
0
1.0
-1.0
๏ฟฝ
(๐‘ฅ๐‘ฅ๐‘–๐‘– -๐‘ฅ๐‘ฅ)(๐‘ฆ๐‘ฆ
ฬ… ๐‘–๐‘– -๐‘ฆ๐‘ฆ)
-10.65
-7.45
0
0
-11.35
-5.95
Total
-35.40
216
Covariance and Correlation Coefficient
• Example: Golfing Study
• Sample Covariance
๐‘ ๐‘ ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ =
∑(๐‘ฅ๐‘ฅ๐‘–๐‘– −๐‘ฅ๐‘ฅ)(๐‘ฆ๐‘ฆ
ฬ…
๏ฟฝ
๐‘–๐‘– −๐‘ฆ๐‘ฆ)
๐‘›๐‘›−1
• Sample Correlation Coefficient
๐‘Ÿ๐‘Ÿ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ =
๐‘ ๐‘ ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ
๐‘ ๐‘ ๐‘ฅ๐‘ฅ ๐‘ ๐‘ ๐‘ฆ๐‘ฆ
=
=
−35.40
6−1
−7.08
8.2192 .8944)
= -7.08
= −.9631
217
Quiz-1
Q-1
Q-2
In each of the following scenarios, define the type of
measurement scale.
a. An investor collects data on the weekly closing price
of gold throughout the year.
b. An analyst assigns a sample of bond issues to one of
the following credit ratings, given in descending order of
credit quality (increasing probability of default): AAA,
AA, BBB, BB, CC, D.
c. The dean of the business school at a local university
categorizes students by major (i.e., accounting, finance,
marketing, etc.) to help in determining class offerings in
the future.
Practice Problem
A data set has a mean of 1,500 and a standard deviation of
100.
a. Using Chebyshev’s theorem, what percentage of the
observations fall between 1,300 and 1,700?
b. Using Chebyshev’s theorem, what percentage of the
observations fall between 1,100 and 1,900?
Solution
Practice Problem
A sample space S yields five equally likely events, A, B, C, D, and E.
a. Find P(D)
b. Find P(Bc)
c. Find P(A ∪ C ∪ E)
Solution
Practice Problem
An alarming number of U.S. adults are either overweight or obese. The distinction
between overweight and obese is made on the basis of body mass index (BMI),
expressed as weight/ height. An adult is considered overweight if the BMI is 25 or
more but less than 30. An obese adult will have a BMI of 30 or greater. According
to a January 2012 article in the Journal of the American Medical Association,
33.1% of the adult population in the United States is overweight and 35.7% is
obese. Use this information to answer the following questions.
a. What is the probability that a randomly selected adult is either overweight or
obese?
b. What is the probability that a randomly selected adult is neither overweight nor
obese?
c. Are the events “overweight” and “obese” exhaustive?
d. Are the events “overweight” and “obese” mutually exclusive
Solution
Introduction to Probability
• Random Experiments, Counting Rules, and Assigning
Probabilities
• Events and Their Probability
• Some Basic Relationships of Probability
• Conditional Probability
• Bayes’ Theorem
225
Uncertainties
• Managers often base their decisions on an analysis of
uncertainties such as the following:
• What are the chances that the sales will decrease if we increase
prices?
• What is the likelihood a new assembly method will increase
productivity?
• What are the odds that a new investment will be profitable?
226
Probability
• Probability is a numerical measure of the likelihood that an event will occur.
• Probability values are always assigned on a scale from 0 to 1.
• A probability near zero indicates an event is quite unlikely to occur.
• A probability near one indicates an event is almost certain to occur.
Statistical Experiments
• In statistics, the notion of an experiment differs somewhat from that
of an experiment in the physical sciences.
• In statistical experiments, probability determines outcomes.
• Even though the experiment is repeated exactly the same way, an
entirely different outcome may occur.
• For this reason, statistical experiments are sometimes called random
experiments.
228
Random Experiment and Its Sample Space
• A random experiment is a process that generates well-defined experimental outcomes.
• The sample space for an experiment is the set of all experimental outcomes.
• An experimental outcome is also called a sample point.
Experiment
Toss a coin
Inspect a part
Conduct a sale call
Roll a die
Play a football game
Experimental Outcomes
Head, tail
Defective, non-defective
Purchase, no purchase
1, 2, 3, 4, 5, 6
Win, lost, tie
Probability
A probability is a numerical value that measures the likelihood that an
event occurs.
– Between zero (0) and one (1)
– 0 → impossible event that never occurs
– 1 → a definite event that always occurs
• An experiment is a process that leads to one of several possible
outcomes.
– Actual outcome is not known with certainty before the experiment begins
– Diversity of outcomes is due to uncertainty
• Example: rolling a fair die
Probability
Sample space of an experiment is denoted ๐‘†๐‘†. The sample space Contains
all possible outcomes of the experiment. For example, Letter grades in a
course: ๐‘†๐‘† = ๐ด๐ด, ๐ต๐ต, ๐ถ๐ถ, ๐ท๐ท, ๐น๐น , Passing a course or not: ๐‘†๐‘† = {๐‘ƒ๐‘ƒ, ๐น๐น}
An event is any subset of outcomes of the experiment. Simple event if it
contains a single outcome, May contain several outcomes. For example, a
passing grade, ๐‘๐‘๐‘๐‘๐‘๐‘๐‘๐‘ = {๐ด๐ด, ๐ต๐ต, ๐ถ๐ถ, ๐ท๐ท}, Tossing a coin
Probability
Mutually exclusive events
They do not share any common outcomes
The occurrence of one event prohibits the occurrence of others
For example,
Sample space for dice S= 1,2,3,4,5, 6
X= event for the number to appear as even = 2,4,6
Y= event for the number to appear as odd= 1,3,5
Here, X and Y are mutually exclusive events. ๐‘ฟ๐‘ฟ ∩ ๐’€๐’€ = ∅
Two events are mutually exclusive events if occurrence of any of those events
excludes the occurrence of other event
Probability
Exhaustive events
All possible outcomes of an experiment belong to the events, Include all outcomes in the
sample space
For example,
Sample space for throwing a dice once S= 1,2,3,4,5, 6
X= event for the number that appear on the dice less than 4 = 1,2,3
Y= event for the number that appear on the dice greater than 2 and less than 5= 3,4
Z= event for the number that appear on the dice greater than 4 = 5,6
– Here, ๐‘ฟ๐‘ฟ ∪ ๐’€๐’€ ∪ ๐’๐’ = ๐Ÿ๐Ÿ, ๐Ÿ๐Ÿ, ๐Ÿ‘๐Ÿ‘, ๐Ÿ’๐Ÿ’, ๐Ÿ“๐Ÿ“, ๐Ÿ”๐Ÿ” = ๐‘บ๐‘บ
Two events are exhaustive events if at least one of them are necessarily occurs whenever the
experiment is performed
Probability
Exhaustive events
All possible outcomes of an experiment belong to the events, Include all outcomes in the
sample space
For example,
Sample space for throwing a dice once S= 1,2,3,4,5, 6
X= event for the number that appear on the dice less than 4 = 1,2,3
Y= event for the number that appear on the dice greater than 2 and less than 5= 3,4
Z= event for the number that appear on the dice greater than 4 = 5,6
– Here, ๐‘ฟ๐‘ฟ ∪ ๐’€๐’€ ∪ ๐’๐’ = ๐Ÿ๐Ÿ, ๐Ÿ๐Ÿ, ๐Ÿ‘๐Ÿ‘, ๐Ÿ’๐Ÿ’, ๐Ÿ“๐Ÿ“, ๐Ÿ”๐Ÿ” = ๐‘บ๐‘บ
Two events are exhaustive events if at least one of them are necessarily occurs whenever the
experiment is performed
Probability
We can define events based on one or more outcomes of the experiment and also combine
events to form new events.
Venn Diagram
Sample space S with a rectangle
Two circles to represent the events A and B
Union of two events
Denoted ๐ด๐ด ∪ ๐ต๐ต
All outcomes in A or B (or both)
The portion in the Venn diagram that is included in either A or B
Probability
Intersection of two events
Denoted ๐ด๐ด ∩ ๐ต๐ต
All outcomes in A and B
The portion in the Venn diagram that is included in both A and B, the overlap
Probability
Complement of an event A
Denoted ๐ด๐ด๐‘๐‘
All outcomes in the sample space S that are not in A
The portion in the Venn diagram that is everything in S that is not included in A
Practice Problem
You roll a die with the sample space S = {1, 2, 3, 4, 5, 6}. You define A as
{1, 2, 3}, B as {1, 2, 3, 5, 6}, C as {4, 6}, and D as {4, 5, 6}. Determine
which of the following events are exhaustive and/or mutually exclusive.
a. A and B
b. A and C
c. A and D
d. B and C
Solution
a.
๐ด๐ด ∪ ๐ต๐ต = {1, 2, 3, 5, 6 } ≠ {1, 2, 3, 4, 5, 6} = ๐‘†๐‘†; the events ๐ด๐ด and ๐ต๐ต are not
exhaustive.
๐ด๐ด ∩ ๐ต๐ต = {1,2,3}; the events A and B are not mutually exclusive.
b. ๐ด๐ด ∪ ๐ถ๐ถ = {1, 2, 3, 4, 6 } ≠ {1, 2, 3, 4, 5, 6} = ๐‘†๐‘†; the events ๐ด๐ด and ๐ถ๐ถ are not
exhaustive.
๐ด๐ด ∩ ๐ถ๐ถ = ∅; the events A and C are mutually exclusive.
c. ๐ด๐ด ∪ ๐ท๐ท = {1, 2, 3, 4, 5, 6 } = ๐‘†๐‘†; the events ๐ด๐ด and ๐ท๐ท are exhaustive.
๐ด๐ด ∩ ๐ท๐ท = ∅; the events A and D are mutually exclusive.
d. ๐ต๐ต ∪ ๐ถ๐ถ = {1, 2, 3, 4, 5, 6 } = ๐‘†๐‘†; the events ๐ด๐ด and ๐ถ๐ถ are exhaustive.
๐ต๐ต ∩ ๐ถ๐ถ = {6}; the events B and C are not mutually exclusive.
Probability
Properties of probability
1. The probability of an event A is a value between 0 and 1; that is, 0 ≤ ๐‘ƒ๐‘ƒ(๐ด๐ด) ≤ 1.
2. The sum of the probabilities of any list of mutually exclusive and exhaustive events equals 1.
There are three types of probabilities.
Subjective: calculated by drawing on personal and subjective judgement
Empirical: calculated as a relative frequency of occurrence
Classical: based on logical analysis
Empirical and classical probabilities do not vary, they are often grouped as objective probabilities.
Rules of Probability
1. Complement rule
Follows from one of the defining properties of
probability: ๐‘ƒ๐‘ƒ ๐ด๐ด + ๐‘ƒ๐‘ƒ ๐ด๐ด๐‘๐‘ = 1
Rearrange: ๐‘ƒ๐‘ƒ ๐ด๐ด๐‘๐‘ = 1 − ๐‘ƒ๐‘ƒ(๐ด๐ด)
2. Addition rule
Used to find the probability of the union of two
events
The probability that A or B occurs, or that at least
one of these events occurs
๐‘ƒ๐‘ƒ ๐ด๐ด ∪ ๐ต๐ต = ๐‘ƒ๐‘ƒ ๐ด๐ด + ๐‘ƒ๐‘ƒ ๐ต๐ต − ๐‘ƒ๐‘ƒ ๐ด๐ด ∩ ๐ต๐ต
๐‘ƒ๐‘ƒ ๐ด๐ด ∩ ๐ต๐ต is double-counted in both ๐‘ƒ๐‘ƒ ๐ด๐ด and
๐‘ƒ๐‘ƒ ๐ต๐ต
๐‘ƒ๐‘ƒ ๐ด๐ด ∩ ๐ต๐ต is referred to as the joint probability
Rules of Probability
For mutually exclusive events A and B, the probability of their
intersection is zero ๐‘ƒ๐‘ƒ ๐ด๐ด ∩ ๐ต๐ต = 0.
A
B
๐‘ƒ๐‘ƒ ๐ด๐ด ∪ ๐ต๐ต = ๐‘ƒ๐‘ƒ ๐ด๐ด + ๐‘ƒ๐‘ƒ ๐ต๐ต
Conditional probability
In business applications, the probability of interest is often a
conditional probability.
Examples include the probability that the customer will make an
online purchase conditional on receiving an e-mail with a discount
offer; the probability of making a six-figure salary conditional on
getting an MBA; and the probability that sales will improve
conditional on the firm launching a new marketing campaign.
Conditional probability
The conditional probability that A occurs given that B has occurred is
derived as ๐‘ƒ๐‘ƒ ๐ด๐ด ๐ต๐ต =
๐‘ƒ๐‘ƒ(๐ด๐ด∩๐ต๐ต)
๐‘ƒ๐‘ƒ(๐ต๐ต)
Because ๐‘ƒ๐‘ƒ ๐ด๐ด ๐ต๐ต is conditional on B (B has occurred), the sample
space reduces to B.
๐‘ƒ๐‘ƒ ๐ด๐ด ๐ต๐ต is the ๐ด๐ด ∩ ๐ต๐ต portion in the Venn diagram that is included in ๐ต๐ต.
Similarly, ๐‘ƒ๐‘ƒ ๐ต๐ต ๐ด๐ด =
๐‘ƒ๐‘ƒ(๐ด๐ด∩๐ต๐ต)
๐‘ƒ๐‘ƒ(๐ด๐ด)
Multiplication rule of Probability
We can find the joint probability as the product of
probabilities using the conditional probability formula;
this is the multiplication rule.
The joint probability of events A and B is derived as
๐‘ƒ๐‘ƒ ๐ด๐ด ∩ ๐ต๐ต = ๐‘ƒ๐‘ƒ ๐ด๐ด ๐ต๐ต ๐‘ƒ๐‘ƒ ๐ต๐ต
Probability
Two events are independent if the occurrence of one event
does not affect the probability of the occurrence of the other
event.
Events are considered dependent if the occurrence of one is
related to the probability of the occurrence of the other event.
Two events, A and B, are independent if
๐‘ƒ๐‘ƒ ๐ด๐ด ๐ต๐ต = ๐‘ƒ๐‘ƒ(๐ด๐ด) or, equivalently,
๐‘ƒ๐‘ƒ ๐ด๐ด ∩ ๐ต๐ต = ๐‘ƒ๐‘ƒ ๐ด๐ด ๐‘ƒ๐‘ƒ ๐ต๐ต .
Practice Problem
Suppose that for a given year there is a 2% chance that
your desktop computer will crash and a 6% chance that
your laptop computer will crash. Moreover, there is a
0.12% chance that both computers will crash. Is the
reliability of the two computers independent of each
other?
Solution
Let D represent the outcome that your desktop crashes,
๐‘ƒ๐‘ƒ ๐ท๐ท = 0.02
Let L represent the outcome that your laptop crashes, ๐‘ƒ๐‘ƒ ๐ฟ๐ฟ =
0.06
The joint probability is ๐‘ƒ๐‘ƒ(๐ท๐ท ∩ ๐ฟ๐ฟ) = 0.0012
๐‘ƒ๐‘ƒ(๐ท๐ท∩๐ฟ๐ฟ)
0.0012
Calculate ๐‘ƒ๐‘ƒ ๐ท๐ท ๐ฟ๐ฟ =
=
= 0.02
๐‘ƒ๐‘ƒ(๐ฟ๐ฟ)
0.06
So, ๐‘ƒ๐‘ƒ ๐ท๐ท ๐ฟ๐ฟ = ๐‘ƒ๐‘ƒ ๐ท๐ท . If your laptop crashes, it does not alter
the probability that your desktop also crashes
The reliability of the two computers is independent
Practice Problem
Let P(A) = 0.65, P(B) = 0.30, and P(A | B) = 0.45.
a. Calculate P(A ∩ B).
b. Calculate P(A ∪ B).
c. Calculate P(B | A).
Consider the following probabilities: P(A) = 0.40, P(B) = 0.50,
and P(AC ∩ BC ) = 0.24. Find:
a. P(AC | BC)
b. P(AC ∪ BC)
c. P(A ∪ B)
Hint for part C: ๐‘ƒ๐‘ƒ (๐ด๐ด๐‘๐‘ ∩ ๐ต๐ต๐‘๐‘) = ๐‘ƒ๐‘ƒ ((๐ด๐ด ∪ ๐ต๐ต) ๐‘๐‘) = 1 − ๐‘ƒ๐‘ƒ(๐ด๐ด ∪ ๐ต๐ต) = 0.24.
Solution
Solution
a.
b.
c.
0.48
0.86
0.76.
Practice Problem
An analyst estimates that the probability of default on a seven- year AA-rated
bond is 0.06, while that on a seven-year A-rated bond is 0.13. The probability
that they will both default is 0.04.
a. What is the probability that at least one of the bonds defaults?
b. What is the probability that neither the seven-year AA-rated bond nor the
seven-year A-rated bond defaults?
c. Given that the seven-year AA-rated bond defaults, what is the probability that
the seven-year A-rated bond also defaults?
Solution
๐‘ƒ๐‘ƒ(๐ด๐ด) = 0.06, ๐‘ƒ๐‘ƒ(๐ต๐ต) = 0.13, and ๐‘ƒ๐‘ƒ (๐ด๐ด ∩ ๐ต๐ต) = 0.04
a.
๐‘ƒ๐‘ƒ (๐ด๐ด ∪ ๐ต๐ต) = ๐‘ƒ๐‘ƒ(๐ด๐ด) + ๐‘ƒ๐‘ƒ(๐ต๐ต) – ๐‘ƒ๐‘ƒ (๐ด๐ด ∩ ๐ต๐ต) = 0.06 + 0.13 − 0.04
= 0.15
b. ๐‘ƒ๐‘ƒ ((๐ด๐ด ∪ ๐ต๐ต)๐‘๐‘) = 1 − ๐‘ƒ๐‘ƒ(๐ด๐ด ∪ ๐ต๐ต) = 1 − 0.15 = 0.85
Practice Problem
Apple products have become a household name in America. Suppose that the likelihood of
Owning an Apple product is 61% for households with kids and 48% for households without
kids. Suppose there are 1,200 households in a representative community, of which 820 are with
kids and the rest are without kids.
a. Are the events “household with kids” and “household without kids” mutually exclusive and
exhaustive? Explain.
b. What is the probability that a household is without kids?
c. What is the probability that a household is with kids and owns an Apple product?
d. What is the probability that a household is without kids and does not own an Apple product?
Practice Problem
Contingency tables and probabilities
A contingency table is useful when examining the relationship
between two categorical variables.
It shows the frequencies for two categorical variables, x and y.
Each cell represents a mutually exclusive combination of the
pair of x and y values.
We can estimate an empirical probability by calculating the
relative frequency to the occurrence of the event.
Practice Problem
Consider the following contingency table.
A
A
AC
B
26
14
BC
34
26
a. Convert the contingency table into a joint probability table.
b. What is the probability that A occurs?
c. What is the probability that A and B occur?
d. Given that B has occurred, what is the probability that A occurs?
e. Given that Ac has occurred, what is the probability that B occurs?
f. Are A and B mutually exclusive events? Explain.
g. Are A and B independent events? Explain.
Solution
B
A
๐ด๐ด๐‘๐‘
Total
Total
0.26
๐ต๐ต๐‘๐‘
0.34
0.60
0.14
0.26
0.40
0.40
0.60
1
Contingency Tables and Probabilities
Enrollment and age group from the introductory case
• What is the probability that a randomly selected attendee enrolls in the fitness center?
• What is the probability that a randomly selected attendee is over 50 years old?
• What is the probability that a randomly selected attendee enrolls in the fitness center
and is over 50 years old?
• What is the probability that an attendee enrolls in the fitness center, given the attendee
is over 50 years old?
Solution
Let E denote the event of enrolling in the fitness center.
Let O denote the event of being over 50 years old.
a.What is the probability that a randomly selected attendee enrolls in the fitness center?
140
= 0.35
๐‘ƒ๐‘ƒ ๐ธ๐ธ =
400
b. What is the probability that a randomly selected attendee is over 50 years old?
132
= 0.33
๐‘ƒ๐‘ƒ ๐‘‚๐‘‚ =
400
Solution
c. What is the probability that a randomly selected attendee enrolls in the fitness center and is over
50 years old?
44
= 0.11
๐‘ƒ๐‘ƒ ๐ธ๐ธ ∩ ๐‘‚๐‘‚ =
400
d. What is the probability that an attendee enrolls in the fitness center, given the attendee is over 50
years old?
44
๐‘ƒ๐‘ƒ ๐ธ๐ธ ๐‘‚๐‘‚ =
= 0.33
132
๐‘ƒ๐‘ƒ ๐ธ๐ธ ๐‘‚๐‘‚ =
๐‘ƒ๐‘ƒ(๐ธ๐ธ∩๐‘‚๐‘‚)
๐‘ƒ๐‘ƒ(๐‘‚๐‘‚)
=
0.11
0.33
= 0.33
Joint Probability
Let X and Y be two events in a sample space. Then the joint
probability of the two events, written as P(X ∩ Y), is given by
Number of observations in ๐—๐— ∩ ๐˜๐˜
๐๐๐—๐— ∩ ๐˜๐˜ ) =
Total number of observations
Total probability rule and Bayes’ Theorem
The total probability rule expresses the probability of an event, ๐ด๐ด, in terms of
probabilities of the intersection of ๐ด๐ด with any mutually exclusive and exhaustive
events.
The total probability rule based on two events, ๐ต๐ต and ๐ต๐ต๐‘๐‘ , is given by ๐‘ƒ๐‘ƒ ๐ด๐ด =
๐‘ƒ๐‘ƒ ๐ด๐ด ∩ ๐ต๐ต + ๐‘ƒ๐‘ƒ ๐ด๐ด ∩ ๐ต๐ต๐‘๐‘ .
Total probability rule and Bayes’ Theorem
Bayes’ theorem is a procedure for updating probabilities based on new
information; it uses the total probability rule.
The original probability is an unconditional probability called a prior probability,
in the sense that it reflects only what we know before the arrival of new
information.
On the basis of new information, we update the prior probability to arrive at a
conditional probability called a posterior probability.
The posterior probability ๐‘ƒ๐‘ƒ ๐ต๐ต|๐ด๐ด can be found using the information on the prior
probability ๐‘ƒ๐‘ƒ ๐ต๐ต along with conditional probabilities as
Bayes Theorem
• Bayes’ theorem is a fundamental concept in analytics that is widely utilized in
solving various problems through the application of Bayesian statistics.
๐๐(๐€๐€ ∩ ๐๐)
๐๐(๐€๐€|๐๐) =
๐๐(๐๐)
The two equations allow us to demonstrate that
๐๐(๐€๐€ ∩ ๐๐)
๐๐(๐๐|๐€๐€) =
๐๐(๐€๐€)
๐๐(๐€๐€|๐๐)๐๐(๐๐)
P(๐๐|๐€๐€) =
๐๐(๐€๐€)
Terms for Bayes Theorem components
• The prior probability (estimate of the probability without any further information) is denoted by P(B).
• P(B|A) is known as the posterior probability (that is, given that event, A has occurred, what is the
probability that event B will occur). That is, given the new information (or additional evidence) that
A has occurred, what is the projected chance of B occurring?
• P(A|B) denotes the likelihood of observing evidence A if B is true.
• P(A) represents the prior probability of A.
Prior Probabilities
New Information
Application of Bayes’
Theorem
Posterior Probabilities
Total probability rule and Bayes’ Theorem
The posterior probability ๐‘ƒ๐‘ƒ ๐ต๐ต|๐ด๐ด can be found using the information
on the prior probability ๐‘ƒ๐‘ƒ ๐ต๐ต along with conditional probabilities as
๐‘ƒ๐‘ƒ(๐ด๐ด ∩ ๐ต๐ต)
๐‘ƒ๐‘ƒ(๐ด๐ด ∩ ๐ต๐ต)
=
=
๐‘ƒ๐‘ƒ ๐ต๐ต|๐ด๐ด =
๐‘๐‘
๐‘ƒ๐‘ƒ(๐ด๐ด)
๐‘ƒ๐‘ƒ ๐ด๐ด ∩ ๐ต๐ต + ๐‘ƒ๐‘ƒ(๐ด๐ด ∩ ๐ต๐ต )
๐‘ƒ๐‘ƒ(๐ด๐ด|๐ต๐ต)๐‘ƒ๐‘ƒ(๐ต๐ต)
๐‘ƒ๐‘ƒ ๐ด๐ด|๐ต๐ต ๐‘ƒ๐‘ƒ(๐ต๐ต) + ๐‘ƒ๐‘ƒ ๐ด๐ด|๐ต๐ต๐‘๐‘ ๐‘ƒ๐‘ƒ(๐ต๐ต๐‘๐‘ )
Total probability rule and Bayes’ Theorem
The analysis to include an n mutually exclusive and exhaustive events
๐ต๐ต1 , ๐ต๐ต2 , โ‹ฏ , ๐ต๐ต๐‘›๐‘› can be included as follows:
For the extended case, Bayes’ theorem, for any i = 1, 2, . . ., n, is
๐‘ƒ๐‘ƒ(๐ด๐ด ∩ ๐ต๐ต๐‘–๐‘– )
๐‘ƒ๐‘ƒ(๐ด๐ด ∩ ๐ต๐ต๐‘–๐‘– )
๐‘ƒ๐‘ƒ ๐ต๐ต๐‘–๐‘– |๐ด๐ด =
=
๐‘ƒ๐‘ƒ(๐ด๐ด)
๐‘ƒ๐‘ƒ ๐ด๐ด ∩ ๐ต๐ต1 + ๐‘ƒ๐‘ƒ ๐ด๐ด ∩ ๐ต๐ต2 + โ‹ฏ + ๐‘ƒ๐‘ƒ(๐ด๐ด ∩ ๐ต๐ต๐‘›๐‘› )
๐‘ƒ๐‘ƒ ๐ต๐ต๐‘–๐‘– |๐ด๐ด =
๐‘ƒ๐‘ƒ ๐ด๐ด|๐ต๐ต1 ๐‘ƒ๐‘ƒ ๐ต๐ต1
๐‘ƒ๐‘ƒ(๐ด๐ด|๐ต๐ต๐‘–๐‘– )๐‘ƒ๐‘ƒ(๐ต๐ต๐‘–๐‘– )
+ ๐‘ƒ๐‘ƒ ๐ด๐ด|๐ต๐ต2 ๐‘ƒ๐‘ƒ ๐ต๐ต2 + โ‹ฏ + ๐‘ƒ๐‘ƒ ๐ด๐ด|๐ต๐ต๐‘›๐‘› ๐‘ƒ๐‘ƒ ๐ต๐ต๐‘›๐‘›
Generalization of Bayes Theorem
Event generated from mutually exclusive subsets
Practice Problem
Christine has always been weak in mathematics. Based on her performance prior
to the final exam in Calculus, there is a 40% chance that she will fail the course if
she does not have a tutor. With a tutor, her probability of failing decreases to 10%.
There is only a 50% chance that she will find a tutor at such short notice.
a. What is the probability that Christine fails the course?
b. Christine ends up failing the course. What is the probability that she had found
a tutor?
Solution
Practice Problem
• In a lie-detector test, an individual is asked to answer a series of
questions while connected to a polygraph (lie detector).
• This instrument measures and records several physiological responses
of the individual on the basis that false answers will produce distinctive
measurements.
• Assume that 99% of the individuals who go in for a polygraph test tell
the truth.
• These tests are considered to be 95% reliable.
• In other words, there is a 95% chance that the test will detect a lie if an
individual actually lies.
• Let there also be a 0.5% chance that the test erroneously detects a lie
even when the individual is telling the truth.
• An individual has just taken a polygraph test and the test has detected a
lie. What is the probability that the individual was actually telling the
truth?
Solution
• Let D and T correspond to the events that the polygraph detects a lie
and that an individual is telling the truth, P T = 0.99 and P T c =
0.01.
• We formulate ๐‘ƒ๐‘ƒ ๐ท๐ท|๐‘‡๐‘‡ ๐‘๐‘ = 0.95 and ๐‘ƒ๐‘ƒ ๐ท๐ท|๐‘‡๐‘‡ = 0.005.
• We can use Bayes’ theorem to find
๐‘ƒ๐‘ƒ(๐ท๐ท|๐‘‡๐‘‡)๐‘ƒ๐‘ƒ(๐‘‡๐‘‡)
๐‘ƒ๐‘ƒ ๐‘‡๐‘‡|๐ท๐ท =
๐‘ƒ๐‘ƒ ๐ท๐ท|๐‘‡๐‘‡ ๐‘ƒ๐‘ƒ(๐‘‡๐‘‡) + ๐‘ƒ๐‘ƒ ๐ท๐ท|๐‘‡๐‘‡ ๐‘๐‘ ๐‘ƒ๐‘ƒ(๐‘‡๐‘‡ ๐‘๐‘ )
=
0.005∗0.99
0.005∗0.99+0.95∗0.01
= 0.34256
Solution
The table provided can assist in solving the
problem in a systematic manner.
Monty Hall Problem
Practice Problem
Dr. Miriam Johnson has been teaching accounting for over 20 years. From her
experience, she knows that 60% of her students do homework regularly.
Moreover, 95% of the students who do their homework regularly pass the course.
She also knows that 85% of her students pass the course.
a. What is the probability that a student will do homework regularly and also
pass the course?
b. What is the probability that a student will neither do homework regularly nor
will pass the course?
c. Are the events “pass the course” and “do homework regularly” mutually
exclusive? Explain.
d. Are the events “pass the course” and “do homework regularly” independent?
Explain
Solution
Let event A correspond to “Do homework regularly” and B to “Pass the
course”.
๐‘ƒ๐‘ƒ(๐ด๐ด) = 0.60, ๐‘ƒ๐‘ƒ(๐ต๐ต|๐ด๐ด) = 0.95, ๐‘ƒ๐‘ƒ(๐ต๐ต) = 0.85
a. P (๐ด๐ด ∩ ๐ต๐ต) = ๐‘ƒ๐‘ƒ(๐ต๐ต|๐ด๐ด)๐‘ƒ๐‘ƒ(๐ด๐ด) = 0.95(0.60) = 0.57
b. P ((๐ด๐ด ∪ ๐ต๐ต)๐‘๐‘) = 1 − ๐‘ƒ๐‘ƒ(๐ด๐ด ∪ ๐ต๐ต)
๐‘ƒ๐‘ƒ(๐ด๐ด ∪ ๐ต๐ต) = ๐‘ƒ๐‘ƒ(๐ด๐ด) + ๐‘ƒ๐‘ƒ(๐ต๐ต) − ๐‘ƒ๐‘ƒ(๐ด๐ด ∩ ๐ต๐ต) = 0.60 + 0.85 − 0.57 = 0.88
Therefore, ๐‘ƒ๐‘ƒ((๐ด๐ด ∪ ๐ต๐ต)๐‘๐‘) = 1 − 0.88 = 0.12
c. No, because P(๐ด๐ด ∩ ๐ต๐ต) = 0.57 ≠ 0
d. No, because P(๐ต๐ต|๐ด๐ด) = 0.95 ≠ 0.85 = ๐‘ƒ๐‘ƒ(๐ต๐ต)
Counting Rules to compute possible event
arrangements
• Basic Rule
๏ƒ˜ If event X can occur in n1 ways and event Y can occur in n2 ways, then
events X and Y can occur in n1 × n2 ways.
๏ƒ˜ In general, m events can occur n1 × n2 × … × nm ways.
๏ƒ˜ How many different stock-keeping units (SKU) labels can a hardware
store make by utilizing two letters (AA-ZZ) followed by four numerals
(0-9)?
Factorial
• The number of possibilities to arrange n items in a specific
order is n factorial.
• The product of all integers from 1 to n is n factorial.
• n! = n(n−1)(n−2)...1
• Factorials can be used to count the possible combinations of
any n things.
• There are n different ways to choose the first, n1 different
ways to choose the second, and so on.
• A home appliance service vehicle is required to make three
stops (A, B, and C). How many different ways may the
three stops be arranged? 3! = 3 × 2 × 1 = 6
Permutations
• A permutation, indicated by nPr, is an arrangement in a specific order
of r randomly picked elements from a set of n objects.
• In other words, how many ways can the r things be organised from
the n things while treating each arrangement as distinct (i.e., XYZ is
distinct from ZYX)?
Combinations
• A combination is an arrangement of r items chosen at random among
n items, where the order of the elements is irrelevant (i.e., XYZ is the
same as ZYX).
• The symbol for a combination is nCr.
Practice Problem
Black boxes used in aircrafts manufactured by three companies A, B
and C. 75% are manufactured by A, 15% by B, and 10% by C. The
defect rates of black boxes manufactured by A, B, and C are 4%, 6%,
and 8%, respectively. If a black box tested randomly is found to be
defective, what is the probability that it is manufactured by company A?
Solution
Let P(A), P(B), P(C) be events corresponding to the black box being manufactured by
companies A, B, and C, respectively, and P(D) be the probability of defective black box. The
probability P(A|D) has been calculated as follows:
P( A | D) =
P( D | A) × P( A)
P( D)
• Now P(D|A) = 0.04 and P(A) = 0.75.
P(D) = 0.75 × 0.04 + 0.15 × 0.06 + 0.10 × 0.08 = 0.047
So,
P( A | D) =
0.04 × 0.75
= 0.6382
0.047
Discrete & Continuous
Probability Distribution
Case:
Available
Staff for
Probable
Customers
Anne Jones works
as a manager at a
nearby Starbucks.
Starbucks
revealed plans to
close 500 stores
in the United
States in 2008.
While Anne's
shop will remain
open, she is afraid
that surrounding
store closings
would have an
impact on her
company. Anne
must decide on
workforce
requirements.
A store with too
many employees
would be pricey.
Customers who
opt not to wait
may be lost if
there are not
enough personnel.
Determine the
likelihood that a
typical client will
visit the store a
certain number of
times in a
particular time
period.
Anne will be able
to: Calculate the
predicted number
of visits from a
typical Starbucks
customer in a
particular time
period using her
grasp of the
probability
distribution of
customer arrivals.
Random Variables
• A random variable is a function that converts every
possible outcome in the sample space to a real value.
• Each sample point in the sample space S is
assigned a real number using this function.
• A random variable is a reliable and practical
technique of describing the results of a random
experiment.
Discrete Random Variables
A discrete random
variable is one that
can only take on a
finite or countably
infinite set of values.
Here are some
examples of discrete
random variables:
Credit rating
(typically
categorized as low,
medium, or high, or
using designations
such as AAA, AA, A,
BBB, and so on).
The number of
orders received by
an e-commerce
retailer, which can
be countably
limitless.
Customer churn (the
random variables
have binary values
of 1. Churn and 2.
Do not churn).
Fraud (the random
variables have
binary values, 1.
Fraudulent
transaction, and 2.
Genuine
transaction).
Any experiment that
involves counting
(for example, the
number of returns in
a day from users of
e-commerce sites
such as Amazon and
Flipkart; the number
of clients who do
not accept job offers
from an
organization).
Continuous Random Variables
• Continuous random variable X is a random variable that can take on any of an
infinite number of possible values.
• The following are examples of random variables that are continuous:
• A company's market share (which can assume any of an infinite number of
values between 0% and 100%) is infinitely variable.
• A company's attrition rate as a proportion of its workforce.
• Engineering systems' time to failure.
• The amount of time necessary to execute an online order.
• Call and service centers’ resolution duration for consumer complaints.
Probability mass function
Every discrete random variable is associated with a
probability distribution.
For a discrete random variable, the probability that
a random variable X takes a specific value xi, P(X =
xi), is called the probability mass function P(xi).
That is, a probability mass function is a function
that maps each outcome of a random experiment
to a probability
Expected Value
• Expected value (or mean) of a discrete random variable is given by
๐ง๐ง
๐„๐„(๐—๐—) = ๏ฟฝ ๐ฑ๐ฑ๐ข๐ข ๐๐(๐ฑ๐ฑ ๐ข๐ข )
๐ข๐ข=๐Ÿ๐Ÿ
• xi is the specific value taken by a discrete random variable X and P(xi) is the
corresponding probability, that is, P(X = xi).
Variance and Standard Deviation
Variance of a discrete random variable is given by
n
Var( X ) =∑ [ xi − E ( X )] × P( xi )
2
i =1
• Standard deviation of a discrete random variable is given by
σ = VAR( X )
Probability
Density
Function (pdf)
The probability density function, f(xi), is
defined as probability that the value of
random variable X lies between an
infinitesimally small interval defined by xi
and xi + δx
P(xi ≤ X ≤ xi + δx)
f(x) = lim
δx→0
δx
Cumulative Distribution Function (CDF)
• The cumulative distribution function (CDF) of a continuous random
variable is defined by
F (a)= P( X ≤ a)=
a
∫
−∞
f ( x)dx
Key Points about the Distributions
• Probability density function and cumulative
distribution function of a continuous random
variable satisfy the following properties
f(x) ≥ 0
+∞
๐‘ญ๐‘ญ(∞) = ๏ฟฝ ๐’‡๐’‡(๐’™๐’™)๐’…๐’…๐’…๐’… = ๐Ÿ๐Ÿ
−∞
๐’ƒ๐’ƒ
๐‘ท๐‘ท(๐’‚๐’‚ ≤ ๐‘ฟ๐‘ฟ ≤ ๐’ƒ๐’ƒ) = ๏ฟฝ ๐’‡๐’‡(๐’™๐’™)๐’…๐’…๐’…๐’… = ๐‘ญ๐‘ญ(๐’ƒ๐’ƒ) − ๐‘ญ๐‘ญ(๐’‚๐’‚)
๐’‚๐’‚
The probability between two values
a and b, ๐‘ท๐‘ท(๐’‚๐’‚≤๐‘ฟ๐‘ฟ≤๐’ƒ๐’ƒ)is the area
between the values a and b under
the probability density function
Key Points about the Distributions
• The expected value of a continuous random variable, E(X), is given by
E( X ) =
+∞
∫ xf ( x)dx
−∞
• The variance of a continuous random variable, Var(X), is given by
Var(=
X)
∞
∫ [ x − E ( x)]
−∞
2
f ( x)dx
Discrete
Types of
Probability
Distributions
Continuous
Binominal Distribution
Normal Distribution
Uniform Distribution
Exponential Distribution
Poisson Distribution
Logistic Distribution
Hypergeometric Distribution
Practice Problem
Brad Williams is the owner of a large car dealership in Chicago. Brad decides to
construct an incentive compensation program that equitably and consistently
compensates employees on the basis of their performance.
a. Calculate the expected value of the annual bonus amount.
b. Calculate the variance and the standard deviation of the annual bonus amount.
c. What is the total annual amount that Brad can expect to pay in bonuses if he has 25
employees?
Solution
Let the random variable X denote the bonus amount (in $1,000’s).
a. The expected value is ๐ธ๐ธ ๐‘‹๐‘‹ = ๐œ‡๐œ‡ = ∑ ๐‘ฅ๐‘ฅ๐‘–๐‘– ๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ = ๐‘ฅ๐‘ฅ๐‘–๐‘– = 4.2 or $4,200.
b. The variance is ๐‘‰๐‘‰๐‘‰๐‘‰๐‘‰๐‘‰ ๐‘‹๐‘‹ = ๐œŽ๐œŽ 2 = ∑ ๐‘ฅ๐‘ฅ๐‘–๐‘– − ๐œ‡๐œ‡ 2 ๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ = ๐‘ฅ๐‘ฅ๐‘–๐‘– = 9.97 (in ($1,000s)2), the
standard deviation is ๐‘†๐‘†๐‘†๐‘† ๐‘‹๐‘‹ = ๐œŽ๐œŽ = 3.158 or $3,158.
c. If Brad has 25 employees, we can expect to pay $4,200*25 = $105,000 in bonuses.
Binomial Distribution
A random variable X is said to follow a Binomial distribution when
• The random variable can have only two outcomes success and failure (also
known as Bernoulli trials).
• The objective is to find the probability of getting k successes out of n trials.
• The probability of success is p and thus the probability of failure is (1 − p).
• The probability p is constant and does not change between trials.
Binomial Probability Distribution
Four Properties of a Binomial Experiment
1. The experiment consists of a sequence of n identical trials.
2. Two outcomes, success and failure, are possible on each trial.
3. The probability of a success, denoted by p, does not change from trial to
trial. (This is referred to as the stationarity assumption)
4. The trials are independent.
300
Binomial Probability Distribution
• Our interest is in the number of successes occurring in the n
trials.
• We let x denote the number of successes occurring in the n trials.
301
Binomial Probability Distribution
Binomial Probability Function
n!
f x =
px (1 − p)(n−x)
x! n − x !
Where:
x = the number of successes
p = the probability of a success on one trial
n = the number of trials
f(x) = the probability of x successes in n trials
n! = n(n – 1)(n – 2) ….. (2)(1)
302
Binomial Probability Function
๐‘›๐‘›!
๐‘“๐‘“ ๐‘ฅ๐‘ฅ =
๐‘๐‘ ๐‘ฅ๐‘ฅ (1 − ๐‘๐‘)(๐‘›๐‘›−๐‘ฅ๐‘ฅ)
๐‘ฅ๐‘ฅ! ๐‘›๐‘› − ๐‘ฅ๐‘ฅ !
Number of experimental
outcomes providing exactly
x successes in n trials
Probability of a particular sequence of
trial outcomes with x successes in n trials
303
Probability Mass Function (PMF) of Binomial
Distribution
• The PMF of the Binomial distribution (probability that the number of success
will be exactly x out of n trials) is given by
๐๐๐๐๐๐(๐ฑ๐ฑ) = ๐๐(๐—๐— = ๐ฑ๐ฑ) =
๐’๐’!
๐’๐’
=
๐’™๐’™
๐’™๐’™! (๐’๐’ − ๐’™๐’™)!
๐ง๐ง ๐ฑ๐ฑ
๐ฉ๐ฉ (๐Ÿ๐Ÿ − ๐ฉ๐ฉ)๐ง๐ง−๐ฑ๐ฑ ,
๐ฑ๐ฑ
๐ŸŽ๐ŸŽ ≤ ๐ฑ๐ฑ ≤ ๐ง๐ง
Cumulative Distribution Function (CDF) of
Binomial Distribution
• CDF of a binomial distribution function, F(x), representing the probability that the random
variable X takes value less than or equal to a, is given by
๐‘Ž๐‘Ž
๐‘Ž๐‘Ž
๐‘˜๐‘˜=0
๐‘˜๐‘˜=0
๐‘›๐‘› ๐‘˜๐‘˜
F(a) = ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ ≤ ๐‘Ž๐‘Ž) = ๏ฟฝ ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ = ๐‘˜๐‘˜) = ๏ฟฝ
๐‘๐‘ (1 − ๐‘๐‘)๐‘›๐‘›−๐‘˜๐‘˜
๐‘˜๐‘˜
Mean and Variance of Binomial Distribution
The Mean of a binomial distribution is given by:
๏ฃซn๏ฃถ x
Mean =E ( X ) =∑ x × PMF( x) =∑ x × ๏ฃฌ ๏ฃท p (1 − p ) n − x =np
x 0=
x 0
=
๏ฃญ x๏ฃธ
n
n
The variance of a binomial distribution is given by
๏ฃซn๏ฃถ x
np(1 − p)
Var( X ) =
( x − E ( X )) × PMF( x) =
( x − E ( X )) × ๏ฃฌ ๏ฃท p (1 − p)n − x =
∑
∑
=x 0=x 0
๏ฃญ x๏ฃธ
n
2
n
2
If the number of trials (n) in a binomial distribution is large, then it can be approximated by normal distribution with mean
np and variance npq.
Mean, Variance, and SD of Binomial Distribution
Expected value
Variance
Standard deviation
Practice
Problem
Fashion Trends Online (FTO) is an e-commerce company that sells
women apparel. It is observed that about 10% of their customers
return the items purchased by them for many reasons (such as size,
color, and material mismatch). On a particular day, 20 customers
purchased items from FTO. Calculate:
(a) Probability that exactly 5 customers will return the items.
(b) Probability that a maximum of 5 customers will return
the items.
(c) Probability that more than 5 customers will return the
items purchased by them.
(d) Average number of customers who are likely to return the
items.
(e) The variance and the standard deviation of the number of
returns.
Solution
In this case, the value of n = 20 and p = 0.1.
(a)Probability that exactly 5 customers will return the items purchased is
๏ฃซ 20 ๏ฃถ
P( X = 5) = ๏ฃฌ๏ฃฌ ๏ฃท๏ฃท × (0.1)5 × (0.9)15 = 0.03192
๏ฃญ5 ๏ฃธ
(b)Probability that a maximum of 5 customers will return the items purchased is
5 ๏ฃซ 20 ๏ฃถ
P( X ≤ 5) = ∑ ๏ฃฌ๏ฃฌ ๏ฃท๏ฃท × (0.1) k × (0.9) 20 − k = 0.9887
k =0 ๏ฃญ k ๏ฃธ
(c)Probability that more than 5 customers will return the product is
5 ๏ฃซ 20 ๏ฃถ
P( X > 5) = 1 − P( X ≤ 5) = 1 − ∑ ๏ฃฌ๏ฃฌ ๏ฃท๏ฃท × (0.1) k × (0.9) 20− k = 1 − 0.9887 = 0.0113
k =0 ๏ฃญ k ๏ฃธ
(d)The average number of customers who are likely to return the items is
E(X) = n × p = 20 × 0.1 = 2
(e) Variance of a binomial distribution is given by
Var(X) = n × p × (1 − p) = 20 × 0.1 × 0.9 = 1.8
and the corresponding standard deviation is 1.3416
Poisson Distribution
Poisson Distribution
•It is another type of discrete probability distribution
•It is considered as limiting form of binomial
distribution in which n, the number of trials, becomes
very large and p, probability of the success of the
event is very small.
•It was proposed by the French mathematician
Simeon Poisson in 1837.
Why we need Poisson Distribution
It is used in cases where chance of any individual event being a
success is very small. This distribution is used to describe the
behavior of rare events.
Examples,
•
•
•
•
•
•
•
The number of defective screws per box of 2000 screws.
The number of printing mistakes in each page of the first proof of book
The number of air accidents in India in one year
Occurrence of number of scratches on a sheet of glass
The number of customers who use a new banking app in a day
The number of spam emails received in a month
The number of defects in a 50-yard roll of fabric
Characteristics of Poisson Distribution
An experiment satisfies a Poisson process for the following
condition:
• The random variable X should be discrete.
• Happening of event must be of two alternatives such as success and
failure
• Applicable in those cases where the number of trials n is very large
and p is very small
• Statistical independence is assumed
Two Properties of a Poisson Experiment
1. The probability of an occurrence is the same for any two intervals of equal
length.
2. The occurrence or nonoccurrence in any interval is independent of the
occurrence or nonoccurrence in any other interval.
314
Poisson Probability Function
๐‘ฅ๐‘ฅ −๐œ‡๐œ‡
๐œ‡๐œ‡ ๐‘’๐‘’
๐‘“๐‘“ ๐‘ฅ๐‘ฅ =
๐‘ฅ๐‘ฅ!
Where:
x = the number of occurrences in an interval
f(x) = the probability of x occurrences in an interval
µ = mean number of occurrences in an interval
e = 2.71828
x! = x(x – 1)(x – 2) . . . (2)(1)
315
Poisson Distribution
For a Poisson random variable X, the probability of x successes
over a given interval of time or space is
๐‘ท๐‘ท ๐‘ฟ๐‘ฟ = ๐’™๐’™ =
๐’†๐’†
−๐๐ ๐’™๐’™
๐๐
๐’™๐’™!
This is for ๐‘ฅ๐‘ฅ = 0, 1, 2, โ‹ฏ
๐œ‡๐œ‡ is the mean number of successes
๐‘’๐‘’ ≈ 2.718 is the base of the natural logarithm
The mean is ๐ธ๐ธ ๐‘‹๐‘‹ = ๐œ‡๐œ‡
The variance is ๐‘‰๐‘‰๐‘‰๐‘‰๐‘‰๐‘‰ ๐‘‹๐‘‹ = ๐œŽ๐œŽ 2 = ๐ธ๐ธ(๐‘‹๐‘‹) = ๐œ‡๐œ‡
Example: Mercy Hospital
Patients arrive at the emergency room of Mercy Hospital at the average rate
of 6 per hour on weekend evenings.
What is the probability of 4 arrivals in 30 minutes on a weekend evening?
317
Solution
Using the probability function:
µ = 6/hour = 3/half-hour, x = 4
๐‘“๐‘“ 4 =
34 (2.71828)−3
4!
= .1680
318
Numerical Example
The average number of accidents at a particular intersection every year
is 18. (a) calculate the probability that there are exactly 2 accidents
there this month
Solution
There are 12 months in a year, so ๐๐= 18/12= 1.5 accidents per month
๐’†๐’†−๐๐ ๐๐๐’™๐’™
๐‘ท๐‘ท ๐‘ฟ๐‘ฟ = ๐’™๐’™ =
๐’™๐’™!
๐‘ท๐‘ท ๐‘ฟ๐‘ฟ = ๐Ÿ๐Ÿ =
๐’†๐’†−๐Ÿ๐Ÿ.๐Ÿ“๐Ÿ“ ๐Ÿ๐Ÿ.๐Ÿ“๐Ÿ“๐Ÿ๐Ÿ
๐Ÿ๐Ÿ!
= 0.2510
Numerical Example
Example: Anne is concerned about staffing needs at the Starbucks that
she manages. She believes that the typical Starbucks customer averages
18 visits to the store over a 30-day month.
a. How many visits should Anne expect in a 5-day period from a
typical Starbucks customer?
b. What is the probability that a customer visits the chain five times in a
5-day period?
c. What is the probability that a customer visits the chain no more than
two times in a 5-day period?
d. What is the probability that a customer visits the at least three times
in a 5-day period?
Numerical Example
a. Given the rate of 18 visits over a 30-day month, the mean for the 30-day period as
๐œ‡๐œ‡30 = 18. So the mean for the 5-day period is ๐œ‡๐œ‡5 = 3
b. P X = 5 =
e−3 35
5!
= 0.1008
c. P X ≤ 2 = P X = 0 + P X = 1 + P X = 2
= 0.0498 + 0.1494 + 0.2241 = 0.4233
d. ๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ ≥ 3 = ๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ = 3 + ๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ = 4 + โ‹ฏ
Cannot be found since there is an infinite number of possibilities
๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ ≥ 3 = 1 − ๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ = 0 + ๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ = 1 + ๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ = 2
= 1 − 0.4233 = 0.5767
Numerical Example using Excel
• Craft breweries that make beer in small batches are experiencing a spectacular
growth in bars and liquor stores across the nation. The craft beer industry now
boasts of 4,269 breweries, representing a 12% market share of the total beer
market in the United States (Fortune, March 22, 2016). It has been estimated that
1.5 craft breweries open every day. Assume this number represents an average
that remains constant over time.
a. What is the probability that no more than 10 craft breweries open every week?
b. What is the probability that exactly 10 craft breweries open every week?
Numerical Example using Excel
• Excel Command
A. In order to find the probability that no more than 10
craft breweries open every week, P(X ≤ 10),
We enter “=POISSON.DIST(10, 10.5, 1)” and Excel
returns 0.5207.
Solution
There is a 52.07% chance that no more than 10 craft
breweries open every week.
In order to find the probability that exactly 10 craft breweries open every week,
P(X = 10), we enter “=POISSON.DIST(10, 10.5, 0)” and Excel returns 0.1236.
There is a 12.36% chance that 10 craft breweries open every week.
Practice problem
Assume that X is a Poisson random variable with μ = 20. Use Excel’s function
options to find the following probabilities.
a)
b)
c)
d)
P(X < 14)
P(X ≥ 20)
P(X = 25)
P(18 ≤ X ≤ 23)
Practice problem: Solution
a. ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ < 14) = ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ ≤ 13) = 0.0661
In Excel: =POISSON.DIST(13,20,TRUE)
b. ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ ≥ 20) = 1 − ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ ≤ 19) = 0.5297
In Excel: =1 - POISSON.DIST(19,20,TRUE)
c. ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ = 25) = 0.0446
In Excel: =POISSON.DIST(25,20,FALSE)
d. ๐‘ƒ๐‘ƒ(18 ≤ ๐‘‹๐‘‹ ≤ 23) = ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ ≤ 23) − ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ ≤ 17) = 0.4905
In Excel: = POISSON.DIST(23,20,TRUE) - POISSON.DIST(17,20,TRUE)
Practice problem
A textile manufacturing process finds that on average, two flaws
occur per every 50 yards of material produced.
a. What is the probability of exactly two flaws in a 50-yard piece of
material?
b. What is the probability of no more than two flaws in a 50-yard
piece of material?
c. What is the probability of no flaws in a 25-yard piece of material?
Practice problem: Solution
a. What is the probability of exactly two flaws in a 50-yard piece of material?
๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ = 2)
In Excel: =POISSON.DIST(2,2,0)= 0.270670566
b. What is the probability of no more than two flaws in a 50-yard piece of
material?
๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ ≤ 2) = ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ = 0) + ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ = 1) + ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ = 2) = 0.6767
In Excel: =POISSON.DIST(2,2,1)
c. What is the probability of no flaws in a 25-yard piece of material?
๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ = 0)
In Excel: =POISSON.DIST(0,1,0)=0.367879
Exponential distribution
Exponential distribution
The exponential distribution is a useful non-symmetric continuous
probability distribution.
Related to Poisson: number of occurrences over a given interval of
time or space.
Now we are interested in the time/space between the occurrences or
arrivals.
The exponential random variable is nonnegative.
Exponential distribution
The probability distribution is defined in terms of its rate parameter ๐œ†๐œ†.
Same ๐œ†๐œ† as with Poisson
Average number of arrivals per unit of time/space
For an exponential random variable, the mean is the inverse of ๐œ†๐œ†: the average time
between arrivals.
Used to model lifetimes or failure times.
Exponential distribution
Exponential distribution
Exponential distribution
Example: the time between e-mail messages during work hours is
exponentially distributed with a mean of 25 minutes.
a. Calculate the rate parameter ๐œ†๐œ†.
b. What is the probability that you do not get an e-mail for more
than one hour?
c. What is the probability that you get an e-mail within 10
minutes?
SOLUTION
a. ๐œ†๐œ† =
1
๐ธ๐ธ(๐‘‹๐‘‹)
=
1
25
= 0.04 emails per minute
b. ๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ > 60 = ๐‘’๐‘’ −0.04(60) = 0.0907
c. ๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ ≤ 10 = 1 − ๐‘’๐‘’ −0.04
10
= 1 − 0.6703 = 0.3297
Continuous
Uniform
Distribution
Continuous Uniform Distribution
• One of the simplest continuous probability distributions is called the continuous uniform
distribution.
• This distribution is appropriate when the underlying random variable has an equally likely
chance of assuming a value within a specified range [a,b].
1
๐‘“๐‘“ ๐‘ฅ๐‘ฅ = ๏ฟฝ๐‘๐‘ − ๐‘Ž๐‘Ž
0
for ๐‘Ž๐‘Ž ≤ ๐‘ฅ๐‘ฅ ≤ ๐‘๐‘
for ๐‘ฅ๐‘ฅ < ๐‘Ž๐‘Ž or ๐‘ฅ๐‘ฅ > ๐‘๐‘
๐‘Ž๐‘Ž + ๐‘๐‘
๐ธ๐ธ ๐‘‹๐‘‹ = ๐œ‡๐œ‡ =
2
๐‘†๐‘†๐‘†๐‘† ๐‘‹๐‘‹ = ๐œŽ๐œŽ =
๐‘๐‘ − ๐‘Ž๐‘Ž 2 ⁄12
Continuous Uniform Distribution
The probability density function does not directly represent probability.
The area under the curve represents probability
This is the area of a rectangle: base times height
1
Length of an interval ∗
๐‘๐‘ − ๐‘Ž๐‘Ž
Practice Problem
• Example: Sales for a particular cosmetic line follow a continuous uniform
distribution with a lower limit of $2,500 and an upper limit of $5,000.
• What are the mean and standard deviation?
• ๐œ‡๐œ‡ =
• ๐œŽ๐œŽ =
2500+5000
2
= $3,750
5000 − 2500 2 ⁄12 = $721.69
Solved Problem
Example, continued.
What is the probability the sales exceed $4,000?
1
๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ > 4000 = 1000 ×
= 0.40
5000 − 2500
Solved Problem
Example, continued.
What is the probability the sales are between $3,200 and $3,800?
1
๐‘ƒ๐‘ƒ 3200 ≤ ๐‘‹๐‘‹ ≤ 3800 = 600 ×
= 0.24
5000 − 2500
Practice Problem
Example: For a continuous random variable X with an upper bound of
4,
P (0 ≤ X ≤ 2.5) = 0.54 and P (2.5 ≤ X ≤ 4) = 0.16.
Calculate the following probabilities.
a. P(X < 0)
b. P(X > 2.5)
c. P(0 ≤ X ≤ 4)
Solution
a.
๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ < 0) = 1 − ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ ≥ 0) = 1 − (0.54 + 0.16) = 1 − 0.70 =
0.30
b. Since 4 is the upper bound, ๐‘ƒ๐‘ƒ(๐‘‹๐‘‹ > 2.5) = ๐‘ƒ๐‘ƒ(2.5 ≤ ๐‘‹๐‘‹ ≤ 4) =
0.16
c.
๐‘ƒ๐‘ƒ(0 ≤ ๐‘‹๐‘‹ ≤ 4) = ๐‘ƒ๐‘ƒ(0 ≤ ๐‘‹๐‘‹ ≤ 2.5) + ๐‘ƒ๐‘ƒ(2.5 ≤ ๐‘‹๐‘‹ ≤ 4) = 0.54 +
0.16 = 0.70
Practice Problem
Suppose the average price of electricity for a New England customer follows the
continuous uniform distribution with a lower bound of 12 cents per kilowatthour and an upper bound of 20 cents per kilowatt-hour.
a. Calculate the average price of electricity for a New England customer.
b. What is the probability that a New England customer pays less than 15.5 cents
per kilowatt-hour?
c. A local carnival is not able to operate its rides if the average price of electricity
is more than 14 cents per kilowatt-hour. What is the probability that the carnival
will need to close?
Solution
Practice Problem
You were informed at the nursery that your peach tree will definitely
bloom sometime between March 18 and March 30. Assume that the
bloom times follow a continuous uniform Distribution between these
specified dates.
a. What is the probability that the tree does not bloom until March
25?
b. What is the probability that the tree will bloom by March 20?
Solution
Practice Problem
For a continuous random variable X, P (20 ≤ X ≤ 40) = 0.15 and P (X
> 40) = 0.16. Calculate the following probabilities.
a. P(X = 40)
b. P(X < 40)
Solution
The probability of a continuous random variable taking a particular value is zero: that is, ๐‘ƒ๐‘ƒ ( ๐‘‹๐‘‹ = ๐‘ฅ๐‘ฅ ) = 0 for
any value ๐‘ฅ๐‘ฅ . The fact that the random variable has a zero probability of taking any one value is what
distinguishes the continuous random variables from the discrete ones.
Hypergeometric Distribution
The Hypergeometric Distribution
The binomial distribution is appropriate when you sample with replacement.
The probability of success does not change from trial to trial
The trials are independent
Sampling without replacement: after an item is drawn, it is not put back for subsequent draws.
Trials are not independent; the trials are dependent on each other
The probability of success changes from trial to trial
Use the hypergeometric distribution in place of the binomial distribution when sampling
without replacement.
The number of successes in a two-outcome experiment
Trials are not independent of one another
With replacement means the same item can be chosen more than once. Without
replacement means the same item cannot be selected more than once.
The Hypergeometric Distribution
The probability of x successes in a random selection of n items
is
๐‘บ๐‘บ
๐‘ท๐‘ท ๐‘ฟ๐‘ฟ = ๐’™๐’™ = ๐’™๐’™
๐‘ต๐‘ต − ๐‘บ๐‘บ
๐’๐’ − ๐’™๐’™
๐‘ต๐‘ต
๐’๐’
N is the population size, S is the number of population
successes, n is the sample size
For ๐‘ฅ๐‘ฅ = 0,1,2, โ‹ฏ , ๐‘›๐‘› if ๐‘›๐‘› ≤ ๐‘†๐‘† or ๐‘ฅ๐‘ฅ = 0,1,2, โ‹ฏ , ๐‘†๐‘† if ๐‘›๐‘› > ๐‘†๐‘†
The Hypergeometric Distribution
The formula consists of three parts
๐‘†๐‘†
: the number of ways to select x success from S population
๐‘ฅ๐‘ฅ
successes
๐‘๐‘ − ๐‘†๐‘†
: the number of ways to select ๐‘›๐‘› − ๐‘ฅ๐‘ฅ failures from ๐‘๐‘ − ๐‘†๐‘†
๐‘›๐‘› − ๐‘ฅ๐‘ฅ
population failures
๐‘๐‘
: the number of ways a sample of size n can be selected from a
๐‘›๐‘›
population of size N
๐’๐’๐’๐’
๐‘ต๐‘ต − ๐’๐’
๐’Œ๐’Œ
๐’Œ๐’Œ
๐Ÿ๐Ÿ
๐๐ =
๐š๐š๐š๐š๐š๐š ๐ˆ๐ˆ =
. ๐ง๐ง. (๐Ÿ๐Ÿ − )
๐‘ต๐‘ต
๐‘ต๐‘ต − ๐Ÿ๐Ÿ
๐‘ต๐‘ต
๐‘ต๐‘ต
The Hypergeometric Distribution: Excel Command
The concept of hypergeometric distribution is important because it
provides an accurate way of determining the probabilities when the
number of trials is not a very large number and that samples are
taken from a finite population without replacement.
Solution using Excel
Example: Inspect five mangoes from a box containing 20 mangos with exactly two damaged
mangos.
What is the probability that one out of the five mangoes is damaged? (n=5, N=20, x=1, S=2)
2
๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ = 1 = 1
20 − 2
5 − 1 = 0.3947
20
5
If the manager decides to reject the shipment if one or more of the mangoes are damaged, what is
the probability that the shipment will be rejected?
๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ = 0 =
2 20−2
0 5−0
20
5
= 0.5526
๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ ≥ 1 = 1 − ๐‘ƒ๐‘ƒ ๐‘‹๐‘‹ = 0 = 1 − 0.5526 = 0.4474
Calculate the expected value, the variance, and the standard deviation.
2
2
2 20 − 5
= 0.50, ๐‘‰๐‘‰๐‘‰๐‘‰๐‘‰๐‘‰ ๐‘ฅ๐‘ฅ = 5
1−
= 0.3553, ๐‘†๐‘†๐‘†๐‘† ๐‘‹๐‘‹ = 0.5960
• ๐ธ๐ธ ๐‘‹๐‘‹ = 5
20
20
20 20 − 1
Practice Problem
Assume that X is a hypergeometric random variable with N = 25, S = 3, and n = 4.
Calculate the following probabilities.
a. P(X = 0)
b. P(X = 1)
c. P(X ≤ 1)
Normal
Distribution
Normal Distribution
• The normal distribution is bell-shaped and symmetric around its mean.
• The normal distribution is completely described by two parameters—the
population mean μ and the population variance σ2.
• The normal distribution is asymptotic in the sense that the tails get closer and
closer to the horizontal axis but never touch it.
The total area under the curve and above the horizontal axis is
equal to 1.
∞
∫−∞ ๐’‡๐’‡
๐’™๐’™ ๐’…๐’…๐’…๐’… =
๐Ÿ๐Ÿ
๐ˆ๐ˆ ๐Ÿ๐Ÿ๐Ÿ๐Ÿ
๐Ÿ๐Ÿ
∞ − ๐Ÿ๐Ÿ (๐’™๐’™−๐๐)๐Ÿ๐Ÿ
๐’…๐’…๐’…๐’…
∫−∞ ๐’†๐’† ๐Ÿ๐Ÿ๐ˆ๐ˆ
= ๐Ÿ๐Ÿ
๐œ‡๐œ‡
x1
x2
Normal Distribution
๐‘ƒ๐‘ƒ ๐‘ฅ๐‘ฅ1 < ๐‘ฅ๐‘ฅ < ๐‘ฅ๐‘ฅ2 =
Mean= ๐œ‡๐œ‡ =
1
๐œŽ๐œŽ 2๐œ‹๐œ‹
๐‘ฅ๐‘ฅ2
๏ฟฝ
๐‘ฅ๐‘ฅ1
1
− 2 (๐‘ฅ๐‘ฅ−๐œ‡๐œ‡)2
๐‘’๐‘’ 2๐œŽ๐œŽ
๐‘‘๐‘‘๐‘‘๐‘‘
denotes the probability of x in the interval (๐‘ฅ๐‘ฅ1 , ๐‘ฅ๐‘ฅ2 ).
∞
∫−∞ ๐‘ฅ๐‘ฅ. ๐‘“๐‘“
Standard Deviation =
๐‘ฅ๐‘ฅ ๐‘‘๐‘‘๐‘‘๐‘‘ =
1
๐œŽ๐œŽ 2๐œ‹๐œ‹
๐œŽ๐œŽ 2 =
1
1
∞
− 2 (๐‘ฅ๐‘ฅ−๐œ‡๐œ‡)2
๐‘‘๐‘‘๐‘‘๐‘‘
∫−∞ ๐‘ฅ๐‘ฅ. ๐‘’๐‘’ 2๐œŽ๐œŽ
๐œŽ๐œŽ 2๐œ‹๐œ‹
∞
๏ฟฝ (๐‘ฅ๐‘ฅ −
−∞
1 (๐‘ฅ๐‘ฅ−๐œ‡๐œ‡)
๏ฟฝ 2]
[
−
๐œŽ๐œŽ ๐‘‘๐‘‘๐‘‘๐‘‘
๐œ‡๐œ‡)2 . ๐‘’๐‘’ 2
Standard Normal Distribution
μ controls location
σ controls spread
Wider the curve, the larger the standard deviation and more variation exits in the
process
The location of the normal distribution is determined by mean and dispersion
or spread of distribution is determined by standard deviation
Standard Normal Distribution
The normal distribution has computational complexity to calculate ๐‘ƒ๐‘ƒ(๐‘ฅ๐‘ฅ1 < ๐‘ฅ๐‘ฅ <
๐‘ฅ๐‘ฅ2 ) for any two (๐‘ฅ๐‘ฅ1 , ๐‘ฅ๐‘ฅ2 ) and given ๐œ‡๐œ‡ and ๐œŽ๐œŽ
To avoid this difficulty, the concept of standard normal distribution is followed
The standard normal distribution is a
special case of the normal distribution, denoted by Z.
Condition 1: The mean has to be equal to zero μ = E(Z) = 0
Condition 2: The standard deviation/variance has to be equal to one σ = SD(Z) = 1
Lowercase letter z is used/followed to denote the value that the standard normal
Finding the probability of the standard normal
distribution
Z-Tables for standard normal probabilities
Tabulated areas under the standard normal density are the probabilities
of intervals extending from mean μ = 0 to points z to its right
Any data normally distributed data can be converted to the standardized
form using the formula:
Where x is the data point in the question
Z (Z-score) is a measure of the number of standard deviations of that data point from the mean
Finding the probability of the standard normal
distribution
Finding the probability of the standard normal
distribution
Example: ๐‘ท๐‘ท ๐’๐’ ≤ ๐Ÿ๐Ÿ. ๐Ÿ“๐Ÿ“๐Ÿ“๐Ÿ“ = ๐ŸŽ๐ŸŽ. ๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—
Finding the probability of the standard normal
distribution
Example: ๐‘ท๐‘ท ๐’๐’ ≤ −๐Ÿ๐Ÿ. ๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ— = ๐ŸŽ๐ŸŽ. ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ
Finding the probability of the standard normal
distribution table
Example
๐‘ƒ๐‘ƒ 0 ≤ ๐‘๐‘ ≤ 1.96 = ๐‘ƒ๐‘ƒ ๐‘๐‘ ≤ 1.96 − ๐‘ƒ๐‘ƒ ๐‘๐‘ ≤ 0
= 0.4750 − 0.00 = 0.4750
Finding the probability of the standard normal
distribution table
๐‘ท๐‘ท ๐Ÿ๐Ÿ. ๐Ÿ“๐Ÿ“๐Ÿ“๐Ÿ“ ≤ ๐’๐’ ≤ ๐Ÿ๐Ÿ. ๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ— = ๐‘ท๐‘ท ๐’๐’ ≤ ๐Ÿ๐Ÿ. ๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ— − ๐‘ท๐‘ท ๐’๐’ ≤ ๐Ÿ๐Ÿ. ๐Ÿ“๐Ÿ“๐Ÿ“๐Ÿ“
= ๐ŸŽ๐ŸŽ. ๐Ÿ’๐Ÿ’๐Ÿ•๐Ÿ•๐Ÿ•๐Ÿ•๐Ÿ•๐Ÿ• − ๐ŸŽ๐ŸŽ. ๐Ÿ’๐Ÿ’๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘ = ๐ŸŽ๐ŸŽ. ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ
Finding the probability of the standard normal
distribution table
๐‘ท๐‘ท −๐Ÿ๐Ÿ. ๐Ÿ“๐Ÿ“๐Ÿ“๐Ÿ“ ≤ ๐’๐’ ≤ ๐Ÿ๐Ÿ. ๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ— = ๐‘ท๐‘ท ๐’๐’ ≤ ๐Ÿ๐Ÿ. ๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ— − ๐‘ท๐‘ท ๐’๐’ ≤ −๐Ÿ๐Ÿ. ๐Ÿ“๐Ÿ“๐Ÿ“๐Ÿ“
= ๐ŸŽ๐ŸŽ. ๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ— − ๐ŸŽ๐ŸŽ. ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ = ๐ŸŽ๐ŸŽ. ๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—๐Ÿ—
Finding the probability of the standard normal
distribution table
Example: find z that satisfies the following
๐‘ท๐‘ท ๐’๐’ ≤ ๐’›๐’› = ๐ŸŽ๐ŸŽ. ๐Ÿ”๐Ÿ”๐Ÿ”๐Ÿ”๐Ÿ”๐Ÿ”๐Ÿ”๐Ÿ”;
๐’›๐’› = ๐ŸŽ๐ŸŽ. ๐Ÿ’๐Ÿ’๐Ÿ’๐Ÿ’
Finding the probability of the standard normal distribution
table
Example: find z that satisfies the following
๐‘ƒ๐‘ƒ ๐‘๐‘ > ๐‘ง๐‘ง = 0.0212 → 1 − ๐‘ƒ๐‘ƒ ๐‘๐‘ ≤ ๐‘ง๐‘ง
= 0.9788
0.9788 − 0.5 = 0.4788 → ๐‘ง๐‘ง = 2.03
Normal distribution transformation
Any normally distributed random variable can be transformed into the standard normal random
variable based on the following explanations
Let ๐‘‹๐‘‹ have a normal distribution with mean ๐œ‡๐œ‡ and standard deviation ๐œŽ๐œŽ.
๐‘‹๐‘‹ can be transformed into ๐‘๐‘ using Z =
๐‘‹๐‘‹−๐œ‡๐œ‡
๐œŽ๐œŽ
Any ๐‘ฅ๐‘ฅ can be transformed into ๐‘ง๐‘ง using z =
๐‘ฅ๐‘ฅ−๐œ‡๐œ‡
๐œŽ๐œŽ
Download
Study collections