Uploaded by Natalie

economic analysis PPT notes

advertisement
Ch 2 powerpoint notes, descriptive statistics: Tabular and graphical
displays
1. Summarizing data for two variables using tables
2. Summarizing data for two variables using graphical displays
Data visualization: best practices in creating effective graphical displays
SUMMARIZING DATA FOR TWO VARIABLES USING TABLES
o Thus far we have focused on methods that are used to summarize the
data for one variable at a time
o Often a manager is interested in tabular and graphical methods that will
help understand the relationship between two variables
Crosstabulation is a method for summarizing the data for two variables
o Tabular summary of data for two variables
o Crosstabulation can be used when
 One variable is qualitative, and the other variable is
quantitative
 Both variables are quant.
 Both variables are qual.
o The left and top margin labels define the classes for the two
variables
o Example: Finger Lakes Homes
o Price range- quantitative variable
o Home styles- categorical variable
o The # of finger lakes homes sold for each style and price for the
past 2 years is shown below
Price range
(homestyles) log
split
A-frame
total
colonial
<$200,000
18
6
19
12
55
>/=$200,000 12
14
16
3
45
total
30
20
35
15
100
o Insights gained from preceding crosstabulation
o the greatest number of homes (19) in the sample are a split-level
style and priced at less than 200,000
1
o
o
o
o
o only three homes in the sample are an A-frame style and priced at
200,000 or more
The numbers 55 and 45 are the frequency distribution for the price
range variable
30, 20, 35, 15 are the frequency distribution for the home style variable
Crosstabulation: Row or Column percentages
Converting the entries in the table into row percentages or column
percentages can provide additional insight about the relationship
between the two variables
Crosstabulation: row percentages
Price
colonial
log
split
range
<200,000
32.73
10.91
34.55
>/=200,000 26.67
31.11
35.56
(colonial and >/=200k/(all>/=200k)*100=(12/45)*100
a-frame
total
21.82
6.67
100
100
Price
Colonial
log
split
range
<200k
60.00
30.00
54.29
</=200k
40.00
70.00
45.71
total
100
100
100
(colonial and >/=200k)/(all colonial)*100=(12/30)*100
a-frame
80.00
20.00
100
Crosstabulation: Simpson’s paradox
o Data in two or more crosstabulations are often aggregated to produce a
summary crosstabulation
o We must be careful in drawing conclusions about the relationship
between the two variables in the aggregated crosstabulation
o In some cases the conclusions based upon an aggregated
crosstabulation can be completely reversed if we look at the
unaggregated data. The reversal of conclusions based on aggregate and
unaggregated data is called the Simpson’s paradox
Summarizing data for two variables using graphical displays
2
o In most cases, a graphical display is more useful than a table for
recognizing patterns and trends
o Displaying data in creative ways lead to powerful insights
o Scatter diagrams and trendlines are useful in exploring the relationship
between two variables
Scatter diagram and trendline
o A scatter diagram is a graphical presentation of the relationship
between two quantitative variables
o One variable is shown on the horizontal axis and the other variable is
shown on the vertical axis
o The general pattern of the plotted points suggests the overall
relationship between the variables.
o A trendline provides an approximation of the relationship
o Positive relationship is when it goes up
o Negative relationship is when it goes down
o No apparent relationship is a straight line
Side by side bar chart
o Graphical display for depicting multiple bar charts on the same display
o Each cluster of bars represents one value of the first variable
o Each bar within a cluster represents one value of the second variable
Stacked bar chart is another way to display and compare two variables on the
same display
o It is a bar chart in which each bar is broken into rectangular segement
of a different color
o If percentage frequencies are displayed, all bars will be of the same
height (or length) extending to the 100% mark
Data visualization: best practices in creating effective graphical displays
o Data vis. Describes the use of graphical displyas to summarize and
present info about a data set
o The goal is to communicate as effectively and clearly as possible the key
info about the data
3
Creating effective graphical displays
o Art and science
o Guidelines
o Give the display a clear and concise title
o Keep display simple
o Clearly label each axis and provide the units of measure
o If colors are used make sure theyre distinct
o If multiple colors or lines are used provide a legend
Choosing the type of graphical display
o Displays used to show the dist of the data
o Bar chart, pie chart, dot plot, histogram, stem and leaf display
o Displays used to make comparisons
o Side by side bar chart, Stacked bar chart
o Displays used to show relationships
o Scatter diagram, trendline
Data dashboard is a widely used data visualization tool
o It organizes and presents key performance indicators (kpis) used to
monitor an org or process
o It provides timely, summary info that is easy to read, understand and
interpret
o Some additional guidelines include
o Minimize the need for screen scrolling
o Avoid unnecessary use of color or 3d
o Use borders between charts to improve readability
Tabular and graphical displays
4
Ch. 2 part A, Descriptive Stats: tabular and graphical displays
o Summarizing data for a categorical variable
o Summarizing data for a quantitative variable
Categorical data use labels or names to identify categories of like items
Quantitative data are numerical values that indicate how much or how many
FREQUENCY DISTRIBUTION
o A frequency distribution is a tabular summary of data showing the
number (frequency) of obserations in each of several non-overlapping
categories or classes
o The objective is to provide insights about the data that cannot be
quickly obtained by looking only at the original data
Relative Frequency Distribution
5
o The relative frequency of a class is the fraction or proportion of the
total number of data items belonging to the class
o A relative frequency distribution is a tabular summary of a set of data
showing the relative frequency for each class
o A percent frequency of a class is the relative frequency multiplied by
100
o A percent frequency distribution is a tabular summary of a set of data
showing the percent frequency for each class
Bar chart
o A bar chart is a graphical display for depicting qualitative data
o On one axis (usually the horizontal axis) we specify the labels that are
used for each of the classes
o A frequency, relative frequency or percent frequency scale can be used
for the other axis (usually the vertical axis)
o Using a bar of fixed width drawn above each class label, we extend the
height appropriately
o The bars are separated to emphasize the fact that each class is a
separate category
Pareto diagram
o In quality contro, bar charts are used to identify the most important
causes of problems
o When the bars are arranged in descending order of height from left to
right (with the most frequently occurring cause appearing first) the bar
chart is called a Pareto diagram
o This diagram is named for its founder Vilfredo Pareto an Italian
economist
Pie chart
o Pie chart is a commonly used graphical display for presenting relative
frequency and percent frequency distributions for categorical data
o First draw a circle then use the relative frequencies to subdivide the
circle into sectors that correspond to the relative frequency for each
class
6
o Since therea re 360 degrees in a circle a class with a relative frequency
of .25 would consume .25(360)=90 degrees of the circle.
Frequency distribution
o The three steps necessary to define the classes for a frequency
distribution with quantitative data are:
1. Determine the # of non overlapping classes
2. Determine the width of each class
3. Determine the class limits
Guidelines for determining the number of classes
o Use between 5 and 20 classes
o Data sets with a larger number of elements usually require a larger
number of classes
o Smaller data sets usually require fewer classes
o The goal is to use enough classes to show the variation in the data but
not so many classes that some contain only a few data items
o Use classes of equal width
o Approximate class width= (largest data value-smallest data value)/#
classes
o Making the classes the same width reduces the chance of inappropriate
interpretations
o *Note on number of classes and class width
o In practice the number of classes and the appropriate class width
are determined by trail and error
o Once a possible number of classes is chosen the appropriate class
width is found
o The process can be repeated for a diff number of classes
o Ultimately the analyst uses judgment to determine the combo of
the number of classes and class width that provides the best
frequency distribution for summarizing the data
o Guidelines for determining the class limits
o Class limits must be chosen so that each data item belongs to one
and only one class
7
o The lower class limit identifies the smallest possible data value
assigned to the class
o The upper class limit identifies the largest possible data value
assgiend to the class
o The upper class limit identifies the largest possible data value
assigned to the class
o The appropriate values for the class limits depend on the level of
accuracy of the data
o An open end class requires only a lower class or an upper class
limit
Dot plot
o One of the simplest graphical summaries of data is dot plot
o A horizontal axis shows the range of data values
o Then each data value is represented by a dot placed above the axis
Histogram
o Another common graphical display of quantitative data is a histogram
o The variable of interest is placed on the horizontal axis
o A rectangle is drawn above each class interval with its height
corresponding to the interval’s frequency, relative frequency or percent
frequency
o Unlike a bar graph, a histogram has no natural separation between
rectangles of adjacent classes
Cumulative distributions
o Cumulative frequency distribution shows the number of items with
values less than or equal to the upper limit of each class
o Cumulative relative frequency distribution- shows the proportion of
items with values less than or equal to the upper limit of each class
o Cumulative percent frequency distribution
o Shows the percentage of items with values less than or equal to the
upper limit of each class
Cumulative distributions
8
o The last entry in a cumulative frequency distribution always equals the
total number of observations
o The last entry in a cumulative relative frequency distribution always
equal 1.00
o The last entry in a cumulative percent frequency distribution always
equals 100
Stem and leaf display
o A stem and leaf display shows both the rank order and shape of the
distribution of the data
o It is similar to a histogram on its side but it has the advantage of
showing the actual data values
o The first digits of each data item are arranged to the left of a vertical
line
o To the right of the vertical line we record the last digit for each item in
rank order
o Each line in the display is referred to as a stem
o Each digit on a stem is a leaf
Stretched stem and leaf display
o If we believe the original stem and leaf display has condensed the data
too much we can stretch the display vertically by using two stems for
each leading digits
o Whenever a stem value is stated twice the first value corresponds to
leaf values of 0-4 and the second value corresponds to leaf values of 5-9
o Leaf units
o A single digit is used to define each leaf
o In the preceding example the leaf unit was 1
o Leaf units may be 100,10,1,0.1, and so on
o Where the leaf unit is now shown, it is assumed to equal 1
o The leaf unit indicates how to multiply the stem and leaf numbers
to approximate the original data
Statistics
9
o The term stats can refer to numerical facts such as averages medians
percents and index numbers that help us understand a variety of
business and economic situations
o Stats can also refer to the art and science of collecting, analyzing,
presenting and interpreting data
Applications in Business and Economics
o Accounting
o Public accounting firms use statistical sampling procedures when
conducting audits for their clients
o Economics
o Economists use statistical information in making forecasts about
the future of the economy or some aspect of it
o Finance
o Financial advisors use price earnings ratios and dividend yields to
guide their investment advice
Applications in Business and Economics
o Marketing
o Electronic point of sale scanners at retail checkout counters are
used to collect data for a variety of marketing research
applications
o Production
o A variety of statistical quality control charts are used to monitor
the output of a production process
o Information Systems
o A variety of statistical info helps administrators assess the
performance of computer networks
Data and Data sets
o Data- facts and figures collected, analyzed and summarized for
presentation and interpretation
o All the data collected in a particular study are referred to as the data set
for the study
Elements, Variables and observations
10
o Elements are the entities on which data are collected
o A variable is a characteristic of interest for the elements
o The set of measurements obtained for a particular element is called an
observation
o A data set with n elements contains n observations
o The total number of data values in a complete data set is the number of
elements multiplied by the number of variables
Scales of measurement
o Includes nominal, interval, ordinal, ratio
o The scale determines the amt of info contained in the data
o The scale indicates the data summarization and statistical analyses that
are most appropriate
o Nominal
o Data are labels or names used to identify an attribute of the
element
o A nonnumeric label or numeric code may be used
o Ex. Students of a uni are classified by the school in which they are
enrolled using a nonnumeric label such as business, humanities,
education and so on
o Alternatively a numeric code could be used for the school
vbariable like 1 denotes business, 2 denotes humanities, etc
o Ordinal
o The data have the properties of nominal data and the order or
rank of the data is meaningful
o A nonnumeric label or numeric code may be used
o Ex. Students of uni are classified by class standing using a
nonnumeric label such as freshman sophomore junior or senior
o Alternatively a numeric code could be used for the class standing
variable (e.g. 1 denotes freshman, 2 denotes sophomore)
o Interval
o The data have the properties of ordinal data and the interval
between observations is expressed in terms of a fixed unit of
measure
o Interval data are always numeric
11
o
o
o
o
o
o
o Ex. Melissa has an SAT score of 1985, while Kevin has an SAT score
of 1880. Melissa scored 105 points more than Kevin
Ratio
o The data have all the properties of interval data and the ratio of
two values is meaningful
o Variables such as distance height weight and time use the ratio
scale
o This scale must contain a zero value that indicates that nothing
exists for the variable at the zero point
o Ex. Melissas college record shows 36 credit hours earned, while
kevin’s shows 72 credit hours earned. Kevin has twice as many
credit hours earned as Melissa
Categorical and Quantitative Data
o Data can be further classified as being categorical or quantitative
o The statistical analysis that is appropriate depend son whether
the data for the variable are categorical or quantitative
o In general there are more alternatives for statistical analysis
when the data are quantitative
o Labels or names used to identify an attribute of each element
o Often referred to as qualitative data
o Use either the nominal or ordinal scale of measurement
o Can be either numeric or nonnumeric
o Appropriate statistical analyses are rather limited
Quantitative data indicate how many or how much
o Discrete, if measuring how many
o Continuous, if measuring how much
o Quantitative data are always numeric
o Ordinary arithmetic operations are meaningful for quantitative
data
Scales of measurement
Cross sectional data are collected at the same or approx. the same point
in time
o Ex. Data detailing the number of building permits issued in nov
2012 in each of the counties of Ohio
Time Series Data
12
o Time series data are collected over several time periods
 Ex. Data detailing the number of building permits issues in
lucas county, ohio in each of the last 36 months
o Graphs of time series help analysts understand:
 What happened in the past
 Identify any trends over time
 Project future levels for the time series
o Data sources
o Existing sources
 Internal company records- almost any department
 Business database services- Dow Jones and Co.
 Govt Agencies- US dept of labor
 Industry associations- travel industry association of America
 Special interest orgs- graduate management admission
council
 Internet- more and more firms
o Data Sources
o Data available from internal company records
 Employee records- data including name, address, SS #
 Production records- part number, quantity produced, direct
labor cost, material cost
 Inventory records- part number, quantity in stock, reorder
level, economic order quantity
 Sales records- product number, sales volume, sales volume
by region
 Credit scores- customer name, credit limit, accounts
receivable balance
 Customer profile- age, gender, income, household size
o Data available from selected govt agencies
o Census bureau- population data, number of households,
household income
o Federal Reserve Board- data on money supply, exchange rates,
discount rates
o Office of Mgmt and Budget- data on revenue, expenditures, debt
of federal govt
13
o Dept of Commerce- data on business activity, value of shipments,
profit by industry
o Bureau of Labor Statistics- customer spending, unemployment
rate, hourly earnings, safety record
o Data Sources
o Statistical studies- experimental
 In experimental studies the variable of interest is first
identified. Then one or more other variables are identified
and controlled so that data can be obtained about how they
influence the variable of interest
 The largest experimental study ever conducted is believed
to be the 1954 public health service experiment of the salk
polio vaccine. Nearly 2 million US children were selected
o Statistical studies- observational
 In observational (nonexperimental) studies no attempt is
made to control or influence the variables of interest
 A survey is a good example
 Studies of smokers and nonsmokers are
observational studies because researchers do not
determine or control who will smoke and who wont
o Data Acquisition considerations
o Time requirement
 Searching for info can be time consuming
 Info may not longer be useful by the time its available
o Cost of acquisition
 Orgs often charge for info even when it’s not their primary
business activity
o Data errors
 Using any data that happen to be available or were acquired
with little care can lead to misleading info
o Descriptive stats
o Most of the statistical info in newspapers, mags, company reports
and other publications consists of data that are summarized and
presented in a form that is easy to understand
14
o Such summaries of data which may be tabular, graphical, or
numerical are referred to as descriptive stats
o Ex. The manager of Hudson auto would like to have a better
understanding of the cost of parts used in the engine tune ups
performed in her shop. She examines 50 customer invoices for
tune-ups. The costs of parts, rounded to the nearest dollar, are
listed on the next slide
Numerical Descriptive Stats
o The most common numerical descriptive statistic is the average (or
mean)
o The avg demonstrates a measure of the central tendency or central
location of the data for a variable
Statistical inference
o Population- the set of all elements of interest in a particular study
o Sample- a subset of the population
o Statistical inference- the process of using data obtained from a sample
to make estimates and test hypotheses about the characteristics of a
population
o Census- collecting data for the entire population
o Sample survey- collecting data for a sample
Process of statistical inference
1.
2.
3.
4.
Population consists of all tune ups. Avg cost of parts is unknown
A sample of 50 engine tune-ups is examined.
The sample data provide a sample average parts cost of 79 per tuneup
The sample avg is used to estimate the population avg
Computers and statistical analysis
o Statisticians often use computer software to perform the statistical
computations required with large amts of data
o Many of the data sets in this book are available on the website that
accompanies the book
o The data sets can downloaded in either minitab or excel format
15
o Also, the excel add-in stat tools can be downloaded from the website
Data warehousing
o Organizations obtain large amts of data on a daily basis by means of
magnetic card readers, bar code scanners, point of sale terminals and
touch screen monitors
o Wal-mart captures data on 20-30 million transactions per day
o Visa processes 6800 payment transactions per second
o Capturing storing and maintaining the data, referred to as data
warehousing, is a significant undertaking
Data Mining
o Analysis of the data in the warehouse might aid in decisions that will
lead to new strats and higher profits for the organization\
o Using a combination of procedures from stats, math, and comp sci,
analysts mine the data to convert it into useful info
o The most effective data mining systems use automated procedures to
discover relationships in the data and predict future outcomes,
prompted by only general even vague queries by the user
Data mining applications
o The major applications of data mining have been made by companies
with a strong consumer focus such as retail, financial and
communication firms
o Data mining is used to identify related products that customers who
have already purchased a specific product are also likely to purchase
(and then pop-ups are used to draw attention to those related
products)
o As another example, data mining is used to identify customers who
should receive special discount offers based on their past purchasing
volumes
Data mining reqts
o Statistical methodology such as multiple regression, logistic regression,
and correlation are heavily used
16
o Also needed are computer science techs involving AI and machine
learning
o A significant investment in time and money is reqd as well
Data mining model reliability
o Finding a statistical model that works well for a particular sample of
data does not necessarily mean that it can be reliably applied to other
data
o With the enormous amt of data available the data set can be partitioned
into a training set (for model development and a test set (for validating
the model)
o There is however a danger of over fitting the model to the point that
misleading association and conclusions appear to exist
o Careful interpretation of results and extensive testing is important
Ethical guidelines for statistical practice
o In a statistical study, unethical behavior can take variety of forms
including
o Improper sampling
o Inappropriate analysis of the data
o Development of misleading graphs
o Use of inappropriate summary stats
o Biased interpretation of the results
o You should strive to be fair, thorough, objective, and neutral as you
collect, analyze and present data
o As a consumer of stats, you should also be aware of the possibility of
unethical behavior by others
o The American statistical association developed the report ethical
guidelines for statistical practice
o The report contains 67 guidelines organized into 8 topic areas
o Professionalism
o Responsibilities to funders, clients, employers
o Responsibilities in publications and testimony
o Responsibilities to research subjects
o Responsibilities to research team colleagues
17
o Responsibilities to other statisticians/practitioners
o Responsibilities regarding allegations of misconduct
o Responsibilities of employers including orgs, individuals,
attorneys, or other clients
18
Ch. 3 part A Descriptive Statistics: Numerical Measures
Measures of location
o Mean
o Most important measure of location is the mean
o Provides a measure of central location
o The mean of a data set is the avg of all the data values
o The sample mean x is the point estimator of the population mean
u
o Weighted mean
o In some instances, the mean is computed by giving each
observation a weight that reflects its relative importance
o The choice of weights depends on the application
o The weights might be the number of credit hours earned for each
grade, as in GPA
o In other weighted mean computations, quantities such as pounds
dollars, or volume are frequently
19
 If data is from a population, u replaces x
o Median
o The median of a data set is the value in the middle when the data
items are arranged in ascending order
o Whenever a data set has extreme values the median is the
preferred measure of central location
o The median is the measure of location most often reported for
annual income and property value data
o A few extremely large incomes or property values can inflate the
mean
o Trimmed Mean
o Another measure, sometimes used when extreme values are
present is the trimmed mean
o It is obtained by deleting a percentage of the smallest and largest
values from the data set and then computing the mean of the
remaining values
 Ex. The 5% trimmed mean is obtained by removing the
smallest 5% and the largest 5% of data values and then
computing the mean of the remaining values
o Geometric mean
o Calculated by finding the nth root of the product of n values
o It is often used in analyzing growth rates in financial data (where
using the arithmetic mean will provide misleading results
o It should be applied anytime you want to determine the mean
rate of change over several successive periods, be it years,
quarters, weeks
20
o Other common applications include changes in populations of
species, crop yields, pollution levels and birth and death rates
o Mode
o The mode is the value that occurs w greatest frequency
o The greatest frequency can occur at two or more diff values
o If the data have exactly two modes, the data are bimodal
 If the data have more than two modes, the data are
multimodal
 Caution: if the data are bimodal or multimodal Excel’s
MODE function will incorrectly identify a single mode
o Percentiles
o A percentile provides info about how the data are spread over the
interval from the smallest value to the largest value
o Admission test scores for colleges and unis are frequently
reported in terms of percentiles
o The pth percentile of a data set is a value such that at least p
percent of the items take on this value or less and at least (100-p)
percent of the items take on this value or more
o Arrange data in ascending order
o Compute index I the position of the pth percentile i=(p/100)n
o If I is not an integer round up the pth percentile is the value in the
ith position
o If I is an integer the pth percentile is the avg of the values in
positions I and i+1
o Quartiles
o Specific percentiles
o First quartile 25%
21
o Second 50% (median)
o Third quartile 75%
 If the measures are computed for data from a sample, they
are called sample stats
 If the measures are computed for data from a population,
they are called population parameters
 A sample statistic is referred to as the point estimator of
the corresponding population parameter
Measures of variability
o It is often desirable to consider measures of var as well as measures of
location
o For example if choosing supplier a or b we might consider not
only the avg delivery time for each but also the variability in
delivery time for each
o Range
o Difference b/w the largest and smallest data values
o It is the simplest measure of variability
o It is very sensitive to the smallest and largest data values
o Interquartile range
o Difference between the third and first quartile, middle 50% of data
o It overcomes the sensitivity to extreme data values
o Variance
o Measure of the variability that utilizes all the data
o Based on diff between value of each observation and the mean
o The variance is useful in comparing the variability of two or more
variables
o Variance is the avg of the squared differences between each data
value and the mean
22
o Standard deviation
o Positive square root of the variance
o It is measured in the same units as the data, making it more easily
interpreted than the variance
o Coefficient of variation
o Indicates how large the standard deviation is in relation to the
mean
23
Ch. 3 part B Descriptive stats: numerical measures
Distribution shape: skewness
o Important measure of the shape of a distribution is called skewness
o The formula for the skewness of sample data is
o Skewness can be easily computed using statistical software
o Look at PowerPoint for types of skewness
Z-scores
o Often called the standardized value
o Denotes the number of standard deviations a date value is from the
mean
o Excel’s standardize function can be used to compute the z score
o An observation’s z score is a measure of the relative location of the
observation in a data set
o A data value less than the sample mean will have a z score less than zero
o A data value greater than sample mean will have z score greater than
zero
o A data value equal to the sample mean will have a z score of zero
Chebyshev’s theorem
o At least (1-1/z^2) of the items in any data set will be within z standard
deviations of the mean where z is any value greater than 1
o Chebyshev’s theorem requires z>1 but z need not be an integer
24
o At least 75% of the data values must be within z=2 standard
deviations of the mean
o At least 89% of the data values must be within z=3 standard
deviations of the mean
o At least 94% of the data values must be within z=4 standard
deviations of the mean
Empirical Rule
o When the data are believed to approximate a bell-shaped distribution
o The empirical rule can be used to determine the percentage of data
values that must be within a specified number of standard deviations of
the mean
o The empirical rule is based on the normal distribution which is
covered in Ch. 6
For data having a bell-shaped distribution
o 68/26% of the values of a normal random variable are within the +/-2
standard deviations of its mean
o 99.72 of the values of a normal random variable are within +/-3 standard
deviations of its mean
Detecting outliers
o An outlier is an unusually small or unusually large value in a data set
o A data value with a z score less than -3 or greater than +3 might be
considered an outlier
o It might be
o An incorrectly recorded data value
o A data value that was incorrectly included in the data set
o A correctly recorded data value that belongs in the data set
Five number summaries and box plots
o Summary stats and easy-to-draw graphs can be used to quickly
summarize large quantities of data
o Two tools that accomplish this are five number summaries and box
plots
25
o Five number summary
o 1 smallest value
o 2 first quartile
o 3 median
o 4 third quartile
o 5 largest value
Box plot
o Graphical summary of data that is based on five number summary
o A key to the development of a box plot is the computation of the
median and the quartiles Q1 and Q3
o Box plots provide another way to identify others
o Limits are located, not drawn, using the interquartile range (IQR)
o Data outside these limits are considered outliers
o The locations of each outlier is shown with the symbol *
Measures of association between two variables
o Thus far we’ve examined numerical methods used to summarize the
data for one variable at a time
o Often a manager or decision maker is interested in the relationship
between two variables
o Two descriptive measures of the relationship between two variables
are covariance and correlation coefficient
Covariance
o Measure of linear association between two variables
o Positive values indicate a positive relationship
o Negative values indicate a negative relationship
26
o
Correlation coefficient
o Correlation is a measure of linear association and not necessarily
causation
o Just bc two variables are highly correlated it doesn’t mean that one
variable is the cause of the other
o
o
o
o
Coefficient can take on values between -1 and +1
Values near -1 indicate a strong negative linear relationship
Values near +1 indicate a strong positive linear relationship
The closer the correlation is to zero, the weaker the relationship
Data dashboards: adding numerical measures to improve effectiveness
o Not limited to graphical displays
o The addition of numerical measures, like the mean and standard
deviation of KPIs to a data dashboard is often critical
o Dashboards are often interactive
o Drilling down refers to functionality in interactive dashboards that
allows the user to access info and analyses at increasingly detailed level
Ch. 4 Intro to Probability
27
Uncertainties
o Managers often base their decisions on an analysis of uncertainties such
as the following:
o What are the chances that sales will decrease if we increase
prices?
o What is the likelihood a new assembly method will increase
productivity?
o What are the odds that a new investment will be profitable?
Probability
o
o
o
o
Numerical measure of the likelihood that an event will occur
Probability values are always assigned on a scale from 0-1
A probability near zero indicates an event is quite unlikely to occur
A probability near 1 indicates an event is almost certain to occur
Probability as a numerical measure of the likelihood of occurrence
Statistical Experiments
o In stats, the notion of an experiment differs somewhat from that of an
experiment in the physical sciences
o In statistical experiments, probability determines outcomes
o Even though the experiment is repeated in exactly the same way an
entirely different outcome may occur
o For this reason statistical experiments are sometimes called random
experiments
28
An Experiment and its Sample Space
o An experiment is any process that generates well defined outcomes
o The sample space for an experiment is the set of all experimental
outcomes
o An experimental outcome is also called a sample point
experiment
Experiment outcomes
Toss a coin
Head, tail
Inspection a part
Defective, non defective
Conduct a sales call
Purchase, no purchase
Roll a die
1,2,3,4,5,6
Play a football game
Win, lose, tie
o Bradley investments example
o Bradley has invested in two stocks, Markley oil and colins mining
o Bradly has determined that the possible outcomes of these
investments three months from now are as follows
 Investment gain or loss in 3 months (in $000)
Markley oil
Collins mining
10
8
5
-2
0
-20
A counting rule for multiple-step experiments
o If an experiment consists of a sequence of k steps in which there are n1
possible results for the first step, n2 possible results for the second, and
so on, then the total number of experimental outcomes is given by
(n1)(n2)…(nk).
o A helpful graphical representation of a mult step experiment is a tree
diagram
A counting rule for multiple step experiments
o Bradley investments
o Bradley investments can be viewed as a two step experiment.
o It involved two stocks, each with a set of experimental outcomes
 Markley oil: n1=4
29
 Collins mining: n2=2
 Total # of experimental outcomes= (n1)(n2)=(4)(2)=8
o
Counting rule for combinations
o Number of combos of N objects taken n at a time
o A second useful counting rule enables us to count the number of
experimental outcomes when n objects are to be selected from a
set of N objects
o
Counting rule for Permutations
o Number of permutations of N objects taken n at a time
o A third useful counting rule enables us to count the number of
experimental outcomes when n objects are to be selected from a
set of N objects, where the order of selection is important
30
o
Assigning probabilities
o Basic req’t for assigning probabilities
1. The probability assigned to each experimental outcome must be
between 0 and 1, inclusively
2. The sum of the probabilities for all experimental outcomes must = 1
o Classical method
o Assigning probabilities based on the assumption of equally likely
outcomes
o Ex. Rolling a die
 If an experiment has n possible outcomes, the classical
method would assign a probability of 1/n to each outcome
 Experiment: rolling a die
 Sample space: S={1,2,3,4,5,6}
 Probabilities: each sample point has a 1/6 chance of
occurring
o Relative frequency method
o Assigning probabilities based on experimentation or historical
data
31
o Ex. Lucas tool rental
 They would like to assign probabilities to the number of car
polishers it rents each day
 Office records show the following frequencies of daily
rentals for the last forty days
 Each probability assignment is given by dividing the
frequency (number of days) by the total frequency (total
number of days)
o Subjective method
o Assigning probabilities based on judgment
o When economic conditions and a company’s circumstances
change it might be inappropriate to assign probabilities based
solely on historical data
o We can use any data available as well as out experience and
intuition but ultimately a probability value should express our
degree of belief that the experimental outcome will occur
o The best probability estimates are often obtained by combining
the estimates from the classical or relative frequency approach
with the subjective estimate
32
o Ex. Bradley investments
o An analyst made the following probability estimates

Events and their probabilities
o An event is a collection of sample points
o The probability of any event is equal to the sum of the probabilities of
the sample points in the event
o If we can identify all the sample points of an experiment and assign a
probability to each, we can compute the probability of an event
Some basic relationships of probability
33
o There are some basic probability relationships that can be used to
compute the probability of an event without knowledge of all the
sample point probabilities
o Complement of an event
 The complement of event A is defined to be the veent
consisting of all sample points that aren’t A
 The complement of A is denoted by A^c

o Union of two events
 The union of the events A and B is the event containing all
sample points that are in A or B or both
 The union of events A and B is denoted by A U B


34
o Intersection of two events
 The intersection of events A and B is the set of all sample
points that are in both A and B
 The intersection of events A and B is denoted by A upside
down u B


o Addition law
 Provides a way to compute the probability of event A, or B,
or both A and B occurring
 The law is written as

35

o Mutually exclusive events
 Two events are said to be mutually exclusive if the events
have no sample points in common
 Two events are mutually exclusive if, when one event
occurs, the other can’t

 If events A and B are exclusive, P(AupsidedownUB)=0
 The addition law for mutually exclusive events is

Conditional probability
36
o The probability of an event given that another event has occurred is
called cond probability
o The conditional probability of A given B is denoted by P(A|B)
o A conditional probability is computed as follows:
Multiplication Law
o Provides a way to computer the probability of the intersection of two
events
o The law is written as:
o
Joint Probability table
37
Independent events
o If the probability of event A is not changed by the existence of event B
we would say that events A and B are independent
o Two events A and B are independent if
o P(A|B)=P(A) or P(B|A)=P(B)
Multiplication Law for Independent Events
o The mult law also can be used as a test to see if two events are
independent
o The law is written as:
Mutual Exclusiveness and Independence
38
o Do not confuse the notion of mutually exclusive events with that of
independent events
o Two events with nonzero probabilities cannot be both mutually
exclusive and independent
o If one mutually exclusive event is known to occur the other cannot
occur, thus the probability of the other event occurring is reduced to
zero and they are therefore dependent
o Two events that are not mutually exclusive might or might not be
independent
Bayes’ Theorem
o Often we begin probability analysis with initial or prior probabilities
o Then, from a sample, special report or a product test we obtain some
additional info
o Given this info we calculate revised or posterior probabilities
o Bayes theorem provides the means for revising prior probabilities
o Probabilities -> new info -> application of bayes’ theorem-> posterior
probabilities
o Ex. L.S. Clothiers
o A proposed shopping center will provide strong competition for
downtown businesses like L.S. Clothiers. If the shopping center is
built the owner of LS Clothiers feel sit would be best to relocate
to the shopping center
o The shopping center cannot be built unless a zoning change is
approved by the town council
o The planning board must first make a recommendation, for or
against the zoning change, to the council
o New info
39
 The planning board has recommended against the zoning
change. Let b denote the event of a negative
recommendation by the planning board
 Given that B has occurred, should LSC revise the
probabilities that the town council will approve or
disapprove the zoning change?
Conditional probabilities
o Ex. LS Clothiers
o Past history with the planning board and the town council
indicates the following:

o Bayes Theorem
o To find the posterior probability that event A will occur given that
event B has occurred, we apply this theorem
o
o Bayes theorem is applicable when the events for which we want
to compute posterior probabilities are mutually exclusive and
their union is the entire sample space
Posterior Probabilities
o Ex. LS clothier
o Given the planning board’s recommendation not to approve the
zoning change we revise the prior probabilities as follows:
40
o The planning board’s recommendation is good news for LS
clothiers. The posterior probability of the town council approving
the zoning change is .34 compared to a prior probability of .70
Bayes Theorem: Tabular approach
o Example: LS Clothiers
o Step 1
 Prepare the following three columns
 Column 1: the mutually exclusive events for which
posterior probabilities are desired
 Column 2: the prior probabilities for the events
 Column 3: the conditional probabilities of the new
info given each event
o Step 2
 Prepare the fourth column
 Column 4: compute the joint probabilities for each
event and the new info B by using the multiplication
law
 Multiply the prior probabilities in column 2 by the
corresponding conditional probabilities in column 3

 We see that there is a .14 probability of the town council
approving the zoning change and a negative
recommendation by the planning board
 There is a .27 probability of the town council disapproving
the zoning change and negative recommendation by the
planning board
o Step 3
 Sum the joint probabilities in column 4
 The sum is the probability of the new info, P(B)
 The sum .14+.27 shows an overall probability of .41 of a
negative recommendation by the planning board
o Step 4
 Prep the fifth column
41
 Column 5: compute the posterior probabilities using
the basic relationship of conditional probability
 The joint probabilities of
are in column 4
and the probability P(B) is the sum of column 4
42
Ch. 5 Discrete Probability Distributions
Random Variables
o Random variable: numerical description of the outcome of an
experiment
o Discrete random variable: may assume either a finite number of values
or an infinite sequence of values
o Continuous random variable: may assume any numerical value in an
interval or collection of intervals
Discrete Random variable with a Finite number of values
o Ex. JSL Appliances
o Let x= number of TVs sold at the store in one day, where x can
take on 5 values (0,1,2,3,4)
o We can count the TVs sold and there is a finite upper limit on the
number that might be sold (which is the number of TVs in stock)
Discrete Random variable with an Infinite Sequence of values
o Ex. JSL Appliances
o Let x= number of customers arriving in one day, where x can take on
the values 0,1,2,….
o We can count the customers arriving but there is no finite upper limit
on the number that might arrive
Random variables
Question
Family size
Random Variable X
X=# dependents
reported on tax return
Distance from home to X=distance in miles
store
from home to the store
site
Own dog or cat
X=1 if own no pet, 2 if
own dogs only, 4 if own
dogs and cats
Discrete probability distributions
Type
Discrete
Continuous
discrete
43
o The probability distribution for a random variable describes how
probabilities are distributed over the values of the random variable
o We can describe a discrete probability distribution with a table graph or
formula
o Two types of discrete probability distributions will be introduced
o 1. Uses the rules of assigning probabilities to experimental
outcomes to determine probabilities for each value of the random
variable
o 2. uses a special mathematical formula to compute probabilities
for each value of the random variable
o The probability distribution is defined by a probability function denoted
by f(x) that provides the probability for each value of the random
variable
o The required conditions for a discrete probability function are
o
o There are three methods for assigning probabilities to random
variables: the classical method the subjective method and the relative
frequency method
o The use of the relative frequency method to develop discrete
probability distributions leads to what is called an empirical discrete
distribution
o Ex.
44
o
o In addition to tables and graphs, a formula that gives the probability
function f(x) for every value of x is often used to describe the
probability distributions
o Several discrete probability distributions specified by formulas are the
discrete uniform, binomial, Poisson and hypergeometric distributions
Discrete uniform probability distribution
o Simplest example of discrete probability distribution given by a formula
o The discrete uniform probability function is f(x)=1/n where n=
number of values the random variable may assume
Expected Value
o Measure of its central location
o
o The expected value is a weighted avg of the values the random variable
may assume
o The weights are the probabilities
o The expected value doesn’t have to be a value the random variable can
assume
Variance and Standard deviation
o The variance summarizes the variability in the values of a random
variable
o
o The variance is a weighted average of the squared deviations of a
random variable form its mean
45
o The weights are the probabilities
o The standard deviation is defined as the positive square root of the
variance
o Expected Value
o
o Variance
o
Bivariate Distributions
o A probability distribution involving two random variables
o Each outcome of a bivariate experiment consists of two values, one for
each random variable
o Ex. Rolling a pair of dice
o When dealing with bivariate probability distributions, we are often
interested in the relationship between the random variables
A bivariate discrete probability distribution
o A company asked 200 of its employees how they rated their benefit
package and job satisfaction
46
o The crosstabulation below shows the ratings data
o
o The bivariate empirical discrete probabilities for benefits rating and job
satisfaction are listed below
o
o
o
47
o
o Covariance for random variables x and y
o
o
Binomial Probability distribution
o Four properties of a binomial experiment
1. The experiment consists of a sequence of n identical trials
2. Two outcomes, success and failure, are possible on each trial
3. The probability of a success, denoted by p, does not change from
trial to trial (stationary assumption)
4. The trials are independent
o Our interest is in the number of successes occurring in the n trials
o We let x denote the number of successes occurring in the n trials
48
o
o
o Ex. Evans Electronics
o Evans electronics is concerned about a low retention rate for its
employees
o In recent years, management has seen a turnover of 10% of hourly
employees annually
o Thus for any hourly employee chosen at random management
estimates a probability of .1 that the person will not be with the
company next year
o Choosing 3 hourly employees at random, what is the probability
that 1 of them will leave the company this year?
 The probability of the first employee leaving and the
second and third employees staying, denoted (S,F,F), is
given by
 P(1-p)(1-p)
 With a .10 probability of an employee leaving on any one
trial, the probability of an employee leaving on the first trial
and not on the second and third trials is given by
 (.1)(.9)(.9)=(.1)(.9)^2=.081
49
o Two other experimental outcomes also result in one success and
two failures
o The probabilities for the three experimental outcomes involving
one success follow

o
o
Binomial probabilities and cumulative probabilities
o Statisticians have developed tables that give probabilities and
cumulative probabilities for a binomial random variable
o These tables can be found in some stats textbooks
o With modern calculators and the capability of statistical software
packages, such tables are almost unnecessary
50
o
o
o
Poisson probability distribution
o A Poisson distributed random variable is often useful in estimating the
number of occurrences over a specified interval of time or space
o It is a discrete random variable that may assume an infinite sequence of
values (x=0,1,2,…)
o Examples of Poisson distributed random variables
o The number of knotholes in 14 linear feet of pine board
o The number of vehicles arriving at a toll booth in one hour
51
o Bell labs used the poisson distribution to model the arrival of phone
calls
o Two properties of a poisson experiment
1. The probability of an occurrence is the same for any two intervals of
equal length
2. The occurrence or nonoccurrence in any interval is independent of
the occurrence or nonoccurrence in any other interval
o Poisson probability function
o
o Where:
 X= number of occurrences in an interval
 F(x)= the probability of x occurrences in an interval
 U= mean number of occurrences in an interval
 E= 2.71828
 X!=x(x-1)(x-2)… (2)(1)
o Since there is no stated upper limit for the number of
occurrences, the probability function f(x) is applicable for values
x=0,1,2,… without limit
o In practical applications, x will eventually become large enough
so that f(x) is approximately zero and the probability of any
larger values of x become negligible
o Ex. Mercy Hospital
o Patients arrive at the emergency room of Mercy Hospital at the
average rate of 6 per hour on weekend evenings. What’s the
probability of 4 arrivals in 30 mins on a weekend evening?
o
52
o
o A property of the Poisson distribution is that the mean and variance are
equal
o
o Ex. Mercy Hospital
o Variance for number of arrivals
 During 20 min periods

Hypergeometric probability Distribution
- Closely related to binomial distribution
- However for the hypergeometric distribution
o the trials are not independent
o the probability of success changes from trial to trial
-
53
-
- Ex. Neveready’s Batteries
o Bob Neveready has removed two dead batteries from a flashlight
and inadvertently mingled them with the two good batteries he
intended as replacements
o The four batteries look identical
o Bob now randomly selects two of the four batteries
o What is the probability he selects the two good batteries?
-
54
-
- Consider a hypergeometric distribution with n trials and let p=r/n
denote the probability of a success on the first trial
- If the population size is large, the term (N-n)(N-1) approaches 1
- The expected value and variance can be written E(x)=np and
Var(x)=np(1-p)
- Note that these are the expressions for the expected value and variance
of a binomial distribution
- When the population size is large a hypergeometric distribution can be
approx.. by a binomial distribution with n trials and probability of
success p=(r/N)
55
Ch. 6 Continuous Probability Distributions
Continuous probability distributions
- A continuous random variable can assume any value in an interval on
the real line on in a collection of intervals
- It’s not possible to talk about the probability of the random variable
assuming a particular value
- Instead, we talk about the probability of the random variable assuming
a value within a given interval
- The probability of the random variable assuming a value within a given
interval from x1 to x2 is defined to be the areas under the graph of the
probability density function between x1 and x2
Uniform probability distribution
- A random variable is uniformly distributed whenever the probability is
proportional to the interval’s length
- The uniform probability density function is:
o
o Where a = smallest value the variable can assume
o B= largest value the variable can assume
- Expected value of x: E(x)=(a+b)/2
- Variance of x: Var(x)=(b-a)^2/12
- Ex. Slater’s Buffet
o Slater customers are charged for the amt of salad they take
56
o Sampling suggests that the amt of salad taken is uniformly
distributed between 5 and 15 ounces
o
o
o
o
Area as a measure of probability
- The area under the graph of f(x) and probability are identical
57
- This is valid for all continuous random variables
- The probability that x takes on a value between some lower value x1
and some higher value x2 can be found by computing the area under
the graph of f(x) over the interval from x1 to x2
Normal probability distribution
- Most important distribution for describing a continuous random
variable
- It is widely used in statistical inference
- It has been used in a wide variety of applications including
o Heights of peoples
o Rainfall amts
o Test scores
o Scientific measurements
- Abraham de Moivre, a French mathematician, published The Doctrine of
Chances in 1733
- He derived the normal distribution
o
- Characteristics
o The distribution is symmetric, its skewness measure is zero
o
o The entire family of normal probability distributions is define by
its mean and its standard deviation
58
o
o The highest point on the normal curve is at the mean, which is
also the median and mode

o The mean can be any numerical value: negative, zero, or positive

o The standard deviation determines the width of the curve: larger
values result in wider, flatter curves

o Probabilities for the normal random variable are given by areas
under the curve
o The total area under the curve is 1 (.5 to the left of the mean and .5
to the right)
o
59
o 68.26% of values of a normal random variable are within +/standard deviation of its mean
o 95.44% of values of a normal random variable are within +/-2
standard deviations of its mean
o 99.72% of values of a normal random variable are within +/-3
standard deviations of its mean
o
Standard normal probability distribution
- Characteristics
o A random variable having a normal distribution with a mean of 1
and standard deviation of 1 is said to have a standard normal
probability distribution
o The letter z is used to designate the standard normal random
variable
o
o Converting to the standard normal distribution, we can think of z
as a measure of the number of standard deviations x is from u
o
o Ex. Pep Zone
60
 Pep zone sells auto parts and supplies including a popular
multi-grade motor oil
 When the stock of this oil drops to 20gal, a replenishment
order is placed
 The store manager is concerned that sales are being lost
due to stockouts while waiting for a replenishment order
 It has been determined that demand during replenishment
lead-time is normally distributed with a mean of 15 gal and a
standard deviation of 6 gal
 The manager would like to know the probability of a
stockout during replenishment lead-time. In other words,
what is the probability that demand during lead time will
exceed 20 gal?
 P(x>20)=?
- Solving for a stockout probability
o Step 1: convert x to the standard normal distribution
o
o Step 2: find the area under the standard normal curve to the left
of z=.83
o
o Step 3: compute the area under the standard normal curve to the
right of z=.83
o
61
o
- If the manager of Pep Zone wants the probability of a stockout during
replenishment lead-time to be no more than .05 what should the
reorder point be?
- Hint: given a probability, we can use the standard normal table in an
inverse fashion to find the corresponding z value
- Solving for reorder point
o Step 1: find the z value that cuts off an area of .05 in the right tail
of the standard normal distribution
o
o Step 2: convert z.05 to the corresponding value of x
o
o A reorder of 25 gal will place the probability of a stockout during
leadtime at slightly less than .05
62
o
o By raising the reorder point from 20 to 25 gal on hand, the
probability of a stockout decreases from about .2 to .05
o This is a significant decrease in the chance that Pep Zone will be
out of stock and unable to meet a customer’s desire to make a
purchase
Normal Approximation of Binomial Probabilities
- When the number of trials n becomes large evaluating the binomial
probability function by hand or with a calculator is difficult
- The normal prob dist provides an easy to use approximation of binomial
probabilities where
- Add and subtract a continuity correction factor because a continuous
distribution is being used to approximate a discrete distribution
- Ex.
o Suppose that a company has a history of making errors in 10% of
its invoices. A sample of 100 invoices has been taken and we want
to compute the probability that 12 invoices contain errors
63
o
o
o
64
o
o
Exponential probability distribution
- Useful in describing the time it takes to complete a task
- The exponential random variables can be used to describe:
o Time between vehicle arrivals at a toll booth
o Time req’d to complete a questionnaire
o Distance between major defects in a highway
o In waiting line applications, the exponential distribution is often
used for service times
- A property of exponential distribution is that the mean and the
standard deviation are equal
- The exponential distribution is skewed to the left
65
- Cumulative Probabilities
o
- Ex. Al’s Full service pump
o The time between arrivals of cars at Al’s follows an exponential
probability distribution with a mean time between arrivals of 3
mins
o Al would like to know the probability that the time between two
successive arrivals will be 2 mins or less
o
Relationship between the Poisson and Exponential Distributions
- The Poisson distribution provides an appropriate description of the
number of occurrences per interval
- And the exponential distribution provides an appropriate description of
the length of the arrival between occurrences
66
67
Download