Lab #1

advertisement
IE 415: SUMMER 2015
LAB 1:
Simulating Confidence Intervals with Excel
1. Introduction
In this laboratory exercise, you will work in teams of two, and estimate the
“confidence level” of confidence intervals for the mean of a random variable. As
was (or will be) discussed in lecture, confidence intervals fall under the topic of
statistical inference where the objective is to make conclusions about a population
or probability distribution based on sample observations. Confidence intervals are
most often constructed as an interval that contains the mean of a probability
distribution with a specified “confidence”. The interpretation of confidence level
applies to the repeated collection of data and construction of confidence intervals,
where the confidence level equals the fraction of confidence intervals constructed
that contain the mean of the probability distribution.
However, confidence levels are based on assumptions made about the probability
distribution from which the data samples are realizations. If the data are assumed
to be independent samples from a normal distribution, and the standard deviation
is estimated from the data, then the 95% confidence interval for the mean is:
[𝑥̅ − 𝑡0.025,𝑛−1
𝑠
√𝑛
, 𝑥̅ + 𝑡0.025,𝑛−1
𝑠
√𝑛
]
Where 𝑥̅ is the sample average of n observations, s is the sample standard deviation,
and 𝑡0.025,𝑛−1 is a value from a t-distribution with n-1 degrees of freedom such that
the probability of observing a value greater than 𝑡0.025,𝑛−1is equal to 0.025.
When the assumptions under which a confidence interval is constructed are not met,
then the confidence level will usually not equal the stated confidence level. In this
lab you will use Monte Carlo simulation to estimate the confidence interval
confidence levels under varying situations.
1
2. Generating Data in Excel
Excel can generate observations from several distributions. To do so, the Analysis
ToolPak must be installed. This is done through the selection:
File→Options→Add-Ins.
The generation of observations from distributions is done through the selection:
Data (a tab)→Data Analysis→Random Number Generation.
The lab instructors will demonstrate. You will use this Excel capability to
generate observations from a normal distribution with a mean = 5, and standard
deviation = 5.
You will also be generating observations from an exponential distribution with a
mean = 5, and standard deviation = 5.
If we let X denote the exponential random variable, then a single observation x,
will generated using the formula:
𝑥 = −5 ∗ ln(𝑅𝐴𝑁𝐷())
RAND() is the Excel formula for generating a random number (a value equally
likely to be any value between 0 and 1).
3. Useful Excel Features
Two useful features in Excel that you can use in this lab assignment are the IF
function and relative and absolute cell referencing.
IF Function
The IF function in Excel has the following syntax:
=IF(logical_test, [value_if_true], [value_if_false])
See Excel Help for more information on the arguments. The “IF function” can be
used in many ways in a spreadsheet simulation. An important feature is that IF
functions can be nested in other IF functions, which permits the modeling of
outcomes based on multiple conditions, without having to enumerate all of the
possible conditions.
Relative and Absolute Cell Referencing
In Excel, formulas and/or functions can reference values in other cells (e.g., cell
A3). If a formula present in cell C3 is “=A3 + B3” (the sum of values in cells A3
2
and B3) is copied to cell C4, the formula in C4 automatically changes to “=A4+B4”.
To control this automatic formula cell referencing, a method called absolute column
and row referencing can be used. If the formula in cell C3 is “=$A$3 + $B$3”, and
cell C3 is copied and pasted into any other cell location, the copied formula remains
“=$A$3 + $B$3”. If a “$” sign is removed from the position in front of a row letter,
the row reference will change automatically, but the column reference will not. The
opposite is true if the “$” sign is removed from the position in front of a column
number. This absolute referencing technique is helpful when creating tables that
reference values in the column and row headings.
To practice this, create small tables (a multiplication table is a good example) with
numeric column and row headings and then insert a formula into a cell in the table
that references the column heading (use an absolute row reference for this cell) and
row heading values (use an absolute column reference for this cell) . You should
then be able to copy and paste this single formula into all remaining table cells.
4. Lab Assignment
To complete the lab, work with your partner and follow the general procedure
below separately for both normal and exponential random variables (each with a
mean and standard deviation = 5).





Generate five independent observations of the random variable,
Construct a 95% confidence interval,
Record whether the confidence interval contains the known mean,
Repeat a total of 5000 times (i.e., generate five independent observations
5000 times and 5000 95% confidence intervals),
Estimate the confidence level from the simulation results. Since you know
the true mean you can tabulate the percentage of confidence intervals that
contain the true mean. This percentage is the estimated confidence level.
Note: 𝑡0.025,4 = 2.776 = T.INV(0.975,4) in Excel.
Do not use any macros or other Excel add-ins to complete the assignment.
What to turn in
E-mail your completed Spreadsheets to your lab TA. Include the following:



Names of team members (File Name: Last Name-Last Name-Lab#);
Clear row and column headings.
Any other documentation to make the spreadsheet understandable. This is
a judgment call so add more rather than less documentation. Assignments
3

that are too hard to understand will be penalized. Spreadsheets with no
column/row labels and no documentation will receive a zero.
Clearly show the estimated confidence level.
4
Download