Lab 1

advertisement
STAT 452/652 - MINITAB LAB 1
Plan:
1. Computations of probabilities, cdf and quantile function for theoretical distributions: continuous and
discrete.
2. Random sample generation from theoretical distributions;
3. Random sample generation - FTSS in action.
4. Histograms: information about the underlying/model distribution’s pdf;
5. Illustration of the Central Limit Theorem and the distributions of t- and z-statistics.
LAB WORK
Computations of probabilities, cdf and quantile function for theoretical distributions.
1. Continuous distributions, for example Normal. Let X ∼ Normal(1, 2).
(a) Find the probability P (X ≤ 3), that is find its cdf at 3. What is P (X < 3)?
In MINITAB: Calc-Probability Distributions-Normal: (1) fill mean and standard deviation
boxes; (2) ask for cumulative probability, (3) input constant 3. Result: P (X ≤ 3) = 0.8413.
(b) Find the probability that X is larger than 3: P (X ≤ 3)? .
Need to find cdf F at 3 and take 1 − F (3) = 1 − 0.8413 = 0.1587.
(c) Find the probability that X is between -2 and 2: P (−2 < X < 2) .
Find the difference between cdf at 2 and -2. P (−2 < X < 2) = F (2) − F (−2) = 0.6915− 0.0668 =
0.6247.
(d) Find the 85th percentile of X, i.e. the value ”a” such that P (X ≤ a) = 0.85.
In MINITAB: Calc- Probability Distributions-Normal: (1) fill mean and standard deviation boxes; (2) ask for INVERSE CUMULATIVE PROBABILITY, (3) input constant
0.85. Result:3.0729, i.e.P (X ≤ 3.0729) = 0.85.
1
2. Discrete distributions, for example binomial. Let X ∼ Bin(10, 0.6), that is X has a binomial distribution
with number of trials 10 and probability of success 0.6.
(a) Find the probability X is equal to 5: P (X = 5).
In MINITAB: Calc- probability Distributions-Binomial: (1) fill number of trials (10) and
event probability (0.6); (2) ask for PROBABILITY, (3) input constant 5. Result: 0.2007.
(b) Find the probability that X is larger than 5: P (X > 5).
Need to find cdf F at 5 and take 1 − F (5) = 1 − 0.3669 = 0.6331.
(c) Find the 85th percentile of X.
In MINITAB: Calc- Probability Distributions-Binomial: (1) fill number of trials and event
probability; (2) ask for INVERSE CUMULATIVE PROBABILITY, (3) input constant
0.85. Results: P (X ≤ 7) = 0.8327 and P (X ≤ 8) = 0.9536, or F (7) = 0.8327 and F (8) = 0.9536.
Remember we need q = min{x : F (x) ≥ 0.85} so the 85th percentile of X is 8
3. Random sample generation from theoretical distributions;
a) Uniform. Generate a random sample of size 100 from U(0, 1).
In MINITAB: Calc-Random Data-Uniform; (1) fill: generate 100 rows, (2) fill lower and
upper endpoints as 0 and 1, (3) provide storage column, say C1.
b) Binomial. Generate a random sample of size 20 from Bin(10, 0.1).
In MINITAB: Calc-Random Data-Binomial; (1) fill: generate 100 rows, (2) fill numer of
trials and probability of success, as 10 and 0.1 (3) provide storage column, say C2.
4. Random sample generation - FTSS in action. Generate a sample of size 100 from a normal distribution with mean 0 and standard deviation 1 and a sample from an exponential distribution with mean
2, using the FTSS.
(1) Generate the sample from U(0,1), store in C1;
(2) Compute inverse normal CDF on that sample and store in another column, say C2. You got a sample
from standard normal.
(3) Compute inverse exponential cdf on the uniform sample, and store in a column. (Note that inverse
cdf for exponential with mean 2 is: F −1 (x) = −2ln(1 − u), where u is an observation from U(0,1)
2
distribution). In MINITAB natural log is loge.
(4) Graph histograms of your samples from the uniform, normal, and exponential distributions and see
if they look like samples from uniform, normal and exponential distributions.
5. Illustration of the CLT: (1) Generate a sample of 1000 observations from N(1,2), store in C1. Repeat 4
more times: you will have 5 samples of size 1000 each from N(1,2) stored in C1 - C5.
(2) Now look at the data you have by row: you have 1000 samples of size 5 from N(1, 2). For each of
those compute sample mean x̄ and standard deviation s: Use row statistics. Calc - Row Statisticsmean, repeat with computation of st.deviations for each row.
(3) Now check the distribution of the sample means: do a histogram, compute mean and standard deviation of the s. means column. Are the results consistent with the CLT?
(4) Compare the distribution of the samples of size 1000 from N(1,2) with the distribution of the s.means.
Graph histograms of both samples (from N(1,2)=any of the first 5 columns, and s. means) on one graph
page. Similarities and differences?
6. Distributions of z- and t-statistics. Use the data for normal distribution you already have (first 5
columns).
(1) Compute two common statistics for 1000 samples of size 5 from N(1,2): t and z-statistics:
t=
x̄ − µ
x̄ − µ
√ and z = √ ,
s/ n
σ/ n
where µ = 1, σ = 2, and n=5. Note, that you can use the computed sample means and st. deviations
for the t-stat.
(6) Compare the distributions of the t- and z-statistics. Graph histograms of both on one graph page.
Use the option of fitting a (normal) density to the histogram. Similarities and differences?
3
HOMEWORK 3. Due class time, Thursday, February 16.
1. What happens to the histogram when we increase the sample size?
Generate histograms of samples of size: 100, 1000, 10,000 from a normal distribution with mean 0 and
standard deviation 1. What happens to the histogram as you increase the sample size? What curve does
the histogram approximate better and better when we increase sample size?
2. Let X have an exponential distribution with mean 2
(a) Find the probability that X is larger than 3.
(b) Find the 20th percentile of X.
3. Let X have a binomial distribution with 20 trials and probability of success 0.7 on each trial.
(a) Find the probability that X is greater than 5 and less than or equal to 12.
(b) Find the 65th percentile of X.
(c) Find 50th percentile of X.
4. Undergraduate students only. I want to generate one observation from a χ2 distribution with 10
degrees of freedom. using a random number generator I generated an observation from the uniform
distribution on the interval (0,1). The observation I generated is 0.977. Using the FTSS I computed the
value of the corresponding observation from χ210 distribution. What was that value?
5. Graduate Students only. I want to generate one observation from a Pareto distribution with shape
parameter α which has the following pdf:
f(x) =
(
α
xα+1
0
if x ≥ 1;
otherwise.
Using a random number generator I generated an observation from the uniform distribution on the interval (0,1). The observation I generated is 0.5. Using the FTSS I computed the value of the corresponding
observation from Pareto distribution with α = 2. What was that value?
4
Download