Stat 350 Lab Session GSI: Yizao Wang Section 016 Mon 2pm30-4pm MH 444-D Section 043 Wed 2pm30-4pm MH 444-B Outlines • Binomial and normal distribution • Sampling distribution and CLT (Module 4) • Confidential intervals (Module 5, Actv.1) • Permission to post forms • Today’s qwizdom questions are anonymous. You don’t have to login with your UMID. Binomial Distribution Example of B(n,p): coins flipping Flip a coin n times. The probability of getting heads each time is p. The number of heads we get during n times is a r.v. distributed as B(n,p) Conditions to verify for binomial r.v. The number of heads from flipping a coin n times 1- n trails. n fixed in advance. Flipping n times. 2- 2 possible outcomes each trial. Heads of tails. 3- Independent outcomes between trails. The result of any flipping won’t change the others. 4- Probability of success is p, fixed for all trials. (Identical distribution) Decided by the (same) coin. Another classical example is giving a survey of one ‘yes/no’ question to n random selected persons. Normal Approximation of Binomial Distribution X ~ B(n,p) • P(X = k) is decided by the parameters… but • When n is large, very difficult to calculate! • Approximation by normal distribution Approximately X ~ N( np,sqrt(np(1-p)) ) Normal Distribution • Normal distribution is very rare in real world, but often a very good approximation, with some nice mathematical properties. • Written as X ~ N(\mu,\sigma) • Z-score (z-statistic) is the standardized X by Z = (X-\mu)//sigma • Z ~ N(0,1) (why we want to standardize X?) • What do the normal distributions look like? How to relate the shape with the two parameters? Normal Distribution • 10 minues In-lab review (8 questions) CTools\Lab Info\Lab review: Normal Distribution Population vs. Sample Population Sample Definition Collection of items you want to study Small collection of population items Size Too large Small Example Heights of all UM students Heights of students in a certain lab Random or fixed? Fixed Random (why?) Parameters vs. Statistics Parameters Statistics Where are they from? Population Sample Example Mean height of UM students Mean height of students in a certain lab Known or not? No Calculable from sample Random or fixed? Fixed Random Examples of parameters and corresponding (why?) statistics Mean Population mean Sample mean Standard deviation Population s.d. Sample s.d. Proportion Population proportion Sample proportion Statistics are random variables. Parameters are constants. Statistical Inference • Population parameters are unknown constants. • Statistics are random variables obtained through sampling. • Statistical inference: using statistics to estimate parameters. • Statistics are also called estimators (of parameter). Example: X-bar is the estimator of μ • We need to study the distribution of statistics. (Random variables have fixed distributions.) Sampling Distribution • The probability distribution of the sample statistics is called its sampling distribution. The X in the pictures is not a random variable… Consider it as X-bar. Statistical Inference What kind of estimators do we prefer? • Unbiased: the mean of estimator equals parameter. • Small variation: small standard deviation. Module 4 • Task 1-3 • Objectives: study the influence of the sample size and the distribution of parent population on the sampling distribution. • Sampling Distribution Applet (CTools/lab info) Summary • The shape of the sampling distribution will depend on the distribution of original parent population as well as the sample size. • The sampling distribution is approximately normal when… 4(a) Sampling Dist. of the Sample Mean If the parent popul. is a normal dist. with a mean μ and a stand. dev. σ, then for any sample size, the sample mean will have a __________ dist. with a mean of _____ and a stand. dev. of _____. 4(b) Central Limit Theorem If the parent popul. is NOT a normal dist. but with a mean μ and a stand. dev. σ, then for a large sample size, the sample mean will have a __________ dist. with a mean of _____ and a stand. dev. of _____. What is the distinction between 4(a) and 4(b)? Choose all that apply... A) B) C) D) Shape of parent popul. Shape of dist. of sample mean Standard deviation of sample mean Sample size True or False • If n is large, the sample data will always have a normal distribution. Clicker in your answer. Confidence Interval Recall the parameter-statistic comparison… • We never know the true population parameter value. • We use a one-sample (with several observations) statistic to estimate it. • A sample statistic may not be exactly equal to the corresponding parameter value. (why confidence interval?) Confidence Interval Example: we are 95% confident that the true parameter value lies inside the confidence interval [a, b]. Confidence interval provides a method of stating: • What interval tells: How close the value of a statistic is likely to be to the value of a parameter • What confidence tells: The accuracy of it being that close Confidence Interval Basic structure for any confidence interval: estimate multiplier standard error The sample statistics such as p-hat, x-bar. Margin of error. The Bigger the margin of error, the wider the CI (why?) Confidence Interval Two interpretations: 1. A 95% Confidence Interval: We are 95% confident that the true parameter value lies inside the confidence interval. The interval provides a range of reasonable values for the population parameter. 2. The 95% Confidence Level: If the procedure were repeated many times (that is, if we repeatedly took a random sample of the same size and computed the 95% confidence interval for each sample), we would expect 95% of the resulting confidence intervals to contain the true population parameter. Confidence Interval Principles for using CIs to guide decision making: • Principle 1: A value not in a CI can be rejected as possible value of the population parameter. A value in a CI is an “acceptable” or “reasonable” possibility for the value of a population parameter. • Principle 2: When the CIs for parameters for two different populations do not overlap, it is reasonable to conclude that the parameters for the two populations are different. Confidence Interval • The probability that the true parameter lies in a particular, already computed, confidence interval is either 0 or 1. The interval is now fixed and the parameter is not random, so the parameter is either in that particular interval or it is not. Module 5 Activity1 • Good summary on p26 • Confidence Interval for Mean Applet (CTools/Lab Info) # 4: Interpret the (95%) confidence level in terms of a popul. mean. A) We are 95% confident that the popul. mean will be in the computed confidence interval. B) The computed confidence interval will contain the popul. mean 95% of the time. C) 95% of all confidence intervals created with this method are expected to contain the popul. mean. Before we finish today… Questions or comments?