252grass3-052 10/25/05 Name: Class days and time:

252grass3-052 10/25/05 (Open this document in 'Page Layout' view!) Name: Class days and time: Please include this on what you hand in! Graded Assignment 3 A. In your outline there are 6 methods to compare means or medians, methods D1, D2, D3, D4, D5a and D5b. Methods D6a and D6b compare proportions and method D7 compares variances or standard deviations. In the following cases, identify H 0 and H 1 and identify which method to use. If the hypotheses involve a mean, state the hypotheses in terms of both  and   1   2 . If the hypotheses involve a proportion, state them in terms of both p and p  p1  p 2 . If the hypotheses involve standard deviations or variances, state them in terms of both  2 and  12  22 or  22  12 . All the questions involve means, medians, proportions or variances. One of these problems is a chi-squared test. Remember that a yes answer is not acceptable without an explanation. Note: Look at 252thngs ( 252thngs) on the syllabus supplement part of the website before you start (and before you take exams) . Remember that I use  ,  ,  and p as parameters and x , s, x.50 , and p  x n as sample statistics. ----------------------------------------------------------------------------------------------------------------------------Example: This may seem long but it appears on an old Graded Assignment 3. A group of supervisors are given the exams on management skills before and after taking a course in management. Scores are as follows. Supervisor Before After 1 63 78 2 93 92 3 84 91 4 72 80 5 65 69 6 72 85 7 91 99 8 84 82 9 71 81 10 80 87 11 68 93 If we assume that the distribution of results is Normal, what method should we use to answer the question “Has the course improved the scores of the managers?” Solution: You are comparing means before and after the course. You can get away with using means because the parent distributions are Normal. If  2 is the mean of the second sample, you are hoping that  2  1 , which, because it contains no equality is an alternate hypothesis. So your hypotheses are  H 0 : 1   2  H 0 : 1   2  0 H 0 : D  0 or  . If D  1   2 , then  . The important thing to notice   H 1 : 1   2  H 1 : 1   2  0 H 1 : D  0 here is that the data are in before and after pairs, so you use Method D4. -------------------------------------------------------------------------------------------------------------------------------1. You have data on income in two villages ( x1 in village 1, x 2 in village 2). You want to test the hypothesis that village 2 has higher earnings than village 1. You know that income has an extremely skewed distribution. and you have to decide whether to use the mean or the median income. 2. The data in the file CONCRETE 1 on your CD represents the strength (measured by how many thousands of pounds/square inches that they can take without buckling) of 40 concrete samples on the second and seventh days after pouring. ( x1 is the strength on the second day and x 2 is the strength on the seventh day, each line refers to a single sample.) Assume that the underlying distribution is Normal and test the hypothesis that it is stronger on the seventh day. 3. You have interviewed a sample of 80 small businesses in the Northeast and 75 small businesses in the Southeast. Each business has indicated whether they sell in foreign markets. You want to show that businesses in the Northeast are more likely to export. ( x1 is the total number of firms that export in the Northeast sample, x 2 in the Southeast). 4. You expand the sample in 3 by adding 60 small businesses in the Midwest, ( x3 is the number of these that export). You test the hypothesis that the same fraction of businesses export in each region. 6. In order to see which garage to use under contract for automobile repairs, 10 cars are towed first to garage 1 and than to garage 2. You end up with two data sets, the first data column, x1 , is estimates from the first garage and the second data column, x 2 , is estimates for the second garage. Each of the 10 lines of data refers to one car. You believe that the estimates are approximately normally distributed. Compare the estimates in garage 1 and 2. Would you change your method if there were 200 cars? 7. You have processing times in seconds, x1 , for a sample of 5 computer jobs from the accounting department and for 6 jobs from the research department, x 2 . You believe that the underlying distributions are Normal and want to show that research jobs take longer than accounting jobs. Would you change your method if n1  n 2  205 ? 8. You are having a part produced in two different machines. x1 is 200 randomly selected data points that represent the length of parts from machine one, x 2 is 200 randomly selected data points that represent the length of parts from machine two. You want to test your suspicion that parts from machine 2 are longer than parts from machine 1. In a problem of this type you would assume that the lengths are normally distributed. 9. You also suspect that parts from machine two are more variable in length than parts from machine one (This is the same as saying that machine 2 is less reliable than machine 1). Test this suspicion. 10. A panel is exposed to an ad for Smelly-Welly Dirt Devourer. Before seeing the ad, 5 out of the 40 members had a favorable impression of Smelly-Welly. After seeing the ad, 2 more members of the panel plus the original 5 had a favorable impression. Has the proportion with favorable impressions risen significantly? B. You have 3 methods that can be used for goodness of fit tests. Chi-squared, Kolmogorov-Smirnov and Lilliefors. Which would you use in the following cases? 1. You want to know if the Normal distribution applies to a data set. a. The data set consists of 15 numbers – you do not know the population mean and variance and will have to compute sample means and variances from the data. b. The data set consists of 15 numbers – you think that you know the population mean and variance. c. The data set consists of 5000 numbers and you have observed frequencies for the following intervals: below 1000, 1000-11199.99, 1200-1399.99, 1400-1599.99 ……….2600-2799.99, 2800 and above. You think you know the population mean and variance. d. The data set consists of 5000 numbers and you have observed frequencies for the following intervals: below 1000, 1000-11199.99, 1200-1399.99, 1400-1599.99 ……….2600-2799.99, 2800 and above. You have computed a sample mean and variance from the data. 2

252grass3-052 10/25/05 Name: Class days and time:

Related documents

Products

Support

252grass3-052 10/25/05 Name: Class days and time:

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib