HOMEWORK 1 SOLUTIONS 1.2.1 Exercises 2. A field experiment is conducted to compare the yield of three varieties of corn used for biofuel. Each variety will be planted on 10 randomly selected plots and the yield will be measured at the time of harvest. (a) Describe the population(s) involved. The populations involved are the three varieties of corn. (b) What is the characteristic of interest? The characteristic of interest is the yield of each corn variety. (c) Describe the sample(s) The samples are the 10 plots of each variety of corn. 4. A consumer magazine article titled ”How Safe is the Air in Airplanes” reports that the air quality, as quantified by the degree of staleness, was measured on 175 domestic flights. (a) Identify the population of interest, and the population units. The population of interest is all of the air on airplanes that fly domestically. The population units are each plane. (b) Identify the sample. The sample is the air on the 175 domestic flights. (c) What is the characteristic of interest? The characteristic of interest is the staleness of the air. 5. In an effort to determine the didactic benefits of computer activities, when used as 1 2 HOMEWORK 1 SOLUTIONS an integral part of a statistics course for engineers, one section is taught using the traditional method, while another is taught with computer activities. In the end of the semester, each student’s score on the same test is recorded. To eliminate unnecessary variability, both sections were taught by the same professor. (a) Is there one or two populations involved in the above study? There are two populations involved in the study. (b) Describe the population(s) involved. The populations consist of students, but they are distinguished by the method in which they are taught. (c) Is (are) the population(s) involved hypothetical or not? Yes, the populations are both hypothetical. (d) What is (are) the sample(s) in this study? The samples are the two sections of students taught with the two methods. 1.4.5 Exercises 2. A type of universal remote fro home theater systems is manufactured in three distinct locations. 20% of the remotes are manufactured in location A, 50% in location B, and 30% in location C. The quality control team (QCT) wants to inspect a simple random sample (srs) of 100 remotes to see if a recently reported problem with the menu feature has been corrected. The QCT requests that each location send to the QC Inspection Facility a srs of remotes from their recent production as follows: 20 from location A, 50 from B, and 30 from C. (a) Does the sampling scheme described above produce a simple random sample of size 100 from the recent production of remotes? No, this does not produce a simple random sample of size 100. (b) Justify your answer in part (a). If you answer no, then what kind of sampling is it? This type of sample produces a stratified sample. 3. A civil engineering student, working on his thesis, plans a survey to determine the proportion of all current drivers that regularly wear a seat belt. He decides to interview HOMEWORK 1 SOLUTIONS 3 his classmates in the three classes he is currently enrolled. (a) What is the population of interest? The population is all current drivers. (b) Do the student’s classmates constitute a simple random sample from the population of interest? No. (c) What name have we given to the sample that the student collected? A convenience sample. (d) Do you think that this sample proportion is likely to overestimate, or underestimate the true proportion of all drivers who regularly wear seatbelts? The sample proportion is likely to overestimate the true proportion, since young drivers are more likely to wear seatbelts than drivers who learned to drive when seatbelts weren’t required or available. 4. A service agency wishes to assess clients’ views of quality of service over the past year. Computer records identify 1000 clients over the past 12 months, and a decision is made to select 100 clients to survey. (a) Describe a procedure for selecting a simple random sample of 100 clients from last year’s population of 1000 clients. Assign numbers 1-1000 to the clients. Use a software package to randomly select 100 of them. (b) The population of 1000 clients consists of 800 Caucasian-Americans, 150 AfricanAmericans, and 50 Hispanic-Americans. Describe an alternative procedure for selecting a representative random sample of size 100 from the population of 1000 clients. For a stratified sample, randomly select 80 Caucasian-Americans, randomly select 15 African-Americans, and randomly select 5 Hispanic-Americans. 7. An automotive assembly line is manned by two shifts a day. The first shift amounts for two thirds of the overall production. The task of quality control engineers is to monitor the number of non-conformances per car. Each day a simple random sample of six cars from the first shift, and a simple random sample of three cars from the second shift is taken, and 4 HOMEWORK 1 SOLUTIONS the number of non-conformances per car is recorded. Does the sampling scheme described above produce a simple random sample of size nine from the day’s production? Justify your answer. No, this sampling method produced a stratified sample. 1.5.1 Exercises 2. In a population of 500 tin plates, the number of plates with 0, 1, and 2 scratches is N0 = 190, N1 = 160, and N2 = 150. (a) Identify the variable of interest and the statistical population. The variable of interest is the number of scratches. The statistical population is the 500 plates. (b) Is the variable of interest quantitative or qualitative? Quantitative. (c) Is the varible of interest univariate or multivariate? Univariate. 3. At the final assembly point of BMW cars in Graz, Austria, a quality control inspector, who visits for the day, records the total number on non-conformances in the engine and transmission that arrive from Germany and France, respectively. (a) Is the variable of interest univariate, bivariate, or multivariate? Univariate. (b) Is the variable of interest quantitative or qualitative? Quantitative. (c) Describe the statistical population. The number of 0’s, 1’s, 2’s, etc. referring to the total number of non-conformances in each car. (d) Suppose the number of non-conformances in the engine and transmission are recorded separately for each car. Is the new variable univariate, bivariate , or multivariate? HOMEWORK 1 SOLUTIONS 5 Bivariate. 5. A car manufacturing company, which makes three different types of cars, want information about customer satisfaction for cars sold during the previous year. Each customer is asked for the type of car he/she bought last year, and to rate his/her satisfaction on a scale from 1-6. (a) Identify the variable recorded and the statistical population. The variables recorded are type of car and satisfaction. The statistical population is the data collected from the customers who purchased cars last year. (b) Is the variable of interest univariate? No, two data points were collected from each customer, making the variable bivariate. (c) Is the variable of interest quantitative or qualitative? Qualitative. 1.6.4 Exercises 1. A polling organization samples 1000 adults nationwide and finds that the average duration of daily exercise is 37 minutes with a standard deviation of 18 minutes. (a) The correct notation for the number 37 is: (i) x̄ (ii) µ. x̄. (b) The correct notation for the number 18 is: (i) S (ii) σ. S. (c) Of the 1000 adults in the sample 72% favor tougher penalties for persons convicted of drunk driving. The correct notation for the number 0.72 is: (i) p̂ (ii) p. p̂ 2. In the year 2000 census, the United States Census Bureau found that the average number of children of all married couples is 2.3 with a standard deviation of 1.6. 6 HOMEWORK 1 SOLUTIONS (a) The correct notation for the number 2.3 is: (i) x̄ (ii) µ. µ. (b) The correct notation for the number 1.6 is: (i) S (ii) σ. σ. (c) According to the same census 17% of all adults choose not to marry. The correct notation for the number 0.17 is: (i) p̂ = .17 (ii) p = .17. p = 0.17 3. (a) Use the information given in Example 1.6.3 to calculate the population variance and standard deviation. σ2 = N 1 X (vi − µ)2 N i=1 = 1 [300(12 − 3.47)2 + 700(2 − 3.47)2 + 4000(3 − 3.47)2 + 4000(4 − 3.47)2 10000 +1000(5 − 3.47)2 ] = 0.7691 √ σ = σ 2 = 0.877 (b) Use the data given in Example 1.6.5 to calculate the sample variance and standard deviation. n S2 = 1 X 2 (xi − x̄)2 n−1 i=1 = 1 [(2 − 3.7)2 + 3(3 − 3.7)2 + 4(4 − 3.7)2 + 2(5 − 3.7)2 ] 10 − 1 = 0.9 p 2 S = (S ) = 0.949 6. The possible samples of size two, taken with replacement from the population {0,1}, are {0,0},{0,1},{1,0}, {1,1} HOMEWORK 1 SOLUTIONS 7 (a) Calculate the sample variance for each of the four possible samples listed above. S 2 = 0, 1/2, 1/2, 0 (b) Compare the average of the four sample variances with the population variance. They are both equal to 1/4. 8. Consider a statistical population consisting of the N values v1 , ..., vN , and let µV , σV2 , σV denote the population mean value, variance, and standard deviation. (a) Suppose that the vi values are coded w1 , ..., wN , where wi = c1 + vi , where c1 is a known constant. Show that the mean value, variance, and standard deviation of the statistical population w1 , ..., wN are 2 µW = c1 + µV , σW = σV2 , σW = σV µW = P wi = 1 N P (c1 + vi ) = c1 + 1 N P vi = c1 + µV 1 N P P (wi − µW )2 = N1 ((c1 + vi ) − (c1 + µV ))2 = q q 2 = σV2 = σV = σW 2 = σW σW 1 N 1 N P (vi − µV )2 = σV2 (b) Suppose that the vi values are coded w1 , ..., wN , where wi = c2 vi , where c2 is a known constant. Show that the mean value, variance, and standard deviation of the statistical population w1 , ..., wN are 2 µW = c2 µV , σW = c22 σV2 , σW = |c2 |σV µW = P wi = 1 N P (c2 vi ) = c2 N1 P vi = c2 µV P P P (wi − µW )2 = N1 ((c2 vi ) − (c2 µV ))2 = c22 ( N1 (vi − µV )2 ) = c22 σV2 q q 2 = = σW c22 σV2 = |c2 |σV 2 = σW σW 1 N 1 N (c) Suppose that the vi values are coded w1 , ..., wN , where wi = c1 + c2 vi , where c1 , c2 are a known constant. Show that the mean value, variance, and standard deviation of the statistical population w1 , ..., wN are 2 µW = c1 + c2 µV , σW = c22 σV2 , σW = |c2 |σV 8 HOMEWORK 1 SOLUTIONS 1 N µW = Wi = 1 N P (c1 + c2 vi ) = c1 + c2 N1 P vi = c1 + c2 µV P P P (wi − µW )2 = N1 ((c1 + c2 vi ) − (c1 + c2 µV ))2 = c22 ( N1 (vi − µV )2 ) = c22 σV2 q q 2 = σW = c22 σV2 = |c2 |σV 1 N 2 = σW σW P 12. The following data show the starting salaries, in $1000 per year, for a sample of 15 engineers. (a) Assuming that the 15 senior engineers represent a simple random sample from the population of senior engineers, estimate the population mean and variance. n x̄ = 15 1 X 1X xi = xi = 192.8 n 15 i=1 S2 = i=1 1 n−1 n X 15 (xi − x̄)2 = i=1 1 X (xi − 192.8)2 = 312.3143 15 − 1 i=1 (b) Give the sample mean and variance for the data on second-year salaries for the same group of engineers if i. if each engineer gets a $5000 raise n 15 1 X 1X xi = xi = 197.8 x̄ = n 15 i=1 S2 1 = n−1 i=1 n X i=1 15 1 X (xi − x̄) = (xi − 197.8)2 = 312.3143 15 − 1 2 i=1 i. if each engineer gets a 5% raise n x̄ = 15 1X 1 X xi = xi = 202.44 n 15 i=1 S2 = 1 n−1 i=1 n X i=1 15 (xi − x̄)2 = 1 X (xi − 202.44)2 = 344.3265 15 − 1 i=1