Uploaded by amenderng

Sample final 2

advertisement
SINGAPORE MANAGEMENT UNIVERSITY
Sample Final Examination 2
COR-STAT1202 Introductory Statistics
Instructions to Candidates
(i) This test paper contains ten (10) true-or-false questions, ten (10) multiplechoice questions, ten (10) fill-in-the-blanks questions and three (3) problemsolving questions. It comprises eleven (12) printed pages.
(ii) Candidates must answer all the 33 questions.
(iii) Each true-and-false question carries 1 mark. Each multiple-choice question
carries 3 marks. Each fill-in-the-blanks questions carries 3 marks. Each
problem-solving question carries 10 marks. The total mark for this paper
is 100.
(iv) The formula sheets are provided.
(v) The normal statistical table and t-distribution table are provided.
(vi) All answers must be written in the answer books.
(vii) You are given 120 minutes to write.
1
Section A: True or False (Write ‘T’ on the answer books if the statement is true
and ‘F’ if the statement is false.)
[1.] (1 mark) If the regression line is estimated to be Ŷi = 0.76 − 0.43Xi and the coefficient
of determination r2 = 0.49, the coefficient of correlation between X and Y is 0.7.
[2.] (1 mark) When sampling without replacement from a finite population, the use of the
finite population correction factor will reduce the standard error of the sample mean.
[3.] (1 mark) A binomial distribution with parameters n = 800 and p = 0.1 is highly
right-skewed.
[4.] (1 mark)
The larger the p-value, the weaker the evidence to reject the null hypothesis.
[5.] (1 mark) The test statistic measures how close the computed sample statistic has come
to the hypothesized population parameter.
[6.] (1 mark) When constructing a confidence interval for population proportion, if we
decrease the confidence level, the estimated standard error of sample proportion would
increase.
[7.] (1 mark)
is false.
A Type I error is committed when we do not reject the null hypothesis that
[8.] (1 mark) In regression analysis, the Total Sum of Squares (SST) does not depend on
the values of independent variable Xi .
[9.] (1 mark) If we are performing a two-tailed test of whether p = 0.5, the probability of
detecting a shift of the proportion to 0.6 will be less than the probability of detecting a
shift of the proportion to 0.7.
[10.] (1 mark)
The process of using sample statistics to draw conclusions about true
population parameters is called descriptive statistics.
2
Section B: Multiple Choices (Write your choice on the answer books.)
[11.] (3 marks)
Which one of the following statements is false?
(A) Sample mean X̄ is an unbiased estimator of the population mean µ regardless the
population size and sample size.
(B) The sampling distribution of X̄ is approximately normal provided the sample size is
sufficiently large according to Central Limit Theorem.
(C) The coefficient of correlation must be between -1 and 1, inclusive.
(D) As the number of degrees of freedom decreases, the t-distribution approaches to the
standard normal distribution.
[12.] (3 marks) An auto analyst is conducting a satisfaction survey, sampling a list of
1,000 new car buyers. The list includes, 250 Nissan buyers, 250 BMW buyers, 250 Honda
buyers, and 250 Toyota buyers. The analyst selects a sample of 80 car buyers, by randomly
sampling 20 buyers of each brand. Is this an example of a simple random sample?
(A) Yes, because each buyer in the sample was randomly sampled.
(B) Yes, because car buyers of every brand were equally represented in the sample.
(C) No, because every possible 80-buyer sample did not have an equal chance of being
chosen.
(D) No, because the population consisted of purchasers of four different brands of car.
[13.] (3 marks) Other things being equal, which of the following actions will reduce the
power of a hypothesis test?
(I) Increasing sample size
(II) Increasing significance level
(III) Decreasing sample size
(IV) Decreasing significance level
(A) I and II
(B) I and IV
(C) II and III
(D) III and IV
3
[14.] (3 marks) A national consumer magazine reported the following correlation analysis:
The correlation between car weight and car reliability is -0.34. The correlation between
car weight and annual maintenance cost is 0.21. Which of the following statements are
true?
(I) Heavier cars tend to be less reliable.
(II) Heavier cars tend to cost more to maintain.
(III) Car weight is related more strongly to reliability than to maintenance cost.
(A) I only
(B) II only
(C) I and II
(D) I, II and III
[15.] (3 marks) If we randomly choose a number from 2 to 8, inclusive, the probability
distribution of the 7 possible outcomes is given as:
Outcome
Probabilities
2
0.1
3
0.2
4
0.2
5
0.1
Events G1 , G2 , G3 , and G4 are defined as follows.
G1 : {The number is odd}
G2 : {The number is less than 5 }
G3 : {The number is less than 4 or it is even}
G4 : {The number is more than 3 and it is odd}
Which one of the following statements is false?
(A) Events G2 and G4 are mutually exclusive.
(B) Events G1 and G4 are statistically independent.
(C) Events G1 and G3 are collectively exhaustive.
(D) P (G2 |G3 ) 6= P (G3 |G2 )
4
6
0.1
7
0.2
8
0.1
[16.] (3 marks) A major metropolitan newspaper selected a simple random sample of 1,000
readers from their list of subscribers. They asked whether the paper should increase its
coverage of local news. The sample survey finds that 420 readers wanted more local news.
What is the 99% confidence interval for the proportion of readers who would like more
coverage of local news?
(A) 0.3798 to 0.4602
(B) 0.3850 to 0.4550
(C) 0.3894 to 0.4506
(D) 0.3943 to 0.4457
[17.] (3 marks) It has been claimed that 65% of homeowners would prefer to heat with
electricity instead of gas. A study finds that 71% of 200 homeowners prefer electric heating
to gas. In a two-tail test, can we conclude that the percentage who prefer electric heating
may differ from 65%? Determine the p-value for the test.
(A) p-value = 0.075; At the 0.05 level of significance, we have sufficient evidence to
conclude that the percentage who prefer electric heating may differ from 65%.
(B) p-value = 0.075; At the 0.05 level of significance, we do not have sufficient evidence
to conclude that the percentage who prefer electric heating may differ from 65%.
(C) p-value = 0.0375; At the 0.01 level of significance, we have sufficient evidence to
conclude that the percentage who prefer electric heating may differ from 65%.
(D) p-value = 0.0375; At the 0.01 level of significance, we do not have sufficient evidence
to conclude that the percentage who prefer electric heating may differ from 65%.
5
[18.] (3 marks) A developmental psychologist believes that the age at which a normal
child begins to speak words clearly is closely related to the age at which first begins to
use complete sentences. A random sample of 20 normal children was taken, and careful
records were kept for each. Let X (in months) be the age at which words are first clearly
used, and let Y (in months) be the age at which complete sentences are used. The sample
provides following statistics:
P
xi = 260.9,
P
x2i = 3423.09,
P
yi = 495,
P
yi2 = 12280.08,
P
xi yi = 6473.26
Based on the simple linear regression analysis, predict the age at which complete sentences
are used if a child first clearly uses words at age of 14.5 months.
(A) 24.74 months
(B) 25.93 months
(C) 26.48 months
(D) 27.26 months
[19.] (3 marks) Refer to the information given in question 18, determine what percentage
of variations in Y can be explained by the independent variable X in this model.
(A) 45.11%
(B) 43.25%
(C) 44.78%
(D) 46.24%
6
[20.] (3 marks) A group of high school students registered for a special SAT mathematics
preparatory course offered in their school. They took a sample SAT the first day and then
took another the last day. The scores and their differences (After Scores minus Before
Scores) were as follows:
Student
1
2
3
4
5
6
7
8
9
10
Mean
Before
540
460
520
580
670
590
640
490
530
540
556
After
570
510
530
570
680
610
660
520
540
580
577
Difference
30
50
10
-10
10
20
20
30
10
40
21
Standard deviation
64.84169
57.74465
17.2884
We would like to test whether the course has any impact on the SAT scores at the 5%
significance level. Determine the test statistic for this test, its degrees of freedom, and the
decision of the test.
(A) |t| = 3.8412 with with 9 degrees of freedom; reject H0
(B) |t| = 3.8412 with with 9 degrees of freedom; do not reject H0
(C) |t| = 0.7648 with with 18 degrees of freedom; reject H0
(D) |t| = 0.7648 with with 18 degrees of freedom; do not reject H0
7
Section C: Fill in the blanks (Write your final answer to four decimal places
on the answer book).
[21.] (3 marks) Seven of the 15 campus police officers available for assignment to the
auditorium in which a local politician is to speak have received advanced training in
crowd control. If 5 officers are randomly selected for service during the speech, what is
the probability that at least 3 of them will have had advanced training in crowd control?
[22.] (3 marks) During a study of auto accidents, the Highway Safety Council found that
60 percent of all accidents occur at night, 52 percent are alcohol-related, and 36 percent
occur at night and are alcohol-related. What is the probability that an accident was not
alcohol-related, given that it occurred at night?
[23.] (3 marks) Martin Coleman, credit manager for Beck’s, knows that the company
uses three methods to encourage collection of delinquent accounts. From past collection
records, he learns that 70 percent of the accounts are called on personally, 20 percent are
phoned, and 10 percent are sent a letter. The probabilities of collecting an overdue amount
from an account with three methods are 0.75, 0.60, and 0.40 respectively. Mr Coleman has
just received payment from a past-due account. What is the probability that this account
was called on personally?
[24.] (3 marks) Robertson Employment Service customarily gives standard intelligence
and aptitude tests to all people who seek employment through the firm. The firm has
collected data for several years and has found that the distribution of scores is not normal,
but is skewed to the left with a mean of 86 and a standard deviation of 16. What is the
probability that in a sample of 100 applicants who take the test, the mean score will be
less than 84 or greater than 90?
8
[25.] (3 marks) A psychologist wrote a computer program to simulate the way a person
responds to a standard IQ test. To test the program, he gave the computer 15 different
forms of a popular
P IQ test and computed
P15 2 its IQ score (X) from each form. The results
are as follows: 15
x
=
2,
145,
i=1 i
i=1 xi = 307, 125. Based on this sample results, what
is the coefficient of variation?
[26.] (3 marks) Arrivals at a walk-in optometry department in a shopping mall have been
found to be Poisson distributed with a mean of 2.5 potential customers arriving per hour.
What is the probability that the time interval between two potential customers is more
than 20 minutes but less than 30 minutes?
[27.] (3 marks) A criminologist has developed a questionnaire for predicting whether a
teenager will become a delinquent. Scores on the questionnaire can range from 0 to 100,
with higher values reflecting a presumably greater criminal tendency. As a rule of thumb,
the criminologist decides to classify a teenager as a potential delinquent if his or her score
exceeds 75. The questionnaire has already been tested on a large sample of teenagers,
both delinquent and nondelinquent. Among those considered nondelinquent, scores were
normally distributed with a mean of 60 and a standard deviation of 10. Among those
considered delinquent, scores were normally distributed with a mean of 80 and a standard
deviation of 5. In a randomly selected group of four considered delinquents, what is the
probability that the criminologist will classify all of them as delinquents?
[28.] (3 marks) In a study of the effects of a medication on the body temperature of
normal adults, a scientist wishes to be 95% sure that the estimates made from a sample
are within 0.01◦ C of the population mean. The population under study is believed to
have a standard deviation in body temperature of 0.07◦ C. At least how many subjects
should be used in the sample if these conditions are to be met?
9
[29.] (3 marks) A dental experiment involves coating patients’ teeth with a special compound
which is intended to reduce formation of plaque and so reduce the number of cavities.
The compound is applied to the teeth of a sample group of 65 volunteers. After 3 years,
these patients developed a mean of 3.2 cavities with a standard deviation of 1.4. A 99%
confidence interval for the mean number of cavities developed by all similar patients using
this compounded for 3 years is then calculated. Find the margin of error.
[30.] (3 marks) The Chevrolet dealers of a large city are conducting a study to determine
the proportion of car owners in the city who are considering the purchase of a new car
within the next year. If the population proportion is believed to be 0.15, how many owners
must be included in a simple random sample if the dealers want to be 90% confident that
the margin of error will be no more than 0.02?
10
Section D: Problems Solving (Answer all problems in the answer books)
[31.] (10 marks) The following table contains the probability distribution for the number
of traffic accidents (X) daily in a small city:
x
P (X = x)
0
0.20
1
0.25
2
0.30
3
0.15
4
0.10
(a) Suppose you randomly select 100 days and observe the number of traffic accidents in
each day. What is the probability that the sample mean of traffic accidents per day
is more than 1.55? State clearly whether any assumption is needed and explain why.
(b) If you only randomly select 5 days, instead of 100 days in the sample, what is the
probability that the sample has exactly two days without any traffic accident?
(c) In a sample of 5 days, the daily observations are 0, 4, 2, 3 and 2 traffic accidents.
Find the sample mean and estimated standard error of sample mean.
[32.] (10 marks) Two research laboratories have independently produced drugs that provide
pain relief to arthritis sufferers. In laboratory 1, Drug A was tested on a group of 60
arthritis sufferers and produced a mean of 8.5 hours of relief, and a sample standard
deviation of 1.8 hours. In laboratory 2, Drug B was tested on 40 arthritis sufferers,
producing a mean of 7.4 hours of relief, and a sample standard deviation of 2.1 hours.
Assume that the variances of the two populations are equal even though they are unknown.
(a)At the 0.05 level of significance, can we conclude that the average lengths of relief
period to arthritis sufferers provided by the two laboratories are different? If we can,
which laboratory provides a longer period of relief to arthritis sufferers in general?
State clearly whether any additional assumption is needed and explain why.
(b) Based on the decision made in part (a), what type of error of the test is possibly
committed? Explain why.
(c) You are also given the followings information about the characteristics of arthritis
sufferers tested in both laboratories:
Laboratory
1
2
Age Range
40-45
60-65
Ratio of male to Female
1:2
4:1
Would the above additional information affect the conclusion made in part (a)? Explain briefly.
11
[33.] (10 marks) A firm administers a test to sales trainees before they go into the field.
The management of the firm is interested in determining the relationship between the test
scores (X) and the sales (Y in units sold) made by the trainees at the end of one year in
the field. Data are collected for 10 sales personnel who have been in the field one year.
The simple linear regression model Yi = β0 + β1 Xi + i is used to model the situation. The
data
statistics:
P provide the
P following
P
P 2
P
xi = 784,
x2i = 62348,
yi = 2998,
yi = 913394,
xi yi = 238393
(a) Write down the regression equation used to predict the sales as a function of the test
scores.
(b) At the 0.05 level of significance, do the data present sufficient evidence to conclude
that the test score contributes useful information for the prediction of the sales?
State clearly all the assumptions needed for your hypothesis testing.
(c) Construct a 95% confidence interval estimate of the increase in sales for every 1 point
increase in test scores.
(d) Construct a 95% prediction interval of the sales of a particular sales personnel with
test score of 78.
–END–
12
Download