Exam 1

advertisement
{2 points} Name ____________________________
You MUST work alone – no tutors; no help from classmates. Email me or see me with
questions. You will receive a score of 0 if this rule is violated. Classroom Lab is open on
Monday from 9am-3pm.
EPE/EDP 660 EXAM 1:
Summer 2015
{3 points} Minitab (or other approved software) output must be included. It must be clearly
labeled, with all answers clearly identified. You may also choose to copy output and embed
within the exam responses. You must also include a copy of your session window. Make
sure to enable commands in the worksheet. Do NOT include a copy of the worksheet.
Directions: Read each question before responding. In order to receive partial credit, work
must be shown.
PART A (33 POINTS): FILL IN THE BLANK (with best choice) {1 point per blank}
1. The field of statistics can be roughly divided into two areas (also known as branch of
statistics): ____________________ and ____________________ .
2. The mean,  , and standard deviation,  , completely specify the ____________________
distribution.
3. If we think that a variable, x, may explain or even cause changes in another variable, y, we
call x a(n) ____________________ variable and y a(n) ____________________ variable.
4. The ____________________ measures the direction and the strength of the linear
association between two variables.
5. Fill in the following table with OK, Type I error, or Type II error.
Decision
Fail to Reject Ho
Reject Ho
Ho True
Null Hypothesis
Ho False
6. At a UK basketball game, 500 fans were randomly selected and asked how much they
paid for parking. From this group a mean of $14 was computed. Match the items in
Column II with the statistical term in Column I.
Column II
Column I
___
Statistic
(a) 500 selected fans
___
Data
(b) All fans in attendance
___
Sample
(c) The amount paid for parking by each of the
500 selected fans
___
Population
(d) The computed $14
Use the following Table to answer # 7 – 12.
ID
1
2
3
4
5
6
Anxiety [0-100%]
25
2
77
56
95
0
7. What type of variable is ID? ____
A. Nominal
B. Ordinal
C. Interval
D. Ratio
8. What type of variable is Sex? ____
A. Nominal
B. Ordinal
C. Interval
D. Ratio
9. What type of variable is Anxiety? ____
A. Nominal
B. Ordinal
C. Interval
D. Ratio
Sex
F
F
F
M
M
M
Result on Test
Fail
Pass
Pass
Pass
Fail
Pass
10. What type of variable is Result on Test? ____
A. Nominal
B. Ordinal
C. Interval
D. Ratio
11. What type of variable is ID? ____
A. Categorical
B. Numerical
12. What type of variable is Anxiety? ____
A. Categorical
B. Numerical
Use the following equation to answer
E(y) = β0 + β1x
13. β0 and β1 are ___________________.
A. Population parameters
B. Sample statistics
C. Intercepts
D. Slope estimates
14. What is the y-intercept when x is 0?
A. E(y)
B. β0
C. β1
D. x
____
15. What is the slope estimate?
A. E(y)
B. β0
C. β1
D. x
____
Fill in the blank with either “A” for True or “B” for False.
16. The mean is a measure of central tendency. ____
A. True
B. False
17. If a density curve is perfectly symmetric, the mean, median, and mode are the same. ____
A. True
B. False
18. The ε is a random variable with mean = 1 and variance σ2. ____
A. True
B. False
19. R-square (R2) values range between 0 and 100%. ____
A. True
B. False
20. Pearson correlation values range between 0 and 1. ____
A. True
B. False
21. To test if the slope parameter is zero, we use a t –test. ____
A. True
B. False
22. SST does not change with the model, as it depends only on values of the dependent variable, y. ____
A. True
B. False
23. SSE decreases as variables are added to a model, and SSM increases by the same amount. ____
A. True
B. False
24. In a hypothesis test, if the p-value = 0.08 and you have set alpha at 0.05, the decision would be to Reject
the null hypothesis. ____
A. True
B. False
25. Sum of Squares measure the amounts of explained and unexplained information due to the model ____.
A. True
B. False
PART B (18 POINTS): SHORT ANSWER
Answer the following questions in a few sentences.
1. Central tendency can be measured by the mean, median, and mode. Briefly describe each one providing
an example of when it might be used. {3 points}
2. What three components do you need to produce a confidence interval? Why do statisticians prefer
confidence intervals over point estimates? {3 points}
3. A researcher demonstrates that the number of statistics courses taken is highly negatively correlated with
exam anxiety. She goes on to argue that increases the numbers of statistics’ classes students take will
cause a decrease in exam anxiety. Is this a reasonable argument? Explain. {4 points}
4. As a graduate student discusses the results of a study he conducted, a classmate suggests he has
committed a Type II error. What does this mean, and what might the student do to guard against making
this type of error in the future? Be specific about how the adjustment would help. {4 points}
5. A teacher fits a regression model using age to predict effort. For the model, Minitab reports a R-square
value of 0.63. The teacher argues that while this may not be the best estimate, it is good. Do you agree?
Explain. {4 points}
PART C (44 POINTS): DATA ANALYSIS
1. A researcher wants to investigate the relationship between students’ “self-concept” and their academic
performance. The researcher is specifically interested in how much “self-concept” contributes to
explaining GPA after the effect of IQ is taken into account. The study includes a sample of 78
seventh-grade students in a rural Midwestern school. The variables include each student’s grade
point average (GPA), score on a standard IQ test (IQ), and score on the Piers-Harris Children’s SelfConcept Scale (Self-concept).
A. Produce descriptive statistics (be sure to produce at a minimum the mean, median, n and standard
deviation) of GPA, IQ, and Self-Concept (you may wish to use graphical summary). {3 points}
B. Discuss the distribution of each variable, in terms of central tendency and variation. You may also
wish to discuss the general shape of the plotted variables. {3 points}
C. Now investigate the potential effect of IQ on GPA: Create a scatter plot of GPA versus IQ.
{1 point}; Calculate the correlation estimate between these two variables. {1 point}; Perform a
simple linear regression. {2 points} You are not required to check assumptions here.
D. Now investigate the potential effect of student’s self-concept on the GPA: Create a scatter plot of
GPA versus self-concept {1 point}; Calculate the correlation estimate between these two
variables. {1 point}; Perform a simple linear regression. {2 points} You are not required to check
assumptions here.
E. Which variable appears to have the greatest variation? Explain. {2 points}
F. Do you feel that both IQ and Self-concept are useful predictors of GPA? Explain. {3 points}
2. A researcher is interested in estimating birth rates for young mothers (ages 15-17) from
the poverty rate in the state in which they live. He consults the U.S. Census Bureau for the
year 2000. He records the state name (Location), percentage of population living in
households with income below poverty level (PovPct) and birth rate for females 15 to 17
years old (Brth15to17). Birth rate refers to births per 1,000 persons in the group.
A. Create a scatterplot of PovPct (x) and Brth15to17 (y). {1 point}
B. Based on your plot, describe the strength and direction of the relationship between
Birth Rate (y) and Percent of Poverty (x). {2 points}
C. Do you feel that simple linear regression is a sound choice in this setting? Explain.
You are not being asked to run the regression at this point. {2 points}
D. Compute the least-squares regression equation, R-square, and coefficient estimates. Be
sure to also create the 4-1 plot to check the assumptions of regression. {3 points}
E.
Are the assumptions satisfied for linear regression? Discuss in terms of the plots.
Provide detail of either support or non-support. {3 points}
F. Write the most accurate least-squares regression equation using your output. {1 point}
G. Interpret your slope ( b1 ) in the context of these variables. {2 points}
H. Test if b0 and b1 are significant. Is one of these tests more important? Why? {3 points}
I. Report and Interpret your R-square value. {2 points}
J. Identify and report any extreme values in your data set. If none, state that. {2 points}
K. Given a PovPct of 68, use your regression equation to predict the expected birth rate.
(Show you work). Do you have any issues with this estimate? Explain. {2 points}
L.
Produce the line of best fit, be sure to include 95% Confidence and Prediction intervals
on your graph. {2 points}
Download