{2 points} Name ____________________________ You MUST work alone – no tutors; no help from classmates. Email me or see me with questions. You will receive a score of 0 if this rule is violated. Classroom Lab is open on Monday from 9am-3pm. EPE/EDP 660 EXAM 1: Summer 2015 {3 points} Minitab (or other approved software) output must be included. It must be clearly labeled, with all answers clearly identified. You may also choose to copy output and embed within the exam responses. You must also include a copy of your session window. Make sure to enable commands in the worksheet. Do NOT include a copy of the worksheet. Directions: Read each question before responding. In order to receive partial credit, work must be shown. PART A (33 POINTS): FILL IN THE BLANK (with best choice) {1 point per blank} 1. The field of statistics can be roughly divided into two areas (also known as branch of statistics): ____________________ and ____________________ . 2. The mean, , and standard deviation, , completely specify the ____________________ distribution. 3. If we think that a variable, x, may explain or even cause changes in another variable, y, we call x a(n) ____________________ variable and y a(n) ____________________ variable. 4. The ____________________ measures the direction and the strength of the linear association between two variables. 5. Fill in the following table with OK, Type I error, or Type II error. Decision Fail to Reject Ho Reject Ho Ho True Null Hypothesis Ho False 6. At a UK basketball game, 500 fans were randomly selected and asked how much they paid for parking. From this group a mean of $14 was computed. Match the items in Column II with the statistical term in Column I. Column II Column I ___ Statistic (a) 500 selected fans ___ Data (b) All fans in attendance ___ Sample (c) The amount paid for parking by each of the 500 selected fans ___ Population (d) The computed $14 Use the following Table to answer # 7 – 12. ID 1 2 3 4 5 6 Anxiety [0-100%] 25 2 77 56 95 0 7. What type of variable is ID? ____ A. Nominal B. Ordinal C. Interval D. Ratio 8. What type of variable is Sex? ____ A. Nominal B. Ordinal C. Interval D. Ratio 9. What type of variable is Anxiety? ____ A. Nominal B. Ordinal C. Interval D. Ratio Sex F F F M M M Result on Test Fail Pass Pass Pass Fail Pass 10. What type of variable is Result on Test? ____ A. Nominal B. Ordinal C. Interval D. Ratio 11. What type of variable is ID? ____ A. Categorical B. Numerical 12. What type of variable is Anxiety? ____ A. Categorical B. Numerical Use the following equation to answer E(y) = β0 + β1x 13. β0 and β1 are ___________________. A. Population parameters B. Sample statistics C. Intercepts D. Slope estimates 14. What is the y-intercept when x is 0? A. E(y) B. β0 C. β1 D. x ____ 15. What is the slope estimate? A. E(y) B. β0 C. β1 D. x ____ Fill in the blank with either “A” for True or “B” for False. 16. The mean is a measure of central tendency. ____ A. True B. False 17. If a density curve is perfectly symmetric, the mean, median, and mode are the same. ____ A. True B. False 18. The ε is a random variable with mean = 1 and variance σ2. ____ A. True B. False 19. R-square (R2) values range between 0 and 100%. ____ A. True B. False 20. Pearson correlation values range between 0 and 1. ____ A. True B. False 21. To test if the slope parameter is zero, we use a t –test. ____ A. True B. False 22. SST does not change with the model, as it depends only on values of the dependent variable, y. ____ A. True B. False 23. SSE decreases as variables are added to a model, and SSM increases by the same amount. ____ A. True B. False 24. In a hypothesis test, if the p-value = 0.08 and you have set alpha at 0.05, the decision would be to Reject the null hypothesis. ____ A. True B. False 25. Sum of Squares measure the amounts of explained and unexplained information due to the model ____. A. True B. False PART B (18 POINTS): SHORT ANSWER Answer the following questions in a few sentences. 1. Central tendency can be measured by the mean, median, and mode. Briefly describe each one providing an example of when it might be used. {3 points} 2. What three components do you need to produce a confidence interval? Why do statisticians prefer confidence intervals over point estimates? {3 points} 3. A researcher demonstrates that the number of statistics courses taken is highly negatively correlated with exam anxiety. She goes on to argue that increases the numbers of statistics’ classes students take will cause a decrease in exam anxiety. Is this a reasonable argument? Explain. {4 points} 4. As a graduate student discusses the results of a study he conducted, a classmate suggests he has committed a Type II error. What does this mean, and what might the student do to guard against making this type of error in the future? Be specific about how the adjustment would help. {4 points} 5. A teacher fits a regression model using age to predict effort. For the model, Minitab reports a R-square value of 0.63. The teacher argues that while this may not be the best estimate, it is good. Do you agree? Explain. {4 points} PART C (44 POINTS): DATA ANALYSIS 1. A researcher wants to investigate the relationship between students’ “self-concept” and their academic performance. The researcher is specifically interested in how much “self-concept” contributes to explaining GPA after the effect of IQ is taken into account. The study includes a sample of 78 seventh-grade students in a rural Midwestern school. The variables include each student’s grade point average (GPA), score on a standard IQ test (IQ), and score on the Piers-Harris Children’s SelfConcept Scale (Self-concept). A. Produce descriptive statistics (be sure to produce at a minimum the mean, median, n and standard deviation) of GPA, IQ, and Self-Concept (you may wish to use graphical summary). {3 points} B. Discuss the distribution of each variable, in terms of central tendency and variation. You may also wish to discuss the general shape of the plotted variables. {3 points} C. Now investigate the potential effect of IQ on GPA: Create a scatter plot of GPA versus IQ. {1 point}; Calculate the correlation estimate between these two variables. {1 point}; Perform a simple linear regression. {2 points} You are not required to check assumptions here. D. Now investigate the potential effect of student’s self-concept on the GPA: Create a scatter plot of GPA versus self-concept {1 point}; Calculate the correlation estimate between these two variables. {1 point}; Perform a simple linear regression. {2 points} You are not required to check assumptions here. E. Which variable appears to have the greatest variation? Explain. {2 points} F. Do you feel that both IQ and Self-concept are useful predictors of GPA? Explain. {3 points} 2. A researcher is interested in estimating birth rates for young mothers (ages 15-17) from the poverty rate in the state in which they live. He consults the U.S. Census Bureau for the year 2000. He records the state name (Location), percentage of population living in households with income below poverty level (PovPct) and birth rate for females 15 to 17 years old (Brth15to17). Birth rate refers to births per 1,000 persons in the group. A. Create a scatterplot of PovPct (x) and Brth15to17 (y). {1 point} B. Based on your plot, describe the strength and direction of the relationship between Birth Rate (y) and Percent of Poverty (x). {2 points} C. Do you feel that simple linear regression is a sound choice in this setting? Explain. You are not being asked to run the regression at this point. {2 points} D. Compute the least-squares regression equation, R-square, and coefficient estimates. Be sure to also create the 4-1 plot to check the assumptions of regression. {3 points} E. Are the assumptions satisfied for linear regression? Discuss in terms of the plots. Provide detail of either support or non-support. {3 points} F. Write the most accurate least-squares regression equation using your output. {1 point} G. Interpret your slope ( b1 ) in the context of these variables. {2 points} H. Test if b0 and b1 are significant. Is one of these tests more important? Why? {3 points} I. Report and Interpret your R-square value. {2 points} J. Identify and report any extreme values in your data set. If none, state that. {2 points} K. Given a PovPct of 68, use your regression equation to predict the expected birth rate. (Show you work). Do you have any issues with this estimate? Explain. {2 points} L. Produce the line of best fit, be sure to include 95% Confidence and Prediction intervals on your graph. {2 points}