{2 points} Name ____________________________ You MUST work alone – no tutors; no help from classmates. Email me or see me with questions. You will receive a score of 0 if this rule is violated. EPE/EDP 660 EXAM 1: Summer 2014 {3 points} Minitab (or other approved software) output must be included. It must be clearly labeled, with all answers clearly identified. In addition, you must include a copy of your session window. Make sure to enable commands in the worksheet. Do NOT include a copy of the worksheet. Directions: Read each question before responding. In order to receive partial credit, work must be shown. PART A (33 POINTS): FILL IN THE BLANK (with best choice) {1 point per blank} 1. The field of statistics can be roughly divided into two areas (also known as branch of statistics): ____________________ and ____________________ . 2. The mean, , and standard deviation, , completely specify the ____________________ distribution. 3. If we think that a variable, x, may explain or even cause changes in another variable, y, we call x a(n) ____________________ variable and y a(n) ____________________ variable. 4. The ____________________ measures the direction and the strength of the linear association between two variables. 5. Fill in the following table with OK, Type I error, or Type II error. Decision Fail to Reject Ho Reject Ho Ho True Null Hypothesis Ho False 6. At a local high school, 100 students were randomly selected and asked how many hours they slept the previous night. From this group a mean of 7.2 hours was computed. Match the items in Column II with the statistical term in Column I. Column II Column I ___ Statistic (a) The 100 students selected ___ Data (b) All students at the local high school ___ Sample (c) The number of hours each of the 100 students slept ___ Population (d) The computed 7.2 hours of sleep Use the following Table to answer # 7 – 12. ID 1 2 3 4 5 6 Anxiety [0-100%] 25 2 77 56 95 0 7. What type of variable is ID? ____ A. Nominal B. Ordinal C. Interval D. Ratio 8. What type of variable is Sex? ____ A. Nominal B. Ordinal C. Interval D. Ratio 9. What type of variable is Anxiety? ____ A. Nominal B. Ordinal C. Interval D. Ratio Sex F F F M M M Result on Test Fail Pass Pass Pass Fail Pass 10. What type of variable is Result on Test? ____ A. Categorical B. Numerical 11. What type of variable is ID? ____ A. Categorical B. Numerical 12. What type of variable is Anxiety? ____ A. Categorical B. Numerical Use the following equation to answer E(y) = β0 + β1x 13. β0 and β1 are ___________________. A. Population parameters B. Sample statistics C. Intercepts D. Slope estimates 14. What is the y-intercept when x is 0? A. E(y) B. β0 C. β1 D. x ____ 15. What is the slope estimate? A. E(y) B. β0 C. β1 D. x ____ Fill in the blank with either “A” for True or “B” for False. 16. The standard deviation is a measure of central tendency. ____ A. True B. False 17. If a density curve is perfectly symmetric, the mean, median, and mode are the same. ____ A. True B. False 18. The ε is a random variable with mean = 0 and variance σ2. ____ A. True B. False 19. R-square values range between 0 and 100%. ____ A. True B. False 20. Pearson correlation values range between 0 and 1. ____ A. True B. False 21. To test if the slope parameter is zero, we use a t –test. ____ A. True B. False 22. SST does not change with the model, as it depends only on values of the dependent variable, y. ____ A. True B. False 23. SSE decreases as variables are added to a model, and SSM increases by the same amount. ____ A. True B. False 24. In a hypothesis test, if the p-value = .94 and you have set alpha at .05, you would Reject the null hypothesis. ____ A. True B. False 25. Sum of Squares measure the amounts of explained and unexplained information due to the model ____. A. True B. False PART B (18 POINTS): SHORT ANSWER Answer the following questions in a few sentences. 1. Central tendency can be measured by the mean, median, and mode. Briefly describe each one providing an example of when it might be used. {3 points} 2. What three components do you need to produce a confidence interval? Why do statisticians prefer confidence intervals over point estimates? {3 points} 3. A researcher demonstrates that number of hours slept is highly negatively correlated with grade on exam. He goes on to argue that lack of sleep causes poor grades. Is this a reasonable argument? Explain. {4 points} 4. As a graduate student discusses the results of a study she conducted, a classmate suggests she has committed a Type II error. What does this mean, and what might the student do to guard against making this type of error in the future? {4 points} 5. A teacher fits a regression model using height to predict effort. For the model, Minitab reports a R-square value of 0.54. The teacher argues that while this may not be the best estimate, it is good. Do you agree? Explain. {4 points} PART C (44 POINTS): DATA ANALYSIS 1. A researcher wants to investigate the relationship between students’ “self-concept” and their academic performance. She is specifically interested in how much “self-concept” contributes to explaining GPA after the effect of IQ is taken into account. The study includes a sample of 78 seventh-grade students in a rural Midwestern school. The variables include each student’s grade point average (GPA), score on a standard IQ test (IQ), and score on the Piers-Harris Children’s Self-Concept Scale (Self-concept). A. Produce descriptive statistics (be sure to produce at a minimum the mean, median, n and standard deviation) of GPA, IQ, and Self-Concept (you may wish to use graphical summary). {3 points} B. Discuss the distribution of each variable, in terms of central tendency and variation. You may also wish to discuss the general shape of the plotted variables. {3 points} C. Now investigate the potential effect of IQ on GPA: Create a scatter plot of GPA versus IQ. {1 point}; Calculate the correlation estimate between these two variables. {1 point}; Perform a simple linear regression. {2 points} D. Now investigate the potential effect of student’s self-concept on the GPA: Create a scatter plot of GPA versus self-concept {1 point}; Calculate the correlation estimate between these two variables. {1 point}; Perform a simple linear regression. {2 points} E. Which variable appears to have the greatest variation? Explain. {2 points} F. Do you feel that both IQ and Self-concept are useful predictors of GPA? Explain. {3 points} 2. A researcher is interested in estimating birth rates for young mothers (ages 15-17) from the poverty rate in the state in which they live. He consults the U.S. Census Bureau for the year 2000. He records the state name (Location), percentage of population living in households with income below poverty level (PovPct) and birth rate for females 15 to 17 years old (Brth15to17). Birth rate refers to births per 1,000 persons in the group. A. Create a scatterplot of PovPct (x) and Brth15to17 (y). {1 point} B. Based on your plot, describe the strength and direction of the relationship between Birth Rate (y) and Percent of Poverty (x). {2 points} C. Do you feel that simple linear regression is a sound choice in this setting? Explain. {2 points} D. Compute the least-squares regression equation, R-square, and coefficient estimates. Be sure to also create the 4-1 plot to check the assumptions. {4 points} E. Are the assumptions satisfied for linear regression? Discuss in terms of the plot. Provide detail of either support or non-support. {3 points} F. Write the most accurate least-squares regression equation using your output. {1 point} G. Interpret your slope ( b1 ) in the context of these variables. {2 points} H. Test if b0 and b1 are significant. Is one of these test more important? Why? {3 points} I. Report and Interpret your R-square value. {2 points} J. Identify and report any extreme values in your data set. If none, state that. {2 points} K. Given a PovPct of 68, use your regression equation to predict the expected birth rate. (Show you work). Do you have any issues with this estimate? Explain. {3 points}