STAT 5050 Instructor: Scott Crawford E-mail: scrawfo8@uwyo.edu Office hours: MWF 12-1pm in Ross 333 (for other times just email me) Website: http://www.uwyo.edu/crawford/stat5050 Class: EN 2109 from 9am to 9:50am Required Text: K.G. Gerow, Stats Alive: Concepts and Tools This is an e-book that will be available online. Optional Text: Fred Ramsey, Dan Schafer. The Statistical Sleuth, 3rd Edition. Teaching Assistant: Jared Studyvin E-mail: jstudyvin@uwyo.edu Office: Ross Hall 351 (make appointments by email) Course Goals (1) (2) (3) (4) (5) (6) Formulate your science questions into questions about appropriate parameters; Choose and understand consequences of different study designs Understand the idea of the sampling distribution of a statistic. Choose statistical tools applicable to your questions, and assess their validity; Correctly report the results of your statistical analyses; Read critically statistical aspects of studies in your literature. Below is a list of topics we will study. The list is not a chronology I reserve the right to alter the plan as appropriate Statistical Concepts and one- and two-sample tools: (1) sampling distributions of statistics a. What does it mean to say: “The sample mean has a distribution”? b. The role of Normality in (many) sampling distributions (2) hypothesis testing a. the role of the sampling distribution b. null/alternate hypotheses c. one- and two-tailed alternate hypotheses d. the α-level of a test e. the p-value as a form of evidence (3) confidence intervals a. the role of the sampling distribution b. impact of sample size, confidence level, underlying variation on intervals c. margin of error (in classically constructed intervals) d. one-sided intervals (a.k.a confidence bound) e. connections between intervals and tests 1 (4) tools a. b. c. d. Numerical and graphical description The t-tools (for inferences on differences in means) Z-distribution tools for Binomial proportions Nonparametric methods for inferences on medians (and differences in medians) Simple Linear Regression This topic is of limited importance in and of itself; I believe it serves it highest purpose by being the simplest setting in which to study matters that play out in more complicated and subtle ways with multiple predictors. (1) Concepts a. Purposes for exploring relationships between a response and a predictor (prediction and (apparent) effect of one variable on the other) b. Observational and experimental data: correlation is not causation c. Validity conditions d. Detecting and understanding influence (or lack thereof) of outliers (2) Tools a. b. c. d. e. Simple linear regression, scatterplots, fitted line plots Residual analysis for tool validation The role of transformations Extensions: quadratic regression Making predictions Multiple Regression (1) Concepts a. Working analogy: building groups of people to work on a project vis-à-vis building multiple regression models b. Model selection: historical (forward, backward, stepwise) and contemporary (best subsets) methods. c. Interactions between predictors d. The role of transformations (2) Tools a. Fitting, and interpreting models b. Testing for lack of fit c. Incorporating categorical predictors d. Testing for and interpreting interactions e. Residual analyses for examining model validity ANOVA for one Factor (1) Concepts a. Multiple groups (extension of two-samples) b. Strengths and limitations (2) Tools a. One-way ANOVA; data organization; 2 b. Residual plots for assessing validity; c. Post-hoc methods (Duncan’s Multiple Range, Tukey’s “all pairs”, contrasts; Bonferroni (and other) multiple test adjustments d. Contrasts (independent and non-independent) Chi-Square and related analyses (1) Concepts a. 2×2 tables and Binomial proportions b. Goodness of fit c. Tests for independence (2) Tools a. Pearson’s statistic; g-statistic b. Inference for One-way, two-way, multi-way tables Grading Discussion Questions (30 points). There will be 10 Discussion Question assignments (3 percentage points each). On Mondays I will assign a discussion question. You should be prepared to discuss your solutions (which might be terrible) on Friday. Then on Monday you should be prepared to hand in your final answer. Homework assignments (30 points) There will be six small data analysis problems (5 percentage points each), due on Fridays by 8:00am (see schedule next page). I will assign a partner to you; hand in one jointly authored response. The homework is graded as if it were a full project, although you get a full grade for every completed problem. All due dates are posted on the class website http://www.uwyo.edu/crawford/stat5050 Mid-Term Project (20 Points) The mid-term project is not a team project (you do your own work alone). Talking with other students about the project is not allowed. You are free to discuss the project with me. The project will have a scenario and data set, with specific questions for you to answer as if you were replying to a client who does not know statistics. Final Project (20 Points) The final project is not a team project (you do your own work alone). Talking with other students about the project is not allowed. You are free to discuss the project with me. The project will have a scenario and data set. The project will have a scenario and data set, with specific questions for you to answer as if you were replying to a client who does not know statistics. 3 5050 Schedule Discussion Questions (DQ) are expected to have a first draft by Friday and are due the next Monday. Homework assignments (HW) will be due on Fridays by class time on the day indicated. Monday Sep 1 Labor Day 3 Wednesday 5 Friday 8 10 12 HW 1 15 DQ 1 17 19 22 DQ 2 24 26 HW 2 29 DQ 3 October 1 2 6 DQ 4 8 10 HW 3 13 DQ 5 15 17 20 DQ 6 22 24 HW 4 27 DQ 7 29 31 November 3 5 10 DQ 8 12 7 Mid-Term Project Due 14 HW 5 17 DQ 9 19 21 24 DQ 10 26 Thanksgiving 28 December 1 3 5 HW 6 8 10 12 Final Project Due 4