Statistics and Zoology 5050

advertisement
STAT 5050
Instructor: Scott Crawford
E-mail: scrawfo8@uwyo.edu
Office hours: MWF 12-1pm in Ross 333 (for other times just email me)
Website: http://www.uwyo.edu/crawford/stat5050
Class: EN 2109 from 9am to 9:50am
Required Text: K.G. Gerow, Stats Alive: Concepts and Tools
This is an e-book that will be available online.
Optional Text: Fred Ramsey, Dan Schafer. The Statistical Sleuth, 3rd Edition.
Teaching Assistant: Jared Studyvin
E-mail: jstudyvin@uwyo.edu
Office: Ross Hall 351 (make appointments by email)
Course Goals
(1)
(2)
(3)
(4)
(5)
(6)
Formulate your science questions into questions about appropriate parameters;
Choose and understand consequences of different study designs
Understand the idea of the sampling distribution of a statistic.
Choose statistical tools applicable to your questions, and assess their validity;
Correctly report the results of your statistical analyses;
Read critically statistical aspects of studies in your literature.
Below is a list of topics we will study. The list is not a chronology
I reserve the right to alter the plan as appropriate
Statistical Concepts and one- and two-sample tools:
(1) sampling distributions of statistics
a. What does it mean to say: “The sample mean has a distribution”?
b. The role of Normality in (many) sampling distributions
(2) hypothesis testing
a. the role of the sampling distribution
b. null/alternate hypotheses
c. one- and two-tailed alternate hypotheses
d. the α-level of a test
e. the p-value as a form of evidence
(3) confidence intervals
a. the role of the sampling distribution
b. impact of sample size, confidence level, underlying variation on intervals
c. margin of error (in classically constructed intervals)
d. one-sided intervals (a.k.a confidence bound)
e. connections between intervals and tests
1
(4) tools
a.
b.
c.
d.
Numerical and graphical description
The t-tools (for inferences on differences in means)
Z-distribution tools for Binomial proportions
Nonparametric methods for inferences on medians (and differences in
medians)
Simple Linear Regression
This topic is of limited importance in and of itself; I believe it serves it highest purpose
by being the simplest setting in which to study matters that play out in more complicated
and subtle ways with multiple predictors.
(1) Concepts
a. Purposes for exploring relationships between a response and a predictor
(prediction and (apparent) effect of one variable on the other)
b. Observational and experimental data: correlation is not causation
c. Validity conditions
d. Detecting and understanding influence (or lack thereof) of outliers
(2) Tools
a.
b.
c.
d.
e.
Simple linear regression, scatterplots, fitted line plots
Residual analysis for tool validation
The role of transformations
Extensions: quadratic regression
Making predictions
Multiple Regression
(1) Concepts
a. Working analogy: building groups of people to work on a project vis-à-vis
building multiple regression models
b. Model selection: historical (forward, backward, stepwise) and
contemporary (best subsets) methods.
c. Interactions between predictors
d. The role of transformations
(2) Tools
a. Fitting, and interpreting models
b. Testing for lack of fit
c. Incorporating categorical predictors
d. Testing for and interpreting interactions
e. Residual analyses for examining model validity
ANOVA for one Factor
(1) Concepts
a. Multiple groups (extension of two-samples)
b. Strengths and limitations
(2) Tools
a. One-way ANOVA; data organization;
2
b. Residual plots for assessing validity;
c. Post-hoc methods (Duncan’s Multiple Range, Tukey’s “all pairs”,
contrasts; Bonferroni (and other) multiple test adjustments
d. Contrasts (independent and non-independent)
Chi-Square and related analyses
(1) Concepts
a. 2×2 tables and Binomial proportions
b. Goodness of fit
c. Tests for independence
(2) Tools
a. Pearson’s statistic; g-statistic
b. Inference for One-way, two-way, multi-way tables
Grading
Discussion Questions (30 points).
There will be 10 Discussion Question assignments (3 percentage points each). On
Mondays I will assign a discussion question. You should be prepared to discuss your
solutions (which might be terrible) on Friday. Then on Monday you should be prepared
to hand in your final answer.
Homework assignments (30 points)
There will be six small data analysis problems (5 percentage points each), due on Fridays
by 8:00am (see schedule next page). I will assign a partner to you; hand in one jointly
authored response. The homework is graded as if it were a full project, although you get
a full grade for every completed problem. All due dates are posted on the class website
http://www.uwyo.edu/crawford/stat5050
Mid-Term Project (20 Points)
The mid-term project is not a team project (you do your own work alone). Talking with
other students about the project is not allowed. You are free to discuss the project with
me. The project will have a scenario and data set, with specific questions for you to
answer as if you were replying to a client who does not know statistics.
Final Project (20 Points)
The final project is not a team project (you do your own work alone). Talking with other
students about the project is not allowed. You are free to discuss the project with me.
The project will have a scenario and data set. The project will have a scenario and data
set, with specific questions for you to answer as if you were replying to a client who does
not know statistics.
3
5050 Schedule
Discussion Questions (DQ) are expected to have a first draft by Friday and are due the next Monday.
Homework assignments (HW) will be due on Fridays by class time on the day indicated.
Monday
Sep 1
Labor Day
3
Wednesday
5
Friday
8
10
12
HW 1
15
DQ 1
17
19
22
DQ 2
24
26
HW 2
29
DQ 3
October 1
2
6
DQ 4
8
10
HW 3
13
DQ 5
15
17
20
DQ 6
22
24
HW 4
27
DQ 7
29
31
November 3
5
10
DQ 8
12
7
Mid-Term
Project Due
14
HW 5
17
DQ 9
19
21
24
DQ 10
26
Thanksgiving
28
December 1
3
5
HW 6
8
10
12
Final Project Due
4
Download