Introduction to Biostatistics (MATH 225)

advertisement
BIOSTATISTICS II (MATH 335)
Instructor: Doug Landsittel, Ph.D.
Phone: 412-396-1419; E-Mail: landsittel@mathcs.duq.edu
Office: 419 College Hall; Office Hours: TBD.
Teaching Assistant(s): TBD
Prerequisites: Biostatistics I
You should understand the following concepts:
 Basic concept of sampling and sampling variability
 Numerical summary statistics and graphical displays
 1- and 2- sample confidence intervals for μ or μ1 – μ2
 1- and 2- hypothesis tests for μ = μ0 or μ1 = μ2
 Analysis of variance tests
 Chi-square tests (OK if you didn’t cover)
Course Objectives: This course will basically extend our knowledge of statistical
methods to cover other scenarios and types of data not covered in Biostatistics I.
 Non-parametric Statistics:
o Methods for non-normal data
o 1- and 2-sample hypothesis tests
 Categorical Data Analysis:
o Methods for contingency tables (2 or more categorical variables)
o Summary statistics and hypothesis tests
 Rates and Standardization:
o Methods for calculating and standardizing morbidity/mortality rates
 Survival Analysis:
o Methods for survival times, i.e. continuous data with censoring
o Displaying survival distributions and 2-sample hypothesis tests
 Correlation and Simple Linear Regression:
o Relationship between 2 continuous variables (usually one
independent and one dependent variable)
o Summary statistics, the regression model, and associated tests
 Multiple Regression:
o Regression methods for a single continuous dependent variable but
two or more independent variables
o Summary statistics, the regression model, and associated tests
 Logistic Regression: regression methods for a binary dependent variable
o Regression methods for a single binary dependent variable
o Summary statistics, the regression model, and associated tests
Overall Goal: learn enough statistics for analyses about non-normal, categorical,
and multidimensional data; provide a background for more advanced work.
1
Lectures will typically include a description of the concept/methods followed by a
numerical example that generally focuses on a biological or other scientific
application. Mathematical notation will relatively be minimal, although some
mathematics are necessary. Students are responsible for attending class, taking
notes, collecting any additional handouts, and completing HW as we finish
covering the topic in class. Blackboard will be used for this course.
Statistical Software Package: SPSS or S-Plus.
Required Textbook: Principles of Biostatistics (2nd Edition). By M. Pagano and
K. Gaurveau. Pacific Grove, CA: Duxbury. Homework assignments originate
from the textbook.
Grading:
Homework: Homework will be assigned, but will not be collected or graded.
Class time will be reserved for answering homework questions.
Attendance: Attendance will be taken, but not formally incorporated into grading.
Quizzes: Quizzes will be given (without prior announcement), but not graded.
Exams: Exams 1, 2 and 3 are worth 20%, 25%, and 25% each (not cumulative).
Final Exam: 30%; the final is cumulative.
Final grades may be curved if necessary, but will likely follow the usual 10-point
scale (e.g. 90-100 =A, 80-89 = B, etc.), with borderline grades just below the
cutoff potentially receiving a + or -. Typically, there will not be any specific curve
on individual exams. It is unlikely anyone will be ‘curved into’ the A range.
Exams will NOT be open book; students are allowed to bring notes fitting on
an 8.5×11 sheet of paper (back and front). Also, bring a calculator to the
exam. Copies of needed statistical tables will be provided with the exam.
All of the exams will have a take-home component.
Policy on missed exams: Students must make arrangements PRIOR to the exam
to have the opportunity to take the exam late.
Expectations of the Students:
 Lecture: Attend lecture, take notes, and review notes
 Homework: Complete homework for both repetition and
application/expansion of concepts learned in lecture
 Statistical Software: Learn and apply statistical software (during later
sections of the course)
 Exams: Demonstrate an understanding of both the concepts and methods,
and an ability to apply them to “similar” problems.
2
Anything I write down during class is potentially fair game for an exam (except
notes that I clearly denote as ‘an aside’). Exam questions are meant to be
reflective of the notes and HW as a whole, but also challenge that understanding.
Course Schedule:
1. Mon 1/9 Wed 1/11
2. Mon 1/16 Wed 1/18
3. Mon 1/23 Wed 1/25
4. Mon 1/30 Wed 2/1
5. Mon 2/6 Wed 2/8
6. Mon 2/13 Wed 2/15
7. Mon 2/20 Wed 2/22
8. Mon 2/27 Wed 3/1
Mon 3/6 Wed 3/8
9. Mon 3/13 Wed 3/15
10. Mon 3/20 Wed 3/22
11. Mon 3/27 Wed 3/29
12. Mon 4/3 Wed 4/5
13. Mon 4/10 Wed 4/12
14. Mon 4/17 Wed 4/19
Fri 1/13
Fri 1/20
Fri 1/27
Fri 2/3
Fri 2/10
Fri 2/17
Fri 2/24
Fri 3/3
Fri 3/10
Fri 3/17
Fri 3/24
Fri 3/31
Fri 4/7
Fri 4/14
Fri 4/21
Non-parametric Statistics
Non-parametric Statistics
Categorical Data
Categorical Data /Exam 1: 2/3
Rates and Standardization
Rates and Standardization
Survival Analysis
Survival Analysis /Exam 2: 3/3
Correlation & Simple Regression
Simple & Multiple Regression
Multiple Regression
Exam 3: 4/5; Logistic Regression
Logistic Regression
Overview & Review for the Final
The final is scheduled by the University, as Mon 5/1 11am-1pm.
Reading Assignments:
The given reading assignments provide an important supplement to the lecture
notes. However, they do not provide a replacement for lecture! Lecture notes will
include material not contained within the text. You may want to do the reading
assignment before (for preparation) and/or immediately after (for repetition of
concepts) the relevant lecture discussions.
Non-parametric Statistics: Chapter 13
Categorical Data: Chapters 15 and 16
Rates and Standardization: Chapter 4
Survival Analysis: Chapter 21
Correlation: Chapter 17
Simple Regression: Chapter 18
Multiple Regression: Chapter 19
Logistic Regression: Chapter 20
3
Homework Assignments:
Homework assignments should be completed immediately after completing the
relevant lectures in class. Within 1-2 classes after completing a given topic, I will
reserve class discussion time for homework. This period will only be useful if you
have at least attempted the assignment, and have formulated subsequent questions.
I will not pass out solutions (since students tend to just wait for the solutions!).
Non-parametric Statistics: Section 13.6; page 317-321
Assignment: #1-5, #6, #7a, #10
Additional practice: #8a, #9, #11, #12, #13, #14, #15b, #16
Categorical Data: Section 15.6; page 366-372.
Assignment: #1-5, #8a-b, #10, #13; Additional practice: #6-7, #9, #12a, #14, #16
Section 16.4; page 393-396.
Assignment: #1-4, #5; Additional practice: #6-8
Rates and Standardization: Section 4.4; page 89-95.
Assignment: #1-6, #7, #8, #15
Additional practice: #16
Survival Analysis: Section 21.5; page 511-512.
Assignment: #1-5, #6; Additional practice: #7, #8, #9
Correlation: Section 17.5; page 412-414.
Assignment: #1-4, #5
Additional practice: #6-8
Simple Linear Regression: Section 18.5; page 443-447.
Assignment: #1-7, #9, #11
Additional practice: #10, #12, #13
Multiple Linear Regression: Section 19.4; page 465-469.
Assignment: #1-6, #7, #8, #9
Additional practice: #10, #11, #12
Logistic Regression: Section 20.5; page 484-487.
Assignment: #1-4, #5, #7,
Additional practice: #6, #8, #9
4
Policy regarding Student Disabilities:
Students who feel they may have a disability that requires special accommodation
should contact the Office of Freshman Development and Special Student Services,
309 Student Union, at 412-396-6658. This office will then provide an appropriate
paperwork to the student to pass on to his/her instructors. Appropriate
arrangements will then be made for exam accommodations or other such issues in
accordance with University policy.
Policy regarding Academic Integrity:
Any students found to be sharing answers or assisting each other on any exams
will be assigned an F (0%) for that exam.
Extra Credit:
Students wishing to raise their grades can turn in an additional project which uses
as many of the techniques we learned as possible. You must pick a dataset to
analyze and prepare a 3-6 page report which has an abstract, introduction,
methods, results and discussion section. You must organize your report around
one or more logical research questions. You must also include a list of references
for any literature that you site in the report. All extra credit projects must use a
statistical software package for at least some of the analyses.
The introduction section describes relevant background information and a
summary of existing knowledge on the subject. Your introduction must include at
least a few references to the published literature. The methods section includes a
brief description of the data set and the statistical approaches that you use to
analyze the data. This section also must include at least one citation of the
literature, which could be your textbook (but should also include a reference to
where you got the dataset, which could cite the internet if you get your data
online). The results section describes the statistical results you got, and does not
5
need to include any references. The discussion should describe the importance
and limitations of the results you got. The discussion should include at least a few
literature citations. Since this is for extra credit, I will not provide extensive help
on this project.
Students must select individual projects; team projects will not be accepted. All of
the above specifications must be met to receive any extra credit. Any pages
beyond the 6-page limit will be discarded and not considered at all in the grading.
Projects satisfying all of the above specifications will get between 1% and 10%
added to their final grade, with an expected average of 2-3% extra credit. Only
extremely innovative and well-done projects will receive more than 4-5% extra
credit.
The actual data set and research question for this project is not limited to
biostatistics or the medical field. You may, for instance choose a subject (such as
NFL statistics) that is of personal interest to you. I will accept multiple projects
that happen to be on the same subject, but you may not work together at all.
Projects which appear to be very similar in content (not just the overall idea) will
be discounted and given no extra credit.
6
Download