Applied Epidemiology Using R

advertisement
Applied Epidemiology Using R
UC Berkeley School of Public Health
Syllabus for PH 251D CC# 76238
Fall 2012, Mondays, 4pm-6pm
214 Haviland Hall, Updated 2012-09-23
Faculty
Tomás J. Aragón, MD, DrPH
Principal Investigator, Cal PREPARE Systems Research
Center for Infectious Diseases & Emergency Readiness
UC Berkeley School of Public Health
Health Officer, City & County of San Francisco
San Francisco Department of Public Health
Email: tomas.aragon@sfdph.org
G-Phone: 415-78-SALUD (415-787-2583)
------Michael C. Samuel, DrPH
Chief, Surveillance and Epidemiology Section
STD Control Branch, DCDC, CID
California Department of Public Health
Email: Michael.Samuel@cdph.ca.gov; Tel: 510-620-3198
Course description
This is an intensive one-semester introduction to the R programming language for applied
epidemiology. R is a freely available, multi-platform (Linux, Mac OS, Windows, etc.), versatile,
and powerful program for statistical computing and graphics (http://www.r-project.org). This
course will focus on core basics of organizing, managing, and manipulating epidemiologic data;
basic epidemiologic applications; introduction to R programming; and basic R graphics.
Target audience
This course is intended for epidemiologists, medical epidemiologists, data analysts, and
demographers that want an introduction to the R language for epidemiologic applications.
Course prerequisites
Completion of one semester of epidemiology and one semester of bio/statistics.
ph251d_2012fall_syllabus.odt
p. 1 of 4
Created in LibreOffice!
Course schedule
Date
Wk
Topics
08/27/12
1
Getting started with R
Tomas Aragon
09/03/12
--
Academic and Administrative Holiday
--
09/10/12
2
Working with R data objects I (Vectors, Matrices, Arrays) – HW1 due Tomás Aragón
09/17/12
3
Working with R data objects II (Lists, Data Frames)
Tomás Aragón
09/24/12
4
Managing epidemiologic data I (Entering, Editing, Transforming,
Merging, Exporting) – HW2 due
Tomás Aragón
10/01/12
5
Managing epidemiologic data II (Importing, Dates, Missing values,
Regular expressions)
Tomás Aragón
10/08/12
6
R programming I – HW3 due
Tomás Aragón
10/15/12
7
Analyzing epidemiologic data (basic)
Tomás Aragón
10/22/12
8
R programming II
Tomás Aragón
10/29/12
9
Graphing epidemiologic data I
Michael Samuel
11/05/12
10 Graphing epidemiologic data II (to be confirmed)
Michael Samuel
11/12/12
--
--
11/19/12
11 Analyzing epidemiologic data (regression)
Students
11/26/12
12 Student Presentations
Students
12/03/12
13 Student Presentations (last day of class)
Students
Academic and Administrative Holiday
Lecturer
Course objectives
Upon completion of this course, participants will be able to:
•
•
•
•
•
Use R as a scientific calculator and a functional spreadsheet;
Enter, manage, and manipulate epidemiologic data in R;
Conduct basic epidemiologic analyses in R.;
Graphically display epidemiologic data;
Write basic R programs.
Course format
Lecture and computer demonstration. You are welcome to bring your laptop with R and RStudio
pre-installed.
Course enrollment and fee
UC Berkeley students should register for Public Health 251D. Non-registered students who want
to receive academic credit will need to register and pay the UC Extension fee (see
http://extension.berkeley.edu/info/ConcurrentOverview.html). Auditors are welcome if there is
seating space.
ph251d_2012fall_syllabus.odt
p. 2 of 4
Created in LibreOffice!
Course location and schedule
Schedule: Every Mondays, 4:00 pm - 6:00 pm.
Location: 214 Haviland Hall
Course materials
•
Applied Epidemiology Using R. by Tomás J. Aragón (required)
Available at http://www.medepi.com. Updated chapters will appear weekly.
•
Getting Started with RStudio, by John Verzani (highly recommended)
Permalink: http://amzn.com/1449309038
•
Epidemiology: An Introduction. by Kenneth J. Rothman (highly recommended)
Permalink: http://amzn.com/0199754551
•
An Introduction to R. Freely available at http://cran.r-project.org/doc/manuals/R-intro.pdf
(also comes with default installation)
Course web site
Documents will be posted at http://www.medepi.com; bSpace for forums and secure documents.
Course requirements and evaluation
Grading and evaluation
For registered UC Berkeley/Extension students: Units: 2; Grading: Letter or S/U
Grading will be based on weekly attendance (10%), homework (20%), and student project
(70%). Satisfactory (S) or Passed (P) is at a minimum level of B.
Student homework
Homework will involve (a) completing problems at the end of each chapter, and (b)
demonstrating the following skills:
1. Run the program file (filename1.r) using the 'source' command;
2. Demonstrate reading an ASCII data file (filename2.dat) to create a 'data frame';
3. Demonstrate simple data manipulation (e.g., variable transformation, recoding, etc.);
4. Demonstrate the use of calendar and Julian dates;
5. Conduct a simple analysis using existing functions (from R, colleagues, etc.);
6. Conduct a simple analysis demonstrating simple programming (e.g., a 'for' loop);
7. Conduct a simple analysis demonstrating an original function created by student;;
8. Create a simple graph with title, axes labels and legend, and output to file;
9. Demonstrate the use of regular expressions;
10.Demonstrate the use of the 'sink' function to generate an output file;
ph251d_2012fall_syllabus.odt
p. 3 of 4
Created in LibreOffice!
Student project and presentation
•
Prepare and submit a brief paper/article illustrating an epidemiologic method or topic
using R. For examples, see short articles in the R Journal (http://journal.r-project.org).
•
Students will present their paper at end of course during the last sessions, and submit
their paper on the last day.
Course free and open source software (required)
•
Install R on your computer (visit http://cran.cnr.berkeley.edu/)
•
Install RStudio on your computer (visit http://www.rstudio.org)
Creating LaTeX documents (optional)
I use LaTeX to create high quality typeset documents. You can create LaTeX document online at
http://www.scribtex.com/. RStudio can also be used to generate LaTeX documents, but LaTeX
must be installed on your computer.
Additional readings & resoures:
•
Official R manuals. (Available from http://cran.r-project.org/manuals.html)
•
Contributed R tutorials. (Available from http://cran.r-project.org/other-docs.html)
Related fall courses
This course is not a statistics course, although we review some statistical methods. The following
is/are highly recommended course(s) that complement this course:
•
PH 248 (3 units), Steve Selvin, Statistical/Computer Analysis Using R, Th 2pm-4pm, 330
Evans Hall; CCN 76202
R for epidemiologic computing
Where do I get R?
•
Download from UC Berkeley at http://cran.cnr.berkeley.edu/
•
R manuals available at http://cran.r-project.org/manuals.html
•
R tutorials available at http://cran.r-project.org/other-docs.html
How can I get help with R?
•
We will set up bSpace forum for course.
•
Subscribe to R mailing lists at http://www.r-project.org/mail.html
ph251d_2012fall_syllabus.odt
p. 4 of 4
Created in LibreOffice!
Download