STAT 565 Computer Intensive Statistics

advertisement
STAT 565 Computer Intensive Statistics
Professor: Stephen S. Lee, Rm 412 Brink Hall, (208) 885-7701, stevel@uidaho.edu
Course Objective: Computational speed and power has revolutionized the field of statistics both in
theoretical research and applications. This course will provide you with some of the computational
techniques used by statistical researchers and practitioners beyond standard statistical software packages.
The goal is to expand your statistical toolbox through computational and simulation methods.
Additionally, the material will teach you how to approach statistical problems from a computational
perspective as a complement to the theoretical training you receive in mathematical statistics courses and
work. It emphasizes the role of computation as a fundamental tool of discovery in data analysis, of
statistical inference, and for development of statistical theory and methods.
Prereq: Stat 451, Stat 452, Math 330, and computer programming experience
Office hours: MWF 11:40 am – 12:30 pm, or by appointment
Textbooks:
1. George Casella: Monte Carlo Statistical Methods. Springer.
2. Maria Rizzo: Statistical Computing with R. Chapman & Hall.
3. Givens & Hoeting: Computational Statistics. Wiley.
Software:
The main analysis software used in the course is R. Information for learning R, including links for
downloading, can be obtained at http://www.r-project.org/
Topics:











Introduction to R
Random Number Generation
Monte Carlo Integration
Monte Carlo Methods in Inference
Markov Chain Monte Carlo
Bootstrap and Jackknife
Probability Density Estimation
Numerical Methods
Optimization and EM Algorithm
Permutation Tests
Visualization of Multivariate Data
Grading: There are approximately 6 homework assignments, and a term project. Two-third (66%) of the
grade comes from homework, and 1/3 (34%) from the term project. All homeworks and project are to be
submitted in hard paper copies with R programs posted at your (student’s) webpage. This will allow me
to execute your R codes for accuracy checking. Please e-mail your students’s webpage address to me
ASAP.
Homeworks: Homework assignments will be available on the course web page throughout the
semester as announced in class. The exact due date for each homework will be given and expected to
follow. No late homeworks will be accepted.
You are encouraged to discuss homework problems with other students, but you should write
your solutions independently. Your homework solutions for problems requiring computing must include
concise computer output properly edited, labeled, and neatly displayed and an appendix with a
documented, clearly presented version of your code.
Any plagiarism will be dealt with according to the University policy.
Project:
As part of the course you will be asked to do an individual term project. The project is for you to identify
an article in the scientific/statistical literature that involves intensive computing with statistical nature, and
to design, conduct, and report on a similar study using R. In other words, you are going to replicate some
portions of the study published in the chosen article, and to extend the published results in some ways.
I will give a list of potential papers on the course homepage.
The project’s grade will be based on the quality of the oral presentation and written report.
A preliminary report should be placed on your student’s webpage on or before Nov 10 (12-th week);
otherwise 10% will be off for each passing week. (Note: the project worth 34%)
The oral presentation is an in-class 25-minute presentation right after your return from Thanksgiving
break (i.e., during the last two weeks of classes). You are required to attend all project presentations, and
peer-evaluate your fellow students’ presentations.
The written report is an 8-10 page paper in journal style format (i.e., maximum 10 pages of text not
counting figures, tables, and bibliography), 12 pt font, and one inch margins, single-spaced, with figures
and tables clearly presented and labeled at the end of the paper. The written report is due on 5:00 p.m. on
December 13, Wednesday. I recommend you finish and turn in the report prior the due date; late
project will NOT be accepted!
You may wish to use this opportunity to learn LATEX to type the manuscript. LATEX is a professional
way to produce scientific text documents with nice mathematical symbols, styles, and figures. To
encourage you to learn and use LATEX if you have not already known it, I provide a 10% bonus point if
you choose to use LATEX to write your written report.
Project Timeline:
1. Sept 22 (5-th week) --- Find and describe an article in scientific/statistical literature that uses
“Computer Intensive Statistics” methods, or propose your own research topics
2. Oct 13 (8-th week) --- Design a plan of study using R programming
3. Nov 10 (12-th week)--- Preliminary report put on your web page. (10% off if undone by Nov 10)
4. Nov 27/ Dec 6 (15-th and 16-th week) --- Oral report on web and class presentations
5. Dec 13 (17 week) --- Written report due 5:00 p.m. on December 13, Wednesday. (Bonus of
10% if using LATEX to write your report). Late Project will NOT be Accepted!
Students' Webpages:
Each student will prepare a Web page for submitting homeworks and presentation of the project. If you
do not yet have your student’s webpage set up, please do so soon. Help are available through the UI
Information Technology Services at http://www.its.uidaho.edu/. You may also call the help desk at 208
885-HELP (4357) or e-mail to helpdesk@uidaho.edu
Disclaimer: This syllabus is intended to provide a comprehensive overview of the course and no one can
claim any right from it. Topics coverage, homeworks and project grade relative percentages and dates
may change; official announcements are always made in class and online.
Download