STAT 6190: Introduction to Mathematical Statistics Fall 2015 Instructor: Chao DU E-mail: cd2wb@virginia.edu Phone: 434-924-3014 Office: Halsey 107 Class Schedule: TuTh 11:00AM - 12:15PM Location: Physics Bldg 210 Office Hour: Wed 1:00PM-3:00PM Grader: Yin Zhang (yz4an@virginia.edu) Course Description: In everyday life, we often say something like “if I toss a coin, the chance of obtaining a head is 50%” or “I think I have 80% of chance of passing the coming final exam.” While we invoke the concept of chance or randomness in both statements, we may fail to realize a fundamental difference between these two statements: in the first case, we could toss the coin as many times as we like and actually obtain different outcomes; in the second case, we can only try to pass this certain final exam once: either we fail or we pass. Then how can we apply the same concept in two different scenarios? To ensure financial success, Hollywood studios often do a few pre screenings before releasing a new film. The receptions obtained during pre-screenings are then used to predict whether the film will be a blockbuster or a financial disaster. Such a method, if employed properly, can actually work very well. But what is the reason behind such a method? When does this method actually works? Is there any opportunity for improvement? You may have already been exposed to similar examples in real life or during a college-level statistics class. You may also remember some techniques for making statistical estimation and calculating error bound. Such knowledge is certainly necessary in becoming a professional statistician and useful for solving a large class of real world problems, but are not sufficient to bring you further insight regarding the aforementioned problems. To take one step further, we need to develop the necessary knowledge and mindset of the professional statisticians to recognize randomness in real problems, to understand and propose corresponding abstract models, and to analyze, evaluate and develop empirical method. This course is thus designed to help the first year graduate student to achieve this goal. For this purpose, we will venture into the land of theoretical foundation of statistic science, that is, the theory of probability. Prerequisites: You are expected to have completed college-level classes in calculus (including multivariate calculus) and linear algebra. In particular, a good level of proficiency at the following subjects is needed: using basic set operations, calculating derivatives and integrals of both single variable and multiple variable functions; applying Taylor’s series expansion to approximate exponential function and logarithm function, performing arithmetic operations on vectors and matrices. Knowledge in introductory level courses in probability theory and statistic methods is highly recommended but not absolutely necessary. Nonetheless, you may find it is a good idea to review some basic materials in probability theory (see useful references section for a recommended textbook) during the first few weeks. You do not need knowledge in real analysis or complex analysis to take this class. If you feel your preparation was particularly weak, please consult with me about your concerns. Course Objectives: 1. Understand the essential concepts of “randomness” in the theory of probability. Relate the abstract concepts to the real world examples. Identify the underlying assumption and component regarding “randomness” in empirical methods. 2. Master the major theoretical models and tools that describe randomness. Apply such tools to address theoretical problems and evaluate empirical models. 3. Study the essential characteristics and relationship between major elementary “modules” in probability and statistics that can be used to construct complex models. Examine how these elementary modules arise from practical concerns and how they might be used to construct models and solve real problems. 4. Practice commonly used techniques that combine the aforementioned “modules” together. Apply the knowledge in this class to study and describe the properties of complicated models. 5. Master the probability limit theorems that link the randomness of individual and the deterministic behavior of ensemble. Apply such theories to understand the limiting behavior of probability model and evaluate the effectiveness of empirical tools. 6. Master key mathematic tools that are essential for advanced statistical topics. Course Assessment: The final numerical grades will be calculated based on the major factors described bellow. The letter grades will be assigned based on the overall distribution of numerical grades. No fixed threshold will be set in advance. 1. Class participation (10%). You are expected to attend all lectures. The lectures materials would be drawn from a large spectral of sources and do not follow the textbook in general. In addition, the in-class discussion will help you to master the tools learnt from this class, as well as obtain an intuitive understanding of the theoretical matter beyond mathematical derivations. 2. Homework (25%). There will be around 8 written assignments. Each assignment will be weighted equally towards the final grades. Homework is designed to help you master the theoretical tools and learn to apply such knowledge to the relevant questions outside the focus of this class. The grading will be generous but it is strongly strongly advised that you should independently work on the problems before discussing with fellow students. Solution will be provided after the due date to help you to self-evaluate your understanding. 3. Exams (65%). Both mid-term (25%) and final (40%) exams aim to test your mastery on the knowledge learned from class and your ability of applying such knowledge to solve theoretical questions, as well as to analyze and evaluate empirical problems. While the midterm exam would focus on the application of a single skill or theorem, the final exam would gauge your ability to integrate different aspects of this class to solve complicated problems. Textbook: George Casella and Roger L. Berger. 2001. Statistical Inference, Duxbury Thomson Learning, 2nd ed. Our lecture will not follow the structure of the textbook. Still, this textbook contains a wide range of examples and problems that could serve as the complement of the lectures. Useful References: No lectures would be able to cover all the knowledge you need in your careers. The following lists are meant to provide you a starter’s reference list with the aim to: 1) Help you to review some pre-requisite materials. 2) Help you to master the class materials. 3) Provide you references that go beyond the scope of this class. Choose them according to your need. College-level textbook on probability theory: Sheldon M. Rose. Introduction to Probability Models. Academic Press. (Any edition) In case that you have never exposed to probability theory, it would be a good idea to skim through the first 3 chapters of this book during the first 4 weeks of the class. This book is very well written, easy to read and contains rich examples on probability theory. Advanced textbooks on probability theory (with rigorous math) Jeffrey S. Rosenthal. 2006. First Look at Rigorous Probability Theory. World Scientific Publishing Company. Patrick Billingsley. 1995. Probability and Measure. John Wiley & Sons. In this class, we will not provide rigorous proofs to every theorems we encountered. Thus, if you love math or just want to see a more rigorous version of the proofs, you may check the book listed above. These books are also very good textbooks and references for your future study. Classical Probability Textbooks William Feller. 1951. An Introduction to Probability Theory and Its Applications. Volume I. John Wiley & Sons. William Feller. 1971. An Introduction to Probability Theory and Its Applications: Volume II. John Wiley & Sons. These two volumes are considered as “classics” in probability theory. They are not very suited for a first time learner but contain invaluable insights on the theory of probability and its application. Books on Limit Theorems Valentin V. Petrov. 1995. Limit Theorems of Probability Theory: Sequences of Independent Random Variables. Oxford University Press. A.W. van der Vaart. 1998. Asymptotic Statistics. Cambridge University Press. Books on Stochastic Process Samuel Karlin and Howard M. Taylor. 1975. A First Course in Stochastic Processes. Academic Press. Samuel Karlin and Howard M. Taylor. 1981. A Second Course in Stochastic Processes. Academic Press. The four books listed above focus on certain aspects of probability theory. Take a look if you want to adventure into these fields. Class Policies: Class Participation You are expected to attend all lectures and participate actively in class discussion. Homework In order to be graded and counted towards the final grade, each assignment must be submitted on time. Extensions on assignment deadline will be granted only in the most exceptional circumstances. Any extension request must be made to the course instructor at least 24 hours before the due date. Exams Both mid-term and final exams are closed-booked but you will be allowed to bring in two/three double-sided A4 sheets with written (NO PHOTOCOPY & NO PRINTING) notes. Besides helping you to pass the exam, the process of preparation notes also helps you to organize the knowledge we learn in this class. For this reason, you must prepare your own notes, and PHOTOCOPY and PRINTING are NOT allowed. The date of mid-term exam is Oct 20th, 11:00am-12:15pm. And the final exam will be taken place on December 17th, from 9:00am to 12:00pm. Honor Policy: As the only true way to acquire knowledge is through your own hard work, it is of the uttermost importance that all the submitted works, such as homework assignments and exam papers, must reflect your independent efforts made during the learning process. Hence, the following honor policy will be enforced throughout the semester. Any breach to the policy will be reported directly to the UVa Honor Committee. Although students may discuss homework assignments in small groups, each student must finish his or her assignments independently based on his or her own understanding. Copying others’ works will not be tolerated. Students must not consult any external resource other than the allowed sheets of written notes during both mid-term and final-term exam. Class Schedule: The topics covered in this class can be roughly grouped into the following areas: 1. Fundamental concepts: events, probability space, random variable independence and conditional probability. 2. Major theoretical tools: distribution, expectation and conditional expectation, moments generating function, characteristic function, and probability inequalities. 3. Elementary “module”: Major distribution families including Bernoulli, Binomial, Geometric, Poisson, Normal, Exponential, Gamma, Beta, multivariate Normal. Ordered statistics, location and scale families and exponential distribution family. 4. From elementary “module” to complex model: transformation of random variable, convolution, mixture model and hierarchical model. 5. Asymptotic theory: including sum of independent random variables, different types of convergence, week and strong laws of large number, and central limit theorem. Tentative Weekly Schedule Week 1 (08/23-08/29) Concepts and theories: What is the probability? Outcome, events and sigma-algebra. Probability Triple. Probability measure. Useful Skills: Set manipulation, understand set in term of probability events. Week 2 (08/30-09/05) Concepts and theories: Axioms of probability. Limit of Events. Random Variables. Distributions. Useful Skills: Apply the axioms of probability to derive probability theorems. Week 3 (09/06-09/12) Foundation concepts: Type of Distributions. pdf, pmf and CDF. Independence and Conditional probability. Useful Skills: Calculate probability using distributional function, condition on independence and rule of conditional probability. Week 4 (09/13-09/19) Concepts and theories: Transformation of Random Variables. Jacobin Matrix. Parameter family. Bernoulli, Binomial, Geometric and Negative Binomial distribution. Useful Skills: Calculate distribution of the transformed random variables. Apply the properties of the major distribution family discussed. Week 5 (09/20-09/26) Concepts and theories: Uniform, Exponential, Gamma and Normal distribution. Beta, Poisson distribution. Probability Integration Transformation. Location and Scale Family. Useful Skills: Apply the properties of the major distribution family discussed. Apply the properties of Gamma and Beta functions. Week 6 (09/27-10/03) Concepts and theories: Ordered Statistics. Expectation. Variance and Covariance. Useful Skills: Calculate expectation based on the linear property. Apply the tool of expectation to solve actual problems. Week 7 (10/04-10/10) Concepts and theories: Conditional Probability and expectation. Hierarchical Models. Useful Skills: Rule of conditional expectation. No class on Oct 6th (Reading Days) Week 8 (10/11-10/17) Concepts and theories: Moment generating function and characteristic function. Useful Skills: Calculate moments using moment generating function. Mid-term Review on Oct 15th. Week 9 (10/18-10/24) Concepts and theories: Multivariate random Variables. Covariance Matrix. Useful Skills: Covariance Matrix of the Linear transformed random variables Mid-term on Oct 20th. Week 10 (10/25-10/31) Concepts and theories: Multivariate Normal distribution, Sample Mean and Sample Variance of Normal random vector. Conditional Normal distribution. Exponential family. Useful Skills: Calculate the distributional function of linear transformed Normal vector. Calculate the distributional of conditional normal distribution. Week 11 (11/01-11/07) Concepts and theories: Probability Inequality. Sample mean and sample variance for general random variable. Useful Skills: Apply probability inequality and calculate the moments regarding sample mean and variance. Week 12 (11/08-11/14) Concepts and theories: Type of Convergences. Week and Strong Law of Large Number Useful Skills: Apply the techniques in proving the week and strong law of large number . Week 13 (11/15-11/21) Concepts and theories: Continuous mapping theorem. Slutsky theorem. Central limit theorem Useful Skills: Apply both theorems to study limit of complicated random variables. Week 14 (11/22-11/28) Introduction Concepts and theories:: Limit theorem of multivariable random variables No class on November 26th (Thanksgiving recess) Week 15 (11/29-12/05) Concepts and theories: Delta method. Central limit theorem beyond the i.i.d. case. Useful Skills: Apply delta method to study the limit distribution of complicated random variables. Week 16 (12/06-12/12) Review. No class on Dec 10th. Week 17 (12/13-12/20) Final Exam on Dec 17th, 9am-12pm.