Course overview: Data Mining (Exploratory Data Analysis) Probabiliy Theory + Stochastic Theory Statistics Estimation & Inferences Modeling & Prediction ATM 315 Environmental Statistics Course 2014-01-28 1 SUGGESTED READING R-Programming: The Art of R Programming – Norman Matloff (free e-book PDF http://it-ebooks.info/book/1734/) The Art of R Programming.pdf R in a Nutshell – Joseph Adler (O’Reilly) Veranzi-SimpleR.pdf ONLINE Tutorials: http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf (Verzani-SimpleR.pdf) http://www.nceas.ucsb.edu/files/scicomp/Dloads/RProgramming/BestFirstRTutorial.pdf BestFirstRTutorial.pdf More links: http://math.ucdenver.edu/RTutorial/RBookResources.html 2 SUGGESTED READING Statistics: Statistical Methods in the Atmospheric Sciences (Daniel S. Wilks, 2nd ed., AP) Introduction to Probability and Statistics Using R – G. Jay Kerns http://cran.r-project.org/web/packages/IPSUR/vignettes/IPSUR.pdf Statistical Concepts in Environmental Sciences – Dr. David. B. Stephenson (A Course Book – no updated link found) (http://www.spatial.maine.edu/~beard/Documents/basicstats.pdf) Collaborative Statistics - Barbara Illowsky, Susan Dean http://cnx.org/content/col10522/latest/ IPSUR.pdf basicstats.pdf introductory_course_statistics.pdf 3 Course overview: Probability Theory Law’s of Probability The Frequentist’s Interpretation of Probability Independence and Conditional Dependence Causal Reasoning Exploratory Data Analysis Description of the data sample Mean,Median,Standard Deviation, Histograms, Boxplots, Quantiles Empirical Distributions (Probability Density Function) 4 Course overview: Data samples Sample size Independent, Identically Distributed Data Linear Transformation: Shifting and Rescaling Non-linear transformation Paired Observations: Scatter Mid-Term Exam 03/11/2014 Plot Covariance & Correlation Linear Regression Principal Component Analysis – Part I Logistic Regression 5 Course overview: Data Mining (Exploratory Data Analysis) Probabiliy Theory + Stochastic Theory Statistics Estimation & Inferences Modeling & Prediction ATM 315 Environmental Statistics Course 2014-01-28 10 Course overview: Data Mining (Exploratory Data Analysis) ? Probabiliy Theory + Stochastic Theory ? Statistics ? Estimation & Inferences ? Modeling & Prediction Give a fitting verb to each connecting arrow that describes their relationships. ATM 315 Environmental Statistics Course 2014-01-28 11 A first session with R-Studio (Windows 7) Memory window: list of objects Currently in the memory for use Multi-function Window Command-line window ATM 315 Environmental Statistics Course (for file browsing, plotting, Software package management … 2014-01-23 12 Our first R-session: data types and objects example001.R 13 Our first R-session: data types and objects example001.R example001.R 14 Our first R-session: data types and objects example001.R 15 Our first R-session: data types and objects example001.R 16 R: INSTALLING PACKAGES Packages are collections of extra functions (and data) for specific purposes Packages make R a universal toolbox We will install the package “prob”: Menu: Tools -> Install Packages … Type packge name “prob” 17 R: BASIC PROBABILITY WITH R 18 R: BASIC PROBABILITY WITH R Note! This is not a valid command in R Descriptive text into comment lines starting with # library(prob) does the same, it loads the package into the R session rolldie() is a function from the package prob 19 NOTES The winning streak of Oakland Athletics in 2002 was 20 consecutive games in a row. Assume it was a ‘fair game’ , all teams at even level. What are the odds to witness such a winning streak? What makes it even more ‘unreal’? 20 Axioms and Laws Of Probability The Axioms of Probability set the mathematical foundation for the calculus! We can measure probability of events and compare them. (1) Probability of any event is nonnegative (2) Probability of the compound event is 1 (3) The probability of two mutually exclusive events is the sum of the two individual probabilities Or: 1) P(e) >= 0 2) P(S) = 1 (S is the compound event, i.e. coin: S=(head OR tail) 3) P(e1 OR e2) = P(e1)+P(e2) (e.g. throwing the die shows 1 OR 2) NOTE: In Axiom (3) it is important to refer to “mutually exclusive” events! Test yourself: What are mutually exclusive events ? Tomorrow weather forecast: It will rain (e1) or it won’t rain (e2) Tomorrow weather forecast: It will rain (e1) or it will snow (e2) 21 THE SAMPLE SPACE: The coin flipping game ‘Head’ ‘Tail’ The 6-sided die game 1 2 3 4 5 6 22