Class-02

advertisement
Course overview:
Data Mining
(Exploratory
Data Analysis)
Probabiliy
Theory
+
Stochastic
Theory
Statistics
Estimation
&
Inferences
Modeling
&
Prediction
ATM 315 Environmental Statistics Course
2014-01-28
1
SUGGESTED READING

R-Programming:

The Art of R Programming – Norman Matloff
(free e-book PDF http://it-ebooks.info/book/1734/)

The Art of R Programming.pdf
R in a Nutshell – Joseph Adler (O’Reilly)
Veranzi-SimpleR.pdf

ONLINE Tutorials:

http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf (Verzani-SimpleR.pdf)

http://www.nceas.ucsb.edu/files/scicomp/Dloads/RProgramming/BestFirstRTutorial.pdf
BestFirstRTutorial.pdf

More links: http://math.ucdenver.edu/RTutorial/RBookResources.html
2
SUGGESTED READING

Statistics:

Statistical Methods in the Atmospheric Sciences (Daniel S. Wilks, 2nd
ed., AP)

Introduction to Probability and Statistics Using R – G. Jay Kerns
http://cran.r-project.org/web/packages/IPSUR/vignettes/IPSUR.pdf

Statistical Concepts in Environmental Sciences – Dr. David. B.
Stephenson (A Course Book – no updated link found)
(http://www.spatial.maine.edu/~beard/Documents/basicstats.pdf)

Collaborative Statistics - Barbara Illowsky, Susan Dean

http://cnx.org/content/col10522/latest/
IPSUR.pdf
basicstats.pdf
introductory_course_statistics.pdf
3
Course overview:

Probability Theory
Law’s of Probability
 The Frequentist’s Interpretation of Probability
 Independence and Conditional Dependence
 Causal Reasoning


Exploratory Data Analysis

Description of the data sample
 Mean,Median,Standard
Deviation, Histograms, Boxplots,
Quantiles
 Empirical Distributions (Probability Density Function)
4
Course overview:

Data samples
 Sample
size
 Independent, Identically Distributed Data
 Linear Transformation: Shifting and Rescaling
 Non-linear transformation

Paired Observations:
 Scatter
Mid-Term Exam
03/11/2014
Plot
 Covariance & Correlation
 Linear Regression
 Principal Component Analysis – Part I
 Logistic Regression
5
Course overview:
Data Mining
(Exploratory
Data Analysis)
Probabiliy
Theory
+
Stochastic
Theory
Statistics
Estimation
&
Inferences
Modeling
&
Prediction
ATM 315 Environmental Statistics Course
2014-01-28
10
Course overview:
Data Mining
(Exploratory
Data Analysis)
?
Probabiliy
Theory
+
Stochastic
Theory
?
Statistics
?
Estimation
&
Inferences
?
Modeling
&
Prediction
Give a fitting verb to each connecting arrow
that describes their relationships.
ATM 315 Environmental Statistics Course
2014-01-28
11
A first session with R-Studio (Windows 7)
Memory window: list of objects
Currently in the memory for use
Multi-function Window
Command-line window
ATM 315 Environmental Statistics Course
(for file browsing, plotting,
Software package management …
2014-01-23
12
Our first R-session: data types and objects

example001.R
13
Our first R-session: data types and objects

example001.R
example001.R
14
Our first R-session: data types and objects

example001.R
15
Our first R-session: data types and objects

example001.R
16
R: INSTALLING PACKAGES


Packages are collections of extra functions (and data) for specific
purposes
Packages make R a universal toolbox
We will install the package “prob”:
Menu: Tools -> Install Packages …
Type packge name “prob”
17
R: BASIC PROBABILITY WITH R
18
R: BASIC PROBABILITY WITH R
Note! This is not a valid command in R
Descriptive text into
comment lines starting
with #
library(prob) does the same, it loads the
package into the R session
rolldie() is a function from the package prob
19
NOTES
The winning streak of Oakland Athletics in 2002 was 20 consecutive games
in a row. Assume it was a ‘fair game’ , all teams at even level.
What are the odds to witness such a winning streak?
What makes it even more ‘unreal’?
20
Axioms and Laws Of Probability
The Axioms of Probability set the mathematical foundation for the calculus!
We can measure probability of events and compare them.
(1) Probability of any event is nonnegative
(2) Probability of the compound event is 1
(3) The probability of two mutually exclusive events is
the sum of the two individual probabilities
Or:
1) P(e) >= 0
2) P(S) = 1 (S is the compound event, i.e. coin: S=(head OR tail)
3) P(e1 OR e2) = P(e1)+P(e2) (e.g. throwing the die shows 1 OR 2)
NOTE: In Axiom (3) it is important to refer to “mutually exclusive”
events! Test yourself: What are mutually exclusive events ?
Tomorrow weather forecast: It will rain (e1) or it won’t rain (e2)
Tomorrow weather forecast: It will rain (e1) or it will snow (e2)
21
THE SAMPLE SPACE:
The coin flipping game
‘Head’
‘Tail’
The 6-sided die game
1
2
3
4
5
6
22
Download