Data analysis Ben Graham October 5, 2015 MA930, University of Warwick

advertisement
Data analysis
Ben Graham
MA930, University of Warwick
October 5, 2015
Data Analysis
I George Box: All models are wrong, some models are useful.
I Corollary: This model is wrong, therefore it is useful.
I Peter Norvig (not): All models are wrong, and increasingly you
can succeed without them.
Statistics
I What is a statistic?
I Why is this course not called statistics?
I Data
I Designed experiments
I Observed data
I Big data
I Small data
I Summary statistics (mean, median, mode, min, max, ....)
I Graphs
I Probabilistic models
I Seek the model parameters
I
I
Frequentist statistics
Bayesian statistics
Key principles of statistics
I Taking averages is good.
I Correlation does not imply causation (missing covariates,
Simpson's paradox).
I Interpolation good; extrapolation bad.
I Can you play catch?
Machine learning *
I Supervised learning
I Learning to approximate high dimensional functions
I Boring? Includes a huge range of problems.
I Neural networks, decision trees, random forests, support vector
machines, bagging and boosting.
I Unsupervised learning
I Clustering
I PCA, LLE, RBMs
I Dimensionality reduction
I Simplify correlation structures of data?
Problems with English
From the FT:
Linda is single, outspoken, and deeply engaged in social
issues. Which of the following is more likely?
1. That Linda is a bank manager.
2. That Linda is a bank manager who is an active feminist.
Set theory
Denition 1.1.1. The set, S , of all possible outcomes of a particular
experiment is called the sample space for the experiment.
I Coin toss
I Sequence of coin tosses
I Two children, at least one of them a boy.
I Waiting time at a red trac light.
I Waiting time passing a trac light.
Events
Denition 1.1.2 An Event is any collection of possible outcomes of
an experiments, that is any subset of S.
Includes
∅, {x }
for every x
∈ S,
and S .
How many events when you
I toss a coin
I roll a die
De Morgan's Laws
(A ∪ B )c = Ac ∩ B c
(A ∩ B )c = Ac ∪ B c
Proof ?
Disjoint events
Denition 1.1.5 Two events A and B are disjoint if A
∩ B = ∅.
Denition 1.1.6 If A1 , A2 , . . . are a collection of pairwise disjoint
events, and if
∪i Ai = S
then A1 , A2 , . . . form a partition of S .
Axioms of Probability
Def 1.2.1 A collection of events is called a
1.
∅∈B
denoted
B
c ∈ B (closed under complements)
∞ A ∈ B (closed under countable
A1 , A2 , · · · ∈ B , then ∪
i =1 i
2. If A
3. If
σ -algebra,
∈B
then A
unions).
Examples
I Toss a coin
I Roll a die
I Roll a die to see if you get a 6.
if
Probability space
Def 1.2.4 Given S and
P
: B → [0, 1]
1. P (A)
≥0
2. P (S )
=1
B,
a probability function is a function
s.t.
for all A
3. If A1 , A2 , · · ·
∞ A )
P (∪
i =1 i
∈B
∈PB are pairwise
= i P (Ai ).
disjoint, then
Examples
I Toss a coin
I Roll a die to see if you get a 6.
I Circle
2
[0, 1]
{(x , y ) : (x − 0.5)2 + (y − 0.5)2 ≤ 1}
.
in the unit square
National Lottery counting
I There are 49!
= 1 × 2 × · · · × 48 × 49
ways to pick 49 balls in
order (without replacement).
I If we only pick 6 balls, there are
49
× 48 × · · · × 44
× 5 × ··· × 1
6
possibilities.
Def 1.2.17 Binomial coecients
n =
r
n
choose r =
!
r !(n − r )!
n
ways of picking r objects from n objects.
Conditional probability
Def 1.3.2
I Events
I
,
A B
∈ B.
( ) > 0.
P B
I The conditional probability of
A
( | B) =
P A
P
( · | B)
given B is
( ∩ B)
P (B )
P A
satises the axioms for being a probability measure!
Bayes Rule
By the denition of conditional probability:
( | B ) = P (B | A)
P A
( )
( )
P A
P B
Theorem 1.3.5: Let
I
,
A1 A2
I Let
B
,...
partition the sample space S ,
be any event (P (B )
> 0),
P (B | Ai )P (Ai )
( i | B ) = P∞
j =1 P (B | Aj )P (Aj )
P A
Or if A1
= A,
A2
= Ac ,
( | B) =
P A
( | A)P (A)
| A)P (A) + P (B | Ac )P (Ac )
P B
(
P B
Independence
Def 1.3.7: Two events A and B are independent if
( ∩ B ) = P (A)P (B ).
P A
Can an event be independent of itself ?
c
Is A independent of B ?
Def 1.3.12: A collection of events A1 , A2 , . . . is independent if for
every n element subset Ai1 , . . . , Ain
P
n
Y
n
∩j =1 Ai =
P
j =1
j
A
i .
j
Do not confuse this with pairwise independence!
Do not think about pairwise independence!!!
Random variables
Def 1.4.1: A random variable is a function X
: S → R.
I Represent something random like rolling a die
I Not actually random themselves,
I also not actually variables, on account of being functions.
Examples
I Toss a coin (1 for H, 0 for T)
I Toss
n
coins and count the number of H
I Toss a coin repeatedly: count how many H before the rst T.
CDF - cumulative distribution function
Def 1.5.1
X (x ) = P (X ≤ x )
F
I cadlag: continue à droite, limite à gauche
I Left limit 0
I Right limit 1
I non-decreasing
Examples
I Roll a die
I Trac lights waiting time.
I Radioactive decay
Density and mass functions
Def 1.6.1: Discrete r.v. - probability mass function
X (x ) = P (X = x )
f
for all x
Def 1.6.3: Continuous r.v. probability density function fX (x )
satises
(
P X
≤ x ) = FX (x ) =
ˆ x
X (x )dt
f
−∞
Download