All of Statistics: Chapter 7

advertisement
All of Statistics:
Chapter 7
Toby Xu
UW-Madison
07/02/07
The Empirical Distribution
Function

Def: The Empirical distribution function Fˆn is
the CDF that puts mass 1/n at each data
point Xi. Formally,
1 n
F̂n (x) =  I ( Xi  x)
n i 1

Where
1 if Xi  x
I ( Xi  x)  
0 if Xi  x
Theorems:
E( Fˆn ( x))  F ( x)
The supremum or
least upper bound of
a set S of real
numbers is denoted
by sup(S) and is
defined to be the
smallest real number
that is greater than or
equal to every
number in S.
F ( x)(1  F ( x))
V ( Fˆn ( x)) 
n

MSE=
F ( x)(1  F ( x ))
0
n
sup | Fˆn ( x)  F ( x) | 0

DKW inequality:
P(sup | F ( x)  Fˆn( x) |  )  2e
2 n 2
DKW: confidence level




L(x)=max{ Fˆn ( x)   n ,0}
U(x)=min{ Fˆn ( x)   n ,1}
Where
1
2
n 
log( )
2n

For an F
P( L( x)  F ( x)  U ( x) for all x)  1  
Statistical Functions





A statistical function T(F) is any function of F.
Mean:    x dF(x)
Variance :  2   ( x   )2 dF ( x)
Median : m=F-1(1/2)
Plug-in estimator of   T ( F ) is defined by
ˆn  T ( Fˆn )

If T ( F )   r ( x)dF ( x) for some function r(x) then T is
called a linear function
Statistical Functionals
continued

The plug-in estimator for linear functional
1
T ( F )   r ( x)dF ( x)   r ( X )
n
Assume we can find se, then for many cases:
n
n

n
i 1
i
T ( Fn )  N (T ( F ), se 2 )

Normal-based interval for 95% CL
T ( Fn )  2se
Examples:

The Mean: let   T ( F )   xdF ( x) , the plug-in
estimator is ˆ   xdFˆ ( x)  X . se  V ( X )   / n
The Variance:   T ( F )  V ( X )   x dF ( x)  ( xdF ( x))
n

2
n
n
2
ˆ 2   x 2 dFˆn ( x)  (  xdFˆn ( x)) 2
1 n
1 n
2
  X i  (  X i )2
n i 1
n i 1
1 n
  ( X i  X n )2
n i 1
1 n
2
Sample Variance: Sn 
(
X

X
)
 i n
n  1 i 1
2
2
Examples Continued


The Skewness:
Correlation:
ˆ 

E( X   )

(X
i
3
i

3
(
x


)
dF ( x)

 ( x   ) dF ( x)
2
 X n )(Yi  Yn )
i
(X
2
2

X
)
i
n
 (Y  Y )
i
i
n
2
3/ 2
Download