Random Variables & Entropy: Examples

advertisement
Random Variables & Entropy:
Extension and Examples
Brooks Zurn
EE 270 / STAT 270
FALL 2007
Overview
• Density Functions and Random Variables
• Distribution Types
• Entropy
Density Functions
• PDF vs. CDF
1
0.9
0.8
0.7
0.6
0.5
PDF
0.4
CDF
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
– PDF shows probability of each size bin
– CDF shows cumulative probability for all sizes up to and
including current bin
– This data shows the normalized, relative size of a rodent as
seen from an overhead camera for 8 behaviors
Markov & Chebyshev Inequalities
• What’s the point?
• Setting a maximum limit on probability
• This limits the search space for a solution
– When looking for a needle in a haystack, it helps
to have a smaller haystack.
• Can use limit to determine the necessary
sample size
Markov & Chebyshev Inequalities
• Example: Mean height of a child in a kindergarten class is 3’6”. (Leon-Garcia
text, p. 137 – see end of presentation)
– Using Markov’s inequality, the probability of a child being taller than 9 feet
is <= 42/108 = .389.
there will be fewer than 39 students over 9 feet tall in a class of 100
students.
Also, there will be NO LESS THAN 41 students who are under 9’ tall.
-Using Chebyshev’s inequality (and assuming the variance = 1 foot) the
probability of a child being taller than 9 feet is <= 122/1082 = .0123.
there will be no more than 2 students taller than 9’ in a class of 100
students. (this is also consistent with Markov’s Inequality).
Also, there will be NO LESS THAN 98 students under 9’ tall.
This gives us a basic idea of how many student heights we need to measure to
rule out the possibility that we have a 9’ tall student…
SAMPLE SIZE!!
Markov’s Inequality
For a random variable X >= 0,
E[ X ]
P{ X  c} 
c
Derivation:
E[x]=, where fx (x)=P[x-e/2£X£x+e/2]/e
Assuming this also holds for X = a, because
this is a continuous integral.
Markov’s Inequality
Therefore
for c > 0, the number of values of x > c is infinite, therefore the
value of c will stay constant while x continues to increase.
Markov’s Inequality
References: Lefebvre text.
Chebyshev’s Inequality
P{ Y  E[Y ]  c} 
Derivation (INCOMPLETE):

2
c
2
, c  0
Chebyshev’s Inequality
As before, c2 is constant and (Y-E[Y])2 continues to increase. But,
how do fy|Y-E[Y]| and fY (Y-E[Y])2 relate?
(|Y-E[Y]|)2 = (Y-E[Y])2
As long as Y – E[Y] is >= 1, then u2 will be > u and the inequality
holds, as per Markov’s Inequality.
Note: this is not a rigorous proof, and cases for which Y – E[Y] < 1 are not discussed.
Reference: Lefebvre text.
Note
• These both involve the Central Limit Theorem,
which is derived in the Leon-Garcia text on p.
287.
• Central Limit Theorem states that the CDF of a
normalized sequence of n random variables
approaches the CDF of a Gaussian random
variable. (p. 280)
Overview
• Entropy
– What is it?
– Used in…
Entropy
• What is it?
– According to Jorge Cham (PhD Comics),
Entropy
• “Measure of uncertainty in a random
experiment”
Reference: Leon-Garcia Text
• Used in information theory
– Message transmission (for example, Lathi text p. 682)
– Decision Tree ‘Gain Criterion’
• Leon-Garcia text p. 167
• ID3, C4.5, ITI, etc. by J. Ross Quinlan and Paul Utgoff
• Note: NOT same as the Gini index used as a splitting
criterion by the CART tree method (Breiman et al, 1984).
Entropy
• ID3 Decision Tree:
Expected Information for a Binary Tree
q
E ( A)  
s1j  s 2j  ...  s nj
j 1
s
I ( S 1j , S 2j ,..., S nj )
where the entropy I is
n
I ( S1 , S 2 ,..., S n )   pi log 2 pi
i 1
E(A) is the average information needed to classify A.
• ITI (Incremental Tree Inducer):
• -Based on ID3 and its successor, C4.5.
-Uses a gain ratio metric to improve function for certain cases
Entropy
• ITI Decision Tree for Rodent Behaviors
– ITI is an extension of ID3
Reference: ‘Rodent Data’ paper.
Distribution Types
• Continuous Random Variables
– Normal (or Gaussian) Distribution
– Uniform Distribution
– Exponential Distribution
– Rayleigh Random Variable
• Discrete (‘counting’) Random Variables
– Binomial Distribution
– Bernoulli and Geometric Distributions
– Poisson Distribution
Poisson Distribution
P{ X  n}  e 
n
and
n!
n
(

z
)
PX ( z )  e  
 e  ( z 1)
n 0
n!

• Number of events occurring in one time unit, time between events is
exponentially distributed with mean 1/a.
• Gives a method for modeling completely random, independent events
that occur after a random interval of time. (Leon-Garcia p. 106)
• Poisson Dist. can model a sequence of Bernoulli trials (Leon-Garcia p.
109)
– Bernoulli gives the probability of a single coin toss.
References: Kao text, Leon-Garcia text.
Poisson Distribution
•
http://en.wikipedia.org/wiki/Image:Poisson_distribution_PMF.png
References
•
Lefebvre Text:
•
Kao Text:
•
Lathi Text:
•
Entropy-Based Decision Trees:
•
Other Decision Tree Methods:
•
Rodent Data:
•
Poisson Distribution Example:
– Applied Stochastic Processes, Mario Lefebvre. New York, NY: Springer., 2003
– An Introduction to Stochastic Processes, Edward P. C. Kao. Belmont, CA, USA: Duxbury
Press at Wadsworth Publishing Company, 1997.
– Modern Digital and Analog Communication Systems, 3rd ed., B. P. Lathi. New York,
Oxford: Oxford University Press, 1998.
– ID3: P. E. Utgoff, "Incremental induction of decision trees.," Machine Learning, vol. 4, pp.
161-186, 1989.
– C4.5: J. R. Quinlan, C4.5: Programs for machine learning, 1st ed. San Francisco, CA, USA:
Morgan Kaufmann Publishers Inc., 1993.
– ITI: P. E. Utgoff, N. C. Berkman, and J. A. Clouse, "Decision tree induction based on
efficient tree restructuring.," Machine Learning, vol. 29, pp. 5-44, 1997.
– CART: L. Breiman, J. H. Friedman, R. A. Olshen, C. J. Stone, Classification and Regression
Trees. Belmont, CA: Wadsworth. 1984.
– J. Brooks Zurn, Xianhua Jiang, Yuichi Motai. Video-Based Tracking and Incremental
Learning Applied to Rodent Behavioral Activity under Near-Infrared Illumination. To
appear: IEEE Transactions on Instrumentation and Measurement, December 2007 or
February 2008.
– http://en.wikipedia.org/wiki/Image:Poisson_distribution_PMF.png
Questions?
Download