Maximum Entropy and Fourier Transformation

advertisement

Maximum Entropy and Fourier

Transformation

Nicole Rogers

An Introduction to Entropy

Known as the ‘law of disorder.’

Entropy is a measurement of uncertainty associated with a random variable.

Measures the ‘multiplicity’ associated with the state of objects.

Thermodynamic Entropy

Thermodynamic entropy is related to Shannon entropy by normalizing it with a Boltzmann constant.

Shannon Entropy

Shannon Entropy measures how undetermined a state of uncertainty is.

The higher the Shannon Entropy, the more undetermined the system is.

Shannon Entropy Example

Let’s use the example of a dog race.

Four dogs have various chances of winning the race.

If we apply the entropy equation:

Racers

Fido

Ms Fluff

Spike

Woofers

H = ∑ P i log(P i

)

Chance to Win

(P)

0.08

0.17

0.25

0.50

-log(P)

3.64

2.56

2.00

1.00

-P log(P)

0.29

0.43

0.50

0.50

Shannon Entropy Example (cont.)

Racers

Fido

Ms Fluff

Spike

Woofers

Chance to Win

(P)

0.08

0.17

0.25

-log(P)

3.64

2.56

2.00

-P log(P)

0.29

0.43

0.50

0.50

H = ∑ P

1.00

i log(P i

)

H = 0.29 + 0.43 + 0.5 + 0.5

0.50

The Shannon Entropy is 1.72

Things to Notice

Racers

Fido

Ms Fluff

Spike

Woofers

Chance to Win (P) -log(P)

0.08

3.64

0.17

0.25

0.50

2.56

2.00

1.00

-P log(P)

0.29

0.43

0.50

0.50

If you add the chance of each dog to win, the total will be one.

This is because the chances are normalized and can me represented using a Gaussian curve.

The more uncertain a situation, the higher the Shannon entropy.

This will be demonstrated in the next example.

Two Uncertain Examples

Racers

Fido

Ms Fluff

Spike

Woofers

Chance to Win

(P)

0.25

0.25

0.25

-log(P)

2.00

2.00

2.00

-P log(P)

0.50

0.50

0.50

0.25

H = ∑ P 2.00

i log(P i

) 0.50

H = 0.5 + 0.5 + 0.5 + 0.5 +0.5

With every variable completely uncertain, the

Shannon Entropy will be 2.0

Two Uncertain Examples

Racers

Fido

Ms Fluff

Spike

Woofers

Chance to Win

(P)

0.01

0.01

0.01

-log(P)

6.64

6.64

6.64

-P log(P)

0.07

0.07

0.07

0.97

H = ∑ P 0.0439

i log(P i

) 0.04

H = 0.07 + 0.07 + 0.07 + 0.04

With the situation fairly certain, the Shannon Entropy will be 0.25.

Comparisons to Draw

High Uncertainty

Fair Uncertainty

Low Uncertainty

H = 2.00

H = 1.72

H = 0.25

The more uncertain the situation, the higher the entropy, thus entropy is a measurement of chaos.

Maximum Entropy

The maximum entropy states that, subject to precisely stated prior data, which must be a proposition that expresses testable information, the probability distribution which best represents the current state of knowledge is one with the largest information theoretical entropy.

In most practical cases, the stated prior data or testable information is given by a set of conserved quantities associated with the probability distribution is question.

We use Lagrange method to help us solve this.

Lagrange Multiplier

In mathematical optimization, the method of Lagrange multipliers provides a strategy for finding the local maxima and minima of a function subject to equality constraints.

Lagrange Method assumes maximum entropy.

The first of these equations are a normalization constraint. All of the probabilities must equal 1.

The second equation is a general constraint. We will see more of what this is in the next example.

Lagrange Multiplier

Since Lagrange Method assumes maximum entropy, we can say:

Maximizing L with respect to each of the p(A i

) is done by differentiating L with respect to one of the p(A i

α , β , and all other p(A i

) constant. The result is:

) while keeping

Lagrange Multiplier

Rearranging the equation, we can get:

Where f( β )=0 because . Using this method, we can solve equations with minimum constraints.

Fast Food Frenzy

Burger

$1.00

Chicken

$2.00

Fish

$3.00

Tofu

$8.00

A fast food restaurant sells four types of product. They find that the average amount of money made for each purchase is

$2.50. The products are chosen by the consumer based on price alone, and not preference. What is the percentage of purchase for each of these four foods?

Fast Food Frenzy

We know that:

Applying Lagrange Method:

Fast Food Frenzy

Entropy is the largest, subject to the constraints, if

Where

Fast Food Frenzy

A zero-finding program was used to find the variables in these equations. The results were:

Food Probability of

Purchase

Burger

Chicken

Fish

Tofu

0.3546

0.2964

0.2477

0.1011

0.3546+0.2964+0.2477+0.1011 = 0.9998

This rounds to one, and therefore is normalized.

Lagrange method and maximum entropy can determine probabilities using only a small set of constraints. This answer makes sense because the probabilities of each food being chosen are consistent with the price constraint given to them.

Remarks

Burger

$1.00

Chicken

$2.00

Fish

$3.00

Tofu

$8.00

Only by assuming maximum entropy are we able to evaluate these equations.

Since this example is evaluated on price alone, then the burger would have been chosen with the most frequency because of the cheaper price. The probabilities are lower for the more expensive prices, as indicated by the results.

When the number of randomness increases, so does the entropy.

Because we only had four variables, the entropy at maximum would have been lower than if there were five variables.

Fourier Transformation

Fourier transform is a mathematical operation with many applications in physics and engineering that expresses a mathematical function of time as a function of frequency. The frequency can be approximated with sine and cosine functions.

Fourier transforms and maximum entropy can both be utilized to find the specific frequencies of a sine/cosine wave.

Fourier vs. Max Entropy

Num=30 x(i)=dsin(twopi*2.d0*t) x(i)=x(i)+dsin(twopi*3.d0*t) x(i)=5.d0+x(i)+dsin(twopi*3.2d0*t)

Num=90

Num=150

Fourier vs. Max Entropy

Since we were looking for 2.0

π , 3.0

π , and 3.2

π in our sine and cosine waves, maximum entropy was consistently better at determining these numbers on the graphs

Maximum entropy works better than Fourier from the range of 30 to 150 data sets. This is because it calculates an average using a small amount of data.

If the data were dramatically increased, Fourier

Method would work better.

Sources

 http://en.wikipedia.org/wiki/Entropy

 http://www.eoht.info/page/High+entropy+state

 http://en.wikipedia.org/wiki/Second_law_of_ther modynamics

 http://www.entropylaw.com/

Download