Probabilities and Entropy 1 6 Probabilities and Entropy Here we express entropy in terms of probabilities so that we can treat letters or states that are not equally probable. For instance, we might seek the probability pm that a polymer has exactly m folds. Then many average properties can be computed like the average length of a polymer at a given temperature. This chapter re-expresses information in terms of such state probabilities. Introducing Unequal Probabilities To address cases where N states of a system are not equally likely to occur, we assign G1 (equally likely) “balls” of type one, G2 “balls” of type two, etc. We make these assignments in order to simulate the probability p1 G1 G for the system to be in state 1, and so on for the other states: p1 G1 , G p2 G2 G , , pN N . G G More succinctly, Gi G pi with (1) G Gi (2) i The number of distinguishable configurations W is (3) G! W G1 ! G2 ! G N ! Substitute this into the Boltzmann form and reduce the result with Sterling’s formula: I k ln W k G ln G G Gi ln Gi Gi i Now replace the Gi’s by pi’s. I kG ln G 1 pi ln pi pi Finally, introduce pi 1 (4) i to obtain 1. I kG pi ln pi . Show that the statistical relation (4)) follows directly from condition (2). (5) Probabilities and Entropy 2 2. Derive Eq. (5) for the case of only two probabilities, p1 and p2 . 3. (a) Given n equally probable letters, what is the probability of selecting any one? G (b) Show that the information in a message having G of these letters is k ln n . n Shannon’s Information Entropy Although Eq. (5) is the mathematical equivalent of Boltzmann’s form, it is usual to express information entropy S in terms of information per symbol, S I G . S k pi ln pi (6) The context of the problem should indicate whether the number of symbols G should be included. Clearly, “bits” are an arbitrary choice of units for measuring. The constant k is, to a certain extent, an arbitrary measure of information. Generally k is taken to be 1/ln2 in coding problems and Boltzmann’s constant in physical problems. 4. Life forms in the atmosphere of Jupiter are found to have 4 DNA nucleotides with the following frequencies: 1 each A and T 3 1 each C and G 6 These nucleotides code for 20 amino acids, where each of 10 have probability 0.09 and each of 10 others have probability 0.01. (a) Find the average information per nucleotide. [ans. 3.79 bits] (b) Find the average information per amino acid [ans. 1.92 bits] (c) How many nucleotides are in the shortest possible ‘codons.’ [ans. 2] Equation (6) is appropriate for any probability distribution, including equal probabilities. We repeat the rubber band problem here using this form for entropy. Remember that k represents Boltzmann’s constant for these physical cases. Probabilities and Entropy 3 5. Consider a rubber band to be comprised of N links of length a that may point left or right. The whole chain has a length L as shown. a L L (a) Let NR and NL represent the number of links pointing right and left respectively. Show that NR and NL can now be expressed in terms of N and L, N R 12 N L a N L 12 N L a (b) The probability of a link pointing to the right is N R N . Use the probabilities for pointing left and right in the expression for the entropy of N links, S kN pi ln pi . Show that expanding S to second order in L gives kL2 . 2 Na 2 (c) The fundamental thermodynamic relation applied to this system is dU TdS fdL where f is the tension applied to the rubber band. Solve this for dS. The result is a physical expression for entropy as a function of U and L. Compare this with the mathematical identify, S S dS dU dL U L L U to find the relation of S to f: S f T L U Use this to evaluate f and compare your result with problem 7 of chapter 4. S Nk ln 2 Summary We have experessed entropy in terms of probabilities. In subsequent chapters we will use entropy to derive expressions for the (unequal) probabilities of systems. We will see under which circumstances distributions are equiprobable or canononical.