Statistical Physics 763620S Jani Tuorila Department of Physics University of Oulu December 6, 2012 • P. Pietiläinen, Statistical Physics. Previous lecture notes of the course that lay the foundation for these notes. General The website of the course can be found at: It includes the links to this material, exercises, and to their solutions. Also, the possible changes on the schedule below can be found on the web page. • J. Arponen, Statistical Physics. Old notes were based on this. Contents Schedule and Practicalities The course will cover the following topics: Lectures and exercise sessions will follow the schedule below • Background • Lectures: Monday 12-14 room TE320 Tuesday 10-12 room FY254-2 • Exercises: Friday – Probability – Random walk 12-14 room FY254-2 – Central limit theorem • Statistical approach to many-body problem By solving the exercises you can earn extra points for the final exam. The more problems you solve, the higher is the raise. Maximum raise is one grade point. It is worthwhile to remember that the biggest gain in trying to solve the exercise is, however, in the deeper and more efficient learning of the course! – Statistical formulation – Equal a priori probabilities – H-theorem • Statistical Thermodynamics Literature – Phase space and Liouville’s theorem This course is not based on a single book, but the main parts of these lecture notes can be found in any of the following books – Ergodicity theorem – Ensemble theory – Thermodynamic limit • R. Pathria and P. Beale, Statistical Mechanics, 3rd ed.. This is used widely as a textbook. I have been mainly reading this. • Quantum Statistics – Density matrix • F. Reif, Fundamentals of Statistical and Thermal Physics. A classic but missing some modern examples. – Bose and Fermi statistics • Ideal Quantum Gases – Bose-Einstein condensation • J. Sethna, Statistical Mechanics: Entropy, Order Parameters, and Complexity. A book with modern flavor, available free online (http://pages.physics. – Magnetism • Interacting Systems – Cluster expansion I strongly encourage you to read at least one of the books above. It will give you more deeper understanding on the subject than mere lectures. In addition, you can read this course material. – Second quantization • Phase Transitions – Landau’s mean field theory Other literature: – Critical exponents • R. Fitzpatrick, Thermodynamics & Statistical Mechanics: An Intermediate Level Course. Lecture notes ( sm1/sm1.html), elegantly written but not advanced enough. – Scaling • Non-equilibrium Statistics – Langevin equation – Fluctuation-Dissipation theorem • L. Reichl, A Modern Course in Statistical Physics. • D. Chandler, Introduction to Modern Statistical Mechanics. 1 1. Introduction scope of the theory and leaves many interesting properties All matter we see around us consists of vast number of of nature unexplained. microscopic particles that are in constant motion, and in The idea that the macroscopic properties of matter are interaction with one another. At any given instant of time, caused by the microscopic constituents was first introduced it is possible in principle to find the state of this kind of in the second half of the 19th century by James Clerk many-body system. The laws of physics are such that the Maxwell, Ludwig Boltzmann, Max Planck, Rudolf Claufuture evolution of the state can then be solved from the sius, and J. Willard Gibbs. The theory was completed in governing equations of motion. However in practise, the the first half of 20th century by the advent of quantum number of the constituent particles is typically of the order mechanics that gives the correct description of atomic size of the Avogadro’s number particles. By applying the probabilistic ideas to systems in equilibrium, one can obtain all results of the thermodyNA = 6 · 1023 , namics together with the ability to relate the macroscopic parameters of the system to its microscopic constituents. leading to stupendously large set of coupled equations. This powerful method will be subject in the first half of Even the storing of the initial state of these equations this course. The description of systems not in equilibrium would exceed in multitude the combined memory of all is much more difficult task. One can still make some genof the computers in the world. This is the so-called manyeral statements of such systems which leads to statistical body problem of physics. physics of irreversible processes. This will be discussed When the number of the constituent particles is im- briefly in the latter part of the course. mense, it has turned out useful to use the tools of probabilStatistical physics can, in principle, be applied to ity calculus to describe their emergent behaviour. In this any state of matter: solids, liquids, gases, matter comcourse, we will not be interested in the details of individposed of several phases/or several components, matter ual particles, but in the relations between the macroscopic under extreme conditions of temperature and pressure, properties that result. In fact, we lack most of the informamatter in equilibrium with radiation, and so on. In tion required to describe the internal state of the system, addition, the concepts and methods of statistical physics and therefore our approach has to be probabilistic in nahave turned out useful in many other fields of science: ture. This is the starting point of statistical physics. chemistry, study of dynamical systems, communications, The equations governing the macroscopic properties (e.g. bioinformatics, complexity, and even stock markets. energy, pressure, temperature) of a many-body system are statistical averages of those of the constituent particles. Therefore, it is implicitly assumed that the averaged quantities experience statistical deviations. For example, the familiar equation of state of the ideal gas P V = nRT “If a million monkeys typed ten hours a day, it is extremely unlikely that their output would exactly equal all the books of the richest libraries of the world; and yet, relates the average pressure P and average volume V to in comparison, it is even more unlikely that the laws of the average temperature T . Quite generally, the deviations statistical mechanics would ever be violated, even briefly.” from the√average in a many-body system with N particles go as 1/ N . The astounding number of particles, which - Emile Borel, 1913 makes the calculations of the macroscopic properties from first principles practically impossible, leads to such an accuracy of the statistical results that they can be taken as exact physical laws. Motivation The systematic studies of macroscopic systems started from the phenomenological studies in the 18th and 19th centuries. These were done by the likes of James Watt, William Rankine, Rudolf Clausius, Sadi Carnot, and William Thomson. The resulted theory of thermodynamics is concerned on heat and its relation to other forms of energy and work. The strength of thermodynamics is in its generality, emerging from the fact that the theory does not depend on any detailed assumptions on the microscopic properties of the system. This is also the main weakness of thermodynamics, since only relatively few statements can be made on such general grounds. This restricts the 2 2. Background performs successive steps, each of which has a length l and is taken on a random direction. The successive steps are statistically independent, so we can denote 2.1 Ensemble and Probability The laws of Physics are deterministic in a sense that given the state of a system at one time one can calculate the state at all later times, by using the time-dependent Schrödinger equation or classical mechanics. Then, the outcome of an experiment on the system should be determined completely by this final state. Even though this can be done in principle, it is practically impossible to determine an initial state of a system with NA particles, let alone to calculate its time evolution. p = probability of a right step q = 1 − p = probability of a left step One obtains a great insight by considering the number p as the fraction of right steps in the ensemble of mental copies of single steps. The random walk of N individual steps is formed by taking a subset, i.e. a sample, from the ensemble and calculating the sum of its members. After N steps, the particle Statistical physics circumvents this problem by considlies at ering a (infinitely) large set of similar experiments, instead x = ml, of a single one. This kind of set of mental copies of similar 1 systems is called an ensemble . where m = nr − nl (2) Let then S denote a general system and consider an experiment on the system with a general outcome X. We denote with Σ the ensemble of mental copies of S. Now, the probability of observing the outcome X is given by P (X) = lim Ω(Σ)→∞ Ω(X) , Ω(Σ) and nr (nl ) denotes the number of steps to right (left). Naturally, −N ≤ m ≤ N and (1) N = nr + nl . (3) where Ω(Σ) is the number of systems in the ensemble and By taking into account the number of different ways takΩ(X) is the number of systems exhibiting the outcome X. ing nr steps out of N , we arrive at the probability This is the scientific definition of probability. WN (nr ) = Example: Throwing a dice One could in principle determine the initial position and orientation of a dice. Also, one could figure out exactly how the dice is thrown out of hand. By using these as initial conditions, one could use the classical equations of motion to determine exactly which of the six numbers turns up. Practically, this is impossible. N ! nr nl p q nr !nl ! (4) of taking nr steps to the right. Probability function (4) is referred to as the binomial distribution since it resembles the terms in the binomial expansion. By using Eqs. (2) and (3), one can show (exercise) that the probability of finding a particle at position m after N steps is Instead, we usually consider a large group of (nearly) similarly thrown dices. The result of a single throw is PN (m) = WN (nr ) not anymore deterministic, since we do not have the ex(5) N! act knowledge of the initial state. Nevertheless, we can p(N +m)/2 q (N −m)/2 . = [(N + m)/2]![(N − m)/2]! find out the probability of a particular outcome if we can assume that each outcome is equally likely2 . In such a case, we obtain 1 Moments of a discrete distribution P (X) = . 6 Before continuing with the random walk example, let us This can, of course, be compared with the actual dice recall a couple of concepts that characterize a given discrete throws. Indeed, if we notice a deviation from this assumpdistribution (later, we will generalize these for continuous tion of equally probable outcomes, we immediately suspect distributions). that the dice is crooked! The mean of an arbitrary (normalized) distribution P (u) is given by 2.2 Random Walk Next, we will consider one-dimensional random walk as a simple example, grasping the relevant concepts in probability calculus needed in this course. Assume, that a particle µ ≡ hui = M X i=1 1 Ensemble is the French word for group. We will discuss in a later chapter the reasons why this kind of treatment can be done. 2 This is in fact a postulate of so-called a priori probabilities that lies in the foundation of statistical physics, and will be discussed more thoroughly in the following chapter! ui P (ui ) = lim Ω(Σ)→∞ M X ui Ω(ui ) i=1 Ω(Σ) , (6) where ui are the M possible values the variable u can have. The latter equality allows the interpretation of the mean as the ensemble average. The mean has the familiar interpretation due to the law of large numbers: 3 If one repeats endlessly the same random exMean values of the random walk periment, the average of the results approaches First, let us calculate the mean and variance of the whole the mean of the ensemble formed by mental ensemble copies of the single experiment. 2 2 2 This law is, of course, a straightforward consequence of µ = p − q , σ = p(1 − (p − q)) + q(−1 − (p − q)) = 4pq. our definition of probability. The random walk with length N is formed by taking a Let us consider a series of random steps (l = 1) and, sample from the ensemble. It can be characterized either after each step, calculate the average displacement of a with distribution (4) or (5). According to Eq. (4), we can show that the binomial distribution is normalized, i.e. single random walk x̄ = l nr − nl → l(p − q) = hxi. N N X WN (nr ) = 1. nr =0 net displacement 1 Thus, we can calculate the mean number of steps to the right as hnr i = N p, where we have used the result 500 1000 steps nr pnr = p ∂ nr (p ). ∂p The mean displacement is then given by hmi = N (p − q) = N µ, -1 We have assumed p = 0.5 and l = 1 in the figure. As which is (naturally) zero when p = q. We see that the one increases the number of steps, the size of the sample average displacement is formed by N independent single grows and the average of the steps approaches that of the displacements with the mean µ = p − q. ensemble. Correspondingly, the variance of the right steps in the random walk is Assume that f (u) and g(u) are any functions of the discrete variable u and c is a constant. Then, hf (u)i = M X σr2 = h(nr − hnr i)2 i = N pq. Now, the variance of the net displacement is f (ui )P (ui ) 2 σm = h(m − hmi)2 i = 4σr2 = N σ 2 , i=1 hf (u) + g(u)i = hf (u)i + hg(u)i where we have used the relations (2) and (3). For the number of right steps, we obtain the standard deviation r σr q 1 √ . = (9) hnr i p N √ We see that the relative width decreases as 1/ N with increasing sample size N .3 hcf (u)i = chf (u)i. The mean is called the first moment of the distribution. In general, the nth moment is defined as huk i = M X uki P (ui ). (7) WN H nr L i=1 In principle, the whole distribution can be resolved by calculating all of its moments. æ æ A useful mean value is called the variance, or the second moment about the mean σ 2 ≡ h(u − hui)2 i = M X æ 0.15 æ æ æ æ æ 0.10 (ui − hui)2 P (ui ). æ (8) æ i=1 æ æ æ æ æ æ 0.05 æ ææ æ The variance has a great importance since its square root æ æ æ æ σ, the standard deviation, can be used to characterize the æ æ æ ææ æ æ ææææ æææææææææææ ææææææææææææ nr width of the range over which the random variable is dis10 20 30 40 tributed around its mean. The following relation is often √ 3 This kind of 1/ N -law is common in statistics, and is the reason useful 2 2 2 why statistical mechanics gives (nearly) exact results. σ = hu i − hui . 4 Binomial distributions of the number of right steps with where W̃N = WN (hnir ) can be determined from the norN = 20 (red) and N = 40 (blue). The arrows indicate the malization condition5 standard deviations. r Z ∞ N X |B2 | WN (nr ) ≈ WN hnir +η dη = 1 ⇒ W̃N = . 2π −∞ nr =0 Large N limit Let us then assume that N is large, meaning that we are in the limit of a long random walk, or, of a large sample. Due to simpler calculations, we do the calculation for the number of right steps nr , but the same reasoning applies also for the net displacement m. First, we note that since p hnr i = N p and σr = N pq 1. Thus, we have obtained that, in the limit of large N and nr , the binomial distribution converges to the Gaussian distribution 2 WN (nr ) = p WN H nr L we have that near the mean hnr i and in the limit of large N |WN (nr + 1) − WN (nr )| WN (nr ). (n −hn i) 1 − r 2σ2r r e . 2πσr2 (11) æ æ æ 0.15 This allows us to consider WN as a continuous function of continuous nr . æ æ æ æ 0.10 Now, the location of the maximum nr = ñr can be found approximately from the equation æ æ æ æ æ æ æ æ æ 0.05 d ln WN = 0. dnr æ ææ æ æ æ æ æ æ ææææææææææææ We make a Taylor expansion of ln WN in the vicinity of ñr , and obtain4 10 æ æ æ æ ææææ 20 ææ æææææææææææ 30 40 Binomial distribution (dots) with p = 0.5, N = 20 and 1 1 ln WN (nr ) = ln WN (ñr ) + B1 η + B2 η 2 + B3 η 3 + · · · , N = 40, together with the corresponding Gaussian distri2 6 (10) butions (solid lines). where η = nr − ñr and Bk = 2.3 Central Limit Theorem dk ln WN . dnkr nr =ñr The preceding discussion is one special case of the so-called central limit theorem 6 : By using Stirling’s formula (exercise) If we sum N (N → ∞) systems that are statistically independent and identical, the resultant distribution is under very general conditions a Gaussian. ln n! ≈ n ln n − n, we obtain In terms of the random walk, we started with a single step with a probability distribution p X = right P (X) = q X = left. B1 = − ln nr + ln(N − nr ) + ln p − ln q. At the maximum (nr = ñr ), B1 = 0 which implies ñr = N p = hnir . Then, we combine a large number N of such random steps and calculate the net displacement to the right m = nr − nl , where nr (nl ) is the number of steps to the right (left). In the limit of N → ∞, we obtain Similarly, B2 = − 1 1 =− 2 <0 N pq σr as it should for a maximum. The higher order derivatives Bk ∝ N −(k−1) , and can be neglected in the limit of large N in the expansion (10). nr N nl N m N We have thus obtained 1 → p → q → p − q = µ. 5 We have seen that the distribution has a pronounced maximum, so the continuation of the integral from −∞ to ∞ is an excellent approximation. 6 Central limit theorem is notoriously difficult to prove, which is therefore omitted here. 2 WN (nr ) = W̃N e− 2 |B2 |η , 4 We use ln W N instead of WN since it’s expansion converges more rapidly. 5 nr This is the law of large numbers. Additionally for a fixed the probability that u has the value ui and v the value vj . N , the central limit theorem tells us that the measured We require the normalization values for individual random walks are spread around the N X M X mean. This spread is characterized by the relevant stanP (ui , vj ) = 1. 7 dard deviation . i=1 j=1 We showed that for a random walk, the number of right steps obeys a Gaussian distribution (11) when N is large. We can use the relations (2) and (3) to show that also other resultant random variables are distributed normally. For example, let us study the average displacement m̄ = m/N . Now, m = 2nr − N , hm̄i = Naturally, P (ui ) = M X P (ui , vj ) and P (vj ) = N X j=1 σ2 hmi 2 = µ , σm̄ . = N N P (ui , vj ). i=1 When the outcome of one variable is independent of the outcome of the other, we say that the two variables are statistically independent, or uncorrelated. In this case, Because the change of variables is linear, we have that dn (m̄−hm̄i)2 − 1 r 2 2σm̄ e PN (m̄) = WN (nr ) . = p 2 dm 2πσm̄ P (ui , vj ) = P (ui )P (vj ). Let F (u, v), G(u, v), f (u) and g(v) be arbitrary functions. Now, distribution 8 hF i = 6 N X M X F (ui , vj )P (ui , vj ) i=1 j=1 hF + Gi = hF i + hGi 4 hf i = 2 M N X X f (ui )P (ui , vj ) = i=1 j=1 - 0.15 - 0.10 - 0.05 0.00 0.05 0.10 0.15 average displacement N X f (ui )P (ui ) i=1 hf gi = hf ihgi, where the last equality holds only for uncorrelated u and v. The proofs are left as an exercise. In the figure, N = 400 and p = 0.5. We have calculated m̄ for 3000 random walks. The height of the bars denote the number of random walks with m̄ within the interval m + δm. The interval δm is the width of the bars. The height of the bars is normalized so, that the total area under them is 1. The red curve is the Gaussian distribution with 1 hm̄i = 0 and σm̄ = √ . N Continuous distributions We have already had one example in the previous sections on the probability distribution of a continuous random variable in the form of the Gaussian. Let us consider such cases more generally. Assume that u can obtain any value in the continuous range a1 < u < a2 . The probability of finding the variable between u and u + du is 2.4 Several Variables and Continuous Distributions P (u) = P(u)du, (12) Several variables In statistical physics, the typical problems deal with sys- where P is the probability density. The normalization tems with 1024 , or so, particles. Therefore, it is evident condition and the definition of the expectation value are that we have to deal with probability distributions with straightforwardly generalized from the discrete case. Thus, Z a2 more than one variable. We consider here the case of two P(u)du = 1 (13) variables, but the results are straightforwardly generaliza1 able to cases with any number of variables. Z a2 Assume that we have two random variables u and v with and possible values ui and vj , where i = 1, . . . , N and j = hf (u)i = f (u)P(u)du, (14) a1 1, . . . , M . We denote with where f is an arbitrary function of u. These results are P (ui , vj ) easily generalized to the cases where one has more than 7 The mean is only the most probable outcome. For example, if one continuous random variable (exercise). you toss a coin four times, the mean is two heads and two tails. But, sometimes you might get all four tails! 6 3. Statistical Formulation Assume that we have tossed two heads. This is the ”macrostate” of the three coin system. Nevertheless from In this chapter, we will present the formal description of the statistical approach to the analysis of the internal mo- the Table, we see that we can obtain two heads in three diftions of a many-body system. This approach holds for sys- ferent ways. If we do not posses any additional knowledge tems in equilibrium 8 . The non-equilibrium statistics will on the system (e.g. labels on the coins), we can only say that the system is after the toss in one of the corresponding be studied in the latter part of the course. three ”microstates” 2,3 or 4. Assume then that the number of coins is of the order of NA . One can easily imagine that there exists, in general, We want to describe a system that is macroscopic, by an extremely large number of possible states for a given which we mean that the number N of its constituent par- value of heads. ticles obeys We can generalize the above coin flip example to the 1 √ 1. (15) system consisting of macroscopic number of microscopic N particles. The extensive macroscopic parameters, that deCharacterization of the state Typically, N ≈ NA ≈ 1024 . This means that the statistical arguments discussed in the previous chapter can be applied. termine the macroscopic state, give us only partial information about the system. Nevertheless, they impose constraints to the possible states the system is allowed to occupy. The macroscopic state, i.e macrostate, of the system Macroscopic state can be characterized with the so-called is a function of a set of f coordinates, where f denotes the state variables. Extensive variables are proportional to the number of the degrees of freedom in the system (i.e. the posize of the system, i.e. to the particle number N or to sition coordinates and the possible spins of the constituent the volume V . For example, the total energy E of N nonparticles) and is proportional to the number N of the coninteracting particles is stituent particles. Any given macrostate can be achieved X by several different sets of values for the coordinates. Each E= ni εi , (16) such set is called a microstate. i where ni denotes the number of particles with energy εi . Naturally, X N= ni . (17) Postulate of equal a priori probabilities From the simple coin flip example, we can see that for a given macrostate the number of possible microstates can i increase very rapidly as the number of random variables Intensive variables are independent on the size, and can increases. What can we then say about the probability of be determined from any sufficiently large volume element finding the system in any of its allowed microstates? As of the system. Examples include temperature T , pressure an example, consider an isolated system. By definition, p, chemical potential µ, ratios of extensive variables, such the energy is conserved which imposes a constraint on the possible microstates the system can occupy. Usually, there as density ρ = N/V . are however many such states. The macroscopic state, or macrostate, is determined by General statements can be made when the system is in the values of the extensive variables, e.g. N , V and E. equilibrium. The equilibrium is characterized by the fact that the probability of finding the system in any of its Statistical ensemble accessible microstates is independent on time. There is nothing in the laws of physics that would make one assume that some of these microstates would be occupied more Example: Coin flips frequently than the others. So, in the case of any other Assume that we flip three coins. There are altogether prior knowledge, we assume that: four different possible outcomes: 0,1,2 or 3 heads. Isolated system in equilibrium is equally likely State index Outcome Number of heads to be found in any of its accessible states. 1 hhh 3 In statistical physics, the above assumption is called the 2 hht 2 postulate of equal a priori probabilities. Therefore, it can3 hth 2 not be proven but one can only derive its consequences and 4 thh 2 compare them with measurements. This kind of measure5 htt 1 ments have been done in multitude, and the agreement has 6 tht 1 9 so far been very good . 7 tth 1 9 8 ttt 0 We naturally assume this postulate when we play a game of chance, such as coin toss, dice toss, or cards. In the absence of any additional information, we always assume that the different outcomes occur with equal probabilities, as we discussed in the first dice toss example! 8 We will define the equilibrium in the course of the discussion. At the moment, it is sufficient to consider it as the state of the system which does not show any temporal macroscopic changes. 7 3. τ ≈ texp : Distribution over accessible states is not uniform and changes in the experimental time scales. Problem cannot be reduced to a discussion of equilibrium situations. Approach to equilibrium The postulate of a priori probabilities indicates that the equilibrium of the isolated system is characterized by a uniform distribution of its accessible microstates. The accessible states are those that do not violate the constraints According to the H-theorem, after the relaxation the set by the known values of systems macroscopic parameters quantity H is a static constant, and can be written for y1 , y2 , . . ., yn (such as E, V and N ). Thus, the number of an isolated system as the accessible states can be written in functional form X Heq = Pr ln Pr = − ln Ωf . Ω = Ω(y1 , y2 , . . . , yn ), r meaning that the parameter α lies in the range yα and Thus, we see that yα + δyα . If some constraints of an isolated system are Assume that initially the isolated system is in equilib- removed then the parameters of the system tend rium, meaning that it can be found equally likely in any to spontaneously readjust themselves in such a of its Ωi states. A removal of a constraints leads to an way that increase of accessible states Ω(y1 , . . . , yn ) → maximum. Ωf ≥ Ω i . We call process the transition from the initial to the final macrostate. When Ωf = Ωi , the process is called reversible. The initial macrostate can be recovered by reimposing the constraint since the system is throughout the process in equilibrium. If Ωf > Ωi the corresponding process is irreversible. The imposition or removal of additional conConsider a system in such a state of non-equilibrium. Let straint cannot spontaneously recover the initial macrostate Pr be the probability of the system to be in the microstate because of the uniform distribution over Ωf states. r. The so-called H-theorem 10 says that (exercise) We see that the quantity ln Ω measures the degree of dH irreversibility of the macrostate of an isolated system in ≤ 0, (18) dt equilibrium. Especially for any physical process operating on an isolated system, where H = hln Pr i and the equality holds when Pr = Ps for all microstates r and s. This means that the nonln Ωf − ln Ωi ≥ 0. equilibrium state changes so, that the quantity H decreases until the uniform equilibrium distribution of the For historical reasons, this measure of irreversibility is written in the units joules/kelvin. Thus, it is multiplied with microstates is reached. the Boltzmann constant (kB = 1.381 · 10−23 J/K) The H-theorem guarantees that isolated systems approach the equilibrium, eventually. The time scale that S = kB ln Ω, (19) characterizes the process is called the relaxation time τ . The relaxation time depends on the particular system, es- and termed the (equilibrium) entropy. pecially on the inter-particle interactions. The postulate of equal a priori probabilities holds for the isolated systems Implications of the statistical definition of enin equilibrium. In order for the postulate to hold, it is tropy therefore important to wait a number of relaxation times The H-theorem gives straightforwardly the second law of after the decoupling of the system from the environment. thermodynamics 11 : By comparing with the experimental time scale texp we can The entropy of an isolated system tends to inseparate three distinctive cases: crease with time and can never decrease. 1. τ texp : System reaches equilibrium very quickly As one goes to lower energies, one has to turn into quanand equilibrium statistics are applicable. tum mechanical treatment12 . This is because in classical physics there is no lower bound on the energy of the system. 2. τ texp : Equilibrium is reached very slowly. In fact, Accordingly, we can write the third law of thermodynamics: it is so slow in the experimental time scale that it can 11 In thermodynamics, the second law results after rather cumberbe considered to be in equilibrium (with additional some considerations of Carnot cycles. This has been extensively studconstraints that prevent the reaching to the equilib- ied in in the Thermodynamics course and will be omitted here. 12 In order to have unique value for the entropy, one actually has rium). Hence, the statistical arguments apply. When Ωf > Ωi , this means that immediately after the removal of the constraint the system is not distributed equally between its allowed microstates, it is not in equilibrium, and its macroscopic parameters change in time. 10 The to treat it quantum mechanically. This is because the volume elements in classical phase-space (discussed later) are not bounded by the uncertainty principle. theorem was first proposed and proven by L. Boltzmann in 1872. 8 Entropy of a system approaches to a constant 4. Statistical Thermodynamics value as the temperature approaches zero. The previous chapter indicates that the statistiThe constant is determined solely by the degeneracy of cal physics of a macrostate follows from the number the quantum mechanical ground state of the system. In Ω(E, V, N ) of allowed microstates. In general, this number the case of non-degenerate ground state (one accessible mi- is a function of the macroscopic energy of the system, in crostate, Ω = 1), the entropy is zero. addition to the external variables such as the particle number and volume. We leave the detailed computations of Ω Probability calculations to the next chapter, and concentrate, in this chapter, in its Consider an isolated system in equilibrium whose (con- relation with thermodynamic quantities. It is worthwhile stant) energy is known to be somewhere between E and to remember that thermodynamics is expressed in terms of macroscopic parameters, not relying on any microscopic E + δE. We denote with Ω(E) the total number of states with energy in this range. Let us assume that we measure knowledge on the matter under study. the property y, and let yk be the possible outcomes of that measurement and Ω(E; yk ) the number of the state with such outcome. Based on our discussions in the previous chapter, we form an ensemble of the Ω(E) states. Then, our fundamental postulate of a priori probabilities tells us that each of the Ω(E) states in the ensemble is equally likely to occur. Thus, the probability of obtaining yk from the measurement is P (yk ) = We have learned that the macroscopic quantities are statistical in nature. This means that in equilibrium their values are statistical averages of NA , or so, particles. The largeness of NA means that the distributions of the averages are Gaussians whose deviations are small compared with the most probable value, e.g. 1 σE ∼√ . hEi NA Ω(E; yk ) . Ω(E) Thus, we can replace macroscopic quantities, such as energy, temperature and pressure, by their mean values. This We assume in the above that δE is sufficiently large, so is the essence of classical thermodynamics. that we can take Ω(E) → ∞. This is generally true in macroscopic systems. 4.1 Contact Between Statistics and Thermodynamics We see that the statistical physics calculations reduce simply to counting states, subject to relevant constraints (such as for E in the above discussion). Thermal interaction Consider two systems A1 and A2 separately in equilibrium. The number of microstates in system A1 in a macrostate with energy in range E1 to E1 + δE1 is denoted with Ω1 (E1 ). Correspondingly, we denote the number of microstates of the system A2 with Ω2 (E2 ). Assume that the systems are brought in a thermal contact, meaning that they can exchange energy, but that their particle numbers and volumes stay fixed. We thus assume that the energies E1 and E2 can vary, but the variations are restricted by the energy conservation condition E (0) = E1 + E2 = constant. Each microstate in A1 can be combined with a state in A2 , so the total number of microstates in the composite system is Ω(0) = Ω1 (E1 )Ω2 (E2 ) = Ω1 (E1 )Ω2 (E (0) − E1 ). Based on the discussion in the previous chapter, after we have brought the systems together and after waiting several relaxation times, the combined system can be assumed to have reached the equilibrium. We have shown with the H-theorem that the equilibrium is such that Ω(0) is maximized. Thus, (we use partial derivative since we assume purely thermal interaction in which the other parameters are kept 9 fixed) ∂Ω (E ) 1 1 Ω2 hE2 i ∂E1 E1 =hE1 i ∂Ω2 (E2 ) ∂E2 + Ω1 hE1 i = 0. ∂E2 E2 =hE2 i ∂E1 This implies (∂E2 /∂E1 = −1) that the equilibrium condition of the composite system can be written as ! ! ∂ ln Ω2 (E2 ) ∂ ln Ω1 (E1 ) = or ∂E1 ∂E2 E1 =hE1 i β= ∂ ln Ω(N, V, E) ∂E σX (E) = − E2 =hE2 i β1 = β2 , where We want to calculate how the total number of accessible states Ω(E, x) changes as we increase x with dx. First, we define the generalized force conjugate to the parameter x as ∂Er . (22) X=− ∂x For a fixed value of X the number of such states that are located within energy −Xdx below E, and change to a value greater than E is ! ΩX (E, x) Xdx, δE where ΩX (E, x) is the number of such microstates that have their energy Er within the interval above, and the generalized force within X and X + δX. . (20) E=hEi The parameter β has the following properties: 1. If two systems in equilibrium with the same value of β are brought into thermal contact, they will remain in equilibrium. Thus, we obtain the total number of states σ(E) that change from an energy value lower than E to one that is larger X Ω(E, x) Xdx, σ(E) = σX (E) = − δE X where X= X 1 ΩX (E, x)X Ω(E, x) X 2. If two systems in equilibrium with different values of β are brought into thermal contact, the system with is the average of the generalized forces over all accessible higher value of β will absorb heat from the other until states obeying uniform distribution. We have also denoted the total number of states with energy within E to E + δE the two β values are the same (exercise). as X Ω(E, x) = ΩX (E, x). In addition, we notice that β has the units of inverse energy. X We define the thermodynamic or absolute temperature with ! The total number of states within E and E +δE changes 1 ∂S(E, V, N ) = kB β = , (21) as T ∂E V,N,E=hEi ∂Ω(E, x) ∂σ dx = σ(E) − σ(E + δE) = − δE. where the subscripts contain the assumption of constant ∂x ∂E volume and particle number. We thus see that the thermodynamic temperature of a macroscopic system depends We obtain ∂ ∂Ω only of the rate of changes of the number of accessible mi= (ΩX). ∂x ∂E crostates with the total energy. This implies that According to the previous discussion, it is easy to arrive at the zeroth law of thermodynamics: ∂ ln Ω ∂ ln Ω ∂X = X+ . ∂x ∂E ∂E If two systems are separately in thermal equilibrium with a third system then they must also Since X is intensive but Ω and E are extensive variables, be in thermal equilibrium with one another. one sees that the second term is negligible compared with the first in the macroscopic limit. General interaction In the most general case, the external parameters do not remain fixed and the system is not thermally insulated. Additionally, the (essentially quantum mechanical) energies Er of the accessible microstates are dependent on the external parameters Er = Er (x1 , . . . , xn ), where xi are the relevant external macroscopic parameters. For simplicity, we consider here the case where only one external parameter x is allowed to vary. In addition to the constraint due to x, we assume that the energy of the system is found from the interval E to E + δE. When we change x by dx the energy Er of a microstate r changes by (∂Er /∂x)dx. 10 Thus, we have obtained that a change in an external parameter results in a mean generalized force given by the relation ∂S 1 = X. (23) ∂x T In general, if we calculate entropy S = S(E, x1 , . . . , xn ), ! n X ∂S dS = dE + ∂E i=1 x1 ,...,xn the infinitesimal change in we obtain ! ∂S dxi , ∂xi E,x1 ,...,xi−1 ,xi+1 ,...,xn (24) where the subscripts denote the variables that are kept is an exact differential. For example, the integrating factor constant during partial derivation. of heat is 1/T 13 dQ In order to apply the results (20) and (23), one should dS = . T take care that the process leading to the change in S is reversible. This guarantees that the average values have throughout the process well defined values. The reversibility is achieved by requiring that the changes in E and the external parameters xi are carried out so slowly that the system remains arbitrarily close to equilibrium at all stages of the process. This is called a quasi-static process. The slowness is, naturally, a characteristic of the system and determined by the length of the relaxation time. For a quasi-static process, one can write dS = n 1X 1 dE + Xi dxi . T T i=1 (25) We see that the latter term is a sum of components of the form generalized force times the a change in the corresponding coordinate. Thus, it can be related to the decrease in the internal energy dE during the process. Accordingly, we call it the work done by the system and denote it with n X Xi dxi . dW = Now, the total change in the mean energy changes during the process is ∆E = Ef − Ei = Q − W, (28) where W is the macroscopic work done by the system as a result of the change in external parameters. The quantity Q is the heat absorbed by the system, i.e. the change in the macroscopic energy due to the thermal interaction. For a quasi-static process, this is a restatement of the first law of thermodynamics, which in its most general form it says: Energy of an isolated system is conserved. Equilibrium conditions Let us consider the case with an exchange of the external parameters V and N between the subsystems A1 and A2 , in addition to energy. Again, we consider a process leading from initial to final equilibrium. The change in entropy S1 = S1 (E1 , V1 , N1 ) due to infinitesimal changes in the arguments can be written as i=1 ∂S1 ∂S1 ∂S1 dE1 + dV1 + dN1 . dS1 = The change in the internal energy caused by the thermal ∂E1 ∂V1 ∂N1 V ,N E ,N E ,V interaction is called heat 1 1 1 1 1 1 (26) One can write a similar formula for the change in entropy of the system A2 by replacing 1 → 2 in the above formula. One should pay attention to the notation. Special sym- Due to the fact that the system is isolated, we have that bols dQ and dW denote the fact that the heat and work dE2 = −dE1 , dV2 = −dV1 and dN2 = −dN1 . Thus, the are infinitesimal quantities that do not correspond to any change in entropy of the total system can be written as ! differences between two heats or two works of the initial ∂S2 ∂S1 − dE1 dS =dS1 + dS2 = and final states of the process. Such infinitesimal quanti∂E1 ∂E2 V1 ,N1 V2 ,N2 ties are called inexact differentials. In contrast, the state ! variables (such as E, p, V , T , . . .) that have well defined ∂S1 ∂S2 + dV1 (29) − values for the initial and final states are referred to as exact ∂V1 ∂V2 E1 ,N1 E2 ,N2 differentials. ! ∂S1 ∂S2 Generally speaking, consider a function F (x, y) of two + − dN1 ∂N1 ∂N2 independent variables x and y. The differential dQ = dE + dW, E1 ,V1 ∂F ∂F dF = dx + dy ∂x ∂y is exact if ∂2F ∂2F = . ∂x∂y ∂y∂x Thus, dF = dF is integrable and Z 2 dF = F (2) − F (1), 1 E2 ,V2 In the equilibrium, small deviations in the total entropy have to vanish, leading to new equilibrium conditions ∂S1 ∂S2 = (30) ∂V1 ∂V2 E1 ,N1 E2 ,N2 ∂S2 ∂S1 = . (31) ∂N1 ∂N2 (27) E1 ,V1 E2 ,V2 By recalling from thermodynamics, the definitions of pressure and chemical potential are respectively ∂E ∂E In the case of inexact differentials, the integrals depend = −p , = µ. on the taken path, but there always exists an integrating ∂V ∂N S,N S,V factor λ(x, y) so that meaning that the integral of an exact differential does not depend on the path of integration. 13 Note that dS ≥ dQ/T for an arbitrary process, the equality holds only for reversible ones. λdF = df 11 ρ = N/V → 0. For such a system, one can readily write the dependence of Ω(E, V, N ) on V . We will settle for now in order-of-magnitude calculation, and improve it once we start to discuss the calculation of Ω more generally. Using the mathematical identities, valid for any function f of variables x and y, !−1 ∂f ∂x ∂f ∂x ∂y = −1 , = ∂x ∂y ∂f ∂x ∂f y f x y y one arrives at statistical definition for pressure p ∂S = T ∂V E,N and that for chemical potential µ ∂S − = T ∂N . Let us consider a single particle in the gas. Since the system has a fixed number of particles, the number of its accessible microstates depends only on E and V , as Ω(E, V ). We have to find the number of states within the interval E to E + δE and in the volume V . Thus, Ω has to be (32) proportional to the total volume V and that of the energy space. Since we assume that the particles are classical and that the interactions are negligible, they can occupy the same space. For the N particle system, we end up with (33) Ω(E, V ) ∝ V N χ(E), (37) E,V where χ(E) does not depend on volume. What we have obtained is the condition of equilibrium Now, the calculation of the equation of state is straightin the case where all the state variables E, V and N are forward. We have allowed to vary S = kB ln Ω = kB N ln V + kB ln χ(E) + constant T1 = T2 , p1 = p2 , µ1 = µ2 . (34) which gives the pressure Also, we have obtained the first law of thermodynamics N for infinitesimal reversible processes or pV = N kB T. (38) p = kB T V dE = dQ − dW = T dS − pdV + µdN. (35) Chemists prefer the form This is often called the fundamental thermodynamic relapV = nRT, tion. It should be noted that the differential of the internal energy consists terms AdB, of which one is always an ex- where n = N/NA is the number of moles of the gas particles tensive variable and the other intensive. This observation and R = kB NA is the gas constant. Note that equation of holds generally. state (38) (nor R) does not depend on the kind of the molecules constituting the gas. Note also that for thermally isolated systems dS = In order to acquire complete knowledge on the macrostate of the ideal gas, one should also calculate the equation of state for the internal energy E from equation dQ =0 T which is the definition of an adiabatic process. 4.2 Equations of State Relations (20) and (23) are of particular interest 1 ∂S ∂S 1 = , = X. T ∂E ∂x T 1 ∂S ∂ ln χ(E) = = kB . T ∂E ∂E This means that the mean energy E = E(T ) is function of only temperature, not volume V . The explicit dependence requires the knowledge on χ(E). We will return to its (36) determination later. If the gas is composed of m different species of molecules, Namely, they constitute a way to find relations between the equation of state is still (38) where temperature, generalized forces (like p and µ) and exterm m nal parameters (like N and V ). The relations are called X X N= Ni and p = pi . equations of state and are important since they connect i=1 i=1 quantities that are readily measurable. We will list here some examples of well known equations of state. Here pi = Ni kB T /V Ideal gas is the partial pressure of the ith gas. Let us consider a gas of N monatomic particles. We assume that the gas in confined in a container of volume Virial expansion of real gases V , and that the interactions and quantum effects between the particles are negligible. This is called classical ideal gas. The interactions between the constituent molecules beIdeal gas is a good approximation for dilute gases that have come important in real gases. Nevertheless, the equation 12 Surface tension of state of any gas can be written in a general series of the particle density ρ = N/V as Surface tension is a property of a surface of a liquid h i that allows it to resist external forces. It has temperature p = kB T ρ + B2 (T )ρ2 + B3 (T )ρ3 + · · · , (39) dependence !n T σ = σ0 1 − where Bk is the kth virial coefficient. TC where the temperature is measured in ◦ C, TC and n are dependent on the liquid, and σ0 is the tension at T = 0◦ C. van der Waals equation The interactions between the molecules of real gases are Electric polarization • repulsive at short distances (due to the Pauli principle); every particle needs at least the volume b which implies V & N b. When a piece of material is in an external electric field E we define D = 0 E + P, • attractive at large distances due to the induced dipole where momenta. The pressure decreases when two particles are separated by the attraction distance. The probability of this is ∝ ρ2 . If the material is homogeneous and dielectric it has ! b E, P= a+ T repulsive 1.5 1.0 attractive where a and b are almost constant and a, b ≥ 0. 0.5 1.0 1.5 2.0 2.5 - 0.5 r rm Curie’s law When a piece of paramagnetic material is in magnetic field H we write -1.0 " VLJ (r) = ε r 12 m r P = electric polarization = atomic total dipole moment density. This can be modelled with the Lennard-Jones potential V LJ 2.0 D = electric flux density −2 r 6 m r B = µ0 (H + m), # , where B = magnetic flux density where ε is the depth of the potential and rm is the distance where the potential is at minimum. m = magnetic polarization We improve the ideal gas equation of state = atomic total magnetic moment density. p0 V 0 = N kB T Polarization obeys roughly Curie’s law so that m= V 0 = V − N b and p = p0 − aρ2 = true pressure. ρC H, T where C is a constant related to the individual atom. Thus, 4.3 Thermodynamic Potentials (p + aρ2 )(V − N b) = N kB T Consider a conservative mechanical system, such as a spring, where work can be stored in the form of potential energy and subsequently retrieved. Under certain circumstances the same is true for thermodynamic systems. Energy can be stored in a thermodynamic system by doing work on it through a reversible process. This work can later be retrieved in the form of work. This kind of energy that can be stored and retrieved in the form of work is called free energy. We will show in the following that there are as many free energies as there are combinations of constraints. Stretched wire Consider a wire at rest with length L0 . The stretching of the wire to length L causes the tension σ = E(T )(L − L0 )/L0 , where E(T ) is the temperature dependent constant of elasticity. 13 The free energies are analogous to the potential energy of a spring and are, thus, called thermodynamic potentials. This leads to a useful set of so called Maxwell’s relations ! ! ∂T ∂p =− ∂V ∂S S,N V,N ! ! ∂T ∂µ = ∂N ∂S S,V V,N ! ! ∂p ∂µ =− ∂N ∂V Internal energy The internal energy of a thermodynamic system is determined as a sum of all forms of energy intrinsic to the system. According to the first law (35), dE = T dS − pdV + µdN, S,V which means that S, V and N are the natural variables of the internal energy14 S,N In an irreversible process, T ∆S > ∆Q = ∆E + ∆W, E = E(S, V, N ). The first law implies the relations ! ! ∂E ∂E =T , = −p , ∂S ∂V V,N S,N which means that ∆E < T ∆S − p ∆V + µ ∆N. ∂E ∂N ! If S, V and N stay constant in the process then the internal energy decreases. Thus we can deduce that = µ. S,V In an equilibrium with given S, V and N the internal energy is at the minimum. Because E, S, V and N are extensive, we have that We consider a reversible process in a thermally isolated system E(λS, λV, λN ) = λE(S, V, N ), ΔL where λ is an arbitrary constant characterizing the size of the system. Assume then that we make infinitesimal changes to the internal energy, i.e. S → S + S, V → V + V and N → N + N . Now E(S + S, V + V, N + N ) = E(S, V, N ) ! ! ! ∂E ∂E ∂E + S + V + ∂S ∂V ∂N V,N S,N E(S + S, V + V, N + N ) = E(S, V, N ) + E(S, V, N ). We end up with the Euler equation for homogeneous functions ! ! ! ∂E ∂E ∂E E=S +V +N , ∂S ∂V ∂N S,V V2 F We partition ∆W into the components " # Z work due to the p dV = change of the total "volume # work done by the ∆Wfree = . gas against the force F Now or, E = T S − pV + µN V1 equilibrium position S,V S,N p2 N. On the other hand, we have that V,N p1 (40) ∆Wfree = ∆W1 + ∆W2 = p1 ∆V1 + p2 ∆V2 = (p1 − p2 )∆V1 = (p1 − p2 )A ∆L which is the fundamental equation. = −F ∆L. The combinations of parameters in the above equation According to the first law we have are such that they lead to an exact differential for E. Thus, Z the second derivatives of E have to be independent on the ∆E = ∆Q − ∆W = ∆Q − p dV − ∆Wfree order of differentiation, e.g. = ∆Q − ∆Wfree . ∂2E ∂2E = . Because now ∆Q = 0, we have ∂V ∂S ∂S∂V 14 Often ∆E = −∆Wfree = F ∆L, internal energy is also denoted with U . 14 i.e. when the variables S, V and N are kept constant the change of the internal energy is completely exchangeable with the work. ∆E is then called free energy and E thermodynamic potential. so the natural variables of A are T , V and N . We immediately obtain the partial derivatives ∂A S = − ∂T V,N Note If there are irreversible processes in an isolated ∂A system (V and N constants) then p = − ∂V T,N ∆Wfree ≤ −∆E. ∂A µ = . ∂N T,V Helmholtz free energy Due to the fact that dA is an exact differential, we obtain Internal energy is the appropriate thermodynamic po- the Maxwell relations tential in processes where the S, V and N are kept con stant. Additionally, the internal energy can be determined ∂S ∂p = as a function of these natural variables, and all thermody∂V T,N ∂T V,N namic properties of the system can be calculated by taking ∂µ ∂S partial derivatives of the internal energy with respect to its = − ∂N T,V ∂T V,N natural variables. Often, the experimental circumstances are such that the macrostate of the studied system is more ∂p ∂µ = − . conveniently represented with different set of independent ∂N T,V ∂V T,N state variables. The change in the variable set made with the Legendre transformation. Similar to the case with internal energy, we have for an For example, let us look at the function f (x, y) of two irreversible change variables. We denote ∆A < −S ∆T − p ∆V + µ ∆N, ∂f (x, y) z = fy = ∂y i.e. when the variables T , V and N are constant the system drifts to the minimum of the free energy. Correspondingly and define the function ∆Wfree ≤ −∆A, g = f − yfy = f − yz. when (T, V, N ) is constant. Now, dg Helmholtz free energy is suitable for processes that are done at constant volume and temperature. The constant temperature can be achieved by embedding the studied system into heat bath. Generally, a bath is an equilibrium system, much larger than the system under consideration, which can exchange given extensive property with the system15 . = df − y dz − z dy = fx dx + fy dy − y dz − z dy = fx dx − y dz. Thus we can take x and z as independent variables of the function g, i.e. g = g(x, z). Obviously y=− ∂g(x, z) . ∂z Enthalpy Corresponding to the Legendre transformation f → g, there is the inverse transformation g → f Using the Legendre transform E →H =E−V f = g − zgz = g + yz. ∂E ∂V = E + pV S,N In a process where T , V and N are constant, it is worth- We move from the variables (S, V, N ) to the variables (S, p, N ). The quantity while to study the transformed potential H = E + pV ∂E E →A=E−S ∂S V,N is called enthalpy. or Now, A = E − TS dH = T dS + V dp + µ dN. which is the (Helmholtz) free energy. 15 In the case of heat bath, the exchanged property is entropy, resulting in the constant temperature of the system determined by the bath. Now dA = −S dT − p dV + µ dN, 15 From this we can read the partial derivatives ∂H T = ∂S p,N ∂H V = ∂p S,N ∂H µ = . ∂N S,V Corresponding Maxwell ∂T ∂p S,N ∂T ∂N S,p ∂V ∂N S,p Gibbs’ function The Legendre transformation ∂E ∂E E →G=E−S −V ∂S V,N ∂V S,N defines the Gibbs function or the Gibbs free energy G = E − T S + pV. Its differential is relations are ∂V = ∂S p,N ∂µ = ∂S p,N ∂µ = . ∂p S,N dG = −S dT + V dp + µ dN, so the natural variables are T , p and N . For the partial derivatives we can read the expressions ∂G S = − ∂T p,N ∂G V = ∂p T,N ∂G . µ = ∂N T,p In an irreversible process one has ∆Q = ∆E + ∆W − µ ∆N < T ∆S. Now ∆E = ∆(H − pV ), so that ∆H < T ∆S + V ∆p + µ ∆N. We see that In a process where S, p and N are constant spontaneous changes lead to the minimum of H, i.e. in an equilibrium of a (S, p, N )-system the enthalpy is at the minimum. From these we obtain the Maxwell relations ∂S ∂V = − ∂p T,N ∂T p,N ∂µ ∂S = − ∂N T,p ∂T p,N ∂V ∂µ = . ∂N T,p ∂p T,N The enthalpy is a suitable potential for an isolated system in a pressure bath (p is constant). In an irreversible process Let us look at an isolated system in a pressure bath. Now dH = dE + d(pV ) ∆G < −S ∆T + V ∆p + µ ∆N, holds, i.e. when the variables T , p and N stay constant the system drifts to the minimum of G. and dE = dQ − dW + µ dN. Correspondingly Again we partition the work into two components: ∆Wfree ≤ −∆G, dW = p dV + dWfree . when (T, p, N ) is constant. The Gibbs function is suitable for systems which are allowed to exchange mechanical energy and heat. Now dH = dQ + V dp − dWfree + µ dN and for a finite process Z Z Z ∆H ≤ T dS + V dp − ∆Wfree + µ dN. Grand potential The Legendre transform ∂E ∂E −N E →Ω=E−S ∂S V,N ∂N S,V When (S, p, N ) is constant one has ∆H ≤ −∆Wfree defines the grand potential Ω = E − T S − µN. i.e. ∆Wfree is the minimum work required for the change ∆H. Its differential is Note Another name of enthalpy is heat function (in constant pressure). dΩ = −S dT − p dV − N dµ, 16 so the natural variables are T , V and µ. and Cp = T The partial derivatives are now ∂Ω S = − ∂T p,µ ∂Ω p = − ∂V T,µ ∂Ω . N = − ∂µ T,V ∂S ∂T = p,N ∂H ∂T (41) p,N Compressibility and expansion Typically, one can control T and p in the experiments (instead of S and V ). If we consider the isobaric change in volume as a function of temperature, we can define the volume heat expansion coefficient 1 ∂V αp = . V ∂T p,N We get the Maxwell relations ∂S ∂p = ∂V T,µ ∂T V,µ ∂N ∂S = ∂µ T,V ∂T V,µ ∂p ∂N = . ∂µ T,V ∂V T,µ Similarly, in isothermic volume change we define the compressibility 1 ∂V κT = − . V ∂p T,N In the case of adiabatic compression, we define the adiabatic compressibility 1 ∂V . κS = − V ∂p S,N In an irreversible process ∆Ω < −S ∆T − p ∆V − N ∆µ, The sound propagation is related to adiabatic compression of the gas, rather than isothermic, resulting in the relation r 1 cS = ρm κ S between the velocity of sound and the adiabatic compressibility. In the above formula, we have denoted the mass density of the gas with ρm . One can show (exercise) that the two compressibilities are related by holds, i.e. when the variables T , V and µ are kept constant the system moves to the minimum of Ω. Correspondingly ∆Wfree ≤ −∆Ω, when (T, V, µ) is constant. The grand potential is suitable for systems that are allowed to exchange heat and particles. κT = κS + V T αp2 . Cp 4.4 Application of Thermodynamics Free expansion Specific heat Consider thermally insulated rigid container with two Consider an increase in temperature by ∆T due to ab- compartments. Assume that the volume V1 is filled with sorption of heat ∆Q by the system. We define the specific gas at temperature T1 . Open the valve separating the comheat or heat capacity as partments so that the gas is free to expand and to fill the whole container with volume V2 . ∆Q C= . ∆T If the increase in temperature is done without the change in volume, the process is called isochoric. According to the 1st law, we have for infinitesimal reversible processes V 1 D Q = D W = 0 dE = T dS = dQ which results in specific heat at constant volume ∂S ∂E CV = T = ∂T V,N ∂T V,N For (isobaric) processes in constant pressure, we have We are interested in the temperature T2 after the equilibrium is reached. It is worthwhile to remember, that right after the valve is opened, the system is in non-equilibrium state and that the expansion of the gas is irreversible. Now, the system is adiabatically insulated and thus dH = T dS ∆Q = 0. 17 Also, the gas does not do work in the process, i.e. We consider now a reversible isenthalpic (and dN = 0) process initial → final. Here ∆W = 0. dH = T dS + V dp = 0, According to the first law, so ∆E = 0. dS = − V dp. T (∗) This means that due to the conservation of energy we Now T = T (S, p), so that have that E(T2 , V2 ) = E(T1 , V1 ). ∂T ∂T dT = dS + dp. Remember, that for ideal gas we had that E(T ) does not ∂S p ∂p S depend on volume. Thus, the temperature does not change in the free expansion of the ideal gas, i.e. T1 = T2 . One On the other hand can show that for a real gas (like a van der Waals gas) with T ∂T = , interactions, the free expansion results in cooling, i.e. ∂S p Cp T2 < T1 . where Cp is the isobaric heat capacity (see Eq. (41)). Using the Maxwell relation ∂V ∂T = ∂p S ∂S p Joule-Thomson (throttling) process Consider then a gas flow in a pipe with thermally insulated walls. and the partial derivative relation gas flow p1 V1 p2 V2 ∂V ∂S ∂T ∂S p ∂T ∂V p = p we can write dT = porous plug T T dS + Cp Cp ∂V ∂T dp. p Introduction of a constriction to the flow, such as a Substituting into this the differential dS in constant enporous plug, results in a constant pressure difference p1 > thalpy (∗) we get so called Joule-Thomson coefficients p2 across the constriction. In a steady state p1 and p2 are # " temporal constants, and the process is again irreversible. ∂T T V ∂V Given the temperature T1 before (on the left hand side of = − . ∂p H Cp ∂T p T the plug), we are interested in the temperature T2 after the plug. When an infinitesimal amount of matter passes Defining the heat expansion coefficient αp so that through the choke the work done by the system is dW = p2 dV2 + p1 dV1 . 1 ∂V αp = , V ∂T p V1 V2 we can rewrite the Joule-Thomson coefficient as Initial state Vinit 0 Final state 0 Vfinal ∂T V µJT = = (T αp − 1). The work done by the system is ∂p H Cp Z We see that when the pressure decreases the gas ∆W = dW = p2 Vfinal − p1 Vinit . • cools down, if T αp > 1. According to the first law we have • warms up, if T αp < 1. ∆E = Efinal − Einit = ∆Q − ∆W = −∆W, so that For ideal gases ∂T = 0 holds. For real gases ∂T is ∂p ∂p H H positive below the inversion temperature, so the gas cools Thus in this process, the enthalpy H = E + pV is con- down. In the case of van der Waals gas, one can show that stant, i.e. the process is isenthalpic, i 1 h 2a µ = − b , JT ∆H = Hfinal − Hinit = 0. Cp RT Einit + p1 Vinit = Efinal + p2 Vfinal . 18 where one assumes that the gas has low density implying RT v a and v b (v = V /n is the molar volume and n = N/NA ). done with a single additional variable called the degree of reaction ξ so that dNj = νj dξ. Joule-Thomson effect is often used in the process of liquefying gases. In order to achieve cooling, one has to work in such temperature and pressure range that µJT > 0. At room temperature, all gases except hydrogen, helium and neon cool upon expansion by the Joule-Thomson process. For example, for helium one has to pre-cool it first (e.g. with liquid hydrogen) below 34 K. Also, the throttling process is applied in many thermal machines, such as refrigerators, air conditioners, and heat pumps. When ξ increments by one, one reaction of the reaction formula from left to right takes place. Convention: When ξ = 0 the reaction is as far left as it can be. Then we always have ξ ≥ 0. The differential for the Gibbs potential can now be written as X X dG = µj dNj = dξ νj µj . Chemical reaction j j Chemical reactions involve large numbers of molecules We define breaking and reforming their chemical bonds. The chemiX ∂G ∆r G ≡ ∆r ≡ = νj µj . cal bonds characterize the manner by which the atoms are ∂ξ p,T j joined to form molecules16 . Examples include: ∆r is thus the change in the Gibbs function per one reac• Covalent bond, caused by the mutual sharing of election. tron pairs, such as in H2 . Since (p, T ) is constant G has a minimum at an equilib• Ionic bond, molecule held together by electrostatic at- rium. The equilibrium condition is thus traction of two oppositely charged ions, e.g. NaCl. X ∆r Geq = νj µeq j = 0. • Polar bond, covalent bond but unequal sharing of elecj trons which causes permanent dipole moment of the molecule, e.g. H2 O. In a non-equilibrium dG/dt < 0, so if ∆r > 0 we must have dξ/dt < 0, i.e. the reaction proceeds to left and vice versa. Regardless of the type of bonds, the macroscopic properWe assume that the components obey the ideal gas equaties of all chemical reactions can be described by thermotion of state. Then one can show17 dynamics. In the following, we will discuss only reactions involving electrically neutral species of molecules. µj = kB T [φj (T ) + ln p + ln xj ], Chemical equilibrium is reached when the rate of proth duction equals the rate of depletion for each species of wherePxj = Nj /N is the concentration of j component, Nj , molecules involved. Reactions themselves do not stop, even N = at equilibrium. In general, the chemical reaction formula µ0j 5 is written as − ηj − ln T, φj (T ) = X k T 2 B 0= ν j Mj . and µ0j and ηj are constants. So, j Here νj ∈ I are the stoichiometric coefficients and Mj stand for the molecular species. ∆r G = k B T (burning of hydrogen sulphide) The equilibrium condition can now be written as P Y ν xj j = p− j νj K(T ), → 2 H2 S + 3 O2 ←2 H2 O + 2 SO2 . A B H2 S O2 −2 −3 C H2 O 2 P Y ν νj φj (T ) + kB T ln p νj xj j . j Example Consider for example the chemical reaction j Mj νj X j D SO2 2 where K(T ) = e− P j νj φj (T ) Typically, the chemical reactions take place in constant is the equilibrium constant of the reaction. The equilibrium pressure and temperature. Thus, the relevant thermody- condition is called the law of mass action. namic potential is the Gibbs’ potential. It turns out that The reaction heat is the change of heat ∆r Q per one the Gibbs’ free energy can be generalized to such situa- reaction to right. A reaction is tions where the system is out of equilibrium. This can be 16 More info in the courses Solid State Physics and Condensed Matter Physics. 19 17 For the rigorous derivation, one needs the concepts of quantum statistics, which will be introduced later. For now the derivation of this formula is omitted. 5. Ensemble Theory • Endothermic, if ∆r Q > 0 i.e. the reaction takes heat. The preceding chapter was devoted in studying the macroscopic consequences of statistical physics in the thermodynamic limit. We argued that the whole of the classical thermodynamics can be derived from the knowledge of the functional dependence of the number of microstates on the macroscopic parameters, e.g. Ω = Ω(S, V, N ). In this chapter, we will consider the way how the actual explicit dependence can be resolved. • Exothermic, if ∆r Q < 0 i.e. the reaction releases heat. We write ∆r G as ∆r G = −kB T ln K(T ) + kB T X νj ln pxj . j Now ∆Q = ∆E + ∆W = ∆E + p ∆V = ∆(E + pV ) = 5.1 Phase Space of a Classical System ∆H, The microstate of a classical system can be determined up to arbitrary accuracy, at least in principle. This is achieved by measuring the instantaneous positions and momenta of all particles constituting the system. This requires 3N position coordinates (q1 , q2 , . . . , q3N ) and 3N momentum coordinates (p1 , p2 , . . . , p3N ). The space spanned by the position coordinates is often referred to as the configuration space, and that spanned by the momentum coordinates as the momentum space. Together, the 6N coordinates form the so-called phase space. since ∆p = 0. When the total amount matter is constant dG = −S dT + V dp holds in a reversible process and G G 1 G S V = dG − 2 dT = − + dT + dp d T T T T2 T T H V = − 2 dT + dp, T T The phase point (qi , pi ), where i = 1, 2, . . . , 3N is a representative point. Every representative point corresponds to a possible state of the system. As it evolves in time, a representative point carves a trajectory into the phase space. The direction of the trajectory is determined by the velocity vector v = (q̇i , ṗi ), whose components are solved from Hamilton’s equations of motion because G = H − T S. We see that G ∂ . H = −T 2 ∂T T p,N Now ∂ ∂T ∆r G T = −kB ∂H(qi , pi ) ∂pi ∂H(qi , pi ) , ṗi = − ∂qi d ln K(T ), dT q̇i = so that d ln K(T ). ∆ r H = kB T dT This expression is known as the reaction heat. 2 (42) (43) where H(qi , pi ) ≡ H(q, p) is the Hamiltonian of the system. Macroscopic parameters limit the accessible regions in phase space. For example, the knowledge of the volume of the system V immediately tells us which are the possible values for the coordinates qi in the configuration space. Moreover, if the system has the energy E, the trajectory is restricted to the hypersurface H(q, p) = E. More realistically, the energy is determined in the interval E to E + δE, and the trajectories are within the hypershell defined by these limits. Given the values of the relevant macroscopic state variables (such as E, V and N ), the system can be in any of the possible microstates within the corresponding hypershell. We define the ensemble in phase space as a group of mental copies of the given system, one for each possible representative point. At any given time t, the number of representative points within the volume element d3N qd3N p around the point (q, p) is given by ρ(q, p; t)d3N qd3N p, 20 (44) where ρ(q, p; t) is the density function. Accordingly, we define the ensemble average of a given physical quantity f (q, p) as R hf i = f (q, p)ρ(q, p; t)d3N qd3N p R . ρ(q, p; t)d3N qd3N p We call the ensemble stationary if 5.2 Liouville’s Theorem and Microcanonical Ensemble We consider the ”flow” of representative points in the phase space. Let ω be an arbitrary volume in the phase (45) space and σ the corresponding surface. The number of points within the volume increases as Z ∂ ρdω, ∂t ω ∂ρ = 0. ∂t where dω = d3N qd3N p. On the other hand, the flow out of the volume can be written as Z Z This implies time-independent averages hf i and, thus, the ρv · n̂dσ = ∇ · (ρv)dω, stationary ensemble describes a system in equilibrium. σ ω where v is the velocity of the representative points, dσ is the surface element, and n̂ the unit vector normal to the element. We have also applied the divergence theorem, where the divergence in the phase space is naturally given by 3N X ∂ ∂ (ρq̇i ) + (ρṗi ) . ∇ · (ρv) = ∂qi ∂pi i Ergodicity We can imagine two kinds of averages in the ensemble. If we consider a single representative point, we can follow it along its trajectory for a long time T and calculate the time average 1 T →∞ T Z f = lim T f (q(t), p(t))dt. The number of representative points is conserved, and thus Z Z On the other hand, we can look at the whole swarm of ∂ ρdω = − ∇ · (ρv)dω. representative points and take the ensemble average (45) ∂t ω ω at any given time. If the ensemble is stationary, the hf i is This has to hold for an arbitrary volume ω, so we end up time-independent and with the continuity equation of the ensemble Z T 1 ∂ρ hf i = lim hf idt. + ∇ · (ρv) = 0. (46) T →∞ T 0 ∂t 0 By applying the definition of divergence, the continuity The two averages in the previous equation are completely equation can be rewritten as independent. Thus, ∂ρ dρ = + {ρ, H} = 0, (47) dt ∂t hf i = hf i. which is called Liouville’s theorem. In the above, we have According to ergodic hypothesis 18 , the trajectory of a repre- denoted the classical Poisson bracket with sentative point travels through every point in the ensemble 3N X in a sufficiently long period of time. Thus, the time aver∂A ∂B ∂A ∂B − {A, B} = ages of physical quantities have to be the same for every ∂qi ∂pi ∂pi ∂qi i=1 member of the ensemble. Thus, Liouville’s theorem means that the representative points in hf i = f , the ensemble move in the phase space in the same manner as an incompressible fluid in the physical space. where the latter is calculated for a trajectory of any member of the ensemble. Microcanonical ensemble In the experiments, what we measure are long-time avLiouville’s theorem holds quite generally for classical syserages. In other word, tems, whereas ∂ρ The ensemble average of any physical quantity =0 f is identical to the value one expects to ob∂t tain by making an appropriate measurement on is merely a requirement for the equilibrium. Thus, in equithe given system. librium we get 18 The proof of the ergodic hypothesis is not at all trivial and, thus, omitted here. It was first proposed by L. Boltzmann (1871). Notice also the similarity with the law of large numbers. 21 {ρ, H(q, p)} = 3N X ∂ρ i=1 ∂ρ q̇i + ṗi ∂qi ∂pi = 0. Consider then a system of N molecules confined to a volume V with the total energy within the range E ≤ H(q, p) ≤ E + δE. One possible way to satisfy both of the above conditions is to assume that the density is completely independent on the on the coordinates, i.e. constant if E ≤ H(q, p) ≤ E + δE ρ(q, p) = 0 otherwise Let us try to justify this more rigorously in terms of the phase space arguments presented above. The Hamiltonian of the ideal gas is independent on the configuration space, due to the lack of interactions, i.e. This is the definition of microcanonical ensemble. Physically, this implies that the ensemble is uniformly distributed among all possible microstates. Thus, there is equal a priori probability for a given member of the microcanonical ensemble of being in any of the accessible microstates. Microcanonical ensemble is, therefore, describing isolated systems in equilibrium. It is of great importance, since whenever we have two interacting systems A and A0 the situation can reduced to the case of an isolated system by studying the combined system A + A0 . where m is the mass of the atom. The volume of the hypershell of the accessible states is given by Z Z Z 3N 3N N ω= d q d p=V d3N p. 3N X p2i H(q, p) = , 2m i=1 P3N p2i ≤ E + δE defines a shell The condition E ≤ i=1 2m in the momentum √ space, lying in between a sphere with radius R(E) = 2mE and that of slightly larger radius R(E + δE). In the limit δE E, the thickness of the shell is approximately It should be kept in mind that even though classical r physics allows in principle the determination of the coorm . R(E + δE) − R(E) ≈ δE dinates qi and pi up to an arbitrary accuracy, in practice 2E there always exists uncertainties ∆qi and ∆pi (e.g. due to measurement precision). We denote the product of uncer- The volume integral in momentum space is then given by the product of this thickness and the surface area of the tainties of conjugated variables with (momentum) hypersphere with radius R(E). In general, h0 = ∆qi ∆pi . the surface area Sn (R) of n-dimensional sphere with radius R is given by This means that the phase space is divided into volume 2π n/2 Rn−1 f elements of size h0 (f = 3N is the number of degrees of Sn (R) = , Γ(n/2) freedom), each of which can be seen to contain one distinguishable microstate. Thus, the number of microstates where Γ is the gamma-function. Thus, we obtain within the relevant hypershell of the phase space is, asympδE N (2πmE)3N/2 totically, ω= V . ω E [(3N/2) − 1]! Ω= f, h0 Now, the number of accessible microstates is where Z Z ω= dω = df qdf p Ω(E, V, N ) = and the integration is done over the hypershell. From the knowledge of Ω, thermodynamics follows via entropy S = kB ln Ω, δE ω = V N χ(E), 3N h Eh3N where χ(E) = (2πmE)3N/2 /(E[(3N/2) − 1]!). (48) as in the previous chapters. 5.3 Canonical Ensemble Often the system under study is not isolated, but interFundamentally, the simultaneous measurement of canon- acting with a much larger system (i.e. a bath or a reservoir) ically conjugated variables is restricted by the Heisenberg fixing one or more of the systems macroscopic properties. uncertainty principle One example of such instances is when the system is immersed in a heat bath. ∆qi ∆pi ≥ ~. (49) We consider a small system A in thermal interaction with This implies that the ultimate limit of accuracy is h0 → h. a heat bath A0 . In equilibrium, the two systems have the same temperature T , but the energy E of the studied system is a variable. This kind of treatment is often beneficial, Example: Ideal gas since in experimental situations it is typically easier to conConsider the equilibrium of monatomic ideal gas consist- trol temperature, instead of energy. ing of N particles and confined in volume V . Previously, We assume that the bath contains an infinitely large we argued in Eq. (37) that the number of accessible minumber of mental copies of the system A. These copies crostates in ideal gas is form the so-called canonical ensemble in which the macroΩ ∝ V N χ(E). scopic state of the systems is determined by the parameters 22 T , V and N . The system A can in principle be any relatively small macroscopic system, or even a distinguishable microscopic system. Gibbs entropy Then, we would like to know what is the probability Pr that the system is in any one particular microstate r of energy Er . In other words, we are not interested in measuring any properties that depend upon the heat bath, but want a statistical ensemble for the system that averages over the relevant states of the bath. The entropy of the whole system (system+bath) is again given by Eq. (48). But, we are interested merely on the properties of the small system A, and thus we have to generalize our definition of entropy to20 Z d3N qd3N pρ(q, p) ln ρ(q, p) X = −kB Pr ln Pr . S = −kB (53) (54) Consider that the common temperature of the system in r equilibrium with the heat bath is T . This determines the total energy E (0) of the combined system A + A0 . Now, This is the Gibbs entropy which is valid even for nonassuming that the system A is in a state with energy Er , equilibrium systems, as well as system that interact with 0 then the bath has such energy Er that surroundings. The latter form contrasts the fact that there is a limit in how small ”compartments” one can divide the 0 (0) Er + Er = E = constant. phase space (ultimately set by the Heisenberg uncertainty principle), given by h3N . Thus, The bath is much larger than the system, and thus Er = E (0) Pr = ρ(qr , pr )h3N . Er0 1 − (0) 1. E (50) We denote with Ω0 (Er0 ) the number of states of accessible states of the bath with energy Er0 . Because the combined system A + A0 is isolated and in equilibrium, all states should be found with equal probabilities. Thus, Pr ∝ Ω0 (Er0 ) = Ω0 (E (0) − Er ). Because of Eq. (50), we have that19 We see immediately that in the case of an isolated system in equilibrium, Pr = 1/Ω for all r and S = kB ln Ω. Now, we can show alternatively21 that the canonical distribution results from the maximization of the entropy with the constraints (exercise) X hEi = Er Pr = constant r 0 ln Ω (Er0 ) 0 = ln Ω (E (0) )+ 0 ∂ ln Ω ∂E 0 X (Er0 −E (0) ) + ··· Pr = 1. r E 0 =E (0) 0 ' constant − β Er . Physical significance of canonical distribution In equilibrium β 0 = β = 1/kB T , and we obtain The internal energy for the system interacting with a heat bath is given by the expectation value Pr ∝ exp(−βEr ). P ∂ r Er exp(−βEr ) So, we see that the probability Pr is proportional to the =− ln Z(β). hEi = P ∂β r exp(−βEr ) Boltzmann constant. The probabilities should be normalized, so by P summing over all possible states r of A we We recall from the chapter discussing the thermodynamic should have r Pr = 1, and potentials, that the relevant potential for processes in conexp(−βEr ) . Pr = P r exp(−βEr ) (51) This is called the canonical distribution. The sum over states normalizing the distribution, Z(β) = X exp(−βEr ), stant temperatures is the Helmholtz free energy A. Now, ∂(A/T ) hEi = A + T S = , ∂(1/T ) N,V which implies (52) A(T, V, N ) = −kB T ln Z(T, V ). (55) r This is the most fundamental result of canonical ensemis often referred to as the partition function. Canonical ble theory, since now the all thermodynamic quantities are ensemble consists of mental copies of A, all of which obey given in terms of the partition function. the distribution (51). 20 Compare this with the proof of the H-theorem. 19 Again, 21 By using Lagrange’s multipliers, check e.g. the course of Analytical Mechanics. we study the logarithm of Ω due to reasons of convergence. 23 and Example: Classical ideal gas gi exp(−βEi ) Pi = P . Again, we consider classical ideal gas as an example. We i gi exp(−βEi ) recall that ρ(q, p) is the measure of the probability of finding a representative point in the vicinity of the phase space Often one can assume that the accessible energy levels are very close to one another compared with the average energy point (q, p). In canonical ensemble, E. Accordingly, one can assume that the energy E is conρ(q, p) ∝ exp(−βH(q, p)), tinuous within any reasonable interval (E, E + δE). Correspondingly, the probability that the given system from the The number of distinct microstates in a volume element canonical ensemble has the energy in this energy range is dω was found to be22 dω . N !h3N (56) P (E) = P(E)dE ∝ exp(−βE)g(E)dE, (58) where P(E) is the probability density and g(E) is the density of states. We obtain that Z ∞ Z(T, V ) = exp(−βE)g(E)dE. Thus, the partition function is Z 1 Z(T, V ) = exp(−βH(q, p))dω. N !h3N 0 In the case of ideal gas, Naturally, if δE is sufficiently small energy interval, we have that the number of states within range (E, E + δE) is Ω ≈ g(E)δE. N X p2i H(q, p) = 2m i=1 and the partition function can be written in the form Since the Boltzmann factor is a monotonically decreas N 1 V 1 ing function of E, and the density of states monotonically 3/2 N Z(T, V ) = (2πmkB T ) [Z1 (T, V )] . = increasing one, we have that N ! h3 N! (57) ∂ −βE This means that because of the absence of the interparti=0 e g(E) ∂E cle interactions, the partition function can be written as a E=E ∗ product of single molecule partition functions. determines a maximum of P (E). This can rewritten as Now, we can write ∂ ln g(E) 2 3/2 N h = β. A(T, V, N ) = N kB T ln −1 , ∂E E=E ∗ V 2πmkB T Now, we can write 3/2 N h2 ∂A = kB T ln µ= ∂N T,V V 2πmkB T ∂S(E) 1 = . S = kB ln g and ∂E T and E=hEi ∂A N kB T P =− = . Thus, we see that the most probable value is equal to the ∂V V,N V mean, i.e. E ∗ = hEi. Also, we obtain the expression for the internal energy of Consider then the mean the classical ideal gas R Eg(E) exp(−βE)dE ∂ 3 hEi = R . hEi = A + T S = − ln Z = N kB T. g(E) exp(−βE)dE ∂β 2 Er we are interested in the variance Energy fluctuations in canonical ensemble σ 2 = hE 2 i − hEi2 = − ∂hEi = kB T 2 CV . ∂β In many physical cases, the accessible energy levels can be degenerate, meaning that there exists in general gi states We see that because CV and hEi are extensive variables, belonging to the same energy value Ei . Thus, one has σ 1 X ∝√ . Z(T, V ) = gi exp(−βEi ), hEi N i 22 The additional factor N ! comes from the quantum mechanical fact that the constituting particles in the system are indistinguishable. We will return to this later when we talk about quantum statistics more thoroughly. 24 Thus, in the limit N → ∞ we have that the energy of a system in the canonical ensemble is for all practical purposes equal to the mean energy. This means that the situation is more or less the same as in the microcanonical ensemble! Moreover, if we make the Taylor expansion around E ∗ = The derivation of the equipartition theorem embodies hEi also the so-called viral theorem by Clausius (1870). The virial of the system h i S 1 −βE 2 X ln e g(E) ≈ − βhEi + − 2 (E − hEi) + · · · . V= hqi ṗi i = −3N kB T. (62) kB 2σ i Thus, we see that Immediately from Eq. (60), one obtains P(E) ∝ e −βE g(E) ≈ e −β(hEi−T S) exp (E − hEi)2 , − 2σ 2 V = −2K, meaning that the probability distribution is Gaussian with √ the mean hEi and the standard deviation σ = kB T 2 CV . (63) where K is the average kinetic energy of the system. 5.4 Applications of Canonical Ensemble Equipartition theorem A system of harmonic oscillators Consider the expectation value R ∂H −βH xi ∂xj e dω ∂H R i= , hxi ∂xj e−βH dω Harmonic oscillator is one of the most practical models in physics. We will now examine a system of N noninteracting oscillators, mainly to illustrate the canonical ensemble formulation, but also for the later development where xi and xj are any of the 6N generalized coordinates of the theory of black-body radiation and the theory of (q, p), and H is the (classical) Hamiltonian of the system. lattice vibrations. By using partial integration to the numerator, we obtain First, consider the classical Hamiltonian of a single oshxi cillator ∂H i = δij kB T. ∂xj 1 2 1 mω 2 qi2 + p . 2 2m i The single oscillator partition function is then Z Z dqdp kB T Z1 (β) = e−βH(q,p) = , h ~ω H(qi , pi ) = We thus obtain hpi ∂H ∂H i = hpi q˙i i = kB T = −hqi ṗi i = hqi i. ∂pi ∂qi where ~ = h/2π. This is the classical counting of the number of accessible microstates. Similar to the non3N 3N interacting gas, the partition function of N oscillators is X X hpi q̇i i = − hqi ṗi i = 3N kB T. (59) a product of N single oscillator partition functions24 i=1 i=1 N kB T N Z(β) = [Z1 (β)] = . Hamiltonian of a system that is quadratic in its coordi~ω nates can be written as X X Again, from the knowledge of the partition function, the H= Aj Pj2 + Bj Q2j , thermodynamics follows. For example, j j ~ω , A = N kB T ln where Qj and Pj are canonically conjugate and Aj and Bj kB T constants of the problem. Clearly, ~ω µ = kB T ln , X ∂H ∂H kB T Pj + Qj = 2H. (60) ∂Qj ∂Pj p = 0, j kB T This implies S = N kB ln +1 , ~ω 1 hHi = f kB T, (61) 2 hEi = N kB T. where f is the number of non-zero coefficients Aj and Bj . So, we see that each harmonic term in the Hamiltonian The internal energy agrees with the equipartition of energy, makes a contribution of 12 kB T to the internal energy of because there are two independent quadratic terms for each the system. This also results to a contribution of 12 kB to single-oscillator Hamiltonian. the specific heat CV . This the theorem of equipartition of It should be noted at this point that when we derived the energy among the degrees of freedom of the system23 . canonical distribution, we did not make assumptions on the By summing over i, we obtain 23 There will corrections to this theorem in the quantum limit where some of the degrees of freedom can be ”frozen” (cannot be excited) due to the insufficient amount of available (thermal) energy. 25 24 Note, that the oscillators themselves are distinguishable, but the excitations (i.e. photons/phonons) are distinguishable. We will return to this. nature of the energy levels Er of the system. So, instead of classical energies, we can calculate the thermodynamics starting from the quantum mechanical eigenvalues for the single oscillators, namely 1 En = n + ~ω. 2 Consider N such magnetic dipoles, each with magnetic moment µ. We assume that the dipoles are not interacting with one another, and that they are localized and classical in the sense that they are distinguishable and freely orientable. The internal energy of the N dipole system is given by its potential energy due to the external field H E= Accordingly, the single-oscillator partition function is Z1 (β) = N X Ei = − i=1 ∞ X −1 ~ω . e−β(n+1/2)~ω = 2 sinh 2kB T n=0 N X µi · H = − N X i=1 µH cos θi . i=1 Again, due to the fact that the dipoles do not interact, we can write This results in the N -oscillator partition function Z(β) = [Z1 (β)]N , X Z1 (β) = eβµH cos θ . Z(β) = [Z1 (β)]N = e−(N/2)β~ω (1 − e−β~ω )−N . θ Again, we obtain 1 A = N ~ω + kB T ln(1 − e−~ω/kB T ) , 2 The dipoles will obviously align themselves along the field. The magnitude of the magnetization will be ∂A N ∂ . (65) ln Z1 (β) = − Mz = N hµ cos θi = β ∂H ∂H T µ = A/N, p = 0, S = N kB By denoting with dΩ = sin θdθdφ the elemental solid angle representing the orientation of the dipole, we obtain β~ω −β~ω − ln(1 − e ) , eβ~ω − 1 Z 1 ~ω hEi = N ~ω + β~ω . 2 e −1 (64) It is interesting to notice, that the energy per oscillator approaches the equipartition energy only when kB T ~ω. In low temperatures, the internal energy is given by the zero-point energy of the oscillators ~ω/2. XE\ 2π π Z eβµH cos θ sin θdθdφ = 4π Z1 (β) = 0 0 We have, thus, obtained the magnetization per dipole µz = Mz = µL(βµH), N where L(x) = coth x − NÑΩ 3.0 1 x is the Langevin function. Classical 2.0 Μz Μ 1.2 1.5 1.0 2.5 Schrödinger Planck 1.0 sinh(βµH) . βµH ΒΜH 3 0.8 0.5 0.6 0.5 1.0 1.5 2.0 2.5 3.0 k B T ÑΩ 0.4 Average energy (64) per oscillator, labelled as Schrödinger oscillator. Note that if there were no zeropoint oscillations, the classical limit would be off by − 12 N ~ω (Planck oscillator). Statistics of paramagnetism 0.2 ΒΜH 0 2 4 6 8 10 12 14 We note that the parameter βµH gives the strength of the field in terms of thermal energy. When µH kB T , we have magnetic saturation Paramagnetism is a form of magnetism caused by permaMz ' N µ. nent dipole moments of the constituent atoms, generated by the electron spin and angular momentum. Magnetiza- In the large temperature limit, µH k T and B tion of paramagnetic materials is created only in the presence of an external magnetic field H. The magnetization N µ2 C M ' H ≡ H, z disappears when the field is removed. 3kB T T 26 and the corresponding energies are ±ε = ±µB H. Thus, which is the Curie law of paramagnetism and C is the Curie constant. Z(β) = [2 cosh(βε)]N , Quantum mechanics limits the number of possible orientations of the dipoles. Generally, the magnetic moment and the total angular momentum l of the dipole are related by e l, µ= g 2m where g(e/2mc) is the gyromagnetic ratio, g is Lande’s gfactor, l2 = J(J + 1)~2 , J = and 1 3 5 , , , . . . or 0, 1, 2 . . . 2 2 2 A = −N kB T ln[2 cosh(βε)], ε ε ε S = N kB ln 2 cosh − tanh , kB T kB T kB T ε E = −N ε tanh , kB T ε M = N µB tanh , kB T 2 ∂E ∆ e∆/kB T (1 + e∆/kB T )−2 , C= = N kB ∂T H kB T where ∆ = ε − (−ε) = 2µB H is the difference between the two allowed dipole energies. Let us analyse these a bit. First, we found the natural result E = −M H. In the limit T → 0, S → 0 and E → −N ε. This is a result of complete ordering of the dipoles along the field. At high temperatures, the dipoles are randomly oriented and the energy and magnetization vanish. 3 S(S + 1) − L(L + 1) g= + 2 2J(J + 1) where S and L are the spin and orbital quantum numbers of the dipole. We obtain µ2 = g 2 µ2B J(J + 1) C Nk B 0.7 and 0.6 µz = gµB m , m = −J, −J + 1, . . . , J − 1, J 0.5 where µB = e~/2m is the Bohr magneton. 0.4 Now, the single-dipole partition function is J X Z1 (β) = 0.3 0.2 eβgµB mH 0.1 m=−J = sinh 1 1 βgµB mH . 1+ βgµB mH / sinh 2J 2J 0 1 ∂ Mz = ln Z1 (β) = (gµB J)BJ (βgµB mH), N β ∂H BJ (x) = 1 1+ 2J 3 4 5 6 k B T ¶ Interestingly, where 2 The specific heat displays the Schottky anomaly, i.e. a peak as function of temperature. Generally, the Schottky anomaly occurs in systems with limited number of energy levels. The mean magnetic moment per dipole is µz = 1 1 kB =− tanh−1 T ε 1 1 1 coth 1 + x − coth x 2J 2J 2J E , Nε which implies that T < 0 when E > 025 . Positive values of energy are abnormal, since they correspond to a situation is the Brillouin function of order J. Similar to classical where the magnetization is in opposite direction to the apcase, we have saturation for strong fields. In the weak field plied field. Yet, it has been realized in experiments with limit, we also obtain the Curie law nuclear spins in a crystal (with LiF-crystal by Purcell and 2 2 Pound in 1951). A sudden reversal of a strong magnetic g µB J(J + 1) CJ µz ' H≡ H, (66) field will result in negative magnetization and thus in neg3kB T T ative temperature, assumed that the relaxation time for the mutual interaction among the spins is small compared where CJ is the Curie constant. with that for the spins and the lattice. It should be noted that when J → ∞ the number of We showed before that the heat will flow from the syspossible orientations of the dipoles become infinitely large tem with lower value of β to the one with a higher value. and the problem reduces to its classical counterpart. Interestingly, one can show that this holds also for negative Peculiarities of spin systems temperatures. In a sense then, systems at negative temConsider system with N non-interacting spin- 1 (J = 1 ) peratures are hotter than those at positive temperatures! 2 2 25 This dipoles. Such a dipole has only two possible orientations, 27 should be compared with the exercise 3.1! 5.5 Grand Canonical Ensemble so In addition to energy, also the number N of particles in a statistical system is hardly ever measured. Thus, the next natural step is to regard both N and E as variables and allow them to fluctuate. S= hEi hN i −µ + kB ln ZG . T T Grand potential Now, one can proceed in the same way as in the case of In thermodynamics we defined the canonical ensemble. One can either consider a small system 0 A immersed in the large reservoir A with which it can Ω = E − T S − µN, exchange energy and particles, or, maximize the entropy so in the grand canonical ensemble the grand potential is with the constraints X hEi = constant Ω = −kB T ln ZG . hN i = constant With the help of this the grand canonical distribution can be written as Pr,N = eβ(Ω−Er +µN ) . Pr = 1. r Notify The grand canonical state sum depends on the variables T , V and µ, i.e. In both cases, one ends up with the probability exp(−β(Er − µN ) r,N exp(−β(Er − µN ) Pr,N = P (67) ZG = ZG (T, V, µ). of observing the system with energy Er and N particles. This is the grand canonical distribution. Similar to the Density and energy fluctuations canonical distribution, the studied system can be any relLet us evaluate the fluctuations in the number of partiatively small macroscopic system or a distinguishable micles around the mean croscopic system. Moreover, the grand canonical ensemble 1 ∂hN i consists of mental copies of system A obeying the grand 2 . σN = hN 2 i − hN i2 = canonical distribution in the particle-energy bath. β ∂µ T,V We can write the grand canonical partition function. X ZG = e−β(Er −µN )) We obtain for the particle density n = hN i/V 2 σn2 σN kB T ∂hN i kB T ∂v = = = − hni2 hN i2 hN i2 ∂µ T,V V ∂µ T 1 ∝p hN i r,N Number of particles and energy where v = 1/n. Now P r,N hN i = N e−β(Er −µN )) ZG 1 ∂ ln ZG β ∂µ = and P hEi = = r,N It is straightforward to show that G = µN , implying that the chemical potential is just another name for the amount free Gibbs energy per particle. Thus, one obtains (68) the Gibbs-Duhem equation S dµ = vdP − dT. (69) hN i Er e−β(Er −µN )) By assuming that T is constant, we obtain ZG ∂ ln ZG ∂ ln ZG kB T 2 + kB T µ . ∂T ∂µ σn2 kB T = κT , 2 hni V where κT = −(1/V )(∂V /∂P )T is the isothermal compressibility. Entropy Typically, the relative fluctuations of the particle go as p 1/ hN i. There are, however, exceptions. For example, in phase transitions 26 the compressibility can become excessively large. Especially, at critical points27 the compressibility diverges leading to large fluctuations in the particle According to the definition, the Gibbs entropy is X S = −kB Pr,N ln Pr,N . r,N 26 We will discuss more on phase transitions in the following section, and also later in the course. 27 See definition later. Now ln Pr,N = −βEr + βµN − ln ZG , 28 density. In this kind of situations the grand canonical en- GB (NB , P, T ) semble can lead to different result than the corresponding ∂GB ∂GA canonical ensemble, and should be preferred due to the dG = dNA + dNB = (µA −µB )dNA . correct physical picture it gives. ∂NA T,P ∂NB T,P We can repeat the calculation for the variance of the energy. It gives 2 σE 2 = kB T C V + ∂hEi ∂N 2 Because Gibbs energy is minimized at the equilibrium, we have that if µA > µB the number of molecules in phase B will increase and that of A decrease. Similarly, if µA < µB the NA will increase and NB decrease. When 2 σN . T,V This means that the fluctuations in energy of the grand canonical ensemble is equal to that of the canonical ensemble plus a contribution arising from the fluctuating N . These fluctuations are typically negligible, but can also become large in phase transitions due to the second term in the formula. 5.6 Phase Equilibrium and Phase Transitions A big part of the success of statistical physics and thermodynamics has been gained in the study of phase transitions. Thermodynamic phases are such regions in the phase diagram where the thermodynamic functions are analytic. Phase transitions on the other hand arise at point, lines or surfaces of the diagram where such properties are nonanalytic. The typical phases of matter are µA (P.T ) = µB (P, T ) the Gibbs free energy is independent on NA and NB and the phases are said to coexist. This equilibrium condition draws a coexistence curve in (p, T )-plane. If three phases are simultaneously in equilibrium, the is crossing of three coexistence curves is called the triple point. The point where the coexistence curve ends is called critical point. c o e x is te n c e c u rv e T 1 2 p Now, let us define pσ (T ) on the coexistence curve in (p, T )-plane, determining the phase boundary between the • vapor phase is a low density gas obeying ideal-gas phases A and B. Accordingly, equation of state, and possible corrections described by the virial expansion (such as van der Waals equa∂µA ∂µB dpσ ∂µB dpσ ∂µA tion), + + = . ∂T p ∂p T dT ∂T p ∂p T dT • liquid phase is dense fluid with strong interactions beAccording to the Gibbs-Duhem equation (69), we have tween the atoms, • solid phase is static structure where the constituent atoms are bound together either in regularly or irregularly. S ∂µ , =− N ∂T p V ∂µ = . N ∂p T • plasma phase is similar to gas in which certain portion of the atoms is ionized. No definite shape or volume, We obtain the Clausius-Clapeyron equation but unlike gas, can form structures in magnetic fields. SB − SA ∆S ∆H dpσ = = = , (70) In addition to these, there exists exotic phases in low dT VB − VA ∆V T ∆V temperatures that arise from the corrections in statistics introduced by quantum mechanics. Such phases include where the superfluid and supersolid phases in helium, and the ∆Q = T ∆S = ∆E + p∆V = ∆H superconducting phase that is observed in numerous comis the phase transition heat or latent heat released in a pounds. process where the coexistence curve is crossed at constant T and p. Phase equilibrium Consider two phases A and B contained in a cylinder Phase transitions at constant pressure P and temperature T . Assume that Generally, when the free energy is an analytic function the total number of particles is N = NA + NB . If the two phases do not coexist at these P and T , the numbers NA only single phase exists. This is denoted in the phase diaand NB will change in order to approach equilibrium. This grams with the open spaces between the coexistence curves. leads to the change in Gibbs energy G = GA (NA , P, T ) + When the parameters (in our case T or p) are changed so, 29 that the system reaches the coexistence curve, the free enAccording to the Clausius-Clapeyron equation ergy becomes non-analytic and a phase transition occurs. dp ∆Hls In a phase transition the chemical potential = dT T ∆Vls ∂G µ= we have ∂N p,T is continuous. Instead S=− ∂G ∂T V = ∂G ∂p > 0, if ∆Vls > 0 1) dp dT < 0, if ∆Vls < 0 2) . p d p p and dp dT 1 ) s o lid d T flu id > 0 p d p d T 2 ) < 0 flu id s o lid T are not necessarily continuous. T T A transition is of first order, if the first order derivatives (S, V ) of G are discontinuous and of second order, if the We see that when the pressure is increased in constant second order derivatives are discontinuous. temperature the system 1) drifts ”deeper” into the solid phase, Examples 2) can go from the solid phase to the liquid phase. a) Vapor pressure curve p fu s io n c u rv e c ritic a l p o in t C s o lid flu id v a p o r s u b lim a tio n T v a p o r p re ssu re c u rv e c u rv e trip le p o in t T c) Sublimation curve Now dH = T dS + V dp = Cp dT + V (1 − T αp ) dp, since S = S(p, T ) and using Maxwell relations and definitions of thermodynamic response functions ∂S ∂S ∂V Cp dS = dp + dT = − dp + dT. ∂p T ∂T p ∂T p T We consider the transition The vapor pressure is small so dp ≈ 0, and liquid → vapor. Supposing that we are dealing with ideal gas we have ∆V = Vv = Hs = Hs0 + Z T Cps dT solid phase Cpv dT vapor (gas). 0 N kB T , p Hv = Hv0 + Z T 0 since Let us suppose that the vapor satisfies the ideal gas state equation. Then Vl(iquid) Vv(apor) . Because the vaporization heat (the phase transition heat) ∆Hlv is roughly constant on the vapor pressure curve we have dp ∆Hlv p so = . dT N kB T 2 Integration gives us ∆Vvs = N kB T N kB T − Vs ≈ , p p dp ∆Hvs p ∆Hvs = ≈ , dT T ∆Vvs N kB T 2 where ∆Hvs = Hs − Hv . p = p0 e−∆Hlv /N kB T . For a monatomic ideal gas Cp = 52 kB N , so that 0 ∆Hvs 5 1 ln p = − + ln T − N kB T 2 kB N b) Fusion curve Now ∆Vls = Vl(iquid) − Vs(olid) Z RT 0 Cps dT 0 dT +constant. T2 0 Here ∆Hvs is the sublimation heat at 0 temperature and pressure. can be positive or negative (for example H2 O). 30 5.7 Gibbs Paradox and Correct Enumeration of Microstates Up to this point, our discussions on enumeration of the accessible microstates has been classical. According to classical physics, each microstate corresponds to a point in a phase space, spanned by the coordinates and momenta of the constituent particles of the studied system. Now, classical physics consider elementary particles as distinguishable, meaning that an pairwise exchange of phase-space coordinates of two particles leads to a new microstate. However, this line of thinking results in overcounting of the accessible states, which is traditionally demonstrated by the so-called Gibbs paradox related to the entropy of mixing of two ideal gases. B T p A A T B p When the numbers of particles are large, we can use Stirling’s formula and write ∆Smix = SA+B − SA − SB = −kB (NA ln NA − NA ln N + NB ln NB − NB ln N ) ≈ kB (ln N ! − ln NA ! − ln NB !). Thus, we see that the problem would be solved by reducing the entropies by a factor kB ln N(A,B) !. This kind of a factor appears to the entropy when the particles constituting the system are considered indistinguishable 28 . By correcting the counting of the states in this way, one obtains results that agree with the reality, at least at the classical limit. We will later show, independently, that this kind of reduction has to be made to the number of accessible microstates in order to have classical statistical physics as the true limit of quantum statistical physics. Entropy of mixing A The problem arises when the two gases are the same. Equation for the entropy of mixing (71) still holds, implying that the entropy will increase after the opening of the valve. However, since the two gases are the same, the process should be reversible! This is the Gibbs paradox. B Consider two ideal gases (A and B) confined compartments separated by a valve in volume V . Suppose that initially pA = pB = p and TA = TB = T . If we open the valve, both gas expand adiabatically (∆Q = 0) to the whole volume. In a mixture of ideal gases every component satisfies the state equation pj V = Nj kB T. The concentration of the component j is Nj pj = , N p xj = where the total pressure p is X p= pj . j Each constituent gas expands in turn into the volume V . Since pA = pB and TA = TB , we have Vj = V xj . The change in the entropy is (exercise) ∆S = X Nj kB ln j V Vj or ∆Smix = −N kB X xj ln xj . (71) j Now ∆Smix ≥ 0, since 0 ≤ xj ≤ 1, and the equality holds only when either NA = 0 or NB = 0. 31 28 In fact, this was done already in Eq. (56). 6. Quantum Statistics where the matrix elements hm|ρ̂|ni ≡ ρmn (t) = am (t)a∗n (t). One can show that all of the measurable information in The ensemble theory developed in the preceding chapter For is extremely general. However when applied to quantum- the state vector is contained in the density operator. 29 example, the expectation value can be written as systems composed of indistinguishable entities, one has to proceed with great care. One benefits in writing the ensemhÂi = Tr(Âρ̂) (73) ble theory in a manner that is more suitable to quantum mechanical treatment, i.e. in the language of operators. and the Schrödinger equation as i ρ̂˙ = − [Ĥ, ρ̂(t)]. ~ 6.1 Density Operator Let us consider a quantum system characterized by the Hamiltonian operator Ĥ. Recall from quantum mechanics, that the (normalized) state vector |ψ; ti gives the maximal description of the system. Quantum states that can be represented in the above manner as vectors of a Hilbert space, are called pure states. The time evolution of the state vector is given by the Schrödinger equation Ĥ|ψ; ti = i~ ∂ |ψ; ti. ∂t Then, let {|ni} be a complete set of orthonormal vectors, i.e. the basis, in the Hilbert space under consideration. Every vector can be written as a linear superposition of the basis vectors, especially X |ψ; ti = an (t)|ni, n where an (t) = hn|ψ; ti . Thus, the state of the system can be described equally well with the coefficients an (t) X X i~ȧn (t) = am (t)hn|Ĥ|mi = am (t)Hnm , m (74) The latter equation describing the time-evolution of the density operator is called von Neumann equation. It should be noted that von Neumann equation follows from the Liouville equation by the application of Dirac’s correspondence principle, i.e. i {H, ρ} → − [Ĥ, ρ̂]. ~ The above discussion holds an implicit assumption that the system is isolated, and that we have a complete knowledge of the preparation of the quantum state |ψ; ti. In practice however, we have to settle for less. We can only restrict the set of accessible states into a set, ensemble, {|ψ k ; ti|k = 1, . . . , N }, whose members do not have to be mutually orthogonal. Then, what we can say is that the state k is prepared with probability P (k). Accordingly, we generalize the definition of the density operator into X X X ρ̂(t) = P (k)|ψ k ; tihψ k ; t| = P (k) ρkmn (t)|mihn| k ≡ X k mn ρmn (t)|mihn|. mn m We see that the matrix elements of the density operator where Hnm = hn|Ĥ|mi are the matrix elements of Ĥ. The are ensemble averages of the corresponding elements of the coefficients an (t) have the standard interpretation as the states in the ensemble. Especially, the diagonal elements X X probability amplitudes of the system to be in states |ni. ρnn (t) = P (k)ρknn (t) = P (k)|an (t)|2 Since we assume that the state is normalized, we have that k k X 2 |an (t)| = 1. give the probabilities of finding a system in state n in the n ensemble. Notice the two type of averaging, one quantum mechanical due to the probability amplitudes, and one staEach measurable quantity α of the system has a corretistical. sponding Hermitian operator  in the Hilbert space H of This describes a mixed state where we have an incomthe system. The possible outcomes of a measurement of plete knowledge on the state of the system. The density the property α are the (real) eigenvalues of Â. The mean operator presented in this form contains both the statistical outcome is given by the expectation value and quantum mechanical information about the system. It hÂi ≡ hψ; t|Â|ψ; ti. should be noted that Eqs. (73) and (74) hold also for the general definition. A description of the system, equivalent to the one given by the state vector, can be obtained by defining the so-called Entanglement with the environment density operator X One important aspect, that will be discussed more rigρ̂(t) = |ψ; tihψ; t| = am (t)a∗n (t)|mihn| orously later when we study non-equilibrium systems, is m,n X 29 It should be noted that trace operation is independent on the = ρmn (t)|mihn|, (72) choice of the basis. Also, Trρ̂ = 1 which follows from the normalizamn tion of the state vectors. 32 that the studied systems in general interact with their sur- will be studied later. In the case of an equilibrium system, roundings, one example being a system immersed to a heat the density operator should be static. This implies that bath. In such cases, the Hamiltonian of the whole system ∂ ρ̂ (system+environment+interactions) is generally of form30 i~ = [Ĥ, ρ̂] = 0. ∂t ĤT = Ĥ + ĤB + Ĥint . Thus, the density operator commutes with the Hamiltonian, which means that they can be diagonalized simultaOnly in the cases where Ĥint = 0, it is possible to write neously. Let {|ni} be the eigenbasis of the Hamiltonian. down the state vector of the bare system. In the presence In this basis, the density operator is diagonal, i.e. of interactions, the system and its environment become X entangled, meaning that the state vector cannot be written ρ̂ = ρn |nihn|. as a tensor product of a system part and a bath part, i.e. n |Ψ; ti = 6 |ψ; ti ⊗ |ψB ; ti. Therefore, it is reasonable to speak only about the state vector of the combined system31 Entropy X |Ψ; ti = AnB (t)|ni|Bi. Recall that the Gibbs entropy was defined as nB X S = −kB Pn ln Pn , The Schrödinger equation gives, in principle, the timen evolution of the state vector of the combined system. But even if could produce a solution32 , it would give an unnec- where Pr is the probability of finding a system in the state essarily large amount of information about the bath. This r in the ensemble. Based on the above discussion, Pn = ρn excess information is naturally interesting in its own right, and we can write X but at the moment we are only interested in the behaviour S = −kB ρn ln ρn of the system. Therefore, it is practical and convenient n to ”trace out” the bath and hide its contributions into the X = −kB hn|ρ̂ ln ρ̂|ni probabilities P (k). This can be seen by considering the exn pectation value of the operator  in systems Hilbert space = −kB Tr ρ̂ ln ρ̂ . hÂi = hΨ; t|Â|Ψ; ti XX The last equality is the general definition of entropy in = hΨ; t|nBi hnB|Â|n0 B 0 i hn0 B 0 |Ψ; ti quantum mechanics. It should be noted that it is indenn0 BB 0 pendent on the choice of basis (since the trace is). Also, ≡ Trsys (Âρ̂sys ), the definition bears great generality as it holds also in nonequilibrium systems. where XX Microcanonical ensemble ρ̂sys = |n0 i hn0 B|Ψ; ti hΨ; t|nBi hn|. B nn0 Recall, that in microcanonical ensemble the constituting mental copies of a system are characterized by fixed N in is the reduced density operator of the system. Clearly, a fixed volume V . The energy of the system is lying in ρ̂sys = TrB (|Ψ; tihΨ; t|). (75) the interval (E, E + δE), where δE E. The number of accessible microstates is denoted with Ω(E, V, N ; δE), We thus see that the density operator of the system is and each microstate occurs with equal probability (in acwritten as a partial trace of that of the combined system, cordance of the postulate of equal a priori probabilities). Thus, the matrix elements of the density operator in the taken over the environment variables only. energy eigenbasis are of form 6.2 Equilibrium Density Operator 1/Ω for each accessible state ρn = 0 for all other states The density operator is an extremely powerful tool in the study of both equilibrium and non-equilibrium quantum We see immediately that Trρ̂ = 1, as it should for normalsystems, both closed and open. The non-equilibrium cases ized states. We see that the entropy in the microcanonical 30 It should be noted that the environment can consists of a large ensemble is given by the familiar formula set of quantum systems, or even of a single quantum system. 31 Formally, we have to combine the Hilbert spaces H and H A B of the interacting quantum systems with the tensor product HA ⊗ HB . Accordingly in our case, Ĥ → Ĥ ⊗ IˆB , ĤB → Iˆ ⊗ ĤB , |ni|Bi ≡ |ni ⊗ |Bi, where |Bi are the basis vectors of HB and Iˆ (IˆB ) is the identity operator of H (HB ). 32 This is a formidable task, since the bath contains in general a macroscopic amount of degrees of freedom! 33 S = kB ln Ω. We first emphasize that Ω is calculated quantum mechanically, so that the indistinguishability of the particles has been taken into account right from the start. This means that we should avoid paradoxes, such as the Gibbs’. Also, if the number of quantum states within δE is Ω = 1 we 6.3 Systems of Indistinguishable Particles obtain S = 0, i.e. the Nernst theorem (third law of therLet us then turn into quantum mechanical description modynamics). of a system of N identical particles. For simplicity, we will Moreover, the case Ω = 1 corresponds to the pure state, deal first systems where the particles are non-interacting. in which case the construction of the ensemble is quite The Hilbert space of the system is a tensor product of superfluous because every system in the ensemble would single-particle Hilbert spaces H1 be in the same state. H = H1 ⊗ H1 ⊗ · · · ⊗ H1 . {z } | N copies Canonical ensemble If |qi i are the position eigenstates in the single-particle Hilbert spaces, one can write a general state of the N particle system as Z Z |Ψi = · · · dq1 · · · dqN |q1 . . . qN iΨ(q1 , . . . , qN ), Consider then the canonical ensemble where the macrostate is characterized by constant parameters N , V and T , and the energy E is a variable quantity. The density matrix in energy basis is again diagonal, and ρn = e−βEn , n = 0, 1, 2, . . . Z where Z= X e−βEn n is the partition function. This means that the density operator can be written as ρ̂ = e−β Ĥ , −β Ĥ Tr e where |q1 . . . qN i = |q1 i ⊗ · · · ⊗ |qN i, and Ψ(q1 , . . . , qN ) = hq1 . . . qN |Ψi is the wave function. Now, because the particles do not interact, we can write the solution of the time-independent Schrödinger equation (we denote q = (q, . . . , qN )) ĤΨE (q) = EΨE (q) as ΨE (q) = ψε1 (q1 ) ⊗ · · · ⊗ ψεN (qN ) {z } | (76) N times with E= where e−β Ĥ = ∞ X j=0 εi . i=1 j (−1)j N X (β Ĥ) . j! The pairs (εi , ψi (i)) are the solutions of the single-particle Schrödinger equation The expectation values are thus of form Tr Âe−β Ĥ . hÂi = Tr e−β Ĥ Ĥi ψi (i) = εi ψi (i), (78) where we have denoted ψi (i) = ψεi (qi ). We denote with ni the number of particles in the state with energy εi . Thus, X N= ni i Grand canonical ensemble and Similarly, in the case of grand canonical ensemble, where both E and particle number N are variables, we can write ρ̂ = e−β(Ĥ−µN̂ ) , ZG E= X ni εi . i (77) where N̂ is the particle number operator and X ZG = = e−β(Er −µN ) = Tr e−β(Ĥ−µN̂ ) r,N is the grand partition function. Again, the ensemble average Tr Âe−β(Ĥ−µN̂ ) hÂi = . ZG Now it should be noted that, according to quantum mechanics, even if two particles are in different single-particle states, an interchange between them should not lead into a new microstate. This means that as long the numbers ni remain the same, any permutation P among the N particles should result into same microstate. In this way, the indistinguishability of the particles is guaranteed. We define the permutation P as the reordering of the coordinates as P Ψ(1, 2, . . . , N ) = Ψ(P 1, P 2, . . . , P N ). For arbitrary P , we should have |P Ψ|2 = |Ψ|2 . 34 (79) The simplest way to form a wave function Ψ that obeys the have to be either symmetric or antisymmetric also in the above condition is make a linear combination of all possible systems of interacting particles. N ! permutations of the wave function ΨE . There are two possibilities with Eq. (79) fulfilled: Spin-Statistics Theorem One can show by using relativistic quantum field theory, that there exists an intrinsic link between the spin meaning that the wave function is symmetric, or of the particle and the statistics it obeys. Namely, particles with integral spin must have symmetric wave func +Ψ if P is an even permutation, tions and are, therefore, bosons. Correspondingly, particles PΨ = −Ψ if P is an odd permutation with half-odd integer spin have antisymmetric wave functions and are fermions. All particles in nature are either which means that the wave function is antisymmetric. One fermions or bosons. The fermions constitute the matter can show that every permutation of coordinates can be dearound us: quarks (up, down, strange, charm, bottom, composed into successive pairwise exchanges called transtop), leptons (electron, muon, tau, corresponding neutripositions. If the number of transpositions forming the nos and antiparticles), and their compositions with odd permutation is even (odd) the permutation is called even number of constituents. Bosons are responsible for inter(odd). The sign of the permutation actions: photons, gluons, W and Z bosons, Higgs boson; but include also compositions of fermions with even num+1 if P is an even permutation, σ(P ) = ber of constituents. −1 if P is an odd permutation In fact, since the Hilbert spaces of quantum systems are The symmetric wave functions are of form in general complex vector spaces, one could in principle have for every transposition PT 1 X P ΨE (q). (80) ΨS (q) = √ N! P PT Ψ = eiθ Ψ P Ψ = Ψ for all P The antisymmetric wave functions can be written as 1 X ΨA (q) = √ σ(P )P ΨE (q) N! P ψ1 (1) ψ1 (2) 1 ψ2 (1) ψ2 (2) =√ .. .. N ! . . ψN (1) ψN (2) ··· ··· ··· ··· ψN (N ) ψ1 (N ) ψ2 (N ) .. . where θ is a real variable, which is connected to the spin s of the particle by θ = 2πs. One immediately realizes that for bosons +1 −1 for fermions e2iπs = (−1)2s = 2iπs e for anyons (81) Thus, anyons are such particles that can acquire any phase when two of them are interchanged. For topological reasons, this can occur only in dimensions smaller than 3. which is the so-called Slater determinant. One should no- When the spin of an anyon is a rational fraction, other than tice that in the antisymmetric state, if we change two par- integer of half-integer, it is said to obey fractional statistics. ticles in the same single-particle state, we should obtain Naturally, elementary particles cannot be anyons. Neverthe same state. This means that P ΨA = ΨA . But, since theless, the collective excitations, i.e. quasiparticles, of inexchange of two particles is an odd permutation, we should teracting elementary particles restricted to two dimensions have P ΨA = −ΨA . Thus, it is impossible to have an an- can display fractional statistics, and are essential ingreditisymmetric wave function with two particles in the same ents e.g. in the theory of fractional Hall effect. single-particle state. This is a result equivalent to Pauli’s One interesting ”hot topic” at the moment is the search exclusion principle for electrons. of anyons obeying non-abelian statistics. This means that The systems composed of particles obeying the exclusion instead of a phase factor, the many particle state is changed principle are described by an antisymmetric wave function. by a matrix operation in a transposition, implying that The statistics governing the particles is called Fermi-Dirac two successive transpositions do not in general commute. statistics, and the constituent particles fermions. For sym- One candidate for this are the so-called Majorana fermions, metric wave functions we have no such problems. The par- fermions that are their own anti-particles. Again, elementicle numbers ni are not restricted. The statistics gov- tary particles are not found to be Majorana fermions, but erning such systems is Bose-Einstein statistics, and the they have been realized as (anyonic) quasiparticles in suconstituent particles are bosons. It should be noted that perconductors. the Hilbert space of a many particle system not the whole space spanned by the function ΨE but a subspace spanned by either symmetric (bosons) or antisymmetric (fermions) 6.4 Non-Interacting Bosons and Fermions state vectors. We want to calculate the statistics for N non-interacting Even though the presentation has been done for non- bosons or fermions, i.e. for quantum ideal gas. Recall, that interacting particles, one can show that the wave functions the quantum state has ni particles in the single-particle 35 energy state with energy εi . The eigenstate of the Hamil- Also, the (occupation) number operator is the familiar tonian of N non-interacting particles is then denoted with n̂i = â†i âi |n1 , n2 , . . . , ni , . . .i (82) and obeys the relation resulting in n̂i |n1 , n2 , . . . , ni , . . .i = ni |n1 , n2 , . . . , ni , . . .i, X Ĥ|n1 , n2 , . . . , ni , . . .i = ni εi |n1 , n2 , . . . , ni , . . .i with ni = 0, 1, . . . i Bose-Einstein distribution with N= X ni and E = i X For bosons, all fillings nk of the eigenstate ψk are allowed. Each particle in the state contributes with energy εk and chemical potential µ, so the single-particle grand partition function is ni εi . i It is straightforward to show that the basis {|n1 , n2 , . . . , ni , . . .i} is complete X ˆ |n1 , n2 , . . . , ni , . . .ihn1 , n2 , . . . , ni , . . . | = I, k ZG = Tre−β(Ĥk −µn̂k ) = ∞ X e−β(εk −µ)nk nk =0 {ni } = where Iˆ is the identity operator, and orthonormal 1 1 − e−β(εk −µ) , where the trace is taken over the complete set of symmetric many-body states. Thus, the grand partition function for bosons is Y For such energy eigenstate, we can always define the cre1 boson † ZG = . −β(εk −µ) ation operator âi so that 1 − e k hn01 , n02 , . . . |n1 , n2 , . . .i = δn1 n01 δn2 n02 . . . â†i |n1 , n2 , . . . , ni , . . .i = C|n1 , n2 , . . . , ni + 1, . . .i, The grand potential is i.e. â†i creates one particle into the state i. Correspondingly, the annihilation operator obeys â†i |n1 , n2 , . . . , ni , . . .i ΩG = −kB T boson ln ZG = kB T X −β(εk −µ) ln 1 − e . k = C 0 |n1 , n2 , . . . , ni − 1, . . .i We recall that the average number of particles is given by Eq. (68) meaning that âi removes one particle from the state i. ∂ΩG hN i = − . It is beneficial to use grand canonical ensemble, since we ∂µ are dealing with both particle and energy exchange with This means that since the grand potential of the whole N the bath. Again, because the particles do not interact we particle system is composed of a sum of grand potentials can write the grand partition function ZG as a product of of the individual states, we can write the expected number k as single-particle partition functions ZG of particles in state ψk as Y k ∂Ωk 1 ZG = ZG . hn̂k i = − G = β(ε −µ) . k ∂µ e k −1 We have thus seen that the number of particles in a singleparticle bosonic state obeys the Bose-Einstein distribution Bosons hn̂iBE = Creation and annihilation operators for bosons 1 . eβ(ε−µ) − 1 (83) For bosons, the creation and annihilation operators obey This describes the filling of single-particle eigenstates with the commutation relations non-interacting bosons. X n\ BE 2.0 [âi , â†j ] = δij [âi , âj ] = [â†i , â†j ] = 0. T = 0.1 Μ 1.5 One can show that (this has been done for harmonic oscillator creation and annihilation operators in quantum mechanics courses) √ âi |n1 , n2 , . . . , ni , . . .i = ni |n1 , n2 , . . . , ni − 1, . . .i √ â†i |n1 , n2 , . . . , ni , . . .i = ni + 1|n1 , n2 , . . . , ni + 1, . . .i X n\ MB 1.0 0.5 0.0 36 0.1 0.2 classical limit 0.3 0.4 0.5 H ¶ - ΜL k B T When the occupancy of a state is low, hni 1, we obtain the Boltzmann distribution where c and ai are complex coefficients and 0 1 0 −i 1 σ̂x = , σ̂y = , σ̂z = 1 0 i 0 0 hn̂iBE ≈ e−β(ε−µ) . 0 −1 The condition for this is ε − µ kB T 33 . As ε → µ we are the Pauli spin matrices. obtain hn̂iBE → ∞. In systems of non-interacting bosons The creation and annihilation operators are now µ is always less than the lowest single-particle eigenenergy. 0 1 0 0 As a final remark, Eq. (64) corresponds to an average σ̂+ = , σ̂− = 0 0 1 0 excitation level of a quantum harmonic oscillator 1 hn̂iqho = eβ~ω −1 . Fermi-Dirac distribution Due to the antisymmetry of the wave function of One can use this to show that the excitations inside a harmonic oscillator can be treated as particles obeying Bose- fermions, only nk = 0 and nk = 1 are allowed. This means that the single-particle partition function is Einstein statistics. k ZG = Fermions 1 X e−β(εk −µ)nk = 1 + e−β(εk −µ) . nk =0 Creation and annihilation operators for bosons The grand partition function for the N fermion system is For fermions, the creation and annihilation operators thus Y obey the anti-commutation relations fermion ZG = 1 + e−β(εk −µ) k {âi , â†j } = âi â†j + â†j âi = δij We obtain the average occupation in the state ψk to be {âi , âj } = {↠, ↠} = 0. i j P1 One can show that âi |n1 , n2 , . . . , ni , . . .i √ (−1)Si ni |n1 , n2 , . . . , ni − 1, . . .i = 0 nk =0 hn̂k i = P 1 nk e−β(εk −µ)nk nk =0 if ni = 1 otherwise â†i |n1 , n2 , . . . , ni , . . .i √ (−1)Si ni + 1|n1 , n2 , . . . , ni + 1, . . .i = 0 e−β(εk −µ)nk = 1 . eβ(εk −µ) + 1 We have thus shown that the average occupation of a single-particle state in a fermionic system obeys the FermiDirac distribution if ni = 0 otherwise f (ε) = hn̂iFD = 1 , eβ(ε−µ) + 1 (84) In the above, Si = n1 + n2 + · · · + ni−1 . The (occupation) where f (ε) is often called the Fermi function. Again, in the limit εk − µ kB T , the occupation of the state ψk is number operator † low and the distribution is given by the Boltzmann distrin̂i = âi âi bution. In fermionic systems, the chemical potential can and obeys again the relation be greater or less than any eigenenergy; at low temperatures the chemical potential separates the filled states from n̂i |n1 , n2 , . . . , ni , . . .i = ni |n1 , n2 , . . . , ni , . . .i, the empty states, and is often called as the Fermi energy εF = µ(T = 0). with ni = 0, 1 X n\ FD Example: Spin- 12 system 1.2 Spin- 21 particle is characterized by two quantum states, spin-”up” and spin-”down”, denoted with 1 0 | ↑i = , | ↓i = . 0 1 1.0 X n \ MB T=0 0.8 0.6 D¶~k B T One can show that every Hermitian operator Ô in the twodimensional Hilbert space spanned by {| ↓i, | ↑i} can written as Ô = cIˆ + a1 σ̂x + a2 σ̂y + a3 σ̂z , 33 Rather interestingly, this is fulfilled usually at high temperatures, because then there are more available states for occupation, leading to large and negative µ. 37 0.4 0.2 0.0 classical limit small T 0.5 1.0 1.5 2.0 ¶ Μ Note that only states within |ε − µ| . kB T are partially filled. 7. Ideal Bose Gas Maxwell-Boltzmann gas We calculate the trace in the grand partition function over the whole Hilbert space, i.e. over states ΨE . For each value of N , we eliminate the overcounting by dividing with N! N X 1 X ZG = e−βεk eN βµ (85) N! N k Y −β(εk −µ) = ee . We see that the distributions for both bosons and fermions imply that in certain parameter range the statistics deviate from the classical Maxwell-Boltzmann statistics. For bosons, the standard textbook examples include Bose-Einstein condensation, black body radiation and vibrations of the crystal lattice. The last example is vital for understanding, e.g., the specific heats of metals at high temperatures. Nevertheless, we will omit the discussion here34 . k Degeneracy discriminant The grand potential is ΩG = −kB T X Before we continue the exploration of the resulting phenomena, we would like to have some kind of criterion to separate between the classical and quantum realms. e−β(εk −µ) . k Thus, we end up with the average number of particles in a single-particle eigenstate with energy ε hn̂iMB = e−β(ε−µ) , (86) We have seen that the Maxwell-Boltzmann distribution is valid when hn̂k i 1 for all single-particle states k. This can be rewritten as which is called the Maxwell-Boltzmann distribution. One sees that this indeed is the limiting case of both the BoseEinstein and Fermi-Dirac distributions in the limit ε−µ kB T . It is worthwhile to notice, that the correct result was obtained by tracing over the whole Hilbert space, and by requiring the indistinguishability of the particles with the factor 1/N !. Interestingly, the non-interacting quantum particles fill the single-particle states according to the same law hn̂i = 1 eβ(ε−µ) + c eβµ eβεk ≤ 1, since min εk = 0. In the above, eβµ is often called the fugacity, which describes the ease of adding particles to the system. Now, recall that average value of particles can be written as ∂ΩG N = hN̂ i = − ∂µ where the grand potential ΩG for the Maxwell-Boltzmann gas is X ΩG = −kB T eβµ e−βεk = −kB T eβµ Z1 , where c = −1 for bosons, c = 1 for fermions and c = 0 for Maxwell-Boltzmann statistics. k where Z1 (β) is the single-particle canonical partition function. In the classical limit, we obtained the result (57) X Z1 (β) = e−βεk k V ≈ 3 (2πmkB T )3/2 h ≡ V λ−3 T , where λT = length. p h2 /(2πmkB T ) is the thermal de Broglie wave Thus, we can write the condition for classical statistics as ρλ3T 1, (87) where ρ = N/V is the density of the gas. One gets an intuitive picture of the above condition by considering the volume per particle as a sphere of radius r 1 4π 3 = r . ρ 3 34 Interested reader can refer to the material on the course of condensed matter physics. 38 In addition, it is beneficial to consider the particles composNow, the energy of a plane wave is given by the dispering the gas as quantum mechanical wave packets 35 . Next, sion relation 38 we make a rough estimate of the size of the wave packet. ~2 k 2 ε = . k We assume that the average energy of a particle is given 2m by the thermal energy (equipartition theorem): Thus, the number of plane waves in a small range of energy h p2 h dε is = ≈ kB T → p ≈ p . 2 2m λ V dk h /(2πmkB T ) T D(ε)dε = (4πk 2 ) dε , 8π 3 dε If we assume that the quantum mechanical uncertainty in the momentum ∆p is less than the momentum p itself, we where the first term is the surface area of a sphere in kobtain according to Heisenberg’s uncertainty principle space, the second term is the density of plane waves per unit volume in k-space, and the third is the thickness of ∆x ≥ λT . the spherical shell. The above equation defines the energy Now, the thermal wave length λ can be considered as the density of states T minimum size of the wave packet. Thus, we see that the classical limit occurs when the wave packets do not overlap. In other words, this means that the quantum correlations between the particles are suppressed when the particles are small compared to the volume they are occupying. D(ε) = 2πV (2m)3/2 ε1/2 . h3 7.1 Bose-Einstein Condensation One should note that the quantum effects reveal themWe saw in the previous section that the number of parselves when the density of the particles is high and/or the ticles in a single particle state of a many-boson system can temperature of the system is low, together with small par- take any non-negative integer value. ticle mass36 . X n\ BE 2.0 Density of free particles in a box 1.5 Before we start the actual demonstration of the power of quantum statistics, we note that one often has to study statistical quantities of type X Fε 1.0 T decreases 0.5 ε H ¶ - ΜL k T B where Fε is a function defined by the single-particle energy 0.0 0.1 0.2 0.3 0.4 0.5 eigenstates. If the studied volume is large, the eigenstates Especially interesting is the case in extremely low temlie close to one another and Fε can be taken as continuous functions of energy. Thus, the summations are easily peratures where all of the particles tend to go to the lowest accessible quantum state. This results into a new form of evaluated as integrals. matter, Bose-Einstein condensate. Let us try to estimate In the case of free particles, we assume that they are the temperature region where this macroscopic quantum confined to a cube with V = L3 and subjected to peri- phenomenon occurs. odic boundary conditions 37 , resulting in a solution of the Now, the number of particles in the ground state (ε = 0) Schrödinger equation in the form of plane waves is 1 1 z (88) ψk = √ eik·r , n0 = hn̂0 i = −1 = → ∞, V z −1 1−z with the wave vector k = (2π/L)(n1 , n2 , n3 ) where ni are integers. Each state occupies the volume (2π/L)3 in the k-space, so the density of plane waves in the k-space is V . D(k) = 8π 3 because z → 1, when T → 0 (exercise). Thus, we obtain for the total number of particles N= X ε 1 z −1 eβε −1 = X z 1 + . 1−z z −1 eβε − 1 ε6=0 35 See Quantum mechanics. is vital to acknowledge that the temperature and density should be compared with each other. For example, in white dwarfs one has a degenerate quantum gas even in temperatures of the order 107 K. 37 Check the course of Solid state physics or Condensed matter physics for more details on this calculation. 36 It 39 Now, when the volume V is large, the spectrum of single particle energy states is almost continuous. Thus, one can 38 Dispersion relation simply means the dependence of the energy on the wave vector. approximate the summation with an integral39 Z ∞ X 1 D(ε) N= = n + dε 0 −1 βε −1 z e −1 z eβε − 1 0 ε = n0 + We thus see, that ρλ3T = g3/2 (z), n0 3 ρλ3T = λ + g3/2 (1), V T In other words, when V g3/2 (z). λ3T 1 Γ(ν) ∞ Z 0 Typically, the density of the gas is kept constant and the temperature is lowered. The temperature below which the chemical potential µ is zero, i.e. z = 1, is 2/3 ρ 2π~2 . Tc = mkB 2.612 V kB T = kB T ln(1 − z) − g5/2 (z) λ3T which implies p=− ∂ΩBE ∂V = 1 ρ there will be a macroscopic occupation in the ground state, forming a Bose-Einstein condensate. Again, the condensate starts to arise when the thermal wave packets describing the bosons start to overlap. ∞ X xν−1 dx zn . = z −1 ex − 1 n=1 nν Similarly, one can show that the grand potential is ΩBE when ρλ3T ≥ 2.612. λ3T ≥ 2.612 We have also identified the Bose-Einstein functions gν (z) = when ρλ3T < 2.612 kB T g5/2 (z). λ3T (89) This is called the critical temperature, below which the condensate starts to form. Below the Tc , the relative fraction Similarly, we can calculate the internal energy of the sys- of particles in the condensate is 3/2 tem n0 2.612 T =1− 3 =1− . (90) Z ∞ N λT ρ Tc εD(ε) 3 V E= dε = kB T 3 g5/2 (z) z −1 eβε − 1 2 λT 0 T,N n N Thus, we obtain for ideal Bose gas p= 1.4 2E 3V n0 N 1.2 1.0 In the classical limit we have nl 1 for all states l. Thus, n0 is negligible compared with V g3/2 (z)/λ3T and ρλ3T = g3/2 (z). 0.8 0.4 Since µ < 0, we have that z = 0 when T → ∞, which is in accordance with our condition (87) for classical statistics. ne N 0.6 coexistence region 0.2 0.0 0.0 0.5 1.0 1.5 2.0 T T c Also, when T → 0 the fugacity z → 1. So, the fugacity We see that at Tc the gas goes through a phase transiobtains values 0 < z < 1, and within this interval the Bosetion, and starts to coexist in two phases. The particles in Einstein function is monotonically increasing function with the ground state form the condensed phase, characterised limits by the order parameter n0 /N 40 . The particles in the excited states form a normal phase with relative fraction of g3/2 (0) = 0 particles given by g3/2 (1) = 2.612. ne n0 =1− . N N When the fugacity z = 1 (i.e. µ(T, N ) = 0, the density ρ One can also imagine that the density of the gas is inand temperature T obey creased at constant temperature, resulting in the critical density ρλ3T = g3/2 (1) = 2.612, ρc = 2.612 × λ−3 T . This implies that n0 = 0. If we still increase the density, Now, the critical temperature leads together with Eq. (89) or decrease the temperature, the increase in the quantity in critical pressure ρλ3T must originate from the particles occupying the ground kB Tc state. pc = 3 g5/2 (1). λT 39 Notice, that we removed the grand potential of the ground state since it would be taken into account incorrectly by the density of states, giving a zero weight. 40 We will return to the concept of order parameter when we study more about phase transitions. 40 C other hand, the particles occupying the excited states are identified as the normal component. This can be denoted as V ρ = ρs + ρn j = js + jn .. . T T C It should, nevertheless, be stressed that the quantitative The phase transition is of first order, meaning that the agreement with the experiments requires the inclusion of interactions to the model. first derivative of the Gibbs’ energy is discontinuous. BEC in ultra-cold atomic gases Experimental realisation Helium-4 Helium is a peculiar element, since it resists to go into solid phase at standard pressure, even at the lowest temperatures. This is because of the small mass of the helium atom that causes so large zero-point motion that it is enough to prevail over the weak inter-atomic attractive (van der Waals) forces. Consequently, the boiling point of Helium is extremely low, 4.2 kelvins. Helium has two stable isotopes 3 He and 4 He. 3 He is a fermion and, thus, we concentrate here on the 4 He-atoms, which are bosons. Helium-4 goes through a phase transition at 2.19 K. This is visible as the discontinuity in the specific heat CV . C The first demonstration of gaseous Bose-Einstein condensation was made in dilute gases of alkali atoms in 199541 . In these experiments, the gases were trapped and cooled with magneto-optical traps, which effectively confine the gas in a three dimensional harmonic potential. The experiment is performed many times with different final temperatures, and once the temperature is reached the trapping magnetic field is turned off and the subsequent ballistic motion of the atoms is observed. Atoms in a high temperature gas will obey thermal distribution and expand in thermal manner. The atoms below the condensation temperature are in the ground state of the harmonic trap, and will hardly move after the field is turned off. V H e II H e I The onset of Bose-Einstein condensation in rubidium, taken from the experiments on rubidium atoms at Wiemann and Cornell’s group at JILA-NIST. T T C Due to the shape of the CV -curve, the transition was named λ-transition and the corresponding Tc the λtemperature. Interestingly, the shape of the CV curve resembles that of a Bose gas. Moreover, the critical temperature for Bose-Einstein condensation, calculated with the 4 He parameters, gives Tc = 3.13 K, which is not that far from the observed transition temperature. This lead people to believe that the transition in 4 He was a manifestation of Bose-Einstein condensation. 7.2 Blackbody Radiation (Photon Gas) Matter emits and absorbs radiation. In equilibrium, the rates of these processes are equal, which is called the detailed balance. Now, consider a perfect absorber, a blackbody, that can absorb all radiation at all frequencies. What will its emission spectrum look like? First, recall that the electromagnetic radiation consists of plane waves, similar to Eq (88). Since photons are massless, the dispersion relation is ωk = ck, where c is the speed of light. There are two modes for each wave vector k, one for each transverse polarization. Experimentally, it is impossible to have a perfect blackbody. Closed cavity with a small opening, and in a constant temperature produces nevertheless a good approximation. However, the 4 He is not an ideal liquid. There exists strong repulsive interactions between 4 He-atoms at short distances, and attractive (although weak) ones at longer distances. Nevertheless, the theoretical basis of the description of the fluid below the critical temperature is based in the so-called two-fluid model. Below the critical temperature, the particles occupying the ground state are thought The radiation spectrum of a blackbody was one of to form a superfluid, a state of matter which behaves like the key ignitions for the theory of quantum mechanics. fluid with zero viscosity and zero entropy. The phase 41 Wiemann and Cornell with rubidium atoms, and Ketterle with with a superfluid component is commonly called heliumII, whereas the completely normal 4 He is helium-I. On the sodium atoms. They shared the physics Nobel prize for this in 2001. 41 Namely, the classical physics predicts the so-called ultra- Thus, the average total energy in the field is43 violet catastrophe, radiation at infinite powers. This reZ ∞ ~ω sults from the fact that each mode of the radiation field dωD(ω) β~ω hEi = . e −1 0 can have the equipartition energy kB T . Moreover, in a 3dimensional cavity, the number of modes is proportional to We thus see that the energy density can be written as the square of the frequency. So, the radiated power should Z ∞ ω3 ~ E follow the Rayleigh-Jeans law, and be proportional to the dω β~ω = 2 3 . squared frequency, leading to unlimited radiated power at V π c 0 e −1 high frequencies. This is clearly in contradiction with the This implies the energy density per unit frequency real world42 , except for the limit of low frequencies. • an assembly of quantum harmonic oscillators with energies εs = ~ωs (ns + 21 ), where ns = 0, 1, 2, . . . and ωs is the angular frequency of the oscillator. • a gas of identical and indistinguishable quanta, photons, with energies ~ωs . Each photon is a boson, and each mode can support any number ns = 0, 1, 2, . . . of photons. equipart ition ~ω 3 The correct description to this problem was given by , (91) u(ω) = 2 3 β~ω th π c (e − 1) Max Planck in the beginning of the 20 century, when he proposed that the emitted energies are not distributed conwhich is Planck’s law of radiation. The spectrum of a pertinuously, but come in lumps proportional to the frequency, fect absorber is determined solely by its temperature, not i.e. by its shape, material nor structure. At the room temεk = ~ωk . perature, most of the energy radiated by the blackbody Let us apply Planck’s ideas and consider a cavity of vol- is infra-red, not perceived by the human eye. Only after ume V at temperature T . There exists two conceptually about 800 K, the black body starts to radiate visible light. different, but practically identical viewpoints, as the field uH ΩL can be seen as Ω max H T L T3 T2 < In both cases, one the same expectation value T1 < 1 . hns i = β~ωs e −1 Ω Ω Notice that this is the Bose-Einstein distribution with µ = 0. Since we are dealing with an equilibrium gas, the We see that the maximum of the energy density follows free energy should be minimized in order to find the equi- the Wien displacement law librium number of photons. There are no conservation laws ωmax = constant × T for photons, like those of mass, charge, lepton number, angular momentum and such. Thus, there is no constraints imposed in the minimization, giving just (∂A/∂N )T,V = 0. In the limit of small frequencies, one obtains This implies the disappearance of the chemical potential. kB T D(ω) In other words, the total number of particles in the present u(ω) ≈ 2 3 ω 2 = kB T , π c V case is indefinite. We see that there exists a complete correspondence be- which is the Rayleigh-Jeans law, equivalent to the equipartween ”an oscillator in eigenstate ns with eigenenergy tition energy density: kB T per classical oscillator. ~ωs (ns + 21 )” and ”the occupation of the energy level ~ωs In general, the total energy density obeys the Stefanby ns photons”. From this point on, regardless of the used Boltzmann law viewpoint, the treatments are the same. 4 E = σT 4 , The number of single-particle plane waves in a small V c range dω is given by 43 Notice that here we neglect the (infinite) zero-point energy. This is the so-called renormalization of the energy, justified by stating that 2V dk D(ω)dω = (4πk 2 ) dω , only the differences in the energy can be measured. Those readers 3 8π dω that have been observant, noticed that in the ”photon picture” there where the factor two is due to the number of modes per k. seems to be no such term as the zero-point energy. Nevertheless, this is just a sloppy notation, since according to quantum field theory, the We thus obtain the density of states per unit frequency zero-point fluctuations in the oscillator model are present as vacuum fluctuations in the photon model. The vacuum fluctuations arise as constant creation of photon–anti-photon pairs, which can interact before annihilation, giving rise to the vacuum energy equivalent to the zero-point energy of the oscillators. V ω2 D(ω) = 2 3 . π c 42 You do not get a sun-tan by opening the door of a hot oven! 42 where 8. Ideal Fermi Gas 4 π 2 kB W σ= = 5.67 · 10−8 2 4 3 2 60~ c m K In this chapter, we will study the gas of indistinguishable, non-interacting fermions. According to the previous We can calculate also other properties of the photon gas. chapters, we have obtained the grand potential X The free energy is given by the grand potential (because fermion Ω = −k T ln(1 + ze−βε ) B G µ=0 ε Z ∞ D(ω) ln(1 − e−β~ω) . and the average number of particles X X 1 . N = hn̂ i = ε −1 βε We obtain by the change of variable β~ω = x, and by z e +1 ε ε integration by parts Unlike bosons, the fugacity of fermion gas is unrestricted, 4σ 1 4 i.e. 0 ≤ z < ∞. Because of the Pauli principle no phenomA = − V T = − E. 3c 3 ena, like the Bose-Einstein condensation, where a macroscopic amount of particles go into the same single-particle The entropy of the gas is given by state can occur. Nevertheless, the fermion gas has its own peculiarities that arise from the characteristic quantum na∂A 16σ S=− = V T 3, ture of its constituents, and those will be studied here. ∂T 3c Similar to boson gases, one can change the summations and the pressure of statistical variables into integrations for large enough volumes V . This leads to energy density of states 4σ 4 ∂A = T . p=− ∂V 3c 2πV D(ε) = gs 3 (2m)3/2 ε1/2 , h We thus see that the pressure of the photon gas is one third of its energy density, i.e. where gs is the number of different spin states (for spin- 1 A = ΩG + µN = ΩG = kB T 0 2 particle gs = 2). Accordingly, we obtain 1E p= . 3V p gs = 3 f5/2 (z) kB T λT This is a well known result of the radiation theory. One of the most important examples of the blackbody and gs ρ = 3 f3/2 (z). radiation is the 2.7 K cosmic microwave background, which λT is thought to be the remnant from the Big Bang. In the above, the Fermi-Dirac functions are defined as Z ∞ ∞ X zn 1 xν−1 dx = (−1)(n−1) ν . fν (z) = −1 βε Γ(ν) 0 z e + 1 n=1 n The internal energy is given by Z ∞ εD(ε) 3 gs V E= dε = kB T 3 f5/2 (z), −1 βε z e +1 2 λT 0 giving also for the ideal Fermi gas the relation p= 2E . 3V We see that in the classical limit ρλ3T = f3/2 (z) 1. gs This means that fν (z) ≈ z. Thus, we obtain the familiar classical results pV = N kB T , E = 3 N kB T 2 and so on. For the values ρλ3T /gs ∼ 1, one has to rely on numerical calculations in expanding the series in the FermiDirac functions. In the quantum limit ρλ3T /gs → ∞, one 43 can write the thermodynamics in a closed form, and the On immediately sees, that the leading terms are the idengas is completely degenerate meaning that tical to the ground-state results. From the average energy, we can calculate the specific heat 1 for ε < µ0 nFD = CV π 2 kB T 0 for ε > µ0 = . N kB 2 εF Here µ0 = εF is the chemical potential at T = 0, often This implies that for T TF the specific heat is much less referred to as the Fermi energy. All single particle states than the classical value 23 N kB . up to Fermi energy are completely filled (according to the This last result is a consequence of the fundamental Pauli principle), and those above the Fermi energy are comproperty of the Fermi gas. Due to the Pauli principle, pletely empty. the single-particle states far below the Fermi energy are The value of the Fermi energy can be calculated from completely filled. It means that the particles occupying the density, because such states cannot be (easily) excited because there are no Z εF empty states nearby, and therefore do not contribute to 3/2 gs (2εF m) V N= D(ε)dε = the specific heat, or for example to electric conductivity. 2 3 6π ~ 0 For metals, the Fermi temperature is typically high (of the order of 104 K, meaning that the low temperature approxi 2 2/3 2 6π ρ ~ mation is justified at the room temperature. Thus, the low ⇒ εF = . gs 2m temperature ideal Fermi gas serves as an excellent starting 44 Similarly, one can calculate the average energy at T = 0, point of the study of metallic properties . or the so-called ground-state energy, as 8.1 Magnetism in an Ideal Fermi Gas √ Z εF 2gs V m3/2 5/2 εF , E0 = εD(ε)dε = Let us study the behaviour of the ideal Fermi gas under 5π 2 ~3 0 the influence of external magnetic field B. Previously on page 26, we showed that if orientations the spins of the which implies the average energy per particle fermions are distributed according to the classical BoltzE0 3 mann distribution, the net magnetic moment Mz becomes = εF . saturated at low temperatures (compared with the magniN 5 tude of the field). This implies that the magnetic suscepOften needed quantities are Fermi temperature, momentum tibility χ(T ) at constant temperature becomes zero and wave vector, which are defined respectively as ∂Mz = 0. χ(T ) = µ0 lim εF B→∞ ∂B T TF = kB √ pF = 2mεF If one takes into account the Fermi statistics, one obtains pF a very different behaviour, especially at low temperatures. . kF = ~ If we want to study properties like specific heat and entropy, we have to have a finite temperature. If we still can assume that the temperature is low, meaning that T µ/kB , we expect that the deviations from the ground state are not that large and that we need only the first terms from the so-called asymptotic Sommerfeld expansion π2 1 (βµ)ν fν = 1 + ν(ν − 1) (92) Γ(ν + 1) 6 (βµ)2 7π 4 1 + ν(ν − 1)(ν − 2)(ν − 3) + · · · . 360 (βµ)4 The Sommerfeld expansion gives us the low temperature corrections to the thermodynamic quantities, such as 2 E 3 5π 2 kB T = εF 1 + + ··· N 5 12 εF Pauli paramagnetism For simplicity, we assume that we are dealing with spin- 21 particles at low temperatures (T TF ). This kind of situation occurs, for example, with the electron gas in metals. The energy of such a particle in external field is ( 2 p − µB B when µ is parallel to B 2m ε= p2 when µ is antiparallel to B, 2m + µB B where µ is the intrinsic magnetic moment of the particle and µB = e~/2m is the Bohr magneton. Thus, we see that the introduction of the magnetic field divides the spins into two populations, Now, since the studied Fermi gas is strongly degenerated (T TF ), the occupations in the two different populations are given by the Fermi distributions np± = 2 2E 2 5π 2 kB T p= = ρεF 1 + + ··· . 3V 5 12 εF 1 eβ(εp ±µB B−εF )+1 , 44 This fact is exhaustively discussed in the condensed matter physics course, and will not be studied further here. 44 where εp = p2 /2m and εF = µ(T = 0). At finite temperatures, one should apply the Sommerfeld expansions to resolve the susceptibility. We omit the All the states up to the Fermi energy εF are filled, which implies that the wave numbers kF± for the two populations calculation here, and just state that the lowest order corrections to the low and high temperature susceptibilities are different and can be determined from the condition are ~2 kF+2 2 + µB B = εF 2 2m χ0 1 − π12 kεBFT when T is low ~2 kF−2 χ= 3 − µB B = εF . T χ∞ 1 − 2ρλ 2m when T is high, 5/2 ¶ respectively. The high temperature correction is proportional to (TF /T )3/2 and tends to zero as T → ∞. Landau diamagnetism In addition to the (para)magnetism associated to the orientations of the spins, there exists additional form of ¶F magnetism which arises from the quantization of the kinetic energy of charged particles associated with their motion perpendicular to the direction of the field. This is k k Fk F+ diamagnetic in character, meaning that it creates a field opposite to the external field B, implying negative suscepClearly, the number of states in the population with the tibility χ(T ). According to the Bohr-van Leeuwen theorem, parallel alignment is larger. In terms of the Fermi wave the phenomenon of diamagnetism cannot be explained in terms of classical physics (exercise). number, we obtain the population occupations Z + kF N+ = 0 Z − kF N− = Consider an ideal Fermi gas formed by particles with charge q. In the presence of an external magnetic field, the single-charge Hamiltonian is written as 2 1 − i~∇ − qA , Ĥ = 2m k +3 4πk D(k)dk = V F 2 6π 2 4πk 2 D(k)dk = V 0 kF−3 . 6π 2 Thus, we obtain a net magnetic moment where M = µB (N− − N+ ) µB V −3 = (k − kF+3 ) 6π 2 F 3/2 µB V 2m = (εF + µB B)3/2 − (εF − µB B)3/2 . 2 2 6π ~ B = ∇ × A. We assume that the magnetic field points in z-direction, i.e. B = Bz, which allows us to choose the Landau gauge 45 for the vector potential A = (0, B x̂, 0). Realistic values for fields produced in laboratories are of the order B . 10 T, which corresponds to µB B/kB . 1 K. This is much smaller than the Fermi temperatures in metals, and thus we can assume that µB B εF . This way we have 3/2 V 2m 1/2 µ2B BεF . M≈ 2π 2 ~2 Thus, we obtain the Hamiltonian 2 p̂2x 1 p̂2 Ĥ = + p̂y − qB x̂ + z . 2m 2m 2m As a result, a charged particle has a helical trajectory whose axis is parallel to the external field, and whose proWe end up with the weak field Pauli susceptibility (per jection to the (x, y)-plane is a circle. unit volume We see that the Hamiltonian does not depend on the µ0 ∂M 3 µ2 = µ0 ρ B , B→0 V ∂B 2 εF operator ŷ, meaning that Ĥ and p̂y can be diagonalized in the same basis. This means that in the y-direction the Hamiltonian describes a free particle and altogether in the which holds for T TF and µB B εF . Thus, we see (x, y)-plane we have that instead of being zero, the susceptibility is a constant 2 determined by the density. This should be compared to p̂x 1 ~ky 2 + mωc x̂ − , Ĥxy = the classical (high-temperature) Curie law, Eq. (66) (with 2m 2 mωc g = 2 and J = 12 ), which implies χ0 = lim χ∞ 45 Recall that a given field can be created with different choices for potentials. The behaviour of a particle in the field is independent on the choice for the potential. This is called gauge invariance. µ0 µ2B =ρ . kB T 45 i.e. a harmonic oscillator Hamiltonian with a displacement p in the x-coordinate by X = ky l02 where l0 = ~/qB is the magnetic length. We have also defined the cyclotron frequency ωc = qB/m. The eigenvalues of the harmonic oscillator Hamiltonian are not dependent on the translations in the coordinates. Thus, the eigenenergies are of form 1 ~ωc , n = 0, 1, 2, . . . En = n + 2 These are the so-called Landau levels of a charged particle in a magnetic field. In the classical limit, we have that f1/2 (z) ≈ z ≈ ρλ3T . This implies ρµ0 µ2eff χ∞ = − . 3kB T We thus see that at high temperatures the diamagnetism obeys the Curie law, but with the opposite sign. In the quantum limit (T TF but still µeff B krmB T ), we obtain ρµ0 µ2eff . χ0 (T ) ≈ − 2εF We have applied the first term of the asymptotic SommerBy using the periodic boundary conditions, we obtain in feld expansion (92). Note that the asymptotic results χ0 and χ∞ are precisely one third in magnitude of the paramthe (free) y-direction agnetic results, given that the particles are electrons. This 2πny gives us the intuition that the diamagnetic contribution , ny = 0, ±1, ±2, . . . ky = Ly to the materials response to a magnetic field is negligible compared with the paramagnetic contribution. and in the harmonic coordinate x 8.2 Relativistic Electron Gas in White Dwarfs 0 ≤ X < Lx . These restrict the possible values for ny : The first application of Fermi statistics was related to the equilibrium studies of white dwarf stars. White dwarfs ⇒ Ns ≡ nmax y have burned most of their hydrogen in the nuclear fusion (2H → He). When this energy release becomes absent, the where q = Ze, Φ0 = h/e is the flux quantum, and Φ is the gravitational force is unbalanced and the star collapses into magnetic flux through the system. We thus see that each a white dwarf. Landau level is degenerate, with exactly one state for each Consider then a typical idealization of white dwarf, conflux quantum and for each elementary charge. sisting of a ball of helium with a mass M ≈ 1030 kg and e Φ Lx Ly = Z Lx Ly B = Z , = 2 2πl0 h Φ0 mass density ρm ≈ 1010 kgm−3 . The temperature within the white dwarf is of the order T = 107 K. This means that the thermal energy per helium atom is The total energy of the ideal charged Fermi gas is εn = p2z + En , 2m 3 kB T ≈ 1 keV Eion ≈ 10 eV, where the kinetic energy associated with the linear motion 2 along the field is assumed to be a continuous variable. Now, where Eion is the ionization energy of a helium atom. Thus, we can calculate the grand partition function the helium atoms in white dwarfs are completely ionized. X ln ZG = ln(1 + ze−βε ) Now, the mass of the helium consists mostly of that ε of the nucleus, and can be approximated as (4 He is more Z ∞ Lz abundant) = dpz h −∞ X ∞ M ≈ N (me + 2mp ) ≈ 2N mp , 2 qB × Lx Ly ln 1 + ze−β(~ωc (n+1/2)+pz /2m) . h where N is the number of electrons, me is the mass of the n=0 electron, and mp = 1.673 · 10−27 kg is the mass of the Similar to Eq. (65), we can calculate the magnetization as proton. This gives us the approximate electron density 1 ∂ N ρm 3 M= ln ZG . ρ= = ≈ 1036 electrons/m . β ∂B V 2m z,V,T p After somewhat lengthy calculation (which we skip here), This gives us the Fermi energy of the electron gas we obtain the low-field (µeff B kB T ) susceptibility εF ≈ 106 eV ≈ mc2 = 0.512 · 106 eV. 3/2 2 (2πm) µeff χ(T ) = −µ0 f1/2 (z), Thus, the kinetic energies of the electrons are of the order of 3h3 β 1/2 their relativistic rest energy, meaning that in white dwarfs where µeff = q~/2m (this is the Bohr magneton if q = we are dealing with relativistic Fermi gas. Additionally, e). Notice that the susceptibility is diamagnetic in nature, the Fermi energy gives TF ≈ 1010 K, which means that the irrespective of the sign of the charge of the particle. electrons are (almost) completely degenerate. 46 In the equilibrium of the white dwarf, the outward force where due to the pressure p of the relativistic degenerate electron 3/2 1/2 9π ~c gas balances the gravitational pull ≈ 0.52 · 1057 mp Mc = m p 512 Gm2p dp GM (r) 1/3 1/3 , =− ~ 9π Mc dr r2 Rc = ≈ 4700 km. mc 8 mp where G = 6.673 · 10−11 Nm2 /kg2 is the gravitational constant, and the mass at distance r from the star center is For the radius of the star we obtain 4 3 1/3 " 1/3 # M (r) = ρm r , M M 3 1− . R = Rc Mc Mc where we assume that the density is constant within the star. By simple integration, we obtain the pressure at the We see that the white dwarf has the maximum mass M = center of the star Mc , which it cannot exceed without collapsing to a neutron 8π star or a black hole. This is called the Chandrasekhar limit, 2 2 2 p= Gmp ρ R , 3 and even more detailed calculations give Mc ≈ 1.44M , where we have assumed that the pressure vanishes at the where M is the mass of the Sun. surface. On the other hand, the pressure can be calculated from the ground state properties of the relativistic electron gas. First of all, recall that the relativistic energy of the electron can be written as p εp = (mc2 )2 + (cp)2 . This can be rewritten in terms of the wave vector as p εk = c~ k 2 + kc2 , where p = ~k and kc = mc/~ is the Compton wave vector. Now, in the relativistic limit kF kc and one can write the grand potential as X ΩG = −kB T ln(1 + ze−βε ) ε Z ∞ D(ε) ln(1 + ze−βε )dε ≈ −kB T Z0 ∞ = −kB T 4πD(k) ln(1 + ze−βε(k) )dk Z 0∞ V 1 dε k 3 dk 2 −1 βε(k) 3π 0 z e + 1 dk V ~c 4 ≈− k − kc2 kF2 + · · · , 12π 2 F =− where we have applied partial integration. The grand potential gives the pressure ∂ΩG ~c 4 p=− = k − kc2 kF2 + · · · . ∂V T,N 12π 2 F This should obey the previous balance condition, resulting in 8π ~c 4 Gm2p ρ2 R2 = kF − kc2 kF2 + · · · . 2 3 12π By substituting kF3 = 3π 2 ρ, we obtain M Mc 2/3 =1− R Rc 2 Mc M 2/3 , 47 9. Entropy, Information and Arrow of Time as the direction of time in which the entropy increases. Loschmidt paradox Before the discussion on the main topic of this chapter, a word of caution is in order. The discussion in this chapter Let us dwell on this last statement. Consider a film first will be subjective and, at some places, somewhat specula- played forwards and then backwards. Normally, we can tive. Therefore, the contents should be read with an open separate between the directions. A classic example is the mind and with a scientific doubt. falling of a drinking class to the floor and it’s shattering Entropy is perhaps the most influential, but at the to small fragments. The reverse (the re-organization of the same time controversial, concept that arises from statis- fragments) seems irrational to us. Our minds are formed tical physics. So far, we have encountered three different so that we can separate between the directions of time in ways to determine the entropy. It is hard to tell exactly irreversible processes. But, the laws of nature are timewhat entropy is, but we should nevertheless try to develop reversible. The motions of atoms look reasonable whether an intuition for it46 . In this chapter, we will review the the time is running backwards or forwards. It is then definitions, and also look closer at some of the myriad of quite a mystery how the irreversibility emerges from the reversible behaviour of things. Johann Loschmidt, a conimplications that result. temporary of Boltzmann, was the first to see this contrabetween the second law and the reversible laws of 9.1 Thermodynamic Entropy: Irre- diction nature, hence it was named the Loschmidt paradox. versibility and Arrow of Time One really cannot overvalue the second law of thermodynamics. Heat cannot spontaneously flow from cold regions to hot regions. It is impossible to build a perpetual motion machine. After P. W. Bridgman, there have been nearly as many formulations of the second law as there have been discussions of it. But, possibly the deepest philosophical message is two-fold. The second law introduces the concept of entropy, that gives us the way to characterize all processes in nature; In an isolated system, only such processes are allowed that increase the entropy47 . As such, this is among the most fundamental statements we have about nature. But, it tells us more. It points us the arrow of time, separates the future from the past. Consider the previous example on the mixing of two ideal gases (see p. 31). As we open the valve, the constraints for the two gases are changed. In the statistical physics language we have learned to use, the number of accessible microstates becomes increased and the systems are not in equilibrium. As the new equilibrium is achieved, one notices that the thing that characterizes this process is the second law, and the quantity that changes is the entropy48 . One attempt to resolve this issue is to state that the whole Universe was in a low entropy state right after the Big Bang. This way the familiar definitions of the past and the future come about in terms of the increasing entropy in the Universe. As a result, we would be witnessing a process where the Universe is moving towards equilibrium as time increases. In this equilibrium, in the words of Helmholtz (1854), all energy will become heat, all temperatures will become equal, and everything will be condemned to a state of eternal rest. This ultimate fate of the Universe has been named as the heat death of the Universe. 9.2 Disorder Other way to interpret the entropy is as a measure of the disorder in the system. Consider again the entropy of mixing. We obtained (see Eq. (71) and the following discussion) that for the mixture of two gases of indistinguishable particles, the entropy of mixing can be written as (we assume that in the beginning VA = VB = V /2 and NA = NB = N/2) ∆Smix = Smixed − Sunmixed The increase in entropy calls for irreversibility. For ex N/2 (V /2)N/2 V ample, it is impossible to restore the initial equilibrium of − 2kB ln = 2kB ln (N/2)! (N/2)! the two gases, both in separate compartments, simply by closing the valve. In general, the ever-growing entropy tells = N kB ln 2. us that one direction of time is preferred over the other. It defines, at least in the thermodynamical sense, the future We thus see that entropy increases by kB ln 2 every time we insert a particle into either of the two compartments, pro46 The same statement holds also for energy and time. Both of them are hard to define exactly, but we have learned to be fluent in using vided that we do not look which compartment we choose. them to describe phenomena around us. This chapter is intended to This is a simplest possible instance of the counting entropy be give guidelines to do the same in the case of entropy. 47 Strictly, processes that do not decrease the entropy. 48 As a side note, in this example it is important to notice the inadequacy of the thermodynamical definition of entropy Scounting = kB ln(# of configurations). In the case of mixing entropy, each particle has two possible compartments, so N particles can be arranged into 2N different configurations. Generally, if there are m ≥ 2 where Q is the heat flow into the system. The circumstances we set possibilities for a particle, the added entropy is accordingly up for the system were such that heat is not exchanged between the gas mixture and the surroundings, thus the change in entropy cannot kB ln m. This idea will be exploited in the last section of result from such a process. this chapter, when we discuss about information entropy. ∆S ≥ Q , T 48 The above discussion holds in cases where there exists discrete amount of choices for the new particle. Discrete choice arises naturally in quantum statistics where the states are quantized. Generally, as we have already discussed, the equilibrium entropy measures the logarithm of the number of different states the system can be in. For example, the previous equilibrium statistical mechanics (Boltzmann) entropy (Taken from K. Maruyama et al., Rev. Mod. Phys 81, 1 (2009).) The process goes as follows. (a) we insert a partition that divides the volume in two equal compartments. (b) the demon measures the position of the molecule, and records the measurement (whether the molecule is on the right or on the left of the partition). (c) the demon connects a load to the partition, working now as a piston, on the side where she has measured the molecule. (d) by keeping the S = kB ln Ω(E) (93) system at constant temperature, the demon can let the measures the number of microstates with macroscopic en- gas expand isothermally and quasistatically (no change in ergy E, with phase-space volume h3N allocated to each internal energy), resulting in work Z V Z V state. V −1 dV = kB T ln 2. pdV = kB T W =Q= V /2 Maxwell’s Demon Is it possible to extract work from the mixing entropy? Consider a mixture of two ideal gases, A and B, and a clever creature, the Maxwell’s demon, that works at the valve between two compartments of the container. Suppose that the demon is able to control the valve in such a way that it lets the A-molecules to pass through the valve to the right side of the valve, and the B-molecules to the left. Additionally, she would prevent the opposite processes. As a result, each transfer of a molecule through the valve decreases the entropy by kB ln 2, apparently violating the second law. The thing that is overlooked is that the demon itself consumes entropy, which then balances the entropy decrease as a whole. We will return to this in the following section. V /2 Thus, we see that the Szilard’s demonic engine extracts heat Q from the surroundings and converts it completely into mechanical work, apparently violating the second law. The explanation of this requires further insights on our concept of entropy. 9.3 Information Entropy The Szilard’s realisation of Maxwell’s demon leads us to our last and the most profound interpretation of the entropy, as a measure of our ignorance about a system. As we have seen, in the entropy of mixing we add kB ln 2 to our ignorance of the system every time we add a particle into the container without looking where it went. Similarly, a cycle in the Szilard’s engine transfers an amount kB T ln 2 of heat from surroundings into work, provided that one knows on which side of the partition the molecule is. GenSzilard’s engine erally, this implies that the entropy is not a property of But first, let us return to the initial goal of the demon, the system, but is related our knowledge about the systhe extraction of work from entropy. Consider a simplified tem. Additionally, the second law then tells us that in the version of the Maxwell’s demon, first presented in 1929 by equilibrium, our knowledge becomes minimal, reflected in the Hungarian L. Szilard. This was the first time that the the equal probabilities of finding the system in one of its significance of information in physics49 was pointed out. allowed states. The Szilard’s idea was to study one-molecule ideal gas in volume V . Note, that we can consider the single molecule Shannon entropy as an ideal gas by forming an ensemble of one-molecule In 1948, C. Shannon laid the foundation to information gases and calculating the averages of various quantities as theory by studying the information content of random variif it was an ideal gas with a large number of molecules. ables. Interestingly, information and physics seem to have deep connections. Consider a random variable X with the possible outcomes {x1 , x2 , . . . xn }50 . Intuitively, we assume that the information content I(X = xi ) related to an outcome xi depends only on the probability of the event. By definition, we require that I is positive and additive. Also, if we assume that we have another random (independent on X) variable Y , the information content of measuring xi and yj simultaneously should be the sum of the individual information contents, i.e. I(X ∩ Y ) = I(X) + I(Y ). By taking these into account, one unavoidably ends up with the information content function I(xi ) = − log P (xi ), 49 We 50 We assume here that X is a discrete random variable. Generalization to continuous variables is straightforward. will go deeper into this connection in the following section. 49 where P (xi ) is the probability of observing the outcome xi . The base of the log determines the units of information. In the base of 2, the units are bits, and in the base of e the units are nats. The average information content S = hI(xi )i = − X P (xi ) ln P (xi ) i is the cornerstone of the theory of information. Interestingly, the average information content is exactly of the form of Gibbs entropy S = −kB hln pi i = −kB X pi ln pi , (94) i (Taken from S. Toyabe et al., Nat. Phys. 6, 988 (2010).) which was our generalization of the Boltzmann entropy! Thus, it was named as the Shannon entropy, according to its founder. Note, that in information theory the unit of entropy does not have the historical burden of having the units J/K. In 1951, L. Brillouin became the first to treat the entropy and information on the same footing. The Gibbs entropy can be seen as the maximum amount of information one can obtain from the system by making a measurement. To clarify this, consider an isolated system initially with entropy Si . Assume then that we acquire information about the state of the system by making a measurement. The gain in information is linked to the change in the entropy by I meas = Si − Sf > 0, where Sf is the physical entropy after the measurement. This means that the gain in information decreases physical entropy. In the Szilard’s engine, this occurs when we go from (a) to (b), and we see that the entropy in the system has turned into information inside the demon’s head. This implies the duality of entropy, i.e. its having both thermodynamic and information theoretic aspects. If further measurements are not made, we see that the second law implies unavoidable decrease in the information about the system. Thus indeed, entropy can be seen as the lack of knowledge about the system. In their experiment, Toyabe et al. studied system equivalent to particle stochastically jumping in a spiral-staircase due to the thermal fluctuations51 . In order to prevent downward jumps, the demon places a block behind the particle whenever she observes an upward jump. Thus, the particle climbs up the stairs without direct energy injection. Naturally, the energy for useful work is extracted from the heat bath setting up the temperature in the Szilard’s engine. As we have already discussed, this apparently violates the second law because it does a perfect conversion of heat Q into work W . Szilard’s original idea was that this energy is returned to the heat bath by the measurement device. However, Bennett showed in 1982 that the measurement can be done reversibly, i.e. without any change in entropy, at least in principle. But, there is still one thing that we have overlooked. At the stage where the work is done, the gas expands isothermally and the demon looses her information about the location of the molecule. This is a logically irreversible process. In general, logical reversibility means injective, i.e. one-toone, correspondence between logical and physical states. When we process information the reversibility guarantees that no information is lost. But, erasure of information is irreversible! The mapping is not one-to-one, instead it maps all physical states into the same logical state, which we refer to as the standard state. It corresponds to a blank sheet of paper. Landauer argued in his erasure principle that the logical irreversibility must involve dissipation. The minimum amount of dissipation equals the energy kB T log 2 Exorcized demon per bit. This being true, the demon’s efforts to violate What about our demon then? Well, as we have the second law have come to an end. In fact, just recently seen, the decrease in the entropy of Szilard’s en- A. Bérut et al. (2012) have demonstrated experimentally gine leads to an increase demon’s knowledge about that the Landauer limit of the information erasure can be the location of the molecule. Further on, this in- reached but not exceeded. 51 We will discuss more about fluctuations later when we consider formation can be then converted into useful work, as shown recently by S. Toyabe et al. (2010). non-equilibrium statistics. 50 10. Interacting Systems Quantum entropy What about the Universe then? Is it erasing information at the Landauer rate per bit towards the unavoidable heat death, or is it computing reversibly its following step? It is important to realise that non-zero entropies of isolated systems are related to our ignorance about the state of the system. The von Neumann definition, The discussion so far has been about systems consisting of non-interacting entities. For a real connection between the theory and experiments, one has to include the interparticle interactions operating in the system. The interactions are the reason why the equations in many-body physics become intractable. One possibility to include the interactions is in terms of the so-called cluster expansion. S = −kB Tr(ρ̂ ln ρ̂), (95) In low-density gases, the non-interacting system can serve is the fundamental version of the entropy, since it is quan- as an approximation. The physical quantities can then tum mechanical, as is the nature. One should notice, how- be written in terms of series expansions in terms of twoever, that the von Neumann entropy is zero for pure states particle potentials, which provide the corrections to the and non-zero for mixed states, but in both cases indepen- main terms arising from the non-interacting solution. dent in time for isolated systems (exercise). This means Another approach we are going to discuss is that of secthat if we can follow the microscopic evolution of a closed ond quantization. system, our ignorance of the state does not change. The ignorance arises only from our lack of knowledge on how 10.1 Cluster Expansions the system was build. Mayer et al. (1937 onwards) developed a systematic Consider then the whole Universe. By definition, it is isolated, and regardless of whether the Universe is in a method to carry out a cluster expansion in real gases obeypure or in a mixed state, its entropy has to be a constant! ing classical statistics. The generalization to interacting This makes the heat death of the Universe seem consider- quantum gases was initiated by Kahn and Uhlenbeck, and ably exaggerated. However, it is impossible to determine completed by Lee and Yang. The interplay between quanthe state of the whole Universe at microscopic level, let tum statistical effects and the effects due to interactions alone calculate its future evolution from the Schrödinger leads to involved mathematics and will be omitted here. equation. Therefore, the constancy of its entropy should Consider a gas obeying the Hamiltonian be out of reach of observations. N X X p2i Consider then us making observations, which can only + uij , H= 2m i<j cover a tiny fraction of the whole. Moreover, since the small i=1 region of which we have knowledge, or want to study, is unavoidably interacting with a part that is unknown to us, where uij = u(rij ) is the interaction potential between parwe cannot accurately call it isolated. Thus, if we measure ticles i and j, and rij = |ri −rj | is the distance between the an increase in the thermodynamical entropy (in accordance particles. One example of such interaction is the Lennardwith the second law) it unavoidably has to be consequence Jones potential on p. 13. of a decrease in entropy of the environment. This is the In the classical limit, the canonical partition function is only way to keep the total entropy of the Universe constant. Z 1 dω −βH Finally, when we derived the H-theorem in the exercises Z(T, V, N ) = e 3N N ! h we introduced the transition rates between different states. Z Z P P p2 These would not be present in an isolated system, and 1 i −β i<j uij 3N −β N 3N i=1 2m = d pe d re can be seen as resulting from the (weak) coupling to the N !h3N environment. In addition, the equations are coarse-grained, 1 = QN (T, V ), i.e. we have wiped out the quantum coherence between the N !λ3N T states. As a result, we see that the apparent irreversibility that we observe in nature arises from our lack of exact where we have carried out the momentum integration and knowledge on the coupling between the system and the defined the configuration integral rest of the Universe, and on the coherences between the Z P quantum states, thus resolving Loschmidt’s paradox. QN (T, V ) = Q(T, V, N ) = d3N re−β i<j uij Z Y References = d3N r e−βuij , i<j This chapter is partly based on the following articles: where the product is taken over all pairs of particles (ij) (there are N (N − 1)/2 such pairs). For ideal gas, uij = 0 and VN Z(T, V, N ) = N !λ3N T • K. Maruyama et al., Rev. Mod. Phys 81, 1 (2009). • S. Toyabe et al., Nat. Phys. 6, 988 (2010). • A. Bérut et al., Nature 483, 187 (2012). • S. Popescu et al., Nat. Phys. 2, 754 (2006). which is the same as Eq. (57) obtained previously. 51 Cluster expansion The grand potential allows us to calculate thermodynamic quantities, such as the pressure We would like to calculate the configuration integral QN . ∞ For this, we expand it in terms of Mayer functions X ∂ΩG bl (T )eβlµ p = − = k T B −βuij ∂V T,µ λ3l − 1. fij = f (rij ) = e T l=1 and the average density of particles ∞ X 1 ∂ΩG lbl (T )eβlµ ρ=− . = V ∂µ T,V λ3l T u ij 2 u ij l=1 1 By recalling the virial expansion (39) f ij pV = kB T [ρ + B2 (T )ρ2 + B3 (T )ρ3 + · · · ], r ij (96) we can show that the virial coefficients Bk are now functions of the cluster integrals, e.g. (exercise) -1 Z We 1 3 −βu(r) d B (T ) = −b (T ) = r 1 − e 2 2 see that the Mayer functions fij are bounded everywhere 2 and have the same range as the potential uij (we have 2 B3 (T ) = 4b2 − 2b3 used the Lennard-Jones potential). For the evaluation, we B4 (T ) = −20b32 + 18b2 b3 − 3b4 make the expansion Z .. Y . QN = d3 r1 · · · d3 rN (1 + fij ). i<j Ursell showed that the grand partition function can be written in terms of cumulant expansion as van der Waals equation For very dilute gases, two-body clusters give the dominant contribution for the interactions. The interactions between atoms in real gases can be well approximated by the Lennard-Jones potential, as we have discussed. Accordingly (see p. 13), we assume that ∞ X 1 eβN µ QN 3N N !λ T N =0 X Z ∞ 1 βlµ 3 3 = exp e d r1 · · · d rl Ul (r1 , . . . , rl ) , l!λ3l T l=1 ZG = • the interaction is hard core, i.e. strongly repulsive when the distance between atoms r . rm . where Ul (r1 , . . . , rl ) are called cluster functions or Ursell functions. The cluster functions can be obtained by expanding the cumulant expansion in powers of λ−3 T exp(βµ), leading into • on the average, the interaction is attractive when r & rm , but weak compared with the temperature, i.e. βu(r) 1. U1 (r1 ) = 1 Thus, U2 (r1 , r2 ) = f12 U3 (r1 , r2 , r3 ) = f12 f13 + f12 f23 + f13 f23 + f12 f13 f23 .. . e−βu(r) = 0 1 − βu(r) when r . rm when r & rm The second virial coefficient becomes Z ∞ B2 = 2π drr2 1 − e−βu(r) We thus see that Ul depends on all connected clusters of l particles. 0 Virial expansion ≈b− We define the cluster integral as Z 1 bl = d3 r1 · · · d3 rl Ul (r1 , . . . , rl ). l!V where b= Thus, the grand potential becomes explicitly proportional to V : ΩG (T, V, µ) = −kB T ln ZG (T, V, µ) = −V kB T λ3l T 2π 3 r 3 mZ ∞ a = −2π drr2 u(r) > 0. rm ∞ X bl (T )eβlµ l=1 a , T . In the above approximation, we have ended up with the van der Waals equation. 52 10.2 Second Quantization where Ĥ1 is the non-interacting single-particle Hamiltonian, and Û is the potential operator for the two-particle Another way to introduce the interactions to the manybody systems is by a formalism known as the second quan- interactions. By applying54the second quantization, we can tization. In short, the distinction between the first and write the Hamiltonian as X 1X the second quantization is equivalent to that between the Ĥ = hi|Ĥ0 |jiâ†i âj + hij|Û |kliâ†i â†j âl âk . Hermite polynomials, and the creation and annihilation 2 ij ijkl operators together with number (Fock) states in the single harmonic oscillator problem. The first quantization can be In the above, we have defined the matrix elements seen as the quantization of the motion of the elementary Z particles. The second quantization quantizes also the elec- hi|Ĥ |ji = d3 rψ ∗ (r)Ĥ (r)ψ (r) 0 0 j i tromagnetic field, and even the particles. It is the starting Z Z Z point of the microscopic (BCS-)theory of superconductivhij|Û |kli = d3 r d3 r0 ψi∗ (r)ψj∗ (r0 )Û (r0 − r)ψl (r0 )ψk (r), ity52 , the quantum field theories and the quantum electrodynamics, the theory of interaction between matter and where ψi are the single-particle wave-functions defined in radiation. Eq. (78). We have already introduced most of the essential conFor convenience, one often defines the so-called field opcepts. In a many-body system, the wave functions erators are (anti-)symmetric superpositions of products of singleX particle wave functions (see Eqs. (80-81)). The first step ψ̂(r) = ψi (r)âi , in second quantization is to change from the wave function i X representation into the abstract number basis with the defψ̂ † (r) = ψi∗ (r)â†i . inition (82). This makes the calculations more convenient i since the number basis does not carry along the unphysical information on the particle locations, which are not sensi- This way the Hamiltonian operator in the second quantible quantities since the particles are indistinguishable. zation formalism can be written as Z Next, the vacuum state is defined as Ĥ = d3 rψ̂ † (r)Ĥ0 (r)ψ̂(r) |0, 0, . . .i ≡ |0i. Z Z 1 This is a unique state that contains no particles. With d3 rd3 r0 ψ̂ † (r)ψ̂ † (r0 )Û (r0 − r)ψ̂(r0 )ψ̂(r). + 2 the application of the creation operators, one can write an arbitrary number state in terms of the vacuum state as We notice that this resembles an expectation value of the Hamiltonian operator, but instead of wave functions we 1 † n1 † n2 â â2 · · · |0i. |n1 , n2 , . . .i = √ take calculate it with respect to the field operators. Hence n1 !n2 ! · · · 1 the name second quantization. We have also defined before the number operator of the ith The field operators obey the (anti-)commutation relasingle-particle state as tions for all r and r0 n̂i = â†i âi . [ψ̂(r), ψ̂(r0 )]± = [ψ̂ † (r), ψ̂ † (r0 )]± = 0 The number states are eigenstates of the number operators [ψ̂(r), ψ̂ † (r0 )]± = δ(r − r0 ), n̂i |n1 , . . . , ni , . . .i = ni |n1 , . . . , ni , . . .i, where the eigenvalue ni is the number of particles in the where minus (plus) sign stands for (anti-)commutator for corresponding single-particle state. Recall, that for bosons bosons (fermions). A straightforward calculation gives the ni = 0, 1, 2, . . ., and for the fermions ni = 0, 1. The opera- number operator in terms of the field operators as Z tor for the total number of particles in the system is given by N̂ = d3 rψ̂ † (r)ψ̂(r). X N̂ = n̂i . i One can also show that The second quantization equals to writing every operator in terms of the creation and annihilation operators53 . Especially, we assume that the interacting many-body Hamiltonian operator is Ĥ = N X i=1 Ĥ0 (ri ) + X Û (ri − rj ) [Ĥ, N̂ ] = 0. In general, an arbitrary many-body operator  = N X Â(ri ), i=1 i<j 52 Further 54 We omit the lengthy derivation here. An interested reader can find it from Fetter and Walecka: Quantum Theory of Many-Particle Systems. information, see Superconductivity course. that creation and annihilation operators obey the (anti)commutation relations for bosons (fermions). 53 Recall, 53 can be written in second quantization as X  = hi|Â|jiâ†i âj is the momentum transfer during the collision. In the case of half-integer spin particles, one has to label the annihilation and creation operators also with the zcomponent of the spin. Finally, the Hamiltonian can be written in the second quantization as ij Z = d3 rψ̂ † (r)Â(r)ψ̂(r). Ĥ = X εk â†kσ âkσ + kσ Plane wave expansion 1 XXXX U (k2 − k4 ) (97) 2V k1 σ k2 λ k3 k4 δk1 +k2 ,k3 +k4 â†k1 σ â†k2 λ âk4 λ âk3 σ . × If we assume that there is no external potential, the nonTo simplify the notations, we have assumed that the pointeracting part of the Hamiltonian is of the form tential does not change the spins of the particles. In such ~2 2 cases, one should include also the conservation of spin into Ĥ0 = − ∇ . 2m the calculation of the matrix elements of the interaction. In the case of weak interactions, a good choice for the single-particle states are plane waves 1 ψk (r) = √ eik·r . V The Fourier transform of the potential defines the socalled scattering amplitude 55 This gives the kinetic energy matrix elements (now i → k0 and so on) Z 0 1 ~2 k 2 ik·r ~2 0 2 hk |∇ |ki = d3 re−ik ·r e = εk δkk0 , − 2m V 2m a(k) = 2 2 a λT and ρa3 1. In this limit, the particles are slow, k ≈ 0 and the scattering length is Z mU0 where U0 = d3 rU (r). a= 4π~2 Correspondingly, the matrix elements of the potential energy operator are Z Z 1 hk1 k2 |Û |k3 k4 i = 2 d3 rd3 r0 e−i(k1 −k3 )·r V For bosons, we can omit the spin-summations in Eq. (97) and we obtain in the low-temperature limit (recall k = k2 − k4 = −(k3 − k1 ) = 0) X 2πa~2 X † † Ĥ = εk â†k âk + ak ak ak ak mV k k X † † † † + ak ak0 ak ak0 + ak0 ak ak0 ak . i(k4 −k2 )·r0 × U (r − r)e Z 1 d3 re−ik·r U (r) = V 1 ≡ U (k), V where Z U (k) = m U (k), 4π~2 which defines the length scale of the scattering process. Now, we are interested in the low-temperature behaviour, meaning that we assume k where εk = ~2m . Recall, that we have applied the periodic boundary conditions in the normalization of the plane wave, which implies that the wave vectors k are quantized, and thus we can apply the Kronecker delta function. 0 Low-temperature behaviour of interacting Bosegas k6=k0 d3 rU (r)e−ik·r By employing the properties of the annihilation and creation operators, one can write the Hamiltonian into form (exercise) X X 2πa~2 2N̂ 2 − N̂ − n̂2k . Ĥ = εk n̂k + mV is the Fourier transformation whose inverse is defined as 1 X U (r) = U (k)eik·r . V k k k Note, that the matrix elements give the probability amplitudes for the collision between single particles in momenThe energy eigenvalues are thus of form tum states k3 and k4 . In the collision, total momentum X X has to be conserved, i.e. the momentum in the scattered 2πa~2 2 E{n } ' n ε + 2N − n2k , k k k states has to be equal to that of the colliding states mV k k1 + k2 = k3 + k4 . k where the dependence of the eigenvalue on the distribution between different wave vectors has been explicitly implicated. This has been applied in the last equality. In the above, k = k2 − k4 = −(k1 − k3 ) 55 Compare 54 with Quantum Mechanics II. For the ground state, the distribution set becomes N for k = 0 nk = 0 for k 6= 0. This means that the interactions lead to non-zero chemical potential giving a correction to the fugacity zc ' 1 + 4g3/2 (1) This gives the ground state energy E0 = 2πa~2 N 2 , mV pressure p0 = − ∂E0 ∂V = N 2πa~2 ρ2 , m and chemical potential ∂E0 4πa~2 ρ µ0 = = . ∂N V m We see that the ground state properties of the interacting Bose-gas at low temperatures are given by the scattering length and density. Generally, the thermodynamics can be calculated from the partition function X Z(T, V, N ) = exp(−βE{nk }) {nk } = X exp − β {nk } X k n2 2πa~2 N 2 2 − 02 εk nk + mV N . The lowest order corrections to the ground state results, that arise from the low but finite temperatures, can be obtained by recalling the Eq. (90) for the ideal gas: g3/2 (1) 1/3 λ3 n0 = 1 − 3c where λc = N λT ρ =1− v λ3T 1 vc = . where v = vc ρ g3/2 (1) By using this as an estimate for the relative population in the ground state, one obtains the Helmholtz free energy A(T, V, N ) = −kB T ln Z = Aig (T, V, N ) + 2πa~2 N m 1 2 v + − 2 , v vc vc where Aig is the free energy of the ideal Bose gas. The thermodynamics is obtained from the free energy ∂(A/N ) 2πa~2 1 1 ∂A =− = pid + + 2 p=− ∂V T,N ∂v m v2 vc T ∂A 4πa~2 1 1 µ= = µig + + . ∂N T,V m v vc These should be compared with the ground state results when vc → ∞. The condensation starts when v = vc and λT = λc , leading to 4πa~2 g3/2 (1)2 mλ6c a µc = 4g3/2 (1)kB Tc . λc pc = pig + 55 a . λc 11. Phase Transitions The phenomena to which statistical physics has been applied can be divide roughly to two categories. So far, we have studied the first category, in which the constituting systems can be considered non-interacting. It includes specific heats of gases and solids, chemical reactions, Bose-Einstein condensation, blackbody radiation, paramagnetism, elementary electron theory of metals, to name a few. The most significant feature of this category is that the thermodynamic functions of the systems involved are smooth and continuous56 . The second category can be characterised with discontinuous or singular thermodynamic functions, corresponding to phase transitions. It includes the condensation of gases, the melting of solids, coexistence of phases, the behaviour of mixtures and solutions, ferromagnetism, antiferromagnetism, the order-disorder transitions, superfluid transition from He-I to He-II, the normal to superconducting transition, and so on. These phenomena are characterized by the fact that the interactions between the constituents play an important role. The co-operation leads to macroscopic significance at a particular critical temperature Tc . We try here to capture the aspects of these cooperative processes in terms of simplified models. The regions with zero slope correspond to coexistence of two or more phases with different densities, implying the onset of phase transition. The positive slopes in the (p, v)-isotherms are non-physical, and result always from the approximations made in the derivation of the equation of state58 . Such regions should be corrected with the Maxwell construction of equal areas. • The exact partition function with finite N corrects the singularities resulting from such corrections made for the approximate isotherms at the limit N → ∞. 11.2 Condensation of a van der Waals Gas Let us start with the simplest model for an interacting gas, the van der Waals gas, obeying the equation of state p= a RT − , v − b v2 where v is the molar volume of the gas, and thus a and b pertain to one mole of the gas. At the critical temperature, both (∂p/∂v)T and (∂ 2 p/∂v 2 )T vanish. This leads to the coordinates of the critical point pc = 8a a , vc = 3b , Tc = , 2 27b 27bR 11.1 General Remarks on Condensation implying RTc 8 We study a system with N particles, and assume that K= = . pc vc 3 the potential can be written as a sum of two-particle terms This allows us to write the equation of state in a universal u(r) = ∞ for r ≤ σ form by using reduced variables ∗ 0 > u(r) > −ε for σ < r < r ∗ u(r) = 0 for r ≥ r . p v T pr = , vr = , Tr = . pc vc Tc Thus, each particle can be seen as a hard sphere with the diameter σ, surrounded with attractive potential of range This implies that the reduced equation of state becomes r∗ and depth −ε. Note, that the Lennard-Jones potential meets these requirements up to a very good approximation. 3 (3vr − 1) = 8Tr . p + r We expect that the exact interparticle potentials lead to vr2 phenomena that are, at least qualitatively, similar to those obtained from our simplified form. p There are some generally accepted properties that the exact partition function Z(T, V, N ) has to display the following: critical point • In the thermodynamic limit, f (v, T ) = N −1 ln Z = −A/N kB T tends to depend only on v = V /N and T . Thus, it is an intensive variable, and the thermodynamic pressure is then given by ∂f ∂A = kB T , p(v, T ) = − ∂V T,N ∂v T turning out to be a strictly non-negative quantity. 56 With pc T=Tc T<Tc p(T) coexistence curve vl vc • Function f is concave, implying57 ∂p ≤ 0. ∂v T 57 This T>Tc vg Below the critical temperature, the van der Waals isotherms display regions with positive slope, i.e. 58 This and the following sentence will be clarified in the next section where we study these properties in terms of van der Waals gas. a sole exception of BEC. is relevant for the mechanical stability. 56 v (∂p/∂v)T > 0. As stated, these regions should be corrected due to their non-physical nature. We replace them with horisontal lines. In the phase diagram (((T, p)-plane), these indicate the locations of phase transitions. The replacements are made in terms of Maxwell’s construction of equal areas. The requirement for equal areas follows simply from the requirements dp = 0 and dT = 0 for the corrected isotherm, that imply the change in the (molar) Gibbs energy dg = −sdT + vdp = 0. Where the actual value of constant is unimportant, and can be used as the zero of energy. Interactions with Jij > 0 favour the parallel alignment, and can display ferromagnetism. If Jij < 0, the situation is reversed and we see the possibility of antiferromagnetism. The most important interactions are assumed to occur between the nearest neighbours. As a result, the Hamiltonian of the N spin system can be written as X X H=− Jij si · sj − µB · si i hiji The same result should be obtained by integrating the theoretical isotherm, i.e. the quantity dg = vdp, from the state where hiji indicates that the summation goes over all nearvl to state vg . est neighbour pairs. Also, we have denoted the external The locus of points vl (T ) and vg (T ) is called the coexis- magnetic field with B. In the following, we simplify this tence curve. Within the region enclosed by this curve the even further by assuming that the strength of the interaction does not vary between the lattice sites, i.e. Jij = J. gaseous and liquid phases mutually coexist. Also, we will restrict the spin quantum number to the case The critical behaviour of the van der Waals system difsi = 12 , and take into account only its z-component. This fers in several respects from that of real systems. Neverallows us to treat the system classically, since all variables theless, it is still used as the first approximation against in the Hamiltonian commute. We end up with the Ising which the more sophisticated theories are compared. model X X H = −J σi σj − µB σi , (98) 11.3 Ising Model of Phase Transitions hiji i Next, we present a model that provides a unified theo- where σi = +1 for ”up” spin and −1 for ”down” spin. retical basis for understanding (anti-)ferromagnetism, gasThe Ising model can be solved exactly in one and two liquid and liquid-solid transitions, order-disorder transi- dimensions. Systems analogous to Ising model are, e.g. tions in alloys, and so on. We will discuss the model in terms of ferromagnetism, but the formulation can be gen• binary mixture composed of two species of atoms, A eralized to the languages of the other physical phenomena. and B, where each lattice point is occupied either type of atom. We assume that the atoms are bound to lattice sites, implying that the spin degrees of freedom are independent • lattice gas, where each lattice point is either occupied on other degrees of freedom by an atom or empty. H ≈ Hspin + Hother . Mean field approach to the Ising model This implies that we can factorize the partition function Before we solve the Ising model exactly, let us introduce an approximative way to handle the pairwise interactions. We make the mean field approximation, which Because of the exchange interaction 59 , there is an energy approximates the interaction energy between a lattice site difference between the state of parallel spins and the one and its nearest neighbours by assuming that the neighbours of antiparallel spins, given by all have spin hσi, the average spin per site. This leads to the mean field Hamiltonian ε↑↑ − ε↑↓ = −2Jij . Z = Zspin Zother . Because H=− 1 (si + sj )2 − s2i − s2j 2 1 = S(S + 1) − s(s + 1) 2 1 if S = 1 4 = − 34 if S = 0 si · sj = N N N X X 1X νJhσiσi − µB σi = − E(J, B)σi , 2 i=1 i=1 i=1 where E(J, B) = 12 νJhσi + µB, and ν is the number of nearest neighbours. Also, the prefactor 21 guarantees that every pair is counted only once. The expectation value hσi has to be determined in a self-consistent manner. The partition function of a single spin is X Z= eβEσi = 2 cosh(βE). we can write the interaction energy as εij = const. − 2Jij (si · sj ). σi =±1 59 The exchange interaction arises as manifestation of the interplay between the Coulomb repulsion and the Pauli principle between two electrons. Interested reader can refer to the course of Condensed matter physics. 57 Thus, the probability that the site i has the spin σi is P (σi ) = eβEσi . 2 cosh(βE) Again, the average magnetization of the lattice is Exact calculation of one-dimensional Ising model In practice, the only exactly solvable models are the one and two dimensional Ising models. The one-dimensional Ising model was first solved by Ising himself (1925). We assume periodic boundaries of the spin chain, i.e. M = N µhσi, where P σi eβEσi hσi = Pσi =±1 βEσi = tanh(βE). σi =±1 e σN +1 = σ1 . This eliminates the undesired effects arising from the termination of the chain. Nevertheless, it does not affect the thermodynamics of the (infinitely long) chain. The Hamiltonian of the system can be written as If B = 0 we obtain the magnetization 1 hσi = tanh β νJhσi = tanh 2 νJhσi . 2kB T N N X X We see that hσi is the order parameter for the spin lattice. H = −J σi σi+1 − µB σi When temperature is high, the magnetization is zero since i=1 i=1 the spins are randomly oriented. At lower temperatures, the spins align spontaeously resulting in non-zero magnetization. We can determine the critical temperature Tc below where σ = ±1. Now, the partition function can be written i which the lattice becomes more and more ordered. This is as calculated graphically by plotting hσi and tanh(αhσi) into XX X PN PN the same graph as a function of hσi. Here α = νJ/2kB T . Z= ··· eβJ i=1 σi σi+1 +βµB i=1 σi σ1 Α>1 = 1 XX σ1 Α<1 - Σ0 Σ0 σ2 σN ··· σ2 N XY 1 eβJσi σi+1 + 2 βµB(σi +σi+1 ) σN i=1 We define the 2 × 2 transition matrix X Σ\ 0 1 0 Tσσ0 = eβJσσ + 2 βµB(σ+σ ) , where σ, σ 0 = ±1. The partition function is then XX X Z= ··· Tσ1 σ2 Tσ2 σ3 · · · TσN σ1 -1 σ1 σ2 σN X The solutions are found at the intersections of the curves. = T N σ1 σ1 = Tr T N . We find that when α < 1, there is only one intersection at σ1 hσi = 0. Similary when α > 1 there exists three intersections at hσi = 0 and ±σ0 . One can show that the free The transition matrix energy per site is β(J+µB) e e−βJ T = e−βJ eβ(J−µB) G(J, 0) −kB T ln 2 if hσi = 0 = 1 −kB T ln[2 cosh( 2 βνJσ0 )] if hσi = ±σ0 N which is clearly symmetric. Its eigenvalues are given by q Thus, hσi = ±σ0 are the possible equilibrium states since 2 βJ −4βJ λ± = e cosh(βµB) ± sinh (βµB) + e . they minimize the free energy. There is the phase transition (critical point) at α = 1, implying the critical temperature Tc = The trace is independent on the basis, and thus νJ . 2kB N Z = λN + + λ− . We have thus shown that in the mean field approximaAlso, tion, there is a phase transition at finite temperature for N d-dimensional lattice. In the following, we will show that λ− . ln Z = N ln λ+ + ln 1 + the mean field theory leads to incorrect result in the case λ+ of one-dimensional lattice. In general, one can show that for d ≤ 3, the mean field calculation gives too high esti- In the thermodynamic limit (N → ∞), mates for the critical temperature. For dimensions d ≥ 4, N the estimates are good but have only mathematical, rather λ− → 0. than physical, interest. λ+ 58 Thus, Also, the specific heat diverges (logarithmically) at the critical point T = Tc , and the phase transition is continuous. The two-dimensional Ising model is perhaps the simplest exactly solvable model that displays a phase transition. ln Z ≈ N ln λ+ , meaning that the larger eigenvalue determines the thermodynamic properties. The free energy per spin is given by 11.4 Critical Exponents G kB T =− ln Z N N q = −J − kB T ln cosh(βµB) + sinh2 (βµB) + e−4βJ . One way to classify phase transitions is the quantity m called the order parameter. As in the case of magnetization, the order parameter usually is assigned to to the expectation value of some observable of the system (such as the average spin in the case of magnetization). AnaloAgain, thermodynamic quantities can be calculated from gously to µB/k T in the magnetization case, we identify B the free energy. In particular, the average spin per particle h as the ordering field coupled to the observable.60 In the is low-field limit (h → 0), m tends to limiting value m0 which is 0 when T ≥ Tc and 6= 0 for T < Tc . We say that the ∂G/N sinh(βµB) σ ≡ hσi = − =q . phase transition is continuous, if the order parameter ap∂µB T sinh2 (βµB) + e−4βJ proaches the zero value continuously. On the other hand, if the change at the critical temperature is finite, the phase 61 We saw in the mean field approximation that the expec- transition is discontinuous. We consider here continuous tation value σ behaves as an order parameter: σ = 0 corre- phase transitions. sponds to completely stochastically oriented spins, whereas It is typical for a continuous (second order) phase transi|σ| = 1 corresponds to the case of parallely aligned spins. tion that the system goes from high temperature phase to As previously, the magnetization is a low temperature phase with lower symmetry. This is referred to as the spontaneous symmetry breaking. Examples M = N µσ. include: Clearly, when B → 0 also M → 0 for finite temperatures T . This rules out the possibility of spontaneous break in the symmetry between the opposite spins. Thus, we cannot observe spontaneous magnetization, and hence the phase transition, contrary to the mean field approach! • Structural formation of crystal lattice. • Ferromagnetic phase transition. • Superconducting and superfluid transitions. From the low-field susceptibility N µ2 2J/kB T ∂M = χ0 (T ) = lim e , B→0 ∂B T kB T • Symmetry breaking of the electroweak interaction. we see immediately that a ferromagnetic (J > 0) system magnetizes strongly at low (but finite) temperatures. Nevertheless, when the external field is removed, the system returns to the disordered state with σ = 0. Thus, the onedimensional Ising chain is a paramagnetic system without any phase transitions. However, it does not obey the Curie law (M 6∝ 1/T ) so it is not a Curie paramagnet. The low-field susceptibility χ0 diverges when T → 0. This means that there exists a phase transition at absolute zero, corresponding to complete alignment of the spins. The order parameter, various response functions (such as the susceptibility) and correlation functions display nonanalytic behaviour, such as singularities, near the critical point. The singular parts are with great accuracy proportional to some powers of the thermodynamic quantities h and T − Tc . The critical exponents are defined as follows • The behaviour of the order parameter is determined by the exponent β m(T ) ∼ 0 (Tc − T )β when T > Tc when T < Tc . Two-dimensional Ising model • The divergence of the low-field susceptibility when The two-dimensional Ising model can also be solved exT → Tc is given by the exponents γ (above) and γ 0 actly by generalizing the transition matrix method (On60 For a gas-liquid transition, one can choose the density difference sager, 1944). The mathematics involved in the calculation are rather challenging, unfortunately. Therefore, we (ρl − ρc ) as the order parameter and the pressure difference (p − pc ) only give the results here. It turns out that in the two- as61the ordering field. Typically, the order parameter is related to thermodynamic obdimensional case there is a phase transition at temperature servables that are given in terms of first derivatives of the free enTc = ergy. In our previous terminology then, the continuous transitions are second order transitions and discontinuous transitions first order transitions. 2J √ ≈ 2.269 J. ln(1 + 2) 59 (below) χ0 (T ) ∼ ∼ ∂m ∂h reflected as disappearance of the odd powers of m0 . The coefficients may be written as X X X φ0 (t, 0) = qk tk , r(t) = rk tk , s(t) = sk tk , . . . T,h→0 − Tc )−γ −γ 0 (T (Tc − T ) k≥0 when T > Tc when T < Tc k≥0 k≥0 The equilibrium occurs when the free energy is minimized. The thermodynamic stability requires s(t) > 0. Within experimental accuracy we have γ = γ . We expect that the equilibrium is found at m0 = 0 when • The dependence of the order parameter on the field is T > Tc , and at m0 6= 0 when T < Tc . This is obtained given by the exponent δ by choosing r(t) > 0 and r(t) < 0, respectively. Thus, r(0) = 0. The lowest order approximation gives m|T =Tc ∼ h1/δ when T = Tc , h → 0. 0 φ0 (t, m0 ) ≈ φ0 (t, 0) + r1 tm20 + s0 m40 + · · · . • The singular part of the specific heat is determined by Since we are considering phenomena near the critical point, the exponents α and α0 we include only the terms written explicitly above. 2 ∂ G Φ0 Ch (T ) ∼ −T ∂T 2 h (T − Tc )−α when T > Tc 0 ∼ (Tc − T )−α when T < Tc In practice, we have α = α0 . t > 0 One can show that the critical exponents for van der Waals gas (exercise) and for the Ising model in the mean field approximation are t = 0 t < 0 -m s β= ms m0 1 , γ = γ 0 = 1, δ = 3, α = α0 = 0. 2 We see that for t > 0 we have one minimum at m0 = 0. In fact, all mean field theories give this set of exponents For t < 0, we obtain two minima whose locations are given despite the possible structural differences. This suggests by that they belong to the same universality class. p ∂φ0 = 0 ⇒ ms = ± r1 |t|/2s0 . ∂m0 11.5 Landau’s Theory of Phase Transitions Region between these values corresponds to unstable states. It should be corrected with a method similar to In a continuous phase transition, the slope of the free en- Maxwell’s equal areas. Based on the above, the order paergy changes continuously and a symmetry is always bro- rameter depends on the temperature as ken. As we have discussed, this can be seen as an appearm0 ∝ (Tc − T )1/2 , ance of an order parameter in the phase with lower symmetry. All such transitions can be described in terms of mean field theory presented first by Landau (1937). We have al- implying that the critical exponent β = 1/2 for the mean ready noticed that the mean field theory does not describe field theories. all features of continuous phase transitions correctly, but We can thus write near the critical point nevertheless gives a good starting point for understanding ( such transitions. φ0 (t, 0) for T > Tc φ0 (t) ≈ r12 2 φ0 (t, 0) − 4T 2 s0 (Tc − T ) for T < Tc Landau argued that the critical behaviour of a given sysc tem can be determined by expanding its free energy in powers of the zero-field order parameter m0 = m(h = 0) This implies that the critical exponents α = α0 = 0. Also, as it implies a jump r12 /2s0 Tc in the specific heat at the critical point. φ0 (t, m0 ) = φ0 (t, 0) + r(t)m20 + s(t)m40 + · · · T − Tc Effects of the field and universality where t = , |t| 1. Tc When the field is non-zero, one benefits of making LegThe free energy should have the same symmetries as the endre transformation Hamiltonian. Generally, the Hamiltonian remains unchanged (at zero field limit) when m0 → −m0 , which is φh (t, m) = −hm + φh (t, 0) + r1 tm20 + s0 m40 + · · · . 60 With a straightforward calculation, one can show that the critical exponents γ = γ 0 = 1. Quick inspection reveals that the introduction of the field breaks the symmetry in the system when T > Tc . This means that there is no more continuous phase transition. Thus, we have shown that the critical exponents of the Landau theory, build only upon the analyticity of the free energy and the symmetries of the Hamiltonian, result in exactly same critical exponents as the previous calculations made with mean field theory! This means that while the free energy itself depends on the structure of the system, the critical exponents do not. This universality of the critical exponents suggests that we are dealing with a class of systems that display qualitatively same critical behaviour, despite the structural differences. Breakdown of Landau theory The mean field (Landau) theory neglects the fluctuations of the order parameter in large length scales. However, when one approaches the critical point these fluctuations diverge. One often refers them as critical fluctuations. When the fluctuations become larger than the mean value for the order parameter, the mean field theory breaks down. In other words, a mean field theory cannot be valid arbitrarily close to the critical point. To see this, consider correlations between particles in the high temperature phase of the Ising model. They can be characterized with the spin-spin correlation function g(i, j) = hσi σj i − hσi ihσj i, (99) defined for the pair of spins at sites i and j. When i = j, one obtains the fluctuations (standard deviation) of the spin σ at site i. When the separation between i and j increases, one expects that the corresponding spins get uncorrelated and hσi σj i → hσi ihσj i leading to g(i, j) → 0. 2 where σM = hM 2 i − hM i2 . In the mean field theory, the susceptibility diverges at the critical temperature. Implying the breakdown of the theory. In some systems, however, the breakdown occurs only for very small |t|. Examples of such systems include • Superfluids: The normal-superfluid transition in He4 is a continuous phase transition which involves broken gauge symmetry. Order parameter is the macroscopic wave function Ψ of the condensed phase, and the free energy is given as φ(T, P ) = φ0 (T, P ) + r(T, P )|Ψ|2 + s(T, P )|Ψ|4 + · · · , where r(T, P ) = r0 (T, P )(T − Tc ) and r0 and s are slowly varying functions of T and P . • Superconductors: The condensed phase is a macroscopically occupied quantum state Ψ, serving as the order parameter. In the normal-superconductor transition the phase of the order parameter changes. The free energy is given by Z ψ(T ) = d3 r ψn (T ) + r(T )|Ψ(r)|2 + s(T )|Ψ((r))|4 1 |i~∇r Ψ(r)|2 . + 2m The Ginzburg-Landau theory of superconductivity is obtained by extremizing the free energy with respect to variations is Ψ∗ . 11.6 Scaling, Universality and Renormalization Group Approach Even though the neglect of fluctuations in the mean field theory leads into quantitatively incorrect results, it neverBy appyling the partition function (98), we obtain theless produces the correct result that every physical sysX tem belongs in a specific universality class, based on its ∂ ln Z = βµh σi i = βhM i, critical behaviour. In spite of the prediction, the mean ∂B i field theory does not provide any reason for universality. P where µ i σi is the net magnetization of the system. Sim- For that, we need the scaling theory. ilarly, ∂2 Widom scaling ln Z = β 2 (hM 2 i − hM i2 ). ∂B 2 We have described the singular behaviour in the vicinThus, we obtain for the susceptibility ity of the critical point in terms of the critical exponents. XX ∂hM i The free energy can be written in terms of the regular and χ= = β(hM 2 i − hM i2 ) = βµ2 g(i, j). ∂B singular terms as i j This is an important result! It relates the susceptibility φ(t, h) = φr (t, h) + φs (t, h), with the fluctuations in the order parameter (magnetization). Also, it tells that the the fluctuations are intrin- implying that the critical exponents must come from the sically related to the correlations between the individual singular part φs . spins. The idea behind the scaling hypothesis is that when The mean field theory is assumed to be valid if it obeys the system approaches the critical point, the correlation the so-called Ginzburg criterion length 62 ξ becomes infinitely large resulting in diminished 2 σM 1 ⇒ kB T χ hM i2 , hM i2 62 Correlation length is the measure of the range of the correlations in the system. 61 sensitivity to length transformations (changes of scale). In other words, we say that there is no natural measure of lenght in the system, except the atomic length. The system looks similar no matter what scale is used, and it is called self-similar. This suggests that the singular part of the free energy is a generalized homogeneous function, i.e. it scales like φs (λp t, λq h) = λφs (t, h), when λ > 0 and p and q are independent on the system. Direct calculations give the critical exponents in terms of p and q (exercise): 1−q p q δ= 1−q 2q − 1 0 γ=γ = p 1 α = α0 = 2 − . p β= These imply the scaling laws α + 2β + γ = 2 β(δ − 1) = γ. We thus see that only two of the critical exponents are independent. Kadanoff scaling and renormalization group Unlike the Widom scaling hypothesis, the method presented by Kadanoff and developed by Wilson is based on the microscopic properties of matter. We will present the only the outlines of method: • Combine the original microscopic state variables blockwise into block variables. • Determine the effective interactions between the blocks. This is coarse graining of the system called the block transform. • The block transforms form a semigroup, so called renormalization group. One can perform transforms sequentially. • Because in a system at the critical point has no natural length scale due to diverging correlation length, the transformed systems look like copies of each other. Thus, the critical point corresponds to a fixed point of the transformations. In general, the renormalization is a collection of techniques to remove the infinities arising from the theory. It was first developed to deal with the infinite terms in the perturbative expansions of the quantum electrodynamics. Such infinities typically arise because when making the theory we have chosen to describe phenomena occuring at a chosen scale of length, ignoring the degrees of freedom at 62 shorter scales. This kind of theories can be described in terms of renormalization group, and are called effective field theories. For example, practically all of the condensed matter physics consists of effective field theories. Also, the general relativity can be seen as an effective field theory of (yet unknown) complete theory of quantum gravity. 12. Non-Equilibrium Statistics This course has so far dealt with only statistical averages of physical quantities. We have shown that up to very high accuracy these averages correspond to the relevant measurements made on systems in equilibrium. Nevertheless, there do exists deviations from the average values called fluctuations. These fluctuations are generally small in magnitude, but are important for several reasons: • In the onset of the high degree spatial correlations in the vicinity of the critical point. • In understanding how a non-equilibrium state approaches equilibrium, and how the equilibrium persists. interchangeably in the following). Since the studied particle is large compared to the vast number of smaller ones constituting the fluid, its movement is very slow, and F (t) is very rapidly fluctuating function in particle’s time scale. Thus, we are not interested in the explicit time evolution of the bath degrees of freedom. What we assume instead is that in the above equation of motion those are averaged (traced out) quantities. As a consequence, the force F (t) can be treated as a random function of time t. Additionally, we assume that in equilibrium (v = 0) there is no preferred direction in space, and the ensemble average vanishes, hF i = 0. It is beneficial to write the random force as F = hF iξ + ξ, • In relating the dissipative properties of a system where ξ denote the rapidly fluctuating part around the with the microscopic properties of the equilibrium ensemble average65 hF iξ . Naturally, hξi = 0. In the limit (fluctuation-dissipation theorem). of small hvi, one can expand hF i as hF iξ = −mγhviξ . In this final chapter, we wish to gain understanding on the fluctuations in thermal equilibrium, the approach of the equilibrium, and dissipation. It turns out that all these are We will show later that the coefficient γ is determined by the properties of the function F (t) itself. For the moment, intimately connected. it will be taken as a phenomenological constant of friction. One sees that the effect of the slowly varying component 12.1 Brownian Motion hF iξ of the fluctuating force is causing relaxation of a parBrownian motion of a relatively massive particle im- ticle with an initial velocity v(0) towards the equilibrium mersed in fluid will provide us the paradigm for describ- v = 0. ing the equilibrium and non-equilibrium processes. Even standard deviation Due to the rapidly fluctuating force, in thermal equilibrium, the Brownian particle exhibits the velocity of the particle also fluctuates in time. Simirandom-like motion. This randomness is a consequence larly to the force, the velocity can be represented in terms of the unavoidable interactions between the particle and rapid, zero-mean fluctuations v 0 around the ensemble avits surroundings. erage velocity hvi as ξ The situation can be modelled with the Hamiltonian63 v = hviξ + v 0 . H = Hsys + Hbath + Hint , The variations in hviξ are much slower than those in v. where Hsys , Hbath and Hint are the Hamiltonians for the Altogether, the equation of motion is written as system (the particle), its surroundings, and the interacdv tion between them. We will consider, for simplicity, a one = F(t) − mγv + ξ(t), m dt dimensional particle, and we focus our attention on the location x of the particle. Hamilton’s equations of motion where we have assumed that γv = γhviξ 66 . The above give equation of motion is called the Langevin equation. The Langevin equation is a stochastic differential equation, dx =v meaning that it contains a random term. The random sodt lutions of such equations are generally called as stochastic dv dHsys dHint m =− − ≡ F(t) + F (t). variables. dt dx dx The Langevin equation can be shown to hold also for We have assumed above that the system and the bath inthe quantum mechanical operator x̂, in addition to the teract (linearly) via the variable x. We denote with F classical expectation value x = hx̂i discussed above. For the force exerted on the massive particle by some exterfurther details on quantum Langevin equation, refer e.g. nal potential, and with F the force that describes the inCardiner and Zoller, Quantum Noise. teractions with the many other degrees of freedom in the 65 In the present case, the non-equilibrium ensemble is formed as surrounding bath64 (we will use the words bath and fluid 63 Note that this Hamiltonian describes a very general open system, i.e. system interacting with a much larger bath. 64 The term noise is often used to describe the fluctuating force F , since it describes the typically undesired deviations from the behaviour of an isolated system. 63 mental copies of the Brownian particle with different realisations of the random variable ξ. We will continue of using the notation h·i for the equilibrium (microcanonical, canonical, grand canonical) averages. 66 Fluctuations in v are slower than those of the force F that produces them. This is because of the large mass of the particle. The informal derivation of the Langevin equation presented above shows explicitly how irreversibility arises in Hamiltonian (quantum) mechanics. Consider the combined system of the particle and the bath (fluid). This constitutes an isolated system, whose governing equations are reversible and the energy should be conserved. Now, the Langevin equation covers only the degrees of freedom of the particle, neglecting the time evolution of the bath. The effect of the bath on the particle is described with frictional force −mγv, leading to dissipation of energy from the particle into the bath, and of the random force ξ(t) that causes fluctuations in the particle degrees of freedom even at the equilibrium. Equilibrium fluctuations We assume for now that the external potential is zero, and that the particle is in thermal equilibrium with the bath. Thus, we the Langevin equation (multiplied with x) gives d dẋ =m (xẋ) − ẋ2 = −mγxẋ + xξ(t). mx dt dt Next, we take an ensemble average of both sides, and employ the equipartition theorem to 12 mhẋ2 i = 12 kB T . The resulting differential equation is easily solvable, leading into hxẋiξ = Ce−γt + kB T , mγ which is referred to as the Einstein relation. We stress that the variance hx2 iξ is caused by the coupling to the environment. In the absence of the coupling, the particle located initially at x0 stays there forever. However, the constant bombardment of the fluid molecules cause random ”kicks” to the particle, thus causing the diffusive behaviour. Correlation functions Instead of assuming the thermal equilibrium, we can calculate directly the mean square velocity of the Brownian particle (assume again that F = 0). First of all, by integrating the Langevin equation we obtain Z 1 t −γ(t−t0 ) 0 0 e ξ(t )dt (101) v(t) = v(0)e−γt + m 0 It should be noted that the solution of the above equation holds for a single realisation of ξ. Because ξ(t) is a random variable, also v(t) and x(t) are random variables whose properties are determined by ξ(t). This leads into Z 2 −2γt t γt0 2 2 −2γt hv (t)iξ =v (0)e + e e hξ(t0 )iξ dt0 m 0 Z Z 1 −2γt t t 0 00 γ(t0 +t00 ) dt dt e hξ(t0 )ξ(t00 )iξ . + 2e m 0 0 The second term in the above equation is zero because hξ(t)iξ = 0 for all t. where C is a constant of integration. We thus see that The function hξ(t0 )ξ(t00 )iξ appearing in the above equa−1 γ is a characteristic time constant of the system. If we tion is of greatest importance to us. It can be seen as the assume that initially at t = 0 each particle in the ensemble temporal equivalent to the spin-spin correlation function is at x(0) ≡ x0 = 0, we obtain C = −kB T /mγ. Thus, defined in Eq. (99). It is a measure of the statistical correlation between the value of ξ at time t0 and its value at time kB T −γt hxẋiξ = (1 − e ), t00 . We will refer to it as the (time) autocorrelation function mγ of the variable ξ. Generally, we will denote the temporal and by integrating once more we have obtained the equi- correlation function of the variables A and B with librium fluctuations (variance, x0 = 0) CAB (t, t0 ) = hA(t)B(t0 )iξ . 2kB T h(x(t) − x0 )2 iξ = hx2 iξ = t − γ −1 (1 − e−γt ) . (100) In order to proceed with the calculation, one needs thus mγ information over the time correlations of ξ. Typically in the case of Brownian particle, the fluctuating force is assumed When t γ −1 , we have to be independent on its previous values67 . In addition, one kB T 2 2 often assumes that ξ(t) is a stationary process, meaning hx iξ ≈ t . m that its mean and variance does not change in shifts in This means that at short time intervals the particle behaves time and space. Mathematically this is called Gaussian as a free particle with the constant thermal velocity v = white noise process68 and expressed as delta-correlation p kB T /m. In the opposite limit, t γ −1 , we have Cξ (t, t0 ) = gδ(t − t0 ), 2kB T where δ(t) is Dirac’s delta function and g is a constant hx2 iξ ≈ t. mγ given by the equipartition theorem. Here one should recall the result from the very beginning of the course, which stated that the variance of a random walk grows as the number of steps N . Thus, in the long time limit the particle executes a continuous random walk and diffuses to the fluid with the diffusion constant D= kB T , mγ 64 This assumption results in hv 2 (t)iξ = v 2 (0)e−2γt + g 1 − e−2γt . 2m2 γ 67 This assumption of short memory is referred as the Markovian approximation. 68 This terminology will become clear later when we discuss about spectral density When t → ∞ we have that hv 2 (t)i → g/2m2 γ. In this limit, no matter what the initial velocity v(0) is, we expect that the particle is in equilibrium with the bath. Thus, the equipartition theorem gives g = 2mγkB T. We obtain for delta-correlated noise kB T 1 − e−2γt . hv 2 (t)iξ = v 2 (0)e−2γt + m By integrating Eq. (101), we obtain Z t 1 0 v(0) x(t) = x(0)+ 1−e−γ(t−t ) )ξ(t0 )dt0 . 1−e−γt + γ mγ 0 Similar to v, we obtain (we set the origin at x(0) = 0) !2 v(0) hx2 (t)iξ = 1 − e−γt γ Z 2kB T t ds 1 − e−γ(t−s) ) 1 − e−γ(t−s) ) + mγ 0 !2 v(0) = 1 − e−γt γ 2 1 2kB T t − (1 − e−γt ) + (1 − e−2γt ) + mγ γ 2γ !2 v(0) = 1 − e−γt γ + where W (x, x0 )δt is the probability that within a small time interval δt the particle makes a transition from the displacement x into x0 . Note, that here we also make a Markovian approximation by assuming that the rate W (x, x0 ) does not depend on the history of the particle. In the case of Brownian motion, we assume that only those transitions are possible that are between nearby states. This means that W (x, x0 ) is sharply peaked at x0 = x, and rapidly goes to zero away from x. By writing the master equation in terms of η = x − x0 , and making a second order Taylor expansion around η = 0, one ends with (exercise) 1 ∂2 ∂ ∂f µ(x)f (x, t) + D(x)f (x, t) , =− 2 ∂t ∂x 2 ∂x which is called the Fokker-Planck equation. In the above, the first and second moments of the transition rate W (x, x0 ) are given by Z ∞ hδxi = hvi µ(x) = ηW (x, x − η)dη = δt −∞ Z ∞ hδx2 i D(x) = η 2 W (x, x − η)dη = . δt ∞ In the case of Brownian particle, we obtained µ(x) = hvi = 0 kB T D(x) = 2 ≡ 2D. mγ kB T 2kB T t− (1 − e−γt )(3 − e−γt ) 2 mγ mγ Thus, the Fokker-Planck equation reduces in When the initial velocity has the thermal equilibrium value v 2 (0) = kB T /m, we recover the result (100). ∂2f ∂f = D 2, ∂t ∂x which is the one-dimensional diffusion equation. Its solution is a Gaussian Fokker-Planck equation Instead of solving the average behaviour of ensemble of (x−x0 )2 1 Brownian particles, a more generalized way is to find the f (x, t) = √ e− 4Dt , 4πDt rate at which the distribution of Brownian particles approaches the thermal equilibrium. The equation which with the variance σ 2 = 2Dt. Again, the Gaussian solution governs the decay of the non-equilibrium distribution is suggest the interpretation of the diffusion process in terms called the master equation. Here, we present a simplified of continuous random walk. version of it, named the Fokker-Planck equation. Consider the displacement x(t) of a Brownian particle69 . 12.2 Fluctuation-Dissipation Theorem Let f (x, t)dx be the probability that an arbitrary particle In order to complete the discussion, we would still like from the ensemble is found with the displacement between a method how to relate the constant of friction γ with the x and x + dx at any given time t. Naturally, Z ∞ fluctuations. This is done with the fluctuation-dissipation theorem, which basically tells that the decay of the equif (x, t)dx = 1. −∞ librium fluctuations obey the same law than that for a non-equilibrium state70 . We would like study how a therThe time-evolution of a probability density, such as modynamic system responses when it is (weakly) forced f (x, t), is typically governed by the master equation out of equilibrium. Somewhat surprisingly, the calculation Z ∞ ∂f turns out to be the simplest using the quantum mechanical 0 0 0 0 = f (x , t)W (x , x) − f (x, t)W (x, x ) dx , density operator approach. ∂t −∞ 69 The following discussion can be done also for any other dynamical random variable, such as v. 70 This way of stating the fluctuation-dissipation theorem is the Onsager regression hypothesis. 65 Let us consider the changes to an observable  made where Z ∞ by a weak external field F (t). The nature of the field is 1 arbitrary, it can be stochastic or deterministic. We assume χ00AB (ω) = dth[Â(t), B̂(0)]ieq eiωt , 2~ −∞ that the field is coupled to the system via the observable B̂. The Hamiltonian for the system is and χ0AB (ω) is given by the Kramers-Krönig relation (defined later). Ĥ = Ĥ − F (t)B̂, 0 where Ĥ0 is the equilibrium state Hamiltonian. Correspondingly, the density operator of the equilibrium state is given by exp(−β Ĥ0 ) ρ̂eq = . Tr exp(−β Ĥ0 ) Also, the equilibrium expectation value of an arbitrary (hermitian) operator  is given by On the other hand, the power spectrum related to temporal correlations between the observables  and B̂ is given by Fourier transform of the correlation function (this is a generalization of the Wiener-Khintchine theorem presented below) Z ∞ dthδ Â(t)δ B̂(0)ieq eiωt , SAB (ω) = −∞ where the fluctuations are defined as δ Â(t) = Â(t) − hÂieq . By employing the result (exercise) Z ∞ The application of the small time-dependent field in- Z ∞ iωt −β~ω dth B̂(0) Â(t)i e dthÂ(t)B̂(0)ieq eiωt , = e eq duces a small change to the density operator hÂieq = Tr(Âρ̂eq ). −∞ ρ̂(t) = ρ̂eq + δ ρ̂(t). −∞ we obtain The time-evolution is given by the von Neumann equation, which can be approximated as ∂ ρ̂ i i = − [Ĥ, ρ̂(t)] ≈ − [Ĥ0 , δ ρ̂(t)] − F (t)[B̂, ρ̂eq ] . ∂t ~ ~ χ00AB (ω) = 1 1 − e−β~ω SAB (ω). 2~ This is the celebrated fluctuation-dissipation theorem that relates the dissipation due to the perturbative field with the spontaneous thermal fluctuations that occur in the equilibrium. We see that in this approximation, the response is linear with respect to the field. In the interaction picture71 one can write the solution as Z 0 0 i t dt0 F (t0 )e−iĤ0 (t−t )/~ [B̂, ρ̂eq ]eiĤ0 (t−t )/~ . δ ρ̂(t) = ~ −∞ The classical limit of the fluctuation-dissipation theorem is given when ~ω kB T : χ00AB (ω) = The change in the expectation value (the response) of the observable is then hδ Â(t)i = hÂi − hÂieq = Tr Âδ ρ̂(t) . ω SAB (ω). 2kB T Linear response theory Generally, the linear response theory studies how the One can show (exercise) that the response can be written system behaves when it is displaced from the equilibrium. as the convolution It assumes that the force F (t) driving the system out of Z equilibrium is weak, leading to i t hδ Â(t)i = dt0 F (t0 )h[Â(t), B̂(t0 )]ieq , Z t ~ −∞ hδ x̂(t)i ≡ hx̂(t) − hx̂ieq i = dt0 χ(t − t0 )F (t0 ) iĤ0 t/~ −iĤ0 t/~ −∞ Âe . Thus, we can identify the where Â(t) = e Z ∞ linear response function (see the end of this chapter for = dτ χ(τ )F (t − τ ), more information on linear response theory) 0 i h[Â(τ ), B̂(0)]ieq , ~ where hx̂ieq is the equilibrium value of the observable x̂, and χ is the response function that characterises the variable’s response to the force field. The observable x̂ can where we have employed the knowledge that the equilibhere be taken as the position of the Brownian particle, but rium correlation depends only on the time interval τ . can be in general any relevant observable that is randomWe see that the Fourier transform of the response be- ized by the force. The integral can be extended to −∞ comes by assuming causality, i.e. χ(t) = 0 for t < 0. Thus, we Z ∞ see that the response of x is a convolution of the response hδ Â(ω)i = dteiωt hδ Â(t)i = χ0AB (ω)+iχ00AB (ω) F (ω), function and the force. By taking a Fourier transform, we −∞ obtain the linear response in frequency space χAB (τ ) = 71 Unitary transformation ρ̂ = Û † ρ̂Û into the interaction picture I is obtained with Û = exp(−iĤ0 t/~). hδ x̂(ω)i = χ(ω)F (ω). 66 The Fourier transform of the response function is often referred to as the susceptibility. Note, that the response function is real since the response δ x̂ is observable quantity. Nevertheless, the susceptibility is complex χ(ω) = χ0 (ω) + iχ00 (ω), This is the Wiener-Khintchine theorem, that relates the power spectral decomposition of the fluctuations in an observable with the Fourier transform of the autocorrelation function of the fluctuations. In the above, we have assumed that the state is stationary, meaning that Cδx (t + τ, t) = Cδx (τ, 0) ≡ Cδx (τ ). where 0 Z ∞ χ(t) cos(ωt) χ (ω) = −∞ ∞ χ00 (ω) = Z χ(t) sin(ωt). −∞ The real part of the susceptibility is sometimes (especially when one studies linear response of electric circuits) called the reactive response, and the imaginary part the dissipative response. Especially, in the case of Brownian particle subjected to white noise, one can show that the susceptibility is −1 χ(ω) = m(γ − iω) , Conclusions This concludes the course of Statistical Physics. The first part of the course lays the microscopic foundation to the classical theory of thermodynamics. The quantum mechanical indistinguishability of identical particles were discussed in the beginning of the latter half of the course. New phenomena that result from the quantum nature of particles, such as Bose-Einstein condensation and relativistic One can show (omitted here) that the causality implies electron gas, were introduced. The inclusion of unavoidthat the reactive part can be written in terms of the dissi- able interactions between particles was studied in terms pative response, or vice versa, as of cluster expansions and second quantization. Finally, an introduction to the more elaborate theories of phase transiZ ∞ ω0 2 0 tions and non-equilibrium statistics was given. Throughout χ00 (ω 0 ) 02 dω , χ0 (ω) = π 0 ω − ω2 the course, the aim has been in learning the fundamental Z 2ω ∞ 0 0 ω0 statistical concepts that are needed in the more specialised 00 0 χ (ω) = − dω , χ (ω ) 02 π 0 ω − ω2 advanced courses, such as Superconductivity, Condensed Matter Physics, Quantum Optics in Electric Circuits, and which are the Kramers-Krönig relations. so on. Power spectral density Consider fluctuation δ x̂(t) = x̂(t) − hx̂i of an observable x̂. It can be split into spectral components by taking the (truncated) Fourier transform Z T /2 δ x̂(ω) = dtδ x̂(t)eiωt . −T /2 This describes the fraction of the fluctuation δ x̂(t) that is oscillating with the frequency ω. Typically, the power of the oscillations is related to the square of the variable, which persuades us to define the equilibrium power spectral density as 1 h|δ x̂(ω)|2 i. S(ω) = lim T →∞ T Now, we can write72 Z Z 0 1 T /2 T /2 dtdt0 eiω(t −t) hδ x̂(t0 )δ x̂(t)i T →∞ T −T /2 −T /2 Z T T − |τ | iωτ = lim dτ e hδ x̂(τ )δ x̂(0)i T →∞ −T T Z ∞ = dτ eiωτ Cδx (τ ). S(ω) = lim −∞ 72 Recall how the change of variables is made with multidimensional integrals! 67