Predicting the most likely state for a basic geophysical ow: theoretical framework. Expository paper submitted in partial fulllment for the degree of Master of Science, Mathematics by Rodrigo Duran 1 Major Advisor: Dr. Yevgeniy Kovchegov Committee: Dr. Nathan L. Gibson and Dr. Radu Dascaliuc Department of Mathematics Oregon State University Kidder Hall 368 Corvallis, OR 97331-5503 USA Last revision: January 9, 2015 1 rduran@coas.oregonstate.edu Preface During my PhD research (physical oceanography), I developed an interest in statistical mechanics applications to geophysical uid dynamics. This interest motivated the topic for my expository paper as the nal requisite to my master's in math. The underlying objective was writing a paper with which anyone with enough background may be introduced to the topic without necessarily knowing uid dynamics or statistical mechanics or neither. To this eect I have added a number of appendix sections as well as references throught the text at the very least I deem this was a great learning experience. It is clear from the concept of an expository paper (also called a review paper) that no original ideas are to be credited to the author, but just to make sure: I claim no original work in this paper. My job has been to assimilate and put together the information needed to achieve the stated objective. This information has been mainly found in dierent journal articles, textbooks and talks with specialists. A number of proofs and explanations of details or parts thereof were not available in the references and have been worked out by the author under advisement of Dr. Yevgeniy Kovchegov, Dr. Radu Dascaliuc and Dr. Nathan Gibson. I wish to thank my mathematics advisor, Dr. Yevgeniy Kovchegov, he has been an extraordinary professor both inside and outside classrooms. I have beneted much from his rigourous proofs as well as from his generous sharing of insights and meanings (the greater picture as he calls it), all while under a constantly positive and constructive ambiance.1 For all of this and more, I am indebted. I also thank my committee members: Dr. Radu Dascaliuc and Dr. Nathan Gibson from the Mathematics department for their insightful help and comments that made this a better paper as well as for the voluntary committee duty that took from their busy schedules for my benet. They have been very generous in their availability and explanations which directly resulted in my academic benet. I also wish to thank my physical oceanography PhD advisor Dr. Roger M. Samelson for his moral support while completing my master's. I wish to acknowledge his rigorous, very positive and fruitful academic inuence on my development as a graduate student. I have undoubtedly improved much thanks to my advisors' and committee members' tutelage. 1 "constantly positive and constructive ambiance" should be understood in a literal sense. 1 Contents 1 Introduction. 4 2 Foundations for Geophysical Fluid Dynamics 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 Conservation of Mass . . . . . . . . . . . . . . . . . Boussinesq approximation . . . . . . . . . . . . . . Conservation of Energy . . . . . . . . . . . . . . . Navier Stokes equations . . . . . . . . . . . . . . . Euler equations . . . . . . . . . . . . . . . . . . . . Buoyancy Frequency . . . . . . . . . . . . . . . . . Vorticity . . . . . . . . . . . . . . . . . . . . . . . . 2.7.1 Potential vorticity . . . . . . . . . . . . . . 2.7.2 Baroclinic vs barotropic uid . . . . . . . . Shallow water equations . . . . . . . . . . . . . . . Conservation of potential vorticity in shallow water A streamfunction for two-dimensional ow . . . . 2.10.1 Solution to Poisson's equation . . . . . . . . Rossby number . . . . . . . . . . . . . . . . . . . . Quasi-geostrophy . . . . . . . . . . . . . . . . . . . Barotropic QGE . . . . . . . . . . . . . . . . . . . A basic solution to the barotropic QGE . . . . . . Non-linear stability of the barotropic QGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Galerkin approximation . . . . . . . . . . . . . . . . . . . . . . . Conserved quantities. . . . . . . . . . . . . . . . . . . . . . . . . . Non-linear stability of the exact solution to the truncated system Liuoville property of the truncated system. . . . . . . . . . . . . Statistical predictions of the truncated system. . . . . . . . . . . The limit Λ → ∞ . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 Choosing µ = µΛ . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 The limit of the mean states. . . . . . . . . . . . . . . . . 4.6.3 Choosing α = αΛ and the enstrophy constraint . . . . . . 4.6.4 The energy constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Foundations of Statistical mechanics 3.1 3.2 3.3 3.4 3.5 Mixing leads to ergodicity . . . . . . . . . . . . . . . . . . . Statistichal mechanics' main ingredient: Liouville property . Conserved quantities and their ensemble averages . . . . . . Shannon's Entropy . . . . . . . . . . . . . . . . . . . . . . . Casimir's conservation laws. . . . . . . . . . . . . . . . . . . 4 Statistical theory for a simple Geophysical Flow. 4.1 4.2 4.3 4.4 4.5 4.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 References 5 6 6 6 6 8 8 9 9 10 10 12 12 13 13 13 14 16 17 20 21 21 24 25 28 29 29 30 31 32 32 34 36 36 37 37 38 A Existence and Uniqueness to Poisson's equation A.1 Green's identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Maximum-Minimum principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3 Existence and Uniqueness for Dirichlet and Robin Conditions in a bounded domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.4 Existence and Uniqueness for Neumann Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.5 Unbounded domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . . . . . . . . . . . . . . 40 40 40 . . . . . . . 40 . . . . . . . 41 . . . . . . . 41 B Dominated Convergence Theorem 42 C Ensemble average 42 D Flow map of a dierential equation 43 E Entropy of a Measurable Partition 43 F Lagrange multiplier 45 G Orthogonal projections 46 B.1 Bounded Convergence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.1 Uniquenness of Shannon's entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 42 44 44 1 Introduction. Statistical mechanics studies the probability that a system is in a certain state given one or more constraints which are usually xed conserved quantities. It is a particularly useful and powerful approach for problems with a large number of degrees of freedom where a complete knowledge of the system is not practical or even possible. By allowing to reduce the complexity of the system to a few parameters, statistical mechanics allows avoiding the question of 'what is the state of a system?' by asking instead 'what is the most likely state of the system given some known constraints?'. Holloway (1986) has a review of successful applications of statistical mechanics to a variety of geophysical uid dynamics (GFD) problems, including geostrophic turbulence over topography, two-dimensional turbulence on a plane and on a sphere, closed-basin circulation and Western intensication, the shape of a thermocline, baroclinic turbulence, eddy heat transport, predictability (i.e. sensitivity of ow evolution to perturbations in the initial conditions), stirring of tracer elds, internal gravity waves and buoyant turbulence among others. More recently, statistical mechanics has been successfully used to understand aspects of large-scale GFD. For example the Robert-Sommeria-Miller (RSM) equilibrium statistical mechanics has been used to interpret rings and jets as statistical equilibria (Bouchet & Venaille 2012). Statistical Mechanics has also been successfully applied to numerical GFD, a subject of great interest for humanity not only for purely scientic reasons, but also for the large number of applications to real life. A sharp increase in the ability to numerically simulate oceans and atmospheres, as well as in the interest of projecting current states into the future have fueled important developments in numerical GFD. The rest of this paper is organized into a section 2 where the equations of geophysical uid dynamics and some simplications are introduced; some details of the simplied equations like non-linear stability of the steady state solution are also presented. Section 3 develops the theory of statistichal mechanics. We obtain a probability function with which we can predict the most-likely state of the system describing a generalized ow. In section 4 we predict the most likely state of a basic geophysical ow. The rst step applies the theory of section 3 to a Galerkin approximation of the simplied equations from section 2. The nal step extends the result to a continuum by taking the limit to innity of the truncated system. An appendix includes material for some of the concepts we use. 4 2 Foundations for Geophysical Fluid Dynamics In this section we introduce the equations used in GFD from where the simplied models used to make dierent problems tractable are derived. The main objective is to introduce the physical principles at play as well as their mathematical expressions. Further information and rigourous derivations can be found in the references given through out this section. We start by dening what kind of uid dynamics is considered to be GFD and making some comments on the domain. The dierent sections are based on the classical GFD book by Pedlosky (1987) as well as from Cushman-Roisin & Beckers (2011), Kundu & Cohen (2008), Majda (2003) and others mentioned in the text. Geophysical uid dynamics is the study of uid dynamics on a rotating sphere (our planet is close enough to be represented as a sphere). We can know that the rotation is relevant to the dynamics of uid ow when a couple of inequalities constraining time, length and velocity scales, T, L, U respectively, are met. Let Ω be the ambient rotation frequency dened as: 2π radians time of one revolution The criteria for geophysical uid dynamics is that the time scale of the motion satises: Ω= 1 Ω but to be able to account for the fact that the timescale is related to the lengthscale through the velocity of uid motion, the following inequality is usually used instead: T ≥ U ≤Ω L (1) A natural coordinate system for the Earth is spherical coordinates. However, it is convenient that for many useful GFD applications, the sphere may be locally represented by a tangent plane such that near a xed latitude φ0 the horizontal components of spherical coordinates (λ, φ) may be replaced by Cartesian coordinates (x1 , x2 ) = Re (cos φ0 λ, φ − φ0 ) where Re is the radius of the Earth. This approximation would no longer be valid when a typical horizontal length scale for the motion we are studying approaches the radius of the earth (L ≈ Re ). The coordinate system, tangent to Earth, is rotating with an angular velocity Ω which introduces the Coriolis parameter (or planetary vorticity). Locally, rotation is about the vertical axis and thus we dene2 f ≡ 2Ω · b k= 2Ω sin φ which can be represented locally in Cartesian coordinates (again in a tangent plane approximation) as f ≈ f0 + βx2 (2) where x2 = Re (φ − φ0 ) and f0 = 2Ω sin φ0 and β≡ 1 df 2Ω cos φ0 = Re dφ φ0 Re These local Cartesian coordinates are known as the β -plane approximation. Typical values at mid-latitudes are f0 = 10−4 s−1 and β = 2 × 10−11 m−1 s−1 (Samelson 2011). For a complete derivation of the Coriolis term due to a rotating framework the interested reader is referred to chapter 2 of Cushman-Roisin & Beckers (2011) and references therein. Fluid dynamics can be studied from an Eulerian perspective that treats the uid as a eld in which velocity and density are to be determined as a function of space and time, or from a Lagrangian perspective where the uid is a continuous eld of particles and their trajectories, velocity and mass densities are to be determined at each point in their trajectory as a function of the particles' identity (often its initial position). Although we shall work using the Eulerian perspective in this paper, we will allude to quantities that are conserved following the movement of a weightless (Lagrangian) particle passively moving with the uid (being advected). A quantity that is conserved following the motion, is said to be materially conserved. For further reading on the two perspectives of uid ow the interested reader is referred to Price (2006) and Bennet (2006). 2 the symbol ≡ is used throughout to dene so that a ≡ b means a is dened as expression b. 5 2.1 Conservation of Mass An important equation used in GFD corresponds to the dierential form of conservation of mass when there are no sources or sinks: ∂ρ + ∇ · ρu = 0 (3) ∂t The rst term in the left hand side accounts for the local change of density with respect to time, it is balanced by the mass ux. For incompressible or nearly incompressible uid (like air under regular atmospheric conditions, or even more so like water in the ocean) the contribution due to change of density with respect to time is negligible, and (3) can be written: ∇·u=0 (4) this does not necessarily imply that we are assuming ρ is a constant, rather ∂ ρ/∂t is given (as we will see) by (5), and it only vanishes identically when conductive or internal heating3 can be neglected, in other words when motion is adiabatic. The validity of (4) must be examined in each situation by systematic scaling arguments (Pedlosky 1986). 2.2 Boussinesq approximation The basis for the Boussinesq approximation is noting that changes in density due to water pressure (compression), haline contraction and/or thermal expansion are very small compared to the mean value of water density (which is O(1000kg/m3 ); Vallis 2006). Consequently we can dene density as ρ̃ ≡ ρ0 + ρ̄(x3 ) + ρ0 (x, t) = ρ0 + ρ (x, t) where ρ0 is a background or mean density, ρ̄(x3 ) is a background stratication, ρ0 is a density perturbation and |ρ̄| , |ρ0 | , |ρ| ρ0 The Boussinesq approximation which applies to a vast number of natural phenomena is a commonly used series of simplications under which (3) reduces to (4). It is can be briey explained as an approximation where variations in density are neglected except where multiplied by gravity. More details can be found in Vallis (2006) or Kundu & Cohen (2008). We will be using the Boussinesq approximation throughout this paper. 2.3 Conservation of Energy Using the notation of the previous section, we present another equation needed to model uid motion namely conservation of Energy. In approximate form (Pedlosky 1986, Majda 2003) it can be written as: 0 d ρ̄ ∂ρ d2 ρ̄ ∂ ρ0 0 + uH · ∇H ρ + u3 + (5) = κ∆H ρ0 + κ 2 ∂t ∂x3 dx3 dx3 here a vector with subscript H includes the rst two dimensions only, κ is the coecient of thermal diusivity and ∆ ≡ ∇ · ∇ represents the Laplace operator. It is important to notice that although they may seem similar, equations (3) and (5) describe two completely dierent physical principals. 2.4 Navier Stokes equations We introduce the equations of motion for a uid continuum. A suitable domain for these equations is D × [0, ∞) where D ⊂ Rn , n ∈ {2, 3} -either two or three dimensional ow- and time t ∈ [0, ∞) . For a physical space rotating at a constant rate with angular velocity Ω (e.g. Earth) the Navier-Stokes equations of motion describing Newton's second law are: Du + 2Ω × u = −∇p + ρ∇φ + F (u) (6) ρ0 Dt The left hand side of (6) is the background density ρ0 (mass per unit volume) multiplying: 3 the term for internal heating is not included in (5) 6 1. the material (or total) derivative of the n-dimensional velocity eld u = u (x, t) , x ∈ Rn . 2. the Coriolis acceleration term 2Ω × u that accounts for the rotation of the coordinate system. The material derivative is a linear operator given by Df ∂f ≡ + u · ∇f Dt ∂t (7) it is the time derivative of some f = f (x(t), t) while following the motion of f which is passively transported with the uid's motion. The second term of the total derivative, known as the advective term, comes from the chain rule, due to the position vector (at each time) being given by a uid element's path (trajectory) x = x (t). Thus we use the chain rule and equate dxi /dt = ui to obtain the expression for the material derivative. A comment on the notation: while f above is a scalar-valued function, in (6) we are applying the material derivative to a vector u. We will limit ourselves to suggest how it can be thought of: for the i-th component of the vector equation, we are applying the material derivative to the i-th component of u, which indeed is a scalar. For a detailed clarication of the notation involving tensors, we refer the interested reader to the paper by DeCaria & Sikora (2010). When f = ui as in (6) it is momentum that is being transported or advected. Notice that the advective term makes these equations non-linear. An interesting (well known) observation made by Pedlosky (1987) is that in physical terms, the non-linearity usually implies interactions between motions of dierent length scales. Thus when numerically integrating (6) on a grid that does not resolve certain length scales (sub-grid motion), interaction between sub-grid motion and resolved motion will not be explicitly accounted for. Likewise with any analytical treatment that includes the advection of momentum, we can expect motions with dierent length-scales to interact and cause the solution to change. This needs to be taken into account if the ow is projected into a nite number of modes each with a characteristic length scale. On the right hand side of (6) the rst term is the pressure gradient force. The second accounts for any and all body forces, represented by a potential φ multiplying the uid perturbation density ρ = ρ (x, t). Forces like gravity are represented through this term by dening φ̃ = −gx3 , where g represents the Newtonian gravitational acceleration. When the physical space is rotating it is common to combine the eect of gravity and that of the centrifugal force due to rotation into φ , in this case g is the eective gravity of the system. This is possible because the potential force φ can be written as the sum of the gravitational and centrifugal potentials. An b = (0, 0, 1). equipotential surface is then the surface perpendicular to k F represents any nonconservative force, most importantly it represents the frictional forces which are ultimately responsible for dissipation of kinetic energy, draining momentum -that would otherwise be conservedinto molecular disorder (i.e. heat). Frictional forces avoid geophysical ows accelerating to innity in the presence of a persistent external forcing (like solar heating). In large-scale ows however (here large is yet to be made unambiguous), molecular viscosity is often negligible in the force balance (specially away from boundaries). It is important to mention that friction at the ocean surface (boundary conditions) is often used as a forcing mechanism through which wind transmits momentum to the ocean. The exact expression for F will vary depending on the properties of the uid and is given by what is known as a constitutive equation, i.e. the relation between stress and deformation in a continuum. A uid at rest only feels inuence of a normal stress which is isotropic (i.e. ambient pressure). A moving uid feels additional components of stress due to viscosity. This last part, which is exclusively due to uid motion, is a nonisotropic tensor called the deviatoric stress tensor: σ ≡ σij and it depends on the velocity gradient tensor ∂ui /∂xj . As any tensor it can be written as the sum of an antisymmetric and a symmetric part. The antisymmetric part, given by 1/2(∂ ui /∂xj − ∂ ui /∂xj ), represents uid rotation without deformation and does not, by itself, generate stress. For Newtonian uids like water or air, the assumption (which works remarkably well) is that the deviatoric stress tensor is a linear function of the symmetric part. The symmetric part is known as the the strain rate tensor eij and it is given by: ∂uj 1 ∂ui + eij = 2 ∂xj ∂xi 7 The assumed linear relationship between the nonisotropic stress and the strain rate tensor can be written as: σij = n X n X Kijpq epq p=1 q=1 F is then given by: Fi = n X ∂σij j=1 For an incompressible uid, we can use (4) (i.e. be written as: Pn Fi = µ i=1 ∂xj ∂ui /∂xi = 0) to simplify this last expression, F can n X ∂ 2 ui j=1 ∂x2j Here µ is the only constant that survives after assuming that Kijpq is an isotropic tensor and exploiting the consequential symmetry (see e.g. Kundu & Cohen page 101), also to be able to exchange the order of dierentiation, we have assumed u ∈ C2 (D). Thus we write F (u) = µ∆u The ad hoc assumptions we have mentioned turn out to be a very accurate approximation for geophysical ows. For a physical intuition on the meaning of such viscous forces the interested reader is recommended section 3.4.4 of Price (2006). Although the form above is by far the most common dissipation operator, other forms of F are also used in GFD. Some examples can be found in section 1.1 of Majda and Wang (2006). We will denote a general dissipation operator by F . 2.5 Euler equations With F = 0 and ∇φ = g where g = (0, 0, g) is the gravitational acceleration, equation (6) becomes a special case known as Euler equations (on a rotating frame), which for future reference we will write as: ∂u 1 ρ + u · ∇u + 2Ω × u = − ∇p + g ∂t ρ0 ρ0 (8) 2.6 Buoyancy Frequency If we take d ρ̄/dx3 in (5) to be a constant, and consider: u = (0, 0, u3 (x1 , x2 , t)) , p = 0, ρ0 = ρ0 (x1 , x2 , t) as an elementary solution for equations (4), (5) and (6), we will nd (Majda 2003) that both u3 and ρ0 satisfy a simple harmonic oscillator (for the case d ρ̄/dx3 < 0) with frequency: N2 ≡ − g d ρ̄ ρ0 dx3 (9) N is the buoyancy frequency of the background stratication and has units of 1/time, a stable stratication happens when d ρ̄/dx3 < 0 which implies that lighter uid is above heavier uid. If heavier uid is above lighter uid (d ρ̄/dx3 > 0) the solution is unstable and this is manifested in it's most basic form as a solution with exponential growth (no longer a simple harmonic oscillator). Unstable solutions do not last long in the physical world. 8 2.7 Vorticity The curl of the velocity eld ω ≡∇×u (10) is a measure of the angular momentum of a small spherical particle of uid about it's center of mass, it is thus used as the basis upon which a conservation law may be sought in hopes of constraining uid motion. Such constraint facilitates the description and understanding of GFD. 2.7.1 Potential vorticity The dynamics of geophysical ows can be studied through the equation of a scalar quantity derived from the principle of conservation of angular momentum. In particular, the equation of a scalar quantity derived from the vorticity is a prognostic equation from which we may know the evolution of all other variables for a prominent class of motion (e.g. geostrophy and quasi-geostrophy) which is contained in (8). This is the essential reason why potential vorticity (yet to be dened) is of central importance to our current understanding of GFD. Similar vorticity laws play fundamental roles in other types of uid dynamics for the same reasons. We will come back to nding an equation for a scalar quantity derived from the vorticity vector and how this equation relates to uid ow. An excellent introduction to vorticity can be found at: http://web.mit.edu/hml/ncfmf.html In GFD the curl of the uid's velocity (10), is known as the relative vorticity. A physical system like our planet, rotating with angular velocity Ω, causes the uid on it (like the oceans) to rotate as a solid body (after a spin-up period), with a uniform angular velocity (planetary velocity) up ≡ Ω × r, where r is a position vector for a uid particle. The planetary vorticity is therefore twice the rate of rotation ω p ≡ ∇ × up = 2Ω. Thus in a rotating frame the absolute vorticity is then the sum of the relative and planetary vorticity: ω a = ω + ω p = ω + 2Ω. An equation for ω a can be found by taking the curl of the momentum equation (8) (see e.g. Pedlosky 1986 section 2.4). This equation would allow us to know how the vorticity vector evolves in time and space, for an incompressible uid it has the form: ∇ρ × ∇p D ω a = ω a · ∇u − ω a ∇ · u + Dt ρ2 Thus we see that following uid motion vorticity can change due to (from left to right in the right hand side) vertical stretching/shrinking and twisting, changes of volume and nally baroclinicity (to be explained). However, as mentioned above, we would like to nd a constraint to this evolution in hopes of facilitating our understanding of the ow, the above equation is really nothing more than a dierent way of writting of (8). One such constraint is the potential vorticity theorem due to Ertel (1942; see also section 2.5 of Pedlosky 1986). The strength of Ertel's theorem is that it gives an equation by which all dependent variables and their evolution are completely determined for a number of important GFD simplications. This equation is in terms of a scalar named potential vorticity. Roughly the vorticity is constrained by projecting it on a surface of constant density. Under the assumption of adiabatic motion, density is materially conserved (Dρ/Dt = 0) and projecting the absolute vorticity on a material density isosurface4 allows a generalization of Kelvin's theorem that may be applicable to oceans and atmospheres. In particular Ertel's potential vorticity allows the existence of a baroclinic uid (which is of central importance for many types of realistic simulations, it is dened in the next subsection) while Kelvin's theorem requires the uid to be barotropic. Therefore a general equation for potential vorticity in terms of any conserved quantity λ (i.e. Dλ/Dt = 0) not necessarily density although usually so and assuming no frictional forces is given by ∇ρ × ∇p D ωa · ∇λ = ∇λ · (11) Dt ρ ρ3 The term on the right hand side includes the baroclinicity vector which is introduced in the next section. The quantity Π ≡ (ωa /ρ) · ∇λ is called the potential vorticity. Remark: 4 A material isossurface is an isosurface that is transported with the ow, it may be deformed but it always remains an isosurface as long as material conservation holds. 9 2.7.2 Baroclinic vs barotropic uid Vorticity may be produced through baroclinicity, the physical principle is a torque induced by a mismatch between the pressure and density isosurfaces (i.e. a mismatch of the centers of gravity that causes rotation). Baroclinicity is measured through the baroclinic vector. A uid is baroclinic when the baroclinic vector is dierent from zero ∇ρ × ∇p 6= 0 and it is considered barotropic when it equals zero. A barotropic uid is two-dimensional with no variation in the vertical coordinate except possibly the variation found in the pressure eld, this happens if and only if density is a function of the pressure: ρ = ρ(p(x, y, z)). Note density may be everywhere constant as well, in which case the density gradient is zero and the following proof is trivial. Proposition 2.1 Suppose ∇p, ∇ρ 6= 0, then ∇ρ × ∇p = 0 ⇔ ρ = ρ (p (x, y, z)) Proof (⇐) Suppose ρ = ρ (p (x, y, z)) then ∇ρ = (∂ ρ/∂p) ∇p which implies ∇ρ × ∇p = ∂ ρ/∂p (∇p × ∇p) = 0. (⇒) ∇ρ × ∇p = 0 ⇔ |∇ρ × ∇p| = 0 and |∇ρ × ∇p| = |∇ρ| |∇p| sin θ = 0 if θ ∈ {0, π}. We are considering the angle between a surface of constant density and a surface of constant pressure, thus cases θ ∈ {0, π} are equivalent to θ = 0, i.e. the isosurfaces are parallel. This in turn implies that we can write ∇ρ = f (x, y, z) (∇p/ |∇p|) for some scalar-valued function f that at each (x, y, z) gives the magnitude of the vector ∇ρ. We can redene f to −1 include |∇p| (also scalar valued), so we arrive to ∇ρ = f (x, y, z) ∇p. Write the gradients as the exterior derivative (a 1-form), e.g. in the case of density: ∇ρ = dρ = ∂ ρ/∂x dx + ∂ ρ/∂y dy + ∂ ρ/∂z dz . Thus f = dρ/dp, and ∇ρ = (dρ/dp) ∇p integrating both sides we get that ρ = ρ (p (x, y, z)). 2.8 Shallow water equations A common and very useful approximation to the Euler equations is called the shallow water equations (SWE). The following two sections follow Pedlosky's (1986) derivations of the SWE closely but not exclusively, the results will be used to illustrate a time-proven two-dimensional model that can be used to simulate a variety of geophysical ows. To describe our domain let the vector g represent the body forces arising from the potential φ in (6) this of course includes (predominantly) gravitational acceleration. Let our space be R3 , we will take the direction b so that this is indeed the rotation axis of the uid, this means that of g as our vertical axis with unit vector k b. Let us dene the Coriolis parameter f ≡ 2Ω which is, under the the angular velocity simplies to Ω = Ωk present circumstances, twice the angular speed. Let the distance from the plane z = 0 to the uid's free-surface be denoted by h = h (x1 , x2 , t) and let rigid bottom of our domain be given by the bathymetry (or bottom topography) z = hB (x1 , x2 ). The three dimensional velocity is given by u = (u1 , u2 , u3 ). We will take the density to be homogeneous. Clearly (3) then reduces to (4) 5 . Let D be a typical scale for the depth of the water column, it could be taken to be the mean depth for example. Let L be a characteristic horizontal length scale for the motion. Then the meaning of "shallow" can be 5 The assumption of a constant density could be taken at rst consideration as a serious drawback of the model. It will not be able to simulate stratied uids, nor the stratication of uids, as we usually encounter in oceans and atmosphere. However, it is possible to develop the theory for SWE under this assumption and then use a model of several, say N , shallow water layers stacked one above the other, each one with its own constant density. This allows to simulate stratied uids, often can one encounter two-layer models for example. It is in fact, possible to consider "continuously" stratied models by taking the limit N → ∞ and solving the equations by separation, with the vertical structure given through normal modes (see e.g. Shimizu 2011). More advanced mathematical methods allow for density anomalies to develop across the water column for models that include diusion (Gardiner-Garden 1991). 10 made precise with what Pedlosky calls the "fundamental parametric condition that characterizes shallow-water theory" : D δ= 1 L Perhaps we still need a bit more precision: what exactly does much less than one mean? The answer is that L should be at least 10 times bigger than D. The average depth for our planet's oceans is 4km, so we should expect that the SWE could apply to motions of order 40km and bigger. As Pedlosky points out, the vertical extent of the major currents in our oceans is usually much less than 4km yet their horizontal scales is often hundreds or thousands of kilometers. Atmospheric motions similarly have a very small aspect ratio δ . We can now look into what is big and what is small. Let us consider a characteristic magnitude U for horizontal velocities and W for vertical velocities, let T be a characteristic scale for time. The rst two terms in (4) are then O (U/L) so that W needs to satisfy W/D ≤ O (U/L). So W is bounded above by O (δU ). Typical horizontal velocities for the oceans are usually less than 0.5 m/s, a very energetic current (like the Gulf Stream) is about 2m/s, so we conclude that vertical motion is often negligible. Having mentioned the above main assumptions, the interested reader is referred to section 3.3 of Pedlosky (1986) for a formal dimensional analysis of equation (6) that leads to the shallow water equations. Equivalently, ∂p = −ρg + O δ 2 (12) ∂x3 also known as the hydrostatic approximation, can be taken as the denition of a shallow-water model. Integrating (12) (which is the vertical equation of motion) from an arbitrary depth x3 to the ocean's surface h and assuming that pressure at the surface is constant for all time, i.e. p (x1 , x2 , h, t) = p0 , we get that the pressure at any given point is equal to the weight of the water column above that point and at that time: p (x1 , x2 , x3 , t) = ρg h̃ − x3 + p0 (13) where h̃ = h̃ (x1 , x2 , t) is the height of the ocean's surface above the x3 = 0 plane. The horizontal momentum equations then become: ∂u1 ∂u1 ∂ h̃ ∂u1 + u1 + u2 − f u2 = −g ∂t ∂x1 ∂x2 ∂x1 (14) ∂u2 ∂u2 ∂ h̃ ∂u2 + u1 + u2 + f u1 = −g ∂t ∂x1 ∂x2 ∂x2 Animportant consequence of (13) is that the horizontal pressure gradient is independent of x3 : (∂p/∂x1 , ∂p/∂x2 ) = ρg ∂ h̃/∂x1 , ∂ h̃/∂x2 . Thus horizontal accelerations are independent of x3 . This means that if the horizontal velocities are initially independent of x3 , then they will remain so for all time. We will assume such initial condition in our work. If we then integrate (4) with respect to x3 and consider the proper boundary conditions (see e.g. Pedlosky section 3.3) our conservation of mass equation becomes: DH ∂u1 ∂u2 +H + =0 (15) Dt ∂x1 ∂x2 where H ≡ h̃ − hB has been dened, h̃ is the ocean's surface and hB = hB (x1 , x2 ) is the bathymetry (height of the ocean's oor above the plane x3 = 0). 11 2.9 Conservation of potential vorticity in shallow water equations Under the shallow water assumption (i.e. (u1 , u2 ) are independent of depth) and in R3 space with it's canonical b , the relative vorticity components are: basis {bi, bj, k} bi · ω = ∂u3 ∈ O W ∈ O δ U ∂x2 L L W U ∂u bj · ω = − 3 ∈ O ∈O δ ∂x1 L L ∂u ∂u U 2 1 b·ω = k − ∈O ∂x1 ∂x2 L And since by denition of shallow water δ 1, (in fact δ 0.1 is often satised) we have that only the vertical component of the relative vorticity needs to be considered. So let b·ω = ζ≡k ∂u2 ∂u1 − ∂x1 ∂x2 (16) By cross-dierentiating (14), i.e. the rst equation with respect to x2 and the second equation with respect to x1 , then h, the uid's free-surface (assumed to be twice continuously dierentiable) is eliminated and we get an equation for the time evolution of ζ : Dζ ∂u1 ∂u2 = − (ζ + f ) + (17) Dt ∂x1 ∂x2 Using (15) we can write (17) as ζ + f DH Dζ = Dt H Dt DH So vorticity can increase due to column stretching Dt > 0 or decrease due to column shrinking As we have f constant (with respect to time) we can write (18) as: D ζ +f =0 Dt H (18) DH Dt < 0. (19) so that Π ≡ ζ+f H is conserved following uid motion. Notice that if the depth increases (decreases) then the absolute vorticity must decrease (increase) so that Π may remain constant for any given uid parcel. In the context of SWE (i.e. two-dimensional ow), Π is called the Potential Vorticity. B It can be shown that if we dene λ ≡ x3 −h and if the uid is barotropic then (11) coincides with (19). In the H case of (19) the conserved quantity λ is the ratio between the relative vertical position of a uid particle in the water column and the total depth, it is easy to show that it is indeed conserved following uid motion (see e.g. Cushman-Roisin & Beckers 2011 page 215). 2.10 A streamfunction for two-dimensional ow An incompressible two-dimensional ow can be expressed entirely in terms of a scalar-valued function known as the streamfunction. Following Samelson & Wiggins (2006), let u = (u1 , u2 ) be the velocity and assume incompressibility for a homogeneous uid: ∂ u1 /∂x1 + ∂ u2 /∂x2 = 0. By integrating the divergence over a surface and by using (the generalized) Stoke's theorem we can write: Z Z ∂u2 ∂u1 dx1 dx2 + 0=− ∂x1 ∂x2 I R = − u · nds C I u2 dx1 − u1 dx2 = C 12 Where C is the boundary of region R, with outward normal vector n, if the velocity eld is time-dependent, t is treated as a xed variable and integration is done only with respect to the spatial variables. The integrand of the last line integral can be represented as the (spatial) dierential of a scalar-valued function ψ = ψ(x1 , x2 , t). dψ(x1 , x2 , t) = u2 (x1 , x2 , t)dx1 − u1 (x1 , x2 , t)dx2 Using the denition for the dierential dψ (i.e. the dierential of a 0-form ψ is a 1-form dψ ): dψ = we arrive to: ∂ψ ∂ψ (x1 , x2 , t)dx1 + (x1 , x2 , t)dx2 ∂x 1 ∂x 2 ∂ψ (x1 , x2 , t) = u2 (x1 , x2 , t) ∂x1 ∂ψ (x1 , x2 , t) = −u1 (x1 , x2 , t) ∂x2 (20) And so the streamfunction can be dened as ∇ψ ≡ u⊥ (21) where u⊥ ≡ (u2 , −u1 ). Because the line integral above is equal to zero, the sign of the streamline in (20) is dened by convention. It corresponds to ow following the streamline and higher streamfunction values to the right. It follows by using denition (16) that ∆ψ = ζ (22) 2.10.1 Solution to Poisson's equation By solving Poisson's equation (22) in terms of the known vorticity ζ , the velocity eld is completely determined, it is therefore essential to be able to invert the Laplacian. Existence and Uniqueness to Poisson's equation is briey treated in Appendix A. The fundamental solution of Poisson equation in two dimensions is given by the Green function: G (x, y) = 1 1 log 2π |x − y| (23) where x, y ∈ R2 , and x is xed. 2.11 Rossby number The Rossby number compares the distance traveled horizontally by a uid parcel during one revolution (U/2Ω) with the length scale over which the parcels motion takes place. Rotational eects are important when the former is less than the latter. It is an equivalent way of writing (1): Ro = U 2ΩL Rotational eects are dynamically important when the Rossby number is of the order of unity or less. The Rossby number could also be thought of as a comparison of the advection term to the Coriolis acceleration. As a general rule the characteristics of geophysical ows vary greatly with the values of the Rossby numbers. 2.12 Quasi-geostrophy GFD is heavily constrained by Earth's rotation, this is the case when the Rossby number is small and advection is negligible. The Coriolis acceleration 2Ω × u is a leading term in (6) or (8), and it is largely balanced by the pressure gradient − ρ10 ∇p. This is called geostrophic balance and it is often the leading order balance found in GFD. 13 Clearly there are no accelerations in this balance which would imply that some other terms must likewise be relevant, otherwise once geostrophic balance is achieved, uid ow would be stationary for eternity which we know is not the case. When the timescale is longer than about a day, geophysical ows are usually in a nearly geostrophic state but not identically so. An advantageous simplied dynamical formalism called quasi-geostrophy captures the leading geostrophic balance plus second order terms that allow to emulate large-scale ow with great accuracy. The underlying idea is an asymptotic expansion retaining the two leading terms. To make this even better quasi-geostrophy can be completely studied in terms of a suitable potential vorticity equation. Thus we have a very good approximation to GFD which is amendable to analytical treatment. In particular for a barotropic uid the SWE equations can be written in terms of a suitable potential vorticity. A formal and rigorous derivation of the quasi-geostrophic equations (QGE) can be found in Majda (2003) it includes convergence of the SWE to the QGE. Further information can be found in Pedlosky (1988), Cushman-Roisin & Beckers (2011) and Vallis (2006). Here we will only state and introduce the barotropic geostrophic equations on a β -plane for latter use in section 4. 2.13 Barotropic QGE In this section we describe the equations we will be using for an analytical application of statistical mechanics to GFD. We will set the background density ρ0 = 1, as is often done. There is no loss of generality by doing so. It will be convenient to write the velocity in terms of the streamfunction. Making sure we are consistent with (21) we can dene the velocity as the orthogonal gradient of the streamfunction: u = ∇⊥ ψ = −∂ ψ/∂x2 ∂ ψ/∂x1 Our domain is [0, 2π] × [0, 2π] and we will work with periodic boundary conditions in both spatial directions. Periodic boundary conditions are not physically unreasonable except if we want to simulate ow near a boundary. However, even in that case generalizations to a number of other geometries like a channel (periodic in one direction) or a closed basin are not dicult. Doubly-periodic boundary conditions allow us to avoid generation and dissipation of vorticity near boundaries, in terms of Navier-Stokes equations we are comfortably allowed to set F = 0. We will include Earth's rotation by using a β -plane but since (2) is not a periodic function, a large scale mean ow needs to be introduced to overcome the diculty (as we will see in the next section). More precisely, we introduce a non-periodic stream function ψ including a non-periodic large-scale mean ow and a small-scale periodic component ψ 0 : (24) ψ = −V (t) x2 + ψ 0 Thus our stream function satises: ψ (x1 + 2π, x2 , t) = ψ (x1 , x2 + 2π, t) = ψ (x1 , x2 , t). The velocity eld is given by: 0 0 ⊥ u=∇ ψ= 0 V (t) + ∇⊥ ψ 0 0 (25) The total energy, which is the squared L2 norm of the velocity (divided by two), is given by 1 E≡ 2 Z 2 (2π) 2 1 |u| dx = V (t) + 2 2 2 Z 2 |∇ψ 0 | dx (26) Moving on, we arrive at the potential vorticity, which for our case will include the relative and planetary vorticity (as explained above) but also a topographic term h = h (x1 , x2 ) usually called the bottom topography, this term is equal to the fractional change in layer thickness divided by the Rossby number. Thus, our potential vorticity is given by: q = ∆ψ + βx2 + h 14 (27) This form of potential vorticity arises from the asymptotic expansion of the SW potential vorticity (19), the details can be found in Majda (2003). The potential vorticity is also split into a small scale (q 0 ) and a large scale component (βx2 ): q = q 0 + βx2 q 0 = ∆ψ 0 + h The relative vorticity is given in terms of the small-scale stream function: ζ = ∆ψ 0 The conservation of potential vorticity is expressed as: ∂q + J (ψ, q) = 0 ∂t (28) where J (·, ·) is the Jacobian determinant, in our case we can write J(ψ, q) = ∇⊥ ψ · ∇q . We introduce the following notation for simplicity: Z − dλ ≡ D 1 λ(D) Z dλ D Finally we will include a term by which the small scale motion and the large scale motion will interact: topographic stress. Z d ∂h 0 V (t) = −− ψ (29) dt ∂x1 This term comes from the conservation of energy law. We have that the total energy is given by the sum of large-scale and small-scale contributions: ET otal = ELarge + ESmall and if we assume conservation of Energy we can write the above expression as equal to a constant: Z 1 2 2 1 ET otal = (2π) V 2 (t) + |∇ψ 0 | = K 2 2 which means dET otal dV (t) 2 = (2π) V (t) + V (t) dt dt Z ∂h 0 ψ =0 ∂x From where (29) is obvious. To see why the time derivative of the small-scale energy is equal to V (t) rst note that: Z 1 2 ESmall = |∇ψ 0 | 2 Z 1 = ∇ψ 0 · ∇ψ 0 2 Z 1 =− ψ 0 ∆ψ 0 2 15 R ∂h 0 ∂x ψ we We can then use (28) to write Z dESmall 1 dq ψ0 =− dt 2 dt Z Z 1 dq 0 ⊥ 0 ψ ∇ ψ · ∇q + ψ 0 V (t) = 2 dx | R {z } =− q∇·(ψ∇⊥ ψ 0 )=0 ∂ ∆ψ 0 0 + V (t) =V (t) ψ ∂x | {z } R ∂ 1 =− ∂x ( 2 |∇ψ0 |2 )=0 Z Z ∂h 0 ψ ∂x Intermediate steps of the integrations by parts are indicated with underbraces. It is thus that we arrived to Z dESmall ∂h 0 = V (t) ψ dt ∂x To see why this term is called topographic stress, the integral in (29) can be written (integration by parts over periodic functions) as: Z Z ∂ ψ0 ∂h 0 ψ = −− h − ∂x1 ∂x1 which helps us visualize the reason for such a name: ∂ ψ 0 /∂x1 is proportional to the pressure gradient so that (∂ ψ 0 /∂x1 ) h can be named topographic stress: a linear representation of stress for the geostrophic velocity with the proportionality coecient given by the bottom topography h. Thus, our set of equations are: ∂q + J (ψ, q) = 0 ∂t Z d ∂h 0 V (t) = −− ψ dt ∂x1 q = ∆ψ 0 + h + βx2 , (30) ψ = −V (t) x2 + ψ 0 The periodic functions are h, ζ and ψ 0 . The set of equations (30) are close to being the simplest equations capable of meaningfully describing geophysical ows, we will use them as a test case for the predictions of statistical mechanics. In particular we are interested in simulating quasi-geostrophic turbulence, a common feature in GFD that forms long lasting quasi-stationary structures like long-lived eddies for example. This means we need to verify that in some sense our equations are stable, we make this precise in the following subsection. QG turbulence is a close relative of two-dimensional turbulence (conserves energy and estrophy) but it is more realistic by allowing the eects of vortex stretching and varying Coriolis parameter. It could also allow for vertical stratication which is not pursued in this paper. 2.14 A basic solution to the barotropic QGE From here on we will assume that the solutions (and their perturbations of course) live in Sobolev space H2 . We will also assume that any function with a Fourier expansion is at least piecewise C1 ; whenever needed to exchange order of dierentiation functions are assumed to exist in C2 . Consider the term J (ψ, q), if q = q(ψ) then the Jacobian determinant vanishes (this statement is proven in proposition (2.1) under a dierent notation). A common ansatz is a linear relationship q = µψ . It turns out this is often a good approximation (see e.g. Merryeld et. al. 2001, and references therein). Under these conditions the QG equation becomes: µ ∂ψ = 0, ∂t µψ = ∆ψ 0 + βx2 + h (x) 16 (31) The beta-plane eect is eliminated from (31) by assuming a large-scale mean ow V0 = − βµ so that ψ = −V0 x2 + ψ 0 where (32) µψ 0 = ∆ψ 0 + h (x) If there where no topographic eects, then µ would need to be an eigenvalue of the Laplacian with associated eigenfunctions given by the small-scale stream function ψ 0 . But with topography (32) is solvable only if µ is not an eigenvalue of the Laplacian, as we will verify through Fourier analysis. Assume for simplicity that h and ψ 0 are periodic and have zero mean, with their Fourier expansion given by: X X b h= hk eik·x + c.c., ψ0 = ψb0 k eik·x + c.c. (33) |k|6=0 |k|6=0 whre c.c. stands for complex conjugate (reality condition). Introducing (33) into (32) we get: 2 µ + |k| ψb0 k = b hk (34) 2 Clearly to solve this we need µ 6= − |k| . From here we can get an expression for the stream function and thus for the velocity and relative vorticity as well: ψ = − V 0 x2 + ψ 0 = ⊥ u =∇ ψ = − ζ =∆ψ = − X 1 β b x2 + hk eik·x 2 µ |k| + µ |k|6=0 β/µ + 0 X |k| X 1 |k|6=0 |k| + µ 2 (35) −k2 b hk ieik·x k1 (36) 2 2 |k| + µ |k|6=0 b hk eik·x (37) To make it unambiguous that this is a steady-state solution in future references, we will use the notation V ≡ V0 and ψ ≡ ψ and in general q ≡ q,. We now return to the stability of the steady-state solution. 2.15 Non-linear stability of the barotropic QGE Non-linear stability analysis uses the full equation (without linearizing) by bounding perturbations and considering the evolution of small, nite perturbations. The intuitive idea is that a steady-state solution is non-linearly stable if for small initial perturbations, the resulting perturbations remain small for all time. We will use the standard denition of inner product under the following notation: Z hf gi ≡ − f gdx and we will measure perturbations using L2 -norm: k (q, V ) k20 = kqk20 + kV k20 Z 2 2 = − |q| + |V | where the zero subscript refers to the above inner product. Assume that a steady-state solution to (30) is given by q = q(x1 , x2 ) and by V (t) = V . Denition 1 The steady state solution q, V of the barotropic quasi-geostrophic equations (30) is non-linearly stable in the L2 sense if there are constants C > 0 and R > 0 such that for any initial perturbation δq0 of q and any initial perturbation δV0 of V given by: q t=0 = q + δq0 , V t=0 = V + δV0 17 with k (δq0 , δV0 ) k0 ≤ R, then the resulting perturbed solution (q (x1 , x2 , t) , V (t)) of (30) given by: q (x1 , x2 , t) = q + δq (x1 , x2 , t) , V (t) = V + δV (t) has perturbations that satisfy: k (δq (t) , δV (t)) k0 ≤ Ck (δq0 , δV0 ) k0 for any time t > 0. We will also need the following denition: Denition 2 A non-linear functional W (δq, δV ) is locally positive denite (with respect to L2 norm) provided there is R0 > 0 and C ≥ 1 so that if k (δq, δV ) k0 ≤ R0 then C −1 k (δq, δV ) k20 ≤ W (δq, δV ) ≤ Ck (δq, δV ) k20 (38) If we can construct such a conserved functional then we can prove that the steady state is non-linearly stable. In particular, Proposition 2.2 Assume that W (δq, δV ) is a conserved functional for the quasi-geostrophic equations (30) with the property that it is locally positive denite. Then the steady state q (x1 , x2 ) , V is non-linearly stable Proof R0 Assume that the initial perturbation satises k (δq0 , δV0 ) k0 ≤ R = 2C . Suppose that the future perturbation (δq (t) , δV (t)) satises k (δq (t) , δV (t)) k0 ≤ R0 for all t ≤ T , since W (δq, δV ) is a conserved quantity for the ow and it satises the inequalities in (38) we then have that: C −1 k (δq (t) , δV (t)) k20 ≤ W (δq (t) , δV (t)) = W (δq0 , δV0 ) ≤ Ck (δq0 , δV0 ) k20 from where we get that k (δq (t) , δV (t)) k0 ≤ Ck (δq0 , δV0 ) k0 ≤ R0 /2 We conclude that (38) holds for all time and hence the steady state q (x1 , x2 ) , V is non-linearly stable. To see that (38) holds for all time suppose that at any time t ≤ T we get k (δq (t) , δV (t)) k0 = R0 this is a contradiction since it follows that (δq (t) , δV (t)) k0 ≤ Ck (δq0 , δV0 ) k0 ≤ R0 /2. We can then choose a bigger T and repeat the same reasoning ad innitum. The above proposition shows that to prove non-linear stability, it is enough to produce a conserved functional W (δq, δV ) satisfying (38). At this point we will assume that total energy: Z Z 2 1 1 1 1 E (q (t) , V (t)) ≡ V 2 (t) + − ∇⊥ ψ 0 = V 2 (t) − − ψ 0 ζ (39) 2 2 2 2 (where integration by parts of a periodic function has been used in the last part), as well as the large scale enstrophy: Z 1 2 (40) E (q (t) , V (t)) ≡ βV (t) + − |q 0 | 2 are two conserved quantities. For now we will assume they are conserved, when we use this fact in our GFD application we will actually prove that this assertion holds. It then becomes natural to try combining this two conserved quantities to construct the needed conserved functional W for the perturbation (δq (t) , δV (t)). In section 2.14 we found a steady state solution. Considering that in the northern hemisphere β > 0, and considering that the large scale mean velocity was V , 0 = (−β/µ, 0), we can see that µ > 0 corresponds to a mean westward ow. We will now prove that for westward mean large-scale ow the steady state solution is non-linearly stable. Theorem 2.1 The steady-state solution q, V given in section 2.14 is a non-linearly stable steady-state solution of the quasi-geostrophic equations (30) provided that 0 < µ < ∞ 18 Proof Towards constructing the functional W, we will expand the conserved quantities about a mean state, let q = q + δq and V = V + δV . Then the energy in (39) has the expansion: Z Z 1 1 E q + δq, V + δV = E q, V + V δV − − ψδq + δV 2 − − δq∆−1 δq (41) 2 2 and enstrophy has the expansion: Z Z 1 2 E q + δq, V + δV = E q, V + βδV + − qδq + − |δq| 2 (42) We will do a linear combination of the energy and enstrophy expansions so that we can form the form W, notice that we will get rid of the linear terms in the process: Wµ (δq, δV ) = E q + δq, V + δV − E q, V + µ E q + δq, V + δV − E q, V Z Z Z µ 1 µ 2 = β + µV δV + − q − µψ δq + δV 2 + − |δq| − − δq∆−1 δq 2 2 2 Since the steady state satises q = µψ and V = −β/µ we have that (43) reduces to: Z Z µ 2 1 µ 2 Wµ (δq, δV ) = δV + − |δq| − − δq∆−1 δq 2 2 2 To verify that the functional W (δq) is positive denite we use the following Fourier expansion: X ik·x d δq = δq ke (43) (44) (45) k6=0 Then Parseval's identity6 can be used to represent the functional as: 1X µ µ 1+ Wµ (δq, δV ) = δV 2 + 2 2 2 |k| k6=0 ! 2 d δqk (46) We are looking for a constant Cµ that depends on µ such that the following two inequalities hold: Cµ−1 ≤ 1 µ 1+ ≤ Cµ , 2 s Cµ−1 ≤ µ ≤ Cµ 2 where 1 ≤ s < ∞ and 0 < µ < ∞. Then Cµ ≥ 1 can be given by: ( 2/µ if 0 < µ < 1 Cµ = max {(1 + µ) /2 , 2} if 1 ≤ µ < ∞ (47) (48) The fact that Cµ → ∞ as µ → 0 is consistent with the fact that the coecient for the mean ow perturbation goes to zero as well. With no mean ow the analysis is dierent than what shown here, more details can be found in section 4.2 of Majda & Wang (2003). Plugging (47) back into (46) gives the desired result: Cµ−1 k (δq, δV ) k20 ≤ Wµ (δq, δV ) ≤ Cµ k (δq, δV ) k20 which proves that for 0 < µ < ∞ the steady-state solution of the QG equations (30) are non-linearly stable. 6 the innite sum of the squares of the Fourier coecients of some function equals the square of the L2 norm of same function. It is a generalization to a Hilbert space of the Pythagorean theorem which states that the sum of the squared components of a vector equals the squared norm of the vector. 19 3 Foundations of Statistical mechanics In the context of classical mechanics, the state of a system is completely described by the phase space, the space formed by the coordinates (i.e. where the system is) and the momentum associated to those coordinates (i.e. where is the system heading). An important concept for statistical mechanics is that there cannot be any loss of information in phase space. Given a position at a certain time we should be able to know where the particle was at a prior time, this is essentially the Loiuville property. Qualitatively, trajectories do not merge, nor they cross nor do they approach asymptotically. If there is stretching in one direction there must be compression in another to compensate, we equivalently say that the ow is incompressible. This is a crucial property that will allow an initial probability distribution P0 (x) in phase space to be carried to a probability distribution P (x, t) at any future time (as will be shown). The Liouville property thus allows a probability measure over phase space at any given time with which we can dene an information-theoretic entropy. Another necessary condition for the statistical mechanics approach to work is the existence of conserved quantities which impose constraints on the probability distribution. With the Liouville property plus the existence of conserved quantities we can invoke the maximum entropy principle to obtain the most likely probability distributions on phase space i.e. least biased probability or Gibbs measure. It can be further proved that the Gibbs measure is an invariant measure of ow maps. Thus statistical solutions like a mean ow and uctuations about that mean can be estimated by using the most likely probability distribution. The fact that we can nd the statistical equilibrium state (most probable state) does not necessarily imply that this state will be reached. Additionally, we need our system to be non-linear with which a healthy amount of bending and twisting of trajectories can be achieved. A chaotic system will help ensure ergodicity, strong mixing makes ergodicity more likely. Proving the convergence to statistical equilibrium is an open problem in mathematics and will not be pursued, but it is good to know that numerical simulations as well as experience conrm the statistical mechanics approach as a useful tool in predicting the most likely mean state of many a system as well as its uctuations about the mean. In the next subsection we present a an argument showing that ergodicity is indeed feasible. But how does this relate to Geophysical Fluid Dynamics? Physical space is called the conguration space. The properties of the phase space ow eld that have just been mentioned are similar to those found in the conguration space of many types of ocean ows which in general and at large scales, tend to be chaotic, quite nearly incompressible and nearly two-dimensional. These geophysical turbulent ows tend to form largescale coherent7 structures (perhaps most notably eddies). The robustness and ubiquitousness of eddies (see e.g. Chelton 2007, 2011) suggest that the large-scale coherent ow does not depend on the ne details of the dynamics (in a sense we will not make precise!). As their name suggests, these structures last for relatively long periods of time8 thus the importance that the equations we use to simulate the ow are stable in the sense of section 2.15. Under these circumstances, using statistical mechanics to solve GFD problems becomes a natural alternative, the ultimate goal being the prediction of these large-scale coherent features based on the bulk properties retained in a theoretical formulation that avoids the tremendous amount of degrees of freedom inherent to the original problem. Onsager was the rst to use statistical mechanics to explain two-dimensional turbulence in 1949. A review of Onsager's contributions to turbulence theory can be found in Eyink & Sreenivasan (2006). The Robert-SommeriaMiller equilibrium statistical mechanics explains spontaneous organization of unforced undissipated geophysical ows. An excellent as well as relevant introduction to quasi-geostrophic turbulence can be found in Salmon (1998, Chapter 6), it includes references to a number of reviews on the topic. For further readings on applications of statistical mechanics to GFD the books by Bouchet and Venaille (2011 and 2012) are recommended. In this section we introduce the mathematical foundations of statistical theories for geophysical ows following Majda and Wang (2006). We let our phase space to be RN . We begin by revisiting the claim that mixing9 causes ergodicity. 7 Coherent in GFD usually referes to a spatio-temporal structure persisting during the evolution of a system. 8 in GFD terms: long when compared to the eddy's turn-around time 9 Mixing is meant in the probability sense of the word. In the context of GFD mixing has a dierent meaning; context and in GFD terminology it should be thought of as stirring. 20 under the current 3.1 Mixing leads to ergodicity It was claimed in the introduction to this section that mixing makes ergodicity more likely, in this subsection we make this statement precise. The reader not familiar with the problem of ergodicity is encouraged to read appendix C rst. For the purpose of outlining a feasibility argument, we need the following mathematical denitions (Sarig 2006 and Durret 1996): Denition 3 The orbit of transformation system originally in state X. Remark: counterpart. Φ denoted by {Φn (X)}n∈Z is a record of the time evolution of a Here we used discrete time, it is straightforward to extend these concepts to their continuous Denition 4 A measurable set A ∈ B is called invariant set, if Φ−1 (A) = A. Denition 5 Let (D, B, P ) be a probability space, a quartet transformation if (D, B, P, Φ) is called a measure preserving 1. Φ is measurable: A ∈ B ⇒ Φ−1 A ∈ B, and 2. P is an invariant probability measure: i.e. P Φ−1 A = P (A) , ∀A ∈ B Denition 6 A measure preserving transformation all measurable sets A and B : Φ on a probability space (D, A, P ) is called mixing if for lim P A ∩ Φ−n B = P (A) P (B) n→∞ (49) Denition 7 A measure preserving transformation (D, B, P, Φ) is called ergodic, if every invariant set A satises P (A) = 0 or P (D\A) = 0. We say P is an ergodic probability measure. Proposition 3.1 Mixing implies ergodicity Proof Take B = A P (A) ∈ {0, 1}, i.e. P to be an invariant set in (49) to get that P (A) = P (A)P (A). This can only hold if is ergodic. On the other hand, ergodicity implies that: n−1 1 X 1B (Φm X) −n→∞ −−−→ P (B) = E [1B ] n m=0 (50) It is a basic fact of Measure theory that measurable functions continuous functions being a subset can be approximated by a linear combination of characteristic functions (i.e. simple functions); this allows (50) to be generalized. In particular, for some probability measure P and some measurable function g , it becomes feasible that we can write: ZT Z 1 T →∞ g (X (t)) dt −−−−→ g (X) P (X) dX = E [g (X)] T T0 RN where the notation X (t) is for a trajectory. 3.2 Statistichal mechanics' main ingredient: Liouville property Denition 8 A vector eld F(X) satises the Liouville property if it is divergence free, i.e. ∇·F= N X ∂ Fj =0 ∂X j j=1 21 (51) Let X ∈ RN and F = (F1 , ..., FN ) with N 1. Consider a system of ordinary dierential equations given by dX = F(X) dt Xt=0 = X0 (52) The Liouville property implies the ow map associated with (52) is volume (or measure) preserving on phase space, as we are about to see. Denition 9 We dene the time-t ow map associated with the nite-dimensional ODE system dΦt (X) = F Φt (X) , dt Φt (X) |t=0 = X (52) by: (53) We will also refer to it by ow map for simplicity. Section D in the appendix elaborates on the notation and meaning of the ow map. An interesting analogy can be drawn here: if this ow map is over physical instead of phase space, Φt (X) would be the trajectory of uid elements and F would be the velocity eld, (53) would be a statement of the Fundamental Theorem of Kinematics (see e.g. Price 2006), and (51) would be the condition of incompressibility stated in (4). The condition of incompressibility for the ow map is given by the Jacobian determinant J(t) ≡ det ∇X Φt (X) of the transformation that takes the initial position to the position at time t, thus to prove that the ow map is volume (or measure) preserving we only need a calculus identity which is proven in the following proposition: Proposition 3.2 Assume Φ ∈ C2 , then dJ(t) = ∇ · F Φt (X) J (t) dt (54) where J(t) is the Jacobian determinant of the transformation X 7→ Φt (X) for each t. Proof ∂ (Φt ,Φt ,Φt ) We use the notation J(t) ≡ ∂(X11,X22 ,X33 ) . The derivative of a Jacobian determinant is: dJ(t) ∂ (F1 , Φt2 , Φt3 ) ∂ (Φt1 , F2 , Φt3 ) ∂ (Φt1 , Φt2 , F3 ) = + + (55) dt ∂(X1 , X2 , X3 ) ∂(X1 , X2 , X3 ) ∂(X1 , X2 , X3 ) where we have used that dΦti /dt = Fi Φt for i = 1, 2, 3 by the denition of ow map (53). It is straightforward to see (by computation) that: ∂ (F1 , Φt2 , Φt3 ) ∂ (F1 , Y2 , Y3 ) ∂ (Φt1 , Φt2 , Φt3 ) = ∂(X1 , X2 , X3 ) ∂(Y1 , Y2 , Y3 ) Y=Φt ∂(X1 , X2 , X3 ) t t ∂ (Φ1 , F2 , Φ3 ) ∂ (Y1 , F2 , Y3 ) ∂ (Φt1 , Φt2 , Φt3 ) = (56) ∂(X1 , X2 , X3 ) ∂(Y1 , Y2 , Y3 ) Y=Φt ∂(X1 , X2 , X3 ) ∂ (Y1 , Y2 , F3 ) ∂ (Φt1 , Φt2 , F3 ) ∂ (Φt1 , Φt2 , Φt3 ) = ∂(X1 , X2 , X3 ) ∂(Y1 , Y2 , Y3 ) Y=Φt ∂(X1 , X2 , X3 ) It is also straightforward to see (again by computation) that plugging (56) into (55) gives (54). We are now in a position to show that for an incompressible vector eld, the associated ow map is volume (or measure) preserving, which is the main reason we need (51). We consider phase space in particular: 22 Proposition 3.3 Φt (X) is volume (or measure) preserving on phase space, that is: (57) J(t) = 1, ∀t ≥ 0 Proof We know that dJ(t) = ∇ · F Φt (X) J (t) dt since ∇ · F = 0, we have that J(t) = J(0), ∀t ≥ 0. Using the initial value of the ow map given by (53) we get that J(0) is the determinant of the identity matrix, thus J(0) = 1. As mentioned in the introduction of this section the incompressibility condition assures no loss of information, in other words the ow map is invertible. Since we are interested in the behavior of an ensemble of solutions, it is natural to consider probability measures on phase space. Let a probability density function at time t = 0 be given by P0 (X), Denition 10 we dene a probability density function with the ow map, that is: P (X, t) P (X, t) ≡ P0 Φt by the pull-back of the initial probability density −1 (X) (58) We shall now verify that this denition yields a probability measure. This will follow from the fact that the probability measure is transported by the vector eld. More precisely we have: Proposition 3.4 P (X, t) = P0 Φt −1 (X) is transported by the vector eld F, or equivalently it satises the Liouville equation: ∂P + F · ∇X P = 0 ∂t (59) and hence P (X, t) is a probability density function for all time. Proof We will use the transport theorem (Majda and Wang 2006) that states that for any function f (X, t) ∈ C 1 [D] in a subset of phase space D ⊂ RN : Z Z ∂ ∂f f (X, t) dX = + div(f F) dX (60) ∂t ∂t Φt (D) Φt (D) by (51) this equation reduces to ∂ ∂t Z Z f (X, t) dX = Φt (D) ∂f + F · ∇X f ∂t dX (61) Φt (D) let f (X, t) = P (X, t) and using the denition (58) Z Z P (X, t) dX = Φt (D) P0 Φt −1 (X) dX Φt (D) Z = P0 (Y) det ∇Y Φt (Y) dY D Z = P0 (Y) dY (62) D where we have used a substitution X = Φt (Y) and (57) to exchange the substitution Jacobian for one. By plugging (62) back into (61) we see that: Z Z ∂ ∂P 0= P (X, t) dX = + F · ∇X P dX (63) ∂t ∂t Φt (D) Φt (D) 23 Since D is arbitrary we conclude that ∂P + F · ∇X P = 0 (64) ∂t Notice that implicit in denition (58) is the fact that P (t) is a family of probability measures depending on parameter t. To see that P (t) is a probability density function for all time, we can use (58) to get that P (X, t) ≥ 0 and (62) with D = RN to verify that Z Z P0 (X) dX = 1 (65) P (X, t) dX = RN RN is satised. Corollary 3.1 If G(P ) is any dierentiable function of the probability density P then ∂ G(P ) + F · ∇G(P ) = 0 ∂t (66) this further implies d dt Z (67) G (P (X, t)) = 0 RN Proof the total time derivative is dG(P ) ∂ G(P ) = + F · ∇G(P ) = G0 (P ) dt ∂t ∂P + F · ∇P ∂t =0 (68) where we have used the chain rule as in (7), as well as (51), this proves (66). Integrating (68) over RN and assuming that the integral and derivative can be exchanged10 we can write: d dt Z RN Z G (P (X, t)) = ∂ G(P ) =− ∂t Z F · ∇G(P ) RN RN Z =− ∇ · [G(P )F] RN I = − lim r→∞ ∂Br (0) [G(P )F] · ds = 0 where we have integrated by parts to get the second to last line. Although we haven't said much on the precise nature of G(P )F, we assume that the behavior at innity allows the integrand to be dominated by at least r−1 . 3.3 Conserved quantities and their ensemble averages An essential ingredient for statistical mechanics is the existence of conserved quantities known as Casimirs, in the context of GFD they exist due to a relabeling symmetry of uid mechanics (Salmon 1998). Here we will just assume our system has L conserved quantities E such that for 1 ≤ l ≤ L: El Φt (X) = El (X) We will now show that the ensemble average of these conserved quantities are conserved in time. 10 a formal proof for a similar exchange between derivative and integral will follow in proposition 3.8 24 (69) Denition 11 We dene the ensemble average of a conserved quantity as: Z El = hEl iP ≡ El (X) P (X) dX (70) RN Remarks: Here El is denoting a specic constant quantity, it will be used again when dening a set of probability measures that satisfy certain constraints (equation (73)). Proposition 3.5 hEl iP (t) = hEl iP0 , ∀t Proof Z El (X) P (X, t)dX hEl iP (t) = RN Z El (X) P0 = Φt −1 (X) dX RN Z El Φt (Y) P0 (Y) dY = RN Z El (Y) P0 (Y) dY = hEl iP0 = RN where we have used property (57): the substitution Y = Φt in time as stated in (69). −1 (X) preserves volume, and that E is conserved 3.4 Shannon's Entropy With a probability density function P on RN we may use the denition of entropy. Denition 12 The Shannon entropy continuous with respect to Lebesgue S for the probability density function P (X) on RN that is absolutely is dened by: measure11 S(P ) ≡ − Z P (X) ln P (X)dX (71) RN The entropy is a measure of the information we do not know (see Appendix E). By setting G(P ) = −P ln P in (67) we get that the Shannon entropy is conserved in time: S(P (t)) = S(P0 ) (72) The next step is to choose a probability measure that guarantees the least bias. The maximum entropy principle states that the probability density function with the least bias should be the one that maximizes the Shannon Entropy (71), subject to the constraint set of conserved quantities as dened through (70). To this purpose we dene the set: Z (73) P = P P (X) ≥ 0, P (X)dX = 1, hEl iP = El RN Clearly, if we have an initial probability density function P0 , and we use it to dene El , then P 6= ∅. The maximum entropy principle predicts that the most probable probability density function P∗ ∈ P is the one that satises: 11 Let (Ω, A) be a measurable space and µ and ν measures on A, then ν is ν(A) = 0. In this case Lebesgue measure zero implies probability measure P 25 absolutely continuous with respect to µ if µ(A) = 0 ⇒ is also zero. S (P∗ ) = max S (P ) (74) P ∈P To solve this maximization problem we will use the Lagrange multiplier method (see Appendix F) , for which we need to calculate the variational derivative of a functional. The denition of this derivative can be motivated by recalling the denition of the directional derivative of a function f : RN → R in the direction of vector ν : f (x + hν) ≡ ∇f (x) · ν h→0 h lim (75) Instead of the Euclidean inner product in (75), we will use the L2 inner product between functions f and g over a space Ω given by: Z Z 1 f ḡdλ = − f ḡdλ f g ≡ λ (D) D D Let H be a Hilbert space, we dene: Denition 13 The variational derivative of a functional F : H → R with respect to an innitesimal continuous real-valued function δu is given by the function δF/δu that satises: F (u + δu) − F (u) = lim →0 δF δu δu (76) We notice that δF/δu is analogous to the gradient R of f in (75). In general the variational derivative of F (u) = − G (u) dx for some function G ∈ C 1 [Ω] is given by: Z F (u + δu) − F (u) [G (u + δu) − G (u)] = lim − dx →0 →0 Z = − G0 δu lim (77) (Exchanging the limit with the integral will be justied in proposition 3.8.) Thus we identify: δF = G0 δu We can now use (78) to calculate the variational derivative of (71) as: δS = − (1 + ln P ) δP (78) (79) The constraints of this maximization problem are given in (73), in particular we calculate the variational derivative of the ensemble averages of the conserved quantities using (70): δEl = El (X) δP (80) as well as the constraint that P ∈ P is a probability over RN , which gives δP =1 δP The Lagrange multiplier method tells us that the probability P∗ that maximizes the Entropy S subject to the constraints El with 1 ≤ l ≤ L as well as the constraint that P∗ is a probability density function is given by: − (1 + ln P∗ ) = θ0 + L X l=1 26 θl El (X) Solving for P∗ we get: P∗ = C exp − L X ! θl El (X) (81) l=1 P L Here θ0 has been absorbed into C. Assuming that exp − l=1 θl El (X) is integrable, we can write: C −1 Z ≡I= exp − L X ! θl El (X) dX l=1 Rn With this denition for C (notice we chose θ0 = ln(I) − 1 so that C −1 = exp(θ0 + 1) = I ) we call ! L X Gθl ≡ P∗ = C exp − θl El (X) the (82) l=1 Gibbs measure. We will show that any smooth function of the conserved quantities El of the ODE (52) is a steady state solution to the Liouville equation, in particular the Gibbs measure satisifes: F · ∇X Gθl = 0 (83) Proposition 3.6 Let G (E1 , ..., EJ ) Proof Ej for 1 ≤ j ≤ J be conserved quantities of system (52). Any smooth function G = is a steady state solution of the Liouville equation. Ej (X (t)) is conserved in time so for 1 ≤ j ≤ J : 0= ∂ Ej d Ej (X (t)) = + F · ∇ X Ej = F · ∇ X Ej dt ∂t Using the chain rule on G we get the desired result: F · ∇X G (E1 , ..., EJ ) = J X ∂G F · ∇ X Ej = 0 ∂Ej j=1 (84) Setting Gθl = G in (84) we get (83). We can use this result to show that any probability measure satisfying (83) is an invariant probability measure. Denition 14 A measure µ on Rn is said to be an invariant measure under the ow map Φt if µ Φt −1 (D) = µ (D) (85) for all measurable sets D ⊂ Rn and all time t. Proposition 3.7 The Gibss measure stationary statistical solution to (52). Gθl is an invariant measure of the system (52), in other words it is a Proof d dt Z d Gθl (X) dX = dt (Φt )−1 D Z = Z Gθl Φt (Y) dY D F · ∇X Gθl (X) X=Φt (Y) dY D =0 27 We have used the Rusual substitution as well as the volume preserving property of the ow eld J = 1. The conclusion is that (Φt )−1 D Gθl (X) dX is independent of time and therefore: Z Z Gθl (X) dX = Gθl (X) dX D (Φt )−1 D which in turn imples Gθl is an invariant probability measure of Φt . 3.5 Casimir's conservation laws. Both Euler and the quasi-geostrophic equations conserve an innite number of functionals named Casimirs. They are of the form: Z Cs [q] = s(q)dx (86) D where q = q (x, t) is either the vorticity or the potential vorticity, both of which are materially conserved i.e. they satisfy Dq/Dt = 0. Throughout this paper we have exchanged a derivative with an integral without rigorous treatment. We will use this section to show how this can be done, while introducing Casimirs at the same time. Although we will not be referencing these conserved quantities any further, readers that follow up on the topic will invariably encounter them. Thus, in the author's mind at least, there is some added value in introducing them while we prove that the exchange of derivative and integral may be an acceptable practice. Proposition 3.8 Cs is materially conserved for any s ∈ C 1 [D]. That is DCs =0 Dt Proof We need to prove that D Cs [q] = Dt Z D ∂q + u · ∇q dx = 0 s (q) ∂t | {z } 0 (87) =0 For this to be valid it is only necessary to justify the exchange of the time derivative with the integral over the xed spatial domain D. The problem amounts to bringing the limit into the integral in: Z D [s(q(x, t + h)) − s(q(x, t))] Cs [q] = lim dx h→0 Dt h (88) D We would like to use the bounded convergence theorem (Appendix B.1). First we will assume that the domain is closed, but eventually we will relax this assumption. This will illustrate the dierences between the two types of domain. Any closed set is compact since D ⊂ Rn , and any continuous function on a compact domain is uniformly bounded, thus that s and Ds/Dt are uniformly bounded is straightforward. Next dene sn (q (x, t)) ≡ s(q(x, t + n1 )) − s(q(x, t)) 1 n From the Mean Value Theorem: ∀n ∈ N ∃ t? ∈ t, t + n1 such that Ds/Dt(x,t ) = sn (q (x, t)) therefore for some ? 0 < M ∈ R we have |sn | < M ∀n ∈ N. At this point the Dominated Convergence Theorem applies and we can take the limit into the integral in (88) to get (87). Remark: This proof can be extended to an unbounded (e.g. D = Rn ) or some other open domain by requiring that the derivative of the integrand is bounded and not only continuous. Next, the dominated convergence theorem could be used by nding a non-negative integrable function g such that |sn | ≤ g . Remark: Usually s(q) = q n , n ∈ N so the assumption s ∈ C 1 [D] implies no restriction. 28 4 Statistical theory for a simple Geophysical Flow. We will now use the theory developed in section 3 to nd the mean state of the Gibbs measure for a simple realistic barotropic ow with no forcing or dissipation as dened in section 2.12: ∂q + J (ψ, q) = 0 ∂t Z dV (t) ∂h 0 = −− ψ dt ∂x (89) Where q = ∆ψ 0 + h + βy, ζ ≡ ∆ψ 0 ψ = −V (t) y + ψ 0 , q 0 ≡ ∆ψ 0 + h V (t) ⊥ v=∇ ψ= + ∇⊥ ψ 0 0 Remarks: We are using equation (22) for the relative vorticity ζ . Here ∇⊥ = (−∂ /∂x2 , ∂ /∂x1 ). See section 2.13 for more details. We can choose our units so that our domain is D = [0, 2π] × [0, 2π] with periodic boundary conditions. We will use the statistical theory for the truncated quasi-geostrophic dynamics (89) originally developed by Kraichnan (1975), Salmon et. al. (1976) as well as Carnevale and Frederiksen (1987): the rst step is to write a Galerkin approximation to (89) where the equations are projected to a nite dimensional space. Periodic boundary conditions make Fourier series a natural option. Next we verify which quantities are conserved in the nite dimensional space and if the truncated equations satisfy the Liouville property. These are the essential ingredients as we have seen in the previous section. Next we nd the Gibbs measure for this system and compute its mean state. The last step includes taking the limit as the number of dimensions of our space tends to innity thus obtaining the continuum limit. 4.1 Galerkin approximation 0 We will use the truncated spatial Fourier series expansions of the small-scale n o stream function ψ , the vorticity ζ 2 and the topography h as spanned by the basis BΛ ≡ eik·x 1 ≤ |k| ≤ Λ where k = (k1 , k2 ) ∈ Z2 so that: 0 ψΛ ≡ X ck (t) eik·x = − ψ 1≤|k|2 ≤Λ hΛ ≡ X 1 b ik·x 2 ζk (t) e |k| 2 1≤|k| ≤Λ X (90) b hk (t) eik·x 1≤|k|2 ≤Λ ζΛ ≡ X X ζbk (t) eik·x = 2 − |k| ψbk eik·x 1≤|k|2 ≤Λ 1≤|k|2 ≤Λ We note that spatial Fourier transformation maps spatial derivatives to products, for example the Laplacian 2 operator is mapped as: ∆ (·) 7→ − |k| (·) and ∂ /∂x(·) 7→ ik1 (·). Thus in the rst and last line of (90) we have used (22). For simplicity we have assumed zero mean for our functions. All the amplitudes of the Fourier expansions are ∗ c∗ , b c∗ b b∗ real numbers, i.e. they satisfy, ψb−k = ψ k h−k = h k and ζ−k = ζ k where (·) denotes complex conjugation. The Fourier coecients are dened in the usual way, for example: Z2πZ2π Z ik·x 1 −ik·x b ψ≡ ψ e = − ψe dx = ψe−ik·x dx1 dx2 4π 2 D 0 29 0 BΛ is an orthonormal basis under the above L2 inner product, a basic result of Fourier analysis. Let VΛ = span{BΛ }. We obtain the truncated dynamical equations for barotropic quasi-geostrophy by projecting (89) orthogonally onto VΛ . Let PΛ denote the orthogonal projection onto VΛ 12 , if we project the barotropic quasigeostrophic equations (89) onto VΛ we get the truncated dynamical equations. It is a Galerkin approximation using a Fourier basis: 0 ∂ q0 ∂ ψ0 ∂ qΛ 0 0 + β Λ + V Λ + PΛ ∇⊥ ψΛ · ∇qΛ =0 ∂t ∂x1 ∂x1 Z dV ∂ ψ0 − − hΛ Λ = 0 dt ∂x1 (91) (92) Remarks: 2 2 2 • VΛ is not closed under multiplication since for |p| ≤ Λ, |q| ≤ Λ we may have |p + q| ≥ Λ (note exp (ip · x) exp (iq · x) = exp (i (p + q) · x)). Therefore the non-linearity in (91) needs special treatment when projecting the Fourier series onto VΛ . This projection is made clear in the next set of equations when (91) and (92) are written in Fourier space. • The last integral is the result of integration by parts of (29). Plugging (90) into (91) and (92) we get an ODE system describing evolution in time. Thus we have that, for 2 the Fourier coecients satisfying 1 ≤ |k| ≤ Λ: iβk1 b d ζbk bk + b − ζ + iV k ζ h − k 1 k 2 dt |k| X p+q=k |p|2 ≤Λ, |q|2 ≤Λ p⊥ · q b b b ζ ζ + h =0 p q q 2 |p| dV −i dt X 1≤|k|2 ≤Λ k1 b h−k ζbk |k| 2 =0 (93) (94) Remarks: 2 • The relations qbk = ζbk + b hk and ψbk = −ζbk / |k| have been used in (93). • q or p are never zero since we have assumed zero mean for the Fourier expansion. 4.2 Conserved quantities. As noted above the existence of conserved quantities is one of the basic ingredients needed for statistical mechanics. We will now prove that our truncated equations (91) and (92) indeed have two conserved quantities with which we can proceed. These quantities are the energy EΛ and the enstrophy EΛ , they were introduced as equations (39) and (40): Z 1 2 1 ⊥ 0 2 1 1 X 2 EΛ = V + − ∇ ψΛ dx = V 2 + |k| |ψbk |2 (95) 2 2 2 2 1≤|k|2 ≤Λ Z 2 1 1 X 2 0 2 EΛ = βV + − (qΛ ) dx = βV + hk (96) − |k| ψbk + b 2 2 2 1≤|k| ≤Λ Proposition 4.1 The truncated energy EΛ and enstrophy EΛ are conserved in the nite-dimensional truncated dynamics. 12 a brief introduction to orthogonal projections is given in Appendix G. 30 Proof Λ = 0 and dE dt = 0 for all time. We start with energy: Z dEΛ ∂V 0 ∂ ζΛ =V − − ψΛ dx dt ∂t ∂t Z 0 ∂V 0 ∂ qΛ =V − − ψΛ dx ∂t ∂t Z Z Z 0 0 ∂ ψ0 0 ∂ qΛ 0 ∂ ψΛ = V − hΛ Λ dx + β − ψΛ dx +V − ψΛ dx ∂x ∂x ∂x {z } | R 2 ∂ =− 21 ∂x (ψΛ0 ) dx=0 Z 0 0 0 + − PΛ ∇⊥ ψΛ · ∇qΛ ψΛ dx Z Z Z 0 0 0 ∂ ψΛ 0 ∂ hΛ 0 0 ψΛ dx dx + V − ψΛ dx +− ∇⊥ ψΛ · ∇qΛ = V − hΛ ∂x ∂x {z } | R ∂ =V − ∂x (hΛ ψΛ0 )dx=0 We need to prove that dEΛ dt =0 where we have used the relationships between our functions as described in (90), (91) and in (92) as well as R R R 0 0 2 0 0 integration by parts. In the rst line we did − ∂ (1/2) |∇ψΛ | /∂t = − ∇ψΛ ∂ ∇ψΛ /∂t = −− ψΛ ∂ ζΛ /∂t We have also used that the integral of the derivative of a periodic function is zero, as well as the product rule. The orthogonal projection can be dropped under integration due to the orthogonality relation of the Fourier basis: ( 1 if m=n im·x in·x he |e i= 0 if m 6= n The same techniques used above will work on the time derivative of enstrophy to show it is zero. 4.3 Non-linear stability of the exact solution to the truncated system Similar to what we did in section 2.15, it is easy to verify that the truncated equation has an exact solution when assuming a linear relationship q Λ = µψ Λ (97) where similar to section 2.15 this means we have: ∆ψ Λ + hΛ = µψ 0 Λ , V = − β µ (98) To assess non-linear stability, a positive quadratic form for the perturbations can be constructed from the truncated energy and enstrophy. Let (δqΛ , δV ) be the perturbations, following section 2.15 and using (98) we have: µEΛ (qΛ , V ) + EΛ (qΛ , V ) = µEΛ q Λ , V + EΛ q Λ , V + Wµ (δqΛ , δV ) (99) where µ 1 Wµ (δqΛ , δV ) = δV 2 + 2 2 = X 1+ 1≤|k|2 ≤Λ 2 1 µ V −V + 2 2 X ! µ (δqΛ ) 2 |k| |k| 2 2 2 µ + |k| 2 ψbk − ψbk (100) 1≤|k|2 ≤Λ Recall that δq = q − q . The same arguments as in section (2.15) prove that the steady-state solution q Λ , V non-linearly stable for µ > 0. 31 is 4.4 Liuoville property of the truncated system. 2 Choose a set S = o {k1 , ..., kM } of the modes k satisfying 1 ≤ |k| ≤ Λ such that if k ∈ S ⇒ −k 6∈ S but S∪(−S) = n 2 N b b b b k : 1 ≤ |k| ≤ Λ . Let N = 2M + 1 and dene X ∈ R by X ≡ V, Reψk1 , Imψk1 , ..., ReψkM , ImψkM . Now the entire state of the nite-dimensional system can be represented by a point X ∈ RN where N 1. Equations (93) and (94) can be represented as: dX = F (X) dt (101) with initial conditions Xt=0 = X0 . Under this form it becomes easy to prove that the Liouville property is satised since the vector eld F satises: (102) Fj (X) = Fj (X1 , ..., Xj−1 , Xj+1 , ..., XN ) in other words: Proposition 4.2 Fj does not depend on Xj in (101). Proof Clearly F1 is independent of X1 = V . When thinking about F, it is appropriate to remember the 2 relationship ζbk = − |k| ψbk . Next notice that in (93), F2 and F3 correspond to Reψbk1 and Imψbk1 respectively. In general F2j and F2j+1 correspond to X2j = Reψbkj and X2j+1 = Imψbkj respectively. Clearly the linear terms in (101) are either independent of ψbkj or only cause a rotation of X2j and X2j+1 of the form: d X2j −X2j+1 = C (kj ) dt X2j+1 X2j where C (kj ) is a constant for each kj . Clearly, this rotation keeps Fj from being dependent on Xj . The nonlinear term in (93) has contribution zero from ψbkj and ψb−kj (i.e. the contribution from X2j and X2j+1 is also zero) since the restriction on the summation implies that either p = 0, q = kj or q = 0, p = kj or p = 2kj , q = −kj or q = 2kj , p = −kj in any case p⊥ · q = 0. 4.5 Statistical predictions of the truncated system. The truncated system has two conserved quantities, it satises the Liouville property and it is non-linear stable to small perturbations. We can now use equation (82) for our least biased probability with the expressions for energy and enstrophy given by (95) and (96) to get that: 2 2 X X 1 1 1 2 2 Gα,θ = C exp −α βV + hk − θ V 2 + |k| ψbk (103) − |k| ψbk + b 2 2 2 2 2 1≤|k| ≤Λ 1≤|k| ≤Λ where α and θ are the Lagrange multipliers for enstrophy and energy respectively, they are determined from the ensemble average enstrophy and energy constraints derived from (70). For (103) to be a probability measure there needs to be some restrictions; in particular, it is needed that the coecients of the quadratic terms are 4 2 2 negative which means that α |k| + θ |k| > 0 for each k satisfying |k| ≤ Λ. Additionally if V 6= 0 then θ > 0 is also needed. Let µ≡ θ , α α 6= 0 The above condition implies that either (104) α, µ > 0 (105) V ≡ 0, α > 0, µ > −1 (106) or 32 or (107) α < 0, µ < −Λ, θ > 0 The last condition depends on the truncation so it is not physically relevant. Condition (106) is a possibility when there is no large scale zonal ow, but it will not be pursued in this paper. We focus on (105) under which we can use: β V =− , µ ψbk = ψ 0 Λ (x, t) = X with b hk (108) 2 µ + |k| (109) ψbk eik·x 2 1≤|k| ≤Λ As we have seen in 4.3 the solution V , ψ 0 Λ under condition (105) is non-linearly stable. Thus we can write the Gibbs measure (103) in terms of (100) as: Gα,µ = C exp (− (αEΛ + αµEΛ )) = C exp (−α (EΛ + µEΛ )) = C exp (−αWµ (δq, δV )) 2 1 µ = C exp −α V − V + 2 2 X |k| 2 2 2 |k| + µ ψbk − ψbk (110) 2 1≤|k| ≤Λ or equivalently: Gα,µ (X) = N Y Gjα,µ (Xj ) j=1 We see that the Gibbs measure for the dynamics is a product of Gaussian probabilities, the good news is that it 0 ) is a measure with which we are well familiar. In fact calculating the ensemble average or mean state of (V, ψΛ is straightforward, and it is given by: Z hXi = XGα,µ (X) dX = V , ψbk1 , . . . , ψbkM (111) RN Thus, it turns out that the non-linearly stable steady state solution is in fact the ensemble average or mean state of the system. Assuming ergodicity, we can predict that for long-enough time T the time-average of the solutions converges to the non-linearly stable steady state solution of the QG truncated equations, more precisely we have : 1 lim T →∞ T TZ 0 +T 0 ψΛ (x, t) dt = ψ 0 Λ (112) β µ (113) T0 1 T →∞ T TZ 0 +T V (t) dt = V = − lim T0 Equations (112) and (113) show that, regardless of the initial conditions, a specic large-scale coherent mean ow will develop from the truncated equations (91). Converging after a long enough time period to the most probable mean state, which is the non-linearly stable exact steady state solution. To see that this solution is a specic ow, recall that the steady state solution V , ψ 0 Λ is given by (108) and (109); for a given bottom topography h and a given constant µ the solution converges to a specic steady-state solution. The underlying physics explaining this is conservation of angular momentum: topography and vorticity are anti-correlated. More will be said about choosing µ in section 4.6.1. 33 4.6 The limit Λ → ∞ It is when we take the limit Λ → ∞ that equations (91) and (92) converge to (89), thus we are interested in investigating the asymptotic behavior of the invariant Gibbs measure as well as their mean states as Λ increases; this is the continuum limit. How can the parameters α = αΛ and µ = µΛ be chosen so that they satisfy the energy and enstrophy contraints as Λ → ∞? We begin by noticing that we do not need the truncated energy/enstrophy constraints to hold. It is in the limit of the cut-o wave number approaching innity that the constraints need to be satised: lim hEΛ i = E0 , lim hEΛ i = E0 Λ→∞ Λ→∞ (114) Here it is possible to exchange the integral and the limit because EΛ ≤ E, ∀Λ (the same holds for E ), and both the enstrophy and energy are bounded. The second thing to notice is that the energy and enstrophy are related by an imposed constraint. The mean enstrophy E0 must be greater than the minimum enstrophy associated with a given energy level. To be precise: E0 > min E(ψ)=E0 E (ψ) = E∗ (E0 ) (115) We only consider the case where the above inequality, as written, is strictly greater. The reason is that when there is equality in (115) any state satisfying the energy/enstrophy constraint must be an enstrophy-minimizing (selective decay) state, which does not have much randomness and therefore not of interest (see e.g. section 4.5 of Majda and Wang 2006 and references therein). 2 Next we recall the denition of variance: var [x] ≡ E x2 − E [x] , because we know the Gaussian distribution well, it is straightforward to calculate that: Z −1 V 2 Gα,µ = (αµ) + β2 µ2 Z 2 −1 2 c c 2 2 + ψ ψk Gα,µ = α |k| µ + |k| k (116) We can therefore observe that the ensemble average of the energy and enstrophy separate naturally into two parts: one that corresponds to the mean state and one that corresponds to the uctuation part: hEΛ i =hEΛ iG = EΛ + EΛ0 2 2 c |k| 2 hk X 1 β EΛ = 2 + 2 2 µ 2 1≤|k|2 ≤Λ µ + |k| X 1 1 µ−1 + EΛ0 = 2 2α µ + |k| 2 (117) 1≤|k| ≤Λ while for the enstrophy hEΛ i =hEΛ iG = EΛ + E0Λ 2 2 c µ hk β 1 EΛ = − + 2 µ 2 2 1≤|k|2 ≤Λ µ + |k| 2 E0Λ = 1 2α X 2 |k| X 2 1≤|k|2 ≤Λ µ + |k| 34 (118) The uctuation part of the energy is isotropic and independent of the mean state. For large Λ the summation may be exchanged with integration so that EΛ0 = 1 1 + 2αµ 2α 1 2π ∼ + = 2αµ 2α 1 X 2 1≤|k|2 ≤Λ √ ZΛ µ + |k| |k| µ + |k| 1 2 d |k| √ Λ 1 π 2 = + ln µ + |k| 2αµ 2α 1 1 π µ+Λ = + ln 2αµ 2α µ+1 (119) where a transformation into polar coordinates has been done to carry out the double integration. Likewise the uctuation part of the total enstrophy can be estimated as: E0Λ = 1 2α |k| X 2 2 1≤|k|2 ≤Λ √ ZΛ µ + |k| 3 |k| 1 ∼ 2π = 2α 2 d |k| µ + |k| 1 √ Λ Z |k| µ + |k|2 − µ |k| π = d |k| 2 α µ + |k| 1 µ+Λ π Λ − 1 − µ ln = 2α µ+1 (120) With this we can see what is needed for the parameters α and µ. If they are bounded independent of Λ then, as Λ → ∞ the uctuation part of the energy and estrophy will go to innity, which contradicts the energy constraint. It can be shown that the parameter α must approach innity, µ should remain bounded since large µ implies small geophysical inuence which contradicts physical reality (recall Coriolis acceleration is often one of the terms in the leading geophysical balance). By the realizability condition (105) and the explicit formula for the energy ensemble average (117) we have that EΛ0 > 0 this further implies that for large Λ the asymptotic behavior is: β2 ≤ E Λ ≤ E0 2µ2Λ (121) from where we can establish a lower bound on µΛ : µΛ ≥ √ Λ, β 2E0 (122) substituting into the perturbation equation for enstrophy in (118) we see that, again asymptotically for large EΛ ≥ − p β2 ≥ −β 2E0 µΛ With this, the uctuation part of the ensemble enstrophy can be asymptotically bounded: p E0Λ ≤ E0 − EΛ ≤ E0 + β 2E0 Now that µΛ is bounded above and positive from the realizability condition (105), it follows that 35 (123) (124) lim ln µµΛΛ+Λ +1 Λ→∞ = lim Λ Λ→∞ ln Λ =0 Λ (125) combining (125), (124) and (120) we have that asymptotically for large Λ αΛ ≥ πΛ √ 2 E0 + β 2E0 (126) Use (126), (122) and the boundedness of µ in (119) to deduce that EΛ0 → 0, as (127) Λ→∞ Thus we conclude that all energy resides in the mean state asymptotically for large Λ. We now investigate how to choose the parameters and the asymptotic behaviour of the parameters as well as the mean states. 4.6.1 Choosing µ = µΛ It is possible to nd a unique µ = µΛ > 0 such that (128) E Λ = E ψ 0 µΛ , V µΛ = E0 Then ψ 0 µΛ , V µΛ satisfy ∆ψ 0 µΛ + hΛ = µΛ ψ 0 µΛ , V µΛ = − β µΛ (129) Clearly µΛ needs to be a non-decreasing function of Λ since for xed µ the mean energy state E Λ in (117) is a monotonic-increasing function in Λ. It is easy to check that there exists a µ = µ∞ > 0 such that (130) lim µΛ = µ Λ→∞ in a monotonic decreasing fashion. In terms of statistical mechanics we then have a positive "temperature" for all energy levels. 4.6.2 The limit of the mean states. The mean state also converges, it is easy to check using (108), (109) and (130), that ψ 0 µΛ → ψ 0 µ = X b hk µ + |k| 2e ik·x , V µΛ → V µ = − β µ (131) where µ = µ∞ > 0 is the unique µ ∈ (0, ∞) such that the energy constraint is met by the limit mean state ψ 0 µ , V µ , that is E ψ 0 µ , V µ = E0 (132) It is also true that thanks to (131) and (132), the limit mean state satises the limit mean eld equation ∆ψ 0 µ + h = µψ 0 µ , Vµ =− which means it is non-linearly stable according to section 2.15 (see 36 β µ e.g. proposition 2.2). (133) 4.6.3 Choosing α = αΛ and the enstrophy constraint For a given energy E0 there exists a minimal enstrophy, E∗ (E0 ) which is equal to E ψ 0 µ , V µ since ψ 0 µ , V µ is the enstrophy-minimizing state (or selective decay state) with the topography h as well as with β . On the other hand thanks to (131) - (133) lim EΛ = E ψ 0 µ , V µ = E∗ (E0 ) (134) Λ→∞ combining (115) and (134) we have tht for large Λ E0 > EΛ = E ψ 0 µΛ , V µΛ (135) We now pick an αΛ so that the enstrophy constraint is satised for all truncations i.e. E0 = hEΛ i = EΛ + E0Λ (136) which amounts to requiring αΛ ∼ = π 2Λ E0 − EΛ (137) for Λ 1. Note that αΛ → ∞ as Λ → ∞. 4.6.4 The energy constraint As mentioned the energy uctuation goes to zero as Λ → ∞. To be precise use (137) in (119) to get π ln Λ EΛ0 ∼ → 0 as Λ → ∞ (138) = 2Λ Combining (138) with (128) and (117) we can conclude that the energy constraint is satised in the sense of (114). 37 5 References Bennet, A. (2006). Lagrangian Fluid Dynamics. Cambridge University Press. Bouchet, F., A. Venaille (2011). Statistical mechanics of two-dimensional and geophysical ows. arXiv:1110.6245v1 [physics.u-dyn]. http://arxiv.org/abs/1110.6245 Also published in Elsevier. Bouchet, F., A. Venaille (2012). Applications of equilibrium statistical mechanics to atmospheres and oceans. Paper available here. See also http://perso.ens-lyon.fr/antoine.venaille/ Carnevale, G.F., Frederiksen, J.S. (1987). Nonlinear stability and statistical mechanics of ow over topography. J. Fluid. Mech. 175, 157-181. Cushman-Roisin B., Beckers J. (2011). Introduction to Geophysical Fluid Dynamics. 2nd edition. Academic Press. DeCaria A. J. , T . D. Sikora (2010). Momentum Advection and the Gradient of a Vector Field: A Discussion of Standard Notation. Journal of Atmospheric Sciences. DOI: 10.1175/2009JAS3393.1 Dubinkina S. (2010). Statistical Mechanics and Numerical Modeling of Geophysical Fluid Dynamics. PhD thesis Universiteit van Amsterdam. Center for Computer Science and Mathematics (CWI). Thomas Stieltjes Institute for Mathematics. Published in the Journal of Computational Physics. Durret, R (1996). Probability: theory and examples. 2nd ed. Duxbury Press. Ertel, H. (1942). Ein neuer hydrodynamischer Wirbelsatz. Meteorologische Zeitschrift, 59, 277-281. Eyink G. L. and K. R. Sreenivasan (2006). Onsager and the theory of hydrodynamic turbulence, Rev. Mod. Phys. 78, 87-135 Gardiner-Garden R. S. (1991). New vertical modes for Dissipative Stratied Circulations with applications to Coastal Upwelling. Jour. Geoph. Res. Vol. 96. No. C5. page 8811-8822. Gluss, David and Weisstein, Eric W. "Lagrange Multiplier." From MathWorldA Wolfram Web Resource. http: //mathworld.wolfram.com/LagrangeMultiplier.html Holloway, G. (1986). Eddies, waves, circulation and mixing: Statistical Geouid Mechanics. Ann. Rev. Fluid Mech. 18: 91-147 Kundu P. K., I. M. Cohen (2008). Fluid Mechanics 4th Edition. Academic Press. Kraichnan, R. H. (1975). Statistical dynamics of two-dimensional ow. J. Fluid Mech. 67, 155-175. Majda A. J. (2003). Introduction to PDE's and Waves for the Atmosphere and Ocean. Courant lecture notes in Mathematics. American Mathematical Society. Majda A. J., X. Wang (2006). Nonlinear Dynamics and Statistical Theories for Basic Geophysical Flows. Cambridge University Press. Meyer, C. D. (2000). Matrix analysis and applied linear algebra. Society for Industrial and Applied Mathematics. Merryeld W. J., P. F. Cummins & G. Holloway (2001). Equilibrium Statistical Mechanics of Barotropic Flow over Finite Topography. Journal of Physical Oceanography. Vol. 31, 1880-1890. McDonald, J.N., N.A. Weiss. A course in Real Analysis. Academic Press 1999. Modern Physics Course: Statistical Mechanics. Stanford University Youtube Channel http://youtu.be/H1Zbp6__uNw Pedlosky J. (1987) Geophysical Fluid Dynamics. 2nd Edition. Springer. Price, J. F. (2006). Lagrangian and Eulerian representation of Fluid Flow: Kinematics and the Equations of Motion. Woods Hole Oceanographic Institution. http://www.whoi.edu/science/PO/people/jprice Randall, D. A., (2010). The Evolution of Complexity in General Circulation Models. In: The Development of Atmospheric General Circulation Models: Complexity, Synthesis, and Computation, L. Donner, W. Schubert, and R. C. J. Somerville, Eds. Cambridge University Press, 272 pp Salmon, R., Holloway, G. and Hendershott, M. (1976). The equilibrium statistical mechanics of simple quasigeostrophic models. J. Fluid Mech. 75, 691-703. Salmon, R. (1998) Lectures on Geophysical Fluid Dynamics, Oxford University Press. 38 Samelson, R. M., S. Wiggins (2006). Lagrangian transport in Geophysical Jets and Waves: the dynamical systems approach. Springer. Samelson, R. M. (2011). The theory of large-scale ocean circulation. Cambridge University Press. Samelson, R. M. (2012). Lagrangian motion, coherent structures, and lines of persistent material strain. Ann. Rev. Mar. Sci. Accepted for publication. Sarig, O. (2008) Lecture Notes on Ergodic Theory - Graduate course at Penn State University. http://www. math.psu.edu/sarig/506/ErgodicNotes.pdf Shimizu, K., 2011: A Theory of Vertical Modes in Multilayer Stratied Fluids. J. Phys. Oceanogr., 41, 1694-1707. doi: http://dx.doi.org/10.1175/2011JPO4546.1 Simi¢, S.N. (2005) Notes for Math 134. San Jose State University http://www.math.sjsu.edu/~simic/Fall05/ Math134/flows.pdf Vallis, G. K. 2006 Atmospheric and Oceanic Fluid Dynamics. Cambridge University Press. Venaille A., G. K. Vallis and S. M. Gries (2012). The catalytic role of beta eect in barotropization processes. arXiv:1201.0657v1 [physics.u-dyn]. 39 A Existence and Uniqueness to Poisson's equation A.1 Green's identities Suppose that ψ, φ ∈ C 2 (D) ∩ C 1 (D̄), where D is a bounded normal domain with boundary B , starting from the Gauss Divergence theorem the following identities known as Green's rst, second and third identities are obtained: Z Z ∂ψ [φ∆ψ + ∇φ · ∇ψ] dx = φ dσ (139) ∂n D where ∂ψ ∂n B ≡ ∇ψ · n is the outward normal derivative. Z Z ∂ψ ∂φ [φ∆ψ + ψ∆φ] dx = φ −ψ dσ ∂n ∂n D (140) B Z Z ∆ψdx = D ∂ψ dσ ∂n (141) B A.2 Maximum-Minimum principle Let D be a bounded domain and ψ ∈ C 2 (D) ∩ C 1 (D̄): 1. if ∆ψ ≥ 0 in D, then ψ(x) ≤ maxy∈B ψ(y) for x ∈ D 2. if ∆ψ ≤ 0 in D, then ψ(x) ≥ miny∈B ψ(y) for x ∈ D 3. if ∆ψ = 0 in D, then miny∈B ψ(y) ≤ ψ(x) ≤ maxy∈B ψ(y) for x ∈ D A.3 Existence and Uniqueness for Dirichlet and Robin Conditions in a bounded domain. Let D be a bounded normal domain, it's closure D̄ ≡ D ∪ B includes the boundary B . Consider the problem x∈D ∆ψ = ζ, If B is twice continuously dierentiable then there is at most one function ψ ∈ C 2 (D) ∩ C 1 (D̄) that solves (22) with either: ψ = f (x), x∈B (Dirichlet) or ∂ψ + α(x)ψ = f (x), x ∈ B (Robin) ∂n where ζ is the known vorticity in equation (22), f (x) is a given function and α(x) ≥ 0 is bounded, continuous and not identically zero. proof: Let ψ1 and ψ2 be solutions to the Robin problem, dene ψ = ψ1 − ψ2 , then ψ satises ∆ψ = 0, x ∈ D and ∂ψ/∂n + αψ = 0; x ∈ B . Take ψ = φ in (139): Z Z Z ∂ψ 2 |∇ψ| dx = ψ dσ = − αψ 2 dσ ∂n D B B The rst integral is non-negative and the last integral is non-positive, this implies ψ = 0. Similar reasoning proves uniqueness for the Dirichlet problem. 40 A.4 Existence and Uniqueness for Neumann Conditions Under the same conditions as above it can be shown that two solutions for the Neumman problem: ∆ψ = ζ, x ∈ D ∂ψ = f (x), x ∈ B ∂n But because of the boundary condition, a further necessary condition for the existence of a solution is that Green's third identity (141) holds this means that the integral of the Laplacian over the domain D is zero because the normal derivative is zero: no normal ow through boundary. A.5 Unbounded domains In a bounded domain we can easily strengthen the uniqueness of the Dirichlet problem (i.e. drop the need for the domain to be normal) by using the Maximum-Minimum principle (see section A.2). As an example of this: the proof for Dirichlet is straightforward by assuming that there are two solutions for the problem ψ1 , ψ2 , dene ψ = ψ1 − ψ2 , ψ which satises ∆ψ = 0 in D and ψ = 0 on B , by the Maximum principle ψ = 0 and so there is only one solution. Using the Maximum principle it is also possible to relax the conditions needed for the Robin problem with α(x) > 0 so that for ψ ∈ C 2 (D) ∩ C 1 (D̄) where D is a bounded domain, there is at most one solution. However the reason that we used Green's identities in the previous section is because it aids in the generalization to innite domains. Further requirements for innite domains are discussed next, we start with some denitions: −p w = O |x| as x → ∞ means that there exist constants M and a such that −p |w(x)| ≤ M |x| for |x| ≥ a Also we will dene an exterior domain D as the complement of a bounded domain D1 that includes the origin, if the bounded domain is normal then we say that the complement is a normal exterior domain. B is the boundary of D. Also let Kr (c) ≡ {x : |x − c| < r}, the boundary of Kr (c) r} is Sr (c) ≡ {x : |x − c| = −2 −1 2 1 It can be shown that if ψ, φ ∈ C (D) ∪ C D̄ and ψ, φ = O |x| and ∇ψ, ∇φ = O (| x as |x| → ∞, then the Green identities (139)-(141) hold on D. The The exterior Dirichlet problem for D is to nd a function ψ ∈ C 2 (D) ∪ C 1 D̄ that satises exterior ∆ψ = h(x) in D, ψ = f (x) on B, ψ → 0 uniformly at infinity Neumann problem for D is to nd a function ψ ∈ C 2 (D) ∪ C 1 D̄ that satises: ∂ψ = f (x) on B, ψ → 0 uniformly at infinity ∂n is to nd a function u that satises Poisson's equation, vanishes at innity ∆ψ = h(x) in D, The exterior and satises: Robin problem for D ∂ψ + α(x)ψ = f (x) on ∂n where α (x) ≥ 0 on B , α 6≡ 0. Proof of Uniqueness of the Dirichlet problem: Let ψ = ψ1 − ψ2 where ψ1 , ψ2 are solutions to the Dirichlet problem. Let > 0 and x a point x0 ∈ D. Choose a > |x0 |, large enough so that D1 ⊂ Ka (0) and |ψ| < , on Sa (0) which is possible to due to the uniform 41 convergence ψ → 0 as x → ∞. Since ψ = 0 on B , the Maximum principle gives |ψ| < for x ∈ D ∩ Ka , in particular |ψ(x0 )| < which holds for any > 0, so we conclude that ψ(x0 ) = 0, since x0 is an arbitrary point in D, ψ ≡ 0. Thus, ψ1 = ψ2 . The proof for the Neumann problem follows from using Green's identity (141) and the fact that ∇ψ = 0 due to the behavior of ψ1 , ψ2 at innity. B Dominated Convergence Theorem ∞ Let (Ω, A, µ) be a measure space. Suppose that {fn }n=1 is a sequence of complex-valued, A-measurable functions that converge µ-ae. Further suppose that there is a nonnegative Lebesgue integrable function, g , such that |fn | ≤ g µ-ae for each n ∈ N. Then Z Z lim fn dµ = lim n→∞ fn dµ n→∞ E E for each E ∈ A. proof page 197 of McDonald and Weiss (1999). B.1 Bounded Convergence Theorem ∞ Let (Ω, A, µ) be a nite measure space. Suppose that {fn }n=1 is a sequence of uniformly bounded, complexvalued, A-measurable functions that converge µ-ae. Then Z Z lim fn dµ = lim fn dµ (142) n→∞ n→∞ E E for each E ∈ A. Proof: page 199 of McDonald and Weiss (1999). C Ensemble average Consider a canonical Hamiltonian system, let {yi }1≤i≤N represent the generalized coordinates and {pi }1≤i≤N the conjugate momenta, and H({yi , pi }) the Hamiltonian. The phase space is the 2N -dimensional space formed by {yi , pi }1≤i≤N and each point ({yi , pi }) is called a microstate. The system is completely described if we know the coordinates and momenta of all particles. Dene the average of a physical quantity A by: 1 Ā = lim ∆t→∞ ∆t t0Z+∆t A [{yi (t) , pi (t)}] dt t0 Often we do not know the system's exact location in phase space nor it's trajectory. What we know is the macroscopic state. It is therefore usual to investigate an ensemble of systems, a collection of all the microscopic systems that could possibly belong to the same macroscopic state. This means that the denition of A above needs to be reformulated in these terms. Instead of a time average over one system we calculate an average over an ensemble of equivalent systems at a xed time. Implicitly, the assumption is that a trajectory in phase space will spend equal time intervals in all regions of a constant energy surface (i.e. an energy conservation constraint), that is accessible from the initial conguration. This property is called ergodicity. And thus we write Z Ā = hAiP ≡ A ({yi , pi }) P ({yi , pi }) dydp B where B is the whole phase space and P is the probability density that a unit volume in phase space is occupied. Ergodicity of a system tends to be a property which is extremely dicult to prove, but it is commonly believed that the parts in phase space B, in which motion is trapped (recirculating trajectories, i.e. spaces that are inaccessible to trajectories outside that space) occupy an extremely small relative volume of B. This seems to 42 be the common wisdom derived from the successfully passed empirical tests of a century of statistical mechanics studies (Bouchet & Venaille 2011). Likewise numerical simulations are encouraging, even when it can be proved that ergodicity does not hold strictly statistical mechanics results in useful predictions as compared with the numerical solution. D Flow map of a dierential equation From Simi¢ (2005). Suppose F : Rn → Rn is a C 1 vector eld, then for each X0 ∈ Rn the ODE dX/dt = F has a unique solution X = X (t) with initial conditions X = X0 . Denition 15 Let Ω ⊂ R, the ow Φ : Ω × Rn → RN of this ODE is dened by: Φ (t, X0 ) = X (t) Therefore the dening properties of the ow map are: Φ (0, X0 ) = X0 and dΦ (t, X0 ) = F (Φ (t, X0 )) , dt ∀t Denition 16 The time-t map of the ow map is Φt : Rn → Rn and it is dened by Φt (X0 ) = Φ (t, X0 ) n n It is the state of the system after t units of time, a transformation of R to R . Often notation is abused and we say that the collection of time-t maps Φt is the ow map. From the uniqueness of the solutions we have that Φ0 = id. Φt+s = Φs ◦ Φt and (143) −1 Thus, the time-t map is invertible and Φt = Φ−t One of the conveniences of the denition of a ow map is that instead of considering a particular solution X (t), when we write Φt (X0 ) we consider all the solutions dependent on the initial condition, a more global point of view. This way the variable becomes X0 , to avoid confusion we relabel the variable to a more commonly-used variable name, and we write Φt (X) for the solution that starts at X when t = 0. Remark: Notice that implicit in (143) we nd that composition of time-t maps is commutative. Clearly, keeping track of the initial conditions is necessary for this to work out. Further reading can be found in Bennet (2006). E Entropy of a Measurable Partition Denition 17 Let P be a discrete probability measure on the sample space A = {a1 , . . . , an } P = n X Pi δai , Pi ≥ 0, i=1 n X Pi = 1 (144) i=1 where δai is the delta function at the point ai The Shannon entropy S (P ) of the probability P is dened as S (P ) = S (P1 , . . . , Pn ) = − n X i=1 43 Pi ln Pi (145) The function S is the information-theoretic entropy, it was used by Shannon to measure information. To recall the intuition behind this equation consider a "word" message as a sequence of binary digits with length n i.e. we need n-digits to characterize it. The set A2n of all words of length n has 2n = N elements and, clearly, the amount of information needed to characterize one element is n = log2 N . The amount of information needed to characterize an element of any set AN , is log2 N for general N. Let A = AN1 ∪ · · · ∪ ANk , where theP sets ANi are pairwise disjoint, and each set ANi has Ni elements. Let Pi be given by Pi = Ni /N , where N = i Ni . If we know that an element of A belongs to some ANi , we then need log2 Ni additional information to determine it completely. Thus, the average amount of information we need to determine an element, provided that we already know the ANi to which it belongs, is given by X Ni i N log2 Ni = X (146) Pi log2 Pi + log2 N i Recall that log2 N is the information needed to determine an element of A if we do not know to which ANi the given element belongs. Thus the corresponding average lack of information is X − Pi log2 Pi (147) so now we can see that (145) is a meausre of the lack of information. It can be shown that the Shannon entropy is unique see Majda & Wang (2006, from which this discussion is taken) page 186 for details. E.1 Uniquenness of Shannon's entropy Shannon's Entropy is a unique measure of the lack of information up to a positive constant. This was originally proved by Janes in 1957, please see Majda and Wang (2006) proposition 6.1 and corresponding references. Proposition E.1 Let H P Pnn be a function dened on the space of discrete n p δ , p ≥ 0, i i=1 i ai i=1 pi = 1} over sample space A and satisfying • Hn (p1 , . . . , pn ) is a continuous function. • A (n) = Hn (1/n, . . . , 1/n) uncertainty. • probability measures P Mn (A) ≡ {p = three properties: is monotonic increasing in n, i.e. Hn is monotonic increasing with increasing (Composition law). If the sample space A = {a1 , . . . , an } is divided into two sets A1 = {a1 , . . . , ak } and A2 = {ak+1 , . . . , an } with probabilities w1 = p1 + · · · + pk and w2 = pk+1 + · · · + pn and conditional probabilities (p1 /w1 , . . . , pk /wk ) and (pk+1 /wk+1 , . . . , pn /wn ) then the amount of uncertainty with the information split in this way is the same as it was originally Hn (p1 , . . . , pn ) = H2 (w1 , wn ) + w1 Hk (p1 /w1 , . . . , pk /w1 ) + w2 Hn−k (pk+1 /w2 , . . . , pn /w2 ) Then Hn is a positive multpiple of the Shannon entropy Hn (p1 , . . . , pn ) = KS (p1 , . . . , pn ) = −K n X pi ln pi i=1 E.2 Motivation Consider a probability space (Ω, A, ν) and suppose that the probability of a particle position in Ω is given by the probability measure ν . That is, for each A ∈ A the probability that the particle p is in A is equal to ν (A). We would like to know the position of p as closely as possible, and for this purpose the concept of a measurable partition is used. Denition 18 Let (Ω, A) be a measurable space and A ∈ A. A nite sequence {Ak }nk=1 of subsets of Ω is said to be a measurable partition of A if the Ak 's are A-measurable, pairwise disjoint and their union is A. That is, 44 1. Ak ∈ A, k = 1, 2, ..., n 2. Ai ∩ Aj = ∅ for i 6= j 3. ∪nk=1 Ak = A Let B be a measurable partition of (Ω, A). Suppose we can extract information about the location of p by answering the question "is p in A ?", for each A ∈ B. We are trying to ascertain which element of B contains p. Some measurable partitions will give us more information than others. For example if a partition reduces the probability by half when one of the two elements of the partition are chosen (i.e. each partition element has probability ν = 1/2) then we will have more information than with a partition that also has two elements 1 and the other element of the partition has the probability of the complement, but one has probability ν = 1000 unless we are very very lucky. We need to assign a number to the amount of information gained by a measurable partition to be able to proceed rigorously. That number is called entropy. Denition 19 Entropy of a measurable partition Let (Ω, A, ν) be a probability space and B a measurable partition of (Ω, A). Then the entropy of B, denoted H (B), is dened by H (B) = − X ν (A) log ν (A) , A∈B where the convention 0 log 0 = 0 has been adopted. Here the sum over all the elements of the measurable partition can be thought of, in a more general sense, as an integral with respect to the probability measure for that space. Notice that the negative sign is needed for the Entropy to be nonnegative since by denition 0 ≤ ν (A) ≤ 1, ∀A ∈ B F Lagrange multiplier Theorem F.1 (Lagrange Multiplier theorem) Let f : U ⊂ Rn → R and g : U ⊂ Rn → R be given functions. Let x0 ∈ U and g (x0 ) = c and let S be the level curve for g with value c, i.e. S ≡ {x ∈ Rn : g (x) = c}. Assume ∇g (x0 ) 6= 0. If f S has a maximum or minimum on S at x0 then there is a number Λ such that: ∇f (x0 ) = Λ∇g (x0 ) (148) Remark: (148) is n+1 equations (g (x0 ) = c, plus the n componentes of the gradient vectors) and n+1 unkowns (the Lagrange multiplier Λ plus the n components of x ∈ Rn ). Proof (A sketch). The tangent space of S at x0 is by denition the space orthogonal to ∇g (x0 ). Consider a path σ(t) that lies on S such that σ(0) = x0 , then σ 0 (0) is a tangent vector to S at x0 but d dc g (σ (t)) = =0 dt dt and at the same time by the chain rule d g (σ (t)) = ∇g (x0 ) · σ 0 (0) dt t=0 So that ∇g (x0 ) · σ 0 (0) = 0. If f S has a maximum at x0 , then clearly f (σ (t)) has a maximum at t = 0, therefore df (σ (t)) /dt t=0 = 0, and by the chain rule: d 0 = f (σ (t)) = ∇f (x0 ) · σ 0 (0) = 0 dt t=0 Thus ∇f (x0 ) is perpendicular to the tangent of every curve on S and it is also perpendicular to the tangent space of S at x0 . We now have that the gradient of f and g at x0 are parallel which is what (148) states. 45 It is straightforward to extend this theorem to the case with multiple constraints of the form gi (x0 ) = ci for 1 ≤ i ≤ N . In this case we need to solve: ∇f (x0 ) = N X Λi ∇gi (x0 ) i=1 G Orthogonal projections In this section we include the basic notions of an orthogonal projection, further concepts and examples may be found in the text by Meyer (2000) that we use as reference for this section. First, what is a projector (also called a projection)? Denition 20 If P is idempotent (i.e. P = P2 ) then P is called a projector. We will also need the denition of an orthogonal complement. Denition 21 The orthogonal complement M⊥ of a subset M of an inner-product space13 V is dened as: M⊥ = x ∈ V : hm xi = 0, ∀m ∈ M Let M be a subspace of a vector space V, then V = M ⊕ M⊥ , i.e. for every v ∈ V we have v = m + n where m ∈ M and n ∈ M⊥ and M ∩ M⊥ = ∅. We call m the orthogonal projection of v onto M. The orthogonal projector PM onto M along M⊥ is called the orthogonal projector onto M. PM is the unique linear operator such that PM v = m. If the eld is C, a formula for the orthogonal projector onto M is given by −1 PM = M (M∗ M) M∗ where the columns of M are some (any) basis for M and (·)∗ denotes the conjugate transpose. We note that if dim M = r then M∗ M is r × r and rank (M∗ M) = rank (M) = r this shows that M∗ M is nonsingular. In the particular case when the columns of M form an orthonormal basis, then M∗ M = I and our formula reduces to PM = MM∗ Moreover, if the columns of M and N constitute orthonormal bases for M and M⊥ respectively, then U = M N is a unitary matrix14 . In this case the formula for the orthogonal projector can be written as: I 0 PM = U r U∗ 0 0 where Ir is the r × r identity matrix. So the projector onto M made of orthonormal bases is similar to a diagonal matrix with ones and zeros. Whatever the formula we use for PM , the formula for PM⊥ is given by: PM⊥ = I − PM 13 A 14 A vector space with an inner product. unitary matrix Un×n is a complex matrix whose columns (or rows) constitute an orthonormal basis for Cn . Unitary matrices satisfy U∗ U = UU∗ = I. 46