Advanced Engineering Mathematics Textbook

Michael D. Greenbera LECTED. FORMULAS itd FIRST-ORDER oy!+ p(a)y = q(x). LINEAR: General solution y(v) be | If y(a)= 6, y(x) !| (Ll=e" )y” EQUATION: LEGENDRE ~ f pla) dx (fe P(e) dee ‘da 4. c’) a AC) ACag(€) dé + ) ew de lg) dg (oe! —2ay' + Ay = 0 Bounded solutions ?,,() n=0,1,2,... on — 1 <SaslifA=n(n+1), BESSEL EQUATION: = ox?y"+.ry! + (a? —v7) y =0 “ AJ, (x)+BY,(x) General solution MODIFIED BESSEL EQUATION: y(2) = { CH) Hy, —octy+ xy! + (=a? -—u\y General solution vlan . Solution 2 y(v) = '"Z,,) = 0 y(z) = Al, { | REDUCIBLETOA BESSELEQUATION: ~ (or) AU \ + DH ) yo Ce (a) 4 axe (2)+ BR, (2) +bry =0 a (a VibiexA/ ). aan where Z),; denotes J.,,, and Yi,,, if 6 > Ocand f,,; and Ay, MATRICES: ∆ l de Av! = mA adjA. (AB)"'=Bu'Av!, (AT)! ↕ 2 if b < 0 =(A7')T, ∶ ∏∶ −yj +ck. dy dz durdz dic dy dA=< Vues lie " (iz 5 Vussa Vevey 70 Pui Ba? Ov dR=dzri-+ Ou Oru Ov Ox Or tata Oy? Oe ‘ et Oy (AB)'=BTAt ∶ ∶ ∶ dyj + dzk, (constant-x surface) (constant-y surface). (constant-2 surface) 3:90 oO +j—-t+k— lay “Oe )« L-—a v=? Ou; Our = —j+—j+—k Ox Oy) dV =durdydz OuOz 4 te Oz ' k COORDINATES: CYLINDRICAL u=ulr,d,z), y=rsind, z=rcosé, R=re,+2e., dR= r dé dz dr dz r dr dé dA = dé, . —_——— I= dég "do eB d@ Vv= UpG, + upg + v2, eee dre, +rdiég + dzé- (constant-7 surface) (constant-@ surface) (constant-z surface) dV =rdrd@dz « © TT wees or Ou 1 Ou Ou Vu = —-e@,+ - —e —eé“ Or” r ao°° + Oe vy? nu 1 LOu c= 120 ∶−−− COORDINATES: R=pe,, dA=¢ PLE s=pcos@ (constant-psurface) (constant-@surface) (constant-@surface) Je, Oe, Oey, Oey ∂ ∂ Op Od Ou . . L Ou. Op” pdé” L [0 ( » Ou | p? | Op f a: L . ey — . ∶ Ll | 1 1 Be! ’ Ou. e — Ou\y ay | SIN@SO) , Od 0@ − ot ‘ − ∶ 1 psing ∩ 1 —~}@,+—- po p (= Ou OPsin* aT @ By 00? ↕∂ OO } − , ∕∶ ) ∶ Oe O sin @) — ∂∂ ∂ ∶ O° (Hw eee ↔ @O PF sing =sin@eg, Oey ∕ ∆ "Taq dV = p" |sin@d|dpddédé ag ∂ psindg 06 . ∶ psing 4 Op ∫ = €y, ao °°? Cn + ~ Br ee LO,4 Vxv=-— L (O(rvo)— Der’ . 59 ) °* 2 ( Or Uplp + Ugeg + Voee plsin dl dpdé pdp do 3- ∕ v= p- |sin6| dé dé −∶ y −−∟ dR = dpe, + pddéy + psin d deg ∂ U= ∂ Ov, — Ouz\ . os FY (GE- y=psingsing, dp =0, v Ove \. Ge) et u=ul(p,o,@), x=psingcosé, ∂ ∂ −−−∙ M4 r OO an L dv, Be vxv=(2 − ∫ ↕∂ −∙ r By SPHERICAL 1 Ou 4 Oru -—+- ∂ 1 − OO Ov, — o OO − Ope) Op \ . 1 (“ie -— | @y + — a "op Op — Ov, Od | es , | y=y(uv), IFe=a(u,v), AREA ELEMENT: IFe= P= katyt Yuet tut tei, B=aityi flay): dA= VEG —fF? dudv, 2=2(uv): dA= Vite ty +2 Gey + f? dx dy Ox, y, 2) Ly Ly Ly Sy Zy Zw VOLUMEELEMENT: dV= rast dududw='lyu Yo Yw ||dudvdw O(u,v, w) | »b(t) d LEIBNIZRULE: ral dt F(a,t)de=| b(t) J a(t) Ja(t) Of — dx+b(t) f(b(t),t) —a’(t)f(a(t),t) Ot FOURIERSERIES: . ∞ ao = if ∕ (an co vr) = ag + a f(x) 2€-periodic: fle)dz, an = , ∏ NAL ? ) 7 NTL 1 f(z ) cos ——der, by = 3/ f(x) s 7 f(x) definedonly on0 <a < L: HRC: f(z) = ao + Gy COS ——, = Y mt HRS QRC f(z) = 2 QRS: ‘(a) f(x) FOURIER INTEGRAL: ~ — 2» = " NEL Gn COS Fe NTL sin——, SF |bz sin Qn =F 9 bn = — z2 | f(z)coswa, fh ns de eee Ma) |,fh f(x) b(w) = NTL si [ SE (—co < x < oo) f(x) sin wa: =OD FOURIERTRANSFORM: F{f(x)}= fw) = [ i" da F(x)e7 1 FN fw} = f(t) = s—fp Pw)" de LAPLACETRANSFORM: de L f(a) = i [a(w)coswa + b(w)sin wa]dw Jo a(w) = =f Tr foOLe 0 ie f° f(e) L Jo Eo L " nal an = if2 f(v)dz, Lp bo = 2 b, sin nee f(x) = S aj=— L{f(t)}=F(s)= ~ (te! dt cos nx L it DIVERGENCE THEOREM: VivdV = / n-vdA JY GREEN’S FIRST IDENTITY: JS | (Vu- Vu +uV?v) JV GREEN’S SECOND IDENTITY: STOKES’S GREEN’S THEOREM: | (uV?u—vV7u) v | n-VxvdA= i dV = | S On s\ wee - pot dA On On. f v dR JS THEOREM: dV = / uov dA JC , OP ~ ae “/0Q ae Js \ Ox Oy ‘ dA = ¢ Pdv+Qdy c STURM-LIOUVILLE EQUATION: — [p(x)y'J! +q(x)y+Aw(x)y= 0 ay, SINEINTEGRALFUNCTION: — Si(x)= / ~ dt Si(co)= wir JO EXPONENTIALINTEGRALFUNCTION:| Ei(x) = / GAMMAFUNCTION:| [(z) = / = dt (x>0) t’-te-t dt (x>0) 0 2 f° 2 ERROR FUNCTION: © erf(z) = Va | RJ TRIGONOMETRIC FUNCTION en" dt, erf(oo) = 1 IDENTITIES: HYPERBOLIC et —en sinha coshar =— sin vw= — cos (iz) = coshz, IDENTITIES: et +e72 eit —pit elf 4 ernie COS2 = a FUNCTION cosh (iz) =cosx, sin(iz) =isinha = TT sinh(ia) =¢sing cosh? z — sinh? ¢ = 1 cos? 4 +sin? a =1 cos(A + B) = cos Acos 8 F sin Asin B cosh (A + B) = cosh Acosh B + sinh A sinh B sin(A + B) = sin AcosB +sin BcosA sinh (A+ B) = sinh Acosh B + sinh Bcosh A cosAcos B = [cos(A + 8B)+ cos(A —B)]/2 sin Acos B = [sin(A + B) +sin (A ~ B)]/2 sin Asin B = [cos(A ~ B) —cos(A + B)j/2 ptf TAYLOR SERIES: joa cect pity f(z) = f(a) + f'(a)(@ ~ a) + J {0 (x —a)? + J ae (x -a)? +--- €eeitert =lt+et (Geometric Series) lz] <1 = ltatartee, tty byetse te 52 ] 1 pe cost rox=1— Fa +a 1 la sing = U- wa qv ol. tea, ↔ e co |x|< . ∙ |x|<oo Advanced Engineering Mathematics SECOND EDITION Michael D. Greenberg Department of Mechanical Engineering University of Delaware, Newark, Delaware PRENTICE HALL Upper Saddle River, New Jersey 07458 Library of Congress Cataloging-in-Publication Greenberg, Michael Data D., date~ Advanced engineering mathematics / Michael D. Greenberg, —- 2nd ed. cm, p. Includesbibliographical referencesand index. ISBN 0-13-321431-1 {. Engineering mathematics. — [. Title. TA330.G725 1998 51S‘ .14--de2! 97-43585 CIP Technical Consultant: Dr. E. Murat Sozer Acquisition editor: George Lobell Editorial director: Tim Bozik Editor-in-chief: JeromeGrant Editorial assistant;Gale Epps Executive managingeditor: Kathleen Schiaparelli Managing editor: Linda Mihatov Behrens Productioneditor: Nick Romanelli Director of creative services: Paula Maylahn Art manager: Gus Vibal Art director / cover designer: Jayne Conte Cover photos: Timothy Hursley Marketing manager:Melody Marcus Marketing assistant: Jennifer Pan Assistant vice president of production and manufacturing: David Riccardi Manufacturing buyer: Alan Fischer © 1998, 1988 by Prentice-Hall, Inc. Simon & Schuster / A Viacom Company Upper Saddle River, New Jersey 07458 All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing from the publisher. Printed in the United States of America. ISBN 0-13-321431-1 Prentice-Hall International (UK) Limited, London Prentice-Hall of Australia Pty. Limited, Sydney Prentice-Hall Canada, Inc., Toronto Prentice-Hall Hispanoamericana, S.A., Mexico Prentice-Hall of India Private Limited, New Delhi Prentice-Hall of Japan, Inc., Tokyo Simon & Schuster Asia Pte, Ltd., Singapore Editora Prentice-Hall do Brasil, Ltda., Rio de Janeiro Advanced Engineering Mathematics oan Contents Part I: Ordinary Differential Equations 1 2 INTRODUCTION TO DIFFERENTIAL 1.1 Introduction 1.2 Definitions 1.3. Introduction to Modeling 2.1 1 | 2 9 EQUATIONS OF FIRST ORDER 2.2. EQUATIONS 18 [Introduction 18 The Linear Equation 19 2.2. Homogeneous case 19 2.2.2 Integrating factor method 22 2.2.3. Existence and uniquenessfor the linear equation 2.2.4 Variation-of-parameter method 27 2.3. Applications of the Linear Equation 34 2.3. Electrical circuits 34 2.3.2 Radioactive decay; carbon dating 39 2.3.3 Populationdynamics 41 2.3.4 Mixing problems 42 24 SeparableEquations 46 2.4, Separable equations 46 2.4.2 Existence and uniqueness (optional) 48 2.4.3. Applications 53 2.4.4 Nondimensionalization (optional) 56 2.5 Exact Equations and Integrating Factors 62 2.5. Exact differential equations 62 2.5.2 Integrating factors 66 Chapter 2 Review 3. LINEAR 3.1 3.2. DIFFERENTIAL 25 71 EQUATIONS OF SECOND Introduction 73 Linear Dependence and Linear Independence 76 ORDER AND HIGHER 73 Vi Contents 3.3 83 3.3.1 3.3.2 88 3.4 91 3.4.1 3.4.2 99 102 3.4.4 3.4.5 3.7 3.8 3.9 3.6.1 3.6.2 Cauchy—Eulerequation 118 Reduction of order (optional) 123 3.6.3 Factoring the operator (optional) Solution of Nonhomogeneous Equation 126 133 3.7, General solution 134 3.7.2 Undetermined coefficients 136 3.7.3. Variation of parameters 141 3.7.4 Variation of parameters for higher-order equations (optional) Application to Harmonic Oscillator: Forced Oscillation 149 3.8. Undamped case 149 3.8.2 Dampedcase 152 Systems of Linear Differential Equations 156 3.9, Examples 157 3.9.2 Existence and uniqueness 160 3.9.3 Solution by elimination 162 Chapter 3 Review 171 POWER SERIES SOLUTIONS | 173 4.1 4.2 4.3 4.4 4.5 Introduction 173 Power Series Solutions 176 4.2.1 Review of power series 176 4.2.2 Power series solution of differential equations The Method of Frobenius 193 4.3.1 Singular points 193 4.3.2 Method of Frobenius 195 Legendre Functions 212 4.4.1 Legendre polynomials 212 4.4.2 Orthogonality of the P,’s 214 4.4.3 Generating functions and properties 215 Singular Integrals; Gamma Function 218 4.5.1 Singular integrals 218 4.5.2 Gamma function 223 4.5.3 Orderof magnitude 225 4.6 4.6.1 uv% integer 231 182 144 Contents 4.6.2 4.6.3 4.6.4 4.6.5 4.6.6 | Chapter vu=integer 233 General solution of Bessel equation 235 Hankel functions (optional) 236 Modified Bessel equation 236 Equations reducible to Bessel equations 238 4 Review 245 5.1 5.2 5.3 5.4 5.5 5.6 5.7 Introduction 247 Calculation of the Transform 248 Properties of the Transform 254 Application to the Solution of Differential Equations 261 DiscontinuousForcing Functions;Heaviside Step Function 269 Impulsive Forcing Functions; Dirac Impulse Function (Optional) 275 Additional Properties 281 Chapter 5 Review 290 6.1 6.2 6.3 Introduction 292 Euler’s Method 293 Improvements:Midpoint Rule and Runge-Kutta 299 6.3.1 Midpoint rule 299 6.3.2 Second-order Runge-Kutta 302 6.3.3. Fourth-order Runge~Kutta 304 6.3.4 Empirical estimate of the order (optional) 307 6.3.5 | Multi-step and predictor-corrector methods (optional) Application to Systems and Boundary-Value Problems 313 Systems and higher-order equations 313 6.4.1 6.4.2 Linear boundary-value problems 317 Stability and Difference Equations 323 6.5.1 Introduction 323 6.5.2 Stability 324 Difference equations (optional) 328 6.5.3 Chapter 6 Review 335 6.4 6.5 71 7.2 73 7.4 Introduction 337 The Phase Plane 338 Singular Points and Stability 348 7.3.1 Existence and uniqueness 348 7.3.2 Singular points 350 7.3.3. The elementary singularities and their stability 7.3.4 | Nonelementary singularities 357 Applications 359 352 308 Vil vili Contents 7.5 7.6 Part I: 8 Linear Algebra SYSTEMS 8.1 8.2 8.3 9 74,1 Singularities of nonlinear systems 360 74.2 Applications 363 74.3 Bifurcations 368 Limit Cycles, van der Pot Equation, and the Nerve Impulse 372 75.1 Limit cycles and the van der Pol equation 372 7.5.2 Application to the nerve impulse and visual perception 375 The Duffing Equation: Jumps and Chaos 380 7.6.1 Duffing equation and the jump phenomenon 380 7.6.2 Chaos 383 Chapter 7 Review 389 OF LINEAR ALGEBRAIC Introduction 391 Preliminary Ideas and Geometrical Approach Solution by Gauss Elimination 396 8.3.1 Motivation 396 8.3.2 Gauss elimination 401 8.3.3. Matrix notation 402 8.3.4 | Gauss—Jordanreduction 404 8.3.5 Pivoting 405 Chapter 8 Review 410 VECTOR SPACE 9.1 9.2 9.3 94 9.5. 9.6 9.7. 9.8 9.9 9.10 EQUATIONS; GAUSS ELIMINATION 392 412 Introduction 412 Vectors; Geometrical Representation 412 Introduction of Angle and Dot Product 416 n-Space 418 Dot Product, Norm, and Angle for n-Space 421 9.5.1 Dot product, norm, and angle 421 9.5.2 Properties of the dot product 423 9.5.3. Properties of the norm 425 9.5.4 Orthogonality 426 9.5.5 Normalization 427 Generalized Vector Space 430 9.6.1 Vector space 430 9.6.2 Inclusion of inner product and/or norm Spanand Subspace 439 Linear Dependence 444 Bases, Expansions, Dimension 448 9.9.1 Bases and expansions 448 99.2. Dimension 450 9.9.3. Orthogonal bases 453 Best Approximation 457 433 391 Contents — ix 9.10.1 Best approximation and orthogonal projection 9.10.2 Kronecker delta Chapter 9 Review 10 458 461 462 MATRICES AND LINEAR EQUATIONS 465 10.1 10.2. {0.3 10.4 10.5 Introduction 465 Matrices and Matrix Algebra 465 The Transpose Matrix 481 Determinants 486 Rank; Application to Linear Dependenceand to Existence and Uniqueness for Ax =c 495 10.5.1 Rank 495 10.5.2 Application of rank to the system Ax =e 500 10.6 Inverse Matrix, Cramer’s Rule, Factorization 508 10.6.1 Inverse matrix 508 10.6.2 Application to a mass-spring system 514 10.6.3. Cramer’s rule 517 10.6.4 Evaluation of A7! by elementary row operations 10.6.5 LU-factorization 520 10.7 10.8 11 Change of Basis (Optional) 526 Vector Transformation (Optional) Chapter 10Review 539 THE EIGENVALUE PROBLEM 518 530 541 11.1 Introduction 541 {1.2 Solution Procedure and Applications 542 11.2.1 Solution and applications 542 11.2.2. Application to elementary singularities in the phase plane 549 11.3. Symmetric Matrices 554 11.3.1 Eigenvalue problem Ax =Ax 554 11.3.2 Nonhomogeneous problem Ax = Ax+c (optional) S61 [1.4 Diagonalization 569 11.5 Application to First-Order Systems with Constant Coefficients (optional) 11.6 Quadratic Forms (Optional) 589 Chapter 11 Review 596 12. EXTENSION TO COMPLEX CASE (OPTIONAL) 583 599 12.1 Introduction 599 12.2. Complex -Space 599 12.3. Complex Matrices 603 Chapter 12 Review 61 Part I: Scalar and Vector Field Theory 13° DIFFERENTIAL CALCULUS OF FUNCTIONS OF SEVERAL VARIABLES 613 Xx Contents 13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8 ntroduction 613 Preliminaries 614 3.2.1 Functions 614 3.2.2 Point set theory definitions 614 Partial Derivatives 620 625 Composite Functions and Chain Differentiation Taylor’s Formula and Mean Value Theorem 629 630 3.5.1 Taylor’s formula and Taylor series for f(x) 3.5.2 Extension to functions of more than one variable 636 Implicit Functions and Jacobians 642 3.6.1 Implicit function theorem 642 3.6.2 Extension to multivariable case 645 3.6.3. Jacobians 649 3.6.4 Applications to change of variables 652 Maxima and Minima 656 3.7.1 Single variable case 656 3.7.2 Multivariable case 658 3.7.3. Constrained extrema and Lagrange multipliers 665 Leibniz Rule 675 Chapter 13 Review 681 14.1 ntroduction 683 14.2 Dot and Cross Product 683 14.3 Cartesian Coordinates 687 ultiple Products 692 14.4 14.4.1 Scalar triple product 692 4.4.2 Vector triple product 693 14.5 Differentiation of a Vector Function of a Single Variable 699 14.6 Non-Cartesian Coordinates (Optional) 4.6.1 Plane polar coordinates 700 4.6.2 Cylindrical coordinates 704 4.6.3 Spherical coordinates 705 4.6.4 Omegamethod 707 Chapter 14 Review 712 15 15.1 15.2 Introduction 714 Curves and Line Integrals 15.2.1 {5.2.2 714 Curves 714 Arclength 716 15.3 15.2.3 Line integrals 718 Double and Triple Integrals 723 15.4 15.3.2 Triple integrals Surfaces 733 15.3.1 Double integrals 723 727 695 Contents 15.4.1 Parametric representation of surfaces Tangent plane and normal 734 Surface Integrals 739 733 15.4.2 15.5 {5.5.1 {5.5.2 15.6 16 AreaelementdA Surface integrals 739 743 Volumes and Volume Integrals 748 15.6.1 Volume element dV 749 {5.6.2 Volume integrals 752 Chapter [5 Review 755 SCALAR AND VECTOR 6.1 Introduction 6.2 Preliminaries FIELD THEORY 757 757 6.4 6.5 758 Topological considerations 758 16.2.2 Scalar and vector fields 758 Divergence 761 Gradient 766 Curl 774 6.6 Combinations; Laplacian 6.8 Cylindrical coordinates 783 Spherical coordinates 786 Divergence Theorem 792 16.2.1 6.3 778 6.7.1 6.7.2 6.8.1 16.9 16.10 Divergence theorem 792 6.8.2 Two-dimensional case 802 6.8.3 Non-Cartesian coordinates (optional) Stokes’s Theorem 810 6.9.1 Lineintegrals 814 16.9.2 Stokes’s theorem 814 6.9.3. Green’stheorem 6.9.4 Non-Cartesian Irrotational Fields 818 coordinates (optional) 826 6.10.1 Irrotational fields 826 6.10.2 Non-Cartesian coordinates Chapter 16 Review 835 841 17.1 Introduction 17.2 Even, Odd, and Periodic Functions 844 17.3. Fourier Series of a Periodic Function 17.3.1 Fourier series 803 850 17.3.2 Euler’s formulas 857 17.3.3. Applications 859 846 850 820 xi xi Contents 7.4 7.5 7.6 7.7 Half- and Quarter-Range Expansions 869 Manipulation of Fourier Series (Optional) 873 Vector Space Approach 881 The Sturm—Liouville Theory 887 7.71 Sturm—Liouvilleproblem 887 7.7.2 Lagrange identity and proofs (optional) 7.8 Periodic and Singular Sturm—Liouville Problems 7.9 Fourier Integral 913 7.10 Fourier Transform 919 7.10.2 Properties and applications 897 905 922 7.11 Fourier Cosine and Sine Transforms, and Passage 7.11.1 Cosine and sine transforms Chapter 17Review 18 940 DIFFUSION EQUATION 18.1 18.2 943 Introduction 943 Preliminary Concepts 944 8.2.1 Definitions 944 18.4 8.2.3 Diffusion equation and modeling 948 Separation of Variables 954 8.3.1 The method of separation of variables 954 8.3.2 Verification of solution (optional) 964 8.3.3. Use of Sturm—Liouville theory (optional) 965 Fourier and Laplace Transforms (Optional) 981 18.5 The Method of Images (Optional) 18.3 8.5.1 18.6 992 8.5.2 Mathematical basis for the method 994 Numerical Solution 998 8.6.1 The finite-difference method 998 WAVE EQUATION 19.1 19.2 19.3 19.4 992 Hlustration of the method Chapter 18 Review 19 934 1015 1017 Introduction 1017 Separation of Variables; Vibrating String 1023 19.2.1 Solution by separation of variables 1023 19.2.2 Traveling wave interpretation 1027 19.2.3. Using Sturm—Liouville theory (optional) [029 Separation of Variables; Vibrating Membrane 1035 Vibrating String; d’Alembert’s Solution 1043 19.4.1 d’Alembert’s solution 1043 Contents 19.4.2 Use of images 1049 19.4.3 Solution by integral transforms (optional) [051 20.1 20,2 20.3 20.3.1 20.3.2 20.3.3 Plane polar coordinates 20.5.1 20.5.2 20.5.3 Rectangular domains Cylindrical 1070 coordinates (optional) Spherical coordinates (optional) 1077 1081 20.4 20.5 1092 Nonrectangulardomains Iterative algorithms 1097 (optional) 1100 21 21.1 21.2 21.3 m 3.1 3.2 3.3 Nv nN Mmm 3.4 Preliminary tdeas 1114 Exponential function 1116 Trigonometric and hyperbolic functions solution of differential equations 21.4 1125 Sy nnnnnnv Q 22.1 22.2 22.3 1120 uU 41 4.2 43 44 AS 4.6 4.7 22 1118 Application of complex numbers to integration and the Polar form =1125 Integral powers of z and de Moivre’s formula 1127 Fractional powers 1128 The logarithm of g [129 General powers of z 1130 Obtaining single-valued functions by branch cuts 1131 More about branch cuts (optional) 1132 XU xiv Contents 1166 Additional Mappings and Applications 1170 More General Boundary Conditions 1174 Applications to Fluid Mechanics Chapter 22 Review 1180 22.4 22.5 22.6 23° THE COMPLEX INTEGRAL 1182 CALCULUS 23.1 Introduction 1182 23.2 Complex Integration 1182 23.2.1 Definition and properties 1182 23.2.2 Bounds 1186 23.3. Cauchy’s Theorem 1189 23.4 Fundamental Theorem of the Complex Integral Calculus 23.5 Cauchy Integral Formula 1199 Chapter 23 Review 1207 24 TAYLOR 24.1 24.2 SERIES, LAURENT SERIES, AND THE RESIDUE [ntroduction 1209 Complex Series and Taylor Series 1209 24.2.1 Complex series 1209 24.2.2 Taylorseries 1214 24.3. Laurent Series 1225 24.4 Classification of Singularities 1234 24.5 Residue Theorem 1240 24.5.1 Residue theorem 1240 24.5.2 Calculating residues 1242 24.5.3 Applications of the residue theorem Chapter 24 Review 1258 1243 =1260 REFERENCES APPENDICES A Review of Partial Fraction Expansions 1263 B Existence and Uniqueness of Solutions of Systems of Linear Algebraic Equations C D E F Table of Table of Table of Table of ANSWERS INDEX 1267 1271 Laplace Transforms Fourier Transforms 1274 Fourier Cosine and Sine Transforms Conformal Maps 1278 TO SELECTED 1315 EXERCISES =1282 1276 1195 THEOREM = 1209 Preface Purpose and Prerequisites This book is written primarily for a single- or multi-semester course in applied mathematics for studentsof engineering or science, but it is also designed for self-study and reference. By self-study we do not necessarily mean outside the context of a formal course. Even within a course setting, if the text can be read independently and understood, then more pedagogical options become available to the instructor. The prerequisite is a full year sequence in calculus, but the book is written so as to be usable at both the undergraduate level and also for first-year graduate students of engineering and science. The flexibility that permits this broad range of use is described below in the section on Course Use. Changes from the First Edition Principal changes from the first edition are as follows: 1. Part I on ordinary differential equations. In the first edition we assumed that the reader had previously completed a first course in ordinary differential equations. However, differential equations is traditionally the first major topic in books on advanced engineering mathematics so we begin this edition with a seven chapter sequence on ordinary differential equations. Just as the book becomes increasingly sophisticated from beginning to end, these seven chapters are written that way as well, with the final chapter on nonlinear equations being the most challenging. 2. Incorporation of a computer-algebra-system. Several powerful computer environments are available, such as Maple, Mathematica, and MATLAB. We selected Maple, as a representative and user-friendly software. In addition to an Instructor’s Manual, a brief student supplement is also available, which presents parallel discussions of Mathematica and MATLAB. 3. Revision of existing material and format. Pedagogical improvements that evolved through eight years ofclass use led to a complete rewriting rather than minor modifications of the text. The end-of-section exercises are refined and expanded. Format The book is comprised of five parts: I II WI IV V Ordinary Differential Equations Linear Algebra Multivariable Calculus and Field Theory Fourier Methods and Partial Differential Equations Complex Variable Theory XV XVI Preface This breakdown is explicit only in the Contents, to suggest the major groupings of the chapters. Within the text there are no formal divisions betweenparts,only between chapters, Each chapter begins with an introduction and (except for the first chapter) ends with a chapter review. Likewise, each section ends with a review called a closure, which is often followed by a section on computer software that discusses the Maple commands that are relevant to the material covered in that section; see, for example, pages 29~3. Subsections are used extensively to offer the instructor more options in terms of skipping or including material. Course Use at Different Levels To illustrate how the text might serve at different levels, we begin by outlining how we have been using it for courses at the University of Delaware: a sophomore/junior level mathematics course for mechanical engineers, and a first-year graduate level two-semester sequence in applied mathematics for students of mechanical, civil, and chemical engineering, and materials science. We denote these courses as U, G1, and G2, respectively. level course (U). This course follows the calculus/differential Sophomore/junior tions sequence taught in the mathematics department. We cover three main topics: equa- Linear Algebra: Chapter 8, Sections 9.1—9.5(plus a one lecture overview of Secs. .7—9.9), 10.1-10.6, and 11.1-11.3. The focus is n-space and applications, such as the mass-spring system in Sec. 10.6.2,Markov population dynamics in Sec. [1.2, and orthogonal modes of vibration in Sec, 11.3. Field Theory: Chapters 14 and 16. The heart of this material is Chapter 16. Having skipped Chapter 15, we distribute on the area element formula a one page “handout” (18) in Sec. 15.5since that formula is neededfor the surface integrals that occur in Chapter 16. Em- phasis is placed on the physical applications in the sections on the divergence theorem and irrotational fields since those applications lead to two of the three chief partial differential equations that will be studied in the third part of the course—the diffusion equation and the Laplace equation. Fourier Series and PDE’s: Sections 17.1-17.4, 18.1, 18.3, 18.6.1, 19.1-19.2.2, 20.1- 20.3.1, 20.5.1-20.5.2. Solutions are by separation of variables. using only the half- and quarter-range Fourier series, and by finite differences. First semester of graduate level course (G1). Text coverage is as follows: Sections 4.4- 4.6, 5.1-5.6, Chapter 9, Secs. I1.1-11.4, 11.6, 13.5-13.8, 14.6, 15.4-15.6, Chapter 16, Secs. 17.3, 17.6-17.11, [8.1-18.3.1, 18.3.3-18.4, 19.1-19.2, 20.1-20.4. As in “U” we do cover the important Chapter 16, although quickly. Otherwise, the approach complements that in “U.” For instance, in Chapter 9, “U” focuses on n-space, but “G1” focuses on generalized vector space (Sec. 9.6), to get ready for the Sturm—Liouville theory (Section 17.7); in Chapter {1 we emphasize the more advanced sections on diagonalization and quadratic forms, as well as Section 11.3.2on theeigenvectorexpansionmethodin finite-dimensional space. so we can use that method to solve nonhomogeneous partial differential equations in later chapters. Likewise, in covering Chapter [7 we assumethat the student has worked with Fourier series before so we move quickly, emphasizing the vector space approach (Sec. 17.6), the Sturm—Liouville theory, and the Fourier integral and transform. When we come Preface xvii to partial differential equations we use Sturm—Liouville eigenfunction expansions (rather than the half- and quarter-range formulas that suffice in “U”), integral transforms, delta functions, and Bessel and Legendre functions. In solving the diffusion equation in “U” we work only with the homogeneous equation and constant end conditions, but in “G1” we discuss the nonhomogeneous equation and nonconstant end conditions, uniqueness, and so on; these topics are discussed in the exercises. Second semester of graduate level course (G2). In the second semester we complete the partial differential equation coverage with the methods of images and Green’s functions, then turn to complex variable theory,the variational calculus, and an introduction to perturbation methods. For Green’s functions we use a “handout,” and for the variational calculus and perturbation methods we copy the relevant chapters from M.D. Greenberg, Foundations of Applied Mathematics (Englewood Cliffs, NJ: Prentice Hall, 1978). Cf you are interested in using any of these materials please contact the College Mathematics Editor office at Prentice-Hall, Inc., One Lake Street, Upper Saddle River, NJ 07458.) Text coverage is as follows: Chapters 21-24 on complex variable theory; then we return to PDE’s, first covering Secs. 18.5~18.6,19.3-19.4, and 20.3.2~20.4 that were skipped in “G1”; “handouts” on Green’s functions, perturbation methods, and the variational calculus. Shorter courses and optional Sections. A number of sections and subsections are listed as Optional in the Contents, as a guide to instructors in using this text for shorter or more introductory courses. In the chapterson field theory,for example, one could work only with Cartesian coordinates, and avoid the more difficult non-Cartesian case, by omitting those optional sections. We could have labeled the Sturm—Liouville theory section (17.7) as optional but chose not to, because it is such an important topic. Nonetheless, if one wishes to omit it, as we do in “U,” that is possible, since subsequent use of the Sturm-—Liouville theory in the PDE chapters is confined to optional sections and exercises. Let us mention Chapter 4, in particular, since its development of series solutions, the method of Frobenius. and Legendre and Bessel functions might seem more detailed than you have time for in your course. One minimal route is to cover only Sections 4.2.2 on power series solutions of ordinary differential equations (ODE’s) and 4.4.1 on Legendre polynomials, since the latter does not depend on the more detailed Frobenius material in Section 4.3. Then one can have Legendre functions available when the Laplace equation is studied in spherical coordinates. You might also want to cover Bessel functions but do not want to use class time to go through the Frobenius material. In my own course (“G1”) I deal with Bessel functions by using a “handout” that is simpler and shorter, which complements the more thorough treatment in the text. Exercises Exercises are of different kinds and arranged, typically, as follows. First, and usually near the beginning of the exercise group, are exercises that follow up on the text or fill in gaps or relate to proofs of theorems stated in that section, thus engaging the student more fully in the reading (e.g., Exercises |—3in Section 7.2, Exercise 8 in Section 16.8). Second, there areusually numerous“drill type” exercises thatask thereaderto mimic stepsor calculations that are essentially like those demonstrated in the text (e.g., there are 19 matrices to invert by hand in Exercise | of Section 10.6, and again by computer software in Exercise 3). XVili Preface Third, there are exercises that require the use of a computer, usually employing software that is explained at the end of the section or in an earlier section; these vary from drill type (e.g., Exercise |, Section 10.6) to more substantial calculations (e.g., Exercise 15, Section 19.2).Fourth, thereareexercises that involve physical applications (e.g.,Exercises 8, 9, and 12 of Section 16.10,on the stream function, the entropy of an ideal gas, and integrating the equation of motion of fluid mechanics to obtain the Bernoulli equation). And, fifth, there are exercises intended to extend the text and increase its value as a reference book. In these, we usually guide the student through the steps so that the exercise becomes more usable for subsequent reference or self-study (e.g., see Exercises 17-22 of Section [8.3). Answers to selected exercises (which are denoted in the text by underlining the exercise number) are provided at the end of the book; a more complete set is available for instructors in the Instructor’s Manual. Specific Pedagogical Decisions In Chapter 2 we consider the linear first-order equation and then the case of separable firstorder equations. It is tempting to reverse the order, as some authors do, but we prefer to elevate the linear/nonlinear distinction, which grows increasingly important in engineering mathematics; to do that, it seems best to begin with the linear equation. It is stated, at the beginning of Chapter 3 on linear differential equations of second order and higher, that the reader is expected to be familiar with the theory of the existence and uniqueness of solutions of linear algebraic equations, especially the role of the determinant of the coefficient matrix, even though this topic occurs later in the text. The instructor is advised to handle this need either by assigning, as prerequisite reading, the brief summary of the needed information given in Appendix B or, if a tighter blending of the differential equation and linear algebra material is desired, by covering Sections 8.1-10.6 before continuing with Chapter 3. Similarly, it is stated at the beginning of Chapter 3 that an elementary knowledge of the complex plane and complex numbers is anticipated. If the class does not meet that prerequisite, then Section 21.2 should be covered before Chapter 3. Alternatively, we could have made that material the first section of Chapter 3, but it seemed better to keep the major topics together—in this case, to keep the complex variable material together. Some authors prefer to cover second-order equations in one chapter and then higherorder equations in another. My opinion about that choice is that: (i) it is difficult to grasp clearly the second-ordercase (especially insofar as the case of repeatedroots is concerned) without seeing the extension to higher order, and (ii) the higher-order case can be covered readily, so that it becomes more efficient to cover both cases simultaneously. Finally, let us explain why Chapter 8, on systems of linear algebraic equations and Gauss elimination, is so brief. Just as one discusses the real number axis before discussing functions that map one real axis to another, it seems best to discuss vectors before discussing matrices, which map one vector space into another. But to discuss vectors, span, linear dependence, bases, and expansions, one needs to know the essentials regarding the existence and uniquenessof solutions of systemsof linear algebraic equations, Thus, Chapter 8 is intended merely to suffice until, having introduced matrices in Chapter 10, we can provide a more complete discussion. Xix Preface Appendices Appendix A reviews partial fraction expansions, needed in the application of Laplace and Fourier transforms. Appendix B summarizes the theory of the existence and uniqueness of solutions of linear algebraic equations, especially the role of the determinant of the coefficient matrix, and is a minimum prerequisite for Chapter 3. Appendices C through F are tables of transforms and conformal maps. Instructor’s Manual An Instructor’s Manual will be available to instructors from the office of the Mathematics Editor, College Department, Prentice-Hall, Inc., | Lake Street, Upper Saddle River, NJ 07458. Besides solutions to exercises, this manual contains additional pedagogical ideas for lecture material and some additional coverage, such as the Fast Fourier Transform, that can be used as “handouts.” Acknowledgements Tam grateful for a great deal of support in writing this second edition, but especially to Dr. E. Murat Sozer, who acted as a Technical Consultant. Dr. Sozer’s work on the latex preparation of text and the figures went far beyond his original desire to learn the latex system and whose generous help was always extraordinary in quantity and quality. Sincere thanks also to my mathematics editor, George Lobell, for his insight and support, to Nick Romanelli in charge of production, to Anita Z. Hoover at the University of Delaware for extensive help with the latex computer system, and to these outside reviewers of the developing manuscript: Gregory Adams (Bucknell University), James M. W. Brownjohn (Nanyang Technical University), Melvin G. Davidson (Western Washington University), John H. EI- ison (Grove City College), Mark Farris (Midwestern State University), Amitabha Ghosh (Rochester Institute of Technology), Evans M. Harrell, I] (Georgia Tech.), Allen Hesse (Rochester Instituteof Technology), Chung-yau Lam (Nanyang Technical University), MohanManoharan (Nanyang Technical University), James G. McDaniel (Boston University), Carruth McGehee (Lousiana State University), William Moss (Clemson University), JeanPaul Nicol (Auburn University), John A. Pfaltzgraff (University of North Carolina, Chapel Hill), Mohammad Tavakoli (Chaffey College), David E. Weidner (University of Delaware), and Jingyi Zhu (University of Utah). [also thank thesegraduate students in this department for their help with working and debugging exercises: Rishikesh Bhalerao, Nadeem Faiz, Santosh Prabhu, and Rahul Rao and Zhaohui Chen. I'm grateful to my wife, Yisraela, for her deep support and love when this task looked ike more than J could handle, and for assuming many of my responsibilities, to give me the needed time. I dedicate this book to her. Most of all, [ am grateful to the Lord for bringing this book back to life and watching over all aspects of its writing and production: “ From whence cometh my help? My help cometh from the Lord, who made heaven and earth.” (Psalm 121) Michael D. Greenberg Chapter 1 ntroduction to Differential Equations 1.1 Introduction The mathematical formulation of problems in engineering and science usually leads to equations involving derivatives of one or more unknown functions. Such equations are called differential equations. Consider, for instance, the motion of a body of mass m along a straight line, which we designate as an x axis. Let the mass be subjected to a force F(t) along that axis, where t is the time. Then according to Newton’s second law of motion dx WH dt? (1) = F(t), where a(t) is the mass’s displacement measured from the origin. If we prescribe the displacement a(t) and wish to determine the force F(t) required to produce that displacement, then the solution is simple: according to (1), we merely differentiate the given x(t) twice and multiply by m. However, if we know the applied force F(t) and wish to determine the displacement x(t) that results, then (1) is a “differential equation” on x(t) since it involves the derivative, more precisely the second derivative, of the unknown function w(t) with respect to ¢. To solve for x we need to “undo” the differentiations. That is, we need to integrate (1), twice in fact. suppose that F(t) gives For definiteness and simplicity, = Fp is a constant. Then, integrating (1) once with respect to t mit =z:Pot + A, dt where A is an arbitrary constant of integration, and integrating again gives Mme = Fi st + At +B, (2) 2 Or, 1 (Fo. a(t)= — (Fe + At +B) m\ (3 a4 2 The constants of integration, A and B, can be found from (2) and (3) if the displacementx and velocity dx/dt areprescribed at theinitial time (¢ = 0). If both «(0) and = (0) are zero, for instance, then (by setting t = 0) we find from (2) that A = 0, andthenfrom (3) thatB = 0. Thus, (3) gives thesolution as a(t) = Fot?/2m, and this solution holds for all t > 0. Unfortunately, most differential > F(t) equations cannot be solved this easily, that is by merely undoing the derivatives. For instance,supposethatthe mass is restrained by a coil spring that supplies a restoring force proportional to the displacement x with constant of proportionality & (Fig. 1). Then in place of (1), the differentia equation governing the motion is dx ma or, = —kz + F(t) dx Ma +kx = F(t). (4) After one integration, (4) becomes dx m— +k | x(t)dt = [ PU) dt+A, (5) where A is the constantof integration. Since F(t) is a prescribed function, the integral of F(t) can be evaluated,but since x(t) is the unknown, the integral of x(t) cannot be evaluated, and we cannot proceed with our solution—by—integration. Thus, we see thatsolving differential equationsis not merely a matterof undoing the derivatives by direct integration. Indeed, the theory and technique involved is considerable, and will occupy us for these first seven chapters. 1.2 Definitions In this section we introduce some of the basic terminology. Differential equation. By a differential equation we mean an equation containing one or more derivatives of the function under consideration. Here are some examples of differential equations that we will study in later chapters: dex M—zst+ke mi + ka= F(t(t), di J dE L—=+si=— l (1) (2) 1.2. Definitions 4 dé mt 7sind=0, = =on, d ad?y dx? > 1 Cyi+ dy 2 ae (*) d*y BIS = —w(x). (3) + (4) itt) E(t) \ ©) (6) b f Figure | - C 1. Electrical circuit, equation (2). Equation (1) is the differential equationgoverning the linear displacementx(t) of a body of mass m, subjected to an applied force F(t) and arestraining spring of stiffness &, as mentioned in the preceding section. Equation (2) governs the current 7(t) in an electrical circuit containing an inductor with inductance L, a capacitor with capacitance C’, and an applied voltage a sourceof strengthE(t) (Fig. 1), where¢ is thetime. Equation (3) governs the angular motion @(t)of a pendulum of length /, under the action of gravity, where g is the acceleration of gravity andt is the time (Fig. 2). Equation (4) governs the population x(t) of a single species, where t is the time andc is a net birth/death rate constant. Equation (5) governs the shape of a flexible cable or string, hanging under the action of gravity, where y(x) is the deflection and C is a constant thatdepends upon the massdensity of the cable and the tension at the midpoint x = 0 (Fig. 3). Finally, equation (6) governs the deflection y(x) of a beam subjected to a loading w(x) (Fig. 4), where & and J are physical constants of the beam material and cross section, respectively. Ordinary and partial differential equations. We classify a differential equation as an ordinary differential equation if it contains ordinary derivatives with respect to a single independent variable, and as a partial differential equation if it contains partial derivatives with respect to two or more independentvariables. Thus, equations (1)—(6) are ordinary differential equations (often abbreviated as ODE’s). The independent variable is ¢ in (1)—(4)and z in (5) and (6). Some representative and important partial differential equations (PDE’s) are as follows: ae ~ae Figure 3. Hangingcable, equation (5). (7) yA att a iano (8) - or ° Ge mm) (9) *on=0. set*parayt oT | i atl 1 +f y(x)/ Figure 4, Loadedbeam, (10) equation (6). 3 ah Equation (7) is the heat equation, governing the time-varying temperature distribution u(x, ¢) in a one-dimensional rod or slab; a locates the point under consideration within the material, ¢ is the time, and a? is a material property called the diffusivity. Equation (8) is the Laplace equation, governing the steady-state temperature distribution u(x, y, z) within a three-dimensional body; x, y, z are the coordinates of the point within the material. Equation (9) is the wave equation, governing the deflection u(a, y, ¢) of a vibrating membrane such as a drum head. Equation (10) is the biharmonic equation, governing the stream function u(x, y) in the case of the slow (creeping) motion of a viscous fluid such as a film of wet paint. Besides the possibility of having more than one independent variable, there could be more than one dependent variable. For instance, dx aT = —(ko1+ k3i)x1 + kigre + kigx3 dx a = kaya —(k12+ k32)a2+ k3903 (11) a = = kgx1 + kggx9—(kig+ ko3)xs is a set,or system,of threeODE’s governingthethreeunknowns21(t), £2(t),73(t); (11) arises in chemical kinetics, where 11, x2, x3 are concentrations of three reacting chemical species, such as in a combustion chamber, where the /;;’s are reaction rate constants, and where the reactions are, in chemical jargon, first-order reactions. Similarly, OE, OF er eee Ox is a system of two PDE’s governing dy a(x,y) the two unknowns £)(a,y) and E£o(x,y), which are the x and y components of the electric field intensity, respectively, a(x, y) is the charge distribution density, and € 1sthe permittivity; these are the Maxwell’s equations governing two-dimensional electrostatics. At this point, we limit our subsequent attention to ordinary differential equations. We will not return to partial differential equations until much later on in this book. Thus, we will generally omit the adjective “ordinary,” for brevity, and will speak only of “differential equations” over the next several chapters. Order. We define the order of a differential equation as the order of the highest derivative therein. Thus, (4) is of first order, (1), (2), (3), and (5) are of second order, (6) is of fourth order, and (11) is a system of first-order ODE’s. More generally, F(«, u(x), u(x), u"(a),..., u'")(x)) =0 (13) 1.2. Definitions is said to be an nth-order differential equation on the unknown u(x), where n is the order of the highest derivative present in (13). Here, we use the standard prime notationfor derivatives: u/(x) denotesdu/dx, w(x) denotesthe second derivative, ..., and ul) (x) denotes the nth derivative. In the fourth-order differential equation (6), for instance, in which the dependent variable is y rather than u, yyyyy”) nt Fiz or yl" = Ely" + w(a«),which happensnotto containy, y/, yft . Solution. A function is said to be a solution of a differential equation, over a particular domain of the independentvariable, if its substitution into the equation reduces that equation to an identity everywhere within that domain. EXAMPLE 1. The functiony(c) = 4sinz —xcosz is a solutionof thedifferential equation y'+y=2sine (14) on the entire 2 axis because its substitution into (14) yields (—4sinx + 2sing + xcosa2)+ (4sinz —xcosxz) = 2singz, which ts an identity for all z. Note that we said ‘‘a” solution rather than “the” solution since there are many solutions of (14): y(z) = Asing + Bcosz — xcosz (15) is a solution for any values of the constants A and B, as is verified by substitution of (15) into (14). [In a later chapter, we will be in a position to derive the solution (15), and to show that it is the most general possible solution, that is, that every solution of (14) can be expressed in the form (15).] @ EXAMPLE 2. The functiony(z) = 1/z is a solutionof thedifferentialequation y+y? =0 (16) over any interval that does not contain the origin since its substitution into (16) gives ~1/x? + 1/2? = 0, which relation is an identity, provided that z #0. EXAMPLE 3. Whereas (14) admits an infinity of solutions [one for each choice of A and B in (15)], the equation ly'|+ly|+3=0 (17) evidently admits none since the two nonnegative terms and one posilive term cannot possibly sum to zero for any choice of y(z). # In applications, however, one normally expects that if a problem is formulated carefully then it should indeed have a solution, and that the solution should be unique, that is, there should be one and only one. Thus, the issues of existence (Does the equation have any solution?) and uniqueness(If it does have a solution, is that solution unique?) are of important interest. Initial-value problems and boundary-value problems. Besides the differential equation to be satisfied, the unknown function is often subjected to conditions at one or more points on the interval under consideration. Conditions specified at a single point (often the left end point of the interval), are called initial conditions, and the differential equation together with those initial conditions is called an initial-value problem. Conditions specified at both ends are called boundary conditions, and the differential equation together with the boundary conditions is called a boundary-value problem. For initial-value problems the independent variable is often the time, though not necessarily, and for boundary-value theindependent variable is often a space variable. problems EXAMPLE 4. Straight-Line Motion of a Mass. Consider once again the problem of predicting the straight-line motion of a body of mass m subjected to a force F(t). According to Newton's second law of motion, the governing differential equation on the displacement x(t) is ma” = F(t). Besidesthedifferentialequation,supposethatwe wish to imposethe conditions«(0) = 0 andz'(0) = V; thatis, theinitial displacementandvelocity are0 and V, respectively. Then the complete problem statement is the initial-value problem (18) That is, x(t) is to satisfythedifferentialequationmx” = F(t) on theinterval0 < t < co and theinitial conditionsz(0) = 0 and2’(0) = V. 8 yA EXAMPLE 5. Deflection of a Loaded Cantilever Beam. Consider the deflection y(z) of a cantilever beam of length L, under a loading w(x) newtons per meter (Fig. 5). Using he so-called Euler beam theory, one finds that the governing problem is as follows: w(x) ALT 0 a oy Pe _ Xx x=L Ely!" = —w(z) (O0<a<L) y0)=0,y(0)=0, y"(L)=0, y"(L)=0, ” where & and J are known physical constants. The appended conditions are boundary Figure 5. Loadedcantileverbeam. conditions because some are specified at one end, and some at the other end, and (19) is therefore a boundary-value problem. The physical significance of the boundary conditions is as follows: y(0) =0 is true simply by virtue of our chosenplacement of the origin of the x,y coordinatesystem;y’(0) = 0 follows since thebeamis cantileveredout of the wall, so thatits slope at z = 0 is zero; y”(L) = 0 andy’”(L) = 0 becausethe “moment”and “shear force,” respectively, are zero at the end of the beam. & 1.2. Definitions Linear and nonlinear differential equations. An mth-order differential equation is said to be linear if it is expressible in the form (a) +++:+an(x)y(x) = f(x), ag(x)y (a) + ay(x)y") (20) a, (a) are functions of the independent variable x alone, and nonwhere ag(x),..., linear otherwise. Thus, equations (1), (2), (4), and (6) are linear, and (3) and (5) are nonlinear.If f(a) = 0, we say that(20) is homogeneous; if not, it is nonhodoes not vanish on the « interval of interest,then we may mogeneous. If a@o(a) divide (20) by ag(x) (to normalize the leading coefficient) and re-express it as y (a)+pr(w)y"Y (a)+--+ +pr()y(2) =a(a), (21) We will find that the theory of linear differential equations is quite comprehensive insofar as all of our major concerns — the existence and uniqueness of solutions, andhow tofind them,especially if the coefficientsag(z),..., @n(x) areconstants. Even in the nonconstant coefficient case the theory provides substantial guidance. Nonlinear equations are, in general, far more difficult, and the available theory is not nearly as comprehensive as for linear equations. Whereas for linear equations solutions can generally be found either in closed form or as infinite series, for nonlinear equations one might focus instead upon obtaining qualitative information about the solution, rather than the solution itself, or upon pursuing numerical solutions by computer simulation, or both. The tendency in science and engineering, until around 1960, when high-speed digital computers became widely available, was to try to get by almost exclusively with linear theory. For instance, consider the nonlinear equation (3), namely, 6" + g =sind= 0, l (22) governing the motion of a pendulum, where 6(t) is the angular displacementfrom the vertical and t is the time. If one is willing to limit one’s attention to small motions, that is, where @is small compared to unity (i.e., 1 radian), then one can use the approximation sind = Los log. + 59 —...88 —3 to replace the nonlinear equation (2) by the approximate “linearized” equation + 59=0, (23) which (as we shall see in Chapter 3) is readily solved. Unfortunately, the linearized version (23) is not only less and less accurate as larger motions are considered, it may even be incorrect in a qualitative sense as well. That is, from a phenomenological standpoint, replacing a nonlinear differential equation by an approximate linear one may amount to “throwing out the baby with the bathwater.” 7 8 Thus, it is extremely important for us to keep the distinction between linear and nonlinear clearly in mind as we proceed with our study of differential equations. Aside from Sections 2.4 and 2.5, most of our study of nonlinear equations takes place in Chapters 6 and 7. Closure. Notice that we have begun, in this section, to classify differential equations, that is, to categorize them by types. Thus far we have distinguished ODE’s (ordinary differential equations) from PDE’s (partial differential equations), established the order of a differential equation, distinguished initial-value problems from boundary-value problems, linear equations from nonlinear ones, and homogeneous equations from nonhomogeneous ones. Why do we classify so extensively? Because the most general differential equation is far too difficult for us to deal with. The most reasonable program, then, is to break the set of all possible differential equations into various categories and to try to develop theory and solution strategies that are tailored to the specific nature of a given category. Historically, however, the early work on differential equations (1654-1705) and his brother Johann (John) (1667~1748), Joseph-Louis Lagrange (1736-1813), Alexis-Claude Clairaut (1713-1765), and Jean le Rond d’Alembert (1717-1783) —generally involved attempts at solving specific equations rather than the development of a general theory. From an applications point of view, we shall find that in many cases diverse physical phenomena are governed by the same differential equation. For example, consider equations (1) and (2) and observe that they are actually the same equation, to within a change in the names of the various quantities: m > L, k + 1/C, F(t) + d&(t)/dt, and x(t) -+ i(t). Thus, to within thesecorrespondences,their solutions areidentical. We speakof the mechanical systemand the electrical circuit as analogs of each other. This idea is deeper and more general than can be seen from this one example, and the result is that if one knows a lot about mechanical systems, for example, then one thereby knows a lot about electrical, biological, and social systems, for example, to whatever extent they are governed by differential equations of the same form. Or, returning to PDE’s for the moment, consider equation (7), which we introduced as the one-dimensional heat equation. Actually, (7) governs any one- dimensional diffusion process, be it the diffusion of heat by conduction, or the diffusion of material such as a pollutant in a river. Thus, when one is studying heat conduction one is also learning about all diffusion processes because the governing differential equation is the same. The significance of this fact can hardly be overstated as a justification for a careful study of the mathematical field of differential equations, or as a cause for marvel at the underlying design of the physical universe. 1.3. Introduction to Modeling EXERCISES 9 1.2 1. Determine the order of each differential equation, and that equation for any twice-differentiable functions f and g. whetheror not the given functions are solutions of that equa- (c) For what value(s) of the constant m is u(x,t) = sin (x + mt) a solution of that equation ? tion. @(y')?=4y w(x)=2, yo(a)= 227, yy(x)= e7* y,(x) = sing, (b) 2yy' =9sin2z; ys(x)= e* yo(x) = 3sinz, (c)y" —9y= 0; yr(a)=6 —€, yo(z) = 3sinh 32, y3(x)= 2e8*—e~ba (d) (y’)?—day! +4y=0; yi(z) = 27-2, yo(x) = 2r-1 (e)y” +9y=0; 3n 8 yile) = 4sin 32 + 3cos 3a, yo(x) = 6sin (3x + 2) (f)y"—y'-2y =6; yi (x) = 5e2*-3, (ay +8y=0 (c)y” —3y'+2y =0 ey" —y'=0 (g) yl" _ by" + By — 0 (b)y’ +38y? =0 (d)y” ~2y’ +y =0 (fy —2y"—y!+2y=0 (h)y” + Syy’+y = 0 yo(x)= —2e7*—3 6. First, verify that the given function is a solution of the given (g)y'” —Gy”+ 12y!—8y = 32—162; yi(x) = 22 ~1+(A+ A,B,C 5. For whatvalue(s)of theconstantA will y = exp (Az) bea solution of the given differential equation? If there are no such A’s, statethat. Bx + Cx?)e* for any constants ys(x)= Aem™ D et dt, (h)y' + 2ay=1; yo(z) = er fe et dt for any constantsA anda. 2. Verify that differential equation, for any constants A, B. Then, solve for A, B so thaty satisfies thegiven initial or boundaryconditions. (a)y"”+4y= 827; y(x) = 22?-1+4Asin 2x+ y(0)=1, y/(0)=0 (b)y”—y=a27; 0)=-2, y(x) = -2? -2+Asinhz+ tga eo y(x)=(A+ =(A+Ba)e*; y(0)=1 y+ ay +y=0; Beje*,y(0)=1, dy" —-y' =0; y(x)=At+ Bet y/(O)=1, a solution of (7) for any constants A, B,C,D,«. NOTE: We will sometimes use the notation exp( ) in place of e' ) because it takes up less vertical space. 7. Classify each equation as linear or nonlinear: 3. Verify thatu(x, y, z) = Asin az sin by sinh cz isa solution (a)y’ t+e*y=4 of (8)for anyconstantsA, a, 6,c, providedthata? + b? = c?. (c) eTy’ =a —2y 4. (a)Verifythatu(z,t) = (Ar + B)(Ct+ D)+(Esingnz + (e)y" +(sinz)y=2 F'cos«x)(Gsin «ct + Hcos «ct) is a solution of the one- (g)yy” +4y= 32 2 Pu Ou Ox? ~ Ot?’ for any constants A, B,..., 1.3 Introduction H, x. to Modeling Beosha; y'(0)=0 u(z,t) = Av + B+ (Csin«az + Dcos Kz) exp(—Kat) is y(2)=0 dimensional wave equation Bcos2z; 2 y(3)=0 (b)yy =xrt+y (d) y’ ~ expy = sinz Oy "_y -¥ =expaz Pp (h)y" =y 8. Recall that the nonlinear equation (5) governs the deflection y(x) of theflexible cable shownin Fig. 3. Supposingthatthe sag is small compared to the span, suggest a linearized version of (5) that can be expected to give good accuracy in predicting theshapey(a). as Heat Transfer, Fluid Mechanics, and Circuit Theory. However, we wish to emphasize the close relationship between the mathematicsand the underlying physics, and to motivate the mathematics more fully. Thus, besides the purely mathematical examples in the text, we will include physical applications and some of the underlying modeling as well. Our intention in this section is only to illustrate the nature of the modeling process, and we will do so through two examples. We suggest that you pay special attention to Example | because we will come back to it at numerous points later on in the text. an] | . CLIP con EXAMPLE 1. Mechanical Oscillator. Consider a block of massm lying ona tableand restrained laterally by an ordinary coil spring (Fig. 1), and denoteby z the displacement of the mass (measuredas positive to the right) from its “equilibrium position,” that is, when => Fit) POPE PROP POPPY Figure 1. Mechanicaloscillator. x = 0 the spring is neither stretched nor compressed. We imagine the mass to be disturbed from its equilibrium position by an initial disturbance and/or an applied force F(t), where t is the time, and we seek the differential equation governing the resulting displacement history x(¢). Our first step is to identify the relevant physics which, in this case, is Newton’s second law of motion. Since the motion is constrained to be alonga straight line, we need consider only the forces in the 2 direction, and these are shown in Fig. 2: F’, is the force exerted bo OP i - * Figure 2. The forces,if z > 0 anda’ > 0. by the spring on the mass (the spring force, for brevity), F is the aerodynamic drag, f’, is the force exerted on the bottom of the mass due to its sliding friction, and F’ is the applied force, the driving force. How do we know if F, Fy, and F act to the left or to the right? The idea is to make assumptions on the signs of the displacement x(t) and the velocity x'(t) at theinstantunderconsideration.For definiteness,supposethatz > 0 and2’ > 0. Then it follows that each of the forces F’,, Fy, and Fy is directed to the left, as shown in Fig. 2. (The equation of motion that we obtain will be insensitive to those assumptions, as we shall see.) Newton’s second law then gives (mass)(z acceleration) = sum of x forces 17x or, mal"=F —F,— Fy —Fa, (1) and we now need to express each of the forces F,, Fy, and Fy in terms of the dependent Figure 3. Springforceand displacement. and independent variables x and t. Consider F, first. If one knows enough about the geometry of the spring and the material of which it is made, one can derive an expression for F as a function of the x, as might be discussed in a course in Advanced Strength of Materials. In extension practice, however, one can proceed empirically, and more simply, by actually applying various positive (i.e., to the right, in the positive a direction) and negative forces (to the left) to the spring and measuring the resulting displacement x (Fig. 3). For a typical coil spring, the resulting graph will be somewhat as sketched in Fig. 4, where its steepening at A and B is due to the coils becoming completely compressed and completely extended, respectively. Thus, /’, in (1) is the function the graph of which is shown as the curve AB, (Ignore the dashed line Z for the moment.) Next, considerthefriction forceFy. The modelingof F’y will dependupon thenature of the contact between the mass and the table — in particular, upon whether it is dry or {.3. Introduction to Modeling 11 lubricated. Let us suppose it is lubricated, so that the mass rides on a thin film of lubricant such as oil. To model /’;, then, we must consider the fluid mechanics of the lubricating film. The essential idea is that the stress 7 (force per unit area) on the bottom of the mass is proportional to the gradient du/dy of the fluid velocity u (Fig. 5), where the constant B of proportionality is the coefficient of viscosity yz: 7 = podu/dy. But u(y) is found, ina course in Fluid Mechanics, to be a linear function, namely, u(y) = (y) so u(h) ~ u(0) h y= Y z(t)-0O nh yoo 4 a(t) Y, ee duos, (t) T= pl = a (ft). oh ; dy A Figure Thus, F's = (stress rT)(area A of bottom of block) =(7) 4. Force-displacement graph. (A). That is, itis of the form Fy = ca'(t), (2) for some constant c that we may consider as known. Thus, the upshot is that the friction force is proportional to the velocity. We will call c the damping coefficient because, as we will see in Chapter 3, the effect of the cx’ term in the governing differential equation is to cause the motion to “damp out.” Likewise, one can model the aerodynamic drag force Fy, but let us neglect &, on the tentative assumption that it can be shown to be small compared to the other two forces. Then (1) becomes ma"(t) + ex'(t) + Fy(x) = F(t). (3) F(a) & ke. (4) ∙ ∙ ∙ ; oo ; ; ∙ Equation (3) is nonlinear because F’,(x) is not a linear function of x, as seen from its graph AB in Fig. 4. Asa final simplifying approximation, let us suppose that the z motion is small enough, say between a and 6 in Fig. 4, so that we can linearize F’,, and hence the governing differential equation, by approximating the graph of F, by its tangent line D. Since L is a straight line through the origin, it follows that we can express We call & the spring stiffness. Thus, the final form of our governing differential equation, or equation of motion, is the linearized approximation ma" +er' +kx = F(t), (3) onQ <¢ < oo, where the constants m, c, & and the applied force F'(t) are known. Equation (5) is important, and we will return to it repeatedly. To this equation we wish to append suitable initial or boundary conditions. This particular problem is most naturally of initial-value type since we envision initiating the motion , 22k LYE t y “Wetable Figure 5. Lubricatingfilm. 12 in some manner at the initial time, say ¢ = 0, and then considering the motion that results, Thus, to (5) we append initial conditions (a) x>0, F< > fF for some (positive, negative,or zero) specified constantsvq and zp. It should be plausible intuitively that we do need to specify both the initial displacement «(0) and the initial velocity 2’(0) if we are to ensurea unique resultingmotion. In any case, the theoretical ——> appropriateness of the conditions (6) are covered in Chapter 3. * (b) x<0, (6) and 2’(0) = 29, z(0)=2o x <0 The differential equation (5) and initial conditions (6) comprise our resulting mathematical model of the physical system. By no means is there an exact correspondence between the model and the system since approximations were made in modeling the forces x >0 F,— }—_» f° fe oi Figure 6. Otherassumptionson thesignsof x and2”. F, and Fy, and in neglecting /, entirely. Indeed, even our use of Newtonian mechanics, rather than relativistic mechanics, was an approximation. This completes the modeling phase. The next step would be to solve the differential equation (5) subject to the initial conditions (6), for the motion a(t). COMMENT J].Let us examine our claim that the resulting differential equation is insensitive to the assumptions made as to the signs of 2 and 2’. In place of our assumption that x > Oand zx’ > 0 at the instant under consideration, suppose we assume that z > 0 and x’ <0. Since x > 0, it follows that F, acts to the left, and since z’ < 0, it follows that Fs acts to the right. Then (Fig. 6a) (7) ma" =F—F,+F;, where we continue to neglect Fy. The sign of the F's term is different in (7), compared with (1), becausef’, now acts to theright, but noticethatF’, now needsto be writtenas Fy = c(—a'(t)), ratherthancex'(t)since 2’ is negative. Further,F, is still ka, so (7) becomes (8) me" = F(t) —kx + (—cz’), which is indeed equivalent to (5), as claimed. Next, whatif c¢< 0 and2’ > 0? This time (Fig. 6b) me" =F +F,— Fy, (9) which differs from (1) in the sign of the F, term. But now F’, needs to be written as FY,= k (—2(t)) since x is negative.Further,F’; is cx’, so (9) becomes ma" = F +k(-2x)- cr’, which, again, is equivalent to (5). The remaining the exercises. COMMENT case, where x < 0 and a’ <0, is left for 2. The approximation (4) was introduced from consideration of the graph shown in Fig. 4, but it amountsto expanding £’,(z) in a Taylor seriesabouttheequilibrium point v = 0, as ∫ ∙ ∶↨ ∫ − ∙ ≤− and linearizing —that is, cutting off after the first-order term: P,(x) &P(0)+FX(0)a =O+ka = ka. ∫ ∶ ↔ 13 This idea, the simplification of a differential equation by such tangent-line approximation, is of greatimportance in modeling. COMMENT 3. The final equation for F,, Ff, = kx is well known as Hooke’s law, after Robert Hooke (1635-1703). Hooke published his law of elastic behavior in 1676 as the anagram ceiiinosssttuv and, two years later, the solution ut tensio sic vis: roughly, “as the force, so is the displacement.” In view of the complexity with which we can now deal, Hooke’s law must look quite modest, but one must appreciate it within its historical context. In spirit, it followed Galileo Galilei (1564-1642) who, in breaking lines established by the ancient Greeks, sought to establish a quantitative science, expressed in formulas and mathematical terms. For example, where Aristotle explained the increasing speed of a falling body in terms of the body moving more and more jubilantly as it approached its naturalplace (thecenter of the earth, which was believed to coincide with thecenter of the universe), Galileo sidestepped the question of cause entirely, and instead put forward the formula vu= 9.8t, where v is the speed (in meters per second) andt is the time (in seconds). It may be argued that such departure from the less productive Greek tradition marked the beginning of modern science. COMMENT 4. In science and engineering it is useful to think in terms of inputs and outputs. Here, thereare three inputs, the two initial conditions and the applied force F'(t), andtheoutputis theresultingsolution,or response,z(t). Hf The foregoing introductory example illustrates several general truths about modeling. First, we see that it is not necessarily an easy task and generally requires a sound understanding of the underlying physics. Even in this little example one senses that obtaining suitable expressions for F’y and Fy (if one does include F) requires skillful handling of the fluid mechanics of the lubrication film and the aerodynamics of the moving block. Second, we see that approximations will no doubt be necessary, and the idea is to makethemjudiciously. In this example we madeseveral approximations. The expressionu(y) = 2'(t)y/h, for instance,is probably quite accuratebut is not exact, especially near the ends of the block. Further, one can imagine that as the motion continues, the lubricant will heat up so that the viscosity jy will decrease. This effect is probably negligible, but we mention it in order to suggest that there is virtually no end to the levels of complexity that one may address, or suppress, in the modeling process. The key is to seek a level of complexity that will provide sufficient accuracy for the purpose at hand, and to seek a uniform level of approximation. For instance, it would hardly make sense to model Fy with great sophistication and accuracy if F’, is of comparable magnitude and is modeled only crudely. To stay on this point a bit longer, note several more approximations that were implicit to our discussion. First, we implicitly assumed that the block is rigid, whereas the applied forces will cause some slight distortion of its shape and dimensions; again, neglect of this effect is surely an excellent approximation when considering the motion x(t). Second, and more serious, notice that our empirical determination of F',(2) was based on a static test whereas, like the block, the spring is itself in motion. Thus, there is an inertial effect for the spring, analogous to the me"term for themass,thatwe haveneglected.If themassof thespring is notnegligible compared to thatof the block, then that approximation may be insufficiently accurate. Finally, notice carefully that we neglect a particular effect, in modeling, not on the grounds that it is small in an absolute sense, but because it is small relative to other effects. For instance, an aerodynamic force F, = 0.001 newton may seem small numerically, but would not be negligible if F', F,, and Fy were of comparable size. Let us consider one more example. EXAMPLE 2. x x+Ax Figure 8. Typical cableelement. SuspensionBridge Cable. To designa suspensionbridge cable, one needs to know the relationships among the deflected shape of the cable, the tension in it, and the weight distribution that it needs to support. In the case of a typical suspension bridge, the roadbed supported by the cables is much heavier than the cables themselves, so let us neglect the weight of the cables, and assume that the loading is due entirely to the roadbed. Consider the configuration shown schematically in Fig. 7. A cable is a distributed system, rather than one or more point masses, and for such systems a useful modeling approach is to isolate a typical element of the system and apply to-it the relevant physical laws. In the present case a typical element is an incremental portion of the cable lying between x and x + Az, for any x between —L/2 and L/2, as shown in Fig. 8: As is the arc length, T the tension in the cable, @the inclination with respect to the horizontal, and AW the load supportedby the element. If the roadbed is uniform and weighs 2w newtons per meter, then each of the two cables supports w newtons per meter,so AW = wAz. Besides neglecting the weight of the cable itself, as mentioned above, there are three additional approximations that are implicit in the foregoing. First, in assuming a uniform load w per unit length, we have really assumed that the vertical support spacing d is very small compared to the span L, so that the intermittent support forces can be distributed as a uniform load. Second, in assuming that the tension is in the direction of the cable we have really assumed that the cable is flexible, a term that we now need to explain. The general state of affairs at any section, such as at the z + Az end of the element, is as shown in Fig. 9, namely, there can be a shear force V, a tensile force JTthrough the centerline, and a momentor “couple” M. (V is the net effect of shearingstressesdistributedover the face, and T and M are the net effect of normal stresses distributed over the face.) By a flexible string or cable, we mean one that is unable to support any shear V or moment ; center line Figure 9. Forcesandmoments at an end, that is, V = Mf = 0. For instance, if one takes a typical household string between the thumb and forefinger of each hand one finds that it offers virtually no resistance to shearing or bending, but considerable resistance to stretching. Thus, when we include only tensile forces in Fig. 8 we are assuming that the cable is flexible. Of course, if we imagine taking the suspension cables on the Golden Gate Bridge “between our fingers” we can imagine quite a bit of resistance to shearing and bending! But thepoint to keep in mind is that even those heavy cables would offer little resistance to shearing and bending by the enormous loads to which they are actually subjected by the weight of the roadbed. Finally, we assume that the cable is inextensible, even under the large tensions that are 15 anticipated. If we can accept these assumptions we can now apply the relevant physics which, again, is Newton’s second law of motion. But this time there is no acceleration, so the element is in static equilibrium. Thus, the « and y forces must each sum to zero: T(x + Axz)cos O(a+ Ax) —T(x) cos(x) = 0, T(x + Ax)sin O(¢+ Ax) —T(z) sin O(2) ~ wAr = 0. x: y: (10a) (10b) Dividing each of theseequations by Az and letting Ax -> 0, we obtain d =(Tcos8)=0, (11a) gstrs sin 0)=" )=w, (11b) or, upon integration, (12a) Tcos@ = A, Tsin@é=wr+B, where A, B are arbitrary constants. Dividing (12b) (12b) by (12a), to eliminate the unknown tensionT(z), and noting thattan 0 = dy/dz, we obtain thedifferential equation dy ow B —=-r+— dx A t A 13 (13) governingthe cable shapey(x), which we are trying to determine. In this case, the solution is simple enough so that we might as well complete the solution. To solve, we merely integrate (13), obtaining wo. y(t) = pea Wir)= sae where A, B,C 8B + —24C, +a are arbitrary constants. To evaluate them, we invoke the associated boundary conditions: y(0) =0 y'(0) =0 DL = y (5) (by symmetryaboutx =0), (14a) (14b) (from Fig. 7). (14c) (by choice of location of origin), Equation (14a) gives C = 0, and (14b) gives B/A = 0, and hence B = 0. Thus far, w y(x) and (14c) then gives A = wl? /8H. = 24” D ) Thus, the cable’s shape is given by the parabola 4H. y(a) = 7 (15) 16 Finally, the distribution of tension T'(a) may be found by squaring and adding (12a) and (12b): =wy/ae? + LA 64H? (16) Ina sense,obtaining y(2) and T(x) marks theend of theanalysis, and if we are content that the expressions (15) and (16) are sufficiently accurate, then the next step would be to use them to help in the actual bridge design. Before doing that, however, we should check those results, for there may be errors in our analysis. Also, the approximations that we have made may prove to be insufficiently accurate. One of the standard ways of checking results is by means of special cases and limiting cases, for which the correct results are known or obvious. Considering (15), we observe first that the parabolic shape looks reasonable. Furthermore, (15) implies that y(a#)> 0, over ~L/2 < x < L/2,as H — 0 with L fixed, and also thaty(z) — 0 at each a, as L — co with A fixed. These results look reasonable too. Turning to (16), observe that the tension becomes infinite throughout the cable as H — 0, as expected. (Try straightening out a loaded washline by pulling on one end!) Finally, consider the limiting case H — oo, with L fixed. In that case, (16) gives T(L/2) + wL/2, which agreeswith the result obtained from a simple consideration of the physics (Exercise 2). 4 Closure. The purpose of this section is to illustrate the modeling process, whereby one begins with the physical problem at hand and ends up with an equivalent mathematical statement, or model. Actually, we should say approximately equivalent since the modeling process normally involves a number of assumptions and approximations. By no means do we claim to show how to model in general, but only to illustrate the modeling process and the intimate connection between the physics and the mathematics. As we proceed through this text we will attempt to keep that connection in view, even though our emphasis will be on the mathematics. Finally, let us note that when we speak of the physical problem and the physics we intend those words to cover a much broader range of possibilities. For instance, the problem might be in the realm of economics, such as predicting the behavior of the Dow Jones Stock Index as a function of time. In thatcase the relevant “physical laws” would be economic laws such as the law of supply and demand. Or, the problem might fall in the realm of sociology, ecology, biology, chemistry, and so on. In any case, the general modeling approach is essentially the same, independent of the field of application. 1.3. Introduction to Modeling = 17 1.3 EXERCISES 1. In Exampie | we showed that the same differential equa- satisfies (3.1) and the boundary conditions y(Q) = 0 and tion, (5), results, independent of whether « > 0 and x’ > 0, y’(0) = 0. But it remains to determine C’. Invoking the re< Oand 2’ > 0. Consider the last orz > Oand az’ < 0orz remaining case, z < 0 and 2’ < 0, and show that, once again, one obtains equation (5). mainingboundarycondition,y(/2) = H, showthatC' satisfies the equation 2. At the end of Example 2, we stated that the result _ i}. H = Gl (cost “ (3.3) T(L/2) — wL/2, obtained from (16), for the limiting case where H — oo with D fixed, agrees with the result obtained from a simple consideration of the physics. Explain that statement, Unfortunately, (3.3) is a transcendental equation for C,, so that 3. (Catenary) In our Suspension Bridge Cable example we ne- we cannot solve it explicitly. We can solve it numerically, for glectedthe weight of the cable itself relative to the weight of given values of H and L, but you need not do that. the roadbed. At the other extreme, suppose that the weight of (c) As a partial check on these results, notice that they should the roadbed (or other loading) is negligible compared to the reduce to the parabolic cable solution in the limiting case weight of the cable. Indeed, consider a uniform flexible ca- where the sag-to-span ratio H/L tends to zero, for then the ble, or catenary, hanging under the action of its own weight load per unit « length, due to the weight of the cable, aponly, as sketched in the figure. Then Fig. 8 still holds, but proaches a constant, as it is in Example 2, where the load is due with AW = ,:As, where puis the weight per unit arc length of entirely to the uniform roadbed. The problem that we pose for the cable. you is to carry out that check. HINT: Think of L as fixed and y H tending to zero. For H to approach zero, in (3.3), we need gE “H AH ¥ C'L/2 to approachzero—thatis, C — 0. Thus, we canexpand the cosh C'x — 1 in (3.2) in a Maclaurin series in C and retain the leading term. Show that that step gives y(x) ~ Cx?/2, andthe boundaryconditiony(L/2) = H enablesus to deter- ty» ~L/2 L/2 x (a) Proceeding somewhat as in (10)—(12), derive the governing differential equation y” =C [1+ y!?, (3.1) mine C’. The result should be identical to (15). (d) Actually, for small sag-to-span ratio we should be able to neglect the y” term in (3.1), relative to unity, so that (3.1) can be linearized as where C’ is an unknown constant. y" =C. (b) Since y(z) is symmetric about z = 0, it suffices to consider the interval 0 < x < L/2. Then we have the boundary conditionsy(0) = 0, y/(0) = 0, and y(L/2) = H. Verify (you need not derive it) that 1 y(2) = G (coshCz — 1) (3.4) Integrating (3.4) and using the boundary conditions y(0) = 0, y'(0) = 0, andy(L/2) = H, showthatone obtains(15)once (3.2) again, Chapter 2 Differential Equations of First Order 2.1 Introduction In studying algebraic equations, one considers the very simple first-degreepolynomial equation ax =6 first, then the second-degree polynomial equation (quadratic equation), and so on. Likewise, in the theory of differential equations it is reasonable and traditional to begin with first-order equations, and that is the subject of this chapter. In Chapter 3 we turn to equations of second order and higher. Recall that the general first-order equation is given by F(x,y,y')=9, (1) where x and y are independent and dependent variables, respectively. In spite of our analogy with algebraic equations, first-order differential equations can fall anywhere in the spectrum of complexity, from extremely simple to hopelessly difficult. Thus, we identify several different subclasses of (1), each of which is susceptible to a particular solution method, and develop them in turn. Specifically, we consider thesesubclasses:thelinearequationag(x)y’ +a1(a)y = f(x) in Section2.2,“separable” equations in Section 2.4, equations that are “exact” (or can be made exact) in Section 2.5, and various other more specialized cases within the exercises. These subclasses are not mutually exclusive. For instance, a given equation could be both linear and separable, in which case we could solve it by one method or the other. Given such a choice, choose whichever you prefer. In other cases the given equation might not fit into any of these subclasses and might be hopelessly difficult from an analytical point of view. Thus, it will be important to complement our analytical methods with numerical solution techniques. But that is a long way off, in Chapter 6. Analytical methods and the general theory of differential equations will occupy us in Chapters 2 through 5. It should be stressed that the equation types that are susceptible to the analytical solution techniques described in these chapters can also be solved analytically 18 2.2. The Linear Equation by computer algebra software that is currently available, such as Maple, Mathematica, and MATLAB, and this approach is discussed here as well. One needs to know both: the underlying theory and solution methodology on one hand, and the available computer software on the other. 2.2 The Linear Equation The first casethatwe consideris thegeneralfirst-orderlinear differential equation ag(x)y'+ ai(x)y = f(z). (1) Dividing throughby a(x) [which is permissible if ag(a) 4 0 over the x interval of interest], we can re-express (1) in the more concise form y' +p(x)y= q(z). (2) Weassumethatp(x) andq(x) arecontinuousoverthex intervalof interest. 2.2.1. Homogeneous case. It is tempting to think that to solve (2) for y(z) we need to get rid of the derivative, and that we can accomplish that merely by integration. It’s true that if we integrate (2) with respect to a, [vides [ovde= fade, then the first term reduces nicely to y (plus a constant of integration), but the catch is that the [ py dz term becomes a stumbling block because we cannot evaluate it since y(a) is unknown! Essentially, then,we havemerely convertedthe differential equation (2) to an “integral equation” — that is, one involving the integral of the unknown function. Thus, we are no better off. To solve (2),we begin with the simpler special casewhereq(x) is zero, y+p(x)y = 0, (3) which is called the homogeneous version of (2). To solve (3), divide by y (assuming that y is nonzero on the z interval of interest. This assumption is tentative since y is not yet known) and integrate on x. Using the fact that y’dx = dy, from the calculus, we thus obtain “d and recalling that [G+[r@)d=o, dx i In|x|+ constant, (4) gives In|y| = —[re dz +C, " (5) 19 20 where the arbitrary constant C' can include the arbitrary constant of integration from the p integral as well. Thus, ty(«)| _ en J p(@)da+C — ef ew J P(e)dec _ Bew fe) os (6) where we have set e¢ = B for simplicity. Since C is real, eC is nonnegative, so B is anarbitrarynonnegative constant:B > 0. Theintegral[ p(«)dx doesindeed exist since we have assumed thatp(a) is continuous. Finally, it follows from (6) thaty(«) = +B exp(—f pdz) or, y(z)=AeW J P(#) de (7) if we now allow the arbitrary constant A to be positive, zero, or negative. Observe that our tentative assumption, just below (3), that y(x) 4 0, is now seen to be justified because the exponential function in (7) is strictly positive. [Of course y(z) = 0 if A = 0, but in that simple case y(x) = 0 is seento satisfy (3) without further ado.] Summarizing, the solution of the homogeneous equation (3) is given by (7), where A is an arbitrary constant. The presence of the arbitrary constant A, in (7), permits the solution (7) to satisfy an initial condition, if one is specified. Thus, supposethat we seek a solution y(z) of our differential equation y’ + p(x)y = 0 that satisfies an initial condition y(a) = 6, for specified values of a and 6. For this purpose, it is convenient to re-express (7) as JaPE)a, y(a)=Aem (8) which is equivalent to (7) since f p(a) dx and f” p(€) dé differ at most by an additive constant, say D, and the resulting e? factor in (8) can be absorbed into the arbitrary constant A. A point of notation: why do we change the integration variable from x to € in (8)? Because f” p(€) dé meansthatwe integratealong someaxis, say a € axis, from a to z. Thus, « is a fixed endpoint, and is different from the integration variable that runs from a to a. To write f” p(a) da runs the risk of confusing the x’s inside the integral with the fixed endpoint x. The nameof the integration variable is immaterial, so one calls it a “dummy variable.” For instance,fy € dé, Io dn, and {5°pdp are all the same,namely,x?/2. One often seesthe letter€ used as a dummy z variable because it is the Greek version of x. In Roman letters it is written as xi and is pronounced as ksé. Occasionally, we may be guilty of badnotationandwritean integralas [° f(x) da, simply to minimizetheletters used, but even then we need to remember the distinction between the z in the upper limit and the «’s inside the integral. In fact, this notation is typical in books on engineering and science, where there is less focus on such points of rigor. Now imposing the condition y(a) = b on (8) gives y(a) = 6 = Ae®= A, so A = band hence ae. JaP(E) y(a)=bem o 2.2. The Linear Equation 21 Thus, (9) satisfies the initial condition y(a) = b. To verify that it also satisfies the differential equation (3), we use the fundamental theorem of the calculus: If f(a) is continuous on an interval J, namely «1 < @ < ae, and Fay=fre) ae, on I, then F(x) =f(a) (108) (10b) on J. Using this theorem, and chain differentiation [let the —fe p(&) dé in the exponentbe u, say,and write de“/dz = (de“/du)(du/da) |, it is shownthat(9) doesindeedsatisfy thedifferentialequationy’ + p(a)y = 0 on an interval I if p(x) is continuous on J. EXAMPLE 1. Considerthedifferentialequation y’ + 2Qary =0 (11) on —co < £ < ox, over which interval p(x) = 2x is continuous, as we have assumed. Then (7) gives \ Ap faedzx y(x) a= Ae _ = Ae —x* (12) on —oo < x < oo. The graphs of the solutions, (12), are called the solution curves or integral curves corresponding to the given differential equation (11), and these are displayed for several values of A in Fig. 1. Those above and below the z axis correspond to A > 0 (A = 1,2,3) and A < 0(A = —1, —2), respectively, and A = 0 gives the solution curve y(z) = 0. & In Example | we used the term “solution curve.” A solution curve, or integral curve, correspondingto a differential equationy/(x) = f(z, y), is simply the graph of a solution to that equation. Besides several solution curves, Fig. | also contains afield of lineal elements through a discrete set of points, or grid. By a lineal element at a point (9, yo), correspondingto a differential equation y’(x) = f(x, y), we meana short straightline segment through the point (xo, yo), centered at that point and having the slope f(xo, yo). That is, each lineal element has the same slope as the solution curve passing through that point and is therefore a small part of the tangent line to that solution curve. The set of all of the lineal elements is called the direction field corresponding to the given differential equation. In intuitive language, the direction field shows the “flow” of solution curves. Given a sufficiently dense (computer) plot of the direction field, one can visualize the various solution curves, or sketch the solution curve through a given initial point. EXAMPLE 2. Considertheproblem (a + 2)y’ — cy = 0, (13a) Figure 1. The solution curves and directionfieldfory’ + 2xy = 0. (13b) y(0) = 3. Since an initial valueis prescribed,let us use(9)ratherthan(7),withp(a) = —a/(x + 2): y(w)= ele §46/E+2) — golé+2-2Inle+2I]|* 0 == 3elt+2—2In |e+2|—(2—21n2)] _ az =12 9,0 pin |2+2|~2 ein? et 14 = 12 uo jo+2|?9 +4044’ where we have used the identity ef = f. Of course, we could have used (7) instead, and then imposed the initial condition on that result: _ y(z) = Ae fxdz/(a+2) _ Ae e+2—21n jx+2] _ ere AG ree wherey(0) = 3 = Ae?/4 givesA = 12e7? and,onceagain,we obtainthesolution ev -3.5 -2 0 y(z) neem 3.5 (15) The graph of that solution is given in Fig. 2, in which we show the direction field of Figure 2. Solutionto the equation (13a), as well. (a+ 2)y’ —ay =0; y(0) = 3, On what z interval is the solution (15) valid? Recall that the solution (9) wasguaranteed over the broadest interval, containing the initial point x = a, over which p(x ) is togetherwith the direction field. continuous.‘In thiscasea = 0 andp(x) = —2x/(x+ 2), whichis undefined(infinite)at x = —2, so (15) is valid at least on —2 < x < oo. In fact, the interval of validity cannot be extended any further to the left in this example, because whereas we need both y and y’ to haveuniquelydefinedfinite valuessatisfying(2 + 2)y’ —cy = 0, bothy and y’ given by (15) are undefined at « = —2. Thus, the interval of validity of (15) is -2< a < oo. & With the homogeneous problem (3) solved, we now turn to the more difficult case, the nonhomogeneous equation (2). We will show how to solve (2) by two different methods: first, by using an integrating factor and, second, by the method of variation of parameters. 2.2.2. Integrating factor method. To solve (2) by the integrating factor method, we begin by multiplying both sides by a yet to be determined function o(2), so that (2) becomes oy’ + opy = oq. (16) {For (2) and (16) to be equivalent, we require o(x) to be nonzero on the z interval of interest,for if (a) = 0 at one or more points, then it does not follow from (16) that y’ + py equals q at those points.] The idea is to seek o(z) so that the left-hand side of (16) is a derivative: d oy + opy = in (oy), (17) 2.2. The Linear Equation 1 because then (16) becomes (18) = da(oy) =04, ↕ . . which can be solved by integration. For instance, to solve the equation y+ vl = 4, observethat if we multiply throughby z, thenwe havery'+y = 42, or (xy)! = 4a, which can be integrated to give xy = 2a7-+C and hence the solution y = 2a+C/a. In this case o(a) was simply x. Such a function o(x) is called an integrating factor, and its use is called the integrating factor method. The idea was invented by Leonhard Euler (1707-1783), one of the greatestand most prolific mathematicians of all time. He contributed to virtually every branch of mathematics and to the application of mathematics to the science of mechanics. One might say that the integrating factor method is similar to the familiar method of solving a quadratic equation by completing the square. In completing the square, we add a suitable quantity to both sides so that the left-hand side becomes a “perfect square,” and the equation can then be solved by taking square roots. In the integrating factor method we multiply both sides by a suitable quantity so that the left-hand side becomes a “perfect derivative,” and the equation can then be solved by integration. How do we find o(2), if indeedsuch a function exists? Writing out the right- hand side of (17) gives oy!+opy=ay!+o'y, which is satisfiedidentically if we chooseo(2) so that (19) o'(x) = p(x)a(z). But (19) is of the same form as (3), with y changed to o and p to —p, so its solution follows from (7) as o(z) = Ae* Jp) factor in o is inconsequential, de From (16), we see that any constant scale so we can take A = 1 without loss. Thus, the desired integrating factor is (20) a(x) = ef P(t)de Putting this o(x) into (18) gives −d (c ↨ SO u) ∶ e ∙ g(x), , ed P(a) day, _ | ed P(@) de o(x) dt + C, or dx (/ y(“) = ewJ P(@) edP(2) dxa(x) da + c’) where C’ is an arbitrary constant of integration. (21) 23 24 Not only does (21) satisfy (2), but it can be seen from the steps leading to (21) that every solution to (2) must be of the form (21). Thus, we call (21) the general solution of (2). EXAMPLE 3. Solve (22) y +3y =. With p(x) = 3 andg(a) = «x,we have ed P(2) dx el 3da est so (21) gives y(x) =e ** ( ee dz +c) = wie ‘ + Ce**, (23) as the general solution to (22). @ If we wish to solve for C, in (21), so as to satisfy an initial condition y(a) = 6, then it is convenient to return to (20) and use the definite integral f p(&) dé in place of the indefiniteintegral[ p(x) dx. These integralsdiffer by at most an arbitrary additive constant, say B, and the scale factor e? that results, in the right-hand side of (20), can be discarded without loss. Thus, let us change both p integrals in (21) to definite integrals, with a as lower limit. Likewise, the g integral in (21) can be changed to a definite integral with a as lower limit since this step changes the integral by an arbitrary additive constant at most, which constant can be absorbed by the arbitrary constant C’. Thus, equivalent to (21), we have ae+0), (J"elimg(6) =e Keooae ule) a where¢ is zeta.If we imposeon this resulttheinitial condition y(a) = 6,we obtain y(a) = b = e~°(0 + C), where each € integral is zero becauseits lower and upper integration limits are the same. Thus, C’ = 6, and de+») JaPOE)(/ “el MOKG(e) y(a)=eW a (24) As a partial check, notice that (24) does reduce to (9) in the event that g(x) = 0. Whereas (21) was the general solution to (2), we call (24) a particular solution since it corresponds to one particular solution curve, the solution curve through the point (a, b). EXAMPLE 4. Solve y’ — 2vy = sing, (25a) y(0) = 3. (25b) 2.2. The Linear Equation This time an titial condition is prescribed, so it is more convenient to use (24) than (21). With p(a) = —22,q(x) = sing, a = 0, andb = 3, we have elePEE pI (28)de ge? 2 z so that (24) gives the desired solution as y(e) = ee ([ 0 ene sin € d€ + 3) (26) in that it cannot be evaluated in The integral in (26) ts said to be nonelementary closed form in terms of the so-called elementary functions: powers of x, trigonometric and inverse trigonometric functions, exponentials, and logarithms. Thus, we will leave it as it is. It can be evaluated in terms of nonelementary functions, or integrated numerically, but such discussion is beyond the scope of this example. # 2.2.3, Existence and uniqueness for the linear equation. A fundamental issue, in the theory of differential equations, is whether a given differential equation F(z,y,y’) = 0 Aas a solution througha given initial point y(a) = b in the a,y plane. That is the question of existence. If a solution does exist, then our next question is: is that solution unique, or is there more than one such solution? That is the question of uniqueness. Finally, if we do indeed have a unique solution, then over what x interval does it apply? For the linear problem y+p(x)y = q(z), y(a)= 6, (27) all of these questions are readily answered. Our proof of existence is said to be constructive because we actually found a solution, (24). We are also assured that that solution is unique because our derivation of (24) did not offer any alternatives: each step was uniquely implied by the preceding one, as you can verify by reviewing those steps. Nonetheless, let us show how to prove uniqueness, for (27), by a different line of approach. Supposethatwe have two solutions of (27), yi(x) and y2(x), on any x interval J containing the initial point a. That is, y, +p(x)y1= g(2), yi(a) = b, (28a) yy+p(x)y2= gz), y2(a)= b. (28b) and Next, denotethedifferenceyi (a) —yo(x) as u(a), say. If we subtract(28b) from (28a), and use the fact that (f ~ g)! = f’ —g’, known from the calculus, we obtain the “homogenized” problem u’ + p(a)u = 0, u(a) = 0, on u(x). But u’ + pu = 0 implies thatfdu/u+ [pdx (29) = 0, which implies that In |u| = —{ pdx + C, which implies that |u| = exp(— [pdx +C) and, 25 26 Chapter 2. Differential Equations of First Order finally, that u(«) = Aexp(— f pdx), where A is an arbitrary constant. Since u(a) = 0 and the exponential is nonzero, it follows that A must be zero, so u(x) = yi(x) —y2(%)mustbe identically zero. Thus, y;(x) and y2(a) mustbe identical on I. Since y;(«) and ye(x) are any two solutions of (27), the solution must be unique. That approach, in proving uniqueness, is somewhat standard. Namely, we suppose that we have two solutions to the given problem, we let their difference be u, say, obtain a homogenized problem on wu,and show that u must be identically zero. Finally, what is the interval of existence of our solution (24)? The only possible breakdown of (24) is that one or more of the integrals might not exist (be convergent). But since p is continuous, by assumption,it follows that the € and ¢ integrals of p both exist and, indeed, are continuous functions of x and &, respectively. Thus exp cfs p(¢) d¢) is continuous too, and since gqis also continuous by assumption, then the integral of the exponential times g must exist. In summary, we have the following result. THEOREM 2.2.1 Existence and Uniqueness,for the Linear Equation. The linear equation y’ + p(«)y = q(x) does admit a solution through an initial pointy(a) = bif p(x) andq(x) arecontinuousat a. That solutionis unique,andit exists at leaston the largestx intervalcontaining2 = a, over which p(x) andq(x) are continuous. InExampleI, for instance,p(x) = 2x andg(x) = 0 werecontinuousfor all z, so everysolution was valid over —oo < x < oo. InExample2,p(x) = —#/(x+2) and q(x) = 0, so the broadestinterval containing the initial point « = 0, over which p and q were continuous, was ~2 < «2 < co and, sure enough, we found that the solution (15) was valid over that interval but not beyond it because of the singularity in the solution at ¢ = —2. We might think of the solution as “inheriting” that singularity from the singular behavior of p(x) = —a/(a + 2) at that point. EXAMPLE 5. A Narrow Escape. The condition of continuity of p(x) and q(x) is sufficient to imply the conclusions stated in the theorem, but is not necessary, as illustrated by the problem zy’ + 38y= 62°. (30) The general solution of (30) is readily found, from (21), to be C atm) — a2 ps + an y(a) = < (31) The graphs of the solution for several values of C’ (i.e., the solution curves) are shown in Fig. 3. Now, p(x) = 3/zx is continuous for all @ except x = 0, and q(x) = 62° is continuous for all x, so if we append to (30) an initial condition y(a) = 6, for some a > 0, then Theorem 2.2.1 tells us that the solution (31) that passes through that initial point will 2.2. The Linear Equation be valid over the interval 0 < «2< oo at least. For instance,ifa@= 1 and 6 = 2.5 (thepoint Pi), then C = 1, and the solution (31) is valid only over 0 < x < 00 since it is undefined at « = 0 because of the 1/° term, as can also be seen from the figure. However, if a = 1 and b = 1 (the point P,) then C = 0, and the solution (31) is valid over the broader interval —oo < x < oo because C’ = 0 removes the singular 1/z? term in (31). That is, if the initial point happens to lie on the solution curve y(a) = x°®through the origin, then the solution y(x) = x° is valid on —oo < & < oo; if not, then the solution (30) is valid on 0 < 2 < 00 # ifa > Oandon-co<a<Qifa<0. Figure 3. Representative integral curves (31) for the equation (30). 2.2.4, Variation of parameter method. A second method for the solution of the general first-order linear equation y+ p(e)y = g(x) (32) is the method of variation of parameters, due to the great French mathematician Joseph Louis Lagrange (1736-1813) who, like Euler, also worked on the applications of mathematics to mechanics, especially celestial mechanics. Lagrange’s method is as follows. We begin by considering the homogeneous version of (32), y' +p(x)y =0, (33) which is more readily solved. Recall that we solved it by integrating di [2+ [o()ae=0 and obtaining the general solution yn(x) = AeWJP(e)de, (34) 27 28 We use the subscript h because y;,(x) is called the homogeneous solution of (32). That is, it is the solution of the homogeneous version (33) of the original nonhomogeneousequation (32). [In place of y;,(a), some authors write y.(«) and call it the complementary solution.| Lagrange’s idea is to try varying the “parameter” A, the arbitrary constant in (34). Thus, we seek a solution y(a) of the nonhomogeneous equation in the form y(2) = A(x)e7 fP) om (35) (The general idea of seeking a solution of a differential equation in a particular form is important and is developed further in subsequent chapters.) Putting (35) into (32) gives (Ae Sede 4 A(—p)e~ SP) + pAew / Pd = 4q. (36) Cancelling the two A terms and solving for A’ gives A(x) = q(x)el P(*)oom (37) which can be integrated to give A(x)=/ ef dra()dx+C and hence the general solution y(x) = A(a)e~ JP) de = ewJ r(a)de ¢ ef P(2)4 a(x) dx +c) , (38) which is identical to our previous result (21). It is easy to miss how remarkable is the idea behind Lagrange’s method because it starts out looking like a foolish idea and ends up working beautifully. Why do we say that it looks foolish? Because it is completely nonspecific. To explain what we mean by that, let us put Lagrange’s idea aside, for a moment, and consider thesecond-orderlinearequationy”+y/—2y = 0, for instance.InChapter3 wewill learn to seek solutions of such equations in the exponential form y = e** where \ is a constantthat needs to be determined. Putting that form into y” + y/ — 2y = 0 gives the equation (\? + \ — 2)e** = 0, which implies that \ needsto satisfy the quadratic equation A? + \ — 2 = 0, with roots \ = 1 and A = —2. Thus, we are successful, in this example, in finding two solutions of the assumed form, y(x) = e* and y(w) = e7?*. Notice how easily this idea works. It is easily implemented because most of the work has been done by deciding to look for solutions in the correct form, exponentials, rather than looking within the set of all possible functions. Similarly, if we lose our eyeglasses, the task of finding them is much easier if we know that they are somewhere on our desk, than if we know that they are somewhere in the universe. Returning to Lagrange’s idea, observe that the form (35) is completely nonspecific. That is, every function can be expressed in that form by a suitable choice 2.2. The Linear Equation of A(x). Thus, (35) seems useless in that it does not narrow down the search in the least. That’s why we say that at first glance Lagrange’s idea looks like a foolish idea. Next, why do we say that, nevertheless, it works beautifully? Notice that the equation (36), governing A(a), is itself a first-order nonhomogeneous equation of the same form as the original equation (32), and looks even harder than (32)—except for the fact that the two A terms cancel, so that we obtain the simple equation (37) that can be solved by direct integration. The cancellation of the two A terms was not serendipitous. For suppose that A(x) is a constant. Then the A’ term in (36) drops out, and the two A terms must cancel to zero because if A is a constant then (35) is a solution of the homogeneous equation! In Chapter 3 we generalize Lagrange’s method to higher-order differential equations. Closure. In this chapter we begin to solve differential equations. In particular, we consider the general first-order linear equation y!+plx)y=q(), (39) wherep(z) and g(a) are given. We begin with the homogeneouscase, (40) y +p(a2)y=0 because it is simpler, and find its general solution y(a)=Aen SP(o) de, (41) If an initial condition y(a) = b is appended to (40), then (41) gives the particular solution of (40) through the initial point (a, 5) as (42) y(xv) = be Ja P(g)ds Turning next to the full nonhomogeneous equation (39), we derive the general solution de9) der+ c) . y(a)=eFP(#)de(/ efP@) (43) first by the integrating factor method, and then again by the method of variation of parameters. Both of these methods will come up again in subsequent sections and chapters. If an initial condition y(a) = b is appended to (39), then (43) gives the partic- wlarsolution of (39) through the initial point (a, b) as y(w)= enfePEa ([ eleWC) Ma(¢)dé+s) (44) which solution is unique, and which is valid on the broadest x interval containing x = a, on which both p(x) and q(x) are continuous, interval than that. and possibly even on a broader 29 30 2. Differential Equations ofFirst Order ~=6Chapter Finally, we introduce the idea of lineal elements and the direction field of a differential equation y’ = f(a, y). It is noteworthy that we are successful in finding the general solution of the general first-order linear equationy’ + p(z)y = g(a) explicitly and in closed form. For other equation types we may not be so successful, as we shall see. In closing, we call attention to the exercises, to follow, that introduce additional important special cases, the Bernoulli, Riccati, d’Alembert-Lagrange, and Clairaut equations. In subsequent sections we occasionally refer to those equations and to those exercises. Computer software. There are now several powerful computer-algebra systems, such as Mathematica, MATLAB, and Maple, that can be used to implement much of the mathematics presented in this text - numerically, symbolically, and graphically. Consider the application of Maple, as a representative software, to the material in this section. There are two types of applications involved in this section. One entails finding the general solution of a given first-order differential equation, or a particular solution satisfying a given initial condition. These can be carried out on Maple using the dsolve command (“function,” in Maple terminology). For example,to solvetheequation(x + 2)y’ —cy = 0 of Example2, for y(z), enter dsolve((x + 2) * diff(y(z), 7) —x * y(x) = 0,y(2)); (including the semicolon) and return; dsolve is the differential equation solver, and diff is thederivative commandfor y’. [The commandfor y” would be diff(y(x), x, x), and so on for higher derivatives.] The output is the general solution C1 exp (zx) y(t)= z+ +4r+4 where _C'l is Maple notation for an arbitrary constant. To solve the same equation, but this time with the initial condition y(0) = 3, enter =3},y(x)); =0,y(0) x)—x*y(x) +2)«diff(y(x), dsolve({(a and return. The output is the particular solution u(x) exp () =1Q—oXP AP) 4 vt?+doe4+ which agrees with our result in Example 2. The dsolve command can cope with differential equations thatcontain unspecified parametersor functions. For example,to solve y’ + p(x)y = 0, wherep(z) is not specified, enter dsolve(diff(y(x), ©)—p(x)*y(x)= 0, y(zx)); 2.2. The Linear Equation and return. The output is the general solution yl)=exp ( fr(eyde) -c1 The second type of application entails generating the graphical display of various solution curves and/orthe direction field for a given differential equation. Both of these tasks can be carried out on Maple using the phaseportrait command. For example, to obtain the plot shown as Fig. |, enter with(DEtools): to access the phaseportrait command; then return and enter {v,y], @= —1.5..1.5,{[0,—2],[0,-1], phaseportrait(-2*a*y, [0,0],(0,1],(0,2],[0,3]}, arrows= LINE); and return. The items within the outer parenthesesare as follows: phaseportrait(right-hand side of y’ = f(x,y), [variables], xrange, {initial points}, optional specification to include direction field lineal elements and choice of their line thickness), The yrange is set automatically, but it can be specified as an additional optional item if you wish. All items following the { initial points } are optional, so if you want the yrange to be —1 < y < 3, say, then modify the phaseportrait command as follows: phaseportrait(—2«x *y, [a,y], «= —1.5..1.5, {[0,—2],(0,-1], (0,0],(0,1],[0,2],(0,3]}, y = —1..3,arrows= LINE); To run phaseportrait over and over, one needs to enter “with(DEtools):” only at the beginning of the session. To obtain a listing of the mathematical functions and operators (or commands) available in Maple, enter ?lib and return. Within that list one would find such commands as dsolve and phaseportrait. To learn how to use a command enter a question mark, then the command name, then return. For example, type ?dsolve and return. In the exercises that follow, and those in subsequent sections, problems are included thatrequire theuseof a computer-algebrasystemsuch as one of the systems mentioned above. These are important, and we strongly urge you to develop skill in the use of at least one such system in parallel with, but not in place of, developing understanding of the underlying mathematics presented in this text. 31 32 Chapter 2. Differential Equationsof First Order EXERCISES 2.2 1. Assuming thatp(x) and g(x) are continuous, verify by direct substitution (a)y(2)= (d)y(1)= (b)y(0)= 1 (e)y(—2)= 0 (c)y(-1) = (f)y(—3)= 0 7. Find the general solution using any method of this section. The answer may be left in implicit form, rather than explicit 2. In each case find the general solution, both using the ‘“‘off- form, if necessary. HINT: Remember that which variable is the-shelf” formula (21) and then again by actually carrying out the independent variable and which is the dependent variable the steps of the integrating factor method. That is, find the inte- is a matter of viewpoint, and one can change one’s viewpoint. grating factor o(a) and then carry out solution steps analogous In these problems, consider whether it might be better to reto those in our derivation of (21). Understand the 2x interval gard x as a function of y, and recall from the calculus that on which the equation is defined to be the broadest interval on ue = 1/(dx/dy). which bothp(z) and g(x) are continuous. For example, in part (a) that (9) satisfies (3) (b) that (21) satisfies (2) (a) the x interval is —co < @ < ov, in part (e) it is any interval on which tan x is continuous(suchas 7/2 < x < 37/2), and (a) in part (f) it is either -co < @ < Oor0 < & < oo [to ensure thecontinuityof p(w) = 2/2]. (a)y' —y = 3e* (b)y’ + 4y = 8 (c)yty =a" (e)y’ ~ (tanz)y = 6 (g)zy!—2y=2° (d)y =y—sin2z (f)ay!+2y=a8 (h) y’ + (cot x)y = 2cosx () (v7 —5)(ay' +3y)=2 dx, | Gia —6r=e! d (m) tSat 7 (n) oe 7, + 2(cot 20)r = (Kyd te 3.(a)-(n) 43e=0 © yA ad 4ay—4y? =1 x =2 x ar For the equation given in Exercise 2, solve by the method of variation of parameters. That is, first find the homogeneous solution, then vary the parameter, and so on — as we did in (34)—(37) for the general equation (31). 4.(a)~(n) For the equation given in Exercise 2, find the general solution using computer software (such as Mathematica, MATLAB, or Maple). Verify your result by showing that it does satisfy the given differential equation. 5. Solve zy’ + y = 62x?subject to the given initial condition using any method of this section, and state the (broadest) interval of validity of the solution. Also, sketch the graph of the solution, by hand, and label any key values. (a)y(1)= 0 (d)y(—3)= 1 (b)y(1)= 2 (c)y(2)= 2 (e)y(—-38)=—-5 (Ay(-2) =8 6. Solve vy’ + 2y = v + 2 subject to the given initial condition using any method of this section, and state the (broadest) the solution, by hand, and label any key values. 1 1 ~ 2 oe (b) i (c) (6y? —2x)%-y=0 dz = 62 + vs (d) (y* siny +2) aie 8. (Direction fields) The direction field concept was within Example 1. For the differential equation given, use computer software to plot the direction field over the specified rectangular region in the x, y plane, as well as the integral curve through the specified point P. Also, if you can identify any integral curves exactly, from an inspection of the direction field, then give the equations of those curves, and verify that they do satisfy the given differential et (a)yi = 2+ (2x —y)" on |z| < 4, ly] < 4; P =(2,1) (b)yy’= y(y? - 4) on {al <4, ly) < 4; P= “O, 1) (c)y= (3—y7)? on |a| < 2, ly] < 3; P = (0,0) on fal < 3, jy| < 2; P = (0,0.5) (d)y+ 2y=e* -3) — 1) on |z| <3, [yf <3; P= (—3, (e)y!= 2°/(y? (Ny+e=y ona]<20,0<y< 20;P=(0,1) (jy =e"y ond<x<50,0<y< 50; P=(0,10) (h)y’=azsiny 9. (Bernoulli on0O<x<10,0<y<10; P=(2,2) equation) The equation y+ p(a)y= q(z)y", (9.1) where 7 is a constant (not necessarily an integer), is called equation, after the Swiss mathematician Jakob Bernoulli’s Bernoulli. Jakob (1654-1705), his brother Johann (1667-1748), and Johann’s son Daniel (1700-1782), are the best known of the eight members of the Bernoulli family who were prominent mathematicians and scientists. (a) Give the general solution of (9.1) for the special cases n=OQOandn = 1. (b) If m is neither 0 nor 1, then (9.1) is nonlinear. Neverthe- less, show that by transforming the dependent variable from 2.2. The Linear Equation y(x) to v(x) according to v = yi-® (forn 4 0,1), (9.1) can be converted to the equation which is linear andcan be solved by themethodsdevelopedin this section. This method of solution was discovered by Gotin 1696. 10. Use the method suggested in Exercise 9(b) to find the general solution to each of the following. (a)y’ —dy=4y? (c)2ayy!+y?=Qa @)y=y" (e)y’ = ay? +2a—- 2° (d)Jy(3y'+y)=« (Hy!=ay? (h)y! = (2—y)y 13. (d’Alembert-Lagrange ear differential equation equation) The first- order nonlin- y = xf (p) + 9(p) (13.1) on y(z), whereit will be convenientto denotey’ as p, and f and g are given functions of p, is known as a d’AlembertLagrange equation after the French mathematicians Jean le Rond d’Alembert (1717-1783) and Joseph-Louis Lagrange (1736-1813). 11. (Riccati equation) The equation (a) Differentiating y' =p(x)y?+q(x)y+r(z) HINT: See if you can find a Y (x) in the form az?. (b)ay’ —2y = xy? (g)y" = (y')? — HINT:First,lety/(a) = u(a). (h)y’” + (y”)? =0 —HINT:First,lety”(x) = u(a). is called Riccati’s Y(«)=2e* y=e ty? ~y |HINT: (9.2) (f)y= (I - He ant) (gy =y- vo+(1—n)p(a)u = (1 —n)aq(2), ifried Wilhelm Leibniz (1646-1716) Gowa (1.1) equation, after the Italian mathematician (13.1) with respect to x, show that p—fp) =(ef) +9) Fdp (13.2) The Riccati equation Observe that this nonlinear equation on p(x} can be converted is nonlinearif p(a) is not identically zero. Recall from Exer- to a linear equation if we interchange the roles of x and p by Jacopo Francesco Riccati (1676-1754). cise 9 that the Bernoulli equation can always be reduced to a linear equation by a suitable change of variables. Likewise, for the Riccati equation, provided that any one particular solution can be found. Let Y(2) be anyoneparticularsolution of (11.1),asfound by inspection, trial and error, or any other means. [Depending now regarding x as the independent variable and p as the dependent variable. Thus, obtain from (13.2) the linear equation f(y). 9) p—f(p) —p—flr) (13.3) on p(x), g(x), and r(c), finding such a Y(x) may be easy, on x(p). Since we have divided by p ~ f(p) we must restrict or it may prove too great a task.] Show that by changing the f(p) so thatf(p) 4 p. Solving thesimplerequation(13.3)for dependent variablefromy(a) tou(x) accordingto z(p)},the solution of (13.1) is therebyobtainedin parametric form: x = x(p) from solving (13.3),andy = a(p) f(p) + g(p) y=¥(a)+— Uu (11.2) from (13.1). This result is the key idea of this exercise, and is illustrated in parts (b)—(c). In parts (d)-(k) we consider a more specialized result, namely, for the case where f(p) happens to the Riccati equation (11.1) can be converted to the equation have a “fixed point.” u’ + [(2p(a)Y(x) + q(x)]u = —p(x), (11.3) (b) To illustrate part (a), consider the equation y = 2ay’ + 3y’ [ie., where f(p) = 2p and g(p) = 3p], andderive a paramet- which is linear and can be solved by the methods developed in this section. This method of solution was discovered by Leonhard Euler (1707-1783) in 1760. ric solution as discussed in (a). (c) To illustrate part (a), consider the equation y = a(y! + y'*) 12. Use the method suggested in Exercise || to find the solution discussed in (a). fie., f(p) = p+ p? and g(p) = 0], and derive theparametric general solution to each of the following. Nonelementary inte- (d) Suppose that f(p) has a fixed point Po, that is, such that grals, such as [ exp (az) de, may be left as is. (a)y’~4y=y? HINT: Y(x) = —4 (b)y! = y? —ay + 1 HINT: Y(x) =a (c) (cosa)y’ =1—y* HINT: Y(z) =sinz f( Po) = Po. [A given function f may have none, one, or any number of fixed points. They are found as the solutions of the equation f(p) = p.] Show that (13.1) thenhas the straight line y = Pox + g(Po) (13.4) 34 Chapter2. Differential Equations of First Order cases where the integrals that occur in the general solution of (14.2) y = Cx + g{C), (e) Show that f(p) = 3p? has two fixed points, p = 0 and where C’ is an arbitrary constant. p = 1/3, and henceshow that theequationy = 3xp? + g(p) (b) Recall that (13.3) does not hold if f(p) = p, but (13.2) (13.3) are too difficult to evaluate.| has straight-line solutions y = g(0) and y ==aa +g @ for any given function g. (f) Determine all particular solutions of the form (13.4), if any, for theequationy = x (y’* —2y' + 2) + ev. does. Letting f(p) = p in (13.2), derive the family of solutions (14.2), as well as the additional parametrically by particular a = —g'(p), y = —pg'(p) +g(p). (g) Same as (f), for y = ze’ — 5cos y'. (h) Same as (f), for y = x (y”? — 2y') + by’. solution given (14.3a) (14.3b) (i) Sameas(f),for y = x (y3 ~ 3y’') —2siny’. (j) Sameas (f),for y —x (y’? + 3) = y’. (c) To illustrate, find the parametric solution (14.3) for the equation y = wy’ — y’. Show that in this example (14.3) can be gotten into the explicit form y = x?/4 by eliminating (k) Sameas (f), for y + a (2y’ +3) = er". the parameter p between (14.3a) and (14.3b). 14. (Clairaut equation) For the special case f(p) = p, the d’Alembert~Lagrange equation (13.1) in the preceding exercise becomes the family (14.2), for C = 0, +1/2, £1,£2, together with the solution y = x7/4. (Observe, from that plot, that the particular solution y = «7/4 forms an “envelope” of the family of straight-line solutions. Such a solution is called a singular so- y=«up+g(p), (14.1) which is known as the Clairaut equation. after the French mathematician Alexis Claude Clairaut (1713-1765). (Recall thatp denotesy’ here.) (a) Verify, by direct substitution into (14.1), that (14.1) admits lution of the differential Plot, by hand, equation.) (d) Instead of a hand plot, do a computer plot of y = 27/4 and 3, on the family (14.2), for C = 0, +£0.25,+£0.5,£0.75,..., 12. —-8<2<8-l0<y< the family of solutions In this section we consider representative physical applications that are governed by linear first-order equations: electrical circuits, radioactivity, population dynamics, and mixing problems, with additional applications introduced in the exercises. 2.3.1. Electrical circuits. [n Section 1.3 we discussed the mathematical modeling of a mechanical oscillator. The relevant physics was Newton’s second law of motion, which relates the net force on a body to its resulting motion. Thus, we needed to find sufficiently accurate expressions for the forces contributed by the individual elements within that system — the forces due to the spring, the friction between the block and the table, and the aerodynamic drag. In the case of electrical circuits the relevant underlying physics, analogous to Newton’s second law for mechanical systems, is provided by Kirchhoff’s laws. Instead of forces and displacements in a mechanical system comprised of various elements such as masses and springs, we are interested now in voltages and currents in an electrical system comprised of various elements such as resistors, inductors, and capacitors. First, by a current we mean a flow of charges: the current through a given control surface, such as the cross section of a wire, is the charge per unit time crossing that surface. Each electron carries a negative charge of 1.6 x 107! coulomb, and each proton carries an equal positive charge. Current is measured in amperes, with one ampere being a flow of one coulomb per second. By convention, a current is counted as positive in a given direction if it is the flow of positive charge in that direction. While, in general, currents can involve the flow of positive or negative charges, in an electrical circuit the flow is of negative charges, free electrons. Thus, when one speaks of a current of one ampere in a given direction in an electrical circuit one really means the flow of one coulomb per second of negative charges (electrons) in the opposite direction. Just as heat flows due to a temperature difference, from one point to another, an electric current flows due to a difference in the electric potential, or voltage, measuredin volts. We will need to know the relationship between the voltage difference across a given circuit element and the corresponding current flow. The circuit elements of interest here are resistors, inductors, and capacitors. For a resistor, the voltage drop E(t), where t is the time (in seconds), is proportional to the current z(t) through it: E(t) = Ri(t), (1) where the constant of proportionality A is called the resistance and is measured in ohms; (1) is called Ohm’s law. By a resistor we usually mean an “off-theshelf” electrical device, often made of carbon, that offers a specified resistance Inductor : → - such as 100 ohms, 500 ohms, and so on. But even the current-carrying wire in a circuit is itself a resistor, with its resistance directly proportional to its length and inversely proportional to its cross-sectional area, though that resistance is probably negligible compared to that of other resistors in the circuit. The standard symbolic representation of a resistor is shown in Fig. 1. For an inductor, the voltage drop is proportional to the time rate of change of current through it: (2) where the constant of proportionality Z is called the inductance and is measured in henrys. Physically, most inductors are coils of wire, hence the symbolic representation shown in Fig. 1. For a capacitor, the voltage drop is proportional to the charge Q(t) on the capacitor: =2Q(), E(t) Resistor : 3) where C’ is called the capacitance and is measuredin farads. Physically, a capacitor is normally comprised of two plates separated by a gap across which no current E,-E) =E=LI 2 di dt Capacitor : Ey Ea E,-E) =E=— fiat Figure 1. The circuit elements. 36 flows, and Q(t) is the charge on one plate relative to the other. Though no current flows across the gap, there will be a current i(¢) that flows through the circuit that links the two plates and is equal to the time rate of change of charge on the capacitor: dQ (t (a ()=20. From (3) and (4) it follows that the desired voltage/current relation for a capacitor can be expressed as E(t)=A/ i(t)dt. (5) Now that we have equations (1), (2), and (5) relating the voltage drop to the current, for our various circuit elements, how do we deal with a grouping of such elements within a circuit? The relevant physics that we need, for that purpose, is given by Kirchhoff’s laws, named after the German physicist Gustav Robert Kirchhoff (1824-1887): Kirchhoff’s current law states that the algebraic sum of the currents approaching (or leaving) any point of a circuit is zero. Kirchhoff’s voltage law states that the algebraic sum of the voltage drops around any loop of a circuit is zero. To apply these ideas, consider the circuit shown in Fig. 2a, consisting of a single loop containing a resistor, an inductor, a capacitor, a voltage source (such as a battery or generator), and the necessary wiring. Let us consider the current z(t) to be positive clockwise; if it actually flows counterclockwise then its numerical value will be negative. In this case Kirchhoff’s current law simply says that the current 2 is a constantfrom point to point within the circuit and thereforevaries only with time. That is, the current law statesthat at any given point P in the circuit (Fig. 2b), i, +(—t2) = 0 or, 14;= 72.Kirchhoff’s voltage law, which is really the self-evident algebraic identity (Va—Va)+ (Vb—Va)+ (Ve—Vb)+ (Va—Ve)= 0, Figure 2. RLC circuit. gives di E(t) — Ri - Loa dt idt =0. (6) (7) The latter is called an integrodifferential equation because it contains both derivatives and integrals of the unknown function, but we can convert it to a differential equation in either of two ways. First, we could differentiate with respect to t to eliminate the integral sign, thereby obtaining di Lap +R di { 1, Ls that Gata d&E(t) . 8 (8) Alternatively, we could use Q(t) instead of i(¢) as our dependentvariable, for then fidt = Q(t), and(7)becomes aQ Ly dQ 1 =Blo): +BQ +RE (9) Either way, we obtain a linear second-order differential equation. Since we are discussing applications of first-order linear equations here, let us treattwo special cases. EXAMPLE 1. RL Circuit. If we omit thecapacitorfromour circuit, then(7) reducesto the first-order linear equation* di dt L—+ Ri = E(t). (10) If E(t) is a continuous function of time and the current at the initial instant t = 0 is i(0) = io, then the solution to the initial-value problem consisting of (10) plus the initial condition i(0) = ig is given by (24) in Section 2.2, with “p” = R/D and “q" = E(t)/L: rt OR i(t) = e~Jo othe (/ i nT or 7 elo E auB(r) dt + i) L 0 1 t i(t) = ine P/E + if , Br) dr eflr—-H/L (11) 0 over Q < t < oo, where 7 and js have been used as dummy integration variables. For instance, if E(¢) = constant = Ep, then (11) gives i(t) = igenRUE or . it) (t) = fio R +(i Fy > .— a (a _ e RUE) Fo\ R 2_ptt /L € (t)—— E,/R (12) 13 (13) t Figure 3. Responsei(¢) for the Ast —»00,theexponentialtermin (13)tendsto zero,andi(t) + Eo/R. Thuswe call the case /(t) = constant = Eo; Eo/R termin (13)thesteady-statesolutionand the (i) — 42) e~/*/* termthetransient approach to steady state. part of the solution. The approach to steady state, for several different initial conditions, is shown in Fig. 3. As another case, let E(t) = Ep sinw#tand tg = 0. Then (11) gives i(gyp Powe (@_RReh = P2+(wLpP Ro, wW sin wt —cosut} (14) “It may seem curious that if we try deleting the capacitor by setting C’ = 0, then the capacitor term in (7) becomes infinite rather than zero. Physically, however, one can imagine removing the capacitor, in effect, by moving its plates together until they touch. Since the capacitance C’ varies as the inverse of the gap dimension, then as the gap diminishes to zero C ~+00, and the capacitor term in the differential equation does indeed drop out because of the 1/C' factor. As t —>oo, the exponential term in (14) tends to zero, and we are left with the steady-state solution i(t) > saoBowl Re+ (wh)? —coswt } . Sinwt | (4— \woLb ) t— co ( ) [5 a Observe that by a steady-state solution we mean that which remains after transients havedied out;it is not necessarilya constant.For thecasewherei(0) = ig andE(t) = 0 thesteady-statesolutionis theconstantHy/R, andfor thecasewherei(0) = 0 andE(t) = fig sin wt the steady-state solution is the oscillatory function given by (15). @ HXAAMPLE 2. RC Circuit. If, insteadof removingthecapacitorfrom thecircuit shown in Fig. 2, we remove the inductor (so that L = 0), then (8) becomes di 1. dE(t avdt i,C - 2 dt ’ (16) which, again, is a first-order linear equation. If we also impose an initial condition i(0) = io, then t i(t) = ige / PE + af 0 gives the solution in terms of ig and E(t). Input ——>| System pe (fo,EQ] Output [i(t)] Figure 4. Schematicof thesystem. dE(r) eft—*)/RC dr dr (17) @ Let us use the electrical circuit problem of Example | to make some general remarks. We speak of the initial condition ig and the applied voltage E(t) as the inputs to the system consisting of the electrical circuit, and the resulting current i(t) as the output (or response), as denoted symbolically in Fig. 4. From (11), we seethatif i9 = 0 andE(t) = 0, theni(t) = 0: if we putnothingin we getnothing out.* Consider the inputs and their respective responses separately. If E(t) = 0 and io % 0, then the response i(t) = ige 4/4 to the input 29 is seen to be proportional to ig: if we double 79 we double its response, if we triple zg we triple its response, and so on. Similarly, if 79 = 0 and E(t) is not identically zero, then the response 1 i(t) = iff ft elt(r-O/E By) dt totheinput&(¢) is proportionalto #(¢).This resultillustratesanimportantgeneral property of linear systems: the response to a particular input is proportional to that input. “In contrast with linear initial-value problems, linear boundary-value problems can yield nonzero solutions even with zero input —that is, even if the boundary conditions are zero and the equation is homogeneous. These are called eigensolutions, and are studied in later chapters. 39 Further, observe from (11) that the total response 2(t) is the sum of the individual responses to ig and /(¢). This result illustrates the second key property of linear systems: the response to more than one input is the sum of the responses to the individual inputs. In Chapter 3 we prove these two important properties and use them in developing the theory of linear differential equations of second order and higher. Before closing this discussion of electrical circuits, we wish to emphasize the correspondence, or analogy, between the RLC electrical circuit and the mechanical oscillator studied in Section 1.3, and governed by the equation dx mae dx +e a (18) +ka =F(t). For we see that both equations (8) (the current formulation) and (9) (the charge formulation) are of exactly the same form as (18). Thus, their mathematical solutions are identical, and hence their physical behavior is identical too. Consider (8), for instance. Comparing it with (18), we note the correspondence . Lem, Ree W/Cek, i(t) o x(t), (19) FW). oO) an ee Thus,given the valuesof m,c,k, and the function F(t), we can sf trical analog circuit by setting L =m, R = = 1/k, and E(t) = f[ F(t If wealsomatchtheinitial conditionsby setting +(0) = x(0) and2 dt =to) thentheresultingcurrenti(¢) will be identical to themotion x(t). Or, we could use (9) to create a different analog, namely, Liem, Ree Cok, Q(thoa(t), E(t) o F(t). (20) In either case we see that, in mechanical terminology, the inductor provides “inertia” (asdoes the mass),the resistor provides energydissipation (asdoes the friction force), and the capacitor provides a means of energy storage (as does the spring). Our interest in such analogs is at least twofold. First, to whatever extent we understand the mechanical oscillator, we thereby also understand its electrical analog circuit, and vice versa. Second, if the system is too complex to solve analytically, we may wish to study it experimentally. If so, by virtue of the analogy we have the option of studying whichever is more convenient. For instance, it would no doubt be simpler, experimentally, to study the REC circuit than the mechanical oscillator. Finally, just as Hooke’s law can be derived theoretically using the governing partial differential equations of the theory of elasticity, our circuit element relations (1)—(5) can be derived using the theory of electromagnetism, the governing equations of which are the celebrated Maxwell’s equations. We will meet some of the Maxwell’s equations later on in this book, when we study scalar and vector field theory. 2.3.2. Radioactive decay; carbon dating. Another important application of firstorder linear equations involves radioactive decay and carbon dating. Radioactive materials, such as carbon—14, einstetnium—253, plutonium—241, radium—226, and thorium—234, are found to decay at a rate that is proportional to the amount of mass present. This observation is consistent with the supposition that the disintegration of a given nucleus, within the mass, ts independent of the past or future disintegrations of the other nuclei, for then the number of nuclei disintegrating, per unit time, will be proportional to the total number of nuclei present: dN ——=—k KN, ie 21 (21) where k is known as the disintegration constant, or decay rate. Actually, the graph of N(t) proceedsin unit stepssince N(¢) is integer-valued,so N(t) is discontinuous and hence nondifferentiable. However, if N is very large, then the steps are very small compared to NV.Thus, we can regard N, approximately, as a continuous function of ¢ and can tolerate the dN/dt derivative in (21). However, it is inconvenient to work with N since one cannot count the number of atoms in a given mass. Thus, we multiply both sides of (21) by the atomic mass, in which case (21) becomes the simple first-order linear equation dm= —km, dt (22) where m(t) is the total mass, a quantity which is more readily measured. Solving, by means of either (9) or (24) in Section 2.2, we obtain m(t) = moe", mo (23) where m(0) = mo is the initial amount of mass (Fig. 5). This result is indeed the exponential decay that is observed experimentally. | 7 Since k gives the rate of decay, it can be expressed in terms of the half-life t Figure 5. Exponentialdecay. of the material, the time required for any initial amount of mass 7 by half, to mg/2. Then (23) gives to be reduced 3 mae mo _h so k = (In2)/T, and (23) can be re-expressedin termsof T' as m(t)= mo2-/. (24) Thus, ift = 7,27, 37,47'..., then m(t) = mo, mo/2, mo/4, mo/8, and so on. Radioactivity has had an important archeological application in connection with dating. The basic idea behind any dating technique is to identify a physical process that proceeds at a known rate. If we measure the state of the system now, and we know its state at the initial time, then from these two quantities together with the known rate of the process, we can infer how much time has elapsed; the mathematics enables us to “travel back in time as easily as a wanderer walks up a frozen river.’* “Ivar Ekeland, Mathematics and the Unexpected. Chicago: University of Chicago Press, 1988. 41 Libby in the 1950’s. The essential idea is as follows. Cosmic rays consisting of high-velocity nuclei penetrate the earth’s lower atmosphere. Collisions of these nuclei with atmospheric gases produce free neutrons. These, in turn, collide with nitrogen, thus changing some of the nitrogen to carbon—14, which is radioactive, and which decays to nitrogen—14with a half-life of around 5,570 years. Thus, some of the carbon dioxide which is formed in the atmosphere contains this radioactive C—14. Plants absorb both radioactive and nonradioactive COe, and humans and animals inhale both and also eat the plants. Consequently, the countless plants and animals living today contain both C-12 and, to a much lesser extent, its radioactive isotope C—14, in a ratio that is essentially the same from one plant or animal to another. 3. Carbon Dating. Consider a wood sample that we wish to date. Since EXAMPLE C-—14emits approximately [5 beta particles per minute per gram, we can determine how many grams of C—14 are contained in the sample by measuring the rate of beta particle it were alive today it would, based upon its weight, contain around 2.6 grams. Thus, we assume that it contained 2.6grams of C—14 when it died. That mass of C—14 will have decayed, over the subsequent time span t, to 0.2 gram. Then (24) gives 0.2= (2.6)2781/5870, and,solving for t, we determine the sample to be around ¢ = 2,100 years old. However, it must be emphasized that this method (and the various others that are based upon radioactive decay) depend critically upon assumptions of uniformity. To date the wood sample studied in this example, for instance, we need to know the amount of C-—14present in the sample when the tree died, and what the decay rate was over the time period in question. To apply the method, we assume, first, that the decay rate has remained constantover the time period in question and, second, that the ratio of the amounts of C14 to C—12 was the same when the tree died as it is today. Observe that although these assumptions are usually stated as fact they can never be proved, since it is too late for direct observation and the only evidence available now is necessarily circumstantial. 2.3.3. Population dynamics. In this application, we are again interested in the variation of a population N(¢) with the time t, not the population of atoms this time, but the population of a particular species such as fruit flies or human beings. According to the simplest model, the rate of change dN/dt is proportional to the population NV: dN = KN, (25) dt where the constant of proportionality « is the net birth/death rate, that is, the birth rate minus the death rate. As in our discussion of radioactive decay, we regard ne) as continuous because the unit steps in V are extremely small compared to itself. Solving (25), we obtain the exponential behavior N(t) = Noe, (26) where N(0) = No is the initial condition. If the death rate exceeds the birth rate, then & < 0 and (26) expresses exponential decrease, with N — Oast + ow. That result seems fair enough. However, if « > 0, then (26) expresses exponential growth, with N — co as t + oo, as displayed in Fig. 6 for several different initial conditions No. That result is unrealistic because as N becomes sufficiently large other factors will undoubtedly come into play, such as insufficient food or other resources. Figure 6. Exponentialgrowth, In other words, we expect that « will not really be a constant but will vary with NV. In particular, we expect it to decrease as N increases. As a simple model of such behavior, suppose that « varies linearly with N: « = a — bN, with a and 6 positive, so that « diminishes as N increases, and even becomes negative when N exceeds a/b. Then (25) is to be replaced by the equation (27) =(a—bN)N. dN dt The latter is known as the logistic equation, or the Verhulst equation, after the Belgian mathematician P. F. Verhulst (1804-1849) who introduced it in his work on population dynamics. Due to the NV?term, the equation is nonlinear, so that the solution that we developed in Section 2.2 does not apply. However, the Verhulst equation is interesting, and we will return to it. 2.3.4. Mixing problems. In this final application we consider a mixing tank with an inflow of Q(t) gallons per minute and an equal outflow, where t is the time: see Fig. 7. The inflow is at a constant concentration c, of a particular solute (pounds per gallon), and the tank is constantly stirred, so that the concentration c(t) within the tank is uniform. Hence, the outflow is at concentration c(t). Let v denote the volume within the tank, in gallons; v is a constant because the inflow and outflow Q(t),c(t) Figure 7. Mixing tank. rates are equal. To keep track of the instantaneous mass of solute x(t) within the tank, let us carry out a mass balance for the “control volume” V (dashed lines in the figure): Rate of increase of mass of solute = Ratein —_ Rate out, (28) within V gal dx tb t)= a (a =) (« dt min Ib ai) — (a t Ib gal (t)— }, =i) =) (« 29 e) or, since c(t) = x(t)/v, la(t t de(t) | QO) 4) =Q(t), dt v (30) 2.3, Applications of the Linear Equation which is a first-order linear equation on a(t). Alternatively, linear equation 43 we have the first-order | Q@)c(t). __Q(t) de{t) ‘ = “it + 1) on theconcentrationc(t). Recall that in modeling a physical system one needs to incorporate the relevant physics such as Newton’s secondlaw or Kirchoff’s laws. In the presentapplication, the relevant physics is provided entirely by (28). To better understand (28), suppose we rewrite it with one more term included on the right-hand side: Rate of increase of mass of solute Rate = within V into V Rate ~— outof Rate of creation + V of mass (32) within V. The equation (32) is merely a matter of logic, or bookkeeping, not physics. Since (28) follows from (32) only if there is no creation (or destruction) of mass, we can now understand (28) to be a statementof the physical principle of conservation of mass, namely, that matter can neither be created nor destroyed (except under exceptional circumstances that are not present in this situation). Closure. In this section we study applications of first-order linear equations to electrical circuit theory, to radioactivity and population dynamics, and to mixing problems. Although our RDC circuit gives rise to a second-order differential equation, we find that we can work with first-order equations if we omit either the inductor or the capacitor. We will return to the RDC circuit when we discuss secondorder equations, so the background provided here, including the expressions for the voltage/current relations and Kirchoff’s two laws, will be drawn upon at that time. The electrical circuit applications also gives us an opportunity to emphasize the extremely important consequences of the linearity of the differential equation upon the relationship between the input and output. The key ideas are that for a linear system: (1) the response to a particular input is proportional to that input, and (2) the response to more than one input is the sum of the responses to the individual inputs. These ideas are developed and proved in Chapter 3. EXERCISES 2.3 NOTE: Thus far we have assumed that p(a) and q(x) in y' +p(x)y = q(x) are continuous, yet in applications that may not be the case. In particular, the “input” g(x) may be discontinuous. In Example 1, for instance,H(t) in Ldi/dt + Ri = E(t) may well be discontinuous, such as E(t) = &o, 0, O<t<t t1<t<o. We state that in such cases, where E(t) has one or more jump discontinuities, the solution (11) [more generally, (24) in Section 2.2] is still valid, and can be used in these exercises. 1. (RL circuit) For the RL circuit of Example 1, with ig = 0 and E(t) = Eo, determine the (a)time requiredfor z(t) to reach99%of its steady-statevalue; (b) resistanceR neededto ensurethati(¢) will attain99% of 44 Chapter 2. Differential Equationsof First Order its steady-statevalue within 2 seconds, if L = 0.001 henry; 9. (Verhulst equation) Solve the Verhulst equation (27), sub- (c) inductance£ neededto ensurethat7(¢)will attain 99% of ject to theinitial conditionN(0) = No, two ways: its steady-statevalue within 0.5 seconds, if R = 50 ohm. 2. (RE circuit) For the RL circuit of Example |, suppose that (a) by noting that it is a Bernoulli equation; (b) by noting that it is (also) a Riccati equation. 1(0) = io and thatE(t) is as given below. In eachcase,de- NOTE: The Bernoulli and Riccati equations, and their soluterminei(¢) and identify the steady-statesolution. If a steady tions, were discussed in the exercises for Section 2.2. (The state does not exist, then state that. Also, sketch the graph of i(t) andlabelkey values. ; : (@) B= © _ Io, 10. (Mixing tank) For the mixing tank governed by (31): <t<oo BOT Arete 0, which method is the subject of the next section.) O<t<ty ti { 0, _ Verhulst equation can also be solved by the method of separation of variables, O<t<t, (a)Let Q(t) = constant= @ andc(0) = co. Solve for c(t). (b) Let Q(t) = 4 for 0 < t < 1 and 2 for ¢ > 1, and let v = cy = 1 and c(0) = 0. Solve for c(t). HINT: The application of (24) in Section 2.2 is not so hard when g(a) in the differentialequationy' + p(x)y = q(z) is definedpiecewise 0, (c) E(t) = O<t<t, Eo, ty <t< 0, tg <<t<w (e.g., as in Exercise 2 above), but is tricky when p(z) is defined piecewise. In this exercise we suggest that you use (24) te to solve for c(t) first forO < t < 1, with “a’=0 and “b” = c(Q)=0. Then,usethatsolution to evaluatec(1) anduse(24) 3. (RC circuit) (a) For the RC circuit of Example 2, suppose that 79 = O and that E(t) = Eoe~®*/L. Solve for i(t) and identify the steady-state solution, treating these cases sepa- again, for 1 < t < on, this time with “a’= where c(1) has already been determined. any key values. (b) Same as (a), but with R = C = 1 and E(t) = Egsint. ery molecule of solute out of the tank. Does this result make sense? Explain. 4. Verify that (14) can be re-expressed as 11. (Mass on an inclined plane) The equation mz” + ca’ = mg sin a governs the straight-line displacement z(t) ofa mass m along a plane that is inclined at an angle @ with respect to the horizontal, if it slides under the action of gravity and fric- 1 and “b” = c(1), (c) Let Q(t) = 2, c, = 0, v = 1, and c(0) = 0.3. Solve for rately: R?C # L, and R?C = L,. If theredoesnot exist a c(t) andthusshowthatalthoughc(t) > 0 as t + oo, it never steadystate,thenstatethat.Sketchthegraphof z(t) andlabel actually reduces to zero, so that it is not possible to wash ev- i(€) = Equlh R? + (w)? € —Rt/L —_——- Eo R? + (wh)? sin (wt — @), where @ is the (unique) angle between 0 and 7/2 such that tan ¢@ = wL/R; ¢ is called the phase angle. the same size. How old is it? Approximately how many years did it takefor its C-14 content to diminish from its initial value to 99%of that? 6. If 10 grams of some radioactive substance will be reduced to 8 grams in 60 years, in how many years will 2 grams be left? In how many years will 0.1 gram be left? 7. If 20% of a radioactive substance disappears in 70 days, what is its half-life? 8. Show that if m1 and m2 grams of a radioactive substance are present at times t, and to, respectively, then its half-life is In2 T = (tg—t,) ——-—~. (t2 in Gmifmn) tion. If x(0) = 0 and2’(0) = 0, solvefor x(t). HINT: First, integrate the equation once with respect to ¢ to reduce it to a first-order linear equation. 12. (Free fall; terminal velocity) The equation of motion of a body of mass m falling vertically under the action of a downward gravitational force mg and an upward aerodynamic drag force f(v), is mu’ = mg —f(v), (12.1) wherev(t) is the velocity [so thatv’(t) is the acceleration]. The determination of the form of f(v), for the given body shape, would require either careful wind tunnel measurements, or sophisticated theoretical and/or numerical analysis, the result being a plot of the nondimensional drag coefficient versus the nondimensional Reynolds number. All we need to know here is that for a variety of body shapes, the result of such an analysis is the determination that f(v) can be approximated 2.3. Applications of theLinear Equation 45 in the form cv? , for where & (meters*/second) is a diffusion constant, 8 (grams per suitable constants c and @. For low velocities (more precisely, for low Reynolds numbers) # = 1, and for high velocities (Le., 2. for high Reynolds numbers){ %& second per gram) is a chemical decay constant, and Q(x) is the constant @ over 0 < « < DLand 0 outside that interval. [Physically, (14.1) expresses a mass balance between the in- (over some limited range of velocities) (a) Solve (12.1), togetherwith the initial condition v(0) == 0, put ~Q(a)/A, thetransportof pollutantby diffusion,ke’, the for the case where f(v) = cu. What is the terminal (i.e., transport of pollutant by convection with the moving stream, steady-state)velocity? (b) Same as (a), for f(v) & cv. Uc’, and by disappearancethroughchemical decay, Gc.] We HINT: Read Exercise [1 in assume that the river is clear upstream; that is, we have the Section 2.2. initial conditionc(—co)= 0. 13. (Light extinction) As light passes through window glass some of it is absorbed. If x is a coordinate normal to the glass (a) Let L = co. Suppose that é is sufficiently small so that (with x = OQat the incident face) and [() is the light intensity at a, then the fractional loss in intensity, -dI/J (with the first-orderlinearequationUc’ + Bc = Q(a)/A. Solve for c(x) we can neglect the diffusion term. Then (14.1) reduces to the and sketch its graph, labeling any key values. minus sign included becausedJ will be negative),will be pro- (b) Repeat part (a) for the case where L is finite. portional to dz: —dI/I = k dz, where k is a positive constant. Thus,/(z) satisfiesthedifferentialequation[’(2) = —kI(x). The problem: If 80% of the light penetratesa 1-inch thick slab of this glass, how thin must the glass be to let 95% penetrate? NOTE: Your answer should be numerical, not in terms of an unknown k. 14, (Pollution in a river) Suppose that a pollutant is discharged into a river at a steady rate @ (grams/second) over a distance L, as sketched in the figure, and we wish to 15. (Newton’s lawof cooling) Suppose that a body initially at a uniform temperature ug is exposed to a surrounding environment that is at a lower temperature U. Then the outer portion of the body will cool relative to its interior, and this temperature differential within the body will cause heat to flow from the interior to the surface. If the body is a sufficiently good conductor of heat so that the heat transfer within the body is much more rapid than the rate of heat loss to the environment at the outer surface, then it can be assumed, as an approximation, that heat transfer will be so rapid that the interior temperature will adjust to the surface temperature instantaneously, and thebody will be at a uniformtemperatureu(t) at eachinstant t. Newton’s law of cooling states that the time rate of change of u(t) will be proportionalto the instantaneoustemperature difference U — u, so that acu _ k(U ~ wu), determine the distribution of pollutant in the river — that is, its concentration c (grams/meter®).Measure z as arc length along the river, positive downstream. The river flows with velocity U (meters/second) and has a cross-sectional area A (meters”), both of which, for simplicity, we assume to be constant. Also for simplicity, suppose that c is a function of x only. That is, it is a constant over each cross section of the stream. This is evidently a poor approximation near the intervalO < x < J, where we expect appreciable across-stream and vertical variations inc, but it should suffice if we are concerned mostly with the far field, that is, more than several river widths upstream or downstream of the interval 0 < 2 < L. Then it can be shown that c(x) is governed by the differential equation ke"!~Ud —Beo= = (15.1) (—co < x < cw) (14.1) where & is a constant. (a)Solve (15.1)for u(t) subjectto theinitial conditionu(0) = ug. NOTE: Actually, it is not necessary that U < uo; (15.1) is equally valid if U > uo. In most physical applications, however, one is interested in a hot body (such as a cup of coffee or a hot ingot) in a cooler environment. (b) An interesting application of (15.1) occurs in connection with thedetermination of thetime of deathin a homicide. Suppose that a body is discovered at a time T after death and its temperature is measured to be 90°F. We wish to solve for T. Suppose that the ambient temperatureis U = 70° F and assume that ug = 98.6° F. Putting this information into the solution to (15.1) we can solve for T, provided that we know 46 Chapter 2. Differential Equations ofFirst Order k, but we don’t. Proceeding indirectly, we can infer the value of & by taking one more temperature reading. Thus, suppose that we wait an hour and again measure the temperature of the and if the compounding is done 7 times per year, then (16.2) body,and find thatu(T’ + 1) = 87° F. Use this information to solve for 7’. 16. (Compound interest) Suppose that a sum of money earns interest at a rate k, compounded yearly, monthly, weekly, or (a) Show that if we let n -—+oo in (16.2), then we do recover the continuous compounding result (16.1). HINT: Re- even daily. If it is compoundedcontinuously, then dS/dt = kS, whereS(t) denotesthesumat time¢. If S(0) = So, then the solution is call, from the calculus, that Instead, suppose that interest is compounded yearly. Then af- (b) Let & = 0.05 (i.e.,5% interest)andcompareS(t)/So after ¢ years ter 1 year (¢ = 1) if interest is compounded yearly, monthly, weekly, daily, and continuously. S(t) = Soe*. 1 lim (1 + ~) (16.1) S(t)= So(1+k)’, Mm—-+CO m ™m =e, form F(a,yy’) (1) =0. If we can solve (1), by algebra, for y’, then we can re-express it in the form y' =f(x,y), (2) equation ry —y=siny’ +4 or, equivalently, that (2) can be written as yf= X(x)Y(y), y’ = 3x —yis not. (3) as x expx times exp (2y), but To solve (3), we divide both sides by sides with respect to w: ly Y(y) Gf Y(y) # 0) and integrateboth yu da = [xo dx, (4) or, since y/da = dy, from the differential calculus, 1 dy = f x(X (x) de. (5) We also know from the integral calculus that if 1/Y(y) is a continuous function of y (over the relevant y interval) and X(a) is a continuous function of x (over the relevant « interval), then the two integrals in (5) exist, in which case (5) gives the general solution of (2). EXAMPLE 1. Solve theequation 9 (6) yo=-y". Though not linear, (6) is separable. Separating the variables and integrating gives 4 = -{ de, 1 _ (7) (8) —--+C,=-r+C), ¥ where C;, and Cy are arbitrary. With C = C,, — Co, we have the general solution 1 (9) y(z) = TE If we imposean initial conditiony(0) = yo thenwe can solve for C' andobtaintheparticular solution I y(x) = r+1/yo _ Yo L+yor’ (10) which is plotted in Fig. | for the representative values yy = 1 and yo = 2. The solution throughthe initial point (0, 1) exists over —1 < x < 0, the one through (0,2) exists over ~1/2 < & < oo. More generally, the one through(0, yo) exists over —1/yo < @ < 00 because the denominator in (10) becomes zero at « = —1/yo. We could plot (10) to the left of that point as well, but such extension of the graph would be meaninglessbecause the point x = —1/yp serves as a “barrier:” y and y’ fail to exist there, so the solution cannot be continued beyond that point. & EXAMPLE 2. Solve theinitial-value problem y=,1 + 2eY yoya. a) Figure 1. Particular solutions given by (10), Though not linear, the differential equation is separable and can be solved accordingly: [a + 2e") dy = / 4a dx, (12) y+ 2e¥= 2474+. (13) Unfortunately, the latter is a transcendental equation in y, so we cannot solve it for y explicitly as a function of x, as we were able to solve (8). Nevertheless, we can impose the initial condition on (13) to evaluate C: 1 + 2e = 0+C,so given, in “implicit” form, by C = 1+ 2e and the solution is y+ 2e¥= 2a?+1+ 2e. (14) The resulting solution is plotted in Fig. 2, along with the direction field. [Actually, we did not plot (14) in Fig. 2; we used the following Maple phaseportraitcommands to solve (11) and to plot the solution: y with (DEtools): phaseportrait(4* 2/(1 +2 * exp(y), [z, y], c = —20..20,{(0,1]}, stepsize= 0.05, Figure 2. The solution (14) of(11). arrows=LINE); where the default stepsize was too large and gave a jagged curve, so we reduced it to 0.05, and where we also included the direction field to give us a feeling for the overall ‘flow.’ ] COMMENT 1. Observe that if we use the definite integrals y | 1 x (1426) dy = [ 0 4a dz, with the lower limits dictated by the initial condition y(0) = 1, then we bypass the need for an integration constant C’ and its subsequent evaluation. COMMENT 2. What is the interval of existence of the solution? In Example | we were able to ascertain that interval by direct examination of the solution (10). Here, however, such examination is not possible because the solution (14) is in Fig. 2, that the solution exists for all 2, but of course Fig. Equation (14) reveals the asymptotic behavior 2e” ~ 2x7, it seems clear that the solution continues to grow smoothly 2.4.2. Existence and uniqueness. (Optional) implicit form. It appears, from 2 covers only —20 < x < 20. or y ~ 2In|a| as |x| + 00, so as || increases. 4 In this section we have begun to solve nonlinear differential equations. Before we get too deeply involved in solution techniques, let us return to the more fundamental questions of existence and uniqueness of solutions. For the linear equation y + p(x)y = q(x) (15) we have Theorem 2.2.1, which tells us that (15) does admit a solution through an initial pointy(a) = bif p(x) andq(x) arecontinuousat a. That solutionis unique, and it exists at least on the largest x interval containing x = a, over which p(a) and q(x) are continuous. What can be said about existence and uniqueness for the more general equation y’ = f(x,y) (which could, of course, be linear, but, in general, is not)? \ 2.4. Separable Equations THEOREM 2.4.1 Existence and Uniqueness If f(x, y) is continuous on some rectangle F in the x, y plane containing the point (a, 6), then the problem y=fay); yla)=b (16) has at least one solution defined on some open 2 interval* containing « = a. If, in addition, Of /Oyis continuous on #, then the solution to (16) is unique on some open interval containing = a. Notice that whereas Theorem 2.2.1 predicts the minimum interval of existence and uniqueness,Theorem 2.4.1 merely ensures existence and uniqueness over some interval; it gives no clue as to how broad that interval will be. Thus, we say that Theorem 2.4.1 is a local result; it tells us that under the stipulated conditions all is well locally, in some neighborhood of « = a. More informative theorems could be cited, but this one will suffice here. Let us illustrate Theorem 2.4.1 with two examples. EXAMPLE 3. The equation -2 (17) ~ 2) y= yly x(y—1) is separable, and separating the variables gives “ y-l ———~ dz dy = Iga — (18) x By partial fractions (which method is reviewed in Appendix A), y-il − vy) o11 − id −−− 2y 2y2 19 ™” With this result, integration of (18) gives 1 ∙ 1 ∕ ≡ ∶ − (20) where C’ is the arbitrary constant of integration. Equivalently, ne | ~2 x = 20, (21) “By an open interval we mean 21 <2 < 2, and by a closed interval we mean vw, <r < we. Thus, a closed interval includes its endpoints, an open interval does not. It is common to use the notation (a4, 2) and [x1, x2] for open and closed intervals, respectively. Further, (21, 2] means By <e < we,and (1,22) meansay <2 < ay, 49 50 So |= =B, (0S B< oo) (22) where B is introduced for convenience and is nonnegative because exp (2C) is nonnegative. Thus, wy?) 2) =+tB=A4, (—0o < A < 0) (23) where A replaces the “+B.” Finally, (23) gives y? — 2y — Ax? = 0 so, by the quadratic formula, we have the general solution y(x) =1+V1+ Az? (24) of (17). 1 6 rs 'a 0 6 6 xX Figure 3. Solutioncurvescorrespondingto equation(17). These solution curves are plotted in Fig. 3. The choice A = 0 gives the solution curves y(z) = O and y(x) = 2. As representativeof solutions above ine line y = 2, consider the initial condition y(1) = 4. Then (24) gives a 4 = 1+ JV14+ A, which requires that we select the plus sign and A = 8, so y(s) 1+ Vi + 8x7. As representative of solutions below the line y = 0, consider the initial condition y(1)= —3. Then (24) gives y(1) = ~3 = 1+ V1+A, which requires that we select the minus sign and A = 15, so y(x) = 1 ~ V1+152?. Finally, as representativeof the solutions betweeny = 0 and y = 2, consider the initial condition y(2) = 3/2, say. Then y(2) = 3/2 = 1 4 V1+44A, so we choose the plus sign and A = ~—3/16, in which case (24) gives y(a) = 1 + \/1 — 3x2/16, namely, the upper branch of the ellipse 3 —6” —1 +(y-1)?=1 . In terms of the Existence and Uniqueness Theorem 2.4.1, observe that the conditions of the theorem are met everywhere in the x, y plane except along the vertical line x = 0 and the horizontal line y = 1, and indeed we do have breakdowns in existence and uniqueness all along these lines. On « = 0 (the y axis) there are no solutions through initial points other than y = O and y = 2 (lack of existence), and through each of those points there 2.4, Separable Equations 51 are an infinite number of solutions (lack of uniqueness). Initial points on the line y = 1 are a bit more subtle. We do have elliptical solution curves througheach such point, yet at the initial point (on y = 1) the slope is infinite, so the differential equation (17) cannot be satisfiedat thatpoint. Thus, we havea breakdownin existencefor eachinitial point on y = 1. Further, realize that for any such ellipse, between y = 0 and y = 2, the upper andlowerhalvesareseparatesolutions.For instance,theellipse(32/4)? + (y —1)? = 1, mentionedabove,reallyamountstotheseparatesolutionsy(a) ==1+:\/1 —(3”/4)?, each validover~4/3 < @< 4/3. ↕ ≤ ∏ ∂ ∶∶∞ ≤ ∏ ≤ ∂ ∑ ∶ ∂ ∏ ‘ EXAMPLE 4. Free Fall. This time consider a physical application. Supposethat a is dropped,fromrest,attime¢ = 0. With its displacementx(t) measured bodyof massmm downward from the point of release, the equation of motion is mz’’ = mg, where g is the acceleration of gravity and ¢ is the time. Thus, gv’ =4q, z(0) =0, z'(0) =0. (0<t<o) (25a) (25b) (25c) Equation (25a) is of second order, whereas this chapter is about first-order equations, but it is readily integrated twice with respect to t. Doing so, and invoking (25b) and (25c) gives the solution a(t)=at, (26) which result is probably familiar to you fromafirst course in physics. However, instead of multiplying (25a) through by dt and integrating on ¢, let us multiply it by dx and integrateon z. Then x”dzx = g dz and since, from the calculus, x : Vd dz! dz— Wet dz! dx 7 at = = dz’ 7! vipat = zd’, ' 27 (27) x'dz = gdx becomes z'dz' = gdz. (28) Integrating (28) gives 1 5 =gr+A, (29) and z(0) = «'(0) = 0 imply that A = 0. Thus, we have reduced (25) to the first-order problem a! = /2gx)/?, z(0) = 0, ∏ ∏ (0<t<o) (30a) (30b) which shall now be the focus of this example. Equation (30a) is separable and readily solved, The result is the general solution w(t) =5(Vat+e)’, G1) ∕ ∂≤ which is shown, for various values of C’, in Fig. 4. Applying (30b) gives C’ = 0, so (31) gives a(t) = gt? /2, in agreement with (26). However, from the figure we can see that although a solution exists over the full ¢ interval of interest (t > 0), that solution is not unique because other solutions satisfying both (30a) and (30b) are given by the curve z(t) = 0 from the origin up to any point Q, followed by the parabola QR. Physically, the solution OQ F corresponds to the mass levitating until time Q, then beginning its descent. Surely that sounds physically impossible, but let us look at the mathematics. We cannot apply Theorem 2.2.1 because (30) is nonlinear, but we can use Theorem 2.4.1 (with zxand y replacedby t and a, of course). Since f(t,c) = /2ga'/?, we see thatf is continuousfor all ¢ > 0 andx > 0, butfx(¢,2) = \/g/2x7'/? is notcontinuousoverany Oo Q t=T ot Figure 4. Nonuniquenessof the solution to (30). interval containing the initial point ¢ = 0. Thus, the theorem tells us that there does exist a solution over some ¢ interval containing ¢ = 0 (which turns out to be the entire positive ¢ axis), but it does not guarantee uniqueness over any such interval, and as it turns out we do not have uniqueness over any such interval. Next, consider the physics. When we multiply force by distance we get work, and work shows up (in a system without dissipation, as in this example) as energy. Thus, multiplying (25a) by dx and integrating converted the original force equation (Newton’s second law) to an energy equation. That is, (29) tells us that the total energy (kinetic plus potential) is conserved; it is constant for all time: x’? /2 + (gx) = constant or, equivalently, (32) sma? + (-—mgz)= A. Kinetic energy + Potential energy = Constant. Since (0) = x’(0) = 0, thetotalenergyA is zero. Whenthe massfalls, its kineticenergy becomes positive and its potential energy becomes negative such that their total remains zero for all ¢ > 0. However, the energy equation is also satisfied if the released mass levitates for any amount of time and then falls, or if indeed it levitates for all time [that is z(t) = 0 for all ¢ > 0]. Thus, our additional solutions are indeed physically meaningful in that they do satisfy the requirement of conservation of energy. Observe, however, that they do not satisfy the equation of motion (25a) since the insertion of z(¢) = 0 into thatequation gives 0 = g. Thus, the spuriousadditional solution z(t) = 0 musthaveenteredsomewhere between (25) and (30). In fact, we introduced it inadvertently when we multiplied (25a) by dx becausex""dr = gdz is satisfiednot only by 2” = g, but also by dx = 0 [i.e.,by x(t) = constant]. The upshot is that although the solution to (30) is nonunique, a look at our derivation of (30) showsthatwe shoulddiscountthesolutionx(t) = 0 of (30) since it doesnot also satisfy the original equation of motion x” = g. In that case we are indeed left with the uniquesolutionx(t) = gt?/2, correspondingto theparabolaOP in Fig. 4. It is important to understandthatthe solution «(¢) = 0 of (30) is not contained within the general solution (31), for any finite choice of C’. Such an additional solution is known as a singular solution, and brief consideration of these will be reserved for the exercises. 2.4, Separable Equations 2.4.3. Applications. ration of variables. EXAMPLE 5. Let us study two physical applications of the method of sepa- Gravitational Attraction. Newton’s law of gravitation statesthatthe force of attraction F’ exerted by any one point mass MWon any other point mass m is* F=G Mm p (33) cm?/gsec”)is calledthe whered is thedistancebetweenthemandG(= 6.67x 107% universal gravitational constant; (33) is said to be an inverse-square law since the force varies as the inverse square of the distance. (By AZ and m being point masses, we mean that their sizes are negligible compared with d.) Consider the linear motion of a rocket of mass m that is launched from the surface of the earth, as sketched in Fig. 5, where Af and R are the mass and radius of the earth, respectively. From Newton’s second law of motion and his law of gravitation, it follows that the equation of motion of the rocket is dz Mm = —-G——_., Ce + Re mp (34) 34 Although (34) is a second-order equation, we can reduce it to one of first order by noting that @r dd (dz _ dv _ dvdz dv de dt \ dt ~ dt dxdt ~Uae’ (35) *Newton derived (33) from Kepler’s laws of planetary motion which, in turn, were inferred empirically from the voluminous measurements recorded by the Danish astronomer Tycho Brahe (1546~1601). Usually, in applications (not to mention homework assignments in mechanics), one is given the force exerted on a mass and asked to determine the motion by twice integrating Newton’s second law of motion. In deriving (33), however, Newton worked “backwards:” the motion of the planets was supplied in sufficient detail by Kepler’s laws, and Newton used those laws to infer the force needed to sustain that motion. It turned out to be an inverse-square force directed toward the sun. Being aware of other such forces between masses, for example, the force that kept his shoes on the floor, Newton then proposed the bold generalization that (33) holds not just between planets and the sun, but between any two bodies in the universe; hence the name universal law ofgravitation. Just as it is difficult to overestimate the importance of Newton’s law of gravitation and its impact upon science, it is also difficult to overestimate how the idea of a force acting at a distance, rather than through physical contact, must have been incredible when first proposed. In fact, such eminent scientists and mathematicians as Huygens, Leibniz, and John Bernoulli referredto Newton’s idea of gravitation as absurd and revolting. Imagine Newton’s willingness to stand nonetheless upon the results of his mathematics, in inferring the concept of gravitation, even in the absence of any physical mechanism or physical plausibility, and in the face of such opposition. Remarkably, Coulomb’s law subsequentlystated an inverse-squaretype of electrical attraction or repulsion between two charges. Why these two types of force field turn out to be of the same mathematicalform is not known. Equally remarkableis the fact that although the forms of the two laws areidentical,the magnitudesof the forcesarestaggeringlydifferent. Specifically, theratio of the electrical force of repulsion to the gravitational force exerted on each other by two electrons (which is independent of the distance of separation) is 4.17 x 107°. 53 54 Chapter 2. Differential Equations of First Order where v is the velocity, and where the third equality follows from the chain rule. Thus (34) becomes the first-order equation dv yee "de GM _ (@+R)? 36 (6) which is separable and gives [va = -om dx | (37) GM aye (38) 5 =srR te If thelaunchvelocityis v(0) = V, then(38)givesC = (V?/2) - GM/R, so vasa] V 2 _ 2GM x (39) «+R R is the desired expression of v as a function of z. If we wish to know z(t) as well, thenwe canre-write(39)as dz IGM — =,/v2-—— dt Vv x R 40 «+R (40) which once again is variable separable and can be solved for x(t). However, let us be content with (39). Observe from (39) that v decreases monotonically with increasing x, from its initial value v = V tov = O, the latter occurring at V2 Re ?mos =96M —VR oe Subsequently, the rocket will be drawn back toward the earth and will strike it with speed V. [We need to choose the negative square root in (39) to track the return motion.] Equation (41) can be simplified by noting that when x = 0, the right-hand side of (34) must be —mg, where g is the familiar gravitational acceleration at the earth’s surface. Thus, -mg ~—~GMm/R*, soGM/R? = g, and(41)becomes V?R Umar = RV? = (42) We see from (42) that zmaxz increases as V is increased, as one would expect, and becomes infinite as V - ./2gR. Thus, the critical value V, = /2g/ft is the escape velocity. Numerically, V. & 6.9 miles/sec. COMMENT I. Recall that the law of gravitation (33) applies to two point massesseparated by a distance d, whereas the earth is hardly a point mass. Thus, it is appropriate to question the validity of (34). In principle, to find the correct attractive force exerted on the rocket by the earth we need to consider the earth as a collection of point masses df, compute the force df’ induced by each dM, and add the dF’s vectorially to find the resultant force F’. 2.4, Separable Equations This calculation is carried out later, in Section 15.7,and the result, remarkably, is that the resultant J” acting at any point P outside the earth (or any homogeneous spherical mass), per unit mass at P, is the same as if its entire mass M were concentratedat a single point, namely, at its center! Thus, the earth might as well be thought of as a point mass, of mass M, located at its center, so (34) is exactly true, if we are willing to approximate the earth as a homogeneous sphere. COMMENT 2. The steps in (35), whereby we were able to reduce our second-orderequation (34) to the first-order equation (36), were not limited to this specific application. They apply wheneverthe force is a function of x alone, for if we apply (35) to theequation d? mos = f(x), (43) mo=—=f(xf(x) 44 (44) we get the separable first-order equation dv 3 mu 5 = | f(a)de with solution or, equivalently, +c | =feefd. mv? |"? ay zy (45) (46) In the language of mechanics, the right-hand side is the work done by the force f(z) as the body movesfrom x, to 22, and mv*/2 is the kinetic energy.Thus, thephysical significance of (46) is that it is a work-energy statement: the change in the kinetic energy of the body is equal to the work done on it. COMMENT 3. Observe the change in viewpoint as we progressed from (34) to (36). Until the third equality in (35), we regarded 2 and v as dependent variables — functions of the independent variable t. But beginning with the right-hand side of that equality, we began to regard v as a function of x. However, once we solved (36) for v in terms of z, in (39), we replaced v by dx/dt, and x changed from independent variable to dependent variable once again. In general, then, which variable is regarded as the independent variable and which is the dependent variable is not so much figured out, as it is a decision that we make, and that decision, or viewpoint, can sometimes change, profitably, over the course of the solution. # EXAMPLE 6. VerhulstPopulationModel. Consider theVerhulstpopulationmodel N'(t) = (a—bN)N; N(0) = No (47) that was introduced in Section 2.3.3, where V(t) is the population of a given species. This example emphasizes that a given equation might be solvable by a number of different methods.Though (47) is not a linear equation, it is both a Bernoulli equation and a Riccati equation, which equations were discussed in the exercises of Section 2.2. Now we see that 55 56 it is also separable, since the right side is a function of N [namely, (a ~ bN)N] times a function of t (namely, 1). Thus, ° dN ° By partial fractions, 1 1 1 tof Ll (a—bN)N~b(N-®)N~ aN—#aN so (48) gives 1 =] N −−a +iimN=t+e, (49) a whereC’ is an arbitraryconstant(~co < C’ < oo). [Whetherwe write In |N] or InN in ↓↕ b (49) is immaterial since N > 0.] Equivalently, | N N —f@ 6 l/a =e t+C N ’ _ etttac _ Be™, (50) a e wherewe havereplacedexp (aC) by B, so 0 < B < oo. Thus N ——_ N a/b = ad Be +Be"=Ae”, 51 (51) whereA is arbitrary(—oo < A < oo). Finally, imposingthe initial conditionN(0) = No gives A = No/(.No —a/b), andputtingthatexpressioninto (50)andsolving for NVgives fN(t) ( ) aNo = ———__-—____. (a − ∟↓ ∫ 52 (62) What can be learnedof the behaviorof N(t) from (52)? We can seefrom (52) that for every initial value No (other than No = 0), N(t) tends to the constant value a/b as t —»oo. [If No = 0, then N(t) = 0 for all ¢, as it should, because if a species starts with no members it can hardly wax or wane.] Beyond observing that asymptotic information, it is an excellent idea to plot the results, especially now that one has such powerful and convenient computer software for that purpose. However, observe that the solution (52) contains the three parameters a, 6, and No, and to use plotting software we need to choose numerical values for these parameters. If, for instance, we wish to plot N(t) versus ¢ for five values of a, five of b, and five of Ng, then we will be generating 5° = 125 curves! Thus, the point is that if we wish to do a parametric study of the solution (i.e., examine the solution for a range of values of the various parameters), then there is a serious problem with managing all of the needed plots. In Section 2.4.4 below, we offer advice on how to deal with this common and serious predicament. # 2.4.4, Nondimensionalization. (Optional) One can usually reduce the number of parameters in a problem, sometimes dramatically, by a suitable scaling of the 2.4. Separable Equations independentand dependentvariablesso thatthe new variablesare nondimensional (i.e,, dimensionless). 7. Example6, Continued.To begin such a processof nondimensionaliza- EXAMPLE tion, we list all dependent and independent variables and parameters, and their dimensions: Variable Dimensions Parameter Dimensions 1/time a Independent: t time Dependent: N number 1/[(time)(number)] number b No (By number we mean the number of living members of the species.) How did we know that a has dimensions of 1/time, and that b has dimensions of 1/[(time)(number)]? From dN aN — bN*. That is, the dimensions of the term on the differential equation Zz the left are number/time, so the dimensions of aN and bN®?must be the same. Dimensionally, then, aN = number/time, Similarly, so a = 1/time. bN? so = number/time, b = 1/[(time)(number)]. Next, we nondimensionalize the independentand dependentvariables (¢and N) using suitable combinations of the parameters. From the parameterlist, observe that 1/a has dimensions of time and can therefore be used as a “reference time” to nondimensionalize the independent variable ¢. That is, we can introduce a nondimensional version of t, say t, by t=¢t/(1/a)= at. Next, we need to nondimensionalize the dependent variable N. From the parameter list, observe that Nog has dimensions of number, so let us introduce a nondimensional version of NV, sayN, by N= N/No. In case the notion of nondimensionalization still seems unclear, realize that it is merely a change of variables, from t and N to € and N; a rather simple change of variables in fact, since £ is simply a constant times t, and N is simply a constant times NV. Puttingt = #/aandN = NoN into (47)gives dN aNoe _ — = (a ~ bNoN) NoN; NoN(0) = No, (53) wherethe left side of the differential equation follows from thechain differentiation a 5 dNdNdi aN di dé ¢. : + 1 dN . 7 + . . sep . . = dN dN / (No) ( d ) (a) = aNo EE (More simply, but lessrigorously, we could merely replace the dN in dN/dt by Nod dN orn and the dt by di/a.) Simplifying (53) gives (l-aN)N; << N(0) = 1, a (54) where a = bNo/a. Thus, (54) contains only the single parametera. The solution of (54) is <e 1 N(t) = ———____—_.. a+(1l—ajen* (55) 57 a a=0 ,0.25 045——] | 2D mms 4-4 t solution of Verhulst problem. of a. As € + 00 (and hencet + 00), N + 1/a, so N/No + 1/(bNo/a), or N + a/b, as found in Example 4. |- Figure 6. Nondimensional The idea is that if we plot N(£) versus#(ratherthan N(t) versust), then we haveonly the one-parameter family of solutions given by (55), where the parameter is the nondimensional quantity a = bNo/a. Those solutions are shown in Fig. 6 for severaldifferent values COMMENT. The nondimensionalization of the independent and dependent variables can often be done in more than one way. In the present example, for instance, we used No However, a/b also has the dimensions of number, N: N = N/No. to nondimensionalize so we could have definedN = N/(a/b) = bN/a instead. Similarly, we could have nondimensinalized t differently, as £ = Nobt, because Nob has dimensions of 1/time. Any will work, and we leave these other choices for the exercises. # nondimensionalization EXAMPLE 8. Example 5, Continued. As one moreexample of the simplifying use of nondimensionalization, consider the initial-value dx — = dt 2GM — —— \/V2 V R problem ox «+R ∶ ∶∫ “(0) (56) from Example 5. As above, we begin by listing all variables and parameters, and their dimensions: Variable Dimensions Parameter Dimensions Independent: t time V length/time Dependent: x length R length We didn’t bother with G and M in the parameter list because V and & are all we need to nondimensionalize ¢ and «. Specifically, 2 has dimensions of length, so we can choose & = «/R, and R/V has dimensions of time. so we can choose t = t/(R/V). x = RE andt = Rt/V into (56)gives Putting (57) or (58) with the single parameter a = 2GM/RV?. Since all other quantities in the final differential equation are nondimensional, it follows that a must be nondimensional as well, as could be checked from the known dimensions of G, AZ, R, and V. Of course. whereas we’ve used the generic dimensions “time” and “length,” we could have used specific dimensions such as seconds and meters. It is common in engineering and science to nondimensionalize the governing equations and initial or boundary conditions even before beginning the solution, so as to reduce the number of parameters as much as possible. In each of the foregoing two examples we ended up with a single parameter, but the final number of 59 parameterswill vary from case to case. The nondimensional parameters that result (such as @in Example 6) are sometimes well known and of great importance. For instance, if one nondimensionalizes the differential equations governing fluid flow, two nondimensional parameters that arise are the Reynolds number Re and he Mach number M., Without getting into the fluid mechanics, let it suffice to say thatthe Reynolds number is a measure of the relative importance of viscous effects toinertial effects: if the Reynolds number is sufficiently large one can neglect the viscous terms in the governing equations of motion, and if it is sufficiently small then one can neglect the inertial terms. Similarly, the Mach number is a measure of the importance of the effects of the compressibility of the fluid: if AZ is sufficiently small then one can neglect those effects and consider the fluid to be incompressible. n fact, any given approximation that is made in engineering science is probably based upon whether some relevant nondimensional parameter is sufficiently large or small, for one is always neglecting one effect relative to others. Closure. We see that the method of separation of variables is relatively simple: one separatesthe variables and integrates.Thus, given a specific differential equation, one is well advised to see immediately if the equation is of separable type and, if it is, to solve by separation of variables. Of course, it might turn out that one or both of the integrations are difficult, but the general rule of thumb is that there is a conservation of difficulty, so that if the integrations are difficult, then an equivalent difficulty will show up if one tries a different solution technique. In the last part of this section we discuss the idea of nondimensionalization. The latter is not central to the topic of this section, separation of variables, but arises tangentially with regard to the efficient management of systems that contain numerous parameters,which situation makes graphical display and general understanding of the results more difficult. Computer software. A potential difficulty with the method of separation of variables is that the integrations involved may be difficult. Using Maple, for instance, integrations can be carried out using the int command. For example, to evaluate the integral on the left side of (48), enter int(1/((a ~b* N)* N), N); and return. The output is _In(a—6N) 4 In (NV) a a which (to within an additive constant) is the same as the left side of equation (49). That is not to say that all integrals can be evaluated in closed form by computer the integral of e~**from x = 0 to x = oo. Enter int(exp(—2x), v = 0..infinity); and return. The output is 1. 60 Chapter 2. Differential Equations of First Order 1 7 =40°+Cl y(2) EXERCISES 2.4 NOTE: Solutions should be expressed in explicit form if possible. 1. Use separation of variables to find the general solution. Then, obtain the particular solution satisfying the given initial condition. Sketch the graph of the solution, showing the key features, and label any key values. (a)y' —327e¥=0; (b)y’ = 62°+5; y(0)=0 separable, in general, but it is if p and q are constants, if one times the other. Obtain the general solution for the case where each is a nonzero constant, for any real number n. HINT: A difficult integral will occur. Our discussion of the Bernoulli equation in the exercises for Section 2.2 should help you to find a change of variables that will simplify that integration. (e)y’ =(y° —y)e"; y(0)=2 (Hy = y>+y—6; y(5)= 10 hy =624; y(0)= —4 yl) =e 6. Solvey’ = (6x? + 1)/(y —1),subjectto thegiveninitial (iy! =e; (0) =1 y Qy = 5 «¥8)=-1 (k)y' +3y(y+ 1)sin2z= 0; (0) =1 condition. (a)y(0)= -2 (c)y(0)= 0 (d)y(1)=3 y=Iny, y(0)=5 (m)y’=ylny; y(0)=5 (f)y(-1) =0 7. Solve y’ = (3x? —1)/2y, subject to the given initial condition. (n)y'+2y=y?+1; y(-3) =0 2.(a)—(n) For the equation given in Exercise was studied in Section 2.3 and solved as a Bernoulli equation, and also as a Riccati equation. Here we ask you to solve it by separation of variables. of the functionsp(x) and q(x) is zero,or if one is a constant (dy =14y"; y(2)=5 yl (a>0, b>0) N(0) = No N'(t) = (a—bN)N; 5. The Bernoulli equationy’ +p(x)y = q(x)y” is not variable y(0) =0 (c)y' +4y= 0; y(—1)=0 (gy =yly+3); 4. The Verhulst population problem 1, use computer (a)y(0)= —3 (b)y(0)= —1 (c)y(4)=5 software to solve for y(a). Verify, by direct substitution, that (d)y(-1) =0 (e)y(—2)= -4—s (f y(1)= —6 your solution does satisfy the given differential equation and A function f(#1,...,2n) functions) 8. (Homogeneous initial condition. said to be homogeneous of degree & if f(Avi,...,Atn) 3. The problemdu/dt = k(U —u); u(0) = up , where& and M f(a1,...,;2n) for anyA. For example, U are constants, occurred in the exercises for Section 2.3 in connection with Newton’s law of cooling. Solve by separation of variables. ∕ = y+ ↨ oe sin. (2) 1s = 2.4, Separable Equations 61 (e)Similarly, fory! = (@~ y ~ 4)/(a@ + y —4). is homogeneous of degree 3 because ↕↕ (f) Devise a method of solution that will work in the excep- State whether f is homogeneous or not. If it is, determine its degree. 12. (Algebraic, exponential, and explosive growth) We saw, in Section 2.3.3, that the population model ∫ ∩ ↨ eS X ∑∑ ↓ ∫ (©+ 2y —1)/(2e ∑ yf!= ∑∶ +∑dy —1). ferns) aoe (2) (a)f(a,y) = 2?+4y?—7 dN — =A#N dt 52 dN 9. (Homogeneous equation) The equation (of degree zero); see the preceding exercise. CAUTION: The term homogeneous is also used to describe a linear differential equation that has zero as its “forcing function” on the righthand side, as defined in Section 1.2. Thus, one needs to use the context to determine which meaning is intended. (a) Show, by examples, that (9.1) may, but need not, be separable. (b) In any case, show that the change of dependent variable w = y/a, from y(z) to w(x), reduces(9.1)to the separable 1 Fw) —w 7 L ty +2y? yf =e e)y (ey = el/* by’ = ar (by @y=--2a +: (a, b,...,f constants) 1 w(p — 1)NG* (12.3) No denotes the initial value N(0}. Observe that T’ diminishes as p increases. 7 we nondimensionInExample 13. (Nondimensionalization) alized according tof = at and N = N/No._ Instead, nondimensionalize (47) according tot = at and N = bN/a, and thus derive the solution B MO By Be where3 = bNo/a. Sketchthe graphof N(é) versusé, for (11.1) several different values of 7, labeling any key value(s). can be reduced to homogeneous form by the change of variables x = u+h,y = u-+k, where hand k are suitably chosen constants, provided that ae ~ bd #0. growth as a limiting case of algebraic growth, in the limit as the exponent 7 becomes infinite. Thus, exponential growth is powerful indeed.) If p is increased beyond | then we expect the growth to be even more spectacular. Show thatifp > 1 then the solution exhibits explosive growth, explosive in the sense that NV-> ooin finite time, as t > T', where or 11. (Almost-homogeneous equation) (a) Show that ~dx Fey +f (12.2) (Of course, when p = 1 we then have exponential growth, as mentioned above, so we can think — crudely ~ of exponential (9.1) +ey yl = ax + by +e («> 0) N(t) ~ at® as t + oo]. Show that as p -+ 0 the exponent @ tends to unity, and as p > 1 the exponent f tends to infinity. T= 10. Use the idea contained in the preceding exercise, to find the general solution to each of the following equations. y x 2y— 2 (a)y’—_ ==+4+3,/= y p anRNP, (9.1) where p is a positive constant. Solve (12.2) and show that if 0 <p < 1 then the solution exhibits algebraic growth [1.c., is said to be homogeneous because f(y/x) is homogeneous Ww (12.1) generally, consider the model (d)f(x,y)=sin(x*+y*) form («>0) gives exponential growth, whereby N’ — oo as t + oo. More (c) f(a, y) = 2? —y? + Tez —32y v=1@) ∕ tional case where ae ~—bd = 0, and apply it to the case (b)Thus,find thegeneralsolutionof y! = (22 —y ~ 6)/(a y —3). (c)Similarly, for y’ = (1 —y)/(a + dy —3). (d)Similarly, for y! = (a + y)/(a —y +1). 14. The initial-value problem w'(0)= 24 (14.1) corresponding to a damped mechanical oscillator driven contains seven parameters: by the force Fsinwt, Nondimensionalize (14.1). How many m,c,k,F,w,2o,29. parameters are present in the nondimensionalized system? ma” - ea’ +ka = Fsinwt; z(0)=2, 62 2.5 Exact Equations and Integrating Factors Thus far we have developed solution techniques for first-order differential equations that are linear or separable. In addition, Bernoulli, Riccati, Clairaut, homogeneous, and almost-homogeneous equations were discussed in the exercises. In this section we consider one more important case, equationsthat are “exact,” and ones that are not exact but can be made exact. First, let us review some information, from the calculus, about partial derivaC 2¢ tives. Specifically, recall that the symbol 0 I is understood to mean oO (54). Ox \ Oy Oxdy If we use the standard subscript notation instead, then this quantity would be expressed as fy, that is, (fy). Does the order of differentiation matter? That is, is fyc = fry? It is shown in the calculus that a sufficient condition for fry to equal fye is that fr, fy, fye, and fy all be continuous within the region in question. These conditions are met so typically in applications, that in textbooks on engineering and science f,, and fy, are generally treated as indistinguishable. Here, however, we will treatthem as equal only if we explicitly assumethe continuity of Feesfy» fya and Fry: 2.5.1. Exact differential equations. To motivate the idea of exact equations, consider the equation dy _ sin y (1) dz 2y—acosy or, rewritten in differential form, sin ydz + (xcosy — 2y)dy = 0. (2) If we notice that the left-hand side is the differential of F(x, y) = xsin y —y?, then (2) is simply dF = 0, which can be integrated to give F = constant; that is, F(z,y) =«siny—y’ =C, (3) where C’ is an arbitrary constant of integration. Equation (3) gives the general solution to (1), in implicit form. Really, our use of the differential form (2) begs justification since we seem to have thereby treated dy/dx as a fraction of computable quantities dy and dz, whereas it is actually the limit of a difference quotient. Such justification is possible, but it may suffice to note that the use of differentials is a matter of convenience and is not essential to the method. For instance, observe that if we write siny + («cosy — 2y) lh dx =0 in place of (2), to avoid any questionable use of differentials, (4) then the left-hand side of (4) is the x derivative (total, not partial) of F(z, y) = xsiny d ah — y?: 1 d y(x)) = = (xsiny —y’) =siny + (xcosy — dy), 2.5. Exact Equations and Integrating Factors so dF /dx = 0. Integratingthe latter gives F(a,y) = csiny — y* = C, just as before. Thus, let us continue, without concern about manipulating dy/dz as though it were a fraction. Seeking to generalize the method outlined above, we consider the differential equation dy dx M(z,y) N(a,y)’ (5) where the minus sign is included so that when we re-express (5) in the differential form M(x, y)dx + N(x, y)dy = 0, (6) then both signs on the left will be positive. It is important to be aware that in equation (5) y is regarded as a function of x, as is clear from the presence of the derivative dy/dx. That is, there is a hierarchy whereby z is the independent variable and y is the dependent variable. But upon re-expressing (5) in the form (6) we change our viewpoint and now consider x and y as having the same status; now they are both independent variables. We observe that integrationof (6) is simple if Afda + Ndy happensto be the differential of some function F'(x, y), for if theredoes exist a function F(z, y) such that dF(x,y) = M(a,y)dz + N(x, y)dy, (7) dF («,y) = 0, (8) then (6) is which can be integrated to give the general solution F(z,y) =C, (9) where C’ is an arbitrary constant. Given Af(x, y) andN(x, y), supposethattheredoesexist an F(x, y) suchthat Mdz + Ndy = dF. Then we say that Afdx + Ndy is an exact differential, and that (6) is an exact differential equation. That case is of great interest because its general solution is given immediately, in implicit form, by (9). Two questions arise. How do we determine if such an F’ exists and, if it does, then how do we find it? The first is addressedby the following theorem. THEOREM 2.5.1 Testfor Exactness Let Af(x,y), N(a, y), OML/Oy,and ON/Ozxbe continuouswithin a rectangleR in the v, y plane. Then Afdz + Ndy is an exact differential, in R, if and only if OM _aN Oy Ou everywhere in R. 410) 63 64 Partial Proof: Let us suppose that Afda + Ndy is exact, so that there is an F satisfying (7). Then it must be true, according to the chain rule of the calculus, that OF ‘ and Ox (Ila) OF (1{1b 1b) N=— Dy Differentiating (11a) partially with respect to y, and (11b) partially with respect to x, gives My = Fry, (12a) and Na = Frye: (12b) Since Mf, N, M,, and N, have been assumed continuous in R, it follows from (11) and (12) that Fi, Fy, Fey, and Fy, are too, so Fy, = Fyg. Then it follows from (12) that At, = Nz, which is equation (10). Becave of the ~if and only if” wording in the theorem, we also need to prove the reverse: that the truth of (10) implies the existence of Ff’. That part of the proof can be carried out using results established in Section 16.12, and will not be given here. Actually R need not be a rectangle; it merely needs to be “simply connected,” that is, a region without holes. Simple connectednesswill be defined and used extensively in Chapter 16 on Field Theory. Assuming that the conditions of the theorem are met, so that we are assured that such an F exists, how do we find F? We can find it by integrating (lla) with respect to x, and (11b) with respect to y. Let us illustrate the method by reconsidering the example given above. EXAMPLE 1. Considerequation(1) once again,or, in differentialform, (13) siny dz + (xcosy —2y)dy =0. First, we identify J = siny, and N = wcosy — 2y. Clearly, M,N, My, and N, are continuous in the whole plane, so we turn to the exactness condition (10): A, = cos y, and N. = cos y, $o (10) is satisfied, and it follows from Theorem 2.5.1 that there does exist an F(x, y) such that the left-hand side of (13) is dF. Next, we find J" from (11): OF=siny, (14a) = ©cosy — 2y. (14b) Ox OF Oy Integrating (14a) partially, with respect to w, gives F(z,y) = | siny Ox = xsiny + A(y), (15) 65 where the sin y integrand was treated as a constant in the integration since it was a “partial integration” on @,holding y fixed [just as y was held fixed in computing OF /Oz in (14a)]. The constant of integration A must therefore be allowed to depend upon y since y was held fixed and was therefore constant. If you are not convinced of this point, observe that taking a partial z-derivative of (15) does indeed recover (14a). Observe that initially F(a, y) was unknown. The integration of (14a) reduced the problem from an unknown function F' of x and y to an unknown function A of y alone. A(y), in turn,can now be determinedfrom (14b). Specifically, we put the right-handside of (15) into the left-hand side of (14b) and obtain «cosy + A'(y) = xcosy —2y, (16) wheretheprimedenotesd/dy. Cancelling termsgivesA’(y) = —2y,so (17) Aly)= -| 2ydy= -y° +B, where this integration was not a “partial integration,” it was an ordinary integration on y sinceA’(y) wasanordinaryderivativeof A. Combining (17)and(15)gives F(a,y) = asiny —y? + B = constant. (18) Finally, absorbing B into the constant, and calling the result C, gives the general solution (19) csiny—-y?=C of (1), in implicit form. COMMENT I. Be awarethatthepartialintegrationnotation[( )Ox and[( )@yis not standard; we use it here because we find it reasonable, and helpful in reminding us that any y’s in the integrandof [( )Ox are to be treatedas constants,and likewise any for any 2’s in [( )dy. COMMENT 2. From (13) all the way through (19), 2 and y have been regarded as independent variables. With (19) in hand, we can now return to our original viewpoint of y beinga functionof z, We can, if possible,solve (19)by algebrafor y(a) [in this caseit is not because (19) is transcendental], plot the result, and so on. Even failing to solve (19) for y(x), we can nevertheless verify that x sin y — y? =Csatisfies (1) by differentiating with respectto x. Thatstepgivessiny + x(cos y)y’ —2yy’ = Oory’ = (siny)/(2y —«cosy), which does agree with (1). COMMENT 3. It would be natural to wonder how this method can fail to work. That is, whether or not M, = N.., why can’t we always successfully integrate (11) to find F'? The answer is to be found in (16). For suppose (16) were 22 cosy + A’(y) = wcosy — 2y instead. Then the x cos y terms would not cancel, as they did in (16), and we would have A'(y) = —x cosy —2y, which is impossible becauseit expressesa relationshipbetweenx and y, whereas x and y are regarded here as independent variables. Thus, the cancellation the fact that A¢ and WNsatisfied the exactness condition (10). COMMENT 4. Though we used (14a) first, then (14b), the order is immaterial and could 66 have been reversed. 2.5.2. Integrating factors. It may be discouraging to realize that for any given pair of functions M and JN it is unlikely that the exactness condition (10) will be satisfied. However, there is power available to us that we have not yet tapped, for even if AY and N fail to satisfy (10), so that the equation (20) M(x, y)de + N(x, y)dy =0 is not exact, it may be possible to find a multiplicative factor o(a, y) so that a(x,y)M (2,y)dz + o(a,y)N(a, y)dy =0 (21) is exact. That is, we seek a function o(z, y) so that the revised exactness condition 0 0 is satisfied. Of course, we need a(x, y) # 0 for (21) to be equivalent to (20). If we can find a o(z, y) satisfying (22), then we call it an integrating factor of (20) because then (21) is equivalent to dF = 0, for some F(z, y), and dF = 0 can be integrated immediately to give the solution of the original differential equation as F(a, y) = constant. How do we find such a @? It is any (nonzero) solution of (22), that is, of (23) +aNyz. =a,N oyM +oMly Of course, (23) is a first-order partial differential equation on o, so we have made dubious headway: to solve our original first-order ordinary differential equation on y(x), we now need to solve the first-order partial differential equation (23) on a(z,y)! However, perhaps an integrating factor a can be found that is a function of x alone: o(x). Then (23) reduces to the differential equation oM,=“2N+N, dx 1 aao " de” ay ( Mdy aa N N.v 4 24 ) a which is separable. This idea succeeds if and only if the (AZ, —Nz) /N ratio on the right-hand side of (24) is a function of x only, for if it did contain any y dependence then (24) would amount to the impossible situation of a function of x equalling a function of x and y, where x and y are independent variables, Thus, if M, - Nz −−−∶↕ N . ∏ ∏ | ∏ (25) 67 thenintegration of (24) gives (26) Actually, the general solution of (24) includes an arbitrary constant factor, but that factor is inconsequential and can be taken to be |. Also, remember that we need o to be nonzero and we are pleased to see, a posteriori, that the o given in (26) cannot equal zero becauseit is an exponential function. If (M, —N,)/N is not a function of « alone,thenan integratingfactor o(x) doesnot exist, but we can try to find o as a function of y alone: a(y). Then (23) reduces to dao dy +oaM, =oaN, or which, again, is separable. If M, — Nz —¥___* = functionof y alone, M (27) then My—-Na o(y) =e fa EXAMPLE (28) 2. Considertheequation(alreadyexpressedin differentialform) dx + (3a —e~*¥)dy = 0. (29) Then M = land N = 3z — e~?¥, so (10) is not satisfied and (29) is not exact. Seeking an integrating factor that is a function M,-Nz N of x alone, we find that 0-3 = 3p Dewy # function of z alone, (30) and conclude that a(x) is not possible. Seeking instead an integrating factor that is a function of y alone, M,-N, ∫ ∫−−↕ 0-38 ∶ —3= functionof y alone, (31) so that o(y) is possible, and is given by a(y) =e Jl a2 dy _. ed 3dy = ey, Multiply (29)throughby theintegratingfactor0 = e8” andobtain e%dar + @3¥(32 − dy = 0, (32) 68 which is now exact. Thus, OF _ ee Oa: and OF Oy =e (32 ~ e7*v) so +Aly), F(a,y)=/ enOx=we™ an and Oy = ¢@°¥ (3a —e7°¥) = Bae" + A'(y). The latter gives (33) A'(y)= —e! so A(y) = —e¥+ B. Thus, ↕ ∶ − ∶∶ ∙∩ − or (34) re’d — e =C, where C’ is an arbitrary constant; (34) is the general solution of (29), in implicit COMMENT. form. Can we solve (34) for y? If we let e¥ = z, then (34) is the cubic equation zz? — z = C in z, and there is a known solution to cubic equations. If we can solve for z, then we have y as y = Inz. However, the solution of that cubic equation (as can be obtained using the Maple solve command) is quite a messy expression. # EXAMPLE 3. First-Order Linear Equation. We've already solved the general first- order linear equation Yt n(ax)y =q(2) (35) in Section 2.2. but let us see if we can solve it again, using the ideas of this section. First, express (35) in the form [p(a)y —g(x)] dx + dy = 0. (36) Thus, AY = p(x)y ~ q(x) and N = 1, so M, = p(x) and N,, = 0. Hence At, # Nz, so (36) is not exact [except in the trivial case when p(z) = OJ. Since My, -Nez — pla) - 0 v M,—-N, M mdat = function of z alone. p(x) — 0 ~ p(a)y ~ q(x) function of y alone, we can find an integrating factor that is a function of w alone. but not one that ts a function of y alone. We leave it for the exercises to show that the integrating factor is a(x) de = eu p(x) 69 and that the final solution (this time obtainable y(x) = eu [pds (/ form) is in explicit el pda a dy oh c) , (37) as found earlier, in Section 2.2. 8 Closure. Let us summarize the main results. Given a differential equationdy/dz = f(z, y), the first step in using the method of exact differentials is to re-express it in the differential form M{(a,y)da + N(x,y)dy = 0. If M, N, My, and N, are all continuous in the region of interest, check to see if the exactness condition (10) is satisfied. If it is, then the equation is exact, and its general solution is F(a,y) = C, where F is found by integrating (11a) and (11b). As a check on your work, a differential of F(2,y) = C should give you back the original equation Mdz + Ndy = 0. If it is not exact,seeif (AZ, ~ N,)/N is a functionof « alone. If it is, then an integrating factor a(x) can be found from (26). Multiplying the given equation Mdz + Ndy = 0 through by that o(a), the new equation is exact, and you can proceed as outlined above for an exact equation. If (My ~ Nz)/N is not a functionof x alone,check to seeif (MZ,- Nz)/M isa function ofy alone. If it is, then an integrating factor o(y) can be found from (28). Multiplying A/dx + Ndy = 0 through bythat o(y), the new equation is exact, and you can proceed as outlined above for an exact equation. not a function of y alone, then the method is of no help unless an integrating factor o can be found that is a function of both x and y. EXERCISES 2.5 NOTE: Solutions should be expressed in explicit form if possible. y(0.5) = 3.1 (k) (42°y5 sin 3a + 32ty®cos32)dx + 5a*y*sin 3a dy= 0; 1. Show that the equation is exact, and obtain its general so- y(0)= 1 lution. Also, find the particular solution corresponding to the (m) (2ye?"¥ sin xz+ e*4 cosa + 1)dx + 2xe*Y sina dy = 0; given initial condition as well. y(2.3) = —1.25 (a)3dzx—dy=0; (0) =6 (b)a?7dx+y'dy=0; (c)adx+2ydy=0; 2.(a)—(m) Find the general solution of the equation given in (9) = -1 y(1) =2 (d)4cos2udu —e~°"dv= 0; (e)eYdx + (xeY —1)dy=0; Exercise | using computer software, and also the particular solution corresponding to the given initial condition. v(0) = 6 3. Make up three different examples of exact equations. y(—5) =6 (f) (e” + z)dy — (sin z — y)dz = (g)(w— 2z)de —(Qe — z)\dz=0; 4, Petermine whatever conditions, if any, are needed on the for the equation to be exact. .,f, A, B,...,F constants @, 0; z(0) = 0 2 =7 i 5 (h) (sin y + ycos z)dz + (sine + teosy)dy =0; y(2)=3 (i) (sin zy + xycosxy)dz + x”“coszy dy=0; | { (0) = (j) (347sin 2y — 2cy)drx + (223cos2y — x?\dy ll Q; (a) (av + +“eda + (Av- +By +C)dy =0 (b) (aa? + by? +exy + dx +ey + f)dx Cry + Dxt+ By+ F)dy =0 + (Ax? +By? 4 70 Chapter 2. Differential Equations ofFirst Order 5. Find a suitableintegratingfactoro(a) or o(y), anduseit to factor depending on x alone or y alone does not exist. Nevfind the general solution of the differential equation. (a) 3ydz + dy = 0 (b) ydz + ulnady =0 (c) ylny dz + (x + y)dy = 0 (d) da + (x — e~¥)dy =0 (e) dz + xdy = 0 (f) (ye~*+ l)dx + (we~*)dy= 0 (g)cosy dx —[2(2 —y)siny + cosy|dy = 0 (h) (1 -~2-—z)dr+dz= (i) (2+ tan?z)(1+e7¥)dx —e~¥tanady = 0 (j) (Su?sinh 3v —2u)du + 3u*cosh3v du = 0 (k) cosx dz + (3sinz + 3cosy —siny)dy = 0 () (ylny + 2xy?)dz + (x + 2*y)dy = 0 (m)(32 —2p)dz —xdp =0 ertheless, find a suitable integrating factor by inspection, and use it to obtain the general solution. (a)eYdz + e*dy = 0 (b)y2dx ~ e8*dy= 0 (c) e’%¥dx —tana dy = 0 9. Obtain the general solution, using the methods of this section. dy OF dre bp)E «-y @ ae aby (c) dy _ 2xy—eY dz dy x(e¥ —2) siny + ycosz @)=x =-_sinz r? cos0 $1 () ~~ OrsinO (@)dy _ y(2x —Iny) dx + Zcosy (n)ydz +(x? ~ x)dy =0 (0)2zy dz + (y? —x?)dy =0 10. What do the integrating factors defined by (26) and (28) turn out to be if the equation is exact to begin with? 6. (First-order linear equation) Verify that o(x) = e/ P(*)4 11.(a)Show that (x? + y)dz + (y? + x)dy =0 is exact. (b)More generally,is M(2, y)dx+M(y, 2)dy exact? Explain. is an integrating factor for the general linear first-order equation (35), and use it to derive the general solution (37). 7. Show that the given equation is not exact and that an integrating factor depending on x alone or y alone does not exist. If possible, find an integrating factor in the form o(z,y) = xy, where a and b are suitably chosen constants. If such a o can be found, then use it to obtain the general solution of the differential equation; if not, state that. (a)(Bry ~ 2y?)dzx+ (2x2?—3zry)dy= 0 (b)(Bay+ 2y?)dx+ (3x? + dry)dy = 0 (c)(a + y*)dx+ (x —y)dy =0 (d)ydz —(xy —x)dy =0 8. Show that the equation is not exact and that an integrating 12.If F(z, y) = C is thegeneralsolution(in implicit form)of a given first-order equation, then what is the particular solution (in implicit form) satisfyingtheinitial conditiony(a) = 6? 13. If Mdx + Ndy = 0 and Pdx + Qdy = 0 are exact, is (M + P)dx + (N + Q)dy = 0 exact?Explain. 14. Showthatfor [p(x)+ q(y)]da+ [r(x) + s(y)|dy = 0 to be exact,it is necessaryandsufficientthatq(y)dz + r(x)dy be an exact differential. 15. Show thatfor p(x)dz + q(z)r(y)dy = 0 to be exact,it is necessary and sufficient that g(x) be a constant. Chapter 2 Review Chapter 2 Review Following is a listing of the types of equations covered in this chapter. SECTION 2.2 First-order linear: =y' + p(x)y = q(2). This equation can be solved by the integrating factor method or by solving the homogeneousequation and using variation of parameters. Its general solution is y(x) = e7 J p(#) de (| ef P®)d@ a(x) da +c) A particular solution satisfying y(a) = 0 is eu mS)a(€) dé+ s) y(a) = ewJaPE)a ¢ a Bernoulli: = y' + p(x)y = g(x)y”. (n # 0,1). (1 —n)p(x)v = (1 —n)q(x) by the changeof variablesv = y!~" (Exercise 9). Riccati: y!= p(x)y* + ¢(x)y + 1(z). This equation can be solved by setting y = Y(x) + -, if a particular solution u Y (a) of the Riccati equation can be found (Exercise | 1). d’Alembert-Lagrange: y/=af(y')+g(y’). [fliy) 4y)] By letting y’ = p be a new independent variable, one can obtain a linear first- orderequationon a(p) (Exercise 13). Clairaut: = y/ = ay! + g(y’). = Thisequation admits the family of straight-line solutions y = Ca + g(C) and, in general, a singular solution as well (Exercise 14). SECTION 2.4 Separable: y/ = X(a)Y(y). General solution obtained by integrating Jvy>[Xo 71 72 Chapter2. Differential Equationsof First Order Homogeneous: 1 4/ = f (2). x Can be made separable by setting v = y/a (Exercise 9). Almost Homogeneous: / _ ax + by +e ~ dx+eyt fi (ae ~ bd # 0) Can be made homogeneous by setting «= u+h, y =v+k (Exercise {1). SECTION 2.5 Exact: M(a,y)de+ N(a,y)dy=0. (My = Nz) General solution F (x,y) = C found by integrating F, = M, Fy = N. If M, # Nz, canmakeexactby meansof an integratingfactoro(x) if (My ~ Nz)/N is a function of x only, or by an integratingfactor o(y) if (My — Nz)/M isa function of y only. Chapter 3 Linear Differential Equations of Second Order andHigher PREREQUISITES: In this chapter on linear differential equations, we encounter systems of linear algebraic equations, and it is presumed that the reader is familiar with the theory of the existence and uniqueness of solutions to such equations, especially as regards the role of the determinant of the coefficient matrix. That material is covered in Chapters 8-10, but the essential results that are neededfor the presentchapter are summarized briefly in Appendix B. Thus, either Sections 8.110.6or Appendix B is a prerequisitefor this chapter.Also presumedis a familiarity with the complex plane and the algebra of complex numbers. That material is covered in Section 21.2 which, likewise, is a prerequisite for Chapter 3. 3.1 Introduction AS we prepare to move from first-order equations to those of higher order, this is a good time to pause for an overview that looks back to Chapter 2 and ahead to Chapters 3-7. If, as you proceed through Chapters 3-7, you lose sight of the forest for the trees, we urge you to come back to this overview. LINEAR EQUATIONS First order: y' +p(a)y= q(x). (1) General solution found [(2.1) in Section 2.2] in explicit form. Existence and uniqueness of solution of initial-value problem [with y(a) = 6] guaranteed over a predeterminedinterval, basedupon the continuity of p(x) and g(2). Solution of initial-value problem expressible as a superposition of responses to the two inputs [theinitial value 6 and the forcing function g(x)] with each 73 74 ~~Chapter 3. Linear Differential Equations of Second Order and Higher response being proportional to that input: for example, if we double the input we double the output. Second order and higher: dy ag(x) Wan+ ay(x) d? 1y dx 1 eee ani (0) di + an(x)y = f(x). (2) Constant coefficients (the a;’s are constants) and homogeneous (f =): This is the simplest case. We will see (Section 3.4) that the general solution can be found in terms of exponential functions, and perhaps powers of x times exponential functions. Constant coefficients and nonhomogeneous: Additional solution is needed due to the forcing function f(x) and can be found by the method of undetermined coefficients (Section 3.7.2) or the method of variation of parameters (Sections 3.7.3 and 3.7.4). Still simple. An alternative approach, the Laplace transform, is given in Chapter 5. Nonconstant coefficients: Essentially, the only simple case is the Cauchy —Euler equation (Section 3.6.1). Other cases are so much more difficult that we give up on finding closed form solutions and use power series methods (Chapter 4). Two particularly important cases are the Legendre (Section 4.4) and Bessel (Section 4.6) equations, which will be needed later in the chapters on partial differential equations. NONLINEAR EQUATIONS First order: y = f(x,y). (3) No solution available for the general case. Need to identify subcategories that are susceptible to special solution techniques. The most important of these subcategories are separable equations (Section 2.4) and exact equations (Section 2.5), and these methods give solutions in implicit form. Several important but more specialized cases are given in the exercises: the Bernoulli, Riccati, d’ Alembert-Lagrange, and Clairaut equations in Section 2.2, and “homogeneous” equations in Section 2.4. The idea of the response being a superposition of responses, as it is for the linear equation, is not applicable for nonlinear equations. The subcategories and special cases mentioned above by no means cover all possible equations of the form y’ = f(z, y), so that many first-order nonlinear equations simply are not solvable by any known means. A powerful alternative to analytical methods, [i.e., methods 3.1. Introduction designedto obtainan analyticalexpressionfor y(z)], is to seeka solutionin numerical form, with the help of a computational algorithm and a computer, and these methods are discussed in Chapter 6. Second order and higher: Some nonlinear equations of first order can be solved analytically, as we have seen, but for nonlinear equations of higher order analytical solution is generally out of the question, and we rely instead upon a blend of numerical solution (Chapter 6) and qualitative methods, such as the phase plane method described in Chapter 7. To get started, we limit our attention in the next several sections to the homogeneous version of the linear equation (2), namely, where f(x) = 0, because thatcase is simpler and becauseto solve the nonhomogeneouscase we will needto solve the homogeneous version first, anyhow. To attach physical significance to the distinction between homogeneous and nonhomogeneousequations, it may help to recall from Section 1.3 that the differential equation governing a mechanical oscillator is 2 (4) +kxr = F(t), mos + < where m, c, k are the mass, damping coefficient, and spring stiffness, respectively, andF(t) is theappliedforce. (In this case,of course,thevariableshappento be x and¢ ratherthany and w.) If F(t) = 0, then(4) governsthe unforced,or “free,” vibration of the mass m. Likewise, for any linear differential equation, if all terms containing the unknown and its derivatives are moved to the left-hand side, then whatever is left on the right-hand side is regarded as a “forcing function.” From a physical point of view then, when we consider the homogeneous case in the next several sections, we are really limiting our attention to unforced systems. A brief outline of this chapter follows: 3.2 Linear Dependence and Linear Independence. The concept of a general solution to a linear differential equation requires the idea of linear dependence and linear independence, so these ideas are introduced first. 3.3 Homogeneous Equation: General Solution. Here we establish the concept of a general solution to the homogeneous equation (2), but do not yet show how to obtain it. 3.4 Solution of Homogeneous Equation: Constant Coefficients. It is shown how to find the general solution in the form of a linear combination of solutions that are either exponentials or powers of x times exponentials. 3.5 Application to Harmonic Oscillator: Free Oscillation. The foregoing concepts and methodsare applied to an extremely important physical application: the free oscillation of a harmonic oscillator. 75 76 3.6 Solution of Homogeneous Equation: | Nonconstant Coefficients. Nonconstant-coefficient equations can be solved in closed form only in exceptional cases. The most important such case is the Cauchy—Euler equation, and that case occupies most of this section. 3.7 Solution of Nonhomogeneous Equation. It is shown how to find the additional solution, due to the forcing function, by the methods of undetermined coefficients and variation of parameters. 3.8 Application to Harmonic Oscillator: Forced Oscillation. We return to the example of the harmonic oscillator, begun in Section 3.5, and obtain and discuss the solution for the forced oscillation. 3.9 Systems of Linear Differential Equations. We consider linear systems of n coupled first-order differential equations on n unkowns and show how to obtain uncoupled nth-order differential equations on each of the n unknowns, which equations can then be solved by the methods described in the preceding sections of this chapter. 3.2 Linear Dependence and Linear Independence Asked how many different paints he had, a painter replied five: red, blue, green, yellow, and purple. However, it could be argued that the count was inflated since only three (for instance red, blue, and yellow) are independent: the green can be obtained from a certain proportion of the blue and the yellow, and the purple can be obtained from the red and the blue. Similarly, in studying linear differential equations, we will need to determine how many “different,” or “independent,” functions are contained within a given set of functions. The concept is made precise as follows. We begin by defining a linear combination of a set of functions f,,..., fp as any function of the form aj f; + +--+ anf, where the a,;’s are constants. For instance, 2 sin — 7 cos2 is a linear combination of sin x and cos z. DEFINITION 3.2.1 Linear Dependence and Linear Independence A set of functions {u1,...,tn} is said to be linearly dependent on an interval I if at least one of them can be expressed as a linear combination of the others on J. If none can be so expressed, then the set is linearly independent. If we do not specify the interval J, then it will be understood to be the entire x axis. NOTE: Since the terms linearly dependent and linearly independent will appear repeatedly, it will be convenient to abbreviate them in this book as LD and 77 LI, respectively, but be aware that this notation is not standard outside of this text. The set {x?,e*,e~*, sinh a} is seen to be LD (linearly dependent) 1. EXAMPLE becausewe can express sinh w as a linear combination of the others: sinha = ———— 2 = 1 , 2 1 =e”— 3° In fact, we could express e* as a linear combination +027. (1) of the others too, for solving (1) for e” givese* = 2sinhaz+e~* +02”. Likewise, we could expresse~* = e* —2sinhz+02?, We cannot express x? as a linear combination of the others [since we cannot solve (1) for x7], but the set is LD nonetheless, because we only need to be able to express “at least one” member as a linear combination of the others. NOTE: The hyperbolic sine and cosine functions, sinh x and cosh, were studied in the calculus, but if these functions and their graphs and properties are not familiar to you, you may wish to turn to the review in Section 3.4.1. The foregoing example was simple enough to be worked by inspection. In more complicated cases, the following theorem provides a test for determining whether a given set is LD or LI. THEOREM 3.2.1 Testfor Linear Dependence/Independence A finite set of functions {u;,...,Un} is LD on an interval J if and only if there exist scalars a;, not all zero, such that ayuy(x2)+ ague(z) +++:+ Antn(z) = 0 (2) identically on J. If (2) is true only if all the a’s are zero, then the set is LI on J. Proof: Because of the “if and only if” we need to prove the statement in both directions. First, suppose that the set is LD. Then, according to the definition of linear dependence, one of the functions, say uj, can be expressed as a linear combination of the others: uj (a) = ayuy(z) +--+ aj—1uj—1(2) + Oj41Uj41(x) +++++QnUn(z), (3) which equation can be rewritten as ayur (x) sree Qj —1Uj—1(2) + (—1)u;(z) + Aj 41Uj41(2) nena QnUn(Z) = 0. (4) Even if all the other a’s are zero, the coefficient a; of uj(z) in (4) is nonzero, namely, —1, so there do exist scalars a1,...,@p, not all zero such that (2) holds. 78 Conversely, suppose that (2) holds with the @’s not all zero. If ag, for instance, is nonzero, then (2) can be divided by a, and solved for u,z(a) as a linear combination of the other u’s, in which case {u1,...,Un}is LD. @ EXAMPLE 2. To determineif theset {1, a,a*} is LD or Ll using Theorem 3.2.1,write equation (2), (5) ay + ager + a3n? = Q, and see if the truth of (5) requires all the a’s to be zero. Since (5) needs to hold for all z’s in the interval (which we take to be ~co < x < oo), let us write it for x = 0, 1, 2, say, to generate three equations on the three a’s: a, = 0, ay + a9 +43 =0, Oy + 202 + dag (6) =0. Solution of (6) gives a; = a2 = a3 = 0, so the set is LI. In fact, (5) really amounts to an infinite number of linear algebraic equations on the three a’s since there is no limit to the number of x values that could be chosen. However, three different x values sufficed to establish that all of the a’s must be zero. @ Alternative to writing out (2) for n specific x values, to generate n equations On Q1,..., Qn, it is more common to generate n such equations by writing (2) and its first n — 1 derivatives (assuming, of course, that w1,...,uU, are n — | times differentiable on J), ayuy(2) + +++ + Antn(x) =D, ayuy (x) +e + anu, (x) = 0, ayuy")(@) +2 tau) (7) (x)=0. Let us denote the determinant of the coefficients as Un(2) ut (z) W [ui,..., Un] (x) = uy (a u ) ul?) (x) vee ∙∙ Uh (x) nl uP) ; (8) (x) which is known as the Wronskian determinant of u,,..., Un, or simply the Wronskian of u1,..., Up, after the Polish mathematician Josef M. H. Wronski (1778—1853). The Wronskian W is itself a function of z. From the theory of linear algebraic equations, we know that if there is any value of w in J, say xg, such that W [u,,..., Un] (wo) # 0, then it follows from (7) with x set equal to xo, that all the a’s must be zero, so the set {ui, ∙ is LL. 79 THEOREM 3.2.2 Wronskian Condition for Linear Independence If, for a set of functions {uw ,,..., tn} having derivatives through order m — 1 on an intervalI, W [uy,..., tn] (@)is not identically zero on J, thenthe set is LI on J. Be careful not to read into Theorem 3.2.2 a converse, namely, that if W [ui,..., Un] (x) is identically zero on J (which we write as W = 0), then the set is LD on I. In fact, the latter is not true, as shown by the following example. EXAMPLE 3. Considertheset{u1,uo}, where ur(2) = xv, { 0, xr<0 «>0, ula) ={ 42 < z>0. (9) (Sketch their graphs.) Then (2) becomes a2" + a(0) = 0 a1(0) + agz? = 0 fora <0 fora’ > 0. The first implies that a, = 0, and the second implies that ag = 0. Hence {wy,uo} is LL v wl(e)al®* Yet, W (wi, ual (2) or Fle 0 ved UH {0 Oone < 0, and W[ui, we] (2) = 0 2 2x =Qon xz> 0,so W [wy, ug] (v7)=O forall a. ff However, our interest in linear dependence and independence, in this chapter, is not going to be in connection with sets of randomly chosen functions, but with sets of functions which have in common that they are solutions of a given linear homogeneousdifferential equation. In that case, it can be shown that the inverse of Theorem 3.2.2 is true: that is, if W = 0, then the set is LD. Thus, for that case we have the following stronger theorem which, for our subsequent purposes, will be more important to us than Theorem 3.2.2. THEOREM If ui,...,Un tion 3.2.3 A Necessary and Sufficient Condition for Linear Dependence are solutions of an nth-order linear homogeneous differential equanm ae dat where the coefficients di (a)?n—1, y ain + Pn=1() = + pn(x)y = 0, PULL dan-l Dj (x) are continuous on an interval J, then W [u1,..., (10) Un| (a) = 0 on J is both necessary and sufficient for the linear dependence of the set {ti,...,Un}on J. 80 EXAMPLE 4. It is readily verified that each of the functions 1,e*,e~" satisfies the equationy/” —y’ = 0. Since their Wronskian is e® e* 1 (2)=| 0 e* -e-*| =240, W[1,e%,e7*] 0 e7* et it follows from Theorem 3.2.3 that the set {1,e*,e7*} fll y/" —y' = Dis e*,e7*, coshaz. Their Wronskian is e” W[e*, e*, cosh x] (vz)=| e* e* is LI. Another set of solutions of e~* cosha ~e~® sinha | =0, e~* coshz | sotheset{e*,e~*,coshx} is LD. In connection with Theorem 3.2.3, it would be natural to wonder if W could be zero for some x’s and nonzero for others. Subject to the conditions of that theorem, it can be shown (Exercise 5) that cy) at), |- [ *pi(t) exp W(2)=W(6) where € is any point in the interval and p, is the coefficient of the next-to-highest derivative in (10), and where we have written W [ui,..., un] (a) as W(a), and W [ui,..., Un] (€) as W(€), for brevity. Due to the French mathematicanJoseph Liouville (1809-1882), and known as Liouville’s formula, (11) shows that under the conditions of Theorem 3.2.3 the Wronskian is either everywhere zero or everywhere nonzero, for the exponential function is positive for all finite values of its argumentandtheconstantW(€) is either zero or not. This fact is illustratedby the two Wronskians in Example 4. Finally, it is useful to cite the following three simple results, proofs of which are left for the exercises. THEOREM 3.2.4 Linear Dependence/Independenceof Two Functions A set of two functions, {u1, u2}, is LD if and only if one is expressible as a scalar multiple of the other. THEOREM 3.2.5 Linear Dependence ofSets Containing the Zero Function [f a set {uy,...,Un} then the set is LD. contains the zero function [that is, uj(a@) = 0 for some 4], 81 THEOREM 3.2.6 Equating Coefficients Let {u1,..., Un} be Lf on an interval J. Then,for ayuy(@) +++: + GyUn(x) = byuy(x) +--+ + bptn (r) to hold on J, it is necessary and sufficient that a; = b; foreach j = 1,...,n. That is, the coefficients of corresponding terms on the left- and right-hand sides must match. EXAMPLE 5. The set {x, sin x} is LI on ~co < & < 00 accordingto Theorem3.2.4 because x is surely not expressible as a constant times sin x (for z/ sin x is not a constant), nor is sin z expressible as a constant times x. H EXAMPLE 6. We've seenthat{1,e*,e~*} is LI on —co < x < oo. Thus, if we meet the equation a+ be*+ce~* = 6 —2e7*, (12) then it follows from Theorem 3.2.6 that we must have a = 6, b = 0, c = ~2, for if we rewrite (12) as (a —b)(1) + be” + (c+ 2)e™*=0, then it follows from the linear independence of 1, e*,e~* thata -6 = 0,b = 0,c+2 that is,a = 6,b=0,c= =0; ~—2.Hi Closure. We have introduced the concept of linear dependence and linear independence as preliminary to our development of the theory of linear differential equations, which follows next. Following the definitions of these terms, we gave threetheoremsfor the testing of a given set of functions to determine if they are LI or LD. Of these, Theorem 3.2.3 will be most useful to us in the sections to follow because it applies to sets of functions that arise as solutions of a given differential equation. In case you have trouble remembering which of the conditions W = 0 and W # 0 corresponds to linear dependence and which to linear independence, think of it this way. If we randomly make up a determinant,the chances are that its value is nonzero; that is the generic case. Likewise, if we randomly select a set of functions out of the set of all possible functions, the generic case is for them to be unrelated — namely, LI. The generic cases go together (W + 0 corresponding to linear independence)and the nongeneric cases go together(W = 0 corresponding to linear dependence). The concept of linear dependence and independence will prove to be important to us later as well, when we study n-dimensional vector spaces and the expansion of a given vector in terms of a set of base vectors. 82 EXERCISES 3.2 §. (Liouville’s 1. (a) Can a set be neither LD nor LI? Explain. (b) Can a set be both LD and LI? Explain. 2. Show that the following sets are LD by expressing one of the functions as a linear combination of the others. (a) {1, 2 +2, 3a —5} (b) {x?, c+ et, et+oetlae- 1} 3. Show whether the given set is LD or LI. HINT: In most cases, the brief discussion of determinants given in Appendix B will suffice; in others, you will need to use known properties of determinants given in Section 10.4. Also, note that the Maple command for calculating determinants (the elements of which need not be constants) is given at the end of Section 10.4. (b){e™, ce, ar} etn} yl” (b)y+ _ by" + ly’ 4y = 0, _ 6y = 0, {e®, er 3} {sin 2a, cos 2x} (c)y’” —6y"”+ 9y' —4y =0, {e%, xe*,e4*} (d)y'” _—Gy"+ Dy’~dy = 0, {e*, ve? (1—x)e*} (e) yl y" _ 2Qy! — 0, {1, e*, ce? } (f) yl" _ by" 4 dy — 0, (g)a*y"”—3ay'+3y=0, (h) xy” (asy” {e*, ene er | e {a,2°}, +ay’~y = 0, 2} Un(2) uy(x) W'(2)= uy")(2) ul” (x) un) (@) (5.2) ul (x) = —p,(x)ul"™-)(2) In the last row, substituteu(x) “++= pn(x)u(x) from (10),againomit vanishingdeterminants, and again obtain (5.1) and hence the solution (11). onz>0O {e?*,xe?"} {e,eln x, (In v)*} . 6. (a) Prove Theorem 3.2.4. (b) Prove Theorem 3.2.5. (c) Prove Theorem 3.2.6. 7. If uy and we are LI, u; and wy are LI, and we and wy are LI, does it follow that {uw,,we,u3} is LI? Prove or disprove. HINT: If a proposition is false it can be disproved by a single counterexample, but if it is true then a single example does not suffice as proof. 8. Verify that2? and2° aresolutionsof xy” —4zy'+6y =0 on —co < x < o. Also verify, from Theorem 3.2.4, that they are LI on that interval. Does the fact that their Wronskian W([x?,x°|(x) = x vanishesatx = 0, togetherwiththeirlin- one >0 —3cy' + 4y = 0, {x?, x? Ine} , Gy" —4y' +4y = 0, (5.1) HINT: You may use the various properties of determinants, given in Section 10.4. 4. Verify that each of the given functions is a solution of the given differential equation, and then use Theorem 3.2.3 to determine if the set is LD or LI. As a check, use Theorem 3.2.4 if that theorem applies. (a) W'(x) = —pi(x)W(2), where the jth one is obtained from the W determinant by differentiating the 7th row and leaving the other rows unchanged. Show that each of these n determinants, except the nth one, has two identical rows and hence vanishes, so that (g){0,2,2°} {1,2, x?,.. formula, (11), equal 2), by showing that W’‘(z) is the sum of n determinants (h) {a, 20,27} (c) tt 1+a,1+2*} (e) {sinz, cosa, sinha} (g) {1, sin 32} (i) {z, 1+2, e*} (a) Derive Liouville’s and integrating the latter to obtain (11). (b) Derive (11) for the general case (i.e., where m need not (c) {at +a° 41,24 -2? +1, 27-2? -1} (d) {e”, e?*,sinha, cosha} (e) {sinh 3z, e*, e3”, e®*,e3*} (f) {e*, e?*,we®,(7a —2)e™} (a) formula) for the special case wheren = 2, by writing out W’(x), showing that onz>d0 ear independence on.-co Explain. < x < co violate Theorem 3.2.3? 83 3.3. Homogeneous Equation: General Solution 3.3.1.General solution. We studiedthe first-orderlinear homogeneousequation (1) y' +p(x)y = 0 in Chapter 2, wherep(a) is continuous on the « interval of interest, /, and found the solution to be = Cem JP) y(2) (2) da whereC is an arbitraryconstant.If we appendto (1) an initial condition y(a) = 8, where a is a point in J, then we obtain, from (2), y(x) = be~JaP(E)a6 (3) as was shown in Section 2.2. The solution (2) is really a family of solutions because of the arbitrary constant C’. We showed that (2) contains all solutions of (1), so we called it a general solution of (1). In contrast, (3) was only one member of that family, so we called it a particular solution. Now we turn to the nth-order linear equation d”y aan d®-ly + pi(z) dri +--+ Pai) d 5=+Pn(a)y =0, (4) and once again we are interested in general and particular solutions. By a general solution of (4), on an interval J, we mean a family of solutions that contains every solution of (4) on that interval, and by a particular solution of (4), we mean any one member of that family of solutions. We begin with a fundamental existence and uniqueness theorem.* THEOREM 3.3.1 Existence and Uniqueness for Initial-Value Problem If pi(x),...,pn(x) are continuous on a closed interval J, then the initial-value problem consisting of the differential equation d’y dx” d~ly pi (x) dxr-1 dy | 4 Pn—1(t) 7 + Pn(z)y = 0, (Sa) togetherwith initial conditions y(a) = bi, y'(a) = by, ..., y™ (a) = bn, (Sb) “For a more complete sequence of theorems, and their proofs, we refer the interested reader to the little book by J. C. Burkill, The Theory of Ordinary Differential Equations (Edinburgh: Oliver and Boyd, 1956) or to William E. Boyce and Richard C. DiPrima, Elementary Differential Equations and Boundary Value Problems, 5th ed. (New York: Wiley, 1992). 84 where the initial point @is in 7, has a solution on J, and that solution is unique. Notice how the initial conditions listed in (Sb) are perfect —not too few and not too many — in narrowing the general solution of (Sa) down to a unique particular solution,for (Sa)givesy' (a) as a linear combinationof y(~)) (x), ...,y(«), the derivativeof (5a)gives y(+!)(«) as a linear combinationof y™(zx),...,y(a), and so on. Thus, knowing y(a),..., y(%—) (a) we can use the differential equation (Sa)andits derivativesto computey\")(a), y"+)) (a), andso on, and thereforeto develop the Taylor series of y(x) about the point a; that is, to determine y(z). Let us leave the initial-value problem (5) now, and turn our attention to determining the nature of the general solution of the nth-order linear homogeneous equation (4). We begin by re-expressing (4) in the compact form (6) Ly] =0, where d” dn} L= 73 t pit) sce + d + Pn-1(t)= +Paz) (7) is called an nth-order differential operator and d” d Uo pale) ml) is to +pale)+ =(+ Liu) qd” = zal) qr-t +pile) aula) + + Pn(z)y(2) (8) defines the action of Z on any n-times differentiable function y. L[y] is itself a function of x, with values L[y|(x). For instance,ifn = 2, pi(z) = sing, po(z) = bx, andy(x) = x”, thenLiy](x) = (x?)” + (sinx)(x?)’ + 5a(x?) = 2+2xsinag +523, The key property of the operator L defined by (8) is that Ll[au + Sv] = aL [u]+ BL [vr] (9) for any (n-times differentiable) functions u, v and any constants a, 2. Let us verify (9) for the representative case where L is of second order: 2 Llau + Bo] = (3 tp da dx + »2) (au + fv) = (au+ bv)" + py (au + Bu)’ + po (au + Bv) =au" =a(u" + Bu" + ppau' + p,Bv' + peau + peSu + pyu’ + pou) + B (v" + piu’ + pav) =aL [ul+ GL [vu]. Similarly for n > 3. (10) 85 Recall that the differential equation (4) was classified as linear. Likewise, the corresponding operator L given by (8) is said to be a linear differential operator. The key and defining feature of a linear differential operator is the linearity property (9), which will be of great importance to us. In fact, Q) holds not just for two functions u and v, but for any finite number of functions, say %1,..., U,. That is, Llaquy +++ + pug) = aL for any functions u,,..., [uy]+--+ + ag 4%,and any constants a ,,...,a,. [ugh (11) (Of course it should be understood, whether we say so explicitly or not, that uw ,...,uwz must be n times differentiable since they are being operatedon by the nth-order differential operator LL.)To prove (11) we apply (9) step by step. For instance, if k = 3 we have Efayuy + agua + agug] = Llayuy + 1 (aque + agus)} =ay,L [ur] + 1D [agus + a3us3| = 0,0 [uy]+ aL [ug]+ a3 [us] from (9) from (9)again. From (11) we have the following superposition theorem: THEOREM 3.3.2 Superposition of Solutions of (4) If yy... ,ye, are solutions of (4), then Cyy, +---+ Cry, is too, for any constants Cy,...,C. Proof: It follows from (11) that L[Cryi +--++ Ceys]= il [yi]+--++ CeL [ys] =C,(0)+---+C,(0). EXAMPLE a 1. Superposition. It is readily verified, by direct substitution,thaty; = e°* and y2 = e~°* are solutions of the equation (We are not yet concerned with how to find such solutions; we will come to that in the next section.) Thus y = Cye** + Cze~%*is also a solution, as can be verified by substituting it into (12), for any constants C], Co. O To emphasize that the theorem does not hold for nonlinear or nonhomogeneous equations, we offer two counter-examples: EXAMPLE 2. It can be verified thaty; = 1 and ya = 2? are solutions of the nonlinear 86 equation zy" — yy’ = 0, yet their linear combination 4 + 3x? is not. @ EXAMPLE 3. It can be verified that y; = 4e°* ~ 2 and yy = e3* — 2 are solutions of the nonhomogeneous equation y’ — 9y = 18, yet their sum 5e°* — 4 is not. 2 We can now prove the following fundamental result. THEOREM 3.3.3 General Solution of (4) be continuous on an open interval J. Then the nth-order Let pi(x),...,pn(x) linear homogeneous differential equation (4) admits exactly n LI solutions; that is, at least n and no more than n. If yi(x),..., Yn(x) is such a set of LI solutions on I, then the general solution of (4), on J, is y(x) where C,..., = Cry Sats (2) (13) Cryn(2), Cy, are arbitrary constants. Proof: To show that (4) has no more than n LI solutions, suppose that y:(x),..., Ym(x) are solutions of (4), where m > n. Let € be some point in J. The n linear algebraic equations eryi(€) + ++++ CmYm(€) = 0 (14) : cry (E)+++ tomy (€)=0 in the m unknown c’s have nontrivial solutions because m > n. Choosing such a nontrivial set of c’s, define v(x) = c1y1 (x) SP eos (15) CmYm(L), and observe first that L [v] =L =o [cry cial [yi] +--+ eral +enL [Yin| = Cl (0) Feet Cm(0) = 0, where L is the differential operator in (4) and, second, that v() = v’(€) =--v"-D(¢e) = 0. One function v(x) that satisfies L[v] =iI 0 and v(€) = + (16) lI = = 0 is simply v(x) = 0. By the uniqueness part of Theorem 3.3.1 it is v("-(€) the only such function, so v(v) = 0. Recalling that the c’s in v(a) = cryi(@) + -+++ CmYm(x) = 0 are not all zero, it follows that y;(x),..., Ym (a) must be LD. Thus, (4) cannot have more than n LI solutions. 87 To show that there are indeed n LI solutions of (4), let us put forward n such solutions. According to the existence part of Theorem 3.3.1, there must be solutions yi(2),...;Yn(x) of (4) satisfying the initial conditions yi(a) = a1, (n~—1) yi(a@)=a, --- yy (a) = ain, (17) : : Yn(@)= On1, Yp(@)= On2, °° yr al (a) = Qnny where a is any chosen point in J and the @’s are any chosen numbers such that their determinant is nonzero. (For instance, one such set of a’s is given by a,; = 1 for each 2 = 1 through n and aj; = 0 whenever 2 4 j.) According to Theorem 3.2.3, yi(@),.-.,Yn(a) must be LI since their Wronskian is nonzero at x = a. Thus, there are indeed n LI solutions of (4). Finally, every solution of (4) must be expressible as a linear combination of those n LI solutions, as in (13), for otherwise there would be more than n LI solutions of (4). @ Any such set of n LI solutions is called a basis, or fundamental set, of solutions of the differential equation. EXAMPLE 4. Supposewe begin writing solutionsof the equationy” — 9y = 0, from Example |: e°*,5e3”,e738",2637+ e78*,sinh 32, cosh32, e°* —4cosh3z, and so on. (That each is a solution is easily verified.) From among these we can indeed find twothatareLI, butno morethantwo. For instance,{e3*,e~**},{9% 268 + e9*}, {e8*, sinh 3c}, {sinh 32, cosh 32}, {sinh 3u, ew3e} are bases, so the general solution can be expressed in any of these ways: y(z) = Cie** + Cpe7*, (18a) y(x) = Cre** + Co (2e"*+ eF*) , (18b) y(x) = Cye®*+ Co sinh 32, (18c) y(z) = Cy sinh 32 + C2 cosh 32, (18d) and so on. Each of these is a general solution of y’ —9y = 0, and all of them are equivalent. For instance, (18a) implies (18d) because y(z) = Ce** + Coe7** = C, (cosh3z + sinh 3x) + Cy (cosh3x —sinh 32) = (Cy + Co) cosh 8x + (Cy — C2) sinh 3x = Ci cosh3x + C4 sinh 32, . t EXAMPLE ∕ ↕ ∙ ∑ ∙ ∂↕ ∙ 5. Solve theinitial-value problem yl" +4 y’ = 0 (19a) ≤ 388 y(0)= 3, y/(0)=5, y"(0)= —4. (19b) A general solution of (19a) is y(“) = Ci cosa + Cosina + Cs, (20) because cos z, sin z, and 1 are LI solutions of (19a). Imposing (19b) gives L (0) rea Ch + C's, y'(0) =5 =C, y"'(0) =-4=-C\, which equations admit the unique solution Cy, = 4, C2 = 5,C3 = —1. Thus, y(z) = 4cosxz+5sinz —1 is the unique solution to the initial-value problem (19). @ 3.3.2. Boundary-value problems. It must be rememberedthat the existence and uniqueness theorem, Theorem 3.3.1, is for initial-value problems. Though most of our interest is in initial-value problems, one also encountersproblems of boundaryvalue type, where conditions are specified at two points, normally the ends of the interval J. Not only are boundary-value problems not covered by Theorem 3.3.1, it is striking that boundary-value problems need not have unique solutions. In fact, they may have no solution, a unique solution, or a nonunique solution, as shown by the following example. EXAMPLE 6. Boundary-ValueProblem. It is readilyverifiedthatthedifferentialequa- tion y’+y=0 (21) admits a general solution y(z) = Cy cosz + Cy sinz. (22) Consider three different sets of boundary values. Case 1: y(0) = 2,y(w) = 1.Then y(0) =2=C, +0, y(t) il lI ~C, +0, which has no solution for C,, C2, so the boundary-value problem has no solution for y(a). Case 2: y(0) = 2, y(w/2) = 3. Then y(0) =2=C;, y(m/2)I lI =3 +0, =04+Cr, 3.3. Homogeneous Equation: General Solution 89 2cose+3sina. Case3: y(0) = 2, y(w) = —2.Then y(0) =2=C,+0, y(r) = -2 = -C, +0, so Cy = 2,and C> is arbitrary, and the boundary-value problem has the nonunique solution (indeed,the infinity of solutions) y(z) = 2cosz + Cysina, where C2 is an arbitrary constant. # -;Yn(x), thanksto the linearity of L, namely, the property of Z that Llau + Bv] = aLlul + BL[v| of n LI solutions {y,.. for brevity) for that equation. Theorems 3.3.1 and 3.3.3 are especially important. EXERCISES 3.3 NOTE: If not specified, the interval J is understood to be the entire @axis. 1. Show whether or not each of the following is a general solution to the equation given. (a)y” —3y' + 2y= 0; (b) y” — 3y' + 2y =0; (ce) yy! —2y=0; dy" ~y!—2y=0; (e) y'"” + dy’ = 0; (f) yl" + 4y! = 0; Cie” —e”) + Cre* Ci(e~* + e?*) Cre~*+ Coe** Cy cos 2a + Cy sin 22 Cy + Cocos 2x + Cs sin 22 (g)yy"—2y"ty! =0; (Cy+ Coa+ C32?)e* (hyy!" =~2Qy"+ yl = 0; +ys0; Gy" yy (C1 + Caw) eF + Cy +Cgae® Cre*+ Cye7® 2. Show whether or not each of the following is a basis for the given equation. (a)y"”—9y = 0; (b)yy”-9y=0; (c)y"—-y=0; (d) yl" _ 3y" { e°*cosh 3z, sinh 3x } {e%*, cosh3a} {sinh 3a, 2cosh 3x} a 3y/ -y= 0; {e*,ce*, (e)y"—3y"=0; {1,a,e%*} ae" } 90 sinc} (ayy + 2y' + 38y=0; | 3. Are the following general solutions of x7y" + xy’ —4y = 0 (b)y” + 2y' + 3y=0; y (c)y" +2y' —y = 0; on0 < a@< 00? On —oo < & < co? Explain. {cosa,sinx,acosz,a (fyyy” + 2y" +y=0; (d)ay!" + ay! —y = 0; (a)Cia? (b) C12? (ce) 22y”—y' —y = 0; (f)(sina)y“"+ vy" =0; y/"(2)= -9 ++ ya ma (c) Cy(a? +27 2) + Cy(2? − ∙ ∫ ∏ ↔ ∏ ∶ of (21). ∶ 10. Verify that (22) is indeed a general∶ solution ∏ ↕ ∏∙ {e*,e~*} (c) {x, vIn|z|} (d) {w+ eln|2|,v - eln|2|} 5. Show whether or not the following is a general solution of _ 4ylvt) yo) 4 56yo”) _ 14y") _ 196y” 4 A9y!" _ 36y’ +4 Ce" a Cye7* at Cae?" + Cye7?* +Cyc?+Ce +Cye~® (b)Cye™ C7 sinh x + Cg cosh 2x + Cre3™ ae Cee 3 +Cye™+ +Cye®* 6. Show that y, = 1 and yo = 2 are solutions of (y° ~ 6y? +lly ~6) y” =OIsy=y+y =1l+2=3 a solution as well? Does your result contradict the sentence preceding Example 2? Explain. 7. 7 (a)y(0)= 0, y(2)=0 (b)y(0)= 0, y(2r) = -3 (c)y(1)= 1, y(2)= (d)(0) = 0, y(5)= 1 l44dy = 0, (a) ∶ ∩ 11. Consider the boundary-value problem consisting of the differential equation x" + y = 0 plus the boundary conditions given. Does the problem have any solutions? [f so, find them. Is the solution unique? HINT: A general solution of the differ- (a) {x, a} (b) y'( y(2)=y'(2 Show that each of the functions y, = (e)y/(0)= 0, y/(7)=0 (f)y/(0)= 0, y’(67) = (g)y/(0)= 0, y’(27)=38 12. Consider the boundary-value problem consisting of the differential equation y””” + 2y” + y = 0 plus the boundary conditions given. Does the problem have any solutions? If so, find them. Is the solution unique? HINT: A general solution of thedifferentialequationis y = (Cy + Cox) cosa + 3a? — x and (C3 + Cyr) sina. ya = x” —x is a solution of the equation a7y” — 2y = 2z. Is = y'(0)= 0, y(m)= 0, y/(m)= 2 thelinear combinationCy, + Coy2 a solution as well, for all (a)y(0) choices of the constants C’; and C2? 8. (Taylor series method) Use the Taylor series method described below Theorem 3.3.1 to solve each of the following initial-value problems for y(a), up to and including terms of fifthorder.NOTE: The termf'")(a)(a@— a)"/n! in theTaylor seriesof f(x) aboutz = a is said to be ofnth-order. ay" +y=0; y(0)=4, y/(0) (b)y(0) = y/(0) = y"(0)=0, ute) =1 (c)y(0) = y"(0)= 0, y(r) =0= y(n) =0 (d)y(0)= y"(0) = 0, y(m) = " mm) =3 13. Prove that the linearity property (10) is equivalent to the two properties Lilut+v] = Liu) +Lf], (13.1a) Lau] = alu). (13.1b) =3 (b)y"”~dy=0; y(0)= —1,y’(0)= (c)y" + 5y'+6y=0; y(0)= 2, y/(0) = (d)y+ sy =0; y(0)=1, y'(0)= 0 (e)y" te “y=0; y(0)=2, y/(0)= (f) yy”— 3y = 0; (5) = 4, y'(5) = 5 That is, show that the truth of (10) implies the truth of (13.1), and conversely. HINT: Expand 14. We showedthat (11) holds for thecase & = 3, but did not prove it in general. Here, we ask you to prove (11) for any in(g)y"+3y'-y=0; y(1)= 2, y’(1) =0 HINT: Expand teger & > 1. HINT: Itis suggested that you use mathematical about v= 1. induction, whereby a proposition P(k), for & > 1, is proved (hyy’” —y'+2y=0; y(0) =0, y/(0) = 0, y"(0) =1 by first showing that it holds for & = 1, and then showing that if it holds for & then it must also hold for & + 1. In the present Gy" —by=0; y(0)= 0, y’(0)= 3, y”(0)= -2 example, the proposition P(k) is the equation (11). 9, Does the problem stated have a unique solution? No solution? A nonunique solution? Explain. 15. (Example 4, Continued) (a) Verify that each of (18a) about 2 oe 3.4. Solution of Homogeneous Equation: Constant Coefficients 91 through (18d) is a general solution of y’ — 9y = 0. (b) It seems reasonable that if C1, Cy are arbitrary constants, able to show that corresponding to any chosen values Ci, C4 the equations (15.1) on Cy, Cy are consistent — that is, that and if we call they admit one or more solutions for C, Cy +Cg=Cy and Cy -Cy=C%, (15.1) Co. Show that (15.1) is indeed consistent. (c) Show that if, instead, we had Cy + Cg = C} and 2C1 + 2C2 = C4,where C1, C2 are arbitrary constants, then then C{,C} are arbitrary too, as we claimed at the end of it is or truethatC}, C4 arearbitrarytoo. Example 4. Actually, for that claim to be true we need to be 3.4 Solution of Homogeneous Equation: Constant Coefficients Knowing that the general solution of an nth-order homogeneous differential equation is an arbitrary linear combination of n LI (linearly independent) solutions, the question is: How do we find those solutions? That question will occupy us for the remainder of this chapter and for the next three chapters as well. In this section we consider the constant-coefficient case, mn m—1 FY ayCa ge tan dx +any=0; lal (1) that is, where the a; coefficients are constants,not functions of x. This case is said to be “elementary” in the sense that the solutions will always be found among the elementary functions (powers of x, trigonometric functions, exponentials, and logarithms), but it is also elementary in the sense that it is the simplest case: nonconstant-coefficient equations are generally much harder, and nonlinear equations are much harder still. Fortunately, the constant-coefficient case is not only the simplest, it is also of great importance in science and engineering. For instance, the equations ma” + ca! + kr = 0 and Li" + Ri! + ai = 0, governing mechanical and electrical oscillators, where primes denote derivatives with respect to the independent variable ¢, are both of the type (1) because m, c, k and L, 2, C are constants; they do not vary with ¢. 3.4.1. Euler’s formula and review of the circular and hyperbolic functions. We are going to be involved with the exponential function e*, where z = x + iy is complex and 7 = v~i. The first point to appreciate is that we cannot figure out how to evaluate e**¥ from our knowledge of the function e®where «xis real. That 92 Chapter 3. Linear Differential Equations of Second Order and Higher is, e®+Y is a “new object,” and its values are a matter of definition, not a matter of figuring out. To motivate that definition, let us proceed as follows: e = erty = eel¥ -e [inn GEE, 2 ty 2! 42 y 3 4 : 4! 3! 44 { 3 y 5 -«|(1 mtGi ~)+i(y F+e--)l. @ Recognizing the two series as the Taylor series representations of cos y and sin y, respectively, we obtain ett” —e®(cosy + isiny) , (3) which is known as Euler’s formula, after the great Swiss mathematician Leonhard Euler (1707-1783), whose many contributions to mathematics included the systematic development ofthe theory of linear constant-coefficient differential equations. We say that(3)definese*+” sinceit givese***Yin thestandardCartesian form a + ib, where the real part a is e®cos y and the imaginary part 6 is e* sin y. Observe carefully that we cannot defend certain steps in (2). Specifically, the second equality seems to be the familiar formula e*+’ = e%e°,but the latter is for real numbers a and b, whereas zy is not real. Likewise, the third equality rests upon the 2 Taylor series formula e” = 1+u+ x + --+-that is derived in the calculus for the case whereuis real, but iy is not real. The point to understand, then, is that the steps in (2) are merely heuristic; trying to stay as close to real-variable theory as possible, we arrive at (3). Once (3) is obtained, we throw out (2) and take (3) as our (i.e., Euler’s) definition of e***¥. Of course, there are an infinite number of ways one can define a given quantity, but some are morefruitful than others. To fully appreciate why Euler’s definition is the perfect one for e**’Y, one needs to study complex-variable theory, as we will in later chapters. For the present, we merely propose that the steps in (2) make (3) a reasonable choice as a definition of e*. AS a special case of (3), let c = 0. Then (3) becomes e'Y = cosy +isiny. Forinstance, e™ = cos w+isin (4a) 7 = —1+07 = —1,and e?~* = e? (cos 3 — isin 3) = 7,39(—0.990—0.1412)= —7.32—1.047.Since (4a)holds for all y, it musthold also with y changed to —y: e~Y = cos(—y) + isin (—y), and since cos (—y) = cosy andsin (—y) = —sin y, it follows that eY = cosy —isiny. (4b) Conversely, we can express cos y and sin y as linear combinations of the complex exponentials e'¥ and e~*¥,for adding (4a) and (4b) and subtracting them gives COSY = (ev + ev) /2and siny = (e¥ ~~ev) /(2i). Let us frame these formulas, for emphasis and reference: = cosy +isiny (Sa,b) cosy — asin y and cosy = sy ed + ew (6a,b) iy 2 mis = - 21 Observe that all four of these formulas come from the single formula (4a). (Of course there is nothing essential about the name of the variable in these formulas. For instance, e’® = cosxz + isinaz, e’ = cos@ + isin@, and so on.) There is a similarity between (5) and (6), relating the cosine and sine to the complex exponentials, to analogous formulas relating the hyperbolic cosine and hyperbolic sine to real exponentials. If we recall the definitions ev +e4 cosh y = a ov eu sinhy = 5 (7a,b) of the hyperbolic cosine and hyperbolic sine, we find, by addition and subtraction of theseformulas, that HI cosh y + sinh y, (8a,b) e~¥= coshy — sinh y. Compare (5) with (8), and (6) with (7). The graphs of cosh x, sinh x, e*, and e~* are given in Fig. 1. Using (6) and (7) we obtain the properties 9 cos” y + sin? y=, cosh? y — sinh? y = 1. (9) (10) From a geometric point of view, if we paraimetrize a curve C’ by the relations L= COST, over 0 <r < 27, say, then it follows circle. And if we parametrize C' by x=coshr, Yy=sint from (9) that a y=sinhr (11) + y”?= 1, so that C isa (12) 0 T Figure and e at T 1. cosh x, sinh z, e*, 94 Chapter 3. Linear Differential Equations of Second Order and Higher instead,thenit followsfrom(10)that«? —y* = 1,soC is a hyperbola.Thus,one refers to cos z and sin x as circular functions and to cosh « and sinh z as hyperbolic functions, the hyperbolic cosine and the hyperbolic sine, respectively. Besides (9) and (10), various useful identities, such as sin(A + B) =sin AcosB cos(A + B)=cos Acos B sinh (A + B) = sinh Acosh cosh (A + B) =cosh Acosh + sin Bcos A, —sin Asin B, B + sinh B cosh A, B + sinh Asinh B, (13a) (13b) (13c) (13d) can be derived from (6) and (7), as well as the derivative formulas d —cost dz d —coshx dx d -—sinz = cosa, =—singz, = sinhz, dx —sinhz dz (14) = coshz. We shall be interested specifically in the function e** and its derivatives with respect to x, where A is a constantthat may be complex, say \ = a + ib. We know from the calculus that a vx_ yer dz when A is a real constant. (15) Does (15) hold when A is complex? To answer the question, use Euler’s formula (3) to express eX?= elatib)e — 62%(cosbr + isin ba). Thus, d − − ∙ ax ∑∂ At __ d ↨ daz ∶ Toe = (e**sin bx) = (ae™cosbx —be™sin bx) + 1(ae™sin ba + be™cosbx) =e (a + ib) (cosbx + isin br) = \e (cosbaz+ isin ba) = \e**, so the familiar formula (15).does hold even for complex A. There is one more fact about the exponential function that we will be needing, namely, that the exponential function e* cannot be zero for any choice of z; that is, it has no zeros, for le*|= jerry = |e” (cosy + isiny)| =|e"||cosy +isiny| = e* |cosy +isiny| = e*. 95 The fourth equality follows from the fact that the real exponential is everywhere positive, and the fifth equality from the fact that |a+ ib| is the square root of the sum of the squares of a and b, and cos y + sin*y = 1. Finally, we know that e®> 0 for all x, so je*| > 0 for all z, and hence e* # 0 for all z, as claimed. 3.4.2. Exponential solutions. To guide our search for solutions of (1), it is a good idea to begin with the simplest case, n = 1: dy Ys ay=0, da (16) the general solution of which is y(z) = Ce", (17) where C' is an arbitrary constant. One can derive (17) by noticing that (16) is a first-order linear equation and using the general solution developed in Section 2.2, or by using the fact that (16) is separable. Observing that (17) is of exponential form, it is natural to wonder if higherorder equations admit exponential solutions too. Consider the second-order equation (18) y" +ary! +agy=0, where a, and ay are real numbers, and let us seek a solution in the form mp \ pXt =e. y(z) (19) If (19) is to be a solution of (18), then it must be true that Ner® + ayAe** + ane or = 0, (\? + aA + a2)e**=0, (20) (21) where (20) holds, according to (15), even if the not-yet-determined constant A turns out to be complex. For (19) to be a solution of (18) on some interval J, we need (21) to be satisfied on J. That is, we need the left side of (21) to vanish identically on I. Since e*” is not identically zero on any interval J for any choice of \, we need \ to be such that dA +a,A\ + ao = 0. This equation and its left-hand side are called the characteristic (22) equation and characteristic polynomial, respectively,correspondingto the differential equation (18). In general, (22) gives two distinct roots, say \y and Ag, which can be found from the quadratic formula as A= ay, ‘) & fay — 4a9 2 a 96 (The nongeneric case of repeatedroots, which occurs if af ~ 4a2 vanishes, is discussed separately, below.) Thus, our choice of the exponential form (19) has been successful. Indeed, we have found two solutions of that form, e*!* and e*2". Next, from Theorem 3.3.2 it follows [thanks to the linearity of (18)] that if e*!” and e*2* are solutions of (18) then so is any linear combination of them, y(x)=Cle (23) +Cye2*. Theorem 3.3.3 guarantees that (23) is a general solution of (18) if e©t®and e2" are LI on J, and Theorem 3.2.4 tells us that they are indeed LI since neither one is expressible as a scalar multiple of the other. Thus, by seeking solutions in the form (19) we were successful in finding the general solution (23) of (18). EXAMPLE 1. For theequation y’ —y' —by = 0, (24) the characteristic equation is \7 — \ — 6 = 0, with roots \ = —2,3, so y(z) = Cye~?* + Cye** (25) is a general solution of (24). 9 EXAMPLE 2. For theequation (26) y” —9y = 0, the characteristic is equation is \? ~9 = 0, with roots \ = £3,s0a general solution of (26) y(x) = Cye®*+ Cpe**. (27) COMMENT 1. As discussed in Example 4 of Section 3.3, an infinite number of forms of the general solution to (26) are equivalent to (27), such as y(x) = C, cosh 3x + Co sinh 32, y(x) = Cy sinh 3a + C2 (5e~8"—2cosh 3z) , (28) (29) y(z) = Cy (e** + 4sinh 32) + Co (cosh 3a — 7 sinh 32) , (30) and so on. Of these, one would normally choose either (27) or (28).. What is wrong with (29) and (30)? Nothing, except that they are ugly; e°* and e~°*make a “handsome couple,” and cosh 3z and sinh 3x do too, but the choices in (29) and (30) seem ugly and purposeless. COMMENT 2. If (27) and (28) are equivalent, does it matter whether we choose one or the other? No, since they are equivalent. However, one may be more convenient than the other insofar as the application of initial or boundary conditions. 97 For instance, suppose we append to (26) the initial conditions y(0) = 4, y/(0) = —5. Applying theseto (27) gives y(0) y'(0) 4 = Cy + Co, (31) —~5 = 380, — 38C2, ApplyingtheseinitialcondisoC = 7/6,Co = 17/6,andy(x) = (7e3*+ 17e7~°*)/6. tions to (28), instead, gives y(0) =4=Ci, (32) y'(0)= —5=3C2, so Cy = 4, Co = ~5/3, andy(x) = 4cosh 32 —(5/3) sinh 32. Whereasour final results are equivalent, we see that (32) was more readily solved than (31). Thus, cosh 3a and sinh 3x makea slightly better choice in this case than e3* and e~8* —namely, when initial conditions are given at c = 0. Or, suppose we consider J to be 0 < 2 < oo and impose the boundary conditions that y(0) = 6, andthaty(z) is to beboundedas2 —+oo. Thatis, ratherthanimposeanumerical valueon y at infinity, we imposea boundednesscondition, that |y(z)| < © for all 2, for some constant Af. Applying these conditions to (27) we see, from the boundedness condition, that we need C, = 0 since otherwise the e®*will give unbounded growth. Next, y(0) = 6 = Co, and hencethe solution is y(xz) = 6e~°*. Notice how easily the boundednesscondition was applied to (27). If we use (28) instead, the solution is harder since both cosh 3z and sinh3z grow unboundedly as x — co. We can’t afford to set both C, = 0 and Cg = 0, in (28) since then we would have y(z) = 0, which does indeed satisfy both the equation (26) and the boundedness condition, but cannot satisfy the remaining initial condition y(0) = 6. However, perhaps the growth in cosh 3z and sinh 3a can be made to cancel. That is, write y(az) = C, cosh 3x + C2 sinh 3z C2 | ————— +a() (A=) est a e78t est = C, | ———-——- ~AtCr a _ eo 8e 33 (33) 32» C1=O2 32 2 so for boundedness we need C, + Cp = 0 (and hence Cy = —Cy). Then (33) gives y(z) = Cye~%* andy(0) = 6 givesC, = 6 andy(z) = 6e™**,as before.Thus,in the case of a boundedness boundary condition at infinity we see that the exponential form (27) is moreconvenient than the hyperbolic form (28). To summarize, when confronted with a choice, such as between (27) and (28), look ahead to the application of any initial or boundary conditions to see if one form will be more convenient than the other. @ EXAMPLE 3. For y" + 9y = 0, the characteristic is (34) equation is \? + 9 = 0, with roots \ = £32, soa general solution of (34) y(x) — Cre®* a: Coe 8*, (35) 98 COMMENT |. Just as the general solution of y” — 9y = 0 was expressible in terms of therealexponentialse°*,e~°"or thehyperbolicfunctionscosh32,sinh3x, thegeneral solutionof (34)is expressiblein termsof thecomplexexponentials e’8”,e~* or in terms of the circular functions cos 32, sin 3x, for we can use Euler’s formula to re-express (35) as y(x) = C; (cos3a + isin 3x) + C2 (cos3a —isin 32) (36) = (Cy + Co) cos3a +7 (Cy —Co) sin 3z. Since C, and C2 are arbitrary constants, we can simplify this result by letting C, + Cy be anew constant A, and letting i(C; — C2) be a new constant B, so we have, from (36), the form (37) + Bsin 32, y(z) = Acos3a where A, B are arbitrary constants. As in Example 1, we note that (35) and (37) are but two out of an infinite number of equivalent forms. COMMENT 2. You may be concerned that if y(a) is a physical quantity such as the displacement of a mass or the current in an electrical circuit, then it should be real, whereas the right side of (35) seems to be complex. To explore this point, let us solve a complete problem, the differential equation (34) plus a representative set of initial conditions, say y(0) = 7,y'(0) = 3, and seeif the final answeris real or not. Imposingthe initial conditions on (35), y(0)=7=Ci +Cr, y'(0) so Cy = (7-1)/2 =3=723C, — 138Co, and Cy = (7+7)/2. Putting these values into (35), we see from (36) that y(x) = $[(7-2)+(7+i)]cos3x+ 44[(7—i) —(7+2)]sin32 = 7cos3a+sin 3x,which is indeed real. Put differently, if the differential equation and initial conditions represent some physical system, then the mathematics “knows all about” the physics; it is built in, and we neednot be anxious. @ Having already made the point that the general solution can always be expressed in various different (but equivalent) forms, we will generally adopt the exponential form when the exponentials are real, and the circular function form when they are complex. This decision is one of personal preference. EXAMPLE 4. The equation (38) y +4y' + Ty =0 has the characteristic equation \? + 4\ + 7 = 0, with distinct roots \ = —2 £iV3, soa general solution of (38) is y(a) — Cel _ ent =e 2tiv3)e (Cre'¥* +4 Cyelr27iv3)a 4 Cre! **) (4 cos V3x + Bsin v3e) . (39) 99 That is, first we factor out the common factor e~2" , then we re-express the complex exponentials in terms of the circular functions. If we impose initial conditions ¥(0) = 1,y/(0) = 0, say, we find that A = 1 and B= 2/V3, so y(a@)=e mee(cos J/32 of Se 7q SinV3z). According to Theorem 3.3.1, that solution is unique. # 3.4.3. Higher-order equations (n > 2). Examples |—4 are representative of the four possible cases for second-order equations having distinct roots of the characteristic equation: if the roots are both real then the solution is expressible as a linear combination of two real exponentials (Example 1); if they are both real and equal and opposite in sign, then the solution is expressible either as exponentials or as a hyperbolic cosine and a hyperbolic sine (Example 2); if they are not both real then they will be complex conjugates. If those complex conjugates are purely imaginary, then the solution is expressible as a linear combination of two complex exponentials or as a sine and a cosine (Example 3); if they are not purely imaginary, complex exponentials or a sine and a cosine (Example 4). Turning to higher-order equations (n > 2), our attention focuses on the characteristic equation A” Hay ATE 4 Ifn = 1, then (40) becomes \+a, Hani A + an = 0. (40) = 0 which, of course, has the root A = —a, on the real axis. If n = 2, then (40) becomes A? + a, + a9 = 0, and to be assured of the existence of solutions we need to extend our number system from a real axis to a complex plane. If the roots are indeed complex (and both a; and ay are real) they will necessarily occur as a complex conjugate pair, as in Example 4. One might wonder if a further extension of the number system, beyond the complex plane, is required to assure the existence of solutions to (40) forn > 3. However, it turns out that the complex plane continues to suffice. The characteristic equation (40) necessarily admits n roots. As for the case n = 2, they need not be distinct and they need not be real, but if there are complex roots then they necessarily occur in complex conjugate pairs (if all of the a;’s are real). In this subsection we limit attention to the case where there are n distinct roots of Ory which we denote as Aj, Ag,...,An. Then each of the exponentials eM? en” is a solution of (1) and, by jiheorem 3.3.3, y(z) = Che *® +---4+ Che” (41) is a general solution of (1) if and only if the set of exponentials is LI. THEOREM 3.4.1 Linear Independence ofa Set of Exponentials Let \i,...,An be any numbers, real or complex. The set {er . (on any given interval J) if and only if the \’s are distinct. is LI 100 Proof: Recall from Theorem 3.2.2 that if the Wronskian determinant erie WwW lems, vas ome (a) = ∙∙ ∕ * ∶ ↕ ∏ \Miotewe ∙ − ↕ (42) ∕ is not identically zero on J, then the set is LI on J. According to the properties of determinants (Section 10.4), we can factor e* out of the first column, e*2” out of the second, and so on, so that we can re-express W as lowe. lew, WwW — ewe] (x) _ eiterbAn)e 7 rot tae 7 . (43) vee Qnol The exponential function on the right-hand side is nonzero. Further, the determinant is of Vandermonde type (Section 10.4), a key property of which is that it is nonzero if the A’s are distinct, as indeed has been assumed here. Thus, W is nonzero (on any interval), so the given set is LI. Conversely, if the A’s are not distinct, then surely the set is LD because at least two of its members are identical. @ Consider the following examples. EXAMPLE 5. The equation —8y' + 8y =0 (44) hasthecharacteristicequation\? —8\ + 8 = 0. Trial anderrorrevealsthatA = 2 is one root. Hence we can factor\? ~ 8\ + 8 as (A — 2)p(A), wherep(A) is a quadratic functionof \. To findp(A) wedivide \ —2 into A®—8A + 8, by longdivision,andobtain p(A) = A? + 2\ —4 which, in turn,can befactoredas [A—(-1 + V5)][A ~ (-1 — V5)]. Thus, \ equals 2 and —1 + V5, so y(x) = Cie?" + Cel 1+v5)« 4 Cye(-l-vo)= 40*(Cer+ye-V) =Oye a is a general solution of (44). COMMENT. Alternative to long division, we can find p(\) by writing \3 — 8A +8 = (A = 2)(ad?+ BA+c) = a3 + (b—2a)A?+ (c —26)\ —2canddetermininga, b,c so that coefficients of like powers of \ match on both sides of the equation. & EXAMPLE 6. The equation ya -y=0 (46) 3.4. Solution of Homogeneous Equation: Constant Coefficients — 101 hasthecharacteristicequation\* —1 = 0, whichsurelyhastheroot\ = 1. Thus,\* —1 is \ ~ 1 times a quadratic function of A, which function can be found, as above, by long division. Thus, we obtain (A-1) (QQ? +A+41)= so \ equals1 and(—1+ V3i)/2. Hence y(x) + Cye( +V38i)2/2 LL Oge(- = Cre” «/2 i> V8) 'V0/2) = Cye®+e72/? (Cre! Bx/2 4 Cae" = Ce" x + en (ay 3 3 cos we, + Cy sin 4, 7) ; (47) where Cy, C'S,C4 are arbitrary constants. (Of course, we don’t really need the primes in the final answer.) | EXAMPLE 7. The equation (48) y) —Ty!"+12y!=0 hasthecharacteristicequation\° —7A3+ 12\ = 0 or, A(A*—7A?+ 12) = 0. The A factor gives the root \ = 0. The quartic factor is actually a quadratic in \?, so the quadratic equationgives\? = 4 and\? = 3.Thus,\ equals0,+2,+V3, so y(x) = Cy + Coe? + Cge7?* + CyeV® + Cge7V3" (49) is a general solution of (48). # EXAMPLE 8. The equation (50) y) +ky=0 arises in studying the deflected shape y(x) of a beam on an elastic foundation, where k is a known positive physical constant. Since the characteristic equation \* + k = 0 gives \* = ~k, to find \ we need to evaluate (—k)!/4. The general result is that zl/™ for any complex number z = a + ib and any integer n > 1, has n values in the complex plane. Thesevalues are equally spacedona circle of radius r = Va? + 6? centeredat theorigin of the complex plane, as is explained in Section 22.4. For our present purpose, let it suffice 1tomerelygivetheresult:\ = (—k)!/4= £h1/4L+2 and-k!/4 —= 50 V2 +i)e/ V2 + Coeski4 (1—i)a/V2 y(x) = Cy ek A me; e*Alfa MWg9 = ebe/V2 1+ i)e/V2 4 Cyeke kil ~i)2/JV2 ki/A Risa V2 v2- (c: cos —=xz + Cy sin L/h ff pen ki v2 x/Vv2(cs 3 cos Ki/4 vO °) Kila z+ Ci 4 sin <) J2 (51) 102 is a general solution of (50). # 3.4.4. Repeated roots. Thus far we have considered only the generic case, where To the nth-order characteristic equation (40) admits n distinct roots A,,...,An. complete our discussion, we need to consider the case where one or more of the roots is repeated. We say thata root \; of (40) is repeated if (40) contains the factor A~ Aj more than once. More specifically, we say that \, is a root of order k if (40) contains the factor \ — A; k times. For instance, if the characterisitic equation for somegiven sixth-orderequationcan be factoredas (\ + 2)(\ —5)8(\ — 1)? = 0, thentheroots \ = 5 and \ = 1 are repeated;\ = 5 is a root of order 3 and A = 1 is a root of order 2. We can say that y(2) = Cye™** + Coe°* + Ce” is a solution for any constants C',, C2, C3, but the latter falls short of being a general solution of the sixth-order differential equation since it is not a linear combination of six LI solutions. The problem, in such a case of repeated roots, is how to find the missing solutions. Evidently, they will not be of the form e>*,for if they were thenwe would havefound themwhen we soughty() in thatform. We will use a simple example to show how to obtain such “missing solutions,” and will then state the general result as a theorem. EXAMPLE 9. Reductionof Order.The equation y’+2y’+y=0 (52) hasthecharacteristicequationA? + 2\+1 = (\+1)? = 0, so \ = —1isa rootof order2. Thus, we have the solution Ae~* but are missing a second linearly independent solution, which is needed if we are to obtain a general solution of (52). To find the missing solution, we use Lagrange’s method of reduction of order, which works as follows. Suppose that we know one solution, say y; (2), of a given linear homogeneous differential equation, and we seek one or more other linearly independent solutions. If y1(z) is a solution then, of course, so is Ay;(x), where A is an arbitrary constant. According to the method of reduction of order, we let A vary and seek y(z) in the form y(z) = A(x)yi (x). Puttingthatform into thegiven differentialequationresultsin another differential equation on the unknown A(x), but that equation inevitably will be simpler than the original differential equation on y, as we shall see. In the present example, y;(z) is e~*, so to find the missing solution we seek y(x) = A(xje~*. (53) From (53),y’ = (A! —A)e~* andy” = (A” —2A'+ A)e~”, and puttingtheseexpressions into (52) gives (A” —2A'+A42A'-2A+ A)e™*=0, (54) so that A(x) must satisfy the second-order differential equation obtained by equating the coefficient of e~* in (54) to zero, namely, A” — 2A’ + A+ 2A’ —~2A +A = 0. The 103 cancellation of the three A terms in that equation is not a coincidence, for if A(z) were a constant [in which case the 4’ and A” terms in (54) would drop out] then the terms on the left-hand side of (54) would have to cancel to zero because Ae~* is a solution of the original homogeneous differential equation if A is a constant. Thanks to that (inevitable) cancellation,the differentialequationgoverningA(x) will be of the form A” +aA' = 0, (55) for some constant a, and this second-order equation can be reduced to the first-order equation vu!+ av = 0 by setting A’ = uv;hence the name reduction of order for the method. In fact, not only do the A terms cancel, as they must, the A’ terms happen to cancel as well, so in place of (55) we have the even simpler equation A” =0 (56) on A(x). Integrationgives A(z) = Cy + Cox, so that(53) becomes y(2) = Che" + Core™”. (57) The Ce~* term merely reproduces that which was already known (recall the second sentenceof this example), and the Cyve~* term is the desired missing solution. Since the two are LI, (57) is a general solution of (52). @ Similarly, suppose we have an eighth-order equation, the characteristic equation of which can be factoredas (\ —2)?(A + 1)4(\ + 5), say,so that2 is a root of order 3 and —1 is a root of order 4. If we take the solution Ae”* associated with the root \ = 2, and apply reduction of order by seeking y in the form A(x)e?*, then we obtain A” = 0 and A(x) = Cy + Cox + Cgx? and hence the “string” of solutions Ce?”, Cove?" , Care?" coming from the root A = 2. Likewise, if we take the solution Ae~* associated with the root A = —1, and apply reduction of order, we obtain A(x) = Cy+ Csx + Cox” + Cra? and hence the string of solutions Cye~*, Csze™*, Cex*e7", Cra e~* coming from the root \ = —1, so that we have a general solution et y(x) = (Cy + Cox + C3x°) + (C4+ Cra + Cou*+ Crx*) e® + Cge* (58) general solution of the original differential equation. [To verify that this is indeed a one would need to show that the eight solutions contained within (58) are LI, as could be done by working out the Wronskian W and showing that W # 0.] EXAMPLE 10. For yl" _ y" =O (59) the characteristic equation \4 —\? = 0 gives \ = 0,0, 1, -1 and hence the solution y(a) = A+ Be® + Ce~*. The latter falls short of being a general solution of (59) because the repeatedroot \ = 0 gave the single solution A. To find the missing solution by reduction 104. of order we could vary the parameter A and seek y(a) = A(z), but surely there can be no gain in that step since it merely amounts to a name change, from y(z) to A(z). This situation will always occcur when the repeated root is zero, but in that case we can achieve a reduction of order more directly. In the case of (59) we can set y’’ = p. Then the fourth-order equation (59) is reduced to the second-order equation p” —p = 0, so p(v) = Ae®+ Be~*. But y” = p, so y'(v) = [oe dz = Ae®—Be“? +C. Hence y(x) = / (Ae* ~ Be? + C) dx = Ae* + Be“* +Ca+D is the general solution of (59). Observe that the pattern is the same: the repeated root A = 0 gives thesolution (C; + C2z)e°*, whereCy is D and CzisC. @ We organize these results as the following theorem. THEOREM 3.4.2 Repeated Roots of Characteristic Equation If A; is a root of order k, of the characteristic equation (40), then ee gee z*-1e1% are k LI solutions of the differential equation (1). Proof: Denote(1) in operatorform as L[y] = 0, where qi) qin-1) d b= am tAgg@ey bot nage tan Then Le] or =("+a ©) ht tanid +an)&™, (61) L le**]= (\—1)p(Aje*, wherep(A) is a polynomial in \, of degreen — k. Since (61) holds for all A, we can set \ = A, in that formula. Doing so, the right-hand side of (61) vanishes, so thatL [e*!*]= 0 andhencee*!*is asolutionof L [y]= 0. Ourobject,now,is to show that re™!*,... ,c*~te*1 are solutions as well. To proceed, differentiate (61) with respect to A (A, not x): aele] =A) d — An Ow) =k — k—1 Oe +A=Aa) (POE } Nw pe Aa AY k d —— 1 \e) ps ∙ (62) 62 The left-hand side of (62) calls for e*” to be differentiated first with respect to x, according to the operator L defined in (60) and then with respect to A. Since we can interchange the order of nee differentiations, we can express the left-hand side as L [ em], that is, as L [ae’A). Thus, one differentiation of (61) with respect to A gives L [we*]= =k(\—A1)*! payee +(A—dy)*< (p(je**) (63) Setting \ = A, in (63) gives L [ze*!*] ==0. Hence, not only is e*!®a solution, so is ve*!, Repeated differentiation with respect to \ reveals that 22e™!,...,2*-1e*1® are solutions as well, as was to be proved. That’s as far as we can go because at that point one more differentiation would give a leading term of k!(A — 1)” p(A;)e™* plus terms with factors of (A- \)* on theright-handside. The latterterms van(A - M1), (A= A1)’,. ish for A = Aj, but the Testis term does not because p(A1) # 0 (because \ — Aq is not among the factors of p) and e*!* 4 0. Verification that the solutions e*!*, ze™1*,...,c*-le*1* are LI is left for the exercises. EXAMPLE 11. Asafinal example,considertheequation y) _ By!” 4 26y" _ A0y’ 4 25y =(0 (64) withcharacteristic equationA*—8\° + 26A?~ 40 + 25 = 0 andrepeated complexroots A= 242, 2+7, 2-1, 2-1. It follows that y(x) = (Cr+Coa) eP*9* + (Cg+Cyr) €P9? i e *[(Cre’* + Cze**) + 2 (Coe + Cae**)| =e"* ((Acosz + Bsinaz) + «(C cosa + Dsinz)] is a general solution of (64). @ 3.4.5. Stability. An important consideration in applications, especially feedback control systems, is whether or not a system is “stable.” Normally, stability has to do with the behavior of a system over time, so let us change the name of the independent variable from <xto ¢ in (1): d”y qn tM d™ty i +--++an_ dy tan ny =0, (65) and let us denote the general solution of (65) as y(t)= Cryi(t) +--+ + Cnyn(t). We say that the system described by (65) (be it mechanical, electrical, economic, or whatever) is stable if all of its solutions are bounded’ —that is, if there exists a constantMj for each solutiony;(t) such that |y;(t)| < Md;for allt > 0. If the system is not stable, then it is unstable. {06 Chapter 3. Linear Differential Equations of Second Order and Higher THEOREM 3.4.3 Stability For the system described by (65) to be stable, it is necessary and sufficient that the characteristic equation of (65) have no roots to the right of the imaginary axis in the complex plane and that any roots on the imaginary axis be nonrepeated. Proof: Let \ = a + ib be any nonrepeated root of the characteristic equation; we call a the real part of \ and write ReX = a, and b the imaginary part of X and write Im\ = b. Sucha rootwill contributea solutione(¢+)* = e@(cosbt+ isin bt). Since the magnitude (modulus, to be more precise) of a complex number x + zy is defined as |a + ty| = \/x? + y?, and the magnitude of the product of complex numbers is the product of their magnitudes, we see that elarioyt) — |e (cos bt +isin bt)|= |e%||cosbt+ isin bt| = e* v/cos?bt + sin?bt = e* so thatsolution will be bounded if and only if a < 0, that is, if A does not lie to the right of the imaginary axis. Next, let A = a + ib be a repeated root of order hk,with a # 0. Such a bt + isin bt), for rootwill contributesolutionsof theform t?e(¢+")t— ¢Pe%(cos p = 0,...,& —1, with magnitude t?e™. Surely the latter grows unboundedly if a > 0 because both factors do, but its behavior is less obvious if a < 0 since then the t? factor grows and the e“’ decays. To see which one “wins,” one can rewrite the product as ¢?/e~*' and then apply I’H6pital’s rule p times. Doing so, one finds that the ratio tends to zero as t —>oo. [Recall that |’Hépital’s rule applies toindeterminateforms of the type 0/0 or o0/oo, not (co)(0); thatis why we first rewrite t?e™ in the form t?/e~“.] The upshotis that such solutions are bounded if A = a+ ib lies in the left half plane (a < 0), and unboundedif it lies in the right nalf plane (a > Q). If A lies on the imaginary axis (a = 0), then |ePelatib)e| pPeret|= |t?(cosbt + isin bt)| = ¢?,which grows unboundedly. Our conclusion is that all solutions are bounded if and only if no roots lie to he right of the imaginary axis and no repeated roots lie on the imaginary axis, as was to be proved. @ One is often interested in being able to determine whether the system is stable or not without actually evaluating the n roots of the characteristic equation (40). There are theorems that provide information about stability based directly upon the a; coefficients in (40). One such theorem is stated below. Hurwitz criterion, is given in the exercises to Section 10.4. THEOREM Another, the Routh- 3.4.4 Coefficients of Mixed Sign If the coefficients in (40) are real and of mixed sign (there is at least one positive and at least one negative), then there is at least one root A with ReA > 0, so the system is unstable. 3.4. Solution of Homogeneous Equation: Constant Coefficients These theorems are not as important as they were before the availability of computer software that can determine the roots of (40) numerically and with great ease. For instance, using Maple, one can obtain all roots of the equation a? + 34 —2a? +a°+2+5=0 simply by using the fsolve command. Enter fsolve(a*5+ 3%074—2*0°3 + 2°2+e+5=0, x, complex); and return.This gives the following printout of the five solutions: —3.6339286, —.58045036— .797312497, —.58045036+ .79731249T, 89741468 — .78056850/, .89741468 + .78056850L In this example, observe that there are, indeed, roots with positive real parts, as predicted by Theorem 3.4.4, so the system is unstable. For equations of fourth degree or lower, such software works even if one or more of the coefficients are unspecified, in which case the roots are given in terms of those parameters. Closure. In this section we limited our attention to linear homogeneous differential equations with constant coefficients, a case of great importance in applications. Seeking solutions in exponential form, we found the characteristic equation to be central. According to the fundamental theorem of algebra, such equations always have at least one root, so we are guaranteed of finding at least one exponential solution of the differential equation. If the n roots A1,...,An are distinct, then each root A; contributes a solution e*”, and their superposition gives a general solution of (1) in the form y(x)= Cye™® +--+ Chern®. (66) If any root A; is repeated, say of order k, then it contributes not only the solution e*/*,but the & LI solutions e*", cei, ..., 28 !e*s to the general solution. Thus, in the generic case of distinct roots,the veneral solution of (1) is of the form (66); in the nongeneric case of repeated roots, the solution also contains one or more terms which are powers of x times exponentials. It should be striking how simple is the solution process for linear constantcoefficient homogeneous equations, with the only difficulty being algebraic — the need to find the roots of the characteristic equation. The reason for this simplicity is that most of the work was done simply in deciding to look for solutions in the right place, within the set of exponential functions. Also, observe that although in a fundamentalsensethesolving of a differential equationin someway involves integration,the methodsdiscussed in this section required no integrations,in contrast to most of the methodsof solution of first-order equationsin Chapter I. In the final (optional) section we introduced the concept of stability, and in Theorem3.4.3 we relatedthe stability of the physical systemto the placementof the roots of the characteristic equation in the complex plane. 107 Chapter 3. Linear Differential Equations of Second Order and Higher {08 Computer software. To obtain a generalsolution of y/” — 9y' = 0 using Maple, use the command dsolve({diff(y(x),2, 2,2) —9 * diff(y(z),x) = 0},y(2)); andtosolvetheODE subjecttotheinitial conditionsy(0) = 5, y’(0) = 2, y”(0) = —4, use the command dsolve({diff(y(x), x, 7,2) —9 * diff(y(a), x) = 0, y(0) = 5, , y()); = —4} D(y)(0)= 2,D(D(y))(0) In place of diff(y(x), x, x, x) we could use diff(y(a), (x) $3), for brevity. EXERCISES 3.4 1. Use whichever of equations (5)—(8) are needed to derive (0) y) — 2y" —3y =0 theserelationsbetweenthecircular andhyperbolicfunctions: (p)yo) + 6y” + 8y =0 (a)cos (iz) = coshz (c) cosh (tz) = cosa (b) sin (tz) = isinhe (d) sinh (iz) = ising 2. Use equations (6) and/or (7) to derive or verify lowing equations, and a particular solution satisfying the given conditions, if such conditions are given. (f) equation(13d) (e)equation(13c) 3. Theorem 3.4.2 states that e*!*, ze™*,... Prove that claim. (x)yt" = 2y yo +2y =0 5. (a)—(r) Solve the corresponding problem in Exercise 4 Cee eee 6. (Repeated roots) Find-a general solution of each of the fol- ° equation Oo (a) equation (130) io scuation Ok ) c) equa fone (q)y(iv)) _+Ty” +Ly =0 − Wt og tt ,e*—1e*1*are LI. y(-3)=5,y/(-3)=-1 (@y"=0; (b) a + 6y’ as 3} 0; ae i y(1) = ()y” =0; y(0)=3, 4. (Nonrepeated roots)Find a generalsolutionof eachof the (@)y'” +5y”=0; following equations, anda particular solution satisfying the givenconditions, if suchconditionsaregiven. (a) y" + 5y! =0 "WW (c)y”+y' =0; y(0)=3,y'(0) =0 dy" -3y'+2y=0; y(l)=1, y(1)=0 =1,y(1)= —4y’—5y=0; ey” (e) y"— 4y'—~5y =0; yA) y(1)= 1, y'(1) =0 tt 5,4 —3y" + 3y' —y =0 ~y"” -y +y =0 ap mn () y - 2y a ss _ 0 y(0)=2, y(O)=4, y/(0)=5 y/(0)=-1 v(0) ~ . v0) ()e q y Tine mins _ 0. a) ~~ _ 0. ∙− _ 3 yu 0) ~ 0. mw nn (j) yO) +By” + 16y= 0 (Hy!ty! Ry=0; y(-1)=2,y(-1)=5 sy (fy a Oe) (g)y" —4y'+5y=0; (h)yy”—2y'+3y=0; y(0) =6 Vo wt) ey ee at " Mn 0; (0) = y/(0)= y"(0) = y"(0) = y= (kK) y(0) =0, y(0)=3 7. (a)~(k) Solve the correspondingproblem in Exercise 6 usingcomputersoftware. 8. If the roots of the characteristic equation are as follows, then find the original differential equationand also a general Dy =O |WO Yoo yeay YuV; = 1, 4 0)=0,YM) =U, Y = o wsonaFie =1, = 2,6 (a) y+ y” —2y=0 (m) (n)yy) ~y =0 y"(0)=0 (h) y + 3y" = 0 =0 ty! (iv) ty! (i) ! 4 ! y” 0) = 1 y(0)=0, (€) yl" + by" + 3y' +y =0 (g) yl" (b) y” — y’ =0 y(0)=—-5, y(0)=1, (c)4 —21,4423 —2i (b)21, (d)—2,3,5 3.4, Solution of HomogeneousEquation: Constant Coefficients (f) 1,1, -2 (e) 2,3, —1 (g)4,4,4,i,-i du yu (h)1,-1,2+%,2—i (i)0,0,0,0,7,9 G@iltil+ijl—i1—i da =0. 109 (11.2) 9. (Complexaj;'s)Find a generalsolution of each of the fol- Sl ve (11.2) for u, put that wuon the right-hand side of lowing equations. NOTE: Normally, the a; coefficients in (1) are real, but the results of this section hold even if they are not (except for Theorem 3.4.4, which explicitly requires that the coefficients be real). However, be aware that if the y — A»y = u, which is again of first order, and solve the dx latter for y. Show that if A,, A2 are distinct, then the result is given by (23), whereas if they are repeated,then the result is y(x) = (Cy + Cox)e*?”. (c) Solve y” — 3y’ + 2y = 0 by factoring the operator as necessarily occur in complex conjugate pairs. For instance, (D ~1)(D —2)y = 0. Solvethelatterby themethodoutlined A?4 2d + 1 = Ohastheroots\ = (/2 — 1)i, -(/2 + 1)i -u=0 in (b): Setting(D —2)y = u, solve (D -lju=u' (b)y” —3iy’ —2y =0 (a)y” —2iy’+y =0 for u(x). Then,knowingu(x), solve (D —2)y = u, namely, a; coefficients are not all real, then complex (c)y"+ty’-y=0 roots do not (d)y” —2iy’ —y=0 y' —2y=u(x),fory(a). HINT: Verify, and use, the fact that (d) Same as (c), for y” — 4y = 0. (e)Sameas (c),for y” + 4y’ + 3y = 0. Vi = (1+ 3)/v2. (f) Sameas (c),for y + 2y’ +y = 0. Hy" +diy” —y' =0 Same as (c), for y” + 4y’ + 4y = 0. (g) HINT: Verify, and use, the fact that NOTE: Similarly for higher-order equations. For instance, f—j=4(1i)/V2 (h)yl!"—(1+ 2i)y” + (i +i)y’ —2014+ dy= 0 HINT: One y” — 2y” — y' + 24y= (D - 2)(D+ 1)(D —-1)y = O can be solved by setting (D + 1)(D — 1)y = uw and solving root is found, by inspection, to be A = (D — 2)u = 0 for u(x); then set (D — 1)y = v and solve (e) y’ — iy = 0 10. (a)—(h) Solve the corresponding oroblem in Exercise 9 using computer software. 11. (Solution by factorization of the operator) We motivated the idea of seeking solutions to (1) in the form e** by ob- ane that the general solution of the first-order equation y' + ayy = 0 is an exponential, Ce~*, and wondering if higher-order equations might admit exponential solutions too. A more compelling approach is as follows. Having already seen that the first-order equation admits an exponential solution, consider the second-order equation (18). (a) Show that (18) can be written, equivalently, (D —Ar)(D —d2)y=0, where D denotes d/dz, as and A, and \» are the two roots of By theleft-handsideof (11.1),we mean (D —1)((D —2)y). Thatis, firstlet theoperatortotheleft of y (namely, D — A») act on y, then let the operator to the left of (D —Az)y (namely,D —\;) acton that. homogeneous differential equation with constant coefficients can be reduced to the solution of a sequence of n first-order linear equations. 12. Use computer software to obtain the roots of the given characteristic equation, state whether the system is stable or unstable, and explain why. If Theorem 3.4.4 applies, then show that your results are consistent with the predictions of that theorem. (a) 8 —8d? + 26’ -—2 =0 (11.1) (b)A?+ 3A? + 2A +2 =0 d? +a;\+ a2 = 0. NOTE: In (11.1) we accomplish a factorization of the original differential operator L = D?-+a,D+az as (D—,)(D—z). (D+ 1)v = u for v(x); finally, solve (D — 1)y = v for y(xz). The upshotis that the solution of an nth-order linear (b) To solve (11.1), let (D — Az)y = u, so that (18) reduces to the first-order equation + AS+31? +21 42=0 (c) M44 +(4+4=0 (d)M+ A452 (e)AS + 45 4.544 243—AZ+A4-3 =0 (f) X48+ OA54+5A4A+21 4+717 +(14+3=0 (g)AS+ AS+ 5AS+ 43 + 447 4+8A 4+4=0 (hyAS+ AF+ 5AA+ 2A3 + TAZ+A43 =0 (i) AB—AS + AS+5AA + QAP+ 72 +143 =0 (j) ABE AT AS + AE4 SASF 21 + 717 +A43=0 3.5 Application to Harmonic Oscillator: Free Oscillation In Section 1.3 we discussed the modeling of the mechanical oscillator reproduced here in Fig. 1. Neglecting air resistance, the block of mass m is subjected to a restoring force due to the spring, a “drag” force due to the friction between the block and the lubricated table top, and an applied force f(t). (By a restoring force, we mean that the force opposes the stretch or compression in the spring.) Most of that discussion focused on the modeling of the spring force and friction force, and we derived the approximate equation of motion = f(t), me" +ca'+kx (1) where c is the damping coefficient, k is the spring stiffness. Besides the differential equation, let us regard the initial displacement and initial velocity, z(0)=2x9 and 2'(0)= 29, (2) respectively, as specified values. In this section we consider the solution for the case where f(t) = 0: mao"+cxr'+kae =0. (3) This is the so-called unforced,or free, oscillation. According to Theorem 3.3.1, the solution x(t) to (3) and (2) doesexist and is unique.To find it, we seekz(t) = ef and obtain the characteristic equationmA? + cA + k = 0, with roots —rt −− a 4 mk (4) 2m Consider first the case where there is no damping, so c = 0 and (3) becomes mz’ +kx = 0. (5) That is, the friction is small enough so that it can be neglected altogether. Then (4) gives \ = +i,/k/m, and the solution of (5) is a(t) = Ae! + Be, (6) where w = \/k/m is the so-called natural frequency of the system, in rad/sec. Or, equivalent to (6) and favored in this text, z(t) = Ccoswt + Dsinwt. (7) In fact, there is another useful form of the general solution, namely, x(t) = sin (wt +9), (8) 111 where the integration constants F and @ can be determined in terms of C' and D as follows. To establish the equivalence of (8) and (7), recall the trigonometric identitysin(A + B) = sin Bcos A + sin Acos B. Then FEsin (wt + ¢) = Esingcoswt if which is identical to C coswt + Dsinwt C= and Esing + Ecos dsinwt, D= (9a,b) Ecos¢. Squaring and adding equations (9), and also dividing one by the other, gives E=\/C?+ F D2 Cc (10a,b) Dd’ and ¢=tan! respectively, as the connection between the equivalent forms (7) and (8). It will be important to be completely comfortable with the equivalence between the solution forms (6), (7), and (8). Both the square root and the tan~! in (10) are multi-valued. We will understandthe square root as the positive one and the tan~! to lie between —m and 7. Specifically, it follows from (9), with & > 0, thatif C’ > 0 and D > 0 <0andD>0O C > Oand D < Othenz/2<d<7,ifC 7/2,if then0 <6 < then—7/2 <<@< 0,andif C <Oand D < Othen—a < ¢ < —77/2. For instance, consider 6cost — 2sint. tan-! (+8). Then & = /36+4 = V40 and ¢ = A calculator or typical computer software will interpret tan7!( ) as —7/2 < tan~'() < m/2, namely,in the first or fourth quadrant. Not able to distinguish(+6)/(—2) from (—6)/(+2), it will give tan=+(—$) = -1.25 rad, which is incorrect. The correct value is in the second quadrant, namely, ¢ = mw ~—1.25 = 1.89 rad. Thus, 6 cost — 2sint = //40sin (t + 1.89). Whereas C’ and D in (7) have no special physical significance, E and ¢ in (8) are the amplitude and phase angle of the vibration, respectively (Fig. 2a). (a) (b) ; * A slope=xg . 2n period => >(Xx . amplitude = Ie +(*s,) o y ‘ Esin(wt +) Figure 2. | (a)Graphical significanceof w, ¢. (b) Undamped free oscillation. Although (8) is advantageousconceptually, in that the amplitude £ and phase angle @are physically and graphically meaningful, it is a bit easier to apply the initial conditions to (7): aO0)=axa=C, 2'(0)=2p=wD soC = x9,D = «p/w,andthesolutionis Jf (11) x(t) = xo coswt + 4 sinwt, a plot of which is shown in Fig. 2b for representativeinitial conditions xo and 2. Before continuing, consider the relationship between the mathematics and the physics. For example, the frequency w = ,/k/m increases with k, decreases with m, and is independent of the initial conditions zo and wp, and hence the amplitudewhich, according to (11) and(10), is fx + (x/w)*. Do theseresults make sense? Probably the increase of w with & fits with our experience with springs, and its decrease with m makes sense as well. However, one might well expect the frequency to vary with the amplitude of the vibration. We will come back to this point in Chapter 7, where we consider more realistic nonlinear models. Now suppose there is some damping, c > 0. From (4) we see that there are three cases of interest. If we define the critical V4mk = = as c,, damping 2¥m&, then the solution is qualitatively different depending upon whether c < Cop (the “underdamped” case), c = Cer (the “critically “overdamped” case). Underdamped vibration damped” case), or c > Cer (the (c < c,,;). In this case (4) gives two complex conjugate roots \= 1 ~2m So (-c + 4/c? — 2.) =— ~~ om Y we — (5° 1 ; (-e tic? —2) 2m so a general solution of (1) is x(t) b c − =e 2m |Acosy/w —~——t =e 2 (5 −−−∙−− Cc ) \2 in» t+ Bsiny/w 2 where A and B can be determined from the initial conditions { (52 Cc ) 2 t}, 2 (12) (2). Of course, we could express the bracketed part in the form (8) if we like. Comparing (7) and (12), observe that the damping has two effects. First, it introduces the e~ (¢/2")t factor, which causes the oscillation to “damp out” as t — oo, as illustrated in Fig. 3. That is, the amplitude tends to zero as t > oo. Second, it Figure 3. Underdampedfree oscillation, reduces the frequency from the natural frequency w to,/w? — (c/2m)?; that is, it makes the system more sluggish, as seems intuitively reasonable. (It might appear from Fig. 2b and 3 that the damping increases the frequency, but that appearance is only because we have compressed the t scale in Fig. 3.) Critically damped vibration (c = c,,). As c is increased further, the system becomes so sluggish that when c attains the critical value c,, the oscillation ceases altogether.In this case (4) gives the repeatedroot \ = —c/2m, of order two, so cy (13) x(t) =(A+ Bt)e 2m, Although the ¢ in A+ Bt grows unboundedly, the exponential function decays more powerfully (as discussed within the proof of Theorem 3.4.3) and the solution (13) decays without oscillation, as shown in the c = c,, part of Fig. 4. Overdamped vibration (c > cc;). As c increases beyond c.;, (4) once again gives two distinct roots, but now they are both real and negative (because the Vc? —4mk is smaller than c), so Cc z(t) =e -—t =e2m Acosh ly @\2 — =) (— iv 2 t+ Bsinh 4/ (|_ w? 2— 72 wet), Figure 4. Critically dampedand overdamped cases. (4) where A and B can be determined from the initial conditions (2). Indeed, if one or both roots were positive then we could have exponential growth, which would make no sense, physically. If that did happen we should expect that either there is an error in our mathematics or that our mathematical modeling of the phenomenon is grossly inaccurate. A representative plot of that solution is shown in the c > cer part of Fig. 4. For the sake of comparison we have used the same initial conditions to generatethe (a) (b) eee “we three plots in Figures 3 and 4. Though one can use positive and negative exponentials within the parenthesesin (14), in place of the hyperbolic cosine and sine, the latter are more convenient for the application of the initial conditions since the sinh is zero at t = 0 and so is the derivative of the cosh. This completes our solution of equation (3), governing the free oscillation of the mechanical oscillator shown in Fig. |. It should be emphasized that Fig. 1 is intended only as a schematic equivalent of the actual physical system. For instance, suppose the actual system consists of a beam cantilevered downward, with a mass 77 at its end, as shown in Fig. 5a. We assume the mass of the beam to be negligible compared to m. It is known from Euler beam theory that if we apply a constant force F’, as shown in Fig. 5b, then the end deflection z = FL°/(3EI), x is given by where L is the length of the beam and EI is its “stiffness” (E is Young’s modulus of the material and J is a cross-sectional Re-expressingthelatteras F = (3EI/L°)x, (c) moment of inertia). we seethatit is of the form F = kx, as for a linear spring of stiffness &. Thus, insofar as the modeling and analysis is concerned, the physical beam system is equivalent to the mass-spring arrangement shown in Fig. 6c, where key = 3EI/L° is the stiffness of the equivalent spring and where there is no friction between the block and the table top. The governing equation of motion is ma" + ket = 0. (15) Just as we neglected the mass of the beam, compared to m, likewise let us neglect g 4 f F k eq SSPPEEPOP ES Figure 5. Equivalentmechanical systems. the mass of the spring compared to m. (How to account for that mass, approximately, is discussed in the exercises.) It should be noted that, in addition, we are neglecting the rotational motion of the mass, in Fig. 5b, since we have already limited ourselves to the case of small deflections of the beam. Finally, it has already been pointed out, in Section 2.3, that the force-driven mechanical oscillator is analogous to the voltage-driven RDC electrical circuit reproduced here in Fig. 6, under the equivalence Lom, Figure RIC Roe 6. Electrical oscillator; 1 tok, C E(t (toc, 2 ) dt + F(t), (16) so whatever results we have obtained in this section for the mechanical oscillator apply equally well to the electrical oscillator shown in Fig. 6, according to the equivalence given above. circuit. Closure. In this section we have considered the free oscillations of the mechanical harmonic oscillator. We found that for the undamped case the solution is a pure sine wave with an amplitude and phase shift that depend upon the initial conditions —that is, the solution is “harmonic.” In the presence of light damping (i.e., for ¢ < Cer), the solution suffers exponential decay and a reduction in frequency, these effects becoming more pronounced as c is increased. When c reachesa critical value c,, the oscillation ceases altogether, and as c is increased further the exponential decay is increasingly pronounced. It should be emphasized that by the damped harmonic oscillator we mean a systemthatcan bemodeledby a linear equationof theform mz” + ca’ + ka = 0. In most applications, however, the restoring force can be regarded as a linear function of « (namely, kx) only for motions that are sufficiently small departures from an equilibrium configuration; if the motion is not sufficiently small, then one must deal with a more difficult nonlinear differential equation. Thus, for the harmonic oscillator, damped or not, we are able to generate simple closed form solutions, as we have done in this section. For nonlinear oscillators one often gives up on the possibility of finding closed form analytical solutions and relies instead on numerical simulation, as will be discussed in Chapter 6. To illustrate how such nonlinear oscillators arise in applications, we have included several such examples in the exercises that follow. In terms of formulas, the equivalence of the three forms (6), (7), and (8) should be clearly understood and remembered. In a given application we will use whichever of these seems most convenient, usually (7). EXERCISES 3.5 is, evaluate £7,p, w. (c) 5cos 2¢ ~ 12 sin 2¢ (a) 6cost (e) cos 5t — sin 5t (d) —2cos 3t + 2 sin 3t + sint 115 af (f) v9 cos wt +“0 W we sin wt, from (11) 2, We emphasized the equivalence of the solution forms (6), (7), and (8), and discussed the equations (10a,b) that relate C and D in (7) to & and ¢ in (8). Of course, we could have used the cosine in place of the sine, and expressed a(t) = Gcos (wt + y) (2.1) m instead. Derive formulas analogous to (10a,b), expressing G and w in terms of C' and D. 3. Apply the initial conditions (2) to the general solution (12), and solve for the integration constants A and B in terms of M,C,W, Xo and rp. 4. Apply the initial conditions (2) to the general solution (14), and solve for the integration constants A and B in terms of M,C,W, Lo and xp. |0|< 1 (where< meansmuch smaller than),thensin # = 0, and the nonlinear equation of motion (8.1) can be simplified to the linear equation 6 ++8=0, (8.2) or, if we allow for some inevitable amount of damping due to friction and air resistance, (8.3) 6"+6!+26=0, 5. Consider an undamped harmonic oscillator governed by theequationmz” + ka = 0, with initial conditionsx(0) = where 0 < € < 1. Now imagine the pendulum to be part Zo,2’(0) = xp. One mightexpectthefrequencyof oscillation of a grandfather’sclock. If a ratchet mechanism converts each to dependon the initial displacement x9. Does it? Explain. 6. We mentioned in the text that the oscillation ceases altogether when c is increased to c,, or beyond. Let us make that statement more precise: for c > c., the graph of x(t) has at most one “flat spot” (on 0 < ¢ < oo), that is, where a’ = 0. (a) Prove that claim. (b) Make up a case m,c,k, (i.e., give numerical values 0, xq) where there is no flat spot on 0 < t < 00. of (c) Make up a case where there is one flat spot on 0 < t < ov. 7. (Logarithmic decrement) For the underdamped case (c < Cer), let XZ, and Z,+41 denote any two successive maxima of x(t). (a) Show that the ratio r, = @,/¢,41 is, 2/2 = t9/e3 = = 7. ∙∙ arithmic decrement 6, is given by 6 = - 8. (Grandfather clock) Consider a pendulum governed by the + mg sin @= 0, or gl"+.$sind =0, 9. (Correctionfor the massof the spring) Recall that our model of the mechanical oscillator neglects the effect of the mass of the spring on the grounds that it is sufficiently small compared to that of the mass m. In this exercise we seek to improve our model so as to account, if only approximately, for the mass of the spring. In doing so, we consider the undampedcase, for which theequationof motionis mz” + kx = 0. (a) Multiplying thatequation by dz and integrating,derive the “first integral” is a constant, say r; that (b) Further, show that the natural logarithm of r, called the log- equationof motionm6" oscillation to one second of recorded time, how does the clock maintain its accuracy even as it runs down, that is, even when its amplitude of oscillation has diminished to a small fraction of its initial value? Explain. 1 gma 12 1 Lsye mn =C, + ake (9.1) which states that the total energy, the kinetic energy of the mass plus the potential energy of the spring, is a constant. (b) Let the mass of the spring be ms. Suppose that the velocity of the elements within the spring at any time ¢ varies linearly from 0 at the fixed end to 2’(t) at its attachmentto the mass m. Show, subject to that assumption, that the kinetic (8.1) where g is the acceleration of gravity. (See the figure.) If energyin thespring is 47,2"? (t). Improving (9.1)to theform 1 3(m 1 + 5m) a” 1 + aha" 3 C, (9.2) obtain, by differentiation with respect to ¢, the improved equation of motion 1 (1m+ ims) 3 zw+ke =0. (9.3) Thus, as a correction, to take into account the mass of the spring, we merely replace the mass m in mz” + ka = 0 by an “effective mass” m + iMs, which incorporates the effect of themassof thespring. NOTE: This analysis is approximate in that it assumes the velocity distribution within the spring, whereasthat distribution itself needs to be determined, which determination involves the solution of a partial differential equation of wave type, as studied in a later chapter. (c) In obtaining an effective mass of the form m + am,, why is it reasonable that a turns out to be less than 1? (e) Is the resulting linearized model equivalent to the vibration of a mass/spring system, with an equivalent spring stiffness of keq = 2poA/L? Explain. 11. (Lateral vibration of a bead on a string) Consider a mass m, such as a bead, restrained by strings (of negligible mass), in each of which there is a tension 79, as shown in Fig. a. (c) (a) 10. (Piston oscillator) Let a piston of mass m. be place at the midpoint of a closed cylinder of cross-sectionalarea A and length 2L, as sketched. Assume that the pressure p on either We seek the frequency of small lateral oscillations of m. A lateral displacement x (Fig. 6) will cause the length of each stringto increasefromlp to I(x) = \/l¢ + x?. Supposethat the tension 7 is found, empirically, to increase with /, from its initial value 79, as shown in Fig. c. (a) Show that the governing equation of lateral motion is side of the piston satisfies Boyle’s law (namely, that the pressure times the volume is constant), and let po be the pressure on both sides when x = 0. (a) If the piston is disturbed from its equilibrium position x = 0, show that the governing equation of motion is ma" + 2ppAL x L? — x2 = 0. (10.1) ma" +2 where 7 (Vi (VE) +2? ) is a function, not a product. (b) Is (11.1) linear or nonlinear? Explain. (c)Expandther (Ve + a) z/V/l¢ + x? termin a Taylor series about x = 0, up to the third-order (b) Is (10.1) linear or nonlinear? Explain. (1L.1) z=0, term. [You should find that the coefficients of these terms involve lg, 7, and (c) Expand the «/(L* — x?) term in a Taylor series about r'(lo).] x = Q, up to the third-order term. Keeping only the leading (d) Linearize the equation of motion by retaining only the term, derive the linearized version leading term of that Taylor series, show that the equivalent stiffness is keg = 27)/lo, and that the frequency of spring 2p0A xz ++—-—2 L x =0 mz" (10.2) au: Lf 270 small oscillations is —-,/——— cycles/sec. of (10.1), which is restricted to the case of small oscillations - that is, where the amplitude of oscillation is small compared to L. (d) From (10.2), determine the frequency of oscillation, in cycles per second. 2r ¥ mio 12. (Oscillating platform) A uniform horizontal platform of mass m is supported by counter-rotating cylinders a distance L apart (see figure). The friction force f exerted on the l17 which rotates without friction about an axis that is tilted by an angle of @ with respect to the vertical (see figure). Let @denote a mg A LN ¢ Ny a oti + BE HN» Ny platform by each cylinder is proportional to the normal force N between the platform and the cylinder, with constant of proportionality (coefficient of sliding friction) uz:f = uN. Show that if the cylinder is disturbed from its equilibrium position (x = 0), then it will undergo alateral oscillation of frequency w = 4/2pg/L rad/sec, where g is the acceleration of gravity. HINT: Derive the equation of motion governing the lateral displacement z of the midpoint of the platform relative to a point midway between the cylinders. , the angle of rotation of the pendulum, with respect to its equilibrium position (where m is at its lowest possible point, namely, in the plane of the paper). (a) Derive the governing equation of motion g sinasind =0. 6’ + + As a partial check of this result, observe that fora (13.1) = 1/2 (14.1) does reduce to the equation of motion of the ordinary pendulum (see Exercise 8). HINT: Write down an equation of conservation of energy (kinetic plus potential energy equal a 13. (Tilted pendulum) Consider a rod of length £ with a point constant), and differentiate it with respect to the time ¢. mass m at its end, where the mass of the rod is negligible com- (b) What is the frequency of small amplitude oscillations, in pared to m. The rod is welded at a right angle to another, rad/sec? In cycles/sec? ° 3.6 Solution of Homogeneous Equation: Nonconstant Coefficients We return to the nth-order linear homogeneous equation any ag(x) dx t On(@)y = 0, geneous equation, given in Section 3.3, holds whether the coefficients (1) are constant the coefficients in (1) are not all constants. Only in special cases are we able to find 118 (Chapter 4) or pursue a numerical approach (Chapter 6). 3.6.1. Cauchy—Euler equation. If (1) is of the special form d™ ∕ ↓ ∙ ∏− √ ↕ ↕ −∙∙∙∙ −↕ ∶ − 0, ∶ ∶ (2) where the c;’s are constants, it is called a Cauchy—Euler equation, and is also called an equidimensional equation. Of most importance to us will be the case where n = 2, so let us consider (st case first, namely, 2,0 vy" + cay! +cay= 0, (3) and let us consider the x interval to be 0 < x < oo; the ase of negative x’s will be treatedseparately,below. Suppose we try to solve (3) vy seeking y in the form y = e**, where ) is a yet-to-be-determined constant,which form proved successful for the constant-coefficient case. Then y’ = Ae** and y” = A7e*, so (3) becomes Mare” + Acyre** + coe” = 0. (4) If we cancel the (nonzero) exponentials we obtain a quadratic equation for A, solution of which gives \ as a function of 2. However, \ was supposed to be a constant, so we have a contradiction, and the method fails. (Specifically, if A turns out to be a function of x, then y’ = \e*” and y” = A?e4*, above, were incorrect.) Said differently, the 27e**, ze**, e*” terms in (4) are LI and cannot be made to cancel identically to zero on any given z interval. The reason we have discussed this fruitless approach is to emphasize that it is incorrect, and to caution against using it. By contrast, if the equation were of constant-coefficienttype, say y’ + ciy’ + coy = 0, theny = e** would work because y = eA”, y! = Ae**,y = \2e** are LD, so the combination y” + czy! + cgy could be made to cancel to zero by suitable choice of 4. Although the form e** will not work for Cauchy—Eulerequations, the form y= 2 (5) will, becausey = 2, ry! = Ax, 2?y"”= A(A —1)2%,...areLD sinceeachis a constant times 2%.Putting (5) into the second-order equation (3) gives [MA — 1) + c1A +c] 2 = 0. Since «* # 0, we requireof \ that M-(1—e1)A+e2 =0, sO \= Ler + V(1 = e1)? 2 ~ dea, (6) 119 We distinguish three cases, depending upon whether the discriminant A = (4 —c1)*—4c is positive,zero, or negative: A > 0: Distinct real roots. If A > 0, then (6) gives two distinct real roots, say 4; and Xo, so we have the general solution to (3) as EXAMPLE 1. To solve y(a) = Ac™ + Ba. (7) ay” —2xy' ~ 10y = 0, (8) seek y = x*. That form gives \? — 3\ — 10 = 0, with roots \ = —2 and 5, so the general solution of (8) is A + Ba”.5 y(z) = = ul A = 0: Repeated real roots. In this case (6) gives therepeatedroot \ = 1- Cl III 2 \1. Thus we have the solution Az*!, but are missing a second linearly independent solution, which is needed if we are to obtain a general solution of (3). Evidently, the missing solution is not of the form x*, or we would have found it when we soughty(a) = 2%. To find the missing solution we use Lagrange’s method of reduction of order, as we did in Section 3.3.3 for constant-coefficient differential equations with repeatedroots of the characteristic equation. That is, we let A vary, and seek y(a) = A(x)a™. (9) Putting (9) into (3) gives (we leave the details to the reader, as Exercise 3) tA” + A'=0. Next, setA’ = p, say,to reducetheorder: oP +p=0, (10) sop = D/xand A(x) = Dina + C, whereC, D arearbitrary constants(Exercise 4). Finally, putting the latter back into (9) gives the general solution of (3) as y(z) = (C+ Din va, EXAMPLE 2. To solve ay" + Tay’ + 9y = 0, (11) (12) 120 seek y = 2. That form gives \? + 6\ + 9 = 0, with the repeated root \ = —3, so the general solution of (12) is y(z) =(A+Blnz)a™*. O A < 0: Complex roots. In this case (6) gives the distinct complex-conjugate roots er Veyue — 04 Nn —ine, ;yteas Oe) 2 2 — 2 “ a Li, (13) so we have the general solution of (3) as — (Aci + Bu~**) . (14) However, since we normally prefer real solution forms, let us use the identity u = e'™™to re-express(14) as* y(z) II at (Aetna 4+ Bene” ) = 7 Ce + Bein) = a* {A |cos(Ina) + isin (@lnz)| + B[cos(@lnz) —isin (Gln z)]} =a" ((A+ B)cos(GInz) +7(A —B)sin(fInz)). (15) Or, letting A+ B=C andi(A— B) =D, lna) + Dsin(Glna)}. y(x) = e* [Ccos (GB EXAMPLE (16) 3. To solve x?y” —2axy'+ 4y = 0, “Tt is important to appreciate that the x akis (17) quantities, in (14), are “new objects” for us, for we have not yet (in this book) defined a real number x raised to a complex power (unless x happens to be e, in which case the already-discussed Euler’s formulas apply). Staying as close as possible to familiar real variable results, let us write goth _ at aif _ 2 (e =| —_ eieilne and similarly for «*~*, None of these three equalities arejustifiable, since they rely on the formulas eet? = ote’ uy = el" and Ine® = clnz, which assume z,a, b,c to be real, but we hereby understand them to hold by definition. Observe that complex quantities and complex functions keep forcing themselves upon us. Therefore, it behooves us to establish a general theory of complex functions, rather than deal with these issues one by one. We will do exactly that, but not until much later in the text. 121 seeky = x*. ThatformgivesA? —3\-+ 4 =0,so\ = $+ im, Hence y(a)= AgS/4V7/2 4 BagS/A-iVT/2 —3/2 (Aci? — q3/? (Aci Ine + Bet = y3/? [om (Fuss) + Ba-iV7/?) me) + Desin (Yne)] | Recall that we have limited our discussion of (3) to the case where x > 0. The reason for that limitation is as follows. For a function y(z) to be a solution of (3) on anz intervalI, we first of all needeachof y, y’, andy” to existon J; thatis, to be defined there. The function In z and its derivatives are not defined at x = 0, nor is In x defined (as a real-valued function) for x < 0. The functions xt, ao? in (7), x! in (11), and ¢® in (16) cause similar problems. To deal with the case where x < 0, it is more convenient to make the change of variable « = —€ in (3), so that €will be positive. Letting y(z) = y(—£) = Y(€),* dy dvdg_dy dx d&édx dé’ dy_ ad (_d¥\ deAV dx? d&€\ dé)dx dé?’ yy so (3) becomes e ay dg? dY +coE—-— + oY =0, dg (¢ > 0) ce which is the same as (3)! Thus, its solutions are the same, but with x changed to €. For the case of distinct real roots, for instance, y(x) = Aa™ + Ba? for x > O, and y(z)=¥(€)=Y(-2)=A(—#)* +B(-2)* for «x< Q. Observe that both of these forms are accounted for by the single expres- _ y(x) = Alx|! . +B Ix|*? Similarly for the other cases (repeated real roots and complex roots). Let us state these results, for reference, as a theorem. “Why do we change the name of the dependentvariable from y to Y? Because they are different functions.To illustrate,supposey(a) = 5 + 2°. Then Y(€) = 5+ (—€)° = 5 — €*. For instance, if the argument of y is 2, then yy is 13, but when the argument of Y is 2, then Y is —3 g 122 THEOREM 3.6.1 Second-Order Cauchy—Euler Equation The general solution of the second-order Cauchy~Euler equation ay" + cycy’ + coy = 0, (20) on any « interval not containing the origin, is A ja|™ + Bla A2 (21) y(z)=¢ (A+ Bln |2|)|2|™ ||“ [A cos (In |a|)+ B sin (Z In |z])] if the roots Ay, Az of A? + (c, —1)A+cg = 0 are real and distinct, real and repeated, or complex (\ = a + 7), respectively. Of course, if the x interval is to the right of the origin, then the absolute value signs in (21) can be dropped. To close our discussion of the Cauchy~—Eulerequation, consider the higherorder case, 7 > 2. For simplicity, we consider z > 0; as for the second-order case treated above, « < 0 can be handled simply by changing all x’s in the soisson to EXAMPLE 4. Considerthethird-orderCauchy—Eulerequation gy!” —32?y" + 72ry'—8y = 0. Seekingy(x) = «* gives (22) M— 6\? + 12A-8=0, with the roots \ = 2,2,2. Thus we have the solution y(x) = Az”, but we need two more linearly independent solutions. To find them, we use reduction of order and seek y(x) = A(x)a?. Puttingthatform into (22)givestheequation vA 432A" + A! =0 on A(a), which can be reduced to the second-order equation xp" + 3ap' +p =0 (23) by letting A’ = p. The latter is again of Cauchy—Euler type, and letting p(x) = x* gives A = —1,-1, so that 1 piv) = (B+Clnaz)~. x Since A’ = p, A(x) = [pa = Blnz+C (Ina)? + D, 123 and y(x) = [Cy + Colne + Cy(Ina)?] x? (24) is thedesired general solution of (22). # Comparing the latter result with the solution (11) for the second-order CauchyEuler equation with repeated root A;, we might well suspect that if any CauchyEuler equation has a repeatedroot A, of order k, then that root contributes the form Cy + Colna + Cg(Inz)? +---+ Cy(In we] (25) a o the general solution. We state,without proof, that that is indeed the case. 5. Asa summaryexample,supposethatupon seeking solutionsof a given EXAMPLE eighth-orderCauchy~Euler equationin theform y(x) = x* we obtain the roots A = —2.4, 1.7, 1.7, 1.7, ~-3+ 47, ~3+4i, -3 - 4, -3- 42. Then the general solution is y(x) = Cya? 4 + [C2+ C3(Inz) + Cy(Inz)?] 27 +(Cs5+ Cglnz) 2-3" + (Cp+ Cglnz) a2™, (26) or (Exercise 5), y(x) = Cya?4 + [Co + Ca(Inz) + Cy(In x)”} gh? + {[Cy cos (4In x) + Cio sin (4Inz)} +Inz [C11cos (4Inz) + Cygsin(4Inz)]} a7. (27) Although such high-order Cauchy-Euler equations are uncommon, we include this example to illustrate the general solution pattern for any ordern > 2. Hf This concludes our discussion of the Cauchy—Euler equation. We will meet Cauchy—Euler equations again in the next chapter, in connection with power series solutions of differential equations, and again when we study the partial differential equations governing such phenomena as heat conduction, electric potential, and certain types of fluid flow, in later chapters. 3.6.2. Reduction of order. (Optional) We have already used Lagrange’s method of reduction of order to find ‘missing solutions,” for constant-coefficient equations and for Cauchy—Euler equations as well. Here, we focus not on constant-coefficient or Cauchy—Euler equations, but on the method itself and indicate its more general application to any linear homogeneous differential equation. For definiteness, consider the second-order case, y” +ay(x)y’ + a2(x)y= 0. (28) 124 Chapter 3. Linear Differential Equations of Second Order and Higher Suppose that one solution is known to us, say Y(a), and that a second linearly independentsolution is sought. If Y(a) is a solution, then so is AY (x) for any constant A. The idea behind Lagrange’s method of reduction of order is to seek the missing solution in the form (29) y(t)=A(z)¥(2), whereA(a) is to bedetermined. The method is similar to Lagrange’s method of variation of parameters, introduced in Section 2.2, but its purpose is different. The latter was used to find thegeneralsolutionof thenonhomogeneousequationy' + p(x)y = q(z) from the solution y_(a) = Ae /?() 4" of the homogeneousequationy/ + p(x)y = 0, by varying theparameterA andseekingy(x) = A(x)e~ J?4, Reductionof order is similar in thatwe vary theparameterA in y = AY(x), butdifferentin thatit is used to find a missing solution of a homogeneous equation from a known solution Y (x) of thathomogeneousequation. We begin by emphasizing that at first glance the form (29) seems to be without promise. To explain that statement,observe that the search for a pair of lost glasses can be expected to be long and arduous if we merely know that they are somewhere in North America, but shorter and easier to whatever extent we are able to narrow the domain of the search. If, for instance, we know that they are somewhere on our desk, then the search is short and simple. Likewise, when we solve a constant- coefficient equation by seeking y in the form e** then the search is short and simple since, first, solutions will indeed be found within that subset and, second, because that subset is tiny compared to the set of all possible functions, just as one’s desk is tiny compared to North America. Similarly, when we solve a Cauchy—Euler equation by seeking y in the form 2%. With this idea in mind, observe that the form (29) does not narrow our search in the slightest, since it includes all functions! That is, any given function f(x) can be expressedas A(x) ¥(x) simply by choosingA(x) to be f(x)/Y (2). tion Proceeding nonetheless,we put (29) into (28) and obtain the differential equa- A'"Y + (2¥’ + aY)A'+(¥"+aY' + aY)A=0 (30) on A(a). At first glance it appearsthatthis differential equationon A(a) is probably even harder than the original equation, (28), on y(). However, and this is the heart of Lagrange’s idea, all of the undifferentiated A terms must cancel, because if A were a constant(in which case the A’ and A” termswould all drop out), then the remaining terms would have to cancel to zero because AY (x) is a solution of (28). Thus, the coefficient of A in (30) is zero, so (30) becomes ANY+(2¥’+a1Y)A’= 0, G1) the order of which can now be reduced from two to one by letting A’ = p: 2Y’ +a1Y dp = 0. _—— Je Y dz + ( — 32 a 3.6. Solution of Homogeneous Equation: Nonconstant Coefficients — 125 Integratingthelattergives F-faide =Be~fFE ae_ pe-2S p(n) = BY (a)~*e7 fa(«) dae Finally, integration of A’ = p gives Ll) dege4.6, A(z)=f v2) dx= B | Y(2)~2e7 so (29) becomes y(z) = B [¥@rte foo tas + c (33) Y(2). The CY (x) term merely reproduces the solution that was already known; the missing solutionis providedby theotherterm,BY (x) [ Y(a)~2e7fa (#)ede, That this solution and theoriginal solution Y (x) are necessarily LI is left for Exercise 6. Incidentally, the result (33) could also be written using definite integrals if one prefers, as (34) (€)-2e7 El)ge c| Y(a), y(a)= E / "Y where the lower limits a and / are arbitrary numbers, for the effect of changing a@is simply to add some constant to the € integral, and that constant times B can be absorbed by the arbitrary constant C’. Likewise, changing @ simply adds some constant, say P, to the 7 integral, and the resulting e~” factor can be absorbed by the arbitrary constant B. EXAMPLE 6. Legendre’sequation.The equation (1~2x*)y”—2ay’ + 2y =0, (-l<a<1) (35) is known as Legendre’s equation, after the French mathematician Adrien Marie Legendre (1752-1833). Itis studied in Chapter 4, and used in later chapters when it arises in physical applications. Observethat(35) admits the simple solution (a) = x. To find a secondsolution, andhencea generalsolution,we can seek y(z) = A(x)x and follow the stepsoutlined above. Rather, let us simply use the derived result (33). First, we divide (35) by 1 — x? to 2z reduce it to the form (28), so that we can identify a(x). Thus, with ay(z) = and 1— x? Y(z) = x, (33)gives el 2a da/(1~x") B | ——.——— dr + C c= [2faim+= 126 or, equivalently, x, l+ea y(a) = Cha + C2 (1-$mi+£). a In this example we were able to evaluate the integrals that occurred. In other cases we may not be able to, even with the help of computer software, and may therefore need to leave the answer in integral form. 3.6.3. Factoring the operator. (Optional) We have been considering the nthorder linear homogeneous equation d” Ly) = Ee or qr-l + a1(x) anal +++++an(x)]y = 09, (D"+a,D"~*+++» +an)y =0, do. d d dx. _dxdz (36) d? hereD = —, D? = DD = —-— = —~, and waste dz? an * on Suppose, first, that (36) is of constant-coefficient type (i.e., each aj; is a constant), and that the characteristic polynomial AP +a A7~ 14+ +ay can be favored as (A~A1)(A—Ag) ++:(A—An), where one or more of the roots A; may be repeated. Then the differential operator L = D” + a,D"~! +--+ + apycan be factored in preciselythesameway,as (D —A1)(D —A2)--:(D —An), wherewe understand (D — \1)(D — A2)--+(D — An)y to mean that first y is operatedon by D — An, then the result of that step is operated on by D —A,—1, and so on. That is, we begin at the right and proceed to the left. Further, it is readily verified that the sequential order of the D — A; factors is immaterial, instance, that is, they commute. If n = 2, for (D —A1)(D—A2)y= (D —A1)(y'—Azy) = Diy! —Agy)—Aly’ —Aay) = yl! —(Ag+ Az)y! + AAgy (D —A2)(D—A1)y= (D —A2)(y'—Aay) = Diy’ —Avy)—Aa(y!—Ary) =y" —(do + A1)y!+ A2ALy are the same. By factoring L, we are able to reduce the solution of (D ~ Ai)(D —Ag)++(D- Any = 0 (37) 3.6. Solution of Homogeneous Equation: Nonconstant Coefficients — 127 to the solution of a sequence of 7 first-order equations, each of which is of the form y —py= gor (38) (D—p)y=4, where p is a constant and g(x) is known. From Section 2.2, we know that the solution of (38) is eP®q(a) du +A) y(a) = eP* (/ , (39) where A is an arbitrary constant. Let us illustrate with an example. EXAMPLE 7. The equation yf" _ 3y" + dy — 0 (40) admits the characteristic roots \ = —1, 2,2, so we can factor (40) as (D+1)(D —2)(D —2)y=0. (41) We begin the solution procedure by setting (D —2)(D —2)y=u, (42) so that (41) becomes (D+ 1)u = 0, with the solution u(x) = Ae. Putting the latter into (42) gives €D — 2)(D — 2)y = Ae, in which we set (44) (D —2)y =v. Then (43) becomes (D —2)v = Ae™*, with the solution e 20"Ae v(x) =e Qe A Ee Qu + Be™. "de + B) = -—e7* me Finally, putting the latter into (44) gives (D ” 2)y a “ge A 7 (43) -- Be, Yay 128 Chapter 3. Linear Differential Equations of Second Order and Higher with the solution y(a) — et ∶− I A∙∂ en ae (-fe" _. ∙↕ ∑ on ∙ + Be) dx + c| ∶ or, equivalently, y(x) = Cye~®+ (Cz + C32) €?*, which is the same solution as obtained by methods discussed in earlier sections. Notice, in particular, that the presence of the repeated root 1 = 2,2 presented no additional difficulty. # Although the factorization method reduces an nth-order equation to a sequence of n first-order equations, it is quite different from the method of reduction of order described above in Section 3.6.2. Thus far we have limited our discussion of factorization to the constant-coefficient case. The nonconstant-coefficient case is more difficult. To appreciate the difficulty, consider the equation y! — x?y = (D? — a*)y = 0. (45) If we canfactorD? —x? as(D +x)(D —<),thenwecansolve(45)by themethod outlined above. However, —xy)=Diy’ - cy)+e(y’—zy) (D+2)(D—2x)y=(D+2)(y' —yl" − ry’ —ytay’— vy = y" _ (x? + 1)y, (46) so(D+a)(D—2x)= D? —(x*+1) is notthesameasD? —x*.Theproblemis that the differential operator on the left-hand side of (46) acts not only on y but also on itself, in the sense that an additional term is contributed to the final result, namely —y, through the action of the underlined D on the underlined x. Observe, further, thatD + x andD —x do not commutesince (D + 2)(D —x) = D? —(x? +1), whereas(D —x)(D +x) = D? — (x? — 1). Thus, the following practical question arises: given a nonconstantcoefficient operator, can it be factored and, if so, how? Limiting our attention to equations of second order (which, arguably, is the most important case in applications), suppose that aj(a) and a(x) are given, and thatwe seek a(x) andb() so that yl"+ai(a)y! +a9(«)y =[D—a(a)][D —b(x)]y. (47) Writing out the right-hand side, y"+ary’+aay= (D —a)(y'—by) =y" —(a+b)y' + (ab—b')y. (48) 129 Since this equation needs to hold for all (twice-differentiable) functions y, a and b must satisfy the conditions (Exercise 13) a+b=-~ay, (49a) ab—b' = as, (49b) or, isolating a and b (Exercise 14), 2 a’ = a* + (a1)a + (a2 —a), (50a) b!= —b?—(a1)b—(ag). (S0b) Each of theseequations is a special case of the nonlinear Riccati equation y!=p(x)y*+q(x)y+r(z), (51) which was discussed in Exercise |1 of Section 2.2. Thus, from a global point of view, it is interesting to observe that the class of second-order equations with nonconstant coefficients is, in a sense, equivalent in difficulty to the class ofnonlinear first-order Riccati equations, We saw, in Exercise 11 of Section 2.2 that in the exceptional case where a particular solution Y(z) of (51) can be found, perhaps by inspection, the nonlinear equation (51) can be converted to the linear equation vo! +[2p(2)¥ (x)+q(a)]v =—p(2) (52) by the change of variables y=Y(a2)+ . (53) Thus, just as we are able to solve the Riccati equation only in exceptional cases, we are able to factor second-order nonconstant coefficient equations (and solve them readily) only in exceptional cases. In general, then, nonconstant-coefficient differential equations are hard in that we are unable to find closed form solutions. EXAMPLE 8. Considertheequation y" —(a?+1)y=0. (54) Herea;(z) = 0 andag(x) = —(x?+ 1), so (50a,b)are a =a? —x? —1, I enee (55a) (55b) In this case we are lucky enough to notice the particular solution a(z) = —a of (55a). Putting this result into (49a) then gives b(a) = x. [Equivalently, we could have noticed the particular solution 6(@)= zxof (55b) and then obtained a(x) = —z from (49a).]Thus, we havethe factorization y —(x?+Dy= (D+2)(D-x)y =0. eS) Proceeding as outlined above, we are able (Exercise 15) to derive the general solution y(z) = Ae®/? + Be®/? | e* de. (57) Going one stepfurther,supposethatinitial conditionsy(0) = 0 and y’(0) = 1 are prescribed and that we wish to evaluate A and B. First, we re-express (57) in the equivalent and more convenient form y(e) = Ae®/2 4 pew? | 0 en8dé. (58) We could have used any lower integration limit, but 0 will be most convenient because the initial conditions are at z = 0. Then where we have used the fundamental theorem of the calculus (Section 2.2) to differentiate the integral term. Thus, A = 0 and B = 1, so y(x)=ern f e€ dé (59) 0 is the desiredparticular solution. # The integral in (59) is nonelementary in that it cannot be evaluated in closed form in terms of the elementaryfunctions. But it arises often enough so that it has been used to define a new function, the so-called error function erf(a)= = [ “oP de, (60) where the 2/,/7 is included to normalize erf(x) to unity as @—+oo since (as will be shown in a later chapter) [eta 0 = vr 2 (61) The graph of erf(a) is shown in Fig. | for « > 0. For « < 0 we rely on the fact that erf(—x) =lI ~erf(x) (Exercise 18);for instance,erf(—oo) = —erf(oo) = —1. 0-4 0 1 Nev Figure 1. Theerrorfunction erf(x). Since e~® is (to within a scale factor) a normal probability distribution, one way in which the error function arises is in the study of phenomena that are governed by normal distributions. For instance, we will encounter the error function when we study the movement of heat by the physical process of conduction. Thus, our solution (59)can be re-expressedas y(a) = \/7/2 er /? erf(a). Just as we know the values of sin, its Taylor series, and its various properties, likewise we know the values of erf(a), its Taylor series, and its various properties, so 131 we should feel comfortable with erf(a) and regard it henceforth as a known function. Though not included among the so-called “elementary functions,” it is one of many “special functions” that are now available in the engineering science and mathematicsliterature. Closure. We have seen, in this section, that nonconstant-coefficient equations can be solved in closed form only in exceptional cases. The most important of these is the case of the Cauchy—Euler equation dy, aly ae + ex" a ay eee tpCait d + cCny= 0. Recall that a constant-coefficient equation necessarily admits at least one solution in the form e**, and that in the case of a repeated root of order k the solutions corresponding to that root can be found by reduction of order to be (Cy + Cotter + Cya*—!) e**, Analogously, a Cauchy—Euler equation necessarily admits at least one solution in the form 2%,and in the case of a repeatedroot of order & the solutions corresponding to that root can be found by reduction of order to be [Ci +Cglna+---+C,(In )*-}] x. In fact, it turns out that the connection between constant-coefficient equations and Cauchy —Euler equations is even closer than that in as much as any given Cauchy—Euler equation can be reduced to a constant-coefficient equation by a change of independent variable according to x = e’. Discussion of that point is reserved for the exercises. Beyond our striking success with the Cauchy—Euler equation, other successes for nonconstant-coefficient equations are few and far between. For instance, we might be able to obtain one solution by inspection and others, from it, by reduction of order. Or, we might, in exceptional cases, be successful in factoring the differential operatorbut, again, such successesare exceptional. Thus, otherlines of approach will be needed for nonconstant-coefficient equations, and they are developed in Chapters 4 and 6. EXERCISES 3.6 tionby seekingy(a) = x. That is, derivethesolution,rather (g) wy" + day! + 2y= 0; dition, find the particular solution corresponding to the initial conditions, if such conditions are given, and state the interval (i) dary” + By =0; y= (j) 22y" + ay’ + 4y =0 (a)cy’ +y=0 (k) ay! + 2ry' —2y = 0; HINT: Letz+2= (I) (@+ 2)*y"—-y=0 of validity of thatsolution. (b)zy’-y=0; (c) ay” + y' = 0 (d)ay” ~dy'=0; yG —2y'=0; (m)ay” —y" y(0) _ =0; y(2)=5 y(1)=0, (n) ny" y/(1)=3 (e)a*y" +ay'-Sy=0; y(2)=1, y/(2)=2 (0)2?y" + cy’ —Ky =0 (p)why+ay’—y=0 t. 132 ult - whereD acting on y(x) meansd/dz and D acting on Y(t) meansd/dt. (q)ay!” +2ey!—2y = 0 (r) x?yfa +ay"—y' (s) no yl” + Gay (t) atyll! + bar” (u) ety! (vy) ay! " “+ Tay’ - as ys 3a7y"" a 3024! + Gay!" _ Qyl! =0 a 0; y(1) _ say’ _ Bay! + dy =0 +y= 0 (b) The results (8.2) suggest that the formula a*D¥y = D(D—-1)---(D-k+Y, —_5, yl) =9") = y"(1)=0 (8.3) holds for all positive integers k. Prove that (8.3) is indeed 2. (a)—(v) For the corresponding problem in Exercise 1, use computer software to obtain a general solution, as well as a particular solution if initial conditions are given. 3. Putting (9) into (3), show that the equation cA” + A’ = 0 results, as claimed below equation (9). correct. HINT: Use mathematical induction. That is, assume that (8.3) holds for any given positive & and, by differentiating both sides with respect to x, show that at! pkt+1y—D(D—1)---(D—k)Y, (8.4) 4. Solve (10), and derive the general solution A(z) = Dlnz+ C' stated below equation (10). which is the same as (8.3) but with k changed to k + 1. Thus, 5. Fill in the steps between (26) and (27). (c)Finally,replacingeachx*D*y in (8.1)by thecorrespond- it must be true that (8.3) holds for all positive integers k. ing right-hand side of (8.3), state why the resulting differential on }(t) will be of constant-coefficient type. equation (34)]arenecessarilyLI. You mayassumethata;(2)is continuous on the x interval of interest. HINT: Recall the fundamen- 9. (Electric potential) The electric potential ® within an antal theorem of the calculus (given in Section 2.2). nular region such as a long metal pipe of inner radius ry and outer radius ro, satisfies the differential equation 7. It was stated in the Closure that any given Cauchy—Euler equation can be reduced to a constant-coefficient equation by the change of variables x = e'. In this exercise we ask you to (ry <r <re) try that idea for some specific cases; in the next exercise we ask 6. Prove that the two solutions within (33) [or, equivalently, for a generalproofof theitalicized claim. Let y(x(t)) = Y(#), Solve for the potential distribution ®(r) subject to these andlet y’ andY’ denotedy/dz anddY/dt, respectively. (a) Show that the change of variables « = e! reduces the boundary conditions: Cauchy—Euler equation 27y’’ —zy’ — 3y =0 to the constantcoefficient equation ¥"” — 2Y’ ~ 3Y = 0. Thus, show that 1b (1) =O, (rg)=% (a) Y(t) = Ae! + Be**. Since ¢ = Inz, showthaty(a) = Agw} + Ba, (b) Same as (a), for w*y" +ay' ~4y = 0. (c) Sameas (a), for 2”aw + ay’ +4y =0. (d) Same as (a), for 27y" + 3ay' + y = 0. to Exercise radius r, and outer radius r2, is governed by the differential du +e yen Ppn-t ob 7. Consider Cn—1 xD the gen- + Cr) Y = (). (8.1) whereD = d/dax.Leta = e’, anddefiney(x(t)) = Y(t). (a) Using chain differentiation, tDy = DY, az’D*y = D(D du "a—>77 +2— dp =0. eral Cauchy—Euler equation (2"D" 10. (Steady-state temperature distribution) The steady-state temperature distribution uwwithin a hollow sphere, of inner equation (e)Sameas(a),for x?y"”+ ry’ —9y = 0. (f) Sameas(a),for 27y" + y = 0. (g)Sameas(a),for 27y” + Qey’ —2y = 0. (h)Sameas (a),for 42*y’””— y = 0. 8. First, read the introduction (b) = (r1)=0, (ro) =o» Solve for u(r) subjectto theseboundaryconditions: (a)u(7;) =u, du, , (b) —(r1) = 3, dr ulre) = Ue u(re) = 0 d u uy, at" 2) =0 u(ry) (c) (r)==tu, o)u FOR OPTIONAL SECTION 3.6.2 show that EXERCISES — LY, 11. Use the given solution y;(z) of the differential equation to find the general solution by the method of reduction of order (leaving the second solution in integral form if necessary). a’ D*y = D(D —1)(D- 2)¥, (8.2) 3.7. Solution of Nonhomogeneous Equation — 133 ay” w(x) =2 +ay -y=0; (b)ay” + ay -y=0; (c)8xy"—azy'+y=0; (d)(a? ~ Ly” —2y =0; (a) ey! —Qay' + 2y = 0 yil(e)=2 (b)x*y”+ zy’ +9y = 0 (c)ay” +xy!—9y= 0 w(x) =2 (d)w?y" + Say! + 4y = 0 yi(x) = 27-1 (e)2y"+ay’—2y=0; yi(e)=a?+2 12. (a)~(e) Obtain a general solution of the corresponding differential equation in Exercise |1, using computer software. EXERCISES FOR OPTIONAL SECTION 3.6.3 18. From its integral definition, (60), show that erf(—z) = ~erf(x). 19. (Integral representations) The notion of an integral representation 13. State, clearly and convincingly, logic by which (49a,b) follow from (48). 14. Fill in the steps between (49a,b) and (50a,b). 15. Provide the steps that are missing between the equation (56) and its solution (57). Ing = 16. If a(x) and ag(x) are constants,then the factorization (47) should be simple. Show that the Riccati equations (50a,b) on a and 6 do indeed give the same results for a and b as can be obtained by more elementary means. 17. In general, the Riccati-type equations (50a,b) are hard. However, we should be able to solve them if the given as used in (60) to define the error of a function, function erf(x), might be unfamiliar to you. If so, it might help to point out that even the elementary functions can be introduced in that manner. For example, one can define the logarithm In x as [ dt —, 1 t («>0) (19.1) from which formula the values of Ing can be derived (by numerical integration), and its various properties derived as well. (a) To illustrate the latter claim, use (19.1) to derive the well nonconstant-coefficient equationy” + a(x)y’ + a2(x)y = 0 known property In z* = alnz of the logarithm. is a Cauchy—Euler equation because that case is simple. Thus, (b) Likewise, use (19.1) to derive the property In(zy) = use the method of factoring the operator for theseequations: Inz+Iny. case Ly] = f(x), (1) nonzeroforcing function f(a). dx ~~ dz +ce—+ka = Fit mae Tae Te 6) (2) governed by the equations dj ae at di 1, oa dE(t) a on thecurrentz(t), and #Q +R dQ +GQ=Blt) 1 Loz w(x) (4) on the charge ()(t) on thecapacitor, the forcing functions are the time derivative of the applied voltage£(t), andthe applied voltageE(t), respectively. As one more example, we give (without derivation) the differential equation di EIS +ky = w(2) (5) governing the deflection y(x) of a beam that rests upon an elastic foundation, under a load distributionw(x) (i.e., load per unit 2 length),as sketchedin Fig. 1. y Figure 1. Beamon elastic foundation. E, I, and k are known physical constants: FEis the Young’s modulus of the beam material, J is the inertia of the beam’s cross section, and k is the spring stiffness per unit length (force per unit length per unit length) of the foundation. Thus, in this casetheforcing functionis w(x), theappliedload distribution.[Derivationof (5) involves the so-called Euler beam theory and is part of a first course in solid mechanics. ] 3.7.1. General solution. Remember that the general solution of the homogeneous equation L[y] = 0 is a family of solutions that contains every solution of that equation, over the interval of interest. Likewise, by the general solution of the nonhomogeneous equation L[y] = f, we mean a family of solutions that contains every solution of that equation, over the interval of interest. Like virtually all of the concepts and methods developed in this chapter, the concepts that follow rest upon the assumed linearity of the differential equation (1), in particular, upon the fact that if Z is linear then Llau(x) + Bv(x)| = aLlu(x)| + BL[v(x)| (6) for any two functions u,v (n-times differentiable, of course, if D is an nth-order operator) and any constants a, (3.Indeed, recall the analogous result for any number of functions: Llayuy(x) +--+ + agug(x)] = ay Llu (x2)]+--+ + ap Llug(a)] for any functions u,,..., uz (7) and constants a1,...,@p. To begin, we suppose that y,(xz) is a general solution of the homogeneous version of (1),L[y] = 0, andthaty»(a) is any particularsolution of (1): L[yp(«)] = f(x). That is, yp(z) is any function which, whenput into the left-handside of (1), gives f(x). We will refer to y,(x) and yp(a) as homogeneous and particular solutions of (1),respectively.[Someauthorswrite y.(z) in place of yp,(a),andcall it thecomplementary solution.| i) THEOREM 3.7.1 GeneralSolutionof L{y| = f If ya(x) and yp,(a) are homogeneous and particular solutions of (1), respectively, on an interval J, then a general solution of (1), on J, is (8) y(t)=yn(a)+yp(2). Proof: That (8) satisfies (1) follows from the linearity of (1): L (yale)+yp(2)| =L lyn(e)] +LZlyp(@)) =0+f(x) =f(z), where the first equality follows from (6), with a = @= 1, and u,v equal to y;, and Yp;tespectively. To see that it is a general solution, let y be any solution of (1). Again using the linearity of L, we have L(y — yp) = L[y] —Llyp] = f — f =9, so that the most general y ~ yp, is a linear combination of a fundamental set of solutions of the homogeneous version of (1), namely yz. Hence y = yp + Yp i8 a general solution of (1). aw Thus, to solve the nonhomogeneous equation (1) we need to augment the homogeneoussolution y;,(x) by adding to it any particular solution yp(2). Often, in applications, f(a) is not a single term but a linear combination of terms:f(x) = fi(z) +--+ + fx(x). In theequationL[y] = 52? —2sin a + 6, for instance,we can identify f(a) = 5x7, fg = —2sin2, and f(x) = 6. THEOREM 3.7.2 General Solution of Ely) = fi +--+ + fr If y,(z) is a general solution of L{y] = 0 on an interval J, and ypi(z),..., Ype(@) are particular solutions of L[y] = fi,...,£[y] = fx on J, respectively,then a generalsolutionof L[y] = fy +---+ fp on Tis y(@)= yr(z) + Ypi(2) ++++ Upp (2). (9) Proof: That (9) satisfies (1) follows from (7), with all the a’s equal to 1: L [Yr + Upi bo Ypk| = [yal +L = fi tet [Yp1] sa! fe, [Ypk] (10) 136 as was to be verified. To see that it is a general solution of (1), let y be any solution of (1). Then ~Yk) =Lly| —Llypi}—+»—L[ype] Ly —Ypt—+++ =f-fi----—fe=0, so the most general y —yp1—- + —Ypx is a general solution yy,of the homogeneous version of (1). B This result is a superposition principle. It tells us that the response y, to a superposition of inputs (the forcing functions fi,..., f,) is the superposition of their individual outputs (Yp1,.-., Ypk)The upshot is that to solve a nonhomogeneous equation we need to find both the homogeneous solution y, and a particular solution y,. Having already developed methods for determining y, — at least for certain types of equations — we now need to present methods for determining particular solutions y,, and that is the subject of Sections 3.7.2 and 3.7.3 that follow. 3.7.2. Undetermined coefficients. The method of undetermined coefficients is a procedure for determining a particular solution to the linear equation Ly] = f(z) =fi(z)+---+ (11) f(z), subject to two conditions: (i) Besides being linear, Z is of constant-coefficient type. (ii) Repeated differentiation of each f;(x) term in (11) produces onlyafinite number of LI (linearly independent) terms. To explain the latter condition, consider f;() = 2xe~”. The sequence consisting of this term and its successive derivatives is Qre~" —+ {2re"*, 2e~*— Que *, —de~"+ 2xe™”,. wb, and we can see that this sequencecontains only the two LI functions e~* and ze~*. Thus, f;(z) = 2xe~®satisfiescondition (ii). As a second example, consider f;(x) = x. This term generates the sequence es {a*,2x,2,0,0,...} , which contains only the three LI functions x”, a, and 1. Thus, f;(a) = x” satisfies condition (ii). The termf;(x) = 1/a, however,generatesthesequence 1/e —+ {1/z, —1/27,2/x°,-6/x*,...}, 137 whichcontainsaninfinitenumberofLI terms(1/2,1/2, 1/x°,...). Thus,fj(%)= 1/« does not satisfy condition (ii). If the term f;(«) does satisfy condition (ii), then we will call the finite set of LI terms generatedby it, through repeateddifferentiation, the family generated by f(x). (That terminology is for convenience here and is not standard.) Thus, the family generatedby 27e~* 1scomprised of e~®and we~*,and the family generated by 327 is comprised of x, x, and 1. Let us now illustrate the method of undetermined coefficients. EXAMPLE 1. Considerthedifferentialequation (12) yl" —y" = 3c" — sin 2z. First, we see that L is linear, with constant coefficients, so condition (i) is satisfied. Next, we identifyf1(x), fo(x), andtheirgeneratedsequencesas file) =3a? —+ {327,62,6,0,0,...}, (13b) 22, ~2cos2a,4sin2z,...}. —»+ {-sin fo(w) = —sin2e (13a) Thus, f; and fo do generate the finite families fila) =8x? (14a) —+ {2?,2,1}, (14b) fo(z) = —sin2a2 —> {sin 2z,cos 2x}, so condition (ii) is satisfied. To find a particular solution y,; corresponding to /,, tentatively seek it as a linear combination of the terms in (14a): (15) Yp1(2) = Ac? + Br+C, where A, B,C are the so-called homogeneous solution of (12), undetermined coefficients. Next, we write down the yn(@) = Cy + Cow + Cye* + Cye™®, (16) and check each term in yp; [i-e., in (15)] for duplication with terms in y,. Doing so, we find that the Ba and C terms in (15) duplicate (to within constant scale factors) the Cyx and C’; terms, respectively, in (16). The method then calls for multiplying the entire family, involved in the duplication, by the lowest positive integer power of 2 needed to eliminate all such duplication. Thus, we revise (15) as Ypi(«) = 2 (Ax? + Ba + C)= Aa? + Bu? + Cx, (17) but find that the Ca term in (17) is still ‘in violation” in that it duplicates the Cya term in (16). Thus, try Ypi(z) = 0" (Ax? + Br +C) = Av’ + Br? + Cx’. This time we are done, since all duplication has now been eliminated. (18) 138 Chapter 3. Linear Differential Equations of Second Order and Higher Next, we put the final revised form (18) into the equation y/” ~— y Ly] = fi(®)| andobtain 244A—12Aa? —6Ba —2C = 32?. = 32? [ie., (19) Finally, equating coefficients of like terms gives x: a: 1: so that A = —1/4, B= —l2A =3 ~6B = 0 (20) 244-20 =0, 0, C = —3. Thus Ypi(z) = 1 ae (21) —3x, Next, we need to find yp,2corresponding to fz. To do so, we seek it as a linear combination of the terms in (14b): (22) Yp2(z) = Dsin 2x + Ecos 22, Checking each term in (22) for duplication with terms in y,;, we see that there is no such duplication. Thus, we acceptthe form (22), put it into theequationy/"" — y” = —sin 2x fie., L[y] = fo(x)], andobtain 20D sin 2x + 20F cos 2x = — sin 22. (23) Equating coefficients of like terms gives 20D = —1, and 20F = 0, so that D = —1/20 and & = 0. Thus, 1 Yp2(t)= 35 sin 2z. (24) Finally, a general solution of(12) is, according to Theorem 3.7.2, y(z) = ya(Z)+Yp(@) = ya(z)+Ypi(x)+Ype(z), namely, 1 y(z) = Cy + Cox + Cge” + Cye™* — i 1 — 3x7 — a sin 22. (25) COMMENT |. We obtained (20) by “equating coefficients of like terms” in (19). That step amounted to using the concept of linear independence — namely, noting that 1,a, x” are LI (on any given x interval) and then using Theorem 3.2.6. Alternatively, we could have rewritten (19) as (24A —2C)1 + (-6B)x + (-12A —3)z? =0 (26) and used the linear independence of 1, x, x? to infer that the coefficient of each term must be zero, which step once again gives (20). COMMENT 2. The key point in the analysis is that the system (20), consisting of three linear algebraic equations in the three unknowns A, B,C, is consistent. Similarly for the system 20D = —1,20 = 0 governing D and E. The guaranteeprovided by the method 139 x: oe ile 0=3 Oz—a0) —~2A= 0, Let us summarize. STEPS IN THE METHOD OF UNDETERMINED COEFFICIENTS: Verify that condition (i) is satisfied. Identify the f;(x)’s and verify that each one satisfies condition (ii). I]. correspondingto f;(a) [(15)in Example 1]. ample 1]. by the lowest positive integer power of x necessary to remove all such duplication between those terms and the terms in yp, [(18) in Example 1]. side of theequationL[y] = f1, andequatecoefficientsof like terms. coefficients. That step completes our determination of yp1(2). Repeat steps 4—8for yp2,..., Ypk: i Thenthe general solution of L[y] = fi ++:++ fx is given, according to Theorem3.7.2,by y(x) = yn(@)+ Yp(©)= yn() + Ypi(@) +++ + ype(a). y’ —9y =4+5sinh 32, (27) 140 which is indeed linear and of constant-coefficient type. Since fifa) =4 —> {4,0,0,...}, fo(w) = 5sinh3ce —> {5sinh 3a, 15cosh 3a,45sinh3z,...}, we see that these terms generatethe finite families fo(w)=5sinh3z2 —> {sinh32,cosh3z}, so we tentatively seek Since Ypi(“) = A. (28) + Cye**, ya(x)= Cye** (29) there is no duplication between the term in (28) and those in (29). Putting (28) into y” - 9y = 4 gives -9A = 4,80 A = —4/9.Thus, 4 Ypi(z) = -5 (30) Yp2(z) = Bsinh 3c + C'cosh3z. (31) Next, we tentatively seek At first glance, it appears that there is no duplication between any of the terms in (31) and those in (29). However, since the sinh 3z and cosh 3z are linear combinations of e°* and e~3*, we do indeed have duplication. Said differently, each of the sinh 3x and cosh 3a terms are solutions of the homogeneous equation. Thus, we need to multiply the right-hand side of (31) by a and revise y, as Yp2(v)= « (Bsinh 3x + Ccosh 3a). Now that yp2 is in a satisfactory form we put that form into y” ~ 9y = 5sinh3z (32) [ie., L{y]= f2(z)] andobtaintheequation (3C + 3C) sinh3a + (3B + 3B) cosh3z +(9B —9B)z sinh 3z + (9C —9C)x cosh3x = 5sinh 32. Equating coefficients of like terms gives B = 5/6 and C = 0, so 5 Yp2(2) = 62 sinh 3a. (33) It follows then, from Theorem 3.7.2, that a general solution of (27) is ↕ ∶ re a 5 4 3z. 62 sinh−− 9 +↔ (34) 41 Naturally, one’s final result can (and should) be checked by direct substitution into the original differential equation. COMMENT. Suppose that in addition to the differential equation (27), initial conditions y(0) =0, y'(0) = 2 arespecified.Imposingtheseconditionson (34)givesCy = 5/9, C2 = —1/9, and hence the particular solution 5a y(z) = 5° 1 ~ ° -3 4 5 62 sinh 32. (35) Do not be concerned that we call (35) a particular solution even though each of the two exponential terms in (35) is a homogeneous solution because if we put (35) into the left-hand side of (27) it does give the right-hand side of (27); thus, it is a particular solution of (27). @ As a word of caution, suppose the differential equation is y” —3y’ + 2y = Qsinhaz, z e ae with homogeneous solution Cie” + Coe**. Observe that 2sinhz = e”?— contains an e®term, which corresponds to one of the homogeneoussolutions. To bring this duplication into the light, we should re-express the differential equation as tt 1 _ 48 y —d3y +2y=e"-—e ~—2z before beginning the method of undetermined coefficients. Then, the particular solution due to f;(a) = e®will be ypi(x) = Age®and the assumedparticular solutiondue to fo(x) = e~*will be ypo(x) = Be”. We find thatA = —1and B = —1/6so thegeneralsolutionis y(x) = Cre” + Coe?” — we™— se. Closing this discussion of the method of undetermined coefficients, let us reconsider condition (il), that repeated differentiation of each term in the forcing function must produce onlya finite number of LI terms. How broad is the class of functions that satisfy that condition? If a forcing function f satisfies that condition, then it must be true that coefficients agf™) +a, fN~-) a; exist, not all of them zero, such that +--+ anf’ tanf =0 (36) over the a interval under consideration. From our discussion of the solution of such constant-coefficient equations we know that solutions f of (36) must be of the form Ca™ele+')® or a linear combination of such terms. Such functions are so common in applications that condition (ii) is not as restrictive as it may seem. 3.7.3. Variation of parameters. Although easy to apply, the method of undetermined coefficients is limited by the two conditions cited above — that L be of 142 constant-coefficient type, and that repeated differentiation of each f;(a) forcing term produces only a finite number of LI terms. The method of variation of parameters, due to Lagrange, is more powerful in that it is not subject to those restrictions. As with automobile engines, we can expect more power to come at a higher price and, as we shall see, Lagrange’s method is indeed the more difficult to apply. In fact, we have already presented the method, in Section 2.2, for the general linear first-order equation (37) y' +p(x)y= q(2), and we urge you to review that discussion. The idea was to seek a particular solution y, by varying the parameter A (i-e., the constant of integration) in the homogeneous solution yn(a) =AwIPC) de, Thus, we sought FPO®, Yp(t)= A(ajeW put thatform into (37) andsolved for A(z). Likewise, if an mth-orderlinear differential equation L[y] = f has a homogeneous solution Yn(z) = Cry (x) tote (38) Cnyn(a), then according to the method of variation of parameterswe seek a particular solution in the form (39) +Cn(x)yn(x); + +++ = Cr(x)yr(@) Yp(@) that is, we “vary the parameters” (constants of integration) C1,..., Cp in (38). Let us carry out the procedure for the linear second-order equation Ly] = y" +pi(@)y!+ po(x)y= f(a), (40) where the coefficientspy(z) and po(a) are assumedto be continuouson the x interval of interest, say [. We suppose that ya(e) = Ciyi(z) + Crya(z) (41) is a known general solution of the homogeneous equation on J, and seek = Ci(x)y1(x)+ C2(x)y2(z). Yp(@) (42) Needing y,, andy;,, to substituteinto (40),we differentiate(42): Up=Cry +Cay+Cyn+Coye. (43) Looking ahead, Vp will include Cy, C2, C, C4, CY,CY terms, so that (40) will become a nonhomogeneous second-order differential equation in Cy and C2, which can hardly be expected to be simpler than the original equation (40)! However, it 143 will be only one equation in the two unknowns C,, Cy, so we are free to impose another condition on C, Cy to complete, and simplify, the system, An especially convenient condition to impose will be Cry (44) + Coyo = for this condition will knock out the C{, C’ terms in (43), so that y,, will contain only first-order derivatives of Cy, and C’y. Then (43) reduces to Up = Cy, so + Cay, (45) Up= Cry + Coys+Cry, +Cayo, and (40) becomes Ch (yi + pry + poyt) + C2 (yy + Diy + poy2) + Cry} + Cyyo = f. (46) The two parenthetic groups vanish by virtue of y, and yg being solutions of the + Coys = f. That homogeneous equation L[y] = 0, so (46) simplifies to Cy result, together with (44), gives us the equations yiCy + yCh Y20o =0, + Ye, (47) yj,Cy+ yoCy= f of uniquely, if thedeterminant on CY,CS. The latterwill be solvablefor C), C4, the coefficients does not vanish on J. In fact, we that determinant as the eee Wronskian of y; and ya, Wyn,yal(a)= yz) yl) y2(x) a y4(a); and the latteris necessarily nonzero on J by Theorem 3.2.3 because y; and y2 are LI solutionsof L[y]= 0. Solving (47) by Cramer’s rule gives Y2 ao) =LEHL _ wie yi YP yi Ye We) cy LHL yi Y2 yi VY Wa( Way where W, }V2 simply denote the determinants in the numerators. Integrating these equations and putting the results into (42) gives w= [Tarren [fey | no, ” 144 or, more compactly, Yp(x) = | EXAMPLE 3. ©Wi(E)yr (x)+ Wo(E)yo(a dé. ey To solve yl”~dy= 8e*, (50) (51) we note the general solution yn(x) = Ce? (52) + Cye~** of the homogeneousequation,so that we may takey;(xz) = e?*,y2(x) = e~?*. Then W(x) = yy —yiye = —4,Wi(r) = —f(x)yo(x) = —8e?®e72* = —8,andWo(x) = f(x)yi (x) = 8e?*e?*= 2e4*,so(49)gives Yp(x) = (f° 2at er? 4 (f° = (2x+ A)e?® + (-> 4a —2e46 as) ent +) 2x e2* = Qre?*—> + Ae?*+ Be~?*,(53) where A, B are the arbitrary constants of integration. We can omit the A, B terms in (53) because they give terms (Ae?* and Be~?*) that merely duplicate those already present in the homogeneous solution y;,. That will always be the case: we can omit the constants of integration in the two integrals in (49). In the present example we can even drop the —e? /2 term in the right side of (53) since it too is a homogeneous solution [and can be absorbed in the Ce?" term in (55)]. Thus, we write Yp(x) = Qre7*, (54) y(x) = yr(z) + yp(x) = Cye?* + Coe ?* + 2re7* (55) Finally, gives a general solution of (51). @ 3.7.4. Variation of parameters for higher-order equations. (Optional) For higher-order equations the idea is essentially the same. For the third-order equation Ly} = y""+pi(x)y” +po(x)y'+ ps(2)y= f(x), (56) yr(x) = Cryi(x) + Coye(x) + Cay3(x) (57) for instance, if is known, then we seek Yp(x) = Ci(a)yi (a) + Co(x)yo(a) + C3(a) y3(x). (58) Looking ahead, when we put (58) into (56) we will have one equation in the three unknown functions C1, Co, C3, so we can impose two additional conditions 145 (58) gives + Clyr +Coye+Cys, = Cry,+Coys+Cayg yp(a) (59) Ciy +Cyys+Cyy3=0 (60) so we set to suppress higher-order derivatives of Cy, C2, C3. Then (59) reduces to yh,= Cry +Cayh+Cay, (61) and anotherdifferentiation gives +Coys+Cays. Up=Cry +Cays+Cag+Chun (62) Again, to suppress higher-order derivatives of C,, Co, C3, set Cry + Coyg+ Cys =0. Then (62) reduces to sO Finally, a (63) = Ciy + Coyg+ Coys, (64) + Cyys. = Cyl! + Coys!+ Cayg’+ Chyll+ Chaya (65) (65), (64), (61), and (58) into (56) gives Ci (yt + pry + pay, + psy1) + Ca (yg!+ piyy + pays+ psy2) (y3'+piys+poys+pays)+Cru+ Coys+Cayg=f,(66) +C1 or Ciyt +Coys+Cyy3= f (67) since each of the three parenthetic groups in (66) vanishes because yj, yo, y3 are homogeneous solutions. = 0, yiCy + yes + - mCi +yoy +gC3 = 0, ∕ √ ∏ ↕ ∶ ∫ (68) ∕∙ ∏ ∕ ∏≤↨ ∏↕ ∕ ∕ ↔↨∕↨ ∶ ∂ 146 Yi YB Y2 (69) =} uy vb va |, Wily,y2,ysl(@) wus V3 which is nonzero on the interval I because yj, y2, y3 are LI on I by the assumption that (57) is a general solution of the homogeneous equation L[y] = 0. Solving (68) by means of Cramer’s rule gives calf ' W(t) cr W(a)’ yi cr 5 0 ¥3 yi O ye Ys 0 ys ¥5 we us| _ Wi) 2 yy 0 Lat fous| _ Walz) Wa) Wa)’ O ye (70) yi ye 0 Lut ve F| _ Wale) Wa)’ Wa) Finally, integrating these equations and putting the results into (58) gives the particular solution Yp(z)= if + EXAMPLE re) | re | mo mu) ae yi(z) + if / 3 (é) Wate) ig y2(x) W ay Y3\a a 4. Consider the nonhomogeneousCauchy—Eulerequation ay!” + a7y"”—Qary'+ 2y = = (0<2<oo) (72) Observe that we cannot use the method of undetermined coefficients in this case because the differential operator is not of constant-coefficient type, and also because the forcing function does not generate only a finite number of linearly-independent derivatives. To use (71), we need to know yj, yo, ys, and f. [tis readily found that the homogeneous solution is 1 ; ya(t) = Cy— + Cox + C32”, £ (73) so we can take y; = 1/2, yo = x, yg = v*. But be careful: f(x) is not 2/x because (72) is not yet in theform of (56).That is, (72) mustbe dividedby 2° so thatthecoefficientof y'” becomes1,as in (56).Doing so, it follows thatf(x) = 2/x*. 147 We can now evaluate the determinants needed in (71): xt W(2) =| og 6 -x7? 1 Qz 2x3 0 at Wo(a) =] ~-27? 2273 => Wile) = x 1 Q@&=72 x 22-* 0 2 0 x 0 Qe l= a Qa 4 0 0 , at 2 oog 0 W(t) =| ~-27? 1 Qa73 , 2 0 9 #0 A [=oe a4 ‘ (74) so (71) gives ed ——~ d&}: 1 * 1 =~d&} — 7 2 az d (J eee)=2 e+(f -gee)o+ wil=(fozeee) = aie 32 The -1/(182) 182 (75) term can be dropped because it is a homogeneous solution, so Yp(z) a3ling7 ( ) 76) as can be verified by direct substitution into (56). @ Generalization of the method to the nth-order equation yg +pi(a)yP)+++ +pn-i(x)y! +Pn(z)y =f(a) (77) is straightforward and the result is Yp(«)= if Wie ag yi(z) to + if Wie is Yn(@), (78) where the y;’s are n LI homogeneous solutions, W is the Wronskian of yi,.-.; Yns and W; is identical to W, but with the jth column replaced by a column of zeros except for the bottom element, which is f. Closure. In this section we have discussed the nonhomogeneous equation L[{y]= f, where L is an nth-order linear differential operator. In Section 3.7.1 we provided the theoretical framework, which was based upon the linearity of L. Specifically, Theorem 3.7.1 showed that the general solution of Lily] = f can be formed as the sum of a homogeneous solution yp(x), and a particularsolutionyp(x); y,(x) is a generalsolutionof L[y] = 0, so it containsthe n constants of integration, and yp»(x)is any particular solution of the full equation Lly| = f. Further, we showed that if f is broken down as f = fi +-::+ fr, 148 ..,Ypk Corresponding to fy, ... respectively. i Sh tionsfj (a). EXERCISES 3.7 1. Show whether or not the given forcing function satisfies condition (ii), below equation (11). If so, givea finite family of LI functions generated by it. (a) 2? cosx (c)lnz (b) cos z sinh 2z (d)2? Ine (f) = (e)sina/a (h)(x —1)/(x+ 2) (g)e®* (i) tan x (j) e®cos 3x (k) we7* sinh x (1)cos x cos 22 (m) sin zsin 2z sin 3x (n) e*/(x +1) 2. Obtain a general solution using the method of undetermined coefficients. 4, Obtain a general solution using the method of variation of parameters. (a)y!+2y= 4e?* (b)y’-y= ae? +1 (c)cy’ -y= 23 (d)ay’+y=I1/e (e) °y' +a27y=1 (x>0) (x >0) (hy —y= 8« (g)y” —y = 8e* (h)y"”~2y'+y = 62? (i) y —2Qy’+ y = Qe (Dy +y =4sing (k)y"”+4y!+4y= 207?" (1)6y” — 5y’ +y = 2? (x >0) (n)27y" —vy! —3y = 4a (a <0) (m) xy" + zy! Co)y"+y" (p) yl! (g)y" —y!= 5sin2x (h) y” +y! = 4ve* + 3sinaz _ by!" -y' + _ Any =] -y=x ly’ _ 6y = eit (i) y” + y = 3sin 22 — 5 + 2a? 5. (a)—(p) Use computer software to solve the corresponding problem in Exercise 4. (k) y" +y = 6cosz +2 6. In the method of variation of parameters we used indefinite (m)y” —2y' + y = ae integrals in those formulas, instead, if we choose. Specifically, Q@y"+y' -%&=23-e* (1) y"" 4 2y! —_ ae + 4e2t (n) y” —4y = 5(cosh2x ~ x) (0)y" _ y! = 2ret (p) y/” ~ y! = 25cos 2¢ (q) yy!"~ y” = 6a + 2coshz + y” —Qy = 327 ~1 (ry (s)yy" —y = 5(a+ cosz) 3. (a)—(s) Use computer software to solve the corresponding problem in Exercise 2. integrals [in (49) and (78)]. However, we could use definite show that, in place. of (49), * Wa(€) ale)=i We |ne)+|a, W(E) as| yo(x) is also correct, for any choice of the constants a1, @2 (although normally one would choose a, and az to be the same). work, Cry + Coy2 = 6, 3.8 Application tion to Harmonic Oscillator: Forced Oscilla- The free oscillation of the harmonic oscillator (Fig. 1) was studied in Section 3.5. Now that we know how to find particular solutions, we can return to the harmonic oscillator and consider the case of forced oscillations, governed by the secondorder, linear, constant-coefficient, nonhomogeneous equation maz"+ca'+kxr = f(t). f(t) = FocosQt. (2) case. To begin, consider the undamped case (c = 0), ma" +ka = Fy cos Mt. (3) The homogeneous solution of (3) is t,(t) = Acoswt + Bsinwt, (4) where w = \/k/m is the natural frequency (i.e., the frequency of the free oscillation), and the forcing function Fp cos Qt generatesthe family {cos Mt, sin Nt}. Thus, to find a particular solution of (3) by the method of undetermined coefficients, seek Ep(t) = CcosNt + Dsin Nt. (5) Two cases present themselves. In the generic case, the driving frequency 22is different from the natural frequency w, so the terms in (5) do not duplicate any of those in (4) and we can accept (5) without modification. In the exceptional, or “singular,” case where 22is equal to w, the terms in (5) repeat those in (4), so we need to modify (5) by multiplying the right side of (5) by ¢. For reasons that will become clear below, these cases are known as nonresonance Nonresonant oscillation. ft) (1) In particular, we consider the important case where the forcing function is harmonic, 3.8.1. Undamped x(t) and resonance, respectively. Putting (5) into the left side of (3) gives Figure 1. Mechanical oscillator. ; F Fi (6) Nt. m Since Q 4 w by assumption, it follows from (6), by equating the coefficients of (w? —27) C cosQt + (w? —Q?) Dsin Qt = "cos cosQt and sin Qt on the left and right sides, thatC = (Fp/m)/(w? —0?) and D = 0. Thus Lp(t) = we Fo/m Cos (2 (7) Qt, so a general solution of (3) is +ep(t) x(t)=xa(t) = Acoswt + Bsinwt cos Qt. + =e (8) In a sense we are done, and if we wish to impose any prescribed initial conditionsx(0) and2’(0), thenwecouldusethoseconditionstoevaluatetheconstantsA and B in (8). Then, for any desired numerical values of m, k, Fo, and 2 we could plot x(t) versus ¢ and see what the solution looks like. However, in science and engineering one is interested not only in obtaining answers, but also in understanding phenomena, so the question is: How can we extract, from (8), an understanding of the phenomenon? To answer that question, let us first rewrite (8) in the equivalent form Fi 0 / ™m a(t) = Esin(wt +o) + we — (22cos Qt (9) since then we can see it more clearly as a superposition of two harmonic solutions, of different amplitude, frequency, and phase. The homogeneous solution E£sin (wt + ¢) in (9), the “free vibration,” was already discussed in Section 3.5. [Alternative to E'sin(wt + ¢), we could use the form EFcos (wt + ¢), whichever one prefers;it doesn’t matter.]Thus, consider the particular solution, or ‘forced response,” given by (7) and the last term in (9). It is natural to regard m and k (and hence w) as fixed, and Fo and (2 as controllable quantities or parameters. That the response (7) is merely proportional to Fo is no surprise, for it follows from the linearity of the differential operator in (3). We also see, from (7), that the response is at the same frequency as the forcing function, Q. More interestingis the variation of the amplitude (Fo/m)/(w? — 2?) with , which is sketched in Fig. 2. The change in sign, as 2 increases through w, is awkward since it prevents us from interpreting the plotted quantity as a pure magnitude. Thus, let us re-express (7) in the equivalent form Figure 2. Magnitude of response (undamped case). p(t) = jw Fo/m — 0] 8 (Qt where the phase angle ® is 0 for Q < w and 7 for + &), (10) > w [since cos (Qt + 7) = —cos Qt gives the desired sign change for 2 > w]. The resulting amplitude- and phase-responsecurves are shown in Fig. 3. From Fig. 3a, observe that as the driving frequency approaches the natural frequency the amplitude tends to infinity! [Of course, we must remember that right at 2 = w our particular solution is invalid since(6) is then(0)cosQt + (0)sin Qt = (Fo/m) cosQt, which cannotbe satisfied.] Further, as 22—+oo the amplitude tends to zero. Finally, we see from Fig. 3b that the response is in-phase (® = Q) with the forcing function for Q < w, but for all Q > w it is 180° out-of-phase. This discontinuous jump is striking since only an infinitesimal change in Q (from just below w to just above it) produces a discontinuous change in the response, Also of considerable interest phenomenologically is the possibility of what is known as beats, but we will postpone that dicussion until we have had a look at the special case of resonance. (a) Amplitude cael a t fl q i Folm“3 (b) Phase Resonant oscillation. For the special case where 2. = w (that is, where we force the system precisely at its natural frequency), the terms in (5) duplicate those in (4) so,accordingto themethodof undeterminedcoefficients,we needto revise Zp as (11) + Dsinut). tp(t) =t(Ccoswt Since the duplication has thereby been removed, we accept (11). Putting that form into (3),we find thatC = 0 and D = Fo/(2mw), so Fo Lp(t) = Fy! rp(t) i Swe, 12 (12) Beats. Isn’t it striking that the response a(t) is the sum of two harmonics [given by (5)] for all 2 4 w, yet it is of the different form (12) for the single case Q = w? One might wonder whether the resonantcase is really of any importance at all since one can never get 2 to exactly equal w. It is therefore of interest to look at the solution a(t) as Q approaches w. To do so, let us use the simple initial conditions = 0, for definiteness, u(t) = 2 Fo/m a2 in which case we can evaluate A and B (cos wt — COS phase-response curves (undamped case). which is shown in Fig. 4. _ In this special case the response is not a harmonic oscillation but a harmonic function times t, which factor causes the magnitude to tend to infinity as tf + oo. This result is known as resonance. Of course, the magnitude does not grow unboundedly in a real application since the mathematical model of the system (the governing differential equation) will become inaccurate for sufficiently large amplitudes, parts will break, and so on. Resonance is sometimes welcome and sometimes unwelcome. That is, sometimes we wish to amplify a given input, and can do so by “tuning” the system to be at or near resonance, as when we tune a radio circuit to a desired broadcast frequency. And other times we wish to suppress inputs, as a well designed automobile Suspensionsuppresses,rather than amplifies, the inputs from a bumpy road. «(0) = 0 and x'(0) in (8), and obtain Figure 3. Amplitude-and Qt) ; (13) v(t) Xp(t)} Fot/(2m@)- a | | Fo t/(2mw)” peers sasiantsnsiniet teen eld t Figure 4. Resonantoscillation. or, recalling the trigonometric identity cos A —-cos B = 2 sin x(t)= 2Fo/1 rol sin(5 Q —~ sin )esin(* -—Q 5 )e , (14) Now, suppose that Q is close to (but not equal to) the natural frequency w. Then the frequency of the second sinusoid in (14) is very small compared to that of the first, so the sin (45%) t factor amounts,essentially,to a slow “amplitude modulation”of therelativelyhighfrequencysin (“4)¢ factor.This phenomenon is known as beats, and is seen in Fig. 5, where we have plotted the solution (14) for four representative cases: in Fig. 5a Q is not close to w, and there is no discernible beat phenomenon, but as 22is increased the beat phenomenon becomes well established,asseenin Fig. Sb,5c,andSd. [Wehaveshownthe“envelope”sin (“5) ¢ as dotted.] We can now see that the resonancephenomenonat (2 = w is not an isolated behavior but is a limiting case as Q — w. That is, resonance (Fig. 4) is actually a limit of the sequence shown in Fig. 5, as 2 + w. Rather than depend only on these suggestive graphical results, we can proceed analytically as well. Specifically, we can take the limit of the response (13) as 2. + w and, with the help of Il’Hépital’s rule, we do obtain (12)! With our foregoing discussion of the undamped forced harmonic oscillator in mind, we cannot overstate that we are by no means dealing only with the solving of equations but with the phenomena thereby being described. To understand phenomena, we normally need to do several things: we do need to solve the equations that model the phenomena (analytically or, if that is too hard, numerically), but we also need to study, interpret, and understand the results. Such study normally includes the generation of suitably chosen graphical displays (such as our Fig. 2, 3, and 4), the isolation of special cases [such as our initial consideration (d) Q=0.98@ ee a sin0.01r . of the case where there is no damping; c = 0 in (1)], and perhapsthe examination of various limiting cases (such as the limit Q — w in the present example). Emphasis in this book is on the mathematics, with the detailed study of the relevant physics left for applications courses such as Fluid Mechanics, Electromagnetic Field Theory, and so on, but we will occasionally try to show not only the connections between the mathematics and the physics but also the process whereby we determine those connections. 3.8.2. Damped case. We now reconsider the harmonically driven oscillator, this Figure 5. Beats, and approach to resonance, time with a cx’ damping term included (c > 0): ma” + ex’ + kx = Fo cos Nt. (15) Recall from Section 3.5 that the homogeneous solution is 4 e 2m / / |Acos4/w* — (— "ts+ Bsin,/w* - (=) (5) trt)= 4 ¢ an (A+ Bt) ont for the underdamped Acosh | V ( in) (c < Cer), critically 4hw? t+ Bsinh. / ( ia) an w2 | (16) damped (c = Ce,), and overdamped (c > Cer) cases, respectively, and where w = \/k/m This time, when we write and Cer = 2V mk. (17) Ep(t) = CcosNt + Dsin NE, according to the method of undetermined coefficients, there is no duplication between terms in (17) and (16), even if 2 = w, because of the exp (—ct/2m) factors in (16), so we can accept (17) without modification. Putting (17) into (15) and equating coefficients of the cos Qt terms on both sides of the equation, and similarly for the coefficients of the sin Q¢ terms, enables us to solve for C and D. The result (Exercise 3a) is that ty— Folin)(u?=&) oO)=a mB)+eam YG? FocQ./m? 2)? + (eQ/mye? sin Qt, (18) or (Exercise 3b), equivalently, £p(t) = Ecos (Qt + 8), (19a) | | where the amplitude & and phase ® are (w?—2)? + (cQ/m)? ® = tan7! pea (19c) with the tan~! understood to lie between 0 and 7. responsecurves, the graphs of the amplitude &, and the phase @with respect to the driving frequency $2.The former is given in Fig. 6 for various values of the damping coefficient c, and the latter is left for the exercises. From Fig. 6 we see that true resonance is possible only in the case of no damping (c = 0), which case is an idealization since in reality there is inevitably some damping present. Analytically, we see the same thing: (19a) shows that the amplitude # can become infinite only if ¢ = 0, and thatoccurs only for Q = w. However, for c > 0 there is still a peaking of the amplitude, even if that peak is now finite, at a driving frequency Q which diminishes from w as ¢ increases, and which is 0 for all c > Cer. Further, the peak magnitude (located by the dotted curve) diminishes from co to Fo/k as c is increased from 0 to c.;, and remains Fo/k for all ¢ > Cer. What is the significance of the Fg /k value? For 2 = 0 the differential equation becomes ma” + cx’ + kx = Fo, and the method of undetermined coefficients gives tp(t) = constant = Fo/k, which is merely the static deflection of the mass under the steady force Fo. Even if true resonance is possible only for the undamped case (c = 9), the term resonance is often used to refer to the dramatic peaking of the amplitude response curves if ¢ is not too large. The general solution, of course, is thesum x(t) = a,(t) + z(t) II tp(t) + Ecos (Qt + ®). (20) where £ and ® are given by (19b,c) and ap(t) is given by the suitable right-iiaid side of (16), according to whether the system is underdamped, critically damped, or overdamped.If we imposeinitial conditions (0) andx‘(0) on (20), thenwe can solve for the integration constantsA and B within x,(t). Notice carefully that the z,(¢) part of the solution inevitably tends to zero as t —+oo because of the exp (—ct/2m) factor, no matter how smallc is, as long as c > 0. Thus, we call x,(t) in (20) the transient part of the solution and we call tp(t) thesteady-state partsince x(t) 3 Ecos (Qt+ ©)as t > oo. The transient Figure 7. A representative responsez(t) (solid); approachto thesteady-stateoscillation x,(t) (dotted). part depends upon the initial conditions, whereas the steady-state part does not. A representative underdamped case is shown in Fig. 7, where we see the approach to the steady-stateoscillation z(t). Closure. In this section we considered the forced vibration of a harmonic oscillator —that is, a systemgovernedby the differential equationmx” + ca’ + kx = f(t), ‘or the case of the harmonic excitation f(t) = Fo cosQ¢t. Thus, besides a homogeneous solution we needed to find a particular solution, and that was done by themethod of undetermined coefficients. The particular solution is especially important physically since even an infinitesimal amount of damping will cause the homogeneous solution to tend to zero as t — oo, so that the particular solution becomes the steady-stateresponse. To understand the physical significance of that response we attached importance to the amplitude- and phase-response curves and discussed the phenomena of resonance and beats. Our discussion in this section nas been limited in that we have considered only the case of harmonic excitation, whereas in applications f(t) surely need not be harmonic. However, that case is important enough to deserve this special section. When we study the Laplace transform method in Chapter 5, we will be able to return to problems such as 155 obtain solutions for virtually any forcing function f(t). EXERCISES 3.8 1. Applying theinitial conditions«(0) = 0 andz'(0) = 0 to 10. Imagine the experimental means that would be required to apply a force Fp cos Nt to.a mass. It doesn’t sound so hard if the mass is stationary, but imagine trying to apply such a force to a moving mass! In many physical applications, such 2. Derive (12) from (11). as earthquake-induced vibration, the driving force is applied 3. (a) Derive (18). (b) Derive (19a,b,c). indirectly, by “shaking” the wall, rather than being applied to the mass. Specifically, for the system shown in directly 4, The amplitude- and phase-response curves shown in Fig. 3 correspondto theequationmax”+ ka = Fo cos Nt. Obtain the the figure, use Newton’s second law to show that if the wall is equations of the analogous response curves for the equation maz" + kx = Fo sin Qt, and give labeled sketches of the two (8), derive (13). Show that the same result can be obtained if we Start with the form (9) instead of (8). x O(t) curves. 5. Figure 6 shows the amplitude-response curves (£ versus 22) 4 corresponding to (19b), for various valuesof c. (a)What happensto thegraph as c > 00? Is E(Q) continuous onQ <2 < oo force= 00? Explain. (b) From (19c), obtain the phase-response curves (® versus Q), either by a careful freehand sketch or using a computer, m k ad displaced laterally according to d(t) = dpcosMt, then the equation of motion of the mass m is ma” + kx = Fo cos Nt, where fy = kd. Here, x and 6 are measured relative to fixed for various values of c, being sure to include the important points in space. NOTE: Observe that such an experiment is case c = 0. What happens to the graph as c + 00? more readily performed since it is easier to apply a harmonic 5(¢) than a harmonic force; for instance, one displacement 6. In Fig. 7 we show the approach of a representative response mechanism (which converts circular slider-crank a use could curve (solid) to the steady-state oscillation (dotted), for an unmotion). Note further that a dislinear harmonic to motion derdamped system. placement input is precisely what an automobile suspension is (a) Do the same (with a computer plot) for a critically damped subjected to when we drive over a bumpy road. case, The valuesof m,c,k, Fo, ,2(0),2'(0) are up to you, but the idea is to demonstrate graphically the approach to z,(t) clearly, as we have in Fig. 7. (b) Same as (a), for an overdamped system, where c = 4C¢,, say. 7. Show that taking the limit of the response (13) as Q > w, with the help of H6pital’s rule, does give (12), as claimed two paragraphs below (14). 8. Observe from Fig. 6 that the amplitude / tends to zero as Q — oo, Explain (physically, mathematically, or both) why thatresult makessense. 11. For the mechanical oscillator governed by the differential equation ma” + cz’ + kx = F(t), obtain computer plots of the amplitude- and phase-response curves (£ = versus 2 and ® versus 2), for the case where F(t) 25sin Nt, for these six values of the damping coefficient c: 0, 0.25C¢, 0.5Cop, Cer, 2Cer 4Cor, where (gjm=1,k=1 (b)m =2,k=5 (c)m = 2,k = 10 (d)m=4,k (e)m=4,k =2 = 10 9. (a)What choice of initial conditions x(0) and «’(0) will reduce the solution (20) to just the particular solution, x(t) = 12. (Complex function method) Let L be a linear constantEcos (Qt + &)? coefficient differential operator, and consider the equation (b) Using a sketchof a representativex,(t) such as the dotted curve in Fig. 7, show the graphical significance of those specialvaluesof 2(0) and2’ (0). L[z] = Fo cos Qt, (12.1) [56 According to the method of undetermined coefficients, can find a particular solution «,(¢) by seeking a,(t) AcosQ#t + Bsin Qt (or, in exceptional we = cases, t to an inte- 13. (Electrical circuit) Recall from Section 2.3 that the equations governing the current i(t) in the circuit shown, and the chargeQ(t) on thecapacitorare ger power times that). A slightly simpler line of approach that is sometimes used is as follows. Consider, in place of (12.1), Liw] = Foe, a*y bap at (12.2) Equation (12.2) is simpler than (12.1) in that to find a particular solution we need only one term, wp(t) = Ae. (If f¢ z = a-+ ib is any complex number, it is standard to call Rez = a and Imz = 3 the real part and the imaginary part of 2, respectively.) Because, according to Euler’s formula, e@ —cosNt + isin Nt, it followsthatRee’ Ime’ ca E(t) = cosMt and | -dt = sin Qt. Since the forcing function in (12.1) is the real part of the forcing function di 1, dk(t) cone of Qe ob en ae EE eae R \ WW 13.1 (13.1) b L ww) Ss +t I ¢ Cc in (12.2), it seems plausible thatz,(t) shouldbe the real partof w,(t). Thus, we have and the following method: to find a particular solution to (12.1) consider insteadthe simpler equation(12.2). Solve for w,(¢) by seekingw,(t) = Ae’, fromx,(t) = Rew,(t). andthenrecoverthedesiredx(t) (a) Prove that the method described above works. HINT: The key is the linearity of L, so that if w = u + iv, then L[w]= Llu + iv] = Llu) + iL[v). (b)—(k) Use the method to obtain a particular solution to the given equation: (b)ma” + ca! + ka = Fo cos Nt (c) ma" + cx’!+ ke = Fosin Qt (d) 2’ + 3x = 5cos 2t (e)2’ x =A4sin3t —~ —x’ +2 =cos 2t (fc (g)2" + 5a’ +a = 3sindt (h) 2” — 22' +2 = 6cos 5t Qe" Ga” +e" +e'+a = 3sint +a +x =3cost (k) 2!" + 2a" + 4a = 9sin 6t 1 d 2 nw, pp@stoq= roe (13.2) respectively, where L, R, C, B,i, and Q are measured in henrys, ohms, farads, volts, amperes, and coulombs, respectively. (a) Let L = 2, R = 4, and C = 0.05. Solve for Q(t) subject to theinitial conditionsQ(0) = Q’(0) = 0, whereE(t) = 100. Identify the steady-state solution. Give a computer plot of the solution for Q(t) over a sufficiently long time period to clearly show the approach of Q to its steady state. (Naturally, all plots should be suitably labeled.) (b) Same as (a), but for C = 0.08. (c) Same as (a), but for C = 0.2. (d) Same as (a),but for E(t) = 10e~*. (e)Same as (a),but for E(t) = 10 (1 —e~*). (f) Sameas (a),butfor E(t) = 50 (1 + e~°**), a large set of differential equations. A realistic model could easily contain 100 differential equations on 100 unknowns. If there are two or more unknowns, then we are involved not with a single differential equation but with a system of such equations. For instance, according to the well known Lotka—Volterramodel of predator-prey population dynamics, the populationsa(t) and y(t) of predatorand prey are governedby thesystemof two equations a! (1) —az + Bay, y = yy ~Oey, where a, (,7y,6 are empirical constants and t is the time. This particular system happensto be nonlinear because of the zy products; we will return to it in Chapter 7 when we study nonlinear systems. The present chapter is devoted exclusively to linear differential equations. By definition, a linear first-order system of n equations in the n unknowns x1(t),...,2n(t) is of theform ay (t)x} a i Ain(t)x), + bii(t)ry Tor bin(t)tn = fi(t) (2) Ani (t)xy tot Ann(t) x, + bai (t)zi tes + bnn(t)tn = Fn(t), where the forcing functions f;(t) and the coefficients a;,(t) and bj,(t) are prescribed,and whereit is convenientto use a double-subscriptnotation:a;,,(t) denotesthecoefficientof x},(¢)in thejth equation,and b;;,(¢)denotesthecoefficient of x(t) in the jth equation. We call (2) a first-order system because the highest derivatives are of first order. If the highest derivatives were of second order, we would call it a second-order system, and so on. A linear second-order system of n equationsin the n unknowns2x;(t),...,£n(t) would be of the sameform as (2), but with each left-handside being a linear combination of the second-,first-, and zeroth-order derivatives of the unkowns. The system (2) is a generalization of the linear first-order equation y’+p(x)y = q(a) in theoneunknowny(z) studiedin Chapter2. There, andin mostof Chapters 3, we favored x as the generic independentvariable and y as the generic dependent variable, but in this section the independent variable in most of our applications happens to be the time ¢, so we will use ¢ as the independent variable. As in thecase of a single differential equation,by a solution of a systemof differential equations (be they linear or not), in the unknowns 2x1(t),...,2n(t) over some¢ interval [, we mean a set of functions x,(t),...,v,(¢) that reduce those equations to identities over J. 3.9.1. Examples. let us begin by giving a few examples of how such systems arise in applications. EXAMPLE 1. RL Circuit. Consider the circuit shown in Fig. |, comprised of three AT RY Ri L a) = Ry @iI@®—L~w te) EE = Ry Figure i. Circuit of Example 1. {58 (a) [2 Pp i q 4a by i i) fg 83 hn¥ < a ¢ iy s r eo is (b) [DF *. —s, loops. We wish to obtain the differential equations governing the various currents in the circuit, There are two ways to proceed that are different but equivalent, and which correspond to thecurrent labeling shown in Fig. 2a and 2b (in which we have omitted the circuit elements, for simplicity). First consider the former. If the current approaching the junction p from the “west” is designated as 7, and the current leaving to the east is 7g,then it follows from Kirchoff’s current law (namely, that the algebraic sum of the currents approaching or leaving any point of a circuit is zero) that the current to the south must be 71—7g.Similarly, if we designate the current leaving the junction q to the east as 7g, then the current leaving to the south must be 72 — 73. With the current approaching 7 from the north and east being ig — tg and 7g, it follows that the current leaving to the west must be ig. Similarly, the current leaving s to the west must be 71. Next, apply Kirchoff’s voltage law (namely, that the algebraic sum of the voltage drops around each loop of the circuit must be zero) to each loop, recalling from Section 2.3 that the voltage drops across inductors, resistors,and capacitors (of which thereare none in ∙ lf. this particular circuit) are Le, stepgivesLy di Ri, and G[i dt, respectively. For the left-hand loop that + Ry(i; —ig) —E,(t) = 0, wherethelast term(correspondingto the applied voltage £1) is counted as negative because it amounts to a voltage rise (according to the polarity denoted by the -: signs in Fig. 1) rather than a drop. Thus, we have for the left, middle, and right loops, Lyi, + Ry (41—ie) = Ey(t), Ty Ry Lott, (t2 _ i3) + Ry (tg _ i) = E2(t), (3) L3i, + Rgig + Ro (is —i2) = Es(t), respectively, or, Dyiy + Ryty _ Ryly = Ey (t), Doi, —Ryi, + (Ri + Ro)ig —Rots = Ea(t), (4) Lgi —Roig+ (Ro+ Ry)is = E3(t), wherey(t), £2(t), #3(t) are prescribed.It mustbe rememberedthatthecurrentsdo not need to flow in the directions assumed by the arrows: after all, they are the unknowns. If any of them turn out to be negative (at any given instant t), that merely means that they are flowing in the direction opposite to that tentatively assumed in Fig. 2a. Alternatively, one can use the idea of “loop currents,” as denoted in Fig. 2b. In that case the south-flowing currents in A, and it, are the net currents 7; ~ ig and tg — ts, respectively, just as in Fig. 2a. Either way, the result is the linear first-order system (4). 8 It is important to see that the system (4) is coupled. That is, the individual equations contain more than one unknown so that we cannot separate them and solve the first for 7, (for instance), the second for 72, and the third for 73. Put differently, the currents 71,72,73 are interrelated. It is only natural for systems of differential equations to be coupled since the coupling is the mathematical expression of the relatednessof the dependent variables. On the other hand, if we write differential equationsgoverningthecurrent¢(¢)in a circuit and the price of teain China, p(t), we would hardly expect those equations to be coupled and, indeed, it would hardly make sense to group them as part of the same system. EXAMPLE 2. LC Circuit. For thecircuit shownin Fig.3, thesamereasoningasabove + lof ran [iat a fi d +,Le (i 1tg) Cy} * w+ mr ce da dt?" Gt or) ; = E(t), )=0 ” on thecurrents7,(¢) and i2(¢) or, differentiatingto eliminate theintegralsigns, ∑ −−1 ↕− a Whereas (4) wasafirst-order EXAMPLE Lit + ↕ Ca Sta = (6) system, (6) is of second order. @ 3. Mass-SpringSystem.This timeconsidera mechanicalsystem,shownin Fig. 4 and comprised of masses and springs. The masses rest on a frictionless table and are subjected to applied forces f\(t), F2(t), respectively. When the displacements x; and X are zero, the springs are neither stretchednor compressed, and we seek the equationsof motion of the system, that is, the differential equations governing x1 (t)and x9(t). The relevant physics is Newton’s second law of motion, and Hooke’s law for each of the three springs, as were discussed in Section 1.3. To proceed, it is useful to make a concrete assumption on 21 and zg. Specifically, suppose that at the instant t we have is called a free-body diagram). Then the left spring is stretched by x, so it exerts a force to the left, on my,, equal (according to Hooke’s law) to Aya 1. The middle spring is compressed by Ly — @ so it exerts a force ky2(v1 — 2) to the left on m, and to the right on me, and the right spring is compressed by x2 and exerts a force kyx to the left on me, as shown in the figure. With the help of the information given in Fig. 5, Newton's second law for each of the two masses gives mye = ~hyay — Aye (a1 — 2) + F(t), 7) Moxy = —kyre + hyo(ty — Le) + FY(t) as the desired equations of motion —or, rearranging terms, mya + (Ay + hye) ay — hyve = F(t), mots — kyoe, + (ko + hy2) vq = Fo(t). Cy Figure 3. LC circuit. 0. Ly > v2 > 0, as assumed in Fig. 5 (which figure, in the study of mechanics, Ttie C gives the integro-differential equations (8) <—— Kix, > F, Mm <—— kyo (x, my —Xy) kyo (xy X92) Fy |——> <- ky Xo Figure 5. Free-bodydiagramof themasses. my Aix Pp FF kyo (x —x Figure 6. Revisedfree-body diagram for my. COMMENT. Our assumption that x; > v2 > O was only for definiteness; the resulting equations (8) are insensitive to whatever such assumption is made. For instance, suppose [> ) that we assume, instead, that 22 > x2; > 0. Then the middle spring is stretched by x2 — 2), so the free-body diagram of m, changes to that shown in Fig. 6, and Newton’s law for m, gives mya? = —kyx, + kyo(aq — 1) + F(t), which is seen to be equivalentto the first of equations (7). Similarly for m2. # 3.9.2, Existence and uniqueness. The fundamental theorem regarding existence and uniqueness is as follows.* THEOREM 3.9.1 Existence and Uniquenessfor Linear First-Order Systems Let the functions a11(t), aio(t),..., @nn(t) and fi(t),..., fr(t) be continuous on a closed interval J. And let numbers 6;,..., 5, be given such that r1(a) = bi, £2(a) = bo, wey In(a) (9) — bn, where a is a given point in J. Then the system a lI ayy(t)xy + ayo(t)rg +++ + ain(t)en +fi(t), (10) +++:+ ann(t)en+ fn(t), eh,= ani(t)e1 + ano(t)tea “There is a subtle point that is worth noting, namely, that (10) is not quite of the same form as the general first-order linear system (2) in that its left-hand sides are simply xi,..., 24, rather than linear combinations of those terms. (What follows presumes that you have already studied the sections on matrices, rank, and Gauss—Jordan reduction.) The idea is that (2) can be reduced to the form (10) by elementary row operations, as in the Gauss—Jordan reduction of linear algebraic equations —unless the rank of the {aj,(¢)} matrix is less than n. In that case, such operations would yield at least one equation, at the bottom of the system, which has no derivatives in it. [f not all of the coefficients of the undifferentiatedx; terms in that equation are zero, thenone could use thatequationto solve for one of the 2;’s in termsof the others and use that result to reduce the system by one unknown and one equation; if all of the coefficients of the undifferentiated x; terms in that equation are zero, then that equation would either be 0 = 0, which could be discarded, or zero equal to some nonzero prescribed function of t, which would cause the system to have no solution. To avoid these singular cases, it is conventional to use the form (10), rather than (2), in the existence and uniqueness theorem. 161 subject to the initial conditions (9), has a unique solution on the entire interval J. Observe that we have added the word “entire” for emphasis, for recall from Section 2.4 that the Existence and Uniqueness Theorem 2.4.1 for the nonlinear initial-valueproblemy’'(a) = f(a, y) with initial conditiony(a) = bis a local one; i i it guarantees the existence of a uni but it does not tell us how big A can be. In contrast, Theorem 3.9.1 tells us that the solution exists and is unique over the entire interval J over which the specified conditions are met. EXAMPLE 4. ExampleI, Revisited.Supposethatweaddinitial conditions,say7;(0) = by,%2(0)= bg,23(0)= b3to thesystem(4) governingtheRL circuit of Example 1. If the E;(t)’s arecontinuouson 0 < ¢ < T'and theL,’s arenonzero[sowe candivide throughby them in reducing (4) to the form of (10)] then, according to Theorem 3.9.1, the initial-value problem has a solution on 0 < ¢ < T, and it is unique. & It would appearthat Theorem 3.9.1 does not apply to the system (8) of Example 3 because the latter is of second order rather than first. However, and this is important, higher-order systems can be reduced to first-order ones by introducing artificial, or auxiliary, dependent variables. 5. Reducethe second-ordersystem(8) to a first-ordersystem. The idea EXAMPLE ee u and v according to vz, = wanda, = v is to introduce artificial dependent becausethen the second-order derivatives z//and x4become first-order derivatives u’ and v’, respectively. Thus, (8) can be re-expressed, equivalently, as the first-order system v(t) = u(t) = x(t) = "h ! u'(t) = in R12 mo +h My e Ly + k 2 My + hye _ ko tm) mg To see that this system is of the form (10), let “a,” Z9 + 1 My F(t), (11) + —1 F(t). ma = a, “wo” = u, “x3” = we, and “tq” = v. Then ayy = aig = 14 = 0, aig = 1, fi(t) = 0, aay = —(ki + hi2)/mai, 22 = dag = 0, dog= ky2/m4, fo(t) = Fy(t)/my, andso on. All of theaj, (t) coefficients are constants and hence continuous for all ¢. Let the forcing functions F(t) and F(t) be continuous on Q < ¢t< oo. Thus, according to Theorem 3.9.1, if we prescribe initial conditions 21(0), u(0), v2(0), v(0), then the initial-value problem consisting of (11), together with those initial conditions,will havea uniquesolutionfor v(t), u(t), va(t), v(t). Equivalently,theinitialvalueproblemconsistingof (8),togetherwith prescribedinitial valuesx1(0),2(0), 72(0), #(0), will havea uniquesolutionfor z1(t), v(t). 162 Consider one more example on auxiliary variables. EXAMPLE 6. Considerthethird-orderequation a” + 2t2"”—2’ + (sint)x = cost, (12) which is a system,a systemof n equationsin n unknowns,wheren = 1. To reduceit to a system of first-order equations, introduce auxiliary variables u,v according to x’ = u and av" =u!’ =v. Then zg! au! Uv U, v, = —(sint)z +u— 2tv+cost (13) is the desired equivalent first-order system, where the last of the three equations follows from the first two together with (12). @ 3.9.3. Solution by elimination. We now give a method of solution of systems of linear differential equations for the special case of constant coefficients, a method of elimination that is well suited to systems that are small enough for us to carry out the steps by hand. We introduce the method with an example after first recalling (from Section 3.3) the idea of a linear differential operator, di” L= ao(t) a + a(t) drat i +++»+ an(t) = ag(t)D" + a,(t)D" 1 +--- +an(t), where D denotes ‘ tion ag + a, cas D? denotes a +-++++ nz. and so on. By L[x| we mean the funcWe say that L is of order n (if ao is not dt” Coie identically zero) and that it “acts” on «, or “operates”on x. Further, by Ly Lo[2] we mean Ly [Lo[x]]; that is, first the operator immediately to the left of x acts on zx, then the operator to the left of that acts on the result. Two operators, say Ey and Lz, are said to be equal if Li[z] = Le[z] for all functions x(t) (that are sufficiently differentiable for L; and LZ to act on them). Finally, in general, differential operators do not commute: [109 # LoL. For instance, if Ly = Dand Ly = tD, thenLyLa{a] = D(tDx) = D(ta') = ta” + 2’, whereas if theira; coeffiLoL\ |r] = tD(Dx) = tDzx' = ta. However,theydo commute cients are constants. For instance, (2D —1)(D + 3)x = (2D —1)(2' + 3x) = 2x” + 6a’ —x! —32, and (D + 3)(2D — 1)a = (D + 3)(22' —x) = 22" — 2! + 62' —3a 1). areidenticalfor all functionsz(t), so (2D —1)(D + 3) = (D+ 3)(2D — EXAMPLE 7. To solve thesystem xv—x—y = 3t, gz’+y' —52 —2Qy=5, (14a) (14b) it is convenient to begin by re-expressing it as (D —1)z — y = 3b, (15a) (D—5)x+(D—2)y =5, (15b) or Ly [x] > Lo{y] = 3t, (16a) La[z] + Laly] = 5, (16b) where Dy} = D —1,L2 = —1, and so on. To solve by the method of elimination, let us operate on (16a) with Lg and on (16b) with Lj, giving LgLy[a]+L3Lo{y] =Ls[3t], (17a) Ly LDs[z] (17b) Tr Ly Laly| = E,(5), wherewehaveusedthelinearityof Ls in writingLg [Li[2] + Lely] asL3L4[2]+L3Lely] in obtaining (17a) and, similarly, the linearity of Z, in obtaining (17b). Subtracting one equation from the other, and cancelling the x terms because L3L, = [,3, enables us to eliminate x and to obtain the equation (LyL4 —L3L2) [y]= £1[5]—Ls[3¢] (18) on y alone. At this point we can return to the non-operator form, with D104 — Egle (D — 1)(D — 2) — (D — 5)(-1) = = D® — 2D — 3 and L,([5] — L3[3t] = (D — 1)(5) - (D —5)(3t)= 15¢—8.Thus, y —2y' —3y = 15t —8, (19) which admits the general solution (20) y(t) = Ae*’+ Be~*—5t +6. To find x(t), we can proceed in the same manner. This time, operate on (16a) with [4 and on (16b) with La: L4L;[z] −− ↕ ↨ ∶ ∑ ↨ ∂↕ LoLs{x]+ LoLaly]= Lo[5}, (21a) (21b) and subtraction gives (Lal, _ LL3) {z] — L4[3¢] _ D[5], (22) 164 or (23) ae"— Qa’ —3a = 8 —6, with general solution (24) + 2t —4. a(t) = Ce*’ + Ee (We avoid using D as an integrationconstantbecauseD = d/dt here.) [t might appear that A, B,C’, & are all arbitrary, but don’t forget that x and y are related through (14), so these constants might be related as well. In fact, putting (20) and (24) into (14a) gives, after cancellation of terms, (25) =0, (2C —Aje* — (QE + Bye of e%!ande~*requiresthatA = 2C andB = —2£. Putting andthelinearindependence (20) and (24) into (14b) gives this same result. Thus, the general solution of (14) is (26a) (26b) a(t) = Ce*!+ Ee + 2t—4, —5t+6. y(t) = 2Ce**—~2EFe~' COMMENT |. With hindsight, it would have been easier to eliminate y first and solve for x since we could have put that 2 [namely, as given by (26a)] into (14a) and solved that equation for y. That step would have produced (26b) directly. COMMENT 2. Notice that (14) is not of the “standard” form (10) because (14b) has both zx’ and y’ in it. While we need it to be in that form to apply Theorem 3.9.1, we do not need the systemto be of thatform to apply themethodof elimination. # A review of the steps in the elimination process reveals that the operators fy,...,£4 might just as well have been constants, by the way we have manipulated them. In fact, a useful way to organize the procedure is to use Cramer’s rule (Section 10.6). For instance, if we have two differential equations (27a) Ly(z]+ Loly]= f(t), Ls{a]+ Laly)= fa(t), (27b) we can, heuristically, use Cramer’s rule to write fi Us |= / Jota £2 Iy Le L3 [4 | Ly L3 | [y D3 = Lalfi] — Lol fe] LyL4 —Leb fi | fo _ Li [fe]-—Lalfil Ly»| Ii L4 —LoL Ly © (28a) (28b) 165 Of course, the division by an operator on the right-hand sides of (28a,b) is not defined, so we need to put the Ly£4 — LoLg back up on the left-hand side, where it came from. That step gives (L1L4—LoLz)[x]= Lalfi] —Lolfel (29a) (LyL4 —LoLs) [y)= Lilfe) —Lalfil, (29b) which equations correspond to (22) and (18), respectively, in Example 7. Again, this approachis heuristic, but it does give the correct result and is readily applied — and extended to systems of three equations in three unknowns, four in four unknowns, and so on. What might possibly go wrong with our foregoing solution of (27)? In the application of Cramer’s rule to linear algebraic equations, the case where the determinant in the denominator vanishes is singular, and either there are no solutions (the system is “inconsistent”’) or there is an infinite number of them (the system is “redundant’”). Likewise, the system (27) is singular if £14 ~ LoLg3 is zero and is either inconsistent (with no solution) or redundant (with infinitely many linearly independentsolutions). For instance, the system Dzr+2Dy =1, (30a) 2Dx+4Dy (30b) =3 has L;L4 — LoL3 = 4D* — 4D? = 0 and has no solution since the left-hand sides are in the ratio 1:2, whereas the right-hand sides are in the ratio 1:3. However, if we change the 3 to a 2, then the new system still has L;L4 — DoL3 = 0 but is now consistent. Indeed, then the second equation is merely twice the first and can be discarded, leaving the single equation Dx + 2Dy = 1 in the two unknowns x(t) and y(t). We can choose one of these arbitrarily and use Dx + 2Dy = 1 to solve for the other, so there are infinitely many linearly independent solutions. Understand that the singular nature of (30), and the modified system, is intrinsic to those systems and is not a fault of the method of elimination. In the generic case, however, LyL4 — LoL3 ~ 0 and we can solve (29a) and (29b) for x(t) and y(t), respectively. It can be shown* that the number of independent arbitrary integration constants ts the same as the degree of the determinantal = D?-2D—3 polynomial LZ;L4—L2L3. In Example 7, for instance, L,L4—L2L3 is of second degree, so we could have known in advance that there would be two independentarbitrary constants. EXAMPLE 8. Mass-Spring Systemin Fig. 4. Let us study the two-masssystemshown in Fig. 4, and letm, = mg = ky = kyg = ko = Land F\(t) = Fo(t) = 0, for definiteness. Then equations (8) become “See pages 144—150 in the classic treatise by E. L. Ince, Ordinary Differential Equations (New York: Dover, 1956). 166 (D? + 2) a1 —22 =0, (31a) ~—x1 + (D?+2)a2=0. (31b) With Ly = Ly = D? +2 andLy = L3 = —1,andf,(t) = fo(t) = 0, (29a,b)become (D4 +4D? +3) x, =0, (32a) (D* + 4D? + 3) x2 = 0, (32b) so (Exercise 2) a(t) = Acost + Bsint x(t) = F cost + Gsint + Ccos V3t + Esin V3t, (33a) /sin V3t. (33b) + Hcos V3t+ To determine any relationships among the constants A, B,..., (32b), the result would be the same] and find that (A —F) cost + (B— G)sint 7, we put (33) into (31a) [or —(C + H) cos V3t —(E+ I)sinV3t =0, from which we learn that F = A, G = B, H = —C, and I = —E, so the general solution of (31) is £1(t) = Acost+ Bsint+C'cos V3t + Esin V3t, ato(t) = Acost + Bsint —Ccos V3t —Esin V3t. (34a) (34) The determinantal polynomial was of fourth degree and, as asserted above, there are four independent arbitrary integration constants. There are important things to say about the result expressed in (34): COMMENT 1. It will be more illuminating to re-express(34) in the form u(t)=Gsin(t+) +Hsin(V8t+4), (35a) ro(t) = Gsin(¢+ ¢) —Hsin (v3e + ) : (35b) wherethefourconstantsG, H, , y aredeterminedfrom theinitial conditions21(0),x/(0), x2(0),andx4(0).Whileneitherx(t) norx2(t) is a puresinusoid,eachis a superposition of two pure sinusoids, the frequencies of which are characteristics of the system (i.e., independent of the initial conditions). Those frequencies, w = 1 rad/sec and w = V3 rad/sec, are the natural frequencies of the system. If the initial conditions are such that H = 0, then the motion is of the form ri(t)=Gsin(¢+¢), we(t) = Gsin (t+ ¢); (36) that is, the two masses swing in unison at the lower frequency w = 1. Such a motion is called a low mode motion because it is at the lower of the two natural frequencies. If instead the initial conditions are such that G = 0, then a(t) = Hsin(V3t+y), 2x9(t)= —Hsin(V3t+¥); (37) 167 the masses swing in opposition, at the higher frequency w = 1/3 rad/sec, so the latter is calleda high mode motion. For instance,theinitial conditionsx1(0) = x2(0) = 1 and vi (0) = (0) = 0 give (Exercise7) thepurelylow modemotion 7/2) = cost, 1(t) = sin(t-+ 0/2) 08) v1(t) = sin (t + wo(t) = sin(t + 7/2) = cost, andtheconditions2,(0) = 1, 22(0) = —1,and2(0) = 24,(0)= 0 give thepurely high mode motion zi(t) = sin(V3t+ 7/2) = cos V3t, ao(t) = —sin(/3t+72/2) (39) = —cosV3t. If, instead,z}(0) = 1 and22(0) = x4(0) = 24(0) = 0, say,thenbothG and H will be nonzero and the motion will be a linear combination of the low and high modes. COMMENT 2. Why is the frequency corresponding to the masses swinging in opposition higher than that corresponding to the masses swinging in unison? Remember from the single-mass case studied in Section 3.5 that the natural frequency in that case is /k/m; that is, the stiffer the system (the larger the value of &), the higher the frequency. For the two-mass system, observe that in the low mode the middle spring is completely inactive, whereas in the high mode it is being stretched and compressed. Thus, there is more stiffness encountered in the high mode, so the high mode frequency is higher. COMMENT 3. Just as the free vibration of a single mass is governed by one differential equation,mz” + kx = 0, andhasa single modeof vibration with naturalfrequency w = ,/k/m, a two-mass system is governed by two differential equations and its general vibration is a linear combination of two modes (unison and opposition in this example), each with its own natural frequency. Similarly, the free vibration of an n-mass system will be governed by n differential equations, and its general vibration will be a linear combination of n distinct modes, each with its own pattern and natural frequency. In the limit, we can think of a continuous system, such as a beam, as an infinite-mass system, an infinite number of tiny masses connected together. In that limiting case, in place of an infinite number of ordinary differential equations we obtain a partial differential equation on the deflectiony(x, t), solution of which yields the generalsolution as a linear combinationof an infinite number of discrete modes of vibration. In applications it is important to know the natural frequencies of a given system because if it is driven by a harmonic forcing function, then it will have a large, perhaps catastrophic, response if the driving frequency is close to one of the natural frequencies. COMMENT 4. Finally, we note that molecules and atoms can be modeled as mass-spring systems, and the spectrum of the natural frequencies are of great importance in determining their allowable energy levels. @ We will have more to say about the foregoing example later, when we study matrix theory and the eigenvalue problem. Observe that once a system of linear constant coefficient equations is converted by the process of elimination to a set of uncoupled equations such as (32a,b), the 168 homogeneous solutions of those equations can be sought in the usual exponential form. In fact, one can do that even at the outset, without first going through the process of elimination. For instance, to solve (31a,b) one can start out by seeking a solution in the form x(t) = €e" and xo(t) = €,e"'. Putting thoseforms into (31a,b) gives what is known as an eigenvalue problem on the unknown constants €1,€2 and r. That discussion is best reserved for the chapters on matrix theory and linear algebra, as an important application of the eigenvalue problem, so we will not pursue it in the present section. Closure. Systems of ordinary differential equations arise in the modeling of physical systems that involve more than one dependent variable. For instance, in modeling an ecological system such as the fish populations in a given lake, the dependent variables might be the populations of each fish species, as a function of the independent variable t. Surely these populations are interrelated (for instance, one species might be the primary food supply for anotherspecies),so the governing differential equations will be coupled. It is precisely the coupling that produces the interest in this section because if they are not coupled then we can solve them, individually, by the methods developed in preceding sections. Our first step was to give the basic existence and uniqueness theorem. That theorem guaranteed both existence and uniqueness, under rather mild conditions of continuity, over an interval that is known in advance. The theorem applied to first-order systems, but we showed that systems of higher order can be converted to first-order systems by suitable introduction of auxiliary dependent variables. Then we outlined a method of elimination for systems with constant coefficients. Elimination is similar to the steps in the solution of linear algebraic equations by Gauss elimination, where the coefficients of the unknowns are operators rather than numbers. The correct result can even be obtained by using Cramer’s rule, provided that the determinantal operatorin the denominator does not vanish, and provided that we move that operator back “upstairs” —as we did in converting (28) to (29). If the operator does vanish, then the problem is singular and there will be no solution or infinitely many linearly independent solutions. In subsequentchapters on matrix theory we shall return to systems of linear differential equations with constant coefficients and develop additional solution techniques that are based upon the so-called eigenvalue problem. Computer software. Often, one can solve systems of differential equations using computer-algebra systems. For instance, to find the general solution of the system (D + 1)x+ 2y=0, 3a+(D+ 2)y=0 using Maple, enter soe) con ee )+ a(t) +2 « y(t) =0, 3 * x(t) + diff(y(t),t) +2 *y(t) = O},{x(t), y(t) }); 3.9. Systems ofLinear Differential Equations — 169 and return. The result is the general solution | C2 {y(t)= exp (t) + 3/2 Cl exp (—4¢), ~C2 exp (t) +. C1 exp (—4t) } If we wish to includeinitial conditions«(0) = 3, y(0) = 2, useinsteadthecommand dsolve({diff(x(t),t) + a(t) +2* y(t) = 0, 3* x(t) + diff(y(t), t) +2 * y(t) = 9, x(0)=3, y(0)=2},{e(t),y()}); The result is the particular solution (—48)} (t)+2exp a(t)=exp (t)+3exp(—d¢), =—exp {y(t) Alternatively, one can first define the two equations and then call them in the dsolve command. That is, enter deq! := diff(a(t),t) + x(t) +2*y(t)=0: deq2:= 3 « x(t) + diff(y(4),t) +2 * y(t) =0: The colon at the end of each line indicates that it is a definition, not a command. Commands are followed by semicolons. Now, enter the dsolve command: deq2,x(0)= 3, y(0)= 2}, {a(t),y(t)})s dsolve({deql, and return. The result is {y(t) = —exp(t) + 3exp(—4t), x(t) = exp (t) + 2exp (—4t)} EXERCISES 3.9 1. Derive the solution (20) of (19). 2. Derive the solutions (33a,b) of (32a,b). x2 ok BW FE 3 mI x k m| FAW a k FW my PR EO TEEPE DOPED, EEE 3. Derive the system of differential equations governing the displacements «,;(t), using the assumption that 2; > x2 > v3 > 0. Repeat the derivation assuming instead that z3 > LQ > x, > O and again, assuming that 2, > 73 > we > 0, and show that the resulting equations are the same, independent of these different assumptions. 4.(a),(b),(c) Derive the system of differential equations governing the currents 7;(£), but you need not solve them. State any physical laws that you use. 170 Chapter 3. Linear Differential Equations of Second Order and Higher (n) (b) (a) a” —a+3y=0 y +a+y=4 (0) 2! +a+y = 24 y! + 38a~ y = -8 (p) al! _L 3a" yl" ~y" = — y +6 (q) (2D? + 3)z + (2D + 1)y = 4e**—7 Dz+(D—2)y =2 6. (a)~—(q)Find the general solution of the corresponding problem in Exercise 5 using computer software. Separately, make up any set of initial conditions, and use the computer to find the particular solution corresponding to those initial conditions. Ok b 5. Obtain the general solution by the method of elimination either step-by-step or using the Cramer’s rule shortcut. (a) (D—1)z+Dy=0 (D+l1)x+(2D+2)y=0 (b) (D—l1)a+2Dy=0 7. (Mass-spring system of Examples 3 and 8) (a) Derive the particular solutions (38) and (39) from the general solution (35) by applying the given sets of initial conditions. (b) Evaluate G,H,¢,y for the initial conditions x,(0) 1,a2(0} = 24(0) = x74(0)= 0, and show thatboth modes ’ are present in the solution. Obtain a computer plot of x, (t) and a(t), over 0 < t < 20 (so as to show several cycles). (c) Sameas (b),for 2(0) = 1,21(0) = x2(0) = 4(0) = 0. (d) Same as (b), for 21(0) = x2(0) = 0,24 (0) = 2,25(0) 3. (e) Same as (b), for x, (0) (D+ 1)x+4Dy =0 (c) Dz +(D~l)y=5 x3(0) = -1. (d) a t+y=ytt x(t) andy(t), reactto form a third substance,with concentration 2(¢). The reactionis governedby the system2’ + ax = ™D+l)a+(D+l)y=0 xv—3y' = -27 +2 (e) 2’ =sint—y yi =-9r+4 (f) 2’ =x ~8y yo/ =r-n-~y—3t (g) vw=22+6y—-t4+7 y’ = 22—2y (h) Qe +y +at+y=P-1 ety t+art+y=0 (i) 2 +y'+a2-y=e av’ +2y +2 —-2%=1-t (j) av —3x +y =4sin 2¢ 32 +y' —~y=6 (kK)a” =a~dy y= 20 y Q) a” =a —2y y" = 2a —4y (m) «” —-x2+2y=0 Wa+y"+dy=1-t? 8. (Chemical kinetics) Two substances, with concentrations 0,2’ = Gy andx +y+2 =~, where a, G,7¥ are known positive constants.Solve for a(t), y(t), z(t), subjectto theinitial conditionsz(0) = 2’(0) =0 for thesecases: (a)a #B (b) a ( HINT: Apply l’H6pital’s rule to your answer to part (a). 9. (Motion of a charged mass) Consider a particle of mass rm, carrying an electrical charge g, and moving in a uniform magnetic field of strength 8. The field is in the positive z direction. The equations of motion of the particle are ma" = qBy', my" = ~qBa', me” = 0, (9.1) wherea(t), y(t), (¢) are thex, y, 2 displacementsas a function of the time ¢. (a)Find thegeneralsolutionof (9.1)for x(t), y(), z(t). How many independent arbitrary constants of integration are there? Chapter 3 Review (b) Show that by a suitable choice of initial conditions the mo- (a) 2’ ~x2@+y=t and centeredat any desired point xo, yo. Propose such a set of initial conditions. () (D—-lA)a+y=t (D2 —l)e+(D+1)y=t+1 (c) (D+ (c) Besides a circular motion in a constant z plane, are any othertypesof motion possible? Explain. 10. Show thatthe given system is singular (i.e., either inconsistent or redundant). If it has no solutions show that; if it has solutions find them. Le — Dy = ot (D? — 1a —(D? — D)y =0 (q) (D D +1)x+ Dy = et - D)y = 3t (D? —1)z+(D? a + ( Chapter 3 Review A differential equation is far more tractable, insofar as analytical solution is concerned, if it is linear than if it is nonlinear. We could see a hint of that even in Chapter 2, where we were able to derive the general solution of the general firstorder linear equation y’ + p(x)y = q(x) but had success with nonlinear equations only in special cases. In fact, for linear equations of any order (first, second, or higher) a number of important results follow. The mostimportantis thatfor annth-orderJineardifferentialequationL[y] = f(x), with constant coefficients or not, a general solution is expressible as the sum of a generalsolution y;,(z) to the homogeneousequation L[y]= 0, and any partic- ularsolutiony,(a) to thefull equationLy]= f: y(z) = yn(x)+ yp(2). In turn, y;,(a@)is expressible as an arbitrary linear combination of any n LI (linearlyindependent)solutionsof L[y] = 0: yn(z) = Ciyi(2) pee Cnyn(2). Thus, linear independence is introduced early, in Section 3.2, and theorems are provided for testing a given set of functions to see if they are LI or not. We then show how to find the solutions y;(x),...,Yn(a) for the following two extremely important cases: for constant-coefficient equations and for Cauchy-Euler equations. For constant-coefficient equations the idea is to seek yp(a) in the exponential form e**. Putting that form into L[y] = 0 gives an nth-degreepolynomial equation on A, called the characteristic equation. Each nonrepeated root A; con- tributesa solution e*/®, and each repeatedroot A; of order & contributes & solutions es®, geXit . Jake Ledie, For Cauchy—Euler pean the form e** does not work. Rather, the idea is to seekyp,(x) in the power form 2. Each nonrepeatedroot Aj;contributes a solution aj, and each repeated root A; of order k contributes k solutions ai, (Inx)ar,..., (Ina) 171 172 Two different methods are put forward for finding particular solutions, the method of undetermined coefficients and Lagrange’s method of variation of parameters. Undetermined coefficients is easter to apply but is subject to the conditions that (i) besides being linear, Z must be of constant-coefficient type, and (ii) repeated differentiation of each term in f must produce only a finite number of LI terms. Variation of parameters,on the other hand, merely requires L to be linear. According to the method, we vary the parameters (i.e., the constants of integration in y,,) Cy,...,Cn, andseekyp(x) = Ci (x)yi (a) +--+» + Cr(x)yn(x). Puttingthatform into thegivendifferentialequationgivesoneconditionon theCj (a)’s. Thatcondi- tion is augmented by n — 1 additional conditions that are designed to preclude the presenceof derivatives of the C;(a)’s that are of order higher than first. In Section 3.8 we study the harmonic oscillator, both damped and undamped, both free and driven. Of special interest are the concepts of natural frequency for the undamped case, critical damping, amplitude- and frequency-response curves, resonance, and beats. This application is of great importance in engineering and science and should be understood thoroughly. Finally, Section 3.9 is devoted to systems of linear differential equations. We give an existence/uniqueness theorem and show how to solve systems by elimination. Chapter 4 Power Series Solutions PREREQUISITES: This chapter presumes a familiarity with the complex plane and the algebra of complex numbers, material which is covered in Section 21.2. 4.1 Introduction In Chapter 2 we presented a number of methods for obtaining analytical closed form solutions of first-order differential equations, some of which methods could be applied even to nonlinear equations. In Chapter 3 we studied equations of second order and higher, and found them to be more difficult. Restricting our discussion to linear equations, even then we were successful in developing solutions only for the (important) cases of equations with constant coefficients and Cauchy-Euler equations, We also found that we can solve nonconstant-coefficient equations if we can factor the differential operator, but such factorization can be accomplished only in exceptional cases. In Chapter 4 we continue to restrict our discussion to linear equations, but we now study nonconstant-coefficient equations. That case is so much more difficult thanthe constant-coefficientcase thatwe do two things: we consider only secondorder equations, and we give up on the prospect of finding solutions in closed form and seek solutions in the form of infinite series. To illustrate the idea that is developed in the subsequentsections,consider the simple example d Fy dx (1) = 0. To solve by the series method, we seek a solution in the form of a power series expansionaboutany desiredpoint « = xo, y(t) = S779 Gn(@—x0)”, wherethe Gn coefficients are to be determined so that the assumed form satisfies the given differential equation (1), if possible. If we choose x9 = 0 for simplicity, then oo y(x) = S° Ant” = ag +ayx+ 0 173 age” ae (2a) 174 and d = d ‘ = in (ag + ayn + ayn? +: :) = a1 + 2aqx + 3a3x" shee, (2b) Putting (2a,b) into (1) gives (ay + 2a9x + 3a3x° +: -) + (ao + aye + ann ++. :) = 0, (3) or, rearrangingterms, (a1 + ao) + (2a + a1)2 + (3a3 + ag) a? +++»=0. (4) If we realize that the right side of (4) is really 0+ 02 + 0x? +---, then, by equating coefficients of like powers of x on both sides of (4), we obtain a, + ag = O, 2a2 + a, = 0, 3a3 + a2 = 0, and so on. Thus, ay = —ao, ag = —a,/2 = —(—ag)/2 (5) = ag/2, a3= —a2/3= —(a9/2)/3 = —a9/6, and so on, where ag remains arbitrary. Thus, we have u(e) = a9 (1-0 1 + 50 1 (6) = Fah 4s), as the general solution to (1). Here, ag is the constant of integration; we could rename it C’, for example, if we wish. Thus, we have the solution — not in closed form but as a power series. In this simple example we are able to “sum the series” into closed form, that is, to identify it as the Taylor series of e~*, so that our general solutionis really y(a) = Ce~*. However,for nonconstant-coefficient differential equations we are generally not so fortunate, and must leave the solution in series form. As simple as the above steps appear,there are several questions that need to be addressed before we can have confidence in the result given by (6): (i) In (2b) we differentiated an infinite series term by term. That is, we interchanged the order of the differentiation and the summation and wrote an Ss" Anw” = S° in (ana"). d Tt d T (7) That step looks reasonable, but observe that it amounts to an interchange in the order of two operations, the summation and the differentiation, and it is possible that reversing their order might give different results. For instance, do we get the same results if we put toothpaste on our toothbrush and then brush, or if we brush and then put toothpasteon the brush? Introduction (ii) Re-expressing (3) in the form of (4) is based on a supposition that we can add series term by term: S° An + S° Bn = S° (An + By) . (8) Again, that step looks reasonable, but is it necessarily correct? (iii) Finally, inferring (5) from (4) is based on a supposition that if S> Ane” = S- Bra" (9) for all 2 in some interval of interest, then it must be true that A, = B, for each n. Though reasonable, does it really follow that for the sums to be the same the corresponding individual terms need to be the same? Thus, there are some technical questions that we need to address, and we do that in the next section. Our approach, in deriving (6), was heuristic, not rigorous, since we did not attend to the issues mentioned above. We can sidestep the several questions of rigor that arose in deriving the series (6) if, instead, we verify, a posteriori, that (6) does satisfy the given differential equation (1). However, that procedure begs exactly the same questions: termwise differentiation of the series, termwise addition of series, and equating the coefficients of like powers of x on both sides of the equation. Here is a brief outline of this chapter: 4,2 Power Series Solutions. In Section 4.2, we review infinite series, power series, and Taylor series, then we show how to find solutions to the equation y’” + p(x)y’ + q(x)y = 0 in the form of a power seriesabouta chosenpoint xq if p(z) andq(x) are sufficiently well-behavedat xo. 4.3 The Method of Frobenius. If p(x) and q(x) are not sufficiently well- behaved at xo, then the singular behavior of p and/or q gets passed on in some form to the solutions of the differential equation; hence those solutions cannot be found in power series form. Yet, if p(x) and q(x) are not too singular at zo, then solutions can still be found, but in a more general form, a so-called Frobenius series. Section 4.3 puts forward the theoretical base for such solutions and the procedure whereby to obtain them. 4.4 Legendre Functions. This section focuses on a specific important example, the Legendre equation (1 — a*)y" ~ 2ay' + Ay = 0, where ) is a constant. 4.5 Singular Integrals; Gamma Function. Singular integrals are defined and their convergence is discussed. An important singular integral, the gamma function, is introduced and studied. 4.6 Bessel Functions. Besides the Legendre equation, we need to study the extremelyimportantBessel equation,x7y" + xy! + («2 — v*)y = 0, wherev is a constant, but preparatory to that study we first need to introduce singular integrals and the gamma function, which will be needed again in Chapter 5 in any case. 175 176 4.2 Power Series Solutions 4.2.1. Review of power series. Whereas a finite sum, N So ag= a1+02 ++ +aN, (1) k=1 is well-defined thanks to the commutative nite sum, or infinite series, and associative laws of addition, an infi- CO Sag = a1 + a2 +a3t+--°, (2) k=1 is not. For example, is the series y9(=1)*-3 =1—-1+1-—1+--- equal to = 0? Is it (by grouping differently) (1-1) +--- =0+0+.--(l-1)+ 1~(1-1)-(l1-1)---=1-—0-—0-.--= 1?In fact, besidesgrouping the numbers in different ways we could rearrange their order as well. The point, then, is that (2) is not self-explanatory, it needs to be defined; we need to decide, or be told, how to do the calculation. To give the traditional definition of (2), we first define the sequence of partial sums of the series (2) as 83 =a, +a2+ a3, 82= aj, +Qa, 81=a, and so on: (3) Th (4) Sn=S- Qk; k=1 where a, is called the kth term of the series. If the limit of the sequence s, exists, as n. —>oo, and equals some number s, then we say thatthe series (2) is convergent, and that it converges to s; otherwise it is divergent. That is, an infinite series is defined as the limit ( if that limit exists) of its sequence of partial sums: oO nr So ax = lim k=1 TL-+ CO Sax = lim s, = s. k=1 TL-F (5) CO That definition, known as ordinary convergence, is not the only one possible. For instance, another definition, due to Cesaro, is discussed in the exercises. However, ordinary convergence is the traditional definition and is the one that is understood unless specifically stated otherwise. Recall from the calculus that by limp—+.oo 8n = 8, In (5), we mean that to each number € > 0, no matter how small, there exists an integer N such that |s —sp| < € for alln > N. (Logically, the words “no matter how small” are unnecessary, but we include them for emphasis.) In general, the smaller the chosen e, the larger the NN that is needed, so that NV is a function of e. The significance of the limit concept cannot be overstated, for in mathematics it is often as limits of “old things” that we introduce “new things.” For instance, thederivativeis introducedas the limit of a differencequotient,theRiemann integral is introducedas the limit of a sequenceof Riemann sums, infinite series are introduced as limits of sequences of partial sums, and so on. To illustrate the definition of convergence given above, consider two simple examples. The series 1 + 1 + 1 +--+ diverges becauses, = 7 fails to approach a limit as n — oo. However, for a series to diverge its partial sums neednot grow unboundedly. For instance, the series 1~1-+1—1-+---, mentioned above,diverges fails to approach a because its sequence of partial sums (namely, 1,0,1,0,1,...) limit. Of course, determining whether a series is convergent or divergent is usually much harder than for these examples. Ideally, one would like a theorem that gives necessaryand sufficient conditions for convergence.Here is such a theorem. THEOREM 4.2.1 Cauchy Convergence Theorem An infinite series is convergent if and only if its sequenceof partial sums sp,is a that is, if to each ¢ > 0 (no matter how small) there corresponds Cauchy sequence — an integer N(e) such that [s,. ~ s,| < € for all m and n greater than N. Unfortunately,this theorem is difficult to apply, so one develops (in the calculus) an array of theorems (i.e., tests for convergence/divergence),each of which is more specialized (and hence less powerful) than the Cauchy convergence theorem, but easier to apply. For instance, if in Theorem 4.2.1 we set m = n — 1, then the stated condition becomes: to each € > 0 (no matter how small) there corresponds an integer Ne) such that |sm,— s,| = ja,| < ¢€for alln > N. The latter is equivalent to saying that a,, —- 0 asm — oo, Thus, we have the specialized, but readily applied, theoremthat for the series S~*°ay,to converge, it is necessary (but not sufficient) that a, — 0 as n — oo. From this theorem it follows immediately that the series 1+1+1+---andl—1+1-—1+---, cited above, both diverge becausein each case the terms do not tend to zero. Let us now focus on the specific needsof this chapter,power series —that is, seriesof the form nr S| an(a — £0)” = ag +a1(u — 9) 0 +ae(x where the a,,’s are numbers called the coefficients ao)” fee (6) of the series, x is a variable, and xg is a fixed point called the center of the series. We say that the expansion is “aboutthe point vo.” In a later chapterwe study complex series, but in this chapter we restrict all variables and constants to be real. Notice that the quantity (a ~ v9)” on the left side of (6) is the indeterminate form 0° when n = 0 and a = v9; that form mustbe interpretedas 1 if the leading term of the series is to be ag, as desired. The terms in (6) are now functions of z rather than numbers, so that the series may converge at some points on the x axis and diverge at others. At the very least (6) converges at 2 = ag since then it reduces to the single term ao. 178 diverge THEOREM 4.2.2 Interval of Convergence of Power Series The power series (6) converges at 2 = ao. If it converges at other points as well, then those points necessarily comprise an interval |z —xo| < R centered at xo and, possibly, one or both endpoints of that interval (Fig. 1), where # can be determined from either of the formulas converge Xo Xo Lak XgtR 1 R= Rl Anti lim Figure 1. Intervalof convergenceof power Series. or, 1 k=——_—,, lim V/|an| noo an n—-+Co (7a,b) if the limits in the denominators exist and are nonzero. I[fthe limits in (7a,b) are zero, then (6) converges for all x (i.e., for every finite x, no matter how large), and we say that “R = oo.” If the limits fail to exist by virtue of being infinite, then R = 0 and (6) converges only at ro. We call |x — wo| < R the interval of convergence,and FRthe radius of convergence. If a power series converges to a function f on some interval, we say that it represents f on that interval, and we call f its sum function. EXAMPLE 1. Consider “3° n! 2”, so an = n! andxo = 0. Then (7a)is easierto apply than(7b),and givesR = 1/ lim {p+} Th OO converges only atz = zp = 0. @ EXAMPLE noo 2. Consider37>°(—1)”[(@+ 5)/2]". Thena, = (—1)"/2", v9 = —5, and(7a)givesR = 1/ lim Se noo : = 1/ lim (n+ 1) = 1/oo = 0, so theseries nN! Qn+1 2" (—1)” | =1/ lim a nO 1/(5) 2 = 2, so theseries , convergesin |x + 5| < 2 anddivergesin |x +5| > 2. For |x + 5| = 2 (x = —7,—3)the theorem gives no information. However, we see that for z = —7 and —3 the terms do not tend to zero as m — 00, so the series diverges for x = —7 and —3. @ " EXAMPLE R=1/ lim n-00 — 1)” Then a, = (1 +1)7", 3. Consider ay Ca (n+ 1)" YV(n+1)7"=1/ lim ls n—oo 7 + 1 a = 1, and (7b) gives 1/0.=.00,.sotheseriesconvergesfor all x; that is, the interval of convergence is ja — 1] < oo. @ EXAMPLE 4. Consider the series 1 Eas? , eo -y iy (8) 179 proceed in steps of 2. However, This series is not of the form (6) because the powers of «3 co 1 ay if we set XY= (2 —3)?, thenwe havethestandardform ne lim == lim Qn OO Mm TL--> OO 5 0 1 5n n+ . X",with a, = 1/5" and : . in |X| <5 (ie., =F Thus, = 5,andtheseriesconverges 5 nel |x—3|<V5),anddiverges in|X|> 5(ic.,[a—3]> V5). Ol Recall from our introductory example, in Section 4.1, that several questions arose regarding the manipulation of power series. The following theorem answers those questions and, therefore, will be needed when we apply the power series method of solution. THEOREM 4.2.3 Manipulation of Power Series (a) Termwise differentiation (or integration) permissible. A power series may be differentiated (or integrated) termwise (i.e., term by term) within its interval of convergence J. The series that results has the same interval of convergence J and representsthe derivative (or integral) of the sum function of the original series. (b) Termwise addition (or subtraction or multiplication) permissible. Two power series (about the same point xo) may be added (or subtracted or multiplied) termwise within their common interval of convergence J. The series thatresults has the same interval of convergence J and representsthe sum (or difference or product) of their two sum functions. (c) If two power series are equal, then their corresponding coefficients must be equal. That is, for S° Qn(x —x)” = S° bp(x —xo)” 0 (9) 0 to hold in some common interval of convergence, it must be true that a, = b, for each n. In particular, if oO S> An(x ~ x9)" = 0 (10) 0 in some interval, then each a,, must be zero. oO Part (a) meansthatif f(a) ll S- f(a) = ind “0 0 = ) An(@— xo)"me = o) 0 an(x —xo)” within J, then =d [an(# — 20) nm|= 7 = ) NAn(L — Lo) n- 1 1 dl) 180 and oO =S va, ~ b—« to) _ _ 4 2 (12) within J, where a, b are any two points within J. Part (b) meansthatif f() = 579°an(a —xo)” andg(x) = 379°ba(x —20)” on J, then F(a)+g(x)= S“(an+bn)( —0)”, (13) and, with z = x ~ 29 for brevity, f(x)g(x) = (>: ene (>: ba" 0 0 =(ap taiz+---) (bo +biz+---) +azz(bo+b12+ doz?+--+) =ag (bp+byz+ bez”++++) +492" (bo + biz + baz? +-:-) = agbo + (agbi + aibo) oes z+-°° (14) 2” +Gabo) ++++ +@1bn—1 =$7(@obn ~ within J. The series on the right-hand side of (14) is known as the Cauchy product of the two series. Of course, if the two convergence intervals have different radii, then the common interval means the smaller of the two. In summary, we see that convergent power series can be manipulated in essentially the same way as if they were finite-degree polynomials. The last items to address,before coming to the power series methodof solution of differential equations, are Taylor series and analyticity. Recall from the calculus thatthe Taylor series of a given function f(x) abouta chosenpoint xo, which we denote here as TS Flag? is defined as the infinite series TS fl, = f(to) + L (to) (a — vo) + SF ~ » 0). — ao)? + ™) : ea (x — x0)”, oe) where0! = 1. The purpose of Taylor series is to representthe given function, so the fundamental question is: does it? Does the Taylor series really converge to f(x) on some « interval, in which case we can write, in place of (15), oO f(n) x f(2)= 00 Pe) The! For that to be the case we need three conditions —xg)". i) to be met: (i) First, we need f to have a Taylor series (15) about that point. Namely, f must beinfinitely differentiableat ao so thatall of thecoefficientsf() (aq)/n! in (15) exist. (ii) Second, we need the resulting series in (15) to converge in some interval |z ~ xo| < R, for R > 0. (iii) Third, we need the sum of the Taylor series to equal f in the interval, so that the Taylor series represents f over that interval —which is, after all, our objective. The third condition might seem strange, for how could the Taylor series of f(z) converge,but to somethingother than f(x)? Such cases can indeedbe put forward, but they are somewhat pathological and not likely to be encountered in applications. If a function is representedin some nonzerointerval |x — xo| < R by its Taylor series [i.e.,TS f |, exists, and convergesto f(x) there],then f is said to be analytic at vo. If a function is not analytic at xo, then it is singular there. Most functions encountered in applications are analytic for all x, or for all x with the exception of one or more points called singular points of f. (Of course, the points are not singular, the function is.) For instance, polynomial functions, sin z, cosx, e*,ande~*areanalytic for all x. On theotherhand,f(x) = 1/(~—1) is analytic for all z except x = 1, where f and all of its derivatives are undefined, fail to exist. The function f(z) = tana = sin z/cosz is analyticfor all x except z=nnr/2(n = £1,+3,...), whereit is undefinedbecausecos x vanishesin the denominator. The function f(a) = «4/8is analytic for all x except 2 = 0, for eventhough f(0) and f’(0) exist, the subsequentderivativesf”(0), f’"(0),... do not (Fig. 2). In fact, f(z) = v®is singular at x = 0 for any nonintegervalue of a. Observe that there is a subtle difficulty here. We know how to test a given Taylor seriesfor convergence since a Taylor series is a power series, and Theorem 4.2.2 on power series convergence even gives formulas for determining the radius of convergence R. But how can we determine if the sum function (i.e., the function to which the series converges) is the same as the original function f? We won’t be able to answer this question until we study complex variable theory, in later chapters. However, we repeat that the cases where the Taylor series of f converges, but not to f, are exceptional and will not occur in the presentchapter,so it will suffice Figure 2. f(x) = c*/? andits first two derivatives. 182 to understand analyticity at 29 to correspond to the convergence of the Taylor series in some nonzero interval about xo. In fact, it is also exceptional for f to have a Taylor series about a point (i.e., be infinitely differentiable at that point) and to have thatTaylor series fail to converge in some nonzero interval about x9. Thus, as a rule of thumb that will suffice until we study complex variable theory, we will test a function for analyticity at a given point simply by seeingif it is infinitely differentiable at that point. 4.2.2. Power series solution of differential equations. We can now state the following basic theorem. THEOREM 4.2.4 Power series solution If p and q are analytic at 2g, then every solution of y" +p(x)y'+ g(a)y= 0 (17) is too, and can therefore be found in the form y(2) = > An(x —x)”. 0 (18) Further, the radius of convergence of every solution (18) is at least as large as the smaller of the radii of convergenceof TS p],,. and TS q|,.- Although we will not prove this theorem, we shall explain why one can expect it to be true. Since p and q are analytic at the chosen point xo, they admit convergent Taylor series about xg, so that we can write (17) as y"+ [p(@o) +p'(wo)(w —20)+--+]y'+ [a(20) +'(wo)(@ —20)+++] y 9) Locally, near 7p, we can approximate (19) as y” +p(xo)y’+ a(to)y= 9, all solutions of which are either exponential or x times an exponential, and are therefore analytic and representable in the form (18), as claimed. In many applications, p(w) and q(x) are rational functions, that is, one polynomial in x divided by another.Let F(x) = N(x)/D(x) be any rationalfunction, where the numerator and denominator polynomials are N(x) and D(a), respectively, and where any common factors have been canceled. It will be shown, when we study complex variable theory, that F(x) is singular only at those points in the complex plane where D = 0, at the zeros of D, so that a Taylor expansion of F about a point x9 on the x axis will have a radius of convergence which, we know in advance,will be equal to the distance from zo on the x axis to the nearestzero of D in thecomplexplane. For instance,if F(a) = (2 + 3x)/[(4+2)(9 + 27), then D has zeros at ~4 and +37. Thus, if we expand F about 7 = 2, say, then the radius of convergence will be the distance from 2 to the nearest zero, which is +32 (or,equally, ~3i), namely, /13 (Fig. 3). If, instead, we expand about « = —6, say, then the radius of convergence will be 2, the distance from —6 to the zero of D at ~4, EXAMPLE 5. Solve (20) y ty=0 by the power series method. Of course, this equation is elementary. We know the solution and do not need the power series method to find it. Let us use it nevertheless, as a first example, to illustrate the method. We can choose the point of expansion ag in (18) as any point at which both p(a) and q(x) in (17) are analytic. In the presentexample,p(x) = 0 and q(x) = 1 are analytic for all 2, So we can choose the point of expansion 2 to be whatever we like. Let zo = O, for y poo R= \ ¢ \ i ag wt \ -6 See 4 3i / 7 -i \ \ \ | 1 \ RE Vis ao \ 2 7 J pm / simplicity. Then Theorem 4.2.4 assures us that all solutions can be found in the form Figure 3. Disks of convergence (21a)in z plane (z = 7 + ty). y(t)=Yoana”, 0 and their series will have infinite radii of convergence. Within that (infinite) interval of convergence we have, from Theorem 4.2.3(a), (21) y(«)=Sonana"), 1 y' (2) =Sinn 2 —l)anz”"~?, (21c) so (20) becomes S> n(n —Lana"? na2 +S- Ana” = 0. (22) n=0 We would like to combine the two sums, but to do that we need the exponents of x to be the same, whereas they are n--2 and n. To have the same exponents, let us setm — 2 = m in the first sum, just as one might make a change of variables in an integral. In the integral, one would then need to change the integration limits, if necessary, consistent with the change of variables; here, we need to do likewise with the summation limits. With n ~ 2 = m, n = co corresponds tom = oo, and n = 2 corresponds tom = 0, so (22) becomes S° (m + 2)(m + L)ampo2™+ 1 m=0 anu” = 0, (23) n=O Next, and we shall explain this step ina moment, let m = 7 in the first sum in (23): oO Soin + 2)( + Danszan” +Sn==() n=0 ane” = 0 (24) x 184 or, with the help of Theorem 4.2.3(b), oO > [(n + 2)(n + Lany2 (25) + Gn] ec" = 0. n=0 Finally, it follows from Theorem 4.2.3(c) that each coefficient in (25) must be zero: (n+ 2)(n + Lange + an = 0. (26) (n = 0,1,2,...) Before using (26), let us explain our setting m = 7 in (23) since that step might seem to contradict the preceding change of variables n ~ 2 = m. The point to appreciate is that m in the first sum in (23) is a dummy index just as t is a dummy variable in fo t? dt. (We shall use the word index for a discrete variable; m takes on only integer values, not a continuous rangeof values.)Justasft? dt = fr? dr = fo a*dx = +--+ = ¥, thesums in (23) are insensitive to whether the dummy index is m orn: oO (m + 2)(m + l)amyo0™ = 2aq + Bagx + 12agn? +---, -+ m=0 and Co Son +2)(n + l)anzot” = 2ag+ Gaga+ 12aqga”? +++: n=0 are identical, even though the summation indices are different. Equation (26) is known as a recursion (or recurrence) formula on the unknown coefficients since it gives us the nth coefficient in terms of preceding ones. Specifically, i 1 a a (n+2)(n+1)" n so that 0 (n =0,1,2,... 27. ) en 1 n=O: ag = nm=1: ag == n=2: aq = — n=3: . ao, (2)(1) (3)(2 i ) a3 = — a4, : L, Q3 = a ~ @B)AG 4B)? 1 1 0 4a (28) FO, ag = 4a, (5)(4) and so on, where ag and a; remain arbitrary and are, in effect, the integration constants. Putting these results into (21a) gives gla) y(@) = =ag+ ayern = do (1 _ 1, x 1 1 1 1 5 + +: 4ot 3 4+aga’ + aya? Tha a 2 ~ —ayn? git ne ae 1 + a — tt :) + ay(« _ De Ta 1, + Ae at ;) (29) Or y(x) = agyr(x) +a1y2(2), (30) where4;(2) andy2(a) are the serieswithin the first and secondpairsof parentheses,respectively.From theirseries,we recognizey;(a) ascosx andyo(a) assin x but,in general, we can’t expect to identify the power series in terms of elementary functions because normally we reserve the power series method for nonelementary equations (except for peda- gogicalexamplessuchasthis).Thus,let uscontinueto call theseries“yy(a)” and“y2(z).” We don’t need to check the series for convergencebecauseTheorem 4.2.4 guarantees that they will converge for all z. We should, however, check to see tf y1, y2 are LI (linearly independent), so that (30) is a general solution of (20). To do so, it suffices to evaluate the WronskianW[y1, y2|(z) at a single point,say w = 0: = yi(Q) vel(e) Wiss ∶ y2(0) 1 0 ↓ (31) ↓ which is nonzero. It follows from that result and Theorem 3.2.3 that yy, y2 are LI on the entire x axis, so (30) is indeed a general solution of (20) for all z. Actually, since there are only two functions it would have beeneasier to apply Theorem 3.2.4: y1, ya are LI because neitherone is a constantmultiple of the other. COMMENT 1.To evaluatey; (x) or yo(x) ata givenx, we needto addenoughtermsof the series to achieve the desired accuracy. For small values of x (i.e., for 2’s that are close to the point of expansion xg, which in this case is 0) just a few terms may suffice. For example, thefirst four termsof y,(z) give y,(0.5) = 0.877582,whereastheexactvalue is 0.877583 (to six decimal places). As x increases, more and more terms are needed for comparable accuracy. The situation is depicted graphically in Fig. 4, where we plot the partial sums $3 and sg, along with the sum function y;(2) (i.e., cos). Observe that the larger n is, the broader is the z interval over which the m-termapproximation s,, stays close to the sum and +1, n(x) function. However, whereas y;(z) is oscillatory and remains between ~—1 is a polynomial, and therefore it eventually tends to +00 or —oo as & increases (—00 if n is even and -+-ooif n is odd). Observe that if we do need to add a great many terms, then it is useful to have an expression for the general term in the series. In this example it is not hard to establish that Qn oo Gayl yi(x)=S7(-1)" oO gent ya()= OW"Gaal (32) 0 COMMENT 2. Besides obtaining the values of the solutions y,(2) and yo(a), one is usually interestedin determining some of their properties. Some propertiescan be obtained directly from the series, For instance,in this case we can see that y;(—«) = y (a) and yo(—x) = ~ye(x) (so that the graphs of y, and y2 are symmetric and antisymmetric, respectively,about2 = 0), and thatyj(a) = ~ye(a) andy4(x) = y,(x). The differential equation is also a source of information. COMMENT 3. We posed (20) without initial conditions. Suppose that we wish to impose theinitial conditionsy(0) = 4 andy’(0) = —1.Then,from (30), y(0) = 4 = agy1(0)+ arye(0), oe S l | 2\ agy (0) + ayy)(O). Figure 4. Partialsumsof yi (2), comparedwith yi (x). (33) 186 From the series representationsof y, and ye in (30), we see that y,(0) = 1, ye(0) = 0, yi (0) = 0, and y5(0) = 1, so we can solve (33) for ag and ay: ag = 4 and a, = —1, hence thedesiredparticularsolutionis y(a) = 4yi(@) —ye(@),on -co < @< oo. G We can now see more clearly how to select the point of expansion x9, besides selecting it to be a point at which p and g in (17) are analytic. We have emphasized that the series solutions are especially convenient when the calculation point « is close to «og,for then only a few terms of the series may suffice for the desired accuracy. Thus, if our interest is limited to the interval 6 < « < 10, say, then it would make sense to seek series solutions about a point somewhere in that interval, such as a midpoint or endpoint, rather than about some distant point such as x = 0. In the event that initial conditions are supplied at some point aj, then it is especially helpful to let x9 be x; because when we apply the initial conditions we will needto know the valuesof y: (i), yo(xi), yj(xi), and y$(a;), as we did in (33). If xo is other than «;, then each of these evaluations requires the summing of an infinite series, whereas if it is chosen as x; then these evaluations are trivial (as in Comment 3, above). EXAMPLE 6. Solve theinitial-value problem (x —1)y"”+y' + 2(x —1)y = 0, y(4)=5, y'(4)=0 (34) on theinterval4 < x < oo. To get (34) into thestandardform y” + p(x)y’ + q(x)y = 0, we divide by x — 1 (which is permissible y+ c-l since z — 1 4 0 on the interval of interest): y +2y = 0, (35) sop(x) = 1/(a@—1) andq(x) = 2. Theseareanalyticfor all z exceptx = 1,wherep(x) is undefined. In particular, they are analytic at the initial point a = 4, so let us choose zo = 4 and seek (36) y(“) = 3 Gy(v— 4)". 0 To proceed we can use either form (34) or (35). Since we are expanding each term in the differential equation about x = 4, we need to expand x — 1 and 2(x — 1) if we use (34), or the 1/(a — 1) factor if we use (35). The former is easiersince (37) u-l=34+(2-4) is merely a two-term Taylor series, whereas (Exercise e-1 a 1 5 (-1)" 6) a— 4)" 38 ge (4) is an infinite series. Thus, let us use (34). Putting (36) and its derivatives ee and (37) into (34) gives [3 + (2 — 4)]Ss" n(n — l)an(a — 4)"7? + S° nay(x —4)"7! 2 L 187 +2[3+ («~4)]S an(x—4)" =0 (39) 0 or, absorbing the 3 + (a ~ 4) terms into the series that they multiply and setting z = a ~ 4 for compactness, oO oo S- 3n(n — Lane”? + Son(n 2 2 oO oO oO ~ Lanz" (40) + S° 6Gn2" + S> Qa,ort! = 0. 6 0 + > Naynz' 1 To adjust all z exponents to n, let —2 = m in the first sum, m~ 1 = ™ in the second and third, and n + 1 = mmin the last: oO ow S° 3(m + 2)(m + Lamszoz™+ So(m + L)mamyi2™ 0 1 +(m+ = 0. Damyrz™+S >6an2”+S~2am—12™ 0 0 (41) 1 Next, we change all of the m indices to n. Then we have z” in each sum, but we cannot yet combine the five sums because the lower summation limits are not all the same; three are 0 and two are 1. We can handle that problem as follows. The lower limit in the second sum can simply be changed from 1 to 0 because the zeroth term is zero anyhow (due to the m factor). And the lower limit in the last sum can be changed to 0 if we simply agree that the a_,, that occurs in the zeroth term, be zero by definition. Then (41) becomes » 3(n+ 2)(n+ Lang22" + So(n + 1)nan412” 0 o oO x oO (42) =0 +So(n + Ldngi2” + 5" 6anz"+ S>2an—1z” 0 or 0 0 foe) 2” = 0, S| [3(n+2)(m+ Lange+ (m+1)angi +Gan+ 2an—1] (43) 0 witha, = 0. Setting the square-bracketed coefficients to zero for each n [Theorem 4.2.3(c)] then gives the recursion formula 3(n + 2)(n + Lanse + (1 +1)°an41 + ban + 2an—1 = 0 or n+l On+2 = 7 3(m+ gyn 2 (n+ 2 2)(n+ no” ~ 3(n + 2)(n +1) Any (44) 188 Chapter 4. Power Series Solutions forn = 0,1,2,.... Thus, n=O: ay = ~ba, 6 n=l: a3 = wy — La = -9 m=2: — ap — ta, 3 9 2 _ ly 3 (-ga 1 =~ ba, — a 6 9 a co] 1 a4 = —403 — 307 — 0 4 6 18 1 8 1 am (-7 5 +360 5 = Tos 1 — 34 8 a 9% = 1 *5%)6 (-5 1 a7 1 + 9% (45) 1 −co)~784 and so on, where ao and a, remain arbitrary. Putting theseexpressions for the a,’s back into (36) then gives (x—4)3 +520) - oo)(e—4)?+(-Fa (-ga 6 27 9 re eee +({pat Ze) (t-~4)*+--- =a9[i=(4? +Se Fayte 4) 5(e + Feat 4)?—Sle + leat = agyi(x) + a1y2(z), where y;(2), ya(x) are the functions (46) represented by the bracketed series. To test y1, ya for linear independence it is simplest to use Theorem 3.2.4: y1, yg are LI because neither one is a constant multiple of the other. Thus, y(z) = aoyi(@) + aiya(z) is a general solution of (x —1)y" + y' + 2(a —1l)y= 0. Imposing the initial conditions is easy because the expansions are about the initial point z = 4: y(4) = 5 = aoyi(4)+ arya(4)= ao(1)+ a1(0), (47) y'(4)= 0 = aoy\(4)+ary2(4)= a0(0)+ ai(1), so ag = 5 and a, = 0, and y(a)=Syi(e)=5 |1— (4)? +ala 4)? +(ea) 9 36 ee (48) is the desired particular solution. COMMENT. Recall that Theorem 4.2.4 guaranteed that the power series solution would have a radius of convergence R at least as large as 3 ~ namely, the distance from the center of theexpansion(xo = 4) to the singularity in 1/(a — 1) at x = 1. For comparison,let us 189 determineR from our results. In this example it is difficult to obtain a general expression for Gm. (Indeed, we didn’t attempt to; we were content to develop the first several terms of the series, knowing that we could obtain as many more as we wish, from the recursion formula.) Can we obtain R without an explicit expressionfor a,,? Yes, we can use the asm —>oo or, equivalently, recursion formula (44), which tells us that a@,42~ —Fan ~ —kGn.Then,from(7a), Gn4+1 R= 1 _ limn—ro0 |“ 1 ↓ ↕ 1 =3, ↓∶ →∞a re Thus, if we were hoping to obtain the solution over the entire interval 4 < « < oo we are disappointed to find that the power series converges only over 1 < x < 7, and hence only over the 4 < x < 7 part of the problem domain. Does this result mean that the solutionis singular at z = 7 and can’t be continuedbeyond,or thatit doesn’texist beyond xz= 7? No, the convergenceis simply being limited by the singularity at ¢ = 1, which lies outside of the problem domain 4 < « < oo, For further discussion of this point, see Exercise 12. @ Closure. In Section4.2.1we reviewedthebasic conceptsof seriesandpowerseries and, in Theorem 4.2.3, we listed the properties of power series that are needed to solve differential equations. In Section 4.2.2 we provided a basis for the power series solution method of Theorem 4.2.4 and then showed, by two examples, how to implement it. It is best to use summation notation, as we did in Examples 5 and 6, because it is more concise and leads to the recursion relation. (But that notation is not essential to the method; for example, we did not use it in our introductory example in Section 4.1.) The recursion relation is important because it permits the calulation of as many coefficients of the series as we desire, and because it can be used in determiningthe radius of convergenceof the resulting series solutions. The method may be outlined as follows: (1) Writethedifferentialequationin thestandardformy” +p(a)y' + ¢(x)y = 0 to identify p(a) and g(x) and their singularities(if any). (2) Choose an expansion point woat which p and gqare analytic. If initial conditions are given at some point, it is suggested that that point be used as zo. (3) The least possible radius of convergence can be predicted as the distance (in the complex plane) from xp to the nearest singular point ofp and g. (4) Seeking y(z) in the form of a power series about xo, put that form into the differential equation, and also expand all coefficients of y”, y’, y about zo as well. (5) By changing dummy indices of summation and lower summation limits, as necessary, obtain a form where each summation has the same exponent on x — xg and the same summation limits. 190 Chapter 4. Power Series Solutions (6) Combine the sums into a single sum. (7) Set the coefficient of (« — xo)”, within that sum, to zero; that step gives the recursion formula. (8) From the recursion formula, obtain as many of the coefficients as desired and hencethesolutionform y(x) = Ayi(x) + Byo(x), whereA, B arearbitrary constantsand y1(a), y2(a) are power series. If possible, obtain expressions for the general term in each of those series. (9) Verify that y1, yo are LI. Computer software. One can use computer software to generate Taylor series and also to obtain power series solutions to differential equations. Using Maple, for instance, the relevant commands are taylor and dsolve. For example,to obtaintheTaylor seriesof 1/(a —x) aboutx = 0, up to terms of third order, where a is a constant, enter taylor(1/(a— x), ec=0, 4); and return. The result is wherethe O(a?) denotesthatthereare moreterms,of order 4 and higher. To obtain a powerseriessolution to x’ + y = 0 aboutthe point « = 0, enter dsolve(diff(y(a), x,v) + y(z) = 0, y(x), type= series); and return. The result is 1 1 y(x)=y(0)+D(y)(O)e —Sy(O)a* ——D(y)(0)a* +sFy(0)a" 1 +759 Ply)” 5 + O(a°®) where D(y)(0) meansy’(0). The default is to expandaboutx = 0 andto go as far as the fifth-order term. If we want an expansion about « = 4, say, and through the seventh-order term, enter Order := 8; to set the order then return and enter dsolve({diff(y(a), x,x) + y(z) = 0, y(4) = a, D(y)(4) = 5}, y(x), type = series); L91 4.2. Power Series Solutions and return. which the expansion is desired. The result is ↕ ∶ L ↔ ↓ 1 b(a— 4)° − 1 6 4)6 « ∕ 1 ∶ 24 ↨ − ↓ ↓ _ Fygg hl — 4—pagal—40" —zagle —4)"+0 ((e—4") 4.2 EXERCISES 1. Use (7a) or (7b) to determine the radius of convergence 2 of the given power series. 3. Work out the Taylor series of the given function, about the given point Zo, and use (7a) or (7b) to determine its radius of convergence. na” (a) ¥ (b)e"*, (d)sinz, (f)cosz, a=1 (a)e", w%o=7 (c)sinz, «9 = 7/2 (e)cosz, Lg =5 (g) cose, Gi)2’, (h) lng, «w=3 ro = 2 (k) cos(x —2), 2 xo = 0 (m) 7 aa 2%=-2 % =7/2 wo=7 w=1 (j)22° - 4, xo =0 () Toi to =0 (n) sin(32'°), xy =0 4. Use computer software to obtain the first 12 nonzero terms in the Taylor series expansion of the given function f, about the given point xo, and obtain a computer plot of f and the 2. Determine the radius of convergence F of the Taylor series expansion of the given rational function, about the specified point xo, using the ideas given in the paragraph preceding Example 5. Also, prepare a sketch analogous to those in Fig. 3. (a) Pe 1 [t= (b) 2 79° {c) we (a) e+ ro z+ 0 Ly =: Qe +1 so ae , fp SO 26 Bo = -2 (x +1)° (0 e+ 2a +40 1-2 a vta-2 (g) reget dg 4 I: O<aK<4 wo =0, (a) f(w) =e", 0O<a<10 I: wj=0, (b) f(z) =sinz, IF: 0<ae<2 a=l, (c) f(a) =Ine, LF: -l<ar<l wx=0, (d) f(w7)=1/(l1-—2), I: O<aK<4 wo =2, (e) f(x) =1/z, LF: -l<e<l rm =0, (f) f(z) =1/+2"), I: -13 <2 < 0.36 wo =0, (g) f(x) =4/(4+a+27), 1 to = —4 a? —3e+1 interval J. 5. (Geometric series) (a) Show that ° (e) x? +3042 partial sums s3(z), s6(@),S9(z), and s,2(a) over the given tp = 2 2 =il+a+a*+---+2 el + ce l-—cz (5.1) is an identity for all x # 1 and any positive integer n, by multiplying through by 1 ~ x (which is nonzero since x # 1) and simplifying. (b) The identity (5.1) can be used to study the Taylor series 192 Chapter 4. Power Series Solutions knownas thegeometricseries S7;"_,x" since,accordingto (5.1),its partialsum s,,(2:) is (k)y+ why! + y=0, 2% =0 (1)y+avy! + 2“y =0, Lo=0 (m) y” +(x—-1)*y=0, n—-1l L—a” (x # 1) 1~2 k=0 Show, from (5.2), that the sequence s,(2) n —>oo,for |x| < 1,anddivergesfor |x| > 1. (5.2) converges, as 2 =2 8. (a)~(m) Use computer software to obtain the general solution, in power series form, for the corresponding problem given in Exercise 7, about the given expansion point. 9, (Airy equation) For the Airy equation, (c) Determine, by any means, the convergence or divergence of the geometric series for the points at the ends of the interval of convergence, « = +1. NOTE: The formula (5.2) y” —axy= 0, (-co < @< ov) is quite striking becauseit reduces s,,(x) to the closedform derive the power series solution (1 —«")/(1 —x), direct examinationof which gives not only y(z) = aoyr(z)+a1ya(z) theintervalof convergencebutalsothesumfunction 1/(1—z). co It is rare thatone can reduce s,(z) to closed form. = 6. (a) Derive the Taylor series of 1/(@~ 1) about 2 = 4 7 1 + 35m - using the Taylor series formula (16), and show that your result 1 1 t-1l ft ol 34+(¢-4) 61 (6.1) and using the geometric series formula Ply 1-t (9.2) ta (or Beg 4. aT and verify that it is a general solution. NOTE: These series are not summable in closed form in terms of elementary functions thus, certain linear combinations of y; and y2 are adopted as a usable pair of LI solutions. In particular, it turns out to be convenient (for reasons that are not obvious) to use the Airy functions Ai(z) and Bi(z), which satisfy theseinitial conditions: ied | <1) i =ym) gintl agrees with (38). (b) Show that the same result is obtained (more readily) by writing yan (9.1) Ai(0) = 0.35502, Az’(0) = —0.25881 and Bi(0) = 0.61493, = 0.44829. 6.2 Bi'(0) (6.2) 10. Use computer software to obtain power series solutions of from Exercise 5, with t = —(a — 4)/3. Further, deduce the the following initial-value problems, each defined on 0 < a < x interval of convergence of the result from the convergence condition|t]< 1 in (6.2). oo, through termsof eighth order, and obtain a computer plot of so(x), 4(x),s¢(z), andsg(x). 7. For each of the following differential equations do the fol- (ay +4y'+y=0, y(0)=1, y’(0)=0 lowing: Identifyp(x) and g(x) and, from them,determinethe (b)y"+a°y=0, y(0)=2, y/(0) =0 least possible guaranteed radius of convergence of power se- (c)y”—~ay'+y=0, y(0)=0, y(0)= ries solutions about the specified point x9; seeking a power (d)(l+a)y”+y=0, y(0)=2, y'(0)="0 series solution form, about that point, obtain the recursion for- (e)(3+zx)y"+y'+y=0, y(0)=0, y'(0)= y(0)=1, (0) = mula and the first four nonvanishing terms in the power series (f) (1+ 27)y"+y=0, for yy(a) andyo(@);verify thaty,, y2 are LL 11. From the given recursion formula alone, determine the radius of convergenceof the corresponding power series solu(a)y"+ 2y'+y=0, a =0 tions. (b) y" + 2y'=0, x =0 (a) (n+ 3)(n + 2ange — (N+ 1)%an41+ Nan = 0 (c)y” +2y'=0, wo =3 (day +y'+y=0, to=—-5 +2y=0,a =1 (e)ay”—2y/ (f) ay" -—y= 0, to c9 =0, (h)y"+y' +(L+a+27)y a =0 w a x)\y=0, () y” a = 0, v= 0 (n +1 La An+2 + 5nn +1 + Qn (c) (n+ 1)anze2 +(2n?+ ans = 9 (g)vy” + (3 +a2)y +ary = 0, (i)y" (b) (d) (n + l)dn42 = tg (€) NAnte —3 = 0 — 3(na 2)an 7 An—1 = 0 —4an = 0 = 0 + 4NAn+1 + 3an = 0 (f) n2@n42— 3(2 + 2)?@n41+ 38an—1= 0 12. In the Comment at the end of Example 6 we wondered what the divergence of the series solution over 7 < « < 00 4.3, The Method of Frobenius {93 implied aboutthe natureof the solution over thatpartof the ductory example of Section 4.1. Keep powers of « — 4 up to domain. To gain insight, we propose studying a simple problem with similar features.Specifically, consider the problem and including fourth order (2 — 4)*, and show that your result agrees (up to terms of fourth order) with that given in (46). the usual defi(12.1) 15. (Cesdro summability) Although (5) gives (a —l)y’+y=9, on the interval 4 < a < oo. (a) Solve (12.1) analytically, and show that the solution is (12.2) over 4 < w < oo. Sketch the graph of (12.2), showing it as a solid curve over the domain 4 < x < oo, and dotted over —oo<a<4. nition of infinite series, it is not the only possible one nor the only one used. For example, according to Cesaro summability, which is especially useful in the theory of Fourier series, one defines s- an ” = hk LH N-+00 8) + 8g +:'° + 8N -oOooOCOeeoee N , (15.1) means of the partial sums. It that is, the limit of the arithmetic (b)Solve(12.1),instead,by seekingy(x) = S79?an(x —4)”. can be shown that if a series converges to s according to “or(c) Show that the solution obtained in (b) is, in fact, the Taylor dinary convergence” [equation (5)], then it will also converge expansion of (12.2) about x = 4 and that it converges only in to the same value in the Cesaro sense. Yet, there are series that jc ~ 4] < 3 so thatit representsthesolution(12.2)only over diverge in the ordinary sense but that converge in the Cesaro the 4 < x < 7 part of the domain, even though the solution sense. Show that for the geometric series (see Exercise 5), (12.2) exists and is perfectly well-behaved over 7 < x < oo. 13. Rework Example 5 without using the }~>summation notation. That is, just write out the series, as we did in the introductory example of Section 4.1. Keep powers of x up to and including fifth order, 2°, and show that your result agrees (up to terms of fifth order) with that given in (29). 14. Rework Example 6 without using the 5~ summationno- tation. That is, just write out the series as we did in the intro- 4.3. The Method the equation ateattsw N 1 #tns® N(1l—2) l-x for all a ~ 1, and use that result to show that the Cesdro definition gives divergence for all |z| > 1 and for « = 1, and convergence for |z| < 1, as does ordinary convergence, but that for z = —1 it gives convergence to 1/2, whereas according to ordinary convergence the series diverges fora = —1. of Frobenius y"+p(x)y’+ a(x)y=0. i.y (1) expansions about any point ag at which both p and q are analytic. We call such a point x9 an ordinary point of the equation (1). Typically, p and q are analytic everywhere on the wxaxis except perhaps at one or more singular points, so that all points of the a axis, except perhaps a few, are ordinary points. In that case one can readily select such an wg and develop two LI power series solutions about that point. Nevertheless, in the present section we examine singular points more closely, and show that one can often obtain modified series solutions about singular points. 194 Why should we want to develop a method of finding series solutions about a singular point when we can stay away from the singular point and expand about an ordinary point? There are at least two reasons, which are explained later in this section. Proceeding, we begin by classifying singular points as follows: DEFINITION 4.3.1 Regular and Irregular Singular Points of (1) Let xp be a singular point of p and/or g. We classify it as a regular or irregular singular point of equation (1) as follows: 29 is (a) a regular singular point of (1) if (a —xq)p(a) and (2 —xq)"q(z) areanalytic at Zo, (b) an irregular singular point of (1) if it is not a regular singular point. EXAMPLE 1. Considera(x —1)?y"’—3y’ + 5y = 0 or,dividingby x(x —1)?to put theequationin thestandardform y” + p(x)y’ + q(x)y = 0, a“ y 3 i 5 yey i ne-ip4 >? (2) Thus,p(x) = ~—3/[a(a ~— 1)?]andq(x) = 5/[a(x — 1)?].Theseareanalyticfor all x except for z = 0 and z = 1, so every z is an ordinary point except for those points. Let us classify those two singular points: to =0: (x —xo)p(x) = (« —0) (-zhp) = Soi To classify the singular point at 2 = 0, consider (3a) and (3b). Since the right-hand sides of (3a) and (3b) are analytic* at 0, we classify « = 0 as a regular singular point of (2). (The fact that those right-hand sides are singular elsewhere, at x = 1, is irrelevant.) To classify the singular point at « = 1, we turn to (3c) and (3d). Whereas the right-hand side of (3d) is analytic at 2 = 1, the right-hand side of (3c) is not, so we classify the singular point at wv= | as an irregular singular point of (2). @ EXAMPLE 2. Consider the case y +VJzy =0. (0<u< oo) (4) “Recall the rule of thumb given in the last sentence of Section 4.2.1, that we will classify a function as analytic at a given point if it is infinitely differentiable at that point. 195 Then p(z) = 0 and g(x) = 4/2, and these are analytic (infinitely differentiable) for all xz > 0, but not at z = 0 because q(x) is not even once differentiable there, let alone infinitely differentiable. To classify thesingular pointat z = 0, observe that («—xo)p(x) = (c)(0) = 0 is analyticat« = 0, but(x —ag)?q(x)= 2?.fa = x°/? is not;it is twice differentiable there (those derivatives being zero), but all higher derivatives are undefined at xc= 0. Thus, a = 0 is an irregular singular point of (4). (See Exercise 2.) # 4.3.2. Method of Frobenius. To develop the method of Frobenius, we require that the singular point about which we expand be a regular singular point. Before statingtheoremsand working examples, let us motivate the idea behind the method. We consider the equation y"+p(x)y!+ g(x)y=0 (3) to have a regular singular point at the origin (and perhaps other singular points as well). There is no loss of generality in assuming it to be at the origin since, if it is at f = xo # 0, we can always make a change of variable € = x — xg to move it to the origin in terms of the new variable € (Exercise 3). Until stated otherwise, let us assumethat the interval of interest is 7 > 0. We begin by multiplying equation (5) by x? and rearranging terms as a*y" + x [xp(x)|y’ + [x*q(x)| y = 0. (6) Since z = 0 is a regular singular point, it follows that xp and xq can be expanded about the origin in convergent Taylor series, so we can write (otue+-)y=0. wy"+a(potpic+- y+ (7) Locally, in the neighborhood of x = 0, we can approximate (7) as xy"HW+pory’+goy=0, (8) which is a Cauchy-Euler equation. As such, (8) has at least one solution in the form x", for some constant 7. Returning to (7), it is reasonable to expect that equation, likewise, to have at least one solution that behaves like x” (for the same value of r) in the neighborhood of z = 0. More completely, we expect it to have at least one solution of the form y(x) = x" (ap + aye +a9z* +-+-), (9) where the power series factor is needed to account for the deviation of y(a), away from z = 0, from its asymptoticbehaviory(a2)~ apx” as « > 0. That is, in place of the power series expansion y(x) = So an” 0 (10) 196 that is guaranteed to work when x = 0 is an ordinary point of (5), it appears that we should seeky(x) in the more generalform oO oO y(z) =a" So an” (11) — ‘> aye tt 0 0 if z = 0 is a regular singular point. Is (11) really different from (10)? Yes, because whereas (10) necessarily represents an analytic function, (11) represents a nonanalytic function because of the x” factor (unless r is a nonnegative integer). Let us try the solution form (11) in an example. EXAMPLE 3. The equation Gay"+7ay'—(L+2")y=0, (0<2<oo) a has a regularsingularpoint at 2 = 0 becausewhereasp(x) = 7x/(6x7) = 7/(6x) and g(x) = —(14 27)/(6x") aresingularatx = 0, xp(x) = 7/6 andx7q(x) = —(1+ 27)/6 are analytic there. Let us seek y(z) in the form (11). Putting that form into (12) gives oO +r)\(n+r— 6x7 So(n oo + raga? Srin 47 Laat"? th} 0 ° DO =0 —(1+2°) So ana" 0 (13) or oO oo S> 6(n+r)(n+r— 0 lane” + S° 7Tn+r)anx"t" 0 foe] oO 0 0 —SF ana” —Sanat? Letting n +r +2 = m+, (14) =0. in the last sum, (14) becomes oO 1 (nt 0 r)(n+r—1)+7(n+r) —Vane"? —S- Qm—22'"*"= 0. 2 Changing the lower limit in the last sum to 0, with the understanding that a2 and changing the m’s to 7’s, we can finally combine the sums as (15) = a_, = 0, DO > {[6(n oe r)* +n-er 0 = 1] An — Gn—2}gt” = Q, where we have also simplified the square bracket in (15) to 6(n +7)? +n+r (16) — 1. From (16), we infer the recursion formula (6(n + r)? +tntr- 1] Ay —~dn—2 = 0 (17) 4,3, The Method of Frobenius foreach n = 0,1,2,.... In particular, n = 0 gives or, since ag (6r? +r —1) aq —~a_2 =0 (18) (6r? +17~ 1) ap = 0. (19) = 0, and that ag is the first nonvanishing coefficient. Proceeding, with ag % 0, it follows from (19) that 6r?+r—-1=0 (20) First, set r = ~1/2. The corresponding recursion formula (17) is then aynFa forn = 1,2,..., 1 6 (n − ≡ − (21) in ~2 3 ∏−≡ since the n = 0 case has already been carried out: n= = 0, a, = a, n=l: . 2: 1 ag = ma = 39 1 n=3: ag = — n=4 ai= nm=d: 1 a5 = Togas = 9, (22) | 1 76° 6 — i (76)(14) aig; 1 n=6: Qg = vonda = TORTS 1 °~ 186°"~~(186)(76)(14) °° and so on. From these results we have the solution y(v) − = agx ∙↔↕∕ ∩ 14 a ne vo + 1 xpil 4 1 pb − ∙∙ ia” * Geyaay** Ts6(76)"A4)* (23) — agyi (x), where ay remains arbitrary. Next, set r = 1/3. The corresponding recursion formula (17) is then 1 an = 5 An—2; 6(n+4)°+n—-2 (24) 197 {98 and proceedingas we did for r = —1/2, we obtain thesolution (Exercise 4) u(x) = ane? | 1 y(t)= ax” 1 2 |L+ sie" +Tpan” tk 1 go3 poe. +Gacainad”®* = agy2(x), a where ag remains arbitrary. [Of course, the ag’s in (23) and (25) have nothing to do with each other; they are independent arbitrary constants.] According to Theorem 3.2.4, the solutions y; and y2 are LI because neither is a scalar multiple of the other, so a general solution to (12) is y(z) = Cyy1(x) + Coye(a), where y; and ye are given in (23) and (25). What are the regions of convergence of the series in (23) and (25)? Though we don’t have the general terms for those series, we can use their recursion formulas, (21) and (24), respectively, to study their convergence. Consider the series’ in (23) first. Its recursion formula is (21) or, equivalently, an+2 = 1 Tae " 6(n+2-24)?+n4+2-3 (26) We need to realize that the an+2 on the left side is really the next coefficient after ap, the “a,41” in Theorem 4.2.2, since every other term in the series is missing (because a, = a3 = a5 = --: = 0). Thus, (26) gives . lim noo = Qn 1 . ” “a | = lim noo ae § (n + 3) +n+ 5 0, (27) and it follows from Theorem 4.2.2 that R = co; the series converges for all x. Of course, the x~!/? factor in (23) "blows up” at « = 0, so (23) is valid in 0 < x < 00, which is the full interval specified in (12). Similarly, we can show that the series in (24) converges for all x, so (25) is valid over the full intervalO <x < oo. G With Example 3 completed, we can come back to the important question that we posed near the beginning of Section 4.3.1: “Why should we want to develop a method of finding series solutions about a singular point when we can stay away from the singular point and expand about an ordinary point?”. Observe that our Frobenius-typesolutiony(z) = Cyyi(x) + Coye(x), with y;(x) and yo(a) given by (23) and (25), was valid on the full interval 0 < x < oo. Furthermore, it even showed us the singular behavior «: the origin explicitly: y(t) = Ca? aCa (1 $a 1. 1 pe. ) + Cyx'/3 (1 + aa” +: } (28) as x — 0. In contrast, if we had avoided the singular point « = 0 and pursued power series expansions about an ordinary point, say zc = 2, then the resulting 4.3. The Method of Frobenius solution would havebeen valid only inQ < # < 4, and it would not haveexplicitly displayed the 1/\/z singular behavior at x = 0. Let us review the ideas presented above, and get ready to state the main theorem. If « = 0 is a regularsingularpoint of theequationy” + p(a)y! + q(x) = 0, which we rewrite as vy" + x [xp(x)] y! + [x*q(x)] y = 0, (29) thenxp(x) andx?q(x) admitTaylor seriesrepresentationsxp(a) = pop+ pix +--= gg + qe +--+. Locally then, near « = 0, (29) can be approximated and xq(x) as ∶ 2,0 ∟↕∕ ∫ (30) ∫∶ ↕∫ which is a Cauchy-Euler equation. Cauchy-Euler equations, we recall, always admit at least one solution in the form x", and this fact led us to seek solutions y(x) to (29)thatbehavelike y(x) ~ x” as x > 0, or Cw CO (31) ee ) Ant” = ) Anz"t, 0 where the ay 0 G@yx"factor is to account for the deviation of y from the local be- haviory(z) ~ x” awayfrom 2 = 0. Putting (31) into wy"+a(potpie+-)y +(got qe+---)y=0 (32) gives oO CO So (n +r\(n+r—1)anz"*” 0 + (pp + pit +---) So(n + r)ana"t? 0 +(gotqe+--)S oO 0 cana” = 0, (33) and equating coefficients of the various powers of a to zero gives x: git, ut? ats [r(r —1) + por + qo]ao = 9, [((r+ L)r + po(r +1) + qo]ai + (pir + qi)ao = 0, [(r + 2)(r + 1) + po(r + 2)+ qo]ag + (etc)ay +(etc)ag = 0, [(r +3)(r + 2) + po(r + 3) + qo]a3 + (etc)ag + (etc)ay (34a) (34b) +(etc)ag = 0, (34d) (34c) and so on, where we’ve used “etc’s” for brevity since we’re most interested, here, in showing theform of the equations. Assuming, without loss of generality, that ag # 0, (34a) gives r? +(po —1)r +qo=0, (35) 199 200 which quadratic equation for r is called the indicial equation; in Example 3 the indicial equation was equation (20). Let the roots be ry and rg. Setting r = ry in (34b,c,d,...) gives a system of linear algebraic equations to find a1, a2,... in terms of ag, with ag remaining arbitrary. Next, we set r = rg in (34b,c,d,...) and again try to solve for a1, a@2,...in terms of ag. [f all goes well, those steps should producetwo LI solutionsof thedifferential equationy” + p(x)y’ + q(x) = 0. The process is straightforward and was carried out successfully in Example 3. Can anything go wrong? Yes. One potential difficulty is that the indicial equation might have repeated roots (r, = 12), in which case the procedure gives only one solution. To seek guidance as to how to find a second LI solution, realize that the same situation occurred for the simplified problem, the Cauchy-Euler equation (30): if, seeking y(x) = 2” in (30), we obtain a repeated root for r, then a second solution can be found (by the method of reduction of order) to be of the form z" timesInw. Similarly, for y” + p(x)y’ + q(x) = 0, as we shall see in thetheorem below [specifically, (41b)]. The other possible difficulty, which is more subtle, occurs if the roots differ by a nonzero integer. For observe that if we denote the bracketed coefficient of ag in (34a) as F(r), then the coefficient of a; in (34b) is F(r + 1), that of ag in (34c) is F(r + 2), and so on. To illustrate, supposethatr; = rg + 1, so thatthe roots differ by 1. Then not only will F(r) vanish in (34a) when we are using r = ro, but so will F(r + 1) in (34b) [though not F(r + 2) in (34c), nor F(r + 3) in (34d), etc.], in which case (34b) becomes 0a, + (pira + qi)ao = 0. If pire t+ a happens not to be zero then the equation (34b) cannot be satisfied, and there is no set of a,,’s that satisfy the system (34). Thus, for the algebraically smaller root rg (e.g., —6 is algebraically smaller than 2), no solution is found. But if pire + q@ does equal zero, then (34b) becomes Oa, = 0 and a, (in addition to ag) remains arbitrary. Then (34c,d,...) give ag, a3,... as linear combinations of ag and aj, and one obtains a general solution y(a) = agxv™(a power series) + a,x" = agyi(@)+ ary2(z), (a different power series) (36) where ao, a1 are arbitrary and yj, y2 are LI. If, however, rp gives no solution, then we can turn to ry. For ry the difficulty cited in the preceding paragraph does not occur, and the method produces a single solution“yo(w).” If, instead, ry = ro + 2, say, then the same sort of difficulty shows up, but not until (34c). Similarly, if ry = ro + 3,71 = ro + 4, and so on. The upshot is that if 71,72 differ by a nonzero integer, then the algebraically smaller root rg leads either to no solution or to a general solution. In either case, the larger root r, leads to one solution. The theorem is as follows. 201 THEOREM Let 4.3.1 Regular Singular Point; Frobenius Solution = 0 be a regular singular point of the differential equation y+p(a)y'+a(x)y=0, (a > 0) with xp(z) = po + pia +++ and a%q(x) = qo + qa +>: GY) having radii of convergence /21, /%grespectively. Let r1, r2 be the roots of the indicial equation r? + (po—1)r+qo= 0, (38) where Tr; > rq if the roots are real. (Otherwise they are complex conjugates.) Seekingy(x) in the form oO oo y(z) =a" S° Cn S- Anz", 0 (ao # 0) (39) 0 with r = ry inevitably leads to a solution CO yi(x) = a2"S- Ant”, (ao# 0) 0 (40) where a1, @,... are known multiples of a9, which remains arbitrary. For definiteness, we choose ag = 1 in (40). The form of the second LI solution, ye(x), depends on ry and 19 as follows: (i) ry and re distinct and not differing by an integer. (Complex conjugate roots belong to this case.) Then with r = ra, (39) yields CO yo(z)=a"?S “baw”, (bo#0) (41a) 0 where the b,,’s are generated by the same recursion relation as the a,’s, but with r = rq instead of r = rj; bj, b2,... are known multiples of bg, which is arbitrary. For definiteness, we choose bg = | in (41a). (ii) Repeated roots, ry = r2 =r. Then y2(x) can be found in the form oO yo(x) = yi(x) neg +2" Ss" Cpt”. L (41b) (ili) 74 — 1g equal to an integer, Then the smaller root ro leads to both solutions, yi(x) and yo(x), or to neither. In either case, the larger root 71 gives the single solution (40). In the latter case, y2(a) can be found in the form oO yo(z) = Kyi(a) Ina +2 S0 dyx™, (41c) 202 where the constant « may turn out to be zero, in which case there is no logarithmic term in (41c). The radius of convergence of each of the series in (40) and (41) is at least as large as the smaller of Ry, Ro. If (37) is on x < Orather than x > 0, then the foregoing is valid, provided that each2”, x", a” andInz is changedto |x|", |a|", |a|"?andIn |a|,respectively. Outline of Proof of (ii):Our discussion preceding this theorem contained an outline of the proof of case (i), and also some discussion of case (ili). Here, let us focus on case (ii) and, again, outline the main ideas behind a proof. We consider the case of repeated roots, where r; = rg = r. Since y;(x), given by (40), is a solution, then so is y(x) = Ayi(xz), where A is arbitrary. To find yo(x), let us use reductionof order;that is, seekyo(x) = A(x)y;(x), wherey;(x) is knownand A(z) is to be found. Putting that form into (37) gives Aly, +A!(2y,+pyr)+A (yl +py,+ay1)= 0. (42) Since y1 satisfies (37), the last term in (42) is zero, so (42) becomes (43) A” y, + A’ (2y; + py) = 0. Replacing A” by dA’/dzx, multiplying throughby dz, and dividing by y; and A’ gives dy dA’ way (44) +pdz=0. Integrating, In |A’) + 21n|yz|+ / p(x) dx = constant,say InC’, for C > 0, , SO = ~ | ∟a)ae, ↕ and ewJ ple) de |A'(x)| = C— > yq (x) —po Ing = ci oof (tpi treet de je" (l+ajz+--)] (—piw—-) ~c oe v2?(1+ Qayatee)’ 45 where we write In x rather than In || since x > 0 here, by assumption. Since exp (—f p(x) dz) > 0, we see from the first equality in (45) thatA’(«) is either everywhere positive or everywhere negative on the interval. Thus, we can drop the absolute value signs around A’(a) if we now allow C' to be positive or 203 eT) 77 pap 4 Pouce 4 . ‘ +---) is negative.Further, e~?°Ing — ging PO_ y—Po,and e(-P1t—"")/(1 + 2a, 3 ee wo . analytic at ¢ = 0 and can be expressed in Taylor series form as 1+ 44+ so 1 A(x) =C get t+po Kone ees, (L+aya+---). (46) For r to be a double root of the indicial equation (38), it is necessary that 2r + pg = 1, in which case integration of (46) gives A(z) =C(Ina+xKy0+-::). (47) Finally, setting C' = 1 with no loss of generality, we have the form yo(z) = A(x)yi(x) = (Inet ayrt+---)y1(z) =yi(x)Inat (Kio +---)a"(L+aya+---) CO = yi(c) Ine +a" S- Crea 1 (48) as given in (41b). @ In short, the Frobenius method is guaranteedto provide two LI solutions to (37) if x = 0 is a regular singular point of that equation. If 2 = 0 is an irregular singular point, the theorem does not apply. That case is harder, and we have no analogous theory to guide us. EXAMPLE 4. Case (ii). Solve the equation zy” —(x+2*)y’ +y =0: (0<«<o) (49) that is, find two LI solutions. The only singular point of (49) is 2 = 0, and it is a regular singular point. Seeking y(t)=Doane™*", (ay#0) (50) 0 substitution of that form into (49) gives 20 es) Si(n 0 +r)(n+r—l)anz™* —Yi(n 0 oO +r)a,x"t" coo (51) 4Sane"? =0. —So(ntrjana”t"t! 0 0 Setn + 1 = m in the third sum, change the lower limit from n = 0 tom = 1, extend that limit back to 0 by defining a_, = 0, change the m’s back to n’s, and combine the four sums. Those steps give S- {[(n+r)(n+r—-1)-—(n+r)+ 0 a, —-(n+r— lan} a"*" =0 204 and hence the recursion formula [(n+r)(n+r—1)—-(n+r)+1an—-(n+r—1)an-1 forn = 0,1,2,.... Forn ay = 0, (52) = 0, (52) becomes (r? — 2r + 1)ag ~ (r — 1)a_y =0. Since = Oand ag Z 0, the latter gives the indicial equation 2 or +1=0, (53) with repeated roots r = 1,1. Thus, this example illustrates case (ii). Putting r = 1 into (52), we obtain the recursion formula an = 1 ~An—1 1 1 1 30» a3 = 3% =3100 and we can see that ! 1 Thus, a, = ao, @2= ria . forn = 1,2,.... i (54) n Gn = —~Ao,SO ni a So? oI +. :) y(x2)= a(ap + Sat ll! grt oo = ao 5 a (55) agys(x). ∏ keep working with the series form in this example. ∂ Theorem 4.3.1 tells us that yg can be found in the form (41b), where r = 1 and the Cys are to be determined. Putting that form into (49) gives x?yy —(@+2?)yh+yo= [oat —(c+2)y ty] Ine+2zy,—(2+a)y1 nl (n +1) Jena +3 4 Seem 1 oO oO ->*( n+l1)e,z"*! 1 + 1)enz"t* - y(n L =0, (56) wherey,(x) = S03°2" +!/n!. The square-bracketed termsin (56)cancel to zero because yi is a solution of (49). If we move 22y, — (2 + x)y, to the right-handside, and write out the various series, then (56) becomes C12" + deg? , + Sega* sees . —2e,7°3 — 3cont — ++)= a . —o Fi 1 —5 meee, (57) and equating coefficients of like powers of a, on the left- and right-hand sides, gives x: q=nl, xe: deg —2c) = —1, x: 9ce3 — 3c. = —$, (58) 205 andso on. Thus,cy = —1,cg = ~4,cg3= —48,... and yo(a) = yi(a) Ina + S> Cyan} 1 ~ Fe! —tigi... a? ---)ing gett fot = (ota? (59) notation, then in place of (57) we obtain, after If, instead, we retain the summation manipulation and simplification, ‘ oO i oO on aD n+l 2 — Mey) NMCy 1 -__——T = n+1 (60) {where co = 0 because there is no cp term in (41b)] and hence the recursion formula nrc or, more conveniently, Ch = ne " 1 eS nme” | (n— 1)! 1 n nn} (61) — 0) (co — ——: —Cn-1 Solving ( 61) gives —-l, C= C2 = => FBI 1 oF 1 (1+5) 1 ~~ 14 1 = 3 3 (62) ll 36 and so on. These results agree with those obtained from (58), but the gain, here, is that (61) can give us as many c,,’s as we wish. In fact, by carrying (62) further we can see that 1 1 1 (63) maa nl (1454-42) 2 n foranyn = 1,2,.... [The price that we paid for that gain was that we needed to manipulate the series by shifting some of the summation indices and summation limits in order to obtain (60).] COMMENT 1. In this example we were able to sum the series in (55) and obtain y; (x) in the closed form yi (x) = ze”. (64) In such a case it is more convenient to seek yo by actually carrying out the reduction-of- orderprocessthatlay behindthegeneralform (41b).Thus, seekyo(x) = A(x)yi(z). The steps were already carried out in the preceding outline of the proof of Theorem 4.3.1, so we can immediately Al(a) use (45). With C = 1, _ ew J pla) de yi(2) e7 f(-4-l)da qoeie _ elh fot sighs _ ent = 6s) 206 so a(e)= [ae =f ALD) = z at =] = ine +f (1-142-4...) = . z oy a 31 v (66) ” : ING = Ine+s(-" yeenl de = nn and a = 6 (67) re™, xe 1) = nosz+ dA —1)" a) = A(s (x) ==A(x)yi yo(x)= which expression is found, upon expanding the e”, to agree with (59). COMMENT 2. As a matter of fact, we can leave the integral in (66) intact because it is a tabulated function. Specifically, the integral of e~*/a, though nonelementary,comes up often enough for it to have been defined, in the literature, as a special function: Ey(x)=| 100 +t — at} (68) (x>0) is known as the exponential integral. Among its various properties is the expansion (69) (x>0) Ye, Ex(x)=—y—Inz— ↓ wherey = 0.5772157is Euler’s constant. Using theF(x) function,we can express in(2) =ACen (ze) =(f° at)2 LCevt oo a (/ evt _- oo ev — at) dt -| rey = [E\(a) xe”, − (70) for any a > 0. The £,(a)xze* termis merely F(a) timesy;(a), so it can be dropped with no loss. Further, the factor —1 in front of the £\(x)xe* can likewise be dropped. Thus, in this example we were able to obtain both solutions in closed form, y;(z) = xe* and yo(x) = Ey(a)re". COMMENT 3. Observe that the Taylor series _p rp(z) =x (=5*) x =-l—ag, r’q(a) = 2? (=) x il (71) both terminate, hence they surely converge with Ry = co and Ry = oo, respectively. Thus, Theorem 4.3.1 guarantees that oO a,x” in (40) and SO crx” in (41b) will likewise have infinite radii of convergence. Of course the Inz in (41b) tends to —00 as x — 0, but nevertheless our solutions y, and ye are indeed valid on the full interval 0 < 2 < oo. H EXAMPLE 5. Case (iii). Solve theequation 4.3. The Method of Frobenius that is, find two LI solutions. The only singular point of (72) is « = O, and it is a regular singular point. Seeking y(2) = S| ane 0 t (ao # 0) (72) becomes Sin 0 +r)\(n+r—lagz™*™~! 4+S- ane" t? = 0, 0 Set n — | = m in the first sum, in which case the lower summation (73) limit becomes ~—1, back to n’s. In the second sum change the lower limit to —1,with the thenchange the 7™m’s understanding that a_, = 0. Then (73) becomes oO + anja"? Ss” [(m+r+1)(n+rangi = 0, n=—l1 so we have the recursion formula (n+r+1)(n+r)anz, forn = ~—1,0,1,2,.... +an = 0, (a_, =0, ap #0) (74) Setting m = —1, and using a_, = O and ap # 0, gives the indicial equation r(r-—1) (75) =0, with roots 7; = Llandre = 0. These differ by an integer, so that the problem is of case (iii) type. Let us try the smaller root first. That root will, according to Theorem 4.3.1, lead to both solutions or to neither. With r = ry = 0, (74) becomes (2 + L)nanagy + An = 0. Having already used n = —1, to obtain (74), we next set n = 0. That step gives 0+ aq = 0, so that a@g = 0, which contradicts the assumption that ag 4 0. Thus, r = re = 0 yields no solutions. Thus, we will use the larger root, r = rj] = 1, to obtain one solution, and then (41c) to obtain the other. With r = r, = 1, (74) gives OntL = 7 1 Gn n+ 2)(n+1) (76) Working out the first several a,,"s from (76), we find that (-1)” Oy = " (n+ 1)(nh2 so oO )= Rar (n “ "do (nl)? pith l = agyr oy (x), (), where OS nls = Nag Gea? (n + 1)( \(nly2 − (77 ) =207 208 [Remember, throughout, that0! = 1 and(—1)°= 1.] To find yo, we use (41c) and seek yo(x) = Kyi (x) Ine + S| dyx™. (78) 0 Putting (78) into (72) gives Kary! Ina + QKay,—Ky, + S> n(n —1)dpa” 0 oO +KvyyIne +)" dn2”*t= 0. (79) Cancelling the Inx terms [because y; satisfies (72)], re-expressing the last sum in (79) as So dnt = SP where d_, = 0, andputting (77) in for dm1t™ = SOP dn—1x", the y; terms, (79) becomes oO - 2” lI [n(n — Ldn + dni] Ky —2Keyy 0 (CONE +Dg oO =F (n+ (n+1)( iinie —_ —(or Lene where, to obtain the last equality, a Gn 1) on we let m + 1 = m and then set m = n. 880 Equating coefficients of like powers of @gives the recursion formula n(n —1)dy + dy, = —K (-1)"71(2n—1) n{(n—1)!]? (81) forn = 1,2,.... [We can begin with n = 1 because equating the constant terms on both sides of (81) merely gives 0 = 0.] Letting n = 1,2,... gives mol: dy = —kK, no=2: dg n= n=3 3: nm=4: 3 ~K = 4 . - lL ~dh, 2 ' (82) 13 = o-oo dls 36° + —dy, 1D 35 4 fc- 1 dh, da= T798"> jaa" " and so on, where d; remains arbitrary. Thus, the series in (78) is Yt" foe m = 14322 +dy sal - 4 ↓ Tota 36° ∶ ~=24+—73 3° a f+. 1728 ∙↕∕ - — a4 fue } (83) 209 The series multiplying d,, on the right side of (83), is identical to y; (x), given by (77), so we can set d; = 0 without loss. With dy = 0, we see that the entire right side of (78) is scaled by «, which has remained arbitrary, so there is no loss in setting « = 1. Thus,y2(x) is givenby (78),whereiny; (x) is givenby (77)andthed,’s by (81),with d; taken to be zeroand« EXAMPLE = 1. @ 6. Case (iii). Solve 4a?y" + day’ —y =0 (84) by the method of Frobenius. This has been a long and arduous section so we will only outline the solution to (84). Seeking a Frobenius expansion y(z) = 373° anz"*” aboutthe regular singular point « = 0, we obtain the indicial equation 4r? — 1 = 0, sor = +1/2, which corresponds to case (iii) of Theorem 4.3.1. We find that the larger root r; = 1/2 leadsto the one-termsolution y(x) = aga!/? (i.e, a) = a2 = ++:= 0), and thatthe smaller root rz = —1/2 leads to y(z) = aga~!/? + ayx'/? (ie., ag = ag = ++:= 0), which is thegeneralsolution. We did not, in (84), specify the x interval of interest. Suppose thatit is 2 < 0. Thena generalsolutionof (84)is y(z) = ao|x|~!/?+ ay|a|!/?,andthat solution is valid on the entire interval 2 < 0. In fact, (84) is an elementary equation, a Cauchy-Euler equation, so we could have solved it more easily. But we wanted to show that it can nonetheless be solved by the Frobenius method,and that that method does indeed give the correct one-term solutions. 4 One final point: what if the indicial equationgives complex roots r = a +13? This issue came up in Section 3.6.1 as well, for the Cauchy-Euler equation. Our treatment here is virtually the same as in Section 3.6.1 and is left for Exercise 10. Closure. The Frobenius theory, embodied in Theorem 3.4.1, enables us to find a general solution to any second-order linear ordinary differential equaton with a regular singular point at « = 0, in the form of generalized power series expansions about that point, possibly with In a included. There are exactly three possible cases: if the roots of the indicial equation (38) are r,, 72, where 71 > 12 if they are real, then if the roots are distinct and not differing by an integer (which includes the case where the roots are complex) then LI solutions are given by (40) and (41a); if theroots are repeatedthen LI solutions are given by (40) and (41b); and if ry —re is an integer then LI solutions are given by (40) and (41c). Theorem 3.4.1 is by no means of theoretical interest alone, since applications, especially the solution by separation of variables of the classical partial differential equations of mathematical physics and engineering (such as the diffusion, Laplace, and wave equations), often lead to nonconstant-coefficient second-order linear differential equations with regular singular points, such as the well known Legendre and Bessel equations. We devote Sections 4.4 and 4.6 to those two important cases. Computer software. It is fortunate that computer-algebra systems can even generate Frobenius-type solutions, fortunate because the hand calculations can be quite tedious, as our examples have shown. Thus, we urge you to study the theory in this 210 Chapter 4. Power Series Solutions 0, y(x), type = series); u(e)=C12(1 1 ae 1. 144 12 36 ol Ly one, a?+ 144 1728" x ++o(a")) 86400 2880 12 2 4 + 2880 101 86400 a) EXERCISES 4.3 1. For each equation, identify all singular points (if any), and classify each as regular or irregular. For each regular singular point use Theorem 4.3.1 to determine the minimum possible radii of convergence of the series that will result in (40) and (41) (but you need not work out those series). (ayy”—a*y'+ ary=0 (x?-3)y"—y=0 (c) (e)(a + 1)°y”— 4y' + (x +1)y (fy” + (Unaz)y’+ 2y =0 y(x(t))= Y(t)is =0 -1)(w+3)?y"+y'+y=0 (g)(aw (h) ry" +(sinz)y’ — (cosx)y = 0 (i) x(a* + 2)y" by = (0) (j)(a*7 Uy +ay'—x?y =0 —y=0 =1)2y' =1)y"+(2? (24 (k) (1)(xt —1)8y"”~ 3(@+ (m) (ty! \' ~5y = 0 1)?y'+ e(@+ ly =90 (n)[x3(a~1)y’]' +2y=0 (0)207y"" —xy!+Ty=0 (p)cy" +dy’ = 0 (q)*y" —3y= 0 (r) 227y" + fry y" + fey = 0 (x > 0) hasan irregularsingularpoint at x = 0, becauseof the/Z. (a) Show that if we change the independent variable from z to t, say, according to /z = t, then the equation on (b) ry” — (cosx)y’ + 5y = 0 (d) a(x? +3)y" +y=0 regular singular point, by suitable change of variables, so that the Frobenius theory can be applied. The purpose of this exercise is to present such a case. We noted, in Example 3, that =0 2. Sometimes one can change an irregular singular point to a y(t) - “¥'(t)+4PY()<0. (t>0) 9 21) (b) Show that (2.1) has a regular singular point at ¢ = 0 (which point corresponds to x = Q). (c) Obtain a genera! solution of (2.1) by the Frobenius method. (If possible, give the general term of any series obtained.) Putting t = \/z in that result, obtain the corresponding generalsolutionof y” + xy = 0. Is thatgeneralsolutionfor y(x) of Frobenius form? Explain. (d) Use computer software to find a general solution. 3. In each case, there is a regular singular point at the left end of the stated x interval: call that point zp. Merely introduce a change of independent variable, from z to ¢, according to v ~ wv = ¢, and obtain the new differential equation on y(x(t)) = Y(t). You neednotsolve thatequation. 4.3. The Method of Frobenius (a (e@—Ly"+y'-y=0, (b)(2? Dy" +y=0, (<a<o) (1<e<oo) (c)(a + 3)y” —2(a + 3)y’ —dy = 0, (d)(a~5)?y"" +2(a—5)y’—y= 0, 9. (a) The equation (-3 <4 < 0) (5 <a < oo) 4. Derive the series solution (25). 5. Make up a differential equation that will have as the roots of its indicial equation (a 1,4 (b)3,3 (c)1/2,2 (e)2+31 (f)-1,-1 — (g)-2/3,5 (i)(1$ 24)/3Gj)5/4,8/3 211 (d)-1/2,1/2 (hy) -1 +i 6. In each case verify that ¢ = 0 is a regular singular point, and use the method of Frobenius to obtain a general solution (a? —x)y"”+ (4a- 2)y' + 2y=0, (O<a<1) 1) has been “rigged” to have,as solutions, 1/2 and 1/(1 —2). Solve (9.1) by the method of Frobenius, and show that you do indeed obtain those two solutions. (b) You may have wondered how we made up the equation (9.1) so as to have the two desired solutions. Here, we ask you to make up a linear homogenous second-order differential equationthathastwoprescribedLI solutions#(z) andG(z). 10. (Complexroots)Sincep(x) andq(x) arereal-valuedfunc- tions, v9 and qo are real. Thus, if the indicial equation (38) has complex roots they will be complex conjugates, r = a+i(, so y(z) = Ayi(x) + Byo(c) of the given differentialequation, on the interval0 < x < oo. That is, determiney(x) and case (i) of Theorem 3.4.1 applies, and the method of Frobenius yo(x). On what interval can you be certain that your solution will give a general solution of the form is valid? HINT: See Theorem 4.3.1. (a)2ay" + y' + 2°y =0 y(z)= Ayi(x) +Byo(z) (b)ry” +y' —xy =0 = Are (c)ay” +y' + 28y =0 Ce)zy" +y' +2y = 0 (e)xy” +ay’ —y=0 (f) ry wo xy! _ 2y =(0 (g)oy! +ay' —(1+2z)y =0 (m)z?y"” — (2+ 32)y =0 rts = x [cos(Glnz) isin y(x) = Cx (q)16x?y" + 8ry' —3y =0 (10.2) (cos(GBInz) 79° ene” +Dzx®(cos(In z) 339°dnx” (r)16x7y""+ Bay’ —(3+a)y =0 +sin(8lna) (s)2*y" + zy’ + (sinz)y = 0 (10.3) 75° ene"), are the real and imaginary parts of a,,, respecwhere c,d, tively: Qn = Cn +idy. (c) Find a general solution of the form (10.3) for the equation ry2, 8. Use the method of Frobenius to obtain a general solution ff +e(l+a)y to the equationxy” + cy’ = 0 on x > 0, where c is a real That is, determine a, @ and c,,d, constant. You may need to treat different cases, depending upon c. (Ginz)], —sin(GInz) 7p draw”) (p)2zy" + e*y'+y =0 7. (a)—(x) Use computer software to obtain a general solution of the corresponding differential equation in Exercise 6. byw”. show that (10.1) [with },, replaced by @,, according to the result found in part (a) above] can be re-expressed in terms of real functions as (n) bay" + y' + 8x7y = 0 (o) ry" +e"y =0 (t)(xy) —9y'+zy = 0 (u)(zy’)’ -y =0 (v)(xy’)’ —2y'-y =0 0 (a) Show that the 6,,’s will be the complex conjugates of the Qn’S: bn = Gn. (b) Recalling, from Section 3.6.1, that (h)c?7y”+ ay’ -y =0 (i)zy" +2y + (1+2)y=0 (j)3zy" + y' +y =0 (kK) c(L +x)y" +y = 0 (I)¢?(2+ ay" —~y=0 (10.1) ane” + Bat say. ty =0. in (10.3), through n = 3, (d)The sameas(c), for 2*y" + ay’ +(1 - x)y = 0. 212 4.4 Legendre Functions 4.4.1. Legendre polynomials. The differential equation (1) (1 — x”) y — Qaey'+ Ay = 0, where A is a constant, is known as Legendre’s equation, after the French mathematician Adrien-Marie Legendre (1752-1833). The z interval of interest is —1 < x < 1, and (1) has regular singular points at each endpoint, « = -1. In this section we study aspects of the Legendre equation and its solutions that will be needed in applications in later chapters. There, we will be interestedin power series solutions about the ordinary point « = 0, (2) y() =S>ana. k=0 Putting (2) into (1) leads to the recursion formula (Exercise 1) k(k+1)-A a2 = Pyke) = (k=0,1,2,--- ay. ) 3 (3) Setting A = 0,1,2,..., in turn, shows that ao and a, are arbitrary, and that subsequent a,’s can be expressed, alternately, in terms of ag and ay: r —F5a9, a= 2—<A 43 = 2 . 6 (6—A)A 40, aq = —~—zr ay, 24 and so on, and we have the general solution ON a0 u(e) = ao| +4 |: t 2—AX 6 e+ 24” (12 (CU) 4 —A)(2—A) a | Clr 720 Lo = | (4) = agy1(«) + ayye(x) of (1). To determine the radii of convergence of the two series in (4) we can use the recursion formula (3) and Theorem 4.2.2, provided that we realize that the a,.42 on the left side of (3) is really the next coefficient after ay, the “ay41” in Theorem 4.2.2 since every other term in each series is missing. Thus, (3) gives “ k- 00 “ape” Ak = lim k-00 k(k+1)—A =i, (FIVEF3) . (5) and it follows from Theorem 4.2.2 that R = 1, so each series converges in —1 < xe<il. 4.4. Legendre Functions 213 In physical applications of the Legendre equation, such as finding the steadystate temperature distribution within a sphere subjected to a known temperature distribution on its surface, one needs solutions that are bounded on —1 < a2 <1. [F (x) being bounded on an interval J means that there exists a finite constantM suchthat|F'(a)| < M for all x in I. If F(x) is not boundedthenit is unbounded.] However, for arbitrary values of the parameter \ the functions y,(a) and yo(x) given in (4) grow unboundedly as a —>+1, as illustrated in Fig. 1 for A = 1. If you studied Section 4.3, then you could investigate why that is so by developing a Frobenius-type solution about the right endpoint 2 = 1, which is a regular singular point of (1). Doing so, you would find a In(1 — a) term, within the solutions, which is infinite at 2 = 1. Similarly, a Frobenius solution about « = —1 would reveala In (1 + x) term,which is infinite at ¢ = —1. Evidently,y;(x) and yo(z), above,contain linear combinations of In (1 —x) and In(1+ 2) [of course, one x cannot see them explicitly in (4) because (4) is an expansion about z = 0, not igure 1. y:(x) andyo(«)in(4), zx= 1orz = ~1] so they grow unboundedly as a + £1. Nonetheless, for certain specific values of 4 one series or the other, in (4), will terminate and thereby be bounded on the interval since it is then a finite degree polynomial! Specifically, if \ happens to be such that (6) A=n(n +1) for any integer n = 0,1, 2,..., then we can see from (3) that one of the two series terminates atk = n: if \ = n(n + 1), where n is even, then the even-powered series terminates at & = n (because dn49 = Gnyg = ++:= 0). For example, if n= 2and \ = 2(2+1) = 6, thenthe6 — factorin everytermafterthesecond,in the even-powered series, causes all those terms to vanish, so the series terminates as 1~ 32". Similarly, if \ = n(n +1), where n is odd, then the odd-powered series terminates at & = n. The first five such \’s, and their corresponding polynomial solutions of (1), are shown in the second and third columns of Table 1. These Table 1. The first five Legendre polynomials. n | \=n(n+1) | Polynomial Solution | Legendre Polynomial P,,(z) 0 0 1 Po(a) =1 1 2 r Pi(z) =a 2 6 1—32? Po(x)= $(3a?—1) 3 12 v— 3x8 P3(x) = $(5a° ~ 32) 4 20 polynomial solutions 1—100°+ B24 | Py(x) = $(35a4—3027+ 3) can, of course, be scaled by any desired numerical factor. Scaling them to equal unity at c = 1, by convention, they become the so-called Legendre polynomials. Thus, the Legendre polynomial P,,(a) is a polynomial forA = lL. 214 solution of the Legendre equation (7) (1—a)y” —2ay'+ n(n + Ly = 0, scaled so that P,(1) = 1. In fact, it can be shown that they are given explicitly by the formula P,(a) = 1 d”™ 2 ridge OU"), n 8 (8) which result is known as Rodrigues’s formula. 4.4.2. Orthogonality of the P,,’s. For reasons that will not be evident until we study vector spaces (Section 9.6), the integral formula (9) =0,= (j#K) [),Pi(2)Pe(w)de is known as the orthogonality relation. By virtue of (9), we say that P;(a) and P,(x) are orthogonal to each other — provided that they are different Legendre polynomials (i.e., 7 # k). Proof of (9) is not difficult. Noting that the (1 —x”)y” — 2zry’ terms in (7) can be combinedas [(1 —27)y’]', we begin by consideringfy / (1 _ 2) Pt] Py dz and integrating by parts until all the derivatives have been transferred from P; to Py: | 1 -1 [(1—x”)Pi] P,dx = (1—2°) PIP," = 0— (1—2") PLP;|", +f 1 —fo 1 _ —x")PPh dx P; (1 —2) Pi] de. (10) The next to last term is zero because of the 1 — x? factor, just as the boundary term following the first equal sign is zero. Since P; and P, are solutions of the Legendre equation (11) [(1- x?)y'|' +n(n+1l)y=0 forn = j and k, respectively, we can use (11) to re-express (10) as or 1 1 ~1 -1 \ de=0. +1)5G+0)f Pa)Pale) (RR (13) Since 7 4 k, it follows from (13) that f P; Py dx = 0, as was to be proved. We will see later that (9) is but a spectal case of the more general orthogonality relation found in the Sturm-Liouville theory, which theory will be essential to us when we solve partial differential equations. 4.4, Legendre Functions 21 aA 4.4.3. Generating function and properties. Besides (9), another important property of Legendre polynomials is expressed by the formula oO (1 —2er + Pye ( z| <1, |r| <1) Pyla)r”. —S- 0 (14) That is, if we regard the left side of (14) as a function of r and expand it in a Taylor series about r = 0, then the coefficient of r” turns out to be P,(a). Thus, 1 - 2er + r) “ie is called the generating function for the P,,’s (Exercise 4). Equation (14) is the source of considerable additional information about the P,,’s. For instance, by changing x to —a in (14) it can be seen that (15) P,(—2)=(-1)"P,(z). Now, if f(—a) = f(x), thenthe graphof f is symmetricabout2 = 0 and we say thatf is an even function of a. If, instead,f(—c) = —f(a), thenthe graphof f is antisymmetric about « = 0 and we say that f is an odd function of x. Noting that the (—1)” is +1 if m is an even integer and —1 if n is an odd integer, then we seefrom (15) that P,,(a) is an even function of x ifn is an even integer, and an odd function of x if n is an odd integer, as is seen to be true for the P,,’s that are shown in Fig. 2. Also, by taking 0/0r of (14) one can show (Exercise 6) that nPy(x) il (2n — 1)¢Pp—i(xv)— (n ~ 1)Pra-a(z), (n = 2,3,...) (16) which is a recursion relation giving P,, in terms of P,—, and P,—2. Or by taking O/0x of (14) instead, one can show (Exercise 7) that Pl(x) — 20P)_y(x) + P)_o(a) = Py_1(2). (n = 2,3,...) (17) Figure 2. Graphsof thefirstfive Finally, squaring both sides of (14) and integrating on 2 from —1 to +1, and using the orthogonality relation (9), one can show that 5, [ PeyPae= vf da = P,(a)|? 2 0,1,2,... n= (01,2...) Legendre polynomials. 18 (18) which result is a companion to (9); it covers the case where j = k&(= n, say). We will need (9) and (18) in later chapters. Closure. Our principal application of Legendre’s equation and Legendre polynomials, in this text, is in connection with the solution of the Laplace equation in spherical coordinates. There, we need to know how to expand a given function in termsof the Legendrepolynomials Po(x), Pi (a), Po(w),..., and the theory behind such expansions is covered in Section 17.6 on the Sturm-Liouville theory. To help put that idea into perspective, recall from a first course in physics or mechanics or calculus that one can expand any given vector in 3-space in terms of 216 orthogonal (perpendicular) vectors “ij, k.” That fact is of great importance and was probably used extensively in those courses. Remarkably, we will be able to generalize the idea of vectors so as to regardfunctions as vectors. [t will turn out that the set of Legendre polynomials Po, Pi,... constitute an infinite orthogonal set of vectors such that virtually any given Function defined on -1 < @ < lcan be expanded in termsot them, just as any given “arrow vector” in 3-space can be expanded in termsof ii ,j,k. In the present section we have not gotten that far, but he results obtained here will be used later, when we finish the story. For a more extensive treatment of Legendre functions, Bessel functions, and the various other important special functions of mathematical physics, see, for instance, D. E. Johnson and J. R. Johnson, Mathematical Methods in Engineering and Physics (Englewood Cliffs, NJ: Prentice Hall, 1982). Even short of a careful study of the other special functions —such as those associated with the names Bessel, Hermite, Laguerre, Chebyshev, — we recommend browsing and Mathieu through a book like Johnson and Johnson so as to at least become aware of these functions and the circumstances under which they arise. Computer software. In Maple, P,,(a) is denotedas P(n,x). say,enter To obtain P7(x), with(orthopoly): and return, then P(7,z); and return.The result is a x! EXERCISES G03x + oo ee 4.4 1. Putting (2) into (1), derive the recursion formula (3). (b) Verify (17) forn = 2. 2. Obtain (4) using computer software. (c) Verify (17) form = 3. 3. e v. Use Rodrigues’s formula, (8), to reproduce the first five Legendre polynomials, cited in Table | 4. Expanding the left-hand side of (14) in a Taylor series in r, about r = 0, through 7%,say,De that the coefficients of r°,...,r3 areindeedPo(x),..., P3(ax),respectively. Squaring (14) and integrating 8. (a) Derive (18) as follows. from —1 to 1, obtain 1 57 eal [—e-f m=) Sir P,(2)) da. n=Q (8.1) relation orthogonality the using and side, left Integrating the Show thosestepsand explain (9) to simplify the right side, obtain 5. We stated that by changing 2 to —x in (14) it can be seen thatP,(—a) = (-1)"P,,(x). your reasoning. to(HE)-E{[mora a2 6. (a) We stated that by taking 0/0r of (14) one obtains (16). Show those steps and explain your reasoning. (b) Verify (16) forn = 2. (c) Verify (16) for n = 3. 7. (a) We stated that by taking 0/0c of (14) one obtains (17). Show those steps and explain your reasoning. n=) ai Finally, expanding the left-hand side in a Taylor series in r, show that (18) follows. (b) Verify (18), by working out the integral, for the cases n= 0,1, and 2. 4.4. Legendre Functions 9, (Integral representation of P,,) It can be shown that 217 Lea x Qi(e)=C § In(==) - i 4+Dyx.— (1t4b) = 4 to (x + f/x? — 1 cos t)" dt, Pale) (9.1) By convention, choose Co = 1, Dp = 0, Cy = land D, = 0, (n = 0,1,2,...) so that which is called Laplace’s integral form for P,,(a). Here, we ask you to verify (9.1) for the cases n = 0,1, and 2, by working out the integral for those cases. 10. We sought power series expansions of (1) about the ordinary point 2 = 0 and, for the case where \ = n(n+1), we obtained a bounded solution (namely, the Legendre polynomial P,,(z)] and an unboundedsolution. Instead,seek a Frobeniustype solution about the regular singular points z = 1 and xz= —1,for the case where {fajn=0 (A=0) (c)n=2 (A=2) (b)n=1 1 11. (Legendre functions of second kind) For the Legendre equation (7) on the interval (11.5a) = ↕in(2t®)-1 Orla) (2) = = —ln ~I. (11.5) ∏ ∶ ∏∏ as the P,,’s. Thus, with Qo and Q, in hand we can use (16) to obtain Qs, Qs, and so on. Do that: show that Q2(z) = (A=1) l+2 Qo(x) = 5 in (; = 2), 3a" ~ 1 ri In ( l+ez +) 3 9% (11.6) andobtainQ3(z) aswell. < x2 < 1, we obtained the -1 boundedsolution y(x) = P(x). In this exercisewe seek a secondLI solution,denotedas Q,(x) and called the Legen- 12. (Electric field induced by two charges) Given a positive dre function of the second kind. Then the general solution of (7) can be expressed as charges lie on the z axis, it follows y(z) = AP, (2) + BQn(2). (a) For the special case n charge Q and a negative charge —Q, a distance 2a apart, let us introduce a coordinate system as in the figure below. Since the (11.1) that the electric field that they induce will be symmetric about the z axis. = 0, solve (7) and show that a secondLI solution is In[(1 + 2)/(1 —x)]. Scaling this solution by 1/2, we define fl 4. Qo(x) = sin (722), (11.2) l—-«ax Sketch the graph of Qo(z) on -1 < a < 1, and notice that|Qo(z)| + co asa > +1. (b) More generally, consider any nonnegative integer n. With only P,,(z) in hand, seek a second solution (by reductionof order)in the form y(x) = A(x)P,(x), and show thatQ,(2) is given by Qr(z) = | aa) e lt ie )(Pa)? +e DyPple). (a) Specifically, the electric potential (i.c., the voltage) ® in- ducedby achargeq is ®= (1/47req)(q/r),wherethephysical (11.3) constant €9 is the permittivity of free space and r is the dis(c) Evaluating the integral in (11.3), show that the first two tance from the charge to the field point. Thus the potential Qn's are induced at the field point P shown in the figure is Qo(a) = 1 300 In (; Ll+a2 - | + Do, Cl {.4a) P= = L (2-2). 4még \ Pe p- (12.1) 218 Chapter 4. Power Series Solutions though a@is not tending to zero and @ to infinity, the field induced by that molecule is approximately equal to that of an idealized dipole of strength pz= 2@a, at points sufficiently far away (ie., for p/a >> 1). (c) As a different limit of interest, imagine the point P as fixed, and this time let @become arbitrarily large. Show, from Show that (12.1) gives Arey a? + p? —2apcos 1 a? + p? + 2apcos;| 1 trap 2Q 20 (12.2), that we obtain a\” > (2) @)~ ®&(p, Pr(cos@) (p>a) 1 Aireg 2Q —z pcos a = 1 2Q —Zz 4rreg a? (12.5) as a — oo. Notice that if, as @is increased, Q is increased such that Q/a? is held constant, then theelectric field intensity E (which, we recall from a course in physics, is the negative (12.2) of the derivative of the potential) is @constant: (b) With the point P fixed, imagine letting a become arbitrarE= ily small. Show, from (12.2), that we obtain (12.6) (2)" P,(cos@).(p<a) da 1 cos @ ®(p,¢)~ Arey 2(a p (12.3) constant, then (12.3) becomes [LE cos@ p? Ameg2a?’ dz asa > 0. Thus, ®(p,¢) — 0 as a -+ 0, as makes sense, because the positive and negative charges cancel each other as they are moved together. However, observe that if, as a is decreased, Q is increased such that the product Qa is held &(p,)~ dren 1 Q that is, we have a uniform field. Thus, a uniform electric field can be thought of as resulting from moving apart two charges, +Q and —Q, and at the sametime increasingtheir strengthQ such that Q/a? is held constant as a —->oo. Similarly, in fluid mechanics, a uniform fluid velocity field can be thought of as resulting from moving a fluid “source” of strength +@ and a fluid “sink” of strength —Q apart in such a way that Q/a? is held constant as @—+00, where 2a is their separationdistance, as sketched schematically in the figure. (12.4) where pp= 2Qa is called the dipole moment, and the charge configuration is said to constitute an electric dipole. If, for instance, a molecule is comprised of equal and opposite charges, +Q and —Q, displaced by a very small distance 2a, then, even nc ep @ + Q re —__ I ee AA ® ~ Q 219 For example, if =[iire%ds,n=fber* del va, [3 = f Ig = for Jxe* dz, dz/(x — 1), w thenJ; is singular due to the infinite limit, />is singular becausetheintegrandtends to co as x -+ 0, J is singular because the integrand is unbounded (tends to —co as x — 1 from the left, and tends to +-oo as a — 1 from the right), and J, is regular. Most of our interest will be in integrals that are singular by virtue of an infinite upper limit (illustrated by /,) and/or a singularity in the integrand at the left endpoint (illustrated by fo), so we limit this brief discussion to those cases. Other cases are considered in the exercises. Consider the first type, fe f(a) dx. Analogous to our definition ∑ oo N n=O n=0 of an infinite series, we define i= Tee) f(z) dx = lim | a X00 XxX f(a) dz. (3) Jig If the limit exists, we say that J is convergent; if not, it is divergent. Recall, from our review of infinite series in Section 4.2, that the necessary and sufficient condition for the convergence of an infinite series is given by the Cauchy convergence theorem, but that theorem is difficult to apply. Thus, in the calculus, we studied a wide variety of specialized but more easily applied methods and theorems. For instance, one proves, in the calculus, that the p-series, oO 1 np’ (4) 1 convergesif p > 1 and diverges Noes< 1, the case p = 1 giving the well known (and divergent)harmonic series She| 3: Phat is, the terms need to die out fast enough, as n increases, for the series to converge. As p is increased, they die out faster and faster,and the borderline case is p = 1, with convergence requiring p > 1. Then one establishes one or more comparison tests. For instance: If Sy = OoGy and So = ~o bp are series of finite positive terms, and a, ~ Kb, asn > co for some finite constant A’, then S, and S_ both converge or both diverge. (The lower limits are inconsequential insofar as convergence/divergence is concerned and have been taken to be 0 merely for definiteness.) of ae series So = oe For instance, to determine the convergence or oO ) 1 ¢ 2n +3 —Toe n? +5 F 2n+3 ‘ F 2 ~ — as n -+ oo. Now, oF ~z is convergent We observe that —j~-—> n+ on because it is a p-series with p = 3 > 1, and by the comparison test stated above it follows that S' is convergenttoo. Our development for determining the convergence/divergence of singular in- tegralsis analogousto thedevelopmentdescribedabovefor infinite series. Analo- gous to the p-series, we study the horizontal p-integral, t= fae, qa =P (a>0) | (5) where p is a constant. (The name “horizontal p-integral” is not standard, and is explained below.) The latter integral is simple enough so that we can determine its convergence/divergence by direct evaluation. Then we can use that result, in conJunction with comparison tests, to determine the convergence/divergence of more complicated integrals. Proceeding, ∫∶∕ oo Y −− ∶ xP X00 Now, limy_,.1nX limy yo X17? x −−↕ fq xP ∶ ↕ ∟ − ∞ (6) is infinite and hence does. not exist, and similarly for if p < 1, whereas the latter does exist if p > 1. Thus, THEOREM 4.5.1 Horizontal p-Integral The horizontal p-integral, (5), converges if p > 1 and diverges if p < 1. That result is easy to remembersince thep-series, likewise, convergesif p > 1 and diverges if p < 1. Graphically, the idea is that p needs to be positive enough (namely, p > 1) so that the infinitely long sliver of area (shaded in Fig. 1) is squeezed thin enough to haveafinite area. We state the following comparison tests without proof. Xx Figure 1. The effect, on 1/xr?, of varying p. THEOREM 4.5.2. Comparison Tests Let I) = f° f(x) dx and Ig = [™ g(x) dx, where f(x) and g(x) are positive (and bounded) on a < x < oo. (a) If there exist constantsAK and X such that f(z) < Kg(x) for alla > X, thenthe convergenceof Jy implies the convergenceof J;, andthedivergenceof J, implies the divergence of Ig, (b) If f(a) ~ Cg(x) as x - oo, for some finite constant C, then I, and Ig both convergeor bothdiverge. Of course, A’ must be finite. Actually, (b) is implied by (a), but we have included it explicitly since it is a simpler statement and is easier to use. Note 221 thatC’ cannotbe zero becausethe notationf(x) ~ 0 makesno sense. That ts, f(x) ~ g(a) as © — xo meansthat f(x)/g(a) + las x + xo, and f(x)/0 cannot possibly tend to 1. 1. EXAMPLE Consider J = | "9 2a +3 22 +3 fe - r dx. Since g et4+5 et+5 2 — as x — oo, and x3 fo” dx/x® is a convergent p-integral (p = 4 > 1), it follows from Theorem 4.5.2(b) that I is convergent. COMMENT. If, instead,theintegrandwere(22 + 3)/(x* +5), thentheintegralwouldbe p-integral (p = 1). divergentbecause(2¢+3)/(a? +5) ~ 2/a, and[5° dx/x is a divergent It wouldbe incorrectto arguethattheintegralconvergesbecause(22 + 3)/(2? +5) + 0 as x — oo. Tending to zero is not enough; the integrand must tend to zero fast enough. 4 Since the integrand of the integral in question might not be positive, as was assumedin Theorem 4.5.2, the following theorem is useful. THEOREM Co | 4.5.3 Absolute Convergence oO |f(x)| dx converges,then so does | a converges absolutely. f(x) da, and we say that the latter ° EXAMPLE 2.Consider [ =|q sa*+1 sing »OO positive. We have Now, I dz, the integrand of which is not everywhere sin | 1 1 < ow 322 4+1|~ 3e2 +1 3a? as . ISBros 7 7) dx/x* is a convergent p-integral (p = 2 > 1). Thus, by the asymptotic relation in (7)andTheorem4.5.2(b),[;° dz/(3a? + 1) converges.Next,by theinequalityin (7) andTheorem4.5.2(a),f>~|sina/(3a* + 1)| dx converges.Finally, by Theorem4.5.3,[ converges. EXAMPLE 3. ConsiderJ = {5°2!e~°°l"dx. It mightappearthatthis integral divergesbecauseof thedramaticgrowthof thez!°, in spiteof the e~°-°!*decay. Let us see, Writing 100 0.0128 _ wp100 e001 _ 14 (0.01n)+ igor) x 100 102!)10 < war = (o2/10" (0.012) 102 7) 102! e100 ‘ 5 foes 200 ) (8) we see, by comparison with the p-integral, with p = 2, that J converges. EXAMPLE 4. Observe that "Oo I =| da BOO =| aie AI ane) @ 3 (9) _ Jim In (Inx)|3 = 00, co 2 xcpad so I is divergent. This example illustrates just how weakly Inz — oo as « — oo, for the integral of 1/2 is borderline divergent (p = 1), and the Inz in the denominator does not even provide enough help, as x —+oo, to produce borderline convergence! H So much for the case where the upper limit is oo. The other case that we consider is that in which the integrand “blows up” (i.e., tends to +-oo or —oo) at a finite endpoint, say the left endpoint « = a. If the integrand f(x) blows up as x —+a, thenin the samespirit as (3) we define b r= | f(z) dz= imaf e—0 b (10) f(x) da, where € — 0 through positive values. We first consider the so-called vertical p-integral b t=|[ +a. (b>0) o (11) According to (10), r= f 51 —de 02 51 =lim | —de = 40 J, aP Now, lime-,9 Ine is infinite lim −5 − <0 i-? lime49Ina|?. FI) (p=1) (—oo) and hence does not exist and, similarly lime_4o€~? if p > 1, whereasthelatterlimit doesexist if p < 1. Thus, _» p increasing p increasing } 1 (12) for THEOREM 4.5.4 Vertical p-Integral The vertical p-integral, (11), converges if p < 1 and diverges if p > 1. Recall that as p is increased, the horizontal sliver of area (shaded in Fig. 1) is squeezed thinner and thinner. For p > | it is thin enough to have finite area. However, the effect near z = 0 is the opposite: increasing p causes the singularity at x = 0 to become stronger, and the vertical column of area (shaded in Fig. 2) Figure 2. The effect, on 1/2’, of varying p. to become thicker. Thus, to squeeze the vertical column thin enough for it to have finite area, we need to make p small enough; namely, we need p < 1. The motivation behind the terms “horizontal” and “vertical” p-integrals should now be apparent;the former involves the horizontal sliver shown (shaded)in Fig. I, 223 and the vertical p-integral involves the vertical sliver shown (shaded) in Fig. 2 Next, we add the following comparison test: THEOREM 4.5.5 Comparison Test Let J = f f(a)dz, where0 < b < ow, If f(x } ~ K/xz? as x — 0 for some constantsJ¢ Ae p, and f(a) is continuous on 0 <a < 6, then I converges if p < 1 and diverges if p > 1. EXAMPLE 5. Testtheintegralfe (sin 22/x9/*)dx for convergence/divergence. Ev- idently, the integrand blows up as x -> 0 and needs to be examined there more closely. Recalling the Taylor seriessin2a = (2x) — (2x)°/3! + (2x)°/5! — ---, we see that sin2z ~ 2x as x -> O [as can be verified, if you wish, by applying |’Hépital’s rule to showthatsin (27)/2z + las 2 - 0], so sin 22 gale 22 2 (13) 73/2~ yi/2 Thus, according to Theorem 4.5.5, with p = 1/2, the integral is convergent. @ Example 5 concludes our introduction to singular integrals, and we are now prepared to study the gamma function. 4.5.2. Gamma function. The integral T(x) = | tee Jo "dt (x > 0) (14) is nonelementary; that is, it cannot be evaluated in closed form in terms of the so- called elementary functions. Since it arises frequently, it has been given a name, the gamma function, and has been studied extensively. Observe that the integral is singular for two reasons: first, the upper limit is oo and, second, the integrand blows up as t — 0 if the exponent x — 1 is negative. To determine its convergence or divergence, we can separate the two singularities by breaking the integral into the sum of an integral from ¢ = 0 to t = 7, say, for any tT> 0, plus another from 7 to oo.* In the first, we have #°~!e7! ~ (7! = 1/t*~* “That is, if f(¢) is unboundedas ¢ -> 0, thenthe integral Jo f(t) dt −li lim e-0 ∫ − | tig a [ ‘ f(t lim e+ 0 − ∞ f(t)dt + jim / T fi)dt exists if and onlyif each of the last two integrals exist. ic = yas Tr fo seeyae [ f(t a OD f(t) dt 224 Chapter 4. Power Series Solutions as t + 0, and by Theorem 4.5.5 we see that we have convergence if l—-a < 1 (Le., x > Q),and divergence if « < 0. In the part from 0 to oo, we have convergence no matter how large x is, due to the e~'. Thus, the integral in (14) is convergent only if c > 0; hence the parenthetic stipulation in (14). An important property of the gamma function can be derived from the definition (14) by integration by parts. With “w= ¢°~! and “du’= e~'dt, l(a) = tte + (%— yf tee! dt. (15) 0 The integral in (15) converges [and is (a — 1)] only if > 1 (rather than x > 0, because the exponent on ¢ is now x —2), in which case the boundary term vanishes. Thus, (15) becomes (16) (a > 1) 1) (# —1). T(x) = (w«— The latter is a recursion formula because it gives [ at one point in terms of I at another point. In fact, if we compute [(z), by numerical integration, over a unit interval such as 0 < x < 1, then (16) enables us to compute I(x) for all x > 1. For example, (3.2) =2.21(2.2) = (2.2)(1.2)P'(1.2) = (2.2)(1.2)(0.2)P(0.2), (17) and one can find ['(0.2) in a table. (Actually, tabulationsare normally given over the interval 1 < x < 2 because accurate integration is difficult if 2 is close to 0. In fact, tables are no longer essential since the gamma function is available within most computer libraries.) Note, in particular, that if m is a positive integer, then Pin+ 1) =nI(n) = n(n —1) (n - 1) =e =n(n—1)(n—2)---()E(), (18) and since oo r(1) = [ 0 edt (19) =1, (18) becomes P(n+1) (20) =n. Thus, the gamma function can be evaluated analytically at positive integer values of its argument. Another x at which the integration can be carried out is 7 = 1/2, and the result is (5) Tr 1 = JT. : (21) Derivation of (21) is interesting and is left for the exercises. Recall that (14) defines [(a) only for a > 0; for a < 0 the integral diverges and (14) is meaningless. What is a reasonable way to extend the definition of P(x) to negativex? Recall thatif we know ['(a), thenwe can compute['(« + 1) from the recursionformula (a + 1) = aD(x). Insteadof using this formula to step to we were able, in (17), to compute the right [for instance, recall that knowing [(0.2) 1(3.2)], we can turn it around and use it to step to the left. Thus, let us define Ta r(o)=2+) 1 (22) toraco. x For example, A [(—0.€ T(-1. —2.6 (~2.6)(-1.6) — (—2.6)(—1.6)(—0.6)’ P(x) where [°(0.4) is known because its argument is positive. The resulting graph of T'(2)is shownin Fig. 3. In summary then, ['(a) is defined for all 2 4 0,—-1,—2,... by the integral (14) together with the leftward-stepping recursion formula (16). The singularity of T(z) at c = 0 propagatesto x = —1,—2,... by virtue of that formula. Especially notableis the fact that[(a) = (a — 1)! atz = 1,2,3,..., andfor this reason['(x) is often referred to as the generalized factorial function. A great many integrals are not themselves gamma function integrals but can be evaluated by making suitable changes of variables so as to reduce them to gamma function integrals. EXAMPLE 6. EvaluateJ = [5° t2/3e-V? de.SettingVt = u, we obtain "OO i [ 0 (u2)?/* e “2udu= - 1 du = 20 (2) we" 2 | , (24) JO where['(10/3) can beobtainedfromtablesor a computer. & 4.5.3. Order of magnitude. [n some of the foregoing examples it was important to assess the relative magnitudes of two given functions. In Example 3, for instance, the x! grows as 2 — oo while the e~?-°!" decays. Which one “wins,” and by what margin determines whether the integral converges or diverges. Of particular interestare therelative growth and decay of the exponential, algebraic, and logarithmic functions as x + oo and x — 0, and we state the following elementary results as a theorem, both for emphasis and for reference. THEOREM 4.5.6 Relative Growth and Decay For any choice of positive real numbers « and {, pte Be + 0 (Ina) /xz°+0 e“Ine>0 asz— oO, asa — oo, asx-0. (25a) (25b) (25c) Figure 3. Gammafunction,['(z). 226 Proof of (25a) can proceed as a generalization of (8) or by using PH6pital’s rule, and (25b,c) can be proved by I’ H6pital’s rule. To prove (25c), for example, observe that«* Ina — (0)(—oo),which resultis indeterminate.To use|’H6pital’s rule we need to have 0/0 or co/oo. Thus, express«%ln x as (Inaw)/x~°,which tendsto —co/oo as x > 0. Then |’ H6pital’s rule gives lim st 20 ge , 1a vol = lim a0 ~~ —ag 7ool lim (-=) 230 a =. We say that 2° exhibits algebraic growth as « —>oo, and we see from (25a) that algebraic growth «@ is no match for exponential decay e~°*, no matter how large a is and no matter how small @ is! Of course, it follows from (25a) that a~*eP® _s o9 as « > 00: algebraic decay is no match for exponential growth. Just as exponential growth is extremely strong, logarithmic growth is extremely weak for (25b) shows that 2® dominates In x as x — oo, no matter how small a is. Similarly as « + 0: «~* -— oo and Inw — —oco(recall that Inz is zero atx = 1, increases without bound as x ~ oo, and decreases without bound as 7 — 0; sketch it), and (25c), rewritten as (Inz)/x~* — 0, shows that 2~* —+oo faster than Ina — —oo, no matter how smalla is. Crudely then, we can think of In x as being of the order of « to an infinitesimal positive power as z — oo, and of the order of x to an infinitesimal negative power as x — 0. In contrast, one can think, crudely, of e* as being of the order of x to an arbitrarily large positive power as 2 —>co, and e~” as being of the order of x to an arbitrarily large negative power as x —+oo, When considering the relative strength of functions as x tends to some value Xo, constant scale factors are of no consequence no matter how large or small they may be. For instance, (87Inz)/a2°9! + 0 as @ > oo just as (Inz)/x°"! does. Thus, in place of the asymptotic notationf(x) ~ g(x) as © — wo,which means that f(a)/g(a) + 1 as x + xo, we will sometimes use the “big oh” notation |F(@) = O(g(x)) to mean that* f(x) ~ Cg(x) (26a) asx —- xo (26b) x, asx for some finite nonzero constant C’. For instance, whereas f(z) = vi— +v3 r+ V5) 7 4:673Ine Vit ~ St as x —>Q, it is simpler to write f(a) = O(a7'/?) V3 ips T(i + V5) a as x —>Q. That is, the scale factorC = V1 + V3/T'(1+ V5) canbeomittedinsofaras theorderof magnitude “Actually, the notation(26a) meansthatf(a)/g(z) is boundedas « — xo. Though our usageis more restricted, it is consistent with the definition just given, for if (26b) holds, then surely f(x) /g(«) is bounded as # — ao. Though more restricted, our definition (26b) of (26a) is sufficient for our purposes and is easier to understand and use. 227 of f is concerned. In words, we say that f is big oh of a '/? as x + 0. Of course, xg can be any point in (26); often, but not always, xo is 0 or oo. As one more illustration of the big oh notation, observe from the Taylor series ze 5 a! —-— 7= 5od0 739 sme =~—* sing=o tee 28 28) that each of the following is true: sing = O(a), (29a) sing = o+ O(z*) , a sing = 2 6 “fbO(z°) ; (29b) (29c) and so on, as z ~+0. For instance,|’H6pital’s rule showsthat (sinw)/a -> 1 as x —0,sosing ~ x; hence (26b) holds, with C = 1, so (29a) is correct. Similarly, l Hépital’s ruleshowsthat(sinz —x)/x? + —1/6,sosina —x ~ —2°/6;hence sing —2 = O(a) or sing = x + O(zx?)so (29b)is correct. The big oh notation is especially useful in working with series. For instance, (29b) states that if we retain only the leading term of the Taylor series (28), then theerror therebyincurred is of order O(2*). Put differently,the portion omitted, . . ae po 2 is simply O(2*). —£ 4 3, - sig t+ Closure. In Section 4.5.1, we define singular integrals as integrals in which something is infinite: one or both integration limits and/or the integrand. We make such integrals meaningful by defining them as limits of regular integrals. Just as the convergence and divergence of infinite series is a long story, so is the convergence and divergence of singular integrals, but our aim here is to consider only types that will arise in this text. Though the convergence of singular integrals of the type fo f(x) dx and of infinite series “7° an bear a strong resemblance(e.g., the pseries and horizontal p-integral both converge for p > 1 and diverge for p < 1), one should by no means expect all results about infinite series to merely carry over. For instance, for series convergence it is necessary (but not sufficient) that a, — O as n —>oo, but it is not necessary for the convergence of fo” f(x) dx that f(z) + 0 as « — oo. For instance, we state without proof that fo sin (a”) dx converges, eventhoughsin (a*) doesnot tendto zero as x > 00. In Section 4.5.2, we introduce a specific and useful singular integral, the gamma function, and obtain its recursion formula and some of its values. The exercises indicate some.of.its.many.applications. In the final section, 4.5.3, our aim is to clarify the relative orders of magnitude of exponential, algebraic, and logarithmic functions. It is important for you to be familiar with the results listed in Theorem 4.5.6, just as you are familiar with the relative weights of cannonballs and feathers. We also introduce a simple big oh notation which is especially useful in Chapter 6 on the numerical integration of differential equations. Chapter 4. Power Series Solutions 228 Computer software. Many integrals can be evaluated by symbolic computer software. With Maple, for instance, the relevant command is int. To evaluate the integral J in Example 6, for instance, enter int(t*(2/3)*exp(—t(1/2)), t = 0..infinity); and return. The result is u2_xv3 81F(2/3) which looks different from the result obtained in Example 6 but is actually equivalent to that result (Exercise 12). To evaluate the latter, enter evalf(’’); and return. The result is 5.556316963. EXERCISES 4.5 1. If ~> Oand @ > 0 show that, no matter how large a is and (h) fx COSZar5 no matter how small G is, | of 2. If a > 0, showthatno matterhow small a is, ) [ (a) (Inz)/xe* +0 (b)a*/Ina->O0 asx — oo ast ∙∙ 0 3. Show whether the given integral converges or diverges. As usual, be sure to explain your reasoning. (a) & a c ~ Jo (e)i 0 (f) 0 nec’ HINT:ShowthatIna < 2/4 for all sufficiently large x, by showing that (Inz)/a!/4 2 Ined J/g + 0 as x -+ 00. HINT: Make the change of variables av Ve 7 = ∞dx/x? o . . . . integral,andstatewhethertheresultingintegralis singularor Caen vt +100 dav not. (a)I Pa ee a= Va. of BES er 2 Ine dz a) | converge? Explain. C 6. Enter the indicated change of variables in the given singular (e) "© sin? a dx g 4, Show whether the given integral converges or diverges. 5. For what p’s, if any,does [ ∞∶ at 2 “aw dz 9 wt +2 a [ r2 COSL 1/€ andusethehint given in part(a). ‘ 2 0, (b) dx a (b) ~~ dx 4 HINT: Let€é = a —1. end a asx —oo (a) e%e7 8" +0 (b) a7 %e8* -3 00 «asa 4 00 TY g b *daLe °) o Va’ ∕ 1 es = ;dx E wt+2 “cosa aoe 1 dx g too 2 4.5. Singular Integrals; Gamma Function — 229 (d) °° cos x dx , ve 11. Evaluate as many of the integrals in Exercise 10 as possible using computer software. ie ’ 7. For what range of a’s (such as 0 < a < 2,a@ > 4, no a’s, etc.) does the given integral converge? Explain. °° a (a) [ dz (6) 0, vit] 2 o%sined df bP ) [| ue c 7) 4 0 de ‘b ve +3 12. In Example 6 we obtainedthe value 2P(10/3). Using Maple, instead,showthattheresultis (112/81)rV/3/T(2/3). x sin sina dx « da . 05 4\e 7, Then, use any formulas given in this section or in these exercises to show that the two results are equivalent. 13. Deduce, from the formula given in Exercise [O(a), that I'(z) ~ 1/e as x tendsto zero throughpositivevalues. x®dz 14, (Beta function) Derive the result 8. Evaluate, using a suitable recursion formula and the known value[(1/2) = \/7. Repeatthe evaluationusing computer 1 B(p,q) = software. “ (a)(3.5) (b)P(—3.5) —(c)F(6.5) 9, Derive(21),thatP(1/2) = 7. (1/2) =2| J0 (d)(0.5) HINT: Show that 0 show that =4 o f cau meee Jo (14.1) (14.2) zr?'e~*dz, [(p) = [ en du, = af CigI(p) 0 D(p+q)’ for p > 0, q > 0; B(p,q) is known as the beta function. HINT: Putting « = wu?in so that [P(i/2)]? =4f = P(p)l'(q) a? (1 —2)tdr 0 e 0 ew”du ~(ua? +07) du dv. Regarding the latter as a double integral in a Cartesian u,v plane, change from z, v to polar coordinates r, 0. The resulting double integral should be easier to evaluate. 10. Show by suitable change of variables that 0 da | wete 0 v@le-" dy, (14.3) Regarding the latter as a double integral in a Cartesian u,v plane, change from w, v to polar coordinates r, #. Making one more change of variables in each integral, the r integral gives ['(p + q) and the @integral gives B(p, q). 15. Derive, from (14.1) above, the alternative forms: (a) B(p,9) =| pet 8OO Gaore @ (15.1) (p>0, q>0) p 0 n! 7 ro (p > 0) HINT: Seta = ¢/(1 +t) in (14.1). nf2 B(p,q) = 2f (b) (b) z™ (Ina)"dx = (—1)" oy Jo (m+ 1)r*t vy —x 4 vail cos??~! 9 sin??~! @dO (15.2) (p>0, q >0) (m,n nonnegative integers) ~ 0 16. Using any results from the preceding two exercises, show that ie (a) JO 1 +1 q4+l1 (16.1) cos#sin?0d0= =B(2 2° i) 2 (p>-1, a ¢>-1) 230 Chapter 4. Power Series Solutions (b) / om[2 0 tan?0d0= / om{2 0 cot?@d0 ia lt+p1-p — 38 (=. L i ° x =) a | 29 Jo Vcos@ — cos 4 ~ Deosbe 2 T/4 dt, 0 (18.2) where T' is the period and @)is the maximum swing. We expect T' to depend on 09, so we denote it as T'(99). For the (16.2) case0) = 7/2,showthat for —1 < p< 1. HINT: You may use (17.1), below. ° (c) adr (1+ aye [ 1 atl cb-a-1 52 ( b ? b r (24) _l fora > -1,b>0,be-a>1. b nl P(1/4) T(n/2) = ) (16.3) (#=¢-4) T(c) ↕ ∶ ↨ 9 T/A) NOTE: You mayuseresultsfromtheprecedingexercises. (b) At first glance,it appearsfrom (18.2)thatT'(@9)> 0 as 9 — 0. Is that assessment correct? Explain. 17. It can be shown,from theresiduetheoremof thecomplex 19. Let F(x) = 4/(1 + a7) = 4-42? integralcalculus,that G(e) 2+ 32 (x) 7x —2+1 (a) = pgp PMO yg tM 0 ger! ine" u Gao (0<a<1) (17.1) J(z)i = Net (a) F(z) Using this result, (15.1), and (14.1), show that T(e)P(1 aa @) => ne 7 (0 <ac< 1) (17.2) 18. (Period of oscillation of pendulum) Conservation of energy dictatesthattheangulardisplacement6(¢)of a pendulum,of length/ and massm, satisfiesthedifferential equation 1 (4) =m (l@)+mg(l—Icos@) = j ) + mg 3a,, andK(z) Fin each: F(x) (d) ae =4— 32 Verif the truthof y asa 30 =O(1) asx 0 (b)F(z) =4+O0(2?) 3) 4 4at—.-.., 2x —3lnz ae 4x2 ~ Ae as (f) H(z) = O(r) (g) H(z) =O(1) (h)I(z) =O(a7!) + O (x*) ase asx 3 0 0 ee asx — 00 asx 70 asa —oo l—Icos@). (18.1 " ° )=mg(i—leos).(18-1) Fa) O(n) asa40 (a)From(18.1),showthat j) J(z) =O(x) (k) J(xz) =O(1) (1)K(z) =O(1) 4.6 ast — co asx 0 asx—-0 Bessel Functions The differential equation ay” + ay! + (x®—v*)y =0, where v is a nonnegative real number, is known as Bessel’s (1) equation of order vy. The equation was studied by Friedrich Wilhelm Bessel (1784-1846), director of the astronomical observatory at K6nigsberg, in connection with his work on planetary motion. Outside of planetary motion, the equation appears prominently in a wide range of applications such as steady and unsteady diffusion in cylindrical regions, and one-dimensional wave propagation and diffusion in variable Bessel Functions cross-section media, and it is one of the most important differential equations in mathematical physics. Dividing through by the leading coefficient 2”, we see from that ∫∫ ∶↓∕ ∶ ∏(2? _ ∫vu?)/x* ∶ ∶ there is one singular point, 7 = 0, and v? are that it is a regular singular point because wp(x) = 1 and v7q(x) = x? ~— analytic at x = 0. 4.6.1. v ~ integer. Consider the case where the parameter v is not an integer. Seeking a Frobenius solution about the regular singular point 2 = 0, CO y(a)=Soapa**", (ay#0) (2) k=0 gives (Exercise 1) 5 [lh c+r)o =v *) ak + ap a} ght" =0, (3) k=O where ag 4 0 and a_2 = a_ ill 0. Equating to zero the coefficient of each power of x in (3) gives k=0: k=l: (r?°—v)ag= an+1)? - a A> 2: [(r + kj? — vy” Gp + ap—2 = 0. (4a) (4b) a, = 9, (4c) Since ag # 0, (4a) gives the indicial equation r? — v* = 0, with the distinct roots r= ctv. First, let r = +v. Then (4b) gives a, = 0 and (4c) gives the recursion relation ‘| 5 k>2 9. Oj,= h(k+20) 8? Oh (k2 2) From (5), together with the fact that a, = 0, it follows thata, = ag = a5 =--and that dak= (~1)* kk (vytkh)(v+k—-1)-+-(v+1) ao. oa =0 (6) If v werean integer,thentheever-growingproduct (v + k)(v +k —1)---(v+1) could be simplified into closed form as v!/(v + k)!. But since v is not an integer, we seek to accomplish such simplification by means of the generalized factorial we recall the gamma function recursion formula [(2) = (« ~ 1)P(a — 1), then i Petktl=vwtkhP(vek)=(vtkh)(vek-Ivt+k-le= whichgives(v+k)(v+k—1)--- (+1) i (v+k)(v+k—1)->-(v+1)(v+1), Pv +k + 1)/P(v + 1). With this replacement,(6) becomes —DAE (py 4 1 Q94 . = a)... ( ) Diy + 1) eID (Vy +k + n° 7 a 231 so we have the solution — . y(@) = ap2’”T(v + 1) 2 a 2k {—1)* eee 7 a AIT(vu+k+1) (5) (8) Dropping the a92”T'(v + 1) scale factor, we call the resulting solution the Bessel function of the first kind, of order v: Cy & ∫ ∙ _4\k ∶ ∕ ↨∏ ∕ To obtain a second linearly independent solution, we turn to the other indicial root, r = —v. There is no need to retrace all of our steps; all we need to do is to change v to —v everywhere on the right side of (9). Denoting the result as J_, (2), the Bessel function of the first kind, of order —v, we have ayy & ny2k _4)\k (10) =(5) Saneceay G) J-o(x) Both series, (9) and (10), converge for all x, as follows from Theorem 4.3.1 with Ry = Ry = oo or from Theorem 4.2.2 and the recursion formula (5). The leading terms of the series in (9) and (10) are constants times x” and x~”, respectively, so neitherof thesolutionsJ,(x) and J_,(x) is a scalarmultipleof theother.Thus, they are LI, and we conclude that y(x) = AJ, (x) + BJ, (11) (x) is a general solution of (1). Writing (9) and (10), vy l J(a)=2 Free:D2 1 Bee) T(l—v)2-" Tw+a2n* * | , 1 1 12 a ove (2 —v)a-v + | (13) Since the power serieswithin the squarebracketstendto 1/['(v+1)2” and 1/T(1- v)2~", respectively,as 2 -+ 0, we see that J,(x) ~ [1/P(v + 1)2”]x” and. J_v(x) ~ [1/P( — v)2-"|27" as x + 0. It is simpler and more concise to use the big oh notationintroduced in Section 4.5.3, and say that J,(v) = O(x”) andJ_,,(x) = O(x~") as x + 0. Thus,theJ,,(x)’s tendto zero andthe J_,,(x)’s tend to infinity as ~ -+ 0. As representative,we haveplotted Jj /2(x) and J_j/2(z) in Fig. 1. In fact, for the half-integer values v = +1/2, £3/2,+5/2,... the series in (9) and (10) can be shown to represent elementary functions. For instance (Exercise 5), 2 Ji jo(x) = 4/— sing, 2 J_4jo(z) = 4/— Cost. (14a,b) 233 +1) =T(n+k+1) +4 4.6.2.v =integer. If v is a positive integern, then(v = (n + k)! in (9), so we have from (9) the solution =) =0 a Rea Gy" x4 x6 (15) of (1). For instance, 1 xu2 Jol)=~ oa+gacane ~a6anz 6 en Ji(a) = “ 2 (16b) os F 7 oO “ “ foe 2382! 25213! = 2734! We need to be careful with (10) because if v = n, then the [(k —n-+ 1) in (10) is, we recall from Section 4.5.2, undefined when its argument is zero or a negative integer ~ namely, fork = 0,1,...,n — 1. One could say that P(& — n + 1) is infinite for thosek’s, so 1/T(k -n+ 1) equalszero fork = 0,1,...,n—1, equals1/(k —n)! fork =n,n+1,..., in which case(10)becomes 7 teak—n) (2) J-n(a =D and it (7) [The resulting equation (17) is correct, but our reasoning was not rigorous since — 1, rather than “oo.” A rigorous line [(k — n+ 1) is undefined atk = 0,1,...,n of approach is suggested in Exercise 10.] Replacing the dummy summation index k by m according tok —n =m, ~ I —n( eam yaaa <= (m+ n)im! p\ 2m-en (5) \2 If (—1)"” is factored out, the series that remains is the same as that given in (15), so that Jen() =(-1)"In(e). (18) The resultis thatJ,(a) andJ_,,(z) arelinearly dependent,since(18)tells us thatone is a scalar multiple of the other. Thus, whereasJ,(x) and J_,(x) are LI and give the general solution (11) if v is not an integer, we have only one linearly independent solution thus far for the case where v = n, namely, yil(a a Jn(x) given by (15). To obtain a second LI solution yo(avewe rely on Theorem 4.3.1. Let us begin with n = 0. Then we have the case of repeated indicial roots, r = +n= +0,which corresponds to case (ii) of that theorem. Accordingly, we seekyo(a) in the form yo(x) = Jo(x) Ina + S- che. 1 (19) 234 Doing so, we can evaluate the c,’s, and we obtain n\2 1 L yo(x) = Jo(a) naw + (5) 7 (1 +5) (a2 v\4 (5) bree, (20) which is called Yo(x), the Neumann function of order zero. Thus, Theorem 4.3.1 leadsus to thetwo LI solutionsy;(a) = Jo(x) andy2(a) = Yo(x), so we canuse them to form a general solution of (1). However, following Weber, it proves to be convenient and standardto use, in place of Yo(«), a linear combination of Jo(x) andYo(a), namely, (21) [Yo(x) + (y — In2)Jo(a)| = Yo(x), yo{x) = where 2 x av Yo(a) = - (in 3 + 7) Jo(x) + 2 1 1 1 x! (1 + 5) 24(21)2 x (22) | +(:++ 4)a2 is Weber’s Bessel function of the second kind, of order zero; y = 0.5772157 is known as Euler’s constant and is sometimeswrittenas C’, andYo() is sometimes written as No(x). The graphsof Jo(x) and Yo(x) are shown in Fig. 2. Important featuresarethatJo(a) andYo(x) look a bit like dampedcosineandsinefunctions, 0.5 except that Yo(a) tends to —oo as x — 0. Specifically, we see from (16a) and (22) that , Jo(a) ~ 1, -0.5 15 Figure 2. Jo andYo. Yo(a) ~ = Ing (23a,b) as x — 0, and it can be shown (Exercise 6) that Jo(a) ~ (2 cos (x ~ *), Yo(z) ~ rE sin (« − (24a,b) as @ — oo. Indeed, we can see from (24) why the Weber Bessel function Yo is a nicer companion for Jo than the Neumann Bessel function Yo, for Yo(a) ~ : Ve l(< —yt+ nz) 2 sina — (< +y— inz) cosr| 2 (24c) a as x —+00; surely (24b) makes a nicer companion for (24a) than does (24c). [t might appear, from Fig. 2 and (24), that the zeros of Jo and Yo [1.e., the roots of Jo(a) = 0 and Yo(a) = 0] are equally spaced, but they are not; they approach an equal spacing only as x — oo. For instance, the first several zeros of Jo are 2.405, 5.520, 8.654, 11.792, 14.931. Their differences are 3.115, 3.134, 3.138, 3.139, and these are seen to rapidly approach a constant [namely, 7, the spacing between the zeros of cos (« — w/4) in (22a)]. The zeros of the various Bessel functions turn out to be important, and they are tabulated to many significant figures. 235 the indicial roots r = -tn differ by an integer, which cor- For n = 1,2,... responds to case (iii) of Theorem 4.3.1. Using that theorem, and the ideas given above for Yo we obtain Weber’s Bessel function of the second kind, of order n, (25) (eye i yo ae ne ue Oe + Yn(a) = 2|(ng ? k==0 whichformulaholdsform = 0,1,2,...;6(0) = Oandd(k) =1+4+4+---4+ 4 fork > 1. 4.6.3. General solution of Bessel equation. Thus, we have two different general solution forms for (1), depending on whether v is an integer or not: y(a) = y=n=0,1,2,.... It turns out that if we define ¥,(e) = (cosv7) A(x) — J_y(xr) (26) sin vit for nonintegerv, thenthelimit of Y,(z) as v > n(n = 0,1, 2,...) gives the same resultas (25). Furthermore,J,(x) andY_(2) are LI (Exercise 1) so theupshotis that we can express the general solution of (1) as y(x)= AJL(a) + BY,(z) (27) for all values of v, with Y, defined by (25) and (26) for integer and noninteger values of v, respectively. The graphs of several J,,’s and Y;,’s are shown in Fig. 3. For reference, we cite the following asymptotic behavior: (b) ∫ aitL (n = 0,1,2,...) va 2 (n =0) (280) (n = 0,1,2,...) (29a) (n = 0,1,2,...) (29b) a (n=1,2,...) —ayt 4Fomin Yale)~ ~~—Ingz, 1 (28a) z as « — O, and 2 JIn(z)~ 4/— cos(x —Wn), KX Y,(v) ~ {2 , 4/—sin (x — wn), TL as Z + oo, where 7, = (2n+1)7/4. Observe the sort of conservation expressed in (28a,b): as n increases, the Y;,’s develop stronger singularities (In x, x ~lig? ) 236 while the J,,’s develop stronger zeros (1, x, «”,...). (We say that 2° has a stronger zeroattheoriginthanx, forexample,becausex°/x? —+0 asx + 0.) Finally,we call attention to the interlacing of the zeros of J, and Y,,. All of these features can be seenin Fig. 3. In summary, our key result is the general solution (27) with J, given by (9), Y_ by (25) for integer v and by (26) for noninteger v, and with J_, in (26) given by (10). 4.6.4. Hankel functions. (Optional) Recall that the harmonic oscillator equation y” + y = 0 has two preferredbases:cos x, sinz and e', e~'*. Usually, the former is used because those functions are real valued, but sometimes the complex exponentials are more convenient. The connection between them is given by Euler’s formulas: e’* = cosx +i sing, e” =cosxz —i sing. Analogousto thecomplexbasise’, e~**for theequationy” + y = 0, a complexvalued basis is defined for the Bessel equation (1): (30a) (30b) HY(2) =J,(x)+i¥,(2) H()(c) = J,(x) —iY,(2). These are called the Hankel functions of the first and second kind, respectively, of order 1. Thus, alternatively to (27), we have the general solution (1) y(x)=AH) (x)+BH?)(2) of (1). As a result of (29a,b),the Hankel functions HY (2), He?)(2) have the pure complex exponential behavior ↨↕n (2) (2)(2)~. H) as v—- i 2 ∕− 2 e~i(z—wn) TL ↕ ↕ ∫ 32 (32a) (32b) co. The Hankel functions are particularly useful in the study of wave propagation. 4.6.5. Modified Bessel equation. Besides the Bessel equation of order v, one also encountersthemodifiedBesselequationof order v, x7y"+2xy'+(-2? —v*) y= 0, where the only difference is the minus sign in front of the second x? term. Let us limit our attention, for brevity, to the case where v is an integer n, so we have vy" + ay! + (—2? = n”) y = 0. (33) 4.6. Bessel Functions The change of variables t = ia (or a = —it) converts (33) to the Bessel equation (34) PY" +ty’ + (?—n?)Y =0, wherey(x) = y(—it) = Y(t) and the primes on Y denoted/dt. Since a general solution to (34) is Y(t) = AJ,(t) + BY, (t) we have, immediately,the general solution y(w)= AJn(ix) + BY, (iz) (35) of (33). From (15), oo —|] k. . fie = S> _(1y' ard k\(k+n)! \ 2 2h--n — jmYer (36) 16) k! Ee+n) so we can absorb the 2” into A and be left with the real-valued solution In(2)=i7"Jn(iz)= YTx = (5) 2k+n ; (37) known as the modified Bessel function of the first kind, and order n. In place of Y,,(ix) it is standard to introduce, as a second real-valued solution, the modified Bessel function of the second kind, and order 7, K,(2)= oH [Jn(ix)+i¥,(ix)]. (38) For instance, Jot) =1+ oe 55 at 8 (39a) 24(21)22831)? x2 (a) + (1)59 Ne Ko(x) = —(in ~+ a +(145 ay (14341) 2 3) 28(3!) (39b) and the graphs of these functions are plotted in Fig. 4. As a general solution of (33) we have y(v) = Al, (e) + BK, (2). (40) Whereas the Bessel functions are oscillatory, the modified Bessel functions are not. To put the various Bessel functions in perspective,observethat the relation- the modified Bessel equation is ee to that between the solutions cos xz,sin x of the harmonic oscillator equation y + y = 0 pas the cosh x, sinh x solutions of the “modified harmonic oscillator” equation y” — y = 0. For instance, just as 237 238 cos (ia) = cosh (a) andsin (iv) = 7 sinh (x), [,(x) andK,,(a) arelinear combinationsof J;,(tx) andY;, (ix). Finally, the asymptotic behavior of [,, and HK, is as follows: say\n 1 ’ a (=) n\t)~ TING In(z) —Inz i Ka n—! a (n 41 = 0, 1,2, oe) (41a) (n =0) 4lb a\n as « — 0, and ↕ In(x) ~ ∙ (42a) (n =0,1,2,...) e”, KelPE waonred as x — oo. As n increases, the J,,’s develop stronger zeros at x = 0 (1,2, x... while the K’,,’s develop stronger singularities there (Inz, x~!, a *,...). 4.6.6. Equations reducible to Bessel equations. The results discussed in Sections 4.6.1-4.6.5 are all the more important because there are many equations which, although they are not Bessel equations or modified Bessel equations, can be reduced to Bessel or modified Bessel equations by changes of variables and then solved in closed form in terms of Bessel or modified Bessel functions. EXAMPLE 1. Solve ty! +y +6cy = 0. (43) Equation (43) is not quite a Bessel equation of order zero because of the x. Let us try to absorb the «* by a change of variable. Specifically, scale z as ¢ = az, where a is to be determined. Then division by a, d dx d? 2 md( d ) (=) dz dt =a d , SO d dx? dt = a*—= and (43) becomes, after 2 t(Y"+¥'+Sty =0, a dt? (44) wherey(x) = y(t/a) = Y(t). Thus, we can absorbthe x” in (44)by choosinga = k. Then(44)is a Besselequationof orderzerowith generalsolutionY(t) = AJo(t)+BYo(t) so (45) y(x) = AJo(t) + BYo(t) = AJo(Kx) + BYo(Kx) is a general solution of (43). & More generally, the equation d dy — {et dx (: ie) cy = 0, + 50%u 0; 46 (46) 4.6. Bessel Functions where a, b,c are real numbers, can be transformed to a Bessel equation by transforming both independent and dependent variables. Because of the powers of « in (46), it seems promising to change variables from «, y(a) to t, u(t) according to the forms t = Av?, u = wy, and to try to find A, B, C so that the new equation, on u(t), is a Bessel equation. [t turns out that that plan works and one finds that under the change of variables t=avbzr!/ and oy uaa! (47) equation (46) becomes the Bessel equation of order v, du 2 du : (48) 42 2). oa tha + (?—v?)u=0, if we choose and g=—_ — c~a+2 (49) iva yo c-a+2 [The latter is meaningless tf c—a-+2 = 0, but in that case (46) is merely a CauchyEuler equation.] Thus, if Z, denotes any Bessel function solution of (48), then putting(47) into u(t) = Z, andsolving for y gives thesolution y(a) =2"/°Z, (vibe) (50) of (46). If b > 0, then Z, denotes J, and Y,, and if b < 0, then Z, denotes JL, and JX, (though we gave formulas for J, and AY, only for the case where v is an integer). EXAMPLE 2. Solve theequation y +3/fry =0. (0< a <oo) (51) Comparing(51)with (46),weseethata = 0, b = 3,andc = 1/2,soa = 2/(1/2—0+2) = 4/5 andv = 1/(1/2 —0+ 2) = 2/5. Thus,(50)gives Fi 4 5 ‘ y(z) = 2'/?Zo)5 (v50°") and y(xz) = Vr [Avy 4 ve (3 v50*"") 5 . . ; (52) | & fe + BYa75 € vae%")| mM A5 (53) is a general solution of (51). # EXAMPLE 3. Solve ty” + 3y' +y = 0, (0 <a < ov) (54) 239 240 or 1 (55) yl +ay!+ay=0. Writing out (46)as 2¢y” + ax* ly! + ba°y = 0 or (56) =0, yl+Sy!+ba®*y xv and comparing (56) and (55) term by term gives a = 3,6 = l,ande-a=—Il,soc = 2. Hence,a = 2/(2 ~ 3+ 2) = 2andy = (1 —3)/(2 ~ 3+ 2) = —2,so (50)becomes y(2) = a lZ_ and (2vi2'/*) =a'Z, (2/2) (57) ' (58) y(z) = = [AJ (2/2) +BY2(2V2)| is a general solution of (54). @ NOTE: In the second equality of (57) we changed the Z_2 to Z2. More generally, if the v that we compute in (49) turns out to be negative we can always change the Z,, in (50) to Zu}, for if v is a negative integer —n, then the Z, in (50) gives J_», and Y_n; but (18) told us that J_,, is identical to J,, to within a constant scale factor, and it can likewise be shown that Y_,, is identical to Y;,, to within a constant scale factor[namely,Y_,(a) = (—1)"Y,(a)]. And if v is negativebutnotaninteger, then the Z, in (50) gives J, and J_,, and that is equivalent to Z_, giving J_,, and aye EXAMPLE 4. Solve ∶ ↕ ∶ We see from (46) that a = 1, b = —5,c = 3,soa y(2) = x°Zo(5 1 ∶ (59) = 1/2,v = O and : |- 12” ; so y(x) = Alp (2) + BKo (2") (60) is a generalsolution of (59). # Closure. In this section we studied the Bessel equation ay" + ay’ + (x? = v?) y= 0 (61) and the modified Bessel equation a?y"” + avy!+ (-2? _ v”) y= 0. (62) Bessel Functions For heuristic purposes, it is useful to keep in mind the similarity betweenthe Bessel equation and the harmonic oscillator equation (63) y" +y =0, and between the modified Bessel equation and the “modified harmonic oscillator” equation (64) yo~y =0. For large x, the left side of (61) becomes fou 1 v 2 + (1-5) uaa! +y (65) so we expect qualitative similarity between the solutions of (61) and those of (63). In fact, the solutions J,(a) and Y,(a) of (61) do tend to harmonic functions as x —>00, like the cosine and sine solutions of (63), and the y//a term in (65) causes somedamping of those harmonic functions, by a factor of 1/,/z. Thus, the general solution y(x)= AJ, (x) + BY,(x) (66) of (61) is similar, qualitatively, to that of (63). Further, just as one can use pure complex exponential solutions of (63) according to the Euler definitions, one can introduce the Hankel functions in essentially the same way, and write the general solution of (61), alternatively, as y(e) = AH!) (x) + BH (2). (67) Likewise, for the modified Bessel equation (62), the left side of which becomes fot 1 2 yo (1-3) uae! +y (68) for large x, we find nonoscillatory solutions analogous to the hyperbolic cosine and sine solutions of (64). So much for large x. As « + 0, the Y, solutions of (61) are unbounded as are the A’, solutions of (62). Computer software. As a general rule of thumb, if we can derive a solution to a given differential equation by hand, we can probably obtain it using computer software. For instance, if, using Maple, we attempt to solve the nonconstant-coefficient differential equation (54) by the command dsolve(a » diff(y(z), v7,x) + 3 « diff(y(a2),x) + y(a) = 0, y(x)); we do obtain the same general solution as was obtained here in Example 3. 241 Chapter 4, Power Series Solutions 242 4.6 EXERCISES 1. Putting the solution form (2) into the Bessel equation (1), 2u derive the recursion relation (3). + Ky-1(2). Kuyzi(2) = = Kel) Frobenius. Show that your two LI solutions can be expressed in closed form as given in (14a,b). (4.6) (c)Use computersoftwareto differentiatex°J3(x), x7Yo(z), wIs(x), 2~*Ko(x), Jo(x), Yo(x), Jo(x), and Ko(x), and show that the results are in accord with the formulas (4.1)- 3. Show thatwith Yo(z) definedby (21),theasymptoticbe- (4.3), havior given in (24b) follows from (24a) and (24c). 5. (Half-integer formulas) 4. (Recursion formulas) It can be shown that = £ [s"Z,(a)] vZ,r(z), —2"Zy (@), (Z=7,Y,1,H, 2 Jijo(x) = 4/ a sin x oH) (Z= K) (4.1) = qqleSu(@)] (Z a" Zy41(2), (Z = 1) = ---’’ = J, Y,K, HO), H)) (4.2) means that the formula holds with Z mula Zot) ={ (Z 2n—1 x (5.2) ∙ J_, and J,,) using the series given in the text for those functions. (b) From the formulas given above, show that 5 ~ Zy-1(2), (Z = J, if a); i) that all Jn41/2’8 are express- and powers of =z. Derive thoseexpressionsfor J3/2(a) and J_3/2(2). (a) (Normal form) By making the change of variables ing o such that 20’ + po = 0. Show that the result is the normal form (i.e., canonical or simplest form) (6.1) Co where (6,2) (b) (Large-x behavior of Bessel functions) For the Bessel (4.4) w+ 2 (5.3) equation(1),showthato(a) = 1/./z andthat(6.1)is and Loai(a) = ~z he(z) ~ f,-1(2), Jn—3/2(2). show that the first derivative term can be eliminated by choos- =I) 2 _ Jn—1/2(2) ible in closed form in terms of sin x, cos, (a) Verify (4.1) and (4.2) for the case where Z is J (i.e., Jy, = ~Z,(z) 2 COs& y(x) = o(x)v(z), from y to v, in y” + p(x)y’ + g(x)y = 0, (4.3) corresponding to (4.2) for the case v = 0, is useful in evaluating certain Bessel function integrals by integration by parts. Zu4i(2) _ (c) It follows from (5.1)-(5.3) 6. pp) r(L) H@)) ~Z (az), ( Z= JY,JY, K,H”), Zi (x), = Jn4i/2(£) equal to each of the itemized functions. In particular,the ford , J_1/2(x) =4/ zr)= -1/2 (b) Derive, from (4.4), the recursion formula —2-"Zy41(z), where the “Z (5.1) and d { (a) Putting v = 1/2 in (9) and (10), show that they give (1-2 2_ 1/4 yao x (4.5) (6.3) NOTE: If we write 1 — (v? ~—1/4)/2? = 1 for large x, then (6.3) becomes vw’ + vu & 0, so we expect that v(x) = Acos (x + ) or, equivalently,Asin (z + @), where 4.6. Bessel Functions A and ¢@are arbitrary constants. Thus, y(a) = o(a)u(a) & AL cos(“ +) Va 243 4.4], there is one for the Bessel functions. Specifically, it can be shown that or A Va sin (2 + 9), (6.4) which forms are the same as those given by (24a) and (24b). Thus, we expect every solution of (1) to behave according to (6.4) as — oo, Evaluating the constants A and ¢ correspondingto a particularsolution,such as Jo(x) or Yo(2), is complicated and will not be discussed here. 7, Recall from Example | that Jn(K©) satisfies the differential n? −− − y=0. (n=0,1,2,...) (7.1) Let the x interval be 0 < x < c, and suppose that « is chosen so thatJn(Kc) = 0; i.e.,ke is anyof thezerosof J,(x) = 0. The purpose of this exercise is to derive the formula [ [Un(nex))? 2dr= C 2 and the left-hand side is called the generating function for the J,,’s. (a) We do not ask you to derive (8.1) but only to verify the equation27y" + ay! + (k*ax7—n?)y = 0 or,equivalently, (ry')! + (22 (8.1) of t9 is Jo(z). (b) Equation (8.1) is useful for deriving various properties of the J,,’s. For example, taking 0/0 d Gg int n(E ; In41(Z) which will be be needed when we show how to use the Sturm- Bessel functions. In turn, that concept will be needed later in our study of partial differential equations. To derive (7.2), we suggestthe following steps. (a) Multiplying (7.1) by 22y’ and integrating on x from0 to ¢, obtain («°2?—n?) ydy =0 1 (n= — Jn4i(2)). 5 [yn—1(2) (7.2) 0 (ovPp +2 f \= of both sides, show that (8.2) 1,2,...) (c} Similarly, taking 0/0¢ of both sides, show that 2 [Jn+4i(Kc)] n = (term. That is, expanding e**/?and e~*/*#in Maclaurin series (one in ascending powers of ¢ and one in powers of 1/t) and multiplying these series together,show that the coefficient =0. = 2(n x +1) [Jn (x) (8.3) + In42{2)| (d) Using computer software, differentiate Jo(x) and J,(z) and show that the results agree with (8.2). 9. (Untegralrepresentation of J,,) Besides the generating function (preceding exercise), another source of information about the J,,’s is the integral representation Jn(z) = =i (7.3) Tv cos (nd —xsin @)dé. (9.1) Verify (9.1) for the case n = 0 by using the Taylor series (b) Showthat with y = J, (Kx), the (xy’)? term is zero of cost, where t = n@ — x sin@ and integrating term by term. atx = 0 forn = 0,1,2,..., and that at z = ¢it is HINT: You may use any of the formulas given in the exercises eld gat) HINT: It followsfrom(4.2)thatJ/(2) = to Section 4.5. nti(t) + 2Jn(2). 10.To derive(17)from (10),we arguedthat1/P(k-n+1) (c) Thus, show that (7.3) reduces to r 2 e c*K?[Inai(we)|”+ 2K?/ r=c 0 fork 9 zy dy — ny? | c=0 (d) Show that the n? vl =C = 0). ~~ (7.4) 6 term is zero for any n = 0,1,2,..., integrate the remaining integral by parts and show that the resulting boundary term is zero, and thus obtain the desired result (7.2). 8. (Generating function for J;,) Just as there is a “generating function” for the Legendre polynomials [see (9) in Section = 0,1,...,m = — 1, on the grounds that for those k’s T'(k — n + 1) is infinite. However,while it is true that the gamma function becomes infinite as its argument approaches itis not rigorous to say that it is infinite at those 0, -1, -2,..., points; it is simply undefined there. Here, we ask you to verify — 1 terms are zero so that the corthat the k = 0,1,...,7 rect lower limit in (17) is, indeed, & = n. For definiteness, let v= 3andr = —v = —3. (You should then be able to generalize the result for the case of any positive integer 1, but we do not ask you to do that; v = 3 will suffice.) HINT: Rather than work from (17), go back to the formulas (4a,b,c). 244 Chapter 4. Power Series Solutions 11. It was stated, below (26), that J, and Y, are LI. Prove that claim. HINT: Use (25) for v = n and (26) forv 4 n. 12. Each differential equation is given on 0 < x < oo. Use (50) to obtain a generalsolution. Gamma functions thatappear neednot be evaluated. (a)y" +42°y= 0 (c) cy” —2y' + cy =0 (e)y"+ Vey =0 (g)cy" +3y'—xy =0 (i)day”+2y!+cy =0 (k)ay” +y' —9a7y= 0 (b)ay” ~2y/—2?y= 0 (d)4y" + Say = 0 (f)y" —cy =0 (h)day”+y =0 (j)ty" +dy!~day=0 () y’ + 2y =0 13. (a)—(1)Solve the corresponding problems in Exercise 12, this time using computer software. 14. (a) Use (50) to find a general solution of zy" + 3y'+9ry=0. (0< a < oo) second law of motion) that each shape Y (2x)is governed by the differential equation =0, —x)¥"]'+pw®Y [pg(t (16.1) (0<a<l) where p is the mass per unit length and g is the acceleration of gravity. (a) Derive the general solution Y(z) = Ado Gaz vg _ *) + BYpo (= v9 l- =) (16.2) of (16.1). HINT: It may help to first make the change of variables 1 — x = €. NOTE: Observe from (16.2) that the displacement Y will be unboundedat the free end a = / because of the logarithmic singularity in Yo when its argument is zero (namely, when x = /). Mathematically, that singularity can be tracedto the vanishingof the coefficient pg(l —x) in (16.1) (b) Find a particular solution satisfying the boundary conditionsy(O) = 6, y/(0) = 0. at x = 1, which vanishing introduces a regular singular point of (16.1) at ¢ = / and results in the logarithmic singularity (c) Show that there is no particular solution satisfying the ini- in the solution (16.2). Physically, dict the result stated in Theorem 3.3.1? Explain. greater the tension the smaller the displacement (as anyone who.has strung a clothesline knows). Hence the vanishing of observe that the coefficient tial conditionsy(0) = 6, y/(0) = 2. Does thatresultcontra- pg(t — x) in (16.1) representsthe tension in the rope. The 15. Use (50) to solve y” +4y = 0, and show thatyour solution agrees with the known elementary solution. You may use any results given in theseexercises. 16. (Lateral vibration of hanging rope) Consider a flexible rope or chain that hangs from the ceiling under the sole action of gravity (see the accompanying sketch). If we pull ies ae thetensionpg(l —x) at thefreeend leadsto themathematical possibility of unbounded displacements there. In posing suitable boundary conditions, it is appropriate to preclude such unbounded displacements there by prescribing the boundary condition that Y (2) be bounded; that is, a “boundedness condition.” Imposing that condition implies that B = 0, so that 2 ~ *). the solution (16.2) reduces to Y(a@)= AJpo (au (b) As a second boundary condition, set Y(0) condition does not lead to the evaluation = 0. That of A (which remains arbitrary); rather, it permits us to find the allowable temporal | the rope to one side and let go, it will oscillate from side to side in a complicated pattern which amounts to a superposition of many different modes,each having a specific shape Y(2) and temporal frequency w. It can be shown (from Newton’s frequencies w. If the first three zeros of Jo(x) are « = 2.405, w (in 5.520, and 8.654, evaluate the first three frequencies terms ofg and 1)and the corresponding mode shapes Y(z) (to within the arbitrary scale factor A). Sketch those mode shapes by hand over0 <a <1, (c) Use computer software to obtain the zeros quoted above (2.405, 5.520, 8.654), and to obtain three mode shapes. (Set A = 1, say.) computer plots of the Chapter 4 Review Chapter 4 Review In this chapter we present methods for the solution of second-order homogeneous differential equations with nonconstant coefficients. The most important general results are Theorems 4.2.4 and 4.3.1, which guarantéespecific forms of series solutions about ordinary and regular singular points, respectively. About an ordinary point one can find two LI power series solutions and hence the general solution. About a regular singular point, say @= 0, one can find two LI solutions, in terms of power series and by the method of Frobenius, power series modified by the multiplicative factors |a|" and/or In |a|, where r is found by solving a quadratic equation known as the indicial equation, The combination of these forms is dictated by whether the roots r are repeatedor distinct and, if distinct, whether they differ by an integer or not. Note that the |a|" and In |x factors introduce singularities in the solutions (unless r is a nonnegative integer). Besides thesegeneral results,we meetthesespecial functions: Exponential integral (Section4.3):Ey(a) = / oo pot — dt, (x >0) wv Gamma function (Section4.5):T(z) = | CO tee dt, (x >0) 0 and study these important differential equations and find solutions for them: Legendreequation(Section4.4):(1 —x”)y" —2xy’ + Ay =0 Solutions that are bounded on ~1 < @ < 1 exist only if \ = n(n + 1) forn = Q,1,2,..., and these are the Legendre polynomials P,,(z): Po(z)=1, Pi(w2)=a, Po(x) = (3x? —1),.... 2, ff Bessel equation (Section 4.6): «7y” + xy! + (x? = v*) y=0 General solution: y(x) = AJ, (x) + BY,(«) CHS) (zx) + DH! (zx) where J/,,, Y_ are Bessel functions of the first and second kind, respectively, of order . 2 1 ∫−∫ ∫−∫ ) are the Hankel functions of the first and second kind, respectively, ∕ of order v, Modified Besselequation(Section4.6):27y” + ay! + (—x*—v*) y =0 For brevity, we consider only the case where v = n is an integer. General solution: y(c) = Al,(x) + BR) (2), where [,,, /X,, are modified Bessel functions of the first and second kinds, respectively, of order n. 245 246 Chapter 4. Power Series Solutions NOTE: We suggest that to place the many ta function results in perspective it is helpfulto seetheBesselequationwy "4 wy!+ (2?—v*)y= 0 andthemodifiedBesselequationwey“4 vy! + (—2?—v*)y= 0 asanalogousto theharmonic oscillator equation y”+ y = 0 and the “modified harmonic oscillator equation” y” —y = 0. For instance,theoscillatoryJ, (x) andY,(z) solutionsof theBessel equationare analogousto the oscillatorycosx andsinx solutionsof y” + y = 0, andthecomplexHankelfunctionsolutionsHy, (1)(x) andHy(2(ax)areanalogousto the complex e’* and e~* solutions. Similarly, the nonoscillatory [,(a) and K,(a ) solutions of the modified Bessel equation are analogous to the nonoscillatory e* ande~*solutionsof theequationy” —y = 0. Equations reducible to Bessel equations (Section 4.6): The equation d ady aL) z(e _ = 0, + ba‘y where a, b, c are real numbers, has solutions =0"!Z, (ayia), y(e) where Q= Zi) denotes Ji and Yi 2 ——., c-at+2 if b > 0, and qi y= l-—a ———. c-a+2 and Ky ifb <0. Chapter 5 Laplace Transfor 5.1 Introduction The Laplace transform is an example of an integral transform, namely, a relation of the form °b fe | K(t,s) P(s)= a () which transforms a given function f(t) into another function F(s); A(t, s) is called the kernel of the transform,and F(s) is known as the transform of f(t). Thus, whereas a function sends one number into another [for example, the function f(x) = x? sendsthe point 2 = 3 on an axis into the point f = 9 onan f axis], (1) sends one function into another,namely, it sends f(t) into F'(s). Probably the most well known integral transform is the Laplace transform, where a = 0, b = oo, and K(t,s) = e~**.In that case (1) takes the form F(s) = l f(t)e7* dt. (2) The parameter s can be complex, but we limit it to real values in this chapter. BesidesthenotationF'(s) usedin (2),theLaplace transformof f(t) is also denoted as L{f(t)} or as f(s), and in a given application we will use whicheverof these three notations seems best. The basic idea behind any transform is that the given problem can be solved more readily in the “transform domain.” To illustrate, consider the use of the natural logarithm in numerical calculation. While the addition of two numbers is arithmeti- cally simple, their multiplication can be quite laborious; for example, try working out 2.761359 x 8.247504 by hand. Thus, given two positive numbers u and v, suppose we wish to compute their product y = uv. Taking the logarithm of both sides gives Iny = Inuv. But Inuv = Inu + Inv, so we have Iny = Inu + Inv. Thus, whereas the original problem was one of multiplication, the problem in the “transform domain” is merely one of addition. The idea, then, is to look up Inu and In v 247 248 ina table and to add these two values. With the sum tn hand, we again enter the table, this time using it in the reverse direction to find the antilog, y. (Of course, with pocket calculators and computers available logarithm tables are no longer needed, as they were fifty years ago, but the transform nature of the logarithm remains the same, whether we use tables or not.) Similarly, the logarithm reduces exponentiation to multiplication since if y = u”, thenIny = In(u”) = v Inu, and it reducesdivision to subraction. Analogously, given a linear ordinary differential equation with constant coefficients, we see that if we take a Laplace transform of all terms in the equation then we obtain a linear algebraic equation on the transform X(s) of the unknown function x(t). That equation can be solved for X(s) by simple algebra and the solution x(t) obtained from a Laplace transform table. The method is especially attractive for nonhomogeneous differential equations with forcing functions which are step functions or impulse functions; we study those cases in Section 5.5. Observe that we have departed from our earlier usage of x as the independent variable. Here we use ¢ and consider the interval 0 < ¢ < oo because in most (though not all) applications of the Laplace transform the independent variable is the time t, withO <t< ow. A brief outline of this chapter follows: 5.2 Calculation ofthe Transform. In this section we study the existence of the transform, and its calculation. 5.3 Properties of the Transform. Three properties of the Laplace transform are discussed: linearity of the transform and its. inverse, the transform of derivatives, and the convolution theorem. These are crucial in the application of the method to the solution of ordinary differential equations, homogeneous or not. 5.4 Application to the Solution of Differential Equations.. Here, we demonstratethe principal application of the Laplace transform,namely, to the solution of linear ordinary differential equations. 5.5 Discontinuous Forcing Functions; Heaviside Step Function. Discontinu- ous forcing functions are common in engineering and science. In this section we introduce the Heaviside step function and demonstrate its use. 5.6 Impulsive Forcing Function; Dirac Impulse Function. Likewise common are impulsive forcing functions such as the force imparted to a mass by a hammer blow. In this section we introduce the Dirac delta function to model such impulsive actions, 5.7 Additional Properties. There are numerous useful properties of the transform beyond the three discussed in Section 5.3. A number of these are given here, as a sequenceof theorems. 5.2 Calculation of the Transform The first questionto addressis whetherthetransformF'(s) of a given functionf(t) exists —that is, whether the integral (1) {[(the~™dt F(s) = [ 0 converges. Before giving an existence theorem, we define two terms. First, we say thatf(t) is of exponential order as ¢ —>oo if thereexist real constantsA’, c, and T’ such that (2) [f(t)|<Ke for allt > 7’. That is, the set of functions of exponential order is the set of functions that do not grow faster than exponentially, which includes most functions of engineering interest. EXAMPLE 1. Is f(t) = sint of exponentialorder? Yes: |sint| < 1 for all t, so (2) holds with A = 1,c = 0, and T = 0. Of course, these values are not uniquely chosen for (2) holds also with K = 7,c¢ = 12, and T = 100, for instance. EXAMPLE @ 2. Is f(t) = ¢?of exponentialorder?I’Hépital’s rule gives _ lim —- = lim tsooet too ot = = lim cet? — tt00 2 c ect =0 if c > 0. Choose c = 1, say. Then, from the definition of limit, there must be a T such that t?/e < 0.06,say,for allt > T. Thus,|f(¢)|= ¢? < 0.06e! for allt > T, hencef(t) is of exponential order. @ On the other hand, the function f(t) = e*” is not of exponential Pp 2 yt € = lim lim —- too ect t00 : e t?—ct _= OO, no matterhow large c is. We say that f(t) is piecewise continuous on a < ¢ <bif order because (3) there exist a finite number of points ¢1, tg, ..., tj such that f(t) is continuous on each open subintervala<<t<ty,t) <t < to,...,tn <t< b, and hasa finite limit as t approaches each endpoint from the interior of that subinterval. For instance, the function f(t) shown in Fig. | is piecewise continuous on the interval 0 < ¢ < 4, The values of are not relevant to whether or not f is piecewise f at the endpoints a, t,¢t2,...,6 continuous; hence we have not even indicated those values in Fig. {. For instance, the limit of f as ¢ tends to 2 from the left exists and is 5, and the limit of f as t tends to 2 from the right exists and is 10, so the value of f at t = 2 does not matter. Thus, piecewise continuity allows for the presence ofjump discontinuities. We can now provide a theorem that gives sufficient conditions on f(¢) for the existence of its Laplace transform F'(s). THEOREM 5.2.1 Existence of the Laplace Transform Let f(t) satisfy theseconditions: (i) f(t) is piecewisecontinuouson 0 < t < A, for everyA > 0, and (ii) f(t) is of exponentialorderas t > 00, so thatthereexist realconstantsA’, c, andT’ suchthat|f(t)| < Ke“ for all ¢BT. Then theLaplace transformof f(t), namely,F'(s) given by (1) exists for all s > c. Proof: We need to show only that the singular integral in (1) is convergent. Break- ing it up as oO [ T f(the* dt = / 0 oo f(t)e7* dt + | 0 f(t)e~* dt, (4) T the first integral on the right exists since the integrand is piecewise continuous on the finite interval 0 < t < T. In the second integral, |f(t)e~*| = |f(t)le" Kes. < Now,[7° Ke~(8-° dt is convergent for s > c, so fr f(t)e~* dt is absolutely convergent —hence, by Theorem 4.5.3, convergent. @ Being thus assured by Theorem 5.2.1 that the transform F'(s) exists for a large and useful class of functions, we proceed to illustrate the evaluation of F'(s) for several elementary functions, say f(t) = 1, e“’, sinat, where a is a real number, and 1/,/t. EXAMPLE 3. If f(¢) = 1, thentheconditionsof Theorem5.2.1aremetfor any ¢ > 0, so accordingto Theorem5.2.1,F'(s) shouldexist for all s > 0. Let us see. F(s) =| J0 edt = lim Boo ewst -S B 1 7 (5) 8 0 on s will cause no difficulty where the limit does indeed exist for all s > 0. Such restriction in applications. @ EXAMPLE 4. If f(t) = e*', theconditionsof Theorem5.2.1are met for any c > aso according to the theorem,/’(s) should exist for all s > a. In fact, 10 (y=feted F(s) = pat pst where the limit does indeed exist foralls lt= ev (srae gin li : B = 0 s§-a ? (6) > a. @ 5. If f(t) = sinat, then the conditions of Theorem 5.2.1 are met for any EXAMPLE c > 0so F(s) should exist for all s > 0. In fact, integrating by parts twice gives F(s) =| a sinat e~*'dt 251 sin at = lim Boo es a 0-3 =(0-0)+ a ewst + —cosat —8 8 ~s ~ a Boo -- 0 a fP 3s? Jo sinat e~*'dt (7) (s), ah wherethelimitexistsif s > 0, Thelattercanbesolvedfor F'(s), andgives a ()=ze5: 0 (s>0) f(s) =- 8 (8) as the transform of sin at. COMMENT. An alternative approach, which requires a knowledge of the algebra of complex numbers (Section 21.2), is as follows: | oo oo | 0 sinate—* (Im e'**) enSt dt | a= 0 = im =Im oo D e tt 1 —=Im s—ta as before, where the fourth equality follows Jeno] / —(s—ia)t B dt = Im| lim fo B00 —($—ta) 5 1 L st ie -~—* >; s—tas+tia s*+a? (9) because _ Jen23| [ei23| — o-88 49 (10) as B -+ 00, if s > 0. In (10)we haveusedthefact that je’*?|=|cosaB +i sinaBl= Vcos?aB+sin°aB=1. EXAMPLE O 6. If f(t) = 1/4, then AO F(s) =f oO goh/?eta = | > fier dr = | oo tol? eT dr, (11) 0 0 a S V8 Jo wherewe have used the substitution st = 7. Having studied thegamma function in Section 4.5,we seethatthefinal integralis 1(1/2) = ./7 so F(s) = 2 8 (12) Theorem 5.2.1 because it is not piecewise continuous on 0 < t < oo since it does not have a limit as - Q. Nonetheless, the singularity at = 0 is not strong enough to cause divergence of the integral in (11), and hence the transform exists, Thus, remember that all we need is convergence of the integral in (1); the conditions in Theorem 5.2.1 are sufficient, not necessary. @ From these examples we could begin to construct a Laplace transform table, with f(t) in one column and its transform F(s) in another. Such a table is sup- 252 plied in Appendix C. More extensive ones are available,* and one can also obtain transforms and their inverses directly using computer software. Tables can be used in either direction. For example, just as the transform of e” is 1/(s —a), it is also truethattheuniquefunctionwhosetransformis 1/(s —a) is e*, Operationally, wesaythat L{e™} = | s—- a and oY I s5—- a paew (13) where L is the Laplace transform operator defined by *dt, en =[ ” P(t) L{f(t)} (14) andL~! is theinverseLaplacetransformoperator.ItturnsoutthatD~!is, like L, an integraloperator,namely L-{F(s)}=—ees F(s)e** ds, Qt y~—ico (15) where ¥ is a sufficiently positive real number. The latter is an integration in a complex s plane, and to carry out such integrations one needs tostudy the complex integralcalculus. If, for instance,we would put 1/(s— a) into theintegrandin (15), for F'(s) and carry out the integration, we would obtain e@ We will return to (15) near the end of this text, when we study the complex integral calculus, but we will not use it in our present discussion; instead, we will rely on tables (and computer software) to obtain inverses. In fact, there are entire books on the Laplace transform that do not even contain the inversion formula (15). Our purpose in presenting it here is to show that the inverse operator is, like D, an integral operator, and to close theoperational“loop:” L{f(t)} = F(s), and thenL~!{F(s)} = f(t). What can we say about the existence and uniquenessof the inverse transform? Although we do not need to go into them here, there are conditions that F(s) must satisfy if the inversion integral, in (15), is to exist, to converge. Thus, if one writes a function F'(s) at random, it may not have an inverse; there may be no function f(t) whose transform is that particular F(s). For instance, there is no function f(t) whose transform is s* because its inversion integral is divergent. But suppose that we can, indeed, find an inverse for a given F'(s). Is that inverse necessarily unique; might there be more than one function f(t) having the same transform? Strictly speaking, the answer is always yes. For instance, not only does the function f(t) = 1 havethetransformF(s) = 1/s (as foundin Example 3),but so doesthe function g(t)= 1, 0<t<3, 3<t<o 500, t=3 “See, for example, A. Erdélyi (ed.), Tables of Integral Transforms, Vol. | (New York: McGraw- Hill, 1954). 5.2. Calculation havethe transformG(s) = 1 becausethe integrandsin fj)” g(t)e7*' dé of the Transform 253 and for f(t) e~* dt differ only at the single point t = 3. Since there is no area under a singlepoint (of finiteheight),G(s) andF(s) areidentical:G(s) = F(s) = 1/s. Clearly, one can construct an infinite number of functions, each having 1/s as its transform, but in a practical sense these functions differ only superficially. In fact, it is known from Lerch’s theorem* that the inverse transform is unique to withinanadditivenullfunction,a functionN(t) suchthat{yrN(t) dt = 0 for every T’ > 0, so we can be content that the inverse transform is essentially unique. Closure. Theorem 5.2.1 guarantees the existénce of the Laplace transform of a given function f(t), subject to the (sufficientbut not necessary)conditions that f be piecewise continuous on 0 < t < oo and of exponential order as t -+ oo. We proceed to demonstrate the evaluation of the transforms of several simple functions, and discuss the building up of a transform table. Regarding the use of sucha table, one needs to know whether the inverse transforrn found in the table is necessarily unique, and we use Lerch’s theorem to show that for practical purposes we can, indeed, regard inverses as unique. Computer software. On Maple, the laplace and invlaplace commands give transforms and inverses, respectively, provided that we enter readlib(laplace) first. To illustrate, the commands readlib(laplace) : laplace(1 + t*(—1/2)= a(t), t, s); givethetransformof 1 + ¢7!/2as 1 4 vr T s Vs and the command (16) invlaplace(a/(s*2 + a°2),s,t); (17) givestheinversetransformof a/(s* + a”) as sin at. EXERCISES 5.2 1. Showwhetheror not the givenfunctionis of exponential (g)cos¢? order, If it is, determine a suitable set of values for A, c, and T in (2). (a)5e# (d) cosh3t (hy£109 (k) 6t + e' cost (j) cosh t* (i) 1/(t + 2) (L)£1000 - (b) ~10e7** (e) sinh t? © (c) sinh 2t (f) e*' sint 2. If f(t) is of exponentialorder,doesit follow thatdf/dé is too? HINT: Consider f(t) = sine’. . 3 “See, for example, D. V. Widder, The Laplace Transform (Princeton, NJ: Princeton University Press, 1941). 7 2 254 Chapter 5. Laplace Transform 3. If f(t) andg(t) areeachof exponentialorder,doesit follow (b) by differentiating both sides of the known transform thatf (g(t)) is too?HINT:Considerthecasewheref(t) = e* andg(t) = ¢*, 4. In Example VOO 6 we state that the result (12) holds if s > 0. 0 Show why thatcondition is needed. 6. DoesL{t~?/3}exist?Explain. 7. Derive L{cos at} two ways: using integrationby partsand using the fact that cos at = Re e'**. (See Example 5.) 8. Derive L{e® sin bt} two ways: using integration by parts andusingsin bf = Ime?®*, 9. Derive L{e“' cosbt} two ways: using integrationby parts andusingcosbt = Ree’”*. 10. Derive by integration the Laplace transform for eachof the following entries in Appendix C: (b) entry 6 (c) entry 7 (d) entry 8 11. Derive entry |1 in Appendix C two ways: oo (a) by writing the transform as af tJo 1 oO te (s~ie)t qe - d prefer, by writing the transform as the single integral oO 0 oO ds Jy NOD sinate~* 1 a dt = / 0 (sin at e**) dt as in the order of integration and differentiation. 12. Use the idea in Exercise (a) entry (12) from entry (4) (c) entry (13) from entry (5) (e) entry (15) from entry (2) | 1(b) to derive (b) entry (7) from entry (1) (d) entry (14) from entry (6) 13. Show thatL{e*’} = 1/(s — a) holds evenif a = Rea +iIma is complex, provided that s > Rea. 14; Use computer software to verify the given entry in Appendix C in both directions. That ts, show that the transform of f(t) is F(s), andalso showthattheinverseof F(s) is f(¢). t e7 (stia 2 Joy you im i a sg? + a? (which we derived in Example 5) with respect to s, assuming the validity of the interchange 5. DoesL{t~3/?}exist?Explain. (a) entry 5 sinate~*' dt = (a) 1-3 (e) 14-16 (b)4-7 (f)17-19 (c)8-10 g)20-22 (d) 11-13 te~ iM deandusing integrationby parts; When we studied the integral calculus we might haveevaluated a few simple integrals directly from the definition of the Riemann integral, but for the most part we learned how to evaluate integrals using a number of properties. For instance, we usedlinearity(wherebyiM [au(x) + Bv(ax)|dx = a fe u(a) da +8 f u(x) dx for any constants a, {, and any functions wu,v, if the two integrals on the right exist), integration by parts, and the fundamental theorem of the calculus (which enabled us to generate a long list of integrals from an already known long list of derivatives). Our plan for the Laplace transform is not much different; we work out a handful of transforms by direct integration, and then rely on a variety of properties of the transform and its inverse to extend that list substantially. There are many such properties, but in this section we presentonly thehandful that will be essential when we apply the Laplace transform method to the solution of differential equations, in the next section. Additional properties are discussed in thefinal section of this chapter. 5.3. Properties of the Transform — 255 We begin with the linearity property of the transform and its inverse. THEOREM 5.3.1 Linearity of the Transform If u(t) andv(t) areanytwo functionssuchthatthetransformsL{u(t)} andL{u(t)} both exist, then +BL{v()} =aL{ult)} +Bo(t)} L{au(t) for any constants a, (1) 2. Proof: We have L{au(t)+Bu(t)}=[ “ [au(t)+Bv(t)]ev* de = lim [ B B00 »B = lim «| 0 B-+00 B =a lim [ Booco [au(t)+ Bu(t)}]e~™dt u(t)e7" dt + B ’ u(t)e" J0 u(t)e7* a| B dt + 3 lim | Broo 0 0 v(t) e~* dt 0 u(t) e7* dt u(t) e7* dt + B | =a | B 0 =aL{u(t)} + dL{u(t)}, (2) where the third equality follows from the linearity property of Riemann integration, and the fourth equality amounts to the result, from the calculus, that lim[af(B) + 8g(B)| = alim f(B) + Blimg(B) as B - Bo, if the lattertwo limits exist. @ EXAMPLE 1. To evaluatethe transform of 6 —5e!, for example, we needmerely know thetransformsof thesimplerfunctions1 ande“’ for L{6 —5e**}= 6L{1} —5L{e"}. Now,L{1} = l/s for s > 0, andL{e*} = 1/(s —4) fors > 4s0 1 L{6 —5e} =6- —5 16 ~5e"} = 65-5 1 = 3 — 24 s(s—4) fors > 4. THEOREM 5.3.2. Linearity ofthe Inverse Transform yor any U(s) ae V(s) such that the inversetransformsD~'{U(s)} = u(t) and L-"'{V(s)} = v(t) exist, 256 L~'{aU(s) +BV(s)}=aL {U(s)}+BL7{V(s)} | (3) for any constants a, J. Proof; Equation (3) follows either upon taking L~! of both sides of (1) or from the linearity property of the integral in the inversion formula [equation (15) in Section 5.2]. @ EXAMPLE 2. Asked to evaluatethe inverseof F(s) = 3/(s* + 3s — 10), we turn to Appendix C but do not find this F(s) in the column of transforms. However,we can simplify F'(s) by using partial fractions. Accordingly, we express B A 3 s?+3s—10 s+5 s—2 (A+ B)s + (—-2A+5B) 4 s? + 3s —10 4) To make the latter an identity we equate the coefficients of s' and s° in the numerators of theleft-andright-handsides:s' gives0 = A+ B, ands° gives3 = —24 + 5B so A = 3/7 andB = —3/7.Then 1-1 3/7 _pfr3/t, 3 s* +33 —10 st+5 eee =—7F 3s, — — tp-8 - 5-2 3,.4f (sas}t 3% ee i} 1 5 +e", (5) where the second equality follows from Theorem 5.3.2, and the last equality follows from entry 2 in Appendix C. COMMENT. Actually, we could have used entry 9 in Appendix C, which says that b {as} = ite = et . sin bt, (6) for if we equate(s —a)* + b? = s* —2as +a? +b? to s* +38 —10 we see thata = ~—3/2 and6 = 71/2. Chooseb = +7i/2, say (6 = —7i/2 will give thesameresult).Then a F 3 ll &+3s—lof 7/2 — fe 7i/2 (s+3/2)?+(71/2)? 6 wee _3e/2 sin 3 5 7 (-e~** + 6 —_ ae (7it/2) 9 e*) , UTIt/2) 2) _ gp ayy CMTH/ So (7) 257 where the first equality is true by linearity and the second follows from (6). This result is the same as the one found above by partial fractions. This example ilustrates the fact that we can often invert a given transform in more than one way. 4 If we are going to apply the Laplace transform method to differential equations, we need to know how to take transforms of derivatives. THEOREM 5.3.3 Transform of the Derivative Let f(t) be continuous and f’(t) be piecewise continuous on 0 < t < to for every finite to, and let f(t) be of exponential order as t + oo so that there are constants K,c,T such that |f(¢)| < Ke! for all ¢ > T. Then L{ f’(t)} exists for all s > c, and (8) L{f'(t)}= sLAF(t)} —FO). Proof: Since L{ f'(t)} = limp-4o0 a [= | 7B 0 where t;,...,tn f(the-*' dt = f(t) e~*'dt, considertheintegral op (the 0 dt +. tf B tn fe *dt, ©) are the points, in 0 < ¢ < B, at which f’ is discontinuous. Inte- grating by parts gives ↓ [= f(t)e™−− lo +---+f(t)e *hB ” +8 “TL B fe" dt+---+s8] JO f(t)e ™dt. By virtue of the continuity of f, the boundary terms at ¢;,...,¢, that, after recombining cancel in pairs so the integrals in (10), we have I= f(B)e~%? —f(0) + | B 0 f(t) en" dt. Since f is of exponential order as t —+oo it follows that f(B)e B- (10) tn (11) —sB + Oas oo. Thus, L{p'()} = lim ones Boo =0—f(0)+sL{f(t)} as was to be proved. @ ∕ ∫ (12) 258 The foregoing result can be used to obtain the transforms of higher derivatives as well. For example, if f/(¢) satisfies the conditions imposed on f in Theorem 5.2.3, then replacement of f by f’ in (8) gives L{f"}=sL{f'}~f'(0)=s[s L{F} —£(0)] - £0), If, besides f’, f also satisfies the conditions of Theorem 5.3.3, so that the L{ f} term on the right side exists, then ~£0) 9f(0) =PDEF} L{f"} Similarly, L{f"} = 3° L{f} ; _ 3° f(0) —S§ f’(0) _ f"(0), (13) (14) if f”, f’, and f satisfy the conditions of Theorem 5.3.3, and so on for the transforms of higher-order derivatives. The last of the major properties of the Laplace tansform that we discuss in this section is the Laplace convolution theorem. THEOREM 5.3.4 Laplace Convolution Theorem If L{ f(t)} = F(s) andL{g(t)} = G(s) bothexistfor s > c, then L~'{F(s)G = fre) g(t—T)d (15) or, equivalently, alt—7)ar =F(s)G(8) L{f “f(r) 16) fors>c. Proof: Since (15) and (16) are equivalent statements, it suffices to prove just one, say (16). By definition, £{ f soyate-nar} =I {fis f(r) g(t-7) yar} e7*dt. (17) Regarding the latter as an iterated integral over a 45° wedge in a 7,¢ plane as shown in Fig. |, let us invert the order of integration in (17). Recalling from the calculus the equivalent notations 259 for iterated integrals (where a, b, c, d are constants, say), inverting the order of integration in (17) gives g(t—7)edt dr (¢~nar}=I [1 f f° seryat g(t —T)e~™dt soryar -[- gu)e EO a f(r) drIDe ~I -| 0 f(r)e"*%* ar [ 0 dy g(mwye ** dy. (19) The last productis simply F'(s) timesG(s), so thetheoremis proved. @ The integral on the right side of (15) is called the Laplace convolution of f and g and is denoted as f * g. It too is a function of t: (f * g)(t) =[ f(r) g(t — rT) dr. (20) CAUTION: Be sureto seethattheinverseof theproduct,L~' {F(s)G(s)}, is not simplythealgebraicproductof theinverses,f(t)g(t); rather,it is (accordingto Theorem5.3.4)theirconvolution,(f * g)(t). EXAMPLE 3. In Example2 we invertedF(s) = 3/(s? + 3s — 10)in twodifferent ways. Let us now obtain the inverse by still another method, the convolution theorem: 3 -1 L (weer 1 1 sc} —ap-l 3h 3 HI a (e** 7 ent | t 0 1 3h {a}* _ar-i Ln 1 ‘s3} ete BUtT)dr (21) _ e7°*) which is the same result as obtained in Example 2. @ Observethatin equation(15)it surely doesn’tmatterif we write F(s)G(s) or G(s)F(s) becauseordinary multiplication is commutative. Yet it is not clear that theresultsare the same,fs f(r) g(t —7) dr in one caseand fs g(t) f(t -—7) dr in the other. Nonetheless, these results are indeed the same, proof of which claim is left as an exercise. In fact, although the convolution is not an ordinary product it does share several of the properties of ordinary multiplication: feg=agrf, (commutative) (22a) Chapter 5. Laplace Transform 260 fe(g*h)=(f*g)*h, fe(gth)=f*eg + fh, fx0=0. (associative) (22b) (distributive) (22c) . (22d) Closure. The properties studied in this section — linearity, the transform of a derivative, and the convolution theorem, should be thoroughly understood. All are used in the next section, where we use the Laplace transform method to solve differential equations. The convolution property, in particular, should be studied carefully. The convolution theorem is useful in both directions. [f we have a transform H{(s) thatis difficult to invert,it may be possible to factor H as F'(s)G(s), where F and G are more easily inverted. If so, then h(t) is given, according to (15), as theconvolutionof f(t) aridg(t). Furthermore,we may needto find thetransform of an integral that is in convolution form. If so, then thé transform is given easily by (16). Finally, we mention thatconvolution.integrals arise in hereditary systems, systems whose behavior at time t depends not only on the stateof the system at that instant but also on its past history. Examples occur in the study of viscoelasticity and population dynamics. EXERCISES 5.3 1, Find the inverse of the given transform two different ways: (a) equation (22a) (b) equation (22b) using partial fractions and using the convolution theorem. Cite any entries used from Appendix C. (c) equation (22c) . (d) equation (22d) (a)3/[s(s + 8)] (b) 1/(3s? + 5s —2) (c)1/(s? —a*) (d)5/ [(s+ 1)(3s+ 2)] (e) 1/(s? +s) (f) 2/(2s? —s —1) 6. ProvethatL{f * g*h} = F(s)G(s)H(s) or, equivalently, thatL~'{F(s)G(s)H(s)} = f *g*h. NOTE: Does fxg*h mean(f * g) * hor f *(g*h)? Accordingto theassociative property(22b)it doesn’tmatter:theyareequal. 2.(a)—(f) Find the inverse of the corresponding transform in 7. To illustrate the result stated in Exercise 6, find the inverse Exercise | usingcomputersoftware. of 1/s3 as L~ fs 1 \ = [7 88 3. Use entry 9 in Appendix C to evaluate the inverse of each. If necessary, use entry 10 as well. NOTE: See the Comment in Example 2. (a) ie (d)1/(s* —8 ~ 2) (f) (s + 1)/(s? +45 +6) (2) (s + 1)/{s? — 8) - f Use (8) together with mathematical (n~2) orem5.35 5. Prove r++, 8. Factoring —5 1, induction ∙ which is ° = I =, it follows from 9s?+a? s*+a? s f- (s? + a2)?f to ver- valid w= ~ (s?+a7)? that = s"L{f} — s"-"f(0) sto fi (0) ~ +) = f@-Y(0), (nat) that the result agrees with that given directly in Appendix C. (h) (2s — 1)/(s* —68 +5) ify thegeneralformulaL{f(™} = 1*1+*1, and show : the convolution theorem and entries 3 and 4 of Appendix C (b) 1/(s° — 38 +3) + 88) (c) 1/(s" —8) (e)s/(s* —2s + 2) 4, / 2. do « {S53} 888 if ∙ f",and f satisfy the conditions of The- ee Pi sin at oq Evaluate this convolution and show that the result agrees with thatgivendirectlybyentry11. 9, Verify (8) and (13) directly, for each given f(t), by work∙ ing out the left- and right-handsides and showing that they areequal. You mayusethetablein AppendixC to evaluate L{ fl"(t)}. LE F(t}, andL{ f(t)}. (b) ew" 4.2 (c) t? +5t~1 (e)cosh3t + 5¢ = () ar? ~ cos 2t (a) e®! (d) sinh 4t uctf(t)g(t). ShowthatL~'{F(s)G(s)} # f(t)g(t) foreach 10. Evaluate the transform of each: (a) ye '<" gin Or dr Ben (c) f(t given pair of functions. (b) f cos 3(¢ — r) dr (d) fi Sr adr cosh 3(t _ r) dr sinh4(t—7)dr (f)fy 7%? (©) an dr 11. We emphasizedthatL~! {4°(s)G(s)} equalstheconvolu- tion of f and g; in general, it does not merely equal the prod- (a) f(t) =t, g(t) =e! (c) f(t)=t, gti =? (b)f(t) =sint, g(t) =4 (d)f(t) =cost, g(t) =t+6 Equations initial conditions at ¢ = 0. EXAMPLE I. oscillator shown in Fig. thatthedisplacement x(t; then satisfies the equation x(t) ma” +kx = f(t) = Fp. ay: m FAFFFFI By the linearity of Lime" + kx} = L{Fo}. (2) (Theorem5 mL{a"(t)} + kL {al (3) Lf{a(t)} = X( X(s). Doing so gives X(s) = sx(0) + x'(0) 4 IF Figure 1. Mechanical oscillator. term in (1) by the Laplace kernel e~*# useL todenotethatstep: ft => Fy (5) where w = 4/k/m is the natural frequency. With thesolving for X(s) completed,we now invert(5) to obtaina(t): +20), x(t)=L7}{220) 8?+ w? 4 c , 2(0)L {waar} = ms (s? +w) ar +O fF 1 (wre} t tie Fo po 1 (rap © ° where the second equality follows from the linearity of the L~! operator (Theorem 5.3.2). Appendix C gives lr { 5 \ = coswt and s? + Ww? po { 1 i st+w* t \ _ ae , Ww (7) but the third inverse in (6) is not found in the table. We could evaluate it with the help of partial fractions, but it is easier to use the convolution theorem: lL Lo 1 1 = -1 ~ Ww 1 ~1 | \zen} {ohap L t 8) —7) ar = 1- COsW) si a) (ee -/ _ 1, Sint 1 > —-1 = {tt} L ion} w* w 0 so (6), (7), and (8) give the desired particular solution as z(t) = 2(0) coswt + x!(0)sinwt + “2 Ww (1 —coswt). (9) For instance,if 2(0) = 2’(0) = 0, thenz(t) = (Fo/k)(1 —coswt) asdepictedin Fig. 2. Does it seem correct that the constant force fp) should cause an oscillation? Yes, for imagine rotating the apparatus 90° so that the mass hangs down. Then we can think of Fo as the downward gravitational force on m. In static equilibrium, the mass will hang down an amount « = Fo/k. If we release it from ¢ = 0, it will fall and then oscillate about the equilibrium position z = Fo/k, as shown in Fig. 2. COMMENT 1. Recall that f * g = g * f, so we can write the convolution integral either as f f(r) g(t —7) dr oras fs g(r) f(t —7) dr; thatis, we can let theargumentof f be 7 and the argument of g be t — 7, or vice versa, whichever Figure 2. Releasefrom rest. we choose. In (8) we chose the T argumentfor 1 and the t —7 argumentfor (sinwt)/w. (Of course,if we changeall the t’s in 1 to r’s we still have 1 because there are no ¢’s in 1.) Alternatively, we could have expressed the inverse in (8) as sin wt 7) xLl= i E / sinw 0 ( w *) (1) dr — 1 — coswt 2 Ww ’ as obtained in (8). COMMENT 2. Observe that (9) is the particular solution satisfying the initial conditions x = x(0) and2’ = x’(0) att = 0. If thosequantitiesare not prescribed,we can replace them by arbitrary constants, and then (9) amounts to the general solution of ma” + ke = 263 Fy. Thus, the method gives either a particular solution or a general solution, whichever is desired. COMMENT 3. If, insteadof the specific forcing function f(t) = Fo we allow f(¢) to be an unspecified function, then we have, in place of (5), X(8) (s) = sx(0) + 2'(0) F(s) op + m(s?po+ w*) eto 10 (10) and, in place of (9), a(t) = x(0) coswt + F 1 x'(0 (0) sinwt + —L7} {as (5) | \ . Ww m S* + W* (1) Using the convolution theorem to write otf s?Pe baat{ata berttro=Bry, tw 3?+? w gives a(t) = x(0) coswt + as the solution. x’(0) w 1 sinwr f(t—1)dr sinwt + —- | f*, mw Jo ay (13) # With Example | completed, there are several observations that can be made aboutthe method. First, consider the general second-orderequation cv’ +azr' +br = f(t), (14) where a, b are constants, although the following discussion applies to higher-order equations as well. If we solve (14) by the methods of Chapter 3, then we need both homogeneous and particular solutions. To find the homogeneous solution we need to factor the characteristic polynomial A? + a\ +b or, equivalently, to find the roots of the characteristic equation A + a + b = 0. Solving (14) by the Laplace transform instead, we obtain, and need to invert, X(s)= (s+ a)x(0) + x’(0) 1 F(s) sttast+b s*+as+6 (15) Whether we invert these terms by partial fractions or by some other method, their inversion depends, essentially, on our being able to factor the s* +as+6 denominator. That polynomial is none other than the characteristic polynomial corresponding to (14). Thus, whether we solve (14) by seeking exponential solutions for the homogeneous equation and then seeking a particular solution, or by using the Laplace transform, we need to face up to the same task, finding the roots of the characteristic equation, Second,observethatif we invert the F(s)/(s* + as + 6) term in (15) by the convolution theorem,then we convolve the inverse of F(s), namely, f(¢), with 264 theinverseof 1/(s* + as + b). Therefore,if we usetheconvolutiontheorem,then there is no need to evaluate the transform F'(s) of f(¢) when transforming the given differential equation. Third, observe how the initial conditions become “built in,” when we take the transform of the differential equation. Thus, there is no need to apply them at the end. Fourth, recall that Laplace transforms come with restrictions on s. For instance,L{1} = 1/s for s > 0. However, such restrictions in no way impede the solution steps in using the Laplace transform method, and once we invert X(s) to obtainx(t) they areno longer relevant. Fifth, we need to realize that when we apply the Laplace transform method to a differential equation, we take the transform of the unknown and one or more of its derivatives, but since we don’t yet know the solution we don’t yet know whether or not these functions are transformable. The procedure, then, is to assume that they are transformable in order to proceed, and to verify that they are once the solution is in hand. Finally, and most important, understand that the power of the Laplace transform, in solving linear constant-coefficient differential equations, is in its ability to convert such an equation to a linear algebraic equation on X(s), which ability flows from the fact that the transform of f’(t) is merely a multiple of F'(s) plus a constant (and therefore similarly for f”, f’”,...). Indeed, the transform L{f(t)} = o° f(t) e~%* dt wasdesignedso as to havethisproperty.Thatis, the “kernel” e~*¢was designed so as to imbue the transform with that property. EXAMPLE 2. Solve the initial-value problem y™)—y=0; — y(0)= 1,y'(0)=y"(0)=y'"(0)=0 for y(x). That the independent and dependent variables (16) are x, y, rather than 2, t, is immaterial to the application of the Laplace transform; the transform of y(z) is now Y(s) = fo” y(x) e~**dx. Taking the transform of (16) gives - ¥(s)=0. —y'"(0)] —sy’"(0) —s*y'(0) [s¥(s)—s4y(0) (17) Putting the initial conditions into (17),and solving for Y(s), gives 3 ¥(s)= a: (18) To invert the latter, we can use partial fractions: so si—-1sl A B Cc s-1 shi si D _ (s = 1)(s*+ 1)A+ (8 +1)(s? +1)B + (8 —i)(s? —1I)C+ (8 + i)(s? - yD st —] (19) 5.4. Application to the Solution of Differential Equations — 265 Equatingcoefficientsof like powersof s in the numeratorsgives the linear equations 8? l=A+B+C+D, 87: Q0=-A+B-iC+iD, s: O0=A+B-C-D, ie O=-A+B+iC —iD, solution of which (for instanceby Gauss elimination) gives A = B= C = D = 1/4. Thus, y(x) 1, = 1c 1, 124, + ic + ra 1, + ra 1 aa (cosha + cos2) (20) is the desired particular solution. # EXAMPLE 3. Solve thefirst-orderinitial-valueproblem x(0) = xo x +pzr = q(t); (21) for z(t), wherep is a constantand q(t) is any prescribedforcing function.Application of the Laplace transform gives vo + X(s) = (s) S+p Q(s) 22 2) s+p and hence the particular solution et x(t) = toe"! + | e PET) g(r)dr Jo t =ePt COMMENT. 0 to t: E + [ 0 q(r) e?”ar| (23) Alternatively, let us begin by integrating the differential equation on t, from n(t)) ep f a(r)dr 0 — [ner (24) or,since 7(0) = xo, vt t q(r) dr. a(t) +p / u(r)dr = 29+ JO JO (25) Of course, (25) is not the solution of the differential equation because the unknown 2(t) is under the integral sign. Thus, (25) is an example of an integral equation. Although we will not study integral equations systematically in this text, it will be useful to at least introduce them. Observe, first, that (25) is equivalent to both the differential condition, equation and the initial for they led to (25); conversely, the derivative of (25) gives back x’ + px = q(t) [if g(t) is continuous],and putting¢ = 0 in (25)gives back «(0) = ag. That is, unlike the differential equation, the integral equation version has the initial condition “built in.” Further, we can solve (25) by the Laplace transform conveniently because each integral is of convolution type: the first is 1 * a(t), and the second is 1 * g(t). Thus, taking a Laplace transformof (25),and noting thatL {1 * a(t)} = L{U}L{a(t)} = (1/s)X(s) andL {1 « q(t)} = (1/s)Q(s), gives X(s) + p~X(s) — ~ + ~(s), (26) which, once again, gives (22) and hence the solution (23). @ Closure. In this section we describe the application of the Laplace transform to the solution of linear differential equations with constant coefficients, homogeneous or nonhomogeneous. In a sense, the method is of comparable difficulty to the solution methods studied in Chapter 3 in that one still needs to be able to factor the characteristic polynomial, which can be difficult if the equation is of high order. However, the Laplace transform method has a number of advantages. First, the method reduces a linear differential equation to a linear algebraic equation. Second, the hardest part, namely the inversion of the transform of the unknown, can often be accomplished with the help of tables or computer software, as well as with several additional theorems that are given in the final section of this chapter. Third, any initial conditions that are given become built in, in the process of taking the transform of the differential equation, so they do not need to be applied separately, at the end, as they were in Chapter 3. We also saw, in the final example, that the Laplace transform is convenient to use in solving integral equations (equations in which the unknown function appears under an integral sign), provided that the integrals therein are Laplace convolution integrals; additional discussion of this idea is left for the exercises. In fact, it might be notedthattheLaplacetransformitself,F(s) = f° f(t) e7* dt, is reallyan integral equationfor f(t) if F(s) is known. Although that integral equationwas studied by Laplace, it was Simeon-Denis Poisson (1781-1840) who discovered ‘ ee ee an ee inversionformula. Je ne 7 6)ost LLJPYFHOO Bon f(F) ae the solutionf(t) = 553 y ico #(s) e* ds, namely,theLaplace Poisson was one of the great nineteenth century analysts and a professor at the Ecole Polytechnique. Also left for the exercises is discussion of the application of the method to a limited class of nonconstant differential equations. EXERCISES 5.4 (a)a’ +20 = at” (b)3a’ + a = Ge”; (e)v" +52' = 10 (at —~a =1+t4+Fh 267 (g)a" ~ 3a'+2a=0; 2(0) =3, 2/(0)=1 (h)x" ~ 4a’ -—5e=2+e7'; x(0)=2'(0) =0 (i)a” —a'~1l2a=t; 2x(0)= —-1,2'(0)=0 (ja +62'+9¢=1; 2(0) =0, 2/(0) = -2 (k)a ~ 2a’ + 22 = ~2t; 2«(0)= 0, 2/(0)= —5 () a” —22'+32=5; 2(0)=1, 2'(0)=-1 (mo —a2"422’ =t?; ; £ (nha +a" ~ Qe’ =1L+ets 4 (o)a'" + 5a” =t4; 2(0)= a'( (p)a!” + 3a" + 8a! + a2= e%! (ya —a" —a'+x2=0; 2(0) = 2, 2/(0) = 2"(0) =0 that those stepsgive mx(t) —may ~ maxgt+k f (s)a) + 3a!" = 0; z"(0) = 2'"(0) = 3 (3.2) Show that, by interchanging the order of integration, the double integrals can be reduced to single integrals, so that the integral equation (3.2) can be simplified to the form ma(t) - mx ~ mayt +k fale —7)a(r) dr a(Q) = 2'(0) = 0, (b) Taking a Laplace transform of (3.3), obtain (thao) + 80” + 16x = 4 (ua!) —¢=1; x(0) = 2'(0) = x"(0) =0, ~2=4; 2(0) =2'(0) =2"(0) =1, x!"(0) =4 (va = (w)a) x"'(0) x(r) dr dt’ = fs f f(r) dr dt’. = Qsint (r) of) x"(0) f 0 F(s) m/(s?+w?)’ which is the same as equation (10). 4. Convert the initial-value problem —16” = —32: x(0) = 0, 2’(0) = 2, — x’"(0) — (o= van) 0 2. (a) Show that for a constant-coefficient linear homogeneous differential equation of order n, the Laplace transformX(s) of thesolution x(t) is necessarily of the form X(s) = P(s)/Q(s), (2.1) whereQ(s) and P(s) are polynomials in s, with Q of degree n and P of degree less than n. “ PUTK)net x(t)=So a a * Tn, then (0<t<o) to an integral equation, analogous to (3.3) in Exercise 3. Then, solve thatintegralequationfor x(t) by using theLaplace transform. 5. (Variable-coefficient equation) Consider the problem te’ +2’ +tz=0 (b)Showthatif Q(s) = 0 hasn distinctrootsry, ... k=1 mex"+ca'+ka = f(t) z(0) = xo, x'(0) = 2 x(0) = 1, «'(0) =0, 0<¢t (0S ¢ <00) (2.2) where our special interest lies in seeing whether or not we can solve (5.1) by the Laplace transform method even though the differential equation has nonconstant coefficients. 3. Our purpose, in this exercise, is to follow up on Example 3 in showing a connection betweendifferential equations and (a) Take the Laplace transform of the differential integral equations, and in considering the solution of certain Note thatthetransformsof tv’(t) andt x(t), integral equations by the Laplace transform method. 200 L{te"(t)} =| (a) Convert the initial-value problem ma" +kr = f(t), O<t (OS #< 00) x(0) = xo, 2x'(0)= 2% (5.1) tae equation. * dt, 0 (3.1) to an integral equation, as follows. Integratethe differential equation from 0 to ¢ twice. Using the initial conditions, show L{ta(t)} = f tre “dt, Jo present a difficulty in that we cannot express them in terms of X(s) theway we can expressL{z’'(t)} = sX(s) —x(0) and L{a"(t)} = s*X(s) — sa(0) — x'(0). Nevertheless,these terms can be handled as follows. Observe that 268 Chapter 5. Laplace Transform L{ta"(t)} = / ∶ ta en 0 de ∑∕ dt = -{ fo ¢ } d (ve *') dt ds obtained the solution in power series form. Of course, that powerseriesis theTaylor seriesof the Bessel functionJo(t). NOTE: Observe that rather than pulling an s out of the square ds Jo rootin (5.5),andthenexpanding1/\/1 + (1/s?) in powersof ¢ ds [s?.X(s)—sx(0) —2'(0)] X(s) = C(1 ~ $s? +--+).However,positivepowersof s are d 1/s*, we could have expanded (5.4) directly in powers of s as not invertible, so this form is of no use. [We will see, in Theo- [s?X(s) —s], rem 5.7.6, that to be invertible a transform must tendto zero as (5.2) s —>00. Positive powers of s do not satisfy this condition, but if we assumethatthe unknownz(t) is sufficiently well be- negative powers do.] Also, observe that the degree to which ds haved for the third equality (where we have interchanged the nonconstant-coefficient differential equations are harder than order of two limit processes, the s differentiation and the ¢ constant-coefficient ones can be glimpsed from the fact that integration)to bejustified. Handling theL{t x(t)} termin the coefficients proportional to ¢cause the equation on X(s) to be same way, show that application of the Laplace transform to a first-order differential equation; coefficients proportional to t®will causetheequationon X(s) to be a second-orderdiffer- (5.1) leads to the equation ential equation, and so on. (s? +1) on X(s). ds 4 sx =0 (5.3) 6. It is found that the integral equation Note that whereas the Laplace transform method C(T) reduces constant-coefficient differential equations to linear algebraic equationson X(s), here the nonconstantcoefficients result in the equation on X(s) being itself a linear differential equation! However, it is a simple one. Solving (5.3), show that C X(s) = _ [ 7 0:0744u?/T? 5(1,) a (6.1) is an approximate relation between the frequency spectrum p{v) and the specific heatC(T) of a crystal, whereT is the temperature.Solve for p(v) if (b)C(T) = Te“1/F (a)O(T) =T (5.4) HINT: By a suitable change of variables, the integral can be s+1. (b) From Appendix C, we find the inverseas z(t) = CJo(t), where Jp is the Bessel function of the first kind, of order zero. Appying the initial condition once again gives made to be a Laplace transform. 7. We have seen that two crucial properties of the Laplace transformare its linearity and the propertythatL{f’(t)} = x(0) = 1 = CJo(0) = C, so C = 1, and the desiredso- sf(s) — f(0); thatis, the transformof thederivative is of the lution of (5.1)is a(t) = Jo(t). Here, however,we ask you simple form L{f’(t)} = af(s) + 6. With thesepropertiesin to proceed as though you don’t know about Bessel functions. Specifically, mind, consider the general integral transform re-express (5.4) as d X(s)=s/1+(1/s?) —-“—=C(1 set): 2 8° 8 (5.5) where the last equality amounts to the Taylor expansion of V1 +rin the quantity r, about r = 0, where r = 1/s?. Carry that expansion further; invert the resulting series term by term (assuming that that step is valid), and thus show that t? 1 ¢! 1 ¢ a(t)=C}l-— +m u(t) met (apzat ~ (Bn? 2 | Setting7(0) = 1 gives C’ = 1, and the result is thatwe have f(t) at F(s)=/ K(t,s) {equation (1) in Section 5.1] from a “design” 1.1) point of view: how to choose the limits c, d and the kernel A(t, s) to achieve these properties. Since 0 < t < oo, it is reasonable to choose c = Oandd = o. Further, (7.1) automatically satisfies the lin- earitypropertyL{au(t)+Gv(t)} = ab{u(t)}+GL{v(t)} because the right side of (7.1) is an integral and integrals satisfy the property of linearity. Thus, we simply ask you for a logical derivationof thechoice K(t, s) = e~**so thatL{ f’(t)} is of theformaf(s) + 6. 5.5 Discontinuous Forcing Functions; Heaviside Step Function Although we show in Section 5.2 that a given function has a Laplace transform if it is piecewise continuous on 0 < ¢ < A for every A and of exponential order as t — oo, we have thus far avoided functions with discontinuities. In applications, however, systems are often subjected to discontinuous forcing functions. For instance,a circuit might be subjected to an applied voltage that is held constant at 12 volts for a minute and then shut off (i.e., reduced to zero for all subsequent time). In this section we study systems with forcing functions that are discontinuous, although we still assume that they are piecewise continuous on 0 < t < A for every A and of exponential order as t + 00, so that they are Laplace transformable. We begin by defining the Heaviside step function* or unit step function (Fig. 1a), H(t)= 0, t<0 1, t>0 (1) (b) Figure 1. Unit stepfunction. which is a basic building block for our discussion. The value of H(t) att = 0 (Le., at the jump discontinuity) is generally inconsequential in applications. We have chosenH(0) = 1 somewhatarbitrarily,anddo not showthevalueof H(t) att = 0 in Fig. la to suggest that it is unimportant in this discussion. Since H(t) is a unit stepat t = 0, H(t —a) is a unit stepshiftedto ¢ = a, as shown in Fig. 1b. In fact, the step function is useful in building up more complicated cases. We begin with the rectangular pulse shown in Fig. 2. Denoting that functionas P(t; a,b), we have P(t;a,b) = H(t-—a) —H(t —6). (2) More generally, observe that any piecewise continuous function O<t<t fi(t), ‘9(t), pag sts 3) PO t%1<t t frit), th <t< ow defined on 0 < t < oo (which is the interval of interest in Laplace transform applications) can be given by the single expression f(t) = fi(t) P(t; 0, t1) a as fr—i(t) P(t; tn—1y tn) + Fn {t) H(t - ae (4) “Oliver Heaviside (1850-1925), initially a telegraph and telephone engineer, is best known for his contributions to vector field theory and to the development of a systematic Laplace transform methodology for the solution of differential equations. Note the spelling: Heaviside, not Heavyside. In0 <t < ¢, for instance, each P function in (4) is zero except for the first, which equalsunity in thatinterval;also,H(t—t,,) is zerothere,so (4)givesf(t) = fi(t). Similarly, int; <t < tg,...,andty_, <t < ty. Int, < t < co, each P function is zero and the H(t —t,,) is unity, so (4) gives f(t) = f,(¢) there. H(t—b)p H(t-a) P(t;a,b) i a = t b > t a b > t Figure 2. Rectangularpulse. Note that (3) does not define f(t) at the endpoints 0, t1,...,t,. The Laplace transform of f will be the same no matter what those values are (assuming that they are finite) since the transform is an integral, an integral represents area, and there is no area under afinite number of points. Thus, those values will be inconsequential; hence we don’t even specify them in (3). EXAMPLE 1. The function 2+, f(t) = 6, 2/(2t —5) A amw on; O<t<2 2<t<3 (5) 3<t<oo shown in Fig. 3, can be expressed, according to (4), as ‘ 2 f(t) = (242°) (H(t) — A(t — 2)}+ 6[A(t — 2) — H(t —3)] + og Figure 3. f(t) of Example |. HE 3). (6) Actually, since the interval is 0 < ¢ < oo we cannot distinguish between H(¢) and unity, so we could replace the H(t) in the first term by |. & EXAMPLE 2. Rampfunction. The function 0<t<a w=? t—a (7) a<t<oc shown in Fig. 4 is called a ramp function and, according to (4), it can be expressed as f(t)=(t~a)H(t-a). O f a Figure 4. The rampfunction t > Before considering applications, observe that roo of Example 2. L{H(t—a)} H(t—ajedt= =| 0 oo [ a eo dt= @ e s , so theLaplace transformof H(t —a) is L{H(t—a)}=* (8) Also important to us is the result (9a) =e F(s)| L{H(t—a)f(t—a)} or, equivalently, (9b) f(t -a) L~'{e"“ F(s)} = H(t -—a) for any (Laplace-transformable)function f(t). Proof is as follows: L{H(t—a)f(t-a)}= 5” -[" (tt —a) f(t—a) e*' dt f(t-a) f(r)e* was[ f(r [ yerstat= dr =e“*F(s), s(t+a)op sin(t-a) HA(t-a) (10) t where the third equality follows the change of variables t — a = 7. In words, H(t —a)f(t —a) is thefunction f(t) delayedby a time intervala, as illustratedin Fig. 5 for thefunction f(t) = sint. Figure 5. Delay significance of H(t —a)f(t—a). EXAMPLE 3. LC Circuit. Wesawin Section2.3thatthedifferentialequationgoverning thechargeQ(t) on thecapacitorin thecircuit shown(Fig. 6) is LQ” + RQ’ + (1/C)Q = E(t). Let R = 0 and let E(t) be the rectangular pulse shown in Fig. 2, of magnitude Eo, @(0) = Qo and Q'(0) = 0, then we have the initial-value problem 1 LQ" + GQ =Ev [H(t - 2) - H(t—-5)], Q(0) = Qo, (11a) (11b) Q'(0) =0 on Q(t). [Since the currenti(t) is dQ/dt, Q’(0) = 0 meansthati(0) = 0 so we can think of a switch being open until time ¢ = 0, and closed at that instant.] We wish to solve (11) for Q(t). Be careful: we will need to distinguish the inductance £ from the Laplace transform L by the context. Taking a Laplace transform of (11a), and using (1 1b) and (8), gives oes 1 ox , L (s°Q(s) —sQo) + =Q(s) = Eo C Se) _Q(s) _ Q95 &c l fers 8 (e~?8 = @75s 8 Be . e~*) (12) (13) pw 0) E(t) S L L1 Figure 6. RIC circuit. wherew = 1/VLC. [Generally,we use thenotation L{f(t)} = F(s), but in Q(t) theQ is alreadycapitalized,so we useL{Q(t)} = @(s) instead.]To invert(13),we beginwith a lpg i —pt (2+0?) -/ to 0 5 ee WwW a Lod = 1» Snwt I Fre 1 —coswt dr= eS. 7 | (14) Ww Then,using (14)and(9b)andL~! {s/ (s? + w?)} = coswt from Appendix C, we have Q(t) = Qocoswt + EoC {H(t —2)[1—cosw(t ~ 2)| —H(t —5) {1—cosw(t —5)]}, which is shown in Fig. 7 for the representative case where Qo = Eg = D=C Q(t) COMMENT (15) =1. |. Most striking is the way the use of the Heaviside notation and the Laplace transformhave enabledus to solve for Q(t) on the entiret domain (0 < t < oo). In contrast, if we rely on the methods of Chapter 3 we need to break (11) into three separate problems: Figure 7. Q(t) given by (15). O<t<2 2<t<5: 5<t<oo: LQ" + 1/C)Q=0, Q(0)=Qo, Q"(0)=0 LQ"+(1/C)Q= Bo,Q(2)=2,Q'(2) =7?_—(16ab LQ"+(1/C)Q=0, Q(5)=?, Q(B)=? First, we solve (16a) for Q(t) on 0 < ¢ < 2. The final values from that solution, Q(2) and @’‘(2),thenserveas the initial conditions for thenextproblem,(16b). Then we solve for Q(t) on 2 < t < 5 and use the final values from thatsolution, Q(5) and @’(5), as the initial conditions for the next problem, (16c). Clearly, this approach is more tedious than the Laplace transform approach that led to (15). COMMENT 2. A fundamentalquestion comes to mind: Does the discontinuous natureof the input /(t) result in theoutputQ(t) being discontinuousas well? We can see from the graph in Fig. 7 that the answer is no.- The continuity of ¢2(¢) may be surprising from the solution form (15), because of the presence of the two Heaviside functions in (15). How- ever,thejump discontinuity implied by the H(t —2) is eliminated by its 1 — cos w(t —2) factor since the latter vanishes at ¢ = 2. Similarly, the jump discontinuity that is implied by theH(t —5) is eliminatedby its 1 —cosw(t —5) factorsince the lattervanishes att = 5. i To better understandhow a discontinuous input can produce a continuous output, consider the following simplified situation, the equation Q"(t) = H(t —a) (17) with discontinuous right-hand side. Integrating (17) gives Q(t)=(t-a)H(t-—a)+A, (18) 5.5, Discontinuous Forcing Functions; Heaviside Step Function —273 becausethe derivative of the right-hand side is indeed H(t — a), as can be seen from Fig. 4. Integrating again, we obtain t~a and these results are shown in Fig. 8 - (19) Q(t) = (t=5ay" H(t—a)+At+B (for the case where A = B = 0, say), | = The idea is that a differential equation is solved by a process which, essentially, involves integration, and integration is a smoothing process! For observe that (0) fenenebnein whereas()"(t) = H(t —a) is discontinuousat t = a, Q’(t) = (t - a)H(t —a) is continuousbut with a “kink,” and Q(t) = (t — a)*H(t — a)/2 is continuousand “ . t smooth (differentiable) as well. EXAMPLE 4. RC Circuit. In Example 3 we took R = 0 in thecircuit shownin Fig. 6, and considered the resulting £C' circuit. Here, let us take £ = O instead, and consider the H(t-aj (t-a}) 0 resultingRC circuit,governedby thefirst-orderequationRQ’ +(1/C)Q = E(t). Further, let Q(0) = 0 and let #(t) = 50t on 0 < t < 2 and E(t) = 40 on 2 < t < 0 (sketch it), Accordingto (4)then,£(t) = 50t(1 ~ H(t ~ 2)|+ 40H(¢ —2). Let R = C =1, for simplicity. Then the initial-value problem on Q(t) is Laplace transforming ~ Q!+Q = 50t+ (40—50t)H(t —2), (20a) = Q(0) = 0. (20b) “3 = = 0 (20a), = = 50 effect (21) Figure 8. Thesmoothing £ {(40—50t)A(t —2)}, sQ(s) + Q(s) = pt of integration. where L{(40 — 50t) H(t — 2)} = L {[-60 ~ 50(¢ — 2)] H(t —2)} 2)} = —60L {H(t — 2)} — 50L {(t — 2) H(t ~— —_ 60 eres e738 eves L{t} = ——— 60= ; ; 28 5 2 ~ 50 . (22) (22 Putting (22) into (21) and solving for Q(s) gives oo ( els) 50 ce s(s+1) ee 60 s(s+ D° 50 TB oe s2(8+ 1° es 93 ey) which we now need to invert. Taking one term at a time, EN 5 ae s(s t 1 +1) ∫ = ET} . ~t}= lee s+] ,8 «Lo! − 1 2 1) f= Jo ee’ dr=1—e', −∶∫ s+ vt =| (¢-r)eTdr=t-1L+e™, 0 (24a) − (24b) ∶ | 274 Chapter 5. Laplace Transform H(t —2) (90—50¢+ 10e*~*) EXERCISES 5.5 1. Use (4) to give a single expression for f(t), and give a labeled sketch of its graph, as well. From that expression, eval- uatethetransformF'(s) of f(t). (a) f(t) =ton0<t<2,4—ton2<t<4,andQont>4 (b) f(t) =e7"fon0 <t<1,00nt>1 (i)H(t—3)[H(t—2)—H(t—1)] 3. Evaluate in terms of Heaviside functions. You may use these results for the definite and indefinite integrals of the Heaviside function: (c) f(t) = 2on0<t<5,-30n5<t<7,lont>7 (d)f(t) =t®? -ton0<t<1,-6o0nt>1 (e)f(t) (f) f(t) t > 20 (g)f(t) (h)f(t) =2-—ton0<t<2,2t-6on2 <t<5,tont>5 = & on0 <t < 10,34?—2ton 10 < t < 20, 5t on and =sinton0 <t < 57, 0ont > 57 = coston0 <t<7,-lont> 7 2. Draw a labeled sketch of the graph of each function. (a)H(t —le’? (b) H(t —27) cos (t — 27) (c) (1 +t)A(t —2) (d)(2 +t) [H(t —2) —H(t —3)] (e)t(H(t — 1) — H(t —2) + H(t —3)] (f)t? [2H(t —1) —H(t - 3) —H(t —4)] (g) [H(t —7/2) —H(t —7)| sine (h) 1+ A(t ~ 1) + A(t ~-2) + H(t — 3) + H(t —4) 0, t<0 _ I H(r)dr ={ t too =tHO. t t | H(r) dr = tH(t) + constant. 2 9 [((2)— (b)fs ve ~2)dr (c)fl [1—H(r —5)| dr (d)[) (H(7—a) —H(r ~b)]dr (e) fe |[H(7—2)—H(r ~3)|dr (f)fo"H(r —1)dr (b > a) (3.1) (3.2) (d)0on0<t<5,100n5<t<7,00nt>7 (g)fy H(r ~t)dr (e)0 fort 4 5, 100 fort = 5 (h) ¢* H(t — 1) (f)1—e'on0 (i)sint* [H(t —1) ~ H(t — 2)| (j) e! (g)Qon0 « H(t —5) <t <6,0ont >6 3<t<4,0ont>4 (k) Ll»H(t —1) <t<3,1on < 2,20n2 <t<l1,lonl<t 4.(a)-(k) Evaluate the integral in the corresponding part of Exercise 3 using computer software such as the Maple int command. 5. Solve x’ —x = f(t), wherex(0) = 0, by themethodsof 6.(a)—(j)Same as Exercise 5, but using computersoftware suchasthe Mapledsolvecommand. thissection,wheref(t) is: (a)H(t ~1) (b) e~!H(t — 3) 7.(a) ~(j) Same as Exercise 5, but for 2” —~x = f(t), (c)ton0<t<2,2ont>2 x(0) = 2'(0) =0. 5.6 Impulsive Forcing Functions; Jirac Impulse Function (Optional) Besides forcing functions that are discontinuous, we are interested in ones that are impulsive — that is, sharply focused in time. For instance, consider the forced mechanical oscillator governed by the differential equation mz" +cz'+kax = f(t), fi (1) wheref(t) is theforce applied to the mass. If theforce is due to a hammerblow, for instance,initiatedat time¢ = 0, thenwe expectf(t) to be somewhatas sketched in Fig. 1a.However,we do not know thefunctionalform of f(t) correspondingto such an event as a hammer blow, so the problem that we pose is how to proceed with the solution of (1) without knowing f. Of course we can solve (1) in terms of f, buteventuallywe needto knowf to find theresponsea(t). (a) fh In working with impulsive forces one normally tries to avoid dealing with the detailed shape of f and tries to limit one’s concern to a global quantity known as the impulse J of f, the area under its graph. The idea is that if € really is small, then the response x(t), while sensitive to J, should be rather insensitive to the detailed shape of f. That is, if we vary the shape of f but keep its area the same, then we expect little change in the response x(t). This idea suggests that we replace the unknown f by a simple rectangular pulse having the correct impulse as shown in T/e = Fig. 1b:f(t) = I/e forO <t < ©,and0 fort > e. With f thus simplified we can proceed to solve for the response x(t). But even so, the solution still depends upon ¢, and the latter is probably not known, just as the actual shape of f is not known. Thus, we adopt one more idealization: we suppose that since € is very small, we might as well take the limit of the solution as « + 0, to eliminate e. (b) Figure 1. Impulsiveforceat t= 0. Let us denote such a rectangular pulse having a unit impulse (J = 1) as D(t;€): Dts 0<t<e 1/e 6) = , where we use D (after the physicist ~ P A. M. Dirac, 2 who developed the idea of impulsive forces in 1929). As « —+0, D becomes taller and narrower as shown in Fig. 2, in such a way as to maintain its unit area. Of course, the limit t=0 oO, . does not exist, becauseoo is not an acceptable value, but Dirac showed that it is nevertheless useful to think of that limiting case as representing an idealized point unit impulse focused at t = 0. To explain, we first prove that Figure 2. Letting « — 0 in (2). lim I e—0 g(r) D(7; 6)dr = g(0) (4) for any function g that is continuous at the origin. To begin our proof, write lim 60 CoO 0 g(t) D(r;e) dr = lim| «0 € Jo 1 g(r) —dr. € (5) Suppose that g is continuous on 0 < 7 < 6 for some positive b. We can assume that « < b because we are letting « + 0. Thus, g is continuous on the integration interval 0 < 7 < e€,so the mean value theorem of the integral calculus tells us that there is a point 7; in [0,¢] (i.e., the closed interval 0 < + < e) such that fo (7) dr = g(m1)e. Thus,(5)gives lim [ e30 0 g(r) D(r; 6)dr = lim Lon) «30 €° = lim g(71) = g(0), e-0° (6) wherethe last equality holds since 7; is in the interval [0,¢],and € is going to zero. Finally, since0 is arbitrarily small, we only need the continuity of g at 7 = 0. This completes our proof of (4). For brevity, it is customary to dispense with calling attentionto the ¢ limit and to express (4) as I g(t) 6(r) dr = g(0), (7) where 5(7) is known as the Dirac delta function, or unit impulse function. We can think of 5(7) as being zero everywhere except at the origin and infinite at the origin, in such a way as to have unit area, but it must be noted that that definition is not satisfactory within the framework of ordinary function theory. To create a legitimate place for the delta function, one needs to extend the concept of function. 277 That was done by L. Schwartz, and the result is known as the theory ofdistributions, but that theory is well beyond our present scope. Let us illustrate the application of the delta function with an example. EXAMPLE 1. Consider(1), with m = & = 1 andc = 0; let f(t) correspondto a hammerblowassketchedin Fig. La,andlet(0) = (0) = 0, so thatbeforetheblowthe mass is at rest. The solution of the problem a’ +a = f(t), x(0) = a2'(0)=0 (8) is found, for instance, by using the Laplace transform, to be a [ sin (t —r) f(r) dr. (9) As outlined above, the idea is to replace f(r) by a rectangular pulse [D(7; €) having the sameareaI as f(7) and then to take the limit as « + 0: nt x(t) = im | e-+0 0 sin (t ~ tT)1D(r;€) dr = lim [ €é-+0 =lim(SSE 4=cost= Isint, € 0 T sin (t —7) —dr € (t — €) — cost cos (10) € e+ 0 where the last equality follows from |’Hépital’s rule. Alternatively and more simply, let f(7) = £6(r) in (9), where the scale factor I is needed since the delta function is a unit impulse whereas we want the impulse to be J. Then property (7) of the delta function gives a(t) = | t 0 sin (ft~ rT)[6(7) dr = [sin (t —7) r=0 = [sint, (11) as obtained previously in (10). You may be concerned that we have applied (7) even though the upper integration limits in (7) and (9) are not the same. However, in (5) we see that the oo was immediately changed to ¢, and then we let € tend to zero. Thus, (7) holds for any positive upper limit; we used oo just for definiteness, @ Let us review the idea. Since, generally, we know neither the exact shape nor the duration of an impulsive forcing function f, we do two things to solve for the response. We replace f by an equivalent rectangular pulse (i.e., having the same impulse, or area, as f), solve for w(t), and then we let the width of the pulse, e, tend to zero. Equivalently and more simply, we take f to be a Dirac delta function and evaluate the resulting integral using the fundamental property (7) of the delta function, The latter procedure is more efficient because one no longer needs to take the limit of the integral as « — 0; the limit was already carried out, once and for all, in our derivation of (4). a) is focusedatt = a, and Since d(t) is focusedat t = 0, it follows that6(t ~— (7) generalizes to i g(t) 6(r —a) dr = g(a). (12) 278 Chapter 5. Laplace Transform Here we continue to use the 0, oo limits, but it should be understood that the result is g(a) for any limits A,B (with B > A) such that the point of action of the delta function is contained within the interval of integration. If the point ¢ = falls outside the interval, then the integral is zero. Thus, for reference, we give the following morecompleteresult:* [ A g(r) 6(t ~ a) dr = gia), A<a<B 0, a<A or a>B. (13) EXAMPLE 2. RC Circuit. Recall from Section5.5thatthechargeQ(¢) on thecapacitor of theRC circuit is governedby thedifferentialequationRQ’ +(1/C)Q = E(t). Let E(t) be an impulsive voltage, with impulse J acting at t = T,, and let Q(0) = Qo. We wish to solvefor Q(t). ExpressingH(t) = [6(t —T), theinitial-valueproblemis Q' where & = 1/(RC). = + KQ Id(t Q(0) — T), = (14) Qo, Taking the Laplace transform of (14) gives sQ -Qot+«Q = IL {6(t —T)} so Q= @o_71 S+K S+K L{6(t-T)}, (15) and Q(t) = Qoe™*+ Ie~**«5(t —T) =Qoe™ oy + ‘| ent at et“) 0 5(7 —T) dr 0, =Qe" +{ Ie“"¢-T), t<T tt>TT = Qoe™ +TH(t - T)e*@-”), (16) where the third equality follows from (13). # Observe that we do not need to know the transformof the delta function in Example2; we merelycall its transformL {6(t —T)}, andinversionby thecon- volution theoremgives us back the 6(t —T’) thatwe startedwith. Nonetheless,for reference, let us work out its Laplace transform. According L{o(t-~a)} = / 0 d(t—a)e"* dt =e to (12), t=a aa (17) “Following (12), we state that the result is g(a) if the delta function acts within the integration interval. How then do we interpretthe integral when a is at an endpoint (A or B)? We've met thatcasein equation(7). SincetheD(r;¢€)sequence(Fig. 2) is definedon (0,¢],thedeltafunction acts essentially to the right of 7 = 0, hence within the interval of integration, and the result of the integration is g(0). To be consistent, let us suppose that the D sequence is always to the right of the B. < Aora> A<a< BandOifa point 7 = a. Then the integral in (13) will be g(a) if Inparticular,L {o(t)}= 1. Since this section is about the Laplace transform, the independent variable has been the time ¢, so the delta function has represented actions that are focused in time. But the argument of the delta function need not be time. For instance, if w(z)is the load distribution on a beam (Fig. 3a), in pounds per unit length, then 6(@— a) representsa point unit load (i.e., one pound) at 2 = a (Fig. 3b). Let us close this discussion with a comment on the delta function idealization from a modeling point of view. Consider a metal plate, extending over ~co < 4 < oo and 0 < y < ox, loaded by pressing a metal coin against it, at the origin, with a force P (Fig. 4a). If one is to determine (from the theory of elasticity) the stress distribution within the plate, one needs to know the load distribution w(a) along the edge of the plate (namely, the w axis). Because the coin will flatten slightly, at the point of contact, the load w(«) will be distributed over a short interval, say from x = —e€to « = e. However, the function w(x) is not known apriori and its determination is part of the problem. Whether one needs to determine the exact w(a) distribution or if it suffices to represent it simply as an idealized point force of magnitudeP, w(a) = Pd(x), dependsupon whetherone is interestedin the “near field” or the “far field”’ By the near field we mean that part of the plate a beam. within several ¢ lengths of the point of the load application — for instance, within the dashed semicircle shown in Fig. 4b. The far field is the region beyond. If we are concerned only with the far field, then it should suffice to use w(x) = Pd(x), (18) but if concerned with the near field then the approximation (18) will lead to large errors. A ball bearing manufacturer, for instance, is primarily interested with the near field induced by a loaded ball bearing due to concern regarding wear and surface damage. Within the theory of elasticity, the insensitivity of the far field to the detailed shape of w(a) [given that the area under the w(x) graph is held fixed] is an example of Saint Venant's principle. Computer software. The Map/e name for 6(t) is Dirac(t). Closure. We introduce the delta function out of a need to deal effectively with impulsive forcing functions, functions that are highly focused in time or space. Often we know neither the precise form of such a function nor the precise interval of application. If that interval is short enough one can model the force as an idealized point force, represented mathematically as a delta function d(t). One is not so Ze much interestedin the numerical values of 6(t) [indeed,one says that 6(0) = oo] as in the effect of integration upon a delta function, and that effect is expressed by (13), which we regard as the most important formula in this section. Vy (b) Figure 4, Deltafunction idealization. 280 Chapter 5. Laplace Transform EXERCISES 5.6 1. Solve for a(t), on0 <t < cw, is nonzero only at ¢ = 0, so there is no difference between f(t)6(t) and f(0)d(z). (a)a” ~ a = 6(f- 2); (0) = 2'(0) =0 (b)2” ~ de = 6d(t~ 1); 2(0) = 0, x'(0) = -3 (c) a” ~ 3a 4+2¢=2+d6(t-5); 2(0)=2'(0)=0 (a) (d)a”+a’ =14+6(t—2); 2(0)=0,(0)=3 (e)a" +2a" +e2=100(t-5); x (f) 20” —a’ = d(t- 1) -—d(t- 2); \ =2(0) =0 x(0)= 2’(0) =0 Ga" (O)= —4e" = 36(t- 1); — (0) = (2.5) the delta and Heaviside functions. Alternatively, we can write that relation as (3.1) 2(0) = 2/(0) =a “(0 )=0, 1 (k)a” ~— 5a" +42 = 6d(t- 2); 2!" 5(r)dr=H(t) 3. The result (2.5), above, reveals the close relation between (g)a” —3a’ + 2x = 1006(t- 3); 2(0) = 4, 2/(0)=0 (h)a’” = 26(¢-5); 2(0)= 2’(0) = x”(0) = 0 (i) a” + 32" 4+22’ = d(t-—5); 2(0) = 2'(0) = x"(0) =0 a" [ 0 (0) = x'(0) = x”(0) = The latter follows from (2.5) only in a formal sense, but is (I)cl”—«=d(t-1); 2(0) =2(0)=x"(0) = 2""(0) =0 2. Show that the delta function has these properties, where « is a nonzeroreal constant,and the function f(t) is continuous at the origin. NOTE: Recall that the delta function is defined by its integral behavior. Thus, by an equation such as quite useful, along with (2.2)-(2.5). a(t) = d(t) we meanthat d(~—t) i g(t)5(—t) dt=i —OO g(t)6(t)dt (2.1) OO for every function g(t) that is continuous at the origin. The rightsideof (2.1)is g(0), so to showthat6(—t)= d(t i), in part (a),you needto verify thatthe left side of (2.1) is g(0) too. (a) 5(—t) = d(t) (b) b(xt)=ig 6(t) («#0) @—fepsty=y TOPOFAO 0, f(0)=0 (2.2) For instance, suppose we wish to verify thatx(t)= H(t—1)sin initial-valueproblem2” +a2=6(t-1); Differentiatingw(t) gives = = = (t — 1) satisfiesthe 2«(0)= 2’(0) =0. A(t —1)cos(t—1) A’(t—1)sin(t—1)+ 6(t—1)sin(t —1) + H(t—1)cos(t—1) 0+ A(t—1)cos(t—1), (3.2) and H(t — 1)sin(t—- 1) H'(t—1)cos(t—1)— 6(t —1)cos(t —1) —H(t —1)sin(t —1) o(t —1) —H(t —1)sin(t - 1) (3.3) sox +x doesgive 6(t—1). In thesecondequality in (3.2)we used(3.1),and in thethird we used(2.4):6(¢— 1) sin (t —1) = 6(t —1)sinO = 0. In thesecondequalityin (3.3)we used z(t) = i] (2.3) (3.1), and in the third we used (2.4): 6(¢ — 1) cos(t- 1) = 6(t —1)cosO = 6(t —1). Further,we seethat(0) = 0 and, gy For instance, (3¢ + 2)d(t) = 2d(t), (siné)d(t) = 0, (3t+ 2)6(¢~ 1)= 5d(¢—1),and(t?+ ¢—2)d(t—1) = 0. Formally, the first part of (2.4) makes sense as follows: 6(t) from (3.2), that 2'(0) = 0. Here is the problem: In the same manner as above, verify the following solutions that are given in the Answers to the Selected Exercises. (a) exercise I(a) (b) exercise 1(d) (c) exercise I(g 281 5.7 Additional Properties In Section 5.3 we establish the linearity of the transform and its inverse, the transform of the derivative f(t), and the Laplace convolution theorem, results that we deem essential in applying the Laplace transform to the solution of differential equations. In this final section of Chapter 5 we present several additional useful properties of the Laplace transform. THEOREM 5.7 1 s-Shift If L{f(t)}= F(s) exists for s > so, then for any real constant a, Lie“ f(t)} = F(s +a) (1) for s+ a > sg or, equivalently, L7' {F(s+a)} =e~“ f(t). (2) Proof: L{e pat F(t) (t)} = [ en F(t Je sty dt =F(s+a). = / f(theSto JO EXAMPLE = 1. DetermineL {t¥e*!}.FromAppendixC, L {t?} = 6/s* so it follows from Theorem 5.7.1 that to note that ro! “ (3) 25+1 ae \steQs¢af* ool =k 25+] Vee _ pl s+1)-1) +3f 7 (i s+1)?+3f i (s +1) (Genes =a} = 2e7* cos V3t ~ € ee -t sin Jf33¢ V3 ’ where in the last step we use entries 3 and 4 in Appendix C and Theorem 5.7.1. @ 282 THEOREM 5.7.2 t-Shift If L{ f(t)} = F(s) existsfor s > so, thenfor anyconstanta > 0 (6) F(s) —a)} =e“ L{H(t—a)f(t for s > Sg or, equivalently, L* {e-“ F(s)} = H(t —a)f(t —a). 7) Equations (6) and (7) are already given in Section 5.5, where we studied the Heaviside step function, but we repeat them here because the t-shift results seem a natural companion for the s-shift results given in Theorem 5.7.1. THEOREM 5.7.3 Multiplication by 1/s If L {f(t)} = F(s) existsfor s > sg, then { [ f(r)ar}=*0) (8) for s > max{0, so} or,equivalently, L7 a, $ “t = | f(r) dr. (9) JO Proof: This theorem is but a special case of the convolution theorem. Specifically, fs f(r) dr =1 f so, accordingto thattheorem, ‘ as asserted. ∫∕ ∶ ↕ ∶ (10) ∫ @ EXAMPLE 3. To evaluateL~! {1/[s(s*+ 1)]},for example,we identifyF'(s) as 1/(s?+ 1).Sincef(t) = £7! {1/(s? + 1)} = sing, 1 laa} t sint =| dt = 1—cost. (11) 0 Alternatively, we could have used partial fractions. # Next, we obtain two useful theorems by differentiating and integrating the definition F(s) = [- f(t)e~* dt (12) 5.7. Additional Properties — 283 with respect to s. First, we state without proof that if the integral in (12) converges for s > 89, then dF(s) d [moe ig pay at 7 is “as J, f(t)e a= | OO5gLe] at __ [etme dt=—L{tf(t)}, (13) 0 for s > so, and is) dt (14) [rove [ [Potoetaas=[re (| ‘eos for b > a > sg. The key step in (13) is the second equality, where we have inverted the order of the integration with respect to t and the differentiation with respect to s. In(14), the key is again the second equality, where we have inverted the order of integration with respect to ¢ and the integration with respect to s. In particular, if a = s and b = oo, then (14) becomes [ F(s')ds’ -[" dt f(t) ([~ ods’) = [Mer a=1{ Oh t (15) forall s > so. For reference, we state the results (13) and (15) as theorems. THEOREM 5.7.4 Differentiation with Respect to s If L {f(t)} = F(s) exists for s > so, then (16) Lies} = dF(sds for s > sp or, equivalently, Lo EXAMPLE dF'(s) {Sh ny 17 a7) = —t f(t). 4. From the known transform a L f{sinat}= Payal (s > 0) (18) we may use (16) to infer the additional results L{t sinat} = —da _ __2as dss*+a* — (s*+a?)?’ (s > 0) (19) 284 352 — a2 7 2as ad ntl ate LT {tS42 sin at} Es “GGL ==Qe(s? bays (s > 0) (20) and soon. THEOREM 5.7.5 Integration with Respect to s If thereis a real numberso such thatD{f(t)} = F(s) exists for s > so, and limy_40f(t) /t exists,then £{ At st | F(s') ds' (21) for s > 8g or, equivalently, EXAMPLE (22) F(s') ds} = a ae if 5. To evaluate Lo}{in 2 i} s—b (23) where a and 6 are real numbers, note that | $—a | “3 —b s—b 6 s—a o3-b ds i) 1 1 s-a_ dy Sy —a 51 . = 1 ~ (= [ 1 =) d ° 24 es) for any s;. Letting s; — oo and recalling that In1 = 0, (24) gives s-a no = ve / 1 − (= 1 ds’. ° 73) 25 a Thus, identify F’(s) in (21) as —1/(s — a) + 1/(s — 6) (which does exist for s > max{a,b} = so). Then f(t) =L7! {- 1 $-a Furthermore, t30 ot 1 \ = eff —et, s-b bt — eat P(e lim ft) + ==lim fee t+0 (27) b-a t (26) does exist, the last equality following from I’ Hépital’s rule, so (22) gives the desired inverse as cfm > sb opt _ o} =o t pat i (28) 285 THEOREM F(s) 5.7.6 Large s Behavior of Let f(t) be piecewise continuous on 0 < t < to foreach finite to and of exponential order as t —+co. Then (i) F(s) + 0 as s - 00, (ii) sF'(s) is boundedas s + oo. Proof: Since f(t) is of exponential order as t -+ oo, there exist real constantsIC andc, with K > 0 such that |f(t)| < Ke for all t > to for some sufficiently large to. And since f(t) is piecewise continuous on 0 < t < fo, there must be a finiteconstantAY suchthatae t)| < Mon0 est i, <[smi eat dt sovetas [uot to 0 oto <| <t < to. Then Me dt + / oo . K e78~)!dt to 0 1 — e7 sto = M—— ‘ 8 tik e~(set —(s—c) °° to M <4 § K s—-C (29) for all s > c. It follows from this result that F(s) 4 0 as s — oo, and that sF'(s) is bounded as s + co. @ For instance, for each of the entries 1-7 in Appendix .C we do have F(s) — 0 and sF(s) boundedas s — oo. For entry 8 we do too, unless —1 < p < 0, in which case F'(s) + 0 but sF'(s) is not bounded. However,in this case f(t) = t? is not piecewise continuous since f(t) + co as t + Oif p is negative. THEOREM §.7.7 I[nitial-Value Theorem Let f be continuous and f’ be piecewise continuotis on 0 < t < to for each finite tg, and let f and f’ be of exponential order as t + oo. Then lim [sF(s)] = f(0). s—00 (30) Proof: With the statedassumptions on f and f’, it follows from Theorem 5.3.3 that L{f'(t)}= sL{F(t)}- FO). G1) Since f’ satisfiestheconditions of Theorem5.5.7,it follows thatL {f’(t)} — 0 as 8 — oo. Thus, letting s —+oo in (31) gives the result stated in (30). @ Normally, we invertf(s) and obtain f(t). However,if it is only f(0) thatwe desire,not f(t), we do not needto invert £'(s); all we needto do, according to (30), is to determinethelimit of sF'(s) as s — oo. As our final item, we show how to transform a periodic function, which is important in applications. First, we define what is meant by a periodic function. If there exists a positive constant 7’ such that f(t+T)=F(t) (32) for all t > 0, then we say that f is periodic, with period T. EXAMPLE 6. The functionsint is periodic with period27 since sin(t+ 27) = sint cos 27 + sin 27 cost = sint, forall t. @ EXAMPLE g Al Lord 7. The functionf shownin Fig. | is, by inspection,seento be periodicwith period I’ = 4, for if we “stamp out” the segment ABC graph of f. @ 78 1112 Figure 1. Periodic functionf. repeatedly, then we generate the ¢ Notice that if f is periodic with period T’, then it is also periodic with period 20, 37, 4T, and so on. For instance, it follows from (32) that FE+2T) = f(E+T) +7) = ft+T) = ft) so that if f is periodic with period T then it is also periodic with period 27’. If there is a smallest period, it is called the fundamental period. Thus, sint in Example 6 is periodic with period 27, 47, 67,..., so its fundamentalperiod is 27; f(t) in Example 7 is periodic with period 4,8,12,..., so its fundamental period is 4. In contrast, f(t) = 3 (i.e., a constant) is periodic with period T for every T > 0. Thus, there is no smallest period, and hence this f does not have a fundamental period. To evaluate the Laplace transform of a periodic function f(t), with period T (which is normally taken to be the fundamental period of f, if f has a fundamental period), it seems like a good start to break up the integral on t as Li{f(t)} = | 00 {[(the" dt = | T f(t)e™ dt + 2T f(t)e7* dt +++-. (33) T Next, let 7 = ¢ in the first integral on the right side of (33), 7 = t— TZin the second, 7 =t — 2T in the third, and so on. Thus, Lif(t)}= i a f(r)e7* dr+ | T fir 4 DT)et) dr fe + | Jo f(r + 2T) e 8(T+2T)dr 4... ; (34) 287 butf(r + 1) = f(r), f(r + 2T) = f(r), andso on, so (34)becomes (35) +e + fs f(r)ec LAF} =(Le Unfortunately, this expression is an infinite series. However, observe that L+ is a geometric est 4 eo 2st series 1 + foecece T+ (e~**) z+ 2? tees, with 4 (e~2F)? eee (36) 2 = e7 8? and the latter is known to havethe sum 1/(1 —z) if Jz}< 1.Since |z|= |e =e! < Lif's > 0, we can sum the parenthetic series in (35) as 1/(1 —et ). Finally, if we ask that f be piecewise continuous on 0 < ¢ < 7, to ensure the existence of the integral in (35), then we can state the result as follows. THEOREM 5.7.8 Transform of Periodic Function If f is periodic with period T on 0 < t < oo and piecewise continuous on one period, then (a7) TL | p(t)en**at 1 = L{F(t)} JQ ~ fors > 0. The point, of course, is that (37) requires integration only over one period, rather than over 0 < ¢ < ov, and gives the transform in closed form rather than as an infinite series. EXAMPLE 8. If f is the sawtooth wave shown in Fig. 2, then 1 = 2, and [ f(t)e dt = [ so Lb )}= 1—(1+2s8)e7?8 2te“ 0 dt = git 2| 1—(14 1 {f(t ‘ ~ L-en + ese (38) § 2| 2s)e7*8 as g2 2 g2 2 e738 4 _ s 1 ~e72s e ( 39 ) for s > 0. A more interesting question is the reverse: What is the inverse of es 2 4 se gs1—~e7s where we pretend no advance knowledge of the sawtooth wave in Fig. 2, or even the knowledge that f(t) is periodic? The key is to proceed in reverse — that is, to expand the 1/(1 — e~*5)in a geometric series in powers of e~**.Thus F(s)=- - =e" i Ee | (lL+e7% +e7* +--+.) I Nw - o> + mr i + & is f ~~ =z= Figure 3. Partial sumsof (42). Assuming that the series can be inverted termwise, f(t) = 2t-—4 [H(t —2) + A(t -4)+ A(t -6)+-- (42) |]. The first few partial sums, fi(t) = 2t, fo(t) = 2t — 4H (t — bho2), fo(t)=2t—4H(t—2)—4H(t—4) are sketched in Fig. 3, and it is easy to infer that (42) gives the periodic sawtooth wave shown in Fig. 2. Figure 4. The staircase A[H(t—2)+ A(t-4)4+---]. COMMENT. Observe that the presence of 1 —e~*” in the denominator of a transform does not suffice to imply that the inverse is periodic. For example, the inverse of 4e~?*/{s(1 e~8)}, in (40), is the nonperiodic “staircase” shown in Fig. 4. [tis only when this staircase is subtracted from 2¢ that a periodic function results. @ This completes our discussion of the Laplace transform. Just as we used it, in this chapter, to solve linear ordinary differential equation initial-value problems, in later chapters we will use it to solve linear partial differential equation initial-value problems. Closure. It would be difficult to pick out the one or two most important results in this section, since there are eight theorems, all of comparable importance. Most of these theorems are included as entries within Appendix C. EXERCISES 5.7 1. Invert each of the following by any method. Cite any (a) - | ae (s? + a*)? items from Appendix C and any theorems that you use. I[f it is applicable, verify the initial-value theorem, namely, that (d) s* sF(s) > f(0) as s > oo;if it is not,thenstatewhy it isnot. ~ (s+ 1)3 (b) : 5 (s? — a’)? (e) 1 s+ (c) >“ ; (s — 2) (f) y _- (s—a)s/? 5.7. Additional Properties es e e728 (@)or (h)(s+ 46 ou(s+1)? | (j)In € + =) (k) cH 2y4 (H)In (1 - = eee ee 4284 (m 8° aS +1 iyn(S**) @— (vy) 5x8 the solution is u(t) = of [1 e~ (t-0)) − (yy) OY s-e) = sint, show that (37) _ 0) (6.1) ↔ > ∙ 2 (a)—(x)Invert the transform given in the corresponding part of Exercise |, using computer software. does indeed give A(t A(t —1) fh ≤ 3. (a) In the simple case where f(t) _ +[1—e~@?)|H(t —2) - + (w)= (F472 we —[l—e"-)] (0)s*(s?so —28 —2) ∶ 289 3 2 I 0 t 4 HINT: It would be wastefulto determineF(s) becausethe solution can be expressed as a convolution integral involving L{sint} = f(t) directly. In thatintegral,expressf(t) asan infiniteseries of Heaviside functions. 1 st41 (b) Sketch and label the graph of x(t) over 0 < t < 3, say, (b) In the case where f(t) = cost, show that (37) does indeed give for thecasewhereto = 0. Is x(t) periodic? If not,is therea valueof xp suchthatx(t) is periodic? Explain. 4. (Scalechanges)Show thatif L{f(¢)} = F(s), then f(£) by the Laplace transform,where 7. Solve 2! +2 z(0) = ao and f(t) is theperiodicfunctionshown. HINT: 8 Lf{cost} = Fu Read Exercise 6. r @L{flat)}= +e(2) wyo-'¢r(as)} =49 (<) 1 8 Higgs _ il t 31 5. Determine the Laplace transform of the function f(t) that is periodic and defined on one period as follows. (a)sint, O<t<7 (c)sin2t, (e) t, O<t<a7 0, O<t<1 − 1<t<2 2, 0O<t<l (g)¢ 4, 1<t<2 1, 2<t<38 (b) 1, O<t<2 0, 2<t<3 (djem', (f) t, t-2, − (a) I O<t<2 O0O<t<1 ~ 1<t<2 2 (b) Ae ∶ ∶∶∶ 2 4 6 8 t 8. Solve vw”+ 2 = f(t) by the Laplace transform,where 6. (a) Solve v’ + x = f(t) by the Laplace transform, where x(0) = a, 2/(0) = ag, andf(t) is thesquarewaveshownin z(Q) = xg and f(t) is the square wave shown, and show that Exercise 6. Evaluate2(5) if v9 = xg = 1. 290 Chapter 5. Laplace Transform Chapter 5 Review The Laplace transform has a variety of uses, but its chief application is in the solution of linear ordinary and partial differential differential equations. In this chapter our focus is on its use in solving linear ordinary differential equations with constant coefficients. The power of the method is that it reduces such a differential equation, homogeneous or not, to a linear algebraic one. The hardest part, the inversion, is often accomplished with the help of tables, a considerable number of theorems, and computer software. Also, any initial conditions that are given become built in, in the process of transforming the differential equation, so they do not need to be applied at the end. Chief properties given in Section 5.3 are: Linearity of the transform and its inverse LE{au(t) + Bo(t)} = aL{u(t)}+ L7"{aU(s) Transform 6Lf{o(t)}, + BV(s)} =a Lo! {U(s)} + BL '{V(s)}, of derivatives L{f'} =sF(s)—f(0), L{f"}=s*F(s)-sf(0)—f'), Convolution -, Theorem L{(f *9)(t)}=F(s)G(s) L~*{F(s)G(s)}= (f *9)(t); where (f *g)(t oe f(r) g(t —7) dr is the Laplace convolution of f and g. In Sections 5.5 and 5.6 we introduce the step and impulse functions H(t — a) and 6(t — a), defined by a(t~a)={ and B [a(t JA 8-ayar=4 SO 0, 0, 1,+ t<a toa BE ESE a<A or a2>bB, to model piecewise-defined and impulsive forcing functions. Finally, in Section 5.7 we derive additional properties: s-shift Lie “f(t )}= F(s+a) Chapter 5 Review or L{F(s+a)} t-shift or =e“ f(t). L{H(t ~ a)f(t —a)} =e“ F(s) L~ {e~*F(s)} = H(t —a)f(t—a). Multiplication by 1/s t Lf f0 rr)ar} =Fs) § or L-1 f{F(s): Le_ Pyaar Differentiation with respect to s Les} = dF2) 4F(s) or L -1 { s 7s _ \ = —tf(t). Integration with respect to s or Large s behaviorof F'(s) F(s) +0 as so, sF(s) boundedas s > oo. Transform of periodic function of period T [ P(tyentat. =—— L{f(t)} Ll—est T NOTE: The preceding list is intended as an overview so, for brevity, the various conditions under which these results hold have been omitted. 291 Chapter 6 Quantitative Methods: Numerical Solution o Differential Equations 6.1 Introduction Following the introduction in Chapter 1, Chapters 2-5 cover both the underlying theory of differential equations and analytical solution techniques as well. That is, the objective thus far has been to find an analytical solution —in closed form if possible or as an infinite series if necessary. Unfortunately, a great many differential equations encountered in applications, and most nonlinear equations in particular, are simply too difficult for us to find analytical solutions. Thus, in Chapters6 and 7 our approachis fundamentallydifferent, and complements the analytical approach adopted in Chapters 2—5: in Chapter 6 we develop quantitative methods, and in Chapter 7 our view is essentially qualitative. More specifically, in this chapter we “discretize” the problem and seek, instead of an analytical solution, the numerical values of the dependentvari:ble at a discrete set of values of the independent variable so that the result is a table or graph, with those values determined approximately (but accurately), rather than exactly. Perhaps the greatest drawback to numerical simulation is that whereas an analytical solution explicitly displays the dependence of the dependent variable(s) on the various physical parameters (such as spring stiffnesses, driving frequencies, electrical resistances,inductances,and so on), one can carry out a numerical solution only for a specific set of values of the system parameters. Thus, parametric studies (i.e., studies of the qualitative and quantitative effects of the various parameters upon the solution) can be tedious and unwieldy, and it is useful to reduce the numberof parametersas much as possible (by nondimensionalization, as discussed in Section 2.4.4) before embarking upon a numerical study. The numerical solution of differential equations covers considerable territory so the present chapter is hardly complete. Rather, we aim at introducing the funda292 6.2. Euler’s Method = 293 mental ideas, concepts, and potential difficulties, as well as specific methods that are accurate and readily implemented. We do mention computer software that carries out thesecomputations automatically, but our present aim is to provide enough information so thatyou will be able to select a specific method and programit. In contrast, in Chapter 7, where we look more at qualitative issues, we rely heavily upon available software. 6.2 Euler’s Method [n this section and the two that follow, we study the numerical solution of the firstorder initial-value problem y =f(z,y); y(a)=b (1) on y(z). To motivate the first and simplest of these methods, Euler’s method, consider the problem y =yt+2x—2", y(0)=1 (0 <a < co) (2) with the exact solution (Exercise [) y(x) = 2? +e". (3) Of course, in practice one wouldn’t solve (2) numerically because we can solve it analytically and obtain the solution (3), but we will use (2) as an illustration. In Fig. | we display the direction field defined by f(a,y) = y + 2a — x7, as well as the exact solution (3). In graphical terms, Euler’s method amounts to using the direction field as a road map in developing an approximate solution to (2). Beginning at the initial point P, namely (0,1), we move in the direction dictated by the lineal element at that point. As seen from the figure, the farther we move along that line, the more we expect our path to deviate from the exact solution. il 0.5, say, for the sake of Thus, the idea is not to move very far. Stopping at z = illustration, we revise our direction according to the slope of the lineal element at that point Q@.Moving in that new direction until 2 = 1, we revise our direction at R, and so on, moving in x increments of 0.5. We call the «xincrement the step size and denote it as h. In Fig. |, fis 0.5. Let us denote the y values at Q, R,... as y,,yo,.... They are computed as yo) is the initial point (a9, where )h,..., yi (er, f + yi = yo yi = yor f(xo, Yoh, P. Expressed as a numerical algorithm, the Euler method is therefore as follows: Yntl = Yn + I (Za, Undh, (n =0,1,2,.. ) (4) where f is the function on the right side of the given differential equation (1), To = a, yo = b, h is the chosen step size, and zp, = xq + nh. Euler’s method is also known as the tangent-line method because the first straight-line segment of the approximate solution is tangent to the exact solution Figure 1. Directionfield motivation of Euler's method, for the initial-value problem (2). 294 y(x) at P, and each subsequentsegment emanating from (2p, yn) is tangent to the solution curve through that point. Apparently, the greater the step size the less accurate the results, in general. For instance, the first point Q deviates more and more from the exact solution as the step size is increased — that is, as the segment PQ is extended. Conversely, we expect the approximatesolution to approach the exact solution curve as h is reduced. This expectationis supportedby the results shown in Table 1 for the initialn fable 1. Comparison of numerical solution of (2) using Euler’s method, with theexact solution. zg i|h=05/h=0.1]h=0.02| y(x) 0.5 1.5000 1.7995 1.8778 | 1.8987 1.0| 2.6250 | 3.4344 3.6578 | 3.7183 1.5| 4.4375 | 6.1095 6.5975 | 6.7317 value problem (2), obtained by Euler’s method with step sizes of h = 0.5,0.1, and 0.02; we have included the exact solution y(x), given by (3), for comparison. With h = 0.5, for instance, yi =yo + (yo+ 2x0—28) h =14+(1+0-—0)(0.5) = 15, y2=yi t+(yi + 201—vf) h = 1.5+4(1.5+1 —0.25)(0.5)= 2.625, y3= yo+ (y2+ 2xq—23)h = 2.625+ (2.625+ 2 —1)(0.5)= 4.4375. With h = 0.1, the values tabulated at x = 0.5,1.0,1.5 are ys, y10,Yis, with the intermediate computed y values omitted for brevity. Scanning each row of the tabulation, we can see that the approximate solution appears to be converging to the exact solution as h — 0 (though we cannot be certain from such results no matter how small we make /), and that the convergence is not very rapid, for even with h = 0.02 the computed value at z = 1.5 is in error by 2%. As strictly computational as this sounds, two important theoretical questions present themselves: Does the method converge to the exact solution as h — 0 and, if so, how fast? By a method being convergent we mean that for any fixed x value in the x interval of interest the sequence of y values, obtained using smaller and smaller step size h, tends to the exact solution y(z) as h — 0. Let us see whether the Euler method is convergent. Observe that there are two sources of error in the numerical solution. One is the tangent-line approximation upon which the method is based, and the other is the accumulation of numerical roundoff errors within the computing machine since a machine can carry only a finite number of significant figures, after which it rounds off (or chops off, depending upon the machine). In discussing convergence, one ignores the presence of such roundoff error and considers it separately.Thus, in this discussion we imagine our computer to be perfect, carrying an infinite number of significant figures. 6.2. Euler’s Method Local truncation error. Although we are interested in the accumulation of error after a great many steps have been carried out, to reach any given a, it seems best to begin by investigating the error incurred in a single step, from t,_1 to tp between the exact (or from @p,to Gp_+1,it doesn’t matter). We need to distinguish and approximate solutions so let us denote the exact solution at zp, as y(a,) and the approximate numerical solution at @pas yy. These are given by the Taylor series (tm — @n—1)* eee y(tn) = y(@n-1) + y!(tn—1) (@n—En—-1)+ wna) = y(@n-1)+ y/(tn- yh + Plena) (5) ,2 +e and the Euler algorithm (6) Yn-1)h, Yn= Yn-1+f (@n-15 respectively. It is important to understand that the Euler method (6) amounts to retaining only the first two terms of the Taylor series in (5). Thus, it replaces the actual function by its tangent-line approximation. We suppose that y(a,— 1) and y,—1 are identical, and we ask how large the error €n = y(n) to Tn. We can — Ym is after making that single step, from z,_1 get an expression for e,, by subtracting (6) from (5), but the right side will be an infinite series. Thus, it is more convenient to use, in place of the (infinite) Taylor series (5), the (finite) Taylor’s formula with remainder, y(@n) = y(@n—1) + y'(fn—1)h + y"(€) 2 (7) h*, I Now, subtracting (6) from (7), where € is some point in the interval [v,_1,,]. = f [tn—1,y(@n-1)| = f(@n~-1,yn—1)becauseof our and noting that y/(¢@p—-1) supposition that y(@n—1)= Yn—1,gives €n =roe (8) The latter expression for e,, is awkward to apply since we don’t know &, except that tn-1 < € < ap.* However, (8) is of more interest in that it shows how the < € < @m—1+h, we single-step error e, varies with h. Specifically, since tn, ↕∂ − − ≤ ≤ ↕ i ∶∙ − ↓ − −↓ − ∶ ∏ ∶∩ (9) “It also appearsthat we do not know the y” function, but it follows from (2) that y” = y/ + 2 22 = (y+ 20-07) 42-99 =yt+2—2". 295 as h -+ 0. [The big oh notation is defined in Section 4.5, and (9) simply means that e, ~ Ch? as h > 0 for some nonzero constant C’.] Since the error e,, is due to truncation of the Taylor series it is called the truncation error~more specifically, the local truncation error because it is the truncation error incurred in a single step. Accumulated truncation error and convergence. Of ultimate interest, however, is the truncation error that has accumulated over all of the preceding steps since that error is the difference between the exact solution and the computed solution at any given x,. We denote it as FE, = y(ap) — Yn and call it the accumulated truncation error. If it seems strangethat we have defined both the local and accumulated truncation errors as y(n) — Yn, it must be remembered that the former is that which results from a single step (from 7,—1 to @,,)whereas the latter is that which results from the entire sequence of steps (from zg to pn). We can estimate EF, at a fixed x location (at least insofar as its order of magnitude) as the local truncation error e, times the number of steps n. Since €n = O(h?), this idea gives : En=O(h?)-n=O(n?)h =O(n?) =O(h)-tn =O(h),—(10) L where the last step follows from the fact that the selectedlocation x, is held fixed as h — Q. Since the big oh notation is insensitive to scale factors, the x, factor can be absorbedby theO(h) so En = O(h), (11) which result tells us how fast the numerical solution converges to the exact solution (at any fixed x location) ash — 0. Namely, E, ~ Ch for some nonzero constant C’. To illustrate, consider the results shown in Table 1, and consider x = 1.5, say, in particular. According to E, ~ Ch, if we reduce h by a factor of five, from 0.1 to 0.02, then likewise we should reduce the error by a factor of five. We find that (6.7317 — 6.1095)/(6.7317 — 6.5875) ~ 4.6, which is indeed close to five. We can’t expect it to be exactly five for two reasons: First, (11) holds only as h > 0, whereas we have used h = 0.1 and 0.02. Second, we obtained the values in Table | using a computer, and a computer introduces an additional error, due to roundoff, which has not been accounted for in our derivationof (11). Probably it is negligible in this example. While (11) can indeed be proved rigorously, be aware that our reasoning in (10) was only heuristic. To understand the shortcoming of our logic, consider the diagram in Fig. 2, where we show only two steps, for simplicity. Figure 2. The global truncation error. Our reasoning, in writing By, = O(h?) - nin (10), is that the accumulated truncation error FE, is (at least insofar as order of magnitude) the sum of the n single-step errors. However, that is not quite true. We see from Fig. 2 that EH»is €2+3, not the sum of the single-step errors ey +e 1, and (3is not identical to e;. The difference between (@and e; is the result of the slightly different slopes of D1 and £2 acting over the short distance h, and that difference can be shown to be a higherorder effect that does not invalidate the final result that £, = O(h), provided that 6.2. Euler’s f is well enough behaved (for example, if /', f,, and fy are all continuous on the x, y region of interest). In summary, (11) shows that the Euler method (4) is convergent because the accumulated truncation error tends to zero as 4 — 0. More generally if, for a given method, BE, = O(h?) as h -> 0, then the method is convergent if p > 0, and we say that it is of order p. Thus, the Euler method is a first-order method. Although convergent and easy to implement, Euler’s method is usually too inaccurate for serious computation because it is only a first-order method. That is, since the accumulated truncation error is proportional to / to the first power, we need to make A extremely small if the error is to be extremely small. Why can’t we do that? Why can’t we merely let h = 107%,say? There are two important reasons. One is that with h = 1078, it would take 108 steps to generate the Euler solution over a unit x interval. That number of steps might simply be impractically large in terms of computation time and expense. Second, besides the truncation error that we have discussed there is also machine roundoff error, and that error can be expected to grow with the number of calculations. Thus, as we diminish the step size A and increase the number of steps, to reduce the truncation error, we inflict a roundoff error penalty that diminishes the intended increase in accuracy. In fact, we can anticipate the existence of an optimal fhvalue so that to decrease h below that value is counterproductive. Said differently, a given level of accuracy may prove unobtainable because of the growth in the roundoff error as h is reduced. Further discussion of this point is contained in the exercises. Finally, there is an important practical question not yet addressed: How do we know how small to choose 2? We will have more to say about this later, but for now let us give a simple procedure, namely, reducing / until the results settle down to the desired accuracy. For instance, suppose we solve (2) by Euler’s method using II 0.5, as a first crack. Pick any fixed point « in the interval of interest, such as h= x =lI 1.5. The computed solution there is 4.4375. Now reduce h, say to 0.01, and run the program again. The result this time, at 2 = 1.5, is 6.1095. Since those results differ considerably, reduce h again, say to 0.02, and run the program again. Simpy repeat that procedure until the solution at z = 1.5 settles down to the desired number of significant figures. Accept the results of the final run, and discard the others. (Of course, one will not have an exact solution to compare with as we did in Table 1.) The foregoing idea is merely a rule of thumb, and is the same idea that we use in computing an infinite series: keep adding more and more terms until successive partial sums agree with the desired number of significant figures. Closure. The Euler method is embodied in (4). It is easy to implement, either using a hand-held calculator or programming it to be run on a computer. The method is convergent but only of first order and hence is not very accurate. Thus, it is important to develop more accurate methods, and we do that in the next section. We also use our discussion of the Euler method to introduce the concept of the local and accumulated truncation errors e, and E,, respectively, which are Method 297 298 and of order one. EXERCISES 6.2 1. Derive the particular solution (3) of the initial-value problem (2). 2. Use the Euler method to compute, by hand, y;, ye, and ys for the specified initial-value problem using h = 0.2. ay’ =—-y; y(0)=1 (b)y' = 2zy; (0) =0 (e)y!= 2xe~¥;y(1)= -1 (iy =5e-2/y, Qy= Very, Yndh y(0)=0 analytically, and that idea is the focus of this exercise. Specif- ically, consider y’ = Ay, whereA is a given constant. Then y(1)=2 y(0)=4 (6.1) becomes y(0)=3 Yn+l = (1 + Ah)yn. 3. Program and run Euler’s method for the initial-value problem y’ = f(x,y), with y(0) = land hk= 0.1, through yyo. Print y1,---,Y10 and the exact solution y(a1),..., y(a10) as (a) Derive the solution well. (Six significant figures will suffice.) Evaluate £yy. Use Yn = C(1+ Ah)” thef(x, y) specified below. (a)2x ee glta+y (b)—6y? (c)a+y (h)—ytan (i)ett +1)/2 (e)(y? (A)dwen¥ 4. (a)-(h) Program and run Euler’s method for the initialvalue problem y' = f(x,y) (with f given in the corresponding part of Exercise 3), and print out the result at 2 = 0.5. Use A = 0.1, then 0.05, then 0.01, then 0.005, then 0.001, and compute the accumulated truncation error at z ==0.5 for each case. Is the rate of decrease of the accumulated truncation error, as /, decreases, consistent with the fact that Euler’s method is a first-ordermethod? Explain. (6.1) as y’ = f(x,y) is a differential equationgoverning y(z). If f is simple enough it may be possible to solve (6.1) for yp (fy! =a?—y?; y(3)=5 (g)y' = wsiny; Yn+1 = Yn + (aan Besides being a numerical sequentially, forn = 0,1,2,.... algorithm for the calculation of the y,,’s, (6.1) is an example of a difference equation governing the sequence of y,,’s, just (c)y! = 3x7y?; (0) = 0 (dy =1+2ry?; y(1)= —2 (h)y’=tan(r+y); 6. We have seen that by discretizing the problem, we can approximate the solution y(x) of a differential equation y’ = f(z, y) byadiscrete variable y,, by solving (6.2) (6.3) of (6.2), where C’ is the initial value yo, if one is specified. (b) Show that as kh—- 0 (6.3) does converge to the solution Ce** of the original equation y' = Ay. HINT: Begin by | NOTE:Thus, for the expressing(1 + Ah)” ase!™CU+4")" simple differential equation y’ = Ay we have been able to prove the convergence of the Euler method by actually solving (6.2) for yn, in closed form, then taking the limit of that result ash + 0. (c) Use computer software to obtain the solution (6.3) of the difference equation (6.2). On Maple, for instance, use the rsolve command, have takenthe step size h to be a constant 5. Thus far we have taken the step h to be positive, and there- 7. In this section we fore developed a solution to the right of the initial point. Is from one step to the next. Is there any reason why we could Euler’s method valid if we use a negative step, A < 0, and not vary # from one step to the next? Explain. hencedevelop a solution to the left? Explain. 299 6.3 Improvements: Midpoint Rule and Runge-Kutta Our objective in this section is to develop more accurate methods than the firstorder Euler method—namely, higher-order methods. In particular, we are aiming at the widely used fourth-order Runge-Kutta method, which is an excellent general- purposedifferential equationsolver. To bridge thegap betweenthesetwo methods, we begin with some general discussion about how to develop higher-order methods. 6.3.1. Midpoint rule. To derive more accurate differential equation solvers, Taylor series (betteryet, Taylor’s formula with remainder) offers a useful line of approach, To illustrate, consider the Taylor’s formula with remainder, y"(E) ap _, u(x)= y(a)+ yf y'(a)(z—a)+ 2 (a—a)", where € is some point in [a,az]. If we let cv = @n4i1,@ = Zp, ande-a (1) = Ln+1 — Tn = A, then (1) becomes yh. y(tn41) =y(tn)+y'(@n)h + y"(E) 9 (2) Since y’ = f(x,y), we can replacethe y/(a,) in (2) by f(@n,y(@n)). Also, the last termin (2) can be expressedmore simply as O(h”) so we have y(tnet)=y(tn)+f(tn,y(en))h + O(h?). (3) term and call attention to the approximation thereby in- If we neglect the O(h) curredby replacingtheexactvaluesy(an4,) andy(z,) by theapproximatevalues Yn+1and y,, respectively, then we have the Euler method Ynt1 = Un + cae (4) Yn)h. Since the term that we dropped in passing from (3) to (4) was O(h7), the local truncation error is O(h”), and the accumulated truncation error is O(h). One way to obtain a higher-order method is to retain more terms in the Taylor’s formula. For instance, begin with y(Zn41) = y(@n) + y'(an)h in place of (2) or, since y” lI Y(@n41) = y(@n) + f(tn, +5 (fe(@n, + y" (an) ae u"(n) 3 5 5 (5) d y(en))h y(Zn)) + Fylan, ylan)) f len, y(2n))] h? + O(h’). (6) 300 Chapter 6. Quantitative Methods: Numerical Solution ofDifferential Equations If we truncate(6) by dropping the O(h*) term, andchangey(an41) and y(ap) to Ynt1 and yp, respectively, then we have the method Yn+1 = Yn + f(tn, 1 Un)h + 5 [fo(tn, ∙ Yn) + fyltn, Un) f (tn, Yn)] he (7) with a local truncation error that is O(h°) and an accumulatedtruncation error that is O(h?); thatis, we now havea second-ordermethod. Why do we say that (7) is a second-order method? Following the same heuristic reasoning as in Section 6.2, the accumulated truncation error /,, is of the order of the local truncation error times the number of steps so E, = O(h®) n= o(ns) = a(n’) = O(h”)-tn = O(h?), as claimed. In fact, as a simple rule of thumb it can be said that if the local truncation error is O(h”), with p > 1, thenthe accumulatedtruncationerror is O(hP—'), and one has a (p — 1)th-ordermethod. Although the second-order convergence of (7) is an improvement over the first- order convergence of Euler’s method, the attractiveness of (7) is diminished by an approximately threefold increase in the computing time per step since it requires three function evaluations (f, fz, fy) per step whereas Euler’s method requires only one (f). It’s true that to carry out one step of Euler’s method we need to evaluate f, multiply that by h, and add the result to y,,, but we can neglect the multiplication by A and addition of y, on the grounds that a typical f(x,y) involves many more arithmetic steps than that. Thus, as a rule of thumb, one compares the computation time per step of two methods by comparing only the number of function evaluations per step. Not satisfied with (7) becauseit requires threefunction evaluations, let us return to Taylor’s formula (5). If we replace h by —h, that amounts to making a backwardstepso the termon the left will be y(a,_1) insteadof y(tn+41).Making those changes, and also copying (5), for comparison, we have i uw Wan uty, +4(Sn)p2a ‘s)nr’, =y(tn)—y'(an)h y(sn—1) . (8a) (8b) ‘ or + #S a),24 OM y(tnt1)=y(n) +y{(an)h respectively, where ¢ is some point in [%,—1,©] and 7 is some point in [tp, tr+1]. Now we can eliminate the bothersome y” terms by subtracting (8a) from (8b). Doing so gives , tt y(@n41) —y(@n—1) = 2y/(an)h+rity fl . 3 or y(tn41)= y(@n—1) + 2f (an,y(an))h+ O (h*). 6.3. Improvements:Midpoint Rule and Runge-Kutta — 301 Finally, if we drop the O (n3) term and change y(an41), y(@n—1),y(@n) tOYnqi; Yn—1sYn»tespectively, we have = Yn—1 + f(an, Yntl Yn) (2h), | (9) which method is known as the midpoint rule. Like (7), the midpoint rule is a second-order method but, unlike (7), it requires only one function evaluation per step. It is an example of a multi-step method because it uses information from more than one of the preceding points —namely, from two: the nth and (n — 1)th. Thus, it is a two-step method whereas Euler’s method and the Taylor series method given by (7) are single-step methods. A disadvantage of the midpoint rule (and other multi-step methods) is that it is not self-starting. That is, the first step gives y, in terms of zg, yo, y—1,but y—1 is not defined. Thus, (9) applies only for n > 1, and to get the method started we need to compute y; by a different method. For instance, we could use Euler’s method to obtain y; and then switch over to the midpoint rule (9). Of course, if we do that we should do it not in a single Euler step but in many so as not to degrade thesubsequentsecond-orderaccuracy. EXAMPLE 1. Considerthesame“testproblem”as in Section6.2, y=yt2e-2?, y(0)=1, (<a<oo) (10) with the exact solution y(z) = 2? + e*. Let us use the midpoint rule with h = 0.1. To get it started,carry out ten steps of Euler’s method with h = 0.01. The result of those steps is the approximate solution 1.11358 at 2 = 0.1, which we now take as y,. Then proceeding with the midpoint rule we obtain from (9) y2=yo +2 (yr +2e1—a7j)h = 1+2(1.11358 +0.2—0.01)(0.1)= 1.26072 Ys=Yr+2 (y2+2a —ab)h = 1.11358 +2(1.26072 +0.4—0.04)(0.1)= 1.43772, and so on. The results are shown in Table | and contrasted with the less accurate Euler results using the same step size, kh= 0.1. @ Before leaving the midpoint rule, it is interesting to interpret the improvement in accuracy, from Euler to midpoint, graphically. If we solve y(2n41) a y(an) + y(an)h (Euler) (11) (midpoint) (12) (Euler) (13) and y(@n41) & y(@n—-1)+ 2y'(an)h for y'(an), we have Table 1. Comparison of Euler, midpoint rule, and exact solutions of the initial-value problem (10), with h = 0.1. Euler | Midpoint | x Exact 0.10| 1.10000| 1.11358| 1.11517 0.20| 1.22900| 1.26072| 1.26140 0.30| 1.38790| 1.43772| 1.43986 0.40| 1.57769| 1.65026| 0.50| 1.79946| 1.89577| 1.89872 1.65182 and (14) (midpoint) which are difference quotient approximations of the derivative y/(x,). yOn-t t ∙ . ) yx, − Lp ) Xp YOns1 Xn+l ) x en Figure 1. Graphicalinterpretation of midpoint rule versus Euler. In Fig. 1, we can interpret(14) and (13)as approximatingy’(x,,) by the slopes of the chords AC and BC, respectively, while the exact y’(z,,) is the slope of the tangent line TL at x,. We can see from the figure that AC gives a more accurate approximation than BC’. 6.3.2. Second-order Runge-Kutta. somewhat differently. The Runge-Kutta methods are developed Observe that the low-order Euler method yn4. = Yn + f(£n,Yn)h amounts to an extrapolation away from the initial point (a, yn) using the slope f(a, yn) at that point. Expecting an average slope to give greater accuracy, one might try the algorithm 1 Yn+1 = Yn + > (f(Xn, 2 Yn) ai f (tn41, (15) Yn+1)] A, which uses an averageof the slopes at the initial and final points. Unfortunately, the formula (15) does not give yn41 explicitly since y,41 appears not only on the left-handside but also in theargumentof f(@n41,Yn+1).Intuitiontells us thatwe should still do well if we replace that yn41 by an estimated value, say, the Euler estimate Yn41 = Yn + f(2n, Yn)h. Then the revised version of (15) is 1 Yn+1 = Yn + 2 {f(2n, . Yn) + f [Cn41; Un + cae Yn)h] } h. (16) Thus, guided initially by intuition we can put the idea on a rigorous basis by considering a method of the form Un+1 = Yn + {af (Xn, Un) + bf [Un + ah, Yn -F Bf (fn, Yn) hr] } h (17) and choosing the adjustable parameters a,b,a,@ so as to make the order of the method (17) as high as possible; a, 3 determine the second slope location, and a, b 303 determine the “weights” of the two slopes. That is, we seek a,b, a, @ so that the left- and right-hand sides of y(2n41) x y(&n) [tn, y(tn)| + {af +6f [tn + ah, y(an) + BS (en, y(an) |rl} h (18) agree to as high a degree in h as possible. Thus, expand the left-hand side (LHS) and right-hand side (RHS) of (18) in Taylor series in h: + LHS = y(en) + y/(a@n)h y"(an) he 4... (19) ; (fe + fyf) heres: =ytfht+ where y means y(a,,) and the arguments of f, fz, fy are Un, y(n). ercise 9), Similarly (Ex- RHS= y+ (a+b6)fht+(afe + Bffy) bh?++. (20) Matching the /t terms requires that a+tob=1. (21a) Matching the h? terms, for any function f requires that ab = pope and (b= (21b) -. ee Rol chosen so as to satisfy The outcome then is that any method (17), with a,b,a, (21), has a local truncation error that is O(h°) and is therefore a second-order method [subject to mild conditions on f such as the continuity of f and its firstand second-order partial derivatives so that we can justify the steps in deriving (19) and (20)|. These are the Runge—Kutta methods of second order.! For instance, with a = b = 1/2 and a = 3 = 1 we have Unt . 1, Yn) + f [Tr41, = Un + 5 {f(tn, Yn + f(tn, Yn )h]} h, (22) which is usually expressed in the computationally convenient form Ynti ky = Af (tn, = Yn + 5 (Ay + ka) ’ Yn) ; keg = hf (Qn41, (23) Yn + ky) ‘The Runge-Kutta methodwas originated by Carl D. Runge (1856-1927), a German physicist and mathematician, and extended by the German aerodynamicist and mathematician M. Wilhelm Kutta (1867-1944). Kutta is well known for the Kutta—Joukowski formula for the lift on an airfoil, and for the “Kutta condition” of classical airfoil theory. 304 To understand this result, note that Euler’s method would give Yay, = Yn + f (@n;Yn) h. Tf we denote that Euler estimate as ybuler then (22) can be expressed as : Yn+1 = Yn + f (tn, Yn) + f (tn+4, Uri’) 5 h . That is, we take a tentative step using Euler’s method, then we average the slopes at the initial point v,, y, and at the Euler estimate of the final point 7,41, yeuler and then make another Euler step, this time using the improved (average) slope. For this reason (23) is also known as the improved Euler method. A different choice, a = 0,6 = 1,a = § = 1/2, gives what is known as the modified Euler method. EXAMPLE 2. Let us proceedthroughthefirst two stepsof theimprovedEuler method (23) for the same test problem as was used in Example |, y =y t+22-2"; y(0) =1, (0< x <o) with A = 0.1; a more detailed illustration is given in Section 6.3.3 below. Here, f(x,y) (24) = y + 2a —x”, n=O: ky = hf (xo,yo) = 0.1 [1+0 —(0)?) =0.1, ko = hf (x1,yo+ ky) =0.1[(1 + 0.1)+ 2(0.1)—(0.1)?)= 0.129, wa= yo+ $ (ki + ke) = 140.5 (0.1+ 0.129)= 1.1145; n=l: ky=hf(a1,yr) =0.1[1.1145 +2(0.1) —(0.1)2] =0.13045, ko = hf (w2,yi + fi) =0.160495, —(0.2)2] +2(0.2) +0.13045) =0.1[(1.1145 =1.2600, +0.160495) +0.5(0.13045 yo=yr+4(hy+ho)=1.1145 compared with the values y(x,) = y(0.1) = 1.1152 and y(x2) = y(0.2) = 1.2614 obtainedfrom theknown exactsolution y(a) = 2? +e". Ol 6.3.3. Fourth-order Runge-Kutta. Using this idea of a weighted average of slopes at various points in the x, y plane, with the weights and locations determined so as to maximize the order of the method, one can derive higher-order RungeKutta methods as well, although the derivations are quite tedious. One of the most 305 method: commonly used is the fourth-order Runge-Kutta Yn+l= Yn+ %(ki + 2ke+ 2kg+ ka) ky = hf (2'n, kg = hf cc Yn) ’ kg + B, Yn + 5k) — = Af (a, + Un -F ky) } (25) » ka = hf (@n4is Yn + ka), avwhich we give without derivation. Here the effective slope used is a va erageof the cles at the four points (pn, Yn), (Un + A/2, yn + kt/2), (an + h/2, Yn + ko/2) and (n41, Yn + &3) in the x, y plane, an average because the sum of he coefficientsWe 2/6, 2/6, 1/6 thatmultiply the&’s is 1. Similarly, the sum of he coefficients 1/2, 1/2 in the second-order version (23) is 1 as well. EXAMPLE 3. As a summarizing illustration, we solve another“test problem,” y=-y y(0)=1 (26) by each of the methods considered, using a step size of h = 0.05 and single precision arithmetic (on the computer used that amounts to carrying eight significant figures; double precision would carry 16), The results are given in Table 2, together with the exact solution y(z) = e~* for comparison; 0.529E + 2, for instance, means 0.529 x 10°. The value of y; for the midpoint rule was obtained by Euler’s method with a reduced step size of 0.0025. To illustrate the fourth-order Runge-Kutta calculation, let us proceed through the first step: nr H O: ky = hf (xo, yo) = ~hyo = —0.05(1) = —0.05, ko= hf (vo+ $.yo+ 441)= —h(yo+ $h1) = —0.05(1 —0.025) = —0.04875, kg = hf (xo alas2 yo +: $k) = —h (yo + ska) = ~—0.04878125, a7) = ~0.05(1 —0, 024375 ka = = hf (a, yo + &3) = —h (yo + ks) i ~—0.05(1~ 0.04878125) = —0.047560938, YL = Yo* é (Ay + 2k ++2hy + ha) = 0.95122943, which final result does agree with the corresponding entry in Table 2. Actually, there is a discrepancy of 2 in the last digit, but an error ofthat size is not fenomonetiat in view ofthe fact that the machine used carried only eight significant figures. Most striking is the excellent accuracy of the fourth-order Runge-Kutta method, with six significant figure accuracy over the entire calculation. COMMENT. We see that the midpoint rule and the second-order Runge-Kutta method yield comparable results initially, but the midpoint rule eventually develops an error that 306 Euler Midpoint 2nd-order Runge-Kutta 4th-order Runge~Kutta Exact = e~ c 0.00 1.00000000 E+ 1 1.00000000 E+1 1.00000000E+ 1 1.00000000 E-+-1 1.00000000E+-1 0.05 (1).94999999E+-1 0.95116991 E+ 0.95125002E+1 0.95122945 E+1 0,95122945 E+1 0.10 0.90249997 E+1 0.90488303 E+1 090487659 E+1 0.90483743 E+1 0.90483743 E+1 0.15 0.85737497 E+1 0.86068159 E+1 0.86076385 E+1 0.86070800 E+1 0.86070800 E+1 0.20 0.81450623 E+1 0.81881487 E+1 0.81880164E+1 0.81873077 E+1 0.81873077 E+1 0.25 0).77378094E+1 0.77880013 E+1 0.77888507 E+-1 0.77880079 E+1 077880079 E+1 0.30 0.73509192 E+1 0.74093485 E+1 0.74091440 E+1 0.7408 1820 E+1 0.74081820 E+1 2.00 0.12851217 E+0 0.13573508 E+0 0.13545239 E+0 0.13533530 E+0 0.13533528 E+0 2.05 0),12208656E+0 0.12853225 E+0 0.12884909 E+-0 0.12873492E+0 0,12873492 E+0 2.10 0.11598223 E+0 0.12288185 E+0 0.12256770 E+0 0.12245644E+0 0.12245644 E-+0 2.15 Q.11018312 E+0 0.11624406 E-+0 0.11659253 E+0 0.11648417E-+-0 0.11648415 E+0 2.20 0.10467397 E+-0 0.11125745 E+0 0.11090864 E+0 0.11080316 E+0 0.11080315 E+0 2.25 0.99440269 E—1 0),10511832E+0 0.10550185 E+0 0.10539923E-+-0 0.10539922 E+0 2.30 0.94468258 E—1 0.10074562 E+0 0.10035863E+0 0.10025885E+-0 0.10025885 E+0 5.00 0).59205294E—2 0.12618494 E—1 0,67525362 E-2 0.67379479 E—2 0.67379437 E—2 5.05 0).56245029E-2 0,25511871 E~3 0.64233500 E—2 064093345 E-2 0.64093322 E—2 5.10 0.53432779 E—2 012592983 E—1 0.61102118 E-2 0.60967477 E—2 0.60967444 E—2 5.15 0.50761141 E-2 -0. 10041796 E—2 0.58123390 E—2 0.57994057E—2 0.57994043 E—2 5,20 0.48223082 E—2 0.12693400 E-1 0.55289874 E~2 0.55165654 E—2 0.55165626 E—2 5.25 0.45811930 E-2 -0.22735195 E-2 5.30 0.43521333 E-2 9.70 9.75 0.52594491 E—2 0.52475194 E—2 0.52475161 E-2 0.12920752 E~-1L 0,.50030509E—2 0.49915947 E—2 0.49915928 E—2 0.47684727 E—4 0.64383668 E+0 0.61541170 E-4 0.61283507 E-4 0).61283448E—4 0.45300490 E~4 -0,67670959 E+0 0.58541038 E-4 0.58294674 E—4 0.58294663 E~4 9.80 0.43035467 E~4 0.71150762 E+0 0.55687164 E—4 055451608 E~4 0.55451590 E~4 9.85 0.40883693 E—4 -0.74786037 E-+0 0,.52972413E--4 0.52747200 E-4 0.52747171 E—4 9.90 0.38839509 E~4 0.78629363 E+0 0.50390008 E—4 0.5017469t E—4 0.50174654 E—4 9.95 0.36897534 E—4 -0,.82648975E+0 047933496 E—4 0.47727641 E-4 0.47727597 E~4 10.00 0.35052657 E—4 0.45596738 E—4 0.45399935 E—4 0.45399931 E—4 086894262 E+-0 307 oscillates in sign, from step to step, and grows in magnitude. The reason for this strange (and incorrect) behavior will be studied in Section 6.5. @ Of course, in real applications we do not have the exact solution to compare with the numerical results. In that case, how do we know whether or not our results are sufficiently accurate? A useful rule of thumb, mentioned in Section 6.2, is to redo the entire calculation, each time with a smaller step size, until the results “settle down” to the desired number of significant digits. Thus far we have taken / to be a constant, for simplicity, but there is no reason why it cannot be varied from one step to the next. In fact, theremay be a compelling reasonto do so. For instance, consider the equation y/ + y = tanh 20z on —10 < x < 10. The function tanh 20x is almost a constant, except near the origin, where it varies dramatically approximately from —1 to +1. Thus, we need a very fine step size A near the origin for good accuracy, but to use that A over the entire x interval would be wasteful in terms of computer time and expense. One can come up with a rational scheme for varying the step size to maintain a consistent level of accuracy, but such refinements are already available within existing software. For example, the default numerical differential equation solver in Maple is a “fourth-fifth order Runge—Kutta—Fehlbergmethod” denoted as RKF45 in the literature. According a tentative step is made, first using a fourth- to RKF45, order Runge-Kutta method, and then again using a fifth-order Runge-Kutta method. If the two results agree to a prespecified number of significant digits, then the fifthorder result is accepted. If they agree to more than that number of significant digits, then / is increased and the next step is made. If they agree to less than that number of significant digits, then A is decreased and the step is repeated. 6.3.4. Empirical estimate of the order. (Optional) The relative accuracies achieved by the different methods, as seen from the results in Table 2, strikingly reveal the importance of the order of the method. Thus, it is important to know how to verify theorder of whatever method we use, if only as a partial check on the programming. Recall that by a method being of order p we mean that at any chosen «xthe error behaves as CAP for some constant C: (27) Yexact ~ Yeomp ™ Ch? as h —»0. Suppose we wish to check the order of a given method. Select a test problem such as the one in Example 3, and use the method to compute y at any x point such as z = 1, for two different h’s say hy and hg. Letting ysornp and yoorn denote the y’s computed at « = 1 using step sizes of hy and he, respectively, we have (4) Yexact —~Yeomp ~ Chi, ↕ ∕ ∕ ≥ Chi. ↨−− D ∏ Dividing one equation by the other, to cancel the unknown C, and solving for p, 308 gives l Yexact~ usormp per In| -~ ey Yexact~ Ycomp| (28) In A 0.1 and hy = 0.05. The results at z = 1 are = 0.348678440100 hy=0.1, —yShp yap = 0.358485922409 ho=sn ? ’ 0.367879441171, (28) gives p = 1.08, which is respectably and since Yexact(1) = close to 1. We should be able to obtain a more accurate es imate of p by using gives p ~ 1.01. Using those same step sizes, we also obtain p = 2.05, 2.02, and rule, second-order Runge-Kutta, and fourth-order RungeKutta methods, respectively. Why not use even smaller 4’s to determine p more accura arise. One is thatas the h’s aredecreasedthe computed solutions become more and 4.03 for the midpoint moreaccurate,and theyexact—YeompandYexact— Yeomp&ifferencesin (28)are known to fewer and fewer significant figures, due to cancelation. This is especially true for a high-order method. The other difficulty is that (27) applies to the truncation error alone so, implicit in our use of (27) is the assumption that roundoff errors are negligible. If we make h too small, that assumption may become invalid. For both of these reasons it is important to use extended precision for such calculations, as we have for the preceding calculations. 6.3.5. Multi-step ¥ parabolic fit and predictor-corrector methods. ods known as Adams—Bashforth from @, to Gp+y: . | ∏− ∶ y' dx -/ ↓ f (a, y(x)) dx aun or ‘Ent bk y(@n41) = y(Ln) + | m =2. We’ve already methods, obtained by integrating y’ = f(z, y) JEn Figure 2. Adams—Bashforth interpolationof f for the case (Optional) called attention to the multi-step nature of the midpoint rule. Our purpose in this optional section is to give a brief overview of a class of multi-step meth- f (a. y(2)) dz. (29) (30) Jen To evaluatethe integral, we fit f (a, y(a)) with a polynomial of degreem, which is readily integrated. The polynomial f (x, y(w)) at the m + 1 points am, interpolates (i.e., takes on the same values as) -..,Un—1, Un as illustrated in Fig. 2 for the case m = 2. As the simplest case, let m = 0. Then the zeroth degree polynomial f,, denotes approximation of f (a, y(x)) on [ap,en4i] is f (2, y(x)) © fn, where f(2n, Yn), and (30) gives the familiar Euler method yn. = Yn + fnh. Omitting 6.3. Improvements: Midpoint Rule and Runge-Kutta the steps in this overview we state that with mz = 3 one obtains the fourth-order Adams—Bashforth method h ∙ Untt = Yn + (55fn — 59 fn—1 + 387fn—2 — 9fn—3) 54 (31a) with a local truncation error of 251 (31b) |, yO(Eh = Foy (Cn)ap can see that (31a) is not self-starting; for some € in [t,—3,@,|.We forn = 3,4,..., it applies only so the first three steps (for n = 0,1, 2) need to be carried out by some other method. Suppose that instead of interpolating f at @p—om,...,2n—1,2n We interpoWith m = 3, again, one obtains the fourth-order late at Gy—m41;.+.)2n,tn4+1. method Adams—Moulton Yn+1 ~~ Yn + ∙ (9fr+1 h 19 fn + — = Sofn-1 fnr-2) (32a) a 24 with a local truncation error of 19 (32b) (en)am = 799! yO (OR, where the €’s in (31b) and (32b) are different, in general. Although both methods are fourth order, the Adams—Moulton method is more accurate because the constant factor in (e,), aar is roughly than the constant in (e,) 4. Un—m+ls--+;En,Un+1 thirteen times smaller This increase in accuracy occurs because the points are More centered on the interval of integration n+1) than the points @—m,--.;2n—1,@n. (from x, to On the other hand, the term fp4. = Yn+1) in (32a) is awkward because the argument yn+1 is not yet known! oe [If f is linear in y, then (32a) can be solved for yai1 by simple algebra, and the awkwardness disappears.] Thus, the method (32a) is said to be of closed type, whereas (31a) and all of our preceding methods have been of open type. To handle this difficulty, it is standard practice to solve closed formulas by iteration. Using superscripts to indicate the iterate, (32a) becomes (& en”) = Unt [9s (test, yi) A h + 19 fn —_Dfn-1 + In| 240 (33) (0) from a predictor formula, with subseTo start the iteration, we compute y,,,, quent corrections made by the corrector formula (33). [t is recommended that the predictor and corrector formulas be of the same order (certainly, the corrector should never be of lower order than the predictor) with the corrector applied only once. Thus, the Adams—Bashforth and Adams—Moulton methods constitute a natural predictor-corrector pair with “AB” as the predictor and “AM” as the corrector. Why might we choose the fourth-order AB-AM predictor-corrector over 309 310 the Runge-Kutta method of the same order or vice versa? On the negative side, AB-AM is not self-starting, it requires the storage of f,—3, fn—2, and fn—1, and is more tedious to program. On the other hand, it involves only two function evaluations per step (namely, f, and f,+1) if the corrector is applied only once, whereas Runge-Kutta involves four. Thus, if f(z, y) is reasonably complicated then we can expect AB~AM to be almost twice as fast. In large-scale computing, the savings can be significant. Closure. Motivated to seek higher-order methods than the first-order Euler method, we use a Taylor series approach to obtain the second-order midpoint rule. Though more accurate, a disadvantage of the midpoint rule is that it is not self-starting. Pursuing a different approach, we look at the possibility of using a weighted average of slopes at various points in the a, y plane, with the weights and locations determined so as to maximize the order of the method. We thereby derive the second-order Runge-Kutta method and present, without derivation, the fourth-order Runge-Kutta method. The latter is widely used because it is accurate and self-starting. Because of the importance of the order of a given method, we suggest that the order be checked empirically using a test problem with a known exact solution. The resulting approximate expression for the order is given by (28). In the final section we return to the idea of multistep methods and present a brief overview of the Adams—Bashforth methods, derived most naturally from an approximate integral approach. Though not self-starting, the fourth-order AdamsBashforth method (31a) is faster than the Runge-Kutta method of the same order because it requires only one function evaluation per step (namely, f;,; the fn—1, fn—2, and f,—3 terms are stored from previous steps). A further refinement consists of predictor-corrector variations of the Adams—Bashforth methods. However, we stress that such refinements become worthwhile only if the scope of the computational effort becomes large enoughto justify the additional inconvenience caused by such features as the absence of self-starting and predictor-corrector iteration. Otherwise, one might as well stick to a simple and accurate methodsuch as fourthorder Runge-Kutta. Computer software. Computer-software systems such as Maple include numerical differential equation solvers. In Maple one can use the dsolve command together with a numeric option. The default numerical solution method is the RKF45 method mentioned above. Note that with the numeric option of dsolve one does not specify a step size A since that choice is controlled within the program and, in general, is varied from step to step to maintaina certain level of accuracy. To specify the absolute error tolerance one can use an additional option called abserr, which is formatted as abserr = Float(1,2-digits) and which means 1 times 10 to the one- or two-digit exponent. For instance, to solve y=-y y(0)=1 for y(x) with an absolute error tolerance of 1 x 107°, and to print the results at x = 2,10, enter with(DEtools): 311 and return. Then enter dsolve({diff(y(a),2) = —y(z), y(0) = 1}, value= array([2,10]),abserr= Float(1,—5)); and return. The printed result is 2. 10. (x,y(x)] .1353337989380555 .00004501989255717160 y(10) = exp (—10) = 0.0000453999,respectively. EXERCISES 0.1353352832 and 6.3 1. Evaluate y; and yg by hand, by the second-order and fourthorder Runge-Kutta methods, with h = 0.02. Obtain the exact valuesy(a,) andy(x2) as well. (a) y! = 3000zy7?; x y(0) = Dy (b) y’ = 40re7!; (0=a y(-1)=5 (c)y=a+y; (d)y= —ytana; y(1)=- (f)y! = -2ysing; y(2)= = According to Bernoulli’s principle, the efflux velocity u(t) is (e)y'= (y?+1)/4; (0) = 2. (a)-(f) Program the second- and fourth-order approximately\/2gz(t), whereg is the accelerationof gravity. Thus, a mass balance gives Runge-Kutta Aa'(t)= Q(t)—Bo(t) (4.1) methods and use them to solve the initial-value problem given in the corresponding part of Exercise | but with the initial conditiony(0) = 1. Use A = 0.05. Print out all computedvalues where B is the cross-sectional area of the efflux pipe. For definiteness,suppose that A = 1 and B/2g of y, up to x = 0.5, as well as the exact solution. z= Q(t) —0.01V/z. the order of the given method. Use (28), with h = 0.1 and 0.05, say. Do the evaluation at two different locations, such as w= Landa = 2. (The order should not depend upon zwso your results at the two points should be almost identical.) (4.2) We wish to know the depth x(t) at the end of 10 minutes (t = 600 sec), 20 minutes, ... , up to one hour. Program the computer solution of (4.2) by the second-order Runge-Kutta method for the following cases, and use it to solve. for. those a values: 2(600), «(1200),...,2(8600). (Using the rule of (a) Euler’s method 3, reduce ft until those results thumb given below oar digits.) significant settle down to four (b) Second-order Runge-Kutta method (c) Fourth-order Runge-Kutta method 4. (Liquid level) Liquid is pumped into a tank of horizontal cross-sectional = 0.01 so area A (m*) at a rate Q (liters/sec), drained by a valve at its base as sketched in the figure. and is (a)Q(t)= 0.02;(0)= 0 (b)Q(t)=0.02; x(0)= (c)Q(t) = 0.02; 2(0) = (d)Q(t) =0.02; 0) 6 312 (e)Q(t)= 0.02(1—e799) | (0) = 0 (f) Q(t) =ss 0.02 (1 — ~@ ~0.0041) . (g) Q(t) = 0.02t; (0) 61 Y(en-1) = y(@n)+ | «x(0)= 0 (h)Q(t) = 0.02(1+ sin0.14); «(0) = 0 NOTE: Surely, we will need A to be small compared to the period200mof Q(t) in part(h). §. (a)—(h) (Liquid Integrating y’ = f(x) from x, to %,41, we have = & (11.1) f(a) dz If we fit f(a), over [t,,@n41], with a zeroth-degreepolynomial (i.e., a constant) that interpolates f at v,, then we have level) Same as Exercise 4, but use fourth- order Runge~Kutta instead of second order. 6. (a)—(h) (Liquid level) Same as Exercise 4, but use computer software to do the numerical solution of the differential equation. In Maple, for instance, the dsolve command uses the fourth-fifth order RKF45 method. 7. (Liquid level) (a) For the case where Q(t) is a constant, derive the general solution of (4.2) in Exercise 4 as f(x) & flan), and (11.1) gives y(tnai) © y(tn) + f(an)h and hence the Euler method gna. = Yn + f(@n)h. (a) Show that if we fit f(z), over (tn, 241], with afirst-degree polynomial (a straight line) that interpolates f atv, and @y41, thenf(x) © f(an) +(f(tna1) —f(an)] ( -tn)/h. Putting that approximation into (11.1), derive the approximation y(@n4+1) = y(tn) 1 4 9 [f(@n) + f(tn+1)] A, (11.2) special case where f is a function Q -0.01/a@—Qin (Q —0.01Vz) = 0.00005t+ C, (7.1) and show that (for the of x only) (11.2) is identical to the second-order Runge-Kutta where C is the constantof integration. (b) Evaluate C in (7.1) if Q = 0.02 and 2(0) = 0. Then, solve NOTE: Unfortu(7.1) for a(t) at t = 600, 1200,...,3600. nately, (7.1) is in implicit rather than explicit form, but you can use computer software to solve. In Maple, for instance, the relevant command is fsolve. 8. Suppose that we have a convergent method, with £,, ~ Ch? as h - 0. Someone offers to improve the method by either halving C or by doubling p. Which would you choose? Explain. 9, Expand the right-hand side of (18) in a Taylor series in h and show that the result is as given in (20). HINT: To expand the f(t, +ah,y + Bfh) term you need to use chain differentiation. 10. (a) Program the fourth-order Runge-Kutta method (25) and use it to run the test problem (10) and to compute y atx = 1 using h = 0.05 and then kh= 0.02. From those values and the known exact solution, empirically verify that the method is fourth order. (b) To see what harm a programming error can cause, change the z, + h/2 in the formula for ky to 2,, repeat the two evaluations of y ata = lL using h = 0.05 and A = 0.02, and empirically. determine the order of the method. [s it still a fourthorder method? (c) Insteadof introducing the programming error suggested in part (b), suppose we change the coefficient of kg in yaa, = Yn + ‘ (ky + 2k + 2ky + ky) from 2 to 3. Do you think the method will still be convergent? Explain. U1. (Rectangular, trapezoidal, and Simpson's rule) Consider the special case where f in y’ = f is a function of a only. method (23). (b) Show that if we fit f(z), over [vn,@n41], with a seconddegree polynomial (a parabola) that interpolates f at Un, In + h/2, and raz. = Ln + fh, and put that approximation into (11.1), then one obtains Y(@n41) Es L y(Zn) 7 6 (f(@n) + 4f (en + h/2) + flansi)l A, (11.3) and show that (for the case where f is a function of x only) method (11.3) is identical to the fourth-order Runge-Kutta (25). NOTE: These three results amount to the well-known rectangular, trapezoidal, and Simpson’s rules of numerical integration for a single interval of size h. If we sum over all of the intervals, they take the forms [f(a)+ flat h)+--+ f(b h, (fla) + 2f(a +h) + 2f(a + 2h) + f(0)]F fe +2f(b—h) ’ + 4f(a+h) + 2f(at+ 2h)+L 4f(a + 3h) (f(a) +£(0)] feeb 4f(b—bh) A 6) (11.4) In passing from (11.3) to the last line respectively. of (11.4) we have replaced h/2 by h everywhere in 313 (11.3}, so that the revised (11.3) reads Yay. = Yn + [f(tn) + 4f(@n41) + f(a@n+2)]h/3, where x, = a+nh. For the rectangular and trapezoidal rules the number of subdivisions, (b ~ a)/h, can be even or odd, but for Simpson’s rule it must be even. The order of the error for these integration methodsis O(h), O(h?), andO(h*), respectively. 12. (a) Using m = 1, derive from (30) the Adams~Bashforth method : Yn+1 = Un at problems (3f . _ 13. This exercise is to take you through the fourth-order ABAM predictor-corrector scheme. (a) For the problem y' = 2xy,y(0) = 1, compute 4, yo, ys from the exact solution, with # = 0.1, and use those as starting values to determine yy, by hand, by means of the fourth-order AB-—AM A fn—1) (b) Determine the order of the method (12.1) empirically by using it to solve the test problem (10), at 2 = 1, with two different step sizes, and then using (28). 2" (12.1) predictor-corrector u(x) = f(x,u,v); u(a) i (la) v(x) = g(x, u,v); v(a) = (1b) to each of the problems (1a) and (1b) as follows: Until = Un + f (2A, Un, Undh, Un+1 forn = 0,1,2,.... = Un + G(Ln, EXAMPLE (2) Uns Un )h Equations (2) are coupled (since each involves both u and v), from the preceding step. 1. Consider the system ws atu; ul=II uv?; ufO) = 0 u(O)= 1. scheme given by (31a) and (33). Apply the corrector three times. (b) Continuing in the same way, determine ys. 314 The latter looks fairly simple, but it is not. [tis nonlinear because of the uv" term. Turning to numerical solution using the Euler method (2), let h = 0.1, say, and let us go through the first couple of steps. First, wg = 0 and vg = 1 from the initial conditions. Then, m=O0: Uy = Uo + (xo + uUo)h= 04 (0+ 1)(0.1) = 0.1, Uy= VoFupugh = 1+ (0)(1)?(0.1)= 1. n=l: ug =uy t+(ay +uz)h =0.1 +(0.1 +1)(0.1)= 0.21, Vg = Up+ uv?h = 1 + (0.1)(1)?(0.1)= L01, and soon. @ Similarly, if the system contains more than two equations. Next, we show how to adapt the fourth-order Runge-Kutta method to the system (1). Recall that for the single equation y=f(r,y)s ya) =yo (4) the algorithm is Yntl = Un + j (Ay + 2ho + 2h3 + ka), hy = hf (tn, k3 = hf Yn), (tn + BY + 5k2) , ko = ~ hf (n+ k= =Af (tn +15 Un + ke3)- ' 3M + 5h), (9) For the system (1) it becomes Un+-] = Un A + Qho (ky + 2h 4 ha), Un+1 = Un + ~ (ly + 2lo + 2l3 + la), Foote |a ky = hf (an, = hg(2n, ko =hf Un, Un), Un, Un); h 1 h 1 1 (m + 7 Un + =k, Un + st) n lo = hg (m+y kg = hf(« Gy), Un + hts 1 Un + sts) : ’ 1 1 h + h ln + <ko, Un+ sls) ; | (6) 315 ls = hg ka = Af h ( 2 (Sn41; l4 = hg (Gn+1; au! 1 + =, Un + aha, lI Un 1 Un kg, Un + lz) 5!) ; Un T kg, Un + ls) ) x+y vu!= uv’; u(0) {| 0 u(0) II L. Ky (0.1)(0 +0.05)+ (1+0.0025)] =0.10525, (0.1)(0+0.0525) (1+0.0025)* =0.005276, h Uy+ (vo 1 6 6 + l3)] , (7) 316 nma=l: and so on for n = 2,3,.... ky,...,Ug ky = 0.110520, ky = 0.116051, ly = 0.010627, lo = 0.016382, kg = 0.116339, kg = 0.122196, lg = 0.016760, lg = 0.023134, Ug = 0.221420, vq = 1.021872, We suggestthatyou fill in thedetails for the calculation of the values shown above forn = 1. Of course, the idea is to carry out such calculations on a computer, not by hand. The calculations shown in Examples | and 2 are merely intended to clarify the methods. What about higher-order equations? The key is to re-express an nth-order equation as an equivalent system of n first-order equations. EXAMPLE yi! 3. The problem _ xy” +y/' _ Qy% — sin 2; y(1) — 2, y'(1) = 0, y" (A) — ~3 (8) can be converted to an equivalent system of three first-order equations as follows. Define y’ = wand y” = v (henceu! = v), Then (8) can be re-expressedin theform y(1)= 2 y =u; ul =v; u(1) = 0 vo= sing +2y%—-utav; v(1) = —3. (9a,b,c) Of the three differential equations in (9), the first two merely serve to introduce the auxiliary dependentvariableswuandv, and since v’ is y’” the third one is a restatedversionof the given equationy” —ry” + y' ~ 2y? = sin x. Equation(9a)is they equation,so theinitial condition is on y(1), namely,y(1) = 2, as given in (8). Equation(9b) is the uwequation, so theinitial conditionis on w(1),and we haveu(1) = y'(1) = 0, from (8). Similarly, for equation (9c), The system (9) can now be solved by the Euler or fourth-order Runge-Kutta methods or any other such algorithm, To illustrate, let us carry out the first two steps using Euler’s method, taking h = 0.2, say. nm=Q: Yi = Yo + oh = 2+ (0)(0.2) = 2, Uy = Uo+ Uph= 0+ (—3)(0.2)= -0.6, Uy = Up + (sin Lo + 2ye Ug + roUo) h = —3+ [sin1+ 2(2)°—0 + (1)(—3)](0.2)= —0.231706. m=l: Yo = yr Fuh = 2+ (—0.6)(0.2) = 1.88, 317 i = —0.646341, uy+ vyh= —0.6+ (—0.231706)(0.2) Uy ==Uy + (sin ay + 2yi — Uy + t1U1) h (0.2) + [sin0.2+ 2(2)°—(—0.6)+ (0.2)(—0.231706)] = —0.231706 = 3.118760, and so on forn = 2,3,.... COMMENT. Observe that at each step we compute y, u, and v, yet we are not really interestedin the auxiliary variables uwand v. Perhaps we couldjust compute yy, yo,... and not the u,v values? No; equations (9) are coupled so we need to bring all three variables along together. Of course, we don’t need to print or plot « and v, but we do need to compute them. @ EXAMPLE 4. Examples | and 2 involve a systemof first-orderequations,and Example 3 involve a single higher-order equation. As a final example, consider a combination of the two such as the initial-value problem u’ ~ 3ceuv=sing; u(0)=4, u’(0)=—-l v’+2u-v=52; v(0)=7, v'(0)=0. (10) The idea is exactly the same as before. We need to recast (10) as a system of first-order initial value problems. We can do so by introducing auxiliary dependent variables w and z according to u’ = wand v’ = z. Then (10) becomes ul = wy u{0) = 4 w’ = sina +3ecuv; w(0) = —1 vis ss v(0) =7 c= 5a -2Qu+vu; (11) 2(0)=0 which system can now be solved by Euler’s method or any other such numerical differential equationsolver. @ 6.4.2. Linear boundary-value problems. Our discussion is based mostly upon the following example. EXAMPLE 5. Consider thethird-orderboundary-valueproblem yl! —ay = —x: y(0) = 0, y/(0) = 0, yQ) =4. (12) To solve numerically, we begin by recasting (12) as the first-order system: yol| =u y(0)= 0, y(2)=4 ees , uy i 9 wy u(0) = 0 mv, : (13a,b,c) 318 Chapter 6. Quantitative Methods: Numerical Solution of Differential Equations However, we cannot-apply the numerical integration techniques that we have discussed because the problem (13c) does not have an initial condition so we cannot get the solution started. Whereas (13c) is missing an initial condition on v, (13a) has an extra condition the right end condition y(2) = 4, but that condition is of no help in developing a numerical integration scheme that develops a solution beginning at x = 0. Nevertheless, the linearity of (12) saves the day and permits us to work with an initialvalue version instead. Specifically, suppose that we solve (numerically) the four initialvalue problems L[¥i] = 0, ¥i(0)=1, Y/(0) =0, Y/’(0) =90, L[¥3] = 0, ¥3(0) =0, Y3(0)=0, Y,/(0) = 1, L{¥,] = -a*, ¥,(0)=0, Y¥;(0)=0, Y¥;’(0)= 0, =0, =1,¥7"(0) =0,¥f(0) =0,—¥a(0) L{¥] a4 whereL = d3/dx? —x? is thedifferentialoperatorin (12). The nineinitial conditions in the first three of these problems were chosen so as to have a nonzero determinant so that Y;, ¥Y2,Y3 comprise a fundamental set of solutions (i.e., a linearly independent set of solutions) of the homogeneousequation L[Y| = 0. The three initial conditions on the particular solution Y, were chosen as zero for simplicity; any values will do since any particular solution will do. Suppose we imagine that the four initial-value problems in (14) have now been solved by the methods discussed above. Then ¥;, Yo,¥3, Y, are known functionsof x over the interval of interest[0,2],and we have thegeneralsolution y(x) = CLYi(2) + C2¥o(x) + C3Y3(x) + Y,(2) (15) of L[y] = —a*. Finally, we evaluate the integration constants C',, C2, C3 by imposing the boundary conditions given in (12): y(0)=0=C, +0+0+0, y (0) =0=0+C,+0+0, (16) y(2) = 4 = CL ¥Y4(2)+ CoYo(2) + Ca¥s(2) + Y,(2). Solving (16)gives C) = Cy = Oand Cy = [4—Y,(2)]/Y¥3(2),so we havethedesired solution of (12) as 4— Y,(2) = A ¥3(2) y(v) y(x) Y3(2) 3(x) + ¥,a (2).) ( {7 ) In fact, since C, = Cy = 0 the functions ¥,(x) and Y2(x), have dropped out so we don’t needto calculate them. All we needare Y3(a) and Y,,(), and theseare found by the numerical integration.of the initial-value problems Yi=Us, — Y3(0)= 0, =0, UlL=V3, -U3(0) =1, Vf=225, Vs(0) (18) 319 and Yy= Up, Us =Vp, Y,(0)= 0, U,(0) =0, (19) =0, 2", Vp(0) Vp= @°Yp— respectively, COMMENT. Remember that whereas initial-value problems have unique solutions (if the functions involved are sufficiently well behaved), boundary-value problems can have no solution, a unique solution, or even an infinite number of solutions. How do these possibilities work out in this example? The clue is that (17) fails if ¥3(2) turns out to be zero. The situation is seen more clearly from (16), where all of the possibilities come into view. Specifically, if ¥3(2) 3 0, then we can solve uniquely for C's, and we have a unique solution, given by (17). If ¥3(2) does vanish, then there are two possiblities as seen from (16): if Y,(2) # 4, then there is no solution, and if Y,(2) = 4 then there are an infinite number of solutions of (12), namely, y(x) = C3Y3(z)+Y,(2), where C’; remains arbitrary. (20) @ We see that boundary-value problems are more difficult than initial-value problems. From Example 5 we see that a nonhomogeneous nth-order linear boundaryvalue problem generally involves the solution of n + 1 initial-value problems, although in Example 5 (in which n ==3) we were lucky and did not need to solve for two of the four unknowns, Y, and Yo. Nonlinear boundary-value problems are more difficult still, because we cannot use the idea of finding a fundamental set of solutions plus a particular solution and thus forming a general solution, as we did in Example 5, and which idea is based upon linearity. One viable line of approach comes under the heading of shooting methods. For instance, to solve the nonlinear boundary-value problem y +siny = 32; y(0) = 0, y(5) = 2 (21) we can solve the initial-value problem yy =U, =U, y(0) y(0)= 0 u' = 3a —siny, u(0) = uo (22) iteratively. That is, we can guess at the initial condition wp [which is the initial slope y/(0)] and solve (22) for y(x) and u(x). Next, we compare the computed value of y(5) with the boundary condition y(5) = 2 (which we have not yet used). If the computed value is too high, then we return to (22), reduce the value of uo, and solve again. Comparing the new computed value of y(5) with the prescribed value y(5) = 2, we again revise our value of uo. If these revisions are done in 320. Chapter 6. Quantitative Methods: Numerical Solution of Differential Equations a rational way, one can imagine obtaining a convergent scheme. Such a scheme is called a shooting method because of the obvious analogy with the shooting of a projectile, with the intention of having the projectile strike the ground at some distant prescribed point. Thus, we can see the increase in difficulty as we move away from linear initialvalue problems. For a linear boundary-value problem of order n we need to solve not one problem but n + 1 of them. For a nonlinear boundary-value problem we need to solve an infinite sequence of them, in principle; in practice, we need to carry out only enough iterations to produce the desired accuracy. Closure. In Section 6.4.1 we extend the Euler and fourth-order Runge-Kutta solution methods to cover systems of equations and higher-order equations. In that discussion it is more convenient to use n-dimensional vector notation because of its compactness, but that notation is not be introduced until Chapters 9 and 10. Nonetheless, let us indicate the result, if only for the Euler method, for future reference. The idea is that we can express the system yi (a) = yo, yi(2) = fi (a,yi(a),.--,yn(a)); (23) yj, (2) = Fn (x, yi(2), > tee »Yn(z)) Yn(a) = Und in the vector form y(x)=£(2,y(e)); —y(a)=yo, where the boldface letters denote ‘n-dimensional yi(2) y(a)= vectors:” y\(x) , y(2)= Yn(z) column ; (24) fila, y(x)) , f(x, y(2))= Y, (2) : En(2, y(x)) (25) and where f(a, y(x)) is simply a shorthandnotationfor f;(@,yi(x),-..,Yn(2)). Then the Euler algorithm corresponding to (24) is Yn4+i1 = Yn + f (tn, Yn) h. (26) In Section 6.4.2 we turn to boundary-value problems, but only linear ones. Given an nth-order linear boundary-value problem L{y] = f(x) on an interval [a, 6] plus m boundary conditions the idea is to solve the problems L(Yi] =0; Vila)=1, ¥i(a)=-- =¥L""" (a)= 0, , =Y¥i(a)=. =¥f"") (a)=0, ¥E""(a) = 1, (27) 321 ¥n(@),¥p(2) and to form the general solution as for Yi(a),..., y(x) = CLYi (a) +-+- + CaYn(@) + Yp(2). (28) boundary conditions to (28) y yields 1 linear alFinally, application of the original & which equations will have a unique solution, no gebraic equations for C,...,C,, solution, or an infinity of solutions. Computer software. No new software is needed for the methods described in this section. For instance, we can use the Maple command dsolve, with the numeric option, to solve the problem Leos l| | oC& S& ic - OQ oS ll pet and to print the results at « = 1,2, and 3. First, enter with(DEtools): and return. Then enter dsolve({diff(u(z), 2) = «+ v(x), diff(v(2), 2) = —5* u(x) * v(@), u(O) = 0, v(0) = 1}, (u(x), v(x)}, type = numeric, value = array({1,2.3])); and return. The printed result is [x, u(x), v(z)] 1. 2. 3. 1.032499017614234 2.544584704578166 5.044585755162072 .07285274036469075 .00001413488345836790 —.3131443346304622 x 107° The only differences between the command above and the one given at the end of Section 6.3 is that here we have entered two differential equations, two initial conditions, and two dependent variables, and we have omitted the abserr option. Observe that to solve a differential equation, or system of differential equations, numerically, we must first express the equations as a system of first-order equations, as illustrated in Example 4. However, to use the Maple dsolve command we can leave the original higher-order equations intact. 322 EXERCISES 6.4 1. In Example 2 we gave ky, l1, ka, la, ky, lg, ka, ly forn = 1, (d)Sameas(a),butwithx(0) = y(0) = 0,2(0) = 10. and the resulting 5. We re-expressed (8) and (10) as the equivalent systems of values of up and ve, but did not show the calculations. Provide thosecalculations, as we did for the step first-order initial-value problems (9) and (11), respectively. Do n= 0. the same for the problem given. You need go no further. 2. As we did in Example |, work out y,, 21, by hand. Use three (a)ma" +ca'+ka= 2'(0) = 25 f(t); 2(0)=20, methods: Euler, second-order Runge-Kutta, and fourth-order Runge-Kutta, and take & = 0.2. These problems are rigged so as to have simple closed-form solutions, some of which are given in brackets. Compare your results with the exact solution. (a) yo=z 2(0)=0 2(2)=0 zi=—y, 2(0)=0 () y=-2/y, yA)=5, (e)y’" —Qsiny’ = 32; y(-2)=7, y/Q)=-1 dy" +y' —4y=32; y(0)=2, y/(0)= y"(—2)=0 y" a+5y =0; (b) y' = 42; y(2)=5 zg =-y; (c)y”—ayy’=sing; y/(-2)=4 y”"(1)=0 y/(A)=2, y(l)=3, (f)y' +2y=cos2z; 2’(0)=—-1 2(0)=2, (g) 2’ +22 —3y =10cos3t; y(0)=1 z=-y;, (b)Li” +Ri +(1/C)i=E'(t); i(0)=%o,i(0)=%, yO)=1 [y(x)=e7*] 2(3) =0,2'(3) =8 e"+ty-—z=g(a); 2”(1)=3 z(l)=2’(1)=0, (i) 2” —8az=sint; y(1) =6 Ee Tee 2(1)=2 ~doy=e* [z(x) =e7*] (d)y= 2e27/y;y(l)=1 [y(z) =27] gi =y/2*; 2(1)=1 [2(z) =a] ghs—y2?; c(l)=1 [2(x) = 1/2] y(0)=4,y/(0)=3 (h) yy"+ay'z= f(z); y(3)=2, y/(3)=-1 y!(3)=6 (e:)yo=(et+y)z—-1,yQ)=1 [y(2)=] Gy 3. (a)—(e) First, read Exercise 2. Use computer software to solve the initial-value problem given in the corresponding part =yz; (0) = 1, 2 = —xy + 2; 2(0) = 6. Use computer software to solve the given system numeri- of Exercise 2, for y(x) and z(x) at x = 3,5, and 10, and compare those results with the exact solution at those points. cally, and print out the solution for y(z) and z(x) at x = 1,2. (a) y" − aoa =5ax; y(0) = 2, y'(0)= —1 2(0)=1 =-3, z! + yz 4. (a) Just as (2) and (6) give the Euler and fourth-order (b) Runge- Kutta algorithms for the second-order system (1), write down the analogous Euler, second-order Runge—Kutta, and fourthorder Runge-Kutta algorithms for the third-order system x(t) = f(t,2,y, 2), x(a) = x y'(t)= g(t,2,y,2), y(a)= yo ai(t)= f(t,e,y,2z),.2(a).= 29. (c)yl=2!+a; (d) yz" =a; =2"(1)=0 21)=1, 2/(1) (4.1) 7. Complete the solution of Example 5 by using computer software to solve (18) for ¥3(x) and (19) for Y, (az),atx = 2,4, 6, and then using (17) to determine y(z)-at those points. Use the Euler and second-order Runge—Kutta algorithms to 8. Use the method explained in Example 5 to reduce the given work out 21,41, 21 and 29,yg, 22, by hand, for the case where linear boundary-value problem to a system of linear initialfig,hare y — 1,z,t +a + 3(2 — y + 1), respectively,with value problems. Then complete the solution and solve for the the initial conditions(0) = —3,y(0) = 0, z(0) = 2 using specified quantity, either using computer software or by programming any of the numerical solution methods that we have studied. Obtain accuracy to five significant figures, and indicate why you believe that you have achieved that accuracy. h=0.3. 6.5. Stability and Difference Equations If you believethatthereis no solution or thatit exists but is nonunique, then give your reasoning. HINT: You can specify (c) y” ~ [In(a + 1)]y/ —y = 2sin 38a4+1; y(O) = 3, y(2)=—1. homogeneousinitial conditions for the Y, problem, as we did Determine y(a) at « = 0.5,1.0, 1.5. y(5)=2 y(0)=1, in Example 5, but be aware that you do not have to use homo- —(q) yi +y—-ay=03; 1,2,3,4. = ata y(a) petermine reduceyour to able be thatyoumay and geneousconditions, (ec)y” +ay = 203 y(t)=y'(1)=0, (2) laborby a moreoptimal choice of thoseconditions. y(0)=1, (a)y" —2ay'+y=3sine; y(1) Determine (b)y+ (cosz)y=0; Determiney(2). 6.5 y(0)=1, Determiney(2). y(2)=3. y(x) ata = 4,5. Determine Equations 6.5.1. Introduction. In progressing from thesimple Euler method to the more sophisticated higher-order methods our aim was improvement in accuracy. However, there are cases where the results obtained not only fail to be sufficiently accurate but are grossly incorrect, as illustrated in thetwo examples to follow. The second one introduces the idea of stability, and in Section 6.5.2 we concentrate on that topic. EXAMPLE 1. The initial-valueproblem (1) hastheexactsolutiony(z) = exp (—42). If we solve it by thefourth-orderRunge-Kutta method for the step sizes h = 0.1, 0.05, and 0.01, we obtain in Table | the results shown, at Table 1. Runge—Kutta solution of (1). x A=0.1 h = 0.05 h=0.01 Exact 0.183153 E-1 0.183156 E-1 4 | —0.167842 E+0 | —0.106538 E-1 | —0.146405 E-3 0.112535 E-6 1 0.179006 E-1 0.182893 E-1 8 | —0.500286 E+3 | —0.317586 E+2 | —0.436763 E+0 | 0.126642 E~-13 —0.149120 E+7 | —0.946704 E+5 | —0.130197 E+4 | 0.142516 E—20 | 12 = 3. o'(5) =04, ty=a;yl)=2,v(1) (Dy+ay! y(10)= Stability and Difference 323 the representative points « = 1, 4,8, and 12. Since the Runge—Kutta method is convergent, theresults should converge to the exact solution at any given a as / tends to zero, but that convergenceis hard to see in the tabulatedresultsexcept for « = 1. In fact, it is doubtful that we could ever come close to the exact values at 2 = 8 or 12 since we might need to make A so small that roundoff errors might come to dominate before the accumulated truncation error is sufficiently reduced. More central to the purpose of this example is to see that with / fixed the results diverge dramatically from the exact solution as x increases so as to become grossly incorrect. We cannot blame this strange and unexpected result on complications due to nonlinearity because (1) is linear. To understand the source of the difficulty, note that the general solution of the differential equationis y(z) = exp (—4a) + C'exp (2x), whereC’ is an arbitraryconstant.The initial condition implies that C = 0, leaving the particularsolution y(a) = exp (—42). In Figure 1. Solution curves for the Fig. | we show several solution curves for values of C' close to and equal to zero, and we can see the rapid divergence of neighboring curves from the solution y{(z) = exp (~—4z). Thus, the explanation of the difficulties found in the tabulated numerical results is that even a very small numerical error shifts us from the exact solution curve to a neighboring curve, which then diverges from the true solution. @ equation y’ — 2y = —6e7**". EXAMPLE 2. In Example 3 of Section 6.3 we solved the equationy’ = —y, with initial condition y(0) = 1, by several methods —from the simple Euler method to the more accurate and sophisticated fourth-order Runge-Kutta method, and we gave the results in Table 2. Since the midpoint rule and the second-order Runge-Kutta methods are both of second order we expected their accuracy to be comparable. Indeedthey were initially, but the midpoint rule eventually developed an error that oscillated in sign from step to step and grew in magnitude (see Table 2 in Section 6.3). Let us solve the similar problem y=-2y; y(0)=1 (2) by the midpoint rule, with h = 0.05. Since the midpoint rule is not self-starting, we use ten Euler steps from x = 0 to « = 0,05 before switching over to the midpoint rule. We have plotted the results in Fig. 2, along with the exact solution, y(v) = exp (—22). Once again, Figure 2. Illustrationof numerical instability associated with the midpoint rule, for the initial-value problem (2), we see that the midpoint rule results follow the exact solution initially, but they develop an error that oscillates in sign and grows such that the results are soon hopelessly incorrect. This numerical difficulty is different from the one found above in Example |, for rather than being due to an extreme sensitivity to initial conditions, it is associated with machine roundoff error and is an example of numerical instability. 9 6.5.2. Stability. Let us analyze the phenomenonof numerical instability that we encountered in Example 2. Recall that we denote the exact solution of a given initial-value problem as y(a,,) and the numerical solution as y,,. Actually, the latter is not quite the same as the computer printout because of the inevitable presence of machine roundoff errors. Thus, let us distinguish further between the numerical solution y,, that would be generatedon a perfect computer, and the solution y,, that is generated On a real machine and which includes the effects of numerical roundoff —that is, the truncation of numbers after a certain number of significant figures. It is useful to decompose the total error, at any nth step, as Be * “yy pe -p om he: arror Totalerror = y(tn) ~— yf em = [y(@n) —Ynl+ [yn—YF] accum. truncation error]+/accum. roundoff error]. (3) 325 We ask two things of a method: first, that the accumulated truncation error tend to zero at any fixed wxas the step size /i tends to zero and, second, that the accumulated roundoff error remain small compared to the exact solution. The first is the issue of convergence, discussed earlier in this chapter, and the second is the issue of stability, our present concern. We have already noted that the midpoint rule can produce the strange behavior shown in Fig. 2, so let us study the application of that method to the standard “test problem,” (4) y= Ay; y(0)=1, where it is useful to include the constant A as a parameter. The midpoint rule generatesy, according to the algorithm Yn+l ll Yn—-1 + 2hf l Yn-1 (an, Yn) (5) Yo = 1 + 2AYn; forn = 0,1,2,.... To determine whether a solution algorithm, in this case (5), is stable, it is customary to “inject” a roundoff error at any step in the solution, say at rn = 0, and to see how much the perturbed solution differs from the exact solution as n increases, assuming that no further roundoff errors occur. Thus, in place of (5), consider the perturbed problem (6) Yntl = Yn-1+ 2Ahyn; yo=1— 6, say, where ¢ is the (positive or negative) roundoff error in the initial Defining the error ep, = Yn — yp, and subtracting (6) from (5), gives Cnt = Cn—|+ 2ZAhen, with the initial condition eg = ©,as governing the evolution of e,. condition. (7) We call (7) a difference equation. Just as certain differential equations can be solved by seeking solutions in the form y(z) = e**, the appropriate form for the difference equation (7) is (8) en =p”; where p is to be determined. Putting this expression into (7) gives pt! —2ARp" —p™! =0 or (9) ' (p” ~ 2Ahp —-1) —=(, p” (10) Since 1/p” is not zero, it follows from (10) that we must have p? — 2Ahp ~ 1 = 0, so we have the two roots p= Ah+VJ/1+A%h? and p= Ah— /1+ Ath’. (1) 326 By considerations analogous to those for differential equations, we have en=CY(Ah4/14 Aah)" +Cs(Ah~Ji+ ae)” (12) as the general solution of (7). If we let h - 0, then (An +V1+ Abn)” ~ (AR+1)” = etGah) J enAh—eAtn where we have used the identity a = e!"%,the Taylor expansion In(1+2) a —*/2+4+-+-+~a, andthe fact thatz, = nh. Similarly, (13) = (4h Vi+ Abn)"~(Ah—1)" = (-1)"(1 _ Ah)” _ (—1)"e” In (1—Ah) (14) ~(-1)e7PA"=(-1)"eA, so (12) becomes (15) 42" + Co(—1)"e7 en= Cye*®" as h — 0. Since there are two arbitrary constants, C and Co, two initial conditions are appropriate, whereas we have attached only the single condition e9 = € in (7). With no great loss of generality let us specify as a second initial condition e, = 0. Imposing these conditions on (15), we have €e9=€=CL+Co, ey = O= Cet"! _ Coe 41, Finally, solving for C; and C»,and inserting these values into (15), gives Cn € ~ 2cosh Ary JeA(en—es)+ (-1)"enAene2)| ∙ (16) To infer from (16) whether the method is stable or not, we consider the cases A> OandA < 0 separately. If A > 0, then the second term in (16) decays to zero, and even though the first term grows exponentially, it remains small compared to the exact solution y(x,) = exp (Az,,) as n increasesbecause€ is very small (for example, on the order of 107!°). We conclude, formally, that if A > O then the midpoint rule is stable. On the other hand, if A < 0, then the second term starts out quite small, due to the € factor, but grows exponentially with x, and oscillates due to the (—1)”, whereas the exact solution is exp (~—«).This is precisely the sort of behavior that was observed in Example 2 (where A was —2), and we conclude that if A < 0, then the midpoint rule is unstable. Since the stability of the midpoint rule depends upon the sign of A in the test equation 1’ = Ay (stability for A > 0 and instability for _A < 0), we say that the 327 midpoint rule is only weakly stable. If, instead, a method is stable independentof the sign of A, then we classify it as strongly stable. Having found that the midpoint rule when applied to the equation y’ = Ay is stable for A > 0 and unstable for A < 0, what about the stability of the midpoint rule when it is applied to an equation y! = f(x,y) that is more complicated? Observing thatA is thepartial derivative of Ay (i.e., theright-hand side of y’ = Ay) with respect to y, we expect, as a rule of thumb, the midpoint rule to be stable if Of/Oy > 0 and unstable if Of/Oy < 0 over the x,y domain of interest. For instance, if y’ = e™Yon a > O, then we can expect the midpoint rule to be stable becauseO(e*¥)/Oy = we®Y> Oona > 0, but ify’ = e~*Yona > O, then we can expect the midpoint rule to be unstable on z > 0 because O(e~*Y)/Oy = —re~"Y < Qons > 0. Besides arriving at the above-stated conclusions as to the stability of the midpoint rule for the test equation y/ = Ay, we can now understand the origin of the instability, for notice that the difference equations y+, — 2Ahyn — Yn—1= 0 and Cn+1 — 2Ahen — €n—1 = 0, governing y», and ep, are identical. Thus, analogous to (15) we must have Un —_ BieS* ae Bo(—1)"e7 (17) 4" for arbitrary constants B,, Bo, as h tends to zero. The first of these terms coincides with the exact solution of the original equation y’ = Ay, and the second term (which gives rise to the instability if A < 0) is an extraneous solution that enters because we have replaced the original first-order differential equation by a second-orderdifference equation (second-order because the difference between the subscripts 2+ 1 and n— 1 is 2). Single-step methods (e.g., Euler and Runge-Kutta) are strongly stable (1.e.,independent of the sign of A) because the resulting difference equation is only first order so there are no extraneous solutions. Thus, we can finally see why, in Example 3 of Section 6.3, the midpoint rule proved unstable but the other methods were stable. Understandthatthesestability claims are basedupon analyses in which we let h tend to zero, whereas in practice / is, of course, finite. To illustrate what can happenas / is varied, let us solve y' = ~1000(y—x°)+327; y(0)=0 (18) by Euler’s method. The exact solution is simply y(x) = 2° so that atx = 1, for instance, we have y(1) = 1. By comparison, the values computed by Euler’s method are as given in Table 2. Even from this limited data we can see that we do have the stability claimed above for the single-step Euler method, but only when fh is made sufficiently small. To understand this behavior, consider the relevant test equation y/ = ~—1000y, namely,y’ = Ay, where A = O[—1000(y— a?) + 3x?]/Oy = —1000. Then Euler’s method for that test equation is Yp414= Yn ~ LO0OhYyn.Similarly, yj, = Yn ~ LOOOhy*. Subtracting these two equations, we find that the roundoff error Cn = Yn — Y;, Satisfies the simple difference equation Engi = (1—1000A)en. (19) 328 Chapter 6. Quantitative Methods: Numerical Solution of Differential Equations Table 2. Finite-/r stability. A Computed y(1) x 10° 0.2500 | 2.3737566 0.1000 | 8.7725049 x 10! 0.0100| Exponential overflow 0.0010| 0.99999726 0.0001| 0.99999970 Letting n = 0,1,2,... in (19) reveals that the solution of (19) is (20) €n = (1 — 1000h)"eo, where eg is the initial roundoff error. If we take the limit as h — 0, then en = (1 —1000h)"e9= cge™ GE —1000K)egg 1000MAegg 1000% (21) which is small compared to the exact solution y, = e 10002»because of the eg factor, so the method is stable. This result is in agreement with the numerical results given in Table 2: as h + 0 the scheme is stable. However, in a real calculation / is, of course, finite and it appears,from the tabulation that there is some critical value, say hey, such that the guaranteed stability is realized only if kh< hep. To see this, let us retain (20) rather than let h — 0. It is seen from (20) that if |1 — 1000A| < 1, then e, — 0 as n — ov, and if {1 — 1000h| > 1, then en — co asn > ow, Thus, for stability we need |1— 1000h| < Lor —-1< 1~—1000h< 1. The righthand inequality imposes no restriction on / because it is true for all h’s (provided that h is positive, as is normally the case), and the left-hand inequality is true only for h < 0.002. Hence h,, = 0.002 in this example, and this result is consistent with the tabulated results, which show instability for the h’s greater than that value, and stability for the A’s smaller than that value. Thus, when we say that the Euler method is strongly stable, what we should really say is that it is strongly stable for sufficiently small h. Likewise for the Runge-Kutta and other single-step methods. 6.5.3. Difference equations. (Optional) Difference equations are important in their own right, and thepurpose of this Section 6.5.3 is not only to clarify some of the steps in Section 6.5.2, but also to take this opportunity to present some basics regarding the theory and solution of such equations. To begin, we define a difference equation of order N as a relation involving Yny Ynt1>+--)Yntn. AS we have seen, one way in which difference equations arise is in the numerical solution of differential equations. For instance, if we discretize the differential equation y' = —y and solve by Euler’s method or the midpoint rule, then in place of the differential equation we have the first- and second-order difference equations Yni1 = Yn — hYn = (1 — h)yn and Ynait = Yn—1~ 2hYn, OF Ynt1—(1—h)yn= 0 (22) 329 and Ynti + 2hYn—Yn-1= 9, (23) respectively. In case it is not clear that (23) is of second order, we could Jet n — 1 = mand obtain Yn-o + 2hYm+1 — Ym = O instead, which equation is more clearly of second order. That is, the order is always the difference between the largest and smatlest subscripted indices. Analogous to differential equation terminology, we say that (22) and (23) are linear because they are of the form 0 (2) Yn+n + ty (2) Un+N—-1 aie + an (Nh)Yn = f(n), (24) homogeneous because {(1) is zero in each case, and of constant-coefficient type because their a;’s are constants rather than functions of n. By a solution of (24) Is meant any sequence y,, that reduces (24) to a numerical identity for each n under consideration, such as n = 0,1,2,.... The theory of difference equations is analogous to that of differential equations. For instance,just as one seeks solutions to a linear homogeneous differential equationwith constantcoefficients in the form y(x) = e**, one seeks solutions to a linear homogeneous difference equation with constant coefficients in the form (25) Un = p" as we did in Section 6.5.2. [In case these forms don’t seem analogous, observe that e+” = (e4)* is a constant to the power a, just as p” is a constant to the power n.| Putting (25) into such an Nth-order difference equation gives an Nth-degree polynomial equation on p, the characteristic equation corresponding to the given difference equation, and if the NVroots (p1,..., 9a) are distinct, then (26) Yn = Cypp tees + Cypy where the C’;’s are arbitrary constants, can be shown to be a general solution of the difference equation in the sense that every solution is of the form (26) for some specific choice of the Cj’s. For an Nth-order linear differential equation, NVinitial conditions (y and its first NW— 1 derivatives at the initial point) are appropriate for narrowing a general solution down to a particular solution. Likewise for a linear difference equation N initial conditions are appropriate—namely, the first NV’values YO, Yis. +++YN=1- EXAMPLE 3. Solve thedifferenceequation Yn+1 (27) = 0. 07 dyn, Since (27) is linear, homogeneous and of constant-coefficient type, seek solutions in the form (25). Putting that form into (27) gives perl _ 4p” — (p _ 4)p” — 0 (28) 330 so that if p 4 0 then p — 4 = 0, p = 4, and the general solution of (27) is (29) Yn = C(4)", For example, if an initial condition yo = 3 is specified, then yo = 3 = C(4)° = C gives C = 3 andhencetheparticularsolutiony, = 3(4)”. Actually, (29) is simple enough to solve more directly since (form. = 0,1,...) it gives yi = 4yo,yo = 4y1 = 4"yo, yg = 4y2 = 4°yo, andso on, so onecan seethatyn = yo(4)” or, if yo is not specified, y,, ==C(4)", EXAMPLE which is the same as (29). @ 4. Solve the difference equation Yn+2 — Yn+1 — Syn = 0. (30) Seeking solutions in the form (25) gives the characteristic equation p? — p — 6 = 0 with roots —2 and 3 so the general solution of (30) is Yn = Cy (—2)" + C2(3)”. (31) If initial conditions are prescribed, say yo = 4 and y, = —138,then yoH4=C Y= give C; =5andCy ,+Co, -13= —2Cy + 3C 2 = —1. @ If the characteristic polynomial has a pair of complex conjugate roots, say Pp. = a+76 and po = a — if, then the solution can still be expressed in real form, for if we express p; and fg in polar form (as is explained in Section 22.2 on complex numbers) as pp=r wherer = \/a? + 3? and6 = tan! +Capi Ciptt Cyrreir? lI pr (Cie™? ew p2=r and e? (32) (3/a), then + Corte in? 4 Coe”) r™ (Cy (cosné +7 sinné) + Cy (cosn@ — 7 sin né)| r” (C3 cosné + Cy sin n) , (33) where C's = Cy + Cy and Cy = i(Cy — C9) are arbitrary constants. EXAMPLE 5. Solve thedifferenceequation Yn4+2 + 4Yn = 0. (34) 331 py = 2e7**/", and (33) p, = 2e'"/? and P2 The characteristic roots are 27 so (32) becomes PA gives the general solution nt Yn ==2” (4 COS 9 + Bsin nq > (35) ) \ where A, B are arbitrary constants. @ As we have seen, one way in which difference equations arise is in the numerical solution of differential equations, wherein the continuous process (described by the differential equation) is approximated by a discrete one (described by the difference equation). However, they also arise directly in modeling discrete processes. To illustrate, let p,, be the principal in a savings account at the end of the nth year. We say that the process is discrete in that p is a function of the integer variable n rather than a continuous time variable ¢. If the account earns an annual interestof J percent, then the growth of principal from year to year is governed by the difference equation Pn+i I = (1 + — iit) 36 (36) Pn; which is of the same form as (22). In fact, discrete processes governed by nonlinear difference equations are part of the modern theory of dynamical systems, in which theory the phenomenon of chaos plays a prominent role. Let us close with a brief mention of one such discrete process that is familiar to those who study dynamical systems and chaos. Let rn, Yn be a point in a Cartesian z,y plane, and let its polar coordinates be r and 8,,. Consider a simple process, or mapping, which sends that point into a at the same radius but at an incremented angle 6, + a. Then, point 2p+1,Y%m+41 recalling the identity cos(4 +B) In+1 = 1cos (6, + a) = cos = cos Acos B — sin Asin B, we can express 6, cosa —r sin Gy Sin @ = Tp COSA— Yn Sine and, recalling the identity sin(A+ B) = sin Acos B + sin Yn41 = rsin(@, +a) = rsind, cosa + rcos§, sina BcosA, we can express = Yyncosa + tz sina. Thus, the process is described by the system of linear difference equations Inti = (cosa)x, —(sina)yn, . Ynel = (sina)Ln + (cosa)yn. 37) ( Surely, if one plots such a sequence of points it will fall on the circle of radius r centered at the origin. Suppose that we now modify the process by including two quadratic ze terms, so that we have the nonlinear system Inti = (cosa)x, — (sina) (Yn _ v2) ; 2 Yn+1 = (sina)xy + (cosa) (Un — v2) 38 ∏ ∂ ∕ − interesting by virtue of the nonlinearity. For a discussion of the main results, we 332 highly recommend the little book Mathematics and the Unexpected, by Ivar Ekeland (Chicago: University of Chicago Press, 1988). Closure. This section is primarily about the concept of stability in the numerical solution of differential equations.A schemeis stableif theroundoff error remains small compared to the exact solution. Normally, one establishes the stability or instability of a method with respect to the simple test equation y’ = Ay. Assuming that roundoff enters in the initial condition and that the computer is perfect thereafter, one can derive a difference equation governing the roundoff error e,, and solve it analytically to see if e, remains small. Doing so, we show that the midpoint rule is only weakly stable: stable if A > 0 and unstable if A < 0. As a rule of humb, we suggest that for a given differential equation y’ = f(x, y) we can expect themidpoint rule to be stableif Of /Oy > 0 and unstableif Of /Oy < 0 over the x,y region of interest. To explain the source of the instability in the midpoint rule, we observe that he exact solution (17) of themidpoint rule difference equationcorresponding to the est equation y’ = Ay contains two terms, one thatcorresponds to the exact solution of y! = Ay and the other extraneous. The latter enters because the midpoint rule difference equation is of second order, whereas the differential equation is only of first order, and it is that extraneous term that leads to the instability. Single-step methods such as the Euler and Runge-Kutta methods, however, are strongly stable, provided that / is sufficiently small. ... Observe thatthe only multi-step methodthat we examine is the midpoint rule; we neither show nor claim that all multi-step methodsexhibit such instability. For instance, it is left for theexercises to show thatthe multi-stepfourth-order AdamsMoulton method is strongly stable (for sufficiently small h). Thus, the idea is that the extraneous terms in the solution, that arise because the difference equation is of a higher order than the differential equation, can, but need not, cause trouble. We close the section with a brief study of difference equations, independentof any connection with differential equations and stability since they are important in their own right in the modeling of discrete systems. We stress how analogous are the theories governing differential and difference equations that are linear, homogeneous, and with constant coefficients. Computer software. Just as many differential equations can be solved analytically by camputer-algebra systems, so can many difference equations. Using Maple, for instance, the relevant command is rsolve. For instance, to solve the difference equation Ynt+2— Yn+1 — 6Yn = 0 (from Example 4), enter rsolve(y(n + 2) ~ y(n +1) —6%y(n) = 0, y(n)); and return. The result is =Zuca)) =(Zuo) (-29" ~Euta)) (Ev(0) the correct solution for any initial values y(0) and y(1). Of course, we could re- 6.5. Stability and Difference Equations — 333 express the latter as Cy (—2)" + C2(3)” have entered y(m)); and would have obtained 19 5 (3) n Loos, + (3)" as the desired particular solution. EXERCISES 6.5 1. If the given initial-value problem were to be solved by the fourth-order Runge-Kutta method (and we are not asking you to do that), do you think accurate results could be obtained? Explain. The z domain is 0 < x < oo. (ay =2y—82r+4; y(0)=0 (b)yo’=y—2e7*; y(0)=1 (c)y'=y+5e7**; y(0)= -1 (yy =14+3(y—2); y(0)=0 2. It is natural to wonder how well we would fare trying to solve (1) using computer software. Using the Maple dsolve command with the abserr option, see if you can obtain accurate results at the points z = 1,4,8, 12 listed in Table |. 5. In (13)we showedthat(Ah + V1 + A2h2)" ~ e4% as h — 0, yet it would appearthat (Ah + (V1) “lel, Explain + Azh?)" wi V1 the apparent contradiction. 6. The purpose of this exercise is to explore the validity of the rule of thumb that we gave regarding the solution of the equation y’ = f(x,y) by the midpoint rule’— namely, that the method should be stable if Of/Oy > 0 and unstable if Of /Oy < 0 over the region of interest. Specifically, in each case apply the rule of thumb and draw what conclusions you can about the stability of the midpoint rule solution of the given problem. Then, program and run the midpoint rule with h = 0.05, say, over the given x interval. Discuss the numerical results and whether the rule of thumb was correct. (Since 3. One can see if a computed solution exhibits instability, as the midpoint rule is not self-starting, use ten Euler steps from did the solution obtained by the midpoint rule and plotted in x = 0toz = 0.05 to get the method started.) Fig. 2, when we have the exact solution to compare it with. In =ehHS practice, of course, we don’t have the exact solution to com- (ayy! pare with; if we did, then we would not be solving numerically y(0)=1,0<a<10 in the first place. Thus, when a computed solution exhibits an (c)y =(4-a)y; yO)=10<a¢¢ 5 oscillatory behavior how do we know that it is incorrect; per- (dy =(@-ly, y(O)=2,0<a<4 ) haps the exact solution has exactly that oscillatory behavior? One way to find out is to rerun the solution with h halved. 7. We stated in the text that the results in Table 2 are consistent If the oscillatory-behavior-is part of the exact solution, then with a critical & value of 0.002 because the calculations change the new results will oscillate every two steps rather than every from unstable to stable as ft decreases from 0.01 to 0.001. Prostep. Using this idea, run the case shown in Fig. 2 twice, for gram and carry out the Euler calculation of the solution to the h = 0.05 and A = 0.025, and comment on whether the results initial-value problem (18) using h = 0.0021 and 0.0019, out indicate a true instability or not. to around 2 = 1, and see if these /#values continue to bracket change from unstable to stable. (You may try to bracket the 4. We derived the solution (12) of the difference equation (7) even more tightly if you wish.) hep in the text. Verify, by direct substitution, that (12) does satisfy (7) for any choice of the arbitrary constants Cy, and Co. 8. (Stability of second-order Runge-Kutta methods) In Sec- (by!=e 334 tion 6.3 we derived the general second-order Runge —Kutta method, which includes these as special cases: the improved Euler method, ∩∏ he, Yn) +f − {f(tn, ∶↕ ∫ Yn)|} Yn F Af(tn, [enti we need to examine the 1” term in (9.3) more closely. Specifically, seeking p4 in the power series form pg = 1+aa+-:-, put that form into (9.2). Equating coefficients of a on both sides of that equation through first-order terms, show that a = 24. Thus, in place of (9.3) we have the more informative statement ; (8.1) Yn~ Ca(l + 24a)" = Cy(1+Ah)" ~ Cye4** (9.4) and the modified Euler method, Yn+1 = Un + Aflan + h > Un + h. gf (tn, Yn). as n — oo. Show why the final step in (9.4) is true. Since the right-hand side of (9.4) is identical to the exact solution (8.2) (a) For the test equation y’ = Ay, show that the improved Euler method is strongly stable for sufficiently small # (e., as h — 0). For the case where A < 0, show that that stability achievedonly if h < Ae, = 2/|Al. is (b) For the test equation y’ = Ay, show that the modified Euler method is strongly stable for sufficiently small A (.e., as h — 0). For the case where A < 0, show that that stability achievedonly if h < he, = 2/|Al. (55 fn a 59fn—1 + 37 fn—2 ~~Ofn—3) : (9.1) where fi = f(%n,Yn). 10. (Strong stability of the multi-step Adams-Moulton method) This exercise is to show that the “AB” method is strongly stable for sufficiently small A (i.e., as h + 0) even though it is a multi-step method. (a) Consider the test equation y’ = Ay, where the constant A can be positive or negative;that is, let f(z,y) = Ay be a solution of the fourth-order difference equation (9.1) in the form yn = p”, show that p must satisfy the fourth-degree characteristic equation p' —(1+ 55a)p?+ 59ap*—387ap+9a=0, from Section Recall, (9.2) Adams— Moulton 6.3, the fourth-order method, h ∏ h Un + 94 or negative, we conclude that the AB method is strongly stable for sufficiently small h. is 9. (Strong stability of the multi-step Adams—Bashforthmethod) Recall, from Section 6.3, the fourth-order Adams—Bashforth method Un+1.= y(z) of the given differential equation,whetherA is positive ∙ + 19 fn (9fn+i ∩∏ + 54 ∶ _ + fn—2) 5fn-1 along the same lines as outlined Proceeding . (10.1) in Exercise 9, show that the AM method is stongly stable for sufficiently small h. 11. Derive the general solution of the given difference equation. If initial conditions are specified, then find the corresponding particular solution. In each casen = 0,1,2,.... (Q)Yn+1—4Yn= 9; Yo= 5 w=s youl, =9; (b) nee -Yn (C) Ynt2 + Un+r — 6Yn = 9; (e) Yn+2 7 Yo=R 9, y= 3Un+1 + 2Yn (f) Yn+3 — Yn+2 —4Yn 1 = Q; Yo = 3, +4 yn = 0; 2 y= 1 (d)Yn-e —4Yn41+ 3Yn = 90; yous, n= Yo = 3, 7 n= 5, Y= 9 (g) Yn+4 7 5Yn+2 + 4un (h) Yn+4 _ 6Yn+2 “b 8YUn = 0 = 6 12. (a)~(h) Use computer software to obtain the general solution of the corresponding problem in Exercise 11, and the wherea = Ah/24. (b) Notice that as 2 tends to zero so does a, and (9.2) reduces particular solution as well, if initial conditions are given. to p* —p®= 0, with theroots= 0,0,0,1. Thus, if we denote 13. (Repeated roots) Recall from the theory of linear homothe roots of (9.2) as py). 7.) 4, then we sée that the first three of these tend to zero and the last to unity as h —- 0, and the the characteristic general solution for y, behaves as Yn = Cipt + Copy + Cap + Capt ~ Cyl” geneous differential equations with constant coefficients that if, when we seek y(x) = e**, \ is a root of multiplicity k of equation, then it gives rise to the solutions + Cya*!) y(a) = (Cy + Cow+ +++ (9.3) as h -+ 0. Since p” tends to zero, unity, or infinity, depending upon whether p is smaller than, equal to, or greater than unity, e%, whereCi,...,Ck are arbitrary constants. An analogous result holds for difference equations. Specifically, verify that the characteristic equationof Yni2 — 2byn41,+ b’yn = O has the root 6 with multiplicity 2, and that y, = (Cy + Con)b”. Chapter 6 Review — 335 14. Show that if yl) and yo) are solutions of the second- ay(M)Yn+1 + ag(n)y, = 0, and Y;, is any particular order linear homogeneousdifferenceequationaq(n)Yn+2+ solution of the nonhomogeneousequation ao(n)Yn4e + @i(M)Yn4i + 2 1 ∏∶∶ Crys∏ ) ls Coy ∟ + Y;, is∶ a solution of the nonhomogeneous equation. 15. (Nonhomogeneous difference equations) For the given mogeneous difference equation. (First, read Exercise =7 (2)Yngi~38Yn (b) Yaa — 2Yn = 3sinn (C)Yn+1—Yn = 2+ cosn (QD Ynte ~ 5Ynpa (f) Yn+-2 7 An solution. Finally, give the general solution of the given nonho- (D) Ynta 16. = 6n* ~ TYn+2 (a)—(h) + 6Yn = 2n? — 6n —1 2Yn =n? (©)Yn4-2 -F Unb 7 Yn =e" equation,first find the general solution of the homogeneous (2) Yn+2 equation.Then adapt the methodof undeterminedcoefficients -1 + 124, +6 software to obtain the general solution of the corresponding problem in Exercise 15. . in the case of differential equation solver. =n Use computer Chapter 6 Review Decomposing the error as 14.) 336 One’s interest in higher-order methods is not just a matter of accuracy because, in principle, one could rely exclusively on the simple and easily programmed Euler method, and make /#smail enough to achieve any desired accuracy. There are two problems with that idea. First, as A is decreased the number of steps increases and one can expect the numerical roundoff error to grow, so that it may not be possible to achieve the desired accuracy. Second, there is the question of economy. For instance, while the fourth-order Runge-Kutta method (for example) is about four times as slow as the Euler method (because it requires four function evaluations per step compared to one for the Euler method), the gain in accuracy that it affords is sO great that we can use a step size much more than four times that neededby the Euler method for the same accuracy, thereby resulting in greater economy. Naturally, higher-order methodsare more complex and hence more tedious to program. Thus, we strongly urge (in Section 6.3.4) the empirical estimation of the order, if only as a check on the programming and implementation of the method. In Section 6.4 we showed that the methods developed for the single equation y’ = f(x,y) can be used to solve systems of equations and higher-order equations as well. There we also study boundary-value problems, and find them to be significantly more difficult than initial-value problems. However, we show how to use the principle of superposition to convert a boundary-value problem to one or more problems of initial-value type, provided that the problem is linear. Finally, in Section 6.5 we look at “what can go wrong,” mostly insofar as numerical instability due to thegrowth of roundoff error, and an analytical approach is put forward for predicting whether a given method is stable. Actually, stability depends not only on the solution algorithm but also on the differential equation, and our analyses are for the simple test equation y’ = Ay rather than for the general case y’ = f(x,y). We find that whereas the differential equation y’ = Ay is of first order, the difference equation expressed by the algorithm is of higher order if the method is of multi-step type. Thus, it has among its solutions the exact solution (as h -+ 0) and one or more extraneous solutions as well. It is those extraneous solutions that can cause instability. For instance, the midpoint rule is found to be stable if A > 0 and unstable if A < 0; we classify it as weakly stable because its stability depends upon the sign of A. However, the fourth-order Adams—Bashforth and Adams~Moulton methods are stable, even though they are multistep methods because the extraneous solutions do not grow. Single-step methods such as Euler and those of Runge-Kutta type do not give rise to extraneous solutions and are stable. Finally, we stress that even if a method is stable as h —+ 0, A needs to be reduced below some critical value for that stability to be manifested. Chapter 7 Qualitative Methods: Phase Plane and Nonlinear Jifferential Equations 7.1 Introduction This is the final chapter on ordinary differential equations, although we do return to the subject in Chapter [1, where we reconsider systems of linear differential equations using matrix methods. Interestin nonlinear differential equations is virtually as old as the subject of differential equations itself, which dates back to Newton, but little progress was made until the late 1880’s.when the great mathematician and astronomer Henri Poincaré (1854-1912) took up a systematic study of the subject in connection with celestial mechanics. Realizing that nonlinear equations are rarely solvable analytically, and not yet having the benefit of computers to generate solutions numerically, he sidestepped the search for solutions altogether and instead sought to answer fundamental questions about the qualitative and topological nature of solutions of nonlinear differential equationswithout actually finding them. The entire chapter reflects either his methods, such as the use of the so-called “phase plane” and focusing attention upon the “singular points” of the equation, or the spirit of his approach. In addition, however, we can now rely heavily upon computer simulation. Thus, our approach in this chapter is a blend of a qualitative, topological, and geometric approach, with quantitative results obtained readily with computer software. Though Poincaré’s work was motivated primarily by problems of celestial mechanics, the subject began to attract broader attention during and following World War II, especially in connection with nonlinear control theory. In the postwar years, interestwas stimulated further by the publication in English of N. Minorsky’s Nonlinear Mechanics (Ann Arbor, MI: J. W. Edwards) in 1947. With that and other books, such as A. Andronov and C. Chaikin’s Theory of Oscillations (Princeton: 337 Princeton University Press, 1949) and J. J. Stoker’s Nonlinear Vibrations (New York: Interscience, 1950) available as texts, the subject appeared in university curricula by the end of the 1950’s. With that base, and the availability of digital computers by then, the subject of nonlinear dynamics, generally known now as dynatnical systems, has blossomed into one of the most active research areas, with applications well beyond celestial mechanics and engineering —to biological systems, the social sciences, economics, and chemistry. The shift from the orderly determinism of Newton to the often nondeterministic chaotic world of Mitchell Feigenbaum, E. N. Lorenz, Benoit Mandelbrot, and Stephen Smale has been profound. For a wonderful historical discussion of these changes we suggest the little book Mathematics and the Unexpected by Ivar Ekeland (Chicago: University of Chicago Press, 1988). 7.2 The Phase Plane To introduce the phase plane, consider the system =0 ma" +ke governing the free oscillation of the simple harmonic mechanical oscillator shown in Fig. |. Of course we can readily solve (1) and obtain the general solution x(t) i z(t) = Cy coswt + Cysinwt, wherew = \/k/m is the natural frequency —or, > mt ASA > equivalently, x(t) = Asin (wt + 9), ‘ Figure 1. Simple harmonic mechanical oscillator. (1) (2) where A and ¢ are the amplitude and phase angle, respectively. To present this result graphically, one can plot 2 versus ¢ and obtain any number of sine waves of different amplitude and phase, but let us proceed differently. We begin by re-expressing (1), equivalently, as the system of first-order equations dx an, de 3 (3a) dy k dt — SS . 3b (30) as is discussed in Section 3.9. The auxiliary variable y, defined by (3a), happens to have an important physical significance, it is the velocity, but having such significance; is not necessary. Next, we deviate from the ideas presented in Section 3.9 and divide (3b) by (3a), obtaining d SY dx nt Ax or my my dy + kadx = 0, (4) integration of which gives 1 5, 1. » my ++ —kae 5 Kew —my* = C, (5) 5 7.2. The Phase Plane Since y = da/dt, (5) is a first-order differential equation. We could solve for y (i.e., dv /dt),separate variables, integrate again, and eventually arrive at (2) once again. Instead,let us take (5) as our end result and plot the one-parameter family of ellipses that it defines (Fig. 2), the parameter being the integration constant C’. In this example C’ happens to the total energy (kinetic energy of the mass plus potential energy of the spring); C' = 0 gives the “point ellipse” w = y = 0 and the greaterthe value of C’, the larger the ellipse. It is customary to speak of the x,y plane as the phase plane. Each integral curve represents a possible motion of the mass, and each point on a given curve representsan instantaneous state of the mass (the horizontal coordinate being the displacement and the vertical coordinate being the velocity). Observe that the time t enters only as a parameter, through the parametric representation xz = x(t), y = y(t). So we can visualize the representativepoint x(t), y(t) as moving along a given curve as suggested by the arrows in Fig. 2. The direction of the arrows is implied by the fact that y = dx/dt, so thaty > 0 implies thatx(¢) is increasing and y < 0 implies that x(t) is decreasing. One generally calls the integral curves phase trajectories, or simply trajectories, to suggest theidea of movement of the representativepoint. A display of a number of such trajectories in the phase plane is called a phase portrait of the original differential equation, in this case (1). Of course, there is a trajectory through each point of the phase plane, so if we showed all possible trajectories we would simply have a black picture; the idea is to plot enough trajectories to establish the key features of the phase portrait. What are the advantages of presenting results in the form of a phase portrait, ratherthan as traditional plots of z(t) versus ¢? One advantageof the phase portrait is that it requires only a “first integral” of the original second-order equation such as equation (5) in this example, and sometimes we can obtain the first integral even when the original differential equation is nonlinear. For instance, let us complicate (1) by supposing that the spring force is not given by the linear function Fi, = ka, but by the nonlinear function F, = ax + bx*, and supposethata > 0 and b > 0 so thatthe spring is a “hard” spring: it grows stiffer as x increases (Fig. 3), as does a typical rubber band. If we take a = b = m, say, for definiteness and for simplicity, then in place of (1) we have the nonlinear equation gz’ +e2+a°=0, (6) Proceeding as before, we re-express (6) as the system v= (7a) y, y =~ac—-2’. (7b) Division gives dy ote dx x+23 or y : ydy +(a+a")dz = 0, (8) which yields the first integral lo lo ly − 5Y + −5 ∕ 4 ie−− C ∶∕ ∙ 9 (9) 339 ya , X(t), y(t) re ~ CT TAs Figure 2. Phaseportraitof (1). Figure 4. Phase portrait of (6) (hard spring). In principle, if we plot thesecurves for various values of C' we can obtain the phase portrait shown in Fig. 4. More conveniently, we generatedthe figure by using the Maple phaseportrait command discussed at the end of this section. A comparable phase portrait plotting capability is provided in numerous other computer software systems. To repeat, one advantageof the phase portrait presentationis that it requires onlya first integral. In the present case (6) was nonlinear due to the x term, yet its first integral (9) was readily obtained. A second attractive feature of the phase portrait is its compactness. For instance, observe that the single phase trajectory T’ in Fig. 2 corresponds to an entire family of oscillations of amplitude A, several of which are shown in Fig. 5» since any point on I can be designated as the initial point (¢ = 0): if the initial point onT is (A,0), then we get the curve #1 in Fig. 5; if the initial point on I’ is a bit counterclockwiseof (A, 0) thenwe get the curve #2; andso on. Passing from the x,t plane to the x,y plane, the infinite family of curves shown in Fig. 5 collapse onto the single trajectory [ in Fig. 2. Put differently, whereas the solution (2) of equation (1) is a two-parameter family of curves in x, space (the parameters being A and @), (5) is only a one-parameter family of curves in the x,y plane (the parameter being C’), That compactness can be traced to the division of (3b) by (3a) or (7b) by (7a) for that step essentially eliminates the time ¢. To learn about nonlinear systems, it is useful to contrast the phase portraits of the linear oscillator governed by (1) and the nonlinear oscillator governed by (6), and given in Figs, 2 and 4, respectively. The phase portrait in Fig. 2 is extremely simple in the sense that all the trajectories are geometrically similar, differing only in scale. That is, if a trajectoryis given by x = X(t), y = Y(t), then Figure 5. Solutionsx(¢) = KX(t), y = &Y(t) is also a trajectory for every possible scale factor &, be it positive, negative, or zero. That result holds not only for the system (3) but for any constantcoefficient linear homogeneous system x corresponding to the trajectory [. ax + by, ∫∶ − (10) In contrast, consider the phase portrait of the nonlinear equation 2” + ax + Ga> = 0 shown in Fig. 4. In that case the trajectories are not mere scalings of each other; there is distortion of shape from one to another, and that distortion is due entirely to the nonlinearity of the differential equation. The innermost trajectories =:smaller and smaller motions are considered the x’ approach ellipses [becai: becomes more and more negligible compared to the other terms in (9)], and the outer ones become more and more distorted as the effect of the 2? term grows in (9). Thus, whereas the phase portrait of the linear equation (1) amounts to a single kind of trajectory, repeated endlessly through scalings, that of the nonlinear equation (6) is made up of an infinity of different kinds of trajectories. That richness is a hallmark of nonlinear equations, as we shall see in the next example and in the sections that follow. Before turning to the next example, let us complement the phase portrait in Fig. 4 with representative plots of x(t) versus ¢. We choose the two sets of initial conditions: «(0) = 0.5, «’(0) = 0 and 2(0) = 1, «’(0) = 0. The resultsare shown in Fig. 6, together with the corresponding solutions of the linear equation x +x = 0 (shown as dotted) for reference. Besides the expected distortion we also observe that the frequency of the oscillation is amplitude dependent for the nonlinear case: the frequency increases as the amplitude increases. In contrast, for = | is a constant, independent the linear equation (1) the frequency w = Jk/m of the amplitude. Above, we mentioned the richness of the sets of solutions to nonlinear differential equations. A much more striking example of that richness is obtained if we reconsider the nonlinear oscillator, this time with a “soft” spring — that is, with F, = ax — bx? (a > 0 and b > 0) as sketched in Fig. 7. Again setting a = 6 =m we have, in place of (6), (11) ve+a—-0=0. In place of (9) we have Le tsedoe su lag =e, et (12) and in place of the phase portrait shown in Fig. 4 we obtain the strikingly different one shown in Fig. 8. We continue to study this example in Section 7.3, but even now we can make numerous interesting observations. First, whereas all of the motions revealed in the phase plane for the hard spring (Fig. 4) are qualitatively similar oscillatory motions —give or take some distortion from one to another we see in Fig. 8 a number of qualitatively different types of motion, and these are “ { G Figure 8. Phase portrait of (11) (soft spring). Figure 6. Effects of nonlinearity on x(t). 342 Chapter 7.. Qualitative Methods; Phase Plane and Nonlinear Differential Equations separatrix. ds ∙ −∙ ↕ dt ≥ +4 12 7 (13) ple. Solving (14a) (14b) (—1,0), (0,0), (1,0), (15) +1 are because (16) function x(¢). GBDEI are given by yor (17) 7.2. The Phase Plane respectively. Beginning at D, say, the representative point P moves rightward on DE and approaches the equilibrium point &. Does it reach & in finite time and then remain there, or does it approach & asymptotically as tf— oo? To answer that questionwe use(16). Let thetangentline to thecurve DEJ, atE, bey = m(a—1). [Wecould determine the slope m by differentiating (17), but the value of m will not be important.] Then, since ds = \/1 + (dy/dx)2 dx ~ V1 +m? dx as P > E, we canreplaces’ in (16)by V1 + m? dx/dt, andy by m(x — 1), so (16)becomes Vi+ mS — ~ Jim (a —1)?+ a2(a + 1)2(a—1)2 (18) +4(e—1), ~aVm? where the negative square root has been chosen since dx/dt > 0 as P + Hon DE, whereas x — 1 < 0 on DE, and where the last step in (18) is left for the exercises. The upshot is that ax 19 —~y(l-sr HY Wi 2) (19) as P -+ E, for some finite positive constant y. Thus, d l-« so vy (20) dt (21) —In(1-—«) ~ yt+ constant, and we can now see that t + co as P > FE (ie., as x - 1). Thus, P does not reach & in finite time but only asymptotically as t + oo. Similarly, if we begin at point H and go backward in time, then we reach & only as t ~ —oo. Let us return now to the region inside of the football and consider any closed orbit P. As the size of P shrinks to zero, [ tends to the elliptical (actually circular becauseof our choice a = m) shape y* + 2° = constant, and the period of the motion tendsto 27 [since the solution of the linearized problem is x(t)= Asin (t + @) ]. At the other extreme, as I’ gets larger it approaches the pointed shape BDE HB. Bearing in mind that it takes infinite time to reach F along an approach from J, it seems evident that the period of the [ motion must tend to infinity as [ approaches we will ask you to explore this point in the exercises. From a physical BDEHB, point of view, the idea is that not only is the “flow” zero at E (where x’ = y’ = 0), it is very slow in the neighborhood of &. If T is any closed trajectory that is just barely inside of BDEH B, then part of T falls. within that stagnant neighborhood of & (similarly at B). The representative point P moves very slowly there, hence the period is very large. We reiterate that although each closed loop inside the football corresponds to a periodic motion, the closed loop BDE HB does not. In fact, although BDE and FEHB meet at B and F they are distinct trajectories. On BODE, t varies from —co at B to +00 at E: likewise, on FEHB ¢tvaries from —oo and vice at B to +00 at B. Thus, if we begin on BDF we can never get to HB, versa. 343 344 Finally, it should be evident that every trajectory that is not within the football correspondsto a nonperiodic motion. Thus, BDE and HHS from nonperiodic to periodic motions. form atransition Thus far we have. studied. the. three. differential. equations. (1), (6), and. (11). In each case we have changed the single second-order differential equation to a system of two first-order equations by setting x’ = y and then studied them in the x,y phase plane. More generally, we consider in this chapter systems of the form x = P(x,y), (22a) y =Q(2,y)- (22b) That is, P(x, y) need not equal y, and the system need not be a restatementof a single second-order equation. Rather, it might arise directly in the form (22). For instance, suppose that two species of fish coexist in a lake, say bluegills and bass. The bluegills, with population «, feed on vegetationwhich is available in unlimited quantity, and the bass, with population y, feed exclusively on the bluegills. If the two species were separated,their populations could be assumed to be governed approximately by the rate equations zr/ = az, y = —By, (23) where the populations x(t) and y(t) are consideredto be large enough so that they can be approximated as continuous rather than discrete (integer valued) variables, and a, 3 are (presumably positive) constants that reflect net birth/death rates. The species are not separated,however, so we expect the effective a to decreaseas y increases and the effective 3 to decrease as x increases. An approximate revised model might then be expressed as z' =(a-yy)a, y ll = —(3—dx)y, (24a) (24b) which system is indeed of the form (22). This ecological problem is well known as Volterra’s problem, and we shall return to it later. The system (22) is said to be autonomous because there is no explicit dependence on the independent variable (the time ¢here but which could have some other physical or nonphysical significance). Surely not all systems are autonomous, but that class covers a great many cases of important interest, and that is the class that is considered in phase plane analysis and in this chapter. Because (22a,b) are autonomous, any explicit reference to t (namely, the ¢ derivatives) can be.suppressed by dividing one equation by the other and obtaining dy _ P(x,y) dz ~~Q(a,y)’ (25) where we now change our point of view and regard y as a function of z in the x,y phase plane, rather than x and y as functions of t. If the system (22) were not 7.2. The Phase Plane 345 autonomous;that is, if it were of the form a! = P(a,y,t) andy’ = Q(x, y,t), one could still make it autonomous by re-expressing it, equivalently, as x = P(x,y,z) y y = Q(a,y,2), / ∶↓ ∩ ∑ − is more complicated. [n this chapter we continue to consider the autonomous case (22) and the two-dimensional «x,y phase plane. Closure. As explained immediately above, our program in this section is to show the advantages of recasting an autonomous system (22) (which could, but need not, arise from a single second- order equation by letting 2’ be an auxiliary dependent variable y) in the form (25) and then study the solutions of that equation in the two- dimensional 2, y phase plane. One advantage is that (25) can sometimes be solved analytically even if (22a,b) are nonlinear. Indeed, our primary interest in Chapter 7 is in the nonlinear case. We find that the phase portrait provides a remarkable overview of the system dynamics, and the hard- and soft-spring oscillator examples begin to reveal some of the phenomenological richness of nonlinear systems. We do not suggest the use of the phase plane as a substitute for obtaining and plotting solutions of (22) in the more usual way, 2 versus t and y versus ¢. Rather, we suggest that to understand a complex nonlinear system one needs to combine several approaches, and for autonomous systems the phase plane is one of the most valuable. Finally, we ask you to observe how the phase plane discussion is more qualitative and topological than lines of approach developed in the preceding chapters. For instance, regarding Fig. 8 we distinguish the qualitatively different types of motion such as the periodic orbits within B DEH B, the transitional motions on the separatrix itself, and the nonperiodic motions as well. We distinguish betweenthe physical velocity «’(t) of the mass, in the preceding examples, and the phase velocity s’(t), which is the velocity of the representativepoint x(t), y(t) in the phaseplane. It is useful, conceptually,to think of the z'(t), y/(t) velocity field as the velocity field of a “flow” such as a fluid flow in the phaseplane. Finally, we mention that in the hard- and soft-spring oscillators, (6) and (8), we meet special cases of the extremely important Duffing equation, to which we return in a later section. Computer software. Here is how we generate the phase portrait shown in Fig. 8 using the Maple phaseportrait command. First, enter with(DEtools): and return, to gain access to the phaseportrait command. Note the colon, whereas Maple commands are followed by semicolons. Next, enter phaseportrait({y,—z + 273], [t,2,y], ¢ = —20..20, {[0,0,0.1], (0,0,0.3), ∏ ∶ ↨ 346 (0,0,0.9],{0,0,—0.9], (0,0,-0.70710781], (0,0,0.70710781], (0,0,0.6}, (0,0, 1.25],[0,0, —1.25],[0,1.5,0.8838834761],[0,1.5, -0.8838834761], (0,1.4,0],[0,1.8,0},[0,-1.5, 0.8838834761],(0,-1.5, -0.8838834761], = —2..2, [0,—1.4,0], (0,-1.8, 0]}, stepsize = 0.05, y = —1.8..1.8, «© scene = |, y]); and return. In [y,—x + x3] the items are the right-hand sides of the first and second differential equations, respectively; [¢,2,y] are the independent variable and dependent variables; ¢ = —20..20 is the range of integration of the differential equations; within { } are the initial t,2,y points chosen in order to generate the trajectories shown in Fig. 8. After those points the remaining items are optional: stepsize= 0.05 setsthe stepsize A in the Runge-Kutta-Fehlberg integrationbecause the defaultvaluewould be (final ¢ —initial t)/20 = (20 + 20)/20 = 2, which would give too coarse a plot (as found by experience);y = —1.8..1.8,2 = —2..2 gives a limit to the x, y region, with the ¢ integrations terminated once a trajectory leaves that region; scene = [x,y] specifies the plot to be a two-dimensional plot in the 2, y plane, the default being to give a three-dimensional plot in ¢,x, y space. There are additional options that we have not used,one especially useful option for phase plane work being the arrow option, which gives a lineal element grid. The elements can be barbed or not. To include thin barbed arrows, type a comma and then arrows=THIN after the last option. Thus, we would have ...scene = [x,yj, arrows =THIN));. In place of THIN type SLIM, or THICK for thicker arrows. For unbarbed lineal elements, use LINE in place of these. The order of the options is immaterial. Observe that the separatrix must be generatedseparately as BDE, EH B, AB, GB, EF, and IE. To generate BDE, for instance, we determine the coordinates of D, The equation of the entire separatrix is given by (12), where C’ is determined by using the z, y pair 1,0 (namely, the point #, which is a known point on the separatrix). Thus, putting 2 = 1 and y =0 into the left sideof(12) gives C = 1/4. Next, put 2 = 0 and solve for y, obtaining y = 1/V2 = 0.770710781 at D. Then, with D as the initial point we need ¢ to go from —co to +00 to generateBDE. By trial, we find that —20 to +20 suffices for this segment and all others; similarly, we generate E’F by using a point on EF at x = 1.5, and determine y at that point from the separatrix equation. That calculation gives y = 0.8838834761. Notice that to generate Fig. 8 with phaseportrait we need to already know something about the phaseportrait — the equation of the separatrix, (12), so that we can choose suitable initial points on AB, GB, BDE, EHB, EF, and TE. Suppose that we desire only the lineal element field, over 0 < a < 4 and 0 <y <4, say. Wecan get it from phaseportrait as follows: . phaseportrait([y,—« + 273], [é,¢,y], t= 0..1, {[0,0,0]}}, 7 =0.4, y =0..4, scene = [x,y], arrows = THIN, grid = [20,20]); because the trajectory through [0,0,0] gives simply the single point « = y = 0 in the a, y phase plane. We have included the one initial point [0,0,0] because the 7.2. The Phase Plane 347 of x or y versus ¢ using the scene option. For instance, phaseportrait([y,~« + 273], [t,a,y], ¢ = 0..5, stepsize= 0.05, scene= [t,x]); 0.2 and y(0)= a'(0)= y(0) = w(0) = u 1.3. EXERCISES 0.8 and 7.2 1. We stated, below (5) that if we solve (5) for y (.e., dx /dé), separatevariables, and integrate, we obtain the general solution x(t) = Asin (wt + @)of (1). Here we ask you to do that, to carry out those steps. 2. Supply the steps missing between the first and second lines of (18). 3. We found in Fig. 6 that for the hard-spring oscillator the frequency increases with the amplitude. Explain, in simple terms, why that result makes sense. 4, Determine the equation of the phase trajectories for the given system, and sketch several representative trajectories. Use arrows to indicate the direction of movement along those trajectories. (a)a’ = Ys y =a (c)ai=y", yi =-ay (b)2’=ay, From your results, obtain the period T for each case and plot T versus A for those values of A (and additional ones if you wish). Does the claim made in the first sentence appear to be correct? 7, We stated, in our discussion of Fig. 8, that all trajectories outside of the “football” region correspond to nonperiodic motions. Explain why that is true. 8. determination (Graphical of phase velocity) (a) For the system (22), consider the special case where P(z,y) as occurred in (3) and (7), for instance. = y, From the accompanying sketch, show that in that case the phase velocity s’ can be y y! = —2x? 5. Determine the equation of the phase trajectories and sketch enough representative trajectories to show the essential featuresof the phaseportrait. Use arrows to indicate the direction of movement along those trajectories. (jcisy, y= -y (cja’=y, y=a (e)a’ =u, y=z Il (b)a’=y, |=y, ja’ = (Nei =a, y' < =9r y = 4x 6. (Period, for soft-spring oscillator) In the paragraph below (21), we suggest that the period T of the periodic motions inside of BDE HB = yl i =y (Fig. 8) tends to 27 in the limit as the amplitude A of the motion tends to zero, and to oo as A — 1. Here we ask you to explore that claim with calculations. Specifically, use phaseportrait (or other software) interpreted graphically as 8 = 4, (8.1) where a is the perpendicular distance from & to the x axis. (b) Consider a rectangular phase trajectory ABC’ DA, where the corner points have the x,y coordinates A = (—1,1), B = (3,1), C = (8,-1), D = (~1,-1). Using (8.1), plot to solve x” + 2 — 2° = 0 subjectto the initial conditions the graph of x(t) versus ¢, from ¢ = 0 through¢ = 20, if the «(0) = A, 2'(0) = 0, for A = 0.05,0.3, 0.6, 0.9, 0.95, 0.99. representative point & is at A att = 0. 348 (c) Consider a phase trajectory ABC consisting of straight- as to the t-interval, the step size, the initial points, and so on, line segmentsfromA = (—1,0)to B = (0,1) toC = (1,0) so as to obtain good results. with & at Batt = 0. Using (8.1),sketchthegraphof «(¢) 10. Reduce the equation x” + x? = 0 toa system of equations versus¢over~-00< t < oo. Also, give x(t) analyticallyover Find: the equation of the trajectories and by setting a’: =-y carefully sketch seven or eight of them, so as to show clearly (d) Considera straight-linephasetrajectoryfrom A = (0,5) the key features of the phase portrait. Pay special attention to to B = (10,-5). Using (8.1),sketchthegraphof x(t) versus the one through the origin, and give its equation, toverrO0<t<o,ifHisat Aatt = 0, (e) Same as (d), but with & at Batt = 0. UL. (Volterra problem) Consider the Volterra problem (24), 9, (a) Reduce the equation x” + 22° = 0 to a system of equa- with a = 8 = y = 6 = 1: Determine any fixed points. tions by setting 2’ = y. Find the equation of the phase trajec- Use phaseportrait(or other software) to obtain the lineal ele- -oo <t<OandO0<t<oo. tories and sketchseveral of them by hand. Show that for larger and larger motions the trajectories are flatter and flatter over -lL<a<l. (b) Use the Maple phaseportrait command (or other software) to generate the phase portrait and, on the same plot, the lineal element field, using barbed arrows to show the flow direction. You will need to make decisions, with some experimentation, ment field, with barbed arrows, over the region 0 < a < 4 and Q < y < 4, say. (Of course, « and y need to be positive because they are populations.). On that plot, sketch a number of representative trajectories. You should find a circulatory motion about the point (1,1). Can you tell, from the lineal element field, whether the trajectories circulate in closed orbits or whether they spiral in (or away from) that point? tions. dx M=Fleyt); elt)=29 ns WY Fp EAs = atte) ylto)= yo as ‘» tg and that 349 solution is unique. More general and more powerful theorems could. be given but this one.will suffice for our purposes in this chapter. For such theorems and proofs we refer you to textson differential equationssuch as G. Birkhoff and G.-C. Rota, Ordinary Differential Equations, 2nd ed. (New York: John Wiley, 1969). For instance, consider the soft-spring oscillator equation x” + « — 2° = 0 or, equivalently, the system (2a) a’ = y, y =-t+2° (2b) thatwe studiedin Section 7.2. In this case f(x,y, t) = y, g(a,y,t) = —w+ 2°, fo =, fy = 1, ge = -1 4+327, gy = O are continuous for all values of x, y, and t, so Theorem 7.3.1 assures us that no matter what initial condition is chosen there is a unique solution through it. The extent of the ¢ interval over which that solution exists is not predicted by the theorem, which is a “local” theorem like Theorem 2.4.1. But it is understood that that interval is not merely the point fo itself, for how could dx/dt and dy/dt make sense if x(t) and y(t) were defined only at a single point? Linear differential equations are simpler, and for them we have “global” theorems such as Theorem 3.9.1. If f and g satisfy the conditions of Theorem 7.3.1 at (xo, yo, to) in aw,y, t space, then there does exist a solution curve, or trajectory, through that point, and there is only one such trajectory. Geometrically, it follows that trajectories in x, y,¢ space cannot touch or cross each other at a point of existence and uniqueness. However,what about the possibility of crossings of trajectories in the z, y phase plane? Be careful, because whereas the theorem precludes crossings in three-dimensional x,y,t space, the phase plane shows only the projection of the three-dimensional trajectories onto the two-dimensional x, y plane. For instance, choose any point Po ona closed orbit inside the “football” in Fig. 8 of Section 7.2. As the representative point P goes round and round on that orbit it passes through 5 an infinite number of times, yet that situation does not violate the theorem because if that trajectory is viewed in three-dimensional wv,y, ¢ space, we see that it is actually helical, and there are no self-crossings. The only points of serious concern in Fig. 8 are (1, 0) and (—1,0). But here too there is no violation of the theorem because there ts only the unique trajectory x(t) = 1 and y(t) = O through any initial point (1,0,to) ~ namely, a straight-line trajectory which is perpendicular to the x,y plane and which extends from —oo to +oo in the ¢ direction.-The trajectories DE and LE, in the x, y,¢ space, approach that line asymptotically as t —+00, and the trajectories FE and HE approach it asymptotically (both in x, y, £ space and in the z, y phase plane) as t + ~oo, but they never reach it. Similarly for (—1, 0, to). Recall, from Section 7.2, that we proceeded to divide (2b) by (2a), obtaining dy = -a@t x dx os y 3 oS 350 Chapter 7. Qualitative Methods: Phase Plane and Nonlinear Differential Equations now formalize. dy_ f(x,y) dx g(x,y)’ we remain there for all t because x’ = y’ = 0 there. be isolated. For example, if f(z,y) = x and g(x,y) = at (1,0), write a / Y; ae Close to (1,0) we neglect the higher-order terms and consider the linearized version (6a) a!=y, (6b) 1) y= 2(a~ or, Moving our coordinate system to (1,0) for convenience by setting X = x — 1 and Y = y, Dividing these gives dY/dX givesY = +/2X; X'=/Y, (7a) Y' =2X. (7b) = 2X/Y, with the solution Y? = 2X2 +C;C and for C 4 0 we haveY = +\/(2X2 +). =0 Thesecurves singularpoint (1,0). are shown in Fig. |. The pairs crossing the X axis correspond to increasingly negative values of C’, and those crossing the Y axis correspond to increasingly positive values of C. Similarly, to study the singular point (0,0) observe that the right-hand sides of (2a) and (2b) are already Taylor series expansions about = 0 and y = 0. Thus, keeping terms only up to first order gives the linearized version v= y, y = —2, ya (8a) (8b) with dy/dx = —a/y giving the family of circles y* + 2? = C and hence the trajectories shown in Fig. 2. Treating the singular point (—1,0) in the same manner, we find the same behavior there as at (1,0). If we show the three results together,in Fig. 3, it is striking Figure 2. The flow nearthe singular point (0,0). how those three localized phenomenaappearto set up the global flow in the whole plane. That is, if we fill in what’s missing, by hand, we obtain — at least in a qualitative or topological sense —the same picture as in Fig. 8 of Section 7.2.* From this example we can see some things and raise some questions. We see that by virtue of our Taylor expansions of f(x,y) and g(x, y) about the singular point (a5, ys) and their linearization, we are always going to end up with linearized equations of the form lI X’=aX +0bY, (9a) Y'’=cX +dY (9b) tostudy,whereX = x ~ x, andY = y —y, areCartesiancoordinateaxeslocated at the singular point. Thus, we might as well study the general system (9) once and for all. Evidently, for different combinations of a, b,c, d there can be different types of singular points, for from Fig. 3 it seems clear that the ones at (1,0) and (—1,0) are different from the one at (0,0). How many different types are there? “It may appear inconsistent that the trajectories near the origin in Fig. 3 look elliptical, whereas they are circles in Fig. 2. That distortion, from circles to ellipses, is merely the result of stretching the a axis relative to y axis for display purposes. Figure 3. Global flow determined, qualitatively, by the singular points. What are they? 7.3.3. The elementary singularities and their stability. We wish to solve (9), examine the results, and classify them into types. [f we equate the right-hand sides of (9) to zero, we have the unique solution X = Y = 0 only if the determinant ad — bc is nonzero. If that determinant vanishes, then the solution X = Y = 0 is nonunique, and there is either an entire line of solutions through the origin or the entire plane of solutions. For instance, if a = 6 = 0 and c and d are not both zero, then every point on the line cX + dY = 0 is a singular point of (9), and ifa = b = c = d = QO,then every point in the plane is a singular point of (9). Wishing to study the generic case, where the origin is an isolated singular point, we will require of a, b,c, d thatad ~ be 4 0. [Ifwe solve (9), for instance by elimination, form (a) 5 we find that the solution is of the + Coe®?*,Y(t) = Cge*!+Cye™, X(t) = Cye™*§ (10a,b) where C, C2, C3, C4 are not independent, and where 1, Ag are the roots yn atest en Me (11) of the characteristic equation \M*—(a+ d)\ 4+(ad —be)= 0. (12) Since ad —bc ¥ 0, zero is not among the roots. There are exactly four possibilities: 2) (1) purely imaginary roots (CENTER), (2) complex conjugate roots (FOCUS), (3) real roots of the same sign (NODE), (4) real roots of opposite sign (SADDLE). These cases lead to four different types of singularity: center, focus, node, and saddle, as we note within parentheses,and we will discuss these in turn. In doing so, it is important to examine each in terms of its stability, which concept we define before continuing. A singular point S = (2, ys) of the autonomous system (4) is said to be stable if motions (i.e., trajectories) that start out sufficiently close to S remain close to S. To make that intuitively stated definition mathematically precise, let d(P;, P2) Figure 4. Stabilityandasymptotic denote the distance* between any two points P, = (x1,y,) and Py = (9, y2). stability. Further,we continueto let P(t) = (x(t), y(t)) denotethe representativepoint in the phase plane corresponding to (4). Then, a singular point S is stable if, given any € > 0 (i.e., as small as we wish) thereis ad > 0 such thatd(P(t),.S) < for allt > 0 if d(P(0),S) < 6. (See Fig. 4a.) If.S'is not stable,thenit is unstable. “In the Euclidean sense,the distance d(P;, P2) is defined as \/(ay— wz)? + (yi — y2)*, but one can define distance in other ways. Here, we understandit in the Euclidean sense. Further, we say that S' is not only stable but asymptotically stable if motions that start out sufficiently close to S not only stay close to S but actually approach S ast — oo. That is, if thereisa dé> O such thatd(P(t),S) ~ 0ast > co wheneverd(P(0),S) < 6, thenS is asymptotically stable.(SeeFig. 4b.) Now let us return to the four cases listed. The most inciteful way to study thesecases is by seeking solutions in exponential form and dealing with the “eigenvalue problem” that results. However, the eigenvalue problem is not discussed until Chapter [1, so in the present section we rely on an approach that should suffice but which is in some ways less satisfactory. In Section 11.5 we return to this problem and deal with it as an eigenvalue problem. If you are already sufficiently familiar with linear algebra, we suggest that you study that section immediately following this one. It is convenient to use the physical example of the mechanical oscillator, with thegoverningequationma” + px’ + kx = 0, or {| y, a’= y i − k ∕ ™m Mm (13a) ∶∩ (13b) as a unifying example because by suitable choice of m, p, k we can obtain each of the four cases, and because this application has already been discussed in Section 3.5. [Here we use p instead of c for the damping coefficient to avoid confusion with the c in (9b).] Purely imaginary roots. (CENTER) Let p = 0 so there is no damping. Then a= 0,b = 1l,c¢ = ~—k/m,d = 0: (11) gives the purely imaginary roots \ = ti/k/m and dy/dxz= —(k/m)a/y gives thefamily of ellipses 1, . 5 1 5my?+sha =C (14) sketched in Fig. 5a. The singular point at (0,0) is called a center because it is surrounded by closed orbits corresponding to periodic motions. For instance, with Ay = +iw (wherew = \/k/m is the natural frequency)and Az = —iw, (10a)gives a(t) = Cy exp (twt) + Cy exp (iwt) or, equivalently, a(t) = Asin (wt+ ¢@). (15) (Here, X = wand Y = y because the singular point is at x = y = 0.) In Fig. 5a the principal axes of the elliptical orbits coincide with the x, y coordinate axes. More generally, they need not. For instance, for the system z= , 8 vB, 4 + au ill V2 | Er 3 34 y=->7 (16a) L6b (166) (a) ya (11) again gives purely imaginary roots, \ = -+i/,/3 so the solutions are harmonic oscillations with frequency 1/,/3, but the principal axes of the elliptical orbits are at an angle of sin~! (1/3) = 19.47° with respect to the x,y axes as shown in Fig. 5b (see Exercise 5), [The system (16) is, of course, not a special case of (13), it is a separateexample.| We see that a center is stable but not asymptotically stable. Complex conjugate roots. (FOCUS) This time let p be positive in (13), but small enough so that p < V4km. According to the terminology introduced in Section Then a = 0, 3.8, we say that the damping is subcritical because per = V4km. d = —p/m; (11) gives the complex conjugateroots b=1,c=—k/m, Va ek 2m 0k 2 (=| m 2m — Poka 2m k (=~), 2 2m m and (10a) gives the solution > \2 k — (=) a(t) = e7P/?™| Acos [ 4/— 2m m gin =CePt/2m Ls m (2) 1 2 2m t+ Bsin t+o|, ls —m (=) 2m 2 t (17) where C’ and ¢ are arbitrary constants.As we discussed in Section 3.8, this solution differs from the undamped version (15) in two ways. First, the frequency of the sinusoid is diminished from thenaturalfrequencyw = \/k/m to Vk/m — (p/2m)?, and the exp (—pt/2m) factor modulates the amplitude, reducing it to zero as t > oo. In terms of the phase portrait, one obtains a family of spirals such as the one shown in Fig. 6a. If we imagine the representative point P moving along that curve, we see thatthe projection onto the x axis is indeed a dampedoscillation. We call the singularity at the origin a focus because trajectories “focus” to the origin as t —>oo; the term spiral is also used. In Fig. 6a the principal axes of the orbits, which would be elliptical if not for the damping, coincide with the x, y coordinate axes. More generally, they need not. For instance, for the system : 26.57 1 oO Figure 6. A stablefocusat (0,0). 7 g! = 32 + gu (18a) 1 4 y= ho 4y (18) we obtain similar results but with the principal axes rotated clockwise by an angle of sin~! (1/5) = 26.57° as shownin Fig. 6b. In each case (Fig. 6a and 6b) we see that the focus is stable and, indeed, asymptotically stable as well. However, one can have foci that wind outward instead of inward, and these will be unstable. For instance, if we return to the solution (17) of the damped oscillator system (13), but this time imagine p to be negative (without concerning ourselves with how that might be arranged physically), and smaller in magnitude than /4hm as before, then in place of the clockwise inward flow shown in Fig. 5a, we obtain the counterclockwise outward flow shown in Fig. 7, and we classify the singularity at the origin as an unstable focus. Note that a stable singular point can, additionally, be asymptotically stable or not, but an unstable singular point is simply unstable. (, Real roots of the same sign. (NODE) We’ve seen that without damping the mechanical oscillator (13) gives pure oscillations, elliptical orbits, and a center. With Figure 7. An unstablefocus at(0,0). light damping (i.e., 0 < p < per) it gives damped oscillations and a stable focus. If we now increase p so as to exceed p-,, then the oscillations disappear altogether, and we have the solution form a(t)= Cye™!! +Coe? (19a) with yee 1 a + ( p\? ia) _k = m? A2 Because of the way we have numbered the ’s, in this application we have from (19a), y(t) = ACyer* p\? _&k =P p _f(P\_=. 2m (5) m we have Az < Ay <0. Since x’ = y + NoCoe™", (19b) We can see from (19) that y(t) ~ AyCye™* z(t) ~ Cye™ (20) 2) and yr Aye (21) as t + oo, provided that the initial conditions do not give C = 0. If they do give C; = 0, then a(t)= Coe", y(t)= AgCoe**! (22) and y = Ax (23) AA, ast —>oo. The resulting phase portrait is shown in Fig. 8a. We call the singularity at (0,0) a node —more specifically, an improper node (“improper” is explained below). In accord with the preceding discussion, observe from Fig. 8a that in the exceptional case in which the initial point lies on the line of slope A the representative point approaches the origin along that line. In all other cases, the approach is asymptotic to the line of slope Ay. If we let p tend to pe,, then the two lines coalesce Figure 8. A stable improper nodeat (0,0): distinctrootsand repeated roots, respectively. 356 Chapter 7. Qualitative Methods: Phase Plane and Nonlinear Differential Equations and we obtain the portrait shown in Fig. 8b, which is, likewise, an improper node hy Figure 9. An unstableimproper nodeat (0, 0): distinct roots. (a) (Exercise 6). [f p is negative and greater in magnitude than p,,, then we see from the expressions given above for the \’s, that Ay > Ag >. 0, and the phase portrait is as shown in Fig. 9. The nodes shown in Fig. 8a,b are both stable, asymptotically stable, and the one shown in Fig. 9 (which is analogous to the one in Fig. 8a) is unstable. Our mechanical oscillator example is but one example leading to a node. We could make up additional ones by writing down equations 2’ = ax + by and y/ = cx + dy if we choose the coefficients a, b, c,d so that the two A’s are of the same sign, but the results will be one of the types shown in Fig. 8 and 9. There is, however, another type of node, which we illustrate by the problem a’ =lI an, y |= ay. y (24a) (24b) i Ag = a, and we have the solution That is, 6 = ¢ = Oanda = d. Then Ay = x (b) v4 x | Figure 10. Stable and unstable a(t) = Ae“, y(t) = Be™, (25a) (25b) where A, B are arbitrary. In this case y/x is not only asymptotic to a constant as t —>oo, it is equal to a constant for all t. Thus, the phase portrait is as shown in Fig. 10a if a < 0, and as in Fig. 10b if a > 0. The former is an asymptotically stable node, and the latter is an unstable node. But this time we call them proper nodes (or stars) because every trajectory is a straight line through the singular point (0,0), not just one or two of them. Real roots of opposite sign. (SADDLE) Consider, once again, the undamped mass/spring system governed by the equation max” + kx = 0, or pring Sy ~ y ane propernodesat(0,0). zg =ya} k; ya—he. m vA (26a) (26b) This time, imagine / to be negative (without concern about how that case could : 9 : 9 occur physically) and set —k/m = h°. Then (26) gives dy/dx = h*x/y so y y= hear +C or (27) which trajectories are shown in Fig. IL for various values of C. In particular, C = 0 gives the two straight lines through the singular point, namely, y = ha, with the flow approaching the origin on one (y = —ha) and moving away from it on the other (y = +/Aa). Such a singular point is called a saddle and is always unstable. The two straight-line trajectories through the saddle, along which the flow is attractedand repelled, are called the stable and unstable manifolds, respectively. Of course, (26) is not the only example of a linear system x = aa + by and y! = cu + dy with a saddle, Any such system with real roots of opposite sign will have such a singularity. For instance, a =x + 2y, (28a) y = 8x —5y (28b) has the roots \ = 3 and \ = —7. Thus, it has a saddle, and we know that two straight-line solutions can be found through the origin. To find them, try y = Ka. Puttingy = Ka into(28)givesa’ = (1+ 2«)x andvc’= (8 —5«K)a/nso it follows from these that we need 1 + 24 = (8 — 5«)/&, which equation gives the slopes &K=land« = —4. (If we would obtain & = oo, we would understand that, from y = &x, to correspond to the w axis.) With & = 1 the equation a’ = (14+2K)a = 3a gives x(t) proportionalto exp (3¢) [likewise for y(t) becausey = Kx], and with Kk= —4 it gives x(t) proportional to exp (—7t). Thus, the line trajectory y = x is the unstable manifold (since « and y grow exponentially on it), and the line is the stable manifold (since x and y die out exponentially on trajectory y = ~—4zx it). The same procedure, which we have just outlined and which should be clearly understood, can be used for a node as well, to find any straight-line trajectories through the node. In this final subsection we turn from the 7.3.4. Nonelementary singularities. elementary singularities to nonelementary ones, with two purposes in mind. First. one doesn’t completely understand elementary singularities until one distinguishes elementary singularities from nonelementary ones and, second, nonelementary singularities do arise in applications. Recall that (29a,b) +dY Y'=cX X’=aXN +bY, has an elementarysingularity at (0,0) if ad —bc 4 0. Consider two examples. EXAMPLE 1. The system vay, (30a,b) yoy has the phase trajectories y = « + C and the phase portrait shown in Fig. 12. Since ad —be = (O)(1) — (1)(0) = 0, the singularity of (30) at (0,0) is nonelementary. It is nonisolated and, in fact. y = 0 is an entire line of singular points. @ Figure 12. Nonelementary singularity of (30) at (0, 0). EXAMPLE 2. Consider the singularity of the system vay, at (0,0). ∙ (31a,b) yo =1—cose Expanding the right side of (31b) gives 1 — cosa Ls = a io a st +--+ so the linearized version of (31) is a’ = Ow+ ly and y! = Ow+ Oy. Thus, ad — bc = (0)(0) — (1)(0) = 0 again and the singularity of (31) at the origin is nonelementary.The 358 difficulty this time is not that the singular point is not isolated; it is. The problem is that it is of higher order for when we linearize the expansion of 1 —cos x we simply have Ox + Oy. In not retaining at least the first nonvanishing term (namely, 2” /2) we have “thrown out the baby with the bathwater.’ To capture the local behavior of (31) near the origin, we need to retain that leading term and consider the system amy, 1. yf=II =a". Dividing (32b)by (32a)and integratinggives y = (32a,b) 7 + C, severalof which trajectories 3 are shown in Fig. 13. We see that the singularity is, indeed, isolated, but that the phase portrait is not of one of the elementary types. 9 Closure. In this section we establish a foundation for our use of the phase plane in studying nonlinear systems. We begin with the issue of existence and uniqueness of solutions, first in x, y, ¢ space (Theorem 7.3.1), and then in the x, y phase plane. The latter leads us to introduce the concept of a singular point in the phase plane as a point at which both 2’ = P(z,y) = Oand y’ = Q(z,y) = 0. To study a singular point S = (x5, ys) one focuses on the immediate neighborhood of S, in which neighborhood we work with the locally linearized equationsX' = aX + bY and Y’ = cX + dY, where X = x — x, and Y = y — ys, so X,Y is a Cartesian system with its origin at S. Studying that linearized system, we categorize the possible “flows” into four qualitatively distinct types —the center,focus, node, and saddle~andillustrateeachthroughthemass/springsystemma” + pz! + kx = 0, with suitable choices of m,p,k, and other examples as well. These are the socalled elementary singularities that result when ad — bc O. In the next section we apply these results to several nonlinear systems, where we will see the role of such singular points in establishing the key features of the overall flow in the phase plane. lated? (e oy y’ a (c) x! i I (e) a’ II lI (g) « II y’ tI yl! and find C3 and Cy in terms of C, and C. 0 20—-Y vty) cy c+ y? rt+y zy—4 xz— 2y (b) (d) (f) a’ = 2x — dy yo=au-y g’ = siny y=aty vo =1l—eY y =1-—2*—2xsiny (h) x’ = cos(x —y) y =xy-1 4. Suppose that we reverse the e’s and 6’s in our definition of stability, so that the definition becomes: A singular point S is stable if, given any ¢ > OQ(i.e., as small as we wish), there is a 6 > Osuchthatd(P(t),S) < 6 forallt > Oifd(P(0),S) <e. Would that definition work? That is, would it satisfy the idea of motions thatstartout sufficiently close to S remainingclose to S? Explain. 5. In this exercise we wish to elaborate on our claim below 7.4. Applications gle of 19.47° with respect to the a, y axes as shown in Fig. 5b. (a) If x,y and %, 7 coordinate systems are at an angle a, as shown here, show that the «, y and %, 7 coordinates of any 9. (Saddles and nodes) Classify yo=dety (c) a = a+ 2y yo=a~2y (e) we= 3a+y yo ame by (g) ow = 2u-+y yl = a + Qy v= Tcosa (i) vw=at+y — Ysina, y = Tsina+Yycosa (5. 1a) (5.1b} (b) Putting (5.1) into (16), insist upon the result being of the form (5.2) P=Py7, Y=-7vz for some constants { and ¥ so as to yield elliptical orbits with %,Y as principal axes, and show that you obtain a = 19.47°. If you obtain another a as well, explain its significance. 6. We claimed that if p = pe,, then the two straight-line trajectories in Fig. 8a coalesce. as shown in Fig. 8b. Here, A, = Ag = A, then the general solution of (13) is a(t) = (Cy + Cot)e™, y(t) = a"(t) = ete. 7. What does Fig. 8a look like in the limit as p —>00? Sketch it. 8. Given the presence of the saddles (i.e., saddle-type singularities) at (—1,0) and (1,0) and the center at (0,0), can you come up with any global flow patterns that are qualitatively different from the one sketched in Fig. 3? Explain. (Assume that these three are the only singularities.) 7.4 Applications at the origin, find the equations of any straight-line trajectories through the origin, and sketch the phase portrait, including flow direction arrows, (a) a =a+y given point are related according to the singularity —359 yl = ©+ Qy (b) vi =y y! = a —dy (d) a! = —a + 3y yo=r-y (f) a = -38e+y yo=-e-y (h) a’ = a+ dy yo = ba+y i ~3e+y Gj) a = yo =a-3y 10. Prove that a linear system 2’ = az + by, y! = ca + dy can have one, two, or an infinite number of straight-line trajectories through the origin, but never a finite number greater than two. 11. Classify the singularity at the origin as a center, focus, node, or saddle. If it is a focus, node, or saddle, then classify it, further, as stable or unstable. (a) a = a-dy (b) 2’ = 224+ 3y (c) vw =aut+y (d) 2 =a+d3y yo=aty y’ =a —4y (e) x y! (g) a ~ yf = = = = —2e — 3y 3u + 2y 2e-y a + By y =2-y yo=ru-y (f) ai = -ae+y yo = -@- 2y (h) a’ = -24-y yo = —w-3y 12. (a)-(h) Use computer software to obtain the phase portrait for the corresponding system in Exercise 11. Be sure to include any key trajectories —namely, any straight-line trajectories through the origin. From the phase portrait, classify the singularity as a center, focus, node, or saddle, state whether it is stable, asymptotically stable, or unstable, and use arrows to show the flow direction. 360 Chapter 7. Qualitative Methods: Phase Plane and Nonlinear Differential Equations interesting applications: system a =P(x,y), y = Q(x,y). linearizing them about S’. aboutanypoint (a, b), is Flay) =f(a,b) ++Lfelasb)(w —a)+fy(a,b)(y ~2) + terms of third order and higher, of a single variable that y==b: (2), first-order terms, we have 7.4. Applications 361 That step is called linearization because the result is linear in « and y. Geometrically, (5) amounts to approximating the f surface, plotted above the x, y plane, by ↓ gf (ale ∙—a)?+ ∫ ∶∕ .e& f(a) + f’(a)(a — a) amounts to approximating the graph of f versus 2 by its tangent line at x = a in the case of a function of a single variable. interest, (ws,ys), and linearize. Using (5) to do that, we obtain —Us)+ Py(ts,Ys)\(y—Ys), P(z,y) & P(Xs,Ys)+ Pe(@s,Ys)(@ zs, Q(x, y) i Ys) + Qe(Zs, Ys)(z 7” ay) + Onan Ys )(Y ~Ys): (6a) (6b) But P(xs, ys) and Q(xs, ys) are zero because(vy, Ys) is a singular point, so we have the approximate (linearized) equations z= P,(s,Yys)(%& — 2s) + Py(xs,ys)(y y! = Qals, Ys)(x _ Ls) + Onan — Ys), (7) Ys) (y _ Ys). Finally, it is convenient, though not essential, to move the origin to S' by letting X = w-—ws and Y = y — yg, and calling P,(as,ys) = a. Qz(@s,Ys) = G Py(ts, ys) = b, and Qy(xs, ys) = d, for brevity, in which case (7) becomes X'=aX + bY, (8a) Y'’=cX +dY, (8b) which system is studied in Section 7.3. There, we classify the singularity at Y = Y = 0 as a center, focus, node, or saddle, depending upon the roots of the characteristic equation dM— (a+ d)\ + (ad ~ be) = 0, (9) namely, Ax (a+ d)+ /(a—d)*+ 4bc 5 (10) ∏ ∕ ∏ ↔ we can find the A roots and determine whether the singular point is a center, focus, node, or saddle. Understand that we are trying to ascertain the local behavior of the flow corresponding to the original nonlinear system (1) near the singular point S by studying the simpler linearized version of (1) at S, namely, (8). That program begs this question: Tsthe nature of the singularity of the nonlinear system (1), at S, truly captured by its linearized version (8)? To answer that question it is helpful to present the singularity classification, developed in Section 7.3, in a graphical form — as we have in Fig. 1. It is more convenient to deal with the two quantities p = a +d and ↕ ∏ q = ad — be (which are the axes in Fig. |) than with the four quantities a, b,c, d since a -- d and ad — be determine the roots of (10) and hence the singularity type. In termsof p andq, (10)simplifies to \ = (p - \/p* —4q)/2. p Unstable improper nodes Unstable foci Saddles Stable fect Stable improper nodes Unstable proper nodes Centers q Stable proper nodes / p- =4dyq In the figure there are five regions separated by the p axis, the parabola p? = 4q, and the positive q axis.. [Of these boundaries, the p axis can be discounted since the case g = 0 is ruled out of consideration in Section. 7.3:because.then there is a line of singular points through the origin rather than the origin being an isolated singular point. That is, our (p,q) point will not fall on the boundary between saddles and improper nodes —namely, the p axis.] The Hartman~Grobman theorem tells us that if a,b,c,d are such that the point (p,q) is within one of those regions, then the singularity types of the nonlinear system and its linearized version are identical. For instance, tf the linearized system has a saddle, then so does the nonlinear system. Essentially, we can think of retaining the higher-order terms (which we drop in linearizing the nonlinear differential equations) as equivalent to infinitesimally perturbing the values of the coefficients a, b,c, d in the linearized equations and hence the values of p and q. If the point (p,q) is within one of the five regions, then the perturbed point will be within the same region, so the singularity type will be the same for the nonlinear system as for the linearized one. However, we can imagine that for a borderline case, where (p,q) is on the parabola p* = 4g or on the positive qgaxis, such a perturbation can push the point into one of the neighboring regions, thus changing the type. In fact, that is the way it turns out. For instance, if (p, q) is on the positive g axis (as occurs in the example to follow), then the nonlinear system could have an unstable focus or a center or a stable focus. EXAMPLE 1. Oscillator with Cubic Damping.The equationx” + a") +x = 0 models a harmonic oscillator with cubic damping —that is. with a damping term proportional to the velocity cubed. The equivalent system i Y; (11) yo = -2— ey has one singular point, a center at 2 = y = 0. The linearized version is N'= Y=O0N4+1Y, Youn -~-X=-1X soa=d=0,b = 1, andc = 1 (from Fig. 1) the linearized (12) +0Y hence p = Oand gq= 1: Thus, (p,q) = (0,1) so system (12) has a center (no surprise, since the solutions of the linearized equation 2”’ + x = 0 are simple harmonic motions). However, it turns out (Exercise |) that the nonlinear system (11) has a stable focus. Hf To summarize, in general the linearized system faithfully captures the singularity type of the original nonlinear system. For the borderline cases, where (p, ¢) 7.4, Applications . 7) . . tae . 363 ot is on the p* = 4q parabola or on the positive g axis, however, we have these possibilities: LINEARIZED NONLINEAR stable proper node => center <=> stable focus, or stable proper node, or stable improper node unstable focus, or center, or stable focus unstable proper node <= unstable improper node, or unstable proper node, or unstable focus 7.4.2. Applications. Consider some physical applications. EXAMPLE 2. Pendulum.Recall from an introductoryphysics coursethatfor a rigid body undergoing pure rotation about a pivot axis the inertia about the pivot O times the angular acceleration is equal to the applied torque. For the pendulum shown in Fig. 2, the inertiaaboutO is ml?, theangle from thevertical is z(t), theangularaccelerationis x’’(t), and the downward gravitational force mg gives a torque of —mgl sin z. If the air resistance is proportionalto thevelocitylz’, sayclz’, thenit givesanadditionaltorque—cl?2’, so the equationof motionis ml?2” = —mglsinx —cl?z’ or wl!+ ral + Fsine =0, (13) where r = c/m. The ra’ term is a damping term. For definiteness, let g/l = 1 and consider the undamped case, where r = 0. For small motions we can approximate sin z by the first term of its Taylor series, sinx = 2, so that we have the simple harmonic oscillator equation x” + @= 0 or a=y (14a) y = ~2; (14b) Figure 2. Pendulum. (14) has a center at 7 = y = 0, and its by now familiar phaseportrait is shown in Fig. 3. To study larger motions, supposethatwe approximate sin x by thefirst two terms of its Taylor series instead: sin ~ a —2x?/6. Then we have the nonlinear, but still approximate, equation of motion Log a" +e —ea =0, (15) The latter is of the same form as the equation governing the rectilinear motion of a mass restrained by a “soft spring,” which is studied in Section 7.2. The system v= y, (16a) Figure 3. Phaseportraitfor the linearizedsystemx” + x = 0. ∶∶ ↓ −− ↕− ∶ (16b) ∕ has a centerat (0,0) and saddlesat (+V6,0) in the (a,y) phaseplane, as discussed in Section 7.2, and its phase portrait is shown here in Fig. 4. Finally, if we retain the entire Taylor series of sin z (i.e., if we keep sin a intact), then we have the full nonlinear system (13), with r = 0, or a= Y, (17a) y = —sina, (17b) with singular points at (x,y) = (n7,0) forn = 0,+1,+42,.... To classify thesesingularFigure 4. Phaseportraitfor the improved model (15). ities, let us linearize equations (17) about the singular point (7,0) using (7). Doing so, knowingthatsinnw = OQ andcosna = (—1)",andsettingX = «—-naandY = y—0 = y, the linearized version of (17) is X'=Y=0X4+1Y Y"=(-1)1X =(-1)"*1x +0Y. (18a) (18b) In thenotationof equation(8),@= d =0,b =1,ande = (—1)"*! sop=a+d=0 and q = ad — bc = (-1)". Thus, these singular points are on the g axis in the p,q plane. For even integers n they are on the positive g axis and correspond to centers; for odd integers n they are on the negative g axis and correspond to saddles. In turn, the latter correspond to saddles of the nonlinear system (17), but the former could be centers or foci of (17), as-is discussed in Section 7.4.1. The computer-generated.phase portrait in Fig. 5 reveals that they are centers; we have centers at x = 0,427, -47r,..., and saddles at w= +tar,+37,...,0n Figure the a axis. 5. Phase portrait of the full nonlinear system x” + sinz = 0. To understand the phase portrait, suppose (for definiteness) that the pendulum is hanging straight down (a = () initially, and that we impart an initial angular velocity y(0), so that the initial point is A,B,C, or D in Fig. 5. If we start at A, then we follow a closed 7.4. Applications orbit that is very close to elliptical, and the motion is very close to simple harmonic motion at frequency w = 1. If we start at B, the orbit is not so elliptical, there is an increase in theperiod, and the motion deviates somewhat from simple harmonic. If we start at C’, then we approach the saddle at a = mas t ++00; that is, the pendulum approaches the inverted position as ¢ + oo. If we impart more energy by starting at D, then —even though it slows down as it approaches the inverted position — it has enough energy to pass through that position and to keep going round and round indefinitely. Though the trajectory in the phase plane is not closed, the motion is nonetheless physically periodic since the positions a and x + 2n7 (for any integer n) are physically identical. How can we gain access to-one of the other closed orbits such as 7? That’s easy: we “crank” the pendulum through two rotations by hand so that while hanging straight down it is now at c = 47. Then we impart an initial angular velocity y(0) = . What is the equation of the trajectories? Dividing (17b) by (17a) and integrating, gives 1, gy ~ cosa = constant = C. (19) Do we really need (19)? After all, we turned the phase portrait generation over to the computer. Yes, to help us choose initial conditions that will enable us to generate the separatrix (the trajectories through the saddles) on the computer. With « = mwand y = O, we find thatC = 1, so theequationof theseparatrixis y? = 2(1 + cos x). Be careful, becausethe initial point ¢ = 0 and y = 2 will not generate the entire separatrix, but only the segment through that point, from « = —mto x = 7. To generate the next segment we could use an initial point « = 27 and y = 2, and so on. COMMENT 1. Recall that Fig. 3-5 correspond to taking sing retaining sin x without approximation, a, sinw + x —.v°/6 and respectively, in (13). Thus, and not surprisingly, as we retain more terms in the Taylor series approximation of sin x about 2 = 0 we capture the flow more accurately and completely. COMMENT 2. What happens if we include some damping? It turns out that the singu- 65 ww larities are at (na, 0), as before. [fr < ro,, where ro, = 2, then the singularities are still saddles if n is odd, but if m is even we now have stable foci rather than centers as seen in Fig. 6 (for r = 0.5). One calls the lightly shaded region (not including the boundaries AB and C'D) thebasin of attraction for thestablefocus at (277,0), thebasin of attractionof an attractingsingular point S being the set of all initial points Py such that the representative pointP(t) tendsto S as¢ + oo if P(0) = Po. Similarly,eachof theotherstablefoci has its own basin of attraction. yA COMMENT 3. We have spoken, in this section, of a nonlinear system having the same type of singularity (or not), at a particular singular point S’, as the system linearized about S. Let us use the present example to clarify that idea. By their singularities being of the same type, we mean that their phase portraits are topologically equivalent in the neighborhood of S. Intuitively, that means that one can be obtained from the other by a continuous deformation, with the direction of the arrows preserved. The situation is illustrated in Fig. 7, where we show both the nonlinear (solid) and linearized (dashed) portraits in the neighborhood of the saddleat (7,0). [In moremathematicalterms,supposethatour system2! = P(x, y) and y’ = Q(z, y) has a singularpointat theorigin andthat P(x,y) = ax + by + higher-orderterms=Ul aX +bY, Q(x,y) = cx + dy + higher-orderterms = cX + dY (20a) (20b) define X and Y as continuous functions of x and y, and vice versa. Such a relationship Figure 7. Continuousdeformation of theportraitnear(7,0). Solid lines correspond to the nonlinear system, dashed to the linearized version. between z, y and X, Y is called a homeomorphism deformation. ] and is what we mean by a continuous COMMENT 4. It turns out that the nonlinear pendulum equation is also prominent in connection with a superconducting device known as a Josephson junction. For discussion of the Josephson junction within the context of nonlinear dynamics, we recommend the book Nonlinear Dynamics and Chaos (Reading, MA: Addison-Wesley, 1994) by Steven H. Strogatz. # In the preceding example we were unable to classify the singularities at (nz, 0) with certainty, for n even and r = 0, as is discussed in the paragraph below (18). We relied on the computer-generatedphase portrait, which show them to be centers, not foci. More satisfactorily, we could have used the fact that 2” + sing = Oisa “conservative system,” which idea we now explain. In general, suppose that the application of Newton’s second law gives (21) max"= F(x); that is, where the force F happensto be an explicit function only of «, not of x’ or t. Defining V(x) by F(x) = —V'(x), (21) becomesma” + V'(x) = 0. Let us multiply the latter by dx and integrate on x. Since, from the calculus, oda de! = eee Le! dx = ae dt ea a! de! dt = v'dx', dt dt dt dt we obtainmx'dz’ + V'(x)dx = 0, integrationof which gives 1 ~ma" + V(x) = constant. 2 (22) (23) 7.4. Applications V(a) is called the potential energy associated with the force F(x). 367 For a linear /2. spring,for instance,the force is F(a) = —ka, and its potentialis V(a) = ka® conis potential) plus (kinetic energy total the The upshot is that (23) tells us that served,it remains constant over time, so we say that any system of the form (21) is conservative. The pendulum equation ml?a" = —mglsina (24) is of that form, and multiplying by dx and integrating on «xgives 1 5 la")” — mgl cos x = constant, (25) where m/(Ia’)?/2 is the kinetic energy and —mglcos x is the potential energy associated with the gravitational force (with the pivot point chosen as the reference - level). For any conservative system (21), the total energy is E(x, 2’) = ma!?/2 + V(a). If we plot E(x, x’) abovethex, x’ phaseplane,thenthex, x’ locationsof maxima and minima of & are found from OE −− Dn ∙ =0 V(x)∶ −− ind OE jy = Mz af 0, 6 (26) and these are precisely the singular points of the system w= Y;, / y = F(x)/m = -V'(x)/m corresponding to (21). To illustrate, the point S' beneath the minimum of F&(Fig. 8) is a singular point. Furthermore, since E is constant on each trajectory, the phase trajectoriesare the projections of the closed curves of intersection of the & surface and the various horizontal planes, as sketched. Evidently (and we do not claim that this intuitive discussion constitutes a rigorous proof), a trajectory [’ very close to S' must be a closed orbit. The only way [ could fail to correspond to a periodic motion is if there is a singular point on T, for the flow would stop there. But if 5 is an isolated singular point, and F is small enough, then there can be no singular points on I’, and we can conclude that 5 must be a center. By that reasoning, could have known that the singularities at (7,0) we (for meven) must be centers, not foci. More generally, we state that conservative systems do not have foci among their singularities. EXAMPLE 3. Volterra’sPredator-PreyModel. The Volterra model (also known as theLotka—Volterra model) of the ecological problem of two coexisting species, one the predator and the other its prey, is introduced in Section 7.2. Recall that if w(t), y(£) are the populations of prey and predator. respectively, then the governing equations are of the form ! wv wl —y)a, y =-v(l—ax)y, (27a) (27b) Figure 8. Occurrenceof a center for a conservative system. where ju,/ are positive empirical constants. Setting the right-hand sides of (27) equal to zero reveals that there are two singular points: (0,0) and (1, 1). Linearizing (27)about(0,0) gives soa= p,b=c=0,d (28) yo= -vy x! = px, = ~—v.Hence, \ = szand —v, which are of oppositesign, so the singularity at the origin is a saddle. Clearly, the straight-line trajectories through (0, 0) are simply the # and y axes since a = 0, y = Aew"', anda = Bel’, y = 0 satisfy (28) and give trajectories that pass through the origin. To linearize about (1, 1) we use (7) and obtain the approximations a’ = —p(y—1), or, with XY=a2—-landY (29) =y-1, Yo svX ~p,c = — py, =0X X'=-py Thus,a@=d=0,b= y =v(x - 1) +0Y. =vX (30a) (30b) so A = £2i,/pnv. Hence, the linearized version (30) has a center, and (27) has either a center or a focus. The phase portrait in Fig. 9 shows the singularity at (1, 1) to be a center, with every trajectory being a periodic orbit, except for the two coordinate axes. (Of course, we show only the first quadrant because«xand y are populations and hence nonnegative.)The direction of the arrows follows from (27), which reveals that 2’ > 0 for y < 1, and 2’ < 0 for Figure 9. Phaseportraitfor Volterra problem (27). y>lory’ <Ofor’ <landy’ > Oforz > 1). COMMENT. Although the Volterra model is a useful starting point in the modeling process and is useful pedagogically, it is not regarded as sufficiently realistic for practical ecological applications, 7.4.3. Bifurcations. As we have stressed,our approach in this chapter is largely qualitative. Of special importance, then, is the concept of bifurcations. That is, systems generally include one or more physical parameters (such as jz and v in Example 3). As those parameters are varied continuously, one expects the system behavior to change continuously as well. For instance, if we vary j¢ in Example 3, then the eccentricity of the orbits close to the center at (1, 1) changes, and the overall flow field deforms, but —qualitatively —nothing dramatic happens. In other cases, there may exist certain critical values of one or more of the parameters such that the overall system behavior changes abruptly and dramatically as a parameter passes through such a critical value. We speak of such a result as a bifurcation. Let us illustrate the idea with an example. EXAMPLE 4. Saddle-NodeBifurcation. The nonlinear system of vc II re i Y lI +y, x 1 + ge -y (31a) (31b) 369 7.4. Applications arises in molecular biology, where #(¢) and y(t) are proportional to protein and messenger RNA concentrations, and 7 is a positive empirical constant, or parameter, associated with the“death rate” of protein in the absenceof the messengerRNA [for if y = 0, then (31a) gives exponential decay of x, with rate constant 7]. The singular points of (31) correspond to intersection points of y = ra andy = a?/(1 +x"), as shown (solid curves) in Fig. 10. Equating thesegives x = y = 0 and also thetwo distinct roots LtVJVi—4r? a ar € ’ Lt V1~4r? Yt = 7L4 = —- 2 provided that r < 1/2. Thus, the critical slope of y = rx isr obtainthetwo intersectionsSy. = (vi,y4) in Fig. 10) these coalesce at (1,0.5), and ifr singular point at the origin. = 1/2. [fr < 1/2 we > 1/2 they disappear and we have only the Let us study thethreesingular points, for r < 1/2. First (0,0): we can see from (31) by inspection or Taylor series expansion, that the linearized equations are soa=—r,b= 1,c = 0,andd y=ny ee) = —1. Thus, (10) gives \ = —r and ~1. Since both are negative,the singular point (0,0) is a stablenode. In similar fashion (which calculations we leave to Exercise 6), we find that the singularity at S_. is a saddle, and that the singularity at Sy is an unstable improper node. As r is increased, S_ and at S,. approach each other along the curve y = z7/(1 +a”). When r = 1/2 they merge and form a singularity of some other type, and when r is increased beyond 1/2 the singularity disappears altogether. leaving only the node at the origin. The bifurcation that occurs at r = 1/2 is an example of a “saddle-node bifurcation.” From the way the singular points S, and S. approach each other along the unstable manifold of the saddle, like “beads on a string” as Strogatz puts it, we see that the bifurcation process is essentially a one-dimensional event embedded within a higher-dimensional space (two-dimensional in this case). 8 The saddle-node bifurcation illustrated above is but one type of bifurcation. A few others are discussed in the exercises and in the next section. For a more complete discussion of bifurcation theory, we recommend the book by Strogatz, referenced in Example 2. Closure. In this section we got into the details of the phase plane analysis of autonomous nonlinear systems. Whether or not we generate the phase portrait by computer, it is essential to begin an analysis by finding any singular points and, by linearization, to determine the key features of the local flow near each singular point. That information is needed even if we turn to the computer to generate the phase portrait, as we discuss below, under “Computer software.” : \ - aye (1,0.5) “ an x“ lex? *] and S_ = (a_,y_) ifr = 1/2 (dashedline e=an-rety, eis ya We also explored the correspondence between the type of a singularity of the nonlinear system and that of the linearized system and found that the type remains the same, except for the borderline cases corresponding to p, g points on the positive q axis or on the parabola p* = 4q in Fig. 1. Those cases could “go either way.” That Figure 10. Determiningthe singular points of (31). 370 Chapter 7. Qualitative Methods: Phase Plane and Nonlinear Differential Equations hus changing the singularity type. system w= y, y’ = —sinaz —0.5y (34) and to linearize (34) about (7,0) as l| Y, / y I —(cos7)(a —7) —0.5y 7 or witha —-7=X (35) andy—O=Y, X'=0X+4+1Y, y! = LX X'’=O0X _ O.5Y. 4+4X, KX! = LX — 0.5K.X, y = —0.02561553. as follows: w= —4.5..17, y = —6..6, scene= [x,y]); (36) 37) 7.4. Applications 371 simply chosen as the same ones used in Fig. 6. EXERCISES 7.4 1. (a) In Example | we stated that the equation a” +ea3 +a = 0 (e > 0) has a stable focus at the origin of the phase plane. Verify that claim by generating (with computer software) its phase portrait. (b) Should the cubic damping result in the oscillation dying out more or less rapidly than for the case of linear damping, x" + ea' + ©= 0, for thesamevaluesof €? Explain. (c) Classify the singularity at (0,0) for the case where « < 0, and support your claim. 2, Determine all singular points, if any, and classify each insofar as possible. =x-y, y’ = sin(z + y) (a) Show that S_. is a saddle, and find the equations of the two straight-line trajectories through it. Show that S, is a stable improper node, and find two straight-line trajectories through it. Find two straight-line trajectories through the stable node at (0,0). Use these results to sketch the phase portrait of (31). ate the phaseportrait of (31) in the rectangle 0 < x < 1.5, y! = ~x -2y Qo=o, (g)a=-2a-y, y=—242y yo=ar+c° (h)c’ = —-22--y, y' =sine O<y< 15. 7. (Dynamic formulation ofa buckling problem) Consider the buckling of the mechanical system shown in the figure, and (a's yer —-1, yo=y-a-l Qai=a?-y, yo=Iw-y (k)a’ =a? —y?, P yi =a? +y-2 “ ae oe , ow y’ =a? -4 3. (a)—(n) Use computer software such as the Maple phaseportrait command, to generate the phase portrait of the corresponding system in Exercise 2. « 4. Is the given system conservative? Explain. o (a)a” — Qa’ + sing = 0 (c)a” + a? = 0 e ou Qa =y, yi =—3sine (m) a’ =2+2y, yi’ =—-x—siny (n)a’ = (x +1)y, In parts (a) and (b) below, let 6. (Example 4 continuation.) r ==0.3 in (31). Sy. is nonelementary. (c) For the representative supercritical case r = 1, identify and classify any singularities, and use computer software to gener- (c)a’=y, y' =(1—27)/(1 +2") (e)a'=(l—a*)y, (0,0) shouldbe a stablenode. (b) For the critical case, r = 1/2, show that the singularity at (aja'=y, y’=1-24 (b)a’=1—y’, y=l-2 (djx’ at (0,0) shouldbea stablefocus. (c) r to be any value that you wish but large enough for the damping to be supercritical. For instance, the singularity at (ba x V1 m o k +e? +e2=0 (d)a" +a’ +ar=0 5. Use computer software such as the Maple phaseportrait command, to obtain the phase portrait of equation (13), over ~2 <a < 14and ~3 < y < 3, showing enough trajectories to clearly portray the flow. Take g/l = 1, and (a)r = 0, Se, Se P consisting of two massless rigid rods of length | pinned to a (b) r to be any positive value that you wish but small enough mass 7 and a lateral spring of stiffness &. That is, when the for the damping to be subcritical. For instance, the singularity spring is neither stretched nor compressed x = 0 and the rods are aligned vertically. As we increase the downward load P nothing happens until we reach a critical value P,,, at which value x increases (to one side or the other, we can’t predict which) and the system collapses. (a) Application of Newton’s second law of motion gives , 7 (9-1/2 E ~ (7) + ke = 0 (7.1) the paper) has current 7 in the same direction as £. According to the Biot-Savart law, the mutual force of attraction is 21il/(separation) = 2ftl/(a—x), wherex = 0 is theposition at which the spring force is zero, so the equation of motion of the restrained wire is ma +k (« ~ r Amn ) = 0, where r Thinking of m,k, a, and / as fixed, and the currents J and 7 as as governingthe displacementx(t). With a’ = y, show that variable, let us study the behavior of the system in terms of the the singularity at the origin in the a, y phase plane changes its type as P is sufficiently increased. Discuss that change of type, show how it corresponds to the onset of buckling, and use it to show that the critical buckling load is P,, = kl/2. (b) Explain what the results of part (a) have to do with bifurcation theory. (c) Use Newton’s law to derive (7.1). 8. (Motion of current-carrying wire) A mutual force of attraction is exerted between parallel current-carrying wires. The infinite wire shown in the figure has current J, and the wire parameter r. For definiteness, letm =k =a= (a) With z’ = y, identify any singularities in the z,y phase plane and their types, and show that they depend upon whether r is less than, equal to, or greater than 1/4. Suppose that r < 1/4. Find the equation of the phase trajectories and of the separatrix. Do a labeled sketch of the phase portrait. (b) Let r = 0.1, say, and obtain a computer plot of the phase portrait. (c) Next. consider the transitional case, where r = 1/4. Show that that case corresponds to the merging of the two singularities, and the forming of a single singularity of higher order (i.e., a nonelementary singularity). Do a labeled sketch of the phase portrait for that case. (d) Let r = 1/4, and obtain a computer plot of the phase portrait. (e) Next, consider of length / and mass m (with leads that are perpendicular to 1. the case where r > 1/4, and sketch the phase portrait. (f) Let r = 0.5, say, and obtain a computer plot of the phase portrait. (g) Discuss this problem from the point of viewof bifurcations, insofar as the parameter 7 is concerned. (e > 0) (1) of the beating of the heart.* Usually, in applications the parameter € is positive. To study (1) in the phase plane we first re-express it as the system vey, (2a) y= —a + (1 —2")y, (2b) (a) which hasone singularpoint: (0,0). Linearizing (2) about(0,0) gives (3a) uv’= y, y =—x + ey, (3b) which has an unstable focus if ¢ < 2 and an unstable node if ¢ > 2. That result is not surprising since (3) is equivalent to x” — ex’ + x = 0 [equation (1) with the term dropped], and the latter corresponds to a damped harmonic nonlinear ex’ oscillator with negative damping. Near the origin in the a, y phase plane the flow is accurately described by (3) and is shown in Fig. la. As the motion increases, the neglected nonlinear term ex*2’ ceases to be negligible, and we wonder how the trajectory shown in Fig. la continues to develop as ¢ increases. Since the “damping (6) coefficient”¢ = ~e(1—wx*), in (1), is negativethroughoutthevertical strip |x| < 1, we expect the spiral to continue to grow, with distortion as the ex?a’ term becomes more prominent. Eventually, the spiral will break out of the |a| < 1 strip (Fig. 1b). As the representativepoint (a(t), y(t)) spends more and more time outside that strip, where c = ~e(1 — 27) > 0, the effect of the positive damping in |z| > 1 increases,relativeto theeffectof thenegativedampingin |x] < 1, so itis naturalto Figure 1. The unstablefocusat(0,0). ” wonder if the trajectory might approach some limiting closed orbit as t + 00 over which the effects of the positive and negative damping are exactly in balance. We can use the following theorem, due to N. Levinson and O. K. Smith. THEOREM 7.5.1 Existence of Limit Cycle Let f(a) be even [f(—x) = f(x)] and continuous for all x. Let g(a) be odd [g(—v)= —g(x)]with g(x) > 0 for all x > 0, andg'(x) be continuousfor all a. With . 0 / f(€)dé = F(x) and i JO g(€) dé = G(x), (4) suppose that (i) G(r) + oo as w — o and (ii) there is an 2 > O such that F(x) < 0 for0 < «#< x29,F(x) > 0 for 2 > xo, and F(x) is monotonically increasing for 2 > xo with F(x) + oo as x equation oo. Then the generalized Liénard ge’+ f(x)a’ + g(x) =0 (5) has a single periodic solution, the trajectory of which is a closed curve encircling the origin in the 2,2’ phase plane. All other trajectories (except the trajectory “B. van der Pol, On “Relaxation Oscillations,” Philosophical Magazine. Vol. 2, 1926. pp. 978~ 992, and B. van der Pol and J. van der Mark, The Heartbeat As a Relaxation Oscillation, and An Electrical Model of the Heart, Philosophical Magazine, Vol.6, 1928,pp. 763-775. consisting of the single point at the origin) spiral toward the closed trajectory as t+ o. Applying this theoremto the van der Pol equation(1), f(x) = —e(1 — x?) is an evenfunction of x and F(a) = —e(x —2°/3), which is less than zero for 0<a y < V3, greater than zero for 2 > 3, and which increases monotonely to infinity as « —+oo. Further,g(a) = x is odd, and positive for x > 0, g(a) = 1, and G(x) = «7/2 — oo as a -+ oo. Since theconditions of the theoremare met (for all € > 0), we conclude from the theorem that the van der Pol equation does admit a closed trajectory, a periodic solution, for every positive value of €. Computer results(using theMaple phaseportraitcommand)bearout this claim. The phase portraits are shown in Fig. 2 for the representativecases € = 0.2, 1, and 5, and x(t) is plotted versus ¢ in Fig. 3 for the trajectories labeled C’. The closed trajectories labeled I’, predicted by the theorem, are examples of limit cycles namely, isolated closed orbits. By [ being isolated we mean that neighboring trajectories through points arbitrarily close to P are not closed orbits. If we start on T we remain on [, but if we start on a neighboring trajectory, then we approach [ as t + co (unless we start at the origin, which is an equilibrium point). Thus, we classify the van der Pol limit cycle as stable (or attracting). Clearly, that particular trajectory is of the greatest importance because every other trajectory (except the point trajectory z = y = 0) winds onto it as t + oo. As one might suspect from Fig. 2 and as can be proved, the van der Pol limit cycle approaches a circle of radius 2 as « — O through positive values. When ¢ becomes zero the singularity at the origin changes from a focus to a center, and while the circle of radius 2 persists as a trajectory it is joined by the whole family of circular orbits centered at the origin. If € is diminished further and becomes negative, the origin becomes a stable focus and all closed orbits disappear and give way to inward-winding spirals. Thus, « = 0 is a bifurcation value of e. Observe the interesting extremes: as « — 0, the steady-state oscillation (e., corresponding to the limit cycle) becomes a purely harmonic motion with amplitude 2. But as € becomes large, the limit cycle distorts considerably and the steadystate oscillation a(t) becomes “herky jerky.” (In the exercises, we show that as € — co it even becomes discontinuous!) Such motions were dubbed as relaxation oscillations Figure 2. The vander Pol limit cycle, for ¢ = 0.2, 1, and 5. by van der Pol, and these are found all around us. Just a few, mentioned in the paper by van der Pol and van der Mark, are the singing of wires in a cross wind, the scratching noise of a knife on a plate, the squeaking of a door, the intermittent discharge of a capacitor through a neon tube, the periodic reoccurrence of epidemics and economic crises, the sleeping of flowers, menstruation, and the beating of the heart. Such oscillations are characterized by a slow buildup followed by a rapid discharge, then a slow buildup, and so on. Thus, there are two time scales present, a “slow time” associated with the slow buildup, and a “fast time” associated with the rapid discharge. In biological oscillators such as the heart, the period of oscillation provides a biological “clock.” Understand clearly that the limit cycle phenomenon is possible only in nonlinear systems for consider the case of small e, say, where we have a limit cycle that is approximately a circle of radius 2. If the system were linear, then the existence of thatorbit would imply that the entire family of concentric circles would necessarily be trajectories as well, but they are not. Besides the van der Pol example, other examples of differential equations exhibiting limit cycles are given in the exercises. In other cases a limit cycle can be unstable (repelling) in thatother trajectorieswind away from it, or semistable in exceptional cases, in that other trajectories wind toward it from the interior and away from it on the exterior, or vice versa. 7.5.2. Application to the nerve impulse and visual perception. The brain con- tainsabout102 (a million million) neurons,with around10!“ to 10!° interconnections. Within this complex network, information is encoded and transmitted in the form of electrical impulses. The basic building block is the individual neuron, or nerve cell, and the functioning of a single neuron as an input/output device is of deep importance and interest. Our purposes in discussing the neuron here are in connection with relaxation oscillations, and especially with the key role of nonlinearity in the design and functioning of our central nervous system. A typical neuron is comprised of a cell body that contains the nucleus and that emanates many dendrites and a single axon. The axon splits near its end into a number of teminals as shown schematically in Fig. 4. Dendrites are on the order of a millimeter long, and axons can be as short as that or as long as a meter. At the end of each terminal is a synapse, which is separatedfrom a dendrite of an adjacent cell by a tiny synaptic gap..Electrical impulses, each.of which is called an action potential, are generated near the cell body and travel down the axon. When an action potential arrives at a synapse, chemical signals in the form of neurotransmitter molecules are released into the synaptic gap and diffuse across that gap to a neighboring dendrite. These electrical signals to that neighboring neuron can be positive (excitatory) or negative (inhibitory). Each cell receives a great many such cues from other neurons. If the net excitation to a cell falls below some critical threshold value, then the cell will not fire — that is, it will not generate action potentials. [f the net excitation is a bit above that threshold, then the cell will fire not just once, but repeatedly and at a certain frequency. Let us consider briefly the generation of the nerve impulse. The nerve cell is surrounded by and also contains salt water. The salt molecules include sodium chloride (NaCl) and potassium chloride (KCI), and many of these molecules are ionized so that Nat, K*, and Cl~ ions are abundant both inside and outside the axon. Of these, Nat and K* are the key players insofar as the nerve impulse is concerned. Rather than being impermeable, the axon membrane has many tubular protein pores of two kinds: channels that can open or close and let either Na* or K* ions through in a passive manner, like valves, and pumps that (using energy from the metabolism of glucose and oxygen) actively eject Nat ions (i.e., from inside the axon to outside) and bring in K* ions. Through the action of these active and passive pores, and the physical mechanisms of diffusion and the repulsion/attraction of like/unlike charges, a differential in charge, and hence potential (voltage), is established across the axon membrane which, in the resting state, is 70 _dendrites Synapse, ™ nucleus / terminal Figure 4. Typical neuron. i millivolts, positive outside. If the net excitation arriving from other cells sufficiently reduces that voltage, at the cell body end of the axon, then a sequence of opening and closing of pores is established, which results in a flow of Na? and K* ions and hence a voltage “blip,” the action potential, proceeding down the axon. That wave is not like the flow of electrons in a copper wire, but rather like a water wave that results. not from horizontal motion of the water, but from a differential up and down motion of water particles. This complicated process was pieced together by Alan Hodgkin and Andrew Huxley, in 1952, and is clearly discussed in the book by David H. Hubel.* Various electrical circuit analogs have been proposed, to model the nerve impluse, by Hodgkin and Huxley and others. They are all somewhat empirical and of the “fill and flush” type — that is, where a charge builds up and is then discharged through an electrical circuit, and they are described in the little book by F. C. Hoppensteadt.t Of interest to us here is that the firing is repetitive (on the order of 100 impulses per second), and consists of a relaxation oscillation governed by the van der Pol equation (as discussed in Hoppensteadt). Further, it is known that as the excitation voltage is increased above the threshold, the magnitude of the action potential remains unchanged, but the firing frequency increases. If we plot the output (action Output Amplitude potential) amplitude versus the input (excitation voltage) amplitude, the graph is as a Input Amplitude Figure 5. Input/outputrelation for a neuron. shown in Fig. 5. Since the graph of output amplitude versus input amplitude is not a straight line through the origin, the process must be nonlinear, which fact is also known through the governing equationbeing a van der Pol equation (or other such equation, depending upon the model adopted); indeed, any process where the output amplitude is zero until a critical threshold is reached is necessarily nonlinear. Since the individual neuron is a nonlinear device, surely the same is true for the entire central nervous system, and the natural and important question that asserts itself is “Why?’’. What is the functional purpose of that nonlinearity? Let us attempt an answer. We have seen that nonlinear systems are more complex than linear ones. Since our nervous system is responsible for carrying out complex tasks, it seems reasonable that the system chosen should be nonlinear. We can be more specific if we look at a single type of task, say visual perception, which is so complex that it occupies around a third of the million million neurons in the brain. Perhaps the most striking revelation in studying visual perception is in discovering that one’s visual perception is not a simple replica of the image that falls upon the retina but is an interpretation of that information, effected by visual processing that begins in the retina and continues up into the visual cortex of the brain. For instance, hold your two hands up, in front of your face, with one twice as far from your eyes as the other (about 8 and 16 inches). You should find that they look about the same size. Yet, if we replace your eyes with a camera, and take a picture, we see in the photo that one hand looks around twice as large as the other. Usually, we blame the camera for the “distortion,” but the camera simply shows you the same “Eye, Brain, and Vision (New York: W. H. Freeman and Company, 1988). "An Introduction to the Mathematics of Neurons (Cambridge: Cambridge University Press, 1986). information that is picked up by your retinas, the distortion is introduced by the brain as it interprets and reconstructs the data before presenting it to you as visual consciousness. The latter is but one example of 4 principle of visual perception known as size constancy. The idea is that whereas the actual size of a physical object is invariant, the size of its retinal tmage varies dramatically as it is moved nearer or further from us. Size constancy is the processing, between the retina and visual consciousness, that compensates for such variation so as to stabilize our visual world. Thus, for instance, our hands look about the same size even when the retinal image of one is twice as large as that of the other. The functional advantage of that stabilization is to relieve our conscious mind of having to figure everything out; our bratn does much of the figuring out and presents us with its findings so that our conscious attentioncan be directed to more pressing and singular matters. Surely, size constancy requires a nonlinear perceptual system for if we take the retinal image size as the input amplitude and the perceived size as the output amplitude, then a linear system would show us the two hands just as a camera does. (Remember that for a linear system if we double the input we double the corresponding output if we triple the input we triple the output, and so on.) In visual perception there are other constancy mechanisms as well, such as brightness constancy and hue constancy. To illustrate brightness constancy, consider the following simple experiment reported by Hubel in his book, cited above. We know from experience that a newspaper appears pretty much the same, whether we look at it in sunlight or in a dimly lit room: black print on white paper. Taking a newspaperand a light meter, Hubel found that the white paper reflected 20 times as much light outdoors as indoors, yet it looked about the same outdoors and indoors. If the perceptual system were linear, the white paper should have /ooked 20 times as bright outdoors compared to indoors. Even more striking, he found that the black letters actually reflected twice as much light outdoors as the white paper did indoors yet, whether indoors or outdoors, the black letters always looked black and white paper always looked white. The point, then, is that these constancy mechanisms stabilize our perceived world and relieve our conscious mind from having to deal with newspapers that look 20 times brighter outdoors than indoors, hands that “grow” and “shrink” as they are moved to and fro, and so on, so that our conscious attention can be reserved for more singular matters such as not getting hit by a bus. These mechanisms are possible only by virtue of the nonlinearity of the central nervous system, which can thatsystem, the individual neuron. You have no doubt heard about “the whole being greater than the sum of its parts.” That idea expresses the essence of the Gestalt school of psychology which, by around 1920, supplanted the previously dominant molecular school of psychology, which had held that the whole is equal to the sum of the parts. To illustrate the Gestalt view, notice that the black dots in Fig. 6a are seen as a group of dots, not as a numberof individual dots, and that the arrangementin Fig. 6b is seen as a triangle with sections removed, rather than as three bent lines. In fact, Max Wertheimer’s £ \ Figure 6. The whole is greater than the sum of its parts. 378 Chapter 7. Qualitative Methods: Phase Plane and Nonlinear Differential Equations fundamental experiment, which launched the Gestalt concept in 1912, is as follows. If parallel lines of equal length are displayed on a screen successively, it is found that if the time interval and distance between them are sufficiently small, then they are perceived not as two separate lines but as a single line that moves laterally. (Today we recognize that idea as the basis of motion pictures!) In mathematical terms, the molecular idea is reminiscent of the result for a linear differential equationL{u] = f, + fo +--:+ combined fy that the responseu to the input is simply the sum of the responses w,, u2,..., Ug to the individ- We suggest here that, in effect, the contribution of the ual inputs fy, fo,...,/,. Gestaltists was to recognize the highly nonlinear nature of the perceptual system, even if they did not think of it or express it in those terms. We say more about the far-reaching effects of that nonlinearity upon human behavior in the next section. Closure. The principal idea of this section is that of limit cycles, which occur only for nonlinear systems. The classic example of an equation with a limit cycle solution is the van der Pol equation, which we discuss, but that is by no means the only equation that exhibits a limit cycle. That limit cycle solution is said to be a self-excited oscillation because even the slightest disturbance from the equilibrium point at the origin results in an oscillation that grows and inevitably approaches the limit cycle as | — oo. The case of large € is especially important in biological applications, and the corresponding limit cycle solution is a relaxation oscillation characterized by alternate t-intervals of slow and rapid change. Since the existence of a limit cycle. is of great importance, there are numerous theorems available that help one to detect whether or not a limit cycle is present, but we include only the theorem of Levinson and Smith since it is helpful in our discussion of the van der Pol equation. Finally, we discuss the action potential occuring during the firing of a neuron, as a biological illustration of a relaxation oscillation, and we use that example to point out the nonlinear nature of the neuron and central nervous system, and the profound implications of that nonlinearity. EXERCISES 7.5 1. (a) Obtain computer results Se to those presented in Fig. 2 and 3 for the case where « = 0.1. What value do you 3. Identify and classify any singularities of the given equation in the 2, y phase plane, where 2’ = y. Argue as convincingly think the period approaches as € —+0? Explain. as you can for or against the existence of any limit cycles, their (b) Obtain computer results analogous to those presented in shape, and their stability or instability.. You should be able to Fig. 2 and 3, fore = 10. NOTE: Be sure to make your ¢- tell a great deal from the equation itself, even in advance of integration step size small enough — namely, small compared any computer simulation. to thetimeintervalsof rapidchangeof x(t). 2. Use Theorem 7.5.1 to show that the following equations admit limit cycles. (a)x ~ (= a eta Ot Ge 0 hog= (a):w" +(a? +2" —1)r'+x=0 (b)a” + (1-2? —2")z' +r =0 4. (Hopf bifurcation) (a) Show that the nonlinear system a’ =ex+y—ax(x>+y’), (4.La) 379 7.5. (4.1b) where primes yl =a +ey—y(u?+y") denote differentiation with respect to the new = GI. That is, find a, J, time variable 7, where t = ar andi can be simplified to and ¢ in terms of L, C, a, and 0. (4.2a) 6. (Rayleigh's equation and van der Pol relaxation oscilla- r(e—r*), r= tion) (a) Show that if we set 2 = z’ in the van der Pol equa- (4.2b)tion v” f= —1 by the change of variables « = rcosé, y = rsin@ from the Cartesian x, y variables to the polar r, @variables. HINT: Putting « = rcos@, y = rsin@ into (4.1) gives differential equationseach of which contains both r’ and 6’ on the lefthand side. Suitable linear combinations of those equations give (4.2a,b), respectively. We suggest that you use the shorthand cos@ = c and sin @= s for brevity. (b) From (4.2) show that the origin in the x, y plane is a stable focus if ¢ < 0 and an unstable focus if « > 0, and show that working from (4.1), instead, one obtains the same classifica- ~ e(1—a)x’ + @= 0 andintegrateonce,we obtain 2! —e(z!—z!9/3)+ 2 = C, whereC is a constant.Setting z = u+C, to absorb the C, obtain Rayleigh’s equation fe ul’ —e (1 = “| on u(t). u+u=0 (6.1) The latter was studied by Lord Rayleigh (John William Strutt, 1842-1919) in connection with the vibration of a clarinet reed. (b) Letting u’ = v, reduce (6.1) to a system of two equations. Show that the only singular point of that system is (0,0), and tion. classify its type. (c) Show from (4.2) that r(t) = /€ is a trajectory (if e > 0) (c) Choosing initial values for wuand v, use computer software and, in fact, a stable limit cycle. NOTE: Observe that zero is a to obtain phaseportraitsand plots of u(t) versus¢, much as bifurcation value of €. As € increases, a limit cycle is born as € we have in Fig. 2 and 3, for ¢ = 0.2,1, and 5. For each of passes through the value zero, and its radius increases with e. those e’s, estimate the amplitude and period of the limit cycle This is known as a Hopf bifurcation. solution. (d) Modify (4.1) so that it gives an wastable limit cycle at r= fe, instead. ‘es di rent 7, Thus, Kirchhoff’s voltage law gives L= 1fia=0 tdi (¢€— co) of the van der Pol equation it is more convenient to work with the Rayleigh 5. The box in the circuit shown in the figure represents an “active element” such as a semiconductor or vacuum tube, the voltage drop across which is a known function f(z) of the cur- CG (d) To study the relaxation oscillation dt + f(t) + equation (6.1), as we shall see. With u’ = v, let us scale the u variable according to u = ew. Show that the resulting system is (6.2a) ew’ =v, ys — ew (6.2b) 2 (v —v3/3) —w (6.3) vy =e € = =) =U, a) so ie rT SOOO’ dw ~~ j< (a) Lf f is of the form f(i) = ai®— bi, show that one obtains . 9 A ∶ L dv ∙ ∕ (5.1) (b) Show that by a suitable scaling of both the independent and dependent variables one can obtain from (5.1) the van der Pol equation I" —e1—-F)r'+1=0, Vv As € - oo we see from (6.3) that du/dw — oo at all points in the v,w phase plane except on the curve v — v3/3- w. So for any initial point, say P, explain why the solution is as shown in the figure. For instance, why is the direction downward from P, and why does the trajectory jump from S$to T and from U to R? The loop RSTUR is traversedrepeatedly andis thelimit cycle. Finally,usethefigureto sketchu(t) and ∕ ∶ thevander Pol equation cycle solutionx(t) of hencethelimit (for € + 00). The result should be similar to the e =5 part of Fig. 3. (HINT: Recall the expression “s’ = a” for the phase velocity in Exercise 8 of Section 7.2.) Finally, explain why it is convenient to work with the Rayleigh equation in order to (5.2) find the relaxation oscillation of the van der Pol equation. where i and f simply denote the initial point and final point, respectively. Explain, further, why (7.1) reduces to P | of (1—2")a' dx = 0, (7.2) which expresses the balance stated above — namely, that the network doneoveronecycle by the«(1—2*)z’ termin (1) is Zero. (b) For the case of small « (0 < ¢ < 1), seek the limit cycle solution in theform x(t) ~ acost. Puttingthelatterinto (7.2), show that one obtains a = 2 as the radius of the circular limit cycle, as claimed in the text. NOTE: Put differently, (7.1) is equivalentto 7. In connection with Fig. 1b we suggested that over the limit cycle the energy gain, while the representative point is in the E| strip |z| < 1, exactly balancesthe energy loss while the point a is outside of that strip. Let us explore that idea. (a) Multiplying (1) through by dz and integrating over one cycle, show that f = / f (1- x”) a dx, (7.3) tw, +rhlie∶ . That is, the change in the total energy & over one cycle is equal to the net work done by the − f. ∙ ↔ ∟is zero∏which, once again, gives (7.2). 2 7.6 a The Duffing Equation: Jumps and Chaos 7.6.1. Duffing equation and the jump phenomenon. Besides the van der Pol equation, also of great importance is the Duffing equation: ma” +ra’ tar+ Ba? = Focos{t, (1) studied by G. Duffing around 1918. Whereas primary interest in the van der Pol equation is in the unforced equation and its self-excited limit cycle oscillation, most of the interest in the Duffing equation involves the various steady-state oscillations that can arise in response to a harmonic forcing function such as Fo cos Qt. Physically, (1) arises in modeling the motion of a damped, forced, mechanical oscillator of mass m having a nonlinear spring. That is, the spring force is not kx but az-+ x3. We assume that a > 0 but that 3 can be positive (for a “hard spring”) or negative(for a “soft spring”). The linear version of (1), where 3 = 0, is discussed in Section 3.8, and an important result there consisted of the amplitude response curves —namely, the graphs of the amplitude of the steady-state vibration versus the driving frequency 22for various values of Fo. For the linear case we obtained two linearly independent homogeneoussolutions and a particular solution and used linearity and superposition to form from them the general solution, which contained all possible solutions. Understand that because equation (1) is nonlinear, that approach is not applicable. Consider first the undamped case (r = 0), and let m = 1 for simplicity, so (1) becomes (2) ve”+ax + Bx? = Fo cos Mt. Further, suppose that /3is small. Since (1) is nonlinear, we expect it to have a wealth of different sorts of solutions. Of these, particular interest attaches to the so-called harmonic response u(t) & Acos Qt (3) (for 3 small) at the same frequency as the driving force. As shown in the exercises, pursuing an approximatesolutionof the form (3) yields the amplitude-frequencyrelation = a+ “BA? ° Fo a Figure 1. Amplituderesponse curves; undamped. (4) which gives the responsecurves shown in Fig. |. For the linear case, where 6 = 0, (4)reducesto.A = Fy/(a@—?), which resultagreeswith theamplitude-frequency relation found in Section 3.8. For G > 0 the curves bend to the right (shown), and for Pa < 0 they bend to the left (not shown). Thus, the effect of the nonlinear Bx? term in (2) is to cause the response curves to bend to one side or the other. Recall from Section 3.8 that for the linear system (3 = 0), the amplitude |A] is Fo increasing infinite when the system is driven at its natural frequency \/a. [More precisely, we saw that the solution form a(t) = Acos Mt simply does not work for the system v" + axe= Focosi coeffiif Q = \/a, butthatthemethodof undetermined cients gives z(t) = ~9 t sin Qt, which growing oscillation is known as resonance.] 20 However, because of the bending of the response curves, resonance cannot occur in the nonlinear case. That is, we see from Fig. | that if 6 4 0 then at each driving frequency2 theresponseamplitude |A| is finite. What is the effect of including damping, of letting r be positive in (1) rather than zero? In that case we need to allow for a phase shift (as in Section 3.8), and seek a(t) & Acos(Qt+ ®) in place of (3). The result would be a modified amplitude- frequency relation, in place of (4), and a “capping off” of the response curves as shown in Fig. 2a. If (2 = (2), for instance, then the response is at P, in Fig. 2b. Suppose we can vary the driving frequency 22continuously by turning a control knob, like the volume knob onaradio. If we increase 2 slowly (remember that (2 is regardedas a constant in this analysis, so we need to increase it very slowly), then the representative point moves to the right along the response curve. But what happens when it reaches P», where the response curve has a vertical tangent? Numerical simulation reveals that the point jumps down to Ps, where it can then continue moving rightward on the response curve if Q is increased further. That is, there is a transient Figure 2. Amplituderesponse curves; damped. 382 Chapter 7. Qualitative Methods: Phase Plane and Nonlinear Differential Equations scope.” ed. (Oxford: Oxford University Press, 1987). uw8ua someone with an eating disorder might fast for a period of time, thenjump almost instantaneously to binging, and vice versa. For a readable account of the modeling of such systems as these, see E. C. Zeeman’s Catastrophe Theory (Reading, MA: Addison-Wesley, 1977). 7.6.2. Chaos. Consider the physical system shown in Fig. 3, a box containing a slender vertical steel reed cantilevered from the “ceiling” and two magnetsattached to the “floor”; a is the horizontal defiection of the end of the reed. In equilibrium, the reed is Stationary,with its tip at one magnetor the other. However, if we vibrate the box periodically in the horizontal direction, we can (if the magnets are not too strong) shake the reed loose from its equilibrium position and set it into motion, This experiment was carried out by F. Moon and P. Holmes,” to study chaos. They modeled the system by the Duffing equation a +a! 2 where r is a (positive) damping coefficient and Fp and Q are the forcing function strength and frequency, respectively. The ~2 + «° terms approximate the force induced on the reed by the two competing magnets. It is zero at ¢ = 0,-£1, which are therefore equilibrium points. To classify the equilibrium points, consider the unforcedequationx” + ra’ —x + 2° = 0 or,equivalently, (6a) yl = —ry te —e, (6b) We leave it for the exercises for you to show that the origin (x, y) = (0,0) is an unstableequilibrium point, namely a saddle, and that (+1, 0) are stable equilibrium points (stable foci if r < 8 and stable nodes if r > /8). For the undriven system (9 = 0), imagine displacing the reed tip to x = 0.5, say, and releasing it from rest. Then it will undergo a damped motion about x = 1. If instead we release it from x = —0.5, say, it will undergo a damped motion about « = —1. What will happen if we force the system (Fy > 0)? We can imagine the reed undergoing a steady-state oscillation about « = +1 or « = —1, depending upon the initial conditions. To encourage physical insight, it is useful to consider the potential energy V(a) associatedwith the magnetic force F(x) = x — 2°, Recalling from Section 7.4.2thatF(x) = —V'(x), we have Vie)= (x) 2 -7- gt eS, ar ∙ reed |2.-magnet ~a +2? = Focos Mt, w= y, <<—> 7 (7) Thus, in place of a reed/magnet system we can conceptualize the system more intuitively as a mass in a double well as sketched in Fig. 4. That is, its gravitational potentialenergyis V(x) = mgy = mg(—x7/2+a4/4) which, exceptfor thescale factor mg, is the same as V(«) for the reed/magnet system, given in (7). “B.C. Moon and P. J. Holmes, A Magnetoelastic Strange Attractor, Journal of Sound and Vibration, Vol, 65, pp. 275~296. 384 If the mass sits at the bottom of a well, then if we vibrate the system horizon- tally we expect the mass to oscillate within that well. If we vibrate so energetically that the mass jumps from one well to the other, then the situation becomes more complicated. Rather than carry out the physical experiment, let us simulate it numerically by using computer software to solve (5). Let us fix r = 0.3 and Q = 1.2, and use the initial conditions «(0) = 1 and w’(0) = 0.2, say, so that if Fp is not too large we expect an oscillation near 2 = | (i.e., in the right-hand well).* Our plan is to use successively larger Fo’s and see what happens. The results are quite striking (like those obtained by Moon and Spencer). They are presented in Fig. 5, both in the a, y phaseplane and as x(t) versus t. In Fig. Sa we set Fg = 0.2 and find that after a brief transient period there results a steady-stateoscillation near « = 1, as anticipated. That oscillation is of the same period as the forcing function (namely, 27/Q = 27/1.2 % 5.236) so Period-1 oscillations it is called a period-1 oscillation, or harmonic oscillation. persist up to around Fo = 0.27, but for Fo > 0.27 different solution types arise. For Fo = 0.28 the solution is still periodic, but it takes two loops (in the x,y plane) to complete one period and the period is now doubled, namely, 47/Q. Thus, it is called a period-2 oscillation, or a subharmonic oscillation of order 1/2. In Fig. Sb—5fwe omit the transient and display only the steady-stateperiodic solution so as not to obscure that display. Observe, in Fig. 5b, the point where the trajectory crosses itself. That crossing does not violate the existence and uniqueness theorem given in Section 7.3.1 because equation (5) is nonautonomous and the phase plane figure shows the projection of the non-self-intersecting curve in threedimensional a, y,¢ space onto the z, y plane. If we increase Fo further, to 0.29, the forcing is sufficiently strong so that during the transient phase the mass (reed) is driven out of the right-hand potential well and ends up in a period-4 oscillation about the left-hand well. To observe this result we need to be patient and run the calculation to a sufficiently large time, namely, beyond ¢ ~ 400. This period doubling continues as Fo increases from 0.29 up to around 0.30. For Fo > 0.30 a period-5 oscillation results that now encompasses both stable equilibrium points (Fig. 5d). The regime 0.37 < Fo < 0.65 is found to be rather chaotic, with essentially random motions, as seen in Fig. Se for the case Fo = 0.5. Reviewing these results, observe that as we increased Fp the period of the motion increased until, when Fo was increased above 0.37, periodicity was lost altogether and the motion became chaotic. (We can think of that motion as periodic but with infinite period.) It would be natural to expect that a further increase of Fo would lead to even greater chaos (if one were to quantify degree of chaos), yet we find that if Fo is increased to 0.65 we once again obtain a periodic solution, namely, the period-2 solution shown in Fig. 5f, and if Fo is increased further to 0.73 then we obtain a period-| solution (not shown). In summary, we see that the forced Duffing equation admits a great variety “These parameter values are the same as those chosen in Section 12.6 of D. W. Jordan and P. Smith (ibid). We refer you to that source for a more detailed discussion than we offer here. 7.6. The Duffing Equation: Jumps and Chaos + (a) Fy=0.2 | | | HT 1 0 1 Xv (a ' i- (b)Fy=0.28 | x Oo 0-4 | | ©) 0- Xx 14 \ 0 x 1 (c) Fo =().29 i a 0-| x 14 | 0 1 x I~ i (d) Fy=0.37 ~~ fT 0- Xx − I 0 | X (e)Fy=0.5 1+ (f) Fy=0.65 0-l4 x | 385 of periodic solutions and chaotic ones as well, and that these different regimes correspond to different intervals on an Fo axis. (We chose to hold r and 2 fixed and to vary only Fo, but we could have varied r and/or 2. as well.) It is possible to predict analytically how the solution type varies with Fo, r, and 22,but that analysis is beyond the scope of this introductory discussion.* Having classified the response in Fig. Se as chaotic, it behooves us to clarify what we mean by that. A reasonable working definition of chaos is behavior in a deterministic system,over time, which is aperiodic and which is sensitive to initial conditions. By asystem being deterministic we mean thatthe governing differential equation(s) and initial conditions imply the existence of a unique solution over subsequent time. Whether we are able to find that solution analytically, or whether we need to use computer simulation. is immaterial. For instance, for given values of r, Fo, Q, the system consisting c: equation (5), together with initial conditions x(0) and x’(0), is deterministic. The choice r = 0.3, Fy = 0.37, 2 = 1.2, x(0) = 1, and «’(0) = 0.2, say, implies the unique response shown in Fig. Sd. If we rerun the numerical solution or solve the problem analytically (if we could), we obtain the same solution as shown in the figure. Likewise even for the chaotic response shown in Fig. Se. By the response being aperiodic we simply mean that it does not approach a periodic solution or a constant. To illustrate what is meant by sensitivity to initial conditions, let us rerun the case corresponding to Fig. Se, but with the initial conditions changed from z(0) = Land2'(0) = 0.2tox(0) = 1andx’(0) = 0.2000000001. Observethattheresults (Fig. 6) bear virtually Figure 6. Sensitivityto initial conditions. no resemblance to those in Fig. 5e. This circumstance is of great significance because if the initial conditions are known to only six significant figures, say, then the task of predicting the response is hopeless! Another well known example of chaos is provided by the Lorenz equations v= piy— 2), y =(q-2z)5-y, z= ry re, (8) where p, g,7r are constants. This system was studied by the mathematical meteorologist E. Lorenz in 1963 in connection with the Bénard problem, whereby one seeks the effect of heating a horizontal fluid layer from below.’ That problem is of fundamental interest in meteorology because the earth, having been heated by the sun during the day, radiates heat upward into the atmosphere in the evening, thus destabilizing the atmosphericlayer above it. Lorenz’s contribution was in discovering the chaotic nature of the solution for certain ranges of the physical parameters p,q,7, thereby suggesting the impossibility of meaningful long-range weather prediction. Some discussion of (8) is left for the exercises. *See Section 12.6 in Jordan and Smith (ibid). TE. Lorenz, Deterministic Nonperiodic Flows, Journal of Atmospheric Sciences, Vol. 20, pp. 130-141, 1963. 387 Perhaps the classic problem of chaos is that of turbulence, in fluid mechanics, be it in connection with the chaotic eddies and mixing behind a truck on a highway or the turbulent breakup of a rising filament of smoke. To appreciate the revolution in physics that has resulted from recent work on chaos, one needs to understand the euphoria that greeted the birth of Newtonian mechanics and the calculus, according to which both the past and the future are contained in the system of differential equations and initial conditions at the present instant. In the words of Ivar Ekeland,* “Past and future are seen as equivalent, since both can be read from the present. Mathematics travels back in time as easily as a wanderer walks up a frozen river.” As Ekeland points out, that statement is not quite true for, as we now know and as was understood by Poincaré even a century ago, deterministic nonlinear systems can turn out to be chaotic, in which case they are useless for long-term prediction. Closure. The common thread in this section is the Duffing equation (1). For a@> 0 we study (1) in connection with the reed/magnet system of Moon and Holmes, shown in Fig. 3. Numerical solution of the governing equation (5), for a sequenceof increasing Fo values, leads to a variety of solution types: a harmonic response, various subharmonic responses, and even chaotic responses. The approach to chaos, as we increase Fo, is typical in that the onset of chaos is preceded by a sequenceof period doublings. We define chaos as behavior in a deterministic system (which must be nonlinear if it is to exhibit chaos) over time, which is aperiodic and so sensitive to initial conditions that accurate long-range predictions of the solution are not possible. Computer software. To generate the responses shown in Fig. 5 and 6 we use theMaple phaseportraitcommand. However, it is worth mentioning how we obtain the graphs in Fig. | and 2, because (4) does not give A explicitly but only implicitly as a function of (2. For instance, suppose we wish to plot the graphs of y versus z, over0 <2 < 3and0 < y < 4, for thefunctionsy(z) givenimplicitly by the equationsy —y? = a and4y —y? = a. First,enter with(plots): and return, to access the subsequent plotting command. Then use the implicitplot command. Enter implicitplot({y —y°3 =a, dey—-y3=c}, numpoints= 1000); c=0..3, y= 0.4, and return. Here, numpoints indicates the number of points to be used. To plot the single function y — y®= x, use y — y°3 = a in place of implicitplot({y — y°3 = vr, 4ey—y°3 = x}. “Mathematics and the Unexpected (Chicago: Chicago University Press, 1988). 388 EXERCISES 7.6 lL. (Derivation of the Duffing amplitude-frequency relation) We state in the text that if one seeks an approximate harmonic & Acos Qt to solution2(£) a” tar that the2, (¢) sequence given by (1.5) does indeed converge to the exact solution (1.4) as nm +oo, provided that jax/Q?| <i. + Bx? = Fy cos2, (1.1) (d)In fact, show that if we equate the coefficients of cos Q¢ in vA—Fi cos Qt, we happen to = “H—=2 zo(t)=AcosMtand «,(t) 2 then one obtains the ampitude-frequency relation : 3,9 O =a+—GA? ce+ a! (c) Recallingthatthegeometricseries1 + 2 + x? + a%+ convergesto 1/(1~2) if |e]< 1 and divergesotherwise,show obtainA = Fy/(a—7), whichagreeswiththeexactsolution Fo A (1.2) (1.4)! (e) In view of the striking success in part (d), we are encour- aged to expect good results even for the nonlinear case where discovered by Duffing. A modern derivation of (1.2) would B # 0. Thus, put ro(t) = Acos (Mt into the right side of probably use a so-called singular perturbation method of ). Then, as in (d), integrate twice to obtain i strained variables, but here we will pursue a simpler iterative (1.3) and of cos Nt in 2, (t) and xzo(t), and show coefficients the equate approach which is essentially that of Duffing; namely, we rethat the result is Duffing’s relation (1.2). HINT: The identity place (1.1) by the iterative scheme cos?@= (3cos@+ cos36)/4shouldbehelpful. Uy = aL, — Bart + Fy cos NE, (1.3) 2. (Computer problem regarding the Duffing jump phenomenon) For the undamped case the amplitude-frequency rechoose the initial iterateas x(t) = A cos Mt, and then use(1.3) lation is given by (4). For the damped case it is given by to find thesuccessiveiteratesx, (t), v2(t),.... Itis surely not 2 obvious whether that procedure will. work,.so-it. makes.sense (2.1) + (rQA)* (a— 0?) A+ “aa to try it out first for the simple /inear case where 3 = 0, for which we know the exact solution. (a) In that case (G = 0), show that if we seek a harmonic solu- Throughout parts (a)—(g) let Fp = 2.a r = 0.3, for definiteness. tion z(t) = Acos Qt of theDuffing equation(1.1)with @= 0, (a) Use (2.1) to generate ¢ we obtainA = Fy/(a —27), and hencetheexact solution = 1, 6 = 0.4, and versus Q as we did in Fig. 2b. NOTE: Actually, there is no need to distinguish Fo x(t ) u(t) = Tae cos Mt. (b) Next, use (1.3) to generate2, (¢),v9(t),..., show that between A and |A| since, unlike (4), (2.1) contains only even (1.4) powers of A. for G = 0, and (b) For Q = 1 solve (2.1) for A. (HINT: Using Maple, for instance, use the fsolve command.) Next, use computer software such as the Maple phaseportrait command to solve A FF y(t) = a a ev’ + ra’ + ax + Bx? = Fo cos Nt, 608Qt, and plot a(t) versus € over a sufficiently long time interval to obtain the steady-state response. Compare the amplitude of the resulting steady-state response with the value of A ob- Th ml)={(Ga)4 a i+ (S)e- tained from (2.1). +(=) | \ pone (1.5) That is, put ao(t) = Acos Mt into the right side of (1.3) and integrate twice to obtain x}. Then put that 2, into the right side of (1.3) and integrate twice to obtain x2, and so on. By the time you reach xg, the general result shown in (1.5) should be apparent. (2.2) (c) Same as (b) but for an Q point specified by your instructor, to the left of the first point of vertical tangency (Q = 1.71). (d) Same as (b), but for an Q point specified by your instructor, to the right of the second point of vertical tangency (Q e&2.05). (e) Now consider an 2 between the two points of vertical tangency, say, 2 = 1.8. Solve (2.1) for the three A values. Next, use computer software to solve (2.2) over a sufficiently long Chapter 7 Review time interval to obtain the steady-state response. Depending upon the initial conditions that you impose, you should obtain 389 obtain a period-! solution.” Obtain computer plots for that case, analogous to those in Fig. 5. the smallest or largest of the three A values, but never the mid- 8. We found an extreme sensitivity to initial conditions for the chaotic regime. Specifically, the plot in Fig. 6 bears little reof «(0) below which you obtain the small-amplitude response semblance to the corresponding one in Fig. 5e, even though and above which you obtain the large-amplitude response. the initial conditions differed by only 107!°. Show that that (f) Same as (e) but for an 2 point specified by your instructor. sensitivity is not found for the non-chaotic responses—namely, (g) Continuing to use the r, a, 6, fo values given above, now for the periodic responses. Specifically, rerun the cases re- dle one.Keeping2’(0) = 0, determinetheapproximatevalue let 2 be slowly varying according toQ = 1.9 — 0.0005¢, and solve (2.2) over 0 < ¢ < 800 with the initial conditions x(0) = x'(0) = 0. Plot the resultingz(t) versust and discuss your results in terms of the ideas discussed in this section. 3. (a) Refer to (4) and Fig. |. What is the asymptotic form of ported in Fig. 5d and 5f, but with 2’(0) = 0.2 changedto 0.20001, say. Do your results appear to reproduce those in Fig. 5°? 9, The equation thegraphof |A] versus2 as |A| -> co? vc +0.382' + sing = Focost (b) In Fig. | we show several amplitude response curves for G = Oand G > 0 and for several values of Fy. Obtain the analogous curves for the case where 2 < 0, either by a careful hand sketch or by computer plotting. 4. Determine the location and type of any singular points of (6). 5. (a)—(f) Obtain x, y and x,t plots for the cases depicted in Fig. 5. Your results should be the same as those in Fig. 5, 6. We stated that “period doubling continues as Fo increases from 0.29 up to around 0.30.” Find, by numerical experimentation, an Fo that gives a period-8 oscillation (and, if possible, period-16) and obtain computer plots analogous to those in Fig. 5. 7. We stated that “if Fy is increased further to 0.73, then we is similar to the one occurring in the Moon/Holmes experiment. (a) Describe a physical problem that would have a governing equation of motion of that form. (We have assigned numerical values to all of the physical parameters except to /, which we leave for the purpose of numerical experimentation.) (b) We leave. this problem a bit more open ended than the foregoing ones, and simply ask you to carry out an analytical and experimental study of (9.1). For instance, you might investigate the singular points of the homogeneous version of (9.1),and also run computer solutions for a rangeof Fy values, somewhat as done for equation (5). Chapter 7 Review In Sections 7.2—7.5we study the autonomous system ve’= P(2,y), (1) X'’=aX +bY Y’=cX +dy, (9.1) 390 centers, foci, nodes, and saddles. The Hartman-Grobman theorem assumes that the linearized system faithfully captures the natureof the local flow (except possibly for the borderline cases of proper nodesand centers,as explained in Section 7.4.1). In Section 7.4 we study applications and introduced the idea of a bifurcation, whereby the behavior of the system changes qualitatively. as a system parameter passes through a critical value. We illustrate the concept with an example of a saddle-node bifurcation from molecular biology. In Section 7.5 we study the van der Pol equation a” — ¢(1—2*)2'+2=0, (e > 0) which introduce us to the concept of limit cycles and relaxation oscillations. Finally, we study the forced Duffing equation ma" +r2' + az + Bx = Focos Mt in two contexts. First, we consider it as modeling a mechanical oscillator, with nonlinear spring force aa + 3a, where a > 0. Of the various possible solutions that can be obtained from different initial conditions, we study only the harmonic response — that is, the steady-stateperiodic response at the same frequency as the driving frequency Q. The key feature that was revealed was the bending of the amplitude response curves and the resulting jump phenomenon, whereby the response amplitude jumps as (Qis increases or decreases slowly througha critical value. We also consider it as modeling the “double well” reed/magnet system of Moon and Holmes. By numerical simulation, we find that if Fo is not too large, then the oscillation is confined to one of the two wells. As Fp is increased, there results a sequence of period doublings, giving so-called subharmonic responses, until fy becomes large enough to drive the response out of that well. Beyond a critical Fo value, we then obtain a chaotic responseinvolving both wells. Chapter 8 Systems of LinearAlgebraic “quations; Gauss Elimination 8.1 Introduction There are many applications in science and engineering where application of the relevant physical law(s) immediately produces a set of linear algebraic equations. For instance, the application of Kirchotf’s laws to a DC electrical circuit containing any number of resistors, batteries, and current loops immediately produces such a set of equations on the unknown currents. In other cases, the problem is stated in some other form such as one or more ordinary or partial differential equations, but the solution method eventually leads us to a system of linear algebraic equations. For instance, to find a particular solution to the differential equation yl" —y" = 327+ 5sine (1) by the method of undetermined coefficients (Section 3.7.2), we seek it in the form Yp(x) = Ac! + Br? + Cx? + Dsing + Ecos. (2) Putting (2) into (1) and equating coefficients of like terms on both sides of the equation gives five linear algebraic equations on the unknown coefficients A, B,..., EB. Or, solving the so-called Laplace partial differential equation Oru = Ou Ox? + Oy? =O @) on the rectangle 0 < « < 1,0 < y < 1 by the method of finite differences (which is studied in Section 20.5), using a mesh size Ac = Ay = 0.05, gives 19? = 361 linear algebraic equations on the unknown values of u at the 361 nodal points of the mesh. Our point here is not to get ahead of ourselves by plunging into partial differential equations, but to say that the solution of practical problems of interest in science and engineering offen leads us to systems of linear algebraic equations. 391 392 Such systems often involve a great many unknowns. Thus, the question of existence (Does a solution exist?), which often sounds “too theoretical” to the practicing engineer, takes on great practical importance because a considerable computational effort is at stake. The subject of linear algebra and matrices encompasses a great deal more than the theory and solution of systems of linear algebraic equations, but the latter is indeed a central topic and is foundational to others. Thus, we begin this sequence of five chapters (8-12) on linear algebra with an introduction to the theory of systems of linear algebraic equations, and their solution by the method of Gauss elimination. Results obtained here are used, and built upon, in Chapters 9-12. Chapters 9 and 10 take us from vectors in 3-space to vectors in n-space and generalized vector space, to matrices and determinants. Linear systems of algebraic equations are considered again, in the second half of Chapter 10, in terms of rank, inverse matrix, LU decomposition, Cramer’s rule, and linear transformation. Chapter 11 introduces the eigenvalue problem, diagonalization, and quadratic forms; areas of application include systems of ordinary differential equations, vibration theory, chemical kinetics, and buckling. Chapter 12 is optional and brief and provides an extension of results in Chapters 9-11 to complex spaces. 8.2 Preliminary Ideas and Geometrical Approach The problem of finding solutions of equations of the form f(a) =0 () occupies a place of both practical and historical importance. Equation (1) is said to be algebraic, or polynomial, if f(a) is expressible in theform anz” +an,—ya"~ 1+ -+++a1x + ag, where a, # 0 for definiteness [i.e., if f(z) is a polynomial of finite degree n], and it is said to be transcendental otherwise. EXAMPLE 1. The equations62 —5 = 0 and324 —x? + 22 + 1 = 0 are algebraic, whereas x° + 2sinax = 0 and e* — 3 = 0 are transcendental since sinz and e* cannot be expressedas polynomials of finite degree. 2 Besides the algebraic versus transcendental distinction, we classify (1) as linear if f(a) isa first-degreepolynomial, aye + ap = 0, (2) and nonlinear otherwise. Thus, the first equation in Example | is linear, and the other three are nonlinear. While (1) is one equation in one unknown, we often encounter problems involving more than one equation and/or more than one unknown —that is, a system 393 of equations consisting of m equations inn unknowns, where m > Land n > 1, fier, such as on nen) = 0, fo(t1,..-,@n) = 0, fin (a Loeceey In) ll o (3) vy ~ sin (a, + 7x2) = 0, cae+ wo - bay +6 = 0. (4) In (4) it happens that 72 = m (namely, 7m = n = 2) so that there are as many equations as unknowns. In general, however, m may be less than, equal to, or greaterthann so we allow form # n in this discussion even though m = 7nis the mostimportantcase. In this chapter we consider only the case where (3) is linear, of the form QL yp GyQ@a+e + Ain@n = C1, G12, + aogao +++ + Gandy = C2, Amity + Gmeve2 +++ + Amntn = Cm, (eq.1) (eq.2) (5) (eq.m) and we restrict m and n to be finite, and the ajj’s and c;’s to be real numbers. If all the c;’s are zero then (5) is homogeneous; if they are not all zero then (5) is nonhomogeneous. The subscript notation adopted in (5) is not essential but is helpful in holding the nomenclature to a minimum, in rendering inherent patterns more visible, and in permitting a natural transition to matrix notation. The first subscript in aj; indicates the equation, and the second indicates the 2; variable that it multiplies. For instance, @91appears in the second equation and multiplies the a1 variable. To so that one does not mistakavoid ambiguity we should write a2 rather than a@21 enly read the subscripts as twenty-one, but we will omit commas except when such ambiguity is not easily resolved from the context. is a solution of (5) if and We say that a sequence of numbers 51, 52,...,8, only if each of the mmequations is satisfied numerically when we substitute s; for £1, 89 for vg, and so on. If there exist one or more solutions to (5), we say that the system is consistent; if there is precisely one solution, that solution is unique; and if thereis more than one, the solution is nonunique. If, on the other hand, there are no solutions to (5), the system is said to be inconsistent. The collection of all solutions to (5) is called its solution set so, by “solving (5)” we mean finding its solution set. Let us begin with the simple case, where m = n = 1: Gy,vy = Cy. (6) [Inthe generic case, a4, % 0 and (6) if ay, = 0 there are two possibilities: that Ox, = c, and (6) is inconsistent, xr, = ais a solution for any value of admits the unique solution @, = c,/ay41, but if c, + 0 then there are no values of x, such but if c¢,= 0 then (6) becomes 0x; = 0, and a; that is, the solution is nonunique. Far from being too simple to be of interest, the case where m ==n = 1 establishes a pattern that will hold in general, for any values of m and n. Specifically, the system (5) will admit a unique solution, no solution, or an infinity of solutions. For instance, it will never admit 4 solutions, 12 solutions, Next, consider the case where m = mn= 2: (a) v2A ul P | C1, (eq.1) (7a) + agQhe li CQ. (eq.2) (7b) Ay Ly + ayQ@2 ag11 L2 Xy (8) or 137 solutions. If ay, and aj2 are not both zero, then (eq.1) defines a straight line, say L1, in a Cartesian 21,22 plane; that is the solution set of (eq.1) is the set of all points on that line. Similarly, if @2, and a@g2 are not both zero then the solution set of (eq.2) is the set of all points on a straight line £2. There exist exactly three possibilities, and theseare illustrated in Fig. |. First, the lines may intersect at a point, say P, in which case (7) admits the unique solution given by the coordinate pair 21, x2 of the point P (Fig. la). That is, any solution pair x1, x2 of (7) needs to be in the solution set of (eq.1) and in the solution set of (eq.2) hence at an intersection of £1 and £2. This is the generic case, and it occurs (Exercise 2) as long as a11422—ayaae, # 0; (8) (8) is the analog of the aj, 4 0 condition for the m = n = 1 case discussed above. Second. the lines may be parallel and nonintersecting (Fig. 1b), in which case there is no solution. Then (7) is inconsistent, the solution set is empty. Third, the lines may coincide (Fig. Ic), in which case the coordinate pair of each point on the line is a solution. Then (7) is consistent and there are an infinite number of solutions. EXAMPLE (c) 2. 221 X49 A | Li, L2 — Ug = ay +329 D, Up + 3dr = = —-l, Tt + 3x = 0, 1, ry + 3L49 = 22, + 6a 1, = 2, illustrate these three cases, respectively. # i TF Figure i. 1. Existence and uniqueness for the system (7). Below (7) we said “If a4, and ay; are not both zero...” What if they are both zero? Then if cy # QOthere is no solution of (7a), and hence there is no solution to the system (7). But if cy = 0, then (7a) reduces to 0 = 0 and can be discarded, leaving just (7b). If ag, and a2 are not both zero, then (7b) gives a line of solutions, but if they are both zero then everything hinges on cg. If cp 4 0 there is no solution and (7) ts inconsistent, but if co = 0, so both (7a) and (7b) are simply 0 = 0, then both 21 and ae are arbitrary, and every point in the plane is a solution. 395 Next, consider the case where m = n = 3: yy Hy,+ ayer + 44323 = C4, A910, + A292 + 9303 = Co, agix, + a3222+ 43323= C3. (eq.1) (9a) (eq.2) (9b) (eq.3) (9c) Continuing the geometric approach exemplified by Fig. [, observe that if a11, a12, a13 are not all zero then (eq.1) defines a plane, say Pl, in Cartesian x1, x2, 23 space, and similarly for (eq.2) and (eq.3). In the generic case, Pi and P2 intersect along a line L, and £ pierces P3 at a point P. Then the 21, x9, 23 coordinates of P give the unique solution of (9). In the nongeneric case we can have no solution or an infinity of solutions tn the following ways. There will be no solution if Z is parallel to P3 and hence fails to pierce it, or if any two of the planes are parallel and not coincident. There will be an infinity of solutions if Z lies in P3 (.e., a line of solutions), if two planes are coincident and intersect the third (again, a line of solutions), or if all three planes are coincident (this time an entire plane of solutions). The case where all of the aj; coefficients are zero in one or more of equations (9) is left for the exercises. An abstract extension of such geometrical reasoning can be continued even if m =n > 4. For instance, one speaks of aj,21 + @jo@9+ 04303 + A144 = C1 as defining a Ayperplane in an abstract four-dimensional space. In fact, perhaps we plane and «1, 22,73 space discussed should mention that even the familiar x1, here could be abstract as well. For instance, if x1 and x9 are unknown currents in two loops of an electrical circuit, then what physical meaning is there to an x1, rq plane? None, but we can introduce it, create it, to assist our reasoning. Closure. Most of this section is devoted to a geometrical discussion of the system (5) of linear algebraic equations. A great advantage of geometrical reasoning is that it brings our visual system into play. It is estimated that at least a third of the neurons in our brain are devoted to vision, hence our visual sense is extremely sophisticated. No wonder we say “Now I see what you mean; now | get the picture.’ The more geometry, pictures, visual images to aid our thinking, the better! We have not yet aimed at theorems, and have been content to lay the groundwork for the ideas of existence and uniqueness of solutions. In considering the cases wherem =n = 1,m =n = 2,andm =n = 3, we have not meant to imply that we need to have m = 1; all possibilities are considered in the next section. To proceed further, we need to consider the processof finding solutions, and that we do, in Section 8.3, by the method of Gauss elimination. 396 Chapter 8. Systems of Linear Algebraic Equations; Gauss Elimination EXERCISES 8.2 2. Derive thecondition (8) as the necessary and sufficient condition for (7)to admit a unique solution. u ee eas 1. True or false? If false, give a counterexample. ∙ ∙ (a) An algebraic equation is necessarily linear. (b)An algebraicequationis necessarilynonlinear. 3. (a) Discuss all possibilitiesof the existenceand unique- (d)A transcendental equationis necessarilynonlinear. the eventthataj; = ay. = a13 = 0, but @o1,22,43 and (c) A transcendentalequation is necessarily linear. ness of solutions of (9) from a geometrical point of view, in (e) A linear equation is necessarily algebraic. (f) A nonlinear equation is necessarily algebraic. (g) A linear equation is necessarily transcendental. (h) A nonlinear equation is necessarily transcendental. 433 are not all zero. 31,432, (b) Same as (a), but with ag, = agg = a3 = 0 as well. (c) Same as (a), but with a@gy= @92 = @o3 = G31 = Ago = a33 ==Q as well. 8.3. Solution by Gauss Elimination 8.3.1. Motivation. In this section we continue to consider the system of m linear algebraic equations AyQ%2+ °° + GinIn QInln Agyey —- aeglg -+ ++ Qy1e, + Am1L1 +++ + Ame@2 + Gmn&n = C1, = C2, = (1) Cm, in the m unknowns 21,..., 2p, and develop the solution technique known as Gauss elimination. To motivate the ideas, we begin with an example. EXAMPLE 1. Determinethesolutionsetof thesystem Ly SU, + v2 + eg —- v3 + 23 lI 1, I 9, (2) LZ, — fo + 4x3 = 8. Keep the first equation intact, and add —3 times the first equation to the second (as a replacement for the second equation), and add —1 times the first equation to the third (as a replacement for the third equation). These steps yield the new “indented” system rg =], ta- Ty + —2r9 + 4ry =i 6, (3) —229 + 5x3 = 7. Next, keep the first two of these intact, and add —1 times the second equation to the third, and obtain ty —- £3 = 1, By + 205 + 4x3 = 6, vg = 1. (4) 8.3. Solution by Gauss Elimination Finally, multiplying the second of these by —1/2 to normalize the leading coefficient (to unity), gives Zy+k- we3= 1, Ly ~ 2g = ~3, tg3= 1. (eq) (eq.2) (eq.3) (5) It is helpful to think of the original system (2) as a tangle of string that we wish to unravel. The first step is to find a loose end and that is, in effect, what the foregoing process of successive indentations has done for us. Specifically, (eq.3) in (5) is the “loose end,” and with that in hand we may unravel (5) just as we would unravel a tangle: putting zz = 1 into (eq.2) gives xy = —1, and then putting x3 = 1 and v2 = —1 into (eq.1) gives x, = 3. Thus, we obtain the unique solution tg=1, m3=-1, m=3, (6) COMMENT I. From a mathematical point of view, the system (2) was a “tangle” because the equations were coupled; that is, each equation contained more than one unknown. Actually, the final system (5) is coupled too, since (eq.1) contains all three unknowns and (eq.2) contains two of them. However, the coupling in (5) is not as debilitating because (5) is in what we call triangular form. Thus, we were able to solve (eq.3) for x3, put that value into (eq.2) and solve for x2, and then put these values into (eq.1) and solve for x4, which steps are known as back substitution. COMMENT 2. However, the process begs this question: Is it obvious that the systems (2)—(5) all have the same solution sets so that when we solve (5) we are actually solving (2)? That is, is it not conceivable that in applying the arithmetic steps that carried us from (2) to (5) we might, inadvertently, have altered the solution set? For example, c—1 = 4 has the unique solution « = 5, but if we innocently square both sides, the resulting equation (x — 1)? = 16 admits the nvo solutions 7 = 5 anda = —3. B The question just raised applies to linear systems in general. It is answered in Theorem 8.3.1 that follows, but first we define two terms: “equivalent systems” and “elementary equation operations.” Two linear systems in 2 unknowns, x, through xp, are said to be equivalent if their solution sets are identical. The following operations on linear systems are known as elementary equation operations: 1. Addition of a multiple of one equation to another Symbolically: (eq.j) > (eq.j) + a (eq.k) 2. Multiplication of an equation by a nonzero constant Symbolically: (eq.j) —>a (eq.7) 3. Interchange of two equations Symbolically: (eq.j) (eq.k) Then we can state the following result. 397 398 THEOREM 8.3.1 Equivalent Systems If one linear system is obtained from another by a finite number of elementary equation operations, then the two systems are equivalent. Outline of Proof: The truth of this claim for elementary equation operations of types 2 and 3 should be evident, so we confine our remarks to operations of type |. It suffices to look at the effect of one such operation. Thus, suppose that a given linear system A is altered by replacing its 7th equation by its jth plus a times its kth, its other equations being kept intact. Let us call the new system A’. Surely, every solution of A will also be a solution A’ since we have merely added equal quantities to equal quantities. That is, if A’ resultsfrom A by the application of an elementary equation operationof type 1, then every solution ofA is also a solution of A’. Further, we can convert A’ back to A by an elementary equation operation of type 1, namely, by replacing the jth equation of A’ by the jth equation of A’ plus —a times the kth equation of A’. Consequently,it follows from the italicized result (two sentencesback) that every solution of A’ is also a solution of A. Then A and A’ areequivalent,asclaimed. @ In Example 1, we saw that each step is an elementary equation operation: Three elementary equation operations of type | took us from (2) to (4), and one of type 2 took us from (4) to (5); finally, the back substitution amounted to several operations of type |. Thus, according to Theorem 8.3.1, equivalence was maintained throughout so we can be sure that (6) is the solution set of the original system (2) (as can be verified by direct substitution). The system in Example | admitted a unique solution. To see how the method of successive elimination works out when there is no solution, or a nonunique solution, let us work two more examples. EXAMPLE 2. InconsistentSystem.Considerthesystem 224 + − 32Q 223 = 4, Ly ~ 2a9 + 2x3= 3, -— 7X, @3 = (7) 2. Keep the first equation intact, add 5 times the first equation to the second (eq.2 —+eq.2 ~$ eq.1), and add ~f times the first to the third (eq.3 > eq.3 —t eq. |): 224 + 3x2 a 223 = — fu. +2e3= _ Bre > 6x3 = 4, 1, (8) —12. Keep the first two equations intact, and add —3 times the second equation to the third (eq.3 8.3, Solution by Gauss Elimination — eq.3 —3 eq.2): 4, 2a, + 3x9 - 2a3= (9) 1, — 9% + 223= Q=—15. Any solution 21,2, %3 of (9) must satisfy cach of the three equations, but there are no that can satisfy 0 = —15. Thus, (9) is inconsistent (has no solution), values of «1, 29,3 and therefore (7) is as well. COMMENT. The source of the inconsistency is the fact that whereas the left-hand side of the third equation is 2 times the left-hand side of the first equation plus 3 times the left-hand _ sideof thesecond,theright-handsidesdo notbearthatrelationship:2(4)+3(3) = 17 # 2. [While that built-in contradiction is not obvious from (7), it eventually comes to light in the > third equation in (9).] If we modify the system (7) by changing the final 2 in (7) to 17, then the final —12 in (8) becomes a 3, and the final ~15 in (9) becomes a zero: 20, + 3xq —2x3 = 4, — faq + 223= 1, * (10) 0=0 or,multiplyingthefirstby 5 andthesecondby —3, (11a,b) => meom 2 —- 73g = 5, Ben —- Tas where we have discarded the identity 0 = 0. Thus, by changing the c;’s so as to be “compatible,” the system now admits an infinity of solutions rather than none. Specifically, we can let 23 (or Yo, it doesn’t matter which) in (11b) be any value, say a, where a is arbitrary.Then (11b)gives zz = —2 + 3a, and putting theseinto (Ila), e1 = i+ fa, Thus, we have the infinity of solutions 2 w=a, 3 t2=--+-a, 7 b 2 4 at 17 1 ey 7 + 5o vy =—4+= (12) 12 for any a. Evidently, two of the three planes intersect, giving a line that lies in the third plane, and equations (12) are parametric equations of that line! @ EXAMPLE 3. = NonuniqueSolution. Consider the system of four equationsin six unknowns (m = 4,n = 6) 209 + Uy, Lr wy + 44 VB + 345 + 2 + 2X6 = 2, = 0, > y+ vg + 2a3 + 4eyq + v5 + 2r— = 3, ry — 3x9 — 4e4 — 245 + we = 0. 13 (13) Wanting the top equation to begin with 2, and subsequent equations to indent at the left, 399 400 let us first move the top equation to the bottom (eq.1 + eq.4): ry ~ 329 — dx, — 2x5 + ag = 0, Lym ka + wy Uy to + 283 + dag + 202 + + 2x6 = £3 + 44 0, 25 + 2x6 = 3, + 325 + (14) we = 2. Add —1 times the first equation to the second (eq.2 + eq.2 ~1 eq.1) and third (eq.3 > eq.3 —1 eq. 1) equations: Ly — 382Q 229 — 4v4 — 245 + xe = 0, = + Ug + 225 + 4a4 3 + (15) 0, 4xq + 293 + 8x4 + 35 + ve = 3, 209 + 3 + 4a4 + 3@5 + ag = 2. Add —2 times the second to the third (eq.3 —+eq.3 —2 eq.2) and —1 times the second to the fourth (eq.4 + eq.4 —1 eq.2): Ly 3x2 − 4x4 _ 225 + rg = 0, =0, 205 ve + + vg 0, 23 +tf dag 4x4 + U3 2X9 209 ++ 7&5 7 Ve = (16) 3, Add the third to the fourth (eq.4 — eq.4 + eq.3): xy — 329 —4%4 - 2475 + we = 0, 209 + @3.+ 4g +. 205. +...06.=.0, tg 7 a5 (17) = 3, —©g = 5. Finally, multiply the second, third, and fourth by 4, —1, and —1, respectively, to normalize the leading coefficients (eq.2 > 4 eq.2, eq.3 + —Leq.3, —-275 + ~40, ry — 3x9 we + $23 + 204+ eq.4 + —1 eq.4): O, we = U5 + $UG = tit ∶ 0, (18) —3, T= ∶ —5. ∶ The last two equations give vg = —5 and zs = 2, and these values can be substituted back into the second equation. In that equation we can let xy be arbitrary, say a,, and we can also let x3 be arbitrary, say a). Then that equation gives x2 and, again by back substitution, the first equation gives x,. The result ts the infinity of solutions Ug = l v5 —5, = 2, 1 - a2, tg = = 5 — 2a, Les where a, and ag are arbitrary. Uq = 1, 21 03 = 42, 3 1 = — ye 5 — 2a, —-Fa, 9 g (19) @ If a solution set contains p independent arbitrary parameters (a1,...,Qp), call it (in this text) a p-parameter family of solutions. We Thus, (12) and (19) are 401 one- and two-parameter families of solutions, respectively. Each choice of values @pyields a particular solution. In (19), for instance, the choice ay = 1 for a1,..., and a2 = 0 yields the particular solution «4, = oe vg = —3, vg = 0, vq = 1, vy = 2, and vg = —5. 8.3.2. Gauss elimination. The method of Gauss elimination,* illustrated in Examples 1-3, can be applied to any linear system (1), whether or not the system is consistent, and whether or not the solution is unique. Though hard to tell from the foregoing hard calculations, the method is efficient and is commonly available in (a) computer systems. Observe that the end result of the Gauss elimination process enables us to determine, merely from the patternof the final equations, whether or not a solution exists and is unique. For instance, we can see from the pattern of (5) that there is a unique solution, from the bottom equation in (9) that there no solution, and (6), ul {i from the extra double indentation in (18) that there is a two-parameter family of solutions. As representative of the case where m <n, let m = 3 andn = 5. There are four possible final patterns, and these are shown schematically in Fig. |. For instance, the third equation in Fig. la could be 73 — 624 + 245 = 0 or v3 + 2@4+ Oxs = 4, and the given third equation in Fig. !b could be 0 = 6 or 0 = O. It may seemfoolish to include the case shown in Fig. Id becausethereare no j’s (ce) i} if each of them is zero; (d) there is no solution if any of the right-hand members is (d) m=3,n=5. nonzero, and a five-parameter family of solutions if each of them is zero. It may appear that Fig. | does not cover all possible cases. For instance, what about the case shown in Fig. 2? That case can be converted to the case shown in Fig. la simply by renaming the unknowns: let x3 become xo and let x5 become x3. Specifically, let 7, > «1, 73 4 wo, @54 v3, 04 > w4, and G2 > B5. The case where mm> n can be studied in a similar manner, and we can draw the following general conclusions. covered? 8.3.2 Existence / Uniqueness for Linear Systems If m < n, the system (1) can be consistent or inconsistent. If it is consistent it cannot have a unique solution; it will have a p-parameter family of solutions, where n —m <p <n. il (all of the a;; coefficients being zero), but it is possible so we have included it. From these patterns we draw these conclusions: (a) there exists a two-parameter family of solutions; (b) there is no solution (the system is inconsistent) if the right-hand member of the third equation is nonzero, and a three-parameterfamily of solutions if the latter is zero; (c) there is no solution if either of the right-hand members of the second and third equations is nonzero, and a four-parameter family of solutions THEOREM oe If m > n, (1) can be consistent or inconsistent. If it is *The method is attributed to Karl Friedrich Gauss (1777-1855), who is generally regarded as the foremostmathematician of the nineteenthcentury and often referred to as the “prince of mathematicians.” 0 0 0 li Ul 402 consistent it can have a unique solution or a p-parameter family of solutions, where L<p<n. The next theorem follows immediately from Theorem 8.3.2, but we state it separately for emphasis. THEOREM 8.3.3 Existence/ Uniquenessfor Linear Systems Every system (1) necessarily admits no solution, a unique solution, or an infinity of solutions. Observe that a system (1) is inconsistent only if, in its Gauss-eliminated form, one or more of the equations is of the form zero equal to a nonzero number. But that can never happen if every c; in (1) is zero, that is, if (1) is homogeneous. THEOREM 8.3.4 Existence/ Uniqueness for Homogeneous Systems Every homogeneous linear system of m equations in n unknowns is consistent. Either it admits the unique trivial solution or else it admits an infinity of nontrivial solutions in addition to the trivial solution. If m < n, then there is an infinity of nontrivial solutions in addition to the trivial solution. In summary, not only did the method of Gauss elimination provide us with an efficient and systematic solution procedure, it also led us to important results regarding the existence and uniqueness of solutions. 8.3.3. Matrix notation. In applying Gauss elimination, we quickly discover that writing the variables 21,...,2,, over and over is inefficient, and even tends to upstage the more central role of the a;;’s and c;’s. It is therefore preferable to omit the x,;’s altogether and to work directly with the rectangular array Q1i1 G12 "'t Qin CL a21 a92 sth G8n c2 Qm1 Gm2 ‘'' Qmn Cm , (20) known as the augmented matrix of the system (1), that is, the coefficient matrix ; (21) 8.3. Solution by Gauss Elimination augmented by the column of c;’s. By matrix we simply mean a rectangular array of numbers,called elements; it is customary to enclose the elementsbetween parenthesesto emphasize that the entire matrix is regarded as a single entity. A horizontal line of elements is called a row, and a vertical line is called a column. Counting rows from the top, and columns from the left, aq, 299 +++ Gon C2 and 4 Cm say,are the second row and (n-+1)th column, respectively, of the augmentedmatrix (20). In terms of the abbreviated matrix notation, the calculation in Example | would ~look like this. Original system: Rt OO be reo Rt em RR mowor Add —3 times first row to second row, and add —1 times first row to third row: 1 1-1 1 0 -2 4 6 0 -2 5 7 Add —1 times second row to third row, and multiply second row by —$: 1 1-1 0 1 -2 00 1 1 -3 |. 7 (22) Thus, corresponding to the so-called elementary equation operations on members of a system of linear equations there are elementary row operations on the augmented matrix, as follows: 1. Addition of a multiple of one row to another: Symbolically: (jth row) > (jth row) + a(kth row) 2. Multiplication of a row by a nonzero constant: Symbolically: (jth row) + a(jth row) 3. Interchange of two rows: Symbolically: (jth row) + (kth row) 403 404 And we say that two matrices are row equivalent if one can be obtained from the other by finitely many elementary row operations. 8.3.4. Gauss— Jordan reduction. With the Gauss elimination completed, the remaining steps consist of back substitution. In fact, those steps are elementary row operations as well. The difference is that whereas in the Gauss elimination we proceed from the top down, in the back substitution we proceed from the bottom up. EXAMPLE 4. To illustrate,let us returnto Example | and pick up at theend of the Gauss elimination, with (5), and complete the back substitution steps using elementary row operations. In matrix format, we begin with 1 0 0 1 1 0 -l -2 1 1 ~3 ]. 1 ~ (23) Keeping the bottom row intact, add 2 times that row to the second, and add 1 times that row to the first: 110 2 0 1 0 -1 |. 1 001 (24) Now keeping the bottom two rows intact, add —1 times the second row to the first: 1 0 0 0 1 0 0 0 1 3 -1 4], 1 (25) which is the solution: x, = 3, v2 = —1, x3 = 1 as obtained in Example |. @ The entire process, of Gauss elimination plus back substitution, is known as Gauss—Jordan reduction, after Gauss and Wilhelm Jordan (1842~—1899).The final result is an augmented matrix in reduced row-echelon form. That is: 1, In each row not made up entirely of zeros, the first nonzero element is a 1, a so-called leading 1. 2. In any two consecutive rows not made up entirely of zeros, the leading | in the lower row is to the right of the leading | in the upper row. 3. Ifacolumn contains a leading |, every other element in that column is a zero. 4, All rows made up entirely of zeros are grouped together at the bottom of the matrix. 8.3. Solution by Gauss Elimination For instance, (25) is in reduced row-echelon form, as is the final matrix in the next example. 5. EXAMPLE Let us returnto Example 3 and finish the Gauss—Jordanreduction, beginning with (18): 1% 0 0 0 O Ll Lt -3 0 0 0 0 0 2 38 10320 0 5l 00 00 0 0 0 00 1 -5 2 0 0 -3 1 1 1 —3 0 1-5 0 0 1 1 8 0 0 0 1 —5 |}0 0 fi 0 $200 4 3 0 51 00 001 109 0 0 21% O14 , «0 14 2 0 0 8 10221 0 1 0 -4 -2 1-3 0 # ¢1 200 0 0 0 4 1 2 -5 The last augmented matrix is in reduced row-echelon form. The four leading 1’s are displayed in bold type, and we see that, as a result of the back substitution steps, only 0’s are to be found above each leading |. The final augmented matrix once again gives the solution (19). @ 8.3.5. Pivoting. Recall that the first step in the Gauss elimination of the system Qe, + GyQwg tet + Ain®p = Gq, + A99%Q+ + AIntn = C2, + Ometg + Amntn Am1l1 +++ C1, (26) = Cm, is to subtract ag; /ay, times the first equation from the second, a31/a11 times the first equation from the third, and so on, while keeping the first equation intact. The first equation is called the pivot equation (or, the first row is the pivot row if one is using the matrix format), and a 4 is called the pivot. That step produces an indented system of the form Ayyey - Ayawo tee Qgo@Q + ∕∙ Gynt. ++ + Ayytn Pon, Ao Uy ho + Lae + Ginntn = = Ch, af Co, (27) ee Cm: ol = Next, we keep the first two equations intact and use the second equation as the new pivot equation to indent the third through mth equations, and so on. Naturally, we need each pivot to be nonzero. For instance, we need ay; # 0 for d21/a11, @31/011,... to be defined. If a pivot is zero, interchange that equation 405 406 with any one below it, such as the next equation or last equation (as we did in Example 3), until a nonzero pivot is available. Such interchange of equations is called partial pivoting. If a pivot is zero we have no choice but to use partial partial pivoting, but in practice even a nonzero pivot should be rejected if it is “very small,” since the smaller it is the more susceptible is the calculation to the adverse effect of machine roundoff error (see Exercise 13). To be as safe as possible, one can choose the pivot equation as the one with the largest leading coefficient (relative to the other coefficients in the equation). Closure. Beginning with a system of coupled linear algebraic equations, one can use a sequence of elementary operations to minimize the coupling between the equations while leaving the solution set intact. Besides putting forward the important method of Gauss elimination, which is used heavily in the following chapters, we used that method to establish several important theoretical results regarding the existence and uniqueness of solutions. The Gauss elimination and Gauss—Jordanreduction discussions lead naturally to a convenient, and equivalent, formulation in matrix notation. We will return to the concept of matrices in Chapter 10, and develop it in detail. Computer software. Chapters 8—12 cover the domain known as linear algebra. A great many calculations in linear algebra can be carried out using computer algebra systems. In Maple, for instance, a great many commands (“functions”) are contained within the linalg package. A listing of these commands can be obtained by entering ?linalg. That list includes the linsolve command, which can be used to solve a system of m linear algebraic equationsin n unknowns. To access linsolve (or any other command within the linalg package), first enter with(linalg). Then, linsolve(A,b) solves (1) for x1,...,%,, where A is the coefficient matrix and 6 is the column of c;’s. For instance, the system Zi — © + 223 — 3x4 =4, ty +2049 —- 23+ 28 324 = 1 (28) admits the two-parameter family of solutions T4= 1, CTE=AQ, T= -l—-2aj,+ao, 7 =38+aQ1,-—a9, where c1, @ are arbitrary. To solve (28) using Maple, enter with(linalg): then return and enter linsolve(array([{1,~1, 2, —3],[1,2, -1, 3]]), array((4,1])); and return. The output is [—_ty + tg+3, ty -2-te—-—1, ~ty, —to] (29) 8.3. Solution by Gauss Elimination —407 where the entries are x,,...,04 and where _t; and —tg are arbitrary constants. With f; = a and —t 2 = ay, this result is the same as (29). If you prefer, you could use the sequence with(linalg): A := array(([1,—1,2, —3],[1,2, —1,3]]): b := array((4,1]}): linsolve(A, b); instead. [f the system is inconsistent, then either the output will be NULL, or there will be no output. EXERCISES 8.3 1. Derive the solution set for each of the following systems using Gauss elimination and augmented matrix format. Documenteach step (e.g., 2nd row—+2nd row + 5 times Ist row), and classify the result (e.g., unique solution, the system is inconsistent, 3-parameter family of solutions, etc.). (k) xy + 23 ed Ly, + 22 — £3 — 224 =5 Ly —- £2 + 243 4+ wy =O Qa, + to+ t3-—- Bea 4 (1) ay +24 =2 5a + y= 2 (b) 22+ y=0 Ty (m) (n) 22+ (e) 27, — wo — v3 - 5aey = 6 () 2 — ~ 3 , 7 Ug - Vy + 4xry = : (0) 224 + ey + a ‘ sa + by + Tz = 8 9x + 10y + llz = 12 xe2= 6a, + llag= Qn, + w+ (q) tr 204 29 229 =n + a3 ∙−∶ On, 409 + Uy + 2a - 23 — 2a4 = = + 23 xy 0 = pe oe XR t 29 + = ®o4+ 23 201 + 2x9 — X3 t= 1 tq +223 + rg =1 ty bay ee Uy + ~—2 23- 10 1 Ly + 2xg = —4 6 0x2 =—1 ≥ Uy Ba,+ Or»Qea= 4 _ 9 v1 + z= r3 + 224 =1 : (h) vy + #2 — 243 = 3 : Uy 2 —-343 = 1 ~ dag ; aay ~ 3x”q 322 v3 = —1 y+ 6 2 yYbur xz—-2y — 4z = —-10 (g) w+ 24y+ 32= + (i) 2a, - =2 for c = 10, and again, for ¢ =11 Wm—y-2=8 — Ug +52=c — &ty e—-ytz=1 Ly ~—2 «+ 2y4+32=5 3a + 4dy (c) w+2y=4 a @ 4+273+ 24 =4 2z + 3y +42 =8 3a — 2y = 0 d) @3 + 24 = 0 — 4tg (a) 22 —3y=1 - . aus: DA 324 ; + 225 + 225 + a, se | a = 0 —0 = 0 =0 ine G Jord . 4 duction instea 5 1 3. (a)—(q)Same as Exercise | but using computer software such as the Maple linsolve command. 408 4, Can 20 linear algebraic equations in [4 unknowns have a unique solution? Be inconsistent? Have a two-parameter family of solutions? Have a |4-parameter family of solutions? Have a |6-parameterfamily of solutions? Explain. (b) “Given thesystem 5. Let since both left-hand sides equal zero, they must equal each other. Hence we have the equation QL, lI 0, + b3x3 = 0 + GQhQq+ 4323 bya, + bot = 22, — to + Vs, Ly + @o — 4g represent any two planes through the origin in a Cartesian 1,2, ay + ay — 4g = 0, 224 ~~Lo + vg = 0, 3 space. For the case where the planes intersect along which equation is equivalent to the original system.” a line, show whether or not that line necessarily passes through the origin. 9, Make up an example of an inconsistent linear algebraic system of equations, with 6. If possible, adapt the methods of this section to solve the following nonlinear systems. If it is not possible, say so. (jm=2,n=4 (a) at t+205 ~ 243 ==29 zy t+ ©)+ 24 = 19 Ba? + das (b) = 67 (b) m = I, m= 4 10. (Physical example of nonexistence and nonuniqueness; DC circuit) Kirchoff’s current and voltage laws were given in Section 2.3.1. [f we apply those laws to the DC’ circuit shown, by 13 5 z+ 3y= sing +2y= (c) sing + siny =1 sing — siny + 4cosz = 1.2 sing + siny + 2cosz = 1.6 geneous (do you agree that they are homogeneous?) systems admit nontrivial solutions? Find the nontrivial solutions corresponding to each such A. u+ (c) Av (b) 22 —- y= 2y = Ay «2~2y= Az dx ~ 8y = Ay 4 (fe) c+y+ e=Ar yroeo= Ay 22 = Nz (a) “Given the system Ty — 280 = 0, 224 _ das — Q, add —2 times the first equation to the second and add ~ 4 times the second equation to the first. By these Gauss elimination steps we obtain the equivalent system 0 = 0 and 0 = O, and hence the two-parameter family of solutions zy = ay (arbitrary), £2 = Q (arbitrary).” ip~ig-tg e =O, iy —ig —ig = 0, Roig — Ryizg = 0, {| EF, Ryiy + Reig = sa Ag 8. Evaluate these excerpts from examination papers. BR we obtain the equations Xx z= Ay c+yte= Az (f) 22+ y+ 2= Ax u+2Qy+ 2= Ay G+ yr2e= Az 8 4 & benmennennnnnsnnettfemenennsnisnnsninenmatrfyrererenntrerannnanrrd —v + Qy = Ay (d) e 4 LW EC) 7. For what values of the parameter A do the following homo- y= a vw a < w/2.and0<2<2 where~7/2 <a <1/2,-1r/2<y (a) 2a + b > Ryty gis = E, (jjunction a) (junctionc) Gj (loop abcda) ((loopabcea) (Ioop (10.1) adcea) where 71, 72,23 are the three currents (measured as positive in the direction assumed in the figure). Ry, Re, Ry are the resistances of the three resistors, and £&is the voltage rise (from e to a) induced by the battery or other source. [Evidently, we did not need to apply the current law to both junctions since the resulting equations are identical. Similarly, it may be that not all of the loop equations are needed. But rather than try to decide which of equations (10.1) to keep and which to discard, let us merely keep them all.] We now state the problem: Obtain the solution set of equations (10.1) by Gauss elimination, If there is no solution, or if there is a nonunique solution, explain that result in physical terms. Take (QR, = Ro = R= R (RO) (b)Ry =Ry= R, Ro=2R (RHO) (c)Rh, 2R, Ry =Rg=2R (RHO) 409 8.3. Solution by Gauss Elimination The point P will move to a point (a, y), and we assumethatthe cables are stiff enoughso that« andy are very small: |x| < 1 (e) Rg = R, Ry = Rg = 0 (g) Ry — Reo = Res — 0 11. (Physical example of nonexistence and nonuniqueness; statically indeterminate structures) (a) Consider the static and |y| < 1. Let the cables obey Hooke’s law: Ty, = ky1, T. = kgd9, and T; = kgdg, where 4; is the increase in length of the jth cable due to the tension T;. Since P moves to (x, y). it follows that equilibrium of the system shown, consisting of two weightless oy (11.1) cables connected at P, at which point a vertical load F’ is applied. Requiring an equilibrium of vertical force components, and horizontal force components too, derive two linear algebraic equations on the unknown tensions T; and 7). Are there any combinations of angles 6, and @, (where 0 < 0; < = and 0 < @) < 5) such that there is either no solution Explain each step in (11.1), and show, similarly, that 11.2 (11.2) —Y, 1 V3 or a nonunique solution? Explain. (b) This time let there be three cables at angles of 45°, 60°, and 30° as shown. Again, requiring an equilibrium of vertical and v3 1 gh —--ogt dy Thus, Ty = ki dy ~ k ad y), ae k To= kodq%~Z («+vy), (11.4) +y). Ts= kgds& -(V3a Putting (11.4) into the two equilibrium equations obtained in (b) then gives two equations in the unknown displacements x,y. Show that that system can be solved uniquely for x and y, and thus complete the solution for 7}, T2, T3. horizontal forces at P, derive two linear algebraic equations on the unknown tensions 7, 75,73. Show that the equations are consistent so there is a nonunique solution. NOTE: We say that such a structure is statically indeterminate because the forces in it cannot be determined from the laws of statics alone. What information needs to be added if we are to complete the evaluation of 7), 2,73? What is needed is information about the relative stiffness of the cables. We pursue this to a conclusion in (c), below. (c) [Completion of part (b)| Before the load £ is applied, locate an x, y Cartesian coordinate system at P. Let P be | foot below the “‘ceiling” so thecoordinatesof A, B,C are (—1,1), (1/V3, 1), and (./3, 1), respectively. Now apply the load F’, 12. (Roundoff error difficulty due to small pivots) To illustrate how small pivots can accentuate the effects of roundoff error, consider the system 0.0050,+ 1.472il il 0.9752,-+2.3229 1.49, 6.29 (12.1) with exact solution 2, = 4 and x2 = 1. Suppose that our computer carries three significant figures and then rounds off. Using the first equation as our pivot equation, Gauss elimination gives 1.47 1.49 | 2.382 6.22 | 0.005 | 9 1.47 —285 1.49 —284 so @ = 284/285 = 0.996 and «, = [1.49 - (1.47)(0.996)|/0.005 = (1.49~ 1.46)/0.005= 6. Showthat off version system 0.975a, + 2.32¢9 I 6.22, 0.0052, + L.47x = 1.49 (12.2) r+ y =2, a ~ 1.014y =0 Chapter 8 solution set intact. Chapter 10. y=2, = 0 2 : LO14y - 0 (13.2) and the rounded-off version v+ x+ 1Oly y = 2, = 0, vm 144.9, y & -142.9 and « = 202, y = —200, respectively; (13.2) is an example of a so-called ill-conditioned system (ill-conditioned in the sense that small changes in the coefficients lead to large changes in the solution). (13.1)the following: is v & 1.007, y = 0.993, whereas the solution of the rounded- 10ly is very much the same, namely « ~ 1.005, y = 0.995. In sharp contrast,the solutions of as our pivot equation, we obtain the result ¢, = 4.00 and vq = 1.00 (which happens to be exactly correct). 13. Ull-conditioned systems) Practically speaking, our numerical calculations are normally carried out on computers, be they hand-held calculators or large digital computers. Such machines carry only a finite number ofsignificant figures and thus introduce roundofferror into most calculations. One might expect (or hope) that such slight deviations will lead to answers that are only slightly in error. For example, the solution of t+ z- if we use partial pivoting and then use the first equation of the Here, we ask Explain why (13.2) is much more sensitive to roundoff than (13.1) by exploring the two cases graphically, that is, in the x, y plane. Chapter 8 Review more fully. The most important results of this chapter are contained in Theorems 8.3.1-3. Finally, we also stress the value of geometrical and visual reasoning, and suggest that you keep that idea in mind as we proceed. 411 Chapter 9 Vector Space 9.1 Introduction Normally, one meets vectors for the first time within some physical context — in studying mechanics, electric and magnetic fields, and so on. There, the vectors exist within two- or three-dimensional space and correspond to force, velocity, position, magnetic field, and so on. They have both magnitude and direction; they can be scaled by multiplicative factors, added according to the parallelogram law; dot and cross product operations are defined between vectors; the angle between two vectors is defined; vectors can be expanded as linear combinations of base vectors, and so on. Alternatively, there exists a highly formalized axiomatic approach to vectors known as linear vector space or abstract vector space. Although this generalized vector concept is essentially an outgrowth of the more primitive system of “arrow vectors” in 2-space and 3-space, described above, it extends well beyond that system in scope and applicability. For pedagogical reasons, we break the transition from 2-space and 3-space to abstract vector space into two steps: in Sections 9.4 and 9.5 we introduce a generalization to “n-space,” and in Section 9.6 we complete the extension to general vector space, including function spaces where the vectors are functions! However, we do not return to function spaces until Chapter 17, in connection with Fourier series and the Sturm—Liouville theory; in Chapters 9—12 our chief interest is in n-space. 9.2 Vectors; Geometrical Representation Some quantities that we encounter may be completely defined by a single real number, or magnitude; the mass or kinetic energy of a given particle, and the temperature or salinity at some point in the ocean, are examples. Others are not defined solely by a magnitude but rather by a magnitude and a direction, examples being force, velocity, momentum, and acceleration. Such quantities are called 412 vectors. _ The defining featuresof a vector being magnitude and direction suggeststhe geometric representation of a vector as a directed line segment, or “arrow,” where the length of the arrow is scaled according to the magnitude of the vector. For example, if the wind is blowing at 8 meters/secfrom the northeast, that defines a wind-velocity vector v, where we adopt boldface type to signify that the quantity is a vector; alternative notations tnclude the use of an overhead arrow as in W. Choosing, according to convenience, a scale of 5 meters/sec per centimeter, say, the geometricrepresentationof v is as shown in Fig. 1. Denoting the magnitude, N | or norm, of anyvectorv as ||v||,we have||v|]= 8 for thev vectorin Fig. [. Observe that the /ocation of a vector is not specified, only its magnitude and direction. Thus, the two unlabeled arrows in Fig. | are equally valid alternative representationsof v. That is not to say that the physical effect of the vector will be entirely independentof its position. For example, it should be apparentthat the motion of the body B induced by a force F (Fig. 2) will certainly dependon the point of application of F* as will the stress field induced in B. Nevertheless, the two vectors-in Fig. 2 are still regarded as equal, as are the three in Fig. 1. Like numbers, vectors do not become useful until we introduce rules for their manipulation, that is, a vector algebra. Having elected the arrow representationof vectors, the vector algebra that we now introduce will, likewise, be geometric. First, we say that two vectors are equal if and only if their lengths are identical and if their directions are identical as well. Next, we define a process of addition between any two vectors u and v. The first step is to move v (if necessary), parallel to itself, so that its tail coincides with the head of u. Then the sum, or resultant, u + v is defined as the arrow from the tail of u to the head of v, as in Fig. 3a. Reversing the order, v + u is as shown in Fig. 3b. Equivalently, we may place u andv tail to tail, as in Fig. 3c. Comparing Fig. 3c with Fig. 3a and b, we see that the diagonal of the parallelogram (Fig. 3c), ef Scale: 8 m/sec/cm Figure 1. Geometric representationof v. Figure 2. Positionof a vector. (a) is both u-+ v and v + u. Thus, uty u+VvV=v+u, (1) so addition is commutative. One may show (Exercise 3) that it is associative as well, (u+v)+w=u+(v+w). (2) Next, we define any vector of zero length to be a zero vector, denoted as 0. Its length being zero, its direction is immaterial; any direction may be assigned if desired. From the definition of addition above, it follows that u+0=0+u=u (3) for each vector u. Corresponding to u we define a negative inverse “—w” such that if u is any nonzero vector, then —u is determined uniquely, as shown in Fig. 4a; that is, it is “Students of mechanics know that the point of application of F affects the rotational part of the motion but not the translational part. (b) u Vu 414 Chapter 9. Vector Space ua (a) of the same length as u but is directed in the opposite direction (again, u and —u have the same length, the length of —u.is not negative). For the zero vector we have —O = 0. We denote u + (—v) as u — v (“u minus v’’) but emphasize that it is really the addition of u and —v, as in Fig. 4b. Finally, we introduce another operation, called scalar multiplication, between any vector u and any scalar (i.e., a real number) a: If a 4 0 and u ¥ 0, then au is a vector whose lengthis |a| times thelengthof u and whose direction is thesame as that of u if a > 0, and the opposite if a < 0; if a = 0 and/or u = 0, then au = 0. This definition is illustrated in Fig. 5. It follows from this definition that scalar multiplication has the following algebraic properties: (4b) av, (4c) a(u+v)=au+ lu=u, Figure 4. —u andvector subtraction, (4a) a(Bu) = (af)u, (a+ f)u=au+t Bu, (4d) where «a,9 are any scalars and u, v are any vectors. Observe that the parallelogram rule of vector addition is a definition so it does not need to be proved. u tofu u“ Nevertheless, definitions are not necessarily fruitful so it is worthwhile to reflect for a moment on why the parallelogram rule has proved important and useful. Basically, if we say that “the sum of u and v is w,’ and thereby pass from the two vectors u, v to the single vector w, it seems fair to expect some sort of equivalence to exist between the action of w andthe joint action of u and v. For example, if F, and F are two forces acting on a body B, as shown in Fig. 6, it is known from fundamental principles of mechanics that their combined effect will be the same as that due to the single force F, so it seems reasonable and natural to say that F ts the sum of F, and F2. This concept goes back at least as far as Aristotle (384-322 B.C.). Thus, while the algebra of vectors is developed here as an essentially mathematical matter, it is important to appreciate the role of physics and physical motivation. In closing this section, let us remark that our foregoing discussion should not be construed to imply that objects of physical interest are necessarily vectors (as are force and velocity) or scalars [as are temperature, mass, and speed (i.e., the magnitude of the velocity vector)|. For example, in the study of mechanics one inds that more than a magnitude and a direction are needed to fully define the state of stress at a point; in fact, a “second-order tensor” is needed — a quantity that is more exotic than a vector in much the same way that a vector is more exotic than a scalar.* Figure 6. Physical motivationfor parallelogram addition. “For an introduction to tensors, we recommend to the interested reader the 68-page book Tensor Analysis by H. D. Block (Columbus, OH: Charles E. Merrill, 1962). 9.2. Vectors; Geometrical Representation 415 9.2 EXERCISES 1. Trace the vectors A, B, C, shown where A is twice as long as B. Then determineeach of the following by graphical means, the lengthsof the other two sides. (b)Repeatpart(a),with “||Aj] = 1” changedto|| Al] = 4. (b)B= A (d)2(B~A) +60 (f)A +2B —2C (a A+B+C (c) A-~C+3B (ec)A+ (4B —C) 6. Use the definitions and properties given in the reading to show that A + B = C implies that A = C ~-B. 2. In each case, C can be expressed as a linear combination of A and B, that is, as C = aA + ($B. Trace the three vectors and by graphical means determine a and (2. (b) (a) 7. (a) Show that if A + B = 0 and A and B are not parallel, then each of A and B must be 0. (b) Vectors are often of help in deriving geometrical relationships. For example, to show that the diagonals of a parallelogram bisect each other one may proceed as follows. From the accompanying figure A + B = C, A — aD = (GC, and A =B+D. Eliminating A andB, we obtain(29 — 1)C = (1 —2a@)D,and since C andD are not parallel, it must be true [perpart(a)]that28-1 = 1-2a =O (ie, a =f = §), which completes the proof. We now state the problem: Use this sort of procedure to show that a line from one vertex of a parallelogram to the midpoint of a nonadjacent side trisects a diagonal. (d) em Cc a» A 3. Show that the associative property (2) follows graphical definition of vector addition. from the 4. Derive the following from the definitions of vector addition and scalar multiplication: |C| (a) = If A (b) property (4b) (d) property (4d) (a) property (4a) (c) property (4c) 5. 8. If (see the accompanying figure) the vector A + aB is placed with its tail at point P, show the line generated by its head as a varies between —oo and +00, (All 5. Can =A +1,B [BI = ++ Cc 2, = and 0? HINT: Use the law of cosines s* = q? +1? ~ 2qr cos @(see the accompanying figure) or the Euclidean proposition that the length of any one side of a triangle cannot exceed the sum of D f B 9, If (seetheaccompanyingfigure)||/AB]]/ ||AC] = a, show thatOB = aOC + (1 —a)OA. Chapter 9. Vector Space 416 O 10. One may express linear displacement as a vector: If a particle moves from point A to point B, the displaccment vector is the directcd line segment, say u, from A to B. For example, observe that a displacement u from A to B, followed by a displacementv from B to C, is equivalent to a single displacement w from Ato C: u+v = w[part (a) in the accompanying figure]. Reversing the order, displacements v and then u also (a) A (b) u the direction specified by the “right-hand rule.” That is, if we curl the fingers of our right hand about the axis of rotation, in the direction of rotation, then the direction @along the axis of rotation is the direction in which our thumb points, The problem is to show that @,defined in this way, is not a proper vector quantity. HINT: Considering the unit cube shown below, say, show (by keeping track of the coordinates of the corner A) that the orientation that results from a rotation of 7/2 about the « axis, followed by a rotation of 7/2 about the y axis, is not the same as that which results when the order of the rotations is reversed. NOTE: If you have encountered angular velocity vectors (usually denoted as w or 92), in mechanics, it may seem strange to youthat finite rotations (assigned a vector direction by the right-hand rule) are not true vectors. The idea is that angular velocity involves infinitesimal rotations, and infinitesimal rotations (assigned a vector direction by the right-hand rule) are true vectors. This subtle point is discussed in many sources (e.g., Robert R. Long, Engineering Science Mechanics, Englewood Cliffs, NJ: Prentice Hall, 1963, pp. 31-36). = B A carry us from A to C: v-+u =w [part(b) in the figure]. Thus, u+v =v+uso that the commutativity axiom (1).is indeed satisfied. How about angular displacements? Suppose that we express the angular displacement of a rigid body about an axis as 8, where the magnitude of @is equal to the angle of rotation, and the orientation Lo A of @ is along the axis of rotation. in Jot Product Continuing our discussion, we define here the angle between two vectors and a “dot product” operation between two vectors. The angle @between two nonzero vectors u_and v will be understood to mean the ordinary angle between the two vectors when they are arranged tail to tail as in Fig. 1. (We will not attempt to define @if one or both of the vectors is 0.) Of course, this definition Figure 1. The angle0 between of @is ambiguous in that there are two such angles, an interior angle (< 7) and an exterior angle (> 7); for definiteness, we choose 6 to be the interior angle, u and v. O<6 lA 7, (1) as in Fig. |. Unless explicitly stated otherwise, angular measure will be understood to be in radians. Next, we define the so-called dot product, u-v, between two vectors u and v as − ∙ | | ≡ | | 0 ∫i ’ # ’ or v=0; if u=O (2a,b) ||ul, ||v|],andcos@arescalarsso u- v is a scalar,too." By way of geometrical interpretation,observe (Fig. 2a) that ||u||cos@is the length of the orthogonal projection of u on the line of action of v so that u-v = [|u|[v|| cos@ = (||v{])(jul| cos@)is the lengthof v times the length of the orthogonal projection of u on the line of action of v.' Actually, that statementholds 7/2 if0 <@< m/2;if cos@ thecosineis negative,andu-v = |]ul|||v|| <6 < 7m, is the negative of the length of v times the length of the orthogonal projection of u on the line of action of v, EXAMPLE 1. Work Done by a Force. In mechanicsthe work W done when a body undergoesa linear displacement from an initial point A to a final point B, under the action of a constant force F (Fig. 3), is defined as the length of the orthogonal projection of F on the line of displacement, positive if F is “assisting” the motion (Le., if 0 < @< 7/2, as in Fig. 3a) and negative if F is “opposing” the motion (e., if 7/2 < @< 7, as in Fig. 3b), times the displacement. By the displacement we mean the length of the vector AB with head at B and tail at A. But that product is precisely the dot product of F with AB, W=F-AB.. Figure 2. Projectionof u on v. (3) 8 (a) An important special case of the dot product occurs when 6 = 7/2. Then u and v are perpendicular, and u-v = |lulj||v||cos5 = 0), (4) Also of importance is the case where u = v. Then, according to (2), f “ees {julfuljcos0= |full? if u 40, if u=O { 0 so that we have Jul = vucu (b) (5) (6) “You may wonder why (2b) is neededsince if u = 0, say, then [Jul] = 0 and |lul] ||v||cos@ is apparently zero anyway. But the point is that if u and/or v are O, then @ is undefined; hence cos @ (and even zero times cos @) is undefined. too. ‘Alternatively,wecoulddecomposeu-v = [jul]||v||cos@= (full) (||v||cos@);thatis, as the length of u times the length of the orthogonal projection of v on the line of action of u. Figure 3. Work doneby F. 418 Chapter 9. Vector Space EXERCISES 9.3 1. Evaluateu-v in eachcase.In (a) |Jul]= 5, in (b) [Jul]= 3, 4. Consider the unit cube shown, where P is the midpoint andin (c) |[ul|= 6. of the right-hand face. Evaluate each of the following using the definition (2), and (3.1) in Exercise 3. HINT: To evaluate AC - OP, for instance, write AC-OP = (AD + (b) DC) -(OD + DP) andthenuse(3.1). 2. (Properties of the dot product) Prove each of the following properties of the dot product, where a, @ are any scalars, and u, Vv, W are any vectors. (a)u:v=v-u (b) u-u>0 =0 (c) (au + Bv)-w HINT: (commutativity) forallu<0 (nonnegativeness) forallu=0oO = a(u-w) + G(v-w) part (c) is equivalentto the two conditions (u + v)-w u-w+v-wand (au): v = a(u-v). = (31) “n-space.” (d)OC-CP (h)CP-DP (1)AO-PA (a}APO (e) ABP (i) BPD (b) APB (fy ACP (j) BOP (c) APC (g) BPO (k) CPO (d) APD (h) BPC (1)DPO 6. If u and v are nonzero,show that w = |[v||u + |[ul|v 3. Using the properties given in Exercise 2, show that n-Space (c)AC-OP (g)AO-OP (k)AP-PB You may use (2), (6), and (3.1). In proving part (c), you may wish to show, first, that 9.4 (b)BA-OP (HBC-OP g)PB.CO 5. Referring to the figure in Exercise 4, use the dot product to compute the following angles. (See the hint in Exercise 4.) (linearity) (u+v)-(w+x)=u-w+u-x+vew+v-x. (a2)0C-AB (e)OC-OP @BP-DB bisects the angle between u and v. (You may use any of the properties given in Exercise 2.) The idea is simple and is based on the familiar representationof points in Cartesian I-, 2-, and 3-space as I-, 2-, and 3-tuples of real numbers. For example, (a) yh the 2-tuple (a1, a2) denotes the point P indicated in Fig. 1a, where a1, a2 are the x,y coordinates, respectively. But it can also serve to denote the vector OP in Fig. 1bor, indeed, any equivalent vector QR. Thus the vector is now represented as the 2-tuple (a1, aq) rather than as an arrow, and while pictures may still be drawn, as in Fig. lb, they are no longer essential and can be discarded if we wish — at least once the algebra of 2-tuples is established (in the next paragraph). The set of all such real 2-tuple vectors will be called 2-space and will be denoted by the symbol IR?;that is, (1) R? = {(a1,42) | a1,a2 realnumbers}. (b) yh Vectors u = (ui, U2) and v = (vj, v2) in R? are defined to be equal if uy = v1 and ug = v9; their sum is defined as* u+v (2) = (uy + v1, U2 + v2) as can be seen from Fig. 2; the scalar multiple au is defined, for any scalar a, as au = (ay, aug); (3) 0 = (0,0); (4) O the zero vector is and the negative of wis —u = (—u1, —ua). Similarly, (5) uty for R?: R= (6) {(@1,a2,43) | @1,a2, a3 real numbers}. u+v = (uy+ v1,U2+ve, ug+ v3), (7) and so on.t It may not be evident that we have gained much since the arrow and n-tuple representations are essentially equivalent. But, in fact, the n-tuple format begins to “open doors.” For example, the instantaneous state of the electrical circuit (consisting of a battery and two resistors) shown in Fig. 3 may be defined by the two currents i; and i or, equivalently, by the single 2-tuple vector (71,i2). Thus, even though“magnitudes,” “directions,” and “arrow vectors” may not leap to mind in describing the system shown in Fig. 3, a vector representation is quite natural within then-tuple framework, and that puts us in a position, in dealing with that electrical system, to make use of whatever vector theorems and techniques developed in subsequentsections and chapters. “We use the = equal sign to mean equal to by definition. 'The spaceR' of 1-tupleswill not be of interesthere. are available, as NE ) ti WW tia WW 420 Indeed,why stop at 3-tuples? One may introduce the set of all ordered real n-tuple vectors, even if n is greater than 3. We call this n-space, and denote it as IR”, that is, IR" = {(ay,...,@n) | @1,..., Qn real numbers }. Consider two vectors, u = (u1,...,Un) Uy,...,Un (8) and v = (v1,..., Up), in IR". The scalars and v1,..., Up, are called the components of u and v. As you may well expect, based on our foregoing discussion of IR? and IR°, u and v are said to be equal if uy = v1,..., Un = Vpn,and we define utv au =(u,+vy,...,Un + Un); = (auy,...,QUn), 0 =(0,...,0), —u =(-1)u, (addition) (9a) (scalar multiplication) (9b) (zero vector) (negativeinverse) (9c) (9d) u-—v =u+(—v). (9e) From these definitions we may deduce the following properties: (commutativity) (10a) (u+v)+w=u+(v4+w), (associativity) (10b) ti u+ (—u) =0, a(Bu) = (ef)u, (10d) (10e) fu, (distributivity) (10f) a(u+v)=aut+ av, (distributivity) (10g) lu=u, Ou=0, (—l)u=—u, WW ti; Ww ti, circuit. (associativity) (a+ 8)u=au4 fi, WW (10c) u+0=u, 4 We utv=evt+u, a0 = 0. (10h) (101) (10)) (10k) To illustrate how such n-tuples might arise, observe that the state of the electrical system shown in Fig. 4, may be defined at any instant by the four currents 21, 12,73, 14, and that these may be regarded as the components of a single vector i = (t1,7g,ig,iq) in R’. Of course, the notation of (w1,..., U,) as a point or arrow in an “n-dimensional space” can be realized graphically only ifn < 3; ifn > 3, the interpretation is valid only in an abstract, or schematic, sense. However, our inability to carry out traditional Cartesian graphical constructions for n > 3 will be no hindrance. Indeed, part of the idea here is to move away from a dependenceon graphical constructions. Having extended the vector concept to IR”, you may well wonder if further extension is possible. Such extension is not only possible, it constitutes an important step in modern mathematics; more about this in Section 9.6. 9.5, Dot Product, Norm, and Angle forn-Space EXERCISES 9.4 Lo ift = (5,0,1,2), u = (2,—1,3,4), v = (4,—5,1), w = (—1, ~2, 5,6), evaluate each of the following (as a single vector); if the operation is undefined (i.e., has not been defined here), state that. At each step cite the equation number of the definition or property being used. (a) 2t-+ 7u (b) 38t — 5u (c)4{u+ 5(w —2u)] (d)dtu + w (e) -w+t (g)t + 2u-+ 3w (f) 2t/u (h) t — 2u — dv (i) u(3t + w) (j) u? + 2t (k) 2t + 7u — 4 (m) sin u (l)u-+ v = (2,0,-5,0), (a) 8x + 2(u — 5v) = w (b) 38x= 40 + (1,0, 0,0) (c)u~dx (d)u-+v-— =0 2x =w ay, &2, a3. If no such scalars exist, state that. (a) at + au + agv = 0 (b) ayt + aav + agw = 0 and w = (a) If 8u — x = 4(v + 2x), solve for x (Le.. find its components). (b) Ifx +u-+v+w equals by equals.) v = (0,1,1), If t = (2,1,3), u = (1,2,-4), 4. w = (—2,1,—1), solve each of the following for the scalars wt (n)w+t—2u 2 Letu = (1,38,0,-2), (4,3, 2, —1). = 0, solve for x. 3. Let u, v. and w be as given in Exercise 2. Citing the definition or property used, at each step, solve each of the following for x. NOTE: Besides the definitions and properties stated in this section, tt should be clear thatifx = y, thenx+z=y+z for any z, and ax = ay for any a (adding and multiplying vectors will not be possible here for n > 3. Thus, if u to define the norm or “length” of u, denoted as j|u (c) ayt + a.u + a3w = (1,3, 2) (d) apt + agv + agw = (2,0, -1) (e)a;u+aov =0 (f) a,u + aev = agw — (2,0,0) 5. (a) If u and v are given 4-tuples and 0 = (0,0,0,0), does = O necessarily have nontrivial the vector equation @,u+a2v solutions for the scalars ay and ag? Explain. (ff the answer ts “no,” a counterexample will suffice.) (b) Repeat part (a), but where u. v are 3-tuples (t1,.-+,Un), we wish ... Uy and . Un. in Sections 9.2 and 9.3 in the event that 2 = 2 or 3. formula u-v = |/ul ||v||cos @, and 0 = (0,0, 0). (c) Repeat part (a), but where u, v are 2-tuples and 0 = (0,0). U1,.+., Un Of u; and given another vector v = (vj,. Urs... 421 (1) 422 Chapter9. Vector Space to re-expressit in termsof vector componentsfor IR?andIR%,and then to generalize those forms to IR”. If u and v are vectors in IR? as shown in Fig. |, formula (1) may be expressed in terms of the components of u and v as follows: u-v= [lullvl]cos 4 =[lull vl]cos (8—a) = ||ul|||v||(cos6 cosa + sin J sina) (sin8) 6) +(lullsina)(||v]| vl](cos cosa)(|| =({[ul| = uv + Ugve. (2) We state,without derivation, that the analogous result for R° is UW:V = UU Figure 1. u-v in termsof components. Generalizing (3) + UQd2 + UZU3. (2) and (3) to IR”, it is eminently reasonable to define the (scalar- valued) dot product of two n-tuple vectors u = (u1,...,Un) as UV = Uy + ugg $60 + Untn =D and v = (v1,..., Un) (4) UjD;. Observe that we have not proved (4); it is a definition. Definingthedotproductis thekey,for now ||u||and6 follow readily.Specifi- cally, we define (5) in accordance with equation (6) in Section 9.3, and 0= cos7? Ga). (6) fallvl from (1), where the inverse cosine is understood to be in the interval (0, 7].* Notice the generalized Pythagorean-theorem nature of (5). Other dot products and norms are sometimes defined for n-space, but we choose to use (4) and (5), which are known as the Euclidean dot product and Euclidian norm, respectively. To signify that the Euclidean dot product and norm have been adopted, we henceforth refer to the space as Euclidean n-space, rather “By the “interval [a, 6] on a real x axis,” we mean the points a < x < 6. Such an interval is said to be closed since it includes the two endpoints. To denote the open interval a < a < 6, we write (a, 6). Similarly, [a, 6) meansa < x < b,and (a, b]meansa < x < 6. Implicit in theclosed-interval notation[a,6]is thefinitenessofa and6. 9.5. Dot Product, Norm, and Angle forn-Space someauthors thanjust n-space. We will still denoteit by thesymbol Ik” (although prefer the notation IE”). EXAMPLE 1. Let u = (1,0) andv = (2,~2). Then uv = (1)(2)+ (0)(-2)= 2, [ful]= V1)?+ (0)?= 1, IIvll= V2)?+ (-2)?= 2v2, 0=cos! ( 2 2/2 w ) = — (or 45°) 4 as is readily verified if we sketch u and v as arrow vectors in a Cartesian plane. @ EXAMPLE 2. Let u = (2,~2,4,—1) andv = (5,9, —1,0).Then, u-v = (2)(5)+ (—2)(9)+ (4)(—1)+ (-1)(0) = -12, (7) ilull= (2)? +(=2)2+ (4)?+(-1)?=5, + (9)?+(-1)?+(0)?= V'107, IIvll= V/(5)? (8) (9) §=cos= (==) 2 x 1.805 = cos”! cos! (—0.232) (—0.232)= w 103.4° (or103.4°). (10) In this case, n (= 4) is greater than 3 so (7) through (10) are not to be understood in any physical or graphical sense, but merely in terms of the definitions (4) to (6). COMMENT. The dotproductof u = (2, —2,4) andv = (5,9, —1,0),ontheotherhand,is not defined since here u and v are members ofdifferent spaces, IR° and R*, respectively. It is not legitimate to augment u to the form (2, —2,4,0) on the grounds that “surely adding a zero can't hurt.” # There is one catch that you may have noticed: (6) serves to define a (real) 6 only if the argument of the inverse cosine is less than or equal to unity in magnitude. That this is indeed true is not so obvious. Nevertheless, that “lS lull Iv <fullvil <1) oor fu-v/ (11) does necessarily hold will be proved in a moment. Whereas double braces denote vector norm, the single braces in (11) denote the absolute value of the scalar u- v. 9.5.2. Properties of the dot product. The dot product defined by (4) possesses the following important properties: Commutative: Nonnegative: UV u-u>Q0O ce Linear: (au+ Bv)-w (12a) = v-u, forall for u= u #0 0, = a(u-w)+G(v-w), (12b) (12c) — 423 424 Chapter 9. Vector Space for any scalars a, @ and any vectors u,v, w. The linearity condition (12c) is equtv- alentto thetwoconditions (a+v)-w = (u-w)+(v-w) and (au)-v = a(u-v). Verification of these claims is left for the exercises. EXAMPLE 3. Expandthedot product(6t ~ 2u)-(v + 4w). Using (12),we obtain (6t —2u)-(v +4w) = 6[t- (vw+ 4w)]- 2[u-(v + 4w)] = 6[(v + 4w)-t] -ts bo[(v + 4w)- ul = 6(v-t) + 24(w-t Se ~ 2(v-u) ~ 8(w-u) by (12c) by (12a) by (12c) in much the same way that we obtain (a — b)(e + d) = ac + ad ~ bc — bd in scalar arithmetic. As a consequence of (12) we are in a position to prove the promised inequality (11), namely, the Schwarz inequality* (13) ju-v|< [lull[lvi. To derive this result, we start with the inequality (14) > 0, (u+av)-(u+av) which is guaranteed by (12b), for any scalar a and any vectors u and v. Expanding the left-handside and noting thatu-u = |/ull’ and v-v = |/v]|”,(14)becomes? (15) lull? + 2ou-v +a? |Jvi?? > 0. Regarding u and v as fixed and a as variable, the left-hand side is then a quadratic function of a. If we choose a@so as to minimize the left-hand side, then (15) will be as close to an equality as possible and hence as informative as possible. Thus, setting d(left-hand side)/da = 0, we obtain 2u-v + 2a |\v|/? =0 or Oo Iv 5 Putting this optimal value of @ back into (15) gives us »_y(a-v?, (uv? s- > 0, SE ae EIB Ilul|?vi]? — 2(u-v)? + (av)? = 0, “After Hermann Amandus Schwarz (1843-1921), The names Cauchy and Bunyakovsky are also associated with this well-known inequality. It does not matter; by virtue of ‘Does a term such as au: v in (15) mean (cu)-v. or a(u-v)? (12c) (with 6 = 0 and w changed to v), (au): v = a(u- v), so the parentheses are not needed. and taking square roots of both sides yields the Schwarz inequality (13).* Thus, it was not merely a matter of luck that the arguments of the inverse cosines were smaller than unity in magnitude in Examples | and 2, it was guaranteed in advance by the Schwarz inequality (13). 9.5.3. Properties of the norm. Since the norm is related to the dot product according to Jul] = /u-u, (16) theproperties (12) of the dot product should imply certain corresponding properties of the norm. These properties are as follows: llaul] = jal |full, Scaling: Nonnegative: [|u| > 0 = 0 for allu 40 (17b) foru = 0, Ju-+evi| < |jul)+ |v]. Triangle Inequality: (17a) (17c) Equation (17a) simply says that au is |a| times as long as u, and for arrow representations of 2-tuples or 3-tuples the triangle inequality (17c) amounts to the Euclidean proposition that the length of any one side of a triangle cannot exceed the sum of the lengths of the other two sides (Fig. 2). Less obvious, however, is the fact that (17c) holds for n-tuples for n’s > 3. Let us prove only (17a) and (17c) since (17b) follows readily from (16) and (12b). First, (17a): jaul] = ul Figure 2. Triangleinequality. \/(au)-(au) by (16) = ,/au-(au) by (12c) with 6 = 0 and w = au = //a(au)-u by (12a) = Va®u-u by (12c) with @=Oandw=u = lal /u-u uty = ja] lull Turning to (17c), we find that ju + vil? =(u+v)-(u+v) =u'ut+V- by (16) U+U'V4+V-V =|jull?+2u-v+|v? S|Jull? +2]a-v] +Iv <|jull? +2[lull ivi)+liv? =(|full+ (vl) by13) fromthefact sideof(15)follows theleft-hand minimizes “Thatthechoicea = —u-v/||v|]° thatd?(left-hand side)/da® = 2||v||? > 0. 426 Chapter 9. Vector Space so that Ju + vil < [fal]+ flv, as claimed. A key step was the use of the Schwarz inequality (13), but we also used the simple inequality u-v < |u-v|, which holds since u--v is a (positive, zero, or negative) real number; that is, if u-v is negative, then the < holds, and if u- v is zero or positive, then the = holds. EXAMPLE 4. Let us verity thetriangle inequality for a specific example,say thevectors u = (2,1,3,—1) and v = (0,4, 2,1). Then u + v = (2,5, 5,0) so (17c) becomes Vb4< V15+V21 or 7.348 < 3.873 + 4.583, which is indeed true. 9.5.4. Orthogonality. If u and v are nonzero vectors such that u-v 6=cos7!( =cos! Ga) = O, then =cos! (0)= 7 (18) and we say that u and v are perpendicular. [Here we have used the nonzeroness of u and v in the third equality in (18); if u and/or v were 0, we would have had cos! (0/ |jul]||v|])= cos7! (0/0), which is notdefined.] But to equatethe condition u-v = 0 to perpendicularity (@= 7/2) would not be correct since u-v will also be zero in the event that u and/or v are 0, in which case @is not defined. Let us therefore make a distinction between perpendicularity and “orthogonality.” We will say that u and v are orthogonal if u-v = 0. (19) Only if u and v are both nonzero does their orthogonality imply their perpendicularity (i.e., @ = 7/2). With this definition, we see that the zero vector O is orthogonal to every vector including itself (Exercise 14). Finally, we say that a ser of vectors, say {u,,..., ug}, is an orthogonal set if every vector in the set is orthogonal to every other one: ujruj=O0 EXAMPLE (20) 5. u, = (2,3,—-1,0), ue = (1,2,8,3), ug = (9, -6, 0, 1) is an orthogonal set because Uy - Uy = Uy: Uy EXAMPLE ififj. = Uo Uy = 0.98 6. u, = (1,3), u2 = (0,0) is an orthogonalsetbecauseu; -U2 = 0. 427 9.5.5. Normalization. Any nonzero vector u can be scaled to haveunit length by multiplyingit by 1/ ||u||so we say thatthe vector « 1 (21) ——u us [lu has been “normalized.” That u has unit length is readily verified: all=at [Jul by(78) iu) Ta Hull by(17b) A vector of unit length is called a unit vector. We will often use the caret notation a for unit vectors. EXAMPLE 7. Normalizeu = (1,—1,0,2). Since |jul]= a-u = V6, wehave A set of vectors is said to be orthonormal if it is orthogonal and if each vector is normalized (i.e., is a unit vector). We will use that term so frequently that it will be useful to abbreviate it as ON, but be aware that that abbreviation is not orthogonality), and u; +uj; = 1 for each j (so ||uj|| = 1, so the set is normalized). The symbol 1,’ i=j bi—_ { ary 22 (22) (1823-1891). Thus, {uj,..., ug} is ON if and only if (23) fori = 1,2,...,kand7 EXAMPLE uy = a = 1,2,...,k. 8. Let (1,0,0,0), yy ug = L 1 10,—=,0,—=], ( v2 :) us = 1 ) 1 (0,-~,0,--=]. ( v2 V2 428 Chapter 9. Vector Space fay|| = [fuel] = fug|] = Land uy-ug = up-ug = Uo-ug = 0. A lu |, and an angle The definitions are designed as extensions of and v. EXERCISES 9.5 1. Given the following vectors u and v, determine j/ull,||v/| (1,3,-2), G = (2,0,4), H = (5,4,3), £ = (-3,-1,0), and @(in radians and degrees). If u and v are orthogonal, state J = (0,0,0). Determine, by vector methods, all interior anthat. gles and their sum, in degrees, for each of the following poly(a) u>= (4,3), v= (b) u = (1,2,3,4), (2, v= gons. ~1) (—4, -3, -2,-1) (c)u = (3,0,1), v = (—2,3,6) (d)u =(2,2,2),v =(—4,—5, -6) (ec)u = (2,5), v = (10, —4) (f) a=(1,2,3,4),v =(4,3,2,1) (g)u= (3,2,0,—1,1), v = (—5,0,0,2,4) 2. State whether or not each of the following expressions is defined. (a)[jul (b) u-(v-w) (e) (u+v)-(u—v) (f) u + 6(v-w) (c)[Ca v)vi| (g)cos! (Qu+ v) (i) (Tu): (2v) (d) (u + v)-w (h)w/ [lu]? (j) [Ju+ 3u?| (a)ABCA (d)BCDB (2)FGIF (j)GHIG (b)ABCDA (ce) BCDEB (h) GHIG (k) HIJH (c) ABCDEA () FGHF G) FGF (1) FIUF 4. (a)~(g) Normalize each pair of u, v vectors in Exercise that is, obtain @ and Vv. |; 5. If vectors A, B, C, representedas arrows, form a triangle such that A = B+ C, derive the law of cosines C? = A*+ B?—2AB cos a, where a is the interior angle between A and B, and where A, B,C are the lengths of A, B, C, respectively, by startingwith theidentity C.-C = (A-—B)-(A—B). 6. (Orthogonalization) In each of following, find scalars 3. Let us denote,as points in 2- and 3-space,A = (2,0), a, O,-y and vectors uy, Ug, Us such that uy = u, Ug = u+av, B = (3,-1),C = (5,0), D = (4,2), FB= (2,2), F = uy = u+ @v + yw is a nonzero orthogonal set, that is, 429 9.5, Dot Product, Norm, and Angle forn-Space uy U, = 0,uy- uy = 0, and ue- ug 0. If this is not possible, state that. (a) u (b) u (c) u (d) u = (1,3,0), v = (2,0, “1. = (1,0,0), v = (1,2,0,1), = 0, where uy, U2 ,ug # uy = (u-v)v = (2,3,0), w = (2,1, -3) v= (1,2,3), w = (3, —2,—5) = (2,1,0), w = (3,2, —1) v= (1,0,1,1), w = (2,—1,1,1) Ug =u—- u = (1,3) and v = (2,4), and where u = (1,3) and evaluatethe following. (c) je" v = (—1,—2). Interpret your results graphically for each of these cases. (c) Use (10.1) to carry out the separation for u = (2,3, 1) and v = (0,2,3). 1). (d) Repeat part (c), for u = (1,2, —1), v = (8, -1, (e) Repeat part (c), for u = (3,0, 5,6), v = (1, —2,0,4) (f) Repeat part (c), for u = (2, 1,0,0,3), v = (0,0,1, —2, 1). (b) |[3u —2v|] +||-vI (d)|full+ |v} 11. (a) Provetheassociativeproperty(au)-v = a(u-v). (b) Prove the distributive property (u+v)-w 8. Derive the following identities. (c) Prove that the linearity (a)flu+vif? +flu—vil?=2[ful]? +2[vil = u-w+v-w. property (12c) is equivalent to the two properties given in parts (a) and (b). (b) |Ju + vil? — {ju — vl? =4du-v (c) Verify parts (a) and (b) for the case u = (2,0,1,1) = (1, —-3,0,2). (10.1) Is (10.1) valid only for 2- and 3-space, or does it hold, without (g)u = (1,2), v = (0,2), w= (1,-1) (h)u = (3,0), v = (1,1), w = (-1,2) (a)ju— v| 1 where ¥ = v/||v|l, Uy. modification, for n-space as well? Explain. (b) Use (10.1) to carry out the separation for the cases where (e)u =(1,0,0,0),v =(1,1,0,0), w =(1,1,1,0) (f)u =(1,-1,1,-1), v = (1,2,0,1),w = (0,2, 1,0) 7. Ifu = (1,3,—4,2) and v =(2,0,0,3), (a) Show that u, and ue can be found, in terms of u and v, as and 9. Find all nonzero vectors (if any) orthogonal to the following vectors. 12. (Direction cosines) The direction cosines of a vector u = (u1, Ug, Ug) in 3-space are defined as 1, = cosa, ly = cos f, lz = cosy, where a, @,7 are the angles between u and the positive coordinate axes, as shown. (a)(3,0, 1) (b) (2,1, 1) and (1, 2,3) (c) (1, 1,0, -1) (d) (1,3,4,0) and (2,—1,0,5) (e) (6, —1,2,2), (1,4,3, 0), and (4, —9, —4,2) (f)(6,-1,2,2),(1,4,3,0),and(4,—5, -4,2) (g)UL=2,0),(2,3,1),and(7,0,2) 2,1,~1), (1,1,1), and(3, 2, 1) 10. (Orthogonal separation) It is sometimes desired to separate a given nonzero-vector u into the sum of two orthogonal nonzero vector v, as sketched in part (a) of the accompanying figure. That is, u = uy + Ue, where uy is of the form av, and ug: u;, = 0. We call uy the orthogonal projection of uonv, and we call ug the component of u orthogonal to v. (a) (b) (a) Obtain general expressions for (y, 2, /3 in terms of the components Uj, U2, U3. (b) Evaluate li, bs,ls for u = (2, -1,5). (c) Evaluate (,, [2,3 for u = (2, 4, 1). (d) Evaluate ti, lo, L, for u = (4,0, —3). (e) Show that 17 + iB + (3 =. 13. Ifu-v = Oand v- w = 0, does that imply that u-w = 0? Prove or disprove. HINT: If a claim is true, it needs to be proved in general, that is, for all possible cases. But if it is false, it can be disproved merely by putting forward a single counterexample. nal. (a) (1,3), (—6,2), (0,0) Qa, - (b)(2,3,0), (~3,2,1), (1,1,1), (1,-3, 1) 2, +20,+12, =5 1) (0,0,0, (0,0,1,0), (0,1,0,0), (1,0,0,0), (c) (d) (1,1, 1,1), G, —1,1,-1), (0,1,0,~1), (2,0, rg —-23 = 8 (fF)vy + vg + 1223= 0 (e)(2,1,-1,1),(1,1,3,0), (1,-1,0,-1),(21,1,i —2,0) (g) a, ae Uy —~a — — 2 a3 =2 ~—223 = 5 16. (Schwarz inequality) To make (15) as close to an equality as possible, and hence as informative as possible, we mini(a) ay + 2t. Ly 2 — t3 = 8 +273 =0 (b) vy + v2 =0 Ly — (c) vi mized theleft-handside by settingd(left-handside )/da = 0. That stepgavea = —u-v/||v||”, andputtingthatresultback into (15) gave the Schwarz inequality. That proof is valid for IR” for any n (> 1). For thespecialcaseof IR?,showthatthe optimal a is —u-v/ ||v||’by using a graphicalapproach;that o + 223 = 0 — @ — 5243 = 0 is, using a suitable sketch. HINT: Given u and v, make u+av as short as possible. to + 403 = 6 9.6 Generalized Vector Space 9.6.1. Vector space. In Section 9.5 we generalize our vector concept from the familiar arrow vectors of 2- and 3-space to n-tuple vectors in abstract m-space,and it is n-space that is used in the remainder of this chapter and in Chapters 10-12. Yet, it is interesting to wonder if further generalization is possible. The answer is yes, and we will complete that story in this section. Far from being just a mathematical curiosity, the results will be essential in later chapters, when we study Fourier series, Sturm—Liouville theory, and partial differential equations. The idea is as follows. In preceding sections we introduced the vectors and arithmetic rules for their manipulation, and then derived the various properties, such asu+v=v+u, u+0=u, a(fZu) = (aZ)u, and so on. In generalizing, the essential idea is to reverse the cart and the horse. Specifically, we elevate the derived properties to axioms, or requirements, and regard the vectors as “objects,” the nature of which is not restricted in advance. They may be chosen to be n-tuples or whatever; all that we ask is that a plus (-++) operation, a zero vector, a negative inverse, and scalar multiplication be defined such thatall of the vector space axioms are satisfied. Thus: DEFINITION 9.6.1 Vector Space We call a (nonempty) set S of “objects,” which are denoted by boldface type and referred to as vectors, a vector space if the following requirements are met: 9.6. Generalized Vector Space and denoted as +-, is de- (i) An operation, which will be called vector addition fined between any two vectors in S in such a way that if u and v are in S, Furthermore, then u + v is too (Le., S is closed under addition). u+v=v+4+u, (u+v)+w=ut+(v+w). (commutative) (1) (associative) (2) (ii) S contains a unique zero vector 0 such that u+0O0=u (3) foreach uin S. (iii) For each u in S there is a unique vector “~u” in S, called the negative inverse of u, such that u+(—u) = 0. (4) We denote u + (—v) as u — v for brevity, but emphasize that it is actually the + operation between u and —v. (iv) Another operation, called scalar multiplication, is defined such that if u is any vector in S and ais any scalar,* then the scalar multiple au is in S, too (i.e., S is closed under scalar multiplication). Further, we require that a(Bu) = (a8)u, (associative) (5) (a+ B@)u=au+s fu, (distributive) (6) a(u+v)=aut+avy, (distributive) (7) lu=u, (8) if the vectors u, v are in S, and a, 7 are scalars. Observe that if we write u-+v-+w, it is not clear whether we mean (u+v)-+Ww (i.e., first add u and v, and then add the result to w) or u + (v + w). However, the associative property (2) guarantees that it does not matter, so the parenthesescan be omitted without ambiguity. Similarly, au is unambiguous by virtue of (5). EXAMPLE 1. R”-Space. Surely, the n-space R”, defined earlier, does constitute a vector space;after all, the axioms listed in Definition 9.6.1 come from the propertiesof IR” listed in Section 9.4. Thus, there is no need to check to see if those axioms are satisfied. Instead,and for heuristic purposes, let us modify our addition operation from u+v = (U1+U1,.--,Un + Un) (9) “We continue to restrict all scalars to be (finite) real numbers. Hence, we call the vector space a real vector space. 431 432 Chapter 9. Vector Space to ut v = (uy + 2u4,..., tn + 202), (10) and see if (10) works; that is, let us see if the vector space axioms listed under (i) in Definition 9.6.1 are still satisfied tf we use (10) as our addition operation instead of (9). According to (10), vt us (uy + 2t,..., Un + 2un) (11) so a comparisonof (10) and (11) shows that the commutativity axiom (1) is satisfied only ifuy + Qu;= vy + 2uz (7 = 1,...,n), henceonly ifv; = uz, henceonly if v = u. Since (1) does not hold for any chosen vectors u and v, but only for vectors u and v that are equal, we conclude that if u + v is defined by (10), then we do nor have a vector space. Of course, it is possible that (10) violates other axioms besides (1), but one failure is sufficient to show that the set is not a legitimate vector space. COMMENT. Observe that we have not shown that u + v must be defined as in (9); con- ceivably, utv=s (ui tui,...,u2 +n) (12) —Un) (13) or utv = (uy —U1,.-.,Un might work; that is, might satisfy the requirements listed under (1). Thus, understand that the plus signs on the left- and right-hand sides of (9) are not the same. The ones on the right denote the usual addition of real numbers (e.g., 2 + 5 = 7), whereas the one on the left is more exotic; it denotes a certain operation between vectors u and v, which is being defined by (9), or (10), or (12), or (13). To emphasize that point we could use a different notation such as u * v, in place of u + v, as some authors do. However, having made that point let us continue to use u+v. 8 IR" is but one example of a vector space. Many other useful spaces can be introduced by using objects other than n-tuples as the vectors. For example, the vectors may be functions, matrices, or whatever, provided that vector addition, a zero vector, a negative inverse, and scalar multiplication are defined such that all of the vector space axioms are satisfied. For nowhere in Definition 9.6.1 is the nature of the vectors specified or in any way restricted. EXAMPLE 2. A Function Space. This time, let thevectorsbe functions.Specifically, let u = u(x) be any continuous function defined on 0 < x < 1, say. For the addition operation let u+v = u(x) + v(2); (14a) that is, let u + v be the function whose values are the ordinary sum u(x) -- v(x). For scalar multiplication let au = au(z); (14b) for the zero vector choose the zero function 0 = 0; ↓ 433 and for the negative of u define ~uU= ~u(e); (14d) thatis, thefunctionwhosevaluesare ~u(2). With these definitions, we can verify that all of the vector space requirements are satisi.ed, so that the set S of such vectors is a bona fide vector space. For instance, if u = (av) andv = u(x) arecontinuouson 0 <a < 1, thenso isu+v = u(x) + a) so S is closedunderaddition. Further,v + u = v(@)+ u(a) = u(x) + v(z) = ut+v,* so addition satisfies the commutative property (1), and so on, This S is but one example of a function space, a space in which the vectors are functions. #f The following theorem ts useful, and its proof illustrates the axiomatic approach. THEOREM 9.6.1 Properties of Scalar Multiplication If u is any vector in a vector space S and qais any scalar, then (15a) Ou = 0, (15b) (~l)u=—u, (15c) a0 = 0. Proof: These results follow from our definition of vector space. To prove (15a), one line of approach is as follows: JOu+u= Then 0u+ lu by (8) = (0+1)u by (6) = lu =u by (8). Qu+u+(—u) = u+ (=u) Ju+0=0 Ou = 0 by(4), by (3). The remaining two, (15b) and (15c), are left for the exercises. @ 9.6.2. Inclusion of inner product and/or norm. Observe that there is no mention ofa dot product or a norm either in Definition 9.6.1 or in Examples | or 2. Indeed, a vector space S need not fave a dot product (also called an inner product) or a norm defined for it. [f it does have an inner product it is called an inner product “The second equality holds because v(a) + u(r) is the ordinary sum of two real numbers; ¢.g., 44+3=344. 434 Chapter 9. Vector Space space, if it has a norm it is called a normed vector space; and if it has both it is called a normed inner product space. If we do choose to introduce an inner product for S, how ts it to be defined? Do you remember the idea of reversing the cart and the horse? That is how we do it. Equations (12a,b,c) in Section 9.5.2 were shown to be properties of the inner product u-v = uyvy +--+ + UnUn. We now take those properties and elevate them to axioms, or requirements, that are to be satisfied by any inner product of any vector space. Similarly, we take the properties (17a,b,c) of the norm, in Section 9.5.3, and elevate them to axioms, or requirements, that are to be satisfied by any norm of any vector space. Let us tabulate them here: REQUIREMENTS OF INNER PRODUCT Commutative: u-v Nonnegative: = v-u, u-u>0 forall u 4 0, for u = 0, = (au+ Bv)-w Linear: (16a) = a(u-w)+f(v-w), (16b) — (16c) and REQUIREMENTS OF NORM = fa jul), jul] Scaling: forallu 40, lu) >0O Nonnegative: for u = 0, (17b) < [Jul]+ |lv|l. (17c) — TriangleInequality: Ju+vi| (17a) Let us illustrate. EXAMPLE 3. IR"-Space.If we wish to addaninnerproductto thevectorspaceR", we can use the choice nh UV = Uy Fe HFUnVn = ) j=l UjU;- (18a) 435 We know that (18a) satisfies the requirements (16) because the latter were deduced, in Section 9.5.2, as properties that follow from (18a). A variation of (18a) that still satisfies (16) is (Exercise 6) Th UsV = wWpUyzy be + Wytndyn= ) WjUjV;, (18b) j=l where the w,’s are fixed positive constants known as “weights” because they attach more or less weight to the different components of u and v. For instance, consider IR? and let wy,= 5 andwa = 3. Thenif u = (2,—4) andv = (1,6) we haveu-v = 5(2)(1)+ 3(—4)(6) = —62. Note that for (18b) to be a legitimate inner product we must have w; > 0 for each 7. For suppose,still in R?, that w, = 3 and w. = —2. Then, for u = (1,5), say, we have u-u = 3(1)(1) —2(5)(5) = —47< 0, in violationof (16b).Or, supposethatw, = 3 and wa = 0. Then, for u = (0,4), say,we haveu-u = 3(0)(0) + 0(4)(4) = 0 eventhough u # 0, again in violation of (16b). Now, suppose that we wish to add a norm. If for any vector space S we already have an inner product, then a legitimate norm can always be obtained from that inner product as |jul| = /u-u, and thatchoice is called the natural norm. Thus, the naturalnorms corresponding to (18a) and (18b) are (19a,b) respectively. However, we do not Have to choose the natural norm. For instance, we could use (18a) as our inner product, and choose (20) = ur]+--++funl= 52 fas [Jal] as our norm (Exercise 8). The latter is used by Struble in his book on differential equations,* probably because it is algebraically simpler than the Euclidean norm (19a) or the modified Euclidean norm (19b), Furthermore, he defines no inner product whatsoever. Struble calls (20) the taxicab norm since a taxicab driver judges the distance from the corner of Sth Avenue and 34th Street to the corner of 2nd Avenue and 49th Street as 18 blocks, not 234 blocks. Bf EXAMPLE 4.) The Function Space of Example 2.. How might we choosean inner product for the function space S defined in Example 2? To motivate our choice, let us imagine approximating any given function (i.e., vector) u(x) in S in a piecewise-constant manner as depicted in Fig. |. That is, divide the a interval (0 < x < 1) into n equal parts and define the approximating piecewise-constant function, over each subinterval as “R.A. Struble, Nonlinear Differential Equations (New York: McGraw-Hill, 1962). 436 Chapter 9. Vector Space the value of u(a) at the left endpoint of that subinterval. If we represent the piecewiseconstant function as the n-tuple (w.,...,U,), then we have, in a heuristic sense, ula) = (wi,..., Un). (21) Similarly, for any other function v(a) in S, u(x) & (U1,...,Un). (22) il h n Hy] x I 0 Figure 1. Staircase approximation of u(x), The m-tuple vectors on the right-hand sides of (21) and (22) are members of R”. For that space, let us adopt the inner product rT (thy. +,Un) (Uy, 6-6) Un) =S- ujujAx, (23) p=l that is, (18b) with all of the w, weights the same. namely, the subinterval width Az. If we let n — oo, the “staircase approximations” approach u(a) and v(x), and the sum in (23) tends to the integral f u(x)u(ax)de. This heuristic reasoning suggests the inner product (u(z), u(z)) = [ (24a) u(xju(a) da. We can denote it as u- v and call it the dot product, or we can denote it as < u(x), u(r) > and call it the inner product. For function spaces, the latter notation is somewhat standard, and is our choice in this text. COMMENT 1. By no means do we claim our staircase idea to be a rigorous derivation of (24a). In fact, it is neither rigorous nor a derivation: it is Heuristic motivation for the definition (24a), We leave it for the exercises to verify that (24a) does satisfy the requirements (16). COMMENT 2. Just as (18b) is a legitimate generalization of (18a), (ifw; nm),we expect that a > Oforl dx ulx)u(x)w(a) = =f ulz),u(r)) (wla)evte)) wa)elepo(a)de <j < (24b) 437 is a legitimate generalization of (24a) [if w(w) > 0 for 0 < « < 1), proof of which claim is left for theexercises,The inner product(24b) is prominentwhenwe studyFourier series and theSturm—Liouville theory in Chapter 17, COMMENT 3. Naturally, if we wish to define a norm as well, we could use a natural norm based on (24a) or (24b), for instance {jul|= u2(a)w(a)de (25) based on (24b). COMMENT 4. Notice carefully that the concept of the dimension of a vector space has not yet been introduced, although it is in Section 9.10. There, we define dimension and find is n-dimensional (which claim is probably not a great shock). Since the staircase that IR” approximation (21) becomes exact only as n -> oo, it appears that our function space S is infinite dimensional! 5. A bit of notation: the set of functions that are defined and continuous COMMENT on [0,1] (ie, 0 < @ < 1) is usually denoted as C'°{0, 1]. If not only are the functions continuousbutalso all derivativesthroughorder&, thenthesetis denotedas C*(0, 1). @ Closure. Using n-space as a ladder, we complete our generalization of vector space by taking the properties of IR” (such as u + v = v + u) and turning them into the axioms, or requirements, to be met by any vector space. Thus, attention shifted from the objects, the vectors, to those requirements.There is no restriction on the nature of the vectors, which can be arrows, n-tuples, matrices, functions, or oranges. Por us, the most important vector spaces are IR”and various function spaces;IR”is usedin theremainderof this chapterandChapters10-12,andfunction spacesare used in Chapter 17 when we study Fourier series and Sturm-Liouville theory. To illustratethepowerof theaxiomaticapproach,recall theSchwarzinequality ju-v{ < |{ul]||v|],provedin Section9.5.2for IR”. That resultholdsfor amynormed innerproductspacewith naturalnorm |/ul||= \/u-u for it followed from properties of IR", which properties are subsequently elevated to axioms for general vector space. Thus, it represents many properties rolled into one. For example, in IR", with the dot product (18a) it says rh S ae wy vj (26) eae SS, pel in the function space of Examples 2 and 4: with the inner product (24b) and norm (25) it says | ‘L 0 and so on. u(x)o(a)w(a) dx < / a 0 u?(c)w(x) da | J0 “L v2(x)w(a)da, (27) Chapter 9. Vector Space 438 9.6 EXERCISES 1. Recall that IR” is the vector space (“‘real” vector space since 10. Let S be the set of real-valued polynomial functions, of de- all scalars are to be real numbers) in which the vectors are m- gree n, defined ona <a <b. [fu = ag + aye +++ + ane” and v = bp + bya +-++ + b,x” are any two such functions, and a is any (real) scalar, define the sum u + v and the scalar tuples u = (t,..., u+v Un), with the definitions = (uy,..-; tm) + (U1,..-, Un) = (uy + U1,--.,Un + Un), 0 = (0,...,0), multiple au as (1.1) (1.2) —u = (—u4,...,—-Un), (1.3) au = (au1,...,QUn). (1.4) If we make the following modifications, do we still have a vector space? If not, specify all requirements within Definition 9.6.1 that fail to be met. (a)only vectorsof theform u = (u, u,.. .,u) admitted,where -—coO<Uu< cw (b)only vectorsof the form u = (u, 2u, 3u,...,nw) where —oo < u < 00 (au)(x) = aap + @aye +++»+ Aan”, respectively.Further, let O be the function0+0z+---+02”, and let —u be the function —a9 — aya +++: ~ Qnz". Show thatS is a vectorspace. 11. Show that the inner product (24b) does satisfy the requiremeénts(16). 12. (Schwarz inequality) We derive the Schwarz inequality admitted, (c) only thevector (0,...,0) admitted(this is an exampleof a zero vector space, a vector space containing only the zero vector) (d)utv (e)u+v (u+v)(a) = (ag+ bo)+ (ai + b1)a+++ + (Qn+ bn)2”, = (uy — v1,..-,Un — Un), in placeof(1.1) = (0,...,0) forall u’s and v’s, in place of (1.1) .,07u,,), in place of (1.4) (f) au = (a®u,...,a° 2. We noted in Example | that the definition (10) of vector addition violates axiom (1). Does it violate any others as well? Explain. Juv] < [lull[lvl (12.1) for R” space in Section 9.5.2. The latter holds not only for IR” but for any normed inner product space with the natural norm jul] = /u-u. In this exercisewe simply ask you to verify (12.1) by working out the left- and right-hand sides for these specific cases: (a)u = (3,1,—-1,0) and v = (1,2,5, —4)in R*, with the inner product (18a) (b) u = (1,2,4,-3) UV and v = (0,4,1,1) in R*, with = UyVy + Suave + 3ugv3 + 2u4Vv4 3. Prove(15b),that(—1)u = —u. (c)u 4. Prove (15c), that a0 = 0. UsvV = UV, + Qugve + 3ugug + 4uava + 5usUs and v = 32° in the function space of Ex(d)u = 2+ 5. Prove that if au = 0 then a = 0 and/or u = 0. 6. Show that the inner product (18b) does satisfy the requirements (16). = (1,1,1,1,1) and v = (2,2,2,2,2) in R®, with ample4, with the inner productu:-v = (u(x), v(z)) = [; u(x)u(x) dz Jo (e) Same as (d), but with (u(x), u(x)) = Sy u(x)v(x)(2 + 7. We stated in Example 3 that if for any vector space S we already have an inner product, then a legitimate norm can al- 5a) dx which choice is called the natural norm. Prove that claim. 13. (Solution space) (a) Consider a set of m linear homogeand neous algebraic equations in the n unknowns 21,...,@, waysbe obtainedfrom thatinnerproductas |lul] = /a-u, 8. Show that the “taxicab norm” (20) is a legitimate norm that is, that it satisfies the requirements (17). denote each solution of the system as an m-tuple vector x = in R”. Show that the set of all such vectors, with (@1,...,2,) 9, (a) Does thechoice |jul] = max lu,|, for IR”, satisfy the the usual definitions requirements (17)? Explain. jen (b)How about||u|]| = min |u,|,for R"? lsjgn [u + -v = (uy +01,...,Un + Un), QU = O = (0,...,0)], is (QU1,...,QUn), —U = (—U1,...,—tn), a vector space. That space is called the solution space of the system. 9.7. Span and Subspace (b) [f the system is nonhomogeneous, is the set of solutions still a vector space? Explain. 14. (Solution space) Show that the solutions of a linear ho- 9,7 mogeneous differential equation (with the same definitions of u+yv, au, —u, and 0 as in Example 2) constitute a vector space, the so-called solution space of that differential equation. Span and Subspace Here, we begin a sequence of closely related ideas: span, linear dependence, basis, expansion, and dimension. The concepts, definitions, and theorems hold for any vector space, but our illustrative examples are restricted to the n-space IR”, this being the case of most interest in Chapters 9-12. We begin with the idea of the “span” of a set of vectors. 9.7.1 Span DEFINITION Uz are vectors in a vector space S, then the set of all linear combinations If u,,..., of these vectors, that is, all vectors of the form uU=a,u, +--:+apug, where a1,...,@, are scalars is called the span of uj,,...,u, (1) and is denoted as span {uy,..., Ug}. The set {u,..., ug} is called the generating set of span {u,,..., uy}. Let us illustrate with some vector sets in IR* and IR° so we can support the discussion with diagrams. EXAMPLE 1. Determinethespanof thesingle vector uy = (4, 2) in R?, Then span {u,} is the set of all vectors that are scalar multiples of u,. (2) Hence, span{u,} is the set of all vectors on the line Z in Fig. 1, such as u = 2u,; = (8,4), v= ~duy ==(~2,~1), and O = Ou, = (0,0). We say that u; generatesthe line L. fi EXAMPLE 2. Determinethespanof the two vectors ui = (4,2), ug = (—8,-4). 439 (3) ; Figure L. Span{us}. 440 Chapter 9. Vector Space Span {uy;, ug} is, once again, the line L in Fig. | (ie., the set of all vectors on L), for both wu,and ug lie along £, so any linear combination of them, a, Uy, + Q@2Ug,does too. Similarly, span{(4,2), (~8, ~4), (18,9), (0,0)} is theline L. @ Observe that the line £, in Examples | and 2, is only a subset of the vector space IR?. Observe that that subset of IR? is itself a vector space, a so-called “subspace” of R?. For if u and v are any two vectors on L, then u +v is on L, too, so the set is closed under addition; similarly, if u is on L, so is au, for any scalar a, so the set is closed under scalar multiplication; L does contain the zero vector [since we can set all the a@’sin (1) equal to zero]; and for each u on L£there is a (unique) vector —u on J such that u + (~—u)= 0. DEFINITION 9.7.2 Subspace If a subset7 of a vector spaceS is itself a vector space (with the same definitions as S for vector addition u+ v, scalar multiplication au, zero vector O, and negative vector —u), then 7 is a subspace of S. Usually, a subspace of S is only a part of S, as the line L is only a part of R?, but since a subset of a set can be all of that set, a subspace of S can be all of S. For instance,IR?is a subspaceof R?. THEOREM If uy,...,U, 9.7.1 Span as Subspace are vectors in a vector space S, then span {uj,..., ug} is itself a vector space, a subspace of S. For instance, the line L in Fig. | is a subspaceof R*. Proof of Theorem 9.7.1 is left for the exercises. EXAMPLE 3. Is thespanof u, = (5,1), ug = (1,3) (4) all of IR?or only a partof IR??To determinetheextentof span{u,, uz}, let v = (v1,v2) be any given vector in IR?, and try to express (5) V = QyUy + QeQUy. That is, (v1,v2) = a1(5, 1) + ag(1,3) = (5a, a1) + (a2, 3a2). = (5a, + a2,Q1 + 32). (6) 9,7, Spanand Subspace Equating components, we obtain the linear equations Day + : = U1, A in @1,@2. Applying (7) v2 ay + 8agQ = (7) becomes Gauss elimination, 1 \ = £Uy, : Qy + QQ a (8) = Hy, = id V2 — + id ag Vy It is clear from the Gauss-reduced form (8) that the system is consistent (solvable for Hence, we may conclude that span {uj1, ty} Q1, @2) for every vector v in R?. is all of IR?;we say that {u,, U2} spansIR?. (Here we use “span” as a verb; in Definition 9.7.1 it is introduced as a noun.) Thus. every v in R? can be expressed as a linear combination of vector u; and ug. As representative,let v = (6,—4) so v; = 6and va = —4, Then (8) gives ag = ~+2 and a= an, so that (5) becomes (9) Buy _ up. v= To see this in graphical terms, observe from Fig. 2 that v = OA + OB, where (with the aid of a scale) OA + 1.6u, and OB = —1.9up. Thus, v & 1.6u; — 1.9ue, in agreement with (9). COMMENT. Suppose that we add us = (2,2) to the set. It should be evident that span {u;, U2, Us} is all of R?, again, since {u,, Us} spanned R? even “without any help” from uy. But in case this is not clear, let us go through steps analogous to steps (5) to (8): V = a, Uy,+ QQU2+ O33 (10) So (v1, v2) = (5a, + ag + 203,01 + 3a2 + 2a3). Thus, day + Q, + Oe 3a + 203 = UL, + 203 = U2, or ay + tay + 203 = Q9g + ta, = U1, * 772 (11) _ qe Like (8), (1 1) is consistent for every v in R®so {uy,, U2, us} spans R?, as claimed. Whereas (8) hada unique solution so thatthe representation(5) was unique, (11) happensto have an infinity of solutions so that the representation (10) is net unique. 4 EXAMPLE 4. Asafinal example,consider thespanof u, = (1,2,2), uy = (—1,0,2) (12) in R¥. Setting V= a uj, + aut, (13) 441 Chapter 9. Vector Space 442 we have pm Ag = Uy, 204 = U9, 201 + 2a9 = vs, or, after Gauss elimination, Qy~7 Ap = Uy, ag = QO= 4vg—U1, Nile Ug 209 (14) ++ 204. Now, span {u1,, 2} is the set of all possible vectors v given by (13), 1e., all vectors v for which the system (14) is consistent, i.e., all vectors v = (v1, v2, vs) such that || | Axis | Figure 3. u; and up. Quy— 2veq+ ug = (15) [so that the last of equations (14) is 0 = 0 rather than a contradiction]. In geometrical terms, on the other hand, span {u,,t2} should be the subset of RS consisting of the plane that passes through u, and ug (u, and us are shown in Fig. 3). How does that fact correlate with (15)? As a matter of fact, (15) is the equation of a plane in 3-space, and that plane does pass through the origin, through the tip of uj [i.e., the point (1, 2, 2)], and through the tip of uy [the point (—1, 0, 2)]. Hence, it is the plane through uj; and Ug so the analytical approach, namely, steps (13) to (15) and our geometrical interpretation are in agreement, We conclude that span {uz, uz} is not all of R°; it is only the subspace of IR? consisting of the plane (i.e., all vectors in the plane) containing the given vectors uw,and ug. COMMENT. Since span {u, ug} is a plane. would it be correct to say that span {uy, uy} is R°? No, that would be incorrect; R? is made up of nvo-tuples, while the vectors in the above-mentioned plane are three-tuples. Thus, R? space is not relevant in this problem. All that can be said here is that span {u1, ug} is the subspace of R®consisting of the plane containing the vectors u, and up, that is, the plane defined by (15). # Closure. In leading up to the conceptof bases and expansions, the two key ideas are span and linear independence. In this section we introduce the idea of span; in the next section we introduce linear dependence and linear independence. Although the concept of span holds for any vector space, such as R®°,we suggest that you focus on the foregoing examples in two- and three-spaces, so that you can use the two- and three-dimensional drawings to promote understanding. 9.7. Span and Subspace EXERCISES 443 9.7 1. Show whether the vectors That solution space is a subspace of IR”. To illustrate, con- IR” span (a)(1,0,...,0), (0,1,0,...,0),...,(0,...,0,1) (b)(0,0,0,1), (0,0,1,1), (0,1,1 ,2) (1,4,1 , 1) span 4 = 1,n = 2, sider the simple system a + 322 = 0; thatis,m a1, = 1, and ay. = 3. The solution is zg = a(arbitrary), (1,1,2,3) 1),(0,0,0,0), 1, -1),(0, 1,0, 0,4),(2,3, (c)(1,2, span R* vy = ~3a, or x = (21,22) = a(—3, 1) so thesolutionspace is thespanofthe vector(—3,1),thatis, span{(—3,1)}. In this (d) (1,3,2,2),(5,ma,0),(—1,—2,4,3) span R! manner,determine the solution space for each of the following examples. (f) (1,1,2), ( a) (2, 1,0), (-1,0,3) span R° (g)(2,0,3), (1, 2,4), (—5,2, -2)spanR3 (h)(1,3,0), (2,-1, 1),(1, 1,4) spanR° (a) vy — 22 +4e3 = 0 in RY (b) 21)+29 +23 —24 = O0inR* (Cc) x, — 2 + «#3= 0 (e) (1, (2,1, -1), (1, 2, —5) span IR? 0,1), (i) (—1,2,4),(5, 2, -2), (2, 0,3), (1,2,3) spanR3 (j)(0,0,0), (2,1,4), (-1, 3,5) spanR4 (k) (2,1,3), (1,-1, 2) span‘RS (1)(2,1,-1),(1,3, 1),(5,5, —1), (0,5, 3)span IR? (m)(—4,1,0), (2,2,2), (1,2,3) spanR® (n)(~3, 1,0),(1,1,1),(1,7,5)span R? (0) (1,2), (2.1) (45) span Re (e) = 0 xy — fo + 23 —-24K4 + 24 + 2¢5 = 0 inR® Ly — £2 £3 + t4 = 0 + 205 + 24 = 0 inR? Ly — £4 = 0 Uy + 2x2 (b) Sketch any three such vectors. (c) Sketch any four such vectors. (g) Uy + fo 4+ 223 _ in IR* 0 = 23 T+ 2. (a)Sketch any two vectors that span the space of all vectors in the plane of the paper. vector sets subspaces ofR?? vy, + 38a0 - (f) ay + a2 - 23 + tg = 0 (0)(1,2), (2,1) spanR? 3. Are the following ry + to + ty = 0 in R® (d) = 0 204 +25 Ly + Lo + 2x3 =0 224 + #5= 0 in R® 5. Find any two vectors in R° that span the plane (See accompanying figure.) Explain. (a) vy 209 + 4x3 t+ 524 =0 (c) =0 (b) 22, + vg — 623 = 0 (d) v1, +42q +23 =0 (f) 38a, — v2 —%3 = 0 (a) the straight line D that extends from the origin to infinity (b) the wedge-shaped region (including its boundary lines) that 6. Show whether the given sets are identical. Explain. extends to infinity in both directions (c) the upper half plane zw.> 0 (a)span {(2,-1, -1), (3, 1,0)} andspan{(2, —1,—1),(5,5, 2) (b) span {(1 23 ,(2,-1, 1)} and span{(1, 2,3), (3, 1,5)} ‘ (c) span{(4, 1, , (d)span{(1, 2, —1), (3, 0, 0)} andspan{(1, 0, 0), (1,3, 0)} (1,0,1,2 —1,1,1,0)} and span{(0,1,2, ), (f) span{(1, 0, 1 span {(2,0, —1,0), -1, 2, 3), (4,3, 2,1)} (0, (g) span{(1,0, 1,1), (2,1,1,0),(1,2, 2, 1)} and span{(2, —10,0), (1, -2,0, 1), (3,5,4, 1} (h) span{(1,2, 3, 0), (0, 1,0,2), (2, 3,0, 1)} and span {(1, 0, 3, —1),(-1,1,3,3), (1, 2,1, 1)} 7. Find any two ON (orthonormal) vectors in (a)Span{0,22)), (6, -1)} 4, (Solution space) First. review Exercise |3a in Section 9.6. (b)span{(1,oe (2,-1,3)} (c) span{(1, (1,2, 3)} -1,0), 444 Chapter 9. Vector Space (f) span{(~2,3, 1, 1),(0,2, -1,1)} (d)span{(2,1,0), (0, 1,2)} (e)span{(1,1,0, 1),(0,2, —1,1)} 8. Prove Theorem 9.7.1. 9.8 Linear Dependence The definition of the linear dependenceor independenceof a setof vectorsis essentially identical to Definition 3.2.1 for a set of functions, with the word “functions” changed to ‘“‘vectors:” DEFINITION 9.8.1 Linear Dependence and Linear Independence A set of vectors {u,,..., ug} is said to be linearly dependent if at least one of them can be expressed as a linear combination of the others. If none can be so expressed, then the set is linearly independent. Thus, we urge you to review Section 3.2 in conjunction with your study of this section. As in Chapter 3, we frequently use the abbreviations LD and LI to stand for linearly dependent and linearly independent, respectively. EXAMPLE 1. by inspection, we can express uy as a linear combination Let uw,= (1,0), u2 = (1,1), and ug = (5,4). These are LD since, EXAMPLE 2. of u, and ug: ug = uy + 4g. (Alternatively,we could expressug = fug — +u, or uy = —4uy + ug). Fl Let u,; = (1,0) and ug = (1,1). These are LI since u; cannot be expressed as a “linear combination of the others,” namely, as a scalar multiple can ug be expressed as a scalar multiple of u,. @ EXAMPLE of ue, nor 3. Let u, = (2,—1),ue = (0,0), anduy = (0,1). TheseareLD sincewe can express Ug = Ou, + Oug. (The fact that we cannot express u, as a linear combination of ug and uy, nor ug as a linear combination of uy, and uy does not alter our conclusion, for recall the words “at least one” in the definition.) # It is implicit in Definition 9.8.1 that u,,..., ug are all members of the same vector space; in Examples | to 3 that space was R*. Thus, it would make no sense to ask whether u, = (2,5) and ug = (4, 3,0, 1) are linearly dependentor not since uj, is a member of IR? while us is a member of R4. The preceding examples are simple enough to be worked by inspection. In more complicated cases, the following theorem provides a systematic approach for 9.8. Linear Dependence determining whether a given vector set is linearly dependent or linearly independent. THEOREM 9.8.1 Testfor Linear Dependence / Independence A finite set of vectors {uy,..., ug} is LD if and only if there exist scalars aj, not all zero, such that (1) ayuy +--+ + apuy =O; if (1) holds only if all the a;’s are zero, then the set is LL Proof is essentially the same as for Theorem 3.2.1. EXAMPLE 4. Considerthe4-tuples us = (2,2,3,0). we =(0,1,1,1), u;, = (2,0,1,-3), (2) To see if these vectors are LI or LD, appeal directly to (1): a (2,0, 1, —3) + a2(0, 1,1, 1) + a3(2, 2,3,0) = (0,0,0,0), (3) or (2a) + 2a3, a2 + 203, ay + a2 +303, —3a1 + a2) = (0,0,0,0). Thus, 204 + 2a3 = 0, a2“ + 2a3 a, = 0Y 4 + a2 + 38a3 = 0, —dsay + a2 (4) = 0. Applying Gauss elimination yields 2ay4 + 2a3 = 0, ag + 904 203 = = 0, (5) ag = 0, 0 = 0. This system admits only the trivial solution, a, = ag = a3 = 0 $0 U4, Us, Ug are LI. EXAMPLE 5. Consider the 3-tuples u, = (1,0,1), ug=(1,1,2), ue =(1,1,1), ua =(1,2,1). (6) Working from (1), as in Example 4, we have a +ag+ ag+ ay =0, ag + ag + 2aq = 0, Q1 + 2 + 2a3 ag = 0, (7) 445 446 Chapter 9. Vector Space or, after Gauss elimination, Qy + 2 +03 + ay = 0, a2 + a3 + 2a4q= 0, = ). ag (8) This time, there exist nontrivial solutions for the a;’s so the vectors uj, Ug, ug are LD, (Specifically, (8) gives ag = 0, aq = Q, @g = —2a, a, = a where a is arbitrary. With ca= 1, say, (1) becomes uy — 2u, + Oug3+ uy = 0.) | We conclude this section with four modest theorems, the first three being essentially the same as Theorems 3.2.4—3.2.6for functions. THEOREM 9.8.2 Linear Dependence / Independence of Two Vectors A set of two vectors {1 ,, ug} is LD if and only if one is expressible as a scalar multiple of the other. THEOREM 9.8.3 Linear Dependenceof Sets Containing the Zero Vector A set containing the zero vector is LD. THEOREM 9.8.4 Equating Coefficients Let {u;,..., ug} be LI. Then, for ayy +++: + apup = byuy +--+ + bpup to hold, it is necessary and sufficient that a; = b; foreach 7 = 1,...,k. That is, the coefficients of corresponding vectors on the left- and right-hand sides must match, THEOREM 9.8.5 Orthogonal Sets Every finite orthogonal set of (nonzero) vectors is LI. Proof of Theorem 9.8.5: Dot uy, into both sides of Quy + atta +--+ +apup, = O. (9) In other words, uy: (ay uy + agua +--+ + ap,u,) = uy -0, QyUy,-:Uy + @oly: Uo +--+ + apUy: Up, =0,7 ay|u|? +O+---+0=0. (10) 447 Now u, # 0 implies that |/u;|]4 0 so it follows from (10)thata; = 0. Similarly, dotting ug into (9) gives ag = 0, and so on. Since ay = ay = -+- = ap = O, the u,’s must be LI, as claimed. m 6. The set {(2, 1), (1,5)} in IR? is LI becauseneither vector can be exEXAMPLE pressedas a scalar multiple of theother. 4 EXAMPLE 7. Let uy, = (4,-1,1,2), Uo = (3,0, 2,5), uy = (0,0,0,0) in R*. The set is LD, according to Theorem 9.8.3 because it contains the zero vector ug = QO.That is, uz can be expressed as a linear combination of u, and ue: ug = Ou, + Oud. If the preceding sentence is not clear, rewrite the equation as Ou, -+-Quy — lug = O and observe that the a; coefficients (0,0, and —1) are not all zero. @ Closure. The foregoing discussion of the linear dependence / independence of vectors is essentially the same as the discussion of the linear dependence/ independence of functions in Section 3.2, except that the Wronskian determinant test did not carry over. EXERCISES 9.8 1. (a)Can a set be neitherLD nor LI? Explain. (b)Can a set be both LD and LI? Explain. G) (1,1, 0,0), (1, -1, 0, 0), (0,0, (k) (1, ~3, 0,2, 1), (—2,6,0, ~4, = 2. Show that the following sets are LD by expressing one of the a asa es combination of the others, (D (5,4, re ee rie (0)(7,1,0),(—1,1,4),(2,3,5- 1),(1, 2),(3,4)} (a) (1,3),(2,0), (1,3),(7,3) (b)(1,3), (2,0), (1,2), (-1,5) (c)(2,3,0),(1,-2,3) (e)(0,0,2).(0,0 3),(2,-1,5), (1,2,4),(7,9,1),(2,0,-4) (f)(2,3,0,0),(1, ~5,0, 2),(3,1,2,2) (g)(1,3,2,0),(4,1, -2, 2), (0,2,0,3),(4,7,1,2) (h)(2.0,1.—1,0),(1,2,0,3, 1),(4,~4,3,~9,—2) (i)(1, 3,0), (0,1,—1),(0, 0,0) 10,0) (1, 9, rn a 2, 2) 1),(3,=2, (1,0, =1), (p)(1,2, D(ed) 2s) oh o {(1,on Me12),(—<3)} (a){(1,2.3), (3,2, 1), 5 .5)} ∩ (0, or (q)(3,1,0,0),(1, —2, 4, 1), (2hbo_~ > ≤ − ∕ ∟ (a) u Uy Bw _ ar a no ↕ 448 (¢) (d) 5. [Ifuy and ug are LI, uy and uy are LI and ua and ug are LI, uy does it follow that {11,,ug, ug} is LI? Prove or disprove. Uo Uo iby ug 6. Prove or disprove: (a) v is in span{uy,..., ug} if {v, uy,..., ug} is LD. (b) v is notin span {uy,..., ug} if {v,uy,..., ug} is LL (c) vis notin span {uy,,..., ug} ifand only if{v,uy,..., ug} is LI. 3 7. (a) Prove Theorem 9.8.2. (b) Prove Theorem 9.8.3. (c) Prove Theorem 9.8.4. (e) uy 9.9 Bases, Expansions, Dimension 9.9.1. Bases and expansions. In the calculus we learn that a given function f(x) can be “expanded” as a linear combination of powers of x (namely 1, z, a, ...), f(a) = ag +aya +agx? +++, (1) We call ag, a1, @2,... the “expansion coefficients,” and these can be computed from f(x) as aj = f(0)/j!. Such representationof a given function is important,and examples such as e®= 1+a+ He" + qe are familiar to us. Likewise useful, in Chapters 9-12, +--+ andsing = «— qu + au? −−∙ are the expansion of a given vector u in terms ofa set of “base vectors” e1,..., eg: U = azey +++ + OReR. (2) How do we come up with such sets of base vectors and, once we know the e;’s and the given u, how do we compute the expansion coefficients a;? The story is simpler than for the power series of functions because whereas (1) is an infinite series and one needs to deal with the sophisticated issue of convergence, our vector expansions in Chapters 9—[2 entail only afinite number of terms. Beginning simply, consider the vector space IR’, the set of all vectors in the plane of the paper. In particular, consider the vectors e; and eg shown in Fig. La. It should be evident (Theorem 9.8.2) that e; and ey are LI and that they span the space so that any given vector, such as u in Fig. lb and v in Fig. lc, can be expressed as a linear combination of them. For the vector u, for example, u = OA {.6e, and OB = 2e9, so that Similarly (Fig. lc), with the aid of a scale, OA + OB; = u = 1.6e; + 2e9. (3) v = 2e; ~ 2.5e9, (4) and so on, for any given vector in the plane. Of course, the zero vector is simply (a) ey O = Oe; + Oeo. The formulas (3) and (4) are examples of the expansion of a given vector [u in (3),v in (4)] in termsof a setof base vectors [theset {e1, eo} ]. DEFINITION 9.9.1 Basis in a vector space S is a basis for S if each A finite set of vectors {e;,...,e,} vector u in S can be expressed (i.e., “expanded”’) uniquely in the form k u=aye; +--+ ape, aje;. = (5) j=l By the expansion (5) being unique, we mean that the a expansion coefficients are uniquely determined. 9.9.1 Test for Basis THEOREM in a vector space S is a basis for S if and only if it spans A finite set {e,,...,e,} S and is LL Figure 1. Vectorexpansionin R?. Proof: First, it follows from the definition of the verb span that every vector u in spans S. Turning to S can be expanded as in (5) if and only if the set {e;,...,e,} the question of the uniqueness of the expansion, suppose that both expansions U = ae, +++ + OKReR, u = Bye, +--++ Beep (6) (7) hold for any given vector u in S. Subtracting (7) from (6) gives (ay—Bi)er+--+ +(ax—By)ex =0. (8) 31), ... , (am—Bp) in (8) must be zero, in which Now, each of thecoefficients (@1—~ case ay = G1,...,@ = Bp and expansions (6) and (7) are identical if and only if Chapter 9. Vector Space 450 the set {e1,...,e,} is LI. Hence, the expansion (5) is unique if and only if the set is LI, and this completes the proof. @ (a) The key idea revealed in the foregoing proof is that a basis needs to contain enough vectors but not too many: enough so that the set spans the space and can therefore be used to expand any given vector in the space, but not too many, in order that such expansions will be unique. A _| & 4- SL - EXAMPLE “su ey 1. Consider the vectors e;=(-2,1), e2=(2,4). ? / T 7 i 4 ~ S T 7 / se inna 6 (9) As may be verified, the set (9) is LI and spans R? and is therefore a basis for R?. Using that set to expand the vector u = (6,2), say, we express “ ~S é u= a e) + a2e2, wv (10) or (6,2) = (—2a1, a1) + (2a2, dag). Hence, (b) —2a4 a, a Solving + 209 = + 4ag = 2. 6, CY (11), ay = —2 and a2 = 1 so the expansion (10) is u = —2e; + eg, (12) as displayed in Fig. 2a. It is to be emphasized that the basis (9) shown in Fig. 2a is by no means the only basis for R?; there are slews of them. For example, it is readily verified that another is e, = (4,-1), e)=(-1,5), (13) and in this case theexpansion of u = (6, 2) is found to be 32 u= i9°! Figure 2. TwobasesforR?. 14 +79° (14) as depicted in Fig. 2b. COMMENT. The difference between the expansions (12) and (13) is not at odds with the notion of uniquenesssince the two expansions are with respect to different bases. In other words, (12) is the unique expansion of u in terms of the e;, eg basis, and (14) is the unique expansionof u in termsof thee}, eg basis. 9.9.2. Dimension. If we always worked in 2-space or 3-space, the concept of dimension would hardly need elaboration; for example, 3-space is three-dimensional, a plane within it ts two-dimensional, and a line within it is one-dimensional. However, having generalized our vector concept beyond 3-space, we need to clarify the idea of dimension. DEFINITION 9.9.2 Dimension If the greatest number of LI vectors that can be found in a vector space S is k, 451 where 1 < & < ov, then S is k-dimensional, and we write dim S = k. If S is the zero vector space (i.e., if it contains only the zero vector), we define dim S = 0. If an arbitrarily large number of LI vectors can be found in S, we say thatS is infinite-dimensional.* To determine the dimension of a given vector space, it may be more convenient to use the following theorem than to work directly from Definition 9.9.2. THEOREM 9.9.2 Testfor Dimension If a vector space S admits a basis consisting of & vectors, thenS is k-dimensional. be a basis for S. Because these vectors forma basis, they Proof: Let {e1,...,e,} must be LI. Hence, we have at least k LI vectors in S, and it remains to show that in no morethank LI vectorscan befoundin S. Supposethatvectorse[,...,@,44 S areLI. Each of thesecan be expandedin terms of the given base vectors, as ∶ − − (15) , Chg, = Akt 11 bo + Ak+1key say. Putting these expressions into the equation aye, + agey+--+ + any1e, = 0 (16) and grouping terms gives Ok= 0. + Op41Ak+1,k) + p41 e411)C1++ + (a1aig+ +++ (ayaq1+ +++ is LI since it is a basis, so each coefficient in the preceding But the set {e1,...,e,} equation must be zero: Qy1Q, +++ + O44 1041 = 0, (17) AypQy+e + Apgi pansy = 0. These are & linear homogeneous equations in the k + 1 unknowns a through G41, and such a system necessarily admits nontrivial solutions (Theorem 8.3.4). Thus, the a’s in (17)are not all necessarily zero so the vectors e},...,@,41 could “Infinite-dimensional function spaces will be studied in Chapter 17. 452 Chapter 9. Vector Space not have been LI after all. Hence, it is not possible to find more than & LI vectors in S, and this completes the proof. @ The spaces of chief concern in Chapters 9-12 are the n-tuple spaces IR” and subspaces thereof. For IR” we can say the following. THEOREM 9.9.3 Dimension of R” The dimension of IR” is n: dim R”" = n. Proof: The vectors eo = (0, 1,0 en = (0,...,0, 0), (18) 1) constitute a basis for R” because any vector u = (uy,...,Un) in IR” can be expanded uniquely as u = uje; +--+ + Unen- Since this basis contains n vectors, it follows from Theorem 9.9.2 that R” is n-dimensional. @ Indeed, we might well have questioned the reasonableness of our definition of dimension if IR” had turned out to be other than n-dimensional! The ON basis OO is called the standard basis for IR” (and is the n-space generalization of the jk” ON basis that might be known to you from other courses). Finally, what about the dimension of a subspace, for example, the subspace of IR that is spanned by two given vectors? THEOREM 9.9.4 Dimension of Span {uj,..., Ux} The dimension of span {uj,..., ux}, where the uj’s are not all zero, denoted as dim [span{uy,..., uy,}], is equal to the greatestnumberof LI vectors within the generatingset {uy,..., ug}. Proof: Denote the generating set {uy,...,uz} as U. Let the greatest number of LI vectors in U be N, where 1 < N < k. It may be assumed, without loss of generality, that the members ofU have been numbered so that uy,..., uy are LI. Then each ofthe remaining members ofU, namely uaij,..., Ug, can be expressed as a linear combination of uy,...,ua. Surely, then, each vector in span U can similarly be expressed as a linear Pomoc of uy,..., ,uy. Now {uy,..., uy} is LL and spans span U. According to Theorem 9.9.3_then, the dimension of span U is N; that is, it is the same as the greatest number of LI vectors in U, as was to be proved. @ 453 EXAMPLE 2. Let uy = (4,0,2,0). ug =(1,1,0,-1), uy = (8,-1,2,1), These vectors are, of course, members of IR*. But since u,, ug, Uy are only three vectors, dim [span {uj, ue, uy}] is at most three. In fact, it is not three since we see that uy = uy + ug. But uy and us, say, are LI since neither is a scalar multiple of the other. Thus, there are only two LI vectors within the generating set so dim [span {uy, ug, ug}) = 2. In Example 2 we determined that the greatest number of LI vectors in the genwhere the uj’s are members of IR°, and k = 6, say? For such a large problem we cannot expect “inspection” to work. Yet, what are we to do, test the u,’s for linear independence one at a time, two at a time, three at a time, and so on, until we determine the greatest number of LI vectors in {uy,..., ug}? That would be quite tedious. No, we will see later, in Chapter 10, that the best way to determine the greatest number of LI vectors in a given set is to determine the “rank” of a certain matrix, and that can be done by the extremely efficient method of elementary row operations. Meanwhile, in the present section, we “get by” by keeping the examples and exercises simple enough so that we can rely on inspection. Let us return, now, to our discussion of bases and expansions. 9.9.3. Orthogonal bases. If, as in Example 1, there are many bases for a given space, then how do we decide which one to select? We will find that in most applications the most convenient basis to use is dictated by the context, so let us not worry about that now. This point is addressed in Chapter I 1 as well as in the chapters on PDEs. However. we do wish to show, here, that orthogonal bases are to be preferred whenever possible. For observe from Example | that to expand u (that is, to compute the aj expansion coefficients) we needed to solve the system (11) of two equations in two unknowns. Similarly, if we seek to expand a given vector in R°, then there will be eight base vectors (because IR®is eight-dimensional) and eight a; expansion coefficients, and these will be found by solving a system [analogous to (11)] of eight equations in the eight unknown a,’s. Thus, the expansion process can be quite laborious. On the other hand, suppose that {e;,....e, } is an orthogonal basis for S; that is, it is not only a basis but also happens to be an orthogonal set: ee; =0 if ify. (19) Suppose that we wish to expand a given vector uin S in terms of that basis; that is, we wish to determine the coefficients a,..... ayy in the expansion U = Aye] - Aveo + To accomplish this, dot (20) with e,,e9,..., + OLep. (20) e;, In turn. Doing so, and using 454 Chapter 9. Vector Space (19), we obtain the linear system + Oag+--++0ax, u-e; = (e1-e1)a1 u- eg = 0a + (e2-€2)a2 +0a3 +--+ u-e, Oa, (21) = Jay +--+ + Oap—1+ (en ex) OR, where all of the quantities u-e1,...,Uex, €,-e1,..., 4° &, are computable since u,e1,...,@% are known. The crucial point is that even though (21) is still k equations in the k unknown a@,’s,the system is uncoupled (i.¢., the only unknown in the first equation is a1, the only one in the second is ag, and so on) and readily gives Qay,= u-e, e€;:°e1 , a= u-e9 €9:€2 » sees) Ob = u:ep er, ek ; (22) provided, of course, that none of the denominators vanish. But these quantities cannot vanish becausee;-e; = lle;|I?, which is zero if and only if e; = O, and this cannot be because if any e; were O, then the set {e;,...,e,} would be LD (Theorem 9.8.3), and hence not a basis. Thus, if the {e,,...,e,} simply basis is orthogonal, the expansion of any given u is w=(Se )e tet e,:e;, (SE) oy-X(ez)}e en: ek ej e;.) (23) j=l If, besidesbeing orthogonal,the e;’s are normalized (||e;||= 1) so thatthey constitute an ON (orthonormal) basis, then (23) simplifies slightly to k u = (u- 61) @, +--+ + (ue) & = S- (u-€j) &;, (24) j=l where we recall that carets denote unit vectors. EXAMPLE 3. Expand u = (4,3, —3,6) in terms of the orthogonal base vectors ey = (1,0,2,0), eg = (0,1,0,0), e3 = (—2,0,1,5), ey = (~2,0,1, -1) of R*. This basis is orthogonal but not ON so we use (23) rather than (24). Computing u-e, 5, and so on, (23) gives u =-- 2 pert: 19 3 e2 + —e3 3588 17 —-—e4 6 eo = —2,e,-e, = 25 (25) 9.9. Bases, Expansions, Alternatively, we could have inferred, from u = aye, +--+ + a4eg, the four equations — 203 Oy : _ 204 ao 204 + ag+t = 4, = 3 (26) ag = —3, 5az3—- a4y= 6 on the four unknown a,’s, and solved these by Gauss elimination, but it is much easier to “cash in” on the orthogonality of the basis and to use (23). If we choose to work with an ON basis,we can scale thee;’s asé) = Rll, 0, 2,0), @2= (0,1,0,0), é3 = Fag (2: 0,1,5), 64 = wal -2.0, 1,~1). Then (24) gives u=- 2. . —=e; + Jeo + 5 19. 17. €3 ~ =u, 30°~—ts«éW46 (27) which result is equivalent to (25). @ Given a nonorthogonal basis there are three possibilities. First, one can use it and face up to the tedious expansion process. Second, one can “trade the nonorthogonal basis in” for an orthogonal basis using the Gram-Schmidt orthogonalization procedure, which procedure is introduced briefly in the exercises and discussed in detail in the next section. Third, one can retain the nonorthogonal basis but streamline the expansion process by computing and utilizing a set of dual, or reciprocal, vectors corresponding to the given basis, as described in the exercises. Closure. This section is about the expansion of vectors, in a given vector space S, in termsof a setof basevectors.A setof vectors{e;,...,e,} in S is a basisfor S if each vector u in S can be expanded as a unique linear combination of the e;’s. is indeed a basis for S if and only if We showed (Theorem 9.9.1) that {e;,...,e,} it spans S (so each u can be expanded) and is LI (so the expansion is unique). The number of vectors in any basis for S is called the dimension of S. For instance, IR” admits the standard basis (18), comprised of n vectors, so R” is n-dimensional. And the greatest number of LI vectors in a set {uy,..., ug} is the dimension of their span. We found that the expansion process (i.e., the determination of the expansion coefficients) can be quite laborious if there are many base vectors, but is extremely simple if the basis is orthogonal, or ON, in which case the expansions are given by (23) or (24), respectively. You should remember those two formulas and be able to derive them as well. Dimension 455 Chapter 9. Vector Space 456 EXERCISES 9.9 (a) for S? 1. Show whether the following is a basis. (b) for span {e1,...,e,}? for R? (1,2) (1,0),(1,1), (a) (b) (3,2), (~1,~5) for IR? (c) (1,1) for R? (d) (2,0, ae 5, oie : A (5, —1,2),(1 2), (2,0,1),( oe 3, — 2,1), (5, 0, 0, ey 7. (Zero vector space) Show that a zero vector space (i.e., a forRY vector space consisting of the zero vector alone) has no basis. (4,3, 2, 1) for R* 1), (2, 1, -3,0), (1,2,4,5) for R* 0,1)forR4 3,0),(5,-2,3,1),(0,—6, 0,0),(1,2, i (4,2, 8. Let u, =(1, ). ee = (0,1,0), ug = (0,0,1), uy = (1,1,0), us = io ug = (1,1,1), and uy = Evaluate eachof thefollowin. for R4 (b) dim [span {u,, u2}] (c) dim [span {1,, Us, us}] 0,0),(1,1,0,0),(1,1,1,0),(1,1,1,1)for R! (i)(1,0, 1,3)forR! 8),(4s—2, 1) (1,355, 0,1),(2,0,05 (8,05 G) 4,3),(2,5,3,5),(3,7,7,8)forR! (k)(1,3,-1,2),(1,2, (1,-1,2,3),(4,1,2,3),(5,41, ,0),(1,2,4,6) (a) dim [span {u,}] (1)(2,3,5,0), 1,0),(0,0,0,0)forR¢ (0,1,0,0),(0,0, (1,0,0,0), (ny (o) (1, 1,2), (4, -2, ~1) for span {(2, —4, a ie —1)} (p)(1,1, 2),(4,-2, -1) for span{(3,—5,—6),(1,2, 1)} (q) (1,1,1), (1, -1, 2) for span {(2, 4,1), (1,7, —2)} (d) dim [span{uj, Ug,ug, U4}) (e)dim [span{uy, U2,uy}] (f) dim [span {uy, U4, Us }] g) dim [span {us, ug, u7}] (h) dim [span{uy, us, Us, U7}] (r) (1,2,3), (1,0, 4) for span{(3,2,0), (1,1, -1)} 2. Expand each vector u in terms of the orthogonal basis {e1,e2,e3} of R°, where e; = (2,1,3), eg = (1,-2,0), e3 = (6,3, —5). (a) u = (9, —2,4) (b) u =(1,0, 0) (c)u =(0,1,5) (e)u =(0,5,0) (a) dim [span{uy, us, Us }] (b) dim [span{uy,,ug, Ug}] (d)u =(3,1,1) (f)u =(1,2,3) (c) dim [span {ueg,ua, Ug}] 3. (a)—(f) Expand each of the u vectors in Exercise 2 in terms of the ON basis {€), é2,é3} of R®, where é;, é2, @3are normalized versions of e;, €2, e3 given in Exercise 2. 4. Expand each vector u in terms of the orthogonal baeg = sis {e,,...,e4} of R*, where ey = (2,0,-1,-5), (2,0,-1,1),e3= (0,1,0,0), eg=(1,0, 2,0). (a) u = (1,0,0, 0) (c)u =(2,5, 1,—3) (d)u =(4,3,—2, 0) (f)u = (2,-7,4,1) (g) u = (0,0, 0,9) (i) u = (0,0, 5,0) (h) u = (2,3, -2,1) G)u = (1,1,1,1) 5. Verify OEYthe{e,,...,e4} a basis for R*. Also, ie LL. vectorsgiven in Example 3 are (26) by Gauss elimination and verify that the 0, ’s thus obtained agree with those given in (25), If {e;,...,e,} it a basis (d)dim [span{us}] (e) dim [span {ug, uy}] (f) dim (span {ug, ug, Us, Ug}] (g) dim [span{uy, ug}] (h) dim [span {ug, ug, Ug, Us, UG}] 10. (a)—(f) Determine the dimension of the solution space in Exercise 4 of Section 9.7. (b)u = (0,6, 0,0) (e)u =(1,2,0,5) 9. Let uy = (1,0,0,0), ue = (1,1,0,0), ug = (1,1,1,0), uy = (1,1,1,1), us = (0,0,0,1), ug = (3, 3.3,3), Evaluate each of the following. is an orthogonal set in a vector space S, is (Gram—Schmidt orthogonalization process) Given k LI vectors V1,...,Vx, it is possible to obtain from them & ON vectors, say @1,...,@,, in span{v,,..., v4} by the GramSchmidt process, after /drgen P. Gram (1850-1916) and Erhardt Schmidt (1876-1959), by taking e; equal to vj, taking €g equal to a suitable linear combination of v1, v9, taking e3 equal to a suitable linear combination of V1, V2, V3, and so on, and then normalizing the results. The resulting ON set is as 457 9.9. Bases, Expansions, Dimension If, instead, we have a basis {e1,...,e@,} which is not ON, then, as noted in the text, the expansion process is not so simple. However, suppose that we can find a set {ef,...,e7 } such that follows: ~ é, = . * eg = Vi Ufvall v2 —(v2, )e ’ Ilva—(v2 €1)éi|| ee; (11.1) eu Vim So(v; 0ee; Vio Soy; j=l We now state the problem: Verify that each é; defined by (11.1) is a linear combination of v;,..., vj, and that the é,’s areON. [Inverifyingthat||@;|] = 1,besuretoshowthateach denominator in (11.1) is nonzero.] formula (11.1) in (13.2) = Dojet aje; gives The set {ef,...,e%} @5)e;. (13.3) is called the dual, or reciprocal, set corresponding to the original set {e,,...,@n}. (We will see in the last exercise in Section 10.6 that the dual set exists, is unique, and is itself a basis for R”, the so-called dual or reciprocal basis.) (b) Given thebasis e; = (1,0),e2 = (1,1) for R?, useequa- Exercise 11 to obtain an ON set from the given LI set. (a)(4,0),(2,1) tion (13.2) to determine the dual vectors ej,e5. Then use equation (13.3) to expand u = (3,1). Sketch e;, e2, ef, e3, u to scale, and verify the expansion graphically, that is, by means of the parallelogram rule of vector addition. (3, 4) (c)(1,0,0), (1,1,0), (1,1,1) (d)(1,1,0), (2, —1, 1), (1,0,3) (e)(1,1,1), (2,0,—1) (f)(1,1,1),(1,0,1). (1,1,0) (g)(1,2,1),(1,-1, 2),(=1,3,1) (h)(2,0,1), (1,1,1).(—2.0.3) (i)(2,1,1,0),(1,5,-1,2) 1 (c) Repeatpart(b),fore; = (2,1), e2 = (0,2),u wo 12), (d) Repeat part (b), fore, = (—1,1),e2 = (2,1), u = (0,4), 1), (2,3,-1,1,4) (j) (6,-1, 1,2, our vector space S to be R”. (a) If {@),...,@,} i#j. rh u= Situ -@;)G; 12. In each case use the Gram—Schmidt t=] Then show that dotting e* into u Q; = u-e; so that i=] (b) (1, ~2), 1, = { 0, through7 = k. = é; = ee is an ON basis for R”, and u is in R”, then by dottingé;, intobothsidesof theequationu = 37"_, ajé;, the given basis is (e) Given the basis e, = (1,0,0),e2 = mane = (1,1,1) for R°, use equation (13.2) to determine the dual vectors e},e5,e3. Then use equation (13.3) to expand each of the vectors u = (4,—1,5), v = (0,0,2), w = (5, —2,3). Be sure to see that the dual vectors get computed once and for all, for a given basis {e1,...,@,}; once we have got them, expansions of the form (13.3) are simple. (f) Repeat part (e) fore; = (2,0,1),e2 = (1,1,0),e3 Il (1,~-1,3), and u = (6,1,0), v = (1, 2,4), w = (0,3,0). (g) Show that if the {e],...,e@n } basis does happen to be ON, then the dual vectors coalesce with the e;’s, i.e. e7 = e; for Th u= So(u j=l , ej )ej. (13.1) j=1,2,....n. 458 Chapter 9. Vector Space Best Approximation 9.10 Let S be a normed inner product vector space (1.e.,a vector space with both a norm and an inner, or dot, product defined), and let the norm be the “natural norm” Jul] = /a-u. We know that if {e,,...,e} is a basis for S, thenany vector u in S can be (uniquely) expanded in the form u = yA cjej. If the basis is orthogonal, then the expansion process is easy, with the ¢;’s computed, from the given vector u and the base vectors e;, as cj = (u-e;)/(e;-e;). And if the basis is not only orthogonal but ON, then u = ys cj@j, where cj = u-é;. However, what if we do not have a “full deck?” That is, what if {@),...,@a/} is ON, but falls short of being a basis for S (i.e., N < dim S)? If u happens to fall within span {@;,...,@,}, which subspace of S we denote as 7, then it can still be expanded in terms of @;,...,@,, but if itis not in 7, then it cannot be so expanded. In the latter case the question arises, what is the best approximation of u in terms of €;,...,@°? In this section we answer that question in general, and illustrate the results for the case where S is R”. Later in this book, when we study Fourier series and partial differential equations, our interest will be in function spaces instead. 9.10.1. Best approximation and orthogonal projection. The best approximation problem, which we address is this: given a vector u in S, and an ON set {@1,...,@y } in S, what is the bestapproximation N us cy@yte Heven = S° cj@z? (1) jel That is, how do we compute the c; coefficients so as to render the error vector B=uEe ce; as small as possible? In other words, how do we choose the cs SOas to minimize the norm of theerror vector ||}? Tf|/E/]is a minimum, then so is ||[El|?,so let us minimize ||E||°(to avoid squareroots),where N N 2 A |E\° =E-E= =u'u—2 |u-—) cj@j > fur N N j=l * S cj; j=l ) cj (u+ey) + ) cr j=) j=l (2) ? as ‘ and where the step N Ss" 1 N ee, . ) cje; a (c, 64 Se cNnen ) | . (cy ey oe oe CNeN) N iI ~at + + ~wh II oO wt ~bo= follows from the orthonormality of the e;’s. Defining u-é; = a; and noting thatu-u = |ju|* we may express(2) as ~ N N - WEI=Soe}250 ages+ Ital’, j=l j=l or, completing the square, as N N j=l jal —Soaf. EI? =So(cj—04)?+Ifull? (4) Observethat u and the ON set {@1,...,@,} are given so that |/ul||and the a;’s in (4) are fixed computablequantities: |lul] = /u-uand aj; = u-é; forj = 1,2,...,N. Thus, in seeking to minimize the right-hand side of (4), the only control we exercise is in our choice of the c;’s. The right-hand side of (4) is greater than or equal to zero,” and so is the en (cj — aj)” term containing the c;’s. Thus, the best that we can do is to set cj = aj (j = 1,2,...,.N). With that choice, our best approximation (1) becomes N uw S” (u-é;)6). j=l Let us summarize (5) these results. THEOREM 9.10.1 Best Approximation Let u be any vector in a normed inner product vector space S with natural norm (|ju||= /a-u), and let {€,,...,é@, } be an ON set in S. Then the best approximation (1) is obtained when the c,’s are given by c; = u- @;, as indicated in (5). EXAMPLE 1. Let S be R?, N = 1,@, = (12, 5), and u = (1,1), as shown in Fig. |. Find the best approximation u & c,@,, that is, the best approximation of u in span{@,}(which is theline L). Theorem9.10.1givesc, = u-@, = 17/138,andhencethe best approximation us 17. ge) (6) which is the vector OA in Fig. L. COMMENT. Observe from the figure that the best approximation OA is the orthogonal projection of u onto span {e,}, which orthogonality is verified by the calculation “This fact may not be obvious due to the minus sign in front of the last summation. But remember thattheright-handsideof (4) is equalto ||B||*,andsurely ||EJ|?> 0. Figure 1. Best approximationof win span{é }. 460 Chapter 9. Vector Space = (u— {2é,)-@, = 4 - 4 = 0. That result makes per- AB-é@ = (u~ OA)-& fect sense since if ¢,@, is to be the best approximation to u, then the distance from the tip of u to the tip of c,é; (which is some point on LZ) should be as small as possible. That shortest distance is the perpendicular distance from the tip of u to the line L. @ EXAMPLE 2. Let S be R¥, let N = 2 with é; = (1,0,0) and é, = (0,1,0), and let u = (a,b,c), as shown in Fig. 2. Computing the coefficients in (5) as u-é, = a and u-@9 = 6, (5) becomes u ae; + beg. (7) The latter is an equality if c = 0. That is, (7) is an equality if u happens to lie in span {@1,@2}, but if c # 0 then the best approximation aé, + bé2 to u is the orthogonal projectionof u ontospan{@;,é2}. 4 Figure 2. Bestapproximationof u in span{), €2}. In Examples t and 2, § was IR?and R®,respectively, so we were able to draw useful pictures. In each case we discovered that the best approximation of u on the subspace 7 of S spannedby €1,...,@, was the orthogonal projection of u onto 7. Is that result true in all cases? That is, is the error vector E: necessarily orthogonal to 7? Since the error vector is N E=u-~)(u-é)é;, (8) j=l we have N Ee, = u~ So (u-é) j=l “eh =u-&— (u-é)(1)=0 for each k = 1,2,...,N, é; eh =0 if 7 x keand (9) where the second equality follows from the fact that 1 if 7 =k, Since E is orthogonal to every one of the e,’s, it is therefore orthogonal to every vector in 7. In that sense we say that the right-hand side of (5) is the orthogonal projection of u onto 7, and denote it as projz u: N projz u = S/ (a:éj) 6). j=l (10) The idea that the best approximation of u in 7 is the orthogonal projection of u onto 7 lends a welcome geometrical interpretation to the problem of best approximation. In fact, let us rephrase Theorem 9.10.1 in terms of orthogonal projection. THEOREM 9.10.1! Best Approximation by Orthogonal Projection Let u be any vector in a normed inner product vector space S with natural norm 461 (\jul]= u-u), span {é1,...,@y} andlet {@;,...,@,} be an ON set in S. Denotethesubspace of S as 7. Then the best approximation of u in 7 (i.e., of the form c,@, +: -:+ceney) is given by the orthogonal projection of u onto 7, namely, by proj; u. 9.10.2. Kronecker delta. When working with ON sets it is convenient to use the Kronecker delta symbol 6;,,, defined as - fl, met igkoR (11) and named after Leopold Kronecker (1823-1891), who contributed to algebra and the theory of equations. The subscripted j and & are usually positive integers. Clearly, 6; is symmetric in its indices 7 and k: bie=Spy. To illustrate the use of the Kronecker delta, suppose that {é;,...,@y} ON basis for some space S, and that we wish to expand a given u in S as N u= So ce). j=l (12) is an (13) To determinethe c;’s, dot e, into both sides, where /&is any integer such that 1<k < N, and use the fact that e; -@, = 0;, (because the €; ’s are ON): N N j=l j=l ue =| Scie; | ee =Sc; (6)-€x) =) N Cojn= ex. (14) j=l Thus, c, = u-é, for each & = 1,2,...,N N so (13) becomes u= >)(u-é) &. (15) j=l Closure. Principal interest, in this brief section, is in the best approximation of a given vector u in a normed inner product vector space S in terms of an ON set {@,...,@,} which falls short of being a basis for S inasmuch as N < dim S. Of course, if N ==dimS so the set is a basis, then we have the equality (15), but if N < dim S, then the best approximation of u is given by (5), best in the vector sense; that is, the norm of the error vector [i.e., the norm of the difference between 462 Chapter 9. Vector Space onto the span of 6,... + UnVpn)and the corresponding natural norm EXERCISES 9.10 Notice that if u happens to be in span {é;,...,é,}, 1. We concluded from (4) that the best choice for the c,’s is or if Cj = aj = u- é;. Show thatthis sameresultis obtainedfrom dim S = N, then (5.1) becomes an equality. In two and three (2) by setting0 (||? /Oc; = 0, andverify thattheextremum dimensions that equality is actually the Pythagorean theorem, and in more than three dimensions it amounts to an abstract extension of that theorem. thus obtained is a minimum. 2. LetS beR®,andlet N = 3 with é; = al, é& = (2,0, -1,0, 1), é3 = (0,0,0,1,0). 0,2,0,0), Find the best 6. (A different inner product) In Examples | and 2 we use the “usual” approximation to the given u vector within span {6}, 69, és}, and the norm of the error vector. (a)(3,—2,0,0,5) (b)(0,0,0,2,1)— (c)(3,0,1,4,1) (d) (1, 1,0, 1,1) (e) (0, 2,0, 0,0) (g)(0,7,0,3,0) UV (f) (1,0, —3,3, 1) (h) (1,2,3,4,5) (i) (5,4, 8,2,1) u:v a1 = (11,0, -1), & = Yall ~1,~1,0), the best approximation span{6,2}, to u = (6.1) + Ueve and its corresponding natural norm \/2u? + uz. Showthattheresultingbestapproxima- e; so as to be a unit vector according to the new norm. (Fig. 1) is orthogonal and perpendicular to span {@,}, show that in this exercise the error vector is indeed orthogonal to span {61}, as promised in the text, but not perpendicular to it. To explain this “paradox,” show that for the modified inner product the orthogonality of two nonzero vectors does not imply their perpendicularity. (4, —2,1,6) span{61,@2,@3}, and (b) Whereas the error vector AB 4. Same as Exercise 3, but for the given u vector. (a)(4,1,0,~1) — (b)(3,-1,1,2)—(c) (0,0,2,5) (d)(1,2,4,4) (e)(0,5,3,-1)— (f)(2,0,-1,-1) 7. Verify the with (4), derive the Bessel Y= (u-é;)”< full’. last stepin (14),that 3 = CAOjk = Che 8. Verify the following, where (7, 7,4,/)run from 1 to N. inequality j=l HW tnvn, tion ay (12, 5), which is not the same as the best approxima- theerrorvector,||E]]. N = WyUyVYyHe tion 35(12 ,5) given by (6). HINT: You will need to rescale span {€1, é2 83, G4}. and in each case compute the norm of 5. (Bessel inequality) Beginning = 2u,v1 llul] = @é3 = J5(1,0,1,1), @4=4g(0,1,-1,1). Find = uyv, +--+ + Unvn, but where the w,;’s are fixed positive constants, or “weights.” (a) Rework Example [ using the modified inner product 3. LetS be R*, andlet within span{é;}, inner product for IR", u-v that is not the only acceptable one. In Example 3 of Section 9.6 we see that another acceptable inner product is (a) » (5.1) i=} (c} S° j (b) S° dig0jk = Sin Oj = 1 \~ k ij Oj h Ont j Sans Out Chapter 9 Review Chapter 9 Review We begin with the two- and three-dimensional “arrow vector” concept that is probably already familiar to you from an introductory course in physics, where the vecand so on. For such vectors, vector addition tors denoted forces, velocities, u + v, scalarmultiplication (au), a zero vector (0), a negativeinverse[—-u= (—1)ul, a norm({Jul]),a dot product (1) 8, cos {lvl} u-v=[lull and the angle @= cos™! (is) u-v ull between u and v are all defined. Vy From there, we generalize to abstract n-space, where u = (uyj,...,Un), by defining vector addition, and so on, in such a way that they agree with the correspondingarrow vector definitions when n = 2 and n = 3. For instance, ml u:v= S° tj Uj, (2) j=l (3) ul] = Yu-u=_ and 6 = cos! uv ——_., (4) FullIv From these definitions, we derived various properties such as utv=vetu, (u+v)+w=u+(v+w), (commutative) (5) (associative) (6) and so on, along with the following properties of the dot product and norm. Dot Product Commutative: u-v = v-u, forall u-u>Q0Q Nonnegative: = () Linear: (ou+ Bv)-w (7a) u 40 foru=0, (7b) = a(u-w)+f(v-w), (7c) Norm Scaling: Nonnegative: Jaul] = fa) |jull, [uj] > 0 = 0 Triangular Inequality: lu-+vi| (8a) forallu 40 foru = 0, < jul] + iv. (8b) (8c) 463 464 Chapter 9. Vector Space To complete the extension to generalized vector space, we reverse the cart and the horse by elevating these various properties to the level of axioms, or requirements. That is, we let the fundamental objects, the vectors, be whatever we choose them to be, and then define addition and scalar multiplication operations, a zero vector, a negative inverse, a dot or “inner” product (if we wish), and a norm (if we wish), so that those axioms are satisfied. Our chief interest, in introducing generalized vector space, is in function spaces, but we will not work with function spaces until Chapter 17, when we study Fourier series and the Sturm—Liouville theory. Next, we introduce the concept of span and linear dependence, primarily so that we can develop the idea of the expansion of a given vector in a vector space S in terms of a set of base vectors for S. We define a set of vectors {e1,...,e,} to be a basis for S if each vector u in S can be expressed (“expanded”) uniquely in the form u = aye; +--+ + a,e%, and prove that a set {e1,..., es} is a basis for S if and only if it spans S and is LI (linearly independent). In particular, orthogonal bases are especially convenient because of the ease with which one can compute the expansion coefficients aj. The result is oe (ZB)a4+(Mt)a ue} ue, e1 ee} Cx &f, ° if the basis is orthogonal, and u = (u-@;) 6; +--+ + (u- eg) & if it is ON (orthonormal); (10) (9) and (10) should be understood and remembered. Finally, we study the question of the best approximation of a given vector u in a vector space S in terms of an ON set {€),...,@,} which falls short of being a basis for S. We show that the best approximation (i.e., the one that minimizes the norm of the error vector) is N ue (u-é;) 6; j= (11) 1 which, in geometrical language, is the orthogonal projection of u onto the span of C1,-.-,EN. Chapter 10 Matrices and Linear Equations 10.1 Introduction We have already met matrices in Section 8.3.3, but they were introduced there only as a notational convenience for the implementation of Gauss elimination and Gauss—Jordanreduction. In the presentchapterwe focus on matrix theory itself, which theory will enable us to obtain additional important results regarding the solution of systems of linear algebraic equations. One way to view matrix theory is to think in terms of a parallel with function theory. In our mathematical training. we first study numbers —the points on a real number axis. Then we study functions, which are mappings, or transformations, from one real axis to another. For instance, f(a) = x? maps the point x = 3, say, on an x axis to the point f = 9 onan f axis. Just as functions act upon numbers, we shall see that matrices act upon vectors and are mappings from one vector space to another. Having studied vectors, in Chapter 9, we can now turn our attention to matrices. Historically, matrix theory did not become a part of undergraduate engineering science curricula until around 1960, when digital computers became widely available in academia. 10.2. Matrices and Matrix Algebra A matrix is a rectangular array of quantities thatare called the elements of the matrix. Normally, the elements will be real numbers, although they may occasionally be other objects such as differential operators or even matrices. Some of thesecases will be met as we go along; for the present, however, let us consider the elements to be real numbers. The complex case is studied in Chapter 12. 465 466 Specifically, any matrix A may be expressed as A= ay 12 vtt Qin 21 d92 "7+ Gan Qmt Um2 “"° Qmn (1) where the brackets (or, in some texts, parentheses) are used to emphasize that the entire array is to be regarded as a single entity. A horizontal line of elements is called a row, and a vertical line is called a column. and columns from the left, then a9, G92 ‘'* Qn and Counting rows from the top ay13 423

Advanced Engineering Mathematics Textbook

Related documents

Products

Support

Advanced Engineering Mathematics Textbook

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib