Uploaded by Rohith Jorige

Advanced Engineering Mathematics (Michael Greenberg) (Z-Library)

advertisement
Michael
D. Greenbera
LECTED.
FORMULAS
itd
FIRST-ORDER
oy!+ p(a)y = q(x).
LINEAR:
General solution
y(v) be
|
If y(a)= 6,
y(x) !|
(Ll=e"
)y”
EQUATION:
LEGENDRE
~ f pla) dx (fe
P(e) dee
‘da
4. c’)
a AC) ACag(€) dé + )
ew de lg) dg (oe!
—2ay' + Ay = 0
Bounded solutions ?,,()
n=0,1,2,...
on — 1 <SaslifA=n(n+1),
BESSEL EQUATION: = ox?y"+.ry! + (a? —v7) y =0
“
AJ, (x)+BY,(x)
General solution
MODIFIED
BESSEL
EQUATION:
y(2) = { CH)
Hy,
—octy+ xy! + (=a? -—u\y
General solution
vlan
.
Solution
2 y(v) = '"Z,,)
= 0
y(z) = Al,
{
|
REDUCIBLETOA BESSELEQUATION: ~ (or)
AU
\ + DH
)
yo Ce
(a) 4
axe
(2)+ BR, (2)
+bry =0
a
(a VibiexA/ ).
aan
where Z),; denotes J.,,, and Yi,,, if 6 > Ocand f,,; and Ay,
MATRICES:
∆
l
de
Av! = mA adjA. (AB)"'=Bu'Av!,
(AT)!
↕
2
if b < 0
=(A7')T,
∶
∏∶
−yj +ck.
dy dz
durdz
dic dy
dA=<
Vues lie
" (iz
5
Vussa
Vevey
70
Pui
Ba?
Ov
dR=dzri-+
Ou
Oru
Ov
Ox
Or
tata
Oy?
Oe ‘
et
Oy
(AB)'=BTAt
∶
∶
∶
dyj + dzk,
(constant-x surface)
(constant-y surface).
(constant-2 surface)
3:90
oO
+j—-t+k—
lay
“Oe )«
L-—a
v=?
Ou;
Our
= —j+—j+—k
Ox
Oy)
dV =durdydz
OuOz
4 te
Oz
'
k
COORDINATES:
CYLINDRICAL
u=ulr,d,z),
y=rsind,
z=rcosé,
R=re,+2e.,
dR=
r dé dz
dr dz
r dr dé
dA =
dé,
.
—_——— I=
dég
"do
eB
d@
Vv= UpG, + upg + v2,
eee
dre, +rdiég
+ dzé-
(constant-7 surface)
(constant-@ surface)
(constant-z surface)
dV =rdrd@dz
«
©
TT
wees or
Ou
1 Ou
Ou
Vu = —-e@,+ - —e
—eé“
Or”
r ao°° + Oe
vy?
nu 1 LOu
c=
120
∶−−−
COORDINATES:
R=pe,,
dA=¢
PLE
s=pcos@
(constant-psurface)
(constant-@surface)
(constant-@surface)
Je,
Oe,
Oey,
Oey
∂
∂
Op
Od
Ou .
. L Ou.
Op”
pdé”
L
[0 ( » Ou
|
p? | Op
f
a:
L
.
ey
—
.
∶
Ll
|
1
1
Be!
’
Ou.
e
— Ou\y
ay | SIN@SO)
, Od
0@
−
ot
‘
−
∶
1
psing
∩
1
—~}@,+—-
po
p (=
Ou
OPsin*
aT @ By
00?
↕∂
OO }
−
,
∕∶
)
∶
Oe
O
sin @) —
∂∂
∂
∶
O°
(Hw
eee
↔
@O
PF
sing
=sin@eg,
Oey
∕
∆
"Taq
dV = p" |sin@d|dpddédé
ag
∂
psindg 06
.
∶
psing
4
Op
∫
=
€y,
ao °°?
Cn + ~ Br ee
LO,4
Vxv=-—
L (O(rvo)— Der’ .
59 ) °*
2 ( Or
Uplp + Ugeg + Voee
plsin dl dpdé
pdp do
3-
∕
v=
p- |sin6| dé dé
−∶
y
−−∟
dR = dpe, + pddéy + psin d deg
∂
U=
∂
Ov, — Ouz\ .
os
FY
(GE-
y=psingsing,
dp =0,
v
Ove \.
Ge) et
u=ul(p,o,@),
x=psingcosé,
∂
∂
−−−∙
M4 r OO an
L dv,
Be
vxv=(2
−
∫
↕∂
−∙
r By
SPHERICAL
1 Ou 4 Oru
-—+-
∂
1
−
OO
Ov,
—
o OO
−
Ope)
Op
\ .
1 (“ie
-—
| @y + —
a
"op
Op
—
Ov,
Od
| es
,
|
y=y(uv),
IFe=a(u,v),
AREA ELEMENT:
IFe=
P= katyt Yuet tut
tei,
B=aityi
flay):
dA= VEG —fF? dudv,
2=2(uv):
dA= Vite
ty +2
Gey
+ f? dx dy
Ox, y, 2)
Ly
Ly
Ly
Sy
Zy
Zw
VOLUMEELEMENT: dV= rast
dududw='lyu Yo Yw ||dudvdw
O(u,v, w) |
»b(t)
d
LEIBNIZRULE: ral
dt
F(a,t)de=|
b(t)
J a(t)
Ja(t)
Of
— dx+b(t)
f(b(t),t) —a’(t)f(a(t),t)
Ot
FOURIERSERIES:
.
∞
ao = if
∕
(an co
vr) = ag + a
f(x) 2€-periodic:
fle)dz,
an = ,
∏
NAL
? )
7
NTL
1
f(z ) cos ——der,
by = 3/
f(x) s
7
f(x) definedonly on0 <a < L:
HRC:
f(z)
= ao +
Gy COS ——,
=
Y
mt
HRS
QRC
f(z) = 2
QRS:
‘(a)
f(x)
FOURIER INTEGRAL:
~
—
2»
=
"
NEL
Gn COS Fe
NTL
sin——,
SF
|bz sin
Qn =F
9
bn = —
z2
|
f(z)coswa,
fh
ns de
eee
Ma)
|,fh f(x)
b(w) =
NTL
si
[
SE
(—co < x < oo)
f(x) sin wa:
=OD
FOURIERTRANSFORM:
F{f(x)}= fw) = [
i" da
F(x)e7
1
FN fw} = f(t) = s—fp Pw)" de
LAPLACETRANSFORM:
de
L
f(a) = i
[a(w)coswa + b(w)sin wa]dw
Jo
a(w) = =f
Tr
foOLe
0
ie
f° f(e)
L Jo
Eo
L
"
nal
an = if2
f(v)dz,
Lp
bo = 2
b, sin nee
f(x) = S
aj=—
L{f(t)}=F(s)= ~ (te! dt
cos nx
L
it
DIVERGENCE
THEOREM:
VivdV
= / n-vdA
JY
GREEN’S
FIRST IDENTITY:
JS
| (Vu- Vu +uV?v)
JV
GREEN’S SECOND IDENTITY:
STOKES’S
GREEN’S
THEOREM:
| (uV?u—vV7u)
v
| n-VxvdA=
i
dV = |
S
On
s\
wee - pot
dA
On
On.
f v dR
JS
THEOREM:
dV = / uov dA
JC ,
OP
~ ae
“/0Q
ae
Js \ Ox
Oy
‘
dA = ¢ Pdv+Qdy
c
STURM-LIOUVILLE
EQUATION:
— [p(x)y'J!
+q(x)y+Aw(x)y= 0
ay,
SINEINTEGRALFUNCTION:
— Si(x)= / ~ dt
Si(co)=
wir
JO
EXPONENTIALINTEGRALFUNCTION:| Ei(x) = /
GAMMAFUNCTION:| [(z) = /
= dt (x>0)
t’-te-t dt (x>0)
0
2 f° 2
ERROR FUNCTION: © erf(z) = Va |
RJ
TRIGONOMETRIC
FUNCTION
en" dt,
erf(oo) = 1
IDENTITIES:
HYPERBOLIC
et —en
sinha
coshar =—
sin vw= —
cos (iz) = coshz,
IDENTITIES:
et +e72
eit —pit
elf 4 ernie
COS2 = a
FUNCTION
cosh (iz) =cosx,
sin(iz) =isinha
=
TT
sinh(ia) =¢sing
cosh? z — sinh? ¢ = 1
cos? 4 +sin? a =1
cos(A + B) = cos Acos 8 F sin Asin B
cosh (A + B) = cosh Acosh B + sinh A sinh B
sin(A + B) = sin AcosB +sin BcosA
sinh (A+ B) = sinh Acosh B + sinh Bcosh A
cosAcos B = [cos(A + 8B)+ cos(A —B)]/2
sin Acos B = [sin(A + B) +sin (A ~ B)]/2
sin Asin B = [cos(A ~ B) —cos(A + B)j/2
ptf
TAYLOR SERIES:
joa
cect
pity
f(z) = f(a) + f'(a)(@ ~ a) + J {0 (x —a)? + J ae (x -a)? +---
€eeitert
=lt+et
(Geometric Series)
lz] <1
= ltatartee,
tty byetse te
52
]
1 pe
cost rox=1—
Fa
+a 1
la
sing = U- wa
qv
ol.
tea,
↔
e co
|x|<
.
∙
|x|<oo
Advanced
Engineering Mathematics
SECOND
EDITION
Michael D. Greenberg
Department of Mechanical Engineering
University of Delaware, Newark, Delaware
PRENTICE
HALL
Upper Saddle River, New Jersey 07458
Library of Congress Cataloging-in-Publication
Greenberg, Michael
Data
D., date~
Advanced engineering mathematics / Michael D. Greenberg, —- 2nd
ed.
cm,
p.
Includesbibliographical referencesand index.
ISBN 0-13-321431-1
{. Engineering mathematics. — [. Title.
TA330.G725
1998
51S‘ .14--de2!
97-43585
CIP
Technical Consultant:
Dr. E. Murat
Sozer
Acquisition editor: George Lobell
Editorial director: Tim Bozik
Editor-in-chief: JeromeGrant
Editorial assistant;Gale Epps
Executive managingeditor: Kathleen Schiaparelli
Managing editor: Linda Mihatov Behrens
Productioneditor: Nick Romanelli
Director of creative services: Paula Maylahn
Art manager: Gus Vibal
Art director / cover designer: Jayne Conte
Cover photos: Timothy Hursley
Marketing manager:Melody Marcus
Marketing assistant: Jennifer Pan
Assistant vice president of production and manufacturing:
David Riccardi
Manufacturing buyer: Alan Fischer
© 1998, 1988 by Prentice-Hall, Inc.
Simon & Schuster / A Viacom Company
Upper Saddle River, New Jersey 07458
All rights reserved. No part of this book may be reproduced,
in any form or by any means, without permission in writing
from the publisher.
Printed in the United States of America.
ISBN 0-13-321431-1
Prentice-Hall
International (UK) Limited, London
Prentice-Hall of Australia Pty. Limited, Sydney
Prentice-Hall
Canada, Inc., Toronto
Prentice-Hall Hispanoamericana, S.A., Mexico
Prentice-Hall of India Private Limited, New Delhi
Prentice-Hall of Japan, Inc., Tokyo
Simon & Schuster Asia Pte, Ltd., Singapore
Editora Prentice-Hall
do Brasil, Ltda., Rio de Janeiro
Advanced
Engineering Mathematics
oan
Contents
Part I: Ordinary Differential Equations
1
2
INTRODUCTION
TO DIFFERENTIAL
1.1
Introduction
1.2
Definitions
1.3.
Introduction to Modeling
2.1
1
|
2
9
EQUATIONS OF FIRST ORDER
2.2.
EQUATIONS
18
[Introduction 18
The Linear Equation
19
2.2.
Homogeneous case
19
2.2.2
Integrating factor method 22
2.2.3. Existence and uniquenessfor the linear equation
2.2.4
Variation-of-parameter method 27
2.3.
Applications of the Linear Equation 34
2.3.
Electrical circuits 34
2.3.2
Radioactive decay; carbon dating 39
2.3.3
Populationdynamics 41
2.3.4
Mixing problems 42
24
SeparableEquations 46
2.4,
Separable equations 46
2.4.2
Existence and uniqueness (optional) 48
2.4.3.
Applications
53
2.4.4
Nondimensionalization (optional) 56
2.5
Exact Equations and Integrating Factors 62
2.5.
Exact differential equations 62
2.5.2
Integrating factors 66
Chapter 2 Review
3. LINEAR
3.1
3.2.
DIFFERENTIAL
25
71
EQUATIONS
OF SECOND
Introduction 73
Linear Dependence and Linear Independence
76
ORDER
AND HIGHER
73
Vi
Contents
3.3
83
3.3.1
3.3.2
88
3.4
91
3.4.1
3.4.2
99
102
3.4.4
3.4.5
3.7
3.8
3.9
3.6.1
3.6.2
Cauchy—Eulerequation 118
Reduction of order (optional) 123
3.6.3
Factoring the operator (optional)
Solution of Nonhomogeneous Equation
126
133
3.7,
General solution 134
3.7.2
Undetermined coefficients 136
3.7.3. Variation of parameters 141
3.7.4
Variation of parameters for higher-order equations (optional)
Application to Harmonic Oscillator: Forced Oscillation 149
3.8.
Undamped case 149
3.8.2
Dampedcase
152
Systems of Linear Differential Equations 156
3.9,
Examples
157
3.9.2
Existence and uniqueness 160
3.9.3
Solution by elimination
162
Chapter 3 Review 171
POWER SERIES SOLUTIONS | 173
4.1
4.2
4.3
4.4
4.5
Introduction 173
Power Series Solutions
176
4.2.1
Review of power series 176
4.2.2
Power series solution of differential equations
The Method of Frobenius 193
4.3.1
Singular points 193
4.3.2
Method of Frobenius 195
Legendre Functions 212
4.4.1
Legendre polynomials
212
4.4.2
Orthogonality of the P,’s
214
4.4.3
Generating functions and properties 215
Singular Integrals; Gamma Function 218
4.5.1
Singular integrals 218
4.5.2
Gamma function 223
4.5.3
Orderof magnitude 225
4.6
4.6.1
uv% integer
231
182
144
Contents
4.6.2
4.6.3
4.6.4
4.6.5
4.6.6 |
Chapter
vu=integer 233
General solution of Bessel equation 235
Hankel functions (optional) 236
Modified Bessel equation 236
Equations reducible to Bessel equations 238
4 Review 245
5.1
5.2
5.3
5.4
5.5
5.6
5.7
Introduction 247
Calculation of the Transform 248
Properties of the Transform 254
Application to the Solution of Differential Equations 261
DiscontinuousForcing Functions;Heaviside Step Function 269
Impulsive Forcing Functions; Dirac Impulse Function (Optional) 275
Additional Properties 281
Chapter 5 Review 290
6.1
6.2
6.3
Introduction 292
Euler’s Method 293
Improvements:Midpoint Rule and Runge-Kutta 299
6.3.1
Midpoint rule 299
6.3.2
Second-order Runge-Kutta 302
6.3.3. Fourth-order Runge~Kutta 304
6.3.4
Empirical estimate of the order (optional) 307
6.3.5 | Multi-step and predictor-corrector methods (optional)
Application to Systems and Boundary-Value Problems 313
Systems and higher-order equations 313
6.4.1
6.4.2
Linear boundary-value problems 317
Stability and Difference Equations 323
6.5.1
Introduction 323
6.5.2
Stability 324
Difference equations (optional) 328
6.5.3
Chapter 6 Review 335
6.4
6.5
71
7.2
73
7.4
Introduction 337
The Phase Plane 338
Singular Points and Stability
348
7.3.1
Existence and uniqueness 348
7.3.2
Singular points 350
7.3.3. The elementary singularities and their stability
7.3.4 | Nonelementary singularities 357
Applications
359
352
308
Vil
vili
Contents
7.5
7.6
Part I:
8
Linear Algebra
SYSTEMS
8.1
8.2
8.3
9
74,1
Singularities of nonlinear systems 360
74.2
Applications 363
74.3
Bifurcations 368
Limit Cycles, van der Pot Equation, and the Nerve Impulse 372
75.1
Limit cycles and the van der Pol equation 372
7.5.2
Application to the nerve impulse and visual perception 375
The Duffing Equation: Jumps and Chaos 380
7.6.1
Duffing equation and the jump phenomenon 380
7.6.2
Chaos 383
Chapter 7 Review 389
OF LINEAR ALGEBRAIC
Introduction 391
Preliminary Ideas and Geometrical Approach
Solution by Gauss Elimination 396
8.3.1
Motivation
396
8.3.2
Gauss elimination
401
8.3.3. Matrix notation 402
8.3.4 | Gauss—Jordanreduction 404
8.3.5
Pivoting 405
Chapter 8 Review 410
VECTOR SPACE
9.1
9.2
9.3
94
9.5.
9.6
9.7.
9.8
9.9
9.10
EQUATIONS; GAUSS ELIMINATION
392
412
Introduction 412
Vectors; Geometrical Representation 412
Introduction of Angle and Dot Product 416
n-Space 418
Dot Product, Norm, and Angle for n-Space 421
9.5.1
Dot product, norm, and angle 421
9.5.2
Properties of the dot product 423
9.5.3.
Properties of the norm 425
9.5.4
Orthogonality 426
9.5.5
Normalization
427
Generalized Vector Space 430
9.6.1
Vector space 430
9.6.2
Inclusion of inner product and/or norm
Spanand Subspace 439
Linear Dependence 444
Bases, Expansions, Dimension 448
9.9.1
Bases and expansions 448
99.2.
Dimension 450
9.9.3.
Orthogonal bases 453
Best Approximation
457
433
391
Contents — ix
9.10.1
Best approximation and orthogonal projection
9.10.2
Kronecker delta
Chapter 9 Review
10
458
461
462
MATRICES AND LINEAR EQUATIONS
465
10.1
10.2.
{0.3
10.4
10.5
Introduction 465
Matrices and Matrix Algebra 465
The Transpose Matrix
481
Determinants 486
Rank; Application to Linear Dependenceand to Existence
and Uniqueness for Ax =c
495
10.5.1 Rank 495
10.5.2 Application of rank to the system Ax =e 500
10.6 Inverse Matrix, Cramer’s Rule, Factorization 508
10.6.1 Inverse matrix 508
10.6.2 Application to a mass-spring system 514
10.6.3. Cramer’s rule 517
10.6.4 Evaluation of A7! by elementary row operations
10.6.5 LU-factorization 520
10.7
10.8
11
Change of Basis (Optional)
526
Vector Transformation (Optional)
Chapter 10Review 539
THE EIGENVALUE
PROBLEM
518
530
541
11.1 Introduction 541
{1.2 Solution Procedure and Applications 542
11.2.1 Solution and applications 542
11.2.2. Application to elementary singularities in the phase plane 549
11.3. Symmetric Matrices 554
11.3.1 Eigenvalue problem Ax =Ax
554
11.3.2 Nonhomogeneous problem Ax = Ax+c (optional) S61
[1.4 Diagonalization
569
11.5 Application to First-Order Systems with Constant Coefficients (optional)
11.6 Quadratic Forms (Optional) 589
Chapter 11 Review 596
12. EXTENSION
TO COMPLEX
CASE (OPTIONAL)
583
599
12.1
Introduction 599
12.2. Complex -Space
599
12.3. Complex Matrices 603
Chapter 12 Review 61
Part I:
Scalar and Vector Field Theory
13° DIFFERENTIAL
CALCULUS
OF FUNCTIONS
OF SEVERAL
VARIABLES
613
Xx
Contents
13.1
13.2
13.3
13.4
13.5
13.6
13.7
13.8
ntroduction
613
Preliminaries 614
3.2.1 Functions 614
3.2.2 Point set theory definitions 614
Partial Derivatives 620
625
Composite Functions and Chain Differentiation
Taylor’s Formula and Mean Value Theorem 629
630
3.5.1 Taylor’s formula and Taylor series for f(x)
3.5.2 Extension to functions of more than one variable 636
Implicit Functions and Jacobians 642
3.6.1 Implicit function theorem 642
3.6.2 Extension to multivariable case 645
3.6.3. Jacobians 649
3.6.4 Applications to change of variables 652
Maxima and Minima 656
3.7.1 Single variable case 656
3.7.2 Multivariable case 658
3.7.3. Constrained extrema and Lagrange multipliers 665
Leibniz Rule 675
Chapter 13 Review 681
14.1 ntroduction 683
14.2 Dot and Cross Product 683
14.3 Cartesian Coordinates 687
ultiple Products 692
14.4
14.4.1 Scalar triple product 692
4.4.2 Vector triple product 693
14.5 Differentiation of a Vector Function of a Single Variable
699
14.6 Non-Cartesian Coordinates (Optional)
4.6.1 Plane polar coordinates 700
4.6.2 Cylindrical coordinates 704
4.6.3 Spherical coordinates 705
4.6.4 Omegamethod 707
Chapter 14 Review 712
15
15.1
15.2
Introduction 714
Curves and Line Integrals
15.2.1
{5.2.2
714
Curves 714
Arclength
716
15.3
15.2.3 Line integrals 718
Double and Triple Integrals 723
15.4
15.3.2 Triple integrals
Surfaces 733
15.3.1
Double integrals
723
727
695
Contents
15.4.1
Parametric representation of surfaces
Tangent plane and normal 734
Surface Integrals 739
733
15.4.2
15.5
{5.5.1
{5.5.2
15.6
16
AreaelementdA
Surface integrals
739
743
Volumes and Volume Integrals 748
15.6.1 Volume element dV
749
{5.6.2 Volume integrals 752
Chapter [5 Review 755
SCALAR
AND VECTOR
6.1
Introduction
6.2
Preliminaries
FIELD THEORY
757
757
6.4
6.5
758
Topological considerations 758
16.2.2 Scalar and vector fields 758
Divergence 761
Gradient 766
Curl
774
6.6
Combinations; Laplacian
6.8
Cylindrical coordinates 783
Spherical coordinates 786
Divergence Theorem 792
16.2.1
6.3
778
6.7.1
6.7.2
6.8.1
16.9
16.10
Divergence theorem
792
6.8.2 Two-dimensional case 802
6.8.3
Non-Cartesian coordinates (optional)
Stokes’s Theorem
810
6.9.1 Lineintegrals
814
16.9.2 Stokes’s theorem 814
6.9.3.
Green’stheorem
6.9.4
Non-Cartesian
Irrotational Fields
818
coordinates (optional)
826
6.10.1 Irrotational fields
826
6.10.2 Non-Cartesian coordinates
Chapter 16 Review
835
841
17.1
Introduction
17.2
Even, Odd, and Periodic Functions
844
17.3. Fourier Series of a Periodic Function
17.3.1
Fourier series
803
850
17.3.2 Euler’s formulas
857
17.3.3. Applications
859
846
850
820
xi
xi
Contents
7.4
7.5
7.6
7.7
Half- and Quarter-Range Expansions
869
Manipulation of Fourier Series (Optional)
873
Vector Space Approach 881
The Sturm—Liouville Theory 887
7.71
Sturm—Liouvilleproblem 887
7.7.2 Lagrange identity and proofs (optional)
7.8 Periodic and Singular Sturm—Liouville Problems
7.9 Fourier Integral 913
7.10 Fourier Transform 919
7.10.2 Properties and applications
897
905
922
7.11 Fourier Cosine and Sine Transforms, and Passage
7.11.1 Cosine and sine transforms
Chapter 17Review
18
940
DIFFUSION EQUATION
18.1
18.2
943
Introduction 943
Preliminary Concepts 944
8.2.1 Definitions 944
18.4
8.2.3 Diffusion equation and modeling 948
Separation of Variables 954
8.3.1 The method of separation of variables 954
8.3.2 Verification of solution (optional) 964
8.3.3. Use of Sturm—Liouville theory (optional) 965
Fourier and Laplace Transforms (Optional) 981
18.5
The Method of Images (Optional)
18.3
8.5.1
18.6
992
8.5.2 Mathematical basis for the method 994
Numerical Solution 998
8.6.1 The finite-difference method 998
WAVE EQUATION
19.1
19.2
19.3
19.4
992
Hlustration of the method
Chapter 18 Review
19
934
1015
1017
Introduction 1017
Separation of Variables; Vibrating String 1023
19.2.1 Solution by separation of variables 1023
19.2.2 Traveling wave interpretation
1027
19.2.3. Using Sturm—Liouville theory (optional) [029
Separation of Variables; Vibrating Membrane 1035
Vibrating String; d’Alembert’s Solution
1043
19.4.1 d’Alembert’s solution 1043
Contents
19.4.2 Use of images 1049
19.4.3 Solution by integral transforms (optional)
[051
20.1
20,2
20.3
20.3.1
20.3.2
20.3.3
Plane polar coordinates
20.5.1
20.5.2
20.5.3
Rectangular domains
Cylindrical
1070
coordinates (optional)
Spherical coordinates (optional)
1077
1081
20.4
20.5
1092
Nonrectangulardomains
Iterative algorithms
1097
(optional)
1100
21
21.1
21.2
21.3
m
3.1
3.2
3.3
Nv
nN
Mmm
3.4
Preliminary tdeas 1114
Exponential function
1116
Trigonometric and hyperbolic functions
solution of differential equations
21.4
1125
Sy
nnnnnnv
Q
22.1
22.2
22.3
1120
uU
41
4.2
43
44
AS
4.6
4.7
22
1118
Application of complex numbers to integration and the
Polar form =1125
Integral powers of z and de Moivre’s formula 1127
Fractional powers 1128
The logarithm of g [129
General powers of z 1130
Obtaining single-valued functions by branch cuts 1131
More about branch cuts (optional) 1132
XU
xiv
Contents
1166
Additional Mappings and Applications
1170
More General Boundary Conditions
1174
Applications to Fluid Mechanics
Chapter 22 Review
1180
22.4
22.5
22.6
23° THE COMPLEX
INTEGRAL
1182
CALCULUS
23.1
Introduction 1182
23.2
Complex Integration 1182
23.2.1 Definition and properties
1182
23.2.2 Bounds 1186
23.3. Cauchy’s Theorem
1189
23.4 Fundamental Theorem of the Complex Integral Calculus
23.5 Cauchy Integral Formula
1199
Chapter 23 Review
1207
24
TAYLOR
24.1
24.2
SERIES, LAURENT
SERIES, AND THE RESIDUE
[ntroduction 1209
Complex Series and Taylor Series
1209
24.2.1
Complex series 1209
24.2.2 Taylorseries
1214
24.3. Laurent Series 1225
24.4 Classification of Singularities 1234
24.5 Residue Theorem 1240
24.5.1 Residue theorem 1240
24.5.2 Calculating residues 1242
24.5.3 Applications of the residue theorem
Chapter 24 Review
1258
1243
=1260
REFERENCES
APPENDICES
A Review of Partial Fraction Expansions 1263
B Existence and Uniqueness of Solutions of Systems of
Linear Algebraic Equations
C
D
E
F
Table of
Table of
Table of
Table of
ANSWERS
INDEX
1267
1271
Laplace Transforms
Fourier Transforms 1274
Fourier Cosine and Sine Transforms
Conformal Maps
1278
TO SELECTED
1315
EXERCISES
=1282
1276
1195
THEOREM
= 1209
Preface
Purpose and Prerequisites
This book is written primarily for a single- or multi-semester course in applied mathematics
for studentsof engineering or science, but it is also designed for self-study and reference.
By self-study we do not necessarily mean outside the context of a formal course. Even
within a course setting, if the text can be read independently and understood, then more
pedagogical options become available to the instructor.
The prerequisite is a full year sequence in calculus, but the book is written so as to be
usable at both the undergraduate level and also for first-year graduate students of engineering and science. The flexibility that permits this broad range of use is described below in
the section on Course Use.
Changes from the First Edition
Principal changes from the first edition are as follows:
1. Part I on ordinary differential equations. In the first edition we assumed that the
reader had previously completed a first course in ordinary differential equations. However, differential equations is traditionally the first major topic in books on advanced
engineering mathematics so we begin this edition with a seven chapter sequence on ordinary differential equations. Just as the book becomes increasingly sophisticated from
beginning to end, these seven chapters are written that way as well, with the final chapter
on nonlinear equations being the most challenging.
2. Incorporation of a computer-algebra-system. Several powerful computer environments are available, such as Maple, Mathematica,
and MATLAB.
We selected Maple,
as a representative and user-friendly software. In addition to an Instructor’s Manual, a
brief student supplement is also available, which presents parallel discussions of Mathematica and MATLAB.
3. Revision of existing material and format. Pedagogical improvements that evolved
through eight years ofclass use led to a complete rewriting rather than minor modifications of the text. The end-of-section exercises are refined and expanded.
Format
The book is comprised of five parts:
I
II
WI
IV
V
Ordinary Differential Equations
Linear Algebra
Multivariable Calculus and Field Theory
Fourier Methods and Partial Differential Equations
Complex Variable Theory
XV
XVI
Preface
This breakdown is explicit only in the Contents, to suggest the major groupings of the chapters. Within the text there are no formal divisions betweenparts,only between chapters,
Each chapter begins with an introduction and (except for the first chapter) ends with a
chapter review. Likewise, each section ends with a review called a closure, which is often
followed by a section on computer software that discusses the Maple commands that are
relevant to the material covered in that section; see, for example, pages 29~3. Subsections
are used extensively to offer the instructor more options in terms of skipping or including
material.
Course Use at Different Levels
To illustrate how the text might serve at different levels, we begin by outlining how we
have been using it for courses at the University of Delaware: a sophomore/junior level
mathematics course for mechanical engineers, and a first-year graduate level two-semester
sequence in applied mathematics for students of mechanical, civil, and chemical engineering, and materials science. We denote these courses as U, G1, and G2, respectively.
level course (U). This course follows the calculus/differential
Sophomore/junior
tions sequence taught in the mathematics department. We cover three main topics:
equa-
Linear Algebra: Chapter 8, Sections 9.1—9.5(plus a one lecture overview of Secs. .7—9.9),
10.1-10.6, and 11.1-11.3. The focus is n-space and applications, such as the mass-spring
system in Sec. 10.6.2,Markov population dynamics in Sec. [1.2, and orthogonal modes of
vibration in Sec, 11.3.
Field Theory: Chapters 14 and 16. The heart of this material is Chapter 16. Having skipped
Chapter
15, we distribute
on the area element formula
a one page “handout”
(18) in Sec.
15.5since that formula is neededfor the surface integrals that occur in Chapter 16. Em-
phasis is placed on the physical applications in the sections on the divergence theorem and
irrotational fields since those applications lead to two of the three chief partial differential
equations that will be studied in the third part of the course—the diffusion equation and the
Laplace equation.
Fourier
Series and PDE’s:
Sections
17.1-17.4,
18.1, 18.3, 18.6.1, 19.1-19.2.2,
20.1-
20.3.1, 20.5.1-20.5.2. Solutions are by separation of variables. using only the half- and
quarter-range Fourier series, and by finite differences.
First semester of graduate level course (G1).
Text coverage is as follows: Sections 4.4-
4.6, 5.1-5.6, Chapter 9, Secs. I1.1-11.4, 11.6, 13.5-13.8, 14.6, 15.4-15.6, Chapter 16,
Secs. 17.3, 17.6-17.11, [8.1-18.3.1, 18.3.3-18.4, 19.1-19.2, 20.1-20.4. As in “U” we do
cover the important Chapter 16, although quickly. Otherwise, the approach complements
that in “U.” For instance, in Chapter 9, “U” focuses on n-space, but “G1” focuses on generalized vector space (Sec. 9.6), to get ready for the Sturm—Liouville theory (Section 17.7);
in Chapter {1 we emphasize the more advanced sections on diagonalization and quadratic
forms, as well as Section 11.3.2on theeigenvectorexpansionmethodin finite-dimensional
space. so we can use that method to solve nonhomogeneous partial differential equations
in later chapters. Likewise, in covering Chapter [7 we assumethat the student has worked
with Fourier series before so we move quickly, emphasizing the vector space approach (Sec.
17.6), the Sturm—Liouville theory, and the Fourier integral and transform. When we come
Preface
xvii
to partial differential equations we use Sturm—Liouville eigenfunction expansions (rather
than the half- and quarter-range
formulas
that suffice in “U”),
integral transforms,
delta
functions, and Bessel and Legendre functions. In solving the diffusion equation in “U” we
work only with the homogeneous equation and constant end conditions, but in “G1” we
discuss the nonhomogeneous equation and nonconstant end conditions, uniqueness, and so
on; these topics are discussed in the exercises.
Second semester of graduate level course (G2). In the second semester we complete the
partial differential equation coverage with the methods of images and Green’s functions,
then turn to complex variable theory,the variational calculus, and an introduction to perturbation methods. For Green’s functions we use a “handout,” and for the variational calculus
and perturbation methods we copy the relevant chapters from M.D. Greenberg, Foundations of Applied Mathematics (Englewood Cliffs, NJ: Prentice Hall, 1978). Cf you are
interested in using any of these materials please contact the College Mathematics Editor
office at Prentice-Hall,
Inc., One Lake Street, Upper Saddle River, NJ 07458.)
Text coverage is as follows: Chapters 21-24 on complex variable theory; then we return to PDE’s, first covering Secs. 18.5~18.6,19.3-19.4, and 20.3.2~20.4 that were skipped
in “G1”; “handouts” on Green’s functions, perturbation methods, and the variational calculus.
Shorter courses and optional Sections. A number of sections and subsections are listed
as Optional in the Contents, as a guide to instructors in using this text for shorter or more
introductory courses. In the chapterson field theory,for example, one could work only with
Cartesian coordinates, and avoid the more difficult non-Cartesian case, by omitting those
optional sections. We could have labeled the Sturm—Liouville theory section (17.7) as
optional but chose not to, because it is such an important topic. Nonetheless, if one wishes
to omit it, as we do in “U,” that is possible, since subsequent use of the Sturm-—Liouville
theory in the PDE chapters is confined to optional sections and exercises.
Let us mention Chapter 4, in particular, since its development of series solutions, the
method of Frobenius. and Legendre and Bessel functions might seem more detailed than
you have time for in your course. One minimal route is to cover only Sections 4.2.2 on
power series solutions of ordinary differential equations (ODE’s) and 4.4.1 on Legendre
polynomials, since the latter does not depend on the more detailed Frobenius material in
Section 4.3. Then one can have Legendre functions available when the Laplace equation is
studied in spherical coordinates. You might also want to cover Bessel functions but do not
want to use class time to go through the Frobenius material. In my own course (“G1”) I deal
with Bessel functions by using a “handout” that is simpler and shorter, which complements
the more thorough treatment in the text.
Exercises
Exercises are of different kinds and arranged, typically, as follows. First, and usually near
the beginning of the exercise group, are exercises that follow up on the text or fill in gaps or
relate to proofs of theorems stated in that section, thus engaging the student more fully in
the reading (e.g., Exercises |—3in Section 7.2, Exercise 8 in Section 16.8). Second, there
areusually numerous“drill type” exercises thatask thereaderto mimic stepsor calculations
that are essentially like those demonstrated in the text (e.g., there are 19 matrices to invert
by hand in Exercise | of Section 10.6, and again by computer software in Exercise 3).
XVili
Preface
Third, there are exercises that require the use of a computer, usually employing software
that is explained at the end of the section or in an earlier section; these vary from drill type
(e.g., Exercise
|, Section 10.6) to more substantial calculations (e.g., Exercise 15, Section
19.2).Fourth, thereareexercises that involve physical applications (e.g.,Exercises 8, 9, and
12 of Section 16.10,on the stream function, the entropy of an ideal gas, and integrating the
equation of motion of fluid mechanics to obtain the Bernoulli equation). And, fifth, there
are exercises intended to extend the text and increase its value as a reference book. In these,
we usually guide the student through the steps so that the exercise becomes more usable
for subsequent reference or self-study (e.g., see Exercises 17-22 of Section [8.3). Answers
to selected exercises (which are denoted in the text by underlining the exercise number)
are provided at the end of the book; a more complete set is available for instructors in the
Instructor’s Manual.
Specific Pedagogical Decisions
In Chapter 2 we consider the linear first-order equation and then the case of separable firstorder equations. It is tempting to reverse the order, as some authors do, but we prefer to
elevate the linear/nonlinear distinction, which grows increasingly important in engineering
mathematics; to do that, it seems best to begin with the linear equation.
It is stated, at the beginning of Chapter 3 on linear differential equations of second
order and higher, that the reader is expected to be familiar with the theory of the existence and uniqueness of solutions of linear algebraic equations, especially the role of the
determinant of the coefficient
matrix, even though this topic occurs later in the text. The in-
structor is advised to handle this need either by assigning, as prerequisite reading, the brief
summary of the needed information given in Appendix B or, if a tighter blending of the
differential equation and linear algebra material is desired, by covering Sections 8.1-10.6
before continuing with Chapter 3. Similarly, it is stated at the beginning of Chapter 3 that
an elementary knowledge of the complex plane and complex numbers is anticipated. If the
class does not meet that prerequisite, then Section 21.2 should be covered before Chapter
3. Alternatively, we could have made that material the first section of Chapter 3, but it
seemed better to keep the major topics together—in this case, to keep the complex variable
material together.
Some authors prefer to cover second-order equations in one chapter and then higherorder equations in another. My opinion about that choice is that: (i) it is difficult to grasp
clearly the second-ordercase (especially insofar as the case of repeatedroots is concerned)
without seeing the extension to higher order, and (ii) the higher-order
case can be covered
readily, so that it becomes more efficient to cover both cases simultaneously.
Finally, let us explain why Chapter 8, on systems of linear algebraic equations and
Gauss elimination, is so brief. Just as one discusses the real number axis before discussing
functions that map one real axis to another, it seems best to discuss vectors before discussing matrices, which map one vector space into another. But to discuss vectors, span,
linear dependence, bases, and expansions, one needs to know the essentials regarding the
existence and uniquenessof solutions of systemsof linear algebraic equations, Thus, Chapter 8 is intended merely to suffice until, having introduced matrices in Chapter 10, we can
provide a more complete discussion.
Xix
Preface
Appendices
Appendix A reviews partial fraction expansions, needed in the application of Laplace and
Fourier transforms. Appendix B summarizes the theory of the existence and uniqueness
of solutions of linear algebraic equations, especially the role of the determinant of the
coefficient matrix, and is a minimum prerequisite for Chapter 3. Appendices C through F
are tables of transforms and conformal maps.
Instructor’s
Manual
An Instructor’s Manual will be available to instructors from the office of the Mathematics
Editor, College
Department,
Prentice-Hall,
Inc., | Lake Street, Upper Saddle River, NJ
07458. Besides solutions to exercises, this manual contains additional pedagogical ideas
for lecture material and some additional
coverage, such as the Fast Fourier Transform,
that
can be used as “handouts.”
Acknowledgements
Tam grateful for a great deal of support in writing this second edition, but especially to
Dr. E. Murat Sozer, who acted as a Technical Consultant. Dr. Sozer’s work on the latex
preparation of text and the figures went far beyond his original desire to learn the latex
system and whose generous help was always extraordinary in quantity and quality. Sincere
thanks also to my mathematics editor, George Lobell, for his insight and support, to Nick
Romanelli in charge of production, to Anita Z. Hoover at the University of Delaware for extensive help with the latex computer system, and to these outside reviewers of the developing manuscript: Gregory Adams (Bucknell University), James M. W. Brownjohn (Nanyang
Technical
University),
Melvin
G. Davidson
(Western Washington
University),
John H. EI-
ison (Grove City College), Mark Farris (Midwestern State University), Amitabha Ghosh
(Rochester
Institute of Technology),
Evans M. Harrell,
I] (Georgia
Tech.), Allen
Hesse
(Rochester Instituteof Technology), Chung-yau Lam (Nanyang Technical University), MohanManoharan (Nanyang Technical University), James G. McDaniel (Boston University),
Carruth McGehee (Lousiana State University), William Moss (Clemson University), JeanPaul Nicol (Auburn University), John A. Pfaltzgraff (University of North Carolina, Chapel
Hill), Mohammad Tavakoli (Chaffey College), David E. Weidner (University of Delaware),
and Jingyi Zhu (University of Utah). [also thank thesegraduate students in this department
for their help with working and debugging exercises: Rishikesh Bhalerao, Nadeem Faiz,
Santosh Prabhu, and Rahul Rao and Zhaohui Chen.
I'm grateful to my wife, Yisraela, for her deep support and love when this task looked
ike more than J could handle, and for assuming many of my responsibilities, to give me
the needed time. I dedicate this book to her.
Most of all, [ am grateful to the Lord for bringing this book back to life and watching
over all aspects of its writing and production: “ From whence cometh my help? My help
cometh from the Lord, who made heaven and earth.” (Psalm
121)
Michael D. Greenberg
Chapter 1
ntroduction
to Differential Equations
1.1
Introduction
The mathematical formulation of problems in engineering and science usually leads
to equations involving derivatives of one or more unknown functions. Such equations are called differential equations.
Consider, for instance, the motion of a body of mass m along a straight line,
which we designate as an x axis. Let the mass be subjected to a force F(t) along
that axis, where t is the time. Then according to Newton’s second law of motion
dx
WH
dt?
(1)
= F(t),
where a(t) is the mass’s displacement measured from the origin. If we prescribe
the displacement a(t) and wish to determine the force F(t) required to produce that
displacement, then the solution is simple: according to (1), we merely differentiate
the given x(t) twice and multiply by m.
However, if we know the applied force F(t) and wish to determine the displacement x(t) that results, then (1) is a “differential equation” on x(t) since it
involves the derivative, more precisely the second derivative, of the unknown function w(t) with respect to ¢. To solve for x we need to “undo” the differentiations.
That is, we need to integrate (1), twice in fact.
suppose that F(t)
gives
For definiteness
and simplicity,
= Fp is a constant. Then, integrating (1) once with respect to t
mit =z:Pot + A,
dt
where A is an arbitrary constant of integration, and integrating again gives
Mme =
Fi
st + At +B,
(2)
2
Or,
1 (Fo.
a(t)= — (Fe + At +B)
m\
(3
a4
2
The constants of integration, A and B, can be found from (2) and (3) if the displace-
mentx and velocity dx/dt areprescribed at theinitial time (¢ = 0). If both «(0) and
= (0) are zero, for instance, then (by setting t = 0) we find from (2) that A = 0,
andthenfrom (3) thatB = 0. Thus, (3) gives thesolution as a(t) = Fot?/2m, and
this solution holds for all t > 0.
Unfortunately, most differential
>
F(t)
equations cannot be solved this easily, that is
by merely undoing the derivatives. For instance,supposethatthe mass is restrained
by a coil spring that supplies a restoring force proportional to the displacement x
with constant of proportionality & (Fig. 1). Then in place of (1), the differentia
equation governing the motion is
dx
ma
or,
= —kz + F(t)
dx
Ma
+kx = F(t).
(4)
After one integration, (4) becomes
dx
m— +k | x(t)dt = [ PU) dt+A,
(5)
where A is the constantof integration. Since F(t) is a prescribed function, the
integral of F(t) can be evaluated,but since x(t) is the unknown, the integral of
x(t) cannot be evaluated, and we cannot proceed with our solution—by—integration.
Thus, we see thatsolving differential equationsis not merely a matterof undoing the derivatives by direct integration. Indeed, the theory and technique involved
is considerable, and will occupy us for these first seven chapters.
1.2
Definitions
In this section we introduce some of the basic terminology.
Differential equation. By a differential equation we mean an equation containing one or more derivatives of the function under consideration.
Here are some
examples of differential equations that we will study in later chapters:
dex
M—zst+ke
mi
+ ka= F(t(t),
di
J
dE
L—=+si=—
l
(1)
(2)
1.2. Definitions
4
dé
mt 7sind=0,
= =on,
d
ad?y
dx?
>
1
Cyi+
dy 2
ae
(*)
d*y
BIS
= —w(x).
(3)
+
(4)
itt)
E(t)
\
©)
(6)
b
f
Figure
|
-
C
1. Electrical circuit,
equation (2).
Equation (1) is the differential equationgoverning the linear displacementx(t)
of a body of mass m, subjected to an applied force F(t) and arestraining spring of
stiffness &, as mentioned in the preceding section.
Equation (2) governs the current 7(t) in an electrical circuit containing an inductor with inductance L, a capacitor with capacitance C’, and an applied voltage
a
sourceof strengthE(t) (Fig. 1), where¢ is thetime.
Equation (3) governs the angular motion @(t)of a pendulum of length /, under
the action of gravity, where g is the acceleration of gravity andt is the time (Fig. 2).
Equation (4) governs the population x(t) of a single species, where t is the
time andc is a net birth/death rate constant.
Equation (5) governs the shape of a flexible cable or string, hanging under the
action of gravity, where y(x) is the deflection and C is a constant thatdepends upon
the massdensity of the cable and the tension at the midpoint x = 0 (Fig. 3).
Finally, equation (6) governs the deflection y(x) of a beam subjected to a loading w(x) (Fig. 4), where & and J are physical constants of the beam material and
cross section, respectively.
Ordinary and partial differential equations. We classify a differential equation as an ordinary differential equation if it contains ordinary derivatives with
respect to a single independent variable, and as a partial differential equation if
it contains partial derivatives with respect to two or more independentvariables.
Thus, equations (1)—(6) are ordinary differential equations (often abbreviated as
ODE’s). The independent variable is ¢ in (1)—(4)and z in (5) and (6).
Some representative and important partial differential equations (PDE’s) are
as follows:
ae ~ae
Figure 3. Hangingcable,
equation (5).
(7)
yA
att a iano
(8)
- or
° Ge mm)
(9)
*on=0.
set*parayt
oT
|
i
atl
1 +f
y(x)/
Figure 4, Loadedbeam,
(10)
equation (6).
3
ah
Equation (7) is the heat equation, governing the time-varying temperature distribution u(x, ¢) in a one-dimensional
rod or slab; a locates the point under consid-
eration within the material, ¢ is the time, and a? is a material property called the
diffusivity.
Equation (8) is the Laplace equation, governing the steady-state temperature
distribution u(x, y, z) within a three-dimensional body; x, y, z are the coordinates
of the point within the material.
Equation (9) is the wave equation, governing the deflection u(a, y, ¢) of a vibrating membrane such as a drum head.
Equation (10) is the biharmonic equation, governing the stream function u(x, y)
in the case of the slow (creeping) motion of a viscous fluid such as a film of wet
paint.
Besides the possibility of having more than one independent variable, there
could be more than one dependent variable.
For instance,
dx
aT = —(ko1+ k3i)x1 + kigre + kigx3
dx
a = kaya —(k12+ k32)a2+ k3903
(11)
a
= = kgx1 + kggx9—(kig+ ko3)xs
is a set,or system,of threeODE’s governingthethreeunknowns21(t), £2(t),73(t);
(11) arises in chemical kinetics, where 11, x2, x3 are concentrations of three reacting chemical species, such as in a combustion chamber, where the /;;’s are reaction
rate constants, and where the reactions are, in chemical jargon, first-order reactions.
Similarly,
OE,
OF
er
eee
Ox
is a system of two PDE’s
governing
dy
a(x,y)
the two unknowns
£)(a,y)
and E£o(x,y),
which are the x and y components of the electric field intensity, respectively, a(x, y)
is the charge distribution density, and € 1sthe permittivity; these are the Maxwell’s
equations governing two-dimensional electrostatics.
At this point, we limit our subsequent attention to ordinary differential equations. We will not return to partial differential equations until much later on in this
book. Thus, we will generally omit the adjective “ordinary,” for brevity, and will
speak only of “differential equations” over the next several chapters.
Order.
We define the order of a differential equation as the order of the high-
est derivative therein. Thus, (4) is of first order, (1), (2), (3), and (5) are of second
order, (6) is of fourth order, and (11) is a system of first-order ODE’s.
More generally,
F(«, u(x), u(x), u"(a),...,
u'")(x)) =0
(13)
1.2. Definitions
is said to be an nth-order differential equation on the unknown u(x), where n
is the order of the highest derivative present in (13). Here, we use the standard
prime notationfor derivatives: u/(x) denotesdu/dx, w(x) denotesthe second
derivative, ..., and ul) (x) denotes the nth derivative. In the fourth-order differential equation (6), for instance, in which the dependent variable is y rather than u,
yyyyy”) nt
Fiz
or
yl"
= Ely"
+ w(a«),which happensnotto containy, y/, yft
.
Solution. A function is said to be a solution of a differential equation, over a
particular domain of the independentvariable, if its substitution into the equation
reduces that equation to an identity everywhere within that domain.
EXAMPLE
1. The functiony(c) = 4sinz —xcosz is a solutionof thedifferential
equation
y'+y=2sine
(14)
on the entire 2 axis because its substitution into (14) yields
(—4sinx + 2sing + xcosa2)+ (4sinz —xcosxz) = 2singz,
which ts an identity for all z. Note that we said ‘‘a” solution rather than “the” solution since
there are many solutions of (14):
y(z) = Asing
+ Bcosz
— xcosz
(15)
is a solution for any values of the constants A and B, as is verified by substitution of (15)
into (14). [In a later chapter, we will be in a position to derive the solution (15), and to
show that it is the most general possible solution, that is, that every solution of (14) can be
expressed in the form (15).] @
EXAMPLE
2. The functiony(z) = 1/z is a solutionof thedifferentialequation
y+y? =0
(16)
over any interval that does not contain the origin since its substitution into (16) gives
~1/x? + 1/2? = 0, which relation is an identity, provided that z #0.
EXAMPLE
3. Whereas (14) admits an infinity of solutions [one for each choice of A
and B in (15)], the equation
ly'|+ly|+3=0
(17)
evidently admits none since the two nonnegative terms and one posilive term cannot possibly sum to zero for any choice of y(z). #
In applications, however, one normally expects that if a problem is formulated
carefully then it should indeed have a solution, and that the solution should be
unique, that is, there should be one and only one. Thus, the issues of existence
(Does the equation have any solution?) and uniqueness(If it does have a solution,
is that solution unique?) are of important interest.
Initial-value problems and boundary-value problems. Besides the differential
equation to be satisfied, the unknown function is often subjected to conditions at
one or more points on the interval under consideration. Conditions specified at
a single point (often the left end point of the interval), are called initial conditions, and the differential equation together with those initial conditions is called
an initial-value problem. Conditions specified at both ends are called boundary
conditions, and the differential equation together with the boundary conditions
is called a boundary-value problem. For initial-value problems the independent
variable is often the time, though not necessarily, and for boundary-value
theindependent variable is often a space variable.
problems
EXAMPLE 4. Straight-Line Motion of a Mass. Consider once again the problem of predicting the straight-line motion of a body of mass m subjected to a force F(t). According
to Newton's second law of motion, the governing differential equation on the displacement
x(t) is ma” = F(t). Besidesthedifferentialequation,supposethatwe wish to imposethe
conditions«(0) = 0 andz'(0) = V; thatis, theinitial displacementandvelocity are0 and
V, respectively. Then the complete problem statement is the initial-value problem
(18)
That is, x(t) is to satisfythedifferentialequationmx” = F(t) on theinterval0 < t < co
and theinitial conditionsz(0) = 0 and2’(0) = V. 8
yA
EXAMPLE
5. Deflection of a Loaded Cantilever Beam. Consider the deflection y(z)
of a cantilever beam of length L, under a loading w(x) newtons per meter (Fig. 5). Using
he so-called Euler beam theory, one finds that the governing problem is as follows:
w(x)
ALT
0
a
oy Pe
_
Xx
x=L
Ely!"
= —w(z)
(O0<a<L)
y0)=0,y(0)=0,
y"(L)=0,
y"(L)=0,
”
where & and J are known physical constants. The appended conditions are boundary
Figure 5. Loadedcantileverbeam. conditions because some are specified at one end, and some at the other end, and (19) is
therefore a boundary-value problem. The physical significance of the boundary conditions
is as follows: y(0) =0 is true simply by virtue of our chosenplacement of the origin of the
x,y coordinatesystem;y’(0) = 0 follows since thebeamis cantileveredout of the wall,
so thatits slope at z = 0 is zero; y”(L) = 0 andy’”(L) = 0 becausethe “moment”and
“shear force,” respectively, are zero at the end of the beam. &
1.2. Definitions
Linear and nonlinear differential equations. An mth-order differential equation
is said to be linear if it is expressible in the form
(a) +++:+an(x)y(x) = f(x),
ag(x)y (a) + ay(x)y")
(20)
a, (a) are functions of the independent variable x alone, and nonwhere ag(x),...,
linear otherwise. Thus, equations (1), (2), (4), and (6) are linear, and (3) and (5)
are nonlinear.If f(a) = 0, we say that(20) is homogeneous; if not, it is nonhodoes not vanish on the « interval of interest,then we may
mogeneous. If a@o(a)
divide (20) by ag(x) (to normalize the leading coefficient) and re-express it as
y (a)+pr(w)y"Y
(a)+--+
+pr()y(2)
=a(a),
(21)
We will find that the theory of linear differential equations is quite comprehensive
insofar as all of our major concerns — the existence and uniqueness of solutions,
andhow tofind them,especially if the coefficientsag(z),..., @n(x) areconstants.
Even in the nonconstant coefficient case the theory provides substantial guidance.
Nonlinear equations are, in general, far more difficult, and the available theory
is not nearly as comprehensive as for linear equations. Whereas for linear equations solutions can generally be found either in closed form or as infinite series, for
nonlinear equations one might focus instead upon obtaining qualitative information about the solution, rather than the solution itself, or upon pursuing numerical
solutions by computer simulation, or both.
The tendency in science and engineering, until around 1960, when high-speed
digital computers became widely available, was to try to get by almost exclusively
with linear theory. For instance, consider the nonlinear equation (3), namely,
6" +
g =sind= 0,
l
(22)
governing the motion of a pendulum, where 6(t) is the angular displacementfrom
the vertical and t is the time. If one is willing to limit one’s attention to small
motions, that is, where @is small compared to unity (i.e., 1 radian), then one can
use the approximation
sind =
Los
log.
+ 59 —...88
—3
to replace the nonlinear equation (2) by the approximate “linearized” equation
+ 59=0,
(23)
which (as we shall see in Chapter 3) is readily solved.
Unfortunately, the linearized version (23) is not only less and less accurate as
larger motions are considered, it may even be incorrect in a qualitative sense as
well. That is, from a phenomenological standpoint, replacing a nonlinear differential equation by an approximate linear one may amount to “throwing out the baby
with the bathwater.”
7
8
Thus, it is extremely important for us to keep the distinction between linear
and nonlinear clearly in mind as we proceed with our study of differential equations. Aside from Sections 2.4 and 2.5, most of our study of nonlinear equations
takes place in Chapters 6 and 7.
Closure. Notice that we have begun, in this section, to classify differential equations, that is, to categorize them by types. Thus far we have distinguished ODE’s
(ordinary differential equations) from PDE’s (partial differential equations), established the order of a differential equation, distinguished initial-value problems from
boundary-value problems, linear equations from nonlinear ones, and homogeneous
equations from nonhomogeneous ones.
Why do we classify so extensively? Because the most general differential
equation is far too difficult for us to deal with. The most reasonable program, then,
is to break the set of all possible differential equations into various categories and to
try to develop theory and solution strategies that are tailored to the specific nature
of a given category. Historically, however, the early work on differential equations
(1654-1705) and his brother Johann (John) (1667~1748), Joseph-Louis Lagrange
(1736-1813), Alexis-Claude Clairaut (1713-1765), and Jean le Rond d’Alembert
(1717-1783) —generally involved attempts at solving specific equations rather than
the development of a general theory.
From an applications point of view, we shall find that in many cases diverse
physical phenomena are governed by the same differential equation. For example,
consider equations (1) and (2) and observe that they are actually the same equation,
to within a change in the names of the various quantities: m > L, k + 1/C,
F(t) + d&(t)/dt, and x(t) -+ i(t). Thus, to within thesecorrespondences,their
solutions areidentical. We speakof the mechanical systemand the electrical circuit
as analogs of each other. This idea is deeper and more general than can be seen
from this one example, and the result is that if one knows a lot about mechanical
systems, for example, then one thereby knows a lot about electrical, biological, and
social systems, for example, to whatever extent they are governed by differential
equations of the same form.
Or, returning to PDE’s for the moment, consider equation (7), which we introduced as the one-dimensional
heat equation.
Actually,
(7) governs any one-
dimensional diffusion process, be it the diffusion of heat by conduction, or the
diffusion of material such as a pollutant in a river. Thus, when one is studying heat
conduction one is also learning about all diffusion processes because the governing differential equation is the same. The significance of this fact can hardly be
overstated as a justification for a careful study of the mathematical field of differential equations, or as a cause for marvel at the underlying design of the physical
universe.
1.3. Introduction to Modeling
EXERCISES
9
1.2
1. Determine the order of each differential equation, and that equation for any twice-differentiable functions f and g.
whetheror not the given functions are solutions of that equa- (c) For what value(s) of the constant m is u(x,t)
=
sin (x + mt) a solution of that equation ?
tion.
@(y')?=4y w(x)=2, yo(a)= 227, yy(x)= e7*
y,(x) = sing,
(b) 2yy' =9sin2z;
ys(x)= e*
yo(x) = 3sinz,
(c)y" —9y= 0; yr(a)=6 —€, yo(z) = 3sinh 32,
y3(x)= 2e8*—e~ba
(d)
(y’)?—day!
+4y=0; yi(z) = 27-2, yo(x) = 2r-1
(e)y” +9y=0;
3n
8
yile) = 4sin 32 + 3cos 3a,
yo(x) = 6sin (3x + 2)
(f)y"—y'-2y =6;
yi (x) = 5e2*-3,
(ay +8y=0
(c)y” —3y'+2y =0
ey" —y'=0
(g)
yl"
_
by"
+
By
— 0
(b)y’ +38y?
=0
(d)y” ~2y’ +y =0
(fy —2y"—y!+2y=0
(h)y” + Syy’+y = 0
yo(x)= —2e7*—3 6. First, verify that the given function is a solution of the given
(g)y'” —Gy”+ 12y!—8y = 32—162;
yi(x) = 22 ~1+(A+
A,B,C
5. For whatvalue(s)of theconstantA will y = exp (Az) bea
solution of the given differential equation? If there are no such
A’s, statethat.
Bx + Cx?)e* for any constants
ys(x)= Aem™ D et dt,
(h)y' + 2ay=1;
yo(z) = er fe et dt for any constantsA anda.
2. Verify that
differential
equation, for any constants A, B. Then, solve for
A, B so thaty satisfies thegiven initial or boundaryconditions.
(a)y"”+4y= 827; y(x) = 22?-1+4Asin 2x+
y(0)=1, y/(0)=0
(b)y”—y=a27;
0)=-2,
y(x) = -2? -2+Asinhz+
tga
eo y(x)=(A+
=(A+Ba)e*;
y(0)=1
y+ ay +y=0;
Beje*,y(0)=1,
dy"
—-y' =0;
y(x)=At+ Bet
y/(O)=1,
a solution of (7) for any constants A, B,C,D,«.
NOTE: We
will sometimes use the notation exp( ) in place of e' ) because
it takes up less vertical space.
7. Classify each equation as linear or nonlinear:
3. Verify thatu(x, y, z) = Asin az sin by sinh cz isa solution
(a)y’ t+e*y=4
of (8)for anyconstantsA, a, 6,c, providedthata? + b? = c?. (c) eTy’ =a —2y
4. (a)Verifythatu(z,t) = (Ar + B)(Ct+ D)+(Esingnz + (e)y" +(sinz)y=2
F'cos«x)(Gsin «ct + Hcos «ct) is a solution of the one- (g)yy” +4y= 32
2 Pu
Ou
Ox? ~ Ot?’
for any constants A, B,...,
1.3
Introduction
H, x.
to Modeling
Beosha;
y'(0)=0
u(z,t) = Av + B+ (Csin«az + Dcos Kz) exp(—Kat) is y(2)=0
dimensional wave equation
Bcos2z;
2
y(3)=0
(b)yy =xrt+y
(d) y’ ~ expy = sinz
Oy "_y
-¥ =expaz
Pp
(h)y" =y
8. Recall that the nonlinear equation (5) governs the deflection
y(x) of theflexible cable shownin Fig. 3. Supposingthatthe
sag is small compared to the span, suggest a linearized version
of (5) that can be expected to give good accuracy in predicting
theshapey(a).
as Heat Transfer, Fluid Mechanics, and Circuit Theory. However, we wish to emphasize the close relationship between the mathematicsand the underlying physics,
and to motivate the mathematics more fully. Thus, besides the purely mathematical examples in the text, we will include physical applications and some of the
underlying modeling as well.
Our intention in this section is only to illustrate the nature of the modeling
process, and we will do so through two examples. We suggest that you pay special
attention to Example | because we will come back to it at numerous points later on
in the text.
an]
|
.
CLIP
con
EXAMPLE 1. Mechanical Oscillator. Consider a block of massm lying ona tableand
restrained laterally by an ordinary coil spring (Fig. 1), and denoteby z the displacement of
the mass (measuredas positive to the right) from its “equilibrium position,” that is, when
=>
Fit)
POPE PROP POPPY
Figure 1. Mechanicaloscillator.
x = 0 the spring is neither stretched nor compressed. We imagine the mass to be disturbed
from its equilibrium position by an initial disturbance and/or an applied force F(t), where
t is the time, and we seek the differential equation governing the resulting displacement
history x(¢).
Our first step is to identify the relevant physics which, in this case, is Newton’s second
law of motion. Since the motion is constrained to be alonga straight line, we need consider
only the forces in the 2 direction, and these are shown in Fig. 2: F’, is the force exerted
bo
OP
i
-
*
Figure 2. The forces,if z > 0
anda’ > 0.
by the spring on the mass (the spring force, for brevity), F is the aerodynamic drag, f’, is
the force exerted on the bottom of the mass due to its sliding friction, and F’ is the applied
force, the driving force. How do we know if F, Fy, and F act to the left or to the right?
The idea is to make assumptions on the signs of the displacement x(t) and the velocity
x'(t) at theinstantunderconsideration.For definiteness,supposethatz > 0 and2’ > 0.
Then it follows that each of the forces F’,, Fy, and Fy is directed to the left, as shown in
Fig. 2. (The equation of motion that we obtain will be insensitive to those assumptions, as
we shall see.) Newton’s second law then gives
(mass)(z acceleration) = sum of x forces
17x
or,
mal"=F —F,— Fy —Fa,
(1)
and we now need to express each of the forces F,, Fy, and Fy in terms of the dependent
Figure 3. Springforceand
displacement.
and independent variables x and t.
Consider F, first. If one knows enough about the geometry of the spring and the
material of which it is made, one can derive an expression for F as a function of the
x, as might be discussed in a course in Advanced Strength of Materials.
In
extension
practice, however, one can proceed empirically, and more simply, by actually applying
various positive (i.e., to the right, in the positive a direction) and negative forces (to the
left) to the spring and measuring the resulting displacement x (Fig. 3). For a typical coil
spring, the resulting graph will be somewhat as sketched in Fig. 4, where its steepening at
A and B is due to the coils becoming completely compressed and completely extended,
respectively. Thus, /’, in (1) is the function the graph of which is shown as the curve AB,
(Ignore the dashed line Z for the moment.)
Next, considerthefriction forceFy. The modelingof F’y will dependupon thenature
of the contact between the mass and the table — in particular, upon whether it is dry or
{.3. Introduction to Modeling
11
lubricated. Let us suppose it is lubricated, so that the mass rides on a thin film of lubricant
such as oil. To model /’;, then, we must consider the fluid mechanics of the lubricating
film. The essential idea is that the stress 7 (force per unit area) on the bottom of the mass
is proportional to the gradient du/dy of the fluid velocity u (Fig. 5), where the constant
B
of proportionality is the coefficient of viscosity yz: 7 = podu/dy. But u(y) is found, ina
course in Fluid Mechanics,
to be a linear function, namely,
u(y) =
(y)
so
u(h) ~ u(0)
h
y=
Y
z(t)-0O
nh
yoo
4
a(t)
Y,
ee
duos,
(t)
T= pl
= a
(ft).
oh
; dy
A
Figure
Thus,
F's = (stress rT)(area A of bottom of block)
=(7)
4. Force-displacement
graph.
(A).
That is, itis of the form
Fy = ca'(t),
(2)
for some constant c that we may consider as known. Thus, the upshot is that the friction
force is proportional to the velocity. We will call c the damping coefficient because, as we
will see in Chapter 3, the effect of the cx’ term in the governing differential equation is to
cause the motion to “damp out.”
Likewise, one can model the aerodynamic drag force Fy, but let us neglect &, on the
tentative assumption that it can be shown to be small compared to the other two forces.
Then (1) becomes
ma"(t) + ex'(t) + Fy(x) = F(t).
(3)
F(a) & ke.
(4)
∙
∙
∙
;
oo
;
;
∙
Equation (3) is nonlinear because F’,(x) is not a linear function of x, as seen from its
graph AB in Fig. 4. Asa final simplifying approximation, let us suppose that the z motion
is small enough, say between a and 6 in Fig. 4, so that we can linearize F’,, and hence the
governing differential equation, by approximating the graph of F, by its tangent line D.
Since L is a straight line through the origin, it follows that we can express
We call & the spring stiffness.
Thus, the final form of our governing differential equation, or equation of motion, is
the linearized approximation
ma" +er' +kx = F(t),
(3)
onQ <¢ < oo, where the constants m, c, & and the applied force F'(t) are known. Equation
(5) is important, and we will return to it repeatedly.
To this equation we wish to append suitable initial or boundary conditions. This particular problem is most naturally of initial-value type since we envision initiating the motion
,
22k
LYE
t
y
“Wetable
Figure 5. Lubricatingfilm.
12
in some manner at the initial time, say ¢ = 0, and then considering the motion that results,
Thus, to (5) we append initial conditions
(a) x>0,
F<
> fF
for some (positive, negative,or zero) specified constantsvq and zp. It should be plausible
intuitively that we do need to specify both the initial displacement «(0) and the initial
velocity 2’(0) if we are to ensurea unique resultingmotion. In any case, the theoretical
——>
appropriateness of the conditions (6) are covered in Chapter 3.
*
(b) x<0,
(6)
and 2’(0) = 29,
z(0)=2o
x <0
The differential
equation (5) and initial conditions (6) comprise our resulting math-
ematical model of the physical system. By no means is there an exact correspondence
between the model and the system since approximations were made in modeling the forces
x >0
F,—
}—_» f°
fe
oi
Figure 6. Otherassumptionson
thesignsof x and2”.
F, and Fy, and in neglecting /, entirely.
Indeed, even our use of Newtonian
mechanics,
rather than relativistic mechanics, was an approximation.
This completes the modeling phase. The next step would be to solve the differential
equation (5) subject to the initial conditions
(6), for the motion a(t).
COMMENT J].Let us examine our claim that the resulting differential equation is insensitive to the assumptions made as to the signs of 2 and 2’. In place of our assumption that
x > Oand zx’ > 0 at the instant under consideration, suppose we assume that z > 0 and
x’ <0. Since x > 0, it follows that F, acts to the left, and since z’ < 0, it follows that Fs
acts to the right. Then (Fig. 6a)
(7)
ma" =F—F,+F;,
where we continue to neglect Fy.
The sign of the F's term is different in (7), compared
with (1), becausef’, now acts to theright, but noticethatF’, now needsto be writtenas
Fy = c(—a'(t)), ratherthancex'(t)since 2’ is negative. Further,F, is still ka, so (7)
becomes
(8)
me" = F(t) —kx + (—cz’),
which is indeed equivalent to (5), as claimed.
Next, whatif c¢< 0 and2’ > 0? This time (Fig. 6b)
me" =F +F,— Fy,
(9)
which differs from (1) in the sign of the F, term. But now F’, needs to be written as
FY,= k (—2(t)) since x is negative.Further,F’; is cx’, so (9) becomes
ma" = F +k(-2x)- cr’,
which, again, is equivalent to (5). The remaining
the exercises.
COMMENT
case, where x < 0 and a’ <0,
is left for
2. The approximation (4) was introduced from consideration of the graph
shown in Fig. 4, but it amountsto expanding £’,(z) in a Taylor seriesabouttheequilibrium
point v = 0, as
∫
∙
∶↨
∫
−
∙
≤−
and linearizing —that is, cutting off after the first-order term:
P,(x)
&P(0)+FX(0)a
=O+ka
= ka.
∫
∶
↔
13
This idea, the simplification of a differential equation by such tangent-line approximation,
is of greatimportance in modeling.
COMMENT 3. The final equation for F,, Ff, = kx is well known as Hooke’s law, after
Robert Hooke (1635-1703). Hooke published his law of elastic behavior in 1676 as the
anagram ceiiinosssttuv and, two years later, the solution ut tensio sic vis: roughly, “as
the force, so is the displacement.” In view of the complexity with which we can now
deal, Hooke’s law must look quite modest, but one must appreciate it within its historical
context. In spirit, it followed Galileo Galilei (1564-1642) who, in breaking lines established
by the ancient Greeks, sought to establish a quantitative science, expressed in formulas and
mathematical terms. For example, where Aristotle explained the increasing speed of a
falling body in terms of the body moving more and more jubilantly as it approached its
naturalplace (thecenter of the earth, which was believed to coincide with thecenter of the
universe), Galileo sidestepped the question of cause entirely, and instead put forward the
formula vu= 9.8t, where v is the speed (in meters per second) andt is the time (in seconds).
It may be argued that such departure from the less productive Greek tradition marked the
beginning of modern science.
COMMENT 4. In science and engineering it is useful to think in terms of inputs and
outputs. Here, thereare three inputs, the two initial conditions and the applied force F'(t),
andtheoutputis theresultingsolution,or response,z(t). Hf
The foregoing introductory example illustrates several general truths about
modeling. First, we see that it is not necessarily an easy task and generally requires a sound understanding of the underlying physics. Even in this little example
one senses that obtaining suitable expressions for F’y and Fy (if one does include
F) requires skillful handling of the fluid mechanics of the lubrication film and the
aerodynamics of the moving block.
Second, we see that approximations will no doubt be necessary, and the idea
is to makethemjudiciously. In this example we madeseveral approximations. The
expressionu(y) = 2'(t)y/h, for instance,is probably quite accuratebut is not
exact, especially
near the ends of the block.
Further, one can imagine that as the
motion continues, the lubricant will heat up so that the viscosity jy will decrease.
This effect is probably negligible, but we mention it in order to suggest that there
is virtually no end to the levels of complexity that one may address, or suppress,
in the modeling process. The key is to seek a level of complexity that will provide sufficient accuracy for the purpose at hand, and to seek a uniform level of
approximation. For instance, it would hardly make sense to model Fy with great
sophistication and accuracy if F’, is of comparable magnitude and is modeled only
crudely.
To stay on this point a bit longer, note several more approximations that were
implicit
to our discussion.
First, we implicitly
assumed that the block is rigid,
whereas the applied forces will cause some slight distortion of its shape and dimensions; again, neglect of this effect is surely an excellent approximation when
considering the motion x(t). Second, and more serious, notice that our empirical
determination of F',(2) was based on a static test whereas, like the block, the spring
is itself in motion. Thus, there is an inertial effect for the spring, analogous to the
me"term for themass,thatwe haveneglected.If themassof thespring is notnegligible compared to thatof the block, then that approximation may be insufficiently
accurate.
Finally, notice carefully that we neglect a particular effect, in modeling, not on
the grounds that it is small in an absolute sense, but because it is small relative to
other effects. For instance, an aerodynamic force F, = 0.001 newton may seem
small numerically, but would not be negligible if F', F,, and Fy were of comparable
size.
Let us consider one more example.
EXAMPLE 2.
x
x+Ax
Figure 8. Typical cableelement.
SuspensionBridge Cable. To designa suspensionbridge cable, one
needs to know the relationships among the deflected shape of the cable, the tension in it,
and the weight distribution that it needs to support.
In the case of a typical suspension bridge, the roadbed supported by the cables is
much heavier than the cables themselves, so let us neglect the weight of the cables, and
assume that the loading is due entirely to the roadbed. Consider the configuration shown
schematically in Fig. 7.
A cable is a distributed system, rather than one or more point masses, and for such
systems a useful modeling approach is to isolate a typical element of the system and apply
to-it the relevant physical laws. In the present case a typical element is an incremental
portion of the cable lying between x and x + Az, for any x between —L/2 and L/2, as
shown in Fig. 8: As is the arc length, T the tension in the cable, @the inclination with
respect to the horizontal, and AW the load supportedby the element. If the roadbed is
uniform and weighs 2w newtons per meter, then each of the two cables supports w newtons
per meter,so AW = wAz.
Besides neglecting the weight of the cable itself, as mentioned above, there are three
additional approximations that are implicit in the foregoing. First, in assuming a uniform
load w per unit length, we have really assumed that the vertical support spacing d is very
small compared to the span L, so that the intermittent support forces can be distributed as
a uniform
load.
Second, in assuming
that the tension is in the direction
of the cable we
have really assumed that the cable is flexible, a term that we now need to explain. The
general state of affairs at any section, such as at the z + Az end of the element, is as shown
in Fig. 9, namely, there can be a shear force V, a tensile force JTthrough the centerline,
and a momentor “couple” M. (V is the net effect of shearingstressesdistributedover the
face, and T and M are the net effect of normal stresses distributed
over the face.) By a
flexible string or cable, we mean one that is unable to support any shear V or moment ;
center
line
Figure 9. Forcesandmoments
at an end,
that is, V = Mf = 0. For instance, if one takes a typical household string between the
thumb and forefinger of each hand one finds that it offers virtually no resistance to shearing
or bending, but considerable resistance to stretching. Thus, when we include only tensile
forces in Fig. 8 we are assuming that the cable is flexible. Of course, if we imagine taking
the suspension cables on the Golden Gate Bridge “between our fingers” we can imagine
quite a bit of resistance to shearing and bending! But thepoint to keep in mind is that even
those heavy cables would offer little resistance to shearing and bending by the enormous
loads to which they are actually subjected by the weight of the roadbed.
Finally, we assume that the cable is inextensible, even under the large tensions that are
15
anticipated.
If we can accept these assumptions we can now apply the relevant physics which,
again, is Newton’s second law of motion. But this time there is no acceleration, so the
element is in static equilibrium. Thus, the « and y forces must each sum to zero:
T(x + Axz)cos O(a+ Ax) —T(x) cos(x) = 0,
T(x + Ax)sin O(¢+ Ax) —T(z) sin O(2) ~ wAr = 0.
x:
y:
(10a)
(10b)
Dividing each of theseequations by Az and letting Ax -> 0, we obtain
d
=(Tcos8)=0,
(11a)
gstrs sin 0)="
)=w,
(11b)
or, upon integration,
(12a)
Tcos@ = A,
Tsin@é=wr+B,
where A, B are arbitrary constants.
Dividing
(12b)
(12b) by (12a), to eliminate the unknown
tensionT(z), and noting thattan 0 = dy/dz, we obtain thedifferential equation
dy
ow
B
—=-r+—
dx
A t A
13
(13)
governingthe cable shapey(x), which we are trying to determine.
In this case, the solution is simple enough so that we might as well complete the
solution. To solve, we merely integrate (13), obtaining
wo.
y(t) = pea
Wir)= sae
where A, B,C
8B
+ —24C,
+a
are arbitrary constants. To evaluate them, we invoke the associated bound-
ary conditions:
y(0) =0
y'(0) =0
DL
=
y (5)
(by symmetryaboutx =0),
(14a)
(14b)
(from Fig. 7).
(14c)
(by choice of location of origin),
Equation (14a) gives C = 0, and (14b) gives B/A = 0, and hence B = 0. Thus far,
w
y(x)
and (14c) then gives A = wl? /8H.
= 24”
D
)
Thus, the cable’s shape is given by the parabola
4H.
y(a) = 7
(15)
16
Finally, the distribution of tension T'(a) may be found by squaring and adding (12a)
and (12b):
=wy/ae? +
LA
64H?
(16)
Ina sense,obtaining y(2) and T(x) marks theend of theanalysis, and if we are content
that the expressions (15) and (16) are sufficiently
accurate, then the next step would be to
use them to help in the actual bridge design. Before doing that, however, we should check
those results, for there may be errors in our analysis. Also, the approximations that we have
made may prove to be insufficiently accurate.
One of the standard ways of checking results is by means of special cases and limiting
cases, for which the correct results are known or obvious. Considering (15), we observe
first that the parabolic shape looks reasonable. Furthermore, (15) implies that y(a#)> 0,
over ~L/2 < x < L/2,as H — 0 with L fixed, and also thaty(z) — 0 at each a, as
L — co with A fixed. These results look reasonable too. Turning to (16), observe that the
tension becomes infinite throughout the cable as H — 0, as expected. (Try straightening
out a loaded washline by pulling on one end!) Finally, consider the limiting case H — oo,
with L fixed. In that case, (16) gives T(L/2)
+ wL/2, which agreeswith the result
obtained from a simple consideration of the physics (Exercise 2). 4
Closure. The purpose of this section is to illustrate the modeling process, whereby
one begins with the physical problem at hand and ends up with an equivalent mathematical statement, or model. Actually, we should say approximately equivalent
since the modeling process normally involves a number of assumptions and approximations. By no means do we claim to show how to model in general, but only
to illustrate the modeling process and the intimate connection between the physics
and the mathematics. As we proceed through this text we will attempt to keep that
connection in view, even though our emphasis will be on the mathematics.
Finally, let us note that when we speak of the physical problem and the physics
we intend those words to cover a much broader range of possibilities. For instance,
the problem might be in the realm of economics, such as predicting the behavior of
the Dow Jones Stock Index as a function of time. In thatcase the relevant “physical
laws” would be economic laws such as the law of supply and demand. Or, the
problem might fall in the realm of sociology, ecology, biology, chemistry, and so
on. In any case, the general modeling approach is essentially the same, independent
of the field of application.
1.3. Introduction to Modeling
= 17
1.3
EXERCISES
1. In Exampie | we showed that the same differential equa- satisfies (3.1) and the boundary conditions y(Q) = 0 and
tion, (5), results, independent of whether « > 0 and x’ > 0, y’(0) = 0. But it remains to determine C’. Invoking the re< Oand 2’ > 0. Consider the last
orz > Oand az’ < 0orz
remaining case, z < 0 and 2’ < 0, and show that, once again,
one obtains equation (5).
mainingboundarycondition,y(/2)
= H, showthatC' satis-
fies the equation
2. At the end of Example 2, we stated that the result
_ i}.
H = Gl (cost “
(3.3)
T(L/2) — wL/2, obtained from (16), for the limiting case
where H — oo with D fixed, agrees with the result obtained
from a simple consideration of the physics. Explain that statement,
Unfortunately, (3.3) is a transcendental equation for C,, so that
3. (Catenary) In our Suspension Bridge Cable example we ne- we cannot solve it explicitly. We can solve it numerically, for
glectedthe weight of the cable itself relative to the weight of given values of H and L, but you need not do that.
the roadbed. At the other extreme, suppose that the weight of (c) As a partial check on these results, notice that they should
the roadbed (or other loading) is negligible compared to the reduce to the parabolic cable solution in the limiting case
weight of the cable. Indeed, consider a uniform flexible ca- where the sag-to-span ratio H/L tends to zero, for then the
ble, or catenary, hanging under the action of its own weight load per unit « length, due to the weight of the cable, aponly, as sketched in the figure. Then Fig. 8 still holds, but proaches a constant, as it is in Example 2, where the load is due
with AW = ,:As, where puis the weight per unit arc length of entirely to the uniform roadbed. The problem that we pose for
the cable.
you is to carry out that check. HINT: Think of L as fixed and
y
H tending to zero. For H to approach zero, in (3.3), we need
gE
“H
AH
¥
C'L/2 to approachzero—thatis, C — 0. Thus, we canexpand
the cosh C'x — 1 in (3.2) in a Maclaurin series in C and retain
the leading term. Show that that step gives y(x) ~ Cx?/2,
andthe boundaryconditiony(L/2) = H enablesus to deter-
ty»
~L/2
L/2
x
(a) Proceeding somewhat as in (10)—(12), derive the govern-
ing differential equation
y”
=C
[1+
y!?,
(3.1)
mine C’. The result should be identical to (15).
(d) Actually, for small sag-to-span ratio we should be able to
neglect the y” term in (3.1), relative to unity, so that (3.1) can
be linearized as
where C’ is an unknown constant.
y" =C.
(b) Since y(z) is symmetric about z = 0, it suffices to consider the interval 0 < x < L/2. Then we have the boundary
conditionsy(0) = 0, y/(0) = 0, and y(L/2) = H. Verify
(you need not derive it) that
1
y(2) = G (coshCz — 1)
(3.4)
Integrating (3.4) and using the boundary conditions y(0) = 0,
y'(0) = 0, andy(L/2) = H, showthatone obtains(15)once
(3.2) again,
Chapter 2
Differential Equations
of First Order
2.1
Introduction
In studying algebraic equations, one considers the very simple first-degreepolynomial equation ax =6 first, then the second-degree polynomial equation (quadratic
equation), and so on. Likewise, in the theory of differential equations it is reasonable and traditional to begin with first-order equations, and that is the subject of
this chapter. In Chapter 3 we turn to equations of second order and higher.
Recall that the general first-order equation is given by
F(x,y,y')=9,
(1)
where x and y are independent and dependent variables, respectively. In spite of
our analogy with algebraic equations, first-order differential equations can fall anywhere in the spectrum of complexity, from extremely simple to hopelessly difficult.
Thus, we identify several different subclasses of (1), each of which is susceptible
to a particular solution method, and develop them in turn. Specifically, we consider
thesesubclasses:thelinearequationag(x)y’ +a1(a)y = f(x) in Section2.2,“separable” equations in Section 2.4, equations that are “exact” (or can be made exact)
in Section 2.5, and various other more specialized cases within the exercises.
These subclasses are not mutually exclusive. For instance, a given equation
could be both linear and separable, in which case we could solve it by one method
or the other. Given such a choice, choose whichever you prefer. In other cases the
given equation might not fit into any of these subclasses and might be hopelessly
difficult from an analytical point of view. Thus, it will be important to complement our analytical methods with numerical solution techniques. But that is a long
way off, in Chapter 6. Analytical methods and the general theory of differential
equations will occupy us in Chapters 2 through 5.
It should be stressed that the equation types that are susceptible to the analytical solution techniques described in these chapters can also be solved analytically
18
2.2. The Linear Equation
by computer algebra software that is currently available, such as Maple, Mathematica, and MATLAB, and this approach is discussed here as well. One needs to
know both: the underlying theory and solution methodology on one hand, and the
available computer software on the other.
2.2
The Linear Equation
The first casethatwe consideris thegeneralfirst-orderlinear differential equation
ag(x)y'+ ai(x)y = f(z).
(1)
Dividing throughby a(x) [which is permissible if ag(a) 4 0 over the x interval
of interest], we can re-express (1) in the more concise form
y' +p(x)y= q(z).
(2)
Weassumethatp(x) andq(x) arecontinuousoverthex intervalof interest.
2.2.1. Homogeneous case. It is tempting to think that to solve (2) for y(z) we
need to get rid of the derivative, and that we can accomplish that merely by integration. It’s true that if we integrate (2) with respect to a,
[vides
[ovde=
fade,
then the first term reduces nicely to y (plus a constant of integration), but the catch
is that the [ py dz term becomes a stumbling block because we cannot evaluate it
since y(a) is unknown! Essentially, then,we havemerely convertedthe differential
equation (2) to an “integral equation” — that is, one involving the integral of the
unknown function. Thus, we are no better off.
To solve (2),we begin with the simpler special casewhereq(x) is zero,
y+p(x)y = 0,
(3)
which is called the homogeneous version of (2). To solve (3), divide by y (assuming that y is nonzero on the z interval of interest. This assumption is tentative since
y is not yet known) and integrate on x. Using the fact that y’dx = dy, from the
calculus, we thus obtain
“d
and recalling that
[G+[r@)d=o,
dx
i
In|x|+ constant,
(4) gives
In|y| = —[re
dz +C,
"
(5)
19
20
where the arbitrary constant C' can include the arbitrary constant of integration from
the p integral as well. Thus,
ty(«)|
_ en J p(@)da+C — ef ew J P(e)dec _ Bew fe)
os
(6)
where we have set e¢ = B for simplicity. Since C is real, eC is nonnegative, so
B is anarbitrarynonnegative
constant:B > 0. Theintegral[ p(«)dx doesindeed
exist since we have assumed thatp(a) is continuous. Finally, it follows from (6)
thaty(«) = +B exp(—f pdz) or,
y(z)=AeW
J P(#)
de
(7)
if we now allow the arbitrary constant A to be positive, zero, or negative.
Observe that our tentative assumption, just below (3), that y(x) 4 0, is now
seen to be justified because the exponential function in (7) is strictly positive. [Of
course y(z) = 0 if A = 0, but in that simple case y(x) = 0 is seento satisfy (3)
without further ado.] Summarizing,
the solution of the homogeneous equation (3)
is given by (7), where A is an arbitrary constant.
The presence of the arbitrary constant A, in (7), permits the solution (7) to
satisfy an initial condition, if one is specified. Thus, supposethat we seek a solution
y(z) of our differential equation y’ + p(x)y = 0 that satisfies an initial condition
y(a) = 6, for specified values of a and 6. For this purpose, it is convenient to
re-express (7) as
JaPE)a,
y(a)=Aem
(8)
which is equivalent to (7) since f p(a) dx and f” p(€) dé differ at most by an additive constant, say D, and the resulting e? factor in (8) can be absorbed into the
arbitrary constant A.
A point of notation: why do we change the integration variable from x to €
in (8)? Because f” p(€) dé meansthatwe integratealong someaxis, say a € axis,
from a to z. Thus, « is a fixed endpoint, and is different from the integration
variable that runs from a to a. To write f” p(a) da runs the risk of confusing
the x’s inside the integral with the fixed endpoint x. The nameof the integration
variable is immaterial, so one calls it a “dummy variable.” For instance,fy € dé,
Io dn, and {5°pdp are all the same,namely,x?/2. One often seesthe letter€
used as a dummy z variable because it is the Greek version of x. In Roman letters
it is written as xi and is pronounced as ksé. Occasionally,
we may be guilty of
badnotationandwritean integralas [° f(x) da, simply to minimizetheletters
used, but even then we need to remember the distinction between the z in the upper
limit and the «’s inside the integral. In fact, this notation is typical in books on
engineering and science, where there is less focus on such points of rigor.
Now imposing the condition y(a) = b on (8) gives y(a) = 6 = Ae®= A, so
A = band hence
ae.
JaP(E)
y(a)=bem
o
2.2. The Linear Equation
21
Thus, (9) satisfies the initial condition y(a) = b. To verify that it also satisfies
the differential equation (3), we use the fundamental theorem of the calculus: If
f(a) is continuous on an interval J, namely «1 < @ < ae, and
Fay=fre) ae,
on I, then
F(x) =f(a)
(108)
(10b)
on J. Using this theorem, and chain differentiation [let the —fe p(&) dé in the
exponentbe u, say,and write de“/dz = (de“/du)(du/da) |, it is shownthat(9)
doesindeedsatisfy thedifferentialequationy’ + p(a)y = 0 on an interval I if p(x)
is continuous on J.
EXAMPLE 1.
Considerthedifferentialequation
y’ + 2Qary
=0
(11)
on —co < £ < ox, over which interval p(x) = 2x is continuous, as we have assumed.
Then (7) gives
\
Ap faedzx
y(x) a= Ae
_
= Ae —x*
(12)
on —oo < x < oo. The graphs of the solutions, (12), are called the solution curves or integral curves corresponding to the given differential equation (11), and these are displayed
for several values of A in Fig. 1. Those above and below the z axis correspond to A > 0
(A = 1,2,3) and A < 0(A = —1, —2), respectively, and A = 0 gives the solution curve
y(z) = 0. &
In Example | we used the term “solution curve.” A solution curve, or integral
curve, correspondingto a differential equationy/(x) = f(z, y), is simply the graph
of a solution to that equation.
Besides several solution curves, Fig. | also contains afield of lineal elements
through a discrete set of points, or grid.
By a lineal element at a point (9, yo),
correspondingto a differential equation y’(x) = f(x, y), we meana short straightline segment through the point (xo, yo), centered at that point and having the slope
f(xo, yo). That is, each lineal element has the same slope as the solution curve
passing through that point and is therefore a small part of the tangent line to that
solution curve. The set of all of the lineal elements is called the direction field
corresponding to the given differential equation.
In intuitive language, the direction field shows the “flow” of solution curves.
Given a sufficiently dense (computer) plot of the direction field, one can visualize
the various solution curves, or sketch the solution curve through a given initial
point.
EXAMPLE
2. Considertheproblem
(a + 2)y’ — cy = 0,
(13a)
Figure 1. The solution curves and
directionfieldfory’ + 2xy = 0.
(13b)
y(0) = 3.
Since an initial valueis prescribed,let us use(9)ratherthan(7),withp(a) = —a/(x + 2):
y(w)= ele §46/E+2)
— golé+2-2Inle+2I]|*
0
== 3elt+2—2In |e+2|—(2—21n2)] _
az
=12
9,0 pin |2+2|~2 ein?
et
14
= 12
uo
jo+2|?9 +4044’
where we have used the identity ef
= f. Of course, we could have used (7) instead, and
then imposed the initial condition on that result:
_
y(z) =
Ae fxdz/(a+2)
_
Ae e+2—21n
jx+2] _
ere
AG ree
wherey(0) = 3 = Ae?/4 givesA = 12e7? and,onceagain,we obtainthesolution
ev
-3.5
-2
0
y(z) neem
3.5
(15)
The graph of that solution is given in Fig. 2, in which we show the direction field of
Figure 2. Solutionto
the equation (13a), as well.
(a+ 2)y’ —ay =0; y(0) = 3,
On what z interval is the solution (15) valid? Recall that the solution (9) wasguaranteed over the broadest interval, containing the initial point x = a, over which p(x ) is
togetherwith the direction field.
continuous.‘In thiscasea = 0 andp(x) = —2x/(x+ 2), whichis undefined(infinite)at
x = —2, so (15) is valid at least on —2 < x < oo. In fact, the interval of validity cannot be
extended any further to the left in this example, because whereas we need both y and y’ to
haveuniquelydefinedfinite valuessatisfying(2 + 2)y’ —cy = 0, bothy and y’ given by
(15) are undefined at « = —2. Thus, the interval of validity of (15) is -2<
a < oo. &
With the homogeneous problem (3) solved, we now turn to the more difficult
case, the nonhomogeneous equation (2). We will show how to solve (2) by two
different methods: first, by using an integrating factor and, second, by the method
of variation of parameters.
2.2.2. Integrating factor method. To solve (2) by the integrating factor method,
we begin by multiplying both sides by a yet to be determined function o(2), so that
(2) becomes
oy’ + opy = oq.
(16)
{For (2) and (16) to be equivalent, we require o(x) to be nonzero on the z interval
of interest,for if (a) =
0 at one or more points, then it does not follow from (16)
that y’ + py equals q at those points.] The idea is to seek o(z) so that the left-hand
side of (16) is a derivative:
d
oy + opy = in (oy),
(17)
2.2. The Linear Equation
1
because then (16) becomes
(18)
=
da(oy) =04,
↕
.
.
which can be solved by integration. For instance, to solve the equation y+ vl
= 4,
observethat if we multiply throughby z, thenwe havery'+y = 42, or (xy)! = 4a,
which can be integrated to give xy = 2a7-+C and hence the solution y = 2a+C/a.
In this case o(a) was simply x. Such a function o(x) is called an integrating
factor, and its use is called the integrating factor method. The idea was invented by
Leonhard Euler (1707-1783), one of the greatestand most prolific mathematicians
of all time. He contributed to virtually every branch of mathematics and to the
application of mathematics to the science of mechanics.
One might say that the integrating factor method is similar to the familiar
method of solving a quadratic equation by completing the square. In completing the square, we add a suitable quantity to both sides so that the left-hand side
becomes a “perfect square,” and the equation can then be solved by taking square
roots. In the integrating factor method we multiply both sides by a suitable quantity
so that the left-hand side becomes a “perfect derivative,” and the equation can then
be solved by integration.
How do we find o(2), if indeedsuch a function exists? Writing out the right-
hand side of (17) gives
oy!+opy=ay!+o'y,
which is satisfiedidentically if we chooseo(2) so that
(19)
o'(x) = p(x)a(z).
But (19) is of the same form as (3), with y changed to o and p to —p, so its solution
follows from (7) as o(z) = Ae* Jp)
factor in o is inconsequential,
de From (16), we see that any constant scale
so we can take A = 1 without loss. Thus, the desired
integrating factor is
(20)
a(x) = ef P(t)de
Putting this o(x) into (18) gives
−d
(c
↨
SO
u)
∶
e
∙
g(x),
,
ed P(a) day, _ |
ed P(@) de o(x)
dt
+ C,
or
dx (/
y(“) = ewJ P(@)
edP(2) dxa(x) da + c’)
where C’ is an arbitrary constant of integration.
(21)
23
24
Not only does (21) satisfy (2), but it can be seen from the steps leading to (21)
that every solution to (2) must be of the form (21). Thus, we call (21) the general
solution of (2).
EXAMPLE
3. Solve
(22)
y +3y =.
With p(x) = 3 andg(a) = «x,we have
ed P(2) dx
el 3da
est
so (21) gives
y(x) =e ** (
ee
dz +c)
=
wie
‘ + Ce**,
(23)
as the general solution to (22). @
If we wish to solve for C, in (21), so as to satisfy an initial condition y(a) = 6,
then it is convenient to return to (20) and use the definite integral f p(&) dé in place
of the indefiniteintegral[ p(x) dx. These integralsdiffer by at most an arbitrary
additive constant, say B, and the scale factor e? that results, in the right-hand side
of (20), can be discarded without loss. Thus, let us change both p integrals in (21)
to definite integrals, with a as lower limit. Likewise, the g integral in (21) can
be changed to a definite integral with a as lower limit since this step changes the
integral by an arbitrary additive constant at most, which constant can be absorbed
by the arbitrary constant C’. Thus, equivalent
to (21), we have
ae+0),
(J"elimg(6)
=e Keooae
ule)
a
where¢ is zeta.If we imposeon this resulttheinitial condition y(a) = 6,we obtain
y(a) = b = e~°(0 + C), where each € integral is zero becauseits lower and upper
integration limits are the same. Thus, C’ = 6, and
de+»)
JaPOE)(/ “el MOKG(e)
y(a)=eW
a
(24)
As a partial check, notice that (24) does reduce to (9) in the event that g(x) = 0.
Whereas (21) was the general solution to (2), we call (24) a particular solution since it corresponds to one particular solution curve, the solution curve through
the point (a, b).
EXAMPLE
4. Solve
y’ — 2vy = sing,
(25a)
y(0) = 3.
(25b)
2.2. The Linear Equation
This time an titial
condition is prescribed, so it is more convenient to use (24) than (21).
With p(a) = —22,q(x) = sing, a = 0, andb = 3, we have
elePEE
pI (28)de ge?
2
z
so that (24) gives the desired solution as
y(e) = ee ([
0
ene sin € d€ + 3)
(26)
in that it cannot be evaluated in
The integral in (26) ts said to be nonelementary
closed form in terms of the so-called elementary functions: powers of x, trigonometric
and inverse trigonometric functions, exponentials, and logarithms. Thus, we will leave it
as it is. It can be evaluated in terms of nonelementary functions, or integrated numerically,
but such discussion is beyond the scope of this example. #
2.2.3, Existence and uniqueness for the linear equation. A fundamental issue, in the theory of differential equations, is whether a given differential equation
F(z,y,y’)
= 0 Aas a solution througha given initial point y(a) = b in the a,y
plane. That is the question of existence. If a solution does exist, then our next
question is: is that solution unique, or is there more than one such solution? That
is the question of uniqueness. Finally, if we do indeed have a unique solution, then
over what x interval does it apply?
For the linear problem
y+p(x)y = q(z),
y(a)= 6,
(27)
all of these questions are readily answered. Our proof of existence is said to be
constructive because we actually found a solution, (24). We are also assured that
that solution is unique because our derivation of (24) did not offer any alternatives:
each step was uniquely implied by the preceding one, as you can verify by reviewing those steps. Nonetheless, let us show how to prove uniqueness, for (27), by a
different line of approach.
Supposethatwe have two solutions of (27), yi(x) and y2(x), on any x interval
J containing the initial point a. That is,
y, +p(x)y1= g(2),
yi(a) = b,
(28a)
yy+p(x)y2= gz),
y2(a)= b.
(28b)
and
Next, denotethedifferenceyi (a) —yo(x) as u(a), say. If we subtract(28b) from
(28a), and use the fact that (f ~ g)! = f’ —g’, known from the calculus, we obtain
the “homogenized” problem
u’ + p(a)u = 0,
u(a) = 0,
on u(x). But u’ + pu = 0 implies thatfdu/u+
[pdx
(29)
= 0, which implies
that In |u| = —{ pdx + C, which implies that |u| = exp(— [pdx +C) and,
25
26
Chapter 2. Differential Equations of First Order
finally, that u(«) = Aexp(— f pdx), where A is an arbitrary constant. Since
u(a) = 0 and the exponential is nonzero, it follows that A must be zero, so u(x) =
yi(x) —y2(%)mustbe identically zero. Thus, y;(x) and y2(a) mustbe identical
on I. Since y;(«) and ye(x) are any two solutions of (27), the solution must be
unique.
That approach, in proving uniqueness, is somewhat standard. Namely, we
suppose that we have two solutions to the given problem, we let their difference be
u, say, obtain a homogenized problem on wu,and show that u must be identically
zero.
Finally, what is the interval of existence of our solution (24)? The only possible breakdown of (24) is that one or more of the integrals might not exist (be
convergent). But since p is continuous, by assumption,it follows that the € and ¢
integrals of p both exist and, indeed, are continuous functions of x and &, respec-
tively. Thus exp cfs p(¢) d¢) is continuous too, and since gqis also continuous by
assumption, then the integral of the exponential times g must exist.
In summary, we have the following result.
THEOREM
2.2.1 Existence and Uniqueness,for the Linear Equation.
The linear equation y’ + p(«)y = q(x) does admit a solution through an initial
pointy(a) = bif p(x) andq(x) arecontinuousat a. That solutionis unique,andit
exists at leaston the largestx intervalcontaining2 = a, over which p(x) andq(x)
are continuous.
InExampleI, for instance,p(x) = 2x andg(x) = 0 werecontinuousfor all z,
so everysolution was valid over —oo < x < oo. InExample2,p(x) = —#/(x+2)
and q(x) = 0, so the broadestinterval containing the initial point « = 0, over
which p and q were continuous, was ~2 < «2 < co and, sure enough, we found
that the solution (15) was valid over that interval but not beyond it because of the
singularity in the solution at ¢ = —2. We might think of the solution as “inheriting”
that singularity from the singular behavior of p(x) = —a/(a + 2) at that point.
EXAMPLE
5.
A Narrow Escape. The condition of continuity of p(x) and q(x) is
sufficient to imply the conclusions stated in the theorem, but is not necessary, as illustrated
by the problem
zy’ + 38y= 62°.
(30)
The general solution of (30) is readily found, from (21), to be
C
atm)
— a2
ps + an
y(a) =
<
(31)
The graphs of the solution for several values of C’ (i.e., the solution curves) are shown
in Fig. 3. Now, p(x) = 3/zx is continuous for all @ except x = 0, and q(x) = 62° is
continuous for all x, so if we append to (30) an initial condition y(a) = 6, for some a > 0,
then Theorem 2.2.1 tells us that the solution (31) that passes through that initial point will
2.2. The Linear Equation
be valid over the interval 0 < «2< oo at least. For instance,ifa@= 1 and 6 = 2.5 (thepoint
Pi), then C = 1, and the solution (31) is valid only over 0 < x < 00 since it is undefined
at « = 0 because of the 1/° term, as can also be seen from the figure. However, if a = 1
and b = 1 (the point P,) then C = 0, and the solution (31) is valid over the broader interval
—oo < x < oo because C’ = 0 removes the singular 1/z? term in (31). That is, if the initial
point happens to lie on the solution curve y(a) = x°®through the origin, then the solution
y(x) = x° is valid on —oo < & < oo; if not, then the solution (30) is valid on 0 < 2 < 00
#
ifa > Oandon-co<a<Qifa<0.
Figure
3.
Representative integral curves (31) for the equation (30).
2.2.4, Variation of parameter method. A second method for the solution of
the general first-order linear equation
y+ p(e)y = g(x)
(32)
is the method of variation of parameters, due to the great French mathematician
Joseph Louis Lagrange (1736-1813) who, like Euler, also worked on the applications of mathematics to mechanics, especially celestial mechanics.
Lagrange’s method is as follows. We begin by considering the homogeneous
version of (32),
y' +p(x)y =0,
(33)
which is more readily solved. Recall that we solved it by integrating
di
[2+
[o()ae=0
and obtaining the general solution
yn(x) = AeWJP(e)de,
(34)
27
28
We use the subscript h because y;,(x) is called the homogeneous solution of (32).
That is, it is the solution of the homogeneous version (33) of the original nonhomogeneousequation (32). [In place of y;,(a), some authors write y.(«) and call it the
complementary solution.| Lagrange’s idea is to try varying the “parameter” A, the
arbitrary constant in (34). Thus, we seek a solution y(a) of the nonhomogeneous
equation in the form
y(2) = A(x)e7 fP) om
(35)
(The general idea of seeking a solution of a differential equation in a particular
form is important and is developed further in subsequent chapters.)
Putting (35) into (32) gives
(Ae
Sede 4 A(—p)e~
SP)
+ pAew / Pd
= 4q.
(36)
Cancelling the two A terms and solving for A’ gives
A(x)
= q(x)el P(*)oom
(37)
which can be integrated to give
A(x)=/ ef dra()dx+C
and hence the general solution
y(x) = A(a)e~ JP) de = ewJ r(a)de ¢
ef P(2)4 a(x) dx +c)
,
(38)
which is identical to our previous result (21).
It is easy to miss how remarkable is the idea behind Lagrange’s method because it starts out looking like a foolish idea and ends up working beautifully. Why
do we say that it looks foolish? Because it is completely nonspecific. To explain
what we mean by that, let us put Lagrange’s idea aside, for a moment, and consider
thesecond-orderlinearequationy”+y/—2y = 0, for instance.InChapter3 wewill
learn to seek solutions of such equations in the exponential
form y = e** where \
is a constantthat needs to be determined. Putting that form into y” + y/ — 2y = 0
gives the equation (\? + \ — 2)e** = 0, which implies that \ needsto satisfy
the quadratic equation A? + \ — 2 = 0, with roots \ = 1 and A = —2. Thus,
we are successful,
in this example, in finding two solutions of the assumed form,
y(x) = e* and y(w) = e7?*. Notice how easily this idea works. It is easily implemented because most of the work has been done by deciding to look for solutions
in the correct form, exponentials, rather than looking within the set of all possible
functions. Similarly, if we lose our eyeglasses, the task of finding them is much
easier if we know that they are somewhere
on our desk, than if we know that they
are somewhere in the universe.
Returning to Lagrange’s idea, observe that the form (35) is completely nonspecific. That is, every function can be expressed in that form by a suitable choice
2.2. The Linear Equation
of A(x).
Thus, (35) seems useless in that it does not narrow down the search in
the least. That’s why we say that at first glance Lagrange’s idea looks like a foolish
idea.
Next, why do we say that, nevertheless, it works beautifully? Notice that the
equation (36), governing A(a), is itself a first-order nonhomogeneous equation of
the same form as the original equation (32), and looks even harder than (32)—except
for the fact that the two A terms cancel, so that we obtain the simple equation (37)
that can be solved by direct integration. The cancellation of the two A terms was
not serendipitous. For suppose that A(x) is a constant. Then the A’ term in (36)
drops out, and the two A terms must cancel to zero because if A is a constant then
(35) is a solution of the homogeneous equation!
In Chapter 3 we generalize Lagrange’s method to higher-order differential
equations.
Closure. In this chapter we begin to solve differential equations. In particular,
we consider the general first-order linear equation
y!+plx)y=q(),
(39)
wherep(z) and g(a) are given. We begin with the homogeneouscase,
(40)
y +p(a2)y=0
because it is simpler, and find its general solution
y(a)=Aen
SP(o)
de,
(41)
If an initial condition y(a) = b is appended to (40), then (41) gives the particular
solution of (40) through the initial point (a, 5) as
(42)
y(xv) = be Ja P(g)ds
Turning next to the full nonhomogeneous equation (39), we derive the general
solution
de9) der+ c) .
y(a)=eFP(#)de(/ efP@)
(43)
first by the integrating factor method, and then again by the method of variation of
parameters. Both of these methods will come up again in subsequent sections and
chapters.
If an initial condition y(a) = b is appended to (39), then (43) gives the partic-
wlarsolution of (39) through the initial point (a, b) as
y(w)= enfePEa ([ eleWC)
Ma(¢)dé+s)
(44)
which solution is unique, and which is valid on the broadest x interval containing
x = a, on which both p(x) and q(x) are continuous,
interval than that.
and possibly even on a broader
29
30
2. Differential Equations ofFirst Order
~=6Chapter
Finally, we introduce the idea of lineal elements and the direction field of a
differential equation y’ = f(a, y).
It is noteworthy that we are successful in finding the general solution of the
general first-order linear equationy’ + p(z)y = g(a) explicitly and in closed form.
For other equation types we may not be so successful, as we shall see.
In closing,
we call attention to the exercises, to follow, that introduce addi-
tional important special cases, the Bernoulli, Riccati, d’Alembert-Lagrange, and
Clairaut equations. In subsequent sections we occasionally refer to those equations and to those exercises.
Computer software. There are now several powerful computer-algebra systems,
such as Mathematica,
MATLAB,
and Maple, that can be used to implement much of
the mathematics presented in this text - numerically, symbolically, and graphically.
Consider the application of Maple, as a representative software, to the material in
this section.
There are two types of applications involved in this section. One entails finding the general solution of a given first-order differential equation, or a particular
solution satisfying a given initial condition. These can be carried out on Maple
using the dsolve command (“function,” in Maple terminology).
For example,to solvetheequation(x + 2)y’ —cy = 0 of Example2, for y(z),
enter
dsolve((x + 2) * diff(y(z), 7) —x * y(x) = 0,y(2));
(including the semicolon) and return; dsolve is the differential equation solver, and
diff is thederivative commandfor y’. [The commandfor y” would be diff(y(x), x, x),
and so on for higher derivatives.] The output is the general solution
C1
exp (zx)
y(t)= z+ +4r+4
where _C'l is Maple notation for an arbitrary constant.
To solve the same equation, but this time with the initial condition y(0) = 3,
enter
=3},y(x));
=0,y(0)
x)—x*y(x)
+2)«diff(y(x),
dsolve({(a
and return. The output is the particular solution
u(x)
exp
()
=1Q—oXP
AP)
4
vt?+doe4+
which agrees with our result in Example 2.
The dsolve command can cope with differential equations thatcontain unspec-
ified parametersor functions. For example,to solve y’ + p(x)y = 0, wherep(z) is
not specified, enter
dsolve(diff(y(x),
©)—p(x)*y(x)= 0, y(zx));
2.2. The Linear Equation
and return. The output is the general solution
yl)=exp
( fr(eyde)
-c1
The second type of application entails generating the graphical display of various solution curves and/orthe direction field for a given differential equation. Both
of these tasks can be carried out on Maple using the phaseportrait command. For
example, to obtain the plot shown as Fig. |, enter
with(DEtools):
to access the phaseportrait command; then return and enter
{v,y], @= —1.5..1.5,{[0,—2],[0,-1],
phaseportrait(-2*a*y,
[0,0],(0,1],(0,2],[0,3]}, arrows= LINE);
and return. The items within the outer parenthesesare as follows:
phaseportrait(right-hand side of y’ = f(x,y),
[variables], xrange,
{initial points}, optional specification to include direction
field lineal elements and choice of their line thickness),
The yrange is set automatically, but it can be specified as an additional optional
item if you wish. All items following the { initial points } are optional, so if you
want the yrange to be —1 < y < 3, say, then modify the phaseportrait command as
follows:
phaseportrait(—2«x *y, [a,y], «= —1.5..1.5, {[0,—2],(0,-1],
(0,0],(0,1],[0,2],(0,3]}, y = —1..3,arrows= LINE);
To run phaseportrait over and over, one needs to enter “with(DEtools):” only at the
beginning of the session.
To obtain a listing of the mathematical functions and operators (or commands)
available in Maple, enter ?lib and return. Within that list one would find such
commands as dsolve and phaseportrait. To learn how to use a command enter a
question mark, then the command
name, then return.
For example,
type ?dsolve
and return.
In the exercises that follow, and those in subsequent sections, problems
are in-
cluded thatrequire theuseof a computer-algebrasystemsuch as one of the systems
mentioned above. These are important, and we strongly urge you to develop skill in
the use of at least one such system in parallel
with, but not in place of, developing
understanding of the underlying mathematics presented in this text.
31
32
Chapter 2. Differential Equationsof First Order
EXERCISES
2.2
1. Assuming thatp(x) and g(x) are continuous, verify by direct substitution
(a)y(2)=
(d)y(1)=
(b)y(0)= 1
(e)y(—2)= 0
(c)y(-1) =
(f)y(—3)= 0
7. Find the general solution using any method of this section.
The answer may be left in implicit form, rather than explicit
2. In each case find the general solution, both using the ‘“‘off- form, if necessary. HINT: Remember that which variable is
the-shelf” formula (21) and then again by actually carrying out the independent variable and which is the dependent variable
the steps of the integrating factor method. That is, find the inte- is a matter of viewpoint, and one can change one’s viewpoint.
grating factor o(a) and then carry out solution steps analogous
In these problems, consider whether it might be better to reto those in our derivation of (21). Understand the 2x interval
gard x as a function of y, and recall from the calculus that
on which the equation is defined to be the broadest interval on ue
= 1/(dx/dy).
which bothp(z) and g(x) are continuous. For example, in part
(a) that (9) satisfies (3)
(b) that (21) satisfies (2)
(a) the x interval is —co < @ < ov, in part (e) it is any interval
on which tan x is continuous(suchas 7/2 < x < 37/2), and (a) in part (f) it is either -co < @ < Oor0 < & < oo [to ensure
thecontinuityof p(w) = 2/2].
(a)y' —y = 3e*
(b)y’ + 4y = 8
(c)yty =a"
(e)y’ ~ (tanz)y = 6
(g)zy!—2y=2°
(d)y =y—sin2z
(f)ay!+2y=a8
(h) y’ + (cot x)y = 2cosx
() (v7
—5)(ay'
+3y)=2
dx,
|
Gia —6r=e!
d
(m) tSat
7
(n) oe
7, + 2(cot 20)r =
(Kyd
te
3.(a)-(n)
43e=0 © yA ad 4ay—4y?
=1
x
=2
x
ar
For the equation given in Exercise 2, solve by the
method of variation of parameters. That is, first find the homogeneous solution, then vary the parameter, and so on — as
we did in (34)—(37) for the general equation (31).
4.(a)~(n) For the equation given in Exercise 2, find the general solution using computer software (such as Mathematica,
MATLAB, or Maple). Verify your result by showing that it
does satisfy the given differential equation.
5. Solve zy’ + y = 62x?subject to the given initial condition
using any method of this section, and state the (broadest) interval of validity of the solution. Also, sketch the graph of the
solution, by hand, and label any key values.
(a)y(1)= 0
(d)y(—3)= 1
(b)y(1)= 2
(c)y(2)= 2
(e)y(—-38)=—-5
(Ay(-2) =8
6. Solve vy’ + 2y = v + 2 subject to the given initial condition using any method of this section, and state the (broadest)
the solution, by hand, and label any key values.
1
1
~ 2 oe
(b) i
(c) (6y? —2x)%-y=0
dz
= 62 + vs
(d) (y* siny +2)
aie
8. (Direction fields) The direction field concept was
within Example 1. For the differential equation given, use
computer software to plot the direction field over the specified rectangular region in the x, y plane, as well as the integral
curve through the specified point P. Also, if you can identify
any integral curves exactly, from an inspection of the direction
field, then give the equations of those curves, and verify that
they do satisfy the given differential et
(a)yi = 2+ (2x —y)" on |z| < 4, ly] < 4; P =(2,1)
(b)yy’= y(y? - 4) on {al <4, ly) < 4; P= “O, 1)
(c)y= (3—y7)? on |a| < 2, ly] < 3; P = (0,0)
on fal < 3, jy| < 2; P = (0,0.5)
(d)y+ 2y=e*
-3)
— 1) on |z| <3, [yf <3; P= (—3,
(e)y!= 2°/(y?
(Ny+e=y
ona]<20,0<y< 20;P=(0,1)
(jy =e"y ond<x<50,0<y< 50; P=(0,10)
(h)y’=azsiny
9. (Bernoulli
on0O<x<10,0<y<10;
P=(2,2)
equation) The equation
y+ p(a)y= q(z)y",
(9.1)
where 7 is a constant (not necessarily an integer), is called
equation, after the Swiss mathematician Jakob
Bernoulli’s
Bernoulli. Jakob (1654-1705), his brother Johann (1667-1748),
and Johann’s son Daniel (1700-1782), are the best known of
the eight members of the Bernoulli family who were prominent mathematicians and scientists.
(a) Give the general solution of (9.1) for the special cases
n=OQOandn = 1.
(b) If m is neither 0 nor 1, then (9.1) is nonlinear. Neverthe-
less, show that by transforming the dependent variable from
2.2. The Linear Equation
y(x) to v(x) according to v = yi-® (forn 4 0,1), (9.1) can
be converted to the equation
which is linear andcan be solved by themethodsdevelopedin
this section. This method of solution was discovered by Gotin 1696.
10. Use the method suggested in Exercise 9(b) to find the general solution to each of the following.
(a)y’ —dy=4y?
(c)2ayy!+y?=Qa
@)y=y"
(e)y’ = ay? +2a—- 2°
(d)Jy(3y'+y)=«
(Hy!=ay?
(h)y! = (2—y)y
13. (d’Alembert-Lagrange
ear differential equation
equation) The first- order nonlin-
y = xf (p) + 9(p)
(13.1)
on y(z), whereit will be convenientto denotey’ as p, and f
and g are given functions of p, is known as a d’AlembertLagrange equation after the French mathematicians Jean le
Rond d’Alembert (1717-1783) and Joseph-Louis Lagrange
(1736-1813).
11. (Riccati equation) The equation
(a) Differentiating
y' =p(x)y?+q(x)y+r(z)
HINT: See if you can find a Y (x)
in the form az?.
(b)ay’ —2y = xy?
(g)y" = (y')? — HINT:First,lety/(a) = u(a).
(h)y’” + (y”)? =0 —HINT:First,lety”(x) = u(a).
is called Riccati’s
Y(«)=2e*
y=e ty? ~y |HINT:
(9.2) (f)y= (I - He ant)
(gy =y-
vo+(1—n)p(a)u = (1 —n)aq(2),
ifried Wilhelm Leibniz (1646-1716)
Gowa
(1.1)
equation, after the Italian mathematician
(13.1) with respect to x, show that
p—fp) =(ef) +9) Fdp
(13.2)
The Riccati equation
Observe that this nonlinear equation on p(x} can be converted
is nonlinearif p(a) is not identically zero. Recall from Exer- to a linear equation if we interchange the roles of x and p by
Jacopo Francesco Riccati (1676-1754).
cise 9 that the Bernoulli equation can always be reduced to a
linear equation by a suitable change of variables. Likewise, for
the Riccati equation, provided that any one particular solution
can be found.
Let Y(2) be anyoneparticularsolution of (11.1),asfound
by inspection, trial and error, or any other means. [Depending
now regarding x as the independent variable and p as the dependent variable. Thus, obtain from (13.2) the linear equation
f(y).
9)
p—f(p) —p—flr)
(13.3)
on p(x), g(x), and r(c), finding such a Y(x) may be easy, on x(p). Since we have divided by p ~ f(p) we must restrict
or it may prove too great a task.] Show that by changing the f(p) so thatf(p) 4 p. Solving thesimplerequation(13.3)for
dependent
variablefromy(a) tou(x) accordingto
z(p)},the solution of (13.1) is therebyobtainedin parametric
form: x = x(p) from solving (13.3),andy = a(p) f(p) + g(p)
y=¥(a)+—
Uu
(11.2) from (13.1). This result is the key idea of this exercise, and is
illustrated in parts (b)—(c). In parts (d)-(k) we consider a more
specialized result, namely, for the case where f(p) happens to
the Riccati equation (11.1) can be converted to the equation
have a “fixed point.”
u’ + [(2p(a)Y(x) + q(x)]u = —p(x),
(11.3) (b) To illustrate part (a), consider the equation y = 2ay’ + 3y’
[ie., where f(p) = 2p and g(p) = 3p], andderive a paramet-
which is linear and can be solved by the methods developed in
this section. This method of solution was discovered by Leonhard Euler (1707-1783) in 1760.
ric solution as discussed in (a).
(c) To illustrate part (a), consider the equation y = a(y! + y'*)
12. Use the method suggested in Exercise || to find the
solution discussed in (a).
fie., f(p) = p+ p? and g(p) = 0], and derive theparametric
general solution to each of the following. Nonelementary inte- (d) Suppose that f(p) has a fixed point Po, that is, such that
grals, such as [ exp (az) de, may be left as is.
(a)y’~4y=y?
HINT: Y(x) = —4
(b)y! = y? —ay + 1 HINT: Y(x) =a
(c) (cosa)y’ =1—y*
HINT: Y(z) =sinz
f( Po) = Po. [A given function f may have none, one, or any
number of fixed points. They are found as the solutions of the
equation f(p) = p.] Show that (13.1) thenhas the straight line
y = Pox + g(Po)
(13.4)
34
Chapter2. Differential Equations of First Order
cases where the integrals that occur in the general solution of
(14.2)
y = Cx + g{C),
(e) Show that f(p) = 3p? has two fixed points, p = 0 and
where C’ is an arbitrary constant.
p = 1/3, and henceshow that theequationy = 3xp? + g(p)
(b) Recall that (13.3) does not hold if f(p) = p, but (13.2)
(13.3) are too difficult to evaluate.|
has straight-line solutions y = g(0) and y ==aa +g @
for
any given function g.
(f) Determine all particular solutions of the form (13.4), if any,
for theequationy = x (y’* —2y' + 2) + ev.
does. Letting f(p) = p in (13.2), derive the family of solutions (14.2), as well as the additional
parametrically by
particular
a = —g'(p),
y = —pg'(p)
+g(p).
(g) Same as (f), for y = ze’ — 5cos y'.
(h) Same as (f), for y = x (y”? — 2y') + by’.
solution given
(14.3a)
(14.3b)
(i) Sameas(f),for y = x (y3 ~ 3y’') —2siny’.
(j) Sameas (f),for y —x (y’? + 3) = y’.
(c) To illustrate, find the parametric solution (14.3) for the
equation y = wy’ — y’. Show that in this example (14.3)
can be gotten into the explicit form y = x?/4 by eliminating
(k) Sameas (f), for y + a (2y’ +3) = er".
the parameter p between (14.3a) and (14.3b).
14. (Clairaut equation) For the special case f(p) = p, the
d’Alembert~Lagrange equation (13.1) in the preceding exercise becomes
the family (14.2), for C = 0, +1/2, £1,£2, together with the
solution y = x7/4. (Observe, from that plot, that the particular solution y = «7/4 forms an “envelope” of the family of
straight-line solutions. Such a solution is called a singular so-
y=«up+g(p),
(14.1)
which is known as the Clairaut equation. after the French
mathematician Alexis Claude Clairaut (1713-1765). (Recall
thatp denotesy’ here.)
(a) Verify, by direct substitution into (14.1), that (14.1) admits
lution of the differential
Plot, by hand,
equation.)
(d) Instead of a hand plot, do a computer plot of y = 27/4 and
3, on
the family (14.2), for C = 0, +£0.25,+£0.5,£0.75,...,
12.
—-8<2<8-l0<y<
the family of solutions
In this section we consider representative physical applications that are governed
by linear first-order equations: electrical circuits, radioactivity, population dynam-
ics, and mixing problems, with additional applications introduced in the exercises.
2.3.1. Electrical circuits. [n Section 1.3 we discussed the mathematical modeling of a mechanical oscillator. The relevant physics was Newton’s second law of
motion, which relates the net force on a body to its resulting motion. Thus, we
needed to find sufficiently accurate expressions for the forces contributed by the
individual elements within that system — the forces due to the spring, the friction
between the block and the table, and the aerodynamic drag.
In the case of electrical circuits the relevant underlying physics, analogous
to Newton’s second law for mechanical systems, is provided by Kirchhoff’s laws.
Instead of forces and displacements in a mechanical system comprised of various
elements such as masses and springs, we are interested now in voltages and currents
in an electrical system comprised of various elements such as resistors, inductors,
and capacitors.
First, by a current we mean a flow of charges: the current through a given control surface, such as the cross section of a wire, is the charge per unit time crossing
that surface. Each electron carries a negative charge of 1.6 x 107! coulomb, and
each proton carries an equal positive charge. Current is measured in amperes, with
one ampere being a flow of one coulomb per second. By convention, a current is
counted as positive in a given direction if it is the flow of positive charge in that
direction. While, in general, currents can involve the flow of positive or negative
charges, in an electrical circuit the flow is of negative charges, free electrons. Thus,
when one speaks of a current of one ampere in a given direction in an electrical
circuit one really means the flow of one coulomb per second of negative charges
(electrons) in the opposite direction.
Just as heat flows due to a temperature difference, from one point to another,
an electric current flows due to a difference in the electric potential, or voltage,
measuredin volts.
We will need to know the relationship between the voltage difference across a
given circuit element and the corresponding current flow. The circuit elements of
interest here are resistors, inductors, and capacitors.
For a resistor, the voltage drop E(t), where t is the time (in seconds), is proportional to the current z(t) through it:
E(t) = Ri(t),
(1)
where the constant of proportionality A is called the resistance and is measured
in ohms; (1) is called Ohm’s law. By a resistor we usually mean an “off-theshelf” electrical
device, often made of carbon, that offers a specified
resistance
Inductor :
→
-
such as 100 ohms, 500 ohms, and so on. But even the current-carrying wire in a
circuit is itself a resistor, with its resistance directly proportional to its length and
inversely proportional to its cross-sectional area, though that resistance is probably
negligible compared to that of other resistors in the circuit. The standard symbolic
representation of a resistor is shown in Fig. 1.
For an inductor, the voltage drop is proportional to the time rate of change of
current through it:
(2)
where the constant of proportionality Z is called the inductance and is measured
in henrys. Physically, most inductors are coils of wire, hence the symbolic representation shown in Fig. 1.
For a capacitor, the voltage drop is proportional to the charge Q(t) on the
capacitor:
=2Q(),
E(t)
Resistor :
3)
where C’ is called the capacitance and is measuredin farads. Physically, a capacitor is normally comprised of two plates separated by a gap across which no current
E,-E) =E=LI
2
di
dt
Capacitor :
Ey
Ea
E,-E) =E=—
fiat
Figure 1. The circuit elements.
36
flows, and Q(t) is the charge on one plate relative to the other. Though no cur-
rent flows across the gap, there will be a current i(¢) that flows through the circuit
that links the two plates and is equal to the time rate of change of charge on the
capacitor:
dQ (t
(a
()=20.
From (3) and (4) it follows that the desired voltage/current relation for a capacitor
can be expressed as
E(t)=A/ i(t)dt.
(5)
Now that we have equations (1), (2), and (5) relating the voltage drop to the
current, for our various circuit elements, how do we deal with a grouping of such
elements within a circuit? The relevant physics that we need, for that purpose,
is given by Kirchhoff’s laws, named after the German physicist Gustav Robert
Kirchhoff (1824-1887):
Kirchhoff’s current law states that the algebraic sum of the currents approaching (or leaving) any point of a circuit is zero.
Kirchhoff’s voltage law states that the algebraic sum of the voltage drops around
any loop of a circuit is zero.
To apply these ideas, consider the circuit shown in Fig. 2a, consisting of a
single loop containing a resistor, an inductor, a capacitor, a voltage source (such as
a battery or generator), and the necessary wiring. Let us consider the current z(t) to
be positive clockwise; if it actually flows counterclockwise then its numerical value
will be negative. In this case Kirchhoff’s current law simply says that the current
2 is a constantfrom point to point within the circuit and thereforevaries only with
time. That is, the current law statesthat at any given point P in the circuit (Fig. 2b),
i, +(—t2) = 0 or, 14;= 72.Kirchhoff’s voltage law, which is really the self-evident
algebraic identity
(Va—Va)+ (Vb—Va)+ (Ve—Vb)+ (Va—Ve)= 0,
Figure 2. RLC circuit.
gives
di
E(t) — Ri - Loa
dt
idt
=0.
(6)
(7)
The latter is called an integrodifferential equation because it contains both derivatives and integrals of the unknown function, but we can convert it to a differential
equation in either of two ways.
First, we could differentiate
with respect to t to eliminate
the integral sign,
thereby obtaining
di
Lap
+R
di
{
1,
Ls
that Gata
d&E(t)
.
8
(8)
Alternatively, we could use Q(t) instead of i(¢) as our dependentvariable, for then
fidt = Q(t), and(7)becomes
aQ
Ly
dQ
1
=Blo):
+BQ
+RE
(9)
Either way, we obtain a linear second-order differential equation.
Since we are discussing applications of first-order linear equations here, let us
treattwo special cases.
EXAMPLE
1. RL Circuit. If we omit thecapacitorfromour circuit, then(7) reducesto
the first-order linear equation*
di
dt
L—+ Ri = E(t).
(10)
If E(t) is a continuous function of time and the current at the initial instant t = 0 is
i(0) = io, then the solution to the initial-value problem consisting of (10) plus the initial
condition i(0) = ig is given by (24) in Section 2.2, with “p” = R/D and “q" = E(t)/L:
rt OR
i(t) = e~Jo othe
(/
i
nT
or
7
elo E auB(r) dt + i)
L
0
1
t
i(t) = ine P/E + if
,
Br) dr
eflr—-H/L
(11)
0
over Q < t < oo, where 7 and js have been used as dummy integration variables.
For instance, if E(¢) = constant = Ep, then (11) gives
i(t) = igenRUE
or
.
it)
(t)
=
fio
R +(i
Fy
>
.—
a
(a _ e RUE)
Fo\
R
2_ptt /L
€
(t)—— E,/R
(12)
13
(13)
t
Figure 3. Responsei(¢) for the
Ast —»00,theexponentialtermin (13)tendsto zero,andi(t) + Eo/R. Thuswe call the case /(t) = constant = Eo;
Eo/R termin (13)thesteady-statesolutionand the (i) — 42) e~/*/* termthetransient approach to steady state.
part of the solution. The approach to steady state, for several different initial conditions, is
shown in Fig. 3.
As another case, let E(t) = Ep sinw#tand tg = 0. Then (11) gives
i(gyp
Powe (@_RReh
= P2+(wLpP
Ro,
wW
sin wt —cosut}
(14)
“It may seem curious that if we try deleting the capacitor by setting C’ = 0, then the capacitor
term in (7) becomes infinite rather than zero. Physically, however, one can imagine removing the
capacitor, in effect, by moving its plates together until they touch. Since the capacitance C’ varies as
the inverse of the gap dimension, then as the gap diminishes to zero C ~+00, and the capacitor term
in the differential equation does indeed drop out because of the 1/C' factor.
As t —>oo, the exponential term in (14) tends to zero, and we are left with the steady-state
solution
i(t) > saoBowl
Re+ (wh)?
—coswt } .
Sinwt
| (4—
\woLb
)
t— co
(
)
[5
a
Observe that by a steady-state solution we mean that which remains after transients
havedied out;it is not necessarilya constant.For thecasewherei(0) = ig andE(t) = 0
thesteady-statesolutionis theconstantHy/R, andfor thecasewherei(0) = 0 andE(t) =
fig sin wt the steady-state solution is the oscillatory function given by (15). @
HXAAMPLE 2. RC Circuit. If, insteadof removingthecapacitorfrom thecircuit shown
in Fig. 2, we remove the inductor (so that L = 0), then (8) becomes
di 1. dE(t
avdt i,C - 2 dt ’
(16)
which, again, is a first-order linear equation. If we also impose an initial condition i(0) =
io, then
t
i(t) = ige / PE + af
0
gives the solution in terms of ig and E(t).
Input ——>| System pe
(fo,EQ]
Output
[i(t)]
Figure 4. Schematicof thesystem.
dE(r)
eft—*)/RC
dr
dr
(17)
@
Let us use the electrical circuit problem of Example | to make some general
remarks. We speak of the initial condition ig and the applied voltage E(t) as the
inputs to the system consisting of the electrical circuit, and the resulting current
i(t) as the output (or response), as denoted symbolically in Fig. 4. From (11), we
seethatif i9 = 0 andE(t) = 0, theni(t) = 0: if we putnothingin we getnothing
out.*
Consider the inputs and their respective responses separately. If E(t) = 0 and
io % 0, then the response
i(t) = ige 4/4
to the input 29 is seen to be proportional to ig: if we double 79 we double its response, if we triple zg we triple its response, and so on. Similarly, if 79 = 0 and
E(t) is not identically zero, then the response
1
i(t) = iff
ft
elt(r-O/E By)
dt
totheinput&(¢) is proportionalto #(¢).This resultillustratesanimportantgeneral
property of linear systems: the response to a particular input is proportional to that
input.
“In contrast with linear initial-value problems, linear boundary-value problems can yield nonzero
solutions even with zero input —that is, even if the boundary conditions are zero and the equation is
homogeneous. These are called eigensolutions, and are studied in later chapters.
39
Further, observe from (11) that the total response 2(t) is the sum of the indi-
vidual responses to ig and /(¢). This result illustrates the second key property of
linear systems: the response to more than one input is the sum of the responses to
the individual inputs.
In Chapter 3 we prove these two important properties and use them in developing the theory of linear differential equations of second order and higher.
Before closing this discussion of electrical circuits, we wish to emphasize the
correspondence, or analogy, between the RLC electrical circuit and the mechanical oscillator studied in Section 1.3, and governed by the equation
dx
mae
dx
+e a
(18)
+ka =F(t).
For we see that both equations (8) (the current formulation) and (9) (the charge formulation) are of exactly the same form as (18). Thus, their mathematical solutions
are identical, and hence their physical behavior is identical too. Consider (8), for
instance. Comparing it with (18), we note the correspondence .
Lem,
Ree
W/Cek,
i(t) o x(t),
(19)
FW).
oO)
an ee
Thus,given the valuesof m,c,k, and the function F(t), we can sf
trical analog circuit by setting L =m, R =
= 1/k, and E(t) = f[ F(t
If wealsomatchtheinitial conditionsby setting
+(0) = x(0) and2
dt
=to)
thentheresultingcurrenti(¢) will be identical to themotion x(t).
Or, we could use (9) to create a different analog, namely,
Liem,
Ree
Cok,
Q(thoa(t),
E(t) o F(t).
(20)
In either case we see that, in mechanical terminology, the inductor provides “inertia” (asdoes the mass),the resistor provides energydissipation (asdoes the friction
force), and the capacitor provides a means of energy storage (as does the spring).
Our interest in such analogs is at least twofold. First, to whatever extent we understand the mechanical
oscillator,
we thereby also understand its electrical
analog
circuit, and vice versa. Second, if the system is too complex to solve analytically,
we may wish to study it experimentally. If so, by virtue of the analogy we have the
option of studying whichever is more convenient. For instance, it would no doubt
be simpler, experimentally, to study the REC circuit than the mechanical oscillator.
Finally, just as Hooke’s law can be derived theoretically using the governing
partial differential equations of the theory of elasticity, our circuit element relations
(1)—(5) can be derived using the theory of electromagnetism,
the governing
equa-
tions of which are the celebrated Maxwell’s equations. We will meet some of the
Maxwell’s equations later on in this book, when we study scalar and vector field
theory.
2.3.2. Radioactive decay; carbon dating. Another important application of firstorder linear equations involves radioactive decay and carbon dating.
Radioactive
materials, such as carbon—14, einstetnium—253,
plutonium—241,
radium—226, and thorium—234, are found to decay at a rate that is proportional
to the amount of mass present. This observation is consistent with the supposition
that the disintegration of a given nucleus, within the mass, ts independent of the
past or future disintegrations of the other nuclei, for then the number of nuclei
disintegrating, per unit time, will be proportional to the total number of nuclei
present:
dN
——=—k
KN,
ie
21
(21)
where k is known as the disintegration constant, or decay rate. Actually, the graph
of N(t) proceedsin unit stepssince N(¢) is integer-valued,so N(t) is discontinuous and hence nondifferentiable. However, if N is very large, then the steps are
very small compared to NV.Thus, we can regard N, approximately, as a continuous
function of ¢ and can tolerate the dN/dt derivative in (21). However, it is inconvenient to work with N since one cannot count the number of atoms in a given
mass. Thus, we multiply both sides of (21) by the atomic mass, in which case (21)
becomes the simple first-order linear equation
dm= —km,
dt
(22)
where m(t) is the total mass, a quantity which is more readily measured. Solving,
by means of either (9) or (24) in Section 2.2, we obtain
m(t) = moe",
mo
(23)
where m(0) = mo is the initial amount of mass (Fig. 5). This result is indeed the
exponential decay that is observed experimentally.
|
7
Since k gives the rate of decay, it can be expressed in terms of the half-life
t
Figure 5. Exponentialdecay.
of the material, the time required for any initial amount of mass 7
by half, to mg/2. Then (23) gives
to be reduced
3 mae
mo
_h
so k = (In2)/T, and (23) can be re-expressedin termsof T' as
m(t)= mo2-/.
(24)
Thus, ift = 7,27, 37,47'..., then m(t) = mo, mo/2, mo/4, mo/8, and so on.
Radioactivity has had an important archeological application in connection
with dating. The basic idea behind any dating technique is to identify a physical
process that proceeds at a known rate. If we measure the state of the system now,
and we know its state at the initial time, then from these two quantities together
with the known rate of the process, we can infer how much time has elapsed; the
mathematics enables us to “travel back in time as easily as a wanderer walks up a
frozen river.’*
“Ivar Ekeland, Mathematics and the Unexpected. Chicago: University of Chicago Press, 1988.
41
Libby in the 1950’s. The essential idea is as follows. Cosmic rays consisting of
high-velocity nuclei penetrate the earth’s lower atmosphere. Collisions of these
nuclei with atmospheric gases produce free neutrons. These, in turn, collide with
nitrogen, thus changing some of the nitrogen to carbon—14, which is radioactive,
and which decays to nitrogen—14with a half-life of around 5,570 years. Thus, some
of the carbon dioxide which is formed in the atmosphere contains this radioactive
C—14. Plants absorb both radioactive and nonradioactive COe, and humans and
animals inhale both and also eat the plants. Consequently, the countless plants and
animals living today contain both C-12 and, to a much lesser extent, its radioactive
isotope C—14, in a ratio that is essentially the same from one plant or animal to
another.
3. Carbon Dating. Consider a wood sample that we wish to date. Since
EXAMPLE
C-—14emits approximately [5 beta particles per minute per gram, we can determine how
many grams of C—14 are contained in the sample by measuring the rate of beta particle
it were alive today it would, based upon its weight, contain around 2.6 grams. Thus, we
assume that it contained 2.6grams of C—14 when it died. That mass of C—14 will have
decayed, over the subsequent time span t, to 0.2 gram. Then (24) gives
0.2= (2.6)2781/5870,
and,solving for t, we determine the sample to be around ¢ = 2,100 years old.
However, it must be emphasized that this method (and the various others that are
based upon radioactive decay) depend critically upon assumptions of uniformity. To date
the wood sample studied in this example, for instance, we need to know the amount of
C-—14present in the sample when the tree died, and what the decay rate was over the time
period in question. To apply the method, we assume, first, that the decay rate has remained
constantover the time period in question and, second, that the ratio of the amounts of C14 to C—12 was the same when the tree died as it is today. Observe that although these
assumptions are usually stated as fact they can never be proved, since it is too late for direct
observation and the only evidence available now is necessarily circumstantial.
2.3.3. Population dynamics. In this application, we are again interested in the
variation of a population N(¢) with the time t, not the population of atoms this
time, but the population of a particular species such as fruit flies or human beings.
According to the simplest model, the rate of change dN/dt is proportional to
the population NV:
dN = KN,
(25)
dt
where the constant of proportionality « is the net birth/death rate, that is, the birth
rate minus the death rate. As in our discussion of radioactive decay, we regard
ne) as continuous because the unit steps in V are extremely small compared to
itself.
Solving (25), we obtain the exponential behavior
N(t) = Noe,
(26)
where N(0) = No is the initial condition. If the death rate exceeds the birth rate,
then & < 0 and (26) expresses exponential decrease, with N — Oast + ow.
That result seems fair enough. However, if « > 0, then (26) expresses exponential
growth, with N — co as t + oo, as displayed in Fig. 6 for several different initial
conditions No. That result is unrealistic because as N becomes sufficiently large
other factors will undoubtedly come into play, such as insufficient food or other
resources.
Figure 6. Exponentialgrowth,
In other words, we expect that « will not really be a constant but will
vary with NV. In particular, we expect it to decrease as N increases. As a simple
model of such behavior, suppose that « varies linearly with N: « = a — bN, with
a and 6 positive, so that « diminishes as N increases, and even becomes negative
when N exceeds a/b. Then (25) is to be replaced by the equation
(27)
=(a—bN)N.
dN
dt
The latter is known as the logistic
equation,
or the Verhulst
equation,
after the
Belgian mathematician P. F. Verhulst (1804-1849) who introduced it in his work
on population dynamics. Due to the NV?term, the equation is nonlinear, so that the
solution that we developed in Section 2.2 does not apply. However, the Verhulst
equation is interesting, and we will return to it.
2.3.4. Mixing problems. In this final application we consider a mixing tank with
an inflow of Q(t) gallons per minute and an equal outflow, where t is the time: see
Fig. 7. The inflow is at a constant concentration c, of a particular solute (pounds
per gallon), and the tank is constantly stirred, so that the concentration c(t) within
the tank is uniform. Hence, the outflow is at concentration c(t). Let v denote the
volume within the tank, in gallons; v is a constant because the inflow and outflow
Q(t),c(t)
Figure 7. Mixing tank.
rates are equal. To keep track of the instantaneous mass of solute x(t) within the
tank, let us carry out a mass balance for the “control volume” V (dashed lines in
the figure):
Rate of increase
of mass of solute
=
Ratein
—_ Rate out,
(28)
within V
gal
dx tb
t)=
a
(a =) («
dt min
Ib
ai)
—
(a
t
Ib
gal
(t)— },
=i)
=) («
29
e)
or, since c(t) = x(t)/v,
la(t
t
de(t)
| QO)
4) =Q(t),
dt
v
(30)
2.3, Applications of the Linear Equation
which is a first-order linear equation on a(t). Alternatively,
linear equation
43
we have the first-order
| Q@)c(t). __Q(t)
de{t)
‘
=
“it +
1)
on theconcentrationc(t).
Recall that in modeling a physical system one needs to incorporate the relevant
physics such as Newton’s secondlaw or Kirchoff’s laws. In the presentapplication,
the relevant physics is provided entirely by (28). To better understand (28), suppose
we rewrite it with one more term included on the right-hand side:
Rate of increase
of mass of solute
Rate
=
within V
into
V
Rate
~— outof
Rate of creation
+
V
of mass
(32)
within V.
The equation (32) is merely a matter of logic, or bookkeeping, not physics. Since
(28) follows from (32) only if there is no creation (or destruction) of mass, we can
now understand (28) to be a statementof the physical principle of conservation of
mass, namely, that matter can neither be created nor destroyed (except under exceptional circumstances that are not present in this situation).
Closure. In this section we study applications of first-order linear equations to
electrical circuit theory, to radioactivity and population dynamics, and to mixing
problems. Although our RDC circuit gives rise to a second-order differential equation, we find that we can work with first-order equations if we omit either the inductor or the capacitor. We will return to the RDC circuit when we discuss secondorder equations, so the background provided here, including the expressions for the
voltage/current relations and Kirchoff’s
two laws, will be drawn upon at that time.
The electrical circuit applications also gives us an opportunity to emphasize the
extremely important consequences of the linearity of the differential equation upon
the relationship between the input and output. The key ideas are that for a linear
system: (1) the response to a particular input is proportional to that input, and (2)
the response to more than one input is the sum of the responses to the individual
inputs. These ideas are developed and proved in Chapter 3.
EXERCISES
2.3
NOTE: Thus far we have assumed that p(a) and q(x) in
y' +p(x)y = q(x) are continuous, yet in applications that may
not be the case. In particular, the “input” g(x) may be discontinuous. In Example 1, for instance,H(t) in Ldi/dt + Ri =
E(t) may well be discontinuous, such as
E(t) =
&o,
0,
O<t<t
t1<t<o.
We state that in such cases, where E(t) has one or more jump
discontinuities, the solution (11) [more generally, (24) in Section 2.2] is still valid, and can be used in these exercises.
1. (RL circuit) For the RL circuit of Example 1, with ig = 0
and E(t) = Eo, determine the
(a)time requiredfor z(t) to reach99%of its steady-statevalue;
(b) resistanceR neededto ensurethati(¢) will attain99% of
44
Chapter 2. Differential Equationsof First Order
its steady-statevalue within 2 seconds, if L = 0.001 henry;
9. (Verhulst equation) Solve the Verhulst equation (27), sub-
(c) inductance£ neededto ensurethat7(¢)will attain 99% of ject to theinitial conditionN(0) = No, two ways:
its steady-statevalue within 0.5 seconds, if R = 50 ohm.
2. (RE circuit) For the RL circuit of Example |, suppose that
(a) by noting that it is a Bernoulli equation;
(b) by noting that it is (also) a Riccati equation.
1(0) = io and thatE(t) is as given below. In eachcase,de- NOTE: The Bernoulli and Riccati equations, and their soluterminei(¢) and identify the steady-statesolution. If a steady tions, were discussed in the exercises for Section 2.2. (The
state does not exist, then state that. Also, sketch the graph of
i(t) andlabelkey values.
;
:
(@)
B=
©
_
Io,
10. (Mixing tank) For the mixing tank governed by (31):
<t<oo
BOT Arete
0,
which method is the subject of the next
section.)
O<t<ty
ti
{ 0,
_
Verhulst equation can also be solved by the method of separation of variables,
O<t<t,
(a)Let Q(t) = constant= @ andc(0) = co. Solve for c(t).
(b) Let Q(t) = 4 for 0 < t < 1 and 2 for ¢ > 1, and let
v = cy = 1 and c(0) = 0. Solve for c(t). HINT: The application of (24) in Section 2.2 is not so hard when g(a) in the
differentialequationy' + p(x)y = q(z) is definedpiecewise
0,
(c)
E(t)
=
O<t<t,
Eo,
ty <t<
0,
tg <<t<w
(e.g., as in Exercise 2 above), but is tricky when p(z) is defined piecewise. In this exercise we suggest that you use (24)
te
to solve for c(t) first forO < t < 1, with “a’=0 and “b” =
c(Q)=0. Then,usethatsolution to evaluatec(1) anduse(24)
3. (RC circuit) (a) For the RC circuit of Example 2, suppose
that 79 = O and that E(t) = Eoe~®*/L. Solve for i(t) and
identify the steady-state solution, treating these cases sepa-
again, for 1 < t < on, this time with “a’=
where c(1) has already been determined.
any key values.
(b) Same as (a), but with R = C = 1 and E(t) = Egsint.
ery molecule of solute out of the tank. Does this result make
sense? Explain.
4. Verify that (14) can be re-expressed as
11. (Mass on an inclined plane) The equation mz” + ca’ =
mg sin a governs the straight-line displacement z(t) ofa mass
m along a plane that is inclined at an angle @ with respect to
the horizontal, if it slides under the action of gravity and fric-
1 and “b” = c(1),
(c) Let Q(t) = 2, c, = 0, v = 1, and c(0) = 0.3. Solve for
rately: R?C # L, and R?C = L,. If theredoesnot exist a c(t) andthusshowthatalthoughc(t) > 0 as t + oo, it never
steadystate,thenstatethat.Sketchthegraphof z(t) andlabel actually reduces to zero, so that it is not possible to wash ev-
i(€) =
Equlh
R? + (w)?
€
—Rt/L
—_——-
Eo
R? + (wh)?
sin (wt — @),
where @ is the (unique) angle between 0 and 7/2 such that
tan ¢@
= wL/R; ¢ is called the phase angle.
the same size. How old is it? Approximately how many years
did it takefor its C-14 content to diminish from its initial value
to 99%of that?
6. If 10 grams of some radioactive substance will be reduced
to 8 grams in 60 years, in how many years will 2 grams be left?
In how many years will 0.1 gram be left?
7. If 20% of a radioactive substance disappears in 70 days,
what is its half-life?
8. Show that if m1 and m2 grams of a radioactive substance
are present at times t, and to, respectively, then its half-life is
In2
T = (tg—t,) ——-—~.
(t2
in Gmifmn)
tion. If x(0) = 0 and2’(0) = 0, solvefor x(t). HINT: First,
integrate the equation once with respect to ¢ to reduce it to a
first-order linear equation.
12. (Free fall;
terminal
velocity) The equation of motion of a
body of mass m falling vertically under the action of a downward gravitational force mg and an upward aerodynamic drag
force f(v), is
mu’ = mg —f(v),
(12.1)
wherev(t) is the velocity [so thatv’(t) is the acceleration].
The determination of the form of f(v), for the given body
shape, would require either careful wind tunnel measurements,
or sophisticated theoretical and/or numerical analysis, the result being a plot of the nondimensional drag coefficient versus
the nondimensional Reynolds number. All we need to know
here is that for a variety of body shapes, the result of such an
analysis is the determination that f(v) can be approximated
2.3. Applications of theLinear Equation
45
in the form cv? , for
where & (meters*/second) is a diffusion constant, 8 (grams per
suitable constants c and @. For low velocities (more precisely,
for low Reynolds numbers) # = 1, and for high velocities (Le.,
2.
for high Reynolds numbers){ %&
second per gram) is a chemical decay constant, and Q(x) is
the constant @ over 0 < « < DLand 0 outside that interval.
[Physically, (14.1) expresses a mass balance between the in-
(over some limited
range of velocities)
(a) Solve (12.1), togetherwith the initial condition v(0)
== 0,
put ~Q(a)/A, thetransportof pollutantby diffusion,ke’, the
for the case where f(v) = cu. What is the terminal (i.e.,
transport of pollutant by convection with the moving stream,
steady-state)velocity?
(b) Same as (a), for f(v) & cv.
Uc’, and by disappearancethroughchemical decay, Gc.] We
HINT: Read Exercise [1 in
assume that the river is clear upstream; that is, we have the
Section 2.2.
initial conditionc(—co)= 0.
13. (Light extinction) As light passes through window glass
some of it is absorbed. If x is a coordinate normal to the glass
(a) Let L = co. Suppose that é is sufficiently small so that
(with x = OQat the incident face) and [() is the light intensity at a, then the fractional loss in intensity, -dI/J (with the
first-orderlinearequationUc’ + Bc = Q(a)/A. Solve for c(x)
we can neglect the diffusion term. Then (14.1) reduces to the
and sketch its graph, labeling any key values.
minus sign included becausedJ will be negative),will be pro- (b) Repeat part (a) for the case where L is finite.
portional to dz: —dI/I = k dz, where k is a positive constant.
Thus,/(z) satisfiesthedifferentialequation[’(2) = —kI(x).
The problem: If 80% of the light penetratesa 1-inch thick slab
of this glass, how thin must the glass be to let 95% penetrate?
NOTE: Your answer should be numerical, not in terms of an
unknown k.
14, (Pollution in a river) Suppose that a pollutant is discharged
into a river at a steady rate @ (grams/second) over a distance
L, as sketched in the figure, and we wish to
15. (Newton’s lawof cooling) Suppose that a body initially at
a uniform temperature ug is exposed to a surrounding environment that is at a lower temperature U. Then the outer portion
of the body will cool relative to its interior, and this temperature differential within the body will cause heat to flow from
the interior to the surface. If the body is a sufficiently good
conductor of heat so that the heat transfer within the body is
much more rapid than the rate of heat loss to the environment
at the outer surface, then it can be assumed, as an approximation, that heat transfer will be so rapid that the interior temperature will adjust to the surface temperature instantaneously, and
thebody will be at a uniformtemperatureu(t) at eachinstant
t. Newton’s law of cooling states that the time rate of change
of u(t) will be proportionalto the instantaneoustemperature
difference U — u, so that
acu _ k(U ~ wu),
determine the distribution
of pollutant in the river — that is, its
concentration c (grams/meter®).Measure z as arc length along
the river, positive downstream.
The river flows with velocity
U (meters/second) and has a cross-sectional area A (meters”),
both of which, for simplicity, we assume to be constant. Also
for simplicity, suppose that c is a function of x only. That is,
it is a constant over each cross section of the stream. This is
evidently a poor approximation near the intervalO < x < J,
where we expect appreciable across-stream and vertical variations inc, but it should suffice if we are concerned mostly with
the far field, that is, more than several river widths upstream or
downstream of the interval 0 < 2 < L. Then it can be shown
that c(x) is governed by the differential equation
ke"!~Ud —Beo= =
(15.1)
(—co < x < cw)
(14.1)
where & is a constant.
(a)Solve (15.1)for u(t) subjectto theinitial conditionu(0) =
ug. NOTE: Actually, it is not necessary that U < uo; (15.1) is
equally valid if U > uo. In most physical applications, however, one is interested in a hot body (such as a cup of coffee or
a hot ingot) in a cooler environment.
(b) An interesting application of (15.1) occurs in connection
with thedetermination of thetime of deathin a homicide. Suppose that a body is discovered at a time T after death and its
temperature is measured to be 90°F. We wish to solve for
T. Suppose that the ambient temperatureis U = 70° F and
assume that ug = 98.6° F. Putting this information into the
solution to (15.1) we can solve for T, provided that we know
46
Chapter 2. Differential Equations ofFirst Order
k, but we don’t. Proceeding indirectly, we can infer the value
of & by taking one more temperature reading. Thus, suppose
that we wait an hour and again measure the temperature of the
and if the compounding is done 7 times per year, then
(16.2)
body,and find thatu(T’ + 1) = 87° F. Use this information to
solve for 7’.
16. (Compound interest) Suppose that a sum of money earns
interest at a rate k, compounded yearly, monthly, weekly, or
(a) Show that if we let n -—+oo in (16.2), then we do re-
cover the continuous compounding result (16.1). HINT: Re-
even daily. If it is compoundedcontinuously, then dS/dt =
kS, whereS(t) denotesthesumat time¢. If S(0) = So, then
the solution is
call, from the calculus, that
Instead, suppose that interest is compounded yearly. Then af-
(b) Let & = 0.05 (i.e.,5% interest)andcompareS(t)/So af-
ter ¢ years
ter 1 year (¢ = 1) if interest is compounded yearly, monthly,
weekly, daily, and continuously.
S(t) = Soe*.
1
lim (1 + ~)
(16.1)
S(t)= So(1+k)’,
Mm—-+CO
m
™m
=e,
form
F(a,yy’)
(1)
=0.
If we can solve (1), by algebra, for y’, then we can re-express it in the form
y' =f(x,y),
(2)
equation
ry —y=siny’
+4
or, equivalently,
that (2) can be written as
yf= X(x)Y(y),
y’ = 3x —yis not.
(3)
as x expx times exp (2y), but
To solve (3), we divide both sides
by
sides with respect to w:
ly
Y(y) Gf Y(y) # 0) and integrateboth
yu da = [xo
dx,
(4)
or, since y/da = dy, from the differential calculus,
1
dy
= f x(X (x) de.
(5)
We also know from the integral calculus that if 1/Y(y) is a continuous function of
y (over the relevant y interval) and X(a) is a continuous function of x (over the
relevant « interval), then the two integrals in (5) exist, in which case (5) gives the
general solution of (2).
EXAMPLE
1. Solve theequation
9
(6)
yo=-y".
Though not linear, (6) is separable. Separating the variables and integrating gives
4 = -{ de,
1
_
(7)
(8)
—--+C,=-r+C),
¥
where C;, and Cy are arbitrary. With C = C,, — Co, we have the general solution
1
(9)
y(z) = TE
If we imposean initial conditiony(0) = yo thenwe can solve for C' andobtaintheparticular solution
I
y(x) = r+1/yo
_
Yo
L+yor’
(10)
which is plotted in Fig. | for the representative values yy = 1 and yo = 2. The solution
throughthe initial point (0, 1) exists over —1 < x < 0, the one through (0,2) exists over
~1/2 < & < oo. More generally, the one through(0, yo) exists over —1/yo < @ < 00
because the denominator in (10) becomes zero at « = —1/yo. We could plot (10) to the
left of that point as well, but such extension of the graph would be meaninglessbecause the
point x = —1/yp serves as a “barrier:” y and y’ fail to exist there, so the solution cannot
be continued beyond that point. &
EXAMPLE
2. Solve theinitial-value problem
y=,1 + 2eY
yoya.
a)
Figure 1. Particular solutions
given by (10),
Though not linear, the differential equation is separable and can be solved accordingly:
[a
+ 2e") dy = / 4a dx,
(12)
y+ 2e¥= 2474+.
(13)
Unfortunately, the latter is a transcendental equation in y, so we cannot solve it for y explicitly as a function of x, as we were able to solve (8). Nevertheless, we can impose the
initial condition on (13) to evaluate C: 1 + 2e = 0+C,so
given, in “implicit” form, by
C = 1+ 2e and the solution is
y+ 2e¥= 2a?+1+ 2e.
(14)
The resulting solution is plotted in Fig. 2, along with the direction field. [Actually, we did
not plot (14) in Fig. 2; we used the following Maple phaseportraitcommands to solve (11)
and to plot the solution:
y
with (DEtools):
phaseportrait(4* 2/(1 +2 * exp(y), [z, y], c = —20..20,{(0,1]}, stepsize= 0.05,
Figure
2. The solution (14) of(11).
arrows=LINE);
where the default stepsize was too large and gave a jagged curve, so we reduced it to 0.05,
and where we also included the direction field to give us a feeling for the overall ‘flow.’ ]
COMMENT
1. Observe that if we use the definite integrals
y
|
1
x
(1426)
dy = [
0
4a dz,
with the lower limits dictated by the initial condition y(0) = 1, then we bypass the need
for an integration constant C’ and its subsequent evaluation.
COMMENT
2. What is the interval of existence of the solution? In Example | we were able
to ascertain that interval by direct examination of the solution (10). Here, however, such
examination is not possible because the solution (14) is in
Fig. 2, that the solution exists for all 2, but of course Fig.
Equation (14) reveals the asymptotic behavior 2e” ~ 2x7,
it seems clear that the solution continues to grow smoothly
2.4.2. Existence
and uniqueness.
(Optional)
implicit form. It appears, from
2 covers only —20 < x < 20.
or y ~ 2In|a| as |x| + 00, so
as || increases. 4
In this section we have begun to
solve nonlinear differential equations. Before we get too deeply involved in solution techniques,
let us return to the more fundamental
questions
of existence and
uniqueness of solutions. For the linear equation
y + p(x)y = q(x)
(15)
we have Theorem 2.2.1, which tells us that (15) does admit a solution through an
initial pointy(a) = bif p(x) andq(x) arecontinuousat a. That solutionis unique,
and it exists at least on the largest x interval containing x = a, over which p(a) and
q(x) are continuous. What can be said about existence and uniqueness for the more
general equation y’ = f(x,y) (which could, of course, be linear, but, in general, is
not)?
\
2.4. Separable Equations
THEOREM
2.4.1 Existence and Uniqueness
If f(x, y) is continuous on some rectangle F in the x, y plane containing the point
(a, 6), then the problem
y=fay);
yla)=b
(16)
has at least one solution defined on some open 2 interval* containing « = a. If, in
addition, Of /Oyis continuous on #, then the solution to (16) is unique on some
open interval containing = a.
Notice that whereas Theorem 2.2.1 predicts the minimum interval of existence
and uniqueness,Theorem 2.4.1 merely ensures existence and uniqueness over some
interval; it gives no clue as to how broad that interval will be. Thus, we say that
Theorem 2.4.1 is a local result; it tells us that under the stipulated conditions all is
well locally, in some neighborhood of « = a. More informative theorems could be
cited, but this one will suffice here.
Let us illustrate Theorem 2.4.1 with two examples.
EXAMPLE
3. The equation
-2
(17)
~ 2)
y= yly
x(y—1)
is separable, and separating the variables gives
“ y-l
———~
dz
dy =
Iga
—
(18)
x
By partial fractions (which method is reviewed in Appendix A),
y-il
−
vy)
o11
−
id
−−−
2y 2y2
19
™”
With this result, integration of (18) gives
1
∙
1
∕
≡
∶
−
(20)
where C’ is the arbitrary constant of integration. Equivalently,
ne
|
~2
x
= 20,
(21)
“By an open interval we mean 21 <2 < 2, and by a closed interval we mean vw, <r < we.
Thus, a closed interval includes its endpoints, an open interval does not. It is common to use the
notation (a4, 2) and [x1, x2] for open and closed intervals, respectively. Further, (21, 2] means
By <e < we,and (1,22) meansay <2 < ay,
49
50
So
|=
=B,
(0S B< oo)
(22)
where B is introduced for convenience and is nonnegative because exp (2C) is nonnegative. Thus,
wy?) 2)
=+tB=A4,
(—0o < A < 0)
(23)
where A replaces the “+B.” Finally, (23) gives y? — 2y — Ax? = 0 so, by the quadratic
formula, we have the general solution
y(x) =1+V1+
Az?
(24)
of (17).
1
6
rs
'a
0
6
6
xX
Figure 3. Solutioncurvescorrespondingto equation(17).
These solution curves are plotted in Fig. 3. The choice A = 0 gives the solution curves
y(z) = O and y(x) = 2. As representativeof solutions above ine line y = 2, consider the initial condition y(1) = 4. Then (24) gives a
4 = 1+ JV14+
A, which
requires that we select the plus sign and A = 8, so y(s)
1+ Vi + 8x7. As representative of solutions below the line y = 0, consider the initial condition y(1)=
—3.
Then (24) gives y(1) = ~3 = 1+ V1+A,
which requires that we select the mi-
nus sign and A = 15, so y(x) = 1 ~ V1+152?.
Finally, as representativeof the
solutions betweeny = 0 and y = 2, consider the initial condition y(2) = 3/2, say.
Then y(2) = 3/2 = 1 4 V1+44A, so we choose the plus sign and A = ~—3/16,
in
which case (24) gives y(a) = 1 + \/1 — 3x2/16, namely, the upper branch of the ellipse
3
—6”
—1
+(y-1)?=1
.
In terms of the Existence and Uniqueness Theorem 2.4.1, observe that the conditions
of the theorem are met everywhere in the x, y plane except along the vertical line x = 0 and
the horizontal line y = 1, and indeed we do have breakdowns in existence and uniqueness
all along these lines. On « = 0 (the y axis) there are no solutions through initial points
other than y = O and y = 2 (lack of existence), and through each of those points there
2.4, Separable Equations
51
are an infinite number of solutions (lack of uniqueness). Initial points on the line y = 1
are a bit more subtle. We do have elliptical solution curves througheach such point, yet
at the initial point (on y = 1) the slope is infinite, so the differential equation (17) cannot
be satisfiedat thatpoint. Thus, we havea breakdownin existencefor eachinitial point on
y = 1. Further, realize that for any such ellipse, between y = 0 and y = 2, the upper
andlowerhalvesareseparatesolutions.For instance,theellipse(32/4)? + (y —1)? = 1,
mentionedabove,reallyamountstotheseparatesolutionsy(a) ==1+:\/1 —(3”/4)?, each
validover~4/3 < @< 4/3.
↕
≤
∏
∂
∶∶∞
≤
∏
≤
∂
∑
∶
∂
∏
‘
EXAMPLE
4. Free Fall. This time consider a physical application. Supposethat a
is dropped,fromrest,attime¢ = 0. With its displacementx(t) measured
bodyof massmm
downward from the point of release, the equation of motion is mz’’ = mg, where g is the
acceleration of gravity and ¢ is the time. Thus,
gv’ =4q,
z(0) =0,
z'(0) =0.
(0<t<o)
(25a)
(25b)
(25c)
Equation (25a) is of second order, whereas this chapter is about first-order equations, but it
is readily integrated twice with respect to t. Doing so, and invoking (25b) and (25c) gives
the solution
a(t)=at,
(26)
which result is probably familiar to you fromafirst course in physics.
However, instead of multiplying (25a) through by dt and integrating on ¢, let us multiply it by dx and integrateon z. Then x”dzx = g dz and since, from the calculus,
x
: Vd
dz!
dz— Wet
dz! dx
7 at
=
=
dz’
7!
vipat
=
zd’,
'
27
(27)
x'dz = gdx becomes
z'dz' = gdz.
(28)
Integrating (28) gives
1
5
=gr+A,
(29)
and z(0) = «'(0) = 0 imply that A = 0. Thus, we have reduced (25) to the first-order
problem
a! = /2gx)/?,
z(0) = 0,
∏
∏
(0<t<o)
(30a)
(30b)
which shall now be the focus of this example. Equation (30a) is separable and readily
solved, The result is the general solution
w(t)
=5(Vat+e)’,
G1)
∕
∂≤
which is shown, for various values of C’, in Fig. 4. Applying (30b) gives C’ = 0, so
(31) gives a(t) = gt? /2, in agreement with (26). However, from the figure we can see
that although a solution exists over the full ¢ interval of interest (t > 0), that solution is
not unique because other solutions satisfying both (30a) and (30b) are given by the curve
z(t) = 0 from the origin up to any point Q, followed by the parabola QR. Physically, the
solution OQ F corresponds to the mass levitating until time Q, then beginning its descent.
Surely that sounds physically impossible, but let us look at the mathematics. We
cannot apply Theorem 2.2.1 because (30) is nonlinear, but we can use Theorem 2.4.1 (with
zxand y replacedby t and a, of course). Since f(t,c) = /2ga'/?, we see thatf is
continuousfor all ¢ > 0 andx > 0, butfx(¢,2) = \/g/2x7'/? is notcontinuousoverany
Oo
Q
t=T
ot
Figure 4. Nonuniquenessof the
solution to (30).
interval containing the initial point ¢ = 0. Thus, the theorem tells us that there does exist
a solution over some ¢ interval containing ¢ = 0 (which turns out to be the entire positive ¢
axis), but it does not guarantee uniqueness over any such interval, and as it turns out we do
not have uniqueness over any such interval.
Next, consider the physics. When we multiply force by distance we get work, and
work shows up (in a system without dissipation, as in this example) as energy. Thus,
multiplying (25a) by dx and integrating converted the original force equation (Newton’s
second law) to an energy equation. That is, (29) tells us that the total energy (kinetic
plus potential) is conserved; it is constant for all time: x’? /2 + (gx)
= constant or,
equivalently,
(32)
sma? + (-—mgz)= A.
Kinetic energy + Potential energy = Constant.
Since (0) = x’(0) = 0, thetotalenergyA is zero. Whenthe massfalls, its kineticenergy
becomes positive and its potential energy becomes negative such that their total remains
zero for all ¢ > 0. However, the energy equation is also satisfied if the released mass
levitates for any amount of time and then falls, or if indeed it levitates for all time [that is
z(t) = 0 for all ¢ > 0]. Thus, our additional solutions are indeed physically meaningful in
that they do satisfy the requirement of conservation of energy. Observe, however, that they
do not satisfy the equation of motion (25a) since the insertion of z(¢) = 0 into thatequation
gives 0 = g. Thus, the spuriousadditional solution z(t) = 0 musthaveenteredsomewhere
between (25) and (30). In fact, we introduced
it inadvertently
when we multiplied
(25a)
by dx becausex""dr = gdz is satisfiednot only by 2” = g, but also by dx = 0 [i.e.,by
x(t) = constant].
The upshot is that although the solution to (30) is nonunique, a look at our derivation
of (30) showsthatwe shoulddiscountthesolutionx(t) = 0 of (30) since it doesnot also
satisfy the original equation of motion x” = g. In that case we are indeed left with the
uniquesolutionx(t) = gt?/2, correspondingto theparabolaOP in Fig. 4.
It is important to understandthatthe solution «(¢) = 0 of (30) is not contained
within the general solution (31), for any finite choice of C’. Such an additional
solution is known as a singular solution, and brief consideration of these will be
reserved for the exercises.
2.4, Separable Equations
2.4.3. Applications.
ration of variables.
EXAMPLE
5.
Let us study two physical applications of the method of sepa-
Gravitational Attraction. Newton’s law of gravitation statesthatthe
force of attraction F’ exerted by any one point mass MWon any other point mass m is*
F=G
Mm
p
(33)
cm?/gsec”)is calledthe
whered is thedistancebetweenthemandG(= 6.67x 107%
universal
gravitational
constant;
(33) is said to be an inverse-square
law since the force
varies as the inverse square of the distance. (By AZ and m being point masses, we mean
that their sizes are negligible compared with d.)
Consider the linear motion of a rocket of mass m that is launched from the surface
of the earth, as sketched in Fig. 5, where Af and R are the mass and radius of the earth,
respectively. From Newton’s second law of motion and his law of gravitation, it follows
that the equation of motion of the rocket is
dz
Mm
= —-G——_.,
Ce + Re
mp
(34)
34
Although (34) is a second-order equation, we can reduce it to one of first order by noting
that
@r
dd (dz _ dv _ dvdz
dv
de
dt \ dt ~ dt dxdt
~Uae’
(35)
*Newton derived (33) from Kepler’s laws of planetary motion which, in turn, were inferred empirically from the voluminous measurements recorded by the Danish astronomer Tycho Brahe (1546~1601). Usually, in applications (not to mention homework assignments in mechanics), one is given
the force exerted on a mass and asked to determine the motion by twice integrating Newton’s second
law of motion. In deriving (33), however, Newton worked “backwards:” the motion of the planets
was supplied in sufficient detail by Kepler’s laws, and Newton used those laws to infer the force
needed to sustain that motion. It turned out to be an inverse-square force directed toward the sun.
Being aware of other such forces between masses, for example, the force that kept his shoes on the
floor, Newton then proposed the bold generalization that (33) holds not just between planets and the
sun, but between any two bodies in the universe; hence the name universal law ofgravitation. Just
as it is difficult to overestimate the importance of Newton’s law of gravitation and its impact upon
science, it is also difficult to overestimate how the idea of a force acting at a distance, rather than
through physical contact, must have been incredible when first proposed.
In fact, such eminent scientists and mathematicians as Huygens, Leibniz, and John Bernoulli referredto Newton’s idea of gravitation as absurd and revolting. Imagine Newton’s willingness to stand
nonetheless upon the results of his mathematics, in inferring the concept of gravitation, even in the
absence of any physical mechanism or physical plausibility, and in the face of such opposition.
Remarkably, Coulomb’s law subsequentlystated an inverse-squaretype of electrical attraction
or repulsion between two charges. Why these two types of force field turn out to be of the same
mathematicalform is not known. Equally remarkableis the fact that although the forms of the two
laws areidentical,the magnitudesof the forcesarestaggeringlydifferent. Specifically, theratio of the
electrical force of repulsion to the gravitational force exerted on each other by two electrons (which
is independent of the distance of separation) is 4.17 x 107°.
53
54
Chapter 2. Differential Equations of First Order
where v is the velocity, and where the third equality follows from the chain rule. Thus (34)
becomes the first-order equation
dv
yee
"de
GM _
(@+R)?
36
(6)
which is separable and gives
[va
=
-om
dx
|
(37)
GM
aye
(38)
5 =srR te
If thelaunchvelocityis v(0) = V, then(38)givesC = (V?/2) - GM/R, so
vasa] V
2 _ 2GM
x
(39)
«+R
R
is the desired expression of v as a function of z.
If we wish to know z(t) as well, thenwe canre-write(39)as
dz
IGM
— =,/v2-——
dt
Vv
x
R
40
«+R
(40)
which once again is variable separable and can be solved for x(t). However, let us be
content with (39).
Observe from (39) that v decreases monotonically with increasing x, from its initial
value v = V tov = O, the latter occurring at
V2 Re
?mos
=96M
—VR
oe
Subsequently, the rocket will be drawn back toward the earth and will strike it with speed
V. [We need to choose the negative square root in (39) to track the return motion.] Equation
(41) can be simplified
by noting that when x = 0, the right-hand side of (34) must be —mg,
where g is the familiar gravitational acceleration at the earth’s surface. Thus, -mg
~—~GMm/R*,
soGM/R? = g, and(41)becomes
V?R
Umar = RV?
=
(42)
We see from (42) that zmaxz increases as V is increased, as one would expect, and becomes infinite as V - ./2gR. Thus, the critical value V, = /2g/ft is the escape velocity.
Numerically, V. & 6.9 miles/sec.
COMMENT I. Recall that the law of gravitation (33) applies to two point massesseparated
by a distance d, whereas the earth is hardly a point mass. Thus, it is appropriate to question
the validity of (34). In principle, to find the correct attractive force exerted on the rocket
by the earth we need to consider the earth as a collection of point masses df, compute the
force df’ induced by each dM,
and add the dF’s
vectorially
to find the resultant force F’.
2.4, Separable Equations
This calculation is carried out later, in Section 15.7,and the result, remarkably, is that the
resultant J” acting at any point P outside the earth (or any homogeneous spherical mass),
per unit mass at P, is the same as if its entire mass M were concentratedat a single point,
namely, at its center! Thus, the earth might as well be thought of as a point mass, of mass
M, located at its center, so (34) is exactly true, if we are willing to approximate the earth
as a homogeneous sphere.
COMMENT 2. The steps in (35), whereby we were able to reduce our second-orderequation (34) to the first-order equation (36), were not limited to this specific application. They
apply wheneverthe force is a function of x alone, for if we apply (35) to theequation
d?
mos
= f(x),
(43)
mo=—=f(xf(x)
44
(44)
we get the separable first-order equation
dv
3
mu
5 = | f(a)de
with solution
or, equivalently,
+c
| =feefd.
mv? |"?
ay
zy
(45)
(46)
In the language of mechanics, the right-hand side is the work done by the force f(z) as the
body movesfrom x, to 22, and mv*/2 is the kinetic energy.Thus, thephysical significance
of (46) is that it is a work-energy statement: the change in the kinetic energy of the body is
equal to the work done on it.
COMMENT
3. Observe the change in viewpoint as we progressed from (34) to (36). Until
the third equality in (35), we regarded 2 and v as dependent variables — functions of the
independent variable t. But beginning with the right-hand side of that equality, we began
to regard v as a function of x. However, once we solved (36) for v in terms of z, in (39), we
replaced v by dx/dt, and x changed from independent variable to dependent variable once
again. In general, then, which variable is regarded as the independent variable and which
is the dependent variable is not so much figured out, as it is a decision that we make, and
that decision, or viewpoint, can sometimes change, profitably, over the course of the
solution. #
EXAMPLE
6. VerhulstPopulationModel. Consider theVerhulstpopulationmodel
N'(t) = (a—bN)N;
N(0) = No
(47)
that was introduced in Section 2.3.3, where V(t) is the population of a given species.
This example emphasizes that a given equation might be solvable by a number of different
methods.Though (47) is not a linear equation, it is both a Bernoulli equation and a Riccati
equation, which equations were discussed in the exercises of Section 2.2. Now we see that
55
56
it is also separable, since the right side is a function of N [namely, (a ~ bN)N] times a
function of t (namely, 1). Thus,
°
dN
°
By partial fractions,
1
1
1
tof
Ll
(a—bN)N~b(N-®)N~ aN—#aN
so (48) gives
1
=]
N −−a +iimN=t+e,
(49)
a
whereC’ is an arbitraryconstant(~co < C’ < oo). [Whetherwe write In |N] or InN in
↓↕
b
(49) is immaterial since N > 0.] Equivalently,
| N
N
—f@
6
l/a
=e
t+C
N
’
_
etttac
_
Be™,
(50)
a
e
wherewe havereplacedexp (aC) by B, so 0 < B < oo. Thus
N
——_
N a/b
=
ad
Be +Be"=Ae”,
51
(51)
whereA is arbitrary(—oo < A < oo). Finally, imposingthe initial conditionN(0) = No
gives A = No/(.No —a/b), andputtingthatexpressioninto (50)andsolving for NVgives
fN(t)
( )
aNo
= ———__-—____.
(a −
∟↓
∫
52
(62)
What can be learnedof the behaviorof N(t) from (52)? We can seefrom (52) that
for every initial value No (other than No = 0), N(t) tends to the constant value a/b as
t —»oo. [If No = 0, then N(t) = 0 for all ¢, as it should, because if a species starts with
no members it can hardly wax or wane.] Beyond observing that asymptotic information,
it is an excellent idea to plot the results, especially now that one has such powerful and
convenient computer software for that purpose. However, observe that the solution (52)
contains the three parameters a, 6, and No, and to use plotting software we need to choose
numerical values for these parameters. If, for instance, we wish to plot N(t) versus ¢ for
five values of a, five of b, and five of Ng, then we will be generating 5° = 125 curves!
Thus, the point is that if we wish to do a parametric study of the solution (i.e., examine the
solution for a range of values of the various parameters), then there is a serious problem
with managing all of the needed plots. In Section 2.4.4 below, we offer advice on how to
deal with this common and serious predicament. #
2.4.4, Nondimensionalization.
(Optional) One can usually reduce the number
of parameters in a problem, sometimes dramatically, by a suitable scaling of the
2.4. Separable Equations
independentand dependentvariablesso thatthe new variablesare nondimensional
(i.e,, dimensionless).
7. Example6, Continued.To begin such a processof nondimensionaliza-
EXAMPLE
tion, we list all dependent and independent variables and parameters, and their dimensions:
Variable
Dimensions
Parameter
Dimensions
1/time
a
Independent:
t
time
Dependent:
N
number
1/[(time)(number)]
number
b
No
(By number we mean the number of living members of the species.) How did we know
that a has dimensions of 1/time, and that b has dimensions of 1/[(time)(number)]? From
dN
aN — bN*. That is, the dimensions of the term on
the differential equation Zz
the left are number/time, so the dimensions of aN and bN®?must be the same. Dimensionally, then, aN
= number/time,
Similarly,
so a = 1/time.
bN?
so
= number/time,
b = 1/[(time)(number)].
Next, we nondimensionalize the independentand dependentvariables (¢and N) using
suitable combinations of the parameters. From the parameterlist, observe that 1/a has
dimensions of time and can therefore be used as a “reference time” to nondimensionalize
the independent variable ¢. That is, we can introduce a nondimensional
version of t, say t,
by t=¢t/(1/a)= at.
Next, we need to nondimensionalize the dependent variable N. From the parameter
list, observe that Nog has dimensions of number, so let us introduce a nondimensional version of NV, sayN, by N= N/No. In case the notion of nondimensionalization
still seems
unclear, realize that it is merely a change of variables, from t and N to € and N; a rather
simple change of variables in fact, since £ is simply a constant times t, and N is simply a
constant times NV.
Puttingt = #/aandN = NoN into (47)gives
dN
aNoe
_
—
= (a ~ bNoN) NoN;
NoN(0) = No,
(53)
wherethe left side of the differential equation follows from thechain differentiation a
5
dNdNdi
aN di dé
¢.
:
+
1
dN
.
7
+
.
.
sep
.
.
=
dN
dN
/
(No) ( d ) (a) = aNo EE (More simply, but lessrigorously, we could
merely replace the dN in dN/dt by Nod
dN
orn
and the dt by di/a.) Simplifying (53) gives
(l-aN)N;
<<
N(0) = 1,
a
(54)
where a = bNo/a. Thus, (54) contains only the single parametera. The solution of (54)
is
<e
1
N(t) = ———____—_..
a+(1l—ajen*
(55)
57
a
a=0
,0.25
045——]
|
2D
mms
4-4
t
solution of Verhulst problem.
of a. As € + 00 (and hencet + 00), N + 1/a, so N/No + 1/(bNo/a), or N + a/b,
as found in Example 4.
|-
Figure 6. Nondimensional
The idea is that if we plot N(£) versus#(ratherthan N(t) versust), then we haveonly
the one-parameter family of solutions given by (55), where the parameter is the nondimensional quantity a = bNo/a. Those solutions are shown in Fig. 6 for severaldifferent values
COMMENT. The nondimensionalization of the independent and dependent variables can
often be done in more than one way. In the present example, for instance, we used No
However, a/b also has the dimensions of number,
N: N = N/No.
to nondimensionalize
so we could have definedN = N/(a/b) = bN/a instead. Similarly, we could have
nondimensinalized t differently, as £ = Nobt, because Nob has dimensions of 1/time. Any
will work, and we leave these other choices for the exercises. #
nondimensionalization
EXAMPLE
8. Example 5, Continued. As one moreexample of the simplifying use of
nondimensionalization,
consider the initial-value
dx
— =
dt
2GM
— ——
\/V2
V
R
problem
ox
«+R
∶
∶∫
“(0)
(56)
from Example 5. As above, we begin by listing all variables and parameters, and their
dimensions:
Variable
Dimensions
Parameter
Dimensions
Independent:
t
time
V
length/time
Dependent:
x
length
R
length
We didn’t bother with G and M in the parameter list because V and & are all we need to
nondimensionalize
¢ and «. Specifically,
2 has dimensions of length, so we can choose
& = «/R, and R/V has dimensions of time. so we can choose t = t/(R/V).
x = RE andt = Rt/V into (56)gives
Putting
(57)
or
(58)
with the single parameter a = 2GM/RV?.
Since all other quantities in the final differential equation are nondimensional, it follows that a must be nondimensional as well, as
could be checked from the known dimensions of G, AZ, R, and V.
Of course. whereas we’ve used the generic dimensions “time” and “length,” we could
have used specific dimensions such as seconds and meters.
It is common in engineering and science to nondimensionalize the governing
equations and initial or boundary conditions even before beginning the solution, so
as to reduce the number of parameters as much as possible. In each of the foregoing two examples we ended up with a single parameter, but the final number of
59
parameterswill vary from case to case. The nondimensional parameters that result (such as @in Example 6) are sometimes well known and of great importance.
For instance, if one nondimensionalizes the differential equations governing fluid
flow, two nondimensional parameters that arise are the Reynolds number Re and
he Mach number M., Without getting into the fluid mechanics, let it suffice to say
thatthe Reynolds number is a measure of the relative importance of viscous effects
toinertial effects: if the Reynolds number is sufficiently large one can neglect the
viscous terms in the governing equations of motion, and if it is sufficiently small
then one can neglect the inertial terms. Similarly, the Mach number is a measure of
the importance of the effects of the compressibility of the fluid: if AZ is sufficiently
small then one can neglect those effects and consider the fluid to be incompressible.
n fact, any given approximation that is made in engineering science is probably
based upon whether some relevant nondimensional parameter is sufficiently large
or small, for one is always neglecting one effect relative to others.
Closure. We see that the method of separation of variables is relatively simple:
one separatesthe variables and integrates.Thus, given a specific differential equation, one is well advised to see immediately if the equation is of separable type and,
if it is, to solve by separation of variables. Of course, it might turn out that one or
both of the integrations are difficult, but the general rule of thumb is that there is a
conservation of difficulty, so that if the integrations are difficult, then an equivalent
difficulty will show up if one tries a different solution technique.
In the last part of this section we discuss the idea of nondimensionalization.
The latter is not central to the topic of this section, separation of variables, but
arises tangentially with regard to the efficient management of systems that contain
numerous parameters,which situation makes graphical display and general understanding of the results more difficult.
Computer software. A potential difficulty with the method of separation of variables is that the integrations involved may be difficult. Using Maple, for instance,
integrations can be carried out using the int command. For example, to evaluate
the integral on the left side of (48), enter
int(1/((a ~b* N)* N), N);
and return. The output is
_In(a—6N)
4 In (NV)
a
a
which (to within an additive constant) is the same as the left side of equation (49).
That is not to say that all integrals can be evaluated in closed form by computer
the integral of e~**from x = 0 to x = oo. Enter
int(exp(—2x), v = 0..infinity);
and return. The output is 1.
60
Chapter 2. Differential Equations of First Order
1
7 =40°+Cl
y(2)
EXERCISES
2.4
NOTE: Solutions should be expressed in explicit form if possible.
1. Use separation of variables to find the general solution.
Then, obtain the particular solution satisfying the given initial condition. Sketch the graph of the solution, showing the
key features, and label any key values.
(a)y' —327e¥=0;
(b)y’ = 62°+5;
y(0)=0
separable, in general, but it is if p and q are constants, if one
times the other. Obtain the general solution for the case where
each is a nonzero constant, for any real number n. HINT: A
difficult integral will occur. Our discussion of the Bernoulli
equation in the exercises for Section 2.2 should help you to
find a change of variables that will simplify that integration.
(e)y’ =(y° —y)e"; y(0)=2
(Hy = y>+y—6; y(5)= 10
hy =624;
y(0)= —4
yl) =e
6. Solvey’ = (6x? + 1)/(y —1),subjectto thegiveninitial
(iy! =e;
(0) =1
y
Qy = 5
«¥8)=-1
(k)y' +3y(y+ 1)sin2z= 0; (0) =1
condition.
(a)y(0)= -2
(c)y(0)= 0
(d)y(1)=3
y=Iny,
y(0)=5
(m)y’=ylny; y(0)=5
(f)y(-1) =0
7. Solve y’ = (3x? —1)/2y, subject to the given initial condition.
(n)y'+2y=y?+1; y(-3) =0
2.(a)—(n) For the equation given in Exercise
was studied in Section 2.3 and solved as a Bernoulli equation,
and also as a Riccati equation. Here we ask you to solve it by
separation of variables.
of the functionsp(x) and q(x) is zero,or if one is a constant
(dy =14y"; y(2)=5
yl
(a>0, b>0)
N(0) = No
N'(t) = (a—bN)N;
5. The Bernoulli equationy’ +p(x)y = q(x)y” is not variable
y(0) =0
(c)y' +4y= 0; y(—1)=0
(gy =yly+3);
4. The Verhulst population problem
1, use computer
(a)y(0)= —3
(b)y(0)= —1
(c)y(4)=5
software to solve for y(a). Verify, by direct substitution, that (d)y(-1) =0
(e)y(—2)= -4—s (f y(1)= —6
your solution does satisfy the given differential equation and
A function f(#1,...,2n)
functions)
8. (Homogeneous
initial condition.
said to be homogeneous of degree & if f(Avi,...,Atn)
3. The problemdu/dt = k(U —u); u(0) = up , where& and M f(a1,...,;2n) for anyA. For example,
U are constants, occurred in the exercises for Section 2.3 in
connection with Newton’s law of cooling. Solve by separation
of variables.
∕
=
y+
↨
oe
sin.
(2)
1s
=
2.4, Separable Equations
61
(e)Similarly, fory! = (@~ y ~ 4)/(a@
+ y —4).
is homogeneous of degree 3 because
↕↕
(f) Devise a method of solution that will work in the excep-
State whether f is homogeneous or not. If it is, determine its
degree.
12. (Algebraic, exponential, and explosive growth) We saw, in
Section 2.3.3, that the population model
∫
∩
↨
eS
X
∑∑
↓
∫ (©+ 2y —1)/(2e
∑ yf!=
∑∶ +∑dy —1).
ferns)
aoe
(2)
(a)f(a,y) = 2?+4y?—7
dN
— =A#N
dt
52
dN
9. (Homogeneous equation) The equation
(of degree zero); see the preceding exercise. CAUTION: The
term homogeneous is also used to describe a linear differential
equation that has zero as its “forcing function” on the righthand side, as defined in Section 1.2. Thus, one needs to use
the context to determine which meaning is intended.
(a) Show, by examples, that (9.1) may, but need not, be separable.
(b) In any case, show that the change of dependent variable
w = y/a, from y(z) to w(x), reduces(9.1)to the separable
1 Fw) —w
7
L
ty +2y?
yf =e
e)y
(ey
= el/*
by’ = ar
(by
@y=--2a +:
(a, b,...,f constants)
1
w(p
— 1)NG*
(12.3)
No denotes the initial value N(0}. Observe that T’ diminishes
as p increases.
7 we nondimensionInExample
13. (Nondimensionalization)
alized according tof = at and N = N/No._ Instead, nondimensionalize (47) according tot = at and N = bN/a, and
thus derive the solution
B
MO By Be
where3 = bNo/a. Sketchthe graphof N(é) versusé, for
(11.1) several different values of 7, labeling any key value(s).
can be reduced to homogeneous form by the change of variables x = u+h,y = u-+k, where hand k are suitably chosen
constants, provided that ae ~ bd #0.
growth as a limiting case of algebraic growth, in the limit as
the exponent 7 becomes infinite. Thus, exponential growth is
powerful indeed.) If p is increased beyond | then we expect
the growth to be even more spectacular. Show thatifp > 1
then the solution exhibits explosive growth, explosive in the
sense that NV-> ooin finite time, as t > T', where
or
11. (Almost-homogeneous equation) (a) Show that
~dx Fey +f
(12.2)
(Of course, when p = 1 we then have exponential growth, as
mentioned above, so we can think — crudely ~ of exponential
(9.1)
+ey
yl = ax + by +e
(«> 0)
N(t) ~ at® as t + oo]. Show that as p -+ 0 the exponent @
tends to unity, and as p > 1 the exponent f tends to infinity.
T=
10. Use the idea contained in the preceding exercise, to find
the general solution to each of the following equations.
y
x
2y— 2
(a)y’—_
==+4+3,/=
y
p
anRNP,
(9.1) where p is a positive constant. Solve (12.2) and show that if
0 <p < 1 then the solution exhibits algebraic growth [1.c.,
is said to be homogeneous because f(y/x) is homogeneous
Ww
(12.1)
generally, consider the model
(d)f(x,y)=sin(x*+y*)
form
(«>0)
gives exponential growth, whereby N’ — oo as t + oo. More
(c) f(a, y) = 2? —y? + Tez —32y
v=1@)
∕
tional case where ae ~—bd = 0, and apply it to the case
(b)Thus,find thegeneralsolutionof y! = (22 —y ~ 6)/(a y —3).
(c)Similarly, for y’ = (1 —y)/(a + dy —3).
(d)Similarly, for y! = (a + y)/(a —y +1).
14. The initial-value problem
w'(0)= 24
(14.1)
corresponding to a damped mechanical oscillator driven
contains seven parameters:
by the force Fsinwt,
Nondimensionalize (14.1). How many
m,c,k,F,w,2o,29.
parameters are present in the nondimensionalized system?
ma” - ea’ +ka = Fsinwt;
z(0)=2,
62
2.5
Exact Equations and Integrating Factors
Thus far we have developed solution techniques for first-order differential equations that are linear or separable. In addition, Bernoulli,
Riccati, Clairaut, homoge-
neous, and almost-homogeneous equations were discussed in the exercises. In this
section we consider one more important case, equationsthat are “exact,” and ones
that are not exact but can be made exact.
First, let us review some information, from the calculus, about partial derivaC
2¢
tives. Specifically, recall that the symbol 0 I is understood to mean oO (54).
Ox \ Oy
Oxdy
If we use the standard subscript notation instead, then this quantity would be expressed as fy, that is, (fy). Does the order of differentiation matter? That is, is
fyc = fry? It is shown in the calculus that a sufficient condition for fry to equal
fye is that fr, fy, fye, and fy all be continuous within the region in question.
These conditions are met so typically in applications, that in textbooks on engineering and science f,, and fy, are generally treated as indistinguishable. Here,
however, we will treatthem as equal only if we explicitly assumethe continuity of
Feesfy» fya
and Fry:
2.5.1. Exact differential equations. To motivate the idea of exact equations, consider the equation
dy _
sin y
(1)
dz
2y—acosy
or, rewritten in differential form,
sin ydz + (xcosy — 2y)dy = 0.
(2)
If we notice that the left-hand side is the differential of F(x, y) = xsin y —y?, then
(2) is simply dF = 0, which can be integrated to give F = constant; that is,
F(z,y) =«siny—y’
=C,
(3)
where C’ is an arbitrary constant of integration. Equation (3) gives the general
solution to (1), in implicit form.
Really, our use of the differential form (2) begs justification since we seem
to have thereby treated dy/dx as a fraction of computable quantities dy and dz,
whereas it is actually the limit of a difference quotient. Such justification is possible, but it may suffice to note that the use of differentials is a matter of convenience
and is not essential to the method. For instance, observe that if we write
siny + («cosy — 2y)
lh
dx
=0
in place of (2), to avoid any questionable use of differentials,
(4)
then the left-hand side
of (4) is the x derivative (total, not partial) of F(z, y) = xsiny
d
ah
— y?:
1
d
y(x)) = = (xsiny —y’) =siny + (xcosy — dy),
2.5. Exact Equations and Integrating Factors
so dF /dx = 0. Integratingthe latter gives F(a,y)
= csiny — y* = C, just as
before.
Thus, let us continue, without concern about manipulating dy/dz as though it
were a fraction. Seeking to generalize the method outlined above, we consider the
differential equation
dy
dx
M(z,y)
N(a,y)’
(5)
where the minus sign is included so that when we re-express (5) in the differential
form
M(x, y)dx + N(x, y)dy = 0,
(6)
then both signs on the left will be positive. It is important to be aware that in
equation (5) y is regarded as a function of x, as is clear from the presence of the
derivative dy/dx. That is, there is a hierarchy whereby z is the independent variable
and y is the dependent variable. But upon re-expressing (5) in the form (6) we
change our viewpoint and now consider x and y as having the same status; now
they are both independent variables.
We observe that integrationof (6) is simple if Afda + Ndy happensto be the
differential of some function F'(x, y), for if theredoes exist a function F(z, y) such
that
dF(x,y) = M(a,y)dz + N(x, y)dy,
(7)
dF («,y) = 0,
(8)
then (6) is
which can be integrated to give the general solution
F(z,y) =C,
(9)
where C’ is an arbitrary constant.
Given Af(x, y) andN(x, y), supposethattheredoesexist an F(x, y) suchthat
Mdz + Ndy = dF. Then we say that Afdx + Ndy is an exact differential, and
that (6) is an exact differential equation. That case is of great interest because its
general solution is given immediately, in implicit form, by (9).
Two questions arise. How do we determine if such an F’ exists and, if it does,
then how do we find it? The first is addressedby the following theorem.
THEOREM
2.5.1 Testfor Exactness
Let Af(x,y), N(a, y), OML/Oy,and ON/Ozxbe continuouswithin a rectangleR in
the v, y plane. Then Afdz + Ndy is an exact differential, in R, if and only if
OM
_aN
Oy
Ou
everywhere in R.
410)
63
64
Partial Proof: Let us suppose that Afda + Ndy is exact, so that there is an F
satisfying (7). Then it must be true, according to the chain rule of the calculus, that
OF
‘
and
Ox
(Ila)
OF
(1{1b 1b)
N=— Dy
Differentiating (11a) partially with respect to y, and (11b) partially with respect to
x, gives
My = Fry,
(12a)
and
Na = Frye:
(12b)
Since Mf, N, M,, and N, have been assumed continuous in R, it follows from
(11) and (12) that Fi, Fy, Fey, and Fy, are too, so Fy, = Fyg. Then it follows
from (12) that At, = Nz, which is equation (10). Becave of the ~if and only if”
wording in the theorem, we also need to prove the reverse: that the truth of (10)
implies the existence of Ff’. That part of the proof can be carried out using results
established in Section 16.12, and will not be given here.
Actually R need not be a rectangle; it merely needs to be “simply connected,”
that is, a region without holes. Simple connectednesswill be defined and used
extensively in Chapter 16 on Field Theory.
Assuming that the conditions of the theorem are met, so that we are assured
that such an F exists, how do we find F? We can find it by integrating (lla)
with respect to x, and (11b) with respect to y. Let us illustrate the method by
reconsidering the example given above.
EXAMPLE
1. Considerequation(1) once again,or, in differentialform,
(13)
siny dz + (xcosy —2y)dy =0.
First, we identify
J = siny, and N = wcosy — 2y. Clearly, M,N,
My, and N, are
continuous in the whole plane, so we turn to the exactness condition (10): A, = cos y, and
N. = cos y, $o (10) is satisfied, and it follows from Theorem 2.5.1 that there does exist an
F(x, y) such that the left-hand side of (13) is dF. Next, we find J" from (11):
OF=siny,
(14a)
= ©cosy — 2y.
(14b)
Ox
OF
Oy
Integrating (14a) partially, with respect to w, gives
F(z,y)
= | siny Ox = xsiny + A(y),
(15)
65
where the sin y integrand was treated as a constant in the integration since it was a “partial
integration” on @,holding y fixed [just as y was held fixed in computing OF /Oz in (14a)].
The constant of integration A must therefore be allowed to depend upon y since y was held
fixed and was therefore constant. If you are not convinced of this point, observe that taking
a partial z-derivative of (15) does indeed recover (14a).
Observe that initially F(a, y) was unknown. The integration of (14a) reduced the
problem from an unknown function F' of x and y to an unknown function A of y alone.
A(y), in turn,can now be determinedfrom (14b). Specifically, we put the right-handside
of (15) into the left-hand side of (14b) and obtain
«cosy + A'(y) = xcosy —2y,
(16)
wheretheprimedenotesd/dy. Cancelling termsgivesA’(y) = —2y,so
(17)
Aly)= -| 2ydy= -y° +B,
where this integration was not a “partial integration,” it was an ordinary integration on y
sinceA’(y) wasanordinaryderivativeof A. Combining (17)and(15)gives
F(a,y) = asiny —y? + B = constant.
(18)
Finally, absorbing B into the constant, and calling the result C, gives the general solution
(19)
csiny—-y?=C
of (1), in implicit form.
COMMENT I. Be awarethatthepartialintegrationnotation[( )Ox and[( )@yis not
standard; we use it here because we find it reasonable, and helpful in reminding us that any
y’s in the integrandof [( )Ox are to be treatedas constants,and likewise any for any 2’s
in [( )dy.
COMMENT 2. From (13) all the way through (19), 2 and y have been regarded as independent variables. With (19) in hand, we can now return to our original viewpoint of y
beinga functionof z, We can, if possible,solve (19)by algebrafor y(a) [in this caseit is
not because (19) is transcendental],
plot the result, and so on. Even failing to solve (19) for
y(x), we can nevertheless verify that x sin y — y? =Csatisfies
(1) by differentiating with
respectto x. Thatstepgivessiny + x(cos y)y’ —2yy’ = Oory’ = (siny)/(2y —«cosy),
which does agree with (1).
COMMENT 3. It would be natural to wonder how this method can fail to work. That is,
whether or not M, = N.., why can’t we always successfully integrate (11) to find F'? The
answer is to be found in (16). For suppose (16) were 22 cosy + A’(y)
= wcosy
— 2y
instead. Then the x cos y terms would not cancel, as they did in (16), and we would have
A'(y) = —x cosy —2y, which is impossible becauseit expressesa relationshipbetweenx
and y, whereas x and y are regarded here as independent variables. Thus, the cancellation
the fact that A¢ and WNsatisfied the exactness condition (10).
COMMENT
4. Though we used (14a) first, then (14b), the order is immaterial and could
66
have been reversed.
2.5.2. Integrating factors. It may be discouraging to realize that for any given
pair of functions M and JN it is unlikely that the exactness condition (10) will be
satisfied. However, there is power available to us that we have not yet tapped, for
even if AY and N fail to satisfy (10), so that the equation
(20)
M(x, y)de + N(x, y)dy =0
is not exact, it may be possible to find a multiplicative factor o(a, y) so that
a(x,y)M
(2,y)dz + o(a,y)N(a,
y)dy =0
(21)
is exact. That is, we seek a function o(z, y) so that the revised exactness condition
0
0
is satisfied. Of course, we need a(x, y) # 0 for (21) to be equivalent to (20).
If we can find a o(z, y) satisfying (22), then we call it an integrating factor of
(20) because then (21) is equivalent to dF = 0, for some F(z, y), and dF = 0 can
be integrated immediately to give the solution of the original differential equation
as F(a, y) = constant.
How do we find such a @? It is any (nonzero) solution of (22), that is, of
(23)
+aNyz.
=a,N
oyM +oMly
Of course, (23) is a first-order partial differential equation on o, so we have made
dubious headway: to solve our original first-order ordinary differential equation
on y(x), we now need to solve the first-order partial differential equation (23) on
a(z,y)!
However, perhaps an integrating factor a can be found that is a function of x
alone: o(x). Then (23) reduces to the differential equation
oM,=“2N+N,
dx
1
aao
"
de”
ay
(
Mdy
aa
N
N.v
4
24
)
a
which is separable. This idea succeeds if and only if the (AZ, —Nz) /N ratio on the
right-hand side of (24) is a function of x only, for if it did contain any y dependence
then (24) would amount to the impossible situation of a function of x equalling a
function of x and y, where x and y are independent variables, Thus, if
M, - Nz
−−−∶↕
N
.
∏
∏
|
∏
(25)
67
thenintegration of (24) gives
(26)
Actually, the general solution of (24) includes an arbitrary constant factor, but that
factor is inconsequential and can be taken to be |. Also, remember that we need o
to be nonzero and we are pleased to see, a posteriori, that the o given in (26) cannot
equal zero becauseit is an exponential function.
If (M, —N,)/N is not a function of « alone,thenan integratingfactor o(x)
doesnot exist, but we can try to find o as a function of y alone: a(y). Then (23)
reduces to
dao
dy
+oaM,
=oaN,
or
which, again, is separable. If
M, — Nz
—¥___* = functionof y alone,
M
(27)
then
My—-Na
o(y) =e fa
EXAMPLE
(28)
2. Considertheequation(alreadyexpressedin differentialform)
dx + (3a —e~*¥)dy = 0.
(29)
Then M = land N = 3z — e~?¥, so (10) is not satisfied and (29) is not exact. Seeking an
integrating factor that is a function
M,-Nz
N
of x alone, we find that
0-3
= 3p Dewy
# function of z alone,
(30)
and conclude that a(x) is not possible. Seeking instead an integrating factor that is a
function of y alone,
M,-N,
∫
∫−−↕
0-38
∶
—3= functionof y alone,
(31)
so that o(y) is possible, and is given by
a(y)
=e
Jl a2
dy _. ed 3dy = ey,
Multiply (29)throughby theintegratingfactor0 = e8” andobtain
e%dar + @3¥(32 −
dy = 0,
(32)
68
which is now exact. Thus,
OF
_
ee
Oa:
and
OF
Oy
=e
(32 ~ e7*v)
so
+Aly),
F(a,y)=/ enOx=we™
an
and
Oy
= ¢@°¥
(3a —e7°¥) = Bae" + A'(y).
The latter gives
(33)
A'(y)= —e!
so
A(y) = —e¥+ B.
Thus,
↕
∶
−
∶∶
∙∩
−
or
(34)
re’d — e =C,
where C’ is an arbitrary constant; (34) is the general solution of (29), in implicit
COMMENT.
form.
Can we solve (34) for y? If we let e¥ = z, then (34) is the cubic equation
zz? — z = C in z, and there is a known solution to cubic equations. If we can solve for
z, then we have y as y = Inz. However, the solution of that cubic equation (as can be
obtained using the Maple solve command) is quite a messy expression. #
EXAMPLE
3.
First-Order Linear Equation. We've already solved the general first-
order linear equation
Yt n(ax)y
=q(2)
(35)
in Section 2.2. but let us see if we can solve it again, using the ideas of this section. First,
express (35) in the form
[p(a)y —g(x)] dx + dy = 0.
(36)
Thus, AY = p(x)y ~ q(x) and N = 1, so M, = p(x) and N,, = 0. Hence At, # Nz, so
(36) is not exact [except in the trivial case when p(z) = OJ. Since
My, -Nez — pla) - 0
v
M,—-N,
M
mdat
= function of z alone.
p(x) — 0
~ p(a)y ~ q(x)
function of y alone,
we can find an integrating factor that is a function of w alone. but not one that ts a function
of y alone. We leave it for the exercises to show that the integrating factor is
a(x)
de
= eu p(x)
69
and that the final solution (this time obtainable
y(x)
= eu [pds
(/
form) is
in explicit
el pda a dy oh
c)
,
(37)
as found earlier, in Section 2.2. 8
Closure. Let us summarize the main results. Given a differential equationdy/dz =
f(z, y), the first step in using the method of exact differentials is to re-express it
in the differential form M{(a,y)da + N(x,y)dy = 0. If M, N, My, and N,
are all continuous in the region of interest, check to see if the exactness condition
(10) is satisfied.
If it is, then the equation is exact, and its general solution is
F(a,y) = C, where F is found by integrating (11a) and (11b). As a check on your
work, a differential of F(2,y) = C should give you back the original equation
Mdz + Ndy = 0.
If it is not exact,seeif (AZ, ~ N,)/N is a functionof « alone. If it is, then
an integrating factor a(x) can be found from (26). Multiplying the given equation
Mdz + Ndy = 0 through by that o(a), the new equation is exact, and you can
proceed as outlined above for an exact equation.
If (My ~ Nz)/N is not a functionof x alone,check to seeif (MZ,- Nz)/M isa
function ofy alone. If it is, then an integrating factor o(y) can be found from (28).
Multiplying A/dx + Ndy = 0 through bythat o(y), the new equation is exact, and
you can proceed as outlined above for an exact equation.
not a function of y alone, then the method is of no help unless an integrating factor
o can be found that is a function of both x and y.
EXERCISES
2.5
NOTE: Solutions should be expressed in explicit form if pos-
sible.
y(0.5) = 3.1
(k) (42°y5 sin 3a + 32ty®cos32)dx + 5a*y*sin 3a dy= 0;
1. Show that the equation is exact, and obtain its general so- y(0)= 1
lution. Also, find the particular solution corresponding to the
(m) (2ye?"¥ sin xz+ e*4 cosa + 1)dx + 2xe*Y sina dy = 0;
given initial condition as well.
y(2.3) = —1.25
(a)3dzx—dy=0; (0) =6
(b)a?7dx+y'dy=0;
(c)adx+2ydy=0;
2.(a)—(m) Find the general solution of the equation given in
(9) = -1
y(1) =2
(d)4cos2udu —e~°"dv= 0;
(e)eYdx + (xeY —1)dy=0;
Exercise | using computer software, and also the particular
solution corresponding to the given initial condition.
v(0) = 6
3. Make up three different examples of exact equations.
y(—5) =6
(f) (e” + z)dy — (sin z — y)dz =
(g)(w— 2z)de —(Qe — z)\dz=0;
4, Petermine whatever conditions, if any, are needed on the
for the equation to be exact.
.,f, A, B,...,F
constants @,
0; z(0) = 0
2
=7 i
5
(h) (sin y + ycos z)dz + (sine + teosy)dy =0;
y(2)=3
(i) (sin zy + xycosxy)dz + x”“coszy dy=0;
| { (0) =
(j) (347sin 2y — 2cy)drx + (223cos2y — x?\dy
ll
Q;
(a) (av +
+“eda + (Av- +By +C)dy =0
(b) (aa? + by? +exy + dx +ey + f)dx
Cry + Dxt+ By+ F)dy =0
+ (Ax?
+By?
4
70
Chapter 2. Differential Equations ofFirst Order
5. Find a suitableintegratingfactoro(a) or o(y), anduseit to factor depending on x alone or y alone does not exist. Nevfind the general solution of the differential equation.
(a) 3ydz + dy = 0
(b) ydz + ulnady =0
(c) ylny dz + (x + y)dy = 0
(d) da + (x — e~¥)dy =0
(e) dz + xdy = 0
(f) (ye~*+ l)dx + (we~*)dy= 0
(g)cosy dx —[2(2 —y)siny + cosy|dy = 0
(h) (1 -~2-—z)dr+dz=
(i) (2+ tan?z)(1+e7¥)dx —e~¥tanady = 0
(j) (Su?sinh 3v —2u)du + 3u*cosh3v du = 0
(k) cosx dz + (3sinz + 3cosy —siny)dy = 0
() (ylny + 2xy?)dz + (x + 2*y)dy = 0
(m)(32 —2p)dz —xdp =0
ertheless, find a suitable integrating factor by inspection, and
use it to obtain the general solution.
(a)eYdz + e*dy = 0
(b)y2dx ~ e8*dy= 0
(c) e’%¥dx
—tana dy = 0
9. Obtain the general solution, using the methods of this section.
dy
OF
dre
bp)E
«-y
@ ae aby
(c) dy _ 2xy—eY
dz
dy
x(e¥ —2)
siny + ycosz
@)=x =-_sinz
r? cos0
$1
() ~~ OrsinO
(@)dy _ y(2x —Iny)
dx
+ Zcosy
(n)ydz +(x? ~ x)dy =0
(0)2zy dz + (y? —x?)dy =0
10. What do the integrating factors defined by (26) and (28)
turn out to be if the equation is exact to begin with?
6. (First-order linear equation) Verify that o(x) = e/ P(*)4
11.(a)Show that (x? + y)dz + (y? + x)dy =0 is exact.
(b)More generally,is M(2, y)dx+M(y, 2)dy exact? Explain.
is an integrating factor for the general linear first-order equation (35), and use it to derive the general solution (37).
7. Show that the given equation is not exact and that an integrating factor depending on x alone or y alone does not exist.
If possible, find an integrating factor in the form o(z,y) =
xy, where a and b are suitably chosen constants. If such a o
can be found, then use it to obtain the general solution of the
differential equation; if not, state that.
(a)(Bry ~ 2y?)dzx+ (2x2?—3zry)dy= 0
(b)(Bay+ 2y?)dx+ (3x? + dry)dy = 0
(c)(a + y*)dx+ (x —y)dy =0
(d)ydz —(xy —x)dy =0
8. Show that the equation is not exact and that an integrating
12.If F(z, y) = C is thegeneralsolution(in implicit form)of
a given first-order equation, then what is the particular solution
(in implicit form) satisfyingtheinitial conditiony(a) = 6?
13. If Mdx + Ndy = 0 and Pdx + Qdy = 0 are exact, is
(M + P)dx + (N + Q)dy = 0 exact?Explain.
14. Showthatfor [p(x)+ q(y)]da+ [r(x) + s(y)|dy = 0 to
be exact,it is necessaryandsufficientthatq(y)dz + r(x)dy be
an exact differential.
15. Show thatfor p(x)dz + q(z)r(y)dy = 0 to be exact,it is
necessary and sufficient that g(x) be a constant.
Chapter 2 Review
Chapter 2 Review
Following is a listing of the types of equations covered in this chapter.
SECTION 2.2
First-order linear: =y' + p(x)y = q(2).
This equation can be solved by the integrating factor method or by solving the
homogeneousequation and using variation of parameters. Its general solution is
y(x) = e7 J p(#)
de (| ef P®)d@
a(x) da +c)
A particular solution satisfying y(a) = 0 is
eu mS)a(€) dé+ s)
y(a) = ewJaPE)a ¢
a
Bernoulli: = y' + p(x)y = g(x)y”.
(n # 0,1).
(1 —n)p(x)v = (1 —n)q(x) by the changeof variablesv = y!~" (Exercise 9).
Riccati:
y!= p(x)y* + ¢(x)y + 1(z).
This equation can be solved by setting y = Y(x) + -, if a particular solution
u
Y (a) of the Riccati equation can be found (Exercise | 1).
d’Alembert-Lagrange:
y/=af(y')+g(y’).
[fliy) 4y)]
By letting y’ = p be a new independent variable, one can obtain a linear first-
orderequationon a(p) (Exercise 13).
Clairaut: = y/ = ay! + g(y’).
=
Thisequation admits the family of straight-line solutions y = Ca + g(C) and,
in general, a singular solution as well (Exercise
14).
SECTION 2.4
Separable:
y/ = X(a)Y(y).
General solution obtained by integrating
Jvy>[Xo
71
72
Chapter2. Differential Equationsof First Order
Homogeneous:
1
4/ = f (2).
x
Can be made separable by setting v = y/a (Exercise 9).
Almost Homogeneous:
/
_ ax + by +e
~ dx+eyt
fi
(ae ~ bd # 0)
Can be made homogeneous by setting «= u+h,
y =v+k
(Exercise {1).
SECTION 2.5
Exact:
M(a,y)de+
N(a,y)dy=0.
(My = Nz)
General solution F (x,y) = C found by integrating F, = M, Fy = N. If
M, # Nz, canmakeexactby meansof an integratingfactoro(x) if (My ~ Nz)/N
is a function of x only, or by an integratingfactor o(y) if (My — Nz)/M isa
function of y only.
Chapter 3
Linear Differential Equations
of Second Order andHigher
PREREQUISITES: In this chapter on linear differential equations, we encounter
systems of linear algebraic equations, and it is presumed that the reader is familiar
with the theory of the existence and uniqueness of solutions to such equations,
especially as regards the role of the determinant of the coefficient matrix. That
material is covered in Chapters 8-10, but the essential results that are neededfor the
presentchapter are summarized briefly in Appendix B. Thus, either Sections 8.110.6or Appendix B is a prerequisitefor this chapter.Also presumedis a familiarity
with the complex plane and the algebra of complex numbers. That material is
covered in Section 21.2 which, likewise, is a prerequisite for Chapter 3.
3.1
Introduction
AS we prepare to move from first-order equations to those of higher order, this is
a good time to pause for an overview that looks back to Chapter 2 and ahead to
Chapters 3-7. If, as you proceed through Chapters 3-7, you lose sight of the forest
for the trees, we urge you to come back to this overview.
LINEAR EQUATIONS
First order:
y' +p(a)y= q(x).
(1)
General solution found [(2.1) in Section 2.2] in explicit form. Existence and
uniqueness of solution of initial-value problem [with y(a) = 6] guaranteed
over a predeterminedinterval, basedupon the continuity of p(x) and g(2).
Solution of initial-value problem expressible as a superposition of responses
to the two inputs [theinitial value 6 and the forcing function g(x)] with each
73
74 ~~Chapter 3. Linear Differential Equations of Second Order and Higher
response being proportional to that input: for example, if we double the input
we double the output.
Second order and higher:
dy
ag(x) Wan+ ay(x)
d?
1y
dx
1
eee
ani
(0)
di
+ an(x)y = f(x).
(2)
Constant coefficients (the a;’s are constants) and homogeneous (f =):
This is the simplest case. We will see (Section 3.4) that the general solution
can be found in terms of exponential functions, and perhaps powers of x
times exponential functions.
Constant coefficients and nonhomogeneous:
Additional solution is needed due to the forcing function f(x) and can be
found by the method of undetermined coefficients (Section 3.7.2) or the
method of variation of parameters (Sections 3.7.3 and 3.7.4). Still simple.
An alternative approach, the Laplace transform, is given in Chapter 5.
Nonconstant coefficients:
Essentially, the only simple case is the Cauchy —Euler equation (Section
3.6.1). Other cases are so much more difficult that we give up on finding
closed form solutions and use power series methods (Chapter 4). Two particularly important cases are the Legendre (Section 4.4) and Bessel (Section
4.6) equations, which will be needed later in the chapters on partial differential equations.
NONLINEAR
EQUATIONS
First order:
y = f(x,y).
(3)
No solution available for the general case. Need to identify subcategories
that are susceptible to special solution techniques. The most important of
these subcategories are separable equations (Section 2.4) and exact equations
(Section 2.5), and these methods give solutions in implicit form. Several important but more specialized
cases are given in the exercises:
the Bernoulli,
Riccati, d’ Alembert-Lagrange, and Clairaut equations in Section 2.2, and
“homogeneous” equations in Section 2.4. The idea of the response being
a superposition of responses, as it is for the linear equation, is not applicable for nonlinear equations. The subcategories and special cases mentioned
above by no means cover all possible equations of the form y’ = f(z, y),
so that many first-order nonlinear equations simply are not solvable by any
known means. A powerful alternative to analytical methods, [i.e., methods
3.1. Introduction
designedto obtainan analyticalexpressionfor y(z)], is to seeka solutionin
numerical form, with the help of a computational algorithm and a computer,
and these methods are discussed in Chapter 6.
Second order and higher:
Some nonlinear equations of first order can be solved analytically, as we
have seen, but for nonlinear equations of higher order analytical solution is
generally out of the question, and we rely instead upon a blend of numerical
solution (Chapter 6) and qualitative methods, such as the phase plane method
described in Chapter 7.
To get started, we limit our attention in the next several sections to the homogeneous version of the linear equation (2), namely, where f(x) = 0, because
thatcase is simpler and becauseto solve the nonhomogeneouscase we will needto
solve the homogeneous version first, anyhow.
To attach physical significance to the distinction between homogeneous and
nonhomogeneousequations, it may help to recall from Section 1.3 that the differential equation governing a mechanical oscillator is
2
(4)
+kxr = F(t),
mos + <
where m, c, k are the mass, damping coefficient, and spring stiffness, respectively,
andF(t) is theappliedforce. (In this case,of course,thevariableshappento be x
and¢ ratherthany and w.) If F(t) = 0, then(4) governsthe unforced,or “free,”
vibration of the mass m. Likewise, for any linear differential equation, if all terms
containing the unknown and its derivatives are moved to the left-hand side, then
whatever is left on the right-hand side is regarded as a “forcing function.” From a
physical point of view then, when we consider the homogeneous case in the next
several sections, we are really limiting our attention to unforced systems.
A brief outline of this chapter follows:
3.2 Linear Dependence and Linear Independence. The concept of a general
solution to a linear differential equation requires the idea of linear dependence and
linear independence, so these ideas are introduced first.
3.3 Homogeneous Equation: General Solution. Here we establish the concept
of a general solution to the homogeneous equation (2), but do not yet show how to
obtain it.
3.4 Solution of Homogeneous Equation: Constant Coefficients. It is shown
how to find the general solution in the form of a linear combination of solutions
that are either exponentials or powers of x times exponentials.
3.5 Application to Harmonic Oscillator: Free Oscillation. The foregoing concepts and methodsare applied to an extremely important physical application: the
free oscillation of a harmonic oscillator.
75
76
3.6 Solution of Homogeneous Equation: | Nonconstant Coefficients.
Nonconstant-coefficient equations can be solved in closed form only in exceptional
cases. The most important such case is the Cauchy—Euler equation, and that case
occupies most of this section.
3.7 Solution of Nonhomogeneous Equation. It is shown how to find the additional solution, due to the forcing function, by the methods of undetermined coefficients and variation of parameters.
3.8 Application to Harmonic Oscillator: Forced Oscillation. We return to
the example of the harmonic oscillator, begun in Section 3.5, and obtain and discuss
the solution for the forced oscillation.
3.9 Systems of Linear Differential Equations. We consider linear systems
of n coupled first-order differential equations on n unkowns and show how to obtain uncoupled nth-order differential equations on each of the n unknowns, which
equations can then be solved by the methods described in the preceding sections of
this chapter.
3.2
Linear Dependence and Linear Independence
Asked how many different paints he had, a painter replied five: red, blue, green,
yellow, and purple. However, it could be argued that the count was inflated since
only three (for instance red, blue, and yellow) are independent: the green can be
obtained from a certain proportion of the blue and the yellow, and the purple can be
obtained from the red and the blue. Similarly, in studying linear differential equations, we will need to determine how many “different,” or “independent,” functions
are contained within a given set of functions. The concept is made precise as follows. We begin by defining a linear combination of a set of functions f,,..., fp
as any function of the form aj f; + +--+ anf, where the a,;’s are constants. For
instance, 2 sin — 7 cos2 is a linear combination of sin x and cos z.
DEFINITION
3.2.1 Linear Dependence and Linear Independence
A set of functions {u1,...,tn}
is said to be linearly
dependent on an interval I
if at least one of them can be expressed as a linear combination of the others on J.
If none can be so expressed, then the set is linearly independent.
If we do not specify the interval J, then it will be understood to be the entire
x axis. NOTE: Since the terms linearly dependent and linearly independent will
appear repeatedly, it will be convenient to abbreviate them in this book as LD and
77
LI, respectively, but be aware that this notation is not standard outside of this text.
The set {x?,e*,e~*, sinh a} is seen to be LD (linearly dependent)
1.
EXAMPLE
becausewe can express sinh w as a linear combination of the others:
sinha = ————
2
=
1 ,
2
1
=e”— 3°
In fact, we could express e* as a linear combination
+027.
(1)
of the others too, for solving (1) for e”
givese* = 2sinhaz+e~* +02”. Likewise, we could expresse~* = e* —2sinhz+02?,
We cannot express x? as a linear combination of the others [since we cannot solve (1) for
x7], but the set is LD nonetheless,
because we only need to be able to express “at least
one” member as a linear combination of the others. NOTE: The hyperbolic sine and cosine
functions, sinh x and cosh, were studied in the calculus, but if these functions and their
graphs and properties are not familiar to you, you may wish to turn to the review in Section
3.4.1.
The foregoing example was simple enough to be worked by inspection. In
more complicated cases, the following theorem provides a test for determining
whether a given set is LD or LI.
THEOREM 3.2.1 Testfor Linear Dependence/Independence
A finite set of functions {u;,...,Un}
is LD on an interval J if and only if there
exist scalars a;, not all zero, such that
ayuy(x2)+ ague(z) +++:+ Antn(z) = 0
(2)
identically on J. If (2) is true only if all the a’s are zero, then the set is LI on J.
Proof: Because of the “if and only if” we need to prove the statement in both direc-
tions. First, suppose that the set is LD. Then, according to the definition of linear
dependence, one of the functions, say uj, can be expressed as a linear combination
of the others:
uj (a) = ayuy(z)
+--+
aj—1uj—1(2)
+ Oj41Uj41(x)
+++++QnUn(z),
(3)
which equation can be rewritten as
ayur (x) sree
Qj —1Uj—1(2) + (—1)u;(z)
+ Aj 41Uj41(2)
nena
QnUn(Z)
= 0.
(4)
Even if all the other a’s are zero, the coefficient a; of uj(z) in (4) is nonzero,
namely, —1, so there do exist scalars a1,...,@p, not all zero such that (2) holds.
78
Conversely, suppose that (2) holds with the @’s not all zero. If ag, for instance, is nonzero, then (2) can be divided by a, and solved for u,z(a) as a linear
combination of the other u’s, in which case {u1,...,Un}is
LD. @
EXAMPLE
2. To determineif theset {1, a,a*} is LD or Ll using Theorem 3.2.1,write
equation (2),
(5)
ay + ager + a3n? = Q,
and see if the truth of (5) requires all the a’s to be zero. Since (5) needs to hold for all z’s
in the interval (which we take to be ~co < x < oo), let us write it for x = 0, 1, 2, say, to
generate three equations on the three a’s:
a,
= 0,
ay + a9 +43 =0,
Oy + 202
+ dag
(6)
=0.
Solution of (6) gives a; = a2 = a3 = 0, so the set is LI.
In fact, (5) really amounts to an infinite number of linear algebraic equations on the
three a’s since there is no limit to the number of x values that could be chosen. However,
three different x values sufficed to establish that all of the a’s must be zero. @
Alternative to writing out (2) for n specific x values, to generate n equations
On Q1,..., Qn, it is more common to generate n such equations by writing (2) and
its first n — 1 derivatives (assuming, of course, that w1,...,uU, are n — | times
differentiable on J),
ayuy(2) + +++
+ Antn(x) =D,
ayuy (x) +e
+ anu, (x) = 0,
ayuy")(@)
+2 tau)
(7)
(x)=0.
Let us denote the determinant of the coefficients as
Un(2)
ut (z)
W [ui,..., Un] (x) =
uy (a
u )
ul?)
(x)
vee
∙∙
Uh (x)
nl
uP)
;
(8)
(x)
which is known as the Wronskian determinant of u,,..., Un, or simply the Wronskian of u1,..., Up, after the Polish mathematician Josef M. H. Wronski (1778—1853). The Wronskian W is itself a function of z.
From the theory of linear algebraic equations, we know that if there is any
value of w in J, say xg, such that W [u,,..., Un] (wo) # 0, then it follows from (7)
with x set equal to xo, that all the a’s must be zero, so the set {ui,
∙
is LL.
79
THEOREM
3.2.2 Wronskian Condition for Linear Independence
If, for a set of functions {uw ,,..., tn} having derivatives through order m — 1 on an
intervalI, W [uy,..., tn] (@)is not identically zero on J, thenthe set is LI on J.
Be careful not to read into Theorem
3.2.2 a converse, namely, that if
W [ui,..., Un] (x) is identically zero on J (which we write as W = 0), then the
set is LD on I. In fact, the latter is not true, as shown by the following example.
EXAMPLE
3. Considertheset{u1,uo}, where
ur(2)
=
xv,
{
0,
xr<0
«>0,
ula)
={
42
<
z>0.
(9)
(Sketch their graphs.) Then (2) becomes
a2"
+ a(0) = 0
a1(0) + agz? = 0
fora <0
fora’ > 0.
The first implies that a, = 0, and the second implies that ag = 0. Hence {wy,uo} is LL
v
wl(e)al®*
Yet, W (wi, ual (2)
or
Fle
0
ved UH
{0
Oone < 0, and W[ui, we] (2) =
0
2
2x
=Qon
xz> 0,so W [wy,
ug] (v7)=O forall a. ff
However, our interest in linear dependence and independence, in this chapter,
is not going to be in connection with sets of randomly chosen functions, but with
sets of functions which have in common that they are solutions of a given linear
homogeneousdifferential equation. In that case, it can be shown that the inverse of
Theorem 3.2.2 is true: that is, if W = 0, then the set is LD. Thus, for that case we
have the following stronger theorem which, for our subsequent purposes, will be
more important to us than Theorem 3.2.2.
THEOREM
If ui,...,Un
tion
3.2.3 A Necessary and Sufficient Condition for Linear Dependence
are solutions of an nth-order linear homogeneous differential equanm
ae
dat
where the coefficients
di
(a)?n—1, y ain + Pn=1() =
+ pn(x)y = 0,
PULL dan-l
Dj (x) are continuous
on an interval J, then W [u1,...,
(10)
Un| (a)
= 0 on J is both necessary and sufficient for the linear dependence of the set
{ti,...,Un}on
J.
80
EXAMPLE
4. It is readily verified that each of the functions 1,e*,e~" satisfies the
equationy/” —y’ = 0. Since their Wronskian is
e®
e*
1
(2)=| 0 e* -e-*| =240,
W[1,e%,e7*]
0
e7*
et
it follows from Theorem 3.2.3 that the set {1,e*,e7*}
fll
y/"
—y' = Dis e*,e7*, coshaz. Their Wronskian is
e”
W[e*, e*, cosh x] (vz)=| e*
e*
is LI. Another set of solutions of
e~* cosha
~e~® sinha | =0,
e~* coshz |
sotheset{e*,e~*,coshx} is LD.
In connection with Theorem 3.2.3, it would be natural to wonder if W could be
zero for some x’s and nonzero for others. Subject to the conditions of that theorem,
it can be shown (Exercise 5) that
cy)
at),
|- [ *pi(t)
exp
W(2)=W(6)
where € is any point in the interval and p, is the coefficient of the next-to-highest
derivative in (10), and where we have written W [ui,..., un] (a) as W(a), and
W [ui,..., Un] (€) as W(€), for brevity. Due to the French mathematicanJoseph
Liouville
(1809-1882),
and known as Liouville’s
formula,
(11) shows that under
the conditions of Theorem 3.2.3 the Wronskian is either everywhere zero or everywhere nonzero, for the exponential function is positive for all finite values of its
argumentandtheconstantW(€) is either zero or not. This fact is illustratedby the
two Wronskians in Example 4.
Finally, it is useful to cite the following three simple results, proofs of which
are left for the exercises.
THEOREM
3.2.4 Linear Dependence/Independenceof Two Functions
A set of two functions, {u1, u2}, is LD if and only if one is expressible
as a scalar
multiple of the other.
THEOREM
3.2.5 Linear Dependence ofSets Containing the Zero Function
[f a set {uy,...,Un}
then the set is LD.
contains the zero function [that is, uj(a@) = 0 for some 4],
81
THEOREM 3.2.6 Equating Coefficients
Let {u1,..., Un} be Lf on an interval J. Then,for
ayuy(@) +++: + GyUn(x) = byuy(x) +--+ + bptn (r)
to hold on J, it is necessary and sufficient that a; = b; foreach j = 1,...,n. That
is, the coefficients of corresponding terms on the left- and right-hand sides must
match.
EXAMPLE
5. The set {x, sin x} is LI on ~co < & < 00 accordingto Theorem3.2.4
because x is surely not expressible as a constant times sin x (for z/ sin x is not a constant),
nor is sin z expressible as a constant times x. H
EXAMPLE
6. We've seenthat{1,e*,e~*} is LI on —co < x < oo. Thus, if we meet
the equation
a+ be*+ce~* = 6 —2e7*,
(12)
then it follows from Theorem 3.2.6 that we must have a = 6, b = 0, c = ~2, for if we
rewrite (12) as
(a —b)(1) + be” + (c+ 2)e™*=0,
then it follows from the linear independence of 1, e*,e~* thata -6 = 0,b = 0,c+2
that is,a = 6,b=0,c=
=0;
~—2.Hi
Closure. We have introduced the concept of linear dependence and linear independence as preliminary to our development of the theory of linear differential
equations, which follows next. Following the definitions of these terms, we gave
threetheoremsfor the testing of a given set of functions to determine if they are LI
or LD. Of these, Theorem 3.2.3 will be most useful to us in the sections to follow
because it applies to sets of functions that arise as solutions of a given differential
equation.
In case you have trouble remembering which of the conditions W = 0 and
W # 0 corresponds to linear dependence and which to linear independence, think
of it this way. If we randomly make up a determinant,the chances are that its
value is nonzero; that is the generic case. Likewise, if we randomly select a set of
functions out of the set of all possible functions, the generic case is for them to be
unrelated — namely, LI. The generic cases go together (W + 0 corresponding to
linear independence)and the nongeneric cases go together(W = 0 corresponding
to linear dependence).
The concept of linear dependence and independence will prove to be important
to us later as well, when we study n-dimensional vector spaces and the expansion
of a given vector in terms of a set of base vectors.
82
EXERCISES
3.2
§. (Liouville’s
1. (a) Can a set be neither LD nor LI? Explain.
(b) Can a set be both LD and LI? Explain.
2. Show that the following sets are LD by expressing one of
the functions as a linear combination of the others.
(a) {1, 2 +2, 3a —5}
(b) {x?, c+
et, et+oetlae-
1}
3. Show whether the given set is LD or LI. HINT: In most
cases, the brief discussion of determinants given in Appendix
B will suffice; in others, you will need to use known properties of determinants given in Section 10.4. Also, note that the
Maple command for calculating determinants (the elements of
which need not be constants) is given at the end of Section
10.4.
(b){e™, ce,
ar}
etn}
yl”
(b)y+
_
by"
+
ly’
4y = 0,
_
6y
= 0,
{e®,
er
3}
{sin 2a, cos 2x}
(c)y’” —6y"”+ 9y' —4y =0,
{e%,
xe*,e4*}
(d)y'” _—Gy"+
Dy’~dy =
0, {e*,
ve? (1—x)e*}
(e) yl
y" _ 2Qy! — 0,
{1, e*,
ce? }
(f) yl"
_ by"
4 dy — 0,
(g)a*y"”—3ay'+3y=0,
(h) xy”
(asy”
{e*,
ene
er | e
{a,2°},
+ay’~y
= 0,
2}
Un(2)
uy(x)
W'(2)=
uy")(2)
ul” (x)
un) (@)
(5.2)
ul (x)
= —p,(x)ul"™-)(2) In the last row, substituteu(x)
“++= pn(x)u(x) from (10),againomit vanishingdeterminants, and again obtain (5.1) and hence the solution
(11).
onz>0O
{e?*,xe?"}
{e,eln
x,
(In v)*} .
6. (a) Prove Theorem 3.2.4.
(b) Prove Theorem 3.2.5.
(c) Prove Theorem 3.2.6.
7. If uy and we are LI, u; and wy are LI, and we and wy are
LI, does it follow that {uw,,we,u3} is LI? Prove or disprove.
HINT: If a proposition is false it can be disproved by a single
counterexample, but if it is true then a single example does not
suffice as proof.
8. Verify that2? and2° aresolutionsof xy” —4zy'+6y =0
on —co < x < o. Also verify, from Theorem 3.2.4, that
they are LI on that interval. Does the fact that their Wronskian
W([x?,x°|(x) = x vanishesatx = 0, togetherwiththeirlin-
one >0
—3cy' + 4y = 0, {x?, x? Ine} ,
Gy" —4y' +4y = 0,
(5.1)
HINT: You may use the various properties of determinants,
given in Section 10.4.
4. Verify that each of the given functions is a solution of the
given differential equation, and then use Theorem 3.2.3 to determine if the set is LD or LI. As a check, use Theorem 3.2.4
if that theorem applies.
(a)
W'(x) = —pi(x)W(2),
where the jth one is obtained from the W determinant by differentiating the 7th row and leaving the other rows unchanged.
Show that each of these n determinants, except the nth one,
has two identical rows and hence vanishes, so that
(g){0,2,2°}
{1,2, x?,..
formula, (11),
equal 2), by showing that W’‘(z) is the sum of n determinants
(h) {a, 20,27}
(c) tt 1+a,1+2*}
(e) {sinz, cosa, sinha}
(g) {1, sin 32}
(i) {z, 1+2, e*}
(a) Derive Liouville’s
and integrating the latter to obtain (11).
(b) Derive (11) for the general case (i.e., where m need not
(c) {at +a° 41,24 -2? +1, 27-2? -1}
(d) {e”, e?*,sinha, cosha}
(e) {sinh 3z, e*, e3”, e®*,e3*}
(f) {e*, e?*,we®,(7a —2)e™}
(a)
formula)
for the special case wheren = 2, by writing out W’(x), showing that
onz>d0
ear independence on.-co
Explain.
< x < co violate Theorem 3.2.3?
83
3.3. Homogeneous Equation: General Solution
3.3.1.General solution. We studiedthe first-orderlinear homogeneousequation
(1)
y' +p(x)y = 0
in Chapter 2, wherep(a) is continuous on the « interval of interest, /, and found
the solution to be
= Cem JP)
y(2)
(2)
da
whereC is an arbitraryconstant.If we appendto (1) an initial condition y(a) = 8,
where a is a point in J, then we obtain, from (2),
y(x) = be~JaP(E)a6
(3)
as was shown in Section 2.2.
The solution (2) is really a family of solutions because of the arbitrary constant
C’. We showed that (2) contains all solutions of (1), so we called it a general
solution of (1). In contrast, (3) was only one member of that family, so we called it
a particular solution.
Now we turn to the nth-order linear equation
d”y
aan
d®-ly
+ pi(z)
dri
+--+
Pai)
d
5=+Pn(a)y
=0,
(4)
and once again we are interested in general and particular solutions. By a general
solution of (4), on an interval J, we mean a family of solutions that contains every
solution of (4) on that interval, and by a particular solution of (4), we mean any
one member of that family of solutions.
We begin with a fundamental existence and uniqueness theorem.*
THEOREM 3.3.1 Existence and Uniqueness for Initial-Value Problem
If pi(x),...,pn(x)
are continuous on a closed interval J, then the initial-value
problem consisting of the differential equation
d’y
dx”
d~ly
pi (x) dxr-1
dy
|
4 Pn—1(t)
7
+ Pn(z)y
= 0,
(Sa)
togetherwith initial conditions
y(a) = bi, y'(a) = by, ..., y™ (a) = bn,
(Sb)
“For a more complete sequence of theorems, and their proofs, we refer the interested reader to the
little book by J. C. Burkill, The Theory of Ordinary Differential Equations (Edinburgh: Oliver and
Boyd, 1956) or to William E. Boyce and Richard C. DiPrima, Elementary Differential Equations and
Boundary Value Problems, 5th ed. (New York: Wiley, 1992).
84
where the initial point @is in 7, has a solution on J, and that solution is unique.
Notice how the initial conditions listed in (Sb) are perfect —not too few and not
too many — in narrowing the general solution of (Sa) down to a unique particular
solution,for (Sa)givesy' (a) as a linear combinationof y(~)) (x), ...,y(«), the
derivativeof (5a)gives y(+!)(«) as a linear combinationof y™(zx),...,y(a),
and so on. Thus, knowing y(a),..., y(%—) (a) we can use the differential equation
(Sa)andits derivativesto computey\")(a), y"+)) (a), andso on, and thereforeto
develop the Taylor series of y(x) about the point a; that is, to determine y(z).
Let us leave the initial-value problem (5) now, and turn our attention to determining the nature of the general solution of the nth-order linear homogeneous
equation (4). We begin by re-expressing (4) in the compact form
(6)
Ly] =0,
where
d”
dn}
L= 73 t pit) sce
+
d
+ Pn-1(t)= +Paz)
(7)
is called an nth-order differential operator and
d”
d
Uo
pale)
ml) is to +pale)+
=(+
Liu)
qd”
= zal)
qr-t
+pile) aula)
+
+ Pn(z)y(2)
(8)
defines the action of Z on any n-times differentiable function y. L[y] is itself
a function of x, with values L[y|(x). For instance,ifn
= 2, pi(z)
= sing,
po(z) = bx, andy(x) = x”, thenLiy](x) = (x?)” + (sinx)(x?)’ + 5a(x?) =
2+2xsinag +523,
The key property of the operator L defined by (8) is that
Ll[au + Sv] = aL [u]+ BL [vr]
(9)
for any (n-times differentiable) functions u, v and any constants a, 2. Let us verify
(9) for the representative case where L is of second order:
2
Llau + Bo] = (3
tp
da
dx
+ »2) (au + fv)
= (au+ bv)" + py (au + Bu)’ + po (au + Bv)
=au"
=a(u"
+ Bu" + ppau' + p,Bv' + peau + peSu
+ pyu’ + pou) + B (v" + piu’ + pav)
=aL [ul+ GL [vu].
Similarly for n > 3.
(10)
85
Recall that the differential equation (4) was classified as linear. Likewise, the
corresponding operator L given by (8) is said to be a linear differential operator.
The key and defining feature of a linear differential operator is the linearity property
(9), which will be of great importance to us.
In fact, Q) holds not just for two functions u and v, but for any finite number
of functions, say %1,..., U,.
That is,
Llaquy +++ + pug) = aL
for any functions u,,...,
[uy]+--+ + ag
4%,and any constants a
,,...,a,.
[ugh
(11)
(Of course it should be
understood, whether we say so explicitly or not, that uw ,...,uwz must be n times
differentiable since they are being operatedon by the nth-order differential operator
LL.)To prove (11) we apply (9) step by step. For instance, if k = 3 we have
Efayuy + agua + agug] = Llayuy + 1 (aque + agus)}
=ay,L
[ur] + 1D [agus + a3us3|
= 0,0 [uy]+ aL [ug]+ a3 [us]
from (9)
from (9)again.
From (11) we have the following superposition theorem:
THEOREM 3.3.2 Superposition of Solutions of (4)
If yy... ,ye, are solutions of (4), then Cyy, +---+ Cry, is too, for any constants
Cy,...,C.
Proof: It follows from (11) that
L[Cryi +--++ Ceys]= il
[yi]+--++ CeL [ys]
=C,(0)+---+C,(0).
EXAMPLE
a
1. Superposition. It is readily verified, by direct substitution,thaty; = e°*
and y2 = e~°* are solutions of the equation
(We are not yet concerned with how to find such solutions; we will come to that in the next
section.) Thus y = Cye** + Cze~%*is also a solution, as can be verified by substituting it
into (12), for any constants C], Co. O
To emphasize that the theorem does not hold for nonlinear or nonhomogeneous
equations, we offer two counter-examples:
EXAMPLE
2. It can be verified thaty; = 1 and ya = 2? are solutions of the nonlinear
86
equation zy"
— yy’ = 0, yet their linear combination 4 + 3x? is not. @
EXAMPLE
3. It can be verified that y; = 4e°* ~ 2 and yy = e3* — 2 are solutions of
the nonhomogeneous equation y’ — 9y = 18, yet their sum 5e°* — 4 is not. 2
We can now prove the following fundamental result.
THEOREM
3.3.3 General Solution of (4)
be continuous on an open interval J. Then the nth-order
Let pi(x),...,pn(x)
linear homogeneous differential equation (4) admits exactly n LI solutions; that is,
at least n and no more than n. If yi(x),..., Yn(x) is such a set of LI solutions on
I, then the general solution of (4), on J, is
y(x)
where C,...,
=
Cry
Sats
(2)
(13)
Cryn(2),
Cy, are arbitrary constants.
Proof: To show that (4) has no more than n LI solutions, suppose that y:(x),...,
Ym(x) are solutions of (4), where m > n. Let € be some point in J. The n linear
algebraic equations
eryi(€)
+ ++++ CmYm(€) = 0
(14)
:
cry (E)+++
tomy (€)=0
in the m unknown c’s have nontrivial solutions because m > n. Choosing such a
nontrivial set of c’s, define
v(x)
= c1y1 (x)
SP eos
(15)
CmYm(L),
and observe first that
L [v] =L
=o
[cry
cial
[yi] +--+
eral
+enL
[Yin| = Cl (0) Feet
Cm(0)
= 0,
where L is the differential operator in (4) and, second, that v() = v’(€) =--v"-D(¢e) = 0. One function v(x) that satisfies L[v] =iI 0 and v(€) = +
(16)
lI
=
= 0 is simply v(x) = 0. By the uniqueness part of Theorem 3.3.1 it is
v("-(€)
the only such function, so v(v) = 0. Recalling that the c’s in v(a) = cryi(@) +
-+++ CmYm(x) = 0 are not all zero, it follows that y;(x),..., Ym (a) must be LD.
Thus, (4) cannot have more than n LI solutions.
87
To show that there are indeed n LI solutions of (4), let us put forward n such
solutions. According to the existence part of Theorem 3.3.1, there must be solutions
yi(2),...;Yn(x)
of (4) satisfying the initial conditions
yi(a) = a1,
(n~—1)
yi(a@)=a,
--- yy
(a) = ain,
(17)
:
:
Yn(@)= On1, Yp(@)= On2, °°
yr
al
(a) = Qnny
where a is any chosen point in J and the @’s are any chosen numbers such that their
determinant is nonzero. (For instance, one such set of a’s is given by a,; = 1 for
each 2 = 1 through n and aj; = 0 whenever 2 4 j.) According to Theorem 3.2.3,
yi(@),.-.,Yn(a) must be LI since their Wronskian is nonzero at x = a. Thus,
there are indeed n LI solutions of (4).
Finally, every solution of (4) must be expressible as a linear combination of
those n LI solutions,
as in (13), for otherwise there would be more than n LI solu-
tions of (4). @
Any such set of n LI solutions is called a basis, or fundamental set, of solutions of the differential equation.
EXAMPLE
4. Supposewe begin writing solutionsof the equationy” — 9y = 0,
from Example |: e°*,5e3”,e738",2637+ e78*,sinh 32, cosh32, e°* —4cosh3z, and so
on. (That each is a solution
is easily verified.)
From among these we can indeed find
twothatareLI, butno morethantwo. For instance,{e3*,e~**},{9% 268 + e9*},
{e8*, sinh 3c}, {sinh 32, cosh 32}, {sinh 3u, ew3e} are bases, so the general solution can
be expressed in any of these ways:
y(z) = Cie** + Cpe7*,
(18a)
y(x) = Cre** + Co (2e"*+ eF*) ,
(18b)
y(x) = Cye®*+ Co sinh 32,
(18c)
y(z) = Cy sinh 32 + C2 cosh 32,
(18d)
and so on. Each of these is a general solution of y’ —9y = 0, and all of them are equivalent.
For instance, (18a) implies (18d) because
y(z) = Ce** + Coe7**
= C, (cosh3z + sinh 3x) + Cy (cosh3x —sinh 32)
= (Cy + Co) cosh 8x + (Cy — C2) sinh 3x
= Ci cosh3x + C4 sinh 32,
.
t
EXAMPLE
∕
↕
∙
∑
∙
∂↕
∙
5. Solve theinitial-value problem
yl"
+4 y’
= 0
(19a)
≤
388
y(0)= 3, y/(0)=5, y"(0)= —4.
(19b)
A general solution of (19a) is
y(“) = Ci cosa + Cosina + Cs,
(20)
because cos z, sin z, and 1 are LI solutions of (19a). Imposing (19b) gives
L (0)
rea
Ch
+ C's,
y'(0) =5 =C,
y"'(0)
=-4=-C\,
which equations admit the unique solution Cy, = 4, C2 = 5,C3 = —1. Thus,
y(z) = 4cosxz+5sinz —1
is the unique solution to the initial-value problem (19). @
3.3.2. Boundary-value problems. It must be rememberedthat the existence and
uniqueness theorem, Theorem 3.3.1, is for initial-value problems. Though most of
our interest is in initial-value problems, one also encountersproblems of boundaryvalue type, where conditions are specified at two points, normally the ends of the
interval J. Not only are boundary-value problems not covered by Theorem 3.3.1,
it is striking that boundary-value problems need not have unique solutions. In fact,
they may have no solution, a unique solution, or a nonunique solution, as shown
by the following example.
EXAMPLE
6. Boundary-ValueProblem. It is readilyverifiedthatthedifferentialequa-
tion
y’+y=0
(21)
admits a general solution
y(z) = Cy cosz + Cy sinz.
(22)
Consider three different sets of boundary values.
Case 1: y(0) = 2,y(w) = 1.Then
y(0) =2=C, +0,
y(t) il lI ~C, +0,
which has no solution for C,, C2, so the boundary-value problem has no solution for y(a).
Case 2: y(0) = 2, y(w/2) = 3. Then
y(0)
=2=C;,
y(m/2)I
lI
=3
+0,
=04+Cr,
3.3. Homogeneous Equation: General Solution
89
2cose+3sina.
Case3: y(0) = 2, y(w) = —2.Then
y(0) =2=C,+0,
y(r) = -2 = -C, +0,
so Cy =
2,and C> is arbitrary, and the boundary-value problem has the nonunique solution
(indeed,the infinity of solutions) y(z) = 2cosz + Cysina, where C2 is an arbitrary
constant. #
-;Yn(x), thanksto
the linearity of L, namely, the property of Z that
Llau + Bv] = aLlul + BL[v|
of n LI solutions {y,..
for brevity) for that equation.
Theorems 3.3.1 and 3.3.3 are especially important.
EXERCISES
3.3
NOTE: If not specified, the interval J is understood to be the
entire @axis.
1. Show whether or not each of the following is a general solution to the equation given.
(a)y” —3y' + 2y= 0;
(b) y” — 3y' + 2y =0;
(ce)
yy!
—2y=0;
dy" ~y!—2y=0;
(e) y'"” + dy’ = 0;
(f)
yl"
+
4y!
= 0;
Cie”
—e”)
+ Cre*
Ci(e~* + e?*)
Cre~*+ Coe**
Cy cos 2a + Cy sin 22
Cy + Cocos 2x + Cs sin 22
(g)yy"—2y"ty! =0; (Cy+ Coa+ C32?)e*
(hyy!" =~2Qy"+ yl = 0;
+ys0;
Gy" yy
(C1 + Caw) eF + Cy
+Cgae®
Cre*+ Cye7®
2. Show whether or not each of the following is a basis for the
given equation.
(a)y"”—9y = 0;
(b)yy”-9y=0;
(c)y"—-y=0;
(d) yl"
_ 3y"
{ e°*cosh 3z, sinh 3x }
{e%*,
cosh3a}
{sinh 3a, 2cosh 3x}
a 3y/ -y=
0;
{e*,ce*,
(e)y"—3y"=0; {1,a,e%*}
ae"
}
90
sinc}
(ayy + 2y' + 38y=0; |
3. Are the following general solutions of x7y" + xy’ —4y = 0 (b)y” + 2y' + 3y=0; y
(c)y" +2y' —y = 0;
on0 < a@< 00? On —oo < & < co? Explain.
{cosa,sinx,acosz,a
(fyyy” + 2y" +y=0;
(d)ay!" + ay! —y = 0;
(a)Cia?
(b) C12?
(ce)
22y”—y' —y = 0;
(f)(sina)y“"+ vy" =0;
y/"(2)= -9
++ ya ma
(c) Cy(a? +27 2) + Cy(2? −
∙
∫
∏
↔
∏
∶
of (21). ∶
10. Verify that (22) is indeed a general∶ solution
∏
↕
∏∙
{e*,e~*}
(c) {x,
vIn|z|}
(d) {w+ eln|2|,v - eln|2|}
5. Show whether or not the following is a general solution of
_ 4ylvt)
yo)
4 56yo”)
_ 14y")
_ 196y”
4 A9y!"
_ 36y’
+4
Ce"
a
Cye7*
at
Cae?"
+
Cye7?*
+Cyc?+Ce
+Cye~®
(b)Cye™
C7 sinh x + Cg cosh 2x
+
Cre3™
ae
Cee
3
+Cye™+
+Cye®*
6. Show that y, = 1 and yo = 2 are solutions of
(y° ~ 6y? +lly ~6) y” =OIsy=y+y
=1l+2=3
a solution as well? Does your result contradict the sentence
preceding Example 2? Explain.
7.
7
(a)y(0)= 0, y(2)=0
(b)y(0)= 0, y(2r) = -3
(c)y(1)= 1, y(2)=
(d)(0) = 0, y(5)= 1
l44dy = 0,
(a)
∶
∩
11. Consider the boundary-value problem consisting of the
differential equation x" + y = 0 plus the boundary conditions
given. Does the problem have any solutions? [f so, find them.
Is the solution unique? HINT: A general solution of the differ-
(a) {x, a}
(b)
y'(
y(2)=y'(2
Show that each of the functions
y,
=
(e)y/(0)= 0, y/(7)=0
(f)y/(0)= 0, y’(67) =
(g)y/(0)= 0, y’(27)=38
12. Consider the boundary-value problem consisting of the
differential equation y””” + 2y” + y = 0 plus the boundary
conditions given. Does the problem have any solutions? If
so, find them. Is the solution unique? HINT: A general so-
lution of thedifferentialequationis y = (Cy + Cox) cosa +
3a? — x and (C3 + Cyr) sina.
ya = x” —x is a solution of the equation a7y” — 2y = 2z. Is
= y'(0)= 0, y(m)= 0, y/(m)= 2
thelinear combinationCy, + Coy2 a solution as well, for all (a)y(0)
choices of the constants C’; and C2?
8. (Taylor series method) Use the Taylor
series method de-
scribed below Theorem 3.3.1 to solve each of the following
initial-value problems for y(a), up to and including terms of
fifthorder.NOTE: The termf'")(a)(a@— a)"/n! in theTaylor
seriesof f(x) aboutz = a is said to be ofnth-order.
ay"
+y=0;
y(0)=4, y/(0)
(b)y(0)
= y/(0)
= y"(0)=0, ute) =1
(c)y(0) = y"(0)= 0, y(r) =0= y(n) =0
(d)y(0)= y"(0) = 0, y(m)
= " mm)
=3
13. Prove that the linearity property (10) is equivalent to the
two properties
Lilut+v] = Liu) +Lf],
(13.1a)
Lau] = alu).
(13.1b)
=3
(b)y"”~dy=0; y(0)= —1,y’(0)=
(c)y" + 5y'+6y=0; y(0)= 2, y/(0)
=
(d)y+ sy =0; y(0)=1, y'(0)= 0
(e)y" te “y=0; y(0)=2, y/(0)=
(f) yy”— 3y = 0;
(5) =
4, y'(5) = 5
That is, show that the truth of (10) implies the truth of (13.1),
and conversely.
HINT: Expand
14. We showedthat (11) holds for thecase & = 3, but did not
prove it in general. Here, we ask you to prove (11) for any in(g)y"+3y'-y=0;
y(1)= 2, y’(1) =0
HINT: Expand teger & > 1. HINT: Itis suggested that you use mathematical
about v= 1.
induction, whereby a proposition P(k), for & > 1, is proved
(hyy’”
—y'+2y=0;
y(0) =0, y/(0) = 0, y"(0) =1
by first showing that it holds for & = 1, and then showing that
if it holds for & then it must also hold for & + 1. In the present
Gy" —by=0; y(0)= 0, y’(0)= 3, y”(0)= -2
example, the proposition P(k) is the equation (11).
9, Does the problem stated have a unique solution? No solution? A nonunique solution? Explain.
15. (Example 4, Continued) (a) Verify that each of (18a)
about 2 oe
3.4. Solution of Homogeneous Equation: Constant Coefficients
91
through (18d) is a general solution of y’ — 9y = 0.
(b) It seems reasonable that if C1, Cy are arbitrary constants,
able to show that corresponding to any chosen values Ci, C4
the equations (15.1) on Cy, Cy are consistent — that is, that
and if we call
they admit one or more solutions for C,
Cy +Cg=Cy
and
Cy -Cy=C%,
(15.1)
Co. Show that (15.1)
is indeed consistent.
(c) Show that if, instead, we had Cy + Cg = C} and
2C1 + 2C2 = C4,where C1, C2 are arbitrary constants, then
then C{,C} are arbitrary too, as we claimed at the end of it is or truethatC}, C4 arearbitrarytoo.
Example 4. Actually, for that claim to be true we need to be
3.4
Solution of Homogeneous Equation:
Constant Coefficients
Knowing that the general solution of an nth-order homogeneous differential equation is an arbitrary linear combination of n LI (linearly independent) solutions, the
question is: How do we find those solutions? That question will occupy us for the
remainder of this chapter and for the next three chapters as well. In this section we
consider the constant-coefficient case,
mn
m—1
FY
ayCa ge tan dx +any=0;
lal
(1)
that is, where the a; coefficients are constants,not functions of x. This case
is said to be “elementary” in the sense that the solutions will always be found
among the elementary functions (powers of x, trigonometric functions, exponentials, and logarithms), but it is also elementary in the sense that it is the simplest
case: nonconstant-coefficient equations are generally much harder, and nonlinear
equations are much harder still.
Fortunately, the constant-coefficient case is not only the simplest, it is also of
great importance in science and engineering. For instance, the equations
ma” + ca! + kr = 0
and
Li" + Ri! + ai
= 0,
governing mechanical and electrical oscillators, where primes denote derivatives
with respect to the independent variable ¢, are both of the type (1) because m, c, k
and L, 2, C are constants; they do not vary with ¢.
3.4.1. Euler’s formula and review of the circular and hyperbolic functions.
We are going to be involved with the exponential function e*, where z = x + iy
is complex and 7 = v~i. The first point to appreciate is that we cannot figure out
how to evaluate e**¥ from our knowledge of the function e®where «xis real. That
92
Chapter 3. Linear Differential Equations of Second Order and Higher
is, e®+Y is a “new object,” and its values are a matter of definition, not a matter of
figuring out. To motivate that definition, let us proceed as follows:
e
= erty
= eel¥
-e [inn GEE,
2
ty
2!
42
y
3
4
:
4!
3!
44
{
3
y
5
-«|(1 mtGi ~)+i(y F+e--)l.
@
Recognizing the two series as the Taylor series representations of cos y and sin y,
respectively, we obtain
ett” —e®(cosy + isiny) ,
(3)
which is known as Euler’s formula, after the great Swiss mathematician Leonhard
Euler (1707-1783), whose many contributions
to mathematics included the system-
atic development ofthe theory of linear constant-coefficient differential equations.
We say that(3)definese*+” sinceit givese***Yin thestandardCartesian
form a + ib, where the real part a is e®cos y and the imaginary part 6 is e* sin y.
Observe carefully that we cannot defend certain steps in (2). Specifically, the second equality seems to be the familiar formula e*+’ = e%e°,but the latter is for real
numbers a and b, whereas zy is not real. Likewise, the third equality rests upon the
2
Taylor series formula e” = 1+u+ x + --+-that is derived in the calculus for the
case whereuis
real, but iy is not real. The point to understand, then, is that the
steps in (2) are merely heuristic; trying to stay as close to real-variable
theory as
possible, we arrive at (3). Once (3) is obtained, we throw out (2) and take (3) as our
(i.e., Euler’s) definition of e***¥. Of course, there are an infinite number of ways
one can define a given quantity, but some are morefruitful than others. To fully
appreciate why Euler’s definition is the perfect one for e**’Y, one needs to study
complex-variable theory, as we will in later chapters. For the present, we merely
propose that the steps in (2) make (3) a reasonable choice as a definition of e*.
AS a special case of (3), let c = 0. Then (3) becomes
e'Y = cosy +isiny.
Forinstance,
e™ = cos w+isin
(4a)
7 = —1+07 = —1,and e?~* = e? (cos 3 — isin 3)
= 7,39(—0.990—0.1412)= —7.32—1.047.Since (4a)holds for all y, it musthold
also with y changed to —y:
e~Y = cos(—y) + isin (—y),
and since cos (—y) = cosy andsin (—y) = —sin y, it follows that
eY
= cosy —isiny.
(4b)
Conversely, we can express cos y and sin y as linear combinations of the complex exponentials e'¥ and e~*¥,for adding (4a) and (4b) and subtracting them gives
COSY = (ev + ev) /2and siny = (e¥ ~~ev) /(2i). Let us frame these formulas, for emphasis and reference:
= cosy +isiny
(Sa,b)
cosy — asin y
and
cosy =
sy
ed + ew
(6a,b)
iy 2 mis
= -
21
Observe that all four of these formulas come from the single formula (4a). (Of
course there is nothing essential about the name of the variable in these formulas.
For instance, e’® = cosxz + isinaz, e’
= cos@ + isin@, and so on.)
There is a similarity between (5) and (6), relating the cosine and sine to the
complex exponentials, to analogous formulas relating the hyperbolic cosine and
hyperbolic sine to real exponentials. If we recall the definitions
ev +e4
cosh y = a
ov eu
sinhy =
5
(7a,b)
of the hyperbolic cosine and hyperbolic sine, we find, by addition and subtraction
of theseformulas, that
HI cosh y + sinh y,
(8a,b)
e~¥= coshy — sinh y.
Compare (5) with (8), and (6) with (7). The graphs of cosh x, sinh x, e*, and e~*
are given in Fig. 1.
Using (6) and (7) we obtain the properties
9
cos” y + sin? y=,
cosh? y — sinh? y = 1.
(9)
(10)
From a geometric point of view, if we paraimetrize a curve C’ by the relations
L= COST,
over 0 <r
< 27, say, then it follows
circle. And if we parametrize C' by
x=coshr,
Yy=sint
from (9) that a
y=sinhr
(11)
+ y”?= 1, so that C isa
(12)
0
T
Figure
and e
at
T
1. cosh x, sinh z, e*,
94
Chapter 3. Linear Differential Equations of Second Order and Higher
instead,thenit followsfrom(10)that«? —y* = 1,soC is a hyperbola.Thus,one
refers to cos z and sin x as circular functions and to cosh « and sinh z as hyperbolic
functions, the hyperbolic cosine and the hyperbolic sine, respectively.
Besides (9) and (10), various useful identities, such as
sin(A + B) =sin
AcosB
cos(A + B)=cos Acos B
sinh (A + B) = sinh Acosh
cosh (A + B) =cosh Acosh
+ sin Bcos A,
—sin Asin B,
B + sinh B cosh A,
B + sinh Asinh B,
(13a)
(13b)
(13c)
(13d)
can be derived from (6) and (7), as well as the derivative formulas
d
—cost
dz
d
—coshx
dx
d
-—sinz = cosa,
=—singz,
= sinhz,
dx
—sinhz
dz
(14)
= coshz.
We shall be interested specifically in the function e** and its derivatives with
respect to x, where A is a constantthat may be complex, say \ = a + ib. We know
from the calculus that
a
vx_ yer
dz
when A is a real constant.
(15)
Does (15) hold when A is complex?
To answer the
question, use Euler’s formula (3) to express
eX?= elatib)e — 62%(cosbr + isin ba).
Thus,
d
−
−
∙
ax
∑∂
At __ d
↨
daz
∶
Toe
= (e**sin bx)
= (ae™cosbx —be™sin bx) + 1(ae™sin ba + be™cosbx)
=e (a + ib) (cosbx + isin br)
= \e
(cosbaz+ isin ba) = \e**,
so the familiar formula (15).does hold even for complex A.
There is one more fact about the exponential function that we will be needing,
namely, that the exponential
function e* cannot be zero for any choice of z; that is,
it has no zeros, for
le*|= jerry = |e” (cosy + isiny)|
=|e"||cosy +isiny| = e* |cosy +isiny|
= e*.
95
The fourth equality follows from the fact that the real exponential is everywhere
positive, and the fifth equality from the fact that |a+ ib| is the square root of the
sum of the squares of a and b, and cos y + sin*y = 1. Finally, we know that
e®> 0 for all x, so je*| > 0 for all z, and hence e* # 0 for all z, as claimed.
3.4.2. Exponential solutions. To guide our search for solutions of (1), it is a
good idea to begin with the simplest case, n = 1:
dy
Ys
ay=0,
da
(16)
the general solution of which is
y(z) = Ce",
(17)
where C' is an arbitrary constant. One can derive (17) by noticing that (16) is a
first-order linear equation and using the general solution developed in Section 2.2,
or by using the fact that (16) is separable.
Observing that (17) is of exponential form, it is natural to wonder if higherorder equations admit exponential solutions too. Consider the second-order equation
(18)
y" +ary! +agy=0,
where a, and ay are real numbers, and let us seek a solution in the form
mp
\ pXt
=e.
y(z)
(19)
If (19) is to be a solution of (18), then it must be true that
Ner® + ayAe** + ane
or
= 0,
(\? + aA + a2)e**=0,
(20)
(21)
where (20) holds, according to (15), even if the not-yet-determined constant A turns
out to be complex. For (19) to be a solution of (18) on some interval J, we need
(21) to be satisfied on J. That is, we need the left side of (21) to vanish identically
on I. Since e*” is not identically zero on any interval J for any choice of \, we
need \ to be such that
dA +a,A\
+ ao = 0.
This equation and its left-hand side are called the characteristic
(22)
equation and
characteristic polynomial, respectively,correspondingto the differential equation
(18). In general, (22) gives two distinct roots, say \y and Ag, which can be found
from the quadratic formula as
A=
ay,
‘)
&
fay — 4a9
2
a
96
(The nongeneric case of repeatedroots, which occurs if af ~ 4a2 vanishes, is discussed separately, below.)
Thus, our choice of the exponential form (19) has been successful. Indeed, we
have found two solutions of that form, e*!* and e*2". Next, from Theorem 3.3.2 it
follows [thanks to the linearity of (18)] that if e*!” and e*2* are solutions of (18)
then so is any linear combination of them,
y(x)=Cle
(23)
+Cye2*.
Theorem 3.3.3 guarantees that (23) is a general solution of (18) if e©t®and e2"
are LI on J, and Theorem 3.2.4 tells us that they are indeed LI since neither one is
expressible as a scalar multiple of the other. Thus, by seeking solutions in the form
(19) we were successful in finding the general solution (23) of (18).
EXAMPLE
1. For theequation
y’ —y' —by = 0,
(24)
the characteristic equation is \7 — \ — 6 = 0, with roots \ = —2,3, so
y(z) = Cye~?* + Cye**
(25)
is a general solution of (24). 9
EXAMPLE 2.
For theequation
(26)
y” —9y = 0,
the characteristic
is
equation is \? ~9 = 0, with roots \ = £3,s0a
general solution of (26)
y(x) = Cye®*+ Cpe**.
(27)
COMMENT 1. As discussed in Example 4 of Section 3.3, an infinite number of forms of
the general solution to (26) are equivalent to (27), such as
y(x) = C, cosh 3x + Co sinh 32,
y(x) = Cy sinh 3a + C2 (5e~8"—2cosh 3z) ,
(28)
(29)
y(z) = Cy (e** + 4sinh 32) + Co (cosh 3a — 7 sinh 32) ,
(30)
and so on. Of these, one would normally choose either (27) or (28).. What is wrong with
(29) and (30)? Nothing, except that they are ugly; e°* and e~°*make a “handsome couple,”
and cosh 3z and sinh 3x do too, but the choices in (29) and (30) seem ugly and purposeless.
COMMENT
2. If (27) and (28) are equivalent, does it matter whether we choose one or
the other? No, since they are equivalent. However, one may be more convenient than the
other insofar as the application of initial or boundary conditions.
97
For instance, suppose we append to (26) the initial conditions y(0) = 4, y/(0) = —5.
Applying theseto (27) gives
y(0)
y'(0)
4 = Cy
+ Co,
(31)
—~5
= 380, — 38C2,
ApplyingtheseinitialcondisoC = 7/6,Co = 17/6,andy(x) = (7e3*+ 17e7~°*)/6.
tions to (28), instead, gives
y(0) =4=Ci,
(32)
y'(0)= —5=3C2,
so Cy = 4, Co = ~5/3, andy(x) = 4cosh 32 —(5/3) sinh 32. Whereasour final results
are equivalent, we see that (32) was more readily solved than (31). Thus, cosh 3a and
sinh 3x makea slightly better choice in this case than e3* and e~8* —namely, when initial
conditions are given at c = 0.
Or, suppose we consider J to be 0 < 2 < oo and impose the boundary conditions that
y(0) = 6, andthaty(z) is to beboundedas2 —+oo. Thatis, ratherthanimposeanumerical
valueon y at infinity, we imposea boundednesscondition, that |y(z)| < © for all 2,
for some constant Af. Applying these conditions to (27) we see, from the boundedness
condition, that we need C, = 0 since otherwise the e®*will give unbounded growth.
Next, y(0) = 6 = Co, and hencethe solution is y(xz) = 6e~°*. Notice how easily the
boundednesscondition was applied to (27).
If we use (28) instead, the solution is harder since both cosh 3z and sinh3z
grow
unboundedly as x — co. We can’t afford to set both C, = 0 and Cg = 0, in (28)
since then we would have y(z) = 0, which does indeed satisfy both the equation (26) and
the boundedness condition, but cannot satisfy the remaining initial condition y(0) = 6.
However, perhaps the growth in cosh 3z and sinh 3a can be made to cancel. That is, write
y(az) = C, cosh 3x + C2 sinh 3z
C2 | —————
+a()
(A=)
est
a
e78t
est
= C, | ———-——-
~AtCr
a
_
eo 8e
33
(33)
32» C1=O2 32
2
so for boundedness we need C, + Cp = 0 (and hence Cy = —Cy). Then (33) gives
y(z) = Cye~%*
andy(0) = 6 givesC, = 6 andy(z) = 6e™**,as before.Thus,in the
case of a boundedness boundary condition at infinity we see that the exponential form (27)
is moreconvenient than the hyperbolic form (28).
To summarize,
when confronted
with a choice, such as between (27) and (28), look
ahead to the application of any initial or boundary conditions to see if one form will be
more convenient than the other. @
EXAMPLE
3. For
y" + 9y = 0,
the characteristic
is
(34)
equation is \? + 9 = 0, with roots \ = £32, soa general solution of (34)
y(x)
— Cre®*
a:
Coe
8*,
(35)
98
COMMENT
|. Just as the general solution of y” — 9y = 0 was expressible in terms of
therealexponentialse°*,e~°"or thehyperbolicfunctionscosh32,sinh3x, thegeneral
solutionof (34)is expressiblein termsof thecomplexexponentials
e’8”,e~* or in terms
of the circular functions cos 32, sin 3x, for we can use Euler’s formula to re-express (35) as
y(x) = C; (cos3a + isin 3x) + C2 (cos3a —isin 32)
(36)
= (Cy + Co) cos3a +7 (Cy —Co) sin 3z.
Since C, and C2 are arbitrary constants, we can simplify this result by letting C, + Cy be
anew constant A, and letting i(C;
— C2) be a new constant B, so we have, from (36), the
form
(37)
+ Bsin 32,
y(z) = Acos3a
where A, B are arbitrary constants. As in Example
1, we note that (35) and (37) are but
two out of an infinite number of equivalent forms.
COMMENT
2. You may be concerned that if y(a) is a physical quantity such as the dis-
placement of a mass or the current in an electrical circuit, then it should be real, whereas
the right side of (35) seems to be complex. To explore this point, let us solve a complete
problem, the differential equation (34) plus a representative set of initial conditions, say
y(0) = 7,y'(0) = 3, and seeif the final answeris real or not. Imposingthe initial conditions on (35),
y(0)=7=Ci +Cr,
y'(0)
so Cy = (7-1)/2
=3=723C,
— 138Co,
and Cy = (7+7)/2. Putting these values into (35), we see from (36) that
y(x) = $[(7-2)+(7+i)]cos3x+ 44[(7—i) —(7+2)]sin32 = 7cos3a+sin 3x,which
is indeed real. Put differently, if the differential equation and initial conditions represent
some physical system, then the mathematics “knows all about” the physics; it is built in,
and we neednot be anxious. @
Having already made the point that the general solution can always be expressed in various different (but equivalent) forms, we will generally adopt the
exponential
form when the exponentials
are real, and the circular
function
form
when they are complex. This decision is one of personal preference.
EXAMPLE
4. The equation
(38)
y +4y' + Ty =0
has the characteristic
equation \? + 4\ + 7 = 0, with distinct roots \ = —2 £iV3,
soa
general solution of (38) is
y(a)
— Cel
_
ent
=e
2tiv3)e
(Cre'¥*
+4 Cyelr27iv3)a
4
Cre!
**)
(4 cos V3x + Bsin v3e)
.
(39)
99
That is, first we factor out the common factor e~2" , then we re-express the complex exponentials in terms of the circular functions.
If we impose initial conditions ¥(0) = 1,y/(0) = 0, say, we find that A = 1 and
B=
2/V3,
so y(a@)=e mee(cos
J/32 of Se
7q SinV3z).
According
to Theorem
3.3.1, that
solution is unique. #
3.4.3. Higher-order equations (n > 2). Examples |—4 are representative of the
four possible cases for second-order equations having distinct roots of the characteristic equation: if the roots are both real then the solution is expressible as a
linear combination of two real exponentials (Example 1); if they are both real and
equal and opposite in sign, then the solution is expressible either as exponentials
or as a hyperbolic cosine and a hyperbolic sine (Example 2); if they are not both
real then they will be complex conjugates. If those complex conjugates are purely
imaginary, then the solution is expressible as a linear combination of two complex
exponentials or as a sine and a cosine (Example 3); if they are not purely imaginary,
complex exponentials or a sine and a cosine (Example 4).
Turning to higher-order equations (n > 2), our attention focuses on the characteristic equation
A” Hay
ATE
4
Ifn = 1, then (40) becomes \+a,
Hani A + an = 0.
(40)
= 0 which, of course, has the root A = —a,
on the real axis. If n = 2, then (40) becomes A? + a, + a9 = 0, and to be assured
of the existence of solutions we need to extend our number system from a real axis
to a complex plane. If the roots are indeed complex (and both a; and ay are real)
they will necessarily occur as a complex conjugate pair, as in Example 4.
One might wonder if a further extension of the number system, beyond the
complex plane, is required to assure the existence of solutions to (40) forn
> 3.
However, it turns out that the complex plane continues to suffice. The characteristic equation (40) necessarily admits n roots. As for the case n = 2, they need
not be distinct and they need not be real, but if there are complex roots then they
necessarily occur in complex conjugate pairs (if all of the a;’s are real).
In this subsection we limit attention to the case where there are n distinct
roots of Ory which we denote as Aj, Ag,...,An. Then each of the exponentials
eM?
en” is a solution of (1) and, by jiheorem 3.3.3,
y(z) = Che *®
+---4+
Che”
(41)
is a general solution of (1) if and only if the set of exponentials is LI.
THEOREM 3.4.1 Linear Independence ofa Set of Exponentials
Let \i,...,An
be any numbers, real or complex. The set {er .
(on any given interval J) if and only if the \’s are distinct.
is LI
100
Proof: Recall from Theorem 3.2.2 that if the Wronskian determinant
erie
WwW
lems, vas ome
(a) =
∙∙
∕
*
∶
↕
∏
\Miotewe
∙
−
↕
(42)
∕
is not identically zero on J, then the set is LI on J. According to the properties of
determinants (Section 10.4), we can factor e*
out of the first column, e*2” out of
the second, and so on, so that we can re-express W as
lowe.
lew,
WwW
—
ewe]
(x) _ eiterbAn)e
7
rot
tae
7
.
(43)
vee Qnol
The exponential function on the right-hand side is nonzero. Further, the determinant is of Vandermonde type (Section 10.4), a key property of which is that it
is nonzero if the A’s are distinct, as indeed has been assumed here. Thus, W is
nonzero (on any interval), so the given set is LI.
Conversely, if the A’s are not distinct, then surely the set is LD because at least
two of its members are identical. @
Consider the following examples.
EXAMPLE
5. The equation
—8y' + 8y =0
(44)
hasthecharacteristicequation\? —8\ + 8 = 0. Trial anderrorrevealsthatA = 2 is
one root. Hence we can factor\? ~ 8\ + 8 as (A — 2)p(A), wherep(A) is a quadratic
functionof \. To findp(A) wedivide \ —2 into A®—8A + 8, by longdivision,andobtain
p(A) = A? + 2\ —4 which, in turn,can befactoredas [A—(-1 + V5)][A ~ (-1 — V5)].
Thus, \ equals 2 and —1 + V5, so
y(x) = Cie?"
+ Cel
1+v5)« 4 Cye(-l-vo)=
40*(Cer+ye-V)
=Oye
a
is a general solution of (44).
COMMENT.
Alternative to long division, we can find p(\) by writing \3 — 8A +8
=
(A = 2)(ad?+ BA+c) = a3 + (b—2a)A?+ (c —26)\ —2canddetermininga, b,c so
that coefficients of like powers of \ match on both sides of the equation. &
EXAMPLE
6. The equation
ya
-y=0
(46)
3.4. Solution of Homogeneous Equation: Constant Coefficients — 101
hasthecharacteristicequation\* —1 = 0, whichsurelyhastheroot\ = 1. Thus,\* —1
is \ ~ 1 times a quadratic function of A, which function can be found, as above, by long
division. Thus, we obtain
(A-1) (QQ?
+A+41)=
so \ equals1 and(—1+ V3i)/2. Hence
y(x)
+ Cye(
+V38i)2/2
LL Oge(-
= Cre”
«/2
i> V8)
'V0/2)
= Cye®+e72/? (Cre! Bx/2 4 Cae"
= Ce"
x
+ en
(ay
3
3
cos we,
+ Cy sin 4, 7) ;
(47)
where Cy, C'S,C4 are arbitrary constants. (Of course, we don’t really need the primes in
the final answer.) |
EXAMPLE
7. The equation
(48)
y) —Ty!"+12y!=0
hasthecharacteristicequation\° —7A3+ 12\ = 0 or, A(A*—7A?+ 12) = 0. The A
factor gives the root \ = 0. The quartic factor is actually a quadratic in \?, so the quadratic
equationgives\? = 4 and\? = 3.Thus,\ equals0,+2,+V3, so
y(x) = Cy + Coe? + Cge7?* + CyeV® + Cge7V3"
(49)
is a general solution of (48). #
EXAMPLE
8. The equation
(50)
y) +ky=0
arises in studying the deflected shape y(x) of a beam on an elastic foundation, where k is
a known positive physical constant. Since the characteristic equation \* + k = 0 gives
\* = ~k, to find \ we need to evaluate (—k)!/4. The general result is that zl/™ for any
complex number z = a + ib and any integer n > 1, has n values in the complex
plane.
Thesevalues are equally spacedona circle of radius r = Va? + 6? centeredat theorigin
of the complex plane, as is explained in Section 22.4. For our present purpose, let it suffice
1tomerelygivetheresult:\ = (—k)!/4= £h1/4L+2 and-k!/4 —=
50
V2
+i)e/ V2 + Coeski4 (1—i)a/V2
y(x) = Cy ek A
me;
e*Alfa
MWg9
= ebe/V2
1+ i)e/V2
4
Cyeke kil
~i)2/JV2
ki/A
Risa
V2
v2-
(c: cos —=xz + Cy sin
L/h ff
pen ki
v2
x/Vv2(cs 3 cos
Ki/4
vO
°)
Kila
z+ Ci 4 sin <)
J2
(51)
102
is a general solution of (50). #
3.4.4. Repeated roots. Thus far we have considered only the generic case, where
To
the nth-order characteristic equation (40) admits n distinct roots A,,...,An.
complete our discussion, we need to consider the case where one or more of the
roots is repeated. We say thata root \; of (40) is repeated if (40) contains the factor
A~ Aj more than once. More specifically, we say that \, is a root of order k if (40)
contains the factor \ — A; k times. For instance, if the characterisitic equation for
somegiven sixth-orderequationcan be factoredas (\ + 2)(\ —5)8(\ — 1)? = 0,
thentheroots \ = 5 and \ = 1 are repeated;\ = 5 is a root of order 3 and A = 1
is a root of order 2. We can say that
y(2) = Cye™** + Coe°* + Ce”
is a solution for any constants C',, C2, C3, but the latter falls short of being a general
solution of the sixth-order differential equation since it is not a linear combination
of six LI solutions. The problem, in such a case of repeated roots, is how to find
the missing solutions. Evidently, they will not be of the form e>*,for if they were
thenwe would havefound themwhen we soughty() in thatform.
We will use a simple example to show how to obtain such “missing solutions,”
and will then state the general result as a theorem.
EXAMPLE
9. Reductionof Order.The equation
y’+2y’+y=0
(52)
hasthecharacteristicequationA? + 2\+1 = (\+1)? = 0, so \ = —1isa rootof order2.
Thus, we have the solution Ae~* but are missing a second linearly independent solution,
which is needed if we are to obtain a general solution of (52).
To find the missing solution, we use Lagrange’s method of reduction of order, which
works as follows. Suppose that we know one solution, say y; (2), of a given linear homogeneous differential equation, and we seek one or more other linearly independent solutions.
If y1(z) is a solution then, of course, so is Ay;(x), where A is an arbitrary constant. According to the method of reduction of order, we let A vary and seek y(z) in the form
y(z) = A(x)yi (x). Puttingthatform into thegiven differentialequationresultsin another
differential equation on the unknown A(x), but that equation inevitably will be simpler
than the original differential equation on y, as we shall see.
In the present example, y;(z) is e~*, so to find the missing solution we seek
y(x) = A(xje~*.
(53)
From (53),y’ = (A! —A)e~* andy” = (A” —2A'+ A)e~”, and puttingtheseexpressions
into (52) gives
(A” —2A'+A42A'-2A+
A)e™*=0,
(54)
so that A(x) must satisfy the second-order differential equation obtained by equating the
coefficient of e~* in (54) to zero, namely, A” — 2A’ + A+ 2A’ —~2A +A = 0. The
103
cancellation of the three A terms in that equation is not a coincidence, for if A(z) were
a constant [in which case the 4’ and A” terms in (54) would drop out] then the terms on
the left-hand side of (54) would have to cancel to zero because Ae~* is a solution of the
original homogeneous differential equation if A is a constant. Thanks to that (inevitable)
cancellation,the differentialequationgoverningA(x) will be of the form
A” +aA' = 0,
(55)
for some constant a, and this second-order equation can be reduced to the first-order equation vu!+ av = 0 by setting A’ = uv;hence the name reduction of order for the method. In
fact, not only do the A terms cancel, as they must, the A’ terms happen to cancel as well,
so in place of (55) we have the even simpler equation
A” =0
(56)
on A(x). Integrationgives A(z) = Cy + Cox, so that(53) becomes
y(2) = Che" + Core™”.
(57)
The Ce~* term merely reproduces that which was already known (recall the second sentenceof this example), and the Cyve~* term is the desired missing solution. Since the two
are LI, (57) is a general solution of (52). @
Similarly, suppose we have an eighth-order equation, the characteristic equa-
tion of which can be factoredas (\ —2)?(A + 1)4(\ + 5), say,so that2 is a root
of order 3 and —1 is a root of order 4. If we take the solution Ae”* associated with
the root \ = 2, and apply reduction of order by seeking y in the form A(x)e?*,
then we obtain A” = 0 and A(x) = Cy + Cox + Cgx? and hence the “string”
of solutions Ce?”,
Cove?" , Care?"
coming from the root A = 2. Likewise,
if
we take the solution Ae~* associated with the root A = —1, and apply reduction
of order, we obtain A(x) = Cy+ Csx + Cox” + Cra? and hence the string of
solutions Cye~*, Csze™*, Cex*e7", Cra e~* coming from the root \ = —1, so
that we have a general solution
et
y(x) = (Cy + Cox + C3x°)
+ (C4+ Cra + Cou*+ Crx*) e® + Cge*
(58)
general solution
of the original differential equation. [To verify that this is indeed a
one would need to show that the eight solutions contained within (58) are LI, as
could be done by working out the Wronskian W and showing that W # 0.]
EXAMPLE
10. For
yl"
_ y"
=O
(59)
the characteristic equation \4 —\? = 0 gives \ = 0,0, 1, -1 and hence the solution y(a) =
A+ Be® + Ce~*. The latter falls short of being a general solution of (59) because the
repeatedroot \ = 0 gave the single solution A. To find the missing solution by reduction
104.
of order we could vary the parameter A and seek y(a) = A(z), but surely there can be
no gain in that step since it merely amounts to a name change, from y(z) to A(z). This
situation will always occcur when the repeated root is zero, but in that case we can achieve
a reduction of order more directly. In the case of (59) we can set y’’ = p. Then the
fourth-order equation (59) is reduced to the second-order equation p” —p = 0, so
p(v) = Ae®+ Be~*.
But y” = p, so
y'(v) = [oe
dz = Ae®—Be“? +C.
Hence
y(x) = /
(Ae* ~ Be?
+ C) dx = Ae* + Be“*
+Ca+D
is the general solution of (59). Observe that the pattern is the same: the repeated root A = 0
gives thesolution (C; + C2z)e°*, whereCy is D and CzisC.
@
We organize these results as the following theorem.
THEOREM
3.4.2 Repeated Roots of Characteristic Equation
If A; is a root of order k, of the characteristic equation (40), then ee
gee
z*-1e1% are k LI solutions of the differential equation (1).
Proof: Denote(1) in operatorform as L[y] = 0, where
qi)
qin-1)
d
b= am tAgg@ey bot nage tan
Then
Le]
or
=("+a
©)
ht tanid +an)&™,
(61)
L le**]= (\—1)p(Aje*,
wherep(A) is a polynomial in \, of degreen — k. Since (61) holds for all A, we
can set \ = A, in that formula. Doing so, the right-hand side of (61) vanishes, so
thatL [e*!*]= 0 andhencee*!*is asolutionof L [y]= 0. Ourobject,now,is to
show that re™!*,... ,c*~te*1 are solutions as well.
To proceed, differentiate (61) with respect to A (A, not x):
aele] =A)
d
—
An
Ow)
=k
—
k—1
Oe +A=Aa) (POE
}
Nw
pe
Aa
AY
k
d
—— 1
\e)
ps
∙
(62)
62
The left-hand side of (62) calls for e*” to be differentiated first with respect to x,
according to the operator L defined in (60) and then with respect to A. Since we can
interchange the order of nee differentiations, we can express the left-hand side as
L [ em],
that is, as L [ae’A). Thus, one differentiation of (61) with respect to A
gives
L [we*]=
=k(\—A1)*!
payee +(A—dy)*< (p(je**)
(63)
Setting \ = A, in (63) gives L [ze*!*] ==0. Hence, not only is e*!®a solution, so is
ve*!, Repeated differentiation with respect to \ reveals that 22e™!,...,2*-1e*1®
are solutions as well, as was to be proved.
That’s as far as we can go because at that point one more differentiation would
give a leading term of k!(A — 1)” p(A;)e™* plus terms with factors of
(A- \)* on theright-handside. The latterterms van(A - M1), (A= A1)’,.
ish for A = Aj, but the Testis
term does not because p(A1) # 0 (because \ — Aq
is not among the factors of p) and e*!* 4 0.
Verification that the solutions e*!*, ze™1*,...,c*-le*1*
are LI is left for the
exercises.
EXAMPLE
11. Asafinal example,considertheequation
y)
_ By!” 4 26y"
_ A0y’ 4 25y =(0
(64)
withcharacteristic
equationA*—8\° + 26A?~ 40 + 25 = 0 andrepeated
complexroots
A= 242,
2+7,
2-1,
2-1.
It follows that
y(x)
= (Cr+Coa) eP*9*
+ (Cg+Cyr) €P9?
i e *[(Cre’* + Cze**) + 2 (Coe
+ Cae**)|
=e"* ((Acosz + Bsinaz) + «(C cosa + Dsinz)]
is a general solution of (64). @
3.4.5. Stability. An important consideration in applications, especially feedback
control systems, is whether or not a system is “stable.” Normally, stability has
to do with the behavior of a system over time, so let us change the name of the
independent variable from <xto ¢ in (1):
d”y
qn tM
d™ty
i
+--++an_
dy
tan ny =0,
(65)
and let us denote the general solution of (65) as y(t)= Cryi(t) +--+ + Cnyn(t).
We say that the system described by (65) (be it mechanical, electrical, economic,
or whatever) is stable if all of its solutions are bounded’ —that is, if there exists a
constantMj for each solutiony;(t) such that |y;(t)| < Md;for allt > 0. If the
system is not stable, then it is unstable.
{06
Chapter 3. Linear Differential Equations of Second Order and Higher
THEOREM
3.4.3 Stability
For the system described by (65) to be stable, it is necessary and sufficient that the
characteristic equation of (65) have no roots to the right of the imaginary axis in
the complex plane and that any roots on the imaginary axis be nonrepeated.
Proof: Let \ = a + ib be any nonrepeated root of the characteristic equation; we
call a the real part of \ and write ReX = a, and b the imaginary part of X and write
Im\ = b. Sucha rootwill contributea solutione(¢+)* = e@(cosbt+ isin bt).
Since the magnitude (modulus, to be more precise) of a complex number x + zy
is defined as |a + ty| = \/x? + y?, and the magnitude of the product of complex
numbers is the product of their magnitudes, we see that elarioyt) — |e (cos bt
+isin bt)|= |e%||cosbt+ isin bt| = e* v/cos?bt + sin?bt = e* so thatsolu-
tion will be bounded if and only if a < 0, that is, if A does not lie to the right of the
imaginary axis.
Next, let A = a + ib be a repeated root of order hk,with a # 0. Such a
bt + isin bt), for
rootwill contributesolutionsof theform t?e(¢+")t— ¢Pe%(cos
p = 0,...,& —1, with magnitude t?e™. Surely the latter grows unboundedly if
a > 0 because both factors do, but its behavior is less obvious if a < 0 since
then the t? factor grows and the e“’ decays. To see which one “wins,” one can
rewrite the product as ¢?/e~*' and then apply I’H6pital’s rule p times. Doing so,
one finds that the ratio tends to zero as t —>oo. [Recall that |’Hépital’s rule applies
toindeterminateforms of the type 0/0 or o0/oo, not (co)(0); thatis why we first
rewrite t?e™ in the form t?/e~“.] The upshotis that such solutions are bounded if
A = a+ ib lies in the left half plane (a < 0), and unboundedif it lies in the right
nalf plane (a > Q). If A lies on the imaginary axis (a = 0), then |ePelatib)e|
pPeret|= |t?(cosbt + isin bt)| = ¢?,which grows unboundedly.
Our conclusion is that all solutions are bounded if and only if no roots lie to
he right of the imaginary axis and no repeated roots lie on the imaginary axis, as
was to be proved. @
One is often interested in being able to determine whether the system is stable
or not without actually evaluating the n roots of the characteristic equation (40).
There are theorems that provide information about stability based directly upon the
a; coefficients in (40). One such theorem is stated below.
Hurwitz criterion, is given in the exercises to Section 10.4.
THEOREM
Another, the Routh-
3.4.4 Coefficients of Mixed Sign
If the coefficients in (40) are real and of mixed sign (there is at least one positive
and at least one negative), then there is at least one root A with ReA > 0, so the
system is unstable.
3.4. Solution of Homogeneous Equation: Constant Coefficients
These theorems are not as important as they were before the availability of
computer software that can determine the roots of (40) numerically and with great
ease. For instance, using Maple, one can obtain all roots of the equation
a? + 34 —2a? +a°+2+5=0
simply by using the fsolve command. Enter
fsolve(a*5+ 3%074—2*0°3 + 2°2+e+5=0,
x, complex);
and return.This gives the following printout of the five solutions:
—3.6339286, —.58045036— .797312497, —.58045036+ .79731249T,
89741468 — .78056850/, .89741468 + .78056850L
In this example, observe that there are, indeed, roots with positive real parts, as
predicted by Theorem 3.4.4, so the system is unstable.
For equations of fourth degree or lower, such software works even if one or
more of the coefficients are unspecified, in which case the roots are given in terms
of those parameters.
Closure. In this section we limited our attention to linear homogeneous differential equations with constant coefficients, a case of great importance in applications.
Seeking solutions in exponential form, we found the characteristic equation to be
central. According to the fundamental theorem of algebra, such equations always
have at least one root, so we are guaranteed of finding at least one exponential solution of the differential equation. If the n roots A1,...,An
are distinct, then each
root A; contributes a solution e*”, and their superposition gives a general solution
of (1) in the form
y(x)= Cye™®
+--+ Chern®.
(66)
If any root A; is repeated, say of order k, then it contributes not only the solution e*/*,but the & LI solutions e*", cei,
..., 28
!e*s to the general solution.
Thus, in the generic case of distinct roots,the veneral solution of (1) is of the form
(66); in the nongeneric case of repeated roots, the solution also contains one or
more terms which are powers of x times exponentials.
It should be striking how simple is the solution process for linear constantcoefficient homogeneous equations, with the only difficulty being algebraic — the
need to find the roots of the characteristic equation. The reason for this simplicity
is that most of the work was done simply in deciding to look for solutions in the
right place, within the set of exponential functions.
Also, observe that although in
a fundamentalsensethesolving of a differential equationin someway involves integration,the methodsdiscussed in this section required no integrations,in contrast
to most of the methodsof solution of first-order equationsin Chapter I.
In the final (optional) section we introduced the concept of stability, and in
Theorem3.4.3 we relatedthe stability of the physical systemto the placementof
the roots of the characteristic equation in the complex plane.
107
Chapter 3. Linear Differential Equations of Second Order and Higher
{08
Computer software. To obtain a generalsolution of y/” — 9y' = 0 using Maple,
use the command
dsolve({diff(y(x),2, 2,2) —9 * diff(y(z),x) = 0},y(2));
andtosolvetheODE subjecttotheinitial conditionsy(0) = 5, y’(0) = 2, y”(0) =
—4, use the command
dsolve({diff(y(x), x, 7,2) —9 * diff(y(a), x) = 0, y(0) = 5,
, y());
= —4}
D(y)(0)= 2,D(D(y))(0)
In place of diff(y(x), x, x, x) we could use diff(y(a),
(x) $3), for brevity.
EXERCISES 3.4
1. Use whichever of equations (5)—(8) are needed to derive
(0) y)
— 2y" —3y =0
theserelationsbetweenthecircular andhyperbolicfunctions: (p)yo) + 6y” + 8y =0
(a)cos (iz) = coshz
(c) cosh (tz) = cosa
(b) sin (tz) = isinhe
(d) sinh (iz) = ising
2. Use equations (6) and/or (7) to derive or verify
lowing equations, and a particular solution satisfying the given
conditions, if such conditions are given.
(f) equation(13d)
(e)equation(13c)
3. Theorem 3.4.2 states that e*!*, ze™*,...
Prove that claim.
(x)yt" = 2y
yo +2y =0
5. (a)—(r) Solve the corresponding problem in Exercise 4
Cee
eee
6. (Repeated roots) Find-a general solution of each of the fol-
° equation Oo
(a) equation (130)
io scuation Ok )
c) equa fone
(q)y(iv)) _+Ty”
+Ly =0 −
Wt og tt
,e*—1e*1*are LI.
y(-3)=5,y/(-3)=-1
(@y"=0;
(b) a + 6y’ as 3} 0;
ae
i
y(1) =
()y”
=0;
y(0)=3,
4. (Nonrepeated
roots)Find a generalsolutionof eachof the (@)y'” +5y”=0;
following equations, anda particular solution satisfying the
givenconditions,
if suchconditionsaregiven.
(a) y" + 5y! =0
"WW
(c)y”+y'
=0; y(0)=3,y'(0) =0
dy" -3y'+2y=0; y(l)=1, y(1)=0
=1,y(1)=
—4y’—5y=0;
ey”
(e)
y"—
4y'—~5y
=0; yA)
y(1)=
1, y'(1) =0
tt
5,4
—3y"
+ 3y'
—y =0
~y"” -y
+y =0
ap mn
() y - 2y a ss _ 0
y(0)=2,
y(O)=4,
y/(0)=5
y/(0)=-1
v(0) ~ .
v0)
()e q
y Tine mins _ 0.
a)
~~
_ 0.
∙−
_ 3
yu 0) ~ 0.
mw
nn
(j) yO)
+By” + 16y= 0
(Hy!ty! Ry=0; y(-1)=2,y(-1)=5
sy
(fy
a Oe)
(g)y" —4y'+5y=0;
(h)yy”—2y'+3y=0;
y(0) =6
Vo wt) ey
ee at
"
Mn
0; (0) = y/(0)= y"(0) = y"(0) =
y=
(kK)
y(0) =0, y(0)=3
7. (a)~(k) Solve the correspondingproblem in Exercise 6
usingcomputersoftware.
8. If the roots of the characteristic equation are as follows,
then find the original differential equationand also a general
Dy
=O |WO
Yoo yeay
YuV;
= 1, 4 0)=0,YM)
=U,
Y
= o wsonaFie
=1,
=
2,6
(a)
y+ y” —2y=0
(m)
(n)yy) ~y =0
y"(0)=0
(h) y
+ 3y"
= 0
=0
ty!
(iv) ty!
(i)
!
4
!
y” 0) = 1
y(0)=0,
(€) yl" + by" + 3y' +y =0
(g) yl"
(b) y” — y’ =0
y(0)=—-5,
y(0)=1,
(c)4 —21,4423
—2i
(b)21,
(d)—2,3,5
3.4, Solution of HomogeneousEquation: Constant Coefficients
(f) 1,1, -2
(e) 2,3, —1
(g)4,4,4,i,-i
du yu
(h)1,-1,2+%,2—i
(i)0,0,0,0,7,9
G@iltil+ijl—i1—i
da
=0.
109
(11.2)
9. (Complexaj;'s)Find a generalsolution of each of the fol- Sl ve (11.2) for u, put that wuon the right-hand side of
lowing equations. NOTE: Normally, the a; coefficients in
(1) are real, but the results of this section hold even if they
are not (except for Theorem 3.4.4, which explicitly requires
that the coefficients be real). However, be aware that if the
y
— A»y = u, which is again of first order, and solve the
dx
latter for y. Show that if A,, A2 are distinct, then the result is
given by (23), whereas if they are repeated,then the result is
y(x) = (Cy + Cox)e*?”.
(c) Solve y” — 3y’ + 2y = 0 by factoring the operator as
necessarily occur in complex conjugate pairs. For instance,
(D ~1)(D —2)y = 0. Solvethelatterby themethodoutlined
A?4 2d + 1 = Ohastheroots\ = (/2 — 1)i, -(/2 + 1)i
-u=0
in (b): Setting(D —2)y = u, solve (D -lju=u'
(b)y” —3iy’ —2y =0
(a)y” —2iy’+y =0
for u(x). Then,knowingu(x), solve (D —2)y = u, namely,
a; coefficients
are not all real, then complex
(c)y"+ty’-y=0
roots do not
(d)y” —2iy’ —y=0
y' —2y=u(x),fory(a).
HINT: Verify, and use, the fact that (d) Same as (c), for y” — 4y = 0.
(e)Sameas (c),for y” + 4y’ + 3y = 0.
Vi = (1+ 3)/v2.
(f) Sameas (c),for y + 2y’ +y = 0.
Hy" +diy” —y' =0
Same as (c), for y” + 4y’ + 4y = 0.
(g)
HINT: Verify, and use, the fact that
NOTE: Similarly for higher-order equations. For instance,
f—j=4(1i)/V2
(h)yl!"—(1+ 2i)y” + (i +i)y’ —2014+
dy= 0 HINT: One y” — 2y” — y' + 24y= (D - 2)(D+ 1)(D —-1)y = O can
be solved by setting (D + 1)(D — 1)y = uw and solving
root is found, by inspection, to be A =
(D — 2)u = 0 for u(x); then set (D — 1)y = v and solve
(e) y’ — iy = 0
10. (a)—(h) Solve the corresponding
oroblem in Exercise 9
using computer software.
11. (Solution by factorization of the operator) We motivated
the idea of seeking solutions to (1) in the form e** by ob-
ane
that the general solution of the first-order equation
y' + ayy = 0 is an exponential, Ce~*,
and wondering if
higher-order equations might admit exponential solutions too.
A more compelling approach is as follows. Having already
seen that the first-order equation admits an exponential solution, consider the second-order equation (18).
(a) Show that (18) can be written, equivalently,
(D —Ar)(D —d2)y=0,
where D denotes d/dz,
as
and A, and \» are the two roots of
By theleft-handsideof (11.1),we mean
(D —1)((D —2)y). Thatis, firstlet theoperatortotheleft
of y (namely, D — A») act on y, then let the operator to the left
of (D —Az)y (namely,D —\;) acton that.
homogeneous differential equation with constant coefficients
can be reduced to the solution of a sequence of n first-order
linear equations.
12. Use computer software to obtain the roots of the given
characteristic equation, state whether the system is stable or
unstable, and explain why. If Theorem 3.4.4 applies, then
show that your results are consistent with the predictions of
that theorem.
(a) 8 —8d? + 26’ -—2 =0
(11.1) (b)A?+ 3A? + 2A +2 =0
d? +a;\+ a2 = 0. NOTE: In (11.1) we accomplish a factorization of the original differential operator L = D?-+a,D+az
as (D—,)(D—z).
(D+ 1)v = u for v(x); finally, solve (D — 1)y = v for
y(xz). The upshotis that the solution of an nth-order linear
(b) To solve (11.1), let (D — Az)y = u, so that (18) reduces to
the first-order equation
+ AS+31? +21 42=0
(c) M44
+(4+4=0
(d)M+ A452
(e)AS + 45 4.544 243—AZ+A4-3 =0
(f) X48+ OA54+5A4A+21 4+717 +(14+3=0
(g)AS+ AS+ 5AS+ 43 + 447 4+8A 4+4=0
(hyAS+ AF+ 5AA+ 2A3 + TAZ+A43 =0
(i) AB—AS + AS+5AA + QAP+ 72 +143 =0
(j) ABE AT AS + AE4 SASF 21 + 717 +A43=0
3.5
Application to Harmonic Oscillator: Free Oscillation
In Section 1.3 we discussed the modeling of the mechanical oscillator reproduced
here in Fig. 1. Neglecting air resistance, the block of mass m is subjected to a
restoring force due to the spring, a “drag” force due to the friction between the
block and the lubricated table top, and an applied force f(t). (By a restoring force,
we mean that the force opposes the stretch or compression in the spring.) Most of
that discussion focused on the modeling of the spring force and friction force, and
we derived the approximate equation of motion
= f(t),
me" +ca'+kx
(1)
where c is the damping coefficient, k is the spring stiffness. Besides the differential
equation, let us regard the initial displacement and initial velocity,
z(0)=2x9
and 2'(0)= 29,
(2)
respectively, as specified values.
In this section we consider the solution for the case where f(t) = 0:
mao"+cxr'+kae =0.
(3)
This is the so-called unforced,or free, oscillation. According to Theorem 3.3.1, the
solution x(t) to (3) and (2) doesexist and is unique.To find it, we seekz(t) = ef
and obtain the characteristic equationmA? + cA + k = 0, with roots
—rt
−−
a
4
mk
(4)
2m
Consider first the case where there is no damping, so c = 0 and (3) becomes
mz’ +kx = 0.
(5)
That is, the friction is small enough so that it can be neglected altogether. Then (4)
gives \ = +i,/k/m,
and the solution of (5) is
a(t) = Ae!
+ Be,
(6)
where w = \/k/m is the so-called natural frequency of the system, in rad/sec.
Or, equivalent to (6) and favored in this text,
z(t) = Ccoswt + Dsinwt.
(7)
In fact, there is another useful form of the general solution, namely,
x(t) = sin
(wt +9),
(8)
111
where the integration constants F and @ can be determined in terms of C' and D
as follows. To establish the equivalence of (8) and (7), recall the trigonometric
identitysin(A + B) = sin Bcos A + sin Acos B. Then
FEsin (wt + ¢) = Esingcoswt
if
which is identical to C coswt + Dsinwt
C=
and
Esing
+ Ecos dsinwt,
D=
(9a,b)
Ecos¢.
Squaring and adding equations (9), and also dividing one by the other, gives
E=\/C?+
F
D2
Cc
(10a,b)
Dd’
and ¢=tan!
respectively, as the connection between the equivalent forms (7) and (8). It will be
important to be completely comfortable with the equivalence between the solution
forms (6), (7), and (8). Both the square root and the tan~! in (10) are multi-valued.
We will understandthe square root as the positive one and the tan~! to lie between
—m and 7. Specifically, it follows from (9), with & > 0, thatif C’ > 0 and D > 0
<0andD>0O
C > Oand D < Othenz/2<d<7,ifC
7/2,if
then0 <6 <
then—7/2 <<@< 0,andif C <Oand D < Othen—a < ¢ < —77/2.
For instance, consider 6cost — 2sint.
tan-! (+8).
Then & = /36+4
= V40 and ¢ =
A calculator or typical computer software will interpret tan7!( )
as —7/2 < tan~'() < m/2, namely,in the first or fourth quadrant. Not able
to distinguish(+6)/(—2) from (—6)/(+2), it will give tan=+(—$) = -1.25
rad, which is incorrect. The correct value is in the second quadrant, namely, ¢ =
mw
~—1.25 = 1.89 rad. Thus, 6 cost — 2sint = //40sin (t + 1.89).
Whereas C’ and D in (7) have no special physical significance, E and ¢ in (8)
are the amplitude and phase angle of the vibration, respectively (Fig. 2a).
(a)
(b)
;
* A slope=xg
.
2n
period =>
>(Xx
.
amplitude = Ie +(*s,)
o
y
‘ Esin(wt +)
Figure 2.
|
(a)Graphical significanceof w, ¢. (b) Undamped free oscillation.
Although (8) is advantageousconceptually, in that the amplitude £ and phase
angle @are physically and graphically meaningful, it is a bit easier to apply the
initial conditions to (7):
aO0)=axa=C,
2'(0)=2p=wD
soC = x9,D = «p/w,andthesolutionis
Jf
(11)
x(t) = xo coswt + 4 sinwt,
a plot of which is shown in Fig. 2b for representativeinitial conditions xo and 2.
Before continuing, consider the relationship between the mathematics and the
physics. For example, the frequency w = ,/k/m increases with k, decreases with
m, and is independent of the initial conditions zo and wp, and hence the amplitudewhich, according to (11) and(10), is fx + (x/w)*. Do theseresults make
sense? Probably the increase of w with & fits with our experience with springs,
and its decrease with m makes sense as well. However, one might well expect the
frequency to vary with the amplitude of the vibration. We will come back to this
point in Chapter 7, where we consider more realistic nonlinear models.
Now suppose there is some damping, c > 0. From (4) we see that there are
three cases of interest.
If we define the critical
V4mk =
=
as c,,
damping
2¥m&, then the solution is qualitatively different depending upon whether c < Cop
(the “underdamped” case), c = Cer (the “critically
“overdamped” case).
Underdamped
vibration
damped” case), or c > Cer (the
(c < c,,;). In this case (4) gives two complex conju-
gate roots
\=
1
~2m
So
(-c + 4/c? — 2.) =—
~~ om
Y
we
— (5°
1
;
(-e tic?
—2)
2m
so a general solution of (1) is
x(t)
b
c
−
=e 2m |Acosy/w
—~——t
=e
2
(5
−−−∙−−
Cc
)
\2
in»
t+ Bsiny/w
2
where A and B can be determined from the initial conditions
{
(52
Cc
)
2
t},
2
(12)
(2). Of course, we
could express the bracketed part in the form (8) if we like.
Comparing (7) and (12), observe that the damping has two effects. First, it introduces the e~ (¢/2")t factor, which causes the oscillation to “damp out” as t — oo,
as illustrated in Fig. 3. That is, the amplitude tends to zero as t > oo. Second, it
Figure 3. Underdampedfree
oscillation,
reduces the frequency from the natural frequency w to,/w? — (c/2m)?; that is, it
makes the system more sluggish, as seems intuitively reasonable. (It might appear
from Fig. 2b and 3 that the damping increases the frequency, but that appearance is
only because we have compressed the t scale in Fig. 3.)
Critically
damped vibration
(c = c,,).
As c is increased further, the system
becomes so sluggish that when c attains the critical value c,, the oscillation ceases
altogether.In this case (4) gives the repeatedroot \ = —c/2m, of order two, so
cy
(13)
x(t) =(A+ Bt)e 2m,
Although the ¢ in A+ Bt grows unboundedly, the exponential function decays more
powerfully (as discussed within the proof of Theorem 3.4.3) and the solution (13)
decays without oscillation, as shown in the c = c,, part of Fig. 4.
Overdamped vibration (c > cc;). As c increases beyond c.;, (4) once again gives
two distinct roots, but now they are both real and negative (because the Vc? —4mk
is smaller than c), so
Cc
z(t)
=e
-—t
=e2m
Acosh
ly @\2 —
=)
(—
iv
2 t+ Bsinh
4/ (|_
w?
2—
72
wet),
Figure 4. Critically dampedand
overdamped cases.
(4)
where A and B can be determined from the initial conditions (2). Indeed, if one
or both roots were positive then we could have exponential growth, which would
make no sense, physically. If that did happen we should expect that either there is
an error in our mathematics or that our mathematical modeling of the phenomenon
is grossly inaccurate.
A representative plot of that solution is shown in the c > cer part of Fig. 4.
For the sake of comparison we have used the same initial conditions to generatethe
(a)
(b)
eee
“we
three plots in Figures 3 and 4. Though one can use positive and negative exponentials within the parenthesesin (14), in place of the hyperbolic cosine and sine, the
latter are more convenient for the application of the initial conditions since the sinh
is zero at t = 0 and so is the derivative of the cosh.
This completes our solution of equation (3), governing the free oscillation of
the mechanical oscillator shown in Fig. |. It should be emphasized that Fig. 1
is intended only as a schematic equivalent of the actual physical system. For instance, suppose the actual system consists of a beam cantilevered downward, with
a mass 77 at its end, as shown in Fig. 5a. We assume the mass of the beam to
be negligible compared to m. It is known from Euler beam theory that if we apply a constant force F’, as shown in Fig. 5b, then the end deflection
z = FL°/(3EI),
x is given by
where L is the length of the beam and EI is its “stiffness” (E
is Young’s modulus of the material and J is a cross-sectional
Re-expressingthelatteras F = (3EI/L°)x,
(c)
moment of inertia).
we seethatit is of the form F = kx,
as for a linear spring of stiffness &. Thus, insofar as the modeling and analysis is
concerned, the physical beam system is equivalent to the mass-spring arrangement
shown in Fig. 6c, where key = 3EI/L° is the stiffness of the equivalent spring
and where there is no friction between the block and the table top. The governing
equation of motion is
ma" + ket = 0.
(15)
Just as we neglected the mass of the beam, compared to m, likewise let us neglect
g
4
f
F
k eq
SSPPEEPOP ES
Figure 5. Equivalentmechanical
systems.
the mass of the spring compared to m. (How to account for that mass, approximately, is discussed in the exercises.) It should be noted that, in addition, we are
neglecting the rotational motion of the mass, in Fig. 5b, since we have already
limited ourselves to the case of small deflections of the beam.
Finally, it has already been pointed out, in Section 2.3, that the force-driven
mechanical oscillator is analogous to the voltage-driven RDC electrical circuit reproduced here in Fig. 6, under the equivalence
Lom,
Figure
RIC
Roe
6. Electrical oscillator;
1
tok,
C
E(t
(toc,
2 )
dt
+ F(t),
(16)
so whatever results we have obtained in this section for the mechanical oscillator apply equally well to the electrical oscillator shown in Fig. 6, according to the
equivalence given above.
circuit.
Closure. In this section we have considered the free oscillations of the mechanical harmonic oscillator. We found that for the undamped case the solution is a
pure sine wave with an amplitude and phase shift that depend upon the initial conditions —that is, the solution is “harmonic.” In the presence of light damping (i.e.,
for ¢ < Cer), the solution suffers exponential
decay and a reduction in frequency,
these effects becoming more pronounced as c is increased. When c reachesa critical value c,, the oscillation
ceases altogether, and as c is increased further the
exponential decay is increasingly pronounced.
It should be emphasized that by the damped harmonic oscillator we mean a
systemthatcan bemodeledby a linear equationof theform mz” + ca’ + ka = 0.
In most applications, however, the restoring force can be regarded as a linear function of « (namely, kx) only for motions that are sufficiently small departures from
an equilibrium configuration; if the motion is not sufficiently small, then one must
deal with a more difficult nonlinear differential equation. Thus, for the harmonic
oscillator, damped or not, we are able to generate simple closed form solutions, as
we have done in this section. For nonlinear oscillators one often gives up on the
possibility of finding closed form analytical solutions and relies instead on numerical simulation, as will be discussed in Chapter 6. To illustrate how such nonlinear
oscillators arise in applications, we have included several such examples in the
exercises that follow.
In terms of formulas,
the equivalence
of the three forms (6), (7), and (8)
should be clearly understood and remembered. In a given application we will use
whichever of these seems most convenient, usually (7).
EXERCISES
3.5
is, evaluate £7,p, w.
(c) 5cos 2¢ ~ 12 sin 2¢
(a) 6cost
(e) cos 5t — sin 5t
(d) —2cos 3t + 2 sin 3t
+ sint
115
af
(f) v9 cos wt +“0
W
we
sin wt, from (11)
2, We emphasized the equivalence of the solution forms (6),
(7), and (8), and discussed the equations (10a,b) that relate C
and D in (7) to & and ¢ in (8). Of course, we could have used
the cosine in place of the sine, and expressed
a(t) = Gcos (wt + y)
(2.1)
m
instead. Derive formulas analogous to (10a,b), expressing
G and w in terms of C' and D.
3. Apply the initial conditions (2) to the general solution (12),
and solve for the integration constants A and B in terms of
M,C,W, Xo and rp.
4. Apply the initial conditions (2) to the general solution (14),
and solve for the integration constants A and B in terms of
M,C,W, Lo and xp.
|0|< 1 (where< meansmuch smaller than),thensin # = 0,
and the nonlinear equation of motion (8.1) can be simplified
to the linear equation
6 ++8=0,
(8.2)
or, if we allow for some inevitable amount of damping due
to friction and air resistance,
(8.3)
6"+6!+26=0,
5. Consider an undamped harmonic oscillator governed by
theequationmz” + ka = 0, with initial conditionsx(0) = where 0 < € < 1. Now imagine the pendulum to be part
Zo,2’(0) = xp. One mightexpectthefrequencyof oscillation of a grandfather’sclock. If a ratchet mechanism converts each
to dependon the initial displacement x9. Does it? Explain.
6. We mentioned in the text that the oscillation ceases altogether when c is increased to c,, or beyond. Let us make that
statement more precise: for c > c., the graph of x(t) has at
most one “flat spot” (on 0 < ¢ < oo), that is, where a’ = 0.
(a) Prove that claim.
(b) Make up a case
m,c,k,
(i.e.,
give
numerical
values
0, xq) where there is no flat spot on 0 < t < 00.
of
(c) Make up a case where there is one flat spot on 0 < t < ov.
7. (Logarithmic decrement) For the underdamped case (c <
Cer), let XZ, and Z,+41 denote any two successive maxima of
x(t).
(a) Show that the ratio r, = @,/¢,41
is, 2/2
= t9/e3 =
= 7.
∙∙
arithmic decrement 6, is given by 6 = -
8. (Grandfather clock) Consider a pendulum governed by the
+ mg sin @= 0, or
gl"+.$sind =0,
9. (Correctionfor the massof the spring) Recall that our model
of the mechanical oscillator neglects the effect of the mass of
the spring on the grounds that it is sufficiently small compared
to that of the mass m. In this exercise we seek to improve our
model so as to account, if only approximately, for the mass of
the spring. In doing so, we consider the undampedcase, for
which theequationof motionis mz” + kx = 0.
(a) Multiplying thatequation by dz and integrating,derive the
“first integral”
is a constant, say r; that
(b) Further, show that the natural logarithm of r, called the log-
equationof motionm6"
oscillation to one second of recorded time, how does the clock
maintain its accuracy even as it runs down, that is, even when
its amplitude of oscillation has diminished to a small fraction
of its initial value? Explain.
1
gma
12
1 Lsye mn
=C,
+ ake
(9.1)
which states that the total energy, the kinetic energy of the
mass plus the potential energy of the spring, is a constant.
(b) Let the mass of the spring be ms. Suppose that the velocity of the elements within the spring at any time ¢ varies
linearly from 0 at the fixed end to 2’(t) at its attachmentto
the mass m. Show, subject to that assumption, that the kinetic
(8.1)
where g is the acceleration of gravity. (See the figure.) If
energyin thespring is 47,2"? (t). Improving (9.1)to theform
1
3(m
1
+ 5m)
a”
1
+ aha"
3 C,
(9.2)
obtain, by differentiation with respect to ¢, the improved equation of motion
1
(1m+ ims)
3
zw+ke =0.
(9.3)
Thus, as a correction, to take into account the mass of the
spring, we merely replace the mass m in mz” + ka = 0 by
an “effective mass” m + iMs, which incorporates the effect
of themassof thespring. NOTE: This analysis is approximate
in that it assumes the velocity distribution within the spring,
whereasthat distribution itself needs to be determined, which
determination involves the solution of a partial differential
equation of wave type, as studied in a later chapter.
(c) In obtaining an effective mass of the form m + am,, why
is it reasonable that a turns out to be less than 1?
(e) Is the resulting linearized model equivalent to the vibration
of a mass/spring system, with an equivalent spring stiffness of
keq = 2poA/L? Explain.
11. (Lateral vibration of a bead on a string) Consider a mass
m, such as a bead, restrained by strings (of negligible
mass),
in each of which there is a tension 79, as shown in Fig. a.
(c)
(a)
10. (Piston oscillator) Let a piston of mass m. be place at the
midpoint of a closed cylinder of cross-sectionalarea A and
length 2L, as sketched. Assume that the pressure p on either
We seek the frequency of small lateral oscillations of m. A
lateral displacement x (Fig. 6) will cause the length of each
stringto increasefromlp to I(x) = \/l¢ + x?. Supposethat
the tension 7 is found, empirically, to increase with /, from its
initial value 79, as shown in Fig. c.
(a) Show that the governing equation of lateral motion is
side of the piston satisfies Boyle’s law (namely, that the pressure times the volume is constant), and let po be the pressure
on both sides when x = 0.
(a) If the piston is disturbed from its equilibrium position
x = 0, show that the governing equation of motion is
ma" + 2ppAL
x
L?
— x2
= 0.
(10.1)
ma" +2
where 7 (Vi
(VE)
+2? ) is a function, not a product.
(b) Is (11.1) linear or nonlinear? Explain.
(c)Expandther (Ve + a) z/V/l¢ + x? termin a Taylor
series about x = 0, up to the third-order
(b) Is (10.1) linear or nonlinear? Explain.
(1L.1)
z=0,
term.
[You should
find that the coefficients of these terms involve lg, 7, and
(c) Expand the «/(L* — x?) term in a Taylor series about r'(lo).]
x = Q, up to the third-order term. Keeping only the leading (d) Linearize the equation of motion by retaining only the
term, derive the linearized version
leading term of that Taylor series, show that the equivalent
stiffness is keg = 27)/lo, and that the frequency of
spring
2p0A
xz ++—-—2
L x =0
mz"
(10.2)
au:
Lf 270
small oscillations is —-,/——— cycles/sec.
of (10.1), which is restricted to the case of small oscillations
-
that is, where the amplitude of oscillation is small compared
to L.
(d) From (10.2), determine the frequency of oscillation, in cycles per second.
2r ¥ mio
12. (Oscillating platform) A uniform horizontal platform
of mass m is supported by counter-rotating cylinders a distance L apart (see figure). The friction force f exerted on the
l17
which rotates without friction about an axis that is tilted by an
angle of @ with respect to the vertical (see figure). Let @denote
a
mg
A
LN ¢
Ny
a
oti
+
BE
HN»
Ny
platform by each cylinder is proportional to the normal force
N between the platform and the cylinder, with constant of proportionality (coefficient of sliding friction) uz:f = uN. Show
that if the cylinder is disturbed from its equilibrium position
(x = 0), then it will undergo alateral oscillation of frequency
w = 4/2pg/L rad/sec, where g is the acceleration of gravity.
HINT: Derive the equation of motion governing the lateral displacement z of the midpoint of the platform relative to a point
midway between the cylinders.
,
the angle of rotation of the pendulum, with respect to its
equilibrium position (where m is at its lowest possible point,
namely, in the plane of the paper).
(a) Derive the governing equation of motion
g
sinasind =0.
6’ + +
As a partial check of this result, observe that fora
(13.1)
= 1/2
(14.1) does reduce to the equation of motion of the ordinary
pendulum (see Exercise 8). HINT: Write down an equation of
conservation of energy (kinetic plus potential energy equal a
13. (Tilted pendulum) Consider a rod of length £ with a point constant), and differentiate it with respect to the time ¢.
mass m at its end, where the mass of the rod is negligible com- (b) What is the frequency of small amplitude oscillations, in
pared to m. The rod is welded at a right angle to another, rad/sec? In cycles/sec?
°
3.6 Solution of Homogeneous Equation:
Nonconstant Coefficients
We return to the nth-order linear homogeneous equation
any
ag(x) dx
t On(@)y
= 0,
geneous equation, given in Section 3.3, holds whether the coefficients
(1)
are constant
the coefficients in (1) are not all constants. Only in special cases are we able to find
118
(Chapter 4) or pursue a numerical approach (Chapter 6).
3.6.1. Cauchy—Euler equation. If (1) is of the special form
d™
∕
↓
∙
∏−
√
↕
↕
−∙∙∙∙
−↕
∶
−
0,
∶
∶ (2)
where the c;’s are constants, it is called a Cauchy—Euler equation, and is also
called an equidimensional equation.
Of most importance to us will be the case where n = 2, so let us consider (st
case first, namely,
2,0
vy" + cay! +cay= 0,
(3)
and let us consider the x interval to be 0 < x < oo; the ase of negative x’s will
be treatedseparately,below. Suppose we try to solve (3) vy seeking y in the form
y = e**, where ) is a yet-to-be-determined constant,which form proved successful
for the constant-coefficient case. Then y’ = Ae** and y” = A7e*, so (3) becomes
Mare”
+ Acyre** + coe” = 0.
(4)
If we cancel the (nonzero) exponentials we obtain a quadratic equation for A, solution of which gives \ as a function of 2. However, \ was supposed to be a constant,
so we have a contradiction, and the method fails. (Specifically, if A turns out to
be a function of x, then y’ = \e*” and y” = A?e4*, above, were incorrect.) Said
differently, the 27e**, ze**, e*” terms in (4) are LI and cannot be made to cancel
identically to zero on any given z interval.
The reason we have discussed this fruitless approach is to emphasize that it
is incorrect, and to caution against using it. By contrast, if the equation were of
constant-coefficienttype, say y’ + ciy’ + coy = 0, theny = e** would work
because y = eA”, y! = Ae**,y = \2e** are LD, so the combination y” + czy! +
cgy could be made to cancel to zero by suitable choice of 4.
Although the form e** will not work for Cauchy—Eulerequations, the form
y= 2
(5)
will, becausey = 2, ry! = Ax, 2?y"”= A(A —1)2%,...areLD sinceeachis a
constant times 2%.Putting (5) into the second-order equation (3) gives
[MA — 1) + c1A +c] 2 = 0.
Since «* # 0, we requireof \ that
M-(1—e1)A+e2 =0,
sO
\=
Ler
+ V(1
= e1)?
2
~ dea,
(6)
119
We distinguish three cases, depending upon whether the discriminant A =
(4 —c1)*—4c is positive,zero, or negative:
A > 0: Distinct real roots. If A > 0, then (6) gives two distinct real roots,
say 4; and Xo, so we have the general solution to (3) as
EXAMPLE
1. To solve
y(a) = Ac™ + Ba.
(7)
ay” —2xy' ~ 10y = 0,
(8)
seek y = x*. That form gives \? — 3\ — 10 = 0, with roots \ = —2 and 5, so the general
solution of (8) is
A + Ba”.5
y(z) = =
ul
A = 0: Repeated real roots. In this case (6) gives therepeatedroot \ =
1-
Cl
III
2
\1. Thus we have the solution Az*!, but are missing a second linearly independent
solution, which is needed if we are to obtain a general solution of (3). Evidently,
the missing solution is not of the form x*, or we would have found it when we
soughty(a) = 2%.
To find the missing solution we use Lagrange’s method of reduction of order, as we did in Section 3.3.3 for constant-coefficient differential equations with
repeatedroots of the characteristic equation. That is, we let A vary, and seek
y(a) = A(x)a™.
(9)
Putting (9) into (3) gives (we leave the details to the reader, as Exercise 3)
tA” + A'=0.
Next, setA’ = p, say,to reducetheorder:
oP +p=0,
(10)
sop = D/xand A(x) = Dina + C, whereC, D arearbitrary constants(Exercise
4). Finally, putting the latter back into (9) gives the general solution of (3) as
y(z) = (C+ Din va,
EXAMPLE
2. To solve
ay" + Tay’ + 9y = 0,
(11)
(12)
120
seek y = 2.
That form gives \? + 6\ + 9 = 0, with the repeated root \ = —3, so the
general solution of (12) is
y(z) =(A+Blnz)a™*.
O
A < 0: Complex roots. In this case (6) gives the distinct complex-conjugate roots
er
Veyue
— 04
Nn
—ine,
;yteas Oe)
2
2
—
2
“
a
Li,
(13)
so we have the general solution of (3) as
—
(Aci
+ Bu~**) .
(14)
However, since we normally prefer real solution forms, let us use the identity u =
e'™™to re-express(14) as*
y(z) II
at
(Aetna
4+ Bene”
)
=
7
Ce
+ Bein)
= a* {A |cos(Ina) + isin (@lnz)| + B[cos(@lnz) —isin (Gln z)]}
=a" ((A+ B)cos(GInz) +7(A —B)sin(fInz)).
(15)
Or, letting A+ B=C
andi(A—
B) =D,
lna) + Dsin(Glna)}.
y(x) = e* [Ccos (GB
EXAMPLE
(16)
3. To solve
x?y” —2axy'+ 4y = 0,
“Tt is important to appreciate that the x akis
(17)
quantities, in (14), are “new objects” for us, for we
have not yet (in this book) defined a real number x raised to a complex
power (unless x happens to
be e, in which case the already-discussed Euler’s formulas apply). Staying as close as possible to
familiar real variable results, let us write
goth
_
at
aif
_
2
(e
=|
—_ eieilne
and similarly for «*~*, None of these three equalities arejustifiable, since they rely on the formulas
eet? = ote’ uy = el"
and Ine® = clnz, which assume z,a, b,c to be real, but we hereby
understand them to hold by definition. Observe that complex quantities and complex functions keep
forcing themselves upon us. Therefore, it behooves us to establish a general theory of complex
functions, rather than deal with these issues one by one. We will do exactly that, but not until much
later in the text.
121
seeky = x*. ThatformgivesA? —3\-+ 4 =0,so\ = $+ im, Hence
y(a)= AgS/4V7/2 4 BagS/A-iVT/2
—3/2 (Aci?
— q3/? (Aci
Ine + Bet
= y3/? [om
(Fuss)
+ Ba-iV7/?)
me)
+ Desin (Yne)]
|
Recall that we have limited our discussion of (3) to the case where x > 0. The
reason for that limitation is as follows. For a function y(z) to be a solution of (3)
on anz intervalI, we first of all needeachof y, y’, andy” to existon J; thatis, to
be defined there. The function In z and its derivatives are not defined at x = 0, nor
is In x defined (as a real-valued function) for x < 0. The functions xt, ao? in (7),
x! in (11), and ¢® in (16) cause similar problems. To deal with the case where
x < 0, it is more convenient to make the change of variable « = —€ in (3), so that
€will be positive. Letting y(z) = y(—£) = Y(€),*
dy
dvdg_dy
dx d&édx dé’
dy_
ad (_d¥\
deAV
dx? d&€\ dé)dx dé?’
yy
so (3) becomes
e
ay
dg?
dY
+coE—-—
+ oY =0,
dg
(¢ > 0)
ce
which is the same as (3)! Thus, its solutions are the same, but with x changed to €.
For the case of distinct real roots, for instance,
y(x) = Aa™ + Ba?
for x > O, and
y(z)=¥(€)=Y(-2)=A(—#)*
+B(-2)*
for «x< Q. Observe that both of these forms are accounted for by the single expres-
_
y(x) = Alx|!
.
+B Ix|*?
Similarly for the other cases (repeated real roots and complex roots). Let us state
these results, for reference, as a theorem.
“Why do we change the name of the dependentvariable from y to Y? Because they are different
functions.To illustrate,supposey(a) = 5 + 2°. Then Y(€) = 5+ (—€)° = 5 — €*. For instance,
if the argument of y is 2, then yy is 13, but when the argument
of Y is 2, then Y is —3
g
122
THEOREM
3.6.1 Second-Order Cauchy—Euler Equation
The general solution of the second-order Cauchy~Euler equation
ay" + cycy’ + coy = 0,
(20)
on any « interval not containing the origin, is
A ja|™ + Bla A2
(21)
y(z)=¢ (A+ Bln |2|)|2|™
||“ [A cos (In |a|)+ B sin (Z In |z])]
if the roots Ay, Az of A? + (c, —1)A+cg = 0 are real and distinct, real and repeated,
or complex (\ = a + 7), respectively.
Of course, if the x interval is to the right of the origin, then the absolute value
signs in (21) can be dropped.
To close our discussion of the Cauchy~—Eulerequation, consider the higherorder case, 7 > 2. For simplicity, we consider z > 0; as for the second-order case
treated above, « < 0 can be handled simply by changing all x’s in the soisson to
EXAMPLE
4. Considerthethird-orderCauchy—Eulerequation
gy!” —32?y" + 72ry'—8y = 0.
Seekingy(x) = «* gives
(22)
M— 6\? + 12A-8=0,
with the roots \ = 2,2,2. Thus we have the solution y(x) = Az”, but we need two
more linearly independent solutions. To find them, we use reduction of order and seek
y(x) = A(x)a?. Puttingthatform into (22)givestheequation
vA
432A" + A! =0
on A(a), which can be reduced to the second-order equation
xp" + 3ap' +p =0
(23)
by letting A’ = p. The latter is again of Cauchy—Euler type, and letting p(x) = x* gives
A = —1,-1,
so that
1
piv) = (B+Clnaz)~.
x
Since A’ = p,
A(x) = [pa
= Blnz+C
(Ina)?
+ D,
123
and
y(x) = [Cy + Colne + Cy(Ina)?] x?
(24)
is thedesired general solution of (22). #
Comparing the latter result with the solution (11) for the second-order CauchyEuler equation with repeated root A;, we might well suspect that if any CauchyEuler equation has a repeatedroot A, of order k, then that root contributes the form
Cy + Colna + Cg(Inz)? +---+
Cy(In we]
(25)
a
o the general solution. We state,without proof, that that is indeed the case.
5. Asa summaryexample,supposethatupon seeking solutionsof a given
EXAMPLE
eighth-orderCauchy~Euler equationin theform y(x) = x* we obtain the roots
A = —2.4, 1.7, 1.7, 1.7, ~-3+ 47, ~3+4i,
-3 - 4, -3-
42.
Then the general solution is
y(x) = Cya? 4 + [C2+ C3(Inz) + Cy(Inz)?] 27
+(Cs5+ Cglnz) 2-3"
+ (Cp+ Cglnz) a2™,
(26)
or (Exercise 5),
y(x) = Cya?4
+ [Co + Ca(Inz) + Cy(In x)”} gh?
+ {[Cy cos (4In x) + Cio sin (4Inz)}
+Inz [C11cos (4Inz) + Cygsin(4Inz)]}
a7.
(27)
Although such high-order Cauchy-Euler equations are uncommon, we include this example
to illustrate the general solution pattern for any ordern > 2. Hf
This concludes our discussion of the Cauchy—Euler equation. We will meet
Cauchy—Euler equations again in the next chapter, in connection with power series
solutions of differential equations, and again when we study the partial differential
equations governing such phenomena as heat conduction, electric potential, and
certain types of fluid flow, in later chapters.
3.6.2. Reduction of order. (Optional) We have already used Lagrange’s method
of reduction of order to find ‘missing solutions,” for constant-coefficient equations
and for Cauchy—Euler equations as well. Here, we focus not on constant-coefficient
or Cauchy—Euler equations, but on the method itself and indicate its more general
application to any linear homogeneous differential equation.
For definiteness, consider the second-order case,
y” +ay(x)y’ + a2(x)y= 0.
(28)
124
Chapter 3. Linear Differential Equations of Second Order and Higher
Suppose that one solution is known to us, say Y(a), and that a second linearly
independentsolution is sought. If Y(a) is a solution, then so is AY (x) for any
constant A. The idea behind Lagrange’s method of reduction of order is to seek
the missing solution in the form
(29)
y(t)=A(z)¥(2),
whereA(a) is to bedetermined.
The method is similar to Lagrange’s method of variation of parameters, introduced in Section 2.2, but its purpose is different. The latter was used to find
thegeneralsolutionof thenonhomogeneousequationy' + p(x)y = q(z) from the
solution y_(a) = Ae /?() 4" of the homogeneousequationy/ + p(x)y = 0, by
varying theparameterA andseekingy(x) = A(x)e~ J?4,
Reductionof order
is similar in thatwe vary theparameterA in y = AY(x), butdifferentin thatit is
used to find a missing solution of a homogeneous equation from a known solution
Y (x) of thathomogeneousequation.
We begin by emphasizing that at first glance the form (29) seems to be without
promise. To explain that statement,observe that the search for a pair of lost glasses
can be expected to be long and arduous if we merely know that they are somewhere
in North America, but shorter and easier to whatever extent we are able to narrow
the domain of the search. If, for instance, we know that they are somewhere on
our desk, then the search is short and simple. Likewise,
when we solve a constant-
coefficient equation by seeking y in the form e** then the search is short and simple
since, first, solutions will indeed be found within that subset and, second, because
that subset is tiny compared to the set of all possible functions, just as one’s desk
is tiny compared to North America. Similarly, when we solve a Cauchy—Euler
equation by seeking y in the form 2%.
With this idea in mind, observe that the form (29) does not narrow our search
in the slightest, since it includes all functions! That is, any given function f(x) can
be expressedas A(x) ¥(x) simply by choosingA(x) to be f(x)/Y (2).
tion
Proceeding nonetheless,we put (29) into (28) and obtain the differential equa-
A'"Y + (2¥’ + aY)A'+(¥"+aY'
+ aY)A=0
(30)
on A(a). At first glance it appearsthatthis differential equationon A(a) is probably even harder than the original equation, (28), on y(). However, and this is the
heart of Lagrange’s idea, all of the undifferentiated A terms must cancel, because
if A were a constant(in which case the A’ and A” termswould all drop out), then
the remaining terms would have to cancel to zero because AY (x) is a solution of
(28). Thus, the coefficient of A in (30) is zero, so (30) becomes
ANY+(2¥’+a1Y)A’= 0,
G1)
the order of which can now be reduced from two to one by letting A’ = p:
2Y’ +a1Y
dp
= 0.
_——
Je
Y
dz + (
—
32
a
3.6. Solution of Homogeneous Equation: Nonconstant Coefficients — 125
Integratingthelattergives
F-faide
=Be~fFE ae_ pe-2S
p(n)
= BY
(a)~*e7
fa(«)
dae
Finally, integration of A’ = p gives
Ll) dege4.6,
A(z)=f v2) dx= B | Y(2)~2e7
so (29) becomes
y(z) = B [¥@rte
foo
tas + c
(33)
Y(2).
The CY (x) term merely reproduces the solution that was already known; the miss-
ing solutionis providedby theotherterm,BY (x) [ Y(a)~2e7fa (#)ede, That
this solution and theoriginal solution Y (x) are necessarily LI is left for Exercise 6.
Incidentally, the result (33) could also be written using definite integrals if one
prefers, as
(34)
(€)-2e7 El)ge c| Y(a),
y(a)= E / "Y
where the lower limits a and / are arbitrary numbers, for the effect of changing
a@is simply to add some constant to the € integral, and that constant times B can
be absorbed by the arbitrary constant C’. Likewise, changing @ simply adds some
constant, say P, to the 7 integral, and the resulting e~” factor can be absorbed by
the arbitrary constant B.
EXAMPLE
6. Legendre’sequation.The equation
(1~2x*)y”—2ay’ + 2y =0,
(-l<a<1)
(35)
is known as Legendre’s equation, after the French mathematician Adrien Marie Legendre
(1752-1833). Itis studied in Chapter 4, and used in later chapters when it arises in physical
applications.
Observethat(35) admits the simple solution (a) = x. To find a secondsolution,
andhencea generalsolution,we can seek y(z) = A(x)x and follow the stepsoutlined
above. Rather, let us simply use the derived result (33). First, we divide (35) by 1 — x? to
2z
reduce it to the form (28), so that we can identify a(x). Thus, with ay(z) = and
1— x?
Y(z) = x, (33)gives
el 2a da/(1~x")
B | ——.———
dr + C
c=
[2faim+=
126
or, equivalently,
x,
l+ea
y(a) = Cha + C2 (1-$mi+£).
a
In this example we were able to evaluate the integrals that occurred. In other
cases we may not be able to, even with the help of computer software, and may
therefore need to leave the answer in integral form.
3.6.3. Factoring the operator. (Optional) We have been considering the nthorder linear homogeneous equation
d”
Ly) = Ee
or
qr-l
+ a1(x) anal
+++++an(x)]y = 09,
(D"+a,D"~*+++»
+an)y =0,
do.
d d
dx.
_dxdz
(36)
d?
hereD = —, D? = DD = —-— = —~, and
waste
dz? an * on
Suppose, first, that (36) is of constant-coefficient type (i.e., each aj; is a con-
stant), and that the characteristic polynomial AP +a A7~ 14+ +ay
can be favored
as (A~A1)(A—Ag) ++:(A—An), where one or more of the roots A; may be repeated.
Then the differential operator L = D” + a,D"~! +--+ + apycan be factored in
preciselythesameway,as (D —A1)(D —A2)--:(D —An), wherewe understand
(D — \1)(D — A2)--+(D — An)y to mean that first y is operatedon by D — An,
then the result of that step is operated on by D —A,—1, and so on. That is, we begin
at the right and proceed to the left. Further, it is readily verified that the sequential
order of the D — A; factors is immaterial,
instance,
that is, they commute.
If n = 2, for
(D —A1)(D—A2)y= (D —A1)(y'—Azy)
= Diy! —Agy)—Aly’ —Aay)
= yl! —(Ag+ Az)y! + AAgy
(D —A2)(D—A1)y= (D —A2)(y'—Aay)
= Diy’ —Avy)—Aa(y!—Ary)
=y" —(do + A1)y!+ A2ALy
are the same.
By factoring L, we are able to reduce the solution of
(D ~ Ai)(D —Ag)++(D- Any = 0
(37)
3.6. Solution of Homogeneous Equation: Nonconstant Coefficients — 127
to the solution of a sequence of 7 first-order equations, each of which is of the form
y —py= gor
(38)
(D—p)y=4,
where p is a constant and g(x) is known. From Section 2.2, we know that the
solution of (38) is
eP®q(a) du +A)
y(a) = eP* (/
,
(39)
where A is an arbitrary constant.
Let us illustrate with an example.
EXAMPLE
7. The equation
yf"
_
3y"
+
dy
—
0
(40)
admits the characteristic roots \ = —1, 2,2, so we can factor (40) as
(D+1)(D —2)(D —2)y=0.
(41)
We begin the solution procedure by setting
(D —2)(D —2)y=u,
(42)
so that (41) becomes
(D+ 1)u = 0,
with the solution
u(x) = Ae.
Putting the latter into (42) gives
€D — 2)(D — 2)y = Ae,
in which we set
(44)
(D —2)y =v.
Then (43) becomes
(D —2)v = Ae™*,
with the solution
e 20"Ae
v(x) =e Qe
A
Ee
Qu
+ Be™.
"de + B) = -—e7*
me
Finally, putting the latter into (44) gives
(D
”
2)y
a
“ge A
7
(43)
--
Be, Yay
128
Chapter 3. Linear Differential Equations of Second Order and Higher
with the solution
y(a)
— et
∶−
I
A∙∂
en ae (-fe"
_.
∙↕
∑
on ∙
+ Be)
dx
+
c|
∶
or, equivalently,
y(x) = Cye~®+ (Cz + C32) €?*,
which is the same solution as obtained by methods discussed in earlier sections. Notice,
in particular, that the presence of the repeated root 1 = 2,2 presented no additional
difficulty. #
Although the factorization method reduces an nth-order equation to a sequence
of n first-order equations, it is quite different from the method of reduction of order
described above in Section 3.6.2.
Thus far we have limited our discussion of factorization to the constant-coefficient
case. The nonconstant-coefficient case is more difficult. To appreciate the difficulty, consider the equation
y! — x?y = (D? — a*)y = 0.
(45)
If we canfactorD? —x? as(D +x)(D —<),thenwecansolve(45)by themethod
outlined above. However,
—xy)=Diy’ - cy)+e(y’—zy)
(D+2)(D—2x)y=(D+2)(y'
—yl" − ry’ —ytay’—
vy
= y" _ (x? + 1)y,
(46)
so(D+a)(D—2x)= D? —(x*+1) is notthesameasD? —x*.Theproblemis that
the differential operator on the left-hand side of (46) acts not only on y but also on
itself, in the sense that an additional term is contributed to the final result, namely
—y, through the action of the underlined D on the underlined x. Observe, further,
thatD + x andD —x do not commutesince (D + 2)(D —x) = D? —(x? +1),
whereas(D —x)(D +x) = D? — (x? — 1).
Thus, the following practical question arises: given a nonconstantcoefficient
operator, can it be factored and, if so, how?
Limiting our attention to equations of second order (which, arguably, is the
most important case in applications), suppose that aj(a) and a(x) are given, and
thatwe seek a(x) andb() so that
yl"+ai(a)y!
+a9(«)y
=[D—a(a)][D
—b(x)]y.
(47)
Writing out the right-hand side,
y"+ary’+aay= (D —a)(y'—by)
=y" —(a+b)y'
+ (ab—b')y.
(48)
129
Since this equation needs to hold for all (twice-differentiable) functions y, a and b
must satisfy the conditions (Exercise 13)
a+b=-~ay,
(49a)
ab—b' = as,
(49b)
or, isolating a and b (Exercise 14),
2
a’ = a* + (a1)a + (a2 —a),
(50a)
b!= —b?—(a1)b—(ag).
(S0b)
Each of theseequations is a special case of the nonlinear Riccati equation
y!=p(x)y*+q(x)y+r(z),
(51)
which was discussed in Exercise |1 of Section 2.2.
Thus, from a global point of view, it is interesting to observe that the class of
second-order equations with nonconstant coefficients is, in a sense, equivalent in
difficulty to the class ofnonlinear first-order Riccati equations, We saw, in Exercise
11 of Section 2.2 that in the exceptional case where a particular solution Y(z)
of (51) can be found, perhaps by inspection,
the nonlinear equation (51) can be
converted to the linear equation
vo!
+[2p(2)¥
(x)+q(a)]v
=—p(2)
(52)
by the change of variables
y=Y(a2)+
.
(53)
Thus, just as we are able to solve the Riccati equation only in exceptional cases,
we are able to factor second-order nonconstant coefficient equations (and solve
them readily) only in exceptional cases. In general, then, nonconstant-coefficient
differential equations are hard in that we are unable to find closed form solutions.
EXAMPLE
8. Considertheequation
y" —(a?+1)y=0.
(54)
Herea;(z) = 0 andag(x) = —(x?+ 1), so (50a,b)are
a =a? —x? —1,
I enee
(55a)
(55b)
In this case we are lucky enough to notice the particular solution a(z) = —a of (55a).
Putting this result into (49a) then gives b(a) = x. [Equivalently, we could have noticed the
particular solution 6(@)= zxof (55b) and then obtained a(x) = —z from (49a).]Thus, we
havethe factorization
y —(x?+Dy= (D+2)(D-x)y =0.
eS)
Proceeding as outlined above, we are able (Exercise 15) to derive the general solution
y(z) = Ae®/? + Be®/? | e* de.
(57)
Going one stepfurther,supposethatinitial conditionsy(0) = 0 and y’(0) = 1 are
prescribed and that we wish to evaluate A and B. First, we re-express (57) in the equivalent
and more convenient form
y(e) = Ae®/2 4 pew? |
0
en8dé.
(58)
We could have used any lower integration limit, but 0 will be most convenient because the
initial conditions are at z = 0. Then
where we have used the fundamental theorem of the calculus (Section 2.2) to differentiate
the integral term. Thus, A = 0 and B = 1, so
y(x)=ern f
e€ dé
(59)
0
is the desiredparticular solution. #
The integral in (59) is nonelementary in that it cannot be evaluated in closed
form in terms of the elementaryfunctions. But it arises often enough so that it has
been used to define a new function, the so-called error function
erf(a)= = [ “oP de,
(60)
where the 2/,/7 is included to normalize erf(x) to unity as @—+oo since (as will
be shown in a later chapter)
[eta
0
= vr
2
(61)
The graph of erf(a) is shown in Fig. | for « > 0. For « < 0 we rely on the fact
that erf(—x) =lI ~erf(x) (Exercise 18);for instance,erf(—oo) = —erf(oo) = —1.
0-4
0
1
Nev
Figure 1. Theerrorfunction
erf(x).
Since e~® is (to within a scale factor) a normal probability distribution, one way
in which the error function arises is in the study of phenomena that are governed
by normal distributions. For instance, we will encounter the error function when
we study the movement of heat by the physical process of conduction.
Thus, our solution (59)can be re-expressedas y(a) = \/7/2 er /? erf(a). Just
as we know the values of sin, its Taylor series, and its various properties, likewise we know the values of erf(a), its Taylor series, and its various properties, so
131
we should feel comfortable with erf(a) and regard it henceforth as a known function. Though not included among the so-called “elementary functions,” it is one
of many “special functions” that are now available in the engineering science and
mathematicsliterature.
Closure. We have seen, in this section, that nonconstant-coefficient equations can
be solved in closed form only in exceptional cases. The most important of these is
the case of the Cauchy—Euler equation
dy,
aly
ae
+ ex"
a
ay
eee
tpCait
d
+ cCny= 0.
Recall that a constant-coefficient equation necessarily admits at least one solution in the form e**, and that in the case of a repeated root of order k the
solutions corresponding to that root can be found by reduction of order to be
(Cy + Cotter
+ Cya*—!) e**, Analogously, a Cauchy—Euler equation necessarily admits at least one solution in the form 2%,and in the case of a repeatedroot
of order & the solutions corresponding to that root can be found by reduction of
order to be [Ci +Cglna+---+C,(In
)*-}] x.
In fact, it turns out that the connection between constant-coefficient equations
and Cauchy —Euler equations is even closer than that in as much as any given
Cauchy—Euler equation can be reduced to a constant-coefficient equation by a
change of independent variable according to x = e’. Discussion of that point is
reserved for the exercises.
Beyond our striking success with the Cauchy—Euler equation, other successes
for nonconstant-coefficient equations are few and far between. For instance, we
might be able to obtain one solution by inspection and others, from it, by reduction of order. Or, we might, in exceptional cases, be successful in factoring the
differential operatorbut, again, such successesare exceptional. Thus, otherlines of
approach will be needed for nonconstant-coefficient equations, and they are developed in Chapters 4 and 6.
EXERCISES
3.6
tionby seekingy(a) = x. That is, derivethesolution,rather (g) wy" + day! + 2y= 0;
dition, find the particular solution corresponding to the initial
conditions, if such conditions are given, and state the interval
(i) dary” + By =0;
y=
(j) 22y" + ay’ + 4y =0
(a)cy’ +y=0
(k) ay! + 2ry' —2y = 0;
HINT: Letz+2=
(I) (@+ 2)*y"—-y=0
of validity of thatsolution.
(b)zy’-y=0;
(c) ay” + y' = 0
(d)ay” ~dy'=0;
yG
—2y'=0;
(m)ay” —y"
y(0) _
=0;
y(2)=5
y(1)=0,
(n) ny"
y/(1)=3
(e)a*y"
+ay'-Sy=0; y(2)=1, y/(2)=2
(0)2?y" + cy’ —Ky =0
(p)why+ay’—y=0
t.
132
ult -
whereD acting on y(x) meansd/dz and D acting on Y(t)
meansd/dt.
(q)ay!” +2ey!—2y = 0
(r) x?yfa +ay"—y'
(s) no yl”
+ Gay
(t) atyll!
+ bar”
(u) ety!
(vy) ay!
" “+ Tay’
-
as ys
3a7y""
a 3024!
+ Gay!"
_ Qyl!
=0
a 0;
y(1)
_ say’
_ Bay!
+ dy
=0
+y=
0
(b) The results (8.2) suggest that the formula
a*D¥y = D(D—-1)---(D-k+Y,
—_5,
yl) =9") = y"(1)=0
(8.3)
holds for all positive integers k. Prove that (8.3) is indeed
2. (a)—(v) For the corresponding problem in Exercise 1, use
computer software to obtain a general solution,
as well as a
particular solution if initial conditions are given.
3. Putting (9) into (3), show that the equation cA” + A’ = 0
results, as claimed below equation (9).
correct. HINT: Use mathematical induction. That is, assume
that (8.3) holds for any given positive & and, by differentiating
both sides with respect to x, show that
at! pkt+1y—D(D—1)---(D—k)Y,
(8.4)
4. Solve (10), and derive the general solution A(z) = Dlnz+
C' stated below equation (10).
which is the same as (8.3) but with k changed to k + 1. Thus,
5. Fill in the steps between (26) and (27).
(c)Finally,replacingeachx*D*y in (8.1)by thecorrespond-
it must be true that (8.3) holds for all positive integers k.
ing right-hand side of (8.3), state why the resulting differential
on }(t) will be of constant-coefficient type.
equation
(34)]arenecessarilyLI. You mayassumethata;(2)is continuous on the x interval of interest. HINT: Recall the fundamen- 9. (Electric potential) The electric potential ® within an antal theorem of the calculus (given in Section 2.2).
nular region such as a long metal pipe of inner radius ry and
outer
radius ro, satisfies the differential equation
7. It was stated in the Closure that any given Cauchy—Euler
equation can be reduced to a constant-coefficient equation by
the change of variables x = e'. In this exercise we ask you to
(ry <r <re)
try that idea for some specific cases; in the next exercise we ask
6. Prove that the two solutions within (33) [or, equivalently,
for a generalproofof theitalicized claim. Let y(x(t)) = Y(#),
Solve for the potential distribution ®(r) subject to these
andlet y’ andY’ denotedy/dz anddY/dt, respectively.
(a) Show that the change of variables « = e! reduces the boundary conditions:
Cauchy—Euler equation 27y’’ —zy’ — 3y =0 to the constantcoefficient equation ¥"” — 2Y’ ~ 3Y = 0. Thus, show that
1b
(1) =O, (rg)=%
(a)
Y(t) = Ae! + Be**. Since ¢ = Inz, showthaty(a) =
Agw} + Ba,
(b) Same as (a), for w*y" +ay' ~4y = 0.
(c) Sameas (a), for 2”aw + ay’ +4y =0.
(d) Same as (a), for 27y" + 3ay' + y = 0.
to Exercise
radius r, and outer radius r2, is governed by the differential
du
+e
yen
Ppn-t
ob
7. Consider
Cn—1 xD
the gen-
+
Cr)
Y
= ().
(8.1)
whereD = d/dax.Leta = e’, anddefiney(x(t)) = Y(t).
(a) Using chain differentiation,
tDy = DY,
az’D*y = D(D
du
"a—>77 +2—
dp =0.
eral Cauchy—Euler equation
(2"D"
10. (Steady-state temperature distribution) The steady-state
temperature distribution uwwithin a hollow sphere, of inner
equation
(e)Sameas(a),for x?y"”+ ry’ —9y = 0.
(f) Sameas(a),for 27y" + y = 0.
(g)Sameas(a),for 27y” + Qey’ —2y = 0.
(h)Sameas (a),for 42*y’””— y = 0.
8. First, read the introduction
(b) = (r1)=0, (ro) =o»
Solve for u(r) subjectto theseboundaryconditions:
(a)u(7;) =u,
du, ,
(b) —(r1) = 3,
dr
ulre) = Ue
u(re) = 0
d
u
uy, at" 2) =0
u(ry)
(c)
(r)==tu,
o)u
FOR OPTIONAL
SECTION 3.6.2
show that
EXERCISES
— LY,
11. Use the given solution y;(z) of the differential equation to
find the general solution by the method of reduction of order
(leaving the second solution in integral form if necessary).
a’ D*y = D(D —1)(D- 2)¥,
(8.2)
3.7. Solution of Nonhomogeneous Equation — 133
ay”
w(x) =2
+ay -y=0;
(b)ay” + ay -y=0;
(c)8xy"—azy'+y=0;
(d)(a? ~ Ly” —2y =0;
(a) ey! —Qay' + 2y = 0
yil(e)=2
(b)x*y”+ zy’ +9y = 0
(c)ay” +xy!—9y= 0
w(x) =2
(d)w?y" + Say! + 4y = 0
yi(x) = 27-1
(e)2y"+ay’—2y=0; yi(e)=a?+2
12. (a)~(e) Obtain a general solution of the corresponding
differential equation in Exercise |1, using computer software.
EXERCISES FOR OPTIONAL SECTION 3.6.3
18. From its integral definition, (60), show that erf(—z) =
~erf(x).
19. (Integral representations) The notion of an integral representation
13. State, clearly and convincingly, logic by which (49a,b)
follow from (48).
14. Fill in the steps between (49a,b) and (50a,b).
15. Provide the steps that are missing between the equation
(56) and its solution (57).
Ing =
16. If a(x) and ag(x) are constants,then the factorization
(47) should be simple. Show that the Riccati equations (50a,b)
on a and 6 do indeed give the same results for a and b as can
be obtained by more elementary means.
17. In general, the Riccati-type equations (50a,b) are hard.
However,
we should
be able
to solve
them if the given
as used in (60) to define the error
of a function,
function erf(x), might be unfamiliar to you. If so, it might
help to point out that even the elementary functions can be
introduced in that manner. For example, one can define the
logarithm In x as
[
dt
—,
1
t
(«>0)
(19.1)
from which formula the values of Ing can be derived (by
numerical integration), and its various properties derived as
well.
(a) To illustrate the latter claim, use (19.1) to derive the well
nonconstant-coefficient
equationy” + a(x)y’ + a2(x)y = 0 known property In z* = alnz of the logarithm.
is a Cauchy—Euler equation because that case is simple. Thus, (b) Likewise, use (19.1) to derive the property In(zy) =
use the method of factoring the operator for theseequations:
Inz+Iny.
case
Ly] = f(x),
(1)
nonzeroforcing function f(a).
dx
~~
dz
+ce—+ka = Fit
mae Tae
Te
6)
(2)
governed by the equations
dj
ae at
di
1,
oa
dE(t)
a
on thecurrentz(t), and
#Q +R dQ +GQ=Blt)
1
Loz
w(x)
(4)
on the charge ()(t) on thecapacitor, the forcing functions are the time derivative of
the applied voltage£(t), andthe applied voltageE(t), respectively.
As one more example, we give (without derivation) the differential equation
di
EIS
+ky = w(2)
(5)
governing the deflection y(x) of a beam that rests upon an elastic foundation, un-
der a load distributionw(x) (i.e., load per unit 2 length),as sketchedin Fig. 1.
y
Figure 1. Beamon elastic
foundation.
E, I, and k are known physical constants: FEis the Young’s modulus of the beam
material, J is the inertia of the beam’s cross section, and k is the spring stiffness
per unit length (force per unit length per unit length) of the foundation. Thus, in
this casetheforcing functionis w(x), theappliedload distribution.[Derivationof
(5) involves the so-called Euler beam theory and is part of a first course in solid
mechanics. ]
3.7.1. General solution. Remember that the general solution of the homogeneous equation L[y] = 0 is a family of solutions that contains every solution of
that equation, over the interval of interest. Likewise, by the general solution of the
nonhomogeneous equation L[y] = f, we mean a family of solutions that contains
every solution of that equation, over the interval of interest.
Like virtually all of the concepts and methods developed in this chapter, the
concepts that follow rest upon the assumed linearity of the differential equation (1),
in particular, upon the fact that if Z is linear then
Llau(x)
+ Bv(x)| = aLlu(x)| + BL[v(x)|
(6)
for any two functions u,v (n-times differentiable, of course, if D is an nth-order
operator) and any constants a, (3.Indeed, recall the analogous result for any number
of functions:
Llayuy(x)
+--+ + agug(x)] = ay Llu (x2)]+--+ + ap Llug(a)]
for any functions u,,..., uz
(7)
and constants a1,...,@p.
To begin, we suppose that y,(xz) is a general solution of the homogeneous
version of (1),L[y] = 0, andthaty»(a) is any particularsolution of (1): L[yp(«)] =
f(x). That is, yp(z) is any function which, whenput into the left-handside of (1),
gives f(x). We will refer to y,(x) and yp(a) as homogeneous and particular
solutions of (1),respectively.[Someauthorswrite y.(z) in place of yp,(a),andcall
it thecomplementary solution.|
i)
THEOREM 3.7.1 GeneralSolutionof L{y| = f
If ya(x) and yp,(a) are homogeneous and particular solutions of (1), respectively,
on an interval J, then a general solution of (1), on J, is
(8)
y(t)=yn(a)+yp(2).
Proof: That (8) satisfies (1) follows from the linearity of (1):
L (yale)+yp(2)|
=L lyn(e)]
+LZlyp(@))
=0+f(x) =f(z),
where the first equality follows from (6), with a = @= 1, and u,v equal to y;, and
Yp;tespectively.
To see that it is a general solution, let y be any solution of (1). Again using the
linearity of L, we have
L(y — yp) = L[y]
—Llyp]
= f — f =9,
so that the most general y ~ yp, is a linear combination of a fundamental set of
solutions of the homogeneous version of (1), namely yz. Hence y = yp + Yp i8 a
general solution of (1). aw
Thus, to solve the nonhomogeneous equation (1) we need to augment the homogeneoussolution y;,(x) by adding to it any particular solution yp(2).
Often, in applications, f(a) is not a single term but a linear combination of
terms:f(x) = fi(z) +--+ + fx(x). In theequationL[y] = 52? —2sin a + 6, for
instance,we can identify f(a) = 5x7, fg = —2sin2, and f(x) = 6.
THEOREM
3.7.2 General Solution of Ely) = fi +--+ + fr
If y,(z) is a general solution of L{y] = 0 on an interval J, and ypi(z),..., Ype(@)
are particular solutions of L[y] = fi,...,£[y]
= fx on J, respectively,then a
generalsolutionof L[y] = fy +---+ fp on Tis
y(@)= yr(z) + Ypi(2) ++++ Upp
(2).
(9)
Proof: That (9) satisfies (1) follows from (7), with all the a’s equal to 1:
L [Yr + Upi bo
Ypk|
=
[yal +L
= fi tet
[Yp1] sa!
fe,
[Ypk]
(10)
136
as was to be verified.
To see that it is a general solution of (1), let y be any solution of (1). Then
~Yk) =Lly| —Llypi}—+»—L[ype]
Ly —Ypt—+++
=f-fi----—fe=0,
so the most general y —yp1—- + —Ypx is a general solution yy,of the homogeneous
version of (1). B
This result is a superposition principle. It tells us that the response y, to
a superposition of inputs (the forcing functions fi,..., f,) is the superposition of
their individual outputs (Yp1,.-., Ypk)The upshot is that to solve a nonhomogeneous equation we need to find both
the homogeneous solution y, and a particular solution y,. Having already developed methods for determining y, — at least for certain types of equations — we
now need to present methods for determining particular solutions y,, and that is the
subject of Sections 3.7.2 and 3.7.3 that follow.
3.7.2. Undetermined coefficients. The method of undetermined coefficients is
a procedure for determining a particular solution to the linear equation
Ly] = f(z)
=fi(z)+---+
(11)
f(z),
subject to two conditions:
(i) Besides being linear, Z is of constant-coefficient type.
(ii) Repeated differentiation of each f;(x) term in (11) produces onlyafinite
number of LI (linearly independent) terms.
To explain the latter condition, consider f;() = 2xe~”. The sequence consisting of this term and its successive derivatives is
Qre~" —+ {2re"*, 2e~*— Que *, —de~"+ 2xe™”,. wb,
and we can see that this sequencecontains only the two LI functions e~* and ze~*.
Thus, f;(z) = 2xe~®satisfiescondition (ii).
As a second example, consider f;(x) = x. This term generates the sequence
es
{a*,2x,2,0,0,...}
,
which contains only the three LI functions x”, a, and 1. Thus, f;(a) = x” satisfies
condition (ii).
The termf;(x) = 1/a, however,generatesthesequence
1/e —+ {1/z, —1/27,2/x°,-6/x*,...},
137
whichcontainsaninfinitenumberofLI terms(1/2,1/2, 1/x°,...). Thus,fj(%)=
1/« does not satisfy condition (ii).
If the term f;(«) does satisfy condition (ii), then we will call the finite set of
LI terms generatedby it, through repeateddifferentiation, the family generated by
f(x). (That terminology is for convenience here and is not standard.) Thus, the
family generatedby 27e~* 1scomprised of e~®and we~*,and the family generated
by 327 is comprised of x, x, and 1.
Let us now illustrate the method of undetermined coefficients.
EXAMPLE
1. Considerthedifferentialequation
(12)
yl" —y" = 3c" — sin 2z.
First, we see that L is linear, with constant coefficients, so condition (i) is satisfied. Next,
we identifyf1(x), fo(x), andtheirgeneratedsequencesas
file) =3a? —+ {327,62,6,0,0,...},
(13b)
22, ~2cos2a,4sin2z,...}.
—»+ {-sin
fo(w) = —sin2e
(13a)
Thus, f; and fo do generate the finite families
fila) =8x?
(14a)
—+ {2?,2,1},
(14b)
fo(z) = —sin2a2 —> {sin 2z,cos 2x},
so condition (ii) is satisfied.
To find a particular solution y,; corresponding to /,, tentatively seek it as a linear
combination
of the terms in (14a):
(15)
Yp1(2) = Ac? + Br+C,
where A, B,C
are the so-called
homogeneous solution of (12),
undetermined
coefficients.
Next,
we write down the
yn(@) = Cy + Cow + Cye* + Cye™®,
(16)
and check each term in yp; [i-e., in (15)] for duplication with terms in y,. Doing so, we
find that the Ba and C terms in (15) duplicate (to within constant scale factors) the Cyx
and C’; terms, respectively, in (16). The method then calls for multiplying the entire family,
involved in the duplication, by the lowest positive integer power of 2 needed to eliminate
all such duplication. Thus, we revise (15) as
Ypi(«) = 2 (Ax? + Ba + C)= Aa? + Bu? + Cx,
(17)
but find that the Ca term in (17) is still ‘in violation” in that it duplicates the Cya term in
(16). Thus, try
Ypi(z) = 0" (Ax? + Br +C) = Av’ + Br? + Cx’.
This time we are done, since all duplication
has now been eliminated.
(18)
138
Chapter 3. Linear Differential Equations of Second Order and Higher
Next, we put the final revised form (18) into the equation y/” ~—
y
Ly] = fi(®)| andobtain
244A—12Aa? —6Ba —2C = 32?.
= 32? [ie.,
(19)
Finally, equating coefficients of like terms gives
x:
a:
1:
so that A = —1/4, B=
—l2A =3
~6B = 0
(20)
244-20 =0,
0, C = —3. Thus
Ypi(z) =
1
ae
(21)
—3x,
Next, we need to find yp,2corresponding to fz. To do so, we seek it as a linear combination of the terms in (14b):
(22)
Yp2(z) = Dsin 2x + Ecos 22,
Checking each term in (22) for duplication with terms in y,;, we see that there is no such
duplication. Thus, we acceptthe form (22), put it into theequationy/"" — y” = —sin 2x
fie., L[y] = fo(x)], andobtain
20D sin 2x + 20F cos 2x = — sin 22.
(23)
Equating coefficients of like terms gives 20D = —1, and 20F = 0, so that D = —1/20
and & = 0. Thus,
1
Yp2(t)= 35 sin 2z.
(24)
Finally, a general solution of(12) is, according to Theorem 3.7.2,
y(z) = ya(Z)+Yp(@)
= ya(z)+Ypi(x)+Ype(z),
namely,
1
y(z) = Cy + Cox + Cge” + Cye™* — i
1
— 3x7 — a sin 22.
(25)
COMMENT |. We obtained (20) by “equating coefficients of like terms” in (19). That step
amounted to using the concept of linear independence — namely, noting that 1,a, x” are
LI (on any given x interval) and then using Theorem 3.2.6. Alternatively, we could have
rewritten (19) as
(24A —2C)1 + (-6B)x + (-12A —3)z? =0
(26)
and used the linear independence of 1, x, x? to infer that the coefficient of each term must
be zero, which step once again gives (20).
COMMENT 2. The key point in the analysis is that the system (20), consisting of three
linear algebraic equations in the three unknowns A, B,C, is consistent. Similarly for the
system 20D = —1,20 = 0 governing D and E. The guaranteeprovided by the method
139
x:
oe
ile
0=3
Oz—a0)
—~2A= 0,
Let us summarize.
STEPS IN THE METHOD
OF UNDETERMINED
COEFFICIENTS:
Verify that condition (i) is satisfied.
Identify the f;(x)’s and verify that each one satisfies condition (ii).
I].
correspondingto f;(a) [(15)in Example 1].
ample 1].
by the lowest positive integer power of x necessary to remove all such duplication between those terms and the terms in yp, [(18) in Example 1].
side of theequationL[y] = f1, andequatecoefficientsof like terms.
coefficients. That step completes our determination of yp1(2).
Repeat steps 4—8for yp2,..., Ypk:
i
Thenthe general solution of L[y] = fi ++:++ fx is given, according to
Theorem3.7.2,by y(x) = yn(@)+ Yp(©)= yn() + Ypi(@) +++ + ype(a).
y’ —9y =4+5sinh
32,
(27)
140
which is indeed linear and of constant-coefficient type. Since
fifa) =4 —> {4,0,0,...},
fo(w) = 5sinh3ce —> {5sinh 3a, 15cosh 3a,45sinh3z,...},
we see that these terms generatethe finite families
fo(w)=5sinh3z2 —> {sinh32,cosh3z},
so we tentatively seek
Since
Ypi(“) = A.
(28)
+ Cye**,
ya(x)= Cye**
(29)
there is no duplication between the term in (28) and those in (29). Putting (28) into y” -
9y = 4 gives -9A = 4,80 A = —4/9.Thus,
4
Ypi(z) = -5
(30)
Yp2(z) = Bsinh 3c + C'cosh3z.
(31)
Next, we tentatively seek
At first glance, it appears that there is no duplication between any of the terms in (31)
and those in (29). However, since the sinh 3z and cosh 3z are linear combinations of e°*
and e~3*, we do indeed have duplication. Said differently, each of the sinh 3x and cosh 3a
terms are solutions of the homogeneous equation. Thus, we need to multiply the right-hand
side of (31) by a and revise y, as
Yp2(v)= « (Bsinh 3x + Ccosh 3a).
Now that yp2 is in a satisfactory form we put that form into y” ~ 9y = 5sinh3z
(32)
[ie.,
L{y]= f2(z)] andobtaintheequation
(3C + 3C) sinh3a + (3B + 3B) cosh3z
+(9B —9B)z sinh 3z + (9C —9C)x cosh3x = 5sinh 32.
Equating coefficients of like terms gives B = 5/6 and C = 0, so
5
Yp2(2) = 62 sinh 3a.
(33)
It follows then, from Theorem 3.7.2, that a general solution of (27) is
↕
∶
re
a
5
4
3z.
62 sinh−−
9 +↔
(34)
41
Naturally, one’s final result can (and should) be checked by direct substitution into the
original differential equation.
COMMENT.
Suppose that in addition to the differential equation (27), initial conditions
y(0) =0, y'(0) = 2 arespecified.Imposingtheseconditionson (34)givesCy = 5/9, C2 =
—1/9, and hence the particular solution
5a
y(z) = 5°
1
~ °
-3
4
5
62 sinh 32.
(35)
Do not be concerned that we call (35) a particular solution even though each of the two exponential terms in (35) is a homogeneous solution because if we put (35) into the left-hand
side of (27) it does give the right-hand side of (27); thus, it is a particular
solution of (27). @
As a word of caution, suppose the differential equation is
y” —3y’ + 2y = Qsinhaz,
z
e ae
with homogeneous solution Cie” + Coe**. Observe that 2sinhz = e”?—
contains an e®term, which corresponds to one of the homogeneoussolutions. To
bring this duplication into the light, we should re-express the differential equation
as
tt
1
_
48
y —d3y +2y=e"-—e
~—2z
before beginning the method of undetermined coefficients.
Then, the particular
solution due to f;(a) = e®will be ypi(x) = Age®and the assumedparticular
solutiondue to fo(x) = e~*will be ypo(x) = Be”. We find thatA = —1and
B = —1/6so thegeneralsolutionis
y(x) = Cre” + Coe?” — we™— se.
Closing this discussion of the method of undetermined coefficients, let us reconsider condition (il), that repeated differentiation of each term in the forcing
function must produce onlya finite number of LI terms. How broad is the class of
functions that satisfy that condition? If a forcing function f satisfies that condition,
then it must be true that coefficients
agf™) +a,
fN~-)
a; exist, not all of them zero, such that
+--+ anf’
tanf
=0
(36)
over the a interval under consideration. From our discussion of the solution of such
constant-coefficient equations we know that solutions f of (36) must be of the form
Ca™ele+')® or a linear combination of such terms. Such functions are so common in applications that condition (ii) is not as restrictive as it may seem.
3.7.3. Variation of parameters. Although easy to apply, the method of undetermined coefficients is limited by the two conditions cited above — that L be of
142
constant-coefficient type, and that repeated differentiation of each f;(a) forcing
term produces only a finite number of LI terms.
The method of variation of parameters, due to Lagrange, is more powerful
in that it is not subject to those restrictions. As with automobile engines, we can
expect more power to come at a higher price and, as we shall see, Lagrange’s
method is indeed the more difficult to apply.
In fact, we have already presented the method, in Section 2.2, for the general
linear first-order equation
(37)
y' +p(x)y= q(2),
and we urge you to review that discussion. The idea was to seek a particular solution y, by varying the parameter A (i-e., the constant of integration) in the homogeneous solution
yn(a)
=AwIPC)
de,
Thus, we sought
FPO®,
Yp(t)= A(ajeW
put thatform into (37) andsolved for A(z).
Likewise, if an mth-orderlinear differential equation L[y] = f has a homogeneous solution
Yn(z)
= Cry
(x)
tote
(38)
Cnyn(a),
then according to the method of variation of parameterswe seek a particular solution in the form
(39)
+Cn(x)yn(x);
+ +++
= Cr(x)yr(@)
Yp(@)
that is, we “vary the parameters” (constants of integration) C1,...,
Cp in (38).
Let us carry out the procedure for the linear second-order equation
Ly] = y" +pi(@)y!+ po(x)y= f(a),
(40)
where the coefficientspy(z) and po(a) are assumedto be continuouson the x
interval of interest, say [. We suppose that
ya(e) = Ciyi(z) + Crya(z)
(41)
is a known general solution of the homogeneous equation on J, and seek
= Ci(x)y1(x)+ C2(x)y2(z).
Yp(@)
(42)
Needing y,, andy;,, to substituteinto (40),we differentiate(42):
Up=Cry +Cay+Cyn+Coye.
(43)
Looking ahead, Vp will include Cy, C2, C, C4, CY,CY terms, so that (40) will become a nonhomogeneous
second-order
differential
equation in Cy and C2, which
can hardly be expected to be simpler than the original equation (40)! However, it
143
will be only one equation in the two unknowns C,, Cy, so we are free to impose
another condition on C, Cy to complete, and simplify, the system,
An especially convenient condition to impose will be
Cry
(44)
+ Coyo =
for this condition will knock out the C{, C’ terms in (43), so that y,, will contain
only first-order derivatives of Cy, and C’y. Then (43) reduces to Up = Cy,
so
+ Cay,
(45)
Up= Cry + Coys+Cry, +Cayo,
and (40) becomes
Ch (yi + pry
+ poyt)
+ C2 (yy + Diy
+ poy2) + Cry}
+ Cyyo = f.
(46)
The two parenthetic groups vanish by virtue of y, and yg being solutions of the
+ Coys = f. That
homogeneous equation L[y] = 0, so (46) simplifies to Cy
result, together with (44), gives us the equations
yiCy
+ yCh
Y20o =0,
+
Ye,
(47)
yj,Cy+ yoCy= f
of
uniquely, if thedeterminant
on CY,CS. The latterwill be solvablefor C), C4,
the coefficients does not vanish on J. In fact, we
that determinant as the
eee
Wronskian of y; and ya,
Wyn,yal(a)= yz)
yl)
y2(x)
a
y4(a);
and the latteris necessarily nonzero on J by Theorem 3.2.3 because y; and y2 are
LI solutionsof L[y]= 0.
Solving (47) by Cramer’s rule gives
Y2
ao) =LEHL _ wie
yi YP
yi Ye
We)
cy
LHL
yi Y2
yi VY
Wa(
Way
where W, }V2 simply denote the determinants in the numerators. Integrating these
equations and putting the results into (42) gives
w= [Tarren
[fey | no,
”
144
or, more compactly,
Yp(x) = |
EXAMPLE 3.
©Wi(E)yr (x)+ Wo(E)yo(a dé.
ey
To solve
yl”~dy= 8e*,
(50)
(51)
we note the general solution
yn(x) = Ce?
(52)
+ Cye~**
of the homogeneousequation,so that we may takey;(xz) = e?*,y2(x) = e~?*. Then
W(x) = yy —yiye = —4,Wi(r) = —f(x)yo(x) = —8e?®e72*
= —8,andWo(x) =
f(x)yi (x) = 8e?*e?*= 2e4*,so(49)gives
Yp(x) = (f°
2at
er? 4 (f°
= (2x+ A)e?®
+ (->
4a
—2e46 as) ent
+)
2x
e2* = Qre?*—> + Ae?*+ Be~?*,(53)
where A, B are the arbitrary constants of integration. We can omit the A, B terms in (53)
because they give terms (Ae?* and Be~?*) that merely duplicate those already present in
the homogeneous solution y;,. That will always be the case: we can omit the constants
of integration in the two integrals in (49). In the present example we can even drop the
—e? /2 term in the right side of (53) since it too is a homogeneous solution [and can be
absorbed in the Ce?" term in (55)]. Thus, we write
Yp(x) = Qre7*,
(54)
y(x) = yr(z) + yp(x) = Cye?* + Coe ?* + 2re7*
(55)
Finally,
gives a general solution of (51). @
3.7.4. Variation of parameters for higher-order equations. (Optional) For
higher-order equations the idea is essentially the same. For the third-order equation
Ly} = y""+pi(x)y” +po(x)y'+ ps(2)y= f(x),
(56)
yr(x) = Cryi(x) + Coye(x) + Cay3(x)
(57)
for instance, if
is known, then we seek
Yp(x) = Ci(a)yi (a) + Co(x)yo(a) + C3(a) y3(x).
(58)
Looking ahead, when we put (58) into (56) we will have one equation in the
three unknown functions C1, Co, C3, so we can impose two additional conditions
145
(58) gives
+ Clyr +Coye+Cys,
= Cry,+Coys+Cayg
yp(a)
(59)
Ciy +Cyys+Cyy3=0
(60)
so we set
to suppress higher-order derivatives of Cy, C2, C3. Then (59) reduces to
yh,= Cry +Cayh+Cay,
(61)
and anotherdifferentiation gives
+Coys+Cays.
Up=Cry +Cays+Cag+Chun
(62)
Again, to suppress higher-order derivatives of C,, Co, C3, set
Cry + Coyg+ Cys =0.
Then (62) reduces to
sO
Finally, a
(63)
= Ciy + Coyg+ Coys,
(64)
+ Cyys.
= Cyl! + Coys!+ Cayg’+ Chyll+ Chaya
(65)
(65), (64), (61), and (58) into (56) gives
Ci (yt + pry + pay, + psy1) + Ca (yg!+ piyy + pays+ psy2)
(y3'+piys+poys+pays)+Cru+ Coys+Cayg=f,(66)
+C1
or
Ciyt +Coys+Cyy3= f
(67)
since each of the three parenthetic groups in (66) vanishes because yj, yo, y3 are
homogeneous solutions.
= 0,
yiCy + yes + -
mCi +yoy +gC3 = 0,
∕
√
∏
↕
∶
∫
(68)
∕∙
∏
∕
∏≤↨
∏↕
∕
∕
↔↨∕↨
∶
∂
146
Yi
YB
Y2
(69)
=} uy vb va |,
Wily,y2,ysl(@)
wus V3
which is nonzero on the interval I because yj, y2, y3 are LI on I by the assumption
that (57) is a general solution of the homogeneous equation L[y] = 0. Solving (68)
by means of Cramer’s rule gives
calf
'
W(t)
cr
W(a)’
yi
cr
5
0 ¥3
yi
O ye Ys
0 ys ¥5
we us| _ Wi)
2
yy 0
Lat fous| _ Walz)
Wa)
Wa)’
O
ye
(70)
yi ye 0
Lut ve F| _ Wale)
Wa)’
Wa)
Finally, integrating these equations and putting the results into (58) gives the
particular solution
Yp(z)= if
+
EXAMPLE
re)
| re | mo
mu) ae yi(z) + if
/
3
(é)
Wate)
ig y2(x)
W
ay
Y3\a
a
4. Consider the nonhomogeneousCauchy—Eulerequation
ay!” + a7y"”—Qary'+ 2y = =
(0<2<oo)
(72)
Observe that we cannot use the method of undetermined coefficients in this case because
the differential operator is not of constant-coefficient type, and also because the forcing
function does not generate only a finite number of linearly-independent derivatives.
To use (71), we need to know yj, yo, ys, and f. [tis readily found that the homogeneous solution is
1
;
ya(t) = Cy— + Cox + C32”,
£
(73)
so we can take y; = 1/2, yo = x, yg = v*. But be careful: f(x) is not 2/x because (72)
is not yet in theform of (56).That is, (72) mustbe dividedby 2° so thatthecoefficientof
y'” becomes1,as in (56).Doing so, it follows thatf(x) = 2/x*.
147
We can now evaluate the determinants needed in (71):
xt
W(2) =|
og
6
-x7?
1 Qz
2x3
0
at
Wo(a) =] ~-27?
2273
=>
Wile) =
x
1 Q@&=72
x
22-* 0
2
0
x
0
Qe l= a
Qa 4
0
0
,
at
2
oog
0
W(t) =| ~-27? 1
Qa73
,
2
0
9
#0
A
[=oe
a4
‘
(74)
so (71) gives
ed
——~ d&}:
1
* 1
=~d&} —
7 2
az d
(J eee)=2
e+(f -gee)o+
wil=(fozeee)
=
aie
32
The -1/(182)
182
(75)
term can be dropped because it is a homogeneous solution, so
Yp(z)
a3ling7
(
)
76)
as can be verified by direct substitution into (56). @
Generalization of the method to the nth-order equation
yg +pi(a)yP)+++
+pn-i(x)y!
+Pn(z)y
=f(a)
(77)
is straightforward and the result is
Yp(«)= if
Wie ag yi(z) to
+ if Wie
is Yn(@),
(78)
where the y;’s are n LI homogeneous solutions, W is the Wronskian of yi,.-.; Yns
and W; is identical to W, but with the jth column replaced by a column of zeros except for the bottom element, which is f.
Closure. In this section we have discussed the nonhomogeneous equation L[{y]=
f, where L is an nth-order linear differential operator.
In Section 3.7.1 we provided the theoretical framework, which was based upon
the linearity of L. Specifically, Theorem 3.7.1 showed that the general solution
of Lily] = f can be formed as the sum of a homogeneous solution yp(x), and a
particularsolutionyp(x); y,(x) is a generalsolutionof L[y] = 0, so it containsthe
n constants of integration, and yp»(x)is any particular solution of the full equation
Lly| = f. Further, we showed that if f is broken down as f = fi +-::+ fr,
148
..,Ypk Corresponding to fy, ...
respectively.
i Sh
tionsfj (a).
EXERCISES
3.7
1. Show whether or not the given forcing function satisfies
condition (ii), below equation (11). If so, givea finite family
of LI functions generated by it.
(a) 2? cosx
(c)lnz
(b) cos z sinh 2z
(d)2? Ine
(f) =
(e)sina/a
(h)(x —1)/(x+ 2)
(g)e®*
(i) tan x
(j) e®cos 3x
(k) we7* sinh x
(1)cos x cos 22
(m) sin zsin 2z sin 3x
(n) e*/(x +1)
2. Obtain a general solution using the method of undetermined
coefficients.
4, Obtain a general solution using the method of variation of
parameters.
(a)y!+2y= 4e?*
(b)y’-y=
ae? +1
(c)cy’ -y= 23
(d)ay’+y=I1/e
(e) °y' +a27y=1
(x>0)
(x >0)
(hy —y= 8«
(g)y” —y = 8e*
(h)y"”~2y'+y = 62?
(i) y —2Qy’+ y = Qe
(Dy +y =4sing
(k)y"”+4y!+4y= 207?"
(1)6y” — 5y’ +y = 2?
(x >0)
(n)27y" —vy! —3y = 4a (a <0)
(m)
xy"
+
zy!
Co)y"+y"
(p) yl!
(g)y" —y!= 5sin2x
(h) y” +y! = 4ve* + 3sinaz
_ by!"
-y'
+
_
Any
=]
-y=x
ly’
_ 6y
= eit
(i) y” + y = 3sin 22 — 5 + 2a?
5. (a)—(p) Use computer software to solve the corresponding
problem in Exercise 4.
(k) y" +y = 6cosz +2
6. In the method of variation of parameters we used indefinite
(m)y” —2y' + y = ae
integrals in those formulas, instead, if we choose. Specifically,
Q@y"+y' -%&=23-e*
(1) y"" 4
2y!
—_ ae + 4e2t
(n) y” —4y = 5(cosh2x ~ x)
(0)y"
_ y! = 2ret
(p) y/” ~ y! = 25cos 2¢
(q) yy!"~ y” = 6a + 2coshz
+ y” —Qy = 327 ~1
(ry
(s)yy" —y = 5(a+ cosz)
3. (a)—(s) Use computer software to solve the corresponding
problem in Exercise 2.
integrals [in (49) and (78)]. However, we could use definite
show that, in place. of (49),
* Wa(€)
ale)=i We |ne)+|a, W(E) as| yo(x)
is also correct, for any choice of
the constants a1, @2 (although
normally one would choose a, and az to be the same).
work,
Cry + Coy2 = 6,
3.8
Application
tion
to Harmonic
Oscillator:
Forced Oscilla-
The free oscillation of the harmonic oscillator (Fig. 1) was studied in Section 3.5.
Now that we know how to find particular solutions, we can return to the harmonic
oscillator and consider the case of forced oscillations, governed by the secondorder, linear, constant-coefficient,
nonhomogeneous equation
maz"+ca'+kxr = f(t).
f(t) = FocosQt.
(2)
case. To begin, consider the undamped case (c = 0),
ma" +ka
= Fy cos Mt.
(3)
The homogeneous solution of (3) is
t,(t) = Acoswt + Bsinwt,
(4)
where w = \/k/m is the natural frequency (i.e., the frequency of the free oscillation), and the forcing function Fp cos Qt generatesthe family {cos Mt, sin Nt}.
Thus, to find a particular solution of (3) by the method of undetermined coefficients, seek
Ep(t) = CcosNt + Dsin Nt.
(5)
Two cases present themselves. In the generic case, the driving frequency 22is
different from the natural frequency w, so the terms in (5) do not duplicate any of
those in (4) and we can accept (5) without modification.
In the exceptional, or “sin-
gular,” case where 22is equal to w, the terms in (5) repeat those in (4), so we need
to modify (5) by multiplying the right side of (5) by ¢. For reasons that will become
clear below, these cases are known as nonresonance
Nonresonant
oscillation.
ft)
(1)
In particular, we consider the important case where the forcing function is harmonic,
3.8.1. Undamped
x(t)
and resonance, respectively.
Putting (5) into the left side of (3) gives
Figure 1. Mechanical oscillator.
;
F
Fi
(6)
Nt.
m
Since Q 4 w by assumption, it follows from (6), by equating the coefficients of
(w? —27) C cosQt + (w? —Q?) Dsin Qt = "cos
cosQt and sin Qt on the left and right sides, thatC = (Fp/m)/(w? —0?) and
D = 0. Thus
Lp(t)
= we
Fo/m
Cos
(2
(7)
Qt,
so a general solution of (3) is
+ep(t)
x(t)=xa(t)
= Acoswt + Bsinwt
cos Qt.
+ =e
(8)
In a sense we are done, and if we wish to impose any prescribed initial condi-
tionsx(0) and2’(0), thenwecouldusethoseconditionstoevaluatetheconstantsA
and B in (8). Then, for any desired numerical values of m, k, Fo, and 2 we could
plot x(t) versus ¢ and see what the solution looks like. However, in science and engineering one is interested not only in obtaining answers, but also in understanding
phenomena, so the question is: How can we extract, from (8), an understanding of
the phenomenon? To answer that question, let us first rewrite (8) in the equivalent
form
Fi 0 / ™m
a(t) = Esin(wt +o) + we — (22cos Qt
(9)
since then we can see it more clearly as a superposition of two harmonic solutions,
of different amplitude, frequency, and phase.
The homogeneous solution E£sin (wt + ¢) in (9), the “free vibration,” was already discussed in Section 3.5. [Alternative to E'sin(wt + ¢), we could use the
form EFcos (wt + ¢), whichever one prefers;it doesn’t matter.]Thus, consider the
particular solution, or ‘forced response,” given by (7) and the last term in (9). It
is natural to regard m and k (and hence w) as fixed, and Fo and (2 as controllable
quantities or parameters. That the response (7) is merely proportional to Fo is no
surprise, for it follows from the linearity of the differential operator in (3). We also
see, from (7), that the response is at the same frequency as the forcing function,
Q. More interestingis the variation of the amplitude (Fo/m)/(w? — 2?) with ,
which is sketched in Fig. 2. The change in sign, as 2 increases through w, is awkward since it prevents us from interpreting the plotted quantity as a pure magnitude.
Thus, let us re-express (7) in the equivalent form
Figure 2. Magnitude of response
(undamped case).
p(t)
= jw
Fo/m
— 0]
8
(Qt
where the phase angle ® is 0 for Q < w and 7 for
+ &),
(10)
> w [since cos (Qt + 7) =
—cos Qt gives the desired sign change for 2 > w]. The resulting amplitude- and
phase-responsecurves are shown in Fig. 3. From Fig. 3a, observe that as the driving frequency approaches the natural frequency the amplitude tends to infinity! [Of
course, we must remember that right at 2 = w our particular solution is invalid
since(6) is then(0)cosQt + (0)sin Qt = (Fo/m) cosQt, which cannotbe satis-
fied.] Further, as 22—+oo the amplitude tends to zero. Finally, we see from Fig. 3b
that the response is in-phase (® = Q) with the forcing function for Q < w, but
for all Q > w it is 180° out-of-phase. This discontinuous jump is striking since
only an infinitesimal change in Q (from just below w to just above it) produces a
discontinuous change in the response,
Also of considerable interest phenomenologically is the possibility of what is
known as beats, but we will postpone that dicussion until we have had a look at the
special case of resonance.
(a) Amplitude
cael
a
t
fl
q
i
Folm“3
(b) Phase
Resonant oscillation. For the special case where 2. = w (that is, where we force
the system precisely at its natural frequency), the terms in (5) duplicate those in (4)
so,accordingto themethodof undeterminedcoefficients,we needto revise Zp as
(11)
+ Dsinut).
tp(t) =t(Ccoswt
Since the duplication has thereby been removed, we accept (11). Putting that form
into (3),we find thatC = 0 and D = Fo/(2mw), so
Fo
Lp(t) = Fy!
rp(t)
i
Swe,
12
(12)
Beats. Isn’t it striking that the response a(t) is the sum of two harmonics [given
by (5)] for all 2 4 w, yet it is of the different form (12) for the single case Q = w?
One might wonder whether the resonantcase is really of any importance at all since
one can never get 2 to exactly equal w. It is therefore of interest to look at the solution a(t) as Q approaches w. To do so, let us use the simple initial conditions
= 0, for definiteness,
u(t)
=
2
Fo/m
a2
in which case we can evaluate A and B
(cos
wt
— COS
phase-response curves (undamped
case).
which is shown in Fig. 4. _ In this special case the response is not a harmonic
oscillation but a harmonic function times t, which factor causes the magnitude to
tend to infinity as tf + oo. This result is known as resonance. Of course, the
magnitude does not grow unboundedly in a real application since the mathematical
model of the system (the governing differential equation) will become inaccurate
for sufficiently large amplitudes, parts will break, and so on.
Resonance is sometimes welcome and sometimes unwelcome. That is, sometimes we wish to amplify a given input, and can do so by “tuning” the system to
be at or near resonance, as when we tune a radio circuit to a desired broadcast frequency. And other times we wish to suppress inputs, as a well designed automobile
Suspensionsuppresses,rather than amplifies, the inputs from a bumpy road.
«(0) = 0 and x'(0)
in (8), and obtain
Figure 3. Amplitude-and
Qt)
;
(13)
v(t)
Xp(t)}
Fot/(2m@)-
a
|
| Fo t/(2mw)”
peers
sasiantsnsiniet
teen
eld
t
Figure 4. Resonantoscillation.
or, recalling the trigonometric identity cos A —-cos B = 2 sin
x(t)= 2Fo/1
rol sin(5
Q
—~ sin
)esin(* -—Q
5 )e
,
(14)
Now, suppose that Q is close to (but not equal to) the natural frequency w.
Then the frequency of the second sinusoid in (14) is very small compared to that
of the first, so the sin (45%) t factor amounts,essentially,to a slow “amplitude
modulation”of therelativelyhighfrequencysin (“4)¢ factor.This phenomenon
is known as beats, and is seen in Fig. 5, where we have plotted the solution (14) for
four representative cases: in Fig. 5a Q is not close to w, and there is no discernible
beat phenomenon, but as 22is increased the beat phenomenon becomes well estab-
lished,asseenin Fig. Sb,5c,andSd. [Wehaveshownthe“envelope”sin (“5) ¢
as dotted.]
We can now see that the resonancephenomenonat (2 = w is not an isolated
behavior but is a limiting case as Q — w. That is, resonance (Fig. 4) is actually a
limit of the sequence shown in Fig. 5, as 2 + w. Rather than depend only on these
suggestive graphical results, we can proceed analytically as well. Specifically, we
can take the limit of the response (13) as 2. + w and, with the help of Il’Hépital’s
rule, we do obtain (12)!
With our foregoing discussion of the undamped forced harmonic oscillator in
mind, we cannot overstate that we are by no means dealing only with the solving
of equations but with the phenomena thereby being described. To understand phenomena, we normally need to do several things: we do need to solve the equations
that model the phenomena (analytically or, if that is too hard, numerically), but
we also need to study, interpret, and understand the results. Such study normally
includes the generation of suitably chosen graphical displays (such as our Fig. 2,
3, and 4), the isolation of special cases [such as our initial consideration
(d) Q=0.98@
ee
a
sin0.01r .
of the case
where there is no damping; c = 0 in (1)], and perhapsthe examination of various
limiting
cases (such as the limit Q — w in the present example). Emphasis in this
book is on the mathematics, with the detailed study of the relevant physics left for
applications courses such as Fluid Mechanics, Electromagnetic Field Theory, and
so on, but we will occasionally try to show not only the connections between the
mathematics and the physics but also the process whereby we determine those connections.
3.8.2. Damped case. We now reconsider the harmonically driven oscillator, this
Figure 5. Beats, and approach
to resonance,
time with a cx’ damping term included (c > 0):
ma” + ex’ + kx = Fo cos Nt.
(15)
Recall from Section 3.5 that the homogeneous solution is
4
e 2m
/
/
|Acos4/w* — (— "ts+ Bsin,/w*
-
(=)
(5)
trt)= 4 ¢ an (A+ Bt)
ont
for the underdamped
Acosh |
V ( in)
(c < Cer), critically
4hw? t+ Bsinh.
/ ( ia)
an w2 |
(16)
damped (c = Ce,), and overdamped
(c > Cer) cases, respectively, and where w = \/k/m
This time, when we write
and Cer = 2V mk.
(17)
Ep(t) = CcosNt + Dsin NE,
according to the method of undetermined coefficients, there is no duplication between terms in (17) and (16), even if 2 = w, because of the exp (—ct/2m) factors
in (16), so we can accept (17) without modification. Putting (17) into (15) and
equating coefficients of the cos Qt terms on both sides of the equation, and similarly for the coefficients of the sin Q¢ terms, enables us to solve for C and D. The
result (Exercise 3a) is that
ty— Folin)(u?=&)
oO)=a mB)+eam
YG?
FocQ./m?
2)? + (eQ/mye?
sin
Qt,
(18)
or (Exercise 3b), equivalently,
£p(t) = Ecos (Qt + 8),
(19a)
|
|
where the amplitude & and phase ® are
(w?—2)? + (cQ/m)?
® = tan7! pea
(19c)
with the tan~! understood to lie between 0 and 7.
responsecurves, the graphs of the amplitude &, and the phase @with respect to the
driving frequency $2.The former is given in Fig. 6 for various values of the damping coefficient c, and the latter is left for the exercises.
From Fig. 6 we see that true resonance is possible only in the case of no damping (c = 0), which case is an idealization since in reality there is inevitably some
damping present. Analytically, we see the same thing: (19a) shows that the amplitude # can become infinite only if ¢ = 0, and thatoccurs only for Q = w. However,
for c > 0 there is still a peaking of the amplitude, even if that peak is now finite, at
a driving frequency Q which diminishes from w as ¢ increases, and which is 0 for
all c > Cer. Further, the peak magnitude (located by the dotted curve) diminishes
from co to Fo/k as c is increased from 0 to c.;, and remains Fo/k for all ¢ > Cer.
What is the significance of the Fg /k value? For 2 = 0 the differential equation
becomes ma” + cx’ + kx = Fo, and the method of undetermined coefficients gives
tp(t) = constant = Fo/k, which is merely the static deflection of the mass under
the steady force Fo.
Even if true resonance is possible only for the undamped case (c = 9), the term
resonance is often used to refer to the dramatic peaking of the amplitude response
curves if ¢ is not too large.
The general solution, of course, is thesum
x(t) = a,(t) + z(t)
II tp(t) + Ecos (Qt + ®).
(20)
where £ and ® are given by (19b,c) and ap(t) is given by the suitable right-iiaid
side of (16), according to whether the system is underdamped, critically damped,
or overdamped.If we imposeinitial conditions (0) andx‘(0) on (20), thenwe can
solve for the integration constantsA and B within x,(t).
Notice carefully that the z,(¢) part of the solution inevitably tends to zero as
t —+oo because of the exp (—ct/2m) factor, no matter how smallc is, as long as
c > 0. Thus, we call x,(t)
in (20) the transient part of the solution and we call
tp(t) thesteady-state partsince x(t) 3 Ecos (Qt+ ©)as t > oo. The transient
Figure 7. A representative
responsez(t) (solid); approachto
thesteady-stateoscillation x,(t)
(dotted).
part depends upon the initial conditions, whereas the steady-state part does not. A
representative underdamped case is shown in Fig. 7, where we see the approach to
the steady-stateoscillation z(t).
Closure. In this section we considered the forced vibration of a harmonic oscillator
—that is, a systemgovernedby the differential equationmx” + ca’ + kx = f(t),
‘or the case of the harmonic excitation f(t) = Fo cosQ¢t. Thus, besides a homogeneous solution we needed to find a particular solution, and that was done by
themethod of undetermined coefficients. The particular solution is especially important physically since even an infinitesimal amount of damping will cause the
homogeneous
solution
to tend to zero as t — oo, so that the particular
solution
becomes the steady-stateresponse. To understand the physical significance of that
response we attached importance to the amplitude- and phase-response curves and
discussed the phenomena of resonance and beats. Our discussion in this section
nas been limited in that we have considered only the case of harmonic excitation,
whereas in applications f(t) surely need not be harmonic. However, that case
is important enough to deserve this special section. When we study the Laplace
transform method in Chapter 5, we will be able to return to problems such as
155
obtain solutions for virtually any forcing function f(t).
EXERCISES
3.8
1. Applying theinitial conditions«(0) = 0 andz'(0) = 0 to 10. Imagine the experimental means that would be required
to apply a force Fp cos Nt to.a mass. It doesn’t sound so hard
if the mass is stationary, but imagine trying to apply such a
force to a moving mass! In many physical applications, such
2. Derive (12) from (11).
as earthquake-induced vibration, the driving force is applied
3. (a) Derive (18).
(b) Derive (19a,b,c).
indirectly, by “shaking” the wall, rather than being applied
to the mass. Specifically, for the system shown in
directly
4, The amplitude- and phase-response curves shown in Fig. 3
correspondto theequationmax”+ ka = Fo cos Nt. Obtain the the figure, use Newton’s second law to show that if the wall is
equations of the analogous response curves for the equation
maz" + kx = Fo sin Qt, and give labeled sketches of the two
(8), derive (13). Show that the same result can be obtained if
we Start with the form (9) instead of (8).
x
O(t)
curves.
5. Figure 6 shows the amplitude-response curves (£ versus 22)
4
corresponding to (19b), for various valuesof c.
(a)What happensto thegraph as c > 00? Is E(Q) continuous
onQ <2 < oo force= 00? Explain.
(b) From (19c), obtain the phase-response curves (® versus
Q), either by a careful freehand sketch or using a computer,
m
k
ad
displaced laterally according to d(t) = dpcosMt, then the
equation of motion of the mass m is ma” + kx = Fo cos Nt,
where fy = kd. Here, x and 6 are measured relative to fixed
for various values of c, being sure to include the important points in space. NOTE: Observe that such an experiment is
case c = 0. What happens to the graph as c + 00?
more readily performed since it is easier to apply a harmonic
5(¢) than a harmonic force; for instance, one
displacement
6. In Fig. 7 we show the approach of a representative response
mechanism (which converts circular
slider-crank
a
use
could
curve (solid) to the steady-state oscillation (dotted), for an unmotion). Note further that a dislinear
harmonic
to
motion
derdamped system.
placement input is precisely what an automobile suspension is
(a) Do the same (with a computer plot) for a critically damped
subjected to when we drive over a bumpy road.
case, The valuesof m,c,k, Fo, ,2(0),2'(0)
are up to you,
but the idea is to demonstrate graphically the approach to z,(t)
clearly, as we have in Fig. 7.
(b) Same as (a), for an overdamped system, where c = 4C¢,,
say.
7. Show that taking the limit of the response (13) as Q > w,
with the help of H6pital’s rule, does give (12), as claimed
two paragraphs below (14).
8. Observe from Fig. 6 that the amplitude / tends to zero as
Q — oo, Explain (physically, mathematically, or both) why
thatresult makessense.
11.
For the mechanical oscillator governed by the differ-
ential equation ma” + cz’ + kx = F(t), obtain computer plots of the amplitude- and phase-response curves (£
=
versus 2 and ® versus 2), for the case where F(t)
25sin Nt, for these six values of the damping coefficient c:
0,
0.25C¢,
0.5Cop,
Cer, 2Cer
4Cor, where
(gjm=1,k=1
(b)m =2,k=5
(c)m = 2,k = 10
(d)m=4,k
(e)m=4,k
=2
= 10
9. (a)What choice of initial conditions x(0) and «’(0) will reduce the solution (20) to just the particular solution, x(t) = 12. (Complex function method) Let L be a linear constantEcos (Qt + &)?
coefficient differential operator, and consider the equation
(b) Using a sketchof a representativex,(t) such as the dotted curve in Fig. 7, show the graphical significance of those
specialvaluesof 2(0) and2’ (0).
L[z] = Fo cos Qt,
(12.1)
[56
According
to the method of undetermined
coefficients,
can find a particular solution «,(¢) by seeking a,(t)
AcosQ#t + Bsin
Qt (or, in exceptional
we
=
cases, t to an inte-
13. (Electrical circuit) Recall from Section 2.3 that the equa-
tions governing the current i(t) in the circuit shown, and the
chargeQ(t) on thecapacitorare
ger power times that). A slightly simpler line of approach that
is sometimes used is as follows. Consider, in place of (12.1),
Liw] = Foe,
a*y
bap at
(12.2)
Equation (12.2) is simpler than (12.1) in that to find a par-
ticular solution we need only one term, wp(t) = Ae.
(If
f¢
z = a-+ ib is any complex number, it is standard to call
Rez = a and Imz = 3 the real part and the imaginary part
of 2, respectively.) Because, according to Euler’s formula,
e@ —cosNt + isin Nt, it followsthatRee’
Ime’
ca
E(t)
= cosMt and
| -dt
= sin Qt. Since the forcing function in (12.1) is the
real part of the forcing function
di
1,
dk(t)
cone of Qe ob en ae EE
eae
R
\
WW
13.1
(13.1)
b
L
ww) Ss
+t
I
¢
Cc
in (12.2), it seems plausible
thatz,(t) shouldbe the real partof w,(t). Thus, we have and
the following method: to find a particular solution to (12.1)
consider insteadthe simpler equation(12.2). Solve for w,(¢)
by seekingw,(t) = Ae’,
fromx,(t) = Rew,(t).
andthenrecoverthedesiredx(t)
(a) Prove that the method described above works. HINT:
The key is the linearity of L, so that if w = u + iv, then
L[w]= Llu + iv] = Llu) + iL[v).
(b)—(k) Use the method to obtain a particular solution to the
given equation:
(b)ma” + ca! + ka = Fo cos Nt
(c) ma" + cx’!+ ke = Fosin Qt
(d) 2’ + 3x = 5cos 2t
(e)2’
x =A4sin3t
—~
—x’ +2 =cos 2t
(fc
(g)2" + 5a’ +a = 3sindt
(h) 2” — 22' +2 = 6cos 5t
Qe"
Ga”
+e" +e'+a
= 3sint
+a +x =3cost
(k) 2!" + 2a" + 4a = 9sin 6t
1
d
2
nw,
pp@stoq=
roe
(13.2)
respectively, where L, R, C, B,i, and Q are measured in henrys, ohms, farads, volts, amperes, and coulombs, respectively.
(a) Let L = 2, R = 4, and C = 0.05. Solve for Q(t) subject to
theinitial conditionsQ(0) = Q’(0) = 0, whereE(t) = 100.
Identify the steady-state solution. Give a computer plot of the
solution for Q(t) over a sufficiently long time period to clearly
show the approach of Q to its steady state. (Naturally, all plots
should be suitably labeled.)
(b) Same as (a), but for C = 0.08.
(c) Same as (a), but for C = 0.2.
(d) Same as (a),but for E(t) = 10e~*.
(e)Same as (a),but for E(t) = 10 (1 —e~*).
(f) Sameas (a),butfor E(t) = 50 (1 + e~°**),
a large set of differential equations. A realistic model could easily contain 100
differential equations on 100 unknowns.
If there are two or more unknowns, then we are involved not with a single
differential equation but with a system of such equations. For instance, according
to the well known Lotka—Volterramodel of predator-prey population dynamics, the
populationsa(t) and y(t) of predatorand prey are governedby thesystemof two
equations
a!
(1)
—az + Bay,
y = yy ~Oey,
where a, (,7y,6 are empirical constants and t is the time. This particular system
happensto be nonlinear because of the zy products; we will return to it in Chapter
7 when we study nonlinear systems. The present chapter is devoted exclusively to
linear differential equations.
By definition, a linear first-order system of n equations in the n unknowns
x1(t),...,2n(t) is of theform
ay (t)x}
a
i
Ain(t)x),
+ bii(t)ry
Tor
bin(t)tn
= fi(t)
(2)
Ani (t)xy tot
Ann(t) x, + bai (t)zi
tes
+ bnn(t)tn
= Fn(t),
where the forcing functions f;(t) and the coefficients a;,(t) and bj,(t) are prescribed,and whereit is convenientto use a double-subscriptnotation:a;,,(t) denotesthecoefficientof x},(¢)in thejth equation,and b;;,(¢)denotesthecoefficient
of x(t) in the jth equation. We call (2) a first-order system because the highest
derivatives are of first order. If the highest derivatives were of second order, we
would call it a second-order system, and so on. A linear second-order system of n
equationsin the n unknowns2x;(t),...,£n(t) would be of the sameform as (2),
but with each left-handside being a linear combination of the second-,first-, and
zeroth-order derivatives of the unkowns.
The system (2) is a generalization of the linear first-order equation y’+p(x)y =
q(a) in theoneunknowny(z) studiedin Chapter2. There, andin mostof Chapters
3, we favored x as the generic independentvariable and y as the generic dependent
variable, but in this section the independent variable in most of our applications
happens to be the time ¢, so we will use ¢ as the independent variable.
As in thecase of a single differential equation,by a solution of a systemof differential equations (be they linear or not), in the unknowns 2x1(t),...,2n(t) over
some¢ interval [, we mean a set of functions x,(t),...,v,(¢)
that reduce those
equations to identities over J.
3.9.1. Examples. let us begin by giving a few examples of how such systems
arise in applications.
EXAMPLE
1.
RL Circuit. Consider the circuit shown in Fig. |, comprised of three
AT
RY
Ri
L
a)
= Ry
@iI@®—L~w
te)
EE
= Ry
Figure i. Circuit of Example 1.
{58
(a)
[2
Pp
i
q
4a
by
i
i)
fg 83
hn¥
<
a
¢
iy
s
r
eo
is
(b)
[DF
*.
—s,
loops. We wish to obtain the differential equations governing the various currents in the
circuit, There are two ways to proceed that are different but equivalent, and which correspond to thecurrent labeling shown in Fig. 2a and 2b (in which we have omitted the circuit
elements, for simplicity). First consider the former. If the current approaching the junction
p from the “west” is designated as 7, and the current leaving to the east is 7g,then it follows
from Kirchoff’s current law (namely, that the algebraic sum of the currents approaching or
leaving any point of a circuit is zero) that the current to the south must be 71—7g.Similarly,
if we designate the current leaving the junction q to the east as 7g, then the current leaving
to the south must be 72 — 73. With the current approaching 7 from the north and east being
ig — tg and 7g, it follows
that the current leaving to the west must be ig. Similarly,
the
current leaving s to the west must be 71.
Next, apply Kirchoff’s voltage law (namely, that the algebraic sum of the voltage
drops around each loop of the circuit must be zero) to each loop, recalling from Section 2.3
that the voltage drops across inductors, resistors,and capacitors (of which thereare none in
∙
lf.
this particular circuit) are Le,
stepgivesLy
di
Ri, and G[i
dt, respectively. For the left-hand loop that
+ Ry(i; —ig) —E,(t) = 0, wherethelast term(correspondingto the
applied voltage £1) is counted as negative because it amounts to a voltage rise (according
to the polarity denoted by the -: signs in Fig. 1) rather than a drop. Thus, we have for the
left, middle, and right loops,
Lyi, + Ry (41—ie) = Ey(t),
Ty Ry
Lott,
(t2
_
i3)
+
Ry
(tg
_
i)
=
E2(t),
(3)
L3i, + Rgig + Ro (is —i2) = Es(t),
respectively, or,
Dyiy
+
Ryty
_
Ryly
=
Ey (t),
Doi, —Ryi, + (Ri + Ro)ig —Rots = Ea(t),
(4)
Lgi —Roig+ (Ro+ Ry)is = E3(t),
wherey(t), £2(t), #3(t) are prescribed.It mustbe rememberedthatthecurrentsdo not
need to flow in the directions
assumed by the arrows: after all, they are the unknowns.
If
any of them turn out to be negative (at any given instant t), that merely means that they are
flowing in the direction opposite to that tentatively assumed in Fig. 2a.
Alternatively, one can use the idea of “loop currents,” as denoted in Fig. 2b. In that
case the south-flowing currents in A, and it, are the net currents 7; ~ ig and tg — ts,
respectively, just as in Fig. 2a. Either way, the result is the linear first-order system (4). 8
It is important to see that the system (4) is coupled. That is, the individual equations contain more than one unknown so that we cannot separate them and solve
the first for 7, (for instance), the second for 72, and the third for 73. Put differently,
the currents 71,72,73 are interrelated. It is only natural for systems of differential
equations to be coupled since the coupling is the mathematical expression of the
relatednessof the dependent variables. On the other hand, if we write differential
equationsgoverningthecurrent¢(¢)in a circuit and the price of teain China, p(t),
we would hardly expect those equations to be coupled and, indeed, it would hardly
make sense to group them as part of the same system.
EXAMPLE
2. LC Circuit. For thecircuit shownin Fig.3, thesamereasoningasabove
+
lof
ran [iat
a fi
d
+,Le (i 1tg)
Cy} *
w+
mr ce
da
dt?"
Gt
or)
;
= E(t),
)=0
”
on thecurrents7,(¢) and i2(¢) or, differentiatingto eliminate theintegralsigns,
∑
−−1 ↕−
a
Whereas (4) wasafirst-order
EXAMPLE
Lit
+
↕
Ca
Sta
=
(6)
system, (6) is of second order. @
3. Mass-SpringSystem.This timeconsidera mechanicalsystem,shownin
Fig. 4 and comprised of masses and springs. The masses rest on a frictionless table and
are subjected to applied forces f\(t), F2(t), respectively. When the displacements x; and
X are zero, the springs are neither stretchednor compressed, and we seek the equationsof
motion of the system, that is, the differential equations governing x1 (t)and x9(t).
The relevant physics is Newton’s second law of motion, and Hooke’s law for each
of the three springs, as were discussed in Section 1.3. To proceed, it is useful to make
a concrete assumption on 21 and zg. Specifically, suppose that at the instant t we have
is called a
free-body diagram). Then the left spring is stretched by x, so it exerts a force to the left,
on my,, equal (according to Hooke’s law) to Aya 1. The middle spring is compressed by
Ly — @ so it exerts a force ky2(v1 — 2)
to the left on m, and to the right on me, and the
right spring is compressed by x2 and exerts a force kyx to the left on me, as shown in the
figure. With the help of the information given in Fig. 5, Newton's second law for each of
the two masses gives
mye
= ~hyay — Aye (a1 — 2) + F(t),
7)
Moxy = —kyre + hyo(ty — Le) + FY(t)
as the desired equations of motion —or, rearranging terms,
mya
+ (Ay
+ hye)
ay
— hyve
=
F(t),
mots — kyoe, + (ko + hy2) vq = Fo(t).
Cy
Figure 3. LC circuit.
0.
Ly > v2 > 0, as assumed in Fig. 5 (which figure, in the study of mechanics,
Ttie
C
gives the integro-differential equations
(8)
<——
Kix,
> F,
Mm
<——
kyo (x,
my
—Xy)
kyo (xy
X92)
Fy
|——>
<-
ky Xo
Figure 5. Free-bodydiagramof themasses.
my
Aix
Pp
FF
kyo (x
—x
Figure 6. Revisedfree-body
diagram for my.
COMMENT. Our assumption that x; > v2 > O was only for definiteness; the resulting
equations (8) are insensitive to whatever such assumption is made. For instance, suppose
[>
)
that we assume, instead, that 22 > x2; > 0. Then the middle spring is stretched by x2 — 2),
so the free-body diagram of m, changes to that shown in Fig. 6, and Newton’s law for m,
gives mya? = —kyx, + kyo(aq — 1) + F(t), which is seen to be equivalentto the first
of equations (7). Similarly
for m2. #
3.9.2, Existence and uniqueness. The fundamental theorem regarding existence
and uniqueness is as follows.*
THEOREM
3.9.1 Existence and Uniquenessfor Linear First-Order Systems
Let the functions a11(t), aio(t),..., @nn(t) and fi(t),..., fr(t) be continuous on
a closed interval J. And let numbers 6;,..., 5, be given such that
r1(a)
= bi,
£2(a)
= bo,
wey
In(a)
(9)
— bn,
where a is a given point in J. Then the system
a lI ayy(t)xy
+ ayo(t)rg +++ + ain(t)en
+fi(t),
(10)
+++:+ ann(t)en+ fn(t),
eh,= ani(t)e1 + ano(t)tea
“There is a subtle point that is worth noting, namely, that (10) is not quite of the same form as the
general first-order linear system (2) in that its left-hand sides are simply xi,..., 24, rather than linear
combinations of those terms. (What follows presumes that you have already studied the sections on
matrices, rank, and Gauss—Jordan reduction.) The idea is that (2) can be reduced to the form (10) by
elementary row operations, as in the Gauss—Jordan reduction of linear algebraic equations —unless
the rank of the {aj,(¢)} matrix is less than n. In that case, such operations would yield at least one
equation, at the bottom of the system, which has no derivatives in it. [f not all of the coefficients
of
the undifferentiatedx; terms in that equation are zero, thenone could use thatequationto solve for
one of the 2;’s in termsof the others and use that result to reduce the system by one unknown and one
equation; if all of the coefficients of the undifferentiated x; terms in that equation are zero, then that
equation would either be 0 = 0, which could be discarded, or zero equal to some nonzero prescribed
function of t, which would cause the system to have no solution. To avoid these singular cases, it is
conventional to use the form (10), rather than (2), in the existence and uniqueness theorem.
161
subject to the initial conditions (9), has a unique solution on the entire interval J.
Observe that we have added the word “entire” for emphasis, for recall from
Section 2.4 that the Existence and Uniqueness Theorem 2.4.1 for the nonlinear
initial-valueproblemy’'(a) = f(a, y) with initial conditiony(a) = bis a local one;
i
i
it guarantees the existence of a uni
but it does not tell us how big A can be. In contrast, Theorem 3.9.1 tells us that
the solution exists and is unique over the entire interval J over which the specified
conditions are met.
EXAMPLE 4. ExampleI, Revisited.Supposethatweaddinitial conditions,say7;(0) =
by,%2(0)= bg,23(0)= b3to thesystem(4) governingtheRL circuit of Example 1. If the
E;(t)’s arecontinuouson 0 < ¢ < T'and theL,’s arenonzero[sowe candivide throughby
them in reducing (4) to the form of (10)] then, according to Theorem 3.9.1, the initial-value
problem has a solution on 0 < ¢ < T, and it is unique. &
It would appearthat Theorem 3.9.1 does not apply to the system (8) of Example 3 because the latter is of second order rather than first. However, and this is
important, higher-order systems can be reduced to first-order ones by introducing
artificial, or auxiliary, dependent variables.
5. Reducethe second-ordersystem(8) to a first-ordersystem. The idea
EXAMPLE
ee u and v according to vz, = wanda, = v
is to introduce artificial dependent
becausethen the second-order derivatives z//and x4become first-order derivatives u’ and
v’, respectively. Thus, (8) can be re-expressed, equivalently, as the first-order system
v(t) =
u(t)
=
x(t)
=
"h
!
u'(t) = in
R12
mo
+h
My
e
Ly
+
k
2
My
+ hye
_ ko
tm)
mg
To see that this system is of the form (10), let “a,”
Z9
+
1
My
F(t),
(11)
+ —1 F(t).
ma
= a,
“wo” = u, “x3”
= we, and
“tq” = v. Then ayy = aig = 14 = 0, aig = 1, fi(t) = 0, aay = —(ki + hi2)/mai,
22 = dag = 0, dog= ky2/m4, fo(t) = Fy(t)/my, andso on. All of theaj, (t) coefficients
are constants and hence continuous for all ¢. Let the forcing functions F(t) and F(t) be
continuous on Q < ¢t< oo.
Thus, according to Theorem 3.9.1, if we prescribe initial conditions 21(0), u(0),
v2(0), v(0), then the initial-value problem consisting of (11), together with those initial
conditions,will havea uniquesolutionfor v(t), u(t), va(t), v(t). Equivalently,theinitialvalueproblemconsistingof (8),togetherwith prescribedinitial valuesx1(0),2(0), 72(0),
#(0), will havea uniquesolutionfor z1(t), v(t).
162
Consider one more example on auxiliary variables.
EXAMPLE
6. Considerthethird-orderequation
a” + 2t2"”—2’ + (sint)x = cost,
(12)
which is a system,a systemof n equationsin n unknowns,wheren = 1. To reduceit to a
system of first-order equations, introduce auxiliary variables u,v according to x’ = u and
av" =u!’ =v. Then
zg!
au!
Uv
U,
v,
= —(sint)z +u— 2tv+cost
(13)
is the desired equivalent first-order system, where the last of the three equations follows
from the first two together with (12). @
3.9.3. Solution by elimination. We now give a method of solution of systems of
linear differential equations for the special case of constant coefficients, a method
of elimination that is well suited to systems that are small enough for us to carry
out the steps by hand.
We introduce the method with an example after first recalling (from Section
3.3) the idea of a linear differential operator,
di”
L= ao(t) a
+ a(t)
drat
i
+++»+ an(t)
= ag(t)D" + a,(t)D" 1 +--- +an(t),
where D denotes ‘
tion ag
+ a, cas
D? denotes a
+-++++ nz.
and so on. By L[x| we mean the funcWe say that L is of order n (if ao is not
dt”
Coie
identically zero) and that it “acts” on «, or “operates”on x. Further, by Ly Lo[2]
we mean Ly [Lo[x]]; that is, first the operator immediately to the left of x acts
on zx, then the operator to the left of that acts on the result. Two operators, say
Ey and Lz, are said to be equal if Li[z] = Le[z] for all functions x(t) (that
are sufficiently differentiable for L; and LZ to act on them). Finally, in general, differential operators do not commute: [109 # LoL.
For instance, if
Ly = Dand Ly = tD, thenLyLa{a] = D(tDx) = D(ta') = ta” + 2’, whereas
if theira; coeffiLoL\ |r] = tD(Dx) = tDzx' = ta. However,theydo commute
cients are constants. For instance,
(2D —1)(D + 3)x = (2D —1)(2' + 3x) = 2x” + 6a’ —x! —32,
and
(D + 3)(2D — 1)a = (D + 3)(22' —x) = 22" — 2! + 62' —3a
1).
areidenticalfor all functionsz(t), so (2D —1)(D + 3) = (D+ 3)(2D —
EXAMPLE
7. To solve thesystem
xv—x—y = 3t,
gz’+y' —52 —2Qy=5,
(14a)
(14b)
it is convenient to begin by re-expressing it as
(D —1)z — y = 3b,
(15a)
(D—5)x+(D—2)y
=5,
(15b)
or
Ly
[x] > Lo{y]
= 3t,
(16a)
La[z] + Laly] = 5,
(16b)
where Dy} = D —1,L2 = —1, and so on. To solve by the method of elimination, let us
operate on (16a) with Lg and on (16b) with Lj, giving
LgLy[a]+L3Lo{y]
=Ls[3t],
(17a)
Ly LDs[z]
(17b)
Tr Ly Laly|
=
E,(5),
wherewehaveusedthelinearityof Ls in writingLg [Li[2] + Lely] asL3L4[2]+L3Lely]
in obtaining (17a) and, similarly, the linearity of Z, in obtaining (17b). Subtracting one
equation from the other, and cancelling the x terms because L3L, = [,3,
enables us to
eliminate x and to obtain the equation
(LyL4 —L3L2) [y]= £1[5]—Ls[3¢]
(18)
on y alone. At this point we can return to the non-operator form, with D104 — Egle
(D — 1)(D — 2) — (D — 5)(-1)
=
= D® — 2D — 3 and L,([5] — L3[3t] = (D — 1)(5) -
(D —5)(3t)= 15¢—8.Thus,
y —2y' —3y = 15t —8,
(19)
which admits the general solution
(20)
y(t) = Ae*’+ Be~*—5t +6.
To find x(t), we can proceed in the same manner. This time, operate on (16a) with [4
and on (16b) with La:
L4L;[z]
−− ↕
↨
∶
∑
↨
∂↕
LoLs{x]+ LoLaly]= Lo[5},
(21a)
(21b)
and subtraction gives
(Lal,
_ LL3)
{z] — L4[3¢]
_
D[5],
(22)
164
or
(23)
ae"— Qa’ —3a = 8 —6,
with general solution
(24)
+ 2t —4.
a(t) = Ce*’ + Ee
(We avoid using D as an integrationconstantbecauseD = d/dt here.)
[t might appear that A, B,C’, & are all arbitrary, but don’t forget that x and y are
related through (14), so these constants might be related as well. In fact, putting (20) and
(24) into (14a) gives, after cancellation
of terms,
(25)
=0,
(2C —Aje* — (QE + Bye
of e%!ande~*requiresthatA = 2C andB = —2£. Putting
andthelinearindependence
(20) and (24) into (14b) gives this same result.
Thus, the general solution of (14) is
(26a)
(26b)
a(t) = Ce*!+ Ee + 2t—4,
—5t+6.
y(t) = 2Ce**—~2EFe~'
COMMENT
|. With hindsight, it would have been easier to eliminate y first and solve for
x since we could have put that 2 [namely, as given by (26a)] into (14a) and solved that
equation for y. That step would have produced (26b) directly.
COMMENT
2. Notice that (14) is not of the “standard”
form (10) because (14b) has both
zx’ and y’ in it. While we need it to be in that form to apply Theorem 3.9.1, we do not need
the systemto be of thatform to apply themethodof elimination. #
A review of the steps in the elimination process reveals that the operators
fy,...,£4
might just as well have been constants, by the way we have manipulated them. In fact, a useful way to organize the procedure is to use Cramer’s rule
(Section 10.6). For instance, if we have two differential equations
(27a)
Ly(z]+ Loly]= f(t),
Ls{a]+ Laly)= fa(t),
(27b)
we can, heuristically, use Cramer’s rule to write
fi
Us
|=
/
Jota
£2
Iy
Le
L3
[4
| Ly
L3
| [y
D3
= Lalfi]
— Lol fe]
LyL4 —Leb
fi |
fo _ Li [fe]-—Lalfil
Ly»|
Ii L4 —LoL
Ly
©
(28a)
(28b)
165
Of course, the division by an operator on the right-hand sides of (28a,b) is not
defined, so we need to put the Ly£4 — LoLg back up on the left-hand side, where
it came from. That step gives
(L1L4—LoLz)[x]= Lalfi] —Lolfel
(29a)
(LyL4 —LoLs) [y)= Lilfe) —Lalfil,
(29b)
which equations correspond to (22) and (18), respectively, in Example 7. Again,
this approachis heuristic, but it does give the correct result and is readily applied
— and extended to systems of three equations in three unknowns, four in four unknowns, and so on.
What might possibly go wrong with our foregoing solution of (27)? In the
application of Cramer’s rule to linear algebraic equations, the case where the determinant in the denominator vanishes is singular, and either there are no solutions
(the system is “inconsistent”’) or there is an infinite number of them (the system is
“redundant’”). Likewise, the system (27) is singular if £14 ~ LoLg3 is zero and
is either inconsistent (with no solution) or redundant (with infinitely many linearly
independentsolutions). For instance, the system
Dzr+2Dy =1,
(30a)
2Dx+4Dy
(30b)
=3
has L;L4 — LoL3 = 4D* — 4D? = 0 and has no solution since the left-hand sides
are in the ratio 1:2, whereas the right-hand sides are in the ratio 1:3. However, if
we change the 3 to a 2, then the new system still has L;L4 — DoL3 = 0 but is
now consistent. Indeed, then the second equation is merely twice the first and can
be discarded, leaving the single equation Dx + 2Dy = 1 in the two unknowns
x(t) and y(t). We can choose one of these arbitrarily and use Dx + 2Dy = 1
to solve for the other, so there are infinitely many linearly independent solutions.
Understand that the singular nature of (30), and the modified system, is intrinsic to
those systems and is not a fault of the method of elimination.
In the generic case, however, LyL4 — LoL3
~ 0 and we can solve (29a) and
(29b) for x(t) and y(t), respectively. It can be shown* that the number of independent arbitrary integration constants ts the same as the degree of the determinantal
= D?-2D—3
polynomial LZ;L4—L2L3. In Example 7, for instance, L,L4—L2L3
is of second degree, so we could have known in advance that there would be two
independentarbitrary constants.
EXAMPLE
8. Mass-Spring Systemin Fig. 4. Let us study the two-masssystemshown
in Fig. 4, and letm, = mg = ky = kyg = ko = Land F\(t) = Fo(t) = 0, for definiteness.
Then equations (8) become
“See pages 144—150 in the classic treatise by E. L. Ince, Ordinary Differential Equations (New
York: Dover, 1956).
166
(D? + 2) a1 —22 =0,
(31a)
~—x1
+ (D?+2)a2=0.
(31b)
With Ly = Ly = D? +2 andLy = L3 = —1,andf,(t) = fo(t) = 0, (29a,b)become
(D4 +4D? +3) x, =0,
(32a)
(D* + 4D? + 3) x2 = 0,
(32b)
so (Exercise 2)
a(t)
= Acost
+ Bsint
x(t)
= F cost + Gsint
+ Ccos V3t + Esin V3t,
(33a)
/sin V3t.
(33b)
+ Hcos V3t+
To determine any relationships among the constants A, B,...,
(32b), the result would be the same] and find that
(A —F) cost + (B— G)sint
7, we put (33) into (31a) [or
—(C + H) cos V3t —(E+ I)sinV3t
=0,
from which we learn that F = A, G = B, H = —C, and I = —E, so the general solution
of (31) is
£1(t) = Acost+
Bsint+C'cos
V3t
+ Esin
V3t,
ato(t) = Acost + Bsint —Ccos V3t —Esin V3t.
(34a)
(34)
The determinantal polynomial was of fourth degree and, as asserted above, there are four
independent arbitrary integration constants. There are important things to say about the
result expressed in (34):
COMMENT
1. It will be more illuminating to re-express(34) in the form
u(t)=Gsin(t+) +Hsin(V8t+4),
(35a)
ro(t) = Gsin(¢+ ¢) —Hsin (v3e + ) :
(35b)
wherethefourconstantsG, H, , y aredeterminedfrom theinitial conditions21(0),x/(0),
x2(0),andx4(0).Whileneitherx(t) norx2(t) is a puresinusoid,eachis a superposition
of two pure sinusoids, the frequencies of which are characteristics of the system (i.e., independent of the initial conditions). Those frequencies, w = 1 rad/sec and w = V3 rad/sec,
are the natural frequencies of the system. If the initial conditions are such that H = 0,
then the motion is of the form
ri(t)=Gsin(¢+¢),
we(t) = Gsin (t+ ¢);
(36)
that is, the two masses swing in unison at the lower frequency w = 1. Such a motion
is called a low mode motion because it is at the lower of the two natural frequencies. If
instead the initial conditions are such that G = 0, then
a(t) = Hsin(V3t+y),
2x9(t)= —Hsin(V3t+¥);
(37)
167
the masses swing in opposition, at the higher frequency w = 1/3 rad/sec, so the latter is
calleda high mode motion. For instance,theinitial conditionsx1(0) = x2(0) = 1 and
vi (0) = (0) = 0 give (Exercise7) thepurelylow modemotion
7/2)
= cost,
1(t)
= sin(t-+
0/2)
08)
v1(t) = sin (t +
wo(t) = sin(t + 7/2) = cost,
andtheconditions2,(0) = 1, 22(0) = —1,and2(0) = 24,(0)= 0 give thepurely high
mode motion
zi(t) = sin(V3t+
7/2) = cos V3t,
ao(t) = —sin(/3t+72/2)
(39)
= —cosV3t.
If, instead,z}(0) = 1 and22(0) = x4(0) = 24(0) = 0, say,thenbothG and H will be
nonzero and the motion will be a linear combination of the low and high modes.
COMMENT 2. Why is the frequency corresponding to the masses swinging in opposition
higher than that corresponding to the masses swinging in unison? Remember from the
single-mass case studied in Section 3.5 that the natural frequency in that case is /k/m;
that is, the stiffer the system (the larger the value of &), the higher the frequency. For the
two-mass system, observe that in the low mode the middle spring is completely inactive,
whereas in the high mode it is being stretched and compressed.
Thus, there is more stiffness
encountered in the high mode, so the high mode frequency is higher.
COMMENT
3. Just as the free vibration of a single mass is governed by one differen-
tial equation,mz” + kx = 0, andhasa single modeof vibration with naturalfrequency
w = ,/k/m, a two-mass system is governed by two differential equations and its general
vibration is a linear combination of two modes (unison and opposition in this example),
each with its own natural frequency. Similarly, the free vibration of an n-mass system will
be governed by n differential equations, and its general vibration will be a linear combination of n distinct modes, each with its own pattern and natural frequency. In the limit, we
can think of a continuous
system, such as a beam, as an infinite-mass
system, an infinite
number of tiny masses connected together. In that limiting case, in place of an infinite
number of ordinary differential equations we obtain a partial differential equation on the
deflectiony(x, t), solution of which yields the generalsolution as a linear combinationof
an infinite number of discrete modes of vibration. In applications it is important to know
the natural frequencies of a given system because if it is driven by a harmonic forcing function, then it will have a large, perhaps catastrophic, response if the driving frequency is
close to one of the natural frequencies.
COMMENT
4. Finally, we note that molecules and atoms can be modeled as mass-spring
systems, and the spectrum of the natural frequencies are of great importance in determining
their allowable energy levels. @
We will have more to say about the foregoing example later, when we study
matrix theory and the eigenvalue problem.
Observe that once a system of linear constant coefficient equations is converted
by the process of elimination
to a set of uncoupled equations such as (32a,b), the
168
homogeneous solutions of those equations can be sought in the usual exponential
form. In fact, one can do that even at the outset, without first going through the
process of elimination. For instance, to solve (31a,b) one can start out by seeking
a solution in the form x(t) =
€e" and xo(t) = €,e"'. Putting thoseforms into
(31a,b) gives what is known as an eigenvalue problem on the unknown constants
€1,€2 and r. That discussion is best reserved for the chapters on matrix theory and
linear algebra, as an important application of the eigenvalue problem, so we will
not pursue it in the present section.
Closure. Systems of ordinary differential equations arise in the modeling of physical systems that involve more than one dependent variable. For instance, in modeling an ecological system such as the fish populations in a given lake, the dependent
variables might be the populations of each fish species, as a function of the independent variable t. Surely these populations are interrelated (for instance, one species
might be the primary food supply for anotherspecies),so the governing differential
equations will be coupled. It is precisely the coupling that produces the interest in
this section because if they are not coupled then we can solve them, individually,
by the methods developed in preceding sections.
Our first step was to give the basic existence and uniqueness theorem. That
theorem guaranteed both existence and uniqueness, under rather mild conditions
of continuity, over an interval that is known in advance. The theorem applied to
first-order systems, but we showed that systems of higher order can be converted to
first-order systems by suitable introduction of auxiliary dependent variables.
Then we outlined a method of elimination for systems with constant coefficients. Elimination is similar to the steps in the solution of linear algebraic equations by Gauss elimination, where the coefficients of the unknowns are operators
rather than numbers. The correct result can even be obtained by using Cramer’s
rule, provided that the determinantal operatorin the denominator does not vanish,
and provided that we move that operator back “upstairs” —as we did in converting
(28) to (29). If the operator does vanish, then the problem is singular and there will
be no solution or infinitely many linearly independent solutions.
In subsequentchapters on matrix theory we shall return to systems of linear differential equations with constant coefficients and develop additional solution
techniques that are based upon the so-called eigenvalue problem.
Computer software. Often, one can solve systems of differential equations using
computer-algebra systems. For instance, to find the general solution of the system
(D + 1)x+ 2y=0,
3a+(D+ 2)y=0
using Maple, enter
soe)
con ee )+ a(t) +2 « y(t) =0, 3 * x(t) + diff(y(t),t)
+2 *y(t) = O},{x(t), y(t) });
3.9. Systems ofLinear Differential Equations — 169
and return. The result is the general solution
| C2
{y(t)=
exp (t) + 3/2 Cl exp (—4¢),
~C2 exp (t) +. C1 exp (—4t) }
If we wish to includeinitial conditions«(0) = 3, y(0) = 2, useinsteadthecommand
dsolve({diff(x(t),t) + a(t) +2* y(t) = 0, 3* x(t) + diff(y(t), t) +2 * y(t) = 9,
x(0)=3, y(0)=2},{e(t),y()});
The result is the particular solution
(—48)}
(t)+2exp
a(t)=exp
(t)+3exp(—d¢),
=—exp
{y(t)
Alternatively, one can first define the two equations and then call them in the
dsolve command. That is, enter
deq! := diff(a(t),t) + x(t) +2*y(t)=0:
deq2:= 3 « x(t) + diff(y(4),t) +2 * y(t) =0:
The colon at the end of each line indicates
that it is a definition,
not a command.
Commands are followed by semicolons. Now, enter the dsolve command:
deq2,x(0)= 3, y(0)= 2}, {a(t),y(t)})s
dsolve({deql,
and return. The result is
{y(t) = —exp(t) + 3exp(—4t), x(t) = exp (t) + 2exp (—4t)}
EXERCISES
3.9
1. Derive the solution (20) of (19).
2. Derive the solutions (33a,b) of (32a,b).
x2
ok
BW
FE
3
mI
x
k
m|
FAW
a
k
FW
my PR EO
TEEPE
DOPED,
EEE
3. Derive the system of differential equations governing the
displacements «,;(t), using the assumption that 2; > x2 >
v3 > 0. Repeat the derivation assuming instead that z3 >
LQ > x, > O and again, assuming that 2, > 73 > we > 0,
and show that the resulting equations are the same, independent of these different assumptions.
4.(a),(b),(c)
Derive the system of differential
equations gov-
erning the currents 7;(£), but you need not solve them. State
any physical laws that you use.
170
Chapter 3. Linear Differential Equations of Second Order and Higher
(n)
(b)
(a)
a” —a+3y=0
y +a+y=4
(0) 2! +a+y = 24
y! + 38a~ y = -8
(p)
al!
_L
3a"
yl"
~y"
=
—
y
+6
(q) (2D? + 3)z + (2D + 1)y = 4e**—7
Dz+(D—2)y
=2
6. (a)~—(q)Find the general solution of the corresponding
problem in Exercise 5 using computer software. Separately,
make up any set of initial conditions, and use the computer to
find the particular solution corresponding to those initial conditions.
Ok b
5. Obtain the general solution by the method of elimination
either step-by-step or using the Cramer’s rule shortcut.
(a) (D—1)z+Dy=0
(D+l1)x+(2D+2)y=0
(b) (D—l1)a+2Dy=0
7. (Mass-spring system of Examples 3 and 8) (a) Derive the
particular
solutions (38) and (39) from the general solution
(35) by applying the given sets of initial conditions.
(b) Evaluate G,H,¢,y
for the initial conditions x,(0)
1,a2(0} = 24(0) = x74(0)= 0, and show thatboth modes
’
are present in the solution. Obtain a computer plot of x, (t)
and a(t), over 0 < t < 20 (so as to show several cycles).
(c) Sameas (b),for 2(0) = 1,21(0) = x2(0) = 4(0) = 0.
(d) Same as (b), for 21(0) = x2(0) = 0,24 (0) = 2,25(0)
3.
(e) Same as (b), for x, (0)
(D+ 1)x+4Dy =0
(c) Dz +(D~l)y=5
x3(0) = -1.
(d) a t+y=ytt
x(t) andy(t), reactto form a third substance,with concentration 2(¢). The reactionis governedby the system2’ + ax =
™D+l)a+(D+l)y=0
xv—3y' = -27 +2
(e) 2’ =sint—y
yi =-9r+4
(f) 2’ =x ~8y
yo/ =r-n-~y—3t
(g) vw=22+6y—-t4+7
y’ = 22—2y
(h) Qe +y +at+y=P-1
ety
t+art+y=0
(i) 2 +y'+a2-y=e
av’ +2y +2 —-2%=1-t
(j) av —3x +y =4sin 2¢
32 +y' —~y=6
(kK)a” =a~dy
y=
20 y
Q) a” =a —2y
y" = 2a —4y
(m) «” —-x2+2y=0
Wa+y"+dy=1-t?
8. (Chemical kinetics) Two substances, with concentrations
0,2’ = Gy andx +y+2
=~, where a, G,7¥ are known pos-
itive constants.Solve for a(t), y(t), z(t), subjectto theinitial
conditionsz(0) = 2’(0) =0 for thesecases:
(a)a #B
(b) a
( HINT: Apply l’H6pital’s
rule to your answer to
part (a).
9. (Motion of a charged mass) Consider a particle of mass rm,
carrying an electrical charge g, and moving in a uniform magnetic field of strength 8. The field is in the positive z direction.
The equations of motion of the particle are
ma" = qBy',
my" = ~qBa',
me” = 0,
(9.1)
wherea(t), y(t), (¢) are thex, y, 2 displacementsas a function of the time ¢.
(a)Find thegeneralsolutionof (9.1)for x(t), y(), z(t). How
many independent arbitrary constants of integration are there?
Chapter 3 Review
(b) Show that by a suitable choice of initial conditions the mo-
(a) 2’ ~x2@+y=t
and centeredat any desired point xo, yo. Propose such a set of
initial conditions.
()
(D—-lA)a+y=t
(D2 —l)e+(D+1)y=t+1
(c)
(D+
(c) Besides
a circular
motion in a constant z plane, are any
othertypesof motion possible? Explain.
10. Show thatthe given system is singular (i.e., either inconsistent or redundant). If it has no solutions show that; if it has
solutions find them.
Le
— Dy = ot
(D? — 1a —(D? — D)y =0
(q) (D
D +1)x+ Dy = et
- D)y = 3t
(D? —1)z+(D?
a + (
Chapter 3 Review
A differential equation is far more tractable, insofar as analytical solution is concerned, if it is linear than if it is nonlinear. We could see a hint of that even in
Chapter 2, where we were able to derive the general solution of the general firstorder linear equation y’ + p(x)y = q(x) but had success with nonlinear equations
only in special cases. In fact, for linear equations of any order (first, second, or
higher) a number of important results follow.
The mostimportantis thatfor annth-orderJineardifferentialequationL[y] =
f(x), with constant coefficients or not, a general solution is expressible as the sum
of a generalsolution y;,(z) to the homogeneousequation L[y]= 0, and any partic-
ularsolutiony,(a) to thefull equationLy]= f:
y(z) = yn(x)+ yp(2).
In turn, y;,(a@)is expressible as an arbitrary linear combination of any n LI
(linearlyindependent)solutionsof L[y] = 0:
yn(z)
= Ciyi(2)
pee
Cnyn(2).
Thus, linear independence is introduced early, in Section 3.2, and theorems are
provided for testing a given set of functions to see if they are LI or not. We then
show how to find the solutions y;(x),...,Yn(a)
for the following two extremely
important cases: for constant-coefficient equations and for Cauchy-Euler equations.
For constant-coefficient equations the idea is to seek yp(a) in the exponen-
tial form e**. Putting that form into L[y] = 0 gives an nth-degreepolynomial
equation on A, called the characteristic
equation.
Each nonrepeated root A; con-
tributesa solution e*/®,
and each repeatedroot A; of order & contributes & solutions
es®, geXit
. Jake Ledie,
For Cauchy—Euler pean
the form e** does not work. Rather, the idea is
to seekyp,(x) in the power form 2. Each nonrepeatedroot Aj;contributes a solution
aj,
and each repeated root A; of order k contributes k solutions
ai, (Inx)ar,..., (Ina)
171
172
Two different methods are put forward for finding particular solutions, the
method of undetermined coefficients and Lagrange’s method of variation of parameters. Undetermined coefficients is easter to apply but is subject to the conditions
that
(i) besides being linear, Z must be of constant-coefficient type, and
(ii) repeated differentiation of each term in f must produce only a finite number
of LI terms.
Variation of parameters,on the other hand, merely requires L to be linear. According to the method, we vary the parameters (i.e., the constants of integration in y,,)
Cy,...,Cn, andseekyp(x) = Ci (x)yi (a) +--+»
+ Cr(x)yn(x). Puttingthatform
into thegivendifferentialequationgivesoneconditionon theCj (a)’s. Thatcondi-
tion is augmented by n — 1 additional conditions that are designed to preclude the
presenceof derivatives of the C;(a)’s that are of order higher than first.
In Section 3.8 we study the harmonic oscillator, both damped and undamped,
both free and driven. Of special interest are the concepts of natural frequency for
the undamped case, critical damping, amplitude- and frequency-response curves,
resonance, and beats. This application is of great importance in engineering and
science and should be understood thoroughly.
Finally, Section 3.9 is devoted to systems of linear differential equations. We
give an existence/uniqueness theorem and show how to solve systems by elimina-
tion.
Chapter 4
Power Series Solutions
PREREQUISITES: This chapter presumes a familiarity with the complex plane and
the algebra of complex numbers, material which is covered in Section 21.2.
4.1
Introduction
In Chapter 2 we presented a number of methods for obtaining analytical closed
form solutions of first-order differential equations, some of which methods could
be applied even to nonlinear equations. In Chapter 3 we studied equations of second
order and higher, and found them to be more difficult. Restricting our discussion to
linear equations, even then we were successful in developing solutions only for the
(important) cases of equations with constant coefficients and Cauchy-Euler equations, We also found that we can solve nonconstant-coefficient equations if we can
factor the differential operator, but such factorization can be accomplished only in
exceptional cases.
In Chapter 4 we continue to restrict our discussion to linear equations, but we
now study nonconstant-coefficient equations. That case is so much more difficult
thanthe constant-coefficientcase thatwe do two things: we consider only secondorder equations, and we give up on the prospect of finding solutions in closed form
and seek solutions in the form of infinite series.
To illustrate the idea that is developed in the subsequentsections,consider the
simple example
d
Fy
dx
(1)
= 0.
To solve by the series method, we seek a solution in the form of a power series
expansionaboutany desiredpoint « = xo, y(t) = S779 Gn(@—x0)”, wherethe
Gn coefficients are to be determined so that the assumed form satisfies the given
differential equation (1), if possible. If we choose x9 = 0 for simplicity, then
oo
y(x)
= S° Ant”
= ag +ayx+
0
173
age”
ae
(2a)
174
and
d
=
d
‘
= in (ag + ayn + ayn? +:
:) = a1 + 2aqx + 3a3x" shee,
(2b)
Putting (2a,b) into (1) gives
(ay + 2a9x + 3a3x° +:
-) + (ao + aye + ann
++. :) = 0,
(3)
or, rearrangingterms,
(a1 + ao) + (2a + a1)2 + (3a3 + ag) a? +++»=0.
(4)
If we realize that the right side of (4) is really 0+ 02 + 0x? +---, then, by equating
coefficients of like powers of x on both sides of (4), we obtain a, + ag = O,
2a2 + a, = 0, 3a3 + a2 = 0, and so on. Thus,
ay = —ao,
ag
=
—a,/2
= —(—ag)/2
(5)
= ag/2,
a3= —a2/3= —(a9/2)/3
= —a9/6,
and so on, where ag remains arbitrary. Thus, we have
u(e)
= a9 (1-0
1
+ 50
1
(6)
= Fah 4s),
as the general solution to (1). Here, ag is the constant of integration; we could
rename it C’, for example, if we wish. Thus, we have the solution — not in closed
form but as a power series. In this simple example we are able to “sum the series”
into closed form, that is, to identify it as the Taylor series of e~*, so that our general
solutionis really y(a) = Ce~*. However,for nonconstant-coefficient
differential
equations we are generally not so fortunate, and must leave the solution in series
form.
As simple as the above steps appear,there are several questions that need to be
addressed before we can have confidence in the result given by (6):
(i) In (2b) we differentiated an infinite series term by term. That is, we interchanged the order of the differentiation and the summation and wrote
an Ss" Anw” = S° in (ana").
d
Tt
d
T
(7)
That step looks reasonable, but observe that it amounts to an interchange in
the order of two operations, the summation and the differentiation,
and it is
possible that reversing their order might give different results. For instance,
do we get the same results if we put toothpaste on our toothbrush and then
brush, or if we brush and then put toothpasteon the brush?
Introduction
(ii) Re-expressing (3) in the form of (4) is based on a supposition that we can
add series term by term:
S°
An
+ S°
Bn
= S°
(An
+ By)
.
(8)
Again, that step looks reasonable, but is it necessarily correct?
(iii) Finally, inferring (5) from (4) is based on a supposition that if
S> Ane” = S-
Bra"
(9)
for all 2 in some interval of interest, then it must be true that A,
= B, for
each n. Though reasonable, does it really follow that for the sums to be the
same the corresponding individual terms need to be the same?
Thus, there are some technical questions that we need to address, and we do
that in the next section. Our approach, in deriving (6), was heuristic, not rigorous,
since we did not attend to the issues mentioned above. We can sidestep the several questions of rigor that arose in deriving the series (6) if, instead, we verify, a
posteriori, that (6) does satisfy the given differential equation (1). However, that
procedure begs exactly the same questions: termwise differentiation of the series,
termwise addition of series, and equating the coefficients of like powers of x on
both sides of the equation.
Here is a brief outline of this chapter:
4,2 Power Series Solutions.
In Section 4.2, we review infinite series, power
series, and Taylor series, then we show how to find solutions to the equation y’” +
p(x)y’ + q(x)y = 0 in the form of a power seriesabouta chosenpoint xq if p(z)
andq(x) are sufficiently well-behavedat xo.
4.3 The Method of Frobenius. If p(x) and q(x) are not sufficiently well-
behaved at xo, then the singular behavior of p and/or q gets passed on in some
form to the solutions of the differential equation; hence those solutions cannot be
found in power series form. Yet, if p(x) and q(x) are not too singular at zo, then solutions can still be found, but in a more general form, a so-called
Frobenius
series.
Section 4.3 puts forward the theoretical base for such solutions and the procedure
whereby to obtain them.
4.4 Legendre Functions. This section focuses on a specific important example,
the Legendre equation (1 — a*)y" ~ 2ay' + Ay = 0, where ) is a constant.
4.5 Singular Integrals; Gamma Function. Singular integrals are defined and
their convergence is discussed. An important singular integral, the gamma function,
is introduced and studied.
4.6 Bessel Functions. Besides the Legendre equation, we need to study the
extremelyimportantBessel equation,x7y" + xy! + («2 — v*)y = 0, wherev is a
constant, but preparatory to that study we first need to introduce singular integrals
and the gamma function, which will be needed again in Chapter 5 in any case.
175
176
4.2
Power Series Solutions
4.2.1. Review of power series. Whereas a finite sum,
N
So ag= a1+02 ++ +aN,
(1)
k=1
is well-defined thanks to the commutative
nite sum, or infinite series,
and associative laws of addition, an infi-
CO
Sag
= a1 + a2 +a3t+--°,
(2)
k=1
is not. For example, is the series y9(=1)*-3
=1—-1+1-—1+---
equal to
= 0? Is it (by grouping differently)
(1-1) +--- =0+0+.--(l-1)+
1~(1-1)-(l1-1)---=1-—0-—0-.--= 1?In fact, besidesgrouping
the numbers in different ways we could rearrange their order as well. The point,
then, is that (2) is not self-explanatory, it needs to be defined; we need to decide, or
be told, how to do the calculation. To give the traditional definition of (2), we first
define the sequence of partial sums of the series (2) as
83 =a, +a2+ a3,
82= aj, +Qa,
81=a,
and so on:
(3)
Th
(4)
Sn=S- Qk;
k=1
where a, is called the kth term of the series. If the limit of the sequence s, exists,
as n. —>oo, and equals some number s, then we say thatthe series (2) is convergent,
and that it converges to s; otherwise it is divergent. That is, an infinite series is
defined as the limit ( if that limit exists) of its sequence of partial sums:
oO
nr
So ax = lim
k=1
TL-+ CO
Sax
= lim s, = s.
k=1
TL-F
(5)
CO
That definition, known as ordinary convergence, is not the only one possible. For
instance, another definition, due to Cesaro, is discussed in the exercises. However,
ordinary convergence is the traditional definition and is the one that is understood
unless specifically stated otherwise.
Recall from the calculus that by limp—+.oo
8n = 8, In (5), we mean that to each
number € > 0, no matter how small, there exists an integer N such that |s —sp| < €
for alln
> N.
(Logically,
the words “no matter how small” are unnecessary, but
we include them for emphasis.) In general, the smaller the chosen e, the larger the
NN that is needed, so that NV is a function of e.
The significance of the limit concept cannot be overstated, for in mathematics
it is often as limits of “old things” that we introduce “new things.” For instance,
thederivativeis introducedas the limit of a differencequotient,theRiemann integral is introducedas the limit of a sequenceof Riemann sums, infinite series are
introduced as limits of sequences of partial sums, and so on.
To illustrate the definition of convergence given above, consider two simple
examples. The series 1 + 1 + 1 +--+ diverges becauses, = 7 fails to approach
a limit as n — oo. However, for a series to diverge its partial sums neednot grow
unboundedly. For instance, the series 1~1-+1—1-+---, mentioned above,diverges
fails to approach a
because its sequence of partial sums (namely, 1,0,1,0,1,...)
limit. Of course, determining whether a series is convergent or divergent is usually
much harder than for these examples. Ideally, one would like a theorem that gives
necessaryand sufficient conditions for convergence.Here is such a theorem.
THEOREM 4.2.1 Cauchy Convergence Theorem
An infinite series is convergent if and only if its sequenceof partial sums sp,is a
that is, if to each ¢ > 0 (no matter how small) there corresponds
Cauchy sequence —
an integer N(e) such that [s,. ~ s,| < € for all m and n greater than N.
Unfortunately,this theorem is difficult to apply, so one develops (in the calculus) an array of theorems (i.e., tests for convergence/divergence),each of which is
more specialized (and hence less powerful) than the Cauchy convergence theorem,
but easier to apply. For instance, if in Theorem 4.2.1 we set m = n — 1, then the
stated condition becomes: to each € > 0 (no matter how small) there corresponds
an integer Ne) such that |sm,— s,| = ja,| < ¢€for alln > N. The latter is
equivalent to saying that a,, —- 0 asm — oo, Thus, we have the specialized, but
readily applied, theoremthat for the series S~*°ay,to converge, it is necessary (but
not sufficient) that a, — 0 as n — oo. From this theorem it follows immediately
that the series 1+1+1+---andl—1+1-—1+---,
cited above, both diverge
becausein each case the terms do not tend to zero.
Let us now focus on the specific needsof this chapter,power series —that is,
seriesof the form
nr
S| an(a — £0)” = ag +a1(u — 9)
0
+ae(x
where the a,,’s are numbers called the coefficients
ao)” fee
(6)
of the series, x is a variable,
and xg is a fixed point called the center of the series. We say that the expansion is
“aboutthe point vo.” In a later chapterwe study complex series, but in this chapter
we restrict all variables and constants to be real. Notice that the quantity (a ~ v9)”
on the left side of (6) is the indeterminate form 0° when n = 0 and a = v9; that
form mustbe interpretedas 1 if the leading term of the series is to be ag, as desired.
The terms in (6) are now functions of z rather than numbers, so that the series
may converge at some points on the x axis and diverge at others. At the very least
(6) converges at 2 = ag since then it reduces to the single term ao.
178
diverge
THEOREM
4.2.2 Interval of Convergence of Power Series
The power series (6) converges at 2 = ao. If it converges at other points as well,
then those points necessarily comprise an interval |z —xo| < R centered at xo and,
possibly, one or both endpoints of that interval (Fig. 1), where # can be determined
from either of the formulas
converge
Xo
Xo
Lak
XgtR
1
R=
Rl
Anti
lim
Figure 1. Intervalof
convergenceof power Series.
or,
1
k=——_—,,
lim V/|an|
noo
an
n—-+Co
(7a,b)
if the limits in the denominators exist and are nonzero. I[fthe limits in (7a,b) are
zero, then (6) converges for all x (i.e., for every finite x, no matter how large), and
we say that “R = oo.” If the limits fail to exist by virtue of being infinite, then
R = 0 and (6) converges only at ro.
We call |x — wo| < R the interval of convergence,and FRthe radius of
convergence. If a power series converges to a function f on some interval, we say
that it represents f on that interval, and we call f its sum function.
EXAMPLE
1. Consider “3° n! 2”, so an = n! andxo = 0. Then (7a)is easierto apply
than(7b),and givesR = 1/ lim {p+}
Th
OO
converges only atz = zp = 0. @
EXAMPLE
noo
2. Consider37>°(—1)”[(@+ 5)/2]". Thena, = (—1)"/2", v9 = —5,
and(7a)givesR = 1/ lim Se
noo
:
= 1/ lim (n+ 1) = 1/oo = 0, so theseries
nN!
Qn+1
2"
(—1)”
| =1/ lim a
nO
1/(5)
2
= 2, so theseries
,
convergesin |x + 5| < 2 anddivergesin |x +5| > 2. For |x + 5| = 2 (x = —7,—3)the
theorem gives no information.
However,
we see that for z = —7 and —3 the terms do not
tend to zero as m — 00, so the series diverges for x = —7 and —3. @
"
EXAMPLE
R=1/
lim
n-00
— 1)”
Then a, = (1 +1)7",
3. Consider ay Ca
(n+ 1)"
YV(n+1)7"=1/ lim
ls
n—oo 7 + 1
a = 1, and (7b) gives
1/0.=.00,.sotheseriesconvergesfor all
x; that is, the interval of convergence is ja — 1] < oo. @
EXAMPLE
4. Consider the series
1 Eas? , eo
-y
iy
(8)
179
proceed in steps of 2. However,
This series is not of the form (6) because the powers of «3
co
1
ay
if we set XY= (2 —3)?, thenwe havethestandardform
ne
lim
== lim
Qn
OO
Mm
TL--> OO
5
0
1
5n
n+
.
X",with a, = 1/5" and
:
.
in |X| <5 (ie.,
=F Thus, = 5,andtheseriesconverges
5 nel
|x—3|<V5),anddiverges
in|X|> 5(ic.,[a—3]> V5). Ol
Recall from our introductory example, in Section 4.1, that several questions
arose regarding the manipulation of power series. The following theorem answers
those questions and, therefore, will be needed when we apply the power series
method of solution.
THEOREM
4.2.3 Manipulation of Power Series
(a) Termwise differentiation (or integration) permissible. A power series may be
differentiated (or integrated) termwise (i.e., term by term) within its interval of
convergence J. The series that results has the same interval of convergence J and
representsthe derivative (or integral) of the sum function of the original series.
(b) Termwise addition (or subtraction or multiplication) permissible. Two power
series (about the same point xo) may be added (or subtracted or multiplied) termwise
within their common interval of convergence J. The series thatresults has the same
interval of convergence J and representsthe sum (or difference or product) of their
two sum functions.
(c) If two power series are equal, then their corresponding coefficients must be
equal. That is, for
S° Qn(x —x)” = S° bp(x —xo)”
0
(9)
0
to hold in some common interval of convergence, it must be true that a, = b, for
each n. In particular, if
oO
S> An(x ~ x9)" = 0
(10)
0
in some interval, then each a,, must be zero.
oO
Part (a) meansthatif f(a) ll S-
f(a)
= ind
“0
0
=
) An(@— xo)"me = o)
0
an(x —xo)” within J, then
=d [an(# — 20) nm|=
7
=
) NAn(L — Lo) n- 1
1
dl)
180
and
oO
=S va,
~
b—«
to)
_
_
4
2
(12)
within J, where a, b are any two points within J.
Part (b) meansthatif f() = 579°an(a —xo)” andg(x) = 379°ba(x —20)”
on J, then
F(a)+g(x)= S“(an+bn)( —0)”,
(13)
and, with z = x ~ 29 for brevity,
f(x)g(x) = (>: ene
(>: ba"
0
0
=(ap taiz+---)
(bo +biz+---)
+azz(bo+b12+ doz?+--+)
=ag (bp+byz+ bez”++++)
+492" (bo + biz + baz? +-:-)
= agbo + (agbi + aibo)
oes
z+-°°
(14)
2”
+Gabo)
++++
+@1bn—1
=$7(@obn
~
within J. The series on the right-hand side of (14) is known as the Cauchy product
of the two series. Of course, if the two convergence intervals have different radii,
then the common interval means the smaller of the two.
In summary, we see that convergent power series can be manipulated in essentially the same way as if they were finite-degree polynomials.
The last items to address,before coming to the power series methodof solution
of differential equations, are Taylor series and analyticity. Recall from the calculus
thatthe Taylor series of a given function f(x) abouta chosenpoint xo, which we
denote here as TS Flag? is defined as the infinite series
TS fl,
= f(to) + L (to) (a — vo) +
SF
~ »
0).
— ao)? +
™)
:
ea
(x — x0)”,
oe)
where0! = 1. The purpose of Taylor series is to representthe given function, so the
fundamental question is: does it? Does the Taylor series really converge to f(x) on
some « interval, in which case we can write, in place of (15),
oO f(n) x
f(2)= 00 Pe)
The!
For that to be the case we need three conditions
—xg)".
i)
to be met:
(i) First, we need f to have a Taylor series (15) about that point. Namely, f must
beinfinitely differentiableat ao so thatall of thecoefficientsf() (aq)/n! in
(15) exist.
(ii) Second, we need the resulting series in (15) to converge in some interval
|z ~ xo| < R, for R > 0.
(iii) Third, we need the sum of the Taylor series to equal f in the interval, so
that the Taylor series represents f over that interval —which is, after all, our
objective.
The third condition might seem strange, for how could the Taylor series of
f(z) converge,but to somethingother than f(x)? Such cases can indeedbe put
forward, but they are somewhat pathological and not likely to be encountered in
applications.
If a function is representedin some nonzerointerval |x — xo| < R by its
Taylor series [i.e.,TS f |, exists, and convergesto f(x) there],then f is said to be
analytic at vo. If a function is not analytic at xo, then it is singular there.
Most functions encountered in applications are analytic for all x, or for all x
with the exception of one or more points called singular points of f. (Of course,
the points are not singular, the function is.) For instance, polynomial functions,
sin z, cosx, e*,ande~*areanalytic for all x. On theotherhand,f(x) = 1/(~—1)
is analytic for all z except x = 1, where f and all of its derivatives are undefined,
fail to exist. The function f(z) = tana = sin z/cosz is analyticfor all x except
z=nnr/2(n = £1,+3,...), whereit is undefinedbecausecos x vanishesin the
denominator.
The function f(a) = «4/8is analytic for all x except 2 = 0, for eventhough
f(0) and f’(0) exist, the subsequentderivativesf”(0), f’"(0),... do not (Fig. 2).
In fact, f(z) = v®is singular at x = 0 for any nonintegervalue of a.
Observe that there is a subtle difficulty here. We know how to test a given
Taylor seriesfor convergence since a Taylor series is a power series, and Theorem
4.2.2 on power series convergence even gives formulas for determining the radius
of convergence R. But how can we determine if the sum function (i.e., the function
to which the series converges) is the same as the original function f? We won’t be
able to answer this question until we study complex variable theory, in later chapters. However, we repeat that the cases where the Taylor series of f converges, but
not to f, are exceptional and will not occur in the presentchapter,so it will suffice
Figure 2. f(x) = c*/? andits first
two derivatives.
182
to understand analyticity at 29 to correspond to the convergence of the Taylor series in some nonzero interval about xo. In fact, it is also exceptional for f to have
a Taylor series about a point (i.e., be infinitely differentiable at that point) and to
have thatTaylor series fail to converge in some nonzero interval about x9. Thus, as
a rule of thumb that will suffice until we study complex variable theory, we will test
a function for analyticity at a given point simply by seeingif it is infinitely differentiable at that point.
4.2.2. Power series solution of differential equations. We can now state the
following basic theorem.
THEOREM
4.2.4 Power series solution
If p and q are analytic at 2g, then every solution of
y" +p(x)y'+ g(a)y= 0
(17)
is too, and can therefore be found in the form
y(2) =
> An(x —x)”.
0
(18)
Further, the radius of convergence of every solution (18) is at least as large as the
smaller of the radii of convergenceof TS p],,. and TS q|,.-
Although we will not prove this theorem, we shall explain why one can expect
it to be true. Since p and q are analytic at the chosen point xo, they admit convergent
Taylor series about xg, so that we can write (17) as
y"+ [p(@o)
+p'(wo)(w
—20)+--+]y'+ [a(20)
+'(wo)(@
—20)+++]
y 9)
Locally, near 7p, we can approximate (19) as
y” +p(xo)y’+ a(to)y= 9,
all solutions of which are either exponential or x times an exponential, and are
therefore analytic and representable in the form (18), as claimed.
In many applications, p(w) and q(x) are rational functions, that is, one poly-
nomial in x divided by another.Let F(x) = N(x)/D(x) be any rationalfunction,
where the numerator and denominator polynomials are N(x) and D(a), respectively, and where any common factors have been canceled. It will be shown, when
we study complex variable theory, that F(x) is singular only at those points in the
complex plane where D = 0, at the zeros of D, so that a Taylor expansion of F
about a point x9 on the x axis will have a radius of convergence which, we know
in advance,will be equal to the distance from zo on the x axis to the nearestzero
of D in thecomplexplane. For instance,if F(a) = (2 + 3x)/[(4+2)(9 + 27),
then D has zeros at ~4 and +37. Thus, if we expand F about 7 = 2, say, then the
radius of convergence will be the distance from 2 to the nearest zero, which is +32
(or,equally, ~3i), namely, /13 (Fig. 3). If, instead, we expand about « = —6, say,
then the radius of convergence will be 2, the distance from —6 to the zero of D at
~4,
EXAMPLE
5. Solve
(20)
y ty=0
by the power series method. Of course, this equation is elementary. We know the solution
and do not need the power series method to find it. Let us use it nevertheless, as a first
example, to illustrate the method.
We can choose the point of expansion ag in (18) as any point at which both p(a) and
q(x) in (17) are analytic. In the presentexample,p(x) = 0 and q(x) = 1 are analytic for
all 2, So we can choose the point of expansion 2
to be whatever we like. Let zo = O, for
y
poo R=
\
¢
\
i ag wt
\
-6
See
4
3i
/
7
-i
\
\
\
|
1
\
RE Vis
ao
\
2
7
J
pm
/
simplicity. Then Theorem 4.2.4 assures us that all solutions can be found in the form
Figure 3. Disks of convergence
(21a)in z plane (z = 7 + ty).
y(t)=Yoana”,
0
and their series will have infinite radii of convergence. Within that (infinite) interval of
convergence we have, from Theorem 4.2.3(a),
(21)
y(«)=Sonana"),
1
y' (2) =Sinn
2
—l)anz”"~?,
(21c)
so (20) becomes
S> n(n —Lana"?
na2
+S- Ana” = 0.
(22)
n=0
We would like to combine the two sums, but to do that we need the exponents of x to be the
same, whereas they are n--2 and n. To have the same exponents, let us setm — 2 = m in the
first sum, just as one might make a change of variables in an integral. In the integral, one
would then need to change the integration limits, if necessary, consistent with the change
of variables; here, we need to do likewise with the summation limits. With n ~ 2 = m,
n = co corresponds tom = oo, and n = 2 corresponds tom = 0, so (22) becomes
S° (m + 2)(m + L)ampo2™+ 1
m=0
anu” = 0,
(23)
n=O
Next, and we shall explain this step ina moment, let m = 7 in the first sum in (23):
oO
Soin + 2)( + Danszan” +Sn==()
n=0
ane” = 0
(24)
x
184
or, with the help of Theorem 4.2.3(b),
oO
> [(n
+ 2)(n + Lany2
(25)
+ Gn] ec" = 0.
n=0
Finally, it follows from Theorem 4.2.3(c) that each coefficient in (25) must be zero:
(n+ 2)(n + Lange + an = 0.
(26)
(n = 0,1,2,...)
Before using (26), let us explain our setting m = 7 in (23) since that step might seem
to contradict the preceding change of variables n ~ 2 = m. The point to appreciate is
that m in the first sum in (23) is a dummy index just as t is a dummy variable in fo t? dt.
(We shall use the word index for a discrete variable; m takes on only integer values, not a
continuous
rangeof values.)Justasft? dt = fr? dr = fo a*dx = +--+
= ¥, thesums
in (23) are insensitive to whether the dummy index is m orn:
oO
(m + 2)(m + l)amyo0™ = 2aq + Bagx + 12agn? +---,
-+
m=0
and
Co
Son
+2)(n + l)anzot” = 2ag+ Gaga+ 12aqga”?
+++:
n=0
are identical, even though the summation indices are different.
Equation
(26) is known
as a recursion
(or recurrence)
formula
on the unknown
coefficients since it gives us the nth coefficient in terms of preceding ones. Specifically,
i
1
a a
(n+2)(n+1)"
n
so that
0
(n
=0,1,2,...
27.
)
en
1
n=O:
ag =
nm=1:
ag ==
n=2:
aq = —
n=3:
.
ao,
(2)(1)
(3)(2
i )
a3 = —
a4,
:
L,
Q3 = a
~ @B)AG
4B)?
1
1
0
4a
(28)
FO,
ag = 4a,
(5)(4)
and so on, where ag and a; remain arbitrary and are, in effect, the integration constants.
Putting these results into (21a) gives
gla)
y(@) = =ag+ ayern
= do (1
_
1,
x
1
1
1
1
5 + +:
4ot
3 4+aga’
+ aya?
Tha a 2 ~ —ayn?
git
ne
ae
1
+ a
— tt :)
+ ay(«
_
De
Ta
1,
+ Ae
at
;)
(29)
Or
y(x) = agyr(x) +a1y2(2),
(30)
where4;(2) andy2(a) are the serieswithin the first and secondpairsof parentheses,respectively.From theirseries,we recognizey;(a) ascosx andyo(a) assin x but,in general,
we can’t expect to identify the power series in terms of elementary functions because normally we reserve the power series method for nonelementary equations (except for peda-
gogicalexamplessuchasthis).Thus,let uscontinueto call theseries“yy(a)” and“y2(z).”
We don’t need to check the series for convergencebecauseTheorem 4.2.4 guarantees
that they will converge for all z. We should, however, check to see tf y1, y2 are LI (linearly
independent), so that (30) is a general solution of (20). To do so, it suffices to evaluate the
WronskianW[y1, y2|(z) at a single point,say w = 0:
= yi(Q)
vel(e)
Wiss
∶
y2(0)
1 0
↓
(31)
↓
which is nonzero. It follows from that result and Theorem 3.2.3 that yy, y2 are LI on the
entire x axis, so (30) is indeed a general solution of (20) for all z. Actually, since there are
only two functions it would have beeneasier to apply Theorem 3.2.4: y1, ya are LI because
neitherone is a constantmultiple of the other.
COMMENT 1.To evaluatey; (x) or yo(x) ata givenx, we needto addenoughtermsof the
series to achieve the desired accuracy. For small values of x (i.e., for 2’s that are close to the
point of expansion xg, which in this case is 0) just a few terms may suffice. For example,
thefirst four termsof y,(z) give y,(0.5) = 0.877582,whereastheexactvalue is 0.877583
(to six decimal
places). As x increases, more and more terms are needed for comparable
accuracy. The situation is depicted graphically in Fig. 4, where we plot the partial sums
$3 and sg, along with the sum function y;(2) (i.e., cos). Observe that the larger n is, the
broader is the z interval over which the m-termapproximation s,, stays close to the sum
and +1, n(x)
function. However, whereas y;(z) is oscillatory and remains between ~—1
is a polynomial, and therefore it eventually tends to +00 or —oo as & increases (—00 if n
is even and -+-ooif n is odd). Observe that if we do need to add a great many terms, then
it is useful to have an expression for the general term in the series. In this example it is not
hard to establish that
Qn
oo
Gayl
yi(x)=S7(-1)"
oO
gent
ya()= OW"Gaal
(32)
0
COMMENT 2. Besides obtaining the values of the solutions y,(2) and yo(a), one is
usually interestedin determining some of their properties. Some propertiescan be obtained
directly from the series, For instance,in this case we can see that y;(—«) = y (a) and
yo(—x) = ~ye(x) (so that the graphs of y, and y2 are symmetric and antisymmetric,
respectively,about2 = 0), and thatyj(a) = ~ye(a) andy4(x) = y,(x). The differential
equation is also a source of information.
COMMENT
3. We posed (20) without initial conditions. Suppose that we wish to impose
theinitial conditionsy(0) = 4 andy’(0) = —1.Then,from (30),
y(0) = 4 = agy1(0)+ arye(0),
oe
S l
|
2\
agy (0) + ayy)(O).
Figure 4. Partialsumsof yi (2),
comparedwith yi (x).
(33)
186
From the series representationsof y, and ye in (30), we see that y,(0) = 1, ye(0) = 0,
yi (0) = 0, and y5(0) = 1, so we can solve (33) for ag and ay: ag = 4 and a, = —1, hence
thedesiredparticularsolutionis y(a) = 4yi(@) —ye(@),on -co < @< oo. G
We can now see more clearly how to select the point of expansion x9, besides
selecting it to be a point at which p and g in (17) are analytic. We have emphasized
that the series solutions are especially convenient when the calculation point « is
close to «og,for then only a few terms of the series may suffice for the desired
accuracy.
Thus, if our interest is limited to the interval 6 < « < 10, say, then it
would make sense to seek series solutions about a point somewhere in that interval,
such as a midpoint or endpoint, rather than about some distant point such as x = 0.
In the event that initial conditions are supplied at some point aj, then it is
especially helpful to let x9 be x; because when we apply the initial conditions we
will needto know the valuesof y: (i), yo(xi), yj(xi), and y$(a;), as we did in
(33). If xo is other than «;, then each of these evaluations requires the summing of
an infinite series, whereas if it is chosen as x; then these evaluations are trivial (as
in Comment 3, above).
EXAMPLE
6. Solve theinitial-value problem
(x —1)y"”+y' + 2(x —1)y = 0,
y(4)=5,
y'(4)=0
(34)
on theinterval4 < x < oo. To get (34) into thestandardform y” + p(x)y’ + q(x)y = 0,
we divide by x — 1 (which is permissible
y+
c-l
since z — 1 4 0 on the interval of interest):
y +2y = 0,
(35)
sop(x) = 1/(a@—1)
andq(x) = 2. Theseareanalyticfor all z exceptx = 1,wherep(x) is
undefined. In particular, they are analytic at the initial point a = 4, so let us choose zo = 4
and seek
(36)
y(“) = 3 Gy(v— 4)".
0
To proceed we can use either form (34) or (35). Since we are expanding each term in the
differential
equation about x = 4, we need to expand x — 1 and 2(x — 1) if we use (34), or
the 1/(a — 1) factor if we use (35). The former is easiersince
(37)
u-l=34+(2-4)
is merely a two-term Taylor series, whereas (Exercise
e-1
a
1
5
(-1)"
6)
a— 4)"
38
ge (4)
is an infinite series. Thus, let us use (34). Putting (36) and its derivatives
ee
and (37) into (34)
gives
[3 + (2 — 4)]Ss" n(n — l)an(a — 4)"7? + S° nay(x —4)"7!
2
L
187
+2[3+ («~4)]S an(x—4)" =0
(39)
0
or, absorbing the 3 + (a ~ 4) terms into the series that they multiply and setting z = a ~ 4
for compactness,
oO
oo
S- 3n(n — Lane”?
+ Son(n
2
2
oO
oO
oO
~ Lanz"
(40)
+ S° 6Gn2" + S> Qa,ort! = 0.
6
0
+ > Naynz'
1
To adjust all z exponents to n, let
—2 = m in the first sum, m~ 1 = ™ in the second
and third, and n + 1 = mmin the last:
oO
ow
S° 3(m + 2)(m + Lamszoz™+ So(m + L)mamyi2™
0
1
+(m+
= 0.
Damyrz™+S >6an2”+S~2am—12™
0
0
(41)
1
Next, we change all of the m indices to n. Then we have z” in each sum, but we
cannot yet combine the five sums because the lower summation limits are not all the same;
three are 0 and two are 1. We can handle that problem as follows. The lower limit in the
second sum can simply be changed from 1 to 0 because the zeroth term is zero anyhow
(due to the m factor). And the lower limit in the last sum can be changed to 0 if we simply
agree that the a_,, that occurs in the zeroth term, be zero by definition. Then (41) becomes
» 3(n+ 2)(n+ Lang22" + So(n + 1)nan412”
0
o
oO
x
oO
(42)
=0
+So(n + Ldngi2” + 5" 6anz"+ S>2an—1z”
0
or
0
0
foe)
2” = 0,
S| [3(n+2)(m+ Lange+ (m+1)angi +Gan+ 2an—1]
(43)
0
witha,
= 0.
Setting the square-bracketed coefficients to zero for each n [Theorem 4.2.3(c)] then
gives the recursion formula
3(n + 2)(n + Lanse + (1 +1)°an41 + ban + 2an—1 = 0
or
n+l
On+2
= 7
3(m+ gyn
2
(n+
2
2)(n+ no” ~ 3(n + 2)(n +1)
Any
(44)
188
Chapter 4. Power Series Solutions
forn = 0,1,2,....
Thus,
n=O:
ay = ~ba,
6
n=l:
a3 =
wy — La
= -9
m=2:
— ap — ta,
3
9
2
_ ly
3
(-ga
1
=~ ba, — a
6
9
a co]
1
a4 = —403 — 307 — 0
4
6
18
1
8
1
am (-7
5
+360
5
= Tos
1
— 34
8
a 9% =
1
*5%)6 (-5
1
a7
1
+ 9%
(45)
1
−co)~784
and so on, where ao and a, remain arbitrary. Putting theseexpressions for the a,’s back
into (36) then gives
(x—4)3
+520)
- oo)(e—4)?+(-Fa
(-ga
6
27
9
re eee
+({pat
Ze) (t-~4)*+---
=a9[i=(4? +Se
Fayte 4)
5(e
+ Feat
4)?—Sle
+ leat
= agyi(x) + a1y2(z),
where y;(2), ya(x) are the functions
(46)
represented by the bracketed series. To test y1, ya for
linear independence it is simplest to use Theorem 3.2.4: y1, yg are LI because neither one
is a constant multiple of the other. Thus, y(z) = aoyi(@) + aiya(z) is a general solution
of (x —1)y" + y' + 2(a —1l)y= 0.
Imposing the initial conditions is easy because the expansions are about the initial
point z = 4:
y(4) = 5 = aoyi(4)+ arya(4)= ao(1)+ a1(0),
(47)
y'(4)= 0 = aoy\(4)+ary2(4)= a0(0)+ ai(1),
so ag = 5 and a, = 0, and
y(a)=Syi(e)=5 |1—
(4)?
+ala
4)? +(ea)
9
36
ee
(48)
is the desired particular solution.
COMMENT. Recall that Theorem 4.2.4 guaranteed that the power series solution would
have a radius of convergence R at least as large as 3 ~ namely, the distance from the center
of theexpansion(xo = 4) to the singularity in 1/(a — 1) at x = 1. For comparison,let us
189
determineR from our results. In this example it is difficult to obtain a general expression
for Gm. (Indeed, we didn’t attempt to; we were content to develop the first several terms
of the series, knowing that we could obtain as many more as we wish, from the recursion
formula.) Can we obtain R without an explicit expressionfor a,,? Yes, we can use the
asm —>oo or, equivalently,
recursion formula (44), which tells us that a@,42~ —Fan
~ —kGn.Then,from(7a),
Gn4+1
R=
1
_
limn—ro0
|“
1
↓
↕
1
=3,
↓∶
→∞a
re
Thus, if we were hoping to obtain the solution over the entire interval 4 < « < oo we
are disappointed to find that the power series converges only over 1 < x < 7, and hence
only over the 4 < x < 7 part of the problem domain. Does this result mean that the
solutionis singular at z = 7 and can’t be continuedbeyond,or thatit doesn’texist beyond
xz= 7? No, the convergenceis simply being limited by the singularity at ¢ = 1, which lies
outside of the problem domain 4 < « < oo, For further discussion of this point, see
Exercise 12. @
Closure. In Section4.2.1we reviewedthebasic conceptsof seriesandpowerseries
and, in Theorem 4.2.3, we listed the properties of power series that are needed to
solve differential equations. In Section 4.2.2 we provided a basis for the power
series solution method of Theorem 4.2.4 and then showed, by two examples, how
to implement it.
It is best to use summation notation, as we did in Examples 5 and 6, because
it is more concise and leads to the recursion relation. (But that notation is not
essential to the method; for example, we did not use it in our introductory example
in Section 4.1.) The recursion relation is important because it permits the calulation
of as many coefficients
of the series as we desire, and because it can be used in
determiningthe radius of convergenceof the resulting series solutions.
The method may be outlined as follows:
(1) Writethedifferentialequationin thestandardformy” +p(a)y' + ¢(x)y = 0
to identify p(a) and g(x) and their singularities(if any).
(2) Choose an expansion point woat which p and gqare analytic. If initial conditions are given at some point, it is suggested that that point be used as zo.
(3) The least possible radius of convergence can be predicted as the distance (in
the complex plane) from xp to the nearest singular point ofp and g.
(4) Seeking y(z) in the form of a power series about xo, put that form into the
differential equation, and also expand all coefficients of y”, y’, y about zo as
well.
(5) By changing dummy indices of summation and lower summation limits, as
necessary, obtain a form where each summation has the same exponent on
x — xg and the same summation limits.
190
Chapter 4. Power Series Solutions
(6) Combine the sums into a single sum.
(7) Set the coefficient of (« — xo)”, within that sum, to zero; that step gives the
recursion formula.
(8) From the recursion formula, obtain as many of the coefficients as desired and
hencethesolutionform y(x) = Ayi(x) + Byo(x), whereA, B arearbitrary
constantsand y1(a), y2(a) are power series. If possible, obtain expressions
for the general term in each of those series.
(9) Verify that y1, yo are LI.
Computer software. One can use computer software to generate Taylor series and
also to obtain power series solutions to differential equations. Using Maple, for
instance, the relevant commands are taylor and dsolve.
For example,to obtaintheTaylor seriesof 1/(a —x) aboutx = 0, up to terms
of third order, where a is a constant, enter
taylor(1/(a— x), ec=0, 4);
and return. The result is
wherethe O(a?) denotesthatthereare moreterms,of order 4 and higher.
To obtain a powerseriessolution to x’ + y = 0 aboutthe point « = 0, enter
dsolve(diff(y(a), x,v) + y(z) = 0, y(x), type= series);
and return. The result is
1
1
y(x)=y(0)+D(y)(O)e
—Sy(O)a*
——D(y)(0)a*
+sFy(0)a"
1
+759 Ply)”
5
+ O(a°®)
where D(y)(0) meansy’(0). The default is to expandaboutx = 0 andto go as far
as the fifth-order term. If we want an expansion about « = 4, say, and through the
seventh-order term, enter
Order := 8;
to set the order then return and enter
dsolve({diff(y(a), x,x) + y(z) = 0, y(4) = a, D(y)(4) = 5},
y(x), type = series);
L91
4.2. Power Series Solutions
and return.
which the expansion is desired. The result is
↕
∶
L
↔
↓
1
b(a— 4)°
−
1
6
4)6
«
∕
1
∶
24
↨
−
↓
↓
_
Fygg
hl — 4—pagal—40"
—zagle —4)"+0 ((e—4")
4.2
EXERCISES
1. Use (7a) or (7b) to determine the radius of convergence 2
of the given power series.
3. Work out the Taylor series of the given function, about the
given point Zo, and use (7a) or (7b) to determine its radius of
convergence.
na”
(a) ¥
(b)e"*,
(d)sinz,
(f)cosz,
a=1
(a)e",
w%o=7
(c)sinz,
«9 = 7/2
(e)cosz,
Lg =5
(g) cose,
Gi)2’,
(h) lng,
«w=3
ro = 2
(k) cos(x —2),
2
xo = 0
(m) 7 aa
2%=-2
% =7/2
wo=7
w=1
(j)22° - 4,
xo =0
() Toi
to =0
(n) sin(32'°),
xy =0
4. Use computer software to obtain the first 12 nonzero terms
in the Taylor series expansion of the given function f, about
the given point xo, and obtain a computer plot of f and the
2. Determine the radius of convergence F of the Taylor series
expansion of the given rational function, about the specified
point xo, using the ideas given in the paragraph preceding Example 5. Also, prepare a sketch analogous to those in Fig. 3.
(a) Pe
1
[t=
(b) 2 79°
{c)
we
(a) e+
ro
z+
0
Ly =:
Qe +1
so
ae
,
fp SO
26
Bo = -2
(x +1)°
(0 e+
2a +40
1-2
a
vta-2
(g)
reget
dg
4
I: O<aK<4
wo =0,
(a) f(w) =e",
0O<a<10
I:
wj=0,
(b) f(z) =sinz,
IF: 0<ae<2
a=l,
(c) f(a) =Ine,
LF: -l<ar<l
wx=0,
(d) f(w7)=1/(l1-—2),
I: O<aK<4
wo =2,
(e) f(x) =1/z,
LF: -l<e<l
rm =0,
(f) f(z) =1/+2"),
I: -13 <2 < 0.36
wo =0,
(g) f(x) =4/(4+a+27),
1
to = —4
a? —3e+1
interval J.
5. (Geometric series) (a) Show that
°
(e) x? +3042
partial sums s3(z), s6(@),S9(z), and s,2(a) over the given
tp
=
2
2
=il+a+a*+---+2
el
+
ce
l-—cz
(5.1)
is an identity for all x # 1 and any positive integer n, by
multiplying through by 1 ~ x (which is nonzero since x # 1)
and simplifying.
(b) The identity (5.1) can be used to study the Taylor series
192
Chapter 4. Power Series Solutions
knownas thegeometricseries S7;"_,x" since,accordingto
(5.1),its partialsum s,,(2:)
is
(k)y+ why!
+ y=0, 2%
=0
(1)y+avy! + 2“y =0,
Lo=0
(m) y” +(x—-1)*y=0,
n—-1l
L—a”
(x # 1)
1~2
k=0
Show, from (5.2), that the sequence s,(2)
n —>oo,for |x| < 1,anddivergesfor |x| > 1.
(5.2)
converges, as
2 =2
8. (a)~(m) Use computer software to obtain the general so-
lution, in power series form, for the corresponding problem
given in Exercise 7, about the given expansion point.
9, (Airy equation) For the Airy equation,
(c) Determine, by any means, the convergence or divergence
of the geometric series for the points at the ends of the interval of convergence, « = +1. NOTE: The formula (5.2)
y” —axy= 0,
(-co < @< ov)
is quite striking becauseit reduces s,,(x) to the closedform derive the power series solution
(1 —«")/(1 —x), direct examinationof which gives not only
y(z) = aoyr(z)+a1ya(z)
theintervalof convergencebutalsothesumfunction 1/(1—z).
co
It is rare thatone can reduce s,(z) to closed form.
=
6. (a) Derive the Taylor series of 1/(@~ 1) about 2 = 4
7
1
+ 35m
-
using the Taylor series formula (16), and show that your result
1
1
t-1l
ft
ol
34+(¢-4)
61
(6.1)
and using the geometric series formula
Ply
1-t
(9.2)
ta (or Beg 4.
aT
and verify that it is a general solution. NOTE: These series are
not summable in closed form in terms of elementary functions
thus, certain linear combinations of y; and y2 are adopted as a
usable pair of LI solutions. In particular, it turns out to be convenient (for reasons that are not obvious) to use the Airy func-
tions Ai(z) and Bi(z), which satisfy theseinitial conditions:
ied
| <1)
i
=ym)
gintl
agrees with (38).
(b) Show that the same result is obtained (more readily) by
writing
yan
(9.1)
Ai(0) = 0.35502, Az’(0) = —0.25881 and Bi(0) = 0.61493,
= 0.44829.
6.2 Bi'(0)
(6.2)
10. Use computer software to obtain power series solutions of
from Exercise 5, with t = —(a — 4)/3. Further, deduce the the following initial-value problems, each defined on 0 < a <
x interval of convergence of the result from the convergence
condition|t]< 1 in (6.2).
oo, through termsof eighth order, and obtain a computer plot
of so(x),
4(x),s¢(z), andsg(x).
7. For each of the following differential equations do the fol- (ay +4y'+y=0,
y(0)=1, y’(0)=0
lowing: Identifyp(x) and g(x) and, from them,determinethe (b)y"+a°y=0, y(0)=2, y/(0) =0
least possible guaranteed radius of convergence of power se- (c)y”—~ay'+y=0, y(0)=0, y(0)=
ries solutions about the specified point x9; seeking a power (d)(l+a)y”+y=0,
y(0)=2, y'(0)="0
series solution form, about that point, obtain the recursion for- (e)(3+zx)y"+y'+y=0,
y(0)=0, y'(0)=
y(0)=1,
(0) =
mula and the first four nonvanishing terms in the power series (f) (1+ 27)y"+y=0,
for yy(a) andyo(@);verify thaty,, y2 are LL
11. From the given recursion formula alone, determine the
radius of convergenceof the corresponding power series solu(a)y"+ 2y'+y=0,
a =0
tions.
(b) y" + 2y'=0,
x =0
(a) (n+ 3)(n + 2ange — (N+ 1)%an41+ Nan = 0
(c)y” +2y'=0,
wo =3
(day +y'+y=0,
to=—-5
+2y=0,a =1
(e)ay”—2y/
(f) ay"
-—y=
0,
to
c9
=0,
(h)y"+y' +(L+a+27)y
a =0
w
a x)\y=0,
()
y”
a
= 0,
v=
0
(n
+1 La An+2
+
5nn
+1
+
Qn
(c) (n+ 1)anze2 +(2n?+ ans
= 9
(g)vy” + (3 +a2)y +ary = 0,
(i)y"
(b)
(d) (n + l)dn42
=
tg
(€) NAnte
—3
=
0
— 3(na
2)an
7
An—1
=
0
—4an = 0
= 0
+ 4NAn+1 + 3an = 0
(f) n2@n42— 3(2 + 2)?@n41+ 38an—1= 0
12. In the Comment at the end of Example 6 we wondered
what the divergence of the series solution over 7 < « < 00
4.3, The Method of Frobenius
{93
implied aboutthe natureof the solution over thatpartof the ductory example of Section 4.1. Keep powers of « — 4 up to
domain. To gain insight, we propose studying a simple problem with similar features.Specifically, consider the problem
and including fourth order (2 — 4)*, and show that your result
agrees (up to terms of fourth order) with that given in (46).
the usual defi(12.1) 15. (Cesdro summability) Although (5) gives
(a —l)y’+y=9,
on the interval 4 < a < oo.
(a) Solve (12.1) analytically, and show that the solution is
(12.2)
over 4 < w < oo. Sketch the graph of (12.2), showing it
as a solid curve over the domain 4 < x < oo, and dotted over
—oo<a<4.
nition of infinite series, it is not the only possible one nor the
only one used. For example, according to Cesaro summability, which is especially useful in the theory of Fourier series,
one defines
s-
an
”
=
hk
LH
N-+00
8) + 8g +:'° + 8N
-oOooOCOeeoee
N
,
(15.1)
means of the partial sums. It
that is, the limit of the arithmetic
(b)Solve(12.1),instead,by seekingy(x) = S79?an(x —4)”. can be shown that if a series converges to s according to “or(c) Show that the solution obtained in (b) is, in fact, the Taylor dinary convergence” [equation (5)], then it will also converge
expansion of (12.2) about x = 4 and that it converges only in to the same value in the Cesaro sense. Yet, there are series that
jc ~ 4] < 3 so thatit representsthesolution(12.2)only over diverge in the ordinary sense but that converge in the Cesaro
the 4 < x < 7 part of the domain, even though the solution sense. Show that for the geometric series (see Exercise 5),
(12.2) exists and is perfectly well-behaved over 7 < x < oo.
13. Rework Example 5 without using the }~>summation notation. That is, just write out the series, as we did in the introductory example of Section 4.1. Keep powers of x up to and
including fifth order, 2°, and show that your result agrees (up
to terms of fifth order) with that given in (29).
14. Rework Example 6 without using the 5~ summationno-
tation. That is, just write out the series as we did in the intro-
4.3. The Method
the equation
ateattsw
N
1 #tns®
N(1l—2)
l-x
for all a ~ 1, and use that result to show that the Cesdro
definition gives divergence for all |z| > 1 and for « = 1, and
convergence for |z| < 1, as does ordinary convergence, but
that for z = —1 it gives convergence to 1/2, whereas according to ordinary convergence the series diverges fora = —1.
of Frobenius
y"+p(x)y’+ a(x)y=0.
i.y
(1)
expansions about any point ag at which both p and q are analytic. We call such
a point x9 an ordinary point of the equation (1). Typically, p and q are analytic
everywhere on the wxaxis except perhaps at one or more singular points, so that all
points of the a axis, except perhaps a few, are ordinary points. In that case one
can readily select such an wg and develop two LI power series solutions about that
point.
Nevertheless, in the present section we examine singular points more closely,
and show that one can often obtain modified series solutions about singular points.
194
Why should we want to develop a method of finding series solutions about a singular point when we can stay away from the singular point and expand about an
ordinary point? There are at least two reasons, which are explained later in this
section.
Proceeding, we begin by classifying singular points as follows:
DEFINITION 4.3.1 Regular and Irregular Singular Points of (1)
Let xp be a singular point of p and/or g. We classify it as a regular or irregular
singular point of equation (1) as follows: 29 is
(a) a regular singular point of (1) if (a —xq)p(a) and (2 —xq)"q(z) areanalytic
at Zo,
(b) an irregular singular point of (1) if it is not a regular singular point.
EXAMPLE
1. Considera(x —1)?y"’—3y’ + 5y = 0 or,dividingby x(x —1)?to put
theequationin thestandardform y” + p(x)y’ + q(x)y = 0,
a“
y
3
i
5
yey i ne-ip4
>?
(2)
Thus,p(x) = ~—3/[a(a
~—
1)?]andq(x) = 5/[a(x — 1)?].Theseareanalyticfor all x
except for z = 0 and z = 1, so every z is an ordinary point except for those points. Let us
classify those two singular points:
to =0:
(x —xo)p(x) = (« —0) (-zhp)
=
Soi
To classify the singular point at 2 = 0, consider (3a) and (3b). Since the right-hand sides
of (3a) and (3b) are analytic* at 0, we classify « = 0 as a regular singular point of (2). (The
fact that those right-hand sides are singular elsewhere, at x = 1, is irrelevant.) To classify
the singular point at « = 1, we turn to (3c) and (3d). Whereas the right-hand side of (3d)
is analytic at 2 = 1, the right-hand side of (3c) is not, so we classify the singular point at
wv= | as an irregular singular point of (2). @
EXAMPLE
2. Consider the case
y +VJzy =0.
(0<u< oo)
(4)
“Recall the rule of thumb given in the last sentence of Section 4.2.1, that we will classify a function
as analytic at a given point if it is infinitely differentiable at that point.
195
Then p(z) = 0 and g(x) = 4/2, and these are analytic (infinitely differentiable) for all
xz > 0, but not at z = 0 because q(x) is not even once differentiable there, let alone
infinitely differentiable. To classify thesingular pointat z = 0, observe that («—xo)p(x) =
(c)(0) = 0 is analyticat« = 0, but(x —ag)?q(x)= 2?.fa = x°/? is not;it is twice
differentiable there (those derivatives being zero), but all higher derivatives are undefined
at xc= 0. Thus, a = 0 is an irregular singular point of (4). (See Exercise 2.) #
4.3.2. Method of Frobenius. To develop the method of Frobenius, we require
that the singular point about which we expand be a regular singular point. Before
statingtheoremsand working examples, let us motivate the idea behind the method.
We consider the equation
y"+p(x)y!+ g(x)y=0
(3)
to have a regular singular point at the origin (and perhaps other singular points as
well). There is no loss of generality in assuming it to be at the origin since, if it is
at f = xo # 0, we can always make a change of variable € = x — xg to move it to
the origin in terms of the new variable € (Exercise 3). Until stated otherwise, let us
assumethat the interval of interest is 7 > 0.
We begin by multiplying equation (5) by x? and rearranging terms as
a*y" + x [xp(x)|y’ + [x*q(x)| y = 0.
(6)
Since z = 0 is a regular singular point, it follows that xp and xq can be expanded
about the origin in convergent Taylor series, so we can write
(otue+-)y=0.
wy"+a(potpic+- y+
(7)
Locally, in the neighborhood of x = 0, we can approximate (7) as
xy"HW+pory’+goy=0,
(8)
which is a Cauchy-Euler equation. As such, (8) has at least one solution in the form
x", for some constant 7. Returning to (7), it is reasonable to expect that equation,
likewise, to have at least one solution that behaves like x” (for the same value of r)
in the neighborhood of z = 0. More completely, we expect it to have at least one
solution of the form
y(x) = x" (ap + aye +a9z* +-+-),
(9)
where the power series factor is needed to account for the deviation of y(a), away
from z = 0, from its asymptoticbehaviory(a2)~ apx” as « > 0. That is, in place
of the power series expansion
y(x) = So an”
0
(10)
196
that is guaranteed to work when x = 0 is an ordinary point of (5), it appears that
we should seeky(x) in the more generalform
oO
oO
y(z) =a" So an”
(11)
— ‘> aye tt
0
0
if z = 0 is a regular singular point. Is (11) really different from (10)? Yes, because whereas (10) necessarily represents an analytic function, (11) represents a
nonanalytic function because of the x” factor (unless r is a nonnegative integer).
Let us try the solution form (11) in an example.
EXAMPLE
3. The equation
Gay"+7ay'—(L+2")y=0,
(0<2<oo)
a
has a regularsingularpoint at 2 = 0 becausewhereasp(x) = 7x/(6x7) = 7/(6x) and
g(x) = —(14 27)/(6x") aresingularatx = 0, xp(x) = 7/6 andx7q(x) = —(1+ 27)/6
are analytic there. Let us seek y(z) in the form (11). Putting that form into (12) gives
oO
+r)\(n+r—
6x7 So(n
oo
+ raga?
Srin
47
Laat"?
th}
0
°
DO
=0
—(1+2°) So ana"
0
(13)
or
oO
oo
S> 6(n+r)(n+r—
0
lane”
+ S° 7Tn+r)anx"t"
0
foe]
oO
0
0
—SF ana” —Sanat?
Letting n +r +2 = m+,
(14)
=0.
in the last sum, (14) becomes
oO
1 (nt
0
r)(n+r—1)+7(n+r)
—Vane"? —S- Qm—22'"*"= 0.
2
Changing the lower limit in the last sum to 0, with the understanding that a2
and changing the m’s to 7’s, we can finally combine the sums as
(15)
= a_, = 0,
DO
> {[6(n oe r)* +n-er
0
= 1] An — Gn—2}gt”
= Q,
where we have also simplified the square bracket in (15) to 6(n +7)? +n+r
(16)
— 1. From
(16), we infer the recursion formula
(6(n + r)? +tntr-
1] Ay —~dn—2 = 0
(17)
4,3, The Method of Frobenius
foreach n = 0,1,2,....
In particular, n = 0 gives
or, since ag
(6r? +r —1) aq —~a_2
=0
(18)
(6r? +17~ 1) ap = 0.
(19)
= 0,
and that ag is the first nonvanishing coefficient.
Proceeding, with ag % 0, it follows from (19) that
6r?+r—-1=0
(20)
First, set r = ~1/2. The corresponding recursion formula (17) is then
aynFa
forn = 1,2,...,
1
6 (n − ≡
−
(21)
in ~2
3
∏−≡
since the n = 0 case has already been carried out:
n=
= 0,
a, = a,
n=l:
.
2:
1
ag = ma
= 39
1
n=3:
ag
=
—
n=4
ai=
nm=d:
1
a5 = Togas = 9,
(22)
|
1 76°
6
—
i
(76)(14)
aig;
1
n=6:
Qg = vonda = TORTS
1
°~ 186°"~~(186)(76)(14)
°°
and so on. From these results we have the solution
y(v) −
= agx ∙↔↕∕
∩
14
a
ne
vo
+
1
xpil 4
1
pb
− ∙∙
ia” * Geyaay** Ts6(76)"A4)*
(23)
— agyi (x),
where ay remains arbitrary.
Next, set r = 1/3. The corresponding recursion formula (17) is then
1
an
=
5 An—2;
6(n+4)°+n—-2
(24)
197
{98
and proceedingas we did for r = —1/2, we obtain thesolution (Exercise 4)
u(x) = ane? | 1
y(t)= ax”
1
2
|L+
sie" +Tpan”
tk
1
go3 poe.
+Gacainad”®*
= agy2(x),
a
where ag remains arbitrary. [Of course, the ag’s in (23) and (25) have nothing to do with
each other; they are independent arbitrary constants.] According to Theorem 3.2.4, the
solutions y; and y2 are LI because neither is a scalar multiple of the other, so a general
solution to (12) is y(z) = Cyy1(x) + Coye(a), where y; and ye are given in (23) and (25).
What are the regions of convergence of the series in (23) and (25)? Though we don’t
have the general terms for those series, we can use their recursion formulas,
(21) and (24),
respectively, to study their convergence. Consider the series’ in (23) first. Its recursion
formula is (21) or, equivalently,
an+2 =
1
Tae
"
6(n+2-24)?+n4+2-3
(26)
We need to realize that the an+2 on the left side is really the next coefficient after ap,
the “a,41” in Theorem 4.2.2, since every other term in the series is missing (because
a, = a3 = a5 = --: = 0). Thus, (26) gives
.
lim
noo
=
Qn
1
.
”
“a
| =
lim
noo
ae
§ (n + 3)
+n+
5
0,
(27)
and it follows from Theorem 4.2.2 that R = co; the series converges for all x. Of course,
the x~!/? factor in (23) "blows up” at « = 0, so (23) is valid in 0 < x < 00, which is the
full interval specified in (12).
Similarly, we can show that the series in (24) converges for all x, so (25) is valid over
the full intervalO <x < oo. G
With Example 3 completed, we can come back to the important question that
we posed near the beginning of Section 4.3.1: “Why should we want to develop a
method of finding series solutions about a singular point when we can stay away
from the singular point and expand about an ordinary point?”. Observe that our
Frobenius-typesolutiony(z) = Cyyi(x) + Coye(x), with y;(x) and yo(a) given
by (23) and (25), was valid on the full interval 0 < x < oo. Furthermore, it even
showed us the singular behavior «: the origin explicitly:
y(t) = Ca?
aCa
(1 $a
1.
1
pe. ) + Cyx'/3 (1 + aa”
+: }
(28)
as x — 0. In contrast, if we had avoided the singular point « = 0 and pursued
power series expansions about an ordinary point, say zc = 2, then the resulting
4.3. The Method of Frobenius
solution would havebeen valid only inQ < # < 4, and it would not haveexplicitly
displayed the 1/\/z singular behavior at x = 0.
Let us review the ideas presented above, and get ready to state the main theo-
rem. If « = 0 is a regularsingularpoint of theequationy” + p(a)y! + q(x) = 0,
which we rewrite as
vy" + x [xp(x)] y! + [x*q(x)] y = 0,
(29)
thenxp(x) andx?q(x) admitTaylor seriesrepresentationsxp(a) = pop+ pix +--= gg + qe +--+. Locally then, near « = 0, (29) can be approximated
and xq(x)
as
∶
2,0
∟↕∕
∫
(30)
∫∶
↕∫
which is a Cauchy-Euler equation. Cauchy-Euler equations, we recall, always admit at least one solution in the form x", and this fact led us to seek solutions y(x)
to (29)thatbehavelike y(x) ~ x” as x > 0, or
Cw
CO
(31)
ee ) Ant” = ) Anz"t,
0
where the ay
0
G@yx"factor is to account for the deviation of y from the local be-
haviory(z) ~ x” awayfrom 2 = 0. Putting (31) into
wy"+a(potpie+-)y
+(got qe+---)y=0
(32)
gives
oO
CO
So (n +r\(n+r—1)anz"*”
0
+ (pp + pit +---) So(n + r)ana"t?
0
+(gotqe+--)S
oO
0
cana” = 0,
(33)
and equating coefficients of the various powers of a to zero gives
x:
git,
ut?
ats
[r(r —1) + por + qo]ao = 9,
[((r+ L)r + po(r +1) + qo]ai + (pir + qi)ao = 0,
[(r + 2)(r + 1) + po(r + 2)+ qo]ag + (etc)ay
+(etc)ag = 0,
[(r +3)(r + 2) + po(r + 3) + qo]a3 + (etc)ag +
(etc)ay
(34a)
(34b)
+(etc)ag = 0,
(34d)
(34c)
and so on, where we’ve used “etc’s” for brevity since we’re most interested, here,
in showing theform of the equations. Assuming, without loss of generality, that
ag # 0, (34a) gives
r? +(po —1)r +qo=0,
(35)
199
200
which quadratic equation for r is called the indicial equation; in Example 3 the
indicial equation was equation (20). Let the roots be ry and rg. Setting r = ry
in (34b,c,d,...) gives a system of linear algebraic equations to find a1, a2,... in
terms of ag, with ag remaining arbitrary. Next, we set r = rg in (34b,c,d,...) and
again try to solve for a1, a@2,...in terms of ag. [f all goes well, those steps should
producetwo LI solutionsof thedifferential equationy” + p(x)y’ + q(x) = 0.
The process is straightforward and was carried out successfully in Example 3.
Can anything go wrong? Yes. One potential difficulty is that the indicial equation
might have repeated roots (r, = 12), in which case the procedure gives only one
solution. To seek guidance as to how to find a second LI solution, realize that
the same situation occurred for the simplified problem, the Cauchy-Euler equation
(30): if, seeking y(x) = 2” in (30), we obtain a repeated root for r, then a second
solution can be found (by the method of reduction of order) to be of the form z"
timesInw. Similarly, for y” + p(x)y’ + q(x) = 0, as we shall see in thetheorem
below [specifically, (41b)].
The other possible difficulty, which is more subtle, occurs if the roots differ by
a nonzero integer. For observe that if we denote the bracketed coefficient of ag in
(34a) as F(r), then the coefficient of a; in (34b) is F(r + 1), that of ag in (34c)
is F(r + 2), and so on. To illustrate, supposethatr; = rg + 1, so thatthe roots
differ by 1. Then not only will F(r) vanish in (34a) when we are using r = ro,
but so will F(r + 1) in (34b) [though not F(r + 2) in (34c), nor F(r + 3) in
(34d), etc.], in which case (34b) becomes 0a, + (pira + qi)ao = 0. If pire t+ a
happens not to be zero then the equation (34b) cannot be satisfied, and there is no
set of a,,’s that satisfy the system (34). Thus, for the algebraically smaller root rg
(e.g., —6 is algebraically smaller than 2), no solution is found. But if pire + q@
does equal zero, then (34b) becomes Oa, = 0 and a, (in addition to ag) remains
arbitrary. Then (34c,d,...) give ag, a3,... as linear combinations of ag and aj, and
one obtains a general solution
y(a) = agxv™(a power series) + a,x"
= agyi(@)+ ary2(z),
(a different power series)
(36)
where ao, a1 are arbitrary and yj, y2 are LI.
If, however, rp gives no solution, then we can turn to ry. For ry the difficulty
cited in the preceding paragraph does not occur, and the method produces a single
solution“yo(w).”
If, instead, ry = ro + 2, say, then the same sort of difficulty shows up, but not
until (34c). Similarly, if ry = ro + 3,71 = ro + 4, and so on.
The upshot is that if 71,72 differ by a nonzero integer, then the algebraically
smaller root rg leads either to no solution or to a general solution. In either case,
the larger root r, leads to one solution.
The theorem is as follows.
201
THEOREM
Let
4.3.1 Regular Singular Point; Frobenius Solution
= 0 be a regular singular point of the differential equation
y+p(a)y'+a(x)y=0,
(a > 0)
with xp(z) = po + pia +++ and a%q(x) = qo + qa +>:
GY)
having radii of
convergence /21, /%grespectively. Let r1, r2 be the roots of the indicial equation
r? + (po—1)r+qo= 0,
(38)
where Tr; > rq if the roots are real. (Otherwise they are complex conjugates.)
Seekingy(x) in the form
oO
oo
y(z) =a" S° Cn
S- Anz",
0
(ao # 0)
(39)
0
with r = ry inevitably leads to a solution
CO
yi(x) = a2"S- Ant”,
(ao# 0)
0
(40)
where a1, @,... are known multiples of a9, which remains arbitrary. For definiteness, we choose ag = 1 in (40). The form of the second LI solution, ye(x), depends
on ry and 19 as follows:
(i) ry and re distinct and not differing by an integer. (Complex conjugate roots
belong to this case.) Then with r = ra, (39) yields
CO
yo(z)=a"?S “baw”, (bo#0)
(41a)
0
where the b,,’s are generated by the same recursion relation as the a,’s, but with
r = rq instead of r = rj; bj, b2,... are known multiples of bg, which is arbitrary.
For definiteness, we choose bg = | in (41a).
(ii) Repeated roots, ry = r2 =r. Then y2(x) can be found in the form
oO
yo(x) = yi(x) neg +2" Ss" Cpt”.
L
(41b)
(ili) 74 — 1g equal to an integer, Then the smaller root ro leads to both solutions,
yi(x) and yo(x), or to neither. In either case, the larger root 71 gives the single
solution (40). In the latter case, y2(a) can be found in the form
oO
yo(z) = Kyi(a) Ina +2
S0
dyx™,
(41c)
202
where the constant « may turn out to be zero, in which case there is no logarithmic
term in (41c).
The radius of convergence of each of the series in (40) and (41) is at least as
large as the smaller of Ry, Ro.
If (37) is on x < Orather than x > 0, then the foregoing is valid, provided that
each2”, x", a” andInz is changedto |x|", |a|", |a|"?andIn |a|,respectively.
Outline of Proof of (ii):Our discussion preceding this theorem contained an outline
of the proof of case (i), and also some discussion of case (ili). Here, let us focus on
case (ii) and, again, outline the main ideas behind a proof. We consider the case of
repeated roots, where r; = rg = r. Since y;(x), given by (40), is a solution, then
so is y(x) = Ayi(xz), where A is arbitrary. To find yo(x), let us use reductionof
order;that is, seekyo(x) = A(x)y;(x), wherey;(x) is knownand A(z) is to be
found. Putting that form into (37) gives
Aly, +A!(2y,+pyr)+A (yl +py,+ay1)= 0.
(42)
Since y1 satisfies (37), the last term in (42) is zero, so (42) becomes
(43)
A” y, + A’ (2y; + py) = 0.
Replacing A” by dA’/dzx, multiplying throughby dz, and dividing by y; and A’
gives
dy
dA’
way
(44)
+pdz=0.
Integrating,
In |A’) + 21n|yz|+ / p(x) dx = constant,say InC’, for C > 0,
,
SO
= ~ | ∟a)ae,
↕
and
ewJ ple) de
|A'(x)| = C—
>
yq (x)
—po
Ing
= ci
oof (tpi
treet
de
je" (l+ajz+--)]
(—piw—-)
~c oe
v2?(1+ Qayatee)’
45
where we write In x rather than In || since x > 0 here, by assumption.
Since exp (—f p(x) dz) > 0, we see from the first equality in (45) thatA’(«)
is either everywhere positive or everywhere negative on the interval. Thus, we can
drop the absolute value signs around A’(a) if we now allow C' to be positive or
203
eT)
77
pap
4 Pouce
4
.
‘
+---) is
negative.Further, e~?°Ing — ging PO_ y—Po,and e(-P1t—"")/(1 + 2a,
3
ee
wo
.
analytic at ¢ = 0 and can be expressed in Taylor series form as 1+ 44+
so
1
A(x) =C
get t+po
Kone ees,
(L+aya+---).
(46)
For r to be a double root of the indicial equation (38), it is necessary that 2r + pg =
1, in which case integration of (46) gives
A(z) =C(Ina+xKy0+-::).
(47)
Finally, setting C' = 1 with no loss of generality, we have the form
yo(z) = A(x)yi(x) = (Inet ayrt+---)y1(z)
=yi(x)Inat (Kio +---)a"(L+aya+---)
CO
= yi(c) Ine +a" S- Crea
1
(48)
as given in (41b). @
In short, the Frobenius method is guaranteedto provide two LI solutions to
(37) if x = 0 is a regular singular point of that equation. If 2 = 0 is an irregular
singular point, the theorem does not apply. That case is harder, and we have no
analogous theory to guide us.
EXAMPLE
4. Case (ii). Solve the equation
zy” —(x+2*)y’ +y =0:
(0<«<o)
(49)
that is, find two LI solutions. The only singular point of (49) is 2 = 0, and it is a regular
singular point. Seeking
y(t)=Doane™*",
(ay#0)
(50)
0
substitution of that form into (49) gives
20
es)
Si(n
0
+r)(n+r—l)anz™*
—Yi(n
0
oO
+r)a,x"t"
coo
(51)
4Sane"? =0.
—So(ntrjana”t"t!
0
0
Setn + 1 = m in the third sum, change the lower limit from n = 0 tom = 1, extend that
limit back to 0 by defining
a_,
= 0, change the m’s back to n’s, and combine
the four
sums. Those steps give
S- {[(n+r)(n+r—-1)-—(n+r)+
0
a,
—-(n+r— lan}
a"*" =0
204
and hence the recursion formula
[(n+r)(n+r—1)—-(n+r)+1an—-(n+r—1)an-1
forn = 0,1,2,.... Forn
ay
= 0,
(52)
= 0, (52) becomes (r? — 2r + 1)ag ~ (r — 1)a_y =0. Since
= Oand ag Z 0, the latter gives the indicial equation
2 or +1=0,
(53)
with repeated roots r = 1,1. Thus, this example illustrates case (ii). Putting r = 1 into
(52), we obtain the recursion formula
an
=
1
~An—1
1
1
1
30» a3 = 3% =3100 and we can see that
!
1
Thus, a, = ao, @2= ria
.
forn = 1,2,....
i
(54)
n
Gn = —~Ao,SO
ni
a
So?
oI +. :)
y(x2)= a(ap + Sat
ll!
grt
oo
= ao
5
a
(55)
agys(x).
∏
keep working with the series form in this example.
∂
Theorem 4.3.1 tells us that yg can be found in the form (41b), where r = 1 and the
Cys are to be determined. Putting that form into (49) gives
x?yy —(@+2?)yh+yo= [oat —(c+2)y ty] Ine+2zy,—(2+a)y1
nl (n +1) Jena
+3
4
Seem
1
oO
oO
->*( n+l1)e,z"*!
1
+ 1)enz"t*
- y(n
L
=0,
(56)
wherey,(x) = S03°2" +!/n!. The square-bracketed
termsin (56)cancel to zero because
yi is a solution of (49). If we move 22y, — (2 + x)y, to the right-handside, and write out
the various series, then (56) becomes
C12" + deg?
,
+ Sega* sees
.
—2e,7°3 — 3cont — ++)= a
.
—o
Fi
1
—5
meee,
(57)
and equating coefficients of like powers of a, on the left- and right-hand sides, gives
x:
q=nl,
xe:
deg —2c) = —1,
x:
9ce3 — 3c. = —$,
(58)
205
andso on. Thus,cy = —1,cg = ~4,cg3= —48,... and
yo(a) = yi(a) Ina + S> Cyan}
1
~ Fe! —tigi...
a?
---)ing
gett
fot
= (ota?
(59)
notation, then in place of (57) we obtain, after
If, instead, we retain the summation
manipulation and simplification,
‘
oO
i
oO
on aD
n+l
2
— Mey)
NMCy
1
-__——T
=
n+1
(60)
{where co = 0 because there is no cp term in (41b)] and hence the recursion formula
nrc
or, more conveniently,
Ch
=
ne
"
1
eS
nme”
|
(n— 1)!
1
n
nn}
(61)
— 0)
(co
— ——:
—Cn-1
Solving ( 61) gives
—-l,
C=
C2 =
=>
FBI
1
oF
1
(1+5)
1
~~
14 1
=
3
3
(62)
ll
36
and so on. These results agree with those obtained from (58), but the gain, here, is that (61)
can give us as many c,,’s as we wish. In fact, by carrying (62) further we can see that
1
1
1
(63)
maa nl (1454-42)
2
n
foranyn = 1,2,.... [The price that we paid for that gain was that we needed to manipulate
the series by shifting some of the summation indices and summation limits in order to
obtain (60).]
COMMENT 1. In this example we were able to sum the series in (55) and obtain y; (x) in
the closed form
yi (x) = ze”.
(64)
In such a case it is more convenient to seek yo by actually carrying out the reduction-of-
orderprocessthatlay behindthegeneralform (41b).Thus, seekyo(x) = A(x)yi(z). The
steps were already carried out in the preceding outline of the proof of Theorem 4.3.1, so
we can immediately
Al(a)
use (45). With C = 1,
_
ew J pla) de
yi(2)
e7 f(-4-l)da
qoeie
_
elh fot
sighs
_
ent
=
6s)
206
so
a(e)= [ae =f
ALD)
=
z
at
=]
= ine +f
(1-142-4...)
=
.
z
oy
a
31
v
(66)
”
:
ING
= Ine+s(-"
yeenl de =
nn
and
a
=
6
(67)
re™,
xe
1) =
nosz+ dA —1)"
a) =
A(s (x)
==A(x)yi
yo(x)=
which expression is found, upon expanding the e”, to agree with (59).
COMMENT
2. As a matter of fact, we can leave the integral in (66) intact because it is
a tabulated function. Specifically, the integral of e~*/a, though nonelementary,comes up
often enough for it to have been defined, in the literature, as a special function:
Ey(x)=|
100
+t
— at}
(68)
(x>0)
is known as the exponential integral. Among its various properties is the expansion
(69)
(x>0)
Ye,
Ex(x)=—y—Inz—
↓
wherey = 0.5772157is Euler’s constant. Using theF(x)
function,we can express
in(2)
=ACen
(ze)
=(f° at)2
LCevt
oo
a
(/
evt
_-
oo
ev
— at)
dt -|
rey
= [E\(a)
xe”,
−
(70)
for any a > 0. The £,(a)xze* termis merely F(a) timesy;(a), so it can be dropped with
no loss. Further, the factor —1 in front of the £\(x)xe* can likewise be dropped. Thus,
in this example we were able to obtain both solutions in closed form, y;(z)
= xe* and
yo(x) = Ey(a)re".
COMMENT
3. Observe that the Taylor series
_p
rp(z) =x (=5*)
x
=-l—ag,
r’q(a) = 2? (=)
x
il
(71)
both terminate, hence they surely converge with Ry = co and Ry = oo, respectively. Thus,
Theorem 4.3.1 guarantees that
oO a,x”
in (40) and SO
crx”
in (41b) will likewise have
infinite radii of convergence. Of course the Inz in (41b) tends to —00 as x — 0, but
nevertheless our solutions y, and ye are indeed valid on the full interval 0 < 2 < oo. H
EXAMPLE
5. Case (iii). Solve theequation
4.3. The Method of Frobenius
that is, find two LI solutions. The only singular point of (72) is « = O, and it is a regular
singular point. Seeking
y(2) = S| ane
0
t
(ao # 0)
(72) becomes
Sin
0
+r)\(n+r—lagz™*™~!
4+S- ane" t? = 0,
0
Set n — | = m in the first sum, in which case the lower summation
(73)
limit becomes ~—1,
back to n’s. In the second sum change the lower limit to —1,with the
thenchange the 7™m’s
understanding that a_, = 0. Then (73) becomes
oO
+ anja"?
Ss” [(m+r+1)(n+rangi
= 0,
n=—l1
so we have the recursion formula
(n+r+1)(n+r)anz,
forn = ~—1,0,1,2,....
+an = 0,
(a_, =0, ap #0)
(74)
Setting m = —1, and using a_, = O and ap # 0, gives the indicial
equation
r(r-—1)
(75)
=0,
with roots 7; = Llandre = 0. These differ by an integer, so that the problem is of case
(iii) type. Let us try the smaller root first. That root will, according to Theorem 4.3.1, lead
to both solutions or to neither. With r = ry = 0, (74) becomes (2 + L)nanagy + An = 0.
Having already used n = —1, to obtain (74), we next set n = 0. That step gives 0+ aq = 0,
so that a@g
= 0, which contradicts the assumption that ag 4 0. Thus, r = re = 0 yields no
solutions. Thus, we will use the larger root, r = rj] = 1, to obtain one solution, and then
(41c) to obtain the other.
With r = r, = 1, (74) gives
OntL =
7
1
Gn
n+ 2)(n+1)
(76)
Working out the first several a,,"s from (76), we find that
(-1)”
Oy =
"
(n+ 1)(nh2
so
oO
)=
Rar (n “
"do
(nl)?
pith
l
= agyr
oy (x),
(),
where
OS
nls
=
Nag
Gea?
(n + 1)(
\(nly2
−
(77
)
=207
208
[Remember,
throughout,
that0! = 1 and(—1)°= 1.]
To find yo, we use (41c) and seek
yo(x) = Kyi (x) Ine + S| dyx™.
(78)
0
Putting (78) into (72) gives
Kary! Ina + QKay,—Ky, + S> n(n —1)dpa”
0
oO
+KvyyIne +)" dn2”*t= 0.
(79)
Cancelling the Inx terms [because y; satisfies (72)], re-expressing the last sum in (79) as
So dnt
= SP
where d_, = 0, andputting (77) in for
dm1t™ = SOP dn—1x",
the y; terms, (79) becomes
oO
-
2” lI
[n(n — Ldn + dni]
Ky —2Keyy
0
(CONE
+Dg
oO
=F
(n+
(n+1)(
iinie
—_ —(or
Lene
where, to obtain the last equality,
a
Gn
1) on
we let m + 1 = m and then set m = n.
880
Equating
coefficients of like powers of @gives the recursion formula
n(n —1)dy + dy,
= —K
(-1)"71(2n—1)
n{(n—1)!]?
(81)
forn = 1,2,.... [We can begin with n = 1 because equating the constant terms on both
sides of (81) merely gives 0 = 0.] Letting n = 1,2,... gives
mol:
dy = —kK,
no=2:
dg
n=
n=3
3:
nm=4:
3
~K
=
4 .
-
lL
~dh,
2
'
(82)
13 = o-oo
dls
36° + —dy,
1D
35
4
fc-
1
dh,
da= T798"> jaa"
"
and so on, where d; remains arbitrary. Thus, the series in (78) is
Yt"
foe
m
=
14322
+dy
sal
-
4
↓
Tota
36°
∶
~=24+—73
3° a f+.
1728
∙↕∕
- —
a4 fue }
(83)
209
The series multiplying d,, on the right side of (83), is identical to y; (x), given by (77), so
we can set d; = 0 without loss. With dy = 0, we see that the entire right side of (78) is
scaled by «, which has remained arbitrary, so there is no loss in setting « = 1.
Thus,y2(x) is givenby (78),whereiny; (x) is givenby (77)andthed,’s by (81),with
d; taken to be zeroand«
EXAMPLE
= 1. @
6. Case (iii). Solve
4a?y" + day’ —y =0
(84)
by the method of Frobenius. This has been a long and arduous section so we will only
outline the solution to (84). Seeking a Frobenius expansion y(z) = 373° anz"*” aboutthe
regular singular point « = 0, we obtain the indicial equation 4r? — 1 = 0, sor = +1/2,
which corresponds to case (iii) of Theorem 4.3.1. We find that the larger root r; = 1/2
leadsto the one-termsolution y(x) = aga!/? (i.e, a) = a2 = ++:= 0), and thatthe
smaller root rz = —1/2 leads to y(z) = aga~!/? + ayx'/? (ie., ag = ag = ++:= 0),
which is thegeneralsolution. We did not, in (84), specify the x interval of interest. Suppose
thatit is 2 < 0. Thena generalsolutionof (84)is y(z) = ao|x|~!/?+ ay|a|!/?,andthat
solution is valid on the entire interval 2 < 0.
In fact, (84) is an elementary equation, a Cauchy-Euler equation, so we could have
solved it more easily. But we wanted to show that it can nonetheless be solved by the
Frobenius method,and that that method does indeed give the correct one-term solutions. 4
One final point: what if the indicial equationgives complex roots r = a +13?
This issue came up in Section 3.6.1 as well, for the Cauchy-Euler equation. Our
treatment here is virtually the same as in Section 3.6.1 and is left for Exercise 10.
Closure. The Frobenius theory, embodied in Theorem 3.4.1, enables us to find
a general solution to any second-order linear ordinary differential equaton with a
regular singular point at « = 0, in the form of generalized power series expansions about that point, possibly with In a included. There are exactly three possible
cases: if the roots of the indicial equation (38) are r,, 72, where 71 > 12 if they are
real, then if the roots are distinct and not differing by an integer (which includes the
case where the roots are complex) then LI solutions are given by (40) and (41a); if
theroots are repeatedthen LI solutions are given by (40) and (41b); and if ry —re
is an integer then LI solutions are given by (40) and (41c). Theorem 3.4.1 is by
no means of theoretical interest alone, since applications, especially the solution by
separation of variables of the classical partial differential equations of mathematical physics and engineering (such as the diffusion, Laplace, and wave equations),
often lead to nonconstant-coefficient second-order linear differential equations with
regular singular points, such as the well known Legendre and Bessel equations. We
devote Sections 4.4 and 4.6 to those two important cases.
Computer software. It is fortunate that computer-algebra systems can even generate Frobenius-type
solutions, fortunate because the hand calculations
can be quite
tedious, as our examples have shown. Thus, we urge you to study the theory in this
210
Chapter 4. Power Series Solutions
0, y(x), type = series);
u(e)=C12(1
1 ae
1.
144
12
36
ol
Ly
one,
a?+
144
1728"
x ++o(a"))
86400
2880
12
2
4
+
2880
101
86400
a)
EXERCISES 4.3
1. For each equation, identify all singular points (if any), and
classify each as regular or irregular. For each regular singular
point use Theorem 4.3.1 to determine the minimum possible
radii of convergence of the series that will result in (40) and
(41) (but you need not work out those series).
(ayy”—a*y'+ ary=0
(x?-3)y"—y=0
(c)
(e)(a + 1)°y”— 4y' + (x +1)y
(fy” + (Unaz)y’+ 2y =0
y(x(t))= Y(t)is
=0
-1)(w+3)?y"+y'+y=0
(g)(aw
(h) ry" +(sinz)y’ — (cosx)y = 0
(i) x(a* + 2)y" by = (0)
(j)(a*7 Uy +ay'—x?y =0
—y=0
=1)2y'
=1)y"+(2?
(24
(k)
(1)(xt —1)8y"”~ 3(@+
(m) (ty! \' ~5y = 0
1)?y'+ e(@+ ly =90
(n)[x3(a~1)y’]' +2y=0
(0)207y""
—xy!+Ty=0
(p)cy" +dy’ = 0
(q)*y" —3y= 0
(r) 227y" + fry
y" + fey = 0 (x > 0) hasan irregularsingularpoint at
x = 0, becauseof the/Z.
(a) Show that if we change the independent variable from
z to t, say, according to /z = t, then the equation on
(b) ry” — (cosx)y’ + 5y = 0
(d)
a(x?
+3)y"
+y=0
regular singular point, by suitable change of variables, so that
the Frobenius theory can be applied. The purpose of this exercise is to present such a case. We noted, in Example 3, that
=0
2. Sometimes one can change an irregular singular point to a
y(t) - “¥'(t)+4PY()<0. (t>0) 9 21)
(b) Show that (2.1) has a regular singular
point
at ¢ =
0
(which point corresponds to x = Q).
(c) Obtain a genera! solution of (2.1) by the Frobenius method.
(If possible, give the general term of any series obtained.)
Putting t = \/z in that result, obtain the corresponding gen-
eralsolutionof y” + xy
= 0. Is thatgeneralsolutionfor
y(x) of Frobenius form? Explain.
(d) Use computer software to find a general solution.
3. In each case, there is a regular singular point at the left
end of the stated x interval: call that point zp. Merely introduce a change of independent variable, from z to ¢, according
to v ~ wv = ¢, and obtain the new differential equation on
y(x(t)) = Y(t). You neednotsolve thatequation.
4.3. The Method of Frobenius
(a (e@—Ly"+y'-y=0,
(b)(2? Dy" +y=0,
(<a<o)
(1<e<oo)
(c)(a + 3)y” —2(a + 3)y’ —dy = 0,
(d)(a~5)?y""
+2(a—5)y’—y= 0,
9. (a) The equation
(-3 <4 < 0)
(5 <a < oo)
4. Derive the series solution (25).
5. Make up a differential equation that will have as the roots
of its indicial equation
(a 1,4
(b)3,3
(c)1/2,2
(e)2+31
(f)-1,-1 — (g)-2/3,5
(i)(1$ 24)/3Gj)5/4,8/3
211
(d)-1/2,1/2
(hy)
-1 +i
6. In each case verify that ¢ = 0 is a regular singular point,
and use the method of Frobenius to obtain a general solution
(a? —x)y"”+ (4a- 2)y' + 2y=0,
(O<a<1)
1)
has been “rigged” to have,as solutions, 1/2 and 1/(1 —2).
Solve (9.1) by the method of Frobenius, and show that you do
indeed obtain those two solutions.
(b) You may have wondered how we made up the equation
(9.1) so as to have the two desired solutions.
Here, we ask
you to make up a linear homogenous second-order differential
equationthathastwoprescribedLI solutions#(z) andG(z).
10. (Complexroots)Sincep(x) andq(x) arereal-valuedfunc-
tions, v9 and qo are real. Thus, if the indicial equation (38) has
complex
roots they will be complex conjugates, r = a+i(, so
y(z) = Ayi(x) + Byo(c) of the given differentialequation,
on the interval0 < x < oo. That is, determiney(x) and case (i) of Theorem 3.4.1 applies, and the method of Frobenius
yo(x). On what interval can you be certain that your solution will give a general solution of the form
is valid? HINT: See Theorem 4.3.1.
(a)2ay" + y' + 2°y =0
y(z)= Ayi(x) +Byo(z)
(b)ry” +y' —xy =0
= Are
(c)ay” +y' + 28y =0
Ce)zy" +y' +2y = 0
(e)xy” +ay’ —y=0
(f) ry
wo
xy!
_
2y
=(0
(g)oy! +ay' —(1+2z)y =0
(m)z?y"” — (2+ 32)y =0
rts
= x [cos(Glnz) isin
y(x) = Cx
(q)16x?y" + 8ry' —3y =0
(10.2)
(cos(GBInz) 79° ene”
+Dzx®(cos(In z) 339°dnx”
(r)16x7y""+ Bay’ —(3+a)y =0
+sin(8lna)
(s)2*y" + zy’ + (sinz)y = 0
(10.3)
75° ene"),
are the real and imaginary parts of a,,, respecwhere c,d,
tively: Qn = Cn +idy.
(c) Find a general solution of the form (10.3) for the equation
ry2,
8. Use the method of Frobenius to obtain a general solution
ff
+e(l+a)y
to the equationxy” + cy’ = 0 on x > 0, where c is a real That is, determine a, @ and c,,d,
constant. You may need to treat different cases, depending
upon c.
(Ginz)],
—sin(GInz) 7p draw”)
(p)2zy" + e*y'+y =0
7. (a)—(x) Use computer software to obtain a general solution
of the corresponding differential equation in Exercise 6.
byw”.
show that (10.1) [with },, replaced by @,, according to the
result found in part (a) above] can be re-expressed in terms of
real functions as
(n) bay" + y' + 8x7y = 0
(o) ry" +e"y =0
(t)(xy) —9y'+zy = 0
(u)(zy’)’ -y =0
(v)(xy’)’ —2y'-y =0
0
(a) Show that the 6,,’s will be the complex conjugates of the
Qn’S: bn = Gn.
(b) Recalling, from Section 3.6.1, that
(h)c?7y”+ ay’ -y =0
(i)zy" +2y + (1+2)y=0
(j)3zy" + y' +y =0
(kK)
c(L +x)y" +y = 0
(I)¢?(2+ ay" —~y=0
(10.1)
ane” + Bat
say.
ty =0.
in (10.3), through n = 3,
(d)The sameas(c), for 2*y" + ay’ +(1 - x)y = 0.
212
4.4
Legendre Functions
4.4.1. Legendre polynomials. The differential equation
(1)
(1 — x”) y — Qaey'+ Ay = 0,
where A is a constant, is known as Legendre’s equation, after the French mathematician Adrien-Marie
Legendre (1752-1833).
The z interval of interest is —1 <
x < 1, and (1) has regular singular points at each endpoint, « = -1. In this section
we study aspects of the Legendre equation and its solutions that will be needed in
applications in later chapters. There, we will be interestedin power series solutions
about the ordinary point « = 0,
(2)
y() =S>ana.
k=0
Putting (2) into (1) leads to the recursion formula (Exercise 1)
k(k+1)-A
a2
= Pyke)
=
(k=0,1,2,---
ay.
)
3
(3)
Setting A = 0,1,2,...,
in turn, shows that ao and a, are arbitrary, and that subsequent a,’s can be expressed, alternately, in terms of ag and ay:
r
—F5a9,
a=
2—<A
43 =
2
.
6
(6—A)A
40,
aq = —~—zr
ay,
24
and so on, and we have the general solution
ON
a0
u(e) = ao|
+4 |:
t
2—AX
6
e+
24”
(12
(CU)
4
—A)(2—A)
a
|
Clr
720
Lo
=
|
(4)
= agy1(«) + ayye(x)
of (1). To determine the radii of convergence of the two series in (4) we can use
the recursion formula (3) and Theorem 4.2.2, provided that we realize that the a,.42
on the left side of (3) is really the next coefficient after ay, the “ay41” in Theorem
4.2.2 since every other term in each series is missing. Thus, (3) gives
“
k- 00
“ape”
Ak
= lim
k-00
k(k+1)—A =i,
(FIVEF3)
.
(5)
and it follows from Theorem 4.2.2 that R = 1, so each series converges in —1 <
xe<il.
4.4. Legendre Functions
213
In physical applications of the Legendre equation, such as finding the steadystate temperature distribution within a sphere subjected to a known temperature
distribution on its surface, one needs solutions that are bounded on —1 < a2 <1.
[F (x) being bounded on an interval J means that there exists a finite constantM
suchthat|F'(a)| < M for all x in I. If F(x) is not boundedthenit is unbounded.]
However, for arbitrary values of the parameter \ the functions y,(a) and yo(x)
given in (4) grow unboundedly as a —>+1, as illustrated in Fig. 1 for A = 1. If
you studied Section 4.3, then you could investigate why that is so by developing a
Frobenius-type solution about the right endpoint 2 = 1, which is a regular singular
point of (1). Doing so, you would find a In(1 — a) term, within the solutions,
which is infinite at 2 = 1. Similarly, a Frobenius solution about « = —1 would
reveala In (1 + x) term,which is infinite at ¢ = —1. Evidently,y;(x) and yo(z),
above,contain linear combinations of In (1 —x) and In(1+ 2) [of course, one
x
cannot see them explicitly in (4) because (4) is an expansion about z = 0, not igure 1. y:(x) andyo(«)in(4),
zx= 1orz = ~1] so they grow unboundedly as a + £1.
Nonetheless, for certain specific values of 4 one series or the other, in (4), will
terminate and thereby be bounded on the interval since it is then a finite degree
polynomial! Specifically, if \ happens to be such that
(6)
A=n(n +1)
for any integer n = 0,1, 2,...,
then we can see from (3) that one of the two series
terminates atk = n: if \ = n(n + 1), where n is even, then the even-powered
series terminates at & = n (because dn49 = Gnyg = ++:= 0). For example, if
n= 2and \ = 2(2+1) = 6, thenthe6 — factorin everytermafterthesecond,in
the even-powered series, causes all those terms to vanish, so the series terminates
as 1~ 32". Similarly, if \ = n(n +1), where n is odd, then the odd-powered series
terminates at & = n. The first five such \’s, and their corresponding polynomial
solutions of (1), are shown in the second and third columns of Table 1. These
Table 1. The first five Legendre polynomials.
n | \=n(n+1)
| Polynomial Solution |
Legendre Polynomial P,,(z)
0
0
1
Po(a) =1
1
2
r
Pi(z) =a
2
6
1—32?
Po(x)= $(3a?—1)
3
12
v— 3x8
P3(x) = $(5a° ~ 32)
4
20
polynomial
solutions
1—100°+ B24 | Py(x) = $(35a4—3027+ 3)
can, of course, be scaled by any desired numerical
factor.
Scaling them to equal unity at c = 1, by convention, they become the so-called
Legendre polynomials. Thus, the Legendre polynomial P,,(a) is a polynomial
forA = lL.
214
solution of the Legendre equation
(7)
(1—a)y” —2ay'+ n(n + Ly = 0,
scaled so that P,(1) = 1. In fact, it can be shown that they are given explicitly by
the formula
P,(a) =
1
d”™
2
ridge OU"),
n
8
(8)
which result is known as Rodrigues’s formula.
4.4.2. Orthogonality of the P,,’s. For reasons that will not be evident until we
study vector spaces (Section 9.6), the integral formula
(9)
=0,= (j#K)
[),Pi(2)Pe(w)de
is known as the orthogonality relation. By virtue of (9), we say that P;(a) and
P,(x) are orthogonal to each other — provided that they are different Legendre
polynomials (i.e., 7 # k).
Proof of (9) is not difficult. Noting that the (1 —x”)y” — 2zry’ terms in (7) can
be combinedas [(1 —27)y’]', we begin by consideringfy
/
(1 _ 2) Pt] Py dz
and integrating by parts until all the derivatives have been transferred from P; to
Py:
|
1
-1
[(1—x”)Pi] P,dx = (1—2°) PIP,"
= 0— (1—2") PLP;|", +f
1
—fo
1
_
—x")PPh dx
P; (1 —2) Pi] de.
(10)
The next to last term is zero because of the 1 — x? factor, just as the boundary term
following the first equal sign is zero. Since P; and P, are solutions of the Legendre
equation
(11)
[(1- x?)y'|' +n(n+1l)y=0
forn = j and k, respectively, we can use (11) to re-express (10) as
or
1
1
~1
-1
\
de=0.
+1)5G+0)f Pa)Pale)
(RR
(13)
Since 7 4 k, it follows from (13) that f
P; Py dx = 0, as was to be proved.
We will see later that (9) is but a spectal case of the more general orthogonality
relation found in the Sturm-Liouville theory, which theory will be essential to us
when we solve partial differential equations.
4.4, Legendre Functions
21 aA
4.4.3. Generating function and properties. Besides (9), another important property of Legendre polynomials is expressed by the formula
oO
(1 —2er + Pye
( z| <1, |r| <1)
Pyla)r”.
—S-
0
(14)
That is, if we regard the left side of (14) as a function of r and expand it in a
Taylor series about r = 0, then the coefficient of r” turns out to be P,(a). Thus,
1 - 2er + r) “ie is called the generating function for the P,,’s (Exercise 4).
Equation (14) is the source of considerable additional information about the
P,,’s. For instance, by changing x to —a in (14) it can be seen that
(15)
P,(—2)=(-1)"P,(z).
Now, if f(—a) = f(x), thenthe graphof f is symmetricabout2 = 0 and we say
thatf is an even function of a. If, instead,f(—c) = —f(a), thenthe graphof f
is antisymmetric about « = 0 and we say that f is an odd function of x. Noting
that the (—1)” is +1 if m is an even integer and —1 if n is an odd integer, then we
seefrom (15) that P,,(a) is an even function of x ifn is an even integer, and an odd
function of x if n is an odd integer, as is seen to be true for the P,,’s that are shown
in Fig. 2.
Also, by taking 0/0r
of (14) one can show (Exercise 6) that
nPy(x) il (2n — 1)¢Pp—i(xv)— (n ~ 1)Pra-a(z),
(n = 2,3,...)
(16)
which is a recursion relation giving P,, in terms of P,—, and P,—2. Or by taking
O/0x of (14) instead, one can show (Exercise 7) that
Pl(x) — 20P)_y(x) + P)_o(a) = Py_1(2).
(n = 2,3,...)
(17) Figure 2. Graphsof thefirstfive
Finally, squaring both sides of (14) and integrating on 2 from —1 to +1, and
using the orthogonality relation (9), one can show that
5,
[ PeyPae=
vf
da =
P,(a)|?
2
0,1,2,...
n=
(01,2...)
Legendre polynomials.
18
(18)
which result is a companion to (9); it covers the case where j = k&(= n, say). We
will need (9) and (18) in later chapters.
Closure. Our principal application of Legendre’s equation and Legendre polynomials, in this text, is in connection with the solution of the Laplace equation in
spherical coordinates. There, we need to know how to expand a given function in
termsof the Legendrepolynomials Po(x), Pi (a), Po(w),..., and the theory behind
such expansions is covered in Section 17.6 on the Sturm-Liouville theory.
To help put that idea into perspective, recall from a first course in physics or
mechanics or calculus that one can expand any given vector in 3-space in terms of
216
orthogonal (perpendicular) vectors “ij, k.” That fact is of great importance and
was probably used extensively in those courses. Remarkably, we will be able to
generalize the idea of vectors so as to regardfunctions as vectors. [t will turn out
that the set of Legendre polynomials Po, Pi,... constitute an infinite orthogonal
set of vectors such that virtually any given Function defined on -1 < @ < lcan
be expanded in termsot them, just as any given “arrow vector” in 3-space can be
expanded in termsof ii ,j,k. In the present section we have not gotten that far, but
he results obtained here will be used later, when we finish the story.
For a more extensive treatment of Legendre functions, Bessel functions, and
the various other important special functions of mathematical physics, see, for instance, D. E. Johnson and J. R. Johnson, Mathematical Methods in Engineering
and Physics (Englewood Cliffs, NJ: Prentice Hall, 1982). Even short of a care-
ful study of the other special functions —such as those associated with the names
Bessel, Hermite, Laguerre, Chebyshev,
— we recommend browsing
and Mathieu
through a book like Johnson and Johnson so as to at least become aware of these
functions and the circumstances under which they arise.
Computer software. In Maple, P,,(a) is denotedas P(n,x).
say,enter
To obtain P7(x),
with(orthopoly):
and return, then
P(7,z);
and return.The result is a x!
EXERCISES
G03x + oo ee
4.4
1. Putting (2) into (1), derive the recursion formula (3).
(b) Verify (17) forn = 2.
2. Obtain (4) using computer software.
(c) Verify (17) form = 3.
3.
e v.
Use Rodrigues’s
formula,
(8), to reproduce
the first five
Legendre polynomials, cited in Table |
4. Expanding the left-hand side of (14) in a Taylor series in
r, about r = 0, through 7%,say,De
that the coefficients of
r°,...,r3 areindeedPo(x),..., P3(ax),respectively.
Squaring (14) and integrating
8. (a) Derive (18) as follows.
from —1 to 1, obtain
1
57 eal
[—e-f
m=)
Sir
P,(2)) da.
n=Q
(8.1)
relation
orthogonality
the
using
and
side,
left
Integrating the
Show thosestepsand explain (9) to simplify the right side, obtain
5. We stated that by changing 2 to —x in (14) it can be seen
thatP,(—a) = (-1)"P,,(x).
your reasoning.
to(HE)-E{[mora
a2
6. (a) We stated that by taking 0/0r of (14) one obtains (16).
Show those steps and explain your reasoning.
(b) Verify (16) forn = 2.
(c) Verify (16) for n = 3.
7. (a) We stated that by taking 0/0c of (14) one obtains (17).
Show those steps and explain your reasoning.
n=)
ai
Finally, expanding the left-hand side in a Taylor series in r,
show that (18) follows.
(b) Verify (18), by working out the integral, for the cases
n=
0,1, and 2.
4.4. Legendre Functions
9, (Integral representation of P,,) It can be shown that
217
Lea
x
Qi(e)=C § In(==) - i 4+Dyx.— (1t4b)
= 4 to (x + f/x? — 1 cos t)" dt,
Pale)
(9.1) By convention, choose Co = 1, Dp = 0, Cy = land D, = 0,
(n = 0,1,2,...)
so that
which is called Laplace’s integral form for P,,(a). Here,
we ask you to verify (9.1) for the cases n = 0,1, and 2, by
working out the integral for those cases.
10. We sought power series expansions of (1) about the ordinary point 2 = 0 and, for the case where \ = n(n+1), we obtained a bounded solution (namely, the Legendre polynomial
P,,(z)] and an unboundedsolution. Instead,seek a Frobeniustype solution about the regular singular points z = 1 and
xz= —1,for the case where
{fajn=0
(A=0)
(c)n=2
(A=2)
(b)n=1
1
11. (Legendre functions of second kind) For the Legendre
equation (7) on the interval
(11.5a)
= ↕in(2t®)-1
Orla)
(2) =
= —ln
~I.
(11.5)
∏
∶ ∏∏
as the P,,’s. Thus, with Qo and Q, in hand we can use (16) to
obtain Qs, Qs, and so on. Do that: show that
Q2(z) =
(A=1)
l+2
Qo(x) = 5 in (; = 2),
3a" ~ 1
ri
In (
l+ez
+)
3
9%
(11.6)
andobtainQ3(z) aswell.
< x2 < 1, we obtained the
-1
boundedsolution y(x) = P(x). In this exercisewe seek a
secondLI solution,denotedas Q,(x) and called the Legen- 12. (Electric field induced by two charges) Given a positive
dre function of the second kind. Then the general solution
of (7) can be expressed as
charges lie on the z axis, it follows
y(z) = AP, (2) + BQn(2).
(a) For the special case n
charge Q and a negative charge —Q, a distance 2a apart, let us
introduce a coordinate system as in the figure below. Since the
(11.1)
that the electric
field that
they induce will be symmetric about the z axis.
= 0, solve (7) and show that a
secondLI solution is In[(1 + 2)/(1 —x)]. Scaling this solution by 1/2, we define
fl 4.
Qo(x) = sin (722),
(11.2)
l—-«ax
Sketch the graph of Qo(z) on -1
< a < 1, and notice
that|Qo(z)| + co asa > +1.
(b) More generally, consider any nonnegative integer n. With
only P,,(z) in hand, seek a second solution (by reductionof
order)in the form y(x) = A(x)P,(x), and show thatQ,(2)
is given by
Qr(z)
=
|
aa)
e
lt
ie
)(Pa)?
+e DyPple).
(a) Specifically, the electric potential (i.c., the voltage) ® in-
ducedby achargeq is ®= (1/47req)(q/r),wherethephysical
(11.3) constant €9 is the permittivity of free space and r is the dis(c) Evaluating the integral in (11.3), show that the first two tance from the charge to the field point. Thus the potential
Qn's are
induced at the field point P shown in the figure is
Qo(a)
=
1
300
In (;
Ll+a2
-
|
+ Do,
Cl {.4a)
P=
=
L (2-2).
4még \ Pe
p-
(12.1)
218
Chapter 4. Power Series Solutions
though a@is not tending to zero and @ to infinity, the field
induced by that molecule is approximately equal to that of an
idealized dipole of strength pz= 2@a, at points sufficiently far
away (ie., for p/a >> 1).
(c) As a different limit of interest, imagine the point P as
fixed, and this time let @become arbitrarily large. Show, from
Show that (12.1) gives
Arey
a? + p? —2apcos
1
a? + p? + 2apcos;|
1
trap
2Q
20
(12.2), that we obtain
a\”
>
(2)
@)~
®&(p,
Pr(cos@) (p>a)
1
Aireg
2Q
—z pcos
a
=
1
2Q
—Zz
4rreg a?
(12.5)
as a — oo. Notice that if, as @is increased, Q is increased
such that Q/a? is held constant, then theelectric field intensity
E (which, we recall from a course in physics, is the negative
(12.2) of the derivative of the potential) is @constant:
(b) With the point P fixed, imagine letting a become arbitrarE=
ily small. Show, from (12.2), that we obtain
(12.6)
(2)" P,(cos@).(p<a)
da
1
cos @
®(p,¢)~ Arey 2(a p
(12.3)
constant, then (12.3) becomes
[LE cos@
p?
Ameg2a?’
dz
asa > 0. Thus, ®(p,¢) — 0 as a -+ 0, as makes sense,
because the positive and negative charges cancel each other
as they are moved together. However, observe that if, as a
is decreased, Q is increased such that the product Qa is held
&(p,)~ dren
1 Q
that is, we have a uniform
field. Thus, a uniform electric
field
can be thought of as resulting from moving apart two charges,
+Q and —Q, and at the sametime increasingtheir strengthQ
such that Q/a?
is held constant as a —->oo. Similarly,
in fluid
mechanics, a uniform fluid velocity field can be thought of as
resulting from moving a fluid “source” of strength +@ and a
fluid “sink” of strength —Q apart in such a way that Q/a? is
held constant as @—+00, where 2a is their separationdistance,
as sketched schematically in the figure.
(12.4)
where pp= 2Qa is called the dipole moment, and the charge
configuration is said to constitute an electric dipole. If, for instance, a molecule is comprised of equal and opposite charges,
+Q and —Q, displaced by a very small distance 2a, then, even
nc
ep
@
+ Q
re
—__
I
ee
AA
®
~ Q
219
For example, if
=[iire%ds,n=fber*
del
va,
[3 = f
Ig = for Jxe* dz,
dz/(x — 1),
w
thenJ; is singular due to the infinite limit, />is singular becausetheintegrandtends
to co as x -+ 0, J is singular because the integrand is unbounded (tends to —co as
x — 1 from the left, and tends to +-oo as a — 1 from the right), and J, is regular.
Most of our interest will be in integrals that are singular by virtue of an infinite upper limit (illustrated by /,) and/or a singularity
in the integrand at the left
endpoint (illustrated by fo), so we limit this brief discussion to those cases. Other
cases are considered in the exercises.
Consider the first type, fe f(a) dx. Analogous to our definition
∑
oo
N
n=O
n=0
of an infinite series, we define
i=
Tee)
f(z) dx = lim |
a
X00
XxX
f(a) dz.
(3)
Jig
If the limit exists, we say that J is convergent; if not, it is divergent.
Recall, from our review of infinite series in Section 4.2, that the necessary
and sufficient condition for the convergence of an infinite series is given by the
Cauchy convergence theorem, but that theorem is difficult to apply. Thus, in the
calculus, we studied a wide variety of specialized but more easily applied methods
and theorems. For instance, one proves, in the calculus, that the p-series,
oO
1
np’
(4)
1
convergesif p > 1 and diverges Noes< 1, the case p = 1 giving the well known (and
divergent)harmonic series She| 3: Phat is, the terms need to die out fast enough,
as n increases, for the series to converge. As p is increased, they die out faster and
faster,and the borderline case is p = 1, with convergence requiring p > 1.
Then one establishes one or more comparison tests. For instance: If Sy =
OoGy and So = ~o bp are series of finite positive terms, and a, ~ Kb, asn >
co for some finite constant A’, then S, and S_ both converge or both diverge. (The
lower limits are inconsequential insofar as convergence/divergence is concerned
and have been taken to be 0 merely for definiteness.)
of ae series So =
oe
For instance, to determine the convergence or
oO
)
1
¢
2n +3
—Toe
n? +5
F
2n+3
‘
F
2
~ — as n -+ oo. Now, oF ~z is convergent
We observe that —j~-—>
n+
on
because it is a p-series with p = 3 > 1, and by the comparison test stated above it
follows that S' is convergenttoo.
Our development for determining the convergence/divergence of singular in-
tegralsis analogousto thedevelopmentdescribedabovefor infinite series. Analo-
gous to the p-series, we study the horizontal p-integral,
t= fae,
qa =P
(a>0)
|
(5)
where p is a constant. (The name “horizontal p-integral” is not standard, and is
explained below.) The latter integral is simple enough so that we can determine its
convergence/divergence by direct evaluation. Then we can use that result, in conJunction with comparison tests, to determine the convergence/divergence of more
complicated integrals. Proceeding,
∫∶∕
oo Y
−−
∶
xP
X00
Now, limy_,.1nX
limy yo
X17?
x
−−↕
fq xP
∶
↕
∟
−
∞
(6)
is infinite and hence does. not exist, and similarly for
if p < 1, whereas the latter does exist if p > 1. Thus,
THEOREM
4.5.1 Horizontal p-Integral
The horizontal p-integral, (5), converges if p > 1 and diverges if p < 1.
That result is easy to remembersince thep-series, likewise, convergesif p > 1
and diverges if p < 1. Graphically, the idea is that p needs to be positive enough
(namely, p > 1) so that the infinitely
long sliver of area (shaded in Fig. 1) is
squeezed thin enough to haveafinite area.
We state the following comparison tests without proof.
Xx
Figure
1. The effect, on 1/xr?,
of varying p.
THEOREM
4.5.2. Comparison Tests
Let I) = f° f(x) dx and Ig = [™ g(x) dx, where f(x) and g(x) are positive
(and bounded) on a < x < oo.
(a) If there exist constantsAK and X such that f(z) < Kg(x) for alla > X,
thenthe convergenceof Jy implies the convergenceof J;, andthedivergenceof J,
implies the divergence of Ig,
(b) If f(a) ~ Cg(x) as x - oo, for some finite constant C, then I, and Ig both
convergeor bothdiverge.
Of course, A’ must be finite. Actually, (b) is implied by (a), but we have
included it explicitly since it is a simpler statement and is easier to use. Note
221
thatC’ cannotbe zero becausethe notationf(x) ~ 0 makesno sense. That ts,
f(x) ~ g(a) as © — xo meansthat f(x)/g(a) + las x + xo, and f(x)/0
cannot possibly tend to 1.
1.
EXAMPLE
Consider J = |
"9 2a +3
22 +3
fe
- r dx. Since
g
et4+5
et+5
2
— as x — oo, and
x3
fo” dx/x® is a convergent p-integral (p = 4 > 1), it follows from Theorem 4.5.2(b) that I
is convergent.
COMMENT. If, instead,theintegrandwere(22 + 3)/(x* +5), thentheintegralwouldbe
p-integral (p = 1).
divergentbecause(2¢+3)/(a? +5) ~ 2/a, and[5° dx/x is a divergent
It wouldbe incorrectto arguethattheintegralconvergesbecause(22 + 3)/(2? +5) + 0
as x — oo. Tending to zero is not enough; the integrand must tend to zero fast enough. 4
Since the integrand of the integral in question might not be positive, as was
assumedin Theorem 4.5.2, the following theorem is useful.
THEOREM
Co
|
4.5.3 Absolute Convergence
oO
|f(x)| dx converges,then so does |
a
converges absolutely.
f(x) da, and we say that the latter
°
EXAMPLE
2.Consider
[ =|q sa*+1
sing
»OO
positive. We have
Now, I
dz, the integrand of which is not everywhere
sin
|
1
1
<
ow
322 4+1|~ 3e2 +1
3a?
as
.
ISBros
7
7)
dx/x* is a convergent p-integral (p = 2 > 1). Thus, by the asymptotic relation
in (7)andTheorem4.5.2(b),[;° dz/(3a? + 1) converges.Next,by theinequalityin (7)
andTheorem4.5.2(a),f>~|sina/(3a* + 1)| dx converges.Finally, by Theorem4.5.3,[
converges.
EXAMPLE
3.
ConsiderJ = {5°2!e~°°l"dx.
It mightappearthatthis integral
divergesbecauseof thedramaticgrowthof thez!°, in spiteof the e~°-°!*decay. Let us
see, Writing
100
0.0128 _
wp100
e001
_
14 (0.01n)+ igor)
x 100
102!)10
< war
= (o2/10"
(0.012) 102
7)
102!
e100
‘
5
foes
200
)
(8)
we see, by comparison with the p-integral, with p = 2, that J converges.
EXAMPLE
4. Observe that
"Oo
I =|
da
BOO
=|
aie
AI
ane)
@
3
(9)
_ Jim In (Inx)|3 = 00,
co
2
xcpad
so I is divergent. This example illustrates just how weakly Inz — oo as « — oo, for the
integral of 1/2 is borderline divergent (p = 1), and the Inz in the denominator does not
even provide enough help, as x —+oo, to produce borderline convergence! H
So much for the case where the upper limit is oo. The other case that we
consider is that in which the integrand “blows up” (i.e., tends to +-oo or —oo) at
a finite endpoint, say the left endpoint « = a. If the integrand f(x) blows up as
x —+a, thenin the samespirit as (3) we define
b
r= |
f(z) dz= imaf
e—0
b
(10)
f(x) da,
where € — 0 through positive values.
We first consider the so-called vertical p-integral
b
t=|[ +a.
(b>0)
o
(11)
According to (10),
r= f
51
—de
02
51
=lim
| —de =
40 J, aP
Now, lime-,9 Ine is infinite
lim
−5 −
<0 i-?
lime49Ina|?.
FI)
(p=1)
(—oo) and hence does not exist and, similarly
lime_4o€~? if p > 1, whereasthelatterlimit doesexist if p < 1. Thus,
_» p increasing
p increasing
}
1
(12)
for
THEOREM
4.5.4 Vertical p-Integral
The vertical p-integral, (11), converges if p < 1 and diverges if p > 1.
Recall that as p is increased, the horizontal
sliver of area (shaded in Fig. 1)
is squeezed thinner and thinner. For p > | it is thin enough to have finite area.
However, the effect near z = 0 is the opposite: increasing p causes the singularity
at x = 0 to become stronger, and the vertical column of area (shaded in Fig. 2)
Figure
2. The effect, on 1/2’,
of varying p.
to become thicker. Thus, to squeeze the vertical column thin enough for it to have
finite area, we need to make p small enough; namely, we need p < 1.
The motivation behind the terms “horizontal” and “vertical” p-integrals should
now be apparent;the former involves the horizontal sliver shown (shaded)in Fig. I,
223
and the vertical p-integral involves the vertical sliver shown (shaded) in Fig. 2
Next, we add the following comparison test:
THEOREM
4.5.5 Comparison Test
Let J = f f(a)dz, where0 < b < ow, If f(x } ~ K/xz? as x — 0 for some
constantsJ¢ Ae p, and f(a) is continuous on 0 <a < 6, then I converges if p < 1
and diverges if p > 1.
EXAMPLE
5. Testtheintegralfe (sin 22/x9/*)dx for convergence/divergence.
Ev-
idently, the integrand blows up as x -> 0 and needs to be examined there more closely.
Recalling the Taylor seriessin2a = (2x) — (2x)°/3! + (2x)°/5! — ---, we see that
sin2z
~ 2x as x -> O [as can be verified, if you wish, by applying |’Hépital’s rule to
showthatsin (27)/2z + las 2 - 0], so
sin 22
gale
22
2
(13)
73/2~ yi/2
Thus, according to Theorem 4.5.5, with p = 1/2, the integral is convergent. @
Example 5 concludes our introduction to singular integrals, and we are now
prepared to study the gamma function.
4.5.2. Gamma function. The integral
T(x) = | tee
Jo
"dt
(x > 0)
(14)
is nonelementary; that is, it cannot be evaluated in closed form in terms of the so-
called elementary functions. Since it arises frequently, it has been given a name,
the gamma function, and has been studied extensively.
Observe that the integral is singular for two reasons: first, the upper limit is oo
and, second, the integrand blows up as t — 0 if the exponent x — 1 is negative. To
determine its convergence or divergence, we can separate the two singularities by
breaking the integral into the sum of an integral from ¢ = 0 to t = 7, say, for any
tT> 0, plus another from 7 to oo.* In the first, we have #°~!e7! ~ (7! = 1/t*~*
“That is, if f(¢) is unboundedas ¢ -> 0, thenthe integral
Jo
f(t) dt −li
lim
e-0
∫ −
| tig
a
[
‘
f(t
lim
e+ 0
−
∞
f(t)dt + jim
/
T
fi)dt
exists if and onlyif each of the last two integrals exist.
ic
=
yas
Tr
fo
seeyae [
f(t a
OD
f(t) dt
224
Chapter 4. Power Series Solutions
as t + 0, and by Theorem 4.5.5 we see that we have convergence if l—-a < 1 (Le.,
x > Q),and divergence if « < 0. In the part from 0 to oo, we have convergence no
matter how large x is, due to the e~'. Thus, the integral in (14) is convergent only
if c > 0; hence the parenthetic stipulation in (14).
An important property of the gamma function can be derived from the definition (14) by integration by parts. With “w= ¢°~! and “du’= e~'dt,
l(a) = tte
+ (%— yf
tee!
dt.
(15)
0
The integral in (15) converges [and is (a — 1)] only if > 1 (rather than x > 0,
because the exponent on ¢ is now x —2), in which case the boundary term vanishes.
Thus, (15) becomes
(16)
(a > 1)
1) (# —1).
T(x) = (w«—
The latter is a recursion formula because it gives [ at one point in terms of I at
another point. In fact, if we compute [(z), by numerical integration, over a unit
interval such as 0 < x < 1, then (16) enables us to compute I(x) for all x > 1.
For example,
(3.2) =2.21(2.2)
= (2.2)(1.2)P'(1.2)
= (2.2)(1.2)(0.2)P(0.2),
(17)
and one can find ['(0.2) in a table. (Actually, tabulationsare normally given over
the interval 1 < x < 2 because accurate integration is difficult if 2 is close to
0. In fact, tables are no longer essential since the gamma function is available
within most computer libraries.) Note, in particular, that if m is a positive integer,
then
Pin+ 1) =nI(n)
= n(n —1) (n
- 1)
=e =n(n—1)(n—2)---()E(), (18)
and since
oo
r(1) = [
0
edt
(19)
=1,
(18) becomes
P(n+1)
(20)
=n.
Thus, the gamma function can be evaluated analytically at positive integer values
of its argument. Another x at which the integration can be carried out is 7 = 1/2,
and the result is
(5)
Tr
1
= JT.
:
(21)
Derivation of (21) is interesting and is left for the exercises.
Recall that (14) defines [(a) only for a > 0; for a < 0 the integral diverges
and (14) is meaningless. What is a reasonable way to extend the definition of P(x)
to negativex? Recall thatif we know ['(a), thenwe can compute['(« + 1) from
the recursionformula (a + 1) = aD(x). Insteadof using this formula to step to
we were able, in (17), to compute
the right [for instance, recall that knowing [(0.2)
1(3.2)], we can turn it around and use it to step to the left. Thus, let us define
Ta
r(o)=2+)
1
(22)
toraco.
x
For example,
A
[(—0.€
T(-1.
—2.6
(~2.6)(-1.6) — (—2.6)(—1.6)(—0.6)’
P(x)
where [°(0.4) is known because its argument is positive. The resulting graph of
T'(2)is shownin Fig. 3.
In summary then, ['(a) is defined for all 2 4 0,—-1,—2,... by the integral
(14) together with the leftward-stepping recursion formula (16). The singularity of
T(z) at c = 0 propagatesto x = —1,—2,... by virtue of that formula. Especially
notableis the fact that[(a) = (a — 1)! atz = 1,2,3,..., andfor this reason['(x)
is often referred to as the generalized factorial function.
A great many integrals are not themselves gamma function integrals but can be
evaluated by making suitable changes of variables so as to reduce them to gamma
function integrals.
EXAMPLE
6. EvaluateJ = [5° t2/3e-V? de.SettingVt = u, we obtain
"OO
i
[
0
(u2)?/* e “2udu=
-
1
du = 20 (2)
we"
2 |
,
(24)
JO
where['(10/3) can beobtainedfromtablesor a computer. &
4.5.3. Order of magnitude. [n some of the foregoing examples it was important to
assess the relative magnitudes of two given functions. In Example 3, for instance,
the x! grows as 2 — oo while the e~?-°!" decays. Which one “wins,” and by
what margin determines whether the integral converges or diverges.
Of particular interestare therelative growth and decay of the exponential, algebraic, and logarithmic functions as x + oo and x — 0, and we state the following
elementary results as a theorem, both for emphasis and for reference.
THEOREM 4.5.6 Relative Growth and Decay
For any choice of positive real numbers « and {,
pte
Be
+
0
(Ina) /xz°+0
e“Ine>0
asz—
oO,
asa — oo,
asx-0.
(25a)
(25b)
(25c)
Figure 3. Gammafunction,['(z).
226
Proof of (25a) can proceed as a generalization of (8) or by using PH6pital’s rule,
and (25b,c) can be proved by I’ H6pital’s rule. To prove (25c), for example, observe
that«* Ina — (0)(—oo),which resultis indeterminate.To use|’H6pital’s rule we
need to have 0/0 or co/oo. Thus, express«%ln x as (Inaw)/x~°,which tendsto
—co/oo as x > 0. Then |’ H6pital’s rule gives
lim st
20
ge
,
1a
vol
= lim
a0
~~
—ag 7ool
lim (-=)
230
a
=.
We say that 2° exhibits algebraic growth as « —>oo, and we see from (25a)
that algebraic growth «@ is no match for exponential decay e~°*, no matter how
large a is and no matter how small @ is! Of course, it follows from (25a) that
a~*eP® _s o9 as « > 00: algebraic decay is no match for exponential growth.
Just as exponential growth is extremely strong, logarithmic growth is extremely
weak for (25b) shows that 2® dominates In x as x — oo, no matter how small a is.
Similarly as « + 0: «~* -— oo and Inw — —oco(recall that Inz is zero atx = 1,
increases without bound as x ~ oo, and decreases without bound as 7 — 0; sketch
it), and (25c), rewritten as (Inz)/x~* — 0, shows that 2~* —+oo faster than
Ina — —oo, no matter how smalla is.
Crudely then, we can think of In x as being of the order of « to an infinitesimal
positive power as z — oo, and of the order of x to an infinitesimal negative power
as x — 0. In contrast, one can think, crudely, of e* as being of the order of x to an
arbitrarily large positive power as 2 —>co, and e~” as being of the order of x to an
arbitrarily large negative power as x —+oo,
When considering the relative strength of functions as x tends to some value
Xo, constant scale factors are of no consequence no matter how large or small they
may be. For instance, (87Inz)/a2°9! + 0 as @ > oo just as (Inz)/x°"! does.
Thus, in place of the asymptotic notationf(x) ~ g(x) as © — wo,which means
that f(a)/g(a) + 1 as x + xo, we will sometimes use the “big oh” notation
|F(@) = O(g(x))
to mean that*
f(x) ~ Cg(x)
(26a)
asx —- xo
(26b)
x,
asx
for some finite nonzero constant C’. For instance, whereas
f(z) =
vi— +v3
r+
V5)
7 4:673Ine
Vit
~ St
as x —>Q, it is simpler to write f(a) = O(a7'/?)
V3 ips
T(i + V5)
a
as x —>Q. That is, the scale
factorC = V1 + V3/T'(1+ V5) canbeomittedinsofaras theorderof magnitude
“Actually, the notation(26a) meansthatf(a)/g(z)
is boundedas « — xo. Though our usageis
more restricted, it is consistent with the definition just given, for if (26b) holds, then surely f(x) /g(«)
is bounded as # — ao. Though more restricted, our definition (26b) of (26a) is sufficient for our
purposes and is easier to understand and use.
227
of f is concerned. In words, we say that f is big oh of a '/? as x + 0. Of course,
xg can be any point in (26); often, but not always, xo is 0 or oo.
As one more illustration of the big oh notation, observe from the Taylor series
ze
5
a!
—-—
7= 5od0
739
sme =~—*
sing=o
tee
28
28)
that each of the following is true:
sing = O(a),
(29a)
sing = o+ O(z*) ,
a
sing = 2 6 “fbO(z°) ;
(29b)
(29c)
and so on, as z ~+0. For instance,|’H6pital’s rule showsthat (sinw)/a -> 1 as
x —0,sosing
~ x; hence (26b) holds, with C = 1, so (29a) is correct. Similarly,
l Hépital’s ruleshowsthat(sinz —x)/x? + —1/6,sosina —x ~ —2°/6;hence
sing —2 = O(a) or sing = x + O(zx?)so (29b)is correct.
The big oh notation is especially useful in working with series. For instance,
(29b) states that if we retain only the leading term of the Taylor series (28), then
theerror therebyincurred is of order O(2*). Put differently,the portion omitted,
.
.
ae
po
2
is simply O(2*).
—£ 4 3, - sig t+
Closure. In Section 4.5.1, we define singular integrals as integrals in which something is infinite: one or both integration limits and/or the integrand. We make such
integrals meaningful by defining them as limits of regular integrals. Just as the
convergence and divergence of infinite series is a long story, so is the convergence
and divergence of singular integrals, but our aim here is to consider only types that
will arise in this text. Though the convergence of singular integrals of the type
fo f(x) dx and of infinite series “7° an bear a strong resemblance(e.g., the pseries and horizontal p-integral both converge for p > 1 and diverge for p < 1), one
should by no means expect all results about infinite series to merely carry over. For
instance, for series convergence it is necessary (but not sufficient) that a, — O as
n —>oo, but it is not necessary for the convergence of fo” f(x) dx that f(z) + 0
as « — oo. For instance, we state without proof that fo sin (a”) dx converges,
eventhoughsin (a*) doesnot tendto zero as x > 00.
In Section 4.5.2, we introduce a specific and useful singular integral, the gamma
function, and obtain its recursion formula and some of its values. The exercises indicate some.of.its.many.applications.
In the final section, 4.5.3, our aim is to clarify
the relative orders of magnitude
of exponential, algebraic, and logarithmic functions. It is important for you to be
familiar with the results listed in Theorem 4.5.6, just as you are familiar with the
relative weights of cannonballs and feathers. We also introduce a simple big oh
notation which is especially useful in Chapter 6 on the numerical integration of
differential equations.
Chapter 4. Power Series Solutions
228
Computer software. Many integrals can be evaluated by symbolic computer software. With Maple, for instance, the relevant command is int. To evaluate the
integral J in Example 6, for instance, enter
int(t*(2/3)*exp(—t(1/2)), t = 0..infinity);
and return. The result is
u2_xv3
81F(2/3)
which looks different from the result obtained in Example 6 but is actually equivalent to that result (Exercise
12). To evaluate the latter, enter
evalf(’’);
and return. The result is 5.556316963.
EXERCISES
4.5
1. If ~> Oand @ > 0 show that, no matter how large a is and
(h) fx
COSZar5
no matter how small G is,
|
of
2. If a > 0, showthatno matterhow small a is,
) [
(a) (Inz)/xe* +0
(b)a*/Ina->O0
asx — oo
ast
∙∙
0
3. Show whether the given integral converges or diverges. As
usual, be sure to explain your reasoning.
(a)
&
a
c
~ Jo
(e)i 0
(f)
0
nec’
HINT:ShowthatIna < 2/4 for all suffi-
ciently large x, by showing that (Inz)/a!/4
2 Ined
J/g
+ 0 as x -+ 00.
HINT: Make the change of variables
av
Ve
7 =
∞dx/x?
o
.
.
.
.
integral,andstatewhethertheresultingintegralis singularor
Caen
vt +100
dav
not.
(a)I
Pa
ee
a=
Va.
of BES
er
2 Ine dz
a) |
converge? Explain.
C
6. Enter the indicated change of variables in the given singular
(e) "© sin? a dx
g
4, Show whether the given integral converges or diverges.
5. For what p’s, if any,does [
∞∶ at 2
“aw
dz
9 wt +2
a [
r2 COSL
1/€ andusethehint given in part(a).
‘ 2
0,
(b)
dx
a
(b)
~~ dx
4
HINT: Let€é = a —1.
end
a
asx —oo
(a) e%e7 8" +0
(b) a7 %e8* -3 00 «asa 4 00
TY
g
b *daLe
°) o Va’
∕
1
es =
;dx
E
wt+2
“cosa
aoe 1
dx
g
too
2
4.5. Singular Integrals; Gamma Function — 229
(d)
°° cos x dx
,
ve
11. Evaluate as many of the integrals in Exercise 10 as possible using computer software.
ie
’
7. For what range of a’s (such as 0 < a < 2,a@ > 4, no a’s,
etc.) does the given integral converge? Explain.
°°
a
(a) [
dz
(6) 0, vit]
2
o%sined
df
bP
) [| ue
c
7)
4 0 de
‘b
ve +3
12. In Example 6 we obtainedthe value 2P(10/3). Using
Maple, instead,showthattheresultis (112/81)rV/3/T(2/3).
x sin
sina
dx
« da
. 05
4\e 7,
Then, use any formulas given in this section or in these exercises to show that the two results are equivalent.
13. Deduce, from the formula given in Exercise
[O(a), that
I'(z) ~ 1/e as x tendsto zero throughpositivevalues.
x®dz
14, (Beta function) Derive the result
8. Evaluate, using a suitable recursion formula and the known
value[(1/2) = \/7. Repeatthe evaluationusing computer
1
B(p,q) =
software.
“
(a)(3.5)
(b)P(—3.5)
—(c)F(6.5)
9, Derive(21),thatP(1/2) = 7.
(1/2)
=2|
J0
(d)(0.5)
HINT: Show that
0
show that
=4
o
f
cau
meee
Jo
(14.1)
(14.2)
zr?'e~*dz,
[(p) = [
en du,
= af
CigI(p)
0
D(p+q)’
for p > 0, q > 0; B(p,q) is known as the beta function.
HINT: Putting « = wu?in
so that
[P(i/2)]? =4f
= P(p)l'(q)
a? (1 —2)tdr
0
e
0
ew”du
~(ua? +07)
du dv.
Regarding the latter as a double integral in a Cartesian u,v
plane, change from z, v to polar coordinates r, 0. The resulting double integral should be easier to evaluate.
10. Show by suitable change of variables that
0
da |
wete
0
v@le-"
dy, (14.3)
Regarding the latter as a double integral in a Cartesian u,v
plane, change from w, v to polar coordinates r, #. Making one
more change of variables in each integral, the r integral gives
['(p + q) and the @integral gives B(p, q).
15. Derive, from (14.1) above, the alternative forms:
(a)
B(p,9) =|
pet
8OO
Gaore
@
(15.1)
(p>0, q>0)
p
0
n!
7
ro
(p > 0)
HINT: Seta = ¢/(1 +t) in (14.1).
nf2
B(p,q) = 2f
(b)
(b) z™ (Ina)"dx = (—1)" oy
Jo
(m+ 1)r*t
vy
—x
4
vail
cos??~! 9 sin??~! @dO
(15.2)
(p>0, q >0)
(m,n nonnegative integers)
~
0
16. Using any results from the preceding two exercises, show
that
ie
(a)
JO
1
+1 q4+l1
(16.1)
cos#sin?0d0= =B(2 2° i) 2
(p>-1,
a
¢>-1)
230
Chapter 4. Power Series Solutions
(b) /
om[2
0
tan?0d0= /
om{2
0
cot?@d0
ia
lt+p1-p
— 38 (=.
L i °
x
=)
a
|
29 Jo Vcos@ — cos 4
~ Deosbe
2
T/4
dt,
0
(18.2)
where T' is the period and @)is the maximum swing. We
expect T' to depend on 09, so we denote it as T'(99). For the
(16.2) case0) = 7/2,showthat
for —1 < p< 1. HINT: You may use (17.1), below.
°
(c)
adr
(1+ aye
[
1
atl
cb-a-1
52 ( b ?
b
r (24)
_l
fora > -1,b>0,be-a>1.
b
nl P(1/4)
T(n/2) =
)
(16.3)
(#=¢-4)
T(c)
↕
∶
↨
9 T/A)
NOTE: You mayuseresultsfromtheprecedingexercises.
(b) At first glance,it appearsfrom (18.2)thatT'(@9)> 0 as
9 — 0. Is that assessment correct? Explain.
17. It can be shown,from theresiduetheoremof thecomplex 19. Let F(x) = 4/(1 + a7) = 4-42?
integralcalculus,that
G(e)
2+ 32 (x)
7x —2+1 (a)
= pgp
PMO yg
tM
0
ger!
ine"
u
Gao
(0<a<1)
(17.1) J(z)i = Net
(a) F(z)
Using this result, (15.1), and (14.1), show that
T(e)P(1
aa @) => ne
7
(0 <ac<
1)
(17.2)
18. (Period of oscillation of pendulum) Conservation of energy
dictatesthattheangulardisplacement6(¢)of a pendulum,of
length/ and massm, satisfiesthedifferential equation
1 (4)
=m
(l@)+mg(l—Icos@) =
j
) + mg
3a,, andK(z) Fin
each:
F(x)
(d) ae
=4—
32 Verif the truthof
y
asa 30
=O(1)
asx 0
(b)F(z) =4+O0(2?)
3)
4 4at—.-..,
2x —3lnz
ae
4x2
~ Ae
as
(f) H(z) = O(r)
(g) H(z) =O(1)
(h)I(z) =O(a7!)
+ O (x*)
ase
asx
3
0
0
ee
asx — 00
asx 70
asa —oo
l—Icos@).
(18.1
"
°
)=mg(i—leos).(18-1)
Fa)
O(n) asa40
(a)From(18.1),showthat
j) J(z) =O(x)
(k) J(xz) =O(1)
(1)K(z) =O(1)
4.6
ast — co
asx 0
asx—-0
Bessel Functions
The differential equation
ay” + ay! + (x®—v*)y =0,
where v is a nonnegative real number, is known as Bessel’s
(1)
equation
of order
vy. The equation was studied by Friedrich Wilhelm Bessel (1784-1846), director of the astronomical observatory at K6nigsberg, in connection with his work on
planetary motion. Outside of planetary motion, the equation appears prominently
in a wide range of applications such as steady and unsteady diffusion in cylindrical regions, and one-dimensional wave propagation and diffusion in variable
Bessel Functions
cross-section media, and it is one of the most important differential equations in
mathematical physics. Dividing through by the leading coefficient 2”, we see from
that
∫∫
∶↓∕ ∶
∏(2? _ ∫vu?)/x*
∶
∶ there is one singular point, 7 = 0, and
v? are
that it is a regular singular point because wp(x) = 1 and v7q(x) = x? ~—
analytic at x = 0.
4.6.1. v ~ integer. Consider the case where the parameter v is not an integer.
Seeking a Frobenius solution about the regular singular point 2 = 0,
CO
y(a)=Soapa**", (ay#0)
(2)
k=0
gives (Exercise
1)
5
[lh c+r)o
=v *) ak + ap a} ght" =0,
(3)
k=O
where ag 4 0 and a_2 = a_
ill 0. Equating to zero the coefficient of each power
of x in (3) gives
k=0:
k=l:
(r?°—v)ag=
an+1)? - a
A> 2:
[(r + kj? — vy” Gp + ap—2 = 0.
(4a)
(4b)
a, = 9,
(4c)
Since ag # 0, (4a) gives the indicial equation r? — v* = 0, with the distinct roots
r= ctv. First, let r = +v. Then (4b) gives a, = 0 and (4c) gives the recursion
relation
‘|
5
k>2
9.
Oj,=
h(k+20) 8?
Oh
(k2 2)
From (5), together with the fact that a, = 0, it follows thata, = ag = a5 =--and that
dak=
(~1)*
kk
(vytkh)(v+k—-1)-+-(v+1)
ao.
oa
=0
(6)
If v werean integer,thentheever-growingproduct (v + k)(v +k —1)---(v+1)
could be simplified into closed form as v!/(v + k)!. But since v is not an integer,
we seek to accomplish such simplification by means of the generalized factorial
we recall the gamma function recursion formula [(2) = (« ~ 1)P(a — 1), then
i
Petktl=vwtkhP(vek)=(vtkh)(vek-Ivt+k-le=
whichgives(v+k)(v+k—1)--- (+1) i
(v+k)(v+k—1)->-(v+1)(v+1),
Pv +k + 1)/P(v + 1). With this replacement,(6) becomes
—DAE (py 4 1
Q94
.
=
a)...
(
)
Diy
+ 1)
eID (Vy +k + n°
7
a
231
so we have the solution
—
.
y(@) = ap2’”T(v + 1)
2
a 2k
{—1)*
eee
7 a
AIT(vu+k+1) (5)
(8)
Dropping the a92”T'(v + 1) scale factor, we call the resulting solution the Bessel
function of the first kind, of order v:
Cy &
∫
∙
_4\k
∶
∕
↨∏
∕
To obtain a second linearly independent solution, we turn to the other indicial
root, r = —v. There is no need to retrace all of our steps; all we need to do is to
change v to —v everywhere on the right side of (9). Denoting the result as J_, (2),
the Bessel function of the first kind, of order —v, we have
ayy
&
ny2k
_4)\k
(10)
=(5) Saneceay G)
J-o(x)
Both series, (9) and (10), converge for all x, as follows from Theorem 4.3.1 with
Ry = Ry = oo or from Theorem 4.2.2 and the recursion formula (5). The leading
terms of the series in (9) and (10) are constants times x” and x~”, respectively, so
neitherof thesolutionsJ,(x) and J_,(x) is a scalarmultipleof theother.Thus,
they are LI, and we conclude that
y(x) = AJ, (x) + BJ,
(11)
(x)
is a general solution of (1).
Writing (9) and (10),
vy
l
J(a)=2 Free:D2
1
Bee)
T(l—v)2-"
Tw+a2n* * |
,
1
1
12
a
ove
(2 —v)a-v +
|
(13)
Since the power serieswithin the squarebracketstendto 1/['(v+1)2” and 1/T(1-
v)2~", respectively,as 2 -+ 0, we see that J,(x) ~ [1/P(v + 1)2”]x” and.
J_v(x) ~ [1/P( — v)2-"|27" as x + 0. It is simpler and more concise to
use the big oh notationintroduced in Section 4.5.3, and say that J,(v) = O(x”)
andJ_,,(x) = O(x~") as x + 0. Thus,theJ,,(x)’s tendto zero andthe J_,,(x)’s
tend to infinity as ~ -+ 0. As representative,we haveplotted Jj /2(x) and J_j/2(z)
in Fig. 1. In fact, for the half-integer values v = +1/2, £3/2,+5/2,...
the series in (9) and (10) can be shown to represent elementary functions. For instance
(Exercise 5),
2
Ji jo(x) = 4/— sing,
2
J_4jo(z) = 4/— Cost.
(14a,b)
233
+1) =T(n+k+1)
+4
4.6.2.v =integer. If v is a positive integern, then(v
=
(n + k)! in (9), so we have from (9) the solution
=)
=0
a
Rea
Gy"
x4
x6
(15)
of (1). For instance,
1
xu2
Jol)=~ oa+gacane
~a6anz
6
en
Ji(a) = “
2
(16b)
os
F
7
oO
“
“
foe
2382! 25213! = 2734!
We need to be careful with (10) because if v = n, then the [(k —n-+ 1) in (10)
is, we recall from Section 4.5.2, undefined when its argument is zero or a negative
integer ~ namely, fork = 0,1,...,n
— 1. One could say that P(& — n + 1) is
infinite for thosek’s, so 1/T(k -n+ 1) equalszero fork = 0,1,...,n—1,
equals1/(k —n)! fork =n,n+1,..., in which case(10)becomes
7
teak—n) (2)
J-n(a =D
and it
(7)
[The resulting equation (17) is correct, but our reasoning was not rigorous since
— 1, rather than “oo.” A rigorous line
[(k — n+ 1) is undefined atk = 0,1,...,n
of approach is suggested in Exercise 10.] Replacing the dummy summation index
k by m according tok —n =m,
~
I
—n(
eam
yaaa
<= (m+ n)im!
p\ 2m-en
(5)
\2
If (—1)"” is factored out, the series that remains is the same as that given in (15), so
that
Jen() =(-1)"In(e).
(18)
The resultis thatJ,(a) andJ_,,(z) arelinearly dependent,since(18)tells us
thatone is a scalar multiple of the other. Thus, whereasJ,(x) and J_,(x) are LI
and give the general solution (11) if v is not an integer, we have only one linearly
independent
solution
thus far for the case where v = n, namely, yil(a a
Jn(x)
given by (15). To obtain a second LI solution yo(avewe rely on Theorem 4.3.1.
Let us begin with n = 0. Then we have the case of repeated indicial roots,
r = +n= +0,which corresponds to case (ii) of that theorem. Accordingly, we
seekyo(a) in the form
yo(x) = Jo(x) Ina + S- che.
1
(19)
234
Doing so, we can evaluate the c,’s, and we obtain
n\2
1
L
yo(x) = Jo(a) naw + (5) 7 (1 +5) (a2
v\4
(5) bree,
(20)
which is called Yo(x), the Neumann function of order zero. Thus, Theorem 4.3.1
leadsus to thetwo LI solutionsy;(a) = Jo(x) andy2(a) = Yo(x), so we canuse
them to form a general solution of (1). However, following Weber, it proves to be
convenient and standardto use, in place of Yo(«), a linear combination of Jo(x)
andYo(a), namely,
(21)
[Yo(x) + (y — In2)Jo(a)| = Yo(x),
yo{x) =
where
2
x
av
Yo(a) = - (in 3 + 7) Jo(x) + 2
1
1
1
x!
(1 + 5) 24(21)2
x
(22)
|
+(:++ 4)a2
is Weber’s Bessel function of the second kind, of order zero; y = 0.5772157 is
known as Euler’s constant and is sometimeswrittenas C’, andYo() is sometimes
written as No(x). The graphsof Jo(x) and Yo(x) are shown in Fig. 2. Important
featuresarethatJo(a) andYo(x) look a bit like dampedcosineandsinefunctions,
0.5
except that Yo(a) tends to —oo as x — 0. Specifically, we see from (16a) and (22)
that
,
Jo(a) ~ 1,
-0.5
15
Figure 2. Jo andYo.
Yo(a) ~ = Ing
(23a,b)
as x — 0, and it can be shown (Exercise 6) that
Jo(a) ~ (2
cos (x ~ *),
Yo(z) ~ rE
sin (« −
(24a,b)
as @ — oo. Indeed, we can see from (24) why the Weber Bessel function Yo is a
nicer companion for Jo than the Neumann Bessel function Yo, for
Yo(a) ~
:
Ve
l(< —yt+ nz)
2
sina — (< +y— inz) cosr|
2
(24c)
a
as x —+00; surely (24b) makes a nicer companion for (24a) than does (24c).
[t might appear, from Fig. 2 and (24), that the zeros of Jo and Yo [1.e., the roots
of Jo(a) = 0 and Yo(a) = 0] are equally spaced, but they are not; they approach an
equal spacing only as x — oo. For instance, the first several zeros of Jo are 2.405,
5.520, 8.654, 11.792, 14.931. Their differences are 3.115, 3.134, 3.138, 3.139, and
these are seen to rapidly approach a constant [namely, 7, the spacing between the
zeros of cos (« — w/4) in (22a)]. The zeros of the various Bessel functions turn out
to be important, and they are tabulated to many significant figures.
235
the indicial roots r = -tn differ by an integer, which cor-
For n = 1,2,...
responds to case (iii) of Theorem 4.3.1. Using that theorem, and the ideas given
above for Yo we obtain Weber’s Bessel function of the second kind, of order n,
(25)
(eye
i
yo
ae
ne
ue Oe
+
Yn(a) = 2|(ng
?
k==0
whichformulaholdsform = 0,1,2,...;6(0) = Oandd(k) =1+4+4+---4+ 4
fork > 1.
4.6.3. General solution of Bessel equation. Thus, we have two different general solution forms for (1), depending on whether v is an integer or not: y(a) =
y=n=0,1,2,....
It turns out that if we define
¥,(e) = (cosv7) A(x) — J_y(xr)
(26)
sin vit
for nonintegerv, thenthelimit of Y,(z) as v > n(n = 0,1, 2,...) gives the same
resultas (25). Furthermore,J,(x) andY_(2) are LI (Exercise 1) so theupshotis
that we can express the general solution of (1) as
y(x)= AJL(a) + BY,(z)
(27)
for all values of v, with Y, defined by (25) and (26) for integer and noninteger
values of v, respectively. The graphs of several J,,’s and Y;,’s are shown in Fig. 3.
For reference, we cite the following asymptotic behavior:
(b)
∫
aitL
(n = 0,1,2,...)
va
2
(n =0)
(280)
(n = 0,1,2,...)
(29a)
(n = 0,1,2,...)
(29b)
a (n=1,2,...)
—ayt
4Fomin
Yale)~
~~—Ingz,
1
(28a)
z
as « — O, and
2
JIn(z)~ 4/— cos(x —Wn),
KX
Y,(v) ~
{2 ,
4/—sin (x — wn),
TL
as Z + oo, where 7, = (2n+1)7/4. Observe the sort of conservation expressed in
(28a,b): as n increases, the Y;,’s develop stronger singularities (In x, x ~lig?
)
236
while the J,,’s develop stronger zeros (1, x, «”,...).
(We say that 2° has a stronger
zeroattheoriginthanx, forexample,becausex°/x? —+0 asx + 0.) Finally,we
call attention to the interlacing of the zeros of J, and Y,,. All of these features can
be seenin Fig. 3.
In summary, our key result is the general solution (27) with J, given by (9),
Y_ by (25) for integer v and by (26) for noninteger v, and with J_, in (26) given
by (10).
4.6.4. Hankel functions.
(Optional) Recall that the harmonic oscillator equa-
tion y” + y = 0 has two preferredbases:cos x, sinz and e', e~'*. Usually, the
former is used because those functions are real valued, but sometimes the com-
plex exponentials are more convenient. The connection between them is given by
Euler’s formulas:
e’* = cosx +i sing,
e” =cosxz —i sing.
Analogousto thecomplexbasise’, e~**for theequationy” + y = 0, a complexvalued basis is defined for the Bessel equation (1):
(30a)
(30b)
HY(2) =J,(x)+i¥,(2)
H()(c) = J,(x) —iY,(2).
These are called the Hankel functions of the first and second kind, respectively, of
order 1. Thus, alternatively to (27), we have the general solution
(1)
y(x)=AH) (x)+BH?)(2)
of (1).
As a result of (29a,b),the Hankel functions HY (2), He?)(2) have the pure
complex exponential behavior
↨↕n (2)
(2)(2)~.
H)
as
v—-
i
2
∕−
2 e~i(z—wn)
TL
↕
↕
∫
32
(32a)
(32b)
co.
The Hankel functions are particularly useful in the study of wave propagation.
4.6.5. Modified Bessel equation. Besides the Bessel equation of order v, one also
encountersthemodifiedBesselequationof order v, x7y"+2xy'+(-2? —v*) y=
0, where the only difference is the minus sign in front of the second x? term. Let
us limit our attention, for brevity, to the case where v is an integer n, so we have
vy"
+ ay! + (—2? = n”) y = 0.
(33)
4.6. Bessel Functions
The change of variables t = ia (or a = —it) converts (33) to the Bessel
equation
(34)
PY" +ty’ + (?—n?)Y =0,
wherey(x) = y(—it) = Y(t) and the primes on Y denoted/dt. Since a general
solution to (34) is Y(t) = AJ,(t) + BY, (t) we have, immediately,the general
solution
y(w)= AJn(ix) + BY, (iz)
(35)
of (33). From (15),
oo
—|] k.
.
fie
= S> _(1y'
ard k\(k+n)! \ 2
2h--n
— jmYer
(36)
16)
k! Ee+n)
so we can absorb the 2” into A and be left with the real-valued solution
In(2)=i7"Jn(iz)= YTx =
(5)
2k+n
;
(37)
known as the modified Bessel function of the first kind, and order n. In place
of Y,,(ix) it is standard to introduce, as a second real-valued solution, the modified
Bessel function of the second kind, and order 7,
K,(2)= oH [Jn(ix)+i¥,(ix)].
(38)
For instance,
Jot) =1+
oe
55
at
8
(39a)
24(21)22831)?
x2
(a) + (1)59 Ne
Ko(x) = —(in ~+ a
+(145
ay
(14341)
2 3)
28(3!)
(39b)
and the graphs of these functions are plotted in Fig. 4.
As a general solution of (33) we have
y(v) = Al, (e) + BK, (2).
(40)
Whereas the Bessel functions are oscillatory, the modified Bessel functions are not.
To put the various Bessel functions in perspective,observethat the relation-
the modified Bessel equation is
ee to that between the solutions cos xz,sin x
of the harmonic oscillator equation y + y = 0 pas the cosh x, sinh x solutions
of the “modified harmonic oscillator” equation y” — y = 0. For instance, just as
237
238
cos (ia) = cosh (a) andsin (iv) = 7 sinh (x), [,(x) andK,,(a) arelinear combinationsof J;,(tx) andY;, (ix).
Finally, the asymptotic behavior of [,, and HK, is as follows:
say\n
1
’
a (=)
n\t)~ TING
In(z)
—Inz
i
Ka
n—!
a
(n
41
= 0, 1,2, oe)
(41a)
(n =0)
4lb
a\n
as « — 0, and
↕
In(x) ~
∙
(42a)
(n =0,1,2,...)
e”,
KelPE waonred
as x — oo. As n increases, the J,,’s develop stronger zeros at x = 0 (1,2, x...
while the K’,,’s develop stronger singularities there (Inz, x~!, a
*,...).
4.6.6. Equations reducible to Bessel equations. The results discussed in Sections
4.6.1-4.6.5 are all the more important because there are many equations which, although they are not Bessel equations or modified Bessel equations, can be reduced
to Bessel or modified Bessel equations by changes of variables and then solved in
closed form in terms of Bessel or modified Bessel functions.
EXAMPLE
1. Solve
ty! +y +6cy
= 0.
(43)
Equation (43) is not quite a Bessel equation of order zero because of the x. Let us try to
absorb the «* by a change of variable. Specifically, scale z as ¢ = az, where a is to be
determined. Then
division by a,
d
dx
d?
2
md( d ) (=)
dz
dt
=a
d , SO d
dx?
dt
= a*—= and (43) becomes, after
2
t(Y"+¥'+Sty
=0,
a
dt?
(44)
wherey(x) = y(t/a) = Y(t). Thus, we can absorbthe x” in (44)by choosinga = k.
Then(44)is a Besselequationof orderzerowith generalsolutionY(t) = AJo(t)+BYo(t)
so
(45)
y(x) = AJo(t) + BYo(t) = AJo(Kx) + BYo(Kx)
is a general solution of (43). &
More generally, the equation
d
dy
—
{et
dx (: ie)
cy = 0,
+ 50%u 0;
46
(46)
4.6. Bessel Functions
where a, b,c are real numbers, can be transformed to a Bessel equation by transforming both independent and dependent variables. Because of the powers of « in
(46), it seems promising to change variables from «, y(a) to t, u(t) according to
the forms t = Av?, u = wy, and to try to find A, B, C so that the new equation,
on u(t), is a Bessel equation. [t turns out that that plan works and one finds that
under the change of variables
t=avbzr!/
and
oy
uaa!
(47)
equation (46) becomes the Bessel equation of order v,
du
2
du
:
(48)
42
2).
oa tha + (?—v?)u=0,
if we choose
and
g=—_
—
c~a+2
(49)
iva
yo
c-a+2
[The latter is meaningless tf c—a-+2 = 0, but in that case (46) is merely a CauchyEuler equation.] Thus, if Z, denotes any Bessel function solution of (48), then
putting(47) into u(t) = Z, andsolving for y gives thesolution
y(a) =2"/°Z,
(vibe)
(50)
of (46). If b > 0, then Z, denotes J, and Y,, and if b < 0, then Z, denotes JL,
and JX, (though we gave formulas for J, and AY, only for the case where v is an
integer).
EXAMPLE
2. Solve theequation
y +3/fry =0.
(0< a <oo)
(51)
Comparing(51)with (46),weseethata = 0, b = 3,andc = 1/2,soa = 2/(1/2—0+2) =
4/5 andv = 1/(1/2 —0+ 2) = 2/5. Thus,(50)gives
Fi
4
5
‘
y(z) = 2'/?Zo)5 (v50°")
and
y(xz) = Vr [Avy
4
ve
(3 v50*"")
5
.
.
;
(52)
|
& fe
+ BYa75 € vae%")|
mM A5
(53)
is a general solution of (51). #
EXAMPLE
3. Solve
ty” + 3y' +y = 0,
(0 <a < ov)
(54)
239
240
or
1
(55)
yl +ay!+ay=0.
Writing out (46)as 2¢y” + ax* ly! + ba°y = 0 or
(56)
=0,
yl+Sy!+ba®*y
xv
and comparing (56) and (55) term by term gives a = 3,6 = l,ande-a=—Il,soc
= 2.
Hence,a = 2/(2 ~ 3+ 2) = 2andy = (1 —3)/(2 ~ 3+ 2) = —2,so (50)becomes
y(2) = a lZ_
and
(2vi2'/*)
=a'Z,
(2/2)
(57)
'
(58)
y(z) = = [AJ (2/2) +BY2(2V2)|
is a general solution of (54). @
NOTE: In the second equality of (57) we changed the Z_2 to Z2. More generally, if
the v that we compute in (49) turns out to be negative we can always change the Z,,
in (50) to Zu}, for if v is a negative integer —n, then the Z, in (50) gives J_», and
Y_n; but (18) told us that J_,, is identical to J,, to within a constant scale factor,
and it can likewise be shown that Y_,, is identical to Y;,, to within a constant scale
factor[namely,Y_,(a) = (—1)"Y,(a)]. And if v is negativebutnotaninteger,
then the Z, in (50) gives J, and J_,, and that is equivalent to Z_, giving J_,, and
aye
EXAMPLE 4. Solve
∶
↕
∶
We see from (46) that a = 1, b = —5,c = 3,soa
y(2) = x°Zo(5
1
∶
(59)
= 1/2,v = O and
:
|- 12”
;
so
y(x) = Alp (2)
+ BKo (2")
(60)
is a generalsolution of (59). #
Closure. In this section we studied the Bessel equation
ay"
+ ay’ + (x? = v?) y= 0
(61)
and the modified Bessel equation
a?y"” + avy!+ (-2? _ v”) y= 0.
(62)
Bessel Functions
For heuristic purposes, it is useful to keep in mind the similarity betweenthe Bessel
equation and the harmonic oscillator equation
(63)
y" +y =0,
and between the modified Bessel equation and the “modified harmonic oscillator”
equation
(64)
yo~y =0.
For large x, the left side of (61) becomes
fou
1
v 2
+ (1-5)
uaa! +y
(65)
so we expect qualitative similarity between the solutions of (61) and those of (63).
In fact, the solutions J,(a) and Y,(a) of (61) do tend to harmonic functions as
x —>00, like the cosine and sine solutions of (63), and the y//a term in (65) causes
somedamping of those harmonic functions, by a factor of 1/,/z. Thus, the general
solution
y(x)= AJ, (x) + BY,(x)
(66)
of (61) is similar, qualitatively, to that of (63). Further, just as one can use pure
complex exponential solutions of (63) according to the Euler definitions, one can
introduce the Hankel functions in essentially the same way, and write the general
solution of (61), alternatively, as
y(e) = AH!) (x) + BH
(2).
(67)
Likewise, for the modified Bessel equation (62), the left side of which becomes
fot
1
2
yo
(1-3)
uae! +y
(68)
for large x, we find nonoscillatory solutions analogous to the hyperbolic cosine and
sine solutions of (64).
So much for large x. As « + 0, the Y, solutions of (61) are unbounded as are
the A’, solutions of (62).
Computer software. As a general rule of thumb, if we can derive a solution to a
given differential equation by hand, we can probably obtain it using computer software. For instance, if, using Maple, we attempt to solve the nonconstant-coefficient
differential equation (54) by the command
dsolve(a » diff(y(z), v7,x) + 3 « diff(y(a2),x) + y(a) = 0, y(x));
we do obtain the same general solution as was obtained here in Example 3.
241
Chapter 4, Power Series Solutions
242
4.6
EXERCISES
1. Putting the solution form (2) into the Bessel equation (1),
2u
derive the recursion relation (3).
+ Ky-1(2).
Kuyzi(2) = = Kel)
Frobenius. Show that your two LI solutions can be expressed
in closed form as given in (14a,b).
(4.6)
(c)Use computersoftwareto differentiatex°J3(x), x7Yo(z),
wIs(x), 2~*Ko(x), Jo(x), Yo(x), Jo(x), and Ko(x), and
show that the results are in accord with the formulas (4.1)-
3. Show thatwith Yo(z) definedby (21),theasymptoticbe- (4.3),
havior given in (24b) follows from (24a) and (24c).
5. (Half-integer formulas)
4. (Recursion formulas) It can be shown that
=
£ [s"Z,(a)]
vZ,r(z),
—2"Zy (@),
(Z=7,Y,1,H,
2
Jijo(x) = 4/ a sin x
oH)
(Z= K)
(4.1)
=
qqleSu(@)]
(Z
a" Zy41(2),
(Z = 1)
= ---’’
= J, Y,K,
HO),
H))
(4.2)
means that the formula holds with Z
mula
Zot)
={
(Z
2n—1
x
(5.2)
∙
J_, and J,,) using the series given in the text for those functions.
(b) From the formulas given above, show that
5
~ Zy-1(2),
(Z
= J,
if
a);
i)
that all Jn41/2’8 are express-
and powers of =z.
Derive thoseexpressionsfor J3/2(a) and J_3/2(2).
(a) (Normal form) By making the change of variables
ing o such that 20’ + po = 0. Show that the result is the
normal form (i.e., canonical or simplest form)
(6.1)
Co
where
(6,2)
(b) (Large-x behavior of Bessel functions) For the Bessel
(4.4)
w+
2
(5.3)
equation(1),showthato(a) = 1/./z andthat(6.1)is
and
Loai(a) = ~z he(z) ~ f,-1(2),
Jn—3/2(2).
show that the first derivative term can be eliminated by choos-
=I)
2
_
Jn—1/2(2)
ible in closed form in terms of sin x, cos,
(a) Verify (4.1) and (4.2) for the case where Z is J (i.e., Jy,
= ~Z,(z)
2 COs&
y(x) = o(x)v(z), from y to v, in y” + p(x)y’ + g(x)y = 0,
(4.3)
corresponding to (4.2) for the case v = 0, is useful in evaluating certain Bessel function integrals by integration by parts.
Zu4i(2)
_
(c) It follows from (5.1)-(5.3)
6.
pp)
r(L) H@))
~Z (az), ( Z= JY,JY,
K,H”),
Zi (x),
=
Jn4i/2(£)
equal to each of the itemized functions. In particular,the ford ,
J_1/2(x)
=4/
zr)=
-1/2
(b) Derive, from (4.4), the recursion formula
—2-"Zy41(z),
where the “Z
(5.1)
and
d
{
(a) Putting v = 1/2 in (9) and (10),
show that they give
(1-2
2_
1/4
yao
x
(4.5)
(6.3)
NOTE: If we write 1 — (v? ~—1/4)/2? = 1 for large
x, then (6.3) becomes vw’ + vu &
0, so we expect that
v(x) = Acos (x + ) or, equivalently,Asin (z + @), where
4.6. Bessel Functions
A and ¢@are arbitrary constants. Thus,
y(a) = o(a)u(a) & AL cos(“ +)
Va
243
4.4], there is one for the Bessel functions. Specifically, it can
be shown that
or
A
Va
sin (2 + 9),
(6.4)
which forms are the same as those given by (24a) and (24b).
Thus, we expect every solution of (1) to behave according to
(6.4) as
— oo, Evaluating the constants A and ¢ corre-
spondingto a particularsolution,such as Jo(x) or Yo(2), is
complicated and will not be discussed here.
7, Recall from Example | that Jn(K©) satisfies the differential
n?
−−
−
y=0.
(n=0,1,2,...)
(7.1)
Let the x interval be 0 < x < c, and suppose that « is chosen
so thatJn(Kc) = 0; i.e.,ke is anyof thezerosof J,(x) = 0.
The purpose of this exercise is to derive the formula
[ [Un(nex))?
2dr=
C
2
and the left-hand side is called the generating function for
the J,,’s.
(a) We do not ask you to derive (8.1) but only to verify the
equation27y" + ay! + (k*ax7—n?)y = 0 or,equivalently,
(ry')! + (22
(8.1)
of t9 is Jo(z).
(b) Equation (8.1) is useful for deriving various properties of
the J,,’s. For example, taking 0/0
d
Gg int n(E
;
In41(Z)
which will be be needed when we show how to use the Sturm-
Bessel functions. In turn, that concept will be needed later in
our study of partial differential equations. To derive (7.2), we
suggestthe following steps.
(a) Multiplying (7.1) by 22y’ and integrating on x from0 to ¢,
obtain
(«°2?—n?)
ydy
=0
1
(n=
— Jn4i(2)).
5 [yn—1(2)
(7.2)
0
(ovPp +2 f
\=
of both sides, show that
(8.2)
1,2,...)
(c} Similarly, taking 0/0¢ of both sides, show that
2
[Jn+4i(Kc)]
n = (term. That is, expanding e**/?and e~*/*#in Maclaurin
series (one in ascending powers of ¢ and one in powers of 1/t)
and multiplying these series together,show that the coefficient
=0.
= 2(n
x
+1)
[Jn (x)
(8.3)
+ In42{2)|
(d) Using computer software, differentiate Jo(x) and J,(z)
and show that the results agree with (8.2).
9. (Untegralrepresentation of J,,) Besides the generating function (preceding exercise), another source of information
about
the J,,’s is the integral representation
Jn(z) = =i
(7.3)
Tv
cos (nd —xsin @)dé.
(9.1)
Verify (9.1) for the case n = 0 by using the Taylor series
(b) Showthat with y = J, (Kx), the (xy’)? term is zero of cost, where t = n@ — x sin@ and integrating term by term.
atx
= 0 forn
= 0,1,2,...,
and that at z
= ¢it
is
HINT: You may use any of the formulas given in the exercises
eld
gat)
HINT: It followsfrom(4.2)thatJ/(2) = to Section 4.5.
nti(t) + 2Jn(2).
10.To derive(17)from (10),we arguedthat1/P(k-n+1)
(c) Thus, show that (7.3) reduces to
r
2
e
c*K?[Inai(we)|”+ 2K?/
r=c
0 fork
9
zy dy — ny? |
c=0
(d) Show
that the n? vl
=C
= 0).
~~
(7.4)
6
term is zero
for any n
=
0,1,2,..., integrate the remaining integral by parts and show
that the resulting boundary term is zero, and thus obtain the
desired result (7.2).
8. (Generating function for J;,) Just as there is a “generating
function” for the Legendre polynomials [see (9) in Section
= 0,1,...,m
=
— 1, on the grounds that for those k’s
T'(k — n + 1) is infinite. However,while it is true that the
gamma function becomes infinite as its argument approaches
itis not rigorous to say that it is infinite at those
0, -1, -2,...,
points; it is simply undefined there. Here, we ask you to verify
— 1 terms are zero so that the corthat the k = 0,1,...,7
rect lower limit in (17) is, indeed, & = n. For definiteness, let
v= 3andr = —v = —3. (You should then be able to generalize the result for the case of any positive integer 1, but we do
not ask you to do that; v = 3 will suffice.) HINT: Rather than
work from (17), go back to the formulas (4a,b,c).
244
Chapter 4. Power Series Solutions
11. It was stated, below (26), that J, and Y, are LI. Prove that
claim. HINT: Use (25) for v = n and (26) forv 4 n.
12. Each differential equation is given on 0 < x < oo. Use
(50) to obtain a generalsolution. Gamma functions thatappear
neednot be evaluated.
(a)y" +42°y= 0
(c) cy” —2y' + cy =0
(e)y"+ Vey =0
(g)cy" +3y'—xy =0
(i)day”+2y!+cy =0
(k)ay” +y' —9a7y= 0
(b)ay” ~2y/—2?y= 0
(d)4y" + Say = 0
(f)y" —cy =0
(h)day”+y =0
(j)ty" +dy!~day=0
() y’ + 2y =0
13. (a)—(1)Solve the corresponding problems in Exercise 12,
this time using computer software.
14. (a) Use (50) to find a general solution of
zy" + 3y'+9ry=0.
(0< a < oo)
second law of motion) that each shape Y (2x)is governed by
the differential equation
=0,
—x)¥"]'+pw®Y
[pg(t
(16.1)
(0<a<l)
where p is the mass per unit length and g is the acceleration of
gravity.
(a) Derive the general solution
Y(z)
= Ado
Gaz
vg
_ *)
+ BYpo
(=
v9
l-
=)
(16.2)
of (16.1). HINT: It may help to first make the change of
variables 1 — x = €. NOTE:
Observe from (16.2) that the dis-
placement Y will be unboundedat the free end a = / because
of the logarithmic singularity in Yo when its argument is zero
(namely, when x = /). Mathematically,
that singularity
can be
tracedto the vanishingof the coefficient pg(l —x) in (16.1)
(b) Find a particular solution satisfying the boundary condi-
tionsy(O) = 6, y/(0) = 0.
at x = 1, which vanishing introduces a regular singular point
of (16.1) at ¢ = / and results in the logarithmic singularity
(c) Show that there is no particular solution satisfying the ini-
in the solution (16.2). Physically,
dict the result stated in Theorem 3.3.1? Explain.
greater the tension the smaller the displacement (as anyone
who.has strung a clothesline knows). Hence the vanishing of
observe that the coefficient
tial conditionsy(0) = 6, y/(0) = 2. Does thatresultcontra- pg(t — x) in (16.1) representsthe tension in the rope. The
15. Use (50) to solve y” +4y = 0, and show thatyour solution
agrees with the known elementary solution. You may use any
results given in theseexercises.
16. (Lateral vibration of hanging rope) Consider a flexible
rope or chain that hangs from the ceiling under the sole action of gravity (see the accompanying sketch). If we pull
ies
ae
thetensionpg(l —x) at thefreeend leadsto themathematical
possibility of unbounded displacements there. In posing suitable boundary conditions, it is appropriate to preclude such
unbounded displacements there by prescribing the boundary
condition that Y (2) be bounded; that is, a “boundedness condition.” Imposing that condition implies that B = 0, so that
2
~ *).
the solution (16.2) reduces to Y(a@)= AJpo (au
(b) As a second boundary condition, set Y(0)
condition
does not lead to the evaluation
= 0. That
of A (which remains
arbitrary); rather, it permits us to find the allowable temporal
|
the rope to one side and let go, it will oscillate from side to
side in a complicated pattern which amounts to a superposition
of many different modes,each having a specific shape Y(2)
and temporal frequency w. It can be shown (from Newton’s
frequencies w. If the first three zeros of Jo(x)
are « = 2.405,
w (in
5.520, and 8.654, evaluate the first three frequencies
terms ofg and 1)and the corresponding mode shapes Y(z) (to
within the arbitrary scale factor A). Sketch those mode shapes
by hand over0 <a <1,
(c) Use computer software to obtain the zeros quoted above
(2.405,
5.520, 8.654),
and to obtain
three mode shapes. (Set A = 1, say.)
computer
plots of the
Chapter 4 Review
Chapter 4 Review
In this chapter we present methods for the solution of second-order homogeneous
differential equations with nonconstant coefficients.
The most important general results are Theorems 4.2.4 and 4.3.1, which guarantéespecific forms of series solutions about ordinary and regular singular points,
respectively. About an ordinary point one can find two LI power series solutions
and hence the general solution. About a regular singular point, say @= 0, one can
find two LI solutions,
in terms of power series and
by the method of Frobenius,
power series modified by the multiplicative factors |a|" and/or In |a|, where r is
found by solving a quadratic equation known as the indicial equation, The combination of these forms is dictated by whether the roots r are repeatedor distinct and,
if distinct, whether they differ by an integer or not. Note that the |a|" and In |x
factors introduce singularities in the solutions (unless r is a nonnegative integer).
Besides thesegeneral results,we meetthesespecial functions:
Exponential integral (Section4.3):Ey(a) = /
oo
pot
— dt,
(x >0)
wv
Gamma function (Section4.5):T(z) = |
CO
tee
dt,
(x >0)
0
and study these important differential equations and find solutions for them:
Legendreequation(Section4.4):(1 —x”)y" —2xy’ + Ay =0
Solutions that are bounded on ~1 < @ < 1 exist only if \ = n(n + 1) forn =
Q,1,2,...,
and these are the Legendre polynomials P,,(z):
Po(z)=1,
Pi(w2)=a,
Po(x) = (3x?
—1),....
2, ff
Bessel equation (Section 4.6): «7y”
+ xy! + (x? = v*) y=0
General solution:
y(x) =
AJ, (x) + BY,(«)
CHS) (zx) + DH!
(zx)
where J/,,, Y_ are Bessel functions of the first and second kind, respectively,
of order
.
2
1
∫−∫ ∫−∫ ) are the Hankel functions of the first and second kind, respectively,
∕
of order v,
Modified Besselequation(Section4.6):27y” + ay! + (—x*—v*) y =0
For brevity, we consider only the case where v = n is an integer.
General solution:
y(c) = Al,(x)
+ BR) (2),
where [,,, /X,, are modified Bessel functions of the first and second kinds, respectively, of order n.
245
246
Chapter 4. Power Series Solutions
NOTE: We suggest that to place the many
ta
function results in perspective
it is helpfulto seetheBesselequationwy "4 wy!+ (2?—v*)y= 0 andthemodifiedBesselequationwey“4 vy! + (—2?—v*)y= 0 asanalogousto theharmonic
oscillator equation y”+
y = 0 and the “modified harmonic oscillator equation”
y” —y = 0. For instance,theoscillatoryJ, (x) andY,(z) solutionsof theBessel
equationare analogousto the oscillatorycosx andsinx solutionsof y” + y = 0,
andthecomplexHankelfunctionsolutionsHy,
(1)(x) andHy(2(ax)areanalogousto
the complex e’* and e~* solutions. Similarly, the nonoscillatory [,(a) and K,(a )
solutions of the modified Bessel equation are analogous to the nonoscillatory e*
ande~*solutionsof theequationy” —y = 0.
Equations reducible to Bessel equations (Section 4.6): The equation
d
ady
aL)
z(e
_
= 0,
+ ba‘y
where a, b, c are real numbers, has solutions
=0"!Z, (ayia),
y(e)
where
Q=
Zi)
denotes Ji
and Yi
2
——.,
c-at+2
if b > 0, and qi
y=
l-—a
———.
c-a+2
and Ky
ifb
<0.
Chapter 5
Laplace Transfor
5.1
Introduction
The Laplace transform is an example of an integral transform, namely, a relation
of the form
°b
fe
| K(t,s)
P(s)=
a
()
which transforms a given function f(t) into another function F(s); A(t, s) is
called the kernel of the transform,and F(s) is known as the transform of f(t).
Thus, whereas a function sends one number into another [for example, the function
f(x) = x? sendsthe point 2 = 3 on an axis into the point f = 9 onan f axis],
(1) sends one function into another,namely, it sends f(t) into F'(s). Probably
the most well known integral transform is the Laplace transform, where a = 0,
b = oo, and K(t,s) = e~**.In that case (1) takes the form
F(s) = l
f(t)e7* dt.
(2)
The parameter s can be complex, but we limit it to real values in this chapter.
BesidesthenotationF'(s) usedin (2),theLaplace transformof f(t) is also denoted
as L{f(t)} or as f(s), and in a given application we will use whicheverof these
three notations seems best.
The basic idea behind any transform is that the given problem can be solved
more readily in the “transform domain.” To illustrate, consider the use of the natural
logarithm in numerical calculation. While the addition of two numbers is arithmeti-
cally simple, their multiplication can be quite laborious; for example, try working
out 2.761359 x 8.247504 by hand. Thus, given two positive numbers u and v, suppose we wish to compute their product y = uv. Taking the logarithm of both sides
gives Iny = Inuv. But Inuv = Inu + Inv, so we have Iny = Inu + Inv. Thus,
whereas the original problem was one of multiplication, the problem in the “transform domain” is merely one of addition. The idea, then, is to look up Inu and In v
247
248
ina table and to add these two values. With the sum tn hand, we again enter the table, this time using it in the reverse direction to find the antilog, y. (Of course, with
pocket calculators and computers available logarithm tables are no longer needed,
as they were fifty years ago, but the transform nature of the logarithm remains the
same, whether we use tables or not.)
Similarly, the logarithm reduces exponentiation to multiplication since if y =
u”, thenIny = In(u”) = v Inu, and it reducesdivision to subraction.
Analogously, given a linear ordinary differential equation with constant coefficients, we see that if we take a Laplace transform of all terms in the equation then
we obtain a linear algebraic equation on the transform X(s) of the unknown function x(t). That equation can be solved for X(s) by simple algebra and the solution
x(t) obtained from a Laplace transform table. The method is especially attractive
for nonhomogeneous differential equations with forcing functions which are step
functions or impulse functions; we study those cases in Section 5.5.
Observe that we have departed from our earlier usage of x as the independent
variable.
Here we use ¢ and consider the interval 0 < ¢ < oo because in most
(though not all) applications of the Laplace transform the independent variable is
the time t, withO <t<
ow.
A brief outline of this chapter follows:
5.2 Calculation ofthe Transform. In this section we study the existence of the
transform, and its calculation.
5.3 Properties of the Transform. Three properties of the Laplace transform are
discussed: linearity of the transform and its. inverse, the transform of derivatives,
and the convolution theorem. These are crucial in the application of the method to
the solution of ordinary differential equations, homogeneous or not.
5.4 Application to the Solution of Differential Equations.. Here, we demonstratethe principal application of the Laplace transform,namely, to the solution of
linear ordinary differential equations.
5.5 Discontinuous
Forcing
Functions;
Heaviside
Step Function.
Discontinu-
ous forcing functions are common in engineering and science. In this section we
introduce the Heaviside step function and demonstrate its use.
5.6 Impulsive Forcing Function; Dirac Impulse Function. Likewise common
are impulsive forcing functions such as the force imparted to a mass by a hammer
blow. In this section we introduce the Dirac delta function to model such impulsive
actions,
5.7 Additional Properties. There are numerous useful properties of the transform beyond the three discussed in Section 5.3. A number of these are given here,
as a sequenceof theorems.
5.2
Calculation
of the Transform
The first questionto addressis whetherthetransformF'(s) of a given functionf(t)
exists —that is, whether the integral
(1)
{[(the~™dt
F(s) = [
0
converges. Before giving an existence theorem, we define two terms.
First, we say thatf(t) is of exponential order as ¢ —>oo if thereexist real
constantsA’, c, and T’ such that
(2)
[f(t)|<Ke
for allt > 7’. That is, the set of functions of exponential order is the set of functions that do not grow faster than exponentially, which includes most functions of
engineering interest.
EXAMPLE
1. Is f(t) = sint of exponentialorder? Yes: |sint| < 1 for all t, so (2)
holds with A = 1,c = 0, and T = 0. Of course, these values are not uniquely chosen for
(2) holds also with K = 7,c¢ = 12, and T = 100, for instance.
EXAMPLE
@
2. Is f(t) = ¢?of exponentialorder?I’Hépital’s rule gives
_
lim —- = lim
tsooet
too
ot
= = lim
cet? — tt00
2
c ect
=0
if c > 0. Choose c = 1, say. Then, from the definition of limit, there must be a T such that
t?/e < 0.06,say,for allt > T. Thus,|f(¢)|= ¢? < 0.06e! for allt > T, hencef(t) is
of exponential order. @
On the other hand, the function
f(t) = e*” is not of exponential
Pp
2
yt
€
= lim
lim —-
too
ect
t00
:
e t?—ct _= OO,
no matterhow large c is.
We say that f(t) is piecewise continuous on a < ¢ <bif
order because
(3)
there exist a finite
number of points ¢1, tg, ..., tj such that f(t) is continuous on each open subintervala<<t<ty,t)
<t
< to,...,tn
<t<
b, and hasa finite limit as t approaches
each endpoint from the interior of that subinterval. For instance, the function f(t)
shown in Fig. | is piecewise continuous on the interval 0 < ¢ < 4, The values of
are not relevant to whether or not f is piecewise
f at the endpoints a, t,¢t2,...,6
continuous; hence we have not even indicated those values in Fig. {. For instance,
the limit of f as ¢ tends to 2 from the left exists and is 5, and the limit of f as t
tends to 2 from the right exists and is 10, so the value of f at t = 2 does not matter.
Thus, piecewise continuity allows for the presence ofjump discontinuities.
We can now provide a theorem that gives sufficient conditions on f(¢) for the
existence of its Laplace transform F'(s).
THEOREM
5.2.1 Existence of the Laplace Transform
Let f(t) satisfy theseconditions: (i) f(t) is piecewisecontinuouson 0 < t < A,
for everyA > 0, and (ii) f(t) is of exponentialorderas t > 00, so thatthereexist
realconstantsA’, c, andT’ suchthat|f(t)| < Ke“ for all ¢BT. Then theLaplace
transformof f(t), namely,F'(s) given by (1) exists for all s > c.
Proof: We need to show only that the singular integral in (1) is convergent. Break-
ing it up as
oO
[
T
f(the* dt = /
0
oo
f(t)e7* dt + |
0
f(t)e~* dt,
(4)
T
the first integral on the right exists since the integrand is piecewise continuous on
the finite interval 0 < t < T. In the second integral, |f(t)e~*| = |f(t)le"
Kes.
<
Now,[7° Ke~(8-° dt is convergent
for s > c, so fr f(t)e~* dt is
absolutely convergent —hence, by Theorem 4.5.3, convergent. @
Being thus assured by Theorem 5.2.1 that the transform F'(s) exists for a large
and useful class of functions, we proceed to illustrate the evaluation of F'(s) for
several elementary functions, say f(t) = 1, e“’, sinat, where a is a real number,
and 1/,/t.
EXAMPLE 3. If f(¢) = 1, thentheconditionsof Theorem5.2.1aremetfor any ¢ > 0,
so accordingto Theorem5.2.1,F'(s) shouldexist for all s > 0. Let us see.
F(s) =|
J0
edt
= lim
Boo
ewst
-S
B
1
7
(5)
8
0
on s will cause no difficulty
where the limit does indeed exist for all s > 0. Such restriction
in applications. @
EXAMPLE
4. If f(t) = e*', theconditionsof Theorem5.2.1are met for any c > aso
according to the theorem,/’(s) should exist for all s > a. In fact,
10
(y=feted
F(s)
=
pat pst
where the limit does indeed exist foralls
lt=
ev (srae
gin
li
:
B
=
0
s§-a
?
(6)
> a. @
5. If f(t) = sinat, then the conditions of Theorem 5.2.1 are met for any
EXAMPLE
c > 0so F(s) should exist for all s > 0. In fact, integrating by parts twice gives
F(s) =|
a
sinat e~*'dt
251
sin at
= lim
Boo
es
a
0-3
=(0-0)+
a
ewst
+ —cosat
—8
8
~s
~
a
Boo
--
0
a
fP
3s? Jo
sinat e~*'dt
(7)
(s),
ah
wherethelimitexistsif s > 0, Thelattercanbesolvedfor F'(s), andgives
a
()=ze5:
0
(s>0)
f(s) =-
8
(8)
as the transform of sin at.
COMMENT. An alternative approach, which requires a knowledge of the algebra of complex numbers (Section 21.2), is as follows:
|
oo
oo
|
0
sinate—*
(Im e'**) enSt dt
|
a=
0
= im
=Im
oo
D
e tt
1
—=Im
s—ta
as before, where the fourth equality follows
Jeno]
/
—(s—ia)t
B
dt = Im| lim fo
B00 —($—ta) 5
1
L
st ie -~—*
>;
s—tas+tia
s*+a?
(9)
because
_ Jen23| [ei23| — o-88 49
(10)
as B -+ 00, if s > 0. In (10)we haveusedthefact that je’*?|=|cosaB +i sinaBl=
Vcos?aB+sin°aB=1.
EXAMPLE
O
6. If f(t) = 1/4, then
AO
F(s) =f
oO
goh/?eta
= |
>
fier
dr
= |
oo
tol?
eT dr,
(11)
0
0
a
S
V8 Jo
wherewe have used the substitution st = 7. Having studied thegamma function in Section
4.5,we seethatthefinal integralis 1(1/2) = ./7 so
F(s) =
2
8
(12)
Theorem 5.2.1 because it is not piecewise continuous on 0 < t < oo since it does not
have a limit as - Q. Nonetheless, the singularity at = 0 is not strong enough to cause
divergence of the integral in (11), and hence the transform exists, Thus, remember that all
we need is convergence of the integral in (1); the conditions in Theorem 5.2.1 are sufficient,
not necessary. @
From these examples we could begin to construct a Laplace transform table,
with f(t) in one column and its transform F(s) in another. Such a table is sup-
252
plied in Appendix C. More extensive ones are available,* and one can also obtain
transforms and their inverses directly using computer software.
Tables can be used in either direction. For example, just as the transform of e”
is 1/(s —a), it is also truethattheuniquefunctionwhosetransformis 1/(s —a) is
e*, Operationally,
wesaythat
L{e™} =
|
s—-
a
and
oY
I
s5—- a
paew
(13)
where L is the Laplace transform operator defined by
*dt,
en
=[ ” P(t)
L{f(t)}
(14)
andL~! is theinverseLaplacetransformoperator.ItturnsoutthatD~!is, like
L, an integraloperator,namely
L-{F(s)}=—ees F(s)e**
ds,
Qt
y~—ico
(15)
where ¥ is a sufficiently positive real number. The latter is an integration in a
complex s plane, and to carry out such integrations one needs tostudy the complex
integralcalculus. If, for instance,we would put 1/(s— a) into theintegrandin (15),
for F'(s) and carry out the integration, we would obtain e@ We will return to (15)
near the end of this text, when we study the complex integral calculus, but we will
not use it in our present discussion; instead, we will rely on tables (and computer
software) to obtain inverses. In fact, there are entire books on the Laplace transform
that do not even contain the inversion formula (15). Our purpose in presenting it
here is to show that the inverse operator is, like D, an integral operator, and to close
theoperational“loop:” L{f(t)} = F(s), and thenL~!{F(s)} = f(t).
What can we say about the existence and uniquenessof the inverse transform?
Although we do not need to go into them here, there are conditions that F(s) must
satisfy if the inversion integral, in (15), is to exist, to converge. Thus, if one writes
a function F'(s) at random, it may not have an inverse; there may be no function
f(t) whose transform is that particular F(s).
For instance, there is no function
f(t) whose transform is s* because its inversion integral is divergent. But suppose
that we can, indeed, find an inverse for a given F'(s). Is that inverse necessarily
unique; might there be more than one function f(t) having the same transform?
Strictly speaking, the answer is always yes. For instance, not only does the function
f(t) = 1 havethetransformF(s) = 1/s (as foundin Example 3),but so doesthe
function
g(t)=
1, 0<t<3,
3<t<o
500, t=3
“See, for example, A. Erdélyi (ed.), Tables of Integral Transforms, Vol. | (New York: McGraw-
Hill, 1954).
5.2. Calculation
havethe transformG(s) = 1 becausethe integrandsin fj)” g(t)e7*' dé
of the Transform
253
and
for f(t) e~* dt differ only at the single point t = 3. Since there is no area under a
singlepoint (of finiteheight),G(s) andF(s) areidentical:G(s) = F(s) = 1/s.
Clearly, one can construct an infinite number of functions, each having 1/s
as its transform, but in a practical sense these functions differ only superficially.
In fact, it is known from Lerch’s theorem* that the inverse transform is unique
to withinanadditivenullfunction,a functionN(t) suchthat{yrN(t) dt = 0 for
every T’ > 0, so we can be content that the inverse transform is essentially unique.
Closure. Theorem 5.2.1 guarantees the existénce of the Laplace transform of a
given function f(t), subject to the (sufficientbut not necessary)conditions that f
be piecewise continuous on 0 < t < oo and of exponential order as t -+ oo. We
proceed to demonstrate the evaluation of the transforms of several simple functions,
and discuss the building up of a transform table. Regarding the use of sucha table,
one needs to know whether the inverse transforrn found in the table is necessarily
unique, and we use Lerch’s theorem to show that for practical purposes we can,
indeed, regard inverses as unique.
Computer software. On Maple, the laplace and invlaplace commands give transforms and inverses, respectively, provided that we enter readlib(laplace) first. To
illustrate, the commands
readlib(laplace) :
laplace(1 + t*(—1/2)= a(t), t, s);
givethetransformof 1 + ¢7!/2as
1
4 vr T
s
Vs
and the command
(16)
invlaplace(a/(s*2 + a°2),s,t);
(17)
givestheinversetransformof a/(s* + a”) as sin at.
EXERCISES 5.2
1. Showwhetheror not the givenfunctionis of exponential (g)cos¢?
order, If it is, determine a suitable set of values for A, c, and
T in (2).
(a)5e#
(d) cosh3t
(hy£109
(k) 6t + e' cost
(j) cosh t*
(i) 1/(t + 2)
(L)£1000
-
(b) ~10e7**
(e) sinh t?
©
(c) sinh 2t
(f) e*' sint
2. If f(t) is of exponentialorder,doesit follow thatdf/dé is
too? HINT: Consider f(t) = sine’.
.
3
“See, for example, D. V. Widder, The Laplace Transform (Princeton, NJ: Princeton University
Press, 1941).
7
2
254
Chapter 5. Laplace Transform
3. If f(t) andg(t) areeachof exponentialorder,doesit follow (b) by differentiating both sides of the known transform
thatf (g(t)) is too?HINT:Considerthecasewheref(t) = e*
andg(t) = ¢*,
4. In Example
VOO
6 we state that the result (12) holds if s > 0.
0
Show why thatcondition is needed.
6. DoesL{t~?/3}exist?Explain.
7. Derive L{cos at} two ways: using integrationby partsand
using the fact that cos at = Re e'**. (See Example 5.)
8. Derive L{e® sin bt} two ways: using integration by parts
andusingsin bf = Ime?®*,
9. Derive L{e“' cosbt} two ways: using integrationby parts
andusingcosbt = Ree’”*.
10. Derive by integration the Laplace transform for eachof the
following entries in Appendix C:
(b) entry 6
(c) entry 7
(d) entry 8
11. Derive entry |1 in Appendix C two ways:
oo
(a) by writing the transform as af
tJo
1
oO
te (s~ie)t qe -
d
prefer, by writing the transform as the single integral
oO
0
oO
ds Jy
NOD
sinate~*
1
a
dt = /
0
(sin at e**)
dt
as
in the order of integration and differentiation.
12. Use the idea in Exercise
(a) entry (12) from entry (4)
(c) entry (13) from entry (5)
(e) entry (15) from entry (2)
| 1(b) to derive
(b) entry (7) from entry (1)
(d) entry (14) from entry (6)
13. Show thatL{e*’} = 1/(s — a) holds evenif a =
Rea +iIma is
complex, provided that s > Rea.
14; Use computer software to verify the given entry in Appendix C in both directions.
That ts, show that the transform
of f(t) is F(s), andalso showthattheinverseof F(s) is f(¢).
t e7 (stia
2 Joy
you
im i
a
sg? + a?
(which we derived in Example 5) with respect to s, assuming
the validity of the interchange
5. DoesL{t~3/?}exist?Explain.
(a) entry 5
sinate~*' dt =
(a) 1-3
(e) 14-16
(b)4-7
(f)17-19
(c)8-10
g)20-22
(d) 11-13
te~ iM deandusing integrationby parts;
When we studied the integral calculus we might haveevaluated a few simple integrals directly from the definition of the Riemann integral, but for the most part we
learned how to evaluate integrals using a number of properties. For instance, we
usedlinearity(wherebyiM [au(x) + Bv(ax)|dx = a fe u(a) da +8 f u(x) dx for
any constants a, {, and any functions wu,v, if the two integrals on the right exist),
integration by parts, and the fundamental theorem of the calculus (which enabled us
to generate a long list of integrals from an already known long list of derivatives).
Our plan for the Laplace transform is not much different; we work out a handful
of transforms by direct integration, and then rely on a variety of properties of the
transform and its inverse to extend that list substantially. There are many such properties, but in this section we presentonly thehandful that will be essential when we
apply the Laplace transform method to the solution of differential equations, in the
next section. Additional properties are discussed in thefinal section of this chapter.
5.3. Properties of the Transform — 255
We begin with the linearity property of the transform and its inverse.
THEOREM
5.3.1 Linearity of the Transform
If u(t) andv(t) areanytwo functionssuchthatthetransformsL{u(t)} andL{u(t)}
both exist, then
+BL{v()}
=aL{ult)}
+Bo(t)}
L{au(t)
for any constants a,
(1)
2.
Proof: We have
L{au(t)+Bu(t)}=[ “ [au(t)+Bv(t)]ev* de
= lim [
B
B00
»B
= lim «|
0
B-+00
B
=a lim [
Booco
[au(t)+ Bu(t)}]e~™dt
u(t)e7" dt + B
’
u(t)e"
J0
u(t)e7* a|
B
dt + 3 lim |
Broo
0
0
v(t) e~* dt
0
u(t) e7* dt
u(t) e7* dt + B |
=a |
B
0
=aL{u(t)} + dL{u(t)},
(2)
where the third equality follows from the linearity property of Riemann integration,
and the fourth equality amounts to the result, from the calculus, that
lim[af(B) + 8g(B)| = alim f(B) + Blimg(B) as B -
Bo, if the lattertwo
limits exist. @
EXAMPLE
1. To evaluatethe transform of 6 —5e!, for example, we needmerely know
thetransformsof thesimplerfunctions1 ande“’ for L{6 —5e**}= 6L{1} —5L{e"}.
Now,L{1} = l/s for s > 0, andL{e*} = 1/(s —4) fors > 4s0
1
L{6 —5e} =6- —5
16 ~5e"}
= 65-5
1
=
3 — 24
s(s—4)
fors > 4.
THEOREM
5.3.2. Linearity ofthe Inverse Transform
yor any U(s) ae V(s) such that the inversetransformsD~'{U(s)} = u(t) and
L-"'{V(s)} = v(t) exist,
256
L~'{aU(s)
+BV(s)}=aL {U(s)}+BL7{V(s)}
|
(3)
for any constants a, J.
Proof; Equation (3) follows either upon taking L~! of both sides of (1) or from the
linearity property of the integral in the inversion formula [equation (15) in Section
5.2]. @
EXAMPLE
2. Asked to evaluatethe inverseof F(s) = 3/(s* + 3s — 10), we turn
to Appendix C but do not find this F(s) in the column of transforms. However,we can
simplify F'(s) by using partial fractions. Accordingly, we express
B
A
3
s?+3s—10 s+5
s—2
(A+ B)s + (—-2A+5B)
4
s? + 3s —10
4)
To make the latter an identity we equate the coefficients of s' and s° in the numerators
of theleft-andright-handsides:s' gives0 = A+ B, ands° gives3 = —24 + 5B so
A = 3/7 andB = —3/7.Then
1-1
3/7
_pfr3/t,
3
s* +33 —10
st+5
eee
=—7F
3s,
— — tp-8
-
5-2
3,.4f
(sas}t
3%
ee
i}
1
5
+e",
(5)
where the second equality follows from Theorem 5.3.2, and the last equality follows from
entry 2 in Appendix C.
COMMENT.
Actually, we could have used entry 9 in Appendix C, which says that
b
{as}
=
ite
=
et
.
sin bt,
(6)
for if we equate(s —a)* + b? = s* —2as +a? +b? to s* +38 —10 we see thata = ~—3/2
and6 = 71/2. Chooseb = +7i/2, say (6 = —7i/2 will give thesameresult).Then
a
F
3
ll
&+3s—lof 7/2
—
fe
7i/2
(s+3/2)?+(71/2)?
6
wee
_3e/2
sin
3
5
7
(-e~**
+
6
—_ ae
(7it/2)
9
e*)
,
UTIt/2)
2) _ gp
ayy CMTH/
So
(7)
257
where the first equality is true by linearity and the second follows from (6). This result is
the same as the one found above by partial fractions. This example ilustrates the fact that
we can often invert a given transform in more than one way. 4
If we are going to apply the Laplace transform method to differential equations,
we need to know how to take transforms of derivatives.
THEOREM 5.3.3 Transform of the Derivative
Let f(t) be continuous and f’(t) be piecewise continuous on 0 < t < to for every
finite to, and let f(t)
be of exponential order as t + oo so that there are constants
K,c,T such that |f(¢)| < Ke! for all ¢ > T. Then L{ f’(t)} exists for all s > c,
and
(8)
L{f'(t)}= sLAF(t)} —FO).
Proof: Since L{ f'(t)} = limp-4o0
a
[= |
7B
0
where t;,...,tn
f(the-*' dt =
f(t) e~*'dt, considertheintegral
op (the
0
dt +. tf
B
tn
fe
*dt,
©)
are the points, in 0 < ¢ < B, at which f’ is discontinuous.
Inte-
grating by parts gives
↓
[= f(t)e™−− lo +---+f(t)e
*hB
”
+8
“TL B
fe"
dt+---+s8]
JO
f(t)e ™dt.
By virtue of the continuity of f, the boundary terms at ¢;,...,¢,
that, after recombining
cancel in pairs so
the integrals in (10), we have
I= f(B)e~%? —f(0) + |
B
0
f(t) en" dt.
Since f is of exponential order as t —+oo it follows that f(B)e
B-
(10)
tn
(11)
—sB
+ Oas
oo. Thus,
L{p'()}
= lim ones
Boo
=0—f(0)+sL{f(t)}
as was to be proved. @
∕
∫
(12)
258
The foregoing result can be used to obtain the transforms of higher derivatives
as well. For example, if f/(¢) satisfies the conditions imposed on f in Theorem
5.2.3, then replacement of f by f’ in (8) gives
L{f"}=sL{f'}~f'(0)=s[s
L{F} —£(0)]
- £0),
If, besides f’, f also satisfies the conditions of Theorem 5.3.3, so that the L{ f}
term on the right side exists, then
~£0)
9f(0)
=PDEF}
L{f"}
Similarly,
L{f"}
= 3° L{f}
;
_ 3° f(0)
—S§ f’(0)
_ f"(0),
(13)
(14)
if f”, f’, and f satisfy the conditions of Theorem 5.3.3, and so on for the transforms
of higher-order derivatives.
The last of the major properties of the Laplace tansform that we discuss in this
section is the Laplace convolution theorem.
THEOREM
5.3.4 Laplace Convolution Theorem
If L{ f(t)} = F(s) andL{g(t)} = G(s) bothexistfor s > c, then
L~'{F(s)G
=
fre)
g(t—T)d
(15)
or, equivalently,
alt—7)ar =F(s)G(8)
L{f “f(r)
16)
fors>c.
Proof: Since (15) and (16) are equivalent statements, it suffices to prove just one,
say (16). By definition,
£{ f soyate-nar}
=I
{fis f(r) g(t-7) yar} e7*dt.
(17)
Regarding the latter as an iterated integral over a 45° wedge in a 7,¢ plane as shown
in Fig. |, let us invert the order of integration in (17). Recalling from the calculus
the equivalent notations
259
for iterated integrals (where a, b, c, d are constants, say), inverting the order of
integration in (17) gives
g(t—7)edt dr
(¢~nar}=I [1
f f° seryat
g(t —T)e~™dt
soryar
-[-
gu)e EO
a f(r) drIDe
~I
-|
0
f(r)e"*%*
ar [
0
dy
g(mwye
** dy.
(19)
The last productis simply F'(s) timesG(s), so thetheoremis proved. @
The integral on the right side of (15) is called the Laplace
convolution
of f
and g and is denoted as f * g. It too is a function of t:
(f * g)(t) =[
f(r) g(t — rT) dr.
(20)
CAUTION: Be sureto seethattheinverseof theproduct,L~' {F(s)G(s)}, is not
simplythealgebraicproductof theinverses,f(t)g(t); rather,it is (accordingto
Theorem5.3.4)theirconvolution,(f * g)(t).
EXAMPLE
3. In Example2 we invertedF(s) = 3/(s? + 3s — 10)in twodifferent
ways. Let us now obtain the inverse by still another method, the convolution theorem:
3
-1
L (weer
1 1
sc}
—ap-l
3h
3
HI
a (e**
7
ent
|
t
0
1
3h {a}*
_ar-i
Ln
1
‘s3}
ete BUtT)dr
(21)
_ e7°*)
which is the same result as obtained in Example 2. @
Observethatin equation(15)it surely doesn’tmatterif we write F(s)G(s) or
G(s)F(s) becauseordinary multiplication is commutative. Yet it is not clear that
theresultsare the same,fs f(r) g(t —7) dr in one caseand fs g(t) f(t -—7) dr
in the other. Nonetheless, these results are indeed the same, proof of which claim
is left as an exercise. In fact, although the convolution is not an ordinary product it
does share several of the properties of ordinary multiplication:
feg=agrf,
(commutative)
(22a)
Chapter 5. Laplace Transform
260
fe(g*h)=(f*g)*h,
fe(gth)=f*eg
+ fh,
fx0=0.
(associative)
(22b)
(distributive)
(22c)
.
(22d)
Closure. The properties studied in this section — linearity, the transform of a
derivative, and the convolution theorem, should be thoroughly understood. All
are used in the next section, where we use the Laplace transform method to solve
differential equations. The convolution property, in particular, should be studied
carefully.
The convolution theorem is useful in both directions. [f we have a transform
H{(s) thatis difficult to invert,it may be possible to factor H as F'(s)G(s), where
F and G are more easily inverted. If so, then h(t) is given, according to (15), as
theconvolutionof f(t) aridg(t). Furthermore,we may needto find thetransform
of an integral that is in convolution form. If so, then thé transform is given easily
by (16).
Finally, we mention thatconvolution.integrals arise in hereditary systems, systems whose behavior at time t depends not only on the stateof the system at that
instant but also on its past history. Examples occur in the study of viscoelasticity
and population dynamics.
EXERCISES
5.3
1, Find the inverse of the given transform two different ways:
(a) equation (22a)
(b) equation (22b)
using partial fractions and using the convolution theorem. Cite
any entries used from Appendix C.
(c) equation (22c)
.
(d) equation (22d)
(a)3/[s(s + 8)]
(b) 1/(3s? + 5s —2)
(c)1/(s? —a*)
(d)5/ [(s+ 1)(3s+ 2)]
(e) 1/(s? +s)
(f) 2/(2s? —s —1)
6. ProvethatL{f * g*h} = F(s)G(s)H(s) or, equivalently,
thatL~'{F(s)G(s)H(s)} = f *g*h. NOTE: Does fxg*h
mean(f * g) * hor f *(g*h)? Accordingto theassociative
property(22b)it doesn’tmatter:theyareequal.
2.(a)—(f) Find the inverse of the corresponding transform in
7. To illustrate the result stated in Exercise 6, find the inverse
Exercise | usingcomputersoftware.
of 1/s3 as L~ fs 1 \ = [7
88
3. Use entry 9 in Appendix C to evaluate the inverse of each.
If necessary, use entry 10 as well. NOTE: See the Comment in
Example 2.
(a) ie
(d)1/(s* —8 ~ 2)
(f) (s + 1)/(s? +45 +6)
(2) (s + 1)/{s? — 8)
-
f
Use (8) together with mathematical
(n~2)
orem5.35
5. Prove
r++,
8. Factoring
—5
1,
induction
∙
which
is
° =
I
=, it follows from
9s?+a? s*+a?
s
f-
(s? + a2)?f
to ver-
valid
w=
~ (s?+a7)?
that
= s"L{f} — s"-"f(0)
sto fi (0) ~ +) = f@-Y(0),
(nat)
that the result agrees with that given directly in Appendix C.
(h) (2s — 1)/(s* —68 +5)
ify thegeneralformulaL{f(™}
= 1*1+*1, and show
:
the convolution theorem and entries 3 and 4 of Appendix C
(b) 1/(s° — 38 +3)
+ 88)
(c) 1/(s" —8)
(e)s/(s* —2s + 2)
4,
/
2.
do «
{S53}
888
if
∙
f",and f satisfy the conditions of The-
ee Pi
sin at
oq
Evaluate this convolution and show that the result agrees with
thatgivendirectlybyentry11.
9, Verify (8) and (13) directly, for each given f(t), by work∙
ing out the left- and right-handsides and showing that they
areequal. You mayusethetablein AppendixC to evaluate
L{ fl"(t)}. LE F(t}, andL{ f(t)}.
(b) ew" 4.2
(c) t? +5t~1
(e)cosh3t + 5¢ = () ar? ~ cos 2t
(a) e®!
(d) sinh 4t
uctf(t)g(t). ShowthatL~'{F(s)G(s)} # f(t)g(t) foreach
10. Evaluate the transform of each:
(a) ye '<" gin Or dr
Ben
(c) f(t
given pair of functions.
(b) f cos 3(¢ — r) dr
(d) fi
Sr adr
cosh
3(t
_ r)
dr
sinh4(t—7)dr
(f)fy 7%?
(©) an dr
11. We emphasizedthatL~! {4°(s)G(s)} equalstheconvolu-
tion of f and g; in general, it does not merely equal the prod-
(a) f(t) =t, g(t) =e!
(c) f(t)=t, gti =?
(b)f(t) =sint, g(t) =4
(d)f(t) =cost, g(t) =t+6
Equations
initial conditions at ¢ = 0.
EXAMPLE
I.
oscillator shown in Fig.
thatthedisplacement x(t; then satisfies the equation
x(t)
ma” +kx = f(t) = Fp.
ay:
m
FAFFFFI
By the linearity of
Lime" + kx} = L{Fo}.
(2)
(Theorem5
mL{a"(t)} + kL {al
(3)
Lf{a(t)} = X(
X(s). Doing so gives
X(s) = sx(0) + x'(0) 4
IF
Figure 1. Mechanical oscillator.
term in (1) by the Laplace kernel e~*#
useL todenotethatstep:
ft
=>
Fy
(5)
where w = 4/k/m is the natural frequency.
With thesolving for X(s) completed,we now invert(5) to obtaina(t):
+20),
x(t)=L7}{220)
8?+ w?
4
c
,
2(0)L {waar}
=
ms (s? +w)
ar
+O
fF
1
(wre}
t
tie
Fo
po
1
(rap
©
°
where the second equality follows from the linearity of the L~! operator (Theorem 5.3.2).
Appendix C gives
lr
{
5
\ = coswt
and
s? + Ww?
po {
1
i
st+w*
t
\ _ ae
,
Ww
(7)
but the third inverse in (6) is not found in the table. We could evaluate it with the help of
partial fractions, but it is easier to use the convolution theorem:
lL
Lo
1
1
=
-1
~
Ww
1
~1
|
\zen}
{ohap
L
t 8)
—7) ar = 1- COsW)
si
a) (ee
-/
_ 1, Sint
1
>
—-1
=
{tt}
L
ion}
w*
w
0
so (6), (7), and (8) give the desired particular solution as
z(t) = 2(0) coswt +
x!(0)sinwt + “2
Ww
(1 —coswt).
(9)
For instance,if 2(0) = 2’(0) = 0, thenz(t) = (Fo/k)(1 —coswt) asdepictedin Fig. 2.
Does it seem correct that the constant force fp) should cause an oscillation? Yes, for imag-
ine rotating the apparatus 90° so that the mass hangs down. Then we can think of Fo as
the downward gravitational force on m. In static equilibrium, the mass will hang down an
amount « = Fo/k. If we release it from ¢ = 0, it will fall and then oscillate about the
equilibrium position z = Fo/k, as shown in Fig. 2.
COMMENT
1. Recall that f * g = g * f, so we can write the convolution integral either
as f f(r) g(t —7) dr oras fs g(r) f(t —7) dr; thatis, we can let theargumentof f be
7 and the argument of g be t — 7, or vice versa, whichever
Figure 2. Releasefrom rest.
we choose. In (8) we chose the
T argumentfor 1 and the t —7 argumentfor (sinwt)/w. (Of course,if we changeall the
t’s in 1 to r’s we still have 1 because there are no ¢’s in 1.) Alternatively, we could have
expressed the inverse in (8) as
sin wt
7)
xLl=
i
E / sinw
0
(
w
*)
(1)
dr
—
1 — coswt
2
Ww
’
as obtained in (8).
COMMENT
2. Observe that (9) is the particular solution satisfying the initial conditions
x = x(0) and2’ = x’(0) att = 0. If thosequantitiesare not prescribed,we can replace
them by arbitrary constants, and then (9) amounts to the general solution of ma” + ke =
263
Fy. Thus, the method gives either a particular solution or a general solution, whichever is
desired.
COMMENT 3. If, insteadof the specific forcing function f(t) = Fo we allow f(¢) to be
an unspecified function, then we have, in place of (5),
X(8)
(s) =
sx(0) + 2'(0)
F(s)
op + m(s?po+ w*)
eto
10
(10)
and, in place of (9),
a(t) = x(0) coswt +
F
1
x'(0
(0) sinwt + —L7} {as (5) | \ .
Ww
m
S* + W*
(1)
Using the convolution theorem to write
otf s?Pe
baat{ata
berttro=Bry,
tw
3?+?
w
gives
a(t) = x(0) coswt +
as the solution.
x’(0)
w
1
sinwr f(t—1)dr
sinwt + —- | f*,
mw Jo
ay
(13)
#
With Example | completed, there are several observations that can be made
aboutthe method. First, consider the general second-orderequation
cv’ +azr' +br = f(t),
(14)
where a, b are constants, although the following discussion applies to higher-order
equations as well. If we solve (14) by the methods of Chapter 3, then we need both
homogeneous and particular solutions. To find the homogeneous solution we need
to factor the characteristic polynomial A? + a\ +b or, equivalently, to find the
roots of the characteristic equation A + a + b = 0. Solving (14) by the Laplace
transform instead, we obtain, and need to invert,
X(s)=
(s+ a)x(0) + x’(0) 1
F(s)
sttast+b
s*+as+6
(15)
Whether we invert these terms by partial fractions or by some other method, their
inversion depends, essentially, on our being able to factor the s* +as+6 denominator. That polynomial is none other than the characteristic polynomial corresponding
to (14). Thus, whether we solve (14) by seeking exponential solutions for the homogeneous equation and then seeking a particular solution, or by using the Laplace
transform, we need to face up to the same task, finding the roots of the characteristic
equation,
Second,observethatif we invert the F(s)/(s* + as + 6) term in (15) by the
convolution theorem,then we convolve the inverse of F(s), namely, f(¢), with
264
theinverseof 1/(s* + as + b). Therefore,if we usetheconvolutiontheorem,then
there is no need to evaluate the transform F'(s) of f(¢) when transforming the given
differential equation.
Third, observe how the initial conditions become “built in,” when we take the
transform of the differential equation. Thus, there is no need to apply them at the
end.
Fourth, recall that Laplace transforms come with restrictions on s. For instance,L{1} = 1/s for s > 0. However, such restrictions in no way impede the
solution steps in using the Laplace transform method, and once we invert X(s) to
obtainx(t) they areno longer relevant.
Fifth, we need to realize that when we apply the Laplace transform method to
a differential equation, we take the transform of the unknown and one or more of its
derivatives, but since we don’t yet know the solution we don’t yet know whether or
not these functions are transformable. The procedure, then, is to assume that they
are transformable in order to proceed, and to verify that they are once the solution
is in hand.
Finally, and most important, understand that the power of the Laplace transform, in solving linear constant-coefficient differential equations, is in its ability
to convert such an equation to a linear algebraic equation on X(s), which ability flows from the fact that the transform of f’(t) is merely a multiple of F'(s)
plus a constant (and therefore similarly for f”, f’”,...).
Indeed, the transform
L{f(t)} = o° f(t) e~%*
dt wasdesignedso as to havethisproperty.Thatis, the
“kernel” e~*¢was designed so as to imbue the transform with that property.
EXAMPLE
2. Solve the initial-value problem
y™)—y=0; — y(0)= 1,y'(0)=y"(0)=y'"(0)=0
for y(x).
That the independent
and dependent variables
(16)
are x, y, rather than 2, t, is im-
material to the application of the Laplace transform; the transform of y(z) is now Y(s) =
fo” y(x) e~**dx. Taking the transform of (16) gives
- ¥(s)=0.
—y'"(0)]
—sy’"(0)
—s*y'(0)
[s¥(s)—s4y(0)
(17)
Putting the initial conditions into (17),and solving for Y(s), gives
3
¥(s)= a:
(18)
To invert the latter, we can use partial fractions:
so
si—-1sl
A
B
Cc
s-1 shi
si
D
_ (s = 1)(s*+ 1)A+ (8 +1)(s? +1)B + (8 —i)(s? —1I)C+ (8 + i)(s? - yD
st —]
(19)
5.4. Application to the Solution of Differential Equations — 265
Equatingcoefficientsof like powersof s in the numeratorsgives the linear equations
8?
l=A+B+C+D,
87:
Q0=-A+B-iC+iD,
s:
O0=A+B-C-D,
ie
O=-A+B+iC
—iD,
solution of which (for instanceby Gauss elimination) gives A = B=
C = D = 1/4.
Thus,
y(x)
1,
= 1c
1,
124,
+ ic
+ ra
1,
+ ra
1
aa (cosha + cos2)
(20)
is the desired particular solution. #
EXAMPLE
3. Solve thefirst-orderinitial-valueproblem
x(0) = xo
x +pzr = q(t);
(21)
for z(t), wherep is a constantand q(t) is any prescribedforcing function.Application of
the Laplace transform gives
vo
+
X(s) =
(s)
S+p
Q(s)
22
2)
s+p
and hence the particular solution
et
x(t) = toe"! + | e PET) g(r)dr
Jo
t
=ePt
COMMENT.
0 to t:
E
+ [
0
q(r) e?”ar|
(23)
Alternatively, let us begin by integrating the differential equation on t, from
n(t)) ep f a(r)dr
0
— [ner
(24)
or,since 7(0) = xo,
vt
t
q(r) dr.
a(t) +p / u(r)dr = 29+
JO
JO
(25)
Of course, (25) is not the solution of the differential equation because the unknown 2(t) is
under the integral sign. Thus, (25) is an example of an integral equation. Although we will
not study integral equations systematically in this text, it will be useful to at least introduce
them. Observe, first, that (25) is equivalent to both the differential
condition,
equation and the initial
for they led to (25); conversely, the derivative of (25) gives back x’ + px = q(t)
[if g(t) is continuous],and putting¢ = 0 in (25)gives back «(0) = ag. That is, unlike the
differential equation, the integral equation version has the initial condition “built in.”
Further, we can solve (25) by the Laplace transform conveniently because each inte-
gral is of convolution type: the first is 1 * a(t), and the second is 1 * g(t). Thus, taking
a Laplace transformof (25),and noting thatL {1 * a(t)} = L{U}L{a(t)} = (1/s)X(s)
andL {1 « q(t)} = (1/s)Q(s), gives
X(s) + p~X(s) — ~ + ~(s),
(26)
which, once again, gives (22) and hence the solution (23). @
Closure. In this section we describe the application of the Laplace transform to the
solution of linear differential equations with constant coefficients, homogeneous
or nonhomogeneous. In a sense, the method is of comparable difficulty to the
solution methods studied in Chapter 3 in that one still needs to be able to factor the
characteristic polynomial, which can be difficult if the equation is of high order.
However, the Laplace transform method has a number of advantages. First, the
method reduces a linear differential equation to a linear algebraic equation. Second,
the hardest part, namely the inversion of the transform of the unknown, can often
be accomplished with the help of tables or computer software, as well as with
several additional theorems that are given in the final section of this chapter. Third,
any initial conditions that are given become built in, in the process of taking the
transform of the differential equation, so they do not need to be applied separately,
at the end, as they were in Chapter 3.
We also saw, in the final example, that the Laplace transform is convenient to
use in solving integral equations (equations in which the unknown function appears
under an integral sign), provided that the integrals therein are Laplace convolution
integrals; additional discussion of this idea is left for the exercises. In fact, it might
be notedthattheLaplacetransformitself,F(s) = f° f(t) e7* dt, is reallyan
integral equationfor f(t) if F(s) is known. Although that integral equationwas
studied by Laplace, it was Simeon-Denis Poisson (1781-1840) who discovered
‘
ee
ee
an ee inversionformula.
Je ne
7 6)ost
LLJPYFHOO
Bon f(F) ae
the solutionf(t)
= 553
y ico #(s)
e* ds,
namely,theLaplace
Poisson was one of the great nineteenth century analysts and a professor at the
Ecole Polytechnique.
Also left for the exercises is discussion of the application of the method to a
limited class of nonconstant differential equations.
EXERCISES
5.4
(a)a’ +20 = at”
(b)3a’ + a = Ge”;
(e)v" +52' = 10
(at
—~a =1+t4+Fh
267
(g)a" ~ 3a'+2a=0;
2(0) =3, 2/(0)=1
(h)x" ~ 4a’ -—5e=2+e7';
x(0)=2'(0) =0
(i)a” —a'~1l2a=t;
2x(0)= —-1,2'(0)=0
(ja +62'+9¢=1;
2(0) =0, 2/(0) = -2
(k)a ~ 2a’ + 22 = ~2t; 2«(0)= 0, 2/(0)= —5
() a” —22'+32=5;
2(0)=1, 2'(0)=-1
(mo —a2"422’ =t?; ;
£
(nha +a" ~ Qe’ =1L+ets 4
(o)a'" + 5a” =t4;
2(0)= a'(
(p)a!” + 3a" + 8a! + a2= e%!
(ya —a" —a'+x2=0;
2(0) = 2, 2/(0) = 2"(0) =0
that those stepsgive
mx(t) —may ~ maxgt+k f
(s)a)
+ 3a!" = 0;
z"(0) = 2'"(0) = 3
(3.2)
Show that, by interchanging the order of integration, the double integrals can be reduced to single integrals, so that the integral equation (3.2) can be simplified to the form
ma(t) - mx ~ mayt +k fale —7)a(r) dr
a(Q) = 2'(0) = 0,
(b) Taking a Laplace
transform of (3.3), obtain
(thao) + 80” + 16x = 4
(ua!)
—¢=1;
x(0) = 2'(0) = x"(0) =0,
~2=4;
2(0) =2'(0) =2"(0) =1,
x!"(0)
=4
(va
=
(w)a)
x"'(0)
x(r) dr dt’
= fs f f(r) dr dt’.
= Qsint
(r) of)
x"(0)
f
0
F(s)
m/(s?+w?)’
which is the same as equation (10).
4. Convert the initial-value problem
—16” = —32: x(0) = 0, 2’(0) = 2,
—
x’"(0)
—
(o=
van)
0
2. (a) Show that for a constant-coefficient linear homogeneous
differential equation of order n, the Laplace transformX(s)
of thesolution x(t) is necessarily of the form
X(s) = P(s)/Q(s),
(2.1)
whereQ(s) and P(s) are polynomials in s, with Q of de-
gree n and P of degree less than n.
“ PUTK)net
x(t)=So
a
a
*
Tn, then
(0<t<o)
to an integral equation, analogous to (3.3) in Exercise 3. Then,
solve thatintegralequationfor x(t) by using theLaplace transform.
5. (Variable-coefficient equation) Consider the problem
te’ +2’ +tz=0
(b)Showthatif Q(s) = 0 hasn distinctrootsry, ...
k=1
mex"+ca'+ka = f(t)
z(0) = xo, x'(0) = 2
x(0) = 1, «'(0) =0,
0<¢t
(0S ¢ <00)
(2.2) where our special interest lies in seeing whether or not we can
solve (5.1) by the Laplace transform method even though the
differential equation has nonconstant coefficients.
3. Our purpose, in this exercise, is to follow up on Example
3 in showing a connection betweendifferential equations and (a) Take the Laplace transform of the differential
integral equations, and in considering the solution of certain Note thatthetransformsof tv’(t) andt x(t),
integral equations by the Laplace transform method.
200
L{te"(t)}
=|
(a) Convert the initial-value problem
ma" +kr = f(t),
O<t
(OS #< 00)
x(0) = xo, 2x'(0)= 2%
(5.1)
tae
equation.
* dt,
0
(3.1)
to an integral equation, as follows. Integratethe differential
equation from 0 to ¢ twice. Using the initial conditions, show
L{ta(t)} = f tre “dt,
Jo
present a difficulty in that we cannot express them in terms of
X(s) theway we can expressL{z’'(t)} = sX(s) —x(0) and
L{a"(t)} = s*X(s) — sa(0) — x'(0). Nevertheless,these
terms can be handled as follows. Observe that
268
Chapter 5. Laplace Transform
L{ta"(t)} = /
∶
ta en
0
de
∑∕
dt = -{
fo
¢
}
d (ve *') dt
ds
obtained the solution in power series form. Of course, that
powerseriesis theTaylor seriesof the Bessel functionJo(t).
NOTE: Observe that rather than pulling an s out of the square
ds Jo
rootin (5.5),andthenexpanding1/\/1 + (1/s?) in powersof
¢
ds [s?.X(s)—sx(0) —2'(0)]
X(s) = C(1 ~ $s? +--+).However,positivepowersof s are
d
1/s*, we could have expanded (5.4) directly in powers of s as
not invertible, so this form is of no use. [We will see, in Theo-
[s?X(s) —s],
rem 5.7.6, that to be invertible a transform must tendto zero as
(5.2) s —>00. Positive powers of s do not satisfy this condition, but
if we assumethatthe unknownz(t) is sufficiently well be- negative powers do.] Also, observe that the degree to which
ds
haved for the third equality (where we have interchanged the nonconstant-coefficient differential equations are harder than
order of two limit processes, the s differentiation and the ¢ constant-coefficient ones can be glimpsed from the fact that
integration)to bejustified. Handling theL{t x(t)} termin the coefficients proportional to ¢cause the equation on X(s) to be
same way, show that application of the Laplace transform to a first-order differential equation; coefficients proportional to
t®will causetheequationon X(s) to be a second-orderdiffer-
(5.1) leads to the equation
ential equation, and so on.
(s? +1)
on X(s).
ds
4 sx
=0
(5.3)
6. It is found that the integral equation
Note that whereas the Laplace transform method
C(T)
reduces constant-coefficient differential equations to linear al-
gebraic equationson X(s), here the nonconstantcoefficients
result in the equation on X(s) being itself a linear differential
equation! However, it is a simple one. Solving (5.3), show
that
C
X(s) =
_ [
7 0:0744u?/T? 5(1,) a
(6.1)
is an approximate relation between the frequency spectrum
p{v) and the specific heatC(T) of a crystal, whereT is the
temperature.Solve for p(v) if
(b)C(T) = Te“1/F
(a)O(T) =T
(5.4) HINT: By a suitable change of variables, the integral can be
s+1.
(b) From Appendix C, we find the inverseas z(t) = CJo(t),
where Jp is the Bessel function of the first kind, of order zero. Appying the initial condition once again gives
made to be a Laplace transform.
7. We have seen that two crucial properties of the Laplace
transformare its linearity and the propertythatL{f’(t)}
=
x(0) = 1 = CJo(0) = C, so C = 1, and the desiredso- sf(s) — f(0); thatis, the transformof thederivative is of the
lution of (5.1)is a(t) = Jo(t). Here, however,we ask you simple form L{f’(t)} = af(s) + 6. With thesepropertiesin
to proceed as though you don’t know about Bessel functions.
Specifically,
mind, consider the general integral transform
re-express (5.4) as
d
X(s)=s/1+(1/s?)
—-“—=C(1
set):
2 8°
8
(5.5)
where the last equality amounts to the Taylor expansion of
V1 +rin the quantity r, about r = 0, where r = 1/s?. Carry
that expansion further; invert the resulting series term by term
(assuming that that step is valid), and thus show that
t?
1 ¢!
1 ¢
a(t)=C}l-—
+m
u(t)
met
(apzat ~ (Bn?
2
|
Setting7(0) = 1 gives C’ = 1, and the result is thatwe have
f(t) at
F(s)=/ K(t,s)
{equation (1) in Section 5.1] from a “design”
1.1)
point of view:
how to choose the limits c, d and the kernel A(t, s) to achieve
these properties. Since 0 < t < oo, it is reasonable to choose
c = Oandd = o. Further, (7.1) automatically satisfies the lin-
earitypropertyL{au(t)+Gv(t)} = ab{u(t)}+GL{v(t)} be-
cause the right side of (7.1) is an integral and integrals satisfy
the property of linearity. Thus, we simply ask you for a logical
derivationof thechoice K(t, s) = e~**so thatL{ f’(t)} is of
theformaf(s) + 6.
5.5 Discontinuous Forcing Functions; Heaviside Step
Function
Although we show in Section 5.2 that a given function has a Laplace transform if
it is piecewise continuous on 0 < ¢ < A for every A and of exponential order as
t — oo, we have thus far avoided functions with discontinuities. In applications,
however, systems are often subjected to discontinuous forcing functions. For instance,a circuit might be subjected to an applied voltage that is held constant at 12
volts for a minute and then shut off (i.e., reduced to zero for all subsequent time).
In this section we study systems with forcing functions that are discontinuous, although we still assume that they are piecewise continuous on 0 < t < A for every
A and of exponential order as t + 00, so that they are Laplace transformable.
We begin by defining the Heaviside step function* or unit step function
(Fig. 1a),
H(t)=
0,
t<0
1,
t>0
(1)
(b)
Figure 1. Unit stepfunction.
which is a basic building block for our discussion. The value of H(t) att = 0 (Le.,
at the jump discontinuity) is generally inconsequential in applications. We have
chosenH(0) = 1 somewhatarbitrarily,anddo not showthevalueof H(t) att = 0
in Fig. la to suggest that it is unimportant in this discussion.
Since H(t) is a unit stepat t = 0, H(t —a) is a unit stepshiftedto ¢ = a, as
shown in Fig. 1b. In fact, the step function is useful in building up more complicated cases. We begin with the rectangular pulse shown in Fig. 2. Denoting that
functionas P(t; a,b), we have
P(t;a,b) = H(t-—a) —H(t —6).
(2)
More generally, observe that any piecewise continuous function
O<t<t
fi(t),
‘9(t),
pag
sts
3)
PO
t%1<t
t
frit),
th <t<
ow
defined on 0 < t < oo (which is the interval of interest in Laplace transform
applications) can be given by the single expression
f(t)
= fi(t)
P(t; 0,
t1) a
as
fr—i(t)
P(t;
tn—1y tn)
+ Fn {t) H(t
-
ae
(4)
“Oliver Heaviside (1850-1925), initially a telegraph and telephone engineer, is best known for
his contributions to vector field theory and to the development of a systematic Laplace transform
methodology for the solution of differential equations. Note the spelling: Heaviside, not Heavyside.
In0 <t
< ¢, for instance, each P function in (4) is zero except for the first, which
equalsunity in thatinterval;also,H(t—t,,) is zerothere,so (4)givesf(t) = fi(t).
Similarly, int; <t < tg,...,andty_,
<t < ty. Int, < t < co, each P function
is zero and the H(t —t,,) is unity, so (4) gives f(t) = f,(¢) there.
H(t—b)p
H(t-a)
P(t;a,b)
i
a
=
t
b
>
t
a
b
>
t
Figure 2. Rectangularpulse.
Note that (3) does not define f(t) at the endpoints 0, t1,...,t,.
The Laplace
transform of f will be the same no matter what those values are (assuming that they
are finite) since the transform is an integral, an integral represents area, and there is
no area under afinite number of points. Thus, those values will be inconsequential;
hence we don’t even specify them in (3).
EXAMPLE
1.
The function
2+,
f(t)
=
6,
2/(2t —5)
A
amw
on;
O<t<2
2<t<3
(5)
3<t<oo
shown in Fig. 3, can be expressed, according to (4), as
‘
2
f(t) = (242°) (H(t) — A(t — 2)}+ 6[A(t — 2) — H(t —3)] + og
Figure 3. f(t) of Example |.
HE
3).
(6)
Actually, since the interval is 0 < ¢ < oo we cannot distinguish between H(¢) and unity,
so we could replace the H(t) in the first term by |. &
EXAMPLE
2. Rampfunction. The function
0<t<a
w=?
t—a
(7)
a<t<oc
shown in Fig. 4 is called a ramp function and, according to (4), it can be expressed as
f(t)=(t~a)H(t-a).
O
f
a
Figure 4. The rampfunction
t
>
Before considering applications, observe that
roo
of Example 2.
L{H(t—a)}
H(t—ajedt=
=|
0
oo
[
a
eo dt=
@
e
s
,
so theLaplace transformof H(t —a) is
L{H(t—a)}=*
(8)
Also important to us is the result
(9a)
=e F(s)|
L{H(t—a)f(t—a)}
or, equivalently,
(9b)
f(t -a)
L~'{e"“ F(s)} = H(t -—a)
for any (Laplace-transformable)function f(t). Proof is as follows:
L{H(t—a)f(t-a)}=
5”
-["
(tt —a) f(t—a) e*' dt
f(t-a)
f(r)e*
was[
f(r
[
yerstat=
dr =e“*F(s),
s(t+a)op
sin(t-a)
HA(t-a)
(10)
t
where the third equality follows the change of variables t — a = 7. In words,
H(t —a)f(t —a) is thefunction f(t) delayedby a time intervala, as illustratedin
Fig. 5 for thefunction f(t) = sint.
Figure 5. Delay significance of
H(t —a)f(t—a).
EXAMPLE 3. LC Circuit. Wesawin Section2.3thatthedifferentialequationgoverning
thechargeQ(t) on thecapacitorin thecircuit shown(Fig. 6) is LQ” + RQ’ + (1/C)Q =
E(t). Let R = 0 and let E(t) be the rectangular pulse shown in Fig. 2, of magnitude Eo,
@(0) = Qo and Q'(0) = 0, then we have the initial-value problem
1
LQ" + GQ =Ev [H(t - 2) - H(t—-5)],
Q(0) = Qo,
(11a)
(11b)
Q'(0) =0
on Q(t). [Since the currenti(t) is dQ/dt, Q’(0) = 0 meansthati(0) = 0 so we can
think of a switch being open until time ¢ = 0, and closed at that instant.] We wish to solve
(11) for Q(t). Be careful: we will need to distinguish the inductance £ from the Laplace
transform L by the context.
Taking a Laplace transform of (11a), and using (1 1b) and (8), gives
oes
1 ox
,
L (s°Q(s) —sQo) + =Q(s) = Eo
C
Se)
_Q(s)
_
Q95
&c
l
fers
8
(e~?8
=
@75s
8
Be .
e~*)
(12)
(13)
pw 0)
E(t)
S L
L1
Figure 6. RIC circuit.
wherew = 1/VLC. [Generally,we use thenotation L{f(t)} = F(s), but in Q(t) theQ
is alreadycapitalized,so we useL{Q(t)} = @(s) instead.]To invert(13),we beginwith
a
lpg
i
—pt
(2+0?)
-/
to
0
5
ee
WwW
a Lod
= 1» Snwt
I
Fre
1 —coswt
dr= eS.
7
|
(14)
Ww
Then,using (14)and(9b)andL~! {s/ (s? + w?)} = coswt from Appendix C, we have
Q(t) = Qocoswt + EoC {H(t —2)[1—cosw(t ~ 2)|
—H(t —5) {1—cosw(t —5)]},
which is shown in Fig. 7 for the representative case where Qo = Eg = D=C
Q(t)
COMMENT
(15)
=1.
|. Most striking is the way the use of the Heaviside notation and the Laplace
transformhave enabledus to solve for Q(t) on the entiret domain (0 < t < oo). In
contrast, if we rely on the methods of Chapter 3 we need to break (11) into three separate
problems:
Figure
7. Q(t) given by (15).
O<t<2
2<t<5:
5<t<oo:
LQ" + 1/C)Q=0,
Q(0)=Qo, Q"(0)=0
LQ"+(1/C)Q=
Bo,Q(2)=2,Q'(2)
=7?_—(16ab
LQ"+(1/C)Q=0, Q(5)=?, Q(B)=?
First, we solve (16a) for Q(t) on 0 < ¢ < 2. The final values from that solution, Q(2)
and @’‘(2),thenserveas the initial conditions for thenextproblem,(16b). Then we solve
for Q(t) on 2 < t < 5 and use the final values from thatsolution, Q(5) and @’(5), as the
initial conditions for the next problem, (16c). Clearly, this approach is more tedious than
the Laplace
transform approach that led to (15).
COMMENT 2. A fundamentalquestion comes to mind: Does the discontinuous natureof
the input /(t) result in theoutputQ(t) being discontinuousas well? We can see from the
graph in Fig. 7 that the answer is no.- The continuity of ¢2(¢) may be surprising from the
solution form (15), because of the presence of the two Heaviside functions in (15). How-
ever,thejump discontinuity implied by the H(t —2) is eliminated by its 1 — cos w(t —2)
factor since the latter vanishes at ¢ = 2. Similarly, the jump discontinuity that is im-
plied by theH(t —5) is eliminatedby its 1 —cosw(t —5) factorsince the lattervanishes
att = 5. i
To better understandhow a discontinuous input can produce a continuous output, consider the following simplified situation, the equation
Q"(t) = H(t —a)
(17)
with discontinuous right-hand side. Integrating (17) gives
Q(t)=(t-a)H(t-—a)+A,
(18)
5.5, Discontinuous Forcing Functions; Heaviside Step Function —273
becausethe derivative of the right-hand side is indeed H(t — a), as can be seen
from Fig. 4. Integrating again, we obtain
t~a
and these results are shown in Fig. 8
-
(19)
Q(t) = (t=5ay" H(t—a)+At+B
(for the case where A
= B
= 0, say),
|
=
The idea is that a differential equation is solved by a process which, essentially,
involves integration, and integration is a smoothing process! For observe that
(0)
fenenebnein
whereas()"(t) = H(t —a) is discontinuousat t = a, Q’(t) = (t - a)H(t —a) is
continuousbut with a “kink,” and Q(t) = (t — a)*H(t — a)/2 is continuousand
“
.
t
smooth (differentiable) as well.
EXAMPLE
4. RC Circuit. In Example 3 we took R = 0 in thecircuit shownin Fig. 6,
and considered the resulting £C' circuit. Here, let us take £ = O instead, and consider the
H(t-aj
(t-a})
0
resultingRC circuit,governedby thefirst-orderequationRQ’ +(1/C)Q = E(t). Further,
let Q(0) = 0 and let #(t) = 50t on 0 < t < 2 and E(t) = 40 on 2 < t < 0 (sketch
it), Accordingto (4)then,£(t) = 50t(1 ~ H(t ~ 2)|+ 40H(¢ —2). Let R = C =1, for
simplicity. Then the initial-value problem on Q(t) is
Laplace transforming
~
Q!+Q = 50t+ (40—50t)H(t —2),
(20a) =
Q(0) = 0.
(20b) “3
=
= 0
(20a),
=
=
50
effect
(21) Figure 8. Thesmoothing
£ {(40—50t)A(t —2)},
sQ(s) + Q(s) = pt
of integration.
where
L{(40 — 50t) H(t — 2)} = L {[-60
~ 50(¢ — 2)] H(t —2)}
2)}
= —60L {H(t — 2)} — 50L {(t — 2) H(t ~—
—_ 60
eres
e738
eves L{t} =
———
60= ;
;
28
5 2
~ 50
. (22)
(22
Putting (22) into (21) and solving for Q(s) gives
oo
(
els)
50
ce
s(s+1)
ee
60
s(s+ D°
50
TB
oe
s2(8+ 1°
es
93
ey)
which we now need to invert. Taking one term at a time,
EN 5 ae
s(s
t
1
+1)
∫
= ET}
. ~t}= lee
s+]
,8 «Lo!
−
1
2
1)
f= Jo ee’ dr=1—e',
−∶∫
s+
vt
=| (¢-r)eTdr=t-1L+e™,
0
(24a)
−
(24b)
∶
|
274
Chapter 5. Laplace Transform
H(t —2)
(90—50¢+ 10e*~*)
EXERCISES
5.5
1. Use (4) to give a single expression for f(t), and give a labeled sketch of its graph, as well. From that expression, eval-
uatethetransformF'(s) of f(t).
(a) f(t) =ton0<t<2,4—ton2<t<4,andQont>4
(b) f(t) =e7"fon0 <t<1,00nt>1
(i)H(t—3)[H(t—2)—H(t—1)]
3. Evaluate in terms of Heaviside functions. You may use
these results for the definite and indefinite integrals of the
Heaviside function:
(c) f(t) = 2on0<t<5,-30n5<t<7,lont>7
(d)f(t) =t®?
-ton0<t<1,-6o0nt>1
(e)f(t)
(f) f(t)
t > 20
(g)f(t)
(h)f(t)
=2-—ton0<t<2,2t-6on2
<t<5,tont>5
= & on0 <t < 10,34?—2ton 10 < t < 20, 5t on and
=sinton0 <t < 57, 0ont > 57
= coston0 <t<7,-lont>
7
2. Draw a labeled sketch of the graph of each function.
(a)H(t —le’?
(b) H(t —27) cos (t — 27)
(c) (1 +t)A(t —2)
(d)(2 +t) [H(t —2) —H(t —3)]
(e)t(H(t — 1) — H(t —2) + H(t —3)]
(f)t? [2H(t —1) —H(t - 3) —H(t —4)]
(g) [H(t —7/2) —H(t —7)| sine
(h) 1+ A(t ~ 1) + A(t ~-2) + H(t — 3) + H(t —4)
0, t<0 _
I H(r)dr ={ t too =tHO.
t
t
|
H(r) dr = tH(t) + constant.
2
9 [((2)—
(b)fs ve
~2)dr
(c)fl [1—H(r —5)| dr
(d)[) (H(7—a) —H(r ~b)]dr
(e) fe |[H(7—2)—H(r ~3)|dr
(f)fo"H(r —1)dr
(b > a)
(3.1)
(3.2)
(d)0on0<t<5,100n5<t<7,00nt>7
(g)fy H(r ~t)dr
(e)0 fort 4 5, 100 fort = 5
(h) ¢* H(t — 1)
(f)1—e'on0
(i)sint* [H(t —1) ~ H(t — 2)|
(j) e!
(g)Qon0
« H(t —5)
<t <6,0ont >6
3<t<4,0ont>4
(k) Ll»H(t —1)
<t<3,1on
< 2,20n2
<t<l1,lonl<t
4.(a)-(k) Evaluate the integral in the corresponding part
of Exercise 3 using computer software such as the Maple int
command.
5. Solve x’ —x = f(t), wherex(0) = 0, by themethodsof 6.(a)—(j)Same as Exercise 5, but using computersoftware
suchasthe Mapledsolvecommand.
thissection,wheref(t) is:
(a)H(t ~1)
(b) e~!H(t — 3)
7.(a) ~(j) Same as Exercise 5, but for 2” —~x = f(t),
(c)ton0<t<2,2ont>2
x(0) = 2'(0) =0.
5.6 Impulsive Forcing Functions;
Jirac Impulse Function (Optional)
Besides forcing functions that are discontinuous, we are interested in ones that are
impulsive — that is, sharply focused in time. For instance, consider the forced
mechanical oscillator governed by the differential equation
mz" +cz'+kax = f(t),
fi
(1)
wheref(t) is theforce applied to the mass. If theforce is due to a hammerblow, for
instance,initiatedat time¢ = 0, thenwe expectf(t) to be somewhatas sketched
in Fig. 1a.However,we do not know thefunctionalform of f(t) correspondingto
such an event as a hammer blow, so the problem that we pose is how to proceed
with the solution of (1) without knowing f. Of course we can solve (1) in terms of
f, buteventuallywe needto knowf to
find theresponsea(t).
(a)
fh
In working with impulsive forces one normally tries to avoid dealing with the
detailed shape of f and tries to limit one’s concern to a global quantity known as
the impulse J of f, the area under its graph. The idea is that if € really is small, then
the response x(t), while sensitive to J, should be rather insensitive to the detailed
shape of f. That is, if we vary the shape of f but keep its area the same, then we
expect little change in the response x(t). This idea suggests that we replace the
unknown f by a simple rectangular pulse having the correct impulse as shown in
T/e
=
Fig. 1b:f(t) = I/e forO <t < ©,and0 fort > e. With f thus simplified we can
proceed to solve for the response x(t). But even so, the solution still depends upon
¢, and the latter is probably not known, just as the actual shape of f is not known.
Thus, we adopt one more idealization: we suppose that since € is very small, we
might as well take the limit of the solution as « + 0, to eliminate e.
(b)
Figure 1. Impulsiveforceat
t= 0.
Let us denote such a rectangular pulse having a unit impulse (J = 1) as
D(t;€):
Dts
0<t<e
1/e
6) =
,
where we use D (after the physicist
~
P A. M. Dirac,
2
who developed
the idea of
impulsive forces in 1929). As « —+0, D becomes taller and narrower as shown in
Fig. 2, in such a way as to maintain its unit area. Of course, the limit
t=0
oO,
.
does not exist, becauseoo is not an acceptable value, but Dirac showed that it is
nevertheless useful to think of that limiting case as representing an idealized point
unit impulse focused at t = 0.
To explain, we first prove that
Figure 2. Letting « — 0 in (2).
lim I
e—0
g(r) D(7; 6)dr = g(0)
(4)
for any function g that is continuous at the origin. To begin our proof, write
lim
60
CoO
0
g(t) D(r;e) dr = lim|
«0
€
Jo
1
g(r) —dr.
€
(5)
Suppose that g is continuous on 0 < 7 < 6 for some positive b. We can assume
that « < b because we are letting « + 0. Thus, g is continuous on the integration
interval
0 < 7 < e€,so the mean value theorem of the integral calculus
tells us
that there is a point 7; in [0,¢] (i.e., the closed interval 0 < + < e) such that
fo (7) dr = g(m1)e.
Thus,(5)gives
lim [
e30
0
g(r) D(r; 6)dr = lim Lon)
«30 €°
= lim g(71) = g(0),
e-0°
(6)
wherethe last equality holds since 7; is in the interval [0,¢],and € is going to zero.
Finally, since0 is arbitrarily small, we only need the continuity of g at 7 = 0. This
completes our proof of (4).
For brevity, it is customary to dispense with calling attentionto the ¢ limit and
to express (4) as
I
g(t) 6(r) dr = g(0),
(7)
where 5(7) is known as the Dirac delta function, or unit impulse function. We
can think of 5(7) as being zero everywhere except at the origin and infinite at the
origin, in such a way as to have unit area, but it must be noted that that definition
is not satisfactory within the framework of ordinary function theory. To create a
legitimate place for the delta function, one needs to extend the concept of function.
277
That was done by L. Schwartz, and the result is known as the theory ofdistributions,
but that theory is well beyond our present scope.
Let us illustrate the application of the delta function with an example.
EXAMPLE 1. Consider(1), with m = & = 1 andc = 0; let f(t) correspondto a
hammerblowassketchedin Fig. La,andlet(0) = (0) = 0, so thatbeforetheblowthe
mass is at rest. The solution of the problem
a’ +a = f(t),
x(0) = a2'(0)=0
(8)
is found, for instance, by using the Laplace transform, to be
a
[
sin (t —r) f(r) dr.
(9)
As outlined above, the idea is to replace f(r) by a rectangular pulse [D(7; €) having the
sameareaI as f(7) and then to take the limit as « + 0:
nt
x(t) = im |
e-+0
0
sin (t ~ tT)1D(r;€) dr = lim [
ێ-+0
=lim(SSE 4=cost= Isint,
€
0
T
sin (t —7) —dr
€
(t — €) — cost
cos
(10)
€
e+ 0
where the last equality follows from |’Hépital’s rule.
Alternatively and more simply, let f(7) = £6(r) in (9), where the scale factor I is
needed since the delta function is a unit impulse whereas we want the impulse to be J.
Then property (7) of the delta function gives
a(t) = |
t
0
sin (ft~ rT)[6(7) dr = [sin (t —7)
r=0
= [sint,
(11)
as obtained previously in (10). You may be concerned that we have applied (7) even though
the upper integration limits in (7) and (9) are not the same. However, in (5) we see that the
oo was immediately changed to ¢, and then we let € tend to zero. Thus, (7) holds for any
positive upper limit; we used oo just for definiteness, @
Let us review the idea. Since, generally, we know neither the exact shape nor
the duration of an impulsive forcing function f, we do two things to solve for the
response. We replace f by an equivalent rectangular pulse (i.e., having the same
impulse, or area, as f), solve for w(t), and then we let the width of the pulse, e,
tend to zero. Equivalently and more simply, we take f to be a Dirac delta function
and evaluate the resulting integral using the fundamental property (7) of the delta
function, The latter procedure is more efficient because one no longer needs to take
the limit of the integral as « — 0; the limit was already carried out, once and for
all, in our derivation of (4).
a) is focusedatt = a, and
Since d(t) is focusedat t = 0, it follows that6(t ~—
(7) generalizes to
i
g(t) 6(r —a) dr = g(a).
(12)
278
Chapter 5. Laplace Transform
Here we continue to use the 0, oo limits, but it should be understood that the result
is g(a) for any limits A,B
(with B > A) such that the point of action of the
delta function is contained within the interval of integration. If the point ¢ =
falls outside the interval, then the integral is zero. Thus, for reference, we give the
following morecompleteresult:*
[
A
g(r) 6(t ~ a) dr =
gia),
A<a<B
0,
a<A
or a>B.
(13)
EXAMPLE 2. RC Circuit. Recall from Section5.5thatthechargeQ(¢) on thecapacitor
of theRC circuit is governedby thedifferentialequationRQ’ +(1/C)Q = E(t). Let E(t)
be an impulsive voltage, with impulse J acting at t = T,, and let Q(0) = Qo. We wish to
solvefor Q(t). ExpressingH(t) = [6(t —T), theinitial-valueproblemis
Q'
where & = 1/(RC).
=
+ KQ
Id(t
Q(0)
— T),
=
(14)
Qo,
Taking the Laplace transform of (14) gives sQ -Qot+«Q
=
IL {6(t —T)} so
Q= @o_71
S+K
S+K
L{6(t-T)},
(15)
and
Q(t) = Qoe™*+ Ie~**«5(t —T)
=Qoe™
oy
+ ‘|
ent
at
et“)
0
5(7 —T) dr
0,
=Qe" +{ Ie“"¢-T),
t<T
tt>TT
= Qoe™ +TH(t - T)e*@-”),
(16)
where the third equality follows from (13). #
Observe that we do not need to know the transformof the delta function in
Example2; we merelycall its transformL {6(t —T)}, andinversionby thecon-
volution theoremgives us back the 6(t —T’) thatwe startedwith. Nonetheless,for
reference, let us work out its Laplace transform. According
L{o(t-~a)} = /
0
d(t—a)e"* dt =e
to (12),
t=a
aa
(17)
“Following (12), we state that the result is g(a) if the delta function acts within the integration
interval. How then do we interpretthe integral when a is at an endpoint (A or B)? We've met
thatcasein equation(7). SincetheD(r;¢€)sequence(Fig. 2) is definedon (0,¢],thedeltafunction
acts essentially to the right of 7 = 0, hence within the interval of integration, and the result of the
integration is g(0). To be consistent, let us suppose that the D sequence is always to the right of the
B.
< Aora>
A<a< BandOifa
point 7 = a. Then the integral in (13) will be g(a) if
Inparticular,L {o(t)}= 1.
Since this section is about the Laplace transform, the independent variable has
been the time ¢, so the delta function has represented actions that are focused in
time. But the argument of the delta function need not be time. For instance, if
w(z)is the load distribution on a beam (Fig. 3a), in pounds per unit length, then
6(@— a) representsa point unit load (i.e., one pound) at 2 = a (Fig. 3b).
Let us close this discussion with a comment on the delta function idealization
from a modeling point of view. Consider a metal plate, extending over ~co < 4 <
oo and 0 < y < ox, loaded by pressing a metal coin against it, at the origin, with a
force P (Fig. 4a). If one is to determine (from the theory of elasticity) the stress
distribution within the plate, one needs to know the load distribution w(a) along
the edge of the plate (namely, the w axis). Because the coin will flatten slightly,
at the point of contact, the load w(«) will be distributed over a short interval, say
from x = —e€to « = e. However, the function w(x) is not known apriori and its
determination is part of the problem. Whether one needs to determine the exact
w(a) distribution or if it suffices to represent it simply as an idealized point force
of magnitudeP, w(a) = Pd(x), dependsupon whetherone is interestedin the
“near field” or the “far field”’
By the near field we mean that part of the plate
a beam.
within several ¢ lengths of the point of the load application — for instance, within
the dashed semicircle shown in Fig. 4b. The far field is the region beyond. If we
are concerned only with the far field, then it should suffice to use
w(x) = Pd(x),
(18)
but if concerned with the near field then the approximation (18) will lead to large
errors. A ball bearing manufacturer, for instance, is primarily interested with the
near field induced by a loaded ball bearing due to concern regarding wear and surface damage. Within the theory of elasticity, the insensitivity of the far field to the
detailed shape of w(a) [given that the area under the w(x) graph is held fixed] is
an example of Saint Venant's principle.
Computer software. The Map/e name for 6(t) is Dirac(t).
Closure. We introduce the delta function out of a need to deal effectively with
impulsive forcing functions, functions that are highly focused in time or space. Often we know neither the precise form of such a function nor the precise interval of
application. If that interval is short enough one can model the force as an idealized point force, represented mathematically as a delta function d(t). One is not so
Ze
much interestedin the numerical values of 6(t) [indeed,one says that 6(0) = oo]
as in the effect of integration upon a delta function, and that effect is expressed by
(13), which we regard as the most important formula in this section.
Vy
(b)
Figure 4, Deltafunction
idealization.
280
Chapter 5. Laplace Transform
EXERCISES
5.6
1. Solve for a(t), on0 <t < cw,
is nonzero only at ¢ = 0, so there is no difference between
f(t)6(t) and f(0)d(z).
(a)a” ~ a = 6(f- 2); (0) = 2'(0) =0
(b)2” ~ de = 6d(t~ 1); 2(0) = 0, x'(0) = -3
(c) a” ~ 3a 4+2¢=2+d6(t-5);
2(0)=2'(0)=0
(a)
(d)a”+a’ =14+6(t—2);
2(0)=0,(0)=3
(e)a" +2a" +e2=100(t-5);
x
(f) 20” —a’ = d(t- 1) -—d(t- 2);
\ =2(0) =0
x(0)= 2’(0) =0
Ga"
(O)=
—4e" = 36(t- 1);
—
(0)
=
(2.5)
the delta and Heaviside functions. Alternatively, we can write
that relation as
(3.1)
2(0) = 2/(0) =a “(0 )=0,
1
(k)a” ~—
5a" +42 = 6d(t- 2);
2!"
5(r)dr=H(t)
3. The result (2.5), above, reveals the close relation between
(g)a” —3a’ + 2x = 1006(t- 3); 2(0) = 4, 2/(0)=0
(h)a’” = 26(¢-5);
2(0)= 2’(0) = x”(0) = 0
(i) a” + 32" 4+22’ = d(t-—5); 2(0) = 2'(0) = x"(0) =0
a"
[
0
(0) = x'(0) = x”(0) = The latter follows from (2.5) only in a formal sense, but is
(I)cl”—«=d(t-1); 2(0)
=2(0)=x"(0)
= 2""(0)
=0
2. Show that the delta function has these properties, where
« is a nonzeroreal constant,and the function f(t) is contin-
uous at the origin. NOTE: Recall that the delta function is
defined by its integral behavior. Thus, by an equation such as
quite useful, along with (2.2)-(2.5).
a(t)
= d(t) we meanthat
d(~—t)
i
g(t)5(—t)
dt=i
—OO
g(t)6(t)dt
(2.1)
OO
for every function g(t) that is continuous at the origin. The
rightsideof (2.1)is g(0), so to showthat6(—t)= d(t
i), in part
(a),you needto verify thatthe left side of (2.1) is g(0) too.
(a)
5(—t) = d(t)
(b)
b(xt)=ig 6(t) («#0)
@—fepsty=y
TOPOFAO
0,
f(0)=0
(2.2)
For instance, suppose
we wish to verify thatx(t)= H(t—1)sin
initial-valueproblem2” +a2=6(t-1);
Differentiatingw(t) gives
=
=
=
(t — 1) satisfiesthe
2«(0)= 2’(0) =0.
A(t —1)cos(t—1)
A’(t—1)sin(t—1)+
6(t—1)sin(t —1) + H(t—1)cos(t—1)
0+ A(t—1)cos(t—1),
(3.2)
and
H(t — 1)sin(t—- 1)
H'(t—1)cos(t—1)—
6(t —1)cos(t —1) —H(t —1)sin(t —1)
o(t —1) —H(t —1)sin(t - 1)
(3.3)
sox +x doesgive 6(t—1). In thesecondequality in (3.2)we
used(3.1),and in thethird we used(2.4):6(¢— 1) sin (t —1)
= 6(t —1)sinO = 0. In thesecondequalityin (3.3)we used
z(t)
=
i]
(2.3) (3.1), and in
the third we used (2.4): 6(¢ — 1) cos(t-
1) =
6(t —1)cosO = 6(t —1). Further,we seethat(0) = 0 and,
gy
For instance, (3¢ + 2)d(t) = 2d(t), (siné)d(t) = 0,
(3t+ 2)6(¢~ 1)= 5d(¢—1),and(t?+ ¢—2)d(t—1) = 0.
Formally, the first part of (2.4) makes sense as follows: 6(t)
from (3.2), that 2'(0) = 0. Here is the problem: In the same
manner as above, verify the following solutions that are given
in the Answers to the Selected Exercises.
(a) exercise
I(a)
(b) exercise 1(d)
(c) exercise I(g
281
5.7 Additional Properties
In Section 5.3 we establish the linearity of the transform and its inverse, the transform of the derivative f(t), and the Laplace convolution theorem, results that we
deem essential in applying the Laplace transform to the solution of differential
equations. In this final section of Chapter 5 we present several additional useful
properties of the Laplace transform.
THEOREM 5.7 1 s-Shift
If L{f(t)}= F(s) exists for s > so, then for any real constant a,
Lie“
f(t)} = F(s +a)
(1)
for s+ a > sg or, equivalently,
L7' {F(s+a)}
=e~“ f(t).
(2)
Proof:
L{e
pat
F(t)
(t)}
=
[
en
F(t
Je
sty
dt =F(s+a).
= / f(theSto
JO
EXAMPLE
=
1. DetermineL {t¥e*!}.FromAppendixC, L {t?} = 6/s* so it follows
from Theorem 5.7.1 that
to note that
ro!
“
(3)
25+1
ae
\steQs¢af*
ool
=k
25+]
Vee
_ pl
s+1)-1)
+3f 7
(i s+1)?+3f
i
(s +1)
(Genes =a}
= 2e7* cos V3t ~ €
ee
-t sin Jf33¢
V3
’
where in the last step we use entries 3 and 4 in Appendix C and Theorem 5.7.1. @
282
THEOREM
5.7.2 t-Shift
If L{ f(t)} = F(s) existsfor s > so, thenfor anyconstanta > 0
(6)
F(s)
—a)} =e“
L{H(t—a)f(t
for s > Sg or, equivalently,
L* {e-“ F(s)} = H(t —a)f(t —a).
7)
Equations (6) and (7) are already given in Section 5.5, where we studied the
Heaviside step function, but we repeat them here because the t-shift results seem a
natural companion for the s-shift results given in Theorem 5.7.1.
THEOREM 5.7.3 Multiplication by 1/s
If L {f(t)} = F(s) existsfor s > sg, then
{ [ f(r)ar}=*0)
(8)
for s > max{0, so} or,equivalently,
L7
a,
$
“t
= | f(r) dr.
(9)
JO
Proof: This theorem is but a special case of the convolution theorem. Specifically,
fs f(r) dr =1
f so, accordingto thattheorem,
‘
as asserted.
∫∕
∶
↕
∶
(10)
∫
@
EXAMPLE 3. To evaluateL~! {1/[s(s*+ 1)]},for example,we identifyF'(s) as
1/(s?+ 1).Sincef(t) = £7! {1/(s? + 1)} = sing,
1
laa}
t
sint
=|
dt
=
1—cost.
(11)
0
Alternatively,
we could have used partial fractions.
#
Next, we obtain two useful theorems by differentiating and integrating the definition
F(s) = [-
f(t)e~* dt
(12)
5.7. Additional Properties — 283
with respect to s. First, we state without proof that if the integral in (12) converges
for s > 89, then
dF(s)
d
[moe
ig
pay at
7
is “as J, f(t)e a= | OO5gLe]
at
__ [etme
dt=—L{tf(t)},
(13)
0
for s > so, and
is) dt (14)
[rove [ [Potoetaas=[re (| ‘eos
for b > a > sg. The key step in (13) is the second equality, where we have inverted
the order of the integration with respect to t and the differentiation with respect to
s. In(14), the key is again the second equality, where we have inverted the order of
integration with respect to ¢ and the integration with respect to s.
In particular, if a = s and b = oo, then (14) becomes
[
F(s')ds’ -["
dt
f(t) ([~ ods’)
= [Mer
a=1{
Oh
t
(15)
forall s > so.
For reference, we state the results (13) and (15) as theorems.
THEOREM
5.7.4 Differentiation with Respect to s
If L {f(t)} = F(s) exists for s > so, then
(16)
Lies} = dF(sds
for s > sp or, equivalently,
Lo
EXAMPLE
dF'(s)
{Sh
ny
17
a7)
= —t f(t).
4. From the known transform
a
L f{sinat}= Payal
(s > 0)
(18)
we may use (16) to infer the additional results
L{t sinat} = —da
_
__2as
dss*+a* — (s*+a?)?’
(s > 0)
(19)
284
352 — a2
7
2as
ad
ntl
ate
LT {tS42 sin
at} Es “GGL
==Qe(s? bays
(s > 0)
(20)
and soon.
THEOREM
5.7.5 Integration with Respect to s
If thereis a real numberso such thatD{f(t)} = F(s) exists for s > so, and
limy_40f(t) /t exists,then
£{ At
st |
F(s') ds'
(21)
for s > 8g or, equivalently,
EXAMPLE
(22)
F(s') ds} = a
ae if
5. To evaluate
Lo}{in 2 i}
s—b
(23)
where a and 6 are real numbers, note that
|
$—a
|
“3 —b
s—b
6 s—a
o3-b
ds
i)
1
1
s-a_
dy
Sy —a
51
.
=
1
~
(=
[
1
=)
d
°
24
es)
for any s;. Letting s; — oo and recalling that In1 = 0, (24) gives
s-a
no
=
ve
/
1
−
(=
1
ds’.
°
73)
25
a
Thus, identify F’(s) in (21) as —1/(s — a) + 1/(s — 6) (which does exist for s >
max{a,b} = so). Then
f(t) =L7! {-
1
$-a
Furthermore,
t30
ot
1
\ = eff —et,
s-b
bt — eat
P(e
lim ft)
+
==lim fee
t+0
(27)
b-a
t
(26)
does exist, the last equality following from I’ Hépital’s rule, so (22) gives the desired inverse
as
cfm
>
sb
opt _
o} =o
t
pat
i
(28)
285
THEOREM
F(s)
5.7.6 Large s Behavior of
Let f(t) be piecewise continuous on 0 < t < to foreach finite to and of exponential
order as t —+co. Then
(i) F(s) + 0 as s - 00,
(ii) sF'(s) is boundedas s + oo.
Proof: Since f(t) is of exponential order as t -+ oo, there exist real constantsIC
andc, with K > 0 such that |f(t)| < Ke for all t > to for some sufficiently
large to. And since f(t) is piecewise continuous on 0 < t < fo, there must be a
finiteconstantAY suchthatae t)| < Mon0
est
i,
<[smi eat
dt
sovetas [uot
to
0
oto
<|
<t < to. Then
Me
dt + /
oo
.
K e78~)!dt
to
0
1 — e7 sto
= M——
‘
8
tik
e~(set
—(s—c)
°°
to
M
<4
§
K
s—-C
(29)
for all s > c. It follows from this result that F(s) 4 0 as s — oo, and that sF'(s)
is bounded as s + co. @
For instance, for each of the entries 1-7 in Appendix .C we do have F(s)
— 0 and sF(s) boundedas s — oo. For entry 8 we do too, unless —1 < p < 0, in
which case F'(s) + 0 but sF'(s) is not bounded. However,in this case f(t) = t?
is not piecewise continuous since f(t) + co as t + Oif p is negative.
THEOREM
§.7.7 I[nitial-Value Theorem
Let f be continuous and f’ be piecewise continuotis on 0 < t < to for each finite
tg, and let f and f’ be of exponential order as t + oo. Then
lim [sF(s)] = f(0).
s—00
(30)
Proof: With the statedassumptions on f and f’, it follows from Theorem 5.3.3 that
L{f'(t)}= sL{F(t)}- FO).
G1)
Since f’ satisfiestheconditions of Theorem5.5.7,it follows thatL {f’(t)} — 0 as
8 — oo. Thus, letting s —+oo in (31) gives the result stated in (30). @
Normally, we invertf(s) and obtain f(t). However,if it is only f(0) thatwe
desire,not f(t), we do not needto invert £'(s); all we needto do, according to (30),
is to determinethelimit of sF'(s) as s — oo.
As our final item, we show how to transform a periodic function, which is
important in applications. First, we define what is meant by a periodic function. If
there exists a positive constant 7’ such that
f(t+T)=F(t)
(32)
for all t > 0, then we say that f is periodic, with period T.
EXAMPLE
6.
The functionsint is periodic with period27 since sin(t+ 27) =
sint cos 27 + sin 27 cost = sint, forall t. @
EXAMPLE
g
Al
Lord
7. The functionf shownin Fig. | is, by inspection,seento be periodicwith
period I’ = 4, for if we “stamp out” the segment ABC
graph of f. @
78
1112
Figure 1. Periodic functionf.
repeatedly, then we generate the
¢
Notice that if f is periodic with period T’, then it is also periodic with period
20, 37, 4T, and so on. For instance, it follows from (32) that
FE+2T) = f(E+T) +7) = ft+T) = ft)
so that if f is periodic with period T then it is also periodic with period 27’. If there
is a smallest period, it is called the fundamental period. Thus, sint in Example
6 is periodic with period 27, 47, 67,..., so its fundamentalperiod is 27; f(t) in
Example 7 is periodic with period 4,8,12,...,
so its fundamental period is 4. In
contrast, f(t) = 3 (i.e., a constant) is periodic with period T for every T > 0.
Thus, there is no smallest period, and hence this f does not have a fundamental
period.
To evaluate the Laplace transform of a periodic function f(t), with period T
(which is normally taken to be the fundamental period of f, if f has a fundamental
period), it seems like a good start to break up the integral on t as
Li{f(t)} = |
00
{[(the" dt = |
T
f(t)e™ dt +
2T
f(t)e7* dt +++-. (33)
T
Next, let 7 = ¢ in the first integral on the right side of (33), 7 = t— TZin the second,
7 =t — 2T in the third, and so on. Thus,
Lif(t)}=
i
a
f(r)e7* dr+ |
T
fir 4 DT)et)
dr
fe
+ |
Jo
f(r
+ 2T) e 8(T+2T)dr 4...
;
(34)
287
butf(r + 1) = f(r), f(r + 2T) = f(r), andso on, so (34)becomes
(35)
+e + fs f(r)ec
LAF} =(Le
Unfortunately, this expression is an infinite series. However, observe that
L+
is a geometric
est
4
eo 2st
series 1 +
foecece
T+
(e~**)
z+ 2? tees, with
4
(e~2F)?
eee
(36)
2 = e7 8? and the latter is known to
havethe sum 1/(1 —z) if Jz}< 1.Since |z|= |e
=e!
< Lif's > 0, we
can sum the parenthetic series in (35) as 1/(1 —et
).
Finally, if we ask that f be piecewise continuous on 0 < ¢ < 7, to ensure the
existence of the integral in (35), then we can state the result as follows.
THEOREM 5.7.8 Transform of Periodic Function
If f is periodic with period T on 0 < t < oo and piecewise continuous on one
period, then
(a7)
TL
| p(t)en**at
1
=
L{F(t)}
JQ
~
fors > 0.
The point, of course, is that (37) requires integration only over one period,
rather than over 0 < ¢ < ov, and gives the transform in closed form rather than as
an infinite series.
EXAMPLE
8. If f is the sawtooth wave shown in Fig. 2, then 1 = 2, and
[
f(t)e
dt = [
so
Lb
)}=
1—(1+2s8)e7?8
2te“
0
dt = git
2| 1—(14
1
{f(t
‘
~
L-en
+ ese
(38)
§
2|
2s)e7*8
as
g2
2
g2
2
e738
4
_
s
1 ~e72s
e
(
39
)
for s > 0.
A more interesting question is the reverse: What is the inverse of
es
2
4
se
gs1—~e7s
where we pretend no advance knowledge of the sawtooth wave in Fig. 2, or even the
knowledge that f(t) is periodic? The key is to proceed in reverse — that is, to expand
the 1/(1 — e~*5)in a geometric series in powers of e~**.Thus
F(s)=-
- =e"
i Ee |
(lL+e7% +e7* +--+.)
I Nw
-
o>
+
mr
i
+
&
is
f
~~
=z=
Figure 3. Partial sumsof (42).
Assuming that the series can be inverted termwise,
f(t) = 2t-—4 [H(t —2) + A(t -4)+
A(t -6)+--
(42)
|].
The first few partial sums,
fi(t)
= 2t,
fo(t) = 2t — 4H (t — bho2),
fo(t)=2t—4H(t—2)—4H(t—4)
are sketched in Fig. 3, and it is easy to infer that (42) gives the periodic sawtooth wave
shown in Fig. 2.
Figure 4. The staircase
A[H(t—2)+ A(t-4)4+---].
COMMENT. Observe that the presence of 1 —e~*” in the denominator of a transform does
not suffice to imply that the inverse is periodic. For example, the inverse of 4e~?*/{s(1 e~8)}, in (40), is the nonperiodic “staircase” shown in Fig. 4. [tis only when this staircase
is subtracted from 2¢ that a periodic function results. @
This completes our discussion of the Laplace transform. Just as we used it, in
this chapter, to solve linear ordinary differential equation initial-value problems, in
later chapters we will use it to solve linear partial differential equation initial-value
problems.
Closure. It would be difficult to pick out the one or two most important results
in this section, since there are eight theorems, all of comparable importance. Most
of these theorems are included as entries within Appendix C.
EXERCISES
5.7
1. Invert each of the following by any method. Cite any (a) - | ae
(s? + a*)?
items from Appendix C and any theorems that you use. I[f
it is applicable, verify the initial-value theorem, namely, that (d)
s*
sF(s) > f(0) as s > oo;if it is not,thenstatewhy it isnot.
~ (s+ 1)3
(b)
: 5
(s? — a’)?
(e)
1
s+
(c)
>“ ;
(s — 2)
(f)
y _-
(s—a)s/?
5.7. Additional Properties
es
e
e728
(@)or
(h)(s+ 46
ou(s+1)? |
(j)In € + =)
(k) cH 2y4
(H)In (1 - =
eee
ee
4284
(m 8° aS +1
iyn(S**)
@—
(vy)
5x8
the solution is
u(t)
=
of [1
e~ (t-0))
−
(yy)
OY
s-e)
= sint, show that (37)
_ 0)
(6.1)
↔
>
∙
2 (a)—(x)Invert the transform given in the corresponding part
of Exercise |, using computer software.
does indeed give
A(t
A(t —1)
fh
≤
3. (a) In the simple case where f(t)
_
+[1—e~@?)|H(t —2) -
+
(w)=
(F472
we
—[l—e"-)]
(0)s*(s?so
—28 —2)
∶
289
3
2
I
0
t
4
HINT: It would be wastefulto determineF(s) becausethe
solution can be expressed as a convolution integral involving
L{sint} =
f(t) directly. In thatintegral,expressf(t) asan infiniteseries
of Heaviside functions.
1
st41
(b) Sketch and label the graph of x(t) over 0 < t < 3, say,
(b) In the case where f(t) = cost, show that (37) does indeed
give
for thecasewhereto = 0. Is x(t) periodic? If not,is therea
valueof xp suchthatx(t) is periodic? Explain.
4. (Scalechanges)Show thatif L{f(¢)} = F(s), then
f(£) by the Laplace transform,where
7. Solve 2! +2
z(0) = ao and f(t) is theperiodicfunctionshown. HINT:
8
Lf{cost} = Fu
Read Exercise 6.
r
@L{flat)}=
+e(2) wyo-'¢r(as)}
=49
(<)
1
8
Higgs
_ il
t
31
5. Determine the Laplace transform of the function f(t) that
is periodic and defined on one period as follows.
(a)sint,
O<t<7
(c)sin2t,
(e)
t,
O<t<a7
0,
O<t<1
−
1<t<2
2,
0O<t<l
(g)¢ 4, 1<t<2
1,
2<t<38
(b)
1,
O<t<2
0,
2<t<3
(djem',
(f)
t,
t-2,
−
(a)
I
O<t<2
O0O<t<1
~
1<t<2
2
(b)
Ae
∶
∶∶∶
2
4
6
8
t
8. Solve vw”+ 2 = f(t) by the Laplace transform,where
6. (a) Solve v’ + x = f(t) by the Laplace transform, where x(0) = a, 2/(0) = ag, andf(t) is thesquarewaveshownin
z(Q) = xg and f(t) is the square wave shown, and show that Exercise 6. Evaluate2(5) if v9 = xg = 1.
290
Chapter 5. Laplace Transform
Chapter 5 Review
The Laplace transform has a variety of uses, but its chief application is in the solution of linear ordinary and partial differential differential equations. In this chapter
our focus is on its use in solving linear ordinary differential equations with constant
coefficients. The power of the method is that it reduces such a differential equation, homogeneous or not, to a linear algebraic one. The hardest part, the inversion,
is often accomplished with the help of tables, a considerable number of theorems,
and computer software. Also, any initial conditions that are given become built in,
in the process of transforming the differential equation, so they do not need to be
applied at the end.
Chief properties given in Section 5.3 are:
Linearity of the transform and its inverse
LE{au(t) + Bo(t)} = aL{u(t)}+
L7"{aU(s)
Transform
6Lf{o(t)},
+ BV(s)} =a Lo! {U(s)} + BL
'{V(s)},
of derivatives
L{f'} =sF(s)—f(0), L{f"}=s*F(s)-sf(0)—f'),
Convolution
-,
Theorem
L{(f *9)(t)}=F(s)G(s)
L~*{F(s)G(s)}= (f *9)(t);
where
(f *g)(t oe
f(r) g(t —7) dr
is the Laplace convolution of f and g.
In Sections 5.5 and 5.6 we introduce the step and impulse functions H(t — a)
and 6(t — a), defined by
a(t~a)={
and
B
[a(t
JA
8-ayar=4
SO
0,
0,
1,+
t<a
toa
BE ESE
a<A
or a2>bB,
to model piecewise-defined and impulsive forcing functions.
Finally, in Section 5.7 we derive additional properties:
s-shift
Lie
“f(t )}= F(s+a)
Chapter 5 Review
or
L{F(s+a)}
t-shift
or
=e“ f(t).
L{H(t ~ a)f(t —a)} =e“
F(s)
L~ {e~*F(s)} = H(t —a)f(t—a).
Multiplication by 1/s
t
Lf f0 rr)ar} =Fs)
§
or
L-1 f{F(s): Le_ Pyaar
Differentiation with respect to s
Les} = dF2)
4F(s)
or
L
-1
{
s
7s
_
\ = —tf(t).
Integration with respect to s
or
Large s behaviorof F'(s)
F(s) +0 as so,
sF(s) boundedas s > oo.
Transform of periodic function of period T
[ P(tyentat.
=——
L{f(t)}
Ll—est
T
NOTE: The preceding list is intended as an overview so, for brevity, the various
conditions under which these results hold have been omitted.
291
Chapter 6
Quantitative
Methods:
Numerical Solution o
Differential Equations
6.1
Introduction
Following the introduction in Chapter 1, Chapters 2-5 cover both the underlying
theory of differential equations and analytical solution techniques as well. That
is, the objective thus far has been to find an analytical solution —in closed form if
possible or as an infinite series if necessary. Unfortunately, a great many differential
equations encountered in applications, and most nonlinear equations in particular,
are simply too difficult for us to find analytical solutions.
Thus, in Chapters6 and 7 our approachis fundamentallydifferent, and complements the analytical approach adopted in Chapters 2—5: in Chapter 6 we develop
quantitative methods, and in Chapter 7 our view is essentially qualitative. More
specifically, in this chapter we “discretize” the problem and seek, instead of an an-
alytical solution, the numerical values of the dependentvari:ble at a discrete set of
values of the independent variable so that the result is a table or graph, with those
values determined approximately (but accurately), rather than exactly.
Perhaps the greatest drawback to numerical simulation is that whereas an analytical solution explicitly displays the dependence of the dependent variable(s)
on the various physical parameters (such as spring stiffnesses, driving frequencies,
electrical resistances,inductances,and so on), one can carry out a numerical solution only for a specific set of values of the system parameters. Thus, parametric
studies (i.e., studies of the qualitative and quantitative effects of the various parameters upon the solution) can be tedious and unwieldy, and it is useful to reduce the
numberof parametersas much as possible (by nondimensionalization, as discussed
in Section 2.4.4) before embarking upon a numerical study.
The numerical solution of differential equations covers considerable territory
so the present chapter is hardly complete. Rather, we aim at introducing the funda292
6.2. Euler’s Method
= 293
mental ideas, concepts, and potential difficulties, as well as specific methods that
are accurate and readily implemented. We do mention computer software that carries out thesecomputations automatically, but our present aim is to provide enough
information so thatyou will be able to select a specific method and programit. In
contrast, in Chapter 7, where we look more at qualitative issues, we rely heavily
upon available software.
6.2
Euler’s
Method
[n this section and the two that follow, we study the numerical solution of the firstorder initial-value problem
y =f(z,y);
y(a)=b
(1)
on y(z).
To motivate the first and simplest of these methods, Euler’s method, consider
the problem
y =yt+2x—2",
y(0)=1
(0 <a < co)
(2)
with the exact solution (Exercise
[)
y(x) = 2? +e".
(3)
Of course, in practice one wouldn’t solve (2) numerically because we can solve it
analytically and obtain the solution (3), but we will use (2) as an illustration.
In Fig. | we display the direction field defined by f(a,y) = y + 2a — x7, as
well as the exact solution (3). In graphical terms, Euler’s method amounts to using
the direction field as a road map in developing an approximate solution to (2).
Beginning at the initial point P, namely (0,1), we move in the direction dictated
by the lineal element at that point.
As seen from the figure, the farther we move
along that line, the more we expect our path to deviate from the exact solution.
il 0.5, say, for the sake of
Thus, the idea is not to move very far. Stopping at z =
illustration, we revise our direction according to the slope of the lineal element at
that point Q@.Moving in that new direction until 2 = 1, we revise our direction at
R, and so on, moving in x increments of 0.5. We call the «xincrement the step size
and denote it as h. In Fig. |, fis 0.5.
Let us denote the y values at Q, R,... as y,,yo,....
They are computed as
yo) is the initial point
(a9,
where
)h,...,
yi
(er,
f
+
yi
=
yo
yi = yor f(xo, Yoh,
P. Expressed as a numerical algorithm, the Euler method is therefore as follows:
Yntl
= Yn + I (Za,
Undh,
(n =0,1,2,..
)
(4)
where f is the function on the right side of the given differential equation (1),
To = a, yo = b, h is the chosen step size, and zp, = xq + nh.
Euler’s method is also known as the tangent-line method because the first
straight-line segment of the approximate solution is tangent to the exact solution
Figure 1. Directionfield
motivation of Euler's method,
for the initial-value problem (2).
294
y(x) at P, and each subsequentsegment emanating from (2p, yn) is tangent to the
solution curve through that point.
Apparently, the greater the step size the less accurate the results, in general.
For instance, the first point Q deviates more and more from the exact solution as
the step size is increased — that is, as the segment PQ is extended. Conversely,
we expect the approximatesolution to approach the exact solution curve as h is reduced. This expectationis supportedby the results shown in Table 1 for the initialn
fable 1. Comparison of numerical solution of (2)
using Euler’s method, with theexact solution.
zg
i|h=05/h=0.1]h=0.02|
y(x)
0.5 1.5000
1.7995
1.8778 | 1.8987
1.0| 2.6250 |
3.4344
3.6578 | 3.7183
1.5| 4.4375 |
6.1095
6.5975 | 6.7317
value problem (2), obtained by Euler’s method with step sizes of h = 0.5,0.1, and
0.02; we have included the exact solution y(x), given by (3), for comparison. With
h = 0.5, for instance,
yi =yo + (yo+ 2x0—28) h =14+(1+0-—0)(0.5) = 15,
y2=yi t+(yi + 201—vf) h = 1.5+4(1.5+1 —0.25)(0.5)= 2.625,
y3= yo+ (y2+ 2xq—23)h = 2.625+ (2.625+ 2 —1)(0.5)= 4.4375.
With h = 0.1, the values tabulated at x = 0.5,1.0,1.5 are ys, y10,Yis, with the
intermediate computed y values omitted for brevity.
Scanning each row of the tabulation, we can see that the approximate solution
appears to be converging to the exact solution as h — 0 (though we cannot be
certain from such results no matter how small we make /), and that the convergence
is not very rapid, for even with h = 0.02 the computed value at z = 1.5 is in error
by 2%.
As strictly computational as this sounds, two important theoretical questions
present themselves: Does the method converge to the exact solution as h — 0 and,
if so, how fast? By a method being convergent we mean that for any fixed x value
in the x interval of interest the sequence of y values, obtained using smaller and
smaller step size h, tends to the exact solution y(z) as h — 0.
Let us see whether the Euler method is convergent. Observe that there are two
sources of error in the numerical solution. One is the tangent-line approximation
upon which the method is based, and the other is the accumulation of numerical
roundoff errors within the computing machine since a machine can carry only a finite number of significant figures, after which it rounds off (or chops off, depending
upon the machine). In discussing convergence, one ignores the presence of such
roundoff error and considers it separately.Thus, in this discussion we imagine our
computer to be perfect, carrying an infinite number of significant figures.
6.2. Euler’s Method
Local truncation error. Although we are interested in the accumulation of error after a great many steps have been carried out, to reach any given a, it seems
best to begin by investigating the error incurred in a single step, from t,_1 to tp
between the exact
(or from @p,to Gp_+1,it doesn’t matter). We need to distinguish
and approximate solutions so let us denote the exact solution at zp, as y(a,) and the
approximate numerical solution at @pas yy. These are given by the Taylor series
(tm — @n—1)* eee
y(tn) = y(@n-1) + y!(tn—1) (@n—En—-1)+ wna)
= y(@n-1)+ y/(tn- yh + Plena)
(5)
,2 +e
and the Euler algorithm
(6)
Yn-1)h,
Yn= Yn-1+f (@n-15
respectively. It is important to understand that the Euler method (6) amounts to
retaining only the first two terms of the Taylor series in (5). Thus, it replaces the
actual function by its tangent-line approximation.
We suppose that y(a,— 1) and y,—1 are identical, and we ask how large the
error €n = y(n)
to Tn. We can
— Ym is after making that single step, from z,_1
get an expression for e,, by subtracting (6) from (5), but the right side will be an
infinite series. Thus, it is more convenient to use, in place of the (infinite) Taylor
series (5), the (finite) Taylor’s formula with remainder,
y(@n) = y(@n—1) + y'(fn—1)h
+
y"(€)
2
(7)
h*,
I
Now, subtracting (6) from (7),
where € is some point in the interval [v,_1,,].
= f [tn—1,y(@n-1)| = f(@n~-1,yn—1)becauseof our
and noting that y/(¢@p—-1)
supposition that y(@n—1)= Yn—1,gives
€n =roe
(8)
The latter expression for e,, is awkward to apply since we don’t know &, except
that tn-1 < € < ap.* However, (8) is of more interest in that it shows how the
< € < @m—1+h, we
single-step error e, varies with h. Specifically, since tn,
↕∂
−
−
≤
≤
↕
i
∶∙
−
↓
− −↓ − ∶
∏
∶∩
(9)
“It also appearsthat we do not know the y” function, but it follows from (2) that y” = y/ + 2 22 = (y+ 20-07)
42-99
=yt+2—2".
295
as h -+ 0. [The big oh notation is defined in Section 4.5, and (9) simply means that
e, ~ Ch? as h > 0 for some nonzero constant C’.]
Since the error e,, is due to truncation of the Taylor series it is called the trunca-
tion error~more specifically, the local truncation error because it is the truncation
error incurred in a single step.
Accumulated
truncation
error and convergence.
Of ultimate interest, however,
is the truncation error that has accumulated over all of the preceding steps since
that error is the difference between the exact solution and the computed solution
at any given x,. We denote it as FE, = y(ap) — Yn and call it the accumulated
truncation error. If it seems strangethat we have defined both the local and accumulated truncation errors as y(n) — Yn, it must be remembered that the former
is that which results from a single step (from 7,—1 to @,,)whereas the latter is that
which results from the entire sequence of steps (from zg to pn).
We can estimate EF, at a fixed x location (at least insofar as its order of
magnitude) as the local truncation error e, times the number of steps n. Since
€n = O(h?), this idea gives
:
En=O(h?)-n=O(n?)h =O(n?)
=O(h)-tn =O(h),—(10)
L
where the last step follows from the fact that the selectedlocation x, is held fixed
as h — Q. Since the big oh notation is insensitive to scale factors, the x, factor can
be absorbedby theO(h) so
En = O(h),
(11)
which result tells us how fast the numerical solution converges to the exact solution
(at any fixed x location) ash — 0. Namely, E, ~ Ch for some nonzero constant
C’. To illustrate, consider the results shown in Table 1, and consider x = 1.5, say,
in particular. According to E, ~ Ch, if we reduce h by a factor of five, from 0.1
to 0.02, then likewise we should reduce the error by a factor of five. We find that
(6.7317 — 6.1095)/(6.7317 — 6.5875) ~ 4.6, which is indeed close to five. We
can’t expect it to be exactly five for two reasons: First, (11) holds only as h > 0,
whereas we have used h = 0.1 and 0.02. Second, we obtained the values in Table
| using a computer, and a computer introduces an additional error, due to roundoff,
which has not been accounted for in our derivationof (11). Probably it is negligible
in this example.
While (11) can indeed be proved rigorously, be aware that our reasoning in
(10) was only heuristic. To understand the shortcoming of our logic, consider the
diagram in Fig. 2, where we show only two steps, for simplicity.
Figure 2. The global truncation
error.
Our reasoning, in writing By, = O(h?) - nin (10), is that the accumulated
truncation error FE, is (at least insofar as order of magnitude) the sum of the n
single-step errors. However, that is not quite true. We see from Fig. 2 that EH»is
€2+3, not the sum of the single-step errors ey +e 1, and (3is not identical to e;. The
difference between (@and e; is the result of the slightly different slopes of D1 and
£2 acting over the short distance h, and that difference can be shown to be a higherorder effect that does not invalidate the final result that £, = O(h), provided that
6.2. Euler’s
f is well enough behaved (for example, if /', f,, and fy are all continuous on the
x, y region of interest).
In summary, (11) shows that the Euler method (4) is convergent because the
accumulated truncation error tends to zero as 4 — 0. More generally if, for a given
method, BE, = O(h?) as h -> 0, then the method is convergent if p > 0, and we
say that it is of order p. Thus, the Euler method is a first-order method.
Although convergent and easy to implement, Euler’s method is usually too
inaccurate for serious computation because it is only a first-order method. That is,
since the accumulated truncation error is proportional to / to the first power, we
need to make A extremely small if the error is to be extremely small. Why can’t we
do that? Why can’t we merely let h = 107%,say? There are two important reasons.
One is that with h = 1078, it would take 108 steps to generate the Euler solution
over a unit x interval. That number of steps might simply be impractically large in
terms of computation time and expense.
Second, besides the truncation error that we have discussed there is also machine roundoff error, and that error can be expected to grow with the number of
calculations. Thus, as we diminish the step size A and increase the number of steps,
to reduce the truncation error, we inflict a roundoff error penalty that diminishes
the intended increase in accuracy. In fact, we can anticipate the existence of an
optimal fhvalue so that to decrease h below that value is counterproductive. Said
differently, a given level of accuracy may prove unobtainable because of the growth
in the roundoff error as h is reduced. Further discussion of this point is contained
in the exercises.
Finally, there is an important practical question not yet addressed: How do we
know how small to choose 2? We will have more to say about this later, but for now
let us give a simple procedure, namely, reducing / until the results settle down to
the desired accuracy. For instance, suppose we solve (2) by Euler’s method using
II 0.5, as a first crack. Pick any fixed point « in the interval of interest, such as
h=
x =lI 1.5. The computed solution there is 4.4375. Now reduce h, say to 0.01, and
run the program again. The result this time, at 2 = 1.5, is 6.1095. Since those
results differ considerably, reduce h again, say to 0.02, and run the program again.
Simpy repeat that procedure until the solution at z = 1.5 settles down to the desired
number of significant figures. Accept the results of the final run, and discard the
others. (Of course, one will not have an exact solution to compare with as we did
in Table 1.)
The foregoing idea is merely a rule of thumb, and is the same idea that we use
in computing an infinite series: keep adding more and more terms until successive
partial sums agree with the desired number of significant figures.
Closure.
The Euler method is embodied in (4). It is easy to implement, either us-
ing a hand-held calculator or programming it to be run on a computer. The method
is convergent but only of first order and hence is not very accurate. Thus, it is
important to develop more accurate methods, and we do that in the next section.
We also use our discussion of the Euler method to introduce the concept of
the local and accumulated truncation errors e, and E,, respectively, which are
Method
297
298
and of order one.
EXERCISES
6.2
1. Derive the particular solution (3) of the initial-value problem (2).
2. Use the Euler method to compute, by hand, y;, ye, and ys
for the specified initial-value problem using h = 0.2.
ay’ =—-y; y(0)=1
(b)y' = 2zy; (0) =0
(e)y!= 2xe~¥;y(1)= -1
(iy
=5e-2/y,
Qy= Very,
Yndh
y(0)=0
analytically, and that idea is the focus of this exercise. Specif-
ically, consider y’ = Ay, whereA is a given constant. Then
y(1)=2
y(0)=4
(6.1) becomes
y(0)=3
Yn+l = (1 + Ah)yn.
3. Program and run Euler’s method for the initial-value problem y’ = f(x,y), with y(0) = land hk= 0.1, through yyo.
Print y1,---,Y10 and the exact solution y(a1),..., y(a10) as (a) Derive the solution
well. (Six significant figures will suffice.) Evaluate £yy. Use
Yn = C(1+ Ah)”
thef(x, y) specified below.
(a)2x
ee
glta+y
(b)—6y?
(c)a+y
(h)—ytan
(i)ett
+1)/2
(e)(y?
(A)dwen¥
4. (a)-(h) Program and run Euler’s method for the initialvalue problem y' = f(x,y) (with f given in the corresponding part of Exercise 3), and print out the result at 2 = 0.5.
Use A = 0.1, then 0.05, then 0.01, then 0.005, then 0.001, and
compute the accumulated truncation error at z ==0.5 for each
case. Is the rate of decrease of the accumulated truncation error, as /, decreases, consistent with the fact that Euler’s method
is a first-ordermethod? Explain.
(6.1)
as y’ = f(x,y) is a differential equationgoverning y(z). If
f is simple enough it may be possible to solve (6.1) for yp
(fy! =a?—y?; y(3)=5
(g)y' = wsiny;
Yn+1 = Yn + (aan
Besides being a numerical
sequentially, forn = 0,1,2,....
algorithm for the calculation of the y,,’s, (6.1) is an example
of a difference equation governing the sequence of y,,’s, just
(c)y! = 3x7y?; (0) = 0
(dy =1+2ry?; y(1)= —2
(h)y’=tan(r+y);
6. We have seen that by discretizing the problem, we can
approximate the solution y(x) of a differential equation
y’ = f(z, y) byadiscrete variable y,, by solving
(6.2)
(6.3)
of (6.2), where C’ is the initial value yo, if one is specified.
(b) Show that as kh—- 0 (6.3) does converge to the solution
Ce** of the original equation y' = Ay.
HINT: Begin by
| NOTE:Thus, for the
expressing(1 + Ah)” ase!™CU+4")"
simple differential equation y’ = Ay we have been able to
prove the convergence of the Euler method by actually solving
(6.2) for yn, in closed form, then taking the limit of that result
ash + 0.
(c) Use computer software to obtain the solution (6.3) of the
difference equation (6.2). On Maple, for instance, use the
rsolve command,
have takenthe step size h to be a constant
5. Thus far we have taken the step h to be positive, and there- 7. In this section we
fore developed a solution to the right of the initial point. Is from one step to the next. Is there any reason why we could
Euler’s method valid if we use a negative step, A < 0, and not vary # from one step to the next? Explain.
hencedevelop a solution to the left? Explain.
299
6.3 Improvements: Midpoint Rule and Runge-Kutta
Our objective in this section is to develop more accurate methods than the firstorder Euler method—namely, higher-order methods. In particular, we are aiming at
the widely used fourth-order Runge-Kutta
method, which is an excellent general-
purposedifferential equationsolver. To bridge thegap betweenthesetwo methods,
we begin with some general discussion about how to develop higher-order methods.
6.3.1. Midpoint rule. To derive more accurate differential equation solvers, Taylor
series (betteryet, Taylor’s formula with remainder) offers a useful line of approach,
To illustrate, consider the Taylor’s formula with remainder,
y"(E)
ap
_,
u(x)=
y(a)+ yf
y'(a)(z—a)+
2
(a—a)",
where € is some point in [a,az]. If we let cv = @n4i1,@ = Zp, ande-a
(1)
=
Ln+1 — Tn = A, then (1) becomes
yh.
y(tn41)
=y(tn)+y'(@n)h
+ y"(E)
9
(2)
Since y’ = f(x,y), we can replacethe y/(a,) in (2) by f(@n,y(@n)). Also, the
last termin (2) can be expressedmore simply as O(h”) so we have
y(tnet)=y(tn)+f(tn,y(en))h
+ O(h?).
(3)
term and call attention to the approximation thereby in-
If we neglect the O(h)
curredby replacingtheexactvaluesy(an4,) andy(z,) by theapproximatevalues
Yn+1and y,, respectively, then we have the Euler method
Ynt1
= Un +
cae
(4)
Yn)h.
Since the term that we dropped in passing from (3) to (4) was O(h7), the local
truncation error is O(h”), and the accumulated truncation error is O(h).
One way to obtain a higher-order method is to retain more terms in the Taylor’s
formula. For instance, begin with
y(Zn41)
= y(@n) + y'(an)h
in place of (2) or, since y” lI
Y(@n41) = y(@n) + f(tn,
+5
(fe(@n,
+
y" (an) ae
u"(n)
3
5
5
(5)
d
y(en))h
y(Zn))
+ Fylan,
ylan))
f len,
y(2n))]
h? + O(h’).
(6)
300
Chapter 6. Quantitative Methods: Numerical Solution ofDifferential Equations
If we truncate(6) by dropping the O(h*) term, andchangey(an41) and y(ap) to
Ynt1 and yp, respectively, then we have the method
Yn+1 = Yn + f(tn,
1
Un)h + 5 [fo(tn,
∙
Yn) + fyltn,
Un) f (tn, Yn)] he
(7)
with a local truncation error that is O(h°) and an accumulatedtruncation error that
is O(h?); thatis, we now havea second-ordermethod.
Why do we say that (7) is a second-order method? Following the same heuristic reasoning as in Section 6.2, the accumulated truncation error /,, is of the order
of the local truncation error times the number of steps so
E, = O(h®)
n= o(ns)
= a(n’)
= O(h”)-tn = O(h?),
as claimed. In fact, as a simple rule of thumb it can be said that if the local trunca-
tion error is O(h”), with p > 1, thenthe accumulatedtruncationerror is O(hP—'),
and one has a (p — 1)th-ordermethod.
Although the second-order convergence of (7) is an improvement over the first-
order convergence of Euler’s method, the attractiveness of (7) is diminished by an
approximately threefold increase in the computing time per step since it requires
three function evaluations (f, fz, fy) per step whereas Euler’s method requires only
one (f). It’s true that to carry out one step of Euler’s method we need to evaluate
f, multiply that by h, and add the result to y,,, but we can neglect the multiplication
by A and addition of y, on the grounds that a typical f(x,y) involves many more
arithmetic steps than that. Thus, as a rule of thumb, one compares the computation
time per step of two methods by comparing only the number of function evaluations
per step.
Not satisfied with (7) becauseit requires threefunction evaluations, let us return to Taylor’s formula (5). If we replace h by —h, that amounts to making a
backwardstepso the termon the left will be y(a,_1) insteadof y(tn+41).Making
those changes, and also copying (5), for comparison, we have
i
uw
Wan
uty,
+4(Sn)p2a ‘s)nr’,
=y(tn)—y'(an)h
y(sn—1)
.
(8a)
(8b)
‘ or
+ #S a),24 OM
y(tnt1)=y(n) +y{(an)h
respectively, where ¢ is some point in [%,—1,©] and 7 is some point in [tp, tr+1].
Now we can eliminate
the bothersome
y” terms by subtracting
(8a) from (8b).
Doing so gives
, tt
y(@n41)
—y(@n—1)
= 2y/(an)h+rity
fl
.
3
or
y(tn41)= y(@n—1)
+ 2f (an,y(an))h+ O (h*).
6.3. Improvements:Midpoint Rule and Runge-Kutta — 301
Finally, if we drop the O (n3) term and change y(an41), y(@n—1),y(@n) tOYnqi;
Yn—1sYn»tespectively, we have
= Yn—1 + f(an,
Yntl
Yn) (2h),
|
(9)
which method is known as the midpoint rule. Like (7), the midpoint rule is a
second-order method but, unlike (7), it requires only one function evaluation per
step. It is an example of a multi-step method because it uses information from
more than one of the preceding points —namely, from two: the nth and (n — 1)th.
Thus, it is a two-step method whereas Euler’s method and the Taylor series method
given by (7) are single-step methods.
A disadvantage of the midpoint rule (and other multi-step methods) is that it
is not self-starting. That is, the first step gives y, in terms of zg, yo, y—1,but y—1
is not defined. Thus, (9) applies only for n > 1, and to get the method started
we need to compute y; by a different method. For instance, we could use Euler’s
method to obtain y; and then switch over to the midpoint rule (9). Of course, if we
do that we should do it not in a single Euler step but in many so as not to degrade
thesubsequentsecond-orderaccuracy.
EXAMPLE
1. Considerthesame“testproblem”as in Section6.2,
y=yt2e-2?,
y(0)=1,
(<a<oo)
(10)
with the exact solution y(z) = 2? + e*. Let us use the midpoint rule with h = 0.1. To get
it started,carry out ten steps of Euler’s method with h = 0.01. The result of those steps is
the approximate solution 1.11358 at 2 = 0.1, which we now take as y,. Then proceeding
with the midpoint rule we obtain from (9)
y2=yo +2 (yr +2e1—a7j)h
= 1+2(1.11358
+0.2—0.01)(0.1)= 1.26072
Ys=Yr+2 (y2+2a —ab)h
= 1.11358
+2(1.26072
+0.4—0.04)(0.1)= 1.43772,
and so on. The results are shown in Table | and contrasted with the less accurate Euler
results using the same step size, kh= 0.1. @
Before leaving the midpoint rule, it is interesting to interpret the improvement
in accuracy, from Euler to midpoint, graphically. If we solve
y(2n41)
a y(an)
+ y(an)h
(Euler)
(11)
(midpoint)
(12)
(Euler)
(13)
and
y(@n41) & y(@n—-1)+ 2y'(an)h
for y'(an), we have
Table
1. Comparison
of Euler, midpoint
rule, and exact
solutions of the initial-value problem (10), with h = 0.1.
Euler | Midpoint |
x
Exact
0.10| 1.10000|
1.11358| 1.11517
0.20| 1.22900|
1.26072| 1.26140
0.30| 1.38790|
1.43772| 1.43986
0.40| 1.57769|
1.65026|
0.50| 1.79946|
1.89577| 1.89872
1.65182
and
(14)
(midpoint)
which are difference quotient approximations of the derivative y/(x,).
yOn-t
t
∙
.
)
yx,
−
Lp
)
Xp
YOns1
Xn+l
)
x
en
Figure 1. Graphicalinterpretation
of midpoint rule versus Euler.
In Fig. 1,
we can interpret(14) and (13)as approximatingy’(x,,) by the slopes of the chords
AC and BC, respectively, while the exact y’(z,,) is the slope of the tangent line
TL at x,. We can see from the figure that AC gives a more accurate approximation
than BC’.
6.3.2. Second-order Runge-Kutta.
somewhat differently.
The Runge-Kutta methods are developed
Observe that the low-order Euler method yn4.
= Yn +
f(£n,Yn)h amounts to an extrapolation away from the initial point (a, yn) using the slope f(a, yn) at that point. Expecting an average slope to give greater
accuracy, one might try the algorithm
1
Yn+1 = Yn + > (f(Xn,
2
Yn) ai f (tn41,
(15)
Yn+1)] A,
which uses an averageof the slopes at the initial and final points. Unfortunately,
the formula (15) does not give yn41 explicitly since y,41 appears not only on the
left-handside but also in theargumentof f(@n41,Yn+1).Intuitiontells us thatwe
should still do well if we replace that yn41 by an estimated value, say, the Euler
estimate Yn41 = Yn + f(2n, Yn)h. Then the revised version of (15) is
1
Yn+1 = Yn + 2 {f(2n,
.
Yn) + f [Cn41; Un +
cae
Yn)h] } h.
(16)
Thus, guided initially by intuition we can put the idea on a rigorous basis by
considering a method of the form
Un+1
= Yn + {af (Xn,
Un) + bf [Un + ah,
Yn -F Bf
(fn,
Yn) hr] } h
(17)
and choosing the adjustable parameters a,b,a,@ so as to make the order of the
method (17) as high as possible; a, 3 determine the second slope location, and a, b
303
determine the “weights” of the two slopes. That is, we seek a,b, a, @ so that the
left- and right-hand sides of
y(2n41)
x y(&n)
[tn, y(tn)|
+ {af
+6f [tn + ah, y(an) + BS (en, y(an) |rl} h
(18)
agree to as high a degree in h as possible. Thus, expand the left-hand side (LHS)
and right-hand side (RHS) of (18) in Taylor series in h:
+
LHS = y(en) + y/(a@n)h
y"(an)
he 4...
(19)
; (fe + fyf) heres:
=ytfht+
where y means y(a,,) and the arguments of f, fz, fy are Un, y(n).
ercise 9),
Similarly (Ex-
RHS= y+ (a+b6)fht+(afe + Bffy) bh?++.
(20)
Matching the /t terms requires that
a+tob=1.
(21a)
Matching the h? terms, for any function f requires that
ab =
pope
and
(b=
(21b)
-.
ee
Rol
chosen so as to satisfy
The outcome then is that any method (17), with a,b,a,
(21), has a local truncation error that is O(h°)
and is therefore a second-order
method [subject to mild conditions on f such as the continuity of f and its firstand second-order partial derivatives so that we can justify the steps in deriving (19)
and (20)|. These are the Runge—Kutta
methods of second order.!
For instance, with a = b = 1/2 and a = 3 = 1 we have
Unt
.
1,
Yn) + f [Tr41,
= Un + 5 {f(tn,
Yn + f(tn,
Yn )h]} h,
(22)
which is usually expressed in the computationally convenient form
Ynti
ky = Af
(tn,
= Yn + 5 (Ay + ka) ’
Yn) ;
keg = hf
(Qn41,
(23)
Yn + ky)
‘The Runge-Kutta methodwas originated by Carl D. Runge (1856-1927), a German physicist and
mathematician, and extended by the German aerodynamicist and mathematician M. Wilhelm Kutta
(1867-1944). Kutta is well known for the Kutta—Joukowski formula for the lift on an airfoil, and for
the “Kutta condition” of classical airfoil theory.
304
To understand this result, note that Euler’s method would give Yay, = Yn +
f (@n;Yn) h. Tf we denote that Euler estimate as ybuler then (22) can be expressed
as
:
Yn+1
= Yn +
f (tn, Yn) + f (tn+4, Uri’)
5
h
.
That is, we take a tentative step using Euler’s method, then we average the slopes at
the initial point v,, y, and at the Euler estimate of the final point 7,41, yeuler and
then make another Euler step, this time using the improved (average) slope. For
this reason (23) is also known as the improved
Euler method.
A different choice, a = 0,6 = 1,a = § = 1/2, gives what is known as the
modified Euler method.
EXAMPLE
2. Let us proceedthroughthefirst two stepsof theimprovedEuler method
(23) for the same test problem as was used in Example |,
y =y t+22-2";
y(0) =1,
(0< x <o)
with A = 0.1; a more detailed illustration is given in Section 6.3.3 below. Here, f(x,y)
(24)
=
y + 2a —x”,
n=O:
ky = hf (xo,yo) = 0.1 [1+0 —(0)?) =0.1,
ko = hf (x1,yo+ ky) =0.1[(1 + 0.1)+ 2(0.1)—(0.1)?)= 0.129,
wa= yo+ $ (ki + ke) = 140.5 (0.1+ 0.129)= 1.1145;
n=l:
ky=hf(a1,yr)
=0.1[1.1145
+2(0.1)
—(0.1)2]
=0.13045,
ko = hf (w2,yi + fi)
=0.160495,
—(0.2)2]
+2(0.2)
+0.13045)
=0.1[(1.1145
=1.2600,
+0.160495)
+0.5(0.13045
yo=yr+4(hy+ho)=1.1145
compared with the values y(x,) = y(0.1) = 1.1152 and y(x2) = y(0.2) = 1.2614
obtainedfrom theknown exactsolution y(a) = 2? +e". Ol
6.3.3. Fourth-order Runge-Kutta.
Using this idea of a weighted average of
slopes at various points in the x, y plane, with the weights and locations determined
so as to maximize the order of the method, one can derive higher-order RungeKutta methods as well, although the derivations are quite tedious. One of the most
305
method:
commonly used is the fourth-order Runge-Kutta
Yn+l= Yn+ %(ki + 2ke+ 2kg+ ka)
ky
=
hf
(2'n,
kg = hf cc
Yn)
’
kg
+ B, Yn + 5k)
—
= Af
(a,
+
Un
-F
ky)
}
(25)
» ka = hf (@n4is Yn + ka),
avwhich we give without derivation. Here the effective slope used is a va
erageof the cles at the four points (pn, Yn), (Un + A/2, yn + kt/2), (an + h/2,
Yn + ko/2) and (n41, Yn + &3) in the x, y plane, an average because the sum of
he coefficientsWe 2/6, 2/6, 1/6 thatmultiply the&’s is 1. Similarly, the sum of
he coefficients 1/2, 1/2 in the second-order version (23) is 1 as well.
EXAMPLE
3. As a summarizing illustration, we solve another“test problem,”
y=-y
y(0)=1
(26)
by each of the methods considered, using a step size of h = 0.05 and single precision
arithmetic (on the computer used that amounts to carrying eight significant figures; double
precision would carry 16), The results are given in Table 2, together with the exact solution
y(z) = e~* for comparison; 0.529E + 2, for instance, means 0.529 x 10°. The value of y;
for the midpoint rule was obtained by Euler’s method with a reduced step size of 0.0025.
To illustrate the fourth-order Runge-Kutta calculation, let us proceed through the first
step:
nr H O:
ky = hf (xo, yo) = ~hyo = —0.05(1) = —0.05,
ko= hf (vo+ $.yo+ 441)= —h(yo+ $h1)
= —0.05(1 —0.025) = —0.04875,
kg = hf (xo alas2 yo +: $k)
= —h (yo + ska)
= ~—0.04878125,
a7)
= ~0.05(1 —0, 024375
ka
=
= hf (a, yo + &3) = —h (yo + ks)
i ~—0.05(1~ 0.04878125) = —0.047560938,
YL =
Yo*
é (Ay
+ 2k
++2hy
+ ha)
= 0.95122943,
which final result does agree with the corresponding entry in Table 2. Actually, there is a
discrepancy of 2 in the last digit, but an error ofthat size is not fenomonetiat in view ofthe
fact that the machine used carried only eight significant figures.
Most striking is the excellent accuracy of the fourth-order Runge-Kutta method, with
six significant figure accuracy over the entire calculation.
COMMENT. We see that the midpoint rule and the second-order Runge-Kutta method
yield comparable results initially, but the midpoint rule eventually develops an error that
306
Euler
Midpoint
2nd-order
Runge-Kutta
4th-order
Runge~Kutta
Exact = e~ c
0.00
1.00000000 E+ 1
1.00000000 E+1
1.00000000E+ 1
1.00000000 E-+-1
1.00000000E+-1
0.05
(1).94999999E+-1
0.95116991 E+
0.95125002E+1
0.95122945 E+1
0,95122945 E+1
0.10
0.90249997 E+1
0.90488303 E+1
090487659 E+1
0.90483743 E+1
0.90483743 E+1
0.15
0.85737497 E+1
0.86068159 E+1
0.86076385 E+1
0.86070800 E+1
0.86070800 E+1
0.20
0.81450623 E+1
0.81881487 E+1
0.81880164E+1
0.81873077 E+1
0.81873077 E+1
0.25
0).77378094E+1
0.77880013 E+1
0.77888507 E+-1
0.77880079 E+1
077880079 E+1
0.30
0.73509192 E+1
0.74093485 E+1
0.74091440 E+1
0.7408 1820 E+1
0.74081820 E+1
2.00
0.12851217 E+0
0.13573508 E+0
0.13545239 E+0
0.13533530 E+0
0.13533528 E+0
2.05
0),12208656E+0
0.12853225 E+0
0.12884909 E+-0
0.12873492E+0
0,12873492 E+0
2.10
0.11598223 E+0
0.12288185 E+0
0.12256770 E+0
0.12245644E+0
0.12245644 E-+0
2.15
Q.11018312 E+0
0.11624406 E-+0
0.11659253 E+0
0.11648417E-+-0
0.11648415 E+0
2.20
0.10467397 E+-0
0.11125745 E+0
0.11090864 E+0
0.11080316 E+0
0.11080315 E+0
2.25
0.99440269 E—1
0),10511832E+0
0.10550185 E+0
0.10539923E-+-0
0.10539922 E+0
2.30
0.94468258 E—1
0.10074562 E+0
0.10035863E+0
0.10025885E+-0
0.10025885 E+0
5.00
0).59205294E—2
0.12618494 E—1
0,67525362 E-2
0.67379479 E—2
0.67379437 E—2
5.05
0).56245029E-2
0,25511871 E~3
0.64233500 E—2
064093345 E-2
0.64093322 E—2
5.10
0.53432779 E—2
012592983 E—1
0.61102118 E-2
0.60967477 E—2
0.60967444 E—2
5.15
0.50761141 E-2
-0. 10041796 E—2
0.58123390 E—2
0.57994057E—2
0.57994043 E—2
5,20
0.48223082 E—2
0.12693400 E-1
0.55289874 E~2
0.55165654 E—2
0.55165626 E—2
5.25
0.45811930 E-2
-0.22735195 E-2
5.30
0.43521333 E-2
9.70
9.75
0.52594491 E—2
0.52475194 E—2
0.52475161 E-2
0.12920752 E~-1L
0,.50030509E—2
0.49915947 E—2
0.49915928 E—2
0.47684727 E—4
0.64383668 E+0
0.61541170 E-4
0.61283507 E-4
0).61283448E—4
0.45300490 E~4
-0,67670959 E+0
0.58541038 E-4
0.58294674 E—4
0.58294663 E~4
9.80
0.43035467 E~4
0.71150762 E+0
0.55687164 E—4
055451608 E~4
0.55451590 E~4
9.85
0.40883693 E—4
-0.74786037 E-+0
0,.52972413E--4
0.52747200 E-4
0.52747171 E—4
9.90
0.38839509 E~4
0.78629363 E+0
0.50390008 E—4
0.5017469t E—4
0.50174654 E—4
9.95
0.36897534 E—4
-0,.82648975E+0
047933496 E—4
0.47727641 E-4
0.47727597 E~4
10.00
0.35052657 E—4
0.45596738 E—4
0.45399935 E—4
0.45399931 E—4
086894262 E+-0
307
oscillates in sign, from step to step, and grows in magnitude. The reason for this strange
(and incorrect) behavior will be studied in Section 6.5. @
Of course, in real applications we do not have the exact solution to compare
with the numerical results. In that case, how do we know whether or not our results
are sufficiently accurate? A useful rule of thumb, mentioned in Section 6.2, is
to redo the entire calculation, each time with a smaller step size, until the results
“settle down” to the desired number of significant digits.
Thus far we have taken / to be a constant, for simplicity, but there is no reason
why it cannot be varied from one step to the next. In fact, theremay be a compelling
reasonto do so. For instance, consider the equation y/ + y = tanh 20z on —10 <
x < 10. The function tanh 20x is almost a constant, except near the origin, where
it varies dramatically approximately from —1 to +1. Thus, we need a very fine step
size A near the origin for good accuracy, but to use that A over the entire x interval
would be wasteful in terms of computer time and expense.
One can come up with a rational scheme for varying the step size to maintain
a consistent level of accuracy, but such refinements are already available within existing software. For example, the default numerical differential equation solver in
Maple is a “fourth-fifth order Runge—Kutta—Fehlbergmethod” denoted as RKF45
in the literature. According
a tentative step is made, first using a fourth-
to RKF45,
order Runge-Kutta method, and then again using a fifth-order Runge-Kutta method.
If the two results agree to a prespecified number of significant digits, then the fifthorder result is accepted. If they agree to more than that number of significant digits,
then / is increased and the next step is made. If they agree to less than that number
of significant digits, then A is decreased and the step is repeated.
6.3.4. Empirical estimate of the order. (Optional) The relative accuracies achieved
by the different methods, as seen from the results in Table 2, strikingly reveal the
importance of the order of the method. Thus, it is important to know how to verify
theorder of whatever method we use, if only as a partial check on the programming.
Recall that by a method being of order p we mean that at any chosen «xthe
error behaves as CAP for some constant C:
(27)
Yexact ~ Yeomp ™ Ch?
as h —»0. Suppose we wish to check the order of a given method. Select a test
problem such as the one in Example 3, and use the method to compute y at any x
point such as z = 1, for two different h’s say hy and hg. Letting ysornp and yoorn
denote the y’s computed at « = 1 using step sizes of hy and he, respectively, we
have
(4)
Yexact —~Yeomp ~ Chi,
↕
∕
∕
≥
Chi.
↨−−
D
∏
Dividing one equation by the other, to cancel the unknown C, and solving for p,
308
gives
l
Yexact~ usormp
per
In| -~
ey
Yexact~ Ycomp|
(28)
In A
0.1 and hy = 0.05. The results at z = 1 are
= 0.348678440100
hy=0.1, —yShp
yap = 0.358485922409
ho=sn
?
’
0.367879441171, (28) gives p = 1.08, which is respectably
and since Yexact(1)
=
close to 1. We should be able to obtain a more accurate es imate of p by using
gives p ~ 1.01. Using those same step sizes, we also obtain p = 2.05, 2.02, and
rule, second-order Runge-Kutta,
and fourth-order RungeKutta methods, respectively.
Why not use even smaller 4’s to determine p more accura
arise. One is thatas the h’s aredecreasedthe computed solutions become more and
4.03 for the midpoint
moreaccurate,and theyexact—YeompandYexact— Yeomp&ifferencesin (28)are
known to fewer and fewer significant figures, due to cancelation. This is especially
true for a high-order method. The other difficulty is that (27) applies to the truncation error alone so, implicit in our use of (27) is the assumption that roundoff errors
are negligible. If we make h too small, that assumption may become invalid. For
both of these reasons it is important to use extended precision for such calculations,
as we have for the preceding calculations.
6.3.5. Multi-step
¥
parabolic fit
and predictor-corrector
methods.
ods known as Adams—Bashforth
from @, to Gp+y:
.
|
∏−
∶
y' dx -/
↓
f (a, y(x)) dx
aun
or
‘Ent bk
y(@n41) = y(Ln) + |
m =2.
We’ve already
methods, obtained by integrating y’ = f(z, y)
JEn
Figure 2. Adams—Bashforth
interpolationof f for the case
(Optional)
called attention to the multi-step nature of the midpoint rule. Our purpose in
this optional section is to give a brief overview of a class of multi-step meth-
f (a. y(2)) dz.
(29)
(30)
Jen
To evaluatethe integral, we fit f (a, y(a)) with a polynomial of degreem, which is
readily integrated. The polynomial
f (x, y(w)) at the m + 1 points am,
interpolates (i.e., takes on the same values as)
-..,Un—1, Un as illustrated in Fig. 2 for the
case m = 2. As the simplest case, let m = 0. Then the zeroth degree polynomial
f,, denotes
approximation of f (a, y(x)) on [ap,en4i] is f (2, y(x)) © fn, where
f(2n, Yn), and (30) gives the familiar Euler method yn. = Yn + fnh. Omitting
6.3. Improvements: Midpoint Rule and Runge-Kutta
the steps in this overview we state that with mz = 3 one obtains the fourth-order
Adams—Bashforth method
h
∙
Untt = Yn + (55fn — 59 fn—1 + 387fn—2 — 9fn—3) 54
(31a)
with a local truncation error of
251
(31b)
|,
yO(Eh
= Foy
(Cn)ap
can see that (31a) is not self-starting;
for some € in [t,—3,@,|.We
forn = 3,4,...,
it applies only
so the first three steps (for n = 0,1, 2) need to be carried out by
some other method.
Suppose that instead of interpolating f at @p—om,...,2n—1,2n We interpoWith m = 3, again, one obtains the fourth-order
late at Gy—m41;.+.)2n,tn4+1.
method
Adams—Moulton
Yn+1
~~ Yn
+
∙
(9fr+1
h
19 fn
+
—
=
Sofn-1
fnr-2)
(32a)
a
24
with a local truncation error of
19
(32b)
(en)am = 799! yO (OR,
where the €’s in (31b) and (32b) are different, in general.
Although both methods are fourth order, the Adams—Moulton method is more
accurate because the constant factor in (e,), aar is roughly
than the constant in (e,) 4.
Un—m+ls--+;En,Un+1
thirteen times smaller
This increase in accuracy occurs because the points
are More centered on the interval of integration
n+1) than the points @—m,--.;2n—1,@n.
(from x, to
On the other hand, the term fp4.
=
Yn+1) in (32a) is awkward because the argument yn+1 is not yet known!
oe
[If f is linear in y, then (32a) can be solved for yai1 by simple algebra, and the
awkwardness disappears.] Thus, the method (32a) is said to be of closed type,
whereas (31a) and all of our preceding methods have been of open type.
To handle this difficulty,
it is standard practice
to solve closed
formulas
by
iteration. Using superscripts to indicate the iterate, (32a) becomes
(&
en”)
= Unt
[9s
(test,
yi)
A
h
+ 19 fn
—_Dfn-1
+ In|
240
(33)
(0) from a predictor formula, with subseTo start the iteration, we compute y,,,,
quent corrections made by the corrector formula (33). [t is recommended that
the predictor and corrector formulas be of the same order (certainly, the corrector should never be of lower order than the predictor) with the corrector applied
only once. Thus, the Adams—Bashforth and Adams—Moulton methods constitute a
natural predictor-corrector pair with “AB” as the predictor and “AM” as the corrector. Why might we choose the fourth-order AB-AM predictor-corrector over
309
310
the Runge-Kutta method of the same order or vice versa? On the negative side,
AB-AM is not self-starting, it requires the storage of f,—3, fn—2, and fn—1, and is
more tedious to program. On the other hand, it involves only two function evaluations per step (namely, f, and f,+1) if the corrector is applied only once, whereas
Runge-Kutta involves four. Thus, if f(z, y) is reasonably complicated then we can
expect AB~AM to be almost twice as fast. In large-scale computing, the savings
can be significant.
Closure. Motivated to seek higher-order methods than the first-order Euler method,
we use a Taylor series approach to obtain the second-order midpoint rule. Though
more accurate, a disadvantage of the midpoint rule is that it is not self-starting. Pursuing a different approach, we look at the possibility of using a weighted average of
slopes at various points in the a, y plane, with the weights and locations determined
so as to maximize the order of the method. We thereby derive the second-order
Runge-Kutta
method and present, without derivation, the fourth-order Runge-Kutta
method. The latter is widely used because it is accurate and self-starting.
Because of the importance of the order of a given method, we suggest that the
order be checked empirically using a test problem with a known exact solution.
The resulting approximate expression for the order is given by (28).
In the final section we return to the idea of multistep methods and present a
brief overview of the Adams—Bashforth methods, derived most naturally from an
approximate integral approach. Though not self-starting, the fourth-order AdamsBashforth method (31a) is faster than the Runge-Kutta method of the same order
because it requires only one function evaluation per step (namely, f;,; the fn—1,
fn—2, and f,—3 terms are stored from previous steps). A further refinement con-
sists of predictor-corrector variations of the Adams—Bashforth methods. However,
we stress that such refinements become worthwhile only if the scope of the computational effort becomes large enoughto justify the additional inconvenience caused
by such features as the absence of self-starting and predictor-corrector iteration.
Otherwise, one might as well stick to a simple and accurate methodsuch as fourthorder Runge-Kutta.
Computer software. Computer-software systems such as Maple include numerical differential equation solvers. In Maple one can use the dsolve command together with a numeric option. The default numerical solution method is the RKF45
method mentioned above. Note that with the numeric option of dsolve one does
not specify a step size A since that choice is controlled within the program and,
in general, is varied from step to step to maintaina certain level of accuracy. To
specify the absolute error tolerance one can use an additional option called abserr,
which is formatted as abserr = Float(1,2-digits) and which means 1 times 10 to the
one- or two-digit exponent. For instance, to solve
y=-y
y(0)=1
for y(x) with an absolute error tolerance of 1 x 107°, and to print the results at
x = 2,10, enter
with(DEtools):
311
and return. Then enter
dsolve({diff(y(a),2) = —y(z), y(0) = 1},
value= array([2,10]),abserr= Float(1,—5));
and return. The printed result is
2.
10.
(x,y(x)]
.1353337989380555
.00004501989255717160
y(10) = exp (—10) = 0.0000453999,respectively.
EXERCISES
0.1353352832 and
6.3
1. Evaluate y; and yg by hand, by the second-order and fourthorder Runge-Kutta methods, with h = 0.02. Obtain the exact
valuesy(a,) andy(x2) as well.
(a) y! = 3000zy7?;
x
y(0) =
Dy
(b) y’ = 40re7!;
(0=a
y(-1)=5
(c)y=a+y;
(d)y= —ytana;
y(1)=-
(f)y! = -2ysing;
y(2)= =
According to Bernoulli’s principle, the efflux velocity u(t) is
(e)y'= (y?+1)/4; (0) =
2. (a)-(f) Program the second- and fourth-order
approximately\/2gz(t), whereg is the accelerationof gravity. Thus, a mass balance gives
Runge-Kutta
Aa'(t)= Q(t)—Bo(t)
(4.1)
methods and use them to solve the initial-value problem given
in the corresponding part of Exercise | but with the initial con-
ditiony(0) = 1. Use A = 0.05. Print out all computedvalues where B is the cross-sectional area of the efflux pipe. For
definiteness,suppose that A = 1 and B/2g
of y, up to x = 0.5, as well as the exact solution.
z= Q(t) —0.01V/z.
the order of the given method. Use (28), with h = 0.1 and
0.05, say. Do the evaluation at two different locations, such as
w= Landa = 2. (The order should not depend upon zwso
your results at the two points should be almost identical.)
(4.2)
We wish to know the depth x(t) at the end of 10 minutes
(t = 600 sec), 20 minutes, ... , up to one hour. Program the
computer solution of (4.2) by the second-order Runge-Kutta
method for the following cases, and use it to solve. for. those
a values: 2(600), «(1200),...,2(8600). (Using the rule of
(a) Euler’s method
3, reduce ft until those results
thumb given below oar
digits.)
significant
settle down to four
(b) Second-order Runge-Kutta method
(c) Fourth-order Runge-Kutta method
4. (Liquid level) Liquid is pumped into a tank of horizontal cross-sectional
= 0.01 so
area A (m*) at a rate Q (liters/sec),
drained by a valve at its base as sketched in the figure.
and is
(a)Q(t)= 0.02;(0)= 0
(b)Q(t)=0.02; x(0)=
(c)Q(t) = 0.02; 2(0)
=
(d)Q(t) =0.02; 0) 6
312
(e)Q(t)= 0.02(1—e799) | (0) = 0
(f) Q(t)
=ss 0.02 (1 —
~@ ~0.0041) .
(g) Q(t) = 0.02t;
(0)
61
Y(en-1) = y(@n)+ |
«x(0)= 0
(h)Q(t) = 0.02(1+ sin0.14); «(0) = 0
NOTE: Surely, we will need A to be small compared to the
period200mof Q(t) in part(h).
§. (a)—(h) (Liquid
Integrating y’ = f(x) from x, to %,41, we have
= &
(11.1)
f(a) dz
If we fit f(a), over [t,,@n41], with a zeroth-degreepoly-
nomial (i.e., a constant) that interpolates f at v,, then we have
level) Same as Exercise
4, but use fourth-
order Runge~Kutta instead of second order.
6. (a)—(h) (Liquid level) Same as Exercise 4, but use com-
puter software to do the numerical solution of the differential
equation. In Maple, for instance, the dsolve command uses the
fourth-fifth order RKF45 method.
7. (Liquid level) (a) For the case where Q(t) is a constant,
derive the general solution of (4.2) in Exercise 4 as
f(x) & flan), and (11.1) gives y(tnai) © y(tn) + f(an)h
and hence the Euler method gna. = Yn + f(@n)h.
(a) Show that if we fit f(z), over (tn, 241], with afirst-degree
polynomial (a straight line) that interpolates f atv, and @y41,
thenf(x) © f(an) +(f(tna1) —f(an)] ( -tn)/h.
Putting
that approximation into (11.1), derive the approximation
y(@n4+1) = y(tn)
1
4 9 [f(@n)
+ f(tn+1)]
A,
(11.2)
special case where f is a function
Q -0.01/a@—Qin (Q —0.01Vz) = 0.00005t+ C, (7.1) and show that (for the
of x only) (11.2) is identical to the second-order Runge-Kutta
where C is the constantof integration.
(b) Evaluate C in (7.1) if Q = 0.02 and 2(0) = 0. Then, solve
NOTE: Unfortu(7.1) for a(t) at t = 600, 1200,...,3600.
nately, (7.1) is in implicit rather than explicit form, but you
can use computer software to solve. In Maple, for instance,
the relevant command is fsolve.
8. Suppose that we have a convergent method, with £,, ~ Ch?
as h - 0. Someone offers to improve the method by either
halving C or by doubling p. Which would you choose? Explain.
9, Expand the right-hand side of (18) in a Taylor series in h
and show that the result is as given in (20). HINT: To expand
the f(t, +ah,y + Bfh) term you need to use chain differentiation.
10. (a) Program the fourth-order
Runge-Kutta method (25) and
use it to run the test problem (10) and to compute y atx = 1
using h = 0.05 and then kh= 0.02. From those values and
the known exact solution, empirically verify that the method is
fourth order.
(b) To see what harm a programming error can cause, change
the z, + h/2 in the formula for ky to 2,, repeat the two eval-
uations of y ata = lL using h = 0.05 and A = 0.02, and empirically. determine the order of the method. [s it still a fourthorder method?
(c) Insteadof introducing the programming error suggested in
part (b), suppose we change the coefficient of kg in yaa, =
Yn + ‘ (ky + 2k + 2ky + ky) from 2 to 3. Do you think the
method will still be convergent? Explain.
U1. (Rectangular, trapezoidal, and Simpson's rule) Consider
the special case where f in y’ = f is a function of a only.
method (23).
(b) Show that if we fit f(z), over [vn,@n41], with a seconddegree polynomial (a parabola) that interpolates f at Un,
In + h/2, and raz. = Ln + fh, and put that approximation
into (11.1), then one obtains
Y(@n41) Es
L
y(Zn)
7 6
(f(@n) + 4f (en + h/2) + flansi)l
A,
(11.3)
and show that (for the case where f is a function of x only)
method
(11.3) is identical to the fourth-order Runge-Kutta
(25). NOTE: These three results amount to the well-known
rectangular, trapezoidal, and Simpson’s rules of numerical
integration for a single interval of size h. If we sum over all of
the intervals, they take the forms
[f(a)+ flat h)+--+ f(b h,
(fla) + 2f(a +h) + 2f(a + 2h)
+ f(0)]F
fe +2f(b—h)
’
+ 4f(a+h) + 2f(at+ 2h)+L 4f(a + 3h)
(f(a)
+£(0)]
feeb 4f(b—bh)
A
6)
(11.4)
In passing from (11.3) to the last line
respectively.
of (11.4) we have replaced h/2 by h everywhere in
313
(11.3}, so that the revised (11.3) reads Yay. = Yn +
[f(tn) + 4f(@n41) + f(a@n+2)]h/3, where x, = a+nh. For
the rectangular and trapezoidal rules the number of subdivisions, (b ~ a)/h, can be even or odd, but for Simpson’s rule
it must be even. The order of the error for these integration
methodsis O(h), O(h?), andO(h*), respectively.
12. (a) Using m = 1, derive from (30) the Adams~Bashforth
method
:
Yn+1
= Un at
problems
(3f
.
_
13. This exercise is to take you through the fourth-order ABAM predictor-corrector scheme.
(a) For the problem y' = 2xy,y(0) = 1, compute 4, yo, ys
from the exact solution, with # = 0.1, and use those as starting
values to determine yy, by hand, by means of the fourth-order
AB-—AM
A
fn—1)
(b) Determine the order of the method (12.1) empirically by
using it to solve the test problem (10), at 2 = 1, with two
different step sizes, and then using (28).
2"
(12.1)
predictor-corrector
u(x) = f(x,u,v);
u(a) i
(la)
v(x) = g(x, u,v);
v(a) =
(1b)
to each of the problems (1a) and (1b) as follows:
Until = Un + f (2A, Un, Undh,
Un+1
forn = 0,1,2,....
=
Un + G(Ln,
EXAMPLE
(2)
Uns Un )h
Equations (2) are coupled (since each involves both u and v),
from the preceding step.
1. Consider the system
ws
atu;
ul=II uv?;
ufO) = 0
u(O)= 1.
scheme given by (31a) and (33).
Apply the corrector three times.
(b) Continuing in the same way, determine ys.
314
The latter looks fairly simple, but it is not. [tis nonlinear because of the uv" term. Turning
to numerical
solution
using the Euler method (2), let h = 0.1, say, and let us go through
the first couple of steps. First, wg = 0 and vg = 1 from the initial conditions. Then,
m=O0:
Uy = Uo + (xo + uUo)h= 04 (0+ 1)(0.1) = 0.1,
Uy= VoFupugh = 1+ (0)(1)?(0.1)= 1.
n=l:
ug =uy t+(ay +uz)h =0.1 +(0.1 +1)(0.1)= 0.21,
Vg = Up+ uv?h = 1 + (0.1)(1)?(0.1)= L01,
and soon.
@
Similarly, if the system contains more than two equations.
Next, we show how to adapt the fourth-order Runge-Kutta method to the
system (1). Recall that for the single equation
y=f(r,y)s
ya) =yo
(4)
the algorithm is
Yntl = Un + j (Ay + 2ho + 2h3 + ka),
hy
= hf (tn,
k3 = hf
Yn),
(tn
+ BY
+ 5k2)
,
ko =
~ hf
(n+
k= =Af
(tn +15 Un + ke3)-
' 3M
+ 5h),
(9)
For the system (1) it becomes
Un+-] = Un
A
+ Qho
(ky
+ 2h
4 ha),
Un+1 = Un + ~ (ly + 2lo + 2l3 + la),
Foote
|a
ky = hf (an,
= hg(2n,
ko =hf
Un, Un),
Un, Un);
h
1
h
1
1
(m + 7 Un + =k, Un + st)
n
lo = hg (m+y
kg = hf(«
Gy), Un + hts
1
Un + sts)
:
’
1
1
h
+ h ln + <ko, Un+ sls) ;
|
(6)
315
ls = hg
ka = Af
h
(
2
(Sn41;
l4 = hg (Gn+1;
au!
1
+ =, Un + aha,
lI
Un
1
Un
kg, Un + lz)
5!)
;
Un T kg, Un + ls) )
x+y
vu!= uv’;
u(0) {| 0
u(0) II L.
Ky
(0.1)(0 +0.05)+ (1+0.0025)]
=0.10525,
(0.1)(0+0.0525)
(1+0.0025)*
=0.005276,
h
Uy+
(vo
1
6
6
+ l3)]
,
(7)
316
nma=l:
and so on for n = 2,3,....
ky,...,Ug
ky = 0.110520,
ky = 0.116051,
ly = 0.010627,
lo = 0.016382,
kg = 0.116339,
kg = 0.122196,
lg = 0.016760,
lg = 0.023134,
Ug = 0.221420,
vq = 1.021872,
We suggestthatyou fill in thedetails for the calculation of the
values shown above forn = 1.
Of course, the idea is to carry out such calculations on a computer, not by
hand. The calculations shown in Examples | and 2 are merely intended to clarify
the methods.
What about higher-order equations? The key is to re-express an nth-order
equation as an equivalent system of n first-order equations.
EXAMPLE
yi!
3. The problem
_
xy”
+y/'
_
Qy% — sin 2;
y(1)
— 2,
y'(1)
= 0,
y" (A)
— ~3
(8)
can be converted to an equivalent system of three first-order equations as follows. Define
y’ = wand y” = v (henceu! = v), Then (8) can be re-expressedin theform
y(1)= 2
y =u;
ul =v;
u(1) = 0
vo= sing +2y%—-utav; v(1) = —3.
(9a,b,c)
Of the three differential equations in (9), the first two merely serve to introduce the auxiliary
dependentvariableswuandv, and since v’ is y’” the third one is a restatedversionof the
given equationy” —ry” + y' ~ 2y? = sin x. Equation(9a)is they equation,so theinitial
condition is on y(1), namely,y(1) = 2, as given in (8). Equation(9b) is the uwequation,
so theinitial conditionis on w(1),and we haveu(1) = y'(1) = 0, from (8). Similarly, for
equation (9c),
The system (9) can now be solved by the Euler or fourth-order Runge-Kutta methods
or any other such algorithm, To illustrate, let us carry out the first two steps using Euler’s
method, taking h = 0.2, say.
nm=Q:
Yi = Yo + oh = 2+ (0)(0.2) = 2,
Uy = Uo+ Uph= 0+ (—3)(0.2)= -0.6,
Uy = Up + (sin Lo + 2ye
Ug + roUo) h
= —3+ [sin1+ 2(2)°—0 + (1)(—3)](0.2)= —0.231706.
m=l:
Yo = yr Fuh
= 2+ (—0.6)(0.2) = 1.88,
317
i
= —0.646341,
uy+ vyh= —0.6+ (—0.231706)(0.2)
Uy ==Uy + (sin ay + 2yi — Uy + t1U1) h
(0.2)
+ [sin0.2+ 2(2)°—(—0.6)+ (0.2)(—0.231706)]
= —0.231706
= 3.118760,
and so on forn = 2,3,....
COMMENT. Observe that at each step we compute y, u, and v, yet we are not really
interestedin the auxiliary variables uwand v. Perhaps we couldjust compute yy, yo,... and
not the u,v values? No; equations (9) are coupled so we need to bring all three variables
along together. Of course, we don’t need to print or plot « and v, but we do need to compute
them. @
EXAMPLE
4. Examples | and 2 involve a systemof first-orderequations,and Example
3 involve a single higher-order equation. As a final example, consider a combination of the
two such as the initial-value problem
u’ ~ 3ceuv=sing;
u(0)=4,
u’(0)=—-l
v’+2u-v=52;
v(0)=7,
v'(0)=0.
(10)
The idea is exactly the same as before. We need to recast (10) as a system of first-order
initial value problems. We can do so by introducing auxiliary dependent variables w and z
according to u’ = wand v’ = z. Then (10) becomes
ul = wy
u{0) = 4
w’ = sina +3ecuv;
w(0) = —1
vis
ss
v(0) =7
c=
5a -2Qu+vu;
(11)
2(0)=0
which system can now be solved by Euler’s method or any other such numerical differential
equationsolver. @
6.4.2. Linear boundary-value problems. Our discussion is based mostly upon
the following example.
EXAMPLE
5. Consider thethird-orderboundary-valueproblem
yl! —ay = —x:
y(0) = 0, y/(0) = 0, yQ) =4.
(12)
To solve numerically, we begin by recasting (12) as the first-order system:
yol| =u
y(0)= 0, y(2)=4
ees
,
uy
i
9
wy
u(0) = 0
mv,
:
(13a,b,c)
318
Chapter 6. Quantitative Methods: Numerical Solution of Differential Equations
However, we cannot-apply the numerical integration techniques that we have discussed
because the problem (13c) does not have an initial condition so we cannot get the solution
started. Whereas (13c) is missing an initial condition on v, (13a) has an extra condition the right end condition y(2) = 4, but that condition is of no help in developing a numerical
integration scheme that develops a solution beginning at x = 0.
Nevertheless, the linearity of (12) saves the day and permits us to work with an initialvalue version instead. Specifically, suppose that we solve (numerically) the four initialvalue problems
L[¥i]
= 0,
¥i(0)=1,
Y/(0) =0,
Y/’(0) =90,
L[¥3] = 0,
¥3(0) =0,
Y3(0)=0,
Y,/(0) = 1,
L{¥,] = -a*,
¥,(0)=0,
Y¥;(0)=0,
Y¥;’(0)= 0,
=0,
=1,¥7"(0)
=0,¥f(0)
=0,—¥a(0)
L{¥]
a4
whereL = d3/dx? —x? is thedifferentialoperatorin (12). The nineinitial conditions
in the first three of these problems were chosen so as to have a nonzero determinant so
that Y;, ¥Y2,Y3 comprise
a fundamental
set of solutions
(i.e., a linearly
independent
set of
solutions) of the homogeneousequation L[Y| = 0. The three initial conditions on the
particular solution Y, were chosen as zero for simplicity; any values will do since any
particular solution will do. Suppose we imagine that the four initial-value problems in (14)
have now been solved by the methods discussed above. Then ¥;, Yo,¥3, Y, are known
functionsof x over the interval of interest[0,2],and we have thegeneralsolution
y(x)
= CLYi(2)
+ C2¥o(x)
+ C3Y3(x)
+ Y,(2)
(15)
of L[y] = —a*. Finally, we evaluate the integration constants C',, C2, C3 by imposing the
boundary conditions given in (12):
y(0)=0=C,
+0+0+0,
y (0) =0=0+C,+0+0,
(16)
y(2) = 4 = CL ¥Y4(2)+ CoYo(2) + Ca¥s(2) + Y,(2).
Solving (16)gives C) = Cy = Oand Cy = [4—Y,(2)]/Y¥3(2),so we havethedesired
solution of (12) as
4— Y,(2)
= A ¥3(2)
y(v)
y(x)
Y3(2)
3(x)
+ ¥,a (2).)
(
{7 )
In fact, since C, = Cy = 0 the functions ¥,(x) and Y2(x), have dropped out so we
don’t needto calculate them. All we needare Y3(a) and Y,,(), and theseare found by the
numerical integration.of the initial-value problems
Yi=Us, — Y3(0)= 0,
=0,
UlL=V3, -U3(0)
=1,
Vf=225, Vs(0)
(18)
319
and
Yy= Up,
Us =Vp,
Y,(0)= 0,
U,(0) =0,
(19)
=0,
2", Vp(0)
Vp= @°Yp—
respectively,
COMMENT. Remember that whereas initial-value problems have unique solutions (if the
functions involved are sufficiently well behaved), boundary-value problems can have no
solution, a unique solution, or even an infinite number of solutions. How do these possibilities work out in this example? The clue is that (17) fails if ¥3(2) turns out to be zero.
The situation is seen more clearly from (16), where all of the possibilities
come into view.
Specifically, if ¥3(2) 3 0, then we can solve uniquely for C's, and we have a unique solution, given by (17). If ¥3(2) does vanish, then there are two possiblities as seen from (16):
if Y,(2) # 4, then there is no solution, and if Y,(2) = 4 then there are an infinite number
of solutions of (12), namely,
y(x) = C3Y3(z)+Y,(2),
where C’; remains arbitrary.
(20)
@
We see that boundary-value problems are more difficult than initial-value problems. From Example 5 we see that a nonhomogeneous nth-order linear boundaryvalue problem generally involves the solution of n + 1 initial-value problems, although in Example 5 (in which n ==3) we were lucky and did not need to solve for
two of the four unknowns, Y, and Yo.
Nonlinear boundary-value problems are more difficult still, because we cannot
use the idea of finding a fundamental set of solutions plus a particular solution and
thus forming a general solution, as we did in Example 5, and which idea is based
upon linearity. One viable line of approach comes under the heading of shooting
methods. For instance, to solve the nonlinear boundary-value problem
y +siny = 32;
y(0) = 0, y(5) = 2
(21)
we can solve the initial-value problem
yy =U,
=U,
y(0)
y(0)= 0
u' = 3a —siny,
u(0) = uo
(22)
iteratively. That is, we can guess at the initial condition wp [which is the initial
slope y/(0)] and solve (22) for y(x) and u(x). Next, we compare the computed
value of y(5) with the boundary condition y(5) = 2 (which we have not yet used).
If the computed
value is too high, then we return to (22), reduce the value of uo,
and solve again. Comparing the new computed value of y(5) with the prescribed
value y(5) = 2, we again revise our value of uo. If these revisions are done in
320.
Chapter 6. Quantitative Methods: Numerical Solution of Differential Equations
a rational way, one can imagine obtaining a convergent scheme. Such a scheme
is called a shooting method because of the obvious analogy with the shooting of
a projectile, with the intention of having the projectile strike the ground at some
distant prescribed point.
Thus, we can see the increase in difficulty as we move away from linear initialvalue problems. For a linear boundary-value problem of order n we need to solve
not one problem but n + 1 of them. For a nonlinear boundary-value problem we
need to solve an infinite sequence of them, in principle; in practice, we need to
carry out only enough iterations to produce the desired accuracy.
Closure. In Section 6.4.1 we extend the Euler and fourth-order Runge-Kutta solution methods to cover systems of equations and higher-order equations. In that
discussion it is more convenient to use n-dimensional vector notation because of
its compactness, but that notation is not be introduced until Chapters 9 and 10.
Nonetheless,
let us indicate the result, if only for the Euler method, for future ref-
erence. The idea is that we can express the system
yi (a) = yo,
yi(2) = fi (a,yi(a),.--,yn(a));
(23)
yj, (2)
= Fn (x, yi(2),
>
tee »Yn(z))
Yn(a)
= Und
in the vector form
y(x)=£(2,y(e));
—y(a)=yo,
where the boldface letters denote ‘n-dimensional
yi(2)
y(a)=
vectors:”
y\(x)
, y(2)=
Yn(z)
column
;
(24)
fila, y(x))
, f(x, y(2))=
Y, (2)
:
En(2,
y(x))
(25)
and where f(a, y(x)) is simply a shorthandnotationfor f;(@,yi(x),-..,Yn(2)).
Then the Euler algorithm corresponding to (24) is
Yn4+i1 = Yn + f (tn,
Yn)
h.
(26)
In Section 6.4.2 we turn to boundary-value problems, but only linear ones.
Given an nth-order linear boundary-value problem L{y] = f(x) on an interval
[a, 6] plus m boundary conditions the idea is to solve the problems
L(Yi] =0; Vila)=1, ¥i(a)=-- =¥L""" (a)= 0,
,
=Y¥i(a)=. =¥f"") (a)=0, ¥E""(a) = 1,
(27)
321
¥n(@),¥p(2) and to form the general solution as
for Yi(a),...,
y(x) = CLYi (a) +-+- + CaYn(@) + Yp(2).
(28)
boundary conditions to (28) y yields 1 linear alFinally, application of the original
&
which equations will have a unique solution, no
gebraic equations for C,...,C,,
solution, or an infinity of solutions.
Computer software. No new software is needed for the methods described in this
section. For instance, we can use the Maple command dsolve, with the numeric
option, to solve the problem
Leos
l| |
oC& S&
ic - OQ
oS
ll
pet
and to print the results at « = 1,2, and 3. First, enter
with(DEtools):
and return. Then enter
dsolve({diff(u(z), 2) = «+ v(x), diff(v(2), 2) = —5* u(x) * v(@),
u(O) = 0, v(0) = 1}, (u(x), v(x)}, type = numeric,
value = array({1,2.3]));
and return. The printed result is
[x, u(x), v(z)]
1.
2.
3.
1.032499017614234
2.544584704578166
5.044585755162072
.07285274036469075
.00001413488345836790
—.3131443346304622 x 107°
The only differences between the command above and the one given at the end
of Section 6.3 is that here we have entered two differential equations, two initial
conditions, and two dependent variables, and we have omitted the abserr option.
Observe that to solve a differential equation, or system of differential equations, numerically, we must first express the equations as a system of first-order
equations, as illustrated in Example 4. However, to use the Maple dsolve command
we can leave the original higher-order equations intact.
322
EXERCISES
6.4
1. In Example 2 we gave ky, l1, ka, la, ky, lg, ka, ly forn = 1,
(d)Sameas(a),butwithx(0) = y(0) = 0,2(0) = 10.
and the resulting
5. We re-expressed (8) and (10) as the equivalent systems of
values of up and ve, but did not show the
calculations. Provide thosecalculations, as we did for the step first-order initial-value problems (9) and (11), respectively. Do
n= 0.
the same for the problem given. You need go no further.
2. As we did in Example |, work out y,, 21, by hand. Use three (a)ma" +ca'+ka=
2'(0) = 25
f(t); 2(0)=20,
methods: Euler, second-order
Runge-Kutta,
and fourth-order
Runge-Kutta, and take & = 0.2. These problems are rigged
so as to have simple closed-form solutions, some of which are
given in brackets. Compare your results with the exact solution.
(a) yo=z
2(0)=0
2(2)=0
zi=—y,
2(0)=0
() y=-2/y,
yA)=5,
(e)y’" —Qsiny’ = 32;
y(-2)=7,
y/Q)=-1
dy" +y' —4y=32; y(0)=2, y/(0)=
y"(—2)=0
y" a+5y =0;
(b) y' = 42; y(2)=5
zg =-y;
(c)y”—ayy’=sing;
y/(-2)=4
y”"(1)=0
y/(A)=2,
y(l)=3,
(f)y' +2y=cos2z;
2’(0)=—-1
2(0)=2,
(g) 2’ +22 —3y =10cos3t;
y(0)=1
z=-y;,
(b)Li” +Ri +(1/C)i=E'(t); i(0)=%o,i(0)=%,
yO)=1 [y(x)=e7*]
2(3) =0,2'(3) =8
e"+ty-—z=g(a);
2”(1)=3
z(l)=2’(1)=0,
(i) 2” —8az=sint;
y(1) =6
Ee
Tee
2(1)=2
~doy=e*
[z(x) =e7*]
(d)y= 2e27/y;y(l)=1 [y(z)
=27]
gi =y/2*;
2(1)=1
[2(z) =a]
ghs—y2?;
c(l)=1
[2(x) = 1/2]
y(0)=4,y/(0)=3
(h) yy"+ay'z= f(z); y(3)=2, y/(3)=-1
y!(3)=6
(e:)yo=(et+y)z—-1,yQ)=1 [y(2)=]
Gy
3. (a)—(e) First, read Exercise 2. Use computer software to
solve the initial-value problem given in the corresponding part
=yz; (0) = 1,
2 = —xy + 2;
2(0) =
6. Use computer software to solve the given system numeri-
of Exercise 2, for y(x) and z(x) at x = 3,5, and 10, and compare those results with the exact solution at those points.
cally, and print out the solution for y(z) and z(x) at x = 1,2.
(a) y" − aoa =5ax; y(0) = 2, y'(0)= —1
2(0)=1
=-3,
z!
+ yz
4. (a) Just as (2) and (6) give the Euler and fourth-order
(b)
Runge-
Kutta algorithms for the second-order system (1), write down
the analogous Euler, second-order Runge—Kutta, and fourthorder Runge-Kutta algorithms for the third-order system
x(t) = f(t,2,y, 2), x(a) = x
y'(t)= g(t,2,y,2), y(a)= yo
ai(t)= f(t,e,y,2z),.2(a).= 29.
(c)yl=2!+a;
(d) yz"
=a;
=2"(1)=0
21)=1, 2/(1)
(4.1) 7. Complete the solution of Example 5 by using computer soft-
ware to solve (18) for ¥3(x) and (19) for Y, (az),atx = 2,4, 6,
and then using (17) to determine y(z)-at those points.
Use the Euler and second-order Runge—Kutta algorithms to 8. Use the method explained in Example 5 to reduce the given
work out 21,41, 21 and 29,yg, 22, by hand, for the case where
linear boundary-value problem to a system of linear initialfig,hare y — 1,z,t +a + 3(2 — y + 1), respectively,with value problems. Then complete the solution and solve for the
the initial conditions(0) = —3,y(0) = 0, z(0) = 2 using specified quantity, either using computer software or by programming any of the numerical solution methods that we have
studied. Obtain accuracy to five significant figures, and indicate why you believe that you have achieved that accuracy.
h=0.3.
6.5. Stability and Difference Equations
If you believethatthereis no solution or thatit exists but is
nonunique, then give your reasoning.
HINT: You can specify
(c) y” ~ [In(a + 1)]y/ —y = 2sin 38a4+1;
y(O) = 3,
y(2)=—1.
homogeneousinitial conditions for the Y, problem, as we did Determine y(a) at « = 0.5,1.0, 1.5.
y(5)=2
y(0)=1,
in Example 5, but be aware that you do not have to use homo- —(q) yi +y—-ay=03;
1,2,3,4.
=
ata
y(a)
petermine
reduceyour
to
able
be
thatyoumay
and
geneousconditions,
(ec)y” +ay = 203 y(t)=y'(1)=0,
(2)
laborby a moreoptimal choice of thoseconditions.
y(0)=1,
(a)y" —2ay'+y=3sine;
y(1)
Determine
(b)y+ (cosz)y=0;
Determiney(2).
6.5
y(0)=1,
Determiney(2).
y(2)=3.
y(x) ata = 4,5.
Determine
Equations
6.5.1. Introduction. In progressing from thesimple Euler method to the more sophisticated higher-order methods our aim was improvement in accuracy. However,
there are cases where the results obtained not only fail to be sufficiently accurate
but are grossly incorrect, as illustrated in thetwo examples to follow. The second
one introduces the idea of stability, and in Section 6.5.2 we concentrate on that
topic.
EXAMPLE
1. The initial-valueproblem
(1)
hastheexactsolutiony(z) = exp (—42). If we solve it by thefourth-orderRunge-Kutta
method for the step sizes h = 0.1, 0.05, and 0.01, we obtain in Table | the results shown, at
Table 1. Runge—Kutta solution of (1).
x
A=0.1
h = 0.05
h=0.01
Exact
0.183153 E-1
0.183156 E-1
4 | —0.167842 E+0 | —0.106538 E-1 | —0.146405 E-3
0.112535 E-6
1
0.179006 E-1
0.182893 E-1
8 | —0.500286 E+3 |
—0.317586 E+2 | —0.436763 E+0 | 0.126642 E~-13
—0.149120 E+7 |
—0.946704 E+5 | —0.130197 E+4 | 0.142516 E—20
| 12
= 3.
o'(5)
=04,
ty=a;yl)=2,v(1)
(Dy+ay!
y(10)=
Stability and Difference
323
the representative points « = 1, 4,8, and 12. Since the Runge—Kutta method is convergent,
theresults should converge to the exact solution at any given a as / tends to zero, but that
convergenceis hard to see in the tabulatedresultsexcept for « = 1. In fact, it is doubtful
that we could ever come close to the exact values at 2 = 8 or 12 since we might need
to make A so small that roundoff errors might come to dominate before the accumulated
truncation error is sufficiently reduced.
More central to the purpose of this example is to see that with / fixed the results diverge dramatically from the exact solution as x increases so as to become grossly incorrect.
We cannot blame this strange and unexpected result on complications due to nonlinearity
because (1) is linear.
To understand the source of the difficulty, note that the general solution of the differ-
ential equationis y(z) = exp (—4a) + C'exp (2x), whereC’ is an arbitraryconstant.The
initial condition implies that C = 0, leaving the particularsolution y(a) = exp (—42). In
Figure 1. Solution curves for the
Fig. | we show several solution curves for values of C' close to and equal to zero, and we
can see the rapid divergence of neighboring curves from the solution y{(z) = exp (~—4z).
Thus, the explanation of the difficulties found in the tabulated numerical results is that even
a very small numerical error shifts us from the exact solution curve to a neighboring curve,
which then diverges from the true solution. @
equation y’ — 2y = —6e7**".
EXAMPLE
2.
In Example 3 of Section 6.3 we solved the equationy’ = —y, with
initial condition y(0) = 1, by several methods —from the simple Euler method to the more
accurate and sophisticated fourth-order Runge-Kutta method, and we gave the results in
Table 2. Since the midpoint rule and the second-order Runge-Kutta methods are both of
second order we expected their accuracy to be comparable. Indeedthey were initially, but
the midpoint rule eventually developed an error that oscillated in sign from step to step and
grew in magnitude (see Table 2 in Section 6.3). Let us solve the similar problem
y=-2y;
y(0)=1
(2)
by the midpoint rule, with h = 0.05. Since the midpoint rule is not self-starting, we use ten
Euler steps from x = 0 to « = 0,05 before switching over to the midpoint rule. We have
plotted the results in Fig. 2, along with the exact solution, y(v) = exp (—22). Once again,
Figure 2. Illustrationof numerical
instability associated with the
midpoint rule, for the initial-value
problem (2),
we see that the midpoint rule results follow the exact solution initially, but they develop an
error that oscillates in sign and grows such that the results are soon hopelessly incorrect.
This numerical difficulty is different from the one found above in Example |, for rather
than being due to an extreme sensitivity
to initial conditions,
it is associated with machine
roundoff error and is an example of numerical instability. 9
6.5.2. Stability. Let us analyze the phenomenonof numerical instability that we
encountered in Example 2. Recall that we denote the exact solution of a given
initial-value problem as y(a,,) and the numerical solution as y,,. Actually, the latter
is not quite the same as the computer printout because of the inevitable presence
of machine roundoff errors. Thus, let us distinguish further between the numerical
solution y,, that would be generatedon a perfect computer, and the solution y,, that
is generated On a real machine and which includes the effects of numerical roundoff
—that is, the truncation of numbers after a certain number of significant figures.
It is useful to decompose the total error, at any nth step, as
Be
*
“yy
pe
-p
om
he: arror
Totalerror
= y(tn)
~—
yf em
= [y(@n)
—Ynl+ [yn—YF]
accum. truncation error]+/accum. roundoff error].
(3)
325
We ask two things of a method: first, that the accumulated truncation error tend to
zero at any fixed wxas the step size /i tends to zero and, second, that the accumulated
roundoff error remain small compared to the exact solution. The first is the issue
of convergence, discussed earlier in this chapter, and the second is the issue of
stability,
our present concern.
We have already noted that the midpoint rule can produce the strange behavior
shown in Fig. 2, so let us study the application of that method to the standard “test
problem,”
(4)
y= Ay; y(0)=1,
where it is useful to include the constant A as a parameter. The midpoint rule
generatesy, according to the algorithm
Yn+l
ll
Yn—-1 + 2hf
l
Yn-1
(an,
Yn)
(5)
Yo = 1
+ 2AYn;
forn = 0,1,2,....
To determine whether a solution algorithm, in this case (5), is stable, it is
customary to “inject” a roundoff error at any step in the solution, say at rn = 0,
and to see how much the perturbed solution differs from the exact solution as n
increases, assuming
that no further roundoff
errors occur.
Thus, in place of (5),
consider the perturbed problem
(6)
Yntl = Yn-1+ 2Ahyn; yo=1— 6,
say, where ¢ is the (positive or negative) roundoff error in the initial
Defining the error ep, = Yn — yp, and subtracting (6) from (5), gives
Cnt = Cn—|+ 2ZAhen,
with the initial condition
eg = ©,as governing the evolution of e,.
condition.
(7)
We call (7) a
difference equation. Just as certain differential equations can be solved by seeking
solutions in the form y(z) = e**, the appropriate form for the difference equation
(7) is
(8)
en =p”;
where p is to be determined. Putting this expression into (7) gives
pt!
—2ARp" —p™! =0
or
(9)
'
(p” ~ 2Ahp —-1) —=(,
p”
(10)
Since 1/p” is not zero, it follows from (10) that we must have p? — 2Ahp ~ 1 = 0,
so we have the two roots
p= Ah+VJ/1+A%h?
and p= Ah— /1+ Ath’.
(1)
326
By considerations analogous to those for differential equations, we have
en=CY(Ah4/14 Aah)" +Cs(Ah~Ji+ ae)”
(12)
as the general solution of (7).
If we let h - 0, then
(An +V1+ Abn)” ~ (AR+1)” = etGah) J enAh—eAtn
where we have used the identity a = e!"%,the Taylor expansion In(1+2)
a —*/2+4+-+-+~a, andthe fact thatz, = nh. Similarly,
(13)
=
(4h Vi+ Abn)"~(Ah—1)"
= (-1)"(1
_ Ah)”
_ (—1)"e”
In (1—Ah)
(14)
~(-1)e7PA"=(-1)"eA,
so (12) becomes
(15)
42"
+ Co(—1)"e7
en= Cye*®"
as h — 0. Since there are two arbitrary constants, C and Co, two initial conditions
are appropriate, whereas we have attached only the single condition e9 = € in (7).
With no great loss of generality let us specify as a second initial condition e, = 0.
Imposing these conditions on (15), we have
€e9=€=CL+Co,
ey = O=
Cet"!
_ Coe 41,
Finally, solving for C; and C»,and inserting these values into (15), gives
Cn
€
~ 2cosh Ary
JeA(en—es)+ (-1)"enAene2)|
∙
(16)
To infer from (16) whether the method is stable or not, we consider the cases
A> OandA < 0 separately. If A > 0, then the second term in (16) decays to zero,
and even though the first term grows exponentially, it remains small compared to
the exact solution y(x,) = exp (Az,,) as n increasesbecause€ is very small (for
example, on the order of 107!°).
We conclude,
formally,
that if A > O then the
midpoint rule is stable.
On the other hand, if A < 0, then the second term starts out quite small, due
to the € factor, but grows exponentially with x, and oscillates due to the (—1)”,
whereas the exact solution is exp (~—«).This is precisely the sort of behavior that
was observed in Example 2 (where A was —2), and we conclude that if A < 0,
then the midpoint rule is unstable.
Since the stability of the midpoint rule depends upon the sign of A in the test
equation 1’ = Ay (stability for A > 0 and instability for _A < 0), we say that the
327
midpoint rule is only weakly stable. If, instead, a method is stable independentof
the sign of A, then we classify it as strongly stable.
Having found that the midpoint rule when applied to the equation y’ = Ay is
stable for A > 0 and unstable for A < 0, what about the stability of the midpoint
rule when it is applied to an equation y! = f(x,y) that is more complicated?
Observing thatA is thepartial derivative of Ay (i.e., theright-hand side of y’ = Ay)
with respect to y, we expect, as a rule of thumb, the midpoint rule to be stable if
Of/Oy > 0 and unstable if Of/Oy < 0 over the x,y domain of interest. For
instance, if y’ = e™Yon a > O, then we can expect the midpoint rule to be stable
becauseO(e*¥)/Oy = we®Y> Oona > 0, but ify’ = e~*Yona > O, then
we can expect the midpoint rule to be unstable on z > 0 because O(e~*Y)/Oy =
—re~"Y < Qons > 0.
Besides arriving at the above-stated conclusions as to the stability of the midpoint rule for the test equation y/ = Ay, we can now understand the origin of the
instability, for notice that the difference equations y+, — 2Ahyn — Yn—1= 0 and
Cn+1 — 2Ahen — €n—1 = 0, governing y», and ep, are identical. Thus, analogous to
(15) we must have
Un
—_ BieS*
ae
Bo(—1)"e7
(17)
4"
for arbitrary constants B,, Bo, as h tends to zero. The first of these terms coin-
cides with the exact solution of the original equation y’ = Ay, and the second
term (which gives rise to the instability if A < 0) is an extraneous solution that
enters because we have replaced the original first-order differential equation by a
second-orderdifference equation (second-order because the difference between the
subscripts 2+ 1 and n— 1 is 2). Single-step methods (e.g., Euler and Runge-Kutta)
are strongly stable (1.e.,independent of the sign of A) because the resulting difference equation is only first order so there are no extraneous solutions. Thus, we can
finally see why, in Example 3 of Section 6.3, the midpoint rule proved unstable but
the other methods were stable.
Understandthatthesestability claims are basedupon analyses in which we let
h tend to zero, whereas in practice
/ is, of course, finite.
To illustrate
what can
happenas / is varied, let us solve
y' = ~1000(y—x°)+327; y(0)=0
(18)
by Euler’s method. The exact solution is simply y(x) = 2° so that atx = 1,
for instance, we have y(1) = 1. By comparison, the values computed by Euler’s
method are as given in Table 2.
Even from this limited data we can see that we do have the stability claimed
above for the single-step Euler method, but only when fh is made sufficiently small.
To understand this behavior, consider the relevant test equation y/ = ~—1000y,
namely,y’ = Ay, where A = O[—1000(y— a?) + 3x?]/Oy = —1000. Then
Euler’s method for that test equation is Yp414= Yn ~ LO0OhYyn.Similarly, yj,
=
Yn ~ LOOOhy*. Subtracting these two equations, we find that the roundoff error
Cn = Yn — Y;, Satisfies the simple difference equation
Engi = (1—1000A)en.
(19)
328
Chapter 6. Quantitative Methods: Numerical Solution of Differential Equations
Table 2. Finite-/r stability.
A
Computed y(1)
x 10°
0.2500
| 2.3737566
0.1000
| 8.7725049
x 10!
0.0100| Exponential overflow
0.0010| 0.99999726
0.0001| 0.99999970
Letting n = 0,1,2,...
in (19) reveals that the solution of (19) is
(20)
€n = (1 — 1000h)"eo,
where eg is the initial roundoff error. If we take the limit as h — 0, then
en = (1 —1000h)"e9= cge™ GE
—1000K)egg 1000MAegg 1000%
(21)
which is small compared to the exact solution y, = e 10002»because of the eg factor, so the method is stable. This result is in agreement with the numerical results
given in Table 2: as h + 0 the scheme is stable. However, in a real calculation / is,
of course, finite and it appears,from the tabulation that there is some critical value,
say hey, such that the guaranteed stability is realized only if kh< hep. To see this,
let us retain (20) rather than let h — 0. It is seen from (20) that if |1 — 1000A| < 1,
then e, — 0 as n — ov, and if {1 — 1000h| > 1, then en — co asn > ow,
Thus, for stability we need |1— 1000h| < Lor —-1< 1~—1000h< 1. The righthand inequality imposes no restriction on / because it is true for all h’s (provided
that h is positive, as is normally the case), and the left-hand inequality is true only
for h < 0.002. Hence h,, = 0.002 in this example, and this result is consistent
with the tabulated results, which show instability for the h’s greater than that value,
and stability for the A’s smaller than that value. Thus, when we say that the Euler
method is strongly stable, what we should really say is that it is strongly stable for
sufficiently small h. Likewise for the Runge-Kutta and other single-step methods.
6.5.3. Difference equations. (Optional) Difference equations are important in
their own right, and thepurpose of this Section 6.5.3 is not only to clarify some of
the steps in Section 6.5.2, but also to take this opportunity to present some basics
regarding the theory and solution of such equations.
To begin, we define a difference equation of order N as a relation involving
Yny Ynt1>+--)Yntn.
AS we have seen, one way in which difference
equations arise
is in the numerical solution of differential equations. For instance, if we discretize
the differential equation y' = —y and solve by Euler’s method or the midpoint
rule, then in place of the differential equation we have the first- and second-order
difference equations Yni1 = Yn — hYn = (1 — h)yn and Ynait = Yn—1~ 2hYn, OF
Ynt1—(1—h)yn= 0
(22)
329
and
Ynti + 2hYn—Yn-1= 9,
(23)
respectively. In case it is not clear that (23) is of second order, we could Jet n — 1 =
mand obtain Yn-o + 2hYm+1 — Ym = O instead, which equation is more clearly
of second order. That is, the order is always the difference between the largest and
smatlest subscripted indices.
Analogous to differential equation terminology, we say that (22) and (23) are
linear because they are of the form
0 (2) Yn+n
+ ty (2) Un+N—-1 aie
+ an (Nh)Yn = f(n),
(24)
homogeneous because {(1) is zero in each case, and of constant-coefficient type
because their a;’s are constants rather than functions of n. By a solution of (24) Is
meant any sequence y,, that reduces (24) to a numerical identity for each n under
consideration, such as n = 0,1,2,....
The theory of difference equations is analogous to that of differential equations. For instance,just as one seeks solutions to a linear homogeneous differential
equationwith constantcoefficients in the form y(x) = e**, one seeks solutions to
a linear homogeneous difference equation with constant coefficients in the form
(25)
Un = p"
as we did in Section 6.5.2. [In case these forms don’t seem analogous, observe
that e+” = (e4)* is a constant to the power a, just as p” is a constant to the power
n.| Putting (25) into such an Nth-order difference equation gives an Nth-degree
polynomial equation on p, the characteristic equation corresponding to the given
difference equation, and if the NVroots (p1,..., 9a) are distinct, then
(26)
Yn = Cypp tees + Cypy
where the C’;’s are arbitrary constants, can be shown to be a general solution of
the difference equation in the sense that every solution is of the form (26) for some
specific choice of the Cj’s. For an Nth-order linear differential equation, NVinitial
conditions (y and its first NW— 1 derivatives at the initial point) are appropriate for
narrowing a general solution down to a particular solution. Likewise for a linear
difference equation N initial conditions are appropriate—namely, the first NV’values
YO, Yis.
+++YN=1-
EXAMPLE
3. Solve thedifferenceequation
Yn+1
(27)
= 0.
07 dyn,
Since (27) is linear, homogeneous and of constant-coefficient type, seek solutions in the
form (25). Putting that form into (27) gives
perl
_
4p”
—
(p
_
4)p”
—
0
(28)
330
so that if p 4 0 then p — 4 = 0, p = 4, and the general solution of (27) is
(29)
Yn = C(4)",
For example, if an initial condition yo = 3 is specified, then yo = 3 = C(4)° = C gives
C = 3 andhencetheparticularsolutiony, = 3(4)”.
Actually, (29) is simple enough to solve more directly since (form. = 0,1,...)
it gives
yi = 4yo,yo = 4y1 = 4"yo, yg = 4y2 = 4°yo, andso on, so onecan seethatyn = yo(4)”
or, if yo is not specified, y,, ==C(4)",
EXAMPLE
which is the same as (29). @
4. Solve the difference equation
Yn+2 — Yn+1 — Syn = 0.
(30)
Seeking solutions in the form (25) gives the characteristic equation p? — p — 6 = 0 with
roots —2 and 3 so the general solution of (30) is
Yn = Cy (—2)" + C2(3)”.
(31)
If initial conditions are prescribed, say yo = 4 and y, = —138,then
yoH4=C
Y=
give C; =5andCy
,+Co,
-13=
—2Cy
+ 3C 2
= —1. @
If the characteristic polynomial has a pair of complex conjugate roots, say
Pp. = a+76
and po = a — if,
then the solution can still be expressed in real
form, for if we express p; and fg in polar form (as is explained in Section 22.2 on
complex numbers) as
pp=r
wherer = \/a? + 3? and6 = tan!
+Capi
Ciptt
Cyrreir?
lI pr (Cie™?
ew
p2=r
and
e?
(32)
(3/a), then
+ Corte
in?
4 Coe”)
r™ (Cy (cosné +7 sinné)
+ Cy (cosn@ — 7 sin né)|
r” (C3 cosné + Cy sin n) ,
(33)
where C's = Cy + Cy and Cy = i(Cy — C9) are arbitrary constants.
EXAMPLE
5. Solve thedifferenceequation
Yn4+2 + 4Yn = 0.
(34)
331
py = 2e7**/", and (33)
p, = 2e'"/? and P2
The characteristic roots are 27 so (32) becomes PA
gives the general solution
nt
Yn ==2” (4 COS 9
+ Bsin
nq
>
(35)
) \
where A, B are arbitrary constants. @
As we have seen, one way in which difference equations arise is in the numerical solution of differential equations, wherein the continuous process (described
by the differential equation) is approximated by a discrete one (described by the
difference equation). However, they also arise directly in modeling discrete processes. To illustrate, let p,, be the principal in a savings account at the end of the
nth year. We say that the process is discrete in that p is a function of the integer
variable n rather than a continuous time variable ¢. If the account earns an annual
interestof J percent, then the growth of principal from year to year is governed by
the difference equation
Pn+i
I
= (1 + —
iit)
36
(36)
Pn;
which is of the same form as (22).
In fact, discrete processes governed by nonlinear difference equations are part
of the modern theory of dynamical systems, in which theory the phenomenon of
chaos plays a prominent role. Let us close with a brief mention of one such discrete process that is familiar to those who study dynamical systems and chaos.
Let rn, Yn be a point in a Cartesian z,y
plane, and let its polar coordinates be r
and 8,,. Consider a simple process, or mapping, which sends that point into a
at the same radius but at an incremented angle 6, + a. Then,
point 2p+1,Y%m+41
recalling the identity cos(4 +B)
In+1 = 1cos (6, + a) = cos
= cos Acos B — sin Asin B, we can express
6, cosa —r sin Gy Sin @ = Tp COSA— Yn Sine and,
recalling the identity sin(A+ B) = sin Acos B + sin
Yn41 = rsin(@,
+a)
= rsind,
cosa
+ rcos§,
sina
BcosA, we can express
= Yyncosa + tz sina.
Thus, the process is described by the system of linear difference equations
Inti
= (cosa)x, —(sina)yn,
.
Ynel = (sina)Ln + (cosa)yn.
37)
(
Surely, if one plots such a sequence of points it will fall on the circle of radius r
centered at the origin. Suppose that we now modify the process by including two
quadratic ze terms, so that we have the nonlinear system
Inti = (cosa)x, — (sina) (Yn _ v2) ;
2
Yn+1 = (sina)xy + (cosa) (Un — v2)
38
∏
∂
∕
−
interesting by virtue of the nonlinearity. For a discussion of the main results, we
332
highly recommend the little book Mathematics and the Unexpected, by Ivar Ekeland (Chicago: University of Chicago Press, 1988).
Closure. This section is primarily about the concept of stability in the numerical
solution of differential equations.A schemeis stableif theroundoff error remains
small compared to the exact solution. Normally, one establishes the stability or
instability of a method with respect to the simple test equation y’ = Ay. Assuming
that roundoff enters in the initial condition and that the computer is perfect thereafter, one can derive a difference
equation governing the roundoff
error e,, and
solve it analytically to see if e, remains small. Doing so, we show that the midpoint rule is only weakly stable: stable if A > 0 and unstable if A < 0. As a rule of
humb, we suggest that for a given differential equation y’ = f(x, y) we can expect
themidpoint rule to be stableif Of /Oy > 0 and unstableif Of /Oy < 0 over the
x,y region of interest.
To explain the source of the instability in the midpoint rule, we observe that
he exact solution (17) of themidpoint rule difference equationcorresponding to the
est equation y’ = Ay contains two terms, one thatcorresponds to the exact solution
of y! = Ay and the other extraneous. The latter enters because the midpoint rule
difference equation is of second order, whereas the differential equation is only of
first order, and it is that extraneous term that leads to the instability. Single-step
methods such as the Euler and Runge-Kutta methods, however, are strongly stable,
provided that / is sufficiently small.
... Observe thatthe only multi-step methodthat we examine is the midpoint rule;
we neither show nor claim that all multi-step methodsexhibit such instability. For
instance, it is left for theexercises to show thatthe multi-stepfourth-order AdamsMoulton method is strongly stable (for sufficiently small h). Thus, the idea is that
the extraneous terms in the solution, that arise because the difference equation is of
a higher order than the differential equation, can, but need not, cause trouble.
We close the section with a brief study of difference equations, independentof
any connection with differential equations and stability since they are important in
their own right in the modeling of discrete systems. We stress how analogous are
the theories governing differential and difference equations that are linear, homogeneous, and with constant coefficients.
Computer software. Just as many differential equations can be solved analytically
by camputer-algebra systems, so can many difference equations. Using Maple, for
instance, the relevant command is rsolve. For instance, to solve the difference
equation Ynt+2— Yn+1 — 6Yn = 0 (from Example 4), enter
rsolve(y(n + 2) ~ y(n +1) —6%y(n) = 0, y(n));
and return. The result is
=Zuca))
=(Zuo)
(-29"
~Euta))
(Ev(0)
the correct solution for any initial values y(0) and y(1). Of course, we could re-
6.5. Stability and Difference Equations — 333
express the latter as
Cy (—2)" + C2(3)”
have entered
y(m));
and would have obtained
19
5 (3)
n
Loos,
+ (3)"
as the desired particular solution.
EXERCISES
6.5
1. If the given initial-value problem were to be solved by the
fourth-order Runge-Kutta method (and we are not asking you
to do that), do you think accurate results could be obtained?
Explain. The z domain is 0 < x < oo.
(ay =2y—82r+4; y(0)=0
(b)yo’=y—2e7*; y(0)=1
(c)y'=y+5e7**; y(0)= -1
(yy =14+3(y—2);
y(0)=0
2. It is natural to wonder how well we would fare trying to
solve (1) using computer software. Using the Maple dsolve
command with the abserr option, see if you can obtain accurate results at the points z = 1,4,8, 12 listed in Table |.
5. In (13)we showedthat(Ah + V1 + A2h2)" ~ e4%
as h — 0, yet it would appearthat (Ah +
(V1) “lel,
Explain
+ Azh?)" wi
V1
the apparent contradiction.
6. The purpose of this exercise is to explore the validity of
the rule of thumb that we gave regarding the solution of the
equation y’ = f(x,y) by the midpoint rule’— namely, that
the method should be stable if Of/Oy > 0 and unstable if
Of /Oy < 0 over the region of interest. Specifically, in each
case apply the rule of thumb and draw what conclusions you
can about the stability of the midpoint rule solution of the
given problem. Then, program and run the midpoint rule with
h = 0.05, say, over the given x interval. Discuss the numerical results and whether the rule of thumb was correct. (Since
3. One can see if a computed solution exhibits instability, as the midpoint rule is not self-starting, use ten Euler steps from
did the solution obtained by the midpoint rule and plotted in x = 0toz = 0.05 to get the method started.)
Fig. 2, when we have the exact solution to compare it with. In
=ehHS
practice, of course, we don’t have the exact solution to com- (ayy!
pare with; if we did, then we would not be solving numerically
y(0)=1,0<a<10
in the first place. Thus, when a computed solution exhibits an (c)y =(4-a)y;
yO)=10<a¢¢ 5
oscillatory behavior how do we know that it is incorrect; per- (dy =(@-ly,
y(O)=2,0<a<4
)
haps the exact solution has exactly that oscillatory behavior?
One way to find out is to rerun the solution with h halved. 7. We stated in the text that the results in Table 2 are consistent
If the oscillatory-behavior-is part of the exact solution, then with a critical & value of 0.002 because the calculations change
the new results will oscillate every two steps rather than every from unstable to stable as ft decreases from 0.01 to 0.001. Prostep. Using this idea, run the case shown in Fig. 2 twice, for gram and carry out the Euler calculation of the solution to the
h = 0.05 and A = 0.025, and comment on whether the results
initial-value problem (18) using h = 0.0021 and 0.0019, out
indicate a true instability or not.
to around 2 = 1, and see if these /#values continue to bracket
change from unstable to stable. (You may try to bracket
the
4. We derived the solution (12) of the difference equation (7)
even more tightly if you wish.)
hep
in the text. Verify, by direct substitution, that (12) does satisfy
(7) for any choice of the arbitrary constants Cy, and Co.
8. (Stability of second-order Runge-Kutta methods) In Sec-
(by!=e
334
tion 6.3 we derived the general second-order Runge —Kutta
method, which includes these as special cases: the improved
Euler method,
∩∏
he,
Yn) +f
− {f(tn,
∶↕ ∫
Yn)|}
Yn F Af(tn,
[enti
we need to examine the 1” term in (9.3) more closely. Specifically, seeking p4 in the power series form pg = 1+aa+-:-,
put that form into (9.2). Equating coefficients of a on both
sides of that equation through first-order terms, show that
a = 24. Thus, in place of (9.3) we have the more informative
statement
;
(8.1)
Yn~ Ca(l + 24a)" = Cy(1+Ah)" ~ Cye4** (9.4)
and the modified Euler method,
Yn+1
= Un + Aflan
+
h
>
Un +
h.
gf
(tn,
Yn).
as n — oo. Show why the final step in (9.4) is true. Since
the right-hand side of (9.4) is identical to the exact solution
(8.2)
(a) For the test equation y’ = Ay, show that the improved
Euler method is strongly stable for sufficiently small # (e., as
h — 0). For the case where A < 0, show that that stability
achievedonly if h < Ae, = 2/|Al.
is
(b) For the test equation y’ = Ay, show that the modified
Euler method is strongly stable for sufficiently small A (.e., as
h — 0). For the case where A < 0, show that that stability
achievedonly if h < he, = 2/|Al.
(55 fn
a
59fn—1
+ 37 fn—2
~~Ofn—3)
:
(9.1)
where fi
= f(%n,Yn).
10. (Strong stability of the multi-step Adams-Moulton method)
This exercise is to show that the
“AB” method is strongly stable for sufficiently small A (i.e., as
h + 0) even though it is a multi-step method.
(a) Consider the test equation y’ = Ay, where the constant
A can be positive or negative;that is, let f(z,y)
= Ay be
a solution of the fourth-order difference equation (9.1) in the
form yn = p”, show that p must satisfy the fourth-degree
characteristic equation
p' —(1+ 55a)p?+ 59ap*—387ap+9a=0,
from Section
Recall,
(9.2)
Adams— Moulton
6.3, the fourth-order
method,
h
∏
h
Un + 94
or negative, we conclude that the AB method is strongly stable
for sufficiently small h.
is
9. (Strong stability of the multi-step Adams—Bashforthmethod)
Recall, from Section 6.3, the fourth-order Adams—Bashforth
method
Un+1.=
y(z) of the given differential equation,whetherA is positive
∙
+ 19 fn
(9fn+i
∩∏ + 54
∶
_
+ fn—2)
5fn-1
along the same lines as outlined
Proceeding
. (10.1)
in Exercise
9,
show that the AM method is stongly stable for sufficiently
small h.
11. Derive the general solution of the given difference equation. If initial conditions are specified, then find the corresponding particular solution. In each casen = 0,1,2,....
(Q)Yn+1—4Yn= 9; Yo= 5
w=s
youl,
=9;
(b) nee
-Yn
(C) Ynt2
+ Un+r — 6Yn = 9;
(e) Yn+2
7
Yo=R 9, y=
3Un+1
+ 2Yn
(f) Yn+3 — Yn+2 —4Yn
1
= Q;
Yo = 3,
+4 yn = 0;
2
y= 1
(d)Yn-e —4Yn41+ 3Yn = 90; yous,
n=
Yo = 3,
7
n=
5,
Y=
9
(g)
Yn+4
7
5Yn+2
+ 4un
(h)
Yn+4
_
6Yn+2
“b
8YUn
=
0
=
6
12. (a)~(h) Use computer software to obtain the general so-
lution of the corresponding problem in Exercise 11, and the
wherea = Ah/24.
(b) Notice that as 2 tends to zero so does a, and (9.2) reduces
particular solution as well, if initial conditions
are given.
to p* —p®= 0, with theroots= 0,0,0,1. Thus, if we denote 13. (Repeated roots) Recall from the theory of linear homothe roots of (9.2) as py). 7.) 4, then we sée that the first three
of these tend to zero and the last to unity as h —- 0, and the
the characteristic
general solution for y, behaves as
Yn = Cipt + Copy + Cap + Capt ~ Cyl”
geneous differential equations with constant coefficients that
if, when we seek y(x) = e**, \ is a root of multiplicity k of
equation,
then it gives rise to the solutions
+ Cya*!)
y(a) = (Cy + Cow+ +++
(9.3)
as h -+ 0. Since p” tends to zero, unity, or infinity, depending
upon whether p is smaller than, equal to, or greater than unity,
e%, whereCi,...,Ck
are arbitrary constants. An analogous result holds for difference equations. Specifically, verify that the characteristic
equationof Yni2 — 2byn41,+ b’yn = O has the root 6 with
multiplicity 2, and that y, = (Cy + Con)b”.
Chapter 6 Review — 335
14. Show that if yl)
and yo) are solutions of the second-
ay(M)Yn+1 + ag(n)y,
= 0, and Y;, is any particular
order linear homogeneousdifferenceequationaq(n)Yn+2+
solution
of the nonhomogeneousequation ao(n)Yn4e + @i(M)Yn4i +
2
1
∏∶∶ Crys∏ ) ls Coy ∟ + Y;, is∶ a solution
of the nonhomogeneous equation.
15. (Nonhomogeneous difference equations) For the given
mogeneous difference equation. (First, read Exercise
=7
(2)Yngi~38Yn
(b) Yaa — 2Yn = 3sinn
(C)Yn+1—Yn = 2+ cosn
(QD Ynte
~ 5Ynpa
(f) Yn+-2 7 An
solution. Finally, give the general solution of the given nonho-
(D) Ynta
16.
= 6n*
~ TYn+2
(a)—(h)
+ 6Yn = 2n? — 6n —1
2Yn =n?
(©)Yn4-2 -F Unb
7 Yn =e"
equation,first find the general solution of the homogeneous (2) Yn+2
equation.Then adapt the methodof undeterminedcoefficients
-1
+ 124,
+6
software
to obtain
the general
solution of the corresponding problem in Exercise 15.
. in the case of
differential equation solver.
=n
Use computer
Chapter 6 Review
Decomposing the error as
14.)
336
One’s interest in higher-order methods is not just a matter of accuracy because,
in principle, one could rely exclusively on the simple and easily programmed Euler
method, and make /#smail enough to achieve any desired accuracy. There are two
problems with that idea. First, as A is decreased the number of steps increases and
one can expect the numerical roundoff error to grow, so that it may not be possible
to achieve the desired accuracy. Second, there is the question of economy. For
instance, while the fourth-order Runge-Kutta method (for example) is about four
times as slow as the Euler method (because it requires four function evaluations per
step compared to one for the Euler method), the gain in accuracy that it affords is
sO great that we can use a step size much more than four times that neededby the
Euler method for the same accuracy, thereby resulting in greater economy.
Naturally, higher-order methodsare more complex and hence more tedious to
program. Thus, we strongly urge (in Section 6.3.4) the empirical estimation of the
order, if only as a check on the programming and implementation of the method.
In Section 6.4 we showed that the methods developed for the single equation
y’ = f(x,y) can be used to solve systems of equations and higher-order equations
as well. There we also study boundary-value problems, and find them to be significantly more difficult than initial-value problems. However, we show how to use
the principle of superposition to convert a boundary-value problem to one or more
problems of initial-value type, provided that the problem is linear.
Finally, in Section 6.5 we look at “what can go wrong,” mostly insofar as
numerical instability due to thegrowth of roundoff error, and an analytical approach
is put forward for predicting whether a given method is stable. Actually, stability
depends not only on the solution algorithm but also on the differential equation, and
our analyses are for the simple test equation y’ = Ay rather than for the general
case y’ = f(x,y). We find that whereas the differential equation y’ = Ay is of
first order, the difference equation expressed by the algorithm is of higher order if
the method is of multi-step type. Thus, it has among its solutions the exact solution
(as h -+ 0) and one or more extraneous solutions as well. It is those extraneous
solutions that can cause instability. For instance, the midpoint rule is found to be
stable if A > 0 and unstable if A < 0; we classify it as weakly stable because its
stability depends upon the sign of A. However, the fourth-order Adams—Bashforth
and Adams~Moulton
methods are stable, even though they are multistep methods
because the extraneous solutions do not grow. Single-step methods such as Euler
and those of Runge-Kutta type do not give rise to extraneous solutions and are
stable.
Finally, we stress that even if a method is stable as h —+ 0, A needs to be
reduced below some critical value for that stability to be manifested.
Chapter 7
Qualitative Methods:
Phase Plane and Nonlinear
Jifferential Equations
7.1
Introduction
This is the final chapter on ordinary differential equations, although we do return
to the subject in Chapter [1, where we reconsider systems of linear differential
equations using matrix methods.
Interestin nonlinear differential equations is virtually as old as the subject of
differential equations itself, which dates back to Newton, but little progress was
made until the late 1880’s.when the great mathematician and astronomer Henri
Poincaré (1854-1912) took up a systematic study of the subject in connection
with celestial mechanics. Realizing that nonlinear equations are rarely solvable
analytically, and not yet having the benefit of computers to generate solutions numerically, he sidestepped the search for solutions altogether and instead sought to
answer fundamental questions about the qualitative and topological nature of solutions of nonlinear differential equationswithout actually finding them.
The entire chapter reflects either his methods, such as the use of the so-called
“phase plane” and focusing attention upon the “singular points” of the equation,
or the spirit of his approach. In addition, however, we can now rely heavily upon
computer simulation. Thus, our approach in this chapter is a blend of a qualitative,
topological, and geometric approach, with quantitative results obtained readily with
computer software.
Though Poincaré’s work was motivated primarily by problems of celestial mechanics, the subject began to attract broader attention during and following World
War II, especially in connection with nonlinear control theory. In the postwar years,
interestwas stimulated further by the publication in English of N. Minorsky’s Nonlinear Mechanics
(Ann Arbor,
MI: J. W. Edwards)
in 1947. With that and other
books, such as A. Andronov and C. Chaikin’s Theory of Oscillations (Princeton:
337
Princeton University Press, 1949) and J. J. Stoker’s Nonlinear Vibrations (New
York: Interscience, 1950) available as texts, the subject appeared in university
curricula by the end of the 1950’s. With that base, and the availability of digital
computers by then, the subject of nonlinear dynamics, generally known now as dynatnical
systems, has blossomed
into one of the most active research areas, with
applications well beyond celestial mechanics and engineering —to biological systems, the social sciences, economics,
and chemistry.
The shift from the orderly
determinism of Newton to the often nondeterministic chaotic world of Mitchell
Feigenbaum,
E. N. Lorenz,
Benoit Mandelbrot,
and Stephen Smale has been pro-
found. For a wonderful historical discussion of these changes we suggest the little
book Mathematics and the Unexpected by Ivar Ekeland (Chicago: University of
Chicago Press, 1988).
7.2
The Phase Plane
To introduce the phase plane, consider the system
=0
ma" +ke
governing the free oscillation of the simple harmonic mechanical oscillator shown
in Fig. |. Of course we can readily solve (1) and obtain the general solution
x(t)
i
z(t) = Cy coswt + Cysinwt, wherew = \/k/m is the natural frequency —or,
>
mt
ASA
>
equivalently,
x(t) = Asin (wt + 9),
‘
Figure 1. Simple harmonic
mechanical oscillator.
(1)
(2)
where A and ¢ are the amplitude and phase angle, respectively. To present this
result graphically, one can plot 2 versus ¢ and obtain any number of sine waves of
different amplitude and phase, but let us proceed differently.
We begin by re-expressing (1), equivalently, as the system of first-order equa-
tions
dx
an,
de
3
(3a)
dy
k
dt
—
SS
.
3b
(30)
as is discussed in Section 3.9. The auxiliary variable y, defined by (3a), happens to
have an important physical significance, it is the velocity, but having such significance; is not necessary. Next, we deviate from the ideas presented in Section 3.9
and divide (3b) by (3a), obtaining
d
SY
dx
nt
Ax
or
my
my dy + kadx = 0,
(4)
integration of which gives
1
5, 1. »
my ++ —kae
5 Kew
—my*
= C,
(5)
5
7.2. The Phase Plane
Since y = da/dt, (5) is a first-order differential equation. We could solve for y
(i.e., dv /dt),separate variables, integrate again, and eventually arrive at (2) once
again. Instead,let us take (5) as our end result and plot the one-parameter family
of ellipses that it defines (Fig. 2), the parameter being the integration constant C’.
In this example C’ happens to the total energy (kinetic energy of the mass plus
potential energy of the spring); C' = 0 gives the “point ellipse” w = y = 0 and the
greaterthe value of C’, the larger the ellipse.
It is customary to speak of the x,y plane as the phase plane. Each integral
curve represents a possible motion of the mass, and each point on a given curve
representsan instantaneous state of the mass (the horizontal coordinate being the
displacement and the vertical coordinate being the velocity). Observe that the time
t enters only as a parameter, through the parametric representation xz = x(t), y =
y(t). So we can visualize the representativepoint x(t), y(t) as moving along a
given curve as suggested by the arrows in Fig. 2. The direction of the arrows is
implied by the fact that y = dx/dt, so thaty > 0 implies thatx(¢) is increasing
and y < 0 implies that x(t) is decreasing. One generally calls the integral curves
phase trajectories, or simply trajectories, to suggest theidea of movement of the
representativepoint. A display of a number of such trajectories in the phase plane
is called a phase portrait of the original differential equation, in this case (1). Of
course, there is a trajectory through each point of the phase plane, so if we showed
all possible trajectories we would simply have a black picture; the idea is to plot
enough trajectories to establish the key features of the phase portrait.
What are the advantages of presenting results in the form of a phase portrait,
ratherthan as traditional plots of z(t) versus ¢? One advantageof the phase portrait
is that it requires only a “first integral” of the original second-order equation such
as equation (5) in this example, and sometimes we can obtain the first integral even
when the original differential equation is nonlinear. For instance, let us complicate
(1) by supposing that the spring force is not given by the linear function Fi, = ka,
but by the nonlinear function F, = ax + bx*, and supposethata > 0 and b > 0 so
thatthe spring is a “hard” spring: it grows stiffer as x increases (Fig. 3), as does a
typical rubber band. If we take a = b = m, say, for definiteness and for simplicity,
then in place of (1) we have the nonlinear equation
gz’ +e2+a°=0,
(6)
Proceeding as before, we re-express (6) as the system
v=
(7a)
y,
y =~ac—-2’.
(7b)
Division gives
dy
ote
dx
x+23
or
y
:
ydy +(a+a")dz
= 0,
(8)
which yields the first integral
lo
lo
ly
−
5Y + −5 ∕ 4 ie−−
C
∶∕
∙
9
(9)
339
ya
, X(t), y(t)
re
~
CT
TAs
Figure 2. Phaseportraitof (1).
Figure
4. Phase portrait of (6)
(hard spring).
In principle, if we plot thesecurves for various values of C' we can obtain the phase
portrait shown in Fig. 4. More conveniently, we generatedthe figure by using the
Maple phaseportrait command discussed at the end of this section. A comparable
phase portrait plotting capability is provided in numerous other computer software
systems.
To repeat, one advantageof the phase portrait presentationis that it requires
onlya first integral. In the present case (6) was nonlinear due to the x term, yet its
first integral (9) was readily obtained.
A second attractive feature of the phase portrait is its compactness. For instance, observe that the single phase trajectory T’ in Fig. 2 corresponds to an entire
family of oscillations of amplitude A, several of which are shown in Fig. 5» since
any point on I can be designated as the initial point (¢ = 0): if the initial point
onT is (A,0), then we get the curve #1 in Fig. 5; if the initial point on I’ is a bit
counterclockwiseof (A, 0) thenwe get the curve #2; andso on. Passing from the
x,t plane to the x,y plane, the infinite family of curves shown in Fig. 5 collapse
onto the single trajectory [ in Fig. 2. Put differently, whereas the solution (2) of
equation (1) is a two-parameter family of curves in x, space (the parameters being A and @), (5) is only a one-parameter family of curves in the x,y plane (the
parameter being C’), That compactness can be traced to the division of (3b) by (3a)
or (7b) by (7a) for that step essentially eliminates the time ¢.
To learn about nonlinear systems, it is useful to contrast the phase portraits of
the linear oscillator governed by (1) and the nonlinear oscillator governed by (6),
and given in Figs, 2 and 4, respectively. The phase portrait in Fig. 2 is extremely
simple in the sense that all the trajectories are geometrically similar, differing only
in scale. That is, if a trajectoryis given by x = X(t), y = Y(t), then
Figure 5. Solutionsx(¢)
= KX(t),
y = &Y(t) is also a trajectory for every possible scale factor &, be it positive,
negative, or zero. That result holds not only for the system (3) but for any constantcoefficient linear homogeneous system
x
corresponding to the trajectory [.
ax + by,
∫∶
−
(10)
In contrast, consider the phase portrait of the nonlinear equation 2” + ax +
Ga> = 0 shown in Fig. 4. In that case the trajectories are not mere scalings of each
other; there is distortion of shape from one to another, and that distortion is due
entirely to the nonlinearity of the differential equation. The innermost trajectories
=:smaller and smaller motions are considered the x’
approach ellipses [becai:
becomes more and more negligible compared to the other terms in (9)], and the
outer ones become more and more distorted as the effect of the 2? term grows in
(9).
Thus, whereas the phase portrait of the linear equation (1) amounts to a single
kind of trajectory, repeated endlessly through scalings, that of the nonlinear equation (6) is made up of an infinity of different kinds of trajectories. That richness is
a hallmark
of nonlinear
equations,
as we shall see in the next example
and in the
sections that follow.
Before turning to the next example, let us complement the phase portrait in
Fig. 4 with representative plots of x(t) versus ¢. We choose the two sets of initial
conditions: «(0) = 0.5, «’(0) = 0 and 2(0) = 1, «’(0) = 0. The resultsare
shown in Fig. 6, together with the corresponding solutions of the linear equation
x
+x
= 0 (shown as dotted) for reference. Besides the expected distortion we
also observe that the frequency of the oscillation is amplitude dependent for the
nonlinear case: the frequency increases as the amplitude increases. In contrast, for
= | is a constant, independent
the linear equation (1) the frequency w = Jk/m
of the amplitude.
Above, we mentioned the richness of the sets of solutions to nonlinear differ-
ential equations. A much more striking example of that richness is obtained if we
reconsider the nonlinear oscillator, this time with a “soft” spring — that is, with
F, = ax — bx? (a > 0 and b > 0) as sketched in Fig. 7. Again setting a = 6 =m
we have, in place of (6),
(11)
ve+a—-0=0.
In place of (9) we have
Le
tsedoe
su
lag =e,
et
(12)
and in place of the phase portrait shown in Fig. 4 we obtain the strikingly different one shown in Fig. 8. We continue to study this example in Section 7.3, but
even now we can make numerous interesting observations. First, whereas all of
the motions revealed in the phase plane for the hard spring (Fig. 4) are qualitatively similar oscillatory motions —give or take some distortion from one to another
we see in Fig. 8 a number of qualitatively different types of motion, and these are
“
{
G
Figure 8. Phase portrait of (11) (soft spring).
Figure 6. Effects of nonlinearity
on x(t).
342
Chapter 7.. Qualitative Methods; Phase Plane and Nonlinear Differential Equations
separatrix.
ds ∙
−∙
↕
dt
≥
+4
12
7
(13)
ple. Solving
(14a)
(14b)
(—1,0),
(0,0),
(1,0),
(15)
+1 are
because
(16)
function x(¢).
GBDEI
are given by
yor
(17)
7.2. The Phase Plane
respectively. Beginning at D, say, the representative point P moves rightward on
DE and approaches the equilibrium point &. Does it reach & in finite time and
then remain there, or does it approach & asymptotically as tf— oo? To answer that
questionwe use(16). Let thetangentline to thecurve DEJ, atE, bey = m(a—1).
[Wecould determine the slope m by differentiating (17), but the value of m will not
be important.] Then, since ds = \/1 + (dy/dx)2 dx ~ V1 +m? dx as P > E,
we canreplaces’ in (16)by V1 + m? dx/dt, andy by m(x — 1), so (16)becomes
Vi+ mS — ~ Jim (a —1)?+ a2(a + 1)2(a—1)2
(18)
+4(e—1),
~aVm?
where the negative square root has been chosen since dx/dt > 0 as P + Hon
DE, whereas x — 1 < 0 on DE, and where the last step in (18) is left for the
exercises. The upshot is that
ax
19
—~y(l-sr
HY Wi 2)
(19)
as P -+ E, for some finite positive constant y. Thus,
d
l-«
so
vy
(20)
dt
(21)
—In(1-—«) ~ yt+ constant,
and we can now see that t + co as P > FE (ie., as x -
1). Thus, P does not
reach & in finite time but only asymptotically as t + oo. Similarly, if we begin at
point H and go backward in time, then we reach & only as t ~ —oo.
Let us return now to the region inside of the football and consider any closed
orbit P. As the size of P shrinks to zero, [ tends to the elliptical
(actually circular
becauseof our choice a = m) shape y* + 2° = constant, and the period of the motion tendsto 27 [since the solution of the linearized problem is x(t)= Asin (t + @)
]. At the other extreme, as I’ gets larger it approaches the pointed shape BDE HB.
Bearing in mind that it takes infinite time to reach F along an approach from J, it
seems evident that the period of the [ motion must tend to infinity as [ approaches
we will ask you to explore this point in the exercises. From a physical
BDEHB,
point of view, the idea is that not only is the “flow” zero at E (where x’ = y’ = 0),
it is very slow in the neighborhood of &. If T is any closed trajectory that is just
barely inside of BDEH B, then part of T falls. within that stagnant neighborhood
of & (similarly at B). The representative point P moves very slowly there, hence
the period is very large. We reiterate that although each closed loop inside the
football corresponds to a periodic motion, the closed loop BDE HB does not. In
fact, although BDE and FEHB meet at B and F they are distinct trajectories. On
BODE, t varies from —co at B to +00 at E: likewise, on FEHB ¢tvaries from —oo
and vice
at B to +00 at B. Thus, if we begin on BDF we can never get to HB,
versa.
343
344
Finally, it should be evident that every trajectory that is not within the football
correspondsto a nonperiodic motion. Thus, BDE
and HHS
from nonperiodic to periodic motions.
form atransition
Thus far we have. studied. the. three. differential. equations. (1), (6), and. (11).
In each case we have changed the single second-order differential equation to a
system of two first-order equations by setting x’ = y and then studied them in the
x,y phase plane. More generally, we consider in this chapter systems of the form
x = P(x,y),
(22a)
y =Q(2,y)-
(22b)
That is, P(x, y) need not equal y, and the system need not be a restatementof a
single second-order equation. Rather, it might arise directly in the form (22).
For instance, suppose that two species of fish coexist in a lake, say bluegills and
bass. The bluegills, with population «, feed on vegetationwhich is available in unlimited quantity, and the bass, with population y, feed exclusively on the bluegills.
If the two species were separated,their populations could be assumed to be governed approximately by the rate equations
zr/ = az,
y = —By,
(23)
where the populations x(t) and y(t) are consideredto be large enough so that they
can be approximated as continuous rather than discrete (integer valued) variables,
and a, 3 are (presumably positive) constants that reflect net birth/death rates. The
species are not separated,however, so we expect the effective a to decreaseas y
increases and the effective 3 to decrease as x increases. An approximate revised
model might then be expressed as
z' =(a-yy)a,
y
ll
=
—(3—dx)y,
(24a)
(24b)
which system is indeed of the form (22). This ecological problem is well known as
Volterra’s problem, and we shall return to it later.
The system (22) is said to be autonomous because there is no explicit dependence on the independent variable (the time ¢here but which could have some other
physical or nonphysical significance). Surely not all systems are autonomous, but
that class covers a great many cases of important interest, and that is the class that
is considered in phase plane analysis and in this chapter. Because (22a,b) are autonomous, any explicit reference to t (namely, the ¢ derivatives) can be.suppressed
by dividing one equation by the other and obtaining
dy _ P(x,y)
dz ~~Q(a,y)’
(25)
where we now change our point of view and regard y as a function of z in the
x,y phase plane, rather than x and y as functions of t. If the system (22) were not
7.2. The Phase Plane
345
autonomous;that is, if it were of the form a! = P(a,y,t) andy’ = Q(x, y,t), one
could still make it autonomous by re-expressing it, equivalently, as
x = P(x,y,z)
y
y = Q(a,y,2),
/
∶↓
∩
∑
−
is more complicated. [n this chapter we continue to consider the autonomous case
(22) and the two-dimensional «x,y phase plane.
Closure. As explained immediately above, our program in this section is to show
the advantages of recasting an autonomous system (22) (which could, but need not,
arise from a single second- order equation by letting 2’ be an auxiliary dependent
variable y) in the form (25) and then study the solutions of that equation in the two-
dimensional 2, y phase plane. One advantage is that (25) can sometimes be solved
analytically
even if (22a,b) are nonlinear.
Indeed, our primary interest in Chapter
7 is in the nonlinear case. We find that the phase portrait provides a remarkable
overview of the system dynamics, and the hard- and soft-spring oscillator exam-
ples begin to reveal some of the phenomenological richness of nonlinear systems.
We do not suggest the use of the phase plane as a substitute for obtaining and plotting solutions of (22) in the more usual way, 2 versus t and y versus ¢. Rather,
we suggest that to understand a complex nonlinear system one needs to combine
several approaches, and for autonomous systems the phase plane is one of the most
valuable. Finally, we ask you to observe how the phase plane discussion is more
qualitative and topological than lines of approach developed in the preceding chapters. For instance, regarding Fig. 8 we distinguish the qualitatively different types
of motion such as the periodic orbits within B DEH B, the transitional motions on
the separatrix itself, and the nonperiodic motions as well.
We distinguish betweenthe physical velocity «’(t) of the mass, in the preceding examples, and the phase velocity s’(t), which is the velocity of the represen-
tativepoint x(t), y(t) in the phaseplane. It is useful, conceptually,to think of the
z'(t), y/(t) velocity field as the velocity field of a “flow” such as a fluid flow in the
phaseplane.
Finally, we mention that in the hard- and soft-spring oscillators, (6) and (8),
we meet special cases of the extremely important Duffing equation, to which we
return in a later section.
Computer software. Here is how we generate the phase portrait shown in Fig. 8
using the Maple phaseportrait command. First, enter
with(DEtools):
and return, to gain access to the phaseportrait command. Note the colon, whereas
Maple commands are followed by semicolons. Next, enter
phaseportrait({y,—z + 273], [t,2,y], ¢ = —20..20, {[0,0,0.1], (0,0,0.3),
∏
∶
↨
346
(0,0,0.9],{0,0,—0.9],
(0,0,-0.70710781],
(0,0,0.70710781],
(0,0,0.6},
(0,0, 1.25],[0,0, —1.25],[0,1.5,0.8838834761],[0,1.5, -0.8838834761],
(0,1.4,0],[0,1.8,0},[0,-1.5, 0.8838834761],(0,-1.5, -0.8838834761],
= —2..2,
[0,—1.4,0], (0,-1.8, 0]}, stepsize = 0.05, y = —1.8..1.8, «©
scene = |, y]);
and return. In [y,—x + x3] the items are the right-hand sides of the first and
second differential equations, respectively; [¢,2,y] are the independent variable
and dependent variables; ¢ = —20..20 is the range of integration of the differential
equations; within { } are the initial t,2,y points chosen in order to generate the
trajectories shown in Fig. 8. After those points the remaining items are optional:
stepsize= 0.05 setsthe stepsize A in the Runge-Kutta-Fehlberg integrationbecause
the defaultvaluewould be (final ¢ —initial t)/20 = (20 + 20)/20 = 2, which
would give too coarse a plot (as found by experience);y = —1.8..1.8,2 = —2..2
gives a limit to the x, y region, with the ¢ integrations terminated once a trajectory
leaves that region; scene = [x,y] specifies the plot to be a two-dimensional plot
in the 2, y plane, the default being to give a three-dimensional plot in ¢,x, y space.
There are additional options that we have not used,one especially useful option for
phase plane work being the arrow option, which gives a lineal element grid. The
elements can be barbed or not. To include thin barbed arrows, type a comma and
then arrows=THIN after the last option. Thus, we would have ...scene = [x,yj,
arrows =THIN));. In place of THIN type SLIM, or THICK for thicker arrows. For
unbarbed lineal elements, use LINE in place of these. The order of the options is
immaterial.
Observe that the separatrix must be generatedseparately as BDE, EH B, AB,
GB,
EF,
and IE.
To generate BDE,
for instance, we determine the coordinates
of D, The equation of the entire separatrix is given by (12), where C’ is determined
by using the z, y pair 1,0 (namely, the point #, which is a known point on the
separatrix). Thus, putting 2 = 1 and y =0 into the left sideof(12) gives C = 1/4.
Next, put 2 = 0 and solve for y, obtaining y = 1/V2 = 0.770710781 at D. Then,
with D as the initial point we need ¢ to go from —co to +00 to generateBDE.
By trial, we find that —20 to +20 suffices for this segment and all others; similarly,
we generate E’F by using a point on EF at x = 1.5, and determine y at that point
from the separatrix equation. That calculation gives y = 0.8838834761.
Notice that to generate Fig. 8 with phaseportrait we need to already know
something about the phaseportrait — the equation of the separatrix, (12), so that
we can choose suitable initial points on AB, GB, BDE, EHB, EF, and TE.
Suppose that we desire only the lineal element field, over 0 < a < 4 and
0 <y <4, say. Wecan get it from phaseportrait as follows:
.
phaseportrait([y,—« + 273], [é,¢,y], t= 0..1, {[0,0,0]}}, 7 =0.4,
y =0..4,
scene = [x,y], arrows = THIN, grid = [20,20]);
because the trajectory through [0,0,0] gives simply the single point « = y = 0 in
the a, y phase plane. We have included the one initial point [0,0,0] because the
7.2. The Phase Plane
347
of x or y versus ¢ using the scene option. For instance,
phaseportrait([y,~« + 273], [t,a,y], ¢ = 0..5,
stepsize= 0.05, scene= [t,x]);
0.2 and y(0)= a'(0)=
y(0) = w(0) =
u 1.3.
EXERCISES
0.8 and
7.2
1. We stated, below (5) that if we solve (5) for y (.e., dx /dé),
separatevariables, and integrate, we obtain the general solution x(t) = Asin (wt + @)of (1). Here we ask you to do that,
to carry out those steps.
2. Supply the steps missing between the first and second lines
of (18).
3. We found in Fig. 6 that for the hard-spring oscillator the frequency increases with the amplitude. Explain, in simple terms,
why that result makes sense.
4, Determine the equation of the phase trajectories for the
given system, and sketch several representative trajectories.
Use arrows to indicate the direction of movement along those
trajectories.
(a)a’ = Ys y =a
(c)ai=y",
yi =-ay
(b)2’=ay,
From your results, obtain the period T for each case and plot
T versus A for those values of A (and additional ones if you
wish). Does the claim made in the first sentence appear to be
correct?
7, We stated, in our discussion of Fig. 8, that all trajectories
outside of the “football” region correspond to nonperiodic motions. Explain why that is true.
8.
determination
(Graphical
of phase velocity)
(a) For the
system (22), consider the special case where P(z,y)
as occurred
in (3) and (7), for instance.
= y,
From the accompa-
nying sketch, show that in that case the phase velocity s’ can be
y
y! = —2x?
5. Determine the equation of the phase trajectories and sketch
enough representative trajectories to show the essential featuresof the phaseportrait. Use arrows to indicate the direction
of movement along those trajectories.
(jcisy,
y= -y
(cja’=y,
y=a
(e)a’ =u,
y=z
Il
(b)a’=y,
|=y,
ja’
=
(Nei =a,
y'
< =9r
y = 4x
6. (Period, for soft-spring oscillator) In the paragraph below (21), we suggest that the period T of the periodic motions inside of BDE
HB
=
yl i =y
(Fig. 8) tends to 27 in the limit as
the amplitude A of the motion tends to zero, and to oo as
A — 1. Here we ask you to explore that claim with calculations. Specifically, use phaseportrait (or other software)
interpreted graphically as
8 = 4,
(8.1)
where a is the perpendicular distance from & to the x axis.
(b) Consider a rectangular phase trajectory ABC’ DA, where
the corner points have the x,y coordinates A = (—1,1),
B = (3,1), C = (8,-1), D = (~1,-1). Using (8.1), plot
to solve x” + 2 — 2° = 0 subjectto the initial conditions the graph of x(t) versus ¢, from ¢ = 0 through¢ = 20, if the
«(0) = A, 2'(0) = 0, for A = 0.05,0.3, 0.6, 0.9, 0.95, 0.99.
representative point & is at A att = 0.
348
(c) Consider a phase trajectory ABC
consisting of straight-
as to the t-interval,
the step size, the initial points, and so on,
line segmentsfromA = (—1,0)to B = (0,1) toC = (1,0) so as to obtain good results.
with & at Batt = 0. Using (8.1),sketchthegraphof «(¢)
10. Reduce the equation x” + x? = 0 toa system of equations
versus¢over~-00< t < oo. Also, give x(t) analyticallyover
Find: the equation of the trajectories and
by setting a’: =-y
carefully sketch seven or eight of them, so as to show clearly
(d) Considera straight-linephasetrajectoryfrom A = (0,5)
the key features of the phase portrait. Pay special attention to
to B = (10,-5). Using (8.1),sketchthegraphof x(t) versus
the one through the origin, and give its equation,
toverrO0<t<o,ifHisat
Aatt = 0,
(e) Same as (d), but with & at Batt = 0.
UL. (Volterra problem) Consider the Volterra problem (24),
9, (a) Reduce the equation x” + 22° = 0 to a system of equa- with a = 8 = y = 6 = 1: Determine any fixed points.
tions by setting 2’ = y. Find the equation of the phase trajec- Use phaseportrait(or other software) to obtain the lineal ele-
-oo <t<OandO0<t<oo.
tories and sketchseveral of them by hand. Show that for larger
and larger motions the trajectories are flatter and flatter over
-lL<a<l.
(b) Use the Maple phaseportrait command (or other software)
to generate the phase portrait and, on the same plot, the lineal
element field, using barbed arrows to show the flow direction.
You will need to make decisions, with some experimentation,
ment field, with barbed arrows, over the region 0 < a < 4
and Q < y < 4, say. (Of course, « and y need to be positive
because they are populations.). On that plot, sketch a number
of representative trajectories. You should find a circulatory
motion about the point (1,1). Can you tell, from the lineal element field, whether the trajectories
circulate
in closed orbits
or whether they spiral in (or away from) that point?
tions.
dx
M=Fleyt); elt)=29
ns
WY
Fp EAs
=
atte)
ylto)= yo
as
‘»
tg and that
349
solution is unique.
More general and more powerful theorems could. be given but this one.will
suffice for our purposes in this chapter. For such theorems and proofs we refer
you to textson differential equationssuch as G. Birkhoff and G.-C. Rota, Ordinary
Differential Equations, 2nd ed. (New York: John Wiley, 1969).
For instance, consider the soft-spring oscillator equation x” + « — 2° = 0 or,
equivalently, the system
(2a)
a’ = y,
y =-t+2°
(2b)
thatwe studiedin Section 7.2. In this case f(x,y, t) = y, g(a,y,t) = —w+ 2°,
fo =,
fy = 1, ge = -1 4+327, gy = O are continuous for all values of x, y, and
t, so Theorem 7.3.1 assures us that no matter what initial condition is chosen there
is a unique solution through it. The extent of the ¢ interval over which that solution
exists is not predicted by the theorem, which is a “local” theorem like Theorem
2.4.1. But it is understood that that interval is not merely the point fo itself, for how
could dx/dt and dy/dt make sense if x(t) and y(t) were defined only at a single
point? Linear differential equations are simpler, and for them we have “global”
theorems such as Theorem 3.9.1.
If f and g satisfy the conditions of Theorem 7.3.1 at (xo, yo, to) in aw,y, t space,
then there does exist a solution curve, or trajectory, through that point, and there is
only one such trajectory. Geometrically, it follows that trajectories in x, y,¢ space
cannot touch or cross each other at a point of existence and uniqueness. However,what about the possibility of crossings of trajectories in the z, y phase plane?
Be careful, because whereas the theorem precludes crossings in three-dimensional
x,y,t space, the phase plane shows only the projection of the three-dimensional
trajectories onto the two-dimensional x, y plane. For instance, choose any point Po
ona closed orbit inside the “football” in Fig. 8 of Section 7.2. As the representative
point P goes round and round on that orbit it passes through 5 an infinite number
of times, yet that situation does not violate the theorem because if that trajectory
is viewed in three-dimensional wv,y, ¢ space, we see that it is actually helical, and
there are no self-crossings. The only points of serious concern in Fig. 8 are (1, 0)
and (—1,0). But here too there is no violation of the theorem because there ts only
the unique trajectory x(t) = 1 and y(t) = O through any initial point (1,0,to)
~ namely, a straight-line trajectory which is perpendicular to the x,y plane and
which extends from —oo to +oo in the ¢ direction.-The
trajectories DE
and LE, in
the x, y,¢ space, approach that line asymptotically as t —+00, and the trajectories
FE and HE approach it asymptotically (both in x, y, £ space and in the z, y phase
plane) as t + ~oo, but they never reach it. Similarly
for (—1, 0, to).
Recall, from Section 7.2, that we proceeded to divide (2b) by (2a), obtaining
dy = -a@t x
dx
os
y
3
oS
350
Chapter 7. Qualitative Methods: Phase Plane and Nonlinear Differential Equations
now formalize.
dy_ f(x,y)
dx g(x,y)’
we remain there for all t because x’ = y’ = 0 there.
be isolated. For example, if f(z,y)
= x and g(x,y) =
at (1,0), write
a
/
Y;
ae
Close to (1,0) we neglect the higher-order terms and consider the linearized version
(6a)
a!=y,
(6b)
1)
y= 2(a~
or, Moving our coordinate system to (1,0) for convenience by setting X = x — 1
and Y = y,
Dividing these gives dY/dX
givesY = +/2X;
X'=/Y,
(7a)
Y' =2X.
(7b)
= 2X/Y,
with the solution Y? = 2X2 +C;C
and for C 4 0 we haveY = +\/(2X2 +).
=0
Thesecurves
singularpoint (1,0).
are shown in Fig. |. The pairs crossing the X axis correspond to increasingly
negative values of C’, and those crossing the Y axis correspond to increasingly
positive values of C.
Similarly, to study the singular point (0,0) observe that the right-hand sides of
(2a) and (2b) are already Taylor series expansions about = 0 and y = 0. Thus,
keeping terms only up to first order gives the linearized version
v= y,
y = —2,
ya
(8a)
(8b)
with dy/dx = —a/y giving the family of circles y* + 2? = C and hence the
trajectories shown in Fig. 2.
Treating the singular point (—1,0) in the same manner, we find the same behavior there as at (1,0). If we show the three results together,in Fig. 3, it is striking
Figure 2. The flow nearthe
singular point (0,0).
how those three localized phenomenaappearto set up the global flow in the whole
plane. That is, if we fill in what’s missing, by hand, we obtain — at least in a
qualitative or topological sense —the same picture as in Fig. 8 of Section 7.2.*
From this example we can see some things and raise some questions. We see
that by virtue of our Taylor expansions of f(x,y) and g(x, y) about the singular
point (a5, ys) and their linearization, we are always going to end up with linearized
equations of the form
lI
X’=aX
+0bY,
(9a)
Y'’=cX
+dY
(9b)
tostudy,whereX = x ~ x, andY = y —y, areCartesiancoordinateaxeslocated
at the singular point. Thus, we might as well study the general system (9) once
and for all. Evidently, for different combinations of a, b,c, d there can be different
types of singular points, for from Fig. 3 it seems clear that the ones at (1,0) and
(—1,0) are different from the one at (0,0). How many different types are there?
“It may appear inconsistent that the trajectories near the origin in Fig. 3 look elliptical, whereas
they are circles in Fig. 2. That distortion, from circles to ellipses, is merely the result of stretching
the a axis relative to y axis for display purposes.
Figure
3. Global flow determined,
qualitatively, by the singular points.
What are they?
7.3.3. The elementary
singularities
and their stability.
We wish to solve (9),
examine the results, and classify them into types. [f we equate the right-hand sides
of (9) to zero, we have the unique solution X = Y = 0 only if the determinant
ad — bc is nonzero. If that determinant vanishes, then the solution X = Y = 0
is nonunique, and there is either an entire line of solutions through the origin or
the entire plane of solutions. For instance, if a = 6 = 0 and c and d are not both
zero, then every point on the line cX + dY = 0 is a singular point of (9), and
ifa = b = c = d = QO,then every point in the plane is a singular point of (9).
Wishing to study the generic case, where the origin is an isolated singular point,
we will require of a, b,c, d thatad ~ be 4 0.
[Ifwe solve (9), for instance by elimination,
form
(a)
5
we find that the solution is of the
+ Coe®?*,Y(t) = Cge*!+Cye™,
X(t) = Cye™*§
(10a,b)
where C, C2, C3, C4 are not independent, and where 1, Ag are the roots
yn atest
en
Me
(11)
of the characteristic equation
\M*—(a+ d)\ 4+(ad —be)= 0.
(12)
Since ad —bc ¥ 0, zero is not among the roots. There are exactly four possibilities:
2)
(1) purely imaginary roots (CENTER),
(2) complex conjugate roots (FOCUS),
(3) real roots of the same sign (NODE),
(4) real roots of opposite sign (SADDLE).
These cases lead to four different types of singularity: center, focus, node, and
saddle, as we note within parentheses,and we will discuss these in turn. In doing
so, it is important to examine each in terms of its stability, which concept we define
before continuing.
A singular point S = (2, ys) of the autonomous system (4) is said to be stable
if motions (i.e., trajectories) that start out sufficiently close to S remain close to
S. To make that intuitively stated definition mathematically precise, let d(P;, P2)
Figure 4. Stabilityandasymptotic denote the distance* between any two points P, = (x1,y,) and Py = (9, y2).
stability.
Further,we continueto let P(t) = (x(t), y(t)) denotethe representativepoint in
the phase plane corresponding to (4). Then, a singular point S is stable if, given
any € > 0 (i.e., as small as we wish) thereis ad > 0 such thatd(P(t),.S) < for
allt > 0 if d(P(0),S) < 6. (See Fig. 4a.) If.S'is not stable,thenit is unstable.
“In the Euclidean sense,the distance d(P;, P2) is defined as \/(ay— wz)? + (yi — y2)*, but one
can define distance in other ways. Here, we understandit in the Euclidean sense.
Further, we say that S' is not only stable but asymptotically stable if motions
that start out sufficiently close to S not only stay close to S but actually approach
S ast — oo. That is, if thereisa dé> O such thatd(P(t),S) ~ 0ast > co
wheneverd(P(0),S) < 6, thenS is asymptotically stable.(SeeFig. 4b.)
Now let us return to the four cases listed. The most inciteful way to study
thesecases is by seeking solutions in exponential form and dealing with the “eigenvalue problem” that results. However, the eigenvalue problem is not discussed until
Chapter [1, so in the present section we rely on an approach that should suffice but
which is in some ways less satisfactory. In Section 11.5 we return to this problem
and deal with it as an eigenvalue problem. If you are already sufficiently familiar
with linear algebra, we suggest that you study that section immediately following
this one.
It is convenient to use the physical example of the mechanical oscillator, with
thegoverningequationma” + px’ + kx = 0, or
{| y,
a’=
y i
−
k
∕
™m
Mm
(13a)
∶∩
(13b)
as a unifying example because by suitable choice of m, p, k we can obtain each of
the four cases, and because this application has already been discussed in Section
3.5. [Here we use p instead of c for the damping coefficient to avoid confusion with
the c in (9b).]
Purely imaginary roots. (CENTER) Let p = 0 so there is no damping. Then
a= 0,b = 1l,c¢ = ~—k/m,d = 0: (11) gives the purely imaginary roots
\ = ti/k/m
and dy/dxz= —(k/m)a/y gives thefamily of ellipses
1, .
5
1
5my?+sha =C
(14)
sketched in Fig. 5a. The singular point at (0,0) is called a center because it is
surrounded by closed orbits corresponding to periodic motions. For instance, with
Ay = +iw (wherew = \/k/m is the natural frequency)and Az = —iw, (10a)gives
a(t) = Cy exp (twt) + Cy exp (iwt) or, equivalently,
a(t) = Asin (wt+ ¢@).
(15)
(Here, X = wand Y = y because the singular point is at x = y = 0.) In Fig. 5a the
principal axes of the elliptical orbits coincide with the x, y coordinate axes. More
generally, they need not. For instance, for the system
z=
,
8
vB,
4
+ au
ill V2
|
Er
3
34
y=->7
(16a)
L6b
(166)
(a)
ya
(11) again gives purely imaginary roots, \ = -+i/,/3 so the solutions are harmonic
oscillations with frequency 1/,/3, but the principal axes of the elliptical orbits are
at an angle of sin~! (1/3) = 19.47° with respect to the x,y axes as shown in
Fig. 5b (see Exercise 5), [The system (16) is, of course, not a special case of (13),
it is a separateexample.|
We see that a center is stable but not asymptotically stable.
Complex conjugate roots. (FOCUS) This time let p be positive in (13), but small
enough so that p < V4km. According to the terminology introduced in Section
Then a = 0,
3.8, we say that the damping is subcritical because per = V4km.
d = —p/m; (11) gives the complex conjugateroots
b=1,c=—k/m,
Va
ek
2m
0k
2
(=|
m
2m
—
Poka
2m
k
(=~),
2
2m
m
and (10a) gives the solution
> \2
k
— (=)
a(t) = e7P/?™| Acos [ 4/—
2m
m
gin
=CePt/2m
Ls
m
(2)
1
2
2m
t+ Bsin
t+o|,
ls
—m
(=)
2m
2
t
(17)
where C’ and ¢ are arbitrary constants.As we discussed in Section 3.8, this solution
differs from the undamped version (15) in two ways. First, the frequency of the sinusoid is diminished from thenaturalfrequencyw = \/k/m to Vk/m — (p/2m)?,
and the exp (—pt/2m) factor modulates the amplitude, reducing it to zero as t >
oo. In terms of the phase portrait, one obtains a family of spirals such as the one
shown in Fig. 6a. If we imagine the representative point P moving along that
curve, we see thatthe projection onto the x axis is indeed a dampedoscillation. We
call the singularity at the origin a focus because trajectories “focus” to the origin as
t —>oo; the term spiral is also used.
In Fig. 6a the principal axes of the orbits, which would be elliptical if not for
the damping, coincide with the x, y coordinate axes. More generally, they need not.
For instance, for the system
:
26.57
1
oO
Figure 6. A stablefocusat (0,0).
7
g! = 32 + gu
(18a)
1
4
y= ho
4y
(18)
we obtain similar results but with the principal axes rotated clockwise by an angle
of sin~! (1/5)
= 26.57° as shownin Fig. 6b.
In each case (Fig. 6a and 6b) we see that the focus is stable and, indeed, asymptotically stable as well. However, one can have foci that wind outward instead of
inward, and these will be unstable. For instance, if we return to the solution (17) of
the damped oscillator system (13), but this time imagine p to be negative (without
concerning ourselves with how that might be arranged physically), and smaller in
magnitude than /4hm as before, then in place of the clockwise inward flow shown
in Fig. 5a, we obtain the counterclockwise outward flow shown in Fig. 7, and we
classify the singularity at the origin as an unstable focus. Note that a stable singular point can, additionally, be asymptotically stable or not, but an unstable singular
point is simply unstable.
(,
Real roots of the same sign. (NODE) We’ve seen that without damping the mechanical oscillator (13) gives pure oscillations, elliptical orbits, and a center. With Figure 7. An unstablefocus at(0,0).
light damping (i.e., 0 < p < per) it gives damped oscillations and a stable focus.
If we now increase p so as to exceed p-,, then the oscillations disappear altogether,
and we have the solution form
a(t)= Cye™!!
+Coe?
(19a)
with
yee
1
a
+
( p\?
ia)
_k
=
m?
A2
Because of the way we have numbered the ’s,
in this application we have from (19a),
y(t) = ACyer*
p\? _&k
=P p _f(P\_=.
2m
(5)
m
we have Az < Ay <0. Since x’ = y
+ NoCoe™",
(19b)
We can see from (19) that
y(t) ~ AyCye™*
z(t) ~ Cye™
(20)
2)
and
yr
Aye
(21)
as t + oo, provided that the initial conditions do not give C = 0. If they do give
C; = 0, then
a(t)= Coe",
y(t)= AgCoe**!
(22)
and
y = Ax
(23)
AA,
ast —>oo.
The resulting phase portrait is shown in Fig. 8a. We call the singularity at
(0,0) a node —more specifically, an improper node (“improper” is explained below). In accord with the preceding discussion, observe from Fig. 8a that in the
exceptional case in which the initial point lies on the line of slope A the represen-
tative point approaches the origin along that line. In all other cases, the approach is
asymptotic to the line of slope Ay. If we let p tend to pe,, then the two lines coalesce
Figure 8. A stable improper
nodeat (0,0): distinctrootsand
repeated roots, respectively.
356
Chapter 7. Qualitative Methods: Phase Plane and Nonlinear Differential Equations
and we obtain the portrait shown in Fig. 8b, which is, likewise, an improper node
hy
Figure 9. An unstableimproper
nodeat (0, 0): distinct roots.
(a)
(Exercise 6).
[f p is negative and greater in magnitude than p,,, then we see from the expressions given above for the \’s, that Ay > Ag >. 0, and the phase portrait is
as shown in Fig. 9. The nodes shown in Fig. 8a,b are both stable, asymptotically
stable, and the one shown in Fig. 9 (which is analogous to the one in Fig. 8a) is
unstable.
Our mechanical oscillator example is but one example leading to a node. We
could make up additional ones by writing down equations 2’ = ax + by and y/ =
cx + dy if we choose the coefficients a, b, c,d so that the two A’s are of the same
sign, but the results will be one of the types shown in Fig. 8 and 9. There is,
however, another type of node, which we illustrate by the problem
a’ =lI an,
y |= ay.
y
(24a)
(24b)
i Ag = a, and we have the solution
That is, 6 = ¢ = Oanda = d. Then Ay =
x
(b)
v4
x
|
Figure 10. Stable and unstable
a(t) = Ae“,
y(t) = Be™,
(25a)
(25b)
where A, B are arbitrary. In this case y/x is not only asymptotic to a constant as
t —>oo, it is equal to a constant for all t. Thus, the phase portrait is as shown in
Fig. 10a if a < 0, and as in Fig. 10b if a > 0. The former is an asymptotically
stable node, and the latter is an unstable node. But this time we call them proper
nodes (or stars) because every trajectory is a straight line through the singular point
(0,0), not just one or two of them.
Real roots of opposite sign. (SADDLE)
Consider, once again, the undamped
mass/spring system governed by the equation max” + kx = 0, or
pring Sy
~
y
ane
propernodesat(0,0).
zg =ya}
k;
ya—he.
m
vA
(26a)
(26b)
This time, imagine / to be negative (without concern about how that case could
:
9
:
9
occur physically)
and set —k/m = h°.
Then (26) gives dy/dx = h*x/y
so
y
y=
hear +C
or
(27)
which trajectories are shown in Fig. IL for various values of C. In particular,
C = 0 gives the two straight lines through the singular point, namely, y = ha,
with the flow approaching the origin on one (y = —ha) and moving away from
it on the other (y = +/Aa). Such a singular point is called a saddle and is always
unstable. The two straight-line trajectories through the saddle, along which the flow
is attractedand repelled, are called the stable and unstable manifolds, respectively.
Of course, (26) is not the only example of a linear system x = aa + by and
y! = cu + dy with a saddle, Any such system with real roots of opposite sign will
have such a singularity. For instance,
a =x + 2y,
(28a)
y = 8x —5y
(28b)
has the roots \ = 3 and \ = —7. Thus, it has a saddle, and we know that two
straight-line solutions can be found through the origin. To find them, try y = Ka.
Puttingy = Ka into(28)givesa’ = (1+ 2«)x andvc’= (8 —5«K)a/nso it follows
from these that we need 1 + 24 = (8 — 5«)/&, which equation gives the slopes
&K=land«
= —4. (If we would obtain & = oo, we would understand that, from
y = &x, to correspond to the w axis.) With & = 1 the equation a’ = (14+2K)a = 3a
gives x(t) proportionalto exp (3¢) [likewise for y(t) becausey = Kx], and with
Kk= —4 it gives x(t) proportional to exp (—7t). Thus, the line trajectory y = x
is the unstable manifold (since « and y grow exponentially on it), and the line
is the stable manifold (since x and y die out exponentially on
trajectory y = ~—4zx
it).
The same procedure, which we have just outlined and which should be clearly
understood, can be used for a node as well, to find any straight-line trajectories
through the node.
In this final subsection we turn from the
7.3.4. Nonelementary singularities.
elementary singularities to nonelementary ones, with two purposes in mind. First.
one doesn’t completely understand elementary singularities until one distinguishes
elementary singularities from nonelementary ones and, second, nonelementary singularities do arise in applications.
Recall that
(29a,b)
+dY
Y'=cX
X’=aXN +bY,
has an elementarysingularity at (0,0) if ad —bc 4 0. Consider two examples.
EXAMPLE
1. The system
vay,
(30a,b)
yoy
has the phase trajectories y = « + C and the phase portrait shown in Fig.
12.
Since
ad —be = (O)(1) — (1)(0) = 0, the singularity of (30) at (0,0) is nonelementary. It is
nonisolated and, in fact. y = 0 is an entire line of singular points. @
Figure 12. Nonelementary
singularity of (30) at (0, 0).
EXAMPLE
2. Consider the singularity of the system
vay,
at (0,0).
∙
(31a,b)
yo =1—cose
Expanding the right side of (31b) gives 1 — cosa
Ls
= a
io
a st
+--+ so
the linearized version of (31) is a’ = Ow+ ly and y! = Ow+ Oy. Thus, ad — bc =
(0)(0) — (1)(0) = 0 again and the singularity of (31) at the origin is nonelementary.The
358
difficulty this time is not that the singular point is not isolated; it is. The problem is that it
is of higher order for when we linearize the expansion of 1 —cos x we simply have Ox + Oy.
In not retaining at least the first nonvanishing term (namely, 2” /2) we have “thrown out the
baby with the bathwater.’ To capture the local behavior of (31) near the origin, we need to
retain that leading term and consider the system
amy,
1.
yf=II =a".
Dividing (32b)by (32a)and integratinggives y =
(32a,b)
7
+ C, severalof which trajectories
3
are shown in Fig. 13. We see that the singularity is, indeed, isolated, but that the phase
portrait is not of one of the elementary types. 9
Closure. In this section we establish a foundation for our use of the phase plane in
studying nonlinear systems. We begin with the issue of existence and uniqueness
of solutions, first in x, y, ¢ space (Theorem 7.3.1), and then in the x, y phase plane.
The latter leads us to introduce the concept of a singular point in the phase plane
as a point at which both 2’ = P(z,y)
= Oand y’ = Q(z,y) = 0. To study a
singular point S = (x5, ys) one focuses on the immediate neighborhood of S, in
which neighborhood we work with the locally linearized equationsX' = aX + bY
and Y’ = cX + dY, where X = x — x, and Y = y — ys, so X,Y is a Cartesian
system with its origin at S. Studying that linearized system, we categorize the
possible “flows” into four qualitatively distinct types —the center,focus, node, and
saddle~andillustrateeachthroughthemass/springsystemma” + pz! + kx = 0,
with suitable choices of m,p,k, and other examples as well. These are the socalled elementary singularities that result when ad — bc O. In the next section
we apply these results to several nonlinear systems, where we will see the role of
such singular points in establishing the key features of the overall flow in the phase
plane.
lated?
(e
oy
y’
a
(c) x! i
I
(e) a’ II
lI
(g) « II
y’ tI
yl!
and find C3 and Cy in terms of C, and C.
0
20—-Y
vty)
cy
c+
y?
rt+y
zy—4
xz— 2y
(b)
(d)
(f)
a’ = 2x — dy
yo=au-y
g’ = siny
y=aty
vo =1l—eY
y =1-—2*—2xsiny
(h) x’ = cos(x —y)
y =xy-1
4. Suppose that we reverse the e’s and 6’s in our definition of
stability, so that the definition becomes: A singular point S is
stable if, given any ¢ > OQ(i.e., as small as we wish), there is a
6 > Osuchthatd(P(t),S) < 6 forallt > Oifd(P(0),S) <e.
Would that definition work? That is, would it satisfy the idea
of motions thatstartout sufficiently close to S remainingclose
to S? Explain.
5. In this exercise we wish to elaborate on our claim below
7.4. Applications
gle of 19.47° with respect to the a, y axes as shown in Fig. 5b.
(a) If x,y and %, 7 coordinate systems are at an angle a,
as shown here, show that the «, y and %, 7 coordinates of any
9. (Saddles and nodes) Classify
yo=dety
(c) a = a+ 2y
yo=a~2y
(e) we= 3a+y
yo ame by
(g)
ow = 2u-+y
yl = a + Qy
v= Tcosa
(i) vw=at+y
— Ysina,
y = Tsina+Yycosa
(5. 1a)
(5.1b}
(b) Putting (5.1) into (16), insist upon the result being of the
form
(5.2)
P=Py7,
Y=-7vz
for some constants { and ¥ so as to yield elliptical orbits with
%,Y as principal axes, and show that you obtain a = 19.47°.
If you obtain another a as well, explain its significance.
6. We claimed that if p = pe,, then the two straight-line
trajectories in Fig. 8a coalesce. as shown in Fig. 8b. Here,
A, = Ag = A, then the general solution of (13) is
a(t) = (Cy + Cot)e™,
y(t) = a"(t) = ete.
7. What does Fig. 8a look like in the limit as p —>00? Sketch
it.
8. Given the presence of the saddles (i.e., saddle-type singularities) at (—1,0) and (1,0) and the center at (0,0), can you
come up with any global flow patterns that are qualitatively
different from the one sketched in Fig. 3? Explain. (Assume
that these three are the only singularities.)
7.4
Applications
at the origin,
find the equations of any straight-line trajectories through the
origin, and sketch the phase portrait, including flow direction
arrows,
(a) a =a+y
given point are related according to
the singularity
—359
yl = ©+ Qy
(b) vi =y
y! = a —dy
(d) a! = —a + 3y
yo=r-y
(f) a = -38e+y
yo=-e-y
(h) a’ = a+ dy
yo = ba+y
i ~3e+y
Gj) a =
yo =a-3y
10. Prove that a linear system 2’ = az + by, y! = ca + dy can
have one, two, or an infinite number of straight-line trajectories through the origin, but never a finite number greater than
two.
11. Classify the singularity at the origin as a center, focus,
node, or saddle. If it is a focus, node, or saddle, then classify
it, further, as stable or unstable.
(a) a = a-dy
(b) 2’ = 224+ 3y
(c) vw =aut+y
(d) 2 =a+d3y
yo=aty
y’ =a —4y
(e) x
y!
(g) a
~ yf
=
=
=
=
—2e — 3y
3u + 2y
2e-y
a + By
y =2-y
yo=ru-y
(f) ai = -ae+y
yo = -@- 2y
(h) a’ = -24-y
yo = —w-3y
12. (a)-(h) Use computer software to obtain the phase portrait for the corresponding system in Exercise 11. Be sure to
include any key trajectories —namely, any straight-line trajectories through the origin. From the phase portrait, classify the
singularity as a center, focus, node, or saddle, state whether it
is stable, asymptotically stable, or unstable, and use arrows to
show the flow direction.
360
Chapter 7. Qualitative Methods: Phase Plane and Nonlinear Differential Equations
interesting applications:
system
a =P(x,y),
y = Q(x,y).
linearizing them about S’.
aboutanypoint (a, b), is
Flay)
=f(a,b)
++Lfelasb)(w
—a)+fy(a,b)(y
~2)
+ terms of third order and higher,
of a single variable that
y==b:
(2),
first-order
terms, we have
7.4. Applications
361
That step is called linearization because the result is linear in « and y. Geometrically, (5) amounts to approximating the f surface, plotted above the x, y plane, by
↓
gf (ale ∙—a)?+
∫
∶∕
.e& f(a) + f’(a)(a — a) amounts to approximating the graph of f versus 2 by
its tangent line at x = a in the case of a function of a single variable.
interest, (ws,ys), and linearize. Using (5) to do that, we obtain
—Us)+ Py(ts,Ys)\(y—Ys),
P(z,y) & P(Xs,Ys)+ Pe(@s,Ys)(@
zs,
Q(x, y) i
Ys) + Qe(Zs,
Ys)(z
7” ay) + Onan
Ys )(Y ~Ys):
(6a)
(6b)
But P(xs, ys) and Q(xs, ys) are zero because(vy, Ys) is a singular point, so we
have the approximate (linearized) equations
z=
P,(s,Yys)(%& — 2s) + Py(xs,ys)(y
y! = Qals,
Ys)(x
_ Ls)
+ Onan
— Ys),
(7)
Ys) (y _ Ys).
Finally, it is convenient, though not essential, to move the origin to S' by letting
X = w-—ws and Y = y — yg, and calling P,(as,ys) = a. Qz(@s,Ys) = G
Py(ts, ys) = b, and Qy(xs, ys) = d, for brevity, in which case (7) becomes
X'=aX
+ bY,
(8a)
Y'’=cX
+dY,
(8b)
which system is studied in Section 7.3. There, we classify the singularity at Y =
Y = 0 as a center, focus, node, or saddle, depending upon the roots of the characteristic equation
dM— (a+ d)\ + (ad ~ be) = 0,
(9)
namely,
Ax
(a+ d)+
/(a—d)*+ 4bc
5
(10)
∏
∕ ∏
↔
we can find the A roots and determine whether the singular point is a center, focus,
node, or saddle.
Understand that we are trying to ascertain the local behavior of the flow corresponding to the original nonlinear system (1) near the singular point S by studying
the simpler linearized version of (1) at S, namely, (8). That program begs this question: Tsthe nature of the singularity of the nonlinear system (1), at S, truly captured
by its linearized version (8)? To answer that question it is helpful to present the
singularity classification, developed in Section 7.3, in a graphical form — as we
have in Fig. 1. It is more convenient to deal with the two quantities p = a +d and
↕
∏
q = ad — be (which are the axes in Fig. |) than with the four quantities a, b,c, d
since a -- d and ad — be determine the roots of (10) and hence the singularity type.
In termsof p andq, (10)simplifies to \ = (p - \/p* —4q)/2.
p
Unstable
improper
nodes
Unstable
foci
Saddles
Stable
fect
Stable
improper
nodes
Unstable
proper
nodes
Centers
q
Stable
proper
nodes
/
p- =4dyq
In the figure there are five regions separated by the p axis, the parabola p? =
4q, and the positive q axis.. [Of these boundaries, the p axis can be discounted
since the case g = 0 is ruled out of consideration in Section. 7.3:because.then
there is a line of singular points through the origin rather than the origin being
an isolated singular point. That is, our (p,q) point will not fall on the boundary
between saddles and improper nodes —namely, the p axis.]
The Hartman~Grobman
theorem tells us that if a,b,c,d
are such that the
point (p,q) is within one of those regions, then the singularity types of the nonlinear system and its linearized version are identical. For instance, tf the linearized
system has a saddle, then so does the nonlinear
system.
Essentially,
we can think
of retaining the higher-order terms (which we drop in linearizing the nonlinear
differential equations) as equivalent to infinitesimally perturbing the values of the
coefficients a, b,c, d in the linearized equations and hence the values of p and q. If
the point (p,q) is within one of the five regions, then the perturbed point will be
within the same region, so the singularity type will be the same for the nonlinear
system as for the linearized one.
However, we can imagine that for a borderline case, where (p,q) is on the
parabola p* = 4g or on the positive qgaxis, such a perturbation can push the point
into one of the neighboring regions, thus changing the type. In fact, that is the way
it turns out. For instance, if (p, q) is on the positive g axis (as occurs in the example
to follow), then the nonlinear system could have an unstable focus or a center or a
stable focus.
EXAMPLE
1. Oscillator with Cubic Damping.The equationx” + a") +x = 0 models
a harmonic oscillator with cubic damping —that is. with a damping term proportional to the
velocity cubed. The equivalent system
i
Y;
(11)
yo = -2— ey
has one singular point, a center at 2 = y = 0. The linearized version is
N'=
Y=O0N4+1Y,
Youn -~-X=-1X
soa=d=0,b
= 1, andc = 1
(from Fig. 1) the linearized
(12)
+0Y
hence p = Oand gq= 1: Thus, (p,q) = (0,1) so
system (12) has a center (no surprise, since the solutions of
the linearized equation 2”’ + x = 0 are simple harmonic motions). However, it turns out
(Exercise
|) that the nonlinear
system (11) has a stable focus.
Hf
To summarize, in general the linearized system faithfully captures the singularity type of the original nonlinear system. For the borderline cases, where (p, ¢)
7.4, Applications
.
7)
.
.
tae
.
363
ot
is on the p* = 4q parabola or on the positive g axis, however, we have these possibilities:
LINEARIZED
NONLINEAR
stable proper node
=>
center
<=>
stable focus,
or stable proper node,
or stable improper node
unstable focus,
or center,
or stable focus
unstable proper node
<=
unstable improper node,
or unstable proper node,
or unstable focus
7.4.2. Applications. Consider some physical applications.
EXAMPLE
2. Pendulum.Recall from an introductoryphysics coursethatfor a rigid
body undergoing pure rotation about a pivot axis the inertia about the pivot O times the
angular acceleration is equal to the applied torque. For the pendulum shown in Fig. 2, the
inertiaaboutO is ml?, theangle from thevertical is z(t), theangularaccelerationis x’’(t),
and the downward gravitational force mg gives a torque of —mgl sin z. If the air resistance
is proportionalto thevelocitylz’, sayclz’, thenit givesanadditionaltorque—cl?2’, so the
equationof motionis ml?2” = —mglsinx —cl?z’ or
wl!+ ral + Fsine =0,
(13)
where r = c/m. The ra’ term is a damping term. For definiteness, let g/l = 1 and consider
the undamped case, where r = 0.
For small motions we can approximate sin z by the first term of its Taylor series,
sinx = 2, so that we have the simple harmonic oscillator equation x” + @= 0 or
a=y
(14a)
y = ~2;
(14b)
Figure 2. Pendulum.
(14) has a center at 7 = y = 0, and its by now familiar phaseportrait is shown in Fig. 3.
To study larger motions, supposethatwe approximate sin x by thefirst two terms of its
Taylor series instead: sin ~ a —2x?/6. Then we have the nonlinear, but still approximate,
equation of motion
Log
a" +e —ea
=0,
(15)
The latter is of the same form as the equation governing the rectilinear motion of a mass
restrained by a “soft spring,” which is studied in Section 7.2. The system
v= y,
(16a)
Figure 3. Phaseportraitfor the
linearizedsystemx” + x = 0.
∶∶
↓
−− ↕−
∶
(16b)
∕
has a centerat (0,0) and saddlesat (+V6,0) in the (a,y) phaseplane, as discussed in
Section 7.2, and its phase portrait is shown here in Fig. 4.
Finally, if we retain the entire Taylor series of sin z (i.e., if we keep sin a intact), then
we have the full nonlinear system (13), with r = 0, or
a=
Y,
(17a)
y = —sina,
(17b)
with singular points at (x,y) = (n7,0) forn = 0,+1,+42,.... To classify thesesingularFigure 4. Phaseportraitfor the
improved model (15).
ities, let us linearize equations (17) about the singular point (7,0)
using (7). Doing so,
knowingthatsinnw = OQ
andcosna = (—1)",andsettingX = «—-naandY = y—0 = y,
the linearized version of (17) is
X'=Y=0X4+1Y
Y"=(-1)1X =(-1)"*1x +0Y.
(18a)
(18b)
In thenotationof equation(8),@= d =0,b =1,ande = (—1)"*! sop=a+d=0
and q = ad — bc = (-1)". Thus, these singular points are on the g axis in the p,q
plane. For even integers n they are on the positive g axis and correspond to centers; for
odd integers n they are on the negative g axis and correspond to saddles. In turn, the latter
correspond
to saddles of the nonlinear system (17), but the former could be centers or foci
of (17), as-is discussed in Section 7.4.1. The computer-generated.phase portrait in Fig. 5
reveals that they are centers; we have centers at x = 0,427, -47r,...,
and saddles at
w= +tar,+37,...,0n
Figure
the a axis.
5.
Phase portrait of the full nonlinear system x” + sinz = 0.
To understand the phase portrait, suppose (for definiteness) that the pendulum is hanging straight down (a = () initially, and that we impart an initial angular velocity y(0), so
that the initial point is A,B,C,
or D in Fig. 5. If we start at A, then we follow a closed
7.4. Applications
orbit that is very close to elliptical, and the motion is very close to simple harmonic motion
at frequency w = 1. If we start at B, the orbit is not so elliptical, there is an increase in
theperiod, and the motion deviates somewhat from simple harmonic. If we start at C’, then
we approach the saddle at a = mas t ++00; that is, the pendulum approaches the inverted
position as ¢ + oo. If we impart more energy by starting at D, then —even though it slows
down as it approaches the inverted position — it has enough energy to pass through that
position and to keep going round and round indefinitely. Though the trajectory in the phase
plane is not closed, the motion is nonetheless physically periodic since the positions a and
x + 2n7 (for any integer n) are physically identical.
How can we gain access to-one of the other closed orbits such as 7? That’s easy: we
“crank” the pendulum through two rotations by hand so that while hanging straight down
it is now at c = 47. Then we impart an initial angular velocity y(0) = .
What is the equation of the trajectories? Dividing (17b) by (17a) and integrating, gives
1,
gy
~ cosa = constant = C.
(19)
Do we really need (19)? After all, we turned the phase portrait generation over to the computer. Yes, to help us choose initial conditions that will enable us to generate the separatrix
(the trajectories
through the saddles) on the computer.
With « = mwand y = O, we find
thatC = 1, so theequationof theseparatrixis y? = 2(1 + cos x). Be careful, becausethe
initial point ¢ = 0 and y = 2 will not generate the entire separatrix, but only the segment
through that point, from « = —mto x = 7. To generate the next segment we could use an
initial point « = 27 and y = 2, and so on.
COMMENT
1. Recall that Fig. 3-5 correspond to taking sing
retaining sin x without approximation,
a, sinw + x —.v°/6 and
respectively, in (13). Thus, and not surprisingly,
as
we retain more terms in the Taylor series approximation of sin x about 2 = 0 we capture
the flow more accurately and completely.
COMMENT
2. What happens if we include some damping? It turns out that the singu-
65
ww
larities are at (na, 0), as before. [fr
< ro,, where ro, = 2, then the singularities are still
saddles if n is odd, but if m is even we now have stable foci rather than centers as seen in
Fig. 6 (for r = 0.5). One calls the lightly shaded region (not including the boundaries AB
and C'D) thebasin of attraction for thestablefocus at (277,0), thebasin of attractionof an
attractingsingular point S being the set of all initial points Py such that the representative
pointP(t) tendsto S as¢ + oo if P(0) = Po. Similarly,eachof theotherstablefoci has
its own basin of attraction.
yA
COMMENT 3. We have spoken, in this section, of a nonlinear system having the same type
of singularity (or not), at a particular singular point S’, as the system linearized about S. Let
us use the present example to clarify that idea. By their singularities being of the same type,
we mean that their phase portraits are topologically equivalent in the neighborhood of S.
Intuitively,
that means that one can be obtained from the other by a continuous
deformation,
with the direction of the arrows preserved. The situation is illustrated in Fig. 7, where we
show both the nonlinear
(solid) and linearized
(dashed) portraits in the neighborhood
of the
saddleat (7,0). [In moremathematicalterms,supposethatour system2! = P(x, y) and
y’ = Q(z, y) has a singularpointat theorigin andthat
P(x,y) = ax + by + higher-orderterms=Ul aX +bY,
Q(x,y) = cx + dy + higher-orderterms = cX + dY
(20a)
(20b)
define X and Y as continuous functions of x and y, and vice versa. Such a relationship
Figure 7. Continuousdeformation
of theportraitnear(7,0). Solid
lines correspond to the nonlinear
system, dashed to the linearized
version.
between z, y and X, Y is called a homeomorphism
deformation. ]
and is what we mean by a continuous
COMMENT 4. It turns out that the nonlinear pendulum equation is also prominent in
connection with a superconducting device known as a Josephson junction. For discussion
of the Josephson junction within the context of nonlinear dynamics, we recommend the
book Nonlinear Dynamics and Chaos (Reading, MA: Addison-Wesley, 1994) by Steven
H. Strogatz. #
In the preceding example we were unable to classify the singularities at (nz, 0)
with certainty, for n even and r = 0, as is discussed in the paragraph below (18).
We relied on the computer-generatedphase portrait, which show them to be centers,
not foci. More satisfactorily,
we could have used the fact that 2” + sing
= Oisa
“conservative system,” which idea we now explain.
In general, suppose that the application of Newton’s second law gives
(21)
max"= F(x);
that is, where the force F happensto be an explicit function only of «, not of x’ or
t. Defining V(x) by F(x) = —V'(x), (21) becomesma” + V'(x) = 0. Let us
multiply the latter by dx and integrate on x. Since, from the calculus,
oda
de!
= eee
Le! dx
= ae
dt ea a!
de!
dt = v'dx',
dt
dt dt
dt
we obtainmx'dz’ + V'(x)dx = 0, integrationof which gives
1
~ma" + V(x) = constant.
2
(22)
(23)
7.4. Applications
V(a) is called the potential energy associated with the force F(x).
367
For a linear
/2.
spring,for instance,the force is F(a) = —ka, and its potentialis V(a) = ka®
conis
potential)
plus
(kinetic
energy
total
the
The upshot is that (23) tells us that
served,it remains constant over time, so we say that any system of the form (21) is
conservative. The pendulum equation
ml?a" = —mglsina
(24)
is of that form, and multiplying by dx and integrating on «xgives
1
5
la")” — mgl cos x = constant,
(25)
where m/(Ia’)?/2 is the kinetic energy and —mglcos x is the potential energy associated with the gravitational force (with the pivot point chosen as the reference
- level).
For any conservative system (21), the total energy is E(x, 2’) = ma!?/2 +
V(a). If we plot E(x, x’) abovethex, x’ phaseplane,thenthex, x’ locationsof
maxima and minima of & are found from
OE
−−
Dn ∙
=0
V(x)∶ −−
ind
OE
jy
= Mz af
0,
6
(26)
and these are precisely the singular points of the system
w= Y;,
/
y = F(x)/m = -V'(x)/m
corresponding to (21). To illustrate, the point S' beneath the minimum of F&(Fig. 8)
is a singular point. Furthermore, since E is constant on each trajectory, the phase
trajectoriesare the projections of the closed curves of intersection of the & surface
and the various horizontal planes, as sketched. Evidently (and we do not claim that
this intuitive discussion constitutes a rigorous proof), a trajectory [’ very close to
S' must be a closed orbit. The only way [ could fail to correspond to a periodic
motion is if there is a singular point on T, for the flow would stop there. But if 5
is an isolated singular point, and F is small enough, then there can be no singular
points on I’, and we can conclude
that 5 must be a center. By that reasoning,
could have known that the singularities at (7,0)
we
(for meven) must be centers, not
foci. More generally, we state that conservative systems do not have foci among
their singularities.
EXAMPLE
3.
Volterra’sPredator-PreyModel. The Volterra model (also known as
theLotka—Volterra model) of the ecological problem of two coexisting species, one the
predator and the other its prey, is introduced
in Section 7.2. Recall that if w(t), y(£) are the
populations of prey and predator. respectively, then the governing equations are of the form
!
wv
wl —y)a,
y =-v(l—ax)y,
(27a)
(27b)
Figure 8. Occurrenceof a center
for a conservative system.
where ju,/ are positive empirical constants. Setting the right-hand sides of (27) equal to
zero reveals that there are two singular points: (0,0) and (1, 1).
Linearizing (27)about(0,0) gives
soa=
p,b=c=0,d
(28)
yo= -vy
x! = px,
= ~—v.Hence, \ = szand —v, which are of oppositesign, so the
singularity at the origin is a saddle. Clearly, the straight-line trajectories through (0, 0) are
simply the # and y axes since a = 0, y = Aew"',
anda
= Bel’,
y = 0 satisfy (28) and
give trajectories that pass through the origin.
To linearize about (1, 1) we use (7) and obtain the approximations
a’ = —p(y—1),
or, with
XY=a2—-landY
(29)
=y-1,
Yo svX
~p,c =
— py,
=0X
X'=-py
Thus,a@=d=0,b=
y =v(x - 1)
+0Y.
=vX
(30a)
(30b)
so A = £2i,/pnv. Hence, the linearized version (30) has
a center, and (27) has either a center or a focus.
The phase portrait in Fig. 9 shows the singularity at (1, 1) to be a center, with every
trajectory being a periodic orbit, except for the two coordinate axes. (Of course, we show
only the first quadrant because«xand y are populations and hence nonnegative.)The direction of the arrows follows from (27), which reveals that 2’ > 0 for y < 1, and 2’ < 0 for
Figure 9. Phaseportraitfor
Volterra problem (27).
y>lory’
<Ofor’
<landy’
> Oforz > 1).
COMMENT. Although the Volterra model is a useful starting point in the modeling process
and is useful pedagogically, it is not regarded as sufficiently realistic for practical ecological
applications,
7.4.3. Bifurcations. As we have stressed,our approach in this chapter is largely
qualitative. Of special importance, then, is the concept of bifurcations. That is,
systems generally include one or more physical parameters (such as jz and v in
Example 3). As those parameters are varied continuously, one expects the system
behavior to change continuously as well. For instance, if we vary j¢ in Example
3, then the eccentricity of the orbits close to the center at (1, 1) changes, and the
overall flow field deforms, but —qualitatively —nothing dramatic happens. In other
cases, there may exist certain critical values of one or more of the parameters such
that the overall system behavior changes abruptly and dramatically as a parameter
passes through such a critical value. We speak of such a result as a bifurcation.
Let us illustrate the idea with an example.
EXAMPLE
4. Saddle-NodeBifurcation. The nonlinear system
of
vc
II re
i
Y lI
+y,
x
1 +
ge
-y
(31a)
(31b)
369
7.4. Applications
arises in molecular biology, where #(¢) and y(t) are proportional to protein and messenger
RNA concentrations, and 7 is a positive empirical constant, or parameter, associated with
the“death rate” of protein in the absenceof the messengerRNA [for if y = 0, then (31a)
gives exponential decay of x, with rate constant 7].
The singular points of (31) correspond to intersection points of y = ra andy =
a?/(1 +x"), as shown (solid curves) in Fig. 10. Equating thesegives x = y = 0 and also
thetwo distinct roots
LtVJVi—4r?
a ar
€
’
Lt V1~4r?
Yt = 7L4 = —- 2
provided that r < 1/2. Thus, the critical slope of y = rx isr
obtainthetwo intersectionsSy. = (vi,y4)
in Fig. 10) these coalesce at (1,0.5), and ifr
singular point at the origin.
= 1/2. [fr
< 1/2 we
> 1/2 they disappear and we have only the
Let us study thethreesingular points, for r < 1/2. First (0,0): we can see from (31)
by inspection or Taylor series expansion, that the linearized equations are
soa=—r,b=
1,c = 0,andd
y=ny
ee)
= —1. Thus, (10) gives \ = —r and ~1. Since both are
negative,the singular point (0,0) is a stablenode.
In similar fashion (which calculations we leave to Exercise 6), we find that the singularity at S_. is a saddle, and that the singularity at Sy is an unstable improper node. As r
is increased, S_ and at S,. approach each other along the curve y = z7/(1 +a”). When
r = 1/2 they merge and form a singularity of some other type, and when r is increased
beyond 1/2 the singularity disappears altogether. leaving only the node at the origin. The
bifurcation that occurs at r = 1/2 is an example of a “saddle-node bifurcation.” From
the way the singular points S, and S. approach each other along the unstable manifold
of the saddle, like “beads on a string” as Strogatz puts it, we see that the bifurcation process is essentially a one-dimensional event embedded within a higher-dimensional space
(two-dimensional
in this case).
8
The saddle-node bifurcation illustrated above is but one type of bifurcation.
A few others are discussed in the exercises and in the next section. For a more
complete discussion of bifurcation theory, we recommend the book by Strogatz,
referenced in Example 2.
Closure. In this section we got into the details of the phase plane analysis of
autonomous nonlinear systems. Whether or not we generate the phase portrait by
computer, it is essential to begin an analysis by finding any singular points and,
by linearization,
to determine the key features of the local flow near each singular
point. That information is needed even if we turn to the computer to generate the
phase portrait, as we discuss below, under “Computer
software.”
:
\
-
aye
(1,0.5)
“
an
x“
lex?
*]
and S_ = (a_,y_) ifr = 1/2 (dashedline
e=an-rety,
eis
ya
We also explored the correspondence between the type of a singularity of the
nonlinear system and that of the linearized system and found that the type remains
the same, except for the borderline cases corresponding to p, g points on the positive
q axis or on the parabola p* = 4q in Fig. 1. Those cases could “go either way.” That
Figure 10. Determiningthe
singular points of (31).
370
Chapter 7. Qualitative Methods: Phase Plane and Nonlinear Differential Equations
hus changing the singularity type.
system
w= y,
y’ = —sinaz —0.5y
(34)
and to linearize (34) about (7,0) as
l| Y,
/
y I —(cos7)(a —7) —0.5y
7
or witha
—-7=X
(35)
andy—O=Y,
X'=0X+4+1Y,
y!
=
LX
X'’=O0X
_
O.5Y.
4+4X,
KX! = LX — 0.5K.X,
y = —0.02561553.
as follows:
w= —4.5..17, y = —6..6, scene= [x,y]);
(36)
37)
7.4. Applications
371
simply chosen as the same ones used in Fig. 6.
EXERCISES
7.4
1. (a) In Example | we stated that the equation a” +ea3 +a =
0 (e > 0) has a stable focus at the origin of the phase plane.
Verify that claim by generating (with computer software) its
phase portrait.
(b) Should the cubic damping result in the oscillation dying
out more or less rapidly than for the case of linear damping,
x" + ea' + ©= 0, for thesamevaluesof €? Explain.
(c) Classify the singularity at (0,0) for the case where « < 0,
and support your claim.
2, Determine all singular points, if any, and classify each insofar as possible.
=x-y,
y’ = sin(z
+ y)
(a) Show that S_. is a saddle, and find the equations of the two
straight-line trajectories through it. Show that S, is a stable
improper node, and find two straight-line trajectories through
it. Find two straight-line trajectories through the stable node
at (0,0). Use these results to sketch the phase portrait of (31).
ate the phaseportrait of (31) in the rectangle 0 < x < 1.5,
y! = ~x -2y
Qo=o,
(g)a=-2a-y,
y=—242y
yo=ar+c°
(h)c’ = —-22--y,
y' =sine
O<y<
15.
7. (Dynamic formulation ofa buckling problem) Consider the
buckling of the mechanical system shown in the figure, and
(a's yer —-1, yo=y-a-l
Qai=a?-y,
yo=Iw-y
(k)a’ =a? —y?,
P
yi =a? +y-2
“
ae
oe
,
ow
y’ =a? -4
3. (a)—(n) Use computer software such as the Maple
phase-
portrait command, to generate the phase portrait of the corresponding system in Exercise 2.
«
4. Is the given system conservative? Explain.
o
(a)a” — Qa’ + sing = 0
(c)a” + a? = 0
e
ou
Qa =y,
yi =—3sine
(m) a’ =2+2y,
yi’ =—-x—siny
(n)a’ = (x +1)y,
In parts (a) and (b) below, let
6. (Example 4 continuation.)
r ==0.3 in (31).
Sy. is nonelementary.
(c) For the representative supercritical case r = 1, identify and
classify any singularities, and use computer software to gener-
(c)a’=y, y' =(1—27)/(1
+2")
(e)a'=(l—a*)y,
(0,0) shouldbe a stablenode.
(b) For the critical case, r = 1/2, show that the singularity at
(aja'=y,
y’=1-24
(b)a’=1—y’,
y=l-2
(djx’
at (0,0) shouldbea stablefocus.
(c) r to be any value that you wish but large enough for the
damping to be supercritical. For instance, the singularity at
(ba
x
V1
m
o
k
+e? +e2=0
(d)a" +a’ +ar=0
5. Use computer software such as the Maple phaseportrait
command, to obtain the phase portrait of equation (13), over
~2 <a < 14and ~3 < y < 3, showing enough trajectories
to clearly portray the flow. Take g/l = 1, and
(a)r = 0,
Se,
Se
P
consisting of two massless rigid rods of length | pinned to a
(b) r to be any positive value that you wish but small enough mass 7 and a lateral spring of stiffness &. That is, when the
for the damping to be subcritical. For instance, the singularity spring is neither stretched nor compressed x = 0 and the rods
are aligned vertically. As we increase the downward load P
nothing happens until we reach a critical value P,,, at which
value x increases (to one side or the other, we can’t predict
which) and the system collapses.
(a) Application of Newton’s second law of motion gives
,
7
(9-1/2
E ~ (7)
+ ke = 0
(7.1)
the paper) has current 7 in the same direction as £. According to the Biot-Savart law, the mutual force of attraction is
21il/(separation) = 2ftl/(a—x), wherex = 0 is theposition
at which the spring force is zero, so the equation of motion of
the restrained wire is
ma
+k (« ~
r
Amn
) = 0,
where
r
Thinking of m,k, a, and / as fixed, and the currents J and 7 as
as governingthe displacementx(t). With a’ = y, show that variable, let us study the behavior of the system in terms of the
the singularity at the origin in the a, y phase plane changes
its type as P is sufficiently increased. Discuss that change of
type, show how it corresponds to the onset of buckling, and
use it to show that the critical buckling load is P,, = kl/2.
(b) Explain what the results of part (a) have to do with bifurcation theory.
(c) Use Newton’s law to derive (7.1).
8. (Motion of current-carrying wire) A mutual force of attraction is exerted between parallel current-carrying wires. The
infinite wire shown in the figure has current J, and the wire
parameter r. For definiteness, letm =k =a=
(a) With z’ = y, identify any singularities in the z,y phase
plane and their types, and show that they depend upon whether
r is less than, equal to, or greater than 1/4. Suppose that
r < 1/4. Find the equation of the phase trajectories and of
the separatrix. Do a labeled sketch of the phase portrait.
(b) Let r = 0.1, say, and obtain a computer plot of the phase
portrait.
(c) Next. consider the transitional case, where r = 1/4. Show
that that case corresponds to the merging of the two singularities, and the forming of a single singularity of higher order
(i.e., a nonelementary singularity). Do a labeled sketch of the
phase portrait for that case.
(d) Let r = 1/4, and obtain a computer plot of the phase portrait.
(e) Next, consider
of length / and mass m (with leads that are perpendicular to
1.
the case where r > 1/4, and sketch the
phase portrait.
(f) Let r = 0.5, say, and obtain a computer plot of the phase
portrait.
(g) Discuss this problem from the point of viewof bifurcations,
insofar as the parameter 7 is concerned.
(e > 0)
(1)
of the beating of the heart.* Usually, in applications the parameter € is positive.
To study (1) in the phase plane we first re-express it as the system
vey,
(2a)
y= —a + (1 —2")y,
(2b)
(a)
which hasone singularpoint: (0,0). Linearizing (2) about(0,0) gives
(3a)
uv’= y,
y =—x + ey,
(3b)
which has an unstable focus if ¢ < 2 and an unstable node if ¢ > 2. That result is
not surprising since (3) is equivalent to x” — ex’ + x = 0 [equation (1) with the
term dropped], and the latter corresponds to a damped harmonic
nonlinear ex’
oscillator with negative damping. Near the origin in the a, y phase plane the flow
is accurately described by (3) and is shown in Fig. la. As the motion increases,
the neglected nonlinear term ex*2’ ceases to be negligible, and we wonder how the
trajectory shown in Fig. la continues to develop as ¢ increases. Since the “damping
(6)
coefficient”¢ = ~e(1—wx*),
in (1), is negativethroughoutthevertical strip |x| < 1,
we expect the spiral to continue to grow, with distortion as the ex?a’ term becomes
more prominent. Eventually, the spiral will break out of the |a| < 1 strip (Fig. 1b).
As the representativepoint (a(t), y(t)) spends more and more time outside that
strip, where c = ~e(1 — 27) > 0, the effect of the positive damping in |z| > 1
increases,relativeto theeffectof thenegativedampingin |x] < 1, so itis naturalto Figure 1. The unstablefocusat(0,0). ”
wonder if the trajectory might approach some limiting closed orbit as t + 00 over
which the effects of the positive and negative damping are exactly in balance.
We can use the following theorem, due to N. Levinson and O. K. Smith.
THEOREM
7.5.1 Existence of Limit Cycle
Let f(a) be even [f(—x) = f(x)] and continuous for all x. Let g(a) be odd
[g(—v)= —g(x)]with g(x) > 0 for all x > 0, andg'(x) be continuousfor all a.
With
.
0
/
f(€)dé = F(x)
and
i
JO
g(€) dé = G(x),
(4)
suppose that (i) G(r) + oo as w — o and (ii) there is an 2 > O such that
F(x) < 0 for0 < «#< x29,F(x) > 0 for 2 > xo, and F(x) is monotonically
increasing for 2 > xo with F(x) + oo as x equation
oo. Then the generalized Liénard
ge’+ f(x)a’ + g(x) =0
(5)
has a single periodic solution, the trajectory of which is a closed curve encircling
the origin in the 2,2’ phase plane. All other trajectories (except the trajectory
“B. van der Pol, On “Relaxation Oscillations,” Philosophical Magazine. Vol. 2, 1926. pp. 978~
992, and B. van der Pol and J. van der Mark, The Heartbeat As a Relaxation Oscillation, and An
Electrical Model of the Heart, Philosophical Magazine, Vol.6, 1928,pp. 763-775.
consisting of the single point at the origin) spiral toward the closed trajectory as
t+
o.
Applying this theoremto the van der Pol equation(1), f(x) = —e(1 — x?)
is an evenfunction of x and F(a) = —e(x —2°/3), which is less than zero for
0<a
y
< V3, greater than zero for 2 > 3,
and which increases monotonely to
infinity as « —+oo. Further,g(a) = x is odd, and positive for x > 0, g(a) = 1,
and G(x) = «7/2 — oo as a -+ oo. Since theconditions of the theoremare met
(for all € > 0), we conclude from the theorem that the van der Pol equation does
admit a closed trajectory, a periodic solution, for every positive value of €.
Computer results(using theMaple phaseportraitcommand)bearout this claim.
The phase portraits are shown in Fig. 2 for the representativecases € = 0.2, 1, and
5, and x(t) is plotted versus ¢ in Fig. 3 for the trajectories labeled C’. The closed
trajectories labeled I’, predicted by the theorem, are examples of limit cycles namely, isolated closed orbits. By [ being isolated we mean that neighboring trajectories through points arbitrarily close to P are not closed orbits. If we start on
T we remain on [, but if we start on a neighboring trajectory, then we approach [
as t + co (unless we start at the origin, which is an equilibrium point). Thus, we
classify the van der Pol limit cycle as stable (or attracting). Clearly, that particular
trajectory is of the greatest importance because every other trajectory (except the
point trajectory z = y = 0) winds onto it as t + oo.
As one might suspect from Fig. 2 and as can be proved, the van der Pol limit
cycle approaches a circle of radius 2 as « — O through positive values. When ¢
becomes zero the singularity at the origin changes from a focus to a center, and
while the circle of radius 2 persists as a trajectory it is joined by the whole family
of circular orbits centered at the origin. If € is diminished further and becomes
negative, the origin becomes a stable focus and all closed orbits disappear and give
way to inward-winding spirals. Thus, « = 0 is a bifurcation value of e.
Observe the interesting extremes: as « — 0, the steady-state oscillation (e.,
corresponding to the limit cycle) becomes a purely harmonic motion with amplitude 2. But as € becomes large, the limit cycle distorts considerably and the steadystate oscillation a(t) becomes “herky jerky.” (In the exercises, we show that as
€ — co it even becomes discontinuous!) Such motions were dubbed as relaxation
oscillations
Figure 2. The vander Pol limit
cycle, for ¢ = 0.2, 1, and 5.
by van der Pol, and these are found all around us. Just a few, men-
tioned in the paper by van der Pol and van der Mark, are the singing of wires in a
cross wind, the scratching noise of a knife on a plate, the squeaking of a door, the
intermittent discharge of a capacitor through a neon tube, the periodic reoccurrence
of epidemics
and economic
crises, the sleeping
of flowers,
menstruation,
and the
beating of the heart. Such oscillations are characterized by a slow buildup followed
by a rapid discharge, then a slow buildup, and so on. Thus, there are two time
scales present, a “slow time” associated with the slow buildup, and a “fast time”
associated with the rapid discharge. In biological oscillators such as the heart, the
period of oscillation provides a biological “clock.”
Understand clearly that the limit cycle phenomenon is possible only in nonlinear systems for consider the case of small e, say, where we have a limit cycle that is
approximately a circle of radius 2. If the system were linear, then the existence of
thatorbit would imply that the entire family of concentric circles would necessarily
be trajectories as well, but they are not.
Besides the van der Pol example, other examples of differential equations exhibiting limit cycles are given in the exercises. In other cases a limit cycle can
be unstable (repelling) in thatother trajectorieswind away from it, or semistable
in exceptional cases, in that other trajectories wind toward it from the interior and
away from it on the exterior, or vice versa.
7.5.2. Application to the nerve impulse and visual perception. The brain con-
tainsabout102 (a million million) neurons,with around10!“ to 10!° interconnections. Within this complex network, information is encoded and transmitted in the
form of electrical impulses. The basic building block is the individual neuron, or
nerve cell, and the functioning of a single neuron as an input/output device is of
deep importance and interest. Our purposes in discussing the neuron here are in
connection with relaxation oscillations, and especially with the key role of nonlinearity in the design and functioning of our central nervous system.
A typical neuron is comprised of a cell body that contains the nucleus and that
emanates many dendrites and a single axon. The axon splits near its end into a
number of teminals as shown schematically in Fig. 4. Dendrites are on the order
of a millimeter long, and axons can be as short as that or as long as a meter. At the
end of each terminal is a synapse, which is separatedfrom a dendrite of an adjacent
cell by a tiny synaptic gap..Electrical impulses, each.of which is called an action
potential, are generated near the cell body and travel down the axon. When an
action potential arrives at a synapse, chemical signals in the form of neurotransmitter molecules are released into the synaptic gap and diffuse across that gap to
a neighboring dendrite. These electrical signals to that neighboring neuron can be
positive (excitatory) or negative (inhibitory). Each cell receives a great many such
cues from other neurons. If the net excitation to a cell falls below some critical
threshold value, then the cell will not fire — that is, it will not generate action potentials. [f the net excitation is a bit above that threshold, then the cell will fire not just once, but repeatedly and at a certain frequency.
Let us consider briefly the generation of the nerve impulse. The nerve cell
is surrounded by and also contains salt water. The salt molecules include sodium
chloride (NaCl) and potassium chloride
(KCI), and many of these molecules
are
ionized so that Nat, K*, and Cl~ ions are abundant both inside and outside the
axon. Of these, Nat and K* are the key players insofar as the nerve impulse is
concerned. Rather than being impermeable, the axon membrane has many tubular
protein pores of two kinds: channels that can open or close and let either Na* or
K* ions through in a passive manner, like valves, and pumps that (using energy
from the metabolism of glucose and oxygen) actively eject Nat ions (i.e., from
inside the axon to outside) and bring in K* ions. Through the action of these
active and passive pores, and the physical mechanisms of diffusion and the repulsion/attraction of like/unlike charges, a differential in charge, and hence potential
(voltage), is established across the axon membrane which, in the resting state, is 70
_dendrites
Synapse,
™ nucleus
/
terminal
Figure 4. Typical neuron.
i
millivolts, positive outside.
If the net excitation arriving from other cells sufficiently reduces that voltage,
at the cell body end of the axon, then a sequence of opening and closing of pores
is established, which results in a flow of Na? and K* ions and hence a voltage
“blip,” the action potential, proceeding down the axon. That wave is not like the
flow of electrons in a copper wire, but rather like a water wave that results. not
from horizontal
motion of the water, but from a differential
up and down motion
of water particles. This complicated process was pieced together by Alan Hodgkin
and Andrew
Huxley,
in 1952, and is clearly
discussed
in the book by David
H.
Hubel.* Various electrical circuit analogs have been proposed, to model the nerve
impluse, by Hodgkin and Huxley and others. They are all somewhat empirical
and of the “fill and flush” type — that is, where a charge builds up and is then
discharged through an electrical circuit, and they are described in the little book by
F. C. Hoppensteadt.t
Of interest to us here is that the firing is repetitive (on the order of 100 impulses per second), and consists of a relaxation oscillation governed by the van der
Pol equation (as discussed in Hoppensteadt). Further, it is known that as the excita-
tion voltage is increased above the threshold, the magnitude of the action potential
remains unchanged, but the firing frequency increases. If we plot the output (action
Output
Amplitude
potential) amplitude versus the input (excitation voltage) amplitude, the graph is as
a
Input
Amplitude
Figure 5. Input/outputrelation
for a neuron.
shown in Fig. 5. Since the graph of output amplitude versus input amplitude is not
a straight line through the origin, the process must be nonlinear, which fact is also
known through the governing equationbeing a van der Pol equation (or other such
equation, depending upon the model adopted); indeed, any process where the output amplitude is zero until a critical threshold is reached is necessarily nonlinear.
Since the individual neuron is a nonlinear device, surely the same is true for the
entire central nervous system, and the natural and important question that asserts
itself is “Why?’’. What is the functional purpose of that nonlinearity?
Let us attempt an answer. We have seen that nonlinear systems are more complex than linear ones. Since our nervous system is responsible for carrying out
complex
tasks, it seems reasonable that the system chosen should be nonlinear.
We
can be more specific if we look at a single type of task, say visual perception, which
is so complex that it occupies around a third of the million million neurons in the
brain.
Perhaps the most striking revelation in studying visual perception is in discovering that one’s visual perception is not a simple replica of the image that falls upon
the retina but is an interpretation of that information, effected by visual processing
that begins in the retina and continues up into the visual cortex of the brain. For
instance, hold your two hands up, in front of your face, with one twice as far from
your eyes as the other (about 8 and 16 inches). You should find that they look about
the same size. Yet, if we replace your eyes with a camera, and take a picture, we
see in the photo that one hand looks around twice as large as the other. Usually, we
blame the camera for the “distortion,” but the camera simply shows you the same
“Eye, Brain, and Vision (New York: W. H. Freeman and Company, 1988).
"An Introduction to the Mathematics of Neurons (Cambridge: Cambridge University Press, 1986).
information that is picked up by your retinas, the distortion is introduced by the
brain as it interprets and reconstructs the data before presenting it to you as visual
consciousness.
The latter is but one example of 4 principle of visual perception known as size
constancy. The idea is that whereas the actual size of a physical object is invariant,
the size of its retinal tmage varies dramatically as it is moved nearer or further from
us. Size constancy is the processing, between the retina and visual consciousness,
that compensates
for such variation
so as to stabilize
our visual
world.
Thus, for
instance, our hands look about the same size even when the retinal image of one
is twice as large as that of the other. The functional advantage of that stabilization
is to relieve our conscious mind of having to figure everything out; our bratn does
much of the figuring out and presents us with its findings so that our conscious
attentioncan be directed to more pressing and singular matters.
Surely, size constancy requires a nonlinear perceptual system for if we take
the retinal image size as the input amplitude and the perceived size as the output
amplitude, then a linear system would show us the two hands just as a camera
does. (Remember that for a linear system if we double the input we double the
corresponding output if we triple the input we triple the output, and so on.)
In visual perception there are other constancy mechanisms as well, such as
brightness constancy and hue constancy. To illustrate brightness constancy, consider the following simple experiment reported by Hubel in his book, cited above.
We know from experience that a newspaper appears pretty much the same, whether
we look at it in sunlight or in a dimly lit room: black print on white paper. Taking
a newspaperand a light meter, Hubel found that the white paper reflected 20 times
as much light outdoors as indoors, yet it looked about the same outdoors and indoors. If the perceptual system were linear, the white paper should have /ooked 20
times as bright outdoors compared to indoors. Even more striking, he found that
the black letters actually reflected twice as much light outdoors as the white paper
did indoors yet, whether indoors or outdoors, the black letters always looked black
and white paper always looked white.
The point, then, is that these constancy mechanisms stabilize our perceived
world and relieve our conscious mind from having to deal with newspapers that
look 20 times brighter outdoors than indoors, hands that “grow” and “shrink” as
they are moved to and fro, and so on, so that our conscious
attention can be reserved
for more singular matters such as not getting hit by a bus. These mechanisms are
possible only by virtue of the nonlinearity of the central nervous system, which can
thatsystem, the individual neuron.
You have no doubt heard about “the whole being greater than the sum of its
parts.” That idea expresses the essence of the Gestalt school of psychology which,
by around 1920, supplanted the previously dominant molecular school of psychology, which had held that the whole is equal to the sum of the parts. To illustrate the
Gestalt view, notice that the black dots in Fig. 6a are seen as a group of dots, not as
a numberof individual dots, and that the arrangementin Fig. 6b is seen as a triangle
with sections removed, rather than as three bent lines.
In fact, Max Wertheimer’s
£
\
Figure 6. The whole is greater
than the sum of its parts.
378
Chapter 7. Qualitative Methods: Phase Plane and Nonlinear Differential Equations
fundamental experiment, which launched the Gestalt concept in 1912, is as follows.
If parallel lines of equal length are displayed on a screen successively, it is found
that if the time interval and distance between them are sufficiently small, then they
are perceived not as two separate lines but as a single line that moves laterally.
(Today we recognize that idea as the basis of motion pictures!)
In mathematical terms, the molecular idea is reminiscent of the result for a
linear differential equationL{u] = f, + fo +--:+
combined
fy that the responseu to the
input is simply the sum of the responses w,, u2,..., Ug
to the individ-
We suggest here that, in effect, the contribution of the
ual inputs fy, fo,...,/,.
Gestaltists was to recognize the highly nonlinear nature of the perceptual system,
even if they did not think of it or express it in those terms. We say more about the
far-reaching effects of that nonlinearity upon human behavior in the next section.
Closure. The principal idea of this section is that of limit cycles, which occur
only for nonlinear systems. The classic example of an equation with a limit cycle
solution is the van der Pol equation, which we discuss, but that is by no means the
only equation that exhibits a limit cycle. That limit cycle solution is said to be a
self-excited oscillation because even the slightest disturbance from the equilibrium
point at the origin results in an oscillation that grows and inevitably approaches the
limit cycle as | — oo. The case of large € is especially important in biological
applications, and the corresponding limit cycle solution is a relaxation oscillation
characterized by alternate t-intervals of slow and rapid change. Since the existence
of a limit cycle. is of great importance, there are numerous theorems available that
help one to detect whether or not a limit cycle is present, but we include only the
theorem of Levinson and Smith since it is helpful in our discussion of the van der
Pol equation.
Finally, we discuss the action potential occuring during the firing of a neuron,
as a biological illustration of a relaxation oscillation, and we use that example to
point out the nonlinear
nature of the neuron and central nervous system, and the
profound implications of that nonlinearity.
EXERCISES
7.5
1. (a) Obtain computer results Se
to those presented in
Fig. 2 and 3 for the case where « = 0.1. What value do you
3. Identify and classify any singularities of the given equation
in the 2, y phase plane, where 2’ = y. Argue as convincingly
think the period approaches as € —+0? Explain.
as you can for or against the existence of any limit cycles, their
(b) Obtain computer results analogous to those presented in shape, and their stability or instability.. You should be able to
Fig. 2 and 3, fore = 10. NOTE: Be sure to make your ¢- tell a great deal from the equation itself, even in advance of
integration step size small enough — namely, small compared any computer simulation.
to thetimeintervalsof rapidchangeof x(t).
2. Use Theorem 7.5.1 to show that the following equations
admit limit cycles.
(a)x ~ (= a eta
Ot
Ge
0
hog=
(a):w" +(a? +2" —1)r'+x=0
(b)a” + (1-2? —2")z' +r =0
4. (Hopf bifurcation) (a) Show that the nonlinear system
a’ =ex+y—ax(x>+y’),
(4.La)
379
7.5.
(4.1b) where primes
yl =a +ey—y(u?+y")
denote differentiation with respect to the new
= GI. That is, find a, J,
time variable 7, where t = ar andi
can be simplified to
and ¢ in terms of L, C, a, and 0.
(4.2a) 6. (Rayleigh's equation and van der Pol relaxation oscilla-
r(e—r*),
r=
tion) (a) Show that if we set 2 = z’ in the van der Pol equa-
(4.2b)tion v”
f= —1
by the change of variables « = rcosé, y = rsin@ from
the Cartesian x, y variables to the polar r, @variables. HINT:
Putting « = rcos@, y = rsin@ into (4.1) gives differential
equationseach of which contains both r’ and 6’ on the lefthand side. Suitable linear combinations of those equations
give (4.2a,b), respectively. We suggest that you use the shorthand cos@ = c and sin @= s for brevity.
(b) From (4.2) show that the origin in the x, y plane is a stable
focus if ¢ < 0 and an unstable focus if « > 0, and show that
working from (4.1), instead, one obtains the same classifica-
~ e(1—a)x’ + @= 0 andintegrateonce,we obtain
2! —e(z!—z!9/3)+ 2 = C, whereC is a constant.Setting
z = u+C,
to absorb the C, obtain Rayleigh’s equation
fe
ul’ —e (1 = “|
on u(t).
u+u=0
(6.1)
The latter was studied by Lord Rayleigh (John
William Strutt, 1842-1919) in connection with the vibration
of a clarinet reed.
(b) Letting u’ = v, reduce (6.1) to a system of two equations.
Show that the only singular point of that system is (0,0), and
tion.
classify its type.
(c) Show from (4.2) that r(t) = /€ is a trajectory (if e > 0) (c) Choosing initial values for wuand v, use computer software
and, in fact, a stable limit cycle. NOTE: Observe that zero is a to obtain phaseportraitsand plots of u(t) versus¢, much as
bifurcation value of €. As € increases, a limit cycle is born as € we have in Fig. 2 and 3, for ¢ = 0.2,1, and 5. For each of
passes through the value zero, and its radius increases with e. those e’s, estimate the amplitude and period of the limit cycle
This is known as a Hopf bifurcation.
solution.
(d) Modify
(4.1) so that it gives an wastable limit cycle at
r= fe, instead.
‘es
di
rent 7, Thus, Kirchhoff’s voltage law gives L=
1fia=0
tdi
(¢€— co) of the van der
Pol equation it is more convenient to work with the Rayleigh
5. The box in the circuit shown in the figure represents an
“active element” such as a semiconductor or vacuum tube, the
voltage drop across which is a known function f(z) of the cur-
CG
(d) To study the relaxation oscillation
dt
+ f(t) +
equation (6.1), as we shall see. With u’ = v, let us scale the u
variable according to u = ew. Show that the resulting system
is
(6.2a)
ew’ =v,
ys
— ew
(6.2b)
2 (v —v3/3) —w
(6.3)
vy =e € = =)
=U,
a)
so
ie
rT
SOOO’
dw
~~
j<
(a) Lf f is of the form f(i) = ai®— bi, show that one obtains
.
9
A
∶
L
dv
∙
∕
(5.1)
(b) Show that by a suitable scaling of both the independent and
dependent variables one can obtain from (5.1) the van der Pol
equation
I" —e1—-F)r'+1=0,
Vv
As € - oo we see from (6.3) that du/dw — oo at all points
in the v,w phase plane except on the curve v — v3/3- w.
So for any initial point, say P, explain why the solution is as
shown in the figure. For instance, why is the direction downward from P, and why does the trajectory jump from S$to T
and from U to R? The loop RSTUR is traversedrepeatedly
andis thelimit cycle. Finally,usethefigureto sketchu(t) and
∕
∶ thevander Pol equation
cycle solutionx(t) of
hencethelimit
(for € + 00). The result should be similar to the e =5 part of
Fig. 3. (HINT: Recall the expression “s’ = a” for the phase
velocity in Exercise 8 of Section 7.2.) Finally, explain why it
is convenient to work with the Rayleigh equation in order to
(5.2) find the relaxation oscillation of the van der Pol equation.
where i and f simply denote the initial point and final point,
respectively. Explain, further, why (7.1) reduces to
P
|
of
(1—2")a' dx = 0,
(7.2)
which expresses the balance stated above — namely, that the
network doneoveronecycle by the«(1—2*)z’ termin (1) is
Zero.
(b) For the case of small « (0 < ¢ < 1), seek the limit cycle solution in theform x(t) ~ acost. Puttingthelatterinto
(7.2), show that one obtains a = 2 as the radius of the circular
limit cycle, as claimed in the text. NOTE: Put differently, (7.1)
is equivalentto
7. In connection with Fig. 1b we suggested that over the limit
cycle the energy gain, while the representative point is in the
E|
strip |z| < 1, exactly balancesthe energy loss while the point
a
is outside of that strip. Let us explore that idea.
(a) Multiplying (1) through by dz and integrating over one cycle, show that
f
=
/
f
(1-
x”) a dx,
(7.3)
tw, +rhlie∶ . That is, the change in the total en-
ergy & over one cycle is equal to the net work done by the
−
f.
∙
↔ ∟is zero∏which,
once again, gives (7.2).
2
7.6
a
The Duffing Equation: Jumps and Chaos
7.6.1. Duffing equation and the jump phenomenon. Besides the van der Pol
equation, also of great importance is the Duffing equation:
ma” +ra’
tar+
Ba? = Focos{t,
(1)
studied by G. Duffing around 1918. Whereas primary interest in the van der Pol
equation is in the unforced equation and its self-excited limit cycle oscillation, most
of the interest in the Duffing equation involves the various steady-state oscillations
that can arise in response to a harmonic forcing function such as Fo cos Qt.
Physically, (1) arises in modeling the motion of a damped, forced, mechanical
oscillator of mass m having a nonlinear spring. That is, the spring force is not kx
but az-+ x3.
We assume that a > 0 but that 3 can be positive (for a “hard spring”)
or negative(for a “soft spring”).
The linear version of (1), where 3 = 0, is discussed in Section 3.8, and an important result there consisted of the amplitude response curves —namely, the graphs
of the amplitude of the steady-state vibration versus the driving frequency 22for
various values of Fo.
For the linear case we obtained two linearly
independent
homogeneoussolutions and a particular solution and used linearity and superposition to form from them the general solution, which contained all possible solutions.
Understand that because equation (1) is nonlinear, that approach is not applicable.
Consider first the undamped case (r = 0), and let m = 1 for simplicity, so (1)
becomes
(2)
ve”+ax + Bx? = Fo cos Mt.
Further, suppose that /3is small. Since (1) is nonlinear, we expect it to have a wealth
of different sorts of solutions. Of these, particular interest attaches to the so-called
harmonic response
u(t) & Acos Qt
(3)
(for 3 small)
at the same frequency as the driving force. As shown in the exercises, pursuing an
approximatesolutionof the form (3) yields the amplitude-frequencyrelation
= a+
“BA? °
Fo
a
Figure 1. Amplituderesponse
curves; undamped.
(4)
which gives the responsecurves shown in Fig. |. For the linear case, where 6 = 0,
(4)reducesto.A = Fy/(a@—?), which resultagreeswith theamplitude-frequency
relation found in Section 3.8. For G > 0 the curves bend to the right (shown), and
for Pa < 0 they bend to the left (not shown). Thus, the effect of the nonlinear
Bx? term in (2) is to cause the response curves to bend to one side or the other.
Recall from Section 3.8 that for the linear system (3 = 0), the amplitude |A] is
Fo increasing
infinite when the system is driven at its natural frequency \/a. [More precisely, we
saw that the solution form a(t) = Acos Mt simply does not work for the system
v" + axe= Focosi
coeffiif Q = \/a, butthatthemethodof undetermined
cients gives z(t) = ~9 t sin Qt, which growing oscillation is known as resonance.]
20
However, because of the bending of the response curves, resonance cannot occur
in the nonlinear case. That is, we see from Fig. | that if 6 4 0 then at each driving
frequency2 theresponseamplitude |A| is finite.
What is the effect of including damping, of letting r be positive in (1) rather
than zero? In that case we need to allow for a phase shift (as in Section 3.8),
and seek a(t) & Acos(Qt+
®) in place of (3). The result would be a modified
amplitude- frequency relation, in place of (4), and a “capping off” of the response
curves as shown in Fig. 2a.
If (2 = (2), for instance, then the response is at P, in Fig. 2b. Suppose we
can vary the driving frequency 22continuously by turning a control knob, like the
volume knob onaradio. If we increase 2 slowly (remember that (2 is regardedas a
constant in this analysis, so we need to increase it very slowly), then the representative point moves to the right along the response curve. But what happens when it
reaches P», where the response curve has a vertical tangent? Numerical
simulation
reveals that the point jumps down to Ps, where it can then continue moving rightward on the response curve if Q is increased further. That is, there is a transient
Figure 2. Amplituderesponse
curves; damped.
382
Chapter 7. Qualitative Methods: Phase Plane and Nonlinear Differential Equations
scope.”
ed. (Oxford: Oxford University Press, 1987).
uw8ua
someone with an eating disorder might fast for a period of time, thenjump almost
instantaneously to binging, and vice versa. For a readable account of the modeling
of such systems as these, see E. C. Zeeman’s Catastrophe Theory (Reading, MA:
Addison-Wesley, 1977).
7.6.2. Chaos. Consider the physical system shown in Fig. 3, a box containing a
slender vertical steel reed cantilevered from the “ceiling” and two magnetsattached
to the “floor”; a is the horizontal defiection of the end of the reed. In equilibrium,
the reed is Stationary,with its tip at one magnetor the other. However, if we vibrate
the box periodically in the horizontal direction, we can (if the magnets are not too
strong) shake the reed loose from its equilibrium position and set it into motion,
This experiment was carried out by F. Moon and P. Holmes,” to study chaos. They
modeled the system by the Duffing equation
a +a!
2
where r is a (positive) damping coefficient and Fp and Q are the forcing function
strength and frequency, respectively. The ~2 + «° terms approximate the force
induced on the reed by the two competing magnets. It is zero at ¢ = 0,-£1, which
are therefore equilibrium points. To classify the equilibrium points, consider the
unforcedequationx” + ra’ —x + 2° = 0 or,equivalently,
(6a)
yl = —ry te
—e,
(6b)
We leave it for the exercises for you to show that the origin (x, y) = (0,0) is an
unstableequilibrium point, namely a saddle, and that (+1, 0) are stable equilibrium
points (stable foci if r < 8 and stable nodes if r > /8). For the undriven system
(9 = 0), imagine displacing the reed tip to x = 0.5, say, and releasing it from rest.
Then it will undergo a damped motion about x = 1. If instead we release it from
x = —0.5, say, it will undergo a damped motion about « = —1.
What will happen if we force the system (Fy > 0)? We can imagine the
reed undergoing a steady-state oscillation about « = +1 or « = —1, depending
upon the initial conditions. To encourage physical insight, it is useful to consider
the potential energy V(a) associatedwith the magnetic force F(x) = x — 2°,
Recalling from Section 7.4.2thatF(x) = —V'(x), we have
Vie)=
(x)
2
-7-
gt
eS,
ar
∙
reed
|2.-magnet
~a +2? = Focos Mt,
w= y,
<<—>
7
(7)
Thus, in place of a reed/magnet system we can conceptualize the system more
intuitively as a mass in a double well as sketched in Fig. 4. That is, its gravitational
potentialenergyis V(x) = mgy = mg(—x7/2+a4/4) which, exceptfor thescale
factor mg, is the same as V(«) for the reed/magnet system, given in (7).
“B.C. Moon and P. J. Holmes, A Magnetoelastic Strange Attractor, Journal of Sound and Vibration, Vol, 65, pp. 275~296.
384
If the mass sits at the bottom of a well, then if we vibrate the system horizon-
tally we expect the mass to oscillate within that well. If we vibrate so energetically
that the mass jumps from one well to the other, then the situation becomes more
complicated.
Rather than carry out the physical experiment, let us simulate it numerically
by using computer software to solve (5). Let us fix r = 0.3 and Q = 1.2, and use
the initial conditions «(0) = 1 and w’(0) = 0.2, say, so that if Fp is not too large
we expect an oscillation near 2 = | (i.e., in the right-hand well).* Our plan is to
use successively larger Fo’s and see what happens. The results are quite striking
(like those obtained by Moon and Spencer). They are presented in Fig. 5, both in
the a, y phaseplane and as x(t) versus t.
In Fig. Sa we set Fg = 0.2 and find that after a brief transient period there
results a steady-stateoscillation near « = 1, as anticipated. That oscillation is of
the same period as the forcing function (namely, 27/Q = 27/1.2 % 5.236) so
Period-1 oscillations
it is called a period-1 oscillation, or harmonic oscillation.
persist up to around Fo = 0.27, but for Fo > 0.27 different solution types arise.
For Fo = 0.28 the solution is still periodic, but it takes two loops (in the
x,y plane) to complete one period and the period is now doubled, namely, 47/Q.
Thus, it is called a period-2 oscillation, or a subharmonic oscillation of order
1/2. In Fig. Sb—5fwe omit the transient and display only the steady-stateperiodic
solution so as not to obscure that display. Observe, in Fig. 5b, the point where the
trajectory crosses itself. That crossing does not violate the existence and uniqueness
theorem given in Section 7.3.1 because equation (5) is nonautonomous and the
phase plane figure shows the projection of the non-self-intersecting curve in threedimensional a, y,¢ space onto the z, y plane.
If we increase Fo further, to 0.29, the forcing is sufficiently strong so that
during the transient phase the mass (reed) is driven out of the right-hand potential
well and ends up in a period-4 oscillation about the left-hand well. To observe
this result we need to be patient and run the calculation
to a sufficiently
large time,
namely, beyond ¢ ~ 400. This period doubling continues as Fo increases from
0.29 up to around 0.30. For Fo > 0.30 a period-5 oscillation results that now
encompasses both stable equilibrium points (Fig. 5d).
The regime 0.37 < Fo < 0.65 is found to be rather chaotic, with essentially
random motions, as seen in Fig. Se for the case Fo = 0.5.
Reviewing these results, observe that as we increased
Fp the period of the
motion increased until, when Fo was increased above 0.37, periodicity was lost
altogether and the motion became chaotic. (We can think of that motion as periodic
but with infinite period.) It would be natural to expect that a further increase of Fo
would lead to even greater chaos (if one were to quantify degree of chaos), yet we
find that if Fo is increased to 0.65 we once again obtain a periodic solution, namely,
the period-2 solution shown in Fig. 5f, and if Fo is increased further to 0.73 then
we obtain a period-| solution (not shown).
In summary, we see that the forced Duffing equation admits a great variety
“These parameter values are the same as those chosen in Section 12.6 of D. W. Jordan and P.
Smith (ibid). We refer you to that source for a more detailed discussion than we offer here.
7.6. The Duffing Equation: Jumps and Chaos
+
(a) Fy=0.2
|
|
|
HT
1
0
1
Xv
(a
'
i-
(b)Fy=0.28
|
x
Oo
0-4
|
|
©)
0-
Xx
14
\
0
x
1
(c)
Fo
=().29
i
a
0-|
x
14
|
0
1
x
I~
i
(d) Fy=0.37
~~
fT
0-
Xx
−
I
0
|
X
(e)Fy=0.5
1+
(f) Fy=0.65
0-l4
x
|
385
of periodic solutions
and chaotic ones as well, and that these different regimes
correspond to different intervals on an Fo axis. (We chose to hold r and 2 fixed
and to vary only Fo, but we could have varied r and/or 2. as well.) It is possible to
predict analytically
how the solution type varies with Fo, r, and 22,but that analysis
is beyond the scope of this introductory discussion.*
Having classified the response in Fig. Se as chaotic, it behooves us to clarify
what we mean by that. A reasonable working definition of chaos is behavior in a
deterministic system,over time, which is aperiodic and which is sensitive to initial
conditions.
By asystem being deterministic we mean thatthe governing differential equation(s) and initial conditions imply the existence of a unique solution over subsequent time. Whether we are able to find that solution analytically, or whether we
need to use computer simulation. is immaterial. For instance, for given values of
r, Fo, Q, the system consisting c: equation (5), together with initial conditions x(0)
and x’(0), is deterministic. The choice r = 0.3, Fy = 0.37, 2 = 1.2, x(0) = 1,
and «’(0) = 0.2, say, implies the unique response shown in Fig. Sd. If we rerun
the numerical solution or solve the problem analytically (if we could), we obtain
the same solution as shown in the figure. Likewise even for the chaotic response
shown in Fig. Se.
By the response being aperiodic we simply mean that it does not approach a
periodic solution or a constant.
To illustrate what is meant by sensitivity to initial conditions, let us rerun the
case corresponding to Fig. Se, but with the initial conditions changed from z(0) =
Land2'(0) = 0.2tox(0) = 1andx’(0) = 0.2000000001.
Observethattheresults
(Fig. 6) bear virtually
Figure 6. Sensitivityto initial
conditions.
no resemblance
to those in Fig. 5e. This circumstance
is of
great significance because if the initial conditions are known to only six significant
figures, say, then the task of predicting the response is hopeless!
Another well known example of chaos is provided by the Lorenz equations
v= piy— 2),
y =(q-2z)5-y,
z=
ry re,
(8)
where p, g,7r are constants. This system was studied by the mathematical meteorologist E. Lorenz in 1963 in connection with the Bénard problem, whereby one
seeks the effect of heating a horizontal fluid layer from below.’ That problem is of
fundamental interest in meteorology because the earth, having been heated by the
sun during the day, radiates heat upward into the atmosphere in the evening, thus
destabilizing the atmosphericlayer above it. Lorenz’s contribution was in discovering the chaotic nature of the solution for certain ranges of the physical parameters
p,q,7, thereby suggesting the impossibility of meaningful long-range weather prediction. Some discussion of (8) is left for the exercises.
*See Section 12.6 in Jordan and Smith (ibid).
TE. Lorenz, Deterministic Nonperiodic Flows, Journal of Atmospheric Sciences, Vol. 20, pp.
130-141, 1963.
387
Perhaps the classic problem of chaos is that of turbulence, in fluid mechanics,
be it in connection with the chaotic eddies and mixing behind a truck on a highway
or the turbulent breakup of a rising filament of smoke.
To appreciate the revolution in physics that has resulted from recent work on
chaos, one needs to understand the euphoria that greeted the birth of Newtonian
mechanics and the calculus, according to which both the past and the future are
contained in the system of differential equations and initial conditions at the present
instant. In the words of Ivar Ekeland,* “Past and future are seen as equivalent, since
both can be read from the present. Mathematics travels back in time as easily as
a wanderer walks up a frozen river.” As Ekeland points out, that statement is not
quite true for, as we now know and as was understood by Poincaré even a century
ago, deterministic nonlinear systems can turn out to be chaotic, in which case they
are useless for long-term prediction.
Closure. The common thread in this section is the Duffing equation (1). For
a@> 0 we study (1) in connection with the reed/magnet system of Moon and
Holmes, shown in Fig. 3. Numerical solution of the governing equation (5), for
a sequenceof increasing Fo values, leads to a variety of solution types: a harmonic
response, various subharmonic responses, and even chaotic responses. The approach to chaos, as we increase Fo, is typical in that the onset of chaos is preceded
by a sequenceof period doublings.
We define chaos as behavior in a deterministic system (which must be nonlinear if it is to exhibit chaos) over time, which is aperiodic and so sensitive to initial
conditions that accurate long-range predictions of the solution are not possible.
Computer software. To generate the responses shown in Fig. 5 and 6 we use
theMaple phaseportraitcommand. However, it is worth mentioning how we obtain
the graphs in Fig. | and 2, because (4) does not give A explicitly but only implicitly
as a function of (2. For instance, suppose we wish to plot the graphs of y versus
z, over0 <2 < 3and0 < y < 4, for thefunctionsy(z) givenimplicitly by the
equationsy —y? = a and4y —y? = a. First,enter
with(plots):
and return, to access the subsequent plotting command. Then use the implicitplot
command. Enter
implicitplot({y —y°3 =a, dey—-y3=c},
numpoints= 1000);
c=0..3,
y= 0.4,
and return. Here, numpoints indicates the number of points to be used. To plot the
single function y — y®= x, use y — y°3 = a in place of implicitplot({y — y°3 =
vr, 4ey—y°3 = x}.
“Mathematics and the Unexpected (Chicago: Chicago University Press, 1988).
388
EXERCISES
7.6
lL. (Derivation of the Duffing amplitude-frequency relation)
We state in the text that if one seeks an approximate harmonic
& Acos Qt to
solution2(£)
a” tar
that the2, (¢) sequence given by (1.5) does indeed converge to
the exact solution (1.4) as nm +oo, provided that jax/Q?| <i.
+ Bx? = Fy cos2,
(1.1) (d)In fact, show that if we equate the coefficients of cos Q¢ in
vA—Fi cos Qt, we happen to
= “H—=2
zo(t)=AcosMtand «,(t)
2
then one obtains the ampitude-frequency relation
:
3,9
O =a+—GA?
ce+ a!
(c) Recallingthatthegeometricseries1 + 2 + x? + a%+
convergesto 1/(1~2) if |e]< 1 and divergesotherwise,show
obtainA = Fy/(a—7), whichagreeswiththeexactsolution
Fo
A
(1.2)
(1.4)!
(e) In view of the striking
success in part (d), we are encour-
aged to expect good results even for the nonlinear case where
discovered by Duffing. A modern derivation of (1.2) would
B # 0. Thus, put ro(t) = Acos (Mt into the right side of
probably use a so-called singular perturbation method of
). Then, as in (d),
integrate twice to obtain i
strained variables, but here we will pursue a simpler iterative (1.3) and
of cos Nt in 2, (t) and xzo(t), and show
coefficients
the
equate
approach which is essentially that of Duffing; namely, we rethat the result is Duffing’s relation (1.2). HINT: The identity
place (1.1) by the iterative scheme
cos?@= (3cos@+ cos36)/4shouldbehelpful.
Uy
= aL,
— Bart + Fy cos NE,
(1.3) 2. (Computer problem regarding the Duffing jump phenomenon) For the undamped case the amplitude-frequency rechoose the initial iterateas x(t) = A cos Mt, and then use(1.3) lation is given by (4). For the damped case it is given by
to find thesuccessiveiteratesx, (t), v2(t),.... Itis surely not
2
obvious whether that procedure will. work,.so-it. makes.sense
(2.1)
+ (rQA)*
(a— 0?) A+ “aa
to try it out first for the simple /inear case where 3 = 0, for
which we know the exact solution.
(a) In that case (G = 0), show that if we seek a harmonic solu-
Throughout parts (a)—(g) let Fp = 2.a
r = 0.3, for definiteness.
tion z(t) = Acos Qt of theDuffing equation(1.1)with @= 0,
(a) Use (2.1) to generate ¢
we obtainA = Fy/(a —27), and hencetheexact solution
= 1, 6 = 0.4, and
versus Q as we
did in Fig. 2b. NOTE: Actually, there is no need to distinguish
Fo
x(t
)
u(t) = Tae cos Mt.
(b) Next, use (1.3) to generate2, (¢),v9(t),...,
show that
between A and |A| since, unlike (4), (2.1) contains only even
(1.4) powers of A.
for G = 0, and
(b) For Q = 1 solve (2.1) for A. (HINT: Using Maple, for instance, use the fsolve command.) Next, use computer software
such as the Maple phaseportrait command to solve
A
FF
y(t) = a
a
ev’ + ra’ + ax + Bx? = Fo cos Nt,
608Qt,
and plot a(t) versus € over a sufficiently long time interval
to obtain the steady-state response. Compare the amplitude
of the resulting steady-state response with the value of A ob-
Th
ml)={(Ga)4
a
i+ (S)e-
tained from (2.1).
+(=) | \ pone
(1.5)
That is, put ao(t) = Acos Mt into the right side of (1.3) and
integrate twice to obtain x}. Then put that 2, into the right
side of (1.3) and integrate twice to obtain x2, and so on. By
the time you reach xg, the general result shown in (1.5) should
be apparent.
(2.2)
(c) Same as (b) but for an Q point specified by your instructor,
to the left of the first point of vertical tangency (Q = 1.71).
(d) Same as (b), but for an Q point specified by your instructor, to the right of the second point of vertical tangency
(Q e&2.05).
(e) Now consider an 2 between the two points of vertical tangency, say, 2 = 1.8. Solve (2.1) for the three A values. Next,
use computer software to solve (2.2) over a sufficiently long
Chapter 7 Review
time interval to obtain the steady-state response. Depending
upon the initial conditions that you impose, you should obtain
389
obtain a period-! solution.” Obtain computer plots for that
case, analogous to those in Fig. 5.
the smallest or largest of the three A values, but never the mid-
8. We found an extreme sensitivity to initial conditions for the
chaotic regime. Specifically, the plot in Fig. 6 bears little reof «(0) below which you obtain the small-amplitude response
semblance to the corresponding one in Fig. 5e, even though
and above which you obtain the large-amplitude response.
the initial conditions differed by only 107!°. Show that that
(f) Same as (e) but for an 2 point specified by your instructor.
sensitivity is not found for the non-chaotic responses—namely,
(g) Continuing to use the r, a, 6, fo values given above, now
for the periodic responses. Specifically, rerun the cases re-
dle one.Keeping2’(0) = 0, determinetheapproximatevalue
let 2 be slowly varying according toQ
= 1.9 — 0.0005¢,
and solve (2.2) over 0 < ¢ < 800 with the initial conditions
x(0) = x'(0) = 0. Plot the resultingz(t) versust and discuss
your results in terms of the ideas discussed in this section.
3. (a) Refer to (4) and Fig. |. What is the asymptotic form of
ported in Fig. 5d and 5f, but with 2’(0) = 0.2 changedto
0.20001, say. Do your results appear to reproduce those in
Fig. 5°?
9, The equation
thegraphof |A] versus2 as |A| -> co?
vc +0.382' + sing = Focost
(b) In Fig. | we show several amplitude response curves for
G = Oand G > 0 and for several values of Fy. Obtain the
analogous curves for the case where 2 < 0, either by a careful
hand sketch or by computer plotting.
4. Determine the location and type of any singular points
of (6).
5. (a)—(f) Obtain x, y and x,t plots for the cases depicted in
Fig. 5. Your results should be the same as those in Fig. 5,
6. We stated that “period doubling
continues
as Fo increases
from 0.29 up to around 0.30.” Find, by numerical experimentation, an Fo that gives a period-8 oscillation (and, if possible,
period-16) and obtain computer plots analogous to those in
Fig. 5.
7. We stated that “if Fy is increased further to 0.73, then we
is similar to the one occurring in the Moon/Holmes experiment.
(a) Describe a physical problem that would have a governing
equation of motion of that form. (We have assigned numerical
values to all of the physical parameters except to /, which we
leave for the purpose of numerical experimentation.)
(b) We leave. this problem a bit more open ended than the
foregoing ones, and simply ask you to carry out an analytical and experimental study of (9.1). For instance, you might
investigate the singular points of the homogeneous version of
(9.1),and also run computer solutions for a rangeof Fy values,
somewhat as done for equation (5).
Chapter 7 Review
In Sections 7.2—7.5we study the autonomous system
ve’= P(2,y),
(1)
X'’=aX
+bY
Y’=cX
+dy,
(9.1)
390
centers, foci, nodes, and saddles. The Hartman-Grobman
theorem assumes that the
linearized system faithfully captures the natureof the local flow (except possibly
for the borderline cases of proper nodesand centers,as explained in Section 7.4.1).
In Section 7.4 we study applications and introduced the idea of a bifurcation,
whereby the behavior of the system changes qualitatively. as a system parameter
passes through a critical value. We illustrate the concept with an example of a
saddle-node bifurcation from molecular biology.
In Section 7.5 we study the van der Pol equation
a” — ¢(1—2*)2'+2=0,
(e > 0)
which introduce us to the concept of limit cycles and relaxation oscillations.
Finally, we study the forced Duffing equation
ma" +r2' + az + Bx
= Focos Mt
in two contexts. First, we consider it as modeling a mechanical oscillator, with
nonlinear spring force aa + 3a, where a > 0. Of the various possible solutions
that can be obtained from different initial conditions, we study only the harmonic
response — that is, the steady-stateperiodic response at the same frequency as
the driving frequency Q. The key feature that was revealed was the bending of
the amplitude response curves and the resulting jump phenomenon, whereby the
response amplitude jumps as (Qis increases or decreases slowly througha critical
value.
We also consider it as modeling the “double well” reed/magnet system of
Moon
and Holmes.
By numerical
simulation,
we find that if Fo is not too large,
then the oscillation is confined to one of the two wells. As Fp is increased, there
results a sequence of period doublings, giving so-called subharmonic responses,
until fy becomes large enough to drive the response out of that well. Beyond a
critical Fo value, we then obtain a chaotic responseinvolving both wells.
Chapter 8
Systems of LinearAlgebraic
“quations; Gauss Elimination
8.1
Introduction
There are many applications in science and engineering where application of the
relevant physical law(s) immediately produces a set of linear algebraic equations.
For instance, the application of Kirchotf’s laws to a DC electrical circuit containing
any number of resistors, batteries, and current loops immediately produces such a
set of equations on the unknown currents. In other cases, the problem is stated in
some other form such as one or more ordinary or partial differential equations, but
the solution method eventually leads us to a system of linear algebraic equations.
For instance, to find a particular solution to the differential equation
yl" —y" = 327+ 5sine
(1)
by the method of undetermined coefficients (Section 3.7.2), we seek it in the form
Yp(x) = Ac! + Br? + Cx? + Dsing
+ Ecos.
(2)
Putting (2) into (1) and equating coefficients of like terms on both sides of the equation gives five linear algebraic equations on the unknown coefficients A, B,...,
EB.
Or, solving the so-called Laplace partial differential equation
Oru = Ou
Ox? + Oy? =O
@)
on the rectangle 0 < « < 1,0 < y < 1 by the method of finite differences (which
is studied in Section 20.5), using a mesh size Ac = Ay = 0.05, gives 19? = 361
linear algebraic equations on the unknown values of u at the 361 nodal points of
the mesh. Our point here is not to get ahead of ourselves by plunging into partial
differential equations, but to say that the solution of practical problems of interest
in science and engineering offen leads us to systems of linear algebraic equations.
391
392
Such systems often involve a great many unknowns. Thus, the question of existence
(Does a solution exist?), which often sounds “too theoretical” to the practicing
engineer, takes on great practical importance because a considerable computational
effort is at stake.
The subject of linear algebra and matrices encompasses a great deal more than
the theory and solution of systems of linear algebraic equations, but the latter is
indeed a central topic and is foundational to others. Thus, we begin this sequence
of five chapters (8-12) on linear algebra with an introduction to the theory of
systems of linear algebraic equations, and their solution by the method of Gauss
elimination. Results obtained here are used, and built upon, in Chapters 9-12.
Chapters 9 and 10 take us from vectors in 3-space to vectors in n-space and
generalized vector space, to matrices and determinants. Linear systems of algebraic equations are considered again, in the second half of Chapter 10, in terms
of rank, inverse matrix, LU decomposition,
Cramer’s
rule, and linear transforma-
tion. Chapter 11 introduces the eigenvalue problem, diagonalization, and quadratic
forms; areas of application include systems of ordinary differential equations, vibration theory, chemical kinetics, and buckling. Chapter 12 is optional and brief
and provides an extension of results in Chapters 9-11 to complex spaces.
8.2
Preliminary
Ideas and Geometrical Approach
The problem of finding solutions of equations of the form
f(a) =0
()
occupies a place of both practical and historical importance. Equation (1) is said to
be algebraic, or polynomial, if f(a) is expressible in theform anz” +an,—ya"~ 1+
-+++a1x + ag, where a, # 0 for definiteness [i.e., if f(z) is a polynomial of finite
degree n], and it is said to be transcendental otherwise.
EXAMPLE
1. The equations62 —5 = 0 and324 —x? + 22 + 1 = 0 are algebraic,
whereas x° + 2sinax = 0 and e* — 3 = 0 are transcendental since sinz
and e* cannot be
expressedas polynomials of finite degree. 2
Besides the algebraic versus transcendental distinction,
we classify (1) as lin-
ear if f(a) isa first-degreepolynomial,
aye + ap = 0,
(2)
and nonlinear otherwise. Thus, the first equation in Example | is linear, and the
other three are nonlinear.
While (1) is one equation in one unknown, we often encounter problems involving more than one equation and/or more than one unknown —that is, a system
393
of equations consisting of m equations inn unknowns, where m > Land n > 1,
fier,
such as
on nen)
= 0,
fo(t1,..-,@n)
= 0,
fin (a Loeceey In)
ll o
(3)
vy ~ sin (a, + 7x2) = 0,
cae+ wo - bay +6 = 0.
(4)
In (4) it happens that 72 = m (namely, 7m = n = 2) so that there are as many
equations as unknowns. In general, however, m may be less than, equal to, or
greaterthann so we allow form # n in this discussion even though m = 7nis the
mostimportantcase.
In this chapter we consider only the case where (3) is linear, of the form
QL yp
GyQ@a+e
+ Ain@n = C1,
G12, + aogao +++ + Gandy = C2,
Amity
+ Gmeve2 +++
+ Amntn
= Cm,
(eq.1)
(eq.2)
(5)
(eq.m)
and we restrict m and n to be finite, and the ajj’s and c;’s to be real numbers. If
all the c;’s are zero then (5) is homogeneous; if they are not all zero then (5) is
nonhomogeneous.
The subscript notation adopted in (5) is not essential but is helpful in holding the nomenclature to a minimum, in rendering inherent patterns more visible,
and in permitting a natural transition to matrix notation. The first subscript in aj;
indicates the equation, and the second indicates the 2; variable that it multiplies.
For instance, @91appears in the second equation and multiplies the a1 variable. To
so that one does not mistakavoid ambiguity we should write a2 rather than a@21
enly read the subscripts as twenty-one, but we will omit commas except when such
ambiguity is not easily resolved from the context.
is a solution of (5) if and
We say that a sequence of numbers 51, 52,...,8,
only if each of the mmequations is satisfied numerically when we substitute s; for
£1, 89 for vg, and so on. If there exist one or more solutions to (5), we say that the
system is consistent; if there is precisely one solution, that solution is unique; and
if thereis more than one, the solution is nonunique. If, on the other hand, there
are no solutions to (5), the system is said to be inconsistent.
The collection of all
solutions to (5) is called its solution set so, by “solving (5)” we mean finding its
solution set.
Let us begin with the simple case, where m = n = 1:
Gy,vy = Cy.
(6)
[Inthe generic case, a4, % 0 and (6)
if ay, = 0 there are two possibilities:
that Ox, = c, and (6) is inconsistent,
xr, = ais a solution for any value of
admits the unique solution @, = c,/ay41, but
if c, + 0 then there are no values of x, such
but if c¢,= 0 then (6) becomes 0x; = 0, and
a; that is, the solution is nonunique.
Far from being too simple to be of interest, the case where m ==n = 1 establishes a pattern that will hold in general, for any values of m and n. Specifically,
the system (5) will admit a unique solution, no solution, or an infinity of solutions.
For instance, it will never admit 4 solutions, 12 solutions,
Next, consider the case where m = mn= 2:
(a)
v2A
ul
P
|
C1,
(eq.1)
(7a)
+ agQhe li CQ.
(eq.2)
(7b)
Ay Ly + ayQ@2
ag11
L2
Xy
(8)
or 137 solutions.
If ay, and aj2 are not both zero, then (eq.1) defines a straight line, say L1, in a
Cartesian 21,22 plane; that is the solution set of (eq.1) is the set of all points on
that line. Similarly, if @2, and a@g2
are not both zero then the solution set of (eq.2)
is the set of all points on a straight line £2. There exist exactly three possibilities,
and theseare illustrated in Fig. |. First, the lines may intersect at a point, say P, in
which case (7) admits the unique solution given by the coordinate pair 21, x2 of the
point P (Fig. la). That is, any solution pair x1, x2 of (7) needs to be in the solution
set of (eq.1) and in the solution set of (eq.2) hence at an intersection of £1 and £2.
This is the generic case, and it occurs (Exercise 2) as long as
a11422—ayaae, # 0;
(8)
(8) is the analog of the aj, 4 0 condition for the m = n = 1 case discussed above.
Second. the lines may be parallel and nonintersecting (Fig. 1b), in which case
there is no solution. Then (7) is inconsistent, the solution set is empty.
Third, the lines may coincide (Fig. Ic), in which case the coordinate pair of
each point on the line is a solution. Then (7) is consistent and there are an infinite
number of solutions.
EXAMPLE
(c)
2.
221
X49 A
|
Li,
L2
—
Ug =
ay +329
D,
Up + 3dr
=
= —-l,
Tt + 3x
= 0,
1,
ry
+
3L49 =
22, + 6a
1,
= 2,
illustrate these three cases, respectively. #
i
TF
Figure
i.
1. Existence and
uniqueness for the system (7).
Below (7) we said “If a4, and ay; are not both zero...”
What if they are both
zero? Then if cy # QOthere is no solution of (7a), and hence there is no solution
to the system (7). But if cy = 0, then (7a) reduces to 0 = 0 and can be discarded,
leaving just (7b). If ag, and a2 are not both zero, then (7b) gives a line of solutions,
but if they are both zero then everything hinges on cg. If cp 4 0 there is no solution
and (7) ts inconsistent, but if co = 0, so both (7a) and (7b) are simply 0 = 0, then
both 21 and ae are arbitrary, and every point in the plane is a solution.
395
Next, consider the case where m = n = 3:
yy Hy,+ ayer + 44323 = C4,
A910,
+ A292
+ 9303
= Co,
agix, + a3222+ 43323= C3.
(eq.1)
(9a)
(eq.2)
(9b)
(eq.3)
(9c)
Continuing the geometric approach exemplified by Fig. [, observe that if a11, a12, a13
are not all zero then (eq.1) defines a plane, say Pl, in Cartesian x1, x2, 23 space,
and similarly for (eq.2) and (eq.3). In the generic case, Pi and P2 intersect along
a line L, and £ pierces P3 at a point P. Then the 21, x9, 23 coordinates of P give
the unique solution of (9).
In the nongeneric case we can have no solution or an infinity of solutions tn
the following ways. There will be no solution if Z is parallel to P3 and hence fails
to pierce it, or if any two of the planes are parallel and not coincident. There will
be an infinity of solutions if Z lies in P3 (.e., a line of solutions), if two planes are
coincident and intersect the third (again, a line of solutions), or if all three planes
are coincident (this time an entire plane of solutions).
The case where all of the aj; coefficients are zero in one or more of equations
(9) is left for the exercises.
An abstract extension of such geometrical reasoning can be continued even if
m =n
> 4. For instance, one speaks of aj,21 + @jo@9+ 04303 + A144
= C1
as defining a Ayperplane in an abstract four-dimensional space. In fact, perhaps we
plane and «1, 22,73 space discussed
should mention that even the familiar x1,
here could be abstract as well. For instance, if x1 and x9 are unknown currents in
two loops of an electrical circuit, then what physical meaning is there to an x1, rq
plane? None, but we can introduce
it, create it, to assist our reasoning.
Closure. Most of this section is devoted to a geometrical discussion of the system (5) of linear algebraic equations. A great advantage of geometrical reasoning
is that it brings our visual system into play. It is estimated that at least a third of
the neurons in our brain are devoted to vision, hence our visual sense is extremely
sophisticated. No wonder we say “Now I see what you mean; now | get the picture.’ The more geometry, pictures, visual images to aid our thinking, the better!
We have not yet aimed at theorems, and have been content to lay the groundwork
for the ideas of existence and uniqueness of solutions. In considering the cases
wherem =n = 1,m =n = 2,andm =n = 3, we have not meant to imply
that we need to have m = 1; all possibilities are considered in the next section. To
proceed further, we need to consider the processof finding solutions, and that we
do, in Section 8.3, by the method of Gauss elimination.
396
Chapter 8. Systems of Linear Algebraic Equations; Gauss Elimination
EXERCISES
8.2
2. Derive thecondition (8) as the necessary and sufficient condition for (7)to admit
a
unique
solution.
u
ee
eas
1. True or false? If false, give a counterexample.
∙
∙
(a) An algebraic equation is necessarily linear.
(b)An algebraicequationis necessarilynonlinear.
3. (a) Discuss all possibilitiesof the existenceand unique-
(d)A transcendental
equationis necessarilynonlinear.
the eventthataj; = ay. = a13 = 0, but @o1,22,43 and
(c) A transcendentalequation is necessarily linear.
ness of solutions of (9) from a geometrical point of view, in
(e) A linear equation is necessarily algebraic.
(f) A nonlinear equation is necessarily algebraic.
(g) A linear equation is necessarily transcendental.
(h) A nonlinear equation is necessarily transcendental.
433 are not all zero.
31,432,
(b) Same as (a), but with ag, = agg = a3 = 0 as well.
(c) Same as (a), but with a@gy= @92 = @o3 = G31 = Ago =
a33 ==Q as well.
8.3. Solution by Gauss Elimination
8.3.1. Motivation. In this section we continue to consider the system of m linear
algebraic equations
AyQ%2+ °° + GinIn
QInln
Agyey —- aeglg -+ ++
Qy1e, +
Am1L1
+++
+ Ame@2
+ Gmn&n
= C1,
= C2,
=
(1)
Cm,
in the m unknowns 21,..., 2p, and develop the solution technique known as Gauss
elimination. To motivate the ideas, we begin with an example.
EXAMPLE
1. Determinethesolutionsetof thesystem
Ly
SU,
+
v2
+ eg
—-
v3
+
23
lI 1,
I 9,
(2)
LZ, — fo + 4x3 = 8.
Keep the first equation intact, and add —3 times the first equation to the second (as a
replacement
for the second equation), and add —1 times the first equation to the third (as a
replacement for the third equation). These steps yield the new “indented” system
rg =],
ta-
Ty +
—2r9 + 4ry =i 6,
(3)
—229 + 5x3 = 7.
Next, keep the first two of these intact, and add —1 times the second equation to the third,
and obtain
ty —- £3 = 1,
By +
205
+
4x3
=
6,
vg = 1.
(4)
8.3. Solution by Gauss Elimination
Finally, multiplying the second of these by —1/2 to normalize the leading coefficient (to
unity), gives
Zy+k-
we3=
1,
Ly ~ 2g = ~3,
tg3=
1.
(eq)
(eq.2)
(eq.3)
(5)
It is helpful to think of the original system (2) as a tangle of string that we wish to unravel.
The first step is to find a loose end and that is, in effect, what the foregoing
process of
successive indentations has done for us. Specifically, (eq.3) in (5) is the “loose end,” and
with that in hand we may unravel (5) just as we would unravel a tangle: putting zz = 1
into (eq.2) gives xy = —1, and then putting x3 = 1 and v2 = —1 into (eq.1) gives x, = 3.
Thus, we obtain the unique solution
tg=1,
m3=-1,
m=3,
(6)
COMMENT I. From a mathematical point of view, the system (2) was a “tangle” because
the equations were coupled; that is, each equation contained more than one unknown.
Actually, the final system (5) is coupled too, since (eq.1) contains all three unknowns and
(eq.2) contains two of them. However, the coupling in (5) is not as debilitating because
(5) is in what we call triangular form. Thus, we were able to solve (eq.3) for x3, put that
value into (eq.2) and solve for x2, and then put these values into (eq.1) and solve for x4,
which steps are known as back substitution.
COMMENT 2. However, the process begs this question: Is it obvious that the systems
(2)—(5) all have the same solution sets so that when we solve (5) we are actually solving
(2)? That is, is it not conceivable that in applying the arithmetic steps that carried us from
(2) to (5) we might, inadvertently,
have altered the solution set? For example, c—1 = 4 has
the unique solution « = 5, but if we innocently square both sides, the resulting equation
(x — 1)? = 16 admits the nvo solutions 7 = 5 anda = —3. B
The question just raised applies to linear systems in general. It is answered
in Theorem 8.3.1 that follows, but first we define two terms: “equivalent systems”
and “elementary equation operations.”
Two linear systems in 2 unknowns, x, through xp, are said to be equivalent if
their solution sets are identical.
The following operations on linear systems are known as elementary equation
operations:
1. Addition of a multiple of one equation to another
Symbolically: (eq.j) > (eq.j) + a (eq.k)
2. Multiplication of an equation by a nonzero constant
Symbolically: (eq.j) —>a (eq.7)
3. Interchange of two equations
Symbolically:
(eq.j)
(eq.k)
Then we can state the following result.
397
398
THEOREM
8.3.1 Equivalent Systems
If one linear system is obtained from another by a finite number of elementary
equation operations, then the two systems are equivalent.
Outline of Proof: The truth of this claim for elementary equation operations of
types 2 and 3 should be evident, so we confine our remarks to operations of type
|. It suffices to look at the effect of one such operation. Thus, suppose that a given
linear system A is altered by replacing its 7th equation by its jth plus a times its
kth, its other equations being kept intact. Let us call the new system A’. Surely,
every solution of A will also be a solution A’ since we have merely added equal
quantities to equal quantities. That is, if A’ resultsfrom A by the application of an
elementary equation operationof type 1, then every solution ofA is also a solution
of
A’.
Further, we can convert A’ back to A by an elementary equation operation of
type 1, namely, by replacing the jth equation of A’ by the jth equation of A’ plus
—a times the kth equation of A’. Consequently,it follows from the italicized result
(two sentencesback) that every solution of A’ is also a solution of A. Then A and
A’ areequivalent,asclaimed. @
In Example 1, we saw that each step is an elementary equation operation:
Three elementary equation operations of type | took us from (2) to (4), and one of
type 2 took us from (4) to (5); finally, the back substitution amounted to several op-
erations of type |. Thus, according to Theorem 8.3.1, equivalence was maintained
throughout so we can be sure that (6) is the solution set of the original system (2)
(as can be verified by direct substitution).
The system in Example | admitted a unique solution. To see how the method
of successive elimination
works out when there is no solution, or a nonunique so-
lution, let us work two more examples.
EXAMPLE 2.
InconsistentSystem.Considerthesystem
224
+
−
32Q
223
=
4,
Ly ~ 2a9 + 2x3= 3,
-—
7X,
@3 =
(7)
2.
Keep the first equation intact, add 5 times the first equation to the second (eq.2 —+eq.2
~$ eq.1), and add ~f times the first to the third (eq.3 > eq.3 —t eq. |):
224
+
3x2
a
223
=
— fu. +2e3=
_
Bre
>
6x3
=
4,
1,
(8)
—12.
Keep the first two equations intact, and add —3 times the second equation to the third (eq.3
8.3, Solution by Gauss Elimination
— eq.3 —3 eq.2):
4,
2a, + 3x9 - 2a3=
(9)
1,
— 9% + 223=
Q=—15.
Any solution 21,2, %3 of (9) must satisfy cach of the three equations, but there are no
that can satisfy 0 = —15. Thus, (9) is inconsistent (has no solution),
values of «1, 29,3
and therefore (7) is as well.
COMMENT. The source of the inconsistency is the fact that whereas the left-hand side of
the third equation is 2 times the left-hand side of the first equation plus 3 times the left-hand
_
sideof thesecond,theright-handsidesdo notbearthatrelationship:2(4)+3(3) = 17 # 2.
[While that built-in contradiction is not obvious from (7), it eventually comes to light in the
> third equation in (9).] If we modify the system (7) by changing the final 2 in (7) to 17, then
the final —12 in (8) becomes a 3, and the final ~15 in (9) becomes a zero:
20, + 3xq —2x3 = 4,
— faq + 223= 1,
* (10)
0=0
or,multiplyingthefirstby 5 andthesecondby —3,
(11a,b)
=>
meom
2 —- 73g = 5,
Ben —- Tas
where we have discarded the identity 0 = 0. Thus, by changing the c;’s so as to be
“compatible,” the system now admits an infinity of solutions rather than none. Specifically,
we can let 23 (or Yo, it doesn’t matter which)
in (11b) be any value, say a, where a is
arbitrary.Then (11b)gives zz = —2 + 3a, and putting theseinto (Ila), e1 = i+
fa,
Thus, we have the infinity of solutions
2
w=a,
3
t2=--+-a,
7 b
2
4
at
17
1
ey
7 + 5o
vy =—4+=
(12)
12
for any a. Evidently, two of the three planes intersect, giving a line that lies in the third
plane, and equations (12) are parametric equations of that line! @
EXAMPLE
3. = NonuniqueSolution. Consider the system of four equationsin six
unknowns (m = 4,n = 6)
209 +
Uy,
Lr
wy + 44
VB
+ 345 +
2
+
2X6
= 2,
=
0,
>
y+
vg + 2a3 + 4eyq + v5 + 2r— = 3,
ry — 3x9
— 4e4 — 245 + we = 0.
13
(13)
Wanting the top equation to begin with 2, and subsequent equations to indent at the left,
399
400
let us first move the top equation to the bottom (eq.1 + eq.4):
ry ~ 329
— dx, — 2x5 + ag = 0,
Lym
ka + wy
Uy
to + 283 + dag +
202 +
+ 2x6 =
£3 + 44
0,
25 + 2x6 = 3,
+ 325 +
(14)
we = 2.
Add —1 times the first equation to the second (eq.2 + eq.2 ~1 eq.1) and third (eq.3 >
eq.3 —1 eq. 1) equations:
Ly — 382Q
229
— 4v4 — 245 + xe = 0,
=
+ Ug
+ 225
+ 4a4
3
+
(15)
0,
4xq + 293 + 8x4 + 35 + ve = 3,
209 + 3 + 4a4 + 3@5 + ag = 2.
Add —2 times the second to the third (eq.3 —+eq.3 —2 eq.2) and —1 times the second to
the fourth (eq.4 + eq.4 —1 eq.2):
Ly
3x2
− 4x4
_
225
+
rg
=
0,
=0,
205
ve +
+ vg
0,
23 +tf dag
4x4 +
U3
2X9
209 ++
7&5
7
Ve
=
(16)
3,
Add the third to the fourth (eq.4 — eq.4 + eq.3):
xy — 329
—4%4 - 2475 +
we = 0,
209 + @3.+ 4g +. 205. +...06.=.0,
tg
7
a5
(17)
= 3,
—©g = 5.
Finally, multiply the second, third, and fourth by 4, —1, and —1, respectively, to normalize
the leading coefficients
(eq.2 > 4 eq.2, eq.3 + —Leq.3,
—-275 +
~40,
ry — 3x9
we + $23 + 204+
eq.4 + —1 eq.4):
O,
we =
U5 + $UG =
tit
∶
0,
(18)
—3,
T=
∶ —5.
∶
The last two equations give vg = —5 and zs = 2, and these values can be substituted back
into the second equation.
In that equation we can let xy be arbitrary, say a,, and we can
also let x3 be arbitrary, say a). Then that equation gives x2 and, again by back substitution,
the first equation gives x,. The result ts the infinity of solutions
Ug =
l
v5
—5,
= 2,
1
- a2,
tg = =
5 — 2a, Les
where a, and ag are arbitrary.
Uq = 1,
21
03
=
42,
3
1 = —
ye
5 — 2a, —-Fa,
9
g
(19)
@
If a solution set contains p independent arbitrary parameters (a1,...,Qp),
call it (in this text) a p-parameter
family
of solutions.
We
Thus, (12) and (19) are
401
one- and two-parameter families of solutions, respectively. Each choice of values
@pyields a particular solution. In (19), for instance, the choice ay = 1
for a1,...,
and a2 = 0 yields the particular solution «4, = oe vg = —3, vg = 0, vq = 1,
vy = 2, and vg = —5.
8.3.2. Gauss elimination. The method of Gauss elimination,* illustrated in Examples 1-3, can be applied to any linear system (1), whether or not the system is
consistent, and whether or not the solution is unique. Though hard to tell from the
foregoing hard calculations, the method is efficient and is commonly available in
(a)
computer systems.
Observe that the end result of the Gauss elimination process enables us to
determine, merely from the patternof the final equations, whether or not a solution
exists and is unique. For instance, we can see from the pattern of (5) that there
is a unique solution, from the bottom equation in (9) that there no solution, and
(6),
ul
{i
from the extra double indentation in (18) that there is a two-parameter family of
solutions.
As representative of the case where m <n,
let m = 3 andn
= 5. There
are four possible final patterns, and these are shown schematically in Fig. |. For
instance, the third equation in Fig. la could be 73 — 624 + 245 = 0 or v3 + 2@4+
Oxs = 4, and the given third equation in Fig. !b could be 0 = 6 or 0 = O. It may
seemfoolish to include the case shown in Fig. Id becausethereare no j’s
(ce)
i}
if each of them is zero; (d) there is no solution if any of the right-hand members is
(d)
m=3,n=5.
nonzero, and a five-parameter family of solutions if each of them is zero.
It may appear that Fig. | does not cover all possible cases. For instance, what
about the case shown in Fig. 2? That case can be converted to the case shown in
Fig. la simply by renaming the unknowns: let x3 become xo and let x5 become
x3. Specifically, let 7, > «1, 73 4 wo, @54 v3, 04 > w4, and G2 > B5.
The case where mm> n can be studied in a similar manner, and we can draw
the following general conclusions.
covered?
8.3.2 Existence / Uniqueness for Linear Systems
If m < n, the system (1) can be consistent or inconsistent.
If it is consistent
it cannot have a unique solution; it will have a p-parameter family of solutions,
where n —m <p
<n.
il
(all
of the a;; coefficients being zero), but it is possible so we have included it. From
these patterns we draw these conclusions: (a) there exists a two-parameter family
of solutions; (b) there is no solution (the system is inconsistent) if the right-hand
member of the third equation is nonzero, and a three-parameterfamily of solutions
if the latter is zero; (c) there is no solution if either of the right-hand members of
the second and third equations is nonzero, and a four-parameter family of solutions
THEOREM
oe
If m > n, (1) can be consistent or inconsistent. If it is
*The method is attributed to Karl Friedrich Gauss (1777-1855), who is generally regarded as the
foremostmathematician of the nineteenthcentury and often referred to as the “prince of mathematicians.”
0
0
0
li
Ul
402
consistent it can have a unique solution or a p-parameter family of solutions, where
L<p<n.
The next theorem follows immediately from Theorem 8.3.2, but we state it
separately for emphasis.
THEOREM
8.3.3 Existence/ Uniquenessfor Linear Systems
Every system (1) necessarily admits no solution, a unique solution, or an infinity of
solutions.
Observe that a system (1) is inconsistent only if, in its Gauss-eliminated form,
one or more of the equations is of the form zero equal to a nonzero number. But
that can never happen if every c; in (1) is zero, that is, if (1) is homogeneous.
THEOREM
8.3.4 Existence/ Uniqueness for Homogeneous Systems
Every homogeneous linear system of m equations in n unknowns is consistent.
Either it admits the unique trivial solution or else it admits an infinity of nontrivial
solutions in addition to the trivial solution. If m < n, then there is an infinity of
nontrivial solutions in addition to the trivial solution.
In summary, not only did the method of Gauss elimination provide us with an
efficient and systematic solution procedure, it also led us to important results regarding the existence and uniqueness of solutions.
8.3.3. Matrix notation. In applying Gauss elimination, we quickly discover that
writing the variables 21,...,2,,
over and over is inefficient, and even tends to upstage the more central role of the a;;’s and c;’s. It is therefore preferable to omit
the x,;’s altogether and to work directly with the rectangular array
Q1i1
G12
"'t
Qin
CL
a21
a92
sth
G8n
c2
Qm1
Gm2
‘''
Qmn
Cm
,
(20)
known as the augmented matrix of the system (1), that is, the coefficient matrix
;
(21)
8.3. Solution by Gauss Elimination
augmented by the column of c;’s. By matrix we simply mean a rectangular ar-
ray of numbers,called elements; it is customary to enclose the elementsbetween
parenthesesto emphasize that the entire matrix is regarded as a single entity. A
horizontal line of elements is called a row, and a vertical line is called a column.
Counting rows from the top, and columns from the left,
aq,
299
+++ Gon
C2
and
4
Cm
say,are the second row and (n-+1)th column, respectively, of the augmentedmatrix
(20).
In terms of the abbreviated matrix notation, the calculation in Example | would
~look like this.
Original system:
Rt
OO
be
reo
Rt
em
RR
mowor
Add —3 times first row to second row, and add —1 times first row to third row:
1
1-1
1
0 -2
4 6
0 -2
5 7
Add —1 times second row to third row, and multiply second row by —$:
1 1-1
0 1 -2
00
1
1
-3
|.
7
(22)
Thus, corresponding to the so-called elementary equation operations on members of a system of linear equations there are elementary row operations on the
augmented matrix, as follows:
1. Addition of a multiple of one row to another:
Symbolically: (jth row) > (jth row) + a(kth row)
2. Multiplication of a row by a nonzero constant:
Symbolically:
(jth row) + a(jth row)
3. Interchange of two rows:
Symbolically: (jth row) + (kth row)
403
404
And we say that two matrices are row equivalent if one can be obtained from the
other by finitely many elementary row operations.
8.3.4. Gauss— Jordan
reduction.
With the Gauss elimination
completed, the re-
maining steps consist of back substitution. In fact, those steps are elementary row
operations as well. The difference is that whereas in the Gauss elimination we
proceed from the top down, in the back substitution we proceed from the bottom
up.
EXAMPLE
4.
To illustrate,let us returnto Example | and pick up at theend of the
Gauss elimination, with (5), and complete the back substitution steps using elementary row
operations. In matrix format, we begin with
1
0
0
1
1
0
-l
-2
1
1
~3 ].
1
~ (23)
Keeping
the bottom row intact, add 2 times that row to the second, and add 1 times that
row to the first:
110
2
0 1 0 -1 |.
1
001
(24)
Now keeping the bottom two rows intact, add —1 times the second row to the first:
1
0
0
0
1
0
0
0
1
3
-1 4],
1
(25)
which is the solution: x, = 3, v2 = —1, x3 = 1 as obtained in Example |. @
The entire process, of Gauss elimination plus back substitution, is known as
Gauss—Jordan reduction, after Gauss and Wilhelm Jordan (1842~—1899).The final
result is an augmented matrix in reduced row-echelon form. That is:
1, In each row not made up entirely of zeros, the first nonzero element is a 1, a
so-called leading 1.
2. In any two consecutive rows not made up entirely of zeros, the leading | in
the lower row is to the right of the leading | in the upper row.
3. Ifacolumn contains a leading |, every other element in that column is a zero.
4, All rows made up entirely of zeros are grouped together at the bottom of the
matrix.
8.3. Solution by Gauss Elimination
For instance, (25) is in reduced row-echelon
form, as is the final matrix in the next
example.
5.
EXAMPLE
Let us returnto Example 3 and finish the Gauss—Jordanreduction,
beginning with (18):
1%
0
0
0
O
Ll Lt -3
0
0
0
0
0
2 38
10320
0
5l
00
00
0
0
0
00
1 -5
2 0
0
-3 1
1
1 —3
0
1-5
0
0
1
1 8
0
0
0
1 —5
|}0
0
fi
0 $200
4
3
0
51
00
001
109 0
0
21%
O14
,
«0
14
2
0
0
8
10221
0
1
0 -4 -2
1-3
0
#
¢1
200
0
0
0
4
1
2
-5
The last augmented matrix is in reduced row-echelon form. The four leading 1’s are displayed in bold type, and we see that, as a result of the back substitution steps, only 0’s are
to be found above each leading |. The final augmented matrix once again gives the solution
(19). @
8.3.5. Pivoting. Recall that the first step in the Gauss elimination of the system
Qe,
+ GyQwg tet
+ Ain®p =
Gq,
+ A99%Q+
+ AIntn = C2,
+ Ometg
+ Amntn
Am1l1
+++
C1,
(26)
= Cm,
is to subtract ag; /ay, times the first equation from the second, a31/a11 times the
first equation from the third, and so on, while keeping the first equation intact. The
first equation is called the pivot equation (or, the first row is the pivot row if one
is using the matrix format), and a 4 is called the pivot.
That step produces an
indented system of the form
Ayyey
-
Ayawo tee
Qgo@Q +
∕∙
Gynt.
++
+
Ayytn
Pon,
Ao Uy
ho
+ Lae + Ginntn
=
=
Ch,
af
Co,
(27)
ee Cm:
ol
=
Next, we keep the first two equations intact and use the second equation as the new
pivot equation to indent the third through mth equations, and so on.
Naturally, we need each pivot to be nonzero. For instance, we need ay; # 0
for d21/a11, @31/011,... to be defined. If a pivot is zero, interchange that equation
405
406
with any one below it, such as the next equation or last equation (as we did in Example 3), until a nonzero pivot is available. Such interchange of equations is called
partial pivoting. If a pivot is zero we have no choice but to use partial partial pivoting, but in practice even a nonzero pivot should be rejected if it is “very small,”
since the smaller it is the more susceptible is the calculation to the adverse effect
of machine roundoff error (see Exercise 13). To be as safe as possible, one can
choose the pivot equation as the one with the largest leading coefficient (relative to
the other coefficients in the equation).
Closure. Beginning with a system of coupled linear algebraic equations, one can
use a sequence of elementary operations to minimize the coupling between the
equations while leaving the solution set intact. Besides putting forward the important method of Gauss elimination, which is used heavily in the following chapters,
we used that method to establish several important theoretical results regarding the
existence and uniqueness of solutions.
The Gauss elimination and Gauss—Jordanreduction discussions lead naturally
to a convenient, and equivalent, formulation
in matrix notation. We will return to
the concept of matrices in Chapter 10, and develop it in detail.
Computer software. Chapters 8—12 cover the domain known as linear algebra.
A great many calculations in linear algebra can be carried out using computer algebra systems. In Maple, for instance, a great many commands (“functions”) are
contained within the linalg package. A listing of these commands can be obtained
by entering ?linalg. That list includes the linsolve command, which can be used to
solve a system of m linear algebraic equationsin n unknowns. To access linsolve
(or any other command within the linalg package), first enter with(linalg). Then,
linsolve(A,b) solves (1) for x1,...,%,, where A is the coefficient matrix and 6 is
the column of c;’s. For instance, the system
Zi — © + 223 — 3x4 =4,
ty +2049
—- 23+
28
324 = 1
(28)
admits the two-parameter family of solutions
T4=
1,
CTE=AQ,
T=
-l—-2aj,+ao,
7
=38+aQ1,-—a9,
where c1, @ are arbitrary. To solve (28) using Maple, enter
with(linalg):
then return and enter
linsolve(array([{1,~1, 2, —3],[1,2, -1, 3]]), array((4,1]));
and return. The output is
[—_ty + tg+3,
ty
-2-te—-—1,
~ty,
—to]
(29)
8.3. Solution by Gauss Elimination
—407
where the entries are x,,...,04
and where _t; and —tg are arbitrary constants.
With f;
= a and —t 2 = ay, this result is the same as (29). If you prefer, you
could use the sequence
with(linalg):
A := array(([1,—1,2, —3],[1,2, —1,3]]):
b := array((4,1]}):
linsolve(A, b);
instead. [f the system is inconsistent, then either the output will be NULL,
or there
will be no output.
EXERCISES
8.3
1. Derive the solution set for each of the following systems
using Gauss elimination and augmented matrix format. Documenteach step (e.g., 2nd row—+2nd row + 5 times Ist row),
and classify the result (e.g., unique solution, the system is inconsistent, 3-parameter family of solutions, etc.).
(k)
xy
+ 23
ed
Ly, + 22 — £3 — 224 =5
Ly —- £2 + 243 4+ wy =O
Qa, + to+
t3-—- Bea 4
(1)
ay +24 =2
5a +
y= 2
(b) 22+
y=0
Ty
(m)
(n) 22+
(e) 27, — wo — v3 - 5aey = 6
() 2
—
~
3
,
7
Ug
-
Vy
+ 4xry
=
:
(0)
224
+
ey
+
a
‘
sa + by +
Tz = 8
9x + 10y + llz
= 12
xe2=
6a, + llag=
Qn, +
w+
(q)
tr
204
29
229
=n
+
a3
∙−∶
On,
409 +
Uy + 2a - 23 — 2a4 =
=
+ 23
xy
0
=
pe oe
XR
t
29
+
=
®o4+
23
201 + 2x9 — X3
t=
1
tq +223 + rg =1
ty bay
ee
Uy +
~—2
23-
10
1
Ly + 2xg = —4
6
0x2 =—1
≥
Uy
Ba,+ Or»Qea=
4
_
9
v1 +
z=
r3 + 224 =1
:
(h) vy + #2 — 243 =
3
:
Uy
2 —-343 =
1
~ dag
;
aay ~ 3x”q
322
v3 = —1
y+
6
2
yYbur
xz—-2y — 4z = —-10
(g) w+ 24y+ 32= +
(i) 2a, -
=2
for c = 10, and again, for ¢ =11
Wm—y-2=8
— Ug
+52=c
— &ty
e—-ytz=1
Ly ~—2
«+ 2y4+32=5
3a + 4dy
(c) w+2y=4
a
@ 4+273+ 24 =4
2z + 3y +42 =8
3a — 2y = 0
d)
@3 + 24 = 0
—
4tg
(a) 22 —3y=1
-
.
aus:
DA
324
;
+
225
+
225
+
a,
se |
a
=
0
—0
= 0
=0
ine G
Jord
.
4
duction instea
5
1
3. (a)—(q)Same as Exercise | but using computer software
such as the Maple linsolve command.
408
4, Can 20 linear algebraic equations in [4 unknowns have a
unique solution? Be inconsistent? Have a two-parameter family of solutions? Have a |4-parameter family of solutions?
Have a |6-parameterfamily of solutions? Explain.
(b) “Given thesystem
5. Let
since both left-hand sides equal zero, they must equal each
other. Hence we have the equation
QL,
lI 0,
+ b3x3 = 0
+ GQhQq+ 4323
bya, + bot
= 22, — to + Vs,
Ly + @o — 4g
represent any two planes through the origin in a Cartesian
1,2,
ay + ay — 4g = 0,
224 ~~Lo + vg = 0,
3 space. For the case where the planes intersect along
which equation is equivalent to the original system.”
a line, show whether or not that line necessarily passes through
the origin.
9, Make up an example of an inconsistent linear algebraic system of equations, with
6. If possible, adapt the methods of this section to solve the
following nonlinear systems. If it is not possible, say so.
(jm=2,n=4
(a)
at t+205 ~ 243 ==29
zy t+ ©)+ 24 = 19
Ba? + das
(b)
= 67
(b) m = I, m= 4
10. (Physical example of nonexistence and nonuniqueness;
DC circuit) Kirchoff’s current and voltage laws were given in
Section 2.3.1. [f we apply those laws to the DC’ circuit shown,
by
13
5
z+ 3y=
sing +2y=
(c) sing + siny
=1
sing — siny + 4cosz = 1.2
sing + siny + 2cosz = 1.6
geneous (do you agree that they are homogeneous?) systems
admit nontrivial solutions? Find the nontrivial solutions corre-
sponding to each such A.
u+
(c)
Av
(b)
22 —- y=
2y = Ay
«2~2y=
Az
dx ~ 8y = Ay
4
(fe) c+y+
e=Ar
yroeo=
Ay
22 = Nz
(a) “Given the system
Ty — 280 = 0,
224
_
das
—
Q,
add —2 times the first equation to the second and add ~ 4 times
the second equation to the first. By these Gauss elimination
steps we obtain the equivalent system 0 = 0 and 0 = O, and
hence the two-parameter family of solutions zy = ay (arbitrary), £2 = Q (arbitrary).”
ip~ig-tg
e
=O,
iy —ig —ig = 0,
Roig — Ryizg = 0,
{| EF,
Ryiy + Reig =
sa Ag
8. Evaluate these excerpts from examination papers.
BR
we obtain the equations
Xx
z= Ay
c+yte=
Az
(f) 22+
y+
2= Ax
u+2Qy+
2= Ay
G+ yr2e=
Az
8 4 &
benmennennnnnsnnettfemenennsnisnnsninenmatrfyrererenntrerannnanrrd
—v + Qy = Ay
(d)
e
4 LW
EC)
7. For what values of the parameter A do the following homo-
y=
a
vw
a
< w/2.and0<2<2
where~7/2 <a <1/2,-1r/2<y
(a) 2a +
b
>
Ryty
gis
=
E,
(jjunction a)
(junctionc)
Gj
(loop abcda)
((loopabcea)
(Ioop
(10.1)
adcea)
where 71, 72,23 are the three currents (measured as positive in
the direction assumed in the figure). Ry, Re, Ry are the resis-
tances of the three resistors, and £&is the voltage rise (from e
to a) induced by the battery or other source. [Evidently, we did
not need to apply the current law to both junctions since the resulting equations are identical. Similarly, it may be that not all
of the loop equations are needed. But rather than try to decide
which of equations (10.1) to keep and which to discard, let us
merely keep them all.] We now state the problem: Obtain the
solution set of equations (10.1) by Gauss elimination, If there
is no solution, or if there is a nonunique solution, explain that
result in physical terms. Take
(QR,
= Ro = R=
R
(RO)
(b)Ry =Ry= R, Ro=2R (RHO)
(c)Rh, 2R, Ry =Rg=2R (RHO)
409
8.3. Solution by Gauss Elimination
The point P will move to a point (a, y), and we assumethatthe
cables are stiff enoughso that« andy are very small: |x| < 1
(e) Rg = R, Ry = Rg = 0
(g)
Ry
— Reo =
Res
— 0
11. (Physical example of nonexistence and nonuniqueness;
statically indeterminate structures) (a) Consider the static
and |y| < 1. Let the cables obey Hooke’s law: Ty, = ky1,
T. = kgd9, and T; = kgdg, where 4; is the increase in length
of the jth cable due to the tension T;. Since P moves to (x, y).
it follows that
equilibrium of the system shown, consisting of two weightless
oy
(11.1)
cables connected at P, at which point a vertical load F’ is applied. Requiring an equilibrium of vertical force components,
and horizontal force components too, derive two linear algebraic equations on the unknown tensions T; and 7). Are there
any combinations of angles 6, and @, (where 0 < 0; < =
and 0 < @) < 5) such that there is either no solution
Explain each step in (11.1), and show, similarly, that
11.2
(11.2)
—Y,
1
V3
or a
nonunique solution? Explain.
(b) This time let there be three cables at angles of 45°, 60°,
and 30° as shown. Again, requiring an equilibrium of vertical
and
v3
1
gh —--ogt
dy
Thus,
Ty
=
ki dy
~
k
ad y),
ae
k
To= kodq%~Z («+vy),
(11.4)
+y).
Ts= kgds& -(V3a
Putting (11.4) into the two equilibrium equations obtained in
(b) then gives two equations in the unknown displacements
x,y. Show that that system can be solved uniquely for x and
y, and thus complete the solution for 7}, T2, T3.
horizontal forces at P, derive two linear algebraic equations on
the unknown tensions 7, 75,73. Show that the equations are
consistent so there is a nonunique solution. NOTE: We say that
such a structure is statically indeterminate because the forces
in it cannot be determined from the laws of statics alone. What
information needs to be added if we are to complete the evaluation of 7),
2,73?
What is needed is information about the
relative stiffness of the cables. We pursue this to a conclusion
in (c), below.
(c) [Completion of part (b)| Before the load £ is applied, locate an x, y Cartesian coordinate system at P. Let P be | foot
below the “‘ceiling” so thecoordinatesof A, B,C are (—1,1),
(1/V3, 1), and (./3, 1), respectively. Now apply the load F’,
12. (Roundoff error difficulty due to small pivots) To illustrate
how small pivots can accentuate the effects of roundoff error,
consider the system
0.0050,+ 1.472il
il
0.9752,-+2.3229
1.49,
6.29
(12.1)
with exact solution 2, = 4 and x2 = 1. Suppose that our
computer carries three significant figures and then rounds off.
Using the first equation as our pivot equation, Gauss elimination gives
1.47 1.49 |
2.382 6.22
| 0.005
|
9
1.47
—285
1.49
—284
so @
=
284/285
=
0.996 and «,
=
[1.49 -
(1.47)(0.996)|/0.005
= (1.49~ 1.46)/0.005= 6. Showthat
off version
system
0.975a,
+ 2.32¢9
I 6.22,
0.0052, + L.47x = 1.49
(12.2)
r+
y =2,
a ~ 1.014y
=0
Chapter 8
solution set intact.
Chapter 10.
y=2,
= 0
2 : LO14y - 0
(13.2)
and the rounded-off version
v+
x+
1Oly
y = 2,
= 0,
vm 144.9, y & -142.9 and « = 202, y = —200, respectively; (13.2) is an example of a so-called ill-conditioned system (ill-conditioned in the sense that small changes in the coefficients
lead to large changes in the solution).
(13.1)the following:
is v & 1.007, y = 0.993, whereas the solution of the rounded-
10ly
is very much the same, namely « ~ 1.005, y = 0.995. In
sharp contrast,the solutions of
as our pivot equation, we obtain the result ¢, = 4.00 and
vq = 1.00 (which happens to be exactly correct).
13. Ull-conditioned systems) Practically speaking, our numerical calculations are normally carried out on computers, be they
hand-held calculators or large digital computers. Such machines carry only a finite number ofsignificant figures and thus
introduce roundofferror into most calculations. One might expect (or hope) that such slight deviations will lead to answers
that are only slightly in error. For example, the solution of
t+
z-
if we use partial pivoting and then use the first equation of the
Here, we ask
Explain why (13.2) is much more sensitive to
roundoff than (13.1) by exploring the two cases graphically,
that is, in the x, y plane.
Chapter 8 Review
more fully.
The most important results of this chapter are contained in Theorems 8.3.1-3.
Finally, we also stress the value of geometrical and visual reasoning, and suggest
that you keep that idea in mind as we proceed.
411
Chapter 9
Vector Space
9.1
Introduction
Normally, one meets vectors for the first time within some physical context — in
studying mechanics, electric and magnetic fields, and so on. There, the vectors exist
within two- or three-dimensional space and correspond to force, velocity, position,
magnetic field, and so on. They have both magnitude and direction; they can be
scaled by multiplicative factors, added according to the parallelogram law; dot
and cross product operations are defined between vectors; the angle between two
vectors is defined; vectors can be expanded as linear combinations
of base vectors,
and so on.
Alternatively, there exists a highly formalized axiomatic approach to vectors
known as linear vector space or abstract vector space. Although this generalized
vector concept is essentially an outgrowth of the more primitive system of “arrow
vectors” in 2-space and 3-space, described
above, it extends well beyond that sys-
tem in scope and applicability.
For pedagogical reasons, we break the transition from 2-space and 3-space to
abstract vector space into two steps: in Sections 9.4 and 9.5 we introduce a generalization to “n-space,” and in Section 9.6 we complete the extension to general
vector space, including function spaces where the vectors are functions! However,
we do not return to function spaces until Chapter 17, in connection with Fourier
series and the Sturm—Liouville theory; in Chapters 9—12 our chief interest is in
n-space.
9.2
Vectors; Geometrical
Representation
Some quantities that we encounter may be completely defined by a single real
number, or magnitude; the mass or kinetic energy of a given particle, and the
temperature or salinity at some point in the ocean, are examples. Others are not
defined solely by a magnitude but rather by a magnitude and a direction, examples being force, velocity, momentum, and acceleration. Such quantities are called
412
vectors.
_ The defining featuresof a vector being magnitude and direction suggeststhe
geometric representation of a vector as a directed line segment, or “arrow,” where
the length of the arrow is scaled according to the magnitude of the vector. For
example, if the wind is blowing at 8 meters/secfrom the northeast, that defines a
wind-velocity vector v, where we adopt boldface type to signify that the quantity
is a vector; alternative notations tnclude the use of an overhead arrow as in W.
Choosing, according to convenience, a scale of 5 meters/sec per centimeter, say,
the geometricrepresentationof v is as shown in Fig. 1. Denoting the magnitude,
N
|
or norm, of anyvectorv as ||v||,we have||v|]= 8 for thev vectorin Fig. [.
Observe that the /ocation of a vector is not specified, only its magnitude and
direction. Thus, the two unlabeled arrows in Fig. | are equally valid alternative
representationsof v. That is not to say that the physical effect of the vector will
be entirely independentof its position. For example, it should be apparentthat the
motion of the body B induced by a force F (Fig. 2) will certainly dependon the
point of application of F* as will the stress field induced in B. Nevertheless, the
two vectors-in Fig. 2 are still regarded as equal, as are the three in Fig. 1.
Like numbers, vectors do not become useful until we introduce rules for their
manipulation, that is, a vector algebra. Having elected the arrow representationof
vectors, the vector algebra that we now introduce will, likewise, be geometric.
First, we say that two vectors are equal if and only if their lengths are identical
and if their directions are identical as well.
Next, we define a process of addition between any two vectors u and v. The
first step is to move v (if necessary), parallel to itself, so that its tail coincides with
the head of u. Then the sum, or resultant, u + v is defined as the arrow from the
tail of u to the head of v, as in Fig. 3a. Reversing the order, v + u is as shown in
Fig. 3b. Equivalently, we may place u andv tail to tail, as in Fig. 3c. Comparing
Fig. 3c with Fig. 3a and b, we see that the diagonal of the parallelogram (Fig. 3c),
ef
Scale: 8 m/sec/cm
Figure 1. Geometric
representationof v.
Figure 2. Positionof a vector.
(a)
is both u-+ v and v + u. Thus,
uty
u+VvV=v+u,
(1)
so addition is commutative. One may show (Exercise 3) that it is associative as
well,
(u+v)+w=u+(v+w).
(2)
Next, we define any vector of zero length to be a zero vector, denoted as 0.
Its length being zero, its direction is immaterial; any direction may be assigned if
desired. From the definition of addition above, it follows that
u+0=0+u=u
(3)
for each vector u.
Corresponding to u we define a negative inverse “—w” such that if u is any
nonzero vector, then —u is determined uniquely, as shown in Fig. 4a; that is, it is
“Students of mechanics know that the point of application of F affects the rotational part of the
motion but not the translational part.
(b)
u
Vu
414
Chapter 9. Vector Space
ua
(a)
of the same length as u but is directed in the opposite direction (again, u and —u
have the same length, the length of —u.is not negative). For the zero vector we
have —O = 0. We denote u + (—v) as u — v (“u minus v’’) but emphasize that it
is really the addition of u and —v, as in Fig. 4b.
Finally, we introduce another operation, called scalar multiplication, between
any vector u and any scalar (i.e., a real number) a: If a 4 0 and u ¥ 0, then au is
a vector whose lengthis |a| times thelengthof u and whose direction is thesame
as that of u if a > 0, and the opposite if a < 0; if a = 0 and/or u = 0, then
au = 0. This definition is illustrated in Fig. 5. It follows from this definition that
scalar multiplication has the following algebraic properties:
(4b)
av,
(4c)
a(u+v)=au+
lu=u,
Figure 4. —u andvector
subtraction,
(4a)
a(Bu) = (af)u,
(a+ f)u=au+t Bu,
(4d)
where «a,9 are any scalars and u, v are any vectors.
Observe that the parallelogram rule of vector addition is a definition so it does
not need to be proved.
u
tofu
u“
Nevertheless,
definitions
are not necessarily
fruitful so it
is worthwhile to reflect for a moment on why the parallelogram rule has proved
important and useful. Basically, if we say that “the sum of u and v is w,’ and
thereby pass from the two vectors u, v to the single vector w, it seems fair to expect
some sort of equivalence to exist between the action of w andthe joint action of u
and v. For example, if F, and F are two forces acting on a body B, as shown in
Fig. 6, it is known from fundamental principles of mechanics that their combined
effect will be the same as that due to the single force F, so it seems reasonable and
natural to say that F ts the sum of F, and F2. This concept goes back at least as
far as Aristotle (384-322 B.C.). Thus, while the algebra of vectors is developed
here as an essentially mathematical matter, it is important to appreciate the role of
physics and physical motivation.
In closing this section, let us remark that our foregoing discussion should not
be construed to imply that objects of physical interest are necessarily vectors (as
are force and velocity) or scalars [as are temperature, mass, and speed (i.e., the
magnitude of the velocity vector)|. For example, in the study of mechanics one
inds that more than a magnitude and a direction are needed to fully define the state
of stress at a point; in fact, a “second-order
tensor” is needed — a quantity that is
more exotic than a vector in much the same way that a vector is more exotic than a
scalar.*
Figure 6. Physical motivationfor
parallelogram addition.
“For an introduction to tensors, we recommend to the interested reader the 68-page book Tensor
Analysis by H. D. Block (Columbus, OH: Charles E. Merrill, 1962).
9.2. Vectors; Geometrical Representation
415
9.2
EXERCISES
1. Trace the vectors A, B, C, shown where A is twice as
long as B. Then determineeach of the following by graphical
means,
the lengthsof the other two sides.
(b)Repeatpart(a),with “||Aj] = 1” changedto|| Al] = 4.
(b)B= A
(d)2(B~A) +60
(f)A +2B —2C
(a A+B+C
(c) A-~C+3B
(ec)A+ (4B —C)
6. Use the definitions and properties given in the reading to
show that A + B = C implies that A = C ~-B.
2. In each case, C can be expressed as a linear combination of
A and B, that is, as C = aA + ($B. Trace the three vectors
and by graphical means determine a and (2.
(b)
(a)
7. (a) Show that if A + B = 0 and A and B are not parallel,
then each of A and B must be 0.
(b) Vectors are often of help in deriving geometrical relationships. For example, to show that the diagonals of a parallelogram bisect each other one may proceed as follows. From
the accompanying figure A + B = C, A — aD = (GC, and
A =B+D.
Eliminating A andB, we obtain(29 — 1)C =
(1 —2a@)D,and since C andD
are not parallel, it must be true
[perpart(a)]that28-1 = 1-2a =O (ie, a =f = §),
which completes the proof. We now state the problem: Use
this sort of procedure to show that a line from one vertex of a
parallelogram to the midpoint of a nonadjacent side trisects a
diagonal.
(d)
em
Cc
a» A
3.
Show that the associative
property
(2) follows
graphical definition of vector addition.
from the
4. Derive the following from the definitions of vector addition
and scalar multiplication:
|C|
(a)
=
If
A
(b) property (4b)
(d) property (4d)
(a) property (4a)
(c) property (4c)
5.
8. If (see the accompanying figure) the vector A + aB is
placed with its tail at point P, show the line generated by its
head as a varies between —oo and +00,
(All
5.
Can
=A +1,B [BI
=
++ Cc
2,
=
and
0?
HINT: Use the law of cosines s* = q? +1? ~ 2qr cos @(see
the accompanying figure) or the Euclidean proposition that the
length of any one side of a triangle cannot exceed the sum of
D
f
B
9, If (seetheaccompanyingfigure)||/AB]]/ ||AC] = a, show
thatOB = aOC + (1 —a)OA.
Chapter 9. Vector Space
416
O
10. One may express linear displacement as a vector: If a particle moves from point A to point B, the displaccment vector
is the directcd line segment, say u, from A to B. For example, observe that a displacement u from A to B, followed by a
displacementv from B to C, is equivalent to a single displacement w from Ato C: u+v = w[part (a) in the accompanying
figure]. Reversing the order, displacements v and then u also
(a)
A
(b)
u
the direction specified by the “right-hand rule.” That is, if we
curl the fingers of our right hand about the axis of rotation, in
the direction of rotation, then the direction @along the axis of
rotation is the direction in which our thumb points, The problem is to show that @,defined in this way, is not a proper vector
quantity. HINT: Considering the unit cube shown below, say,
show (by keeping track of the coordinates of the corner A) that
the orientation that results from a rotation of 7/2 about the «
axis, followed by a rotation of 7/2 about the y axis, is not the
same as that which results when the order of the rotations is reversed. NOTE: If you have encountered angular velocity vectors (usually denoted as w or 92), in mechanics, it may seem
strange to youthat finite rotations (assigned a vector direction
by the right-hand rule) are not true vectors. The idea is that
angular velocity involves infinitesimal rotations, and infinitesimal rotations (assigned a vector direction by the right-hand
rule) are true vectors. This subtle point is discussed in many
sources (e.g., Robert R. Long, Engineering Science Mechanics, Englewood Cliffs, NJ: Prentice Hall, 1963, pp. 31-36).
=
B
A
carry us from A to C: v-+u =w [part(b) in the figure]. Thus,
u+v =v+uso
that the commutativity axiom (1).is indeed
satisfied. How about angular displacements? Suppose that we
express the angular displacement of a rigid body about an axis
as 8, where the magnitude of @is equal to the angle of rotation, and the orientation
Lo
A
of @ is along the axis of rotation. in
Jot Product
Continuing our discussion, we define here the angle between two vectors and a “dot
product” operation between two vectors. The angle @between two nonzero vectors
u_and v will be understood to mean the ordinary angle between the two vectors
when they are arranged tail to tail as in Fig. 1. (We will not attempt to define @if
one or both of the vectors is 0.) Of course, this definition
Figure 1. The angle0 between
of @is ambiguous
in that
there are two such angles, an interior angle (< 7) and an exterior angle (> 7); for
definiteness, we choose 6 to be the interior angle,
u and v.
O<6
lA 7,
(1)
as in Fig. |. Unless explicitly stated otherwise, angular measure will be understood to be in radians.
Next, we define the so-called dot product, u-v, between two vectors u and v
as
−
∙
|
|
≡
|
|
0
∫i
’
#
’
or v=0;
if u=O
(2a,b)
||ul, ||v|],andcos@arescalarsso u- v is a scalar,too."
By way of geometrical interpretation,observe (Fig. 2a) that ||u||cos@is the
length of the orthogonal projection of u on the line of action of v so that u-v
=
[|u|[v|| cos@ = (||v{])(jul| cos@)is the lengthof v times the length of the or-
thogonal projection of u on the line of action of v.' Actually, that statementholds
7/2
if0 <@< m/2;if
cos@
thecosineis negative,andu-v = |]ul|||v||
<6 < 7m,
is the negative of the length of v times the length of the orthogonal projection of u
on the line of action of v,
EXAMPLE
1.
Work Done by a Force. In mechanicsthe work W done when a body
undergoesa linear displacement from an initial point A to a final point B, under the action
of a constant force F (Fig. 3), is defined as the length of the orthogonal projection of F on
the line of displacement, positive if F is “assisting” the motion (Le., if 0 < @< 7/2, as in
Fig. 3a) and negative if F is “opposing” the motion (e., if 7/2 < @< 7, as in Fig. 3b),
times the displacement. By the displacement we mean the length of the vector AB with
head at B and tail at A. But that product is precisely the dot product of F with AB,
W=F-AB..
Figure 2. Projectionof u on v.
(3)
8
(a)
An important special case of the dot product occurs when 6 = 7/2. Then u
and v are perpendicular, and
u-v = |lulj||v||cos5 = 0),
(4)
Also of importance is the case where u = v. Then, according to (2),
f
“ees
{julfuljcos0= |full? if u 40,
if u=O
{ 0
so that we have
Jul = vucu
(b)
(5)
(6)
“You may wonder why (2b) is neededsince if u = 0, say, then [Jul] = 0 and |lul] ||v||cos@ is
apparently zero anyway. But the point is that if u and/or v are O, then @ is undefined; hence cos @
(and even zero times cos @) is undefined. too.
‘Alternatively,wecoulddecomposeu-v = [jul]||v||cos@= (full) (||v||cos@);thatis, as the
length of u times the length of the orthogonal projection of v on the line of action of u.
Figure 3. Work doneby F.
418
Chapter 9. Vector Space
EXERCISES
9.3
1. Evaluateu-v in eachcase.In (a) |Jul]= 5, in (b) [Jul]= 3, 4. Consider the unit cube shown, where P is the midpoint
andin (c) |[ul|= 6.
of the right-hand face. Evaluate each of the following using the definition (2), and (3.1) in Exercise 3. HINT: To
evaluate AC - OP, for instance, write AC-OP
= (AD +
(b)
DC) -(OD + DP) andthenuse(3.1).
2. (Properties of the dot product) Prove each of the following
properties of the dot product, where a, @ are any scalars, and
u, Vv, W are any vectors.
(a)u:v=v-u
(b) u-u>0
=0
(c) (au + Bv)-w
HINT:
(commutativity)
forallu<0
(nonnegativeness)
forallu=0oO
= a(u-w)
+ G(v-w)
part (c) is equivalentto the two conditions (u + v)-w
u-w+v-wand (au): v = a(u-v).
=
(31)
“n-space.”
(d)OC-CP
(h)CP-DP
(1)AO-PA
(a}APO
(e) ABP
(i) BPD
(b) APB
(fy ACP
(j) BOP
(c) APC
(g) BPO
(k) CPO
(d) APD
(h) BPC
(1)DPO
6. If u and v are nonzero,show that w = |[v||u + |[ul|v
3. Using the properties given in Exercise 2, show that
n-Space
(c)AC-OP
(g)AO-OP
(k)AP-PB
You may use (2), (6), and (3.1).
In proving part (c), you may wish to show, first, that
9.4
(b)BA-OP
(HBC-OP
g)PB.CO
5. Referring to the figure in Exercise 4, use the dot product
to compute the following angles. (See the hint in Exercise 4.)
(linearity)
(u+v)-(w+x)=u-w+u-x+vew+v-x.
(a2)0C-AB
(e)OC-OP
@BP-DB
bisects the angle between u and v. (You may use any of the
properties given in Exercise 2.)
The idea is simple and is based on the familiar representationof points in
Cartesian I-, 2-, and 3-space as I-, 2-, and 3-tuples of real numbers. For example,
(a)
yh
the 2-tuple (a1, a2) denotes the point P indicated in Fig. 1a, where a1, a2 are the
x,y coordinates, respectively. But it can also serve to denote the vector OP in
Fig. 1bor, indeed, any equivalent vector QR.
Thus the vector is now represented as the 2-tuple (a1, aq) rather than as an
arrow, and while pictures may still be drawn, as in Fig. lb, they are no longer es-
sential and can be discarded if we wish — at least once the algebra of 2-tuples is
established (in the next paragraph). The set of all such real 2-tuple vectors will be
called 2-space and will be denoted by the symbol IR?;that is,
(1)
R? = {(a1,42) | a1,a2 realnumbers}.
(b)
yh
Vectors u = (ui, U2) and v = (vj, v2) in R? are defined to be equal if uy = v1
and ug = v9; their sum is defined as*
u+v
(2)
= (uy + v1, U2 + v2)
as can be seen from Fig. 2; the scalar multiple
au is defined, for any scalar a, as
au = (ay, aug);
(3)
0 = (0,0);
(4)
O
the zero vector is
and the negative of
wis
—u = (—u1, —ua).
Similarly,
(5)
uty
for R?:
R=
(6)
{(@1,a2,43) | @1,a2, a3 real numbers}.
u+v = (uy+ v1,U2+ve, ug+ v3),
(7)
and so on.t
It may not be evident that we have gained much since the arrow and n-tuple
representations are essentially equivalent. But, in fact, the n-tuple format begins
to “open doors.” For example, the instantaneous state of the electrical circuit (consisting of a battery and two resistors) shown in Fig. 3 may be defined by the two
currents i; and i or, equivalently, by the single 2-tuple vector (71,i2). Thus, even
though“magnitudes,” “directions,” and “arrow vectors” may not leap to mind in describing the system shown in Fig. 3, a vector representation is quite natural within
then-tuple framework, and that puts us in a position, in dealing with that electrical
system, to make use of whatever vector theorems and techniques
developed in subsequentsections and chapters.
“We use the = equal sign to mean equal to by definition.
'The spaceR' of 1-tupleswill not be of interesthere.
are available,
as
NE
)
ti
WW
tia
WW
420
Indeed,why stop at 3-tuples? One may introduce the set of all ordered real
n-tuple vectors, even if n is greater than 3. We call this n-space, and denote it as
IR”, that is,
IR" = {(ay,...,@n)
| @1,..., Qn real numbers }.
Consider two vectors, u = (u1,...,Un)
Uy,...,Un
(8)
and v = (v1,..., Up), in IR". The scalars
and v1,..., Up, are called the components
of u and v. As you may well
expect, based on our foregoing discussion of IR? and IR°, u and v are said to be
equal if uy = v1,..., Un = Vpn,and we define
utv
au
=(u,+vy,...,Un
+ Un);
= (auy,...,QUn),
0 =(0,...,0),
—u =(-1)u,
(addition)
(9a)
(scalar multiplication)
(9b)
(zero vector)
(negativeinverse)
(9c)
(9d)
u-—v =u+(—v).
(9e)
From these definitions we may deduce the following properties:
(commutativity)
(10a)
(u+v)+w=u+(v4+w),
(associativity)
(10b)
ti
u+ (—u) =0,
a(Bu) = (ef)u,
(10d)
(10e)
fu,
(distributivity)
(10f)
a(u+v)=aut+
av,
(distributivity)
(10g)
lu=u,
Ou=0,
(—l)u=—u,
WW
ti;
Ww
ti,
circuit.
(associativity)
(a+ 8)u=au4
fi,
WW
(10c)
u+0=u,
4
We
utv=evt+u,
a0 = 0.
(10h)
(101)
(10))
(10k)
To illustrate how such n-tuples might arise, observe that the state of the electrical system shown in Fig. 4, may be defined at any instant by the four currents
21, 12,73, 14, and that these may be regarded as the components of a single vector
i = (t1,7g,ig,iq) in R’.
Of course, the notation of (w1,..., U,) as a point or arrow in an “n-dimensional
space” can be realized graphically only ifn < 3; ifn > 3, the interpretation is valid
only in an abstract, or schematic, sense. However, our inability to carry out traditional Cartesian graphical constructions for n > 3 will be no hindrance. Indeed,
part of the idea here is to move away from a dependenceon graphical constructions.
Having extended the vector concept to IR”, you may well wonder if further extension is possible. Such extension is not only possible, it constitutes an important
step in modern mathematics;
more about this in Section 9.6.
9.5, Dot Product, Norm, and Angle forn-Space
EXERCISES
9.4
Lo ift = (5,0,1,2), u = (2,—1,3,4), v = (4,—5,1),
w = (—1, ~2, 5,6), evaluate each of the following (as a single
vector); if the operation is undefined (i.e., has not been defined
here), state that. At each step cite the equation number of the
definition or property being used.
(a) 2t-+ 7u
(b) 38t — 5u
(c)4{u+ 5(w —2u)]
(d)dtu + w
(e) -w+t
(g)t + 2u-+ 3w
(f) 2t/u
(h) t — 2u — dv
(i) u(3t + w)
(j) u? + 2t
(k) 2t + 7u — 4
(m) sin u
(l)u-+
v = (2,0,-5,0),
(a) 8x + 2(u — 5v) = w
(b) 38x= 40 + (1,0, 0,0)
(c)u~dx
(d)u-+v-—
=0
2x =w
ay, &2, a3. If no such scalars exist, state that.
(a) at + au + agv = 0
(b) ayt + aav + agw = 0
and w
=
(a) If 8u — x = 4(v + 2x), solve for x (Le.. find its components).
(b) Ifx +u-+v+w
equals by equals.)
v = (0,1,1),
If t = (2,1,3), u = (1,2,-4),
4.
w = (—2,1,—1), solve each of the following for the scalars
wt
(n)w+t—2u
2 Letu
= (1,38,0,-2),
(4,3, 2, —1).
= 0, solve for x.
3. Let u, v. and w be as given in Exercise 2. Citing the definition or property used, at each step, solve each of the following
for x. NOTE: Besides the definitions and properties stated in
this section, tt should be clear thatifx = y, thenx+z=y+z
for any z, and ax = ay for any a (adding and multiplying
vectors will not be possible here for n > 3. Thus, if u
to define the norm or “length” of u, denoted as j|u
(c) ayt + a.u + a3w = (1,3, 2)
(d) apt + agv + agw = (2,0, -1)
(e)a;u+aov
=0
(f) a,u + aev = agw — (2,0,0)
5. (a) If u and v are given 4-tuples and 0 = (0,0,0,0), does
= O necessarily have nontrivial
the vector equation @,u+a2v
solutions for the scalars ay and ag? Explain. (ff the answer ts
“no,” a counterexample
will suffice.)
(b) Repeat part (a), but where u. v are 3-tuples
(t1,.-+,Un), we wish
... Uy and
. Un.
in Sections 9.2 and 9.3 in the event that 2 = 2 or 3.
formula
u-v = |/ul ||v||cos @,
and 0 =
(0,0, 0).
(c) Repeat part (a), but where u, v are 2-tuples and 0 = (0,0).
U1,.+., Un Of u; and given another vector v = (vj,.
Urs...
421
(1)
422
Chapter9. Vector Space
to re-expressit in termsof vector componentsfor IR?andIR%,and then to generalize
those forms to IR”.
If u and v are vectors in IR? as shown in Fig. |, formula (1) may be expressed
in terms of the components of u and v as follows:
u-v= [lullvl]cos
4
=[lull
vl]cos
(8—a)
= ||ul|||v||(cos6 cosa + sin J sina)
(sin8)
6) +(lullsina)(||v]|
vl](cos
cosa)(||
=({[ul|
= uv
+ Ugve.
(2)
We state,without derivation, that the analogous result for R° is
UW:V = UU
Figure 1. u-v in termsof
components.
Generalizing
(3)
+ UQd2 + UZU3.
(2) and (3) to IR”, it is eminently reasonable to define the (scalar-
valued) dot product of two n-tuple vectors u = (u1,...,Un)
as
UV
= Uy
+ ugg $60
+ Untn =D
and v = (v1,..., Un)
(4)
UjD;.
Observe that we have not proved (4); it is a definition.
Definingthedotproductis thekey,for now ||u||and6 follow readily.Specifi-
cally, we define
(5)
in accordance with equation (6) in Section 9.3, and
0=
cos7?
Ga).
(6)
fallvl
from (1), where the inverse cosine is understood to be in the interval (0, 7].* Notice
the generalized Pythagorean-theorem nature of (5).
Other dot products and norms are sometimes defined for n-space, but we
choose to use (4) and (5), which are known as the Euclidean dot product and
Euclidian norm, respectively. To signify that the Euclidean dot product and norm
have been adopted, we henceforth
refer to the space as Euclidean
n-space,
rather
“By the “interval [a, 6] on a real x axis,” we mean the points a < x < 6. Such an interval is said
to be closed since it includes the two endpoints. To denote the open interval a < a < 6, we write
(a, 6). Similarly, [a, 6) meansa < x < b,and (a, b]meansa < x < 6. Implicit in theclosed-interval
notation[a,6]is thefinitenessofa and6.
9.5. Dot Product, Norm, and Angle forn-Space
someauthors
thanjust n-space. We will still denoteit by thesymbol Ik” (although
prefer the notation IE”).
EXAMPLE
1. Let u = (1,0) andv = (2,~2). Then
uv = (1)(2)+ (0)(-2)= 2,
[ful]= V1)?+ (0)?= 1,
IIvll= V2)?+ (-2)?= 2v2,
0=cos!
(
2
2/2
w
) = — (or 45°)
4
as is readily verified if we sketch u and v as arrow vectors in a Cartesian plane. @
EXAMPLE
2. Let u = (2,~2,4,—1) andv = (5,9, —1,0).Then,
u-v = (2)(5)+ (—2)(9)+ (4)(—1)+ (-1)(0) = -12,
(7)
ilull= (2)? +(=2)2+ (4)?+(-1)?=5,
+ (9)?+(-1)?+(0)?= V'107,
IIvll= V/(5)?
(8)
(9)
§=cos= (==)
2
x 1.805
= cos”!
cos! (—0.232)
(—0.232)=
w
103.4°
(or103.4°).
(10)
In this case, n (= 4) is greater than 3 so (7) through (10) are not to be understood in
any physical or graphical sense, but merely in terms of the definitions (4) to (6).
COMMENT. The dotproductof u = (2, —2,4) andv = (5,9, —1,0),ontheotherhand,is
not defined since here u and v are members ofdifferent spaces, IR° and R*, respectively. It
is not legitimate to augment u to the form (2, —2,4,0) on the grounds that “surely adding
a zero can't hurt.” #
There is one catch that you may have noticed: (6) serves to define a (real) 6
only if the argument of the inverse cosine is less than or equal to unity in magnitude.
That this is indeed true is not so obvious. Nevertheless, that
“lS
lull Iv
<fullvil
<1) oor fu-v/
(11)
does necessarily hold will be proved in a moment. Whereas double braces denote
vector norm, the single braces in (11) denote the absolute value of the scalar u- v.
9.5.2. Properties of the dot product. The dot product defined by (4) possesses the
following important properties:
Commutative:
Nonnegative:
UV
u-u>Q0O
ce
Linear:
(au+ Bv)-w
(12a)
= v-u,
forall
for u=
u #0
0,
= a(u-w)+G(v-w),
(12b)
(12c)
— 423
424
Chapter 9. Vector Space
for any scalars a, @ and any vectors u,v, w. The linearity condition (12c) is equtv-
alentto thetwoconditions (a+v)-w
= (u-w)+(v-w)
and (au)-v = a(u-v).
Verification of these claims is left for the exercises.
EXAMPLE
3. Expandthedot product(6t ~ 2u)-(v + 4w). Using (12),we obtain
(6t —2u)-(v +4w) = 6[t- (vw+ 4w)]- 2[u-(v + 4w)]
= 6[(v + 4w)-t] -ts bo[(v + 4w)- ul
= 6(v-t) + 24(w-t Se ~ 2(v-u) ~ 8(w-u)
by (12c)
by (12a)
by (12c)
in much the same way that we obtain (a — b)(e + d) = ac + ad ~ bc — bd in scalar
arithmetic.
As a consequence of (12) we are in a position to prove the promised inequality
(11), namely, the Schwarz inequality*
(13)
ju-v|< [lull[lvi.
To derive this result, we start with the inequality
(14)
> 0,
(u+av)-(u+av)
which is guaranteed by (12b), for any scalar a and any vectors u and v. Expanding
the left-handside and noting thatu-u = |/ull’ and v-v = |/v]|”,(14)becomes?
(15)
lull? + 2ou-v +a? |Jvi?? > 0.
Regarding u and v as fixed and a as variable, the left-hand side is then a quadratic
function of a. If we choose a@so as to minimize the left-hand side, then (15) will
be as close to an equality as possible and hence as informative as possible. Thus,
setting d(left-hand side)/da
= 0, we obtain
2u-v + 2a |\v|/? =0
or
Oo
Iv
5
Putting this optimal value of @ back into (15) gives us
»_y(a-v?,
(uv?
s- > 0,
SE
ae
EIB
Ilul|?vi]? — 2(u-v)? + (av)?
= 0,
“After Hermann Amandus Schwarz (1843-1921), The names Cauchy and Bunyakovsky are also
associated with this well-known inequality.
It does not matter; by virtue of
‘Does a term such as au: v in (15) mean (cu)-v. or a(u-v)?
(12c) (with 6 = 0 and w changed to v), (au):
v = a(u-
v), so the parentheses are not needed.
and taking square roots of both sides yields the Schwarz inequality (13).*
Thus, it was not merely a matter of luck that the arguments of the inverse
cosines were smaller than unity in magnitude in Examples | and 2, it was guaranteed in advance by the Schwarz inequality (13).
9.5.3. Properties of the norm. Since the norm is related to the dot product according to
Jul] = /u-u,
(16)
theproperties (12) of the dot product should imply certain corresponding properties
of the norm. These properties are as follows:
llaul] = jal |full,
Scaling:
Nonnegative:
[|u| > 0
= 0
for allu 40
(17b)
foru = 0,
Ju-+evi| < |jul)+ |v].
Triangle Inequality:
(17a)
(17c)
Equation (17a) simply says that au is |a| times as long as u, and for arrow representations of 2-tuples or 3-tuples the triangle inequality (17c) amounts to the
Euclidean proposition that the length of any one side of a triangle cannot exceed
the sum of the lengths of the other two sides (Fig. 2).
Less obvious, however, is
the fact that (17c) holds for n-tuples for n’s > 3.
Let us prove only (17a) and (17c) since (17b) follows
readily from (16) and
(12b). First, (17a):
jaul] =
ul
Figure 2. Triangleinequality.
\/(au)-(au)
by (16)
=
,/au-(au)
by (12c) with 6 = 0 and w = au
=
//a(au)-u
by (12a)
=
Va®u-u
by (12c) with @=Oandw=u
= lal /u-u
uty
= ja] lull
Turning to (17c), we find that
ju + vil? =(u+v)-(u+v)
=u'ut+V-
by (16)
U+U'V4+V-V
=|jull?+2u-v+|v?
S|Jull?
+2]a-v]
+Iv
<|jull?
+2[lull
ivi)+liv?
=(|full+ (vl)
by13)
fromthefact
sideof(15)follows
theleft-hand
minimizes
“Thatthechoicea = —u-v/||v|]°
thatd?(left-hand side)/da® = 2||v||? > 0.
426
Chapter 9. Vector Space
so that
Ju + vil < [fal]+ flv,
as claimed. A key step was the use of the Schwarz inequality (13), but we also used
the simple inequality u-v < |u-v|, which holds since u--v is a (positive, zero, or
negative) real number; that is, if u-v
is negative, then the < holds, and if u- v is
zero or positive, then the = holds.
EXAMPLE
4. Let us verity thetriangle inequality for a specific example,say thevectors
u = (2,1,3,—1) and v = (0,4, 2,1). Then u + v = (2,5, 5,0) so (17c) becomes
Vb4< V15+V21
or 7.348 < 3.873 + 4.583, which is indeed true.
9.5.4. Orthogonality.
If u and v are nonzero vectors such that u-v
6=cos7!(
=cos! Ga)
= O, then
=cos! (0)= 7
(18)
and we say that u and v are perpendicular. [Here we have used the nonzeroness
of u and v in the third equality in (18); if u and/or v were 0, we would have had
cos! (0/ |jul]||v|])= cos7! (0/0), which is notdefined.]
But to equatethe condition u-v = 0 to perpendicularity (@= 7/2) would not
be correct since u-v will also be zero in the event that u and/or v are 0, in which
case @is not defined. Let us therefore make a distinction between perpendicularity
and “orthogonality.” We will say that u and v are orthogonal if
u-v
= 0.
(19)
Only if u and v are both nonzero does their orthogonality imply their perpendicularity (i.e., @ = 7/2). With this definition, we see that the zero vector O is
orthogonal to every vector including itself (Exercise 14).
Finally, we say that a ser of vectors, say {u,,..., ug}, is an orthogonal set if
every vector in the set is orthogonal to every other one:
ujruj=O0
EXAMPLE
(20)
5. u, = (2,3,—-1,0), ue = (1,2,8,3), ug = (9, -6, 0, 1) is an orthogonal
set because Uy - Uy = Uy: Uy
EXAMPLE
ififj.
= Uo Uy
= 0.98
6. u, = (1,3), u2 = (0,0) is an orthogonalsetbecauseu; -U2 = 0.
427
9.5.5. Normalization. Any nonzero vector u can be scaled to haveunit length by
multiplyingit by 1/ ||u||so we say thatthe vector
«
1
(21)
——u
us
[lu
has been “normalized.” That u has unit length is readily verified:
all=at
[Jul
by(78)
iu)
Ta Hull
by(17b)
A vector of unit length is called a unit vector. We will often use the caret notation
a for unit vectors.
EXAMPLE
7. Normalizeu = (1,—1,0,2). Since |jul]= a-u
= V6, wehave
A set of vectors is said to be orthonormal if it is orthogonal and if each vector
is normalized (i.e., is a unit vector). We will use that term so frequently that it
will be useful to abbreviate it as ON, but be aware that that abbreviation is not
orthogonality), and u; +uj; = 1 for each j (so ||uj|| = 1, so the set is normalized).
The symbol
1,’ i=j
bi—_ { ary
22
(22)
(1823-1891). Thus, {uj,..., ug} is ON if and only if
(23)
fori =
1,2,...,kand7
EXAMPLE
uy =
a
= 1,2,...,k.
8. Let
(1,0,0,0),
yy
ug =
L
1
10,—=,0,—=],
( v2
:)
us =
1 )
1
(0,-~,0,--=].
( v2
V2
428
Chapter 9. Vector Space
fay|| = [fuel] = fug|] = Land uy-ug
= up-ug =
Uo-ug = 0. A
lu |, and an angle
The definitions are designed as extensions of
and v.
EXERCISES
9.5
1. Given the following vectors u and v, determine j/ull,||v/| (1,3,-2), G = (2,0,4), H = (5,4,3), £ = (-3,-1,0),
and @(in radians and degrees). If u and v are orthogonal, state J = (0,0,0). Determine, by vector methods, all interior anthat.
gles and their sum, in degrees, for each of the following poly(a) u>=
(4,3),
v=
(b) u = (1,2,3,4),
(2,
v=
gons.
~1)
(—4, -3, -2,-1)
(c)u = (3,0,1), v = (—2,3,6)
(d)u =(2,2,2),v =(—4,—5,
-6)
(ec)u = (2,5), v = (10, —4)
(f) a=(1,2,3,4),v =(4,3,2,1)
(g)u= (3,2,0,—1,1),
v = (—5,0,0,2,4)
2. State whether or not each of the following expressions is
defined.
(a)[jul
(b) u-(v-w)
(e) (u+v)-(u—v)
(f) u + 6(v-w)
(c)[Ca v)vi|
(g)cos! (Qu+ v)
(i) (Tu): (2v)
(d) (u + v)-w
(h)w/ [lu]?
(j) [Ju+ 3u?|
(a)ABCA
(d)BCDB
(2)FGIF
(j)GHIG
(b)ABCDA
(ce)
BCDEB
(h) GHIG
(k) HIJH
(c) ABCDEA
() FGHF
G) FGF
(1) FIUF
4. (a)~(g) Normalize each pair of u, v vectors in Exercise
that is, obtain @ and Vv.
|;
5. If vectors A, B, C, representedas arrows, form a triangle such that A = B+ C, derive the law of cosines C? =
A*+ B?—2AB cos a, where a is the interior angle between A
and B, and where A, B,C are the lengths of A, B, C, respec-
tively, by startingwith theidentity C.-C = (A-—B)-(A—B).
6.
(Orthogonalization)
In each of following,
find scalars
3. Let us denote,as points in 2- and 3-space,A = (2,0), a, O,-y and vectors uy, Ug, Us such that uy = u, Ug = u+av,
B = (3,-1),C
= (5,0), D = (4,2), FB= (2,2), F = uy = u+ @v + yw is a nonzero orthogonal set, that is,
429
9.5, Dot Product, Norm, and Angle forn-Space
uy U,
= 0,uy-
uy = 0, and ue- ug
0. If this is not possible, state that.
(a) u
(b) u
(c) u
(d) u
= (1,3,0), v
= (2,0, “1.
= (1,0,0), v
= (1,2,0,1),
= 0, where uy, U2 ,ug #
uy = (u-v)v
= (2,3,0), w = (2,1, -3)
v= (1,2,3), w = (3, —2,—5)
= (2,1,0), w = (3,2, —1)
v= (1,0,1,1), w = (2,—1,1,1)
Ug =u—-
u = (1,3) and v = (2,4), and where u = (1,3) and
evaluatethe fol-
lowing.
(c) je"
v = (—1,—2). Interpret your results graphically for each of
these cases.
(c) Use (10.1) to carry out the separation for u = (2,3, 1) and
v = (0,2,3).
1).
(d) Repeat part (c), for u = (1,2, —1), v = (8, -1,
(e) Repeat part (c), for u = (3,0, 5,6), v = (1, —2,0,4)
(f) Repeat part (c), for u = (2, 1,0,0,3), v = (0,0,1, —2, 1).
(b)
|[3u
—2v|]
+||-vI
(d)|full+ |v}
11. (a) Provetheassociativeproperty(au)-v = a(u-v).
(b) Prove the distributive property (u+v)-w
8. Derive the following identities.
(c) Prove that the linearity
(a)flu+vif?
+flu—vil?=2[ful]?
+2[vil
= u-w+v-w.
property (12c) is equivalent
to the
two properties given in parts (a) and (b).
(b) |Ju + vil? — {ju — vl? =4du-v
(c) Verify parts (a) and (b) for the case u = (2,0,1,1)
= (1, —-3,0,2).
(10.1)
Is (10.1) valid only for 2- and 3-space, or does it hold, without
(g)u = (1,2), v = (0,2), w= (1,-1)
(h)u = (3,0), v = (1,1), w = (-1,2)
(a)ju—
v|
1
where ¥ = v/||v|l,
Uy.
modification, for n-space as well? Explain.
(b) Use (10.1) to carry out the separation for the cases where
(e)u =(1,0,0,0),v =(1,1,0,0),
w =(1,1,1,0)
(f)u =(1,-1,1,-1), v = (1,2,0,1),w = (0,2,
1,0)
7. Ifu = (1,3,—4,2) and v =(2,0,0,3),
(a) Show that u, and ue can be found, in terms of u and v, as
and
9. Find all nonzero vectors (if any) orthogonal to the following
vectors.
12. (Direction cosines) The direction cosines of a vector u =
(u1, Ug, Ug) in 3-space are defined as 1, = cosa, ly = cos f,
lz = cosy, where a, @,7 are the angles between u and the
positive coordinate axes, as shown.
(a)(3,0,
1)
(b) (2,1, 1) and (1, 2,3)
(c)
(1,
1,0,
-1)
(d) (1,3,4,0) and (2,—1,0,5)
(e) (6, —1,2,2), (1,4,3, 0), and (4, —9, —4,2)
(f)(6,-1,2,2),(1,4,3,0),and(4,—5,
-4,2)
(g)UL=2,0),(2,3,1),and(7,0,2)
2,1,~1), (1,1,1), and(3, 2,
1)
10. (Orthogonal separation) It is sometimes desired to separate a given nonzero-vector u into the sum of two orthogonal
nonzero vector v, as sketched in part (a) of the accompanying
figure. That is, u = uy + Ue, where uy is of the form av, and
ug: u;, = 0. We call uy the orthogonal projection of
uonv,
and we call ug the component of u orthogonal to v.
(a)
(b)
(a) Obtain general expressions for (y, 2, /3 in terms of the components Uj, U2, U3.
(b) Evaluate li, bs,ls for u = (2, -1,5).
(c) Evaluate (,, [2,3 for u = (2, 4, 1).
(d) Evaluate ti, lo, L, for u = (4,0, —3).
(e) Show that 17 + iB + (3 =.
13. Ifu-v = Oand v- w = 0, does that imply that u-w = 0?
Prove or disprove. HINT: If a claim is true, it needs to be
proved in general, that is, for all possible cases. But if it is
false, it can be disproved merely by putting forward a single
counterexample.
nal. (a) (1,3), (—6,2), (0,0)
Qa, -
(b)(2,3,0), (~3,2,1), (1,1,1), (1,-3, 1)
2, +20,+12, =5
1)
(0,0,0,
(0,0,1,0),
(0,1,0,0),
(1,0,0,0),
(c)
(d) (1,1, 1,1),
G, —1,1,-1),
(0,1,0,~1),
(2,0,
rg —-23 = 8
(fF)vy + vg + 1223= 0
(e)(2,1,-1,1),(1,1,3,0),
(1,-1,0,-1),(21,1,i
—2,0)
(g)
a,
ae Uy
—~a —
— 2
a3 =2
~—223 = 5
16. (Schwarz inequality) To make (15) as close to an equality
as possible, and hence as informative as possible, we mini(a) ay + 2t.
Ly
2
— t3 = 8
+273 =0
(b) vy + v2
=0
Ly —
(c)
vi
mized theleft-handside by settingd(left-handside )/da = 0.
That stepgavea = —u-v/||v||”, andputtingthatresultback
into (15) gave the Schwarz inequality. That proof is valid for
IR” for any n (> 1). For thespecialcaseof IR?,showthatthe
optimal a is —u-v/ ||v||’by using a graphicalapproach;that
o + 223 = 0
— @
— 5243 =
0
is, using a suitable sketch. HINT: Given u and v, make u+av
as short as possible.
to + 403 = 6
9.6
Generalized Vector Space
9.6.1. Vector space. In Section 9.5 we generalize our vector concept from the familiar arrow vectors of 2- and 3-space to n-tuple vectors in abstract m-space,and it
is n-space that is used in the remainder of this chapter and in Chapters 10-12. Yet,
it is interesting to wonder if further generalization is possible. The answer is yes,
and we will complete that story in this section. Far from being just a mathematical curiosity, the results will be essential in later chapters, when we study Fourier
series, Sturm—Liouville theory, and partial differential equations.
The idea is as follows. In preceding sections we introduced the vectors and
arithmetic rules for their manipulation, and then derived the various properties,
such asu+v=v+u,
u+0=u,
a(fZu) = (aZ)u, and so on. In generalizing,
the essential idea is to reverse the cart and the horse. Specifically, we elevate the
derived properties to axioms, or requirements, and regard the vectors as “objects,”
the nature of which is not restricted in advance. They may be chosen to be n-tuples
or whatever; all that we ask is that a plus (-++)
operation, a zero vector, a negative
inverse, and scalar multiplication be defined such thatall of the vector space axioms
are satisfied. Thus:
DEFINITION
9.6.1 Vector Space
We call a (nonempty) set S of “objects,” which are denoted by boldface
type and
referred to as vectors, a vector space if the following requirements are met:
9.6. Generalized Vector Space
and denoted as +-, is de-
(i) An operation, which will be called vector addition
fined between any two vectors in S in such a way that if u and v are in S,
Furthermore,
then u + v is too (Le., S is closed under addition).
u+v=v+4+u,
(u+v)+w=ut+(v+w).
(commutative)
(1)
(associative)
(2)
(ii) S contains a unique zero vector 0 such that
u+0O0=u
(3)
foreach uin S.
(iii) For each u in S there is a unique vector “~u” in S, called the negative
inverse of u, such that
u+(—u) = 0.
(4)
We denote u + (—v) as u — v for brevity, but emphasize that it is actually
the + operation between u and —v.
(iv) Another operation, called scalar multiplication, is defined such that if u is
any vector in S and ais any scalar,* then the scalar multiple au is in S, too
(i.e., S is closed under scalar multiplication).
Further, we require that
a(Bu) = (a8)u,
(associative)
(5)
(a+ B@)u=au+s fu,
(distributive)
(6)
a(u+v)=aut+avy,
(distributive)
(7)
lu=u,
(8)
if the vectors u, v are in S, and a, 7 are scalars.
Observe that if we write u-+v-+w, it is not clear whether we mean (u+v)-+Ww
(i.e., first add u and v, and then add the result to w) or u + (v + w). However, the
associative property (2) guarantees that it does not matter, so the parenthesescan
be omitted without ambiguity. Similarly, au is unambiguous by virtue of (5).
EXAMPLE
1. R”-Space. Surely, the n-space R”, defined earlier, does constitute a
vector space;after all, the axioms listed in Definition 9.6.1 come from the propertiesof IR”
listed in Section 9.4. Thus, there is no need to check to see if those axioms are satisfied.
Instead,and for heuristic purposes, let us modify our addition operation from
u+v
= (U1+U1,.--,Un + Un)
(9)
“We continue to restrict all scalars to be (finite) real numbers. Hence, we call the vector space a
real vector space.
431
432
Chapter 9. Vector Space
to
ut
v = (uy + 2u4,..., tn
+ 202),
(10)
and see if (10) works; that is, let us see if the vector space axioms listed under (i) in
Definition 9.6.1 are still satisfied tf we use (10) as our addition operation instead of (9).
According to (10),
vt
us
(uy + 2t,...,
Un + 2un)
(11)
so a comparisonof (10) and (11) shows that the commutativity axiom (1) is satisfied only
ifuy + Qu;= vy + 2uz (7 = 1,...,n), henceonly ifv; = uz, henceonly if v = u. Since
(1) does not hold for any chosen vectors u and v, but only for vectors u and v that are
equal, we conclude that if u + v is defined by (10), then we do nor have a vector space. Of
course, it is possible that (10) violates other axioms besides (1), but one failure is sufficient
to show that the set is not a legitimate vector space.
COMMENT.
Observe that we have not shown that u + v must be defined as in (9); con-
ceivably,
utv=s (ui
tui,...,u2
+n)
(12)
—Un)
(13)
or
utv
= (uy —U1,.-.,Un
might work; that is, might satisfy the requirements listed under (1). Thus, understand that
the plus signs on the left- and right-hand sides of (9) are not the same. The ones on the
right denote the usual addition of real numbers (e.g., 2 + 5 = 7), whereas the one on the
left is more exotic; it denotes a certain operation between vectors u and v, which is being
defined by (9), or (10), or (12), or (13). To emphasize
that point we could use a different
notation such as u * v, in place of u + v, as some authors do. However, having made that
point let us continue to use u+v.
8
IR" is but one example of a vector space. Many other useful spaces can be
introduced by using objects other than n-tuples as the vectors. For example, the
vectors may be functions, matrices, or whatever, provided that vector addition, a
zero vector, a negative inverse, and scalar multiplication
are defined such that all of
the vector space axioms are satisfied. For nowhere in Definition 9.6.1 is the nature
of the vectors specified or in any way restricted.
EXAMPLE
2. A Function Space. This time, let thevectorsbe functions.Specifically,
let u = u(x) be any continuous function defined on 0 < x < 1, say. For the addition
operation let
u+v = u(x) + v(2);
(14a)
that is, let u + v be the function whose values are the ordinary sum u(x) -- v(x). For scalar
multiplication let
au = au(z);
(14b)
for the zero vector choose the zero function
0 = 0;
↓
433
and for the negative of u define
~uU= ~u(e);
(14d)
thatis, thefunctionwhosevaluesare ~u(2).
With these definitions, we can verify that all of the vector space requirements are
satisi.ed, so that the set S of such vectors is a bona fide vector space. For instance, if
u = (av) andv = u(x) arecontinuouson 0 <a < 1, thenso isu+v = u(x) + a) so
S is closedunderaddition. Further,v + u = v(@)+ u(a) = u(x) + v(z) = ut+v,* so
addition satisfies the commutative property (1), and so on,
This S is but one example of a function space, a space in which the vectors are
functions. #f
The following theorem ts useful, and its proof illustrates the axiomatic approach.
THEOREM 9.6.1 Properties of Scalar Multiplication
If u is any vector in a vector space S and qais any scalar, then
(15a)
Ou = 0,
(15b)
(~l)u=—u,
(15c)
a0 = 0.
Proof:
These results follow from our definition
of vector space. To prove (15a),
one line of approach is as follows:
JOu+u=
Then
0u+ lu
by (8)
= (0+1)u
by (6)
= lu
=u
by (8).
Qu+u+(—u)
= u+ (=u)
Ju+0=0
Ou = 0
by(4),
by (3).
The remaining two, (15b) and (15c), are left for the exercises.
@
9.6.2. Inclusion of inner product and/or norm. Observe that there is no mention
ofa dot product or a norm either in Definition 9.6.1 or in Examples | or 2. Indeed,
a vector space S need not fave a dot product (also called an inner
product)
or a
norm defined for it. [f it does have an inner product it is called an inner product
“The second equality holds because v(a) + u(r) is the ordinary sum of two real numbers; ¢.g.,
44+3=344.
434
Chapter 9. Vector Space
space, if it has a norm it is called a normed vector space; and if it has both it is
called a normed inner product space.
If we do choose to introduce an inner product for S, how ts it to be defined?
Do you remember the idea of reversing the cart and the horse? That is how we do
it. Equations (12a,b,c) in Section 9.5.2 were shown to be properties of the inner
product u-v
= uyvy +--+ + UnUn. We now take those properties and elevate
them to axioms, or requirements, that are to be satisfied by any inner product of
any vector space.
Similarly, we take the properties (17a,b,c) of the norm, in Section 9.5.3, and
elevate them to axioms, or requirements,
that are to be satisfied by any norm of any
vector space.
Let us tabulate them here:
REQUIREMENTS
OF INNER PRODUCT
Commutative:
u-v
Nonnegative:
= v-u,
u-u>0
forall u 4 0,
for u = 0,
=
(au+ Bv)-w
Linear:
(16a)
= a(u-w)+f(v-w),
(16b)
— (16c)
and
REQUIREMENTS
OF NORM
= fa jul),
jul]
Scaling:
forallu 40,
lu) >0O
Nonnegative:
for u = 0,
(17b)
< [Jul]+ |lv|l.
(17c)
—
TriangleInequality:
Ju+vi|
(17a)
Let us illustrate.
EXAMPLE
3. IR"-Space.If we wish to addaninnerproductto thevectorspaceR", we
can use the choice
nh
UV
= Uy
Fe
HFUnVn = )
j=l
UjU;-
(18a)
435
We know that (18a) satisfies the requirements (16) because the latter were deduced, in
Section 9.5.2, as properties that follow from (18a). A variation of (18a) that still satisfies
(16) is (Exercise 6)
Th
UsV = wWpUyzy
be
+ Wytndyn= ) WjUjV;,
(18b)
j=l
where the w,’s are fixed positive constants known as “weights” because they attach more
or less weight to the different components of u and v. For instance, consider IR? and let
wy,= 5 andwa = 3. Thenif u = (2,—4) andv = (1,6) we haveu-v = 5(2)(1)+
3(—4)(6) = —62.
Note that for (18b) to be a legitimate
inner product we must have w; > 0 for each 7.
For suppose,still in R?, that w, = 3 and w. = —2. Then, for u = (1,5), say, we have
u-u = 3(1)(1) —2(5)(5) = —47< 0, in violationof (16b).Or, supposethatw, = 3 and
wa = 0. Then, for u = (0,4), say,we haveu-u = 3(0)(0) + 0(4)(4) = 0 eventhough
u # 0, again in violation of (16b).
Now, suppose that we wish to add a norm. If for any vector space S we already have
an inner product, then a legitimate norm can always be obtained from that inner product
as |jul| = /u-u,
and thatchoice is called the natural norm. Thus, the naturalnorms
corresponding to (18a) and (18b) are
(19a,b)
respectively.
However, we do not Have to choose the natural norm. For instance, we could use (18a)
as our inner product, and choose
(20)
= ur]+--++funl= 52 fas
[Jal]
as our norm (Exercise 8). The latter is used by Struble in his book on differential equa-
tions,* probably because it is algebraically simpler than the Euclidean norm (19a) or the
modified Euclidean
norm (19b),
Furthermore,
he defines no inner product whatsoever.
Struble calls (20) the taxicab norm since a taxicab driver judges the distance from the corner of Sth Avenue and 34th Street to the corner of 2nd Avenue and 49th Street as 18 blocks,
not 234 blocks. Bf
EXAMPLE
4.)
The Function Space of Example 2.. How might we choosean inner
product for the function space S defined in Example 2? To motivate our choice, let us
imagine approximating any given function (i.e., vector) u(x) in S in a piecewise-constant
manner as depicted in Fig. |. That is, divide the a interval (0 < x < 1) into n equal
parts and define the approximating piecewise-constant function, over each subinterval as
“R.A. Struble, Nonlinear Differential Equations (New York: McGraw-Hill,
1962).
436
Chapter 9. Vector Space
the value of u(a) at the left endpoint of that subinterval. If we represent the piecewiseconstant function as the n-tuple (w.,...,U,),
then we have, in a heuristic sense,
ula) = (wi,..., Un).
(21)
Similarly, for any other function v(a) in S,
u(x) & (U1,...,Un).
(22)
il h
n
Hy]
x
I
0
Figure
1. Staircase approximation of u(x),
The m-tuple vectors on the right-hand
sides of (21) and (22) are members of R”.
For that
space, let us adopt the inner product
rT
(thy. +,Un) (Uy, 6-6) Un) =S-
ujujAx,
(23)
p=l
that is, (18b) with all of the w, weights the same. namely, the subinterval
width Az.
If we
let n — oo, the “staircase approximations” approach u(a) and v(x), and the sum in (23)
tends to the integral f u(x)u(ax)de.
This heuristic reasoning suggests the inner product
(u(z), u(z)) = [
(24a)
u(xju(a) da.
We can denote it as u- v and call it the dot product, or we can denote it as < u(x), u(r)
>
and call it the inner product. For function spaces, the latter notation is somewhat standard,
and is our choice in this text.
COMMENT
1. By no means do we claim our staircase idea to be a rigorous derivation of
(24a). In fact, it is neither rigorous nor a derivation:
it is Heuristic motivation
for the defi-
nition (24a), We leave it for the exercises to verify that (24a) does satisfy the requirements
(16).
COMMENT
2. Just as (18b) is a legitimate generalization of (18a), (ifw;
nm),we expect that
a
> Oforl
dx
ulx)u(x)w(a)
= =f
ulz),u(r))
(wla)evte))
wa)elepo(a)de
<j
<
(24b)
437
is a legitimate generalization of (24a) [if w(w) > 0 for 0 < « < 1), proof of which claim
is left for theexercises,The inner product(24b) is prominentwhenwe studyFourier series
and theSturm—Liouville theory in Chapter 17,
COMMENT 3. Naturally, if we wish to define a norm as well, we could use a natural norm
based on (24a) or (24b), for instance
{jul|=
u2(a)w(a)de
(25)
based on (24b).
COMMENT 4. Notice carefully that the concept of the dimension of a vector space has not
yet been introduced, although it is in Section 9.10. There, we define dimension and find
is n-dimensional (which claim is probably not a great shock). Since the staircase
that IR”
approximation (21) becomes exact only as n -> oo, it appears that our function space S is
infinite dimensional!
5. A bit of notation: the set of functions that are defined and continuous
COMMENT
on [0,1] (ie, 0 < @ < 1) is usually denoted as C'°{0, 1]. If not only are the functions
continuousbutalso all derivativesthroughorder&, thenthesetis denotedas C*(0, 1). @
Closure. Using n-space as a ladder, we complete our generalization of vector
space by taking the properties of IR” (such as u + v = v + u) and turning them
into the axioms, or requirements, to be met by any vector space. Thus, attention
shifted from the objects, the vectors, to those requirements.There is no restriction
on the nature of the vectors, which can be arrows, n-tuples, matrices, functions,
or oranges. Por us, the most important vector spaces are IR”and various function
spaces;IR”is usedin theremainderof this chapterandChapters10-12,andfunction
spacesare used in Chapter 17 when we study Fourier series and Sturm-Liouville
theory.
To illustratethepowerof theaxiomaticapproach,recall theSchwarzinequality
ju-v{ < |{ul]||v|],provedin Section9.5.2for IR”. That resultholdsfor amynormed
innerproductspacewith naturalnorm |/ul||= \/u-u for it followed from properties
of IR", which properties are subsequently elevated to axioms for general vector
space. Thus, it represents many properties rolled into one. For example, in IR",
with the dot product (18a) it says
rh
S
ae
wy vj
(26)
eae
SS,
pel
in the function space of Examples 2 and 4: with the inner product (24b) and norm
(25) it says
|
‘L
0
and so on.
u(x)o(a)w(a) dx <
/
a
0
u?(c)w(x) da
|
J0
“L
v2(x)w(a)da,
(27)
Chapter 9. Vector Space
438
9.6
EXERCISES
1. Recall that IR” is the vector space (“‘real” vector space since
10. Let S be the set of real-valued polynomial functions, of de-
all scalars are to be real numbers) in which the vectors are m-
gree n, defined ona <a <b. [fu = ag + aye +++ + ane”
and v = bp + bya +-++ + b,x” are any two such functions,
and a is any (real) scalar, define the sum u + v and the scalar
tuples u = (t,...,
u+v
Un), with the definitions
= (uy,..-; tm) + (U1,..-, Un)
= (uy + U1,--.,Un + Un),
0 = (0,...,0),
multiple au as
(1.1)
(1.2)
—u = (—u4,...,—-Un),
(1.3)
au = (au1,...,QUn).
(1.4)
If we make the following modifications, do we still have a vector space? If not, specify all requirements within Definition
9.6.1 that fail to be met.
(a)only vectorsof theform u = (u, u,.. .,u) admitted,where
-—coO<Uu< cw
(b)only vectorsof the form u = (u, 2u, 3u,...,nw)
where —oo < u < 00
(au)(x) = aap + @aye +++»+ Aan”,
respectively.Further, let O be the function0+0z+---+02”,
and let —u be the function —a9 — aya +++: ~ Qnz". Show
thatS is a vectorspace.
11. Show that the inner product (24b) does satisfy the requiremeénts(16).
12. (Schwarz inequality) We derive the Schwarz inequality
admitted,
(c) only thevector (0,...,0) admitted(this is an exampleof
a zero vector space, a vector space containing only the zero
vector)
(d)utv
(e)u+v
(u+v)(a) = (ag+ bo)+ (ai + b1)a+++ + (Qn+ bn)2”,
= (uy — v1,..-,Un — Un), in placeof(1.1)
= (0,...,0) forall u’s and v’s, in place of (1.1)
.,07u,,), in place of (1.4)
(f) au = (a®u,...,a°
2. We noted in Example | that the definition (10) of vector
addition violates axiom (1). Does it violate any others as well?
Explain.
Juv] < [lull[lvl
(12.1)
for R” space in Section 9.5.2. The latter holds not only for IR”
but for any normed inner product space with the natural norm
jul] = /u-u.
In this exercisewe simply ask you to verify
(12.1) by working out the left- and right-hand sides for these
specific cases:
(a)u = (3,1,—-1,0) and v = (1,2,5, —4)in R*, with the
inner product (18a)
(b) u = (1,2,4,-3)
UV
and v = (0,4,1,1) in R*, with
= UyVy + Suave + 3ugv3 + 2u4Vv4
3. Prove(15b),that(—1)u = —u.
(c)u
4. Prove (15c), that a0 = 0.
UsvV = UV, + Qugve + 3ugug + 4uava + 5usUs
and v = 32° in the function space of Ex(d)u = 2+
5. Prove that if au = 0 then a = 0 and/or u = 0.
6. Show that the inner product (18b) does satisfy the requirements (16).
= (1,1,1,1,1) and v = (2,2,2,2,2) in R®, with
ample4, with the inner productu:-v = (u(x), v(z)) =
[; u(x)u(x) dz
Jo
(e) Same as (d), but with (u(x), u(x)) = Sy u(x)v(x)(2 +
7. We stated in Example 3 that if for any vector space S we
already have an inner product, then a legitimate norm can al-
5a) dx
which choice is called the natural norm. Prove that claim.
13. (Solution space) (a) Consider a set of m linear homogeand
neous algebraic equations in the n unknowns 21,...,@,
waysbe obtainedfrom thatinnerproductas |lul] = /a-u,
8. Show that the “taxicab norm” (20) is a legitimate norm that is, that it satisfies the requirements (17).
denote each solution of the system as an m-tuple vector x =
in R”. Show that the set of all such vectors, with
(@1,...,2,)
9, (a) Does thechoice |jul] = max lu,|, for IR”, satisfy the
the usual definitions
requirements (17)? Explain.
jen
(b)How about||u|]|
= min |u,|,for R"?
lsjgn
[u + -v = (uy +01,...,Un
+ Un), QU =
O = (0,...,0)], is
(QU1,...,QUn), —U = (—U1,...,—tn),
a vector space. That space is called the solution space of the
system.
9.7. Span and Subspace
(b) [f the system is nonhomogeneous, is the set of solutions
still a vector space? Explain.
14. (Solution space) Show that the solutions of a linear ho-
9,7
mogeneous differential equation (with the same definitions of
u+yv, au, —u, and 0 as in Example 2) constitute a vector
space, the so-called solution space of that differential equation.
Span and Subspace
Here, we begin a sequence of closely related ideas: span, linear dependence, basis,
expansion, and dimension. The concepts, definitions, and theorems hold for any
vector space, but our illustrative examples are restricted to the n-space IR”, this
being the case of most interest in Chapters 9-12.
We begin with the idea of the “span” of a set of vectors.
9.7.1 Span
DEFINITION
Uz are vectors in a vector space S, then the set of all linear combinations
If u,,...,
of these vectors, that is, all vectors of the form
uU=a,u, +--:+apug,
where a1,...,@,
are scalars is called the span of uj,,...,u,
(1)
and is denoted as
span {uy,..., Ug}.
The set {u,..., ug} is called the generating set of span {u,,..., uy}.
Let us illustrate with some vector sets in IR* and IR° so we can support the
discussion with diagrams.
EXAMPLE
1. Determinethespanof thesingle vector
uy = (4, 2)
in R?, Then span {u,}
is the set of all vectors that are scalar multiples of u,.
(2)
Hence,
span{u,} is the set of all vectors on the line Z in Fig. 1, such as u = 2u,; = (8,4),
v= ~duy ==(~2,~1), and O = Ou, = (0,0). We say that u; generatesthe line L. fi
EXAMPLE
2. Determinethespanof the two vectors
ui = (4,2),
ug = (—8,-4).
439
(3)
;
Figure L. Span{us}.
440
Chapter 9. Vector Space
Span {uy;, ug} is, once again, the line L in Fig. | (ie., the set of all vectors on L), for
both wu,and ug lie along £, so any linear combination of them, a, Uy, + Q@2Ug,does too.
Similarly, span{(4,2), (~8, ~4), (18,9), (0,0)} is theline L. @
Observe that the line £, in Examples | and 2, is only a subset of the vector
space IR?. Observe that that subset of IR? is itself a vector space, a so-called “subspace” of R?. For if u and v are any two vectors on L, then u +v is on L, too, so
the set is closed under addition; similarly, if u is on L, so is au, for any scalar a, so
the set is closed under scalar multiplication;
L does contain the zero vector [since
we can set all the a@’sin (1) equal to zero]; and for each u on L£there is a (unique)
vector —u on J such that u + (~—u)= 0.
DEFINITION 9.7.2 Subspace
If a subset7 of a vector spaceS is itself a vector space (with the same definitions
as S for vector addition u+ v,
scalar multiplication
au, zero vector O, and negative
vector —u), then 7 is a subspace of S.
Usually, a subspace of S is only a part of S, as the line L is only a part of R?,
but since a subset of a set can be all of that set, a subspace of S can be all of S. For
instance,IR?is a subspaceof R?.
THEOREM
If uy,...,U,
9.7.1 Span as Subspace
are vectors in a vector space S, then span {uj,...,
ug} is itself a
vector space, a subspace of S.
For instance, the line L in Fig. | is a subspaceof R*. Proof of Theorem 9.7.1
is left for the exercises.
EXAMPLE
3. Is thespanof
u, = (5,1),
ug = (1,3)
(4)
all of IR?or only a partof IR??To determinetheextentof span{u,, uz}, let v = (v1,v2)
be any given vector in IR?, and try to express
(5)
V = QyUy + QeQUy.
That is,
(v1,v2) = a1(5, 1) + ag(1,3)
=
(5a,
a1)
+ (a2,
3a2).
= (5a, + a2,Q1 + 32).
(6)
9,7, Spanand Subspace
Equating components, we obtain the linear equations
Day +
:
= U1,
A
in @1,@2. Applying
(7)
v2
ay + 8agQ =
(7) becomes
Gauss elimination,
1
\
= £Uy,
:
Qy + QQ
a
(8)
= Hy,
=
id V2 — +
id
ag
Vy
It is clear from the Gauss-reduced form (8) that the system is consistent (solvable for
Hence, we may conclude that span {uj1, ty}
Q1, @2) for every vector v in R?.
is all of
IR?;we say that {u,, U2} spansIR?. (Here we use “span” as a verb; in Definition 9.7.1 it is
introduced as a noun.)
Thus. every v in R? can be expressed as a linear combination of vector u; and ug. As
representative,let v = (6,—4) so v; = 6and va = —4, Then (8) gives ag = ~+2 and
a=
an, so that (5) becomes
(9)
Buy _ up.
v=
To see this in graphical terms, observe from Fig. 2 that v = OA + OB, where (with the
aid of a scale) OA + 1.6u, and OB = —1.9up. Thus, v & 1.6u; — 1.9ue, in agreement
with (9).
COMMENT.
Suppose that we add us = (2,2) to the set. It should be evident that
span {u;, U2, Us} is all of R?, again, since {u,, Us} spanned R? even “without any help”
from uy. But in case this is not clear, let us go through steps analogous to steps (5) to (8):
V = a, Uy,+ QQU2+ O33
(10)
So (v1, v2) = (5a, + ag + 203,01 + 3a2 + 2a3). Thus,
day
+
Q,
+
Oe
3a
+
203
=
UL,
+
203
=
U2,
or
ay + tay
+ 203 =
Q9g +
ta,
=
U1,
*
772
(11)
_
qe
Like (8), (1 1) is consistent for every v in R®so {uy,, U2, us} spans R?, as claimed. Whereas
(8) hada unique solution so thatthe representation(5) was unique, (11) happensto have an
infinity of solutions so that the representation (10) is net unique. 4
EXAMPLE
4. Asafinal example,consider thespanof
u, = (1,2,2),
uy = (—1,0,2)
(12)
in R¥. Setting
V= a uj, + aut,
(13)
441
Chapter 9. Vector Space
442
we have
pm
Ag = Uy,
204
= U9,
201 + 2a9 = vs,
or, after Gauss elimination,
Qy~7
Ap
= Uy,
ag =
QO=
4vg—U1,
Nile
Ug
209
(14)
++ 204.
Now, span {u1,, 2} is the set of all possible vectors v given by (13), 1e., all vectors v for
which the system (14) is consistent, i.e., all vectors v = (v1, v2, vs) such that
||
|
Axis |
Figure
3. u; and up.
Quy— 2veq+ ug =
(15)
[so that the last of equations (14) is 0 = 0 rather than a contradiction].
In geometrical terms, on the other hand, span {u,,t2}
should be the subset of RS
consisting of the plane that passes through u, and ug (u, and us are shown in Fig. 3).
How does that fact correlate with (15)? As a matter of fact, (15) is the equation of a plane
in 3-space, and that plane does pass through the origin, through the tip of uj [i.e., the point
(1, 2, 2)], and through the tip of uy [the point (—1, 0, 2)]. Hence, it is the plane through uj;
and Ug so the analytical approach, namely, steps (13) to (15) and our geometrical interpretation are in agreement,
We conclude that span {uz, uz} is not all of R°; it is only the subspace of IR? consisting of the plane (i.e., all vectors in the plane) containing the given vectors uw,and ug.
COMMENT. Since span {u, ug} is a plane. would it be correct to say that span {uy, uy}
is R°? No, that would be incorrect; R? is made up of nvo-tuples, while the vectors in the
above-mentioned plane are three-tuples. Thus, R? space is not relevant in this problem.
All that can be said here is that span {u1, ug} is the subspace of R®consisting of the plane
containing the vectors u, and up, that is, the plane defined by (15). #
Closure. In leading up to the conceptof bases and expansions, the two key ideas are
span and linear independence. In this section we introduce the idea of span; in the
next section we introduce linear dependence and linear independence. Although
the concept of span holds for any vector space, such as R®°,we suggest that you
focus on the foregoing examples in two- and three-spaces, so that you can use the
two- and three-dimensional drawings to promote understanding.
9.7. Span and Subspace
EXERCISES
443
9.7
1. Show whether the vectors
That solution space is a subspace of IR”. To illustrate, con-
IR”
span
(a)(1,0,...,0), (0,1,0,...,0),...,(0,...,0,1)
(b)(0,0,0,1), (0,0,1,1), (0,1,1 ,2) (1,4,1 , 1) span 4
= 1,n = 2,
sider the simple system a + 322 = 0; thatis,m
a1, = 1, and ay. = 3. The solution is zg = a(arbitrary),
(1,1,2,3)
1),(0,0,0,0),
1, -1),(0, 1,0,
0,4),(2,3,
(c)(1,2,
span R*
vy = ~3a, or x = (21,22) = a(—3, 1) so thesolutionspace
is thespanofthe vector(—3,1),thatis, span{(—3,1)}. In this
(d)
(1,3,2,2),(5,ma,0),(—1,—2,4,3) span
R!
manner,determine the solution space for each of the following
examples.
(f) (1,1,2), ( a)
(2, 1,0),
(-1,0,3) span R°
(g)(2,0,3), (1, 2,4), (—5,2, -2)spanR3
(h)(1,3,0), (2,-1, 1),(1, 1,4) spanR°
(a) vy — 22 +4e3 = 0 in RY
(b) 21)+29 +23 —24 = O0inR*
(Cc) x, — 2 + «#3= 0
(e) (1,
(2,1, -1), (1, 2, —5) span IR?
0,1),
(i) (—1,2,4),(5, 2, -2), (2, 0,3),
(1,2,3) spanR3
(j)(0,0,0), (2,1,4), (-1, 3,5) spanR4
(k) (2,1,3), (1,-1, 2) span‘RS
(1)(2,1,-1),(1,3,
1),(5,5, —1),
(0,5,
3)span
IR?
(m)(—4,1,0), (2,2,2), (1,2,3) spanR®
(n)(~3,
1,0),(1,1,1),(1,7,5)span
R?
(0)
(1,2),
(2.1)
(45)
span
Re
(e)
= 0
xy — fo + 23 —-24K4
+ 24 + 2¢5 = 0 inR®
Ly — £2
£3 + t4 = 0
+ 205 + 24 = 0 inR?
Ly
— £4 = 0
Uy + 2x2
(b) Sketch any three such vectors.
(c) Sketch any four such vectors.
(g)
Uy
+
fo
4+ 223
_
in IR*
0
=
23
T+
2. (a)Sketch any two vectors that span the space of all vectors
in the plane of the paper.
vector sets subspaces ofR??
vy, + 38a0 -
(f) ay + a2 - 23 + tg = 0
(0)(1,2), (2,1) spanR?
3. Are the following
ry + to + ty = 0 in R®
(d)
= 0
204
+25
Ly + Lo + 2x3
=0
224 + #5= 0 in R®
5. Find any two vectors in R° that span the plane
(See accom-
panying figure.) Explain.
(a) vy
209
+ 4x3
t+
524
=0
(c)
=0
(b) 22, + vg — 623 = 0
(d) v1, +42q +23 =0
(f) 38a, — v2 —%3 = 0
(a) the straight line D that extends from the origin to infinity
(b) the wedge-shaped region (including its boundary lines) that
6. Show whether the given sets are identical. Explain.
extends to infinity in both directions
(c) the upper half plane zw.> 0
(a)span {(2,-1, -1), (3, 1,0)} andspan{(2, —1,—1),(5,5, 2)
(b) span {(1 23 ,(2,-1, 1)} and span{(1, 2,3), (3, 1,5)}
‘
(c) span{(4, 1, ,
(d)span{(1, 2, —1), (3, 0, 0)} andspan{(1, 0, 0), (1,3, 0)}
(1,0,1,2 —1,1,1,0)} and
span{(0,1,2, ),
(f) span{(1, 0, 1
span {(2,0, —1,0),
-1, 2, 3), (4,3, 2,1)}
(0,
(g) span{(1,0, 1,1), (2,1,1,0),(1,2, 2, 1)} and
span{(2, —10,0), (1, -2,0, 1), (3,5,4, 1}
(h) span{(1,2, 3, 0), (0, 1,0,2), (2, 3,0, 1)} and
span {(1, 0, 3, —1),(-1,1,3,3),
(1, 2,1, 1)}
7. Find any two ON (orthonormal) vectors in
(a)Span{0,22)), (6, -1)}
4, (Solution space) First. review Exercise |3a in Section 9.6.
(b)span{(1,oe (2,-1,3)}
(c) span{(1,
(1,2, 3)}
-1,0),
444
Chapter 9. Vector Space
(f) span{(~2,3, 1, 1),(0,2, -1,1)}
(d)span{(2,1,0), (0, 1,2)}
(e)span{(1,1,0, 1),(0,2, —1,1)}
8. Prove Theorem 9.7.1.
9.8 Linear Dependence
The definition of the linear dependenceor independenceof a setof vectorsis essentially identical to Definition 3.2.1 for a set of functions, with the word “functions”
changed to ‘“‘vectors:”
DEFINITION 9.8.1 Linear Dependence and Linear Independence
A set of vectors {u,,..., ug} is said to be linearly dependent if at least one of
them can be expressed as a linear combination of the others. If none can be so
expressed, then the set is linearly independent.
Thus, we urge you to review Section 3.2 in conjunction with your study of this
section. As in Chapter 3, we frequently use the abbreviations LD and LI to stand
for linearly dependent and linearly independent, respectively.
EXAMPLE
1.
by inspection,
we can express uy as a linear combination
Let uw,= (1,0), u2 = (1,1), and ug = (5,4). These are LD since,
EXAMPLE
2.
of u, and ug: ug = uy + 4g.
(Alternatively,we could expressug = fug — +u, or uy = —4uy + ug). Fl
Let u,; = (1,0) and ug = (1,1). These are LI since u; cannot be
expressed as a “linear combination of the others,” namely, as a scalar multiple
can ug be expressed as a scalar multiple of u,. @
EXAMPLE
of ue, nor
3. Let u, = (2,—1),ue = (0,0), anduy = (0,1). TheseareLD sincewe
can express Ug = Ou, + Oug. (The fact that we cannot express u, as a linear combination
of ug and uy, nor ug as a linear combination of uy, and uy does not alter our conclusion,
for recall the words “at least one” in the definition.) #
It is implicit in Definition 9.8.1 that u,,..., ug are all members of the same
vector space; in Examples | to 3 that space was R*. Thus, it would make no sense
to ask whether u, = (2,5) and ug = (4, 3,0, 1) are linearly dependentor not since
uj, is a member of IR? while us is a member of R4.
The preceding examples are simple enough to be worked by inspection. In
more complicated cases, the following theorem provides a systematic approach for
9.8. Linear Dependence
determining whether a given vector set is linearly dependent or linearly independent.
THEOREM 9.8.1 Testfor Linear Dependence / Independence
A finite set of vectors {uy,..., ug} is LD if and only if there exist scalars aj, not
all zero, such that
(1)
ayuy +--+ + apuy =O;
if (1) holds only if all the a;’s are zero, then the set is LL
Proof is essentially the same as for Theorem 3.2.1.
EXAMPLE
4.
Considerthe4-tuples
us = (2,2,3,0).
we =(0,1,1,1),
u;, = (2,0,1,-3),
(2)
To see if these vectors are LI or LD, appeal directly to (1):
a (2,0, 1, —3) + a2(0, 1,1, 1) + a3(2, 2,3,0) = (0,0,0,0),
(3)
or (2a) + 2a3, a2 + 203, ay + a2 +303, —3a1 + a2) = (0,0,0,0). Thus,
204
+ 2a3 = 0,
a2“ + 2a3
a,
= 0Y
4
+ a2 + 38a3 = 0,
—dsay + a2
(4)
= 0.
Applying Gauss elimination yields
2ay4
+ 2a3 = 0,
ag + 904
203 =
= 0,
(5)
ag = 0,
0 = 0.
This system admits only the trivial solution, a, = ag = a3 = 0 $0 U4, Us, Ug are LI.
EXAMPLE
5. Consider the 3-tuples
u, = (1,0,1),
ug=(1,1,2),
ue =(1,1,1),
ua =(1,2,1).
(6)
Working from (1), as in Example 4, we have
a +ag+
ag+
ay =0,
ag + ag + 2aq = 0,
Q1 + 2 + 2a3
ag = 0,
(7)
445
446
Chapter 9. Vector Space
or, after Gauss elimination,
Qy + 2
+03
+
ay = 0,
a2 + a3 + 2a4q= 0,
= ).
ag
(8)
This time, there exist nontrivial solutions for the a;’s so the vectors uj, Ug, ug are LD,
(Specifically, (8) gives ag = 0, aq = Q, @g = —2a, a, = a where a is arbitrary. With
ca= 1, say, (1) becomes uy — 2u, + Oug3+ uy = 0.) |
We conclude this section with four modest theorems, the first three being essentially the same as Theorems 3.2.4—3.2.6for functions.
THEOREM
9.8.2 Linear Dependence / Independence of Two Vectors
A set of two vectors {1 ,, ug} is LD if and only if one is expressible as a scalar
multiple of the other.
THEOREM
9.8.3 Linear Dependenceof Sets Containing the Zero Vector
A set containing the zero vector is LD.
THEOREM
9.8.4 Equating Coefficients
Let {u;,..., ug} be LI. Then, for
ayy +++: + apup = byuy +--+ + bpup
to hold, it is necessary and sufficient that a; = b; foreach 7 = 1,...,k. That is, the
coefficients of corresponding vectors on the left- and right-hand sides must match,
THEOREM
9.8.5 Orthogonal Sets
Every finite orthogonal set of (nonzero) vectors is LI.
Proof of Theorem 9.8.5: Dot uy, into both sides of
Quy + atta +--+ +apup, = O.
(9)
In other words,
uy: (ay uy + agua +--+ + ap,u,) = uy -0,
QyUy,-:Uy + @oly: Uo +--+ + apUy: Up, =0,7
ay|u|? +O+---+0=0.
(10)
447
Now u, # 0 implies that |/u;|]4 0 so it follows from (10)thata; = 0. Similarly,
dotting ug into (9) gives ag = 0, and so on. Since ay = ay = -+- = ap = O, the
u,’s must be LI, as claimed. m
6. The set {(2, 1), (1,5)} in IR? is LI becauseneither vector can be exEXAMPLE
pressedas a scalar multiple of theother. 4
EXAMPLE
7. Let
uy, =
(4,-1,1,2),
Uo
=
(3,0,
2,5),
uy
=
(0,0,0,0)
in R*. The set is LD, according to Theorem 9.8.3 because it contains the zero vector ug =
QO.That is, uz can be expressed as a linear combination
of u, and ue: ug = Ou, + Oud.
If the preceding sentence is not clear, rewrite the equation as Ou, -+-Quy — lug = O and
observe that the a; coefficients
(0,0, and —1) are not all zero.
@
Closure. The foregoing discussion of the linear dependence / independence of
vectors is essentially the same as the discussion of the linear dependence/ independence of functions in Section 3.2, except that the Wronskian determinant test did
not carry over.
EXERCISES
9.8
1. (a)Can a set be neitherLD nor LI? Explain.
(b)Can a set be both LD and LI? Explain.
G) (1,1, 0,0), (1, -1, 0, 0), (0,0,
(k) (1, ~3, 0,2, 1), (—2,6,0, ~4, =
2. Show that the following sets are LD by expressing one of
the
a asa es combination of the others,
(D (5,4, re
ee
rie
(0)(7,1,0),(—1,1,4),(2,3,5-
1),(1, 2),(3,4)}
(a)
(1,3),(2,0),
(1,3),(7,3)
(b)(1,3),
(2,0),
(1,2),
(-1,5)
(c)(2,3,0),(1,-2,3)
(e)(0,0,2).(0,0 3),(2,-1,5), (1,2,4),(7,9,1),(2,0,-4)
(f)(2,3,0,0),(1, ~5,0,
2),(3,1,2,2)
(g)(1,3,2,0),(4,1, -2, 2), (0,2,0,3),(4,7,1,2)
(h)(2.0,1.—1,0),(1,2,0,3, 1),(4,~4,3,~9,—2)
(i)(1, 3,0), (0,1,—1),(0, 0,0)
10,0) (1, 9,
rn
a
2,
2)
1),(3,=2,
(1,0,
=1),
(p)(1,2,
D(ed) 2s) oh
o {(1,on Me12),(—<3)}
(a){(1,2.3), (3,2,
1), 5 .5)}
∩
(0,
or
(q)(3,1,0,0),(1, —2,
4, 1), (2hbo_~ >
≤
−
∕
∟
(a)
u
Uy
Bw _
ar a
no
↕
448
(¢)
(d)
5. [Ifuy and ug are LI, uy and uy are LI and ua and ug are LI,
uy
does it follow that {11,,ug, ug} is LI? Prove or disprove.
Uo
Uo
iby
ug
6. Prove or disprove:
(a) v is in span{uy,..., ug} if {v, uy,..., ug} is LD.
(b) v is notin span {uy,..., ug} if {v,uy,..., ug} is LL
(c) vis notin span {uy,,..., ug} ifand only if{v,uy,..., ug}
is LI.
3
7. (a) Prove Theorem 9.8.2.
(b) Prove Theorem 9.8.3.
(c) Prove Theorem 9.8.4.
(e)
uy
9.9
Bases, Expansions, Dimension
9.9.1. Bases and expansions. In the calculus we learn that a given function f(x)
can be “expanded” as a linear combination of powers of x (namely 1, z, a, ...),
f(a) = ag +aya +agx? +++,
(1)
We call ag, a1, @2,... the “expansion coefficients,” and these can be computed from
f(x) as aj = f(0)/j!.
Such representationof a given function is important,and
examples such as e®= 1+a+ He" + qe
are familiar to us.
Likewise
useful, in Chapters 9-12,
+--+ andsing = «— qu + au? −−∙
are the expansion of a given vector u in
terms ofa set of “base vectors” e1,..., eg:
U = azey +++ + OReR.
(2)
How do we come up with such sets of base vectors and, once we know the e;’s
and the given u, how do we compute the expansion coefficients a;? The story is
simpler than for the power series of functions because whereas (1) is an infinite
series and one needs to deal with the sophisticated issue of convergence, our vector
expansions in Chapters 9—[2 entail only afinite number of terms.
Beginning simply, consider the vector space IR’, the set of all vectors in the
plane of the paper. In particular, consider the vectors e; and eg shown in Fig. La. It
should be evident (Theorem 9.8.2) that e; and ey are LI and that they span the space
so that any given vector, such as u in Fig. lb and v in Fig. lc, can be expressed as
a linear combination of them.
For the vector u, for example, u = OA
{.6e, and OB = 2e9, so that
Similarly (Fig. lc),
with the aid of a scale, OA
+ OB;
=
u = 1.6e; + 2e9.
(3)
v = 2e; ~ 2.5e9,
(4)
and so on, for any given vector in the plane. Of course, the zero vector is simply
(a)
ey
O = Oe; + Oeo.
The formulas (3) and (4) are examples of the expansion of a given vector [u in
(3),v in (4)] in termsof a setof base vectors [theset {e1, eo} ].
DEFINITION
9.9.1 Basis
in a vector space S is a basis for S if each
A finite set of vectors {e;,...,e,}
vector u in S can be expressed (i.e., “expanded”’) uniquely in the form
k
u=aye;
+--+
ape,
aje;.
=
(5)
j=l
By the expansion (5) being unique, we mean that the a expansion coefficients
are uniquely determined.
9.9.1 Test for Basis
THEOREM
in a vector space S is a basis for S if and only if it spans
A finite set {e,,...,e,}
S and is LL
Figure 1. Vectorexpansionin R?.
Proof: First, it follows from the definition of the verb span that every vector u in
spans S. Turning to
S can be expanded as in (5) if and only if the set {e;,...,e,}
the question of the uniqueness of the expansion, suppose that both expansions
U = ae,
+++
+ OKReR,
u = Bye, +--++ Beep
(6)
(7)
hold for any given vector u in S. Subtracting (7) from (6) gives
(ay—Bi)er+--+
+(ax—By)ex
=0.
(8)
31), ... , (am—Bp) in (8) must be zero, in which
Now, each of thecoefficients (@1—~
case ay = G1,...,@ = Bp and expansions (6) and (7) are identical if and only if
Chapter 9. Vector Space
450
the set {e1,...,e,}
is LI. Hence, the expansion (5) is unique if and only if the set
is LI, and this completes the proof. @
(a)
The key idea revealed in the foregoing proof is that a basis needs to contain
enough vectors but not too many: enough so that the set spans the space and can
therefore be used to expand any given vector in the space, but not too many, in
order that such expansions will be unique.
A
_|
&
4-
SL
-
EXAMPLE
“su
ey
1. Consider the vectors
e;=(-2,1), e2=(2,4).
?
/
T
7
i
4
~
S
T
7
/
se
inna
6
(9)
As may be verified, the set (9) is LI and spans R? and is therefore a basis for R?.
Using that set to expand the vector u = (6,2), say, we express
“
~S
é
u= a e) + a2e2,
wv
(10)
or (6,2) = (—2a1, a1) + (2a2, dag). Hence,
(b)
—2a4
a,
a
Solving
+
209
=
+
4ag
= 2.
6,
CY
(11), ay = —2 and a2 = 1 so the expansion (10) is
u = —2e;
+ eg,
(12)
as displayed in Fig. 2a.
It is to be emphasized that the basis (9) shown in Fig. 2a is by no means the only basis
for R?; there are slews of them. For example, it is readily verified that another is
e, = (4,-1),
e)=(-1,5),
(13)
and in this case theexpansion of u = (6, 2) is found to be
32
u= i9°!
Figure 2. TwobasesforR?.
14
+79°
(14)
as depicted in Fig. 2b.
COMMENT.
The difference between the expansions (12) and (13) is not at odds with the
notion of uniquenesssince the two expansions are with respect to different bases. In other
words, (12) is the unique expansion of u in terms of the e;, eg basis, and (14) is the unique
expansionof u in termsof thee}, eg basis.
9.9.2. Dimension. If we always worked in 2-space or 3-space, the concept of dimension would hardly need elaboration; for example, 3-space is three-dimensional,
a plane within it ts two-dimensional, and a line within it is one-dimensional. However, having generalized our vector concept beyond 3-space, we need to clarify the
idea of dimension.
DEFINITION 9.9.2 Dimension
If the greatest number of LI vectors that can be found in a vector space S is k,
451
where 1 < & < ov, then S is k-dimensional,
and we write
dim S = k.
If S is the zero vector space (i.e., if it contains only the zero vector), we define
dim S = 0. If an arbitrarily large number of LI vectors can be found in S, we say
thatS is infinite-dimensional.*
To determine the dimension of a given vector space, it may be more convenient
to use the following theorem than to work directly from Definition 9.9.2.
THEOREM 9.9.2 Testfor Dimension
If a vector space S admits a basis consisting of & vectors, thenS is k-dimensional.
be a basis for S. Because these vectors forma basis, they
Proof: Let {e1,...,e,}
must be LI. Hence, we have at least k LI vectors in S, and it remains to show that
in
no morethank LI vectorscan befoundin S. Supposethatvectorse[,...,@,44
S areLI. Each of thesecan be expandedin terms of the given base vectors, as
∶
−
−
(15)
,
Chg, = Akt 11 bo
+ Ak+1key
say. Putting these expressions into the equation
aye, + agey+--+ + any1e, = 0
(16)
and grouping terms gives
Ok= 0.
+ Op41Ak+1,k)
+ p41 e411)C1++ + (a1aig+ +++
(ayaq1+ +++
is LI since it is a basis, so each coefficient in the preceding
But the set {e1,...,e,}
equation must be zero:
Qy1Q, +++ + O44 1041
= 0,
(17)
AypQy+e
+ Apgi pansy = 0.
These are & linear homogeneous equations in the k + 1 unknowns a through
G41, and such a system necessarily admits nontrivial solutions (Theorem 8.3.4).
Thus, the a’s in (17)are not all necessarily zero so the vectors e},...,@,41 could
“Infinite-dimensional function spaces will be studied in Chapter 17.
452
Chapter 9. Vector Space
not have been LI after all. Hence, it is not possible to find more than & LI vectors
in S, and this completes the proof. @
The spaces of chief concern in Chapters 9-12 are the n-tuple spaces IR” and
subspaces thereof. For IR” we can say the following.
THEOREM
9.9.3 Dimension of R”
The dimension of IR” is n: dim R”" = n.
Proof: The vectors
eo
=
(0,
1,0
en = (0,...,0,
0),
(18)
1)
constitute a basis for R” because any vector u = (uy,...,Un)
in IR” can be expanded uniquely as u = uje; +--+ + Unen- Since this basis contains n vectors, it
follows from Theorem 9.9.2 that R” is n-dimensional. @
Indeed, we might well have questioned the reasonableness of our definition
of dimension if IR” had turned out to be other than n-dimensional! The ON basis
OO is called the standard basis for IR” (and is the n-space generalization of the
jk”
ON basis that might be known to you from other courses).
Finally, what about the dimension of a subspace, for example, the subspace of
IR that is spanned by two given vectors?
THEOREM 9.9.4 Dimension of Span {uj,..., Ux}
The dimension of span {uj,..., ux}, where the uj’s are not all zero, denoted as
dim [span{uy,..., uy,}], is equal to the greatestnumberof LI vectors within the
generatingset {uy,..., ug}.
Proof: Denote the generating set {uy,...,uz}
as U. Let the greatest number of
LI vectors in U be N, where 1 < N < k. It may be assumed, without loss of
generality, that the members ofU have been numbered so that uy,..., uy are LI.
Then each ofthe remaining members ofU, namely uaij,...,
Ug, can be expressed
as a linear combination of uy,...,ua.
Surely, then, each vector in span U can
similarly be expressed as a linear Pomoc
of uy,..., ,uy. Now {uy,..., uy}
is LL and spans span U. According to Theorem 9.9.3_then, the dimension of span
U is N; that is, it is the same as the greatest number of LI vectors in U, as was to
be proved. @
453
EXAMPLE
2. Let
uy = (4,0,2,0).
ug =(1,1,0,-1),
uy = (8,-1,2,1),
These vectors are, of course, members of IR*. But since u,, ug, Uy are only three vectors,
dim [span {uj, ue, uy}] is at most three. In fact, it is not three since we see that uy =
uy + ug. But uy and us, say, are LI since neither is a scalar multiple of the other. Thus,
there are only two LI vectors within the generating set so dim [span {uy, ug, ug}) = 2.
In Example 2 we determined that the greatest number of LI vectors in the genwhere the uj’s are members of IR°, and k = 6, say? For such a large problem we
cannot expect “inspection” to work. Yet, what are we to do, test the u,’s for linear independence one at a time, two at a time, three at a time, and so on, until we
determine the greatest number of LI vectors in {uy,...,
ug}?
That would be quite
tedious. No, we will see later, in Chapter 10, that the best way to determine the
greatest number of LI vectors in a given set is to determine the “rank” of a certain matrix, and that can be done by the extremely efficient method of elementary
row operations. Meanwhile, in the present section, we “get by” by keeping the
examples and exercises simple enough so that we can rely on inspection.
Let us return, now, to our discussion
of bases and expansions.
9.9.3. Orthogonal bases. If, as in Example 1, there are many bases for a given
space, then how do we decide which one to select? We will find that in most applications the most convenient basis to use is dictated by the context, so let us not
worry about that now. This point is addressed in Chapter I 1 as well as in the chapters on PDEs.
However. we do wish to show, here, that orthogonal bases are to be preferred
whenever possible. For observe from Example | that to expand u (that is, to compute the aj expansion coefficients)
we needed to solve the system (11) of two
equations in two unknowns. Similarly, if we seek to expand a given vector in R°,
then there will be eight base vectors (because IR®is eight-dimensional) and eight
a; expansion coefficients, and these will be found by solving a system [analogous
to (11)] of eight equations in the eight unknown a,’s. Thus, the expansion process
can be quite laborious.
On the other hand, suppose that {e;,....e, } is an orthogonal basis for S; that
is, it is not only a basis but also happens to be an orthogonal set:
ee;
=0
if
ify.
(19)
Suppose that we wish to expand a given vector uin S in terms of that basis; that is,
we wish to determine the coefficients a,.....
ayy in the expansion
U = Aye] - Aveo +
To accomplish this, dot (20) with e,,e9,...,
+ OLep.
(20)
e;, In turn. Doing so, and using
454
Chapter 9. Vector Space
(19), we obtain the linear system
+ Oag+--++0ax,
u-e; = (e1-e1)a1
u- eg = 0a + (e2-€2)a2 +0a3 +--+
u-e,
Oa,
(21)
= Jay +--+ + Oap—1+ (en ex) OR,
where all of the quantities u-e1,...,Uex, €,-e1,..., 4° &, are computable since
u,e1,...,@% are known. The crucial point is that even though (21) is still k equations in the k unknown a@,’s,the system is uncoupled (i.¢., the only unknown in the
first equation is a1, the only one in the second is ag, and so on) and readily gives
Qay,=
u-e,
e€;:°e1
,
a=
u-e9
€9:€2
»
sees) Ob =
u:ep
er, ek
;
(22)
provided, of course, that none of the denominators vanish. But these quantities
cannot vanish becausee;-e; = lle;|I?, which is zero if and only if e; = O, and
this cannot be because if any e; were O, then the set {e;,...,e,}
would be LD
(Theorem 9.8.3), and hence not a basis.
Thus, if the {e,,...,e,}
simply
basis is orthogonal, the expansion of any given u is
w=(Se
)e tet
e,:e;,
(SE)
oy-X(ez)}e
en: ek
ej
e;.) (23)
j=l
If, besidesbeing orthogonal,the e;’s are normalized (||e;||= 1) so thatthey constitute an ON (orthonormal) basis, then (23) simplifies slightly to
k
u = (u- 61) @, +--+ + (ue)
& =
S-
(u-€j)
&;,
(24)
j=l
where we recall that carets denote unit vectors.
EXAMPLE
3. Expand u = (4,3, —3,6) in terms of the orthogonal base vectors ey =
(1,0,2,0), eg = (0,1,0,0), e3 = (—2,0,1,5), ey = (~2,0,1, -1) of R*. This basis is
orthogonal but not ON so we use (23) rather than (24). Computing u-e,
5, and so on, (23) gives
u =--
2
pert:
19
3 e2 + —e3
3588
17
—-—e4
6 eo
= —2,e,-e,
=
25
(25)
9.9. Bases, Expansions,
Alternatively,
we could have inferred, from u = aye, +--+ + a4eg, the four equations
— 203
Oy
:
_
204
ao
204
+
ag+t
=
4,
=
3
(26)
ag = —3,
5az3—- a4y=
6
on the four unknown a,’s, and solved these by Gauss elimination, but it is much easier to
“cash in” on the orthogonality of the basis and to use (23). If we choose to work with an ON
basis,we can scale thee;’s asé) = Rll, 0, 2,0), @2= (0,1,0,0), é3 = Fag (2: 0,1,5),
64 = wal -2.0, 1,~1). Then (24) gives
u=-
2.
.
—=e; + Jeo +
5
19.
17.
€3 ~ =u,
30°~—ts«éW46
(27)
which result is equivalent to (25). @
Given a nonorthogonal basis there are three possibilities. First, one can use it
and face up to the tedious expansion process. Second, one can “trade the nonorthogonal basis in” for an orthogonal basis using the Gram-Schmidt orthogonalization
procedure, which procedure is introduced briefly in the exercises and discussed in
detail in the next section. Third, one can retain the nonorthogonal basis but streamline the expansion process by computing and utilizing a set of dual, or reciprocal,
vectors corresponding to the given basis, as described in the exercises.
Closure. This section is about the expansion of vectors, in a given vector space S,
in termsof a setof basevectors.A setof vectors{e;,...,e,}
in S is a basisfor S
if each vector u in S can be expanded as a unique linear combination of the e;’s.
is indeed a basis for S if and only if
We showed (Theorem 9.9.1) that {e;,...,e,}
it spans S (so each u can be expanded) and is LI (so the expansion is unique). The
number of vectors in any basis for S is called the dimension of S. For instance, IR”
admits the standard
basis (18), comprised
of n vectors, so R” is n-dimensional.
And the greatest number of LI vectors in a set {uy,..., ug} is the dimension of
their span.
We found that the expansion process (i.e., the determination of the expansion
coefficients) can be quite laborious if there are many base vectors, but is extremely
simple if the basis is orthogonal, or ON, in which case the expansions are given by
(23) or (24), respectively. You should remember those two formulas and be able to
derive them as well.
Dimension
455
Chapter 9. Vector Space
456
EXERCISES
9.9
(a) for S?
1. Show whether the following is a basis.
(b) for span {e1,...,e,}?
for R?
(1,2)
(1,0),(1,1),
(a)
(b) (3,2), (~1,~5) for IR?
(c) (1,1) for R?
(d) (2,0,
ae 5,
oie
:
A
(5, —1,2),(1
2), (2,0,1),(
oe
3,
—
2,1),
(5,
0,
0,
ey
7. (Zero vector space) Show that a zero vector space (i.e., a
forRY
vector space consisting of the zero vector alone) has no basis.
(4,3, 2, 1) for R*
1),
(2,
1, -3,0), (1,2,4,5) for R*
0,1)forR4
3,0),(5,-2,3,1),(0,—6,
0,0),(1,2,
i (4,2,
8. Let u, =(1,
). ee = (0,1,0), ug = (0,0,1), uy =
(1,1,0), us = io
ug = (1,1,1), and uy =
Evaluate eachof thefollowin.
for R4
(b) dim [span {u,, u2}]
(c) dim [span {1,, Us, us}]
0,0),(1,1,0,0),(1,1,1,0),(1,1,1,1)for R!
(i)(1,0,
1,3)forR!
8),(4s—2,
1) (1,355,
0,1),(2,0,05
(8,05
G)
4,3),(2,5,3,5),(3,7,7,8)forR!
(k)(1,3,-1,2),(1,2,
(1,-1,2,3),(4,1,2,3),(5,41, ,0),(1,2,4,6) (a) dim [span {u,}]
(1)(2,3,5,0),
1,0),(0,0,0,0)forR¢
(0,1,0,0),(0,0,
(1,0,0,0),
(ny
(o) (1, 1,2),
(4,
-2,
~1)
for span
{(2,
—4,
a
ie
—1)}
(p)(1,1, 2),(4,-2, -1) for span{(3,—5,—6),(1,2, 1)}
(q) (1,1,1), (1, -1, 2) for span {(2, 4,1), (1,7, —2)}
(d) dim [span{uj, Ug,ug, U4})
(e)dim [span{uy, U2,uy}]
(f) dim [span {uy, U4, Us }]
g) dim [span {us, ug, u7}]
(h) dim [span{uy, us, Us, U7}]
(r) (1,2,3), (1,0, 4) for span{(3,2,0), (1,1, -1)}
2. Expand each vector u in terms of the orthogonal basis
{e1,e2,e3} of R°, where e; = (2,1,3), eg = (1,-2,0),
e3 = (6,3, —5).
(a) u = (9, —2,4)
(b) u =(1,0, 0)
(c)u =(0,1,5)
(e)u =(0,5,0)
(a) dim [span{uy, us, Us }]
(b) dim [span{uy,,ug, Ug}]
(d)u =(3,1,1)
(f)u =(1,2,3)
(c) dim [span {ueg,ua, Ug}]
3. (a)—(f) Expand each of the u vectors in Exercise 2 in terms
of the ON basis {€), é2,é3} of R®, where é;, é2, @3are normalized versions of e;, €2, e3 given in Exercise 2.
4. Expand each vector u in terms of the orthogonal baeg =
sis {e,,...,e4} of R*, where ey = (2,0,-1,-5),
(2,0,-1,1),e3= (0,1,0,0),
eg=(1,0, 2,0).
(a) u = (1,0,0, 0)
(c)u =(2,5,
1,—3)
(d)u =(4,3,—2,
0)
(f)u = (2,-7,4,1)
(g) u = (0,0, 0,9)
(i) u = (0,0, 5,0)
(h) u = (2,3, -2,1)
G)u = (1,1,1,1)
5. Verify OEYthe{e,,...,e4}
a basis for R*. Also,
ie
LL.
vectorsgiven in Example 3 are
(26) by Gauss elimination
and ver-
ify that the 0, ’s thus obtained agree with those given in (25),
If {e;,...,e,}
it a basis
(d)dim [span{us}]
(e) dim [span {ug, uy}]
(f) dim (span {ug, ug, Us, Ug}]
(g) dim [span{uy, ug}]
(h) dim [span {ug, ug, Ug, Us, UG}]
10. (a)—(f) Determine the dimension of the solution space in
Exercise 4 of Section 9.7.
(b)u = (0,6, 0,0)
(e)u =(1,2,0,5)
9. Let uy = (1,0,0,0), ue = (1,1,0,0), ug = (1,1,1,0),
uy = (1,1,1,1), us = (0,0,0,1), ug = (3, 3.3,3), Evaluate
each of the following.
is an orthogonal set in a vector space S, is
(Gram—Schmidt
orthogonalization
process)
Given
k LI
vectors V1,...,Vx,
it is possible to obtain from them & ON
vectors, say @1,...,@,, in span{v,,...,
v4} by the GramSchmidt process, after /drgen P. Gram (1850-1916) and
Erhardt Schmidt (1876-1959), by taking e; equal to vj, taking
€g equal to a suitable linear combination of v1, v9, taking e3
equal to a suitable linear combination of V1, V2, V3, and so on,
and then normalizing the results. The resulting ON set is as
457
9.9. Bases, Expansions, Dimension
If, instead, we have a basis {e1,...,e@,} which is not ON,
then, as noted in the text, the expansion process is not so simple. However, suppose that we can find a set {ef,...,e7 }
such that
follows:
~
é, =
.
*
eg =
Vi
Ufvall
v2 —(v2, )e
’
Ilva—(v2 €1)éi||
ee;
(11.1)
eu
Vim
So(v;
0ee;
Vio
Soy;
j=l
We now state the problem: Verify that each é; defined by
(11.1) is a linear combination of v;,..., vj, and that the é,’s
areON. [Inverifyingthat||@;|]
= 1,besuretoshowthateach
denominator in (11.1) is nonzero.]
formula (11.1) in
(13.2)
= Dojet aje;
gives
The set {ef,...,e%}
@5)e;.
(13.3)
is called the dual, or reciprocal, set
corresponding to the original set {e,,...,@n}.
(We will see
in the last exercise in Section 10.6 that the dual set exists, is
unique, and is itself a basis for R”, the so-called dual or reciprocal basis.)
(b) Given thebasis e; = (1,0),e2 = (1,1) for R?, useequa-
Exercise 11 to obtain an ON set from the given LI set.
(a)(4,0),(2,1)
tion (13.2) to determine
the dual vectors ej,e5.
Then use
equation (13.3) to expand u = (3,1). Sketch e;, e2, ef, e3, u
to scale, and verify the expansion graphically, that is, by means
of the parallelogram rule of vector addition.
(3, 4)
(c)(1,0,0), (1,1,0), (1,1,1)
(d)(1,1,0), (2, —1,
1), (1,0,3)
(e)(1,1,1), (2,0,—1)
(f)(1,1,1),(1,0,1). (1,1,0)
(g)(1,2,1),(1,-1, 2),(=1,3,1)
(h)(2,0,1), (1,1,1).(—2.0.3)
(i)(2,1,1,0),(1,5,-1,2)
1
(c) Repeatpart(b),fore; = (2,1), e2 = (0,2),u wo 12),
(d) Repeat
part (b), fore, = (—1,1),e2 = (2,1), u = (0,4),
1), (2,3,-1,1,4)
(j) (6,-1, 1,2,
our vector space S to be R”.
(a) If {@),...,@,}
i#j.
rh
u= Situ
-@;)G;
12. In each case use the Gram—Schmidt
t=]
Then show that dotting e* into u
Q; = u-e; so that
i=]
(b) (1, ~2),
1,
= { 0,
through7 = k.
=
é; =
ee
is an ON basis for R”, and u is in R”, then
by dottingé;, intobothsidesof theequationu = 37"_, ajé;,
the given basis is
(e) Given the basis e, = (1,0,0),e2
= mane
=
(1,1,1) for R°, use equation (13.2) to determine the dual
vectors e},e5,e3. Then use equation (13.3) to expand each
of the vectors u = (4,—1,5), v = (0,0,2), w = (5, —2,3).
Be sure to see that the dual vectors get computed once and for
all, for a given basis {e1,...,@,};
once we have got them,
expansions of the form (13.3) are simple.
(f) Repeat part (e) fore; = (2,0,1),e2 = (1,1,0),e3 Il
(1,~-1,3), and u = (6,1,0), v = (1, 2,4), w = (0,3,0).
(g) Show that if the {e],...,e@n } basis does happen to be ON,
then the dual vectors coalesce with the e;’s, i.e. e7 = e; for
Th
u=
So(u
j=l
, ej )ej.
(13.1)
j=1,2,....n.
458
Chapter 9. Vector Space
Best Approximation
9.10
Let S be a normed inner product vector space (1.e.,a vector space with both a
norm and an inner, or dot, product defined), and let the norm be the “natural norm”
Jul] = /a-u.
We know that if {e,,...,e}
is a basis for S, thenany vector
u in S can be (uniquely) expanded in the form u = yA
cjej. If the basis is
orthogonal, then the expansion process is easy, with the ¢;’s computed, from the
given vector u and the base vectors e;, as cj = (u-e;)/(e;-e;).
And if the basis
is not only orthogonal but ON, then u = ys
cj@j, where cj = u-é;.
However, what if we do not have a “full deck?” That is, what if {@),...,@a/}
is ON, but falls short of being a basis for S (i.e., N < dim S)? If u happens to fall
within span {@;,...,@,}, which subspace of S we denote as 7, then it can still be
expanded in terms of @;,...,@,, but if itis not in 7, then it cannot be so expanded.
In the latter case the question arises, what is the best approximation of u in
terms of €;,...,@°?
In this section we answer that question in general, and illustrate the results for the case where S is R”.
Later in this book, when we study
Fourier series and partial differential equations, our interest will be in function
spaces instead.
9.10.1. Best approximation and orthogonal projection. The best approximation problem, which we address is this: given a vector u in S, and an ON set
{@1,...,@y } in S, what is the bestapproximation
N
us
cy@yte
Heven
= S° cj@z?
(1)
jel
That is, how do we compute the c; coefficients so as to render the error vector
B=uEe
ce; as small as possible? In other words, how do we choose the
cs SOas to minimize the norm of theerror vector ||}? Tf|/E/]is a minimum, then
so is ||[El|?,so let us minimize ||E||°(to avoid squareroots),where
N
N
2
A
|E\° =E-E=
=u'u—2
|u-—) cj@j
> fur
N
N
j=l
*
S cj;
j=l
) cj (u+ey) + ) cr
j=)
j=l
(2)
?
as
‘
and where the step
N
Ss"
1
N
ee,
.
)
cje;
a
(c, 64 Se
cNnen )
|
. (cy ey oe
oe CNeN)
N
iI ~at +
+ ~wh
II
oO
wt
~bo=
follows from the orthonormality of the e;’s.
Defining u-é; = a; and noting thatu-u = |ju|* we may express(2) as
~
N
N
-
WEI=Soe}250 ages+ Ital’,
j=l
j=l
or, completing the square, as
N
N
j=l
jal
—Soaf.
EI? =So(cj—04)?+Ifull?
(4)
Observethat u and the ON set {@1,...,@,} are given so that |/ul||and the a;’s
in (4) are fixed computablequantities: |lul] = /u-uand aj; = u-é; forj =
1,2,...,N.
Thus, in seeking to minimize the right-hand side of (4), the only control we exercise is in our choice of the c;’s. The right-hand side of (4) is greater than
or equal to zero,” and so is the en (cj — aj)” term containing the c;’s. Thus,
the best that we can do is to set cj = aj (j = 1,2,...,.N).
With that choice, our
best approximation (1) becomes
N
uw S” (u-é;)6).
j=l
Let us summarize
(5)
these results.
THEOREM 9.10.1 Best Approximation
Let u be any vector in a normed inner product vector space S with natural norm
(|ju||= /a-u),
and let {€,,...,é@, } be an ON set in S. Then the best approxi-
mation (1) is obtained when the c,’s are given by c; = u- @;, as indicated in (5).
EXAMPLE
1. Let S be R?, N = 1,@, = (12, 5), and u = (1,1), as shown in
Fig. |. Find the best approximation u & c,@,, that is, the best approximation of u in
span{@,}(which is theline L). Theorem9.10.1givesc, = u-@, = 17/138,andhencethe
best approximation
us
17.
ge)
(6)
which is the vector OA in Fig. L.
COMMENT. Observe from the figure that the best approximation OA is the orthogonal projection of u onto span {e,}, which orthogonality is verified by the calculation
“This fact may not be obvious due to the minus sign in front of the last summation. But remember
thattheright-handsideof (4) is equalto ||B||*,andsurely ||EJ|?> 0.
Figure 1. Best approximationof
win span{é }.
460
Chapter 9. Vector Space
= (u— {2é,)-@, = 4 - 4 = 0. That result makes per-
AB-é@ = (u~ OA)-&
fect sense since if ¢,@, is to be the best approximation to u, then the distance from the tip
of u to the tip of c,é; (which is some point on LZ) should be as small as possible. That
shortest distance is the perpendicular distance from the tip of u to the line L. @
EXAMPLE
2. Let S be R¥, let N = 2 with é; = (1,0,0) and é, = (0,1,0), and let
u = (a,b,c), as shown in Fig. 2.
Computing the coefficients in (5) as u-é,
= a and
u-@9 = 6, (5) becomes
u
ae; + beg.
(7)
The latter is an equality if c = 0. That is, (7) is an equality if u happens to lie in
span {@1,@2}, but if c # 0 then the best approximation aé, + bé2 to u is the orthogo-
nal projectionof u ontospan{@;,é2}. 4
Figure 2. Bestapproximationof
u in span{), €2}.
In Examples t and 2, § was IR?and R®,respectively, so we were able to draw
useful pictures. In each case we discovered that the best approximation of u on the
subspace 7 of S spannedby €1,...,@, was the orthogonal projection of u onto 7.
Is that result true in all cases? That is, is the error vector E: necessarily orthogonal
to 7? Since the error vector is
N
E=u-~)(u-é)é;,
(8)
j=l
we have
N
Ee,
=
u~
So (u-é)
j=l
“eh
=u-&— (u-é)(1)=0
for each k = 1,2,...,N,
é;
eh
=0
if 7 x
keand
(9)
where the second equality follows from the fact that
1 if 7 =k,
Since E is orthogonal to every one of the e,’s, it is therefore orthogonal to
every vector in 7. In that sense we say that the right-hand side of (5) is the orthogonal projection of u onto 7, and denote it as projz u:
N
projz u = S/ (a:éj) 6).
j=l
(10)
The idea that the best approximation of u in 7 is the orthogonal projection
of u onto 7 lends a welcome geometrical interpretation to the problem of best
approximation. In fact, let us rephrase Theorem 9.10.1 in terms of orthogonal
projection.
THEOREM
9.10.1! Best Approximation by Orthogonal Projection
Let u be any vector in a normed inner product vector space S with natural norm
461
(\jul]= u-u),
span {é1,...,@y}
andlet {@;,...,@,} be an ON set in S. Denotethesubspace
of S as 7. Then the best approximation
of u in 7 (i.e., of the
form c,@, +: -:+ceney) is given by the orthogonal projection of u onto 7, namely,
by proj; u.
9.10.2. Kronecker delta. When working with ON sets it is convenient to use the
Kronecker
delta symbol 6;,,, defined as
- fl,
met
igkoR
(11)
and named after Leopold Kronecker (1823-1891), who contributed to algebra and
the theory of equations. The subscripted j and & are usually positive integers.
Clearly, 6; is symmetric in its indices 7 and k:
bie=Spy.
To illustrate the use of the Kronecker delta, suppose that {é;,...,@y}
ON basis for some space S, and that we wish to expand a given u in S as
N
u= So ce).
j=l
(12)
is an
(13)
To determinethe c;’s, dot e, into both sides, where /&is any integer such that
1<k
< N, and use the fact that e; -@, = 0;, (because the €; ’s are ON):
N
N
j=l
j=l
ue =| Scie; | ee =Sc; (6)-€x)
=)
N
Cojn= ex.
(14)
j=l
Thus, c, = u-é,
for each & = 1,2,...,N
N
so (13) becomes
u= >)(u-é) &.
(15)
j=l
Closure. Principal interest, in this brief section, is in the best approximation of
a given vector u in a normed inner product vector space S in terms of an ON set
{@,...,@,}
which falls short of being a basis for S inasmuch as N < dim S. Of
course, if N ==dimS so the set is a basis, then we have the equality (15), but if
N < dim S, then the best approximation of u is given by (5), best in the vector
sense; that is, the norm of the error vector [i.e., the norm of the difference between
462
Chapter 9. Vector Space
onto the span of 6,...
+ UnVpn)and the corresponding natural norm
EXERCISES
9.10
Notice that if u happens to be in span {é;,...,é,},
1. We concluded from (4) that the best choice for the c,’s is
or if
Cj = aj = u- é;. Show thatthis sameresultis obtainedfrom dim S = N, then (5.1) becomes an equality. In two and three
(2) by setting0 (||? /Oc; = 0, andverify thattheextremum dimensions that equality is actually the Pythagorean theorem,
and in more than three dimensions it amounts to an abstract
extension of that theorem.
thus obtained is a minimum.
2. LetS beR®,andlet N = 3 with é; = al,
é& =
(2,0, -1,0, 1), é3 = (0,0,0,1,0).
0,2,0,0),
Find the best 6. (A different inner product) In Examples | and 2 we use the
“usual”
approximation to the given u vector within span {6}, 69, és},
and the norm of the error vector.
(a)(3,—2,0,0,5) (b)(0,0,0,2,1)— (c)(3,0,1,4,1)
(d) (1, 1,0, 1,1)
(e) (0, 2,0, 0,0)
(g)(0,7,0,3,0)
UV
(f) (1,0, —3,3, 1)
(h) (1,2,3,4,5)
(i) (5,4, 8,2,1)
u:v
a1 = (11,0, -1), & = Yall ~1,~1,0),
the best approximation
span{6,2},
to u
=
(6.1)
+ Ueve and its corresponding
natural norm
\/2u?
+ uz. Showthattheresultingbestapproxima-
e; so as to be a unit vector according to the new norm.
(Fig. 1) is orthogonal and
perpendicular to span {@,}, show that in this exercise the error vector is indeed orthogonal to span {61}, as promised in
the text, but not perpendicular to it. To explain this “paradox,”
show that for the modified inner product the orthogonality of
two nonzero vectors does not imply their perpendicularity.
(4, —2,1,6)
span{61,@2,@3}, and (b) Whereas the error vector AB
4. Same as Exercise 3, but for the given u vector.
(a)(4,1,0,~1)
— (b)(3,-1,1,2)—(c) (0,0,2,5)
(d)(1,2,4,4)
(e)(0,5,3,-1)— (f)(2,0,-1,-1)
7. Verify
the
with (4), derive the Bessel
Y= (u-é;)”< full’.
last stepin (14),that 3
= CAOjk = Che
8. Verify the following, where (7, 7,4,/)run from 1 to N.
inequality
j=l
HW tnvn,
tion ay (12, 5), which is not the same as the best approxima-
theerrorvector,||E]].
N
= WyUyVYyHe
tion 35(12 ,5) given by (6). HINT: You will need to rescale
span {€1, é2 83, G4}. and in each case compute the norm of
5. (Bessel inequality) Beginning
= 2u,v1
llul] =
@é3
= J5(1,0,1,1), @4=4g(0,1,-1,1).
Find
= uyv, +--+ + Unvn, but
where the w,;’s are fixed positive constants, or “weights.”
(a) Rework Example [ using the modified inner product
3. LetS be R*, andlet
within span{é;},
inner product for IR", u-v
that is not the only acceptable one. In Example 3 of Section
9.6 we see that another acceptable inner product is
(a) »
(5.1)
i=}
(c}
S°
j
(b) S° dig0jk = Sin
Oj = 1
\~
k
ij
Oj h Ont
j
Sans Out
Chapter 9 Review
Chapter 9 Review
We begin with the two- and three-dimensional “arrow vector” concept that is probably already familiar to you from an introductory course in physics, where the vecand so on. For such vectors, vector addition
tors denoted forces, velocities,
u + v,
scalarmultiplication (au), a zero vector (0), a negativeinverse[—-u= (—1)ul, a
norm({Jul]),a dot product
(1)
8,
cos
{lvl}
u-v=[lull
and the angle @= cos™! (is)
u-v
ull
between u and v are all defined.
Vy
From there, we generalize to abstract n-space, where u = (uyj,...,Un), by
defining vector addition, and so on, in such a way that they agree with the correspondingarrow vector definitions when n = 2 and n = 3. For instance,
ml
u:v=
S° tj Uj,
(2)
j=l
(3)
ul] = Yu-u=_
and
6 = cos!
uv
——_.,
(4)
FullIv
From these definitions, we derived various properties such as
utv=vetu,
(u+v)+w=u+(v+w),
(commutative)
(5)
(associative)
(6)
and so on, along with the following properties of the dot product and norm.
Dot Product
Commutative:
u-v
= v-u,
forall
u-u>Q0Q
Nonnegative:
= ()
Linear:
(ou+ Bv)-w
(7a)
u 40
foru=0,
(7b)
= a(u-w)+f(v-w),
(7c)
Norm
Scaling:
Nonnegative:
Jaul] = fa) |jull,
[uj] > 0
= 0
Triangular Inequality:
lu-+vi|
(8a)
forallu 40
foru
= 0,
< jul] + iv.
(8b)
(8c)
463
464
Chapter 9. Vector Space
To complete the extension to generalized vector space, we reverse the cart and
the horse by elevating these various properties to the level of axioms, or requirements. That is, we let the fundamental objects, the vectors, be whatever we choose
them to be, and then define addition and scalar multiplication operations, a zero
vector, a negative inverse, a dot or “inner” product (if we wish), and a norm (if we
wish), so that those axioms are satisfied. Our chief interest, in introducing generalized vector space, is in function spaces, but we will not work with function spaces
until Chapter 17, when we study Fourier series and the Sturm—Liouville theory.
Next, we introduce the concept of span and linear dependence, primarily so
that we can develop the idea of the expansion of a given vector in a vector space S
in terms of a set of base vectors for S. We define a set of vectors {e1,...,e,}
to
be a basis for S if each vector u in S can be expressed (“expanded”) uniquely in
the form u = aye; +--+ + a,e%, and prove that a set {e1,..., es} is a basis for S
if and only if it spans S and is LI (linearly independent). In particular, orthogonal
bases are especially convenient because of the ease with which one can compute
the expansion coefficients aj. The result is
oe
(ZB)a4+(Mt)a
ue}
ue,
e1 ee}
Cx &f,
°
if the basis is orthogonal, and
u = (u-@;) 6; +--+ + (u- eg) &
if it is ON (orthonormal);
(10)
(9) and (10) should be understood and remembered.
Finally, we study the question of the best approximation of a given vector u in
a vector space S in terms of an ON set {€),...,@,} which falls short of being a
basis for S. We show that the best approximation (i.e., the one that minimizes the
norm of the error vector) is
N
ue
(u-é;) 6;
j=
(11)
1
which, in geometrical language, is the orthogonal projection of u onto the span of
C1,-.-,EN.
Chapter 10
Matrices and Linear Equations
10.1
Introduction
We have already met matrices in Section 8.3.3, but they were introduced there
only as a notational convenience for the implementation of Gauss elimination and
Gauss—Jordanreduction. In the presentchapterwe focus on matrix theory itself,
which theory will enable us to obtain additional important results regarding the
solution of systems of linear algebraic equations.
One way to view matrix theory is to think in terms of a parallel with function
theory. In our mathematical training. we first study numbers —the points on a real
number axis. Then we study functions, which are mappings, or transformations,
from one real axis to another. For instance, f(a) = x? maps the point x = 3, say,
on an x axis to the point f = 9 onan f axis. Just as functions act upon numbers,
we shall see that matrices act upon vectors and are mappings from one vector space
to another. Having studied vectors, in Chapter 9, we can now turn our attention to
matrices.
Historically, matrix theory did not become a part of undergraduate engineering science curricula until around 1960, when digital computers became widely
available in academia.
10.2. Matrices and Matrix Algebra
A matrix is a rectangular array of quantities thatare called the elements of the matrix. Normally, the elements will be real numbers, although they may occasionally
be other objects such as differential operators or even matrices. Some of thesecases
will be met as we go along; for the present, however, let us consider the elements
to be real numbers. The complex case is studied in Chapter 12.
465
466
Specifically, any matrix A may be expressed as
A=
ay
12
vtt
Qin
21
d92
"7+
Gan
Qmt
Um2
“"°
Qmn
(1)
where the brackets (or, in some texts, parentheses) are used to emphasize
that the
entire array is to be regarded as a single entity. A horizontal line of elements is
called a row, and a vertical line is called a column.
and columns from the left, then
a9,
G92
‘'*
Qn
and
Counting
rows from the top
ay13
423
Download