Mathematics for Engineers
Written by Bradley Gallant
Last modified August, 2024
Introduction
The intention of this document is to act as an all-in-one mathematical resource for the first half of
an engineering bachelor’s degree. The content in this document is a selection of material that is
commonly used in the first two years of most engineering degrees, and is primarily based on my own
experience. Therefore, the reader should be aware that it is possible that necessary mathematics
for introductory engineering courses is not included in this document, and it is also possible that
some of the content included will not be necessary for all. Should the reader wish to utilize
this document, they should confer with their professor in taking inventory of what mathematical
knowledge will be needed to succeed in the respective course.
This document assumes the reader is familiar with grade-school algebra, trigonometry, and precalculus, and is designed to guide the reader through Calculus III, with some additional material
on Linear Algebra and Ordinary Differential Equations. I have written this content with the intent
of being as easy to understand as possible, and in a relatively informal manner as a result.
i
Contents
Part One: Calculus I
1
1 Differential Calculus
1
1.1 Differentiation as a Concept . . . . . . . . . . . . . . . . . . .
1
1.2 Definition of the Derivative . . . . . . . . . . . . . . . . . . .
3
1.3 Properties of Derivatives . . . . . . . . . . . . . . . . . . . .
8
1.4 Differentiation Formulas . . . . . . . . . . . . . . . . . . . . 10
1.4.1 Power Rule. . . . . . . . . . . . . . . . . . . . . . . 10
1.4.2 Product Rule . . . . . . . . . . . . . . . . . . . . . . 15
1.4.3 Quotient Rule . . . . . . . . . . . . . . . . . . . . . . 16
1.4.4 Chain Rule . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.5 Trigonometric Functions . . . . . . . . . . . . . . . . . . 22
1.4.6 Inverse Trigonometric Functions
. . . . . . . . . . . . . . . 26
1.5 Implicit Differentiation . . . . . . . . . . . . . . . . . . . . . 27
1.6 Linear Approximation, Differentials, and Deriving the Chain Rule . . . . . . 29
2 Integral Calculus
33
2.1 Integration as a Concept . . . . . . . . . . . . . . . . . . . . 33
2.2 Definite Integration . . . . . . . . . . . . . . . . . . . . . . 34
2.3 Indefinite Integration
. . . . . . . . . . . . . . . . . . . . . 39
2.4 The Fundamental Theorem of Calculus . . . . . . . . . . . . . . . 46
2.4.1 FTOC Part One . . . . . . . . . . . . . . . . . . . . . 46
2.4.2 FTOC Part Two . . . . . . . . . . . . . . . . . . . . . 47
2.5 Basic Integral Properties & Formulas . . . . . . . . . . . . . . . . 52
2.6 U-Substitution. . . . . . . . . . . . . . . . . . . . . . . . 64
ii
2.7 Applying Integrals to Physics
. . . . . . . . . . . . . . . . . . 73
Part Two: Calculus II
77
3 Additional Integration Techniques
77
3.1 Integration by Parts . . . . . . . . . . . . . . . . . . . . . . 77
3.2 Trigonometric Integration . . . . . . . . . . . . . . . . . . . . 86
3.3 Integration with Trigonometric Substitutions . . . . . . . . . . . . . 93
3.4 Integration with Partial Fractions . . . . . . . . . . . . . . . . . 97
3.5 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . 97
Part Three: Calculus III
98
4 Review of Vector Basics
98
4.1 Introduction to Vectors . . . . . . . . . . . . . . . . . . . . . 98
4.2 Vector Addition . . . . . . . . . . . . . . . . . . . . . . . 108
4.3 Vectors in 3D Space and the Unit Vector. . . . . . . . . . . . . . . 112
4.4 Vector Multiplication . . . . . . . . . . . . . . . . . . . . . 118
4.4.1 Dot Products . . . . . . . . . . . . . . . . . . . . . . 118
4.4.2 Cross Products . . . . . . . . . . . . . . . . . . . . . 128
4.4.3 Scalar Triple Products . . . . . . . . . . . . . . . . . . . 135
5 Rudimentary Multivariable and Vector Calculus
138
5.1 Partial Differentiation . . . . . . . . . . . . . . . . . . . . . 138
5.2 Multiple Integrals
. . . . . . . . . . . . . . . . . . . . . . 138
5.3 Multiple Integrals in Polar/Cylindrical Coordinates . . . . . . . . . . . 138
5.4 Applying Calculus to Vectors. . . . . . . . . . . . . . . . . . . 138
5.5 The Gradient Vector
. . . . . . . . . . . . . . . . . . . . . 138
iii
5.6 Line and Surface Integrals . . . . . . . . . . . . . . . . . . . . 138
5.7 Curl and Divergence. . . . . . . . . . . . . . . . . . . . . . 138
Part Four: Miscellaneous Mathematics
139
6 Ordinary Differential Equations
144
6.1 Introduction to Differential Equations . . . . . . . . . . . . . . . . 144
6.2 Solving First Order Linear and Homogeneous Differential Equations . . . . . 148
6.3 Solving by Integration Factors . . . . . . . . . . . . . . . . . . 152
6.4 Solving by Separation of Variables . . . . . . . . . . . . . . . . . 157
7 Linear Algebra
7.1 Defining Linearity
157
. . . . . . . . . . . . . . . . . . . . . . 157
7.2 Introduction to Linear Transformations . . . . . . . . . . . . . . . 160
7.3 Linear Systems of Equations as Linear Transformations. . . . . . . . . . 160
7.4 Dot Products Revisited. . . . . . . . . . . . . . . . . . . . . 160
7.5 The Determinant . . . . . . . . . . . . . . . . . . . . . . . 160
7.6 Cross Products Revisited . . . . . . . . . . . . . . . . . . . . 160
Appendix
161
A Additional Material
A.1 Prerequisite Material
161
. . . . . . . . . . . . . . . . . . . . . 161
A.2 Material for Topics Covered in This Document. . . . . . . . . . . . . 161
B Proof of Various Derivative/Integral Properties & Formulas
B.1 Traditional Proofs for Derivatives and Integrals
B.2 Tree of Proofs for Differential Calculus
C Resource Contributions
162
. . . . . . . . . . . . 162
. . . . . . . . . . . . . . . 164
178
iv
D Development Testing
179
v
Part One: Calculus I
1
Differential Calculus
Chapter 1 Significance
Up to this point in your studies of mathematics, finding the slope of functions has only
been a possibility when a function is linear – where the slope is constant for all x. In this
chapter, we will learn how to find the slope of a variety of functions for any value of x.
This capability is crucial in studying physics, as it will allow for the analysis of dynamic
scenarios.
1.1
Differentiation as a Concept
Section 1.1 Overview
In this section, we will introduce the basic idea of differentiation. By the end of this
section, the reader should be able to:
• Explain what the slope/rate of change of a function is
• Explain what the instantaneous rate of change of a function is
• Explain what the derivative operator does
Recall from your high school mathematics studies that slope is a value that determines the rate
at which a function changes when its independent variable (usually x or t) changes. Put simply,
slope is the rate of change of a function. You are likely most familiar with this concept in the
context of lines, such as the formula
y = mx + b
where m is a constant slope. The fact that y is always changing at the same constant rate is what
makes this equation a line. The important thing about slopes is that they are always the measure
of a rate. If y is dollars, and x is apples, then m is a measure of dollars per apple. Slopes/rates
are very important in physics when you consider motion. If y is position in meters, and x is time
in seconds, then m is meters per second, which is velocity. Likewise if y is velocity in meters per
second, and x is still time, then m is meters per second per second, which is acceleration. This
1
means if we’re given a position vs. time graph where the curve is linear, we can easily determine
what the velocity is by using the slope formula for lines, which is
y2 − y 1
x2 − x 1
Where (x1 , y1 ) is some initial point, and (x2 , y2 ) is some
25
final point. It doesn’t matter which points on the linear
function we choose, the slope is always the same, and
thus so is the velocity of the object. But what if we are
given a position function that doesn’t have a constant
slope? This would mean the velocity is no longer con-
20
15
10
stant, and the result would be a curved graph such as
the one shown in Figure 1.1.
5
The formula for slope will no longer tell us what the
0
velocity of the object is for any given x, as the slope is
now at all points changing, and is never constant. This
0
1
2
3
4
5
Figure 1.1: Position (y) vs. time (x)
with non-constant velocity
is where differentiation comes in. Differentiation is the
use of an operator called the derivative, and it gives us the slope of a function as a function of
the independent variable – that is, it tells us the instantaneous rate of change of a function.
Instead of dealing with functions where m is the same for all x, we now must learn to analyze
functions where the value of m depends on x. This could look something like
y = m(x) x + b
The big question of this chapter is “How do we find what m(x) is?”. Mechanical physics largely
deals with how systems change when other variables change, such as time, position, work, etc.
Being able to answer this big question will prove very useful, as rarely in the natural world do we
see linear relationships between physical quantities.
Section 1.1 Summary
• Differentiation is the use of the derivative operator
• A derivative tells us the instantaneous rate of change of a function
• The instantaneous rate of change of a function is the slope at a specific point
2
1.2
Definition of the Derivative
Section 1.2 Overview
In this section, we will learn how to denote derivatives, how the derivative operator is
defined, and how to use this definition to compute basic derivatives. By the end of this
section, the reader should be able to:
• Interoperate differential notation
• Explain how a derivative finds the instantaneous rate of change of a function at a
point
• Evaluate derivatives for simple polynomials
Notation
Before defining the derivative operator, lets begin by familiarizing ourselves with how derivatives
are notationally represented. If we have some function, say y = f (x), we can denote the derivative
of y or f (x) in a couple of different ways. The first is called Leibniz notation, and is shown below.
d
dy
d
df
y=
=
f (x) =
dx
dx
dx
dx
These are four different ways of denoting the derivative of y or f (x) with Leibniz notation – The
commonality between them is that they all contain
d
dx
This is just an operator, meaning the ‘d’s present are not variables and you cannot cancel them
out. These ‘d’s are an operators themselves, and they’re called differentials. They’re actually very
similar to the ‘∆’ operator, the difference is they denote a very small change in a variable. This
makes a lot of sense when you consider that derivatives are the representation of slope and connect
this notation to the formula for slope in a linear function.
slope =
dy
∆y
=
∆x
dx
If differentials represent very small changes in a variable, then this notation implies that a derivative
is using the linear slope formula, but where (x1 , y1 ) and (x2 , y2 ) are very very close to each other. As
3
you’ll see in the next section, this is precisely what a derivative is. We can read the above Leibniz
notation as “the derivative of y with respect to x”. The function or variable in the numerator is
what is being differentiated, and the variable in the denominator is what we’re differentiating with
respect to. In the context of dy/dx, differentiating with respect to x means we want to see how y
changes when x changes.
The next notation is known as prime notation or Newton’s notation, as is shown below.
y ′ = f ′ (x)
In prime notation, apostrophes are used to denote the derivative of a variable. Note that unlike
Leibniz notation, prime notation doesn’t specify the changing independent variable. However,
it can almost always be determined by considering the context. If y is a function of x, then it
is implied that y ′ describes how y changes when x changes, and it can be read as “y prime”.
When prime notation is used with function notation (the right side of the above equation), the
independent variable is readily apparent, and thus it is very easy to determine what the variable
being differentiated with respect to is. f ′ (x) would be read as “f prime of x”.
The last notation to be familiar with is dot notation, which is the lesser used notation of the three
we’ve discussed. Though when it is used, it is often in physical applications, thus we ought to
know what it means. Dot notation is unique in that it is only used to denote time derivatives,
meaning time will always be the changing independent variable. Dot notation is denoted as
ẏ
This notation mainly appears in physics since time is, of course, a very important part of physics.
All of the notation shown so far are all different ways of representing the same thing. as long as
y = f (x), the following is true:
d
dy
d
df
y=
=
f (x) =
= y ′ = f ′ (x)
dx
dx
dx
dx
or in dot notation
dy
d
= ẏ = f (t) = f˙(t)
dt
dt
We can also take the derivative of a derivative, which can be referred to as the second derivative,
the second order derivative, the derivative to the second degree, etc. Likewise, taking the derivative
of a second order derivative would be considered the third order derivative, and so on. For y = f (x)
4
(or y = f (t) for dot notation), the notation for second and third order derivatives respectively are:
d2 y
d2
=
f (x) = y ′′ = f ′′ (x)
dx2
dx2
d3 y
d3
=
f (x) = y ′′′ = f ′′′ (x)
dx3
dx3
or in dot notation
d2
d2 y
=
f (t) = ÿ = f¨(t)
dt2
dt2
d3 y
d3
... ...
=
f
(t)
=
y = f (t)
dt3
dt3
Once you go beyond the third or fourth derivative of a function, you typically don’t see dot notation
used, and an alternate representation for prime notation. Suppose we needed to find the 5th degree
derivative of y. Instead of adding five apostrophes to y, we would denote it as
y ′′′′′ = y (5) ̸= y 5
Note that the parenthesis denote that 5 is the degree of a derivative, not an exponent. If we wanted
to represent the nth degree derivative of y, we could do so as shown.
dn
dn y
(n)
=
y
=
f (x) = f (n) (x)
dxn
dx
Fun fact: Leibniz notation and Newton’s notation are named after Gottfried Leibniz and Isaac Newton, respectively. These
two mathematicians both developed calculus during the late 17th century, and arrived at the same conclusions despite never
working together. The timing of their discoveries kicked off a century-long debate regarding who “invented” calculus first.
Today it is generally accepted that both Leibniz and Newton made their discoveries independently, with Newton beginning
his work prior to Leibniz, but with Leibniz formally publishing his work earlier than Newton.
5
Defining the Derivative
Now that we know how derivatives are symbolically
20
represented, we need to determine how we can find the
slope of a function at an instant. Suppose we want to
P0 (x0 , y0 )
15
find the slope of an arbitrary function, f (x), at an
arbitrary point, P (x, y). We will begin the analysis
10
by choosing a point P0 that has position on the xaxis x + h, as shown in Figure 1.2a. If we draw a
5
P (x , y)
line between these two points (called a secant line)
and find its slope (Figure 1.2b), it is clear that this
slope does not match the slope of f at P . To see why,
compare the inclination of the secant line to the local
0
1
h
3
2
4
5
(a) f (x) with points P and P0 shown
inclination of f at P .
Now suppose we move P0 closer to P and again find
its secant line (Figure 1.2c). While the inclination of
20
P0
15
the secant line is still too large, we can tell that it is
a closer match. It follows that the closer P0 is to P ,
10
the closer the slope of the secant line will be to the
slope of f at P . The natural course of thought is to
P
5
let P0 = P , however this will result in an undefined
slope, as ∆x = 0 will be in the denominator of the
0
point-slope formula. Therefore, we can use the limit
1
2
3
4
5
(b) Secant line between P and P0
operator to allow P0 to become infinitely close to P , or
in other words, let h (the horizontal distance between
the two points) tend to zero.
20
To do so, we will first write the coordinates of P and
15
P0 in terms of x and h.
P = (x, y) = x, f (x)
P0
10
P
5
P0 = (x0 , y0 ) = x + h, f (x + h)
Now we use these points in the point-slope formula.
0
1
2
3
4
5
(c) Secant line when P0 is moved closer to P
slope =
∆y
f (x + h) − f (x)
f (x + h) − f (x)
=
=
∆x
(x + h) − x
h
6
Figure 1.2: Using the point-slope formula to find the slope of secant lines between P and P0
As previously stated, we will use the limit operator to let h tend toward zero.
f (x + h) − f (x)
h→0
h
f ′ (x) = lim
(1.1)
This is the formal definition of the derivative. This formula uses the point-slope formula with a
point on f at x, and another infinitely close point. In each of the figures from this section, the
blue curve was f (x) = x2 , so lets use this definition to find the instantaneous rate of change of x2 .
f (x + h) − f (x)
h→0
h
f ′ (x) = lim
(x + h)2 − x2
h→0
h
= lim
x2 + 2xh + h2 − x2
h→0
h
= lim
2xh + h2
h→0
h
= lim
= lim 2x + h
h→0
= 2x + lim h
h→0
= 2x + 0
= 2x
In these figures, P was the point (2, 4), so lets find the derivative of x2 at x = 2
f ′ (x) = 2x
⇒
f ′ (2) = 2(2)
=4
Thus the slope or the instantaneous rate of change of x2 at x = 2 is 4. If f was the position of
an object as a function of time, this would tell us the velocity of that object at two seconds (or
any unit of time). Unfortunately, we won’t be able to use the definition of a derivative for a lot of
functions, so we’ll need to make use of some formulas which we’ll get to in upcoming sections.
7
Section 1.2 Summary
• Taking the derivative “with respect to x” means the change of a function when x
changes is being evaluated
• Leibniz notation uses the differential operator d/dx, where the ‘d’s represent infinitely small deltas (∆), are not variables
• Prime notation uses apostrophes to denote derivatives
• Dot notation places dots above functions, and is purposed only for time-dependent
derivatives
• The instantaneous rate-of-change of a function can be found by using the pointslope formula for two infinitely close points
• The definition of a derivative is given by Eq. (1.1)
1.3
Properties of Derivatives
Section 1.3 Overview
Before we move on to using formulas for differentiation, we need to establish a couple of
basic properties of derivatives. These properties will allow us to differentiate a greater
pool of functions once we obtain general formulas for doing so. By the end of this section,
the reader should be able to:
• State the sum rule
• State the constant coefficient rule
The first two properties are a result of the derivative being a linear operator. Simply put, linear operators are operators that preserve scalar multiplication and addition. This results in the
following equations.
d
d
d
f (x) ± g(x) =
f (x) ± g(x)
dx
dx
dx
(1.2)
This is more formally known as the sum rule (proof), and it states that if you are differentiating
a function with multiple terms, you can differentiate each term separately and then group them
8
back together at the end. Similarly, we can use the property of linearity to say that,
d
d
c · f (x) = c · f (x)
dx
dx
(1.3)
where c is any constant
This is known as the constant factor rule (proof), and it tells us that if you have some constant
coefficient on a differentiable function, you can factor the coefficient out in front of the derivative
and differentiate the function on its own. To clarify what we mean by constant, c must not
change when the variable we’re differentiating with respect to changes. If you’re differentiating
with respect to x, c must be the same value for all x. If you’re differentiating with respect to t, c
must be the same value for all t, and so on. With this in mind, it is important we squander any
misunderstandings about the properties of derivatives:
d
f (x) · g(x) ̸= f ′ (x) · g ′ (x)
dx
f ′ (x)
d f (x)
̸= ′
dx g(x)
g (x)
"
#
This is to say that if you have two functions being multiplied/divided by one another, you can
not evaluate the derivative of each and then multiply/divide them by each other. There are rules
for these two cases, and we will explore them in Product Rule and Quotient Rule.
Section 1.3 Summary
• The sum rule states that a function composed of several terms by addition/subtraction can be differentiated by “distributing” the derivative operator
• The constant factor rule states that a function being multiplied by a constant
coefficient can be differentiated multiplying the derivative of the function and the
constant coefficient
• Functions that are composed of several functions by multiplication/division can
not be differentiated by differentiating each function independently and multiplying/dividing the results
9
1.4
Differentiation Formulas
Section 1.4 Overview
While we can evaluate derivatives using Eq. (1.1), this process is often complicated and
tedious. In order to efficiently differentiate functions, we can use a variety of general
formulas that tell us how to differentiate various categories of functions. This section
will cover the general formulas for differentiating the following functions:
• Polynomials
• Trigonometric functions and their inverses
• Exponential and logarithmic functions
• Product and quotient functions
• Composite functions
1.4.1
Power Rule
Now that we have learned what a derivative is, how it is defined, and some properties of derivatives,
it’s time to talk about how to easily evaluate them. While we can use the definition to evaluate
some derivatives, it is a long and tedious process. Luckily, we have formulas to make things easier.
The most common and thus the most important one being the power rule, which is shown below.
d n
x = nxn−1
dx
(1.4)
If we want to take the derivative of some variable raised to a constant power with respect to
that variable, we can “bring down” the exponent to the coefficient and then reduce the existing
exponent by one. The proof of the power rule can be viewed in Appendix B. Lets get a better
idea of how to use the power rule with a quick example.
Example 1.1
Differentiate x2 with respect to x using the power rule
10
Example 1.1 continued
For this example, n = 2, so we can plug that into Eq. (1.4) to get:
d 2
x = 2x2−1
dx
= 2x
In Definition of the Derivative, we discussed what it meant to differentiate with respect to a specific
variable. Lets explore how this terminology changes the way we evaluate derivatives.
Example 1.2
Differentiate 3t4 with respect to t
First, we notice that this is a function of t, not x, and the problem specifies that we differentiate with respect to t. This tells us to evaluate the rate of change of the given function
when t changes. We can evaluate the derivative as if it were a function of x and we were
differentiating with respect to x, so nothing really changes.
Keep in mind that we have a constant coefficient in our function here, so we will have to
put Eq. (1.3) to use:
d
d 4
3t = 3 t4
dt
dt
= 3(4t3 )
= 12t3
Now lets take a look at a problem that may appear to be a little tricky at first.
Example 1.3
Differentiate c with respect to x
The problem asks us to differentiate with respect to x, but our function does not depend
on x (at least as far as we can tell). Consider what the problem really asks us. How does c
change when x changes? Not at all! It does not matter whether x is 1,000,000 or 0.0000001,
11
Example 1.3 continued
c always has the same value since it is not a function of x. As a result, the graph of c vs. x
will be a horizontal line at y = c, and the slope should be zero for all x. Lets use the power
rule to see if we are correct.
Recall that any number to the power of 0 is just 1:
d
d
c = c (1)
dx
dx
=c
d 0
(x )
dx
= c(0)(x−1 )
=0
While this does mostly confirm our belief, you’ll notice this method fails for x = 0. A more
effective approach would be to recognize that y = c is a line, and thus has a constant slope.
This means differentiating isn’t necessary, as ∆y/∆x will yield the answer we’re after. The
numerator would simply be c − c, and thus the slope would be zero regardless of any choice
of x. Therefore, the derivative of any constant is zero, and we can add a new formula to our
toolbox:
d
c=0
dx
(1.5)
where c is any constant number
We can use Eq. (1.5) when we are asked to differentiate any function that isn’t dependent on the
variable we’re differentiating with respect to.
Example 1.4
Differentiate 2t6 with respect to x
Again, think about what the problem really asks us. How does 2t6 change when x changes?
12
Example 1.4 continued
As far as we know, t is not a function of x, so when x changes, 2t6 does not change at all.
d
d 6
2t = (2t6 ) 1
dx
dx
= (2t6 )(0)
=0
What we are really asking for here is the partial derivative of 2t6 with respect to x. We will briefly
discuss partial derivatives in Part ThreeThree, but they will not appear often in introductory
mechanics courses.
Sometimes you will also have to differentiate general functions, i.e. functions where coefficients
are arbitrary constants such as a, b, c, etc. The process for differentiating these functions does not
change, we just treat the arbitrary constants as if they were any given numerical constant. Lets
work a quick example to clear any misconceptions.
Example 1.5
Differentiate the function ax2 + bx + c with respect to x
As far as we know, a, b, and c are all constant coefficients, so we can treat them as any
numerical constant with Eq. (1.3).
d 2
d
d
d
ax + bx + c = a x2 + b x + c 1
dx
dx
dx
dx
= 2ax + b + 0
= 2ax + b
Lets do one more example, this time putting most of the properties and formulas we have discussed
to use.
13
Example 1.6
Find h′ (z) for
h(z) = 3z 4 − z 2 + 5 −
3
z2
The only thing that looks new here is 3/t2 . But recall that
1
= a−b
ab
which makes it clear that the power rule is applicable.
3
d
3z 4 − z 2 + 5 − 2
f (z) =
dz
z
′
=
d 4
d
d
d
3z − z 2 + 5 − 3z −2
dz
dz
dz
dz
= (3)(4)z 3 − (2)z + 0 − (3)(−2)z −3
= 12z 3 − 2z + 6z −3
= 12z 3 − 2z +
6
z3
Before moving on, a quick point needs to be made about what it means for a function to be
differentiable. A function (say, f (x)) is considered differentiable if the derivative of f exists for
all of f ’s domain, and f (x) is differentiable at x = a if f ′ (a) exists. If f has some discontinuity
at x = a, then f would not be differentiable at that point. If we consider both the function from
Example 1.6 and its derivative, we can see that h(z) is not defined for z = 0, since 3/02 is not
defined. Likewise, the derivative, h′ (z) is also not defined for z = 0 since 6/03 is not defined. Thus
h(z) is not differentiable for z = 0. Furthermore, consider the function g(x) = |x|. This function
has a “sharp” turn at x = 0, meaning it is discontinuous. Although |x| is defined for x = 0, g(x)
is not differentiable for x = 0. The importance of the differentiability of a function is ensuring
that you are not differentiating a function at some discontinuous point, mainly regarding piecewise
functions. For this physics course, this will not be very common, but it’s something to keep in
mind, especially for your calculus classes.
To briefly explain how you would take the derivative of |x|, you would do so as a piecewise function,
14
which is demonstrated below.
|x| =
−x,
if x < 0
⇒
x, if x ≥ 0
h
i
d −x ,
d
|x| = dx h i
d
dx
x,
dx
−1,
=
if x < 0
if x > 0
if x < 0
1, if x > 0
Practice Problems: 1-14, 17-19
Differentiating Sin, Cosine, & Logarithmic Functions
In upcoming subsections, we will derive several formulas for differentiation. However, these formulas will require the use of some pre-existing formulas, give below. For the proofs of these formulas,
see Appendix B.
d
cos x = − sin x
dx
(1.6)
d
sin x = cos x
dx
(1.7)
1
d
loga (x) =
dx
x ln (a)
(1.8)
0 < a ̸= 1
Examples demonstrating the use of these equations are present in the following subsections.
1.4.2
Product Rule
In Properties of Derivatives, I stated that,
d
f (x) · g(x) ̸= f ′ (x) · g ′ (x)
dx
15
And said that there was a rule for this case. That rule is the product rule, and it states that,
d
f (x) · g(x) = f ′ g + f g ′
dx
(1.9)
This means if you have a function that is the product of two functions, the derivative is the
derivative of the first times the second, plus the first times the derivative of the second. You’ll see
this is fairly simple with an example, so lets do one.
Example 1.7
Find h′ (x) for
h(x) = 2x2 sin x
First, notice that we will need to take the derivative of a trig function in this example, so
we’ll have to use Eq. (1.7). h(x) is clearly the product of two functions, so lets set each equal
to f or g and plug it into Eq. (1.9):
f = 2x2
⇒
f ′ = 4x
g = sin x
⇒
g ′ = cos x
h′ (x) = f ′ g + f g ′
= (4x)(sin x) + (2x2 )(cos x)
= 4x sin x + 2x2 cos x
That’s really all there is to it. Define each function, find their derivatives, and then plug them
into Eq. (1.9). With enough practice, you’ll be able to solve these in your head.
1.4.3
Quotient Rule
Returning to Properties of Derivatives, I also said that,
d f (x) f ′ (x)
̸= ′
dx g(x)
g (x)
16
And that there was also a formula for this case. That formula is,
d f (x) f ′ g − f g ′
=
dx g(x)
g2
(1.10)
As you can see, the quotient rule is very similar to the product rule. All you need to do is define
each function, find their derivatives, and plug them into this formula.
Example 1.8
Find the derivative of
ln x
sin x
We’ll need to use both Eq. (1.7) and Eq. (1.8). We’ll do the same thing we did in Example
1.7 by defining f and g, then find their derivatives and plug everything into Eq. (1.10):
1
1
=
x ln e
x
f = ln x
⇒
f′ =
g = sin x
⇒
g ′ = cos x
d ln x
f ′g − f g′
=
dx sin x
g2
"
#
1
x
=
(sin x) − (ln x)(cos x)
sin2 x
=
1
ln x cos x
−
x sin x
sin2 x
=
1
ln x cot x
−
x sin x
sin x
=
1
sin x
= csc x x
−1
1
− ln x cot x
x
− ln x cot x
I did a lot of simplifying here. The answer we got right after we plugged our functions into the
quotient rule is still correct, but some professors will want your answer in its “simplest” form.
Also recall our discussion from Power Rule about the differentiability of functions. The function
17
we differentiated is not differentiable for most x ∈ R (values of x in the set of all real numbers),
since ln x has a domain restriction of x > 0. csc x and cot x also have many undefined points,
generally at values of x = πn, n ∈ Z (values of n in the set of all integers), and lastly, x−1 is
undefined at x = 0.
Practice Problems: 1-6
1.4.4
Chain Rule
The chain rule is a formula for taking the derivative of composite functions. Recall that composite
functions are functions that follow the general form,
h(x) = f g(x) = (f ◦ g)(x)
with the latter being an alternate notation for composite functions. The chain rule – in both
notations – is then:
" #
d
f g(x)
dx
′
= f g(x) · g ′ (x)
(1.11)
or
d
f ◦ g (x) = f ′ ◦ g (x) · g ′ (x)
dx
In plain English, if we have a function that is an “outside” function (f (x)) of an “inside” function
(g(x)), the derivative is the derivative of the outside, of the inside, times the derivative of the
inside. This will become clearer with an example.
Example 1.9
Find h′ (x) for
h(x) = sin(x2 )
It’s pretty easy to see that h(x) is a composite function, so lets define the outside and inside
18
Example 1.9 continued
functions.
h(x) = f g(x)
f (x) = sin x
⇒
f ′ (x) = cos x
g(x) = x2
⇒
g ′ (x) = 2x
Using our formula, we get:
h′ (x) = f ′ g(x) · g ′ (x)
= f ′ (x2 ) · 2x
= cos(x2 ) · 2x
= 2x cos(x2 )
Lets modify this problem a little bit and see how it’s done.
Example 1.10
Find h′ (x) for
h(x) = sin2 x
For this problem, we’ve flipped f (x) and g(x).
h(x) = f g(x)
f (x) = x2
⇒
f ′ (x) = 2x
g(x) = sin x
⇒
g ′ (x) = cos x
19
Example 1.10 continued
Plugging into Eq. (1.11):
h (x) = f g(x) · g ′ (x)
′
′
= f ′ (sin x) · cos x
= 2(sin x) · cos x
= 2 sin x cos x
= sin 2x
Notice that I simplified with a trig identity on the last line in this example. The unsimplified
answer is still correct, I just wanted to expose you to this common trig identity.
We can use the chain rule to derive the formula for the derivatives of exponential functions, so lets
do that and add it to our list of equations.
Example 1.11
Differentiate ax
The solution here isn’t very intuitive, but keep in mind that we are using the chain rule
for this example, so we need to try and turn this into a composite function. This can
be accomplished with the use of a method called logarithmic differentiation. Logarithmic
differentiation is an incredibly power tool, as it allows us to take advantage of the properties of logarithms in order to use of various differentiation formulas on functions where
differentiation formulas can’t be used in a function’s natural state. We’ll start off by letting
y = ax
To use logarithmic differentiation, we’re going to take the natural log of both sides of this
equation. It should be noted that given the formulas we’ve gone over so far, we don’t need
to choose a base of e for our logarithm. The choice of e is due to the fact that the formula
for the derivative of logarithmic functions [Eq. (1.8)] has a natural log in its denominator.
This means if we choose base e, then that logarithm will come out to one, which will make
20
Example 1.11 continued
this derivation a little faster.
ln y = ln(ax )
Recall from your algebra studies that a property of logarithms is that
loga (bn ) = n loga (b)
Meaning we can take the exponent of what’s inside the logarithm and apply it as a coefficient
instead. Applying this property, we obtain
ln y = x ln(a)
The magic happens when we differentiate both sides.
d
d
ln y =
x ln(a)
dx
dx
Lets focus on the left side for now. Remember that we let y equal ax meaning y is a function
of x, and thus ln y is a composite function. Lets label each function as f and g like previous
examples and use the chain rule to compute the left side of this equation. Notice we’ll need
to use Eq. (1.8).
ln y = f g(x)
⇒ f ′ (x) =
⇒ f (x) = ln x
⇒
g(x) = y = ax
⇒
1
1
=
x ln e
x
g ′ (x) = y ′ =
d x
a
dx
For now we’re going to keep things in terms of y so it’s not as messy. Plugging our values
of f and g into the Chain Rule formula, we get
d
1 ′ y′
ln y = · y =
dx
y
y
Going back to the right side of our equation, ln a can be factored out of the derivative since
it is a constant due to the fact that a is a constant.
d
x ln a =
dx
ln a
21
d
x = ln a
dx
Example 1.11 continued
Thus the equation becomes
y′
= ln a
y
Since y = ax , y ′ is what we’re after here, so we can solve for y ′ to get our answer.
y ′ = y ln a
Now we just put this back into terms of x and we get
d x
a = ax ln a
dx
(1.12)
a>0
This approach can be used to prove several of the formulas we have learned so far by very simple means. You can view these proofs in the Tree of Proofs for Differential Calculus, along with
the additional proofs needed in order to use logarithmic differentiation. Now that you have completed this subsection, you possess all the knowledge necessary to understand each proof, so I’d
recommend checking it out after completing.
Practice Problems: 1-8, 11, 13
1.4.5
Trigonometric Functions
In this section, we are going to find the derivatives of the remaining trig functions. Recall
Eqs. (1.6) & (1.7):
d
cos x = − sin x
dx
and
d
sin x = cos x
dx
We can use these two derivatives along with the three rules we just learned about (product,
quotient, and chain rule) to figure out what the derivative of the four remaining trig functions
are (remember the reciprocal trig functions). Lets do the first two together, and then it would be
really good practice for you to try the other two on your own before looking at the solution. These
proofs aren’t as difficult as the one we did in Example 1.10, especially once you have seen how we
22
tackle the first two.
Example 1.12
Find the derivative of tan x
This might seem like an odd problem at first since tan x doesn’t seem to be the product
or quotient of two functions, and it doesn’t appear to be a composite function either. But
recall that,
tan x =
sin x
cos x
Now that we can see tan x is in fact the quotient of two functions, lets apply the quotient
rule.
f = sin x
⇒
f ′ = cos x
g = cos x
⇒
g ′ = − sin x
f ′g − f g′
d
tan x =
dx
g2
=
(cos x)(cos x) − (sin x)(− sin x)
(cos x)2
=
cos2 x + sin2 x
cos2 x
Recall that,
cos2 x + sin2 x = 1
and
sec x =
1
cos x
d
1
tan x =
dx
cos2 x
= sec2 x
d
tan x = sec2 x
dx
23
(1.13)
Example 1.13
Find the derivative of sec x
Just like before, we have to turn sec x into a function that we can use one of our three new
rules on. Hopefully you already see that since sec x is the reciprocal trig function of cos x,
sec x is a composite function. It is also the quotient of two functions (1 and cos x), so you
could use the quotient rule if you wanted. Lets do both so you can decide which one you
like more. Starting with the chain rule:
1
cos x
sec x =
= (cos x)−1
f (x) = x−1
⇒
f ′ (x) = −x−2
g(x) = cos x
⇒
g ′ (x) = − sin x
d
sec x = (f ′ ◦ g)(x) · g ′ (x)
dx
= −(cos x)−2 (− sin x)
=
1
cos x
sin x
cos x
= (sec x)(tan x)
d
sec x = sec x tan x
dx
24
(1.14)
Example 1.13 continued
Now for the quotient rule:
1
cos x
sec x =
f =1
⇒
f′ = 0
g = cos x
⇒
g ′ = − sin x
d
f ′g − f g′
sec x =
dx
g2
=
0 + sin x
cos2 x
1
=
cos x
sin x
cos x
= sec x tan x
Alright, you’ve seen these two done, now you should try to find the derivatives of csc x and cot x
on your own. The solutions are below, so you can check your answer.
Example 1.14
Find the derivative of csc x
d
d
csc x =
(sin x)−1
dx
dx
= −(sin x)−2 (cos x)
1
=−
sin x
cos x
sin x
= −(csc x)(cot x)
25
Example 1.14 continued
d
csc x = − csc x cot x
dx
(1.15)
Example 1.15
Find the derivative of cot x
"
d
d cos x
cot x =
dx
dx sin x
=
#
(− sin x)(sin x) − (cos x)(cos x)
sin2 x
=−
sin2 x + cos2 x
sin2 x
=−
1
sin2 x
= − csc2 x
d
cot x = − csc2 x
dx
(1.16)
Practice Problems: 4-9
1.4.6
Inverse Trigonometric Functions
To finish off our applications of differentiation to trigonometric functions, we need to cover the
derivatives of the inverse trig functions. Those are:
26
1
d
cos−1 x = − √
dx
1 − x2
(1.17)
1
d
sec−1 x = √ 2
dx
|x| x − 1
(1.18)
d
1
sin−1 x = √
dx
1 − x2
(1.19)
d
1
csc−1 x = − √ 2
dx
|x| x − 1
(1.20)
1
d
tan−1 x =
dx
1 + x2
(1.21)
1
d
cot−1 x = −
dx
1 + x2
(1.22)
The use of these equations is no different than the ones used in preceding examples, so we will not
demonstrate their use here. As with all the equations in this document, you can find their proofs
in INSERT PROOF.
Practice Problems: 1-5
Section 1.4 Summary
• The power rule [Eq. (1.4)] can be used to differentiate polynomials
• Eqs. (1.6), (1.7), and Eqs. (1.13) - (1.16) can be used to differentiate trigonometric
functions
• Eqs. (1.17) - (1.22) can be used to differentiate inverse trigonometric functions
• The product and quotient rule [Eqs. (1.9) & (1.10), respectively] can be used to
differentiate functions that are the products/quotients of several functions
• The chain rule [Eq. (1.11)] can be used to differentiate composite functions, which
are functions composed of nested functions
1.5
Implicit Differentiation
Section 1.5 Overview
In this section, we will introduce a way of utilizing the Chain Rule to differentiate implicit
functions.
Recall that equations can be categorized by being explicit or implicit. y = x2 can be written
27
explicitly and implicitly like so:
Explicit: y = x2
Implicit: y − x2 = 0
We can see in explicit form, we don’t need to rearrange any equations in order to find y. But in
implicit form, we do. Put simply, explicit equations are those that explicitly define y, whereas
implicit equations are those that imply the defintion of y, hence their names.
A common way of demonstrating the utility of implict differentiation is by finding y ′ in the equation
of a circle. Suppose we are given the implicit equation x2 + y 2 = r2 and are asked to find y ′ . With
our current understanding of differentiation, our first instinct is to solve for y, then differentiate
the expression. With implicit differentiation, we leave the equation as-is and differentiate by using
the chain rule.
We will start by differentiating both sides of the equation:
i
d h 2i
d h 2
x + y2 =
r
dx
dx
d h 2i
d h 2i
x +
y =0
dx
dx
Note that r is a constant representing the radius of a circle, so the derivative on the right side of
the equation is zero. For the left side, we can use the chain rule to find the derivative of y 2 .
2x + 2yy ′ = 0
Now we solve for y ′ .
y′ = −
x
y
To get our answer in terms of x, we solve for y in the original implicity equation.
x2 + y 2 = r 2
⇒
y=
√
r 2 − x2
⇒
x
y′ = − √ 2
r − x2
Finding y ′ through implicit differentiation is a simpler process in this case, as well as many other
cases. To demonstrate this, I have left the traditional explicit differentiation solution next to the
28
implicit differentation solution for easy comparison.
x2 + y 2 = r 2
y′ =
=
⇒
y=
√
x2 + y 2 = r 2
r 2 − x2
1 −1 d 1 2
r − x2 2
r 2 − x2
2
dx
⇒
y=
√
r 2 − x2
d h 2i
d h 2i
d h 2i
x +
y =
r
dx
dx
dx
2x + 2yy ′ = 0
−1/2
1 2
r − x2
(−2x)
2
y′ = −
x
= −√ 2
r − x2
x
y
x
= −√ 2
r − x2
While implicit differentiation took an additional line, the computations themselves are much simpler.
This technique won’t show up very often, but there are occasions where it can save you time, so
it’s good to keep in the back of your head.
1.6
Linear Approximation, Differentials, and Deriving the Chain Rule
Section 1.6 Overview
We close this chapter with a quick discussion on differentials and some confusing notation.
An understanding of differentials will be important for the upcoming chapter.
Recall the formula for slope for a line, which states that
m=
∆y
∆x
This can be rearranged to yield
∆y = m∆x
We can interprate this equation as telling us that if we have some change in x, ∆x, we get a
corresponding change in y, ∆y = m∆x. However, this formula is strictly for functions with
constant slope, so we will need to alter this equation for variable-slope functions.
29
We’ll start with some arbitrary function,
5
f (x), and we’ll draw a line that is tangent
to f at some arbitrary x0 . While how we get
4
this tangent line isn’t very important for our
current focus, we can do so with the follow-
x0 , f (x0 )
3
ing equation:
2
y − f (x0 ) = f ′ (x0 )(x − x0 )
where (x0 , y0 ) is the point where the line
and f intersect. This is simply the pointslope form of a line, but where the slope is
1
0
1
2
3
4
5
(a) Arbitrary function f with its tangent line
at x0 .
found from the derivative of f . If you’re unsure of why this line is tangent, the simple
3.5
explanation is that the function f presumably has a different value of f ′ (x) at any
small distance from x0 . That means for any
3.4
3.3
step away from x0 , f will change by a different amount than if f ′ (x) were constant,
3.2
like it is with the line. Thus, with any small
movement from x0 , f (x) will have a differ-
3.1
ent value than the line. That is, the line
only intersects with f at that one point in
the “neighborhood” of x0 , which is what a
tangent line is. What we mean by “neighborhood” is that the line is only tangent to
3
2.8
2.9
3
3.1
3.2
3.3
(b) At values close to point of intersection,
the tangent line can act as an approximation
for f .
Figure 1.3: Test caption
f in the area around x0 and the line may
intersect f at some point further away from
x0 , like it does in the top right corner of Figure 1.3a.
As we can see in Figure 1.3b, the tangent line is a decent approximation of f when we input values
of x that are close to x0 . As we move further from x0 , the gap between f and the tangent line
grows, making the approximation less accurate. This concept is called linear approximation,
and the idea is that when we zoom in on a function it begins to look linear (use your PDF viewer
to zoom in on the figures to see for yourself).
Okay, now that we understand tangent lines and linear approximation, lets use this concept to
find what the change in f is, ∆f , when we input some change in x, ∆x. We can start by using
30
the tangent line and the formula for the slope of a line:
∆y = m∆x
⇒
∆f ≈ f ′ (x0 )∆x
As we know from linear approximations, the smaller ∆x is, the more accurate our approximation
for ∆f will be. If we make ∆x infinitely small, our approximation will become infinitely accurate,
or rather, we will get an exact answer. We denote the infinitely small ∆x as dx, and this quantity
is called a differential. Note that since the size of ∆f is dependent of ∆x, ∆f also becomes
infinitely small, and we denote it as df .
df = f ′ (x) dx
(1.23)
This equation says that when we change x by a differential quantity dx, we get a differential change
in f equal to the derivative of f multiplied by dx. The differential operator is the same idea as
the delta (Delta) operator, but it implies that the interval is infinitely small. In the next chapter,
we will learn how to sum an infinite number of df ’s over an interval of x to get the total change in
f , ∆f .
For now, there are some important implications of this equation. First of all, notice what happens
when we divide both sides of Eq. (1.23) by dx.
df
= f ′ (x)
dx
This result implies that we can treat the notation dy/dx as a fraction, but you’ll probably be
told by your calculus professors that doing so is treason. The problem is that differentials do not
hold the properties of algebraic quantities due to the fact that they’re technically a limit, and so
treating them like a fraction is not mathematically rigorous. However, in lower level mathematics,
treating dy/dx like a fraction almost always works, and it is an incredibly common practice in
physics, engineering, and even in some higher level math courses. We will see in future chapters
that treating dy/dx like a fraction is often quite convenient.
We will close this section off by using Eq. (1.23) to derive the chain rule. In case you have forgotten,
the chain rule is as follows:
d f g(x) = f ′ g(x) · g ′ (x) or
dx
d df dg
f g(x) =
dx
dg dx
With the second way of denoting the chain rule, we can use the idea of dy/dx being a fraction like
31
so:
df dg
df
df dg
=
=
dg dx
dx dg
dx
That is, we can move the differentials around such that the dg’s cancel out. Again, this isn’t
formal, but it works. We can take the derivation a step further however. Consider the composite
function f (g), where g is a function of x. If we apply Eq. (1.23), we get
df = f ′ (g) dg
Then applying the equation again to g, we get
dg = g ′ (x) dx
Substituting this expression for dg into the equation for df ,
df = f ′ (g) · g ′ (x) dx
Now we divide both sides by dx,
df
= f ′ (g) · g ′ (x)
dx
which gives us the chain rule.
32
2
Integral Calculus
Chapter 2 Significance
In Differential Calculus, we studied the derivative operator, which analyzes the rate of
change of functions. Now, we will begin studying the inverse operator of the derivative;
the integral. Instead of analyzing the rate at which a function changes, the integral
analyzes the total change of a function over a given interval, or as a function of an
independent variable.
2.1
Integration as a Concept
Section 2.1 Overview
This section will introduce the concept of the integral, and give a breif look into why it
is useful to engineers. Before beginning this section, please watch the video linked below,
as this section will use it as a basis for introducing integration.
The Odd Number Rule: watch from 8:03 to 13:37 (the “Moving“ chapter)
Luckily, The Odd Number Rule already uses physics as an example to explain this topic, so
immediately you can understand what this section is about and why it is important. An integral
can be thought of as the area under a curve, and it can also be thought of as the inverse operator to
differentiation – or in other words – an anti-derivative. We will explore both topics in later sections,
but before that, lets further consider the example given in the video.
Michael wanted to find out how far away we were from him at various times given he knew
our velocity, or at least how our velocity was changing. He did so by finding the area under our
velocity curve to get our change in position since we started moving. The heights of each rectangles
were measures of velocity, while the widths of each rectangles were intervals of time, and since
he assumed that velocity was constant over this time interval, the distance traveled during that
period can be calculated by multiplying the two quantities (distance = speed × time). Because
these two quantities correspond to the height and width of a given rectangle, that product also
happens to be the area (area = height × width).
A similar result would be obtained if we did this for an acceleration vs. time graph. In this case,
33
the unit for the area under each rectangle would be m/s2 × s = m/s. Thus, if an integral tells us the
area under a curve, then:
• The integral of acceleration with respect to time is velocity
• The integral of velocity with respect to time is position
Note that the example given in The Odd Number Rule used a linear velocity curve. Because
of this, the shape bounded by the curve and an interval time is just a triangle, meaning that
approximating its area does not make much sense when we know how to calculate the precise area
(A = 12 bh). The broader purpose of that example was to demonstrate the principle while keeping
matters simple. In reality, we will need to calculate the area under smooth curves, for which we
will almost never have nice geometric formulas.
2.2
Definite Integration
Section 2.2 Overview
As mentioned in Integration as a Concept, there are two kinds of integrals; definite and
indefinite. This section will define definite integration. By the end of this section, the
reader should be able to:
• Approximate the area under a curve over a given interval
• Explain what definite integration is and how it finds the exact area under a curve
over some interval
For our purposes definite integrals can be thought of the
25 y
use of an operator that tells us the area bounded by a
curve, an axis, and two bounds. Know that the integral
is much more than this. Treating it as an area operator
allows us to define definite integration by easy to understand means. Let us begin by finding the area beneath
20
15
10
2
f (x) = x , over the interval 0 ≤ x ≤ 5 [Figure 2.1]. We
will take a similar approach to The Odd Number Rule
5
by dividing the area beneath f into small rectangles,
adding their areas, and then comparing them to what
the area of the shaded region really is. For now, you will
0
x
1
2
3
4
5
Figure 2.1: Graph of f (x) = x2
just have to trust me when I say that the answer we are after is 125/3 ≈ 41.67 square units.
34
Lets begin by picking a number of rectangles into which we want to divide this area. For no
particular reason, lets say we choose 5. To make calculations a little easier, we’ll give all of these
rectangles an equal width. Lets call the number of rectangles n, the starting point on our x-axis
a, and the ending point on our x-axis b. The length of the interval of the shaded region on the
x-axis is given by b − a, and if we want to divide that into n equal pieces then the width of each
rectangle, ∆x, will be that length (b − a) divided by n. Thus,
∆x =
5−0
b−a
=
=1
n
5
These division of the x-axis is shown in Figure 2.2a.
25
Now that we have the widths of our rectangles, we
need the heights. Recall from The Odd Number Rule
20
that these will not all be the same heights and will
actually be the value of our function at different values
15
of x. It would be convenient to choose the values of
x in intervals of ∆x, so lets do that. The following is
10
the sequence of x values that will be used to determine
the height of each rectangle.
5
∆x
∆x
∆x
∆x
∆x
xi x1 x2 x3 x4 x5
value
0
1
2
3
0
4
Here, x1 refers to the first item in the sequence, 0, x2
1
2
3
4
5
(a) The width of each rectangle, ∆x, visualized
refers to the second item, 1, and so on. The height of
each rectangle will then be f (xi ), where xi is the left
25
endpoint of the ith rectangle. Note that these values
correspond to the left side of each rectangle because
20
f (x5 )
we started with x1 = 0, and ended with x5 = 4. We
could have chosen x1 = 1 and x5 = 5, which would
15
make each x value correspond to the right endpoints
of each rectangle. In a proper calculus class, this idea
f (x4 )
10
is explored more extensively in order to improve the
accuracy of predictions. For our purposes, it doesn’t
matter what endpoints we choose.
Notice that we said we would use five rectangles, yet
only four are shown in Figure 2.2b. This is because
f (x3 )
5
0
f (x2 )
f (x1 )
1
2
3
4
5
(b) Height and area of each rectangle
x1 = 0, and so f (x1 ) = f (0) = 02 = 0, and thus there Figure
are 5 rectangles,
one just
2.2: Dividing
thehas
areaa height
below xof2
rectangles
in see
order
approximate
zero. Now that we know the height and width of each into
rectangle,
we can
howtoclose
the sum
the under under the curve
35
of their areas is to the area under f . This calculation can be represented with a summation since
we’re incrementally multiplying the height of every rectangle by the base of each rectangle; all we
need to do is denote each individual rectangle with an index, i. Since the sequence of our x values
begins at i = 1, we need to start our summation index at 1 as well. And since this index iterates
through each of our rectangles, it only needs to go to n, 5 in this case.
A≈
n
X
bi hi =
i=1
5
X
∆xf (xi )
i=1
= ∆xf (x1 ) + ∆xf (x2 ) + ∆xf (x3 ) + ∆xf (x4 ) + ∆xf (x5 )
= ∆x f (x1 ) + f (x2 ) + f (x3 ) + f (x4 ) + f (x5 )
2
2
2
2
=1 0 +1 +2 +3 +4
2
= 0 + 1 + 4 + 9 + 16
= 30
As can be seen from Figure 2.2b and from our calcu-
25
lated estimate, this procedure will result in an underestimate. But recall from The Odd Number Rule that
Michael demonstrated how the use of more rectangles
with smaller width results in a better approximation.
Figure 2.3 shows the area approximation for the use of
20
15
10
ten subintervals, and it is immediately clear that our approximation is closer to the area under the curve. The
5
full computation is outlined below.
0
1
2
3
4
5
Figure 2.3: Area approximation on
the interval 0 ≤ x ≤ 5 with n = 10
5−0
1
b−a
=
=
∆x =
n
10
2
⇓
xi x1
value
0
x2
x3
x4
x5
x6
x7
x8
x9 x10
0.5
1
1.5
2
2.5
3
3.5
4
4.5
36
A≈
10
X
bi hi =
i=1
10
X
∆xf (xi ) = ∆x
i=1
10
X
f (xi )
i=1
= ∆x f (x1 ) + f (x2 ) + f (x3 ) + f (x4 ) + f (x5 ) + f (x6 ) + f (x7 ) + f (x8 ) + f (x9 ) + f (x10 )
1
= 02 + 0.52 + 12 + 1.52 + 22 + 2.52 + 32 + 3.52 + 42 + 4.52
2
= 35.625
Recalling that the answer we are looking for is 125/3 ≈ 41.67, our approximation with 10 subintervals did get us a little bit closer. Although despite doubling the number of subintervals, the
approximation only got us 13.5% closer. We continue to see diminishing returns as we increase
the number of subintervals as shown in Figure 2.4.
A ≈ 15.6
n=2
A ≈ 30
A ≈ 35.6
A ≈ 40.45
n=5
n = 10
n = 50
Figure 2.4: Area approximations for increased values of n visualized
A ≈ 41.04
n = 100
As we can see, increasing n leads to better approximations for the area beneath the curve, but
for sufficiently large values of n, increasing it further doesn’t get us much closer to the answer
we’re after. However, if we allow n to approach infinity (meaning we would use infinitely many
rectangles), we should see our approximation approach the precise area under the curve in this
interval. This brings us to the formal definition of a definite integral:
Z b
a
f (x) dx = lim
n→∞
where ∆x =
n
X
f (xi )∆x
i=1
(2.1)
b−a
n
This equation can look somewhat daunting at first, so lets break down all the different symbols
and terms.
•
Rb
a tells us we want to find the signed area (we’ll talk about signed area vs area later) under
37
some curve between points a and b
• f (x) tells us which curve that is
• dx tells us what variable we’ll be integrating with respect to, just like with differentiation.
This means that the values of a and b on the
R
sign are values of x.
• lim tells us we’re going to let n get infinity large
n→∞
•
n
P
tells us we’re going to be adding up terms, starting with index one, and increasing the
i=1
index by one until we reach n. Since we’re letting n approach infinity, we’ll be adding up an
infinite amount of terms
• f (xi ) is part of the term that will be getting indexed and is the height of each rectangle
• ∆x is the width of each rectangle, and since it is equal to b−a
with n approaching infinity, it
n
will be an infinitely small width
dx can be thought of as an infinitely small ∆x. Since the definition of the definite integral has a limit
that brings n to infinity and n is in the denominator of ∆x, the larger n gets, the smaller ∆x gets.
And if n is infinitely large, ∆x is infinitely small, and thus we denote it by dx. All a definite integral
is really saying is that we are taking values of f (x) in infinitely many subintervals/increments and
multiplying them by an infinitely small range on the independent axis, then adding up the infinite
number of terms. This doesn’t always mean area though – area is just an easy and useful way of
representing what definite integration is. The infinite summation present in this definition means
that in order to use the definition of a definite integral, we would have to spend eternity performing
addition. So this definition isn’t going to be quite as useful as the definition of a derivative. But
luckily, we have some tricks up our sleeves that we will get to soon.
Section 2.2 Summary
• The area under a curve can be approximated by dividing the area into several
rectangles
• The more rectangles used to approximate the area under a curve, the better the
approximation is
• Definite integrals use an infinite number of infinitely small rectangles to calculate
the precise area under a curve over an interval
38
2.3
Indefinite Integration
Section 2.3 Overview
As mentioned in preceding sections, there are two primary categories of integration.
In the last section, we covered definite integration, and now we will cover indefinite
integration. By the end of this section, the reader should be able to:
• Explain how definite integration is different from indefinite integration
• Explain what an indefinite integral is
• State the basic properties of indefinite integrals
• Evaluate integrals of polynomial functions
We now need to discuss indefinite integrals.
Indefinite integrals can be thought of as anti-
derivatives, and they are denoted as such.
Z
f (x) dx
Notice that this is very similar in appearance to a definite integral, shown in Eq. (2.1). The key
difference being that we no longer have the bounds a and b on the bottom/top of the integral
symbol. For now, lets forget about definite integrals or calculating area under curves. Lets just
say we have this new fancy operator that “undoes’ differentiation and it has nothing to do with
finding the area under curves. We are going to call this operator the anti-derivative, and we’ll
sometimes denote it as
F (x) =
Z
f (x) dx
Which means if we are given some function, f (x), and asked to find F (x), what we’re really being
asked is “What function did we differentiate to obtain f (x)?”. This will become more clear with
an example.
Example 2.1
Suppose Trevor is taking a calculus exam where one question asks him to differentiate a function. If the answer to this question is 2x, what was the function
Trevor differentiated?
39
Example 2.1 continued
If we want to find out what function Trevor differentiated, we should consider what kind of
function his answer is to determine the derivative formula he used. His answer is a one-term
polynomial, so he must have used the power rule. Recall the power rule states that
d n
x = nxn−1
dx
This example asks us for what f (x) is
d
f (x) = 2x
dx
We can start by putting 2x in the general polynomial form,
2x = 2x1 = Axn
⇒
n = 1,
⇒
2x = (n + 1)xn ,
A=2=n+1
n=1
Since the power rule tells us to take the exponent of f (x) and bring it down to the coefficient,
the exponent of f (x) must have been n + 1, since that’s the coefficient of Trevor’s answer.
d
f (x) = (n + 1)xn
dx
⇒
f (x) = xn+1
Notice that the equation on the left side of the arrow is just a reindexed version of the power
rule. If we let η = n + 1, then n = η − 1 and we get
d
f (x) = ηxη−1
dx
Which is the version of the power rule we’re more familiar with.
Returning to the right side of the arrow, we can plug in n = 1 to obtain
f (x) = xn+1
= x1+1
= x2
To double check that f (x) is the function Trevor must have differentiated, we can find f ′ (x).
40
Example 2.1 continued
If f ′ (x) = 2x, then we’re on the right track.
d
d 2
f (x) =
x
dx
dx
= 2x2−1
= 2x
Which is the same as Trevor’s answer, so we’re done right? Nope! Recall Eq. (1.5), which
states that the derivative of any constant is zero. Consider that x2 = x2 + 0. Since the
derivative of a constant is zero, there could have been one or several constants in our function f (x) that got “deleted” during the differentiation process. For example, consider the
following derivatives:
d 2
x +1
dx
d 2
x +5
dx
d 2
x + 1095
dx
d 2
x + πe
dx
d 2 √
x + 3i
dx
All of these have the same derivative of 2x. The answer to the question is then,
If
d
f (x) = 2x
dx
Then
f (x) = x2 + c
41
Example 2.1 continued
Where c is any constant
Or in the context of integrals,
Z
2x dx = x2 + c
c is called the constant of integration and it accounts for possible constants that were lost
during the differentiation process.
Before we come up with a power rule for anti-derivatives, we need to establish two simple properties.
Like derivatives, integrals are linear operators, which means that
Z f (x) ± g(x) dx =
Z
Z
f (x) dx ±
kf (x) dx = k
Z
Z
g(x) dx
(2.2)
f (x) dx
(2.3)
where k is a constant real number
If we know indefinite integrals “undo” differentiation, we can try integrating both sides of the
power rule (Eq. (1.4)). Note we’re going to use η instead of n because we’ll need to do some
reindexing and want our final formula to use n.
Z
Z
d η
x dx = ηxη−1 dx
dx
Since indefinite integrals and derivatives cancel each other out, we obtain
xη + c 0 =
Z
ηxη−1 dx
Don’t forget the constant of integration. We’re denoting it as c0 for reasons that will be explained
42
shortly. By Eq. (2.3), this is simply
xη + c 0 = η
Z
xη−1 dx
Now we can isolate the indefinite integral to obtain
Z
xη−1 dx =
xη + c 0
η
Notice that if we break this into two fractions we get
xη + c 0
x η c0
=
+
η
η
η
Recall that xη is just the general form of a polynomial term. This means η represents a constant
value. If η and c0 are constant values, then the quotient of the two will also be constant. Instead
of writing cη0 , we can simply label it as c. Integral constants are usually just denoted by c, even if
it was multiplied/divided/added/subtracted by a constant value. This way, we don’t need to keep
track of a bunch of constants, but rather lump them together as one single constant.
Z
xη−1 dx =
xη
+c
η
It would be more useful if our formula was for functions of the form xn instead of the form xn−1 .
We can manipulate our current formula with a simple reindex. Let n = η − 1, and we obtain
n=η−1
⇒
⇒
⇒
η =n+1
Z
x
η−1
dx =
Z
xn dx
xη
xn+1
+c=
+c
η
n+1
⇒
Z
xn dx =
xn+1
+c
n+1
And the power rule for integration is then
Z
xn dx =
xn+1
+c
n+1
(2.4)
Lets do an example problem using these new equations to ensure that we know how to use them.
43
Example 2.2
Compute
Z
1
4x3 − 8x2 + x − 7 dx
3
With integrals, whatever is between
R
and dx is what we want to integrate. We can first use
Eq. (2.2) to split this polynomial into individual integrals.
Z
Z
Z
Z
Z
1
1
4x3 − 8x2 + x − 7 dx = 4x3 dx + −8x2 dx +
x dx + −7 dx
3
3
Then use Eq. (2.3) to factor out the coefficients.
Z
Z
Z
Z
1
1Z
4x3 − 8x2 + x − 7 dx = 4 x3 dx − 8 x2 dx +
x dx − 7 1 dx
3
3
Notice that we can rewrite
R
1 dx as
x0 dx, just as we did in Example 1.3. This will make
R
it easier to use Eq. (2.4) for that term. Applying Eq. (2.4) to each term, we get
Z
Z
Z
Z
1
1Z
4x3 − 8x2 + x − 7 dx = 4 x3 dx − 8 x2 dx +
x dx − 7 x0 dx
3
3
x3+1
x2+1
1 x1+1
x0+1
= 4
+ c 1 − 8
+ c2 +
+ c 3 − 7
+ c4
3+1
2+1
3 1+1
0+1
x3
1 x2
x1
x4
= 4 + c1 − 8 + c2 + + c3 − 7 + c4
4
3
3 2
1
8
1
1
= x4 + 4c1 − x3 − 8c2 + x2 + c3 − 7x − 7c4
3
6
3
8
1
1
= x − x3 + x2 − 7x + 4c1 − 8c2 + c3 − 7c4
3
6
3
4
Again, we do not need to keep track of all of our constants as we did above. We can lump
each of the constants and their coefficients together as c. My only reason for not doing so
to begin with was to further demonstrate that c is still a constant and to show why we do
it this way. As you can see, it would be quite tedious if we didn’t. Going forward, we won’t
be doing all of this arithmetic with these constants, and it will be implied that c represents
any and all integration constants as one single constant. We now have our answer,
Z
1
8
1
4x3 − 8x2 + x − 7 dx = x4 − x3 + x2 − 7x + c
3
3
6
44
Example 2.2 continued
We can easily check our work by taking the derivative of this answer. If it’s correct, we
should get back exactly what’s inside of the indefinite integral on the left.
d 4 8 3 1 2
1
x − x + x − 7x + c = 4x3 − 8x2 + x − 7 + 0
dx
3
6
3
This is exactly the same as the function we integrated, so our answer is correct.
One last example before moving on, this one will be very simple.
Example 2.3
Compute
d
dx
Z
tan−1 x dx
This may initially appear to be a difficult problem, as we currently don’t have any rules or
formulas that tell us how to integrate trig functions, let alone inverse trig functions. But
recall that
Z
tan−1 x dx
is the anti-derivative of tan−1 x, that is to say, it “undoes” or is the inverse operator of
differentiation. If we’re being asked to take the derivative of an anti-derivative, the two
operations will cancel each other out, and we’ll only be left with the integrand (what’s
inside of the integral symbol and dx). Thus,
d Z
tan−1 x dx = tan−1 x
dx
Though it is implied in the name, we’ll add this general rule to our list of equations.
d Z
f (x) dx = f (x)
dx
45
(2.5)
Section 2.3 Summary
• Indefinite integrals act as the inverse operator to derivatives – or rather, they act
as anti-derivatives, while definite integrals find the area under curves or intervals
• Integrals possess many of the same properties as derivatives, such as the sum rule
and the constant factor rule
• Indefinite integrals can be computed using Eq. (2.4)
Practice Problems: 1-12
2.4
The Fundamental Theorem of Calculus
Section 2.4 Overview
The Fundamental Theorem of Calculus (FTOC) is – as the name suggests – one of the
most important principles in calculus, as it unites many of the concepts studied up to
this point under one theorem. By the end of this section, the reader should be able to"
• Explain both parts of the Fundamental Theorem of Calculus
• Use the Fundamental Theorem of Calculus to evaluate definite integrals
The Fundamental Theorem of Calculus can be split into two parts. The first part deals with
the relationship between definite integrals and derivatives, while the second part deals with the
relationship between definite and indefinite integrals. This gives us a very easy way of computing
definite integrals (i.e. area under curves), without having to use the infinitely long equation given
by Eq. (2.1).
However, I am going to dive into this in a different order than normally seen in calculus textbooks.
Rather than giving an in-depth look at part one and then part two, I’m going to very briefly show
you part one and then dive into part two. Once we understand part two, we’ll use it to better
understand part one.
2.4.1
FTOC Part One
Part one of the Fundamental Theorem of Calculus states that
46
Fundamental Theorem of Calculus Part I
If f (x) is continuous on the closed interval [a, b], then
F (x) =
Z x
f (t) dt
a
is continuous on the closed interval [a, b] and differentiable on the open interval (a, b) and
F ′ (x) = f (x)
which is to say
Z x
d
f (t) dt = f (x)
dx a
(2.6)
For now, we’re going to leave this here and come back to it once we have simpler means of showing
that it’s true and explaining what it means.
2.4.2
FTOC Part Two
Recall from The Odd Number Rule that Michael was trying to calculate the area under our
velocity curve to determine our distance from him. We’re going to revisit this example in a bit
more detail, so lets start off with a rough model of how we might figure that out. The formula
for position given constant velocity over some time interval is (uppercase letters used to avoid
confusion between variables and functions):
Xf = Xi + V ∆t
Where X is position (f and i subscripts representing the final and initial values, respectively), V is
a constant velocity, and ∆t is an interval of time. This equation works by multiplying a constant
velocity by an interval of time to find how much distance was traveled during that time (distance
= speed × time), and then adds that distance to the position of the object at the beginning of that
time interval. We know from The Odd Number Rule that we run into issues when the velocity
isn’t constant, so lets change our formula into a more dynamic function of time.
Xf = X(tf ) ≈ X(ti ) + X ′ (ti )(tf − ti )
47
Notice that since V is velocity and X is position, V = X ′ (t), because velocity is the rate of change
in position with respect to time. This formula is now an approximation because it still assumes that
the velocity is constant on the interval of ∆t. If you don’t see why, remember that we’re plugging a
constant value, ti into a function X ′ (t), which gives a constant value and thus a constant velocity.
We know from The Odd Number Rule that we can make this approximation more accurate by
measuring the velocity at a high frequency instead of just once and then multiplying each of those
velocities by small intervals of time. The more often we measure, the better our approximation
is. We also know from Definite Integration that this is precisely what a definite integral does –
multiplying an infinite number of function values (which is velocity in our case) by infinitely small
widths (time intervals in our case). If we apply the definition of a definite integral to a velocity-time
curve, we get
lim
n→∞
n
X
X ′ (tk )∆t = V0 ∆t + V1 ∆t + V2 ∆t + · · · + Vn−1 ∆t + Vn ∆t
k=0
=
Z tf
V (t) dt
ti
=
Z tf
X ′ (t) dt
ti
If this definite integral will tell us exactly how much on object has moved then, we can substitute
this into our position formula to get an exact answer for where an object is – relative to its starting
position – at tf .
Xf ≈ Xi + V ∆t
↓
Xf = Xi +
Z tf
X ′ (t) dt
ti
Now consider how X(t) relates to X ′ (t). If we let f (t) = X ′ (t), then what is X(t) in terms of
f (t)? It’s the anti-derivative! If we call F (t) the anti-derivative or indefinite integral of f (t), then
we can make a more general statement about how definite integrals relate to their anti-derivative
48
counterparts.
X(tf ) = X(ti ) +
Z tf
X ′ (t) dt
ti
⇓
Z tf
ti
X ′ (t) dt = X(tf ) − X(ti )
↓
Z b
f (t) dt = F (b) − F (a)
a
This means if we want to compute the definite integral of a function, we can find its indefinite
integral at the bounds of the definite integral and subtract them from one another! This is precisely
what part two of the Fundamental Theorem of Calculus tell us, and it is more formally expressed
below.
Fundamental Theorem of Calculus Part II
If a function f is continuous on the closed interval [a,b], then
Z b
f (x) dx = F (b) − F (a)
(2.7)
a
Where
F (x) =
Z
f (x) dx
Note that this also confirms that the definite integral of a rate yields the total change in whatever the rate is measuring. For example, integrating (with respect to time for both) velocity
yields change in positions or integrating flow rate yields change in volume. Before revisiting the
Fundamental Theorem of Calculus Part I, there is some notation you should be aware of.
b
F (x) = F (b) − F (a)
a
b
The way to read “F (x) ” is “F (x) evaluated from a to b”. It will also sometimes be shown in the
a
49
following variations:
b
F (b) − F (a) = F (x)
a
b
= F (x)
a
b
= F (x)
a
x=b
= F (x)
x=a
One more piece of notation you should be familiar with:
If
y = f (x)
Then
y
x=a
= f (a)
The evaluated line with only one value – normally at the bottom – means “evaluated at” instead
of “evaluated from”. This is used when function notation isn’t being used, but you need a way of
expressing that some value is being plugged into a function. Revisiting the Fundamental Theorem
of Calculus Part I, which states that
d Z x
f (t) dt = f (x)
dx a
We can use Eq. (2.7) to say
Z x
f (t) dt = F (x) − F (a) ,
a
50
F (t) =
Z
f (t) dt
Thus, the derivative will be
Z x
d
d
f (t) dt =
F (x) − F (a)
dx a
dx
=
d
d
F (x) − F (a)
dx
dx
= f (x) − 0
= f (x)
Note that the derivative of F (a) is zero because when we plug a into F (x), it becomes a constant
and therefore has a rate of change of zero.
To summarize what part one is telling us, consider applying it to a velocity curve. If f (t) is a
measure of velocity over time, with a and x being two arbitrary time-values, then the anti-derivative
F (x) and F (a) gives two positions at two different points in time. Differentiating with respect to
x – a variable value of time – the result is velocity as a function of time-value x, which is the same
as f (t). Therefore, the Fundamental Theorem of Calculus Part I is a statement that establishes
the inverse relationship between integration and differentiation.
Section 2.4 Summary
• The Fundamental Theorem of Calculus Part I states that integration and differentiation are inverse operators
• The Fundamental Theorem of Calculus Part II states that definite integrals can be
computed by evaluating the anti-derivative at the bounds of the definite integral
• The definite integral of f from a to b is equal to the anti-derivative of f evaluated
at b minus the anti-derivative of f evaluated at a. This formula is equivalent to
the statement: “The definite integral of velocity over time interval ∆t = b − a gives
the change in position during ∆t and is equal to the position at t = b minus the
position at t = a.”
51
2.5
Basic Integral Properties & Formulas
Section 2.5 Overview
In preceding sections, we have defined what definite integrals and indefinite integrals are,
and how they relate to one another. In this section, we will develop general properties
and formulas for evaluating integrals. By the end of this section, the reader should be
able to:
• Use the Fundamental Theorem of Calculus Part II to compute definite integrals
• Evaluate both definite and indefinite integrals of trigonometric, exponential, and
logarithmic functions
• Recognize boundary-related properties of definite integrals
Since integration is the inverse operation of differentiation, rather than memorizing a bunch of
integration formulas, it is more useful to memorize all of the differentiation formulas and recognize
that they can be used in a “backward” fashion to find their integration counterparts. We’ll do two
short examples to better demonstrate what this means.
Example 2.4
Find
Z
cos x dx
Notice that if we integrate both sides of Eq. (1.7), the indefinite integral and the derivative
52
Example 2.4 continued
undo one another, and will result in the indefinite integral of cos x.
d
sin x = cos x
dx
Z
Z
d
sin x dx = cos xdx
dx
sin x + c =
Z
cos x dx
⇓
Z
cos x dx = sin x + c
Some uses of this strategy will require us to do a small amount of algebra as shown in this next
example.
Example 2.5
Find
Z
ax dx
We can start by integrating both sides of Eq. (1.12)
d x
a = ax ln a
dx
Z
Z
d x
a dx = ax ln adx
dx
x
a + c = ln a
Z
ax dx
Recall that a is a constant, which is why we can factor ln a outside of the integral. We can
see that integrating both sides did not immediately result in a formula for the indefinite
integral of ax , but we can divide both sides by ln a to fix that.
Z
ax dx =
53
ax
+c
ln a
Example 2.5 continued
Again, ln a is a constant, so ln a gets lumped in with c instead of showing lnca .
This process can be used on most of the differentiation formulas we’ve discussed so far, but we
aren’t going to run through them all. Again, it’s better to memorize the differentiation formula
and think backwards, however I will still put all of the formulas here in case you need them.
Z
Z
Z
Z
Z
Z
Z
ax dx =
ax
+c
ln a
1
dx = ln |x| + c
x
(2.12)
sin x dx = − cos x + c
(2.14)
sec2 x dx = tan x + c
(2.16)
1
√
dx = − cos−1 x + c
1 − x2
Z
(2.10)
cos x dx = sin x + c
1
√
dx = sin−1 x + c
2
1−x
Z
(2.8)
sec x tan x dx = sec x + c
(2.9)
csc x cot x dx = − csc x + c
(2.11)
Z
(2.18)
(2.20)
csc2 x dx = − cot x + c
(2.13)
Z
1
√
dx = sec−1 x + c
2
|x| x − 1
(2.15)
Z
1
√
= − csc−1 x + c
2
|x| x − 1
(2.17)
Z
1
dx = tan−1 x + c
2
1+x
(2.19)
Z
1
dx = − cot−1 x + c
1 + x2
(2.21)
Now that we have many formulas for indefinite integrals, we can use the Fundamental Theorem of
Calculus Part II to solve a wide variety of definite integrals. Recall that in Definite Integration we
were trying to approximate the area underneath the curve f (x) = x2 on the interval 0 ≤ x ≤ 5,
and I said you would have to trust me when I told you that the answer was 125/3. We now have
all the tools to show where that answer comes from.
Example 2.6
Find the area under the curve y = x2 , from x = 0 to x = 5 where x and y are
measured in meters.
54
Example 2.6 continued
Since we’re calculating area, we’ll need to use a definite integral, with 0 and 5 being our
bounds.
A=
Z 5
x2 dx
0
We know from the Fundamental Theorem of Calculus Part II that this is equivalent to the
indefinite integral of x2 evaluated from 0 to 5. Using Eq. (2.4),
Z
x3
x2 dx =
+c
3
=
3
5
3
5
0
+c −
3
0
3
=
125
+c−0−c
3
=
125
3
+c
Since the units on the axes are meters,
A≈
125 2
m
3
Notice that the constants of integration canceled each other out. This will be the case for
all definite integrals, so you do not need to include them when evaluating definite integrals.
In all of our area calculations so far, we’ve only dealt with area above the x-axis. If you consider
the definition of a definite integral, given by Eq. (2.1), you’ll see that where f (x) is negative, so
is the signed area, or the net area. In other words, the area below the x-axis will be negative.
However negative area isn’t really a thing, so we refer to it as the signed area. If we split a curve
into portions above and below the x-axis, we can then add each of the signed areas together to
obtain the net area. If you need to find actual area, you can do so by breaking the integral into
separate sections as will be shown in this next example.
55
Example 2.7
Evaluate
Z 3
(x − 1)2 − 1 dx
0
We must be careful here, as our integrand contains a composite function, (x − 1)2 . We do
not (currently) have a chain rule equivalent for integration, but there is a simple way that
we can solve this without one.
(x − 1)2 − 1 = (x − 1)(x − 1) − 1
= x2 − 2x + 1 − 1
= x2 − 2x
Now the integrand is in a form we know how to deal with.
Z 3
0
3
x2 − 2x dx =
x3
− x2
3
0
33
03
= − 32 − − 02
3
3
= 9−9 − 0−0
=0−0
=0
Clearly we’ve run into a situation where there is a negative signed area and positive signed
area of equal magnitudes, making for a net area of zero. Lets take a look at the graph.
56
Example 2.7 continued
4 y
3
2
A2
1
0
0.5
1
A1
1.5
2
2.5
3
x
3.5
−1
As you can see, the region below the x-axis, A1 , will have a negative signed area and the
region above, A2 , will have a positive signed area. It just so happens that our selected
bounds mean these areas have equal magnitudes, so they sum up to zero. This is a good
opportunity to show a property of definite integrals. The net area can be given by,
Anet =
Z 3
0
x2 − 2x dx = A1 + A2
And the two regions are given by
A1 =
A2 =
Z 2
x2 − 2x dx
0
Z 3
⇒
x2 − 2x dx
Z 3
x2 − 2x dx =
0
Z 2
x2 − 2x dx +
0
Z 3
x2 − 2x dx
2
2
The right side of the arrow implies that when you have matching integrands with bounds
forming one continuous interval, you can combine the definite integrals into one integral.
This can also be done in reverse. You can break up an integral by breaking apart its interval
into several subintervals.
The property shown in Example 2.7 is formally expressed below.
57
Z c
f (x) dx =
Z b
a
f (x) dx +
Z c
a
f (x) dx
b
(2.22)
where a ≤ b ≤ c
This can be easily verified by a quick application of the Fundamental Theorem of Calculus Part
II to the right hand side of this equation.
If
F (x) =
Z
f (x) dx
Then
Z b
a
f (x) dx +
Z c
f (x) dx = F (b) − F (a) + F (c) − F (b)
b
= F (b) − F (a) + F (c) − F (b)
= F (c) − F (a) + F (b) − F (b)
= F (c) − F (a) + 0
= F (c) − F (a)
=
Z c
f (x) dx
a
Two other properties for definite integrals are
Z a
f (x) dx = 0
(2.23)
a
Z b
f (x) dx = −
a
Z a
f (x) dx
(2.24)
b
The first is a rather trivial property, as the area under a curve between the same two points is
clearly going to be zero. But just to give a more symbolic explanation, consider ∆x from the
58
definition of a definite integral, Eq. (2.1).
∆x =
b−a
n
⇒
a=b
⇒
∆x =
Z a
0
=0
n
f (x) dx = lim
n→∞
a
n
X
f (xi )(0) = 0
i=1
And as for the second, we can again consider how this changes ∆x.
n
X
b−a
f (x) dx = lim
f (xi )
n→∞
n
a
i=1
Z b
!
⇓
n
X
a−b
f (x) dx = lim
f (xi )
n→∞
n
b
i=1
Z a
n
X
!
b−a
f (xi ) −
= n→∞
lim
n
i=1
n
X
!
b−a
= − lim
f (xi )
n→∞
n
i=1
=−
Z b
!
f (x) dx
a
There are two more simple identities you should be aware of, which will be demonstrated in the
next two examples. I’d recommend following along for the first one, and then trying the second
one on your own.
Example 2.8
Evaluate
Z π
2
−π
2
cos x dx
This isn’t a challenging integral, but you can make the calculations slightly easier by taking
a look at the graph.
59
Example 2.8 continued
y
1
A1
0.5
A2
x
π
2
− π2
−0.5
−1
We’ll start by breaking the area into two separate areas. Recall that cosine is an even
function, which is to say it follows the general form
f (−x) = f (x)
↓
cos(−x) = cos(x)
This means all the y values left and right of the y-axis will be the same, and since our bounds
are equidistant from the y-axis, the areas of these two regions will be the same. We can also
60
Example 2.8 continued
easily express these areas (individually) as a definite integral.
Anet = A1 + A2
A1 = A2 = A
A=
Z π
2
cos x dx
⇒
⇒
0
Anet = 2A
"Z π
Anet = 2
2
#
cos x dx
0
#π/2
"
= 2 sin x
0
"
"
#
#
π
= 2 sin
− sin 0
2
=2 1−0
=2
The identity used in Example 2.8 is more formally written below.
Z a
f (x) dx = 2
−a
Z a
f (x) dx
0
(2.25)
if f (−x) = f (x) for all [−a, a]
Unfortunately we don’t currently have the tools to easily show this, but the next section will enable
us to do so. The next identity is similar, so I recommend attempting to solve it on your own.
Example 2.9
Evaluate without the use of calculus,
Z π
2
−π
2
sin x dx
If we can’t use calculus to evaluate this integral, then we ought to think about it geometrically.
61
Example 2.9 continued
y
1
0.5
A2
x
π
2
− π2
A1
−0.5
−1
In this case, our integrand is an odd function, which means
f (−x) = −f (x)
↓
sin(−x) = − sin(x)
This means that all y values left of the y-axis will be the same values as those to the right of
the y-axis, but negative. So rather than the net area being twice the area of one region, the
negative area of A1 will cancel out the positive area of A2 . Thus the integral will be zero.
But just to verify:
Z π
2
− π2
sin x dx = − cos x
"
π
2
− π2
#
π
= − cos
2
= (−0) − (−0)
=0
This identity is more formally written as
62
"
π
− − cos −
2
#
Z a
f (x) dx = 0
−a
(2.26)
if f (−x) = −f (x) for all [−a, a]
That covers most of the properties you’ll see. Lets just do one last example using one of the
formulas we learned earlier in this section.
Example 2.10
Evaluate
Z
sin x
cos2 x
dx
We don’t have an equation to deal with this as it currently exists, so we’ll need to try and
manipulate it. Notice we can factor a cosine function out of the denominator.
Z
Z
1 sin x
sin x
dx
=
dx
cos2 x
cos x cos x
1
= sec x
cos x
⇒
Z
Z
sin x
dx = sec x tan x dx
cos2 x
sin x
= tan x
cos x
The integrand is now something we can work with, as sec x tan x is the derivative of sec x.
This means the anti-derivative will just be sec x plus a constant.
d
sec x = sec x tan x
dx
⇓
Z
sec x tan x = sec x + c
⇓
Z
sin x
dx = sec x + c
cos2 x
63
Section 2.5 Summary
• Indefinite integral formulas can be obtained by integrating known differentiation
formulas
• Definite integrals can be broken into separate integrals by splitting the bounds of
integration [see Eq. (2.22)]
• Symmetry about the x and y-axes can be used to simplify the computations of
definite integrals
Practice Problems: 1 - 12
2.6
U-Substitution
Section 2.6 Overview
U-substitution is a method of integrating composite functions, just as the chain rule
differentiates composite functions. This method of integration will allow us to integrate
functions of greater complexity than preceding sections. By the end of this section, the
reader should be able to:
• Make strategic choices of substitutions in order to integrate composite functions
The formal definition of U-substitution is given by
Z
f g(x) · g ′ (x) dx =
Z
f (u) du
(2.27)
where u = g(x) and du = g ′ (x) dx
How to use this formula will become more apparent once you see how it’s done.
Example 2.11
Compute
Z
x
x2 + 1
64
dx
Example 2.11 continued
We can see that none of the integration formulas we’ve learned so far can be applied to this
problem. However, we can rearrange this so that it is more clearly in the form of Eq. (2.27)
Z
Z
x
1
dx =
(x dx)
2
2
x +1
x +1
You’ll hopefully see how this relates to our new formula. We can write f and g as
f (x) =
1
x
1
x2 + 1
g(x) = x2 + 1
f g(x) =
g ′ (x) = 2x
g ′ (x) dx = 2x dx
There is a small issue here. Once we plug our values into Eq. (2.27), we’ll find that it doesn’t
match our original integrand.
Z
f g(x) · g ′ (x) dx =
Z
Z
x
1
(2x
dx)
=
̸
dx
2
2
x +1
x +1
However, if we divide the result given by Eq. (2.27) by two, then our integrands will be
equal.
Z
1Z
1
x
(2x dx) =
dx
2
2
2 x +1
x +1
Now we can make our substitutions. Recall that those substitutions are
u = g(x) = x2 + 1
du = g ′ (x) dx = 2x dx
Which results in
Z
x
1Z 1
dx =
du
x2 + 1
2 u
All we’ve done here is changed the variable of integration into a dummy variable that makes
for an integrand that’s easy to handle. We can evaluate this integral using Eq. (2.10).
1Z 1
1
du = ln |u| + c
2 u
2
65
Example 2.11 continued
Now we reverse the substitution to get our answer in terms of x instead of u.
1
1
ln |u| + c = ln x2 + 1 + c
2
2
= ln
√
x2 + 1 + c
We used a property of logarithms in the last step. The answer is the same without it, so if
you don’t understand it, don’t worry about it.
While this formula is useful, it is often a lot work to follow it. A much more efficient way of utilizing
u-substitution is to identify the “inside” function and check if the derivative of that function is
present in the numerator. Why this works will become clear with an example. We’ll perform the
same integral as the last example to clearly demonstrate how this method differs.
Example 2.12
Compute
x
Z
x2 + 1
dx
We know that the “inside” function is x2 + 1, and we can see that the x in the numerator
prevents this integral from being easy. The derivative of the inside function results in a
polynomial with the same degree as the numerator, so we’re going to use the derivative of u
to get rid of the x.
u = x2 + 1
⇒
du
= 2x
dx
What I am about to do is a bit odd, and it brings up a complicated question.
du
= 2x
dx
⇒
du = 2x dx
We learned in Differential Calculus that d/dx is just an operator, but here I treated it like a
fraction. So is dy/dx a fraction? Not really... but sometimes you can treat it like one. dy/dx
is a ratio, which is different than a fraction. Consider ∆y/∆x. This is does not mean ∆y
pieces out of a whole ∆x. It means for a change of ∆y on the y-axis, we get a change of ∆x
on the x-axis, or vice versa. However, sometimes treating dy/dx like a fraction will help us
66
Example 2.12 continued
get the right answer – especially in low-level mathematics such as basic calculus – it’s just is
a very informal way of doing so and lacks rigour. If you pursue high-level math courses, you
will be able to rigorously define the cases for which doing this is perfectly legal. For now,
just know that treating dy/dx like a fraction will be okay most – if not all – of the time for
the level of mathematics we’re doing. As a measure of safety, check with a professor before
performing this kind of pseudo-arithmetic in cases you haven’t seen it done before. From
here we can solve for dx.
⇒
du = 2x dx
dx =
du
2x
Now we take our original integral and substitute or new values in
u = x2 + 1 and dx =
du
2x
⇓
Z
Z
x
x
dx
=
2
x +1
u
=
du
2x
!
1Z x
du
2 xu
1Z 1
du
=
2 u
=
1
ln u + c
2
=
1
ln x2 + 1 + c
2
= ln
q
(x2 + 1) + c
So even though we did something that isn’t technically correct, it worked out, and it’ll work
out for the integrals you’ll need to do as well.
Lets do one more example with indefinite integrals, and then quickly show how these can be applied
to definite integrals.
67
Example 2.13
Compute
Z
tan x dx
This doesn’t seem to be a composite function, but recall in Example 1.12 we ran into a
similar issue, and we solved that issue by using the fact that
sin x
cos x
tan x =
Now it’s a bit more clear that this is a composite function. Also notice that if we take the
derivative of the denominator, we may be able to cancel out the numerator like we did in
the last example. So lets start by letting u = cos x.
u = cos x
⇒
du = − sin x dx
⇒
dx = −
du
sin x
Substituting these values into our integral,
Z
tan x dx =
=
Z
sin x
dx
cos x
Z
sin x
u
=−
=−
−du
sin x
Z
sin x
du
u sin x
Z
1
du
u
!
= − ln |u| + c
= − ln cos x + c
= ln
1
+c
cos x
= ln sec x + c
For definite integrals, the process is pretty much the same, we just have to modify our integration
68
bounds a tiny bit.
Example 2.14
Compute
Z 2
q
x 1 + 2x2 dx
0
We’ll start off as we usually would by choosing our substitution, taking the derivative, and
then plugging it all into our original integral.
u = 1 + 2x2
⇒
du = 4x dx
⇒
dx =
du
4x
Before we plug this in, we need to consider our bounds. The bounds we have been given are
values of x, so if we switch to u, we need the bounds to be values of u, not x. There are two
ways of doing this. The first way is to plug the x bounds into our equation for u to obtain
bounds that are values of u instead of x.
x1 = 0
⇒
u1 = 1 + 2(0)2 = 1
x2 = 2
⇒
u2 = 1 + 2(2)2 = 9
Now we can say
√
1Z 9√
u du
x 1 + 2x2 dx =
4 1
0
Z 2
9
1 2 3/2
=
u
4 3
1
=
1
3
3
(9) /2 − (1) /2
6
=
13
3
Or, instead we can turn our x values into dummy u values and plug our values of x back in
69
Example 2.14 continued
at the end.
√
1 Z u2 √
2
x 1 + 2x dx =
u du
4 u1
0
Z 2
u
1 2 3/2 2
u
4 3
u1
=
2
3/2
1
=
1 + 2x2
6
0
=
1 3/2
9 −0
6
=
13
3
Instead of figuring out what the bounds were as values of u, we just briefly changed the
bounds to unknown values of u, and then switched back to values of x once we reversed the
substitution after integrating. Either way will work; one just requires slightly less work.
Before we move on, we now have the tools to show why Eq. (2.25) and Eq. (2.26) are true.
Example 2.15
Show that if f (x) is integrable and f (−x) = f (x) on [−a, a], then
Z a
f (x) dx = 2
Z a
−a
f (x) dx
0
We can start by breaking this down into two separate integrals with Eq. (2.22).
Z a
f (x) dx =
−a
Z 0
f (x) dx +
−a
Z a
f (x) dx
0
Let’s play around with the first term of the result and see if we can get it into the same form
as the second term. Since we know f (−x) = f (x), we can say
Z 0
f (x) dx =
Z 0
−a
−a
70
f (−x) dx
Example 2.15 continued
We now have a composite function, so lets use u-substitution.
u = −x
⇒
du = −dx
x1 = −a
⇒
u1 = a
x2 = 0
⇒
u2 = 0
⇒
⇒
dx = −du
Now we have
Z 0
Z 0
f (x) dx =
−a
f (−x) dx
−a
Z 0
=
f (u) (−du)
−a
=−
Z 0
f (u) du
a
By the Fundamental Theorem of Calculus Part II, this becomes
−
Z 0
f (u) du = − F (0) − F (a)
a
= F (a) − F (0)
Notice by the Fundamental Theorem of Calculus Part II that this is equivalent to the second
term
Z a
f (x) dx = F (a) − F (0)
0
Therefore
Z 0
f (x) dx =
−a
Z a
f (x) dx
0
⇓
Z a
−a
f (x) dx =
Z a
f (x) dx +
0
=2
Z a
0
Z a
f (x) dx
0
71
f (x) dx
Example 2.16
Show that if f (x) is integrable and f (−x) = −f (x) on [−a, a], then
Z a
f (x) dx = 0
−a
The strategy here is the same as the last example, so I’ll run through it fairly quickly.
Z a
Z 0
f (x) dx =
−a
f (x) dx +
Z a
−a
Z 0
f (x) dx
0
f (x) dx = −
Z −a
−a
f (x) dx
0
=
Z −a
−f (x) dx
0
=
Z −a
f (−x) dx
0
u = −x
⇒
dx = −du
Z −a
⇒
⇒
u1 = 0,
Z a
u2 = a
f (x) dx =
Z a
−a
f (−x) dx = −
0
f (x) dx −
Z a
Z a
f (u) du
0
f (u) du
0
0
= F (a) − F (0) − F (a) − F (0)
=0
Section 2.6 Summary
• U-substitution rewrites integrals in terms of a sub-function u in order to put them
into a form we can integrate with preceding equations.
• A choice of u should be made with consideration to the resulting du. This result
72
Section 2.6 Summary continued
often dictates whether or not the choice of u will lead to a solvable integral.
Practice Problem Set 1: 1-6
Practice Problem Set 2: 1-4
2.7
Applying Integrals to Physics
Section 2.7 Overview
When dealing with physical problems involving integration, it is often the case that only
one value of c will give the correct answer. Instead of having infinitely many answers
from one family of answers, we will often only have one correct answer. By the end of
this section, the reader should be able to:
• Apply integration to basic kinematics
• Solve for constants of integration
As stated, when integration is used in the real-world, it is often the case that only one choice
of an integration constant will accurately model physical scenarios. Doing so is simple and is
demonstrated in the following example.
Example 2.17
The velocity of a car on the highway is given by v(t) = t2 + t + 1. If the car’s
position at t = 0 is r0 = 5 meters, find the position of the car as a function of
time.
We know how to find position given a velocity function of time, it’s just that last part about
r0 that we’ve never seen before. That value of r0 dictates which value of c will give us
the right answer. Changes in the integration constant will raise/lower the entire position
function, so we only want the one that includes the point (5, 0). This is very easy to do, so
lets start off as we normally would. We know that the integral of velocity with respect to
73
Example 2.17 continued
time is position, so integrating v(t) will give us r(t).
r(t) =
=
Z
Z
v(t) dt
t2 + t + 1 dt
1
1
= t3 + t2 + t + c
3
2
Remember, only one value of c will get us the function we’re after. Since we know r(t) must
be equal to five when t is zero, we can simply plug zero into our function, set it equal to
five, and solve for c.
r(0) = 0 + 0 + 0 + c
=0+c=5
⇒
c=5
So the velocity function we’re after is
1
1
r(t) = t3 + t2 + t + 5
3
2
While we have now covered all of the calculus one will need for an introductory physics course
now, I would like to show you one application of calculus in deriving a formula you will soon come
to know very well. Recall in The Fundamental Theorem of Calculus we used the formula
x(t) = xi + vt
Which tells us the position of a constantly moving object at a variable time t given an initial
position. An equation that works in the same manner is
v(t) = vi + at
Which tells us the velocity of a constantly accelerating object at a variable time t given an initial
velocity. Note that the equation given for position doesn’t work if the object is accelerating, as
that means the velocity is always changing. However, we can integrate both sides of the velocity
74
function to get a formula for the displacement of a constantly accelerating object.
Z tf
v(t) dt =
Z tf
ti
ti
vi + at dt
On the left hand side, we’re integrating a velocity function with respect to time. From the Fundamental Theorem of Calculus Part II we know that this integral will be the anti-derivative of
velocity with respect to time, which of course is position. The result will then be position evaluated
at the initial and final times.
Z tf
tf
v(t) dt = x(t)
ti
ti
= x(tf ) − x(ti )
= xf − xi
As for the right side of the equation, remember that vi is a given value, so it doesn’t change and
can be treated as a constant. Additionally, the velocity function given only works for constant
acceleration, so a is constant as well.
#t
"
1 2 f
vi + at dt = vi t + at
2
ti
ti
Z tf
1
= vi (tf − ti ) + a(tf − ti )2
2
Note that tf − ti = ∆t, thus
1
1
vi (tf − ti ) + a(tf − ti )2 = vi ∆t + a∆t2
2
2
Putting both sides of the equation together, we obtain
1
xf − xi = vi ∆t + a∆t2
2
Solving for xf ,
1
xf = a∆t2 + vi ∆t + xi
2
This is a formula you’ll use often. Now you know where it comes from! Note that this equation
75
will often show up with varying notation, such as
1
x = at2 + vi t + xi
2
The idea here is the same, it just reduces the notational baggage.
Section 2.7 Summary
• When using integrals in physical applications, only one choice of integration constant will yield an equation that accurately models a physical scenario
• Known conditions can be used to solve for the correct integration constant. For
example, if acceleration is being integrated, and the velocity at t = 0 is known,
then zero can be plugged into the integration result and c can be solved for.
76
Part Two: Calculus II
3
Additional Integration Techniques
Chapter 3 Significance
In Part One, we covered how to integrate polynomials, composite functions, and a selection of trigonometric functions. Now, we must find ways to integrate other common types
of functions, such as product functions, quotient functions, functions involving trigonometry that cannot be solved with differentiation formulas, etc. Unfortunately, due to the
integral’s infinite series definition, we cannot find nice general formulas for these kinds
of functions.
As a result, we will need to establish clever techniques that will allow us to integrate
these kinds of functions. In this chapter, we will introduce four techniques that may appear in second-level physics courses. Additionally, we will briefly cover three important
applications of integration that will enable us to drastically simplify various use cases of
integration in physics.
3.1
Integration by Parts
Section 3.1 Overview
Integration by parts is an integration technique that will allow us to analytically integrate
some integrands that are products of functions. It is analogous to the product rule for
differentiation. The objectives of this section are to teach the reader to:
• Identify when integration by parts can be used
• Perform integration by parts for indefinite integrals
• Perform integration by parts for definite integrals
Recall from Part One that we learned about the formula for differentiating the products of functions
shown below.
d
f (x) · g(x) = f ′ g + f g ′
dx
We can use this to create a formula for integrating the product of two functions as well. Let us
77
begin by letting u = f (x) and v = g(x) where both f and g are differentiable at x. We’ll start by
integration both sides with respect to x.
(uv)′ = u′ v + uv ′
⇓
Z
′
(uv) dx =
Z
′
u v dx +
Z
uv ′ dx
Now substitute prime notation with Leibniz notation:
Z
Z
Z
d
du
dv
(uv) dx =
v dx + u dx
dx
dx
dx
Notice all dx’s present will cancel. If you have not viewed Volume I and are troubled by the
treatment of differentials as variables, see Example 30 for the discussion that covers this topic.
Z
d(uv) =
Z
Z
v du +
u dv
The left hand side of the equation only has a one in its integrand, so we obtain
uv =
Z
v du +
Z
u dv
If what we did on the left side of the equation is confusing to you, let w = uv. The differential
d(uv) = dw, thus the integral
R
dw = w + c = uv + c. Note that we did not include a constant
of integration as there are other integrals present in our equation. The implication being that the
constant of integration from the left side of the equation will be included in the constants on the
right side.
Since we’re looking for a formula for the integral of products, we should solve for one of the two
terms on the right side of the equation. Which one we choose does not matter. Therefore,
Z
u dv = uv −
Z
v du
(3.1)
Keep in mind that dv represents the derivative of v or g(x) and that the dx is not present because
the dx from the integral canceled the dx from the derivative. So our formula in terms of f and g
is:
Z
′
f (x)g (x) dx = f (x)g(x) −
78
Z
g(x)f ′ (x) dx
Since our formula tells us only the integral of a function and another functions derivative, how it’ll
be used may seem unclear. So, lets start with a simple example.
Example 3.1
Find
Z
xex dx
The function we need to integrate is clearly a product of two functions, x and ex . We need
to figure out which of the two will be u and which will be dv – that is, which will be f and
which will be g ′ . We will need to be strategic about our choice here, so to best illustrate why
we will start with the correct choice of u and dv, and then show what would have happened
with the opposite selections.
For reasons that will hopefully be apparent in the near future, we’re going to let u = x, and
dv = ex dx.
Z
xex dx =
=
Z
Z
(x)(ex dx)
u dv
Notice the importance of including dx in our choice of dv as opposed to letting dv = ex .
Had we not, the integrand wouldn’t match our formula and would instead look like so:
Z
x
xe dx =
=
Z
Z
(x)(ex ) dx
u dv dx ̸=
Z
u dv
Now that we have our choices of u and dv defined, we can use them to find the other variables
Eq. (3.1) requires, du and v.
u=x
⇒
du
=1
dx
⇒
dx = du
x
dv = e dx
⇒
⇒
Z
dv =
v = ex
Now we plug all four variables into Eq. (3.1).
Z
u dv = uv −
Z
v du
⇒
79
Z
xex dx = xex −
Z
ex dx
Z
ex dx
Example 3.1 continued
The integral present in the right side of the equation is something we know how to compute,
so this gives us our answer.
Z
xex dx = xex − ex + c
= ex (x − 1) + c
Notice we waited until the end to include the constant of integration despite integrating to
obtain v. As usual, it’s implied that the constant of integration are summed together and
represented by c.
As mentioned in the previous example, our choices of u and dv matter. To demonstrate why, we’ll
run through the previous example again, but with opposite choices of u and dv.
u = ex
⇒
du
= ex
dx
⇒
du = ex dx
dv = x dx
⇒
⇒
Z
dv =
Z
x dx
1
v = x2
2
Just like before, we plug our four values into Eq. (3.1).
Z
1 2 x 1Z 2 x
xe dx = x e −
x e dx
2
2
x
As you can probably tell, the integral present on the right is still a product of two functions;
meaning we still can’t use any of our pre-existing tools to compute it. With this it becomes clear
that the primary goal in integration by parts is to make one of the two functions in the integrand
“go away". Since differentiation reduces the power of polynomial terms, it is almost always wise to
select a polynomial term as your choice of u, as that is the variable integration by parts requires
you to differentiate. There isn’t a perfect formula that will tell you which substitutions to make
when using this method, so it is incredibly important you practice this technique as it will help
you with making these decisions.
Stemming off of this previous example, lets look at how we can compute integrals where one
iteration of integration by parts isn’t going to be enough.
80
Example 3.2
Find
Z
x2 ex dx
Hopefully our last discussion has made it clear what our choices for u and dv should here.
Lets write them below and find the other two variables we need. I’m going to write these in
a slightly different manner than last time as it’s a lot less cluttered and makes it very easy
to identify each variable when plugging them into our formula. If you need to see the work
I’m leaving out, go back over Example 3.1, as the methods are identical.
u = x2
dv = ex dx
du = 2x dx
v = ex
⇓
Z
2 x
2 x
x e dx = x e − 2
Z
xex dx
Despite making the appropriate substitutions, our result still contains an integral that is the
product of two functions. Isn’t this exactly what I told you to avoid? Not quite. You’ll notice
when I demonstrated the wrong choices, the power of our polynomial term was higher than
it originally was. Now it’s lower than it was before. This means if I perform integration by
parts again on that term, we’ll reduce it to a result with an integral we can easily compute.
In fact, we already have. The integral is exactly the same as the integral in Example 3.1,
just multiplied by two. Plugging our answer from Example 3.1 into our equation yields:
Z
2 x
2 x
x e dx = x e − 2
"Z
#
x
xe dx
= x2 ex − 2 ex (x − 1) + c
x
=e
2
x − 2x + 2 + c
Sometimes you’ll need to use integration by parts a couple of times before you arrive at a
result with a ready-to-solve integral.
Not all uses of integration by parts will be so straightforward, so lets do two examples where we’ll
need to be a little bit more creative.
81
Example 3.3
Find
Z
ex cos x dx
Right away we can see this is the product of two functions, which is a clear indicator we’ll
likely need integration by parts. However, notice that if we differentiate or integrate either
of these terms, ex will always be ex and cos x will oscillate between ± sin x or ± cos x when
we apply integration by parts repeatedly. This is our first sign of trouble, and also tells us
it doesn’t matter which substitutions we make. Thus, we will let u = ex and dv = cos x dx.
u = ex
dv = cos x dx
du = ex dx
v = sin x
⇓
Z
x
x
e cos x dx = e sin x −
Z
ex sin x dx
As before, the remaining integral is not immediately solvable and integrating by parts again
doesn’t seem to solve this issue. However, look what happens when we do integrate by parts
again. We’ll denote our new substitutions as ū and dv̄ to avoid confusion with the previous
substitutions.
ū = ex
dv̄ = sin x dx
dū = ex dx
v̄ = − cos x
⇓
Z
"
x
x
x
e cos x dx = e sin x − − e cos x −
x
x
= e sin x + e cos x −
We now have
Z
Z
#
x
−e cos x dx
ex cos x dx
R x
e cos x dx on both sides of our equation, which means we can add this
82
Example 3.3 continued
integral to both sides and then solve for it.
Z
x
x
x
e cos x dx = e sin x + e cos x −
Z
ex cos x dx
⇓
2
Z
ex cos x dx = ex sin x + ex cos x + c
⇓
Z
ex cos x dx =
1 x
e sin x + ex cos x + c
2
ex
sin x + cos x + c
=
2
Example 3.4
Find
Z
loga x dx
This is not a straight problem whatsoever as this integral doesn’t really have two functions
in it. However, we can express loga x as 1 · loga x. Since we will need to integrate dv, it would
not be wise to let dv = loga x. However we do know how to differentiate this logarithm, so
83
Example 3.4 continued
letting u = loga x presents no problems.
u = loga x
du =
dv = 1 dx
1
dx
x ln a
v=x
⇓
Z
loga x dx = x loga x −
= x loga x −
Z
x
dx
x ln a
Z
1
dx
ln a
= x loga x −
1 Z
dx
ln a
= x loga x −
x
+c
ln a
1
= x loga x −
+c
ln a
For our last example we’ll briefly show how to compute definite integrals with integration by parts,
although all we really need to do is apply the Fundamental Theorem of Calculus.
Example 3.5
Find
Z π
x cos 2x dx
0
Nothing new here as far as technique. Determine the optimal substitutions, find the remaining variables.
u=x
dv = cos 2x dx
du = dx
v=
84
1
sin 2x
2
Example 3.5 continued
Notice we needed to perform u-substitution to find v.
#π
"
1
1Z π
x cos 2x dx = x sin 2x −
sin 2x dx
2
2 0
0
0
Z π
#π
"
1
1
= x sin 2x + cos 2x
2
4
0
= 0+
1
1
− 0+
4
4
=0
As you can see its not much different from indefinite integration by parts.
Like u-substitution, there is no formula that tells you what choices to make for your substitutions.
As you practice this technique, you will grow a natural sense for what choices of u and dv will
make the problem solvable.
Section 3.1 Summary
• Integration by parts is a technique to integrate the products of some functions.
• Integration by parts is given by Eq. (3.1) and requires you to substitute u and dv
for each function in the integrand.
• The choice for u will need to be differentiated, and dv will need to be integrated.
The results of each operation will be multiplied and then integrated as part of this
procedure.
• To determine what choices will result in a solvable integral, consider whether or
not one function will reduce to one when differentiated one or several times. Additionally, ensure that your choices yield a product v du such that you are able to
integrate that product with pre-existing integration techniques.
• Integrands that contain functions that either oscillate (sine or cosine) or are only
changed by a constant factor (eax ) will require several iterations of integration by
parts. Once the remaining integral in Eq. (3.1) contains the starting integral (i.e.
the integral you’re trying to solve for), you can solve for it algebraically as shown
in Example 3.3.
85
Section 3.1 Summary continued
Practice problems: 1-3, 5
3.2
Trigonometric Integration
Section 3.2 Overview
The focus of this section is integrals involving trigonometric functions that are not immediately solvable with previous techniques. This section will:
• Show how a selection of trigonometric identities can be used to evaluate integrals
that are not solvable with previous methods.
• Teach the reader how to identify which identities to use in various scenarios.
Z
Z
Z
cos x dx = sin x + c
Z
(3.2)
sin x dx = − cos x + c
(3.4)
sec2 x dx = tan x + c
(3.6)
Z
sec x tan x dx = sec x + c
(3.3)
csc x cot x dx = − csc x + c
(3.5)
Z
csc2 x dx = − cot x + c
(3.7)
While we have covered how to integrate the above trigonometric functions in Chapter 2, we need
to address how to integrate functions which do not follow these forms in more depth. You’ll recall
from Chapter 2 that we used u-substitution to do this, however in this section the trigonometric
functions will not be solvable by u-substitution alone. The trigonometric identities below will help
us manipulate trigonometric functions into forms we know how to integrate.
86
cos2 θ + sin2 θ = 1
2
2
cot θ + 1 = csc θ
(3.3)
1
1 − cos 2θ
2
(3.5)
sin2 θ =
1 + tan2 θ = sec2 θ
(3.1)
1
cos θ =
1 + cos 2θ
2
2
(3.2)
(3.4)
sin 2θ = 2 sin θ cos θ
(3.6)
Note that Identity 3.2 and Identity 3.3 are derived from Identity 3.1 by dividing both sides by cos2 θ
and sin2 θ, respectively. Additionally, the definitions of the reciprocal trigonometric functions will
also often be helpful. If you need a refresher on what those definitions are, they are listed below.
sec θ =
1
cos θ
csc θ =
1
sin θ
cot θ =
1
cos θ
=
tan θ
sin θ
There is no formula that tells you how to use these equations, identities, and definitions in order
to integrate trigonometric functions. The best way to learn how to solve these types of problems is
to practice. They will initially take some trial and error, but once you do some practice problems,
you’ll begin to get a sense for what approach should be used to solve a problem. There are some
general guidelines you can follow in order to choose the right solution method, however, I believe
that your time is better spent improving your sense for these problems instead of memorizing a
list of procedures. If you insist on memorizing those procedures, you can find them here.
In summary, the procedures used in the following examples are going to seem random. But as
you practice these problems more, you will begin to grow a sense of when and why to use various
formulas in solving these integrals.
Example 3.6
Evaluate
Z
sin3 x cos3 x dx
First, we notice that this integrand is indeed not solvable with the sole use of one of the
equations provided at the start of this section, so we need to manipulate the integrand. We
can begin by expanding sin3 x.
Z
3
3
sin x cos x dx =
Z
87
sin x sin2 x cos3 x dx
Example 3.6 continued
This might seem like an arbitrary move at first, but we can now use Identity 3.1. By this
identity, sin2 x = 1 − cos2 x, so
Z
sin x sin2 x cos3 x dx =
Z
sin x 1 − cos2 x cos3 x dx
Again, this may seem like an arbitrary move, but this integrand is now solvable by usubstitution. If we let u = cos x, then the sin x at the beginning of the integrand will cancel
out, and we will only be left with a polynomial function in terms of u.
u = cos x
Z
2
⇒
du = − sin x dx
⇒
dx = −
3
sin x 1 − cos x cos x dx =
Z
=−
=−
du
sin x
sin x 1 − u
Z Z
2
3
u
du
−
sin x
!
1 − u2 u3 du
u3 − u5 du
1
1
= − u4 − u6 + c
4
6
1
1
= − cos4 x + cos6 x + c
4
6
This problem is a good way of demonstrating that there is more than one way to solve these
types of problems. For example, we could have expanded cos3 x instead of sin3 x. The result
88
Example 3.6 continued
looks like:
Z
3
3
sin x cos x dx =
=
u = sin x
Z
Z
sin3 x cos2 x cos x dx
Z
sin3 x 1 − sin2 x cos x dx
⇒
du = cos x dx
⇒
dx =
sin3 x 1 − sin2 x cos x dx =
=
du
cos x
Z
u3 1 − u2 du
Z
u3 − u5 du
1
1
= u4 − u6 + c
4
6
=
1 4
1
sin x − sin6 x + c
4
6
While these two functions are not identical, a quick plot of the two functions shows that
they are only off by a constant factor, and thus any difference between the solutions is
compensated by the arbitrary constant. If this fact is unclear to you, consider the discussion
regarding arbitrary constants from Applying Integrals to Physics. If we were to use either
of the two above equations to model some physical scenario, perhaps the position of an
electron, the constant of integration will be different for each of solution. When we apply
each constant to their respective solutions, the equations become identical to one another.
Example 3.7
Evaluate
Z
sin2 (2x) dx
While we could expand this integrand with the use of Identity 3.1, the result would be no
easier to evaluate. Instead, we will use Identity 3.5, as this will yield an integral we can
89
Example 3.7 continued
evaluate.
Z
2
sin (2x) dx =
1
(1 − cos 4x) dx
2
Z
The first term is easily solvable, and the second requires the basic substitution u = 4x.
Z
1
1
1
(1 − cos 4x) dx =
x − sin 4x + c
2
2
4
1
1
= x − sin 4x + c
2
8
Example 3.8
Evaluate
Z
sin2 x cos2 x dx
Here, Identity 3.1, Identity 3.4, and Identity 3.5 will not turn this integrand into one that
we know how to evaluate. However, we can use Identity 3.6 to get our integrand into a form
where Identity 3.5 can be used. By Identity 3.6,
sin x cos x =
1
sin 2x
2
Applying this to the integrand, we obtain
Z
2
2
sin x cos x dx =
=
=
Z
(sin x cos x)2 dx
Z 2
1
sin 2x dx
2
1Z
sin2 (2x) dx
4
90
Example 3.8 continued
Now we use Identity 3.5. Note that we’ll need to make the substitution u = 4x in the
following steps.
1Z
1Z 1
sin2 (2x) dx =
(1 − cos 4x) dx
4
4 2
1
1
=
x − sin 4x + c
8
4
1
1
= x−
sin 4x + c
8
32
The previous three examples demonstrate some of the guidelines for solving these problems. When
integrating a function containing sine or cosine to some odd power, Identity 3.1 can be used to
expand the odd power into the product of an even power and a power of one (see Example 3.6).
The remaining sine/cosine term will cancel with u-substitution, resulting in a polynomial function
of u. In Example 3.7 and Example 3.8, we saw that when the integrand contains sine or cosine with
even powers, then Identity 3.1 will not be useful. Instead, we use some combination of Identity
3.4, Identity 3.5, or Identity 3.6.
Again, I feel that it is more efficient to practice these problems and naturally obtain a feel for
which identities are useful for certain problems than it is to memorize lists of these guidelines.
Example 3.9
Evaluate
Z
tan2 x dx
This integral is actually quite simple. All we need to do is utilize Identity 3.2.
Z
tan2 x dx =
Z
sec2 x − 1 dx
The second term in the integrand is solvable by Eq. (3.6).
Z
sec2 x − 1 dx = tan x − x + c
91
Example 3.10
Evaluate
Z
cot4 x dx
Begin by using Identity 3.3.
Z
cot4 x dx =
Z csc2 x − 1 cot2 x dx
We can not yet use u-substitution, so we need to further manipulate the integrand. If we
distribute cot2 x, we will get two integrals that are separately solvable.
Z csc2 x − 1 cot2 x dx =
=
Z
csc2 x cot2 x − cot2 x dx
Z
csc2 x cot2 x dx −
Z
cot2 x dx
Solving for the first integral, we can make the substitution u = cot x
u = cot x
Z
⇒
du = − csc2 x dx
⇒
dx = −
csc2 x cot2 x dx = −
Z
du
csc2 x
u2 du
1
= − u3 + c1
3
1
= − cot3 x + c1
3
For the second integral, we take a similar approach to Example 3.9 by using Identity 3.2 and
Eq. (3.7).
Z
cot2 x dx =
Z
csc2 x − 1
= − cot x − x + c2
92
Example 3.10 continued
Summing the results yields
Z
1
cot4 x dx = − cot3 x + cot x + x + c
3
These problems can be very tricky at first, so its very important that you practice a lot of these
problems. Think of the integration formulas, identities, and definitions as tools that help you
integrate trigonometric functions. The best way to become skillful in using a tool is to practice
using it.
Lastly, definite integration of these types of functions is no different than shown in U-Substitution.
For a refresher, see Example 2.14.
Section 3.2 Summary
• The integration formulas discussed in Part One can be used in combination with
Identity 3.1 - Identity 3.6 in order to manipulate and evaluate integrals involving
trigonometry.
• The best way to determine which formulas should be used for a given problem is
to practice these types of problems.
Practice Problems: 1-3, 5, 6
3.3
Integration with Trigonometric Substitutions
Section 3.3 Overview
In this section, we will learn to use preceding trigonometric identities in combination with
substitution techniques in order to evaluate integrals. The primary goal of this section
is to show how trigonometric substitutions can simplify integration and discuss how to
deal with the results.
Let us begin with a simple example:
Z
√
1
dx
1 − x2
93
Some of you may already recognize this integral from Eq. (2.18), however, lets pretend we are not
aware of this formula for the sake of example. Watch what happens when we let x = sin θ. First,
we notice that
q
√
1 − x2 = 1 − sin2 θ
⇒
x = sin θ
We know from Identity 3.1 that 1 − sin2 θ = cos θ, so
√
⇒
x = sin θ
1
1
=
2
cos θ
1−x
While this is much easier to handle, we must recognize that we are now left with
Z
1
dx
cos θ
The issue being our function is in terms of θ, but our integral is with respect to x. From USubstitution, we know how to handle this. We simply differentiate our substitution. Thus,
⇒
x = sin θ
dx = cos θ dθ
Z
⇒
√
Z
1
1
(cos θ dθ)
dx
=
cos θ
1 − x2
This integral is now ready to be solved.
Z
1
cos θ
(cos θ dθ) =
dθ
cos θ
cos θ
Z
=
Z
dθ
=θ+c
From our initial substitution,
x = sin θ
⇒
θ = sin−1 x
Which gives us
Z
√
1
dx = sin−1 x + c
2
1−x
Take a moment to appreciate what we have just done here. From our work, we can say that
94
√
the anti-derivative of 1/ 1 − x2 is sin−1 x, which means we have just proven the formula for the
derivative of sin−1 x using nothing but U-substitution and some basic trigonometry. We can use
this approach to prove the derivative formulas for the rest of the inverse trigonometric functions
too, but I will leave that to you.
Note that we also could have substituted x = cos θ. The result would have been − cos−1 x, which
is the same thing as sin−1 x. So, in sum, trigonometric substitutions are substitutions that allow
us to manipulate our integrands into forms where we can use trigonometric identities on. Like
trigonometric integration, these can be tricky at first. The best way to get comfortable with them
is by practice. Lets rework this problem, but with a slightly modification.
Example 3.11
Evaluate
Z
√
1
1 − 16x2
dx
This integral is nearly the same as the one we just did, only this time we have a coefficient
on x2 . The process will be the same, but our substitution will have to be 16x2 = sin2 θ.
That way we can use trigonometric identities to reduce the integrand. Since the technique
is the same as the last problem, I’ll run through this fairly quickly.
16x2 = sin2 θ
Z
⇒
1
x = sin θv
4
⇒
Z
1
1
√
√
dx
=
1 − 16x2
1 − sin2 θ
=
1
cos θ dθ
4
θ = sin−1 (4x)
dx =
1
cos θ dθ
4
1 Z cos θ
dx
4 cos θ
1
= θ+c
4
=
1 −1
sin (4x) + c
4
These integrals are not always so straightforward. Often, we will get a messy answer that requires
us to use trigonometry in order to clean up.
95
Example 3.12
Evaluate
Z q
1 − 25x2 dx
This problem has an identical start to the last two. We need to let 25x2 = sin2 θ (or cos2 θ)
so that the radical will reduce to a trig function.
2
2
25x = sin θ
Z √
⇒
1
x = sin θ
5
1 − 25x2 dx =
=
Z q
1
cos θ dθ
5
θ = sin−1 (5x)
dx =
⇒
1
1 − sin θ
cos θ dθ
5
2
1Z
cos θ cos θ dθ
5
1Z
=
cos2 θ dθ
5
Now we use Identity 3.4
=
1Z 1
(1 + cos (2θ)) dθ
5 2
1 Z
=
1 + cos (2θ) dθ
10
1
1
θ + sin (2θ) + c
10
2
=
While this may seem like a nice answer, recall that θ = sin−1 (5x), which means our answer
is
1
1
sin−1 (5x) + sin 2 sin−1 (5x) + c
10
2
which is rather messy. To fix this, let us return to our answer in terms of θ and use Identity
3.6. Doing so yields
1
(θ + sin θ cos θ) + c
10
96
Example 3.12 continued
We already know that θ = sin−1 5x, and we also know that x = 51 sin θ, which means
sin θ = 5x. This allows us to convert most of our answer into clean terms of x, but we’re still
stuck with cos θ. However, we can use trigonometry to solve for cos θ in terms of x. This
process is shown below:
sin θ =
5x
Opposite
=
Hypotenuse
1
1
⇒
5x
θ
Adjacent, A
To find the adjacent side, A, we just use Pythagorean theorem.
A2 + (5x)2 = 1
Adjacent
cos θ =
Hypotenuse
√
⇒
A=
1 − 25x2
⇒
√
1 − 25x2 √
cos θ =
= 1 − 25x2
1
Plugging this all into our answer, we get
θ = sin−1 (5x)
sin θ = 5x
cos θ =
√
⇒
i
√
1
1 h −1
(θ + sin θ cos θ) + c =
sin (5x) + 5x 1 − 25x2 + c
10
10
1 − 25x2
which gives us our final answer. These problems can be appear long and complicated, but
they become much easier once you do a few of them.
3.4
Integration with Partial Fractions
3.5
Improper Integrals
97
Part Three: Calculus III
4
Review of Vector Basics
Chapter 4 Significance
Vectors are a type of mathematical object that are particularly useful in physical applications, as they allow us to analyze physical phenomena one dimension at a time. In
the context of calculus, vectors will allow us to compute integrals over two and threedimensional paths or surfaces, as well as analyze how fields change with position or time
– a crucial ability in the study of electricity and magnetism.
As this document is intended for second-level physics/engineering courses, it is likely that
you are already somewhat familiar with vectors. If this is the case, I recommend reading
the section summaries to gauge whether or not you need to review any of the material
included in this chapter.
4.1
Introduction to Vectors
Section 4.1 Overview
This section will introduce the fundamental concepts and equations of two-dimensional
vectors. During this section, the reader should aim to be able to:
• Describe what a vector is in a conceptual sense.
• Explain what the difference between a scalar quantity and a vector quantity is.
• Calculate a vector’s components based off of its direction.
• Calculate a vector’s magnitude based off of its components.
• Calculate a vector’s angle based off of its components.
• Understand why vectors are useful in physics.
• Understand conceptually and symbolically what it means to scale a vector.
This chapter will take a brief detour from calculus and instead focus on a topic that you likely
already have some familiarity with. Our reason for doing so is to learn how calculus can be applied
98
to vectors. Vectors can be interpreted in a variety of ways, such as arrows in space that have
various lengths and directions, or as ordered lists of numbers and other mathematical objects. The
interpretation that is most applicable to physics is the first and will therefore be the focus of this
chapter and the next chapter.
Consider a scenario where someone or something is ap-
⃗u
plying a force on an object. While it is useful to know
how strong that force is, its direction is also important.
Vectors are a way of packaging the strength of the force
⃗
w
⃗v
and its direction into one object. In Figure 4.1 we can
Figure 4.1: Three two-dimensional
⃗ with different magvectors, ⃗u, ⃗v, and w
nitudes and directions
see that each vector has a different length and points in
a different direction. The length of a vector is called its
magnitude, and can be denoted as shown below.
Magnitude of vector ⃗v = ∥⃗v∥ = |⃗v| = v
As we can see, there a few ways of denoting magnitude. In the first, two vertical bars are placed
on each side of the vector, while only one vertical bar is used on each side in the second notation.
While the latter can be easily confused with absolute value, you can normally deduce from context
which is being referenced. Notice that vectors are also denoted with an arrow above them. This
arrow makes it clear that the quantity is a vector quantity. In physics, its very common to drop the
arrow when talking about a vector’s magnitude, as it reduces the amount of notational baggage.
A number without an arrow above it is consid-
3⃗
w
ered a scalar quantity, and as the name suggests, they scale vectors. They are the numbers you are more familiar with, as they only
give magnitude and not direction. When we
multiply a vector by a scalar quantity, the vec-
⃗
w
⃗
w
⃗1
w
⃗
w
⃗ from Figure 4.1 scaled by three
Figure 4.2: w
(top) and one half (bottom)
2
tor will either shrink or grow as shown in Fig⃗ , it only grew or shrank in size – that is, its magnitude changed,
ure 4.2. Note that when we scaled w
but its direction remained the same. We can express the direction of a vector by either giving
its angle on the unit circle, or we can determine the vector’s unit vector, which we will get into
later. Knowing the angle of a vector allows us to break it down into components with the use of
trigonometry.
As shown in Figure 4.3, if we know the mag-
⃗u
nitude and the direction of a vector, we can
break it down into a right triangle where the
∥⃗u∥
99
θ = π4 rad = 45◦
θ
uy
hypotenuse has length equal to the magnitude of the vector. The legs of this triangle
are the components of the vector. Notice
that when we consider a vector’s angle to
be the angle it makes with the horizontal,
the components of the vector each span one
dimension. That is, one leg exists in the xdirection and the other exists in the y-direction. Each of these legs are referred to as the x and
y-components, respectively. To determine the magnitudes of these components, ux and uy , we can
use trigonometry. By the definitions for cos θ and sin θ,
cos θ =
=
adjacent
hypotenuse
ux
∥⃗u∥
⇒
sin θ =
ux = ∥⃗u∥ cos θ
=
opposite
hypotenuse
uy
∥⃗u∥
⇒
uy = ∥⃗u∥ sin θ
The ability to decompose vectors into x and y-components will prove to be useful in upcoming
sections. To see why, consider an object that has several forces, all with different magnitudes and
directions, acting on it. Decomposing each vector will allow us to analyze the combination of forces
one dimension at a time, thereby reducing the problem to two one-dimensional problems instead
of one two-dimensional problem.
vx = ∥⃗v∥ cos θ
(4.1)
vy = ∥⃗v∥ sin θ
(4.2)
100
The decomposition of vectors also gives us a way to
represent them symbolically, without explicitly stating
−4
5
!
3
4
their magnitudes or directions. Instead, we list their x
!
and y-components. In Figure 4.4, we have four vectors
originating from the origin with their components listed
in column matrices. We can think of the components as
instructions for where to move from the origin to create
the vector. For example, for the vector in the first quad−1
−3
rant, its components tell us to start at the origin, move
three to the left, then move four up. We then draw an
!
5
−3
!
arrow from the origin to that point in space, (3, 4) in
this case, to get our vector. For a detailed visualization
Figure 4.4: Representing vectors by
their components
of this process, see Animation 4.1.
The use of column matrices is not the only way! to sym3
bolically represent a vector. The vector
can be
4
represented in any of the following ways:
⃗v =
3
4
!
"
⃗v =
⃗v = 3 î + 4 ĵ
3
4
#
⃗v = ⟨3, 4⟩
⃗v = 5 ∠ 53◦
The first two representations from above are both vectors as column matrices, and there is not a difference
Animation 4.1: Using vector components as instructions on where to move
from the origin to create a vector
between the brackets and parenthesis. These two representations are commonly used in linear
algebra, and they’ll be the ones I use most often in this document. The third is commonly used
in calculus courses and is similar to the first two, but it instead uses angle brackets and lists the
components similar to how points in space are represented. The fourth representation is commonly
used in physics and engineering courses, though it requires some additional understanding about
vectors, so we’ll hold off on using it or explaining it for now. The final representation only shows
the magnitude of the vector along with its angle and is generally a more informal representation.
Displaying column vectors in paragraphs can be a bit problematic due to their height. As a result,
when included in paragraphs, they are often represented as transposed matrices, which look like
so:
⃗v = (3 4)T
⃗v = [3 4]T
101
All you need to know is that the ‘T ’ is the transposed operator, and when it is applied to a column
matrix, it has the effect of tipping it on its side. In symbolic terms,
3
4
T
(3 4) =
!
If we are only given the components of a vector, we can use Pythagorean’s theorem to find its
magnitude. From Figure 4.3b,
∥⃗u∥2 = u2x + u2y
⇒
q
∥⃗v∥ =
∥⃗u∥ =
q
u2x + u2y
vx2 + vy2
(4.3)
Figure 4.3b also tells us how to find a vector’s angle using its components. Recall that,
tan θ =
opposite
adjacent
Since we already know both the opposite and adjacent sides of the triangle, ux and uy respectively,
we can easily find a formula for θ.
tan θ =
=
opposite
adjacent
uy
ux
⇒
θv = tan
θ = tan−1
−1
vy
vx
uy
ux
(4.4)
This equation will tell us what angle, with respect to the horizontal, a vector is oriented at given
its components. However, keep in mind that tan−1 θ has a domain restriction of −90◦ ≤ θ ≤ 90◦ ,
as one input has two possible outputs between zero and 2π. To see why, consider a vector ⃗a
that has two positive components, and a second vector ⃗b that has two components with the same
magnitude, but both components are negative. When the components of each vector are divided
by one another, they result in the same value, since a negative divided by a negative is positive.
Using Eq. (4.4) on each vector will yield the same angle, despite the angles being separated by 180◦ .
With this in mind, ensure that the angle obtained from Eq. (4.4) results in an angle in the same
quadrant as your vector. If it isn’t then you will need to add/subtract 180◦ . For a visualization of
how a vector’s components change its angle, see Animation 4.2. Lets do a few examples to solidify
102
Animation 4.2: The angle of a vector depends on the values of its components
our understanding of vectors.
Example 4.1
The vector ⃗
v has components vx = 3 and vy = 4. Find ∥⃗
v∥ and θv .
Starting with the magnitude, remember that the components of this vector form a triangle
with the vector itself, so we can find the magnitude of ⃗v with Pythagorean’s theorem. Using
Eq. (4.3) we obtain,
∥⃗v∥ =
q
vx2 + vy2
=
q
(3)2 + (4)2
=
=
√
√
9 + 16
25
=5
For the angle, we again can use the triangle created by ⃗v to find its angle. By Eq. (4.4),
θv = tan
−1
= tan
−1
4
3
= 53.13°
103
vy
vx
Example 4.2
A force ⃗
F = 50 N ∠ 30° acts upon a box as shown below. What force(s), in
vector form, must we apply to prevent the box from being moved by ⃗
F if
we are to apply (a) two forces ⃗
u and ⃗
v with angles θu = 180° and θv = −90°
respectively, (b) only one force, w
⃗ ? What is the magnitude and direction of w
⃗?
⃗
F
30◦
(a) Let us begin by considering what conditions must be met for this object to remain
stationary. If an object is initially at rest, then the only way to prevent that object
from being moved by a force is to apply a force with equal magnitude and opposite
direction. Since we are required to find two forces to accomplish this, and because we
know the directions of these two forces are along the x and y-axis, it would be wise
⃗ into its two components. This will allow us to use each of our two
to decompose F
⃗ Below is a diagram of this scenario to help
forces to counteract one component of F.
illustrate this process.
⃗v
⃗u
⃗
F
30◦
F⃗y
F⃗x
Now we can redraw the force diagram with only these four forces taken into account.
104
Example 4.2 continued
⃗v
F⃗x
⃗u
F⃗y
We’ll start by considering only the horizontal forces at play here. Since we know ⃗u
has a direction opposite to F⃗x , its magnitude will need to be equal to the magnitude
of F⃗x .
∥⃗u∥ = ∥F⃗x ∥ = F cos θ
= 50 N cos 30◦
= 43.3 N
For ⃗v, since it is directed in the opposite direction of Fy , it must have the same
magnitude.
∥⃗v∥ = ∥F⃗y ∥ = F sin θ
= 50 N sin 30◦
= 25 N
When putting these values into vector form, we must be careful about whether or not
the values are positive or negative. Recall from Figure 4.4 that we can determine the
direction of a vector by placing it at the origin of a coordinate grid, then moving along
each axis according to the vector’s components. For ⃗u, if we insert a positive value for
the x-component, it will point to the right, but we need that vector to point to the
right. Therefore the x-component will need to be negative. Since ⃗u only points in the
x-direction, its y-component will be zero. A similar line of reasoning can be used to
105
Example 4.2 continued
express ⃗v in vector form. Below are several ways to express each.
⃗u =
−43.3
0
⃗v =
0
−25
!
N = 43.3 N ∠ 180◦ = ⟨−43.3, 0⟩ N
!
N = 25 N ∠ − 90◦ = ⟨−25, 0⟩ N
⃗ , but lets first draw a picture. We again know
(b) There are a few ways we could find w
⃗ so our vector
⃗ needs to have an equal magnitude and opposite direction to F,
that w
will look something like this:
⃗
w
⃗
F
30◦
⃗ and add 180◦ to the angle of F,
⃗ which would
We could simply keep the magnitude of F
⃗ = 50 N ∠ 210◦ , however I would like to demonstrate that we actually have
yield w
already found the vector we need in part (a). Since vectors ⃗u and ⃗v only occupy the
x and y-axis respectively, we can combine them by using each of their components to
form one new vector that will have a magnitude of 50 N and an angle of 210◦ . Thus
⃗ will be,
our vector w
⃗ =
w
−43.3
−25
!
N
This will act as our first clue for how to add vectors together, as
⃗u + ⃗v =
−43.3
0
!
N+
0
−25
!
?
N=
−43.3
−25
!
N
We will leave this question unanswered for the moment, as it will be the topic of the
next section, however I encourage you to plot two vectors in the coordinate plane and
consider the geometric interpretations for adding vectors. For now, lets calculate the
106
Example 4.2 continued
⃗.
magnitude and direction of w
∥⃗
w∥ =
q
wx2 + wy2
=
q
(−43.3 N)2 + (−25 N)2
θw = tan
−1
= tan
−1
wy
wx
−25 N
−43.3 N
!
= 50 N
= 30◦ + 180◦
= 210◦
It is important to remember that tan−1 θ has a domain restriction of −90◦ ≤ θ ≤ 90◦ ,
which is why we sometimes need to add 180◦ when our angle is in the second or third
quadrants.
Notice from Example 4.2 that multiplying a vector’s
components had the effect of flipping the vector by
180◦ . Recall from the beginning of this section that we
can multiply vectors by scalars (“normal” numbers) to
shrink or grow them in size/magnitude. While scaling a
vector generally cannot change a vector’s direction, there
is one exception to this. Watch Animation 4.3 and make
an attempt at answering when and why scaling a vector
Animation 4.3: A vector being scaled
changes its direction.
αvx
α⃗v =
αvy
(4.5)
Eq. (4.5) gives the general formula for scaling a vector. If α > 1, the vector grows larger, and if
α < 1 the vector shrinks. However, if α < 0, then the vector not only grows larger or smaller, but
it is also flipped 180◦ . This is because its components – which tells you the point where the tip of
a vector lands – are moved into the opposite quadrant, where the oppositely-signed x or y-values
exist. Therefore, the tip now lands on the opposite side of the origin, and the vector is flipped
180◦ .
107
Section 4.1 Summary
• Vectors can be thought of as arrows in space which have length proportional to
their magnitude and are oriented at a specified angle.
• Vectors can be decomposed into components, the x-components and the ycomponents, which can be calculated with Eq. (4.1) and Eq. (4.2).
• Vectors can be generated by drawing arrows from the origin to the point in space
created by their components.
• Using the triangle created by a vector and its components, we can find its magnitude
with Eq. (4.3) and angle on the unit circle with Eq. (4.4)
• Vectors can be multiplied by scalar quantities to shrink, stretch, or flip a vector
180◦ . To compute a scaled vector, simply multiply the vector components by the
scalar quantity as shown in Eq. (4.5).
4.2
Vector Addition
Section 4.2 Overview
This section will cover the addition and subtraction of vectors. The objectives of this
section are to:
• Teach the reader to add and subtract vectors geometrically
• Teach the reader to add and subtract vectors symbolically
108
⃗v
⃗
w
⃗u
⃗v
⃗u
(a)
(b)
Figure 4.5: Adding vectors geometrically by arranging them tip-to-tail
Suppose we have two vectors, ⃗u = (3 2)T and ⃗v = (2 4)T and that we want to add these
two vectors. What does the addition of two vectors mean? Recall that vectors can be pictured as arrows pointing from the origin to the point in space created by their components.
The addition of ⃗u and ⃗v tells us to start at the origin,
move three to the right and two up (the components of
⃗u), then move to the right two and up four (the compo⃗ from the
nents of ⃗v). We can now draw a new vector, w
origin to the point where ⃗v lands in Figure 4.5a. This
new vector is the sum of ⃗u and ⃗v, or in other words,
⃗ = ⃗u + w
⃗ . Since we’re using each vector’s compow
nents as instructions for where to move on the coordi⃗ are just the sum of
nate plane, the x-components of w
Animation 4.4: Geometric representation of vector addition
the x-components between ⃗u and ⃗v, and likewise for the y-components. In vector form, the sum
of two vectors is given as
ux + v x
⃗u + ⃗v =
uy + v y
The subtraction of vectors is nothing more than the addition of vectors with negative components,
and is shown in Figure 4.6a. This vector is also what we get when we draw ⃗u and ⃗v tail-to-tail,
then draw an arrow between the tip of ⃗u to the tip of ⃗v, shown in Figure 4.6b. Note that this
vector displays the difference between the two vectors, which is exactly what subtraction is.
ux ± v x
⃗u ± ⃗v =
uy ± v y
109
(4.6)
⃗v
⃗u − ⃗v
⃗u
⃗u + −⃗v
⃗u
−⃗v
(a)
(b)
Figure 4.6: Finding ⃗u − ⃗v geometrically by (a) adding negative ⃗v (b) arranging ⃗u and ⃗v tail-totail and drawing a vector from the tip of ⃗u to the tip of ⃗v
To clear any misconceptions, while the two subtraction vectors shown in Figure 4.6 are located in
different places, they are the exact same vectors. From a mathematical perspective, the location
of a vector does not matter. It is only when we start using vectors to represent real-world objects,
like forces, that their location makes a difference. Using Eq. (4.6) is demonstrated below, and you
can use these calculations to verify the results from the figures of this section. As a reminder, ⃗u
and ⃗v are still (3 2)T and (2 4)T , respectively.
⃗u − ⃗v = ⃗u + (−⃗v)
!
=
3
2
!
=
3
2
=
3−2
2−4
=
1
−2
+ −
2
4
−2
−4
+
!
!
!
!
Recall in Example 4.2 we considered whether the not the sum of vectors ⃗u and ⃗v from part (a) were
⃗ that was found in part (b). As a brief exercise, find ⃗u + ⃗v geometrically
the same as the vector w
(line the vectors up tip-to-tail and draw point from the tail of the first vector to the tip of the
⃗ If done correctly, you should be left with a vector that is
last vector). How does it compare to F?
equal in magnitude but opposite direction, which is precisely what we were after.
110
Example 4.3
⃗
x = (4 −3)T and ⃗
y = (6 5)T . Find (a) ⃗
x +⃗
y (b) ⃗
x −⃗
y.
(a) All we need to do here is plug our two vectors into Eq. (4.6).
⃗x + ⃗y =
4
−3
!
6
5
+
=
4+6
−3 + 5
=
10
2
!
!
!
(b) For subtraction, remember that we can still use Eq. (4.6), we just need to add −⃗y.
⃗x − ⃗y =
4
−3
!
6
5
−
=
4−6
−3 − 5
=
−2
−8
!
!
!
Section 4.2 Summary
• Two vectors can be added by putting them tip-to-tail and then drawing a vector
from the tail of the first vector to the tip of the second vector.
• Symbolically, two vectors can be added by adding the x-components of each vector
to obtain the x-component of the sum vector, and likewise for the y-component.
This is given by Eq. (4.6).
• To subtract two vectors, simply scale the second vector by negative one, then follow
the steps for vector addition.
111
4.3
Vectors in 3D Space and the Unit Vector
Section 4.3 Overview
Up until now we have only dealt with vectors in two-dimensional space. However, vectors
became most useful when working in three-dimensional space. Throughout this section,
the reader should aim to be able to:
• Visualize vectors in 3D space.
• Find the magnitude of 3D vectors.
• Conceptually explain what a unit vector is.
• Calculate the unit vector 2D and 3D vectors.
• Use unit vectors to convey a 3D vector’s direction.
From Figure 4.7 we can see that the only differ-
2 z
ence between vectors in 3D space and vectors
in 2D space is that they now have a dimension
that lets them rise from the page. Just like
1
y
with 2D vectors, we can use their components
as instructions for where to move from the ori-
4
2
gin, and then draw an arrow from the origin to
the point in space created by the vector’s com-
a
b
c
1
2
ponents. Recall when we added 2D vectors, we
x
Figure 4.7: A three-dimensional vector with its
components shown
were essentially just following instructions for
two vectors at the same time. The same is true
in 3D space, and thus the formula for addition 3D vectors is the same as 2D vectors, just with an
additional component.
ux ± v x
⃗u ± ⃗v = uy ± vy
uz ± v z
(4.7)
The same is true for scaling a 3D vector, all we do is multiply each component by the scalar.
112
αvx
α⃗v = αvy
αvz
However, finding the magnitude of a 3D vector
(4.8)
2 z
C
If we draw a line in the xy-plane below the
1
vector in Figure 4.7, we’ll find that 3D vectors
y
create two triangles, one in the xy-plane and
one in a plane perpendicular to the xy-plane,
O
1
is the length of line segment OC, which is the
magnitude of the vector ⃗v. From the two tri-
x
2
B
A
Figure 4.8: 3D vectors create two triangles
which can be used to determine magnitude
angles formed, we can express each segment as
follows
AB = b
4
2
as shown in Figure 4.8. What we’re after here
OA = a
a
b
c
may not appear to be quite as straightforward.
BC = c
OB = L
OC = ∥⃗v∥
OB is not a component of ⃗v, however knowing its length will be important in finding the magnitude
of ⃗v. We can begin by using triangle OAB to find L in terms of a and b, then, using the derived
expression for L, we can use triangle OBC to find the length of OC and thus the magnitude of ⃗v.
a2 + b2 = L2 ,
L2 + c2 = v 2
⇒
v 2 = (a2 + b2 ) + c2
⇒
v=
√
a2 + b 2 + c 2
Thus the magnitude of a 3D vector is given by:
∥⃗v∥ =
q
vx2 + vy2 + vz2
(4.9)
The three equations we have covered are the same as in 2D space, however, the direction of a
vector is not as straightforward. This is because instead of having just one angle, we now have
two, which are illustrated in Figure 4.9. While there is nothing stopping us from describing a
vector’s direction in terms of these two angles, it is most common to do so with a special kind
of vector; the unit vector. A unit vector is a vector that only gives information on a vector’s
direction. This is accomplished by finding a vector that points in the same direction as the vector
in question, but only with magnitude one, hence unit vector.
113
Unit vectors are denoted like so:
2 z
unit vector of ⃗v = v̂
1
y
To find a formula for unit vectors, we can consider how we define it. Suppose we have an
arbitrary vector ⃗v with magnitude ∥⃗v∥. If v̂
φ
2
4
θ
1
is a vector that has the same direction as ⃗v,
x
2
but only magnitude one, then we can scale v̂
by the magnitude of ⃗v to create ⃗v. That is we
Figure 4.9: 3D vectors create two angles one in
can represent ⃗v in terms of v̂, which is shown
the xy-plane and one in a plane perpendicular
in Animation 4.5. Symbollicaly, this looks like
to the xy-plane.
⃗v = ∥⃗v∥v̂
We can then solve for v̂ to get our formula.
v̂ =
⃗v
∥⃗v∥
(4.10)
Recall from Introduction to Vectors that when
we listed various forms of representing vectors,
we stated there was one notation that required
Animation 4.5: A vector ⃗v with magnitude 10
and its unit vector shown. The unit vector is
then scaled by 10 in order to become an exact
copy of ⃗v.
a bit more knowledge of vectors. That nota-
z
tion involved unit vectors, and was
2
⃗v = 3î + 4ĵ
1
There are three fundamental unit vectors refereed to as the standard basis vectors and are
rection, respectively, and are shown in Figure
k̂ ĵ
2
1
î
1
represented as î, ĵ, and k̂. Each vector is a
unit vector that points in the x, y, and z di-
y
2
Figure 4.10: î, ĵ, and k̂ shown in 3D space.
4.10. Symbolically,
1
î = 0
0
x
0
ĵ = 1
0
114
0
k̂ = 0
1
We can scale each of these vectors and add them together to represent any vector in 3D space as
shown below.
1
0
0
aî + bĵ + ck̂ = a 0 + b 1 + c 0
0
0
1
a
0
0
= 0 + b + 0
0
0
c
a+0+0
= 0+b+0
0+0+c
a
= b
c
On occasion x̂, ŷ, and ẑ will be used instead of î, ĵ, and k̂. Using these basis vectors to represent
vectors is very common in lower level physics courses, however it will not get much use in this
document as I feel column vectors are much easier to quickly read as well as offer a very smooth
transition into understanding Vector Multiplication, which we will get into shortly. Column vectors
are also used in solving linear systems of equations in Linear Algebra.
Example 4.4
Let ⃗x = (3 7 4)T and ⃗y = (4 −5 8)T . Find ⃗
x −⃗
y and the resulting magnitude.
First we use Eq. (4.7) to find ⃗x − ⃗y.
3
4
⃗x − ⃗y =
−
7 −5
4
8
3−4
=
7+5
4−8
−1
= 12
−4
115
Example 4.4 continued
Then we use Eq. (4.9) to find the magnitude of ⃗x − ⃗y.
∥⃗x − ⃗y∥ =
=
q
√
(−1)2 + (12)2 + (−4)2
161
Example 4.5
Find û for ⃗
u = (2 2 1)T
By Eq. (4.10), we need to find this vector’s magnitude and then divide ⃗u by that value.
∥⃗u∥ =
û =
√
12 + 22 + 22 = 3
⃗u
∥⃗u∥
2/3
2
1
= 2
3
1
= 2/3
1/3
If we want to double check our answer, we can find the magnitude of û. If we did the problem
correctly, it should be one.
∥û∥ =
s
2
2
3
s
=
2
2
+
3
9
9
=1
116
+
2
1
3
Section 4.3 Summary
• Vectors in 3D space have the same properties as vectors in 2D space, just with an
additional dimension.
• A unit vector is a vector that has magnitude one and is used to convey a vector’s
direction.
• î, ĵ, and k̂ are called the standard unit vectors and each point in the x, y, and
z-direction respectively.
• A vector’s unit vector can be found by dividing the vector by its magnitude, which
is given by Eq. (4.10).
It should be mentioned that vectors are not limited to any number of dimensions. Once they
exceed three dimensions, they lose their physical representations and become more like ordered
lists. However, the formulas follow the same trends. While we will not need them in this document,
I’ve listed the formulas for n-dimensional vectors below.
If
x1
x2
⃗x = .
..
xn
∥⃗x∥ =
x21 + x22 + · · · + x2n
y1
y2
⃗y = .
..
and
q
αx1
αx2
α⃗x = .
..
αxn
yn
then:
x1 ± y 1
x2 ± y 2
⃗x ± ⃗y =
..
.
xn ± y n
The same is true for what we will discuss in this last section, and we’ll write out the general
formulas at the very end.
117
4.4
Vector Multiplication
Section 4.4 Overview
We have covered how to add and subtract vectors, as well as multiplying vectors by
scalars, but now we must learn to multiply vectors by other vectors. There are two types
of vector multiplication that we will focus on in this section, dot products and cross
products. As a small bonus, we will end our discussion with a shortcut for computing a
special case of multiplying three vectors. Throughout this section, the reader should aim
to learn:
• How to compute a dot product and what it represents.
• How to compute a cross product and what it represents.
• How to compute a scalar triple product.
4.4.1
Dot Products
The first vector product we will be looking at is called the dot product, and it is represented like
so:
Dot product of ⃗u and ⃗v = ⃗u · ⃗v
Rather than throwing a formula in front of you that tells you how to compute a dot product, we’re
going to go through a simple process that will give you a better idea of what a dot product really
is. The dot product of two vectors, ⃗u and ⃗v, is defined as the magnitude of the projection of ⃗u
onto ⃗v multiplied by the magnitude of ⃗v.
⃗u · ⃗v = (Magnitude of the projection of ⃗u onto ⃗v) × (Magnitude of ⃗v)
Before we get into exactly what a projection is and what this means, lets first notice that we’re
multiplying two magnitudes, so our end result will be a scalar quantity (magnitudes are always
scalars.)
In order to convey what a projection is, imagine you are standing in front of a wall with a flashlight. Suppose you point the flashlight perpendicular to the wall, and then you place some object, like your arm or a plank of wood, in between the flashlight and the wall such that the
object is at some angle with the wall. The shadow on the wall is the projection of the object
118
onto the wall. To visualize this with vectors, we’ll let ⃗u be the object, and ⃗v be the wall.
From Figure 4.11, we can see that the projection of ⃗u
7
onto ⃗v is just the x-component of ⃗u. In a more general
sense, the projection vector is the part of ⃗u that points
6
parallel to ⃗v. Notice that as θ – the angle of the object
with respect to the wall in our mental image – increases,
5
the projection vector gets smaller. Similarly, when θ
4
⃗u
gets smaller, the projection vector grows. This behavior
3
is expected since the projection vector here is the xcomponent of ⃗u, which is given by ∥⃗u∥ cos θ. They key
2
factor here is cos θ, which decreases as θ moves from 0◦
to 90◦ . With that in mind consider what happens when
1
the angle between the vectors is greater than or equal to
90◦ . We’ll answer that question here in a moment, but
for now, lets get our first formula for the dot product of
two vectors.
0
θ
⃗v
Projection or
“shadow” of ⃗u onto ⃗v
0
1
2
3
4
5
6
Figure 4.11: The projection of ⃗u (the
object) onto ⃗v (the wall) visualized.
From Eq. (4.3), we know how to find the magnitude of the projection vector as, again, its just
the x-component of ⃗u. Remember that the definition of a dot product is the magnitude of the
projection vector, multiplied by the magnitude of the vector that’s being projected onto. In Figure
4.11, the magnitude of the projection vector is ∥⃗u∥ cos θ, and the magnitude of the vector being
projected upon, ⃗v, is simply ∥⃗v∥.
⃗u · ⃗v = ∥⃗u∥ cos θ ∥⃗v∥
= ∥⃗u∥∥⃗v∥ cos θ
⃗u · ⃗v = ∥⃗u∥∥⃗v∥ cos θ
(4.11)
It is important to note that θ here is the angle between the two vectors. Because ⃗v has an angle
of zero degrees, it happens that θ = θu , the angle of ⃗u in Figure 4.11. However, the computation
doesn’t change when ⃗v is moved from the horizontal. In Figure 4.12, ⃗u forms a triangle with its
projection onto ⃗v (highlighted in blue). If we know the magnitude of ⃗u (hypotenuse), and we
want to determine the magnitude of the projection (adjacent), we can use the properties of right
triangles to calculate it.
119
7
7
cos θ =
Magnitude of projection
adjacent
=
hypotenuse
∥⃗u∥
6
⇓
5
Magnitude of projection = ∥⃗u∥ cos θ
4
⃗u
3
As an interesting side note, this computation also yields
the x-component of ⃗u if we were to consider 0◦ to be the
2
angle where ⃗v was instead of the horizontal axis. From
1
this interpretation, its as if we were rotating the entire
xy-plane so that the new x-axis was positioned along
⃗v. This idea will be explored in more detail in Linear
Algebra, so if you intend on reading that chapter, keep
this in mind. In summary, the computation for the dot
0
⃗v
θ
0
1
2
3
4
5
6
7
Figure 4.12: The projection of ⃗u onto
⃗v visualized when ⃗v makes an angle
greater than zero with the horizontal.
product will still be ∥⃗u∥∥⃗v∥ cos θ even though neither
vector has angle θ.
Before moving on, I want to quickly expose you to some
notation for the projection vector:
Projection vector of ⃗a onto ⃗b = proj⃗b ⃗a
Projection vector of ⃗b onto ⃗a = proj⃗a ⃗b
Animation 4.6: The projection ⃗u
onto ⃗v visualized with a changing
angle between the two vectors.
An important detail is that this function returns the projection as a vector, not as a scalar. Since
we only need the magnitude of the projection, we aren’t going to get into how to calculate the
projection vector. All you should take away from this is that this is how projection vectors are
denoted and how to correctly interpret this notation.
Animation 4.6 shows the changes in the angle between two vectors changes a project. As you can
see, the projection becomes zero when the angle between the vectors is 90◦ , which makes sense
since cos 90◦ = 0. Thus the dot product of any two perpendicular vectors is zero. Similarly, since
cos θ is negative when θ is between 90◦ and 180◦ , the dot product is negative in that range of angles
too. This may seem a little confusing since we said the dot product is defined as the magnitude
of the projection vector multiplied the magnitude of the vector that is being projected upon, and
because magnitudes are generally positive values. However, it is convention for magnitude of the
projection vector to be considered negative if the projection does not fall on the vector being
projected upon. To put it another way, in Animation 4.6, the magnitude of the projection vector,
120
and thus the dot product, is negative when proj⃗v ⃗u is pointing in the opposite direction of ⃗v.
An important property of dot products is the commutative property, which is,
?
⃗u · ⃗v = ⃗v · ⃗u
In all of the figures from this subsection, we’ve projected ⃗u onto ⃗v, so lets find the projection of ⃗v onto
⃗u and see if the dot product is the same. Recall
7
6
that we can find the magnitude of the projection
by considering the triangle formed by ⃗v, the pro-
5
jection vector, and the dashed line connecting those
4
two vectors. By the same reasoning as Figure 4.12,
⃗v
⃗
j
u
o
3
pr
the magnitude of the projection vector is given by
∥⃗v∥ cos θ. So, when we multiply this by the magnitude of ⃗u, we get:
⃗v · ⃗u = (∥⃗v∥ cos θ) ∥⃗u∥
⃗u
⃗v
2
θ
1
0
0
1
2
3
4
5
6
7
⃗
Figure 4.13: The projection of v onto ⃗u
visualized
= ∥⃗u∥∥⃗v∥ cos θ = ⃗u · ⃗v
So we end up with the exact same thing. A very important disclaimer here,
proj⃗b ⃗a ̸= proj⃗a ⃗b
It is only when we multiply the magnitude of either projection by the magnitude of the other
vector that the result is the same. That is, order does not matter for dot products, but order does
matter for projections.
Our last topic on dot products in this chapter will be a derivation of a formula that allows us to
compute a dot product with only the components of a vector. This is almost always the more
convenient route, as we will rarely have the angle between two vectors in hand. This derivation
really provides no conceptual understanding as to why the formula works, but we will answer this
question in Linear Algebra. Begin the analysis with the use of Identity 4.1.
121
Identity 4.1
c
2
2
a + b − 2ab cos θ = c
2
a
θ
b
Hopefully you’ll immediately see why the Law of Cosines is useful here. If not, the third term on
the left side of the equation contains ab cos θ, and the angle θ describes the angle between legs a
and b of the triangle. This is exactly the formula for the dot product of two vectors, ⃗a and ⃗b,
which are formed by legs a and b respectively. So, we will let ⃗a = ⃗u, ⃗b = ⃗v, and the point where
they meet be the origin. Additionally, we will let both these vectors be oriented such that they
point outward from this assigned origin.
With these choices of ⃗u and ⃗v, the vector formed
by leg c is a vector that connects the tip of ⃗u to
∥⃗u − ⃗v∥
the tip of ⃗v. We know from Vector Addition that
∥⃗u∥
a vector pointing from the tip of one vector to the
tip of another vector is the subtraction of the two
vectors. The difference between letting ⃗c be ⃗u − ⃗v
or ⃗v − ⃗u is just whether it points toward leg a or
leg b and doesn’t matter in this case. With that
in mind, we’ll make the arbitrary choice of letting
⃗c = ⃗u − ⃗v. We can then take these three vectors
to make our triangle, where each leg has a length
equal to the magnitude of the vector corresponding
θ
∥⃗v∥
Figure 4.14: Two arbitrary vectors, ⃗u and
⃗v, separated by angle θ are used to create
a triangle by letting the leg opposite to θ
be the subtraction of the two vectors. The
length of each leg is simply the magnitude
of the corresponding vector.
to that leg. This triangle is shown in Figure 4.14
To be begin the derivation, we will express our vectors in vector form.
ux
⃗a = ⃗u =
uy
v
⃗b = ⃗v = x
vy
ux − v x
=
uy − vy
122
ux
vx
⃗c = ⃗u − ⃗v = −
uy
vy
Next, we will plug our vectors into Identity 4.1.
a2 + b2 − 2ab cos θ = c2
⇓
∥⃗u∥2 + ∥⃗v∥2 − 2∥⃗u∥∥⃗v∥ cos θ = ∥⃗u − ⃗v∥2
⇓
∥⃗u∥2 + ∥⃗v∥2 − 2(⃗u · ⃗v) = ∥⃗u − ⃗v∥2
(1)
Now, we will find ∥⃗u − ⃗v∥2 in terms of the components of ⃗u and ⃗v.
2
∥⃗u − ⃗v∥ =
q
2
2
(ux − vx ) + (uy − vy )
2
= (ux − vx )2 + (uy − vy )2
= u2x − 2ux vx + vx2 + u2y − 2uy vy + vy2
= u2x − 2ux vx + vx2 + u2y − 2uy vy + vy2
We then pair the squares of the vector components together to get the magnitudes of each vector.
= u2x + u2y + vx2 + vy2 − 2ux vx − 2uy vy
= ∥⃗u∥2 + ∥⃗v∥2 − 2 (ux vx + uy vy )
With this we obtain:
∥⃗u − ⃗v∥2 = ∥⃗u∥2 + ∥⃗v∥2 − 2 (ux vx + uy vy )
Since the right side of (1) is identical to the left side of (2), we have the following equation:
∥⃗u∥2 + ∥⃗v∥2 − 2(⃗u · ⃗v) = ∥⃗u∥2 + ∥⃗v∥2 − 2 (ux vx + uy vy )
Now we can solve for ⃗u · ⃗v.
−2(⃗u · ⃗v) = −2 (ux vx + uy vy )
⇓
⃗u · ⃗v = ux vx + uy vy
123
(2)
Which gives us our equation for computing dot products without the use of any angle. So, to find
the dot product of two vectors, we multiply their components pairings (x-x, y-y, etc), then sum
the products.
ux
vx
uy · vy = ux vx + uy vy + uz vz
uz
vz
(4.12)
While we only derived this formula in two dimensions, the process for three dimensions is the
same. The difference is the triangle is tilted into or out of the page so it spans three dimensions.
Then we add z-components to all of our vectors and perform the exact same series of steps. In the
case of the dot product of two n-dimensional vectors:
a1
b1
n
X
a2 b 2
·
ai b i
.. .. = a1 b1 + a2 b2 + . . . + an bn =
. .
i=1
an
bn
Before doing some examples, I’ll list some key properties for dot products. The proofs for these
are very straight forward and can be done by using Eq. (4.12) on vectors with arbitrary constants
for components. I encourage you to work through them as some extra practice.
⃗u · ⃗v = ⃗v · ⃗u
(4.13)
(α⃗u) · ⃗v = α (⃗u · ⃗v)
(4.15)
⃗v · ⃗v = ∥⃗v∥2
⃗u · (⃗v + w
⃗ ) = (⃗u · ⃗v) + (⃗u · w
⃗)
(4.14)
(4.16)
Example 4.6
Find ⃗
x ·⃗
y for ⃗
x = (4 3)T and ⃗
y = (6 1)T . What is θ, the angle between the two
vectors?
124
Example 4.6 continued
Starting off with the dot product of ⃗x and ⃗y,
4
3
⃗x · ⃗y =
!
·
6
1
!
= (4)(6) + (3)(1)
= 27
As for the angle, we can start with Eq. (4.11) and then solve for cos θ. Since Eq. (4.12)
gives us a second way of computing dot products, we won’t run into any issues with circular
reasoning.
⃗x · ⃗y = ∥⃗x∥∥⃗y∥ cos θ
⇒
cos θ =
⃗x · ⃗y
∥⃗x∥∥⃗y∥
27
√
=√ 2
3 + 42 62 + 12
27
= √
5 37
Now we solve for theta.
−1
θ = cos
27
√
5 37
!
= 27.4◦
To double check our answer, we can easily find the angles of ⃗x and ⃗y, then find their
difference.
θx = tan
−1
3
4
θy = tan
= 36.9◦
−1
= 9.46◦
⇓
θ = 36.9◦ − 9.46◦ = 27.4◦
So our answer checks out.
125
1
6
Before we move onto cross products, I think its important we give an example of when dot products
are useful, as when you say its definition out loud, it sounds rather arbitrary.
Recall that work is a measure of the transfer of energy. When an elevator lifts you from the bottom
floor to the top floor, you gain potential energy, and that energy came from electrical energy. The
measure of energy transferred would be the amount of electrical energy that was transferred into
potential energy, which is the work done on you by the elevator. Recall in your very early education
that you were told
Work = Force × Distance
You probably already know that this isn’t the full truth, as work is actually the product of distance
and the part of the force that points in the same direction as the distance. Suppose you were holding
a box at a constant height while moving forward at an increasing horizontal velocity. The upward
force you apply to the box does no work on the box since the force doesn’t move the box upward, it
only prevents the box from falling. That is, the energy of the box doesn’t change in regard to the
vertical components of the system. However the horizontal force you’re applying to the box that’s
pushing it forward does change the energy of the box, since you’re causing the box to increase
in velocity. If we want to calculate how much work you’re doing on the box, then we must only
consider the portion of the force that points in the same direction as the distance. Sounds a lot
like the projection vector, doesn’t it? That’s because that’s exactly what’s going on.
⃗d
⃗
F
Figure 4.15: caption
In Figure 4.15, a force that pushes up and to the left is applied to a box, however the box moves
right and down denoted by ⃗d (the downward component of distance being attributed to some
⃗ that points in the same direction
arbitrary force not depicted). In order to find the portion of F
⃗ onto ⃗d. Once we do that, we multiply the magnitude of that portion of
as ⃗d, we have to project F
⃗ by the magnitude of ⃗d to obtain the work done on the box by F.
⃗ This is precisely what a dot
F
product is, thus
⃗ · ⃗d
W =F
126
Work is a prime example of when we need a tool that easily tells us the product of two similarlydirected vector components. Dot products show up in many other applications to physics and
calculus as well. Later in Rudimentary Multivariable and Vector Calculus, we’ll explore a type of
integral that utilizes the dot product in order to integrate over different paths and surfaces, which
is an incredibly powerful capability – especially in physics.
Subsection 4.4.1 Summary
• A dot product of vectors ⃗a and ⃗b is defined as the scalar product of the magnitude
of the projection of ⃗a onto ⃗b and the magnitude of ⃗b.
• The vector projection of ⃗a onto ⃗b is the portion of ⃗a that points parallel to ⃗b. By
convention, when a projection vector points opposite to the vector being projected
upon, the magnitude of the projection vector is negative.
• A dot product can be calculated by Eq. (4.11) , where θ is the angle between the
two vectors.
• A dot product can also be calculated by Eq. (4.12) when the angle between the
two vectors is not known.
While this isn’t as important as the above items, if you’re interested in a deeper understanding of the dot product, you should also keep in mind the idea of finding the
projection vector by rotating the coordinate axis such that the x-axis is parallel with
the vector being projected upon (⃗v in Figure 4.12). This idea will be the foundation of
understanding why Eq. (4.12) gives you the dot product of two vectors in Linear Algebra.
Before moving onto cross products I wanted to list some formulas in regard to projection vectors.
You probably won’t need to know these and we won’t be using them in this document, but they’re
here for you if you ever need them. The first is the formula for finding the projection vector:
proj⃗a ⃗b =
⃗a · ⃗b
⃗a · ⃗b
⃗a · ⃗b
⃗
⃗a =
a
=
â
⃗a · ⃗a
∥⃗a∥2
∥⃗a∥
We can express this formula in a variety of ways, and above are the most common. The second
takes advantage of the property ⃗v · ⃗v = ∥⃗v∥2 . The next formula gives you the magnitude of the
projection vector:
proj⃗a ⃗b = comp⃗a ⃗b =
127
⃗a · ⃗b
∥⃗a∥
4.4.2
Cross Products
Unfortunately, to truly understand what a cross product is, it requires some basic linear algebra.
Covering that material here would be too long of a detour, so what I’m going to do is explain the
cross product to the best of my ability without the use of any linear algebra. However, we will
revisit both dot and cross products in In Linear Algebra, where we will be able to better understand
each operation. So, for now there will be some blank spots in your conceptual understanding of the
cross product. However, I highly encourage you to revisit the topic in Linear Algebra, as I believe
that when we just memorize formulas and procedures, learning mathematics loses its benefits.
Every theorem, formula, identity, etc, are like tools. You can use tools without knowing exactly
how they work, but this limits your ability to apply them on your own. When you understand
how the tool works, you’re able to solve problems that are completely knew to you, which is a vital
skill in the fields of mathematics, physics, and engineering.
7
The cross product of ⃗u and ⃗v is defined as a vector
perpendicular to both ⃗u and ⃗v, and whose magnitude is equal to the signed area of the parallelogram
bounded by each vector. The cross product is denoted as:
6
5
⃗v
4
3
2
Cross product of ⃗u and ⃗v = ⃗u × ⃗v
1
Notice from the definition that, unlike the dot product, the cross product returns a vector, not a scalar.
0
⃗u
θ
0
1
2
The parallelogram bounded by two vectors can be
3
4
5
6
7
6
7
(a)
obtained by making duplicates of each vector, and
then sliding them to the tip the accompanying vector (see Figure 4.16a). To find this area, we’ll first
recall that the area of a parallelogram is its height
(vy ) multiplied by its base (∥⃗u∥).
7
6
5
⃗v
4
Letting θ denote the angle between the two vectors,
3
the area of this parallelogram is
∥⃗v∥ sin θ
2
A = (∥⃗v∥ sin θ) ∥⃗u∥
1
= ∥⃗u∥∥⃗v∥ sin θ
0
However, this is a scalar quantity, and our definition states that a cross product returns a vector.
128
0
⃗u
∥⃗u∥
θ
1
2
3
4
5
(b)
Figure 4.16: The visualization of the parallelogram formed by two vectors
Therefore, this formula only provides the magnitude of the cross product.
∥⃗u × ⃗v∥ = ∥⃗u∥∥⃗v∥ sin θ
(4.17)
In competent form, we can use the determinant of the matrix formed by each vector. This part
requires a little bit of knowledge on linear algebra to understand, but for now just know that from
its definition, the determinant of a square matrix (same number of rows as columns) will tell us
the area of the parallelogram formed by each vector when we enter the components of each vector
in the rows of a matrix (note that the same is true if you insert the vectors as the columns of the
matrix too.)
a c
b d = ad − bc
a d g
b e
e h
b h
b e h =
g
d+
a−
c
i
c f
f i
c f i
(4.18)
(4.19)
= (ae − bd)i + (ce − af )h + (bf − ce)g
Eq. (4.18) and Eq. (4.19) give the formulas for finding the determinants of 2 × 2 and 3 × 3 matrices
respectively. For a 2 × 2 matrix, the determinant can be obtained by multiplying the terms along
the line that starts at the top left corner and end in the bottom right corner (ad), and then
subtracting the product of the terms along the line that starts at the top right corner and end at
the bottom left corner (bc), like so:
a c
b d
We draw two imaginary diagonal lines and multiply the terms as we go. Starting from the top,
the terms on the line going from left to right (shown in blue) are added. Then, the terms on the
line going from right to left (shown in red) are subtracted. For 3 × 3 matrices, the formula can be
quite a handful to memorize, so there are two methods that make it easier. The first is the most
common, and is called the method of cofactors. This method breaks the 3 × 3 matrix into three
129
2 × 2 matrices like so:
a d g
b e h
c f i
→
e h
a
f i
a d g
b e h
c f i
→
b h
d
c i
a d g
b e h
c f i
→
b e
g
c f
a d g
b e
e h
b h
b e h =
g
d+
a−
c i
c f
f i
c f i
We then use Eq. (4.18) to calculate the three 2 × 2 determinants. However this can still be difficult
to remember, which brings us to the second method, which is what I prefer. The second method
requires us to rewrite the first two columns of the matrix. Then, we start from the first column
and draw diagonal lines down and right two columns over (blue lines). We do this for the first
three columns. After that, we move the last column and do what we just did, but in reverse. We
start at the top of the column and move left and down to the bottom of the second column over
(red lines). This is done for the last three columns. Then, just like with 2 × 2 determinants, we
add the products of the terms on the lines going down and right (blue lines), and subtract the
terms the products of the terms going down and left (red lines). Since all of the terms start with
the entries from the first row, we collect those like-terms and factor out the respective entries.
a d g a d
b e h b e
c f i c f
a d g
b e h = (aei) + (dhc) + (gbf ) − (dbi) − (ahf ) − (gec)
c f i
= (ei − hf )a + (hc − bi)d + (bf − ec)g
This is much better shown with an example, so lets quickly do one. However it must be noted
that the determinant of a 3 × 3 matrix does not tell you the area of a parallelogram bounded
by two vectors. It tells you the volume of the three-dimensional analog of a parallelogram, a
130
parallelopiped. We’ll show how we use 3 × 3 determinants to find the area of a parallelogram
bounded by two vectors after this short example.
Example 4.7
Find the determinant of
3 2 9
A = 4 8 1
6 5 7
The determinant of A can be denoted as det(A), and we will use the second method outlined
above to compute it. Be sure to use the diagrams to ensure you understand where the
groupings of terms come from.
3 2 9 3 2
det(A) = 4 8 1 4 8
6 5 7 6 5
= 3 (8 · 7) − (1 · 5) + 2 (1 · 6) − (4 · 7) + 9 (4 · 5) − (8 · 6)
h
i
h
i
h
i
= 3 51 + 2 −22 + 9 −28
= −143
We won’t worry about what negative determinants are quite yet, the important thing is we
know how to compute them.
Now that we have a general idea of what the determinant tells us and how to compute them, we
can show how to compute cross products with a vector’s components. We will first begin with
the cross product between two 2D vectors. Recall the definition of the cross product of ⃗u and ⃗v,
which is a vector with magnitude equal to the area of the parallelogram formed by ⃗u and ⃗v, and
direction perpendicular to both ⃗u and ⃗v. If when we insert the components of two vectors as rows
of a matrix and take the determinant of that matrix, the result is the area of the parallelogram of
the two vectors, then it follows that
u u
∥⃗u × ⃗v∥ = v x v y
x
y
However, this does not tell us the direction. The direction of the cross product vector follows the
right-hand-rule.
131
The right-hand-rule can be used to find the direction of ⃗u × ⃗v by taking your right hand, pointing
your index finger in the direction of ⃗u, pointing your
middle finger in the direction of ⃗v, and then sticking
your thumb out (see Figure 4.17). The direction of
your thumb will tell you the direction of the cross
product. For example, consider the vectors in Figure 4.16. With your right hand, point your index
finger in the direction of ⃗u, and your middle finger
Figure 4.17: Diagram depicting the righthand-rule. Credit: Dan the Tutor
in the direction of ⃗v. Your thumb will point away from the page/screen, meaning the direction of
the cross product vector is in the positive z-direction. Using the same process, find the direction
of ⃗v × ⃗u. With your right hand, point your index finger in the direction of ⃗v, and your middle
finger in the direction of ⃗u. Your thumb should now point towards the page/screen, meaning the
direction of the cross product vector is now in the negative z-direction. The magnitude of each
cross product will be the same, since the parallelogram does not change, however the direction of
the cross product vector will change. This is due to the nature of the determinant, which we will
explain in Linear Algebra. Summarizing this discussion,
⃗u × ⃗v = −(⃗v × ⃗u)
(4.20)
vx
ux
= ux uy n̂
×
vx vy
vy
uy
(4.21)
n̂ is a unit normal vector to ⃗u and ⃗v
Note that “normal” in this context means perpendicular. Thus, a unit normal vector is a unit
vector that is normal to some reference vector/plane/surface. For computing the cross product of
3D vectors, the process is a little more straightforward.
î ĵ k̂
ux
vx
uy × vy = ux uy uz
uz
vz
vx vy vz
(4.22)
Eq. (4.22) gives the cross product of two 3D vectors as the determinant of a matrix whose rows
contain the components of each vector. However, the top row contains the basis vectors and is
132
mostly just a notational trick. Note that this formula also works for 2D vectors.
î ĵ
ux uy = 0î + 0ĵ + (ux vy − uy vx )k̂
vx vy
î ĵ k̂
⃗u × ⃗v = ux uy 0
vx vy 0
=
ux uy
k̂
vx vy
Lastly, we can establish the cross products of the three basis vectors.
î ĵ k̂
î × ĵ = 1 0 0
0 1 0
î ĵ
1 0 = k̂
0 1
î ĵ k̂
ĵ × k̂ = 0 1 0
0 0 1
î ĵ
0 1 = î
0 0
î ĵ k̂
k̂ × î = 0 0 1
1 0 0
î ĵ
0 0 = ĵ
1 0
Example 4.8
For ⃗
x = (4 5 1)T and ⃗
y = (−3 4 −7)T , find (a) ⃗
x ×⃗
y (b) ⃗
y ×⃗
x
(a) Using Eq. (4.22),
î ĵ k̂
−3
4
⃗x × ⃗y = 5 × 4 = 4 5 1
−7
1
−3 4 −7
h
i
h
î ĵ
4 5
−3 4
i
h
i
= (5)(−7) − (1)(4) î + (1)(−3) − (4)(−7) ĵ + (4)(4) − (5)(−3) k̂
= −39î + 25ĵ + 31k̂
−39
= 25
31
(b) We can use Eq. (4.20) to find ⃗y × ⃗x, but we will use Eq. (4.22) to demonstrate the
133
Example 4.8 continued
validity of Eq. (4.20).
î ĵ k̂
−3
4
⃗y × ⃗x = 4 × 5 = −3 4 −7
−7
1
4 5 1
h
i
î ĵ
−3 4
4 5
h
i
h
i
= (1)(4) − (5)(−7) î + (4)(−7) − (1)(−3) ĵ + (5)(−3) − (4)(4) k̂
= 39î − 25ĵ − 31k̂
39
= −25 = − (⃗x × ⃗y)
−31
As in Dot Products, I would like to provide some context for why cross products are useful. A common use of
cross products is in calculating torque. Torque is the ro-
⃗r
⃗
F
⃗⊥
F
tational analogue of force, or in simpler terms, torque is
rotational force. Consider the use of a wrench to tighten
⃗∥
F
a bolt. You apply a force at some point on the handle of
the wrench, and this causes the bolt to rotate. However,
the amount of torque acting on the bolt is dependent
on where along the handle you apply a force. Forces
applied closer to the bolt on the handle will result in
less torque, while forces applied further from the bolt on
the handle will produce more torque. The formula for
Figure 4.18: The torque produced by
a wrench depends on the distance between the force and the axis of rotation, as well as the magnitude of the
force component perpendicular to the
distance vector
torque is given by
⃗ = ∥⃗r∥∥F∥
⃗ sin θ
⃗τ = ⃗r × F
where τ is torque, r is the distance between the point of application of force F and the axis of
⃗ The angle at which the force is applied to the
rotation, and θ is the angle between ⃗r and F.
handle will also determine the amount of torque applied to the bolt. To see why, consider the force
⃗ shown in Figure 4.18. The parallel component of F,
⃗ F
⃗ ∥ makes no contribution to
components of F
rotating the wrench. Therefore, it is only the force component perpendicular to the radius vector
134
that contributes to the torque on the bolt. Or in other words,
τ = r (F sin θ)
Since only the perpendicular component is multiplied by the radius, the cross product is the
perfect tool for calculating torque. Furthermore, the direction of the torque vector gives the
direction of rotation. Using the right-hand-rule, the torque vector points into the page/screen.
⃗ such that F
⃗ ⊥ is pointed
By convention, this indicates clockwise motion. If you imagine flipping F
in the opposite direction and again use the right-hand-rule, the torque vector points out of the
page/screen, indicating counterclockwise motion. The convention for direction of rotation is not
important here. What you should take away from this is that the cross product is useful in physical
applications involving the multiplication of perpendicular components, just as the dot product was
with parallel components. The cross product’s uses go beyond this, as it also provides a method
of finding a vector that is perpendicular to two other vectors.
Subsection 4.4.2 Summary
• The cross product is defined as a perpendicular vector whose magnitude is equal
to the area of the parallelogram bounded by two vectors
• The magnitude of a cross product is given by Eq. (4.17) and Eq. (4.21)
• The direction of the cross product of two 2D vectors can be found using the righthand-rule
• The cross product of two 3D vectors can be found using Eq. (4.22)
• The determinant is used in the calculation of cross products because it gives the
area of a the parallelogram formed by two vectors
4.4.3
Scalar Triple Products
To finish off this chapter, we will briefly explore the scalar triple product. The scalar triple product
is not its own category of vector multiplication, but instead is a combination of the dot and cross
products. It does not appear all that often in lower-division engineering courses, so we will not
put too much significance on this subsection (hence the lack of a subsection overview/summary).
We are just going to take a quick look at what it geometrically represents, and how to compute
them.
135
The triple scalar product of three vectors, ⃗u, ⃗v, and
⃗ , is defined as the signed volume of the parallelepiped
w
formed by the three vectors. Note that a parallelepiped
is the 3D analogue of a parallelogram, and is essentially
a cube with skewed angles (see Figure 4.19). From this
definition, we know that the scalar triple product takes
in three vectors, and outputs a scalar. Symbolically,
Figure 4.19:
The parallelepiped
formed by three vectors, ⃗a, ⃗b, and ⃗c.
Credit: Nykamp DQ, “The scalar
triple product.” From Math Insight.
http://mathinsight.org/scalar_triple_
product
scalar triple products are given as
⃗ = ⃗u · (⃗v × w
⃗)
Triple scalar product of ⃗u, ⃗v, and w
To see why this formula gives the volume of the paral-
lelepiped, first consider the formula for the volume of one. This formula is given as
Volume = (Area of base)(Height)
In Figure 4.19, the base is the parallelogram formed by vectors ⃗a and ⃗b. Since the cross product is
defined as the normal vector to ⃗a and ⃗b, whose magnitude is the signed area of the parallelogram,
⃗a × ⃗b gives the first portion of the volume equation, but as a vector. So,
Area of base = ∥⃗a × ⃗b∥
The height of the parallelepiped is given by the component of ⃗c that is normal to ⃗a and ⃗b, which
is given by ∥⃗c∥ cos ϕ in Figure 4.19. Thus,
Height = ∥⃗c∥ cos ϕ
⇓
Volume = ∥⃗a × ⃗b∥∥⃗c∥ cos ϕ
This is equivalent to projecting ⃗c onto ⃗a × ⃗b, then multiplying the length of the projection vector
by the length of ⃗a × ⃗b, which is what a dot product is. Thus,
Volume = ⃗c · ⃗a × ⃗b
⃗ and then finding the dot product
While the scalar triple product can be computed by finding ⃗v × w
of the result and ⃗u, there is a faster way of computing it. Recall in Cross Products that we learned
that the determinant of a 2 × 2 matrix yields the area of the parallelogram formed by the vectors
inserted into its rows/columns. In the same regard, the determinant of a 3 × 3 matrix yields the
136
volume of the parallelepiped formed by the three vectors inserted into the rows/columns of the
matrix. Therefore,
ux uy uz
⃗u · (⃗v × w
⃗ ) = vx vy vz
wx wy wz
(4.23)
The process for computing this determinant is no different than shown in Cross Products. If you
need a refresher on how to compute the determinant of a 3 × 3 matrix, see Example 4.7.
137
5
Rudimentary Multivariable and Vector Calculus
5.1
Partial Differentiation
5.2
Multiple Integrals
5.3
Multiple Integrals in Polar/Cylindrical Coordinates
5.4
Applying Calculus to Vectors
5.5
The Gradient Vector
5.6
Line and Surface Integrals
5.7
Curl and Divergence
138
Part Four: Miscellaneous Mathematics
Introduction
This chapter is a compilation of material this likely isn’t required for the level of courses this
document is intended for. However, this material either shows up in textbooks, leads to much
stronger understanding of material from previous chapters, and/or provides you with tools that
can help you solve challenging problems by simpler means.
139
(Reddit question that I answered before posting, ignore) This question originates from 3Blue1Brown’s
video "Dot products and duality" (link in comments – watch from 7:14 to 11:17 if you want complete
context).
The overall goal of this video is to describe dot products as linear transformations. Since dot
products take in two vectors and output a number, this transformation should take a vector and
return a number. The big idea of the video is for the dot product ⃗v · ⃗u, there exists some linear
transformation
L(⃗v) = U ⃗v = (ux
x
uy )
y
that describes said dot product. Since this dot product is (in one way) defined as the magnitude
of the projection of ⃗v onto ⃗u times the magnitude of ⃗u, we can think of the span of ⃗u as a 1D
number line “living" in 2D space. When we take the dot product of ⃗v and ⃗u, we project ⃗v from
2D space onto this 1D space.
6
5
⃗u
4
3
⃗v
2
1
0
0
1
2
3
4
5
6
In the diagram above, the blue ticked line represents the 1D space created by the span of ⃗u, and
the dashed line indicates where ⃗v lands on this 1D space when it is transformed. The overall result
of L(⃗v) is the point on the number where ⃗v lands multiplied by the point where ⃗u is on the 1D
space, which yields ⃗v · ⃗u. I realize order doesn’t matter in terms of computation, but to keep a
consistent conceptual understanding of dot products, I will treat them as if order does matter in
the following regard:
⃗b · ⃗a
⇒
⃗b is being projected onto ⃗a
140
I have not formally studied linear algebra, so my error in reasoning could be hiding here, but
the way I understand it is that for the matrix representing a (2D) linear transformation, the first
column is where î lands after the transformation, and the second column is where ĵ lands after the
transformation.
2 3
4 1
!
↑ ↑
î ĵ
So the transformation matrix above has the effect of moving î to (2, 4) and ĵ to (3, 1). Thus if we
want to find the transformation matrix U, then we need to find where î and ĵ land on the 1D space
after the dot product transformation. The video simplifies this process by instead projecting î and
ĵ onto û.
1.4
1.2
ĵ
1
û
0.8
0.6
0.4
0.2
î
0.2
0.4
0.6
0.8
1
1.2
1.4
To find where î lands on the 1D space, we can take advantage of the symmetry involved (∥î∥ =
∥û∥ = 1) in that the magnitude of the projection of û onto î is equal to the magnitude of the
projection of î onto û (see final image for proof). From the above figure, we can see that when û is
projected onto the x-axis, the resulting magnitude is simply ûx (denoting the horizontal component
of the unit vector of ⃗u). Since the two mentioned projections have equal resulting magnitudes, it
follows that î lands at value ûx in the 1D space.
The same reasoning can be applied to find where ĵ lands in the 1D space, which is ûy . Thus for
141
the case of ⃗v · û, U = ûx ûy . That is,
x
y
⃗v · û = ûx ûy
x
where ⃗v =
y
However, we obviously do not want a transformation that only works for unit vectors. In 3Blue1Brown’s
video, he states that when we scale û, the transformation matrix U is just multiplied by this scale
factor, but does not offer much reason as to why. Since the transformation matrix was obtained
by taking advantage of symmetry (equal magnitudes), it doesn’t seem obvious to me how we can
find the transformation matrix for any vector ⃗u that does not have a magnitude of one (i.e. isn’t
a unit vector).
Based on this belief, it seems the only way to find U for a non unit vector ⃗u is if the entries of
the transformation matrix tell us not where î and ĵ land, but where scaled versions of those basis
vectors land, which seems like redefining how we represent transformation matrices. Additionally,
⃗
shouldn’t any scaling of the basis vectors be done by ⃗v by u?
142
comp⃗a ⃗b = comp⃗b ⃗a if ∥⃗a∥ = ∥⃗b∥
where comp⃗a ⃗b = proj⃗a ⃗b
and θ ̸=
π
n
2
Proof. Let ⃗a and ⃗b be two non-orthogonal vectors such that ∥⃗a∥ = ∥⃗b∥, and θ denote the angle
between ⃗a and ⃗b.
⃗b · ⃗a
⃗b · ⃗a
⃗
∥⃗
proj⃗a ⃗b =
a
=
a∥â
∥⃗a∥2
∥⃗a∥2
⇓
⃗b · ⃗a
⃗b · ⃗a
∥⃗
a
∥
=
proj⃗a ⃗b =
∥⃗a∥2
∥⃗a∥
⇓
z
⃗b · ⃗a
comp⃗a ⃗b =
∥⃗a∥
}|
⃗a · ⃗b
comp⃗b ⃗a =
∥⃗b∥
{
=
∥⃗a∥∥⃗b∥ cos θ
∥⃗a∥
=
∥⃗a∥∥⃗b∥ cos θ
∥⃗b∥
=
∥⃗a∥2 cos θ
∥⃗a∥
=
∥⃗a∥2 cos θ
∥⃗a∥
= ∥⃗a∥ cos θ
= ∥⃗a∥ cos θ
∴ comp⃗a ⃗b = comp⃗b ⃗a if ∥⃗a∥ = ∥⃗b∥
■
143
6
Ordinary Differential Equations
6.1
Introduction to Differential Equations
What Are Differential Equations?
To introduce differential equations, think back to your grade-school mathematics education. We
started with learning the basics of arithmetic, such as how to add, multiply, exponentiate, the
order which those operations should be carried out in, etc. Once we got the hang of that, we
started replacing numbers with unknown mathematical objects (numbers represented by x, for
example). That may have looked something like
x2 − x − 2 = 0 ,
x =?
In differential equations, we do something similar. Only this time, instead of using arithmetic
operators, we use operations of calculus, and instead of the unknown mathematical objects being
numbers, now they are functions. This may look something like:
y ′′ − y ′ − 2y = 0 ,
y(t) = ?
This problem can be put into words like so:
What function, y(t), has the property of its second derivative, minus its first derivative, minus
two times itself being equal to zero?
For those curious, the general solution is y(t) = c1 e−t + c2 e2t where c1 and c2 are any constants.
This can be verified by finding its first and second derivative, then plugging them into the right
side of the above equation and testing whether or not the result is zero. By general solution, we
mean any function that satisfies this differential equation can be expressed in terms of the solution
I have provided. That is, any solution can be obtained by choosing specifies values of c1 and c2 .
Differential Equations in Physics
The question asked in the preceding discussion may seem arbitrary at first. Why would any sane
person ever want to know this? Well, similar questions show up in physics all the time. Consider
the following example:
144
L0
Equilibrium position y = 0
L
m
y
m
Figure 6.1: Mass suspended from a spring.
Suppose a ball of mas m is being suspended from the ceiling by a spring, as shown in Figure 6.1.
Hooke’s law states that the force exerted the ball by the spring is proportional to the length of the
spring. That is,
Fs = kx
where k is a the spring constant and x is the length of the spring. The spring constant tells us
how much more force will be applied for each unit of distance the spring is stretched. I.e. a spring
constant of 5 N/m means for each meter the spring is stretched, an extra 5 N of force is applied to
be ball.
The weight of the ball causes the spring to stretch until the force of the spring is
equal and opposite to the force of gravity. Let us call the length of the spring at
the equilibrium position be L0 , and also let this be y = 0. Lastly, we establish
F s0
a coordinate system by letting the downward direction be positive. The reason
for this being downward motion of the ball increases the length of the spring,
m
and upward motion decreases the length of the spring. Consider the free-body
diagram (FBD) of the spring-mass system at equilibrium position shown in Figure
Fg
6.2. Referring back to Figure 6.1, we can see that, at equilibrium condition, the
spring as length L0 . Thus, the equilibrium spring force, Fs0 , is given as kL0 .
Additionally, the weight of the spring, Fg = mg. We then sum the forces, and
since the spring is in equilibrium, they balance out to get a sum of zero.
X
F = Fg − Fs0 = 0
145
Figure 6.2:
FBD of mass.
Expanding each force gives
mg = kL0
Now consider the right side of Figure 6.1. Here, the spring has been stretched an additional
distance of y past equilibrium position, yeilding a total spring length of L = L0 + y. The free-body
diagram is similar to Figure 6.2, but the spring force will be Fs = kL, since L is the length of the
spring instead of L0 . Summing the forces yields
X
F = Fg − Fs = ma
mg − kL = ma
mg − k(L0 + y) = ma
mg − kL0 + ky = ma
First, note that the sum of forces is no longer equal to zero because the spring was stretched
beyond equilibrium position. Additionally, in the preceding paragraphs, we found that mg = kL0 .
We can plug this into or equation to find
mg − mg − ky = ma
−ky = ma
ma + ky = 0
Since acceleration is the second derivative of position, we arrive at the differential equation
my ′′ + ky = 0
If we could find y(t), then we would have a function that tells us the position of the spring relative
to its equilibrium position at all times.
Differential equations show up not only in dynamics, but also electrical circuits, heat transfer, fluid
dynamics, materials mechanics, and more.
146
Important Notes and Definitions
Differential equations are often very complicated to solve analytically (i.e. with pen and paper).
While the examples of differential equations given here are fairly simple, it would only take small
tweaks before solving them analytically would require elaborate methods and tedious procedures.
In the real world, differential equations are often solved numerically with the help of computers.
With that in mind, we will not get too deep into differential equations here. Our focus will be on
relatively simple first order linear and ordinary differential equations.
The order of a differential equation is determined by the highest derivative present in an equation.
This is much like the degree of a polynomial, which is determined by the highest power of x in the
polynomial. Thus, the differential equation
my ′′ + ky = 0
is a second order differential equation, as the highest order derivative present is a second order
derivative, whereas the differential equation
q′ +
ε
1
q=
RC
R
is a first order differential equation, as the highest order derivative is one. For those who may be
interested, this differential equation gives the charging of a capacitor in an RC circuit over time.
When we say ordinary differential equations, we are talking about the kind of differentiation
involved. An ordinary derivative is the derivative of a single-variable function. A derivative that
is not ordinary is a partial derivative. Partial differential equations are a field of study in their
own right and will not be discussed here. All of the differential equations shown so far have been
either first or second order ordinary differential equations.
Additionally, a differential equation is linear when its derivatives (including the zeroth derivative)
are all to the power of one. This is analogous to the fact that y = mx+b is linear, but y = mx2 +b is
not. The general form of a linear first and second order ordinary differential equation, respectively,
is
y ′ + p(t)y = q(t)
y ′′ + p(t)y ′ + q(t)y = r(t)
147
The equation
y ′ + p(t)y 2 = q(t)
for example, would not be linear since it contains y 2 .
Lastly, differential equations are referred to as homogeneous when they are equal to zero. If we
take the two examples of linear differential equations, we can make them homogeneous like so:
y ′ + p(t)y = 0
y ′′ + p(t)y ′ + q(t)y = 0
A differential equation that is not equal to zero is nonhomogeneous.
6.2
Solving First Order Linear and Homogeneous Differential Equations
In this section, we will focus on solving some of the most basic differential equations, which are
first order linear and homogeneous equations. These equations are of the form
y ′ + p(t)y = 0
(6.1)
Let us first begin with the most basic form of this equation:
y′ + y = 0
(6.2)
y ′ = −y
(6.3)
We can rewrite this as
Some may be able to immediately think of a solution without any further manipulation. If not,
reconsider the equation as the following question:
What function, y(t), has a derivative equal to negative one times itself?
148
First, we know that y = et is a function that has the unique property of y ′ = y. However, we need
y ′ = −y or y = −y ′ . We could fix this fairly easily with the chain rule, as
d −t
e = e−t (−1) = −e−t
dt
So y = e−t is a solution. But, how can we find this solution analytically? We don’t want to guess
and check each time we need to solve an equation like this. Notice what happens if we divide both
sides of Eq. (6.3) to get
y′
=1
y
(6.4)
This should look very familiar to those who viewed the Tree of Proofs for Differential Calculus.
For those who haven’t, consider the following:
y′
d
1
ln y(t) = y ′ =
dt
y
y
⇒
i
y′
dh
=
ln y
y
dt
Thus, we can rewrite Eq. (6.4) as
i
dh
ln y = −1
dt
Now we can integrate both sides to get
ln y = −t + c
Recall from previous chapters that when dealing with several integration constants, we often lump
them together. This is especially common in differential equations, as c will generally refer to any
constants that may appear in a solution as one constant. We can now exponentiation both sides
of our equation to get
eln y = e−t+c
y = e−t+c
= e−t ec
Note that ec is a constant, since e and c are both numbers. Thus, we rewrite ec as c. This may
be uncomfortable, as ea ̸= a, but remember, when we actually solve for c, this won’t change our
solution. In short, rewriting ec as c will change what we get for c, but it won’t change the solution
149
itself. We now arrive at the general solution:
y(t) = ce−t
which describes all solutions to Eq. (6.2). We can verify by plugging this solution into Eq. (6.2)
like so:
y = ce−t
y ′ = −ce−t
y ′ + y = −ce−t + ce−t = 0
When using differential equations to model the real world, only one solution from the above family
of solutions will work, just like when we used integrals to model kinematics. How we do this is the
same as with integrals. If we are given a certain condition, like y(0) = 1, then we can find c. Such
problems are called initial value problems or IVPs. In this particular case,
y(0) = 1
⇒
ce0 = 1
⇒
c=1
So the solution to this IVP would be
y = e−t
We can use this method of solving first order linear and homogeneous differential equations to find
a general formula for solutions to Eq. (6.1).
y ′ + p(t)y = 0
y′
= −p(t)
y
i
dh
ln y = −p(t)
dt
ln y = −
Z
y = e(−
150
p(t) dt
R
p(t) dt)
′
y + p(t)y = 0
y(t) = e(−
⇒
R
p(t) dt)
(6.5)
Example 6.1
Solve the following IVP:
5ty ′ − 4t3 y = 0 ,
y(0) = 2
Before getting started, its very important that we notice the above differential equation not
in the form of Eq. (6.1). However, we can easily get it in that form by dividing both sides
by 5t. So, the differential equation is:
5
y ′ − t2 y = 0
4
We then use the same method as we did in the preceding discussion:
4
y ′ − t2 y = 0
5
y′
4
= t2
y
5
i
dh
4
ln y = t2
dt
5
4 3
t +c
15
ln y =
y = ce
4t3/15
Applying our initial condition, we solve for c.
y(0) = ce0 = c = 2
⇒
y(t) = 2e
4t3/15
As an additional exercise, plug this solution into the original differential equation and verify
that the result is zero.
As we saw in the last example, the use of this method can be used on differential equations of the
form
p(t)y ′ + q(t)y = 0
151
as we can rearrange to get
y′ +
q(t)
y=0
p(t)
Whether or not this method is suitable is dependent on how easy the integral of q(t)/p(t) is.
6.3
Solving by Integration Factors
The method of integration factors ties in quite nicely with the method discussed in the previous
section. This method works for first order linear ordinary differential equations, but they need not
be homogeneous. That is, they solve equations of the form:
y ′ + p(t)y = q(t)
(6.6)
There isn’t a nice natural way of deriving this method, so for now, you’ll have to follow along until
you see how this works. Let us define a function µ(t) that we’ll call the integration factor. This
function is defined as
R
µ(t) = e( p(t) dt)
(6.7)
This function may seem random, but consider what happens when we multiply both sides of
Eq. (6.6) by µ(t):
y ′ + p(t)y = q(t)
h
i
µ(t) y ′ + p(t)y = µ(t)q(t)
µ(t)y ′ + µ(t)p(t)y = µ(t)q(t)
µy ′ + µpy = µq
152
(6.8)
Note that in the last step, all we did was get rid of the function notation to clean up the equation
a little bit. Let us return to Eq. (6.7) and consider what µ′ is. Using the chain rule, we find that
d
µ = e( p dt) ·
dt
R
′
Z
p dt
=µ·p
= µp
In Eq. (6.8), the leading term of y is µp, which means we can rewrite Eq. (6.8) as
µy ′ + µ′ y = µq
(6.9)
The left side of Eq. (6.9) is exactly the same as differentiating µy, so we can again rewrite it as
d
[µy] = µq
dt
Integrating both sides yields
Z
Z
d
[µy] dt = µq dt
dt
µy =
Z
µq dt
1Z
y=
µq dt
µ
′
y + p(t)y = q(t)
⇒
1 Z
y(t) =
µ(t)p(t) dt
µ(t)
1
= R p(t) dt
e
Z
(6.10)
R
e p(t) dt p(t) dt
In summary, the integration factor given by Eq. (6.7) takes advantage of the chain rule and allows
us to turn these differential equations into integration problems. Just as with the previous section,
the viability of this method depends on how difficult
R
p dt and
some examples to better demonstrate how this method is used.
153
R
µp dt are to compute. Lets do
Example 6.2
Solve the following IVP:
y ′ + ay = b ,
y(0) = 0
Before getting started, lets make sure this equation fits the method of integration factors.
This function is linear, since it doesn’t involve y and/or its derivatives with powers other
than one on the left. The function is, of course, a first order equation, since its highest
derivative is one, and since this method works for both homogeneous and nonhomogeneous
equations, we aren’t concerned with whether or not b = 0. Thus, we can use the method of
integration factors on this equation.
Luckily, this equation only has constant coefficients, so integration should be very straightforward. We begin by defining µ(t) with Eq. (6.7) as
R
µ(t) = e( a dt) = eat
Since p(t) = a in this differential equation. We leave out the constant of integration as we’ll
need to integrate later in the problem. That is, we’ll wait to include any constants until the
very end. Now we multiply both sides of our differential equation by µ(t).
eat y ′ + aeat y = beat
We recognize that the left side of the equation is the result of differentiating eat y, giving us
eat y
′
= beat
Now all we need to do is integrate both sides and solve for y. Note that we’ll need to use
u-substitution to integrate the right hand side.
Z at
eat y
e y
′
′
= beat
dt =
Z
beat dt
b
eat y = eat + c
a
y=
154
b
+ ce−at
a
Example 6.2 continued
Now we apply our initial condition, y(0) = 0 to solve for c.
y(0) =
b
+c=0
a
⇒
y=
b
b
− e−at
a a
=
b
1 − e−at
a
c=−
b
a
We can verify our solution like so:
y=
b
b
− e−at
a a
y ′ = be−at
b
b
y ′ + ay = be−at + a
− e−at
a a
!
= be−at + b − be−at
=b
So, our solution checks out and our final answer is
y=
b
1 − e−at
a
Note that since this is an IVP, it is also important that y(0) = 0, but it is easily seen that
this is indeed the case.
As a brief note for the last example, we mentioned in Introduction to Differential Equations that
the differential equation for a charging capacitor in an RC circuit was
q′ +
1
ε
q=
RC
R
155
This equation is of the same form as the one in Example 6.2 where
1
RC
a=
and
b=
ε
R
If we plug these values into our solution from that example, we get
q(t) = Cε 1 − e− /RC
t
where q is charge, C is capacitance, ε is the voltage of the voltage source (like a battery), t is time,
and R is resistance. Note that this equation assumes q(0) = 0. This equation is typically used in
any introductory electromagnetism course. Now you know where it comes from!
While Example 6.2 used constant coefficients, the procedure for non-constant coefficients is no
different. The integration is just a little more intensive. We’ll do one example just to make it
abundantly clear how this method is used for nonconstant coefficients.
Example 6.3
Solve the following IVP:
y ′ − y = et ,
y(0) = 1
This equation matches the form that the method of integration factors is suited for, so lets
start by defining µ(t).
µ(t) = e(−
R
dt)
= e−t
y ′ − y = et
e−t y ′ − e−t y = e−t et
Z e−t y
−t
e y
′
′
=1
dt =
Z
dt
e−t y = t + c
y = et (t + c)
y(0) = c = 1
⇒
156
y(t) = et (t + 1)
6.4
Solving by Separation of Variables
The last method for solving first linear equations is the method of separable variables, and it is
the simplest method of those presented thus far. Separation of variables is when an equation can
be written in the form:
N (y)
dy
= M (t)
dt
(6.11)
To solve, we separate the equation such that each side only contains one variable, including differentials. So Eq. (6.11) becomes
N (y) dy = M (t) dt
Then we integrate both sides to get
Z
7
N (y) dy =
Z
M (t) dt
(6.12)
Linear Algebra
Before we get started with studying linear algebra, I want to speak to a certain group of people.
I want to speak to the people that previewed this chapter, saw matrices, and got scared. I, like
you, was once terrified of matrices and avoided them at all costs. The reason for this being that
when I first learned about matrices, their operations and uses seemed incredibly arbitrary. They
felt like some really weird corner of math that teachers created just to annoy students. That is
not the case, and I intend of covering this material in a way that motivates a need for matrices
and gives a clear example of why their operations are defined the way they are.
This chapter is heavily influenced by 3Blue1Brown’s Essence of Linear Algebra series on YouTube.
Much of what we cover here is covered in that series, so if you think you’d benefit from watching
these sections as videos, you can find a link below.
7.1
Defining Linearity
At a few points in these notes, we have mentioned certain things being “linear”. For example, in
Properties of Derivatives, we said that the derivative was a linear operator, and so it adhered to
157
the properties:
d
d
d
f (x) ± g(x) =
f (x) ± g(x)
dx
dx
dx
and
d
d
c · f (x) = c · f (x)
dx
dx
or, in Introduction to Differential Equations we said that equations of the form
p(t)y ′ + q(t)y = r(t)
were linear differential equations. Now we define a bit more precisely what it means for something
to be linear in math.
To define what linearity is, suppose we have some function F (X). This function can be whatever
we want it to be, and it doesn’t have to be like the functions we are familiar with using. That is,
X doesn’t have to be a number like we normally see with f (x). X could instead be a function,
and F would then be a function that does something to the input function X to return F (X). To
give a specific example, we could define F as
F (X) =
dX
dt
In this case, F takes in functions and outputs their first derivative. We say F is linear if it adheres
to the following properties:
1. F (X + Y ) = F (X) + F (Y )
2. F (αX) = αF (X)
That is, if we add two valid inputs of F together and then run them through F , the result is the
same as running each input through F separately, then adding the results. Likewise, if we multiply
or scale a valid input of F , X by some constant factor and run it through F , the result is the
same as multiplying the output for F (X) by that same factor. In Tree of Proofs for Differential
158
Calculus, we prove that
d
d
d
f (x) ± g(x) =
f (x) ± g(x)
dx
dx
dx
and
d
d
c · f (x) = c · f (x)
dx
dx
and therefore, the derivative is a linear operator. For differential equations, we will let
F (X) = p(t)X ′ + q(t)X
If we then input two functions, X and Y , into this differential equation, we get
F (X + Y ) = p(t)(X + Y )′ + q(t)(X + Y )
= p(t)X ′ + p(t)Y ′ + q(t)X + q(t)Y
h
i
h
= p(t)X ′ + q(t)X + p(t)Y ′ + q(t)Y
i
= F (X) + F (Y )
so the first property of linearity holds. As for the second,
F (αX) = p(t)(αX)′ + q(t)(αX)
= αp(t)X ′ + αq(t)X
h
= α p(t)X ′ + q(t)X
i
= αF (X)
and thus, equations of this form are linear. As an exercise, prove that the following differential
equation is not linear:
F (X) = X 2 X ′ + X
At its route, linear algebra is the study of mathematical objects that adhere to the properties of
linearity.
159
ε
7.2
Introduction to Linear Transformations
7.3
Linear Systems of Equations as Linear Transformations
7.4
Dot Products Revisited
7.5
The Determinant
7.6
Cross Products Revisited
160
Appendix
A
Additional Material
A.1
Prerequisite Material
The items listed below all appear at various points of this document. If you need a refresher, I’d
recommend briefly reading some of this content and doing some example problems.
• Paul’s Online Notes: Summation Notation
• Paul’s Online Notes: Limits
• Paul’s Online Notes: Trig Functions
• Paul’s Online Notes: Logarithm Functions
• Paul’s Online Notes: Vectors (vectors are covered very early on in nearly every intro. physics
course, so you may be able to wait until they are covered in your class)
A.2
Material for Topics Covered in This Document
If you need further detail on some of the topics covered in this document, I strongly recommend
Paul’s Online Notes. If you have been completing the practice problems as recommended, then
you should already be somewhat familiar with his website. See the content of his offered courses
to find the specific material you are looking for.
161
B
Proof of Various Derivative/Integral Properties & Formulas
Here you will find proofs for many of the numbered formulas in this document. This section is
separated into two subsections. The first will provide you with all of the traditional proofs for
many of these formulas or properties, while the second will provide my own sequence of proofs for
differentiation formulas. The intention of this second subsection is to act as a guide to proving
many of these properties and formulas by very simple means. Note that the second subsection
won’t be super useful until you’ve completed the differentiation chapter, as the proof sequence is
different from the sequence in which these formulas were taught.
B.1
Traditional Proofs for Derivatives and Integrals
Derivatives
• Paul’s Online Notes: Proof of Sum/Difference of Two Functions
• Paul’s Online Notes: Proof of Constant Times a Function
• Paul’s Online Notes: Proof of the Derivative of a Constant
• Paul’s Online Notes: Power Rule (Proof 2 or 3 recommended, Proof 3 shown in next subsection)
• Paul’s Online Notes: Product, Quotient, & Chain Rule (Bottom of page; Individual anchors
could not be found)
• Paul’s Online Notes: Derivatives Of Trig Functions
• Paul’s Online Notes: Derivatives Of Exponential And Logarithm Functions
• Paul’s Online Notes: Derivatives Of Inverse Trig Functions
• Paul’s Online Notes: Proofs Of Derivative Applications Facts
Integration:
• Paul’s Online Notes: Proof of:
R
kf (x) dx = k f (x) dx where k is any number
R
• Paul’s Online Notes: Proof of:
R
f (x) ± g(x) dx =
162
R
f (x) dx ± g(x) dx
R
• Paul’s Online Notes: Proof Of Various Integral Properties (Includes all other integral formulas/properties; Individual anchors could not be found)
163
B.2
Tree of Proofs for Differential Calculus
I rarely see students curious as to why a certain formula or property is true, and I feel this is
because proofs are often either too complicated, or they use a series of steps that seem so random
that you’d never think of them yourself and find them difficult to remember as a result. This led
me to create my own section proving/demonstrating many of the differentiation formulas used in
this document in the simplest ways I could think of.
These proofs use logarithmic differentiation (discussed in Example 1.10), which enables them to be
extremely simple in comparison to the traditional methods you’d normally see. The downside is
that the use of logarithmic differentiation means the formulas are proven in a very different order
from how you would normally see these formulas taught. That being said, I recommend that you
complete Differential Calculus up to Chain Rule before viewing this section.
Before we get to proving many of the differentiation formulas we covered, we need to write out
some definitions and use them to establish a few facts. This will enable us to use logarithmic
differentiation in a more rigorous manner. Once we get that out of the way, you’ll probably be
able to do many of the proofs on your own, as logarithmic differentiation really makes these proofs
extremely simple. Note that these proofs are heavily based on properties of limits and logs (which
are essentially just exponent properties). I chose not to include the proofs of these properties, as
they are typically shown prior to Calculus I, and thus it seems fair to assume that they are true.
However, if you have never verified these properties, I encourage you to do so.
On the next page, I have created a diagram to show how each proof enables the proof of another
formula. You can either view each proof in the sequence they appear in this document, or use the
embedded anchors to follow the logic to whatever formula you would like to see proven. I hope
you can appreciate the simplicity of these proofs and are able to use this knowledge in your future
calculus studies.
164
Tree of Proofs for Differential Calculus
Definition of
Definition of e
ex − 1
lim
x→0
x
the Derivative
d
f g(x)
dx
d
c · f (x)
dx
d
f (x) ± g(x)
dx
d x
a
dx
d
f (x) · g(x)
dx
d
loga (x)
dx
d f (x)
dx g(x)
d x
e
dx
d
ln x
dx
d n
x
dx
165
Back
Definition of Euler’s Number
1/x
e = lim 1 + x
x→0
Back
Definition of the Derivative
f (x + h) − f (x)
d
f (x) = lim
h→0
dx
h
166
Back
Proof that lim
ex − 1
x→0
x
=1
Let y = ex − 1
y = ex − 1
y + 1 = ex
ln (1 + y) = ln (ex ) = x
lim y = lim ex − 1 = 0
x→0
x→0
⇒
ex − 1
ex − 1
= lim
y→0
x→0
x
x
lim
eln (1+y) − 1
y→0 ln (1 + y)
= lim
y
y→0 ln (1 + y)
= lim
−1
ln (1 + y)
= lim
y→0
y
= lim ln (1 + y)
y→0
1/y
−1
#−1
"
= ln lim (1 + y)1/y
y→0
−1
ln [e]
=
−1
= 1
=1
ex − 1
=1
x→0
x
∴ lim
167
Back
Proof that
d d
c · f (x) = c ·
f (x)
dx
dx
Let g(x) = cf (x) where f is differentiable at x and c is an arbitrary constant
d
g(x + h) − g(x)
g(x) = lim
h→0
dx
h
= lim
h→0
cf (x + h) − cf (x)
h
c f (x + h) − f (x)
= lim
h
h→0
f (x + h) − f (x)
h→0
h
= c · lim
g(x) = cf (x)
d
f (x)
dx
⇒
d
d
c · f (x) = c · f (x)
dx
dx
d
d
c · f (x) = c · f (x)
dx
dx
∴
=c·
168
Back
Proof that
d d
d
f (x) ± g(x) =
f (x) ±
g(x)
dx
dx
dx
Let H(x) = f (x) + k g(x) where f and g are differentiable at x
and k is an arbitrary constant
d
H(x + h) − H(x)
H(x) = lim
h→0
dx
h
f (x + h) + k g(x + h) − f (x) − k g(x)
h→0
h
= lim
f (x + h) − f (x) + k g(x + h) − k g(x)
h→0
h
= lim
f (x + h) − f (x) + k g(x + h) − g(x)
= lim
h
h→0
f (x + h) − f (x)
g(x + h) − g(x))
= lim
+k
h→0
h
h
g(x + h) − g(x)
f (x + h) − f (x)
+ k lim
h→0
h→0
h
h
= lim
=
k=1
k = −1
⇒
⇒
d
d
f (x) + k g(x)
dx
dx
d
f (x) + g(x) = f ′ (x) + g ′ (x)
dx
d
f (x) − g(x) = f ′ (x) − g ′ (x)
dx
⇒
d
d
d
f (x) ± g(x) =
f (x) ± g(x)
dx
dx
dx
d
d
d
∴
f (x) ± g(x) =
f (x) ± g(x)
dx
dx
dx
169
Back
Proof that
d
dx
f g(x) = f ′ g(x) · g ′ (x)
Let y = f (x) and u = g(x) where f and g are differentiable at x
d
d
f g(x) =
y(u)
dx
dx
∆y
∆x→0 ∆x
= lim
∆y ∆u
= lim
·
∆x→0 ∆x ∆u
∆y ∆u
·
= lim
∆x→0 ∆u ∆x
∆y
∆u
= lim
lim
∆x→0 ∆u
∆x→0 ∆x
∆u = u(x + ∆x) − u(x)
⇒
lim ∆u = u(x) − u(x) = 0
⇒
∆x→0
lim ∆u = u(x) − u(x) = 0
∆x→0
∆y
∆y
= lim
∆x→0 ∆u
∆u→0 ∆u
lim
∆y
∆y
∆u
∆u
lim
lim
lim
=
lim
∆u→0 ∆u
∆x→0 ∆u
∆x→0 ∆x
∆x→0 ∆x
y(u + ∆u) − y(u)
u(x + ∆x) − u(x)
= lim
lim
∆u→0
∆x→0
∆u
∆x
= y ′ (u) · u′ (x)
′
= f g(x) · g ′ (x)
d
∆y
∆u
lim
f g(x) = lim
∆x→0 ∆u
∆x→0 ∆x
dx
d
f g(x) = f ′ g(x) · g ′ (x)
dx
∴
⇒
d
f g(x) = f ′ g(x) · g ′ (x)
dx
170
Back
d
Proof that
dx
ex = ex
Let f (x) = ex
d
f (x + h) − f (x)
f (x) = lim
h→0
dx
h
ex+h − ex
h→0
h
= lim
ex eh − ex
h→0
h
= lim
= lim
ex eh − 1
h
h→0
eh − 1
h→0
h
= ex · lim
= ex · 1
= ex
∴
d x
e = ex
dx
171
Back
Proof that
d
ln x =
dx
1
x
Let y = ln x
y = ln x
ey = eln x = x
d y
d
e =
x
dx
dx
(ey )(y ′ ) = 1
eln x
y′ = 1
(x)(y ′ ) = 1
y′ =
y′ =
d
ln x
dx
∴
1
x
⇒
d
1
ln x =
dx
x
1
d
ln x =
dx
x
172
Back
Proof that
d
dx
xn = nxn−1
Let y = xn where n is any constant number
y = xn
ln y = ln (xn )
ln y = n ln x
d
d
ln y =
n ln x
dx
dx
d
y′
=n
ln x
y
dx
1
y = ny
x
!
xn
=n
x
!
′
= n(xn−1 )
y′ =
d n
x
dx
∴
⇒
d n
x = nxn−1
dx
d n
x = nxn−1
dx
173
Back
Proof that
d dx
f (x) · g(x) = f ′ g + f g ′
Let y = f (x) · g(x) where f and g are differentiable at x
y =f ·g
ln y = ln f · g
ln y = ln (f ) + ln (g)
d
d
ln y =
ln (f ) + ln (g)
dx
dx
d
d
y′
=
ln (f ) +
ln (g)
y
dx
dx
f ′ g′
y′ = y +
f
g
f ′ g′
= f · g +
f
g
=
f ′ · f · g g′ · f · g
+
f
g
= f ′g + f g′
d
y =
f (x) · g(x)
dx
′
⇒
d
f (x) · g(x) = f ′ g + f g ′
dx
d
f (x) · g(x) = f ′ g + f g ′
dx
∴
174
Back
d
Proof that
Let y =
f (x)
dx g(x)
=
f ′g − f g′
g2
f (x)
where f and g are differentiable atx
g(x)
y=
f
g
f
ln y = ln
g
!
ln y = ln (f ) − ln (g)
d
d
ln y =
ln (f ) − ln (g)
dx
dx
y′
d
d
=
ln (f ) −
ln (g)
y
dx
dx
f ′ g′
y′ = y −
f
g
f f ′ g′
= −
g
f
g
=
g′ · f
f′ · f
− 2
f ·g
g
=
f ′ f g′
− 2
g
g
=
f ′ g f g′
· − 2
g g
g
=
f ′g − f g′
g2
d f (x)
y′ =
dx g(x)
d f (x) f ′ g − f g ′
=
dx g(x)
g2
⇒
d f (x) f ′ g − f g ′
∴
=
dx g(x)
g2
175
Back
Proof that
d
dx
loga (x) =
1
x ln a
Let y = loga (x) where a is an arbitrary constant greater than zero
y = loga (x)
ay = aloga (x) = x
ln ay = ln x
y ln a = ln x
y=
ln x
ln a
1
d
dy
=
·
ln x
dx
ln a dx
d
dy
=
loga (x)
dx
dx
∴
=
1 1
·
ln a x
=
1
x ln a
⇒
d
1
loga (x) =
dx
x ln a
1
d
loga (x) =
dx
x ln a
176
Back
Proof that
d
dx
ax = ax ln a
Let y = ax where a is an arbitrary constant greater than zero
y = ax
ln y = ln (ax )
ln y = x ln a
d
d
ln y =
x ln a
dx
dx
d
y′
= ln a · x
y
dx
y ′ = y ln a · 1
y ′ = ax ln a
y′ =
d x
a
dx
∴
⇒
d x
a = ax ln a
dx
d x
a = ax ln a
dx
177
C
Resource Contributions
Grant Sanderson (aka 3Blue1Brown):
Contribution: Mr. Sanderson’s Essence of Linear Algebra series was a great help when first
learning about linear transformations and in realizing their connections to vector products.
Additionally, he is the initial creator of Manim, the Python package used to create the
animations in this document.
https://www.youtube.com/@danthetutor2624 (RHR)
178
D
Development Testing
• Eq. (1.1)
• Eq. (1.2)
• Eq. (1.3)
• Eq. (1.4)
• Eq. (1.5)
• Eq. (1.6)
• Eq. (1.7)
• Eq. (1.8)
• Eq. (1.9)
• Eq. (1.10)
• Eq. (1.11)
• Eq. (1.12)
• Eq. (1.13)
• Eq. (1.14)
• Eq. (1.15)
• Eq. (1.16)
• Eq. (1.17)
• Eq. (1.18)
• Eq. (1.19)
• Eq. (1.20)
• Eq. (1.21)
• Eq. (1.22)
• Eq. (1.23)
179
• Eq. (2.1)
• Eq. (2.2)
• Eq. (2.3)
• Eq. (2.4)
• Eq. (2.5)
• Eq. (2.6)
• Eq. (2.7)
• Eq. (2.8)
• Eq. (2.9)
• Eq. (2.10)
• Eq. (2.11)
• Eq. (2.12)
• Eq. (2.13)
• Eq. (2.14)
• Eq. (2.15)
• Eq. (2.16)
• Eq. (2.17)
• Eq. (2.18)
• Eq. (2.19)
• Eq. (2.20)
• Eq. (2.21)
• Eq. (2.22)
• Eq. (2.23)
• Eq. (2.24)
• Eq. (2.25)
180
• Eq. (2.26)
• Eq. (2.27)
• Eq. (3.1)
• Eq. (3.2)
• Eq. (3.3)
• Eq. (3.4)
• Eq. (3.5)
• Eq. (3.6)
• Eq. (3.7)
• Eq. (4.1)
• Eq. (4.2)
• Eq. (4.3)
• Eq. (4.4)
• Eq. (4.5)
• Eq. (4.6)
• Eq. (4.7)
• Eq. (4.8)
• Eq. (4.9)
• Eq. (4.10)
• Eq. (4.11)
• Eq. (4.12)
• Eq. (4.13)
• Eq. (4.14)
• Eq. (4.15)
• Eq. (4.16)
181
• Eq. (4.17)
• Eq. (4.18)
• Eq. (4.19)
• Eq. (4.20)
• Eq. (4.21)
• Eq. (4.22)
• Eq. (4.23)
• Eq. (6.1)
• Eq. (6.2)
• Eq. (6.3)
• Eq. (6.4)
• Eq. (6.5)
• Eq. (6.6)
• Eq. (6.7)
• Eq. (6.8)
• Eq. (6.9)
• Eq. (6.10)
• Eq. (6.11)
• Eq. (6.12)
• Example 1.1
• Example 1.2
• Example 1.3
• Example 1.4
• Example 1.5
• Example 1.6
182
• Example 1.7
• Example 1.8
• Example 1.9
• Example 1.10
• Example 1.11
• Example 1.12
• Example 1.13
• Example 1.14
• Example 1.15
• Example 2.1
• Example 2.2
• Example 2.3
• Example 2.4
• Example 2.5
• Example 2.6
• Example 2.7
• Example 2.8
• Example 2.9
• Example 2.10
• Example 2.11
• Example 2.12
• Example 2.13
• Example 2.14
• Example 2.15
• Example 2.16
183
• Example 2.17
• Example 3.1
• Example 3.2
• Example 3.3
• Example 3.4
• Example 3.5
• Example 3.6
• Example 3.7
• Example 3.8
• Example 3.9
• Example 3.10
• Example 3.11
• Example 3.12
• Example 4.1
• Example 4.2
• Example 4.3
• Example 4.4
• Example 4.5
• Example 4.6
• Example 4.7
• Example 4.8
• Example 6.1
• Example 6.2
• Example 6.3
• Figure 4.15
184
• Figure 6.1
• Identity 3.1
• Identity 3.2
• Identity 3.3
• Identity 3.4
• Identity 3.5
• Identity 3.6
• Identity 4.1
• Animation 4.1
• Animation 4.3
• Animation 4.4
• Animation 4.5
185
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )