UoA Maths 108
Maths 108: Exam Review
Lectures 1-32
UoA 2019
Hello! This document is an extension of the Maths 108 test review document we wrote
up for the midterm test. Like the test review document, it consists of a list of what we have
covered in this class, along with examples to look at.
Just like the test review, this document is not meant to serve as a replacement for your
coursebook! Instead, it is meant to give you a set of examples and summarise what we
have been doing in the course thus far. A good way to approach this document is to read
through it, and each time you come across something confusing use your resources (office
hours, email, talking to friends, Piazza, the coursebook, the textbooks) to review the related
concepts and figure out what is going on.
As well, this document only covers lectures 13-32. The exam, however, is comprehensive
and covers everything in your coursebook: that is, the exam covers lectures 1-32! If
you want to review topics 1 and 2, (and you should!), check out the test review document
we put up earlier.
1
Solving Systems of Linear Equations [Lectures 13-15]
After reading and watching these lectures, we are hoping that you can do the following
tasks:
• Take a system of linear equations and transform it into an augmented matrix.
• Use row operations to transform an augmented matrix into reduced row-echelon form.
• Use the reduced row-echelon form of an augmented matrix to find all of the solutions
to its corresponding system of linear equations.
1.1
Fundamentals
A linear equation in n variables x1 , . . . xn is an equation of the form a1 x1 + a2 x2 + . . . +
an xn = c, for constants a1 , . . . an , c ∈ R. For example, 2x + 3y = 4 and 3x + 2y − z = 11 are
both linear equations. A system of linear equations is just a collection of multiple linear
2x + y = 1
equations, like
. Given a system of linear equations, a solution is any way
x+y =2
to replace the variables of that system with real numbers, so that all of the equations are
2x + y = 1
satisfied. For example, (x, y) = (−1, 3) is a solution to
, because plugging
x+y =2
in x = −1, y = 3 into our equations satisfies both of them!
Given a system of linear equations, we often want to find all of the possible solutions
to that system of equations. In these lectures, we came up with a multi-step process for
finding these solutions! The key ingredients in this process are the following:
1. We can create the augmented matrix corresponding to a system of linear equations
by writing down the coefficients of that system of equations in a matrix. For example,
2x + y = 1
2 1 1
the system
corresponds to the augmented matrix
. We
x+y =2
1 1 2
put the coefficients of the variables on the left and put the constants on the right, and
draw a vertical line between them so that we can quickly tell the difference.
1
2. Given an augmented matrix, we can perform row operations on that augmented matrix
without changing the solutions to the underlying system of linear equations! There
are three row operations available to us:
• Switch two rows.
For example,
we could switch
rows 1 and 2 in the augmented
2 1 1 switch R1 and R2
1 1 2
matrix above:
.
−−−−−−−−−−−→
1 1 2
2 1 1
• Add amultiple of one row to another. For example, we could add −2R1
1 1 2 add −2R1 to R2
2
1 1
to R2 :
.
−−−−−−−−−−→
2 1 1
0 −1 −3
• Multiply each entry ina row by a nonzero constant.
we
For example,
1 1
2
1 1 2
multiply R2 by −1
could multiply R2 by −1:
.
−−−−−−−−−−−→
0 −1 −3
0 1 3
3. The goal of these row operations is to transform this matrix into something called
reduced row-echelon form, or RREF for short. Augmented matrices in reduced
row-echelon form have the following form:
• The first nonzero entry in each row is a 1; we call these 1’s leading 1’s.
• All entries in the same column as a leading 1 are zero.
• Each leading 1 is to the left of any leading 1 below it.
1 1 2
For example, if we take the augmented matrix
we were working on earlier
0 1 3
1 0 −1
. The first nonzero entry in each row is a 1,
and add −R2 to R1 , we get
0 1 3
there are 0’s above and below these 1’s, and each leading 1 is the left of the leading
1 below it; so this is in reduced row-echelon form!
We call a variable a system of linear equations leading if there is a leading 1 in the
corresponding column of its augmented matrix’s reduced row-echelon form, and we
call a variable free if it is not leading.
4. Finally, once the augmented matrix is in reduced row-echelon form, we turn it back
1 0 −1
x + 0y = −1
into a set of linear equations:
−→
. In this form, it’s
0 1 3
0x + y = 3
really easy to read off the solutions to our original system: we just want x = −1, y = 3!
1.2
Algorithms and an Example
So: we know what we want to do (take an augmented matrix, get its reduced row-echelon
form, interpret the answers.) To describe the “how,” I’ve created some flowcharts:
Yes
Let Ri be the
topmost row
without a leading
1. Let Cj be the
leftmost column
containing nonzero
entries other
than leading 1’s.
Is there a row
without a leading
1 that has nonzero
entries in it?
Using row operations, make
the entry in (i, j)
a 1. Call this
1 a leading 1.
No
We’re done!
Make all other
entries in column
Cj zero by adding
multiples of
Ri to them.
Figure 1: A flowchart for finding the RREF of an augmented matrix.
2
You have a line
of solutions.
1
Is there a row
that looks like 0 ... 0 c
for some c 6= 0?
No
2
How many free
variables are there?
You have a plane
of solutions.
3+
0
Yes
There are no
solutions to
your system of
linear equations.
You have a ddimensional space of
solutions, where d is
the number of free
variables you have.
There is exactly
one solution to
your system of
linear equations.
Figure 2: A flowchart for how to use RREFs to solve systems of linear equations.
These charts summarize the processes we’ve described in class for finding the RREF of
a matrix and interpreting it! To illustrate how these processes work in action, we consider
an example problem here:
Example. Consider the system of linear equations
4x+ 3y+ 2z = 1
x+ y+ z = 1
y+ 2z = 3
What are the solutions to this system of linear equations? How many solutions does this
system of linear equations have?
Answer. We start by converting this system of linear equations to an augmented matrix:
4 3 2 1
4x+ 3y+ 2z = 1
x+ y+ z = 1 −→ 1 1 1 1
y+ 2z = 3
0 1 2 3
We then use our first flowchart to tell us how to find the reduced row-echelon form of this
matrix:
4
3
2
1
0
1
1
1
2
Find R1 4
and C1
1
1
3
0
1
3
2
1
1
1
2
Place
1 leading 1 1
in
(1,
1)
4
1
Swap
3
0
R1 , R 2
1
1
3
2
1
2
Make
other
1 entries in 1
C1 zero
0
1
Add
3
0
−4R1
to R2
1
1
−1
−2
1
2
Find R2 , C2
1
0
0
1
1
−1
−2
1
2
Place
leading 1
1
1
in (2, 2)
−3
0
Scale R2
0
3
by −1
1
1
1
2
1
2
Make
other
1 entries in 1
C2 zero
0
3
Add −R2
3 to R1 , R3 0
3
0
−1
1
2
0
0
Try to
find more
−2 nonzero
rows
3
None
0
exist!
Done!
1
−3
3
We then consult our second flowchart to figure out what this means:
• We do not have any rows where all of the entries to the left of the vertical break are
0, but where the entry to the right is nonzero. So we are not in the “no solutions”
case.
• We now count our free variables. The variables x and y are leading, because there are
leading 1’s in their columns; this leaves z, which is our one free variable.
• According to our flowchart, this means we have a line of solutions!
Finally, if we want the equations for this line, we can just translate our augmented matrix
back into a set of equations:
1 0 −1 −2
x
−z = −2
0 1 2
3 −→
y +2z =
3
0 0 0
0
This gives us a set of general equations for our line. If we want a set of parametric equations,
we can set each free variable equal to its own parameter (in this case, set z = t) and rewrite
all of our other equations in terms of that parameter:
z = t,
x − t = −2 ⇒ x = t − 2,
y + 2t = 3 ⇒ y = 3 − 2t.
2
Matrix Operations and Applications [Lectures 16-19]
After reading and watching these lectures, we’re hoping that you can do the following tasks:
• Add matrices together, multiply matrices by scalars, multiply matrices together, and
take the transpose of a matrix.
• Find the inverse of a matrix, and how to use it to find solutions to a system of linear
equations.
• Know what the identity matrix is.
• Know how to take the determinant of 2 × 2, 3 × 3 and n × n matrices in general.
• Know several useful properties of the determinant.
2.1
Matrix Arithmetic
A
matrix of size m × n is a grid of numbers with m rows and n columns. For example,
1 2 3
is a 2 × 3 matrix. We defined several operations on matrices in class! Most of
2 3 1
these operations were very intuitive. Addition, for example, was pretty straightforward:
given two matrices A, B of the same size, we could create a new matrix A + B by just
adding each cell in A to the corresponding cell in B. For example,
1 2 3
0 1 0
1+0 2+1 3+0
1 3 3
+
=
=
.
2 3 1
1 0 1
2+1 3+0 1+1
3 3 2
1 2
You can’t add matrices of different sizes; that is,
+ 1 2 4 DNE.
2 1
4
Scalar multiplication was similarly nice; given any m × n matrix A and any real number
x, we could form the matrix xA by multiplying each coordinate of A by x; for example,
1 3 3
2 6 6
2
=
.
3 3 2
6 6 4
As well, given a matrix A we can form its transpose AT by “switching” its rows and
its columns; that is, given a matrix A, we can form the matrix AT by making a matrix
whose first row is A’s first column, whose second row is A’s second column, and so on/so
forth until we run out of columns. This is perhaps best illustrated by an example:
1
4
7
2
5
8
T
3
1
6 = 2
9
3
4
5
6
7
8
9
Not every matrix operation is intuitive, though. Matrix multiplication, as we saw in
class, is a pretty strange thing! We defined it as follows: suppose that A is a m × n matrix
and B is a n × l matrix. Then AB, the product of A and B, is defined, and in particular
is a m × l matrix! We defined this as follows: if we let rA,1 , . . . rA,m denote the m rows of
A and cB,1 , . . . cB,l denote the l columns of B, then
rA,1 · cB,1 rA,1 · cB,2 rA,1 · cB,3 . . . rA,1 · cB,l
rA,2 · cB,1 rA,2 · cB,2 rA,2 · cB,3 . . . rA,2 · cB,l
AB = rA,3 · cB,1 rA,3 · cB,2 rA,3 · cB,3 . . . rA,3 · cB,l
..
..
..
..
..
.
.
.
.
.
rA,m · cB,1 rA,m · cB,2 rA,m · cB,3 . . . rA,m · cB,l
In other words, to find the entry that goes in (i, j), take the dot product of the i-th row of
A and the j-th column of B.
This probably looks scary, but in practice it’s not too bad. Here’s an example of this in
action:
1 2 1
−1 1
Example. If A = 2 3 −2 and D = 1 −1, find AD.
0 1 1
3
4
Answer. We use our definition as described above:
1
2
AD =
2
3
0
1
−1
−2
1
1
1
3
rA,1 · cB,1
rA,1 · cB,2
r
·c
−1
= A,2 B,1
rA,3 · cB,1
4
rA,2 · cB,2
1
−1 + 2 + 3
=
−2 + 3 − 6
0+1+3
rA,3 · cB,2
1−2+4
(1, 2, 1) · (−1, 1, 3)
=
(2, 3, −2) · (−1, 1, 3)
(0, 1, 1) · (−1, 1, 3)
4
2−3−8
= −5
0−1+4
4
3
(1, 2, 1) · (1, −1, 4)
(2, 3, −2) · (1, −1, 4)
(0, 1, 1) · (1, −1, 4)
−9
3
Notice that to use our definition, we need the number of columns in A to be equal to
the number of rows in B. If A and B do not have the right
above,
sizes to use
the definition
1 2 3 1 2 3
we say that their product is undefined; so, for instance,
DNE.
2 1 2 2 3 4
Matrix multiplication has a number of interesting properties. One is that most of
the time, the order of
multiplication
matters:
that is, AB and BA are often very dif0 1 0 0
1 0
ferent! For example,
=
, but if we switch the order we can see that
0 0 1 0
0 0
5
0 0 0 1
0 0
=
is quite different! Another property is that much like how R has
1 0 0 0
0 1
a “multiplicative identity” in 1 (that is, a number we can multiply by anything without
changing that thing), we have an identity for matrices as well: in general, for any n we
define the n × n identity matrix In as
1 0 0 ... 0
0 1 0 . . . 0
In = 0 0 1 . . . 0
.. .. .. . . ..
. . .
. .
0 0 0 ... 1
This is a n × n matrix with ones on the main diagonal (i.e. 1’s in every cell (i, i))
and zeroes everywhere else. This matrix has the property
that for any m × n matrix A,
1 2 3
Im · A = A · In = A. For example, if A = 4 5 6, then
7 8 9
1
0
0
1
2
I3 A =
0
0
1
0
4
1
7
5
0
8
(1, 0, 0) · (1, 4, 7)
(0, 1, 0) · (1, 4, 7)
6
=
9
(0, 0, 1) · (1, 4, 7)
3
1
=
4
7
2.2
(1, 0, 0) · (3, 6, 9)
(0, 1, 0) · (2, 5, 8)
(0, 1, 0) · (3, 6, 9)
(0, 0, 1) · (3, 6, 9)
(0, 0, 1) · (2, 5, 8)
2
3
5
6
9
8
(1, 0, 0) · (2, 5, 8)
Matrix Inverses
Given a matrix A, we call A square if it has as many rows as columns; in other words,
if it is size n × n for some n. Given any square matrix A, we say that A is invertible if
there is some matrix A−1 such that AA−1 = In = A−1 A; we call A−1 the inverse of A.
Not all matrices are invertible, but many are! In class, we described a process for finding
the inverse of a matrix A:
1. First, we construct the augmented matrix [A|I].
2. Then, we apply row operations to this augmented matrix to turn the left-hand side
into reduced row-echelon form.
3. If at the end of this process the left-hand side is the identity matrix I, then the
right-hand side is A−1 .
4. Otherwise, if the left-hand side is not the identity matrix after reducing it to its
reduced row-echelon form, then A−1 does not exist.
We look at an example here:
5 1 0
Example. Does the matrix C = 4 5 2 have an inverse?
5 3 1
Answer. We run our process here. First, we form [C|I]; then, we perform row operations
6
until the left-hand side is in RREF:
5 1
[C|I] = 4 5
5 3
5−4
add −R to R , then add −R to R1
4
−−−−−−1−−−−3−−−−−−−−−−2−−−−→
5−5
1 −4
5
= 4
0
2
1
add −4R1 to R2
−−−−−−−
−−−→ 4 − 4
0
1 −4
= 0 21
0
2
1
add −10R3 to R2
−−−−−−−−
−−−→ 0 − 0
0
1 −4
1
= 0
0
2
1+0
add −2R to R ,4R →R1
0
−−−−−−−2−−−−3−−−2−−−→
0−0
1 0
= 0 1
0 0
1 0
add 2R3 to R1
−−−−−−
−−−→ 0 1
0 0
1 0
= 0 1
0 0
0
2
1
1
0
0
0
0
1
0
1
0
1−5
5
3−1
−2
2
1
1
0
−1
−4
5 + 16
2
−2
10
1
1
−4
−1
−2
0
1
1
6
−1
−1
5
0
−2 + 0
0
1−0
25
6
−13
−2 + 2
0
1
1
0−4
−1
−1 0
5
0
0
1
−2
10 − 10
1
−4 + 4
1
2−2
0
0
1
1−0 0−1
0
1
0−1 0−0
−1 0
1
0
0
1
−2
2+8
1
−4
21 − 20
2
−2
0
1
0−0
0
1−0
0−2
2
1−0
−1
1+4
0
1
−4 + 10
−1
0
−10
1
19 − 20
5
−10
−2
−10
21
−1
5
−10
0
0 − 10
1
0 − 40
−10
1 + 20
−40 + 42
−10
21
25 − 26
6
−13
−1
6
−13
−1
5
0
−1 + 20
5
0 − 10
1 + 24
6
−1 − 12
−40
−10
21
19
5
−10
0
0+0
1
The left-hand side
form, and in fact is the identity; therefore
is in reduced row-echelon
−1 −1
2
5
−10 is C −1 !
the right-hand side 6
−13 −10 21
To make sure we haven’t made any errors in our calculations, we check that CC −1 is in
fact equal to I here:
−1
5
1
0
CC −1 =
4
5
5
2
6
1
−13
3
−1
5
−10
−5 + 6 + 0
(5, 1, 0) · (−1, 6, −13)
−10 = (4, 5, 2) · (−1, 6, −13)
21
(5, 3, 1) · (−1, 6, −13)
2
−5 + 5 + 0
=
−4 + 30 − 26
−4 + 25 − 20
−5 + 18 − 13
−5 + 15 − 10
10 − 10 + 0
(5, 1, 0) · (2, −10, 21)
(4, 5, 2) · (−1, 5, −10)
(4, 5, 2) · (2, −10, 21)
(5, 3, 1) · (2, −10, 21)
(5, 3, 1) · (−1, 5, −10)
1
0
0
8 − 50 + 42
= 0
0
10 − 30 + 21
1
0
1
Success!
7
(5, 1, 0) · (−1, 5, −10)
0
If this matrix was not invertible,
we would not have gotten the identity on the left at
2 1
the end. For example, B =
is not invertible, because
8 4
[B|I] =
2
8
1
4
1
0
0
1
multiply R1 by 1
2
−−−−−−−−−−−→
add−8R to R2
−−−−−−1−−−−→
1
8
1
2
1
2
4
0
1
8−8·1
0
1
1
2
4−8·
1
2
1
2
0−8·
1
2
0
1−8·0
=
1
0
1
2
1
2
0
−4
0
1
gives us a matrix whose left-hand side is in RREF but is not the identity I2 .
Inverses of matrices can be used to solve systems of linear equations! Notice that if we
have n linear equations in n unknowns, like for example
5x+ y
= 1,
4x+ 5y +2z = 1,
5x+ 3y +z = 1.
we can rewrite this as
5 1 0 x
1
4 5 2 y = 1 .
5 3 1
z
1
In general, if you have a system of linear equations in n variables, if you let A be the
matrix of coefficients of those variables, x be the vector consisting of all of those variables,
and b be the vector of the constants each equation is equal to, you can always express that
system of linear equations as Ax = b, just like we’ve done here.
5 1 0
Returning to this example: earlier in these notes, we said that if C = 4 5 2 then
5 3 1
−1 −1
2
5
−10 .
C is invertible, and in particular C −1 = 6
−13 −10 21
a
x
Therefore, if we want to solve the equation C y = b , we can just multiply both
c
z
sides by C −1 to get
−1
x
a
y = C −1 b = 6
z
c
−13
−1
5
−10
2
1
(−1, −1, 2) · (1, 1, 1)
(6, 5, −10) · (1, 1, 1)
−10
1 =
1
21
(−13, −10, 21) · (1, 1, 1)
=
−1 − 1 + 2
0
= 1
−2
−13 − 10 + 21
6 + 5 − 10
x
1
In other words, we’ve solved our system of linear equations C y = 1, and found that
z
1
x
1
0
y = C −1 1 = 1 !
z
1
−2
This process works in general: if you have a system of linear equations of the form
Ax = b, then if A−1 exists, you get exactly one solution to this system, and it’s A−1 b! In
general, this is not the fastest way to solve a system of linear equations, and it only applies
when you have A−1 ; if A−1 does not exist, then you cannot use this method, and should go
back to our earlier methods using the RREF to find a solution. But if someone has given
you A−1 for free, then this is a faster way to solve systems of linear equations!
8
In particular, this means that the inverse is connected to finding solutions to systems of
linear equations in certain ways:
• Let A be a square matrix. If A−1 exists, then Ax = b has exactly one solution for
every b.
• This also applies in the other direction: if A is a square matrix and Ax = b has
exactly one solution for some b, then A−1 exists.
2.3
The Determinant
Finally, the last operation we described for matrices was the determinant.
The
determi1 2 4
nant det(A) is something only defined for square matrices (so det
does not
2 5 0
exist), and we defined it as follows:
• For a 1 × 1 matrix [a], det([a])
= a.
a b
a b
• For a 2 × 2 matrix
, we have det
= ad − bc.
c d
c d
a b c
a b c
• For a 3 × 3 matrix d e f , we have det d e f = aei + bf g + cdh − af h −
g h i
g h i
bdi − ceg. A nice way to remember this 3 × 3 determinant formula is to write the
matrix next to itself, and then circle the six diagonal lines labeled below:
a b c a b c
d e f d e f
g h i g h i
The three blue diagonals correspond to the three terms you add, and the three red
diagonals are the three terms you subtract in the formula above.
For larger matrices, like 4 × 4 and on up, most of the formulas you could memorize get
very messy very quickly. So instead we came up with some properties that can help you
calculate the determinant of a large matrix quickly:
• Given a square matrix A, we know how our row operations from earlier affect the
determinant of A:
– If we multiply a row of A by a constant c, this multiplies the determinant by c.
– If we switch two rows in A, this multiplies the determinant by −1.
– If we add a multiple of one row to another row in A, this does nothing to the
determinant.
• We say that a matrix A is upper-triangular if the only cells in A that contain
nonzero values are those on or above the
main diagonal; that is, upper-triangular
1 0 2
matrices are ones that look like 0 2 3.
0 0 0
• The determinant of a matrix that is upper-triangular is the product of the entries on
its diagonal.
Accordingly, this gives us a nice blueprint for how to find the determinant of any square
matrix A:
• Take A and perform row operations on it to transform it into an upper-triangular
matrix B.
9
• Calculate the determinant of B by multiplying the entries on its diagonal!
• Use this to find the determinant of A by correcting for the row operations you performed: that is, for each swap you did to A, make sure to multiply det(B) by −1 to
cancel out the earlier −1, and for each time you multiplied a row in A by a constant
c make sure to multiply det(B) by 1c .
We calculate a few examples here:
Example. Find the determinants of the following matrices:
1 2
9 8 7
2 1
7 2
A=
, B = 6 5 4 , D =
0 2
2 1
0 0
3 2 1
0 0
0
2
1
2
0
0
0
2
1
2
0
0
0
2
1
Answer. For A and B, we just use the formula for the determinant of a 2 × 2 matrix:
7 2
det
=7 · 1 − 2 · 2 = 3,
2 1
9 8 7
9 8 7 9 8 7
det 6 5 4 = 6 5 46 5 4
3 2 1
3 2 1 3 2 1
=9 · 5 · 1 + 8 · 4 · 3 + 7 · 6 · 2 − 9 · 4 · 2 − 8 · 6 · 1 − 7 · 5 · 3
=45 + 96 + 84 − 72 − 48 − 105 = 0.
For D, we use row operations to transform this matrix into a triangular matrix:
1
2
0
0
0
2
1
2
0
0
0
2
1
2
0
0
0
2
1
2
0
1
0
0
add −2R1 to R2
0
−−−−−−−−−−→ 0
0
2
1
0
1
0
6
add − R1 to R2
−−−−−7−−−−−−→
0
0
0
2
−3
2
0
0
0
2
1
2
0
2
−3
0
0
0
0
2
7
3
0
0
1
2
0
0
0
−3
2
0
add 23 R1 to R2
7
0
0
3
−−−−−−−−−−→ 0
0
0
2
2
0
0
0
1
0
0
1
2
0 −3
0
0
14
add 5 R1 to R2
2
0
0
−−−−−−−−−−→ 0
5
0
− 7 2
0
2
1
0
0
0
0
2
1
2
0
0
2
1
2
0
0
0
2
1
0
2
0
0
2
− 57
0
7
3
0
0
0
0
0
2
33
5
The determinant of the matrix at right is just the product of the entries on its diagonal,
i.e. 1 · (−3) · 73 · (− 57 ) · 33
5 = 33, because it is upper-triangular. Therefore, because adding
rows to other rows does not change the determinant, the determinant of the original matrix
is also 33.
The determinant has some nice properties:
• A square matrix A has determinant 0 if and only if A is not invertible.
• If a square matrix A has two identical rows or columns, or a row of all zeroes, then
det(A) = 0.
• If A, B are both square n × n matrices, then det(AB) = det(A) det(B).
• det(AT ) = det(A).
10
3
Cross Product [Lecture 20]
After reading and watching these lectures, we’re hoping that you can do the following tasks:
• Take the cross product of two vectors in R3 .
• Know several properties about the cross product.
3.1
Definition and Properties
The cross product is an operation that takes in two vectors in R3 and outputs another
vector in R3 . It is defined as follows: for any (u1 , u2 , u3 ), (v1 , v2 , v3 ) ∈ R3 , we have
(u1 , u2 , u3 ) × (v1 , v2 , v3 ) = (u2 v3 − u3 v2 , u3 v1 − u1 v3 , u1 v2 − u2 v1 ).
For example,
(1, 2, 3) × (0, 4, 5) = (2 · 5 − 3 · 4, 3 · 0 − 1 · 5, 1 · 4 − 2 · 0) = (−2, −5, 4).
The cross product has several useful properties:
• Given any two vectors u, v, u × v is orthogonal to both u and v.
• As well, ||u × v|| is equal to the area of the parallelogram spanned by u and v.
• Finally, ||u × v|| = ||u||||v|| sin(θ), where θ is the angle between u and v.
4
Differentiation [Lectures 21-23]
After reading and watching these lectures, we’re hoping that you can do the following tasks:
•
•
•
•
•
4.1
Know the definition of the derivative and how to use it.
Find the derivatives of various basic functions.
Take the derivatives of more complex functions via the product and chain rules.
dy
Use implicit differentiation to find dx
of a relation involving x and y.
Find the tangent line to a curve.
Calculating Derivatives
The derivative of a function f at some point x, denoted as either f 0 (x) or
we’re looking at the graph y = f (x), is the following limit:
f 0 (x) = lim
h→0
d
dx f (x)
or
dy
dx
if
f (x + h) − f (x)
h
So, for example, the derivative of f (x) = x2 at x = 2 is just
4 + 4h + h2 − 4
4h + h2
(2 + h)2 − 22
= lim
= lim
= lim 4 + h = 4.
h→0
h→0
h→0
h→0
h
h
h
lim
Geometrically, we think of the derivative of a function f at some point x as measuring the
“slope” of our function at that point. We can visualize this by drawing the tangent line
to a function f (x) at some point a, which has the equation
y − f (a) = f 0 (a) · (x − a).
So, if we return to our example above where f (x) = x2 , we can see that a tangent line to
f (x) at x = 2 would have the equation
y − f (2) = f 0 (2)(x − 2)
⇒
y − 4 = 4(x − 2).
Drawing this line next to the graph of y = f (x) shows that we are indeed capturing the
idea of the “slope” of our function at x = 2:
11
10
5
-2
0
-1
1
2
3
4
5
6
While elegant, this limit definition of the derivative can take a while to use. Accordingly,
we’ve calculated the derivatives of several simple functions:
•
•
d x
x
dx e = e
d
1
dx ln(x) = x
•
•
d n
n−1 ,
dx x = nx
d
dx c = 0
n 6= 0
•
•
d
dx
d
dx
cos(x) = − sin(x)
sin(x) = cos(x)
We also have a set of rules that let us take the derivative of more complicated functions:
• Differentiation is linear; given any two functions f (x), g(x) and constants a, b, we
d
have dx
(af (x) + bg(x)) = af 0 (x) + bg 0 (x).
d
• Product rule: given any two functions f (x), g(x), we have dx
(f (x) · g(x)) = f 0 (x) ·
0
g(x) + f (x) · g (x).
d
• Chain rule: given any two functions f (x), g(x), we have dx
(f (g(x))) = f 0 (g(x))·g 0 (x).
We look at a few quick examples of these rules in action:
√
Example. Find the derivatives of p(x) = e
x , q(x)
= sin(x) cos(x) and r(x) = x2 ·ln(x2 +1).
Answer. For p(x), we want to use the chain rule; this is because p(x) consists of functions
composed with each other, and the chain rule is the only rule that deals with this! So: let’s
√
write p(x) = f (g(x)), where f (x) = ex and g(x) = x. Then, the chain rule tells us that
d
d x
p0 (x) = dx
(f (g(x))) = f 0 (g(x)) · g 0 (x). We know from above that f 0 (x) = dx
e = ex , and
d 1/2
1 −1/2
1
d √
0
= 2√x ; therefore, we have
that g (x) = dx x = dx x = 2 x
0
0
√
0
p (x) = f (g(x)) · g (x) = e
√
x
1
e x
· √ = √ .
2 x
2 x
For q(x), we want to use the product rule, because q(x) consists of the product of two
functions. If we let f (x) = sin(x), g(x) = cos(x) then q(x) = f (x)g(x); so the product rule
d
d
d
says that dx
q(x) = dx
(f (x) · g(x)) = f 0 (x)g(x) + f (x)g 0 (x). Because f 0 (x) = dx
sin(x) =
d
0
cos(x) and g (x) = dx cos(x) = − sin(x), this tells us that
q 0 (x) = f 0 (x)g(x) + f (x)g 0 (x) = cos2 (x) − sin2 (x).
r(x) might seem harder to determine which rule to use, but it’s actually not that bad: if
we look at r(x), we just need to decide whether it looks more like a f (g(x)) or a f (x)g(x)!
In this case, it’s not clear how we would write this as a f (g(x)), as there’s not an obvious
“outside” function that we’re applying to some inside function. However, it’s very easy to
see how we’d write this as a product: we can write r(x) = f (x) · g(x), where f (x) = x2 and
g(x) = ln(x2 + 1). This is how differentiation always works; you’ll always have exactly one
rule that can work, and all you have to do is figure out what that rule is and then apply it!
d 2
d
If we do that here, then f 0 (x) = dx
x = 2x, while g 0 (x) = dx
ln(x2 + 1) is trickier; here
we have to use the chain rule, because we have one function (ln(x)) being applied to another
(x2 + 1)! In particular, if we let h(x) = ln(x), j(x) = x2 + 1, then h0 (x) = x1 , j 0 (x) = 2x, and
therefore the chain rule tells us that g 0 (x) = h0 (j(x)) · j 0 (x) = x22x+1 .
Plugging this into our product rule work earlier tells us that
r0 (x) = f 0 (x)g(x) + f (x)g 0 (x) = 2x ln(x2 + 1) +
12
2x3
.
x2 + 1
4.2
Implicit Differentiation
Sometimes we will want to find the tangent line to an equation like
2x(x2 + y 2 ) = 3x2 − y 2
even though we cannot easily solve for y and make this into a function of x! In this situation,
dy
we use implicit differentiation to try to find dx
. The idea here is the following: take any
expression involving the variables x and y, like sin(x) or exy or y 2 − 2x + 1 or ln(y).
d
• If this expression has the form f (x) (in other words, it only involves x), define dx
f (x) =
0
f (x). In other words, take the derivative like normal.
d
• If this expression has the form f (y) (in other words, it only involves y), define dx
f (y) =
dy
dy
0
f (y) · dx . In other words, take the derivative like normal, but stick this dx on the
outside.
• If it has both x’s and y’s, use the chain and product rules to break it into smaller
pieces.
Given an equation in two variables x, y, implicit differentiation is the following process:
d
to both sides of this equation, as described above.
• Apply dx
dy
dy
’s. Solve for dx
in terms of the other
• This gives you an equation with x’s, y’s, and dx
dy
variables by putting the dx ’s on one side and all of the other terms on the other side.
dy
in terms of x and y: in other words, you have
• You now have an expression for dx
implicitly differentiated your equation!
To give an example: let’s look at the curve 2x(x2 + y 2 ) = 3x2 − y 2 from earlier.
d
If we apply dx
to both sides, we get
⇒
d
d
2x · (x2 + y 2 ) =
3x2 − y 2
dx dx
d
d
d
d 2
(2x) · (x2 + y 2 ) + (2x) ·
(x2 + y 2 ) =
(3x2 ) −
(y )
dx
dx
dx
dx
d 2
d 2
dy
2
2
⇒
2(x + y ) + (2x)
(x ) +
(y ) = 6x − 2y
dx
dx
dx
dy
dy
2
2
⇒
2(x + y ) + (2x) 2x + 2y
= 6x − 2y
dx
dx
dy
dy
⇒
6x2 + 2y 2 + 4xy
= 6x − 2y .
dx
dx
Now, we solve for
dy
:
dx
dy
dx
dy
dy
4xy
+ 2y
dx
dx
dy
(4xy + 2y)
dx
dy
⇒
dx
6x2 + 2y 2 + 4xy
⇒
⇒
13
= 6x − 2y
dy
dx
= 6x − 6x2 − 2y 2
= 6x − 6x2 − 2y 2
=
6x − 6x2 − 2y 2
4xy + 2y
To check
that our answer makes sense, let’s try graphing a tangent line to this curve at
dy
the point 1, √13 . Plugging this point into our equation for dx
yields
dy
=
dx
6 · 1 − 6 · 12 − 2 ·
4·1·
√1
3
+
√1
3
2 √13
2
=
− 23
√6
3
1
=− √ ,
3 3
which tells us that a tangent line to our curve at 1, √13 has equation
y − y0 =
dy
(x − x0 )
dx
⇒
1
1
y − √ = − √ (x − 1).
3
3 3
Graphing this line verifies that it is indeed a tangent line:
5
Differentiation Applications [Lectures 24-26]
After reading and watching these lectures, we’re hoping that you can do the following tasks:
• Know what it means for a function to be increasing, decreasing, concave up, concave
down, to have an inflection point, a critical point, or a relative maxima or minima.
• Visually identify all of the above properties.
• Use the derivative to find where a function has any of the above properties.
5.1
Definitions
The derivative can help us visualize and draw functions! It does this in many ways:
• Given a function f , we say that f is increasing on the interval (a, b) if for any
x < y ∈ (a, b), we have f (x) < f (y). Similarly, we say that f is decreasing on (a, b)
if for any x < y ∈ (a, b) we have f (x) > f (y).
Increasing
Decreasing
14
• The derivative can tell us when this happens! It turns out that f is increasing on
(a, b) if f 0 (x) > 0 for every x ∈ (a, b), and f is decreasing on (a, b) if f 0 (x) < 0 for
every x ∈ (a, b).
• We say that f is concave up on (a, b) if f 00 (x) > 0 on (a, b); similarly, f is concave
down on (a, b) if f 00 (x) < 0 on (a, b). Visually, concave up graphs look like they’re
curving upwards (think cups, rockets taking off, the parabola y = x2 ) and concave
down graphs look like they’re curving downwards (think waterfalls, the path made by
throwing a ball in the air, y = −x2 .)
Concave up
Concave down
• We say that x is a critical point if f 0 (x) = 0 or does not exist.
• We say that a point x ∈ (a, b) is a maximum on (a, b) if f (x) ≥ f (y), for any
y ∈ (a, b). We say that x is a relative maximum if there is some interval containing
x such that x is a maximum on that interval. Similarly x is a minimum on (a, b) if
f (x) ≤ f (y) for any y ∈ (a, b), and is a relative minimum if there is some interval
containing x in which x is a minimum.
• The derivative can help us find these objects! If x is a relative maxima or minima,
then f 0 (x) is a critical point. Not all critical points are relative maxima or minima,
but all relative maxima and minima are critical points.
• We say that a point a is a point of inflection if f 00 (x) changes from positive to negative,
or vice-versa, at x = a.
Inflection point
We can use these properties to draw remarkably accurate graphs of functions! We look
at an example here:
Example. Draw the graph of f (x) = x(x − 9)(x − 24) , labeling all critical points, relative
maxima and minima, inflection points, and identifying where the function is concave up
and where it is concave down.
Answer. Our process for drawing a graph is as follows:
• We start by finding all of the places where our function crosses the x-axis and y-axis:
in other words, we find all of the values of x for which f (x) = 0, and also what f (0)
is.
• Then, we find f 0 (x), and find out where it is positive and negative. We use this to
identify all of the critical points of f and identify which are minima and which are
maxima; we also use this to determine where f is increasing and where f is decreasing.
15
• We finish by finding f 00 (x), and determine where this is positive and where this is
negative; we use this to find the inflection points of f , and determine where f is
concave up and concave down.
The first of these tasks is pretty straightforward. We know that f (x) = x(x − 9)(x − 24);
so we’ve already factored x into its roots, and can see that we have f (x) = 0 whenever x
is 0, 9 or 24. Similarly we know that f (0) = 0(0 − 9)(0 − 24) = 0, so we know where our
function crosses the y-axis.
Now, to get some more information we look at f 0 (x). Because
f (x) = x(x − 9)(x − 24) = x3 − 33x2 + 216x
⇒ f 0 (x) = 3x2 − 66x + 216 = 3(x2 − 22x + 72) = 3(x − 4)(x − 18),
we can see that x = 4, 18 are the two critical points of our function f . Moreover, because
3(x − 4) (x − 18) 3(x − 4)(x − 18)
x ∈ (−∞, 4)
(−)
(−)
(−) · (−) = (+)
x ∈ (4, 18)
(+)
(−)
(+) · (−) = (−)
x ∈ (18, ∞)
(+)
(+)
(+) · (+) = (+)
we can see that our function is increasing on (−∞, 4), decreasing on (4, 18) and then increasing again on (18, ∞). Finally, because at 4 we switch from increasing to decreasing we
know that our function has a relative maximum there, and at 18 because we switch from
decreasing to increasing we have a relative minimum.
This gives us some more information about our function! In particular, we can plot the
points
f (4) = 4(4 − 9)(4 − 24) = 400, f (18) = 18(18 − 9)(18 − 24) = −972
and get that our function looks like something that goes through the following points,
increasing until x = 4, decreasing until x = 18, and then increasing again:
400
200
-5
0
5
10
15
20
25
30
-200
-400
-600
-800
-1000
Finally we look at f 00 (x):
d 0
d
f (x) =
3x2 − 66x + 216 = 6x − 66 = 6(x − 11).
dx
dx
This is negative for all x < 11 and positive for all x > 11; therefore our function is concave
down on (−∞, 11) and concave up on (11, ∞), with an inflection point at x = 11.
At x = 11 we have f (11) = −286, so this gives us one last point to plot.
Now, we draw! Specifically, we draw a nice concave-down curve through the points
(0, 0), (4, 400), (11, −286) and a nice concave-up curve through the points (11, −286), (18, −972), (24, 0):
f 00 (x) =
400
200
-5
0
5
10
15
-200
-400
-600
-800
-1000
16
20
25
30
6
Integration [Lectures 27-30]
After reading and watching these lectures, we’re hoping that you can do the following tasks:
• Know the definition of the antiderivative, and the antiderivative of several basic functions.
• Use integration by substitution and integration by parts to find more complicated
integrals.
• Know how to take a definite integral by using the fundamental theorem of calculus.
• Find the area between a curve y = f (x) and the x-axis.
6.1
The Indefinite Integral
Given a function f , we say that F is an antiderivative of f if F 0 (x) = f (x). In this sense,
taking an antiderivative is exactly what it sounds like: it’s just “undoing” the derivative!
A given function may have many antiderivatives. For instance, f (x) = cos(x) has
d
F (x) = sin(x) + 9 as an antiderivative, because dx
(sin(x) + 9) = cos(x); but it also has
sin(x) − 42, sin(x), and sin(x) + π as antiderivatives as well, because the constant term
d
doesn’t matter if we’re applying dx
!
Accordingly, we define the indefinite integral of a function f as “the antiderivative
up to a constant:” that is, if f (x) is a function and F (x) is any antiderivative of f (x), we
write
Z
f (x) dx = F (x) + C,
C∈R
to denote the indefinite integral of f (x).
We know the indefinite integrals of several basic functions:
Z
Z
xn+1
•
ex dx = ex + C
+ C, n 6= −1
•
xn dx =
n+1
Z
Z
•
ln(x) dx = x ln(x) − x + C
•
cos(x) dx = sin(x) + C
Z
Z
1
•
dx = ln(x) + C
•
sin(x) dx = − cos(x) + C
x
We also have some techniques for integrating more complicated functions! One technique
is integration by substitution, which you can think of as “reverse chain rule.” It’s the
following formula:
Z
Z
d
f 0 (g(x)) · g 0 (x) dx =
(f (g(x))) dx = f (g(x)) + C, C ∈ R.
dx
Basically, this technique says that if we can recognize the function we’re integrating as
having the form f 0 (g(x)) · g 0 (x) for some f, g, then we’re automatically done! We just get
that the integral is f (g(x)) + C, and that’s quite nice.
Sometimes people use “u-substitution” notation, where to evaluate the integral
Z
f 0 (g(x)) · g 0 (x) dx
0
they define u = g(x), which means that g 0 (x) = du
dx , and therefore that du = g (x)dx.
Substituting these u’s in for x’s gives us
Z
Z
0
0
f (g(x)) · g (x) dx = f 0 (u) du = f (u) + C = f (g(x)) + C;
17
in other words it does the exact same thing as the notation above. Pick your favorite!
Not all functions can be written in the form f 0 (g(x)) · g 0 (x), though. For those other
kinds of functions, we have integration by parts (which we can think of as “reverse product
rule.”) It’s the following formula:
Z
Z
0
f (x) · g(x) dx = f (x)g(x) − f (x)g 0 (x) dx.
Basically, this technique says that if we can write the function
R 0 we’re integrating as a product
0
f (x)g(x),
R then we can “trade” the problem of finding f (x)g(x) dx for the problem of
finding f (x)g 0 (x) dx. This can make our problems a lot easier; if we make g(x) something
like ln(x) or x, for instance, g 0 (x) gets a lot nicer when we take a derivative! Conversely, if
we make f 0 (x) something easy to integrate, like ex or sin(x), then f (x) won’t be any worse,
and in theory we will have exchanged a tricky integral for an easier one.
This makes a lot more sense in practice, so let’s look at some examples:
Example. Using the techniques of integration by parts and integration by substitution (i.e.
the reverse chain and product rules), find each of the following indefinite integrals:
Z
(x + 1) sin(x + 1) dx
1.
Z
2.
(x + 1) sin((x + 1)2 ) dx
Z
sin(x)
3.
dx
Z cos(x)
4.
(ln(x))2 dx
Z
Answer.
1. The first thing we need to do to evaluate
(x+1) sin(x+1) dx is figure out
which technique we want to try. At first glance, reverse chain rule (i.e. integration by
substitution) looks good, in that we have some composition going on here — we’d be
tempted to make f 0 (x) = sin(x) and g(x) = x + 1. However, the thing on the outside
0
is
Z not g (x) = 1; it’s x + 1! So our integral actually doesn’t look like it’s of the form
g 0 (x)f 0 (g(x)) dx, and as a result we’re better off trying something else.
Z
Let’s try parts, then! If we were to apply integration by parts to (x+1) sin(x+1) dx,
R
we’d want to write this in the form f 0 (x)g(x)dx, where f 0 (x) is something whose
integral we know and isn’t too bad, while g(x) is something that hopefully gets simpler
when we differentiate. This motivates us to choose g(x) = x + 1, because g 0 (x) = 1 is
indeed a lot simpler; this leaves us with f 0 (x) = sin(x + 1), which has the reasonable
integral f (x) = − cos(x + 1).
Integration by parts, then, tells us that
Z
Z
Z
0
(x + 1) sin(x + 1) dx = f (x) · g(x) dx = f (x)g(x) − f (x)g 0 (x) dx
Z
= (− cos(x + 1))(x + 1) − (− cos(x + 1))(1) dx
Z
= −(x + 1) cos(x + 1) + cos(x + 1) dx
= −(x + 1) cos(x + 1) + sin(x + 1) + C,
18
C ∈ R.
We can check that −(x + 1) cos(x + 1) + sin(x + 1) + C is indeed an antiderivative of
(x + 1) sin(x + 1) by using the product rule:
d
(−(x + 1) cos(x + 1) + sin(x + 1) + C)
dx
= − cos(x + 1) + (−(x + 1))(− sin(x + 1)) + cos(x + 1)
=(x + 1) sin(x + 1).
Z
2. The integral
(x + 1) sin((x + 1)2 ) dx looks like a much better integration by sub-
stitution candidate! As before, we think of sin((x + 1)2 ) as the “f 0 (g(x))” part,
with f 0 (x) = sin(x) and g(x) = (x + 1)2 ; this now means that f (x) = − cos(x),
g 0 (x) = 2(x + 1), and therefore that
Z
Z
1
(x + 1) sin((x + 1)2 ) dx =
2(x + 1) sin((x + 1)2 ) dx
2
Z
1
=
f 0 (g(x)) · g 0 (x) dx
2
1
= f (g(x)) + C
2
1
= − cos((x + 1)2 ) + C, C ∈ R.
2
As always, we check this antiderivative by taking a derivative, using the chain rule:
d
1
1
d
2
− cos((x + 1) ) + C = − (− sin((x + 1)2 )) ·
(x + 1)2
dx
2
2
dx
1
= − (− sin((x + 1)2 ))(2(x + 1)1 )
2
= (x + 1) sin((x + 1)2 ).
Z
sin(x)
dx, it looks like substitution is not a bad guess: we certainly
cos(x)
1
have some composition going on with the cos(x)
part, and if we indeed make f 0 (x) =
1
0
x , g(x) = cos(x) then g (x) = − sin(x) does indeed give us the remaining parts of
the function we’re integrating, up to the sign! Therefore, because f 0 (x) = x1 forces
f (x) = ln(x), we have
Z
Z
sin(x)
1
dx = − − sin(x)
dx
cos(x)
cos(x)
Z
= − f 0 (g(x)) · g 0 (x) dx
3. If we look at
= −f (g(x)) + C
= − ln(| cos(x)|) + C,
C ∈ R.
We can check that this is indeed the antiderivative of tan(x) =
19
sin(x)
cos(x)
by using the
chain rule:
d
1
d
(− ln(| cos(x)|) + C) = −
·
(| cos(x)|)
dx
| cos(x)| dx
(
d
1
· dx
− cos(x)
(cos(x)),
if cos(x) ≥ 0
=
1
d
− − cos(x) · dx (− cos(x)), if cos(x) < 0
(
1
− cos(x)
· (− sin(x)),
if cos(x) ≥ 0
=
1
− − cos(x) · (−(− sin(x))), if cos(x) < 0
( sin(x)
if cos(x) ≥ 0
cos(x) ,
=
sin(x)
if cos(x) < 0
cos(x) ,
=
sin(x)
.
cos(x)
Z
4. If we were to try integration by substitution on
(ln(x))2 dx , we’d have to make
f 0 (g(x)) = (ln(x))2 , and this doesn’t really leave anything left for the g 0 (x) part! So,
let’s not try
Z this, and try parts instead! In particular, this means we probably think
of this as
ln(x) ln(x) dx .
This makes our choice for f 0 (x) and g(x) pretty simple: we make f 0 (x) = ln(x)
and g(x) = ln(x), because we don’t have any other choices really! This means that
f (x) = x ln(x) − x, as we saw in class earlier, while g 0 (x) = x1 . As a result, we have
Z
Z
Z
ln(x) ln(x) dx = f 0 (x) · g(x) dx = f (x)g(x) − f (x)g 0 (x) dx
Z
1
= (x ln(x) − x) ln(x) − (x ln(x) − x) dx
x
Z
= x(ln(x))2 − x ln(x) − ln(x) − 1 dx
= x(ln(x))2 − x ln(x) − (x ln(x) − x − x) + C
= x(ln(x))2 − 2x ln(x) + 2x + C.
We can check that x(ln(x))2 − 2x ln(x) + 2x + C is indeed an antiderivative of (ln(x))2
by using the product and chain rules:
d
x(ln(x))2 − 2x ln(x) − 2x + C
dx
1
=(ln(x))2 + x · 2 ln(x) · − 2 ln(x) − 2 + 2
x
=(ln(x))2 ,
6.2
Definite Integration
The definite integral of a function f (x) from a to b, written
Z b
f (x) dx,
a
is the signed area between the curve y = f (x) and the x-axis, where we think of area
above the x-axis as being positive and area below the x-axis as being negative. So, for
instance, the definite integral
Z 2π
sin(x) dx = 0,
0
20
because the area above the curve from 0 to π is “canceled out” by the area below the curve
from π to 2π:
+
In general, though, we calculate definite integrals by using the Fundamental Theorem
of Calculus: if f (x) is a function and F (x) is any antiderivative of f (x), then
Z b
f (x) dx = F (b) − F (a).
a
So, for example, because − cos(x) is an antiderivative of sin(x), we have
Z 2π
sin(x) dx = (− cos(2π)) − (− cos(0)) = −1 − (−1) = 0,
0
which verifies algebraically the fact we geometrically saw a moment ago.
One useful application of the definite integral is to finding the unsigned area between
a curve and the x-axis: i.e. the area where we count area below and above the x-axis as
positive, and don’t have any of this canceling-out stuff! To find this, you just want to
integrate |f (x)|, as the absolute-value signs transform all of the parts of our curve where
they were below the x-axis into parts that are above the x-axis! In other words, if we let
A denote the unsigned area between the x-axis and the curve y = f (x) from a to b, then
Rb
A = a |f (x)| dx.
To integrate something like |f (x)|, it helps to break up the region you’re integrating
f (x) over into places where f (x) ≥ 0 and where f (x) ≤ 0, so that you can replace
We illustrate this idea with an example:
Z 1
x
Example. Find
f (x) dx for f (x) = 2
. Then, find the area between the x-axis
x +3
−1
and the curve y = f (x) from x = −1 to x = 1.
Z
x
Answer. We start by finding the indefinite integral
dx. This looks like an in2
x +3
tegration by substitution problem, as it has some composition going on; indeed, if we let
f 0 (x) = x1 , g(x) = x2 + 3, g 0 (x) = 2x we have f (x) = ln(|x|) and therefore that
Z
Z
Z
x
1
x
1
dx =
2 2
dx =
g 0 (x)f 0 (g(x)) dx
x2 + 3
2
x +3
2
1
= f (g(x)) + C
2
1
= ln(|x2 + 3|) + C.
2
Therefore, by the fundamental theorem of calculus, we have
Z 1
x
1
1
2
2
dx
=
ln(|x
+
3|)
+
C
−
ln(|x
+
3|)
+
C
2
2
2
−1 x + 3
x=1
ln(4) + C
ln(4) + C
=
−
2
2
= 0.
21
x=−1
This answers the first part of our problem. Z
To answer the second, we need to figure out
x
dx. These absolute value signs
+3
are pretty irritating, so we want to get rid of them! To do this, we need to figure out where
our curve is above the x-axis and where it is below the x-axis.
x
Notice that for any x, x2 + 3 > 0; so the only part of 2
relevant to determining the
x +3
sign of our function is the numerator, which is x. Therefore this entire function is positive
when x > 0 and negative when x < 0!
x
x
x
x
= 2
when x ≥ 0, and 2
=− 2
As a result, we can see that 2
x +3
x +3
x +3
x +3
when x < 0; as a result, we can write
Z
1
Area =
−1
7
x2
Z 0
x
x
dx
+
dx
2+3
2+3
x
x
0
−1
Z 1
Z 0
x
x
=
− 2
dx
+
dx
2
x +3
0 x +3
−1
!
1
1
ln(|x2 + 3|) + C
ln(|x2 + 3|) + C
=
−
2
2
x=1
x=0
!
1
1
ln(|x2 + 3|) + C
−
ln(|x2 + 3|) + C
−
2
2
x=0
x=−1
ln(4) + C
ln(3) + C
ln(3) + C
ln(4) + C
=
−
−
−
2
2
2
2
= ln(4) − ln(3).
x
dx =
x2 + 3
Z
1
Functions of Two Variables [Lectures 31-32]
After reading and watching these lectures, we’re hoping that you can do the following tasks:
• Find the level curves of a function f (x, y), and use them to visualize the graph z =
f (x, y).
d
d
• Calculate partial derivatives dx
f (x, y), dy
f (x, y).
• Find the tangent plane to a function f (x, y) at a point.
7.1
Functions of Two Variables
A function of two variables f : D → R, for any subset D of R2 , is any rule that takes in
pairs (x, y) ∈ R2 of real numbers and outputs another real number. We think of D above
y
as the domain of this function. As before, given a rule f like f (x, y) = x+1
, we often want
to find sets on which our rule is a function (and in particular is defined); for this example,
for instance, f is defined on the set {(x, y) | x, y ∈ R, x 6= −1}.
Pretty much all of the functions of this sort that we interact with in this course are those
that we make by sticking together various elementary functions that we know: for example,
f (x, y) = sin(xy), g(x, y) = x2 + y 2 , h(x, y) = ex − y are all functions of two variables.
Given a function f of two variables, we often want to graph that function! These graphs
take place in three dimensions, i.e. R3 , and consist of plotting all of the points (x, y, z) such
that f (x, y) = z. For example, here are a few surfaces:
22
p
1 − x2 − y 2 , parabola z = x2 + y 2 , and
Figure 3: Left to right: the hemisphere z =
monkey saddle z = x3 − 3xy 2 .
Sometimes, we will want to visualize a surface even when we don’t have access to
computer programs! To do this, we use level curves, which are defined as follows: given a
function f (x, y), a level curve at height h is the set of all points (x, y) such that f (x, y) = h.
We think of this as what happens when we “slice” through the graph z = f (x, y) at height
z = h. If we take enough of these cross-sections, I claim that we get a nice visual image of
what our surface will look like!
For example, let f (x, y) = x2 − y 2 . I’ve drawn the level curves h = x2 − y 2 of this
function below, for values of h ranging from 4 to −4:
-10
-8
-6
-4
-2
8
8
8
6
6
6
4
4
4
2
2
0
2
4
6
8
10
-10
-8
-6
-4
-2
-6
-4
-2
-8
-6
-4
-2
6
8
10
-10
-8
-6
-4
-2
0
-4
-4
-4
-6
-6
-6
-8
-8
8
8
6
6
6
4
4
4
6
8
10
-10
-8
-6
-4
-2
2
4
6
8
10
-10
-8
-6
-4
-2
0
-2
-2
-4
-4
-4
-6
-6
-6
-8
-8
8
8
6
6
4
4
2
2
6
8
10
-10
-8
-6
-4
-2
0
4
6
8
10
4
6
8
10
4
2
2
4
6
8
10
-10
-8
-6
-4
-2
0
-2
-2
-2
-4
-4
-4
-6
-6
-6
-8
-8
h=-2
2
h=-1
6
4
10
-8
h=0
8
2
8
2
0
-2
0
6
4
2
2
4
h=2
8
0
2
-8
h=3
h=1
-10
4
-2
2
-8
2
-2
h=4
-10
2
0
-2
2
-8
h=-3
h=-4
With some imagination, you can think about what it would look like if these level curves
were drawn in 3D space, each one at its corresponding height h, like this:
23
Indeed, if you fill in the gaps you can see the surface we’ve drawn here (a hyperbolic
paraboloid!)
7.2
Partial Derivatives
Given a function f (x, y), we define the partial derivative with respect to x of f (x, y),
d
denoted dx
f (x, y), as the following: take f (x, y), think of x as a variable and y as a constant,
and take the derivative as normal with respect to x. So, for example,
d
(x + y) = 1 + 0 = 1,
dx
d
d
sin(xy) = cos(xy) ·
(xy) = cos(xy) · y,
dx
dx
d 2
y = 0.
dx
d
Similarly, we define the partial derivative with respect to y of f (x, y), denoted dy
f (x, y),
as the following: take f (x, y), think of y as the variable and x as a constant, and take the
derivative with respect to y! So, for example,
d
(x + y) = 0 + 1 = 1,
dy
d
d
sin(xy) = cos(xy) · (xy) = cos(xy) · x,
dy
dy
d 2
y = 2y.
dy
Earlier, we used the derivative to make a tangent line to the graph y = f (x). We can
use the partial derivatives here to make a tangent plane to the graph z = f (x, y) at the
point (a, b, c) in a very similar way, using the equation below:
d
d
· (x − a) + (f (x, y))
· (y − b)
z − c = (f (x, y))
dx
dy
(x,y,z)=(a,b,c)
(x,y,z)=(a,b,c)
24
We consider one last example to illustrate this idea:
Example. Find the tangent plane to the graph of f (x, y) = x2 + y 2 at the point (1, 1, 2).
Answer. We calculate:
d 2
(x + y 2 ) = 2x,
dx
d 2
(x + y 2 ) = 2y,
dy
Plugging this into the tangent plane formula yields
d
· (x − a) + d (f (x, y))
z − 2 = (f (x, y))
dx
dy
(x,y,z)=(a,b,c)
· (y − b)
(x,y,z)=(a,b,c)
⇒ z − 2 = 2(x − 1) + 2(y − 1).
Success!
We can graph this to visually confirm that we’ve drawn a tangent plane:
25