Uploaded by Wayland Whitmire

MT105A Study Guide.pdf

advertisement
Mathematics 1
M. Anthony
MT105a, 279005a
2011
Undergraduate study in
Economics, Management,
Finance and the Social Sciences
This subject guide is for a Level 1 course (also known as a ‘100 course’) offered as part of
the University of London International Programmes in Economics, Management, Finance
and the Social Sciences. This is equivalent to Level 4 within the Framework for Higher
Education Qualifications in England, Wales and Northern Ireland (FHEQ).
For more information about the University of London International Programmes
undergraduate study in Economics, Management, Finance and the Social Sciences, see:
www.londoninternational.ac.uk/current_students/programme_resources/lse/index.shtml
This guide was prepared for the University of London International Programmes by:
Martin Anthony, Department of Mathematics, London School of Economics and
Political Science.
This is one of a series of subject guides published by the University. We regret that due to
pressure of work the author is unable to enter into any correspondence relating to, or arising
from, the guide. If you have any comments on this subject guide, favourable or unfavourable,
please use the form at the back of this guide.
University of London International Programmes
Publications Office
Stewart House
32 Russell Square
London WC1B 5DN
United Kingdom
Website: www.londoninternational.ac.uk
Published by: University of London
© University of London 2010
Reprinted with minor revisions 2011
The University of London asserts copyright over all material in this subject guide except where
otherwise indicated. All rights reserved. No part of this work may be reproduced in any form,
or by any means, without permission in writing from the publisher.
We make every effort to contact copyright holders. If you think we have inadvertently used
your copyright material, please let us know.
Contents
Contents
1 General introduction
1
1.1
Studying mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.2
Mathematics in the social sciences . . . . . . . . . . . . . . . . . . . . . .
2
1.3
Aims and objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.4
Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.5
How to use the subject guide
. . . . . . . . . . . . . . . . . . . . . . . .
2
1.6
Recommended books . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.6.1
Main text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
1.6.2
Other recommended texts . . . . . . . . . . . . . . . . . . . . . .
4
Online study resources . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
1.7.1
The VLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
1.7.2
Making use of the Online Library . . . . . . . . . . . . . . . . . .
6
1.8
Examination advice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
1.9
The use of calculators . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
1.7
2 Basics
9
Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.2
Basic notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2.3
Simple algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
2.4
Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
2.5
Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.6
Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
2.7
Inverse functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
2.8
Composition of functions . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
2.9
Powers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
2.10 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
2.11 Quadratic equations and curves . . . . . . . . . . . . . . . . . . . . . . .
18
2.12 Polynomial functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
i
Contents
2.13 Simultaneous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
2.14 Supply and demand functions . . . . . . . . . . . . . . . . . . . . . . . .
22
2.15 Exponentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
2.16 The natural logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
2.17 Trigonometrical functions . . . . . . . . . . . . . . . . . . . . . . . . . .
28
2.18 Further applications of functions . . . . . . . . . . . . . . . . . . . . . . .
31
Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
Sample examination/practice questions . . . . . . . . . . . . . . . . . . . . . .
32
Answers to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
Answers to Sample examination/practice questions . . . . . . . . . . . . . . .
37
3 Differentiation
Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
3.2
The definition and meaning of the derivative . . . . . . . . . . . . . . . .
41
3.3
Standard derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
3.4
Rules for calculating derivatives . . . . . . . . . . . . . . . . . . . . . . .
44
3.5
Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
3.6
Curve sketching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
3.7
Marginals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
3.8
Profit maximisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56
Sample examination/practice questions . . . . . . . . . . . . . . . . . . . . . .
56
Answers to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
Answers to Sample examination/practice questions . . . . . . . . . . . . . . .
60
4 Integration
ii
41
65
Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
4.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
4.2
Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
4.3
Definite integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67
4.4
Integration by substitution . . . . . . . . . . . . . . . . . . . . . . . . . .
67
4.4.1
The method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67
4.4.2
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
Contents
4.4.3
The substitution method for definite integrals . . . . . . . . . . .
71
4.5
Integration by parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
4.6
Partial fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
73
4.7
Applications of integration . . . . . . . . . . . . . . . . . . . . . . . . . .
75
Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
75
Sample examination/practice questions . . . . . . . . . . . . . . . . . . . . . .
75
Answers to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
Answers to Sample examination/practice questions . . . . . . . . . . . . . . .
80
5 Functions of several variables
85
Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
5.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
5.2
Functions of several variables . . . . . . . . . . . . . . . . . . . . . . . .
85
5.3
Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
5.4
The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
5.5
Implicit partial differentiation . . . . . . . . . . . . . . . . . . . . . . . .
89
5.6
Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
5.7
Applications of optimisation . . . . . . . . . . . . . . . . . . . . . . . . .
93
5.8
Constrained optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
5.9
Applications of constrained optimisation . . . . . . . . . . . . . . . . . .
97
5.10 The meaning of the Lagrange multiplier . . . . . . . . . . . . . . . . . .
100
Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
102
Sample examination/practice questions . . . . . . . . . . . . . . . . . . . . . .
102
Answers to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
103
Answers to Sample examination/practice questions . . . . . . . . . . . . . . .
107
6 Matrices and linear equations
113
Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
113
Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
113
6.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
113
6.2
Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
113
6.3
Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
115
6.3.1
What is a matrix? . . . . . . . . . . . . . . . . . . . . . . . . . .
115
6.3.2
Matrix addition and scalar multiplication . . . . . . . . . . . . . .
115
6.3.3
Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . . . .
116
iii
Contents
6.3.4
The identity matrix . . . . . . . . . . . . . . . . . . . . . . . . . .
117
6.4
Linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
117
6.5
Elementary row operations . . . . . . . . . . . . . . . . . . . . . . . . . .
118
6.6
Applications of matrices and linear equations . . . . . . . . . . . . . . . .
122
Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
124
Sample examination/practice questions . . . . . . . . . . . . . . . . . . . . . .
125
Answers to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
126
Answers to Sample examination/practice questions . . . . . . . . . . . . . . .
127
7 Sequences and series
131
Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
131
Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
131
7.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
131
7.2
Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
131
7.3
Arithmetic progressions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
132
7.4
Geometric progressions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
132
7.5
Compound interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
132
7.6
Compound interest and the exponential function . . . . . . . . . . . . . .
132
7.7
Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
134
7.7.1
Arithmetic series . . . . . . . . . . . . . . . . . . . . . . . . . . .
134
7.7.2
Geometric series . . . . . . . . . . . . . . . . . . . . . . . . . . . .
135
7.8
Finding a formula for a sequence . . . . . . . . . . . . . . . . . . . . . .
135
7.9
Limiting behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
136
7.10 Financial applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
137
Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
138
Sample examination/practice questions . . . . . . . . . . . . . . . . . . . . . .
138
Answers to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
139
Answers to Sample examination/practice questions . . . . . . . . . . . . . . .
140
A Sample examination paper
143
B Comments on the Sample examination paper
147
iv
1
Chapter 1
General introduction
1.1
Studying mathematics
The study of mathematics can be very rewarding. It is particularly satisfying to solve a
problem and know that it is solved. Unlike many of the other subjects you will study,
there is always a right answer in mathematics problems. Of course, part of the
excitement of the social sciences arises from the fact that there may be no single ‘right
answer’ to a problem: it is stimulating to participate in debate and discussion, to defend
or re-think (and possibly change) your position.
It would be wrong to think that, in contrast, mathematics is very dry and mechanical.
It can be as much of an art as a science. Although there may be only one right (final)
answer, there could be a number of different ways of obtaining that answer, some more
complex than others. Thus, a given problem will have only one ‘answer’, but many
‘solutions’ (by which we mean routes to finding the answer). Generally, a mathematician
likes to find the simplest solution possible to a given problem, but that does not mean
that any other solution is wrong. (There may be different, equally simple, solutions.)
With mathematical questions, you first have to work out precisely what it is that the
question is asking, and then try to find a method (hopefully a nice, simple one) which
will solve the problem. This second step involves some degree of creativity, especially at
an advanced level. You must realise that you can hardly be expected to look at every
mathematics problem and write down a beautiful and concise solution, leading to the
correct answer, straight away. Of course, some problems are like this (for example,
‘Calculate 2 + 2’ !), but for other types of problem you should not be afraid to try
various different techniques, some of which may fail. In this sense, there is a certain
amount of ‘trial and error’ in solving some mathematical problems. This really is the
way a lot of mathematics is done. For obvious reasons, teachers, lecturers and textbooks
rarely give that impression: they present the solution right there on the page or the
blackboard, with no indication of the time a student might be expected to spend
thinking — or of the dead-end paths he or she might understandably follow — before a
solution can be found. It is a good idea to have scrap paper to work with so that you
can try out various methods of solution. (It is very inhibiting only to have in front of
you the crisp sheet of paper on which you want to write your final, elegant, solutions.
Mathematics is not done that way.) You must not get frustrated if you can’t solve a
problem immediately. As you proceed through the subject, gathering more experience,
you will develop a feel for which techniques are likely to be useful for particular
problems. You should not be afraid to try different techniques, some of which may not
work, if you cannot immediately recognise which technique to use.
1
1
1. General introduction
1.2
Mathematics in the social sciences
Many students find mathematics difficult and are tempted to ask why they have to
endure the agony and anguish of learning and understanding difficult mathematical
concepts and techniques. Hopefully you will not feel this way, but if you do, be assured
that all the techniques you struggle to learn in this subject will be useful in the end for
their applications in economics, management, and many other disciplines. Some of these
applications will be illustrated in this subject guide and in the textbooks. In fact, as the
textbook discussions illustrate, far from making things difficult and complicated,
mathematics makes problems in economics, management and related fields
‘manageable’. It’s not just about working out numbers; using mathematical models,
qualitative — and not simply quantitative — results can be obtained.1
1.3
Aims and objectives
There is a certain amount of enjoyment to be derived from mathematics for its own
sake. It is a beautiful subject, with its own concise, precise and powerful language. To
many people, however, the main attraction of mathematics is its breadth of useful
applications.
The main aim of this subject is to equip you with the mathematical tools for the study
of economics, management, accounting, banking and related disciplines.
This half course may not be taken with 173 Algebra or 174 Calculus.
1.4
Learning outcomes
At the end of this half course and having completed the Essential reading and activities,
you should have:
used the concepts, terminology, methods and conventions covered in the half course
to solve mathematical problems in this subject
the ability to solve unseen mathematical problems involving understanding of these
concepts and application of these methods
seen how mathematical techniques can be used to solve problems in economics and
related subjects.
1.5
How to use the subject guide
This subject guide is absolutely not a substitute for the textbooks. It is only what its
name suggests: a guide to the study and reading you should undertake. In each of the
subsequent chapters, brief discussions of the syllabus topics are presented, together with
pointers to recommended readings from the textbooks. It is essential that you use
1
2
See Anthony and Biggs (1996) Chapter 1, for instance.
1.6. Recommended books
textbooks. Generally, it is a good idea to read the texts as you work through a chapter
of the guide.
It is most useful to read what the guide says about a particular topic, then do the
necessary reading, then come back and re-read what the guide says to make sure you
fully understand the topic. Textbooks are also an invaluable source of examples for you
to attempt.
You should not necessarily spend the same amount of time on each chapter of the guide:
some chapters cover much more material than others. I have divided the guide into
chapters in order to group together topics on particular central themes, rather than to
create units of equal length.
The discussions of some topics in this guide are rather more thorough than others.
Often, this is not because those topics are more significant, but because the textbook
treatments are not as extensive as they might be.
Within each chapter of the guide you will encounter ‘Learning activities’. You should
carry out these activities as you encounter them: they are designed to help you
understand the topic under discussion. Solutions to them are at the end of the chapters,
but do make a serious attempt at them before consulting the solutions.
To help your time management, the chapters and topics of the subject are converted
below into approximate percentages of total time. However, this is purely for
indicative purposes. Some of you will know the basics quite well and need to spend less
time on the earlier material, while others might have to work hard to comprehend the
very basic topics before proceeding onto the more advanced.
Chapter
2
3
4
5
6
7
Title
Basics
Differentiation
Integration
Functions of several variables
Matrices and linear equations
Sequences and series
% Time
20
20
15
20
15
10
At the end of each chapter, you will find a list of ‘Learning outcomes’. This indicates
what you should be able to do having studied the topics of that chapter. At the end of
each chapter, there are ‘Sample examination questions’: some of these are really only
samples of parts of exam questions.
1.6
Recommended books
The main recommended text is the book by Anthony and Biggs. This covers all of the
required material and uses the same notations as this guide. But if you need more help
with the material of the second chapter (‘Basics’), you might find it useful to consult
some other texts (such as the one by Booth), which treat this basic material more
slowly.
3
1
1
1. General introduction
R
1.6.1
Main text
Anthony, M. and N. Biggs, Mathematics for economics and finance. (Cambridge,
UK: Cambridge University Press, 1996) [ISBN 9780521559133].2
1.6.2
Other recommended texts
Please note that as long as you read the Essential reading you are then free to read
around the subject area in any text, paper or online resource. To help you read
extensively, you have free access to the VLE and University of London Online Library
(see below). Other useful texts for this course include:
R
R
R
R
R
Binmore, K. and J. Davies, Calculus. (Cambridge, UK: Cambridge University
Press, 2001) [ISBN 9780521775410].
Booth, D.J. Foundation mathematics. Harlow: Prentice Hall, 1998) Third
Edition. [ISBN 9780201342949].
Bradley, T. Essential mathematics for economics and business. (Chichester:
Wiley, 2008) Third Edition. [ISBN 9780470018569].
Dowling, Edward T. Introduction to mathematical economics. Schaum’s Outline
Series. (New York; London: McGraw-Hill, 2000) Third Edition. [ISBN
9780071358965].
Ostaszewski, A. Mathematics in economics: models and methods. (Oxford, UK:
Blackwell, 1993) [ISBN 9780631180562].
Each chapter of Anthony and Biggs has a large section of fully worked examples, and a
selection of exercises for the reader to attempt.
The book by Binmore and Davies contains all the calculus you will need, and a lot
more, although it is at times a bit more advanced than you will need.
If you find you have considerable difficulty with some of the earlier basic topics in this
subject, then you should consult the book by Booth (or a similar one: there are many at
that level). This book takes a slower-paced approach to these more basic topics. It
would not be suitable as a main text, however, since it only covers the easier parts of
the subject.
The book by Bradley covers most of the material, and has plenty of worked examples.
Dowling’s book contains lots of worked examples. It is, however, less concerned with
explaining the techniques. It would not be suitable as your main text, but it is a good
source of additional examples.
Ostaszewski is at a slightly higher level than is needed for most of the subject, but it is
very suitable for a number of the topics, and provides many examples.
There are many other books which cover the material of this subject, but those listed
above are the ones I shall refer to explicitly.
Detailed reading references in this subject guide refer to the editions of the set
textbooks listed above. New editions of one or more of these textbooks may have been
published by the time you study this course. You can use a more recent edition of any
2
4
Recommended for purchase.
1.7. Online study resources
of the books; use the detailed chapter and section headings and the index to identify
relevant readings. Also check the virtual learning environment (VLE) regularly for
updated guidance on readings.
It is important to understand how you should use the textbooks. As I mentioned above,
there are no great debates in mathematics at this level: you should not, therefore, find
yourself in passionate disagreement with a passage in a mathematics text! However, try
not to find yourself in passive agreement with it either. It is so very easy to read a
mathematics text and agree with it, without engaging with it. Always have a pen
and scrap paper to hand, to make notes and to work through, for yourself, the examples
an author presents. The single most important point to be made about learning
mathematics is that, to learn it properly, you have to do it. Do work through the
worked examples in a textbook and do attempt the exercises. This is the real way to
learn mathematics. In the examination, you are hardly likely to encounter a question
you have seen before, so you must have practised enough examples to ensure that you
know your techniques well enough to be able to cope with new problems.
1.7
Online study resources
In addition to the subject guide and the Essential reading, it is crucial that you take
advantage of the study resources that are available online for this course, including the
VLE and the Online Library.
You can access the VLE, the Online Library and your University of London email
account via the Student Portal at:
http://my.londoninternational.ac.uk
You should receive your login details in your study pack. If you have not, or you have
forgotten your login details, please email uolia.support@london.ac.uk quoting your
student number.
1.7.1
The VLE
The VLE, which complements this subject guide, has been designed to enhance your
learning experience, providing additional support and a sense of community. It forms an
important part of your study experience with the University of London and you should
access it regularly.
The VLE provides a range of resources for EMFSS courses:
Self-testing activities: Doing these allows you to test your own understanding of
subject material.
Electronic study materials: The printed materials that you receive from the
University of London are available to download, including updated reading lists
and references.
Past examination papers and Examiners’ commentaries: These provide advice on
how each examination question might best be answered.
A student discussion forum: This is an open space for you to discuss interests and
experiences, seek support from your peers, work collaboratively to solve problems
5
1
1
1. General introduction
and discuss subject material.
Videos: There are recorded academic introductions to the subject, interviews and
debates and, for some courses, audio-visual tutorials and conclusions.
Recorded lectures: For some courses, where appropriate, the sessions from previous
years’ Study Weekends have been recorded and made available.
Study skills: Expert advice on preparing for examinations and developing your
digital literacy skills.
Feedback forms.
Some of these resources are available for certain courses only, but we are expanding our
provision all the time and you should check the VLE regularly for updates.
1.7.2
Making use of the Online Library
The Online Library contains a huge array of journal articles and other resources to help
you read widely and extensively.
To access the majority of resources via the Online Library you will either need to use
your University of London Student Portal login details, or you will be required to
register and use an Athens login:
http://tinyurl.com/ollathens
The easiest way to locate relevant content and journal articles in the Online Library is
to use the Summon search engine.
If you are having trouble finding an article listed in a reading list, try removing any
punctuation from the title, such as single quotation marks, question marks and colons.
For further advice, please see the online help pages:
www.external.shl.lon.ac.uk/summon/about.php
1.8
Examination advice
Important the information and advice given here are based on the examination
structure used at the time this guide was written. Please note that subject guides may
be used for several years. Because of this we strongly advise you to always check both
the current Regulations for relevant information about the examination, and the VLE
where you should be advised of any forthcoming changes. You should also carefully
check the rubric/instructions on the paper you actually sit and follow those instructions.
Remember, it is important to check the VLE for:
up-to-date information on examination and assessment arrangements for this course
where available, past examination papers and Examiners’ commentaries for the
course which give advice on how each question might best be answered.
A Sample examination paper may be found at the end of this subject guide. You will
see that from 2009–10, all of the questions on the paper are compulsory. Any further
changes to exam format will be announced on the VLE.
6
1.9. The use of calculators
It is worth making a few comments about exam technique. Perhaps the most important,
though obvious, point is that you do not have to answer the questions in any particular
order; choose the order that suits you best. Some students will want to do easy
questions first to boost their confidence, while others will prefer to get the difficult ones
out of the way. It is entirely up to you.
Another point, often overlooked by students, is that you should always include your
working. This means two things.
First, do not simply write down your answer in the exam script, but explain your
method of obtaining it (that is, what I called the ‘solution’ earlier).
Secondly, include your rough working. You should do this for two reasons:
• If you have just written down the answer without explaining how you obtained
it, then you have not convinced the Examiner that you know the techniques,
and it is the techniques that are important in this subject. (The Examiners
want you to get the right answers, of course, but it is more important that you
prove you know what you are doing: that is what is really being examined.)
• If you have not completely solved a problem, you may still be awarded marks
for a partial, incomplete, or slightly wrong, solution; if you have written down
a wrong answer and nothing else, no marks can be awarded. (You may have
carried out a lengthy calculation somewhere on scrap paper where you made a
silly arithmetical error. Had you included this calculation in the exam answer
book, you would probably not have been heavily penalised for the arithmetical
error.) It is useful, also, to let the Examiner know what you are thinking. For
example, if you know you have obtained the wrong answer to a problem, but
you can’t see how to correct it, say so!
As mentioned above, you will find that, wherever appropriate, there are sample exam
questions at the end of the chapters. These are an indication of the types of question
that might appear in future exams. But they are just an indication. The Examiners
want to test that you know and understand a number of mathematical methods and, in
setting an exam paper, they are trying to test whether you do indeed know the
methods, understand them, and are able to use them, and not merely whether you
vaguely remember them. Because of this, you will quite possibly encounter some
questions in your exam which seem unfamiliar. Of course, you will only be examined on
material in the syllabus. Furthermore, you should not assume that your exam will be
almost identical to the previous year’s: for instance, just because there was a question,
or a part of a question, on a certain topic last year, you should not assume there will be
one on the same topic this year. For this reason, you cannot guarantee passing if you
have concentrated only on a very small fraction of the topics in the subject. This may
all sound a bit harsh, but it has to be emphasised.
1.9
The use of calculators
You will not be permitted to use calculators of any type in the examination. This is not
something that you should panic about: the Examiners are interested in assessing that
7
1
1
1. General introduction
you understand the key methods and techniques, and will set questions which do not
require the use of a calculator.
In this guide, I will perform some calculations for which a calculator would be needed,
but you will not have to do this in the exam questions. Look carefully at the answers to
the sample exam questions
√ to see how to deal with calculations. For example, if the
answer to a problem is 2, then leave the answer like that: there is no need to express
this number as a decimal (for which one would need a calculator or a very good
memory!).
8
Chapter 2
Basics
2
Essential reading
R
(For full publication details, see Chapter 1.)
Anthony and Biggs (1996) Chapters 1, 2, and 7.
Further reading
R
R
R
R
R
Binmore and Davies (2001) Chapter 2, Sections 2.1–2.6.
Booth (1998) Modules 1, 3, 4, 6–8, 11–15.
Bradley (2008) Sections 1.1–1.6, 2.1, 3.1.1, 4.1–4.3.
Dowling (2000) Chapters 1 and 2.
Ostaszewski (1993) Chapter 1 (though this is more advanced than is required at
this stage of the subject), Chapter 5, Sections 5.5 and 5.12.
2.1
Introduction
This chapter discusses some of the very basic aspects of the subject, aspects on which
the rest of the subject builds. It is essential to have a firm understanding of these topics
before the more advanced topics can be understood.
Most things in economics and related disciplines — such as demand, sales, price,
production levels, costs and so on — are interrelated. Therefore, in order to come to
rational decisions on appropriate values for many of these parameters it is of
considerable benefit to form mathematical models or functional relationships between
them. It should be noted at the outset that, in general, the economic models used are
typically only approximations to reality, as indeed are all models. They are, nonetheless,
very useful in decision-making. Before we can attempt such modelling, however, we
need some mathematical basics.
This chapter contains a lot of material, but much of it will be revision. If you find any
of the sections difficult, please refer to the texts indicated for further explanation and
examples.
9
2. Basics
2.2
2
Basic notations
Although there is a high degree of standardisation of notation within mathematical
texts, some differences do occur. The notation given here is indicative of what is used in
the rest of this guide and in most of the texts.1 You should endeavour to familiarise
yourself with as many of the common notations as possible. For example, |a| means ‘the
absolute value of a’, which equals a if a is non-negative (that is, if a ≥ 0), and equals
−a otherwise. For instance, |6| = 6 and | − 2.5| = 2.5. (This is sometimes termed ‘the
modulus of a’. Roughly speaking, the absolute value of a number is obtained just by
ignoring any minus sign the number has.) As another example, multiplication is
sometimes denoted by a dot, as in a · b rather than a × b. Beware of confusing
multiplication and the use of a dot to indicate a decimal point. Even more commonly,
one simply uses ab to denote the multiplication of a and b. Also, you should be aware of
implied multiplications, as in 2(3) = 6.
Some other useful notations are those for sums, products and factorials. We denote the
sum
x 1 + x2 + · · · + xn
of the numbers x1 , x2 , . . . , xn by
n
X
xi .
i=1
The ‘Σ’ indicates that numbers are being summed, and the ‘i = 1’ and n below and
above the Σ show that it is the numbers xi , as i runs from 1 to n, that are being
summed together. Sometimes we will be interested in adding up only some of the
numbers. For example,
n−1
X
xi
i=2
would denote the sum x2 + x3 + · · · + xn−1 , which is the sum of all the numbers except
the first and last.
We denote their product x1 × x2 × · · · × xn (the result of multiplying all the numbers
together) by
n
Y
xi .
i=1
For a positive whole number, n, n! (‘n factorial’) is the product of all the numbers from
1 up to n. For example, 4! = 1.2.3.4 = 24. By convention 0! is taken to be 1. The
factorial can be expressed using the product notation:
n! =
n
Y
i.
i=1
Example 2.1
4
X
i=1
1
Suppose that x1 = 1, x2 = 3, x3 = −1, x4 = 5. Then
xi = 1 + 3 + (−1) + 5 = 8 and
4
X
xi = 3 + (−1) + 5 = 7.
i=2
You may consult Booth (1998) or a large number of other basic maths texts, for further information
on basic notations.
10
2.3. Simple algebra
We also have, for example,
4
Y
xi = 1(3)(−1)(5) = −15 and
i=1
Activity 2.1
2
xi = 1(3)(−1) = −3.
i=1
Suppose that x1 = 3, x2 = 1, x3 = 4, x4 = 6. Find
4
X
xi
i=1
2.3
3
Y
and
4
Y
xi .
i=1
Simple algebra
You should try to become confident and capable in handling simple algebraic
expressions and equations. You should be proficient in:
collecting up terms: e.g.
2a + 3b − a + 5b = a + 8b.
multiplication of variables: e.g.
(−a)(b) + (a)(−b) − 3(a)(b) + (−2a)(−4b) = −ab − ab − 3ab + 8ab
= 3ab,
expansion of bracketed terms: e.g.
(2x − 3y)(x + 4y) = 2x2 − 3xy + 8xy − 12y 2
= 2x2 + 5xy − 12y 2 .
You should also be able to factorise quadratic equations, something discussed later in
this chapter.
Activity 2.2
2.4
Expand (x2 − 1)(x + 2).
Sets
A set may be thought of as a collection of objects.2 A set is usually described by listing
or describing its members inside curly brackets. For example, when we write
A = {1, 2, 3}, we mean that the objects belonging to the set A are the numbers 1, 2, 3
(or, equivalently, the set A consists of the numbers 1, 2 and 3). Equally (and this is
what we mean by ‘describing’ its members), this set could have been written as
A = {n | n is a whole number and 1 ≤ n ≤ 3}.
2
See Anthony and Biggs (1996) Section 2.1.
11
2. Basics
2
Here, the symbol | stands for ‘such that’. Often, the symbol ‘:’ is used instead, so that
we might write
A = {n : n is a whole number and 1 ≤ n ≤ 3}.
As another example, the set
B = {x | xis a reader of this guide}
has as its members all of you (and nothing else). When x is an object in a set A, we
write x ∈ A and say ‘x belongs to A’ or ‘x is a member of A’.
The set which has no members is called the empty set and is denoted by ∅. The empty
set may seem like a strange concept, but it has its uses.
We say that the set S is a subset of the set T , and we write S ⊆ T , if every member of
S is a member of T . For example, {1, 2, 5} ⊆ {1, 2, 4, 5, 6, 40}. (Be aware that some
texts use ⊂ where we use ⊆.)
Given two sets A and B, the union A ∪ B is the set whose members belong to A or B
(or both A and B): that is,
A ∪ B = {x | x ∈ A or x ∈ B}.
Example 2.2
If A = {1, 2, 3, 5} and B = {2, 4, 5, 7}, then A ∪ B = {1, 2, 3, 4, 5, 7}.
Similarly, we define the intersection: A ∩ B to be the set whose members belong to both
A and B:3
A ∩ B = {x | x ∈ A and x ∈ B}.
Activity 2.3
2.5
Suppose A = {1, 2, 3, 5} and B = {2, 4, 5, 7}. Find A ∩ B.
Numbers
There are some standard notations for important sets of numbers.4 The set R of real
numbers, may be thought of as the points on a line. Each such number can be described
by a decimal representation.
Given two real numbers a and b, we define the intervals
[a, b] = {x ∈ R | a ≤ x ≤ b}
(a, b] = {x ∈ R | a < x ≤ b}
(a, b) = {x ∈ R | a < x < b}
[a, b) = {x ∈ R | a ≤ x < b}
[a, ∞) = {x ∈ R | x ≥ a}
(a, ∞) = {x ∈ R | x > a}
3
4
See Anthony and Biggs (1996) for examples of union and intersection.
See Anthony and Biggs (1996) Section 2.1.
12
2.6. Functions
(−∞, b] = {x ∈ R | x ≤ b}
(−∞, b) = {x ∈ R | x < b} .
The symbol ∞ means ‘infinity’, but it is not a real number, merely a notational
convenience. You should notice that when a square bracket, ‘[’ or ‘]’, is used to denote
an interval, the number beside the bracket is included in the interval, whereas if a round
bracket, ‘(’ or ‘)’, is used, the adjacent number is not in the interval. For example, [2, 3]
contains the number 2, but (2, 3] does not.
The set {. . . , −3, −2, −1, 0, 1, 2, 3, . . .} of integers is denoted by Z.
The positive integers are also known as natural numbers and the set of these, i.e.
{1, 2, 3, . . . }, is denoted by N.
Having defined R, we can define the set R2 of ordered pairs (x, y) of real numbers. Thus
R2 is the set usually depicted as the set of points in a plane, x and y being the
coordinates of a point with respect to a pair of axes. For instance, (−1, 3/2) is an
element of R2 lying to the left of and above (0, 0), which is known as the origin.
2.6
Functions
Given two sets A and B, a function from A to B is a rule which assigns to each member
of A precisely one member of B.5 For example, if A and B are both the set Z, the rule
which says ‘add 2’ is a function. Normally we express this function by a formula: if we
call the function f , we can write the rule which defines f as f (x) = x + 2. Two very
important functions in economics are the supply and demand functions for a good.6
These are discussed later in this chapter.
It is often helpful to think of a function as a machine which converts an input into an
output, as shown in Figure 2.1.
x
f
f (x)
Figure 2.1: A function as a machine
2.7
Inverse functions
As the one-way arrows in Figure 2.1 indicate, a function is a one-way relationship: the
function f takes a number x as input and it returns another number, f (x). Suppose you
were told that the output, f (x), was a number y, and you wanted to know what the
input was. In some cases, this is easy. For example, if f (x) = x + 2 and the output f (x)
is the number y, then we must have y = f (x) = x + 2. Solving for x in terms of y, we
find that x = y − 2. In other words, there is only one possible input x which could have
produced output y for this function, namely x = y − 2. In a situation such as this,
where for each and every y there is exactly one x such that f (x) = y, we say that the
5
6
See Anthony and Biggs (1996) Section 2.2.
See Anthony and Biggs (1996) Section 1.2.
13
2
2. Basics
function f has an inverse function.7 The inverse function is denoted by f −1 , and it is
the rule for reversing f . Formally, f −1 (x) is defined by
2
x = f −1 (y) ⇐⇒ f (x) = y.
(The symbol ⇐⇒ means ‘if and only if’ or ‘is equivalent to’). When f (x) = x + 2, we
have seen that
y = f (x) ⇐⇒ x = y − 2,
so the inverse function (which takes as input a number y and returns the number x such
that f (x) = y) is given by
f −1 (y) = y − 2.
(This could also be written as f −1 (x) = x − 2 or f −1 (z) = z − 2; there is nothing special
about the symbol used to denote the variable, i.e. the input to the function.8 )
It should be emphasised that not every function has an inverse. For instance, the
function f (x) = x2 , from R to R, has no inverse. To see this, we can simply observe that
there is not exactly one number x such that f (x) = y, where y = 1; for, both when x = 1
and x = −1, f (x) = x2 = 1. (Of course, this observation is true for any positive number
y.) So, in this case, we cannot definitively answer the question ‘If f (x) = 1, what is x?’.
Activity 2.4
2.8
If f (x) = 3x + 2, find a formula for f −1 (x).
Composition of functions
If we are given two functions f and g, then we can apply them consecutively to obtain
what is known as the composite function h, given by the rule
h(x) = f (g(x)).
The composite function h is denoted h = f g and is often described in words as ‘g
followed by f ’ or as ‘f after g’.9 It is also sometimes denoted by f ◦ g.
Example 2.3 Suppose that f (x) = x + 1 and g(x) = x4 . Then the composite
function h = f g is given by
f g(x) = f (g(x)) = f (x4 ) = x4 + 1.
On the other hand, the function k = gf is given by
gf (x) = g(f (x)) = g(x + 1) = (x + 1)4 .
Note, then, that in general, the compositions f g and gf are different.
7
See Anthony and Biggs (1996) Section 2.2.
See Anthony and Biggs (1996) Section 2.2, for discussion of ‘dummy variables’.
9
See Anthony and Biggs (1996) Section 2.3.
8
14
2.9. Powers
Activity 2.5
f g.
2.9
If f (x) =
√
x and g(x) = x2 + 1, find a formula for the composition
2
Powers
When n is a positive integer, the nth power10 of the number a, an , is simply the product
of n copies of a, that is,
an = a
| × a × a{z× · · · × a} .
n times
The number n is called the power, exponent or index. We have the power rules (or rules
of exponents):
ax ay = ax+y , (ax )y = axy ,
whenever x and y are positive integers. The power a0 is defined to be 1. When n is a
positive integer, a−n means 1/an . For example, 3−2 is 1/32 = 1/9. The power rules hold
when x and y are any integers, positive, negative or zero. When n is a positive integer,
a1/n is the ‘positive nth root of a’; this is the number x such that xn = a. Formally,
suppose n is a positive integer and let S be the set of all non-negative real numbers.
Then the function f (x) = xn from S to S has an inverse function f −1 . We can think of
f −1 as the definition of raising a number
to the power of 1/n: explicitly, f −1 (y) = y 1/n .
√
Of course, a1/2 is usually denoted by a, and is the square root of a. When m and n are
m
integers and n is positive, am/n is a1/n . So, the power rules still apply.
2.10
Graphs
In this section, we consider the graphs of functions. The graphing of functions is very
important in its own right, and familiarity with graphs of common functions and the
ability to produce graphs systematically is a necessary and important aspect of the
subject.
y
x
Figure 2.2: The x and y-axes
10
See Anthony and Biggs (1996) Section 7.1.
15
2. Basics
2
The graph11 of a function f (x) is the set of all points in the plane of the form (x, f (x)).
Sketches of graphs can be very useful. To sketch a graph, we start with the x-axis and
y-axis, as in Figure 2.2. (This figure only shows the region in which x and y are both
non-negative, but the x-axis extends to the left and the y-axis extends downwards.)
y
x
(x, f (x))
f (x)
x
Figure 2.3: Plotting the a point on a graph
We then plot all points of the form (x, f (x)). Thus, at x units from the origin (the point
where the axes cross), we plot a point whose height above the x-axis (that is, whose
y-coordinate) is f (x). This is shown in Figure 2.3. The graph is sometimes described as
the graph y = f (x) to signify that the y-coordinate represents the function value f (x).
Joining together all points of the form (x, f (x)) results in a curve, called the graph of
f (x). This is often described as the curve with equation y = f (x). Figure 2.4 gives an
example of what this curve might look like.
y
x
(x, f (x))
f (x)
x
Figure 2.4: The curve y = f (x)
11
See Anthony and Biggs (1996) Section 2.4.
16
2.10. Graphs
These figures indicate what is meant by the graph of a function, but you should not
imagine that the correct way to sketch a graph is to plot a few points of the form
(x, f (x)) and join them up; this approach rarely works well and more sophisticated
techniques are needed. (Many of these will be discussed later.)
2
We shall discuss the graphs of some standard important functions as we progress. We
start with the easiest of all: the graph of a linear function. In the next section we look
at the graphs of quadratic functions. The linear functions are those of the form
f (x) = mx + c and their graphs are straight lines, with gradient, or slope, m, which
cross the y-axis at the point (0, c). Figure 2.5 illustrates the graph of the function
f (x) = 2x + 3 and Figure 2.6 the graph of the function f (x) = −x + 2.
Figure 2.5: The graph of the line y = 2x + 3
Figure 2.6: The graph of the line y = −x + 2
17
2. Basics
Activity 2.6
Sketch the curves y = x + 3 and y = −3x − 2.
2
2.11
Quadratic equations and curves
A common problem is to find the set of solutions of a quadratic equation12
ax2 + bx + c = 0,
where we may as well assume that a 6= 0, because if a = 0 the equation reduces to a
linear one. (Note that, by a solution, we mean a value of x for which the equation is
true.) In some cases the quadratic expression can be factorised, which means that it can
be written as the product of two linear terms (of the form x − a for some a). For
example x2 − 6x + 5 = (x − 1)(x − 5), so the equation x2 − 6x + 5 becomes
(x − 1)(x − 5) = 0. Now the only way that two numbers can multiply to give 0 is if at
least one of the numbers is 0, so we can conclude that x − 1 = 0 or x − 5 = 0; that is,
the equation has two solutions, 1 and 5. Although factorisation may be difficult, there is
a general technique for determining the solutions to a quadratic equation, as follows.13
Suppose we have the quadratic equation ax2 + bx + c = 0, where a 6= 0. Then:
if b2 − 4ac < 0, the equation has no real solutions;
if b2 − 4ac = 0, the equation has exactly one solution, x = −
b
;
2a
if b2 − 4ac > 0, the equation has two solutions,
√
√
−b − b2 − 4ac
−b + b2 − 4ac
x1 =
and x2 =
.
2a
2a
For example, consider the quadratic equation x2 − 2x + 3 = 0; here we have a = 1,
b = −2, c = 3. The quantity b2 − 4ac (called the discriminant) is (−2)2 − 4(1)(3) = −8,
which is negative, so this equation has no solution. (Technically, it has no real solutions.
It does have solutions in ‘complex numbers’, but this is outside the scope of this
subject.) This is less mysterious than it may seem. We can write the equation as
(x − 1)2 + 2 = 0. Rewriting the left-hand side of the equation in this form is known as
‘completing the square’. Now, the square of a number is always greater than or equal to
0, so the quantity on the left of this equation is always at least 2 and is therefore never
equal to 0. The above formulae for the solutions to a quadratic equation are obtained
using the technique of completing the square.14
It is instructive to look at the graphs of quadratic functions and to understand the
connection between these and the solutions to quadratic equations. First, let’s look at
the graph of a typical quadratic function y = ax2 + bx + c. Figure 2.7 shows the curves
one obtains for two typical quadratics ax2 + bx + c. For the first, a is positive and for
the second a is negative. We have omitted the x and y axes in these figures; it is the
shape of the graph that we want to emphasise first. Note that the first graph has a
‘U’-shape and that the second is the same sort of shape, upside-down. To be more
formal, the curves are parabolae.
12
See Anthony and Biggs (1996) Section 2.4.
See Anthony and Biggs (1996) Section 2.4.
14
See Anthony and Biggs (1996) Section 2.4, if you haven’t already.
13
18
2.11. Quadratic equations and curves
2
a>0
a<0
Figure 2.7: Typical quadratic curves y = ax2 + bx + c
Figure 2.8 is the graph of the quadratic function f (x) = x2 − 6x + 5.
Figure 2.8: The graph of the quadratic y = x2 − 6x + 5
Note that, since the number in front of the x2 term (what we called a above) is positive,
the curve is of the first type displayed in Figure 2.7. What we want to emphasise with
this specific example is the positioning of the curve with respect to the axes. There is a
fairly straightforward way to determine where the curve crosses the y-axis. Since the
y-axis has equation x = 0, to find the y-coordinate of this crossing (or intercept), all we
have to do is substitute x = 0 into the function. Since f (0) = 02 − 6(0) + 5 = 5, the point
where the curve crosses the y-axis is (0, 5). (Generally, the point where the graph of a
function f (x) crosses the y-axis is (0, f (0)).) The other important points on the diagram
are the points where the curve crosses the x-axis. Now, the curve has equation y = f (x),
and the x-axis has equation y = 0, so the curve crosses (or meets) the x-axis when
y = f (x) = 0. (This argument, so far, is completely general: to find where the graph of
f (x) crosses the x-axis, we solve the equation f (x) = 0. In general, this may have no
solution, one solution or a number of solutions, depending on the function.) Thus, we
have to solve the equation x2 − 5x + 6 = 0. We did this earlier, and the solutions are
x = 1 and x = 5. It follows that the curve crosses the x-axis at (1, 0) and (5, 0).
19
2. Basics
2
Figure 2.9: The graph of the quadratic y = x2 − 2x + 3
Figure 2.9 shows the graph of another quadratic, f (x) = x2 − 2x + 3.
Notice that this one does not cross the x-axis. This is because the quadratic equation
x2 − 2x + 3 = 0 (which we met earlier) has no solutions. You might ask what the
coordinates of the lowest point of the ‘U’ are. Later, we shall encounter a general
technique for answering such questions. For the moment, we can determine the point by
using the observation, made earlier, that the function is (x − 1)2 + 2. Now, (x − 1)2 ≥ 0
and is equal to 0 only when x = 1, so the lowest value of the function is 2, which occurs
when x = 1; that is, the lowest point of the ‘U’ is the point (1, 2). You can obtain quite
a lot of information about quadratic curves without using very sophisticated techniques.
Activity 2.7
2.12
Sketch the curve y = x2 + 4x + 3. Where does it cross the x-axis?
Polynomial functions
Linear and quadratic functions are examples of a more general type of function: the
polynomial functions. A polynomial function is one of the form
f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 .
The right-hand side is known simply as a polynomial. The number ai is known as the
coefficient of xi . If an 6= 0 then n, the largest power of x in the polynomial, is known as
the degree of the polynomial. Thus, the linear functions are precisely the polynomials of
degree 1 and the quadratics are the polynomials of degree 2. Polynomials of degree 3 are
known as cubics. The simplest is the function f (x) = x3 , the graph of which is shown in
Figure 2.10. Notice that the graph of the function f (x) = x3 only crosses the x-axis at
the origin, (0, 0). That is, the equation f (x) = 0 has just one solution. (We say that the
function has just one zero.) In general, a polynomial function of degree n has at most n
zeroes. For example, since
x3 − 7x + 6 = (x − 1)(x − 2)(x + 3),
20
2.13. Simultaneous equations
2
Figure 2.10: The graph of the curve y = x3
the function f (x) = x3 − 7x + 6 has three zeroes; namely, 1, 2, −3. Unfortunately, there
is no general straightforward formula (as there is for quadratics) for the solutions to
f (x) = 0 for polynomials f of degree larger than 2.
Activity 2.8
2.13
Factorise f (x) = x3 + 4x2 + 3x.
Simultaneous equations
An important type of problem arises when we have several equations which we have to
solve ‘simultaneously’.15 This means that we must find the intersection of the solution
sets of the individual equations. We have already met an example of this: when we want
to find the points (if any) where the curve y = f (x) meets the x-axis, we are essentially
solving two equations simultaneously. The first is y = f (x) and the second (the equation
of the x-axis) is y = 0. This is one way of thinking about how the equation f (x) = 0
arises. We shall spend a lot of time later looking at problems in which the aim is to
solve simultaneously more than two equations. For the moment, we shall just illustrate
with a simple example.
Any two lines which are not parallel cross each other exactly once, but how do we find
the crossing point? Let’s consider the lines with equations y = 2x − 1 and y = x + 2.
These are not parallel, since the gradient of the first is 2, whereas the gradient of the
second is 1. Figure 2.11 shows the two lines. Our aim is to determine the coordinates of
the crossing point C.
To find C, let us suppose that C = (X, Y ). Then, since C lies on the line with equation
y = 2x − 1, we must have Y = 2X − 1. But C also lies on the line y = x + 2, so
Y = X + 2. Therefore the coordinates X and Y of C satisfy the following two
15
See Anthony and Biggs (1996) Sections 1.3 and 2.4.
21
2. Basics
2
Figure 2.11: The lines y = 2x − 1 and y = x + 2
equations, simultaneously:
Y = 2X − 1
Y = X + 2.
It follows that
Y = 2X − 1 = X + 2.
From 2X − 1 = X + 2 we obtain X = 3. Then, to obtain Y , we use either the fact that
Y = 2X − 1, obtaining Y = 5, or we can use the equation Y = X + 2, obtaining (of
course) the same answer. It follows that C = (3, 5).
Find the point of intersection of the lines with equations y = 2x − 3
1
and y = 2 − x.
2
Activity 2.9
2.14
Supply and demand functions
Supply and demand functions16 describe the relationship between the price of a good,
the quantity supplied to the market by the manufacturer, and the amount the
consumers wish to buy. The demand function q D of the price p describes the demand
quantity: q D (p) is the quantity which would be sold if the price were p. Similarly, the
supply function q S is such that q S (p) is the amount supplied when the market price is p.
In the simplest models of the market, it is assumed that the supply and demand
functions are linear — in other words, their graphs are straight lines. For example, it
could be that q D (p) = 4 − p and q S (p) = 2 + p. Note that the graph of the demand
function is a downward-sloping straight line, whereas the graph of the supply function is
upward-sloping. This is to be expected, since, for example, as the price of a good
16
See Anthony and Biggs (1996) Section 1.2.
22
2.14. Supply and demand functions
increases, the consumers are prepared to buy less of the good, and so the demand
function decreases as price increases.
Sometimes, the supply and demand relationships are expressed through equations. For
instance, in the example just given we could equally well have described the relationship
between demand quantity and price by saying that the demand equation is q + p = 4.
The graphs of the demand function and supply function are known, respectively, as the
demand curve and the supply curve.
There is another way to view the relationship between price and quantity demanded,
where we ask how much the consumers (as a group; that is, on aggregate) would be
willing to pay for each unit of a good, given that a quantity q is available. From this
viewpoint we are expressing p in terms of q, instead of the other way round. We write
pD (q) for the value of p corresponding to a given q, and we call pD the inverse demand
function. It is, as the name suggests, the inverse function to the demand function. For
example, with q D (p) = 4 − p, we have q = 4 − p and so p = 4 − q; thus, pD (q) = 4 − q.
In a similar way, when we solve for the price in terms of the supply quantity, we obtain
the inverse supply function pS (q).
The market is in equilibrium17 when the consumers have as much of the commodity as
they want and the suppliers sell as much as they want. This occurs when the quantity
supplied matches the quantity demanded, or, supply equals demand. To find the
equilibrium price p∗ , we solve q D (p) = q S (p) and then to determine the equilibrium
quantity q ∗ we compute q ∗ = q D (p∗ ) (or q ∗ = q S (p∗ )). (Generally there might be more
than one equilibrium, but not when the supply and demand are linear.) Geometrically,
the equilibrium point(s) occur where the demand curve and supply curve intersect.
Activity 2.10 Suppose the demand function is q D (p) = 20 − 2p and that the supply
function is q S (p) = 23 p − 4. Find the equilibrium price p∗ and equilibrium quantity
q∗.
Not all supply and demand equations are linear. Consider the following example.
Example 2.4
Suppose that we have demand curve,
q = 250 − 4p − p2 ,
and supply curve
q = 2p2 − 3p − 40.
Let us find the equilibrium price and quantity, and sketch the curves for 1 ≤ p ≤ 10.
To find the equilibrium, the simplest approach is to set the demand quantity equal
to the supply quantity, giving 250 − 4p − p2 = 2p2 − 3p − 40. To solve this, we
convert it into a quadratic equation in the standard form (that is, one of the form
ax2 + bx + c = 0, though clearly here we shall use p rather than x). We have
3p2 + p − 290 = 0. Using the formula for the solutions of a quadratic equation, we
have solutions
p
√
−1 ± 1 − 4(3)(−290)
−1 ± 3481
−1 ± 59
p=
=
=
,
2(3)
6
6
17
See Anthony and Biggs (1996) Section 1.3.
23
2
2. Basics
2
which is p = −10 and p = 29/3. (A calculator has been used here, so, given that
calculators are not permitted in the exam, this precise example would not appear in
an exam. This type of example, with easier arithmetic, could, however, do so: see the
Sample exam questions at the end of this chapter.) You could also have solved this
equation using factorisation. It is not so easy in this case, but we might have been
able to spot the factorisation
3p2 + p − 290 = (3p − 29)(p + 10),
which leads to the same answers. Clearly only the second of these two solutions is
economically meaningful. So the equilibrium price is p = 29/3. To find the
equilibrium quantity, we can use either the supply or demand equations, and we
obtain
2
29
1061
29
−
.
=
q = 250 − 4
3
3
9
We now turn our attention to sketching the curves. The demand curve
q = 250 − 4p − p2 is a quadratic with a negative squared term, and hence has an
up-turned ‘U’ shape. It crosses the q-axis at (0, 250). It crosses the p-axis where
250 − 4p − p2 = 0. In standard form, this quadratic equation is −p2 − 4p + 250 = 0
and it has solutions
p
√
−(−4) ± (−4)2 − 4(−1)(250)
4 ± 1016
=
= 13.937, −17.937.
2(−1)
−2
(Again, we use a calculator here, but in the exam such difficult computations would
not be required.) With this information, we now know that the curve is as in the
following sketch.
For the supply curve, we have q = 2p2 − 3p − 40, which is a ‘U’-shaped parabola.
This curve crosses the q-axis at (0, −40). It crosses the P -axis when
2p2 − 3p − 40 = 0. This equation has solutions
p
√
3 ± 9 − 4(2)(−40)
3 ± 329
=
= 5.285, −3.785
4
4
24
2.14. Supply and demand functions
and the curve therefore looks like the following.
2
The question asks us to sketch the curves for the range 1 ≤ p ≤ 10. Sketching both
on the same diagram we obtain:
Note that the equilibrium point (29/3, 1061/9) is where the two curves intersect.
Activity 2.11 Suppose the market demand function is given by p = 4 − q − q 2 and
that the market supply function is p = 1 + 4q + q 2 . Sketch both these functions on
the same graph.
25
2. Basics
2.15
2
Exponentials
An exponential-type function is one of the form f (x) = ax for some number a. (Do not
confuse it with the function which raises a number to the power a. An exponential-type
function has the form f (x) = ax , whereas the ‘ath power function’ has the form
f (x) = xa .)
y
1
x
Figure 2.12: The graph of the function f (x) = ax , when a > 1
There are some important points to notice about f (x) = ax and its graph, for a > 0.
First of all, ax is always positive, for every x. Furthermore, if a > 1 then ax becomes
larger and larger, without bound, as x increases. We say that ax tends to infinity as x
tends to infinity. Also, for such an a, as x becomes more and more negative, the
function ax gets closer and closer to 0. In other words, ax tends to 0 as x tends to
‘minus infinity’. This behaviour can be seen in Figure 2.12 for the case in which a is a
number larger than 1. If a < 1 the behaviour is quite different; the resulting graph is of
the form shown in Figure 2.13. (You can perhaps see why it has this shape by noting
that ax = (1/a)−x .)
y
1
x
Figure 2.13: The graph of the function f (x) = ax , when 0 < a < 1
Some very important properties of exponential-type functions, exactly like the power
26
2.16. The natural logarithm
laws, hold. In particular,
ar+s = ar as , (ar )s = ars .
2
Another property is that, regardless of a, a0 is equal to 1, and the point (0, 1) is the
only place where the graph of ax crosses the y-axis.
We now define the exponential function. This is the most important exponential-type
function. It is defined to be f (x) = ex , where e is the special number 2.71828 . . . (The
function ex is also sometimes written as exp(x).) The most important facts about ex to
remember from this section are the shape of its graph, and its properties. The graph is
shown in Figure 2.14. We shall see in the next chapter one reason why the number e is
so special.
y
1
x
Figure 2.14: The graph of the function f (x) = ex
2.16
The natural logarithm
Formally, the natural logarithm18 of a positive number x, denoted ln x (or, sometimes,
log x), is the number y such that ey = x. In other words, the natural logarithm function
is the inverse of the exponential function ex (regarded as a function from the set of all
real numbers to the set of positive real numbers). Sometimes ln x is called the logarithm
to base e. The reason for this is that we can, more generally, consider the inverse of the
exponential-type function ax . This inverse function is called the logarithm to base a and
we use the notation loga x. Thus, loga x is the answer to the question ‘What is the
number y such that ay = x?’.
The two most common logarithms, other than the natural logarithm, are logarithms to
base 2 and 10. For example, since 23 = 8, we have log2 8 = 3. It may seem awkward to
have to think of a logarithm as the inverse of an exponential-type function, but it is
really not that strange. Confronted with the question ‘What is loga x?’, we simply turn
it around so that it becomes, as above, ‘What is the number y such that ay = x?’.
There is often some confusion caused by the notations used for logarithms. Some texts
use log to mean natural logarithm, whereas others use it to mean log10 . In this guide,
18
See Anthony and Biggs (1996) Section 7.4.
27
2. Basics
we shall use ln to mean natural logarithm and we shall avoid altogether the use of ‘log’
without a subscript indicating its base.
2
y
1
x
Figure 2.15: The graph of the natural logarithm function, ln x
Figure 2.15 shows the graph of the natural logarithm. Note that it only makes sense to
define ln x for positive x. All the important properties of the natural logarithm follow
from those of the exponential function. For example, ln 1 = 0. Why? Because ln 1 is, by
its definition, the number y such that ey = 1. The only such y is y = 0.
The other very important properties of ln x (which follow from properties of the
exponential function19 ) are:
a
= ln a − ln b, ln(ab ) = b ln a.
ln(ab) = ln a + ln b, ln
b
These relationships are fairly simple and you will get used to them as you practise.
2.17
Trigonometrical functions
The trigonometrical functions, sin x, cos x, tan x (the sine function, cosine function and
tangent function) are very important in mathematics and they will occur later in this
subject. We shall not give the definition of these functions here. If you are unfamiliar
with them, consult the texts.20
It is important to realise that, throughout this subject, angles are measured in radians
rather than degrees. The conversion is as follows: 180 degrees equals π radians, where π
is the number 3.141 . . . It is good practice not to expand π or multiples of π as
decimals, but to leave them in terms of the symbol π. For example, since 60 degrees is
one third of 180 degrees, it follows that, in radians, 60 degrees is π/3.
19
20
See Anthony and Biggs (1996) Section 7.4.
See Dowling (2000) Section 20.5, for example or Booth (1998) Module 13.
28
2.17. Trigonometrical functions
The graphs of the sine function, sin x, and the cosine function, cos x, are shown in
Figures 2.16 and 2.17. Note that these functions are periodic: they repeat themselves
every 2π steps. (For example, the graph of the sine function between 2π and 4π has
exactly the same shape as the graph of the function between 0 and 2π.)
Figure 2.16: The graph of the function sin x
Figure 2.17: The graph of the function cos x
Note also that the graph of cos x is a ‘shift’ of the graph of sin x, obtained by shifting
the sin x graph by π/2 to the left. Mathematically, this is equivalent to the fact that
cos x = sin(x + π2 ).
The tangent function, tan x, is defined in terms of the sine and cosine functions, as
follows:
sin x
tan x =
.
cos x
29
2
2. Basics
Note that the sine and cosine functions always take a value between 1 and −1. Table
2.1 gives some important values of the trigonometrical functions.
2
θ
0
π/6
π/4
π/3
π/2
sin θ
0
1/2√
1/
√ 2
3/2
1
cos θ
1√
3/2
√
1/ 2
1/2
0
tan θ
0 √
1/ 3
1√
3
undefined
Table 2.1: Important values for the trigonometrical functions.
Technically, the tangent function is not defined at π/2. This means that no meaning can
be given to tan(π/2). To see why, note that tan x = sin x/ cos x, but cos(π/2) = 0, and
we cannot divide by 0. You might wonder what happens to tan x around x = π/2. The
graph of tan x can be found in the textbooks.21
There are some useful results about the trigonometrical functions, with which you
should familiarise yourself. First, for all x,
(cos x)2 + (sin x)2 = 1.
(We use sin2 x to mean (sin x)2 , and similarly for cos2 x.) Then there are the
double-angle formulae, which state that:
sin(2x) = 2 sin x cos x, cos(2x) = cos2 x − sin2 x.
Note that, since cos2 x + sin2 x = 1, the double angle formula for cos(2x) may be written
in another two, useful, ways:
cos(2x) = 2 cos2 x − 1 = 1 − 2 sin2 x.
The double-angle formulae arise from two more general results. It is the case that for
any angles θ and φ, we have
sin(θ + φ) = sin θ cos φ + cos θ sin φ
and
cos(θ + φ) = cos θ cos φ − sin θ sin φ.
The double-angle formulae follow from these when we take θ and φ to be equal to each
other.
Let S be the interval [−π/2, π/2]. Then, regarded as a function from S to the interval
[−1, 1], sin x has an inverse function, which we denote by sin−1 ; thus, for −1 ≤ y ≤ 1,
sin−1 (y) is the angle x (in radians) such that −π/2 ≤ x ≤ π/2 and sin x = y. In a
similar manner, the function cos x from the interval [0, π] to [−1, 1] has an inverse,
which we denote by cos−1 : so, for −1 ≤ y ≤ 1, cos−1 y is the angle x (in radians) such
that 0 ≤ x ≤ π and cos x = y. Similarly, the function tan x, regarded as a function from
the interval (−π/2, π/2) to R has an inverse, denoted by tan−1 . Some texts use the
notation arcsin for sin−1 , arccos for cos−1 , and arctan for tan−1 .
21
See, for example, Binmore and Davies (2001) p. 57.
30
2.18. Further applications of functions
2.18
Further applications of functions
2
We have already seen that supply and demand can usefully be modelled using very
simple functional relationships, such as linear functions. We now discuss a few more
applications.
Suppose that the demand equation for a good is of the form p = ax + b where x is the
quantity produced. Then, at equilibrium, the quantity x is the amount supplied and
sold, and hence the total revenue T R at equilibrium is price times quantity, which is
T R = (ax + b)x = ax2 + bx,
a quadratic function which may be maximised either by completing the square, or by
using the techniques of calculus (discussed later).
Another very important function in applications is the total cost function of a firm. In
the simplest model of a cost function, a firm has a fixed cost, that remain fixed
independent of production or sales, and it has variable costs which, for the sake of
simplicity, we will assume for the moment vary proportionally with production. That is,
the variable cost is of the form V x for some constant V , where x represents the
production level. The total cost T C is then the sum of these two: T C = F + V x. For a
limited range of x this very simplistic relationship often holds well but more complicated
models (for instance, involving quadratic and exponential functions) often occur.
Combining the total cost and revenue functions on one graph enables us to perform
break-even analysis. The break-even output is that for which total cost equals total
revenue. In simplified, linear, models the break-even point (should it exist) is unique.
When non-linear relationships are used, a number of break-even points are possible.
Example 2.5 Let us find the break-even points in the case where the total cost
function is T C = 7 + 2x + x2 and the total revenue function is T R = 10x. To find
the break-even points, we need to solve T C = T R; that is, 7 + 2x + x2 = 10x or
x2 − 8x + 7 = 0. Splitting this into factors, (x − 7)(x − 1) = 0, so x = 7 or x = 1.
(Alternatively, the formula for the solutions of a quadratic equation could be used.)
Note that there are two break-even points.
Activity 2.12 Find the break-even points in the case where the total cost function
is T C = 2 + 5x + x2 and the total revenue function is T R = 12 + 8x.
Learning outcomes
At the end of this chapter and the relevant reading, you should be able to:
determine inverse functions and composite functions
sketch graphs of simple functions
sketch quadratic curves and solve quadratic equations
solve basic simultaneous equations
find equilibria from supply and demand functions, and sketch these
31
2. Basics
find break-even points
explain what is meant by exponential-type functions and be able to sketch their
graphs
use properties such as ax+y = ax ay and (ax )y = axy
explain what is meant by the exponential function ex
describe the natural logarithm (ln x), logarithms to base a (loga x) and their
properties
describe the functions sin x, cos x, tan x and their properties, key values, and graphs
explain what is meant by inverse trigonometrical functions
2
You do not need to know about complex (or imaginary) numbers (as you might see
discussed in some texts when, in a quadratic equation, b2 − 4ac < 0).
Sample examination/practice questions
The material in this chapter of the guide is essential to what follows. Many exam
questions involve this material, but additionally involve other topics, such as calculus.
We give just three examples of exam-type questions which make use only of the
material in this chapter.
Question 2.1
Suppose the market demand function is given by
p = 4 − q − q2
and that the market supply function is
p = 1 + 4q + q 2 .
Determine the equilibrium price and quantity.
Question 2.2
Suppose that the demand relationship for a product is p = 6/(q + 1) and that the
supply relationship is p = q + 2. Determine the equilibrium price and quantity.
Question 2.3
Suppose that the demand equation for a good is q = 8 − p2 − 2p and that the supply
equation is q = p2 + 2p − 3. Sketch the supply and demand curves on the same diagram,
and determine the equilibrium price.
Answers to activities
Feedback
to activity 2.1
P4
Q4
i=1 xi is 3 + 1 + 4 + 6 = 14. The product
i=1 xi equals 3 × 1 × 4 × 6 = 72.
Feedback to activity 2.2
x3 + 2x2 − x − 2.
32
2.18. Answers to activities
Feedback to activity 2.3
A ∩ B is the set of objects in both sets, and so it is {2, 5}.
2
Feedback to activity 2.4
If y = f (x) = 3x + 2 then we may solve this for x by noting that x = (y − 2)/3. It
follows that f −1 (y) = (y − 2)/3 or, equivalently, f −1 (x) = (x − 2)/3.
Feedback to activity
p2.5
√
(f g)(x) = f (g(x)) = g(x) = x2 + 1.
Feedback to activity 2.6
The curve y = x + 3 is a straight line with gradient 1, passing through the y-axis at the
point (0, 3). Therefore, sketching it we obtain Figure 2.18.
Figure 2.18: Graph of the straight line y = x + 3.
The curve y = −3x − 2 is a straight line with gradient −3 (and hence sloping
downwards), passing through the y-axis at (0, −2). The graph of this curve is shown in
Figure 2.19.
33
2. Basics
2
Figure 2.19: Graph of the straight line y = −3x − 2.
Feedback to activity 2.7
The graph y = x2 + 4x + 3 is a quadratic, with a positive x2 term, and hence it has the
parabolic ‘U’-shape. To locate its position, we find where it crosses the axes. It crosses
the y-axis when x = 0, and hence at (0, 3). To find where it crosses the x-axis (if at all),
we need to solve y = 0; that is, x2 + 4x + 3 = 0. There are two ways we can do this. We
could spot that this factorises as (x + 3)(x + 1) = 0, so that the solutions are x = −3
and x = −1. Alternatively, we can use the formula for the solutions of a quadratic
equation, with a = 1, b = 4, c = 3. This gives
√
√
−b ± b2 − 4ac
= (−4 ± 4)/2 = −3, −1.
2a
With this information, we can sketch the curve (see Figure 2.20).
Figure 2.20: Graph of the curve y = x2 + 4x + 3.
34
2.18. Answers to activities
Feedback to activity 2.8
We note first that x is a factor, and so we have x3 + 4x2 + 3x = x(x2 + 4x + 3). To
factorise the quadratic, we can simply spot the factorisation x2 + 4x + 3 = (x + 1)(x + 3)
or, alternatively, we can solve the quadratic equation x2 + 4x + 3 = 0, which has
solutions −1, −3, meaning that x2 + 4x + 3 = (x − (−1))(x − (−3)) = (x + 1)(x + 3). It
follows that the factorisation we require is x(x + 1)(x + 3).
Figure 2.21: Graph of the curves y = 2x − 3 and y = 2 − (1/2)x.
Feedback to activity 2.9
To find the intersection of the two lines we solve the equations
y = 2x − 3, y = 2 − (1/2)x simultaneously. There is more than one way to do so, but
perhaps the easiest is to write 2x − 3 = 2 − (1/2)x, from which we obtain (5/2)x = 5
and hence x = 2. The y-coordinate of the intersection can then be found from either one
of the initial equations: for example, y = 2x − 3 = 2(2) − 3 = 1. It follows that the
intersection point is (2, 1). Figure 2.21 shows the two curves. (If this were an exam
question, it would not be essential to include the sketch as part of your answer. I’m
doing so just to help you understand what’s going on.)
Feedback to activity 2.10
To find the equilibrium, we solve simultaneously the demand and supply equations; that
is, we set supply q D equal to demand q S . Since q D = 20 − 2p and q S = (2/3)p − 4, we
set 20 − 2p = (2/3)p − 4 and hence (8/3)p = 24, and p = 9. To find the equilibrium
quantity, we can use either the supply or the demand equation. Using the demand
equation gives q = 20 − 2(9) = 2 (and of course we will get the same answer using the
supply equation). Figure 2.22 shows the demand and supply curves (which are, of
course, straight lines in this case).
35
2
2. Basics
2
Figure 2.22: Graph of the curves q D = 20 − 2p and q S = (2/3)p − 4.
Feedback to activity 2.11
Note that, here, p is given in terms of q, whereas in the example preceding this activity,
the relationships were given the other way round, by which I mean that the quantities
were expressed as functions of the prices. This is not something you should get confused
about. We can think about price as a function of quantity or quantity as a function of
price. In this problem, q is treated as the independent variable and p as the dependent
variable, so we will sketch p against q, with the vertical axis being the P axis and the
horizontal axis the q-axis. (This is in contrast to the previous question, where q was the
vertical coordinate and p the horizontal.)
Consider the demand curve, with equation p = 4 − q − q 2 . This is an up-turned
‘U’-shape. It crosses the p-axis (when q = 0) at (0, 4) and it crosses the q-axis when
4 − q − q 2 = 0. Solving this quadratic in the usual way, we obtain
q=
1±
p
√
1 − 4(−1)(4)
1 ± 17
=
= −2.562, 1.562.
−2
−2
The supply curve, with equation p = 1 + 4q + q 2 , crosses the p-axis at (0, 1). It crosses
the q-axis when 1 + 4q + q 2 = 0, which is when
q=
−4 ±
p
√
42 − 4(1)(1)
−4 ± 12
=
= −0.268, −3.732.
2
2
Sketching both curves on the same diagram, we obtain Figure 2.23.
36
2.18. Answers to Sample examination/practice questions
2
Figure 2.23: Graphs of the curves p = 4 − q − q 2 and p = 1 + 4q + q 2 .
Feedback to activity 2.12
We solve T C = T R, which is 2 + 5x + x2 = 12 + 8x. Writing this equation in the
standard way, it becomes x2 − 3x − 10 = 0. We can factorise this as (x − 5)(x + 2) = 0,
showing that the solutions are 5, −2. Or, we can use the formula for the solutions of a
quadratic, with a = 1, b = −3, c = −10. Either way we see that there are two possible
break-even points, x = 5 or x = −2. But the second of these has no economic
significance, since it represents a negative quantity. We therefore deduce that the
break-even point is x = 5.
Answers to Sample examination/practice questions
Answer to question 2.1
To find the equilibrium quantity, we solve
4 − q − q 2 = 1 + 4q + q 2 ,
which is
2q 2 + 5q − 3 = 0.
Using the formula for the solutions of a quadratic, we have
p
√
−5 ± 52 − 4(2)(−3)
−5 ± 49
−5 ± 7
1
=
=
= −3, .
q=
4
4
4
2
So the equilibrium quantity is the economically meaningful solution, namely q = 1/2.
The corresponding equilibrium price is
p = 4 − (1/2) − (1/2)2 = 4 − 1/2 − 1/4 =
13
.
4
37
2. Basics
Answer to question 2.2
2
We solve
6
= q + 2.
q+1
Multiplying both sides by q + 1, we obtain
6 = (q + 1)(q + 2) = q 2 + 3q + 2,
so
q 2 + 3q − 4 = 0.
This factorises as (q − 1)(q + 4) = 0 and so has solutions 1 and −4. Thus the
equilibrium quantity is 1, the positive solution. The equilibrium price, which can be
obtained from either one of the equations, is p = 6/(1 + 1) = 3. (Here, I have used the
demand equation.)
Answer to question 2.3
Consider first the demand curve. This is a negative quadratic and so has an upturned
‘U’ shape. It crosses the p-axis when 8 − p2 − 2p = 0 or, equivalently, when
p2 + 2p − 8 = 0. This factorises as (p + 4)(p − 2) = 0, so the solutions are p = 2, −4.
Alternatively, we can use the formula for the solutions to a quadratic:
p
√
−2 ± 22 − 4(1)(−8)
−2 ± 36
−2 ± 6
=
=
= −4, 2.
p=
2
2
2
(No calculator needed!) The supply curve is q = p2 + 2p − 3 = (p + 3)(p − 1), which
crosses the p-axis at −3 and 1, and has a ‘U’ shape. We notice also that the demand
curve crosses the q-axis at q = 8 and the supply curve crosses the q-axis at q = −3. The
curves therefore look as in Figure 2.24.
Figure 2.24: Graph of the curves q = 8 − p2 − 2p and q = p2 + 2p − 3.
The equilibrium price is given by
8 − p2 − 2p = p2 + 2p − 3,
38
2.18. Answers to Sample examination/practice questions
or 2p2 + 4p − 11 = 0. This has solutions
p
√
−4 ± 42 − 4(2)(−11)
−4 ± 104
1√
=
= −1 ±
26.
p=
4
4
2
√
√
We know that 26 > 2, so the solution −1 + 26/2 is positive and is therefore the
equilibrium price. (The other solution is obviously negative.) (Note: in an exam, you
should leave the answer like this, since you will not have a calculator to work the
answer out as a decimal expansion.)
39
2
2. Basics
2
40
Chapter 3
Differentiation
3
Essential reading
R
(For full publication details, see Chapter 1.)
R
R
R
R
Anthony and Biggs (1996) Chapters 6, 7 and 8.
Further reading
Binmore and Davies (2001) Chapter 2, Sections 2.7–2.10 and Chapter 4, Sections
4.2 and 4.3.
3.1
Booth (1998) Chapter 5, Modules 19 and 20.
Bradley (2008) Chapter 6.
Dowling (2000) Chapters 3 and 4.
Introduction
In this extremely important chapter we introduce the topic of calculus, one of the most
useful and powerful techniques in applied mathematics. In this chapter we focus on the
process of ‘differentiation’ of a function. The derivative (the result of differentiation) has
numerous applications in economics and related fields. It provides a rigorous
mathematical way to measure how fast a quantity is changing, and it also gives us the
main technique for finding the maximum or minimum value of a function.
3.2
The definition and meaning of the derivative
The derivative is a measure of the instantaneous rate of change of a function1
f : R → R. The idea is to compare the value of the function at x with its value at x + h,
where h is a small quantity. The change in the value of f is f (x + h) − f (x), and this,
when divided by the change h in the ‘input’, measures the average rate of change over
the interval from x to x + h. Informally speaking, the instantaneous rate of change is
the quantity this average rate of change approaches as h gets smaller and smaller.
A good analogy can be made with the speed of a car. Imagine that a car is driving
along a straight road and that f (x) represents its distance, in metres, from the starting
1
See Anthony and Biggs (1996) Section 6.1.
41
3. Differentiation
3
point at time x, in seconds, from the start. (It would be more normal to use the symbol
t as the variable here rather than x, but you will be aware from earlier that f (x) and
f (t) convey the same information: it does not matter which symbol is used for the
variable.) Let’s suppose that at time x = 10, the distance f (10) from the start is 150
and that at time x = 11, the distance from the start is 170. Then the average speed
between times 10 and 11 is (170 − 150)/(11 − 10) = 20 metres per second. However, this
need not be the same as the instantaneous speed at 10, since the car may accelerate or
decelerate in the time interval from 10 to 11. Conceivably, then, the instantaneous speed
at 10 could well be higher or lower than 20. To obtain better approximations to this
instantaneous speed, we should measure average speed over smaller and smaller time
intervals. In other words, we should measure the average speed from 10 to 10 + h and
see what happens as h gets smaller and smaller. That is, we compute the limit of
f (10 + h) − f (10)
f (10 + h) − f (10)
=
,
(10 + h) − 10
h
as h tends to 0.
We now give the definition of the derivative. The derivative (or instantaneous rate of
change) of f at a number a (or ‘at the point a’) is the number which is the limit of
f (a + h) − f (a)
,
h
as h tends towards 0. It is not appropriate at this level to say formally what we mean
by a limit, but the idea is quite simple: we say that g(x) tends to the limit L as x tends
to c if the distance between g(x) and L can be made as small as we like provided x is
sufficiently close to c. The derivative of f at a is denoted f 0 (a). (We are assuming here
that the limit exists: if it does not, then we say that the derivative does not exist at a.
But we do not need to worry in this subject about the existence and non-existence of
derivatives: these are matters for consideration in a more advanced course of study.)
Now, if the derivative exists at all a, then for each a we have a derivative f 0 (a) and we
simply call the function f 0 the derivative of f . (Please don’t be confused by this
distinction between the derivative of f and the derivative of f at a point. For example,
suppose that for each a, f 0 (a) = 2a. Then the derivative of f is the function f 0 given by
f 0 (x) = 2x.)
Let’s look at an example to make sure we understand the meaning of the derivative at a
point a. (We will see soon that the type of numerical calculation we’re about to
undertake is not necessary in most cases, once we have learned some techniques for
determining derivatives.)
Example 3.1 Suppose that f (x) = 2x . Let’s try to determine the derivative f 0 (1)
by working out the average rates of change
f (1 + h) − f (1)
,
h
for successively smaller values of h. (As just mentioned, we will later see an easier
way.) Table 3.1 shows some of the values of
f (1 + h) − f (1)
21+h − 2
=
.
h
h
42
3.2. The definition and meaning of the derivative
h
0.5
0.1
0.01
0.001
0.0001
(21+h − 2)/h
1.656854
1.435469
1.39111
1.386775
1.386342
3
Table 3.1: Average rates of change for f (x) = 2x .
It can be seen that as h approaches 0 these numbers seem to be approaching a
number around 1.386. So we might guess that f 0 (1) is around 1.386. (In fact, it turns
out that the exact value of f 0 (1) is 2 ln 2 = 1.38629436 . . . )
Activity 3.1
Calculate some more values of
f (1 + h) − f (1)
,
h
for even smaller values of h.
A geometrical interpretation of the derivative can be given. The ratio
f (a + h) − f (a)
,
h
is the gradient of the line joining the points (a, f (a)) and (a + h, f (a + h)). As h tends
to 0, this line becomes tangent to the curve at (a, f (a)); that is, it just touches the
curve at that point. The derivative f 0 (a) may therefore also be thought of as the
gradient of the tangent to the curve y = f (x) at the point (a, f (a)).
df
An alternative notation for f 0 (x) is
.
dx
Derivatives can be calculated using the definition given above, in what is known as
differentiation from first principles, but this is often cumbersome and you will not need
to do this in an examination. We give one example by way of illustration, but we
emphasise that you are not expected to carry out such calculations.
Example 3.2
as follows:
Suppose f (x) = x2 . In order to work out the derivative we calculate
f (x + h) − f (x)
(x + h)2 − x2
(x2 + 2xh + h2 ) − x2
=
=
= 2x + h.
h
h
h
The first term is independent of h and the second term approaches 0 as h
approaches 0, so the derivative is the function given by f 0 (x) = 2x.
43
3. Differentiation
3.3
Standard derivatives
In practice, to determine derivatives (that is, to differentiate), we have a set of standard
derivatives together with rules for combining these. The standard derivatives (which
you should memorise) are listed in Table 3.2.
3
f (x)
xk
ex
ln x
sin x
cos x
f 0 (x)
kxk−1
ex
1/x
cos x
− sin x
Table 3.2: Standard derivatives.
We mentioned in Chapter 2 that the number e is very special. We can now see one
reason why. We see from above that the derivative of the power function ex is just itself,
that is ex . This is not the case for any other power function ax . For example (as we shall
see), the derivative of 2x is not 2x , but 2x (ln 2).
d x
(e ) = ex etc.
These could also be stated in the ‘d/dx’ notation, as
dx
Example 3.3
Activity 3.2
3.4
The derivative of x5 is 5x4 .
What is the derivative of
1
?
x
Rules for calculating derivatives
To calculate the derivatives of functions other than the standard ones just given, it is
useful to use the following rules.2
The sum rule: If h(x) = f (x) + g(x) then
h0 (x) = f 0 (x) + g 0 (x).
The product rule: If h(x) = f (x)g(x) then
h0 (x) = f 0 (x)g(x) + f (x)g 0 (x).
The quotient rule: If h(x) = f (x)/g(x) and g(x) 6= 0 then
h0 (x) =
2
f 0 (x)g(x) − f (x)g 0 (x)
.
g(x)2
See Anthony and Biggs (1996) Section 6.2.
44
3.4. Rules for calculating derivatives
Example 3.4
Let f (x) = x3 ex . Then, by the product rule,
f 0 (x) = (x3 )0 ex + x3 (ex )0 = 3x2 ex + x3 ex .
3
Activity 3.3
Find the derivative of x2 sin x.
Activity 3.4
Find the derivative of f (x) = (x2 + 1) ln x.
Example 3.5
Let f (x) =
ln x
. Then, by the quotient rule,
x
f 0 (x) =
Activity 3.5
(1/x)x − (1) ln x
1 − ln x
=
.
x2
x2
Determine the derivative of sin x/x.
Another, very important, rule is the composite function rule,3 or chain rule, which may
be stated as follows:
If f (x) = s(r(x)), then f 0 (x) = s0 (r(x))r0 (x).
If you can write a function in this way, as the composition of s and r, then the
composite function rule will tell you the derivative.
Example 3.6
√
x3 + 2. Then f (x) = s(r(x)) where
√
s(x) = x = x1/2 and r(x) = x3 + 2.
Let f (x) =
Now, we have
1
1
s0 (x) = x1−1/2 = x−1/2
2
2
so, by the composite function rule,
and r0 (x) = 3x2 ,
1
3x2
f 0 (x) = s0 (r(x))r0 (x) = (x3 + 2)−1/2 (3x2 ) = √
.
2
2 x3 + 2
Example 3.7
Suppose f (x) = (ax + b)n . Then, by the composite function rule,
f 0 (x) = n(ax + b)n−1 × (ax + b)0 = an(ax + b)n−1 .
3
See Anthony and Biggs (1996) Section 6.4.
45
3. Differentiation
Example 3.8
Suppose that f (x) = ln(g(x)). Then, by the composite function rule,
f 0 (x) =
g 0 (x)
1 0
g (x) =
.
g(x)
g(x)
(This result is often useful in integration, something we discuss later.)
3
Activity 3.6
Find the derivative of (3x + 7)15 .
Activity 3.7
Differentiate f (x) =
Activity 3.8
Differentiate g(x) = ln(x2 + 2x + 5).
√
x2 + 1.
Differentiation can sometimes be simplified by taking logarithms, as the following
example demonstrates.
Example 3.9 We differentiate f (x) = 2x by observing that
ln(f (x)) = ln(2x ) = x ln 2. Now, the derivative of ln(f (x)) is, by the composite
function rule (chain rule) equal to f 0 (x)/f (x), so, on differentiating both sides of
ln(f (x)) = x ln 2, we obtain
f 0 (x)
= (x ln 2)0 = ln 2,
f (x)
and so
f 0 (x) = (ln 2)f (x) = (ln 2)2x .
In particular, f 0 (1) = 2 ln 2, as alluded to in an earlier example in this chapter.
Activity 3.9
3.5
By taking logarithms first, find the derivative f 0 (x) when f (x) = xx .
Optimisation
Critical points
The derivative is very useful for finding the maximum or minimum value of a function
— that is, for optimisation.4 Recall that the derivative f 0 (x) may be interpreted as a
measure of the rate of change of f at x. It follows from this that we can tell whether a
function is increasing or decreasing at a given point, simply by working out its
derivative at that point.
4
See Anthony and Biggs (1996) Chapter 8.
46
3.5. Optimisation
If f 0 (x) > 0, then f is increasing at x.
If f 0 (x) < 0, then f is decreasing at x.
At a point c for which f 0 (c) = 0 the function f is neither increasing nor decreasing: in
this case we say that c is a critical point (or stationary point) of f . It must be stressed
that a function can have more than one kind of critical point. A critical point could be
3
a local maximum, which is a c such that for all x close to c, f (x) ≤ f (c);
a local minimum, which is a c such that for all x close to c, f (x) ≥ f (c); or
an inflexion point, which is neither a local maximum nor a local minimum.
In the first of the following three figures, c = 1 is a local maximum of the function
whose graph is sketched, and in the second c = 1 is a local minimum of the function
sketched. In the third figure, c = 1 is a critical point, but not a maximum or a
minimum; in other words, it is an inflexion point.
Deciding the nature of a given critical point
We can decide the nature of a given critical point by considering what happens to the
derivative f 0 in a region around the critical point. Suppose, for example, that c is a
critical point of f and that f 0 is positive for values just less than c, zero at c, and
negative for values just greater than c. Then, f is increasing just before c and
decreasing just after c, so c is a local maximum. Similarly, if the derivative f 0 changed
sign from negative to positive around the point c then we can deduce that c is a local
minimum. At an inflexion point, the derivative would not change sign: it would be
either non-negative on each side of the critical point, or non-positive on each side. Thus
a critical point can be classified by considering the sign of f 0 on either side of the point.
There is another way of classifying critical points. Let’s think about a local maximum
point c as described above. Note that the derivative f 0 is decreasing at c (since it goes
47
3. Differentiation
3
from positive, through 0, to negative), so the derivative of the derivative f 0 is negative
at c. We call the derivative of f 0 the second derivative of f , and denote it by f 00 (x) or
d2 f
.
dx2
In other words, then, f 00 (c) < 0. It can be proved that a sufficient condition for f to
have a local maximum at a critical point c is f 00 (c) < 0. (By saying that this is a
‘sufficient condition’, we mean that if f 00 (c) < 0 then c is a maximum; you should
understand that even if c is a maximum, it need not be the case that f 00 (c) < 0.) There
is a similar condition for a minimum, in which the corresponding condition is f 00 (c) > 0.
Summarising, we have,
if f 0 (a) = 0 and f 00 (a) < 0, then x = a is a local maximum of f ;
if f 0 (b) = 0 and f 00 (b) > 0, then x = b is a local minimum of f .
These observations together form the second-order conditions for the nature of a critical
point. If a critical point c is an inflexion point, then the condition f 00 (c) = 0 must hold
48
3.5. Optimisation
(since the point is neither a local maximum nor a local minimum). However, as
mentioned above, if f 00 is zero at a critical point then we cannot conclude that the point
is an inflexion point. For example, if f (x) = x6 then f 00 (0) = 0, but f does not have an
inflexion point at 0; it has a local minimum there.
Example 3.10
Let’s find the critical points of the function
3
f (x) = 2x3 − 9x2 + 1,
and determine the natures of these points. The derivative is
f 0 (x) = 6x2 − 18x = 6x(x − 3).
The solutions to f 0 (x) = 0 are 0 and 3 and these are therefore the critical points. To
determine their nature we could examine the sign of f 0 in the vicinity of each point,
or we could check the sign of f 00 (x) at each. For completeness of exposition, we shall
do both here, but in practice you only need to carry out one of these tests.
First, let’s examine the sign of f 0 (x) in the vicinity of x = 0. We have
f 0 (x) = 6x(x − 3), which is positive for x < 0 (since it is then the product of two
negative numbers). For x just greater than 0, x > 0 and x − 3 < 0, so that f 0 (x) < 0.
(Note: we are interested only in the signs of f 0 (x) just to either side of the critical
point, in its immediate vicinity). Thus, at x = 0, f 0 changes sign from positive to
negative and hence x = 0 is a local maximum. Now for the other critical point.
When x is just less than 3, 6x(x − 3) < 0 and when x > 3, 6x(x − 3) > 0; thus, since
f 0 changes from negative to positive around the point, x = 3 is a local minimum.
Alternatively, we note that f 00 (x) = 12x − 18. Since f 00 (0) < 0, x = 0 is a local
maximum. Since f 00 (3) > 0, x = 3 is a local minimum.
Identifying local and global maxima
Now we turn to the problem of optimisation. Suppose we want to find the maximum
value of a function f (x). Such a wish only makes sense if the function has a maximum
value; in other words, it does not take unboundedly large values. This value will occur
at a local maximum point, but there may be several local maximum points. The global
maximum is where the function attains its absolute maximum value (if such a value
exists) and we can think of the local maximum points as giving the maximum value of
the function in their vicinity. It should be emphasised that not all functions will have a
global maximum. For instance, the function f (x) = 2x3 − 9x2 + 1 considered in the
example above has no global maximum because the values f (x) get increasingly large,
without bound, for large positive values of x. Even though this function does, as we
have seen, possess a local maximum, it does not have a global maximum.
If f does indeed have a global maximum, then we can find it as follows. We proceed by
determining all the local maximum points of f , using the techniques outlined above,
and then we calculate the corresponding values f (x) and compare these to find the
largest. (Of course, if there is only one local maximum, then it is the global maximum.)
The analogous procedure is carried out if we want to find the global minimum value (if
the function has one): we find the minimum points and, among these, find which gives
49
3. Differentiation
the smallest value of f . These techniques are, like many other things in this subject,
best illustrated by examples.
2
Example 3.11 To find the maximum value of the function f (x) = xe−x , we first
calculate the derivative, using the product rule to get
3
2
2
2
f 0 (x) = e−x − (2x)xe−x = e−x (1 − 2x2 ).
√
√
There are two solutions of f 0 (x) = 0, namely x = 1/ 2 and x = −1/ 2. (Note that
2
e−x is never equal to 0.) In other words, these values of x give the critical points, or
stationary points. To determine their nature we could examine the sign of f 0 in the
vicinity of each point, or we could check the sign of f 00 (x) at each. For completeness
of exposition, we shall do both here, but in practice you only need to use one
method.
√
0
First, let’s examine
√ as x goes2 from just less0 than −1/ 2 to just
√ the sign of f (x)
greater than −1/√2. For x < −1/ 2, 1 − 2x < 0 and so f (x) < 0,√while for x just
greater than −1/ 2, 1 − 2x2 > 0 and f 0 (x) > 0. It follows that −1/ 2 is a local
minimum. In a similar
way, one can check — and you should do this — that√for x
√
just less than 1/ 2, the derivative is positive and
√ for x just greater than 1/ 2, the
derivative is negative, so that we may deduce 1/ 2 is a local maximum.
Alternatively, we may calculate f 00 (x), using the product rule to get
2
2
f 00 (x) = −2xe−x (1 − 2x2 ) − 4xe−x .
√
√
Now, f 00 (−1/ 2) > 0, so this point is a local minimum, and f 00 (1/ 2) < 0, so this
point is a local maximum.
√
Now, we are trying to find the maximum value of f . This is when x = 1/ 2, and the
maximum value is
1
1
1
f √
= √ e−1/2 = √ .
2
2
2e
(Note: this function does indeed have a global maximum and a global minimum.
This might not be obvious, but it follows from that fact that for very large positive x
2
or very ‘large’ negative x, xe−x is extremely small in size.)
Activity 3.10 Find the critical points of f (x) = x3 − 6x2 + 11x − 6 and classify
the nature of each such point (that is, determine whether the point is a local
maximum, local minimum, or inflexion).
If we are trying to find the maximum value of a function f (x) on an interval [a, b], then
it will occur either at a or at b, or at a critical point c in between a and b. Suppose, for
instance, there was just one critical point c in the open interval (a, b), and that this was
a local maximum. To be sure that it gives the maximum value on the interval, we
should compare the value of the function at c with the values at a and b. To sum up, it
is possible, when maximising on an interval, that the maximum value is actually at an
end-point of the interval, and we should check whether this is so. (The same argument
applies to minimising.)
50
3.6. Curve sketching
3.6
Curve sketching
Another useful application of differentiation is in curve sketching. The aim in sketching
the curve described by an equation y = f (x) is to indicate the behaviour of the curve
and the coordinates of key points. Curve sketching is a very different business from
simply plotting a few points and joining them up: there’s no room in this subject for
such unsophisticated methods, and such ‘plotting’ is an inadequate substitute for
proper curve sketching!
Given the equation y = f (x) of a curve we wish to sketch, we have to determine key
information about the curve. The main questions we should ask are as follows. Where
does the curve cross the x-axis (if at all)? Where does it cross the y-axis? Where are the
critical points (or stationary points, if you prefer that name)? What are the natures of
the critical points? What is the behaviour of the curve for large positive values of x and
‘large’ negative values of x (where ‘large’ means large in absolute value)?
To outline a general technique, we take these in turn.
Where it crosses the x-axis: The x-axis has equation y = 0 and the curve has
equation y = f (x), so the curve crosses the x-axis at the points (x, 0) for which
f (x) = 0. Thus we solve the equation f (x) = 0. This may have many solutions or
none at all. (For instance, if f (x) = sin x there are infinitely many solutions,
whereas if f (x) = x2 + 1 there are none.)
Where it crosses the y-axis: The y-axis has equation x = 0 and the curve has
equation y = f (x), so the curve crosses the y-axis at the single point (0, f (0)).
Finding the critical points: We’ve seen how to do this already. We solve the
equation f 0 (x) = 0.
The natures of the critical points: This means determining whether each one
is a local maximum, local minimum, or inflexion point, and the methods for doing
this have been discussed earlier in this chapter.
Limiting behaviour: We have to determine what happens to f (x) as x tends to
infinity and as x tends to minus infinity; in other words, we have to ask how f (x)
behaves for x far to the right on the x-axis and for x far to the left on the negative
side of the axis.
As far as the last point is concerned, there are two standard results here which are
useful.
First, the behaviour of a polynomial function is determined solely by its leading term,
the one with the highest power of x. This term dominates for x of large absolute value.
A useful observation is that if n is even, then
xn → ∞ as x → ∞ and also as x → −∞,
while if n is odd,
f (x) → ∞ as x → ∞ and f (x) → −∞ as x → −∞.
(To say, for example, that f (x) → ∞ as x → ∞ means that the values of f (x) are, for x
large enough, greater than any value we want. For example, it means that there is some
51
3
3. Differentiation
number X such that for all x > X, f (x) > 1000000; and that, for some value Y , we
have f (x) > 100000000 for all x > Y , and so on. In words, we say that ‘f (x) tends to
infinity as x tends to infinity’.) Thus, for example, if f (x) = −x3 + 5x2 − 7x + 2, then
we examine the leading term, −x3 . As x → ∞, this tends to −∞ and as x → −∞ it
tends to ∞. So this is the behaviour of f .
3
Secondly, whenever we have a function which is the product of an exponential and a
power, the exponential dominates. Thus, for example, x2 e−x → 0 as x → ∞ (even
though x2 → ∞).
Example 3.12 Let’s do a really easy example. Consider the quadratic function
f (x) = 2x2 − 7x + 5. We already know a lot about sketching such curves (from the
previous chapter), but let’s apply the scheme suggested above. This curve crosses the
x-axis when 2x2 − 7x + 5 = 0. The solutions to this equation (which can be found by
using the formula or by factorising) are x = 1 and x = 5/2. The curve crosses the
y-axis at (0, 5). The derivative is f 0 (x) = 4x − 7, so there is a critical point at
x = 7/4. The second derivative is f 00 (x) = 4, which is positive, so this critical point
is a minimum. The value of f at the critical point is f (7/4) = −9/8. As x → ∞,
f (x) → ∞ and as x → −∞, f (x) → ∞. From this it follows that the graph of f is as
in 3.1
Figure 3.1: Graph of the quadratic function f (x) = 2x2 − 7x + 5.
Example 3.13 We considered the function f (x) = 2x3 − 9x2 + 1 earlier. We saw
that it has a local maximum at x = 0 and a local minimum at x = 3. The
corresponding values of f (x) are f (0) = 1 and f (3) = −26. The curve crosses the
y-axis when y = f (0) = 1. It crosses the x-axis when 2x3 − 9x2 + 1 = 0. Now, this is
not an easy equation to solve! However, we can get some idea of the points where it
crosses the x-axis by considering the shape of the curve. Note that as x → ∞,
f (x) → ∞ and as x → −∞, f (x) → −∞. Also, we have f (0) > 0 and f (3) < 0.
These observations imply that the graph must cross the x-axis somewhere to the left
of 0 (since it must move from negative y-values to a positive y-value), it must cross
again somewhere between 0 and 3 (since f (0) > 0 and f (3) < 0) and it must cross
again at some point greater than 3 (because f (x) → ∞ as x → ∞ and hence must
be positive from some point).
52
3.7. Marginals
We therefore have the following sketch (in which I have shown the correct x-axis
crossings):
3
Activity 3.11 Sketch the graph of f (x) = x3 − 6x2 + 11x − 6. (Note: this is the
function considered in Activity 3.10.)
3.7
Marginals
We now turn our attention to economic applications of the derivative. In this section we
consider ‘marginals’ and in the next section we approach the problem of profit
maximisation using the derivative. Suppose that a firm manufactures chocolate bars
and knows that in order to produce q chocolate bars it will have to pay out C(q) dollars
in wages, materials, overheads and so on. We say that C is the firm’s Cost function.
This is often called the Total Cost, and we shall often use the corresponding notation
T C. The cost T C(0) of producing no units (which is generally positive since a firm has
certain costs in merely existing) is called the Fixed Cost, sometimes denoted F C. The
difference between the cost and the fixed cost is known as the Variable Cost, V C. Other
important measures are the Average Cost, defined by AC = T C/q and the Average
Variable Cost AV C = V C/q. We have the relationship
T C = F C + V C.
An increase in production by one chocolate bar is relatively small, and may be
described as ‘marginal’. The corresponding increase in total cost is T C(q + 1) − T C(q).
Now, we know that the derivative of a function f at a point a is the limit of
f (a + h) − f (a)
,
h
as h tends to 0. This means that, if h is small, then this quantity is approximately equal
to f 0 (a). Hence, for small h,
f (a + h) − f (a) ' hf 0 (a),
53
3. Differentiation
where ’'’ means ‘is approximately equal to’.
If the production level q is large, so that 1 unit is small compared with q, then we may
take f to be the total cost function T C(q) and take h = 1 to see that the cost incurred
in producing one extra item, namely T C(q + 1) − T C(q), is given approximately by
T C(q + 1) − T C(q) ' 1 × (T C)0 (q) = (T C)0 (q).
3
It is for this reason that we define the Marginal Cost function to be the derivative of the
total cost function T C. This marginal cost is often denoted M C. The marginal cost
should not be confused with the average cost, which is given by AC(q) = T C(q)/q. In
general, the marginal cost and the average cost are different.
In the traditional language of economics, the derivative of a function F is often referred
to as the marginal of F . For example, if T R(q) is the total revenue function, which
describes the total revenue the firm makes when selling q items, then its derivative is
called the Marginal Revenue, denoted M R.
Activity 3.12
A firm has total cost function
T C(q) = 50000 + 25q + 0.001q 2 .
Find the fixed cost F C, and the marginal cost M C. What is the marginal cost when
the output is 100? What is the marginal cost when the output is 10000?
3.8
Profit maximisation
The revenue or total revenue, T R, a firm makes is simply the amount of money
generated by selling the good it manufactures. In general, this is the price of the good
times the number of units sold. To calculate it for a specific firm, we need more
information (as we shall see below). We have denoted the total cost function by T C. In
this notation, the profit, Π, is the total revenue minus the total cost,
Π = T R − T C.
If we consider revenue and cost as functions of q, then Π = Π(q) is given as a function
of q by
Π(q) = T R(q) − T C(q).
To find which value, or values, of q give a maximum profit, we look for critical points of
Π by solving Π0 (q) = 0. But, since
Π0 (q) = (T R)0 (q) − (T C)0 (q),
this means that the optimal value of q satisfies (T R)0 (q) = (T C)0 (q). In other words, to
maximise profit, marginal revenue equals marginal cost.5
The firm is said to be a monopoly if it is the only supplier of the good it manufactures.
This means that if the firm manufactures q units of its good, then the selling price at
5
See Anthony and Biggs (1996) Sections 8.1, 9.2 and 9.3 for a general discussion.
54
3.8. Profit maximisation
equilibrium is given by the inverse demand function, p = pD (q). Why is this? Well, the
inverse demand function pD (q) tells us what price the consumers will be willing to pay
to buy a total of q units of the good. But if the firm is a monopoly and it produces q
units, then it is only these q units that are on the market. In other words, the ‘q’ in the
inverse demand function (the total amount of the good on the market) is the same as
the ‘q’ that the firm produces. So the selling price is defined by the inverse demand
function, as a function of the production level q of the firm. The revenue is then given,
as a function of q, by T R(q) = qpD (q).
Example 3.14 A monopoly has cost function T C(q) = 1000 + 2q + 0.06q 2 and its
demand curve has equation q + 10p = 500. What value of q maximises the profit?
To answer this, we first have to determine the revenue as a function of q. Since the
firm is a monopoly, we know that T R(q) = qpD (q). From the equation for the
demand curve, q + 10p = 500, we obtain
p = 50 − 0.1q
so pD (q) = 50 − 0.1q
and R(q) = q(50 − 0.1q).
The profit is therefore given by
Π(q) = T R(q) − T C(q)
= q(50 − 0.1q) − (1000 + 2q + 0.06q 2 )
= 48q − 0.16q 2 − 1000.
The equation Π0 (q) = 0 is 48 − 0.32q = 0, which has solution q = 150. To verify that
this does indeed give a maximum profit, we note that Π00 (q) = −0.32 < 0.
Consider this last example a little further. The profit function Π(q) is a quadratic
function with a negative q 2 term, so we well know what it looks like: its graph will be as
follows.
Notice that the profit is negative to start with (because there is no revenue when
nothing is produced, but there is a cost of producing nothing, namely the fixed cost of
55
3
3. Differentiation
1000). As production is increased, profit starts to rise, and becomes positive. The point
at which profit just starts to become positive (that is, where it first equals 0), is called
the breakeven point. So, the breakeven point is the smallest positive value of q such
that Π(q) = 0. When the firm is producing the breakeven quantity, it is breaking even
in the sense that its revenue matches its costs. In this specific example, we can calculate
the breakeven point by solving the equation
3
−0.16q 2 + 48q − 1000 = 0.
This has the solutions 22.524 and 277.475. It is clearly the first of these that we want,
as the second, higher, value of q is where the profit, having increased to a maximum and
then decreased, becomes 0 again. So the breakeven point is 22.524.
Activity 3.13
A firm is a monopoly with cost function
T C(q) = q + 0.02q 2 .
The demand equation for its product is q + 20p = 300. Work out (a) the inverse
demand function; (b) the profit function; (c) the optimal value qm and the maximum
profit; (d) the corresponding price.
Learning outcomes
At the end of this chapter and the relevant reading, you should be able to:
explain what is meant by the derivative
state the standard derivatives
calculate derivatives using sum, product, quotient, and composite function (chain)
rules
calculate derivatives by taking logarithms
establish the nature of the critical/stationary points of a function
use the derivative to help sketch functions
explain and use the terminology surrounding ‘marginals’ in economics, and be able
to find fixed costs and marginal costs, given a total cost function
explain what is meant by the breakeven point and be able to determine this
make use of the derivative in order to minimise or maximise functions, including
profit functions
You do not need to be able to differentiate from first principles (that is, by using the
formal definition of the derivative).
Sample examination/practice questions
Question 3.1
Differentiate y = (1 + 2x − ex ) and find when the derivative is zero.
56
3.8. Sample examination/practice questions
Question 3.2
Differentiate the following functions.
(a) y = x3 + exp(3x2 ).
(b) y =
x2
3x + 5
.
+ 3x + 2
(c) y = ln(3x2 ) + 3x + √
3
1
.
1+x
Question 3.3
Assume that the price/demand relationship for a particular good is given by
p = 10 − 0.005q
where p is the price ($) per unit and q is the demand per unit of time. Also assume that
the fixed costs are $100 and the average variable cost per unit is 4 + 0.01q.
(a) What is the maximum profit obtainable from this product?
(b) What level of production is required to break even?
(c) What are the marginal cost and marginal revenue functions?
Question 3.4
The demand function relating price p and quantity x, for a particular product, is given
by
p = 5 exp(−x/2).
Find the amount of production, x, which will maximise revenue from selling the good,
and state the value of the resulting revenue. Produce a rough sketch graph of the
marginal revenue function for 0 ≤ x ≤ 6.
Question 3.5
Suppose you have a nineteenth-century painting currently worth $2000, and that its
value will increase steadily at $500 per year, so that the amount realised by selling the
painting after t years will be 2000 + 500t. An economic model shows that the optimum
time to sell is the value of t for which the function
P (t) = (2000 + 500t)e−0.1t
is maximised. Given this, find the optimum time to sell, and verify that it is optimal.
Question 3.6
A monopolist’s average cost function is given by
AC = 10 +
20
+ Q.
Q
57
3. Differentiation
Her demand equation is
P + 2Q = 20,
where P and Q are price and quantity, respectively. Find expressions for the total
revenue and for the profit, as functions of Q. Determine the value of Q which maximises
the total revenue. Determine also the value of Q maximising profit.
3
Question 3.7
A firm’s average cost function is given by
300
− 10 + Q,
Q
and the demand function is given by Q + 5P = 850, where P and Q are quantity and
price, respectively.
Supposing that the firm is a monopoly, find expressions for the total revenue and for the
profit, as functions of Q.
Determine the value of Q which maximises the total revenue and the value of Q which
maximises profit.
Answers to activities
Feedback to activity 3.1
Taking h = 0.00001, for example,
21+h − 2
,
h
is 1.386299 and for h = 0.000001 it is 1.386295. These are even closer to the true value
2 ln 2 = 1.38629436 . . ..
Feedback to activity 3.2
The function 1/x can be written as x−1 . Its derivative is therefore
(−1)x−1−1 = −x−2 = −1/x2 .
Feedback to activity 3.3
By the product rule,
(x2 sin x)0 = 2x sin x + x2 cos x.
Feedback to activity 3.4
By the product rule,
((x2 + 1) ln x)0 = 2x ln x + (x2 + 1)
1
1
= 2x ln x + x + .
x
x
Feedback to activity 3.5
By the quotient rule,
58
sin x
x
0
=
cos x(x) − (sin x)(1)
x cos x − sin x
=
.
2
x
x2
3.8. Answers to activities
Feedback to activity 3.6
By the chain rule (composite function rule),
0
(3x + 7)15 = 15(3x + 7)14 (3) = 45(3x + 7)14 .
Feedback
to activity 3.7
√
2
x + 1 = (x2 + 1)1/2 . By the chain rule,
2
(x + 1)
1/2 0
3
1
x
= (x2 + 1)−1/2 (2x) = √
.
2
2
x +1
Feedback to activity 3.8
By the chain rule,
0
ln(x2 + 2x + 5) =
x2
2x + 2
1
(x2 + 2x + 5)0 = 2
.
+ 2x + 5
x + 2x + 5
Feedback to activity 3.9
If f (x) = xx then ln f (x) = x ln x and so
1
f 0 (x)
=x
+ (1) ln x = 1 + ln x,
f (x)
x
from which we obtain
f 0 (x) = (1 + ln x)f (x) = (1 + ln x)xx .
Feedback to activity 3.10
The derivative of f (x) = x3 − 6x2 + 11x − 6 is f 0 (x) = 3x2 − 12x + 11. The stationary
√
√
points, the solutions to 3x2 − 12x + 11 = 0, are (12 − 12)/6 and (12 + 12)/6. The
second derivative is 6x − 12, which is negative at the first stationary
point and positive
√
at the second. We therefore
have a local maximum at (12 − 12)/6
√ and a local
√
√
minimum at (12 +√ 12)/6. The corresponding values of f are 2 3/9 and −2 3/9.
(Approximately, 2 3/9 is 0.3849.)
Feedback to activity 3.11
To sketch f (x) = x3 − 6x2 + 11x − 6, we first note from the previous activity that f has
√
√
a local maximum at (12 − 12)/6
and
a
local
minimum
at
(12
+
12)/6.
√
√
√ The
corresponding values of f are 2 3/9 and −2 3/9. (Approximately, 2 3/9 is 0.3849.)
As x tends to infinity, so does f (x) and as x tends to −∞, f (x) → −∞. The curve
crosses the y-axis at (0, −6). To find where it crosses the x-axis, we have so solve
x3 − 6x2 + 11x − 6 = 0. One solution to this is easily found (by guesswork) to be x = 1,
and so (x − 1) is a factor. Therefore, for some numbers a, b, c,
x3 − 6x2 + 11x − 6 = (x − 1)(ax2 + bx + c).
Straight away, we see that a = 1 and c = 6. To find b, we could notice that the number
of terms in x2 is −a + b, which should be −6, so that −1 + b = −6 and b = −5. Hence
x3 − 6x2 + 11x − 6 = (x − 1)(x2 − 5x + 6) = (x − 1)(x − 2)(x − 3)
and we see there are three solutions: x = 1, 2, 3. Piecing all this information together,
we can sketch the curve as follows:
59
3. Differentiation
3
Feedback to activity 3.12
The total cost function is 50000 + 25q + 0.001q 2 . The fixed cost is obtained by setting
q = 0, giving F C = 50000. The marginal cost is M C = 25 + 0.002q. The marginal cost
is $25.2 if the output is 100, but it rises to $45 if the output is 10000.
Feedback to activity 3.13
(a) The inverse demand function is
pD (q) =
300 − q
= 15 − 0.05q.
20
(b) The profit function is
Π(q) = qpD (q) − T C(q) = q(15 − 0.05q) − (q + 0.02q 2 ) = 14q − 0.07q 2 .
(c) We have Π0 (q) = 14 − 0.14q, so q = 100 is a critical point. The second derivative of Π
is Π00 (q) = −0.14, which is negative, so the critical point is a local maximum. The value
of the profit there is Π(100) = 1400 − 700 = 700, whereas Π(0) = 0 and Π(200) = 0.
Since the maximum profit in the interval [0, 200] must be either at a local maximum or
an end-point, it follows that the maximum profit is 700, obtained when q = 100.
(d) The price when q = 100 is pD (100) = 15 − (0.05)(100) = 10.
Answers to Sample examination/practice questions
Answer to question 3.1
The derivative is
60
dy
dx
= 2 − ex , and this is equal to 0 when ex = 2; that is, when x = ln 2.
3.8. Answers to Sample examination/practice questions
Answer to question 3.2
For the first, we have, using the composite function rule on the exponential term,
dy
= 3x2 + 6x exp(3x2 ).
dx
For the second, using the quotient rule,
3
dy
3(x2 + 3x + 2) − (2x + 3)(3x + 5)
−3x2 − 10x − 9
=
=
.
dx
(x2 + 3x + 2)2
(x2 + 3x + 2)2
Lastly,
d
dx
d
1
2
=
ln(3x ) + 3x + √
ln(3x2 ) + 3x + (1 + x)−1/2
dx
1+x
6x
1
+ 3 − (1 + x)−3/2
2
3x
2
2
1
.
= +3−
x
2(1 + x)3/2
=
Answer to question 3.3
(a) The average variable cost is AVC=4 + 0.01q, so the variable cost is VC=4q + 0.01q 2 .
So the total cost function is T C = 4q + 0.01q 2 + F C, where F C is the fixed cost; that
is, T C = 4q + 0.01q 2 + 100. Given that the firm is a monopoly, the revenue is
T R = pq = (10 − 0.005q)q = 10q − 0.005q 2 . The profit function is
Π = T R − T C = 10q − 0.005q 2 − (4q + 0.01q 2 + 100) = 6q − 0.015q 2 − 100.
To find the maximum, we solve Π0 = 0, which is 6 − 0.03q = 0, so q = 200. To check it
gives a maximum, we check that the second derivative is negative, which is true since
Π00 = −0.03.
(b) The breakeven point is the (least) value of q for which Π(q) = 0. So we solve
6q − 0.015q 2 − 100 = 0. We can rewrite this as
0.015q 2 − 6q + 100 = 0,
which has solutions
p
√
√
6 ± 62 − 4(0.015)(100)
6 ± 36 − 6
100
=
=
(6 ± 30).
2(0.015)
0.03
3
√
Since 30 < 6 (because 30 < 62 = 36), there are
√ two positive solutions. The breakeven
point is the smaller of these, which is 100(6 − 30)/3.
(c) The marginal cost and marginal revenue functions are the derivatives of the total
cost and total revenue. Thus,
MC =
d
4q + 0.01q 2 + 100 = 4 + 0.02q
dq
61
3. Differentiation
and
MR =
d
10q − 0.005q 2 = 10 − 0.01q.
dq
Answer to question 3.4
3
The total revenue obtained from the sale of the good is simply (price times quantity)
T R = 5xe−x/2 . To find the maximum, we differentiate:
1 −x/2 5 −x/2
0
−x/2
(T R) = 5e
+ 5x −
e
= e
(2 − x) ,
2
2
and this is zero only if x = 2. We can see that the derivative changes sign from positive
to negative on passing through x = 2, and so this is a local maximum. The value of the
revenue there is 5(2)e−2/2 = 10/e. When x = 0, T R = 0 and there are no solutions to
T R = 0, so the graph of T R will not cross the x-axis. Sketching the curve between 0
and 6, we obtain the following.
Answer to question 3.5
By routine application of the rules for differentiation we get
P 0 (t) = 500e−0.1t + (2000 + 500t)(−0.1)e−0.1t = e−0.1t (300 − 50t).
Since this is zero when t = 6, that is a critical point of P . Differentiating again we get
P 00 (t) = (−0.1)e−0.1t (300 − 50t) + e−0.1t (−50) = e−0.1t (5t − 80).
It follows that P 00 (6) < 0, so the critical point t = 6 is a local maximum.
The fact that t = 6 is indeed the maximum in [0, ∞) can be verified by common-sense
arguments. We know that t = 6 is the only critical point of P (t), and that it is a local
maximum. It follows that P (t) must decrease steadily for t > 6, because if at any stage
62
3.8. Answers to Sample examination/practice questions
it started to increase again, it would have to pass through a critical point first.
(Alternatively, to see that this local maximum is a global maximum, it could be noted
that P (t) → 0 as t → ∞. This is because the exponential part e−0.01t tends to 0 and
exponentials ‘dominate’ polynomials, so that even if we multiply this by (2000 + 500t),
the result still tends to 0.)
Answer to question 3.6
3
First, since the total cost is Q times the average cost, we have
20
+ Q = 10Q + 20 + Q2 .
T C = Q 10 +
Q
The monopolist’s demand equation is P + 2Q = 20, so when the quantity produced is
Q, the selling price will be P = 20 − 2Q and hence the total revenue will be
T R = QP = Q(20 − 2Q) = 20Q − 2Q2 .
So the profit function is
Π(Q) = 20Q − 2Q2 − (10Q + 20 + Q2 ) = 10Q − 3Q2 − 20.
To maximise T R, we set (T R)0 = 0, which is 20 − 4Q = 0, giving Q = 5. This does
indeed maximise revenue because (T R)00 = −4 < 0. For the profit function, we have
Π0 (Q) = 10 − 6Q, and this is 0 when Q = 5/3. Again, this gives a maximum because
Π00 (Q) = −6 < 0.
Answer to question 3.7
The total cost is
T C = Q(AC) = Q
300
− 10 + Q = 300 − 10Q + Q2 .
Q
The inverse demand function is given by P = 170 − Q/5, so the total revenue is
Q
T R = 170 −
Q = 170Q − 0.2Q2 .
5
The derivative (T R)0 is 170 − 0.4Q, which is 0 when Q = 425, this giving a maximum
because (T R)00 = −0.4 < 0. The profit function is
Π(Q) = T R − T C = 170Q − 0.2Q2 − (300 − 10Q + Q2 )
= 180Q − 1.2Q2 − 300.
To maximise the profit, we set Π0 = 0, which is 180 − 2.4Q = 0, so Q = 75. This is a
maximum because Π00 (Q) = −2.4 < 0.
63
3. Differentiation
3
64
Chapter 4
Integration
Essential reading
R
4
(For full publication details, see Chapter 1.)
R
R
R
R
Anthony and Biggs (1996) Chapters 25 and 26.
Further reading
4.1
Binmore and Davies (2001) Chapter 10, Sections 10.2–10.10.
Booth (1998) Chapter 6.
Bradley (2008) Modules 22–24.
Dowling (2000) Chapters 14 and 15.
Introduction
The next topic in calculus is integration. This is perhaps one of the most difficult topics
in this subject, and I encourage you to practise on lots of examples.
4.2
Integration
Integration is, in essence, the reverse process to differentiation and has a number of
applications in economics and related subjects. (It is also essential for solving
differential equations, but this topic is not part of this subject.) We start with indefinite
integration.1
Suppose the function f is given, and the function F is such that F 0 (x) = f (x). Then we
say that F is an anti-derivative of f . (Sometimes the word primitive is used instead of
‘anti-derivative’.) For example, x4 /4 is an anti-derivative of x3 , and so is x4 /4 + 5. Any
two anti-derivatives of a given function f differ only by a constant. The general form of
the anti-derivative of f is called the indefinite integral of f (x), and denoted by
Z
f (x) dx.
1
See Anthony and Biggs (1996) Section 25.3.
65
4. Integration
Often we call it simply the integral of f . It is of the form F (x) + c, where F is any
particular anti-derivative of f and c denotes any constant, known as a constant of
integration. Thus, for example, we write
Z
x4
+ c.
x3 dx =
4
The process of finding the indefinite integral of f is usually known as integrating f , and
f is known as the integrand.
4
Sometimes the variable of integration will be x; other times it will be some other
symbol, but this makes no real difference. For instance,
Z
x4
x3 dx =
+c
4
and
t4
+ c.
4
Note, however, that if we are to integrate a function of t, then the integral must contain
a dt and if we are to integrate a function of x, then the integral must contain a dx.
Z
t3 dt =
Just as for differentiation, we shall have a list of standard integrals 2 and some rules for
combining these. The main standard integrals (which you should memorise) are listed in
Table 4.1.
Z
f (x)
f (x) dx
xn (n 6= −1)
1/x
ex
sin x
cos x
xn+1
+c
(n + 1)
ln |x| + c
ex + c
− cos x + c
sin x + c
Table 4.1: Standard integtrals.
Note that the integral of 1/x is ln |x| + c rather than ln x + c, because if x is negative,
then the derivative of ln |x| is the derivative of ln(−x), which is just 1/x.
√
Example 4.1 The integral of x, which of course can be written as x1/2 is
2x3/2 /3 + c, according to the first rule (taking n = 1/2).
Activity 4.1
Integrate the function x5 .
Two important rules of integrals are easily verified:
Z
Z
Z
(f (x) + g(x)) dx = f (x) dx + g(x) dx,
2
See Anthony and Biggs (1996) Section 25.5.
66
4.3. Definite integrals
for any functions f and g, and
Z
Z
kf (x) dx = k
f (x) dx,
for any constant k.
If a function f has derivative x2 + 2 sin x, and f (0) = 1, what is the
Activity 4.2
function?
4.3
Definite integrals
4
Let f be a function with an anti-derivative F . The definite integral 3 of the function f
over the interval [a, b] is defined to be
Z b
f (x) dx = F (b) − F (a).
a
Note that any anti-derivative G(x) of f is of the form
G(x) = F (x) + c,
for some constant c, so that
G(b) − G(a) = (F (b) + c) − (F (a) + c) = F (b) − F (a).
Thus, whichever anti-derivative of f is chosen, the quantity on the right-hand side of
the definition is the same. In calculations the notation [F (x)]ba is often used as a
shorthand for F (b) − F (a).
Example 4.2
The definite integral of x4 over [0, 1] is
Z
1
0
R2
Activity 4.3
Calculate
Activity 4.4
Determine
4.4
4.4.1
0
x5
x dx =
5
4
1
=
0
1
1
−0= .
5
5
x2 dx.
R1
−1
et dt.
Integration by substitution
The method
We now turn our attention to a useful integration technique which may be thought of as
doing for integration what the composite function rule does for differentiation. This is
the technique of integration by substitution.4 We illustrate with a simple example.
3
4
See Anthony and Biggs (1996) Section 25.4.
See Anthony and Biggs (1996) Sections 26.1 and 26.2.
67
4. Integration
Example 4.3
4
Suppose that we are asked to find the indefinite integral
Z
(3x + 5)12 dx.
We note that if we substitute u = 3x + 5 the integrand becomes u12 , which we know
how to integrate. But we have changed the variable of integration. Originally we
integrated with respect to x, signified by dx in the integral, now
R 12we must integrate
with respect to u. We must relate dx to du, for the notation u dx has no
meaning. (Recall that a function of u must be accompanied by du.) Making the
substitution u = 3x + 5 is the same as saying that
5
1
x= u− ,
3
3
so dx/du = 1/3. Thus (and although this looks strange, it can be justified),
dx = 13 du. So, when we replace 3x + 5 by u we should replace dx by (1/3)du, giving
Z
Z
1 u13
1
12
12
du =
+ c.
(3x + 5) dx = u
3
3 13
We need the answer in terms of x, the original variable. Since u = 3x + 5 the integral
is
1
(3x + 5)13 + c.
39
The general rule is: when we change the variable by putting x = x(u) in the integral of
f (x) with respect to x, we must replace dx by (dx/du)du. In other words, to determine
Z
f (x) dx
we can work out
Z
f (x(u)) x0 (u)du,
and then substitute back x for x(u).
In practice we overlook the distinction between x as a function of u and the inverse
function, in which u is regarded as a function of x, relying on the fact that du/dx is
equal to 1/(dx/du). This allows us to write ‘shorthand’ statements like
u = 3x + 5,
therefore du = 3 dx,
which determines both du/dx and dx/du.
As we have formally described a change of variable here, it involves expressing the
variable of integration, x, in terms of a new variable u. Another way of expressing this
is to say that we have made a substitution: we have substituted u for x. In practice, in
some problems it will be natural to express the new variable u as a function of the
original variable x, as in the example above and in some other problems it will be more
natural to express x as a function of u. The first approach is probably more common in
the type of integrals studied in this subject.
68
4.4. Integration by substitution
4.4.2
Examples
One key difficulty students have with the substitution method is not knowing which
substitution to attempt. This is something you become more proficient at with practice,
but it should be borne in mind that there need not be one correct substitution: for a
number of problems, more than one substitution might work. In determining what
substitution, if any, to try, one approach is to look at the integral and ask yourself what
is complicating it. For instance, consider the example given above, where we have to
integrate (3x + 5)12 , we note that if we simply had x12 rather than (3x + 5)12 , then the
problem would be easy. It is for this reason that we seek to transform the integral into a
straightforward power, u12 , and the way to do this is to set u = 3x + 5. Fortunately, this
works, because the subsequent step of replacing dx by something involving du does not
further complicate matters, only introducing a multiplicative factor of 1/3. Here are a
few more examples where the obvious substitution works, and one where it does not
quite.
R
Example 4.4 Consider x(3x + 5)7 dx. This is a slightly more complicated
integral than the one we worked on above, but it is still the case that what makes
the integral difficult is the (3x + 5)7 . So, we try the substitution
u = 3x + 5. Then
R
du = 3 dx, so dx = (1/3)du and the integral becomes xu7 (1/3) du. But there’s
something wrong here: the integrand involves both variables x and u, whereas what
we want is an integral involving only the new variable u. But, fear not, because,
from u = 3x + 5, we know that x = (u − 5)/3. So, the integral is
Z
Z
1
(u − 5) 7
1
u du =
u8 − 5u7 du
3
3
9
u8
1 u9
−5
+c
=
9 9
8
u9
5
=
− u8 + c
81 72
=
(3x + 5)9
5
− (3x + 5)8 + c,
81
72
not an answer you might easily have guessed(!), but which is obtained without too
much difficulty using the substitution method.
R
Example 4.5 Let’s think about x(2x2 + 7)8 dx. The complicating part of the
integral is 2x2 + 7, so we try the substitution u = 2x2 + 7. With this, du = 4x dx, so
that dx = (1/(4x))du and the integral becomes
Z
Z
1
1
8
xu
du =
u8 du.
4x
4
Note that the x cancels with the 1/(4x) factor emerging from expressing dx in terms
of du. So, here, there is no need to express x in the integrand in terms of u. This
integral is now quite straightforward. It evaluates to u9 /36 + c, so the answer is
(2x2 + 7)9 /36 + c.
69
4
4. Integration
R
Example 4.6 Consider the fairly similar-looking integral x3 (2x2 + 7)8 dx. Again,
we try the substitution u = 2x2 + 7. We have, as before, dx = (1/(4x))du and so the
integral becomes
Z
Z
1
1
3 8
du =
xu
x2 u8 du.
4x
4
4
Here, the x terms in the integrand do not entirely cancel. But we know that
x2 = (u − 7)/2, so the integral simplifies as
Z
Z
(u − 7) 8
1
1
u du =
(u9 − 7u8 ) du
4
2
8
1 u10 7u9
−
+c
=
8 10
9
=
(2x2 + 7)10 7(2x2 + 7)9
−
+ c.
80
72
x+1
dx is a different sort of integral, but
+ 2x + 7
there is a method that often (though not always) works. We let u, the new variable,
be the denominator (that is, the bottom line) of the integrand, u = x2 + 2x + 7.
Then, du = (2x + 2)dx, so dx = du/(2x + 2) and the integral is
Z
Z
1
du
x + 1 du
=
.
u 2x + 2
2
u
Z
Example 4.7
The integral
x2
Notice how the x + 1 on the numerator of the integrand and the 2x + 2 factor cancel
with each other to give just the constant factor 1/2. Now the integral is
Z
1
du
1
1
= ln |u| + c = ln |x2 + 2x + 7| + c.
2
u
2
2
In fact, x2 + 2x + 7 is positive for all x, so we do not need the absolute value signs,
and the answer is simply (1/2) ln(x2 + 2x + 7) + c.
x
dx looks similar, but we shall see that
x2 + 2x + 7
the same substitution does not enable us to determine the integral. (Can you
anticipate why?) Putting u = x2 + 2x + 7 as before, the integral becomes
Z
Z
x du
x du
=
.
u 2x + 2
2x + 2 u
Z
Example 4.8
The integral
Here, we do not get the same sort of cancellation as before. To express x/(1 + x) in
terms of u would be difficult, and would lead to a ‘messy’ integral which we could
not easily determine. The reason that the substitution does not work here is that the
numerator is not a multiple of the derivative (2x + 2) of the bottom line.
70
4.4. Integration by substitution
In the context of these last two examples, it is worth mentioning a general rule:
Z 0
f (x)
dx = ln |f (x)| + c.
f (x)
This follows on making
the substitution u = f (x). Noting that du = f 0 (x) dx, the
R
integral is exactly (1/u) du, which is ln |u| + c, equal to ln |f (x)| + c.
Activity 4.5
Z
2
Use the substitution u = x to determine
Z
2
xex dx.
4
x(2x2 + 2)1/2 dx by using substitution.
Activity 4.6
Determine
Activity 4.7
Determine the integral
Z
√
x x − 1 dx,
by using
√ the substitution u = x − 1. Now determine it using the substitution
u = x − 1. (You should, of course, get the same answer. The point I’m emphasising
here is that there can be more than one appropriate substitution.)
4.4.3
The substitution method for definite integrals
In the case of a definite integral there is no need to revert to the original variable before
evaluating the anti-derivative: we simply use the appropriate values of the new variable.
If we change from the variable x to the variable u, and the interval of integration for x
was [a, b], the interval for u will be [α, β], where α and β are the values of u which
correspond to x = a and x = b respectively. Formally
Z u=β
Z x=b
f (x) dx =
f (x(u)) x0 (u)du,
x=a
u=α
where x(α) = a and x(β) = b. This result holds provided that u increases or decreases
from α to β as x goes from a to b.
Example 4.9
Making the substitution u = 3x + 5,
Z 1
Z 8
1
2
(3x + 5) dx =
u2 du
3
0
5
3 8
1 u
=
3 3 5
1
= (83 − 53 ).
9
Here, we have used the fact that since u = 3x + 5, the values of u corresponding to
x = 0 and x = 1 are 5 and 8.
71
4. Integration
Z
Activity 4.8
Determine
0
4.5
4
1
x2
x+2
dx.
+ 4x + 5
Integration by parts
The technique of integration by parts 5 may be thought of as resulting from the product
rule for differentiation, which tells us that the derivative of u(x)v(x) is
u0 (x)v(x) + u(x)v 0 (x). Hence the anti-derivative of u0 (x)v(x) + u(x)v 0 (x) is u(x)v(x), or
equivalently
Z
Z
u0 (x)v(x) dx +
u(x)v 0 (x) dx = u(x)v(x).
Rearranging, we get the rule for integration by parts:
Z
Z
0
u (x)v(x) dx = u(x)v(x) − u(x)v 0 (x) dx.
R
Thus, we can express an integral of the form u0 (x)v(x) dx as a known function
(u(x)v(x)) minus another integral. The second integral may be easier than the first.
And this is the point of this rather complicated-looking rule: we might get a simpler
integral as a result of replacing one ‘part’ u0 (x) by its integral u(x) and the other ‘part’
v(x) by its derivative v 0 (x).
Often, this rule is written in the shorthand form
Z
Z
f dg = f g − g df.
Example 4.10
Consider the integral
Z
x ln x dx.
Taking u0 (x) = x and v(x) = ln x in the integration by parts rule, we have
Z
Z
0
u (x)v(x) dx = u(x)v(x) − u(x)v 0 (x) dx
1 2
=
x ln x −
2
1 21
x dx
2 x
Z
1 2
1
=
x ln x −
x dx
2
2
=
5
See Anthony and Biggs (1996) Section 26.3.
72
Z
1 2
1
x ln x − x2 + c.
2
4
4.6. Partial fractions
Example 4.11 You might have wondered why, although ln x is a very important
function, it does not feature in our list of standard integrals. The reason is that the
integral of ln x is not particularly easy to remember. There
is a rather ‘sneaky’ way
R
of finding it, using integration by parts. The integral ln x dx may be thought of as
the integral of 1 × ln x. Taking u0 (x) = 1 and v(x) = ln x (so u(x) = x and
v 0 (x) = 1/x) and using integration by parts, we have
Z
Z
Z
1
ln x dx = x ln x − x dx = x ln x − 1 dx = x ln x − x + c.
x
4
Z
Activity 4.9
4.6
Use integration by parts to find
xex dx.
Partial fractions
This is a way of rewriting integrands of the form p(x)/q(x), where p and q are
polynomials, in a simpler form which makes them easier to integrate.6 Here is an
example.
Example 4.12
Consider
Z
x2
x
dx.
−x−2
The integrand is of the form p(x)/q(x), where p(x) = x and q(x) = x2 − x − 2.
Further, q(x) factorises as (x + 1)(x − 2). We claim that we can find constants A1
and A2 such that
x
x
A1
A2
=
=
+
.
x2 − x − 2
(x + 1)(x − 2)
x+1 x−2
Multiplying through by (x + 1)(x − 2), we obtain
A1 (x − 2) + A2 (x + 1) = x.
Taking x = −1 gives −3A1 + 0A2 = −1, so that A1 = 1/3. Taking x = 2 gives
3A2 = 2 and A2 = 2/3. There is another way of working out the numbers A1 , A2 ,
using the ‘cover-up’ rule. To calculate A1 , for example, we put x = −1 in the original
expression
x
,
[(x + 1)](x − 2)
omitting (‘covering up’) the term in square brackets: that is,
A1 =
6
−1
1
= .
(−1 − 2)
3
See Anthony and Biggs (1996) Section 26.4.
73
4. Integration
Similarly, we can calculate A2 as
A2 =
2
2
= .
(2 + 1)
3
The identity
1/3
2/3
x
=
+
−x−2
x+1 x−2
is called an expansion in partial fractions. It can easily be checked by multiplying
out.
x2
4
Now we can determine the integral, as follows.
Z
Z
x
x
dx =
dx
2
x −x−2
(x + 1)(x − 2)
Z 1
1
2
=
+
dx
3
x+1 x−2
=
1
2
ln(x + 1) + ln(x − 2) + c.
3
3
To be precise, we should use ln |x + 1| and ln |x − 2| since we cannot calculate the
logarithm of a negative number.
Generally, the method of partial fractions involves rewriting expressions of the form
p(x)/q(x), where p and q are polynomials, as a sum of simpler terms. In particular, if
p(x) is linear (that is, of the form ax + b) and q(x) is a quadratic with two different
roots, then the method of partial fractions applies. Suppose that
q(x) = (x − a1 )(x − a2 ), where a1 6= a2 . Then it is possible to write
p(x)
A1
A2
p(x)
=
=
+
q(x)
(x − a1 )(x − a2 )
x − a1 x − a2
(∗)
for some numbers A1 and A2 . Cross-multiplying equation (∗), we get
p(x) = A1 (x − a2 ) + A2 (x − a1 ).
The numbers A1 and A2 may be found by substituting x = a1 , x = a2 in turn into this
identity. When p(x)/q(x) is expressed in this way, it is possible to evaluate the integral,
because we have
Z
p(x)
dx =
q(x)
Z A1
A2
+
x − a1 x − a2
dx
= A1 ln |x − a1 | + A2 ln |x − a2 | + c.
Z
Activity 4.10
74
Find
x2
dx
.
+ 4x + 3
4.7. Applications of integration
4.7
Applications of integration
We have seen that marginals are derivatives; for instance, the marginal cost M C is the
derivative (with respect to quantity) of the total cost function T C. This means that if
we are given the marginal cost and we want to find the total cost function then we have
to reverse the previous procedure, by integrating. However, we shall also need some
additional information, since we know that when we integrate we have a constant of
integration which has to be determined. Often this information is provided to us by the
fixed cost, which, you will recall, is the cost T C(0) of producing no units. The following
example illustrates this.
Example 4.13 Suppose that the marginal cost function is given by M C(q) = 2e
and that the fixed cost is 20. Then the total cost function is the integral of the
marginal cost:
Z
T C(q) =
0.5q
2e0.5q dq = 4e0.5q + c,
for some constant c. To determine c, we use the fact that when q = 0, the cost
T C(0) must equal the fixed cost, 20; thus, 4e0 + c = 20, 4 + c = 20 and so c = 16,
and T C(q) = 4e0.5q + 16. (An extremely common mistake in a problem of this type is
to assume that the constant equals the fixed cost, which, as we see in this example,
need not be the case. Beware!)
Activity 4.11 Find the total cost function if the marginal cost is q + 5q 2 + eq and
the fixed cost is 10.
Learning outcomes
At the end of this chapter and the relevant reading, you should be able to:
explain what is meant by an (indefinite) integral and a definite integral
state and use the standard integrals
use integration by substitution
use integration by parts
integrate using partial fractions
calculate functions from their marginals
Sample examination/practice questions
Question 4.1
Determine the following integral:
Z
x
dx.
x2 + 5x + 6
75
4
4. Integration
Question 4.2
Z 2
x2 (x − 1)1/2 dx using an appropriate substitution.
Evaluate
1
Question 4.3
Z
Determine
4
x2
2x + 3
dx.
+ 3x + 2
Question 4.4
√
Z e
ln x ln x
Determine
dx.
x
1
Question 4.5
A company produces only product XYZ. When producing Q units the marginal cost
M C is given by
1
MC = 1 −
.
(Q + 1)2
If the average cost per unit when producing 4 units is 3.05, what is the total cost of
producing 5 units of XYZ?
Question 4.6
A company’s marginal cost function is
M C = 32 + 18q − 12q 2 .
Its fixed cost is 43. Determine the firm’s total cost function, average cost function, and
variable cost.
Question 4.7
The marginal revenue function for a commodity is given by
M R = 10 − 2x2 ,
and the total cost function for the commodity is
T C = x2 + 4x + 2,
where x is the number of units produced. Find the revenue function, and determine the
maximal profit.
Question 4.8
For a particular company, the marginal cost is a function of output as follows:
M C = 10 − q + q 2 .
Determine the extra cost which is incurred when production is increased from 2 to 4.
76
4.7. Answers to activities
Question 4.9
A firm’s marginal cost function is
20 √
1
√ e Q + Q3 +
.
Q+1
Q
The firm’s fixed costs are 20. Determine the total cost function.
4
Answers to activities
Feedback to activity 4.1
This is one of the standard derivatives, and the answer is x6 /6 + c.
Feedback to activity 4.2
We know that the function is an anti-derivative of x2 + 2 sin x. Integrating this, by the
standard integrals and the rules just seen, we have
Z
x3
− 2 cos x + c,
3
(x2 + 2 sin x) dx =
where c is a constant of integration. But we know more about f : we know that
f (0) = 1, so we know that
x3
f (x) =
− 2 cos x + c,
3
where the constant c is such that f (0) = 1. Now, substituting x = 0 into the expression
for f , we have
f (0) = 0 − 2 cos 0 + c = −2 + c,
and for this to equal 1, c must be 3. Therefore
f (x) =
x3
− 2 cos x + 3.
3
Feedback toR activity 4.3
The integral x2 dx is x3 /3 + c, so
Z
0
2
x3
x dx =
3
2
2
=
0
23
8
−0= .
3
3
Feedback to activity 4.4
The fact that this involves variable t rather than x should not confuse us. We have
1
1
1
et dt = et −1 = e1 − e−1 = e − .
e
−1
Z
77
4. Integration
Feedback to activity 4.5
With u = x2 we have du = 2x dx and so dx = du/(2x). Therefore
Z
x2
Z
xe dx =
1
=
2
xeu
Z
du
2x
eu du
1
= eu + c
2
1 2
= ex + c.
2
4
A slightly quicker approach to making this substitution is to note that since du = 2x dx
and the integral already has x dx, we have
Z
Z
1
x2
xe dx =
eu du.
2
It amounts to the same thing.
Feedback to activity 4.6
We make the substitution u = (2x2 + 2). We have du = 4x dx, so the integral reduces to
Z
1 2u3/2
1
1 1/2
u du =
+ c = (2x2 + 2)3/2 + c.
4
4 3
6
Feedback to activity 4.7
With u = x − 1, we have du = dx and so, on noting that x = u + 1, the integral becomes
Z
√
Z
x u dx =
Z
=
Z
=
√
(u + 1) u du
(u + 1)u1/2 du
(u3/2 + u1/2 )du
2 5/2 2 3/2
u + u +c
5
3
2
2
=
(x − 1)5/2 + (x − 1)3/2 + c.
5
3
=
Now we try the second suggested substitution. Setting u =
√
x − 1 we have
du
1
1
1
= √
=
,
dx
2 x−1
2u
R
and so dx = 2u du. The integral becomes
we need to replace x by its
√ xu(2u)du and
2
expression in terms of u. We have u = x − 1 and so u = x − 1, from which we obtain
78
4.7. Answers to activities
x = u2 + 1. So the integral is
Z
Z
2
(u + 1)u(2u) du = 2 (u4 + u2 ) du
= 2
=
u5 u3
+
5
3
+c
5 2 √
3
2 √
x−1 +
x − 1 + c.
5
3
This is (of course!) the same as the answer obtained using the first substitution. But
which is easier? Well, the details of the substitution are easier for the first, but the
actual integration of the transformed integral is easier for the second (because it does
not involve fractional powers). On balance, probably the first substitution is easier. But
both are correct! Do not think there’s necessarily only one way to solve a problem.
Feedback to activity 4.8
We use the substitution u = x2 + 4x + 5. Then du = (2x + 4)dx and so the integral is
10
Z 10
(1/2)du
1
1
1
=
ln |u|
ln(10) − ln(5) = ln(2).
=
u
2
2
2
5
5
Feedback to activity 4.9
Recall that the integration by parts rule is
Z
Z
0
u (x)v(x) dx = u(x)v(x) − u(x)v 0 (x) dx.
We need to integrate xex . If we were to take u0 = x and v = ex then uv 0 would be
(1/2)x2 ex , so the integral on the right of the integration by parts equation would be
even more difficult than the one we started with. There is, though, another possibility:
we can take u0 = ex and v = x, in which case u = ex and v 0 = 1. Then we have
Z
Z
x
x
xe dx = xe − 1.ex dx = xex − ex + c.
Feedback to activity 4.10
The integrand is
1
1
=
,
x2 + 4x + 3
(x + 1)(x + 3)
and the partial fractions rule says that
1
A
B
=
+
,
(x + 1)(x + 3)
x+1 x+3
for some numbers A and B. Multiplying both sides by (x + 1)(x + 3), we obtain
1 = A(x + 3) + B(x + 1).
Taking x = −3 gives −2B = 1, so B = −1/2. Taking x = −1 gives 2A = 1, so A = 1/2.
The integral is therefore
Z 1
1
1
1
1
dx = ln |x + 1| − ln |x + 3| + c.
−
2
x+1 x+3
2
2
79
4
4. Integration
Feedback to activity 4.11
We have
Z
Z
q 2 5q 3
+
+ eq + c.
T C(q) = M C dq = (q + 5q 2 + eq )dq =
2
3
We know that T C(0) = F C = 10, so 0 + 0 + e0 + c = 10; in other words, 1 + c = 10 and
c = 9. Therefore the total cost function is
q 2 5q 3
+
+ eq + 9.
TC =
2
3
4
Answers to Sample examination/practice questions
Answer to question 4.1
Noting that
x2
x
x
=
,
+ 5x + 6
(x + 2)(x + 3)
we use partial fractions. We have
x
A
B
=
+
,
(x + 2)(x + 3)
x+2 x+3
for some numbers A and B. Multiplying both sides by both factors in the usual way, we
have
A(x + 3) + B(x + 2) = x.
Taking x = −3 gives −B = −3, so B = 3. Taking x = −2 we get A = −2. Hence
Z
Z 3
−2
x
dx =
+
dx
x2 + 5x + 6
x+2 x+3
= −2 ln |x + 2| + 3 ln |x + 3| + c.
Answer to question 4.2
Let us try u = x − 1. We have du = dx. When x = 1, u = 0 and when x = 2, u = 1.
Furthermore, since x = u + 1, we may write x2 as (u + 1)2 . The integral therefore
becomes
Z 1
Z 1
2 1/2
(u + 1) u du =
(u2 + 2u + 1)u1/2 du
0
0
1
Z
(u5/2 + 2u3/2 + u1/2 ) du
=
0
=
2 7/2 4 5/2 2 3/2
u + u + u
7
5
3
2 4 2
+ +
7 5 3
184
=
.
105
=
80
1
0
4.7. Answers to Sample examination/practice questions
Answer to question 4.3
For this integral, the substitution u = x2 + 3x + 2 gives
Z
du
= ln |u| + c = ln |x2 + 3x + 2| + c.
u
An alternative approach, however, is to use partial fractions, because the denominator
factorises as (x + 1)(x + 2). Partial fractions tells us that for some numbers A and B,
2x + 3
A
B
=
+
.
(x + 1)(x + 2)
x+1 x+2
4
In the usual way, we have
A(x + 2) + B(x + 1) = 2x + 3,
for all x. Taking x = −2 reveals that −B = −1, so B = 1; and taking x = −1, we
obtain A = 1. Therefore the integral equals
Z 1
1
+
dx = ln |x + 1| + ln |x + 2| + c.
x+1 x+2
This is the same answer as we obtained using substitution, because
ln |x + 1| + ln |x + 2| = ln |(x + 1)(x + 2)| = ln |x2 + 3x + 2|.
Note that here (again), there is more than one way to solve the problem.
Answer to question 4.4
With u = ln x, we have du = (1/x)dx. Also, the values x = 1 and x = e correspond to
u = 0 and u = 1. So
√
1
Z 1
Z 1
Z e
√
2 5/2
2
ln x ln x
3/2
u u du =
u du = u
dx =
= .
x
5
5
0
0
1
0
Answer to question 4.5
We know that
Z
TC =
M C dQ
Z =
1−
=
Z 1
(Q + 1)2
dQ
1 − (Q + 1)−2 dQ
= Q + (Q + 1)−1 + c.
So, for some constant c,
TC = Q +
1
+ c.
Q+1
81
4. Integration
Now, we know that the average cost when Q = 4 is 3.05, so the total cost when Q = 4 is
4(3.05) = 12.2. But
T C(4) = 4 + (1/5) + c = 4.2 + c,
so we must have c = 8. So T C = Q + 1/(Q + 1) + 8. When Q = 5 the total cost is
therefore 5 + 1/6 + 8 = 79/6.
4
It’s useful, perhaps, to point out how not to answer this question. A naive approach
might be to argue as follows: the total cost at Q = 4 is 12.2, and the marginal cost
when Q = 4 is 1 − (1/42 ) = 15/16. Since the marginal cost is the cost of producing one
additional item, the cost of producing 5 is therefore 12.2 + 15/16 = 13.1375. This is
incorrect. Why? The reason is that the marginal cost gives, approximately, the cost of
producing one more item, but that for this approximation to be good, increasing
production by one item must be a relatively small increase. But increasing from 4 to 5
is a big relative change in production. If we were increasing from 400 to 401 (say), the
approximation would be better. (Recall that the formal mathematical definition of
marginal cost is that it is the derivative of the total cost, and that this is approximately
the cost of producing one additional item.)
Answer to question 4.6
We have
Z
TC =
Z
M C dq =
(32 + 18q − 12q 2 )dq = 32q + 9q 2 − 4q 3 + c,
and we know that the fixed cost, which is T C(0), is 43, so
43 = 0 + 0 + 0 + c,
and hence
T C = 32q + 9q 2 − 4q 3 + 43.
Then, the average cost is
AC =
43
TC
= 32 + 9q − 4q 2 + ,
q
q
and
V C = T C − F C = (32q + 9q 2 − 4q 3 + 43) − 43 = 32q + 9q 2 − 4q 3 .
Answer to question 4.7
The total revenue is given by
Z
Z
2
T R = M R dx = (10 − 2x2 )dx = 10x − x3 + c.
3
But what should c be? Well, think about it: what’s the revenue from selling 0 items?
It’s 0, of course, so T R(0) = 0 and hence c = 0. Therefore
2
T R = 10x − x3 .
3
82
4.7. Answers to Sample examination/practice questions
The profit function is
Π = TR − TC =
2
2 3
10x − x − (x2 + 4x + 2) = 6x − x3 − x2 − 2.
3
3
Setting Π0 (x) = 0 we obtain
6 − 2x2 − 2x = 0,
or, equivalently,
x2 + x − 3 = 0.
√
Solving this, we obtain x = (−1 ± 13)/2 and
√ clearly, for it to have economic
significance, it is the positive solution (−1 + 13)/2 that is relevant. The second
derivative Π00 (x) is −4x − 2, which is negative here, so this gives a maximum. The
maximum value of the profit is obtained by substituting this value into the profit
function. This turns out to be 2.64536.
4
Answer to question 4.8
What we need here is T C(4) − T C(2). Since
Z
T C = M C dq,
we have
Z
4
T C(4) − T C(2) =
M C dq
2
Z
4
=
10 − q + q 2 dq
2
4
q2 q3
+
= 10q −
2
3 2
42 43
22 23
= 10(4) −
+
+
− 10(2) −
2
3
2
3
=
98
.
3
(Alternatively, you could determine T C by indefinite integration to start with, and then
calculate T C(4) − T C(2). Of course, the constant won’t be known since we aren’t told
the fixed costs. But this does not matter since, in working out the difference
T C(4) − T C(2), the constant will cancel.)
Answer to question 4.9
We have
Z Z
TC =
M C dQ =
20 √
1
√ e Q + Q3 +
Q+1
Q
dQ.
√
√
√
Q
Now, to determine
the
integral
of
e
/
Q,
we
use
the
substitution
u
=
Q. We have
√
du = (1/(2 Q))dQ and
Z √Q
Z
√
e
√ dQ = 2eu du = 2eu + c = 2e Q + c.
Q
83
4. Integration
So,
√
T C = 40e
Q
+
Q4
+ ln |Q + 1| + c.
4
Now,
20 = F C = T C(0) = 40e0 + 0 + ln(1) + c = 40 + c,
so c = −20 and
√
T C = 40e
4
Q
+
√
Q4
Q4
+ ln |Q + 1| − 20 = 40e Q +
+ ln(Q + 1) − 20.
4
4
(We’ve used the fact that Q, as a quantity, is non-negative to observe that
|Q + 1| = Q + 1.)
84
Chapter 5
Functions of several variables
Essential reading
R
(For full publication details, see Chapter 1.)
R
Anthony and Biggs (1996) Chapters 11, 13 and Sections 21.2, 22.2.
5
Further reading
R
R
Binmore and Davies (2001) Chapter 3, Sections 3.1 and 3.2; Chapter 4, Section
4.6; and Chapter 6, Sections 6.6 and 6.8. (This book gives a very general approach
to the classification of critical points, more complex than is required just for the
two-variable case, as discussed in this chapter.)
5.1
Bradley (2008) Chapter 7.
Dowling (2000) Chapters 5 and 6.
Introduction
In this chapter we study how the technique of differentiation, and its applications, can
be extended to functions depending on more than one variable. This is one of the most
important ideas for the application of mathematics in economics, management, finance,
and many other fields.
5.2
Functions of several variables
A function f may be thought of as a ‘machine’, which accepts an input x and produces
an output f (x). In this chapter we look at functions for which the input consists of a
pair of numbers (x, y).1 (The theory extends in an obvious way to the general case when
the input consists of n numbers (x1 , x2 , . . . , xn ).) Such a function is called a function of
two variables. (The extension to a function of n variables, for n ≥ 2, will be clear.
Although the title of this chapter is ‘Functions of several variables’, we shall mainly
work with functions of two variables.) Such functions occur often in economics and
other fields in which we might wish to apply mathematical techniques. An important
example is the production function of a firm, q(k, l), which describes the amount of
product the firm produces when using k units of capital and l of labour. Another
1
See Anthony and Biggs (1996) Section 11.1.
85
5. Functions of several variables
important class of such functions is the class of utility functions. A utility function
describes the preferences of a consumer: it enables us to compare the worth to the
consumer of different combinations of two goods. These applications will be discussed
further later on in this chapter.
5.3
5
Partial derivatives
Suppose the quantity Z = f (x, y) is a function of two independent variables x, y. (To
say that x and y are independent simply means that they do not depend on one
another: each may be chosen independently of the other to form an input (x, y) to the
function f (x, y).) Then we may think of Z = f (x, y) as defining a surface in
3-dimensional space: we do this by visualising all the points (x, y, f (x, y)) in three
dimensions. In other words, imagine that for each point (x, y) in the (x, y)-plane, we
plot a point at a z-distance f (x, y) above the point. If we do this for all (x, y) we obtain
a surface (much in the same way as plotting the points (x, f (x)) for a one-variable
function f gives the graph of the function, which is a curve in two dimensions).
For example, the following diagram shows part of the surface corresponding to the
function f (x, y) = 2x2 + 2y 2 . The surface is ‘bowl’-shaped, with the bottom of the bowl
at coordinates (0, 0, 0).
Activity
p 5.1 What sort of surface do you think is described by the equation
z = x2 + y 2 ?
Although curve-sketching (which is sketching the graph of a one-variable function) is
important in this course, there is no need for you to be able to describe or sketch such
surfaces for functions of two or more variables. However, I have discussed this topic
because it might help in your understanding of what follows. (Of course, when dealing
with functions of more than two variables, the surfaces we obtain are in more than three
dimensions, and here our geometrical intuition is of little use.)
For a fixed y = y0 , the rate of change of Z = f (x, y) with respect to x at x = x0 is
denoted
∂f
(x0 , y0 ) or fx (x0 , y0 ).
∂x
86
5.3. Partial derivatives
We then have a function, fx , which is the derivative of f (x, y) when y is regarded as a
constant. This is the partial derivative with respect to x.2 Sometimes the notations
∂Z/∂x and Zx are also used for this partial derivative. (Note the ‘curly-d’, ∂, rather
than the normal d one encounters in the notation df /dx for the derivative of a function,
f (x), of one variable.)
So what does the partial derivative mean? If you imagined yourself walking across the
surface z = f (x, y), passing through the point (x0 , y0 ) and heading in a direction
parallel to the x-axis (so that your y-value remains at y0 ), then fx (x0 , y0 ) would be the
instantaneous rate at which the height of the surface increased. That is, fx (x0 , y0 ) is the
instantaneous rate of change of the function f when we keep y fixed at y0 and change x.
(Thus, it is the derivative of the single-variable function f (x, y0 ).)
We can define fy = ∂f /∂y similarly. ∂f /∂x, ∂f /∂y are sometimes denoted f1 , f2 . Thus,
the notations
∂Z
∂f
, fx ,
, Zx , f1
∂x
∂x
are all notations for the partial derivative with respect to x of Z = f (x, y). Partial
derivatives of important functions in economics often have special names. For instance,
if q(k, l) is the production function of a firm, then ∂q/∂l is known as the marginal
product of labour.
Calculating partial derivatives is only slightly more difficult than calculating standard
derivatives. To calculate the partial derivative of a function f (x, y) with respect to x,
you just treat y as if it were a fixed number and differentiate with respect to x.
Example 5.1
Let f (x, y) = x2 y + 5xy 3 + y 2 . Then
∂f
= 2xy + 5y 3
∂x
Activity 5.2
and
∂f
= x2 + 15xy 2 + 2y.
∂y
x
Suppose f (x, y) = x3 y − . Find the partial derivatives of f .
y
Of course, these definitions can be extended to functions of more than two variables,
defining the partial derivative with respect to each variable.
We may go on to define the partial derivatives with respect to x and y of the functions
fx and fy , obtaining the second-order partial derivatives
fxx ,
fxy ,
fyx ,
fyy .
These are also denoted by
∂ 2f
,
∂x2
∂ 2f
,
∂x∂y
∂ 2f
,
∂y∂x
∂ 2f
∂y 2
or by
f11 ,
2
f12 ,
f21 ,
f22 .
See Anthony and Biggs (1996) Section 11.2.
87
5
5. Functions of several variables
For all suitably well-behaved functions (and we shall only encounter such functions)
fxy = fyx .
The derivatives ∂f /∂x and ∂f /∂y are often called first-order derivatives. But we shall
mainly continue simply to call them the partial derivatives.
Example 5.2
Suppose
Z = f (x, y) = x2 y + y 3 x.
Then
Zx = 2xy + y 3 ,
Zxx = 2y,
5
Zy = x2 + 3y 2 x,
Zxy = Zyx = 2x + 3y 2 ,
Zyy = 6yx.
x
Find the second-order derivatives of the function f (x, y) = x3 y − .
y
(This was the function of the preceding activity.)
Activity 5.3
Activity 5.4 Find all the partial derivatives and second-order partial derivatives of
the function f (x, y) = x3/4 y 1/4 .
5.4
The chain rule
Sometimes a function of one variable is defined with reference to a function of two
variables. For instance, suppose that the production level q of a firm depends on capital
k and labour l through the function q(k, l). Suppose also that both k and l change over
a period of time in some known way, so that we have formulas for k(t) and l(t), where t
is a parameter measuring time. For example, we might have
k(t) = 3 + 2t,
l(t) = 10 − 0.2t,
which means that k increases linearly while l decreases linearly, as functions of time. If
we know the production function q in terms of k and l, then we can also work out the
level of production in terms of t, and so we can see how the production level will change
as a result of changing t. For example, if q is the function q(k, l) = kl, and k and l
change as above, we get the formula
kl = (3 + 2t)(10 − 0.2t) = 30 + 19.4t − 0.4t2 ,
for the output in terms of t.
More generally, suppose we are given a function f of two variables (x, y), both of which
are themselves functions of t. We can think of this situation as defining a composite
function F (t) = f (x(t), y(t)). In the case of a single variable we have a rule, the
‘composite function rule’, or chain rule, which enables us to work out the derivative of a
composite function. There is a similar rule here, (also) known as the chain rule:
∂f dx ∂f dy
dF
=
+
.
dt
∂x dt
∂y dt
88
5.5. Implicit partial differentiation
Sometimes, in this context, we call dF/dt the total derivative of F with respect to t (to
distinguish it from the partial derivatives of F with respect to x and y).
Example 5.3 Suppose, as above, f (x, y) = xy, x(t) = 3 + 2t and y(t) = 10 − 0.2t.
Then, using the chain rule,
dF
= y × 2 + x × (−0.2) = 2(10 − 0.2t) + (−0.2)(3 + 2t) = 19.4 − 0.8t.
dt
We can check the result explicitly, because we know (from above) that
F (t) = 30 + 19.4t − 0.4t2 and hence dF/dt = 19.4 − 0.8t, which is of course the same
answer.
Activity 5.5 Suppose that f (x, y) = x2 y and that x(t) = 2 + 3t and y(t) = t2 + 1. If
F (t) = f (x(t), y(t)), find the derivative dF/dt.
5.5
Implicit partial differentiation
An equation g(x, y) = c can, in some cases, be solved to give ‘y as a function of x’. For
example, if g(x, y) is x2 − y then the equation g(x, y) = 0 is
x2 − y = 0 which gives y = x2 .
In general, we say that an equation g(x, y) = 0 defines y implicitly as a function of x if
there is a function y(x) which satisfies the equation for a range of values of x.3 It is
often difficult or impossible to solve the equation g(x, y) = c and find a formula for
y(x). But, we can often still find the derivative dy/dx, even if we don’t have an explicit
expression for y in terms of x. In fact, dy/dx can be found simply in terms of the partial
derivatives of g, by the formula
∂g/∂x
dy
=−
.
dx
∂g/∂y
Be careful not to forget the minus sign!
Example 5.4
equation
Suppose the quantity y is defined as a function of x through the
x2 y 3 − 6x3 y 2 + 2xy = 1.
Let’s find a general expression for the derivative dy/dx. The equation defining y
implicitly as a function of x is of the form g(x, y) = 1 where
g(x, y) = x2 y 3 − 6x3 y 2 + 2xy. According to the formula given above,
dy
∂g/∂x
=−
.
dx
∂g/∂y
Now,
3
∂g
= 2xy 3 − 18x2 y 2 + 2y
∂x
and
∂g
= 3x2 y 2 − 12x3 y + 2x,
∂y
See Anthony and Biggs (1996) Sections 12.1 and 12.2.
89
5
5. Functions of several variables
so
dy
18x2 y 2 − 2xy 3 − 2y
= 2 2
.
dx
3x y − 12x3 y + 2x
Suppose we wanted to calculate this derivative when x = 1/2. Clearly we first need
to find the corresponding value of y. Putting x = 1/2 into the defining equation
x2 y 3 − 6x3 y 2 + 2xy = 1, we obtain
1 3 3 2
y − y + y = 1,
4
4
or
y 3 − 3y 2 + 4y − 4 = 0.
This equation factorises as
(y − 2)(y 2 − y + 2) = 0.
5
(Check this!) The quadratic y 2 − y + 2 has negative discriminant and so has no
zeroes. It follows that when x = 1/2, y = 2. Substituting these values into the
expression for dy/dx, we see that the derivative when x = 1/2 is 6.
The theory can be extended. Suppose that g(x, y, z) = c defines z implicitly as a
function of x and y. Then
∂z
∂g/∂x
=−
∂x
∂g/∂z
5.6
and
∂z
∂g/∂y
=−
.
∂y
∂g/∂z
Optimisation
We now study the maximisation and minimisation of a function of two variables.4 As we
have seen, we can think of the equation z = f (x, y) as the equation of a surface in three
dimensions. Earlier in this chapter we plotted the function f (x, y) = 2x2 + 2y 2 . We saw
that the resulting surface is bowl-shaped and it has a lowest point at (0, 0, 0). This is an
example of a local minimum.
Here is another example. The function f (x, y) = e−(x
shape:
4
See Anthony and Biggs (1996) Chapter 13.
90
2 +y 2 )
has a surface of the following
5.6. Optimisation
It has a local maximum when x = y = 0. (The maximum is at the top of the hill on the
surface.)
The local maxima and minima of a function f (x, y) occur at points where the partial
derivatives ∂f /∂x, ∂f /∂y are both equal to 0. Such points are called critical points or
stationary points. A critical point which is neither a local maximum nor a local
minimum is a saddle point. There are various types of saddle point. For example,
consider the function f (x, y) = x2 − y 2 . The surface described by this is indeed
‘saddle-shaped’, with a saddle point at (0, 0), as the following diagram shows.
5
We do not analyse saddle points in detail in this course, but in the applications of
optimisation, it is important to be able to be sure that a critical point is a maximum or
minimum and not a saddle point.
Having determined the critical points, one then uses the following test to determine
whether a critical point is a local maximum, a local minimum or a saddle point. (In
other words, we determine its nature.) Note that when applying this test, all the
derivatives are evaluated at the critical point.
Suppose that (a, b) is a critical point of f .
∂ 2f ∂ 2f
If
−
∂x2 ∂y 2
∂ 2f ∂ 2f
If
−
∂x2 ∂y 2
∂ 2f
∂x∂y
2
∂ 2f
∂x∂y
2
> 0 and
∂ 2f
< 0, it is a maximum.
∂x2
> 0 and
∂ 2f
> 0, it is a minimum.
∂x2
2 2
∂ 2f ∂ 2f
∂ f
−
If
< 0, it is a saddle point.
∂x2 ∂y 2
∂x∂y
This is much more complicated than the corresponding one-variable test, in which we
have a maximum if d2 f /dx2 < 0 at the critical point, and a minimum if d2 f /dx2 > 0. It
is not enough just to check the sign of ∂ 2 f /∂x2 or ∂ 2 f /∂y 2 (or both): we also need to
check that
2 2
∂ 2f ∂ 2f
∂ f
−
> 0,
2
2
∂x ∂y
∂x∂y
which you will sometimes see written as
∂ 2f ∂ 2f
>
∂x2 ∂y 2
∂ 2f
∂x∂y
2
,
91
5. Functions of several variables
You should note that when
∂ 2f ∂ 2f
−
∂x2 ∂y 2
∂ 2f
∂x∂y
2
= 0,
this test fails to classify the nature of the critical point. In this case, some other
technique must be used. For example, if f (x, y) = x3 − y 3 then f has a critical point at
(0, 0), but the test fails to classify it. (In fact, however, it can be seen that (0, 0) is a
saddle point, for f (x, 0) = x3 and this takes both negative and positive values in any
small region around the point (0, 0). Since f (0, 0) = 0, this means that (0, 0) is neither a
maximum nor a minimum.)
Example 5.5
Consider
f (x, y) = 160x − 3x2 − 2xy − 2y 2 + 120y − 18.
5
Let us find the critical points of f (x, y) and determine their nature. Now,
∂f
= 160 − 6x − 2y
∂x
and
∂f
= −2x − 4y + 120.
∂y
At a critical point, both of these must be 0. So we solve
6x + 2y = 160 and 2x + 4y = 120,
obtaining x = y = 20. So there is precisely one critical point: the point (20, 20). To
determine its nature, consider the second-order partial derivatives. We have
∂ 2f
= −2 and
∂x∂y
∂ 2f
= −6,
∂x2
∂ 2f
= −4.
∂y 2
(Here, these values are constants. Were they functions of x, y, we would now
substitute x = y = 20 to obtain the values of the second-order partial derivatives at
the critical point.) So we have
∂ 2f ∂ 2f
−
∂x2 ∂y 2
∂ 2f
∂x∂y
2
> 0 and
∂ 2f
< 0.
∂x2
Therefore the point (20, 20) gives a maximum of f (x, y), and this maximum value is
f (20, 20) = 2782.
Activity 5.6
Show that the function
f (x, y) = 6 + 4x − 3x2 + 4y + 2xy − 3y 2 ,
has one critical point, and classify the critical point.
Activity 5.7 Let us consider three of the functions we used as examples earlier.
Find and classify the nature of the critical points of the following functions:
f (x, y) = 2x2 + 2y 2
92
5.7. Applications of optimisation
g(x, y) = e−(x
2 +y 2 )
h(x, y) = x2 − y 2 .
Here is a more difficult example.
Example 5.6
Find the critical points of the function
f (x, y) = x4 + 2x2 y + 2y 2 + 2y,
and determine, for each, whether it is a local maximum, a local minimum, or a
saddle point.
The partial derivatives are
fx = 4x3 + 4xy
5
and fy = 2x2 + 4y + 2.
We solve fx = 0 and fy = 0. Now, fx = 0 means x(x2 + y) = 0, so x = 0 or y = −x2 .
Suppose x = 0. Then, from fy = 0 we have 4y + 2 = 0, so y = −1/2. Now suppose
y = −x2 . From fy = 0 we have 2x2 − 4x2 + 2 = 0, which means x = ±1. So the
critical points are (0, −1/2), (1, −1) and (−1, −1). The second derivatives are
fxx = 12x2 + 4y,
fxy = 4x and fyy = 4.
2
At (0, −1/2),fxx fyy − fxy
< 0, so this is a saddle point.
2
> 0 and fxx > 0 so this is a local minimum.
At (1, −1), fxx fyy − fxy
2
> 0 and fxx > 0 so this is a local minimum.
At (−1, −1), fxx fyy − fxy
It is easy to make mistakes in such examples and not find all of the critical points. For
instance, we might note that the fact that fx = 0 means 4x3 = −4xy and hence,
cancelling x, x2 = −y. But you have to be careful: we can only cancel x if x 6= 0. So the
possibility x = 0 has to also be considered. This is why we argue, correctly, that fx = 0
means x(x2 + y) = 0, so we have the two possibilities: x = 0 or y = −x2 .
Note that in this example we have used the notation fxx and so on. This is often easier
and quicker to write than the ∂ 2 f /∂x2 notations.
5.7
Applications of optimisation
There are very many problems in management, economics and other areas which
concern the optimisation of functions of several variables, as the following examples
illustrate.
Example 5.7 A data processing company employs both senior and junior
programmers. A particular large project will cost
C(x, y) = 2000 + 2x3 − 12xy + y 2 ,
93
5. Functions of several variables
dollars, where x and y represent the number of junior and senior programmers used
respectively. How many employees of each kind should be assigned to the project in
order to minimise its cost? What is this minimum cost?
To minimise the cost, we set Cx = 0 and Cy = 0, obtaining 6x2 − 12y = 0 (or
x2 − 2y = 0) and −12x + 2y = 0. From the second equation, we have y = 6x.
Substituting this into the equation x2 − 2y = 0, we get x2 − 2(6x) = x(x − 12) = 0,
so x = 12 (or 0) and y = 72 (or 0). Ignoring the (0, 0) case for the moment and
checking the nature of the critical point x = 12, y = 72 we find that
∂ 2C
= 12x = 144,
∂x2
∂ 2C
= −12 and
∂x∂y
∂ 2C
= 2.
∂y 2
Thus, at the point (12, 72),
∂ 2C ∂ 2C
−
∂x2 ∂y 2
5
∂ 2C
∂x∂y
2
= (144)(2) − 1442 > 0 and
∂ 2C
> 0.
∂x2
Hence we do have a minimum at x = 12, y = 72. When x = 12 and y = 72, the cost is
C = 2000 + 3456 − 10368 + 5184 = 272.
(It is clear that (0, 0) is not the required solution, since there the cost is 2000, which
is larger than this value.)
We now describe the problem of maximising profit for a firm making two products, X
and Y . Generally, if pX and pY are the selling prices of one unit of X and one unit of Y ,
then the total revenue obtained by producing amounts x and y is
T R(x, y) = xpX + ypY .
There are a number of ways in which the prices pX and pY may be related to the
quantities x and y: they could be fixed constants, for instance, or both could depend on
both of x and y (which would be the case if the goods were related, for example if they
were CDs and cassettes).5 The joint total cost function T C(x, y) will tell us how much it
costs the manufacturer to produce x units of X and y of Y . Then, the profit function is
Π(x, y) = T R(x, y) − T C(x, y) = xpX + ypY − T C(x, y),
and we maximise this function of x and y using the techniques described above.6
Example 5.8 Suppose that a firm is the only firm producing X and Y (in other
words, it has a monopoly on the goods) and that the demand for X is given by
x = 2 − 2pX + pY ,
and the demand for Y is given by
y = 13 + pX − 2pY .
5
6
See Anthony and Biggs (1996) Chapter 13, for a discussion.
See Anthony and Biggs (1996) Chapter 13, for many examples.
94
5.8. Constrained optimisation
(Note that if the price of X is fixed and the price of Y is increased, then the demand
for X will rise and the demand for Y will fall. This is the behaviour one might
expect if X and Y are two different types of chocolate bar, for instance.) Suppose
also that the joint total cost function is T C(x, y) = 5 + x2 − xy + y 2 . We may
rearrange the equations to find expressions for pX and pY . Multiplying the first
equation by 2 and adding it to the second, we obtain
2x + y = 2(2 − 2pX + pY ) + 13 + pX − 2pY = 17 − 4pX + pX = 17 − 3pX ,
from which we get pX as a function of x and y:
pX (x, y) = (17 − 2x − y) /3.
Using this expression for pX , together with the first equation, we can obtain a
similar expression for pY :
5
2
1
pY = x − 2 + 2pX = x − 2 + (17 − 2x − y) = (28 − x − 2y) .
3
3
The profit function in this case is
Π(x, y) = xpX + ypY − T C(x, y)
x
y
(17 − 2x − y) + (28 − x − 2y) − (5 + x2 − xy + y 2 )
3
3
17
28
5
5
1
= −5 + x + y − x2 − y 2 + xy,
3
3
3
3
3
=
and we would now maximise this profit in the manner described above.
Activity 5.8 Finish the problem started in this example. That is, find the values of
x and y that maximise the profit function
Π(x, y) = −5 +
5.8
17
28
5
5
1
x + y − x2 − y 2 + xy.
3
3
3
3
3
Constrained optimisation
Suppose that f (x, y) has to be minimised or maximised subject to the constraint
g(x, y) = 0. This means we want to find the maximum (or minimum) value of the
function f at points (x, y) which satisfy the condition g(x, y) = 0. Then we may use the
method of Lagrange multipliers.7 To find the optimal (maximal or minimal) points of
f (x, y) subject to g(x, y) = 0, we first find the critical points of the three-variable
function
L(x, y, λ) = f (x, y) − λg(x, y).
7
See Anthony and Biggs (1996) Chapters 21 and 22.
95
5. Functions of several variables
The function L is known as the Lagrangean (sometimes spelt Lagrangian) and λ is
known as the Lagrange multiplier. (Some texts use f + λg rather than f − λg, but there
are good reasons to use f − λg, and this is the approach we recommend.) In other
words, we find the points at which the first-order conditions
∂L
= 0,
∂x
∂L
= 0 and
∂y
∂L
= 0,
∂λ
are satisfied. Then the theory of Lagrange multipliers asserts that the required optimal
points of f , subject to the constraint, are to be found among these critical points.
Example 5.9
Consider the function f (mentioned earlier)
f (x, y) = 160x − 3x2 − 2xy − 2y 2 + 120y − 18.
5
Let us find the maximum value of f subject to the constraint
x + y = 34.
We write this constraint as g(x, y) = x + y − 34 = 0. Consider then the Lagrangean
L(x, y, λ) = f (x, y) − λg(x, y)
= 160x − 3x2 − 2xy − 2y 2 + 120y − 18 − λ(x + y − 34).
We have
∂L
= 160 − 6x − 2y − λ,
∂x
∂L
= −2x − 4y + 120 − λ,
∂y
∂L
= −(x + y − 34).
∂λ
The point (x, y, λ) is a critical point of L if and only if
160 − 6x − 2y − λ = 0,
−2x − 4y + 120 − λ = 0,
x + y − 34 = 0.
(Notice that ∂L/∂λ = 0 recovers the constraint g(x, y) = 0.) To solve these, we
adopt a strategy that very often works: we eliminate λ from the first two equations,
determining a relationship between x and y which we then substitute into the third
equation. Explicitly, we have from the first equation that λ = 160 − 6x − 2y, and
from the second equation we have λ = −2x − 4y + 120. These two expressions for λ
must be equal, so 160 − 6x − 2y = −2x − 4y + 120, so that y = 2x − 20. Then the
third equation becomes x + (2x − 20) − 34 = 0, or 3x = 54. Hence x = 18 and
y = 2x − 20 = 16. So we get a constrained maximum value of f of f (18, 16) = 2722.
Earlier, we saw that if no constraint is imposed, then the maximum is at the point
(20, 20). But that point fails to satisfy the constraint in this problem.
96
5.9. Applications of constrained optimisation
Activity 5.9 Use the Lagrange multiplier method to find the values of x, y which
minimise x2 + y 2 subject to the constraint x + y = 1.
5.9
Applications of constrained optimisation
Constrained optimisation is very useful in management, economics and finance. Two
standard types of problem in economics are utility maximisation subject to a budget
constraint, and problems concerning the output of a firm and its capital/labour costs.
Suppose a consumer likes to consume two goods, X and Y . A utility function u(x, y) is
a way of deciding between alternative bundles (that is, combinations) of the two goods.
For example, if u(21, 5) > u(20, 7), then the consumer would prefer to have the bundle
consisting of 21 of X and 5 of Y rather than that comprising 20 of X and 7 of Y .
(Usually, this is all that utility functions tell us. They enable us to rank bundles; that
is, to determine whether one bundle is preferable to another. In general, we should not,
for example, infer from a fact such as u(21, 5) = 2u(20, 7) that the bundle (21, 5) is
‘twice as good’ as (20, 7).) The consumer’s basic problem is to find the ‘best’ (that is,
highest utility-giving) bundle that he or she can afford. Supposing they have a budget
M for X and Y and that the prices of X and Y are pX and pY , then the consumer can
only afford bundles (x, y) satisfying xpX + ypY ≤ M . We assume that the quantities can
be bought in fractional amounts, so that we don’t need to consider only values of x and
y that are whole numbers. It’s clear that if the consumer really regards X and Y as
‘goods’ (rather than ‘bads’ !) then he or she should spend all of their budget. So the
consumer wants to maximise u(x, y) subject to the budget constraint xpX + ypY = M .
This is now a standard constrained optimisation problem.
Example 5.10 Suppose there are two goods with prices pX = 2 and pY = 5, the
income is M = 40, and the utility function is
u(x, y) = x1/3 y 1/2 .
The budget constraint is
2x + 5y = 40,
and the Lagrangean is
L(x, y, λ) = x1/3 y 1/2 − λ(2x + 5y − 40).
We have to solve the three equations
1 −2/3 1/2
1
x
y − 2λ = 0, x1/3 y −1/2 − 5λ = 0 and 2x + 5y = 40,
3
2
for the three unknowns x, y, λ. We employ our standard strategy of using the first
two equations to eliminate λ and find a relationship between x and y. From the first
two equations we get
1
λ = x−2/3 y 1/2
6
and λ =
1 1/3 −1/2
x y
.
10
97
5
5. Functions of several variables
Equating these two different expressions for λ, we clearly have, in particular, that
1 −2/3 1/2
1
x
y = x1/3 y −1/2 .
6
10
This does not look particularly simple, but it easily reduces to the simple equation
y = 3x/5. Substituting this in the budget constraint gives
3
2x + 5
x = 40, that is 5x = 40.
5
From this we get the optimum values
5
x∗ = 8 and y ∗ = 24/5.
p
The corresponding value of λ is λ∗ = 1/120. We don’t really need this to answer
the problem, but as we shall see soon, it can be useful.
Now let us give an example of the type of constrained optimisation problem
encountered when one considers a firm.
Example 5.11 A firm’s weekly output is given by the production function
q(k, l) = k 3/4 l1/4 , and the unit costs for capital and labour are v = 1 and w = 5 per
week, so that the total cost incurred in using k units of capital and l of labour is
k + 5l. Find the minimum cost of producing a weekly output of 5000 and the
corresponding values of k and l.
The problem to be solved is the constrained optimisation problem
minimise k + 5l
subject to k 3/4 l1/4 = 5000.
The Lagrangean for the problem is
L(k, l, λ) = k + 5l − λ(k 3/4 l1/4 − 5000),
and the optimal values of k and l are the solutions to the three equations
3
∂L
= 1 − λk −1/4 l1/4 = 0,
∂k
4
∂L
1
= 5 − λk 3/4 l−3/4 = 0,
∂l
4
∂L
= k 3/4 l1/4 − 5000 = 0.
∂λ
The first two equations imply that
4
λ = k 1/4 l−1/4 = 20k −3/4 l3/4 ,
3
which simplifies to l = k/15. Substituting this information into the third equation
gives
1/4
1
3/4
k
k 1/4 = 5000 so that k = 5000(15)1/4 .
15
98
5.9. Applications of constrained optimisation
Then, l = k/15 = 5000(15)−3/4 and the minimum cost is
k + 5l = 5000(15)1/4 + 5(5000)(15)−3/4 = 100000(15)−3/4 ,
which is approximately 13120.
Here is another type of problem concerning a firm, which this time involves not capital
and labour costs, but raw material costs.
Example 5.12 A firm manufactures a good from two raw materials, X and Y . The
quantity of its good which is produced from x units of X and y of Y is given by
Q(x, y) = x1/4 y 3/4 . If the firm spends no more than $1280 each week on the raw
materials, what is its maximum possible weekly production, given that one unit of X
costs $16 and one unit of Y costs $1?
The problem here is to maximise Q(x, y) subject to the constraint that the amount
spent on raw materials is at most $1280. Clearly, the optimal values of x and y will
satisfy the constraint 16x + y = 1280. The Lagrangean is
L(x, y, λ) = x1/4 y 3/4 − λ(16x + y − 1280),
and the equations to solve are
1
∂L
= x−3/4 y 3/4 − 16λ = 0,
∂x
4
∂L
3
= x1/4 y −1/4 − λ = 0,
∂y
4
∂L
= 1280 − 16x − y = 0.
∂λ
As in the previous example, we eliminate λ from the first two equations to obtain a
relationship between the key variables x and y. From the first equation,
λ=
and from the second,
1 −3/4 3/4
x
y ,
64
3
λ = x1/4 y −1/4 .
4
We therefore have
1 −3/4 3/4 3 1/4 −1/4
x
y = x y
,
64
4
which, on moving all the y terms to the left and the x terms to the right (by
cross-multiplication), simplifies to y = 48x. Then, the third equation implies
64x = 1280, so that x = 20 and y = 48x = 960. The maximum quantity is therefore
Q(20, 960) = (20)1/4 (960)3/4 .
99
5
5. Functions of several variables
5.10
5
The meaning of the Lagrange multiplier
The Lagrange multiplier has a useful interpretation in many applications. Formally, let
us suppose that the constrained optimisation problem is to maximise a function f (x, y)
subject to a constraint of the form h(x, y) = a where a is a constant. Then, provided f
and h are ‘well-behaved’, the value of the Lagrange multiplier is the rate of change of
the maximum value of f with respect to a. Explicitly, if x∗ (a) and y ∗ (a) are the
optimising values of f when the constraint is h(x, y) = a, then the maximum value of f
is f ∗ (a) = f (x∗ (a), y ∗ (a)), and it turns out that the value λ∗ (a) of the Lagrange
multiplier satisfies:
∂f ∗
.
λ∗ (a) =
∂a
An interesting example of this occurs in the problem of the consumer maximising his or
her utility.8 Here, the problem is to maximise a utility function, u(x1 , x2 ), subject to a
budget constraint p1 x1 + p2 x2 = M . The utility of the optimum bundle x∗ = (x∗1 , x∗2 ) is
u(x∗1 , x∗2 ). Since each x∗i is a function of the prices p1 , p2 and the income M , so also is
the optimal utility. Using the notation x∗i = qi (p1 , p2 , M ), we have
u(x∗ ) = u(q1 (p1 , p2 , M ), q2 (p1 , p2 , M )) = V (p1 , p2 , M ),
say. The function V is called the indirect utility function. It specifies the individual
consumer’s optimal utility when the prices are p1 , p2 and the income is M . The partial
derivative ∂V /∂M is the marginal utility of income. It tells us what change in optimal
utility will result from a small change in income, given that prices remain constant. The
value, λ∗ , of the Lagrange multiplier satisfies:
λ∗ =
∂V
.
∂M
Example 5.13 We’ll now work through an example in detail to show that the
1/3 1/2
above theory works. Suppose a consumer has utility function u(x1 , x2 ) = x1 x2 ,
and budget constraint p1 x1 + p2 x2 = M . Then the Lagrangean is
1/3 1/2
L(x1 , x2 , λ) = x1 x2 − λ(p1 x1 + p2 x2 − M ).
The equations we need to solve are
∂L
1 −2/3 1/2
= x1 x2 − λp1 = 0,
∂x1
3
∂L
1 1/3 −1/2
= x1 x 2
− λp2 = 0,
∂x2
2
∂L
= −(p1 x1 + p2 x2 − M ) = 0.
∂λ
From the first two equations, we have
λ=
8
1 −2/3 1/2
1 1/3 −1/2
x1 x2 =
x x
,
3p1
2p2 1 2
See Anthony and Biggs (1996) Sections 22.4 and 22.5.
100
5.10. The meaning of the Lagrange multiplier
which gives
x2 =
3p1
x1 .
2p2
By the third equation, p1 x1 + p2 x2 = M , so
3p1
p 1 x1 + p 2
x1 = M,
2p2
and hence
5p1
x1 = M
2
so that x1 =
2M
.
5p1
Thus, the optimising values of x1 , x2 are
x∗1 = q1 (p1 , p2 , M ) =
2M
5p1
and x∗2 = q2 (p1 , p2 , M ) =
3p1 ∗ 3M
x =
.
2p2 1
5p2
5
The indirect utility function is
V (p1 , p2 , M ) =
It follows that
(x∗1 )1/3 (x∗2 )1/2
=
2M
5p1
1/3 3M
5p2
1/2
=
21/3 31/2 M 5/6
.
1/2
55/6 p1/3
p
2
1
∂V
5 21/3 31/2 M −1/6
=
.
1/2
∂M
6 55/6 p1/3
1 p2
According to the theory presented above, this should equal the Lagrange multiplier.
Now, by the equations arising from the first-order conditions, we have
λ∗ =
1 ∗ −2/3 ∗ 1/2
(x )
(x2 ) ,
3p1 1
and you can verify for yourself that this is exactly the same as ∂V /∂M .
How would we use this theory? Well, having worked through a particular constrained
optimisation problem, say a utility maximisation problem, this interpretation of the
Lagrange multiplier can help us estimate the change in the maximum utility if a small
change in income is made. For example, suppose there are two goods with prices
1/3 1/2
p1 = 2, p2 = 5, and the utility function is x1 x2 (as above). When the income is
M = 40, working through a Lagrangean calculation (as we did earlier in this chapter),
we would find that the maximum utility
is u(8, 24/5), which is about 4.38, and that the
p
value of the Lagrange multiplier is 1/120. Now suppose the consumer’s income rises
to 42. What would the new maximum utility be? We could work through a complete
Lagrangean calculation again, but there is no need if we are happy with an approximate
p
answer. A good approximate answer is obtained by using the fact that λ∗ = 1/120.
Since
∂V
λ∗ =
,
∂M
and M is increased from 40 to 42, the change in M is ∆M = 2, and the change in
maximum utility is given approximately as:
p
∆V ' λ∗ ∆M = 1/120 × 2,
101
5. Functions of several variables
which is approximately 0.18. So when the income increases from 40 to 42 the maximum
utility increases approximately from 4.38 to 4.56.
Learning outcomes
At the end of this chapter and the relevant reading, you should be able to:
5
explain the concept of a function of many variables
calculate partial derivatives
use the chain rule for partial differentiation to find total derivatives
find and classify stationary/critical points
solve optimisation and constrained optimisation problems
be able to interpret the meaning of the Lagrange multiplier, λ
You do not need to know how to verify the nature of a critical point obtained using the
Lagrange multiplier method. (Thus, if, for example, a question asks you to maximise
subject to a constraint and there turns out to be just one point satisfying the
Lagrangean first-order conditions, then you may assume that the point is indeed a
maximum.)
Sample examination/practice questions
Question 5.1
Suppose that a firm has production function q(k, l) = Ak α l1−α where A > 0 and
0 < α < 1. Show that the marginal product of labour ∂q/∂l is positive, and that it is a
decreasing function of l when k is fixed.
Question 5.2
A monopoly manufactures two goods, X and Y , with demand functions
x = 12 − pX
and y = 18 − pY .
The firm’s cost function is C(x, y) = x2 + y 2 + 2xy. Find the maximum profit
achievable, and the quantities produced of each of X and Y in order to achieve this.
Question 5.3
A firm manufactures two products, X and Y , and sells these in related markets.
Suppose that the firm is the only producer of X and Y and that the inverse demand
functions for X and Y are
pX = 13 − 2x − y
and pY = 13 − x − 2y.
Determine the production levels that maximise profit, given that the cost function is
C(x, y) = x + y.
102
5.10. Answers to activities
Question 5.4
Use the technique
multipliers to find the values of x and y which maximise
√ of Lagrange
√
the function 3 x + 4 y, subject to the constraint x + y = 100.
Question 5.5
A firm manufactures a good from two raw materials, X and Y . The quantity of the
good which is produced from x units of X and y of Y is given by
√
√
Q(x, y) = ( x + 2 y)2 .
Each unit of X costs the firm $2 and each unit of Y costs $1. Find the minimum cost of
producing 100 units of the manufactured good.
Question 5.6
A consumer buys two goods, X and Y . The price of one unit of X is $1 and the price of
one unit of Y is $16. The consumer’s utility function, which describes how she values x
units of X and y units of Y , is given by
u(x, y) = x3/4 y 1/4 .
She has a budget of $1280 in total each year to spend on X and Y . Using the method of
Lagrange multipliers, find the values of x, y which will maximise the consumer’s utility
function u(x, y) subject to the constraint on her budget. Use the value of the Lagrange
multiplier to estimate the increase in the maximum obtainable utility if the consumer’s
budget for the goods rises to $1282.
Question 5.7
A firm has production function q(k, l) = 50k 2/3 l1/3 , and unit capital and labour costs of
6 and 4, respectively, so that the total cost incurred when using k units of capital and l
of labour is 6k + 4l. What is the maximum weekly output achievable if the firm spends
no more than 1000 a week on capital and labour?
Question 5.8
A student has a part-time job in a restaurant. For this she is paid $8 per hour. Her
utility function for earning $I and spending S hours studying is
u(I, S) = I 1/4 S 3/4 .
(The utility function is a measure of the ‘usefulness’ or ‘worth’ to the student of a
certain combination of money and study time.) The total amount of time she spends
each week working in the restaurant and studying is 100 hours. How should she divide
up her time in order to maximise her utility?
Answers to activities
Feedback to activity 5.1
Think about what would happen if we were to slice the surface parallel to the
(x, y)-plane at a height c > 0 above the plane. Then we would have the set of points
103
5
5. Functions of several variables
p
with z-coordinate c. But this means,
since
z
=
x2 + y 2 , that all the points (x, y, c) in
p
this cross-section would satisfy x2 + y 2 = c, so x2 + y 2 = c2 . This last equation is the
equation of a circle, centred on the origin and of radius c. So slicing through the surface
at z = c and examining the shape of the section obtained, it is a circle of radius c. It
follows that the surface is a cone. It would look something like the following.
5
This is quite tricky. However, you do not need to be able to determine the shape of
surfaces. I’ve included this activity simply to help you think about them geometrically.
Feedback to activity 5.2
We have f (x, y) = x3 y − xy −1 , so
1
∂f
= 3x2 y − y −1 = 3x2 y − ,
∂x
y
∂f
1
x
= x 3 − x − 2 = x3 + 2 .
∂y
y
y
Feedback to activity 5.3
We already have the first-order derivatives. We calculate the second-order derivatives as
follows. First,
∂ 2f
∂
2
−1
=
3x
y
−
y
= 6xy,
∂x2
∂x
and
∂ 2f
∂
x
=
x3 + xy −2 = −2xy −3 = −2 3 .
2
∂y
∂y
y
We can calculate the remaining second-order derivative in two ways: we can either
differentiate ∂f /∂x with respect to y, or we can differentiate ∂f /∂y with respect to x.
We need only do one of these, but let’s just check they both give the same answer.
∂
1
1
2
3x y −
= 3x2 + 2 ,
∂y
y
y
and
∂
∂x
x
x + 2
y
3
= 3x2 +
1
,
y2
so (as expected) they are equal, and
∂ 2f
1
∂ 2f
= 3x2 + 2 =
.
∂x∂y
y
∂y∂x
104
5.10. Answers to activities
Feedback to activity 5.4
We find that
∂f
3
∂f
1
= x−1/4 y 1/4 and
= x3/4 y −3/4 .
∂x
4
∂y
4
Then,
∂ 2f
∂ 2f
3 −5/4 1/4
3
x
y
,
=
−
= − x3/4 y −7/4 ,
2
2
∂x
16
∂y
16
and
3
∂ 2f
∂ 2f
= x−1/4 y −3/4 =
.
∂x∂y
16
∂y∂x
Feedback to activity 5.5
By the chain rule,
dF
∂f dx ∂f dy
=
+
dt
∂x dt
∂y dt
5
= 2xy(3) + x2 (2t)
= 6(2 + 3t)(t2 + 1) + 2t(2 + 3t)2
= 36t3 + 36t2 + 26t + 12.
Of course, one could also directly find an expression for F in terms of t and differentiate
this (giving the same answer).
Feedback to activity 5.6
We have f (x, y) = 6 + 4x − 3x2 + 4y + 2xy − 3y 2 and to find the critical point(s) we solve
∂f
= 4 − 6x + 2y = 0,
∂x
∂f
= 4 + 2x − 6y = 0.
∂y
We therefore have to solve simultaneously the equations 6x − 2y = 4 and 6y − 2x = 4.
The first says y = 3x − 2 which, using the second, means that 6(3x − 2) − 2x = 4 or
16x = 16. Therefore x = 1, and y = 3x − 2 = 1. So there is a single critical point, (1, 1).
To determine its nature we need the second-order derivatives. We have
∂ 2f
= 2 and
∂x∂y
∂ 2f
= −6,
∂x2
∂ 2f
= −6,
∂y 2
and it is clear that
∂ 2f ∂ 2f
−
∂x2 ∂y 2
∂ 2f
∂x∂y
2
> 0 and
∂ 2f
< 0,
∂x2
so the critical point is a local maximum.
Feedback to activity 5.7
To find the critical points of f we solve
∂f
= 4x = 0 and
∂x
∂f
= 4y = 0,
∂y
105
5. Functions of several variables
so there is just one critical point, (0, 0). The second-order derivatives are
∂ 2f
= 4,
∂x2
so
∂ 2f ∂ 2f
−
∂x2 ∂y 2
∂ 2f
= 4,
∂y 2
∂ 2f
= 0 and
∂x∂y
∂ 2f
∂x∂y
2
> 0 and
∂ 2f
> 0,
∂x2
hence (0, 0) is a minimum. (Actually, this is clear anyway, without using any fancy
calculus: we know that x2 ≥ 0 and equals 0 only when x = 0, and similarly for y 2 , so
f (x, y) ≥ 0 for all (x, y), and f (x, y) = 0 only when (x, y) = (0, 0), so we see that this
gives a minimum. Easy as this is, the point here is to demonstrate the technique, which
will work when matters are less obvious.)
For g, we have
5
∂g
2
2
= −2xe−(x +y ) = 0 and
∂x
∂g
2
2
= −2ye−(x +y ) = 0,
∂y
so (since the exponential term is always positive), there is just one critical point, (0, 0).
The second-order derivatives are
∂ 2g
2
2
2
2
2
2
= −2e−(x +y ) + 4x2 e−(x +y ) = (4x2 − 2)e−(x +y ) ,
2
∂x
and, similarly,
∂ 2g
2
2
= (4y 2 − 2)e−(x +y ) .
2
∂y
Also,
∂ 2g
2
2
= 4xye−(x +y ) .
∂x∂y
We now need to evaluate these at the critical point, so we substitute x = y = 0,
obtaining
∂ 2g
∂ 2g
∂ 2g
(0,
0)
=
−2,
(0,
0)
=
0
and
(0, 0) = −2,
∂x2
∂x∂y
∂y 2
so
2 2
∂ g
∂ 2g
∂ 2g ∂ 2g
−
< 0,
>
0
and
∂x2 ∂y 2
∂x∂y
∂x2
hence (0, 0) is a maximum.
Finally, for the function h(x, y) = x2 − y 2 , the critical points are given by
∂h
= 2x = 0 and
∂x
∂h
= −2y = 0,
∂y
so again there is a unique critical point, (0, 0). The second derivatives are
∂ 2h
= 2,
∂x2
so
∂ 2h
= 0 and
∂x∂y
∂ 2h ∂ 2h
−
∂x2 ∂y 2
∂ 2h
∂x∂y
and hence (0, 0) is a saddle point in this case.
106
∂ 2h
= −2,
∂y 2
2
= −4 < 0,
5.10. Answers to Sample examination/practice questions
Feedback to activity 5.8
It’s clear that the problem of maximising Π is the same as that of maximising
f (x, y) = 3Π(x, y), the advantage of working with f being that the constants involved
are simpler. Now,
f (x, y) = −15 + 17x + 28y − 5x2 − 5y 2 + xy.
We solve
∂f
= 17 − 10x + y = 0 and
∂x
∂f
= 28 − 10y + x = 0.
∂y
That is,
10x − y = 17 and
− x + 10y = 28.
Multiplying the first equation by 10, 100x − 10y = 170. Adding this to the second
equation gives 99x = 198, so x = 2, and y = 10x − 17 = 3. There is therefore one critical
point, (2, 3). We should check that this does indeed maximise profit. As usual, we use
the second derivatives. The second derivatives of f are
∂ 2f
(x, y) = −10,
∂x2
∂ 2f
= 1 and
∂x∂y
∂ 2f
= −10.
∂y 2
So,
∂ 2f ∂ 2f
−
∂x2 ∂y 2
∂ 2f
∂x∂y
2
> 0 and
∂ 2f
< 0,
∂x2
and hence (2, 3) is a maximum. Therefore the optimal production levels are x = 2 and
y = 3.
Feedback to activity 5.9
The constraint can be written as x + y − 1 = 0 and so the Lagrangian is
L(x, y, λ) = x2 + y 2 − λ(x + y − 1).
The first-order conditions are
∂L
= 2x − λ = 0,
∂x
∂L
= 2y − λ = 0 and
∂y
∂L
= 1 − x − y = 0.
∂λ
From the first two, we have λ = 2x = 2y, so x = y. The third equation then gives
x + x = 1, so x = 1/2 and y = 1/2.
Answers to Sample examination/practice questions
Answer to question 5.1
We have
∂q
= A(1 − α)k α l−α .
∂l
This is clearly positive, since we are given that A > 0 and 1 − α > 0. Furthermore, as l
increases then, for fixed k, l−α decreases, since α > 0. It follows that the marginal
product of labour decreases with l.
107
5
5. Functions of several variables
There is another way of verifying that it decreases. The rate of change of ∂q/∂l with
respect to l is its derivative, in other words, the second derivative ∂ 2 q/∂l2 . By the usual
rules we get
∂ 2q
∂
A(1 − α)k α l−α
=
2
∂l
∂l
= A(1 − α)(−α)k α l−α−1
= −Aα(1 − α)k α l−α−1 .
Because A > 0, α > 0 and 1 − α > 0, this is negative, from which it follows that the
marginal product of labour is a decreasing function of l.
Answer to question 5.2
5
The profit function is
Π(x, y) = T R − T C = xpX + ypY − (x2 + y 2 + 2xy).
Now, we want this to be written as a function of the variables x and y, so pX and pY
have to be rewritten in terms of x and y. Since pX = 12 − x and pY = 18 − y, we have
Π(x, y) = x(12 − x) + y(18 − y) − (x2 + y 2 + 2xy)
= 12x + 18y − 2x2 − 2y 2 − 2xy.
To find the critical points, we solve
∂Π
∂Π
= 12 − 4x − 2y = 0 and
= 18 − 4y − 2x = 0.
∂x
∂y
That is,
2x + y = 6 and x + 2y = 9.
Multiplying the first equation by 2 and subtracting the second shows 3x = 3, so x = 1.
Corresponding to this, y = 4. There is therefore one critical point, (1, 4). We should
check that this does indeed maximise profit. As usual, we use the second derivatives.
The second derivatives of Π are
∂ 2Π
∂ 2Π
∂ 2Π
(x, y) = −4,
= −2 and
= −4.
∂x2
∂x∂y
∂y 2
So,
2 2
∂ 2Π ∂ 2Π
∂ Π
∂ 2Π
−
>
0
and
< 0,
∂x2 ∂y 2
∂x∂y
∂x2
hence (1, 4) is a maximum. Therefore the optimal production levels are x = 1 and y = 4,
and the maximum achievable profit is Π(1, 4) = 42.
Answer to question 5.3
The profit function is
Π(x, y) = T R − T C
= xpX + ypY − (x + y)
= x(13 − 2x − y) + y(13 − x − 2y) − (x + y)
= 12x + 12y − 2x2 − 2y 2 − 2xy.
108
5.10. Answers to Sample examination/practice questions
To find the critical points, we solve
∂Π
= 12 − 4x − 2y = 0 and
∂x
∂Π
= 12 − 4y − 2x = 0.
∂y
That is,
2x + y = 6 and x + 2y = 6.
Multiplying the first equation by 2 and subtracting the second shows 3x = 6, so x = 2.
Corresponding to this, y = 2. There is therefore one critical point, (2, 2). We should
check that this does indeed maximise profit. As usual, we use the second derivatives.
The second derivatives of Π are
∂ 2Π
(x, y) = −4,
∂x2
∂ 2Π
= −2 and
∂x∂y
So,
∂ 2Π ∂ 2Π
−
∂x2 ∂y 2
∂ 2Π
∂x∂y
2
> 0 and
∂ 2Π
= −4.
∂y 2
5
∂ 2Π
< 0,
∂x2
hence (2, 2) is a maximum. Therefore the optimal production levels are x = 2 and y = 2
Answer to question 5.4
√
√
The function to be optimised is 3 x + 4 y and the constraint equation g(x, y) = 0 is
x + y − 100 = 0. The Lagrangean is therefore
√
√
L(x, y, λ) = 3 x + 4 y − λ(x + y − 100).
We now solve
3
∂L
= √ − λ = 0,
∂x
2 x
∂L
2
= √ − λ = 0,
∂y
y
∂L
= 100 − x − y = 0.
∂λ
From the first two equations, we obtain two expressions for λ,
3
2
λ= √ = √ ,
y
2 x
so
√
√
y = 4 x/3 and hence y = 16x/9. Now the third equation tells us
x+
16x
= 100,
9
or 25x/9 = 100 and therefore the optimal values of x and y are x = 36 and
y = 16(36)/9 = 64.
109
5. Functions of several variables
Answer to question 5.5
Be careful here about what is the constraint and what is the function to be optimised.
Reading the question carefully, you will see that we have to minimise the cost. Now, the
cost will be 2x + y since X costs $2 and Y costs $1 per√unit. What’s the constraint? We
√
this
have to produce 100 units, so Q(x, y) = 100, which is ( x + 2 y)2 = 100.
√ (At √
point, you could notice that this is equivalent to the simpler constraint x + 2 y = 10,
but let’s imagine we haven’t been quite that clever and proceed without this
observation.) So, the Lagrangean is
√
√
L(x, y, λ) = 2x + y − λ ( x + 2 y)2 − 100 .
5
The first-order conditions are
√
√ ( x + 2 y)
∂L
√
=2−λ
= 0,
∂x
x
√
√ 2( x + 2 y)
∂L
=1−λ
= 0,
√
∂y
y
√
∂L
√
= 100 − ( x + 2 y)2 = 0.
∂λ
This is more complicated that the previous examples, but we shall employ exactly the
same technique, namely using the first two equations to eliminate λ and find a
relationship between x and y. From the first two equations, we obtain two expressions
for λ,
√
√
y
2 x
√
√
λ=
√ =
√ ,
x+2 y
2( x + 2 y)
√
√
which means (cancelling the x + 2 y factor),
√
√
y
2 x=
,
2
√
√
so y = 4 x and hence y = 16x. Now the third equation tells us
√
√
√
√
√
√
100 = ( x + 2 y)2 = ( x + 2 16x)2 = ( x + 8 x)2 = 81x,
so x = 100/81 and y = 16x = 1600/81. The corresponding minimum cost is
100
1600
1800
200
2x + y = 2
+
=
=
.
81
81
81
9
Answer to question 5.6
The budget constraint is x + 16y = 1280 and so the Lagrangean is
L(x, y, λ) = x3/4 y 1/4 − λ(x + 16y − 1280).
The first-order conditions are
∂L
3
= x−1/4 y 1/4 − λ = 0,
∂x
4
∂L
1
= x3/4 y −3/4 − 16λ = 0,
∂y
4
∂L
= 1280 − x − 16y = 0.
∂λ
110
5.10. Answers to Sample examination/practice questions
From the first two equations, we obtain two expressions for λ,
3
1
λ = x−1/4 y 1/4 = x3/4 y −3/4 ,
4
64
which means that y = x/48. Now the third equation gives
x 4
= x = 1280,
x + 16
48
3
so x = 960 and y = 960/48 = 20.
Now, with the optimal values of x, y, the value of λ can be calculated using either of the
two expressions we obtained above. From the first of these, we have
1/4
3 20
3
3
−1/4
1/4
(20) =
=
λ = (960)
.
4
4 960
4(48)1/4
If the income rises by $2, then the maximum utility will increase by approximately 2λ,
which is 3/2(48)1/4 .
Answer to question 5.7
The problem is to maximise q(k, l) = 50k 2/3 l1/3 subject to spending no more than 1000
on capital and labour. Clearly the optimal strategy is to spend all of the 1000 available
(since extra capital and extra labour both increase the output q), so the constraint
equation is 6k + 4l = 1000 and the Lagrangean is
L(k, l, λ) = 50k 2/3 l1/3 − λ(6k + 4l − 1000).
We now solve
100 −1/3 1/3
∂L
=
k
l − 6λ = 0,
∂k
3
∂L
50
= k 2/3 l−2/3 − 4λ = 0,
∂l
3
∂L
= 1000 − 6k − 4l = 0.
∂λ
From the first two equations, we obtain two expressions for λ,
λ=
100 −1/3 1/3 50 2/3 −2/3
k
l = k l
,
18
12
so
18 50 1/3 2/3
· k k ,
100 12
which simplifies to l = 3k/4. The third equation gives
3
k = 9k = 1000,
6k + 4
4
l1/3 l2/3 =
so k = 1000/9 and l = (3/4)k = 3000/36 = 250/3. The corresponding output is
2/3 1/3
1000
250
50
.
9
3
111
5
5. Functions of several variables
Answer to question 5.8
A little care needs to be taken in determining the constraint. Since the number of hours
spent working in the restaurant is I/8, the income divided by the hourly rate, the
constraint ‘total time is 100’ is
I/8 + S = 100.
The Lagrangean is therefore
L(I, S, λ) = I
1/4
S
3/4
−λ
I
+ S − 100 ,
8
and the equations to solve are
∂L
1
λ
= I −3/4 S 3/4 − = 0,
∂I
4
8
5
∂L
3
= I 1/4 S −1/4 − λ = 0,
∂S
4
I
∂L
= 100 − − S = 0.
∂λ
8
The first two equations, on elimination of λ, yield I = 8S/3, and substituting this into
the third equation, we obtain S = 75. Thus the optimal division of time is 75 hours of
study and 25 hours of restaurant work (generating I = 200 dollars income).
112
Chapter 6
Matrices and linear equations
Essential reading
R
(For full publication details, see Chapter 1.)
R
R
R
Anthony and Biggs (1996) Chapters 14, 15 and 16.
Further reading
6.1
Booth (1998) Chapter 2, Module 5.
6
Bradley (2008) Section 9.2–9.3.
Dowling (2000) Chapter 10.
Introduction
This chapter of the guide deals with matrices. The main use of matrices in this subject
is in solving systems of simultaneous linear equations.
6.2
Vectors
An n-vector 1 v is a list of n numbers, written either as a row-vector
(v1 , v2 , . . . , vn ),
or a column-vector
 
v1
 v2 
 
 ..  .
.
vn
(Sometimes the commas are omitted in the notation for a row vector.) The numbers v1 ,
v2 , and so on are known as the components, entries or coordinates of v. The zero vector
is the vector with all of its entries equal to 0.
Vectors are useful in geometry, where they represent directions. However, this is not
something that will concern us in this course. Vectors are a particularly useful form of
1
See Anthony and Biggs (1996) Section 14.1.
113
6. Matrices and linear equations
notation in economic aspects of management. Suppose for simplicity we have a market
selling two items, which we shall call ‘grommets’ and ‘widgets’. If a consumer has 5
grommets and 3 widgets, we may say that the consumer has the (commodity) bundle
(5, 3). More generally, if there are n goods in a commodity market — goods
X1 , X2 , . . . , Xn , say — and a consumer has x1 units of X1 , x2 of X2 , and so on, we say
that she has the bundle x = (x1 x2 . . . xn ).
We can define addition of two n-vectors by the rule
(w1 , w2 , . . . , wn ) + (v1 , v2 , . . . , vn ) = (w1 + v1 , w2 + v2 , . . . , wn + vn ).
(The rule is described here for row vectors but the obvious counterpart holds for column
vectors.) Also, we can multiply a vector by any single number α (usually called a scalar
in this context) by the following rule:
α(v1 , v2 , . . . , vn ) = (αv1 , αv2 , . . . , αvn ).
For example,
6
(1, −2, −3) + (4, 5, 7) = (5, 3, 4) and 4(1, −2, 5) = (4, −8, 20).
The operations of addition and multiplication by a scalar may be combined. For
example,
3(2, 1, 3) + 2(1, 1, 1) = (6, 3, 9) + (2, 2, 2) = (8, 5, 11).
The dot product of two n-vectors x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) is denoted
by x · y and is calculated as follows:
x · y = x1 y1 + x2 y2 + x3 y3 + · · · + xn yn .
Thus, x · y is the number obtained when we multiply together the first entries of x, y,
multiply together the second entries, and so on, and add these n products together.
Example 6.1
Let x = (1, 0, 2) and y = (5, 2, 3). Then
x · y = 1(5) + 0(2) + 2(3) = 11.
Warning: The dot product of two vectors is a number, not a vector. This is a mistake
many students make and you must avoid it. (A common error is to assume that the dot
product of two n-vectors is the n-vector whose entries are obtained by multiplying
together the corresponding entries of the two vectors. This is not the case. The dot
product is the sum of these products.) As mentioned, we do not have a way of
‘multiplying’ together two vectors to get a vector. (However, as we see later, two vectors
may be multiplied together to give a matrix.)
Let us return to our economic model. If the consumer has 5 grommets and 3 widgets
and the price of a grommet is p1 and the price of a widget is p2 , then this bundle, (5, 3),
would have cost the consumer an amount (5p1 + 3p2 ) to buy. More generally, if the
consumer has 36 dollars to spend on grommets and widgets, then the bundle (x1 , x2 )
can be bought provided its cost, in dollars, which is p1 x1 + p2 x2 , is no greater than 36.
114
6.3. Matrices
In this way, we obtain the consumer’s budget constraint, p1 x1 + p2 x2 ≤ 36. A bundle
(x1 , x2 ) is affordable if, and only if, it satisfies the budget constraint. The general case is
only slightly more complex. Suppose we have n goods X1 , X2 , · · · , Xn and the cost of
one unit of Xi is pi . Suppose a consumer has an amount M to spend on these goods.
Then the budget constraint is ‘cost of bundle ≤ M ’, which is
p1 x1 + p2 x2 + · · · + pn xn ≤ M . But the quantity p1 x1 + p2 x2 + · · · + pn xn is the dot
product of the price vector p = (p1 , p2 , · · · , pn ) and the bundle x = (x1 , x2 , · · · , xn ).
Thus the budget constraint is p · x ≤ M and bundle x can be purchased provided it
satisfies the budget constraint.
Activity 6.1
6.3
6.3.1
Let x = (1, 2, 3) and y = (3, 2, 1). Find x + y and x · y.
Matrices
What is a matrix?
A matrix 2 is an array of numbers

a11 a12
 a21 a22

 ..
..
 .
.
am1 am2
6

· · · a1n
· · · a2n 

..  .
...
. 
· · · amn
We denote this array by the single letter A, or by (aij ), and we say that A has m rows
and n columns, or that it is an m × n matrix. We also say that A is a matrix of size
m × n. If m = n, the matrix is said to be square. The number aij is known as the
(i, j)th entry of A. The row vector (ai1 , ai2 , · · · , ain ) is row i of A, or the ith row of A,
and the column vector
 
a1j
 a2j 
 
 .. 
 . 
anj
is column j of A, or the jth column of A.
It is useful to think of row and column vectors as matrices. For example, we may think
of the row vector (1, 2, 4) as being equal to the 1 × 3 matrix (1 2 4). (Indeed, the only
visible difference is that the vector has commas and the matrix does not, merely a
notational difference.)
6.3.2
Matrix addition and scalar multiplication
Matrices are useful because they provide a compact notation, and because we can ‘do
algebra’ with them3 . If A and B are two matrices of the same size then we define A + B
to be the matrix whose elements are the sums of the corresponding elements in A and
2
3
See Anthony and Biggs (1996) Section 15.1.
See Anthony and Biggs (1996) Section 15.1.
115
6. Matrices and linear equations
B. Formally, the (i, j)th entry of the matrix A + B is aij + bij where aij and bij are the
(i, j)th entries of A and B, respectively. Also, if c is a number, we define cA to be the
matrix whose elements are c times those of A; that is, cA has (i, j)th entry caij . For
example,
1 2
3 4
4 6
+
=
,
2 1
1 2
3 3
and
6.3.3
1 2
3 6
3
=
.
2 1
6 3
Matrix multiplication
Suppose A and B are matrices such that the number (say n) of columns of A is equal to
the number of rows of B. We define the product4 C = AB to be the matrix whose
elements are
cij = ai1 b1j + ai2 b2j + · · · + ain bnj .
6
Although this formula looks daunting, it is quite easy to use in practice. What it says is
that the element in row i and column j of the product is obtained by taking each entry
of row i in turn and multiplying it by the corresponding entry of column j of B, then
adding these n products together. In other words, the entry is the dot product of row i
of A and column j of B.
Example 6.2 In the following product the element in row 1 and column 2 of the
product matrix (indicated in bold type) is found by using, as described above, the
row and column printed in bold type:
1 3
1 2 1
7 11 16
=
.
2 1
2 3 5
4 7 7
The entry is 11 because
1 × 2 + 3 × 3 = 11.
The other elements of the product can be worked out in the same way.
It must be stressed that when A has n columns then B must have n rows if AB is to be
defined. In any other case, the product is not defined. If A is an m × n matrix and B is
an n × p matrix, it follows that AB is an m × p matrix.
The definition of matrix multiplication allows us to use some familiar algebraic rules,
but care is needed. Among the rules which we can use are:
A(BC) = (AB)C
and A(B + C) = AB + AC.
On the other hand, it is most important to note that AB and BA are not usually equal.
Indeed it is quite possible that one of the products is defined but the other is not. Even
if both are defined, they are generally not equal.5
4
5
See Anthony and Biggs (1996) Section 15.2.
See Anthony and Biggs (1996) Section 15.2.
116
6.4. Linear equations
Activity 6.2
6.3.4
If A =


2 1
1 2 3
5
5
and B = 0 −1, show that AB =
.
3 2 5
11 11
1 2
The identity matrix
A useful matrix is the identity matrix:

1
0

I =  ..
.
0 ···
1 ···
.. . .
.
.
0 0 ···

0
0

..  ,
.
1
which has the number 1 in each of the positions on the ‘main diagonal’, and 0 elsewhere.
Note that I is a square matrix. Note that there is an identity matrix of any size n × n.
The identity matrix has the property that, whenever A is an n × n matrix, we have
6
IA = AI = A.
6.4
Linear equations
A system of m linear equations in n unknowns x1 , x2 , . . . , xn is a set of m equations of
the form
a11 x1 + a12 x2 + · · · + a1n xn = b1 ,
a21 x1 + a22 x2 + · · · + a2n xn = b2 ,
..
.
am1 x1 + am2 x2 + · · · + amn xn = bm .
The numbers aij are usually known as the coefficients of the system. We say that
(x∗1 , x∗2 , · · · , x∗n ) is a solution of the system if all m equations hold true when x1 = x∗1 ,
x2 = x∗2 and so on. Sometimes a system of linear equations is known as a set of
simultaneous equations; such terminology emphasises that a solution is an assignment of
values to each of the n unknowns such that each and every equation holds with this
assignment.
In order to deal with large systems of linear equations we usually write them in matrix
form. First we observe that vectors are just special cases of matrices: a row vector or list
of n numbers is simply a matrix of size 1 × n, and a column vector is a matrix of size
n × 1. The rule for multiplying matrices tells us how to calculate the product Ax of an
m × n matrix A and an n × 1 column vector x. According to the rule, Ax is

  

a11 a12 · · · a1n
x1
a11 x1 + a12 x2 + · · · + a1n xn
 a21 a22 · · · a2n   x2   a21 x1 + a22 x2 + · · · + a2n xn 

  

 ..
.
..
..   ..  = 
..
..
 .





.
.
.
.
.
am1 am2 · · · amn
xn
am1 x1 + an2 x2 + · · · + amn xn
117
6. Matrices and linear equations
Note that Ax is a column vector with m rows, these being the left-hand sides of our
system of linear equations. If we define another column vector b, whose components are
the right-hand sides bi , the system is equivalent to the matrix equation
Ax = b.
We often use the phrase linear system to mean ‘system of linear equations’ and we say
that a linear system is square if the number of equations is the same as the number of
unknowns; that is, if the matrix A is square.
6.5
Elementary row operations
An elementary way of solving the system
3x + 2y = 2,
5x + y = 2,
6
(6.1)
(6.2)
of linear equations is to ‘eliminate’ one of the variables, as follows. We can eliminate y
by multiplying equation (6.2) by 2 and subtracting equation (6.1) from this new
equation. Explicitly, multiplying the second equation by 2 gives the equation
10x + 2y = 4,
and so, using equation (6.1),
(10x + 2y) − (3x + 2y) = 4 − 2.
That is, 2 × (6.2) − (6.1) gives
2
7x = 2 so that x = ,
7
and substituting this value for x back into either equation yields y = 4/7. This
technique generalises to larger systems of equations and leads to what is often called the
Gaussian Elimination, or Gauss-Jordan, or Gaussian, method for solving systems of
equations. This method works even when the number of equations and the number of
unknowns are different.
It is a simple observation that the set of solutions of a system of linear equations is
unaltered by the following three operations, since the restrictions on the variables
x1 , x2 , · · · given by the new equations imply the restrictions given by the old ones (that
is, we can undo the manipulations made on the old system):
multiply both sides of an equation by a non-zero constant,
add a multiple of one equation to another,
interchange two equations.
These observations form the motivation behind a method6 to solve linear equations.
6
See Anthony and Biggs (1996) Chapters 16 and 17.
118
6.5. Elementary row operations
To solve a linear system Ax = b using the new method we first form the augmented
matrix (Ab), which is A with column b tagged on. For example, if the system is

   
1 2 1
x1
1
2 2 0 x2  = 2 ,
3 5 4
x3
1
then the augmented matrix is the 3 × 4 matrix


1 2 1 1
(Ab) = 2 2 0 2 .
3 5 4 1
We use this form because of the important fact that elementary operations on the
equations of the system correspond to the same operations on the rows of the
augmented matrix. For that reason we shall now refer to them as elementary row
operations. The method now proceeds as follows: we use a sequence of elementary row
operations on the augmented matrix until we have changed it into a matrix of the form


1 ∗ ∗ ∗
(Cd) = 0 1 ∗ ∗ ,
0 0 1 ∗
which is said to be in echelon form or reduced form. Here, the ∗ symbols merely
indicates the presence of some numbers. Note that in an echelon matrix, the first
non-zero entry in each row is 1 (we call this the leading 1), the position of the leading 1
moves to the right as we go down the rows.
To see why this will be useful, let us carry out the procedure for the linear system given
above. In doing so, I shall explain which row operations are being applied at each step,
but it is not generally necessary to give such detail when you are answering a question
of this type. We start with the augmented matrix


1 2 1 1
2 2 0 2 .
3 5 4 1
We eliminate the second and third entries of the first column by subtracting multiples
of the first row. To cancel the 2 in the second row, we subtract twice the first row from
the second. We may conveniently denote this as R2 → R2 − 2R1 , meaning that row 2
changes to what was row 2, minus twice row 1. We shall also, to eliminate the 3 in the
third row and first column, perform the operation R3 → R3 − 3R1 (that is, we subtract
3 times the first row from the third). This gives us the following transformation:




1 2 1 1
1 2
1
1
2 2 0 2 → 0 −2 −2 0  .
3 5 4 1
0 −1 1 −2
Now, clearly we can simplify this by dividing the second row throughout by the number
−2. (That is, we perform the operation R2 → R2 /(−2).) So the next transformation is




1 2
1
1
1 2 1 1
0 −2 −2 0  → 0 1 1 0  .
0 −1 1 −2
0 −1 1 −2
119
6
6. Matrices and linear equations
Now we want to delete the −1 in the third row and second column. To do so, we add
row 2 to row 3. (Note that we would not want to use row 1 to cancel at this stage,
because if we did we would lose the 0 we have worked to obtain in the first column of the
third row.) The next step is therefore to perform the row operation R3 → R3 + R2 to get




1 2 1 1
1 2 1 1
0 1 1 0  → 0 1 1 0  .
0 −1 1 −2
0 0 2 −2
Now, to reduce finally to echelon form, we simply
perform the row operation R2 → R2 /2, to get



1 2 1 1
1
0 1 1 0  → 0
0 0 2 −2
0
divide the last row by 2, i.e. we

2 1 1
1 1 0 .
0 1 −1
I’ve been very careful to explain (for your benefit) what the row operations were at each
stage, but as I mentioned above, we need not include all this detail. The reduction
above can be written simply as
6



1 2 1 1
1
2 2 0 2 → 0
3 5 4 1
0

1

→ 0
0

1
→ 0
0

1

→ 0
0
Now, the initial system of equations

1 2
0 1
0 0

2
1
1
−2 −2 0 
−1 1 −2

2 1 1
1 1 0
−1 1 −2

2 1 1
1 1 0
0 2 −2

2 1 1
1 1 0 .
0 1 −1
has the same set of solutions as the system
   
1
x1
1




1
x2 =
0 .
1
x3
−1
This system of equations is
x1 + 2x2 + x3 = 1,
x2 + x3 = 0,
x3 = −1.
But it is easy to solve these equations by working backwards from the third equation to
the first one. Immediately, we have x3 = −1. The second equation then gives x2 = 1,
and then the first gives
x1 = 1 − 2x2 − x3 = 0.
120
6.5. Elementary row operations
We give another example.
Example 6.3 Use the method of elementary row operations to solve the following
system of equations.
3x1 − 3x2 + 5x3 = 6,
x1 + 7x2 + 5x3 = 4,
5x1 + 10x2 + 15x3 = 9.
Solution: The augmented matrix corresponding to the system of equations is


3 −3 5 6
1 7 5 4  ,
5 10 15 9
which we reduce to echelon form using elementary row operations, as follows.



3 −3 5 6
1
1 7 5 4  → 3
5 10 15 9
5

1
→ 0
0

1

→ 0
0

1

→ 0
0

1
→ 0
0

7 5 4
−3 5 6
10 15 9
6

7
5
4
−24 −10 −6 
−25 −10 −11

7
5
4
1 5/12 1/4 
−25 −10 −11

7
5
4
1 5/12 1/4 
0 5/12 −19/4

7
5
4
1 5/12 1/4  .
0
1
−57/5
This last matrix, in echelon form, represents the system
x1 + 7x2 + 5x3 = 4,
x2 +
5
1
x3 = ,
12
4
57
x3 = − .
5
Its solution (which is the same as that of the original system) may be determined by
back-substitution:
x3 = −
57
,
5
x2 =
1
5
− x3 = 5 and x1 = 4 − 7x2 − 5x3 = 26.
4 12
121
6. Matrices and linear equations
Activity 6.3 Use the method of elementary row operations to solve the following
system of equations.
x1 + x2 + x3 = 6,
2x1 + 4x2 + x3 = 5,
2x1 + 3x2 + x3 = 6.
Once we think we have the solution to a system of equations, it is quite straightforward
to check whether they are indeed correct: all we have to do is substitute the supposed
solutions into the original equations, and check that each equation holds true.
6.6
6
Applications of matrices and linear equations
Matrices are extremely useful in management and economics. We illustrate with two
examples.
Example 6.4 A company manufactures three goods, X, Y and Z, each of which is
made from three types of input, A, B and C. Each unit of X requires 1 unit of A, 7
units of B and 3 units of C. Each unit of Y requires 4 units of A, 3 units of B and 1
unit of C. Furthermore, one unit of Z requires 2 units of A, 4 units of B and 2 units
of C. In a particular day’s production the company uses up 105 units of A, 135 units
of B and 55 units of C.
(a) Create a matrix equation to represent the usage of A, B and C in the day’s
production of x, y and z units of X, Y and Z respectively.
(b) Using matrix algebra, determine the values of x, y and z.
Solution: (a) Consider first the total amount of A used in the production of the
goods X, Y, Z. Since each unit of X requires 1 unit of A, the total amount of A used
in the production of X is 1 × x = x. Similarly, the total amount used in producing Y
is 4y and the amount used in producing Z is 2z. Therefore the total amount of A
used is x + 4y + 2z. On the other hand, we know that this is 105, since this figure is
given in the problem. Therefore
x + 4y + 2z = 105.
Similarly, considering in turn the total amounts of B and C used, we have
7x + 3y + 4z = 135 and 3x + y + 2z = 55.
These three equations expressed

1
7
3
in matrix form become
   
4 2
x
105




3 4
y = 135 .
1 2
z
55
Note that we have been very careful here and have thought about what the
underlying equations are. A hurried approach might be to try to write a matrix
122
(6.3)
6.6. Applications of matrices and linear equations
equation directly by looking at the numbers given in the questions. But it is
tempting to look at the problem and ‘read off’ the equation

   
1 7 3
x
105
4 3 1 y  = 135 .
2 4 2
z
55
However, this is wrong! The matrix just written is the so-called ‘transpose’ of the
correct one (that is, it is obtained from the correct one by interchanging rows and
columns). The moral of this digression is: think, and be careful!
(b) To answer this part of the problem, we need to solve the matrix equation (6.3) to
determine x, y, z. We use the standard technique. First, we write down the
augmented matrix,


1 4 2 105
7 3 4 135 ,
3 1 2 55
which we then reduce to

1
7
3
echelon form using


4 2 105
1


3 4 135 → 0
1 2 55
0

1

→ 0
0

1
→ 0
0

1

→ 0
0
Therefore,
x + 4y + 2z = 105,
elementary row operations, as follows.

4
2
105
−25 −10 −600
−11 −4 −260

4
2
105
1 2/5 24 
−11 −4 −260

4 2 105
1 2/5 24 
0 2/5 4

4 2 105
1 2/5 24  .
0 1
10
2
y + z = 24 and z = 10,
5
so we have z = 10, y = 20 and x = 5.
Example 6.5
The supply function for a commodity takes the form
q S (p) = ap2 + bp + c,
for some constants a, b, c. When p = 1, the quantity supplied is 5; when p = 2, the
quantity supplied is 12; when p = 3, the quantity supplied is 23. Find the constants
a, b, c.
Solution: The given information means that
q S (1) = 5,
q S (2) = 12 and q S (3) = 23,
123
6
6. Matrices and linear equations
that is,
a(12 ) + b(1) + c = 5,
a(22 ) + b(2) + c = 12,
a(32 ) + b(3) + c = 23.
So we have the following system of linear equations for a, b, c:
a + b + c = 5,
4a + 2b + c = 12,
9a + 3b + c = 23.
We solve this in the usual

1
4
9
6
way, by reducing the augmented matrix to echelon form.



1 1 5
1 1
1
5
2 1 12 → 0 −2 −3 −8 
3 1 23
0 −6 −8 −22


1 1
1
5
→ 0 −2 −3 −8
0 0 −1 −2


1 1 1 5
→ 0 1 3/2 4 .
0 0 1 2
Therefore,
a + b + c = 5,
3
b + c = 4,
2
c = 2,
so that c = 2, b = 1 and a = 2. The supply function is therefore given explicitly by
q S (p) = 2p2 + p + 2.
Learning outcomes
At the end of this chapter and the relevant reading, you should be able to:
explain what is meant by a vector, either a row or column vector
add vectors, multiply a vector by a number (or scalar), and compute the dot
product of two vectors
explain what is meant by a matrix
add matrices and multiply a matrix by a number
multiply matrices
explain what is meant by the n × n identity matrix
124
6.6. Sample examination/practice questions
solve linear systems using row operations
apply matrices and linear equations to problems in management
You do not need to know about determinants and the methods for linear equations
based on them, such as Cramer’s rule. You need not know about matrix inverses and
their calculation, inconsistent systems, or systems with infinitely many solutions.
Sample examination/practice questions
Question 6.1
Express the following set of equations in matrix form and hence solve them using a
matrix method:
x + y + z = 1,
2x − y + z = −1,
x + 3y − z = 7.
6
Question 6.2
A high class dressmaker makes three types of dresses. She makes cheap ‘everyday’
dresses, medium-priced ‘cocktail’ dresses and expensive ‘ballroom’ dresses. The making
of the dresses involves the ‘inputs’ of fabric, labour, fastenings and machine time. Table
6.1shows the units of input required per dress for each dress type.
Fabric
Labour
Fastenings
Machine time
‘Everyday’
5
20
15
7
‘Cocktail’
6
25
20
9
‘Ballroom’
8
30
22
12
Table 6.1: Units of input required per dress for each dress type.
The dressmaker makes a combination of the three dress types which uses exactly 270
units of fabric, 1050 units of labour and 790 units of fastenings. How many of each type
of dress does she make? What is the corresponding machine time used?
Question 6.3
The function f (x) is given by
f (x) =
a
+ bx + c,
1 + x2
for some constants a, b, c. Given that f (0) = 8, f (1) = 3 and f (2) = −8/5, find a system
of linear equations for a, b, c. Solve this system using a matrix method.
125
6. Matrices and linear equations
Answers to activities
Feedback to activity 6.1
x + y = (4, 4, 4) and x · y = 10.
Feedback to activity 6.2
This is quite straightforward. We know that, since A is a 2 × 3 matrix and B is a 3 × 2
matrix, that the product AB can be formed and will be a 2 × 2 matrix. There are
therefore four entries to compute. The top left entry is computed by forming the dot
product of
 
2

the first row, (1 2 3), of A and the first column, 0, of B,
1
and is therefore
(1 × 2) + (2 × 0) + (3 × 1) = 5.
6
The other entries are computed in a similar way. For example, the bottom right entry of
AB (indicated in bold in the following equation) is the dot product of the row and
column indicated in bold:
AB =


2 1
5
5
1 2 3 
0 −1 =
.
3 2 5
11 11
1 2
Feedback to activity 6.3
We reduce the augmented matrix as follows:



1 1 1 6
1
2 4 1 5 → 0
2 3 1 6
2

1

→ 0
0

1

→ 0
0

1
→ 0
0

1 1
6
2 −1 −7
3 1
6

1
1
6
1 −1/2 −7/2
1 −1
−6

1
1
6
1 −1/2 −7/2
0 −1/2 −5/2

1
1
6
1 −1/2 −7/2 .
0
1
5
You should check which operations have been used at each step. For example, going
from the first to the second matrix, we have subtracted twice the first row from the
second.
126
6.6. Answers to Sample examination/practice questions
The final augmented matrix represents the system
x1 + x2 + x3 = 6,
7
1
x 2 − x3 = − ,
2
2
x3 = 5.
As we have seen, the solution of this system can be found by ‘working backwards’:
x3 = 5,
7
1
x2 = x3 − = −1 and x1 = −x2 − x3 + 6 = 2.
2
2
Answers to Sample examination/practice questions
Answer to question 6.1
In matrix form, the system of equations is

   
1 1
1
x
1
2 −1 1  y  = −1 .
1 3 −1
z
7
The augmented matrix corresponding

1
2
1
We reduce this with row operations

1 1
1
2 −1 1
1 3 −1
6
to the system of equations is

1
1
1
−1 1 −1 .
3 −1 7
as follows:


1
1


−1 → 0
7
0

1
→ 0
0

1

→ 0
0

1

→ 0
0

1
→ 0
0

1
1
1
−3 −1 −3
2 −2 6

1 1 1
2 −2 6
3 1 3

1 1 1
1 −1 3
3 1 3

1 1
1
1 −1 3 
0 4 −6

1 1
1
1 −1
3 .
0 1 −3/2
127
6. Matrices and linear equations
This last matrix, in echelon form, represents the system
x + y + z = 1,
y − z = 3,
3
z=− .
2
Its solution (which is the same as that of the original system) may be determined by
back-substitution:
3
z=− ,
2
y =3+z =
3
2
and x = 1 − y − z = 1.
Answer to question 6.2
Let x be the number of Everyday dresses, y the number of Cocktail dresses and z the
number of Ballroom dresses made. The fact that 270 units of fabric are used means
(from the information given in the table), that
5x + 6y + 8y = 270.
6
Considering, in turn, labour and fastenings, we obtain the additional two equations
20x + 25y + 30z = 1050 and 15x + 20y + 22y = 790.
(We are not given any constraint on the machine time, so there is no fourth equation
corresponding to this.) We therefore have to solve the system
5x + 6y + 8z = 270,
20x + 25y + 30z = 1050,
15x + 20y + 22z = 790.
We reduce the augmented matrix

5 6 8
20 25 30
15 20 22
to echelon form as follows:



270
5 6 8 270
1050 → 0 1 −2 −30
790
0 2 −2 −20


5 6 8 270
→ 0 1 −2 −30
0 0 2
40


1 6/5 8/5 54
→ 0 1 −2 −30 .
0 0
1
20
So we have
8
6
x + y + z = 54,
5
5
from which it follows that
z = 20,
128
y − 2z = −30 and z = 20,
y = 10 and x = 10.
6.6. Answers to Sample examination/practice questions
The dressmaker must therefore have made 10 Everyday dresses, 10 Cocktail dresses and
20 Ballroom dresses.
The corresponding machine time used is
7(10) + 9(10) + 12(20) = 400.
Answer to question 6.3
a
+ bx + c,
1 + x2
and f (0) = 8, f (1) = 3 and f (2) = −8/5. So, substituting x = 0, 1, 2 in turn, we obtain
the equations
f (x) =
a + c = 8,
a
+ b + c = 3,
2
8
a
+ 2b + c = − .
5
5
6
Multiplying the second equation by 2 and the third by 5 (just to make it easier to deal
with), we obtain the system
a + c = 8,
a + 2b + 2c = 6,
a + 10b + 5c = −8.
Reducing the augmented matrix

1 0
1 2
1 10
to echelon form,


1 8
1


2 6
→ 0
5 −8
0

1
→ 0
0

1

→ 0
0
we have

0 1 8
2 1 −2 
10 4 −16

0 1
8
2 1 −2
0 −1 −6

0 1
8
1 1/2 −1 .
0 1
6
Therefore,
a + c = 8,
b+
c
= −1,
2
c = 6.
It follows that
c = 6,
b = −4
and
a = 2.
129
6. Matrices and linear equations
The unknown function is therefore
f (x) =
6
130
2
− 4x + 6.
1 + x2
Chapter 7
Sequences and series
Essential reading
R
(For full publication details, see Chapter 1.)
R
R
Anthony and Biggs (1996) Chapters 3 and 4.
Further reading
7.1
Bradley (2008) Chapter 5.
Dowling (2000) Chapter 17.
7
Introduction
In this chapter we turn our attention to sequences, series and their applications. This is
a rather small topic, but it is sufficiently different from the other topics to merit a
separate chapter. A more complete investigation of sequences and series would involve
the study of difference equations, but this is not part of this subject. (Difference
equations are, however, covered in 05B Mathematics 2.)
7.2
Sequences
A sequence1 of numbers y0 , y1 , y2 , . . . is an infinite and ordered list of numbers with one
term, yt , corresponding to each non-negative integer, t. We call yt−1 the tth term of the
sequence. Notice that, in our notation, the first term is y0 and yt is actually the (t + 1)st
term of the sequence. (Be careful not to be confused by this, as some texts differ.) For
example, yt could represent the price of a commodity t years from now, or the balance
in a bank account t years from now. Often, a sequence is defined explicitly by a formula.
For instance, the formula yt = t2 generates the sequence
y0 = 0,
y1 = 1,
y2 = 4,
y3 = 9,
y4 = 16, . . .
and the sequence 3, 5, 7, 9, . . . may be described by the formula
yt = 2t + 3 (t ≥ 0).
1
See Anthony and Biggs (1996) Section 3.1.
131
7. Sequences and series
7.3
Arithmetic progressions
The arithmetic progression with first term a and common difference d has its terms
given by the formula yt = a + dt. For example, the arithmetic progression with first
term 5 and common difference 3 is 5, 8, 11, 14, . . . . Note that yt is obtained from yt−1 by
adding the common difference d. In symbols, yt = yt−1 + d.
7.4
Geometric progressions
Another very important type of sequence is the geometric progression. The geometric
progression with first term a and common ratio x is given by the formula yt = axt .
Notice that successive terms are related through the relationship yt = xyt−1 . For
example, the geometric progression with first term 3 and common ratio 1/2 is given by
yt = 3(1/2)t ; that is, the sequence is 3, 3/2, 3/4, 3/8, . . . .
7.5
7
Compound interest
Perhaps the simplest occurrence of geometric progressions in economics is in the study
of compound interest.2 Suppose that we have a savings account for which the annual
percentage interest rate is constant at 8%. What this means is that if we have $P in the
account at the beginning of a year then, at the end of that year, the account balance is
increased by 8% of $P . In other words, the balance increases to $(P + 0.08P ).
Generally, if the annual percentage rate of interest is R%, then the interest rate is
r = R/100 and in the course of one year, a balance of $P becomes $(P + rP ) =
$(1 + r)P . One year after that, the balance in dollars becomes $(1 + r)[(1 + r)P ], which
is $(1 + r)2 P . Continuing in this way, we can see that if P dollars are deposited in an
account where interest is paid annually at rate r, and if no money is taken from or
added to the account, then after t years we have a balance of P (1 + r)t dollars. This
process is known as compounding (or compound interest), because interest is paid on
interest previously added to the account.
Activity 7.1 Suppose that 1000 dollars is invested in an account that pays interest
at a fixed rate of 7%, paid annually. How much is there in the account after 4 years?
7.6
Compound interest and the exponential function
When we looked at the exponential function in Chapter 2 of the guide, you might well
have asked where on earth this strange number e came from. It does seem strange, so
let me try to justify it by giving you another definition of the exponential function.3 In
order to do this, we have to have some idea of what is meant by the limit of a function.
2
See Anthony and Biggs (1996) Sections 4.3 and 7.3.
See Anthony and Biggs (1996) Section 7.2 or Ostaszewski (1993) Section 11.1, for a discussion of
this approach to the exponential function.
3
132
7.6. Compound interest and the exponential function
Consider the function f (y) = 1/y. As y gets larger and larger, f (y) gets closer and
closer to 0. This idea of ‘getting closer and closer’ to a given number is the essence of
what we mean by a limit. We say that f (y) tends to 0 as y tends to infinity, or that 0 is
the limit of f (y) as y tends to infinity. The notation used for this is
f (y) → 0 as y → ∞,
or
lim f (y) = 0,
y→∞
where the symbol ∞ stands for ‘infinity’. Do not think that ∞ is a number; it is merely
a convenient notation. A rigorous, formal, approach to the exponential function is to
define e to be the limit
y
1
.
e = lim 1 +
y→∞
y
Then, for any x, we define ex to be the limit
x
e = lim
y→∞
x
1+
y
y
.
This way of thinking about e is useful when we consider compound interest. What
happens if interest is added more frequently than once a year? Suppose, for example,
that instead of 8% interest paid at the end of the year, we have 4% interest added
twice-yearly, once at the middle of the year and once at the end. If $100 is invested, the
amount after one year will be
100(1 + 0.04)2 = 108.16,
dollars which is slightly more than the $108 which results from the single annual
addition. If the interest is added quarterly (so that 2% is added four times a year), the
amount after one year will be
100(1 + 0.02)4 = 108.24,
dollars (approximately). In general, when the year is divided into m equal periods, the
rate is r/m over each period, and the balance after one year is
r m
,
P 1+
m
where P is the initial deposit. Taking m larger and larger — formally, letting m tend to
infinity — we find ourselves in the situation of continuous compounding. Now, from
above,
r m
lim 1 +
= er ,
m→∞
m
so the balance after one year is P er . If invested for a further year, we would have
P er er = P (er )2 = P e2r . After t years continuous compounding, the balance of the
account would be P ert .
133
7
7. Sequences and series
7.7
Series
Let us continue with the story of our investor. It is natural to investigate how the
balance varies if the investor adds a certain amount to the account each year. Suppose
that she adds $P to the account at the beginning of each year, so that at the beginning
of the first year the balance is $P . At the beginning of the second year the balance, in
dollars, will be P (1 + r) + P ; this represents the money from the first year with interest
added, and the new, further, deposit of $P . Convince yourself that, continuing in this
way, the balance at the beginning of year t is, in dollars,
P + P (1 + r) + · · · + P (1 + r)t−2 + P (1 + r)t−1 .
How can we calculate this expression? Note that it is the sum of the first t terms (that
is, term 0 to term t − 1) of the geometric progression with first term P and common
ratio (1 + r). Before coming back to this, we shall discuss such things in a more general
setting.
Given a sequence y0 , y1 , y2 , y3 , . . . , a finite series is a sum of the form
y0 + y2 + · · · + yt−1 ,
7
the first t terms added together, for some number t. There are two important results
about series, concerning the cases where the corresponding sequence is an arithmetic
progression (in which case the series is called an arithmetic series) and where it is a
geometric progression (in which case the series is called a geometric series).
7.7.1
Arithmetic series
The main result here is that if yt = a + dt describes an arithmetic progression and St is
the sequence
St = y0 + y1 + y2 + · · · + yt−1 ,
then
St =
t(2a + (t − 1)d)
.
2
There is a useful way of remembering this result. Notice that St may be rewritten as
St = t
(a + (a + (t − 1)d))
(y0 + yt−1 )
=t
,
2
2
so that we have the following easily remembered result: an arithmetic series has a sum
equal to the number of terms, t, times the value of the average of the first and last
terms (y0 + yt−1 )/2. Equivalently, the average value St /t of the t terms is the average,
(y0 + yt−1 )/2 of the first and last terms.
Activity 7.2 Find the sum of the first n terms of an arithmetic series whose first
term is 1 and whose common difference is 5.
134
7.8. Finding a formula for a sequence
7.7.2
Geometric series
We now look at geometric series. It is easily checked (by multiplying out the expression)
that, for any x,
(1 − x)(1 + x + x2 + · · · + xt−1 ) = 1 − xt .
So, if x 6= 1 and yt = axt , then the geometric series
St = y0 + y2 + · · · + yt−1 = a + ax + ax2 + · · · + axt−1 ,
is therefore given by
St =
Example 7.1
expression
a(1 − xt )
.
1−x
In our earlier discussion on savings accounts, we came across the
P + P (1 + r) + · · · + P (1 + r)t−2 + P (1 + r)t−1 .
We now see that this is a geometric series with t terms, first term P and common
ratio 1 + r. Therefore it equals
P
Activity 7.3
1 − (1 + r)t
P
=
(1 + r)t − 1 .
1 − (1 + r)
r
7
Find an expression for
2 + 2(3) + 2(32 ) + 2(33 ) + · · · + 2(3)n .
7.8
Finding a formula for a sequence
Often we can use results on series to determine an exact formula for the members of a
sequence of numbers. The following example illustrates this.
Example 7.2 Suppose a sequence of numbers is constructed as follows. The first
number, y0 , is 1, and each other number in the sequence is obtained from the
previous number by multiplying by 2 and adding 1 (so that yt = 2yt−1 + 1, for
t ≥ 1). What is the general expression for yt in terms of t?
We can see that
y1
y2
y3
y4
= 2y0 + 1 = 2(1) + 1 = 2 + 1,
= 2y1 + 1 = 2(2 + 1) + 1 = 22 + 2 + 1,
= 2y2 + 1 = 2(22 + 2) + 1 = 23 + 22 + 2 + 1,
= 2y3 + 1 = 2(23 + 22 + 2 + 1) + 1 = 24 + 23 + 22 + 2 + 1.
135
7. Sequences and series
In general, it would appear that
yt = 2t + 2t−1 + · · · + 22 + 2 + 1.
But this is just a geometric series: perhaps this is clearer if we write it as
yt = 1 + 2 + 22 + · · · + 2t−1 + 2t ,
from which it is clear that this is the sum of the first t + 1 terms of the geometric
progression with first term 1 and common ratio 2. Thus, using the formula for the
sum of a geometric series, we have
yt =
7.9
7
1 − 2t+1
= 2t+1 − 1.
1−2
Limiting behaviour
When x is greater than 1, as t increases, xt will eventually become greater than any
given number, and we say that xt tends to infinity as t tends to infinity.4 We write this
in symbols as
xt → ∞ as t → ∞
or
lim xt = ∞.
t→∞
On the other hand, when x < 1 and x > −1, we have
xt → 0 as t → ∞
or
lim xt = 0.
t→∞
We notice that, while xt gets closer and closer to 0 for all values of x in the range
−1 < x < 1, its behaviour depends to some extent on whether x is positive or negative.
When x is negative, the terms are alternately positive and negative, and we say that the
approach to zero is oscillatory. For example, when x = −0.2, the sequence xt is
1, −0.2, 0.04, −0.008, 0.0016, −0.00032, 0.000064, −0.0000128, . . .
for t ≥ 0. When x is less than −1, the sequence is again oscillatory, but it does not
approach any limit, the terms being alternately large-positive and large-negative. In this
case, we say that xt oscillates increasingly.
As an application of this, let us consider again the geometric series
St = a + ax + ax2 + · · · + axt−1 .
We have, using the formula for a geometric series, that
St =
a(1 − xt )
.
1−x
If −1 < x < 1 then xt → 0 as t → ∞. This means that St approaches the number
a
4
1−0
a
=
,
1−x
1−x
See Anthony and Biggs (1996) Section 3.3.
136
7.10. Financial applications
as t increases. In other words,
St →
a
1−x
as t → ∞.
We call this limit the sum to infinity of the sequence given by yt = axt . Note that a
geometric sequence has a sum to infinity which is finite only if the common ratio is
strictly between −1 and 1.
Example 7.3 Consider the sequence with yt = 1/2t for t ≥ 0. The sum of the first t
terms of this sequence would be given by
St = 1 +
1
1
1
+ 2 + · · · + t−1 .
2 2
2
Using the formula for the sum of a geometric series, we then have
"
t #
1
,
St = 2 1 −
2
and that St → 2 as t → ∞.
Activity 7.4
7
Find an expression for
2 3
t
2
2
2
2
+
+ ··· +
,
St = +
3
3
3
3
and determine the limit of St as t tends to infinity.
7.10
Financial applications
A number of problems in financial mathematics can be solved using arithmetic and
geometric series. Here is an example.
Example 7.4 John has opened a savings account with a bank, and they pay a
fixed interest rate of 5% per annum, with the interest paid once a year, at the end of
the year. He opened the savings account with a payment of $100 on 1 January 2003,
and will be making deposits of $200 yearly, on the same date. What will his savings
be after he has made N of these additional deposits? (Your answer will be an
expression involving N .)
If yN is the required amount, then we have
y1 = (1.05)100 + 200,
and then
y2 = (1.05)y1 + 200 = 100(1.05)2 + 200(1.05) + 200,
137
7. Sequences and series
so that, in general, we can spot the pattern and observe that
yN = 100(1.05)N + 200(1.05)N −1 + 200(1.05)N −2 + · · · + 200(1.05) + 200
= 100(1.05)N + 200 1 + (1.05) + (1.05)2 + · · · + (1.05)N −2 + (1.05)N −1
1 − (1.05)N
1 − (1.05)
= 100(1.05)N + 4000 (1.05)N − 1 ,
= 100(1.05)N + 200
where we have used the formula for the sum of a geometric series.
Learning outcomes
At the end of this chapter and the relevant reading, you should be able to:
7
explain what is meant by arithmetic and geometric progressions, and calculate the
sum of finite arithmetic and geometric series
explain compound interest and calculate balances under compound interest
apply sequences and series in management and finance
analyse the long-term behaviour of series and sequences
Sample examination/practice questions
Question 7.1
A geometric progression has a sum to infinity of 3 and has second term, y1 , equal to
2/3. Show that there are two possible values of the common ratio x and find the
corresponding values of the first term a.
Question 7.2
Suppose we have an initial amount, A0 , to invest and we add an additional investment
F at the end of each subsequent year. All investments earn interest at a rate of i% per
annum, paid at the end of each year.
(a) Use the formula for the sum of a geometric series to derive a formula for the value of
the investment, An , after n years.
(b) An investor puts $10000 into an investment account that yields interest of 10% per
annum. The investor adds an additional $5000 at the end of each year. How much will
there be in the account at the end of five years? Show that if the investor has to wait N
years until the balance is at least 80000, then
N≥
138
ln(13/6)
.
ln(1.1)
7.10. Answers to activities
Question 7.3
An amount of $1000 is invested and attracts interest at a rate equivalent to 10% per
annum. Find expressions for the total after one year if the interest is compounded:
(a) annually,
(b) quarterly,
(c) monthly,
(d) daily. (Assume the year is not a leap year.)
What would be the total after one year if the interest is 10% compounded continuously?
Question 7.4
Suppose yt = 1/22t . Find the limit, as t → ∞, of
St = y0 + y2 + · · · + yt−1 .
7
Answers to activities
Feedback to activity 7.1
The required amount is 1000(1 + 0.07)4 = 1310.80 dollars.
Feedback to activity 7.2
We have
n
5
3
1
Sn = n (2(1) + (n − 1)5) = (5n − 3) = n2 − n.
2
2
2
2
Feedback to activity 7.3
Noting that there are n + 1 terms in the series, and that it is the sum of a geometric
progression with first term 2 and common ratio 3, the expression is
2(1 − 3n+1 )
= 3n+1 − 1.
1−3
Feedback to activity 7.4
St is the sum of the first t terms of a geometric progression with first term 2/3 and
common ratio 2/3, so
"
t #
2 1 − (2/3)t
2
St =
=2 1−
.
3 1 − (2/3)
3
As t → ∞, (2/3)t → 0 and so St → 2.
139
7. Sequences and series
Answers to Sample examination/practice questions
Answer to question 7.1
We know that the sum to infinity is given by the formula a/(1 − x) and that y1 = ax.
Therefore, the given information is
a
2
=3
and
ax = .
1−x
3
From the first equation, a = 3(1 − x) and the second equation then gives
3(1 − x)x = 2/3, from which we obtain the quadratic equation 9x2 − 9x + 2 = 0. This
has the two solutions x = 2/3 and x = 1/3. The corresponding values of the first term a
can then be found, using a = 3(1 − x), to be 1 and 2, respectively. So, as suggested by
the question, there are two geometric progressions that have the required sum to
infinity and second term.
Answer to question 7.2
(a) After 1 year, at the beginning of the second, the amount A1 in the account is
i
+ F,
A1 = A0 1 +
100
7
because the initial amount A0 has attracted interest at a rate of i/100 and F has been
added. Similar considerations show that
i
A2 = 1 +
A1 + F
100
i
i
= 1+
A0 1 +
+F +F
100
100
2
i
i
+F 1+
= A0 1 +
+ F,
100
100
and
A3 =
=
i
1+
100
i
1+
100
"
= A0
A2 + F
i
1+
100
A0
i
1+
100
3
+F
2
#
i
+F 1+
+F +F
100
i
1+
100
2
i
+F 1+
+ F.
100
In general, if we continued, we could see that An is given by
i n
i n−1
i n−2
A0 1 + 100
+ F 1 + 100
+ F 1 + 100
+ ··· + F 1 +
|
{z
n terms
i
100
+F .
}
Now, looking at the n terms involving F , we use the formula for the sum of geometric
progression to get
n
i n
1 − 1 + 100
100F
i
=
F
1+
−1 ,
i
i
100
1 − 1 + 100
140
7.10. Answers to Sample examination/practice questions
so that
n
n
i
100F
i
1+
+
−1 .
An = A0 1 +
100
i
100
For (b), we us the formula just obtained, with A0 = 10000, i = 10, F = 5000 and n = 5,
and we see that
"
#
5
5
10
10
100(5000)
1+
A5 = 10000 1 +
+
−1
100
10
100
= 10000 (1.1)5 + 50000 (1.1)5 − 1
= 46630.60,
dollars.
Now, for the balance to be at least 80000 dollars after N years, we need AN ≥ 80000
which means
h
i
N
N
10000 (1.1) + 50000 (1.1) − 1 ≥ 80000.
This is equivalent, after a little manipulation, to
60000(1.1)N ≥ 130000,
or (1.1)N ≥ 13/6. To solve this, we can take logarithms and see that we need
7
N ln(1.1) ≥ ln(13/6),
so
N≥
ln(13/6)
,
ln(1.1)
as required.
Answer to question 7.3
We use the fact that if the interest is paid in m equally spaced instalments, then the
m
total after one year is 1000 1 + mr , where r = 0.1 and m = 1, 4, 12, 365 in the four
cases. Therefore the answers to the first four parts of the problem are as follows:
(a) 1000 (1 + 0.1) = 1100.
4
0.1
(b) 1000 1 +
= 1000(1.025)4 .
4
12
0.1
(c) 1000 1 +
.
12
365
0.1
(d) 1000 1 +
.
365
For the last part, we use the fact that under continuous compounding at rate r, an
amount P grows to P er after one year, so the answer here is 1000e0.1 .
141
7. Sequences and series
Answer to question 7.4
Note that 1/22t = 1/4t = (1/4)t , so this is a geometric series where the common ratio is
1/4. The first term is 1, and there are t terms, so
"
t #
4
1
1 − (1/4)t
=
1−
.
St =
1 − (1/4)
3
4
As t → ∞, (1/4)t → 0 and so St → 4/3.
7
142
A
Appendix A
Sample examination paper
Important note: This Sample examination paper reflects the examination and
assessment arrangements for this course in the academic year 2009–10. The format and
structure of the examination may have changed since the publication of this subject
guide. You can find the most recent examination papers on the VLE where all changes
to the format of the examination are posted.
The purpose of this Sample examination paper is to give you an idea of the format of
the paper and the type of questions that might be asked. Note that, unlike most of your
other subjects, this is a two-hour paper rather than a three-hour paper and that all of
the questions on the paper should be attempted.
Not all questions in Section A will necessarily carry the same number of marks. (These
are compulsory questions, so you should do them all anyway.)
Sample examination paper
Time: 2 hours
This examination has two sections, Section A and Section B.
Attempt ALL questions in Section A.
Attempt BOTH questions in Section B.
Section A represents 60% of the available marks, and each question from Section B
represents 20% of the available marks.
SECTION A
Answer all six questions from this section (60 marks in total).
1.
A firm is the only producer of a particular good, and the demand
equation for the good is 2p + q = 20, where p denotes the selling price
and q is the quantity produced by the firm. The firm’s fixed costs are
12 and its marginal cost function is M C = 2 + q. Find an expression
for the profit function, Π(q). Sketch the graph of Π for q between 0 and
8. Determine the value of the production, q, which maximises the firm’s
profit.
2.
Use a matrix method to find the numbers x, y, z that satisfy
x + y + z = 4, 2x − y + 2z = 5 and −x + 2y + 3z = 3.
143
A
A. Sample examination paper
3.
Determine the integrals
Z
x(x + 2)5/2 dx and
Z
ex
dx.
e2x − 1
4.
Use the Lagrange multiplier method to determine the values of x and y
that minimise 2x + y subject to the constraint x2 y 3 = 27.
5.
Find the critical points of the function f (x, y) = x3 − y 3 − 3xy.
Determine, for each, whether it is a local maximum, a local minimum,
or a saddle point.
6.
A graduate starts work with a company for an initial salary of $S. His
contract is such that, each year, his salary will increase by $500, in
addition to (that is, on top of) a percentage increase of 5%. (So, for
example, if S = 2000 then his salary, once he has worked for one year,
will rise to 2000 + 0.05(2000) + 500 = 2600.) Determine an expression,
in as simple a form as possible, for his salary once he has worked in the
company for N full years (in other words, the salary he will be paid in
year N + 1 of his employment).
SECTION B
Answer both questions from this section (20 marks each).
7. (a)
A firm is the only supplier of two goods, X and Y , and the demand
equations for these goods are
1
x = 50 − pX ,
4
1
y = 76 − pY ,
2
where pX and pY are the prices of X and Y , and where x, y denotes
(respectively) the quantities of X and Y . The firm has a joint total
cost function
T C = 6x2 + 4xy + 4y 2 + 10.
Determine an expression, in terms of x and y, for the firm’s profit.
Determine also the values of production x and y that will give
maximum profit to the firm.
(b)
144
A firm produces a good from a raw material that costs 1 dollar per
unit. Using r units of raw material, the firm can produce rα units of its
good, where α is a fixed number such that 0 < α < 1. The selling price
of the good is 2 dollars per unit. Show that the maximum profit the
firm can achieve is
1
1/(1−α) 1/(1−α)
2
α
−1 .
α
A
8.
A firm has a weekly production function given by
q(k, l) = k 1/4 l1/4 .
where k and l denote, respectively, the capital and labour employed.
Each unit of capital costs $1 a week and each unit of labour costs $16 a
week. Suppose that, when producing any given amount, the firm
minimises its total expenditure on capital and labour.
(i) Show that when the weekly production level is q, this minimum
total expenditure on capital and labour, C(q), is 8q 2 per week.
(ii) Determine the value of the Lagrange multiplier, λ∗ , corresponding
to the optimising values of k and l. Verify that λ∗ = C 0 (q).
Suppose that the firm pays 1 dollar in all other variable costs (raw
materials, and so on) for each unit produced, and that the selling price
of the good is fixed at 33 dollars per unit.
(iii) Find the weekly level of production q that
maximises the firm’s profit.
145
A
A. Sample examination paper
146
Appendix B
Comments on the Sample
examination paper
B
1. From the demand equation,
q
p = 10 − ,
2
so the firm’s total revenue is
1
T R = pq = 10q − q 2 .
2
For the firm’s total costs we integrate the marginal cost to get
Z
TC =
Z
M C dq =
1
(2 + q) dq = 2q + q 2 + c,
2
where, to find c, we note that T C(0) = F C = 12, so c = 12. Consequently, the firm’s
profit function is
1 2
1 2
Π(q) = T R − T C = 10q − q − 2q + q + 12 = −q 2 + 8q − 12.
2
2
To maximise this, we now solve Π0 = 0, which is
8 − 2q = 0,
giving q = 4. This quantity does indeed maximise the firm’s profit because
Π00 (q) = −2 < 0.
To sketch this function for values of q between 0 and 8 we note that Π(0) = −12,
Π(8) = −12 and that Π(q) = 0 gives
q 2 − 8q + 12 = 0 ⇐⇒ (q − 6)(q − 2) = 0 ⇐⇒ q = 2, 6.
so along with the information above about the turning point we get the following graph
for Π(q).
2. The augmented matrix is


1
1 1 4
 2 −1 2 5 .
−1 2 3 3
147
B. Comments on the Sample examination paper
B
Using row operations to reduce, we have



1
1 1 4
1
 2 −1 2 5 → 0
−1 2 3 3
0

1
→ 0
0

1

→ 0
0

1 1 4
−3 0 −3
3 4 7

1 1 4
1 0 1
0 4 4

1 1 4
1 0 1 ,
0 1 1
so we have z = 1, y = 1 and x = 4 − y − z = 2.
Be careful that in answering a question like this, you use only valid row operations. For
example, multiplying two rows together or subtracting a fixed number from each entry
of a row, is not permissible.
Methods such as Cramer’s rule can be successfully used to solve this question (although
these techniques are not part of the formal Mathematics 1 syllabus).
3. By parts,
Z
x(x + 2)
5/2
2
dx = x(x + 2)7/2 −
7
Z
2
(x + 2)7/2 dx
7
2
4
= x(x + 2)7/2 − (x + 2)9/2 + c.
7
63
148
Alternatively, we can substitute u = x + 2. This gives du = dx and
Z
Z
5/2
x(x + 2) dx = (u − 2)u5/2 du
Z
=
B
u7/2 − 2u5/2 du
2
4
= u9/2 − u7/2 + c
9
7
4
2
= (x + 2)9/2 − (x + 2)7/2 + c.
9
7
Next, for the second integral, let u = ex . We have du = ex dx so the integral is
Z
Z
1
1
du =
du.
2
u −1
(u − 1)(u + 1)
Now, partial fractions can be used to see that
Z
Z
1 1
1
1 1
du =
−
du
(u − 1)(u + 1)
2u−1 2u+1
1
1
ln(u − 1) − ln(u + 1) + c
2
2
1
1
= ln(ex − 1) − ln(ex + 1) + c.
2
2
=
4. The Lagrangean is
L(x, y, λ) = 2x + y − λ(x2 y 3 − 27).
The first order conditions are
2 − 2λxy 3 = 0,
1 − 3λx2 y 2 = 0,
x2 y 3 = 27.
The first two of these equations imply that
1
1
= 2 2,
3
xy
3x y
so y = 3x. Then, using the final equation (the constraint), we have x2 (3x)3 = 27, which
is x5 = 1, so x = 1 and y = 3. There is no need to check that these values do indeed give
a minimum: that method is not part of this syllabus, and there is no credit for applying
it, so it is not something to spend time on. (It is not as straightforward as some
candidates seemed to think. The second order test for constrained problems does not
simply involve the second derivatives of L with respect to x and y.)
5. The partial derivatives are fx = 3x2 − 3y and fy = −3y 2 − 3x. We solve fx = fy = 0.
Now, fx = 0 means y = x2 . Then, from fy = 0 we have (x2 )2 + x = 0. This is
x(x3 + 1) = 0. So x = 0 or x = −1. Corresponding values of y are 0 and 1. So there are
149
B. Comments on the Sample examination paper
two critical points: (0, 0) and (−1, 1). (It is also possible to solve for y first, and then
find the corresponding values of x.) The second derivatives are
B
fxx = 6x,
fxy = −3 and fyy = −6y.
2
At (0, 0) we have fxx fyy − fxy
< 0, so this is a saddle point. At (−1, 1) we have
2
fxx fyy − fxy > 0 and fxx < 0 so this is a local maximum.
6. Let yN be salary after N full years. (Make it clear what you are trying to determine.)
Then,
y0 = S,
y1 = (1.05)S + 500,
h
i
y2 = (1.05) (1.05)S + 500 + 500
= (1.05)2 S + (1.05)500 + 500,
h
i
y3 = (1.05) (1.05)2 S + (1.05)500 + 500 + 500
= (1.05)3 S + (1.05)2 500 + (1.05)500 + 500,
and, in general
yN = (1.05)N S + (1.05)N −1 500 + (1.05)N −2 500 + · · · + 500.
This simplifies (noting the geometric progression) to
yN = (1.05)N S + 500
1 − (1.05)N
= (1.05)N S + 10000[(1.05)N − 1].
1 − 1.05
Some students will use recurrence (or difference) equations (a technique which is not
formally part of Mathematics 1) here, and this is perfectly acceptable.
Also note that we cannot assume that S = 2000 here. The question only uses this as an
example value to help explain what is going on.
7. (a) We have pX = 200 − 4x and pY = 152 − 2y, so that the profit is
Π = xpX + ypY − T C
= x(200 − 4x) + y(152 − 2y) − (6x2 + 4xy + 4y 2 + 10)
= 200x − 10x2 − 6y 2 + 152y − 4xy − 10.
We then solve Πx = Πy = 0, which is
200 − 20x − 4y = 0 and
− 12y + 152 − 4x = 0,
equivalent to
5x + y = 50 and x + 3y = 38,
150
which has solution x = 8, y = 10.
We note that
B
Πxx Πyy − Π2xy = (−20)(−12) − (−4)2 > 0,
and Πxx = −20 < 0, so it is a maximum.
7. (b) The total revenue is 2q where q is the quantity produced, and the cost is r. Since
q = rα , the profit is therefore Π(r) = 2rα − r. To maximise profit we solve
Π0 = 2αrα−1 − 1 = 0. This gives r = r∗ = (2α)1/(1−α) . Note that Π00 = 2α(α − 1)rα−2 < 0,
since 0 < α < 1, hence we do indeed maximise profit. Now, the corresponding profit is
Π(r∗ ) = 2 (2α)α/(1−α) − (2α)1/(1−α) .
This simplifies to
1/(1−α)
2
α
1/(1−α)
1
−1 ,
α
as required.
8. The weekly production function is
q(k, l) = k 1/4 l1/4 ,
and the associated cost function is
k + 16l,
dollars per week. For (i), we have to minimise the firm’s expenditure on these costs
subject to the constraint that the firm is producing an amount q. The Lagrangean is
therefore
L(k, l, λ) = k + 16l − λ(k 1/4 l1/4 − q).
We now solve
1
∂L
= 1 − λk −3/4 l1/4 = 0,
∂x
4
∂L
1
= 16 − λk 1/4 l−3/4 = 0,
∂y
4
∂L
= − k 1/4 l1/4 − q = 0.
∂λ
From the first two equations,
λ
k 3/4
l3/4
= 1/4 = 16 1/4 ,
4
l
k
(∗)
so k = 16l. Now the third equation becomes
(16l)1/4 l1/4 = q,
or 2l1/2 = q and therefore the optimal values of k and l are
2
q 2 q 2
q
l=
=
and k = 16
= 4q 2 .
2
4
4
151
B. Comments on the Sample examination paper
The corresponding minimum cost is
B
2
k + 16l = 4q + 16
q2
4
= 4q 2 + 4q 2 = 8q 2 ,
and so, when producing an amount q, the total expenditure on capital and labour is
C(q) = 8q 2 dollars per week.
For (ii), the Lagrange multiplier is, from (∗), given by
λ
k 3/4
= 1/4 =
4
l
k3
l
1/4
,
and so the value corresponding to the optimising values of k and l is
λ∗
=
4
(4q 2 )3
q 2 /4
1/4
= 44 q 4
1/4
= 4q,
that is, we have λ∗ = 16q. We also have
C 0 (q) = 16q,
and so we can see that λ∗ = C 0 (q) as required.
For (iii), given a quantity, q, the firm’s weekly revenue is 33q dollars and its weekly
costs are given by
8q 2 + q + F C,
dollars for capital and labour, all other variable costs and any fixed costs respectively.
This means that the firm’s weekly profit function is given by
Π(q) = 33q − (8q 2 + q + F C) = −8q 2 + 32q − F C,
dollars. To maximise this, we set Π0 (q) = 0, which is
−16q + 32 = 0,
giving q = 2. This level of production does indeed maximise the firm’s profit because
Π00 (q) = −16 < 0.
152
Notes
Untitled-3 9
23/12/2008 10:39:12
Notes
Untitled-3 8
23/12/2008 10:39:12
Comment form 2010.qxp
10/11/2010
10:58
Page 1
Comment form
We welcome any comments you may have on the materials which are sent to you as part of your
study pack. Such feedback from students helps us in our effort to improve the materials produced
for the International Programmes.
If you have any comments about this guide, either general or specific (including corrections,
non-availability of Essential readings, etc.), please take the time to complete and return this form.
Title of this subject guide: ..............................................................................................................................
................................................................................................................................................................................
Name ......................................................................................................................................................................
Address ..................................................................................................................................................................
................................................................................................................................................................................
Email ......................................................................................................................................................................
Student number ......................................................................................................................................................
For which qualification are you studying? ..............................................................................................................
Comments
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
Please continue on additional sheets if necessary.
Date: ......................................................................................................................................................................
Please send your comments on this form (or a photocopy of it) to:
Publishing Manager, International Programmes, University of London, Stewart House, 32 Russell Square, London WC1B
5DN, UK.
Download