Study Advice Services

advertisement
Study Advice Service
Mathematics in Decision Making
an introduction for business studies/management students (levels 5 and 6)
Background
This summary sheet is derived from experiences in teaching Management Decision Making at
level 5 and level 6 and identifies common problem areas for students.
Decision making is multidimensional activity, in that it uses both quantitative and qualitative
information to aid the process. One approach to this multidimensionality is to use the three
phased approach (Jennings & Wattam, 1998). This identifies three key steps to complete in the
process of making decisions about problems and opportunities.
Problem
Identification
Solution
Development
Solution
Selection
In applying this framework to management decisions, we need to understand and model
information in a variety of ways. This involves handling both qualitative information (text, values,
opinions, evidence etc) and quantitative information (data, trends, percentages, probabilities
etc).
This short guide is a summary of the key issues to consider in managing quantitative models in
particular. Most examples and activities you will come across use time series data (data which
_____________________________________________________________________________________________________
Tel: 01482 466199
Web: www.hull.ac.uk/studyadvice
Email: studyadvice@hull.ac.uk
is dependent upon the passage of time). This is usually described by the x variable
(independent) and the data against which this varies is the y variable (dependent). Note,
however, you may come across different data sets which do not use time as the independent
variable. The procedures for handling such problems are largely the same.
Smoothing Data
Periodic/random variations in data can be smoothed by weighting the importance of the most
recent forecast and most recent actual data observed, in different ways. Typical nomenclature
used in this approach uses the Greek symbols of:
 - alpha – used to denote a smoothing constant (typically given a value between 0.1 and 0.5)
For example:
New Forecast = (chosen fraction * Actual Data value) + ((1-chosen fraction)*Old Forecast
Value)
Where  = chosen fraction (or smoothing constant)
Hence:
a) New Forecast = (* Actual Data value) + ((1-)*Old Forecast Value)
Hence:
F4 =  A3 + (1-)F3
Where :
F4 = Next (needed) forecast
A3 = Most recent actual data
F3 = Most recent forecast
NOTE – It is very important to understand that there is a hierarchy of brackets (i.e. ( and )) used
in calculations involving quantitative methods. You always start with the innermost brackets first:
e.g. first part A is worked out, then part with B, then part with C and then the part with D.
ANSWER = (D*(C+1(B*10(A+20))))
Unless you carefully consider this when using Excel, calculations and formulae used in the cells
of Excel will not generate the correct answer. For example, the answer to:
ANSWER = (A*B+1-R)
Will not give the same result as:
ANSWER = (A*((B+1)-R))
There are also additional equations for looking at smoothing quantitative data forecasts using
more smoothing constants (such as  (beta) and  (gamma) values).
2
Shorthand and symbols when dealing with lots of numbers
As this focus on quantitative tends to ask you to consider lots of data, another general symbol
you will come across is that of  (or sigma).
 – this means ‘SUM OF’ – so whatever follows this, you are ‘adding’ up e.g.
For example, if your data set was called SET 1 and included the values of (1,45,45,34,6,52),
then  (SET 1) would be equal to:
 (SET 1) = 1+45+45+34+6+52 = 183
You will also come across symbols such as:
 and - mean – approximately equal to (i.e. A  B, if A equals 34.23 and B equals 34.27)
 - means whatever is on the left hand side is greater than or equal to whatever is on the right
hand side of the inequality symbol e.g. (10  9 is OK, as is 10  10, but not 10  11)
 - means whatever is on the left hand side is less than or equal to whatever is on the right hand
side of the inequality symbol e.g. (9  10 is OK, as is 9  9, but not 9  8)
y or x - means to find the mean (or average) of your y or x values
 - means ‘infinity’
X1, X2 etc – using subscripts such as numbers indicates the data all belong to one data set (or
variable)
ye – this is the estimate of y – typically indicating the forecast you have made as a result of
modelling some quantitative data (y is usually the dependent variable)
R2 or r2 - this is the symbol given to the statistical calculation called the ‘coefficient of
determination’. This is found from a dataset (x,y) directly to be:
It can also be found from the equation:
r2 =  (ye – y)2
 (y - y)2
The nearer your value for r2 is to either +1 or -1, the better your fit. The nearer it is to 0, the
poorer your best fit line is.
3
Methods of handling quantitative data for forecasting (1)
Handling datasets for forecasting purposes usually means looking for trends in the data and
then isolating those trends, so as to be able to rearrange them to give a value for some time in
the future (if your independent variable is time (t)). How you isolate them has a common
methodology, but how you rearrange them needs careful consideration by you, as which is the
best way to do this (according to what mathematical relationship). For example, the easiest way
to do this with two datasets (x and y) is via a mathematical method called linear regression.
NOTE – in choosing to use such a method, you are assuming that the relationship between x
and y is linear. This means that say a 20% change in x would have a same change in the value
of y (also 20%), assuming the relationship between the data is only in terms of x and y. Nonlinear relationships between x and y do not obey this same rule. You can think of this also in
terms of linear relationships being described by a straight line, whereas non-linear relationships
are described by curves.
Linear regression further assumes that your dependent values (y) have an underlying trend (U),
a dominant trend (T), perhaps a seasonal trend (S), and random trends ( R). Usually values for
R are ignored. This process identifies the values for U, T and S for any given value of x, to
provide a forecast of ye. It uses the method of least squares analysis to do this. This method
determines the equation of a best fit line, which minimizes its average distance from all known
data points.
The equation of a straight line is:
y = mx +c
Where:
y = dependent value
x = independent value
m = gradient of line (this is also described by T)
c = value at which the line crosses the y axis (this is also described by U)
So, the method of least squares analysis tries to find the values of m and c that best describe
the most accurate fitting line between data points. In the examples below, graph (a) has a better
‘best fit line’ than that also shown in graph (b).
4
Graph (a)
y (dependent variable))
14
Actual data plotted
12
10
8
6
4
Best fit line
2
0
0
2
4
6
8
10
12
x (independent variable)
The equation of the straight line in graph (a) is: y = 1.12x + 1.28. This is the equation of the best
fit line for the data plotted. This means that the distance between the values in this dataset
(actual data plotted) and the best fit line for x and y is minimized. Hence it is the best fitting line,
when compared with the poor fitting line drawn in graph (b). This poor best fit line has the
equation y=1.45x+1.28 and it is easy to see that despite the similarity in the equations, the best
fit line in graph (a) is superior.
Graph (b)
y (dependent variable)
18
16
14
Poor fitting line
12
10
8
6
4
2
Original dataset
Best fitting
line
0
0
2
4
6
8
10
12
x (independent variable)
The values of m and c can be determined directly from your original x and y values, using these
formulae:
c
5
In these formulae, ‘n’ denotes the number of values you have got (in your original dataset). A
measure of how good a fit your ‘best fit line’ actually is can be found from this equation:
Hence, a typical dataset in x and y could look like (taken from Carlson, 2002):
x
y
xy
x2
y2
1
2
2
1
2
2
5
10
4
25
3
6
18
9
36
4
10
40
16
100
5
16
80
25
256
6
21
126
36
441
7
25
175
49
625
8
26
208
64
676
9
32
288
81
1024
10
35
350
100
1225
∑ x = 55
∑ y = 178
∑ xy = 1297
∑ x 2 = 385
∑ y2 = 4410
c
Sometimes in datasets you may observe a regular repeating change in the value of y, say. This
could be a seasonal pattern (S values and which does not necessarily refer to the more familiar
use of the word). It is better to think of it as a regular repeating cycle which occurs every n th
moment. For example, if a company’s sales were related strongly to the season of the year,
then this cycle would have a period of 4 (i.e. every 5 th point would be repeating itself – so only 4
points are needed to describe the entire seasonal change).
You can moderate your forecasts by accommodating such regular periodic changes.
6
Methods of handling quantitative data for forecasting (2)
You may be asked to consider datasets (x,y) which do not seem to have a clear linear
relationship. These are called non-linear relationships and three common types are shown
below in graphs (c,d and e). You make this judgement by inspecting your dataset and
considering whether you would expect to see such a relationship in your data. It is important to
note that equations which describe curves are part of a family of curves, which can look very
similar to each other. For example, if you magnified the edge of the circle sufficiently, it would
come to resemble a straight line. So you must exercise care when choosing to use a non-linear
equation for forecasting.
Some of the more familiar types of non-linear equations used are described in graphs (c,d
and e).
In graph (c ) for example, the curve has the equation y = b*(mx) (or b multiplied by m which has
been raised to the power of x).
In graph (d), the curve has the equation y = b*(xm) (or b multiplied by x which has been raised
to the power of m). When m is an odd number note you will have multiple turning points in the
graph.
In graph (e), the curve has the equation y = b*(exp(-mx)) (or b multiplied by the exponential
value of the negative value of m multiplied by x). There are lots of other non-linear relationships,
but these are not covered by this study guide.
These equations can be used in the same way as with forecasting via the linear regression and
least squares analysis, but there are a few more intermediate steps.
These steps are the transformations needed to convert a non-linear relationship into a linear
relationship. To do this you need to use logarithms and the laws of the logarithms.
7
Graph c
4000
y (independent variable)
3500
3000
2500
2000
1500
1000
500
0
0
5
10
15
20
25
x (independent variable)
For example, for graph (c ) above, which has the general equation y = cm x. This can be
expanded using the laws of logarithms to derive:
y = cmx.
OR
log y = log (cmx)
OR
log y = log c + log mx
OR
(1) log y = log c + x log m
If you compare this with the general equation of a straight line (y = mx+c) you can note that
equation (1) looks like a straight line equation now, if we let:
Y = log y
C = log c
M= log m
Then:
Y = Mx + C
8
This has transformed the non-linear equation into a linear equation, which we can apply the
standard least squares analysis method to help solve. Remember that in this case, you will
have a column of log y (which replaces y in your calculations to find M and C). However, when
you have derived your values for m and c from least squares analysis, you must remember that
these actually refer to M and C, hence you need to transform them back to the original values in
the original equation of y = cmx.
Hence, suppose you find M to equal 0.45 and C to equal 1.4, then transforming these back to
find m and b means taking the anti-logarithms of M and C. If you are using (log) function (this is
log base 10), this is simply:
m = 10M and c = 10C
OR
actual gradient of your best fitting curve (your T value) has a gradient value of 10 raised to the
power of M and the intercept on the y axis (the U) value has a value of 10 raised to the power of
C.
This same procedure can be applied to the equations described by graphs (d) and (e). That is
to say:




Identify the generic equation you wish to use to model the quantitative date
Transform that equation into a comparable linear format
Determine values of M and C (as appropriate) via least squares analysis
Transform values of M and C (as appropriate) back to find true values of m and c (as
appropriate) in your original equation of the curve you have chosen to use.
9
Graph d
60
y (dependent variable)
50
40
30
20
10
0
0
5
10
15
20
25
x (independent variable)
For example, for graph (d) above, which has the general equation y = cxm. This can be
expanded using the laws of logarithms to derive:
y = cxm.
OR
log y = log (cxm)
OR
log y = log c + log xm
OR
(2) log y = log c + m log x
If you compare (2) with the general equation of a straight line (y = mx+c) you can note that it
looks like a straight line equation now, if we let:
Y = log y
X= log x
C = log c
Then:
Y = mX+ C
10
This has transformed the non-linear equation into a linear equation, to which we can apply the
standard least squares analysis method to help solve. Remember that in this case, you will
have a column of log y (which replaces y in your calculations to find m and C as well as a
column of log x, which replaces your column of x). However, when you have derived your
values for m and c from least squares analysis, you must remember that m is ok and solved, but
c actually refers to C and hence you need to transform this back to the original value in the
original equation of y = cxm.
Hence, suppose you find m to equal 0.45 and C to equal 1.4, then transforming this back to find
b means taking the anti-logarithms of B. If you are using (log) function, this is simply:
c = 10C
OR
the intercept on the y axis (b, also the U) value, has a value of 10 raised to the power of C.
Graph e
0.8
y (dependent variable)
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
5
10
15
20
25
x (independent variable)
For example, for graph (e) above, which has the general equation y = c*(exp(-mx)) or y=ce-mx.
This can be expanded using the laws of natural logarithms to derive:
y = ce-mx
OR
ln y = ln (ce-mx)
OR
11
ln y = ln c + ln (e-mx)
OR
(3) ln y = ln c + (-mx)
This happens as ln(exp(X)) always equals X (i.e. the natural logarithm is the inverse of the
exponential function). If you compare this with the general equation of a straight line (y = mx+c)
you can note that equation (3) looks like a straight line equation now, if we let:
Y = In y
C = In c
Then:
Y = (-)mx+ C
This has transformed the non-linear equation into a linear equation, to which we can apply the
standard least squares analysis method to help solve. Remember that in this case, you will
have a column of ln y (which replaces y in your calculations to find c). However, when you have
derived your values for m and c from least squares analysis, you must remember that m is ok
and solved, but c actually refers to C and hence you need to transform this back to the original
value in the original equation of y = ce-mx
Hence, suppose you find m to equal 0.45 and C to equal 1.4, then transforming this back to find
c means taking the anti-natural logarithm of C. If you are using ln function, this is simply:
c = Exp (C)
OR
c, the intercept on the y axis (the U) value, has a value of exponential(C).
As with linear regression forecasts, you can also moderate your non-linear forecasts with
seasonality values if appropriate.
Worked example
Consider the dataset below of average monthly temperature (in degrees Celsius) for the last 41
months in Sao Paulo. Assuming there are no other variables or external factors affecting the
observed data, except the time of the year, forecast the temperature for the next year.
Month Temperature Month Temperature Month Temperature Month Temperature
1
2
3
4
5
6
7
8
9
10
11
12
3
4
5.5
9
15
18
20
21
17
15
10
8
13
14
15
16
17
18
19
20
21
22
23
24
5
5.8
5.9
9
16
21
25
30
30
25
19
15
25
26
27
28
29
30
31
32
33
34
35
36
12
8
8.5
10
11
17.5
22
27.5
38
45
37
35
32
37
38
39
40
41
28
29
32
35
47
When this data is plotted on an x-y graph (graph below), the following relationship is displayed.
It is difficult to be certain at this juncture which forecasting model (linear or non-linear) to use to
predict the temperature for the next 12 months. This would require you to research the context
of the data. For the purposes of this guide, we will assume the relationship is non-linear and
based upon the 3 non–linear equations discussed above, we will use the generic equation (4) y
= c*(exp(+mx)) or y=ce+mx. (Note in this case, as the curve is upward sloping, the m value is
given a positive sign).
Degrees (Celcius)
Temperature in Sao Paulo
50
45
40
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
Month
Using the expansion of equation (4), we will use the transformation ln y = ln c + (+mx), to
convert our data into a linear form we can manipulate. Hence our dataset would look like:
13
SUM
x
Month
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
11.00
12.00
13.00
14.00
15.00
16.00
17.00
18.00
19.00
20.00
21.00
22.00
23.00
24.00
25.00
26.00
27.00
28.00
29.00
30.00
31.00
32.00
33.00
34.00
35.00
36.00
37.00
38.00
39.00
40.00
41.00
861.00
y
Temperature
3.00
4.00
5.50
9.00
15.00
18.00
20.00
21.00
17.00
15.00
10.00
8.00
5.00
5.80
5.90
9.00
16.00
21.00
25.00
30.00
30.00
25.00
19.00
15.00
8.00
8.50
10.00
11.00
17.50
22.00
27.50
38.00
45.00
37.00
35.00
32.00
28.00
29.00
32.00
35.00
47.00
814.70
ln y
xlny
lny squared
x squared
1.10
1.39
1.70
2.20
2.71
2.89
3.00
3.04
2.83
2.71
2.30
2.08
1.61
1.76
1.77
2.20
2.77
3.04
3.22
3.40
3.40
3.22
2.94
2.71
2.08
2.14
2.30
2.40
2.86
3.09
3.31
3.64
3.81
3.61
3.56
3.47
3.33
3.37
3.47
3.56
3.85
113.83
1.10
2.77
5.11
8.79
13.54
17.34
20.97
24.36
25.50
27.08
25.33
24.95
20.92
24.61
26.62
35.16
47.13
54.80
61.16
68.02
71.43
70.82
67.72
64.99
51.99
55.64
62.17
67.14
83.00
92.73
102.74
116.40
125.62
122.77
124.44
124.77
123.29
127.96
135.16
142.21
157.86
2626.12
1.21
1.92
2.91
4.83
7.33
8.35
8.97
9.27
8.03
7.33
5.30
4.32
2.59
3.09
3.15
4.83
7.69
9.27
10.36
11.57
11.57
10.36
8.67
7.33
4.32
4.58
5.30
5.75
8.19
9.55
10.98
13.23
14.49
13.04
12.64
12.01
11.10
11.34
12.01
12.64
14.82
336.27
1.00
4.00
9.00
16.00
25.00
36.00
49.00
64.00
81.00
100.00
121.00
144.00
169.00
196.00
225.00
256.00
289.00
324.00
361.00
400.00
441.00
484.00
529.00
576.00
625.00
676.00
729.00
784.00
841.00
900.00
961.00
1024.00
1089.00
1156.00
1225.00
1296.00
1369.00
1444.00
1521.00
1600.00
1681.00
23821.00
Hence from this table of data values, we can determine m and c to be:
m = 41(2626.12)-(861)(113.83)
41(23821)-(861)(861)
c = (113.83)(23821)-(861)(2626.12)
41(23821)-(861)(861)
Hence :
m = 0.0411 to 4 decimal places)
C = 1.9140 (to 4 decimal places)
14
However, as this refers to m and C and we need m and c, we must transform the c value to
obtain the true value for our non-linear equation. Hence:
c = exp(C)
Therefore:
m = 0.0411 (to 4 decimal places)
c = 6.7822 (to 4 decimal places)
Therefore the final non-linear equation is given as: (5) y = 6.7822e0.0411x
Hence the forecasts for next 12 months are found by inserting additional values of x into
equation (5) which generates table (1) below:
Month
1
2
3
4
5
6
7
8
9
10
11
12
Temperature Month Temperature Month Temperature Month Temperature
3
4
5.5
9
15
18
20
21
17
15
10
8
13
14
15
16
17
18
19
20
21
22
23
24
5
5.8
5.9
9
16
21
25
30
30
25
19
15
25
26
27
28
29
30
31
32
33
34
35
36
8
8.5
10
11
17.5
22
27.5
38
45
37
35
32
37
38
39
40
41
42
43
44
45
46
47
48
28
29
32
35
47
38.11
39.71
41.38
43.11
44.92
46.80
48.77
Month
Temperature
49
50
51
52
53
50.81
52.95
55.17
57.48
59.89
Table 1
If you plot these forecasts with the original data, you notice that whilst the trend (T) and
underlying trend (U) seem appropriate, we are still missing out on a seasonal variation (S) in the
data forecasts. This is the next step.
Seasonality can be addressed in two ways, using these methods. The first is called the additive
method, the second is called the multiplicative method. In both methods you are working out
how much difference there is between your trend (T) forecast (in table 1) compared with the
real data you have. You then adjust your (T) forecast by a suitable amount to end up with a final
forecast.
Your adjustments are based on the two calculations of:
Final forecast = T + S (Additive method)
Final forecast = T * S (Multiplicative method)
We shall use one of these methods to continue with this worked example.
15
Graph (f) : Temperature Sau Paulo
70
degree celcius
60
50
40
30
20
Best fit line
10
0
0
10
n=3,15,27
20
30
40
50
60
months
When including seasonality (S) into forecasts, one approach is then to consider if the
seasonality (i.e. the change observed in the actual data as it moves around the best fit line) is a
regular constant change or if that change is increasing or decreasing with the passage of time.
By looking at the original graph of the temperature in Sau Paulo over time, we could argue that
the regular ups and downs observed in this data seem to repeat every 12 th data value point
(hence the seasonality has a period of 12) and that these changes in every 12 th point, appear to
be more or less the same when compared with a rising trend (T).
For example, if you look at graph (f), the 3rd data value point (n=3), just after the bottom of the
seasonal cycle seems to always be more or less, a little bit less than the trend (T) value for this
time. Hence the temperature for each March month (n=3, 15, 27, 39) always seems to be the
same distance below the trend line (T).
We therefore need to modify the forecast values we have determined for (T) above in table 1,
by the amount of difference for each point in the seasonal cycle. This is worked out in this
example by the simple calculation of (Actual - Forecast (T)). Hence in table (2) below, the
difference between the actual data observed and the trend forecast is worked out. As this cycle
of change in the data repeats itself with a period of 12, we can take the average of the
differences for the same points in the cycle. This is shown by the column in table 2, headed
‘mean difference’. Hence the first value is the mean of the values (-4.07, -6.57, -10.95, -3.03).
Your final forecast is then the addition of the (T) forecast with the seasonal (S) modification.
Hence the first final forecast value is found to be (T)+(S), which equals:
Final forecast for January (n=1) = 7.07+ (-6.15) = 0.91
If you repeat this calculation for all your months, including your new forecasts, you end up with
table 2 and graph (g).
16
Difference between actual
x
y
ln y
Month Temperature
1.00
3.00
1.10
2.00
4.00
1.39
3.00
5.50
1.70
4.00
9.00
2.20
5.00
15.00
2.71
6.00
18.00
2.89
7.00
20.00
3.00
8.00
21.00
3.04
9.00
17.00
2.83
10.00
15.00
2.71
11.00
10.00
2.30
12.00
8.00
2.08
13.00
5.00
1.61
14.00
5.80
1.76
15.00
5.90
1.77
16.00
9.00
2.20
17.00
16.00
2.77
18.00
21.00
3.04
19.00
25.00
3.22
20.00
30.00
3.40
21.00
30.00
3.40
22.00
25.00
3.22
23.00
19.00
2.94
24.00
15.00
2.71
25.00
8.00
2.08
26.00
8.50
2.14
27.00
10.00
2.30
28.00
11.00
2.40
29.00
17.50
2.86
30.00
22.00
3.09
31.00
27.50
3.31
32.00
38.00
3.64
33.00
45.00
3.81
34.00
37.00
3.61
35.00
35.00
3.56
36.00
32.00
3.47
37.00
28.00
3.33
38.00
29.00
3.37
39.00
32.00
3.47
40.00
35.00
3.56
41.00
47.00
3.85
861.00
814.70
113.83
xlny
1.10
2.77
5.11
8.79
13.54
17.34
20.97
24.36
25.50
27.08
25.33
24.95
20.92
24.61
26.62
35.16
47.13
54.80
61.16
68.02
71.43
70.82
67.72
64.99
51.99
55.64
62.17
67.14
83.00
92.73
102.74
116.40
125.62
122.77
124.44
124.77
123.29
127.96
135.16
142.21
157.86
2626.12
lny squared x squared Trend forecast
1.21
1.92
2.91
4.83
7.33
8.35
8.97
9.27
8.03
7.33
5.30
4.32
2.59
3.09
3.15
4.83
7.69
9.27
10.36
11.57
11.57
10.36
8.67
7.33
4.32
4.58
5.30
5.75
8.19
9.55
10.98
13.23
14.49
13.04
12.64
12.01
11.10
11.34
12.01
12.64
14.82
336.27
1.00
4.00
9.00
16.00
25.00
36.00
49.00
64.00
81.00
100.00
121.00
144.00
169.00
196.00
225.00
256.00
289.00
324.00
361.00
400.00
441.00
484.00
529.00
576.00
625.00
676.00
729.00
784.00
841.00
900.00
961.00
1024.00
1089.00
1156.00
1225.00
1296.00
1369.00
1444.00
1521.00
1600.00
1681.00
23821.00
7.07
7.36
7.67
7.99
8.33
8.68
9.04
9.42
9.82
10.23
10.66
11.11
11.57
12.06
12.56
13.09
13.64
14.21
14.81
15.43
16.08
16.75
17.45
18.19
18.95
19.74
20.57
21.44
22.34
23.27
24.25
25.27
26.33
27.43
28.58
29.78
31.03
32.33
33.69
35.10
36.58
38.11
39.71
41.38
43.11
44.92
46.80
48.77
50.81
52.95
55.17
57.48
59.89
Table 2
17
data and trend forecast
Mean difference
Final forecast
-4.07
-3.36
-2.17
1.01
6.67
9.32
10.96
11.58
7.18
4.77
-0.66
-3.11
-6.57
-6.26
-6.66
-4.09
2.36
6.79
10.19
14.57
13.92
8.25
1.55
-3.19
-10.95
-11.24
-10.57
-10.44
-4.84
-1.27
3.25
12.73
18.67
9.57
6.42
2.22
-3.03
-3.33
-1.69
-0.10
10.42
-6.15
-6.05
-5.27
-3.41
3.65
4.95
8.13
12.96
13.26
7.53
2.43
-1.36
-6.15
-6.05
-5.27
-3.41
3.65
4.95
8.13
12.96
13.26
7.53
2.43
-1.36
-6.15
-6.05
-5.27
-3.41
3.65
4.95
8.13
12.96
13.26
7.53
2.43
-1.36
-6.15
-6.05
-5.27
-3.41
3.65
4.95
8.13
12.96
13.26
7.53
2.43
-1.36
-6.15
-6.05
-5.27
-3.41
3.65
0.91
1.31
2.40
4.59
11.98
13.62
17.18
22.38
23.08
17.76
13.09
9.75
5.42
6.01
7.29
9.68
17.29
19.16
22.94
28.39
29.34
24.28
19.89
16.83
12.79
13.70
15.30
18.03
25.99
28.22
32.38
38.23
39.59
34.96
31.02
28.42
24.88
26.28
28.41
31.70
40.23
43.06
47.84
54.34
56.37
52.45
49.24
47.41
44.66
46.90
49.89
54.08
63.55
Graph (g):Forecast and Actual temperature Sau Paulo
70
Degree
60
Original data
50
40
30
Final forecast
20
10
Best fit line
0
0
10
20
30
40
50
60
Month
Graph (g) therefore shows a good fit between your final forecasts and the original data,
suggesting that for future forecasts it is to be appropriate and accurate. To determine how
accurate your forecasts are, you now need to determine the errors in them. This is examined
below.
As an aside, if you were to apply the multiplicative method to the (T) forecasts in this example,
rather than work out (Actual – forecast (T)) as above, you instead determine:
Seasonal proportional change = Actual *100%
Forecast (T)
Hence as with the calculations above, you end up with a proportional change (a % change
needing to be made to each (T) forecast) so as to end up with the final forecast needed. This
method therefore allows you consider increasing or decreasing rates of change in the
seasonality affecting observed data.
Methods of handling quantitative data for forecasting (3)
In applying any statistical or numerical method to a range of data, it is helpful to know if your
forecasts are good, bad or indifferent. The value of r2 has already been introduced, but there are
other measures of goodness that can be used to give information about how effective your
analysis has been. Some of these are listed below.
Mean Error (ME) – This is the summed average difference between your forecasts and what
data you have observed. It has the form of:
ME =  (ye – y)
n
where n = number of data values of y you have
18
Mean Square Error (MSE) – This is the summed and squared difference between your
forecasts and what data you have observed. It has the form of:
MSE =  (ye – y)2
n
where n = number of data values of y you have
It is a better measurement of goodness in your forecasts, as it eliminates negative values in the
difference between your forecast and the observed data.
Mean Absolute Deviation (MAD) – This is the absolute integer difference between your
forecasts and the data you have observed. It has the form of:
MAD =  │ye – y│
n
where n = number of data values of y that you have
The straight bars on either side of the equation denote just take the absolute difference
(ignoring any negative signs). It is a better measurement of goodness in your forecasts, as it
eliminates negative values in the difference between your forecast and the observed data.
Worked example
If we apply the MSE calculation to the worked example in table 2, we need to calculate the sum
of the differences between our forecasts and the original data, then divide this by the number of
data values we have (n=41). Hence:
MSE = 566.55 / 41
MSE = 13.82 (to 2 decimal points)
Hence taking the square root of this value, each forecast value is, on average, within 3.17
degrees of the original data.
In terms of calculating the coefficient of determination:
r2 =  (ye – y)2
 (y - y)2
Using the data in table 2, we can find this to be:
r2 = 4611.709
5596.76
r2 =0.8239
In other words, the model used has accounted for approximately 83% of observed variance in
the data, representing a good fit with the data.
19
References:
Carlson, G.A., (2002),’Least Squares Analysis’, Accessed at
http://www.stchas.edu/faculty/gcarlson/physics/docs/LeastSquaresAnalysis.pdf on March 24th
2006
Jennings D. and Wattam, S.,(1998),’Decision Making – An integrated approach’, FT- Prentice
Hall, UK.
The information in this leaflet can be made available on request.
Telephone 01482 466199.
© 03/2008
20
Download