Gold Medal essay

advertisement
Introduction:
This investigation seeks to find out the appropriate mathematical model that can show a relationship
between the gold medal winning heights that were held in the Olympic games of men’s categories with
that of the year in which the games were held. The Olympic games have a four year cycle so that the next
game commences after every four years of the current year. The relationship between the winning heights
and the years of the games was modelled analytically by the use of technology such as Autograph
software and other graphical analysis software.
In the Table 1, the data that is being shown is of the heights of the winning men’s high jump who got
Gold Medal in Olympics in the years that ranged from 1932 to 1980. The unit is in centimetres. The
height of the gold medal winning jumps in the men’s category ranged from 197- 236 cm for the period
1932 to 1980. It might be worthwhile mentioning here that the Olympic Games didn’t take place on two
occasions – first in 1940 and the other in 1944. The data contained in the below Table, table 1 is the
actual and original data of the games that will be used for making the model:
Table 1:
Year
Height
1932
197
1936
203
1948
198
1952
204
1956
212
1960
216
1964
218
1968
224
1972
223
1976
225
1980
236
The table below shows the years that are after the year 1896 (for example 1932 corresponds to the year 36
in the Table 2) according to my adjusted axis. This origin was chosen by me because I wanted the
coordinated of the x-axis to be positive as far as possible. The height y, which is a dependent variable is
measured in cm. So the two axis are x for year and y for height.
Table 2:
Year (since 1980)-
Height(in cm)-(y)
(x)
36
197
40
203
52
198
56
204
60
212
64
216
68
218
72
224
76
223
80
225
84
236
Graph 1:
In this kind of model, it is neither possible nor practical to have negative values for the variable y. It
would not simply make sense as height cannot be below zero (0) cm at any cost. In a long jump, the
athletes are expected to jump atleast a few cms if not a few hundred cms as is the case in most of the
games. So the numbers are natural or whole numbers in this model. On the contrary, we could have
negative numbers in the x axis value which is for years as they represent the years before 1890, hence
here x values are integers. But in this model and this assignment no negative values will be used as all the
years that is going to be used here will be post 1980).
Developing and Evaluating the Analytical Model:
If we take a look at the scatter plot graph which is enclosed above we will see that the data shows minima,
maxima and inflection points and as such is not in a linear pattern but in a zig zag manner. An upward
trend is seen for the period:1932-1936 and the graph follows a downward trend 1936 to 1948. Again there
is an upward trend from 1948 till 1968. After 1968, there is further decline again which continues till
1972. After 1972 there is an upward trend from the year1976. However it should be noted that the
relationship is not a quadratic one since data is not of a parabolic path one. This is because the data has
more than one maxima or minima. Again it cannot be put under logarithmic or exponential category as
there are numerous maxima and minima. These are lacking in logarithmic and exponential functions.
Again another prominent and widespread function in the form of trigonometric function cannot be applied
in this model as the trend is not a periodic function. Even though a piecewise function could have been
applied here as there were good chances that it would had worked in this case but in this assignment the
polynomial functions will be examined.
A 3rd order polynomial fit of the form, was chosen for the 1st model function by me
Four equations are required for calculating the values of the four parameters: a, b, c and d. A year and
corresponding height are substituted into each of the equation. These four different equations are solved
by the help of matrices. In order to enhance the accuracy of the model, the four data sets selected by me
were ones that were fairly spread in Table 2. The 4 different equations are presented here in the Table
below:
3.
Table 3:
Year (from
Equation
1890)
36
197 = a(36)3 + b(36)2 + c(36) + d
56
204 = a(56)3 + b(56)2 + c(56) + d
66
218= a(66)3 + b(66)2 + c(66) + d
72
224 = a(72)3 + b(72)2 + c(72) + d
In the above equation, the following years were selected to include the lower half of the curve (maxima):
1932 and 1936 and the following years were selected to show the upper half of the curve to include points
after the inflection in the graph (in this case it is somewhat around 1964): 1972 and 1980
In order to solve the equations, a matrix was set up in the following manner:
AX = B
A-1AX = A-1B
X = A-1B
Where the first matrix is A, the second matrix is X and the third matrix is B.
Substituting numbers:
=
Therefore the following function explains the mathematical relationship of winning Olympic high jump
heights and the corresponding years:
Function 1:f(x)≈-0.000414x3+0.0859x2-4.91x + 284.3
The Graphics Display Calculator (GDC) has automatically rounded the numbers to 4 significant figures.
The function has been plotted on below Graph here as Graph 2:
Graph 2:
If one takes a look at the above trend then he or she will get an impression that the analytical model here
fits the given data quite well. The graph follows the general and normal trend of data in which the height
increases with respect to the time, i.e., rises up. In the above graph we can see that 4 of the points sit on
the curve. It was these pints that created the model function. 3 of the points are above the curve while
other 4 lie down. Overall an even distribution is shown above and below the trend line.
One particular drawback or limitations of this particular model is that after 1980 the height of the graph
continues to go up. This is not a very practical or realistic depiction as one cannot jump over a certain
limit. It cannot be indefinite at all as human beings cannot jump to infinite distances. So it could be
presumed safely that this function is valid only for the years prior to 1980.
One other limitation with this model can be found if we see the height of the graph prior to 1932. It is in a
growth trend which means that before 1932 there was an increase in height but that was not the case as is
shown by the model curve. Therefore the function is valid only for the years that were after 1932. The
below table (Table 3) shows the residual heights for each and every year. If the Δy value is 0, it means
that the point lies on the line. The greater and bigger the value of Δy is, the greater is the distance between
the point from the curve. If there is a negative Δy value, then it means that the point lies below the curve,
whereas positive Δy values mean that the point is above the curve. The values deviate from a range of
0cm to -12.7 cm. We can see that here the standard deviation is 5.934. However, in some of the the years
like 1936, 1972, 1976 and 1980, the errors are higher than the standard deviation which means that that
there must be a better fit.
Table 3:
Year
Height
Height
Δy = y1-y2
(after
(in cm)
predicted by
(in cm)
1980)
(y1)
curve (y2) (in
cm)
36
197
197.0
0.0
40
203
196.5
6.5
52
198
200.8
-2.8
56
204
204.0
0.0
60
212
208.0
4.0
64
216
212.6
3.4
68
218
218.0
0.0
72
224
224.0
0.0
76
223
230.6
-7.6
80
225
237.7
-12.7
84
236
245.3
-9.3
By the use of technology, a natural exponent fit was created by me:
This curve is also quite close in resemblance with my analytical model but it does not cross through the
same 4 data points as it did in the analytical model. This is visible in the below mentioned Graph3:
Graph 3:
Even though the curve is following the general path here, it doesn’t match the crests as seen in the years
1932-1948 and in 1948- 1964 and also in troughs (1936-1952 and 1968-1976) or inflection points (around
1956). It can be possible that the data can be better fitted to a polynomial function of a higher order. As it
looks like that there might be approximately 2 crests, 2 troughs and 1 inflection point, it is likely that a
fifth order sixth order polynomial will be a good fit. A fifth order polynomial will be analysed in this
paper. This polynomial function below has been found using technology/ regression and all numbers have
been rounded to four significant figures.
Function 3:f(x) = 0.00005655x5 – 0.01037x4 + 0.9877x³ – 51.52x² + 1396x – 0.0001518
Technology had been used to plot this function has been plotted on Graph 4.
Graph 4:
The green curve in the above graph denotes the 5th order polynomial function and the pink curve denotes
the analytical cubic model. This 5th degree polynomial can be seen as a better fit as it passes through
points (i.e, years 1936, 1956 and 1980) and the curve is closer to certain points (years 1948, 1960, 1972
and 1976) than the cubic function.
Predictions Based on Analytical Model:
By using the cubic analytical model (Function 1), if the games had been held in 1940 (50 according to the
adjusted axis) and in the year 1944 (54 according to the adjusted axis), then the predicted winning heights
would have been as below:
For solving the year 1940 as it was not originally in the data, the number 50 which corresponds to the
year 1940 is substituted for x in Function 1:
f(x)≈ -0.0001302x3+2030x2-3.645x + 273.9
f(50)≈ -0.0001302(50)3+2030(50)2-3.645350) + 273.9
f(50) ≈ 197.1
Again to replace a number for the year 1944, the number 54 which corresponds to the year 1944 will be
used in place of x in the Function 1:
f(x)≈ -0.0001302x3+2030x2-3.645x + 273.9
f(54)≈ -0.0001302(54)3+2030 (54)2-3.6453 (54) + 273.9
f(54) ≈ 198.5
Therefore going by the regression model, the men’s gold medal winning height should have been 197cm
in the year 1940 and for 1944 it should had been 199cm. Both the numbers and figures seem achievable
and realistic as there can be a 2cm increase in the next games.
Using the analytical cubic function (Function 1), it was found that if the Games had been held in 1984 (94
according to the adjusted axis) and in 2016 (126 according to the adjusted axis), the predicted winning
heights would have been as:
For solving the year 1984, the number 94 which corresponds to1984 will be used in lieu for x in Function
1.
f(x)≈ -0.0001302x3+2030x2-3.645x + 273.9
f(94)≈ -0.0001302(94)3+2030 (94)2-3.6453 (94) + 273.9
f(94) ≈ 253.3
To solve for the year 2016 the number 126 which corresponds to 2016 will be used in lieu for x in the
Function 1:
f(x)≈ -0.0001302x3+2030x2-3.645x + 273.9
f(126)≈ -0.0001302(126)3+2030 (126)2-3.6453 (126) + 273.9
f(94) ≈ 327.2
Going by the regression model, in 1984 the winning height should have been 253 cm and for the year
2016 the winning height should be 327cm. But both the figures for the mentioned 2 years - 1984 and
2016 do not seem factual and realistic. The winning height cannot continue to increase above 1980 and
also below 1932.
Interpreting Analytical Model for Additional Data:
The below Graph (Graph 6) includes additional data for years 1896-1932 (minus the years 1900, 1916
and 1924) and for 1980-2008. This data was fit to the cubic analytical (Function 1) model.
Graph 6
By seeing the above graph one can find out that a cubic function is not a good fit for the additional data.
This is because the winning heights before the years 1932 and after 1980 do not even lie on the trend
curve. If we see the Table 4, the values of Δy are much higher than those of Table 3. Because the
function is in an upward curve, most of the points lie below the curve and therefore the values of Δy are
all negative. The range of deviation is from 0 to -71.8cm. The standard deviation is 24.43 which is
approximately 4 times more than that of between the years 1932-1980. The Δy values are much larger
than the standard deviation for 1896, 1904, 1908, 1992, 1996, 2000, 2004 and 2008, which means there
must be a another better fit.
Table 4:
Year (years after 1980)
Height (in cm) (Y1)
Height
Δy = y1-y2
predicted by
(in cm)
curve (y2) (in
cm)
6
190
253.7
-63.7
14
180
232.0
-52.0
18
191
223.3
-32.3
22
193
215.8
-22.8
30
193
204.8
-11.8
38
194
198.5
-4.5
42
197
197.0
0.0
46
203
196.5
6.5
58
198
200.8
-2.8
62
204
204.0
0.0
66
212
208.0
4.0
70
216
212.6
3.4
74
218
218
0.0
78
224
224
0.0
82
223
230.6
-7.6
86
225
237.7
-12.7
90
236
245.3
-9.3
94
235
253.3
-18.3
98
238
261.8
-23.8
102
234
270.5
-36.5
106
239
279.5
-40.5
110
235
288.8
-53.8
114
236
298.3
-62.3
118
236
307.8
-71.8
From 1896-1908 there is a trough in the data points. But from 1908- 1936 there is an upward trend after
which there is a dip once again till 1952. From 1952- 1968 there is again an upward trend and then
another trough is seen from 1968 – 1976. Then the values are almost constant and fixed from 1980
onwards with minor fluctuations (for example in the years 1988 and 1996). In order to fit these trends in
the data, the model needs to be changed. The cubic model was modified such that the curve now went
through four points including years in the extended data set.
Table 5:
Year (from
Equation
1890)
6
197 = a(6)3 + b(6)2 + c(6) + d
58
204 = a(58)3 + b(58)2 + c(58) + d
94
218= a(94)3 + b(94)2 + c(94) + d
118
224 = a(118)3 + b(118)2 + c(118) + d
To include the lower half of the data set the years 1896 was chosen. To include a point at the middle of
the data set the year 1948 was used. The years 1984 and 2008 were chosen to include the upper half of the
data set.
To solve the equations, a matrix was set up in the following fashion:
AX = B
A-1AX = A-1B
X = A-1B
Where the first matrix is A, the second matrix is X and the third matrix is B.
Substituting numbers:
=
Therefore the mathematical relationship of winning Olympic high jump heights and years is defined by
the function:
Function 4:f(x)≈-0.0002354x3+0.04713x2-1.979 + 200.2
The Graphics Display Calculator (GDC) has automatically rounded the numbers to 4 significant figures.
This function has been plotted on Graph 7 by using technology.
Graph 7:
In the above Graph 7, the pink line or equation 2 represents the function 4 which is the refined cubic
function for the additional data and the blue line or Equation 1 represents the original cubic function.
Function 4 is a much better fit than Function 1 as it follows the general trend of the data as it increases
and then gradually decreases. However, this function has its limitations as well. This is because it is
highly likely that prior to 1896, it was unlikely that the heights would have increased and post 2008, it is
highly unlikely that there will be decreasing trend.
Conclusion:
This IA attempted to find an analytical curve and a regression curve that models the height and years from
1986 to 2008. A cubic function (Function 1) fit the original data given in Table 1. However, it only
followed the general trend and did not follow the nuances (minima, maxima, inflections) in the data. To
model this, a 5th order regression polynomial (Function3) was used which was a better fit for the overall
trend. For the extended data, the original cubic function did not fit very well and a modified cubic
polynomial (Function 4) was generated which followed the general trend of the data better.
Download