RiskTheoMean(cellref or distribution function

advertisement
Determining mean, variance and other
statistical properties of a distribution
ModelRisk offers a number of functions that will return statistical properties associated with a
probability distribution. These can be immensely useful in risk analysis modelling for a wide range of
reasons.
The distribution must be defined in ModelRisk as an Object to use these functions. Thus, for
example:
=VoseMean(VoseGammaObject(2,3))
returns a value of 6, the mean of the Gamma(2,3) distribution
ModelRisk has many such functions for: variance; skewness; kurtosis; probability density (mass);
cumulative probability; quantiles; log likelihoods and information criteria for fitted distributions; raw
moments, etc. Clicking ModelRisk’s ‘View Function’ will pull up an appropriate window to illustrate
what each function is doing.
ModelRisk uses Objects to avoid any ambiguity about the purpose of the function defining the
distribution. The software does not support more than one distribution object being defined in a cell
to ensure that all object-related functions can be properly interpreted.
Crystal Ball software from Oracle Corporation does not have any functions that return statistical
properties of its input variables (assumptions). Palisade Corporation has attempted to provide the
same features in their @RISK software. Unfortunately, they do not implement specific Object
functions producing ambiguous and potentially disastrous results. For example, the formula above
can be written in @RISK as:
=RiskTheoMean(RiskGammaObject(2,3))
which correctly return the mean of the Gamma distribution. However, there are many other
combinations in which one could use a function like RiskTheoMean that may or may not give the
correct answer depending on the circumstances. This is why the @RISK users manual states the rules
on how the function works:
“RiskTheoMean(cellref or distribution function) returns the mean value of the first distribution function in the
formula in cellref, or the entered distribution function.”
For example, writing:
Cell A1: =RiskNormal(10,1)
Cell A2: =RiskTheoMean(A1)
correctly returns the mean value of 10
But:
Cell A1: =RiskNormal(10,1) + RiskNormal(3,1)
Cell A2: =RiskTheoMean(A1)
returns the value3, which is the mean of the second distribution.
In fact the RiskTheoMean, and all the other @RISK statistical functions, will use the last distribution
in a formula, rather than the first one as it specifies in its own rules.
The essential fault is that @RISK uses the same function to perform the tasks of both defining a
distribution, and for generating random values from it. This leads to highly unpredictable results. For
example, if @RISK evaluates the last distribution, it is difficult to be sure what it would do with:
Cell A1: =RiskNormal(10,RiskGamma(2,3))
Cell A2: =RiskTheoVariance(A1)
In fact, it takes a random sample from the Gamma distribution, squares it, and returns this as the
variance of the Normal distribution. This ambiguity is impossible with ModelRisk. For example, if we
have:
Cell A1: =VoseNormalObject(10,VoseGamma(2,3))
Cell A2: =VoseVariance(A1)
The VoseVariance function will only operate on an Object (The Normal distribution), and only one
distribution Object can appear in a cell at a time without generating an error, so the function can
only return the variance of the Normal distribution – which is a sample from the Gamma distribution
because VoseGamma is a random generation function.
Accuracy and user-friendliness of ModelRisk functions
In most situations there exist formulae that relate the distribution’s parameters to its moments,
probability density function, etc. which the software should then use. Occasionally the moments
(mean, variance, etc) of a distribution can be undefined – meaning that there is no formula to
evaluate them. If this occurs, the appropriate ModelRisk function returns the message “Undefined”.
The equivalent @RISK functions return “#VALUE!” which leave one to wonder what the problem
actually is.
However, if the distribution is truncated in any way, these formulae no longer apply and one must
generally resort to numerical integration methods to determine the moments. A good test of
whether the numerical methods work is to truncate an Exponential distribution because it has the
very particular property that cutting off its left hand side at some value x leaves exactly the sameshaped distribution (i.e. the same variance, skewness and kurtosis) except that the mean has
increased by x. The following graphs illustrate what happens when one uses @RISK’s and
ModelRisk’s functions on an Exponential distribution with mean 100 and truncated at various values
of x. In each case, the correct answer is a horizontal line:
ModelRisk
@RISK
Figure 1: Accuracy of ModelRisk v/c @RISK for calculation of statistical moments
A more challenging example can be created with the Pareto distribution, which has the longest tail
of any distribution family. The following example uses a Pareto(5,1) distribution:
ModelRisk
@RISK
Figure 2: A Pareto(5,1) distribution. There is only a 10-15 probability that a value drawn from this distribution will exceed
1000, so right-truncation at 1000 has no practical effect on the distribution’s moments
The distribution is left-truncated at x and right-truncated at (x+1000). The following graphs show the
change in the calculated moments as a function of x. The graphs should all show a change in the
moments (vertical axis) of zero when x=0.
Download