Determining mean, variance and other statistical properties of a distribution ModelRisk offers a number of functions that will return statistical properties associated with a probability distribution. These can be immensely useful in risk analysis modelling for a wide range of reasons. The distribution must be defined in ModelRisk as an Object to use these functions. Thus, for example: =VoseMean(VoseGammaObject(2,3)) returns a value of 6, the mean of the Gamma(2,3) distribution ModelRisk has many such functions for: variance; skewness; kurtosis; probability density (mass); cumulative probability; quantiles; log likelihoods and information criteria for fitted distributions; raw moments, etc. Clicking ModelRisk’s ‘View Function’ will pull up an appropriate window to illustrate what each function is doing. ModelRisk uses Objects to avoid any ambiguity about the purpose of the function defining the distribution. The software does not support more than one distribution object being defined in a cell to ensure that all object-related functions can be properly interpreted. Crystal Ball software from Oracle Corporation does not have any functions that return statistical properties of its input variables (assumptions). Palisade Corporation has attempted to provide the same features in their @RISK software. Unfortunately, they do not implement specific Object functions producing ambiguous and potentially disastrous results. For example, the formula above can be written in @RISK as: =RiskTheoMean(RiskGammaObject(2,3)) which correctly return the mean of the Gamma distribution. However, there are many other combinations in which one could use a function like RiskTheoMean that may or may not give the correct answer depending on the circumstances. This is why the @RISK users manual states the rules on how the function works: “RiskTheoMean(cellref or distribution function) returns the mean value of the first distribution function in the formula in cellref, or the entered distribution function.” For example, writing: Cell A1: =RiskNormal(10,1) Cell A2: =RiskTheoMean(A1) correctly returns the mean value of 10 But: Cell A1: =RiskNormal(10,1) + RiskNormal(3,1) Cell A2: =RiskTheoMean(A1) returns the value3, which is the mean of the second distribution. In fact the RiskTheoMean, and all the other @RISK statistical functions, will use the last distribution in a formula, rather than the first one as it specifies in its own rules. The essential fault is that @RISK uses the same function to perform the tasks of both defining a distribution, and for generating random values from it. This leads to highly unpredictable results. For example, if @RISK evaluates the last distribution, it is difficult to be sure what it would do with: Cell A1: =RiskNormal(10,RiskGamma(2,3)) Cell A2: =RiskTheoVariance(A1) In fact, it takes a random sample from the Gamma distribution, squares it, and returns this as the variance of the Normal distribution. This ambiguity is impossible with ModelRisk. For example, if we have: Cell A1: =VoseNormalObject(10,VoseGamma(2,3)) Cell A2: =VoseVariance(A1) The VoseVariance function will only operate on an Object (The Normal distribution), and only one distribution Object can appear in a cell at a time without generating an error, so the function can only return the variance of the Normal distribution – which is a sample from the Gamma distribution because VoseGamma is a random generation function. Accuracy and user-friendliness of ModelRisk functions In most situations there exist formulae that relate the distribution’s parameters to its moments, probability density function, etc. which the software should then use. Occasionally the moments (mean, variance, etc) of a distribution can be undefined – meaning that there is no formula to evaluate them. If this occurs, the appropriate ModelRisk function returns the message “Undefined”. The equivalent @RISK functions return “#VALUE!” which leave one to wonder what the problem actually is. However, if the distribution is truncated in any way, these formulae no longer apply and one must generally resort to numerical integration methods to determine the moments. A good test of whether the numerical methods work is to truncate an Exponential distribution because it has the very particular property that cutting off its left hand side at some value x leaves exactly the sameshaped distribution (i.e. the same variance, skewness and kurtosis) except that the mean has increased by x. The following graphs illustrate what happens when one uses @RISK’s and ModelRisk’s functions on an Exponential distribution with mean 100 and truncated at various values of x. In each case, the correct answer is a horizontal line: ModelRisk @RISK Figure 1: Accuracy of ModelRisk v/c @RISK for calculation of statistical moments A more challenging example can be created with the Pareto distribution, which has the longest tail of any distribution family. The following example uses a Pareto(5,1) distribution: ModelRisk @RISK Figure 2: A Pareto(5,1) distribution. There is only a 10-15 probability that a value drawn from this distribution will exceed 1000, so right-truncation at 1000 has no practical effect on the distribution’s moments The distribution is left-truncated at x and right-truncated at (x+1000). The following graphs show the change in the calculated moments as a function of x. The graphs should all show a change in the moments (vertical axis) of zero when x=0.