TRENDRAN To test for the presence of trend in a regular or irregular time series, using a variety of non-parametric test-statistics. The test-statistics are Number of runs above and below the median Number of positive differences Number of runs up and down. Significance is determined using both normal approximations and randomisation. Calling statement trendran c1 ; nran k1 ; statistics c1-c3. Input c1 A column of numeric data. Missing values : Allowed. Observations with missing values are simply ignored. Subcommands statistics Specify three columns in which to store simulated test-statistics. 1st column : number of runs above and below the median 2nd column : number of positive differences 3rd column : number of runs up and down Outputs For each of the three test-statistics, we present the Observed value of the test-statistic Expected value, standard error and p-value for the test-statistic using a (large-sample) normal approximation Randomization p-values Null hypothesis : The observed time series is a random series. Alternative hypothesis : There is trend (long-term dependence) within the series. Test-statistic The first test-statistic which we use is the number of runs above and below the median (note: values equal to the median are assumed to be above the median), M. The second test-statistic is the number of positive differences, P. The third test-statistic is the number of runs up and down, U. Randomization : We randomize the order of the data, since under the null hypothesis this ordering will be random. Speed of macro : FAST References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Standard procedures As well as randomization output, the macro produces p-values using normal approximations (Manly, 1997). For long series, a normal approximation to the distribution of M is reasonable under the null hypothesis, with mean = 2r(n-r)/(n+1), variance = 2r(n-r){2r(n-r)-n}/{n2(n-1)}, where n is the length of the series and r is the observed number of runs below the median. For long series, a normal approximation to the distribution of P is reasonable under the null hypothesis, with mean = m/2, variance = m/12, where m is the number of differences after zeros have been removed. For long series, a normal approximation to the distribution of U is reasonable under the null hypothesis, with mean = (2m+1)/3, Variance = (16m-13)/90, where m is the number of differences. There is no in-built Minitab command for this kind of procedure, but the closest command is % trend c1, which tests for evidence of trend in c1 using parametric models. WORKED EXAMPLE FOR TRENDRAN Name of dataset EXTINCTION Description The data are estimated extinction rates for marine genera from the late Permian period until the present, listed in chronological order. There are 48 geological stages… We use data on extinction rates for marine genera from the late Permian period until the present. The data are from an irregular time series; times are not presented here, because there is some doubt as to their accuracy. Source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source RAUP, D.M. (1987), Mass extinctions: a Discussion, Palaeontology, 30, pp. 1-13. Data Number of observations = 48 Number of variables = 1 Extinction rate 22 23 61 7 14 26 30 7 14 60 21 10 45 7 11 29 22 18 23 16 7 40 28 46 19 18 15 9 11 26 2 13 6 8 5 11 4 13 3 48 11 9 6 6 7 7 2 13 16 Plot 60 Extinction rate 50 40 30 20 10 0 Index 10 20 30 40 Worksheet C1 Data Aim of analysis To investigate whether there is a trend in extinction rates over time. Randomization procedure Welcome to Minitab, press F1 for help. MTB > Retrieve "N:\resampling\Examples\Extinction.MTW". Retrieving worksheet from file: N:\resampling\Examples\Extinction.MTW # Worksheet was saved on 07/08/01 10:57:19 Results for: Extinction.MTW MTB > % N:\resampling\library\trendran c1; SUBC> nran 999 ; SUBC> statistics c3-c5. Executing from file: N:\resampling\library\trendran.MAC Some tests for detecting trend in a single time series Test 1 : Runs above and below the median test Data Display (WRITE) Observed number of runs 16 Expected number of runs 25.00 Standard deviation for number of runs 3.427 Two-sided p-value using normal approximation 0.0131 One-sided randomization p-value, H1: trend 0.9970 One-sided randomization p-value, H1: rapid oscillation 0.0060 Two-sided randomization p-value 0.0120 3 Test 2 : Sign test Data Display (WRITE) Observed number of positive differences 23 Total observed number of non-zero distances 47 Expected number of positive differences 23.50 Standard deviation for number of positive differences 1.979 Two-sided p-value using normal approximation 1.0000 One-sided randomization p-value, H1: decreasing trend 0.6350 One-sided randomization p-value, H1: increasing trend 0.5570 Two-sided randomization p-value 1.0000 Test 3 : Runs up and down test Data Display (WRITE) Observed number of runs 28 Expected number of runs 31.67 Standard deviation for number of runs 2.866 Two-sided p-value using normal approximation 0.2691 One-sided randomization p-value, H1: trend 0.8950 One-sided randomization p-value, H1: rapid oscillation 0.1690 Two-sided randomization p-value 0.3380 Modified worksheet C3 Column containing 999 M statistics, one for each randomized dataset C4 Column containing 999 P statistics, one for each randomized dataset C5 Column containing 999 U statistics, one for each randomized dataset Discussion Method Runs above and below the median Positive differences Runs up and down Randomization p-values (2-sided) This example Manly 0.012 0.008 1.000 1.000 0.338 0.360 P-values by normal Approximation 0.013 1.000 0.260 Our randomization p-values agree closely with those of Manly (1997). These are somewhat different from the p-values obtained using the normal approximation, but the differences are not too great. Overall, only the first of the tests shows any evidence of trend. In the case of runs above and below the median, the evidence for trend is reasonably strong. Manly (1997) suggests that there is clear trend within the data, and that the 2nd and 3rd tests have failed to pick this up because they concentrate too much on small-scale behaviour. 4