Calling statement

advertisement
TRENDRAN
To test for the presence of trend in a regular or irregular time series, using a variety of non-parametric
test-statistics. The test-statistics are  Number of runs above and below the median
 Number of positive differences
 Number of runs up and down.
Significance is determined using both normal approximations and randomisation.
Calling statement
trendran c1 ;
nran k1 ;
statistics c1-c3.
Input
c1
A column of numeric data.
Missing values : Allowed. Observations with missing values are simply ignored.
Subcommands
statistics
Specify three columns in which to store simulated test-statistics.
1st column : number of runs above and below the median
2nd column : number of positive differences
3rd column : number of runs up and down
Outputs
For each of the three test-statistics, we present the
 Observed value of the test-statistic
 Expected value, standard error and p-value for the test-statistic using a (large-sample) normal
approximation
 Randomization p-values
Null hypothesis : The observed time series is a random series.
Alternative hypothesis : There is trend (long-term dependence) within the series.
Test-statistic
The first test-statistic which we use is the number of runs above and below the median (note: values equal
to the median are assumed to be above the median), M.
The second test-statistic is the number of positive differences, P.
The third test-statistic is the number of runs up and down, U.
Randomization : We randomize the order of the data, since under the null hypothesis this ordering will
be random.
Speed of macro : FAST
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Standard procedures
As well as randomization output, the macro produces p-values using normal approximations (Manly,
1997).
For long series, a normal approximation to the distribution of M is reasonable under the null hypothesis,
with
mean = 2r(n-r)/(n+1),
variance = 2r(n-r){2r(n-r)-n}/{n2(n-1)},
where n is the length of the series and r is the observed number of runs below the median.
For long series, a normal approximation to the distribution of P is reasonable under the null hypothesis,
with
mean = m/2,
variance = m/12,
where m is the number of differences after zeros have been removed.
For long series, a normal approximation to the distribution of U is reasonable under the null hypothesis,
with
mean = (2m+1)/3,
Variance = (16m-13)/90,
where m is the number of differences.
There is no in-built Minitab command for this kind of procedure, but the closest command is
% trend c1,
which tests for evidence of trend in c1 using parametric models.
WORKED EXAMPLE FOR TRENDRAN
Name of dataset
EXTINCTION
Description
The data are estimated extinction rates for marine genera from the late Permian period until the present,
listed in chronological order. There are 48 geological stages…
We use data on extinction rates for marine genera from the late Permian period until the present.
The data are from an irregular time series; times are not presented here, because there is some doubt as to
their accuracy.
Source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
RAUP, D.M. (1987), Mass extinctions: a Discussion, Palaeontology, 30, pp. 1-13.
Data
Number of observations = 48
Number of variables = 1
Extinction rate
22 23 61
7 14 26
30
7 14
60
21
10
45
7
11
29
22
18
23
16
7
40 28 46
19 18 15
9 11 26
2
13
6
8
5
11
4
13
3
48
11
9
6
6
7
7
2
13
16
Plot
60
Extinction rate
50
40
30
20
10
0
Index
10
20
30
40
Worksheet
C1
Data
Aim of analysis
To investigate whether there is a trend in extinction rates over time.
Randomization procedure
Welcome to Minitab, press F1 for help.
MTB > Retrieve "N:\resampling\Examples\Extinction.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Extinction.MTW
# Worksheet was saved on 07/08/01 10:57:19
Results for: Extinction.MTW
MTB > % N:\resampling\library\trendran c1;
SUBC> nran 999 ;
SUBC> statistics c3-c5.
Executing from file: N:\resampling\library\trendran.MAC
Some tests for detecting trend in a single time series
Test 1 : Runs above and below the median test
Data Display (WRITE)
Observed number of runs 16
Expected number of runs
25.00
Standard deviation for number of runs
3.427
Two-sided p-value using normal approximation
0.0131
One-sided randomization p-value, H1: trend
0.9970
One-sided randomization p-value, H1: rapid oscillation
0.0060
Two-sided randomization p-value
0.0120
3
Test 2 : Sign test
Data Display (WRITE)
Observed number of positive differences 23
Total observed number of non-zero distances 47
Expected number of positive differences
23.50
Standard deviation for number of positive differences
1.979
Two-sided p-value using normal approximation
1.0000
One-sided randomization p-value, H1: decreasing trend
0.6350
One-sided randomization p-value, H1: increasing trend
0.5570
Two-sided randomization p-value
1.0000
Test 3 : Runs up and down test
Data Display (WRITE)
Observed number of runs 28
Expected number of runs
31.67
Standard deviation for number of runs
2.866
Two-sided p-value using normal approximation
0.2691
One-sided randomization p-value, H1: trend
0.8950
One-sided randomization p-value, H1: rapid oscillation
0.1690
Two-sided randomization p-value
0.3380
Modified worksheet
C3
Column containing 999 M statistics, one for each randomized dataset
C4
Column containing 999 P statistics, one for each randomized dataset
C5
Column containing 999 U statistics, one for each randomized dataset
Discussion
Method
Runs above and below the median
Positive differences
Runs up and down
Randomization p-values (2-sided)
This example
Manly
0.012
0.008
1.000
1.000
0.338
0.360
P-values by normal
Approximation
0.013
1.000
0.260
Our randomization p-values agree closely with those of Manly (1997). These are somewhat different from
the p-values obtained using the normal approximation, but the differences are not too great. Overall, only
the first of the tests shows any evidence of trend. In the case of runs above and below the median, the
evidence for trend is reasonably strong. Manly (1997) suggests that there is clear trend within the data,
and that the 2nd and 3rd tests have failed to pick this up because they concentrate too much on small-scale
behaviour.
4
Download