What is a time series? - South East Public Health Observatory

advertisement
'Trends, time series and forecasting
Paul Fryers
East Midlands KIT
Overview
 Introduction
 Understanding trends and time series – Seasonality
– Transformations
 Methods for analysing time series
– Regression
– Moving averages
– Autocorrelation
 Overview of forecasting
 Forecasting methods

– Extrapolation of regression
– Holt’s method
 Uses for forecasting

– Setting and monitoring targets
– Estimating current values
 General methodological points
What is a time series?
 A set of well defined measures collected through time:
– Mortality
– Diagnoses
– Temperature
– Rainfall
– Share price
– Sunspots
– Ice cream sales
– Air passengers
– Road accidents
What is special about time series data?
 There is an implicit order to the data with a first, second, third,..., nth
value
 Previous observations may be important determinants of later
observations
– this has implications for analysis
 Trend and/or seasonal effects may be present
– a trend is a tendency for observations to fall or rise over time
– seasonal effects are regular repeating patterns of rises or falls
 Different techniques are needed for analysis of historical data and for
producing forecasts
Continuous time : electrocardiogram trace
Monthly emphysema deaths
Understanding trends and time series
 First plot the data
 Is the time series consistent?
– Look for step changes in level or trend
 Is there any visual evidence of any pattern or trend?
 Is there evidence of a regular ‘seasonal’ pattern?
 If there is a trend, is it linear? (probably not!)
Is the time series consistent? – change in trend
Indicator Value
Graph of an indicator over time demonstrating a
clear change in trend
Time
Is the time series consistent? – step changes
Number of recorded 'violence against the person' crimes in England & Wales
1980- 2006/07
1,200,000
1,000,000
Introduction of the
National Crime
Recording Standard
(NCRS)
800,000
600,000
Change to the Home
Office counting rules for
recorded crime
400,000
200,000
2006/07
2005/06
2004/05
2003/04
2002/03
2001/02
2000/01
1998/9
1999/00
1998/9
1997
1997/8
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
0
Handling inconsistency
 Usually, we will simply break the time series at the point where the trend
changes, or the step change occurs
 Analyse only the data since that point, or analyse the different parts of
the time series separately
 Or use a method/software that will do that automatically, eg by weighting
more recent points more heavily
 We may be able to adjust or transform the data prior to a step change but
only if we understand the reason for the change and are confident that
the adjustment makes the data consistent
– eg adjusting for a coding change (ICD coding, definition of
unemployment, etc.)
 But it’s not always clear cut...
Is the time series consistent? – step changes?
Mortality rate per 100,000 population-years at risk
140
120
Rate
100
80
60
40
20
0
1990
1992
1994
1996
1998
2000
Year
2002
2004
2006
2008
2010
Is the time series consistent? – outlier
Mortality rate per 100,000 population-years at risk
140
120
Rate
100
80
60
40
20
0
1990
1992
1994
1996
1998
2000
Year
2002
2004
2006
2008
2010
Handling outliers
 Normally, we ignore outliers, ie exclude them from the analysis
– this can be a nuisance for some analyses
 But again, it’s not always clear cut:
– we need to identify plausible reasons for the outlier/s (eg known
issues with data collection, or a specific factor that has influenced
the outcome)
Is there any visual evidence of any pattern or trend?
Indicator Value
Graph of an indicator over time demonstrating a
steady increasing trend
Time
Is there any visual evidence of any pattern or trend?
Indicator Value
Graph of an indicator over time demonstrating an
increasing trend with greater variability
Time
Is there any visual evidence of any pattern or trend?
Indicator Value
Graph of an indicator over time demonstrating an
steady decreasing trend
Time
Is there any visual evidence of any pattern or trend?
Indicator Value
Graph of an indicator over time demonstrating
no change over time
Time
Is there any visual evidence of any pattern or trend?
Indicator Value
Graph of an indicator over time demonstrating a
seasonal pattern
Time
Is there any visual evidence of any pattern or trend?
Graph of an indicator showing a seasonal pattern plus rising trend
Mortality rate per 100,000 population-years at risk
140
120
Rate
100
80
60
40
20
0
Jun-90
Feb-93
Nov-95
Aug-98
May-01
Year
Feb-04
Nov-06
Jul-09
Handling seasonality
 Seasonality can be additive or multiplicative
– ie each different period in the cycle has an extra factor added to (or
subtracted from) or multiplied by the overall average level
 We can adjust the data by applying the inverse factor to each period
 Easier to use an integrated method that adjusts for the seasonality within
the analysis
Is the trend linear?
Indicator Value
Graph of an indicator over time demonstrating an
steady decreasing trend
Time
Example of a falling rate – straight line
Indicator Value
Graph of an indicator over time demonstrating an
steady decreasing trend
Time
Example of a falling rate – exponential curve
Indicator Value
Graph of an indicator over time demonstrating an
steady decreasing trend
Time
Transformations – non-linear trends
 In many cases, it is meaningless for the forecasts to fall below zero
 In public health we are most commonly dealing with counts, rates or
proportions
 We routinely transform the data in order to ‘make the data linear’ and
constrain them to be no less than zero
 By default, we should use a log-transformation for counts or rates, fitting
an exponential curve which assumes a constant rate of change, rather
than a constant numerical increase or decrease
 We should use a logit-transformation for proportions (or percentages),
which constrains the variable to be between 0 and 1 (or 0% and 100%)
Transformations – falling exponential curve
Indicator Value
 A rapidly falling trend
 The indicator looks to be heading rapidly towards zero, but the log
transformation ensures that it stays positive: the rate or count is ‘tending
towards’ zero but can never quite get there
 It represents a constant
Graph of an indicator over time demonstrating a
constant rate of decrease (exponential curve)
rate of change (i.e.
reducing by x% each 250
year rather than
200
reducing by a set
amount each
150
year)
 This should be the 100
default option for
analysis of counts
50
or rates
0
Time
Transformations – rising exponential curve
Indicator Value
 A rapidly increasing trend
 For a count or rate, mathematically it is preferable to use an exponential
curve, but need to beware of other practical constraints: there will usually
be some practical limit to a count or rate
 If the continued rise in
Graph of an indicator over time demonstrating a
constant rate of increase (exponential curve)
the count or rate is
30
implausible then
it is better to use
25
a linear model or
20
logit...
15
10
5
0
Time
Transformations – log-transform counts and rates
 Fitting an exponential curve:
Equation of curve:
where
y
a
indicator at time 0)
t
b
each increase
 ln(0) =
ln(∞)=
–∞
∞
ln(a) + ln(b)t
or
y = a × bt
ln(y)
=
=
=
value of variable being studied
intercept on y-axis (nominal value of
=
=
time value
‘gradient’ (amount y is multiplied by for
of 1 in time)
Transformations – logistic curve
Indicator Value
 Proportions can not go below zero or above 1
 The tails are equivalent: e.g. proportion surviving = 1 – proportion dying
 Particularly important for proportions that span a large range, from
under 0.5 to nearly 1, e.g. percentage achievement on QOF scores
 For proportions or
Graph of a proportion over time demonstrating a
logistic curve
percentages close
1
to zero, the logit is
0.9
equivalent to the
0.8
log
0.7
 For proportions
0.6
always close to
0.5
0.4
1 could subtract
from 1 and use log 0.3
0.2
0.1
0
Time
Transformations – logit-transform proportions
 The logit function:
logit(y) = ln(y/(1–y)) = ln(y) – ln(1–y)
 logit(0) = –∞
logit(½) = 0
logit(1) = ∞
 We transform proportions by applying the logit function, then fit a
regression line to the transformed data
 For rates or counts which have a practical limit, if we have a sound basis
for estimating that realistic maximum then we could do so and treat the
rate or count as a proportion of that upper limit
Methods for analysing time series
 Regression
– Most common method: simply fit a line or curve to the data,
treating ‘time’ as any other explanatory variable
– Gives equal weight to all points in the time series
– Assumes points are independent, identically distributed,
observations
– Gradient has confidence intervals: if CIs don’t include zero, the
gradient is signicant
 Two other concepts that are used as the basis for analysing time series:
– Moving average
– Autocorrelation
Linear regression
Confidence intervals for the gradient
Moving average
 Familiar as a method of presenting data
– For annual data, rather than presenting data for 2004, 2005, 2006,
2007 and 2008, we may present three-year figures: 2004-06,
2005-07 and 2006-08
– Smoothes out fluctuations in the data, making trends easier to see
 Also called ‘rolling averages’
 Moving averages of different periods can be used to highlight different
features of a time series (example follows)
 BUT!!!
Moving averages must not be used as the basis for regression, time series
analysis or forecasting as they are not independent observations (they
share their data with their neighbours)
 [Note: time series methods such as Holt’s Method and Box-Jenkins
(ARIMA) models use moving averages within the analysis, but the data
from which the model is derived should not be moving averages]
Monthly emphysema deaths
3-point moving average, highlighting seasonality
13-point moving average, highlighting trend
Autocorrelation
 In time series, observations can often be predicted by combinations of
previous observations
 If the observations are correlated with their immediate predecessors, we
can calculate the Pearson correlation coefficient between them
 This is called autocorrelation of lag 1
 Observations can also be correlated with predecessors from further back
in the time series – autocorrelation of lag k (where k is number of
observations back in the series)
– In time series, observations can be predicted by combinations of
previous observations
– Smoothes out fluctuations in the data, making trends easier to see
Forecasting
 Why do we need to forecast?
 Extrapolating
 Forecasting methods
– Examples
– Holt’s Method
– Interval forecasts
– How far back and how far forward?
 Using forecasts
– to set and monitor progress against targets
– to estimate current health outcomes/indicators
Why do we need to forecast?
 To inform planning by estimating future needs
– the health of the population tends to change slowly and react slowly
to public health interventions so we need to look ahead
 To anticipate future major events
– e.g. outbreaks
 To set and monitor progress against targets
– where are we likely to be on current progress?
– are we on track to meet targets?
 To estimate current health outcomes
– our most recent data tend to be a year or more out of date so if we
want to know where we are now or even where we were last year we
have to forecast
Forecasting from past trends
 If we have time series for a health outcome, health service output
indicator or risk factor, we can use this to forecast future values
 eg:
– mortality rates
– teenage pregnancy rates
– hospital activity rates
– prevalence estimates
 Assumes:
– consistent definitions and measurement, past and future
– either that nothing significant changes, or that changes/
improvements continue at the same rate
Extrapolating from regression lines
 A common method is to fit a regression line (or curve) to the historic data
and extrapolate it to the future
 This is OK for a short time into the future as long as the historic data are
stable, ie changing at a steady rate
 But:
The regression line is fitted across the whole of the historic data, and
gives equal weight to all points: e.g. the value for last year is given the
same weight as one from 20 years ago – it doesn’t give the best estimate
of ‘current trends’
 We cannot give realistic confidence intervals for future values (‘prediction
intervals’ or ‘forecast intervals’)
Forecasting methods
 There is a range of methods which are intended for forecasting, eg
moving average methods, autocorrelation methods, Box-Jenkins methods
 These methods take into account fluctuations from year to year, trends
(ie gradual changes over time) and seasonal variations
 They tend to give greater weight to more recent values, hence ‘start from
where we are’
 They give confidence intervals for forecasts, which tend to get wider as
we move further into the future
 The most useful methods for public health applications tend to be Holt’s
Method (which includes a trend component) and Holt-Winters (which
adds a seasonal component)
 Note, as with regression analysis, the points in the time series must be
independent of each other: rolling averages must never be used for
forecasting
Teenage conceptions – England
ENGLAND
Under 18 Conception Rate per 1,000 females aged 15-17
70
60
50
40
30
20
10
0
1992
1993
1994
Observed Rate
1995
1996
Target
1997
1998
Forecast Rate
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Forecast Confidence Interval
http://www.empho.org.uk/pages/viewResource.aspx?id=11285
Teenage conceptions – London GOR
LONDON GOR
Under 18 Conception Rate per 1,000 females aged 15-17
70
60
50
40
30
20
10
0
1992
1993
1994
Observed Rate
1995
1996
Target
1997
1998
Forecast Rate
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Forecast Confidence Interval
http://www.empho.org.uk/pages/viewResource.aspx?id=11285
Teenage conceptions – Newham
Newham LB
Under 18 Conception Rate per 1,000 females aged 15-17
70
60
50
40
30
20
10
0
1992
1993
1994
Observed Rate
1995
1996
Target
1997
1998
Forecast Rate
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Forecast Confidence Interval
http://www.empho.org.uk/pages/viewResource.aspx?id=11285
Teenage conceptions – Harrow
Harrow LB
Under 18 Conception Rate per 1,000 females aged 15-17
70
60
50
40
30
20
10
0
1992
1993
1994
Observed Rate
1995
1996
Target
1997
1998
Forecast Rate
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Forecast Confidence Interval
http://www.empho.org.uk/pages/viewResource.aspx?id=11285
Alcohol-related admission rates – Bassetlaw PCT
Bassetlaw Alcohol Admission Rate
1800
1600
DSR per 100,000 population
1400
1200
1000
800
600
400
200
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
0
Year
Observed
Forecast
95% Confidence Interval
Data provided to Nottinghamshire and Bassetlaw PCTs for WCC trajectories
Fractured neck of femur admission rates –
Nottinghamshire PCT
Admission Rates f or Fractured Neck of Femur
Nottinghamshire PCT
DSR per 100,000 population
600
500
400
300
Actual
Fitted
200
LCI
100
UCI
0
Data provided to Nottinghamshire and Bassetlaw PCTs for WCC trajectories
Emergency admission rates for acute coronary
syndrome – East Midlands – males
Report to East Midlands Cardiac & Stroke Network
Emergency admission rates for acute coronary
syndrome – East Midlands – females
Report to East Midlands Cardiac & Stroke Network
Holt’s Method
 Holt’s exponential smoothing (aka double exponential smoothing) is a
moving average method
 There are two equations involved in fitting the model:
Lt = axt + (1–a)(Lt–1 + Tt–1)
Tt = g(Lt–Lt–1) + (1–g)Tt–1
where
xt is the observed value at time t
Lt is the forecast at time t (the ‘level’ parameter)
Tt is the estimated slope at time t (the ‘trend’ parameter)
a is the first smoothing constant, used to smooth the level
g is the second smoothing constant, used to smooth the trend
 The model is fitted iteratively from the start of the time series, usually
setting L1 initially to x1 and T1 to x2 – x1
 A software package optimises the constants a and g such that the squared
differences between the observed values and the forecasts are minimised
Holt’s Method in practice
 Several statistical packages will do this:
– ForecastPro – not free but very easy to use
– Stata – not free and needs code, but PHE has a corporate licence
– R – open source software which requires code
– Excel – you can put the equations into Excel but have to optimise the
parameters manually
 If you use Stata, R or Excel, you need to put some effort into optimising
the parameters, which requires some expertise and time
 ForecastPro has very clever optimisation routines, which always seem to
result in sensible forecasts and forecast intervals
 BUT!!!
Every forecast should be graphed and checked – even the most expert of
automated ‘expert systems’ cannot and should not be totally relied on
Interval forecasts not point forecasts
 When we forecast the future we give a single figure for each forecast that
is our best estimate of the future value
Teenage Conception Rates, 1992-2012
East Midlands
Rate per 100,000 resident population
80
70
60
50
40
30
20
10
0
1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
Year
Observed values
Forecasts
Target
Interval forecasts not point forecasts
Rate per 100,000 resident population
 When we forecast the future we give a single figure for each forecast that
is our best estimate of the future value
 However, of course there is uncertainty about that prediction
 Forecast intervals give an indication of the degree of uncertainty, and are
far more valuable
Teenage Conception Rates, 1992-2012
East Midlands
than the actual
80
point forecasts
70
 These forecast
60
intervals are
50
calculated by
40
the forecasting
30
software
20
10
0
1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
Year
Observed values
Forecasts
95% Forecast Interval
Target
How far back and how far forward?
 As discussed earlier, if the graph shows a distinct change in trend or step
change then we should ignore the data before the current trend
– If we use Holt’s Method or similar, it is less critical because the
method tends to give more weight to recent data and largely ignores
earlier points, but if it is clear from the graph, it is still wise to use
only the data which exhibit the current trend
– If the change is very recent, then we probably don’t have a sound
basis for forecasting – this would be reflected in the forecast intervals
(covered later)
 How far ahead can we forecast – the ‘forecast horizon’?
– A rule of thumb is quoted, that you can forecast around half as far
forwards as you have data going back, however it depends on the
stability of the series, and common sense should be applied
– The question is less critical if you present forecast intervals: these will
become extremely wide as you get further into the future,
demonstrating that the forecasts are meaningless
Using forecasting to estimate ‘current’ rates
 By ‘current’ we normally mean ‘the average of the last three years for
which data are available’
 For deaths, the ‘current’ values, used for analysing our current mortality
rates, for example, are based on 2010-2012 data, i.e. data from between
4½ and 1½ years ago: on average 3 years out of date
 For small areas, even with 3 years’ data, we still have very few deaths or
cases to work with and hence our baseline can be pretty arbitrary
 We may be able to use forecasting methodology to help with both of
these problems:
If we forecast 2014 values based on a time series from 2000 to 2012 then
we have
a) a more robust baseline, based on 13 years’ data not 3
b) a baseline which reflects ‘now’ rather than 3 years ago
 Forecasts of ‘current’ periods can give us robust ‘underlying’ values or
rates
Example – rapidly changing rates
Circulatory disease death rates are falling dramatically
2004-06 average rate was 91 deaths per 100,000 population-years
2008 forecast was 74
In 2008, by taking the average of 2004-06 as our ‘current’ rate we were
potentially
Circulatory Disease Mortality Rates, 1993-2010
Doncaster and England
overestimating
200
the impact of
180
interventions
160
by 23%
140
Rate per 100,000 resident population




120
100
80
60
40
20
0
1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Year
England Rate
95% Forecast Interval
'Current' Values
Doncaster Rate
Upper Limit
Forecast
England Forecast
Summary – key points
 Look at a graph of the data, and think about the data you are working
with, considering whether there are reasons why past trends may not be
a sound basis for future changes
 Decide how far back you should start
 Transform data to ensure that the shape of the graph and any logical
limits on variability (e.g. >0, <100%) are reflected in the mathematical
assumptions
 Use regression to analyse past changes
 Use forecasting methods such as Holt’s Method (or Holt-Winters for
seasonal data) to make predictions of future rates with realistic forecast
intervals
 Ensure that data are independent of one another: no rolling averages
 Always graph the results, to ensure that the maths hasn’t had an off day
Contact
 Paul Fryers
paul.fryers@phe.gov.uk
Download