1 Chapter 1 INTRODUCTION

advertisement
1
Chapter 1
INTRODUCTION
1.1
World Natural Rubber Industry
Natural rubber (NR) is generated from latex yielding trees (Havea Brasillensis).
According to the United Nations Conference on Trade and Development (UNCTAD),
due to its elasticity, toughness and resilience, NR is commercially an important
component in manufacturing a variety of products in the transportation, industrial,
consumer and medical sectors. In 2010, more than 90 percent of global supply of NR was
produced by the Association of Natural Rubber Producing Countries (ANRPC), which
consists of Thailand, Indonesia, Malaysia, India, Vietnam, China, Sri Lanka, Philippines
and Cambodia. ANRPC supplied 9.422 million metric tons (MT) of NR to the world
market in 2010.
Malaysia is considered as one of main NR producers and consumers. It is the
third-ranked NR producer with 970,000 MT in 2010, which was more than 10 percent of
global output of NR (Table 1.1). The country is expected to increase its production to
1.05 million MT in 2011 according to the ANRPC. Furthermore, it estimated that five
percent of the total NR supply was consumed by Malaysia in 2010. Essentially, the
Malaysian market is considered one of main commercial hubs for the natural rubber
industry (Burger and Smit 2002). The price of Standard Malaysia Rubber No. 20
(SMR20) will be analyzed in this study because it is the most commonly traded type of
NR.
2
NR is considered a vital source of income for the region, especially in Thailand,
Malaysia, Indonesia, Vietnam, India and Cambodia. The International Rubber Research
and Development Board (IRRDB) estimated that approximately 20 million families
depend on this commodity for their basic income. Most farms are small with an average
of two hectares (or 4.94 acres) of land filled with Havea. Large estates consist of a
minimum 300 hectares (or 741.31 acres), and are commonly owned by states and private
enterprises. In Malaysia, 94.5% of total NR productions are from smallholdings while
the estate sector accounted for only 5.5%.
Table 1.1 Supply of Natural Rubber in 2009 and 2010
Country
Production ('000 MT)
2009
2010
Percentage Change
3,164
3,072
-3%
Thailand
2440
2,843
14%
Indonesia
857
970
12%
Malaysia
820
845
3%
India
724
750
3%
Vietnam
643
647
1%
China
137
148
7%
Sri Lanka
98
102
4%
Philippines
35
45
22%
Cambodia
8,918
9,422
5%
Total
Source of Data: The Association of Natural Rubber Producing Countries
Futures contracts are used in trading NR between producing and consuming
countries such as China, Singapore, European, Japan and the United States through
brokers (Romprasert 2009). With the establishment of futures markets, it provides
opportunities for local traders and producers to exercise different options in trading such
as when to buy, sell, or stock NR. Therefore, a better understanding of what determines
3
prices is necessary to make sound decisions in hedging risk and speculating in this
commodity futures market. Due to a variation in prices, the existence of the hedging
procedure protects the investments against loss in the futures market by balancing or
compensating contracts.
1.2 The Price of Natural Rubber
The content of natural rubber is technically considered as an intermediate good,
which is complementarily used to produce a variety of final consumer goods such as
medical gloves, condoms, general rubber goods (GRG), and tires (Table 1.2). NR is
predominantly demanded by the manufacturing sector; specifically, tires and automotive
industries play a key role in the growth of NR (Burger and Smit 1989). Today, more than
65 percent of NR is used in the tire and automobile industries (ANRPC 2010). Table 1.2
indicates that the percentage of the NR-share in the tire and GRG (General Rubber
Goods) sectors have been decreasing; economically, due to the substitution of synthetic
rubber for increasingly-costly natural rubber.
Table 1.2 Percentages of NR Used in Tire and GRG Sector in 2010
Period
NR-Share in tire sector (%)
NR-Share in GRG sector (%)
2003
76
2004
76
2005
76
2006
73
2007
72
2008
73
2009
71
2010
68
Source of Data: The Association of Natural Producing Countries
79
79
79
78
78
77
76
75
4
Similar to most agricultural commodities in the world market, the price of NR
fluctuates over time (Figure 3.1). From January 2000 to mid 2008, the price of natural
rubber increased around 4.5 percent annually. In the fourth quarter of 2008, the price of
SMR20 plunged due to the global economic downturn, which led to lower demand for
NR. Since Heavea are primarily grown in developing countries, the price of NR plays an
important role in contributing to poverty in the region.
Global economic factors are viewed as the main determined of NR prices. For
example, the United States economy grew slightly faster in Q3, 2010 at 2.6 percent rate
than Q2; giving the economy a significant boost, and this growth signaled a jump in
global NR prices due to the boost in U.S. consumer. The global economic recovery in the
United States and elsewhere continues to boost commodity prices through speculative
investments (ANRPC 2010).
1.3 Problem Statement
With large price volatility, it is important to statistically and accurately forecast
the future spot prices of natural rubber. The decisions of current production depend
heavily on the prevailing futures prices (Feder at et. (1980) and Allen (1994)). In that
case, the importance of accurate price forecasting for decision makers has become even
more critical for producers, traders and consumers who are involved in the NR industry.
There are several econometric techniques utilized in price forecasting.
Agricultural economists have done extensive work forecasting many specific agriculture
commodity prices. Yet, there are not many studies which focus on forecasting NR prices.
Romprasert (2009) conducted different forecasting models, which included regression
5
analysis, exponential smoothing, Holt’s linear exponential, and Box-Jenkins, to study the
futures price of the Thailand natural rubber ribbed smoked sheet no. 3 (RSS3). Khin et
al. (2011) incorporated an autoregressive integrated moving average (ARIMA) and
multivariate autoregressive moving average (MARMA) models to forecast the futures
prices of SMR20 over the period January 1990 to December 2008. Burger et al. (2002)
included a vector error correction (VEC) model in their paper to analyze the relationship
between the price of NR and exchange rate during the Asian financial crisis in 1997.
Three econometric models, ARIMA and vector autoregressive (VAR) and VEC
are employed in this analysis. The ARIMA model is a form of extrapolation, where the
past behavior of prices and other variables (captured in the error terms) are used to
forecast future values. With ARIMA models, there are no explanatory variables except
the past history of the variable in question. That is, it is a univariate method of
forecasting. The multivariate autoregression (VAR) model involves N-variables.
Technically, it is an extension of the univariate to multivariate time series. Economically,
the occurrence of an event may be caused by multiple time series variables. Furthermore,
not only that these variables are contemporaneously correlated to each other, their past
values may also correlate with each other. By considering multiple time series jointly for
an analysis, it utilizes additional information in determining the dynamic relationships
over time among the series (Stock and Watson 2011).
1.4 Objectives and Overview
The objective of the thesis is to develop and evaluate the forecasting accuracy of
univariate and multivariate time series models ARIMA, VAR, and VEC, respectively, for
6
NR prices. The models are based on the average monthly spot price of SMR20, which is
the most commonly used specification in the industries. Economically, there are
variables that influence prices. Most importantly, the average monthly spot price of
crude oil and the average monthly exchange rate between the Malaysia ringgit (MYR)
and the United States dollar (USD) are used in the multivariate analysis. Before
proceeding with the examination, all variables are converted into log form and tested for
stationary. If non-stationary occurs in the time series, the data are differenced. Then,
tests for cointegration are conducted if the variables in the forecasting models are
differenced of the same order. If cointegration is found, then a VEC model is taken into
consideration for the multivariate time series instead of VAR model. Having determined
the specific models, evaluating the forecasting ability of the alternative models will be
conducted in order to compare the forecasting errors of each model. In essence, this
paper carefully studies the historical data of NR to come up with proper econometric
models, which can be used to perform a short-term forecast of NR prices.
1.5 Contributions of the Study
Besides palm oil, rice, and cassava, NR is considered as one of most important
commodities in Southeast Asia since it plays a major role in improving social economies
throughout the region. Millions of families’ incomes are dependent on the prices of
natural rubber. Consequently, the instability of the price of NR posts a significant risk to
producers, traders, consumers and others who involved in the production of NR.
Therefore, this study provides valuable information in terms of improving,
planning, and making decisions in NR production. As one of the producers and
7
distributors of natural rubber in Cambodia, I can adopt these methods of forecasting to
develop a market strategy to work with local farmers and improve the well being of NR
producers in Cambodia. Furthermore, this study can be introduced to the Department of
Rubber Development in Cambodia where certain policies can be implemented to
maximize the welfare of rubber farmers.
8
Chapter 2
LITERATURE REVIEW
2.1 Introduction
Forecasting consists of using prior data to predict futures values. Since natural
rubber is a storable intermediate good, current production depends heavily on future
prices. In essence, forecasts provide complementarily information, which aids in
executing policies and making decisions for future events.
2.2 Methodology Reviews
The ARIMA model deals with a univariate time series data and it is function of
autoregression (AR) and moving average (MA) model. The process of AR depends on a
weighted sum of its past values and a random disturbance term while the process of MA
model depends on a weighted sum of current and lagged random disturbances. If a time
series is not stationary, it can be differenced (integrated) once or more to become
stationary. Therefore, the stationary process of ARIMA model is a combination of both
lagged from past values and random disturbances, as well as a current disturbance term
(Pindyck and Rubinfeld 1998).
In terms of short-term forecasting, univariate time series models frequently
outperform sophisticated structural models (Harvey and Todd 1983). Khin (2010)
developed multiple forecasting models, included ARIMA, to predict the short-term future
price of natural rubber in Malaysia where the short term ex ante and ex post forecasting
was being generated; Khin et. al (2008) fit the historical prices, over period January 1990
to December 2006, of SMR20 in a time series model and concluded that the data suits an
9
ARIMA(1,1,1) model, using Box-Jenkins methods. The outcomes showed that
forecasting values are satisfactory in term of statistical results. Harvey and Todd (1983)
suggested that univariate time series price forecasting was reasonably accurate in
predicting the future prices of diesel for short-term forecasting; yet, it was formidable to
forecast the long-term price. The results of forecasting and statistical results are
somewhat inconsistent when few data points are available and when the forecast horizon
is extended (Bailey and Gupta 1999). For accurate forecasting, a lengthy observation is
required and it is recommended that at least 50 observations are generally needed to
obtain good results (Meyler, Kenny and Quinn 1998). Zant (1994) studied the Indian
rubber market, which focused particularly on the explanation of short run price and stock
information over the period 1978 to 1990. He concluded that the performance on historic
time series was not impressive due to enormous price fluctuations.
Theoretically, there are economic variables that influence the prices of NR. In
that regard, synthetic rubber, a close substitute source in manufacturing most industrial
products, according to United Nation Conference on Trade and Agreement, UNCTAD,
can be processed from crude oil. Therefore, it is important to add the price of crude oil in
the analysis in order to minimize forecasting errors. A fluctuation in exchange rates for
NR producing countries is also one of the main factors in influencing NR prices (ANRPC
2010). Sims (1980) utilized and promoted VAR models as a method to estimate the
economic relationships. Essentially, variables such as exchange rates and oil prices will
be utilized as additional predictors in the VAR or VEC model.
10
Having a set of interacting variables, leads to the use of multivariate models
where multiple variables and their interaction are taken into consideration. Burger et. al
(2002) indicated that exchange rates have impacted the price of NR. The price of crude
oil and exchange rate between MYR and USD play a key role in changing the price of
NR in Malaysia. Frank and Garcia (2010) adopted VAR and VEC model procedures to
determine the linkages among several commodities oil and exchange rates. This study
suggested that the agricultural commodity markets depend more on the exchange rate and
to a lesser extent on oil prices, but both will be utilized in this thesis.
11
Chapter 3
FORECASTING DATA
3.1 The Data
The time series of the average monthly spot prices of SMR20, the average
monthly spot prices of crude oil, the average exchange rate between MYR and the USD,
and the average estimated end-to-month sales of motor vehicle in the United States are
implemented in this empirical analysis. The data are drawn from three sources: the
Malaysia Rubber Board (MRB), the Bank Negara Malaysia (Central Bank of Malaysia)
and the United States Department of Commerce. In essence, the data consists of four
time series, 141 observations for each series over the period of January 2000 to
September 2011. The obtained price of SMR20 and crude oil are the calculated monthly
average spot prices and given in USD per MT and USD per barrel, respectively. The
estimated sales of motor vehicle in the U.S. are given as a total end-of-month retail sale.
For convenience, I note the prices of SMR20, prices of crude oil, total sales of motor
vehicle in the U.S., and the average monthly exchange rate between MYR and the USD
as PSMR20, PCO, TSMV and EXM, respectively.
3.2 Forecasting Periods
In order to evaluate the out-of-sample forecasting ability of the various models,
some observations at the end of the sample period are not used in estimating the models.
Thus, there are two periods in the analysis: an in-sample period (January 2000 to
December 2009), and an out-of-sample period (January 2010 to September 2011). The
series from the in-sample period is used to generate the forecasting models where the out-
12
of-sample forecasts can be used to check against actual data. Figure 3.1 shows that
during the fourth quarter of 2008, the price of SMR20 plummets and this phenomenon
was caused by the global economic downturn, which led to a decline in demand for
commodities, including NR. The global recession was likely the primary reason for the
large decline in NR prices. Auto sales are included to account for the macroeconomic
factors along with oil prices and exchange rates.
Figure 3.1 Average Monthly Spot Prices of SMR20 (Jan. 2000 to Sep. 2011)
6000
USD/MT
5000
4000
3000
2000
1000
0
Time
Source: Malaysia Rubber Board
3.3 Descriptive Statistics
Descriptive statistics are presented in Table 3.1 where the Panels A and B show
the statistics corresponding to the time series over the period January 2000 to December
2009 and January 2010 to September 2011, respectively. Over the period January 2000
to December 2009, the mean for PSM020 is 1,406.52 USD/MT while the standard
deviation is 703.40 USD/MT. The minimum value for PSMR20 was 503.80 USD/MT
while the maximum price reached 3,183.10 USD/M. The coefficient of variation is
13
slightly over 50 percent, which indicated that the price of SMR20 was highly varied over
the period January 2000 to December 2009. The decrease in the volatility of price shows
that between January 2010 and September 2011 mean adjusted volatility was less than
one-half of what it was prior January 2010.
Table 3.1 Descriptive Statistics for In-Sample of PSMR20
Panel A: Period January 2000 to December 2010 (120 observations)
PSMR20
Mean
SD
Min
Max
Coef. Of Variation
1,406.52
703.40
503.80
3,183.10
50.01
Panel B: Period January 2010 to September 2011 (21 observations)
PSMR20
3,962.25
Mean
880.80
SD
2,861.00
Min
5,560.80
Max
22.33
Coef. Of Variation
Source: Malaysia Rubber Board and Malaysia Central Bank
Chapter 1 briefly discussed that the trend of PSMR20 and PCO are closely
related. The correlation between the two variables over the period January 2000 and
December 2010 is close to perfect, at 0.94. This result is consistent with demand theory
where the prices of complementary goods move together. The negative relationship
between Log of PSMR20 and Log of EXM indicates that appreciation in the Malaysia
ringgit leads to a higher price of SMR20. The positive correlation between the log
PSMR20 and log of TSMV reveals that strong demand in motor vehicles leads to price
increases in SMR20.
14
Table 3.2 Correlation Between Log of PSMR20 with Log of PCO and EXM
Log of SMR20
Log of PCO
0.94
Log of EXM
-0.79
Log of TSMV
0.20
Source: Malaysia Rubber Board and Malaysia Central Bank
Furthermore, examining the plots of the data is important; the graphical results
provide visual evidence as to whether there exists any structural breaks, outliers or data
errors. One can also detect a significant seasonal pattern form a time series plot. Visual
plots also suggest potential relationships among the time series data. It appears that there
is strong positive relationship between the PSMR20 and PCO, which is supported by
economic theory since synthetic rubber and natural rubber are the key components in
producing commercial tires. NR is traded in USD and its prices normally gain on
strengthening currencies of NR producing country (ANRPC 2010). An appreciation of
Malaysia ringgit over the period of late 2006 to mid 2008 led to an increase in PSMR20.
Furthermore, NR prices are generally expected to follow the path of the crude oil market
(ANRPC 2010). Correspondingly, Figure 3.2 shows that crude oil prices follow the same
trend as SMR20 prices. Frank and Garcia (2010) suggested that commodity prices
depend on the exchange rate. The U.S. index dropped to the lowest point in 2008, which
coincides with peak price of crude oil in that same period (Figure 3.2). Also, the plot
reveals that the series all appear to behave like random walks with no seasonal vibration.
Figure 3.3 suggests that the trend of PSMR20 and PTSMV seem to evolve in a common
movement. Therefore, by adding the demand for motor vehicles, should help explain the
variation of PSMR20.
15
Figure 3.2 Log Forms of the Variables (Jan. 2000 to Dec. 2009)
Source: Malaysia Rubber Board and Malaysia Central Bank
Figure 3.3 Log Forms of the Variables (Jan. 2010 to Sep. 2011)
Source: Malaysia Rubber Board and the United State Department of Commerce
16
Chapter 4
EMPIRICAL METHODOLOGY
4.1 Preliminary Procedures
4.1.1 Unit Root Processes
It is important for the time series data to be stationary. A stationary time series is
one with a constant mean, variance and a covariance that does not depend on time (Stock
and Watson 2011). In the case of nonstationary time series, the data are transformed by
differencing to induce stationary.
There are several methods to test for stationary, including the Dickey-Fuller (DF)
test (Dickey and Fuller 1981) and the Phillips-Perron Test (Phillips and Perron 1988).
The Phillips-Perron method suffers from severe size distortions when there are negative
moving average errors. In this analysis, the DF test will be performed since it is generally
the most reliable and it is easy to implement and interpret (Stock and Watson (2011).
Consider a first-order of autoregressive model AR(1) which can be written as the
following:
PSMR20t = b0 + b1PSMR20t-1 + ut
(4.1)
In the Dickey-Fuller (DF) test for unit root, the null hypothesis is b1 = 1 which
indicates that PSMR20t is nonstationary and has an autoregressive root of 1. The
alternative is that b1 < 1 which implies that time series PSMR20t is stationary. In
practice, the Dickey-Fuller test is implemented by subtracting PSMR20t-1form both sides
of the equation (4.1) to yield:
17
DPSMR20t = b0 + d PSMR20t-1 + ut
(4.2)
Where DPSMR20t = PSMR20t - PSMR20t-1 and d = b1 -1 . The null hypothesis
is now d = 0 (unit root) against the alternative d < 0 (Stationary).
The DF test applies only to AR(1). In some cases, AR(1) is not considered a
good model in capturing all the serial correlation of the time series. Therefore, a higherorder of autoregressive is taken into account. In that situation, testing for unit root
requires the augmented Dickey-Fuller (ADF) test. The ADF-critical value is used in the
analysis.
4.1.2 Checking for Seasonality
After seven years, Havea trees are mature and can yield latex all year around.
However, in some regions the trees are not being tapped during period of growing new
leaves, which normally occurs in February and March. Technically, the latex is
processed into dried block or sheet rubber which can be stored in the warehouse for a
long period of time. In that case, NR is considered a storable good. The supply of NR is
considered non-seasonal (Barlow 1978). However, there is a possibility that the demand
of rubber is seasonal. In that case, checking for seasonality for the price of SMR20 is
taken into consideration in the empirical analysis. The existence of a consistent pattern in
a time series reveals seasonality (Goetz and Weber 1986). Inspecting the plots of
autocorrelation function (ACF) and partial autocorrelation function (PACF) are
considered effective methods for checking seasonality in the time series. For instance,
the existence of a seasonal pattern in the data suggests that the ACF plots spike
consistently every 12 observations in monthly data.
18
More formally, seasonal dummy variables are included in the regression models.
If the ACF and PACF plots indicate an existence of seasonality, seasonal dummy
variables are incorporated in the analysis.
4.2 Univariate Time Series Model
4.2.1 Autoregressive Integrated Moving Average (ARIMA) Model
In order to attain a forecast with minimal errors, there are roughly seven
characteristics of a good ARIMA model, which are taken into consideration (Prankratz
1983). First of all, a good model is parsimonious which provides a strong practical
orientation in developing a model. Parsimony means only including the smallest
numbers of coefficients needed to explain the available data. Second, a good
autoregressive (AR) model is stationary which implies that the time series has constant
mean and variance through time. Third, a good moving average (MA) is invertible where
the requirements imply that the coefficient of MA must satisfy certain conditions (Table
4.1). Fourth, a good model has high quality estimated coefficients at the estimation stage,
which refers to the AR and MA coefficients. Theoretically, the coefficients must be
statistically and significantly different from zero. Fifth, a good model has statistically
independent residuals. Sixth, the residuals of the estimated model should be normally
distributed. There are certain statistically tests which will be used to analyze the
residuals. Having determined the fitted models, the results of forecasts error will be
generated and discussed. Finally, a good model has sufficiently small forecast errors,
which satisfactorily forecasts the future and normally fit the past data as well; it should
19
produce fairly acceptable forecasting future values. Correspondingly, performing an outof-sample forecast will provides results of forecasts error.
As discussed in Chapters 2 and 3, there was a global recession which started in
mid 2008. The price variation was excessive during the period. Therefore, in order to
minimize the forecasting errors, dummy variables are being implemented to account for
the global recession in the fourth quarter of 2008.
Table 4.1 Summary of Invertibility Conditions for MA Coefficients
Model Type
Invertibility Conditions
ARMA(p,0)
Always invertible
| q1 | <1
MA(1) or ARMA(p,1)
| q 2 | <1
MA(2) or ARMA(p,2)
q2 + q1 <1
q2 - q1 <1
4.2.1.1 Identification and Estimation
The analysis and model are based on the observation of ln(PSMR20t) over the
period January 2000 to December 2009, while the observations from January 2010 to
September 2011 are being reserved for out-of-sample forecasting.
Box-Jenkins forecasting models essentially involve examining the patterns of the
ACF and PACF. The estimated ACF and PACF are used as a guide in choosing one or
more ARIMA models that might fit the available data. These tools are considered
important in the identification stage since they evaluate the statistical relationship
between observations in a univariate time series.
The plot of the ACF is a useful method in identifying the trend of a time series.
For instance, a time series is considered non-stationary if an ACF plot of the time series
20
decays extremely slowly; however, an ACF graph cuts off fairly quickly, the time series
should be identified stationary.
Hence, let take a closer look at the functionality of the ACF by considering the
following time series values PSMR20b, PSMR20b+1,...PSMR20n where the sample
autocorrelation at lag k, denoted by rk is,
n-k
å(PSMR20 - PSMR20)(PSMR20
t
rk =
t+k
- PSMR20)
t=b
(4.3)
n-k
å(PSMR20
t
- PSMR20)
2
t=b
An estimated autocorrelation coefficient rk measures the direction and strength of
the relationship between the time series observations separated by a lag of k.
Theoretically, rk always falls between -1 and 1 where a value rk close to positive 1
indicates a strong linear positive correlation and negative 1 represents a strong tendency
to move together in the direction of negative slope. When rk = 0, there is no indication of
correlation between the time series observation separated by a lag of k time units.
21
Table 4.2 Characteristics of the ACF and PACF for AR and MA Processes
Process
ACF
PACF
AR(p)
Tails off as exponential decay or Cuts off after lag p – Cuts off to
damped sine wave – Decays toward zero (lag length last spike equals
zero
AR order of process)
MA(q)
Cuts off after lag q – Cuts off to zero Tails off as exponential decay or
(lag length of last spike equals MA damped sine wave
order of process)
ARMA(p,q)
Tails off after lag (q,p) – Tails off Tails off after lag (p,q) – Tails off
toward zero
toward zero
Source: Excerpt from Weit, 2005 p.109 and Pankratz, 1983 p.55.
Table 4.3 Detailed Characteristics of Five Common Stationary Processes
Process
ACF
AR(1)
Exponential decay: (i) on the Spike at lag 1, then cuts off to
positive side of f1>0; (ii) alternating zero; (i) spike is positive if f1>0;
in sign starting on the negative sign (ii) spike is negative if f1<0.
f1<0.
AR(2)
A mixture of exponential decays or a Spike at lags 1 and 2, then cuts
damped sine wave. The exact pattern off to zero
depends on the signs and sizes of f1
and f 2 .
MA(1)
Spike at lag 1, then cuts off to zero: Damps out exponentially: (i)
(i) spike is positive if q1 < 0; (ii) alternating in sign, starting on the
positive side, if q1 < 0; (ii) on the
spike is negative if q1 > 0.
negative side, if q1 > 0.
MA(2)
Spike at lags 1 and 2, then cuts off to A mixture of exponential decays
zero.
or a damped sine wave. The exact
pattern depends on the signs and
sizes of q1 and q 2 .
ARMA(p,q)
Exponential decay from lag 1: (i) Exponential decay from lag 1: (i)
sign of r1 = sign of ( (f1 - q1 ) ; (ii) f11 = r1 ; (ii) all one sign if q1 >
all one sign of f1 > 0; (iii) 0; (iii) alternating in sign if q1 <
0.
alternating in sign if f1<0.
Source: Excerpt from Pankratz, 1983 p.55.
PACF
22
Autoregressive (AR) Processes
Assuming that the time series is stationary, the next step is to establish an
appropriate lag length structure in ARIMA model by following the guidelines in Table
4.2 and 4.3. First, let’s consider an AR model which expresses the current time series
values of PSMR20t as a function of past time series values PSMR20t-1, PSMR20t-2 ,…,
and PSMR20t-p . The AR model can be written as:
PSMR20t = d + f1PSMR20t-1 + f2 PSMR20t-2 +... + f p PSMR20t-p + et
(4.4)
PSMR20t is a dependent variable at time t and PSMR20t-1, PSMR20t-2 ,…, and
PSMR20t-p represent the independent variables at time lags t-1, t-2,…, and t-p,
respectively. Hence, f1, f2 ,..., f p are the unknown parameters which relates to
PSMR20t-1, PSMR20t-2 ,…, and PSMR20t-p .
Moving Average (MA) Processes
The number of lags for moving average (MA) terms is determined by following
the guidelines in Tables 4.2 and 4.3. Generally, moving average terms take into account
the impact of the current random shock et and past random stock of et-1, et-2…, and et-q.
qth-order of moving average model, MA(q), can be written as,
PSMR20t = et + q1et-1 + q2et-2 +... + q pet-q
(4.5)
PSMR20t is a dependent variable at time t and et, et-1,…, and et-q represent the
errors at time t, and errors in the previous time periods that correspond to t-1, t-2,…, and
t-p, respectively. Hence, q1, q2 ,..., q q are unknown parameter which relates to et-1, et-2 …,
and et-q , respectively.
23
Mixed Autoregressive Moving Average (ARIMA) Processes
Theoretical ACF’s with both AR and MA characteristics are implemented in
mixed ARIMA process, which can be determined by once again following the guidelines
in Tables 4.2 and 4.3. Having theoretically defined the proper lag length for AR and
MA, now it is time to combine both models into a single ARIMA model of order (p,d,q)
which can be written as
PSMR20t = d + f1PSMR20 t-1 + f2 PSMR20t-2 +... + f p PSMR20t-p
+ et + q1 et-1 + q 2 et-2 +... + q p et-p (4.6)
4.2.1.2 Model Diagnostic
Having identified an ARIMA model and having satisfactorily estimated its
parameters, a model is examined for improvement. If there is evidence of autocorrelation
or statistical insignificance, one needs to go back to the identification stage and modify
the model.
There are a number of diagnostic tools available for ensuring that an ARIMA
model is statistically adequate. A first check is to simply plot the autocorrelogram of the
residuals of the fitted model. The residuals of an estimated ARIMA model should
resemble a white noise process if the model is correctly specified; a plot of
autocorrelation should immediately die out from one lag on. In that case, any significant
autocorrelations may results in model specification.
Secondly, a statistically acceptable model has random shocks, et , that are
statistically independent. The residual autocorrelations rk (k =1, 2,...,l) are supposed to be
uncorrelated and normally distributed as N(0,1/n). The chosen models will be assessed
24
for autocorrelation in the residuals. Generally, it is testing the null hypothesis, there is no
residual autocorrelation, against the alternative hypothesis where there is at least one
nonzero autocorrelation.
H 0 : rk = 0 and k =1, 2,3,..., K
H 0 : rk ¹ 0 for at least onek =1, 2, 3,..., K
To ensure these interests, one can adopt a diagnostic chi-square test, known as the
Ljung –Box test, on the autocorrelations of the residuals in order to check for adequacy in
the model. The Ljung-Box statistic is,
K
2
Q = n(n + 2)å(n - k)-1 rk2 (â) ~ c l-m
*
k=1
(4.7)
where n is the number of observation used to fit the model and l is the number of
autocorrelations included in the test. Also, rl2 (â) is the squared sample autocorrelation.
The Q* statistic approximately follows the chi-squared distribution. Thus, if Q* is large,
and statistically significant from zero, reject the null hypothesis; hence, it indicates that
the residuals of the estimated model are autocorrelated.
Third, one must test if the standardized residuals are normally distributed, based
on the third and fourth moments, by measuring the difference of the skewness and the
kurtosis of the series with those from the normal distribution. Skewness is a measure of
symmetry of the histogram. The skewness of a symmetrical distribution such as the
normal distribution is zero. The kurtosis is a measure of the thickness of the tails of the
distribution. The kurtosis of a normal distribution is 3. If the distribution has thicker tail
than does the normal distribution, its kurtosis will exceed three. One common test for
25
normality is the Jarque-Bera (JB) test. Under the null hypothesis of normality, the
Jarque-Bera is distributed chi-square with 2 degrees of freedom.
H 0 : E(uts )3 = 0 (skewness) and E(uts )4 = 3 (kurtosis)
H 0 : E(uts )3 ¹ 0 or E(uts )4 ¹ 3
If JB is large, it rejects the null hypothesis, which indicates that residuals are nonnormally distributed. Thus, a statistically acceptable model is adequate when its residuals
are distributed as white noise, not autocorrelated, and normally distributed.
If a model is not statistically acceptable because its random shocks are not
statistically significant or the residuals are non-normally distributed, one must
reformulate the model by returning to an identification stage and repeat the process until
an acceptable or best model is found.
4.3 Multivariate Time Series Model
4.3.1. Vector Autoregressive (VAR) Model
A univariate time series involves only with one variable in the analysis. In
contrast, multivariate autoregression (VAR) model involve N-variables. Technically, it is
an extension of the univariate to multivariate time series. In most case, an occurrence of
an event is caused by multiple time series variables. Furthermore, not only that these
variables are contemporaneously correlated to each other, their past values may also
correlate to each others. By considering multiple time series jointly for an analysis, it
utilizes the additional information in determining the dynamic relationships over time
among the series (Stock and Watson 2011).
26
Based on two endogenous variables, namely PSMR20t and EXMt, the standard
form of VAR with n lag model can be expressed as:
PSMR20t = a1 PSMR20t-1 +... + ak PSMR20 t-k + ak+1PEXM t-1
+... + ak+1+n PEXM t-n + ePSMR20t
PEXM t = b1PSMR20t-1 +... + bk PSMR20t-k + bk+1PEXM t-1
+... + bk+1+n PEXM t-n + ePEXM t
(4.8)
(4.9)
PSMR20t and EXMt, depend on the lags of itself and lags of another variable,
which captures the interrelationship among them.
F-tests and information criterion are useful tools in selecting an appropriate lag
length for a VAR model (Stock and Watson 2011). Using the F-statistic to test the null
hypothesis; it is a joint hypothesis that sets all of the coefficients equal zero against the
alternative where at least one of the coefficients is not zero. The F-statistic, however,
may overestimate the true lag order (Stock and Watson 2011). Also, the Bayes
Information criterion (BIC) and Akaike Information criterion (AIC) will be employed in
defining the appropriate lag lengths for a VAR model since they are better measurements.
BIC(p) = ln[det(Ŝu )]+ k(kp +1)
ln(T )
T
(4.10)
AIC(p) = ln[det(Ŝu )]+ k(kp +1)
2
T
(4.11)
The number of lagged value will be outlined by the statistical package EViews.
27
4.3.2 Cointegration and Vector Error Correctional (VEC) Model
At this stage, the variables are assumed stationary in the analysis of VAR models.
If the time series are nonstationary, it may leads to spurious results. In order to apply
VAR models in the analysis, the data must be stationary by taking sufficient differencing.
For instance, if there are at least two variables in a model are nonstationary, the model is
considered cointegrated. Suppose that PSMR20t and EXMt are cointegrated, then they
share the common stochastic trend (Stock and Watson 2011).
Heij at. el (2004) listed four steps in cointegration and VEC models analysis. Step
1, test whether the trend of time series variables is deterministic or stochastic. The
methods of unit root test, which discussed in Section 4.1 will be used. Failing to reject the
null hypothesis of stochastic trends, it implies that the variables are nonstationary and
continues with step 2. In the second step, test the presence of cointegration by using
Johansen trace test. It involves choosing relevant VEC model equation and starting with
constant terms where trends are included. Failing to reject the null hypothesis of no
cointegration, it continues to step 4. If it rejects the null hypothesis, then continue to step
3. In Step 3, it estimates the VEC model with cointegration by using Johansen trace test.
Finally, in Step 4, convert the variables to stationary by taking first differences and then
estimate the VAR model for the differenced variables.
If cointegration has been detected among the variables which involves in a
multivariate model, there is an existence of a long-term equilibrium relation between the
series. In that case, a VECM is implemented instead of VAR model in order to avoid
28
misspecification errors in the analysis. The regression equation form for two endogenous
variable, namely PSMR20t and EXMt both are I(1) with lag order 1, can be written as:
DPSMR20t = aPSMR20 (PSMR20t-1 - b EXMt-1 )+ ePSMR20,t
(4.14)
DEXMt = aEXM (PSMR20t-1 - b EXMt-1 )+ eEXM,t
(4.15)
The above expressions yield as aPSMR20 = -a12 a21 / (1- a22 ) , b = (1- a22 ) / a21and
a EXM = a21 . Equation (4.14) and (4.15) also suggest that PSMR20t and EXMt are changed
in response to PSMR20t-1 - b EXM t-1 which is the previous period’s deviation from longrun equilibrium (Enders 2004).
4.4 Forecast and Forecasting Evaluation
There are two forecasting period in this analysis, an in-sample and an out-ofsample period. The in-sample period is January 2000 to December 2009, which is
implemented to generate the model specifications of ARIMA and VAR or VEC models.
Furthermore, in order to evaluate the out-of-sample forecasting ability, some observations
at the end of the sample period are not included in generating the models. The out-ofsample period is January 2010 to September 2011. There are two ways, dynamic and
static, for out-of-sample forecasts. The dynamic forecasts technique is using the models
to predict multiple steps ahead while static forecasts method is used to predict one-step
ahead.
This study will consider the case of dynamic forecasts. The estimated models
produce six-step ahead forecasts (6 future months). These estimated values are used to
aid in trading NR. The specific models, which are generated from in-sample period, are
being used to produce multiple-steps ahead forecasts. The approach is to estimate the
29
model recursively and forecast ahead a specific number of observations (Tashman 2000).
Technically, the model first compute one-step ahead forecast, using this forecast to
compute two-step ahead forecasts, and continue until specific period forecast has been
reached. For example, the time series are available from January 2000 to September
2011 and one wish to forecast six steps ahead (i.e., October 2011 to March 2012). In that
case, once could initially estimate the model over the period January 2000 to December
2009 and forecast six steps ahead (i.e., January 2010 to June 2010). Then, re-estimate the
model over the period January 2000 to January 2010 and forecasts another six steps
ahead (i.e., February 2010 to July 2010). The process of estimation will be repeated until
there is no out-of-sample observation left in the available data. Therefore, it will produce
21 one-step ahead forecasts, 20 two-steps ahead forecasts, 19 three-steps ahead forecasts,
18 four-steps ahead forecasts, 17 five-steps ahead forecasts, and 16 six-steps ahead
forecasts.
Tests of forecast accuracy are based on the differences between the forecasted
values of the dependent variable at time t and the actual value of the dependent variable
at time t. The closer together the two are, the smaller is the forecast error, and the better
the forecast. There are several measures of forecasting accuracy, most of which are
based on an average of the errors between the actual and forecast values at time t. The
forecasting error is denoted as:
et = PSMR20t - PSMR20tF
(4.16)
et is the errors between the actual and forecast price at time t. PSMR20t represents
the actual price and PSMRtF represents the forecast price at time t.
30
The mean absolute error (MAE), the root mean squared error, and the Theil
inequality coefficient (TIC) will be taken into consideration as tools in evaluating the
forecast performance.
1 F
MAE = å et
F i=1
(4.17)
1 F
RSME =
(et )2
å
F i=1
(4.18)
TIC =
F
1
(et )2
å
F i=1
F
RSME
1
(etN )2 =
å
F i=1
RSME N
(4.19)
F is considered at the number-of-sample observations and N is the naïve model of
no change (estimated model from in-sample period).
The MAE and RSME depend on the scale of the independent variables and should
be used as relative measures to compare forecasts for the same series. They have been
used extensively in application of theoretical works on forecasting. It indicates that the
smaller error in MAE and RMSE, the better the forecasting ability of the model. The
RMSE will always be at least larger than MAE. If both equal each other, all errors are
exactly the same. A value of TIC is, the ratio of RSME of the estimated model over the
naïve model, close to zero indicates a nearly perfect fit model where the forecasting
performance of the estimated model is satisfactory. The advantage of Theil statistic is
that it is a unit-free measure while the MAE and RMSE are dependent on the dimensions
of the dependent variable.
31
Chapter 5
EMPIRICAL RESULTS AND DISCUSSION
5.1 Introduction
The main objective of this research paper is to forecast the price of SMR20,
which is the most commonly traded type of NR. In this chapter, empirical results, which
are outlined in Chapter 4, will be presented. Also, forecasting performances of each
model are analyzed based on the statistical criteria.
5.2 Unit Root Test Results
Table 5.1 Results of Unit Root Tests with Trend (Jan. 2000 to Dec. 2009)
Variables
Coefficient of the Unit Root Test
Level
First Different
DF-1 lag
DF-0 lag
LPSMR20t
-0.69
-8.09*
DF-1 lag
DF-0 lag
LPCO t
-1.29
-8.27*
DF-1 lag
DF-0 lag
LEXM t
-1.33
-8.39*
LTSMVt
ADF-12 lag
-1.62*
ADF-11 lags
-11.74*
Note: *, **, *** Indicates significance at 1%, 5%, and 10% levels.
Table 5.2 The Augmented Dickey-Fuller t-Statistic For Unit Root Test
Deterministic Regressors
10%
5%
1%
Intercept only
-2.57
-2.86
-3.43
Intercept and time trend
-3.12
-3.41
-3.99
The number of lags in ADF test is chosen based on Schwarz Information
Criterion so that the highest lag significant from zero is taken into account. With the
ADF-critical value approximately equals -3.41 at the 5% level (included time trend), it
fails to reject the null hypothesis that LPSMR20t, LPCOt , LEXMt, and LTSMVt in levels
have a unit root according to the results in Table 5.1.
32
Then the four series, LPSMR20t, LPCOt , LEXMt, and LTSMVt are first
differenced, and the DF and ADF tests are being performed on the first differenced data.
Table 5.1 illustrates that DF and ADF tests are significant which it rejects for all the
series in first difference. Therefore, it suggests that LPSMR20t, LPCOt , LEXMt, and
LTSMVt are integrated of order one I(1). Therefore, the ARIMA and VAR or VECM
analysis is carried out in first differences which being denoted as DLPSMR20t, DLPCOt ,
DLEXMt, and DLTSMVt.
5.3 Seasonality
The graphical result of level prices of SMR20 in Figure 3.1 shows no indication
of seasonal pattern. However, according to the plots of ACF and PACF in Figures 5.3
and 5.4, it appears that there are seasonal patterns; yet, those lags are not significant at
5% levels. To determine whether there is a pattern of seasonality in the price of SMR20,
the seasonal dummy variables are included in the analysis; if the results yields significant
statistical results, the dummy variables are important in explaining the variation of the
dependent variable.
5.4 Univariate Model Specifications
The analysis and model are based on the observation of LPSMR20t over the
period January 2000 to December 2009. Having determined whether the series is
stationary by first differencing (d =1), it is time to establish an appropriate lag length for
an ARIMA model.
Technically, Box-Jenkins forecasting models essentially involve examining the
patterns of the autocorrelation function (ACF) and the partial autocorrelation function
33
(PACF). The estimated ACF and PACF are used as a guideline in choosing one or more
ARIMA models that might fit the available data. These tools are considered important in
the identification stage since they evaluate the statistical relationship between
observations in a univariate time series.
5.4.1 Univariate Results
Figures 5.1 and 5.2 shows the estimated ACF and PACF of the log form of
original time series, LPSMR20 over the period January 2000 to December 2009. The
estimated ACF are falling to zero slowly while estimated PACF is spike at 1 and then cut
off; these incidences are corresponding to nonstationary.
In Figures 5.3 and 5.4, the ACF is decay; it starts from the positive side and then
switch to negative side; the PACF is spike at lag 1 and then it cuts off to zero. These
occurrences are associated with the autoregressive model of order 1, AR(1), according to
the guidelines in Tables 4.3 and 4.4. The following equation represents the AR(1) model:
DLPSMR20t = d - f1DLPSMR20t-1 + et
(5.1)
Once again, Figures 5.3 and 5.4 show the plot of ACF has a spike at lag 1 and 2
followed by a mixture of exponential decays. These occurrences associate with the
moving average model of order 2, MA(2) with negative coefficient, according to the
guideline in Table 4.5. The following equation represents the MA(1):
DLPSMR20t = a - q1et-1 - q1et-2 + et
(5.2)
With p = 1 and q = 2, ARIMA(1,1,2) is chosen, according to the guideline in
Table 4.5. Thus, ARIMA (1,1,2) can be expressed as,
DLPSMR20t = g + f1DLPSMR20t-1 - q1et-1 - q1et-2 + et
(5.3)
34
Figure 5.1 Autocorrelation of LPSMR20
Figure 5.2 Partial Autocorrelation of LPSMR20
35
Figure 5.3 Autocorrelation of DLPSMR20
Figure 5.4 Partial Autocorrelation of DLPSMR20
36
Table 5.3 ARIMA and VEC of PSMR20t
ARIMA
(2,1,1)
Specification
Regressors
AR(1)
AR(2)
MA(1)
1.13***
(11.93)
-0.22**
(2.41)
0.98***
(81.37)
(1)
ARIMA
(2,1,1)
Seasonal
Dummy
Intercept
January
February
March
April
May
June
July
August
September
October
November
0.30***
(7.82)
0.02***
(7.42)
(3)
ARIMA
(1,1,2)
Seasonal
Dummy
ARIMA
(2,1,2)
ARIMA
(2,1,2)
Seasonal
Dummy
-0.11
(0.21)
0.14
(0.88)
-0.90***
(25.11)
-0.56
(1.33)
1.44***
(4.52)
-0.52*
(1.80)
1.48***
(5.08)
-0.56**
(2.09)
0.37
(0.7)
1.17***
(11.39)
0.82*
(1.94)
- 1.32***
(3.77)
-1.33**
(4.05)
0.19*
(1.94)
0.21*
(1.69)
0.34
(0.99)
0.35
(1.09)
MA(2)
D2008Q4
(2)
ARIMA
(1,1,2)
-0.31***
(6.67)
-0.34***
(8.18)
0.01
(0.42)
0.06**
(2.12)
0.03
(1.15)
-0.01
(0.19)
0.01
(0.05)
0.01
(0.34)
0.02
(0.60)
-0.10
(0.35)
0.01
(0.33)
0.01
(0.04)
0.02
(0.79)
0.01
(0.64)
0.02***
(2.79)
-0.27***
(7.34)
0.01
(0.45)
0.05**
(2.05)
0.03
(1.00)
-0.01
(0.36)
0.01
(0.12)
0.01
(0.21)
0.02
(0.57)
-0.01
(0.36)
0.01
(0.31)
0.00
(0.02)
0.02
(0.77)
0.01
(0.46)
0.02***
(8.88)
0.01
(0.28)
0.06**
(2.07)
0.03
(1.09)
0.03
(0.20)
0.01
(0.18)
0.01
(0.40)
0.02
(0.77)
0.01
(0.16)
0.01
(0.49)
0.01
(0.20)
0.02
(0.77)
0.01
(0.41)
Note: ***, **, and * denote significant 1%, 5% and 10%, respectively.
The values of t-statistic are in the parenthesis.
37
Table 5.4 Statistical Results of ARIMA and VEC of PSMR20t
ARIMA
(2,1,1)
(1)
ARIMA
(2,1,1)
Seasonal
Dummy
0.36
(2)
(3)
ARIMA
ARIMA
(2,1,2)
(2,1,2)
Seasonal
Dummy
0.40
0.39
0.39
ARIMA
(1,1,2)
Seasonal
Dummy
0.37
-2.17
-2.56
-2.17
-2.55
-2.18
-2.59
-2.55
-2.67
-2.55
-2.69
-2.59
0.96
0.03
-.60
-0.56
0.95
1.08
0.37
0.61
1.03
0.99
0.98
Q(16) =
7.37
3.70
Q(16) =
4.81
7.57
Q(16) =
5.53
5.41
Q(16) =
5.09
9.00
Q(16) =
9.01
1.89
Q(16) =
7.44
1.37
0.6422
0.6219
0.5129
0.4868
0.5785
0.5546
0.5583
0.5344
1.0705
1.0549
0.8714
0.8560
Specification
Adjusted R2
0.40
BIC
-2.70
AIC
Inverted AR
Roots
Inverted MA
Roots
Ljung-Box
Q-statistic
Jarque_Bera
RSME
MAE
-0.98
ARIMA
(1,1,2)
As discussed in Chapters 2 and 4, due to the global recession in 2008, the prices
of NR plummeted in the fourth quarter of 2008. Therefore, a dummy variable of
D2008Q4 will be implemented into the regressions analysis in order to offset the high
variation of price change.
Having determined the specific model for ARIMA(1,1,2). There are also several
ARIMA model that can be constructed from the characteristics of ACF and PACF plots
in Figures 5.3 and 5.4. In that case, Table 5.3 illustrates different models of ARIMA,
which yield different statistical results. One can compare all the models and define
which models stand out in term of their statistical results, which fit the seven stages,
discussed in Chapter 4.
Tables 5.3 and 5.4 show that seasonal dummy variables are not statistically
significant. Also, implementing the seasonal dummies causes the coefficients to be
38
statistically insignificant. Therefore, it appears that there is no seasonal condition in the
series of PSMR20t.
Having determined that seasonal is absent in the series, let analyze the statistical
results of the ARIMA models. First, in order to be stationary, the absolute coefficients of
AR must less than one. Table 5.3 and 5.4 lists that the inverted AR roots for all models
are less than 1, which implies that all models are satisfactory with the appropriate
stationary conditions. Second, the coefficients of MA must satisfy the invertibility
condition where the value of coefficients of in MA terms must be less than 1. In that
case, only ARIMA(2,1,1), ARIMA(1,1,2) and ARIMA(2,1,2) specifications meet the
requirements. Next, the model should have high quality of estimated coefficients, which
suggests that all the coefficients on the right hand side must be statistically significant. In
that regard, only ARIMA(1,1,2) and ARIMA(2,1,1) have all the coefficients statistically
significantly at 5% and 1% levels.
The process of elimination leaves only two models: ARIMA(1,1,2) and
ARIMA(2,1,1). The Ljung-Box statistic of ARIMA (2,1,1) and ARIMA (1,1,2) is not
significantly different from zero, according to Ljung-Box statistic values in Table 5.3. It
fails to reject the null hypothesis of no remaining significant autocorrelation in the
residual of the models. In that case, the prices of SMR20 can be fitted into an ARIMA
(2,1,1) and ARIMA (1,1,2) model. With the JB-test equals 3.70 and 5.41, it fails rejects
the null hypothesis, which indicates that residuals are considered normally distributed at
5% level. Finally, the MAE and RMSE for ARIMA(1,1,2) are smaller than
ARIMA(2,1,1,) which indicates it ARIMA(1,1,2) fits the available data (the past) well
39
enough to satisfy the analysis. Also, the ACF and PACF graphs in Figures 5.3 and 5.4
also show the models are satisfactory.
5.5 Multivariate Results
The empirical results of multivariate time series models VAR and VEC model
will be presented in this section. Table 5.1 shows the results of ADF test indicated that
fails to reject the null hypothesis of stochastic in LPSMR20t, LPCOt and LEXMt. The
ADF tests on the series of first differenced DLPSMR20t, DLPCOt , DLEXMt, and
DLTSMVt, shows these series are stationary. Therefore, it concludes that the series of
DLPSMR20t, DLPCOt , DLEXMt, and DLTSMVt, are integrated of order 1.
5.5.1 Cointegration and VEC model
By applying Johansen test on the cointegration, the trace statistic does not reject
the null hypothesis; hence, there is at least one cointegration relation between the
variables. The number of cointegration of r =1 is not rejected at the 5% level (11.80 <
29.80) (Table 5.5). The tests confirm that variables are cointegration.
Table 5.5 Results of Cointegration Tests
Hypothesized No. of CE(s)
Eigenvalue
None
0.130080
At most 1
0.068932
At most 2
0.030729
At most 3
0.000001
Trace Statistic
49.82959
11.80386
3.590188
0.000974
5% Critical Value
47.85613
29.79707
15.49471
3.441466
Based on the discussion in Chapter 4, a VEC model is implemented when
variables are cointegrated. Using the AIC and BIC criterion, it estimates a VEC model
with 1 lag for each variable. Therefore, one lag is included in the cointegration equations
40
based on EViews’ suggestion, and incorporated in a VEC model for price of SMR20
(Table 5.6).
Table 5.6 AIC and BIC Criterions From EViews
Lag
LogL
LR
FPE
AIC
0
593.0104
NA
4.85E-09
-10.63082
1
621.318 54.57483 3.42e-09* -10.97870*
2
625.5368 7.905638
3.73E-09
-10.89256
3
631.625 11.07929
3.94E-09
-10.84009
4
637.2915 10.00575
4.19E-09
-10.78003
SC
-10.55759
-10.68578*
-10.37994
-10.10778
-9.82803
HQ
-10.60111
-10.85987*
-10.6846
-10.54301
-10.39383
With an existence of cointegration between the variables, it suggests a long-term
relationship among the variables. Therefore, the long-run equilibrium relation is
estimated by EViews, which illustrates in Table 5.7.
Table 5.7 The Estimated Long-Run Relationship for One Cointegration Vector
Variable
LPSMR20
LPCO
LEXM
PTSMV
Constant
1
2.67***
-11.30***
4.63***
-3.28***
Coefficients
(6.08)
(2.72)
(3.46)
(5.15)
Note: ***, **, and * denote significant 1%, 5% and 10%, respectively.
In Table 5.7, all the coefficients are statistically significant at the 1% level. The
results suggest that 1% increased in price of crude oil leads to 2.67% prices increased of
NR. This interpretation is consistence with the economic theory since NR and SR are
complement goods. Yet, a 1% increase in the exchange rate between MYR and USD
causes the prices of NR to drop by 11.30%. If the currencies of producing countries are
appreciated against dollar, the goods are being traded are under valued. However, the
price of NR would boost up by 4.63% if the demand for motor vehicle in the United
41
States increased by 1%. This phenomenon reflects the quantity of supply and demand
theory where the quantity of demand go up, the price is increased.
Having determined the estimated long-run relationship between LPSMR20t ,
LPCOt, LEXMt and LTSMVt , the VEC model can be constructed and expressed as in
Equation 5.4 and Table 5.8.
DLPSMR20 t = a1 (LPSMR20t-1 + 2.6LPCOt-1 -11.30LEXM t-1
+4.30PTSMV - 3.28) + g11DLPSMR20 t-1
+g12 DLPCOt-1 + g13 DLEXM t-1 + D2008Q4 + e1t
(5.4)
LPSMR20t is the price of SMR20 at time t, a1 is a coefficient on the error
correction terms, g11 , g12 , and g13 are the coefficients to be estimated on the
autoregressive terms. b1 is the coefficient on the exogenous variable of dummy variable
when the market hit global recession in late 2008. e1t is the innovations which assumed
to be serially correlated within the equation.
Table 5.8 reveals that the coefficients of lagged value of DLPCO and D2008Q4
are statistically significant at the 1% level.
42
Table 5.8 VEC(1,1) Model for PSMR20t (Jan .2000 to Dec .2009)
Dependent variable DLPSMR20t
Regressors
Cointegration
Equation (Table 5.6)
Adjusted R2
-0.003
(1.40)
-0.049
(1.57)
-0.022
(1.29)
-1.880***
(2.91)
-0.108
(0.65)
-0.234***
(4.81)
0.022
(1.44)
0.37
Jarque_Bera
4.75
DLPSMR20t-1
DLPCO t-1
DLEXM t-1
DLTSMV t-1
D2008Q4
Intercept
RSME
0.3896
MAE
0.3624
Note: ***, **, and * denote significant 1%, 5% and 10%, respectively.
The values of t-statistic are in the parenthesis. Number in
parenthesis represents the value of t-statistic.
5.6 Evaluations and Forecast Accuracy
There are two fitted models in this study, ARIMA(1,1,2) and VEC(1,1) which are
estimated over the period January 2000 to December 2009. Tables 5.9 and 5.10 below
presents some forecast statistics for the prices of SMR20 series estimated over the period
January 2010 to September 2011. Once again, the estimated models are used to forecast
six-steps ahead; then the models were recursively estimated, stepping forward one month
at a time, and again forecast six-steps ahead until September 2011.
43
Table 5.9 Forecast Statistics for ARIMA of LPSMR20t (Jan. 2010 to Sep. 2011)
Steps
MAE
RMSE
TIC
Number of Obs.
1
0.5520
0.5887
0.0564
21
2
0.5512
0.5885
0.0563
20
3
0.5642
0.6025
0.0575
19
4
0.5546
0.5889
0.0562
18
5
0.5510
0.5895
0.0562
17
6
0.5423
0.5829
0.0558
16
Average 1-6
0.5526
0.5902
0.0564
Table 5.10 Forecast Statistics for VEC of LPSMR20t (Jan. 2010 to Sep. 2011)
Step
MAE
RMSE
Theil
Number of Obs.
1
0.3574
0.4280
0.0598
21
2
0.3903
0.4254
0.0413
20
3
0.3883
0.4233
0.0411
19
4
0.3878
0.4224
0.0410
18
5
0.3863
0.4208
0.0408
17
6
0.3838
0.4182
0.0406
16
Average 1-6
0.3823
0.4230
0.0441
Tables 5.9 and 5.10 indicate the forecast statistics from the multivariate model
VE(1,1) clearly outperforms the univariate model ARIMA(1,1,2). The RSME of the
ARIMA forecasts is higher than the VEC forecasts. Also, the RSME for ARIMA varies
from 0.6025% to 0.5829% compared to approximately 0.4230% for the VEC model. The
average RMSE of ARIMA(1,1,2) and VEC(1,1) are 0.5902% and 0.4230%, respectively.
Once again, out of the two models, ARIMA(1,1,2) provides better forecasting due to its
lower RMSE. Also, the TIC statistics are consistently below unity which indicates that
the estimated model outperforms the simple naïve model which indicate nearly perfect fit
models where the forecasting performance of the estimated models are satisfactory.
44
Table 5.11 Six-Step Forecast for PSMR20t and Model Evaluations
Actual
ARIMA
Date
Price
(1,1,2)
2,888.90
Jun-10
3,285.26
2,884.70
Jul-10
3,351.71
3,065.40
Aug-10
3,419.51
3,369.00
Sep-10
3,488.68
3,774.10
Oct-10
3,559.25
4,170.60
Nov-10
3,631.25
4,584.40
Dec-10
3,704.71
5,230.40
Jan-11
3,779.65
5,560.80
Feb-11
3,856.11
4,767.40
Mar-11
3,934.11
5,007.30
Apr-11
4,013.69
4,491.20
May-11
4,094.88
4,527.90
Jun-11
4,177.71
4,535.80
Jul-11
4,262.22
4,587.70
Aug-11
4,348.44
4,489.70
Sep-11
3,285.26
0.5887
Avg. RMSE
0.5520
Avg. MAE
0.0564
Avg. TIC
VEC
(1,1)
3,333.56
3,501.57
3,672.53
3,843.05
3,927.45
3,963.79
3,999.08
4,232.70
4,351.63
4,466.22
4,614.74
4,676.30
4,709.78
4,888.39
4,993.83
3,333.56
0.4287
0.3574
0.0598
The approach of forecasting six-steps forecasting is discussed in Chapter 4. Table
5.11 illustrates the comparison between the dynamic forecast, six-steps ahead of each
model, and actual values of PSMR20t. Table 5.12 shows the percentage difference
between the forecast and actual value at time t. The large different between the
forecasting and actual values indicates that the models are not performing well when time
horizon is extended. Also, the value of RMSE and MAE for VEC(1,1) model are lesser
than ARIMA(1,1,2) which signifies that multivariate forecasting model perform better in
term of statistical results.
45
For a one-step ahead forecasting, all the models do a good job of forecasting the
prices in term of proximity between actual and forecast values Figure 5.5 and 5.6;
however, all the models fail to predict the sharp turns, which consisted of periods
between April 2010 to June 2010 and February 2011 to June 2011. Also, the percentage
difference between the actual value and multivariate model, VEC(1,1) are smaller than
the univariate model, ARIMA(1,1,2).
Table 5.12 Percentage Difference Between the Forecast an Actual Values
ARIMA
VEC
Date
(1,1,2)
(1,1)
Jun-10
-11.47%
-14.48%
Jul-10
-13.89%
-15.56%
Aug-10
-9.34%
-14.23%
Sep-10
-10.50%
-9.01%
Oct-10
7.56%
-1.83%
Nov-10
14.66%
5.83%
Dec-10
20.79%
13.54%
Jan-11
29.17%
23.54%
Feb-11
32.03%
23.88%
Mar-11
19.12%
8.72%
Apr-11
21.43%
10.81%
May-11
10.63%
-2.75%
Jun-11
9.56%
-3.28%
Jul-11
7.89%
-3.84%
Aug-11
7.09%
-6.55%
Sep-11
3.15%
-11.23%
Date
Aug-11
Sep-11
Aug-11
Sep-11
Feb-11
Jan-11
Dec-10
Nov-10
Oct-10
Sep-10
Aug-10
Jul-10
Jun-10
May-10
Apr-10
Mar-10
Feb-10
Jan-10
Jul-11
2000
Jul-11
3000
Jun-11
4000
Jun-11
5000
May-11
6000
May-11
VEC (1,1)
Apr-11
Figure 5.6 One-Step Ahead Forecast of VEC(1,1)
Apr-11
Forecast
Mar-11
Date
Mar-11
Feb-11
Jan-11
Actual
Dec-10
Nov-10
Oct-10
Sep-10
Aug-10
Jul-10
Jun-10
May-10
Apr-10
Mar-10
Feb-10
Jan-10
USD/MT
USD/MT
46
Figure 5.5 One-Step Ahead Forecast of ARIMA(1,1,2)
ARIMA (1,1,2)
6000
5000
4000
3000
2000
47
Chapter 6
CONCLUSIONS AND IMPLICATIONS
6.1 Conclusion
The results from the models ARIMA(1,1,2) and VEC(1,1) provide efficient
outcomes in terms of statistical results and predictable values for out-of-sample
forecasting. The average RSME and MAE for the univariate ARIMA model are
significantly larger than the multivariate VEC model. Furthermore, the predicted values
from out-of-sample forecasts are close to the actual values.
The results from the empirical analysis indicates that there is a positive
relationship between the price of NR and price of crude oil which is consistent with the
economic theory since two commodities are considered complementary goods.
Furthermore, with an increasing demand in motor vehicle, the price of NR is climbed.
This phenomenon reflects the concept of quantity theory of supply and demand. Yet, the
price of NR is declined if the currencies of producing countries are appreciated against
US dollar. Also, the multivariate model outperform univariate model in term of
statistical results and forecasting values.
6.2 Remarks and Future Research
This paper demonstrates that simple forecasting techniques provide valuable
information in terms of improving, planning and decision-making. Currently, as one of
the producers and distributors of NR in Cambodia, I can adopt these methods of
forecasting to develop a market strategy to work with local farmers and improve the well
being of rubber producers. While several models and techniques were employed in this
48
thesis, this analyses can, there is room to further improve the models in order to introduce
this study to the Department of Rubber Development in Cambodia where certain policies
can be implemented to maximum the welfare for rubber farmers.
49
REFERENCES
Allen, P.G. (1994). Economic forecasting in agriculture. International Journal of
Forecasting, 10(1), 81-135. Retrieved from
http://www.sciencedirect.com/science/article/pii/0169207094900523
ANRPC. (n.d.) Natural rubber trends and statistic. Retrieved March 15, 2011, from
http://www.anrpc.org/html/archive.aspx
Bailey, C.D. & Gupta S. (1999). Judgment in learning-curve forecasting: a laboratory
study. Journal of Forecasting. 18(1), 39-57. Retrieved from
http://onlinelibrary.wiley.com/doi/10.1002/(SICI)1099131X(199901)18:1%3C39::AID-FOR683%3E3.0.CO;2-N/abstract
Burger, K., & Smit H. P. (1891). Long term and short term analysis of the natural rubber
Market. Review of World Economics, 125(4),718-747. Retrieved from
http://www.springerlink.com/content/d537678331523827/
Burger, K., Smit H. P. & Vogalvang, B. (2002). Exchange rates and natural rubber prices,
the effect of the Asian crisis. 10th EAAE Congress: Exploring Diversity in the
European agri-food system. Retrieved From
http://ageconsearch.umn.edu/bitstream/24958/1/cp02bu31.pdf
Dickey, D. A., & Fuller W. A. (1979). Distribution of the Estimators for Autoregressive
Time Series with A Unit Root. Journal of the American Statistical Association,
49(366), 427-431. Retrieved from
http://www.deu.edu.tr/userweb/onder.hanedar/dosyalar/1979.pdf
Enders, W. (2004). Applied econometric time series (2nd Edition). New York, NY:
Wiley.
Feder, G., Richard, J. E., & Andrew, S. (1980). Futures Markets and Theory of The Firm
Under Price Uncertainty. The Quarterly Journal of Economics, 94(2), 317-328.
Retrieved from http://qje.oxfordjournals.org/content/94/2/317
Frank, J., & Garcia, P. (2010). How strong are the linkages among agricultural, oil, and
exchange rate markets? In Proceedings of the NCCC-134 Conference on Applied
Commodity Price Analysis, Forecasting, and Market Risk
Management. Retrieved from
http://www.ucema.edu.ar/conferencias/download/2010/11.06.pdf
50
Harvey, A.C. & E.C. Todd (1983). Forecasting econometric time series with structural
and Box-Jenkins models (with discussion). Journal of Business and Economic
Statistics, 1(4), 299-315. Retrieved from http://www.jstor.org/pss/1391661
Heij, C., Boer. P. D., Frances, P.H., Kloek, T., & Dijk, H.K. (2004). Econometric models
with applications in business and economics. New York, NY: Oxford University
Press.
Khin A. A. (2011). Econometrics forecasting models for short term natural rubber prices.
Saarbrcken, Germany: Lambert Academic Publishing.
Meyler, A, G. Kenny and T. Quinn (1998). Forecasting Irish inflation using ARIMA
models. Central Bank of Ireland Technical Paper. Retrieved from
http://mpra.ub.uni-muenchen.de/11359/1/cbi_3RT98_inflationarima.pdf
Pankratz, A. (1983). Forecasting with univariate Box-Jenkins models. Hoboken, NJ: John
Wiley & Sons, Inc.
Pindyck R.S., and Rubinfeld D. L. (1998). Econometric models and economic forecasts.
New York, NY: Irwin/McGraw-Hill
Phillip P.C.B, & Perron P. (1988). Testing for a unit root time series regression.
Biometrika. 75(2), 335-346. Retrieved from
ftp://ftp.icesi.edu.co/jcalonso/Analisis%20de%20Series%20de%20tiempo/Paper/
PP1988.pdf
Romprasert, S. 2009. Forecasting model of RSS3 price in futures market. Kaetsart
University Journal of Economics, 16(1), 54-74. Retrieved from
http://journal.eco.ku.ac.th/upload/document/thai/20090958015458.pdf
Sims, C.A. (1980) Macroeconomics and reality. Econometrica, 48(1), 1-48. Retrieved
from
http://www.ekonometria.wne.uw.edu.pl/uploads/Main/macroeconomics_and_reali
ty.pdf
Stock J.H. & Watson, M.W. (2011). Introduction to Econometrics (3rd Edition). Boston,
MA: Pearson.
Tashman, L. (2000). Out-of-sample tests of forecasting accuracy: an analysis and review.
International Journal of Forecasting. 16(1), 437-450. Retrieved from
http://www.sciencedirect.com/science/article/pii/S0169207000000650
51
UNCTAD. (n.d.) Market Information in the Commodities Area: The Main Uses of NR.
August 11, 2011. Retrived from
http://unctad.org/infocomm/anglais/rubber/uses.htm
Zant, W. (1994). Price and stock information with rational expectations in the Indian
natural rubber market. Discussion Paper, Tinbergen Institute, Amsterdam.
Retrieved from ftp://zappa.ubvu.vu.nl/19930020.pdf
Download