The 10th International PSU Engineering Conference

advertisement
The 10th International PSU Engineering Conference
May 14-15, 2012
Comparison of Support Vector Machine’s Kernel Function
for Unsmoke Sheet Rubber Price Forecasting
Abstract
The development of natural rubber price forecasting
model is the interesting and important topic for all
stakeholders in automobile tire industry. About 6-36
percentages of automobile tires are the natural rubber
especially unsmoke sheet rubber. Nevertheless the
factors influence the natural rubber price regarded as
fluctuated value and non-stationary dataset is also
studied problem. Therefore the selection of technical
forecasting model which is appropriate for such a
factors data type is needed to be concerned. Support
vector machine technique has potential to be served as
a powerful tool for making forecasting model on the
non-stationary of unsmoke sheet rubber price data. In
additional, SVM-based forecasting model necessitates
the selection of appropriated kernel function and
constraint. The work presented in this paper aims to
examine the effect of different kernel function type,
namely, linear, polynomial, sigmoid, and radial basis
function. The forecasting of unsmoke sheet rubber
price has been done against the monitored time series
database of forward contract rubber price, crude oil
price, and foreign currency exchange rate. The effect
of constraint is also studied. The experiment shows
that the radial basis kernel function grants the highest
recognition rate for the unsmoke sheet rubber price
forecasting.
Keywords: Price forecasting model, Unsmoke sheet
rubber, Support vector machine, Kernel Function,
Radial Basis Function
1. Introduction
In automobile tire industry, price of unsmoke
sheet rubber (USS rubber) is initial raw material cost
of their supply chain. Therefore precisely multipleperiod-ahead forecasting price is major supporting
information of the significant strategies. The strategic
planning allow business to provide a wide range of
advantage in business operating activities [1] for
instance pricing management, inventory management,
treasury management, investment budget, etc. As a
result, it led to USS rubber price forecasting model
become one of the challenging applications of modern
time series forecasting.
Now and then, statistical forecasting model has
been widely used for time series natural rubber price
forecasting e.g. Box and Jenkins’s Auto Regressive
Integrated Moving Average (ARIMA) technique,
Holt-Winters Smoothing with no Seasonal, Simple
Exponential Smoothing (SES), and Ordinary Least
Square (OLS). They, however, are general univariate
model and developed based on the assumption that
forecasted times series are linear and stationary [2-5].
Moreover, only short-term forecasting is applicable.
In recent year, Artificial Neural Networks (ANNs)
was introduced to time series forecasting field [6-8].
It was domain in term of the pattern classification and
pattern recognition. ANNs performs better result than
statistical models on natural rubber price forecasting,
specifically, for more irregular series, for multivariate
data and for multiple-period-ahead forecasting [9-12].
Although ANNs show in better performance, there are
several disadvantages: (i) the dependency on large
number of parameters, e.g. network size, learning
parameters and initial weight chosen, (ii) possibility
of being trapped into local minima resulting in a very
slow convergence, and (iii) over-fitting on training
data resulting in poor generalization ability.
Later Support Vector Machine (SVM) has
emerged as a new and powerful technique for learning
from data in many fields. SVM is in particular for
solving classification and regression problems with
better performance [13-16]. The main advantage of
SVM is its ability to minimize structure risk as
opposed to empirical risk minimization as employed
by ANNs technique. SVM technique is better to be
corporate with non-stationary, no obvious trend and
no seasonality data. It is compliance with data pattern
of USS rubber price and their influential factors. The
aim of this paper is to investigate the effect of various
kernel functions type on predicting USS rubber price
at Hatyai’s central rubber market. The forecasting
performance is evaluated recognition rate in term of
upward or downward trend of the USS rubber market
price. The effect of selecting regularization parameter
of SVMs on prediction error is also investigated.
2. Support Vector Machine
Support Vector Machine is one of Kernel-based
classifiers which represent a major development in
machine learning algorithms. It represents extension
to nonlinear models of generalized portrait algorithm
developed by Vladimir Vapnik [17]. Support vector
machines (SVMs) are a group of supervised learning
methods that can be applied to classification (pattern
recognition) and regression (function approximation).
The goal of SVM modeling is to find a maximum
margin hyperplane to separate the classes obviously.
They maximize the distance of the hyperplane from
the nearest training examples to separate clusters of
vector. They separate cases with one category of the
target variables are on one side of the plane and cases
with the other category are on the other size of the
plane. Consequently the hyperplane obtained is called
the optimal separating hyperplane (OSH). Training
examples that are closest to the maximum margin
hyperplane are called support vectors.
A unique feature of SVMs is that they are
resistant to the over-fitting problem. This is because
SVM implements the structural risk minimization
principle whereas ANN implements the empirical risk
minimization principle. The former seeks to minimize
the misclassification error or deviation from correct
solution of the training data, but the later searches to
minimize an upper bound of generalization error.
Furthermore, SVMs possess the well-known ability of
being universal approximators of any multivariate
function to any desired degree of accuracy. From this
reason, they are of particular interest for modeling the
unknown, partially known, highly nonlinear, complex
systems, plants or processes [18].
2.1 Support Vector Classification Model
If the data is linearly separable, a hyperplane
separating binary decision classes in the two attribute
case can be represented as the following equation:
y = w0 + w1x1 + w2x2
(1)
where y is the outcome, xi are the attribute values, and
there are three weights wi to be learned by the
learning algorithm. Whereas the maximum margin
hyperplane can be represented as following equation
in terms of the support vectors:
y = b + Σαi yi x(i)  x
(2)
where y is the class value of training example x(i), the
vector x represents a test example, the vectors x( i) are
the support vectors and · represents the dot product.
In this equation, b and αi are parameters that
determine the hyperplane. Finding the support vectors
and determining the parameters b and αi is equivalent
to solve a linearly constrained quadratic programming
problem.
If the data is not linearly separable, as in this
case, SVM transforms inputs into high-dimensional
feature space. This is done by using a kernel function
as follows:
y = b + Σαi yi K(x(i),x)
(3)
2.2 Parameter and Kernel Function
SVM is a kernel-based algorithm. A kernel is
a function that transforms the input data to a highdimensional space where the problem is solved. The
kernel functions can be linear or nonlinear. In the
“simplest” pattern recognition tasks, SVMs use a
linear separating hyperplane to create a classifier with
a maximal margin. In order to do that, the learning
problem for the SVMs will be cast as constrained
nonlinear optimization problem. In this setting the
cost function will be quadratic and the constraints
linear i.e. one will have to solve a classic quadratic
programming problem. SVM models have a cost
parameter, C, that controls the tradeoff between
allowing training errors and forcing rigid margins to
allow some flexibility in separating the categories.
Then it creates a soft margin that permits some
misclassifications as shown in Figure 1. Increasing
the value of C increases the cost of misclassifying
points and forces the creation of a more accurate
model that may not generalize well.
Figure 1. Trading off error by parameter C.
In cases when given classes cannot be linearly
separated in original input space, SVMs transforms
the original input space into a higher dimensional
feature space. This transformation can be achieved by
using various nonlinear mapping such as polynomial,
sigmoid tanh, radial basis as shown in Figure 2. After
the specific nonlinear transformation, nonlinearly
separable problems in the input space can become
linearly separable problems in a feature space [19].
The kernel functions used in this study are the
followings.
Linear
: K(xi,xj) = <xi  xj>
Polynomial
: K(xi,xj) = (<xi  xj> + 1)d
Gaussian RBF : K(xi,xj) = exp(- xi - xj2)
Sigmoid Tanh
: K(xi,xj) = tanh (α <xi  xj> + )
Figure 2. Mapping data from low to high dimension
by kernel function.
3. Research Methodology
In this study, we experimented with 4 influent
factors against USS rubber price at Hatyai’s central
rubber market. The factors are daily price-differential
time series database of the forward contract rubber
(RSS#3) price at Tokyo market (TOCOM), crude oil
price at New York market (NYMEX). The remaining
factors are foreign currency exchange rate of Thai
Baht per US Dollar and Japanese Yen. We used data
only data at working day of Hatyai’s central rubber
market. Total 1,333 dataset coming from the period of
January 2005 to December 2010 was represented to
classify upward/downward trend of the daily rubber
market price. The available dataset was divided into
training and testing set. We experimented with four
different kernel function type, namely linear, sigmoid
tanh, radial basis function and polynomial. The
objective is to investigate the effect of kernel type and
determine the most suitable kernel function for such a
particular dataset.
In the first set of experiment, we collect and
rearrange all daily price change from previous day of
input and output data. These dataset are utilized as
training and testing dataset which be prepared under
two different kinds of data classification assumption.
They consist of 5 classes and 8 classes. These were
classified upon the non-concerned and concerned
assumption about input data’s behavior affecting USS
rubber price changes’ range respectively. The reason
why we do that is to demonstrate the effective of each
SVMs generated model comparatively.
A total of 1,333 daily dataset will be allocated as
training and testing data in term of 2 assumptions.
The first assumption is to test efficiency of model
which be generated by their own training data – it was
called “Known test set”. All daily dataset served as
training dataset and some parts of training dataset
were selected randomly as testing dataset. The ratio of
splitting for random testing is 15%. The second
assumption’s aim is to evaluate result of generated
model by distinguishable data between training and
testing dataset. It was called “Unknown test set”. The
ratio of splitting for training and testing is 80:20
Under experimental stage, the four different
kernel functions, namely linear, sigmoid tanh, radial
basis, and polynomial are used for model generating.
The parameter optimization (constraint-C) is also
tested. The arranged training dataset under 5-class and
8-class assumption will be trained by each 4 kernel
functions with vary range of C parameter (C = 1, 10,
100 and 1,000). The 64 forecasting models will be
generated. Then the accurate evaluation of each
generated model will be done by Mean Square Error
(MSE) method over two types of the testing data
assumption way.
4. Result and Discussion
In the first set of our experiment, all arranged
vectors of 1,333 dataset were trained to generate the
forecasting model. They were trained over 4 types of
SVM’s kernel mapping function with vary of the
regularization parameter C. From this part, the 64
forecasting models were generated upon the training
dataset under 5-class and 8-class assumption. Then
the prediction performance which was measured in
term of “the percentage of accuracy on test set” is
demonstrated as following Figure 3 & 4. We found
that the polynomial at all C value took long time for
training. The sigmoid kernel function at C-100 and C1,000 is also equal performed. The linear, RBF at all
C value and sigmoid at C-1 and C-10 are suitable for
such a data type to classify the upward or downward
trend of USS rubber price.
Figure 3. % Accuracy on Known and Unknown
Test Set (5 Classes)
Figure 4. % Accuracy on Known and Unknown
Test Set (8 Classes)
Refer to the transformed result of Figure 3 to
be Figure 5 in term of bar chart; it was shown the
illustration of the generating model’s efficiency. This
model was done over 5-class assumption gathering
with using all daily dataset served as training dataset.
Meanwhile the 15% of training dataset were selected
randomly as testing dataset. It was called “Known test
set”. The experiment demonstrated that the radial
basis function (RBF) at C-1,000 obtained the best
result (55.03%). Linear at C-10 and sigmoid at C-1
performed lower accuracy percentage in the result of
31.50% and 17.47%, respectively. On the other sides,
the experiment was taken under the same training
dataset as before however, the testing dataset was
discriminated from training dataset in ratio of 80:20.
It was called “Unknown test set”. The RBF at C-1
obtained the best result (33.22%) as shown in Figure
6. Linear at C-100 and sigmoid at C-10 performed
lower accuracy percentage in the result of 28.26% and
18.66%, respectively.
Figure 5. % Accuracy on Known Test Set
(5 Classes)
Figure 8. % Accuracy on Unknown Test Set
(8 Classes)
5. Conclusion
Figure 6. % Accuracy on Unknown Test Set
(5 Classes)
From the transformed result of Figure 4 to be
Figure 7 in term of bar chart, the experiment showed
the result over 8-class assumption training set. It was
shown that the RBF with regularization parameter C1,000 gained better performance than others on the
known testing dataset (65.16%). Besides, its accuracy
percentage was better than one of 5-class assumption
generated model (55.03%) as well. Linear at C-10 and
sigmoid at C-10 performed lower accuracy percentage
in the result of 39.59% and 38.77%, respectively.
Meanwhile the RBF at C-100 gained as the best result
as generating model trained with RBF at C-1,000 over
the unknown testing dataset (48.24%) as shown in
Figure 8. This accuracy percentage is also performed
better than 5-class assumption generating model
(33.22%). Linear at C-1,000 and sigmoid at C-10
performed lower accuracy percentage in the result of
35.45% and 37.80%, respectively.
Figure 7. % Accuracy on Known Test Set
(8 Classes)
This paper investigates the performance of
support vector machine for making upward/downward
trend prediction of USS rubber price at Hatyai’s
central rubber market. It was examined in term of
kernel function type and regularization parameter
selection. The experiment demonstrated that the radial
basis function dominates the best result in forecasting.
It provided the best performance on decreasing
the prediction error over either concerned or nonconcerned factor’s behavior assumption. Furthermore,
the SVM model generated under concern-behavior
assumption gained more accurate. While C-100 and
C-1,000 produced equally good result. However,
the percentage of prediction accuracy still is not in
satisfied status. This outcome encourages us to
analyze the suitable way to classify the appropriation
of classification. Consequently, further improvement
in prediction performance may be achieved by input
feature reclassification. SVM classification should be
reclassified more concentrate on each historical input
features’ behavior affecting the range of USS rubber
price change. The optimization of constraint value
(regularization parameter) is also concerned and is
currently under investigation.
References
[1] Gusuma, P. 2010. Financial Management in
Industrial estate, Bangkok Press Center, Bangkok,
Thailand.
[2] Suppanunta, R. 2009. Forecasting Model of RSS3
Price in Future Market. Kasetsart University
Journal of Economics, 16(1): 54-74.
[3] Suwimon, T. 2008. Short-term Price Forecasting
of RSS3 in Agricultural Future Exchange of
Thailand, Chiang Mai University, Chiang Mai,
Thailand.
[4] Pagagrong, T. 2002. The Price forecasting of
Agricultural Products in Agricultural Future
Exchange of Thailand, Kasetsart University,
Bangkok, Thailand.
[5] Aat, P. 2007. Direction and Alteration of Thai
Natural Rubber Industries in the Next Five Years.
Chamber of Commerce University Journal of
Academy, 27(3): 91-119.
[6] Wu, S.Jr., Han, J., Annambhotla, S. and Bryant, S.
2005. Artificial Neural Networks for Forecasting
Watershed Runoff and Stream Flows. Journal of
Hydrologic Eng, 5: 216-222.
[7] Haoffi, Z., Guoping, X., Fagting, Y. and Han, Y.
2007. A Neural Network Model Based on The
Multi-stage Optimization Approach for Short-term
Food Price Forecasting in China. Expert Syst.
Appl, 33: 347-356.
[8] Zou, H.F., Xia, G.P., Yang, F.T. and Wang, H.Y.
2007. An Investigation and Comparison of
Artificial Neural Network and Time Series Models
for Chinese Food Grain Price Forecasting.
Neurocomputing, 70: 2913-2923.
[9] Porntip, C. 2005. The Comparison of Rubber Price
Forecasting Using Box and Jenkins, Transfer
Function, and Artificial Neural Networks. King
Mongkut’s University of Technology Ladgabang,
Bangkok, Thailand.
[10]Panida, S. and Jitian, X. 2008. Developing
Forecasts for Thai Rubber Latex Prices: NonNeural Network Training and Neural Network
Training Approaches, School of Computer and
Security Science, Edith Cowan University, Perth,
Western Australia.
[11]Pongsiri, S., Pranee, N. and Suda, T. 2007. Time
Series Forecasting Using a Combined ARIMA and
Artificial Neural Network Model. Proceedings of
National Academic Research Conference,
Bangkok, Thailand, 2007: 1-7.
[12]Jarumon, N., Payong, M. and Srimaj, W. 2009.
The Comparison Study Techniques on Time
Series Prediction of Rubber Price using Artificial
Neural Networks, Polynomial Regression and
Support Vector Regression. The 14th National
Graduate Research Conference, King Mongkut’s
University of Technology North Bangkok,
Thailand, Sep. 10-11, 2009.
[13]Samsudin, R., Shabri, A. and Saad, P. 2010. A
Comparison of Time Series Forecasting using
Support Vector Machine and Artificial Neural
Network Model. Journal of Applied Sciences,
10(11): 950-958.
[14]Wei, Z.L. and Wen, J.W. 2005. Potential
Assessment of the “Support Vector Machine”
Method in Forecasting Ambient Air Pollutant
Trends. Proceedings of the Chemosphere, China,
2005: 693-701.
[15]Wei, H., Yoshiteru, N. and Shou, Y.W. 2005.
Forecasting Stock Market Movement Direction
with Support Vector Machine. Proceedings of the
Computer & Operations Research, China, 2005:
2513-2522.
[16]Rohit, C. and Kumkum, G. 2008. A Hybrid
Machine Learning System for Stock Market
Forecasting. Proceedings of the World Academy
of Science, Engineering and Technology, 2008:
315-318.
[17]Vapnik, V. 1995. The Nature of Statistical
Learning Theory, Springer-Verlag, New York.
[18]Kecman, V. 2005. Support Vector Machines – An
Introduction, StudFuzz, Springer-Verlag Berlin
Heidelberg.
[19]Wang, L. 2005. Support Vector Machines:
Theory and Applications (Studies in Fuzziness and
Soft Computing), Springer, New York.
Download