Time Series Prediction

advertisement
Time Series Prediction
Using
Support Vector Machine: A Survey
By
Ma Yongning
Outline

Introduction to time series prediction

Classic techniques

Introduction to SVM

Variants of SVM

Hybrid techniques

Application

Conclusion
Time Series Prediction
The objective of time series prediction is to find a
function f(x) such that , the predicted value of the time
series at a future point in time is unbiased and
consistent.
Goal of time series prediction is to estimate some
future value based on current and past data samples
ANN Approach –
Autoregression Filter



Linear, stationary process
The current term of the series can be estimated by a
linear weighted sum of previous terms in the series
A number of techniques exist for computing AR
coefficients. The main two categories are least squares
and Burg method
ANN Approach – Kalman Filter


Also known as linear quadratic estimation
(LQE), is an algorithm which uses a series
of measurements observed over time,
containing noise (random variations) and
other inaccuracies, and produces
estimates of unknown variables
Assume linear, stationary process, and
model is known
Introduction to SVM

A technology used to find the optimal separating
hyperplane
f(x)= wx + b or f(x)= wg(x) + b



Linear SVM developed by Vapnik and Chervonenkis in
1960s
Non-Linear SVM developed in 1990s by applying the
kernel trick
Unbiased if b = 0
Computation of Separating
Hyperplane

Assume f(x)= wx + b (Linear Case)

Task: Maximize the margin M = 2/||w||

Which means to minimize ||w||

Also, we allow some error ξ
Some Math

Our problem now is

Subject to

And use convex optimization to solve for f(x)
The Kernel Trick



In the linear case we have K(x, y) = <x, y>. If the data
points are not linearly separable, we construct
Such that
Which means K(x, y) can be expressed in dot product of
terms in inner product space V (e.g. Hilbert Space)
whose dimension is dim(V), therefore separable in V.
An Simple Example


Common kernel functions are polynomial or Gaussian
To see how they transform input vectors into higher
dimension, consider quadratic kernel

Which corresponds to feature

The dimension of the feature is O(n^2):
Non-Linear Decision Boundary
Least-Square SVM (Among
other Variants)


Express the error term as in linear regression:
Then problem can be solved by letting partial derivatives
of Lagrangian to be zero, which eliminates w and b, and
leads to a linear system of a and b, that can be directly
solved. LS-SVM sacrifices a bit accuracy for efficiency
Hybrid Approaches

Fuzzy SVM (blurred decision boundaries)

SOM and SVM (SOM used to cluster data first)



GA and SVM (used to optimize the free variables in
SVM; a broad type will be adaptive SVM -ASVM, where
free variables are subject to change in the training
process)
Specific to time series: use ARIMA to predict linear
region
Smooth data first by SOM, Kalman filter. But in the case
of SOM, it worsened the performance
Application of SVM for Time
Series in Different Industries

Financial Data Time Series Prediction



Use of SVR to predict five specific
financial time series sources including
the S&P 500 and several foreign bond
indices.
The SVR significantly outperformed the
BP NN because of it’s ability to
appropriately fit the data.
A hybrid system of using SOM
combined with SVR yielded not only
better prediction performance but also
superior convergence speed.
Application of SVM for Time
Series in Different Industries

General Business Appications





Electricity Price Forecasting
Credit Rating Analysis
Customer “Churning”- Auto Insurance
Market Prediction
Financial Failure of Dotcoms – Financial
Analysis using 24 financial ratios
Production Value Prediction of the
Taiwanese Machinery Industry
Application of SVM for Time
Series in Different Industries

Environmental Parameter Estimation
Using SVR

Used to predict parameters such as
rainfall estimation and detection,
weather forecasting, short term air
quality(environmental pollution such as
nitrogen oxides) etc…
Financial Data Time Series
Prediction

Electric Utility Load Forecasting
Applications Using SVR.

Forecasting of electric power
consumption demands by consumers,
which is a non-linear prediction
problem.

(Factors influencing usage for example
are “holiday time periods”,
weather(temperature and humidity),
electricity pricing etc…
Financial Data Time Series
Prediction

Machine Reliability Forecasting
Applications



The prediction of machine reliability is a non-linear
problem.
Example is to predict the “period reliability ration”
for the automotive industry based on time series
data containing vehicle damage incidents and the
number of damages repaired.
Another example is the use of SVR to predict
engine failure in both the repair and design process
of mechanical engines. Data used as input was the
engine age at the time of unscheduled
maintenance action per maintenance period and
output was the predicted engine age of the next
unscheduled maintenance action.
Financial Data Time Series
Prediction

Control System and Signal Processing
Applications.







Mobile Position Tracking
Internet Flow Control
Adaptive Inverse Disturbance
Cancelling
Narrowband Interference Suppression
Antenna Beamforming
Elevator Traffic Flow Prediction
Dynamically Tuned Gyroscope Drift
Modeling
Financial Data Time Series
Prediction

Miscellaneous Applications

Biological Neuron Application(Australian
Crayfish)

Kalman Filtering Method of Switching Dynamics
Associated with Unsupervised Segmentation

Natural Gas Load Forecasting(using weather
related factors such as temperature, day of week
and holidays etc…)

Transportation Travel Time Estimation

Use of Particle Swarm Optimization in conjuncture
with SVR.
Conclusion



Support Vector Machines/Support Vector
Regression(SVR) are powerful learning mechanisms that
have been developed and matured over the last 15
years.
Useful for predicting and forecasting time series for a
myriad of applications.
SVR research continues to be a viable approach in the
prediction of time series data in non-linear systems.
Download