Support Vector Regression

advertisement
Artur Akbarov
A tutorial on support vector regression
By Smola, A.J and Schölkopf, B.
Statistics and Computing, 14,
pp. 199-222, 2004


SVM is applied in many different fields
including bioinformatics, epidemiology,
finance, economics etc.
Essentially, it can be applied wherever there is
a problem of classification or prediction.
Source:http://www.saedsayad.com/support_vector_machine_reg.htm
Source:http://www.saedsayad.com/support_vector_machine_reg.htm
y  w ( x)  b
Source:http://www.saedsayad.com/support_vector_machine_reg.htm
Linear SVR:
Non-linear SVR:

Gaussian radial basis function:
2
K( xi , x j )  exp( xi  x j )

Polynomial
K( xi , x j )  ( xi  x j )
d

Model fitting:
◦ Training set
◦ Validation set

– fit the model
– predict using the fitted model,
choose the model with minimum
prediction error.
Model testing:
◦ Test set
– examine the prediction error
(model performance, compare
different prediction methods)

Training and validation sets:
◦ Fixed split
◦ Random split
◦ Cross-validation
 Split the data into n number of subsets, train on n-1
subsets, validate on the remaining subset, loop over all
subsets.
 Leave-one-out cross validation.

LIBSVM – SVM library in different languages.

Weka – data mining tools.

R package - “e1071”






install.packages(“e1071”)
library(“e1071”)
model<-svm(data=D, formula=Y~X1+X2)
model<-svm(y=Y,x=X)
Y_fit<-predict(model, X)
Y_hat<-predict(model, X_new)

Other SVM parameters for the svm() function:
◦ epsilon = 0.1
◦ cost = 1.0, which is C
◦ kernel =“linear”, “polynomial”, “radial”, ”sigmoid”.


best.svm() function uses grid search to find
the optimal values for SVM parameters.
model<-best.svm(x=X, y=Y,
tunecontrol=tune.control(cross=5),
cost=c(1:10),epsilon=c(0.05,0.10))
Thank you for your attention
Download