The Prediction Study on Output Value and Profit of China Mobile

advertisement
The Prediction Study on Output Value and Profit of China Mobile Tianjin
Branch
Hong-li Wang 1, Lin-xi Song 12, Ya-tao Zhang 3
1
School of Management,Tianjin University,Tianjin,P.R.China
2
China Mobile Tianjin Branch,Tianjin,P.R.China
3
School of Mechanical Engineering,Tianjin University,Tianjin,P. R. China
(wanghl@tju.edu.cn)
Abstract - This paper predicted the future development
of China Mobile Tianjin Branch, based on support vector
machine, stochastic gradient boosting and artificial neural
networks. By comparison, we known that stochastic gradient
regression had higher precision for the prediction of output
value, and that support vector machine regression had higher
precision for the prediction of profit. By using the two
regression methods, output value and profit in the next 4
years was predicted respectively, based on the company’s
past data. The study will provide creditable support for the
formulation of the company’s development strategy.
Keywords -prediction, China Mobile, SVM, stochastic
gradient regression
Before 1994, the mobile communications market in
China was monopolized by posts and telecommunications
department. In 1994, the monopoly of China mobile
communications market was break off, with China
Unicom established. After the operation of posts and
telecommunications department divided and china
telecommunications
reunion,
China
Mobile
Communications Corporation was officially established on
April 20th, 2000. Since then, the performance of China
Mobile Communications Corporation increases rapidly,
and the output value of Tianjin Branch has increased from
1.98 billion in 2000 to 7.40 billion in 2011.
In recent years, a number of scholars have carried out
the prediction of business data in different industries. Fang
Kuangnan et al. [1] used the random forest method to
predict the excess rate of return for China's fund, and
proved the predictability of China's financial markets.
Chen Bo et al.[2] predicted the output value of China's
online game industry in the next few years based on gray
prediction theory.
I. METHOD OF THE SUPPORT VECTOR
MACHINE REGRESSION
In 1995, based on statistical learning theory, Vapnik
and Cortes proposed a new computer-based learning
method: Support Vector Machine [3-5]. SVM has unique
advantage in solving the small sample, nonlinear and high
dimensional pattern recognition problem,and it can be
extended to other computer-based learning problems such
as function fitting.
For an unknown function: 𝑦 = 𝑓(𝑥) where 𝑥 ∈
𝑅𝑑 , 𝑦 ∈ 𝑅 , we can find a fitting function 𝑓̂: 𝑅𝑑 → 𝑅 ,
which makes the distance function.
𝑅(𝑓, 𝑓̂) = ∫ 𝐿(𝑓, 𝑓̂) 𝑑𝑥
(1)
smallest, where 𝐿 is the Loss Function. Because 𝑓 is
unknown,we can only determine the fitting function
𝑓̂according to the samples, which have been obtained.
Using the regression function𝑓(𝑥) =< 𝑤,𝑥 > +𝑏 to fit
the sample data(𝑥1 , 𝑦1 ),(𝑥2 , 𝑦2 ),…,(𝑥𝑟 , 𝑦𝑟 ), where 𝑥𝑖 ∈
𝑅𝑑 , 𝑦𝑖 ∈ 𝑅, assume that all training data can be fitted
without error by linear function at the accuracy ε
𝑦 −< 𝑤, 𝑥 > −𝑏 ≤ 𝜀
{ 𝑖
< 𝑤, 𝑥 > +𝑏 − 𝑦𝑖 ≤ 𝜀
𝑖 = 1,2, … ,𝑘
(2)
Allowing the fitting
error, we introduce the relaxation factors: ξi ≥ 0, ξ∗i ≥ 0.
Formula (2) is rewritten as follows
𝑦 −< 𝑤, 𝑥 > −𝑏 ≤ 𝜀 + 𝜉𝑖
{ 𝑖
< 𝑤, 𝑥 > +𝑏 − 𝑦𝑖 ≤ 𝜀 + 𝜉𝑖∗
𝑖 = 1,2, … ,𝑘
(3)
The regression problem is transformed into how to
minimize the distance function under the constraint
functions (3). The rewritten distance function is as
follows
k
R(w, ξi , ξ∗i )
1
= ‖w‖2 + C ∑(ξi + ξ∗i )
2
(4)
i=1
II. METHOD OF THE STOCHASTIC GRADIENT
REGRESSION
Stochastic gradient regression [6-8] was put forward
by Friedman in 2001, which was the linear combination of
several regression trees.
𝐹(𝑥) = ∑𝑀
𝑚=1 𝛼𝑚 𝑇𝑚 (𝑥)(5)
𝛼𝑚 are combination coefficients, and 𝑇𝑚 (𝑥) are
regression trees. The trees are set up with fastest drop
optimize idea, by using false residual which is generated
by loss function negative gradient. Stochastic gradient
boosting algorithm is shown as follows.
Figure2. The profit of Tianjin Mobile in 2000-2011
(billion)
Figure1. Stochastic gradient boosting arithmetic
{𝑦𝑖 , 𝑥𝑖 }1𝑁 representsthe
̂
𝑁
original sample, and {𝑦𝜋(𝑖) , 𝑥𝜋(𝑖) }
1
̂ < 𝑁, and {𝜋(𝑖)}1𝑁 is a
is a random sub-sample, where 𝑁
random sequence from {1,2, … , 𝑁}.
III. THE PREDICTION STUDY ON CHINA
MOBILE TIANJIN BRANCH
With the development of China's economy ,
especially the development of Tianjin's, the output value
of China Mobile Tianjin branch increased from 1.98
billion in 2000 to 7.403 billion in 2011, and the profit of
the company increased from 0.498 billion in 2000 to 1.573
billion in 2011. In order to provide better basis for
formulating policies of the company, it is very necessary
to predict the output value and profit in the future few
years.
First of all, we needed to determine that which were
the independent variables, and which was the dependent
variable, in order to use regression methods to study the
output value and profit of the company. Since we were
using historical data to predict the future output value and
profit, let the previous n years of data be the independent
variables, and the (n +1) _(th) year data be the dependent
variable (in the calculations below: n=2). The data in 2011
wasn’t used for fitting, but was left to verify the
calculations were accurate or not. Secondly we analyzed
the relationship between the independent variables and the
relationship between the dependent variable and the
independent variables. Based on three kinds of regression
methods, the prediction of the output value and profit was
obtained and selected by comparison with the left data.
The algorithm in this paper was completed by R
software. In the process of calculation, the value of annual
output value and profit was changed into annual growth
rate. The final prediction value was obtained by predict the
growth rates.
A.The Profit Prediction by SVM Regression
The profit of China Mobile Tianjin branch during
2010 to 2011 was shown in Figure 2; and its growth rates
were shown in Table I.
TABLE I
ANNUAL GROWTH RATES OF THE PROFIT IN 2000-2011
Y
2
2
2
2
2
e
0
0
0
0
0
a
0
0
0
0
0
r
1
2
3
4
5
R
0
0
0
0
0
a
.
.
.
.
.
t
1
1
1
1
1
e
6
6
6
6
6
3
4
3
3
3
Y
2
2
2
2
2
e
0
0
0
0
0
a
0
0
0
1
1
r
7
8
9
0
1
R
0
0
a
.
.
0
0
0
t
1
3
.
.
.
e
4
3
0
0
0
6
0
6
9
1
5
0
6
2
0
0
6
0
.
1
6
3
Based on three regression methods (support vector
machine (SVM), artificial neural networks (ANNs) [9-10],
stochastic gradient (SG)), the profit prediction of 2011
was calculated, and was compared with the true profit.
The results were shown in Table II.
TABLE II
THE ERRORS COMPARED WITH THE PROFIT IN 2011
Meth
SVM
ANN
SG
od
s
Predi
1.629
2.297
1.689
ction
8
Error
3.58
46.07
7.37
%
%
%
By comparison, the SVM regression for the profit
forecast was most effective. We used this method to obtain
the predicted profit of 2012-2015. The results were shown
in Table III.
TABLE III
THE FORECASTING RESULTS OF PROFIT BASED ON SVM
REGRESSION (BILLION)
Year
Predict
ion
20
12
1.5
84
20
13
1.6
45
20
14
1.6
88
20
15
1.7
16
B. The Output Value Prediction by Stochastic Gradient
Regression
The output value of China Mobile Tianjin branch
during 2010 to 2011 was shown in Figure 3; and its
growth rates were shown in Table IV.
TABLE VI
THE FORECASTING RESULTS OF OUTPUT VALUE BASED
ON STOCHASTIC GRADIENT REGRESSION (BILLION)
Year
201
201
201
2015
2
3
4
Predicti
8.37
9.10
9.71
10.8
on
9
8
6
79
IV.CONCLUSION
Using three kinds of regression methods, the paper
sets up output value and profit prediction models, and
predicts output value and profit of China Mobile Tianjin
branch in the future four years. The results indicate that
the output value of the company will exceed 10billion.The
methods used in this paper can be applied to similar
problems.
Figure3. The output value of Tianjin Mobile in
2000-2011 (billion)
Y
e
a
r
R
a
t
e
Y
e
a
r
R
a
t
e
TABLE IV
ANNUAL GROWTH RATES OF THE OUTPUT VALUE IN
2000-2011
2
2
2
2
2
0
0
0
0
0
0
0
0
0
0
1
2
3
4
5
0
0
0
0
0
.
.
.
.
.
1
1
1
1
1
2
2
2
2
2
0
0
0
0
6
2
2
2
2
2
0
0
0
0
0
0
0
0
1
1
7
8
9
0
1
0
0
0
0
0
.
.
.
.
.
1
2
0
1
0
8
9
5
2
5
1
3
6
1
2
REFERENCES
2
0
0
6
0
.
1
1
1
Based on three regression methods (support vector
machine (SVM), artificial neural networks (ANNs),
stochastic gradient (SG)), the output value prediction of
2011 was calculated, and was compared with the true
value. The results were shown in Table V.
TABLE V
THE ERRORS COMPARED WITH THE PROFIT IN 2011
Method
SVM
ANNs
SG
Prediction
8.007
7.999
7.783
Error
8.17%
8.06%
5.13%
By comparison, the stochastic gradient regression for
the output value forecast was most effective. We used this
method to obtain the predicted output value of 2012-2015.
The results were shown in Table VI.
[1]Fang Kuangnan, Zhu Jianping, and XieBangchang, “A
Research into the Forecasting of Fund Return Rate Direction
and Trading Strategies Based on the Random Forest
Method”, Economic Survey, vol.2, no.9, pp. 61-66, 2010.
[2] Chen Bo, and Li Fuming, “Application of Grey Theory to
Predict Chinese Online Game Output”, Journal of Shandong
University of Technology (Nature Science Edition), vol.23,
no.6, pp. 52-54, Nov. 2009.
[3] Wang Guosheng, “Research on Theory and Algorithm for
Support
Vector
Machine
Classifier”,
Ph.
D.
dissertation,Beijing
University
of
Posts
and
Telecommunications, Beijing, China, 2007
[4] O. Chapelle, and V. Vapnik. “Model Selection for Support
Vector Machines”, Advances in Neural Information
Processing Systems, MIT Press, 2000.
[5] O. L. Mangasarian, and D. R. Musicant, “Successive Over
Relaxation for Support Vector Machines”, IEEE Transaction
on Neural Networks, vol.10, no.5, pp. 1032-1037, 2001.
[6] Xia Guoen, Jin Weidong, and Zhang Gexiang, “Synthetic
Evaluation Method Based Support Vector Classifier and
Regression Machine”, Journal of Southwest Jiaotong
University, vol. 41, no. 4, pp. 522–527, 2006.
[7] Han Hongchen, “Study of Nonlinear Dynamics and Analysis
of Stochastic Gradient Boosting in Price System”, Ph. D.
dissertation, Tianjin University, Tianjin, China, 2009.
[8]GuoJianxiao, Wang Hongli, “Analysis of Influencing Factors
in Real Estate Prices Based on Stochastic Gradient
Regression Model”, 2009 Third International Symposium on
Intelligent Information Technology Application, pp.483-846,
2009.
[9] Zhang Lidong, Jia Lei, Zhu Wenxing. “Overview of Traffic
Flow Hybrid ANN Forecasting Algorithm Study”, Institute
of Electrical and Electronics Engineers, Chengdu, 2010.
[10] Wu Jianxin, Zhou Zhihua, and ShenXuehua, “A Selective
Constructing Approach to Neural Network Ensemble”,
Journal of Computer Research and Development, vol.37,
no.9, pp.1039-1044, 2000.
Download