The Prediction Study on Output Value and Profit of China Mobile Tianjin Branch Hong-li Wang 1, Lin-xi Song 12, Ya-tao Zhang 3 1 School of Management,Tianjin University,Tianjin,P.R.China 2 China Mobile Tianjin Branch,Tianjin,P.R.China 3 School of Mechanical Engineering,Tianjin University,Tianjin,P. R. China (wanghl@tju.edu.cn) Abstract - This paper predicted the future development of China Mobile Tianjin Branch, based on support vector machine, stochastic gradient boosting and artificial neural networks. By comparison, we known that stochastic gradient regression had higher precision for the prediction of output value, and that support vector machine regression had higher precision for the prediction of profit. By using the two regression methods, output value and profit in the next 4 years was predicted respectively, based on the company’s past data. The study will provide creditable support for the formulation of the company’s development strategy. Keywords -prediction, China Mobile, SVM, stochastic gradient regression Before 1994, the mobile communications market in China was monopolized by posts and telecommunications department. In 1994, the monopoly of China mobile communications market was break off, with China Unicom established. After the operation of posts and telecommunications department divided and china telecommunications reunion, China Mobile Communications Corporation was officially established on April 20th, 2000. Since then, the performance of China Mobile Communications Corporation increases rapidly, and the output value of Tianjin Branch has increased from 1.98 billion in 2000 to 7.40 billion in 2011. In recent years, a number of scholars have carried out the prediction of business data in different industries. Fang Kuangnan et al. [1] used the random forest method to predict the excess rate of return for China's fund, and proved the predictability of China's financial markets. Chen Bo et al.[2] predicted the output value of China's online game industry in the next few years based on gray prediction theory. I. METHOD OF THE SUPPORT VECTOR MACHINE REGRESSION In 1995, based on statistical learning theory, Vapnik and Cortes proposed a new computer-based learning method: Support Vector Machine [3-5]. SVM has unique advantage in solving the small sample, nonlinear and high dimensional pattern recognition problem,and it can be extended to other computer-based learning problems such as function fitting. For an unknown function: 𝑦 = 𝑓(𝑥) where 𝑥 ∈ 𝑅𝑑 , 𝑦 ∈ 𝑅 , we can find a fitting function 𝑓̂: 𝑅𝑑 → 𝑅 , which makes the distance function. 𝑅(𝑓, 𝑓̂) = ∫ 𝐿(𝑓, 𝑓̂) 𝑑𝑥 (1) smallest, where 𝐿 is the Loss Function. Because 𝑓 is unknown,we can only determine the fitting function 𝑓̂according to the samples, which have been obtained. Using the regression function𝑓(𝑥) =< 𝑤,𝑥 > +𝑏 to fit the sample data(𝑥1 , 𝑦1 ),(𝑥2 , 𝑦2 ),…,(𝑥𝑟 , 𝑦𝑟 ), where 𝑥𝑖 ∈ 𝑅𝑑 , 𝑦𝑖 ∈ 𝑅, assume that all training data can be fitted without error by linear function at the accuracy ε 𝑦 −< 𝑤, 𝑥 > −𝑏 ≤ 𝜀 { 𝑖 < 𝑤, 𝑥 > +𝑏 − 𝑦𝑖 ≤ 𝜀 𝑖 = 1,2, … ,𝑘 (2) Allowing the fitting error, we introduce the relaxation factors: ξi ≥ 0, ξ∗i ≥ 0. Formula (2) is rewritten as follows 𝑦 −< 𝑤, 𝑥 > −𝑏 ≤ 𝜀 + 𝜉𝑖 { 𝑖 < 𝑤, 𝑥 > +𝑏 − 𝑦𝑖 ≤ 𝜀 + 𝜉𝑖∗ 𝑖 = 1,2, … ,𝑘 (3) The regression problem is transformed into how to minimize the distance function under the constraint functions (3). The rewritten distance function is as follows k R(w, ξi , ξ∗i ) 1 = ‖w‖2 + C ∑(ξi + ξ∗i ) 2 (4) i=1 II. METHOD OF THE STOCHASTIC GRADIENT REGRESSION Stochastic gradient regression [6-8] was put forward by Friedman in 2001, which was the linear combination of several regression trees. 𝐹(𝑥) = ∑𝑀 𝑚=1 𝛼𝑚 𝑇𝑚 (𝑥)(5) 𝛼𝑚 are combination coefficients, and 𝑇𝑚 (𝑥) are regression trees. The trees are set up with fastest drop optimize idea, by using false residual which is generated by loss function negative gradient. Stochastic gradient boosting algorithm is shown as follows. Figure2. The profit of Tianjin Mobile in 2000-2011 (billion) Figure1. Stochastic gradient boosting arithmetic {𝑦𝑖 , 𝑥𝑖 }1𝑁 representsthe ̂ 𝑁 original sample, and {𝑦𝜋(𝑖) , 𝑥𝜋(𝑖) } 1 ̂ < 𝑁, and {𝜋(𝑖)}1𝑁 is a is a random sub-sample, where 𝑁 random sequence from {1,2, … , 𝑁}. III. THE PREDICTION STUDY ON CHINA MOBILE TIANJIN BRANCH With the development of China's economy , especially the development of Tianjin's, the output value of China Mobile Tianjin branch increased from 1.98 billion in 2000 to 7.403 billion in 2011, and the profit of the company increased from 0.498 billion in 2000 to 1.573 billion in 2011. In order to provide better basis for formulating policies of the company, it is very necessary to predict the output value and profit in the future few years. First of all, we needed to determine that which were the independent variables, and which was the dependent variable, in order to use regression methods to study the output value and profit of the company. Since we were using historical data to predict the future output value and profit, let the previous n years of data be the independent variables, and the (n +1) _(th) year data be the dependent variable (in the calculations below: n=2). The data in 2011 wasn’t used for fitting, but was left to verify the calculations were accurate or not. Secondly we analyzed the relationship between the independent variables and the relationship between the dependent variable and the independent variables. Based on three kinds of regression methods, the prediction of the output value and profit was obtained and selected by comparison with the left data. The algorithm in this paper was completed by R software. In the process of calculation, the value of annual output value and profit was changed into annual growth rate. The final prediction value was obtained by predict the growth rates. A.The Profit Prediction by SVM Regression The profit of China Mobile Tianjin branch during 2010 to 2011 was shown in Figure 2; and its growth rates were shown in Table I. TABLE I ANNUAL GROWTH RATES OF THE PROFIT IN 2000-2011 Y 2 2 2 2 2 e 0 0 0 0 0 a 0 0 0 0 0 r 1 2 3 4 5 R 0 0 0 0 0 a . . . . . t 1 1 1 1 1 e 6 6 6 6 6 3 4 3 3 3 Y 2 2 2 2 2 e 0 0 0 0 0 a 0 0 0 1 1 r 7 8 9 0 1 R 0 0 a . . 0 0 0 t 1 3 . . . e 4 3 0 0 0 6 0 6 9 1 5 0 6 2 0 0 6 0 . 1 6 3 Based on three regression methods (support vector machine (SVM), artificial neural networks (ANNs) [9-10], stochastic gradient (SG)), the profit prediction of 2011 was calculated, and was compared with the true profit. The results were shown in Table II. TABLE II THE ERRORS COMPARED WITH THE PROFIT IN 2011 Meth SVM ANN SG od s Predi 1.629 2.297 1.689 ction 8 Error 3.58 46.07 7.37 % % % By comparison, the SVM regression for the profit forecast was most effective. We used this method to obtain the predicted profit of 2012-2015. The results were shown in Table III. TABLE III THE FORECASTING RESULTS OF PROFIT BASED ON SVM REGRESSION (BILLION) Year Predict ion 20 12 1.5 84 20 13 1.6 45 20 14 1.6 88 20 15 1.7 16 B. The Output Value Prediction by Stochastic Gradient Regression The output value of China Mobile Tianjin branch during 2010 to 2011 was shown in Figure 3; and its growth rates were shown in Table IV. TABLE VI THE FORECASTING RESULTS OF OUTPUT VALUE BASED ON STOCHASTIC GRADIENT REGRESSION (BILLION) Year 201 201 201 2015 2 3 4 Predicti 8.37 9.10 9.71 10.8 on 9 8 6 79 IV.CONCLUSION Using three kinds of regression methods, the paper sets up output value and profit prediction models, and predicts output value and profit of China Mobile Tianjin branch in the future four years. The results indicate that the output value of the company will exceed 10billion.The methods used in this paper can be applied to similar problems. Figure3. The output value of Tianjin Mobile in 2000-2011 (billion) Y e a r R a t e Y e a r R a t e TABLE IV ANNUAL GROWTH RATES OF THE OUTPUT VALUE IN 2000-2011 2 2 2 2 2 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 0 0 0 0 0 . . . . . 1 1 1 1 1 2 2 2 2 2 0 0 0 0 6 2 2 2 2 2 0 0 0 0 0 0 0 0 1 1 7 8 9 0 1 0 0 0 0 0 . . . . . 1 2 0 1 0 8 9 5 2 5 1 3 6 1 2 REFERENCES 2 0 0 6 0 . 1 1 1 Based on three regression methods (support vector machine (SVM), artificial neural networks (ANNs), stochastic gradient (SG)), the output value prediction of 2011 was calculated, and was compared with the true value. The results were shown in Table V. TABLE V THE ERRORS COMPARED WITH THE PROFIT IN 2011 Method SVM ANNs SG Prediction 8.007 7.999 7.783 Error 8.17% 8.06% 5.13% By comparison, the stochastic gradient regression for the output value forecast was most effective. We used this method to obtain the predicted output value of 2012-2015. The results were shown in Table VI. [1]Fang Kuangnan, Zhu Jianping, and XieBangchang, “A Research into the Forecasting of Fund Return Rate Direction and Trading Strategies Based on the Random Forest Method”, Economic Survey, vol.2, no.9, pp. 61-66, 2010. [2] Chen Bo, and Li Fuming, “Application of Grey Theory to Predict Chinese Online Game Output”, Journal of Shandong University of Technology (Nature Science Edition), vol.23, no.6, pp. 52-54, Nov. 2009. [3] Wang Guosheng, “Research on Theory and Algorithm for Support Vector Machine Classifier”, Ph. D. dissertation,Beijing University of Posts and Telecommunications, Beijing, China, 2007 [4] O. Chapelle, and V. Vapnik. “Model Selection for Support Vector Machines”, Advances in Neural Information Processing Systems, MIT Press, 2000. [5] O. L. Mangasarian, and D. R. Musicant, “Successive Over Relaxation for Support Vector Machines”, IEEE Transaction on Neural Networks, vol.10, no.5, pp. 1032-1037, 2001. [6] Xia Guoen, Jin Weidong, and Zhang Gexiang, “Synthetic Evaluation Method Based Support Vector Classifier and Regression Machine”, Journal of Southwest Jiaotong University, vol. 41, no. 4, pp. 522–527, 2006. [7] Han Hongchen, “Study of Nonlinear Dynamics and Analysis of Stochastic Gradient Boosting in Price System”, Ph. D. dissertation, Tianjin University, Tianjin, China, 2009. [8]GuoJianxiao, Wang Hongli, “Analysis of Influencing Factors in Real Estate Prices Based on Stochastic Gradient Regression Model”, 2009 Third International Symposium on Intelligent Information Technology Application, pp.483-846, 2009. [9] Zhang Lidong, Jia Lei, Zhu Wenxing. “Overview of Traffic Flow Hybrid ANN Forecasting Algorithm Study”, Institute of Electrical and Electronics Engineers, Chengdu, 2010. [10] Wu Jianxin, Zhou Zhihua, and ShenXuehua, “A Selective Constructing Approach to Neural Network Ensemble”, Journal of Computer Research and Development, vol.37, no.9, pp.1039-1044, 2000.