Forecasting stock exchange movements using neural networks

advertisement

Forecasting stock exchange movements using neural networks: Empirical evidence from

Kuwait

Mohamed M. Mostafa a , a New York Institute of Technology, Global Program in Bahrain, Manama, Bahrain

Available online 23 February 2010.

Abstract

Financial time series are very complex and dynamic as they are characterized by extreme volatility. The major aim of this research is to forecast the Kuwait stock exchange (KSE) closing price movements using data for the period 2001–2003. Two neural network architectures: multi-layer perceptron (MLP) neural networks and generalized regression neural networks are used to predict the KSE closing price movements. The results of this study show that neuro-computational models are useful tools in forecasting stock exchange movements in emerging markets. These results also indicate that the quasi-Newton training algorithm produces less forecasting errors compared to other training algorithms. Due to their robustness and flexibility of modeling algorithms, neuro-computational models are expected to outperform traditional statistical techniques such as regression and ARIMA in forecasting stock exchanges’ price movements.

Keywords: KSE; Neural networks ; Forecasting

Article Outline

1. Introduction and related work

2. Methodology

2.1. Data

2.2. Multi-layer perceptron

2.3. Generalized regression neural network

3. Results

3.1. MLP-based forecasting

3.2. GRNN-based forecasting

4. Implications, limitations and future research

References

1. Introduction and related work

The increasing globalization of the financial markets has heightened the interest in emerging markets. In the Middle East there are 11 formal stock markets monitored by Standard and

Poors Emerging Markets Database ( Smith, 2007 ). Stock markets are usually classified as either developed or emerging. Three of the Middle East stock markets are categorized as developed: Kuwait, United Arab Emirates and Qatar. Middle East stock markets are, however, relatively small by world standards as the 11 formal stock markets in the region account for around 0.9% of world stock market capitalization ( Smith, 2007 ).

Kuwait stock exchange (KSE) was officially established in 1984 after the crash of

Almanakh (over-the-counter market). In the post liberation period (1992) KSE witnessed a lot of reforms among which were the government privatization program to sell its holdings of shares in local shareholding companies, the launching of mutual funds, and the emergence of institutional investors as the dominant players in the market ( Al-Loughani & Chappell, 2000 ).

Throughout its history, KSE has been characterized by an irregularity in trades and price formation process. However, the Kuwaiti parliament has recently introduced some measures that permit foreign investors to trade in KSE ( Al-Loughani & Chappell, 2000 ). This study aims to forecast the KSE movements using neural network (NN) models.

NN models have been successfully used in prediction or forecasting studies across many disciplines. One of the first successful applications of MLP is reported by Lapedes and Farber

(1988) . Using two deterministic chaotic time series generated by the logistic map and the

Glass–Mackey equation, they designed an MLP that can accurately mimic and predict such dynamic non-linear systems. Another major application of MLP is in electric load consumption (e.g. [Darbellay and Slama, 2000] and [McMenamin and Monforte, 1998] ).

Many other problems have been solved by MLP. A short list includes air pollution forecasting

(e.g. Videnova, Nedialkova, Dimitrova, & Popova, 2006 ), maritime traffic forecasting

( Mostafa, 2004 ), airline passenger traffic forecasting ( Nam & Yi, 1997 ), railway traffic forecasting ( Zhuo, Li-Min, Yong, & Yan-hui, 2007 ), commodity prices ( Kohzadi, Boyd,

Kemlanshahi, & Kaastra, 1996 ), ozone level ( Ruiz-Suarez, Mayora-Ibarra, Torres -Jimenez,

& Ruiz-Suarez, 1995 ), student grade point averages ( Gorr, Nagin, & Szczypula, 1994 ), forecasting macroeconomic data ( Aminian, Suarez, Aminian, & Walz, 2006 ), financial time series forecasting ( Yu & Lai Wang, 2009 ), advertising ( Poh, Yao, & Jasic, 1998 ), and market trends ( Aiken & Bsat, 1999 ).

Due to its good performance in noisy environments, the generalized regression neural network (GRNN) has been extensively used in various prediction and forecasting tasks in the literature. For example, Gaetz, Weinberg, Rzempoluck, and Jantzen (1998) analyzed EEG activity of the brain using a GRNN. Chtioui, Panigrahi, and Francl (1999) used a GRNN for leaf wetness prediction. Ibric, Jovanovi, Djuri, Paroj, and Solomun (2002) used a GRNN in the design of extended-release aspirin tablets. Cigizoglu (2005) employed a GRNN to forecast monthly water flow in Turkey. In this study, GRNN forecasting performance was found to be superior to the MLP and other statistical and stochastic methods. Kim and Lee (2005) used a

GRNN-based genetic algorithm to predict silicon oxynitride etching. Hanna, Ural, and Saygili

(2007) developed a GRNN model to predict seismic condition in sites susceptible to liquefaction. Shie (2008) used a hybrid method integrating a GRNN and a sequential quadratic programming method to determine an optimal parameter setting of an injectionmolding process.

There is an extensive literature in financial applications of NNs (e.g. [Harvey et al., 2000] and

[Kumar and Bhattacharya, 2006] ). For example, Cao, Leggio, and Schniederjans (2005) used

NNs to predict stock price movements for firms traded on the Shanghai stock exchange.

The authors compared the predictive power using linear models to the predictive power of the univariate and multivariate NN models. Results showed that NN models outperform the linear models. These results were statistically significant across the sample firms and indicated that

NN models are useful for stock price prediction. Kryzanowski, Galler, and Wright (1993) used NN models with historic accounting and macroeconomic data to identify stocks that will outperform the market. McGrath (2002) used market to book and price earnings ratios in

a NN model to rank stocks based on the likelihood estimates. (Ferson and Harvey, 1993) and (Kimoto et al., 1990) used a series of macroeconomic variables to capture predictable variation in stock price returns. McNelis (1996) used the Chilean stock market to predict returns in the Brazilian markets. Yumlu, Gurgen, and Okay (2005) used various NN architectures to model the performance of Istanbul stock exchange over the period 1990–

2002. Leigh, Hightower, and Modani (2005) used NN models and linear regression models to model the New York Stock Exchange Composite Index data for the period from 1981 to

1999. Results were robust and informative as to the role of trading volume in the stock market. Chen, Leung, and Daouk (2003) predicted the direction of return on market index of the Taiwan stock exchange using a probabilistic NN model. The results were then compared to the generalized methods of moments (GMM) with Kalman filter. From this literature survey we find that no previous studies have attempted to predict the movements of

KSE. In this study we aim to fill this research gap through the application of MLP and GRNN models to forecast the movements of the KSE.

2. Methodology

2.1. Data

This study covers the time period of June 17, 2001 through November 30, 2003. Thus, the data set contained 612 data points in time series. This data set is quite similar in length to data sets in previous studies of similar nature. We used data for all listed companies traded on the

KSE. The data consists of daily closing prices. The source of the closing price data is the KSE

( www.Kuwaitse.com

). Table 1 shows descriptive statistics of the open and closing prices of the data used in this study.

Table 1.

Descriptive statistics of KSE opening and closing prices.

Open

Mean

Standard Error

Median

Mode

Close

2550.021242 Mean

36.7838047 Standard Error

2196.25 Median

1746.7 Mode

2553.124346

36.81831376

2194.55

1764.2

Standard Deviation 909.9810725 Standard Deviation 910.8347795

Sample Variance 828065.5523 Sample Variance 829619.9955

Kurtosis

Skewness

-0.63889504 Kurtosis

0.891027464 Skewness

-0.64911602

0.88577702

Range

Minimum

2945.2

1578.7

Range

Minimum

2942.8

1579

Open

Maximum

Count

4523.9

612

Close

Maximum

Count

4521.8

612

2.2. Multi-layer perceptron

MLP consists of sensory units that make up the input layer, one or more hidden layers of processing units (perceptrons), and one output layer of processing units (perceptrons). The

MLP performs a functional mapping from the input space to the output space. An MLP with a single hidden layer having H hidden units and a single output, y , implements mappings of the form

(1)

(2) where Z h

is the output of the h th hidden unit, W h

is the weight between the h th hidden and the output unit, and W

0

is the output bias. There are N sensory inputs, Xj . The j th input is weighted by an amount

β j

in the h th hidden unit. The output of an MLP is compared to a target output and an error is calculated. This error is back-propagated to the neural network and used to adjust the weights. This process aims at minimizing the mean square error between the network’s prediction output and the target output.

MLP was first developed to mimic the functioning of the brain. It consists of interconnected nodes referred to as processing elements that receive, process, and transmit information. MLP consists of three types of layers: the first layer is known as the input layer and corresponds to the problem input variables with one node for each input variable. The second layer is known as the hidden layer and is useful in capturing non-linear relationships among variables. The final layer is known as the output layer and corresponds to the classification being predicted

( Baranoff, Sager, & Shively, 2000 ). Fig. 1 represents the typical structure of MLP.

Fig. 1. General structure of MLP.

Full-size image (28K)

First of all the network has to be trained to produce the correct output with minimum error.

To achieve the minimum error the network first has to be trained until it produces a tolerable error. This is how the training is done. Input is fed to the input nodes, from here the middle layer nodes take the input value and start to process it. These values are processed based on the randomly allocated initial weight of the links. The input travels from one layer to another and every layer process the value based on the weights of its links. When the value finally reaches the output node, the actual output is compared with the expected output. The difference is calculated and it is propagated backwards, this is when the links adjust their weights. After the error has propagated all the way back to first layer of middle level nodes, the input is again fed to the input nodes. The cycle repeats and the weights are adjusted over and over again until the error is minimized. The key here is the weight of different links. The weights of the links will decide the output value.

The MLP is the most frequently used neural network technique in pattern recognition

( Bishop, 1999 ) and classification problems ( Sharda, 1994 ). However, numerous researchers document the disadvantages of the MLP approach. For example, Calderon and Cheh (2002) argue that the standard MLP network is subject to problems of local minima. Swicegood and Clark (2001) claim that there is no formal method of deriving a MLP network configuration for a given classification task. Thus, there is no direct method of finding the ultimate structure for modeling process. Consequently, the refining process can be lengthy, accomplished by iterative testing of various architectural parameters and keeping only the most successful structures. Wang (1995) argues that standard MLP provides unpredictable solutions in terms of classifying statistical data.

2.3. Generalized regression neural network

GRNN was devised by Specht (1991) , casting a statistical method of function approximation into a neural network form. The GRNN, like the MLP, is able to approximate any functional relationship between inputs and outputs ( Wasserman, 1993 ). Structurally, the

GRNN resembles the MLP. However, unlike the MLP, the GRNN does not require an estimate of the number of hidden units to be made before training can take place.

Furthermore, the GRNN differs from the classical MLP in that every weight is replaced by a distribution of weight which minimizes the chance of ending up in local minima. Therefore, no test and verification sets are required, and in principle all available data can be used for the training of the network ( Parojcic, Ibric, Djuric, Jovanovic, & Corrigan, 2005 ).

The GRNN is a method of estimating the joint probability density function (pdf) of x and y , giving only a training set. The estimated value is the most probable value of y and is defined by

(3)

The density function f ( x , y ) can be estimated from the training set using Parzen’s estimator

( Parzen, 1962 )

(4)

The probability estimate f ( x , y ) assigns a sample probability of width

σ

for each sample x i

and y i , and the probability estimate is the sum of these sample probabilities ( Specht, 1991 ).

Defining the scalar function

(5) and assessing the indicated integration yields the following:

(6)

The resulting regression (6) is directly applicable to problems involving numerical data.

The first hidden layer in the GRNN contains the radial units. A second hidden layer contains units that help to estimate the weighted average. This is a specialized procedure. Each output has a special unit assigned in this layer that forms the weighted sum for the corresponding output. To get the weighted average from the weighted sum, the weighted sum must be divided through by the sum of the weighting factors. A single special unit in the second layer calculates the latter value. The output layer then performs the actual divisions (using special division units). Hence, the second hidden layer always has exactly one more unit than the output layer. In regression problems, typically only a single output is estimated, and so the second hidden layer usually has two units. Fig. 2 shows the general structure of the GRNN.

The GRNN can be modified by assigning radial units that represent clusters rather than each individual training case: this reduces the size of the network and increases execution speed.

Centers can be assigned using any appropriate algorithm (i.e., sub-sampling, K -means or

Kohonen). Fig. 2 shows the General structure of the GRNN.

Full-size image (26K)

Fig. 2. General structure of the generalized regression neural network (GRNN).

3. Results

3.1. MLP-based forecasting

There are many software packages available for analyzing MLP models. We chose

NeuroIntelligence package ( Alyuda Research Company, 2003 ). This software applies artificial intelligence techniques to automatically find the efficient MLP architecture.

Typically, the application of MLP requires a training data set and a testing data set ( Lek &

Guegan, 1999 ). The training data set is used to train the MLP and must have enough examples of data to be representative for the overall problem. The testing data set should be independent of the training set and is used to assess the classification/prediction accuracy of the MLP after training. Following Lim and Kirikoshi (2005) , an error back-propagation algorithm with weight updates occurring after each epoch was used for MLP training.

Fig. 3 shows the actual versus the fitted closing price values for the whole series using the quick propagation training algorithm. Fig. 4 shows the negative exponential decay in the error rate. As can be seen from Fig. 4 , the best network was obtained after around 500 epochs

(trials). This figure is quite similar to typical convergence of errors in MLP models (similar error graphs were obtained using other training algorithms).

Full-size image (104K)

Fig. 3. Actual versus fit using the quick propagation training algorithm.

Full-size image (47K)

Fig. 4. Error conversion and best network error.

Fig. 5 shows the actual versus the fitted closing price values for the whole series using the conjugate gradient descent training algorithm. Fig. 6 shows the actual versus the fitted closing price values for the whole series using the quasi-Newton training algorithm.

Full-size image (93K)

Fig. 5. Actual versus fit using the conjugate gradient descent training algorithm.

Full-size image (94K)

Fig. 6. Actual versus fit using the quasi-Newton training algorithm.

Based on the descriptive statistics of different training algorithms reported in Table 2 along with the plots of the training algorithms used, it seems that the quasi-Newton training algorithm produces less forecasting errors compared to the other two methods. This is also evident from the target vs. output graph and from the error dependence graph using the quasi-

Newton training algorithm ( Fig. 7 ).

Table 2.

MLP training algorithms major statistics.

Target Output AE

Quick propagation a

ARE

Mean 2534.744 2537.564 48.450 0.021

SD 908.299 890.698 31.966 0.015

Min 1579.000 1708.253 0.336 0.000

Max 4521.800 4369.447 155.094 0.084

Conjugate gradient descent b

Mean 2534.744 2535.314 26.598 0.010

SD 908.299 902.767 22.605 0.009

Min 1579.000 1653.851 0.334 0.000

Max 4521.800 4433.294 170.075 0.049

Quasi-Newton algorithm c

Mean 2534.744 2535.313 24.956 0.010

SD 908.299 903.801 21.542 0.008

Min 1579.000 1646.779 0.303 0.000

Max 4521.800 4442.282 166.271 0.046 a

Correlation = 0998; R

2

= 0.996. b Correlation = 0999; R 2 = 0.999. c

Correlation = 0999; R

2

= 0.999.

Full-size image (96K)

Fig. 7. Predicted Vs. actual and error dependence graph using the quasi-Newton training algorithm.

3.2. GRNN-based forecasting

There are many computer software packages available for building and analyzing NNs.

Because of its extensive capabilities for building networks based on a variety of training and learning methods, NeuralTools Professional package ( Palisade Corporation, 2005 ) was chosen to conduct GRNN analysis in this study. This software automatically scales all input data. Scaling involves mapping each variable to a range with minimum and maximum values of 0 and 1. NeuralTools Professional software uses a non-linear scaling function known as the

‘tanh’, which scales inputs to a (−1, 1) range. This function tends to squeeze data together at the low and high ends of the original data range. It may thus be helpful in reducing the effects of outliers ( Tam, Tong, Lau, & Chan 2005 ). Table 3 shows the basic properties of the GRNN model used in this study. [Fig. 8] and [Figure 9] show the error distribution and the predicted versus actual results using the GRNN network. These figures indicate the robustness of the

GRNN and its normality of error distributions.

Table 3.

GRNN architecture.

Summary

Net Information

Configuration GRNN Numeric Predictor

Training

Number of Cases

Number of Trials

490

61

Reason Stopped Auto-Stopped

% Bad Predictions (30% Tolerance) 0.0000%

Root Mean Square Error 24.51

Mean Absolute Error

Std. Deviation of Abs. Error

17.13

17.53

Testing

Number of Cases 122

% Bad Predictions (30% Tolerance) 0.0000%

Root Mean Square Error

Mean Absolute Error

32.12

22.30

Summary

Std. Deviation of Abs. Error

Data Set

Number of Rows

23.12

612

Full-size image (22K)

Fig. 8. GRNN histogram of residuals (training/testing).

Full-size image (25K)

Figure 9. GRNN predicted versus actual values (training/testing).

4. Implications, limitations and future research

Our results confirm the theoretical work by Hecht-Nielson (1989) who has shown that NNs can learn input–output relationships to the point of making perfect forecasts with the data on which the network is trained. The good performance of the NN models in predicting KSE closing price movements can be traced to its inherent non-linearity. This makes an NN ideal for dealing with non-linear relations that may exist in the data. Thus, neuro-computational models are needed to better understand the inner dynamics of stock markets. Our results are also in line with the findings of other researchers who have investigated the performance of

NN compared to other traditional statistical techniques, such as regression analysis, discriminant analysis, and logistic regression analysis. For example, in a study of creditscoring models used in commercial and consumer lending decisions, Bensic, Sarlija, and

Zekic-Susac (2005) compared the performance of logistic regression, neural networks and decision trees. The PNN model produced the highest hit rate and the lowest type I error.

Similar findings have been reported in a study examining the performance of NN in

predicting bankruptcy ( Anandarajan, Lee, & Anandarajan 2001 ) and diagnosis of acute appendicitis ( Sakai et al., 2007 ).

Despite the significant contributions of this study, it suffers from a number of limitations.

First, despite the satisfactory performance of the NN models in this study, future research might improve the performance of the NN models used in this study by integrating fuzzy discriminant analysis and genetic algorithms (GA) with NN models. Mirmirani and Li (2004) pointed out that traditional algorithms search for optimal weight vectors for a neural network with a given architecture, while GA can yield an efficient exploration of the search space when the modeler has little apriori knowledge of the structure of problem domains.

Second, future research might use other NN architectures such as self-organizing maps

(SOMs) to classify movements in KSE. Due to the unsupervised character of their learning algorithm and the excellent visualization ability, SOMs have been recently used in myriad classification tasks. Examples include classifying cognitive performance in schizophrenic patients and healthy individuals ( Silver & Shmoish, 2008 ), mutual funds classification

( Moreno, Marco, & Olmeda, 2006 ), crude oil classification ( Fonseca, Biscaya, de Sousa, &

Lobo 2006 ), and classifying magnetic resonance brain images ( Chaplot, Patnaik, &

Jagannathan 2006 ).

References

Aiken and Bsat, 1999 M. Aiken and M. Bsat, Forecasting market trends with neural networks, Information Systems Management 16 (1999), pp. 42–49.

Al-Loughani and Chappell, 2000 N. Al-Loughani and D. Chappell, Modeling the day-of-theweek effect in the Kuwait stock exchange: A non-linear Garch representation, Applied

Financial Economics 11 (2000), pp. 353–359.

Alyuda Research Company, 2003 Alyuda Research Company (2003). NeuroIntelligence User

Manual (Version 2.1).

Aminian et al., 2006 F. Aminian, E. Suarez, M. Aminian and D. Walz, Forecasting economic data with neural networks, Computational Economics 28 (2006), pp. 71–88. Full Text via

CrossRef | View Record in Scopus | Cited By in Scopus (3)

Anandarajan et al., 2001 M. Anandarajan, P. Lee and A. Anandarajan, Bankruptcy prediction of financially stressed firms: an examination of the predictive accuracy of artificial neural networks, International Journal of Intelligent Systems in Accounting, Finance and

Management 10 (2001), pp. 69–81. Full Text via CrossRef

Baranoff et al., 2000 E. Baranoff, T. Sager and T. Shively, A semi-parametric stochastic spline model as a managerial tool for potential insolvency, Journal of Risk and Insurance 67

(2000), pp. 369–396. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus

(4)

Bensic et al., 2005 M. Bensic, N. Sarlija and M. Zekic-Susac, Modelling small-business credit scoring by using logistic regression, neural networks and decision trees, Intelligent Systems in Accounting, Finance and Management 13 (2005), pp. 133–150. Full Text via CrossRef

Bishop, 1999 C. Bishop, Neural networks for pattern recognition, Oxford University Press,

New York (1999).

Calderon and Cheh, 2002 T. Calderon and J. Cheh, A roadmap for future neural networks research in auditing and risk assessment, International Journal of Accounting Information

Systems 3 (2002), pp. 203–236. Article | PDF (491 K) | View Record in Scopus | Cited By in Scopus (30)

Cao et al., 2005 Q. Cao, K. Leggio and M. Schniederjans, A comparison between Fama and

French’s model and artificial networks in predicting the Chinese stock market,

Computers

and Operations Research 32 (2005), pp. 2499–2512. Article | PDF (238 K) | View Record in Scopus | Cited By in Scopus (19)

Chaplot et al., 2006 S. Chaplot, L. Patnaik and N. Jagannathan, Classification of magnetic resonance brain images using wavlets as input to support vector machines and neural network, Biomedical Signal Processing and Control 1 (2006), pp. 86–92. Article | PDF

(237 K) | View Record in Scopus | Cited By in Scopus (15)

Chen et al., 2003 A. Chen, M. Leung and H. Daouk, Application of neural networks to an emerging financial market: forecasting and trading the Taiwan stock index, Computers and

Operations Research 30 (2003), pp. 901–923. Article | PDF (229 K) | View Record in

Scopus | Cited By in Scopus (54)

Chtioui et al., 1999 Y. Chtioui, S. Panigrahi and L. Francl, A generalized regression neural network and its application for leaf wetness prediction to forecast plant disease,

Chemometrics and Intelligent Laboratory Systems 48 (1999), pp. 47–58. Article | PDF

(204 K) | View Record in Scopus | Cited By in Scopus (26)

Cigizoglu, 2005 H. Cigizoglu, Generalized regression neural network in monthly flow forecasting, Civil Engineering and Environmental Systems 22 (2005), pp. 71–84. View

Record in Scopus | Cited By in Scopus (25)

Darbellay and Slama, 2000 G. Darbellay and M. Slama, forecasting the short-term demand for electricity: Do neural networks stand a better chance, International Journal of Forecasting

16 (2000), pp. 71–83. Article | PDF (205 K) | View Record in Scopus | Cited By in Scopus

(67)

Ferson and Harvey, 1993 W. Ferson and C. Harvey, The risk and predictability of international equity returns, Review of Financial Studies 6 (1993), pp. 527–566.

Fonseca et al., 2006 A. Fonseca, J. Biscaya, J. de Sousa and A. Lobo, Geographical classification of crude oils by Kohonen self-organizing maps, Analytica Chimica Acta 556

(2006), pp. 374–382. Article | PDF (339 K) | View Record in Scopus | Cited By in Scopus

(14)

Gaetz et al., 1998 M. Gaetz, H. Weinberg, E. Rzempoluck and K. Jantzen, Neural network classification and correlation analysis of EEG and MEG activity accompanying spontaneous reversals of the Necker cube, Cognitive Brain Research 6 (1998), pp. 335–346. Article |

PDF (247 K) | View Record in Scopus | Cited By in Scopus (10)

Gorr et al., 1994 W. Gorr, D. Nagin and J. Szczypula, comparative study of artificial neural network and statistical models for predicting student point averages, International Journal of

Forecasting 10 (1994), pp. 17–34. Abstract | PDF (1636 K) | View Record in Scopus |

Cited By in Scopus (46)

Hanna et al., 2007 A. Hanna, D. Ural and G. Saygili, Evaluation of liquefaction potential of soil deposits using artificial neural networks, Engineering Computations 24 (2007), pp. 5–

16. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (4)

Harvey et al., 2000 C. Harvey, K. Travers and M. Costa, Forecasting emerging market returns using neural networks, Emerging Markets Quarterly 4 (2000), pp. 43–55.

Hecht-Nielson, 1989 Hecht-Nielson, R. (1989). Theory of the back-propagation neural network. In International joint conference on neural networks (pp. 593–605). Washington,

DC.

Ibric et al., 2002 S. Ibric, M. Jovanovi, C. Djuri, J. Paroj and L. Solomun, The application of generalized regression neural network in the modeling and optimization of aspirin extended release tablets with Eudragit(r) RSPO as matrix substance, Journal of Controlled Release 82

(2002), pp. 213–222. Article | PDF (736 K)

Kim and Lee, 2005 B. Kim and B. Lee, Prediction of silicon oxynitride plasma etching using a generalized regression neural network, Journal of Applied Physics 98 (2005), pp. 1–6.

Kimoto et al., 1990 Kimoto, T., Asakawa, K., Yoda, M., & Takeoka, M. (1990). Stock market prediction system with modular neural networks. In Proceedings of the IEEE international conference on neural networks (pp. 1–16).

Kohzadi et al., 1996 N. Kohzadi, M. Boyd, B. Kemlanshahi and I. Kaastra, A Comparison of artificial neural network and time series models for forecasting commodity prices,

Neurocomputing 10 (1996), pp. 169–181. Abstract | Article | PDF (878 K) | View Record in Scopus | Cited By in Scopus (41)

Kryzanowski et al., 1993 L. Kryzanowski, M. Galler and D. Wright, Using artificial neural networks to pick stocks, Financial Analysts Journal 49 (1993), pp. 21–27. Full Text via

CrossRef

KSE, 2008 KSE ( www.Kuwaitse.com

), visited on November 10, 2008.

Kumar and Bhattacharya, 2006 K. Kumar and S. Bhattacharya, Artificial neural network vs. linear discriminant analysis in credit ratings forecast, Review of Accounting and Finance 5

(2006), pp. 216–227. Full Text via CrossRef

Lapedes and Farber, 1988 A. Lapedes and R. Farber, How neural nets work?. In: D.

Anderson, Editor, Neural information processing systems , American Institute of Physics,

New York (1988), pp. 442–456.

Leigh et al., 2005 W. Leigh, R. Hightower and N. Modani, Forecasting the New York stock exchange composite index with past price and interest rate on condition of volume spike,

Expert Systems with Applications 28 (2005), pp. 1–8. Article | PDF (205 K) | View Record in Scopus | Cited By in Scopus (8)

Lek and Guegan, 1999 S. Lek and J. Guegan, Artificial neural networks as a tool in ecological modelling: an introduction, Ecological Modeling 120 (1999), pp. 65–73. Article |

PDF (151 K) | View Record in Scopus | Cited By in Scopus (194)

Lim and Kirikoshi, 2005 C. Lim and T. Kirikoshi, Predicting the effects of physician-directed promotion on prescription yield and sales uptake using neural networks, Journal of

Targeting, Measurement and Analysis for Marketing 13 (2005), pp. 158–167.

McGrath, 2002 C. McGrath, Terminator portfolio,

Kiplinger’s Personal Finance

56 (2002), pp. 56–57.

McMenamin and Monforte, 1998 J. McMenamin and F. Monforte, Short term energy forecasting with neural networks, Energy Journal 19 (1998), pp. 43–52.

McNelis, 1996 P. McNelis, A neural network analysis of Brazilian stock prices: Tequila effects vs. pisco sour effects, Journal of Emerging Markets 1 (1996), pp. 29–44.

-Mirmirani and Li, 2004 S. Mirmirani and H. Li, Gold price, neural networks and genetic algorithm, Computational Economics 23 (2004), pp. 193–200. Full Text via CrossRef | View

Record in Scopus | Cited By in Scopus (12)

Moreno et al., 2006 D. Moreno, P. Marco and I. Olmeda, Self-organizing maps could improve the classification of Spanish mutual funds, European Journal of Operational Research 147

(2006), pp. 1039–1054. Article | PDF (199 K) | View Record in Scopus | Cited By in

Scopus (7)

Mostafa, 2004 M. Mostafa, Forecasting the Suez Canal traffic: A neural network analysis,

Maritime Policy and Management 31 (2004), pp. 139–156. Full Text via CrossRef | View

Record in Scopus | Cited By in Scopus (6)

Nam and Yi, 1997 K. Nam and J. Yi, Predicting airline passenger volume, Journal of

Business Forecasting Methods and Systems 16 (1997), pp. 14–17.

Palisade, 2005 Palisade Corporation (2005). NeuralTools professional user guide (version

1.0) . Ithaca, New York: Palisade Corporation.

Parojcic et al., 2005 J. Parojcic, S. Ibric, Z. Djuric, M. Jovanovic and O. Corrigan, An investigation into the usefulness of generalized regression neural network analysis in the development of level A in vitro–in vivo correlation, European Journal of Pharmaceutical

Sciences 30 (2005), pp. 264–272.

Parzen, 1962 E. Parzen, On the estimation of a probability density function and mode, Annals of Mathematical Statistics 33 (1962), pp. 1065–1076. MathSciNet

Poh et al., 1998 H. Poh, J. Yao and T. Jasic, Neural networks for the analysis and forecasting of advertising impact, International Journal of Intelligent Systems in Accounting,

Finance and management 7 (1998), pp. 253–268. Full Text via CrossRef

Ruiz-Suarez et al., 1995 J. Ruiz-Suarez, O. Mayora-Ibarra, J. Torres -Jimenez and L. Ruiz-

Suarez, Short-term ozone forecasting by artificial neural network, Advances in Engineering

Software 23 (1995), pp. 143–149. Abstract | Article | PDF (703 K) | View Record in

Scopus | Cited By in Scopus (36)

Sakai et al., 2007 S. Sakai, K. Kobayashi, S. Toyabe, N. Mandai, T. Kanda and K. Akazawa,

Comparison of the levels of accuracy of an artificial neural network model and a logistic regression model for the diagnosis of acute appendicitis, Journal of Medical Systems 31

(2007), pp. 357–364. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus

(4)

Sharda, 1994 R. Sharda, Neural networks for the MS/OR analyst: An application bibliography, Interfaces 24 (1994), pp. 116–130.

Shie, 2008 J. Shie, Optimization of injection-molding process for mechanical properties of polypropylene components via a generalized regression neural network, Polymers for

Advanced Technologies 19 (2008), pp. 73–83. Full Text via CrossRef | View Record in

Scopus | Cited By in Scopus (1)

Silver and Shmoish, 2008 H. Silver and M. Shmoish, Analysis of cognitive performance in schizophrenia patients and healthy individuals with unsupervised clustering models,

Psychiatry Research 159 (2008), pp. 167–179. Article | PDF (919 K) | View Record in

Scopus | Cited By in Scopus (6)

Smith, 2007 G. Smith, Random walks in Middle Eastern stock markets, Applied Financial

Economics 17 (2007), pp. 587–596. Full Text via CrossRef | View Record in Scopus | Cited

By in Scopus (0)

Specht, 1991 D. Specht, A general regression neural network, IEEE Transactions on

Neural Networks 2 (1991), pp. 568–576. Full Text via CrossRef | View Record in Scopus |

Cited By in Scopus (806)

Swicegood and Clark, 2001 P. Swicegood and J. Clark, Off-site monitoring systems for prediction bank underperformance: A comparison of neural networks, discriminant analysis, and professional human judgment, International Journal of Intelligent Systems in

Accounting, Finance and Management 10 (2001), pp. 169–186. Full Text via CrossRef

Tam et al., 2005 C. Tam, T. Tong, T. Lau and K. Chan, Selection of vertical framework system by probabilistic neural network models, Construction Management and Economics

23 (2005), pp. 245–254. Full Text via CrossRef | View Record in Scopus | Cited By in

Scopus (9)

Videnova et al., 2006 I. Videnova, D. Nedialkova, M. Dimitrova and S. Popova, Neural networks for air pollution forecasting, Applied Artificial Intelligence 20 (2006), pp. 493–506.

Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (1)

Wang, 1995 S. Wang, The unpredictability of standard back propagation neural networks in classification applications, Management Science 41 (1995), pp. 555–559. Full Text via

CrossRef

Wasserman, 1993 P. Wasserman, Advanced methods in neural computing, Van Nostrand-

Reinhold, New York (1993).

Yu and Lai Wang, 2009 L. Yu and S.K. Lai Wang, A neural-network -based nonlinear metamodeling approach to financial time series forecasting, Applied Soft Computing 9 (2009), pp. 563–574. Article | PDF (739 K) | View Record in Scopus | Cited By in Scopus (2)

Yumlu et al., 2005 S. Yumlu, F. Gurgen and N. Okay, A comparison of global, recurrent and smoothed-piecewise neural models for Istanbul stock exchange (ISE) prediction, Pattern

Recognition Letters 26 (2005), pp. 2093–2103. Article | PDF (292 K) | View Record in

Scopus | Cited By in Scopus (4)

Zhuo et al., 2007 W. Zhuo, J. Li-Min, Q. Yong and W. Yan-hui, Railway passenger traffic volume prediction based on neural network, Applied Artificial Intelligence 21 (2007), pp.

1–10. Full Text via CrossRef

Expert Systems with Applications

Volume 37, Issue 9 , September 2010, Pages 6302-6309

Forecasting model of global stock index by stochastic time effective neural network

Zhe Liao a and Jun Wang , a , a

Institute of Financial Mathematics and Financial Engineering, College of Science, Beijing

Jiaotong University, Beijing 100044, PR China

Available online 8 June 2009.

Abstract

In this paper, we investigate the statistical properties of the fluctuations of the Chinese

Stock Index, and we study the statistical properties of HSI, DJI, IXIC and SP500 by comparison. According to the theory of artificial neural networks, a stochastic time effective function is introduced in the forecasting model of the indices in the present paper, which gives an improved neural network – the stochastic time effective neural network model. In this model, a promising data mining technique in machine learning has been proposed to uncover the predictive relationships of numerous financial and economic variables. We suppose that the investors decide their investment positions by analyzing the historical data on the stock market, and the historical data are given weights depending on their time, in detail, the nearer the time of the historical data is to the present, the stronger impact the data have on the predictive model, and we also introduce the Brownian motion in order to make the model have the effect of random movement while maintaining the original trend. In the last part of the paper, we test the forecasting performance of the model by using different volatility parameters and we show some results of the analysis for the fluctuations of the global stock indices using the model.

Keywords: Brownian motion; Stochastic time effective function; Data analysis; Neural network ; Returns; Predict

Article Outline

1. Introduction

2. Methodology for stochastic time effective function

2.1. Introduction of stochastic time effective function

2.2. Procedure followed in the stochastic time effective model

3. Experiment analysis

3.1. Selection and preprocessing of data

3.2. Training stochastic time effective neural network

4. Conclusions

Acknowledgements

References

1. Introduction

Recently, some progress has been made in the work done on the fluctuations of the Chinese stock market, for example, see (Ji and Wang, 2007) and (Li and Wang, 2006) . In the present paper, we investigate the statistical properties of the fluctuations of the indices of the Chinese

Stock Exchange, and study the statistical properties of HSI, DJI, IXIC and SP500 by comparison. A predictive model of the stock prices is constructed using the theory of artificial neural networks and a stochastic time effective function, further the data from the

Chinese stock markets are analyzed in the model. China has Stock A and Stock B in the stock markets. The indices of Stock A and Stock B play an important role in the

Chinese stock markets, and the database of the indices is from the website www.sse.com.cn

.

Recently, the properties of the fluctuations of the stock markets have been studied in many research fields, for example, see (Azoff, 1994) , (Nakajima, 2000) , (Pino et al., 2008) , (Shtub and Versano, 1999) and (Wang, 2007) . Artificial neural networks are one of the technologies that have made great progress in the study of the stock markets. Usually stock prices can be seen as a random time sequence with noise, artificial neural networks, as large-scale parallel processing nonlinear systems that depend on their own intrinsic link data, provide methods and techniques that can approximate any nonlinear continuous function, without a priori assumptions about the nature of the generating process, see (Pino et al., 2008) and (Shtub and Versano, 1999) . They have good self-learning ability, a strong antijamming capability,and have been widely used in the financial fields such as stock prices, profits, exchange rate and risk analysis and prediction. Although the historical data have a great influence on the investors’ positions, the degree of impact of the data depends on the date at which they occurr (or time), we get a high level effect of the data when they are very near the current state. Furthermore, we also introduce the Brownian motion in the model (see

Wang, 2007 ), in order to make the model have the effect of random movement while maintaining the original trend. We test the forecasting performance of the model by using different volatility parameters, and we show some results of the analysis for the fluctuations of the global stock index using the model. In this work, the forecasting model is developed to estimate the level of returns on Stock A Index of China. In Section 3 , the results show that the forecasting model can predict the index behavior better in a short time interval than in a long time interval, and show the different performances of HSI, DJI, IXIC and SP500 by comparison, see (Abhyankar et al., 1997) , (Austin et al., 1997) and (Balvers et al., 1990) .

In this paper, forecasting based on neural network involves two major steps, data preprocessing and structure design. In the pretreatment stage, the collected data should be normalized and properly adjusted, in order to reduce the impact of noise in the stock markets. At the design stage, different data training sets, validations and data processing will cause the different results, see (Breen et al., 1990) and (Campbell, 1987) . In stock markets, the environment and behavior of the markets may change greatly, for example see the Chinese stock markets in 2007. As a result, the data in the data training set should be time-variant, reflecting the different behavior patterns of the markets at different times. If all the data are used to train the network equivalently, the network system may not be consistent with the development of the stock markets, see (Chenoweth and Obradovic, 1996) , (Chenoweth and

Obradovic, 1996) , (Cybenko, 1989) and (Demuth and Beale, 1998) . Especially in the current

Chinese stock markets, stock market trading rules and management systems are changing rapidly, for example, the daily price limit (now 10%), shareholding reformation, the direct investment of Hong Kong stock markets, the reorganization of A share, B share and H share, and the establishment of financial derivatives such as futures and options. Therefore, using the historical data of the past it is difficult to reflect the current Chinese stock markets’ development. However, if only the recent data are selected, a lot of useful information will be lost which the early data hold. In the financial model of the present paper, a promising data mining technique in machine learning is proposed to uncover the predictive relationships of numerous financial and economic variables. Considering the abovementioned financial situation, this paper presents an improved neural network model, the stochastic time effective series neural network model: each historical datum is given a weight depending on the time it occurs in the model, and we also use the probability density functions to classify the various variables from the training samples, see (Desai and Bharati,

1998) , (Duda et al., 2001) and (Elton and Gruber, 1991) .

2. Methodology for stochastic time effective function

2.1. Introduction of stochastic time effective function

In this section, first we describe a three-layer BP neural network model (see Azoff, 1994 ), which is shown in Fig. 1 . The neural network model includes three layers: input layer, hidden layer and output layer. In the model, the proper number of the hidden layer nodes requires validation techniques to avoid under-fitting (too few neurons) and over-fitting (too many neurons). Generally, too many neurons in the hidden layers, and, hence, too many connections, produce a neural network that memorizes the data and lacks the ability to generalize.

Full-size image (27K)

Fig. 1. Three-layer neural network.

Suppose that a three-layer neural network has neurons, and for any fixed neuron n ( n =1,2,…, N ), the model has the following structure: let { x i

( n ): i =1,2,…, p } denote the set of input of neurons, { y j

( n ): j

=1,2,…, m } denote the set of output of the hidden layer neurons; V i

is the weight that connects the node i in the input layer neurons to the node j in the hidden layer,

W j

is the weight that connects the node j in the hidden layer neurons to the node k in the output layer; and { o k

( n ): k

=1,2,…, q } denote the set of output of neurons. Then the output value for a unit is given by the following function where are the neural thresholds, is the sigmoid activation function.

Let T k

( n ) be the actual value of the data sets, then the error of the corresponding neuron k to the output is defined as

ε k

= T k

o k

. In this paper, the error of the output is defined as then the error of the sample n ( n

=1,2,…,

N ) is defined as

, where ( t ) is the stochastic time effective function. Now we define ( t ) as follows: where

τ

(>0) is the time strength coefficient, t

1

is the current time or the time of the newest data in the data set, and t n

is an arbitrary time point in the data set.

μ

( t ) is the drift function (or the trend term), σ ( t ) is the volatility function, and B ( t ) is the standard Brownian motion

( Wang, 2007 ). The stochastic time effective function implies that the recent information has a stronger effect for the investors than the old information. In detail, the nearer the events happened, the greater the investors and markets affected. And the impact of data follows the time exponential decay, see (Nakajima, 2000) and (Pino et al., 2008) . Then the total error of all the data training sets in the set output layer with the stochastic time effective function is defined as

2.2. Procedure followed in the stochastic time effective model

Note that the training objective of stochastic time effective neural network is to modify the weights so as to minimize the error between the network’s prediction and the actual target.

When all the training data are data (that is t

1

= t n

), the stochastic time effective neural network is the general neural network model. In Fig. 2 , the training algorithms procedures of stochastic time effective neural network are shown, which are as follows:

Step 1: Train a stochastic time effective neural network by choosing five kinds of stock prices in the input layer: daily opening price, daily closing price, daily highest price, daily lowest price and daily trade volume, and one price of the stock prices in the output layer: the closing price of the next trading day. Then set the connective weights, and input the training data sets.

Step 2: At the beginning of data processing, connective weights V i

and W j

follow the uniform distribution on (-1,1), and let the neural threshold

θ k

,

θ j

be 0.

Step 3: Introduce the stochastic time effective function t

in the error function e ( n , t ). Choose different volatility parameters. Give the transfer function from the input layer to the hidden layer and the transfer function from the hidden layer to the output layer.

Step 4: Establish an error acceptable model and set pre-set minimum error. If the output error is below pre-set minimum error, go to Step 6, otherwise go to Step 5.

Step 5: Modify the connective weights: Calculate backward for the node in the output layer:

Calculate

δ

backward for the node in the hidden layer: where o ( n ) is the output of the neuron n , T ( n ) is the actual value of the neuron n in the data sets, o ( n )[1o ( n )] is the derivative of the sigmoid activation function and h

is each of the nodes which connect with the node h and in the next hidden layer after node h . Modify the weights from the layer to the previous layer: where

η

is the learning step, which usually takes constants between 0 and 1.

Step 6: Output the predictive value.

Full-size image (66K)

Fig. 2. Procedure followed in the stochastic time effective model.

3. Experiment analysis

3.1. Selection and preprocessing of data

In this paper, we select the data of Stock A Index (SAI) and Stock B Index (SBI) of

China for each trading day in a 18-year period, that is from December 19, 1990 to June 7,

2008, which are from the Shanghai and Shenzhen Stock Exchange. And we also choose the data of HSI, DJI, IXIC and SP500 by contrast. First, we study the statistical properties of the returns of the index using the stochastic time effective neural network model, and then study the relativity between the Chinese stock indices and foreign country stock indices.

Fig. 3 presents the related coefficients between SAI, SBI, HSI, DJI, IXIC and SP500. In Fig.

3 , we can see that the related coefficients between SAI and SBI, HSI, DJI, IXIC and SP500 are 0.885, 0.629, 0.463, 0.108, 0.329; and the related coefficients between SBI and SAI, HSI,

DJI, IXIC and SP500 are 0.885, 0.817, 0.688, 0.458, 0.623.

Full-size image (60K)

Fig. 3. Related coefficients.

In the model of this paper, we suppose that the network inputs include five kinds of data, daily opening price, daily closing price, daily highest price, daily lowest price and daily trade volume, and the network outputs include the closing price of the next trading day.

Fig. 4 presents the plot of the time sequence log return of SAI, SBI, IXIC and SP500. We denote the price sequence of SAI, SBI, IXIC and SP500 of time then R ( t ) denotes the logarithm of return rate, respectively, by t by ,

In Fig. 4 , we can see that the prices of the index fluctuate wildly, and this indicates that there is a big noise in the data that causes difficulty in forecasting. Thus, we should go through data preprocessing before forecasting, so the data are normalized as follows:

Similarly to (5), the normalized values of the above-mentioned five kinds of data can also be given.

Full-size image (107K)

Fig. 4. The plot of log returns of SAI (a), SBI (b), IXIC (c), and SP500 (d).

3.2. Training stochastic time effective neural network

Data sets are divided into two parts, data training set and data testing set. We collect the data of SAI in 1990–2006 as the training set and the data of SAI in 2007–2008 as the testing set.

According to the procedures of the three-layer network introduced in Section 2 , the number of neural nodes in the input layer is 5, the number of neural nodes in the hidden layer is

20 and the number of neural nodes in the output layer is 1, and the threshold of the maximum training cycles is 100 and the threshold of the minimum error is 0.0001. We take the μ ( t ) (the drift parameter) and σ ( t ) (the volatility parameter) ( μ ( t ), σ ( t )) to be (1, 0), (1, 1) and (1, 2). While (

μ

( t ),

σ

( t )) is (1, 0), the model has the effect of only time effective function; while (

μ

( t ),

σ

( t )) is (1, 1), the model has the effect of both time effective function and normal randomization; and while ( μ ( t ), σ ( t )) is (1, 2), the model has the effect of intensive randomization.

In Table 1 the parts of training errors of different trading dates for (a) SAI, (b) SBI, (c) HSI,

(d) DJI, (e) IXIC and (f) SP500 are given. It can be clearly seen that during the years 1991 and 1992, the relative error is larger than that in the year 2007, this clearly shows the effect of time effective function. Furthermore, the gap between the relative error of SAI and SBI is much more greater than the relative error of the foreign stock markets. So we can conclude that the value of the historical data in the foreign stock markets is greater than that in the

Chinese stock markets, this means that the Chinese stock markets fluctuate more sharply than the foreign markets.

Table 1.

Comparison of errors of different dates for SAI, SBI, HSI, DJI, IXIC and SP500 (( μ ( t ), σ ( t )) is

(1, 1)).

Time

(a) SAI

Actual Predictive Error

90/12/20 104.39 144.57

91/05/10 108.53 149.06

06/09/20 1820.8 1830.3

−0.384

−0.373

−0.005

−0.0001

07/02/13 2973.4 2974.0

(b) SBI

92/02/24 124.65 128.88 −0.034

93/12/24 93.14 87.64 0.059

05/07/11 59.88 60.8308 −0.02

06/11/15 108.27 106.37 0.017

(c) HSI

01/07/31 12316.7 12123.8 0.016

02/01/29 11014.2 10712.2 0.027

06/10/11 17862.8 17849.5 0.001

07/09/26 26430.2 26230.8 0.008

(d) DJI

91/12/23 3022.6 2948.1 0.025

−0.014

92/07/07 3295.2 3340.8

06/07/26 11102.5 11125.5

−0.002

07/12/28 13365.9 134129 −0.004

Time

(e) IXIC

Actual Predictive Error

91/11/08 548.08

92/03/05 621.97

540.3

635.6

06/03/07 2268.38 2290.0

0.014

−0.022

−0.010

0.005 07/05/29 2572.06 2559.6

(f) SP500

91/01/03 321.91

92/02/04 413.85

339.06

408.38

−0.053

0.013

06/10/31 1377.94 1376.05 0.001

07/02/05 1446.99 1448.47 −0.001

Fig. 5 shows the fluctuations of the time sequence of relative errors during the years 1991 to

2007 for the prices of SAI, SBI, HSI, DJI, IXIC and SP500. In these plots, 0 represents the farthest data to the current date, and the larger t (date) represents the date that is nearer to the current date. Fig. 5 also clearly indicates that the stochastic time effective neural network model can be realized by assigning different weights to the data of different times. Time sequence of relative errors of (b) SBI and (d) DJI in Fig. 5 clearly reflects the model of randomization by the effect of the Brownian motion. And we also conclude that SBI is similar to DJI, and that SAI is similar to HIS. By coincidence, the conclusion is supported by the relativity analysis shown in Fig. 3 .

Full-size image (137K)

Fig. 5. Relative errors of SAI, SBI, HSI, DJI, IXIC,and SP500 ((

μ

( t ),

σ

( t ))) is (1, 1).

In order to test the validity of the volatility parameter

σ

( t ), we take

σ

( t ) to be 0 or 2. If

σ

( t ) is

0, then the model has no effect of randomization but has only the effects of the time effective function. If σ ( t ) is 2, then the model has intensive effect of the wild fluctuation.

In Table 2 , the predictive values and relative errors of SAI by the different values

σ

( t ) are given. This shows that the relative error is the smallest when

σ

( t )=1 (almost below 1%) and the relative error is the largest when σ ( t )=0 (almost over 10%). So we can conclude that adding in the Brownian motion in this financial model is propitious for increasing the precision of prediction. However, the volatility parameter should not be too large, otherwise it will increase the error of the prediction.

Table 2.

Predictive values and errors of time effective neural network model of SAI by different

σ

( t ).

Time

(a) σ ( t )=1

Actual Predictive Error

08/06/03 3605.859 3627.168

−0.0059

08/06/04 3535.809 3604.064 −0.0193

08/06/05 3516.219 3530.171

−0.0039

08/06/06 3493.189 3507.505 −0.0040

(b)

σ

( t )=0

08/06/03 3605.859 3655.273 −0.014

08/06/04 3535.809 3689.849

−0.044

08/06/05 3516.219 3559.069

−0.012

08/06/06 3493.189 3502.137

−0.003

(c) σ ( t )=2

08/06/03 3605.859 4060.890

−0.126

08/06/04 3535.809 4034.350 −0.141

08/06/05 3516.219 4001.172

−0.138

08/06/06 3493.189 3968.767

−0.136

In Table 3 , I stands for the average relative error, II stands for the average relative error of the first 1000 days in the data sets, and III stands for the latest 100 days in the data sets. In Table

3 , the time effective function in the model is clearly expressed. Take the relative error in SAI at

σ

( t )=1 for example, the average relative error is 2.6%; the average relative error of the first

1000 days is 6.88% and the average relative error of the latest 100 days is 1.3%. So the latest data are more valuable than the historical data of the past in the stock market.

Table 3.

Average relative errors for SAI, SBI, DJI, IXIC, HSI and SP500 of different

σ

( t ).

SAI SBI DJI IXIC HSI SP500

(a) σ ( t )=1

I 0.026 0.0212 0.0388 0.0152 0.0104 0.0074

II 0.0688 0.0262 0.0697 0.0366 0.0112 0.0077

III 0.013 0.0141 0.0058 0.0078 0.0109 0.0078

(b)

σ

( t )=0

I 0.0766 0.0284 0.0128 0.2143 0.0322 0.0345

II 0.1637 0.0391 0.201 0.7247 0.0379 0.045

III 0.0514 0.0148 0.0103 0.0328 0.0247 0.0366

(c)

σ

( t )=2

I 0.2458 0.1041 0.043 0.1443 0.0636 0.0897

II 0.8686 0.1276 0.083 0.3466 0.0875 0.2256

III 0.0353 0.0985 0.034 0.1041 0.0347 0.0265

In Table 3 , the global stock indices errors of the different values

σ

( t ) are also given. Take the relative error in SAI for example, while

σ

( t )=1, the average relative error is 2.6%; while

σ

( t )=0, the average relative error is 7.66%; while

σ

( t )=2, the average relative error is 24.58%.

So this implies that the appropriate volatility parameter is beneficial for the prediction of the stochastic time effective neural network, and that it is widely applicable to the global stock markets.

In Fig. 6 , the comparison between predictive values and the actual values in SAI, SBI, HSI and IXIC by the stochastic time effective neural network model is shown.

Full-size image (71K)

Fig. 6. Comparison of the predictive values and actual values.

In Fig. 7 , by using the linear regression method, we compare the predictive values of stochastic time effective neural network model with the actual values in SAI (a), SBI (b),

NSDK (c) and SP500 (d). Through the regression analysis, different linear equations in SAI

(a), SBI (b), NSDK (c) and SP500 (d) are obtained. Take the linear equation in SAI (a) for example, the linear equation is

And the correlation coefficient R =0.9984. The linear equation for SBI (b) is

And the correlation coefficient R =0.9978; the linear equation for NSDK(c) is

And the correlation coefficient R =0.9988 and the linear equation for SP500 (d) is

And the correlation coefficient R =0.9991. So we test the accuracy of the results of the forecasting from another angle.

Full-size image (72K)

Fig. 7. Regression of the predictive values and actual values.

4. Conclusions

This paper introduces a new stochastic time effective function to model a stochastic time effective neural network model. The effectiveness of the model has been analyzed by performing a numerical experiment on the data of SAI, SBI, HSI, DJI, IXIC and SP500, and the validity of the volatility parameters of the Brownian motion is tested. Further, the present paper shows some predictive results on the global stock indices using the stochastic time effective neural network model.

Acknowledgements

The authors were supported in part by National Natural Science Foundation of China Grant

No. 70771006 and by BJTU Foundation No. 2006XM044.

References

Abhyankar et al., 1997 A. Abhyankar, L.S. Copeland and W. Wong, Uncovering nonlinear structure in real-time stock -market indexes: The SP500, the DAX, the Nikkei 225, and the

FTSE-100, Journal of Business Economics and Statistics 15 (1997), pp. 1–14. Full Text via

CrossRef | View Record in Scopus | Cited By in Scopus (58)

Austin et al., 1997 M. Austin, C. Looney and J. Zhuo, Security market timing using neural network models, Expert Systems 3 (1997), pp. 3–14. View Record in Scopus | Cited By in

Scopus (9)

Azoff, 1994 E.M. Azoff, Neural network time series forecasting of financial market, Wiley,

New York (1994).

Balvers et al., 1990 R.J. Balvers, T.F. Cosimano and B. McDonald, Predicting stock returns in an eHcient market, Journal of Finance 55 (1990), pp. 1109–1128. Full Text via CrossRef

Breen et al., 1990 W. Breen, L.R. Glosten and R. Jagannathan, Predictable variations in stock index returns, Journal of Finance 44 (1990), pp. 1177–1189.

Campbell, 1987 J. Campbell, Stock returns and the term structure, Journal of Financial

Economics 18 (1987), pp. 373–399. Abstract | PDF (1757 K) | View Record in Scopus |

Cited By in Scopus (353)

Chenoweth and Obradovic, 1996 T. Chenoweth and Z. Obradovic, A multi-component nonlinear prediction system for the SP500 index, Neurocomputing 10 (1996), pp. 275–290.

Abstract | Article | PDF (1159 K) | View Record in Scopus | Cited By in Scopus (14)

Chenoweth and Obradovic, 1996 T. Chenoweth and Z. Obradovic, Embedding technical analysis into neural network based trading systems, Applied Artificial Intelligence 10

(1996), pp. 523–541. View Record in Scopus | Cited By in Scopus (8)

Cybenko, 1989 G. Cybenko, Approximation by superpositions of a sigmoidal function,

Mathematics of Control Signals and Systems 2 (1989), pp. 303–314. MathSciNet | View

Record in Scopus | Cited By in Scopus (2248)

Demuth and Beale, 1998 H. Demuth and M. Beale, Neural network toolbox: For use with

MATLAB (5th ed.), The Math Works, Inc, Natick, MA (1998).

Desai and Bharati, 1998 V.S. Desai and R. Bharati, The efficiency of neural networks in predicting returns on stock an bond indices, Decision Sciences 29 (1998), pp. 405–425.

Duda et al., 2001 R.O. Duda, P.E. Hart and D.G. Stork, Pattern classic cation, Wiley, New

York (2001).

Elton and Gruber, 1991 E.J. Elton and M.J. Gruber, Modern portfolio theory and investment analysis (4th ed.), Wiley, New York (1991).

Ji and Wang, 2007 M.F. Ji and J. Wang, Data analysis and statistical properties of Shenzhen and Shanghai land indices, WSEAS Transactions on Business and Economics 4 (2007), pp.

33–39.

Li and Wang, 2006 Q.D. Li and J. Wang, Statistical properties of waiting times and returns in

Chinese stock markets, WSEAS Transactions on Business and Economics 3 (2006), pp.

758–765.

Nakajima, 2000 Y. Nakajima, Are fluctuations in stock price unexpected, Artificial

Intelligence and Knowledge Based Processing (AI) 100 (2000), pp. 37–42.

Pino et al., 2008 R. Pino, J. Parreno, A. Gomez and P. Priore, Forecasting next-day price of electricity in the Spanish energy market using artificial neural networks, Engineering

Applications of Artificial Intelligence 21 (2008), pp. 53–62. Article | PDF (302 K) | View

Record in Scopus | Cited By in Scopus (18)

Shtub and Versano, 1999 A. Shtub and P. Versano, Estimating the cost of steel pipe bending, a comparison between neural networks and regression analysis, Journal of Production

Economics 62 (1999), pp. 201–207. Article | PDF (106 K) | View Record in Scopus | Cited

By in Scopus (18)

Wang, 2007 J. Wang, Stochastic process and its application in finance, Tsinghua University

Press and Beijing Jiaotong University Press (2007).

Corresponding author. Tel./fax: +86 10 51682867.

Expert Systems with Applications

Volume 37, Issue 1 , January 2010, Pages 834-841

Improving returns on stock investment through neural network selection

Tong-Seng Quah , , a and Bobby Srinivasan b a

School of Electrical and Electronic Engineering, Nanyang Technological University,

Nanyang Avenue, Singapore 639798, Singapore b School of Accountancy and Business, Nanyang Technological University, Nanyang Avenue,

Singapore 639798, Singapore

Available online 1 November 1999.

Abstract

The Artificial Neural Network (ANN) is a technique that is heavily researched and used in applications for engineering and scientific fields for various purposes ranging from control systems to artificial intelligence. Its generalization powers have not only received admiration from the engineering and scientific fields, but in recent years, the finance researchers and practitioners are taking an interest in the application of ANN. Bankruptcy prediction, debtrisk assessment and security market applications are the three areas that are heavily researched in the finance arena. The results, this far, have been encouraging as ANN displays better generalization power as compared to conventional statistical tools or benchmark.

With such intensive research and proven ability of the ANN in the area of security market application and the growing importance of the role of equity securities in Singapore, it has motivated the conceptual development of this project in using the ANN in stock selection.

With its proven generalization ability, the ANN is able to infer from historical patterns the characteristics of performing stocks. The performance of stocks is reflective of their profitability and the quality of management of the underlying company. Such information is reflected in financial and technical variables. As such, the ANN is used as a tool to uncover the intricate relationships between the performance of stocks and the related financial and technical variables. Historical data such as financial variables (inputs) and performance of the stock (output) are used in this ANN application. Experimental results obtained this far have been very encouraging.

Author Keywords: Technical analysis; Fundamental analysis; Neural network ; Economic factors; Political factors; Firm specific factors

Article Outline

1. Introduction

2. Application of neural network in financial and commercial domains

3. Neural architecture

3.1. Select the Appropriate Algorithm

3.2. Architecture of ANN

3.3. Selection of the Learning Rule

3.4. Selection of the Appropriate Learning Rates and Momentum

4. Variables selection

5. Experiment

5.1. Research design

5.2. Design 1 (Basic System)

5.3. Design 2 (Moving Window System)

5.4. Results

6. Conclusion and future works

References

1. Introduction

With the growing importance in the role of equities to both the international and local investors, the selection of attractive stocks is of utmost importance to ensure a good return.

Therefore, a reliable tool in the selection process can be of great assistance to these investors.

An effective and efficient tool/system gives the investor the competitive edge over others as he/she can identify the performing stocks with minimum effort.

Trading strategies, rules and concepts based on fundamental and technical analysis, have been devised by both academics and practitioners in assisting the investors in their decision making process. Innovative investors opt to employ information technology to improve the efficiency in the process. This is done through transforming trading strategies into computer known languages so as to exploit the logical processing power of the computer. This greatly reduces the time and effort in short-listing the list of attractive stocks.

In this age where information technology is dominant, such computerized rule based expert systems have great limitations that will affect its effectiveness and efficiency. However, with the significant advancement in the field of Artificial Neural Network (ANN), these limitations have found a solution. In this research, the generalization ability of the ANN is being harnessed in creating an effective and efficient tool for stock selection. Results of the researches in this field have so far been very encouraging.

2. Application of neural network in financial and commercial domains

Research developments in Bankruptcy prediction have showed that the ANN performs better than the conventional statistical methods such as Discriminant Analysis and Logistic

Regression. Alici (1996) , using UK data, shows that ANN, with the architecture consisting of

3 multi layer Preceptron (28 financial inputs, seven hidden neurons and two output neurons) has an average of 71.38%(76.07%) for the failed firms (healthy firms) using neural network. On the other hand, the Discriminant Analysis and Logistic Regression have both achieved 60.12%(71.43%) and 65.29%(71.07%) for the failed firms (Healthy Firms). The feasibility of applying ANN into bankruptcy was also studied by Raghupathi, Schkade and

Raju, 1991 and Odom , and many others.

Salchenberger, Cinar and Lash (1992) use the ANN for classifying failure pertaining to the

Savings and Loans (S&Ls) organizations in US. In the Debt Risk Assessment applications,

Dutta and Shekhar (1993) use neural network to classify bonds.

The generalization ability of the ANN is also extended to commodity trading (copper). Robles and Naylor (1996) are able to show that the ANN outperforms the traditional Weighted

Moving Average rule and a “buy and hold” strategy. In equity,

Gencay and Stengos (1996) show that the ANN outperforms the (linear model) in the identical research methodology. The ratios of the average MSPE of the testing model and benchmark are 0.961 and 1 for the ANN and the , respectively.

Burgess and Refenes (1996) prove that, the ANN has better generalization ability than

Ordinary Least Square in predicting daily return of the applied in predicting the trend of Italian stock market ( index. Neural network is also

). Chinetti and Rossignoli

(1993) attempted to use technical variables to build an accurate prediction system to forecast the timing of buying and selling for the index. Westheider (1994) studied the predictive power of the ANN on stock index returns. He uses economic and fundamental variables to predict monthly, quarterly and yearly returns.

3. Neural architecture

The computer software selected for the training and testing the network is Neural Planner version 3.71. This software was programmed by Stephen Wolstenholme. It is an ANN simulator strictly designed for only one Back Propagation learning algorithm.

There are four major issues in the selection of the appropriate network:

1. Select the Appropriate Algorithm.

2. Architecture of the ANN.

3. Selection of the Learning Rule.

4. Selection of the Appropriate Learning Rates and Momentum.

3.1. Select the Appropriate Algorithm

Since the sole purpose of this project is to identify the top performing stocks and the historical data that are used for the training process will have a known outcome (whether it is considered Top performer or otherwise), algorithms designed for supervised learning are ideal. Among the available algorithms, the Back Propagation algorithm designed by

Rumelhart, Hinton and Williams (1986) is the most suitable as it is being intensively tested in

Finance. Moreover, it is recognized as a good algorithm for generalization purposes.

3.2. Architecture of ANN

Architecture, in this context, refers to the entire structural design of the ANN (Input Layer,

Hidden Layer and Output Layer). It involves determining the appropriate number of neurons required for each layer and also the appropriate number of layers within the Hidden Layer.

The logic of the Back Propagation method is the Hidden Layer. The Hidden Layer can be considered as the crux of the Back Propagation method. This is because hidden layer can extract higher level features and facilitate generalization, if the input vectors have low level features of a problem domain or if the output/input relationship is complex. The fewer the hidden units, the better is the ANN able to generalize. It is important not to over-fit the ANN with large number of hidden units than required until it can memorize the data. This is because the nature of the hidden units is like a storage device. It learns noise present in the training set, as well as the key structures. No generalization ability can be expected in these.

This is undesirable as it does not have much explanatory power in a different situation/environment.

3.3. Selection of the Learning Rule

The learning rule is the rule that the network will follow in its error reducing process. This is to facilitate the derivation of the relationships between the input(s) and output(s). The generalized delta rule developed by Rumelhart et al. (1986) is used in the calculations of weights. This particular rule is selected because it is heavily used and proven effective in the

Finance researches.

3.4. Selection of the Appropriate Learning Rates and Momentum

The Learning Rates and Momentum are parameters in the learning rule that aid the convergence of error, so as to arrive at the appropriate weights that are representative of the existing relationships between the input(s) and the output(s).

As for the appropriate learning rate and momentum to use, the Software has a feature that can determine appropriate learning rate and momentum for the network to start training with. This function is known as “Smart Start”. Once this function is activated, the network will be tested using different values of learning rates and momentum to find a combination that yields the lowest average error after a single learning cycle. These are the optimum starting values as using these rates improve error converging process, thus require less processing time.

Another attractive feature is that the software comes with an “Auto Decay” function that can be enabled or disabled. This function automatically adjusts the learning rates and momentum to enable a faster and more accurate convergence. In this function, the software will sample the average error periodically, and if it is higher than the previous sample then the learning rate is reduced by 1%. The momentum is “decayed” using the same method but the sampling rate is half of that used for the learning rate. If both the learning rate and momentum decay are enabled then the momentum will decay slower than the learning rate.

In general cases, where these features are not available, a high learning rate and momentum

(e.g. 0.9 for both the Learning Rates and Momentum) are recommended as the network will converge at a faster rate than when lower figures are used. However, too high a Learning Rate and Momentum will cause the error to oscillate and thus prevent the converging process.

Therefore, the choice of learning rate and momentum are dependent on the structure of the data and the objective of using the ANN.

4. Variables selection

In general, financial variables chosen are constrained by data availability. They are chosen first on the significant influences over stock returns based on past literature searches and practitioners’ opinions and then on the availability of such data. Most of the data used in this research is provided by Credit Lyonnais Securities (Singapore) Pte Ltd. Stock prices are extracted from a financial database called .

Broadly, factors that can affect stock prices can be classified into three categories: economic factors, political factors and firm/ stock specific factors. Economic factors have significant influence on the returns of individual stock as well as stock index in general as they possess significant impact on the growth and earnings’ prospects of the underlying companies thus affecting the valuation and returns. Moreover, economic variables also have significant influence on the liquidity of the stock market. Some of the economic variables used are: inflation rates, employment figures and producers’ price index.

Many researchers have found that it is difficult to account for more than one third of the monthly variations in individual stock returns on the basis of systematic economic influences, and shown that political factors could help to explain some of the missing variations. Political stability is vital to the existence of business activities and the main driving force in building a strong and stable economy. Therefore, it is only natural that political factors such as fiscal policies, budget surplus/deficit etc do have effects on stock price movements.

Firm specific factors affect only individual stock return. For example, financial ratios and some technical information that affects the return structure of specific stocks, such as yield factors, growth factors, momentum factors, risk factors and liquidity factors. As far as stock selection is concerned, firm specific factors constitute to important considerations as it is these factors that determine whether a firm is a bright start or a dim light in the industry. Such firm specific factors can be classified into five major categories:

1. Yield factors: these include “historical P/E ratio” and “Prospective P/E ratio”. The former is computed by price/earning per share. The latter is derived by price/consensus earnings per share estimate. Another variable is the “cashflow yield”, which is basically price/operating cashflow of the latest 12 months.

2. Liquidity factors: the most important variable is the “market capitalization”, which is determined by “price of share×number of shares outstanding”.

3. Risk factors: the representative variable is the “earning per share uncertainty”, which is defined as “percentage deviation about the median EPS estimates”.

4. Growth factors: basically, this means the “return on equity (ROE)”, and is computed by

“net profit after tax before extraordinary items/shareholders equity”.

5. Momentum factors: a proxy is derived by “average of the price appreciation over the quarter with half of its weights on the last month and remaining weights being distributed equally in the remaining two months”.

The inputs of the neural network stock selection system are the above seven inputs and the output is the return differences between the stock and the market return (excess returns).

This is to enable the neural network to establish the relationships between inputs and the output (excess returns).

The training data set will include all data available until the quarter before the testing quarter.

This is to ensure that the latest changes in the relationship of the inputs and the output are being captured in the training process.

5. Experiment

The quarterly data required by the project are generally stock prices and financial variables

(inputs to the ANN stock selection system) from 1/1/93 to 31/12/96.

Stock Prices, which are used to calculate stock returns, are extracted from the Financial

Database called . These stock returns, adjusted for dividends, stock splits and bonus issues, will be used as output in the ANN training process.

One unique feature of this research is that Prospective P/E ratio, measured as

(Price/Consensus Earnings Per Share Estimate), is being used as a forecasting variable. This variable has not received much attention in Financial Research. Prospective P/E ratio is used among practitioners as it can reflect the perceived value of stock with respect to EPS

(Earnings per share) expectations. It is used as a value indicator, which has similar implications as that of the Historical P/E ratio. As such, a low Prospective P/E suggests that the stock is undervalued with respect to its future earnings and vice versa. With its

explanatory power, Prospective P/E ratio qualifies as an input in the stock selection system.

Data on Earnings per Share estimates, which is used for the calculation of EPS Uncertainty and Prospective P/E ratio, is available in . This is a compilation on EPS estimates and recommendations are put forward by financial analysts. The coverage has estimates from countries over the Asia Pacific Rim as from January 1993.

5.1. Research design

The purpose of this ANN stock selection system is to select stocks that are top performers from the market ( Stock that outperformed the market by 5%) and to avoid selecting under performers ( Stocks that underperformed the market by 5%). More importantly, the aim is to beat the market benchmark (Quarterly return on the index) on a portfolio basis.

This ANN stock selection system is a quarterly portfolio re-balancing strategy whereby it will select stocks in the beginning of the quarter and performance (the return of the portfolio) will be assessed at the end of the quarter.

5.2. Design 1 (Basic System)

In this research design, the sample used for training consists of stocks that outperformed and underperformed the market quarterly by 5% from 1/1/93 to 30/6/95.

The inputs of the ANN stock selection system are the seven inputs chosen in Section 4 and the output will be the return differences between the stock and the market return (excess returns). This is to enable the ANN to establish the relationships between inputs and the output (excess returns).

The training data set will include all data available until the quarter before the testing quarter . This is to ensure that the latest changes in the relationship of the inputs and the output are being captured in the training process.

The generalization ability of the ANN in selecting top performing stocks and whether the system can perform is tested across time. The data used for the selection process are from the third quarter of 1995(1/7/95–30/9/95), the fourth quarter of 1995 (1/10/95–31/12/95), the first quarter of 1996 (1/1/1996–31/3/1996), the second quarter of 1996 (1/4/1996–30/6/1996), the third quarter of 1996 (1/7/1996–30/9/1996), the fourth quarter of 1996 (1/10/1996–

31/12/1996). The limited test duration is constrained by the data availability.

The testing inputs are being injected into the system and the predicted output will be calculated using the established weights. After which, the top 25 stocks with the highest output value will be selected to form a portfolio of stocks. These 25 stocks are the top 25 stocks recommended for purchase at the beginning of the quarter. Generalization ability of the ANN will be determined by the performance of the portfolio, measured by excess returns over the market as well as the % of top performers in the portfolio as compared to the benchmark portfolio (Testing Portfolio) at the end of the month.

5.3. Design 2 (Moving Window System)

The Basic System is constrained by meeting the minimum sample size required for training process. However, this second design is going to forgo the recommended minimum sample

size and introduce a Moving Window concept. This is to analyze the ANN ability to perform under a restricted sample size environment.

The inputs and output variables are identical with that of the Basic System, but the training and testing samples are different. The Moving Window System uses three quarters as training sample and the subsequent quarter as the testing sample. The selection criterion is also identical with that of the Basic System in research Design 1.

5.4. Results

The ANN is made to train with 10 000 and 15 000 cycles. The reason for using these numbers of cycles for training is because the error converging is generally slow after 10 000 thus suggesting adequate training. Moreover, it does not converge beyond 15 000. This is an indicator that the network is over-trained.

The training of four hidden neurons for 10 000 cycles takes approximately 1.5 h, eight hidden neurons takes about 3 h and the most complex (14 hidden neurons) took about 6 h on a

Pentium 100 MHz PC. As for those architectures that require 15 000 cycles, it usually takes about 1.5 times the time it takes to train the network for 10 000 cycles.

The results of the Basic Stock Selection System based on the training and testing schedules mentioned are presented in two forms: (1) the excess return format and (2) the percentage of the top performers in the selected portfolio. These two techniques will be used to assess the performance and generalization ability of ANN.

Testing results show that the ANN is able to “beat” the market overtime, as shown by positive compounded excess returns achieved consistently throughout all architectures and training cycles. This implies that the ANN can generalize relationships overtime. Even at individual quarters’ level, the relationships between the inputs and the output established by the training process is proven successful by “beating” the market in 6 out of 8 possible quarters which is a reasonable 75%. ( Fig. 1 .)

Full-size image (16K)

Fig. 1. Graphical presentation of excess returns for 15 000 cycles.

The Basic Stock System has consistently performed better than the testing portfolio overtime. This is evident by the fact that the selected portfolios have higher percentage of top performing stocks (above 5%) than the testing portfolio overtime. This ability has also enabled the network to better the performance of the market ( index) presented earlier. ( Fig. 2 and Fig. 3 .)

Full-size image (17K)

Fig. 2. Performance of portfolios with % of stocks with actual return of 5% above market for 10 000 cycles.

Full-size image (24K)

Fig. 3. Graphical presentation of excess returns of portfolio.

The Moving Window Selection System is designed to test the generalization power of the

ANN in an environment with limited data.

The generalization ability of the ANN is again evident in the Moving Window Stock

Selection System as it outperformed the Testing portfolio in 9 out of 13 testing quarters

(69.23%). This can be seen in the graphical presentation that the line representing the selected portfolio is above the line representing testing portfolio most of the time. ( Fig. 4 .) Moreover, the compounded excess returns and the annualized compounded excess returns are better than that of the testing portfolio by two times over. The selected portfolios have outperformed the market 10 out of 13 (76.92%) testing quarters and excess returns (127.48% for the 13 quarters and 36.5% for the Annualized compounded return), which proved its consistent performance over the market ( index) overtime. ( Fig. 5 .)

Full-size image (21K)

Fig. 4. % of Top performers in the portfolio - moving window stock selection system.

Full-size image (28K)

Fig. 5. Actual returns - moving window stock selection system.

The Selected Portfolios have outperformed the Testing Portfolio in nine (69.23%) and equal the performance in one occasion. This further proves the generalization ability of the ANN.

Moreover, the ability to avoid selecting undesirable stocks is also evident by the fact that the selected portfolios have less of this kind of stocks than the testing portfolio in 10 out of

13 occasions (76.92%).

From the experimental results, the portfolio of the Selected portfolios outperformed the

Testing and Market portfolios in terms of compounded actual returns overtime. The reason is because the Selected portfolios outperform the two categories of portfolios in most of the testing quarters thus achieving better overall position at the end of the testing period.

6. Conclusion and future works

The ANN has displayed its generalization ability in this particular application. This is evident through the ability to single out performing stock counters and having excess returns in the

Basic Stock Selection System overtime. Moreover, neural network has also showed its ability in deriving relationships in a constrained environment in the Moving Window Stock

Selection System thus making it even more attractive for applications in the field of Finance.

This paper is largely constrained by the availability of data. Therefore, when more data is available, performance of the neural networks can be better assessed in the various kinds of market conditions, such as bull, bear, high inflation, low inflation or even political conditions in which all have different impact on stocks.

Also, as more powerful neural architectures are being discovered by researchers on a fast pace, it is good to repeat the experiments using several architectures and compare the results.

The best performance structure may than be employed.

References

Alici , Y., 1996. Neural network in corporate failure prediction: the UK experience. In:

Proceedings of the Third International Conference on Neural Networks in the Capital

Markets World Scientific, Singapore 11–13 October,1995, London.

Burgess , A.N. and Refenes, A.N., 1996. Modelling non-linear co-integration in international equity index futures. In: Proceedings of the Third International Conference on Neural

Networks in the Capital Markets World Scientific, Singapore London 11–13 October 1995.

Chinetti , D., & Rossignoli, C. (1993). A neural network model for stock market prediction. Tech. Rep. Universita Statale di Milano, Department of Computer Science, Milan,

Italy..

Dutta , S. and Shekar, S., 1993. Bond rating: a non-conservative application of neural networks Chapter 14,reprint. In: Robert R. Trippi and Efraim Turban Editors, 1993. Neural networks in finance and investing Pobus Publishing Company, pp. 257–273 Proceedings of the IEEE International Conference on Neural Networks, July 1988 (pp. II443 – II450).

Gencay , R. and Stengos, T., 1996. The predictability of stock returns with local versus global nonparametric estimators. In: Proceeding of the Third Conference on Neural

Networks in the Capital Markets World Scientific, Singapore London 11–13 October 1995.

Odom , M.D. and Sharda, R., 1993. Neural network model for bankruptcy prediction

Chapter 14, reprint. In: Robert R. Trippi and Efraim Turban Editors, 1993. Neural networks in finance and investing Pobus Publishing Company, pp. 178–185 Proceedings of the IEEE

International Conference on Neural Networks (pp II163–II168). San Deigo, CA, EEE.

Raghupathi , W., Schkade, L.L., & Raju, B.S. (1991). A neural network approach to bankruptcy prediction. Proceedings of the IEEE 24th Annual Hawaii International Conference on System Sciences..

Robles , J.V. and Naylor, C.D., 1996. Applying neural networks in copper trading: a technical analysis simulation. In: Proceedings of the Third International Conference on

Neural Networks in Capital Markets World Scientific, Singapore London 11–13 October

1995.

Rumelhart , D.E., Hinton, G.E. and Williams, R.J., 1986. Learning the internal representations by error propagation. In: Parallel distributed processing vol 1 and 2 MIT Press,

Massachusetts.

Salchenberger , L.M., Cinar, E.M. and Lash, N.A., 1992. Neural networks: a new tool for predicting thrift failures. Decision Sciences 23 4, pp. 899–916. Full Text via CrossRef

Westheider , O., 1994. Stock return predictability: a neural network approach. In:

Information systems working paper #8–93 The John E. Anderson Graduate School of

Management at UCLA.

Corresponding author; email: itsquah@ntu.edu.sg

Expert Systems with Applications

Volume 17, Issue 4 , November 1999, Pages 295-301

Download