Improvement in the Speed of Training a Neural

advertisement
Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com
Volume 2, Issue 4, April 2013
ISSN 2319 - 4847
Improvement in the Speed of Training a Neural
Network Using Sampled Training
Advait S. Bhatt1, Harikrishna B. Jethva2
1
PG Student, 2Assosciate Professor,
Department of Computer Engineering, L.D. College of Engineering, Gujarat Technological University, Ahmedabad
ABSTRACT
Today the most important use of neural network is in classifying the data. The neural network is trained for classification by
giving some input-output pairs which is also known as supervised learning. As training a network is a part of the whole
classification process, the time required for training should be as small as possible. In this paper, we propose a scheme where
sampling of the training set is done for reducing the time required for training the network and hence the overall time required
for classification is reduced.
Keywords: Supervised Learning, Learning rate, Classification, Training Time, Neural Network, Sampled Training.
1. INTRODUCTION
Classification is basically a set of activities responsible for deriving a model that categorizes and describes classes of
data and concepts, whose sole purpose is to determine and predict the classes of objects who have no label as [1]. The
model derived can be shown in various forms, such as classification (IF-THEN) rules, decision trees, mathematical
formulae, or neural network.
A neural network, when used for classification, is typically a collection of neuron-like processing units with weighted
connections between the units. Fig 1 shown below is the example of a Feed-Forward Artificial Neural Network. Every
neural network has one input and one output unit along with zero or more number of hidden units. The input unit
receives the input, the output unit generates the classification result and the hidden unit performs the processing.
There are some very important components those make up an artificial neuron. These include weighting factor,
summation function, transfer function, error function, error and back-propagated value and learning function. The
most important among these is the learning function. Its purpose is to modify the variable connection weights when
inputs are given to each processing element based on some neural based algorithm.
Figure 1 A Feed-Forward Neural Network
There are a large number of neural network algorithms that are implemented for different applications. In each type of
algorithms, some type of learning technique is always used. The time required for training a network should be very
less so that the actual data is classified in less time.
Further this paper is divided into following sections, section II is about training a neural network. Section III is about
the proposed scheme of sampled training in a neural network, section IV is about the advantages of the proposed
scheme and section V is the conclusion so derived after the study of the two schemes.
2. TRAINING A NEURAL NETWORK
2.1 Process of Classification
Every classification technique basically consists of two phases: the learning or the training phase and the other is the
output or the testing phase. In the learning phase, the training data whose data objects have a known class label are
given as input to the model so as to train the model. Once the model has been trained, the original data set that is to be
classified is given as input to the model. The model based on information learned from the training data classifies the
test data into appropriate class labels.
2.2 Batch Mode Training (Existing Method)
Volume 2, Issue 4, April 2013
Page 241
Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com
Volume 2, Issue 4, April 2013
ISSN 2319 - 4847
In a feed-forward neural network, training of the network is done by providing the input-output pairs to the network
and then generating the model which can then be used for classification of the test data. In case of the existing system,
the whole of the training set is given as input for training the network. So the total time required for performing the
classification includes the time required for training and the time required for classifying the test data based on the
model generated.
It has found that for better error gradient estimations, more number of input vectors for each gradient calculation
should be used [2]. But the down side of using more input vectors per weight update is that the gradient has to be
calculated for each input vector used [3]. In the training process it is the gradient calculations that take the longest.
Using multiple input vectors quickly lengthens the training times to intolerable levels.
Figure 2 Flow-Chart of Batch Training in Neural Networks
Figure 2 shows the flowchart of the batch mode training scheme that is being used in the existing system. As it can
seen from the figure above, in batch mode of training a neural network, the complete training set is used for training.
After the training has been accomplished and the model generated, the test data is given as input for classifying the
unknown class labels. This is how the existing system works.
But the problem with this training is that the time required to train a neural network is more as the whole training set is
used [4]. So, in order to reduce the training time, we propose a new scheme where the training set is sampled into
smaller chunks.
3. SAMPLED TRAINING (PROPOSED SCHEME)
In order to reduce the overall time of classifying the test data, the time required for training the network should be
reduced greatly. In this scheme, we sample the whole training set into fixed size data sets and then give them as input
to the network for training.
As shown in the figure below, the training set is sampled into smaller training sets having fixed number of tuples. Once
the network is trained, test data is given for classification. If the performance of the model generated is as per the user
specified levels, the network is trained and if not then the remaining samples of the training set are given as input
further training of the network. This process continues until the samples are available for training or until the user
specified accuracy is obtained.
Figure 3 Flow-Chart of Sampled Training in Neural Networks
Volume 2, Issue 4, April 2013
Page 242
Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com
Volume 2, Issue 4, April 2013
ISSN 2319 - 4847
With re-sampling taking place, the training set is reduced and so is the training time. As the training time is reduced,
the overall time of classification is also reduced [5].
4. ADVANTAGES OF SAMPLED TRAINING
Lesser Weight Updates
As discussed earlier that gradient calculation requires maximum time in the training process. So to reduce the training
time, the number of gradient calculations to be performed should be decreased. In sampled training, as the number of
training samples used for training is less, the gradient calculations are reduced and so is the number of weight updates
resulting in speed improvement in training a neural network.
Reduction in Training Time
As the numbers of training samples that are used for training are lesser than whole of the training set, the amount of
time required to train a neural network with sampled training is less.
More Accurate Classification Model
As the accuracy of the trained model is compared every time after the test data is classified the accuracy of the
classification model so generated is always above than the user-specified accuracy level.
5. CONCLUSION
In Feed-Forward neural networks used with Gradient Descent Technique, training time is the most important
parameter to be considered. As the gradient calculations require or consume more time than actual training. So the
technique of using the training set should be checked to reduce the training time.
In batch mode training, the numbers of weight updates are more with increase in the training data which in turn results
in the increase of gradient calculations. Due to more gradient calculations, the training time also increases.
In sampled training, as the original training data is sampled into fixed size samples and then given as input to the
network for training, the number of weight updates is less as compared to number of weight updates in batch mode
training. And so, as a result gradient calculation also reduces which results in lesser training times.
6. ACKNOWLEDGEMENTS
Advait Bhatt wishes to thank Prof. H.B.Jethva, for his guidance and help for doing this work. He also acknowledges
Prof. D. A. Parikh, Head of computer department, and to all staff of computer department for full support for
completion of this work.
Prof. Harikrishna Jethva wishes to acknowledge his family and staff of computer department at L.D.College of
engineering.
References
[1] Jiaweu Han and Micheline Kamber, “Data Mining Concepts and Techniques” , second edition, Elsevier press
[2] Nortje William Daniel, “Comparison of Bayesian Learning and Conjugate Gradient Descent Training Of Neural
Networks”, University of Pretoria, October 2001.
[3] Er. Shegal Parveen, Dr. Gupta Sangeeta and Prof. Kumar Dharminder, “Minimization of Error in Training A
Neural Network Using Gradient Descent Technique”, International Journal of Technical Research (IJTR), Vol-1,
Issue-1, Mar-Apr 2012.
[4] Gong Liang, Liu Chengliang and Yuan Fuqing, “Training Feed-forward Neural Network Using the Gradient
Descent Method With The Optimal Step Size”, Journal of Computational Information System 8:4 (2012).
[5] Nandy Sudarshan, Sarkar Prartha Prathim and Das Achintya, “An Improved Gauss-Newtons Method Based BackPropagation Algorithm For Fast Convergence”, International Journal for Computer Applications, Vol 39-No. 8,
February 2012.
AUTHOR
Advait S. Bhatt, pursuing his Master Degree in Computer Science & Technology from Gujarat
Technological University (L D College of Engg., Ahmedabad), received his Bachelor Degree in
Information Technology from Dharmsinh Desai University, Nadiad in 2010. His area of interest include
Database Systems, Data Mining, C/C++/Java/Advanced java Programming.
Prof. Harikrishna B. Jethva, received his post graduate degree in Computer engineering from Dharmsinh
Desai University in 2009 and Bachelor Degree in Computer Engineering from Saurashtra University,
Rajkot in 2001. He has worked as a Assistant Professor for 10 Years and presently working as a Associate
Professor in L.D.College of Engineering. His area of interest is in Neural Network, Theory of Computation,
Computer Network, Compiler Design, Soft Computing and Distributed Computing.
Volume 2, Issue 4, April 2013
Page 243
Download