Survey: Learning Techniques for Intrusion Detection System (IDS) Roshani Gaidhane#1, Prof. C. Vaidya #2, Dr. M. Raghuwanshi#3 # RGCER, Computer Science and Engineering Department, RTMNU University Nagpur, Maharashtra, India 1roshani.deotare@gmail.com 2chandu.nyss@gmail.com 3m_raghuwanshi@rediffmail.com Abstract— An intrusion detection system (IDS) is a software application that monitors network or system activities for malicious activities. The research on neural network methods and machine learning techniques to improve the network security by examining the behaviour of the network as well as that of threats is done in the rapid force. There are several techniques for intrusion detection which exist at present to provide more security to the network, however many of those are static. Many researchers used machine learning techniques for intrusion detection, but some shows poor detection, some techniques takes large amount of training time. In this paper learning approaches i.e. neural network approaches used for intrusion detection in the recent research papers has been surveyed and proposed an extreme learning approach to solve the training time issue. Keywords— Intrusion Detection System (IDS), Training, Neural Network, anomaly detection, misuse detection. I. INTRODUCTION Intrusion Detection is a major focus of research in the security of computer systems and networking. An intrusion detection system (IDS) [1] is used to detect unauthorized intrusions i.e. attacks into computer systems and networks. These systems are known to generate alarms (alerts).The following general terms used for detection and identification of attack and non-attack behaviour. True positive (TP): The amount of attack detected when it is actually attack; True negative (TN): The amount of normal detected when it is actually normal; False positive (FP):The amount of attack detected when it is actually normal called as false alarm; False negative (FN): The amount of normal detected when it is actually attack, namely the attacks which can be detected by intrusion detection system. A. Classification of IDS Intrusion Detection Systems are primarily classified into two types i.e. Host-based IDS (HIDS) and Network-based IDS (NIDS) [2]. HIDS looks for particular host activity while NIDS watches network traffic. B. IDS Techniques The two basic techniques used by Intrusion Detection Systems for detecting intruders are Misuse Detection (also called signature based detection) and Anomaly Detection [2], [3], [4]. 1) Signature or Misuse based IDS: Misuse Detection system tries to match data with known attack pattern. In this system every signature requires entry in a database which is one of the big challenges. It may hundreds or even thousands of entries and each packet is compared with all the entries in the database. Disadvantages Any new form of misuse is not detected Resource consuming and slows down the throughput Advantages It raises fewer false alarms because they can be very specific about what it is they are looking for. 2) Anomaly based IDS: Anomaly Detection System watches for unknown intrusion for abnormalities in traffic. Disadvantages It raises high false alarm Limited by training data Advantage New form of attack can be detected. There are various approaches [4] used for intrusion detection in the research. In this paper learning approaches (Neural Network) used for IDS has surveyed. Neural Network (NN) approach has the scope for both the misuse detection system and the anomaly detection system due to its self-adaptive, self-organizing and self-learning (training) abilities [5]. C. Neural Network approach Increasing amount of research is going on Artificial Neural Network (ANN) [6], [7]. ANN consists of base units called neurons which are grouped in several levels. Neurons are connected to neighbour neurons and those connections are weighed. An ANN has input level, one or several hidden layers, and output level. Neural Networks architecture can be distinguished as follow: Supervised training algorithm [5], [6]: The network learns the desired output for a given input or pattern in the learning phase. Ex. Multi-Level Perceptron (MLP); the MLP is employed for Pattern Recognition problems. Unsupervised training algorithm [5], [6]: The network learns without specifying desired output in the learning phase. Ex. Self-Organizing Maps (SOM) It finds a topological mapping from the input space to clusters. Generally used for classification problems. For IDS using ANN approach has two phases: 1) Training and 2) Testing 1) Training: To recognize various normal and abnormal traffic behaviour one has to train the network. In the research it is done by using a dataset .The KDD99 dataset is publically available and it is mostly used for evaluating IDS. 2) Testing: It is similar to the training. After training NN IDS tested using a test dataset. This dataset is smaller than the training dataset to ensure that the network can detect intrusions it was trained to detect. II. LITERATURE SURVEY For IDS using neural network approach it is necessary to collect data representing normal and abnormal behaviour to train the Neural Network and in Machine learning it is based heavily on statistical analysis of data and some algorithms can use patterns found in previous data to make decisions about new data [6]. The advantage of Neural Network [7] is capable of analysing the data from the computer network, even if the data is incomplete or distorted. Current ANN intrusion detection technologies are Back-propagation Neural Network called NNID (Neural Network Intrusion Detector) [8], Multiple Self Organizing Maps (MSOMS), CMAC (Cerebellar Model Articulation Controller) uses adaptive NN, MLP (Multi Level Perceptron) [9]. A. Related Work Hua TANG and Zhuolin CAO proposed an approach in [10] to detect an attack which uses artificial neural networks and support vector machine. The proposed approach is applied to the KDD CUP'99 data set. Average detection rate for various attacks are obtained which are as follows. TABLE I ATTACK DETECTION RATES OBTAINED IN [10] Approach NN 82.4% Attack type DoS U2R 59.1% 65.9% SVM 83.8% 63.1% Probe 66.3% R2L 14.3% 14.9% A result shows that SVM is better than NN. If overall accuracy is compared then author got the results in which NN is slightly better than the SVM. Laheeb Mohammad Ibrahim, proposed an approach in [11] for anomaly detection using Distributed Time-Delay Artificial Neural Network (DTDNN) over KDD99dataset. He used training dataset consisting of 25000 patterns (5000 patterns for each class of DoS, U2R, R2L, Probe, Normal), and testing dataset consisting of 2500 patterns (500 patterns for each class). The results shows overall accuracy classification is 99.884% for Distributed Time-Delay and the percentage of successful classification for DoS (97.6 %),U2R (96.2%), R2L (95.8%) ,Probe (98.2%) from normal one (Normal (98.4%)). For intrusion detection, authors used neural network IDS model based on BP neural Network in [12] . 2570 records were selected from KDD99 dataset, of which 1325 for training, the normal connection 631,connecting 694 the invasion; 1245 for testing, 523 normal connections, 722 invasion of connection. Obtained results are detection rate=80.5%, false alarm rate=7.4% and omission rate=11.3%. Also in [13], Mukhopadhyay1, M Chakraborty, S Chakrabarti, T Chatterjee proposed Back propagation neural network for intrusion detection. Their emphasis is on detection of new attacks and low failure rate. The proposed model consists of data-collector, pre-processor, encoder and neural network classifier. First the network is trained and then tested. Testing includes two phases Level 1 and Level 2. In level 1 sample data is used whereas in level 2 totally new dataset is used. Success rate for level 1 and level 2 testing are 95.6%, 73.9% whereas the failure rate is 4.4%, 26.1% respectively. Sufyan T. Faraj Al-Janabi and Hadeel Amjed Saeed worked on anamoly based intrusion detection in [14]. They have developed anamoly based IDS based on BPN and used packet behaviour parameter for experiment. The proposed model first detects normal-abnormal traffic then abnormal events are classified into four attack typtes (DOS, PROB, U2R, or R2L) and then detailed classification of abnormal events into 29 subattack types . 22 features of KDD99 dataset is used for experiment. 5 preliminary, 7 secondary, 10 less important features are categorized. They faced several issues which are as follows: Large amount of training data requires to train ANN and to get accurate results. There is little compromise between increasing the classification levels and the percentage of detection In paper [15], Vladimir Bukhtoyarov and Eugene Semenkin proposed a neural network ensemble approach to detect intrusion. The approach is used for fixed-size neural networks ensembles with single-stage voting. To overcome the problem of detecting the network attacks collective neural network approach is used. But the structure become complex due to collective approach and more amount of training time requires for training each ANN model which are issues of the system. The choice of the threshold to appeal to the neural network ensemble classifier is one of the issues. Prof. D.P. Gaikwad, Sonali Jagtap, Kunal Thakare and Vaishali Budhawant implemented an FC-ANN approach in [16] based on ANN and fuzzy clustering to solve the lower detection precision, weaker detection stability issues. In the proposed model restore point is provided for rolling back of system files, registry keys, installed programs and the project data base etc. To reduce the complexity and size of the subsets, first different training subsets are generated by using fuzzy clustering. Then for those subsets different ANN models are trained and finally results are combined. 1) Data pre-processing: Convert raw data to machine readable form. 2) Training: In this phase the network will be trained on normal and attack data. 3) Testing: Activity will be predict i.e. either intrusive or not. Fig.1. depicts the architecture of the proposed approach. V. Jaiganesh, Dr. P. Sumathi, S. Mangayarkarasi proposed a back-propagation approach to detect intrusion in [17]. First the input and its corresponding target are called a Training Pair is generated. Then the training pair is applied to the network. Detection rate and false alarm rate are the performance measure used for evaluation of proposed method. The detection rate for DoS, Probe, U2R, R2L attack is below 80%. Poor detection of attackers if some hidden attackers are present is one of the issues. In paper [18], Devikrishna K S and Ramakrishna B B proposed a system which uses Multi Layer Perceptron (MLP) architecture. The system detects attacks and classifies into six groups. Authors pointed out the issue of obtaining irrelevant output and suggest work to solve it in future . Fig.1. Proposed Architecture of IDS. The architecture has following modules. Network Data Monitoring: This module will monitor network stream and capture packets to serve for the data source of the NIDS Pre-processing: In pre-processing phase, network traffic will be collected and processed for use as input to the system. Feature Extraction: This module will extract feature vector from the network packets (connection records) and will submit the feature vector to the classifier module. The feature extraction process consists of feature construction and feature IV. PROPOSED APPROACH selection. The quality of feature construction and feature selection algorithms is one of the most important factors From the literature survey it is observed that many authors that influence the effectiveness of IDS. Achieving used back propagation neural network approach [12], [13], reduction of the number of relevant traffic features [14], [17] for intrusion detection. Though there are some without negative impact on classification accuracy is a issues such as low detection, long training time. So, our task is goal that largely improves the overall effectiveness of the to find another approach which can work on these issues. IDS In theory, it is found that Extreme learning machine (ELM) Classifier : [19], [20] algorithm tends to provide extremely fast learning This module will analyse the network stream and will speed than traditional learning algorithm [20]. Therefore our draw a conclusion whether intrusion happens or not. BPN proposed approach is to build a predictive model for intrusion and ELM techniques can be used as a classifier. The most detection which will have a fast learning ability than BPN. successful application of neural network is classification Using ELM technique a classifier will be build to classify or categorization and pattern recognition. normal and abnormal activity. The results of ELM will be Training: compared with traditional BPN approach. The learning process is the process of optimization in The proposed approach has the following three which the parameters of the best set of connection phases. coefficients (weighs) for solving a problem are found. III. DRAWBACKS OF EXISTING LEARNING TECHNIQUES Several issues come from the survey such as false detection, large training time, detection precision of low frequent attacks, classification of attacks etc. To overcome the problem of large amount of training time, it is necessary to use high speed learning algorithm for IDS and to test its results with existing learning technique. In this paper a technique is proposed which will reduce the training time and its results will be analyzed with existing technique. Testing : When detecting that intrusion happens, this module will send a warning message to the user. Knowledgebase This module will serve for the training samples of the classifier phase. The Artificial Neural Networks can work effectively only when it has been trained correctly and sufficiently. [9] V. CONCLUSION In this paper some basics of the IDS is introduced and discussed the different neural network approaches used in the research paper for IDS. It is found that the most of the researchers used BPN for intrusion detection. However our survey pointed out some issues like: low detection rate, detailed classification of attack gives sometimes irrelevant output, large training time required to train the network. To overcome the training time issue an extreme learning approach is proposed and in future work its results will be compared with traditional BPN approach. [12] REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] Danny Rozenblum, "Understanding Intrusion Detection Systems",SANS Institute Reading Room site K.Rajasekhar, B.Sekhar Babu, P.Lakshmi Prasanna, D.R.Lavanya, T.Vamsi Krishna,"An Overview of Intrusion Detection System Peng Ning,Sushil Jajodia,"Intrusion Detection Techniques", http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.89.2492&re p=rep1&type=pdf Sandip Sonawane, Shailendra Pardeshi, Ganesh Prasad,"A survey on intrusion detection techniques",World Journal of Science and Technology 2012, 2(3):127-133. Jean-Philippe,"Application of Neural Networks to Intrusion Detection",SANS Institute Reading Room site Deepika P Vinchurkar, Alpa Reshamwala ,“ A Review of Intrusion Detection System Using Neural Network and Machine Learning Technique”, International Journal of Engineering Science and Innovative Technology (IJESIT), Volume 1, Issue 2, November 2012 Shahbaz Pervez, Iftikhar Ahmad, Adeel Akram, Sami Ullah Swati,“A Comparative Analysis of Artificial Neural Network Technologies in Intrusion Detection Systems”, Proceedings of the 6th WSEAS International Confe rence on Multimedia, Internet & Video Technologies, Lisbon, Portugal, September 22-24, 2006. V. Jaiganesh, Dr. P. Sumathi, S. Mangayarkarasi,“An Analysis of Intrusion Detection System using back propagation neural network” IEEE Computer Society Publication -2013. [10] [11] [13] [14] [15] [16] [17] [18] [19] [20] Aida O. Ali ,Ahmed I. saleh, Tamer R. Badawy ,“Intelligent Adaptive Intrusion Detection Systems Using Neural Networks (Comparitive study)” International Journal of Video& Image Processing and Network Security IJVIPNS-IJENS Vol:10 No:01, Feb 2010. Hua TANG, Zhuolin CAO,"Machine Learning-based Intrusion Detection Algorithms",Journal of Computational Information Systems5:6(2009) 1825-1831. Laheeb Mohammad Ibrahim,"Anomaly network intrusion detection system based on distributed time-delay neural network(DTDNN)",Journal of Engineering Science and Technology,Vol. 5, No. 4 (2010) 457 - 471. Changjun Han, Yi Lv, Dan Yang, Yu Hao, “An Intrusion Detection System Based on Neural Network”, IEEE publication, 2011 International Conference on Mechatronic Science, Electric Engineering and Computer, August 19-22, 2011, Jil. Mukhopadhyay1, M Chakraborty, S Chakrabarti, T Chatterjee,"Back Propagation Neural Network Approach f,or Intrusion Detection System",2011 International Conference on Recent Trends in Information Systems, IEEE Publication. Sufyan T. Faraj Al-Janabi, Hadeel Amjed Saeed “A Neural Network Based Anomaly Intrusion Detection System” 2011 Developments in Esystems Engineering ,IEEE Publication - 978-0-7695-4593-6/11 , DOI 0.1109/DeSE.2011.19 Vladimir Bukhtoyarov, Eugene Semenkin , “Neural Networks Ensemble Approach for Detecting Attacks in Computer Networks”, WCCI 2012 IEEE World Congress on Computational Intelligence Prof. D.P. Gaikwad, Sonali Jagtap, Kunal Thakare, Vaishali Budhawant,”Anomaly Based Intrusion Detection System Using Artificial Neural Network and fuzzy clustering.”, International Journal of Engineering Research & Technology (IJERT), ISSN: 2278-0181, Vol. 1 Issue 9, November- 2012 V. Jaiganesh, Dr. P. Sumathi, S. Mangayarkarasi,“An Analysis of Intrusion Detection System using back propagation neural network” IEEE Computer Society Publication -2013 Devikrishna K S, Ramakrishna B B,"An Artificial Neural Network based Intrusion Detection System and Classification of Attacks",International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622,Vol. 3, Issue 4, Jul-Aug 2013, pp. 1959-1964 Chi Cheng, “Extreme learning machines for intrusion detection”, Neural Networks (IJCNN), The 2012 International Joint Conference on 10-15 June 2012 Guang-Bin Huang, Qin-Yu Zhu, Chee-Kheong Siew,“Extreme learning machine: Theory and applications”, NeuroComputing, December 2005