Survey on different techniques of Anomaly based Intrusion Detection Parth Shah Harsh Shah Sindhu Nair Department of Computer Engg Dwarkadas J. Sanghvi COE Mumbai, India Department of Computer Engg Dwarkadas J. Sanghvi COE Mumbai, India Department of Computer Engg Dwarkadas J. Sanghvi COE Mumbai, India shahp13594@gmail.com harshshah30032@gmail.com sindhu.nair@djsce.ac.in ABSTRACT: In different application domain anomaly Detection is one of the well-studied problem. To identify fraud, customer behavioural change, and manufacturing flaws are the widely used examples. Applications or some other domains of research are specified by this methods. The methods presented are based on data mining and also machine learning domains. We have presented the key components along with the overview of anomaly detection system. Key components of anomaly detection system is also provided. Different anomaly detection system techniques has been discussed. anomalies in a simple 2-dimensional data set. N1 and N2 are the two normal regions, since most observations lie in these two regions. Points that are sufficiently far away from the regions, Eg, points d1 and d2, and points in region d3,a are the anomalies. Keywords: Data Mining, Anomaly Detection, Anomaly Based Intrusion Detection, Machine Learning 1. INTRODUCTION: Enormous amount of data is stored in files, databases, and other repositories, it is increasingly important and also helps to develop powerful means for a explanation or translation of such data. Data Mining is a process which enables discovery of knowledge in field of database and hence known as Knowledge Discovery of Databases(KDD).It can be defined as the nontrivial extraction of hidden, unknown and potentially useful information from data in databases [5]. Data mining and KDD have same meanings. Hence this process includes Data mining. The major steps included in KDD process are Data Integration, data cleaning, data selection, data transformation, data mining, pattern evaluation and knowledge presentation. Data mining primarily involves anomaly detection, association rule learning, classification, regression, summarization and clustering. It is concerned with how can we summarize and generalize the data. For instance, if we wish to determine what observations don’t belong and which are interesting. 2. ANOMALIES Anomalies are the aberrant data that do not comply with the normal data[1][5]. The figure below illustrates Fig.1. A simple example of anomalies in a two-dimensional data set.[1] Anomaly detection is also known as outlier detection. It is the detection of items, events or also observations that do not comply to an expected output[5].The outliers, novelties, noise, aberrant and exceptions in data are also considered as example of anomalies. Detection of aberrant images from surveillance images, identifying abnormal organic compounds, data cleaning and identifying flaws in manufactured materials are different types. The basic steps that remain the same are as follows: 1) To identify normality by calculating some “signature” of the data. 2) To measure and calculate the abnormality from the signature of the observer. 3) To set some threshold which, if exceeded by an observation’s metric measurement. 3. VARIOUS CHALLENGES IN ANOMALY DETECTION: ● To define a region representing normal behaviour and to declare any observation. ● ● ● ● ● To define a normal region which encompasses every possible normal behaviour is very difficult [1]. In many domains normal behaviour keeps evolving and a current belie of normal behaviour might not be sufficiently representative in the future. The exact notion of an anomaly is different for various application domains.[3] Availability of labelled data for training and authentication of models used is usually a major issue. It is difficult to distinguish and remove the data that contains noise which tends to be similar to the actual anomalies. multiple attribute are known as multivariate The nature and structure of attributes helps us to determine the applicability of anomaly detection techniques. Multivariate data instances consists of attributes that may be similar or a mixture of dissimilar data types. The applicability of anomaly detection techniques help to determine the nature and structure. For example, for statistical techniques different statistical models have to be used for continuous and categorical data[3]. Similarly, for nearest neighbour based techniques, the nature of attributes would determine the distance measure to be used[3]. The pair wise distance between data instances might be provided in the form of a similarity matrix instead of actual data. Input data can also be categorized based on the relationship present among data instances. The existing anomaly detection techniques deal with record data where data instances can be related to each other. Sequence data, multidimensional data, and graph data are some of the examples 4.2 Types Of anomaly Fig.2. The key components of anomaly detection technique.[2] 4.DIFFERENT ASPECTS ANOMALY DETECTION OF The different aspects of anomaly detection are identified. Many different factors such as the structure, nature of the input data, the availability as well as unavailability of labels helps us to determine the formulation. The above factors play an important role in determining the aspects of anomaly detection. 4.1. Structure and Nature of the Input Data The data instances collection also referred as object, record, point, vector, pattern, event, case, sample, observation, entity are input instance. Set of attributes help to describe the data instance. Binary, categorical or continuous are different types of data inputs. Data instance that consist of only one attribute which is known as univariate and the data instance which contains of Anomalies can be classified into following categories: 1. Point Anomalies: When an individual data instance can be considered as abnormal with respect to the rest of data,it is known as point anomaly.[2][5].For example, if an individual’s spending using credit card is in the range of $50000 to $100000,a payment of $500000 is by itself a point anomaly and therefore worth investigation. 2.Contextual Anomalies: Contextual analysis can be defined as a data instance that is anomalous in a specific context and concept, [2].Both point and collective anomalies can be contextual. With the use of credit card If the usual pattern is $100 per week, a $1200 during Diwali week is considered normal, as compared to the same $1200 during a week in May is not normal [5]. Following are the attributes set used for defining each data instances: Contextual attribute: The contextual attribute are used to determine the context or neighbourhood for that instance[2][5].The contextual attributes ,for instance, data sets pertaining to space , the coordinates of a location are which determines the position of an instance on the entire sequence[5]. Behavioural attribute : The behavioural attributes defines the non-contextual characteristic of data[2][5].The behavioural attribute help us to find the anomalous behaviour[5].A data instance might be a contextual anomaly in a given context, but an identical data instance could be considered normal in a different context[5]. 3.Collective Anomalies: Collective anomaly is a collection of related data instances is anomalous with respect to the entire data set [2][5].In collective anomaly the individual data instances may not be anomalies occurrence together as a collection is. ,their 4.3 Data Labels The labels associated with a data instance denote help us to find whether it normal or anomalous [1]. Human experts carry out the work of labelling. Typically, obtaining labelled set of aberrant data sets that includes all possible type of anomalous behaviour is more difficult than getting labels for normal behaviour. Moreover, the anomalous behaviour is often dynamic in nature. Data Labels are classified into three categories such as: 1.Supervised anomaly detection: The data set are labelled as "normal" and "abnormal in this typel. It also involves training of the classifier [1][5]. The unsupervised anomaly detection issues is that the anomalous data instances are very few compared to the normal data instances in the training data. Issues are also considered important. 2.Semi supervised anomaly detection: Techniques that operate in a semi supervised mode can be illustrated with the situation assuming that the training data has labelled instances only for the normal class. Since the class of anomaly does not require the labels, they are more widely used than supervised techniques. 3.Unsupervised anomaly detection: The task is to find which parts of a collection or document are most anomalous with respect to the rest of the collection is the main motto of this type of anomaly detection. For example, if we had a collection of many news stories with fictional story inserted, we would wish to identify the fictional story as anomalous, because its language is abnormal with respect to the rest of the data in the collection. 4.4 Anomaly detection output: It is an important aspect of anomaly detection. Typically, they are produced by anomaly detection techniques among the following two types: 1) Scores: In scoring techniques, an anomaly score 2) is provided to each instance in the test data with the condition depending on the degree to which that instance is considered an anomaly[1].The threshold may be used by the analyst. Labels: We assign a label normal or anomalous to each test instance in label [1].Techniques that provide binary labels to the test instances do not directly allow analysts to use a specific threshold to select the most relevant anomalies, and these can be controlled through parameter choices within each technique. 5. CLASSIFICATION TECHNIQUES FOR ANOMALY BASED INTRUSION DETECTION SYSTEM Classification is mainly used for anomaly detection .Classification process is divided into two steps: In first step, training set made up of data instances and their associated class labels are used to build a classifier. Classification techniques which are used to classify the intrusion detection databases are: Bayesian classification, Decision tree, k-Nearest neighbour, Support Vector Machine, Neural Network and Rule Induction Methods. A. Naïve Bayes Naïve Bayes is a probabilistic classifier that is mainly used to predict the likelihood of group members[4]. It assumes conditional independence of class. The algorithm first finds out prior probability and then class conditional probability for the given intrusion data set[3]. Next step is to find the highest class probability after which the detection rate and false positive rate is calculated[3]. Figure 3 shows the framework for a naïve Bayesian model to perform intrusion detection. Fig 3:The framework of intrusion2w detection model[4] The probability of one attribute does not affect the probability of other is the assumption that the algorithm uses. It makes 2n! . The naïve bayes classifier omits the probability when calculating the likelihoods of membership in each class to handle missing attribute values. B. Decision Tree: The main advantage of decision trees is that it can learn a model based on the training data and can predict the future data as one of the attack type[3]. Due to this they can be used as misuse intrusion detection. Two prerequisites for the analysis are data collection and tool acquisition and selection[3]. Figure below shows the process to implement decision tree for intrusion detection. Network Traffic Pre-processing Alerts Detector Pattern Building Data Set. To classify a new instance, start at the topmost(root) node and follow the branch indicated by the outcome of each test until a bottommost(leaf) node is reached. Leaf node label represents the result of classification. They are useful in real time intrusion detection because of their high performance. Due to generalization accuracy of decision tree, they are able to detect new intrusions. The 1-Nearest Neighbour (1NN) classifier id based on the following key points. The example of the 1NN is one in which the representative points are considered train samples and then distance between samples and each point are computed[3]. The class label of the representative point are assigned to the test samples. Extension of 1NN is the k-NN in which test samples are determined by finding the k nearest neighbours. The major usage is that it is used with statistical schemes for intrusion detection. C. Support Vector Machine We have to construct a SVM model for classification in IDS. The main aim of SVM is to produce a model that produces the target value of a data instance using various kernel functions [3][4]. The three major SVM kernel functions are: Gaussian Kernel, Polynomial Kernel, and Sigmoid Kernel. In classification phase, SVM training model is build and to generate classification results SVM functions are used[3]. Speed and accuracy are the main factors for using SVM for detection of intrusion[6].Training and testing are the phases of the implementation of SVM intrusion detection system [3}. Whenever a new pattern is detected during classification it updates the training pattern dynamically. It provides high accuracy rates. E. Artificial Neutral Networks D. K-Nearest Neighbour 6. POTENTIAL ISSUES FOR ANOMALY BASED INTRUSION DETECTION SYSTEMS When reliable parametric of probability density are not unique or difficult to determine then K-NN classification is applied[3].In this method objects are classified based on closest training example in the space[3]. It is also known as lazy learning technique as functions are approximated locally and all the computations are delayed until classification. Figure below shows the method for deciding the nearest neighbour. The use of neural networks is in both anomaly intrusion detection and in misuse intrusion detection[3][6]. In anomaly IDS, neural networks identify the variations from user’s established behaviour while in case of misuse IDS, neural networks have been designed to receive the data from network and analyse it for any occurrence of misuse[3]. This method implements neural network as a standalone misuse detection where data is received from a network stream and then analysed for any misuse intrusion that will be helpful. Even if the data is incomplete or distorted, neural network would be capable of analysing the data from network, Feature Extraction. Classifier construction. Sequential pattern prediction Human Intervention False positive and false negative alarms rate 7. CONCLUSION Fig. 3 Majority voting scheme [3] In this paper we have presented the generalized definition of anomaly detection along with its different methods, aspects and techniques of anomaly detection. During this paper we have studied the different types of intrusion detection systems with the brief introduction of each category of anomaly detection methods. For the future work we will suggest to present the investigation over the same technique and claims its efficiency against existing methods. 8. REFERENCES [1] Varun Chandbola, Aridnam Banerjee, and Vipin Kumar “Anomaly Detection: A Survey” ACM Computing Surveys, Vol. 41, No. 3, Article 15 , Publication date: July 2009 [2] Vaishali V. Khandagale, Yoginath Kalshetty“Review and Discussion on different techniques of Anomaly Detection Based and Recent Work” International Journal of Engineering Research & Technology(IJERT) Vol.2 Issue10, October2013 ISSN: 2278-0181 [3]Anju, Pardeep Kumar Mittal, Shalini Aggarwal “A Review of Various Classification Techniques Based Data Mining for Intrusion Detection” International Journal of Advanced Research in Computer Science and Software Engineering Volume 5, Issue 3, March 2015 ISSN: 2277 128X [4]Patel Hemant, Bharat Sarkhedi, Hiren Vaghamshi, “Intrusion Detection in Data Mining With Classification Algorithm”, International journal of Advance Research in Electrical, Electronics and Instrumentation Engineering, ISSN: 2320-3765, Vol. 2, Issue 7, July 2013 [5]Wong,W.Moore,A.,Cooper,G.,andWagner,M.2003.Ba yesian network anomaly pattern detection for disease outbreaks.InProceedings of the 20th International Conferenceon Machine Learning. AAAI Press, Menlo Park,California, 808-815 [6]P. Amudha, S. Karthik, S. Sivakumari, Overview”, “Classification Techniques for Intrusion Detection – International Journal of Computer Applications (0975-8887), Vol. 76, No. 16, August 2013 [7]S. Neelima, N. Satyanarayana and P. Krishna Murthy, “Data Mining Techniques in Intrusion detection”, International Journal of Emerging Technology and Advanced Engineering, ISSN: 22502459, Vol. 4, Issue 2, Feb 2014, Pp. 631-634. [8] Shyara Taruna R., Mrs. Saroj Hiranwal, “Enhanced Naïve Bayes Algorithm for Intrusion Detection in Data Mining”, International Journal of Computer Science