See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/331417497 A Two-Level Hybrid Model for Anomalous Activity Detection in IoT Networks Conference Paper · January 2019 DOI: 10.1109/CCNC.2019.8651782 CITATIONS READS 21 502 2 authors: Imtiaz Ullah Qusay H. Mahmoud Ontario Tech University Ontario Tech University 9 PUBLICATIONS 89 CITATIONS 253 PUBLICATIONS 3,811 CITATIONS SEE PROFILE Some of the authors of this publication are also working on these related projects: Performance Evaluation of NoSQL Databases View project All content following this page was uploaded by Imtiaz Ullah on 20 February 2020. The user has requested enhancement of the downloaded file. SEE PROFILE WK,((($QQXDO&RQVXPHU&RPPXQLFDWLRQV 1HWZRUNLQJ&RQIHUHQFH &&1& A Two-Level Hybrid Model for Anomalous Activity Detection in IoT Networks Imtiaz Ullah, Qusay H. Mahmoud Department of Electrical, Computer and Software Engineering University of Ontario Institute of Technology, Oshawa, ON, L1G 0C5 Canada {imtiaz.ullah, qusay.mahmoud}@uoit.net Abstract-In this paper we propose a two-level hybrid anomalous activity detection model for intrusion detection in IoT networks. The level-1 model uses flow-based anomaly detection, which is capable of classifying the network traffic as normal or anomalous. The flow-based features are extracted from the CICIDS2017 and UNSW-15 datasets. If an anomaly activity is detected then the flow is forwarded to the level-2 model to find the category of the anomaly by deeply examining the contents of the packet. The level-2 model uses Recursive Feature Elimination (RFE) to select significant features and Synthetic Minority Over-Sampling Technique (SMOTE) for oversampling and Edited Nearest Neighbors (ENN) for cleaning the CICIDS2017 and UNSW-15 datasets. Our proposed model precision, recall and F score for level-1 were measured 100% for the CICIDS2017 dataset and 99% for the UNSW-15 dataset, while the level-2 model precision, recall, and F score were measured at 100 % for the CICIDS2017 dataset and 97 % for the UNSW-15 dataset. The predictor we introduce in this paper provides a solid framework for the development of malicious activity detection in IoT networks. Keywords-Anomaly detection system; flow-based intrusion detection; internet of things; machine learning; cybersecurity; vulnerabilities. I. INTRODUCTION The network administrator is facing many challenges due to the increasing number of devices connecting to the Internet. The Internet can nowadays be entitled Internet of Things (IoT) and the future network will be the Internet of Everything (IoE). The Internet is exposed to various irregular actions that affect network performance. Computer security is becoming more significant and essential due to the substantial expansion of the computer network. Network failures or attacks, data theft from the organization or people are the lists of irregular actions [1]. These actions are not acceptable in computer networks and known as “network anomalies”. These abnormalities can bring severe harms to the network condition and it is required to discover these anomalies as quickly as possible to reduce the chance of its damages. Many solutions have been developed in the past decade to address this problem. In general, the anomaly detection system (ADS) can be divided into two groups: feature-based and capacity based [2]. A feature-based anomaly detection system uses IP and TCP header to extract the properties of network traffic to k,((( detect a malicious activity in the network communication [3] while the capacity-based ADS use a threshold to define the anomalous and normal behavior of the system. Anomaly-based and misused based are typically focused and motivated detection techniques in the research field of intrusion detection [4, 5]. Anomaly-based detection system becomes more significant as compared to signaturebased detection system in identifying novel attacks. Anomaly-based detection methodology defines normal behavior and the rest is considered as malicious behavior while the signature-based detection methodology defines malicious behavior and consider the rest as normal behavior. The signature-based detection system collects patterns of known attacks, input the attack patterns into the signature database, extract features from various audit stream, compare these features with the attack signature and raise alarm if there is a possible intrusion. Manual update of the signature database and the inability to detect new emerging cyber threats limit the use of signature-based detection systems. Several studies have shown that anomaly-based methodology achieves better performance in detecting new attacks as compared to the signaturebased detection system. Network and host are the two primary domain area of IDS. The network-based IDS monitor packets traveling to a network via wireless or wire media while host-based IDS detect malicious activity at the host level only. IDS generate alerts as a result of malicious activity. A false positive alarm is an indication but not an actual intruder activity. The rate of false positive alarms can be decreased by changing the default configuration of the rule base [6]. IDS required to generate accurate and appropriate alerts along with timely effective actionable information. The operation of anomaly-based IDS depends on the stored normal patterns. Data mining and machine learning, artificial intelligence etc. are practices used to increase the effectiveness of IDS. The rest of this paper is organized as follows: Section II discusses related work. Section III presents the proposed model framework; Section IV presents the experimental setup, and comparison of results. Finally, Section V concludes the paper and offers ideas for future work. II. RELATED WORK A network is vulnerable to the different type of unpredicted actions that might in some way compromise its WK,((($QQXDO&RQVXPHU&RPPXQLFDWLRQV 1HWZRUNLQJ&RQIHUHQFH &&1& action. These unpredicted actions are known as anomalies. These anomalies reduce network service by making a network device or service partly or absolutely inaccessible for legitimate users. The study of intrusion detection has received a lot of attention due to the available vulnerabilities and threats in the networks as a result of new techniques are required for intrusion detection fulfill these gaps. Packet-based inspection is a common approach to acquire information about network traffic. In this approach, all packets communicated over the network are captured by means of TCPdump or Wireshark [7]. The benefit of this technique is to obtain a detail investigation of network communication. However, this technique is very expensive in large networks since it required a high memory and computation power [8]. A flow-based inspection is an alternative approach to reduce the amount of data capture. Presently, various network devices support flow protocols to monitor the network [3]. A flow is a set of packets having mutual properties, protocols, source and destination port and IP addresses. Flow-based intrusion detection only inspects the packet header to detect an attack. Usually, IDS use stateful protocol analysis or deep packet examination to identify anomalous actions in the network communication [9]. A flow-based IDS use network stream to classify the network traffic as normal or malicious. Flow-based IDS only examine a packet header; therefore, the flow-based anomaly detection is faster than packet-based and state-full protocol examination. Altaher et al. [10] propose an adaptive entropy based anomaly detection system to determine the deviation in network traffic. They used real network traffic to validate the efficiency of their proposed system. Mishra et al. [11] revealed that the security of the wireless network is not an easy task due to the major architectural difference between a wired and wireless network. An intrusion detection in the wireless network depends on protocols, applications, and the type of intrusion detected, so a response for one intrusion may be re-authenticating all nodes while the other response may be reinitializing communication channels. Cho et al. [12] described an attack model and proposed an anomaly-based detection scheme for botnet attacks in 6LowWPAN. They used centralized IDS at the border router to evaluate their model in a simulation environment. The substantial growth of IoT devices in smart homes, and the new attacks on IoT devices required to secure and protect these IoT systems. Hemdan and Manjaiah [13] investigate various attacks in IoT network that help a cybercrime investigator or security professional to understand and protect IoT systems. A framework for the detection of DoS attack in IoT based on 6LowPAN was proposed and evaluated by means of a real-life scenario [14]. Their framework successfully detect DoS attack and produce a security measure to increase network availability. Farooqi and Khan [15] analyzed IDS for wireless sensor networks. They categorized IDS into three groups, pure centralized, pure distributed and distributed centralized. Centralized IDS are more energy efficient because a powerful network device will be used to detect intrusions but need a specialized protocol to gather data from all sensors for possible malicious behavior detection. A distributed IDS required to install agents at every node need extra power consumption making it less energy efficient. An optimal solution is to combine centralized and distributed IDS together to work in a distributed manner and coordinate with the other nodes to identify malicious behavior. Oh et al. [16] present two novel techniques to detect malicious activity in the Internet of Things. Their proposed approach achieved better performance as compared to a traditional pattern-matching approach. IoT security becomes more challenging because different devices may use different technology e.g. 6LoWPAN, RPL, ZigBee, and a loose communication link, therefore; an IoT network can be attacked easily. Pongle and Chavan [17] used location and neighbor information to identify a wormhole attack and the attacker in the IoT networks. Their approach is a very energy efficient for the resource-constrained situation because they used a centralized node for heavy processing while lightweight processing was performed on the sensors node. Fig. 1. Two-level IoT Intrusion Detection Model III. FRAMEWORK OF PROPOSED MODEL Anomaly detection techniques have been the key motivation techniques for many researchers due to its potential in identifying novel attacks. The IoT attacks appear identical to the traditional Internet but due to limited protection of IoT devices, the scale and simplicity of these attacks are larger. In this paper, we proposed a two-level framework for the Internet of Things, the level-1 model operates at the network layer of the IoT infrastructure, classify the network flow as normal or anomaly and forward the network flow to the level-2 model to identify the category of the anomaly. We select the decision tree classifier for the level-1 model. The level-1 model operates at the network communication layer and is responsible to classify the network flow as normal or anomalous. The benefit of the level-1 model is to detect a malicious activity WK,((($QQXDO&RQVXPHU&RPPXQLFDWLRQV 1HWZRUNLQJ&RQIHUHQFH &&1& nearby the smart infrastructure by optimizing the local parameters. The flow-based local IDS accelerate the attack detection by getting the parameters updated locally. The network communicat- ion layer transfers the network flow to the level-2 model for deep packet inspection to identify the category of detected anomaly. Figure 1 shows our proposed two-level intrusion detection model for IoT networks. The level-2 model extracts all features from the raw network data. Irrelevant and redundant features can affect the classification power of the machine-learning algorithm; therefore, it is required to remove irrelevant and redundant features. Feature selection is a technique of choosing a subclass of the significant features and remove the irrelevant and redundant features. These significant features increase the accuracy by concentrating on enthusiastic features while ignoring unrelated and redundant features. The level-2 model combines the Recursive Feature Elimination (RFE) methodology for feature selection and Synthetic Minority Over-sampling Technique (SMOTE) and Edited Nearest Neighbors (ENN) for oversampling and undersampling classes. The internal structure of our proposed model for IoT intrusion detection presented in Figure 2. Fig. 2. Internal Organization of Two-level IoT Intrusion Detection Model The level-2 model uses RFE methodology to obtain the importance of every feature through feature importance and coefficient attribute and remove the least significant feature from the present set of features. The procedure is repeated until the desired number of features selected that give a high value of accuracy or precision or recall. We use RFE to select optimal features for accuracy, precision, and recall. The reason to use RFE for accuracy, precision, and recall is to find out a separate set of features for different metrics. It has been observed that some features are significant for one metric and insignificant for another metric. These three set of features are combined to have an optimal set of features that have the highest values for all three metrics. Details of RFE for accuracy, precision, and recall for CICIDS2017 and UNSW-15 datasets are presented in Figure 3 and Figure 4 respectively. Imbalanced datasets are biased towards majority classes. In order to solve this issue, oversampling and undersampling techniques are used to balance the dataset [18, 19]. We used SMOTE and ENN to balance the CICIDS2017 and UNSW-15 datasets. After the feature selection process, a combination of oversampling and undersampling was applied to balance the dataset to improve the classification accuracy of minor classes. We used SMOTE for oversampling and ENN for cleaning the dataset. The reason for using SMOTE and ENN is to create a synthetic sample of minority classes to improve the classification accuracy of minor classes. We used the random forest classifier at level-2 to identify the category of the detected anomaly. The following steps are required in order to implement our proposed model. Step 1. Extract the network features from the collected flow using tools such as tcptrace. Step 2. Empirically select flow-based features a. Train level-1 model via a decision tree. b. Classify the flow anomaly. Step 3. If the network flow is anomalous then forward to the flow to the level-2 model. Step 4. Use RFE at level-2 to calculate an optimal set of features for Accuracy, Precision, Recall WK,((($QQXDO&RQVXPHU&RPPXQLFDWLRQV 1HWZRUNLQJ&RQIHUHQFH &&1& Step 5. Combine the optimal set of features calculated in Step 4. Step 6. Generate training and test data. We used 60 % data for training and 40 % for testing. Step 7. Apply SMOTE and ENN for oversampling and undersampling to the training data. Step 8. Train the layer-2 model via a random forest classifier to identify the category of detected anomaly. Step 9. Validate the model using testing data. IV. EXPERIMENTAL SETUP A. Dataset Anomaly-based detection system becomes more important as compared to signature-based detection system in discovering novel attacks. KDD99 [20] and NSL-KDD datasets [21] does not reflect modern normal and anomalous network traffic. The KDD99 dataset was created two decades ago does not contain the new footprint of normal and anomalous network applications. The ISCX 2012 [22] dataset was developed to remove the deficiencies of KDD99 and NSL-KDD datasets but a limited number of attacks and network features used in ISCX dataset which makes it less efficient dataset for intrusion detection. To address these challenges the UNSW-NB15 [23, 24] and CICIDS2017 [25, 26] datasets have been recently released at defense force academy University of New South Wales Australia and Canadian Institute for Cybersecurity (CIC), University of New Brunswick Canada respectively. The UNSW-NB15 and CICIDS2017 datasets reflect modern attacks scenario and deep structured network traffic information. The UNSW-NB15 dataset contains nine modern attacks categories and realistic events of normal network traffic. The complexity of UNSW-NB15 training and testing sets were evaluated via Kolmogorov-Smirnov test in order to compare the distribution of training and testing, the skewness test used to measure the features irregularity and the kurtosis test is used to evaluate the smoothness of features. The UNSWNB15 dataset contains 49 features and a class label. The features of the UNSW-NB15 dataset are grouped into seven categories; Flow, Basic, Content, Time, Additional Generated features, Connection and Labeled features. The UNSW-NB15 attack categories are Analysis, Backdoor, DoS, Exploit, Fizzers, Generic, Reconnaissance, Shellcode, and Worm. The dataset is organized in two way to detect anomaly: binary class consists of normal and anomaly labels while the attack category label consists of nine attack categories and a normal network label. The UNSW-NB15 dataset has several advantages as compared to the earlier intrusion detection datasets. It contains modern anomalous and normal behavior; training and testing set has similar probability distribution and the flow-based features from the packet header and payload replicate the network packets competently. Fig. 3. RFE for Accuracy, Precision, and Recall for CICIDS2017 Fig. 4. RFE for Accuracy, Precision, and Recall for UNSW-NB15 The Canadian Institute for Cybersecurity (CIC), release the CICIDS2017 dataset at University of New Brunswick Canada. The CICIDS2017 dataset used diverse attack scenarios to generate anomalous network traffic and accurate measures to produce the normal network traffic. The CICIDS2017 data were collected for five days and 80 features were extracted by means of CIC flow meter [27]. The CICIDS2017 dataset essential criteria are; complete WK,((($QQXDO&RQVXPHU&RPPXQLFDWLRQV 1HWZRUNLQJ&RQIHUHQFH &&1& network configuration; complete traffic; labeled dataset; complete interaction; complete capture; a larger number of available protocols; attack diversity; heterogeneity; large number features set; and the metadata of the dataset. B. Dataset Preprocessing Preprocessing of the dataset is essential because the data type and format of some features are not suitable for machine learning algorithms. The IP address was converted to integer datatype and the protocol, state, and service features have string datatype converted to an integer representation. All features were normalized. The system uses the same methodology for both datasets. After the preprocessing, the data is ready for training and testing. Python sklearn package was employed to develop the proposed model. TABLE 1. SPECIFICITY, PRECISION, RECALL, AND F1-SCORE FOR CICIDS2017 FOR BINARY CLASS DATASET Normal Anomaly Avg Specificity 100 100 100 Precision 100 100 100 Recall 100 100 100 F1-score 100 100 100 TABLE 2. SPECIFICITY, PRECISION, RECALL, AND F1-SCORE FOR UNSW15 BINARY CLASS DATASET Normal Anomaly Avg Specificity 99 100 99 Precision 100 99 100 Recall 100 99 100 F1-score 100 99 100 TABLE 3. CROSS-VALIDATION FOR CICIDS2017 DATASET Specificity 99 99 99 3 Fold 5 Fold 10 Fold Precision 100 100 100 Recall 100 100 100 F1-score 100 100 100 TABLE 4. CROSS-VALIDATION FOR UNSW-15 DATASET 3 Fold 5 Fold 10 Fold Specificity 99 99 99 Precision 99 99 99 Recall 99 99 99 F1-score 99 99 99 TABLE 5. FEATURES SELECTED FOR CICIDS2017 AND UNSW-NB15 Dataset Selected Feature CICIDS2017 Source_Port, Destination_Port, Bwd_Packet_Length_Max, Bwd_Packet_Length_Min, Bwd_Packet_Length_Std, Flow_IAT_Mean, Flow_IAT_Std, Flow_IAT_Min, Fwd_IAT_Min, Bwd_Packets/s, Min_Packet_Length, Max_Packet_Length, Init_Win_bytes_forward, act_data_pkt_fwd UNSW-NB15 sport, dsport, proto, dur, sbytes, dbytes, sttl, service, Sload, smeansz, dmeansz, Stime, synack, ct_srv_dst V. RESULTS AND DISCUSSION The main function of our prospective model is to appropriately classify a malicious activity in IoT communica- tion. The key role of the level-1 model to categorize the input flow as normal or anomalous. The level-1 model use only flow-based features to classify the IoT communication. The proposed level-1 model achieved 100% specificity; precision; recall and F score for CICIDS2017 and 99 % specificity and 100 % precision; recall and F score for UNSW-NB15 datasets as shown in Table 1 and Table 2. Cross-validation tests are used to evaluate the specificity, precision, recall and F score of our proposed model. Cross-validation test for CICIDS2017 and UNSW-15 datasets are presented in Table 3 and Table 4. The result for UNSW-15 remain the same but there is a minor change in the CICIDS2017 dataset as shown in Table 3 and Table 4. In machine learning, the feature selection technique is considered an important step to select efficient features. Our proposed model uses RFE methodology in a wellorganized way to select active and reliable features for intrusion detection. The feature selected for accuracy, precision, recall are presented in Figure 3 and Figure 4. The features selected for accuracy, precision, and recall are combined to produce an optimal set of features for all evaluation matrixes. Table 5 shows the features selected for accuracy, precision, and recall for CICIDS2017 and UNSW-NB15 datasets. A performance study was accomplished to select the finest classification technique to achieve a high true positive and true negative value. We used random forest classifier for the level-2 model. The level-2 model deeply inspects the network traffic to categorize the detected anomaly appropriately. TABLE 6. SPECIFICITY, PRECISION, RECALL, AND F1-SCORE FOR CICIDS2017 Attack Type Benign FTP-Patator SSH-Patator DoS Slowloris DoS Slowhttptest DoS Hulk DoS GoldenEye Heartbleed Brute Force SQL Injection XSS Infiltration Bot Port Scan DDoS Average Specificity 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Precision 100 100 100 100 100 99 100 100 85 75 74 100 100 100 100 100 Recall 100 100 100 99 100 100 100 100 88 27 70 69 99 100 100 100 F1 100 100 100 99 100 100 100 100 87 43 72 81 100 100 100 100 TABLE 7: SPECIFICITY, PRECISION, RECALL, AND F1-SCORE FOR UNSWNB15 Attack Type Normal Exploits Fuzzers Backdoors DoS Generic Reconnaissance Shellcode Analysis Worms Average Specificity 100 98 100 100 99 100 100 100 100 100 100 Precision 100 64 93 65 36 100 93 86 62 64 97 Recall 100 82 89 11 32 99 76 88 18 16 97 F1 100 72 91 18 34 99 84 87 28 25 97 The classification algorithm biased towards the majority class when imbalance CICIDS2017 and UNSWNB15 datasets are used. In order to solve this issue, we WK,((($QQXDO&RQVXPHU&RPPXQLFDWLRQV 1HWZRUNLQJ&RQIHUHQFH &&1& used oversampling and undersampling techniques to balance the datasets. We adopted SMOTE and ENN for our proposed model to balance CICIDS2017 and UNSW-NB15 datasets. SMOTE is used for oversampling and ENN is used for cleaning the dataset. Regular, borderline1, borderline2 and SVM algorithms were used but the finest results were obtained via regular and borderline1 algorithms. We used borderline1 algorithm for the oversampling. K is the number of neighbors that determine the average distance to the minority samples in the borderline1 algorithm. We used various values of K but K=4 to 8 considered as an optimal number of neighbors to compute an average distance to the minority samples. The proposed level-2 model specificity, precision, recall and F score for CICIDS2017 and UNSW-NB15 datasets as shown in Table 6 and Table 7 respectively. [9] [10] [11] [12] [13] [14] VI. CONCLUSION AND FUTURE WORK In this work, we have developed a two-level intrusion detection model for IoT networks. The level-1 model is a flow-based anomalous activity detection system, which classifies the input flow as a normal or an anomaly. If the flow is detected as anomalous flow then the flow will be transferred to the level-2 model to deeply inspect the flow to categorize the class of the detected anomaly. The level-2 model uses RFE for feature selection and SMOTEENN for balancing the dataset. The proposed model was evaluated via CICIDS2017 and UNSW-NB15 datasets. Our proposed model achieved 100% specificity for both datasets and the precision, recall and F score was measured 100% for CICIDS2017 and 97 % UNSW-NB15 datasets. [15] In future work, we plan to implement our proposed model in real-world scenario to supplementary validate the model. [21] REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] Mangrulkar, N.S., Patil, A.R.B. and Pande, A.S., “Network Attacks and Their Detection Mechanisms: A Review”. International Journal of Computer Applications, 90(9), (2014). Amaral, A.A., de Souza Mendes, L., Zarpelão, B.B. and Junior, M.L.P., “Deep IP flow inspection to detect beyond network anomalies”. Computer Communications, pp.80-96, (2017). Proença, M.L., Coppelmans, C., Bottoli, M. and de Souza Mendes, L., “Baseline to help with network management”. e-Business and telecommunication networks, pp. 158-166, (2006). Amin, S.O., Siddiqui, M.S., Hong, C.S. and Lee, S., “RIDES: Robust intrusion detection system for IP-based ubiquitous sensor networks”. Sensors, 9(5), pp.3447-3468, (2009). Li, L., Yang, D.Z. and Shen, F.C., “A novel rule-based Intrusion Detection System using data mining”. 3rd IEEE International Conference on Computer Science and Information Technology (ICCSIT), Vol. 6, pp. 169-172, (2010). Zhao, Y., Zheng, Z. and Wen, H., “Bayesian statistical inference in machine learning anomaly detection”. 2010 International Conference on Communications and Intelligence Information Security (ICCIIS), pp. 113-116, (2010). Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A. and Stiller, B., “An Overview of IP Flow-based Intrusion Detection”. IEEE Communications Surveys and Tutorials, 12(3), pp.343-356, (2010). Kanda, Y., Fontugne, R., Fukuda, K. and Sugawara, T., “ADMIRE: Anomaly detection method using entropy-based PCA with three-step sketches”. Computer Communications, 36(5), pp.575-588, (2013). View publication stats [16] [17] [18] [19] [20] [22] [23] [24] [25] [26] [27] Gupta, R.M., Jain, P.K., Amidon, K.E., Gong, F., Vissamsetti, S., Haeffele, S.M. and Raman, A., McAfee LLC, “Hierarchy-based method and apparatus for detecting attacks on a computer system”. U.S. Patent 7,234,168, (2007). Altaher, A., Ramadass, S. and Almomani, A., “Real time network anomaly detection using relative entropy”. IEEE High Capacity Optical Networks and Enabling Technologies (HONET), pp. 258-260, (2011). Mishra, A., Nadkarni, K. and Patcha, A., “Intrusion detection in wireless ad hoc networks”. IEEE wireless communications, 11(1), pp.48-60, (2004). Cho, E.J., Kim, J.H. and Hong, C.S., “Attack model and detection scheme for Botnet on 6LoWPAN”. In Asia-Pacific Network Operations and Management Symposium, pp. 515-518, (2009). Hemdan, E.E.D. and Manjaiah, D.H., “Cybercrimes Investigation and Intrusion Detection in Internet of Things Based on Data Science Methods”. In Cognitive Computing for Big Data Systems Over IoT, pp. 39-62, (2018). Kasinathan, P., Pastrone, C., Spirito, M. A., & Vinkovits, M. “Denialof-Service detection in 6LoWPAN based Internet of Things.” In IEEE 9th International Conference on Wireless and Mobile Computing, Networking and Communications, pp. 600-607, (2013). Farooqi, Ashfaq Hussain, and Farrukh Aslam Khan. "Intrusion detection systems for wireless sensor networks: A survey." Communication and networking, pp. 234-241, (2009). Oh, Doohwan, Deokho Kim, and Won Woo R, "A malicious pattern detection engine for embedded security systems in the Internet of Things." Sensors, pp, 24188-24211, (2014). Pongle, Pavan, and Gurunath Chavan. "A survey: Attacks on RPL and 6LoWPAN in IoT." IEEE International Conference on Pervasive Computing, (2015). Liu, A.C., “The effect of oversampling and undersampling on classifying imbalanced text datasets”. The University of Texas at Austin, (2004). Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P., “SMOTE: synthetic minority over-sampling technique”. Journal of artificial intelligence research, pp.321-357, (2002). Lee, W. and Stolfo, S.J., “A framework for constructing features and models for intrusion detection systems”. ACM transactions on Information and system security (TiSSEC), 3(4), pp.227-261, (2000). Tavallaee, M., Bagheri, E., Lu, W. and Ghorbani, A.A., “A detailed analysis of the KDD CUP 99 data set”. IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1-6, (2009). Shiravi, A., Shiravi, H., Tavallaee, M. and Ghorbani, A.A., “Toward developing a systematic approach to generate benchmark datasets for intrusion detection”. Computers & Security, 31(3), pp.357-374, (2012). Moustafa, N. and Slay, J., “UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). IEEE Military Communications and Information Systems Conference (MilCIS), pp. 1-6, (2015). Moustafa, N. and Slay, J., “The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 dataset”. Information Security Journal: A Global Perspective, 25(1-3), pp.18-31, (2016). Sharafaldin, I, Lashkari,A.H and Ghorbani, A.A, “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization”, 4th International Conference on Information Systems Security and Privacy (ICISSP), Purtogal, (2018). Gharib, A., Sharafaldin, I., Lashkari, A.H. and Ghorbani, A.A., “An Evaluation Framework for Intrusion Detection Dataset”. 2016 IEEE International Conference Information Science and Security (ICISS), pp. 1-6, (2016). Gil, G.D., Lashkari, A.H., Mamun, M. and Ghorbani, A.A., “Characterization of encrypted and VPN traffic using time-related features. In Proceedings of the 2nd International Conference on Information Systems Security and Privacy, pp. 407-414, (2016).