See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/370074808 Intelligent approaches toward intrusion detection systems for Industrial Internet of Things: A systematic comprehensive review Article · April 2023 DOI: 10.1016/j.jnca.2023.103637 CITATIONS READS 0 105 3 authors, including: Lamia Chaari Fourati Ecole Nationale d'Ingénieurs de Sfax 212 PUBLICATIONS 2,112 CITATIONS SEE PROFILE Some of the authors of this publication are also working on these related projects: WiMAX2 View project Space-Air-Ground Integrated Networks (SAGIN) View project All content following this page was uploaded by Lamia Chaari Fourati on 21 April 2023. The user has requested enhancement of the downloaded file. Journal of Network and Computer Applications 215 (2023) 103637 Contents lists available at ScienceDirect Journal of Network and Computer Applications journal homepage: www.elsevier.com/locate/jnca Review Intelligent approaches toward intrusion detection systems for Industrial Internet of Things: A systematic comprehensive review Mudhafar Nuaimi a , Lamia Chaari Fourati b ,∗, Bassem Ben Hamed b a b National School of Electronics and Telecommunications of Sfax, Tunisia Digital Research Center of Sfax (CRNS); Laboratory of Signals, SysteMs, aRtificial Intelligence and neTworkS (SM@RTS), Sfax University, Tunisia ARTICLE INFO Keywords: Internet of Things (IoT) Industrial IoT (IIoT) Intrusion Detection System (IDS) Artificial Intelligence (AI) Industry 4.0 ABSTRACT Recently years, we have seen the exponential upgrowth of the Industrial Internet of Things (IIoT), which brings significant benefits to our daily lives, industry, and society. The common idea behind the IIoT is to connect digital devices and services with physical systems using sensors and actuators. The IIoT output a very enormous amount of data through resource-constrained devices (sensors). However, due to the heterogeneity of these devices and their limited resources, they are exposed to a multifariousness of threats that jeopardize IIoT’s ability to provide seamless operations to enterprises. Therefore, there is an urgent need to develop efficient security approaches to combat these threats and protect IIoT systems. To this end, some intrusion detection systems (IDS) have been enhanced in recent years. With the advent of intelligent approaches (IAs), most IDSs are based on Machine Learning (ML) or Deep Learning approaches and institutions are incorporating accurate IAs techniques in their real-world applications. Therefore, we survey recent efforts in the literature related to intrusion detection in IIoT, focusing on ML algorithms. We divide them into three categories: Agent Placement Strategy, Detection Method, and Security Problem. We pose a number of open questions and suggest some research directions. 1. Introduction Because of the massive usage of the Internet in recent years, network security has turned into a real necessity. As access to information increases, various security threats arise, from viruses to network intrusions. These security threats not only hurt companies financially and their reputation, but also lead to the theft of users’ confidential data. In this context, companies are making efforts to improve network security by using IAs as intrusion detection tools (Liu et al., 2020; Su et al., 2020; Magán-Carrión et al., 2020). Since the introduction of the IPv6 protocol, a connection made to the Internet of Things (IoT), which allows various devices such as wearable devices, cognitive buildings, smart blenders, microwaves, clothing, etc. to freely access the Internet (Smys, 2020) as well as industrial equipment’s. IIoT, which connects digital devices and services with Supervisory Control and Data Acquisition (SCADA), and physical systems, integrates these smart IoT devices, Distributed Control Systems (DCS), and other components in the industry to improve productivity. The IIoT promises many benefits in various industries, such as automotive (Abdel-Basset and Imran, 2020), transportation (Chavhan et al., 2021), healthcare (Darwish et al., 2020), and so on. More specifically, IIoT aims to create improved services and goods in several industrial fields. In this context, Piccialli et al. (2021) have examined in depth in their work how and where the benefits of the IIoT will lead to organizational change in many industries and what new business model will be created by IIoT solutions. As mentioned earlier, the IIoT consists of a number of components that viaduct the gap between the physical and virtual worlds. Therefore, a number of disruptive activities occur in IIoT based on the connection between information technology (IT) and organizational technology (OT), which makes network vulnerability a major challenge in IIoT networks. According to Muna et al. (2018), cyber threats in IIoT will cost up to $90 trillion by 2030 if efficient IDS tools are not found in the near future. The dangerous risk in IIoT is often malware, where perpetrators use Denial of Service (DoS), Decentralized DoS (DDoS), and Progressive Determined Risk (PDR) to infect vulnerable computers. Ukraine experienced in the past years, a cyberattack dubbed BlackEnergy (BE) in which Ukrainian power utilities suffered unplanned blackouts affecting more than 800000 customers (Kushner, 2013). A Czech hospital shut down its network due to a cyber attack in March 2020, which greatly impacted the diagnosis of COVID-19 and patient care (Stevenson, 2020). Recently, a ransomware attack was lunch on a US fuel pipeline leading to the shutdown of a critical fuel network (Ding et al., 2020). These events show that current methods of dealing with ∗ Corresponding author. E-mail addresses: mudhafar.fadhil@utq.edu.iq (M. Nuaimi), lamiachaari1@gmail.com (L.C. Fourati). https://doi.org/10.1016/j.jnca.2023.103637 Received 8 November 2022; Received in revised form 27 February 2023; Accepted 2 April 2023 Available online 17 April 2023 1084-8045/© 2023 Elsevier Ltd. All rights reserved. Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Table 1 Nomenclature. cyber threats do not work well when it comes to protecting industrial systems. Therefore, finding more effective IDS methods is a crucial task. To improve network security in IIoT, efforts are being made and research is being conducted in the research community to find reliable and robust intrusion detection tools in IIoT scenarios (Elrawy et al., 2018), Tsiknas et al. (2021). There are several novel approaches in the literature aimed at detecting intrusions into IIoT networks. Specifically, Alruwaili (2021) have recently shown in their article that intrusion detection is fundamental in IIoT. They have shown various protocols, algorithms, and mechanisms for intrusion detection in IIoT systems. They also suggested that the combination of ML and blockchain technology can efficiently combat security threats in IIoT. Butun et al. (2020) provided an in-depth study on how the distributed and parallel implementation of streaming applications in environments that be industrial can help detect intrusions in the IIoT. They argued that future IIoT deployments should consider the concept of streaming data when developing their IDS. da Costa et al. (2019) research has shown the importance of using IAs in IDS in IoT use cases. Current mature IDSs have been developed without considering the requirements of IIoT networks. Unlike traditional networks, IIoT networks have some special requirements that make it difficult to deploy current mature IDSs directly in them. First, in traditional networks, nodes with high computational resources host the IDS agents. However, in IIoT, it becomes difficult to find nodes with high computational resources because most devices deployed in such networks have limited resources. Therefore, finding a strategy for placing IDS agents is crucial for efficient intrusion detection in IIoT. Second, the network architecture of IIoT networks makes them challenging for IDS. In traditional networks, end nodes are usually connected to the access point or gateways that forward data to destinations. However, in IIoT networks, multi-hop communication is used in many cases, where data is transmitted to the destination through multiple intermediate stations. In other words, several devices play the function of relay nodes, working as forwarding and terminating devices. Finally, several protocols such as Constrained Application Protocol, Wireless Personal Area Network (6LoWPAN), LoRaWAN (Lalle et al., 2021), IPv6 over Low-power and Routing Protocol for Low-Power and Lossy Networks (RPL), etc. are used in IIoT. These new protocols bring new vulnerabilities and thus new requirements for IDSs. Considering these requirements, this work aims to compile recent efforts in the literature on intrusion detection in IIoT. Abbreviation Meaning IoT IIoT ML DL IDS IAs AI IPv6 DCS SCADA SLR IT OT PDR DoS DDoS ICS BE RPL 6LoWPAN CPS PRISMA Internet of Things Industrial Internet of Things Machine Learning Deep Learning Intrusion Detection System Intelligent Approaches Artificial intelligence Internet Protocol Version 6 Distributed Control Systems Supervisory Control and Data Acquisition Systematic Literature Review Information Technology Organizational Technology Progressive Determined Risk Denial of Service Decentralized Denial of Service Industrial Control System Black-Energy Routing Protocol for Low-Power and Lossy Networks IPv6 over Low-power Wireless Personal Area Network Cyber–Physical System Preferred Reporting Items for Systematic Reviews and Meta-Analyzes Institute of Electrical and Electronics Engineers Multidisciplinary Digital Publishing Institute Research Question Automated Guided Vehicle Narrowband Internet of things Structured Query Language Buffer Overflow Attack Dynamic Time Warping Recurrent Neural Network Long Short-Term Memory Radio-Frequency Identification Media Access Control Directory Traversal Attack Support Vector Machine k-Nearest Neighbour Decision Tree Neural Network Random Forest Association Rule Principal Component Analysis Energy Management Systems Custom Word List generator Not Specify Anomaly Based Signature Based Reinforcement Learning Post-Decision State Australian Centre for Cyber Security Hypertext Transfer Protocol HyperText Transfer Protocol Secure File Transfer Protocol Secure Shell Canadian Institute for Cyber-security True Positive True Negative False Positive False Negative The receiver operating characteristic Area Under ROC Area Under Precision-Recall Curve Genetic Algorithm Linear Regression Naïve Bayes Extra-Trees Extreme Gradient Boosting Particle Swarm Optimization Light Gradient Boosting Machine Variant of Long Term Short Memory Auto-encoder Neural Network IEEE MDPI RQ AGV NB-IoT SQL BOA DTW RNN LSTM RFID MAC DTA SVM KNN DT NN RF AR PCA EMSs CeWL NS AB SB RL PDS ACCS HTTP HTTPS FTP SSH CIC TP TN FP FN ROC AUROC AUPR GA LR NB ET XGB PSO LightGBM VLSTM ANN 1.1. Motivation and contributions Recently, many IDSs have been proposed that use attack rules/ signatures or regular behavior specifications to combat cyberattacks in IIoT. However, these methods have two main drawbacks: (1) they have a high percentage of false positive/negative attack recognition and (2) they cannot detect zero-day attacks. So, current efforts are focused on ML with an affirmation on Deep Learning (DL) to enhance the security of IIoT networks (Fadlullah et al., 2017). Indeed, the ML and DL approaches have demonstrated their efficiency in complex problems in object classification, natural language processing, fraud detection, and so on. This motivates researchers to explore and propose learningbased approaches for intrusion detection. In addition, companies are testing intrusion detection methods based on ML/DL to be used in practice. However, we have not found any research work that provides a detailed overview of the application of ML/DL in intrusion detection in IIoT. Therefore, this paper presents new learning-based approaches for intrusion detection in IIoT. In this paper, we present the progress of research on security issues in IIoT networks, focusing on ML/DL approaches. Our survey article targets three important areas: (1) IIoT, (2) IDS, and (3) ML/DL. This article aims to extend the literature by providing information on the novel learning-based approaches that have been proposed to address security issues in IIoT. Moreover, this work aims to be a new resource for interested researchers in assessing and mitigating vulnerabilities in IIoT networks. In addition, this work identifies some challenges and proposes some research guidelines. (continued on next page) 2 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Table 1 (continued). Abbreviation Meaning FAR I-ELM A-PCA GWO TS IG DBNs CDBN FDI TE OCSVM HEDV DCT SVD CSS OICS-VFSL False Alarm Rate Incremental Extreme Learning Machine Adaptive Principal Component Analysis Grey Wolf Optimizer Tabu Search Information Gain Deep Belief Networks Conditional Deep Belief Network False Data Injection Tennessee Eastman One Class-Support Vector Machine Hybrid Environment for Design and Validation Discrete Cosine Transform Singular Value Decomposition Cross-site Scripting Optimized Intra/Inter-Class Structure-based Variational Few-Shot Learning Detection Rate Convolutional Neural Network Sparse Evolutionary Training Remote Telemetry Unit Feed-Forward Deep Neural Network Unmanned Aerial Vehicles CertificateLess signature Scheme Feature Selection Deep Random Neural Network DR CNN SET RTU FFDNN UAVs CLS FS DRaNN the related works that provided systematic literature reviews (SLR) or surveys and investigations regarding the use of Intelligent approaches for intrusion detection within IIoT systems. A brief background on different IIoT systems and architecture is provided in Section 4. In Section 5, intrusion detection methods and IIoTs security issues are briefly explained, and discuss the different attacks that can happen. Section 6 a comprehensive overview of deep learning (DL) and machine learning (ML) approaches is presented. Section 7, presents the background of the IDS that base on ML/DL. In Section 8, a deep investigation related to the use of Intelligent approaches to detect intrusion with IIoT systems is provided. Section 9 provides a comparative study and a deep discussion of the reviewed proposals. Besides that, Open problems and obstacles, and future aspirations are highlighted in Section 10. Finally, Section 11 is the conclusion of our study. 2. Systematic literature review methodology A systematic Literature Review is a review methodology that aggregates all relevant existing studies related to a specific topic. In this work, this section affords a methodology anchored on a systematic literature technique to highlight specific research questions. The methodology used in this study highlights analyzing recent works related to IDS-based machine-learning approaches for the IIoT. The SLR method starts with elaborating and formulating clear research questions denoted (RQi; i = 1, i = 2..n) and inclusive and exclusion criteria to retrieve and select relevant scientific papers that will be deeply studied and investigated. In this work, we have established our SLR due to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyzes). PRISMA is a minimum evidence-based set of items developed to assist authors in reporting systematic reviews and meta-analyses. In this work, the PRISMA methodology is used to assess the benefits of IA-based approaches to detect intrusion within IIoT-based systems. The PRISMA statement involves 27 points checklist 1.2. List of acronyms Table 1 present the used list of acronyms along with this paper. 1.3. Paper organization The rest of this article consists of 10 other sections. Section 2 presents the search map used during this work. Section 3 highlighted Fig. 1. Main phases for prisma methodology. 3 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Fig. 2. Distribution of selected papers over the Years (2017–2022). Fig. 3. Databases for selected papers. (RQ4): What are the main artificial intelligence approaches that are proposed in the literature to detect intrusion within IIoT-based systems? and a four-step flowchart (see Fig. 1). Indeed, PRISMA is widely used in the existing systematic literature review in the context of IIoT, security, Internet of Things. Therefore, we have chosen this technique over other ones. In this context, the authors (Booth et al., 2021) pinpoint the benefit of PRISMA methodology compared to the other systematic approach. We choose PRISMA in this research because it will enable the reviewers and readers to be aware of what we did and found. Additionally, PRISMA not only optimizes the quality of reporting but also makes the peer review process more efficient. In general, an SLR can be summarized via the following phases: (RQ5): What are the open perspectives and future research directions for securing IIoTs-based systems? After filtering the research questions, we got hold of different research papers that their study related to security issues in IIoT based on keywords and similar alternatives as shown in the Table 3. 2.2. Papers selection 1. Posing research questions? 2. Locating studies: search for adequate databases. 3. Critical evaluation of the studies (Inclusion and exclusion criteria). 4. Data collection: mapping the selected paper to the research question. 5. Reporting and analyzing the review. 6. Interpreting the finding. Our database is composed of research papers published between 2017 and 2022 in Science Direct, IEEE Xplore, Springer, MDPI Publisher of Open Access Journals, SCOPUS, ACM, Wiley, Web of Science, Inderscience, and Hindawi Publishing Corporation. It can be seen that our selection covered the most recent papers and therefore provides comprehensive information about the existing state of the art regarding our challenging research topic. The selection methodology is illustrated via two figures. Fig. 2 highlights the selected papers for each year and Fig. 3 pinpoints the sources of the included papers within the domain of this study. As we explained in detail the articles covered in the study according to publishing years in Table 2. The research sequence is determined after specifying essential keywords and relative alternatives, which are: (‘‘Industrial Internet of Things’’ OR ‘‘IIoT’’ OR ‘‘Industry 4.0’’) AND (‘‘IDS’’ OR ‘‘Intrusion Detection Systems’’) AND (‘‘Vulnerabilities’’ OR ‘‘Threats’’ OR ‘‘Attacks’’ OR ‘‘Security’’) AND (‘‘Artificial Intelligence’’ OR ‘‘AI’’ OR ‘‘Machine learning’’ OR ‘‘ML’’ OR ‘‘Deep learning’’ OR ‘‘DL’’). 2.1. Research questions According to our paper’s goals, evident research questions (RQi) are defined to be answered in the remainder survey sections. In the following, the defined research questions: (RQ1): What are the main security concerns and IIoTs vulnerabilities? (RQ2): What are the main attacks in IIoT? (RQ3): What are the main methods and strategies to detect intrusion within IIoT-based systems? 4 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Table 2 Distribution of selected papers. Year Related papers 2017 Fadlullah et al. (2017), Rubio et al. (2017a,b), Abdelhafidh et al. (2017), Bertino and Islam (2017), Han et al. (2017), Zhao et al. (2017), Lashkari et al. (2017), Ullah and Mahmoud (2017), He et al. (2017), Potluri et al. (2017), Siddavatam et al. (2017), Stewart et al. (2017), Chen and Ng (2017) and McMahan et al. (2017) Al-Jaroodi et al. (2018), Muna et al. (2018), Elrawy et al. (2018), Boyes et al. (2018), Esposito et al. (2018), Doshi et al. (2018), Sharafaldin et al. (2018b,a), Zong et al. (2018), Alves et al. (2018), Kalash et al. (2018), Dong et al. (2018), Zolanvari et al. (2018), Zhang et al. (2018), Aljawarneh et al. (2018), Panchal et al. (2018), Zhou et al. (2018), Maglaras (2018) and Lin et al. (2018) Al-Hawawreh and Sitnikova (2019), da Costa et al. (2019), Hajiheidari et al. (2019), Fahim and Sillitti (2019), Yao et al. (2019), Tange et al. (2019), Sassi et al. (2019), Lalle et al. (2019), He et al. (2019), Bala and Nagpal (2019), Hasan et al. (2019), Gao et al. (2019), Hanif et al. (2019), Ketzaki et al. (2019), Zolanvari et al. (2019), Al-Hawawreh et al. (2019), Wang et al. (2019), Singh et al. (2019), Rezaeibagha et al. (2019), Koroniotis et al. (2019), Albettar (2019), Bhatia et al. (2019) and Mullen and Meany (2019) Liu et al. (2020) Su et al. (2020), Magán-Carrión et al. (2020), Smys (2020), Abdel-Basset and Imran (2020), Darwish et al. (2020), Butun et al. (2020), Alsoufi et al. (2020), Sengupta et al. (2020), Bekri et al. (2020a,b), Lalle et al. (2020), Xu et al. (2020), Borgiani et al. (2020), Tajalli et al. (2020), Alsaedi et al. (2020), Li et al. (2020), Zhou et al. (2020), Almomani (2020), Latif et al. (2020), Abdel-Basset et al. (2020), Hassan et al. (2020), Kasongo and Sun (2020), Qiao et al. (2020), Aldawood and Skinner (2020), Ding et al. (2020), ElMamy et al. (2020), Stevenson (2020) and Zhang et al. (2020) Chavhan et al. (2021), Piccialli et al. (2021), Tsiknas et al. (2021), Alruwaili (2021), Lalle et al. (2021), Jayalaxmi et al. (2021), Pal and Jadidi (2021), Dwivedi et al. (2021), Abosata et al. (2021), Kasongo (2021), Liu et al. (2021), Nazir and Khan (2021), Wang et al. (2021), Awotunde et al. (2021), Mendonça et al. (2021), Raja et al. (2021), Sarhan et al. (2021a), Liang et al. (2021), Sarhan et al. (2021b), Ferretti et al. (2021), Zolanvari (2021), Khan et al. (2021), Mohammadi et al. (2021), Singh and Saini (2021), Zolanvari et al. (2021) and Booth et al. (2021) Zhang et al. (2022), Tang et al. (2022), Shrivastava et al. (2022), Shahin et al. (2022), Lalle et al. (2021), Hasan et al. (2022), Ferrag et al. (2022), Capuano et al. (2022), Alani et al. (2022) and Al-Hawawreh et al. (2022) 2018 2019 2020 2021 2022 Table 3 Databases for selected papers. Research question Related papers RQ1 Abdel-Basset and Imran (2020), Abdelhafidh et al. (2017), Abosata et al. (2021), Alruwaili (2021), Alsaedi et al. (2020), Awotunde et al. (2021), Borgiani et al. (2020), Boyes et al. (2018), Butun et al. (2020), Chavhan et al. (2021), Darwish et al. (2020), Dong et al. (2018), Dwivedi et al. (2021), Esposito et al. (2018), Jayalaxmi et al. (2021), Kasongo (2021), Mendonça et al. (2021), Pal and Jadidi (2021), Piccialli et al. (2021), Qiao et al. (2020), Raja et al. (2021), Rubio et al. (2017a,b), Sarhan et al. (2021b), Sengupta et al. (2020), Tajalli et al. (2020), Tange et al. (2019), Tsiknas et al. (2021), Xu et al. (2020), Yao et al. (2019), Zolanvari et al. (2019, 2018), Panchal et al. (2018), Zolanvari (2021), Al-Hawawreh et al. (2022) and Sgandurra et al. (2016) RQ2 Abdel-Basset et al. (2020), Abosata et al. (2021), Alsaedi et al. (2020), Butun et al. (2020), Jayalaxmi et al. (2021), Li et al. (2020), Mendonça et al. (2021), Pal and Jadidi (2021), Qiao et al. (2020), Raja et al. (2021), Rubio et al. (2017a,b), Sarhan et al. (2021b), Al-Hawawreh and Sitnikova (2019), Sengupta et al. (2020), Tajalli et al. (2020), Tsiknas et al. (2021), Wang et al. (2019), Xu et al. (2020), Zolanvari et al. (2019, 2018), Panchal et al. (2018), Singh et al. (2019), Kasongo (2021), Liu et al. (2021), Zhou et al. (2020), Gao et al. (2019), Zong et al. (2018), Ullah and Mahmoud (2017), Alves et al. (2018), Potluri et al. (2017), Wang et al. (2021), Awotunde et al. (2021), Kasongo and Sun (2020), Zolanvari (2021) and Al-Hawawreh et al. (2022) RQ3 Abdel-Basset et al. (2020), Abosata et al. (2021), Al-Hawawreh et al. (2019), Alruwaili (2021), Alsaedi et al. (2020), Awotunde et al. (2021), Butun et al. (2020), Dong et al. (2018), Fahim and Sillitti (2019), Jayalaxmi et al. (2021), Kasongo (2021), Latif et al. (2020), Li et al. (2020), Liang et al. (2021), Mendonça et al. (2021), Muna et al. (2018), Qiao et al. (2020), Raja et al. (2021), Rubio et al. (2017a,b), Sarhan et al. (2021b), Yao et al. (2019), Zolanvari et al. (2019, 2018), Liu et al. (2021), Zhou et al. (2020), Gao et al. (2019), Hanif et al. (2019), Ketzaki et al. (2019), Almomani (2020), Nazir and Khan (2021), Zong et al. (2018), Ullah and Mahmoud (2017), Alves et al. (2018), He et al. (2017), Potluri et al. (2017), Keliris et al. (2016), Eigner et al. (2016), Siddavatam et al. (2017), Maglaras (2018), Mantere et al. (2012), Wang et al. (2021), Stewart et al. (2017), Zhang et al. (2018), Aljawarneh et al. (2018), Kalash et al. (2018), Hassan et al. (2020), Kasongo and Sun (2020), Zolanvari (2021) and Al-Hawawreh et al. (2022) RQ4 Abdel-Basset et al. (2020), Al-Hawawreh et al. (2019), Alruwaili (2021), Alsaedi et al. (2020), Awotunde et al. (2021), Butun et al. (2020), Fahim and Sillitti (2019), He et al. (2019), Jayalaxmi et al. (2021), Kasongo (2021), Latif et al. (2020), Li et al. (2020), Liang et al. (2021), Mendonça et al. (2021), Muna et al. (2018), Qiao et al. (2020), Raja et al. (2021), Rubio et al. (2017a), Sarhan et al. (2021b), Yao et al. (2019), Zolanvari et al. (2019, 2018), Liu et al. (2021), Zhou et al. (2020), Gao et al. (2019), Hanif et al. (2019), Ketzaki et al. (2019), Almomani (2020), Nazir and Khan (2021), Zong et al. (2018), Ullah and Mahmoud (2017), Alves et al. (2018), He et al. (2017), Potluri et al. (2017), Keliris et al. (2016), Eigner et al. (2016), Siddavatam et al. (2017), Maglaras (2018), Wang et al. (2021), Stewart et al. (2017), Dong et al. (2018), Zhang et al. (2018), Aljawarneh et al. (2018), Kalash et al. (2018), Hassan et al. (2020), Kasongo and Sun (2020), Zolanvari (2021) and Al-Hawawreh et al. (2022) RQ5 Abdel-Basset et al. (2020), Abosata et al. (2021), Al-Hawawreh et al. (2019), Alruwaili (2021), Alsaedi et al. (2020), Awotunde et al. (2021), Boyes et al. (2018), Esposito et al. (2018), Fahim and Sillitti (2019), Gao et al. (2019), Jayalaxmi et al. (2021), Kasongo (2021), Li et al. (2020), Liang et al. (2021), Mendonça et al. (2021), Muna et al. (2018), Pal and Jadidi (2021), Qiao et al. (2020), Raja et al. (2021), Rubio et al. (2017a,b), Sarhan et al. (2021b), Sengupta et al. (2020), Tange et al. (2019), Tsiknas et al. (2021), Yao et al. (2019), Zolanvari et al. (2019, 2018), Panchal et al. (2018), Singh et al. (2019), Rezaeibagha et al. (2019), Zolanvari (2021) and Al-Hawawreh et al. (2022) 3. Related works Hajiheidari et al. (2019) conferred a comprehensive survey of IDS in IoT environments. Their work focused on four types of IDS, namely signature-based, anomaly-based, specification-based, and hybrid-based. Nonetheless, the work addresses IDS approaches for IoT in general and is not specific to ML/DL. In contrast, our review work focuses on IDS algorithms based on IAs. In the literature, we found some research papers providing an overview of IDS in IoT and/or IIoT networks. However, most of them focused on IoT, while there are few works dedicated specifically to IIoT. Of the works that provide an overview of IIoT, none focus exclusively on ML/DL. Accordingly, in this paper, we ambition to padding this gap by providing a comprehensive overview of IDS in IIoT networks. In the following, we will introduce the existing related works. Fahim and Sillitti (2019) provide a systematic literature review on techniques for analysis, prediction, and anomaly detection in IoT 5 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Table 4 Comparison of our paper to existing survey papers. scenarios. However, this work is not specified to IAs, nor is it specific to IDSs. Recently, Alsoufi and Razak (Alsoufi et al., 2020) conducted a comprehensive study of anomaly intrusion detection techniques in IoT scenarios using DL. However, the work is not specific to IDS and does not consider ML approaches. da Costa et al. (2019) provided an overview of ML-based approaches for intrusion detection approaches in IoT environments. They discussed current IDS based on ML and various public datasets used for training. However, novel approaches based on DL for intrusion detection in IoT and IIoT were not considered. Recently, Alruwaili (2021) provided a brief overview of protocols, algorithms, and mechanisms that have been proposed to secure the IIoT network. The authors also compared different approaches used to prevent, detect, and protect IIoT systems from various threats, vulnerabilities, and attacks. However, the paper is very short and does not include relevant recent work targeting IDS in IIoT. In addition, the paper is not specific to ML/DL. The authors of Jayalaxmi et al. (2021) analyzed the security vulnerabilities and attacks at different levels of IIoT. The authors examined some IDSs proposed for the IIoT. They also presented different frameworks that equip various security requirements for smart factory systems. Some ML and DL algorithms were examined. Pal and Jadidi (2021) analyzed the security issues in IIoT networks. They identified security issues from technological, architectural, and logical points of view and discussed various IIoT architectures. They studied multiple attacks and threats that occur at each layer of an IIoT system. However, they do not examine the various approaches proposed in the literature to deal with these threats and attacks. Rubio et al. (2017a) examined current techniques that attempt to identify intrusions into IIoT systems. First, the authors examined various threats to cybersecurity in IIoT systems. Then, they examined various defense techniques against these threats. However, the authors focus on the classification of these techniques. They do not address the recent ML/DL approaches that have been proposed to address security threats in IIoT. In Rubio et al. (2017b), Rubio et al. analyzed cybersecurity threats in the IIoT and presented the requirements that an IDS must meet to be effective against attacks and threats in the IIoT. Tsiknas et al. (2021) focused on describing various attacks on IIoT systems and analyzed in detail possible solutions against these threats. However, as with previous work, the authors focused on analyzing the security threats without paying more attention to the ML/DL approaches proposed in the literature to address these attacks and threats. Yao et al. (2019) briefly reviewed the traditional ML and DL proposed for intrusion detection in IIoT. They proposed a hybrid IDS system and presented an ML-supported detection method. Boyes et al. (2018) first presented the relationship between Cyber–Physical System (CPS), Industry 4.0, and IIoT in their paper. Then, the authors developed an analysis framework that helped analyze security vulnerabilities. However, they focused on analyzing the integration of IoT, IIoT, and blockchain to address the security challenges without paying attention to IDS. Sengupta et al. (2020) provided a survey of security issues and attacks in IoT/IIoT. They then explored the integration of blockchain to address security challenges in IoT/IIoT. Similar work is presented by Dwivedi et al. (2021). Tange et al. (2019) identified the security needs of IIoT that can subsequently be used in developing security approaches. They proposed a useful research methodology that can be used for a Systematic Literature Review (SLR) in IIoT. Among the above reviews that examined the IIoT domain, we did not find any work that specifically provided an SLR on IIoT/IoT security threats using ML/DL with the help of PRISMA methodology. As we know, our work is the first to prepare a survey focusing on ML/DL in the context of IDS. In Table 4, we summarize the main features of our paper that differ from existing survey papers. Ref/year SLR IIoT/IoT ML/DL PRISMA Hajiheidari et al. (2019) Fahim and Sillitti (2019) Alsoufi et al. (2020) da Costa et al. (2019) Alruwaili (2021) Jayalaxmi et al. (2021) Pal and Jadidi (2021) Rubio et al. (2017a) Rubio et al. (2017b) Tsiknas et al. (2021) Yao et al. (2019) Boyes et al. (2018) Sengupta et al. (2020) Ours ✓ ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✗ ✗ ✗ ✓ ✗ ✗ ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✓ ✓ 4. Industrial Internet of Things (IIoT) Architecture Several IoT (Bekri et al., 2020a,b; Sassi et al., 2019), and IIoT (Abdelhafidh et al., 2017) architectures proposed in the literature. More specifically, in Abdelhafidh et al. (2017), Abdelhafidh et al. proposed an IIoT architecture for fluid distribution systems. Moreover, in Lin et al. (2018), the authors proposed a hierarchical blockchain-based framework that consists of four tangible layers for the ‘‘Industry 4.0’’-era. Their framework was proposed to provide privacy and security guarantees such as anonymous authentication, audibility, and confidentiality in IIoT. When assessing their proposed framework, they obtained promising results. In ElMamy et al. (2020), a survey on the use of Blockchain technology to mitigate cyber-threats in Industry 4.0 was provided. The authors investigated the most important cyber-attacks that occurred in the last decade in Industry 4.0 and highlighted the benefits of combining blockchain with Industry 4.0. Al-Jaroodi et al. (2018) proposed, Man4Ware, a Service-Oriented Middleware Framework for Manufacturing Industry 4.0. The Man4Ware services are composed of three layers, namely, the multiple manufacturing Cyber–physical systems (CPS) layer, cloud manufacturing layer, and fog manufacturing nodes layer. In this work, we consider a four-layer based architecture as illustrated by Fig. 4. The four layers are the perception layer, the network layer, the data processing layer, and the application layer. However, in a five-layers based architecture, an Edge/Fog layer for data preprocessing is added between the perception layer and the networks layer. • The Perception layer: is composed of smart IoT objects such as sensors, devices, and actuators that sense the surrounding environment. These devices are accompanied by types of equipment such as Automated Guided vehicles (AGVs), conveyor systems, and industrial robots. They collected data and send them through communication channels such as LoRaWAN, NB-IoT, SigFox, Zigbee, Bluetooth, etc. Lalle et al. (2019, 2020, 2021). • The Network layer: The most important layer of IIoT is the network layer. All data gathered by smart objects are sent through communication protocols. The network layer works as a transmission medium for data using different communication protocols such as the Internet, cellular networks, LoRaWAN, SigFox, NB-IoT, IEEE 802.11ah, IEEE 802.15.4e, Z-Wave, LTE-M, ECGSMand... In general, this layer could be subdivided into two sub-layers: (1) Short-range communication sub-layer providing connectivity between sensors node or between a sensor node and a data aggregating node. (2) Long-range communication sub-layer providing long-range communication connectivity between the aggregating node and the processing node as shown in Fig. 14 (in general the cloud layer or the remote servers). 6 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Fig. 4. Architecture of IIoT. • The Data processing layer: This layer includes local clouds and servers for data storing and processing. It acts as a middleware between the application layer and the network layer. The big data generated by the perception layer is stored and processed at this layer. ML and DL techniques could be used at this layer to analyze the collected data for better understanding and better decision-making. • The Application layer: It is considered the last layer of IIoT systems with the aim of providing services to end-users through web and mobile-based applications or software. Some of these applications are Smart Grid, Amazon’s Reinventing Warehousing, Smart Factory, Supply Chain and Smart Robotics, etc. users or manipulate data stored in the database. This attack can have severe leverage on the IIoT system, as the main, goal of SCADA systems is to collect and store information (Zolanvari et al., 2019). 5.1.2. Data tampering Over here, the intruder manipulates legitimate users’ data purposely to deactivate their privacy using undesirable activities. The main targeted devices of data tampering attackers are devices that carry crucial information such as the location, and billing price of an IIoT system (Shrivastava et al., 2022). 5.1.3. Improper input validation attack The improper input validation is caused by a lack of appropriate rules to legitimize the user’s input. In this type of attack, an intruder enters wrong input values that can make the system insecure (Zolanvari et al., 2019). It is worth mentioning that most IIoT systems are based on Industry 4.0. More specifically, Industry 4.0 describes the growing trend toward automation and data exchange in technology and processes within the manufacturing industry, including, IIoT, IoT, cloud computing, smart manufacturing, smart factories, cyber–physical systems (CPS), etc. In Industry 4.0, wireless connectivity and smart sensors are added to the manufacturing machines in order to monitor and visualize an entire production process and make autonomous decisions. 5.1.4. IIoT-integrity attacks and mitigation: Related works Authors in Abosata et al. (2021) discussed the integrity of industrial IoT systems, classified attacks according to the layers of the IIoT architecture, and provided a critical analysis of the existing IoT/IIoT approaches. Authors in Xu et al. (2020) considered the evolutional process of the strategies that an attacker diffuses to launch data integrity attacks against an IIoT system and the strategies that a defender relies on to detect such attacks. In addition, a realistic IIoT testbed implementing multiple integrity attacks under different levels of communication noise and defensive strategies, and evaluating their interactions was provided in Xu et al. (2020). According to the author, the experimental results demonstrate that the 1NN-DTW(1NN-dynamic time warping) has a strong classification accuracy under RNNLSTM (recurrent neural networks (RNNs) with long short term memory (LSTM) units) and weak communication noise is better under heavy communication noise scenarios. In Esposito et al. (2018) authors proposed a real-time and energy-aware solution ensuring message integrity and authentication stationed on the use of a group-signature-based scheme within the context of IIoT publish/subscribe services. As a result of the limited power resources of IIoT devices, will make them vulnerable to attacks and it will be difficult to implement robust security solutions for like devices. To overcome this issue, the authors in He et al. (2019) suggested a blockchain-based software status monitoring system, named BoSMoS to monitor the software status of IIoT devices and detect malicious behaviors. So for ensuring the software integrity information, a snapshot of trusted software technical status is stored in a blockchain. 5. Prevalent attacks in IIoT Typically, the security aspects are divided into integrity, availability, confidentiality, authentication, and authorization. Therefore, attacks are discordant into five categories based on which security aspect is compromised. We discuss in this section different attacks that can happen according to the concerning security aspect. We summarize the most prevalent attacks encountered in IIoT in Fig. 5. 5.1. Integrity This subsection briefly describes attacks related to the integrity aspect. 5.1.1. Code injection attack In this attack, the attacker attempts to inject malicious data or inject virulent commands into the system. For example, in IIoT, an intruder may attempt to control or compromise the database server by sending malicious Structured Query Language (SQL) inquests to the database server. In this way, the attacker can access sensitive information of 7 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Fig. 5. The most encountered attacks in IIoT. constantly busy (Han et al., 2017). This attack negatively affects the devices by depleting their resources such as energy, bandwidth, and memory. 5.2. Availability In this subsection, attacks related to the availability aspect are briefly described. 5.2.4. IIoT-availability attacks and mitigation: Related works Authors in Borgiani et al. (2020) proposed a distributed mechanism to detect and alleviate DoS attacks Within IIoT systems through muting malicious nodes. The proposed approach named distributed congestion control by duty-cycle restriction (D-ConCReCT) looks for abnormal traffic patterns in each node by monitoring its child nodes, in this case, the attack is detected and mitigated by sending a message to the virulent node to minimize its traffic. According to the contents of this message, the malicious node fits its own duty cycle to reduce traffic or is completely muted. In addition, authors in Tajalli et al. (2020) proposed a scheduling framework for smart microgrids in the IIoT environment deploying an average consensus-based algorithm that provides mitigation against DoS attacks. 5.2.1. Denial of Service (DoS) attack DoS attacks are substantially responsible for obstructing the services of the system by sending a vast number of random packets at high speed to the target IoT device (Shahin et al., 2022). The target device is then constantly busy and therefore unavailable to legitimate users. Since the attacked IIoT device is always on, its lifetime becomes limited as it is powered by a small battery. A distributed DoS (DDoS) attack, which is a special kind of DoS attack, occurs when multiple attacks are made across various IPs to generate diverse requests and keep the server constantly busy. Therefore, it becomes difficult to distinguish normal traffic from abnormal traffic. In September 2016, Mirai was in charge of launching devastating DDoS attacks that damaged thousands of IoT devices (Bertino and Islam, 2017). 5.3. Confidentiality 5.2.2. Buffer Overflow Attack (BOA) In BOA, the intruder tries to write a quantity of data that is more than the specified size in the buffer, leading to overwriting other buffers and altering their values and adding extra bits. Usually, this type of attack happens in the case of poor size validation mechanisms or poor input type that can make the system crash (Mullen and Meany, 2019). In this subsection, attacks related to the confidentiality aspect are briefly described. 5.3.1. Eavesdropping It is a quiescent attack that is classified into two types: (1) passive eavesdropping: in which the intruder penetrates the IIoT system to collect data about the system, such as IP addresses, connected devices, host information, security policies, etc. Once the intruder identifies the network components, it creates a map of the network architecture to recognize the vulnerabilities in the system. Then, it eavesdrops and 5.2.3. Jamming attacks Over here, the intruder disrupts ongoing communications on a wireless network by dispatching unwanted data packets or signals to IIoT nodes, causing problems for legitimate users as the network is 8 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. inspects the ongoing network traffic to get the status and information of the network devices. (2) active eavesdropping: Over here, attackers can hack communications and listen to phone calls between the two ends of the communication (Hasan et al., 2022) mental operations, this is difficult to determine. More than that, he has the ability to manipulate one device while reinfecting others, such as smart meters linked to the grid. In a scenario involving the IIoT, the hacker might, for instance, acquire access to one of the smart meters and use it to execute harmful attacks against Energy Management Systems (EMSs) or forcibly remove power lines (Rambus, 2022). 5.3.2. Password attacks This attack is used to break the passwords of users of systems to access them. There are two types, dictionary-based attacks (in which an attacker tries commonly used passwords or all words in a dictionary to crack a user’s password) and oppressive force attacks (where an attacker attempts to use each single password group using oppressive force as tools to break a user’s password) (Zolanvari et al., 2019). 5.4.5. IIoT-authentication attacks and mitigation: Related works Authors in Panchal et al. (2018) suggested the use of firewalls as a countermeasure to a DOS attack that allows or blocks access to requests. The use of a preferable authentication and authorization system and an IDS can also help to avert such attacks. Authors in Singh et al. (2019) presented an authentication scheme that ensures the authenticity of the device, messages, and confidentiality of communication in the light of generic IoT and human-centric IIoT. They also considered the security feature of their proposed scheme against various threats related to authentication. In addition, authors in Rezaeibagha et al. (2019) to reduce authentication, relied on a completely new scheme of the certificateless signature scheme (CLS) and to demonstrate its computational efficiency of it, they simulated that scheme and measured its time complexity. It is the first scheme that can be demonstrated with very robust security mitigation and is very suitable for the IIoT environment and is computationally efficient compared to other schemes. 5.3.3. IIoT-confidentiality attacks and mitigation: Related works Authors in Wang et al. (2019) provide an anti-eavesdropping plan depending on unmanned aerial vehicles (UAVs) in which these aircraft emit jamming signals that enable to reduce or perhaps even disable eavesdropping activities. To assess the performance of this scheme, they created a theoretical framework for analyzing the potential for eavesdropping, both local and macro. They also indicated that this scheme had no effect on legitimate communications. Authors in Aldawood and Skinner (2020) focus on human awareness to avoid social engineering that includes interrupting or infecting information systems in the industry environment. The results of their study showed that there is a positive relationship between user awareness and social engineering. The more you know about social engineering, the lower you will be a victim of it. In addition, authors in Alsaedi et al. (2020) have expanded two bash scripts: (e.g., password-1.sh) handling the Custom Word List generator (CeWL) toolkit for dictionary attacks and (password-2.sh) handling the Hydra toolkit for oppressive force attacks. The authors also stated that these scripts were sophisticated to concurrently launch password attack scenarios against IIoT devices on the test bed. 5.5. Authorization This subsection describes attacks related to the Authentication aspect. 5.5.1. Backdoor The system’s backdoor invader looks for a means to enter via evading authentication. Once inside, the intrusive party has access to all system data. He or she is capable of issuing commands to harm the system (Zolanvari et al., 2019). 5.4. Authentication This subsection describes attacks related to the Authentication aspect. 5.5.2. Directory Traversal Attack (DTA) In DTA, the attacker aims to access the registered directories or files that are granted access to the root only. Poor filtering of poor validation rules is one of the causes of this vulnerability (Albettar, 2019). 5.4.1. Man-in-the-middle attack Here, the intruder pretends to be a part of the communication system and tries to expose the messages between two terminals nevertheless thinking they are still communicating directly (Zolanvari et al., 2019). 5.5.3. IIoT-authorization attacks and mitigation: Related works Authors in Chen and Ng (2017) introduced an authorization framework that deals with annotated metadata to secure IIoT objects called SecIIoT and they suggested indexing the owners of these objects with 600 and defined parameters such as location, address, time, etc. on the services that IIoT provides with flexibility and accuracy. The application of this prototype in the IIoT environment has shown that in-memory treatment and high-dimensional refinement allow effective, large-scale, and convenient deployment. A new file-centric multikey aggregate keyword encryption (Fc-MKA-KSE) scheme was created by the authors of Zhou et al. (2018) and is capable of being computationally searched for file-centric data sharing in IIoT. Additionally, they created two security models, one of which achieves the inability to differentiate against selective-file chosen keyword attacks and the other of which catches the trapdoor confidentially (IND-SF-CKA). In addition, in Ferretti et al. (2021) the authors proposed a framework that regulates access to industrial systems resources through steps of the delegation. This structure has ensured many benefits which include the ability to delegate audit permissions issued by the authorized three parties, checking for potential anomalies and attacks, and confirming the attribution of misconduct. It has been proven that the solution they proposed is compatible with the IIoT environment. 5.4.2. Selective forwarding attacks An alternative name for it is a Gray Hole Attack. The nodes that have been compromised simply drop or refuse to relay the data packet to the other node in the network during this kind of attack. Selective forwarding attacks can take on a variety of shapes since they can target a single node or a group of nodes to drop the data that is flowing through them, which causes DoS conditions to exist over that node or set of nodes (Singh and Saini, 2021). 5.4.3. Sybil and spoofing attacks A Sybil attack is one in which the attacker compromises a trustworthy node and uses it to pretend to be another node in order to take over the network. A spoofing attack involves the hacker personally fabricating each network node’s identity and using that identity to carry out the attack (Mohammadi et al., 2021). 5.4.4. Device hijacking attack In this kind of assault, the attacker entirely seizes control of the target device. Due to the hijacker’s failure to alter the device’s funda9 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Fig. 6. Pictorial illustrations of Supervised Learning algorithm. 6. ML and DL techniques as a solution for IDSs in IIoT supervised learning is used to solve the problems of classification and retraction. As shown in Fig. 6, supervised learning techniques include Neural Networks, Decision Trees, Random Forests, k-nearest neighbor (K-NN), and Support Vector Machine (SVM). The following is a succinct summary of various algorithms: Before we provide our comprehensive survey on ML/DL in the context of IDS, we find it crucial to give a brief description of the widely used ML/DL algorithms. Therefore, in what follows, a brief description of these algorithms is provided. 6.1.1. Decision Tree A Decision Tree (DT) is an algorithm that looks like a tree with branches and leaves as shown in Fig. 6(a). It relies on if then else rules to ameliorate the readability. DT contains two nodes, namely, the Leaf node and the decision node. In classification or regression problems, based on the decision rules, DT predicts a class and creates 6.1. Supervised learning As the name suggests, the supervisor monitored the entire learning process. More specifically, to perform the learning process, a labeled data set is provided to a supervised learning algorithm. Typically, 10 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. a training model inferred from training data. Simple construction, ease of implementation, handling large data samples, and transparency are advantages of DT. However, a big space is required to store the data due to its large construction which is its drawback making him more complex in cases where various DTs are considered to abolish the problem. In IIoT, DT is used to solve security problems, especially, for DDoS and intrusion detection. 6.1.2. k-nearest neighbor (K-NN) KNN is considered as a lazy and non-parametric algorithm in supervised learning relying usually on Euclidean distance even though other distance functions such as Hamming distance, Manhattan distance, and, Chebychev distance can be used. As illustrated in Fig. 6(b) the distance function is utilized to calculate the average value of the unknown sample, which is represented by the k nearest neighbors. More specifically, the average value of the closest neighbor can be used to locate any missing samples. KNN is utilized in the IIoT for anomaly detection, malware detection, and intrusion detection. Finding unknown samples takes a lot of time, which reduces KNN’s accuracy in a variety of applications. Fig. 7. Pictorial illustration of Bayesian algorithm. 6.1.3. Random forest RF is made up of a number of trees, each of which classifies the given problem. The combination of these DTs creates an algorithm in order to get a right and robust estimation model for outcomes using the concept of majority voting Fig. 6(c). 6.1.4. Support Vector Machine (SVM) SVM is a supervised machine learning technique that produces a hyper-plane between two classes to get the best classification from a given data set. The hyper-plane is developed with the aim of maximizing the distance from each class that differs from each class with a minimum error at the maximum margin, as shown in Fig. 6(d). SVM can be used to tackle linear and non-linear problems. SVM has a high accuracy level making him suitable to solve security problems in IIoT like intrusion detection. Fig. 8. Pictorial illustration of Ada-Boost algorithm. 6.1.7. Bayesian The Bayesian algorithm is a statistical learning methodology used to detect relationships between data sets by learning conditional independence using various statistical techniques. This algorithm predicts an output based on the current information using Bayesian probability. More specifically, Bayesian learning takes as input different prior probability functions and presents information to predict probable posterior probabilities. A Bayesian learning algorithm is used in IIoT to detect anomalies and intrusions in the network layer. Although Bayesian requires fewer data for training and is easier to apply, it is less accurate since it relies on prior knowledge and relationships between characteristics as illustrated in Fig. 7. 6.1.5. Neural Networks (NN) Neural Networks (NN) is a mathematical representation of the neurons-based organization of the human brain. In ML/DL, a NN is constituted of an enormous number of neurons that process the given or input data and output an output. A NN typically consists of an input layer, one or more hidden layers, and an output layer, as shown in Fig. 6(e). The input layer receives the input data, the hidden layer or layers process that data using mathematical operations (multiplication, activation function), and the output layer provides the anticipated result. In terms of response time, NN is effective. However, the complexity in terms of the computational cost of NN is high. In the context of IIoT, RF is often used in DDoS attacks, anomaly detection, and unauthorized IIoT device identification. More specifically, in Doshi et al. (2018), the authors obtained excellent results with RF when tackling the DDoS attack detection problem in IIoT and even RF outperforms SVM, NN, and KNN in terms of network accuracy. However, one of the drawbacks of RF is that it requires a higher amount of training datasets. 6.1.8. Ada-Boost Ada-Boost is a special ML algorithm that uses different classification techniques to gain a reasonable outcome. To make a forecast, Ada-Boost explicitly mixes homogeneous or heterogeneous multi-classifiers. It is utilized in IIoT for malware detection, anomaly detection, and intrusion detection as shown in Fig. 8. 6.1.9. Deep learning The layered structure of NNs is the foundation of deep learning (DL), a specialized branch of machine learning. DL, by mimicking the information processing and communication systems of the human brain, recognizing speeches, translating languages, processing the data for detecting objects, and making decisions as shown in Fig. 9. In IIoT, DL algorithms are widely used for intrusion detection, anomaly detection, and for solving other security issues (Alsaedi et al., 2020; Li et al., 2020). 6.1.6. Association Rule (AR) Association Rule (AR) is a form of machine learning method that seeks to set the unknown samples by taking into account their shared relationships in a given data set as shown in Fig. 6(f). In Tsai (2009), AR has shown its efficiency in network intrusion detection tasks. Although AR is simple to implement, its utilization in IDSs is limited due to its high time complexity and poor results in cases where the model is large and complex. 11 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Fig. 11. Pictorial illustration of K-means algorithm. Fig. 9. Pictorial illustration of Deep Learning algorithm. Fig. 12. Pictorial illustration of PCA algorithm. in Fig. 12. In the IIoT, PCA can be used as a features selector to quickly identify intrusion attempts. Therefore, a robust security protocol can be obtained by combining PCA and some ML/DL techniques. Zhao et al. (2017) particularly suggested an online machine-learning technique for an IDS in IIoT that integrates PCA and KNN. Fig. 10. Supervised vs Unsupervised Learning. 6.2. Unsupervised learning 6.3. Reinforcement Learning The learning algorithm in unsupervised learning is not directed. In other words, neither classification nor labeling is included in the training data set. A comparison of supervised and unsupervised learning is shown in Fig. 10. Since the data is unlabeled, the learning algorithm looks for similarities between data samples and groups them into various clusters, as a result, Devare et al. (2016) and Xiao et al. (2013), the Unsupervised Learning algorithms have been used for DoS detection in IIoT. In what follows, we briefly describe the Unsupervised Learning techniques. Reinforcement Learning (RL) is another category of ML that enables machines to learn through the concept of action-recompense or reward. More specifically, an agent interacts with its environment and takes suitable action to maximize the reward points according to the situation. In contrast to the previous two categories, in RL, there is no training data set. The agent uses trial and error methods to build the best method from its experience in order to be rewarded highly as shown in Fig. 13. In the past, RL approaches such as Qlearning and deep Q-learning, post-decision state (PDS), and Dyna-Q has been utilized to address security challenges in IoT networks such as jamming attacks, malicious inputs, and malware identification (Xiao et al., 2016). 6.2.1. K-means One of the most well-liked Unsupervised Learning techniques for grouping data points into clusters or groups is K-means. The K-means clustering technique separates the training data set into clusters, each of which has a k-centroid, as seen in Fig. 11. Each centroid acting as the heart of its cluster captures, based on Euclidean distance, the data samples closest to him/her and includes them in its cluster. Then, the mean of every cluster is recalculated as new centroids, and the same process iterates until the optimal cluster centroids are obtained. The K-means clustering algorithm is utilized in IIoT and IoT networks for anomaly and Sybil attack detection. However, compared to Supervised Learning algorithms, the technique is less effective. 7. Intrusion Detection System background based on ML/DL In this section, we present a background of IDS based on ML/DL. The general architecture is shown in Fig. 14. The IIoT networks experience numerous assaults, including malware, Man-in-the-Middle, DoS, DDoS, etc., as was discussed in the preceding section. The life cycle of an IDS includes the following steps: data gathering, data preprocessing, implementation, training, and validation. The data gathering consists of collecting the signatures of all the targeted attacks (malware, Man-in-the-Middle, DoS, DDoS, spoofing, jamming, etc.). The network layer could be divided into two sub-layers: (1) Short-range communication sub-layer providing connectivity between sensors node or between a sensor node and a data aggregating node. (2) Long-range 6.2.2. Principal Component Analysis (PCA) A massive data set can be reduced in size using the Principal Component Analysis (PCA), commonly known as the feature reduction procedure, without sacrificing any of the information it contains as shown 12 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Fig. 13. Reinforcement Learning (RL). Fig. 14. IDS based on ML/DL. 8. Review on Intrusion Detection System based on intelligent approaches for IIoT communication sub-layer providing long-range communication connectivity between the aggregating node and the processing node. Learning ML/DL-based software that connects with the system’s Next-Generation Firewalls (NGFWs) makes up the pre-processing and training step. An AI-based firewall known as a ‘‘Next-Generation Firewall’’ is able to learn how users behave within a network and decide whether to admit or reject those users based on whether their behavior is normal or aberrant. The implemented solution is tested as part of the validation stage. We present a thorough assessment of the literature on ML/DL-based IDSs for IIoT networks in this part. 8.1. Datasets Nowadays, a large number of research groups provide many types of datasets for the purpose of their studies and to provide them to 13 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. • X-IIoTID database (Al-Hawawreh et al., 2022): Generated at the University of New South Wales in Canberra, Al-Hawawreh et al. (2022). It was developed to mimic the methods, moves, and plans utilized by fresh attackers in the IIoT setting. IoT sensors, actuators, controllers, edge, mobile, and cloud traffic were all incorporated into this simulation. Included are several communication styles, such as Machine-to-Human, Machine-to-Machine, and Machine-to-Machine with high network activity and events. In addition, it included protocols for communication such as CoAP, WebSocket, and MQTT. There were 68 characteristics in the dataset. • WUSTL-IIoT-2021 Dataset (Zolanvari, 2021): Maede Zolanvari from the University of Washington’s McKelvey School of Engineering created it in 2021. By classifying all of the normal traffic as class 0 and all of the attack traffic as class 1, the issue has been reduced to binary classification. Intentionally designed to be unbalanced, the system resembles the actual IIoT ecosystem. The dataset consists of 1,107,448 normal samples, 87,016 attack samples, and 41 characteristics. AI/ML-based IDS only offers a binary classification to determine whether a traffic sample is indicative of an attack or not. • Edge-IIoT database (Ferrag et al., 2022): It is one of the most recent datasets designed for IoT/IIoT environments, from which fresh information was gathered through a seven-layer test including more than 10 IoT devices, an IIoT-based Modbus flow, and 14 assaults on IoT/IIoT protocols. Man-in-the-middle attacks, malware attacks, injection attacks, information-gathering attacks, and DoS/DDoS assaults are the five threats that these attacks fall under. science community repositories. Numerous of these data sets will be highlighted in this section, particularly those that are concerned with intrusion detection systems and machine learning in IoT/IIoT environments. It is worth noting that there were data that were not specific to the IoT/IIoT environment, but some of them were produced specifically for that. Table 5 describes these datasets. • KDD99 (Hettich, 1999): KDD99 is a dataset used in the construction of dependable IDS during the Third International Knowledge Discovery and Data Mining Tools Competition (Xiang and Lim, 2005). The following attacks, types, or labels are present in the records from the military network environment: (1) probing; (2) remote to the user; (3) user to root; and (4) denial-of-service. The dataset has 41 features that have been categorized into four kinds. (1) Fundamental characteristics of distinct TCP connections (Feature 1 to Feature 9). (2) Content elements within a relationship that domain knowledge suggests (F10-F22) (3) A two-second time window (F23–F31) is used to compute traffic features, and (4) host features are intended to evaluate attacks that run longer than two seconds (F32-F41). • NSL-KDD (Bala and Nagpal, 2019): A modified version of the prior dataset is NSL-KDD. It gets around several of KDD99’s restrictions. Particularly, three significant issues have been reduced: (1) redundant records have been removed; (2) a range of data samples have been chosen from the original dataset to improve the accuracy of ML/DL approaches; and (3) the imbalanced probability distribution problem has been resolved. • UNSW-NB15 (Moustafa and Slay, 2015): The IXIA PerfectStorm tool and the Cyber Range Lab of the Australian Centre for Cyber Security (ACCS) published it in 2015. The intention of this dataset was to include both hybrid real-world, and contemporary incursion scenarios. The dataset, which consists of 540,044 samples, was created from 100 GB of raw traffic that was recorded using the TCP dump program and stored in four CSV files. • CICIDS database (Sharafaldin et al., 2018a): It is the most recent dataset made available by the University of New Brunswick’s Canadian Institute for Cyber-security. Eleven significant features were taken from the dataset in Gharib et al. (2016) and evaluated as being sufficient to develop a reliable IDS with a high detection rate. The dataset was produced using the FTP, HTTP, HTTPS, SSH, and email protocols while taking into account the hypothetical actions of 25 people. The dataset was analyzed using the CICFlowMeter tool (Lashkari et al., 2017) based on timestamp, initial and final ports, IP, protocols, and assaults. The dataset includes SSH Heartbleed, Brute force FTP, and DDoS attacks. • CSE-CIC-IDS2018 (Sharafaldin et al., 2018b): It is an anomalybased dataset with network traffic incursion. The dataset contains 80 features that were extracted from network traffic and system logs using the CICFlowMeter-V3 program. • Bot-IoT (Koroniotis et al., 2019): In 2018, the Cyber Range Lab of the UNSW Canberra Cyber Center offered it. To be compliant with the IoT ecosystem, this data was created. There are sufficient records with diverse network profiles. In a simulated Internet of Things context, this dataset contains more than 72 million recordings of network activity. In the original dataset, a list of the top 10 features has been supplied. There are 5 output classes in the dataset, each of which represents both regular traffic and the four different forms of assaults that were successful against the IoT network. • TON-IoT database (Alsaedi et al., 2020): The Australian Centre for Cyber Security’s (ACCS) Cyber Range Lab published it in 2019. to duplicate the IIoT network’s complexity and scalability. The telemetry data from an IIoT network is part of a heterogeneous dataset. It includes network activity, operating system logs, and other IIoT service traces. Several attack scenarios, including Man in the Middle, injection, DoS, DDoS, backdoor, password, ransomware, scanning, and Cross-Site Scripting, are contained in this data. 8.2. Performance evaluation metrics Performance indicators like accuracy, precision, recall, and F-Score are frequently used to gauge how effective machine learning and deep learning techniques are. These performance measurements are more specifically known by the following terms: • True positive (TP): both the initial data points and the anticipated data points are true. • True Negative (TN): The anticipated data points as well as the original data points are both false. • False Positive (FP): The projected data points are accurate whereas the original data points are false. • False Negative (FN): Although the projected data points were incorrect, the initial data points were accurate. These phrases characterize the performance measurements mentioned above as: 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑁 + 𝑇𝑃 𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁 (1) 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃 𝑇𝑃 + 𝐹𝑃 (2) 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑃 𝑇𝑃 + 𝐹𝑁 (3) The F-Score, also called widely F-Measure (Hasan et al., 2019), expresses the weighted mean of the recall and precision as: 𝐹 − 𝑆𝑐𝑜𝑟𝑒 = 2(𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙) 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙 (4) TP versus FP is plotted on the receiver operating characteristic (ROC) curve, which displays how well a classifier performs at various threshold levels. Area Under Precision-Recall Curve (AUPR) measures the trade-off between recall and precision at different thresholds, whereas 14 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Table 5 Popular IDS datasets and their corresponding features. Dataset Creation year Description Features Centralized IoT IIoT KDD99 (Hettich, 1999) 1998 It was applied at the Third International Knowledge Discovery and Data Mining Tools Competition. 42 ✗ ✗ ✗ NSL-KDD (Bala and Nagpal, 2019) 1998 The KDD99 dataset has been modified to include this. ✓ ✗ ✗ UNSW-NB15 (Moustafa and Slay, 2015) 2015 It was created using the IXIA PerfectStorm tool by ACCS (Cyber Range Lab of the Australian Centre for Cyber Security). 49 ✗ ✓ ✓ CICIDS (Sharafaldin et al., 2018a) 2017 The Canadian Institute for the Cyber-Security University of New Brunswick publishes it in 2017. 11 ✓ ✗ ✗ CSE-CIC-IDS2018 (Sharafaldin et al., 2018b) 2018 It is made public in collaboration with the Communications Security Establishment and the Canadian Institute for Cybersecurity. 77 ✓ ✗ ✗ Bot-IoT (Koroniotis et al., 2019) 2018 It is made available by the Cyber Range Lab of UNSW Canberra Cyber. 46 ✓ ✓ ✗ TON-IoT (Alsaedi et al., 2020) 2019 It is a heterogeneous dataset released by the Cyber Range Lab of the Australian Centre for Cyber Security. 31 ✗ ✓ ✓ X-IIoTID (Al-Hawawreh et al., 2022) 2021 Created in Canberra at the University of New South Wales. 59 ✗ ✓ ✓ WUSTL-IIoT-2021 (Zolanvari, 2021) 2021 It was produced by Maede Zolanvari from McKelvey School of Engineering at the University of Washington. 41 ✓ ✓ ✓ Edge-IIoT (Ferrag et al., 2022) 2022 To evaluate the effectiveness of machine learning-based IDSs, this data has been prepared in a way that is suitable for the IoT/IIoT environment and is based on a realistic testbed. 61 ✓ ✓ ✓ Area Under ROC (AUROC) measures the area beneath the ROC curve in the two dimensions. These are how they are stated: Bayes (NB), Decision Tree (DT), Extra-Trees (ET), and Extreme Gradient Boosting (XGB) were utilized in the intrusion detection process, and their performance was evaluated in terms of test accuracy (TAC) and an area under the curve (AUC). The authors retrieved 16 features from the UNSW-NB15 dataset and showed through their implementation that RF beats the other techniques indicated below. However, few classes were considered the training of their model (only 6). Using the Particle Swarm Optimization (PSO) technique, which is based on the Light Gradient Boosting Machine (LightGBM), to choose pertinent features and SVM for classification, Liu et al. (2021) demonstrated an IDS for IIoT use case. Accuracy and false alarm rate were employed as performance indicators by the authors as they confirmed their hypothesis using data from the UNSW-NB15 dataset. They had a high false alarm rate of 10.62% and an accuracy rate of 86.68%. The suggested solution targets attacks made with backdoors, shellcodes, and worms. The dataset the authors used to validate their proposal is imbalanced and has few classes. Zhou et al. (2020) suggested a Long Term Short Memory (VLSTM) variation for intrusion detection. The IIoT Auto-encoder Neural Network (ANN) is made to extract low-dimensional feature representations from high-dimensional datasets. Using the F1-Score, precision, Area Under the Curve (AUC), recall, and False Alarm Rate (FAR) as performance indicators, the suggested approach was validated using the UNSW-NB15 dataset. The authors got an F1-Score of 90.7%, 0.895 of AUC, 86% precision, and 97.8% recall. The incremental extreme learning machine (I-ELM) strategy for IDS that Gao et al. (2019) suggested uses an adaptive principal component analysis (A-PCA) for feature selection. Candidate features are initially chosen using A-PCA by the proposed algorithm, and then these features are fed into IEM for classification. The performance of the suggested technique was evaluated using the NSL-KDD and the UNSW-NB15. NSLKDD has an accuracy of 81.22%, whereas the recommended approach on UNSW-NB15 had an accuracy of 70.51%. More improvements, according to the authors, are required before their technology can be used in the IIoT network. The authors of Hanif et al. (2019) suggested an artificial neural network (ANN)-based approach for intrusion detection systems (IDS), 1 𝐴𝑈 𝑅𝑂𝐶 = 𝑇𝑃 𝐹𝑃 𝑑( ) ∫0 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 𝑅𝑒𝑐𝑎𝑙𝑙 (5) 1 𝐴𝑈 𝑃 𝑅 = 𝑇𝑃 𝑇𝑃 𝑑( ) ∫0 𝑇 𝑃 + 𝐹 𝑃 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 (6) The researchers evaluated the effectiveness of their recommendations using the measures known as false positive rate and detection rate in the literature on IIoT. An effective IDS should have a very low falsepositive rate and a very high detection rate in the best-case scenario, according to Aburomman and Reaz (2016). These performance metrics’ equation is as follows: 𝐷𝑒𝑡𝑒𝑐𝑡𝑖𝑜𝑛 𝑅𝑎𝑡𝑒 = 𝑇𝑃 𝑇𝑃 + 𝐹𝑁 𝐹 𝑎𝑙𝑠𝑒 𝑃 𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒 = 𝐹𝑃 𝐹𝑃 + 𝑇𝑃 (7) (8) 8.3. Methods See Table 6 for further information. These tables feature nine columns, the first of which displays the reference work, the second of which indicates whether the study will use a centralized, hybrid, or unspecified placement method (NS), A signature-based (SB), anomalybased (AB), or other sorts of detection method is specified in the third column. The fourth column displays the various attacks employed in the implementation, and the fifth column identifies the validation method, either simulation, experiment, or not specified. The industrial environment is displayed in the sixth column and includes IIoT, ICS, Big Data, 5G, etc. The type of AI method used in the selected case is displayed in the seventh column and includes RF, DT, XGB, ANN, etc. The feature selection method is displayed in the eighth column, and the type of data used in the selected case is displayed in the ninth column. An IDS system based on ML techniques and the Genetic Algorithm (GA) for IIoT was recently proposed by Kasongo (2021). In the proposal, Random Forest (RF) was used in the GA fitness function, whereas GA was used for feature selection. RF, Linear Regression (LR), Naive 15 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Table 6 Recent IDS for IIoT based on ML/DL. Ref. Placement strategy Detection method Security threat Validation type IIoT scenario ML/DL method Feature selection method Dataset Kasongo (2021) Not Specify (NS) Signaturebased (SB) Normal, Generic, Exploits, Dos, Reconnaissance, and Shellcode. Experiment Large scale IIoT RF, LR, NB, DT, ET, XGB wrapperbased Feature Selection (FS) UNSW-NB15 Liu et al. (2021) Centralized SB Backdoor, Shellcode and Worms. Experiment NS OCSVM PSOLightGBM UNSW-NB15 Zhou et al. (2020) Centralized Anomalybased (AB) Fuzzers, worm, analysis, exploit, backdoor, DoS, shellcode, reconnaissance, and generic. Experiment Industrial Big Data systems. VLSTM PCA UNSW-NB15 Gao et al. (2019) Centralized AB DoS, Probe, U2R, R2L, Normal Experimental ICS IELM PCA UNSW-NB15, NSL-KDD Hanif et al. (2019) Centralized AB Normal or Threat Simulation ICS ANN None UNSW-NB15 Ketzaki et al. (2019) Centralized AB Normal or Threat Simulation IIoT and 5G ANN Statistical analysis UNSW-NB15 Almomani (2020) Centralized AB Normal or Threat Experiment IIoT DT, SVM GA, GWO, FFA UNSW-NB15 Nazir and Khan (2021) Centralized AB Normal or Threat Experiment ICS RF TS UNSW-NB15 Zong et al. (2018) Centralized AB Normal, Analysis, Total Records, DoS Exploits, Worms, Generic, Reconnaissance, Shellcode, Fuzzers, Backdoor Experiment Large-scale network environment RF IG UNSW-NB15 Ullah and Mahmoud (2017) Centralized AB Normal, Fuzzers, Backdoor, Worms, Analysis, Generic, Reconnaissance, Total Records, DoS Exploits, Shellcode Experiment SCADA BayesNet and J48 IG private Alves et al. (2018) Centralized AB Code Injection DoS, eavesdrop Experiment SCADA K-means NS Private He et al. (2017) Centralized AB FDI Simulation SCADA CDBN NS Private Potluri et al. (2017) Hybrid AB Probe, U2R, R2L, DoS Simulation Network Control Systems CDBN SVM NS NSL-KDD Keliris et al. (2016) Centralized AB Rennaisance, Code injection Simulation ICS SVM NS Private Eigner et al. (2016) Centralized AB Man-in-the Middle Simulation ICS KNN NS Private Siddavatam et al. (2017) Centralized AB Normal or Abnormal Simulation in-house developed industrial compliant Ada-Boost NS Private Maglaras (2018) Centralized AB Normal or Abnormal Simulation SCADA OCSVM NS Private Mantere et al. (2012) NS AB NS NS ICS NS NS NS Zolanvari et al. (2019) Centralized AB command injection, backdoor, SQL injection Experiment IIoT SVM,ANN,KNN, naïve Bayes(NB), DT,RF, (LR) logistic regression The features that change during the attack phases Private (continued on next page) 16 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Table 6 (continued). Ref. Placement strategy Detection method Security threat Validation type IIoT scenario ML/DL method Feature selection method Dataset Awotunde et al. (2021) Centralized AB DoS, Probe, U2R, R2L Experiment IIoT Deep FeedForward Neural Network Rule-based NSL-KDD, UNSW-NB15 Wang et al. (2021) Centralized AB DoS, DDoS, Theft, Renaissance Experiment ICS ANN NS NF-BoTIoT-v2 Stewart et al. (2017) Hybrid NS NS Simulation SCADA OCSVM GA HEDVa Kalash et al. (2018) Hybrid AB Malware Experiment IIoT CNN NS Malimg and Microsoft Latif et al. (2020) Centralized AB Attack or not attack Experiment IIoT DRaNN NS UNSW-NB15 AbdelBasset et al. (2020) Distributed AB Rennaissance, Theft, DoS, DDoS, Legitime Experiment IIoT Deep-IFS NS Bot-IIoT UNSW-NB15 Hassan et al. (2020) Centralized AB Attack or not attack Experiment Big data environment CNN, LSTM NS UNSW-NB15 Li et al. (2020) Centralized AB DoS, Probe, U2R, R2L Experiment IIoT Multi-CNN NS NSL-KDD Raja et al. (2021) Centralized AB DoS Experiment IIoT Deep Neural Network NS TON_IoT Zolanvari et al. (2018) Distributed NS Normal, Attack Experiment IIoT ANN raw network NS Zhang et al. (2018) Distributed AB Normal, Attack Simulation IIoT ANN AutoEncoder UNSW-NB15 Mendonça et al. (2021) Centralized AB DoS, malevolent operation, spying, scanning, data type probing, brute force, wrong setup, web attacks. Experiment IIoT ANN NS DS2OS CICIDS2017 AlHawawreh et al. (2019) Centralized AB Attack or not attack Experiment brownfield IIoT Deep Neural network NS Mississippi State University’s Critical Infrastructure Protection Center dataset Kasongo and Sun (2020) Centralized AB Rennaissance, Theft, DoS, DDoS, Legitime Experiment IIoT, 5G, 6G Deep Neural network NS UNSW-NB15, AWID AlHawawreh et al. (2022) Centralized NS Normal, Attack Experiment IoT/IIoT NB, DT,SVM, KNN, Logistic Regression (LR), Gated Recurrent Unit Deep Neural Network (DNN). NS X-IIoTID (continued on next page) 17 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Table 6 (continued). Ref. Placement strategy Detection method Security threat Validation type IIoT scenario ML/DL method Feature selection method Dataset Zolanvari (2021) Centralized NS DoS, Backdoor, Reconnaissance, Command Injection Experiment IoT/IIoT DT, NB, SVM, KNN, Logistic Regression (LR), ANN, and RF NS WUSTLIIoT-2021 Dong et al. (2018) NS NS normal and abnormal Experiment SCADA NS entropybased NSL-KDD water dataset gas dataset Yao et al. (2019) Hybrid AB NS Simulation Edge-based IIoT LightGBM NS NS Alsaedi et al. (2020) NS NS DoS,DDoS, XSS,passward cracking, Injection, MITM, ransomware, scanning, and backdoor Experiment IoT/IIoT SVM, DT, NB,LR, RF, KNN, LSTM NS TON-IoT Sarhan et al. (2021b) Distributed Hybrid endtabular NS Simulation IIoT ANN,RF Chisquare,IG, Correlation UNSW-NB15, CSECIC-IDS 2018,ToN-IoT AlHawawreh and Sitnikova (2019) Liang et al. (2021) NS NS ransomware attacks Experiment IIoT SVM, NB, RF, DNN, LR AutoEncoder data in Sgandurra et al. (2016) Distributed NS Normal or Attack Experiment IIoT ANN, DNN, LSTM,LuNet, DeRol, Siamese-NN, Deep-MCDD, OICS-VFSL intra intra-based NSL-KDD, CIC-IDS 2017 Qiao et al. (2020) Distributed Hybrid Fuzzers, DoS, Exploits, Generic, backdoor,Analysis Simulation IIoT KNN LDA, PCA UNSW-NB15 Aljawarneh et al. (2018) Centralized Hybrid DoS, U2R, R2L, PROBE Simulation NS NB,J48, Meta Pagging, RandomTree, REPTree, AdaBoostM1, DecisionStump IG NSL-KDD Koroniotis et al. (2019) Centralized NS DoS/DDoS, Information theft, probing Simulation IoT SVM, RNN, and LSTM Joint Entropy, Correlation Coefficient Bot-IoT Ferrag et al. (2022) Centralized NS DoS/DDoS, Information gathering, Injection, Man in Middle, and Malware Experiment IoT/IIoT SVM, DT, RF, KNN and LSTM Zeek tool, TShark tool Edge-IIoT AlHawawreh et al. (2019) NS NS Normal or Attack Experiment IIoT SVM, NB, RF, KNN AutoEncoder RTU data Zhang et al. (2020) NS AB Normal, Generic, Exploits, Fuzzers, Reconnaissance, DoS, Worms, Shellcode Experiment IIoT SVM, MRMR AutoEncoder UNSW-NB15, MSU Bhatia et al. (2019) NS AB Normal or Attack Experiment IIoT SVM AutoEncoder PCA Puplic data Tang et al. (2022) Distributed NS Cyber Attacks Experiment IoT GRU GRU CICIDS2017 Khan et al. (2021) Distributed NS NS Experiment IIoT CNN, LIME AutoEncoder LSTM gas data (continued on next page) 18 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Table 6 (continued). Ref. Placement strategy Detection method Security threat Validation type IIoT scenario ML/DL method Feature selection method Dataset Zolanvari et al. (2021) NS NS NS Experiment IIoT MMG, LIME TRUST XAI WUSTLIIoT, NSL-KDD, UNSW Alani et al. (2022) NS NS reconnaissance, backdoors, DoS, injection Experiment IIoT DeepIIoT MLP WUSTL-IIOT2021 in which the ANN is utilized to detect attacks and the controller discards the orders after categorizing them as threats. The proposed approach was validated using the UNSW-NB15 dataset, and a precision score of 84.00% for the binary classification was attained. An algorithm for feature selection was not used by the authors. For IIoT and 5G networks, Ketzaki et al. (2019) developed a lightweight IDS that used ANN for classification. In order to determine which features to employ and whether an attack is occurring or not, the authors used statistical analysis. The proposal was validated using the UNSW-NB15 dataset, and an accuracy score of 83.9 percent was attained. Almomani (2020) suggested an IDS for IIoT networks that classified data using DT and SVM. For feature extraction, the grey wolf optimizer (GWO) and the firefly optimization (FFA) were used. The accuracy values achieved by the GWO-DT, FFA-DT, and GA-DT are 85.676%, 86.037%, and 86.874% respectively, while the accuracy scores achieved by the GWO-SVM, FFA-SVM, and GA-SVM are 84.485%, 85.429%, and 86.387% respectively. The UNSW-NB15 dataset was utilized to support the suggested approach. Nazir and Khan (2021) RF-based .’s method, which utilized Tabu Search (TS) for feature selection and RF as a classifier, was just recently put forth. This strategy increased classification accuracy while reducing the number of features and false positives. Performance verification was conducted using the UNSW-NB15 dataset, and results showed an accuracy of 83.12% and a False Positive Rate of 3.7%. However, the authors did not take into account the dataset’s class imbalance problem. By separating the training and detection of minority and majority intrusion classes, Zong et al. (2018) introduced a two-stage intrusion detection system (IDS) based on imbalanced intrusion detection datasets. The majority classes of incursions are detected in the second step after the minority classes have been detected in the first stage. The feature selection process employed the Information Gain (IG), and the classification process used the RF. The authors used the UNSW-NB15 dataset to validate the suggested technique, and they found that it had an accuracy of 85.78% and a False Alert Rate of 15.64%. Ullah and Mahmoud (2017) suggested a hybrid model for anomalybased intrusion detection in the IIoT based on J48 and BayesNet classifier. In the suggested approach, the J48 classifier served as a supervised attribute filter and the information gain served as a feature selector. Finally, anomaly-based intrusion detection was built using BayesNet. The authors made use of an exclusive dataset created by Mississippi State University using the gas pipeline network. The authors claimed that their method obtained 100% accuracy, precision, recall, and F-Value for binary classification and 99.5% accuracy, precision, recall, and F-Value for multi-classification. Alves et al. (2018) suggested a k-means clustering technique for IDS in SCADA settings (Alves2018embedding). They created a simulation of a SCADA system that includes a gas pipeline, a water treatment facility, and a water storage tank using an open-source virtual PLC (OpenPLC platform) and AES-256 encryption. In their studies, three types of attacks – DoS, speculation, and eavesdropping – were taken into account. The authors collected 4000 data samples, ran simulations, and used standard deviation to validate the method’s effectiveness. The purpose of data integrity is to identify False data injection (FDI) in real time. In He et al. (2017), proposed IDS based on conditional deep belief network (CDBN). The system was simulated using IEEE 118bus and 300-bus test equipment. The accuracy of detection attained by the authors was over 95%. They evaluated how well their suggested method performed in comparison to ANN and SVM. In their review of the Potluri et al. (2017), evaluated the use of DL and ML together to identify intrusions in network control systems. To be more precise, they employed an SVM classifier for the classification tasks and deep belief networks (DBNs) to extract features from the NSL-KDD dataset. In their suggestion, they took into account attacks like DoS, Probe, U2R, and R2L. On Normal, DoS, Probe, R2L, and U2R, the precision was 95.33%, 98.21%, 87.16%, 23.58%, and 29.26%, respectively. On various attacks, the F-score and recall were also evaluated. A test bed in MATLAB was used by Keliris et al. (2016) to control a Tennessee Eastman (TE) chemical process. They have created a very potent SVM model for the detection module that can tell the difference between disruptions during regular operations and malicious activities. The authors of the studies initially concentrated on reconnaissance assaults and then assessed the impact of command injection attacks on the controller’s ability to manage reactor pressure. The authors assert that their approach can successfully distinguish between a typical procedure and an aberrant one. The k-Nearest Neighbors (KNN) technique was presented by Eigner et al. (2016) to identify Man-in-the-Middle attacks in ICS. To gather data, they first ran simulations, from which they then extracted features for the KNN method. 32 characteristics in all were gathered. Using the KNN technique, an outlier score of 1.231 was calculated for the Man-in-the-Middle attack on the collected data. An ADA-Boost algorithm based on DT and RF was suggested in Siddavatam et al. (2017) for anomaly detection for internally built industrial compliance. To train the model, a prototype of the complaint was constructed, from which features were taken both during regular operation and during abnormal operation. It is important to note that there was no test of an attack scenario. The obtained data has a 98% accuracy rate. A One Class-SVM model was suggested by Maglaras (2018) to identify anomalies in the SCADA system. To train the SVM model, simulations were run and 1570 data points were collected from the system’s normal and abnormal functioning. The writers did not take an attack into account. Data rate and packet size were the two features taken into account when training the OCSVM model. On the testing dataset, the authors assert a 98% accuracy rate. The feature selection processes for ICS anomaly detection were described by Mantere et al. in their paper (Mantere et al., 2012). They came to the conclusion that the key factors to take into account for anomaly detection in IIoT systems were individual packet sizes, flow directions, protocol, average data byte rates, and average packet rates. The attack, however, was put forward to help them reach their conclusion. Several machine learning (ML) methods, including KNN, SVM, naive Bayes (NB), DT, logistic regression (LR), RF, and ANN, were assessed by Zolanvari et al. (2019) for the IIoT. By completing a real-world test bed that replicates an actual industrial plant, they created their own dataset. The authors only took into account features whose values vary throughout the attack phase out of a set of features that were developed. Performance indicators included False Alarm Rate, Undetected 19 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. RF algorithms are used to measure the attack detection accuracy on CSE-CIC-IDS2018, UNSW-NB15, and ToN-IoT datasets. Zhang et al. (2018) selected characteristics from the UNSW-NB15 dataset using a denoising AutoEncoder (AE), and trained an ANN model for IIoT anomaly classification. The authors considered binary classification and obtained an overall accuracy of 98.80% of F-score, precision and recall. Aljawarneh et al. (2018) used Information Gain (IG) on the NSLKDD dataset as a feature selector. On the basis of an IG threshold value of 0.40, a total of 8 features were chosen. Then, these features are fed to a hybrid J48, Meta Pagging, RF, REPTree, AdaBoostM1, DecisionStump, and NB classifiers to measure the detection accuracy rate. The proposed hybrid algorithm achieved an accuracy of 99.81% on binary classification. In order to detect the presence of intruders, Awotunde et al. (2021) presented a DL-based IDS for IIoT using hybrid rule-based feature selection. To be more precise, the model was trained using a deep feedforward neural network and feature selection was done using rules. The authors used the UNSW-NB15 and NSL-KDD datasets to validate their hypothesis. For the UNSW-NB15 dataset, the authors achieved a False Positive Rate, detection rate, and accuracy of 1.1%, 99.9%, and 98.9%, respectively, during the evaluation phase. For the NSLKDD dataset, they achieved a False Positive Rate, detection rate, and accuracy of 1.0%, 99.0 %, and 99.0%, respectively. Kalash et al. (2018) malware categorization using a convolutional neural network-based DL model (CNN). The authors trained a CNN for classification after first converting malware binaries to grayscale pictures. On the Microsoft and Malimg datasets, the suggested method has accuracy rates of 99.97% and 98.52%, respectively. Latif et al. (2020) provided an approach for intrusion detection in the IIoT that uses deep random neural networks (DRaNN), where the positive and negative signals are switched in the form of amplitude spikes. The authors claimed that the highly scattered character of random neural networks was the reason they chose to employ them. The concept was validated using the UNSW-NB15 dataset, and accuracy and attack detection rates were 99.54% and 99.41%, respectively. To detect intrusions in IIoT traffic, Abdel-Basset et al. (2020) introduced the Deep-IFS distributed forensics-based deep learning model. This model learns local representations using gated recurrent units while Multihead attention learns the global representation. The BotIIoT and UNSW-NB15 datasets were used to validate the suggested model. The binary classification approach yielded an accuracy of 99.75%, an F1-measure of 98.14%, and an AUC of 99.98 %. The proposed method had the following performance on multi-classification: F1-measure: 99.88%; precision: 99.99%; recall 99.77%; and accuracy: 99.77 %. A hybrid deep learning model for IDS for big data situations was presented by Hassan et al. (2020). The proposed method involved using the CNN to extract features while keeping long-term relationships between extracted features in a weight-dropped, lengthy short-term memory to minimize overfitting. The model scored a 97.1% accuracy for binary classification and 98.4% accuracy for multiclass classification. A multi-convolutional neural network (multi-CNN) fusion approach for IDS in the IIoT was proposed by Li et al. (2020). The suggested approach offers a migration learning model analysis framework and a way to choose system features. The multi-CNN model had a multi-class classification accuracy of 64.81% and a binary classification accuracy of 86.95% on the NSL-KDD dataset. A novel sparse evolutionary training (SET) based prediction model for intrusion detection in the IIoT was proposed in Mendonça et al. (2021). On the DS2OS and CICIDS2017 datasets, the approach was validated, and the model’s precision was assessed against the state-ofthe-art. The authors of Al-Hawawreh et al. (2019) suggested a DL-based method in which the model is trained using data gathered from Remote Telemetry Unit (RTU) streams of the gas pipeline system. Labeled and Rate, Accuracy, Matthews Correlation Coefficient, and Sensitivity, and various comparisons were done. In order to enable neural networks to recognize unknown attack classes, Wang et al. (2021) suggested an IDS based on ANN, employing the Openmax layer rather than the popular softmax layer. The NF-BoTIoT-v2 dataset, which was just released by Sarhan et al. (2021a), was used by the authors. The authors report accuracy, F-score, and recall on the NF-BoT-IoT-v2 data to be 0.864 ±0.094, 0.845 ± 0.084, and 0.840 ± 0.090, respectively, without taking into account any feature selection methods. Stewart et al. (2017) suggested an IDS for the IIoT based on OCSVM. The HEDVa (Hybrid Environment for Design and Validation) was utilized by the authors to create the proposed approach, and data were collected for the SVM classifier. The GA was used to choose the features and optimize the hyperparameters (kernel). The authors’ 99% detection accuracy for testing data. For IIoT networks, Dong et al. (2018) introduced a traffic feature map-based IDS. The authors’ strategy for creating a feature vector and extracting significant features in their proposal is information entropybased. Then, using a correlation analysis method, a feature relationship map is created. A perceptual hash digest database of the normal and abnormal is created using the discrete cosine transform (DCT) and singular value decomposition (SVD) methods and intrusion detection rules are then retrieved from the database. In their proposal for a hybrid IDS architecture for edge-based IIoT, Yao et al. (2019) used ML and DL algorithms in the lower-layer network and upper-layer network, respectively. Advanced characteristics are taken from the dataset and placed in the lower layer, also known as the edge nodes layer. The intrusion detection task is then carried out at the upper layer using a DL algorithm with improved accuracy. The suggested architecture is not put through any sort of validation procedure. A dataset for intruder detection in IoT and IIoT scenarios was proposed by the authors of the paper (Alsaedi et al., 2020). On the suggested dataset, they assessed how well a number of ML algorithms performed. The authors take into account the ML algorithms NB, SVM, RF, kNN, DT, LR, and LSTM. Scanning, DoS, DDoS, ransomware, backdoor, data injection, Cross-site Scripting (XSS), password cracking assault, and Man-in-the-Middle are the nine attacks that are examined (MITM). The authors assert that RF is the top algorithm on the suggested dataset for identifying intruders in IIoT systems based on their analyses. To identify intruders in dispersed IIoT systems, Liang et al. (2021) presented an Optimized Intra/Inter-Class Structure-based Variational Few-Shot Learning (OICS-VFSL). The authors used Bayesian theory to maximize both the intra-class and the inter-class utilizing the maximization of similarities. The ANN is then used to carry out multiclassification tasks. Performance indicators used in the experiments on the NSL-KDD and CIC-IDS 2017 dataset include the false acceptance rate (FAR), detection rate (DR), and F-score. We looked at both binary classification and multi-classification. In Zolanvari et al. (2018), explored the command manipulation attack in a water storage scenario to compromise the output commands and took into account the impact of the unbalanced dataset. They used Kali Linux Penetration Testing Distribution to carry out attack scenarios and gather training data. The dataset was generated in a way that there exists an imbalance in the training data. Then, the ANN algorithm was chosen to train the model. Qiao et al. (2020) compared two feature extractors, namely, Discriminant Analysis (LDA) and PCA. By feeding the discriminant information obtained from the LDA into the PCA, they presented a linear discriminative PCA. Then, a KNN algorithm on the UNSW-NB15 dataset to detect intruders. For binary classification, the authors achieved a detection rate of 92.35%. Sarhan et al. (2021b) studied three feature selection algorithms, namely, chi-square, information gain and correlation. These feature selection algorithms identify and rank data features. Then, ANN and 20 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Zhang et al. (2020) studied the impact of distinct features on anomaly detection in IIoT through the maximum correlation minimum redundancy (MRMR) feature selection algorithm and used SVM as a classification method. By using the UNSW-NB15 data set and private Industrial data set, the authors concluded that there exists both coupling and independence between different features and different features have different impacts on anomaly detection. However, the authors did not highlight which anomaly their model is able to detect. Bhatia et al. (2019) proposed an unsupervised learning method to secure devices and machines in Industrial control systems. The authors claim that their method can be incorporated into a larger system to identify the spoofing attack. The authors of Tang et al. (2022) proposed a federated learningbased approach for intruders detection for industrial control systems. The experiment results on CICIDS2017 provided interesting performance regarding data privacy protection and network scalability. Ferrag et al. (2022) In this dataset, ML/DL-based IDSs can employ either centralized learning or federated learning. Features are taken from a number of sources. Out of the 1176 features discovered, the authors proposed 61 features with high correlations. The suggested testbed is divided into seven layers: the edge computing layer, the IoT and IIoT perception layer, the cloud computing layer, the blockchain network layer, the network function virtualization layer, and the layers for software-defined networking and cloud computing. The ThingsBoard IoT platform, Hyperledger Sawtooth, ONOS SDNcontroller, Digital twin, OPNFV platform, Modbus TCP/IP, and other novel techniques were provided by the authors for each layer. They performed a preliminary exploratory data analysis after processing and analyzing the suggested data, evaluating the effectiveness of machine learning methodologies in accordance with the two forms of centralized and federated learning. In contrast to conventional ML, the authors demonstrated that the DL technique (like DNN) is effective for IDS (like DT, RF, SVM, and KNN). Whereas DT achieved the lowest accuracy of 67.11% under identical conditions, DNN earned the best accuracy of 96.01% for multiclass in centralized mode. DNN had the best accuracy in binary mode, 99.99%, while DT had the lowest accuracy, 99.98%. DNN outperformed the other ML algorithms in the multi-class model in terms of Precision, Recall, and F1-score, and all ML algorithms give 100% in the normal class because there is no false positive rate. With a multi-classification (15-class), when K = 15, the client’s greatest accuracy was 71.42%, but with federated learning, the client attained an accuracy of 91.74%. This is a nice illustration of the federated model. Recently, Explainable Artificial Intelligence (XAI) methods were explored in cyber security applications. Capuano et al. (2022) and Zhang et al. (2022) provided surveys on XAI for cyber security applications. In the context of XAI for security issues, Khan et al. (2021) presented an explainable auto-encoder-based framework for Cyber Threat Discovery in Industrial IoT Networks which leverages convolutional and recurrent networks to discover cyber threats in IoT networks. The model is able to detect known and zero-day attacks. It consists of two steps: the use of a sliding window technique to transform a 1dimensional (1D) sample into smaller contiguous 2-dimensional (2D) samples. Then, fed it into a CNN, comprised of a 1D convolutional layer and a 1D max-pooling layer which extracts spatial features. The data is then fed into the auto-encoder-based LSTM that extracts temporal features. Finally, the DNN uses the extracted representation to make predictions. To make the model explainable, the authors use LIME (Ribeiro et al., 2016). The dataset used for experimentation was from a real-world gas pipeline system which consists of system logs that include packet data used to communicate with the pipeline, along with features such as packet length, pressure setpoint, and PID gain. The authors achieved a 99.35% accuracy using their proposed model. However, the dataset used is not described well and the proposed approach is not able to detect a large number of attacks. unlabeled data are separated from the data that were collected. First, to train an unsupervised model and derive the model parameters, the unsupervised learning approach uses the sparse and denoising autoencoder methods. The supervised learning algorithm, which uses a deep neural network, is then prepared using these parameters. Precision, detection accuracy, and false positive rate are measured at 96.41%, 91.49%, and 1.87%, respectively. Kasongo and Sun (2020) a Feed-Forward Deep Neural Network (FFDNN)-based binary and multi-classification approach for wireless IDS. A feature vector is built using the Wrapper-Based Feature Extraction Unit based on the Extra Trees and fed into a Deep Neural Network for classification. In order to validate the approach, the UNSW-NB15 and AWID intrusion detection datasets were employed. Overall accuracies for the binary and multiclass classification setups, respectively, are 99.66% and 99.77%. Raja et al. (2021) proposed a two-level IDS where the first level of detection targets the simple attacks, while the second level looks for the difficult IIoT attacks that were misclassified at the first level of detection. The solution was validated on the 𝑇 𝑂𝑁_𝐼𝑜𝑇 dataset and an F1-score of 99.65%, recall of 99.5%, accuracy of 99.97%, and precision of 95.62% were obtained. Al-Hawawreh et al. (2022) SVM, DT, NB, K-nearest Neighbor (KNN), Deep Neural Network (DNN), Logistic Regression (LR), and Gated Recurrent Units were among the machine learning techniques employed (GRU). In terms of accuracy, DT outperformed all other algorithms, achieving 99.54% for binary classification and 99.49% for multi-classification. To find intrusions, this data set used centralized educational processing techniques. Zolanvari (2021) employed 4 different attack types in this dataset: Command Injection, Reconnaissance, DoS, and Backdoor. Seven AI/MLbased IDS techniques – SVM, KNN, Naive Bayes (NB), RF, Decision Tree (DT), Logistic Regression (LR), and Artificial Neural Network – were employed and put to the test (ANN). The RF algorithm had the highest accuracy rate among the other algorithms (99.99%), while the NB algorithm had the lowest accuracy rate (97.48%), according to the data. The results also revealed that the NB method had the highest false alarm rate (2.52) among the employed algorithms, while the SVM algorithm had the lowest rate (0.00), and the remaining algorithms achieved similar results (0.01). In this data, Koroniotis et al. (2019) generated malicious traffic using cyberattacks that originated in Kali Linux VMs, including DoS/DDoS, information theft (data theft and keylogging), and probing (port scanning and OS fingerprinting). This dataset was tested using three ML/DL models, including the Support Vector Machine (SVM), Recurrent Neural Network (RNN), and Long-Short Term Memory (LSTM). With a 99% accuracy rate, the SVM algorithm had the best performance. The lack of IIoT traffic, however, renders it ineffective for IIoT security. In Al-Hawawreh and Sitnikova (2019), a stacked Variational AutoEncoder (VAE) which is a variant of neural network was proposed to protect the Industrial Internet of Things (IIoT) systems against ransomware attacks. Their proposed method was correctly designed to learn the latent structure of system activities and reveal the ransomware behavior. To enable their model to generalize well, the authors proposed a data augmentation technique based on VAE. They used the dataset generated in Sgandurra et al. (2016) to validate their proposal. The used dataset is not a public dataset and the authors implemented their method to detect only ransomware activities. The authors of Al-Hawawreh et al. (2019) proposed an IDS for IIoT that leverages sparse and denoising auto-encoder methods for unlabeled data and deep neural networks for supervised learning. The proposed detection method was validated using data collected from Remote Telemetry Unit (RTU) streams of the gas pipeline system. Even though their proposal provided good results in identifying malicious activities, the authors did not detail the dataset used and the attacks that their proposed method is able to detect. 21 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. cases, real-time service provision becomes important. On the other hand, the above-examined attacks may have a negative effect on the functionality of these applications. Security concerns are therefore a grave problem for many applications. As a result, strong and privacypreserving security procedures are needed to safeguard IIoT systems without degrading their functionality or infringing the privacy of their users. One area of research that can be explored is Federated Learning. In traditional Machine Learning (ML) algorithms, data owners upload their data to the centralized data center to train a model for future prediction (or classification, pattern recognition, etc.). However, always sending data to the centralized data server (1) increases the communication cost and (2) reveals private information or end-users. This traditional way makes data collection very hard because data owners are unwilling to share their data. To mitigate the above-mentioned issue while at the same time keeping the ML model’s accuracy and performance, the concept of Federated Learning (FL) was proposed in 2017 by Google (McMahan et al., 2017). In FL, the training data are distributed among different users which eliminate the fact that only one node stores entirely the training data. The basic idea behind FL is that each user, by using its data set, trains a local model and just sends the model parameters to the centralized server for model averaging. The centralized server will then compute the global model using the local models trained by each user. Therefore, FL (Lalle et al., 2021) protects efficiently users’ privacy and reduces the communication overhead. FL will enable the training of a model without sharing original data which preserves data privacy. IIoT is also made up of numerous IoT objects dispersed throughout the monitoring environment. These items are based on various IoT platforms and are sold by various vendors. Interoperability problems thus become an issue. When creating IDSs for IIoT networks, it is essential to take interoperability and standards concerns into account. Moreover, the objects used in IIoT are powered by batteries where replenishment or recharging is harsh in some scenarios. Therefore, lightweight IDSs that require only a small number of computational operations are preferred. During the design of an IDS for IIoT scenarios, the power consumption and memory metrics must be taken into account. Firstly, energy optimization techniques can be designed and implemented to minimize the energy consumption of devices when using IDS over them. Secondly, deep learning architectures that require less computational time are preferred because the longer it takes time to run an algorithm, the more energy the device is consumed. Also, less complex network architecture is preferred to save the memory space of devices when storing the model parameters of the trained model. Finally, power saving options can be equipped by the IIoT devices to power off the device radio when not in use. It is abundantly obvious from the aforementioned survey studies that few papers took into account hybrid intrusion detection systems. However, it is essential to take into account such a method to identify various intruders from various computing settings. We believe, again, that FL and Blockchain can be combined to play a great role in such a context. More specifically, by combining FL and Blockchain, the clients can collaboratively train a global model by training local models using their own local datasets and then sharing the local updates to a Blockchain for model aggregation. The Blockchain will be considered a central server that aggregates local models received from several clients. IDS placement strategy will be seriously considered when designing IDSs for IIoT use cases because it will impact the overall efficiency of the IDS. As mentioned above there are centralized, distributed, and hybrid placement strategies. Each placement strategy has its advantages and drawbacks. Therefore, it is crucial to investigate the pros and cons of each strategy in order to find trade-offs between them. In their research when comparing placement strategies, Krimmling and Peter (2014) concluded that hybrid methods are preferred over centralized and distributed. In the future, Blockchain can be investigated to detect and protect different IIoT systems from intrusions and various types of attacks. The authentication and authorization issues can be addressed with Ethernet Smart Contract with a high acceptance rate. Zolanvari et al. (2021) proposed an XAI model called TRUST (Transparency Relying Upon Statistical Theory) for IIoT. They have demonstrated the performance of their model with different cybersecurity datasets (WUSTLIIoT, NSL-KDD, and UNSW). By using factor analysis, the authors were able to rank the most relevant features for each class based on mutual information after converting them to latent variables. The likelihood of any new sample falling into the classes is calculated using multi-model Gaussian distributions. Alani et al. (2022) proposed an explainable Deep Learning-based system called DeepIIoT, indented to secure industrial IoT devices. The proposed approach achieved a performance of 99% accuracy with the testing dataset on the WUSTL-IIOT-2021 dataset. They focus on industrial IoT, especially water treatment facilities, nuclear reactors, and power grids. However, the innovative intermediate communication system makes it network-independent. The application boundary of the proposed IDS is not confined to industrial IoT only. 9. Performance comparison This section compares the previously studied intrusion detection algorithms by considering the above performance metrics. The comparison results are summarized in Table 7. These tables have eight columns, with the first one displaying the reference paper and the second displaying the accuracy measure (although some studies did not use this metric), the third column shows the false alarm rate metric, the fourth column specifies the rate of AUC (Area under the curve) metric. The precision metric is displayed in the fifth column, recall is displayed in the sixth, the F1 score is displayed in the seventh, and the false positive rate is displayed in the eighth and final column. From the above tables, the performance of different methods proposed for intruders detection in IIoT was compared using different performance metrics. Some approaches adopted feature selection techniques while others did not use feature selection mechanisms. The first remark that we discovered is that most proposed approaches that employed deep learning instead of traditional machine learning have obtained excellent results. In addition, few methods used the dataset proposed for IoT or IIoT while most of the methods employed the data set proposed in the past years which was not proposed for IIoT. Therefore, this remark reveals that deep-learning models are preferred over traditional ML algorithms in IIoT for tackling intruder detection problems. Secondly, our comparison tables demonstrated that most IDSs adopted binary classification schemes or employed few classes during the training except (Ferrag et al., 2022) which used up to 15 classes for their classification task. Thirdly, the proposed approaches did not consider scalability and privacy concerns. More specifically, most approaches are centralized approaches while few proposals followed the distributed pattern. Only Ferrag et al. (2022) and Tang et al. (2022) proposed distributed federated learning approaches for IDS in IIoT. Finally, it seems that except the dataset proposed by Ferrag et al. (2022), most data sets proposed for training IDS in IoT or IIoT did not consider the distributed learning algorithms. In other words, most datasets were proposed for training centralized learning models. 10. Research finding, open issues, and future outlook From the literature reviews above, it can be concluded that there is an urgent need for IDS in IIoT networks. Additionally, as the main finding, it is clear that ML/DL approaches will be used in the near future to detect intruders in real-world IIoT applications because they have demonstrated, especially, deep learning, their ability to accurately detect intruder activities in IIoT network traffic. IIoT network use in industrial scenarios is regarded as essential. These applications may involve real-time situations where network latency and delay have a direct impact on their performance. In such 22 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Table 7 Performance comparison of the IDSs based on ML/DL approaches. Ref. Accuracy False alarm rate AUC Precision Recall F1-Score False positive rate Kasongo (2021) Binary Class.: 𝑓3 = 87.61%𝑅𝐹 𝑓9 = 70.83%𝐿𝑅 Multi Class.: 𝑔5 = 77.67𝐸𝑇 𝑔4 = 32.72𝑁𝐵 – 0.98𝑅𝐹 , 𝐺𝐵 0.79𝐿𝑅 Binary Class.: 𝑓3 = 82.51%𝑅𝐹 𝑓9 = 68.83%𝐿𝑅 Multi Class.: 𝑔4 = 83.35𝐸𝑇 𝑔3 = 52.69𝑁𝐵 Binary Class.: 𝑓2 = 98.69%𝑅𝐹 𝑓1 0 = 82.71%𝐿𝑅 Multi Class.: 𝑔5 = 77.64𝐸𝑇 𝑔4 = 32.72𝑁𝐵 Binary Class.: 𝑓3 = 89.73%𝑅𝐹 𝑓9 = 76.43%𝐿𝑅 Multi Class.: 𝑔5 = 80.27𝐸𝑇 𝑔4 = 44.35𝑁𝐵 – Liu et al. (2021) 86.68% PSOLightGBM 10.62% – – – – Zhou et al. (2020) – 0.117 VLSTM 0.43 SSAE 0.895 VLSTM 0.731 SSAE 86% VLSTM 73.1% SSAE 99% VLSTM 95.6% SSAE 90.7% VLSTM 83.2% SSAE Gao et al. (2019) NSL-KDD: 81.22% I-ELM+ A-PCA 73.35% BP UNSW-NB15: 70.51% I-ELM+ A-PCA 62.60% BP NSL-KDD: 30.03% I-ELM+ A-PCA 42.26% SVM UNSW-NB15: 26.04% ELM 39.72% CNN – – – – Hanif et al. (2019) 84.00% – – – – – 8% Ketzaki et al. (2019) 92.6% Adapt AAO – – – – – – Almomani (2020) 86.874% 85.676% 86.037% 86.387% 84.485% 85.429% – j48: 86.874% 85.676% 86.037% SVM: 86.387% 84.485% 85.429% – – j48: 21.164% 20.952% 22.592% SVM: 22.270% 22.931% 22.592% – – – 3.7% Nazir and Khan (2021) GA-j48 GWO-j48 FFA-j48 GA-SVM GWO-SVM FFA-SVM 83.12% – – – GA GWO FFA GA GWO FFA Zong et al. (2018) 85.78% 15.64% – – – – – Ullah and Mahmoud (2017) 100% binary class 99.5% multi-class – – 100% binary 99.5% multi 100% binary 99.5% multi 100% binary 99.5% multi – Alves et al. (2018) – – – – – – – He et al. (2017) 95%. – – – – – – Potluri et al. (2017) – – – 95.33% Normal 98.21% 87.16% 23.58% 29.26% 92.89% 97.80% 83.87% 83.91% 53.84% DoS Probe R2L U2R Normal DoS Probe R2L U2R 92.98% 96.83% 82.66% 36.82% 35.82% Normal DoS Probe R2L U2R – Keliris et al. (2016) – – – – – – – Eigner et al. (2016) – – – – – – – Siddavatam et al. (2017) 99.81% RF 99.42% DT – – – – – – Maglaras (2018) 96.3% for all 2.5% all – – – – – Mantere et al. (2012) – – – – – – – Zolanvari et al. (2019) 99.99% 99.98% 99.90% 99.64% 97.48% 0.00 SVM 0.01 DT,KNN, RF,LR 0.03 ANN 2.52 NB – – – – – RF DT, KNN LR SVM, ANN NB Wang et al. (2021) 0.864 ± 0.094 – – – 0.840 ± 0.090 0.845 ± 0.084 – Stewart et al. (2017) 99% – – – – – – Dong et al. (2018) – 0.0029 – – – – – Qiao et al. (2020) 92.35% – – – – – – Zhang et al. (2018) 98.80% – – – 98.80% 98.80% – (continued on next page) 23 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Table 7 (continued). Ref. Accuracy False alarm rate AUC Precision Recall F1-Score False positive rate Sarhan et al. (2021b) UNSW-NB15: 99.27% NF-UNSW-NB15: 98.50% ToN-IoT: 97.35% NF-ToN-IoT:99.38% CSE-CIC-IDS2018: 98.01% NF-CSE-CIC-IDS2018: 95.51% all for RF UNSW-NB15: 0.36% RF NF-(UNS...): 1.27% RF ToN-IoT: 2.94% RF NF-ToN-IoT: 0.42% RF CSE-CICIDS2018: 1.43% RF NF-(CSE-..): 4.36% RF UNSW-NB15: 0.958 RF NF-(UNS...): 0.962 RF ToN-IoT: 0.972 RF NF-ToN-IoT: 0.994 RF CSE-CICIDS2018: 0.966 RF NF-(CSE-..): 0.951 RF – UNSW-NB15: 91.95% RF NF-(UNS...): 93.71% RF ToN-IoT: 97.36% RF NF-ToN-IoT: 99.33% RF CSE-CICIDS2018: 94.79% RF NF-(CSE-..): 94.61% RF UNSW-NB15: 0.92 RF NF-(UNS...): 0.85 RF ToN-IoT: 0.99 RF NF-ToN-IoT: 1.00 RF CSE-CICIDS2018: 0.93 RF NF-(CSE-..): 0.84 RF – Aljawarneh et al. (2018) 99.81% Binary Class 98.56% Multi Class – – – – – – Awotunde et al. (2021) 99.0% NSL-KDD 98.9% UNSW-NB15 – – – 99.0% 99.9% – 1.0% 1.1% Yao et al. (2019) 0.932 0.924 0.926 0.923 0.884 – – 0.999 0.996 0.995 0.994 0.966 0.956 LGB 0.95 RF 0.952 DT 0.95 LR 0.925 NB – Liang et al. (2021) – 0.05 NSL-KDD 0.01 CIC-IDS – – 1.0 NSL-KDD 0.99 CIC-IDS 0.98 NSL-KDD 0.98 CIC-IDS – Zolanvari et al. (2018) 99.99 0.0 – – – – – Kalash et al. (2018) 98.52% Malimg 99.97% Microsoft – – – – – – LGB RF DT LR NB LGB RF DT LR NB 0.916 0.905 0.912 0.909 0.885 LGB RF DT LR NB Latif et al. (2020) 99.54% – – – – – – Abdel-Basset et al. (2020) BoT-IoT: 98.10% UNSW-NB15: 99.75% – 99.7% 99.98% – – 97.5% 98.14% – Hassan et al. (2020) 97.1% binary class 98.4% multi-class – – – – – – Li et al. (2020) 86.95% binary class 81.33% multi-class – – – – – – Mendonça et al. (2021) 99.89% – – 98% 98% 98% – Al-Hawawreh et al. (2019) – – – 96.41% 91.49% – 1.87% Kasongo and Sun (2020) 99.66% binary class 99.77% multi-class – – – – – – Raja et al. (2021) 99.97% 2.34% – 95.62% 99.5% 99.65% – Al-Hawawreh et al. (2022) Binary class: 99.54% Multi-class: 99.49% – – 99.54% 98.27% 99.54% 97.20% 99.54% 97.27% – Zolanvari (2021) 99.99% 99.98% 99.90% 99.64% 97.48% 2.52 NB 0.03 ANN 0.01 DT,KNN, RF,LR 0.00 SVM – – – – – Koroniotis et al. (2019) 99.988% SVM – – 99.998% LSTM 100% SVM – – Al-Hawawreh and Sitnikova (2019) 92.81% – – 99.47% – 13.9 Al-Hawawreh et al. (2019) – – – 95.60% 90.72% – 1.99 Zhang et al. (2020) 95.67% – – – – – – Bhatia et al. (2019) – – – 99.9% SVM 99.7% SVM 99.8% SVM – – RF DT,KNN LR SVM,ANN NB Tang et al. (2022) 97.3% – – – – – Khan et al. (2021) 99.35% – – – – – – Alani et al. (2022) 99.94% 0.032% – – – 0.9994 0.069% (continued on next page) 24 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Table 7 (continued). Ref. Accuracy False alarm rate AUC Precision Recall F1-Score False positive rate Alsaedi et al. (2020) Light-Motion: 59% LSTM, 54% KNN Thermostat: 66% LR, SVM, RF, LSTM,LDA, NB 59% CART Weather: 87% CART, 58% LR Binary Classification: 88% CART, 61% LR, SVM Multi Classification: 77% CART, 54% NB – – Light-Motion: 35% LSTM, 34% KNN, SVM, NB,LR,LDA, RF,CART Thermostat: 59% RF, 44% LR, NB, SVM, LDA Weather: 88% CART, 59% LDA Binary Class.: 90% CART, 37% LR, SVM Multi Class.: 77% CART, 37% SVM Light-Motion: 59% for all Thermostat: 67% LSTM, 59% CART Weather: 87% CART, 59% LR Binary Class.: 88% CART, 61% LR, SVM Multi Class.: 77% CART, 51% NB Light-Motion: 44% LSTM, 43% for the rest Thermostat: 57% CART,KNN 53% LR,NB,RF, LDA,SVM Weather: 87% CART, 53% LR, LDA Binary Class.: 88% CART, 46% LR, SVM Multi Class.: 75% CART, 46% SVM – Ferrag et al. (2022) Centralized model: -(15-class): 94.67% DNN 80.83% RF 79.18% KNN 77.61% SVM 67.11% DT -(6-class): 96.01% DNN 85.62% SVM 83.39% KNN 82.90% RF 77.90% DT -(2-class): DNN, RF, SVM, and KNN = 99.99% DT = 99.98% Federated model: 71.42% when k = 15 for multi-classification 91.74% in 10th rounds of federated learning – – Centralized model: Multi class: – DNN algo.: 100% MITM 99% DDoS – SVM algo.: 91% Scanning 91% Injection Normal class: 100% for all algorithms Centralized model: Multi class: – DNN algo.: 100% MITM 97% Malware 94% Scanning 98% DDoS 67% Injection Normal class: 100% for all algorithms Centralized model: Multi class: – DNN algo.: 97% MITM 64% Malware – DT algo.: 99% MITM 59% Injection – RF algo.: 100% MITM 73% Malware Normal class: 100% for all algorithms Centralized: Normal class: 0.00 for all algorithms Zolanvari et al. (2021) LIME: 82.03% WUSTLIIoT 94% NSL-KDD 94.86% UNSW MMG: 98.65% WUSTLIIoT 98.08% NSL-KDD 93.76% UNSW – – – – – – 11. Conclusion Declaration of competing interest In recent years, the Industrial Internet of Things (IIoT) has significantly improved our daily lives, businesses, and society. At the same time, hackers might leverage the IIoT’s enormous potential as a brand-new avenue to endanger the security and privacy of users. IDS is therefore one of the most crucial security technologies, just like in traditional networks. We have given a survey of recent ML/DL-focused IDS research efforts in the literature in this paper. We have selected research papers published between 2017 and 2022 in IEEE Xplore, MDPI Publisher of Open Access Journals, Science Direct, Springer, SCOPUS, ACM, Wiley, Web of Science, and Hindawi Publishing Corporation. Then, we proposed a taxonomy to classify them. More specifically, they are classified into placement strategy, detection method, threat type, validation type, IIoT use case, and ML approach. As the main finding, we observed that until now most research papers are based on a centralized placement strategy and the detection method is anomaly-based detection. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Data availability No data was used for the research described in the article. References Abdel-Basset, M., Chang, V., Hawash, H., Chakrabortty, R.K., Ryan, M., 2020. Deep-IFS: Intrusion detection approach for IIoT traffic in fog environment. IEEE Trans. Ind. Inform. Abdel-Basset, M., Imran, M., 2020. Special issue on Industrial Internet of Things for automotive industry-New directions, challenges and applications. Mech. Syst. Signal Process. 142, 106751. 25 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. da Costa, K.A., Papa, J.P., Lisboa, C.O., Munoz, R., de Albuquerque, V.H.C., 2019. Internet of Things: A survey on machine learning-based intrusion detection approaches. Comput. Netw. 151, 147–157. Darwish, L.R., Farag, M.M., El-Wakad, M.T., 2020. Towards reinforcing healthcare 4.0: A green real-time IIoT scheduling and nesting architecture for COVID-19 large-scale 3D printing tasks. IEEE Access 8, 213916–213927. Devare, A., Shelake, M., Vahadne, V., Kamble, P., Tamboli, B., 2016. A system for denial-of-service attack detection based on multivariate correlation analysis. Int. Res. J. Eng. Technol. (IRJET) 3 (04), 1917–1923. Ding, D., Han, Q.L., Ge, X., Wang, J., 2020. Secure state estimation and control of cyber-physical systems: A survey. IEEE Trans. Syst. Man. Cybern. Syst. 51 (1), 176–190. Dong, R.H., Wu, D.F., Zhang, Q.Y., Zhang, T., 2018. Traffic characteristic map-based intrusion detection model for industrial internet. Int. J. Netw. Secur. 20 (2), 359–370. Doshi, R., Apthorpe, N., Feamster, N., 2018. Machine learning ddos detection for consumer Internet of Things devices. In: 2018 IEEE Security and Privacy Workshops. SPW, IEEE, pp. 29–35. Dwivedi, S.K., Roy, P., Karda, C., Agrawal, S., Amin, R., 2021. Blockchain-based Internet of Things and industrial IoT: a comprehensive survey. Secur. Commun. Netw. 2021. Eigner, O., Kreimel, P., Tavolato, P., 2016. Detection of man-in-the-middle attacks on industrial control networks. In: 2016 International Conference on Software Security and Assurance. ICSSA, IEEE, pp. 64–69. ElMamy, S.B., Mrabet, H., Gharbi, H., Jemai, A., Trentesaux, D., 2020. A survey on the usage of blockchain technology for cyber-threats in the context of industry 4.0. Sustainability 12 (21), 9179. Elrawy, M.F., Awad, A.I., Hamed, H.F., 2018. Intrusion detection systems for IoT-based smart environments: a survey. J. Cloud Comput. 7 (1), 1–20. Esposito, C., Castiglione, A., Palmieri, F., De Santis, A., 2018. Integrity for an event notification within the industrial Internet of Things by using group signatures. IEEE Trans. Ind. Inform. 14 (8), 3669–3678. Fadlullah, Z.M., Tang, F., Mao, B., Kato, N., Akashi, O., Inoue, T., Mizutani, K., 2017. State-of-the-art deep learning: Evolving machine intelligence toward tomorrow’s intelligent network traffic control systems. IEEE Commun. Surv. Tutor. 19 (4), 2432–2455. Fahim, M., Sillitti, A., 2019. Anomaly detection, analysis and prediction techniques in iot environment: A systematic literature review. IEEE Access 7, 81664–81681. Ferrag, M.A., Friha, O., Hamouda, D., Maglaras, L., Janicke, H., 2022. Edge-IIoTset: A new comprehensive realistic cyber security dataset of IoT and IIoT applications for centralized and federated learning. IEEE Access 10, 40281–40306. Ferretti, L., Longo, F., Merlino, G., Colajanni, M., Puliafito, A., Tapas, N., 2021. Verifiable and auditable authorizations for smart industries and industrial internet-of-things. J. Inf. Secur. Appl. 59, 102848. Gao, J., Chai, S., Zhang, B., Xia, Y., 2019. Research on network intrusion detection based on incremental extreme learning machine and adaptive principal component analysis. Energies 12 (7), 1223. Gharib, A., Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A., 2016. An evaluation framework for intrusion detection dataset. In: 2016 International Conference on Information Science and Security. ICISS, IEEE, pp. 1–6. Hajiheidari, S., Wakil, K., Badri, M., Navimipour, N.J., 2019. Intrusion detection systems in the Internet of Things: A comprehensive investigation. Comput. Netw. 160, 165–191. Han, G., Xiao, L., Poor, H.V., 2017. Two-dimensional anti-jamming communication based on deep reinforcement learning. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP, IEEE, pp. 2087–2091. Hanif, S., Ilyas, T., Zeeshan, M., 2019. Intrusion detection in IoT using artificial neural networks on UNSW-15 dataset. In: 2019 IEEE 16th International Conference on Smart Cities: Improving Quality of Life using ICT & IoT and AI (HONET-ICT). IEEE, pp. 152–156. Hasan, M.K., Ghazal, T.M., Saeed, R.A., Pandey, B., Gohel, H., Eshmawi, A., AbdelKhalek, S., Alkhassawneh, H.M., 2022. A review on security threats, vulnerabilities, and counter measures of 5G enabled Internet-of-Medical-Things. IET Commun. 16 (5), 421–432. Hasan, M., Islam, M.M., Zarif, M.I.I., Hashem, M., 2019. Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches. Internet Things 7, 100059. Hassan, M.M., Gumaei, A., Alsanad, A., Alrubaian, M., Fortino, G., 2020. A hybrid deep learning model for efficient intrusion detection in big data environment. Inform. Sci. 513, 386–396. He, Y., Mendis, G.J., Wei, J., 2017. Real-time detection of false data injection attacks in smart grid: A deep learning-based intelligent mechanism. IEEE Trans. Smart Grid 8 (5), 2505–2516. He, S., Ren, W., Zhu, T., Choo, K.K.R., 2019. BoSMoS: A blockchain-based status monitoring system for defending against unauthorized software updating in industrial Internet of Things. IEEE Internet Things J. 7 (2), 948–959. Hettich, S., 1999. Kdd cup 1999 data. In: The UCI KDD Archive. University of California, Department of Information and Computer Science. Jayalaxmi, P., Saha, R., Kumar, G., Kumar, N., Kim, T.H., 2021. A taxonomy of security issues in Industrial Internet-of-Things: scoping review for existing solutions, future implications, and research challenges. IEEE Access 9, 25344–25359. Abdelhafidh, M., Fourati, M., Fourati, L.C., Chouaya, A., et al., 2017. İnternet of things in industry 4.0 case study: fluid distribution monitoring system. In: CS & IT Conference Proceedings, Vol. 7, No. 15. Abosata, N., Al-Rubaye, S., Inalhan, G., Emmanouilidis, C., 2021. Internet of Things for system integrity: A comprehensive survey on security, attacks and countermeasures for industrial applications. Sensors 21 (11), 3654. Aburomman, A.A., Reaz, M.B.I., 2016. Ensemble of binary SVM classifiers based on PCA and LDA feature extraction for intrusion detection. In: 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference. IMCEC, IEEE, pp. 636–640. Al-Hawawreh, M., Sitnikova, E., 2019. Industrial Internet of Things based ransomware detection using stacked variational neural network. In: Proceedings of the 3rd International Conference on Big Data and Internet of Things. pp. 126–130. Al-Hawawreh, M., Sitnikova, E., Aboutorab, N., 2022. X-IIoTID: A connectivityagnostic and device-agnostic intrusion data set for Industrial Internet of Things. IEEE Internet Things J. 9 (5), 3962–3977. http://dx.doi.org/10.1109/JIOT.2021. 3102056. Al-Hawawreh, M., Sitnikova, E., Den Hartog, F., 2019. An efficient intrusion detection model for edge system in brownfield industrial internet of things. In: Proceedings of the 3rd International Conference on Big Data and Internet of Things. pp. 83–87. Al-Jaroodi, J., Mohamed, N., Jawhar, I., 2018. A service-oriented middleware framework for manufacturing industry 4.0. ACM SIGBED Rev. 15 (5), 29–36. Alani, M.M., Damiani, E., Ghosh, U., 2022. DeepIIoT: An explainable deep learning based intrusion detection system for industrial IOT. In: 2022 IEEE 42nd International Conference on Distributed Computing Systems Workshops. ICDCSW, IEEE, pp. 169–174. Albettar, M., 2019. Evaluation and assessment of cyber security based on Niagara framework: a review. J. Cyber Secur. Technol. 3 (3), 125–136. Aldawood, H., Skinner, G., 2020. Analysis and findings of social engineering industry experts explorative interviews: perspectives on measures, tools, and solutions. IEEE Access 8, 67321–67329. Aljawarneh, S., Aldwairi, M., Yassein, M.B., 2018. Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model. J. Comput. Sci. 25, 152–160. Almomani, O., 2020. A feature selection model for network intrusion detection system based on PSO, GWO, FFA and GA algorithms. Symmetry 12 (6), 1046. Alruwaili, F.F., 2021. Intrusion detection and prevention in Industrial IoT: A technological survey. In: 2021 International Conference on Electrical, Computer, Communications and Mechatronics Engineering. ICECCME, IEEE, pp. 1–5. Alsaedi, A., Moustafa, N., Tari, Z., Mahmood, A., Anwar, A., 2020. TON_IoT telemetry dataset: A new generation dataset of IoT and IIoT for data-driven intrusion detection systems. IEEE Access 8, 165130–165150. Alsoufi, M., Razak, S., Siraj, M.M., Ali, A., Nasser, M., Abdo, S., et al., 2020. Anomaly intrusion detection systems in IoT using deep learning techniques: A survey. In: International Conference of Reliable Information and Communication Technology. Springer, pp. 659–675. Alves, T., Das, R., Morris, T., 2018. Embedding encryption and machine learning intrusion prevention systems on programmable logic controllers. IEEE Embed. Syst. Lett. 10 (3), 99–102. Awotunde, J.B., Chakraborty, C., Adeniyi, A.E., 2021. Intrusion detection in industrial internet of things network-based on deep learning model with rule-based feature selection. Wirel. Commun. Mob. Comput. 2021. Bala, R., Nagpal, R., 2019. A review on kdd cup99 and nsl nsl-kdd dataset. Int. J. Adv. Res. Comput. Sci. 10 (2). Bekri, W., Jmal, R., Chaari Fourati, L., 2020a. Internet of things management based on software defined networking: a survey. Int. J. Wirel. Inf. Netw. 27, 385–410. Bekri, W., Jmal, R., Fourati, L.C., 2020b. Softwarized Internet of Things network monitoring. IEEE Syst. J. 15 (1), 826–834. Bertino, E., Islam, N., 2017. Botnets and Internet of Things security. Computer 50 (2), 76–79. Bhatia, R., Benno, S., Esteban, J., Lakshman, T., Grogan, J., 2019. Unsupervised machine learning for network-centric anomaly detection in IoT. In: Proceedings of the 3rd Acm Conext Workshop on Big Data, Machine Learning and Artificial Intelligence for Data Communication Networks. pp. 42–48. Booth, A., Sutton, A., Clowes, M., Martyn-St James, M., 2021. Systematic Approaches to a Successful Literature Review. Sage. Borgiani, V., Moratori, P., Kazienko, J.F., Tubino, E.R., Quincozes, S.E., 2020. Toward a distributed approach for detection and mitigation of denial-of-service attacks within Industrial Internet of Things. IEEE Internet Things J. 8 (6), 4569–4578. Boyes, H., Hallaq, B., Cunningham, J., Watson, T., 2018. The Industrial Internet of Things (IIoT): An analysis framework. Comput. Ind. 101, 1–12. Butun, I., Almgren, M., Gulisano, V., Papatriantafilou, M., 2020. Intrusion detection in industrial networks via data streaming. In: Industrial IoT. Springer, pp. 213–238. Capuano, N., Fenza, G., Loia, V., Stanzione, C., 2022. Explainable artificial intelligence in CyberSecurity: A survey. IEEE Access 10, 93575–93600. Chavhan, S., Kulkarni, R.A., Zilpe, A.R., 2021. Smart sensors for IIoT in autonomous vehicles. In: Smart Sensors for Industrial Internet of Things. Springer, pp. 51–61. Chen, G., Ng, W.S., 2017. An efficient authorization framework for securing industrial internet of things. In: TENCON 2017-2017 IEEE Region 10 Conference. IEEE, pp. 1219–1224. 26 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Mullen, G., Meany, L., 2019. Assessment of buffer overflow based attacks on an IoT operating system. In: 2019 Global IoT Summit. GIoTS, IEEE, pp. 1–6. Muna, A.H., Moustafa, N., Sitnikova, E., 2018. Identification of malicious activities in industrial Internet of Things based on deep learning models. J. Inf. Secur. Appl. 41, 1–11. Nazir, A., Khan, R.A., 2021. A novel combinatorial optimization based feature selection method for network intrusion detection. Comput. Secur. 102, 102164. Pal, S., Jadidi, Z., 2021. Analysis of security issues and countermeasures for the industrial internet of things. Appl. Sci. 11 (20), 9393. Panchal, A.C., Khadse, V.M., Mahalle, P.N., 2018. Security issues in IIoT: A comprehensive survey of attacks on IIoT and its countermeasures. In: 2018 IEEE Global Conference on Wireless Computing and Networking. GCWCN, IEEE, pp. 124–130. Piccialli, F., Bessis, N., Cambria, E., 2021. Industrial Internet of Things (IIoT): Where we are and what’s next. IEEE Trans. Ind. Inform. Potluri, S., Henry, N.F., Diedrich, C., 2017. Evaluation of hybrid deep learning techniques for ensuring security in networked control systems. In: 2017 22nd IEEE International Conference on Emerging Technologies and Factory Automation. ETFA, IEEE, pp. 1–8. Qiao, H., Blech, J.O., Chen, H., 2020. A machine learning based intrusion detection approach for industrial networks. In: 2020 IEEE International Conference on Industrial Technology. ICIT, IEEE, pp. 265–270. Raja, K., Karthikeyan, K., Abilash, B., Dev, K., Raja, G., 2021. Deep learning based attack detection in IIoT using two-level intrusion detection system. Rambus, 2022. Industrial IoT: Threats and countermeasures. URL: https://www.rambus. com/iot/industrial-iot/. Rezaeibagha, F., Mu, Y., Huang, X., Yang, W., Huang, K., 2019. Fully secure lightweight certificateless signature scheme for IIoT. IEEE Access 7, 144433–144443. Ribeiro, M.T., Singh, S., Guestrin, C., 2016. " Why should i trust you?" Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 1135–1144. Rubio, J.E., Alcaraz, C., Roman, R., Lopez, J., 2017a. Analysis of intrusion detection systems in industrial ecosystems. In: SECRYPT. pp. 116–128. Rubio, J.E., Roman, R., Lopez, J., 2017b. Analysis of cybersecurity threats in Industry 4.0: the case of intrusion detection. In: International Conference on Critical Information Infrastructures Security. Springer, pp. 119–130. Sarhan, M., Layeghy, S., Moustafa, N., Portmann, M., 2021a. Towards a standard feature set of nids datasets. arXiv preprint arXiv:2101.11315. Sarhan, M., Layeghy, S., Portmann, M., 2021b. Feature analysis for ML-based IIoT intrusion detection. arXiv preprint arXiv:2108.12732. Sassi, M.S.H., Jedidi, F.G., Fourati, L.C., 2019. A new architecture for cognitive Internet of Things and big data. Procedia Comput. Sci. 159, 534–543. Sengupta, J., Ruj, S., Bit, S.D., 2020. A comprehensive survey on attacks, security issues and blockchain solutions for IoT and IIoT. J. Netw. Comput. Appl. 149, 102481. Sgandurra, D., Muñoz-González, L., Mohsen, R., Lupu, E.C., 2016. Automated dynamic analysis of ransomware: Benefits, limitations and use for detection. arXiv preprint arXiv:1609.03020. Shahin, M., Chen, F.F., Hosseinzadeh, A., Bouzary, H., Rashidifar, R., 2022. A deep hybrid learning model for detection of cyber attacks in industrial IoT devices. Int. J. Adv. Manuf. Technol. 1–11. Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A., 2018a. Intrusion detection evaluation dataset (CIC-IDS2017). In: Proceedings of the of Canadian Institute for Cybersecurity. Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A., 2018b. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: ICISSp, Vol. 1. pp. 108–116. Shrivastava, R.K., Singh, S.P., Hasan, M.K., Islam, S., Abdullah, S., Aman, A.H.M., et al., 2022. Securing Internet of Things devices against code tampering attacks using return oriented programming. Comput. Commun. 193, 38–46. Siddavatam, I.A., Satish, S., Mahesh, W., Kazi, F., 2017. An ensemble learning for anomaly identification in SCADA system. In: 2017 7th International Conference on Power Systems. ICPS, IEEE, pp. 457–462. Singh, J., Gimekar, A., Venkatesan, S., 2019. An efficient lightweight authentication scheme for human-centered industrial Internet of Things. Int. J. Commun. Syst. e4189. Singh, S., Saini, H.S., 2021. Learning-based security technique for selective forwarding attack in clustered WSN. Wirel. Pers. Commun. 118 (1), 789–814. Smys, S., 2020. A survey on Internet of Things (IoT) based smart systems. J. ISMAC 2 (04), 181–189. Stevenson, K., 2020. Healthcare cyber-attacks and the COVID-19 pandemic: An urgent threat to global health. Stewart, B., Rosa, L., Maglaras, A., Cruz, T., Ferrag, M., Simoes, P., Janicke, H., 2017. A novel intrusion detection mechanism for SCADA systems that automatically adapts to changes in network topology. Ind. Netw. Intell. Syst. 1–12. Su, T., Sun, H., Zhu, J., Wang, S., Li, Y., 2020. BAT: Deep learning methods on network intrusion detection using NSL-KDD dataset. IEEE Access 8, 29575–29585. Tajalli, S.Z., Mardaneh, M., Taherian-Fard, E., Izadian, A., Kavousi-Fard, A., Dabbaghjamanesh, M., Niknam, T., 2020. DoS-resilient distributed optimal scheduling in a fog supporting IIoT-based smart microgrid. IEEE Trans. Ind. Appl. 56 (3), 2968–2977. Tang, Z., Hu, H., Xu, C., 2022. A federated learning method for network intrusion detection. Concurr. Comput.: Pract. Exper. 34 (10), e6812. Kalash, M., Rochan, M., Mohammed, N., Bruce, N.D., Wang, Y., Iqbal, F., 2018. Malware classification with deep convolutional neural networks. In: 2018 9th IFIP International Conference on New Technologies, Mobility and Security. NTMS, IEEE, pp. 1–5. Kasongo, S.M., 2021. An advanced intrusion detection system for IIoT based on GA and tree based algorithms. IEEE Access 9, 113199–113212. Kasongo, S.M., Sun, Y., 2020. A deep learning method with wrapper based feature extraction for wireless intrusion detection system. Comput. Secur. 92, 101752. Keliris, A., Salehghaffari, H., Cairl, B., Krishnamurthy, P., Maniatakos, M., Khorrami, F., 2016. Machine learning-based defense against process-aware attacks on industrial control systems. In: 2016 IEEE International Test Conference. ITC, IEEE, pp. 1–10. Ketzaki, E., Drosou, A., Papadopoulos, S., Tzovaras, D., 2019. A light-weighted ANN architecture for the classification of cyber-threats in modern communication networks. In: 2019 10th International Conference on Networks of the Future. NoF, IEEE, pp. 17–24. Khan, I.A., Moustafa, N., Pi, D., Sallam, K.M., Zomaya, A.Y., Li, B., 2021. A new explainable deep learning framework for cyber threat discovery in industrial IoT networks. IEEE Internet Things J. 9 (13), 11604–11613. Koroniotis, N., Moustafa, N., Sitnikova, E., Turnbull, B., 2019. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-iot dataset. Future Gener. Comput. Syst. 100, 779–796. Krimmling, J., Peter, S., 2014. Integration and evaluation of intrusion detection for CoAP in smart city applications. In: 2014 IEEE Conference on Communications and Network Security. IEEE, pp. 73–78. Kushner, D., 2013. The real story of stuxnet. IEEE Spectr. 50 (3), 48–53. Lalle, Y., Fourati, L.C., Fourati, M., Barraca, J.P., 2019. A comparative study of lorawan, sigfox, and nb-iot for smart water grid. In: 2019 Global Information Infrastructure and Networking Symposium. GIIS, IEEE, pp. 1–6. Lalle, Y., Fourati, L.C., Fourati, M., Barraca, J.P., 2020. Lorawan network capacity analysis for smart water grid. In: 2020 12th International Symposium on Communication Systems, Networks and Digital Signal Processing. CSNDSP, IEEE, pp. 1–6. Lalle, Y., Fourati, M., Fourati, L.C., Barraca, J.P., 2021. A Hierarchical Clustering Federated Learning-Based Blockchain Scheme for Privacy-Preserving in Water Demand Prediction, Available at SSRN 4108575. Lalle, Y., Fourati, M., Fourati, L.C., Barraca, J.P., 2021. Communication technologies for Smart Water Grid applications: Overview, opportunities, and research directions. Comput. Netw. 107940. Lashkari, A.H., Draper-Gil, G., Mamun, M.S.I., Ghorbani, A.A., 2017. Characterization of tor traffic using time based features. In: ICISSp. pp. 253–262. Latif, S., Idrees, Z., Zou, Z., Ahmad, J., 2020. DRaNN: A deep random neural network model for intrusion detection in industrial IoT. In: 2020 International Conference on UK-China Emerging Technologies. UCET, IEEE, pp. 1–4. Li, Y., Xu, Y., Liu, Z., Hou, H., Zheng, Y., Xin, Y., Zhao, Y., Cui, L., 2020. Robust detection for network intrusion of industrial IoT based on multi-CNN fusion. Measurement 154, 107450. Liang, W., Hu, Y., Zhou, X., Pan, Y., Kevin, I., Wang, K., 2021. Variational few-shot learning for microservice-oriented intrusion detection in distributed industrial IoT. IEEE Trans. Ind. Inform.. Lin, C., He, D., Huang, X., Choo, K.K.R., Vasilakos, A.V., 2018. Bsein: A blockchainbased secure mutual authentication with fine-grained access control system for industry 4.0. J. Netw. Comput. Appl. 116, 42–52. Liu, Z., Thapa, N., Shaver, A., Roy, K., Yuan, X., Khorsandroo, S., 2020. Anomaly detection on iot network intrusion using machine learning. In: 2020 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems. IcABCD, IEEE, pp. 1–5. Liu, J., Yang, D., Lian, M., Li, M., 2021. Research on intrusion detection based on particle swarm optimization in IoT. IEEE Access 9, 38254–38268. Magán-Carrión, R., Urda, D., Díaz-Cano, I., Dorronsoro, B., 2020. Towards a reliable comparison and evaluation of network intrusion detection systems based on machine learning approaches. Appl. Sci. 10 (5), 1775. Maglaras, L., 2018. Intrusion Detection in Scada Systems Using Machine Learning Techniques (Ph.D. thesis). University of Huddersfield. Mantere, M., Sailio, M., Noponen, S., 2012. Feature selection for machine learning based anomaly detection in industrial control system networks. In: 2012 IEEE International Conference on Green Computing and Communications. IEEE, pp. 771–774. McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A., 2017. Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics. PMLR, pp. 1273–1282. Mendonça, R.V., Silva, J.C., Rosa, R.L., Saadi, M., Rodriguez, D.Z., Farouk, A., 2021. A lightweight intelligent intrusion detection system for industrial internet of things using deep learning algorithm. Expert Syst. e12917. Mohammadi, M., Kavousi-Fard, A., Dabbaghjamanesh, M., Farughian, A., Khosravi, A., 2021. Effective management of energy internet in renewable hybrid microgrids: A secured data driven resilient architecture. IEEE Trans. Ind. Inform. 18 (3), 1896–1904. Moustafa, N., Slay, J., 2015. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: 2015 Military Communications and Information Systems Conference. MilCIS, IEEE, pp. 1–6. 27 Journal of Network and Computer Applications 215 (2023) 103637 M. Nuaimi et al. Zhou, X., Hu, Y., Liang, W., Ma, J., Jin, Q., 2020. Variational LSTM enhanced anomaly detection for industrial big data. IEEE Trans. Ind. Inform. 17 (5), 3469–3477. Zhou, R., Zhang, X., Du, X., Wang, X., Yang, G., Guizani, M., 2018. File-centric multikey aggregate keyword searchable encryption for Industrial Internet of Things. IEEE Trans. Ind. Inform. 14 (8), 3648–3658. Zolanvari, M., 2021. Addressing Pragmatic Challenges in Utilizing AI for Security of Industrial IoT (Ph.D. thesis). Washington University in St. Louis. Zolanvari, M., Teixeira, M.A., Gupta, L., Khan, K.M., Jain, R., 2019. Machine learningbased network vulnerability analysis of industrial internet of things. IEEE Internet Things J. 6 (4), 6822–6834. Zolanvari, M., Teixeira, M.A., Jain, R., 2018. Effect of imbalanced datasets on security of industrial IoT using machine learning. In: 2018 IEEE International Conference on Intelligence and Security Informatics. ISI, IEEE, pp. 112–117. Zolanvari, M., Yang, Z., Khan, K., Jain, R., Meskin, N., 2021. Trust xai: Model-agnostic explanations for ai with a case study on iiot security. IEEE Internet Things J. Zong, W., Chow, Y.-W., Susilo, W., 2018. A two-stage classifier approach for network intrusion detection. In: International Conference on Information Security Practice and Experience. Springer, pp. 329–340. Tange, K., De Donno, M., Fafoutis, X., Dragoni, N., 2019. Towards a systematic survey of industrial IoT security requirements: research method and quantitative analysis. In: Proceedings of the Workshop on Fog Computing and the IoT. pp. 56–63. Tsai, F.S., 2009. Network intrusion detection using association rules. Int. J. Recent Trends Eng. 2 (2), 202. Tsiknas, K., Taketzis, D., Demertzis, K., Skianis, C., 2021. Cyber threats to industrial IoT: A survey on attacks and countermeasures. Internet Things 2 (1), 163–188. Ullah, I., Mahmoud, Q.H., 2017. A hybrid model for anomaly-based intrusion detection in SCADA networks. In: 2017 IEEE International Conference on Big Data. Big Data, IEEE, pp. 2160–2167. Wang, Q., Dai, H.N., Wang, H., Xu, G., Sangaiah, A.K., 2019. UAV-enabled friendly jamming scheme to secure industrial Internet of Things. J. Commun. Netw. 21 (5), 481–490. Wang, C., Wang, B., Sun, Y., Wei, Y., Wang, K., Zhang, H., Liu, H., 2021. Intrusion detection for industrial control systems based on open set artificial neural network. Secur. Commun. Netw. 2021. Xiang, C., Lim, S., 2005. Design of multiple-level hybrid classifier for intrusion detection system. In: 2005 IEEE Workshop on Machine Learning for Signal Processing. IEEE, pp. 117–122. Xiao, L., Li, Y., Han, G., Liu, G., Zhuang, W., 2016. PHY-layer spoofing detection with reinforcement learning in wireless networks. IEEE Trans. Veh. Technol. 65 (12), 10037–10047. Xiao, L., Yan, Q., Lou, W., Chen, G., Hou, Y.T., 2013. Proximity-based security techniques for mobile users in wireless networks. IEEE Trans. Inf. Forensics Secur. 8 (12), 2089–2100. Xu, H., Yu, W., Liu, X., Griffith, D., Golmie, N., 2020. On data integrity attacks against industrial Internet of Things. In: 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress. DASC/PiCom/CBDCom/CyberSciTech, IEEE, pp. 21–28. Yao, H., Gao, P., Zhang, P., Wang, J., Jiang, C., Lu, L., 2019. Hybrid intrusion detection system for edge-based IIoT relying on machine-learning-aided detection. IEEE Netw. 33 (5), 75–81. Zhang, Z., Hamadi, H.A., Damiani, E., Yeun, C.Y., Taher, F., 2022. Explainable artificial intelligence applications in cyber security: State-of-the-art in research. arXiv preprint arXiv:2208.14937. Zhang, X., Li, J., Zhang, D., Gao, J., Jiang, H., 2020. Research on feature selection for cyber attack detection in industrial internet of things. In: Proceedings of the 2020 International Conference on Cyberspace Innovation of Advanced Technologies. pp. 256–262. Zhang, H., Wu, C.Q., Gao, S., Wang, Z., Xu, Y., Liu, Y., 2018. An effective deep learning based scheme for network intrusion detection. In: 2018 24th International Conference on Pattern Recognition. ICPR, IEEE, pp. 682–687. Zhao, S., Li, W., Zia, T., Zomaya, A.Y., 2017. A dimension reduction model and classifier for anomaly-based intrusion detection in Internet of Things. In: 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress. DASC/PiCom/DataCom/CyberSciTech, IEEE, pp. 836–843. Lamia Chaari Fourati is a professor at Computer Science and Multimedia Higher Institute), Sfax University, and researcher at Digital Research Center of Sfax (CRNS), Laboratory of Signals, systeMs, aRtificial Intelligence, neTworkS (SM@RTS). She focused her research activities on conception and validation of new protocols and mechanisms for emerging networks technologies. Her research activities are very important and up-to-date which are related to digital telecommunication networks, in particular wireless access networks, sensor networks, vehicular networks, Internet of things, 5G, software defined network, information centric network and wireless body area network, unmanned aerial vehicle . . . In these areas, she is interested in problems such as quality of services provisioning (congestion control, admission control, resources allocations . . . ), cyber security and ambient intelligence. Furthermore, she focused her research on applications that impact positively our society such as healthcare domain, through the conception of innovative protocols and mechanisms for health monitoring, remote systems, the environmental control as well as energy saving approaches. Her scientific publications have met the interest of the scientific community and her work has been published in a very good journal and conferences. She has more than 70 papers published in journals and more than 120 papers published in conferences. Prof. Lamia CHAARI FOURATI is the laureate for the Kwame Nkrumah Regional Awards for women 2016 (North Africa Region). 28 View publication stats