Journal of Parallel and Distributed Computing 162 (2022) 89–104 Contents lists available at ScienceDirect Journal of Parallel and Distributed Computing www.elsevier.com/locate/jpdc Machine learning and the Internet of Things security: Solutions and open challenges Umer Farooq a , Noshina Tariq b , Muhammad Asim a , Thar Baker c,∗ , Ahmed Al-Shamma’a c a b c National University of Computer and Emerging Sciences, A. K. Barohi Road, H-11/4, Islamabad, Pakistan Shaheed Zulfikar Ali Bhutto Institute of Science and Technology, Street No. 09, Plot No. 67 Sector H-8/4’, Islamabad, 44000, Pakistan University of Sharjah, P.O.Box: 27272 Sharjah, United Arab Emirates a r t i c l e i n f o Article history: Received 2 April 2021 Received in revised form 2 August 2021 Accepted 19 January 2022 Available online 29 January 2022 Keywords: Internet of Things Security Machine learning a b s t r a c t Internet of Things (IoT) is a pervasively-used technology for the last few years. IoT technologies are also responsible for intensifying various everyday smart applications improving the standard of living. However, the inter-crossing of IoT systems and the multi-directional elements responsible for these systems’ placement have raised new safety concerns. They generate and share a massive amount of sensitive data. Unfortunately, both the data and the devices are susceptible to many privacy and security challenges. Much research has been done to secure these infrastructures; however, Machine Learning (ML), among others, provides higher accuracy. This survey covers the major security issues and open challenges encountered by IoT infrastructures. It also encompasses an in-depth study and analysis of MLbased state-of-the-art solutions used in securing such domains. The security challenges and requirements in IoT-based systems have been highlighted, along with a discussion on how ML supports security measures in the said domain. Furthermore, the challenges associated with ML-based security solutions have been identified concerning IoT. An analysis of prevailing ML security techniques’ constraints is also contemplated. © 2022 Elsevier Inc. All rights reserved. 1. Introduction Internet of Things (IoT) is a collection of smart devices, which disseminate information through the Internet. These smart devices, deployed in different locations, sense and capture the data. The potential IoT applications include smart cities, intelligent transportation systems, smart homes, earthquake detection, and smart grid system, to name a few. Despite being the most emerging technology of the last decade, with application in almost all life fields, security is still a limiting factor in many such application areas. A great deal of research has been carried out to secure and safeguard IoT applications in the last decade. Fernandes et al. [37] differentiated between the current security problems of IoT and the conventional Information Technology (IT) security issues. This differentiation is made on the basis of hardware, software, and the protocols used. They argued that conventional security measures are not equally beneficial for the IoT domain as for conventional IT. For instance, encryption-based solutions are utilized in IoT design * Corresponding author. E-mail addresses: i181613@nu.edu.pk (U. Farooq), dr.noshina@szabist-isb.edu.pk (N. Tariq), muhammad.asim@nu.edu.pk (M. Asim), tshamsa@sharjah.ac.ae (T. Baker), alshammaa@sharjah.ac.ae (A. Al-Shamma’a). https://doi.org/10.1016/j.jpdc.2022.01.015 0743-7315/© 2022 Elsevier Inc. All rights reserved. at separate layers and protocols with many encryption phases, decryption, and re-encryption sequences in the entire structure [28]. The design is more likely to be attacked because of these sequences. To avoid invades, back to back encryption must be done, which is a burdensome process. However, such solutions demand extensive computation, storage, and energy resources. Accordingly, the constrained resources are the central issue in IoT, which does not support complex and advanced security methods in IoT networks [115]. The IoT needs a cross-layer structure and enhanced procedures to solve security problems. IoT designs require improved cryptography and procedures to solve security issues because of complex mathematical hindrances [54]. However, the quantity of designs in IoT results in more problems for the security systems. Many security issues are complicated, with no specific remedy. For example, in certain security issues, like Distributed Denial of Services (DDoS) or intrusion, the chance of false positives makes the solutions fail. Also, it reduces the user faith, hence, lowering the efficiency of the solutions. Moreover, the enormous collected information and its diversity is another factor that affects many security solutions’ functionality and accuracy. Sizeable massive information is considered for behaviors, patterns, assessment, and predictions [82]. The diversity of the information produced by IoT caused another suggestion for the methods pro- U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 Table 1 State-of-the-art. Reference Table 2 Nomenclature. Year Proposed 2021 [54] 2020 [82] 2020 [6] 2020 [23] 2019 [10] [126] 2020 2020 [79] 2020 [43] 2019 [50] 2019 Subject matter ML-based security solutions’ analysis and open challenges with respect to IoT resource analysis Focuses on ML-based security and privacy solutions and challenges ML-, AI, and blockchain-based security solutions discussed for layer-wise issues ML-, and DL-based security solutions and challenges ML-based security solutions for Network Intrusion Detection systems (NIDSs) ML-based privacy management Addressing confidentiality and security issues using ML and BC-based models in the IoT domain ML-based security solutions and open challenges with respect to IoT layers ML-based security solutions with respect to three-layer IoT architecture and requirements Blockchain-, edge-, fog-, and ML-based solution analysis with respect to IoT layers. Resource analysis × × × × × × × × × cessed by the present information. Hence, improved procedures are required to check the number of IoT-generated information. Based on above discussion, Machine Learning (ML) is the most appropriate mathematical model for ingrained intelligence in IoTs [54]. It helps IoT devices to collect necessary information from the system, environment, or information produced by humans. The smart device can change the position based on information that is a critical way to solve IoT problems. ML methods are considered in classification, regression, and density estimation [75]. In many strategies, for instance, IoT, computer vision, intrusion and malware identification, speech recognition, and authentication, MLbased models are extensively used [6]. The ML is a technique that performs the computational work autonomously and intelligently. This method needs designing and testing with the help of different methods. Some cases require early and predictive decisions before the actual happening of an event, such as the fire should be predicted before happening. That is possible only by combining IoT with ML incorporated security measures. Moreover, it is needed to explore the IoT systems’ current security-related dilemmas for making them tamper-proof. The use of ML requires an efficient process for computation and storage of massive data [65]. For example, Reference [52] reviewed specific security-related issues while integrating ML techniques in the smart grid systems. Some of the intrusion detection mechanisms and their effects have also been addressed in [23] and [11]. We compare the latest state-of-the-art surveys with ours. In this regard, we selected only those surveys that focus particularly on ML-based security solutions for IoT. Table 1 summarizes stateof-the-art in this regard. No doubt, these surveys provide in-depth knowledge for the claimed subject matter. However, this paper aims to find information concerning the security challenges and threats, which disrupt IoT applications in resource-constrained IoT infrastructures. It covers IoT apropos of security-limiting factors, security challenges, and the threats faced in such infrastructures. To the best of our knowledge, this is the only survey that incorporates an exhaustive study on ML-based security solutions along with the critical analysis of the limitations they pose on IoT devices, especially on resource-constrained ones. We analyze many research models concerning the major threats and present the overheads they may make on IoT devices in terms of memory, Abbreviation Term AR ANN ADCs BM CNN DT DL DoS DBN DRL DRNN DDoS D2D ELM FCM FE FDN GAN IP IoT IDS ICNs IoBT LR LSTM ML MLP MCA NIDSs NB NN OWASP RF RL RNN RBF RaNN RFID SVM SBL SVR SQL SDN SECaaS SSL/TLS Auto-Regressive Artificial Neural Network Application Delivery Controllers Boltzmann Machine Convolutional Neural Networks Decision Tree Deep Learning Denial-of-Services Deep Belief Networks Deep Reinforcement Learning Dense Random Neural Networks Distributed-Denial-of-Services Device-to-Device Communication Extreme Learning Machine Fuzzy C-Means Feature Extraction Feedforward Deep Networks Generative Adversarial Networks Internet Protocol Internet of Things Intrusion detection system Information-Centric Networks Internet of Battlefeild Things Logistic Regression Long-short Term Memory Machine Learning Multi-Layer Perception Multivariate Correlation Analysis Network Intrusion Detection systems Naïve Bayes Neural Networks Open Web-Application Security Project Random Forest Reinforcement Learning Random Neural Networks Radial Basis Function Random Neural Networks Radio Frequency IDentification Support Vector Machine Sequence Based Learning Support Vector Regression Structured Query Language Software Defined Networking SECurity as a Service Secure Sockets Layer/Transport Layer Security computation, and energy. (Table 2 presents Nomenclature used in this paper.) Following are the main contributions of this paper: 1. Highlighting security challenges and requirements in IoT-based systems. 2. Discussing ML as a good security measure for the IoT domain and the challenges it poses on IoT. 3. Analyzing constraints of the prevailing ML security techniques proposed for IoT domain security. The paper’s organization is as follows: Section 2 discusses the Internet of Things details and its prominent features. Section 3 details security challenges faced in IoT deployments. Section 4 and 5 gives insight on machine learning as a security solution in IoT frameworks. Section 6 discusses the potential limitations faced while harnessing ML solutions to the said domain. Analysis of ML-based security solutions in IoT infrastructures is presented in Section 7. Section 8 details the implications of the findings for the future of IoT security, and, lastly, conclusion and recommendation of future research are given in Section 9. 90 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 Internet [31]. It is also estimated that IoT may produce between 4% and 11% of global GDP in 2025. It is also argued that both users and enterprises would be influenced by emerging IoT technologies. However, IoT systems operate within certain constraints; they face additional protection challenges for both devices and applications [113]. Furthermore, the IoT provides a spectrum full of software and services from critical infrastructure to home appliances, personal healthcare, agriculture, and military [114]. Besides, IoT services protected domains, such as energy, medical, building management, manufacturing, retail, and transport, are some to mention. The enormous size of the IoT networks has bought some new issues, such as managing these devices, managing large quantities of data they generate, secure communication, storage, protection, computation, and privacy. Extensive work has been conducted to review these various IoT features, such as protocols, architecture, communication, software, protection, and privacy [113]. However, the basis of marketing IoT technology is the assurance of protection, privacy, and customer satisfaction. IoT networks utilize different supporting technologies, for instance, Software Defined Networking (SDN), fog, and cloud computing, which further enhance the attackers’ threat domain. The IoT devices generate voluminous data streams; therefore, conventional data collection methods, processing, and storage might not perform efficiently for these streams. This high volume of the data stream can be analyzed by extracting trends, behaviors, forecasts, and evaluations. Additionally, the complexity of IoT produced data provides another front for the existing processes of data processing. Thus, leveraging the valued information from IoT produced data requires new frameworks and strategies. From this point of view, the ML is regarded as one of the essential computational models capable of providing the feature of embedded-intelligence in IoT devices [75]. ML helps infer useful system information or human-generated data from machines and smart devices to assist smart devices in changing or systematizing the knowledge-based situation or actions that are considered an integral part of an IoT solution. It is also used for regression, classification, and estimation of density. As afore-mentioned, the ML algorithms and techniques are used in numerous domains such as fraud detection, computer vision, Speech recognition, bioinformatics, malware detection, and authentication [54]. Similarly, it may be utilized in IoT to provide intellectual services. This paper’s scope revolves around discussing ML applications for the provision of security in IoT networks. Fig. 1. Internet of Things. 2. Internet of things IoT is known as inter-connected and embedded systems that communicate using wireless or wired technologies [54]. It is often considered a network of tangible things enhanced by computation, storage, communication resources, network connectivity, and embedded by electronics (i.e., sensors and actuators). Some of the IoT entities are shown in Fig. 1. Most importantly, the software is also embedded in them that allows such entities to capture, often analyze, and exchange the data. IoT refers to ‘things’ from our everyday lives, starting from smart household gadgets, such as smart bulbs, smart switches, smart meters, smart refrigerators, smart ovens, air conditioners, temperature sensors, smoke detectors, Internet Protocol (IP) cameras, and some more advanced devices like Radio Frequency Identification (RFID), monitors, sensors used in parking areas, and a variety of other sensory devices [53]. With the advancement of technology in society, new options have emerged, improving living, delivering more efficient services, and automating manufacturing processes. The concept of “smart” is elevated to the epicenter of already-happening technical advancements. Indeed, they are now regarded as the backbone of the fourth industrial revolution, owing to their enormous potential for innovation and societal benefits. IoT technologies provide a whole new perspective on the advancement of numerous areas, including engineering [140], agriculture [36], and medical [100], as well as in hitherto unexplored domains. Certain application areas for IoT technologies are still unknown or unclear regarding how to approach them, indicating that more intensive study should be undertaken in this hard sector to uncover new and significant potential advantages for society. The exponential growth of IoT devices has surpassed the total number of the human population. IoT systems with sensing and acting capability make them applicable for any application in the real world [115]. These devices produce huge, sensitive, and valuable data volumes. It is, therefore, a big challenge to compute and process this huge volume of data. The significance and value of IoT technologies in the future are abundantly evident. The growth of IoT technologies is accelerating at the moment. By 2026, the volume of interconnected IoT devices will reach 26.9 billion, representing a 13% rising trend annually [24]. For consumer-oriented applications, the need for smart monitoring of IoT devices is critical. While AI may interconnect IoT devices, real-time monitoring, managing, and securing the growing connectivity is still a major concern. According to an estimate, the IoT could generate an economic impact of $3,900–$11,100 billion annually by 2025, owing to lower hardware cost, cloud storage, advanced processing, lower connectivity costs, and increased speed, all of which contribute to an increased devices’ connectivity to the 2.1. Characteristics of the IoT network This section addresses a few special features of the IoT network (as depicted in Fig. 2) that are defined below: – Heterogeneity: A collection of different devices interacting with each other in an IoT network has different characteristics, capabilities, and communication protocols. More precisely, the systems may use different connectivity protocols and networking paradigms, such as cellular or Ethernet, with different hardware resources [113]. – Plethora of IoT devices: According to an estimate, billions of IoT devices interact through the Internet, causing exceed in existing Internet capabilities. The massive-scale implementation of IoT has brought new challenges, such as the need for new storage and networking architecture, data communication protocols, standardization of the technology, interface design for the proactive detection, and protection of IoT infrastructure, just to name a few [1]. – Inter-connectivity: These IoT devices are connected with the global infrastructure to communicate and access any informa91 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 get compromised and may affect the personal data present on these devices [114]. – Intelligence: It is the most interesting characteristic of the IoT devices. It is the ability due to which timely decisions and informed opinions are constructed. Data set up by the IoT devices must be handled in a way to extract sound logic from it along with executing actions based on results of decisions [52]. 3. Security challenges in the deployment of IoT The IoT ecosystem consists of many things around us that are interconnected and linked to the Internet to provide a plethora of services and an improved lifestyle. The performance of everyday activities may be increased in this way. These systems are acquirable globally; however, they mostly comprise limited equipment and are manufactured by lossy connections [1]. However, such devices’ growing connection and computing capacity naturally increase the associated vulnerabilities (i.e., hardware, firmware, communications), which can raise the likelihood of exploitation [12]. Ideally, its developers and designers must reinforce safety into software and hardware at the time of inception rather than later damage [119]. Considering this, it is critical to have a clear and exact list of attack vectors to quickly formulate a plan for better responding to the growing risks affecting the IoT ecosystem as a whole. This may ensure that exploitable flaws in a generic architecture may be discovered and that particular actions were implemented to avoid or reduce the probability of an attack. However, it is not practical due to rapid advancement in technology and attacks. Hence, it is difficult to protect the large-scale attack surface of the IoT system. For such an issue, IoT systems must be thoroughly scrutinized. However, IoT devices are more likely to work unguarded. As a result, the attacker can manually threaten these devices. IoT appliances are appropriately linked over wireless networks where the attacker can retrieve confidential data from a connecting line by overhearing. Attacks must be foreseen in IoT applications such as smart healthcare since this environment serves the community and relies on various technologies and sensors. It often consists of a number of insecure devices, frameworks, and services that interact over insecure media via insecure protocols, for example, FTP, HTTP, and telnet [118]. An attack can take various forms, including network attacks that monitor un-encrypted traffic in search of sensitive data; passive attacks monitor vulnerable network communications to decrypt poor encrypted traffic and obtain authentication data. They may exploit loopholes to obtain access to systems and steal sensitive information [78]. Additionally, unauthorized individuals may damage devices and disrupt service effectiveness. Unfortunately, designing, developing, and implementing IoT applications include several limitations and difficulties, including scarce resources, device compatibility, diversity, and security. Moreover, most manufacturers try to speed up their product development and leave safety aside [33], [61]. It could lead to various vulnerabilities such ecosystems, for example, accidental hardware and software backdoors. Moreover, IoT devices do not approve complicated guarding systems due to their restricted calculation and power resources [4]. With security as a primary concern, IoT systems must continually adjust and perform accurately and reliably, specifically in modes where there is a threat (e.g., in health systems [103]). Also, IoT surroundings develop new attack surfaces. These particular surfaces are created because of the interrelated and interlinked conditions of the IoT. The standard method to fix these systems is unsuitable; therefore, the safety threat is more in IoT systems as compared to other systems [16]. Likewise, they have unluckily inherited all the vulnerabilities that are associated with the Internet [98]. Moreover, certain IoT devices are designated as security-critical, which means that their Fig. 2. Internet of Things characteristics. – – – – – – tion at any time and place. This global connectivity relies on the applications and service(s) types. Local connectivity is required at some places, for example, connected car technology or the sensor swarm. In some other cases, global connectivity is required to access smart homes through mobile networks and in the management of critical infrastructures [114]. Close proximity and D2D communication: Another significant aspect of IoT is direct contact without involving central administration, for instance, base stations. It uses Device-to-Device Communication (D2D), which provides point-to-point contact characteristics. The traditional Internet architecture was more focused on network-centric communication. However, the service and network decoupling recently also allow device- and content-centric communication resulting in an enhanced spectrum of IoT [52]. Reliable and Low Latency Communication: It is worth to mention that in real-time operations, such as technical computerization, remote surgical treatment, intelligent traffic transfer systems, IoT networks are required for reliable and low latency communication [52]. Low-power and low-cost: For successful working of lowpower and low-cost solutions, the IoT devices’ enormous connectivity is utilized [115,117]. Self-healing and self-organization characteristics: It is the most important requirement for critical and present IoT communication that leads to delay-sensitive situations. In these scenarios, dependence on the network, transportation is not possible; therefore, there is a dire need for self-organized networks [52]. Dynamism: IoT involves a huge number of devices, which needs to be managed proficiently. These devices need energy for working and for the wake-up/sleep time of devices dependent on the application. Thus, for performing these tasks and communicating with other devices, they use the Internet. This uniqueness must also be integrated into an IoT network [115,122]. Safety: Along with the other factors, safety is also the most important factor and has much significance for the proper functioning of IoT systems. It is equally important for the clients and devices due to inter-connection among these devices. They use the Internet for communication, which may 92 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 failure may result in irreparable and cascading damage to society [73]. The primary activities in an IoT application may encompass data collection, storage, processing, and data exchange that need to be secured at each phase. Traditionally, it comprises a specific environment where the devices are installed to carry out certain tasks (i.e., sensors or actuators) that are connected to the cloud via a protocol, such as Low-Power Wide Area Networks (LPWANs) [119]. In the cloud, the data is aggregated and processed for analysis in the corporate network. The introduction of the cloud, in addition, adds to the attack vector as well. Furthermore, most IoT devices have a basic design based on the idea that they can be controlled remotely and connected to a third party, and due to market launch pressure, protection and reliability may be overlooked. Therefore, in addition to the lowest tiers of hardware and firmware, higher layers such as frameworks and software must also be made secure. Many IoT devices do not allow firmware/software upgrades making them highly susceptible to vulnerabilities and attacks [107]. Likewise, services, objects, and data must be protected during transmission and storage. On account of this, mandatory alterations should be made in the existing security ideas for wireless and information networks to sustain efficient IoT security procedures. As these systems are prone to attacks, using the current protective measures, such as encryption, authentication mechanisms, encryption techniques, access control lists, network protection, and application safety, is a daunting task for large systems having numerous linked devices. There is an example of ‘Mirai,’ a rare virus that made serious Distributed-Denial-of-Services (DDoS) attacks by deploying IoT devices [64]. Present security measures need to be refined to safeguard the IoT network. Nevertheless, using a security measure opposing the particular security risk can damage the attackers’ new attacks to restrict the current solutions. For instance, fake IP addresses are used by improved DDoS attacks, so the defenders cannot identify areas attacked. As a result, more harmful and intricate attacks than Mirai are expected, as the IoT systems are more likely to be affected by attacks. Hence, finding out the best solutions to secure IoT systems is difficult due to the full range of IoT applications and options. Current security solutions, particularly those focused on cryptography, generally require many resources (e.g., computation, storage, and energy). Such an aspect applies a considerable burden on the computing resources of an IoT device and contributes to a significant degeneration in secured communications and services [110]. To provide fast and adequate resources and services, present research and business initiatives encourage hardware-accelerated cryptography [48], SECurity as a Service (SECaaS) [141], [17], Application Delivery Controllers (ADCs) [41], and Secure Sockets Layer/Transport Layer Security (SSL/TLS) acceleration [84]. Most of these approaches have proven successful in enhanced and secure communications in their designated areas. Technical innovations, such as SDN, IoT, OpenFlow, and Information-Centric Networks (ICNs), have increased mobile development and regional communication networks [34]. These advancements have led to robust and budget-friendly production and manufacturing processes of a broad range of devices. However, these developments have brought difficulties in managing safe interactions, particularly in resource-constrained IoT infrastructures [46]. For example, a sensor or an RFID tag has low computational, storage, and energy resources to carry out complex security-related computations. Hence, resource limitations of such IoT devices must be considered while designing security measures. The following features should be taken into account to achieve successful security ways. The attacks [113] that are more likely to affect the important security needs (authentication, integrity, confidentiality, availability, non-repudiation, and authorization) are shown in Fig. 3. Fig. 3. Internet of Things Security requirements. – Confidentiality: Confidentiality is one of the most important security measures of an IoT system. Illegal persons should not disclose confidential data stored in IoT devices [113]. – Integrity: The information from IoT devices is transmitted via a wireless connection, and only the legal individual can make any amendments. This property is fundamental to identifying any change while communicating through an unprotected wireless network [1]. – Authentication: Before doing any further procedures, the details of that individual must be noted. However, verification details vary from system to system because of the IoT systems’ attributes [6]. – Authorization: Authorization means that users are permitted to enter an IoT arrangement, for example, a sensor apparatus. The end-users can be personages, devices, or services. For instance, the information that a sensor collects must only be transmitted and accessible to the legal users who are authorized objects and service requester [1]. – Availability: There legal users must be provided with the services transmitted by IoT systems. Availability is the leading property for a compelling arrangement of IoT systems [6]. – Non-repudiation: This featured property gives access ledgers that act as proof in matters where objects or users must not reject a procedure [113]. 4. Machine learning: providing IoT security solution Machine learning is an intellectual method used to optimize the presentation criteria of the current information or experience through learning. More precisely, ML algorithms are designed on mathematical approaches for working on massive data. ML also implements the capability for smart devices for learning without the need for a program. These models are considered as a base for predicting the upcoming trends for an input data stream. Fig. 4 illustrates a generic view of ML utilization for IoT-generated data. It shows that the data generated may be structured, semi-structured, or unstructured, which can be passed to an ML model directly from the devices to get it processed and analyzed. ML, in nature, is multidisciplinary with its origins among the different areas of engineering and science that comprise optimization theory, artificial intelligence, cognitive sciences, and information theory are some to name here [94]. It is used in that situation where human knowhow does not exist. It cannot be used like in the case of navigation of an unfriendly region, where individuals cannot utilize their skills, for example, comprehension of speech data and robotics [75]. It is applicable 93 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 in situations where the time intervals modify the explanation for a particular problem, such as routing cyberspace or detecting the malignant codes from an application or software. Besides, it is also deployed in practical smart systems [52], such as Google utilizes ML for studying the threats from the mobile nodes and in the Android-based applications. ML is applied for the identification and elimination of malware from infested receivers. Similarly, the Macie service, provided by Amazon, makes use of machine learning for sorting and categorizing the data residing in their cloud repository and service. Like the other technologies, it also has some reliability and accuracy issues as it may give some false positives and true negatives. Therefore, this technique needs amendment and proper guid- ance for making precise predictions. On the divergent, DL [135], being a promising subfield of ML, can resolve the accurateness of predictions by themselves. Due to the ML models’ self-service nature, it is considered appropriate for predicting and classifying IoT applications with relative and modified assistance. Even though conventional methods are broadly utilized for the different aspects of IoT, such as protocols, applications, data aggregation, architecture, services, resource allocation, analytics, and security. However, the large-scale operation of IoT demands robust, intelligent, and reliable techniques. From this point of view, ML strategies are promising solutions for IoT networks. ML algorithms improve the intelligence of IoT networks through mining immensely generated data streams from these networks. Additionally, the value of the IoT device gathered data is better exploited by incorporating ML paradigms, which help to create knowledgeable and proactive intelligent decisions. ML-based security solutions are fundamentally utilized for security, malware analysis, privacy, and attack detection [52]. However, their usage in IoT applications has a lot of complex challenges. For example, it is challenging to build up a generic illustration that caters to the diversity of data from IoT applications. Likewise, labeling input stream efficiently is another challenging job. A critical point in this aspect, is to reduce labeled instances during the learning phase. Execution of these methodologies on resource-restricted IoT methods gives rise to other challenges, where it is necessary to reduce the information computation and preserving level [103]. In addition, transportation and real-time applications do not work the irregularities cultivated as a consequence of executing ML algorithms. Keeping in view this scenario, it is crucial to evaluate the security solutions of IoT that influence ML methodically. 5. ML-based security Here we are discussing state-of-the-art algorithms from the machine learning field. Furthermore, we present how these algorithms are applied in IoT applications (Table 3). 5.1. Primary machine learning techniques These Machine Learning algorithms may broadly be arranged into four groups; Supervised, Semi-supervised, Unsupervised, and Reinforcement learning algorithms: – Supervised learning: This learning technique is used when there are pre-defined targets available for the particular sets of inputs. During this process, the data is initially labeled, and then the modal is trained on that data. It makes the rules from the available data set for making various classes, and at last, it distributes the elements in the designed classes [11]. – Unsupervised Learning: In this process, the environment only presents the input instances without any restriction of targets. Fig. 4. A generic illustration of ML-based model in IoT. Table 3 ML-based security applications in IoT. Reference Specificity Dealt-with Solutions [65] [52] [23] [11] [97] [19] [95] [49] [80] [47] Mobile IoT Smart Grid IoT Smart Home IoT IoT-connected home Fog, IoT IoT IoT IoT sites Security monitoring Security survey Intrusion detection Intrusion detection Authentication Network attacks on gateways Real time network attacks Darknet traffic analysis, malware Identification, Authentication Anomaly detection Big data processing, ML Big data, ML ML supervised ML SDN, ML Dense Random Neural Networks(DRNN) ELM-based Semi-supervised Fuzzy C-Means (ESFCM) Association rule learning Neural network-based emitter Support Vector Machine (SVM), Artificial Neural Network (ANN), Logistic Regression (LR), Random Forest (RF), and Decision Tree (DT) 94 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 This learning method does not require labeled data and can find similarities in the unlabeled data and categorize it into different groups [22]. These two learning techniques, supervised and unsupervised, rely on data analysis, whereas reinforcement models are used in decision-making and comparisons. Therefore, it can be said that the ML model selection is dependent on the type of data and the desired output. For instance, when the data samples and desired outputs are known in advance, the supervised learning technique is utilized. This system is trained for guiding the inputs to the desired outputs only. The regression and classification can be named as examples of supervised learning. The difference among them is that classification gives discrete outputs, but regression provides the continuous outputs. Many regression models, for instance Support Vector Regression (SVR), Polynomial Regression (PR), and Logistic Regression (LR), are among the famous regression models. Similarly, k-Nearest Neighbor (KNN), LR, and Support Vector Machine (SVM) are among aggressively used classification methods [47]. Fortunately, models like Neural Networks (NN) are among those models that are equally good in regression and classification [80]. In some cases, the output is not adequately defined; therefore, the algorithm has to determine structure from raw data. Hence, these unsupervised learning methods are deployed for the system’s training. Analogous to K-means clustering, unsupervised learning techniques also make groups of objects having a similarity criteria. The accuracy of predictive analytics centers on the wellness of the training through ML on the available data as it enables it to develop models for future use. SVR, NN [80], and Naive Bayes (NB) algorithms [129] are used for extrapolative modeling. – Semi-Supervised Learning: Semi-Supervised learning lies amidst the above two defined learning paradigms. They require either complete labels for data or not at all. The cost of labeling the data is very high, as it requires human intervention with skills. Therefore, they offer some advantages for designing models as the available majority of data are without the labels [143]. – Reinforcement Learning (RL): This process does not require specific outcomes; however, it learns from the feedback of the interaction with the environment. It takes some actions and makes choices based on the outcomes. This learning technique has been adopted by humans and animals. This ability makes it a practical choice in robotics as robots have to make decisions for performing some tasks without pre-defined programming [96]. This technique’s main requirement is a suitable reward function because it decides the maximum workability of the learning technique [130]. – Deep Learning (DL): This is a sub-field of machine learning that has its roots in famous brain-inspired Artificial NN (ANN). This ANN comprises a group of neurons interconnected with each other, such as output from the former neuron serves as input to the next neuron. To obtain knowledge, supervised or unsupervised learning methodologies are amalgamated to a simple NN or deep NN. The term “deep” refers to different numbers of layers that expand the network in depth [134], [68]. DL is used for scattered processing and the interpretation of the extensive raw data, such as unlabeled and un-categorized data. DL method can be applied in ML applications like speech recognition, NLP, and computer vision by generating improved classification and data samples. Moreover, this model helps data recovery and compression from time and spatial domain points of view. It happens due to its efficiency in extracting patterns and features among the large volume of time-dependent data. There are different DL models used in this context, for instance Convolutional Neu- ral Networks (CNN), Generative Adversarial Networks (GAN), Deep Belief Networks (DBN), Recurrent Neural Network (RNN), Boltzmann Machine (BM), Feed-forward Deep Networks (FDN), and Long-short Term Memory (LSTM). The most commonly used deep learning architecture includes CNN (for spatiallydistributed data) and RNN (for time-series data). – Deep Reinforcement Learning (DRL): As described earlier, DL is a subset of ML that facilitates solving the classification problems, function approximation, and prediction. In contrast, RL is also a type of ML technique employed for decision making with the help of a software agent that learns through interacting with the environment. When the multi-dimensional data is extensive in a stable environment; then DL and RL come into play as alone, any of these techniques cannot perform well. Therefore, by combining these techniques, the agents can perform well and get the best possible rewards. During its execution, RL gets aid from the DL to identify the possible prime policy, and DL finds the best approximation of the action values for finding the eminence action for a particular state. The DL can learn by observing the intricate patterns; however, it can miss-classify them. Therefore RL comes in for help and ensures better classification as it does not require the feature classification [76], [86]. The DRL is the combination of the RL’s powerful capability and DL’s perceptiveness. 5.2. ML and IoT security In the next paragraphs, we will elaborate on many ML algorithms related to privacy and protection in IoT networks. Out of many, we will consider malware analysis, authentication, attack detection and mitigation, anomaly and observing intrusion, and DDoS attacks. Supervised algorithms are preferred for the use in IoT due to the capability of working with labeled data. They perform various tasks like channel estimation, localization problems, flexible filtering, security, and spectrum sensing. This class of techniques includes both classification and regression techniques. Supervised machine learning enables prediction, along with the modeling of available data sets. At the same time, regression predicts numeric variables with continuous values. SVM, DT, NB, and RF are some of the most frequently used learners. SVMs utilizes the kernels mechanism to find differences among the two points of the separate classes. They can model decision precincts that are non-linear in nature [47]. Nevertheless, this approach is sensitive to memory and may have difficulty selecting the right kernel for modeling large data sets. This reason makes RF is preferable over SVM [47]. At the same time, NB is utilized for solving real-world problems like spam detection and text categorization [104]. RF and its variants are well suited for real-time problems’ modeling due to the input data’s naïve and independent nature. Furthermore, they are easier for implementation and adaptive concerning the magnitude of the data set. The RF algorithm is also the right choice for solving realworld problems due to the input data’s naïve and independent nature. Furthermore, its implementation and adaptability for any sized data set are not much complex [47]. These algorithms require lengthy training time compared to supervised algorithms like NB and SVM; however, they give high accuracy results in lesser time. Another advantage of this technique is that it produces a continuous graph with branches and leaves, demonstrating the classes and decisions for classifying an event. The top-down methodology achieves it by navigating through trees until a final decision is reached. Some standard regression algorithms are nearest neighbors and logistic regression. These are the “Instance-Based” algorithm that makes predictions based on every new observation after searching similar training data. Unfor95 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 tunately, they are inappropriate for high dimensional data and are memory intensive. The unsupervised learning algorithms are designed for dealing with unlabeled data based on heuristics. These are utilized to detect anomaly, fault, intrusions, cell clustering, and load balancing. Data clustering is an unsupervised learning strategy, and clusters are made based on affinities or discrepancies within a data set. Their unsupervised nature makes it challenging to evaluate their accuracy [22]. Therefore, data visualization techniques are employed. However, if there possibly a correct or incorrect, then the data can be pre-labeled as clusters; hence, classification algorithms are preferred in this situation. The most widely used clustering algorithms are hierarchical and K-means clustering. Amongst all, K-means is very simple yet flexible, and it forms clusters by calculating the distances between data values. After these clusters are arranged around the centroids forming equalsized “globular” [40], it requires the initial clustering specification before starting it, which is not efficient at every time. Therefore it will reduce the quality of the result. In reality, most IoT applications have unsupervised nature and are deficient in having preliminary information about the data points, just like the natural learning process of human beings. Like the “Zeroday,” attacks on IoT infrastructure have no preliminary information available. RL technique works in various steps and prepares the action-reward correlation of agents with their environment. It is no doubt expedient in resolving many IoT problems [90]. This technique does not rely upon too many training samples; however, agents should have enough knowledge of transition function. It requires a large amount of time to come in a steady-state, but it is computationally elementary. Therefore these limiting factors are the critical challenges in this technique for application in a dynamic IoT environment. DL requires robust learning, estimation, and function approximation for better performance in IoT domains for solving the existing problems. Due to the limited resources, these IoT devices cannot run complex computational algorithms for any task like analysis, prediction, and communication. Therefore, DL-based machine learning algorithms show superior performance with low complexity and latency as compared to conventional techniques and theories [128]. Furthermore, these DNN’s are ideal in locating and defining representation from any sort of data, such as text, audio, and an image having high dimensional visualizations. DRL, along with its variants, is utilized in authentication and detecting DDoS in the diverse IoT environment. The most common algorithms of this class are deep deterministic policy gradient, continuous DQN, Dueling network DQN, and asynchronous advantage actor-critic. the main hurdle in the application of ML-based techniques in IoT environments. Table 4 reflects some state-of-the-art in this respect. It shows that most of Ml-based solutions pose memory, computation and energy overheads. The solutions given in [65], [11], and [23] strain memory. The solutions given in [120], [129], [142], [19], [95], [49], [57], [40], [101], [144], [47], and [104] are computationally costly along with the memory overhead. In contrast [26] poses all the three overheads (i.e., memory, computation, and energy) – Delay-sensitive applications: Delay-sensitive IoT devices need real-time analysis in delay-sensitive applications, which is again not supported by ML-based approaches [114]. Therefore, it is advised to merge the ML-based approaches with the existing streaming solutions. However, it may increase the complexity of the algorithm. Furthermore, ML-based networks are designed on the fact that complete data is available during training and analysis, which does not apply to dynamic and heterogeneous IoT data. It may become more challenging as these ML-based algorithms do not consider the unprecedented volume of data by design. Furthermore, the algorithm’s predictive ability reduces significantly due to increased multifaceted data and volume [69]. The previous discussion is only focused on the security-related challenges in functions of IoT for possible attacks and intrusion. – Data analytics: The wireless data can be obtained from multiple sources and locations, such as network information systems and sensing devices deployed at various locations for data gathering [115]. The raw data is considered the IoT systems’ flagship, requiring efficacious and robust analysis to get useful information. However, this data is extensive; thus, data management is a hectic task from an application point of view. This data is heterogeneous, having different types, semantics, and formats. That is why they possess syntactic (such as diversified data, document formats, programming, and data architectures) and semantic heterogeneity (such as having different sense and data interpretation) [18]. This heterogeneity and diversity may result in problems related to generalizations, particularly in Big Data having various data sets with different attributes. They are designed on the basis of the assumption that the statistical data features remain unchanged. Therefore, it requires data pre-processing for filtering the odd ones out [69]. However, it is not possible in the real world as the data is collected from different sources with different formats and representation styles. These differences create problems in ML-based algorithms, as these algorithms cannot deal with such diverse data. 7. Analysis of ML-based security solutions in IoT infrastructures 6. Machine learning challenges in IoT infrastructures This section discusses state-of-the-art ML-based solutions designed for mitigating security threats in IoT. Table 5 presents the definition of some major threats in this regard. IoT infrastructures are heterogeneous, uncertain, and produce data massively. Therefore, traditional ML techniques are inherently inefficient in dealing with such situations and demand considerable modifications [94]. The inherent uncertainties of IoT data are challenging in removing the unpredictability of the data samples. Therefore, it is inevitable to discuss major challenges in ML-based security solutions for IoT infrastructures. Following are some important factors that need to be carefully handled when dealing with IoTs: 1. Eavesdropping and Spoofing: The applications of the Internet of Things are generally classified as personal and enterprise classes. The former includes smart homes, smart offices, smart healthcare, sensor networks, and body area networks. In contrast, in the latter one, the applications may comprise smart cities, critical infrastructure, and smart industries. Such applications generate highly confidential and sensitive data. The primary yet most crucial requirement in such infrastructures is to ensure authorized and controlled access of an IoT service or application and the data to mitigate eavesdropping and spoofing. Usually, IoT services and applications are designed concerning the amount and type of data to be interchanged among devices/networks. The data obtained from an IoT de- – Device resources: Inherently, the ML-based algorithm has complex challenges, such as computational, memory, and sample complexity. Furthermore, they lack scalability, and their scope is confined to low-dimensional problems [54]. Moreover, most of the IoT devices are resource-constrained (i.e., limited energy, storage, and processing power). Such factors are 96 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 Table 4 ML-based solutions in IoT. Reference [65] [23] [11] [19] [95] Solutions Attack PCA SVM, KNN, GNB, ANN, DT, Extreme Learning Machine (ELM) J48 DT [49] [47] RNN ELM-based Semi-supervised Fuzzy C-Means (ESFCM) Association rule learning LR, SVM, DT, RF, ANN [129] [104] [40] [120] [142] [57] [26] NB, J48 DT NB, BayesNet, DT, RF, RT K-Means Clustering RL Swarm Intelligence(SI) SVM DL [101] DT, NB and Laplace estimator RF, DT, KNN, LR, SVM, ANN, NB [144] Accuracy Computation Energy Big data processing - × × Spoofing - × × DoS, MITM, Replay, Reconnaissance, multi-stage attack DoS Fog-based attack detection 98.0% × × 86.53% × × Scan attacks Data Type Probing, Spying and Wrong Setup, DoS, Malicious Operation, Malicious Control, Scan Malicious mobile malware Anomaly, IDS Location tracking Impersonation attack Data risk management Intrusion detection system (IDS) Over-fitting 99.4% × × × 82.5% and 86% 99.99% 94% × × × × × × × × × × - × Backdoor, command injection, and Structured Query Language (SQL) injection - × tation, and energy resources, this technique is not suitable in resource-constrained IoT infrastructures. Another SVM-based approach is presented in [35] that has computation overhead. In [87], an RL-based approach has been proposed for authentication and access control; however, it may exhaust memory and computation resources at the device level. Similarly, [93] presented a feature extraction model, which has computation and memory overheads. Another Auto-Regressive (AR) technique is proposed in [74] that may have computation overhead at node-level. 2. Attack Detection and Mitigation Under this heading, different attacks are discussed based on the standard layers of an IoT infrastructure. The IoT devices, due to their limited resources and heterogeneity, become an easy target for attackers. The attackers always try to attack at the known weak points of the network and device. The nature of these attacks may range from low profile hacking to severe ransomware attacks (such as WanaCry), including esoteric attacks, such as Mirai and Dyn. However, existing cryptographic models have limited accuracy. For such reason, ML-based approaches are utilized, such as KNN, DL, and SVM. ML-based approaches demonstrated high accuracy; however, they can exhaust resource-constrained IoT devices, as shown in Table 7. A fog-based semi-supervised assault identification model is presented in [95]. It uses Extreme ELM and FCM. Nevertheless, its limiting factor is its low accuracy compared to the other DL-based models. In contrast, it shows excellent performance than the standard ML algorithms. As afore-mentioned, the spectrum of IoT expands from personal, such as body area networks to the most sensitive critical infrastructures, such as smart healthcare and smart grid infrastructures. Hence, securing such infrastructures from a massive spectrum of attacks is inevitable [116]. For example, [88] studies the role of semisupervised, feature space fusion, supervised, and online learning models for securing smart grid systems. Table 5 Major threats in IoT infrastructures. Threat Description Eavesdropping The attacker eavesdrops on the credentials input by a victim, such as passwords and sensitive documents [139]. The adversary get illegal access of the victim system by targeting the identification, such as RFID and MAC address [79]. The adversary impedes the target system or the network by restricting authorized and legitimate users from accessing it by flooding a massive number of spam requests [32]. A variant of DoS attacks, which may compromise a large number of IoT devices to send a massive number of requests to the victim to shut it down [32]. The intruders disrupt the system covertly and silently in an IoT network/system [9]. A spurious software attacking legitimate software to behave other than the normal [92] Spoofing DoS DDoS Intrusion/Anomaly Malware Overhead Memory vice is usually pre-processed, shipped to a decision-making system for analysis depending upon the under-laying IoT infrastructure; however, they might have analogous data flow. Upon data request from an application/user, the request must be authenticated by the IoT device and ensured that the requester has all the required access permissions and protocols. Otherwise, it should deny the request. Table 6 presents a summary of state-of-the-art research in this domain. [131] used a Q-learning and Dyna-Q model to ensure controlled eavesdropping and spoofing using authentication and access control. However, the proposed technique has computation overhead and is not suitable for resourceconstrained IoT. The same author presented an LR-based solution in [132]. However, it requires more memory. Shi et al. [105] presented a new user authentication technique using DNN and SVM. Due to the extensive need for memory, compu97 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 Table 6 Eavesdropping and Spoofing in IoT using ML. Reference [131] [132] [105] [35] [87] [93] [74] ML-Technique Q-learning and Dyna-Q LR DNN, SVM SVM RL FE AR Accuracy Overhead 86% > 90% Memory Computation Energy × × × × × × × × × × Table 7 Attack detection and mitigation in IoT using ML. Reference [95] [88] [29] [3] [127] [121] [14] ML-Technique Fuzzy C-Means (ESFCM) SVM, KNN, SLR DL Distributed Deep Learning k-NN, LR, NB, RF, SVM DL NN, SVM Accuracy Overhead 86.53% 99.20% 99.20% 98.7% 95% 99.71% Another fog- and DL-based attack detection model is proposed in [29]. However, this mechanism is not suitable for resourceconstrained IoT deployments. A distributed deep learning security model is proposed in [3]. Although the mechanism shows outstanding detection accuracy; however, it requires memory, computation, and energy resources. In [95], an FCM approach is used for attack mitigation; however, it may pose computation overhead at the node level. Another ML-based attack detection mechanism is proposed in [88] using SVM, KNN, and SLR. Nevertheless, this technique is not suitable for IoT networks with limited memory and computation resources. Recently, [127], [121], and [127] propose ML-based security solutions with more than 95% detection accuracy. [127] used NN, and SVM techniques, [121] uses DL, and kNN, LR, NB, RF, and SVM are used in [127]. All these three techniques require high computation, energy, and memory resources. 3. DoS Attacks DoS attacks are hard to be detected and mitigated in the IoT environment. Numerous reasons make it difficult to design effective remedies against such attacks. The main reason is the massive interconnection of IoT devices to the Internet. Besides, heterogeneity and the low-level security mechanisms also strain resource-constrained IoT devices. The crossplatform and massive communications also play a significant role in making these attacks’ detection cumbersome. All these factors, collaboratively enhance the susceptibilities of IoT devices against DDoS attacks. For example, Mirai4,1 and other Mirai-like bots abased the Internet globally including compromised smart home devices, such as baby cams, printers, and webcams. Therefore, it is dire to protect such infrastructures against these attacks. Vlajic et al. [125] mentions IoT as the “Land of opportunities for DDoS attackers”. It is important to note that IoT devices can be hacked as the starting point for launching deadliest attacks against big scale organizations. Therefore this supports the need for serious, intelligent security measures for securing these devices. At present, some remarkable researches based on different mechanisms for mitigating DDoS attacks in IoT are present. However, each IoT infrastructure’s different architectures make 1 Memory Computation Energy × × × × × it very difficult to generalize a common unified mechanism for combating attacks like DDoS made over diversified IoT networks [137]. Conventionally, the detection and mitigation of a DDoS are handled at the IoT networks’ entry-points, such as gateways or routers. A lot of research has been done to effectively study and deal with such attacks, for instance, [89], [91], and [30]. Considering the resource-constrained nature of IoT devices, several state-of-the-art proposed multi-layer cloud, fog, or SDN-based DDoS attacks mitigation techniques, such as but not limited to, [7], [133]. Hence, it can be said that there is no straightforward way to eliminate the DDoS issue in resource-limited networks. The literature review also shows the need for more robust and sophisticated detection mechanisms for countering DDoS. Another critical issue is the false positives results, which can deny a legitimate or genuine request. Although the new research has solved some of the existing issues, there is still room for more intelligent systems considering the amount of traffic with the attacker’s nature. From this point of view, ML becomes the ray of hope for solving such issues. Table 8 summarizes state-of-the-art in this regard. The table shows different ML-based proposed mechanisms for the detection of DDoS attacks. Although ML-based mechanisms show phenomenal accuracy results; however, these mechanisms strain resource-constrained IoT infrastructures. In [30], a good comparison of ML-based DDoS attack detection mechanisms, such as NN, KNN, SVM, DT, and RF, are discussed for IoT networks. [136] presents an SDN-based DDoS detection mechanism using SVM. In [63] proposes another SDN- and SVM-based DDoS detection. In [64], NB, Radial Basis Function (RBF), RF, and Bagging are used to mitigate DoS attacks. Similarly, [112] presents a Multivariate Correlation Analysis (MCA), based on behavioral analysis of traffic data, to detect DDoS attacks. [72] proposes a new Signal-to-Interference-plus-NoiseRatio (SINR)-based DoS attack detection mechanism. In [51], a supervised ANNs model is proposed for thwarting the DDoS in IoT infrastructure. Likewise, [66] proposes an MLP mechanism for detecting a DoS attack on the sensor networks. 4. Anomaly/Intrusion Detection At present, there are numerous ML-based techniques for the detection of irregularities and intrusion in IoT infrastructure [102]. The most common method for intrusion detection is traffic filtering [62]. In this process, Mirai malware is discussed in detail in [59]. 98 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 Table 8 ML-based DoS and DDoS Mitigation in IoT. Reference [30] [136] [63] [112] [72] [51] [66] ML-Technique KN, LSVM, DT, RF, NN SVM SVM MCA Q-learning ANN MLP Accuracy Overhead >99% 95.24% 95.11% 95.20 - 99.95% 99.4% - batch or individual packets are analyzed for filtering the legitimate packets from malicious packets. Nevertheless, this technique still produces a higher number of false positives due to efficient traffic classification mechanisms and lessens its reliability [15]. Similarly, some behavior-based models are used for detecting intrusions in the network. It is worth mentioning that the existing traditional behavior-based and signaturebased schemes are ineffective in detecting the zero-day intrusions [81]. ML-based approaches proved to be promising in IDS in IoT infrastructures. For instance (but not limited to), [71], [70], [39], [20], [2], and [25]. Table 9 summarizes state-of-the-art intrusion and anomaly detection schemes using ML for IoT infrastructures along with the memory, computation, and energy overheads. In [106], a light-weight ML-based IDS is proposed for detecting wormhole attack in low-power IoT networks based on 6LoWPAN. The proposed technique uses K-means clustering, DT, and a hybrid technique. Two ML-based mechanisms are proposed in [21] to detect intrusion at IoT gateways. They used genetic algorithms and ANN to secure IoT against such attacks. In [131], the authors propose an outlier detection technique to handle invalid data in IoT infrastructures. However, it requires a large amount of data for analyzing data having outliers. A light-weight non-parametric mechanism is proposed in [85] using sequence-based supervised learning. On the other hand, an energy-efficient and computational friendly IDS is presented in [124] using DT, Linear Discriminant Analysis (LDA), and NB. Likewise, in [102], other energy-efficient IDS is proposed using the game-theoretic technique. The use of DL-based methods in anomaly detection systems has aided in the easy discrimination of most characteristics that humans may not identify. DL will undoubtedly enhance the accuracy of detection techniques, making it a better choice over shallow conventional learning. A large dataset was used in [58] to mitigate IDS using an optimized LSTM-based deep learning framework. The proposed model outperformed the state-of-the-art with 99% accuracy, 98% sensitivity, and 98% specificity. Idriss et al. [55] proposed a CNN-based IDS for bot attack mitigation in IoTs. The results showed 99.94% validation accuracy with validation loss of 0.58%. Similarly, CNN, LSTM, and CNN-LSTM with 96.60%, 99.82%, and 98.80% respective accuracy were proposed in [8] for intrusion detection in IoTbased infrastructure. Another customized DL feed-forward NN model was proposed in [38] for IoT-based networks’ intrusion detection with embedding layers for multi-class classification. An RNN-based IDS is introduced in [62] for heterogeneous IoT deployments. In [99], an efficient two-layered RNN-based mechanism is proposed for intrusion detection in low-power IoT infrastructures. 5. Malware Analysis in IoT: The ever-increasing volume, along with heterogeneity in the IoT devices, gives the impeccable and profitable grounds for the adversaries. One of the most common and dreadful attacks is the malicious code injection attack in the IoT devices by exploiting these devices’ weak- Memory Computation Energy × × × × × - - - × × × nesses. Such weaknesses may be classified as weak application safety, authentication, and authorization mechanisms. Besides, these devices are highly prone to physical tampering to gain access to the software for modification and security parameters’ misconfiguration. Examples of malware attacks may include bot, virus, spyware, adware, ransomware, trojan are some to name. Firstly, we will discuss some different types of malware attacks concerning IoT systems, and later on, we will discuss state-of-the-art solutions. It is observed that the smart devices, interconnected to the Internet, do not comply with any of the existing security mechanisms, which compromises the security of both the device itself and the connected network. For instance, for launching the massive scale attacks, [83] tested the music devices for their innate vulnerability. They successfully launched the malicious code attack on the under-laying devices. Similarly, Insecam5 is the list of potentially compromised webcams for different attacks connected to the Internet. These cameras observe the residential, offices, public places, and restaurants to become the secure hotspots for malicious attacks. Some common types of malware,2 which are responsible for upsetting the usual working routine are Red October, NotPetya, Stuxnet, Night Dragon, and Cryptlockre are some to mention. These malware attacks have caused many monetary losses to the industry and other losses, such as the organizations’ public image. Nevertheless, it is worth learning about the attacking strategy of the attacker through the malware [77]. At present, standard security assessment mechanisms, such as Open Web-Application Security Project (OWASP), may also introduce inherent security vulnerabilities that an attacker can take advantage of. The attacker may inject malicious code or redirect the payload and generate attacks, such as phishing, rootkit, etc. A malware may range from naive lone-task malware to a more sophisticated, smart, and multipurpose malware. Some of the clever malware is adaptive to the IoT environment to execute themselves after network analysis. For example, some malware can escape from the detection mechanism, and they stay dormant until they find a suitable time for executing the malicious codes [138], [123]. In [45], an SVM based malware classification in IoT. However, this classification is time-consuming. Another RNN-based DL malware-analysis approach is proposed in [42]; however, the said mechanism posses memory and computational overhead in resource-constrained IoT deployments. In [13], a novel DLbased approach is proposed to study the Operational codes (Opcode) sequence on the Internet of Battlefield Things (IoBT). This technique may also require more energy and computation power. Another DL-based malware analysis mechanism is proposed in [60]. In [109], a light-weight deep autoencoders- 2 A detailed taxonomy with the working principles of malware may be found in [77]. 99 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 Table 9 ML-based Anomaly/Intrusion Detection in IoT. Reference [102] [62] [106] [21] [85] [99] ML-Technique Game Theory RNN K-means clustering, DT ANN Sequence Based Learning (SBL) RaNN Accuracy Overhead 98.88% 70-93, 71-80 99% 99.65%, 98.53% 97.23% Memory Computation Energy - - - × × × - - - × Table 10 ML-based malware Detection in IoT. Reference [104] [45] [42] [13] [60] [109] [67] ML-Technique NB,BNet, C4.5 DT, RF, RT SVM RNN DL DL CNN DT, NN Accuracy Overhead 99.79% 95.7% 98.18% 99.68% 96-99% 94-81.8% 99.4% based mechanism is proposed to mitigate Linux.Gafgyt and Mirai malware attacks. However, a large data set is required to train the model, which can exhaust resources. Similarly, [104] presents a malware detection and analysis mechanism using NB, BNet, C4.5 DT, RF, and RT. Although the model achieved high accuracy, however, it requires larger memory and computation power. Likewise, in [67], a DT and NN-based model is proposed for mitigating malware. Table 10 illustrates state-of-the-art in this regard. Memory Computation Energy × × × × × × × × time. Therefore, analyzing every aspect of the system (e.g., energy, computing, and other network resources) is essential when designing a security solution for IoT. Although some studies have good accuracy (i.e., >90% on average) regarding their attack detection results, they have failed to report the performance evaluation from the just-mentioned perspectives. As summarized in Table 6 till Table 10, approximately 90% (on average) state-of-the-art may pose computational overhead in resource-constrained IoT devices, which is thought provoking. Similarly, approximately 82% (on average) state-of-the-art may yield memory overhead when it comes to resource-constrained IoT devices due to their innate space complexity. Hence, due to computing and memory constraints, these systems may need new breeds of efficient and light-weight security measures for IoT security as one promising solution may not work for all heterogeneous infrastructures [113]. For instance, a solution proposed for resourceful IoT infrastructure may fail for infrastructures based on resource-constrained IoT devices (e.g., sensor nodes) and not vice versa. Therefore, either the lightweight or edge-based solutions may be focused. Additionally while designing ML-based security measure, it is essential to gather suitable data sets to apply these techniques successfully in large-scale, distributed, and heterogeneous IoT systems. A good data set is required for detection to provide an accurate, unbiased result during the training and testing process. As a result, selecting the appropriate data set for the system is critical. There are still some significant issues from a security perspective because of the diverse data collected from these IoT devices. Unfortunately, they are often challenging to acquire based on the capacity of the systems to identify risks and take appropriate measures [56]. Besides that, there are further concerns about the proliferation of IoT devices’ protection protocols. There are no standalone solutions, which is one of the intricate defensive barriers. There is a potential that, for example, the false-positive findings would render solutions ineffectual against security breach attempts, mostly in case of security breaches, such as intrusion and DDoS attacks [44]. Additionally, the market trust would be undermined, reducing the efficacy of such security solutions. Thus, a holistic strategy to IoT security and privacy may use existing safety technologies and enable the development of new autonomous, reliable, adaptive, lightweight, and scalable IoT security solutions to meet their growing needs [108]. Even while AI has already demonstrated encouraging outcomes in security systems in several domains, it still has much momen- 8. Discussion and future research areas Currently, the IoT and its significance have touched every domain. Additionally, its security had already attracted the attention of a large number of devices and network experts. However, its deployment, usage, and impact on infrastructure reveal many challenges and shortcomings, paving the way for new future research areas. The roots of privacy and security concerns must be explored further to deploy IoT infrastructures effectively. The idea of IoT, particularly, has been thrown about by technological advances, and so are its safety concerns. Therefore, it is needed to re-enhance the legacies of old technology [108], [27]. The main issue is the constrained resources of these devices, making it difficult to adapt advanced countermeasures currently deployed in IoT deployments. Furthermore, due to the networks’ heterogeneity, IoT networks are dissimilar to the other networks [37]. They consist of various devices made by distinct manufacturers having multiple diverse applications and system software platforms. Therefore, security mechanisms ought to be compatible with such ecosystems at this level of sophistication. Additionally, privacy and security solutions for the IoT demand a cross-cutting layer design and simplified security methodologies. In this regard, ML techniques provide a potential option for establishing reliable IoT frameworks. It is a cutting-edge AI domain without a need for explicit programming and can outperform complicated networks [111], [5]. Though MLbased solutions provide autonomous and more accurate countermeasures, their time-complexity and resource-requisition are still questionable in delay-sensitive and resource-limited applications where real-time response is obligatory. Furthermore, when designing any security solution for IoT, it is important to consider certain network quality parameters. For example, network overhead can cause the system to drain the energy resources of the IoT devices minimizing the network life100 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 tum among researchers over many other areas. While there are many applications for data extraction and data forecasting in IoT and big data analytics, there is limited use of AI and ML in the context of IoT security. Since, as a result, there is a critical need to concentrate attack detection. In addition, it is also needed that the proposed solutions must be able to modify their approach in response to new, constantly emerging threats. In addition, scalability is also critical when evaluating the performance of a new system in massive dynamic infrastructures with nodes joining in or leaving out the network [117]. Another popular research topic that may be able to help IoT networks overcome scalability limitations is fog computing. An IoT network that integrates with ML and IoT solutions may change how we see the IoT network since resourcehungry ML solutions are unsuitable for resource-constrained IoT devices. This explains why shifting computational activities to the edge layer (i.e., fog) solve many of the security problems we have today because of the existing security models that require a large number of resources and computing power. References [1] N. Abbas, M. Asim, N. Tariq, T. Baker, S. Abbas, A mechanism for securing IoT-enabled applications at the fog layer, J. Sens. Actuator Netw. 8 (1) (2019) 16. [2] A. Abduvaliyev, A.-S.K. Pathan, J. Zhou, R. Roman, W.-C. Wong, On the vital areas of intrusion detection systems in wireless sensor networks, IEEE Commun. Surv. Tutor. 15 (3) (2013) 1223–1237. [3] A. Abeshu, N. Chilamkurti, Deep learning: the frontier for distributed attack detection in fog-to-things computing, IEEE Commun. Mag. 56 (2) (2018) 169–175. [4] M. Abomhara, et al., Cyber security and the Internet of things: vulnerabilities, threats, intruders and attacks, J. Cyber Secur. Mobil. 4 (1) (2015) 65–88. [5] Z.S. Ageed, S.R. Zeebaree, M.M. Sadeeq, S.F. Kak, H.S. Yahia, M.R. Mahmood, I.M. Ibrahim, Comprehensive survey of big data mining approaches in cloud systems, Qubahan Acad. J. 1 (2) (2021) 29–38. [6] M.A. Al-Garadi, A. Mohamed, A. Al-Ali, X. Du, I. Ali, M. Guizani, A survey of machine and deep learning methods for Internet of things (IoT) security, IEEE Commun. Surv. Tutor. (2020). [7] S. Alharbi, P. Rodriguez, R. Maharaja, P. Iyer, N. Subaschandrabose, Z. Ye, Secure the Internet of things with challenge response authentication in fog computing, in: 2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC), IEEE, 2017, pp. 1–2. [8] H. Alkahtani, T.H. Aldhyani, Intrusion detection system to advance Internet of things infrastructure-based deep learning algorithms, Complexity (2021) 2021. [9] M. Aloqaily, S. Otoum, I. Al Ridhawi, Y. Jararweh, An intrusion detection system for connected vehicles in smart cities, Ad Hoc Netw. 90 (2019) 101842. [10] M. Amiri-Zarandi, R.A. Dara, E. Fraser, A survey of machine learning-based solutions to protect privacy in the Internet of things, Comput. Secur. (2020) 101921. [11] E. Anthi, L. Williams, M. Słowińska, G. Theodorakopoulos, P. Burnap, A supervised intrusion detection system for smart home IoT devices, IEEE Int. Things J. 6 (5) (2019) 9042–9053. [12] L. Aversano, M.L. Bernardi, M. Cimitile, R. Pecori, A systematic review on deep learning approaches for IoT security, Comput. Sci. Rev. 40 (2021) 100389. [13] A. Azmoodeh, A. Dehghantanha, K.-K.R. Choo, Robust malware detection for Internet of (battlefield) things devices using deep eigenspace learning, IEEE Trans. Sustain. Comput. 4 (1) (2018) 88–95. [14] M. Bagaa, T. Taleb, J.B. Bernabe, A. Skarmeta, A machine learning security framework for IoT systems, IEEE Access (2020). [15] E. Benkhelifa, T. Welsh, W. Hamouda, A critical review of practices and challenges in intrusion detection systems for IoT: toward universal and resilient systems, IEEE Commun. Surv. Tutor. 20 (4) (2018) 3496–3509. [16] E. Bertino, N. Islam, Botnets and Internet of things security, Computer 50 (2) (2017) 76–79. [17] G. Blanc, N. Kheir, D. Ayed, V. Lefebvre, E.M. de Oca, P. Bisson, Towards a 5G security architecture: articulating software-defined security and security as a service, in: Proceedings of the 13th International Conference on Availability, Reliability and Security, 2018, pp. 1–8. [18] T.E. Bogale, X. Wang, L.B. Le, Machine intelligence techniques for nextgeneration context-aware wireless networks, arXiv preprint, arXiv:1801. 04223, 2018. [19] O. Brun, Y. Yin, E. Gelenbe, Y.M. Kadioglu, J. Augusto-Gonzalez, M. Ramos, Deep learning with dense random neural networks for detecting attacks against IoT-connected home environments, in: International ISCIS Security Workshop, Springer, Cham, 2018, pp. 79–89. [20] I. Butun, S.D. Morgera, R. Sankar, A survey of intrusion detection systems in wireless sensor networks, IEEE Commun. Surv. Tutor. 16 (1) (2013) 266–282. [21] J. Canedo, A. Skjellum, Using machine learning to secure IoT systems, in: 2016 14th Annual Conference on Privacy, Security and Trust (PST), IEEE, 2016, pp. 219–222. [22] G. Casolla, S. Cuomo, V.S. Di Cola, F. Piccialli, Exploring unsupervised learning techniques for the Internet of things, IEEE Trans. Ind. Inform. 16 (4) (2019) 2621–2628. [23] N. Chaabouni, M. Mosbah, A. Zemmari, C. Sauvignac, P. Faruki, Network intrusion detection for IoT security based on learning techniques, IEEE Commun. Surv. Tutor. 21 (3) (2019) 2671–2701. [24] A. Chawla, P. Babu, T. Gawande, E. Aumayr, P. Jacob, S. Fallon, Intelligent monitoring of IoT devices using neural networks, in: 2021 24th Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN), IEEE, 2021, pp. 137–139. [25] J.F. Colom, D. Gil, H. Mora, B. Volckaert, A.M. Jimeno, Scheduling framework for distributed intrusion detection systems over heterogeneous network architectures, J. Netw. Comput. Appl. 108 (2018) 76–86. [26] A. Dawoud, S. Shahristani, C. Raun, Deep learning and software-defined networks: towards secure IoT architecture, Int. Things 3 (2018) 82–89. [27] A. Derhab, R. Alawwad, K. Dehwah, N. Tariq, F.A. Khan, J. Al-Muhtadi, Tweet-based bot detection using big data analytics, IEEE Access 9 (2021) 65988–66005. [28] S.S. Dhanda, B. Singh, P. Jindal, Lightweight cryptography: a solution to secure IoT, Wirel. Pers. Commun. (2020) 1–34. 9. Conclusion The Internet of Things (IoT) combines lots of smart devices, so that, they connect with very little human interference. It makes a record in computing as the quickest growing sector in computing history, with 50 billion devices approximated in 2020. Unfortunately, these infrastructures are prone to several cyber-attacks and threats. It is essential to ensure that the IoT ecosystem is successfully secured. In the past few years, ML-based security approaches have developed remarkably. However, despite being efficient and appropriate for classification and predictions in several tasks, ML-based techniques are not the only solution to all the problems faced by IoT networks. Moreover, ML being the latest technological and computational trends does not change the fact that they have their flaws and need to be looked after before incorporated into IoT networks. Thus, we should improve the current security procedures to ensure the IoT ecosystem is successfully secured. Even though ML-based solutions provide high accuracy, they require more storage, computation, and energy resources. Hence, they may strain a resource-constrained IoT infrastructure. Therefore, this calls for light-weight ML-based security solutions that may not exhaust such frameworks. Besides, a layered approach may also be helpful in this regard, such as the extensive computation-, memory-based, and energy-consumed operations, and analysis may be moved to the upper layers instead of the device layer, such as cloud, fog, or the SDN layer. In the future, we are looking forward to addressing intrusion and anomaly-based threats using deep learning at the fog layer. CRediT authorship contribution statement Umer Farooq: Conceptualization, Data curation, Formal analysis, Writing. Noshina Tariq: Visualization, Roles/Writing - original draft, Formal analysis. Muhammad Asim: Supervision, Project administration, Formal analysis, Writing - review & editing. Thar Baker: Formal analysis, Writing - review & editing. Ahmed AlShamma’a: Formal analysis, Writing - review & editing. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. 101 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 [29] A.A. Diro, N. Chilamkurti, Distributed attack detection scheme using deep learning approach for Internet of things, Future Gener. Comput. Syst. 82 (2018) 761–768. [30] R. Doshi, N. Apthorpe, N. Feamster, Machine learning DDoS detection for consumer Internet of things devices, in: 2018 IEEE Security and Privacy Workshops (SPW), IEEE, 2018, pp. 29–35. [31] H. Edquist, P. Goodridge, J. Haskel, The Internet of things and economic growth in a panel of countries, Econ. Innov. New Technol. 30 (3) (2021) 262–283. [32] M. El-hajj, A. Fadlallah, M. Chamoun, A. Serhrouchni, A survey of Internet of things (IoT) authentication schemes, Sensors 19 (5) (2019) 1141. [33] P. Emami-Naeini, Y. Agarwal, L.F. Cranor, Comments on “Establishing confidence in IoT device security: how do we get there?” 2021. [34] C. Fang, H. Yao, Z. Wang, W. Wu, X. Jin, F.R. Yu, A survey of mobile information-centric networking: research issues and challenges, IEEE Commun. Surv. Tutor. 20 (3) (2018) 2353–2371. [35] H. Fang, A. Qi, X. Wang, Fast authentication and progressive authorization in large-scale IoT: how to leverage AI for security enhancement, IEEE Netw. 34 (3) (2020) 24–29. [36] M.S. Farooq, S. Riaz, A. Abid, T. Umer, Y.B. Zikria, Role of IoT technology in agriculture: a systematic literature review, Electronics 9 (2) (2020) 319. [37] E. Fernandes, A. Rahmati, K. Eykholt, A. Prakash, Internet of things security research: a rehash of old ideas or new intellectual challenges?, IEEE Secur. Priv. 15 (4) (2017) 79–84. [38] M. Ge, N.F. Syed, X. Fu, Z. Baig, A. Robles-Kelly, Towards a deep learningdriven intrusion detection approach for Internet of things, Comput. Netw. 186 (2021) 107784. [39] A.A. Gendreau, M. Moorman, Survey of intrusion detection systems towards an end to end secure Internet of things, in: 2016 IEEE 4th International Conference on Future Internet of Things and Cloud (FiCloud), IEEE, 2016, pp. 84–90. [40] A. Gondalia, D. Dixit, S. Parashar, V. Raghava, A. Sengupta, V.R. Sarobin, IoTbased healthcare monitoring system for war soldiers using machine learning, Proc. Comput. Sci. 133 (2018) 1005–1013. [41] L. Gupta, R. Jain, M. Samaka, Dynamic analysis of application delivery network for leveraging software defined infrastructures, in: 2015 IEEE International Conference on Cloud Engineering, IEEE, 2015, pp. 305–310. [42] H. HaddadPajouh, A. Dehghantanha, R. Khayami, K.-K.R. Choo, A deep recurrent neural network based approach for Internet of things malware threat hunting, Future Gener. Comput. Syst. 85 (2018) 88–96. [43] H. HaddadPajouh, A. Dehghantanha, R.M. Parizi, M. Aledhari, H. Karimipour, A survey on Internet of things security: requirements, challenges, and solutions, Int. Things (2019) 100129. [44] S.H. Haji, S.Y. Ameen, Attack and anomaly detection in IoT networks using machine learning techniques: a review, Asian J. Res. Comput. Sci. (2021) 30–46. [45] H.-S. Ham, H.-H. Kim, M.-S. Kim, M.-J. Choi, Linear SVM-based Android malware detection for reliable IoT services, J. Appl. Math. 2014 (2014). [46] S.A. Hamad, Q.Z. Sheng, W.E. Zhang, S. Nepal, Realizing an Internet of secure things: a survey on issues and enabling technologies, IEEE Commun. Surv. Tutor. 22 (2) (2020) 1372–1391. [47] M. Hasan, M.M. Islam, M.I.I. Zarif, M. Hashem, Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches, Int. Things 7 (2019) 100059. [48] Y. Hasegawa, S. Abe, H. Matsutani, H. Amano, K. Anjo, T. Awashima, An adaptive cryptographic accelerator for IPsec on dynamically reconfigurable processor, in: Proceedings. 2005 IEEE International Conference on FieldProgrammable Technology, IEEE, 2005, 2005, pp. 163–170. [49] N. Hashimoto, S. Ozawa, T. Ban, J. Nakazato, J. Shimamura, A darknet traffic analysis for IoT malwares using association rule learning, Proc. Comput. Sci. 144 (2018) 118–123. [50] V. Hassija, V. Chamola, V. Saxena, D. Jain, P. Goyal, B. Sikdar, A survey on IoT security: application areas, security threats, and solution architectures, IEEE Access 7 (2019) 82721–82743. [51] E. Hodo, X. Bellekens, A. Hamilton, P.-L. Dubouilh, E. Iorkyase, C. Tachtatzis, R. Atkinson, Threat analysis of IoT networks using artificial neural network intrusion detection system, in: 2016 International Symposium on Networks, Computers and Communications (ISNCC), IEEE, 2016, pp. 1–6. [52] E. Hossain, I. Khan, F. Un-Noor, S.S. Sikander, M.S.H. Sunny, Application of big data and machine learning in smart grid, and associated security concerns: a review, IEEE Access 7 (2019) 13960–13988. [53] F. Hussain, Internet of Things: Building Blocks and Business Models, no. 978–3 Springer, 2017. [54] F. Hussain, R. Hussain, S.A. Hassan, E. Hossain, Machine learning in IoT security: current solutions and future challenges, IEEE Commun. Surv. Tutor. (2020). [55] I. Idrissi, M. Boukabous, M. Azizi, O. Moussaoui, H. El Fadili, Toward a deep learning-based intrusion detection system for IoT against botnet attacks, Int. J. Artif. Intell. 10 (1) (2021) 110. [56] A.F. Jahwar, S.R. Zeebaree, A state of the art survey of machine learning algorithms for IoT security, Asian J. Res. Comput. Sci. (2021) 12–34. [57] U. Jayasinghe, G.M. Lee, T.-W. Um, Q. Shi, Machine learning based trust computational model for IoT services, IEEE Trans. Sustain. Comput. 4 (1) (2018) 39–52. [58] B. Jothi, M. Pushpalatha, WILS-TRS—a novel optimized deep learning based intrusion detection framework for IoT networks, Pers. Ubiquitous Comput. (2021) 1–17. [59] G. Kambourakis, C. Kolias, A. Stavrou, The Mirai botnet and the IoT zombie armies, in: MILCOM 2017-2017 IEEE Military Communications Conference (MILCOM), IEEE, 2017, pp. 267–272. [60] E.B. Karbab, M. Debbabi, A. Derhab, D. Mouheb, MalDozer: automatic framework for Android malware detection using deep learning, Digit. Investig. 24 (2018) S48–S59. [61] H. Khujamatov, E. Reypnazarov, A. Lazarev, Modern methods of testing and information security problems in IoT, Bull. TUIT, Manag. Commun. Technol. 4 (2) (2021) 4. [62] J. Kim, J. Kim, H.L.T. Thu, H. Kim, Long short term memory recurrent neural network classifier for intrusion detection, in: 2016 International Conference on Platform Technology and Service (PlatCon), IEEE, 2016, pp. 1–5. [63] R. Kokila, S.T. Selvi, K. Govindarajan, DDoS detection and analysis in SDN-based environment using support vector machine classifier, in: 2014 Sixth International Conference on Advanced Computing (ICoAC), IEEE, 2014, pp. 205–210. [64] C. Kolias, G. Kambourakis, A. Stavrou, J. Voas, DDoS in the IoT: Mirai and other botnets, Computer 50 (7) (2017) 80–84. [65] I. Kotenko, I. Saenko, A. Branitskiy, Framework for mobile Internet of things security monitoring based on big data processing and machine learning, IEEE Access 6 (2018) 72714–72723. [66] R.V. Kulkarni, G.K. Venayagamoorthy, Neural network based secure media access control protocol for wireless sensor networks, in: 2009 International Joint Conference on Neural Networks, IEEE, 2009, pp. 1680–1687. [67] A.P. Kuruvila, S. Kundu, K. Basu, Defending hardware-based malware detectors against adversarial attacks, arXiv preprint, arXiv:2005.03644, 2020. [68] N.D. Lane, P. Georgiev, L. Qendro, DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning, in: Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, 2015, pp. 283–294. [69] A. L’heureux, K. Grolinger, H.F. Elyamany, M.A. Capretz, Machine learning with big data: challenges and approaches, IEEE Access 5 (2017) 7776–7797. [70] D. Li, Z. Cai, L. Deng, X. Yao, H.H. Wang, Information security model of block chain based on intrusion sensing in the IoT environment, Clust. Comput. 22 (1) (2019) 451–468. [71] J. Li, Z. Zhao, R. Li, H. Zhang, AI-based two-stage intrusion detection for software defined IoT networks, IEEE Int. Things J. 6 (2) (2018) 2093–2102. [72] Y. Li, D.E. Quevedo, S. Dey, L. Shi, SINR-based DoS attack on remote state estimation: a game-theoretic approach, IEEE Trans. Control Netw. Syst. 4 (3) (2016) 632–642. [73] C. Lin, H. Khazaei, A. Walenstein, A. Malton, Autonomic security management for IoT smart spaces, 2021. DOI: https://doi.org/10.1145/3466696. [74] X. Lixia, D. Ying, Y. Hongyu, H. Ze, Mitigating LFA through segment rerouting in IoT environment with traceroute flow abnormality detection, J. Netw. Comput. Appl. (2020) 102690. [75] M.S. Mahdavinejad, M. Rezvan, M. Barekatain, P. Adibi, P. Barnaghi, A.P. Sheth, Machine learning for Internet of things data analysis: a survey, Dig. Commun. Netw. 4 (3) (2018) 161–175. [76] M. Mahmud, M.S. Kaiser, A. Hussain, S. Vassanelli, Applications of deep learning and reinforcement learning to biological data, IEEE Trans. Neural Netw. Learn. Syst. 29 (6) (2018) 2063–2079. [77] I. Makhdoom, M. Abolhasan, J. Lipman, R.P. Liu, W. Ni, Anatomy of threats to the Internet of things, IEEE Commun. Surv. Tutor. 21 (2) (2018) 1636–1675. [78] P. Malhotra, Y. Singh, P. Anand, D.K. Bangotra, P.K. Singh, W.-C. Hong, Internet of things: evolution, concerns and security challenges, Sensors 21 (5) (2021) 1809. [79] S. Manjia Tahsien, H. Karimipour, P. Spachos, Machine learning based solutions for security of Internet of things (IoT): a survey, J. Netw. Comput. Appl. 161 (2020) 102630. [80] J.M. McGinthy, L.J. Wong, A.J. Michaels, Groundwork for neural network-based specific emitter identification authentication for IoT, IEEE Int. Things J. 6 (4) (2019) 6429–6440. [81] W. Meng, Intrusion detection in the era of IoT: building trust via traffic filtering and sampling, Computer 51 (7) (2018) 36–43. [82] B.K. Mohanta, D. Jena, U. Satapathy, S. Patnaik, Survey on IoT security: challenges and solution using machine learning, artificial intelligence and blockchain technology, Int. Things (2020) 100227. [83] J. Moos, IoT, malware and security, ITNOW 59 (1) (2017) 28–29. [84] R. Nelson, Ssl offloading, encryption, and certificates with nginx, Retrieved from https://www.nginx.com/blog/nginx-ssl/, Apr 30 (2014) 5. [85] N. Nesa, T. Ghosh, I. Banerjee, Non-parametric sequence-based learning approach for outlier detection in IoT, Future Gener. Comput. Syst. 82 (2018) 412–421. 102 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 [86] N.D. Nguyen, T. Nguyen, S. Nahavandi, System design perspective for humanlevel agents using deep reinforcement learning: a survey, IEEE Access 5 (2017) 27091–27102. [87] A. Outchakoucht, E. Hamza, J.P. Leroy, Dynamic access control policy based on blockchain and machine learning for the Internet of things, Int. J. Adv. Comput. Sci. Appl. 8 (7) (2017) 417–424. [88] M. Ozay, I. Esnaola, F.T.Y. Vural, S.R. Kulkarni, H.V. Poor, Machine learning methods for attack detection in the smart grid, IEEE Trans. Neural Netw. Learn. Syst. 27 (8) (2015) 1773–1786. [89] L.A.B. Pacheco, J.J. Gondim, P.A.S. Barreto, E. Alchieri, Evaluation of distributed denial of service threat in the Internet of things, in: 2016 IEEE 15th International Symposium on Network Computing and Applications (NCA), IEEE, 2016, pp. 89–92. [90] T. Park, N. Abuzainab, W. Saad, Learning how to communicate in the Internet of things: finite resources and heterogeneity, arXiv preprint, arXiv: 1610.01586, 2016. [91] V. Paxson, An analysis of using reflectors for distributed denial-of-service attacks, Comput. Commun. Rev. 31 (3) (2001) 38–47. [92] G. Primiero, F.J. Solheim, J.M. Spring, On malfunction, mechanisms and malware classification, Philos. Technol. 32 (2) (2019) 339–362. [93] P. Punithavathi, S. Geetha, M. Karuppiah, S.H. Islam, M.M. Hassan, K.-K.R. Choo, A lightweight machine learning-based authentication framework for smart IoT devices, Inf. Sci. 484 (2019) 255–268. [94] J. Qiu, Q. Wu, G. Ding, Y. Xu, S. Feng, A survey of machine learning for big data processing, EURASIP J. Adv. Signal Process. 2016 (1) (2016) 67. [95] S. Rathore, J.H. Park, Semi-supervised learning based distributed attack detection framework for IoT, Appl. Soft Comput. 72 (2018) 79–89. [96] N. Ravishankar, M. Vijayakumar, Reinforcement learning algorithms: survey and classification, Indian J. Sci. Technol. 10 (1) (2017) 1–8. [97] F. Restuccia, S. D’Oro, T. Melodia, Securing the Internet of things in the age of machine learning and software-defined networking, IEEE Int. Things J. 5 (6) (2018) 4829–4842. [98] T. Saba, K. Haseeb, A.A. Shah, A. Rehman, U. Tariq, Z. Mehmood, A machinelearning-based approach for autonomous IoT security, IT Prof. 23 (3) (2021) 69–75. [99] A. Saeed, A. Ahmadinia, A. Javed, H. Larijani, Intelligent intrusion detection in low-power IoTs, ACM Trans. Internet Technol. 16 (4) (2016) 1–25. [100] S. Salagare, R. Prasad, An overview of Internet of dental things: new frontier in advanced dentistry, Wirel. Pers. Commun. 110 (3) (2020) 1345–1371. [101] I.H. Sarker, A machine learning based robust prediction model for real-life mobile phone data, Int. Things 5 (2019) 180–193. [102] H. Sedjelmaci, S.M. Senouci, M. Al-Bahri, A lightweight anomaly detection technique for low-resource IoT devices: a game-theoretic methodology, in: 2016 IEEE International Conference on Communications (ICC), IEEE, 2016, pp. 1–6. [103] D. Serpanos, The cyber-physical systems revolution, Computer 51 (3) (2018) 70–73. [104] M. Shafiq, Z. Tian, Y. Sun, X. Du, M. Guizani, Selection of effective machine learning algorithm and Bot-IoT attacks traffic identification for Internet of things in smart city, Future Gener. Comput. Syst. 107 (2020) 433–442. [105] C. Shi, J. Liu, H. Liu, Y. Chen, Smart user authentication through actuation of daily activities leveraging WiFi-enabled IoT, in: Proceedings of the 18th ACM International Symposium on Mobile Ad Hoc Networking and Computing, 2017, pp. 1–10. [106] P. Shukla, ML-IDS: a machine learning approach to detect wormhole attacks in Internet of things, in: 2017 Intelligent Systems Conference (IntelliSys), IEEE, 2017, pp. 234–240. [107] K. Sowmya, C. Srinivasan, K. Lakshmy, T.K. Bansal, A secure protocol for the delivery of firmware updates over the air in IoT devices, in: Soft Computing and Signal Processing, Springer, 2021, pp. 213–224. [108] N.-A. Stoian, Machine learning for anomaly detection in IoT networks: malware analysis on the IoT-23 data set, B.S. thesis, University of Twente, 2020. [109] J. Su, V.D. Vasconcellos, S. Prasad, S. Daniele, Y. Feng, K. Sakurai, Lightweight Classification of Iot Malware Based on Image Recognition, 2018 IEEE 42Nd Annual Computer Software and Applications Conference (COMPSAC), vol. 2, IEEE, 2018, pp. 664–669. [110] A.-E.M. Taha, A.M. Rashwan, H.S. Hassanein, Secure communications for resource-constrained IoT devices, Sensors 20 (13) (2020) 3637. [111] S.M. Tahsien, H. Karimipour, P. Spachos, Machine learning based solutions for security of Internet of things (IoT): a survey, J. Netw. Comput. Appl. 161 (2020) 102630. [112] Z. Tan, A. Jamdagni, X. He, P. Nanda, R.P. Liu, A system for denial-of-service attack detection based on multivariate correlation analysis, IEEE Trans. Parallel Distrib. Syst. 25 (2) (2013) 447–456. [113] N. Tariq, M. Asim, F. Al-Obeidat, M. Zubair Farooqi, T. Baker, M. Hammoudeh, I. Ghafir, The security of big data in fog-enabled IoT applications including blockchain: a survey, Sensors 19 (8) (2019) 1788. [114] N. Tariq, M. Asim, F.A. Khan, Securing SCADA-based critical infrastructures: challenges and open issues, Proc. Comput. Sci. 155 (2019) 612–617. [115] N. Tariq, M. Asim, Z. Maamar, M.Z. Farooqi, N. Faci, T. Baker, A mobile codedriven trust mechanism for detecting internal attacks in sensor node-powered IoT, J. Parallel Distrib. Comput. 134 (2019) 198–206. [116] N. Tariq, A. Qamar, M. Asim, F.A. Khan, Blockchain and smart healthcare security: a survey, Proc. Comput. Sci. 175 (2020) 615–620. [117] N. Tariq, M. Asim, F.A. Khan, T. Baker, U. Khalid, A. Derhab, A blockchain-based multi-mobile code-driven trust mechanism for detecting internal attacks in Internet of things, Sensors 21 (1) (2021) 23. [118] J. Thom, Y. Shah, S. Sengupta, Correlation of cyber threat intelligence data across global honeypots, in: 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC), IEEE, 2021, pp. 0766–0772. [119] N. Torres, P. Pinto, S.I. Lopes, Security vulnerabilities in LPWANs—an attack vector analysis for the IoT ecosystem, Appl. Sci. 11 (7) (2021) 3176. [120] S. Tu, M. Waqas, S.U. Rehman, M. Aamir, O.U. Rehman, Z. Jianbiao, C.-C. Chang, Security in fog computing: a novel technique to tackle an impersonation attack, IEEE Access 6 (2018) 74993–75001. [121] R.M.A. Ujjan, Z. Pervez, K. Dahal, A.K. Bashir, R. Mumtaz, J. González, Towards sFlow and adaptive polling sampling for deep learning based DDoS detection in SDN, Future Gener. Comput. Syst. 111 (2020) 763–779. [122] T. ul Hassan, M. Asim, T. Baker, J. Hassan, N. Tariq, CTrust-RPL: a control layerbased trust mechanism for supporting secure routing in routing protocol for low power and lossy networks-based Internet of things applications, Trans. Emerg. Telecommun. Technol. 32 (3) (2021) e4224. [123] C.S. Veerappan, P.L.K. Keong, Z. Tang, F. Tan, Taxonomy on malware evasion countermeasures techniques, in: 2018 IEEE 4th World Forum on Internet of Things (WF-IoT), IEEE, 2018, pp. 558–563. [124] E. Viegas, A. Santin, L. Oliveira, A. Franca, R. Jasinski, V. Pedroni, A reliable and energy-efficient classifier combination scheme for intrusion detection in embedded systems, Comput. Secur. 78 (2018) 16–32. [125] N. Vlajic, D. Zhou, IoT as a land of opportunity for DDoS hackers, Computer 51 (7) (2018) 26–34. [126] N. Waheed, X. He, M. Usman, Security & privacy in IoT using machine learning & blockchain: threats & countermeasures, arXiv preprint, arXiv:2002.03488, 2020. [127] Y. Wan, K. Xu, G. Xue, F. Wang, IoTArgos: a multi-layer security monitoring system for Internet-of-things in smart homes, in: IEEE INFOCOM 2020-IEEE Conference on Computer Communications, IEEE, 2020, pp. 874–883. [128] T. Wang, C.-K. Wen, H. Wang, F. Gao, T. Jiang, S. Jin, Deep learning for wireless physical layer: opportunities and challenges, China Commun. 14 (11) (2017) 92–111. [129] L. Wei, W. Luo, J. Weng, Y. Zhong, X. Zhang, Z. Yan, Machine learningbased malicious application detection of Android, IEEE Access 5 (2017) 25591–25601. [130] C. Wirth, R. Akrour, G. Neumann, J. Fürnkranz, A survey of preferencebased reinforcement learning methods, J. Mach. Learn. Res. 18 (1) (2017) 4945–4990. [131] L. Xiao, Y. Li, G. Han, G. Liu, W. Zhuang, Phy-layer spoofing detection with reinforcement learning in wireless networks, IEEE Trans. Veh. Technol. 65 (12) (2016) 10037–10047. [132] L. Xiao, X. Wan, Z. Han, Phy-layer authentication with multiple landmarks with reduced overhead, IEEE Trans. Wirel. Commun. 17 (3) (2017) 1676–1687. [133] Q. Yan, W. Huang, X. Luo, Q. Gong, F.R. Yu, A multi-level DDoS mitigation framework for the industrial Internet of things, IEEE Commun. Mag. 56 (2) (2018) 30–36. [134] L. Yang, Y. Chen, X.-Y. Li, C. Xiao, M. Li, Y. Liu Tagoram, Real-time tracking of mobile RFID tags to high precision using COTS devices, in: Proceedings of the 20th Annual International Conference on Mobile Computing and Networking, 2014, pp. 237–248. [135] S. Yao, Y. Zhao, A. Zhang, S. Hu, H. Shao, C. Zhang, L. Su, T. Abdelzaher, Deep learning for the Internet of things, Computer 51 (5) (2018) 32–41. [136] J. Ye, X. Cheng, J. Zhu, L. Feng, L. Song, A DDoS attack detection method based on SVM in software defined network, Secur. Commun. Netw. (2018) 2018. [137] D. Yin, L. Zhang, K. Yang, A DDoS attack detection and mitigation with software-defined Internet of things framework, IEEE Access 6 (2018) 24694–24705. [138] I. You, K. Yim, Malware obfuscation techniques: a brief survey, in: 2010 International Conference on Broadband, Wireless Computing, Communication and Applications, IEEE, 2010, pp. 297–300. [139] J. Yu, L. Lu, Y. Chen, Y. Zhu, L. Kong, An indirect eavesdropping attack of keystrokes on touch screen through acoustic sensing, IEEE Trans. Mob. Comput. (2019). [140] A. Zaidan, B. Zaidan, A review on intelligent process for smart home applications based on IoT: coherent taxonomy, motivation, open challenges, and recommendations, Artif. Intell. Rev. 53 (1) (2020) 141–165. [141] S.T. Zargar, H. Takabi, J. Iyer, Security-as-a-service (SECaaS) in the cloud, in: Security, Privacy, and Digital Forensics in the Cloud, 2019, pp. 189–200. [142] O. Zedadra, A. Guerrieri, N. Jouandeau, G. Spezzano, H. Seridi, G. Fortino, Swarm intelligence-based algorithms within IoT-based systems: a review, J. Parallel Distrib. Comput. 122 (2018) 173–187. [143] X.J. Zhu, Semi-supervised learning literature survey, Tech. Rep., University of Wisconsin-Madison, Department of Computer Sciences, 2005. 103 U. Farooq, N. Tariq, M. Asim et al. Journal of Parallel and Distributed Computing 162 (2022) 89–104 Thar Baker is Associate Professor in the Department of Computer Science at The University of Sharjah (UoS) in UAE. Before joining UoS, Thar was Reader in Cloud Engineering and Head of Applied Computing Research Group (ACRG) in the Faculty of Engineering and Technology at Liverpool John Moores University (LJMU, UK). He received his PhD in Autonomic Cloud Applications from LJMU in 2010 and became a Senior Fellow of Higher Education Academy (SFHEA) in 2018. Dr Baker has published numerous refereed research papers in multidisciplinary research areas including parallel and distributed computing, federated learning, IoT, and energy routing protocols. [144] M. Zolanvari, M.A. Teixeira, L. Gupta, K.M. Khan, R. Jain, Machine learningbased network vulnerability analysis of industrial Internet of things, IEEE Int. Things J. 6 (4) (2019) 6822–6834. Umer Farooq is a PhD student in the Department of Computer Science, National University of Computer and Emerging Sciences, Islamabad, Pakistan. He received his MS degree in Computer Science from the same University. His research interests include Wireless Sensor Networks, Internet of Things, Cyber Security, Blockchain, Network Security, and Machine Learning. Ahmed Al-Shamma’a Professor Ahmed Al-Shamma’a (BEng, MSc, PhD), The Dean - College of Engineering, University of Sharjah, UAE. Obtained his MSc and PhD degree from the University of Liverpool, UK in 1990 and 1993 respectively and he is a member of many Professional bodies, active member of the UK and the European Research Councils and one of the founders of UK Sensor City. He was the Pro Vice ChancellorExecutive Dean (Teaching, Research and Enterprise) of College of Engineering at Liverpool John Moores University, UK for 5 years before joining the University of Sharjah in Nov 2019. Prof Ahmed main areas of expertise are, Non-invasive sensing for industrial applications, Microwave devices and systems, Industry 4.0 complete system integration and Telecommunications. His academic contributions Impact through professional practice is illustrated by the long list of publications over; 300 refereed journal and conference papers; 18 patents, 70 technical papers and reports; 1 book joint; 18 book Chapters; over 60 Keynote speeches in national-international conferences and workshops; Prof Ahmed was a main supervisor to 33 PhD students who successfully completed their studies and directly supervised over 50 post docs. Noshina Tariq is a PhD in Computer Science from FAST - National University of Computer and Emerging Sciences, Islamabad - Pakistan. Her core area of research is Cyber Security. She is currently associated with Shaheed Zulfiqar Ali Bhutto Institute of Science and Technology, Islamabad, as an Assistant Professor in the Department of Computer Science. Her research interests include Cyber Security, Network Security, Internet of Things (IoT), Wireless Sensor Networks (WSN), Fog Computing, Blockchain, and Machine Learning. Muhammad Asim is a Professor at the Department of Computer Science, National University of Computer and Emerging Sciences, Pakistan. Having attained a Ph.D. in 2010 from Liverpool John Moores University, he researches in the fields of Internet of Things (IoT), Cloud Computing, Blockchain, Security and Trust applied to IoT and Cloud Computing. Muhammad serves as an editorial board member and potential reviewer for several scientific journals and conferences. 104