Uploaded by python.test.sahar

MNET.011.2000396

advertisement
This article has been accepted for inclusion in a future issue of this magazine. Content is final as presented, with the exception of pagination.
ACCEPTED FROM OPEN CALL
Design Guidelines for Machine Learning-based Cybersecurity in Internet of Things
Azzedine Boukerche and Rodolfo W. L. Coutinho
Abstract
Cybersecurity is one of the building blocks in
need of increasing attention in Internet of things
(IoT) applications. IoT has become a popular target for attackers seeking sensitive and personal
user data, computing infrastructure for massive
attacks, or aimed at compromising critical applications. Worryingly, the industrial race toward the
forefront of IoT software and device development
has led to increased market penetration of vulnerable IoT devices and applications. Nevertheless,
traditional cybersecurity solutions designed for
personal computers often rely on heavy computation and high communication overhead, and
therefore are prohibitive for IoT, given the explosive number of IoT devices, their resource-constrained nature, and their heterogeneity. Hence,
innovative solutions must be designed for securing IoT applications, while considering the peculiar characteristics of IoT devices and networks.
In this article, we discuss the motivations and
challenges of using machine learning (ML) models for the design of cybersecurity solutions for
IoT. More specifically, we tackle the challenge of
designing ML-based solutions and provide guidelines for ML-based physical layer solutions aimed
at securing IoT. We propose a device-oriented
and network-oriented classification and investigate
recent works that designed ML-based solutions,
considering IoT physical layer features, to secure
IoT applications. The proposed classification helps
engineers and practitioners starting in this area
to better identify and understand the challenges, requirements, and up-to-date common design
principles for securing IoT devices and networks
considering physical layer features. Finally, we
shed light on some future research directions that
need further investigation.
Introduction
In recent years, significant advances have been
made on embedded devices, sensing and actuation hardware, wireless networking technologies,
edge computing, and data-centric networking,
which have contributed to the development and
market penetration of Internet of things (IoT). IoT
has emerged as a network of seamlessly interconnected devices (e.g., sensors and actuators),
which cooperate to attain common objectives
[1]. Moreover, IoT has gained increased attention
thanks to its potential to change the way people
live and work by creating efficient, comfortable,
green and enjoyable environments through smart
applications over different domains, such as eduDigital Object Identifier:
10.1109/MNET.011.2000396
1
cation, health-care, transportation, manufacturing, and surveillance. Internet of things (IoT) has
unlocked sensing and actuation-based applications in several domains. A traditional IoT application relies on various heterogeneous devices
to sense the environment and act based on the
observed conditions or received commands. The
IoT devices gather a large amount of multimedia
data through heterogeneous sensors, share collected data whenever needed through machineto-machine (M2M) communication, and offload it
to edge or cloud infrastructures.
Current advancements in key technologies
are supporting the ever-growing expansion and
popularization of IoT. The evolving Long Term
Evolution (LTE) and 5G networks are expected
to provide IoT applications with massive connectivity, high bandwidth, and ultra-reliable and
low-latency communication. Edge computing will
expand data processing capabilities closer to IoT,
which helps improve energy efficiency and reduce
network congestion, since devices will no longer
need to offload all collected data cloud servers.
Information-centric networking architectures will
improve communication interoperability and data
delivery in IoT, by employing data-centric request
and response, and in-networking content caching, respectively. Nevertheless, cybersecurity is a
fundamental building block in need of increased
attention in IoT. IoT systems are being targeted
with an unprecedented number of cyberattacks.
The F-Secure reports that attack traffic on IoT
devices more than tripled in the first half of 2019,
when compared with the previous period, and
reached a total of over 2.9 billion events (please
refer to Attack Landscape H1 2019 (report available on https://tinyurl.com/sxaeq4c)). Moreover,
malicious users rely on spoofing attacks, intrusions, jamming, eavesdropping, and malware
to: leak sensitive IoT data; turn into botnets IoT
systems for massive distributed denial-of-service
(DDoS) attacks, spam, phishing, click-fraud; and
make critical IoT applications unavailable (e.g.,
health-care systems, surveillance, smart transportation, smart grids, and industrial applications).
The industrial race toward the forefront of the
development of IoT devices has led to increased
market penetration of vulnerable devices. For
instance, the security researcher Billy Rios showed
that the LifecarePCA drug infusion system, as well
as five other Hospira drug delivery automated
machines, is vulnerable to attacks that can change
the drug dosage to be delivered (https://tinyurl.
com/t8xyr4h). Nevertheless, traditional cybersecurity solutions designed for protecting personal
Azzedine Boukerche is with the University of Ottawa; R. W. L. Coutinho is with Concordia University.
0890-8044/20/$25.00 © 2020 IEEE
IEEE Network • Accepted for Publication
Authorized licensed use limited to: Cornell University Library. Downloaded on September 06,2020 at 09:44:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this magazine. Content is final as presented, with the exception of pagination.
FIGURE 1. Common security threatens for IoT applications.
computers connected to the Internet will not be
feasible for IoT because of the explosive number
of IoT devices, their resource-constrained nature,
and their heterogeneity. Thus, Restuccia et al. [2]
has advocated for a secure-by-design approach,
in which IoT systems shall be building as free
of vulnerabilities as possible. However, security-by-design is hard to achieve in IoT, as devices
are composed of several hardware parts manufactured by different vendors and software developed
from different companies.
Therefore, the design of new solutions to protect IoT from cyberattacks has received increased
attention in the scientific and industrial communities. In particular, machine learning (ML)-based
solutions have emerged for IoT cybersecurity. Traditional cybersecurity solutions are prohibitive for
IoT, as they rely on heavy computation [3] and will
overload the network with traffic for autonomous
changing of default passwords on millions of IoT
devices, two-factor device authentication, application of security patches and updates to IoT devices
(https://tinyurl.com/yblo7yq6). Moreover, they
were not designed considering the severe devices’ constraints in terms of computation, memory,
radio bandwidth, and battery resources, and do
not encompass the entire security spectrum on
devices, edge computing, and wireless networking.
In contrast, ML-based cybersecurity solutions for
IoT have gained increased momentum, and several
works have been proposed in the literature (see
[1–4] and references therein). ML models can be
used, for instance, to create traffic profiles, detect
threats through traffic exchange that does not fall
within the established normal behavior, detect IoT
hardware vulnerabilities through observed physical
layer characteristics, and authenticate legitimate
devices based on their characteristics and behavior. Xiao et al. [3] analyzed learning-based solutions
designed for IoT device authentication, access control, malware detection, and secure offloading. Li
et al. [5] evaluated the feasibility and suitability of
statistical learning models for detecting anomalous
behavior of IoT devices by considering system statistics (e.g., CPU usage cycles and disk usage).
In this work, we tackle the challenge of designing ML-based solutions for physical layer IoT security. Related works either addressed one particular
security problem (e.g., intrusion detection) that
might appear, or focused primarily on the discussion of the ML models while presenting proposed
solutions to secure IoT. In contrast, we discuss
ML-based solutions for cybersecurity in IoT appli-
cations by addressing IoT from two distinct points
of view: the device and network point of view.
This process helps engineers and practitioners
starting in the area to better understand the challenges and principle design of ML-based cybersecurity solutions when they are intended to protect
IoT devices individually, as well as IoT network
infrastructure. More specifically, the contributions
of this work include:
• A thorough discussion of the motivation for
the design of novel solutions to secure IoT
systems, ML-based cybersecurity, and requirements and current daunting challenges.
• A proposed classification to categorize
recent works that designed ML-based
approaches for IoT cybersecurity in devices
and network-based solutions. The proposed
classification, by considering two distinct
points of view of IoT systems, helps to better identify and understand the challenges,
requirements and up-to-date common principles for the design of security solutions for
IoT devices and networks considering physical layer features.
• A thorough discussion of open issues and
future research directions toward the design
of efficient cybersecurity solutions for IoT.
Fundamentals
Cybersecurity for IoT
Figure 1 illustrates classic attacks that IoT infrastructures can experience. Cybersecurity solutions must be designed to protect IoT data and
avoid IoT devices to be compromised. Cisco
estimates that data produced by IoT applications
will reach nearly 850 ZB by 2021 (https://tinyurl.
com/ybez862s). Despite this impressive number,
it is worth highlighting that IoT data will mostly
be sensitive and may reveal private aspects of
users and their interactions with the application.
For instance, a smart health-care application will
produce data regarding users’ health conditions
and historical health records. A smart home application will produce data regarding rooms and
environmental states and conditions (e.g., temperature, lightness, humidity, and noise), as well
as users’ interactions with them.
In both examples mentioned above, IoT data
leakage can reveal critical users’ sensitive information and behavior in most private spaces. An
attacker in possession of such data can infer when
a user is at home (or if the house is vacant), as
IEEE Network • Accepted for Publication
Authorized licensed use limited to: Cornell University Library. Downloaded on September 06,2020 at 09:44:35 UTC from IEEE Xplore. Restrictions apply.
2
This article has been accepted for inclusion in a future issue of this magazine. Content is final as presented, with the exception of pagination.
IoT Characteristics
Challenges for cybersecurity in IoT applications
Massive deployment
•
•
•
Data is distributed among multiple devices.
Individual protection of devices.
Network overhead.
Heterogeneity
•
•
Devices with heterogeneous capabilities.
Need for different solutions to secure different devices.
Dynamic network topologies
•
•
•
IoT network topology changes frequently due to controllable and uncontrollable factors.
Topology changes will affect communication pattern of IoT devices.
Fingerprinting-based cybersecurity solutions should consider communication traffic pattern changes.
Low-power and low-cost
communication
•
•
•
IoT devices have severe energy constraints.
Networking protocols do not implement robust mechanism for reliable communication.
Distributed cybersecurity solutions should consider low-reliable communication in IoT applications.
Low latency communication
•
•
IoT applications might have time constraints.
Complex cybersecurity solutions will incur additional delays.
TABLE 1. IoT characteristics and challenges for cybersecurity .
well as the user’s routines and preferences while
at home. Therefore, cybersecurity solutions for
IoT must deal with eavesdropping attacks efficiently, preventing information leakage, ensuring
data will not be globally accessed, and limiting
data lifetime to the minimum extent required.
Moreover, IoT devices have been targeted by
cyber-attackers aimed at taking control of them.
In contrast to traditional computing systems, each
IoT device performs a well-defined task. However,
such a task might be critical; for instance, an IoT
medical device can be used for insulin delivery in
a health-care system. In this regard, compromised
IoT devices can lead to fatal consequences, as
they can pump lethal doses of the administered
drug in the health-care applications (https://
tinyurl.com/y8tsb7fu).
Besides, compromised IoT systems can be
used to create botnets, which will be explored to
attack and damage other computing infrastructures. Although each device individually lacks
computing capabilities, the numbers compensate
for this. An infected IoT device can be instructed
to download malware and wait for commands
to begin an attack. Despite having constrained
resources, it is undeniable that orchestrated DDoS
attacks from IoT are destructive because of the
excessive number of involved devices. IoT botnets
(e.g., a Mirai botnet) have served as infrastructure
for powerful DDoS attacks, such as those in October 2016, which took down hundreds of websites (e.g., Twitter, Netflix, Reddit, and GitHub)
for several hours [6]. The critical fact is that traditional cybersecurity approaches might not prove
suitable for IoT applications, given the unique
characteristics of IoT devices and networks, as
summarized in Table 1.
ML-Based Cybersecurity
Machine learning has gained increased attention
in the design of cybersecurity solutions for IoT.
One of the reasons for such increased attention
is the potential for using ML models to protect
IoT data and control access to IoT resources.
In traditional personal computer-based systems
(e.g., client/server computing applications), data
is located in a well-defined place and is requested by the users from a data unique identifier or
address of the host storing it. In contrast, IoT
data might be spread out among devices and
processing units; that is, IoT data will not reside
in a single place, and its location will not be
well-defined.
3
Thereafter, a naive solution for securing IoT
data would be to implement protective measurements on any single device in an IoT application.
However, such a naive approach will be unfeasible, given the heterogeneous and resource-constrained nature of the IoT devices, and the heavy
computation and high communication load nature
of traditional cybersecurity techniques. Moreover, cybersecurity solutions must guarantee that
access to IoT resources is controlled. IoT devices
might perform vital tasks, such as in health-care
applications. Hence, cybersecurity solutions must
make sure that the access to update a device configuration or working mode is granted only to a
legitimate entity. Such access control is needed
to prevent, for instance, a malicious user from
changing the dosage a device must deliver to a
patient in a smart health-care application.
In this regard, ML-based solutions can observe
different variables in an IoT system and make
decisions to secure it. In an IoT application, each
device will have a well-defined task to perform.
Moreover, the interaction between users and a
set of IoT devices, or a machine-to-machine interaction in a given IoT application, tend to follow
a pattern, that is, it is not a random interaction.
In this regard, machine learning algorithms can
be trained to learn such an interaction pattern,
as well as the characteristics of networking traffic
generated from such interactions. Therefore, an
ML-based solution will be able to authenticate
users, control data access, and identify DDoS
attacks, compromised IoT devices, or unauthorized attempts to access IoT data or resources.
Requirements and Fundamental Challenges
Cybersecurity techniques for IoT must be lightweight, resilient, fault-tolerant, and robust. Moreover, they should tackle the heterogeneous
capabilities of IoT devices and wireless networking technologies. In addition, cybersecurity techniques should protect IoT data by considering
different data sensitivity levels. Moreover, they
should guarantee that data is accessed only by
users and system components that have the right
permission to access it. Furthermore, cybersecurity solutions for IoT should detect unusual IoT traffic, block attack attempts, and mitigate damage
when a device or component is compromised.
Nonetheless, solutions to secure IoT must not
incur significant overhead for the system and network, which would diminish the performance of
an IoT application.
IEEE Network • Accepted for Publication
Authorized licensed use limited to: Cornell University Library. Downloaded on September 06,2020 at 09:44:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this magazine. Content is final as presented, with the exception of pagination.
In this regard, supervised machine learning techniques (e.g., SVM, naive Bayes, K-nearest neighbor,
deep neural networks, and random forests) have
been used for detecting network intrusion and
malware, DDoS, and spoofing attacks [3]. Supervised ML techniques require labeled data, with a
set of inputs and their corresponding outputs, used
to train the model initially. The working principle
of such an approach overall includes the centralized training of the model and its later execution
in selected IoT devices. This might require a vast
amount of raw data for training the models.
In addition, needed data from training might
be sensitive and private, which will not be easy
to acquire. Furthermore, supervised models must
be resilient to maliciously introduced data; that is,
they must reject compromised training data sets
that might negatively impact the result. Biased
data from user interactions must also be properly treated when training supervised models for
securing IoT. The challenges mentioned above
will also emerge whenever a used supervised ML
model must be re-trained and updated.
In contrast, unsupervised machine learning
techniques (e.g., k-means, hierarchical clustering,
and k-NN) have gained increased attention for IoT
networks [7]. Unsupervised learning can be used
for detecting data modification attacks, statistical
data tuples classification into benign or malicious,
abnormal flow identification, and malicious relay
detection. The main advantage of unsupervised
ML is that it does not require labeled data for
training, which contributes to reducing complexity and required resources. However, efficient
unsupervised ML-based solutions will require the
proper selection of features to be considered, and
removal of features possessing no discriminating
power, aimed at coping with the curse of the
dimensionality problem.
Finally, it is worth mentioning that some of the
machine learning models, such as deep learning,
are well known for the difficulties of deep understanding behind decisions taken. Thus, ML-based
cybersecurity solutions might fail concerning
forensic capabilities, as taken decisions might
not be traced. It will be challenging to develop
ML-based solutions to secure IoT that are capable of providing transparency and accountability of the taken actions. It might not be possible
to prove that taken actions were correct, which
would challenge the system of being defensible in
court law whenever needed.
ML-Based Cybersecurity for IoT
The first step toward the design of efficient
ML-based solutions for IoT applications is to
understand IoT characteristics, security requirements, and design challenges. To facilitate this
process, we propose a novel classification to
categorize current ML-based designs to secure
IoT applications. Based on the primary goal, we
categorize the solutions in IoT devices and IoT
network security, as summarized in Table 2. The
proposed classification contributes to the study of
the challenges and requirements of IoT systems
from a device and network point of view. Hence,
for each category, we highlight the design principles and main challenges to be overcome, and
shed light on some recent works in the literature.
The discussed works are summarized in Table 3.
Approach
Description
Device security
Solutions aimed at tackling vulnerabilities and
attacks intended to IoT devices (e.g., hardware
trojan, cloning, and battery draining), and secure
them to avoid privacy leakage, DDoS and jamming.
Network security
Solutions aimed at securing IoT communication infrastructure (e.g., edge nodes, access
points, routers, and cache systems) against adversaries.
TABLE 2. Classification of IoT cybersecurity approaches.
IoT Device Security
One of the daunting challenges in IoT applications is how to secure the devices. IoT devices
might present vulnerabilities, such as open telnet
ports, outdated firmware, and unencrypted transmission of sensitive data. Hence, they are susceptible to many kinds of attacks, which include
hardware trojan, non-network side-channel
attacks, DDoS, and tampering attacks [15]. Moreover, IoT devices overall have severe limitations
in terms of power supply, which lead them to
work in a duty-cycled manner to conserve energy. However, they are also susceptible to sleep
deprivation and battery draining attacks. In this
regard, ML-based cybersecurity approaches can
be explored to ensure IoT devices are working
correctly, that is, detecting when they are compromised or receiving unusual requests for sensitive data or due to DDoS attempts. Moreover,
ML-based cybersecurity can improve authentication mechanisms and access control to data and
networks for new devices added to the system.
Machine learning has been used in proposed
solutions to authenticate IoT devices through
fingerprinting. Figure 2 depicts the general work
principle of such approaches. IoT devices will
have unique radio signal signatures. The unique
signatures of transmitted signals will happen due
to the transmitter’s hardware imperfections or
effects of signal propagation (e.g., fading, Doppler
effect, noise, and distortion). Furthermore, recent
studies [9–11] designed ML-based solutions to
extract unique features from received signals and
determine if a device that is trying to authenticate
in the network is legitimate or adversarial.
Das et al. [9] proposed a Long Short Term
Memory (LSTM)-based classifier to learn unique
hardware imperfections of legitimate IoT devices.
Hence, such unique imperfections are used to distinguish legitimate devices from adversaries that
try to emulate them. To do so, wireless signals
through samples of transmitted preambles, composed of multiple symbols, are considered. For
a given input, the LSTM classifier’s output will be
the imperfection characteristics of the transmitter
hardware, in terms of frequency offset, phase offset, filters, timing offset, and multipath.
Chatterjee et al. [10] proposed the RF-PUF
for IoT device authentication through physical
unclonable functions (PUF). In the RF-PUF, device
identification is performed at the receiver node,
from frequency, in-phase (I) and quadrature (Q)
components and channel features) extracted from
received wireless signals. The proposed solution
implements a three-layer Artificial Neural Network
(ANN) that will determine the unique identifier
IEEE Network • Accepted for Publication
Authorized licensed use limited to: Cornell University Library. Downloaded on September 06,2020 at 09:44:35 UTC from IEEE Xplore. Restrictions apply.
4
This article has been accepted for inclusion in a future issue of this magazine. Content is final as presented, with the exception of pagination.
Proposal
Category
ML technique
Goal
Description
Xiao et al. [8]
Network security
DQN
Secure mobile edge
caching devices
Determine the edge node the IoT device should
use, the task offloading rate/time, and the transmission power to be used in the communication. Those
parameters are selected from the observed users’
density, devices’ battery level, jamming strength, and
radio channel bandwidth.
Liu et al. [7]
Network security
k-means
Detect malicious devices
within IoT multihop paths
Use probe packets to discover multi-hop paths
from source nodes to the sink. The sink node determines the fraction of unmodified packets of each
path, from received probes. Hence, k-means is used
to cluster nodes into benign and malicious, based on
the path reputation they are a member of and their
contribution to each path.
Das et al. [9]
Device security
LSTM
Device authentication
Use the unique hardware imperfections of IoT devices to authenticate them.
Chatterjee et al. [10]
Device security
ANN
Device authentication
Authenticate IoT devices from physical unclonable functions.
Ferdosi and Saad [11]
Device security
LSTM
Device authentication
Gateway nodes authenticate devices of massive
IoT scenarios through received watermarked signals.
Chen et al. [12]
Network security
DBN
Detect jamming attacks
in the mobile edge computing infrastructure
Deep belief network is used to learn features of
eavesdropping and jamming attacks to mobile edge
computing systems.
Miettinen et al. [13]
Network security
Random Forest
Detect devices with unpatched vulnerabilities
Use devices’ fingerprint to identify if they have
any unpatched vulnerability. Hence, protective measurements are taken to limit the operation of a vulnerable device in the IoT network.
Alli et al. [14]
Network security
PSO and Neuro-Fuzzy
Prevent malicious IoT
devices of offloading invalid data aimed at network congestion and exhaustion of fog and cloud
computing resources.
Surrogate entities at fog nodes collect and store information regarding IoT devices within the network.
PSO is used at the fog nodes to select the optimal
node, aimed at reducing delay, for handling offloaded
tasks. Neuro-Fuzzy is used at gateways to evaluate
data coming from IoT devices and identify malicious
task offloading.
Vashist et al. [4]
Device security
ANN, SVM, kNN
and decision tree
classifiers
Detect burst errors on
multiple consecutive flits
of a packet in a WiNoC.
Implements a set of machine learning classifiers to
detect jamming attacks aimed at denial-of-service on
wireless Network-on-Chip. The classifiers are used
to distinguish burst errors occasioned during normal
operation from errors that happen when an internal or
external attacker is interfering in the communication.
TABLE 3. Summary of discussed works.
of the transmitter based on the output (normalized geometric means of feature values) and PUF
properties.
The main disadvantage of the above work is the
high demand at the gateway node, which might
fail in simultaneously authenticating IoT devices
in massive IoT systems. In this regard, Ferdosi and
Saad [11] proposed an LSTM-based watermarking algorithm for assisting dynamic massive IoT
device authentication. In the proposed solution,
the LSTM model is used to extract fingerprints
from device signals’ characteristics (spectral flatness, mean, variance, skewness, and kurtosis). The
output is a bitstream used to watermark the original signal using a key. At the gateway, a proposed
dynamic watermarking LSTM (DW-LSTM) model is
used to extract the bit, and features of a received
watermarked signal. Those outputs are compared,
and in the event of dissimilarities between two
sequences, an attack alarm is triggered.
In contrast to the works mentioned above,
Vashist et al. [4] addressed jamming attacks aimed
at DoS on wireless Network-on-Chip (WiNoC).
The authors used a burst error correction code to
monitor the rate of burst errors received over the
wireless medium, and ML classifiers (ANN, SVM,
kNN, and decision tree) to detect the persistent
jamming attack. In the considered attack model,
5
an external or internal attacker will interfere with
legitimate transmissions, which will cause high
burst error rates on multiple consecutive flits of
a packet. Hence, ML classifiers were employed
to distinguish random burst errors occasioned by
power source fluctuations, ground bounce, or
crosstalk from burst errors due to jamming attacks.
The authors created a simulation-based dataset
with different bit error rates (BER) to model normal operation and burst errors from jamming
attacks. The number of transmitted and received
flits, as well as the number of errors, are used
together with the operating mode (i.e., normal or
attacked) are used for training the classifiers.
Despite the advancements, many challenges should be addressed during the design of
AI-based cybersecurity solutions to protect devices. First, the solutions must be lightweight as
devices have limited resources in terms of computing, storage, and energy. Second, the solutions
will need to deal with the lack of reliable data sets
to be used for training and validation. Simulated
data were considered to evaluate the proposed
solutions in [9, 10], for instance. Third, it might be
required to re-train and update the parameters of
an ML-based cybersecurity solution. Hence, the
data exchange for such tasks should be done in a
way that will not congest the network.
IEEE Network • Accepted for Publication
Authorized licensed use limited to: Cornell University Library. Downloaded on September 06,2020 at 09:44:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this magazine. Content is final as presented, with the exception of pagination.
FIGURE 2. ML-based IoT device fingerprinting.
IoT Network Security
Network-based IoT security aims to create barriers
to protect the IoT network, rather than addressing
security in a per-device manner. This includes, for
instance, identification of the malicious device,
traffic filtering as it traverses the network, identifying unusual requests without congesting the
network and increasing latency, link protection
between IoT and edge/cloud servers, and device
identification and registration when new devices connect to the network. ML-based solutions
to secure IoT networks can also be deployed at
edge and cloud infrastructure to monitor incoming and outgoing traffic of devices within the
network, profile them, and determine when the
network is under attack from normal and unusual
behavior of the entities.
Jamming is one of the attacks in IoT networks
aimed to disrupt communication between devices and edge servers. In order to tackle jamming
attacks, physical-layer security methods have
been proposed for IoT, as alternative solutions
to encryption/decryption-based methods, which
are costly in terms of computing resources. Chen
et al. [12] proposed a deep learning framework
for jamming attack detection in a mobile edge
computing infrastructure supporting IoT-based
cyber-physical transportation. The proposed
framework uses a deep belief network to analyze
attack behaviors from required permissions, sensitive application programming interfaces (APIs),
and dynamic behaviors.
Liu et al. [7] used k-means clustering to identify
malicious nodes involved in data routing in IoT
multi-hop applications. Accordingly, probe packets are transmitted from source nodes toward the
destination (sink). The destination calculates the
fraction of unmodified packets by checking the
integrity of each received probe packet, along
the multi-paths from the source node. Hence,
k-means is used to cluster nodes in two groups
(benign or malicious nodes) based on the reputation attributes of the paths they are part of, and
their contributions to the paths.
Another approach to secure IoT networks is
to detect the presence of devices with unpatched
vulnerabilities and apply necessary protection
measurements to secure the other devices in the
same network. The IoT SENTINEL [13] implements software-defined networking (SDN)-based
Security Gateway to monitor and classify the
devices, as well as to send device fingerprints to
the proposed IoT Security. The Random Forest
algorithm is used to create classifiers for devices
with known fingerprints. Hence, upon the connection of new IoT devices in the network, 23
features extracted from each packet of a set col-
lected during devices’ initialization are used as
input for each classifier that will provide a binary decision as to whether the input fingerprint
matches the device-type.
In addition, IoT networks can suffer from DoS
of edge computing resources. Malicious nodes can
attack edge computing infrastructure by maliciously
offloading tasks aimed at occupying processing, storage, and communication edge computing resources. Hence, tasks offloaded by legitimate devices will
not find available resources on the edge and will
need to be handled locally, which will exhaust IoT
resource-constrained devices and impair the performance of applications. Alli et al. [14] proposed the
SecOFF-FCIoT, an ML-based approach for secure
task offloading to fog and cloud servers. The proposed solution uses Particle Swarm Optimization
(PSO) at IoT device level to optimally select a fog
node to handle offloaded tasks. Hence, a neuro-fuzzy model is used at gateway nodes to evaluate
data coming from IoT devices and isolate the malicious devices that are sending invalid data with the
purpose of congesting the network.
In contrast, Xiao et al. [8] investigated the use
of a reinforcement learning-based procedure for
securing mobile edge caching (MEC) devices. In
IoT applications, a MEC infrastructure will be a
target of attackers that seek either leakage of data
cached at the MEC devices, or denial of service
through an impaired performance of MEC systems. Hence, the authors investigated the use of
a deep Q-network (DQN) to secure MEC. The
DQN model observes user density, battery levels,
jamming strength, and radio channel bandwidth,
and selects the edge device to offload the task,
the offloading rate/time, and the transmission
power of the IoT device for the task offloading for
the MEC device.
Herein, collaborative solutions need to be
explored to improve the security of large-scale
and massive IoT networks. IoT devices can
select edge servers based on the level of security of the communication and server. However, the need for periodic communication among
IoT devices, for exchange of the edge devices
security level they have used, will congest the network and incur additional costs, such as energy.
Hence, it requires the development of collaborative machine learning approaches where models’
parameters are shared among the devices, rather
than the data used for training.
Future Research Directions
While important progress has been achieved,
there are several directions that require further
exploration in the design of solutions to secure
IoT applications.
IEEE Network • Accepted for Publication
Authorized licensed use limited to: Cornell University Library. Downloaded on September 06,2020 at 09:44:35 UTC from IEEE Xplore. Restrictions apply.
6
This article has been accepted for inclusion in a future issue of this magazine. Content is final as presented, with the exception of pagination.
First, there is a lack of machine learning-based
solutions that consider different information for
device profiling. Current ML-based cybersecurity
solutions for IoT authentication and access control
consider device profile in terms of their hardware
imperfections. However, additional information
for improved device profiling can be considered
to increase the performance of IoT cybersecurity
solutions. For instance, a combination of IoT infrastructure usage information, such as CPU, memory
and networking traffic intensity and pattern, rather
than considering a single aspect (as is done in the
current literature), as well as the use of high-level
information, such as social interactions with other
devices, can improve the performance of ML-based
cybersecurity solutions in IoT applications.
Battery draining and sleep deprivation attacks
are popular and catastrophic in IoT devices. As
mentioned in [15], some works in the literature
have already investigated the energy usage pattern of IoT devices, aimed at detecting energy
depletion and DDoS attacks. However, more
research efforts in this area are needed. For
instance, IoT devices will work in a duty-cycled
manner, where devices will be sleeping (i.e.,
transceiver will be turned off) most of the time
for reducing energy consumption. Such features
should be explored, where supervised learning
can be used to correlate devices with similar functionalities and detect when a device is working
with an abnormal active and sleep cycle.
Furthermore, there is a lack of investigation
of collaborative and distributed machine learning-based solutions. IoT will demand ML-based
solutions on distributed and heterogeneous devices. Such solutions must be collaborative and
do not rely on centralized data training. In this
regard, federated learning could be used as a
starting point for such approaches.
In addition, classic challenges of machine
learning, such as a data set for training and validation, must be tackled. There is a lack of IoT
data sets in terms of incoming/outgoing network
traffic, device operations, and user interactions.
Moreover, there is a lack of data sets related to
attacks and threats of IoT applications.
Conclusion
This article presented a detailed discussion of
the advantages and challenges of machine learning (ML)-based solutions to secure the Internet
of things (IoT). We described the fundamental
design requirements and challenges of cybersecurity solutions for IoT. Hence, we discussed
how ML-based solutions could be advantageous
to tackle the vulnerabilities of IoT. We classified
ML-based cybersecurity solutions as device-based
and network-based, according to the main security goal they are intended to cope with in IoT
applications. This proposed classification helps the
understanding of the requirement and challenges
faced when designing new ML-based cybersecurity solutions for IoT applications. For each category of the proposed classification, we shed light
on the main goal and fundamental challenges to
be tackled, and discussed representative works in
the literature. Finally, we presented some future
research directions that need further investigation.
7
Acknowledgment
This work is partially supported by the NSERC
DISCOVERY, NSERC CREATE TRANSIT and Canada Research Chairs Programs.
References
[1] J. Jagannath et al., “Machine Learning for Wireless Communications in the Internet of Things: A comprehensive
Survey,’’ Ad Hoc Networks, vol. 93, 2019, p. 101913–59.
[2] F. Restuccia et al., “Securing the Internet of Things in the
Age of Machine Learning and Software-Defined Networking,’’ IEEE Internet of Things J., vol. 5, no. 6, Dec. 2018, pp.
4829–42.
[3] L. Xiao et al., “IoT Security Techniques Based on Machine
Learning: How do IoT Devices Use AI to Enhance Security?’’
IEEE Signal Processing Mag., vol. 35, no. 5, Sep. 2018, pp.
41–49.
[4] A. Vashist et al., “Securing a Wireless Network-on-Chip
Against Jamming Based Denial-of-Service Attacks,’’ Proc.
IEEE Computer Society Annual Symposium on VLSI (ISVLSI),
July 2019, pp. 320–25.
[5] F. Li et al., “System Statistics Learning-Based IoT Security:
Feasibility and Suitability,’’ IEEE Internet of Things J., vol. 6,
no. 4, Aug. 2019, pp. 6396–6403,
[6] C. Kolias et al., “DDoS in the IoT: Mirai and Other Botnets,’’
Computer, vol. 50, no. 7, July 2017, pp. 80–84.
[7] X. Liu et al., “Identifying Malicious Nodes in Multihop IoT
Networks Using Diversity and Unsupervised Learning,’’ Proc.
IEEE Int’l Conference on Communications (ICC), May 2018,
pp. 1–6.
[8] L. Xiao et al., “Security in Mobile Edge Caching with Reinforcement Learning,’’ IEEE Wireless Commun., vol. 25, no. 3,
June 2018, pp. 116–122.
[9] R. Das et al., “A Deep Learning Approach to IoT Authentication,’’ Proc. IEEE Int’l Conference on Communications (ICC),
May 2018, pp. 1–6.
[10] B. Chatterjee et al., “RF-PUF: Enhancing IoT Security
Through Authentication of Wireless Nodes Using in-situ
Machine Learning,’’ IEEE Internet of Things J., vol. 6, no. 1,
Feb. 2019, pp. 388–398.
[11] A. Ferdowsi and W. Saad, “Deep Learning for Signal
Authentication and Security in Massive Iinternet-of-Things
Systems,’’ IEEE Trans. Commun., vol. 67, no. 2, Feb. 2019,
pp. 1371–87.
[12] Y. Chen et al., “Deep Learning for Secure Mobile Edge
Computing in Cyber-Physical Transportation Systems,’’ IEEE
Network, vol. 33, no. 4, July 2019, pp. 36–41.
[13] M. Miettinen et al., “IoT SENTINEL: Automated Device-type
Identification for Security Enforcement in IoT,’’ Proc. IEEE
37th Int’l Conf. on Distributed Computing Systems (ICDCS),
June 2017, pp. 2177–84.
[14] A. Alli and M. Alam, “SecOFF-FCIoT: Machine Learning
Based Secure Offloading in Fog-Cloud of Things for Smart
City Applications,’’ Internet of Things, vol. 7, 2019, pp.
70–89.
[15] A. Mosenia and N. Jha, “A Comprehensive Study of Security of Internet-of-Things,’’ IEEE Trans. on Emerging Topics in
Computing, vol. 5, no. 4, Oct. 2017, pp. 586-602.
Biographies
A zzedine B oukerche [FIEEE, FEiC, FCAE, FAAAS] is a Distinguished University Professor and Canada Research Chair
Tier-1 at the University of Ottawa. He has received the C. Gotlieb Computer Medal Award, Ontario Distinguished Researcher Award, Premier of Ontario Research Excellence Award, G.
S. Glinski Award for Excellence in Research, IEEE Computer
Society Golden Core Award, IEEE CS-Meritorious Award, IEEE
TCPP Leaderships Award, IEEE ComSoc ASHN Leaderships and
Contribution Award, and the University of Ottawa Award for
Excellence in Research. His research interests include wireless
ad hoc and sensor networks, wireless networking and mobile
computing.
R odolfo W. L. C outinho (rodolfo.coutinho@concordia.ca)
is an assistant professor at Concordia University, Canada. He
received the ACM MSWiM’19 Rising Star Award and the
2018 Pierre Laberge Prize at the University of Ottawa. He also
received the Best Thesis Awards from the CAPES, Brazilian Computer Society and the Brazilian Computer Networks and Distributed Systems Interest Group. He has served as TPC Co-Chair
for ACM and IEEE conferences. His research interests include
Internet of Things, underwater networks, information-centric
networking, and mobile computing.
IEEE Network • Accepted for Publication
Authorized licensed use limited to: Cornell University Library. Downloaded on September 06,2020 at 09:44:35 UTC from IEEE Xplore. Restrictions apply.
Download