EAI/Springer Innovations in Communication and Computing
Sarvesh Pandey
Udai Shanker
Vijayalakshmi Saravanan
Rajinikumar Ramalingam Editors
Role
of Data-Intensive
Distributed
Computing Systems
in Designing Data
Solutions
EAI/Springer Innovations in Communication
and Computing
Series Editor
Imrich Chlamtac, European Alliance for Innovation, Ghent, Belgium
The impact of information technologies is creating a new world yet not fully
understood. The extent and speed of economic, life style and social changes
already perceived in everyday life is hard to estimate without understanding the
technological driving forces behind it. This series presents contributed volumes
featuring the latest research and development in the various information engineering technologies that play a key role in this process. The range of topics,
focusing primarily on communications and computing engineering include, but
are not limited to, wireless networks; mobile communication; design and learning;
gaming; interaction; e-health and pervasive healthcare; energy management; smart
grids; internet of things; cognitive radio networks; computation; cloud computing;
ubiquitous connectivity, and in mode general smart living, smart cities, Internet of
Things and more. The series publishes a combination of expanded papers selected
from hosted and sponsored European Alliance for Innovation (EAI) conferences
that present cutting edge, global research as well as provide new perspectives on
traditional related engineering fields. This content, complemented with open calls
for contribution of book titles and individual chapters, together maintain Springer’s
and EAI’s high standards of academic excellence. The audience for the books
consists of researchers, industry professionals, advanced level students as well as
practitioners in related fields of activity include information and communication
specialists, security experts, economists, urban planners, doctors, and in general
representatives in all those walks of life affected ad contributing to the information
revolution.
Indexing: This series is indexed in Scopus, Ei Compendex, and zbMATH.
About EAI - EAI is a grassroots member organization initiated through cooperation between businesses, public, private and government organizations to address
the global challenges of Europe’s future competitiveness and link the European
Research community with its counterparts around the globe. EAI reaches out to
hundreds of thousands of individual subscribers on all continents and collaborates
with an institutional member base including Fortune 500 companies, government
organizations, and educational institutions, provide a free research and innovation
platform. Through its open free membership model EAI promotes a new research
and innovation culture based on collaboration, connectivity and recognition of
excellence by community.
Sarvesh Pandey • Udai Shanker •
Vijayalakshmi Saravanan •
Rajinikumar Ramalingam
Editors
Role of Data-Intensive
Distributed Computing
Systems in Designing Data
Solutions
Editors
Sarvesh Pandey
Computer Science
Banaras Hindu University
Varanasi, India
Udai Shanker
Madan Mohan Malaviya University of
Technology
Gorakhpur, Uttar Pradesh, India
Vijayalakshmi Saravanan
University of South Dakota
South Dakota, SD, USA
Rajinikumar Ramalingam
Deutsches Elektronen-Synchrotron DESY
Hamburg, Germany
ISSN 2522-8595
ISSN 2522-8609 (electronic)
EAI/Springer Innovations in Communication and Computing
ISBN 978-3-031-15541-3
ISBN 978-3-031-15542-0 (eBook)
https://doi.org/10.1007/978-3-031-15542-0
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2023
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Foreword
Data analytics and machine learning technologies, particularly in a decentralized
scenario, are offering cost-effective solutions for many real-life problems. Recent
developments in computer technology have led to increased research interests in
the field of modern data-intensive distributed computing systems. Today, when user
requirements are becoming exponentially complex, it is not possible to meet the
expectations of society by applying core knowledge of any single research area of
computer science; rather, there is a need for integrated efforts with the umbrella of
research topics. This prompted the researchers to think about the multi-disciplinary
nature of work to provide a solution for the challenges set forth due to various future
requirements. In this direction, data systems serve as a strong component that we are
either using or would be using in near future.
Advancement in the field of modern computing will continue to be critical
for computer science researchers and a matter of concern for the end users.
Therefore, the objective of the book Role of Data-Intensive Distributed Computing
Systems in Designing Data Solutions, edited by Sarvesh Pandey, Udai Shanker,
Vijayalakshmi Saravanan, and Rajinikumar Ramalingam, is to introduce the reader
to recent research activities in the field of modern-day data-driven decision-making
processes. It is an excellent example of a collection of advanced works applied to
relevant problems. It covers areas like real-time systems, machine learning, data
analytics, medical imaging, and applications of all these areas considering evergrowing user demands. Some of the chapters of this book provide interesting
information on the integration of this wonderful and disruptive technology with
modern applications. Also, one chapter introduces the readers to a system model
for detecting the original camera that clicked a particular image – this would help
in solving many real-life issues in the near future. Research addressing performance
issues of these systems is a relatively novel area, and the contents in the chapter are
good enough to evince the interest for developing innovative solutions to the open
technical challenges.
This book will be very helpful to students, researchers, scientists, and industry
professionals working in the field of computing. A genuine attempt is made to
increase the understanding of how data is going to play a central role in many of the
v
vi
Foreword
emerging research domains. It would empower the readers to work on new research
domains, which would be useful for society. At last, this book indeed provides future
insights on the performance issues with modern data-intensive systems.
Director at IIIT, Pune, Maharashtra, India
Anupam Shukla
Preface
This book, titled Role of Data-Intensive Distributed Computing Systems in Designing Data Solutions, is centered on discussing various new opportunities created
by the fast-computing power and big data collectively. There were more than 40
submissions; out of these, 16 submissions have been finally included in this book
proceeding after rigorous review. We appreciate everyone who considered this venue
for the possible publication of their research articles; congratulations to all the
authors whose book chapters are included.
To better organize the contents, this book is divided into three sections. Part
I, which consists of four chapters, is mainly on integration of data systems and
traditional computing research. Part II, which consists of seven chapters, is about
how data-driven decision-making is now a reality. Finally, Part III, which consists
of five chapters, discusses the critical role of data management in healthcare
functioning. The themes of the accepted book chapters are discussed below in brief
so that audience can understand what this book has to cater.
Part I: On Integration of Data Systems and Traditional
Computing Research
Chapter 1 talks about energy-conscious scheduling of resources for fault-tolerant
distributed computing systems. This chapter emphasizes the point that reliability
should be given equal weightage as that to deadline aspect of such system design.
Chapter 2 discusses how advanced morphological component analysis and
steganography could be utilized for the purpose of secret data transmission.
Chapter 3 puts light on cyber-security aspects of data management in wireless
sensor networks. Chapter 4 proposes a dynamic privacy protection scheme for
trajectory data.
vii
viii
Preface
Part II: Data-Driven Decision-Making Systems
Chapter 5 proposes an idea of how integration of mobile agent systems with egovernance can lead to better/transparent and dynamic infrastructure with no loss to
reliability and fault tolerance.
When we are living in a world where a countless number of websites are on
the Internet, it is important that we should design a system to make sure that end
users do not fall into the trap of phishing websites. Chapter 6 not only discusses
this problem but also attempts to resolve this issue by using some of the existing
machine learning techniques.
Source camera identification method, which can be used to identify the source
camera of the images/photos, plays a very important role in today’s era, especially in
the domain of digital image forensics. In Chap. 7, using machine learning classifiers,
authors attempted to predict device-specific information from picture data.
Dependence on vehicles has increased manifold in the twentieth century. Now,
with advent of the Internet, researchers started working on the idea of “Internet of
Vehicles (IoV).” After that, since 2015, a cross-injection of IoV and blockchain
technology has continued to be a research area with lots of potential. Chapter 8 puts
light on all these aspects.
Traditional bidding system can also benefit from blockchain technology. Chapter
9 discusses this. With integration of blockchain, without any doubt, transparency of
bidding process would increase.
Chapter 10 talks about vehicular ad hoc networks (VANETs). Various security
challenges one may face with VANET-based systems are nicely discussed in this
chapter. This exploratory study also lists future promising solutions.
Chapter 11 is all about providing a user-friendly GUI to the learners. In the recent
past, we faced an unprecedented threat of COVID-19. This has proven yet again that
online learning systems are our friends and can co-exist with traditional classroom
teaching methods, and by utilizing both, we could improve the outcomes to a greater
extent.
Part III: Data-Intensive Systems in Healthcare
After the COVID-19 outbreak, the first thing we struggled with was the need for an
efficient medical kit to test whether someone is COVID-19 positive or not. In the
fight against COVID-19, it has been an accepted practice that CT scans could be
relied on for testing. Chapter 12 details on the aspect of analyzing high-resolution
CT images for COVID testing.
Chapter 13 proposes the use of an attention-based deep learning approach for the
analysis of X-ray images.
The efficacy of swarm-based methods in processing medical images is discussed
in detail in Chapter 14.
Preface
ix
Chapter 15 talks about analyzing cardiac MRI images using convolution neural
networks to detect cardiovascular diseases.
Along the line of Chap. 15, Chap. 16 focuses on analyzing brain images using
deep learning to detect brain tumors. In constrained circumstances, where people
with medical expertise may get overwhelmed, the techniques presented in Chaps.
15 and 16 could be of great assistive help.
To summarize, we are of the view that this book has perfectly covered various
application areas with central focus on big data.
Varanasi, India
Uttar Pradesh, India
SD, USA
Hamburg, Germany
Sarvesh Pandey
Udai Shanker
Vijayalakshmi Saravanan
Rajinikumar Ramalingam
Acknowledgment
It is a matter of immense pleasure for us to write the acknowledgment part of this
book; it took more than 2 years to complete – the longest assignment we worked
on till date. At the same time, when we look back, it feels fulfilling that most of
the goals we had thought of before accepting this opportunity have been met. This
book is truly an international one as it attracted submissions from multiple countries
around the globe. Acknowledgments are not just a part of a book; instead, they
remind us that a network of coordination among like-minded of people could be a
blessing, we believe.
It’s a result of effort from a lot of people who directly/indirectly helped us in
making this book a reality. We thank all the authors who contributed to this book.
A hearty thank you goes to all the reviewers; you all have really made our job
easy by giving your timely, valuable insights on submissions. As a small token of
appreciation, we are sharing the names of reviewers:
1. Dr. Karthikeyan Chinnusamy, Veritas Technologies LLC, USA
2. Dr. Abdullah Alghamid, Najran University, Najran, Saudi Arabia
3. Dr. Savina Bansal, Maharaja Ranjit Singh Punjab Technical University
Bathinda, India
4. Dr. Ajey Kumar. Symbiosis International Deemed University, Pune, India
5. Dr. Sri Hari Nallamala, Lakireddy Bali Reddy College of Engineering
(Autonomous), India
6. Dr. Saurabh Pal, Veer Bahadur Singh Purvanchal, University, Jaunpur, India
7. Dr. Gargi Srivastava, Rajeev Gandhi Institute of Petroleum Technology, Amethi, India
8. Dr. Nagendra Pratap Singh, NIT, Hamirpur, India
9. Dr. Vibhav Prakash, MNNIT, Allahabad, India
10. Dr. Sanjay Kumar, NIT, Jamshedpur, India
11. Dr. Mohit Kumar, NIT, Hamirpur, India
12. Dr. Awadhesh Kumar, BHU, Varanasi, India
13. Dr. Manoj Mishra, BHU, Varanasi, India
14. Dr. Rohit Tiwari, M. M. M. University of Technology, Gorakhpur, India
xi
xii
Acknowledgment
15. Dr. Sukhvinder Kaur, Swami Devi Dayal Institute of Engineering and Technology, Haryana, India
16. Dr. Arvind Tiwari, KNIT, Sultanpur, India
17. Dr. A. Shaji George, Indian Institute of Integrated Science Tech. and Research,
Chennai, India
18. Dr. Anil Kumar, Mody University, India
19. Dr. Ankit Jaiswal, Bennett University, Greater Noida, India
20. Dr. Ravi Sharma, IMS Engineering College, Ghaziabad, India
21. Dr. Suryabhan Pratap Singh, Institute of Technology and Management, Gorakhpur, India
22. Dr. Ajay Kumar Gupta, an Independent Researcher from New Delhi, India
23. Mr. Ravi Yadav, Research Scholar, BHU Varanasi, India
24. Mr. Santosh Tripathy, Research Scholar, IIT (BHU) Varanasi, India
25. Ms. Anupama Arun, Research Scholar, IIIT Pune, India
26. Ms. Ruchi Pathak, Infosys Limited, Mysore
27. Mr. Ankit Aakash, Stone Business Development Executive, Pidilite Industries
Limited, India
We would also like to thank our families. There is no such thing like worklife balance; obviously, at many points, we felt like we were not able to devote
enough time to our beloved ones because of being too busy with our professional
responsibilities. Almighty GOD has always been alongside us in all we do.
Contents
Part I On Integration of Data Systems and Traditional Computing
Research
Energy Conscious Scheduling for Fault-Tolerant Real-Time
Distributed Computing Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Savina Bansal, Rakesh Kumar Bansal, and Kiran Arora
Secret Data Transmission Using Advanced Morphological
Component Analysis and Steganography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Binay Kumar Pandey, Digvijay Pandey, Ankur Gupta,
Vinay Kumar Nassa, Pankaj Dadheech, and A. Shaji George
Data Detection in Wireless Sensor Network Based on Convex
Hull and Naïve Bayes Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Edwin Hernan Ramirez-Asis, Miguel Angel Silva Zapata,
A. R. Sivakumaran, Khongdet Phasinam, Abhay Chaturvedi, and R. Regin
DSPPTD: Dynamic Scheme for Privacy Protection of Trajectory
Data in LBS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ajay K. Gupta and Sanjay Kumar
3
21
45
59
Part II Data-Driven Decision-Making Systems
n-Layer Platform for Hi-Tech World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
R. B. Patel, Lalit Awasthi, M. C. Govil, and Rachita
A Comparative Study of Machine Learning Techniques for
Phishing Website Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Mohammad Farhan Khan, Rohit Kumar Tiwari, Sushil Kumar Saroj,
and Tripti Tripathi
83
97
xiii
xiv
Contents
Source Camera Identification Using Hybrid Feature Set
and Machine Learning Classifiers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Ankit Kumar Jaiswal and Rajeev Srivastava
Analysis of Blockchain Integration with Internet of Vehicles:
Challenges, Motivation, and Recent Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Manik Gupta, R. B. Patel, and Shaily Jain
Reliable System for Bidding System Using Blockchain . . . . . . . . . . . . . . . . . . . . . . 165
N. Ambika
Security Challenges and Solutions for Next-Generation VANETs:
An Exploratory Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Pavan Kumar Pandey, Vineet Kansal, and Abhishek Swaroop
iTeach: A User-Friendly Learning Management System . . . . . . . . . . . . . . . . . . . . 203
Nikhil Sharma, Shakti Singh, Shivansh Tyagi, Siddhant Manchanda,
and Achal Kaushik
Part III Data-Intensive Systems in Health Care
Analysis of High-Resolution CT Images of COVID-19 Patients . . . . . . . . . . . . 225
A. Joy Christy and A. Umamakeswari
Attention-Based Deep Learning Approach for Semantic Analysis
of Chest X-Ray Images Modality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Rishabh Dhenkawat, Snehal Saini, Shobhit Kumar,
and Nagendra Pratap Singh
Medical Image Processing by Swarm-Based Methods . . . . . . . . . . . . . . . . . . . . . . . 265
María-Luisa Pérez-Delgado and Jesús-Ángel Román-Gallego
Left Ventricle Volume Analysis in Cardiac MRI Images Using
Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Palakala Sai Krishna Yadhav, K. Susheel Kumar,
and Nagendra Pratap Singh
MRI Image Analysis for Brain Tumor Detection Using Deep Learning . . . 321
Prachi Chauhan, Hardwari Lal Mandoria, and Alok Negi
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Part I
On Integration of Data Systems and
Traditional Computing Research
Energy Conscious Scheduling for
Fault-Tolerant Real-Time Distributed
Computing Systems
Savina Bansal, Rakesh Kumar Bansal, and Kiran Arora
1 Introduction
An integration between computing and physical world with upcoming advances in
technology has made it possible to fulfill the growing computational demands and
needs of industry and individuals. Pervasive computing devices employ controllers
to read physical inputs through sensors, perform data processing, and feed tangible
outputs to actuators. Real-time functions especially based on artificial intelligence
such as computer vision and sensor fusion are gaining popularity due to costeffective availability of needed hardware owing to advances in VLSI and related
technologies. Real-time applications, as in avionics and aerospace engineering,
automobile sectors, mission and safety-critical application in defense and medical
fields, for which timely completion within a given time deadline is very crucial along
with logical accuracy, demand usage of real-time systems. Timeliness is essential
for real-time application as beyond the specific time window or time instant (also
referred as task deadline of a task) even a logically correct outcome is of no use.
Failing to honor deadline can lead to serious consequences—from loss of signal
quality, as during video-conferencing, to some bigger financial loss or may even
cost human lives [37, 49]. Real-time systems are capable of producing accurate
results within the given deadline provided the tasks are scheduled properly over
S. Bansal · R. K. Bansal
Department of Electronics and Communication Engineering, Giani Zail Singh Campus College of
Engineering & Technology, Bathinda, India
e-mail: savina.bansal@gmail.com; drrakeshkbansal@gmail.com
K. Arora ()
Department of Computer Science and Engineering, Baba Hira Singh Bhattal Institute of
Engineering & Technology, Lehragaga, India
e-mail: erkiranarora@gmail.com
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_1
3
4
S. Bansal et al.
Fig. 1 Embedded system market [14]
them. Scheduling, in general, relates to allocation and assignment of incoming tasks
to the underlying computing processor/s such that deadlines of all tasks get honored.
The major concern of this work is on scheduling of real-time systems.
The development of systems with real-time capability is increasing at a fast
pace (Fig. 1) to satisfy the needs of our day-to-day lives. More specifically, these
systems have application that affects our social and personal lives directly or
indirectly such as bank transactions, automobiles, traffic signal controller, medical
care, video-conferencing, smart home, and firefighting [40]. As per the new Global
Info Research study, it is projected that worldwide market growth for embedded
systems will rise from 86,500 million US dollar in 2020 to 11,620 million US
dollar in 2025, at a compound annual growth rate of approximately 6.1% [39].
For instance, contemporary cars have hundreds of processing units equipped to
provide basic features such as vehicle control to specialized facilities for safety
and comfort. To recognize its surroundings, perception subsystem in the vehicle
should be able to process enormous data that demands huge computational power
necessitating the use of multicore or multiprocessor systems [35] in order to achieve
higher throughput, reduced response time, and increased reliability.
Substantial advancement in the performance of present-day computing systems
has led to considerable rise in power consumption. In fact, the amount of heat
generated by them is quickly growing to level equivalent to nuclear reactors as
shown in Fig. 2 [46, 57]. As projected by Moore’s law, energy utilization of
computing systems has increased at an exponential rate from last few decades.
Such rise in energy consumption results in ecological and monetary problems
due to which energy management has turn out to be a prime design concern for
computing systems [1, 13, 20, 38]. In the scientific and technical literature, the
interrelation between energy, economy, and environment is recognized with “3E”
Energy Conscious Scheduling for Fault-Tolerant Real-Time Distributed. . .
5
Fig. 2 Rise in power densities [46]
[56]. United Nations Development Organization proposed at the beginning of the
twenty-first century that this triple bond is of global importance and is among the
eight Millennium Development Goals (MDGs) in the global economic scenario.
The evolution of objective of ensuring environmental sustainability promotes the
importance of energy management. In light of this, 3E is more conspicuous today
than ever before, making energy management a premier research problem for
computational systems.
However, the main target for energy saving in such systems is processor, as
a major fraction of total power is consumed by CPU alone. As the complexity
and computational power of real-time computing systems grow, it leads to high
operating temperature generation due to excessive transistor integration on small
size chips. Miniaturization further aggravates energy consumption of processors.
State-of-the-art processors consume a substantial amount of energy. For example,
Intel Core i7-975 drains estimated 83 W of power in idle state, and AMD FX
8350 processor has a peak power consumption of 210 W [41]. Various assessments
[5, 8, 72] recommended that main focus should be on power efficiency while designing complex real-time systems. Hence, it becomes necessary to consider energy
management as a mandatory parameter for real-time multiprocessor scheduling
algorithms.
Along with the timing precision, real-time systems must be reliable, but a precisely designed system may fail, which can lead to unexpected situations. Massive
heat dissipation adversely affects reliability and performance of semiconductor
devices as well and also contributes to global warming [17]. Another serious threat
to reliability is caused by high operating temperature, which is a direct consequence
of high power consumption generated owing to excessive transistor integration in
6
S. Bansal et al.
small size. As reported by Srinivasan et al. [53], maximum rise in temperature
realized by 180 nm processor is 15 ◦ K lesser than realized by 65 nm processor,
and scaling to such small value leads to 316% growth in error rate. For safetycritical real-time computing systems, reliability is a vital feature because faults may
cause deadline violations, which can also be disastrous at times. To avoid this, fault
detection and tolerance features should be incorporated in the system to achieve
high reliability so that it can operate proficiently even in case of faults. Therefore,
reducing energy consumption while maintaining reliability of a real-time system is
a challenging problem and requires consideration.
2 Energy Management
The presence of miniaturized electronic components and chips in the contemporary
computing systems makes energy consumption scenario worst ever. The most
prevailing digital electronic technology is Complementary Metal-Oxide Semiconductor (CMOS), whose peak power dissipation occurs during state transitions
of transistors. To handle power consumption of CMOS circuits, static power
dissipation (based on leakage voltage) and dynamic power dissipation (based on
supply current) need to be minimized [31–33]:
– Static power: It arises as an after-effect of leakage current flowing through
transistors. Leakage current increases exponentially with reduced thickness of
insulating region and leads to rise in static power.
– Dynamic power: It is a consequence of repeated charging and discharging of
capacitance of several hundreds of gates in contemporary chips.
To reduce static and dynamic power consumption, commonly used techniques
are dynamic power management (DPM) and dynamic voltage and frequency
scaling (DVFS), respectively. These techniques are overviewed in the following
subsections.
2.1 Dynamic Power Management
Intel, HP, and Microsoft presented an enhanced framework, called Advanced
Configuration and Power Interface (ACPI) for device configuration and monitoring
[7, 60]. Basically, ACPI provides a simple and adaptable interface to operating
system for configuring and discovering peripherals. The motive of ACPI-based
power management is to put the whole system or devices that are unused or less
used into low-power states when possible.
Due to arbitrary workloads during operation time, DPM attains energy efficiency
in the system by judiciously lowering the performance of system components and by
switching off the processor in idle periods, thereby saving energy. However, putting
Energy Conscious Scheduling for Fault-Tolerant Real-Time Distributed. . .
7
system component to power-saving state and taking back to the active state involve
energy and time overhead.
Every processor has some minimum transition time from one state to another
called break-even time b0 . When an idle interval is greater than the break-even time,
only then processor is put to sleep mode to take advantage of DPM for reducing
energy consumption. DPM technique involves taking decision of putting the system
components to a power-saving state based on the size of forthcoming idle period. For
example, energy budget of switching between states will be larger than the energy
saved in the sleep state if the idle period is relatively small. Thus, transition to powersaving state must be done only when idle interval is greater than break-even time.
The smallest value of b0 is the one that consumes exactly an equal amount of energy
if kept in active state or transition it from active to power-efficient state.
2.2 Dynamic Voltage and Frequency Scaling
Growing computing capabilities demand usage of higher operating frequency of
processors, which lead to higher energy consumptions. To sustain necessary processor performance by using higher operational frequencies, a number of integrated
transistors per chip are growing day by day [12]. Fast switching of a large number
of transistors increases the frequency of a processor and also makes them dissipate
more dynamic power.
Dynamic power consumption of a processor and supply voltage have quadratic
relation between them such that:
ρdyn = ℘ζef υ 2 f,
(1)
where ρdyn is the dynamic power, ℘ is the gate activity factor, ζef is the switched
capacitance, υ is the supply voltage, and f is the operating frequency. DVFS
dynamically adjusts voltage/frequency to reduce processor’s power consumption;
however, it trades energy with performance since reducing frequency will in turn
increase execution time of application. The challenge for DVFS technique for realtime applications is how to preserve the feasibility of a system while reducing
voltage so that all deadlines can be honored and energy consumption is decreased.
So, care must be taken while using DVFS for real-time applications, as they have
stringent timing constraint.
Nowadays, processors being launched in the market have DVFS capabilities
enabled on it, such as an AMD R-series [2]. Thus, in contemporary processors,
it is possible to dynamically regulate the supply voltage and operational frequency
to cut down dynamic power consumption using DVFS but at the price of elongated
circuit delay [6, 9]. Real-time DVFS techniques can be differentiated based on time
of speed adjustment as inter-task and intra-task.
8
S. Bansal et al.
– Inter-task DVFS: With inter-task DVFS techniques, a job runs at the same
frequency level until it finishes its execution after being dispatched or is
preempted by a high-priority job. The speed may be readjusted when it restarts
execution after preemption depending on the available slack at that particular
time. A majority of DVFS algorithms are based on inter-task technique as it has
low run-time overhead.
– Intra-task DVFS: The intra-task algorithms adjust the speed at the welldetermined power-management points (PMPs) at run time and focus on reducing
dynamic energy consumption. But they involve extra energy and time overhead
owing to a large number of speed changes.
Decrease in processor frequency leads to reduction in frequency-dependent
power, but it increases execution time of task, which in turn results in rise in static
and independent power. To overcome this problem, a critical frequency fcrit , also
called energy-efficient frequency, has been proposed in the literature [29], below
which the DVFS does not remain effective. So, tasks should not be executed at
frequency lower than fcrit .
3 Fault Tolerance
Rapid advancement in scale and complexity of real-time multiprocessor computing
systems has made reliability an increasingly challenging issue. Due to the aggressive
scaling of transistors, CMOS devices become more susceptible to extrinsic effects
such as high-energy radiations and electromagnetic interference. Thus, computing
systems have become prone to various types of faults that may introduce some errors
in results. In a combinational logic circuit from 600 nm to 50 nm feature size [52],
the soft error rate (SER) per chip increases by nine orders of magnitude. If scaling
process remains at the same pace, then for 16 nm technology, per day per computer
chip will have at least one failure [23, 34].
Despite being designed perfectly, a system may fail abnormally owing to
unpredictable fault occurrence. A fault is a situation of unusual response due to some
defect in the system. A fault may be a hardware defect or an implementation flaw
in the software. In other words, a system is supposed to have failure when service
provided by it diverges from the desired service. For example, a computing system
that observes the state of critical patients in the hospital must take an action as soon
as the patient’s state changes. A remedial measure must be taken if patient’s blood
pressure decreases/increases beyond a specific threshold, such as giving an alarm
or injecting medicine in patient’s body. This process must be performed strictly in
a defined time limit (or deadline). Thus, computing system employed in hospitals
especially in intensive care units (ICU) should guarantee that even if the processor
incurs fault, the task is executed within its deadline [18]. Another example is in
flight control systems where often tasks are activated by the controllers depending
Energy Conscious Scheduling for Fault-Tolerant Real-Time Distributed. . .
9
on the output seen on screen. However, if system incurs a fault, it should be able to
handle fault before the deadline [36].
Processor faults are broadly classified as permanent and transient faults:
– Permanent faults: Hardware failures lead to permanent faults caused due to
manufacturing defects at the time of fabrication or due to wear and tear because
of aging. The sole way to tolerate permanent faults is hardware redundancy (to
employ additional hardware components). Permanent faults cause damages to
processors and can hamper its working.
– Transient fault: This type of fault generates soft errors (or single-event upsets
(SEU)), which is not persistent and may cause errors in computation or corruption in data. Moreover, with continuing scaling of CMOS technologies,
approximately all digital systems are prone to transient faults along with systems
that work in outer space [70]. Studies showed that transient faults appear more
often as compared to permanent faults [11, 19].
Many techniques have been proposed for detecting faults based on hardware and
software [30, 42]. The well-known error detection mechanisms are fail-signal
processors, alarms or watchdogs, signatures, and acceptance tests (ATs) [10, 23, 45].
3.1 Fault-Tolerant Techniques
Fault tolerance is basically concealing error by switching to another unit of work
at the time of fault occurrence. Redundancy is generally applied in the form of
extra resources to mask faults for preserving required levels of performance in the
system [19]. To integrate fault tolerance in the computing system, approaches have
been suggested to tolerate faults that are generally based on redundancy of various
resources such as hardware, software, time, and information.
– Hardware redundancy is achieved by deploying extra hardware in the system
for the replacement of a faulty component.
– Software redundancy employs substitute implementations of program that can
be used in case the initial version encounters a fault at run time.
– Information redundancy techniques are used to handle faults that occur while
transferring or storage of data such as error detection and correcting codes.
– Time redundancy uses extra CPU time for re-execution of a faulty task or
executes a secondary task in case of a fault.
To tolerate permanent faults, hardware redundancy is essential, but repeating
the execution of task fully or partially helps in tolerating a transient fault [48].
Re-execution and checkpointing are two most commonly used time redundancybased techniques for tolerating transient faults that repeat task fully and partially,
respectively.
10
S. Bansal et al.
– Checkpointing: This technique saves the snapshot of current state of system to
stable storage during the execution at regular intervals called checkpoints, where
every checkpoint comprises all the context information required to restart process
from that point of time. On detecting a fault, system re-executes faulty segment
from the most recent correct checkpoint. This technique is able to tolerate g-faults
in a task.
– Re-execution: Under this technique, re-execution of original task in a failure
situation is done and is widely used to tolerate transient fault.
If the system is safety critical, task duplication/replication is used to tolerate
transient faults to provide required reliability level. However, redundancy increases
resource overhead such as rise in energy consumption. Owing to the rising concern
for energy management and reliability in contemporary world, energy-saving
techniques must be incorporated in fault-tolerant real-time task scheduling.
4 Joint Optimization of Energy and Fault Management
Computing systems are nowadays affecting almost every facet of our everyday
life. Due to the increased responsibilities, it becomes essential that computer
systems should provide both safety and reliability. For many years, researchers have
addressed the emerging problems of system reliability, which come along with this
thriving evolution of VLSI technology and raised it as prime design concern for realtime systems. Energy management has also become as an essential design parameter
for real-time systems due to various environmental, economic, technical, and social
issues such as hike in green-house gas emission, cooling infrastructure cost due to
more heat dissipation, and damage to public health. If not judiciously handled, high
energy consumption and degraded reliability will restrict the advancements to be
made to real-time multiprocessor computing systems in upcoming future.
Systems such as avionics, defense, and space exploration with real-time constraints need to be reliable as well as energy-efficient. Conventional approaches
focused solely on timing constraints, whereas recently additional design issues
such as thermal, energy, and reliability have gained attention, which has made
the scheduling problem more complex. Hence, it is desirable that task scheduling
algorithms for real-time systems must consider different constraints such as timing,
energy, and reliability and be designed systematically to accomplish the specified
design objectives.
Together, reliability and energy management are conflicting design goals for
a real-time system. Redundancy-based reliability/fault-tolerance enhancement
techniques increase energy consumption due to overhead of the additional
resources/computation. Researchers have also observed that there exists an inverse
relationship between supply voltage and the rate of transient faults. As a result,
reducing energy consumption makes the system more vulnerable to transient faults.
Energy Conscious Scheduling for Fault-Tolerant Real-Time Distributed. . .
11
The ability to achieve timeliness for real-time applications increases on multiprocessor systems with an increased number of computing units. With a greater number
of processing units, the possibility of enhancing reliability/fault tolerance improves
by having increased prospects for replication of tasks. However, redundancy of tasks
raises energy consumption due to multiple executions. Owing to this reason, energyaware task scheduling algorithms remain a pressing concern for the fault-tolerant
real-time multiprocessor system. However, it is challenging to reduce energy while
tolerating faults because both are conflicting issues having a trade-off between them.
These concerns have motivated the research for joint optimization of energy and
system reliability.
Several schemes are available in the literature that deals with the joint problem of
power and reliability management on single as well as multiprocessor platforms. Reexecution, checkpointing, and replication along with voltage scaling and shutdown
methods are frequently used strategies to preserve desired level of reliability/fault
tolerance and power management in the system. Not only the task ordering for
execution on a given processor but task mapping to various processors also affects
energy consumption and reliability of the system. Hence, there are various aspects
of fault-tolerant task scheduling on a real-time multiprocessor system where energy
efficiency can be improved. The research fraternity has shifted to examine the
problems at the intersection of fault tolerance and power management in recent
past. Task scheduling techniques for joint management of fault tolerance and energy
efficiency are discussed below as per the classification shown in Fig. 3.
Fault tolerant energy
aware RTS Techniques
Re-execution with
voltage scaling based
On uniprocessor platform
Check-pointing with
voltage scaling based
On multiprocessor platform
Standbysparing
techniques
Task-duplication with
voltage scaling based
On multiprocessor platform
M-of-N
hardware
redundancy
techniques
Y-replication
techniques
Fig. 3 Classification of real-time scheduling techniques with joint management of energy and
reliability
12
S. Bansal et al.
Re-execution with Voltage Scaling A combination of time redundancy and voltage
scaling is used to tackle the joint problem of fault tolerance and energy management
on uniprocessor system. Based on re-execution strategy, reliability-aware power
management (RA-PM) refers to the unified approach of energy management and
fault tolerance based on time redundancy and has been explored in the literature
with different aspects. It refers to the notion of original reliability, which is the
probability of successfully executing all real-time tasks at maximum CPU speed
with no transient fault. RA-PM works by utilizing the available slack for slowing
down the tasks with DVFS policy as well as for executing backup copy of scaled
tasks in case of fault [69]. Zhu et al. [69] proposed RA-PM over periodic realtime tasks by considering both EDF [69] and RM [71] as underlying scheduling
algorithms and showed that RA-PM approach maintains the original reliability of
all tasks while saving energy.
In another work based on aperiodic tasks, Zhu [70] exploited dynamic slack
for further lowering the frequency of tasks and to assign backup tasks to enhance
reliability with RA-Greedy algorithm. He also proposed checkpointing for utilizing
dynamic slack when recovery placement is not possible due to small size of available
slack. The energy-constrained version of reliability-aware power management
(ECRM) has been presented by Zhao et al. [65], where they focus on achieving
maximum reliability for a real-time system that works in a limited energy budget.
For fixed-priority real-time system with weakly hard QoS constraint, Niu
et al. [44] proposed reliability conscious energy-aware scheduling (FPRMK-EM)
algorithm by reserving space for recovery of mandatory jobs in case of failure and
reducing frequency of other tasks for energy efficiency.
Zhang et al. [61] targeted to improve energy savings of real-time system with
shared resources under the constraint of reliability with EDF/DDM as underlying
scheduling algorithm. They proposed Dynamic Low-Power Scheduling Algorithm
for Periodic Tasks with Shared Resources (DLPSR) algorithm that exploits dynamic
slack for reliability preservation and energy conservation.
Considering the preemption overhead, Xu et al. [59] proposed reliability-aware
power-management algorithms that effort to reduce execution time and energy
consumption of real-time tasks by minimizing the number of preemptions. They
proposed greedy energy efficiency scheduling algorithm (GEE) based on greedy
strategy of maximally utilizing slack time. Further, GEEPU and GLEEPU have been
proposed that reduce frequency based on processor utilization, and DGAET exploits
dynamic slack for improving energy saving.
Zhao et al. [66] proposed Generalized Shared Recovery (GSHR) technique,
where in spite of separate recovery copies for scaled tasks, one or more global shared
recovery blocks are reserved, which can be used by any task at whatever time in the
situation of fault. In case a task encounters a fault, it uses the recovery block, and
the rest of the tasks are then executed at the maximum speed. This scheme improves
the reliability of a system to great extent due to the ability to tolerate multiple
faults by same task with multiple shared recovery blocks. Thus, it can be used
for safety-critical systems where it is essential to maintain certain arbitrary level
of reliability in an energy-efficient manner. The authors proposed shared recovery
Energy Conscious Scheduling for Fault-Tolerant Real-Time Distributed. . .
13
scheme for common-deadline-based tasks (Incremental Reliability Configuration
Search—IRCS) [64, 66], periodic tasks [67], and precedence constrained tasks
(SHR-DAG) [68], where frequency of tasks and the number of recovery blocks are
determined based on the given reliability target.
Qi et al. [50] proposed global-scheduling-based reliability-aware powermanagement scheme for individual and shared recovery schemes on multiprocessor
system. They showed that it is NP-hard problem to find an optimal solution for the
selection of subgroup of tasks for energy and reliability management. Algorithms to
exploit dynamic slack have also been proposed by them to improve energy savings.
Reliability-aware dynamic power management (RA-DPM) has been presented
by Fan et al. [17] with shared recovery blocks on single processor system. As soon
as a minimum number of tasks execute successfully, time reserved for recoveries is
used for reducing frequency for extending energy savings dynamically.
Huang et al. [28] proposed energy-efficient fault-tolerant mapping and scheduling for precedence constrained tasks with mixed-integer linear programming formulation on heterogeneous multiprocessor system. They proposed List-based Binary
Particle Swarm Optimization (LBPSO) algorithm that is based on particle swarm
optimization to obtain high-quality solution in terms of energy saving and reliability.
Checkpointing with Voltage Scaling In order to guarantee reliability and energy
efficiency, an adaptive checkpointing scheme (ADT-DVS) has been presented by
Zhang et al. [63] assuming Poisson fault model. They adjust checkpoint intervals
dynamically to tolerate a fixed number of faults for a set of periodic tasks with EDF
scheduling policy on a single processor system.
For fixed-priority scheduling algorithm, Zhang et al. [62] proposed a unified
approach for checkpointing and DVFS (both task-level and application-level speed
scaling) for tolerating g-transient faults while lessening energy consumption for
periodic real-time task sets. The authors used genetic-algorithm-based approach
(GA) to find the optimal frequency assignment with exhaustive search, which is
computationally unfeasible for heavy workload applications on the processor with
a large number of available discrete frequency levels. Using adaptive checkpointing
technique, work was done by Wei et al. [58] based on the behavior of tasks and
fault rate at run time while complying with tasks’ deadline constraints. Two offline
DVFS scheduling algorithms—application-level DVS (A-DVS) and task-level DVS
(T-DVS)—were proposed for fixed-priority real-time tasks by exploring dynamic
slack to minimize energy consumption.
Another non-uniform checkpointing technique combined with DVFS for power
management has been presented by Melhem et al. [43], which has an advantage
of improved energy saving over uniform checkpointing. They considered EDF
scheduling algorithm for periodic tasks on a single-core processor with the constraint of having at most one failure in the system. To reduce the number of
checkpoints for the sake of minimizing energy consumption, two-state checkpointing (TsCp) concept has been introduced by Salehi et al. [51] where non-uniform
14
S. Bansal et al.
checkpointing is applied in the fault-free scenario but as soon as the fault occurs,
system shifts to uniform checkpointing policy.
Duplication with Voltage Scaling A majority of strategies that have been proposed
in the literature for fault-tolerant energy-aware task scheduling based on duplication
of task copies are for homogeneous platform. Research works done on multiprocessors using duplication of task have been divided into three categories based on the
number of duplicate/replicated copies of tasks as follows:
– Standby-sparing techniques: Standby-sparing strategy uses one level of replication, such that each task has exactly one replica to execute for fault handling
on dual processor system. The workload handled by this technique is not greater
than the maximum utilization bound of single processor because extra processor
is just employed to provide fault tolerance by scheduling duplicate task copies
on it.
For independent periodic tasks with common deadline, Ejlali et al. [15]
proposed that instead of using standby-sparing scheme with hot or cold spares,
Low-Energy Standby-Sparing (LESS) is more effective in saving energy while
providing reliability. LESS reduces voltage of primary tasks by applying DVFS
and delays backup tasks maximally keeping the deadline constraint fulfilled.
They considered reduced energy model and reliability model by considering
energy and time overheads as well as static-energy consumption.
Aminzadeh et al. [3] did the comparative analysis of system-level energymanagement schemes based on DVFS and DPM for standby-sparing systems
and proposed a Markov model to analyze their energy and reliability parameters.
They proposed that hybrid method of postponing secondary tasks and frequency
reduction of primary and backup tasks on standby-sparing system always save
more energy as compared to simple DVFS and DPM methods.
For fixed-priority scheduling, Haque et al. [26] suggested that executing
primary tasks at lower voltage and backing up tasks at maximum voltage
maintain reliability of the system as well as save energy. They proposed StandbySparing Fixed-Priority (SSFP) algorithm for periodic tasks that uses dual-priority
scheduling approach on spare processor to delay backup tasks and applied
deallocation strategy as well to save energy by canceling backup tasks whose
corresponding primary tasks have finished successfully. Dynamic slack has been
exploited to enhance energy saving by reducing speed of tasks on main processor
and further delaying of backup tasks at run time.
Ansari et al. [4] followed the similar concept of [26] for energy- and
reliability-aware scheduling on standby-sparing system by using dual-priority
strategy for earliest deadline first scheduling algorithm. They presented a new
Adaptive Dual-Queue scheduling (AdDQ) algorithm and showed that their work
saves 14% more energy than ASSPT and CSSPT algorithms [24].
– M-of-N hardware redundancy techniques: Optimistic TMR has been proposed
by Elnozahy et al. [16] to reduce the energy consumption for conventional TMR
systems. Two out of three machines run at lower frequency and their result is
Energy Conscious Scheduling for Fault-Tolerant Real-Time Distributed. . .
15
matched. The output is released, but in case of deviation in results, output of
third machine that was slower than other ones is used as tie-breaker.
Benefit of multiprocessor platform has been exploited by Salehi et al.
[51] with N-modular redundancy to improve reliability by masking errors on
multiple processing units; however, it imposes substantial energy overhead. They
suggested to work in two phases to carry out N-modular redundancy. Half-plusone copies for every task are executed in the first indispensable phase, and the
rest of the copies are executed in the on-demand phase only if fault appeared in
the earlier phase thereby saving energy in fault-free scenario.
– Y-replication techniques: For a set of independent periodic tasks, Unsal et al.
[55] proposed an energy-aware fault-tolerance technique with primary–backup
scheme, which defers the execution of backup tasks as late as possible to
minimize overlap between the execution of primary and backup copies. Energy
consumption is reduced by canceling the backup copy on successful completion
of primary copy.
For the heterogeneous systems, Tosun [54] proposed energy- and reliabilityaware task scheduling and achieved 62% energy saving against energy-oblivious
schemes. He presented an integer linear-programming-based framework for
mapping and scheduling tasks to heterogeneous multiprocessor system on chip
(HMPSoC) for periodic real-time tasks.
For highly safety-critical systems, to achieve target reliability level, a certain
number of replicas are required. But to generate an energy-efficient schedule,
tasks must be executed at reduced frequency value. For preemptive periodic realtime applications, Haque et al. [27] analyzed the interplay between the energy,
replication, frequency, and reliability. They proposed a method to create energy–
frequency–reliability (EFR) table [25] and then how to use it for determining the
extent of replication and frequency reduction for lowering energy consumption
with the help of energy-efficient replication (EER) algorithm.
Poursafaei et al.[47] used EFR table and presented an algorithm that works
in two phases. The first phase is offline replication in which extent of replication
and frequency reduction is determined for every task depending on the given
reliability target. Later on, at run time, the online phase prevents the execution of
redundant copies of task whose one of the copies has finished successfully.
By extending the concept of standby-sparing scheme to multiprocessor
system, Guo et al. [22] proposed paired-SS and generalized-SS task configuration
schemes for independent periodic real-time tasks with dynamic-priority scheduling algorithm. They used worst-fit decreasing strategy for task allocation and
showed that generalized-SS is a more energy-efficient configuration for dynamicpriority task set on multiprocessor system. Later on, they extended the concept
for mixed scheduling where in spite of standby-sparing configuration, tasks are
allocated in mixed manner, such that every processor has a mixture of primary
and backup tasks provided that copies of same task are not allocated to the
same processor with POED-Mix algorithm. POED algorithm is used to schedule
primary tasks with ASAP preference and backup tasks in ALAP manner [21] to
save energy by delaying backup tasks for reducing overlap in the execution of
two copies of same task as well as minimizing the number of executed backup.
16
S. Bansal et al.
5 Conclusion
With the growing availability of multiprocessor technology, hardware redundancy
has emerged as a suitable candidate for providing fault tolerance in real-time
systems. Duplicating tasks on separate processing units has turned up as a fitting technique to meet stringent reliability requirements. But efficient scheduling
techniques are required to handle the after-effect of replicating task resulting
in increased energy consumption. Use of dynamic voltage scaling and dynamic
power-management techniques has been the choice of researchers for designing
energy-efficient scheduling algorithms. However, in fault-tolerant systems, careful
application of energy-management schemes is required, as execution on processor
at lower voltage raises fault rate.
References
1. Agarwal, M. M., Govil, M. C., Sinha, M., & Gupta, S. (2019). Fuzzy based data fusion
for energy efficient internet of things. International Journal of Grid and High Performance
Computing, 11(3), 46–58. https://doi.org/10.4018/ijghpc.2019070103
2. AMD. 2nd generation AMD embedded R-series APU. https://www.amd.com/en/products/
embedded-r-series-2nd-gen-apu (2nd). Accessed 20 March 2020
3. Aminzadeh, S., & Ejlali, A. (2011). A comparative study of system-level energy management
methods for fault-tolerant hard real-time systems. IEEE Transactions on Computers 60(9),
1288–1299 (2011). https://doi.org/10.1109/tc.2011.42
4. Ansari, M., Safari, S., Poursafaei, F. R., & Salehi, M. (2017). AdDQ: Low-energy hardware
replication for real-time systems through adaptive dual-queue scheduling. The CSI Journal on
Computer Science and Engineering, 15(1), 31–38.
5. Attia, K. M., El-Hosseini, M. A., & Ali, H. A. (2017). Dynamic power management techniques
in multi-core architectures: A survey study. Ain Shams Engineering Journal, 8(3), 445–456.
https://doi.org/10.1016/j.asej.2015.08.010
6. Aydin, H., Melhem, R., Mosse, D., & Mejia-Alvarez, P. (2004). Power-aware scheduling for
periodic real-time tasks. IEEE Transactions on Computers, 53(5), 584–600. https://doi.org/10.
1109/tc.2004.1275298
7. Bambagini, M. (2014). Energy Saving in Real-Time Embedded Systems. Ph.D. Thesis, ReTiS
Lab, TeCIP Institute, Pisa, Italy.
8. Bambagini, M., Marinoni, M., Aydin, H., & Buttazzo, G. (2016). Energy-aware scheduling for
real-time systems. ACM Transactions on Embedded Computing Systems, 15(1), 1–34. https://
doi.org/10.1145/2808231
9. Burd, T. D., & Brodersen, R. W. (1995). Energy efficient CMOS microprocessor design. In
Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences
(Vol. 1, pp. 288–297). https://doi.org/10.1109/HICSS.1995.375385
10. Campbell, A., McDonald, P., & Ray, K. (1992). Single event upset rates in space. IEEE
Transactions on Nuclear Science, 39(6), 1828–1835. https://doi.org/10.1109/23.211373
11. Castillo, X., McConnel, S. R., & Siewiorek, D. P. (1982). Derivation and calibration of a
transient error reliability model. IEEE Transactions on Computers, C-31(7), 658–671. https://
doi.org/10.1109/tc.1982.1676063
12. Cong, J., Nagaraj, N. S., Puri, R., Joyner, W., Burns, J., Gavrielov, M., Radojcic, R., Rickert,
P., & Stork, H. (2009). Moore’s law: Another casualty of the financial meltdown? In 2009 46th
ACM/IEEE Design Automation Conference (pp. 202–203).
Energy Conscious Scheduling for Fault-Tolerant Real-Time Distributed. . .
17
13. Dewangan, B. K., Agarwal, A., Venkatadri, M., & Pasricha, A. (2019). Energy-aware
autonomic resource scheduling framework for cloud. International Journal of Mathematical,
Engineering and Management Sciences, 4(1), 41–55. https://doi.org/10.33889/ijmems.2019.4.
1-004
14. EETimes, Staff, E. (2017). 2017 Embedded Market Survey (2017). Accessed 21 May 2020.
15. Ejlali, A., Al-Hashimi, B. M., & Eles, P. (2012). Low-energy standby-sparing for hard real-time
systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,
31(3), 329–342. https://doi.org/10.1109/tcad.2011.2173488
16. Elnozahy, E., Melhem, R., & Mosse, D. (2002) Energy-efficient duplex and TMR real-time
systems. In 23rd IEEE Real-Time Systems Symposium, 2002. RTSS 2002. IEEE Comput. Soc.
https://doi.org/10.1109/real.2002.1181580
17. Fan, M., Han, Q., & Yang, X. (2017). Energy minimization for on-line real-time scheduling
with reliability awareness. Journal of Systems and Software, 127, 168–176. https://doi.org/10.
1016/j.jss.2017.02.004
18. Ghosh, S., Melhem, R., & Mosse, D. (1997). Fault-tolerance through scheduling of aperiodic
tasks in hard real-time multiprocessor systems. IEEE Transactions on Parallel and Distributed
Systems, 8(3), 272–284. https://doi.org/10.1109/71.584093
19. Ghosh, S., Melhem, R., Mossé, D., & Sarma, J. S. (1998). Fault-tolerant rate-monotonic
scheduling. Real-Time Systems, 15(2), 149–181. https://doi.org/10.1023/a:1008046012844
20. Goyal, N., Dave, M., & Verma, A. K. (2016). Energy efficient architecture for intra and
inter cluster communication for underwater wireless sensor networks. Wireless Personal
Communications, 89(2), 687–707. https://doi.org/10.1007/s11277-016-3302-0
21. Guo, Y., Su, H., Zhu, D., & Aydin, H. (2015). Preference-oriented real-time scheduling and its
application in fault-tolerant systems. Journal of Systems Architecture, 61(2), 127–139. https://
doi.org/10.1016/j.sysarc.2014.12.001
22. Guo, Y., Zhu, D., Aydin, H., Han, J. J., & Yang, L. T. (2017). Exploiting primary/backup mechanism for energy efficiency in dependable real-time systems. Journal of Systems Architecture,
78, 68–80. https://doi.org/10.1016/j.sysarc.2017.06.008
23. Han, Q., Wang, T., & Quan, G. (2015). Enhanced fault-tolerant fixed-priority scheduling of
hard real-time tasks on multi-core platforms. In 2015 IEEE 21st International Conference on
Embedded and Real-Time Computing Systems and Applications. IEEE. https://doi.org/10.1109/
rtcsa.2015.22
24. Haque, M. A., Aydin, H., & Zhu, D. (2011). Energy-aware standby-sparing technique for
periodic real-time applications. In 2011 IEEE 29th International Conference on Computer
Design (ICCD). IEEE. https://doi.org/10.1109/iccd.2011.6081396
25. Haque, M. A., Aydin, H., & Zhu, D. (2013). Energy-aware task replication to manage
reliability for periodic real-time applications on multicore platforms. In 2013 International
Green Computing Conference Proceedings (pp. 1–11). IEEE. https://doi.org/10.1109/igcc.
2013.6604518
26. Haque, M. A., Aydin, H., & Zhu, D. (2015). Energy-aware standby-sparing for fixed-priority
real-time task sets. Sustainable Computing: Informatics and Systems, 6, 81–93. https://doi.org/
10.1016/j.suscom.2014.05.001
27. Haque, M. A., Aydin, H., & Zhu, D. (2017). On reliability management of energy-aware realtime systems through task replication. IEEE Transactions on Parallel and Distributed Systems,
28(3), 813–825. https://doi.org/10.1109/tpds.2016.2600595
28. Huang, K., Jiang, X., Zhang, X., Yan, R., Wang, K., Xiong, D., & Yan, X. (2018). Energyefficient fault-tolerant mapping and scheduling on heterogeneous multiprocessor real-time
systems. IEEE Access, 6, 57614–57630. https://doi.org/10.1109/access.2018.2873641
29. Jejurikar, R., Pereira, C., & Gupta, R. (2001). Leakage aware dynamic voltage scaling for realtime embedded systems. In Proceedings of the 41st Annual Design Automation Conference,
DAC ’04 (pp. 275–280). ACM. https://doi.org/10.1145/996566.996650
30. Jhumka, A., Hiller, M., Claesson, V., & Suri, N. (2002). On systematic design of globally
consistent executable assertions in embedded software. ACM SIGPLAN Notices, 37(7), 75.
https://doi.org/10.1145/566225.513843
18
S. Bansal et al.
31. Kaur, N., Bansal, S., & Bansal, R. K. (2016). Energy conscious scheduling with controlled
threshold for precedence-constrained tasks on heterogeneous clusters. Concurrent Engineering, 25(3), 276–286. https://doi.org/10.1177/1063293x16679001
32. Kaur, N., Bansal, S., & Bansal, R. K. (2016). Energy efficient duplication-based scheduling
for precedence constrained tasks on heterogeneous computing cluster. Multiagent and Grid
Systems, 12(3), 239–252. https://doi.org/10.3233/MGS-160252
33. Kaur, N., Bansal, S., & Bansal, R. K. (2017). Duplication-controlled static energy-efficient
scheduling on multiprocessor computing system. Concurrency and Computation: Practice and
Experience, 29(12), e4124. https://doi.org/10.1002/cpe.4124
34. Khudia, D. S., & Mahlke, S. (2014). Harnessing soft computations for low-budget fault
tolerance. In 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
IEEE. https://doi.org/10.1109/micro.2014.33
35. Kim, J., Kim, H., Lakshmanan, K., & Rajkumar, R. (2013). Parallel scheduling for cyberphysical systems: Analysis and case study on a self-driving car. In 2013 ACM/IEEE
International Conference on Cyber-Physical Systems (ICCPS) (pp. 31–40).
36. Lala, J., & Harper, R. (1994). Architectural principles for safety-critical real-time applications.
Proceedings of the IEEE, 82(1), 25–40. https://doi.org/10.1109/5.259424
37. Leveson, N. G. (1986). Software safety: Why, what, and how. ACM Computing Surveys, 18(2),
125–163. https://doi.org/10.1145/7474.7528
38. Li, K. (2016). Energy and time constrained task scheduling on multiprocessor computers with
discrete speed levels. Journal of Parallel and Distributed Computing, 95, 15–28. https://doi.
org/10.1016/j.jpdc.2016.02.006
39. Market, E.S. (2020). Embedded system market by hardware (MPU, MCU, application-specific
integrated circuits, DSP, FPGA, and memories), software (middleware, operating systems),
system size, functionality, application, region—global forecast to 2025. Accessed 21 May
2020.
40. Marwedel, P. (2018). Embedded system design. Springer International Publishing. https://doi.
org/10.1007/978-3-319-56045-8
41. Masiero, M., & Roos, A. (2012). Power consumption—CPU charts 2012: 86 processors from
AMD and Intel, tested (2012). Accessed 02 Jan 2020.
42. Meixner, A., Bauer, M. E., & Sorin, D. (2007). Argus: Low-cost, comprehensive error detection
in simple cores. In 40th Annual IEEE/ACM International Symposium on Microarchitecture
(MICRO 2007). IEEE. https://doi.org/10.1109/micro.2007.18
43. Melhem, R., Mosse, D., & Elnozahy, E. (2004). The interplay of power management and fault
recovery in real-time systems. IEEE Transactions on Computers, 53(2), 217–231. https://doi.
org/10.1109/tc.2004.1261830
44. Niu, L., & Li, W. (2016). Reliability-conscious energy management for fixed-priority real-time
embedded systems with weakly hard QoS-constraint. Microprocessors and Microsystems, 46,
107–121. https://doi.org/10.1016/j.micpro.2016.03.005
45. Oh, S. K., & Macewen, G. H. (1992). Toward fault-tolerant adaptive real-time distributed
systems.
46. Pollack, F. J. (1999). New microarchitecture challenges in the coming generations of CMOS
process technologies (keynote address) (abstract only). In Proceedings of the 32Nd Annual
ACM/IEEE International Symposium on Microarchitecture, MICRO 32 (p. 2). IEEE Computer
Society.
47. Poursafaei, F. R., Safari, S., Ansari, M., Salehi, M., & Ejlali, A. (2015). Offline replication and
online energy management for hard real-time multicore systems. In 2015 CSI Symposium on
Real-Time and Embedded Systems and Technologies (RTEST). IEEE. https://doi.org/10.1109/
rtest.2015.7369847
48. Pradhan, D. K. (1996). Fault-tolerant computer system design. Prentice-Hall.
49. Punnekkat, S. (1997). Schedulability Analysis for Fault Tolerant Real-time Systems. Ph.D.
Thesis, University of York, UK.
Energy Conscious Scheduling for Fault-Tolerant Real-Time Distributed. . .
19
50. Qi, X., Zhu, D., & Aydin, H. (2011). Global scheduling based reliability-aware power
management for multiprocessor real-time systems. Real-Time Systems, 47(2), 109–142. https://
doi.org/10.1007/s11241-011-9117-x
51. Salehi, M., Ejlali, A., & Al-Hashimi, B. M. (2016). Two-phase low-energy n-modular redundancy for hard real-time multi-core systems. IEEE Transactions on Parallel and Distributed
Systems, 27(5), 1497–1510. https://doi.org/10.1109/tpds.2015.2444402
52. Shivakumar, P., Kistler, M., Keckler, S., Burger, D., & Alvisi, L. (2002). Modeling the effect
of technology trends on the soft error rate of combinational logic. In Proceedings International
Conference on Dependable Systems and Networks. IEEE Comput. Soc. https://doi.org/10.
1109/dsn.2002.1028924
53. Srinivasan, J., Adve, S., Bose, P., & Rivers, J. (2004). The impact of technology scaling on
lifetime reliability. In International Conference on Dependable Systems and Networks, 2004.
IEEE. https://doi.org/10.1109/dsn.2004.1311888
54. Tosun, S. (2011). Energy- and reliability-aware task scheduling onto heterogeneous MPSoC
architectures. The Journal of Supercomputing, 62(1), 265–289. https://doi.org/10.1007/
s11227-011-0720-3
55. Unsal, O. S., Koren, I., & Krishna, C. M. (2002). Towards energy-aware software-based fault
tolerance in real-time systems. In Proceedings of the 2002 International Symposium on Low
Power Electronics and Design (pp. 124–129). ACM Press. https://doi.org/10.1145/566408.
566442
56. Uribe-Toril, J., Ruiz-Real, J., Milán-García, J., & de Pablo Valenciano, J. (2019). Energy,
economy, and environment: A worldwide research update. Energies, 12(6), 1120. https://doi.
org/10.3390/en12061120
57. Venkatachalam, V., & Franz, M. (2005). Power reduction techniques for microprocessor
systems. ACM Computing Surveys, 37(3), 195–237. https://doi.org/10.1145/1108956.1108957
58. Wei, T., Mishra, P., Wu, K., & Zhou, J. (2012). Quasi-static fault-tolerant scheduling schemes
for energy-efficient hard real-time systems. Journal of Systems and Software, 85(6), 1386–
1399. https://doi.org/10.1016/j.jss.2012.01.020
59. Xu, H., Li, R., Zeng, L., Li, K., & Pan, C. (2018). Energy-efficient scheduling with reliability
guarantee in embedded real-time systems. Sustainable Computing: Informatics and Systems,
18, 137–148. https://doi.org/10.1016/j.suscom.2018.01.005
60. Zahaf, H. E. (2016). Energy efficient scheduling of parallel real-time tasks on heterogeneous
multicore systems. Ph.D. Thesis, Lille 1 University of Science and Technology, France.
61. Zhang, Y. W., Zhang, H. Z., & Wang, C. (2017). Reliability-aware low energy scheduling in
real time systems with shared resources. Microprocessors and Microsystems, 52, 312–324.
https://doi.org/10.1016/j.micpro.2017.06.020
62. Zhang, Y., & Chakrabarty, K. (2006). A unified approach for fault tolerance and dynamic power
management in fixed-priority real-time embedded systems. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, 25(1), 111–125. https://doi.org/10.1109/tcad.
2005.852657
63. Zhang, Y., & Chakrabarty, K. (2004). Dynamic adaptation for fault tolerance and power
management in embedded real-time systems. ACM Transactions on Embedded Computing
Systems, 3(2), 336–360. https://doi.org/10.1145/993396.993402
64. Zhao, B., Aydin, & H., Zhu, D. (2009). Enhanced reliability-aware power management through
shared recovery technique. In Proceedings of the 2009 International Conference on ComputerAided Design (pp. 63–70). ACM Press. https://doi.org/10.1145/1687399.1687412
65. Zhao, B., Aydin, H., & Zhu, D. (2010). On maximizing reliability of real-time embedded
applications under hard energy constraint. IEEE Transactions on Industrial Informatics, 6(3),
316–328. https://doi.org/10.1109/tii.2010.2051970
66. Zhao, B., Aydin, H., & Zhu, D. (2011). Generalized reliability-oriented energy management for
real-time embedded applications. In Proceedings of the 48th Design Automation Conference
on—DAC ’11. ACM Press. https://doi.org/10.1145/2024724.2024815
20
S. Bansal et al.
67. Zhao, B., Aydin, H., & Zhu, D. (2012). Energy management under general task-level reliability
constraints. In 2012 IEEE 18th Real Time and Embedded Technology and Applications
Symposium (pp. 285–294). IEEE. https://doi.org/10.1109/rtas.2012.30
68. Zhao, B., Aydin, H., & Zhu, D. (2013). Shared recovery for energy efficiency and reliability
enhancements in real-time applications with precedence constraints. ACM Transactions on
Design Automation of Electronic Systems, 18(2), 1–21. https://doi.org/10.1145/2442087.
2442094
69. Zhu, D., & Aydin, H. (2009). Reliability-aware energy management for periodic real-time
tasks. IEEE Transactions on Computers, 58(10), 1382–1397. https://doi.org/10.1109/TC.2009.
56
70. Zhu, D. (2010). Reliability-aware dynamic energy management in dependable embedded realtime systems. ACM Transactions on Embedded Computing Systems, 10(2), 1–27. https://doi.
org/10.1145/1880050.1880062
71. Zhu, D., Qi, X., & Aydin, H. (2007). Priority-monotonic energy management for real-time
systems with reliability requirements. In 2007 25th International Conference on Computer
Design. IEEE. https://doi.org/10.1109/iccd.2007.4601963
72. Zhuravlev, S., Saez, J. C., Blagodurov, S., Fedorova, A., & Prieto, M. (2013). Survey of energycognizant scheduling techniques. IEEE Transactions on Parallel and Distributed Systems,
24(7), 1447–1464. https://doi.org/10.1109/tpds.2012.20
Secret Data Transmission Using
Advanced Morphological Component
Analysis and Steganography
Binay Kumar Pandey, Digvijay Pandey, Ankur Gupta,
Vinay Kumar Nassa , Pankaj Dadheech, and A. Shaji George
1 Introduction
The World Wide Web is among the most popular and easiest mediums for individual
people to convey digital information, and yet one of the most common threats
to transmission is that somebody else might undoubtedly acquire those kinds of
details, and the Internet by itself offers so little protection for this kind of data.
Meanwhile, a transmitter favors enforcing a few other security procedures for this
kind of digital data to thwart users from trying to access it. The information has
indeed been dispersed all over the end points inside a computer network, utilizing
B. K. Pandey
Department of IT, College of Technology, Govind Ballabh Pant University of Agriculture and
Technology, Pantnagar, Uttarakhand, India
D. Pandey ()
Department of Electronics Engineering, Institute of Engineering and Technology, Dr. A.P.J.
Abdul Kalam Technical University, Lucknow, Uttar Pradesh, India
A. Gupta
Department of Computer Science and Engineering, Vaish College of Engineering, Rohtak,
Haryana, India
V. K. Nassa
Department of Computer Science Engineering, Rajarambapu Institute of Technology, Islampur,
Maharashtra, India
P. Dadheech
Computer Science & Engineering, Swami Keshvanand Institute of Technology, Management &
Gramothan (SKIT), Jaipur, Rajasthan, India
A. S. George
Department of Information and Communication Technology, Crown University, Int’l. Chartered
Inc. (CUICI), Santa Cruz, Argentina
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_2
21
22
B. K. Pandey et al.
packets of data of differing and fixed sizes. Most protected techniques are designed
to utilize information at only the software level; as such safe encryption would’ve
been packetized and sent to the lower levels of the OSI architecture. Whenever an
intruder receives all the data packets, he or she might very well procure encrypted
information while appropriately placing an order for all of the data obtained from
the data packet. To follow that, efforts were made to violate the user’s secure
methodology.
An attacker would not identify the essential nature of data gathered or forwarded
or the organization of pertinent information from distinguishable data packets if
it is hard to eradicate data further on accumulation of data inside the incoming
packets all through transmission of data. As little more than a consequence, the
exclusive process of safeguarding transmitted information from all kinds of risk
and securely transferring pertinent data to the receiver was already finished. This
is exactly what a proposed technique based on morphological component analysisbased steganography and a hybrid convolution neural network could accomplish.
Morphological component analysis provides a methodology for distinguishing
picture components that appear to possess multiple morphologically important
properties.
Morphological component analysis must be used for picture contrast enhancement and image segregation, and it is incredibly competent at dividing pictures
between texture and smooth components [2]. Morphological component analysis
combined with total variation [5] regularization looks to become a particularly
effective approach for distinguishing a picture into piecewise smooth contents
and textures. The usage of a curvelet [7] dictionary for morphological component
analysis produces “rings of aberrations inside the piecewise smoothed content part.”
A total variation regularization approach [27] was applied to eliminate ringing
distortions in piecewise smoother portions. The Daubechies wavelet has been
used to forecast the TV regularization technique. The cartoon component of a
picture has indeed been transformed to a Daubechies wavelet transform, which has
been assisted with soft thresholding of such a component. The updated cartoon
elements of text-based pictures [29] characteristics are then integrated into the
cover picture and delivered via Internet-of-things connections. A steganographic
approach modifies an initial cover image just minimally in order to incorporate a
textual image element on the inside of the cover photo. As a result, a stego image
would be generated. Numerous picture obfuscation techniques have been developed
to conceal sensitive material within digital photographs.
One of the most prevalent approaches appears to be the least significant bits
(LSB) technique, which combines secret information into part of the cover picture’s
least significant bit (LSB). The LSB approach seems to provide the benefits of
being easy to quantify with carrying a large payload comprising data that might
be included in a cover image while preserving acceptable picture fidelity. In these
kinds of cases, an LSB-based technique would’ve been commonly utilized. Minor
changes in pixel edge areas are less sensitive, but little changes in smooth parts
become significantly more responsive.
Secret Data Transmission Using Advanced Morphological Component. . .
23
The extended wavelet convolution layers’ persistent search efficient method
with quotient multipicture element value discretization could then be posited for
concealing text-based image characteristics all across cover media and enabling
obfuscation techniques just at the transmitter end as well as steganalysis just
at the receiver end for full-size input images. During this multiple procedure,
including ingraining, a stereo-image would’ve been transported to the extreme
opposite via a protected site built upon the network of things. Therefore, just
at the receiver side, the textual image has indeed been rebuilt via applying the
complete opposite of steganography to obtain text-based visual attributes contained
in the cover picture. These characteristics would subsequently be used to construct
a hybrid convolutional neural network. This hybrid convolution neural network’s
performance would be assessed by contrasting it to the data set.
The main objective of this study would be to improve the reliability of textbased picture transmissions every second using morphological component analysis,
steganography, and textual recognition on only the recipient side, particularly
employing a hybrid convolution neural network. As a result, the work suggests
two approaches. The very first employs morphological component analysis in
combination with an image steganography technique, whereas the other employs
this image steganography approach in conjunction with a hybrid convolution neural
network. In reality, information security is carried out two times. As per results, the
proposed approach surpasses the traditional technique in areas of existing quality
measures, including peak signal-to-noise ratio, structural similarity index measures,
and accuracy.
2 Suggested Method’s Proposed Goals
The proposed approach enables safe textual data transmission and extraction from
complicated damaged photos, which employs both obfuscation techniques and deep
learning to accomplish its ultimate goal.
• To investigate and study the plethora of textual image features, including
steganography techniques, that communicate text-based picture information via
untrusted connections.
• To create a solid data set collection.
• Start imposing an efficient MCA-based approach onto text-based pictures that
might better accurately separate both textured and smoother regions of a text
image while enhancing corresponding efficiency measures such as peak signalto-noise ratio (PSNR), accuracy, and structural similarity index measure (SSIM).
• Using the total variation method, divide the image into smoother and textured
sections. After recovering piecewise smooth shapes and textures included in a
picture, obfuscation has been utilized to disguise the smoother forms and textures
of textual-based pictures separately inside cover images.
24
B. K. Pandey et al.
• To create a suitable procedure for generating raised stego images for data
protection, employing the smoother shape and textures of textual-based pictures,
in order to enable maximum security transmission of classified content on the
web.
• Reversing steganography techniques are applied now at the receiving end to
produce a smoother shape and texture of textual-based pictures using highquality cover images.
• After that, the textual picture would’ve been recreated by employing inverse
morphological component analysis on the smoother shapes and textures of
textual-based pictures.
• The application of an adapted optimization method boosts the efficiency of the
text-based picture reformation methodology.
• Furthermore, the efficacy of the proposed approaches would have been assessed
by correlating them with all currently accessible approaches.
3 Review of Literature
From the initial periods of automation, secure text data transmission and recognition
were considered attractive issues for different experts. In recent years, multiple
specialists have been engaged in the creation of theoretical approaches and measurement devices to build, discriminate, replicate, and classify these pictures. An
overview of the respective efforts has been provided beneath.
In Caselles et al. [9], the potential for total variation could restore picture
discontinuity and inspire their usage as a regularization concept for imagery
challenges. These are based upon its various applications, which include noise
removal, optic flows, and stereoscopic images, including 3-D surface reconstructions, segmentation, and interpolation, to mention just a few. On the other hand, it
will go over the main conceptual considerations that are being advanced in favors
of such a proposition. But on the other hand, it will cover the main computational
techniques for solving different models with total variation and also present the
basic iteration strategies and the optimization algorithm methods relying on maxflow techniques.
In this [20] study, we initially divide every picture into areas that correspond
to a few of the three morphology elements, namely, contours, texturing, and
smoothness, using the region energies of alternating coefficients(AC) of a discrete
cosine transform (DCT). Next, for every morphology element, decide on a block.
I have used the lowest block size for the contouring elements, the moderate block
size for the texture components, and also the greatest block size for the smoothness
elements. To better preserve picture features, a multistep technique is being used
to achieve visual noise removal, with each phase comparable with a BM3D
methodology except for using adaptable dimensions and distinct transformation.
Experiment findings reveal that this suggested methodology provides higher PSNR
and MSSIM values than the BM3D approach, as well as improved viewing clarity of
Secret Data Transmission Using Advanced Morphological Component. . .
25
decoded pictures than that of the BM3D mechanism and some other state-of-the-art
methodologies.
A novel variational structural and texture-based decomposing methodology
has been presented in this study [6]. The primary components of the suggested
methodology seem to be as follows: (1) using a low-pass filtration level-set curvature
of a source image as a given image and (2) texture suppression by minimizing a
changeable exponent vitality, where the changeable multiplier has been managed to
gain knowledge from the curvature-guided image filtration result. This approach
appears compatible with the present state of the art in structural and textured
picture segmentation, according to quantitative studies. Numerous petitions have
been under consideration.
The framework in [28] of the multi-resolution evaluation method for compression
techniques was provided in this study, and also how a 2-D picture could be
segmented into four segments, such as the approximation image and detailed
picture, was provided. These wavelet coefficients are then subdivided into four subband pictures. A 2D–DWT multi-resolution decomposed would be used to achieve
the picture approximations. The source picture was reconstructed via obtaining
the least frequency sub-band images (i.e., LL) from three-level reduction outputs.
The wavelets were modified by reconstructing the actual image, by utilizing only
the approximations. The experiment performed in the MATLAB framework had
the lowest error rate. As a result, the two-dimensional DWT technique becomes
extremely effective in attaining a satisfactory result.
As pertaining to [1], data protection relies mainly on cryptography, with obfuscation techniques functioning as a supplementary tier of security in some cases.
Steganography would be a scientific method of concealing the existence of such
a textual picture in encrypted transmissions. Several steganographic approaches
support this idea, and the majority of techniques produce significantly relevant
changes to the covering carriers, particularly as actual textual payloads grow in
size. This work [23] presents a deep learning-based weighted naive Bayes classifier
(WNBC), which can identify letters and letters in image files. Real-scene photos
often include a few tiny disruptions, which have been eliminated throughout the
preprocessing phase via supervised imaging filtration. Not only the Gabor transforms, but also stroke width transformation approaches have been used to retrieve
critical data through classification. Using those returned properties, WNBC, along
with deep neural network-based adaptive galactic swarm optimization, eventually
obtains textual identification and character detection. Accuracy, F1-score, precision,
mean square error, and recall evaluations are being used to judge the competence of
a suggested methodology.
Baran et al. [4] presented the modern technology of character recognition and
text analysis detection. This requires a connected component-based approach, which
makes considerable use of such a detection scheme for maximally stable extremal
region characteristics. Contour-oriented and geometrical filtering was also used to
identify non-text and text MSERs. The remaining textual sections were then split
into phrases and sentences. After that, novel filtering techniques have been deployed
to exclude superfluous words and non-text regions that are not even sufficiently
26
B. K. Pandey et al.
aligned with anticipated characteristics. OCR technology has been applied to detect
key terms and phrases that lasted all throughout the final phase. Finally, a content
identification and dissemination framework were used to achieve this plan. As in
Starck et al. [25] utilized multi-scale portrayal processes like the ridgelet as well as
curvelet transform to reinstate the picture, yet compared one’s research results to
generally recognized methodologies based on the wavelet coefficients thresholding
and demonstrated that the curvelet transform enhances the data’s characteristics.
As per Feng et al. [16], several cutting-edge digital data concealment algorithms
base their insertion modifications primarily on the centers of one-shape forms.
Nevertheless, the embedding need of this kind offers an imbalanced correction
to the border framework. The paper proposes an image steganography method,
which utilizes introducing additional entities as well as l-shape sequence insertion
criteria to recognize freshly built content-adaptive digital picture information
covers. It starts by looking at whether different one-shape patterns affect the flow
of a particular 4 × 3 research methodology in the study. In terms of efficiency,
four structural categories that indicate a combination of two picture components
concentrated all throughout the scope of pattern alteration were employed to create
a 32-dimensional steganographic set of features.
In Aujol et al. [3], the characteristics of various norms that seem to be mirrodin
of Sobolev and Besov standards were conflated, which had been supplemented
by Y. Meyer’s previous innovations, which decomposed concepts into texture and
simple geometric elements. A newly perceptual methodology is then applied to a
picture, which has been partitioned into three components: the picture’s framework, texture, and noise. However, one decomposed framework comprises three
semi-norms, which include the total variability of the simple geometric constituent,
a detrimental Sobolev norm for both texture components, and now a negative Besov
benchmark for distortion.
As in Xinbo, Gao et al. [17], they proposed using morphological component
analysis to dissolve mammogram pictures into piecewise components and also
add a texture portion to improve mass identification accuracy. Mammogram mass
recognition would be widely used during breast cancer diagnosis, but distinguishing
masses from normal places would be difficult due to its rich morphological characteristics and uncertain margins. A texture component has been utilized because this
effectively diminishes relational usable interruptions and blood vessel effects, and
to anticipate multiple kinds of significant impacts in a mammogram, two classical
circumferential surface standards have been established.
An explanation for negentropy approximations for nonlinearity operations was
indeed given in Prasad et al. [24] to enable the effective utilization during estimation
of frequency-domain-independent component analysis (FDICA). They posited a
nonlinear conceptual model predicated on natural science forecasting models of
time-frequency series of speech (TFSS) through GGD processes, which enhance
separation effectiveness while also speeding up centralization. The research strenuously supports the proposed nonlinear operations. As per Starck et al. [26],
a new procedure relying on sparse signal representation, particularly regarding
Morphological Component Analysis (MCA), would be focused on the concept that
Secret Data Transmission Using Advanced Morphological Component. . .
27
every transmitter behavior really should be differentiated, as well as a dictionary
would seem to occur that facilitates its advancement utilizing just a sparse signal
depiction. Furthermore, to acquire a correct separation, a pursuit approach toward
such a sparsest depiction might well be applied. The study additionally offers
a number of picture feature usage findings, conceptual findings that justify the
overall separation procedure, and a description of several methods that help with the
suggested method. There seems to be a fundamental framework of several repetitive
multi-scale transformations, including the Ridgelet and Curvelet transformations.
This chapter [22] describes a feature preference strategy, which correlates with the
most sharable information hypothesis. This same technique incorporates specific
alternatives for sequential-independent component analysis-based transforms with
just an effective and accurate consensual knowledge appraiser, with the understanding that minimizing a probability-based error margin could be deduced by
optimizing similarity measuring systems among features.
A linear independent component analysis transform has been used to separate
combined attributes into approximately vectors, enabling single-dimensional consensual information assessment. An independent component analysis transform
would be approximated while also tackling an overall eigen desiccated issue that
is also feasible and dependable throughout aspects of computation. The current
methodology would be premised on linear independent component analysis, which
doesn’t always generate distinctive attributes, which counters the integral dissolving
of interactional understanding theory.
Elad et al. [15] discuss a variety of signal computing issues, such as the process
issue, which is the data loss from a physical metric; the cocktail party issue, which
is the separation of one audio from a mixture of many other captured audio at a
club; and the decomposition of the picture and signal into superposed achievements
from distinct images. Another concern seems to be the separation of a picture into
multiple parts, like texture and cartoon (piecewise smooth) elements.
Hyvärinen and Oja [21] presented the theoretical background and deployments
of ICA, as well as one of the most recent studies on the topic. ICA would be a
relatively new phenomenon for whom the goal would be to find a linear summary of
non-Gaussian and random information that really is uncorrelated with one another.
This even makes it easier to build a necessary model in a variety of applications
such as feature extraction. They also put a spotlight on sequential methodologies
like principal component analysis, confirmatory factor, and others.
There is now a description regarding wavelets in a temporal frame [13], which
provides a basis for expanding wavelet techniques to issues which are not periodical
and not delimited within full Euclidean space. In fact, the properties of such
wavelets are indeed being studied, including both applications, in addition. Following that, they explored the stability evaluation of wavelet transforms, focusing
primarily on timeframes, and proposed strategies for enhancing the latter. The
curvelet transformation of elements that really are smoother apart from discontinuity
along the whole curves has been examined in Donoho and Duncan [14] recently
offered a new multi-resolution portrayal. The suggestions were created using the
function specified for that over continuum plane R2. The implementation of the
28
B. K. Pandey et al.
aforesaid transformations for digital data was already hampered by the need to
define a strategy approach for computing digitally curvelet transformations and also
a computing framework and Curvelet256. Then, in the case of 256,256 images,
apply this technique. Furthermore, they discussed some of the research carried out
utilizing it.
As per Garofalakis and Gibbons [18], wavelet transformations would effectively
decompose activities into the tiers as per Garofalakis and Gibbons [18]. The
function’s wavelet transform is generally made up of either coarse as a whole or
approximate solution and also depth parts that affect the functional in distinct layers.
Numerous recent studies have demonstrated the efficacy of such a wavelet transform
in allowing estimated queries to be processed across large data sets. In other words,
using wavelet transform using input information to construct a layered summary
of data with just a reasonable quantity of wavelets would’ve been a fine decision.
The wavelet transform’s power decrease, and also decorrelation features aid in
the construction of meaningful as well as appropriate approximation depictions.
Wavelet transforms could possibly be calculated within a fixed time, culminating in
exceedingly complicated results.
As reported in Candès et al. [8], two digital curvelet transformation algorithms,
both 2-D and 3-D, are used. The first is concerned with discrete fast Fourier
transforms (USFFT), while the second seems primarily concerned with the packing
of such particular Fourier samples. When transforming curvelets with each size, a
spatially consistent grid has been employed, and the slope varies among different
instances. Most electronic transformations include a list that includes digitized
curvelet values that are linked either by scaling, orientation, or spatial position
requirements. When it comes to n-by-n Cartesian matrices, almost all deployments
have been efficient, with a computation time complexity of O(n2 log n), but they
also appear to be invertible. To facilitate implementation, the proposed electronic
transformations become simpler, faster, and significantly less repetitious than the
initial generation using curvelets.
Four textural characteristics in creating morphological components have been
explored in this work (Xiang, [31]): content, coarseness, contrast, and directionality
(including horizontal and vertical). Furthermore, to evaluate and develop morphological features, a classification has been done using both remote-sensing hyperspectral and polarimetric synthetic aperture radar (SAR) pictures, demonstrating all
the proposed techniques’ ability to handle numerous varieties of remote sensing
imagery. Furthermore, despite having a sufficient number of training instances, the
results demonstrated that the proposed MCA architectural can produce extremely
acceptable classification results in a wide range of analytic situations.
This chapter [11] provides a method for improving dimension minimization by
employing a partially EZW methodology. EZW, an evolving image compressing
approach, would be an enhancement over Shapiro’s embedded zerotree wavelet
method. The recommended Partially EZW Approach handles the EZW challenge while sacrificing efficiency while moving to a lower bit plane. Throughout
this study, integer wavelet transforms and region of interest [10] encoding were
Secret Data Transmission Using Advanced Morphological Component. . .
29
introduced into Partial EZW, bringing it superior to both EZW and the SPIHT
Methodologies.
4 Suggested Method
The morphology component analysis methodology is being employed just at the
transmitter end to separate a picture into independent units that have encompassed
a naturalistic scenery element and also the texturing, as seen in the workflow of
the methodology proposed in Fig. 1. According to analytical outcomes, curvelet
transformation exceeds wavelet transformation for identifying a realistic picture
component. The picture’s starting values would be assigned to implicit vectors.
The discrete cosine transformation with curvelet conversion of the residue is then
calculated. The curvelet coefficient was treated with hard thresholding after getting
a curvelet transform of the residue.
Sender side
Dictionary built by
combining several
transformations
Textual Image
Data base
of cover
image
Preprocess cover
image
Morphological
component Analysis
Cover Image
Smoother part of
textual image
Texture part of textual
image
Stegnography
Optimal pixel
selection
elsb
Steganalysis (textual
image feature
extraction)
Smoother part of
textual image
Texture part of textual
image
Features of textual image
Receiver side
Fig. 1 Flow diagram of suggested methodology
Based on features of textual images
Hybrid convolution neural network is
train to identify textual image
30
B. K. Pandey et al.
The curvelet transformation returns a frequency component of the matrix. Image
distortion is a feature with greater frequency. Coefficient values having a value much
less than that threshold really aren’t taken into account. The retrieved features were
primarily put inside the cover image using ELSB-based steganography, and the
stego image was likewise broadcast over the Internet. The features of a textual picture were separated either from the cover image and the recipient sides via reversed
obfuscation techniques, and the image coefficient was determined via inverse
transformations of acquired matrices. Following performing an irreversible curvelet
transform, a picture containing ring artifacts was created. As a consequence, to
properly overcome this matter, the whole variance regularization methodology is
used.
The coefficients of a wavelet transformation were simply inadequate. The
preponderance of a coefficient becomes essentially zero by adopting a distortionfree wavelet transformation. As a result, the image restoration challenge could be
reframed as one about recovering picture coefficients that become “better” over
the Gaussian white noise backgrounds. As a result, smaller magnitude coefficients
should be distorted and lowered to near zero. The wavelet threshold level is a method
for evaluating each coefficient to a cutoff to determine whether or not it represents
a useful portion of the initial impulses. This residual is then estimated via a discrete
cosine transform.
This thresholding of the wavelet coefficient is normally performed on just the
picture’s informational coefficients, not the approximation coefficient, since the
former reflects “low-frequency” notions that typically comprise crucial components
of such a signal and therefore seems to be considerably less affected by distortions.
The approximated coefficients pertain to a lower frequency element, whereas the
complete coefficients contribute to the “frequency part.” The effectiveness of the
previous MCA methodology, as well as the current strategy, has been tested using
various beginning thresholds.
The coefficient having an absolute value lower than a set threshold level was fixed
at zero to retrieve an essential coefficient. Using reverse wavelet transformation,
every original photo is also recreated. The discrete cosine transformation of a
residue was computed almost instantly. Then, two images were combined that
were created by calculating the reverse discrete curvelet transformation and also
the reverse discrete cosine transforms of a residual image.
After that, this number would be removed from the overall image and be assigned
as if it were a new residue. This residual image is therefore used toward an outcome
that has been computed because the reversed curvelet transforms were measured.
Determine the curvelet transformation of such an output yet again, and then proceed
with the techniques below. Following numerous repetitions, the outcomes were
eventually obtained. The following seems to be the methodology utilized at the
sending end to identify the features from a textual picture by utilizing morphological
component analysis and afterward encrypt those features utilizing an eLSB-based
steganography technique:
1. Invoke Kmax but also threshold λ = δ * Kmax .
Secret Data Transmission Using Advanced Morphological Component. . .
31
Where Kmax seems to be the total number of iterations for every layer N and λ
seems to be a hard-threshold with such a threshold level δ.
2. For J = 1 to M;
Where M is number of images.
Compute the residual term Res, with assumption of current value of xn is
variable and value of xt is fixed:
x consists of set
Res = x- xt -xm ; for a given sample of m textual images,
of m textual images {xi }i = 1,2,3, . . . . . . ..t, such that x= ni=1 x i having distinct
morphology.
3. Compute the curvelet transformation of xn + Res and obtain
χn = ϕn+ (xn + Res n )
For each i = 1 to n, ϕ represent a set of bases of dictionary such that for every
value of i, yi sparse in ϕi , not in ϕi , or at least not sparse.
4. Apply hard threshold on the coefficient χn with threshold value λ and obtain
χn = λδ ϕn+ Resn
5. For S = 1 to M;
Compute the residual-term Res, with postulation of current value of xn is fixed
and value of xt is changeable.
Res = x-xt -xn ; for a given sample of m textual images,
x consists of set of
n textual images {xi }i = 1,2,3, . . . . . . ..t, n such that x= ni=1 x i having different
morphology.
6. Compute the discrete cosine transforms of xt + Res and gained
χt = ϕt+ (xt + Res t )
For each i = 1 to t, ϕ denote a set of bases of dictionary such that for every
value of i, xi sparse in ϕi , not in i , or atleast not sparse.
7. Now apply hard threshold on the coefficient χt with threshold value λ and
obtain
χt = λδ ϕt+ Resn
8. Reading all usable frames of a cover image as well as the value of χt , χn , Res
as features extracted from morphological variable analysis of textual image.
9. Change value χt , χn , and Res in binary format.
10. Now, calculate the LSB of each picture element in a cover picture that would
be transferred over the channels within the transmission process.
11. Substitute all binary values with binary values χt , χn , for the LSB of every
pixel of the cover picture to really be communicated.
12. Subsequently, stego images are captured and communicated via the World Wide
Web.
32
B. K. Pandey et al.
The following algorithms should be used at the receiver side to extract textual
images:
1. Analyze a stego image obtained.
2. Compute the least significant bit of each picture element in a stego image
received mostly on the receiver side and communicated from the sender side.
3. Get the number of bits, and then convert each 8 bit to a character to get the value
of χt , χn , and Res.
4. Rebuild xn by xn = ϕn χn where ϕn is inverse of ϕn+ .
5. Rebuild xt by x = ϕt χt where ϕt is inverse of ϕt+ .
6. Rebuild x by using the method given below:
Res = x - xt - xn .
7. Modify the threshold by λ = λ – δ.
8. If δ > λ, continue the rebuilding procedure.
9. Else, finish.
5 Results and Discussion
The major purpose of that kind of work would be to retrieve textual graphics
concealed inside a cover image. The textual image extraction technique is usually
partitioned into two stages, one on the sender side and the other on the receiver
side. Initially, a proposed methodology for morphology-based component analysis
was used for a textual image in an attempt to improve the textual image variances
across distinct texturing all throughout the picture. The accompanying examples
demonstrated well how to breakdown a textual image into different parts based on
various textural qualities, employing a specified dictionary for MCA.
Because of the coarseness, the image becomes separated into coarse (strengthening) and tiny (weak) components, wherein coarseness seems to be an estimate of
the number of sides in a local square neighborhood having a radius (see Fig. 2).
The image is separated into high-contrast (strong) and low-contrast (weak)
portions based upon contrast. (See Fig. 3).
Because the image is composed of horizontal and vertical orientations, it is also
segmented into those two portions even if they do not exist (see Fig. 4).
As per line likeness, the frame is converted to the line-like (strong) and non-linelike (weak) elements (see Fig. 5).
Following the decomposition [12] of a textual picture into different parts, the
textual images of such constituents were improved by employing appropriate
image improvement techniques and then merging those components to obtain the
manufactured textual image. The changed textual images displayed in Fig. 6 are
then hidden inside the cover image borders seen in Fig. 7 using an enhanced least
significant bit steganographic approach, and the resulting stego picture is shown in
Fig. 8.
Secret Data Transmission Using Advanced Morphological Component. . .
33
Fig. 2 Decomposition of image by coarseness
Fig. 3 Decomposes of images by contrast
Figure 6 depicts a proposed technique, which uses eLSB to optimize quality
images such that the methodology could be performed appropriately. Because the
proposed steganographic technology functions in a spatial arena, this is separated
into two stages. In the first stage, metadata has been established. The earliest few
34
B. K. Pandey et al.
Fig. 4 Images decompose based on directionality
bytes of a cover image include header data. The hidden data was inserted into the
cover-image in just the way that it would be optimized. Figure 9a and b show the
outcome stego image for two distinct pictures from each stage; they are source
images of two distinct text images, features for preprocessed images and a stego
image (b).
The embed method’s secret key has been subsequently provided to such a textual
information extraction procedure, mostly on the receiver end. The cryptographic
keys have been used to extract textual images, and the embedded content is
then obtained from the cover picture by utilizing a deep learning-based hybrid
convolutional neural network (CNN). An adaptive optimization method is applied
to further improve the effectiveness of deep learning algorithms. The peak signalto-noise ratio (PSNR) and structural similarity have been utilized to evaluate the
effectiveness of the presented secret text retrieval technique.
According to Eq. 1, MSE provides a particular evaluation process that describes a
tier of similarity, or alternatively, a magnitude of variance, but there is deterioration
here between primary and decompressed picture frames. MSE would be not only
formulated as having for an M × N main images I but also decompressed images
K:
Secret Data Transmission Using Advanced Morphological Component. . .
Fig. 5 Images decompose based upon line likeness
Fig. 6 Textual images that
have been manipulated
Fig. 7 Cover Image that has been used to hide textual image
35
36
B. K. Pandey et al.
Fig. 8 Stego image that has resulted
MSE =
−1
M−1
N
1
[I (i, j ) − K (i, j )]2
M ×N
(1)
i=0 j =0
As described by Eq. 2, the peak signal-to-noise ratio (PSNR) seems to be a
scientific articulation for a proportion of a signal’s maximum promising strength
to the strength of perverting noise that impairs the representation accuracy. Since
numerous transmissions seem to have a wide, versatile scope, PSNR would be
calculated by taking the average of a logarithmic decibel scale. Lossy picture
compaction encoder restoration has indeed been utilized to evaluate the accuracy of
lossy picture restoration. Therefore, in the illustration, a signal seems to have been
surrounding the preliminary contribution, and the dissonance appears to become an
encoding defect. When encoding codecs are considered, PSNR would simply be a
change to scientific comprehension for renovation efficiency when encoding codecs
are considered.
MAX2I
P SNR = 10log10
(2)
MSE
where MAXi would be the picture’s highest pixel valuation.
The structural similarity (SSIM) evaluation provided by Eq. 3 is a metric for
ascertaining how related two pictures emerge. The SSIM [30] measurement might
well be a valuable reference measurement. Such that it further genuinely quantifies
picture quality by using an initial uncompressed or reverberation picture as a
baseline. SSIM has been planned to be based on conventional methodologies that
have been demonstrated to be incompatible with human eye detection, including
PSNR and MSE.
Secret Data Transmission Using Advanced Morphological Component. . .
37
Fig. 9 (a) and (b) represent feature extraction using morphological component analysis and
encoding of textual image by using steganography
38
B. K. Pandey et al.
The difference between SSIM and other outlined methods, like MSE or PSNR,
is that all these methodologies assess perception inconsistencies, whereas SSIM
wants to cure image regression as a number of changes throughout quality. The term
“structural information” reflects the fact that picture element values have substantial
interdependencies, especially once they seem to be temporal equivalents. These
interactions offer critical information about the layout of the visual acuity image
area.
The measure between two windows x and y of common size N × N is:
2μx μy + c1 2σxy + c2
SSI M (x, y) =
μ2x + μ2y + c1 σx2 + σy2 + c2
(3)
with
μx the average of x,
μy the average of y,
σx2 the variance of x,
σy2 the variance of y,
σxy the covariance of x and y.
c1 = (k1 L)2 and c2 = (k2 L)2 two variables to stabilize the division with weak
denominator.
L the dynamic range of the pixel values (typically this is 2bits per pixel ).
k1 = 0.01 and k2 = 0.03 and by default.
The eventual result is the SSIM index, which has a numeric value between −1
and 1, but scoring 1 will only be feasible if two comparable data sets have been
used. This is typically measured using 8 × 8 window frames. The window frames
in images could be replaced pixel by pixel, but still, it would seem that a subset of
the obtainable window frames has been used to decrease quantification intricacies.
Performance analysis can be performed to investigate and evaluate the adequacy
of suggested methods. The evaluation revealed that the suggested technique outperformed the existing approach in terms of the structural similarity index measure,
accuracy, and peak signal-to-noise ratio for a variety of thresholds (see Tables 1, 2
and 3).
Once the TV regularization approach is approximate using Haar and Daubechies
wavelet, then the impact on PSNR, SSIM, and correctness of textual images with
different threshold levels is shown in Figs. 10, 11 and 12.
6 Conclusions
At the moment, in which the primary form of engagement involves mobile
technology that has Internet connectivity to transmit data, the key problem appears
to become the safeguarding of such secret information. For secure communications,
Secret Data Transmission Using Advanced Morphological Component. . .
39
Table 1 Illustrates the influence on PSNR of the Haar and Daubechies filtering techniques for
cover image shown in Fig. 6
Threshold (γ) PSNR with Haar (existing methods) PSNR with Daubechies (proposed methods)
5
40.12
39.75
7
40.25
39.70
9
40.50
40.25
10
41.00
40.50
12
41.25
42.00
14
41.60
44.20
16
42.70
45.70
18
43.80
47.20
19
43.76
48.09
20
44.03
48.24
21
44.34
48.54
24
45.42
49.54
Table 2 Illustrates the
influence on SSIM of the
Haar and Daubechies filtering
techniques for cover image
shown in Fig. 6
Threshold (γ)
5
7
9
10
12
14
16
18
19
20
21
24
SSIM with Haar
0.9964
0.9981
0.9974
0.9976
0.9981
0.9979
0.9980
0.9983
0.9967
0.9977
0.9976
0.9978
SSIM with Daubechies
0.9972
09975
09980
09980
0.9981
0.9981
0.9982
0.9985
0.9984
0.9985
0.9967
0.9986
information extraction and validation of textual information across a public network
connection, this methodology uses morphology component analysis, total variance,
enhanced LSB (e-Least Significant Bit) steganography, a deep learning-based
weighted naive Bayes classifier, and an adaptive optimization method.
To secure information transfer in a general internet data transmission connection, a combination including morphology component analysis, total variance,
steganography, and a weighted naive Bayes classifier-based deep learning algorithm
was used in many stages. The very first stage employs morphological component
analysis and total variance to generate image-based components on coarseness,
directionality, contrast, and line likeness. By using a spatially steganographic
method, the morphological components of a text-based image were divided more
and inevitably embedded into the least significant bit of cover picture as in the
second stage.
40
B. K. Pandey et al.
Table 3 Illustrates the influence on accuracy of the Haar and Daubechies filtering techniques for
cover image shown in Fig. 6
Accuracy with Haar filtering
techniques
95.645
96.998
97.923
96.782
96.976
97.135
98.695
98.456
98.443
98.544
98.556
98.561
Threshold (γ)
5
7
9
10
12
14
16
18
19
20
21
24
PSNR with Haar filter
Accuracy with Daubechies
filtering techniques
96.343
97.123
98.126
97.945
97.997
98.342
99.182
99.385
99.461
99.465
99.532
99.623
PSNR with Daubechies filters
49
48
47.2
47
Peak Signal to Noise Ratio (PSNR)
45.7
46
45
44.2
43.6
44
43
42.7
42
42
41.6
41
40.12
40.25
40.25
40.50
41.5
41.25
40
39
38
5
7
9
12
10
14
16
18
Threshold ()
Fig. 10 Peak signal-to-noise ratio comparisons for the Haar and Daubechies filters utilized in the
MCA and steganography techniques
The secret key acquired from textual image incepting methods had been sent to
the textual picture retrieving technique on that receiver side in order to collect the
textual picture, and the incorporated textual picture was ultimately detected utilizing
simply a weighted naive Bayes classifier. Adaptive optimization methodology can
Secret Data Transmission Using Advanced Morphological Component. . .
SSIM with Haar
41
SSIM with Daubechies
0.9988
0.9985
0.9986
Structural similarity index measure (SSDM)
0.9984
0.9982
0.9982
0.998
0.9981
0.996
0.9981
0.9981
0.9983
0.9981
0.998
0.9980
0.9979
0.9978
0.9976
0.9975
0.9976
0.9974
0.9972
0.9974
0.9972
0.9970
0.9968
0.9966
0.9964
0.9964
0.9962
0.9960
7
5
9
10
12
14
16
18
Threshold ()
Fig. 11 A comparison of a structural similarity index measure (SSIM) for such a Haar filter and
the Daubechies filter used for the MCA and steganography procedures
Accuracy with Haar Filtering Techniques
Accuracy of proposed method with Daubechies Filtering Techniques
100.0
99.385
99.182
99.5
99.0
98.695
98.342
98.5
98.126
97.923
98.0
Accuracy
98.456
97.997
97.945
97.5
97.123
97.135
96.998
96.976
97.0
96.782
96.343
96.5
96.0
95.645
95.5
95.0
5
7
9
10
12
14
16
18
Threshold ()
Fig. 12 A comparison of the accuracy of the Haar and Daubechies filters employed in the MCA
and steganographic processes is presented
42
B. K. Pandey et al.
commonly be utilized to increase the effectiveness of a weighted naive Bayes
classifier. This suggested scheme, which uses a debauched filtration mechanism to
create text-based picture elements, improves the conventional process, which uses
the Haar filter to create text-based image features. The suggested model not only
improves on PSNR but also outperforms it in respect of SSIM relevance.
The suggested methodology correctly conducts textual extraction just at the
receiver side, although the letters may also be overlooked sometimes, or an identical
letter may indeed be retrieved repeatedly, leading to the recovery of incorrect textual
data. As a result, an integrated solution to prevent such errors across all textual
recognition should be established in the near future. In addition, many types of
fuzzification algorithms were addressed. An ant colony optimization approach could
be used to achieve favorable outcomes.
References
1. Attaby, A. A., Ahmed, M. F. M., & Alsammak, A. K. (2018). Data hiding inside JPEG images
with high resistance to steganalysis using a novel technique: DCT-M3. Ain Shams Engineering
Journal, 9(4), 1965–1974.
2. Aujol, J. F., Aubert, G., Blanc-Féraud, L., & Chambolle, A. (2003). Image decomposition
application to SAR images. In L. D. Griffin & M. Lillholm (Eds.), Scale space methods in
computer vision. Scale-space 2003 (Lecture notes in computer science) (Vol. 2695). Springer.
https://doi.org/10.1007/3-540-44935-3_21
3. Aujol, J. F., Aubert, G., Blanc-Féraud, L., & Chambolle, A. (2005). Image decomposition into a
bounded variation component and an oscillating component. Journal of Mathematical Imaging
and Vision, 22(1), 71–88.
4. Baran, R., Partila, P., & Wilk, R. (2018). Automated text detection and character recognition
in natural scenes based on local image features and contour processing techniques. In
W. Karwowski & T. Ahram (Eds.), International conference on intelligent human systems
integration (pp. 42–48). Springer.
5. Beck, A., & Teboulle, M. (2009). Fast gradient-based algorithms for constrained total variation
image denoising and deblurring problems. IEEE Transactions on Image Processing, 18(11),
2419–2434.
6. Belyaev, A., & Fayolle, P. A. (2018). Adaptive curvature-guided image filtering for structure+
texture image decomposition. IEEE Transactions on Image Processing, 27(10), 5192–5203.
7. Candes, E. J., & Donoho, D. L. (2000). Curvelets – A surprisingly effective nonadaptive
representation for objects with edges. Saint-Malo Proceedings, 1–10.
8. Candès, E., Demanet, L., Donoho, D., & Ying, L. (2006). Fast discrete Curvelet transforms. Multiscale Modeling and Simulation, 5(3), 861–899. ISSN 1540-3459. https://
resolver.caltech.edu/CaltechAUTHORS:CANmms06
9. Caselles, V., Chambolle, A., & Novaga, M. (2015). Total variation in imaging. In O. Scherzer
(Ed.), Handbook of mathematical methods in imaging. Springer. https://doi.org/10.1007/9781-4939-0790-8_23
10. Chaoqiang, L. (2004). ROI and FOI algorithms for wavelet-based video compression. In
PCM’04 proceedings of the 5th Pacific rim conference on advances in multimedia information
processing, 3, 241–248.
11. Charalampos, D., & Ilias, M. (2007, September/October). Region of interest coding techniques
for medical image compression. IEEE engineering in medicine and biology magazine.
Secret Data Transmission Using Advanced Morphological Component. . .
43
12. Chen, S., Donoho, D., & Saunder, M. (1998). Atomic decomposition by basis pursuit. SIAM
Journal on Scientific Computing, 20, 33–61.
13. Cohen, A. (2000). Wavelet methods in numerical analysis. In E. Trélat & E. Zuazua (Eds.),
Handbook of Numerical Analysis (Vol. 7, pp. 417–711). Elsevier. https://doi.org/10.1016/
S1570-8659(00)07004-6. ISSN 1570-8659, ISBN 9780444503503.
14. Donoho, D., & Duncan, M. (2000). Digital Curvelet transform: Strategy, implementation and
experiments. Proceedings of SPIE – the International Society for Optical Engineering. 4056.
https://doi.org/10.1117/12.381679.
15. Elad, M., Fadili, J., Starck, J., & Donoho, D. (2010). MCALab: Reproducible research in signal
and image decomposition and inpainting. Computing in Science & Engineering, 12(1), 44–63.
https://doi.org/10.1109/MCSE.2010.14
16. Feng, B., Weng, J., Lu, W., & Pei, B. (2017). Steganalysis of content-adaptive binary image
data hiding. Journal of Visual Communication and Image Representation, 46, 119–127.
17. Gao, X., Wang, Y., Li, X., & Tao, D. (2010). On combining morphological component
analysis and concentric morphology model for mammographic mass detection. IEEE Transactions on Information Technology in Biomedicine, 14(2), 266–273. https://doi.org/10.1109/
TITB.2009.203616
18. Garofalakis, M., & Gibbons, P. B. (2002). Wavelet synopses with error guarantees. In M.
J. Franklin, B. Moon, & A. Ailamaki (Eds.), Proceedings of the 2002 ACM SIGMOD
international conference on Management of Data (Madison, WI) (pp. 476–487). ACM.
19. Guo, C., Zhu, S., & Wu, Y. (2003). Towards a mathematical theory of primal sketch and
Sketchability. Proceedings of the ninth IEEE international conference on computer vision
(ICCV), Nice, France.
20. Hou, Y., & Shen, D. (2018). Image denoising with morphology- and size-adaptive blockmatching transform domain filtering. Journal on Image and Video Processing, 2018(59). https:/
/doi.org/10.1186/s13640-018-0301-y
21. Hyvärinen, A., & Oja, E. (2000). Independent component analysis: Algorithms and applications. Neural Networks, 13(4–5), 411–430. https://doi.org/10.1016/S0893-6080(00)00026-5.
ISSN 0893-6080.
22. Lan, T., Erdogmus, D., Adami, A., & Pavel, M. (2005). Feature selection by independent
component analysis and mutual information maximization in EEG signal classification. In
Proceedings. 2005 IEEE international joint conference on neural networks (Vol. 5, pp. 3011–
3016). https://doi.org/10.1109/IJCNN.2005.1556405.
23. Pandey, D., Pandey, B. K., & Wairya, S. (2021). Hybrid deep neural network with adaptive
galactic swarm optimization for text extraction from scene images. Soft Computing, 25, 1563–
1580. https://doi.org/10.1007/s00500-020-05245-4
24. Prasad, R., Saruwatari, H., & Shikano, K. (2005). Blind separation of speech by fixed-point
ICA with source adaptive negentropy approximation. IEICE Transactions, 88-A, 1683–1692.
https://doi.org/10.1093/ietfec/e88-a.7.1683
25. Starck, J. L., Candes, E., & Donoho, D. (2003). Astronomical image representation by the
curvelet transform. Astronomy and Astrophysics, 398, 785–800. https://doi.org/10.1051/00046361:20021571
26. Starck, J. L., Elad, M., & Donoho, D. (2004). Redundant multiscale transforms and their
application for morphological component separation. Advances in Imaging and Electron
Physics, 132.
27. Starck, J. L., Elad, M., & Donoho, D. L. (2005). Image decomposition via the combination
of sparse representations and a variational approach. IEEE Transactions on Image Processing,
14, 1570–1582.
28. Thakral, S., & Manhas, P. (2019). Image processing by using different types of discrete wavelet
transform. In A. Luhach, D. Singh, P. A. Hsiung, K. Hawari, P. Lingras, & P. Singh (Eds.),
Advanced Informatics for Computing Research. ICAICR 2018 (Communications in computer
and information science) (Vol. 955). Springer. https://doi.org/10.1007/978-981-13-3140-4_45
29. Tharwat, A. (2021). Independent component analysis: An introduction. Applied Computing
and Informatics, 17(2), 222–249. https://doi.org/10.1016/j.aci.2018.08.006
44
B. K. Pandey et al.
30. Wang, Z., Bovik, A. C., Sheik, H. R., & Simoncelli, E. P. (2004). Image quality assessment:From error visibility to structural similarity. IEEE Transactions on Image Processing,
13(4), 600–612. https://doi.org/10.1109/TIP.2003.819861
31. Xu, X., Li, J., & Mura, M. D. (2016). Multiple morphological component analysis based
decomposition for remote sensing image classification. IEEE Transactions on Geoscience and
Remote Sensing, 54(5), 3083–3102.
Data Detection in Wireless Sensor
Network Based on Convex Hull
and Naïve Bayes Algorithm
Edwin Hernan Ramirez-Asis, Miguel Angel Silva Zapata, A. R. Sivakumaran,
Khongdet Phasinam, Abhay Chaturvedi, and R. Regin
1 Introduction
Wireless sensor networks (WSNs) play a major role in the world because of its
applications in wildlife tracking, military movements sensing, health care system,
building health monitoring, and storing environmental observations [7]. Geographic
routing-based distributed sensor systems have applications in Internet of Things
(IoT) domain. The mechanism of covetous sending is considered one of the
excellent geographic directing plans due to its straightforwardness and proficiency.
The eager mechanism for data transport has a demerit, e.g., communication void.
Most of the geographic directing plans comprise two components: (1) covetous
sending and (2) reinforcement. In this chapter, we propose a novel void dealing
E. H. Ramirez-Asis · M. A. S. Zapata
Santiago Antúnez de Mayolo National University, Huaraz, Peru
e-mail: ehramireza@unasam.edu.pe; msilvaz@unasam.edu.pe
A. R. Sivakumaran
Information Technology, Malla Reddy Engineering College for Women, Secunderabad, India
K. Phasinam
School of Agricultural and Food Engineering, Faculty of Food and Agricultural Technology,
Pibulsongkram Rajabhat University, Phitsanulok, Thailand
e-mail: phasinam@psru.ac.th
A. Chaturvedi
Department of Electronics & Communication Engineering, GLA University, Mathura,
Uttar Pradesh, India
e-mail: abhay.chaturvedi@gla.ac.in
R. Regin ()
Department of Information Technology, Adhiyamaan College of Engineering, Hosur,
Tamil Nadu, India
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_3
45
46
E. H. Ramirez-Asis et al.
technique that identifies the void boundary. Moreover, a transfer hub is used that
bundles from source run around the void to build a way with less bounce tally.
The nodes in the WSNs interact with the surrounding environment using sensors
and actuators. Sensor nodes in WSN are equipped with severe energy constraints
[6]. Difficulties that arise due to power, processing, sensing, and location unity
in the hardware of the sensor units may result in hardware failure. Also, badly
written sensor program may result in software failure. Moreover, complications in
the transceiver of the sensors lead to communication failure in the WSNs. The data
are classified based on the WSNs sensed data [12].
Voronoi polygons are haphazardly isolated into the inactive hubs of the WSNs
within the two-dimensional square-checking zone. Furthermore, the key zones of
the static nodes are associated within the counterclockwise heading to decide the
gap zones. Concurring to the distinctive concave shapes, convexity worsens into
an arched body, and the raised body center is utilized as the base point. The
Delaunay triangulation is combined with the bulge vertices to calculate the position
of the virtual repair hub. Finally, based on the hubs’ relative removal of the hubs,
remaining vitality & hub of criticality, multi-factor cooperative energy coordinating
choice table between the virtual repair hub & the versatile hub concurring the choice
table, the portable hub performs the limited separate development to realize the
repair optimization of the scope gap. Recreation tests are designed for calculating
existing scope gap. AEL-HO calculation moves forward to arrange scope and
expands the arranged life cycle.
2 Related Work
Data are available in various forms, i.e., lost data, offset data, out of bounds data,
gain data, spike data, stuck-at data, noise data, or random data. Detection of data in
the WSNs becomes complex due to restricted sensors and different placement fields.
In recent years, with the advancements in ad hoc network, numerous researchers are
actively contributing to the domain of IoT and sensor systems. Notwithstanding,
in contrast to MANET, the versatility of vehicles in IOT is commonly compelled
by predefined streets. The speed of the vehicle is additionally confined in the
parts of speed limits, level of blockage in streets, and traffic control components.
Along these lines, building up a practical versatility model of IOT is critical for
assessing and structuring steering convention. This linkage is utilized to safeguard
the dynamic transmission that enhances the transmission scope of the vehicle as
indicated by situations of nearby traffic. The effect of vehicular traffic for the most
part on Matt’s traceroute (MTR) breaks down when thickness changes from steady
trade to greatest congested driving conditions. Clients must perceive the urban infill
which is not simply to copy the length of vehicles. In addition to this, the automobile
overloads bring down vehicles at normal speed. The assumption of nearby nonrenouncement encourages a hub to join a Bayesian sober mindedness inside the
nearby neighborhood, where the hub is fit for perceiving the neighbors.
Data Detection in Wireless Sensor Network Based on Convex Hull and Naïve. . .
47
Data detection techniques play a major role in obtaining an assured operating
condition of WSNs. Different types of data detection algorithms are used in
WSNs, which contain the basic operation of supervised or unsupervised learning.
Support vector machine (SVM) classifier and statistical learning theory [8] can be
introduced to detect data in the networks. A classifier’s training is performed using
kernel functions that contain radial basis, polynomial, and linear kernel functions.
The indoor dataset of single hop is not considered. The trend analysis of least
squares SVM [4] is developed for improved sensor data diagnosis to overcome
these problems. The error-correcting output matrix used for data classification has
limitations. SVM with Statistical Time-Domain Features [10] and one-class SVM
with Stochastic Gradient Descent [3] methods are presented for data detection in
WSNs, which are suitable only for binary classification of data. For multiple data
classification, data classifier identification process is combined with convolutional
neural networks (CNN) [11] and Random Forest (RF) [13] utilized for consumption
of energy in the sensors during anomaly detection. However, there is no detailed
study on evaluation of hybrid classifiers. The comparison is made by applying
this algorithm on different datasets and their feature values are utilized for the
classification of sensed data.
2.1 Challenges and Problem Statement
The WSNs face various challenges in data detection mechanism due to the following
reasons.
• The resources at the node level will make use of node’s utilize classifiers [2] only
in restricted ways, since there is no need of difficult calculation.
• In hazardous and uncertain atmosphere, there is a need for placement of sensor
nodes.
• Medical information detection techniques [14] must be accurate and random
to eliminate loss. For example, the method would identify the changes among
normal data and sensor data. As a result, it lacks encompassing in obtaining
inaccurate information, which might lead to a misrepresentative response.
2.2 Contributions
• Adaptive ensemble learning with hyper-optimization classifier is utilized to
detect health information in WSNs.
• In addition to that, three more classifiers are utilized on the datasets. A wide
experimental evaluation is accompanied to find WSN. In this chapter, SVM, RF,
and CNN classifiers are utilized for comparing the proposed AEL-HO classifier.
48
E. H. Ramirez-Asis et al.
• Performance of classifiers is analyzed based on the parameters of the measures
such as F1-score, detection accuracy (DA), Matthew’s correlation coefficient
(MCC), and true positive rate (TPR).
In the work discussed in this chapter, a novel adaptive ensemble learning with
hyper-optimization algorithm has been developed to reduce the complication of the
AEL-HO algorithm dealing with comprehensive training datasets. The proposed
technique stores related data points and cleans irrelevant data points present in the
dataset. Initially, the k-mean clustering process was applied to specify training data
points. Next, the quickhull algorithm [5] collects the single class label data points
from each cluster in the convex hull. The data belonging to the convex hull vertices
and clusters of more class label’s data points are ultimately considered the specified
training data points of the AEL-HO algorithm classifiers. The experimental outputs
of the large dataset demonstrate that the proposed technique minimizes the total
training data points without reducing the accurateness of the training data points.
Here, reduction of 90% training time is achieved in comparison to the AEL-HO
method.
This chapter is organized as follows. The general explanation of WSNs is given
in Sect. 1. Related works and the proposed contribution are presented in Sect. 2. The
proposed methodology of the AEL-HO classifier is explained in Sect. 3. The system
model result is structured in Sect. 4, and the conclusion of the chapter is given in
Sect. 5.
3 Proposed Methodology
3.1 Preprocessing
The aim of preprocessing is to eliminate unwanted words from the bug report.
During analysis, unnecessary words are removed since they may worsen the
learning performance. Thus, the space for the feature set is minimized making
it easy for learning and performing data analysis. This process comprises three
steps: tokenization, stop-word removal, and stemming. First, a sequence of text
is partitioned as numbers, words, punctuation, etc., which are termed as tokens.
Then, every punctuation is substituted with spaces; escape characters that are nonprintable are eliminated and all words are changed to lowercase. Here, the common
stem of words is substituted and saved as selected features. For instance, words like
“moving,” “moved,” “moves,” and “move” are substituted with the word “move.”
The words obtained, once preprocessing is completed, are termed as features as
given in Eqs. (1) and (2).
BR = {S1, S2, S3----Sj}
(1)
Data Detection in Wireless Sensor Network Based on Convex Hull and Naïve. . .
Sj = {f1, f2, f3----fk}
3.1.1
49
(2)
N-Gram Extraction
While extracting the features, information present in the bug report is expressed as
a feature vector. It supports extending the network by transforming the information
of the bug report into various sets of n-gram data. Thus, features are represented in a
better way. N-gram method is deployed for observing the semantic relationship and
for estimating the frequency of feature order. N-gram technique is reduced as k − 1
and organizes the series of i features into unigram, bigram, and trigram.
P (f i/f 1, f i/f 2, . . . . . . ..f i-1) = p (f i/f i-k + 1, . . . . . . ..f i-1)
(3)
where, fi and p() represent the feature (word) and probability, respectively; in
unigram, it is assumed that the successive features are not dependent on one another.
The features of the feature string have no mutual information. Hence, the conditional
probability of unigram is given as follows.
P (f 1/w1) =
p
fi
w1
(4)
In bigram, two adjoining features provide language information. Its conditional
probability is written as follows.
P (f 1/w1) =
p
fi + 1
w2
(5)
Various n-gram techniques must be integrated to exploit its entire ability. Several
n-gram techniques can be used for analyzing an individual sentence, and then the
results obtained are combined. Thus, the relationship among n-gram feature is
expressed as an analysis at word level, which is given as follows.
P (f 1, f k + 1) =
p(f 1)p (f k + 1, f k, . . . . . . f k − m, wi) p(wi)
Algorithm
Input ← Training data Dtr, testing data Dts,
and defective rate σd
Output → Class label cj prediction
of every instance in testing data
For
every instance of Dtr and Dts
Delete every duplicate instance
from Dtr and Dts.
(6)
50
E. H. Ramirez-Asis et al.
Fill the missing instance value in Dtr and Dts
with the mean of the corresponding instance
value2.
Normalize both the Dtr and Dts using
min-max method3.
Input ← SDA for generating a deep
representation of aik, where kth metrics of
ith instance of class label ci.
Output → Deepaik is the deep representation of aik.
End for
for
every base learner bl EL phase l=1, 2,...10.
Train Dtr;
According to bl, a cross-fold validation
is performed on Dtr.
Calculate average MCC(AvGMCCi) and average
F-measure(AvGF)
End for
tuning xk with yk
yk = ak + f × (bk − ck)
{np, f , cr} = {10n, 0.8, 0.9}
Data Point Clustering: Initially, the k-mean clustering technique performs the
clustering operation on the new data points in training which separates the data
points into k clusters. In this method, depending upon the dataset structure and
total data points, the selection of clusters has been carried out. The clustering
technique’s accuracy is based on two things, i.e., initial centroids and the k values.
In the clustering method, singular clusters are formed with the help of only one data
point class. The nonsingular clusters are formulated where more than one data point
class are present. Using k-mean techniques, five cluster groups are formed on the
data points. It contains four “singular” clusters and one “nonsingular” cluster in the
dataset.
Convex Hull Construction: Using quickhull technique, the convex hull is constructed for every cluster. Then, the convex hull is calculated. It contains the singular
and nonsingular cluster. V1 and V2 denote the vertices set of class label 1 and 2
correspondingly.
Redundant Data Points Elimination: In this step, we eliminate the data point,
which is not used to form the vertices in step 2. The data points in the dataset are
used to form the vertices and are defined as “Rem” data points. Here, 41 data points
are used to perform the next step of the naïve Bayes classifier.
The system model of this work contains two TelosB mote sensors and one
desktop computers, which is accumulated to make the measurements. This model
contains three stages.
Data Detection in Wireless Sensor Network Based on Convex Hull and Naïve. . .
51
Stage 1
TelosB sensor is used to take the sensed data readings. These readings are utilized
to form the data preparation block with new data measurement V_t. It makes a new
observation vector. The measurement of temperature (T_1 and T_2) and humidity
(H_1 and H_2) is combined to form the dataset. The new observation vectors are
formed by the three sequential data measurements V_t, V_(t-1), and V_(t-2).
Stage 2
In the second stage, the faults are included in the training dataset. From [1], four
different faults are taken (offset, gain, out of bounds, and stuck-at). In addition to
that, data loss fault and spike fault are taken in this system model.
Stage 3
In this third stage, the many cluster nodes are connected to form the WSNs. To make
the communication between other nodes and the network layer, each cluster contains
one cluster head. In each cluster head, the naïve Bayes algorithm is incorporated to
classify the fault. To form the decision function, the observation vectors are used.
This process is less expensive because the decision functions are incorporated in
each cluster head along with the classifiers. By using the decision function, the fault
may be classified. Classifiers classify the dataset into positive (fault) and negative
(normal).
3.2 Attribute-Based Encryption
Attribute-based encryption is a form of public key encryption that depends on the
attribute of the user’s secret key and the cipher text. In such a framework, the
unscrambling of a cipher text is conceivable if the arrangement of the trait of the
cipher text is an urgent security part of property-based encryption resistance. An
individual getting numerous keys may have the choice of obtaining data if, at any
rate, one individual key awards is received.
3.3 Symmetric-Key Algorithm
Executions of symmetric key encryption can be especially successful to ensure
that consumers do not encounter any significant time delay due to encryption and
unscrambling. Likewise, symmetric-key encryption offers a degree of confirmation
as data mixed with one symmetric key cannot be decoded with some other
symmetric key. Therefore, if it can be used by the two gatherings to scramble
correspondences and keep the symmetric key private, each gathering will ensure that
it communicates to the other as long as the decoded messages become consistent and
pleasant.
52
E. H. Ramirez-Asis et al.
3.4 Cipher Text Attribute–Based Encryption
A collection of descriptive attributes will define private keys. In our construction,
a party that wants to scramble a message will suggest a method that private keys
must follow to unscramble through an entrance tree structure. Every inside hub of
the tree is an edge door and the leaves are related with qualities. We utilize similar
documentation to depict the entrance trees, even though for our situation, the credit
is utilized to distinguish the keys (as restricted to the data) and specified in the
private key.
3.5 Performs the Naïve Bayes Classifier on the Remaining
Data Points
In the last step, the remaining 41 data points are used to perform the naïve Bayes
classifier. From 85 training data points, only 50% of the data are utilized to obtain
the accurate naïve Bayes classifier technique [38]. Here, the data points can be
reduced requiring fewer mathematical formulation steps to classify data points. In
addition to that, the computation times are reduced and obtain higher accuracy.
Proposed AEL-HO Algorithm
1. Choose the cluster value K.
2. Perform the k-means clustering techniques.
3. Where k varies up to K (k ≤ K) for each cluster do.
4. Based on cluster k, check the data points class label.
5. If the cluster data points are a single class.
6. Allocate the cluster label as “Singular.”
7. Else, allocate the cluster label as “Nonsingular.”
8. End.
9. End.
10. For “Singular” cluster, do the following:
11. Perform quickhull techniques.
12. Estimate the convex hull (V1 ), which denotes the class-1 label vertices points.
13. Estimate the convex hull (V2 ), which denotes the class-2 label vertices points.
14. Set of vertices points are formed.
15. Eliminate each clusters sample not related to the group.
16. End.
17. For “Nonsingular” cluster do the following:
18. Choose each cluster data points and form in a single set.
19. End.
20. Remaining samples are structured as “Rem” dataset.
21. Perform naïve Bayes classifier to the “Rem” values.
Data Detection in Wireless Sensor Network Based on Convex Hull and Naïve. . .
53
4 Experimental Results
The experimental results of the proposed system are evaluated. Here, the dataset
used in this work is explained and the performance analysis parameter and their
equations are presented. A final comparison of the proposed method with the other
three classifier algorithms is discussed.
System configuration:
•
•
•
•
Operating System: Windows 8.
Processor: Intel Core i3.
RAM: 4 GB.
Platform: MATLAB.
4.1 Dataset
In WSNs, measurements of the sensor and the different fault types are combined to
form the labeled dataset. This dataset is used based on the existing dataset proposed
by the investigators in the North Carolina University at Greensboro in 2010 [13].
From the single hop, multi-hop, and two outdoor multi-hop sensors WSNs, the data
are gathered in TelosB motes. The sensed data contain temperature and humidity
measurements. Each vector is formed from the three successive instances t_0,
t_1, and t_2. By using temperature T_1 and T_2 and humidity H_1 and H_2
measurements, the construction of each instance has been carried out. Here, six
different faults (stuck-at, data loss, offset, out of bounds, gain, and spike) are taken at
different rates (10%, 20%, 30%, 40%, and 50%), which is introduced in the dataset.
From 9566 observations, 40 datasets have been prepared: each has 12 dimensions.
Each dataset contains the measurement values and target values (1 for normal and
2 for fault). The naïve Bayes classifier is used to classify the whole dataset into two
labels, i.e., normal case and fault case. Table 1 summarizes the various types of fault
results.
Table 1 MCC of SVM,
CNN, RF, and proposed
AEL-HO
Matthews correlation
Techniques
SVM
CNN
RF
Proposed
AEL-HO
coefficient (MCC)
0.65
0.32
0.46
0.73
Rank
2
4
3
1
54
E. H. Ramirez-Asis et al.
Table 2 Performance comparison of accuracy and response time
Classifiers
RF [12]
CNN [9]
SVM [1]
Proposed AEL-HO classifier
Data accuracy (%)
95.22
95.33
95.45
96.42
Response time (s)
1.99
1.76
1.22
0.97
4.2 Performance Evaluation Parameters
DA is the first metric [1, 7], which is represented in Eq. (7):
DA = (Number of faulty observations detected)/(Total number of faulty observations)
(7)
TPR is the second metric [15]. It is defined by actual positive quantity that is
identified as correct. The corresponding expression is given in Eq. (8).
TPR = TP/ (TP + FN)
(8)
where, true positive (TP) denotes the estimation of fault capable of identifying true
positives. The false negative (FN) denotes the estimation of fault which is wrongly
requested as negative. MCC is the third metric [17–25], which ranks the classifiers
based on the accuracy values. It ranges between −1 and 1. The expression of MCC
is given in Eq. (9):
√
MCC = (TP × TN-FP × FN) / ((TP + FP) (TP + FN) (TN + FP) (TN + FN))
(9)
where, true negative (TN) denotes the estimation of non-faulty nodes correctly and
false positive (FP) denotes the estimation of faulty nodes incorrectly. F1-score is the
fourth metrics [16], which is the mean of harmonics precision and recall.
Table 1 shows the ranking of the proposed AEL-HO classifier with existing
CNN, SVM, and RF classifiers based on the MCC score. The AEL-HO classifier
is considered to perform well based on MCC values. A classifier with an MCC
value of 1 means the classifier is the best. Here, the MCC value of the proposed
naïve Bayes classifiers is close to 1 as compared to SVM, CNN, and RF classifiers.
Therefore, the naïve Bayes classifier is proven to be the best classifier (MCV value,
0.73) followed by SVM classifier (MCC value, 0.65) [26–31].
Table 2 shows the accuracy and response time of the proposed AEL-HO classifier
compared with CNN, SVM, and RF classifiers [32–37]. The analysis shows that the
proposed classifier exhibits a higher accuracy of 96.42% and response time of 0.97 s.
Data Detection in Wireless Sensor Network Based on Convex Hull and Naïve. . .
55
5 Conclusions and Future Work
The clustering is performed on the data points, and then the convex hull algorithm
is used to find the vertices of the data points that belong to each cluster. The
performance of the classifiers is analyzed based on the metrics such as F1-score,
DA, MCC, and TPR. Based on the values of DA and TPR, it has been concluded that
the proposed algorithm has performed better than the existing methods. For future
work, the same dataset can be applied to different new data that appear in WSNs. In
addition, identification of WSNs data is found to be accurate in the network layer
and the sensor nodes.
References
1. Alenezi, M., Magel, K., & Banitaan, S. (2013). Efficient bug triaging using text mining. Journal
of Software, 8(9), 2185–2190.
2. Guo, S., Chen, R., Li, H., Zhang, T., & Liu, Y. (2019). Identify severity bug report with distribution imbalance by CR-SMOTE and ELM. International Journal of Software Engineering
and Knowledge Engineering, 29(6), 139–175.
3. Jindal, R., Malhotra, R., & Jain, A. (2017). Prediction of defect severity by mining software
project reports. International Journal of Systems Assurance Engineering and Management, 8,
334–351.
4. Kamei, Y., Shihab, E., Adams, B., et al. (2013). A large-scale empirical study of just-in-time
quality assurance. IEEE Transactions on Software Engineering, 39(6), 757–773.
5. Kanwal, J., & Maqbool, O. (2012). Bug prioritization to facilitate bug report triage. Journal of
Computer Science and Technology, 27(2), 397–412.
6. Lamkanfi, S., Demeyer, E. G., & Goethals, B. (2010). Predicting the severity of a reported bug.
In Mining Software Repositories (MSR) (pp. 1–10).
7. Li, H., Gao, G., Chen, R., Ge, X., & Guo, S. (2019). The influence ranking for testers in bug
tracking systems. International Journal of Software Engineering and Knowledge Engineering,
29(1), 1–21.
8. Sampathkumar, A., & Vivekanandan, P. (2018). Gene selection using multiple queen colonies
in large scale machine learning. Journal of Electrical Engineering, 9(6), 97–111.
9. Singh, V. B., & Chaturvedi, K. K. (2011). Bug tracking and reliability assessment system.
International Journal of Software Engineering and Its Applications, 5(4), 17–30.
10. Yang, X.-L., Lo, D., Xia, X., Huang, Q., & Sun, J.-L. (2017). High-impact bug report identification with imbalanced learning strategies. Journal of Computer Science and Technology,
32(1), 181–198.
11. Yu, H., Zhang, W. Y., & Li, H. (2019). Data-tolerant compensation control based on
sliding mode technique of unmanned marine vehicles subject to unknown persistent ocean
disturbances. International Journal of Control, Automation, and Systems, 18(9), 739–752.
12. Zhang, T., Chen, J., Yang, G., Lee, B., & Luo, X. (2016). Towards more accurate severity
prediction and fixer recommendation of software bugs. Journal of Systems and Software, 117,
166–184.
13. Alrosan, A., Alomoush, W., Norwawi, N., Alswaitti, M., & Makhadmeh, S. N. (2020). An
improved artificial bee colony algorithm based on mean best-guided approach for continuous
optimization problems and real brain MRI images segmentation. Neural Computing and
Applications, 33(3), 1671–1697.
56
E. H. Ramirez-Asis et al.
14. Elgamal, Z. M., Yasin, N. B. M., Tubishat, M., Alswaitti, M., & Mirjalili, S. (2020). An
improved Harris hawks optimization algorithm with simulated annealing for feature selection
in the medical field. IEEE Access, 8, 186638–186652.
15. Tubishat, M., Alswaitti, M., Mirjalili, S., Al-Garadi, M. A., Alrashdan, M. T., et al. (2020).
Dynamic butterfly optimization algorithm for feature selection. IEEE Access, 8, 194303–
194314.
16. Rohit, R., Kumar, S., & Mahmood, M. R. (2020). Color object detection based image retrieval
using ROI segmentation with multi-feature method. Wireless Personal Communications,
112(1), 169–192.
17. Sandeep, K., Jain, A., Shukla, A. P., Singh, S., Raja, R., Rani, S., Harshitha, G., AlZain, M. A.,
& Masud, M. (2021). A comparative analysis of machine learning algorithms for detection of
organic and nonorganic cotton diseases. Mathematical Problems in Engineering, 2021. https:/
/doi.org/10.1155/2021/1790171
18. Naseem, U., Razzak, I., Khan, S. K., & Prasad, M. (2020). A comprehensive survey on word
representation models: From classical to state-of-the-art word representation language models.
arXiv preprint arXiv, 15036.
19. Vikram, K. K., & Narayana, V. L. (2016). Cross-layer Multi Channel MAC protocol for wireless sensor networks in 2.4-GHz ISM band. In IEEE conference on, computing, analytics and
security trends (CAST-2016) on DEC 19–21, 2016 at Department of Computer Engineering &
information technology. College of Engineering, Pune, Maharashtra. https://doi.org/10.1109/
CAST.2016.7914986
20. Tiwari, L., Raja, R., Awasthi, V., Rohit Miri, G. R., Sinha, M. H., & Alkinani, K. P. (2021).
Detection of lung nodule and cancer using novel Mask-3 FCM and TWEDLNN algorithms,
108882. Measurement, 172. https://doi.org/10.1016/j.measurement.2020.108882, ISSN 02632241.
21. Raja, R., Raja, H., Patra, R. K., Mehta, K., & Gupta, A. (2020). Assessment methods of
cognitive ability of human brains for inborn intelligence potential using pattern recognition.
In Biometric systems. IntechOpen. ISBN 978-1-78984-188-6.
22. Vikram, K., & Sahoo, S. K. (2017, December). Load Aware Channel estimation and channel
scheduling for 2.4GHz frequency band wireless networks for smart grid applications. International Journal on Smart Sensing and Intelligent Systems, 10(4), 879–902. https://doi.org/
10.21307/ijssis-2018-023
23. Naseem, U., Khan, S. K., Farasat, M., & Ali, F. (2019). Abusive language detection: A
comprehensive review. Indian Journal of Science Technology, 12(45), 1–13.
24. Sharma, D. K., Singh, B., Regin, R., Steffi, R., & Chakravarthi, M. K. (2021). Efficient
classification for neural machines interpretations based on mathematical models. In 2021 7th
International Conference on Advanced Computing and Communication Systems (ICACCS)
(pp. 2015–2020). https://doi.org/10.1109/ICACCS51430.2021.9441718
25. Arslan, F., Singh, B., Sharma, D. K., Regin, R., Steffi, R., & Suman Rajest, S. (2021). Optimization technique approach to resolve food sustainability problems. In 2021 International
Conference on Computational Intelligence and Knowledge Economy (ICCIKE) (pp. 25–30).
https://doi.org/10.1109/ICCIKE51210.2021.9410735
26. Ogunmola, G. A., Singh, B., Sharma, D. K., Regin, R., Rajest, S. S., & Singh, N. (2021).
Involvement of distance measure in assessing and resolving efficiency environmental obstacles.
In 2021 International Conference on Computational Intelligence and Knowledge Economy
(ICCIKE) (pp. 13–18). https://doi.org/10.1109/ICCIKE51210.2021.9410765
27. Sharma, D. K., Singh, B., Raja, M., Regin, R., & Rajest, S. S. (2021). An efficient python
approach for simulation of Poisson distribution. In 2021 7th International Conference on
Advanced Computing and Communication Systems (ICACCS) (pp. 2011–2014). https://doi.org/
10.1109/ICACCS51430.2021.9441895
28. Sharma, D. K., Singh, B., Herman, E., Regine, R., Rajest, S. S., & Mishra, V. P. (2021).
Maximum information measure policies in reinforcement learning with deep energy-based
model. In 2021 International Conference on Computational Intelligence and Knowledge
Economy (ICCIKE) (pp. 19–24). https://doi.org/10.1109/ICCIKE51210.2021.9410756
Data Detection in Wireless Sensor Network Based on Convex Hull and Naïve. . .
57
29. Metwaly, A. F., Rashad, M. Z., Omara, F. A., & Megahed, A. A. (2014). Architecture of
multicast centralized key management scheme using quantum key distribution and classical
symmetric encryption. The European Physical Journal Special Topics, 223(8), 1711–1728.
30. Farouk, A., Zakaria, M., Megahed, A., & Omara, F. A. (2015). A generalized architecture of
quantum secure direct communication for N disjointed users with authentication. Scientific
Reports, 5(1), 1–17.
31. Naseri, M., Raji, M. A., Hantehzadeh, M. R., Farouk, A., Boochani, A., & Solaymani, S.
(2015). A scheme for secure quantum communication network with authentication using GHZlike states and cluster states controlled teleportation. Quantum Information Processing, 14(11),
4279–4295.
32. Wang, M. M., Wang, W., Chen, J. G., & Farouk, A. (2015). Secret sharing of a known arbitrary
quantum state with noisy environment. Quantum Information Processing, 14(11), 4211–4224.
33. Zhou, N. R., Liang, X. R., Zhou, Z. H., & Farouk, A. (2016). Relay selection scheme for
amplify-and-forward cooperative communication system with artificial noise. Security and
Communication Networks, 9(11), 1398–1404.
34. Supritha, R., Chakravarthi, M. K., & Ali, S. R. (2016). An embedded visually impaired reconfigurable author assistance system using LabVIEW. In Microelectronics, electromagnetics and
telecommunications (pp. 429–435). Springer.
35. Ganesh, D., Naveed, S. M. S., & Chakravarthi, M. K. (2016). Design and implementation
of robust controllers for an intelligent incubation Pisciculture system. Indonesian Journal of
Electrical Engineering and Computer Science, 1(1), 101–108.
36. Chakravarthi, M. K., Gupta, K., Malik, J., & Venkatesan, N. (2015, December). Linearized PI
controller for real-time delay dominant second order nonlinear systems. In 2015 International
Conference on Control, Instrumentation, Communication and Computational Technologies
(ICCICCT) (pp. 236–240). IEEE.
37. Yousaf, A., Umer, M., Sadiq, S., Ullah, S., Mirjalili, S., Rupapara, V., & Nappi, M. (2021b).
Emotion recognition by textual tweets classification using voting classifier (LR-SGD). IEEE
Access, 9, 6286–6295. https://doi.org/10.1109/access.2020.3047831
38. Sadiq, S., Umer, M., Ullah, S., Mirjalili, S., Rupapara, V., & Nappi, M. (2021). Discrepancy detection between actual user reviews and numeric ratings of Google App store
using deep learning. Expert Systems with Applications, 181, 115111. https://doi.org/10.1016/
j.eswa.2021.115111
DSPPTD: Dynamic Scheme for Privacy
Protection of Trajectory Data in LBS
Ajay K. Gupta and Sanjay Kumar
1 Introduction
Location-aware service [1–3] is a type of context-aware services in which location
is provided as input to the system. The system takes location as input and provides
services to the user. The location may be geometric, i.e., latitude and longitude form,
or it may be semantic, i.e., near and within. User querying the services provides his
or her location to service providers believing that the correct location would improve
the quality of services (QoS). However, it led to a risk of disclosure of private and
confidential information [4]. It is highly challenging to design efficient trade-off
between the QoS and privacy of the mobile user. In location-based services, the user
provides his current location to the third party for a service request. An attacker
(or untrusted service provider) may make an inference attack [5] through these live
locations of the user may infer the personal confidential information regarding his
health or lifestyle by observing location, duration of stay, and habits of activity
performed by him. So, this is a security and privacy problem. The aim here is to
reduce the privacy leakage risk as well as to provide the quality of service.
The general architecture of the cellular mobile environment [15, 16] consists of
mobile units (MU), fixed hosts (FHs), and base stations (BSs). The BS has its fixed
location, functions with two-way radio, and has some data processing capabilities.
The basic function of data and transaction management is done by the database
server (DBS). Many BSs and FHs are linked via a high-speed network. Each cell
has a limited radio coverage area and a BS to manage mobile clients. The cell
A. K. Gupta ()
Indian Institute of Information Technology, Pune, India
S. Kumar
United Services Automobile Association (USAA) – AADC Project Technical Lead (HCL
America Inc., America), San Antonio, TX, USA
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_4
59
60
A. K. Gupta and S. Kumar
can be seen as a limited bandwidth radio coverage area and normally represented
by the shape of a hexagon. In wireless local area networks, it can be treated as a
high bandwidth network within the area of the building. The wireless channel is
splitted in two channels known as uplink and downlink channels. Here, the first
one is utilized for the submission of the mobile client’s queries, and the second is
used to answer the mobile client queries by the mobile switching stations (MSSs).
The base station controller (BSC) is used to control the various BSs. The mobile
switching center (MSC) gives commands to BSC to control an appropriate BS.
Unlimited mobility in personal communications service, global system for mobile
communication, and reachability to any BS or FH facilitate many services being
easy to deploy in the real world. The public switched telephone network and MSC
connects the databases available for a mobile environment to the outer world.
The mobile transactions run in the frequent disconnection mode. Due to mobility
and frequent disconnection behavior of these mobile transactions, they are longlived. The data and/or user may also move in a mobile environment. Therefore,
the mobile transaction may have their associated sub-transactions (cohorts). Among
those, some may run on the MSS and some may run on mobile nodes. Due to the
disconnection and mobility nature of the transaction, it shares its information of
states and also partial results with other transactions. Also, the mobile transaction
should fulfill some prerequisites to work well in the environment of mobility.
With the mobility nature of nodes, the state of the data object being accessed
and the corresponding location information must also move. There should be the
availability of the techniques to deal with concurrency, frequent disconnection,
and consistency between replicated data objects residing at different locations.
The mobile transactions are also executed in a distributed manner, which may
be subjected to further restrictions such as limited bandwidth. Evolving commit
protocols [17] for a distributed transaction in the presence of mobility is the most
challenging task in comparison with the generic environment. Here, the mobile
transaction may need to deal with the forced wait or forced abort, if wireless
channels (uplink or downlink) are not available at any instant of time, and this could
be delayed due to hand off randomly. The mobile transactions might not be in a
position to complete its implementation due to the unavailability of full database
management system (DBMS) capability [18]. This is the reason why conventional
transaction control strategies are not well suited to the mobile environment. If the
connection is not possible to mobile nodes or due to high expenses in continuous
connection, the mobile host can decide to work in disconnected mode also. Based
on the locations of initiation and execution of the mobile transaction, it can be
classified into three types. The first category is of those mobile transactions, which
are both initiated and executed by a mobile host (MH). The second category is of
those mobile transactions [19], which are initiated by fixed host (FH), but executed
by the MH. The third category is of those mobile transactions, which are initiated
by MH, but executed by both MH and FH. In the mobile transaction environment,
where MH initiates transactions but executed completely by FH, the MH requires
no record retrieval capability.
DSPPTD: Dynamic Scheme for Privacy Protection of Trajectory Data in LBS
61
The location-dependent information system (LDIS) is an application of contextaware computing in the mobile environment where the transaction is initiated
and/or executed by MH. Moflex is an example of this type of scenario. This
model primarily stood on dependencies set, suitable goals, and also on rules.
Sub-transactions for location-based services are also supported by this model. A
further version of the same scenario is pre-serialization, which permits the cohorts
of the global transactions to commit independently. The serialization technique
permits in releasing the nearby resources in a well time-stamped way. There is
need of integrating revised data visualization and indexing technique of data item
in LDIS for user-friendly response and faster access rate, respectively. A number of
mobile devices together with the personal digital assistants has very small screens.
Therefore, the requirement for potential future research work is to consider the
mobile system screen size and computational limits when developing lightweight
simulation methods for desktop and handheld apps applications. Indexing in LBS
is used to get faster search results. Before one can search through the LBS,
he has to create a search engine index. It facilitates power saving mode to
the client until queried records arrive on the requested channel. Index overhead
induced by an LBS implementation certainly affects indexing approach selection.
The proportional frequency of queries vs. updates especially favors either queryoptimized or update-optimized indexing approaches. The scope of future research
is toward an investigation of trajectory and filtering approaches to further enhance
the efficiency of these indexing approaches in terms of updating and querying.
1.1 Problem Statement
The past trajectory privacy protection approaches mostly rely on obfuscation of
the trajectory locations and add more uncertainty to preserve privacy. However, it
is challenging to monitor the trade-off between the efficacy of trajectory privacy
security and the usefulness for spatial and temporal behavior, and this problem
has not been thoroughly explored or measured in past strategies [6, 10]. The
recent analyses concentrate predominantly on the spatial component of trajectory
details, whereas other semantics such as thematic and temporal attributes are seldom
addressed. In comparison, existing methods depend extensively on manually crafted
procedures. If the process is revealed, the initial trajectory details can be recovered.
To this end, this study intends to investigate the feasibility of deep learning methods
to overcome the above mentioned privacy security challenges in trajectory.
The following points can describe the primary contributions of this work.
1. The edge-based distance measure has been introduced in proposed DSPPTD
for k-path trajectory clustering of deep neural network processed trajectory to
achieve differential privacy before publishing it. The work discusses an end-toend solution of deep learning to produce trajectory data supporting differential
62
A. K. Gupta and S. Kumar
privacy. A Gaussian mechanism for synthetic trajectory preparation has been
described in this work.
2. The two functions, namely mutual information and Hausdorff distance, are used
to measure the intensity of privacy protection and utility of the trajectory data
with training deep learning approach.
3. Analysis of the trade-off between privacy protection effectiveness and the
usefulness of the new model are made utilizing real-world LBS details.
The rest of this paper is organized as follows: Sect. 2 gives an overview of related
work. The deep-learning-based differential privacy protection approach has been
described in Sect. 3. We discuss the factors affecting privacy protection effectiveness
to verify the utility and privacy trade-off of the proposed policy in Sect. 4. Finally,
Sect. 5 concludes this paper.
2 Related Work
With the advancement in mobile technologies, smartphones allow peoples to access
numerous LBSs and provide interactive information depending upon location of
the user. The study of user’ positions and associated confidential information not
only enables more sophisticated and reliable user information to be created but
also inevitably leads to security and privacy problems. Therefore, this domain needs
more research works for the development of location-based technologies to resolve
such burning issues [11–13].
There are various reports on the privacy security of dummy-based trajectories.
Kido et al. [14] were the first who used the concept of a random move to
create dummies. Lu et al. [15] suggested a confidentiality-conscious, dummy-based
strategy for preserving consumer data. However, the history details were overlooked
by these systems. Niu et al. [16] established a Dummy-T effective privacy security
system for the route. It employs the minimum cloaking area and context details to
ensure each dummy produced on the trajectories is just like the real one. However,
it lacks the actual mobility trend and spatiotemporal association, which leads to the
deterioration of the degree of privacy.
The definition of k-anonymity was first introduced for relational databases [17].
If the position of the recipient is indistinguishable from the position of certain
k-1 persons, then the query is said to be location k-anonymous. Zhang et al.
[18] also suggested caching and spatial K-anonymity (CSKA) policy to improve
safety through k-anonymity and caching. This system, though, is not well suited
to protection for trajectories. Moreover, past policies are based on user-clustered
or centralized architectures. Hence, the workload of the network is high, and the
anonymizer could lead the bottleneck performance.
To make sure the optimal distribution of the selected dummy locations, the
authors in [19] also provided an enhanced decay lengths (DLs) approach that
could expand the cloaking region while retaining a degree of privacy near to the
DSPPTD: Dynamic Scheme for Privacy Protection of Trajectory Data in LBS
63
DLs algorithm. In [15], two approaches of dummy creation were suggested by
the authors, notably grid-based and circle-based methods, which take into account
the privacy criteria. In [8], the authors developed dual dummy-based techniques
to guarantee the k-anonymity of privacy-conscious clients in LBS, recognizing
that opponents would exploit side details. The previous approaches have also not
understood the information on the side that attackers may exploit while picking
dummy locations. While some approaches have taken the side information into
account, they have a high processing cost. However, the effective selection of
dummy locations in IoT remains a research problem. In K-anonymity systems,
Hu et al. [5] applied a credit-incentive framework to maximize the efficiency of
selecting dummy roles. Based on the fuzzy reasoning, credit rating contributes to
a certain maximum level of probability for each customer. A client can still get
help from specific users on the condition that his credit rating passes a certain
likelihood threshold amount. It motivates people to assist others in building Kanonymity actively. In a sense, all the above solutions originate directly from the
single time LBS position privacy policy [20] and therefore ignore the following two
issues:
(a) Protection of communication messages in user’s LBS request.
(b) Exposure of the users’ real location details due to the continuous importance of
query position.
Present findings on the evolution of privacy protection concentrate primarily on
two sources of study. One is the hierarchical solution to privacy to combine and mix
trajectories from various users such that the detection of person trajectory data is
turned into an issue of k-anonymity [19, 21]. Here, the spatial cloaking method utilizes k-anonymous cloaked spatial regions to combine trajectory locations between
k-objects and renders these trajectories k-anonymized [22]. The mix-zone strategy
often anonymizes trajectory locations in a mix-zone using aliases. It removes the
link between the former section and the latter section of the mix-zone trajectory
[23].
Additionally, the positions of k trajectories are divided into k-anonymized
separate regions first by the generalization-based method and then uniform selection
and reassemble k new trajectories by connecting points of each k-anonymized region
[24]. A further analysis medium is termed geo-masking, which blurs the positions of
actual trajectory details by using spatial dimension interference to cover or change
the original positions. However, spatial trends might not be substantially affected
[25, 26]; for example, Zandbergen [27] discussed the need to preserve privacy and
the spatial usefulness of many forms of geo-masks.
Kwan et al. [28] tested the efficacy of three independent arbitrary geo-masks of
perturbation on lung cancer cases in space research. Seidl et al. [29] introduced grid
masking and random disruption to data sets from GPS and measured the efficiency
of privacy security. Gao et al. [26] studied the efficacy of Twitter data aggregation,
Gaussian disruption, random disruption, exploration of the complexity, degree of
anonymity, and analytics of each process.
64
A. K. Gupta and S. Kumar
Users may access preference details of the actual position in the implemented
system without revealing their location data to the service provider. Beresford et
al. [30] proposed anonymous communication techniques, who are first to introduce
mixed zones concept. A mixing zone applies to a geographic area where no call
back activity has been recorded by any users. The researchers in [31] allow users to
swap the pseudonyms if they met in mix zone and also care for user to avoid the use
of pseudonym for a larger time. The association of app positions and pseudonyms
may, therefore, be disrupted by pseudonyms exchange.
Finally, it may be claimed that these days scientists are energetically researching
the privacy concern of query processing [5, 32]. A few worthy survey articles
have emerged in recent years addressing privacy problems in LBS—difficulties and
probabilistic scope connected with it [7, 33].
3 Our Proposed Scheme
We follow the system approach based on fog computation, as seen in Fig. 1. It
is made up of three entities: handheld device, LBS server, and fog server. The
fog system is operated by the consumer and installed with enough hard space in
the user’s spare devices. In the proposed approach, the fog server receives the
background information. It applies the DSPPTD policy for protecting trajectory
and dependent confidential information from the attacker while providing maximum
QoS for the user’s query request by the LBS server. The LBS server scans the POIs
of users, and it returns the output of the applicant to the fog server after this fog
server delivers the relevant results to the customer.
Fig. 1 LBS system structure
DSPPTD: Dynamic Scheme for Privacy Protection of Trajectory Data in LBS
65
Table 1 Key notation used in proposed scheme
Terms
T
Description
Trajectory
Terms
Theta
pjt
Location point at time t of
set j
Trajectory original data
Number of clusters
Entropy of set S
System defined variable
N>2k
A real location from
trajectory at time t
E
Description
Skewness parameter based on
zipf access distribution
Differential privacy
DNN_Datat
K
CmaxH
n
DNN processed trajectory data
Anonymity degree
Optimized set using entropy
The total number of snapshots
qjt−1
A query probability for j dummy
location at time t−1
Transition probability from
time t−1 to time t for j
dummy locations, where
2>j≥k-1
Randomly selected m
locations set from N dummy
locations, where
m ≤ C(2 k-1, N)
HCRti
Path entropy
m
Randomly selected m’ locations
set from 2 k optimized set
Datat
K
E(S)
N
drt
T R dJt |djt−1
M
Dt
Anonymous set at time t
Disttmax
Separation length from the
current position to the next
Mutual information
MI
q djt−1
(x, y)
Time-dependent query
probability at location djt−1
Separation angle between x and y
HD
Hausdorff distance
The concept of cloud fog computing makes server computation resources
available in the ground nearer to end-users. In comparison with clustered data
centers, these nodes are physically much closer to smartphones, which leads to fast
communications between entities. It has the remarkable ability of edge nodes to
process and measure large amounts of data under their own, without submitting
it to distant servers. Fog computing is an intermediary between external servers
and mobile devices. It controls the details that the server can obtain, which can be
accessed locally. For this sense, fog is a smart portal that offloads clouds making
for more effective data collection, retrieval, and analysis. Table 1 summarizes the
notations used in the proposed scheme.
The DSPPTD approach is a trajectory privacy protection that incorporates
the deep neural network and structure of the Gaussian system to build privacypreserving synthetic trajectories as substitutes to actual trajectories for the exchange
and publishing of trajectories.
In this paper, we propose a new approach consisting of four main components.
The four main components that are implemented by the system include processing,
generation, optimization, and release of trajectories. A detailed summary of each
unit is given below.
66
A. K. Gupta and S. Kumar
A. Trajectory processing model uses the user’s moving scene to generate corresponding high-dimensional data items.
B. Trajectory generator uses a deep neural network, which takes random noise
and original location points of trajectories as inputs to generate synthetic
trajectories as outputs. The processed trajectory consists of position points in
actual timestamps that can shield the original collection of data.
C. Apply k-means clustering for k subregions division of the location trajectory
data region with common data points.
D. The trajectory release step involves comparing each clustered “synthetic trajectories datum” to corresponding “real” trajectory and merging accordingly.
The process also involves a prejudging mechanism to ensure at least one actual
trajectory record can be seen in processed trajectory.
3.1 Trajectory Processing Model
The trajectory is a sequenced series of user movement points where the interval
period between two user location points does not reach a fixed threshold Th . It is
represented by T : p1 → p2 → · · · → pm , where, Th > pi+t t >0 with (m > i ≥ 1)
and pi ∈ P ⊂ L. The |T| is the number of samplings (|T| = m), and t is defined as
the interval of the sampling point. P = p1 , p2 . . . , pm are the arrangement of points
known as user movement log, where each point pi ∈ P contains pi .lat, pi .lng, pi .t,
and pi .v as latitude, longitude, timestamp, and velocity, respectively.
pi = {pi .lat, pi .lng, pi .t, pi .v}
Also, the location coordinate can change as time passes. Figure 2 provides a
distinctly unpredictable glimpse of the initial trajectory data collection. These nodes
are related as per the time – series data and thus shape a trajectory. In the equation,
a general representation of a record is given below:
Fig. 2 Log and trajectory for moving person
DSPPTD: Dynamic Scheme for Privacy Protection of Trajectory Data in LBS
67
Original Data :
Datat : (p1 .lat, p1 .lng, p1 .t, p1 .v) →
(p2 .lat, p2 .lng, p2 .t, p2 .v) →
→ (pi .lat, pi .lng, pi .t, pi .v)
The Gaussian method is used to attain differential anonymity by applying random
noise to the time parameter t of the client behavior predicted trajectory results.
The Gaussian method can be described using a data set D = {x1 , x2 , . . . , xN },
privacy parameter ε, global sensitivity Δf of given function f.
In the differential privacy mechanism, with the given sibling data set D and D ,
the function f sensitivity is represented by Δf as given below:
Δf = max
DΔD f (D) − f D DΔD is the set of each pair data sets that differs in at most one record.
Theorem 1 For a given output function f : Dd → Rd , the following function M have
2
(ε, δ)-differential privacy if δ > 45 exp − εσ2 and ε < 1.
M (f, D) = f (D) + (Y1 , Y2 , . . . , Yd )
The likelihood of differential privacy is represented by probability δ. The
parameter
δ bounds the differential privacy level, and its value is smaller than
1
|D| . The parameter ε is inversely proportional to privacy protection. The Gaussian
distribution draw in the form of Yi (i = 1, 2, . . . , d) has 0 as the value of mean and
Δf σ as the value of standard deviation, i.e., Y(0, (Δf σ)).
Trajectory data given in the below equation
is the trajectory data post-processing
Δf
the Gaussian noise function value, Gaus ε to all-time attribute, that can resist
an attack through context awareness.
Processed Data :
Pro_Datat : p1 .lat, p1 .lng, p1 .t + Gaus Δfε
p2 .lat, p2 .lng, p2 .t + Gaus Δfε , p2 .v →
→ pi .lat, pi .lng, pi .t + Gaus Δfε , pi .v
, p1 .v →
3.2 Trajectory Generator
DSPPTD’s essential purpose is to increase the performance of trajectory data
reporting statistics as well as the scheme’s productivity based on maintaining
68
A. K. Gupta and S. Kumar
the differential privacy. Differential privacy frameworks and deep neural network
(DNN) deep learning algorithms are the core methodologies applied in this paper.
DSPPTD uses differential privacy to offer protection and privacy functionality to
LBS apps and uses DNN to efficient trajectory data processing from complex
time series. DSPPTD is built for a dynamical object movement, which defines the
dynamic model of four components correlated with the speed, latitude, longitude,
and time of the users.
3.3 Multilayer Perceptron and Deep Neural Network
A “perceptron” is a known “artificial neuron,” forming the “neural” system. This
paper first discussed the simplest single hidden layer multilayer perceptron before
deep learning-based multilayer perceptron. In general, the multilayer perceptron
has the structure in which every location might be represented by way of a single
input and a single output neuron and having one hidden layer. Positive weights are
typically considered to be excitatory in neural network, whereas negative weights
are known to be inhibitory. Training is the method of weight change to build a
network that performs some task. The basic architecture of artificial neural network
consists of the three components, namely presynaptic connections, which input xi ,
synaptic influence, which is modeled using real weights wi , and neuron reaction,
which is a nonlinear weighted inputs function f.
As shown in Fig. 3, x1 , x2 , and x3 are given as inputs to the perceptron, which
produces a single binary output. Piecewise linear and sigmoid are examples of
output or response function. The equation for sigmoid and piecewise linear is given
below:
1
f (x) =
1 + e−λx
Fig. 3 Perceptron in ANN
f (x) =
x, if x ≥ θ
0, if x < θ
DSPPTD: Dynamic Scheme for Privacy Protection of Trajectory Data in LBS
69
The functioning of the human brain is imitated by employing neural network
technology for understanding pattern recognition rather than passing the input
through the different layers of simulated neural connection. “Artificial neural
networks” have an “input layer,” at least one “hidden layer” in-between and an
“output layer.” In “feature hierarchy,” specific sorting and order types are carried
out in each layer. To deal with unlabeled or unstructured data is among the practical
uses of these neural networks. Figure 3 shows the perceptron in artificial neural
network (ANN). The leftmost layer refers to “input neurons” present in the “input
layer.” The rightmost layer refers to the “output neurons” present in the “output
layer.” The middle layer refers to the “hidden layer,” which does not contain the
“neurons” of input or the output.
One of the downsides of the “neural network” is cost work slope processing. One
of the quicker ways to deal with slope processing is “error back propagation,” which
gives an in-depth knowledge of changing the metrics toward the system’s behavior.
The “deep neural network gives the hierarchical composition of the “linear” and
“nonlinear” activation function. We propose using “deep neural networks” or “deep
learning.” In this proposed work, the system considers an input layer, two hidden
layers, and a final output layer. The former layers and output layer have been evolved
the activation function sigmoid.
The three steps involved in back propagation preparation are listed below:
1. Training set: Neural network uses a collection of input–output patterns for
training.
2. Test set: For assessment of neural network performance, another collection of
input–output patterns are used.
3. Learning rate: It is a scalar parameter used to determine the change rate, which
is similar to phase size in numerical integration.
Network error is used as termination criteria or as an indicator for desired training
of the neural network. Root mean square error (RMSE) and sum squared error
(SSE) are the two most important indicators commonly used in most of the neural
network applications. The equations for root mean square error (RMSE) and total
sum squared error (SSE) are given below:
RMSE =
TSSE =
2∗ TSSE
#patterns∗ #outputs
1 (desired − actual)2
2 patterns outputs
The deep neural network processed trajectory data is a new trajectory that may
mask the original data set for the trajectory. Using this training model, we can only
get more anonymous data according to specific points in the complex trajectory. The
70
A. K. Gupta and S. Kumar
model facilitates the avoidance of loading complete data sets inside the standard
procedures and leads to running time reduction. The trajectory data given in the
below equation is the trajectory data after deep neural network processing of the
trajectory data:
DNN_Datat : (p1DNN .lat, p1DNN .lng, p1 DNN .t, p1DNN .v) →
(p2DNN .lat, p2 DNN .lng, p2 DNN .t, p2 DNN .v) →
→ (piDNN .lat, pi DNN .lng, pi DNN .t, pi DNN .v)
The trajectory data generation procedure can be described by Algorithm 1 as
given below:
Algorithm 1: Differential Privacy Generation of Trajectory Data
Input: Trajectory Original Data (Datat)
Output: DNN processed Trajectory Data (DNN_Datat)
Begin
For_ALL Datat
For_ALL t ≠ 0 in Datat
∆f
∆f = Gaus
= max − D∆D′
ε
tdnn = t + ∆f
latdnn = DNNlat(Pro_Datat, tdnn)
londnn = DNNlon(Pro_Datat, tdnn)
vdnn = DNNv(Pro_Datat, tdnn)
(latdnn, londnn, tdnn, vdnn) → DNN_Datat
End_For
End_For
Return DNN_Datat
End
3.4 K-Paths Trajectory Clustering
Partition-based approaches are more like clustering techniques that are categorized
before processing by the count of clusters (or centers). A parameter k (k ≤ n, n is
the data point count in the data set) is needed to set the count of final data partitions.
The cluster is represented by partitions, which must require at least one data point.
Partition-based approaches involve techniques of k-medoids and k-means. In [] and
[], two improved variants of k-means and k-medoids are described. The k-means
algorithms have been utilized in several clustering projects. The central concept is
to locate k cluster centers randomly and then in an iterative manner, a grouping of
the piece of data according to the divergence to the nearest clustering center until all
clustering centers converge.
DSPPTD: Dynamic Scheme for Privacy Protection of Trajectory Data in LBS
71
This step involves the k subregions division of the location trajectory data region
of common points. Here, the positions data with the same timestamp t is first
segmented. Then k subregions or groups are identified with similar data points,
and initial centroid corresponding to each subregion is chosen. If, at any instance,
the area covers a more significant number of mobile users than the threshold,
then k needs to be revised accordingly. In clustering, the location data with closer
trajectories are merged into a common cluster.
The k-paths trajectory clustering process can be defined as given below:
Given a set of trajectories T: p1 → p2 →···→ pm , the goal of the k-paths is to
divide the n trajectories into k (k ≤ n) clusters groups C = {C1 , C2 , . . . , Ck } to
minimize the below objective function:
O = arg min
C
k Dist (pi , μx )
j =1 pi ∈Cx
where each clusters Cx have their centroid path μx, which is an element of the set
of paths in road network directed graph G [34], and Dist is the measure of the
Euclidean distance between two trajectories.
The k-means and k-paths can be differentiated based on the following four points:
(a) In a Euclidean space, trajectories can differ in length rather than fixed-length
vectors.
(b) A trajectory length estimate “Dist” must be specified for two trajectories.
(c) We cannot locate the centroid direction μx by merely measuring the average
value with each trajectory throughout the cluster. Analogous to a version of kmeans named k-medoids [35], it is possible to use a current trajectory as the
centroid path.
Let EH, ALH are the edge histograms and accumulated length histograms,
respectively. The terms ub(i) and lb.(i) be the Ti to its nearest cluster upper
bound distance and the Ti to its second nearest cluster lower bound distance,
respectively. The terms cd(x) and cb(x) be the centroid drift and centroid bound of
μx , respectively. The formula for edge-based distance measure used in Algorithm 2
is given below:
Edge-Based-Distance (T1 , T2 ) = max (|T1 |, |T2 |) − | T1 ∩ T2 |
|T1| and |T1| be the travel length of the total trajectory T1 and T2 , respectively.
In k-path trajectory clustering, the trajectory distance measure “Dist” is replaced by
edge-based distance. Therefore, the applied objective function has been revised as
given below:
O = arg min
C
k j =1 pi ∈Cx
Edge-Based-Distance (pi , μx )
72
A. K. Gupta and S. Kumar
Algorithm 2: K-Paths Clustering (K, DNN_Datat )
Input: Number of clusters (k), DNN processed Trajectory Data (DNN_Datat)
Output: k centroid paths: {μ1, . . . , μk}.
Begin
Centroid paths μ = {μ1, · · · , μk} initialization, t ← 0;
Repeat
If (t = 0)
For Each Ti ∈ DNN_Datat do
mini ← +infinity;
For Each path centroid μj do
lb(i, j) ← Edge-Based-Distance(pi, μj );
If (mini > lb(i, j)) then
a(i) ← x
mini ← lb(i, x)
End For
UpdateHistogram(pi, ALH, EH, a(i));
End For
Else
For Each cluster
Compute and make changes to centroid bound cb and
centroid drift cd
End For
For Each trajectory Ti ∈ DNN_Datat do
Compute and make changes to lb and ub;
If (ub(i) < max(cb(a’(i))/2, lb(i))) then
a(i) ← a’(i)
\\Ti remain in same cluster:
Else
mini ← +infinity;
For Each path centroid μx do
If (lb(i, x) < ub(i)) then
lb(i, x) ← Edge-BasedDistance(pi, μx );
If (mini > lb(i, x)) then
a(i) ← x
mini ← lb(i, x);
End If
End If
End For
End If
If (i) ≠ ( )
UpdateHistogram(pi, ALH, EH, a(i));
End If
End For
For Each centroid path μj do
Compute
= min
Edge Based Distance( , μ) and update μx ;
End For
t ← t + 1;
While (t = 0 or μ changed)
Return {μ1, . . . , μk}
DSPPTD: Dynamic Scheme for Privacy Protection of Trajectory Data in LBS
73
The trajectory path-k clustering procedure can be described by Algorithm 2. To
adjust the centroid direction in iteration and to determine the objective function
system manages two histograms of trajectory for each cluster.
(a) Edge histogram: The edge histogram (EHj ) for given trajectories in cluster Cj
has the graph edges frequency information in sorted order. EHj (e) stands for
the edge e frequency, i.e., EHj (e) =|e|, and EHj [l] stands for the l-th most
considerable frequency. In any iteration system, no need to reconstruct the
histograms; instead, it holds one histogram progressively for every cluster and
refreshes it only as a trajectory passes through in or goes out of this cluster.
Many trajectories would continue in the same cluster for further iteration,
although there would be few changes to the histogram.
(b) Accumulated length histograms: The critical point is the size in a meter of the
trajectories for each entry. This histogram measures the number of trajectories
that have this defined size. ALH is ordered by key in ascending order; ALHx [l]
gives the trajectories count in cluster Cx that have a size l.
3.5 Trajectory Release
Trajectory release is the last step, which involves comparing each clustered “synthetic trajectories datum” to corresponding “real” trajectory and merging accordingly. The process also involves a prejudging mechanism to ensure at least one
actual trajectory record can be seen in processed trajectory. So when the count of
records is zero, it means that the produced trajectory data is a null trajectory and
is considered to be irregular. The probability of issuing a null trajectory is further
minimized due to the inclusion of the decision process of an irregular course, the
reliability of the orbiting assignment is increased, and better data availability has
been assured.
4 Performance Analysis of Privacy Protection Scheme
To check the feasibility of our proposed approach and the data availability, we
performed specific tests based on TDrive pre-project data from Microsoft research
[38], which includes the trajectory details of 10,357 taxis for a week duration. The
cumulative points count is about 15 million, for a cumulative trajectory size of nine
million kilometers. The evaluation was conducted on Octa-core 3.2 GHz, RAM
of 64 GB, Windows 8 operating system, and Intel i7 processor. The processing
time overhead of the query and service schedule is assumed to be negligible in
the proposed model. Location-based services have drawn millions of users and
their digital footprints are massively contained. The query process and interval
process are the two modules executed for the simulation of the proposed model.
74
A. K. Gupta and S. Kumar
A location-dependent k-nearest neighbor query (e.g., nearest hospital profile info)
is continuously generated by the query process with the exponential distributed
query interval. Driven by the assessment process surrounding anonymity, modeling,
and uncertainty [36], we are examining the relationship between the efficacy and
usefulness of data security. Past policies are based on user-clustered or usercentralized architectures. Hence, the workload of the network is high, and the
anonymizer could lead the bottleneck performance. Different from existing work,
our suggested methodology integrates the fog server, which processes the data in an
IoT gateway or fog node, as it is nearer to the consumer and can be partly managed
by the user [37]. For the safety of trajectories, we assume in our system the timedependent mobility trend, probability of query, and spatiotemporal connection. It
produces k − 1 dummy positions and trajectories with full entropy, which can
render offline and online original trajectory security. Here, we have undertaken
two measures, namely, mutual information and Hausdorff distance, to establish this
relationship and evaluate the proposed policy. We have a belief that consideration
of these measures may assist in choosing and implementing acceptable methods of
privacy security for particular situations on the pathway. The two measures, i.e.,
mutual information and Hausdorff distance, can be defined as given below.
Mutual information: Mutual information (MI) is a measure of privacy protection
intensity of a given privacy protection scheme. It is directly proportional to the
differential privacy parameter (ε) and inversely proportional to privacy protection
intensity. The differential privacy budget is represented by ε, which is also known
as the differential privacy parameter.
Hausdorff distance: Hausdorff distance is a method for calculating the difference
in a metric space between two sets of points and has been commonly used to
calculate the spatial dissimilarity of two trajectories. We measure the Hausdorff
distance from each pair of initial trajectories to the synthetic ones. A higher value
of Hausdorff distance between trajectories pair represents high dissimilarity of
two trajectories, and so it has a reduced set of POI than original trajectory POIs.
Therefore, the higher Hausdorff distance value shows a lower utility of given
trajectory data for LBS.
From the comparative analysis of past policies such as TSTDA [38], NGTMA
[39], and SDD [40] with the proposed state-of-the-art proposed scheme deep neural
network-based differential privacy protection policy, it is proven that DSPPTD
outperforms the other policy with the highest privacy protection intensity in terms
of mutual information (MI) and trajectory data utility in terms of Hausdorff distance
(HD) has been computed for all models, which have been depicted in Figs. 4 and 5.
The DSPPTD does have the lowest MI level, which shows that RNN-DP has a
higher level of privacy security relative to NGTMA, TSTDA, and SDD methods.
In this study, we discover that the level of privacy security is directly linked to ε as
depicted in Fig. 4. Because DSPPTD uses the Gaussian method in the data collection
step in addition to the exponential method in the data release phase; therefore, the
dual differential privacy security protocols provide better privacy protection.
As depicted in Fig. 5, DSPPTD has the smallest HD of the four systems, so
the data set for publishing is identical to the initial data collection. DSPPTD has
DSPPTD: Dynamic Scheme for Privacy Protection of Trajectory Data in LBS
Differential Privacy Parameter = 0.5
75
Number of Groups = 50
1.0
0.30
DSPPTD
NGTMA
TSTDA
SDD
0.20
0.6
0.15
0.4
M.I.
M.I.
0.25
DSPPTD
NGTMA
TSTDA
SDD
0.8
0.10
0.2
0.05
0.0
0.00
0
20
40
60
80
100
0
20
40
60
80
100
ε (%)
Number of Groups
Fig. 4 Effect of number of cluster groups and differential privacy parameter (ε) on mutual
information (MI)
Differential Privacy Parameter = 0.5
Number of Groups = 50
4.0
10
DSPPTD
NGTMA
TSTDA
SDD
DSPPTD
NGTMA
TSTDA
SDD
8
3.0
HD (in meters)
HD (in meters)
3.5
2.5
2.0
6
4
2
1.5
1.0
0
0
20
40
60
Number of Groups
80
100
0
20
40
60
80
100
ε (%)
Fig. 5 Effect of number of cluster groups and differential privacy parameter (ε) on Hausdorff
distance (HD)
increased the practicability of results. The fundamental explanation for this is the
prediction system convergence. The trajectory data redundancy is carried out during
the processing of the data. When DSPPTD discovers the data to be incorrect,
it removes this data to boost the reliability of the reported trajectory data. The
reliability of these data leads to the higher utility of the LBS.
As shown in Fig. 6, DSPPTD seems to have the lowest execution time of
algorithms within a separate budget for privacy. The algorithm’s execution time
comprises of time for generating noise and time for processing trajectories. The
execution time of the proposed policy is correlated with the algorithm’s time for
76
A. K. Gupta and S. Kumar
ε = 0.5
ε = 0.5
100
DSPPTD
NGTMA
TSTDA
SDD
8
Average Clustering and Trajectory
Release Time (sec)
Average Trajectory Generatorion
Time (sec)
10
6
4
2
0
DSPPTD
NGTMA
TSTDA
SDD
80
60
40
20
0
0
20
40
60
Number of Groups
80
100
0
20
40
60
80
100
Number of Groups
Fig. 6 Comparison of execution time based on number cluster groups
processing trajectories while the noise generation time is the same for all algorithms.
The distinction lies in the computation time of trajectories. DSPPTD has the benefit
of utilizing the projected model trajectory data collection for the study and is not the
time series data conventional processing. So, it has better execution time efficiency.
We concluded, therefore, that DSPPTD ensures computing security and availability
of data, along with the high efficiency of the device in terms of the running time.
5 Conclusion
In this work, we introduced Dynamic Scheme for Privacy Protection of Trajectory
Data (DSPPTD). DSPPTD involve Gaussian framework and double differential
privacy requirement focused on deep learning to provide private security and edge
computing based on enhanced utility services. For consumer services, a mechanism
of dual deep learning-based differential privacy model has been suggested. Via
empirical study, we have shown that DSPPTD has more effective privacy security
strength, better data efficiency, and overall reliability than state-of-the-art systems
currently existing.
Our future research will concentrate on improving the trajectory resemblance
loss metric model, expanding our system to global trajectory data sets, creating
personalized simulated trajectory data for variable lengths, investigating possible
attacks on privacy and security techniques, and assessing the efficacy and usefulness
of our system in other trajectory data mining and analytics schemes.
Competent Interest Declaration On behalf of all authors, the corresponding author states that
there is no conflict of interest.
DSPPTD: Dynamic Scheme for Privacy Protection of Trajectory Data in LBS
77
References
1. Sun, G., et al. (2017). Efficient location privacy algorithm for Internet of Things (IoT) services
and applications. Journal of Network and Computer Applications, 89, 3–13. https://doi.org/
10.1016/j.jnca.2016.10.011
2. Gupta, A. K., & Shanker, U. (2020). Some issues for location dependent information system
query in Mobile environment. In 29th ACM international conference on information and
knowledge management (CIKM ’20) (p. 4). https://doi.org/10.1145/3340531.3418504
3. Gupta, A. K., & Shanker, U. (2018). Location dependent information System’s queries for
Mobile environment. In Lecture notes in computer science (pp. 218–226). https://doi.org/
10.1007/978-3-319-91455-8_19
4. Zakhary, S., & Benslimane, A. (2018). On location-privacy in opportunistic mobile networks, a
survey. Journal of Network and Computer Applications, 103, 157–170. https://doi.org/10.1016/
j.jnca.2017.10.022
5. Hu, H., Sun, Z., Liu, R., & Yang, X. (2019, July). Privacy implication of location-based service:
Multi-class stochastic user equilibrium and incentive mechanism. Transportation Research
Record, 2673(12), 256–265. https://doi.org/10.1177/0361198119859322
6. Gupta, A. K., & Shanker, U. (2020). OMCPR: Optimal mobility aware cache data prefetching and replacement policy using spatial K-anonymity for LBS. Wireless Personal
Communications, 114(2), 949–973. https://doi.org/10.1007/s11277-020-07402-2
7. Shen, H., Bai, G., Yang, M., & Wang, Z. (2017). Protecting trajectory privacy: A usercentric analysis. Journal of Network and Computer Applications, 82, 128–139. https://doi.org/
10.1016/j.jnca.2017.01.018
8. Niu, B., Zhang, Z., Li, X., & Li, H. (2014). Privacy-area aware dummy generation algorithms
for location-based services. In 2014 IEEE International Conference on Communications (ICC)
(pp. 957–962). https://doi.org/10.1109/ICC.2014.6883443
9. Indyk, P., & Woodruff, D. (2006). Polylogarithmic private approximations and efficient
matching. In Theory of cryptography (pp. 245–264). Springer.
10. Gupta, A. K., & Shanker, U. (2020). MAD-RAPPEL: Mobility aware data replacement
& prefetching policy enrooted LBS. Journal of King Saud University – Computer and
Information Sciences. https://doi.org/10.1016/j.jksuci.2020.05.007
11. Gambs, S., Killijian, M., & Cortez, M. N. D. P. (2013). De-anonymization attack on Geolocated
data. In 2013 12th IEEE international conference on trust, security and privacy in computing
and communications (pp. 789–797). https://doi.org/10.1109/TrustCom.2013.96
12. Liu, H., Darabi, H., Banerjee, P., & Liu, J. (2007). Survey of wireless indoor positioning
techniques and systems. IEEE Transactions on Systems, Man, and Cybernetics, Part C
(Applications and Reviews), 37(6), 1067–1080. https://doi.org/10.1109/TSMCC.2007.905750
13. Petrou, L., Larkou, G., Laoudias, C., Zeinalipour-Yazti, D., & Panayiotou, C. G. (2014).
Demonstration abstract: Crowdsourced indoor localization and navigation with anyplace. In
IPSN-14 proceedings of the 13th international symposium on information processing in sensor
networks (pp. 331–332). https://doi.org/10.1109/IPSN.2014.6846788
14. Kido, H., Yanagisawa, Y., & Satoh, T. (2005). An anonymous communication technique using
dummies for location-based services. In Proceedings of ICPS (pp. 88–97).
15. Lu, H., Jensen, C., & Yiu, M. (2008). PAD: Privacy-area aware, dummy-based location
privacy in mobile services. https://doi.org/10.1145/1626536.1626540
16. Niu, B., Gao, S., Li, F., Li, H., & Lu, Z. (2016). Protection of location privacy in continuous
LBSs against adversaries with background information. In 2016 International Conference
on Computing, Networking and Communications (ICNC) (pp. 1–6). https://doi.org/10.1109/
ICCNC.2016.7440649
17. Samarati, P., & Sweeney, L. (1998). Generalizing data to provide anonymity when disclosing
information (Abstract). In Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART
symposium on principles of database systems (p. 188). https://doi.org/10.1145/275487.275508
78
A. K. Gupta and S. Kumar
18. Zhang, S., Li, X., Tan, Z., Peng, T., & Wang, G. (2019). A caching and spatial K-anonymity
driven privacy enhancement scheme in continuous location-based services. Future Generation
Computer Systems, 94, 40–50. https://doi.org/10.1016/j.future.2018.10.053
19. Niu, B., Li, Q., Zhu, X., Cao, G., & Li, H. (2014). Achieving k-anonymity in privacyaware location-based services. In IEEE INFOCOM 2014 – IEEE Conference on Computer
Communications (pp. 754–762). https://doi.org/10.1109/INFOCOM.2014.6848002
20. Guan, Z. et al., (2019, January). APPA: An anonymous and privacy preserving data aggregation
scheme for fog-enhanced IoT. Journal of Network and Computer Applications, 125, 82–92.
https://doi.org/10.1016/j.jnca.2018.09.019.
21. Zhu, H., Yang, X., Wang, B., Wang, L., & Lee, W.-C. (2019). Private trajectory data publication
for trajectory classification. In Web information systems and applications (pp. 347–360).
22. Gruteser, M., & Grunwald, D. (2003). Anonymous usage of location-based services through
spatial and temporal cloaking. In Proceedings of the 1st international conference on Mobile
systems, applications and services (pp. 31–42). https://doi.org/10.1145/1066116.1189037
23. Palanisamy, B., & Liu, L. (2011). MobiMix: Protecting location privacy with mix-zones over
road networks. In 2011 IEEE 27th international conference on data engineering (pp. 494–505).
https://doi.org/10.1109/ICDE.2011.5767898
24. M. Nergiz, M. Atzori, and Y. Saygin, Towards trajectory anonymization: A generalizationbased approach. 2008.
25. Hampton, K., et al. (2010, November). Mapping health data: Improved privacy protection
with donut method Geomasking. American Journal of Epidemiology, 172, 1062–1069. https://
doi.org/10.1093/aje/kwq248
26. Gao, S., Rao, J., Liu, X., Kang, Y., Huang, Q., & App, J. (2019, December). Exploring the
effectiveness of geomasking techniques for protecting the geoprivacy of Twitter users. Journal
of Spatial Information Science. https://doi.org/10.5311/JOSIS.2019.19.510
27. Zandbergen, P. (2014, April). Ensuring confidentiality of geocoded health data: Assessing
geographic masking strategies for individual-level data. Advances in Medicine, 2014, 1–14.
https://doi.org/10.1155/2014/567049
28. Kwan, M.-P., Casas, I., & Schmitz, B. (2004, June). Protection of Geoprivacy and accuracy of
spatial information: How effective are geographical masks? Cartographica the International
Journal for Geographic Information and Geovisualization, 39, 15–28. https://doi.org/10.3138/
X204-4223-57MK-8273
29. Seidl, D. E., Jankowski, P., & Tsou, M.-H. (2016, April). Privacy and spatial pattern preservation in masked GPS trajectory data. International Journal of Geographical Information
Science, 30(4), 785–800. https://doi.org/10.1080/13658816.2015.1101767
30. Beresford, A. R., & Stajano, F. (2003). Location privacy in pervasive computing. IEEE
Pervasive Computing, 2(1), 46–55. https://doi.org/10.1109/MPRV.2003.1186725
31. Liu, X., Zhao, H., Pan, M., Yue, H., Li, X., & Fang, Y. (2012). Traffic-aware multiple mix
zone placement for protecting location privacy. In 2012 Proceedings IEEE INFOCOM (pp.
972–980). https://doi.org/10.1109/INFCOM.2012.6195848
32. Hasan, A. S. M. T., Jiang, Q., & Li, C. (2017, October). An effective grouping method for
privacy-preserving bike sharing data publishing. Future Internet, 9, 65. https://doi.org/10.3390/
fi9040065
33. Li, X., Zhu, Y., Wang, J., Liu, Z., Liu, Y., & Zhang, M. (2018). On the soundness and security of
privacy-preserving SVM for outsourcing data classification. IEEE Transactions on Dependable
and Secure Computing, 15(5), 906–912. https://doi.org/10.1109/TDSC.2017.2682244
34. Gupta, A. K., & Shanker, U. (2020). Study of fuzzy logic and particle swarm methods in map
matching algorithm. SN Applied Sciences, 2, 608. https://doi.org/10.1007/s42452-020-2431-y
35. Park, H.-S., & Jun, C.-H. (2009). A simple and fast algorithm for K-medoids clustering. Expert Systems with Applications, 36(2, Part 2), 3336–3341. https://doi.org/10.1016/
j.eswa.2008.01.039
36. Gupta, A. K. (2020). Spam mail filtering using data mining approach: A comparative
performance analysis. In S. Shanker & U. Pandey (Eds.), Handling priority inversion in timeconstrained distributed databases (pp. 253–282). IGI Global.
DSPPTD: Dynamic Scheme for Privacy Protection of Trajectory Data in LBS
79
37. Wang, T., et al. (2017). Trajectory privacy preservation based on a fog structure for cloud location services. IEEE Access, 5, 7692–7701. https://doi.org/10.1109/ACCESS.2017.2698078
38. Hua, J., Gao, Y., & Zhong, S. (2015). Differentially private publication of general time-serial
trajectory data. In 2015 IEEE Conference on Computer Communications (INFOCOM) (pp.
549–557). https://doi.org/10.1109/INFOCOM.2015.7218422
39. Li, M., Zhu, L., Zhang, Z., & Xu, R. (2017, March). Achieving differential privacy of trajectory
data publishing in participatory sensing. Information Sciences, 400. https://doi.org/10.1016/
j.ins.2017.03.015
40. Jiang, K., Shao, D., Bressan, S., Kister, T., & Tan, K.-L. (2013). Publishing trajectories with
differential privacy guarantees. https://doi.org/10.1145/2484838.2484846
Part II
Data-Driven Decision-Making Systems
n-Layer Platform for Hi-Tech World
R. B. Patel, Lalit Awasthi, M. C. Govil, and Rachita
1 Introduction
The online information management systems are in demand of today’s world for
easing the life of human beings. The growth for demand of services becomes
populous because of increasing use of the Internet. If every citizen of a national
is fond of to use technology to fulfill day-to-day needs, then the traffic/network
bandwidth will be the crucial for a nation. Democratic increasing participation,
accountability, transparency, quality of service, and on time availability of services
are challenges for a country [14]. An e-governance system, which caters such type
of functions, are partially available in few developed countries [1, 2, 3, viz., Europe,
the United States, Australia, and Singapore, etc.]. A complete e-governance system
is demand of present generation as on date it is not fully functional in any country
around the world. An intelligent e-governance system may be adapted/implemented
worldwide by next decade [7, 8]. At present, it will be early to say that e-governance
system is available across the world. It gives knowledge to the citizen about the day-
Note: We have used model, platform, and framework interchangeably.
R. B. Patel ()
Chandigarh College of Engineering and Technology, Chandigarh, Punjab, India
L. Awasthi
National Institute of Technology, Hamirpur, Himachal Pradesh, India
e-mail: lalit@nith.ac.in
M. C. Govil
National Institute of Technology, Ravangla, Sikkim, India
Rachita
TD Canada Trust, St. Catharine, ON, Canada
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_5
83
84
R. B. Patel et al.
to-day development of a country. But due to limitation of client/server technology,
this system suffers in terms of network bandwidth when number of client increases.
The next decade world will be challenging for system managers to fulfill the desire
of human beings, result force to create hi-tech world. This development in the mind
of people demand a system, which should fulfill requirement of the people, and also
it should adapt the new changes/developments. Thus, it is required to model such a
system in which every country should be a member of it and work over a common
platform in global public interest [9, 10].
A work agent (WA) is a mobile software program, i.e., mobile agent (MA), which
moves from nodes to nodes over the global network. It searches for the required
resources to accomplish the assigned task to it. When it finds resources, it finishes
the assigned task and returns the outcome of the task to its owner and issues death
warrant to itself and dies. Death warrant is important to avoid the misuse of code and
associated data with a WA. This article presents a novel n-layer platform for hi-tech
world. In this platform, agent technology is used. A new and unique naming scheme
is used to identify citizen of a country and unique name to the work agent(s) created
by its owner. The initial part of this naming scheme provides unique identification
to citizen of a country, which is the owner of the agent(s). It provides a unique name
for an agent across the global network for everyone using mobile agent technology
(MAT). In the proposed platform, value of n may vary from 2 to 10. We have
considered India as a case study, and in this platform, n is considered as 7 plus
1. Here, 7 is used to uniquely define identification for every citizen and eighth level
to give identification to work agent(s) created by a citizen.
This platform is named as neighbor assister framework for mobile agents
(NAFMA). Here, mobile agent is interchangeably used for physical mobile agent(s).
It may be a citizen of a country. This scheme uses an eight-component-based naming
scheme for MAs. In this naming scheme, seven components are arranged in logical
hierarchical order. The seven components of the name are contributed by seven
layers of the NAFMA and eighth component is contributed by the agent owner itself.
Every layer of the NAFMA is integrated with platform for mobile agent distribution
and execution (PMADE) [2, 12]. In NAFMA, each layer is integrated with layer
intelligent agents (LIAs). This system uses hexadecimal digital identification code
for every citizen of a country. The length of code is 17-digit for citizen and 18th digit
for defining a WA. NAFMA supports two types of services, internal and external.
These services are being useful for the day-to-day work of the persons belonging to
a country. External services serve everyone around the world. These are available
at country layer of the said system. Thus, system tells one world, one umbrella.
There are several other types of services, which are required to be available among
the people of a country and are managed by provincial layers, which are known as
internal services.
Implementation of this system will bring the whole world under one umbrella
and make hi-tech world smart and green. This system will also remove the hurdles
of carrying documents while traveling across the world. Only using the NAFMA
card and finger/face print details can be fetched from the system, which in result
will present identification of an individual along with face value.
n-Layer Platform for Hi-Tech World
85
This system assigns unique 17-digit identification (ID) code to every citizen and
1-digit ID to their work agent(s), which means that a citizen will be permitted to use
16 work agents concurrently. This system uses the peer-to-peer concept to generate
location-dependent identification code for the citizens [2, 4]. Citizens are permitted
to share their views and ideas with other citizen and authorities as well as with the
government with the help of work agents across the globe. This system will support
all kind of applications related to human beings. In the development of this system,
we have used a model, platform, and framework interchangeably.
Rest of the article is organized as follows: Section 1 discusses the Introduction,
and Sect. 2 presents information management issues and challenges. Discussion
about Indian administrative system is given in Sect. 3. System model is explored in
Sect. 4, and system architecture is given in Sect. 5. Unique costumer identification
code is presented in Sect. 6. Implementation and performance study is given in
Sect. 7. Discussion about findings is explored in Sect. 8, and finally, this chapter is
concluded in Sect. 9.
2 Information Management Issues
Easily availability of Internet connectivity fuelled the growth of electronic information. This happened due to the advancement of electronic system. The growth of
advancement in electronic system promoted production of economical electronic
gadgets and their usages. Because of these, cheap/economical use of Internet
services grows drastically around the world. These usages of the Internet services
are considerably worsening information management challenges. These information
management challenges are prodigious issues for the organizations and governments
because of their dependencies on existing rules and resources. Organizations and
governments are suffering because of no clear direction for the use of technologies
and their integration with disparate information management systems available with
them. Policies of organizations and governments are also suffering from internal
politics and non-clarity around broader organizational strategies and directions.
Thus, information management system suffers from poor quality of information,
which leads to inconsistency, duplicate, and stale information. Further, most important factor in changing working practices and processes of staff of the organizations
is that it does not want to go for upgradation as per need of demand of time. To
handle such issues, researchers proposed several models.
Authors [5] explored the opportunities and challenges for the organizations,
which were networked. They presented a flexible and efficient information architecture for establishing new values, attitudes, and behaviors to share information and
build databases. This system provides integrated customer support on a worldwide
basis and protects personal freedoms and privacy.
Electronic brainstorming is seemingly suitable and prevalent platform in the
twenty-first century. It makes daily public life easy but leaves the issues behind
its management and security [6].
86
R. B. Patel et al.
The Unique Identification Authority of India (UIDAI) is mandated to issue an
easily verifiable 12-digit random number as unique identification for its residents.
The UIDAI issued 12-digit unique identification (UID) number (termed “Aadhaar”).
In Aadhar card number, twelfth digits are used for checksum [13]. There are several
limitations with this UID. It does not guide straight way about the location/place
of actual birth of a person who is using it. This card does not provide day-to-day
wealth condition of individuals. One card with all kind of tasks (banking, income
tax, vehicle registration, driving license, loan account management, payment, etc.) is
not possible with UID. Aadhar card also does not warrant food guaranty to everyone
every day. It does not keep records of unemployed citizens of India.
Besides, there several other issues which are not addressing real life of human
beings of most developing and under developing countries. Poverty is a major issue
amongst underdeveloped/developing countries, where the system is not able to reach
in time to the common people, resulting in growth of younger generation being
hampered due to lack of basic necessities. Thus, there is a need of a common
e-governance system, which should address the issues of common people with
reduced management cost of overall system of a country [10, 11].
3 Indian Administrative System
When the population and area of a system becomes very large, the cost and
processing involved in directed communication are prohibitive. A popular alternative to direct communication that eliminates these difficulties is to organize
the population and area into a federated system. Citizens of a country do not
directly communicate with the higher authority, but locally, they can communicate.
A set of people/area has a facilitator, who kept informed about their individual
needs and abilities. Citizens/individuals can also send and receive applicationlevel information and requests to these facilitators. Facilitators use the information
provided by citizens/individuals to transform these application-level messages and
route them to appropriate authorities. A federated system consisting of a group
of organizations, countries, regions, etc. have joined together to form a larger
organization or government.
India is a federated republic, with a civil law system. It consists of 29 states
and eight union territories. There are 638 districts in states, 11 districts in Delhi,
and 26 districts in union territories. Further, these districts are organized in
Tehsil/block/Taluka, which are about 5479 in the states +269 in Delhi and union
territories. India at route level divided into villages and wards. There are approximate 638,365 villages and wards across the country.
The system proposed in Fig. 1 faces several issues, viz., observation about
common people is not possible in time, observation of the higher authority by lower
precedence (common) people is not possible, what schemes are for the individual’s
welfare never reach in time to everyone, higher authority always being dependent
on their subordinates for getting the status of the common people, current election
n-Layer Platform for Hi-Tech World
87
Government of India
State Government(s)
Division(s)
District(s)
(Zilla-Parishad)
Tehsil(s)
Municipal Corporaon(s)
(Mahanagar Palika)
Block(s)
Municipality(s)
(Nagar Palika)
City Council(s)
(Nagar Panchayat)
Ward(s)
Villages
(Gram Panchayat)
Fig. 1 Administrative organization of India at a glance
system of India utilizes 20–30% of the budget of a complete year, and how the
common people earning will be saved and utilized for the society.
4 System Model
The proposed hi-tech world model is n-Layer Platform, which uses agent technology
in background. In this model, a new and unique naming/identification scheme is
used to identify citizen of a country and to provide unique name to the citizen’s work
agent(s). This naming scheme is composed of two parts. The first part is composed
of seven layers of this naming scheme, which provides unique identification to
citizen of a country who are entitled to create their work agent(s). The eighth layer
shows how many work agents a citizen allowed to create. The proposed system
promises a unique identification to every citizen of a country and name for every
work agent within the global network for everyone using agent technology. In the
proposed platform, value of n may vary from 2 to 10. We have considered India as
a case study, and in this, platform n is considered as 7 plus 1. Here, seven is used to
uniquely define identification for every citizen and eighth level to give identification
to work agent(s) created by a citizen.
The seven-coordinating layers are arranged logically in a hierarchical fashion.
Seven components of the name/identification are contributed by seven layers of
88
R. B. Patel et al.
D
D
Level 2 : Province
(P-2-Server)
D
D
Level n : Province
(Pn - 1 Server)
Level 1 : Province
(P-1-Server)
Level 2 : Province
(P-2-Server)
D
Level n-1 : Province
(P-1-Server)
D
Level n : Province
(Pn - 1 Server)
Fig. 2 n-Layer neighbor-assisted framework for mobile agents (NAFMA)
the said system, and eighth components is contributed by the agent owner/citizen
itself. This new agent naming/identification scheme will give base to develop hi-tech
world as shown in Fig. 2. The coordination among the layers is an important factor
for smooth functioning of the system. Further, NAFMAs of different countries’
coordination will be a major factor for the completion of the task of a work agent
and of a citizen who is a resident of a country.
The proposed system (NAFMAS) accepts the information through registration
process. But it also opens channel to accept other format databases. Anyone who
wants to become member of this system may register by giving his/her details with
valid credentials and documents. Database created through registration process or
through integration of databases of other system is partially shared by this system
among its layers [2, 7]. It also accepts the Aadhar card database of India for
gathering the information about its citizens. This system converts 12-digit decimal
numbers into a 17-digit (Hex digit) unique identification for a citizen. A person
who registers with this system has the right to decide one-nibble identification code
range to his/her work agent. In general purpose, a work agent may be a vehicle,
a house, an income tax identification, a passport, a field of land, etc. A registered
member with NAFMAS system allowed for both types of services to fulfill dayto-day work. The deployment of NAFMAS e-governance system will allow/permit
the services, namely the Internet, E-taxation, E-health schemes, E-education, social
n-Layer Platform for Hi-Tech World
89
service, and E-conversation with the persons of other countries. It will also facilitate
events, which are based on resource, time, and money constraints, viz., E-voting, Edemocracy, and E- suggestions [14].
NAFMA database is useful for all kind of systems who are focused to work
through e-governance. One such example is international law, which is required
to be formed for using the database of the said system. By implementation
of international law, international police may use this database for identifying
the persons/systems, who/which are doing illegal work(s). When an organization/government wants to allocate project(s) to any person/organization, it is not
required to collect the information for the same person. Simply by using the
identification code of a person/organization, all details can be collected/verified
before the allotment of project(s). Further, to mention that, this system may work
like a ready-made database at every layer. A person is not required to keep identity
proof; only NAFMAS card will be sufficient because every kind of identification
marks, viz. snap, finger print, and retina of every person, are collected by the said
system just once. This NAFMAS system keeps track of all kind of changes a person
may possible make to do the crime. If changes are made by a person same, is
reflected at every layer of the system.
5 System Architecture
We have developed a neighbor-assisted framework for mobile agent (NAFMA)
based on e-governance system. It is a peer-to-peer n-layer architecture. These layers
are logically hierarchical in nature. Scalability and communication efficiency is a
major achievement of the proposed system. PMADE is background technology, and
layer specific intelligent agents (LSIAs) are integrated at each layer. NAFMAS egovernance system uses one LSIA at each layer. Number of LSIAs depends on
governance structure of a country that is going to be member of NAFMAS. If
governance structure uses n-level federated system, then at list n-LSIA will be
required to run the system smoothly.
We have considered the province of India as a case study. Figure 3 shows NAFMAS model for province of India. The top layer contains country intelligent agent(s)
(CIA) and maintains external linkage with the world. It manages information about
a country for it is serving like external affairs. It keeps information about the culture,
gender wise population, source of income from agriculture and industry, area and
category-developing/developed country, etc. Besides above said information, CIA
running at P-1-Server keeps several attributes.
The state intelligent agent (SIA) keeps track of state information at the P2 Server,
which is at layer 2. Similarly, P3 Server takes care of district intelligent agent (DIA),
which lies at layer 3 and is district head quarter. It maintains information about the
people of districts. Tehsil intelligent agent (TIA) keeps itself on P4 Server, which
is in-charge of layer 4 and keeps records of public of a tehsil. Block intelligent
agent (BIA) is being owner by P5 Server. It keeps records of citizens of a block
90
R. B. Patel et al.
Layer 1: CIA
D
D
Layer 3: DIA
D
D
D
Layer 4: TIA
D
Layer 5: BIA
D
D
Layer 7: VIA
Layer 2: SIA
D
D
Layer 6: PIA
D
D
D
Cizen Id
Fig. 3 Citizen ID and work agent generation using intelligent agents
at layer 5. P6 Server runs Panchayat/ward intelligent agent (PIA). It is the owner
of layer 6 of the system. Bookkeeping about citizens of a Panchayat/ward is done
by it. Layer 7 runs P7 Server for village/town. It uses smart and intelligent agent
for keeping the records. Actual data maintained at this layer is being replicated
across the different layers. Authenticity of records/data of citizens is important for a
country. All above discussed agents decides about the unique identification number
n-Layer Platform for Hi-Tech World
91
of a citizen. A citizen is permitted to allocate 1-nibble ID to its work agent, which
is the eighth layer of NAFMAS.
NAFMAS is an intelligent e-governance model, which maintains heterogeneous
collection of databases. Accessibility of this database is only possible through
NAFMAS members. This system maintains information specific agents (ISAs)
for the help of users. An ISA handles user/client tasks with the help of system
intelligent agents (SIAs.). Communication may takes place between ISA and
SIA(s) for processing queries/exchanging information. System keeps track record
of visiting work agent(s), ISA(s), SIA(s), and system resources access during the
execution of assigned task(s). It is also required to make distinction between the
task(s), information access agent (IAA), and other agent(s). An IAA is permitted to
access the databases in read-only mode. This database may belong to government
organization/department/private. Securities of the records are important at any stage,
which is easily secured by NAFMAS system using the PMADE security features
[15].
6 Unique Citizen Identification Code (UCIC)
The NAFMAS e-governance system ensures unique costumer identification code
(UCIC) for every citizen of a country. It uses down-streamed concept for generating
identification code (IDs) of different layers. This process is done at the system boot
up time. Higher level layers are responsible to provide IDs toward lower order/level
layers. A layer at lower level in the hierarchy is responsible for prefixing the main
part of identification code to its own local ID. A combined approach of all the layers
in the system contributes for the generation of new ID of a layer. The country being
studied in this article is India (as a case study). At primary level, it is the land of
villages and secondary level towns. The lowest (layer 7) will be at Province Level 7.
A 17-hex digits identification code (Id) is issued by the said system to every person
of a country. A citizen itself is permitted to allocate 1-nibble ID to its WA(s). Figure
4 illustrates sample identification code. This identification code consists of 2-nibble
country provincial code, 2-nibble state provincial code, 2-nibble District provincial
code, 1-nibble to represent tehsil provincial code, 1-nibble for block provincial code,
2-nibble for Panchayat provincial code,1-nibble for village/town provincial code,
and 5-nibble for representing the citizen identification number (ID). In code, the
first field is priority code, which is sued to represent one for developed country,
two for developing country, any other number as per need and will be decided by
international body. Here, 0 is used for no priority.
A nibble (4-bits of binary digits) is used to form a hexadecimal digit. So for
simplicity data, size format nibble is used in the system. Sample format shown
in Fig. 4 enables a total population of 1,048,576 in a village/town. Each citizen
is allowed to launch simultaneous 16 work agents at 16 sites at a time. This
system generates 252 million unique user identification codes. Initially, a citizen is
required to register himself/herself through local provinces. It may be done through
92
R. B. Patel et al.
1
2
3
4
5
6
7
8
9
10
««Field No
Priority
CPC
SPC
DPC
TPC
BPC
PPC
VPC
Citizen
Code
Agent
Code
Source Of Data
1-16
1-256
1-256
1-156
1-16
1-16
1-256
1-16
1-1048576
1-16
Maximum Data
Bits
0
5
0
0
2
8
0
1nibble
1nibble
2nibble
B
1-nibble 2nibble
B
2nibble
5
2nibble
1
5
0
1nibble
5-nibble
0
0
1
C
4
Id in Hexadecimal
1nibble
Size in Nibble
Fig. 4 Citizen ID generation fields using WA Province Level 7, Village/town(VPC); Province
Level 6, Panchayat (PPC); Province Level 5, Block (BPC); Province Level 4, Tehsil (TPC);
Province Level 3, Districts (DPC); Province Level 2, State (SPC); Province Level 1, Country (CPC)
a ward/village/town layer of the system. NAFMAS system appends citizen ID with
the system ID for generating a unique identification code for a citizen. Figure 4
presents 2 (Priority-developing country), 5B (India), 0B (Delhi), 05 (West Delhi
District), 2 (Punjabi Bagh, Tehsil), 8 (Nilothi block), 01 (Chander Vihar, Panchayat),
5 (Shani Bazar Town), 0001C (Dr. Munshi Yadav), and 4 (work agent) for Dr.
Munshi Yadav’s agent.
Dr. Yadav is a citizen of country India; Delhi State, West Delhi District; Punjabi
Bagh, Tehsil; Nilothi block; Chander Vihar, Panchayat; and Shani Bazar Town.
Here, 5B is a code of India, personal ID of Dr. Munshi Yadav is 0001C, and his
work agent ID is 4. Other details are as discussed above.
This identification code is hierarchical in nature. Code is generated through upstream toward the root of tree and every component of the code passes through the
branch(s) of the tree. This process reflects availability of identification to each of its
parental ancestor layer. A WA moving across the global network possesses a unique
identification code and permitted to make conversation to any ancestor layer through
its local village/town layer agents. In case of failure, corruption, and maliciousness
citizen, a work agent is permitted to directly approach to its next higher ancestor
in the branch. A citizen work agent is not allowed to route through the branch to
which it does not belong, because the identification of that citizen work agent will
be supported only by branch to which it belongs. In case of roaming of the citizen
work agent, the system may also be enhanced to provide all privileges, viz., create
work agent assign to the work agent.
7 Implementation and Performance Study
Implementation of NAFMA-based e-governance system tested on the networks of
60 machines. These machines are divided into 40 networks. To implement all the
state provinces, 38 networks are established, one for each state/union territory.
Two state networks are completely implemented, and all layers are equipped
with 1/2 machines. Each network has a gateway to work as a province server.
n-Layer Platform for Hi-Tech World
93
Remaining four machines are working as country servers for different countries.
Every machine is equipped with PMADE secure mobile agent platform, and on
the top of it, NAFMAS system is executed. Each machine possesses configure as
follows: Intel(R), Core(TM), i7–8700, CPU @ 3.20GHz, 3.19 GHz, 8.00 GB RAM,
64-bit operating system, x64-based processor, and Windows 10 Pro. The arrival
rates of WAs on different sites/servers are function of poison distribution. Further,
to mention that, registration process of 1 million citizens is done randomly. System
is also updated with some UDAI database. More than 5000 WAs were launched
from different users/clients at different traffic load (high/peak and medium and
low) on the network. These WAs allowed for working on different location of the
network. In the implementation of the said system, different performance metrics
were used, viz., fault tolerant, security, network delay, and failure of different layers.
This system kept continuously running for few weeks in different conditions.
Performance measurement of NAFMA e-governance system depends on the various implementation factors. Case study implementation of said system inherently
consists of eight layers. Country layer sits at first level and a citizen at eighth level.
The network delay (ND) may occur in WA transportation. Movement of WA may
be upward (from layer 8 to 1) or downward (from layer 1 to 8) in system. This
movement takes time accordingly.
Per record registration time is 10 minutes for entering a fresh record into the
database. In the implementation of said system, it is assumed that minimum number
of record per village/town is 500 citizens, and maximum is 10,000. In city wards,
population is more and is assumed that is in 5000 minimum and maximum 50,000,
respectively. Record processing time (RPT) depends on the network traffic. Figure
5 shows time required for registration of all the citizens of country (India). It shows
that if system will run, fault free about 100 days will be required to complete the
registration process. When random network failure occurs, maximum registration
time increases, and it is about 119 days. RPT also increases in maximum time in
processing of records, which is 876 ms. But without failure, it is 776 ms.
In the implementation of the system, 500 minimum and 800,000 maximum
numbers of queries were generated to study the processing time of the system.
Figures 6 and 7 show the query processing time (QPT) with and without network
failure. It is observed that network failure affects system performance but its effect
is very less.
8 Discussion
NAFMAS-based e-governance system warrants food guaranty for every poor in the
locality of every province. This system will realize duty of every public-elected
official. The said system is fully distributed. One-time initial implementation cost
of system will be about Rs. Six hundred cores for India, like huge country. UDAI
94
R. B. Patel et al.
Fig. 5 Registration and record processing time
Fig. 6 Query processing time (QPT) without fault
system of the government of India does not warrant food guaranty for every
poor in the locality of every province. Further, it also does not warrant weekly
unemployment record like the United States. But NAFMAS governance system
warrants all kind of citizen-oriented applications.
n-Layer Platform for Hi-Tech World
95
Fig. 7 Query processing time (QPT) with fault
9 Conclusion and Future Work
This chapter gives a look of n-layer NAFMAS system for promoting e-governance
toward a hi-tech world. NAFMAS layers are working in hierarchical fashion. Both
internal and external services are available to the citizen of a country. PMADE
is playing key role in the deployment of the said system. Citizens are allowed to
use 16 WAs simultaneously to accomplish their task across the world. A WA is a
mobile and intelligent agent and is enough for the dissemination of work/accessing
the information. NAFMAS integrates fault tolerant and reliability attributes from
PMADE for the implementation of successful e-governance system. NAFMASbased governance system warrants most of the citizen-oriented applications, viz.,
daily food for the poor in the vicinity of a province. Weekly unemployment record
of citizens maintains at provenance level. NAFMAS card warrants every humanrelated application identities to food guaranty, income tax to voting, etc. But UDAI
only warrants identity to the citizens of India. In future, more rigorous properties of
the said system will be tested.
References
1. West, D. M. (2004). E-government and the transformation of service delivery and citizen
attitude. Public Administration Review, 64(1), 15–27.
2. Patel, R. B. (2004). Design and implementation of a secure Mobile agent platform for
Distributed computing. PhD Thesis. Department of Electronics and Computer Engineering,
IIT Roorkee.
96
R. B. Patel et al.
3. Bagchi, S., Srinivasan, B., Kalbarczyk, Z., & Iyer, R. K. (2000). Hierarchical error detection in
a software implemented fault tolerance (SIFT) environment. IEEE Transactions on Knowledge
and Data Engineering, 12(2).
4. Azar, Y., Kutten, S., and Patt Shamir, B. (2003). Distributed error confinement. In ACM PODC
(pp. 33–42).
5. Jarvenpaa, S. L., & Ives, B. (2015). The global network organization of the future: Information
management opportunities and challenges. Journal of Management Information Systems,
10(4).
6. Maaravi, Y., Heller, B., Shoham, Y., Mohar, S., & Deutsch, B. (2021). Ideation in the digital
age: literature review and integrative model for electronic brainstorming. Review of Managerial
Science, 15(6), 1431–1464.
7. Habermas, J. (1996). Citizenship and national identity between facts and norms, contributions
to a discourse theory of law and democracy (pp 491–515). Translation by William Regh. MIT
Press.
8. Tsuchiya, T., Sawano, H., Lihan, M., Yoshinaga, H., & Koyanagi, K. (2009). A distributed
information retrieval manner based on the statistic information for ubiquitous services.
Progress in Information, 63–77.
9. Tryfonopoulos, C. (2008). P2P information retrieval and filtering (pp. 607–611). Springer.
10. Jung, J. (2009). Consensus-based evaluation framework for distributed information retrieval
system. Knowledge and Information System, 18(2), 199–211.
11. McLean, M., & Tawfik, J. (2003). The Role of information and communication technology in
the Modernization of e-Government, pp. 237–245.
12. Patel, R. B., & Garg, K. (2004). A new paradigm for mobile agent computing. WSEAS
Transaction on Computers, 1(3), 57–64.
13. K-13011/26/2012-DD-I, Gazette of India. Retrieved 7 July 2015.
14. Vaid, R., & Patel, R. B. (2009, October). A 7-layer model for modernizing the World: a step
towards a Hi-Tech World. ARTCOM’09: Proceedings of the 2009 international conference on
advances in recent technologies in communication and computing, pp 840–843.
15. Patel, R. B., & Garg, K. (2005). A Flexible Security Framework for Mobile Agent Systems.
Control and Intelligent Systems, 33(3).
A Comparative Study of Machine
Learning Techniques for Phishing
Website Detection
Mohammad Farhan Khan, Rohit Kumar Tiwari, Sushil Kumar Saroj,
and Tripti Tripathi
1 Introduction
The Internet has become one of the integral parts of our life in recent years due to
the availability of various services in online mode like online banking, social media,
entertainment, etc. These online services have caused an exponential increase in
Internet users, which in turn has given an opportunity to cyber criminals for
cyber fraud causing huge financial loss to users every year. Cyber criminals use
various techniques to harm the users’ system or steal their sensitive information
like username, password, email ID, and other credentials by deceiving the users
as a trustworthy entity [1]. Phishing is one of the techniques of cybercrime where
the attacker presents himself as a trustworthy entity to the users to collect sensitive
information through email or websites. In a phishing website, the attacker creates a
fake website by cloning the legitimate website and sends an email to target users to
update their information. Once a user goes through the email and clicks on the URL
of the website, it redirects the target users to a fake website where the users enter
their credential to update the information. The attacker gets the credentials of the
users and uses them for financial or any other type of fraud.
Phishing website attack has become one of the main challenges in cyberspace.
It is causing huge financial loss to users every year with the increase of Internet
users. According to the RSA quarterly fraud report for the period of 1st of January
to 31st of March 2018, phishing is responsible for 48 percent of all cyberattacks [2].
The report says that Canada, United States, India, and Brazil are the most victim
countries of the phishing attack. Figure 1 shows the statistics of phishing attacks of
various countries in the above period.
M. F. Khan · R. K. Tiwari () · S. K. Saroj · T. Tripathi
Department of Computer Science & Engineering, Madan Mohan Malaviya University of
Technology, Gorakhpur, Uttar Pradesh, India
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_6
97
98
M. F. Khan et al.
5%
Canada
4% 2%
18%
United States
7%
India
Brazil
9%
16%
Netherlands
Colombia
Spain
11%
Mexico
13%
15%
Germany
Fig. 1 Percentage of phishing attack in different countries during January to March 2018
Due to the increasing number of cyber frauds through phishing websites, it has
become necessary to develop some techniques to prevent it. Many techniques have
been proposed to detect phishing websites with the help of black and whitelisted
databases like PhishTank. In this approach, whenever a user visits a website, its
URL is checked in the database. If the URL is present in the blacklist label, then the
website is phishing. However, these techniques are not sufficient to detect phishing
websites as new phishing websites are created every minute and their addition to the
database takes time. Therefore, there is a need for an intelligent phishing website
detection system that can detect phishing websites automatically.
In this paper, we have developed intelligent phishing detection systems based
on machine learning techniques that use structural features of the website to detect
whether a website is phishing or not. The features used for training the detector are
based on the structure of the webpage and URL. We have used various machine
learning algorithms to train a detector using a standard dataset consisting of phishy
and non-phishy websites. We have also compared them in terms of various metrics
to show the usefulness of the machine learning algorithm for phishing website
detection.
The remaining part of this paper is organized as follows: Section 2 discusses
the categorization of phishing website detection techniques with a detailed review
of them. Section 3 discusses the various structural features, which are being used
to train the phishing website detector. Section 4 explains the working of the
proposed method of machine learning-based phishing detector followed by result
and discussion in Sect. 5. At last, the conclusion is presented in Sect. 6.
A Comparative Study of Machine Learning Techniques for Phishing Website Detection
99
2 Literature Review
Phishing website detection is one of the major challenges in cybersecurity. There
are various methods that exist in the literature to detect phishing websites. Figure 2
presents the taxonomy of phishing website detection systems. In user awarenessbased techniques, a user recognizes a phishing website from its experience or
knowledge whereas software detection-based techniques use automated techniques
to detect phishing websites. Vision and web page structure-based techniques use
website design and structure to detect phishing. The vision-based technique detects
a phishing website based on the visual comparison of a legitimate and illegitimate
websites. They use interest point detector techniques of computer vision to locate
a phishing website. The web page structure-based technique detects a phishing
website using a structure of web page like referencing, HTML structure, URL,
etc. Various authors have proposed phishing website detectors based on the above
categorizations. Some of them have been discussed below.
Rao et al. [3] have proposed an approach to detect phishing websites based
on machine learning techniques. They trained eight machine learning algorithms
to detect phishy websites out of which random forest-based technique was more
accurate. Sönmez et al. [4] also proposed a machine learning-based phishing website
detector. They initially extracted features of the website and used them to train the
classifier. They used support vector machine, naive Bayes, and extreme learning
machine (ELM) as machine learning techniques out of which ELM has the highest
accuracy. Sharmin et al. [5] have proposed a supervised learning technique in which
they discussed the problem of spam detection in social media platforms. They
worked on the comment section of YouTube to filter out spam comments. They tried
to solve the phishing problem by applying various methods. Ensemble classifier has
the best response among them.
Altaher et al. [6] have proposed a hybrid algorithm by combining KNN and SVM
methods. They first applied the KNN method to remove noisy data followed by
Phishing Detection Technique
User Awareness
Software
Vision
Fig. 2 Taxonomy of phishing detection techniques
Web Page Structure
100
M. F. Khan et al.
the SVM method that is used to classify the phishy website. This hybrid approach
has 90.04% accuracy. Karnik et al. [7] proposed the phishy website detector using
the SVM algorithm to solve the phishing attack problem. They used features like
webpage contents, DNS information, link structure, textual properties, and network
traffic of webpage to train the detector. The proposed approach has 95% accuracy.
Abunadi et al. [8] presented a review of features used in machine learning-based
techniques to detect phishy websites. They also added some new features, which are
helpful in phishing websites detection and experimentally show that new features
are more helpful in phishy website detection.
James et al. [9] proposed machine learning based to prevent phishing attacks.
They used three more features like host properties, page importance properties,
and lexical features to train the detector. Xiang et al. [10] proposed an extension
of CANTINA called CANTINA+ to detect the phishy website. In CANTINA +,
they added new features with previous features to get better accuracy. The proposed
system filters website without entering the login form in the first step to reduce the
false positive rate. The proposed approach utilizes 15 attributes like URL, HTML
document object model, search engines, other services to train SVM to identify
phishing websites. The true positive rate of CANTINA+ is 92%, and the falsepositive rate is 0.4%. Aburrous et al. [11] presented a method using data mining
techniques to search and identify the Internet banking system to prevent phishing
attacks in the banking system. They used 27 phishing characteristics divide into
six categories like source code and JavaScript, protection and encryption, URL and
domain identity, content and page style, the web address bar, and social human
factors to train the detector. Wenyin et al. [12] proposed a method to detect phishy
websites in two stages. The first stage detects the keywords and suspicious URLs
on the local email server. After detection of the URLs or suspicious keywords in the
email, the second module compares the layout, block-level, and style equality for
the suspicious webpage to detect a phishy website.
3 Phishing Websites Features
The phishing website follows some patterns related to web page structure and URL.
There are many features to identify phishing websites. These patterns and features
are used to categorize the websites as legitimate or illegitimate websites. Figure 3
shows the classification of phishing websites features.
3.1 Address Bar Features
Address bar features correspond to features obtained from the URL or address of
a website. There are various patterns in the address of a website that indicate the
website is phishy or not. Figure 4 shows an example of the address of a website
A Comparative Study of Machine Learning Techniques for Phishing Website Detection
Fig. 3 Phishing websites
features
101
Phishing Fearures
Address Bar Features
Abnormal Features
HTML and JavaScript Features
Domain Features
Fig. 4 URL and its components
with its various parts. A typical URL consists of a domain, a protocol, and a file
path. Some important address bar features used for detecting phishing websites are
discussed below.
• IP address: An IP address uniquely identifies a computer on the Internet.
The websites are hosted on the server, and they are accessed through URL.
Sometimes, the URL consists of an IP address that normally does not exist. So,
if the website consists of a URL, then users perceive that it is phishy. The rule
used to identify a phishy website is given in Eq. 1:
Rule : If
IP address exits in URl → Phishy
Otherwise → Feature = Legitimate
(1)
• URL Length: URL length also indicates whether a website is phishing or not.
The rule for classifying a website as phishing, legitimate, or suspicious is given
in Eq. 2:
⎧
⎨
URL length < 54 → Legitimate
Rule : If URl length ≥ 54 and ≤ 75 → Suspicious
⎩
Otherwise → Phishy
(2)
102
M. F. Khan et al.
• URL with @ symbol: If a URL consists of an @ symbol, it is categorized as a
phishing URL otherwise legitimate. The rule is shown in Eq. 3:
Rule : If
URl has@Symbol → Phishy
Otherwise → Legitimate
(3)
• Prefix or suffix: To make users perceive that they are working with a legitimate
website, phishers use suffixes or prefixes separated by “−“in the area name.
Therefore, the users think that they are working with a valid webpage with a
domain name. The rule used to categorize a website based on suffix and prefix is
given in Eq. 4:
Rule : If
Domain has’ − ’ symbol → Phishy
Otherwise → Legitimate
(4)
• Subdomain and multisubdomains: A website is classified as legitimate, suspicious, or phishy based on the number of subdomains in its URL. The rule used
to classify it is shown in Eq. 5:
⎧
⎨ Dots in the domain part < 3 → Legitimate
Rule : If Else if dots in domain part = 3 → Supicious
⎩
Otherwise → Phishy
(5)
3.2 Abnormal Features
Abnormal features are the information or features obtained from a website. We have
examined different abnormal features. Some important abnormal features used for
identifying phishing websites are presented below.
• Request URL: Websites use different application programming interfaces (API)
to access some resources. The API is identified by URL and request data. The
website accessing API with the same domain name is legitimate. The ratio of
request URL with same domain name to another domain name is used to identify
if a website is phishy or not. If the web page has a ratio of request URL less
than 22%, then it is legitimate. If request URLs are between 22% and 61%, then
it is suspicious; otherwise, the website is a phishing website. The rule to detect
phishing websites is given in Eq. 6:
Rule : If
⎧
⎨
Request URL% < 22% → Legitimate
Request URL% ≥ 22%and < 61% → Suspicious
⎩
Otherwise → Phishy
(6)
A Comparative Study of Machine Learning Techniques for Phishing Website Detection
103
• Anchor tag URL: A website is classified as phishy based on the URL of the
anchor tag. An anchor is defined by HTML element <a > tag. A website can be
classified as phishy based on the percentage of URLs pointing to another domain.
The rule used to classify a website as phishy is given in Eq. 7:
Rule : If
⎧
⎨
Anchor URL% < 31% → Legitimate
Anchor URL% ≥ 31%and ≤ 67% → Suspicious
⎩
Otherwise → Phishy
(7)
3.3 HTML and JavaScript Features
These features are extracted from the HTML and JavaScript files of the websites.
The various features that can be extracted from HTML and JavaScript files of a
website to classify a website phishy or non-phishy are discussed below.
• Redirect page: Page redirection is a situation where when we click on a URL
to reach page x but it redirects to page y. The rule used to classify a website as
phishy is given in Eq. 8:
Rule : If
⎧
⎨
page redirect ≤ 1 → Legitimate
page redirect > 1 and < 4 → Suspicious
⎩
Otherwise → Phishy
(8)
• Hide link: JavaScript supports various functions to make a website responsive
based on user mouse click. One of the functions provided by it is onMouseOver(),
which hides the text when the mouse hovers over it. Phishers use this feature to
hide phishy links. The rule used to categorize a website phishy is given in Eq. 9:
⎧
⎨ status bar change on nmouseover → Phishy
Rule : If
no status bar change → Supicious
⎩
Otherwise → Legitimate
(9)
• Right click disable: JavaScript is used by phishers to block the right clicks on
a webpage, which makes users unable to view source code and helps them to do
cyber fraud. The rule used to classify a website as phishing or legitimate based
on this feature is given in Eq. 10:
⎧
⎨ disabled right click → Phishy
Rule : If alert on right click → Suspicious
⎩
Otherwise → Legitimate
(10)
104
M. F. Khan et al.
3.4 Domain Features
Domain features are the information or features extracted from the domain of
a website. We have discussed various domain-based features following based on
which we can categorize a website as phishy or legitimate
• Domain age: The domain age of a website is obtained from the WHOI database.
The domain age information helps us to identify a phishing website. If a website
is very old, it indicates that it is valid and not created for phishing purpose. The
rule used to classify a website phishy on domain age is given in Eq. 11:
Rule : If
domain age ≥ 6 months → Legitimate
otherwise → Phishy
(11)
• DNS record: A website is categorized as phishing or not based on its domain
name system (DNS) record. The rule used to classify it is given in Eq. 12:
Rule : If
no DNS record → Phishing
Otherwise → Legitimate
(12)
4 Proposed Method
The proposed method used to detect a website is phishy or non-phishy is shown
in Fig. 5. It aggregates the different website features and provides them as input
to a trained machine learning classifier to classify it as phishy or non-phishy. The
classifier is trained on a standard dataset. We have used six machine learning
algorithms such as k-nearest neighbor (KNN), logistic model tree, support vector
machine, naive Bayes, multilayer perceptron, and decision tree machine learning
algorithms to classify a website as legitimate or non-legitimate. The details of the
machine learning algorithm are discussed below.
• K-nearest neighbor’s algorithm: KNN is a simple and most commonly used
algorithm. It is a type of supervised learning method that classifies a new website
based on similarity measures. It uses distance measures to find the distance of the
new website from the phishy and non-phishy websites available in the dataset,
and based on a similarity measure, it predicts whether the website is phishy
or not. The distance measure generally used in KNN is Euclidean; however,
hamming distance is also used in some cases.
• Logistic model tree: The logistic model is a supervised learning classification
model built by combining logistic regression and decision tree. It uses the concept
of both decision tree and logistic regression tree. The decision tree classifies the
problem as a tree where logistic regression generates the result as a discrete
A Comparative Study of Machine Learning Techniques for Phishing Website Detection
Fig. 5 Flowchart of
proposed method
Address Bar
Abnormal
105
Domain
HTML and JavaScript
Machine Learning Classifier
Phishy/Non-Phishy Website
value such as yes or no, 0 or 1, true or false, or high or low. So, we can say
that the logistic model tree works on combining two methods into a model tree
and generates a tree with nodes containing a logistic regression function.
• Support vector machine: Support vector machine (SVM) is one of the most
effective machine learning classifier that is used in various fields such as face
recognition, cancer classification, and many more. It is a supervised classification
method that separates data using a hyperplane where a hyperplane acts like a
decision boundary between the various classes. It is a representation of training
examples as points in space such that the points of different categories are
separated by a gap as wide as possible. It can also perform nonlinear classification
and work well with large datasets.
• Naive Bayes: Naïve Bayes is a simple probabilistic classifier based on the Bays
theorem with an assumption of independence among training cases. It assumes
that the quantity of interest is governed by probability distributions and the
optimal decision can be made by reasoning about these probabilities together
with observed data. Bayes theorem provides a way to calculate the probability of
a hypothesis based on its prior probability of a hypothesis.
• Multilayer layer perceptron: A multilayer perceptron is a perceptron with
multiple layers. It is a type of feed-forward artificial neural network. It has
an input layer, output layer, and hidden layer with perceptron. The perceptron
consists of weights, the summation processor, and an activation function. A
perceptron takes a weighted sum of inputs and outputs a single value. From the
input layer, input signals are taken, and all the computations are performed at the
hidden layers, and the final output is reflected on the output layer. If the predicted
output is the same as the desired output, then the performance is considered
satisfactory, and no changes to the weight are made. However, if the output does
not match the desired output, then the weights are changed to reduce the error.
106
M. F. Khan et al.
• Decision tree: A decision tree is a graphical representation of all the possible
solutions to a decision based on certain conditions. It is a decision support tool
that arranges datasets in the form of a tree-like structure. It is also called the
tree-like model, in which each internal node represents the test attribute and all
the branches represent the outcome of the test. It built the decision tree based on
the training examples using computing entropy and information gain of samples.
Once the decision tree is constructed, a new sample is classified into a category
based on the decision rule of each node of the decision tree.
5 Results and Discussions
We compare the different machine learning algorithms for phishing website detection using a publicly available dataset available at UCI Machine Learning Repository collected by organizations Phish Tank, MillerSmiles, and Google [13]. The
dataset consists of a total of 11,055 entries of phishy and non-phishy website
features; out of which 4898 are non-phishy, and the rest are phishy websites. There
are total 30 features of each website, which is used for classification purpose. Some
of them are IP address, URL length, right click, etc., which are discussed in Sect. 3.
We have used Windows 8 operating system and Weka tool to train and compare
the accuracy of machine learning algorithms like multilayer perceptron, support
vector machine, decision tree, logic model tree, random forest, and k-nearest
neighbor machine learning algorithm for phishing website detection. We used
tenfold cross-validation during training to remove the biases. The confusion matrix
obtained during the training and validation phase for different machine learning
algorithms is shown in Fig. 6. It can be observed from Fig. 6 that random forest has
the highest value of true positive while k-nearest neighbor has the highest value of
false negative.
We further compared the accuracy of different machine learning algorithms used
for phishing website detection in the proposed approach. The accuracy of different
algorithms is shown in Fig. 7. It can be observed from it that the random forest is
efficient in terms of accuracy to detect phishing websites. The accuracy of random
forest is 97.20%. K-nearest neighbor has an accuracy of 97.2% while logistic model
tree and multilayer perceptron both have an accuracy of 96.9%. The decision tree
has an accuracy of 95.9% while the support vector machine has the least accuracy
of 94%.
6 Conclusions
Phishing is one of the important challenges for today’s era in cybersecurity. The
cases of phishing are growing exponentially and causing many cyber frauds, which
result in the loss of money of business organizations or individuals. In this paper,
5984
Phishy
NonPhishy
Phishy
193
110
6047
Phishy
NonPhishy
True Label
4750
NonPhishy
Predicted Label
(c) Decision Tree
Predicted Label
(e) Random Forest
5873
NonPhishy
Predicted Label
(b) Support Vector Machine
4683
215
130
6027
NonPhishy
173
284
Predicted Label
(d) Logistic Model Tree
4675
223
87
6070
NonPhishy
NonPhishy
283
383
Phishy
Phishy
Phishy
True Label
4515
NonPhishy
Predicted Label
(a) Multi-Layer Perceptron
4515
Phishy
6026
107
Phishy
131
True Label
NonPhishy
Phishy
211
True Label
NonPhishy
Phishy
4687
True Label
NonPhishy
Phishy
Phishy
True Label
Fig. 6 Confusion matrix for
phishing website detection
using different machine
learning algorithm
NonPhishy
A Comparative Study of Machine Learning Techniques for Phishing Website Detection
Predicted Label
(f) K- Nearest Neighbor
we have proposed a machine learning-based approach to detect phishing websites.
We used various website features to train a classifier to detect a phishy website.
We used six machine learning like the random forest, multilayer perceptron, naïve
Bayes, support vector machine, decision tree, and logistic model tree algorithms to
train the classifier. It was found that the random forest is the most efficient algorithm
to detect phishing websites while other methods detect phishing websites with less
accuracy. A comparison of machine learning methods to detect a phishy website in
terms of accuracy is given at last.
108
M. F. Khan et al.
Accuracy (%)
Accuracy
98
97
96
95
94
93
92
Machine Learning Algorithms
Fig. 7 Accuracy of various machine learning algorithms for phishing detection
References
1. Singh, P., Maravi, Y. P., Sharma, S. (2015, February). Phishing websites detection through
supervised learning networks. In 2015 international conference on computing and communications technologies (ICCCT) (pp. 61–65). IEEE.
2. HeidiBleau. RSA quarterly fraud report: Q4 2018. URL: https://www.rsa.com/en-us/offers/rsafraud-report-q4-2018. Accessed 15 Sept 2021.
3. Rao, R. S., & Pais, A. R. (2019). Detection of phishing websites using an efficient feature-based
machine learning framework. Neural Computing and Applications, 31(8), 3851–3873.
4. Sönmez, Y., Tuncer, T., Gökal, H., & Avcı, E. (2018, March). Phishing web sites features
classification based on extreme learning machine. In 2018 6th international symposium on
digital forensic and security (ISDFS) (pp. 1–5). IEEE.
5. Sharmin, S., & Zaman, Z. (2017, December). Spam detection in social media employing
machine learning tool for text mining. In 2017 13th international conference on signal-image
Technology & Internet-Based Systems (SITIS) (pp. 137–142). IEEE.
6. Altaher, A. (2017). Phishing websites classification using hybrid svm and knn approach.
International Journal of Advanced Computer Science and Applications, 8(6), 90–95.
7. Karnik, R., & Bhandari, G. M. (2016). Support vector machine-based malware and phishing
website detection. International Journal of Computer Applications in Technology, 3(5), 295–
300.
8. Abunadi, A., Akanbi, O., & Zainal, A. (2013, December). Feature extraction process: A
phishing detection approach. In 2013 13th international conference on Intellient systems design
and applications (pp. 331-335). IEEE.
9. James, J., Sandhya, L., & Thomas, C. (2013, December). Detection of phishing URLs using
machine learning techniques. In 2013 international conference on control communication and
computing (ICCC) (pp. 304–309). IEEE.
10. Xiang, Y. A. N. G., Li, Y. A. N., Bo, Y. A. N. G., & LI, Y. F. (2017). Phishing website detection
using C4. 5 decision tree. DEStech Transactions on Computer Science and Engineering.
11. Aburrous, M., Hossain, M. A., Dahal, K., & Thabtah, F. (2010). Intelligent phishing detection
system for e-banking using fuzzy data mining. Expert Systems with Applications, 37(12),
7913–7921.
A Comparative Study of Machine Learning Techniques for Phishing Website Detection
109
12. Fu, A. Y., Wenyin, L., & Deng, X. (2006). Detecting phishing web pages with visual similarity
assessment based on earth mover’s distance (EMD). IEEE Transactions on Dependable and
Secure Computing, 3(4), 301–311.
13. Mohammad, R., Thabtah, F., & McCluskey, T. L. (2015). Phishing websites dataset.
Source Camera Identification Using
Hybrid Feature Set and Machine
Learning Classifiers
Ankit Kumar Jaiswal and Rajeev Srivastava
1 Introduction
In today’s era, with the growth of the digital world in communication technologies,
where one can freely take images and videos without the consent of the third party
without giving access to its location and time, felonies have become a big concern
for our society. Digital images are used in different applications in areas such as
entertainment, social networking, and security systems. With the development of
new image editing tools, these images can be manipulated and forged, which can
cause harm to public credence and can also question the result of the forensic
because of the manipulated evidence. Source camera identification (SCI) is used
to identify the source camera of the images/photos as shown in Fig. 1.
SCI has a wide range of applications in the department of forensic and judicial
systems. In felonies like tampering of images [1–5], terrorist-act scenes, video
voyeurism, and sharing with a third party without consent, or any distribution of
illicit content, it helps forensic investigators to extract relevant information about
the culprit by detecting information about the camera device, like brand and model
of camera.
Many techniques were introduced by researchers involving correlation-based and
feature-based models [6–8]. Some techniques used manufacturing defects such as
A. K. Jaiswal ()
Department of Computer Science and Engineering, Thapar Institute of Engineering and
Technology, Patiala, India
e-mail: Ankitkrjaiswal.rs.cse17@iitbhu.ac.in
R. Srivastava
Department of Computer Science and Engineering, Indian Institute of Technology (BHU),
Varanasi, India
e-mail: Rajeev.cse@iitbhu.ac.in
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_7
111
112
A. K. Jaiswal and R. Srivastava
Fig. 1 Given image belongs
to which camera model?
Given Image
?
Apple
Samsung
Sony
Motorola
Nexus
lens distortion and noise filtering as their identification criteria. In 2006, Lukas et.al
[9] uses SPN-based approaches and wavelet filtering approach implemented in [10]
for taking an average of SPN from images of the same camera model in finding the
average SPN for a particular camera model. Some advanced it [11] by implementing
two-preprocessing steps using zero mean (ZM) and Wiener filter (WF), but it suffers
from the drawback of image contaminations. After that several other papers worked
on sensor pattern noise (SPN) such as SPN enhancement and SPN extraction, but
due to the high rate of complexity, it is not an efficient approach to work on further.
With the growth in machine learning and advancement in techniques, featurebased methods are way better than correlation-based methods. In this chapter
[12–14], statistical features are trained on multi-class classifiers such as SVM,
which results in better detection accuracy. Considering all these methods as
motivation, this project comprises feature-based methods but with techniques to
improve feature-vector extraction and to overcome the drawback of limited data
size.
To overcome the problems based on correlation-based techniques and previous
feature-based techniques, this chapter introduces a machine learning-based SCI
technique using a combination of frequency and spatial features, where data augmentation on the images is performed first and DWT and LBP features are extracted
to overcome the limitations of existing techniques. Then, different classifiers such as
SVM, KNN, and LDA are used to classify these features. Images are classified into
five different camera models, namely, Sony, Samsung, Apple, Motorola, and Nexus.
In the feature-based approach using machine learning, this work majorly contributes
the following:
• To increase the number of feature vectors available for training, data augmentation is being performed over the dataset by rotating, resizing, gray scaling, and
brightness and contrast adjustments. This overcomes the limitation of the small
size of the dataset.
• Since spatial features are not sufficient in the case of different properties of
images in the same class, discrete wavelet features and local binary pattern
features are extracted from the augmented images.
• This results in insufficient amount of feature vectors, which gives better results
as compared with other correlation-feature-based techniques.
Source Camera Identification Using Hybrid Feature Set and Machine Learning. . .
113
This chapter comprises five sections: Section 1 gives the introduction to the
problem, motivation behind work, and major contributions. Section 2 talks about
theoretical background and literature review. The proposed method is discussed in
Sect. 3. Section 4 demonstrates the experimental results on the publicly available
dataset followed by its discussion. The presented work is concluded with future
scope in the fifth section.
2 Literature Review
Source camera identification methods proposed by various researchers in literature
are divided into two categories:
2.1 Correlation-Based Method
The hardware defects while manufacturing leaves ineradicable marks on camera
models, which then can be helpful for the research to identify it uniquely. This
makes the researchers use the sensor pattern noise (SPN) method [12–14]. The main
component of SPN is photo-response nonuniformity (PRNU) noise. The wavelet
filtering technique is used to extract the SPN from images, and camera reference
was obtained by taking the average of SPN of multiple images of the same camera
model. These methods are based on three approaches – SPN enhancement, SPN
extraction, and correlation calculation. Researchers always try to increase the SPN
to increase the identification accuracy, by the relation, the higher the SPN, the more
the identification accuracy, but still, SPN’s accuracy and the huge complexity of
matching techniques concern the researchers.
2.2 Feature-Based Method
This method uses statistical features, which make this approach a classification
problem. Various classifiers such as support vector machine (SVM) and ensemble
are used. High-order wavelet statistics (HOWS) and sequential forward feature
selection (SFFS) are different algorithm approaches used for feature extraction.
These methods are mostly based on extracting first-order or higher-order statistical
features by identifying overall variations in the imaging process of different camera
models.
These methods are discussed in the following Table 1 on different parameters.
The various research articles are reviewed above to manifest different approaches
and techniques to achieve the desired goal; all of them have their advantages and
limitations. Let’s discuss it in detail. In the method [7], the authors used wavelet
This methodology is based
on the hypothesis that the
attenuation in signal is
required for those having
high n value as they are
less trustworthy
components and generally
fluctuate. This is tested by
giving greater weight to
the smaller SPN elements
and performing the
correlation-based method.
[18]
[16]
Method
Based on photo-response
nonuniformity (PRNU)
that provides features for
classification with SVM
and applying two-level
DWT wavelet transform
and PCA for
reducing-edge effects and
de-noise effects in PRNU
noise pattern.
It uses CNN for
classification by using
linear SVM as a classifier
with feature-extraction and
dimensionality reduction
techniques.
Ref.
[15]
Dresden image dataset
[19]
Flickr dataset [17]
Dataset
130 random images per
model and a total of
four models
Accuracy: 80.8%
Accuracy: 93%
Result
Accuracy: 89%
Table 1 Discussion of the state-of-the-art techniques of SCI on different parameters
To resolve the issue of
complexity and
contamination of noise in
images in a
correlation-based method,
this method provides a
faster approach for
enhancing SPN.
It is prone to less error
because of its simplistic
assumptions.
Can work on small-image
patches with high
accuracy.
Merits
This method achieves
better results as compared
with a correlation-based
method, which is
time-inefficient.
Since the model needs to
be trained from scratch, it
increases its computational
burden.
For camera models having
a smaller number of
images, it shows poor
accuracy.
This method is often
time-consuming and
involves analysis on a
larger dataset of images
having a smooth surface
for averaging.
Demerits
This method performs
poorly when the number of
classes increases and also
training on a very small
size of the dataset.
114
A. K. Jaiswal and R. Srivastava
This methodology involves
averaging sensor pattern
noise by different camera
models using de-noising
filters and then estimating
false alarm rates and false
rejection rates.
Summarizes all the
existing feature-based
methods and gives a
unified method to
determine not only the
camera model but also the
brand as well as individual
with high accuracy. Six
steps: image
preprocessing, residual
calculation, feature
extraction, reduction, and
classification.
[9]
[20]
Dresden image dataset
[19]
Three hundred twenty
random images per
model total nine camera
models
Accuracy: 89.9%
Good values of correlation
without JPEG or gamma
correction
The proposed framework
achieves higher accuracy
as compared with other
state-of-art methods due to
the LBP feature extraction
and bilinear classification
model.
This chapter introduced the
idea that sensor pattern
noise can be used in
identifying unique camera
models.
(continued)
Since the identification
process requires proper
alignment, geometric
operations, such as
cropping, enlargement,
rotation, digital zoom,
result in desktop separation
and protect the correct
camera identification. In
this case, the brute force
search will need to use a
powerful search.
The combination of
wavelet and contourlet
transform shows slightly
worse performance as
compared with when we
use only wavelet
transform.
Source Camera Identification Using Hybrid Feature Set and Machine Learning. . .
115
[22]
[21]
This work focuses on
small-scale training
samples by introducing a
concept of megatrend
diffusion (MTD) and
combined with reading
together. This method
generates visual samples
with a uniform distribution
of the distribution distance
of the samples calculated
from the practice
distribution method.
This method uses a
multi-classifier instead of a
binary classifier for testing
images taken from
different camera models.
The main idea is to develop
a CNN architecture that
can extract characteristic
features on its own.
Table 1 (continued)
Dresden image dataset
[19]
Dresden image dataset
[19]
Accuracy: 86.22%
Accuracy: 53.93% in five
on samples and 83.28%
with ensembles
JPEG compression and
noise do not affect the
image resolution.
Through ensemble
learning, we increased the
identification accuracy of
the model, which was not
achieved by using only
MTDBOX and
MTDRELATION.
The accuracy gets affected
by the rescaling attack.
Not efficient on large
training samples
Ensemble models are
difficult to interpret as
compared with other
models.
Ensemble learning is
expensive in both time and
space.
116
A. K. Jaiswal and R. Srivastava
[15]
[23]
This work considered each
noise pattern as its
fingerprint to identify its
source camera sensor
where SPN is used to
classify the images further
characterized by
wavelet-based feature
vector noise images
obtained by PRNU
extraction and further 81
features were exacted
using the extracting
feature. The RBF kernel in
SVM performed the
classification.
An algorithm by extracting
color models and color
channels and using them in
image texture features. It
can even differentiate
between the images taken
from the same model and
source device.
Dresden image dataset
[19]
A random sample of
100 images was used
for training and 100
different for testing.
Accuracy: 93.2% and
87.2%
The road is high on both
gaining accuracy and
durability. Images remain
unaffected even after
rescaling, adding noise, or
JPEG compression.
This method can be
considered for a large
number of different
cameras and hence can be
utilized in forensics and
mining.
We cannot detect the
accuracy based on resizing.
The decrease in the
accuracy of experiment
two indicates that the
performances decrease
when the number of
classes is increased.
Source Camera Identification Using Hybrid Feature Set and Machine Learning. . .
117
118
A. K. Jaiswal and R. Srivastava
transform on photo-response nonuniformity (PRNU) to provide the features, which
were further classified by using support vector machine (SVM). It gives a satisfying
accuracy when tested on level-1 decomposition but performs poorly when the
number of levels increases. Authors in [8] first trained the algorithm through CNN
and SVM and then used this trained algorithm to further classify the images; it was
only performed on original JPEG images without any manipulation, but this model
is not of any use when it comes to performing on manipulated images.
Authors of [15] worked on the hypothesis that attenuation in signal is required
for those having high SPN value as they are less trustworthy components and
generally fluctuate. Method [9] includes finding false alarm values and false disposal
rate using denoising filters but, due to geometric operations such as cropping,
reuse, digital zoom, can cause desynchronization and predicting incorrect camera
models to use a powerful search detection. The authors of [20] proposed a method
that worked on a feature-based method. Image preprocessing and residual image
calculations were done to extract specific artifacts. Then, image transformation was
performed followed by a dimensionality reduction process to reduce its complexity.
The concerning part is that its accuracy is lesser than hierarchical model accuracy.
A method [24] talks about using high-order wavelet features and SFFS algorithm
to improve the accuracy of multi-class SVM; this method didn’t discover any
new idea and instead worked on the existing method. In work [21], a method to
deal with small training samples was proposed; a combined mega-trend-diffusion
method (MTD) with ensemble learning is used to generate visual sample images. It
fails for large training samples, and ensemble learning is inadequate in both time
and cost. In a research article [22], a multi-classifier is used to extract features
automatically instead of handcrafting them. Rescaling attacks affect its accuracy. In
[23], the authors used the same feature-based extraction method but with the RBF
kernel SVM for classification. They experimented by taking a different number of
classes and concluded that the accuracy decreases with the increases in the number
of classes. In the method [24], color channels and color model methods are used to
extract image texture features even if they are taken from the same camera source.
Though it cannot analyze accuracy on various parameters such as resizing.
After reviewing the literature, it can be concluded that there is a need to
implement a model that can overcome the above limitations. We need to implement
a model that can perform well even on unseen testing datasets and if there is a
distribution gap between training and testing images and whose accuracy remains
unaffected by various parameters, such as double JPEG compression, rescaling, and
resizing.
3 Method and Model
A source camera identification model is proposed using a combination of spatial
and frequency-based features. The main concept behind it is that every camera
model has its unique property, so when one clicks a picture then these properties
Source Camera Identification Using Hybrid Feature Set and Machine Learning. . .
119
such as color channels interpolation, focusing light rays through lenses, complex
operations at acquisition time, an adjustment in brightness, and many others act as
footprints to identify the specific camera model. Techniques based on sensor pattern
noise (SPN) and lens radial distortion have been used to identify the source of the
camera where one used manufacturing defects of the camera as its unique property
to identify the particular camera model. Earlier, correlation-based methods [25, 26]
were used by taking the average of multiple SPN taken by different camera models
to identify them uniquely, but it involves a lot of complexity in the matching process,
and the images severely contaminated by scenes make it difficult to reach a proper
conclusion. In this chapter, a machine learning classification model is given based
on spatial as well as frequency domain-based features. To overcome the limitation
of a small size dataset, data augmentation is performed on images of each class.
The local binary pattern (LBP) features and DWT features are extracted from the
augmented image to make a feature vector. Then these feature vectors are trained
in classifiers such as support vector machine (SVM), linear discriminant analysis
(LDA), and KNN to detect a camera model. This methodology is divided into three
steps – the first is image preprocessing, the second is feature extraction, and the third
is classification.
3.1 Image Preprocessing
The dataset used for evaluation purposes consists of diverse images varying from
close-ups, indoor to nature, and outdoor. Since we don’t require classification on the
nature of these images, instead we want to extract frequency component features of
these images, which would be useful in classifying a unique camera model so we
converted color images into gray scale (see Fig. 2). Furthermore, this conversion is
necessary as color image processing requires more computational cost. The sample
output and the conversion formula is as follows:
Gray scale = 0.289∗ R + 0.587∗ G + 0.114∗ B
To overcome the limitation of a small-size dataset, data augmentation is performed on these images. Augmentation techniques such as brightness enhancement,
contrast adjustment, rotation, and resizing are done as shown in Fig. 3. Images
are rescaled into 1500 × 1500 of high-resolution sizes with performing two
types of brightness enhancement one by increasing in the intensity values by
Fig. 2 Color conversion of
image
Gray-scale
120
A. K. Jaiswal and R. Srivastava
Rescaled
Rotaon 5⁰
Rotaon 10⁰
Brightness 10
Brightness 20
Contrast 1
Contrast 2
Contrast 3
Fig. 3 Data augmentation of image
10 and then again by 20. Images are also rotated by 5 and 10%, and contrast
adjustments are done through three ways – first by adjusting the intensity values to
increase the contrast of output image, second by contrast-limited adaptive histogram
equalization, and third by histogram equalization where it transforms the input
image into an output image, which will have 64 bins and approximately flat.
3.2 Feature Extraction
In source camera identification, features extracted from the wavelet feature domain
can perform better than spatial features (such as image color, IQM, and CFA
features). The reason being that wavelet transform algorithms reduce the edge effect
and remove noise while preserving perceptually important features whereas spatial
features often smoothens edges and affect image quality.
Wavelet Features
Discrete wavelet transform (DWT) is a digital filtering process used to process
images into tiny wavelets into its four subfrequency bands low–low (LL), low–high
(LH-vertical), high–high (HH-diagonal), and high–low (HL-horizontal) as shown
in Fig. 4. LL is part of approximation coefficients (generated from low-pass filters),
and the rest three are part of detailed coefficients (generated from high-pass filters).
Local Binary Pattern Features
The LBP feature output is used to achieve feature vectors that may be slightly
involved in the conversion and conversion of gray matter. LBP features are extracted
by comparing the pixels to their neighboring pixel cells. It is computationally simple
and gives high accuracy. After image preprocessing, a local binary pattern (LBP) is
performed to extract 59 features.
Source Camera Identification Using Hybrid Feature Set and Machine Learning. . .
121
Applied 2d DWT
ll 4th level
Mean of each
row
Median Absolute
deviaon from
4x4 matrix of
HH4
Fig. 4 Extraction of frequency domain-based features (discrete wavelet transform)
From augmented image, first spatial features (LBP features) are extracted to
make a feature vector. From one image, 59 feature descriptors are extracted as LBP.
Similarly, for frequency-domain features, four-level DWT features are extracted
from the augmented image using Daubechies (db1) wavelet [6]. This four-level
transformation is performed because we want enough frequency component features
for classifiers. The steps are as follows:
• From each level of wavelet decomposition, one approximation coefficient and
three-detailed coefficients are produced. The second level decline is made from
the first level coefficient, and the process continues until the fourth level of equity
and detailed coefficients.
• From this fourth level-detailed coefficient, the diagonal component is selected
and divided into 4 × 4 size distinct blocks.
• From every 4 × 4 distinct block median absolute deviation is calculated.
• Further, feature reduction is done by calculating the mean for each row, which
results in 23 feature vectors.
This results in 82 feature vectors (59 from LBP and 23 from DWT).
3.3 Classification
To perform classification among different camera models, multi-class classification
models are used. These classifiers use extracted feature vectors to train the model.
SVM (Support Vector Machine)
It is a supervised classification model that has good generalization ability. It is one
of the machine learning algorithms, which analyzes the training data and generates
122
A. K. Jaiswal and R. Srivastava
a classifier function. It takes feature vectors after feature extraction as input and
produces tags that define one of the introduced ten camera models as output. It
performs best for small observations. In this chapter, a medium Gaussian SVM and
one-vs-one multiclass method are implemented.
Linear Discriminant Analysis (LDA)
LDA is mostly used for the supervised classification model. It is also used to project
higher dimension features into lower. Here, the linear discriminant is used as a
linear classifier that performs multi-class classification and performs dimensionality
reduction.
KNN (k-Nearest Neighbor)
KNN is the easiest algorithm known so far. It relies on the fact that similar datasets
are near each other, i.e., if two similar data points are close enough, then they are
classified into the same class. In image classification, input is taken as N images,
which are classified into K classes and the classifier is trained onto these image
datasets. After that, a set of testing images is taken and compared with every single
one of the training images and predicts the label of the closest training image.
4 Experiment and Result Analysis
To evaluate the given model there is a need for an experimental result. The
experiment of the proposed model is performed on MATLAB R2017b on a
Windows 10 operating system having Intel Core i5 8th Gen processor and 8GB
RAM. For the evaluation purpose, a publicly available dataset is used.
The source of the dataset used for implementation is from – IEEE’s signal
processing society – camera model identification dataset [27]. The dataset consists
of images taken from five popular mobile devices namely Sony, Samsung, Apple,
Motorola, and Nexus. This dataset is having 495 images per class. In total, there
are 2475 images in the dataset. In the proposed approach, data augmentation is
performed first. Using these operations, a total of 3960 images are created per class.
Hence, a total of 19800 images are used for training the model. From each image,
a total of 82 features are extracted using DWT and local binary pattern (LBP).
This section gives details about the results obtained by performing experiments on
different classifiers.
For the performance measure, different evaluation metrics are used. These
evaluation metrics are calculated using a confusion matrix. These measures are
accuracy, precision, recall, and f1-score value. Since the number of images is equal
for each class (balanced dataset), these performance measures are sufficient for
evaluation purpose. The confusion matrix is given as:
Source Camera Identification Using Hybrid Feature Set and Machine Learning. . .
TP: Correctly predicted the positive
class
FP: Incorrectly predicted the positive
class
123
TN: Correctly predicted the negative
class
FN: Incorrectly predicted the negative
class
By using these instances, performance measures are calculated as follows:
precision =
recall =
accuracy =
tp
tp + f n
tp + tn
tp + tn + fp + f n
specificity =
f 1 − score =
tp
tp + fp
tn
tn + fp
2 × precision × recall
precision + recall
In this way, three different classifiers are used namely support vector machine
(SVM), K-near neighbor (KNN), and linear discriminant analysis (LDA). The
dataset is divided into two parts – one is for the training of the model, and another
is for validation of the model. A tenfold cross-validation test is performed on
the dataset. Classes are represented in numeric values (i.e., 1-Apple, 2-Sony, 3Samsung, 4-Motorola, and 5-Nexus).
SVM
Medium Gaussian SVM is taken to train our dataset. The method that we
implemented has an overall accuracy of 89.7% with average precision and
recall/sensitivity rates of 90.2% and 89.7% and an average specificity rate of
97.3% on the dataset. The quantitative result of the proposed model with an SVM
classifier on different performance measures is shown in Table 2.
LDA
In this method, a linear discriminant analysis classifier is used for training our
dataset, and the classifier gives an accuracy of 80.5% with average precision
and recall/sensitivity rates of 81.6% and 80.04% and an average specificity rate
of 94.76%, respectively, on our dataset. The figure mentioned below shows the
124
A. K. Jaiswal and R. Srivastava
Table 2 Quantitative result of the proposed model with medium Gaussian SVM on different
performance measures
Camera model
Apple
Sony
Samsung
Motorola
Nexus
Precision (%)
89.4
95.1
91.0
84.6
91.0
Recall (%)
91.0
89.7
89.0
88.6
90.5
Specificity (%)
96.9
99.4
97.5
95.4
97.4
F1 score (%)
90.2
92.3
89.9
86.5
90.7
Table 3 Quantitative result of the proposed model with LDA on different performance measures
Camera model
Apple
Sony
Samsung
Motorola
Nexus
Precision (%)
77.4
88.6
84.7
75.2
82.1
Recall (%)
84.1
77.0
76.4
82.0
80.7
Specificity (%)
91.9
98.6
96.0
92.4
94.9
F1 score (%)
80.6
82.3
79.6
78.4
81.3
Table 4 Quantitative result of the proposed model with KNN on different performance measures
Camera model
Apple
Sony
Samsung
Motorola
Nexus
Precision (%)
81.6
88.2
85.7
85.4
88.4
Recall (%)
88.3
89.9
86.0
79.4
86.3
Specificity (%)
94.3
98.5
95.9
96.1
96.7
F1 score (%)
84.8
89.0
85.8
82.3
87.2
Table 5 Comparison of the different classifiers on the proposed feature set
Model
SVM
LDA
KNN
Accuracy (%)
89.7
80.5
85.6
Precision (%)
90.22
81.6
85.86
Recall (%)
89.76
80.04
85.98
Specificity (%)
97.32
94.76
96.3
F1 score (%)
89.93
80.44
85.8
confusion matrix of the experimental result of LDA. The quantitative result of the
proposed model with an LDA classifier on different performance measures is shown
in Table 3.
KNN
In this method, a Fine KNN classifier is used for training our dataset, and the
classifier gives an accuracy of 85.6%. The average precision and recall/sensitivity
rates come out to be 85.86% and 85.98%, respectively, and the average specificity
rate 96.3%. The quantitative result of the proposed model with KNN classifier on
different performance measures is shown in Table 4.
Table 5 compares the average values of accuracy, precision, recall, specificity,
and F1-score on different classifiers.
Source Camera Identification Using Hybrid Feature Set and Machine Learning. . .
125
From the above table, it can be concluded that SVM (medium Gaussian) performs better for a given dataset. So, SVM can be considered for further comparison
with the other state-of-the-art techniques mentioned in Sect. 2. The implemented
method works better than other state-of-the-art techniques (as accuracy comes out
to be better than most of the papers mentioned in Sect. 2) because we extracted a
significant amount of feature vectors, which helped the classifier to train the model
better. However, we noticed that as the number of classes increased, the accuracy
decreased, due to the limited number of photos of each class.
5 Conclusion
In this chapter, a framework based on features for source camera identification is
proposed that has three steps – image preprocessing, feature extraction, and classification. Image preprocessing includes data augmentation and color conversion
from RGB to gray scale. A total of 82 features were extracted using DWT (23
features) and LBP (59 features). On performing experiments on these 19800 × 82
feature vectors using different classifiers, SVM shows an overall accuracy of 89.7%
for five camera models, whereas LDA performed with an accuracy of 80.5% and
KNN with an overall accuracy of 85.6%. After comparing all three SVM, LDA, and
KNN, SVM has shown comparatively better results than other art form techniques
mentioned in Sect. 4. However, the accuracy of the proposed model is limited to
five classes because of the small size of the dataset available till now. In the future,
the accuracy and method can further be improved by performing experiments on the
required large number of images by generating a dataset containing more classes.
References
1. Jaiswal, A. K., & Srivastava, R. (2019). Copy-move forgery detection using shift-invariant SWT
and block division mean features (Vol. 524). Springer.
2. Jaiswal, A. K., & Srivastava, R. (2020). Forensic image analysis using inconsistent noise
pattern. Pattern Analysis and Applications. https://doi.org/10.1007/s10044-020-00930-4
3. Mehta, V., Jaiswal, A. K., & Srivastava, R. (2020). Copy-move image forgery detection using
DCT and ORB feature set (Vol. 1206 CCIS). Springer.
4. Jaiswal, A. K., & Srivastava, R. (2019). Image splicing detection using deep residual network.
SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3351072
5. Jaiswal, A. K., & Srivastava, R. (2019). Image splicing detection using deep residual network.
In International conference on advanced computing and software engineering (ICACSE-2019)
(pp. 99–102). https://doi.org/10.2139/ssrn.3351072
6. Jaiswal, A. K., & Srivastava, R. (2020). A technique for image splicing detection using hybrid
feature set. Multimedia Tools and Applications, 79(17–18), 11837–11860. https://doi.org/
10.1007/s11042-019-08480-6
7. Singh, S., & Singh, N. P. (2019). Machine learning-based classification of good and rotten
apple. Recent Trends in Communication, Computing, and Electronics, 377–386.
126
A. K. Jaiswal and R. Srivastava
8. Singh, S., & Kumar, R. (2020). Histopathological image analysis for breast cancer detection
using cubic SV. In 7th International conference on signal processing and integrated networks
(SPIN) (pp. 498–503).
9. Lukáˇ, J., Fridrich, J., & Goljan, M. (2006). Digital Camera Identification from sensor pattern
noise. IEEE Transactions on Information Forensics and Security, 1(2), 205–214.
10. Mzhqak, M. K., Kozintsev, I., Ramchandran, K., & Ave, N. M. (1999). Spatially adaptive
statistical modeling of wavelet image. International Conference on Acoustics, Speech, and
Signal Processing, 6, 3253–3256.
11. Chen, M., Fridrich, J., Goljan, M., & Lukáš, J. (2008). Determining image origin and integrity
using sensor noise. IEEE Transactions on Information Forensics and Security, 3(1), 74–90.
https://doi.org/10.1109/TIFS.2007.916285
12. Wang, B., Kong, X., & You, X. (2009, January). Source camera identification using support
vector machines. In IFIP International Conference on Digital Forensics. pp. 107–118.
Springer, Berlin, Heidelberg.
13. Gloe, T. (2012). Feature-based forensic camera model identification. In Lecture Notes in
Computer Science (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol.
7228 LNCS (pp. 42–62). https://doi.org/10.1007/978-3-642-31971-6_3
14. Tsai, M. J., Wang, C. S., Liu, J., & Yin, J. S. (2012). Using decision fusion of feature selection
in digital forensics for camera source model identification. Computer Standards & Interfaces,
34(3), 292–304. https://doi.org/10.1016/j.csi.2011.10.006
15. Bolouri, K., Javanmard, M., & Firouzmand, M. (2013). Camera identification algorithm
based on Sensor Pattern Noise using wavelet transform, SVD/PCA and SVM classifier. Journal of Information Systems and Telecommunication, 1(4), 233–237. https://doi.org/10.7508/
jist.2013.04.004
16. Bondi, L., et al. (2017). First steps toward camera model identification with Convolutional
neural networks. IEEE Signal Processing Letters, 24(3), 259–263. https://doi.org/10.1109/
LSP.2016.2641006
17. “Flicker Camera Model Dataset.”
18. Li, C. T. (2009). Source camera linking using enhanced sensor pattern noise extracted from
images. IET Seminar Digest, 2, 2009. https://doi.org/10.1049/ic.2009.0274
19. Gloe, T., & Böhme, R. (2010). The dresden image database for benchmarking digital image
forensics. Journal of Digital Forensic Practice, 3(2–4), 150–159. https://doi.org/10.1080/
15567281.2010.531500
20. Wang, B., Zhong, K., Shan, Z., Zhu, M. N., & Sui, X. (2020). A unified framework of source
camera identification based on features. Forensic Science International, 307. https://doi.org/
10.1016/j.forsciint.2019.110109
21. Wu, S., Wang, B., Zhao, J., Zhao, M., Zhong, K., & Guo, Y. (2021). Virtual sample
generation and ensemble learning based image source identification with small training
samples. International Journal of Digital Crime and Forensics, 13(3), 34. https://doi.org/
10.4018/IJDCF.20210501.oa3
22. Yao, H., Qiao, T., Zheng, N., & Xu, M. (2018). Robust multi-classifier for camera model
identification based on convolution neural network. IEEE Access, 6, 24973–24982. https://
doi.org/10.1109/ACCESS.2018.2832066
23. Corripio, J. R., González, D. M. A., Orozco, A. L. S., Villalba, L. J. G., Hernandez-Castro, J., &
Gibson, S. J. (2013). Source smartphone identification using sensor pattern noise and wavelet
transform. In 5th International conference on imaging for crime detection and prevention.
ICDP 2013. https://doi.org/10.1049/ic.2013.0267
24. Wang, B., Guo, Y., Kong, X., & Meng, F. (2009). Source camera identification forensics based
on wavelet features. In IIH-MSP 2009 – 2009 5th International conference on intelligent
information hiding and multimedia signal process (pp. 702–705), no. April 2015. https://
doi.org/10.1109/IIH-MSP.2009.244
25. Zhang, L. B., Peng, F., & Long, M. (2017). Identifying source camera using guided image
estimation and block weighted average. Journal of Visual Communication and Image Representation, 48(December), 471–479. https://doi.org/10.1016/j.jvcir.2016.12.013
Source Camera Identification Using Hybrid Feature Set and Machine Learning. . .
127
26. Lawgaly, A., & Khelifi, F. (2017). Sensor pattern noise estimation based on improved
locally adaptive DCT filtering and weighted averaging for source camera identification and
verification. IEEE Transactions on Information Forensics and Security, 12(2), 392–404. https:/
/doi.org/10.1109/TIFS.2016.2620280
27. “IEEE’s Signal Processing Society - Camera Model Identification.” .
Analysis of Blockchain Integration
with Internet of Vehicles: Challenges,
Motivation, and Recent Solution
Manik Gupta, R. B. Patel, and Shaily Jain
1 Introduction
Due to the increased number of road vehicles, the demand for effective transport
networks has grown considerably. The growth of the urban population has made
the management of traffic more difficult and the management of severe traffic
problems more difficult. However, road conditions, including traffic congestion and
other factors, like poor public transportation, are all problems that are important
to take into consideration. Additionally, cities using the smart city approach will
need traffic control systems to support municipal administrations and new car
ownership applications. Consequently, smart services will only be provided by using
technological and unique solutions, and thus, the significance of these solutions
is critical for road authorities and driver satisfaction. Traditionally, transportation
management systems have relied on the use of vehicular ad hoc networks (VANETs)
to provide various applications and services. Since there is a lot of available data and
improved connections provided by the Internet of Things, the Internet of Vehicles
(IoV) approach [1] improves technology by connecting vehicles together (IoT).
The Internet of Vehicles (IoV) is actually a sophisticated, traffic-efficient ad-hoc
network. IoV apps not only have many features like traditional IoT (Internet of
Things) applications, but they also have significant differences. It is used in a
network environment which is accessible to wireless networks, and the topology of
M. Gupta () · S. Jain
Chitkara University School of Engineering and Technology, Chitkara University, Himachal
Pradesh, India
e-mail: manik.gupta@chitkarauniversity.edu.in; shaily.jain@chitkarauniversity.edu.in
R. B. Patel
Department of Computer Science & Engineering, Chandigarh College of Engineering &
Technology, Chandigarh, India
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_8
129
130
M. Gupta et al.
the network is continuously changing, which complicates the processing of mobile
IOV services [2]. The IoV produces a variety of data kinds, including extra data like
trajectories, road traffic statistics, and multimedia data, continuously via moving
vehicles. The vehicles may interact with one another and with the surroundings
under this paradigm: individuals, telecommunication networks, gadgets, indications
of transport, etc. This will lead to the development and implementation of effective
road safety and worldwide traffic efficiency apps that will reduce the causalities of
traffic [3]. But drivers and passengers who use ridesharing services may be at risk
of physical injury or property damage as a result of security and privacy concerns
like data manipulation, identity counterfeiting, and sensitive information exposure.
Many distinct variables are responsible for the inherent danger of privacy breaches.
One of the ways that people’s privacy may be compromised is dependent on the kind
of data that is being gathered by various organisations on a network. Additionally,
even for apps that are capable of providing accurate data, the collected data is kept
on devices that are meant to be reused, increasing the possibility of data reuse
without permission.
Blockchain technology has recently risen to prominence, bringing with it
the benefits of decentralisation, privacy, and reliability [4–6] that it provides.
Blockchain technology is a new craze in computer science for safeguarding
information resources between networked devices. This has been widely suggested
as a promising solution to IoT’s trust-related issues [7–9]. It is possible for Internet
of Things devices to safely exchange energy or resources with some other unknown
counterparts with the help of blockchain [10–12].
Blockchain technology is appropriate with distributed consensus properties
for decentralised applications, in particular when vehicles do not have mutual
confidence in complicated road transport settings. The initial implementations of
blockchain technology were as a distributed ledger for the Bitcoin [13–15] system,
with the goal of solving the cryptocurrency’s double-spending issue. As a result of
the immutability of the distributed ledger, one of the most important characteristics
of the blockchain is that it enables transacting parties and stakeholders to build
trust among previously untrustworthy organisations in a decentralised way [16–
17]. The blockchain-based architecture is decentralised, open, and implemented by
many dispersed nodes, each with a copy of the Bitcoin transaction records linked
from a cryptographical perspective, structured into blocks, which certain consensus
procedures between the blockchain nodes agree on [18–21].
The increasing number of smart cars is anticipated to generate and interchange
a large quantity of data, and the network traffic to be handled will be considerably
large, thanks to the fast development of vehicular applications and services. Significant problems will also be encountered when using conventional cloud services and
administration explicitly with the great speed, low latency, contextual complexities,
and heterogeneous features of IoV. It is also challenging for IoV companies from
various service providers to guarantee good interoperability and compatibility. As
a result, blockchain technology, in conjunction with current cryptographic methods
and edge computing, has already offered significant possibilities in a variety of IoV
applications. It is anticipated that the combination of blockchain technology with the
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
131
Internet of Things will significantly enhance the security, intelligence, large-scale
data storage, and efficient administration of the Internet of Things, making it an
attractive study subject. Because the communication connection between vehicles
or vehicles and roadside units (RsUs) is insecure or unreliable, it’s indeed critical
to address the issue of designing a blockchain-based car network and modelling the
accessible portion of the vehicular network.
1.1 Organization and Reading Map
Figure 1 shows the organization of the chapter. Section 1 mentions the basic
introduction of the chapter followed by the abstract. Section 2 describes the basic
architecture of the Internet of Vehicles followed by the challenges with which IoV
deals. Section 4 details the overview of the blockchain technology, and Sect. 5
focuses on the types of blockchain networks. In Sect. 6, the motivation for using
blockchain in IoV is highlighted, and Sect. 7 briefs the recent solutions for IoV
integration with blockchain. Section 8 highlights the applications of blockchained
IoV from three major perspectives: incentive mechanism, trust establishment, and
security and privacy. Some common use cases are shown in Sect. 9 with the future
scope of blochained IoV conclusion of the chapter in Sects. 10 and 11, respectively.
2 Architecture of IoV
The Internet of Vehicles (IoV) enables individuals to manage their vehicles from a
distance via the connectivity of people, vehicles, and things, as well as the intelligent
provision of services by vehicles to people. Figure 2 depicts the organisational
structure of the IoV. Taking into account the lifespan of data in IoV, we split the
IoV structure from bottom to top into four layers, with the first layer being the
sensing layer, the second layer being the communication layer, the third layer being
the computation layer, and the fourth layer being the application layer.
2.1 Physical Layer
It has a wide range of sensors on board and sensors that are situated in noteworthy
locations and portable devices. Take real-time vehicle and road environmental conditions into consideration while collecting real-time vehicle operational parameters
[27]. It may include, for example, velocity, geolocation, engine RPM, distance,
and the like. For uniform visualization and interpretation, all gathered data will be
transmitted to the cloud server to get business data needed by clients and to offer
accurate IoV data support [28–31].
132
M. Gupta et al.
Section I - Introduction
Organization and Reading Map
Section II - Architecture of IoV
Physical
Layer
Communication
Layer
Computation
Layer
Application
Layer
Section III - Challenges of IoV
Privacy
Leak
High Mobility
Complexity in
Wireless Networks
Latency-Critical
Applications
Scalability and
Heterogeneity
Section IV - Overview of Blockchain Technology
Blocks
Miners
Nodes
Section V - Types of Blockchain Network
Public
Blockchain
Private
Blockchain
Consortium
Blockchain
Hybrid
Blockchain
Section VI - Motivations of Using Blockchained IoV
Section VII - Recent Solutions for IoV Integration with
Blockchain
Section VIII - Applications of Blockchained IoV
Incentive Mechanisms
Trust Establishment
Security and Privacy
Section IX - Use Cases of Blockchained IoV
Section X - Future Scope of Blockchained IoV
Section XI - Conclusion
Fig. 1 Organization of chapter
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
133
Application
Layer
Geo-Location Service On-board Entertainment
Computation
Layer
Auto Pilot
Navigation Planner
Data Analysis
Vehicle Fault Detection
Safety Measures Calculation
Traffic Flow Calculation Cloud Server
Communication
Layer
NFC
Bluetooth
Wi-Fi
Physical
Layer
Satellite
Near Field Communication
Vechicles
Cellular Network
Smart Phones
Human
Traffic Signals
Fig. 2 Architecture of Internet of Vehicles (IoV)
2.2 Communication Layer
There are two types of communication: near-field communication and far-field
communication. Bluetooth, RFID, and other technologies are used for near-field
communication to construct VANET in order to share information between cars,
vehicles, and the environment. Far-field communication uploads data to cloud
servers via a variety of wireless communication protocols, including cellular
networks, satellite links, and other wireless connections. Following that, the server
calculates the information in order to offer matching services for each car [27, 29,
32].
134
M. Gupta et al.
2.3 Computation Layer
The data is summarised, analysed, and processed once it has been deployed on
cloud servers to offer powerful computing services [33]. Carry out a combined
data analysis in order to establish different connections between automobiles. Data
computation services that are effective, correct, and quick are provided to user
groups.
2.4 Application Layer
The layer is the topmost layer of the IoV, and it may offer consumers a range
of vehicle services [34]. The network layer processes the vehicle’s real-time
information, which is then sent to the on-board system after being calculated. It
enables customers to access location and navigation-related travel services, such as
assisted trip planning, on-board entertainment systems, and self-driving cars.
3 Challenges of IoV
The progress in the Internet of Vehicles (IoV) sector is greater than ever before,
as new car equipment and Internet technologies join forces. In principle, the aim
of this potential IoV area is to increase driver safety in the near future, while at
the same time improving vehicles, transport infrastructures, and people’s lifestyles.
The introduction of an enormous quantity of data, as well as that which is derived
from cars and vehicle services, will occur in the cloud and on edge storage devices
due to IoV. In addition, the futuristic vehicles will be equipped with very powerful
computing and storage capabilities. These resources and services will be exchanged
with each other in order to provide a broad variety of application services. With
the development of IoV connections, the problems will be exacerbated. Because
of this, a variety of issues are intricately linked to ITS. When IoV technologies
are integrated with current Internet technologies, numerous problems [24–26] arise,
including security, confidentiality, trustworthiness, openness and connection issues,
and performance and connectivity issues.
3.1 Privacy Leak
Privacy leaking is defined as the unauthorised acquisition and the use of a person’s
private information by a foreign entity without their knowledge or consent. Because
of the economic value of the information, the owner’s life will be disrupted as a
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
135
result of the privacy breach. While the owner’s mobile phone number or licence
plate number may have little commercial worth, the driving track and car pricing
may have advertising potential. In addition to allowing personal itineraries to be
leaked, sensitive travel information may also lead to the disclosure of additional
personal information. The leak of personal information has three different pathways
to the IoV architecture:
3.1.1
Leakage in the Communication Layer
When there is no authentication method in the near field communication, it is not
possible to ensure the security of the other party. The sharing of private information
may lead to the loss of certain confidential information. During the data transmission
of far-field communication, information may be watched or captured by other
parties.
3.1.2
Vulnerabilities in the Computation Layer
An attack on the computational layer, which is responsible for executing the
computing job, may result in private information about the user being exposed.
Additionally, since the data is kept on devices that are not within the user’s control,
the service provider may acquire and utilise the data unlawfully.
3.1.3
Vulnerabilities in the Application Layer
Vehicle-embedded operating systems, as well as vehicle-mounted software and
hardware, may be compromised via a variety of technological methods that are
difficult to understand. When the vehicle information storage system, infotainment
system, or navigation system is compromised, malicious hackers will steal private
data from users, and they will be watched for an extended period of time [22, 23].
3.2 High Mobility
In IOV situations, both drivers and self-controlled cars (AVs) are regarded as
highly moving items that normally travel along the roadways, as opposed to other
IoT smart devices. Likewise, the speeds at which cars operate may vary, leading
to varied mobility, especially for hand-operated vehicles. As a result, while cars
are able to make effective use of computing and communication resources when
they are connected to a large number of peers, the vehicles’ connectivity will be
challenging to sustain because of the highly mobile and varied nature of the network.
In particular, strong mobility qualities may cause additional difficulties.
136
M. Gupta et al.
3.3 Complexity in Wireless Networks
A diverse wireless connectivity system, in which a variety of wireless technologies
compete, serves as the basis for the IoV ecosystem. In this environment, vehicles
communicate with neighbouring vehicles, humans, and fixed RSUs through a
wireless network. The technology used is often a combination of Bluetooth, mm
Wave, and dedicated short-range communication (DSRC), which makes different
wireless network-related services possible. This is an example of Bluetooth and
mm Wave’s relative coverage areas: Bluetooth has coverage under 100 m, while mm
Wave has coverage under 10 m. Unlike the two, DSRC often has a wide coverage
of communications. Furthermore, when the vehicles move, the topologies of their
networks are altered. As a result, the effects of network complexity on IoV situations
are substantial.
3.4 Latency-Critical Applications
Many IoV applications need more efficient network methods to share data with
local peers rather than far-off centralised cloud nodes. In fact, such applications
are frequently sensitive to delays and usually have relatively short lengths of propagation. As a result, the maximum time between source and destination should be
as short as feasible for them. For instance, emergency and safety-related automotive
applications, where communication must occur within a certain time frame in order
to avoid unforeseen circumstances such as collisions. To guarantee that prospective
Internet-assisted technologies do not introduce excessive transmission delays in
Internet transmission, place a delay restriction on IoV implementations.
3.5 Scalability and Heterogeneity
Vehicles that often travel over a large geographical region offer a potentially
handy alternative for achieving scalability through roadside edge computing nodes,
VANETs, and wirelessly linked Internet technologies. Additionally, by taking
into account heterogeneity, IoV components with diverse devices, protocols, and
platforms may anticipate smooth integration with cutting-edge information and
communication technologies. Furthermore, this variability among IoV components
makes achieving interoperability an additional difficulty. Interoperability, in fact,
refers to the capacity for the use of information by IoV components and the
interchange of information across sectors, centres, and systems, including hardware
and software.
Nonetheless, the majority of the aforementioned characteristics are virtually
universal in automotive ad hoc network situations. In such instances, with IoV
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
137
applications, difficulties must be overcome owing to these particular features, and
how to overcome these issues will also be distinct and unique.
4 Overview of Blockchain Technology
A blockchain, which facilitates trustless cooperation and coordination between
institutions who are not entirely trusting of one another, is a decentralised computing
and information exchange platform. The blockchain is a kind of database that is
built using blocks and chains. Blockchains, in contrast to conventional databases,
store data in blocks, which are subsequently linked together. When new information
arrives, a new block is inserted. For instance, a Bitcoin block includes information
about the originator, the recipient, and the total amount of Bitcoins being sent.
Blockchain is a synthesis of three cutting-edge technologies: cryptographic keys,
often known as cryptographic hashes, a peer-to-peer network with a shared ledger,
and a computer system to store the network’s transactions and records.
In a nutshell, blockchain technology is a decentralised, distributed ledger that
tracks the origin of a digital item. Because the data on a blockchain can’t be changed,
it’s a genuine disruptor in sectors like finance, cybersecurity, health, and the Internet
of Things. Blockchain is a particularly promising, revolutionary technology, since
it contributes to risk reduction, eradication of fraud, and scalable transparency to a
multitude of applications. Blocks, nodes, and miners are the three main elements of
the blockchain.
4.1 Blocks
A chain is made up of many blocks, each of which has three fundamental elements:
data in the block, a nonce (a 32-bit random number used once), and a hash (a
256-bit integer produced by hashing the nonce), which becomes the block header.
The cryptographic hash of the first block in a chain is generated when a nonce is
applied. Unless the block is mined, the data in it is deemed signed and will remain
permanently associated with the nonce and hash.
4.2 Miners
Through a process known as mining, miners add new blocks to the chain. A
blockchain not only does have its own distinct nonce and hash for every block
but also links to the preceding block’s hash for every block in the chain, which
makes mining a block very difficult. Miners use specialised software to tackle the
extremely difficult mathematical challenge of generating an approved hash. Due to
138
M. Gupta et al.
the fact that the nonce is only 32 bits long and the hash is 256 bits long, there are
about four billion nonce-hash permutations, which must be mined before the correct
one is discovered. When this occurs, it is claimed that the miners have discovered the
“golden nonce,” then the block is added to the blockchain. To make a modification
to a block earlier in the chain needs not just reminiscence of the block, but of all the
blocks that follow. Therefore, the manipulation of blockchain technology is very
hard. This may be seen as “securing mathematics” since it takes a great deal of time
and computer power to discover golden nonces. Upon successfully mining a block,
all of the nodes in the network acknowledge the change, and the miner is monetarily
compensated.
4.3 Nodes
Decentralization is one of the most significant ideas in blockchain technology as the
chain can’t be owned by a single device or entity. Rather, it is dispersed among the
nodes linked to the chain. Any technological device capable of maintaining clones
of the blockchain and facilitating the proper operation of the network is called a
node. Each node has its own blockchain clone, which must allow the network to
algorithmically update, retain, and verify each freshly mined block of the chain.
Due to the transparency of blockchains, any transaction on the ledger can be readily
verified and seen. A unique alphanumeric identifier is issued to each participant to
ensure transactions are accurate. Using a constitutional framework in conjunction
with publicly available information helps to preserve integrity and foster confidence
among the network’s users. Blockchains may thus be described as the technological
scalability of trust.
As shown in Fig. 3, the transactions recorded in a blockchain are organised into
blocks, with each freshly produced block referencing the block before it using a
unique identification number known as a “hash.” These blocks form a chain, which
is why the term “blockchain” was coined. This sequence of events may continue
forever.
Here, trust is established via technical characteristics, including the reality that all
blocks are publicly visible. Without first being confirmed by a miner, no transaction
is placed into a block-a specific computer type inside the network. In this manner,
the community keeps an eye on transaction integrity to guarantee there are no false
records on the blockchain. As a result, parties that do not necessarily trust one
another to do business may utilise a blockchain since they are certain that their
transactions are secure and cannot be tampered with.
5 Types of Blockchain Network
It was with blockchain technology that the concepts of a public blockchain and
a cryptocurrency were first presented to the world. The developers’ intentions are
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
Da
Pre
v
Ha ious
sh
ta
Da
Ha
sh
Pre
v
Ha ious
sh
ta
Da
Ha
Pre
v
Ha ious
sh
sh
139
ta
Ha
sh
Completion of Transaction:
New transaction block is now
integrated with Blockchain
Mining: Validation by Blockchain
network nodes using
cryptographic algorithm
Data
Hash
Addition of new block
containing the transaction at the
end of blockchain
Transaction request
within the network
Hash of previous block
Transaction request
is uploaded on P2P network
containing network nodes
Blockchain validation
and new blocks addition
by miners
Fig. 3 Transaction flow in blockchain
unclear, but the concept of decentralised heading technology was presented. This
has altered our way of resolving problems. It provided the opportunity for groups
to function without relying on a centralised body. Not only distributed technologies
address centralization disadvantages, but also they bring with them many additional
solutions to various situations when it comes to blockchain technology. For example, Bitcoin uses inefficient proof of work, a consensus algorithm. It necessitated the
use of energy by the nodes to perform mathematical computations. As long as the
problem was simple, it did not take much time or effort to answer those equations.
However, as soon as the complexity rose, the time and effort needed to solve those
equations likewise increased. Due to its inefficiency, it is unsuitable for any system
that must remain efficient regardless of the circumstances. Banks, for example, deal
with a large number of transactions on a daily basis. It will not be appropriate for
the kind of blockchain this is built on. With the initial generation of the blockchain,
additional difficulties arose, like scalability, no automation, and other issues.
Next, let us just look at it from a different perspective. Blockchains are not
suitable for all entities. If entities must keep some aspects of their operations private,
they will not be able to utilise a public blockchain. Their company does have some
140
M. Gupta et al.
important data that sustains their performance. There could be rivals who use it
if it becomes public. A private or federated blockchain was created to address the
aforementioned use cases. Private blockchains enable organisations to have total
control over who participates in the network. This gives them the opportunity
to benefit from blockchain-based capabilities without having to expose everything. With this knowledge in hand, we may summarise that the first-generation
blockchain has numerous shortcomings, including inefficiency and scalability. A
public blockchain is suitable for everybody’s objective or requirements, but is not
suitable for certain business interests. The main causes of development in the various
kinds of blockchain technology may be seen as built on these two aspects.
Public blockchains, private blockchains, consortium blockchains, and hybrid
blockchains, each of which is unique, are the primary kinds of blockchain networks
[35]. Each of these platforms has a number of advantages, disadvantages, and
optimal applications.
5.1 Public Blockchain
A public blockchain is a distributed ledger system that is not restricted by permission
and does not need any additional permission from its users. The blockchain network
may be accessed by anybody with an Internet connection who registers on a
blockchain platform to become an authorised node and become a member of
the blockchain network. Public blockchain nodes or users are permitted to view
current and previous data, perform verification of transactions, and provide proofof-work for an arriving block. Public blockchains are the most essential use for
mining and cryptocurrency exchange. Litecoin, Ethereum, and Bitcoin are the most
popular public blockchains. As long as users adhere to rigorous security rules and
procedures, public blockchains are generally safe. But it is dangerous only if the
participants do not genuinely follow the security procedures.
5.1.1
Benefits
Because public blockchains are entirely independent of organisations, even if the
organisation that created them goes out of business, they will continue to function
as long as there are computers linked to them. This is one of the main benefits of
public blockchains. Public blockchains provide another benefit in that they allow for
more network openness. While individuals take precautions, public blockchains are
generally safe as long as users strictly follow security rules and techniques.
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
5.1.2
141
Drawbacks
On the other hand, networks may be painfully sluggish, and businesses cannot limit
usage or access. If hackers acquire 51% or more of a public blockchain network’s
processing power, they may change it unilaterally. Public blockchains are not wellsuited for scalability. As additional nodes enter the network, the network slows
down.
5.1.3
Applications
For the most part, public blockchains are used for cryptocurrency mining and the
trading of digital currencies like Bitcoin. Nevertheless, it is often used to certify
permanent documents, such as affidavits and government documents of property
ownership, using an auditable chain of custody. A blockchain that focuses on being
transparent and trustworthy is excellent for organisations like social assistance
groups and non-governmental organisations. Due to the network’s public character,
private companies will almost certainly wish to avoid it.
5.2 Private Blockchain
A private blockchain restricts or provides access exclusively to an organization’s
internal network. Private blockchains are often utilised inside an organisation or
business where only a small number of individuals are allowed to participate
in a blockchain network, as opposed to public blockchains. This arrangement
gives the governing organization exclusive discretion over the degree of security,
authorization, approval, and access. As a result, private blockchains are identical
to public blockchains in terms of functionality, but they have a smaller and more
restricted network. For example, private blockchain networks are used for a variety
of purposes, including e-voting, supply chain, digital identification, asset ownership,
and more. Multichain and hyperledger projects like Fabric and Sawtooth, Corda, and
others are typical instances of private blockchains.
5.2.1
Benefits
This company determines who is granted access to a certain application or section
of a programme. A blockchain network setup for a particular company, for
example, may control which nodes have permission to access, contribute, or modify
data. Additionally, it may prohibit other parties from gaining access to specific
information. Due to their small size, private blockchains may be extremely rapid,
and transactions can be processed considerably faster than public blockchains.
142
5.2.2
M. Gupta et al.
Drawbacks
The inconvenience of private blockchains is the disputed assertion that they are not
real blockchains because decentralisation is the fundamental concept of blockchain.
Moreover, complete confidence in the information is more difficult to establish since
centralised nodes decide what is genuine. Additionally, a limited number of nodes
may imply poorer security. The consensus process may be undermined if a few
nodes are misguided. Furthermore, because the source code for private blockchains
is usually proprietary and locked, it’s difficult to mine on such platforms. Auditing
or verifying its veracity is not possible for users to independently do, which may
lead to reduced security.
5.2.3
Applications
Blockchains built specifically for private use are more suitable for situations
where confidentiality is required but a controlling organisation doesn’t want the
information to be accessible by the general public. Supply chain management, asset
ownership, and internal voting are just a few among all of the applications for private
blockchain technology.
5.3 Consortium Blockchain
A blockchain consortium is a partnership type in which a blockchain network is
managed by many organisations. Unlike a public blockchain, which is maintained
by a variety of organisations, a private blockchain is managed by a single company
alone. Multiple organisations may serve as nodes in this kind of blockchain,
exchanging data and doing mining. Banks, government agencies, and other institutions often utilise consortium blockchains. Several well-known instances of
consortium blockchains include the Energy Web Foundation, R3, and others.
5.3.1
Benefits
When using a consortium blockchain, security, scalability, and efficiency all
increase. As is the case with private and mixed blockchains, it also incorporates
access restrictions.
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
5.3.2
143
Drawbacks
With the exception of consortium blockchains, public blockchains are more transparent. If a partner node is violated, the blockchain’s own rules may still affect the
functioning of the network.
5.3.3
Applications
This kind of blockchain is used in banking and payments, to name a few applications. In certain cases, several banks may collaborate and create a consortium,
and this consortium will choose which nodes verify transactions. A comparable
approach to organisations wanting to monitor food may be developed by research
groups. It is excellent for food and medical applications, especially supply networks.
However, there are other consensus methods to take into consideration. Aside
from PoW and PoS, anybody intending to build up a network should also examine
the other kinds, which are accessible on various platforms, such as Wave and
Burstcoin. This concept has many different examples, such as the use of leased proof
of stake to let users earn money without the node having to mine itself. To show the
relevance of users, proof of importance combines transactions and balances.
5.4 Hybrid Blockchain
It is possible to create a hybrid blockchain by combining elements of both private
and public blockchains. They utilise both kinds of blockchain, which may include
a private system based on permission and a public system without permission.
Users may restrict access to which information is kept on the blockchain using
such a hybrid network. Only a portion of the blockchain’s data or records may
be made public while maintaining the remainder inside the private network. The
hybrid blockchain solution is versatile, enabling users to connect with many public
blockchains and a private blockchain simply. When doing a transaction on a
private network of a hybrid blockchain, the transaction is validated on the private
network. However, users may also publish it to be validated on a public blockchain.
The public blockchains, as a result, have increased hashing power and need a
greater number of nodes for verification. This improves the network’s security and
openness. The Dragonchain is a well-known example of a hybrid blockchain.
5.4.1
Benefits
One of the great benefits of the hybrid blockchain is that attackers outside of
the system cannot launch a 51% network assault since it operates inside a closed
ecosystem. It also secures personal information while allowing for contact with
144
M. Gupta et al.
other parties. The network has more scalability than a public blockchain network,
according to the developers.
5.4.2
Drawbacks
The main issue with this kind of blockchain is that information may be hidden.
Upgrading may also be a difficulty, as consumers are not encouraged to engage or
participate.
5.4.3
Applications
Real estate is one of the many compelling applications for hybrid blockchains.
With the use of a hybrid blockchain, companies may keep the majority of their
business running secretly, while allowing open access to specific information, such
as listings. Retail can also simplify its operations using the hybrid blockchain,
as well as the advantages of utilising highly controlled sectors such as financial
services and medical record maintenance. Users may get access to their information
via the use of a smart contract, which prevents third parties from seeing the
information. It may also be used by governments to keep data secret from citizens
or to securely exchange information across organisations.
6 Motivations of Using Blockchain in IoV
There is significant potential for creative solutions in nearly all IoV use cases
with the introduction of blockchain technology. Therefore, nearly all of the IoV
simulations have the property of being real time and dynamic, and they create and
interchange a substantial quantity of data. In IoV situations, it is very doubtful
that many traditional methods would be appropriate and successful. In addition,
increased connectivity may offer new attack avenues for hostile actors in these
situations. In addition to enhancing security, privacy, and trust, the incorporation of
blockchain into IoV increases system speed and automation. Therefore, blockchainlike robust technology should be used to allow flexibility and manage large amounts
of data. Due to its characteristics, BC may be an appealing method of solving these
problems.
6.1 Decentralization
The blockchain’s design is decentralised and is less dependent on a single body;
thus, it may prove to be an effective method for deploying secure solutions.
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
145
Blockchain enables decentralised IoV networks to be established and includes
additional distributors, such as RSUs, vehicles, and people. Simultaneously, these
dispersed entities are capable of managing their own activities autonomously. The
present IoV network’s operating principles, which are heavily reliant on central
decision-making, will be simplified and moved to a decentralised architecture.
The ultimate effect of decentralisation will be to improve the transportation user
experience.
6.2 Availability
Due to the decentralised nature of BC, there is no one point of failure. The safety
and availability of the system are thus improved. This is because all linked peer
nodes replicate and synchronise blockchain data. If one or more nodes are hacked,
the services can still work efficiently. Blockchain technology, on the other hand, is
based on current cryptographic methods to guarantee that basic confidentiality and
anonymity characteristics are maintained. Furthermore, the greater the anonymity
and confidentiality, the better it is for the Internet of Value (IoV) networks.
6.3 Transparency
The BC resources are available to all the nodes that have linked into it and have
access to the BC content. The system itself is open and transparent, which eliminates
the need to create trust relationships among nodes. Nodes cannot deceive one
another within the constraints of the system’s rules.
6.4 Immutability
Blockchain technology offers a high level of immutability for IoV services and
situations, since blocks in a blockchain link to each other using hash values
that represent the chain of blocks. The blockchain ledger contains information
that cannot be changed. BC offers a straightforward and fast method of storing
protected data. This immutability characteristic of blockchain possibly eliminates
data manipulation and alteration, as well as aids in correct auditing. With the
assistance of a smart contract, it is also possible to install and enforce any preset
rules or scripts. Thus, the blockchain makes it easy and safe to store confidential
data.
146
M. Gupta et al.
6.5 Exchanges Automation
Autonomous transactions between gadgets and automobiles may be automated
using smart contract technology. As a result, services such as data interchange or
resource sharing may be automatically implemented without the need for human
involvement. The decentralised nature of the blockchain is especially useful for
peer-to-peer (p2p) trade, sharing, and interactions between two parties. The service
requests and suppliers are able to communicate directly with each other through a
p2p network. For IoV situations, this p2p functionality is very helpful for securely
transferring data and resources between cars and RSU. Due to the fact that no
middleman is required in the peer-to-peer network, it eventually leads to low-latency
services and applications.
7 Recent Solutions for IoV Integration with Blockchain
In this part, based on our observations and research, we highlight recent solutions for
the integration of IoT with blockchain, and the major difficulties that must be tackled
when blockchain is included in the IoV scenarios. Based on the review of recent
research contributions from the year 2019, this section focuses on the perspective of
these research contributions, the challenges and their outcomes in brief.
In addition, we have emphasised the possible remedies in various literature for
these difficulties.
[36] concentrates on uncertain key management and proposes a new network
model based on safe broadcasting groups. The suggested model enables more
efficient distributed key management by shortening key transfer times.
The energy and transaction loads are increased while updating distributed ledgers
and performing blockchain transfer operations [37]. uses a technique known as
distributed clustering that makes it possible to control the number of transactions
in the most efficient way. When compared with the Bitcoin model, the suggested
approach uses much less energy and necessitates a significantly lower number of
transactions.
[38] focused on the difficulties associated with centralised privacy solutions. It
employs a security mechanism based on remote attestation. The suggested approach
satisfies the decentralised characteristics, user anonymity, and traceability.
[39] aims to compromise vehicle identification privacy via monitoring attacks
and the dissemination of falsified communications from inside vehicles. It employs
methods such as an anonymous reputation system, which calculates reputation
based on past interactions as well as views, and pseudonym addresses rather than
actual names. This approach can create a model of trust and also meet the needs of
anonymity, transparency, and stability.
[40] is specifically designed to address security and authentication issues in
consensus algorithms. It employs a byzantine consensus method in conjunction
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
147
with a gossip protocol and a time sequence mechanism. The suggested consensus
outperforms the conventional method in terms of consensus effectiveness, security
mechanisms, scaling, and fault tolerance.
Due to the absence of penalty systems for sending misleading signals, [41] relies
on monitoring and enforcing trust values on vehicles. This method provides a safer
and more trusted network while limiting the transmission of misleading messages.
Due to the lack of decentralisation in current authentication methods, [42] has
chosen to concentrate on mutual authentication using elliptic curve cryptography
(ECC). The model suggested a lightweight, scalable, decentralised, and anonymous
authentication and key-exchange system.
There is currently no incentive system in place for backup miners in general
blockchain solutions. The suggested method by [43] improves and secures data
exchange by using contract theory.
Scalability has always been a major problem when trying to handle large amounts
of IoV data on the blockchain. The suggested framework by [44] can maximise
throughput while simultaneously guaranteeing low latency and decentralisation, and
deep reinforcement learning is used to achieve it.
Another significant obstacle to electric vehicle participation in energy distribution is the absence of an adequate incentive programme. The suggested system by
[45], which is based on price and reputation theories, not only encourages cars
to join the blockchain network in order to create a balanced grid but also allows
vehicles to optimise their utility.
For VANETs, [46] suggested block-SDV, a permissioned blockchain-enabled
software-defined architecture. The authors suggest using the consensus method
known as Redundant Byzantine fault Tolerance (RBFT) to guarantee that all
consensus-required transactions, including executing and writing transactions, are
performed properly. A Markov decision process with three functions, state space,
action space, and reward functions, is used to represent a joint optimization issue.
Security concerns regarding the delegated PoS consensus mechanism, as well
as the possibility of collusion between miner candidates and attacked high-stake
vehicles when selecting miner candidates via stake-based voting, are considered
another aspect of the challenge when dealing with IoV using blockchain technology
[47]. offers reputational mining selection among candidates, two-stage verification
and auditing by active and stable miners, and the theory of the contract to propose an
improved mechanism for security. It provides excellence in the defence of internal
collusion, a high detection rate for compromised candidate vehicles, and a better
reputation system than the existing systems.
Due to restricted resources, vehicles are unable to engage in competitive PoW
and PoS-like consensuses to earn incentives. Based on the theories of satisfaction
module, a suggested brokerage technique with decision-making capacity has been
developed by [47]. The suggested method generates much more profit and consumes
significantly less energy when mining and validating operations are uploaded.
The risk of exposing vehicle location privacy, such as location tracing and sensitive information leaks, while using location-based services has always remained
high. For the purpose of managing trust, the Dirichlet distribution is used by [48].
148
M. Gupta et al.
The suggested data structure is capable of storing trustworthiness. This suggested
system is capable of defending against assaults that target trust models, protecting
location privacy, and detecting hostile vehicles.
There has been a dearth of effective decentralised keyword search methods. By
including smart contracts and searchable encryption [49], the suggested method may
significantly enhance privacy protection with forward and backward privacy. Smart
contracts have taken the role of centralised searches.
[50] offers an SDN-enabled blockchain-based system for IoVs in fog computing
and 5G communication networks for the efficient and effective management and
control of the vehicular network in order to ensure the safety of a standardised
vehicle communications architecture. As part of the shared management process,
IoVs integrate blockchain and SDN. On the other hand, the blockchain fulfils the
requirement for trust among linked peers. However, SDN ensures efficient network
administration and a smooth control procedure across the network. Furthermore, fog
computing addresses the handover issues that arise when a large number of vehicles
are linked to the RSUs, which are a concern with SDN. Low-latency communication
services help to improve network performance with 5G. The authors propose the
network trust model, which may reduce harmful actions and stop users from being
deceived by peers in-network by determining if the information supplied by those
peers is reliable.
Permissioned blockchain users, in contrast to public blockchain users, are subject
to certain restrictions. In [51], authorization based on the policy, signatures that
are based on attributes, and cryptography without the need for certificates have
been assigned. In addition to having a small signature size, this suggested signature
method also has a minimal computational cost.
8 Applications of Blockchain in IoV
BC technology has many advantages, such as decentralization, immutability,
anonymity, and exchange automation. Its potential to transform the vehicular
environment is evidenced by the various applications that it has already been
considered. As shown in Fig. 4, there are three primary applications of blockchain
in IoV.
8.1 Incentive Mechanisms
The concept of vehicle cooperation is the reason why the V2V networks are
so important. In order to improve the safety of the vehicles on the road, data
forwarding is very important. Due to the nature of the networks, they could be
affected by various factors such as improper storage and bandwidth consumption.
Current incentive mechanisms could be adapted to incentivize vehicles to share their
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
149
Fig. 4 Applications of blockchain in IoV
network and computing capabilities. The BC technology could be used to design
incentive mechanisms that are secure and scalable.
The concept of Vehicular Driver Reinforcement Networks (VDRNs) is based
on the idea of storing and carrying forward data between connected road devices.
In [52, 53], the authors proposed a method that incentivizes vehicle operators to
improve their cooperation by using Bitcoin. Vehicle drivers can be rewarded if
they transport data from one base station to another. However, in a normal Bitcoin
transaction, the money is taken from the sender and is sent to the recipient. Here, the
authors present a method that enables vehicle-to-car sharing services to use Bitcoin
BC in exchange for conditional rewarding. This method overcomes the limitations
of Bitcoin BC and could be extended to other scenarios. Vehicle-to-car sharing
services could also be considered as clients.
Despite the advantages of vehicular cloud computing, its existence is still
dependent on the vehicles’ cooperation [54]. proposes a credit-based incentive
approach to encourage them to cooperate in this field. The goal of this chapter is
to develop a conditional rewarding system that would allow a vehicle to collect data
without being incentivized. However, this system should not be implemented as it
involves the deployment of Bitcoin BC in a complex environment. The authors in
[52, 53] introduce a framework for establishing trust in vehicular networks. They
also introduce an incentive mechanism that rewards a vehicle that contributes to the
proper functioning of the networks. This chapter aims to provide an ambitious goal,
which is to implement a secure and private key management system based on hash
and public/private keys. However, this approach is not yet fully implemented and
the system’s security is not yet guaranteed.
150
M. Gupta et al.
Legends:
Blockchain
Network
Road Side
unit
Cloud
Storage
Blockchain
Communication
Blockchain
Nodes
Node
Communication
Link
Connectivity
Fig. 5 Blockchain-enabled Internet of Vehicles
One of the most important factors in optimising an incentive system is the
protection of privacy. Although it can be guaranteed that a basic level of privacy
is maintained, unlinkability is also an essential component of a system. The authors
of [55] propose an incentive mechanism that allows vehicles to inform the public
about driving conditions. An incentive mechanism is also proposed to incentivize
the users of the system. For instance, if a vehicle wants to find out about a certain
traffic condition, it will provide a reward to the users who provide the information.
Since the BC technology was initially designed to enable cryptocurrencies’
exchange, it has been proposed to be used in the vehicle environment for various
incentive mechanisms. Some of these include the sharing of information and
computational resources among vehicles, establishing collaborations in vehicular
cloud computing and intersection management, and improving the efficiency of BC
ledgers through different mechanisms. This can be taken by considering vehicles
as the blockchain nodes [54], presented in Fig. 5. The feasibility of deploying BC
ledgers in an unstable environment should be studied. Also, the various use cases
for BC should be defined. The performance of BC technology will be an important
challenge in order to improve its reliability. The concept of decentralised cloud
computing is proposed in [56]. Due to the increasing computational capabilities
and storage requirements of vehicles, it has been proposed that the development of
vehicular cloud computing should be conducted in the vehicle environment.
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
151
8.2 Trust Establishment
The various types of nodes in a vehicular network can be composed of various
kinds of vehicles such as mobile platforms, roadside units, and fixed base stations.
These nodes can be easily connected to each other using various communication
technologies. Establishing trust between these components is very important to
ensure the safety of road users. BC technology can help to create trust models that
are based on the mutual data and immutability of the ledger. It can also identify
non-believers and trusted users. Vehicle networks are mainly used for providing
road safety services. These systems operate by broadcasting messages to alert the
surrounding vehicles about certain road conditions. However, these systems are also
prone to various types of attacks. In order to improve the trust between vehicles
and the environment, a system [57, 58] that uses a Bayesian inference model was
proposed. This method is used to determine the credibility of the messages that are
sent by other vehicles. The messages sent by each vehicle are analysed by the BC
network to determine if they are credible. The goal of this system is to maintain
a trust index of the vehicle [57, 58]. proposes a method that will allow a vehicle
to determine which of the surrounding vehicles are trusted. It will also collect the
trust index of the vehicles and their surrounding areas. There is no mechanism
to ensure that the data collected by the system is valid. As a result, the proposed
solution should be improved to establish a secured solution [59]. proposes a system
that enables local BCs to control the behaviour of the nodes within a given area.
This method not only could provide a better security measure but also requires the
evaluation of the system’s complexity.
To improve the trust level of messages, [60] introduced the proof-of-event
algorithm. To accelerate the dissemination of data, the authors of this chapter
propose a two-phase continuous transaction in BC. This method allows the vehicles
to share their messages with the surrounding RSUs, which in turn simplifies the data
dissemination process [57–60]. introduce a vehicle-centric system that will allow the
use of RSUs to control and store data. The system will only store and manage the
trust establishment and governance of the data. Another component of the system is
a reputation management mechanism. The objective of this system is to determine
which vehicle provides the most reliable data. The data collected through the BC
ledger is then stored in a secure environment and is easily verifiable. Each vehicle
can then select a trusted data provider.
[61] describes a system that aims to provide security in networks without RSUs.
It stores the BC ledger inside the vehicles and updates it when the vehicles are
exchanged. Even if this system is attractive, it should not be considered as a
way to establish trust within a platoon without any external help. This chapter
proposes a more complex approach to manage the global trust index. In [62], the
authors present a decentralized system that enables vehicles to send and receive
emergency messages through a BC ledger. They also provide a secure and trustbased environment for the users. In [63], the authors introduce the benefits of
permissioned BC for Content Centric Networks. They show how it can improve
152
M. Gupta et al.
the trust in vehicular environments by allowing users to control the behaviour of the
various nodes. Message transmission between vehicles is handled through the BC
ledger. The vehicle’s surrounding details are then controlled by the vehicle’s owner.
If the transaction is valid, a trusted validator can be added to the ledger. Since the
position of the message sender is not verified, the message is not sent to all the
vehicles in the vicinity. This method could require different security measures and
energy consumption.
In a virtual environment, such as a software-defined IoV [64], applications can
request and modify the behaviour of the controllers. With the SD-IoV technology,
it will allow applications to request resources from different controllers. This will
enable them to control the behaviour of the controllers and is very useful for
monitoring and controlling the usage of resources and can be done by modifying
the data plane’s configuration (reserving resources and modifying communication
path) [65]. introduced the concept of application trust index and application identity
and BC as a public key infrastructure. When a controller discovers an abnormal
behaviour, it will share this information with the other SDN controllers. This method
prevents the exploitation of the network by malicious nodes [65]. aims to introduce
AI-based techniques to evaluate the behaviour of SDN applications. The systems
described here rely on the BC ledger to store details about the vehicles. This method
avoids the possibility of exploitation by hackers. A decentralized version of this
system called the anonymous reputation system is proposed.
The authors in [66] created an architecture composed of various entities, such
as the law enforcement authority, the registry authority, and the vehicle securing
system. This system will allow each vehicle to request a new key. Instead of tracking
a vehicle, it is impossible to monitor its privacy. Its reputation score is attached to
its certificate, which enables the surrounding vehicles to identify which ones are
trusted. The proposed system in [66] should be able to provide the vehicles with
the privacy they deserve. It should be based on an efficient certificate management
mechanism [59]. presents a more complete and robust BC-based mechanism that
enables vehicle authentication and control. It can be improved in terms of its
complexity and deployment.
BC aims to enable secure exchange of data between nodes. For vehicles, this
will be very important since they will be able to share data with the surrounding
cars. In order to control the messages’ trustworthiness [60], various authors have
proposed systems that allow users to modify the messages’ trustworthiness. These
systems can also protect the privacy of the users. In terms of fully decentralized
systems, this concept is not yet clear. Instead, vehicles would be able to act as BC
nodes and update their own BC ledger. However, this method requires various steps
to be successful. Vehicle networks are designed to enable trust establishment by
controlling the behaviour of the vehicles. These networks can then check if the
information provided by the vehicles is correct. These systems can help driverless
cars operate by controlling the behaviour of the surrounding vehicles. Aside from
this, they can also detect the location of vehicles and provide useful data. In these
systems, the performance of the multilayered BC should be evaluated to determine
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
153
its relevance. Then, the behaviour of the RSUs should be controlled by the BC
network to ensure their security.
8.3 Security and Privacy
Regardless of the type of application, securing a network is an essential component.
There are various security services that can be utilized to protect your data. Aside
from having the necessary permissions, a BC system should also allow the users
to authenticate and read private content. It should also authenticate and protect the
integrity of the data. Due to the decentralized nature of the network, data availability
is guaranteed, and control mechanisms are easily defined. This eliminates the need
for manual intervention and complex encryption methods.
Vehicle-based authentication is a challenging task, especially since it requires
the development of new solutions that can solve the issue of fast-moving vehicles.
For this reason, it is often necessary to develop solutions that are specifically built
for vehicular environments. Due to the various limitations of centralized PKIs [67],
they are not able to provide their users with the same level of security. In [68],
the authors introduce a BC-based PKI system that is designed to address these
issues. Through this system, a BC ledger can be shared between the revocation
Authority and the RSUs, allowing for mutual authentication between the vehicles
and the RSUs [69]. introduces an approach that enables users to quickly share new
revocation notifications with their surrounding users. This method works by storing
the new certificates and other details in a secure environment. This method could
improve the system’s latency and provide better storage capabilities. It can also
reduce the verification overhead. However, this approach should be evaluated to
ensure that the security level is still secure.
The Security Credential Manager System is a PKI system that secures vehicular
communications [70]. It is a system that can be used to establish and maintain
mutual trust. Its main goal is to provide a transparent and authenticated system
that can be used by both parties. Each vehicle that has been certified to use BC
must agree to share its abnormal behaviour data with a designated RSU. The data
will then be shared with the global BC network. This proposal mainly addresses
the issue of revoking lists. However, it also addresses the various aspects of
implementing it. Some of these include establishing a cluster and implementing
security measures [71]. studies the key management of heterogeneous vehicular
communication systems. In this chapter, the authors introduce a method that enables
key transfer between central and local security managers. Due to the complexity of
the distributed key management concept, it is necessary to implement a simple and
secure method for transferring key information between different service managers.
This chapter presents a distributed key transfer handshake scheme that takes
advantage of the BC ledger’s efficiency and transparency. This chapter proposes
a method for establishing a handshake scheme in a BC environment. Although it
can reduce the complexity of the transaction, it still has the potential to provide high
154
M. Gupta et al.
computational time and overhead. To overcome the limitations of the existing public
key infrastructure (PKI), various BC-based solutions have been proposed. These
include: protecting against unauthorized access to the networks of certification
authorities, recusing communication exchanges between network devices and the
authorities, and improving the performance of existing systems.
With the increasing volume of data generated by the IoV and the vehicular
networks, securing the confidentiality of these pieces of information is an important
point. The data collected by the vehicles will be very useful for various applications.
However, accessing and storing these fragments in a fast and efficient manner is an
important challenge. The RSU allows the applications to request bits of information
from a given RSU. The system can also handle the load between the different RSUs.
With the permissions being controlled, the applications can be authenticated and
secured. However, the complexity of the access control mechanism should not be
ignored. The concept of BC-based access control has many advantages, such as
transparency, low-cost deployment, and distributed audit ability [72]. However, the
idea of implementing this technology in vehicular networks is still in its early stages
[73]. describes a BC-based access control system for Internet of Things devices,
which could enable users to own and control the data they consume. This concept
could be used in new applications.
One of the main advantages of a BC ledger is the ability to provide consistent and
reliable data availability. However, in the event of a network failure, this benefit can
be prevented by ensuring that the data is always available. In addition, in order to
minimize network downtime, it is important that the system is capable of handling
V2V communications.
Non-repudiation and integrity are intrinsic in BC-based applications. This means
that these transactions are verifiable through the BC ledger and are not prone to
being invalidated. Even though it’s not considered a major issue, a high level of
integrity could still have a negative impact [74]. For instance, if a huge amount of
data is stored in a BC ledger, it will not be able to be easily erased or modified.
BC technology could provide the integrity of the data stored in BC ledger, but its
correctness and relevance could not be guaranteed. This issue could be solved by
combining BC technology with other secure computing platforms.
Different privacy preserving mechanisms have been designed to help prevent
attacks and minimize the exploitation of users’ privacy. However, these mechanisms
can be useful for many applications. For example, carpooling could expose sensitive
information such as a person’s location and identity [75]. proposes a privacypreserving scheme that uses BC technology. As per the system requirements, RSUs
should also be involved in the system’s operations. Ideally, they should be able to
detect when a car is driving and prevent it from accessing certain services such as
carpool. This system eliminates the need for intermediaries and the user himself
to protect his privacy. It allows the vehicle to modify its pseudonym at any time.
However, this method is not secure and has non-traceability. To improve scalability,
regions are defined, which is a set of service managers that are responsible for
maintaining the BC ledger. This system allows users to authenticate and access
credentials with a list of authorized pseudonyms. It does so by creating a new alias
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
155
for each new connection. This approach is not ideal for handling the registration of
vehicles. Instead, it uses a centralised registration service, which is typically used
to manage the entire process. Due to the large amounts of data that are generated in
the vehicular environment, it is important that the users’ privacy is protected [76–
78]. show three similar approaches that enable users to generate pseudonyms for
their privacy and also discuss the possibility of protecting localisation based on BC
technology.
9 Use Cases of Blockchained IoV
Blockchain is a potential and revolutionising technology for the motor sector.
Mobile devices and other gadgets around the car (road signs, smart phones, etc.)
communicate information with BC, and this communication is authorised and
protected. Certain typical instances of usage may be seen below:
9.1 Supply-Chain Management
BC may be a method of facilitating transparent communication between suppliers,
transporters, and manufacturers as well as of coordinating their activities in the
future.
9.2 Manufacturing and Production
All those processes may be enhanced with blockchain transparency through inventory management, ownership problems, and product traceability and quality check
records.
9.3 Settlements of Insurance Claim
It may be able to handle insurance claims in an effective way by storing various
pieces of information in a protected BC header, such as location, acceleration, and
braking, and vehicle speed.
156
M. Gupta et al.
9.4 Management of Fleet
With blockchain, members of the fleet management, owner, operator, and driver
community may securely share critical information. The technology may automate
tasks such as route planning, payment processing, and vehicle maintenance, among
other things.
9.5 Tracking of Vehicle
Blockchain technology enables the safe management of transparent vehicle information, automobile title transfers, and car leasing.
10 Future Scope of Blockchained IoV
In the previous section, we discussed the various applications of blockchain in the
IoV. Some of these are related to improving the vehicular network environment,
while the others are focused on developing new technologies. Some of these could
contribute to improving the network environment in near future and are listed under.
10.1 Off-Chain Data Trust
Although the use of blockchain technology has been widely used to address various
security issues in IoVs, there are still many concerns regarding the quality of offchain transactions. In [83], the authors proposed a method that aims to provide
secure and privacy-oriented incentives for off-chain data.
10.2 Evaluation Criteria
Most of the proposed solutions are based on independent evaluation and simulation.
As a result, there is no comparison between the advantages and disadvantages of
these solutions. Evaluations could be interesting for different research directions.
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
157
10.3 Management of Resources
Due to the high number of transactions on each server, it consumes a huge amount
of energy. The current consensus mechanism also has the issue of resource waste.
Although DPoS and PoS can reduce resource consumption, they have the same
problems that come with weak supervision and inadequate security. The use of
lightweight cryptography algorithms is necessary to avoid these issues.
10.4 Data-Centric consensus
Designing consensus mechanisms for IoV is an important challenge [79] as it
involves ensuring that the data sent by users are verified correctly. This is because
there are many factors that can affect the integrity of the data sent by users. The
traditional consensus mechanism for validating data has not been designed to handle
the issue of invalidating or validating data sent by a vehicle. Instead, it has been
suggested that various algorithms be used to guarantee the correctness of the data.
Due to this, consensus algorithms must be improved to provide secure and resilient
solutions. However, they are still not deployable.
10.5 Blockchain for the Environment
The current BC and consensus mechanisms require a high level of computational
capabilities to perform their intended functions. This limitation could be solved
by reducing the computational capabilities of existing consensus protocols. The
security level of the BC ledgers is still an area of concern [80]. Another issue
that concerns the system is the amount of storage that the BC ledgers can handle.
Designing a lightweight BC system is a step towards realising the full potential of
this technology. This process involves improving the existing approaches [81, 82]
and developing new ones.
10.6 Administration of Blockchain Platform
In order to successfully implement BC-based applications in vehicles, the requirements should vary depending on the environment and the complexity of the task.
Some of these include: bandwidth constraints, storage capabilities, and latency
constraints. Due to the complexity of the vehicular environment, various solutions
are being developed to enable the deployment of Bitcoin on mobile platforms. These
include enabling devices with limited processing capabilities to access BC services
158
M. Gupta et al.
or side chains that improve the network’s scalability. It is also important that the
deployment of BC networks in vehicles is carried out in a way that is efficient and
secure.
10.7 Evaluation of Performance
The integration of various technologies such as SDN, NFV, edge computing, and
BC has been proposed to improve the overall performance of vehicular networks.
However, in order to get the most out of these new features, it is necessary to
develop a proper software that can allow the evaluation of their performance in an
approaching environment. The current evaluation of the proposed solutions mainly
relies on independent and specific simulation methods that are not very meaningful.
It is also possible to determine which kind of application would be supported
by each technology. For instance, for some applications, such as high-speed web
surfing, BC-based approaches might not be suitable.
10.8 Design of New Services
Through the use of BC technology, connected cars could be enabled to operate
seamlessly, allowing the users to collect and share their data in exchange for a
financial contribution. This concept is very beneficial for the vehicle economy as
it will allow the users to store and manage their data without having to establish a
financial institution.
10.9 Future Architecture Integrations
There are a number of technologies that should be integrated in IoV: artificial
intelligence, edge computing, and software-defined networking. BC could be used
to improve these technologies or even secure them. Its immutability could allow it
to be used to improve various aspects of AI.
This chapter, inspired by paper [84], proposes using blockchain technology to
improve the security and privacy of a hybrid vehicular framework. This chapter
tackles one of the main challenges of the integration of various technologies into
one vehicle architecture. In order to get the full benefit of blockchain technology,
this chapter proposes to implement blockchain in a hybrid framework that combines
various technologies such as 5G, fog computing, and SDN.
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
159
11 Conclusion
Each and every car will be linked to the Internet in the future vision of the Internet
of Vehicles, and blockchain technology is anticipated to offer credit support for
the fundamental information of vehicles at a cheap cost. The distributed ledger
(blockchain) provides an effective way to circumvent the centralised Internet of
Value (IoV) architecture. As a result, our research has begun with preliminary
background covering the fundamental architecture of IoV, IoV’s difficulties, and
short explanations of blockchain technology, as well as IoV’s reason for using
blockchain technology. We also addressed the study’s motives by highlighting the
difficulties connected with IoV and the realisations of decentralisation, great flexibility, accessibility, and trustworthiness, which were followed by some significant
use cases of blockchained IoV. Based on this analysis, many significant difficulties
in the IoV (including unfulfilled expectations and challenges) seem to be just around
the corner. It is anticipated that the combination of blockchain technology with IoV
will substantially enhance the functionality of transportation networks. We believe
this chapter will be useful as a starting point for further study.
References
1. Contreras-Castillo, J., Zeadally, S., & Guerrero-Ibañez, J. A. (2017). Internet of vehicles:
Architecture, protocols, and security. IEEE Internet of Things Journal, 5(5), 3701–3709.
2. Qiu, T., Liu, X., Li, K., Hu, Q., Sangaiah, A. K., & Chen, N. (2018). Community-aware data
propagation with small world feature for internet of vehicles. IEEE Communications Magazine,
56(1), 86–91.
3. Kaiwartya, O., Abdullah, A. H., Cao, Y., Altameem, A., Prasad, M., Lin, C. T., & Liu, X.
(2016). Internet of vehicles: Motivation, layered architecture, network model, challenges, and
future aspects. IEEE Access, 4, 5356–5373.
4. Zyskind, G., & Nathan, O. (2015, May). Decentralizing privacy: Using blockchain to protect
personal data. In 2015 IEEE Security and Privacy Workshops (pp. 180–184). IEEE.
5. Fernández-Caramés, T. M., & Fraga-Lamas, P. (2018). A review on the use of blockchain for
the Internet of Things. Ieee Access, 6, 32979–33001.
6. Kshetri, N. (2017). Can blockchain strengthen the Internet of Things? IT professional, 19(4),
68–72.
7. Christidis, K., & Devetsikiotis, M. (2016). Blockchains and smart contracts for the Internet of
Things. Ieee Access, 4, 2292–2303.
8. Dorri, A., Kanhere, S. S., Jurdak, R., & Gauravaram, P. (2017, March). Blockchain for IoT
security and privacy: The case study of a smart home. In 2017 IEEE international conference
on pervasive computing and communications workshops (PerCom workshops) (pp. 618–623).
IEEE.
9. Novo, O. (2018). Blockchain meets IoT: An architecture for scalable access management in
IoT. IEEE Internet of Things Journal, 5(2), 1184–1195.
10. Li, R., Song, T., Mei, B., Li, H., Cheng, X., & Sun, L. (2018). Blockchain for large-scale
Internet of Things data storage and protection. IEEE Transactions on Services Computing,
12(5), 762–771.
11. Li, Z., Yang, Z., & Xie, S. (2019). Computing resource trading for edge-cloud-assisted Internet
of Things. IEEE Transactions on Industrial Informatics, 15(6), 3661–3669.
160
M. Gupta et al.
12. Chen, W., Zhang, Z., Hong, Z., Chen, C., Wu, J., Maharjan, S. Z., & Zhang, Y. (2019).
Cooperative and distributed computation offloading for blockchain-empowered industrial
Internet of Things. IEEE Internet of Things Journal, 6(5), 8433–8446.
13. Nakamoto, S. (2008). Bitcoin: A peer-to-peer electronic cash system. Decentralized Business
Review, 21260.
14. Tschorsch, F., & Scheuermann, B. (2016). Bitcoin and beyond: A technical survey on
decentralized digital currencies. IEEE Communications Surveys & Tutorials, 18(3), 2084–
2123.
15. Belotti, M., Božić, N., Pujolle, G., & Secci, S. (2019). A vademecum on blockchain
technologies: When, which, and how. IEEE Communications Surveys & Tutorials, 21(4),
3796–3838.
16. Buterin, V. (2014). A next-generation smart contract and decentralized application platform.
White paper, 3(37).
17. Fromknecht, C., Velicanu, D., & Yakoubov, S. (2014). A Decentralized Public Key Infrastructure with Identity Retention (Vol. 2014, p. 803). IACR Cryptol. ePrint Arch.
18. Wilkinson, S., Boshevski, T., Brandoff, J., & Buterin, V. (2014). Storj a peer-to-peer cloud
storage network.
19. Wilkinson, S., Lowry, J., & Boshevski, T. (2014). Metadisk a blockchain-based decentralized
file storage application (pp. 1–11). Storj Labs, Technical Report, hal.
20. Crosby, M., Pattanayak, P., Verma, S., & Kalyanaraman, V. (2016). Blockchain technology:
Beyond bitcoin. Applied. Innovations, 2(6–10), 71.
21. Kalodner, H. A., Carlsten, M., Ellenbogen, P., Bonneau, J., & Narayanan, A. (2015, June). An
Empirical Study of Namecoin and Lessons for Decentralized Namespace Design. In WEIS.
22. Butt, T. A., Iqbal, R., Salah, K., Aloqaily, M., & Jararweh, Y. (2019). Privacy management in
social internet of vehicles: Review, challenges and blockchain based solutions. IEEE Access,
7, 79694–79713.
23. Feng, J., Liu, Z., Wu, C., & Ji, Y. (2018). Mobile edge computing for the internet of vehicles:
Offloading framework and job scheduling. IEEE Vehicular Technology Magazine, 14(1), 28–
36.
24. Chattopadhyay, A., Lam, K. Y., & Tavva, Y. (2020). Autonomous vehicle: Security by design.
IEEE Transactions on Intelligent Transportation Systems.
25. Hahn, D. A., Munir, A., & Behzadan, V. (2021). Security and Privacy Issues in Intelligent
Transportation Systems: Classification and Challenges. IEEE Intelligent Transportation Systems Magazine, 13(1), 181–196.
26. Li, W., & Song, H. (2015). ART: An attack-resistant trust management scheme for securing
vehicular ad hoc networks. IEEE Transactions on Intelligent Transportation Systems, 17(4),
960–969.
27. Alam, K. M., Saini, M., & El Saddik, A. (2015). Toward social internet of vehicles: Concept,
architecture, and applications. IEEE access, 3, 343–357.
28. Wang, L., & Liu, X. (2018). NOTSA: Novel OBU with three-level security architecture for
internet of vehicles. IEEE Internet of Things Journal, 5(5), 3548–3558.
29. Dai, F., Mo, Q., Li, T., Huang, B., Yang, Y., & Zhao, Y. (2020). Refactoring business process
models with process fragments substitution. Wireless Networks, 1–15.
30. Dai, F., Mo, Q., Qiang, Z., Huang, B., Kou, W., & Yang, H. (2020). A choreography analysis
approach for microservice composition in cyber-physical-social systems. IEEE Access, 8,
53215–53222.
31. Xu, X., Cao, H., Geng, Q., Liu, X., Dai, F., & Wang, C. (2020). Dynamic resource provisioning
for workflow scheduling under uncertainty in edge computing environment (p. e5674). Practice
and Experience.
32. Kaiwartya, O., Abdullah, A. H., Cao, Y., Altameem, A., Prasad, M., Lin, C. T., & Liu, X.
(2016). Internet of vehicles: Motivation, layered architecture, network model, challenges, and
future aspects. IEEE Access, 4, 5356–5373.
33. Zheng, W., Zheng, Z., Chen, X., Dai, K., Li, P., & Chen, R. (2019). Nutbaas: A blockchain-asa-service platform. Ieee Access, 7, 134422–134433.
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
161
34. Liu, D. (2018, January). Big data analytics architecture for internet-of-vehicles based on the
spark. In 2018 International conference on intelligent transportation, big data & smart city
(ICITBS) (pp. 13–16). IEEE.
35. Niranjanamurthy, M., Nithya, B. N., & Jagannatha, S. (2019). Analysis of Blockchain
technology: Pros, cons and SWOT. Cluster Computing, 22(6), 14743–14757.
36. Lei, A., Cruickshank, H., Cao, Y., Asuquo, P., Ogah, C. P. A., & Sun, Z. (2017). Blockchainbased dynamic key management for heterogeneous intelligent transportation systems. IEEE
Internet of Things Journal, 4(6), 1832–1843.
37. Sharma, V. (2018). An energy-efficient transaction model for the blockchain-enabled internet
of vehicles (IoV). IEEE Communications Letters, 23(2), 246–249.
38. Xu, C., Liu, H., Li, P., & Wang, P. (2018). A remote attestation security model based on privacypreserving blockchain for V2X. IEEE Access, 6, 67809–67818.
39. Lu, Z., Liu, W., Wang, Q., Qu, G., & Liu, Z. (2018). A privacy-preserving trust model based
on blockchain for VANETs. IEEE Access, 6, 45655–45664.
40. Hu, W., Hu, Y., Yao, W., & Li, H. (2019). A blockchain-based Byzantine consensus algorithm
for information authentication of the Internet of vehicles. IEEE Access, 7, 139703–139711.
41. Ijaz, A., & Javaid, N. (2019). Reward and penalty based mechanism in vehicular network using
decentralized blockchain technology. Submitted for publication.
42. Kaur, K., Garg, S., Kaddoum, G., Gagnon, F., & Ahmed, S. H. (2019, May). Blockchainbased lightweight authentication mechanism for vehicular fog infrastructure. In 2019 IEEE
International Conference on Communications workshops (ICC workshops) (pp. 1–6). IEEE.
43. Kang, J., Xiong, Z., Niyato, D., & Kim, D. I. (2019, May). Incentivizing secure block
verification by contract theory in blockchain-enabled vehicular networks. In ICC 2019–2019
IEEE International Conference on Communications (ICC) (pp. 1–7). IEEE.
44. Liu, M., Teng, Y., Yu, F. R., Leung, V. C., & Song, M. (2019, May). Deep reinforcement
learning based performance optimization in blockchain-enabled internet of vehicle. In ICC
2019–2019 IEEE International Conference on Communications (ICC) (pp. 1–6). IEEE.
45. Wang, Y., Su, Z., & Zhang, N. (2019). BSIS: Blockchain-based secure incentive scheme for
energy delivery in vehicular energy network. IEEE Transactions on Industrial Informatics,
15(6), 3620–3631.
46. Zhang, D., Yu, F. R., & Yang, R. (2019). Blockchain-based distributed software-defined
vehicular networks: A dueling deep Q-learning approach. IEEE Transactions on Cognitive
Communications and Networking, 5(4), 1086–1100.
47. De Maio, V., Brundo Uriarte, R., & Brandic, I. (2019, December). Energy and profit-aware
proof-of-stake offloading in blockchain-based VANETs. In Proceedings of the 12th IEEE/ACM
International Conference on Utility and Cloud Computing (pp. 177–186).
48. Luo, B., Li, X., Weng, J., Guo, J., & Ma, J. (2019). Blockchain enabled trust-based location
privacy protection scheme in VANET. IEEE Transactions on Vehicular Technology, 69(2),
2034–2048.
49. Chen, B., Wu, L., Wang, H., Zhou, L., & He, D. (2019). A blockchain-based searchable
public-key encryption with forward and backward privacy for cloud-assisted vehicular social
networks. IEEE Transactions on Vehicular Technology, 69(6), 5813–5825.
50. Gao, J., Agyekum, K. O. B. O., Sifah, E. B., Acheampong, K. N., Xia, Q., Du, X., et al.
(2019). A blockchain-SDN-enabled Internet of vehicles environment for fog computing and
5G networks. IEEE Internet of Things Journal, 7(5), 4278–4291.
51. Mu, Y., Rezaeibagha, F., & Huang, K. (2019). Policy-driven blockchain and its applications
for transport systems. IEEE Transactions on Services Computing, 13(2), 230–240.
52. Park, Y., Sur, C., Kim, H., & Rhee, K. H. (2017). A reliable incentive scheme using Bitcoin on
cooperative vehicular ad hoc networks. IT CoNvergence PRActice (INPRA), 5(4), 34–41.
53. Park, Y., Sur, C., & Rhee, K. H. (2018). A secure incentive scheme for vehicular delay tolerant
networks using cryptocurrency. Security and Communication Networks, 2018.
54. Alouache, L., Nguyen, N., Aliouat, M., & Chelouah, R. (2018, November). Credit based
incentive approach for V2V cooperation in vehicular cloud computing. In International
Conference on Internet of Vehicles (pp. 92–105). Springer.
162
M. Gupta et al.
55. Li, L., Liu, J., Cheng, L., Qiu, S., Wang, W., Zhang, X., & Zhang, Z. (2018). Creditcoin: A
privacy-preserving blockchain-based incentive announcement network for communications of
smart vehicles. IEEE Transactions on Intelligent Transportation Systems, 19(7), 2204–2220.
56. Hong, Z., Wang, Z., Cai, W., & Leung, V. (2017). Blockchain-empowered fair computational
resource sharing system in the D2D network. Future Internet, 9(4), 85.
57. Yang, Z., Zheng, K., Yang, K., & Leung, V. C. (2017, October). A blockchain-based reputation
system for data credibility assessment in vehicular networks. In 2017 IEEE 28th annual
international symposium on personal, indoor, and mobile radio communications (PIMRC) (pp.
1–5). IEEE.
58. Yang, Z., Yang, K., Lei, L., Zheng, K., & Leung, V. C. (2018). Blockchain-based decentralized
trust management in vehicular networks. IEEE Internet of Things Journal, 6(2), 1495–1505.
59. Shrestha, R., Bajracharya, R., & Nam, S. Y. (2018, October). Blockchain-based message
dissemination in VANET. In 2018 IEEE 3rd International Conference on Computing, Communication and Security (ICCCS) (pp. 161–166). IEEE.
60. Yang, Y. T., Chou, L. D., Tseng, C. W., Tseng, F. H., & Liu, C. C. (2019). Blockchain-based
traffic event validation and trust verification for VANETs. IEEE Access, 7, 30868–30877.
61. Wagner, M., & McMillin, B. (2018, December). Cyber-physical transactions: A method for
securing VANETs with blockchains. In 2018 IEEE 23rd Pacific rim international symposium
on dependable computing (PRDC) (pp. 64–73). IEEE.
62. Awais Hassan, M., Habiba, U., Ghani, U., & Shoaib, M. (2019). A secure message-passing
framework for inter-vehicular communication using blockchain. International Journal of
Distributed Sensor Networks, 15(2), 1550147719829677.
63. Ortega, V., Bouchmal, F., & Monserrat, J. F. (2018). Trusted 5G vehicular networks:
Blockchains and content-centric networking. IEEE Vehicular Technology Magazine, 13(2),
121–127.
64. Jiacheng, C., Haibo, Z. H. O. U., Ning, Z., Peng, Y., Lin, G., & Sherman, S. X. (2016). Software
defined Internet of vehicles: Architecture, challenges and solutions. Journal of communications
and information networks, 1(1), 14–26.
65. Mendiboure, L., Chalouf, M. A., & Krief, F. (2018, November). Towards a blockchain-based
SD-IoV for applications authentication and trust management. In International Conference on
Internet of Vehicles (pp. 265–277). Springer.
66. Lu, Z., Liu, W., Wang, Q., Qu, G., & Liu, Z. (2018). A privacy-preserving trust model based
on blockchain for VANETs. IEEE Access, 6, 45655–45664.
67. Al-Bassam, M. (2017, April). SCPKI: A smart contract-based PKI and identity system. In
Proceedings of the ACM Workshop on Blockchain, Cryptocurrencies and Contracts (pp. 35–
40).
68. Malik, N., Nanda, P., Arora, A., He, X., & Puthal, D. (2018, August). Blockchain based
secured identity authentication and expeditious revocation framework for vehicular networks.
In 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and
Communications/12th IEEE International Conference on Big Data Science and Engineering
(TrustCom/BigDataSE) (pp. 674–679). IEEE.
69. Lasla, N., Younis, M., Znaidi, W., & Arbia, D. B. (2018, February). Efficient distributed
admission and revocation using blockchain for cooperative its. In 2018 9th IFIP international
conference on new technologies, mobility and security (NTMS) (pp. 1–5). IEEE.
70. Whyte, W., Weimerskirch, A., Kumar, V., & Hehn, T. (2013, December). A security credential
management system for V2V communications. In 2013 IEEE Vehicular Networking Conference (pp. 1–8). IEEE.
71. Lei, A., Cruickshank, H., Cao, Y., Asuquo, P., Ogah, C. P. A., & Sun, Z. (2017). Blockchainbased dynamic key management for heterogeneous intelligent transportation systems. IEEE
Internet of Things Journal, 4(6), 1832–1843.
72. Maesa, D. D. F., Mori, P., & Ricci, L. (2017, June). Blockchain based access control. In IFIP
international conference on distributed applications and interoperable systems (pp. 206–220).
Springer.
Analysis of Blockchain Integration with Internet of Vehicles: Challenges,. . .
163
73. Ouaddah, A., Abou Elkalam, A., & Ait Ouahman, A. (2016). FairAccess: A new Blockchainbased access control framework for the Internet of Things. Security and communication
networks, 9(18), 5943–5964.
74. Staples, M., Chen, S., Falamaki, S., Ponomarev, A., Rimba, P., Tran, A., Tran, I., Weber, X.,
& Xu, J. Z. (2017). Risks and opportunities for systems using blockchain and smart contracts.
Data 61. CSIRO.
75. Zheng, Y., Li, M., Lou, W., & Hou, Y. T. (2015). Location based handshake and private
proximity test with location tags. IEEE Transactions on Dependable and Secure Computing,
14(4), 406–419.
76. Sharma, R., & Chakraborty, S. (2018, December). BlockAPP: Using blockchain for authentication and privacy preservation in IoV. In 2018 IEEE Globecom Workshops (GC Wkshps) (pp.
1–6). IEEE.
77. Li, M., Zhu, L., & Lin, X. (2018). Efficient and privacy-preserving carpooling using
blockchain-assisted vehicular fog computing. IEEE Internet of Things Journal, 6(3), 4573–
4584.
78. Yao, Y., Chang, X., Mišić, J., Mišić, V. B., & Li, L. (2019). BLA: Blockchain-assisted
lightweight anonymous authentication for distributed vehicular fog services. IEEE Internet
of Things Journal, 6(2), 3775–3784.
79. Wang, X., Zha, X., Ni, W., Liu, R. P., Guo, Y. J., Niu, X., & Zheng, K. (2019). Survey on
blockchain for Internet of Things. Computer Communications, 136, 10–29.
80. Lin, I. C., & Liao, T. C. (2017). A survey of blockchain security issues and challenges.
International Journal of Network Security, 19(5), 653–659.
81. Dorri, A., Kanhere, S. S., Jurdak, R., & Gauravaram, P. (2019). LSB: A Lightweight Scalable
Blockchain for IoT security and anonymity. Journal of Parallel and Distributed Computing,
134, 180–197.
82. Liu, Y., Wang, K., Lin, Y., & Xu, W. (2019). LightChain: A lightweight blockchain system for
industrial Internet of Things. IEEE Transactions on Industrial Informatics, 15(6), 3571–3581.
83. Chen, W., Chen, Y., Chen, X., & Zheng, Z. (2019). Toward secure data sharing for the IoV: A
quality-driven incentive mechanism with on-chain and off-chain guarantees. IEEE Internet of
Things Journal, 7(3), 1625–1640.
84. Mendiboure, L., Chalouf, M. A., & Krief, F. (2019, May). Towards a 5G vehicular architecture.
In International Workshop on Communication Technologies for Vehicles (pp. 3–15). Springer.
Reliable System for Bidding System
Using Blockchain
N. Ambika
1 Introduction
Blockchain [5, 18] innovation that starts from Bitcoin, the first digital currency
framework propelled in 2008, can give a viable answer for IoT protection and
security. The e-sell-off [7] is one of the well-known web-based business exercises
and enables bidders to legitimately offer the items over the Internet. Concerning the
fixed offer, the additional exchange cost is required for the middle people because
the outsider is the significant job between the purchasers and the merchants help to
exchange both during the sale. It never ensures whether the outsider can be trusted.
To determine the issues, the blockchain innovation [3] with low exchange cost is
utilized to build up the brilliant agreement of open offer and fixed offer. The savvy
contract, proposed in 1990 and executes through the Ethereum stage, can guarantee
the bill secure, private, non-reputability and inalterability inferable from every one
of the exchanges are recorded in the equivalent however decentralized records.
Bitcoin is a rule that sequences events into collections called segments. The
procedure targets a section generation interlude of ten moments with the highest
capacity of 1 MB. The ultimate 100 blocks had a 0.99 MB midpoint block volume
and a 9.8-min mean interim. The circuit order executes a peer-to-peer arrangement
based on flooding block and activity reporting. The peer-to-peer system is created by
point-to-point connections. To make a joint, customers authorize a TCP attachment
and complete a protocol-level three-way handshake. The protocol-level handshake
transfers the status of individual clients, such as the slope of the blockchain [3, 4]
and a transcription sequence amalgamated with the software being administered.
N. Ambika ()
Dept. of Computer Science and Applications, St. Francis College, Bangalore, India
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_9
165
166
N. Ambika
Fig. 1 Principles of blockchain
When a consumer finds or takes a new segment, it overwhelms the system with the
jumble of the intersection. If an adjacent customer requires the block, it demands
the block based on the hash content.
Blockchain [5] innovation that starts from Bitcoin, the first digital currency
framework propelled in 2008 [5], can give a viable answer for IoT [25] protection
and security, because of its three fundamental principles:
• Information in the blockchain is put away in a common, appropriated, and
deficiency-tolerant database that each member in the system can share the
capacity to invalidate foes by saddling the computational abilities of the genuine
hubs and data traded is versatile to control.
• Blockchain is a decentralized engineering to make the designs strong against any
disappointments and assaults.
• Blockchain depends on an open key framework which enables the substance to
be encoded in a manner that is costly to split. Figure 1 portrays the same. Figure
2 represents the interplay of other factors.
1.1 Features of Blockchain
1.1.1
Decentralization
The features allude to the procedures of information confirmation, stockpiling,
support, and transmission on the blockchain, which depend on a disseminated
framework structure. The trust between appropriated hubs is worked through
numerical strategies as opposed to the concentrated associations. The presentation
position [20] is the Jean Jaurès primary academy with a wood-fired evaporator.
It is a temperature generator. The customers and sustaining workers follow-up on
their recovery steps to increase power administration. Virtual nodes are added to
the framework. The performance of supplementary estimates the scaling potential
for an eco-district. The green documents are allocated to regional generators.
The credentials are bought by energy consumers. The transaction is certified by
Reliable System for Bidding System Using Blockchain
167
Social Economic Aspects
- business ethics
- fair trade
- worker’s benefits
SOCIETY
ECONOMICS
- Standard of living
- Education
- Jobs
- Equal opportunity
- Growth
- Profit
- Cost saving
- R&D
SUSTAINABILITY
Social Environmental:
- Conservation policies
- Environmental justice
- Global stewardship
ENVIRONMENT
- Natural resource use
- Pollution prevention
- Bio-diversity
Environmental Economic
- energy efficiency
- renewable fuels
- subsidies, incentives
- green technology
Fig. 2 Interplay of other factors [35]
the blockchain controller node. It plays the role of central memory of certificate
transactions. Three client interfaces have been received and driven on the Predix
principles.
All the segments of the blockchain store construction framework [6] with
their functionalities are distinct. It gives ease, pace, and effectiveness. It has two
components. The customer segment is in the end-user tier. The framework is a
blockchain store having three tiers. The end-user tier assists in the communication
between the end users. It is implemented by the management of the graphical
interfaces or workflow administration. The end users can use assistance from the
current production methods on the structure via blockchain technology as the P2P
method is preferred between the possible cooperations. The focus zone supervises
the organization by achieving all the additional courses on conformity. Its main
competency is to produce a blockchain interface and sustain the P2P arrangement.
The sub-responsibility describes warehouse providers, scheduling of the regularity,
administration, and subsistence of the operation.
The NIST design [2] characterizes characters, administrations, and thoughts into
seven distinct regions and their subareas. These regions have buyers, businesses,
assistance providers, enterprises, formation, communication, and relationships.
The customer specialty consists of the ultimate power customers in the energetic
framework. The demand areas are formed of current business associates. Assistance
providers are the items that present store and end computation assistance to
168
N. Ambika
clients, prosumers, and benefits. The employment region consists of autonomous
operation workers that promote intelligent framework utilization to manage edge
and store computing assistance and administer diverse market principles. The
volume production, communication, and administration areas are the generators
and transporters. The underlying P2P systems and dispersed record technology
of the blockchain arrangements are obliged to guarantee consensus-based power
expenditure info participating and collaborative utilization of strength supplies
between users, prosumers, and services. The cryptocurrencies and influence tools
are wanted to produce new decentralized trading paradigms among prosumers and
uses.
It [17] has seven positions. Electors do a modification for polling. Certification
server establishes the voter’s identification and presents qualified taxpayers with
voting documents. The authorization host checks the record. The polling method
is the administration of the appointing executives. Recording center stocks the
certification. Shared info hosts save the encrypted coordinates of features of
the chosen amount. Intelligent engagement is a productive season to repair the
functionality of the proverbial summary provisions. It can calculate the tickets to
improve the trustworthiness and authenticity of the voting.
In the initial frame [15], the agencies start redundancies. The tools convey an
introductory assemblage of knowledge. It assists their opinion for their allocation
in the business solvent. The collection of learning is a resolution package. The
answer container is fashioned of arbitrary measures but in the capacities of the
original suggestions. The representatives renew the supplies that participated in the
clarification parcels in all degrees according to the pheromone tiers. In the initial
repetition, every agency determines the accurate purpose concerning its possible
period and the answer packages they collected from all additional representatives.
Every ant’s journey describes a potential answer to the business puzzle. After all,
insects make their opening travels, and the health of every voyage is assessed.
The robustness is assessed based on two principles. The benefits of the weighted
normal of the health measures are used to decide the ablest journey. This metric
recognizes the suspension with the highest human progress and least positive net
surplus. The refusal standard is introduced, which excludes warehouse answers with
net power excess that exceeds a permitted negligible failure of ±5%. The DACO
uniquely proposes two tiers of pheromones updating. The initial tier is a regional
pheromone updating based on the achievement tier of the ants to their assignments.
Every ant determines behind its traces and deposits pheromones on the advantages
it inflicted. The purpose is to recognize the ants to investigate new listings with the
low significance of pheromones congregation.
The suggestion [32] combines a Turing comprehensive programming conversation with intelligent engagement computing functionality. An answer is developed
that authorizes the establishment of structures. The shareholders conserve realtime
inspection by adding the necessary components. The governance arrangements
are formalized, automatized, and inflicted utilizing the software. The necessary
regulations for good agreement are composed to make a decentralized autonomous
establishment on the Ethereum blockchain. It creates a conceptual arrangement
Reliable System for Bidding System Using Blockchain
169
course with a built-in Turing-complete programming communication. It authorizes
any character to draft intelligent agreements and decentralized importance. They
compose their own discretional rules for the control, performance setups, and status
development agenda. Ethereum Virtual Machine Code is low-level, stack-based
bytecode semantics. Ethereum contracts are addressed in EVM principles. Every
byte in the regulations communicates for a purpose. When the law accomplishes,
it begins in an unending circuit that consists of many terms achievement of the
gathering at the prevailing schedule board. It starts with a void value and increases
by one till the code is stopped.
1.1.2
Traceability
This feature implies that all exchanges on the blockchain are masterminded in
sequential requests, and a square is associated with two contiguous squares by the
cryptographic hash work. In this way, every exchange is identifiable by looking at
the square data connected by hash keys.
Produce suppliers and retailers utilize for traceability assistance [24] for various
determinations. Suppliers want to get documents to determine their output origin
and feature to customers and comply with ordinances. Retailers need confirmation
of the product source and essence. Each result supplier that practices the partner
assistance has on medium 20 outcomes to be traced. The traceability knowledge
granularity is generous. It replies to results from groupings rather than single
outputs. This information’s measurement isn’t simple to predict because many
reports currently aren’t digitized, such as licenses announced by traceability
assistance providers. The traceability helps the provider and confirms an application
from a product supplier or retailer based on paperwork. The two parties sign
a legal agreement. It generates a clever arrangement that serves the contractual
transaction. The intelligent understanding arranges the organization of cooperations
and other limitations described in the correspondence. It also examines whether all
the learning needed by the ordinance is provided. It enables automatic regulatory
compliance checks.
The practice [34] relies on RFID methodology to perform knowledge recovery,
distribution, and experience in the creating, processing, warehousing, delivery, and
selling sections of the agri-food equipment series. The feed protection disaster
occurred; administration activities could take crisis steps promptly to limit the
spreading of the danger.
The contribution [9] is a fully decentralized traceability system for the agri-Food
supply series administration. The tiered structure can rely on the blockchain and IoT
technologies. It manages the clarity, audibility, and immutability of the collected
recordings in treacherous circumstances. It takes benefit of the growing abilities
granted by advanced edge tools. It completes connections of coursed blockchain
scheme. It prolongs the stability, distribution, safety, and confidence of the complete arrangement. API is a REST utilization programming interface proving the
abilities of AgriBlockIoT to additional reinforcements. It has a level of distraction,
170
N. Ambika
permitting a smooth combination with actual software methods. The controller is
segmentable of changing the high-level purpose commands into indistinguishable
low-level signals for the blockchain panel. The principal ingredient of the practice
includes all the transaction inferences. It implements by intelligent contracts. It is a
hub to the chain itself. It will change in complexity according to the plan capacities
of the selected and the skills of the customer.
The stock series [8] consists of various characters and details the method from
the beginning till the end outcome in the stocks obtained and employed by a
customer. The traceability of players is managed via private traceability among
the characters of inner arrangements. Central in the structure is that all characters
in the feedstock succession provide knowledge to and recover learning from
the Food Safety Information System via diverse technology. It includes different
data necessary to complete clearness and support of excellence among the feed
store’s connection characters. It comprises the protection and feature management
of the stock series players. Traceability data that reflect compliance with these
arrangements are collected in the FSIS. A clever agreement is network regulations in
the blockchain that is administered once requirements are satisfied. The determining
process of operation is automated and inevitable as predefined in the philosophy
of the processor principles. The case analysis examines the characteristics of the
farm stores connection methods of four various trading situations. The outcomes
show features among the events and disagreements. The surface efficacy presents
discussions in businesses with comparable industry conditions. It provides to induce
the decisions concerning frame situations.
1.1.3
Immutability
There are two reasons that blockchain innovation is changeless. From one perspective, all exchanges are put away in hinders with one hash key connecting from
the past square and one hash key highlighting the following square. Messing with
any exchange would bring about various hash esteems and would along these lines
be distinguished by the various hubs running a similar approval calculation. Then
again, blockchain is a shareable open record put away on a huge number of hubs,
and all records keep on adjusting continuously. Effective altering would need to
change over 51% of the records put away in the system.
1.1.4
Currency
The quintessence of blockchain innovation is highlight point exchanges; no outsider
is included, which implies that all exchanges don’t require the support of outsiders.
The course of computerized cash dependent on blockchain innovation is fixed.
In particular, in Bitcoin, the money base is set at 21 million tops, so the age
of computerized cash is made by utilizing a particular mining calculation and is
limited by a precharacterized recipe. Consequently, there won’t be the issue of
Reliable System for Bidding System Using Blockchain
171
Fig. 3 Features of blockchain
expansion, breakdown, etc. In Blockchain 2.0 and 3.0 applications, the mix of
different exercises, for example, government exercises, instructive exercises, and
money-related exercises, can make these nonmonetary exercises have the property
of cash. Figure 3 portrays the features of blockchain.
The e-auction [33] is one of the well-known web-based business exercises, which
permits bidders to legitimately offer the items over the Internet. With respect to the
fixed offer, the additional exchange cost is required for the middle. The outsider has
significant job between the purchasers and the venders helping to exchange both
during the bartering. Also, it never ensures whether the outsider can be trusted. To
determine the issues, the blockchain innovation with low exchange cost is utilized
to build up the brilliant agreement of open offer and fixed offer. E-sell-off has
two principle issues. Initially, an incorporated mediator is required in offering a
framework to help correspondence among bidders and salespeople. The charge
expenses for the concentrated go-between to build the exchange cost. Also, the
individual information and exchange records are put away in a database that may
cause security spillage. Furthermore, in a fixed envelope, bidders have no real way
to guarantee that lead bidder never releases their offering cost.
The work [11] applies the blockchain method into the e-sale to determine the two
issues. The blockchain is a shared access structure with the end goal that focuses
on the structure that can confide in one another focuses. Every area can safely
convey, verify, and move information to any of different destinations. Therefore,
in the decentralized structure, the brought together go-between can be expelled to
lessen the exchange cost. With respect to the subsequent issue, the brilliant contract
is utilized to evade the offer value spilled by the lead bidder. A few standards are
172
N. Ambika
composed inside the shrewd arrangement, which cannot be opened before the cutoff
time. The proposal [11] is made out of the location of auctioneer, the beginning
closeout time, cutoff time, the location of the current winner, the current most
significant expense. The keen agreement is a lot of codes and digits actualized
by means of the Ethereum stage. In an astute understanding, the agreement will
begin if the time or occasion is activated, for example, communicating something
specific, managing exchanges, and ending the agreement. Prior to the cutoff time,
all the legitimate bidders can send the fixed envelope to restore the cost. All the
fixed envelopes are opened when the time is expected. The most significant expense
on the fixed envelope is the last champ. During public bidding, bidders can offer a
few times; in this way, open offer is additionally called multi-offering sell-off. Fixed
offer is that bidders scramble the bill and send the bill once. On the off chance that
the time is expected, the salesperson looks at all of the bills. The bidder who offers
at the greatest expense is the victor of the fixed offer.
The proposed system embeds blockchain technology. The procedure is divided
into three phases. In the registration phase, the user is to register himself with the
server by providing the details of the device and himself. In the broadcast phase, the
server broadcast the auction details to all the registered users. In the auction phase,
the user is provisioned to transmit his auction value. This methodology aims in
bringing reliability to the system by using the device identity and biometric extract
as the parameters.
Following the introduction, literature survey is summarized in Sect. 2. The
notations are listed in Sect. 3. The proposed work is described in Sect. 4. The work’s
security is evaluated in Sect. 5. The work is concluded in Sect. 6.
2 Literature Survey
Many contributors have used the blockchain methodology in different applications.
The same is briefed in this section.
Shaikh and Iliev [31] introduce an exchange preparing framework for e-trade
by utilizing blockchain innovation and zero-information verification, and changed
elliptic curve cryptography encryption is proposed. A strategy sharing the database
among the members is given by the blockchain innovation regardless of whether
they don’t confide in one another. Based on the distributed system, it produces a
commercial center to move resources without a focal position. At that point, the
zero-information confirmation strategy is handled; given this solitary, a blockchainbased TPS (Bb-TPS) is dealt with and exhibits its functionalities of nonstop
observing, bookkeeping, and consent the executives in the constant applications.
At last, the altered elliptic bend cryptography is utilized to scramble the information
by utilizing the improvement strategy cuckoo search (CS) calculation. The private
key and open key are the two keys utilized in ECC.
Current general well-being data innovation frameworks, for example, qualification, enlistment, and electronic wellbeing records, have archived issues with
Reliable System for Bidding System Using Blockchain
173
interoperability and are delayed to adjust to changing project and innovation
requests. We recommend that blockchain can unravel these issues that can be
understood. The Medicaid Management Information Systems (MMIS) program
[28] burned through $3.7 billion out of 2015, and complete organization and
other innovation spend on qualification frameworks, electronic well-being records,
and innovation related to the organization were over $25 billion every 2015. The
guarantee of blockchain can conceivably fathom these issues at a decreased expense
because of the relative simplicity of sending versus conventional equipment and
programming foundation.
Ranganthan et al. [29] introduce an application that cures each of the three disadvantages using the Ethereum blockchain stage. The application was created utilizing
the Truffle advancement system. The application’s capacities were contained inside
an Ethereum keen agreement, which was then moved to the Ethereum organize. An
Ethereum arrange is made out of a lot of hubs running an Ethereum customer. Every
one of these hubs has a duplicate of the blockchain, which contains a rundown of
all tasks performed on the system. This empowers hubs to forestall fake exercises,
for example, forging and copying digital forms of money, and also containing
an auditable record of all exchanges performed on the system. Because of the
decentralized idea of the system, the Ethereum structure jam client pseudonymity,
as every client’s character, is given by a public credential. This empowers clients on
the stage to perform capacities, for example, moving cash, purchasing, and selling.
To take care of the twofold spending issue [21], every calculation hub in the
blockchain arrange needs not only to store each exchange to empower the dispersed
confirmation of the exchanges but also to follow an appropriated timestamp system
to figure out which exchanges ought to be acknowledged and which ought to be
dismissed. An extra advantage of the evidence of work accord convention utilized
in the blockchain is the capacity to determine contradiction of the chains and hence
lets blockchains be changeless review trails. That is the point at which an aggressor
alters a square; all the squares after that square are recomputed because each square
contains the hash estimation of the past square’s header, and the computational
expense of such change ought to be sufficiently high to preclude assaults.
Huh et al. [19] Ethereum is used as the blockchain stage, utilizing its brilliant
agreement. They compose their very own Turing-complete code to run over
Ethereum. In the work of [12], each brilliant home is furnished with a constantly
on the web, high asset gadget, known as “excavator” that is answerable for taking
care of all correspondence inside and outside the home. The digger likewise saves
a private and secure BC, utilized for controlling and evaluating interchanges.
The authors show that our proposed BC-based keen home structure is secured
by completely investigating its security as for the central security objectives of
classification, uprightness, and accessibility.
Dorri et al. [1] is a lightweight BC-based design for IoT that takes out the
overheads of exemplary BC, while keeping up the vast majority of its security and
protection benefits. IoT gadgets profit by a private unchanging record, that demonstration like BC, however, is overseen midway to enhance vitality utilization. High
asset gadgets make an overlay system to execute a freely open circulated BC that
174
N. Ambika
guarantees start-to-finish security and protection. The proposed engineering utilizes
circulated trust to diminish the square approval preparing time. We investigate our
methodology in a brilliant home setting as a delegate contextual analysis for more
extensive IoT applications.
An architecture for scalable access management in IoT by Novo [26] is another
design for refereeing jobs and authorizations in IoT. The new design is a completely
circulated access control framework for IoT dependent on blockchain innovation.
The engineering is upheld by proof of idea usage and assessed in practical IoT
situations. The plan works in a solitary brilliant agreement, streamlining the entire
procedure in the blockchain system and diminishing the correspondence overhead
between the hubs. Furthermore, the entrance control data is given to the IoT gadgets
in ongoing.
Zhang and Wen’s [36] IoT electric business model is an e-business engineering
structured explicitly for IoT, which is based on the convention of the Bitcoin.
The creators have embraced conveyed self-sufficient enterprises (DACs) as the
exchange element to manage the paid information and keen property. DACs can
offer paid administrations with no human contribution under the influence of an
honest arrangement of business rules. These guidelines are executed as freely
auditable open source programming conveyed over the PCs of their partners. In
the proposed e-business design, individuals can exchange with DACs to acquire IoT
coins through P2M.
Liang et al. [22] suggest a trusted and versatile engineering for IoT administration dependent on blockchain, which gives the capacity to self-trust, information
uprightness review, and information flexibility, just as adaptability. Drone is a runof-the-mill microcosm of IoT, where automatons gather information from inserted
sensors and cameras and get the directions from remote control frameworks. Each
control directly from the control framework or the cloud server is responsible for
transferring the activity records to the blockchain arrange. This gives every activity a
unique finger impression, which makes each activity discernible. The circulated idea
of blockchain hubs adds to the accessibility of the two information and information
approval, making it an on-request administration with no personal time.
Ouaddah et al. [27] suggest the utilization of SmartContract to express finegrained and logical access control strategies to settle on approval choices. The
structure uses the consistency offered by blockchain-based digital forms of money
to tackle the issue of concentrated and decentralized access control in IoT featured
at the start of this paper. In FairAccess, the creators have settled on approval tokens
as access control component, conveyed through developing cryptographic money
arrangements. They use blockchain right off the bat to guarantee to assess access
approaches in dispersed conditions where there is no focal power/executive and
assurance that arrangements will be appropriately authorized by all interfacing
elements and to guarantee token reuse identification.
Shafagh et al. [30] suggest a blockchain-based structure for the IoT that brings
appropriated access control and information to the executives. The structure is
customized for IoT information streams and empowers secure information sharing.
They empower a protected and versatile access control for the executives, by using
Reliable System for Bidding System Using Blockchain
175
the blockchain as an auditable and circulated get to control layer to the capacity
layer. They encourage the capacity of time-arrangement IoT information at the edge
of the system by employing a territory-aware decentralized stockpiling framework
leading to blockchain innovation. The framework is freethinker of the physical
stockpiling hubs and supports also usage of distributed storage assets as capacity
hubs.
Blockchain innovation can be applied to training from numerous points of view
past just recognition of the board and accomplishments appraisal [10, 11]. For
the two students and instructors, blockchain innovation has an incredible potential
for more extensive application prospects on developmental assessment, learning
exercises structure and execution, and continue following the entire learning forms.
The shrewd agreement among instructors and understudies can be applied to the
instructive situation. Continuous honors can be given to understudies through some
straightforward snaps by the teachers. Understudies will get a specific number of
computerized cash as indicated by shrewd agreements as remunerations. This sort
of cash can be put away in the instruction wallet, utilized as educational cost, even
traded with genuine monetary standards.
Dujak and Sajter [13] intend to present the idea of blockchain and its present
applications in coordination and supply systems. Blockchain innovation guarantees
overwhelming trust issues and permitting trustless, secure, and confirmed frameworks of coordination and inventory network data trade-in supply systems. The
absolute most significant current execution territories of blockchain in coordination
and inventory network are following the item starting point just as the following
item move through stockpile arrange and request gauging, diminishing of fake
and extortion chance, open access to data in the production network, lessening
the negative effect on the surroundings, and exchange atomize through keen
agreements.
Liang et al. [22] is an imaginative client-driven wellbeing information-sharing
arrangement. It uses by a decentralized and permissioned blockchain to ensure
protection utilizing channel development plan and improves the personality of
the board. A versatile application is conveyed to gather well-being information
from individual wearable gadgets, manual information, and clinical gadgets and
synchronize information to the cloud for information imparting to medicinal
services suppliers and health care coverage organizations. To save the respectability
of well-being information, inside each record, proof of honesty and approvals is for
all time retrievable from the cloud database. This record is then submitted to the
blockchain arrange, which is trailed by a few stages to change a rundown of records
into an exchange. A rundown of exchanges will be utilized to shape a square, and
the square will be approved by hubs in the blockchain arrange. After a progression
of procedures, the respectability of the record can be saved, and future approval on
the square and the exchange identified with this record is accessible. Each time there
is a procedure on the individual well-being information, a record will be reflected in
the blockchain. This guarantees each activity close to home well-being information
is responsible.
176
Table 1 Description of work
complexity
N. Ambika
Contribution
[31]
[21]
[19]
[1]
[26]
[36]
[22, 23]
[27]
[30]
[10, 11]
[13]
[22, 23]
[11]
[11]
Work complexity
N*O(N2 log N)
(N)
O(N3 )
N* O(logN)
(log N)
O(N5 )
O(N*log N2 )
N*O(2N )
O(Nα )
O(2N )
O(log N!)
2*O(log N)
O(log N2 )
O(N)*N2
The work of Chen et al. [11] applies the blockchain method into the e-sale to
determine the two issues. The blockchain is a shared access structure with the end
goal that focuses on the structure that can confide in one another focuses. Every area
can safely convey, verify, and move information to any of different destinations.
Therefore, in the decentralized structure, the brought together go-between can be
expelled to lessen the exchange cost. With respect to the subsequent issue, the
brilliant contract is utilized to evade the offer value spilled by the lead bidder.
A few standards are composed inside the shrewd arrangement, which cannot be
opened before the cutoff time. The proposal [11] is made out of the location of
auctioneer, the beginning closeout time, cutoff time, the location of the current
winner, and the current most significant expense. The keen agreement is a lot
of codes and digits actualized by means of the Ethereum stage. In an astute
understanding, the agreement will begin if the time or occasion is activated, for
example, communicating something specific, managing exchanges, and ending the
agreement. Prior to the cutoff time, all the legitimate bidders can send the fixed
envelope to restore the cost. All the fixed envelopes are opened when the time is
expected. The most significant expense on the fixed envelope is the last champ.
During public bidding, bidders can offer a few times; in this way, open offer
is additionally called multi-offering sell-off. Fixed offer is that bidders scramble
the bill and send the bill once. On the off chance that the time is expected, the
salesperson looks at all of the bills. The bidder who offers at the greatest expense
is the victor of the fixed offer. Table 1 is the representation of the contributions
complexity.
Reliable System for Bidding System Using Blockchain
177
Table 2 Notations used in the study
Notations
Ui
N
As
Ri
Ad
Uid
Ul
Ubio
Aid
Al
Abio
Tb
Te
Tc
Bu
Tu
h(..)
Description
ith user of the network
Network in consideration
Auctioneer server
ith request of the user Ui
Auction details transmitted by the auctioneer server
User’s device identity
User’s location
User biometric extract
Auctioneer’s device identity
Auctioneer’s location
Biometric extract of the auctioneer
Beginning time of the auction
Close time of the auction
Cutoff time of the auction
Bidding value of the user
Timestamp of the user
Hashing algorithm (blockchain algorithm)
3 Notations Used in the Study
Table 2 lists the notations used in the proposal.
4 Proposed Work
Online bidding aims in providing flexibility to the user. The user is provided with
the timeslot to provide his bidding value. This received value is evaluated, and the
winner is concluded. To achieve this process, the user is to get registered with the
server. The server will validate the user and make an entry into the system. To make
the system better, blockchain can be utilized to enhance security to the system.
The proposed system embeds this technology. The procedure is divided into three
phases. In the registration phase, the user is to register himself with the server by
providing the details of the device and himself. In the broadcast phase, the server
broadcasts the auction details to all the registered users. In the auction phase, the
user is provisioned to transmit his auction value.
(a) Registration Phase
The respective users get registered with the auction system using the device
unique identity, biometric extract, and location information. In the notation (1),
the user Ui transmits requests Ri , its identity Uid , user biometric extract Ubio , and
178
N. Ambika
location information Ul to the auctioneer server As .
Ui → As : h Ri ||Uid || Ul Ubio
(1)
Both the communicating parties have to undergo mutual authentication before
starting to bid. As the system is sharing the hashed values, the confidentiality is
maintained. In Eq. (2), the auctioneer server As is calculating the hash value using
the received message – device identity Aid , location of the user Al , auction details
Ad , and biometric extract Abio :
As → Ui : h (Ad ) h (Aid | |Al | |Abio )
(2)
This hashed value is used as identification by both the communicating parties by
identifying themselves during auction time.
(b) Broadcast Phase
At the time of auction, all the devices are provisioned to give their options. The
auction server broadcasts the cutoff time to its clients. In the Eq. (3), the auctioneer
server As is transmitting cutoff time Tc , beginning time Tb , closeout time Te , and
hashed identity to the auctioneer to the network N:
As → N : h (Tc ||Tb || Te ) h (Aid | |Al | |Abio )
(3)
(c) Auction Phase
The user inserts the hashed identity, hashed bidding value, hashed timestamp and
transmits the same to the auction server. In Eq. (4), the user Ui is transmitting the
hashed bidding value Bu , timestamp Tu , and hashed identity to the auction server
As :
Ui → As : h (Uid , Ul , Ubio ) | |h (Bu )| | h (Tu )
(4)
Using the received data from various users, the auction server concludes with the
winner of the auction. The server applies the auction rules to conclude the winner,
and the same is broadcasted to all the registered users.
Reliable System for Bidding System Using Blockchain
179
5 Security Analysis
The proposal [11] is made out of the location of auctioneer, the beginning closeout
time, cutoff time, the location of the current winner, and the current most significant
expense. While the suggested proposal aims in enhancing reliability to the system.
The proposed system embeds this technology. The procedure is divided into three
phases. In the registration phase, the user is to register himself with the server by
providing the details of the device and himself. In the broadcast phase, the server
broadcast the auction details to all the registered users. In the auction phase, the
user is provisioned to transmit his auction value. Comparing to the previous work,
the proposed work uses device identity and biometric extract of the user in the work
to improve reliability to the system. Table 3 lists the parameters used in the study.
(a) Reliability
The previous system uses the location details while the proposed system
provisions the system to use device identity and biometric extract to enhance the
reliability of the system. The auctioneer will be able to trust the user better in the
proposed system. The proposed system enhances reliability by 11% compared with
Chen et al. [11]. The same is represented in the Fig. 4.
Table 3 Lists the parameters
in the study
Fig. 4 Comparison of work w.r.t Reliability
Parameters
No of users used
No of auctioneer
Time duration
Beginning time
Close time
Cutoff time
Length of the biometric extract
Length of location information
Length of device identity
Length of timestamp
Description
5
1
60 ms
0 ms
30 ms
28 ms
32 bits
8 bits
16 bits
12 bits
180
N. Ambika
6 Conclusion
Online bidding aims in providing flexibility to the user. The user is provided with
the timeslot to provide his bidding value. This received value is evaluated, and the
winner is concluded. To achieve this process, the user is to get registered with the
server. The server will validate the user and make an entry into the system. To make
the system better, blockchain can be utilized to enhance the security of the system.
The proposed system embeds this technology. The procedure is divided into three
phases. In the registration phase, the user is to register himself with the server by
providing the details of the device and himself. In the broadcast phase, the server
broadcast the auction details to all the registered users. In the auction phase, the
user is provisioned to transmit his auction value. The proposed system enhances
reliability by 11%.
References
1. Dorri, A., Kanhere, S. S., & Jurdak, R. (2017). Towards an optimized blockchain for IoT. In
Second International Conference on Internet-of-Things Design and Implementation (pp. 173–
178). ACM.
2. Aderibole, A., et al. (2020). Blockchain technology for smart grids: Decentralized NIST
conceptual model. IEEE Access, 8, 43177–43190.
3. Ambika, N. (2021a). A reliable blockchain-based image encryption scheme for IIoT networks.
In S. K. Pani, S. L. Lau, & X. Liu (Eds.), Blockchain and AI technology in the industrial
internet of things (pp. 81–97). IGI Global.
4. Ambika, N. (2021b). A Reliable hybrid blockchain-based authentication system for IoT
network. In S. Singh & A. D. Jurcut (Eds.), Revolutionary applications of blockchain-enabled
privacy and access control (pp. 219–233). IGI Global.
5. Atlam, H. F., & Wills, G. B. (2019). Technical aspects of blockchain and IoT. In Role of
Blockchain Technology in IoT Applications (Vol. 115).
6. Barenji, A. V., Guo, H., Tian, Z., Li, Z., Wang, W. M., & Huang, G. Q. (2019). Blockchainbased cloud manufacturing: Decentralization. Open Access by IOS Press, 1003–1011.
7. Bauer, D. L., & Adair, A. J. (2008). Washington. DC: U.S. Patent, 7, 315,832.
8. Behnke, K., & Janssen, M. F. W. H. A. (2020). Boundary conditions for traceability in food
supply chains using blockchain technology. International Journal of Information Management,
52, 101969.
9. Caro, M. P., Ali, M. S., Vecchio, M., & Giaffreda, R. (2018). Blockchain-based traceability in
Agri-Food supply chain management: A practical implementation. In IoT Vertical and Topical
Summit on Agriculture-Tuscany (IOT Tuscany) (pp. 1–4). IEEE.
10. Chen, G., Xu, B., Lu, M., & Chen, N. S. (2018a). Exploring blockchain technology and its
potential applications for education. Smart Learning Environments, 5(1), 1.
11. Chen, Y.-H., Chen, S.-H., & Lin, I.-C. (2018b). Blockchain based smart contract for bidding
system. In IEEE International Conference on Applied System Invention (ICASI) (pp. 208–211).
IEEE.
12. Dorri, A., Kanhere, S. S., Jurdak, R., & Gauravaram, P. (2017). Blockchain for IoT security
and privacy: The case study of a smart home. In IEEE international conference on pervasive
computing and communications workshops (pp. 618–623). IEEE.
13. Dujak, D., & Sajter, D. (2018). Blockchain applications in supply chain. In A. Kawa & A.
Maryniak (Eds.), SMART Supply Network (pp. 21–46). Springer.
Reliable System for Bidding System Using Blockchain
181
14. Tijan, E., Aksentijević, S., Ivanić, K., & Jardas, M. (2019). Blockchain Technology Implementation in Logistics. Sustainability, 11(4), 1–13.
15. Esmat, A., de Vos, M., Ghiassi-Farrokhfal, Y., Palensky, P., & Epema, D. (2021). A novel
decentralized platform for peer-to-peer energy trading market with blockchain technology.
Applied Energy, 116123.
16. Hackius, N., & Petersen, M. (2017). Blockchain in logistics and supply chain: trick or treat? In
Proceedings of the Hamburg International Conference of Logistics (HICL) (pp. 3–18).
17. Hsiao, J. H., Tso, R., Chen, C. M., & Wu, M. E. (2017). Decentralized E-voting systems based
on the blockchain technology. In Advances in Computer Science and Ubiquitous Computing
(pp. 305–309). Springer, Singapore.
18. Huckle, S., Bhattacharya, R., White, M., & Beloff, N. (2016). Internet of things, blockchain
and shared economy applications. In 7th International conference on Emerging Ubiquitous
Systems and pervasive networks (pp. 461–466). Elsevier.
19. Huh, S., Cho, S., & Kim, S. (2017). Managing IoT devices using blockchain platform. In 19th
International Conference on Advanced Communication Technology (ICACT) (pp. 464–467).
IEEE.
20. Imbault, F., Swiatek, M., De Beaufort, R., & Plana, R. (2017). The green blockchain: Managing
decentralized energy production and consumption. In IEEE International Conference on
Environment and Electrical Engineering and 2017 IEEE Industrial and Commercial Power
Systems Europe (pp. 1–5). IEEE.
21. Kuo, T. T., Kim, H. E., & Ohno-Machado, L. (2017). Blockchain distributed ledger technologies for biomedical and health care applications. Journal of the American Medical Informatics
Association, 24(6), 1211–1220.
22. Liang, X., Zhao, J., Shetty, S., Liu, J., & Li, D. (2017a). Integrating blockchain for data sharing
and collaboration in mobile healthcare applications. In 28th Annual International Symposium
on Personal, Indoor, and Mobile Radio Communications (PIMRC) (pp. 1–5). IEEE.
23. Liang, X., Zhao, J., Shetty, S., & Li, D. (2017b). Towards data assurance and resilience in
iot using blockchain. In MILCOM 2017–2017 IEEE Military Communications Conference
(MILCOM) (pp. 261–266). IEEE.
24. Lu, Q., & Xu, X. (2017). Adaptable blockchain-based systems: A case study for product
traceability. IEEE Software, 34(6), 21–27.
25. Nagaraj, A. (2021). Introduction to Sensors in IoT and Cloud Computing Applications.
Bentham Science Publishers.
26. Novo, O. (2018). Blockchain meets IoT: An architecture for scalable access management in
IoT. IEEE Internet of Things Journal, 5(2), 1184–1195.
27. Ouaddah, A., Abou Elkalam, A., Ait, A., & Ouahman. (2016). FairAccess: a new Blockchainbased access control framework for the Internet of Things. Security and Communication
Networks, 9(18), 5943–5964.
28. Randall, D., Goel, P., & Abujamra, R. (2017). Blockchain applications and use cases in health
information technology. Journal of Health & Medical Informatics, 8(3), 8–11.
29. Ranganthan, V. P., Dantu, R., Paul, A., Mears, P., & Morozov, K. (2018). A decentralized
marketplace application on the ethereum blockchain. In IEEE 4th International Conference on
Collaboration and Internet Computing (CIC) (pp. 90–97). IEEE.
30. Shafagh, H., Hithnawi, A., Burkhalter, L., Fischli, P., & Duquennoy, S. (2017). Secure sharing
of partially homomorphic encrypted iot data. In 15th ACM Conference on Embedded Network
Sensor Systems (pp. 1–14). ACM.
31. Shaikh, J. R., & Iliev, G. (2018). Blockchain based Confidentiality and Integrity Preserving
Scheme for Enhancing E-commerce Security. In IEEE Global Conference on Wireless
Computing and Networking (GCWCN) (pp. 155–158). IEEE.
32. Singh, M., & Kim, S. (2019). Blockchain technology for decentralized autonomous organizations. Advances in Computers, 115, 115–140.
33. Teich, J. E., Wallenius, H., Wallenius, J., & Zaitsev, A. (2006). A multi-attribute e-auction
mechanism for procurement: Theoretical foundations. European Journal of Operational
Research, 175(1), 90–100.
182
N. Ambika
34. Tian, F. (2016). An agri-food supply chain traceability system for China based on RFID
& blockchain technology. In 13th International Conference on Service Systems and Service
Management (ICSSSM) (pp. 1–6). IEEE.
35. Tseng, C.-T., & Shang, S. S. C. (2021). Exploring the Sustainability of the Intermediary Role
in Blockchain. Sustainability, 13, 1–21.
36. Zhang, Y., & Wen, J. (2017). The IoT electric business model: Using blockchain technology
for the internet of things. Peer-to-Peer Networking and Applications, 10(4), 983–994.
Security Challenges and Solutions for
Next-Generation VANETs: An
Exploratory Study
Pavan Kumar Pandey, Vineet Kansal, and Abhishek Swaroop
1 Introduction
Wireless communication and Cloud computing [1] has evolved and gained significant popularity due to their applicability and several recent technical advancements
in the area of communication. Mobile ad hoc networks (MANETs) [2] and
wireless sensor networks (WSNs) [3] [4] are a few popular examples of wireless
communication-based networks. Vehicular ad hoc networks (VANETs) [5] and
flying ad hoc networks (FLANETs) [6] are subclasses of mobile ad hoc networks.
By utilizing fixed roads and roadside units (RSUs), VANETs provide infrastructure
less communication framework for vehicles and other infrastructure nodes. Figure
1 presents vehicular networks based on different factors such as used components,
types of communication, and their applications.
Participating vehicles in VANETs are equipped with onboard units (OBUs), a
global positioning system (GPS), and other sensors. Widespread RSUs in VANETs
work as traffic authorities for facilitating registration, tracking, and monitoring
of vehicles. Several OBUs and other sensors can communicate for the Internet of Things (IoT) applications through wireless communication. Substantial
enhancements in the vehicle’s capabilities and evolved communication technologies
contrive the design of an intelligent transportation system (ITS) [7].
P. K. Pandey ()
Dr. A.P.J. Abdul Kalam Technical University, Lucknow, India
V. Kansal
I.E.T. Dr. A.P. J. Abdul Kalam Technical University, Lucknow, India
A. Swaroop
Bhagwan Parshuram Institute of Technology, New Delhi, India
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_10
183
184
P. K. Pandey et al.
Communication
Range
Types of
Short
Communication
Challenges
Wide area
Efficient
Vehicle to
Routing
Vehicle (V2V)
Security
Vehicle to
Power
Infrastructure
(V2I or I2V)
Management
VANETS
Applications
Entities
Comfort Driving
Traffic Efficiency
Traffic Management
Infotainment
Vehicles
RSU
Central Authority
Data Network
Fig. 1 Graphical taxonomy of VANETs
VANETs possess characteristics such as dynamic topology, high mobility, and
variant network size. Consequently, numerous challenges are associated with
VANETs such as efficient routing [8], security [9], and power management. By
considering these challenges, traditional routing strategies have been proposed in
[10]. ITS supports short-range vehicular communication among vehicles known
as vehicle-to-vehicle (V2V) communication. Moreover, communication between
vehicles and other fixed infrastructure nodes called vehicle-to-infrastructure (V2I)
communication also exists. Figure 2 presents V2V and V2I communication in
VANETs.
The further evolution of cellular technology led to the integration of 5G technology [12] with VANETs to improve flexibility, scalability, and mobility management.
Load distribution challenges [13] are also associated with high-speed VANETs.
Software-defined networks (SDN) [14] play an important role in the integration of
5G by decoupling control functions from the data plane. SDN is a logical network
paradigm to manage networks in a centralized manner.
Security Challenges and Solutions for Next-Generation VANETs: An Exploratory Study
185
Fig. 2 Types of vehicular communication [11]
SDNs [15] are three-tier networks that have layers of application, control, and
infrastructure. They provide a centralized view of the networks and their associated
hardware. SDN is a technology that enables network programmers to develop new
services without requiring the usual hardware or software updates. Therefore, SDN
eliminates the need for manual intervention and allows networks to run smoothly.
SDN-enabled communication standard is different from the traditional networking
paradigm based on various factors such as configuration, performance, and features.
The function of each layer is explained below:
• Application layer: The network services that are defined within this layer consist
of path reservation, network configuration, and network topology. These services
use the service of control layer and infrastructure layer. SDN applications can be
designed by using network virtualization (NV), network function virtualization
(NFV), and information content networking (ICN) [16].
• Control layer: The control layer is a network device that can install and modify
the flow rules according to the running applications. It also keeps the flow table
up to date with the changes in network topology.
• Infrastructure layer: This is also known as a data plane. A data plane is a
forwarding entity in a network that forwards packets by following instructions
and messages from a controller. A communication standard, which is known as
open flow [17], is used for sending flow instruction instructions and messages
between controller and switch.
186
P. K. Pandey et al.
Fig. 3 5G VANETs integrated architecture [18]
In NG-VANETs, evolved node base controllers (eNBCs) play an important role
in providing intelligence for policy management and traffic control. At the same
time, the road-side controller (RSC) is inducted for sharing the load with eNBC on
traffic management. RSC is key to the 5G-VANETs that manage the control plane
with eNBC. In addition to that, RSC handles data plane and security plane in 5GVANETs with vehicles. 5G-based VANETs architecture is presented in Fig. 3.
Numerous ITS applications [19] such as traffic efficiency, comfort driving,
infotainment in vehicles, and traffic management have attracted lot of researchers
and industry personal in recent years. Therefore, several researchers have discussed
the security challenges, attack models, and their respective solutions in VANETs so
far.
This chapter focuses on the security perspective of NG VANETs. Our major
contributions in the current exposition are listed below:
• Security requirements for NG-VANETs are investigated and discussed.
• Several security threats and attacks are classified based on their ultimate impact.
• Few best-suited security solutions have been discussed for targeting security
services for NG-VANETs.
• Comparative analysis of discussed security solutions has been presented based
on their applicability in VANETs.
The remaining chapter consists of three more sections. Section 2 explains
security requirements in VANETs with details of possible security threats and
Security Challenges and Solutions for Next-Generation VANETs: An Exploratory Study
187
attacks. The next section presents some recently proposed security methods to
secure the NG-VANETs. We conclude our discussion in the last section.
2 Security Requirements
The specific characteristics of VANETs such as high mobility, low trust management, and inefficient key distribution lead to several security challenges in vehicular
communication. The next-generation VANETs are more vulnerable to security
threats. Therefore, proper treatment of security attacks is required to make sure that
no alteration in messages and traffic information is exchanged among vehicles.
2.1 Security Services
As part of security requirements, several security services [20] are used to measure
security imposed on VANETs. Some of the security objectives in VANETs are:
Authentication
This is to make sure the correct identification of the sender of received messages in
terms of identity, location, etc. It can be achieved using certificates and pseudonym
methods.
Confidentiality
Confidentiality intends to restrict the access of messages for sender and receiver
only. Some predefined rules and keys are used to ensure confidential communication.
Integrity
It ensures the correctness of messages and makes sure that the transmitted messages
are not altered or dropped in communication.
Privacy
Nondisclosure of the identity of vehicles and RSUs etc. against unauthorized access
is known as privacy.
Availability
Availability demands the system always be in the operative mode, and a wireless interface should be always available for communication. The availability is
enhanced by resisting DoS attacks etc.
188
Table 1 Security services
for different vehicular
communication mode
P. K. Pandey et al.
VANET communication mode
V2V communication
V2I communication
V2V and V2I communication
Security services
Content verification
Access control
Traceability
Privacy
Authentication
Availability
Confidentiality
Integrity
Content Verification
This is to avoid false messaging in the system by verifying the content of messages.
Data verification must be done to check data consistency in multiple messages for
ensuring content correctness.
Access Control
Access control derives some protocol and mechanism which is to be followed for
accessing critical information and resources. Every entity must act in a network
according to rules and role privileges.
Traceability
It provides a mechanism that can verify the location or history of any entity by
using some recorded identification. Since the real identity of vehicles should not
be revealed, some other mechanism must be used to obtain the real identity of the
vehicle for tracking.
After discussing several security services required in vehicular communication,
Table 1 presents a further classification of discussed services based on their
applicability in different communication modes. All discussed security services are
classified in three communication modes, namely V2V, V2I, and both V2V and V2I.
2.2 Security Attacks
Deployment scenarios in VANETs are highly vulnerable to several types of attacks.
The incorporation of recent technologies caused increased security concerns for
future generation VANETs. Sometimes optimization [21] techniques can help in
securing the system. However, proper security solutions are required against security
attacks.
Several researchers have investigated attacks [22] and classified them into
subcategories based upon different parameters. Relevant attacks for NG-VANETs
are listed and classified in different subcategories in Fig. 4.
Security Challenges and Solutions for Next-Generation VANETs: An Exploratory Study
User Based
Content Based
• Repudiation
• Malware & Spamming
• Masquerading
• Bogus Information
• Location Tracking
• Message Alteration
• Brute force
• Repression
• Illusion
• Sybil Attacks
• Identity & Location
• Forgery
Revealing
189
• Message Delay
Security Attacks in NG-VANETs
Channel Based
Network Based
• Jamming
• DoS / DDoS
• Blackhole
• Unauthorized Access
• Man in Middle (MiM)
• Eavesdropping
• Grayhole Attack
• Clone Attack
• Wormhole
• Session Highjack
• Replay
• Timing Attack
Fig. 4 Classification of attacks in NG-VANETs
As evident from Fig. 4, several attacks are filtered by considering high bandwidth
utilization, centralized control (cloud), and powerful sensors used in vehicular
communication. These attacks are further categorized based on their impact on
corresponding participating entities in communication. Therefore, these attacks are
divided into four different categories: user-based, content-based, channel-based, and
system-based attacks.
User-Based Attacks
These attacks target the identification and personal details of vehicles and RSUs for
reducing the effectiveness of communication in several ways.
190
P. K. Pandey et al.
• Repudiation – Any participated nodes including vehicles or RSUs deny communication.
• Masquerading or impersonation or spoofing – To get some additional privileges,
attackers take the identity and location of some other legitimate user and
participate in the communication.
• Location tracking – By launching this attack on GPS, the attacker tracks the
location of any vehicle to misuse that information.
• Brute force attack – By using this method, the confidential details of users such
as identity number, user ID, and password are stolen and misused.
• Illusion attacks – Tampering of sensors and software used by communication
entities to broadcast incorrect and misleading details into a network.
• Identity and location revealing – By attacking some common servers of the
system, disclosing the details such as identity and location of any participant.
Content-Based Attacks
Attacks discussed under this section affect the transmitted message directly and
tempered the communicated data.
• Malware or spamming – Sending spam messages into a network for affecting
QoS of the network such as latency and bandwidth consumption.
• Bogus information – Attackers float fake information into the network intentionally that affects the behavior of vehicles and RSUs in traffic.
• Message alteration or repression – The act of dropping or modifying the
messages by adversaries comes under this category.
• Sybil attack – Sending the same message from different senders to the same
receiver is known as Sybil attack that reflects on the receiver side that the same
messages are received from different sources.
• Forgery – Attackers make it possible by sending fake warning messages and
alerts (e.g., accident alerts and poor road conditions) into VANETs.
• Message delay – Attackers introduce a significant time delay in messages so that
these may be discarding them on the receiver side.
Channel-Based Attacks
These kinds of attacks pick out channels and paths established for vehicular
communication and affect that particular session.
• Jamming – Attackers intentionally put disturbance in the channel established
between nodes and interfere in their conversation.
• Black hole – Adversaries receive messages and intercept conversation by falsely
indicating that they have the shortest routes to the destination.
• Man in middle (MiM) – Attackers sit between sender and receiver to listen to
their conversation and pose as a responder for each one of them.
• Gray hole attack – Malicious node pretends as forwarding node in the network.
However, the same node drops the packet on receiving the messages.
Security Challenges and Solutions for Next-Generation VANETs: An Exploratory Study
191
• Wormhole – In this attack, attackers are strategically placed at two places and
create a tunnel between both places. Then, attackers start receiving packets from
one place and forwarding them toward another place using that tunnel.
• Replay – Malicious node repeatedly and fraudulently forward valid messages
toward one destination node.
Network-Based Attacks
These attacks are liable to degrade the performance of the complete vehicular
system.
• Denial of services (DoS) – Malicious nodes keep the system resources and
services unavailable for vehicles and other users in VANETs.
• distributed denial of services (DDoS) – These are DoS attacks in the system from
several locations.
• Unauthorized access – Attackers try to use some network resources and services
without proper permissions and privileges.
• Eavesdropping – The malicious nodes intend to intercept information transmitted
over the vehicular network using some connected devices into an unsecured
network.
• Session hijacking – Interception of complete conversation by hacking a particular
channel established between two or more entities.
• Clone attack – Adversaries create several devices similar to legitimate devices in
a network to compromise other genuine devices in the network.
• Timing attack – Malicious nodes introduce additional time slots into critical
messages toward infrastructure nodes such as RSUs; this may down the complete
network.
So far, several security requirements and security attacks are discussed, which
are applicable in NG-VANETs. For better interpretation of security in NG-VANETs,
mapping between security services and corresponding attacks is required. Table 2
depicts discussed security services with some security attacks, which compromise
that particular security service in vehicular communication.
3 Security Mechanisms
Several researchers have proposed solutions for different types of attacks. Based on
different attack scenarios, the security approaches are designed to achieve security
requirements. The design of effective security schemes for future VANETs becomes
more critical due to demanding specifications such as increased network bandwidth,
enhanced sensors, and high processing devices in next-generation VANETs. Some
of the designed security approaches to target different types of attacks for NGVANETs are discussed below.
192
Table 2 Security services
are compromised by
corresponding attacks
P. K. Pandey et al.
Compromised services
Privacy
Confidentiality
Authentication
Integrity
Availability
Content verification
Access control
Traceability
Security attacks
Impersonation
Identity/location revealing
Repudiation
Anonymity
Eavesdropping
Replay attack
Spamming
Black hole
Gray hole
Packet analysis
Sybil attack
Password attack
Illusion attack
Clone attack
Timing attack
Bogus information
Man in middle (MiM)
Session hijacking
Dos
DDoS
Jamming
Sybil attack
Message alteration
Message tempering
Brute force attack
Unauthorized access
Vehicle tracing
Packet tracing
3.1 Hybrid Device to Device (D2D) Message Authentication
(HDMA) Scheme
HDMA [23] approach uses a group signature algorithm for authentication and
precomputed lookup table to reduce authentication overhead. This approach divides
the network into two different logical groups: global group and local group. The
global group contains all network entities such as trusted authority (TA), a roadside
base station (RSBS), and vehicles. On the other hand, the communication range
of RSBS and vehicles creates a logical boundary and forms the local group. The
global group uses the pseudonyms authentication method, whereas signature-based
authentication is used in a local group.
HDMA functions in four different phases, namely, initialization, authentication,
tracing, and revocation. As part of initialization, TA generates system parameters
for global and local groups. Moreover, TA generates identities and pseudonyms
for RSBSs and vehicles. The authentication phase follows different authentication
methods for V2V and V2I authentication. The tracing phase is used for revealing the
Security Challenges and Solutions for Next-Generation VANETs: An Exploratory Study
193
real identity of the nodes in communication to prevent the system from malicious
activities. If TA identifies malicious vehicles, then certificates for those malicious
vehicles get revoked and get stored in the certificate revocation list (CRL) for future
reference.
By using effective authentication methods, the HDMA approach prevents
VANET’s from several attacks such as impersonation, message alteration, replay,
identity revealing, etc. In addition to adding several security features, this approach
reduces overall computation overhead by reducing the authentication overhead of
messages signing and verifying the users. A predefined lookup table is suggested to
be used for increasing the computation speed of modular exponentiation.
3.2 Blockchain-Based Secure and Trustworthy Approach
The Internet of Things (IoT) plays a key role in future VANETs. Therefore, the
blockchain-based approach [24] is specifically designed for trust management and
privacy for IoT services in VANETs by using decentralized and inflexible properties
of the blockchain. The use of a blockchain in the proposed framework improves the
security and efficiency of vehicular systems.
In this approach, all participating nodes including vehicles, roadside units
(RSUs), and traffic authorities (TAs) are suggested to prepare a P2P network to
maintain blockchain. Vehicular services of trust management, real-time video report
of traffic situations, and message exchanges among vehicles are discussed with the
proposed vehicular architecture. This approach provides user privacy, secured data,
and trust management for preventing the system from several types of attacks and
message tampering.
The two-step process is suggested for real-time video reports and messagesharing services. The first step is the vehicle registration, and the second step
is the road condition report. There are separate algorithms designed for both
processes. Each vehicle needs to be registered first using the subscriber number and
device number. At the time of registration, symmetric key (SKE) is generated after
verifying identity details. After the registration process, the video file gets recorded
using a camera, and the message digest of video content is calculated. Thereafter,
the vehicle broadcast that video toward neighbors after encrypting hash value and
signing the content.
Trust management is four steps process that consists of traffic collection, trust
computation, miner election, and credibility assessment. After receiving traffic
information from vehicles, RSU classifies the scores of messages received from
forwarding vehicles and calculates the trust value of vehicles. In the next step,
the difficulty level of RSU is calculated using RSU’s trust value. Then, RSU gets
elected as a miner, and a new block is added. Credibility assessment is used to see an
uploaded video for checking any suspicious activity observed on traffic recording.
194
P. K. Pandey et al.
3.3 Searchable Encryption with Vehicle Proxy
Re-encryption-Based Scheme
An efficient and secure routing scheme based on searchable encryption with vehicle
proxy re-encryption (ESSPR) [25] was presented to provide privacy preservation of messages for vehicular peer-to-peer social networks (VP2PSN). Security
scheme functions in six steps: system initialization, peer registration, document
generation, document forwarding, proxy re-encryption, and document receiving.
ESSPR provides authentication, privacy, and data integrity based on public-key
encryption, aggregate signature, proxy re-encryption, and quality of services (QoS)–
based clustering. This scheme prevents VANETs from multiple attacks such as
eavesdropping, wormhole, packet analysis, packet tracing, and replay packets.
The first phase of the approach starts with the generation of several parameters
as part of system initialization. Peer registration makes sure that joining of any
new vehicle into a cluster with the corresponding private key, public key, and
certificate. After peer registration and authentication, the vehicle picks encryption
algorithm and public–private keys pair and generates public–private keys pair for
peer vehicles. Thereafter, the vehicle generates a chipper of the document with its
keywords.
Document forwarding algorithm specifies procedure when a destination is not
in range of source vehicle. It is suggested to encrypt cipher again on a proxy of
destination vehicle as part of proxy re-encryption phase. In the last phase, a cipher
is decrypted again to an original document after receiving on destination vehicle.
3.4 Secure and Efficient AOMDV (SE-AOMDV) Routing
Protocol
Multipath on-demand routing protocols are more exposed to multiple types of
security attacks such as man in middle (MiM) and black hole attacks. Therefore,
SE-AOMDV [26] designed to provide security in a multipath on-demand routing
approach has a more challenging task ahead. This scheme introduces authentication
and integrity for a route-reply packet that is used to fetch the best and secure routes.
Authentication of vehicles is introduced as a mandatory step in AOMDV to provide
trust ability in participating vehicles. This step helps in discarding malicious activity
by differing the malicious behavior and legitimate behavior of vehicles in VANETs.
Few new parameters have been added by the authors in routing information
format to support authentication in vehicular communication such as AUTH,
DETECT, and R BIN. Information check value (ICV) field is added into the RREP
packet for capturing the hash value of the source, destination, sequence number,
and hop counts. This approach captures details of the authentication process with
misbehavior detection and integrity check.
Security Challenges and Solutions for Next-Generation VANETs: An Exploratory Study
195
In addition to the traditional authentication process for generating certificates by
a third party, the certificates for both vehicles and RSUs are a major contribution
to this approach. The use of certificates assures a relationship between identity and
used key pair. The authentication process takes place in two steps: initialization
and authentication. The initialization process generates all required parameters
and certificates whereas the authentication process is performed through TA using
generated keys and certificates.
An algorithm for misbehavior detection was proposed with an approach that
considers suspicious behavior of vehicles such as duplicate packets, drop packets,
change in vehicle’s speed, and increased interrupted connections. Additionally, ICV
is introduced in the RREP packet for providing data integrity and the R Bin field,
which is used to handle node disconnection. R Bin field is used to verify, whether
packets need to be forwarded or not before broadcasting the packet further. Possible
node disconnection is handled in AOMDV by selecting disjoint routes only.
3.5 Socially Aware Security Message Forwarding Mechanism
To prevent the system from privacy attacks, trust-based socially aware security
message forwarding (SASMF) [27] strategy was proposed. In this approach,
pseudonyms are used to protect privacy. This security scheme is used to prevent the
system from some important attacks such as anonymity, location privacy, message
authentication, traceability, and edge information attacks, although this scheme
suggests two different strategies namely privacy protection and forwarding control.
Privacy protection strategy is designed using three steps, namely, key generation,
pseudonyms updating, and message protection. Traffic authority (TA) generates
system parameters and derives private and public keys from generated parameters.
Moreover, TA chooses a hash function for protecting messages from replay attacks
during transmission. After that, all parameters except the private key are publicly
available on the system. Based on these global parameters, RSU application
provider (AP) and other vehicles generate their set of keys. In the next step, TA
generates pseudonyms identity by utilizing the registration identity of vehicles. In
certain intervals, pseudonym’s identity needs to be updated, otherwise the use of
the same identity for a long time is a risk of disclosure of the identity. Exchange
entropy is suggested as an exchange control tool. Therefore, this should be directly
proportional to the privacy of the vehicle, i.e., greater entropy tends to higher
privacy of vehicles. In the last step, message forwarding is suggested by using
a fragmentation scheme based on Shamir’s secret sharing algorithm. Apart from
privacy, security is also associated with this approach. For enhancing security, the
original message is divided into several fragments based on traffic situations, and
fragments are transmitted to AP through RSUs.
For the forwarding control strategy, the trust management approach is used to
provide security and reliability. There are two separate algorithms discussed for trust
forwarding decisions and message forwarding mechanisms. In the trust decision
196
P. K. Pandey et al.
phase, every vehicle is evaluated based on their trust degree and trust degree can
be assessed in several ways like direct trust, indirect trust, and comprehensive trust.
After that, based on the vehicle’s trust values, forwarding vehicles are selected and
messages are forwarded through those high trust vehicles.
3.6 Puzzle-Based Co-authentication (PCA) Scheme
Pseudonym schemes are significantly exposed to DoS attacks due to the cost of
initial authentication in the pseudonyms scheme. Therefore, the puzzle-based coauthentication scheme (PCA) [28] was proposed by the authors for preventing the
system from DoS attacks against pseudonyms authentication using a hash puzzle.
In the proposed scheme, it is assumed that attackers will use the cost of first-time
certificate verification to forge a huge number of fake verification requests for DoS
attacks. Authentication of pseudonyms is a time-consuming operation and total
time taken for frequent changes in pseudonyms identities of vehicles is too high.
Therefore, any vehicle can verify only a few pseudonyms identities in case a huge
number of fake identities are inducted into the system by attackers. Consequently,
any vehicle can verify only a few pseudonyms identities in case of DoS attacks.
PCA scheme is divided into two sections: Hash puzzle designing and mutual
trust-based co-authentication. Property of one-way function is used to design
computational puzzle and hash puzzle consisting of two parameters messages and
answer. The value of the puzzle can be calculated based on these two parameters.
Generation of the puzzle is finding the answer of hash function with an assumption
of message, and value is given for that hash function.
Three different roles for generator, verifier, and beneficiary are also associated
with the puzzle. In the next phase of the scheme, a trust-based cluster is defined as
a group of vehicles for collaborative authentication. A trust cluster is defined as a
strongly connected component from a derived undirected graph of VANETs. It is
recommended that member vehicles from mutual trust clusters can generate puzzles
together and cooperate in certificate verification.
3.7 Intelligent Drone-Assisted Security Scheme
This scheme [29] proposes an assistant-based communication protocol that uses
an intelligent drone. This method will allow vehicles to securely communicate
with other vehicles while protecting the privacy of the individual. The current
approach proposes anonymous authentication and key agreement protocol for 5Gbased vehicular networks. It introduces an authentication scheme that uses drones
for remote areas that do not have adequate base station coverage, and this method is
mainly aimed at rural regions with poor signal.
Security Challenges and Solutions for Next-Generation VANETs: An Exploratory Study
197
The complete approach has been divided into five subsections. First, system
initialization where the control center (CC) chooses a random number-based master
key and calculates the respective public key and several one-way hash functions.
Subsequently, each vehicle gets registered with CC in offline mode as part of
the vehicle registration phase. Once vehicles are registered, the drone also gets
registered with CC to get a secret key as part of the third phase of the approach.
Vehicles and drones both follow three steps registration process. In the fourth phase
of the approach, the login phase has been discussed. During the login phase, the
drone broadcasts messages at a regular frequency for vehicles that require services
from drones. In the last phase of the approach, authentication and key agreement
mechanism have been discussed. This approach uses a hybrid encryption algorithm
such as elliptic curve, hash functions, and AES.
3.8 Efficient Privacy-Preserving Anonymous Authentication
Protocol
This approach proposed in [30] has been designed for targeting multiple security
services such as privacy, authentication, confidentiality, integrity, etc. in a single
solution. In this approach, anonymous authentication scheme with an efficient
privacy-preserving mechanism has been discussed. Identity-based signature is
used for designing authentication protocol. This approach suggests vehicles send
authenticated messages to nearby RSUs. Furthermore, the protocol proposes a
key exchange mechanism for generating session keys for secure communication
between vehicles and RSUs.
The first phase of the approach focuses on identity-based signature, which
contains four algorithms: setup, key extract, sign, and verify. In the second phase
of the approach, authentication protocol has been described, and the complete
authentication protocol further contains three steps. The three steps discuss system
initialization, registration, and authentication.
In system initialization, TA generates several public parameters, which are to be
used in the next steps of the approach. These parameters include prime numbers,
random numbers, hash functions, and AES-based MAC. The next step is known
as registration, which suggests that each vehicle should register its identity with
TA before communicating to nearby RSU. The last step of the approach discusses
authentication where a vehicle with a unique identity authenticates to the nearby
RSU. This approach makes authenticated communication possible between vehicles
and RSU.
198
P. K. Pandey et al.
4 Comparative Study of Security Solutions
In previous sections, several relevant attacks for next-generation (NG) VANETs
have been discussed. Those attacks have been classified into four categories based
on impacted vehicular entities. Additionally, some recent security approaches have
been discussed for the prevention of these attacks. The comparative analysis of
discussed strategies is summarized in Table 3. Security approaches are compared
for their features such as provided security services and applicability of the abovementioned approaches.
All the discussed approaches are suitable for tackling security attacks in NGVANETs, since these are specifically designed by considering the architecture
of 5G-VANETs. These approaches are suitable for high-bandwidth vehicular networks. However, the suitability of discussed approaches varies based upon the used
methodology and their targeted applications.
Among discussed security solutions, each approach is designed for a specific
goal to achieve in a certain environment. HDMA and PCMA are signature-
Table 3 Comparison chart of discussed security approaches
Security approaches
HDMA
Targeted security services
Privacy
Confidentiality
Blockchain based
Privacy
Confidentiality
Trust
Management
Privacy
Authentication
Integrity
Authentication
Integrity
Availability
ESSPR
SE-AOMDV
SASMF
Privacy
Authentication
PCA
Privacy
Availability
Privacy
Authentication
Privacy
Authentication
Confidentiality
Drone assisted
Privacy preserving
Applicability
High bandwidth
VANETs for V2V
communication
Vehicular IoT
environment in
SDN enabled
VANETs
Vehicular P2P
social networks
(VP2PSN)
High-speed
vehicular
environment (highway/expressway)
Informal vehicular
communication
(not suitable for
critical
alerts/warnings)
High bandwidth
VANETs
A rural or
mountainous area
Military VANETs
Security Challenges and Solutions for Next-Generation VANETs: An Exploratory Study
199
based and puzzle-based mutual authentication protocols, respectively. Both of these
protocols provide authentication and confidentiality in high bandwidth vehicular
communication. On the other hand, blockchain-based security framework and
SEAOMDV detect malicious vehicles and messages in vehicular communication,
where blockchain-based algorithms are specifically designed for IoT services such
as real-time traffic monitoring. Furthermore, ESSPR and SASMF both provide privacy to vehicles in a vehicular network using searchable encryption and pseudonymbased algorithms, respectively. Based on use cases of security approaches, these are
more efficient in different environments. SASMF is more suitable for communication of noncritical messages, blockchain-based algorithms for IoT services and
ESSPR can be effectively used in P2P social networks. Apart from these approaches,
HDMA, SEAOMDV, and PCA are effectively used for high bandwidth V2V or V2I
vehicular communication for providing several security services such as privacy,
confidentiality, etc., where PCA is specifically designed for providing availability
and SEAOMDV provides integrity in message communication.
5 Conclusion and Future Work
VANETs play a key role in designing intelligent transportation systems (ITS)
to enhance our traffic experience. However, recent advances in VANETs keep
vehicular communication more exposed to several types of attacks by some greedy
drivers and other selfish users. Therefore, the security perspective of VANETs must
be focused upon before starting to use it. In this chapter, we investigated possible
security threats for NG-VANETs. The various attacks have been categorized into
four different categories based on influenced entities by attackers that give an
extensive analysis of attack scenarios. In addition to that, some recently proposed
solutions to prevent the network from various types of attacks have also been
discussed. Moreover, we have summarized security strategies with their features
and suitability for various application areas. The presented security analysis in the
current exposition may enable other researchers and users to select appropriate
security mechanism for using vehicular communication in a secured way.
References
1. Tyagi, N., Rana, A., & Kansal, V. (2019). Creating elasticity with enhanced weighted
optimization load balancing algorithm in cloud computing. In Amity International Conference
on Artificial Intelligence (AICAI) (pp. 600–604). IEEE Explore.
2. Basagni, S., Conti, M., & Giordano, S. (2013). Mobile ad hoc networking: Cutting edge
directions. Willey IEEE Press Publisher.
3. Gupta, S., Rana, A., & Kansal, V. (2020). Optimization in wireless sensor network using
soft computing. In Advances in intelligent systems and computing (Vol. 1090, pp. 801–810).
Springer.
200
P. K. Pandey et al.
4. Varshney, S., Kumar, C., Swaroop, A., Khanna, A., Gupta, D., Rodrigues, J. J. P. C., Pinheiro, P.
R., & De Albuquerque, V. H. C. (2018). Energy efficient management of pipelines in buildings
using linear wireless sensor networks. Sensors (Basel), 18(8), 2618.
5. Liang, W., Li, Z., Zhang, H., Wang, S., & Bie, R. (2015). Vehicular ad hoc networks:
Architectures, research issues, methodologies, challenges, and trends. International Journal
of Distributed Sensor Networks, 11(8), 745303.
6. Khanna, A., Rodrigues, J. J. P. C., Gupta, N., Swaroop, A., Gupta, D., Saleem, K., & de
Albuquerque, V. H. C. (2019). A mutual exclusion algorithm for flying ad hoc networks.
Computers & Electrical Engineering, 76, 82–93.
7. Boussoufa-Lahlah, S., Semchedine, F., & Bouallouche-Medjkoune, L. (2018). Geographic
routing protocols for Vehicular Ad Hoc Networks (VANETs): A survey. Vehicular communications, 11, 20–31.
8. Pandey, P. K., Swaroop, A., & Kansal, V. (2019). A Concise Survey on Recent Routing Protocols for Vehicular Ad hoc Networks (VANETs). In International Conference on Computing,
Communication, and Intelligent Systems (ICCCIS) (pp. 188–193). IEEE.
9. Mokhtar, B., & Mohamed, A. (2015). Survey on security issues in vehicular ad hoc networks.
Alexandria Engineering Journal, 54(4), 1115–1126.
10. Pandey, P. K., Kansal, V., & Swaroop, A. (2021). IBRP: An infrastructure-based routing
protocol using static clusters in urban VANETs. In A. Khanna, A. K. Singh, & A. Swaroop
(Eds.), Recent studies on computational intelligence. Series of studies in computational
intelligence (Vol. 921). Springer.
11. Rehman, S., Khan, M. A., Zia, T., & Zheng, L. (2013). Vehicular Ad Hoc Networks
(VANETs) – An overview and challenges. EURASIP Journal on Wireless Communications
and Networking, 3(3), 29–38.
12. Khan, A. A., Abolhasan, M., & Ni, W. (2018). 5G next generation VANETs using SDN and
fog computing framework. In 15th IEEE Annual Consumer Communications & Networking
Conference (CCNC) (pp. 1–6).
13. Tyagi, N., Rana, A., & Kansal, V. (2020). Load distribution challenges with virtual computing.
In V. Solanki, M. Hoang, Z. Lu, & P. Pattnaik (Eds.), Intelligent computing in engineering,
advances in intelligent systems and computing (Vol. 1125, pp. 51–56). Springer.
14. Haleplidis, E., Pentikousis, K., Denazis, S., Salim, J. H., Meyer, D., & Koufopavlou, O. (2015).
Software-defined networking (SDN): Layers and architecture terminology. RFC 7426, IRTF.
15. Sultana, R., Grover, J., & Tripathi, M. (2021). Security of SDN-based vehicular ad hoc
networks: State-of-the-art and challenges. Vehicular Communications, 27, 100284.
16. Jarraya, Y., Madi, T., & Debbabi, M. (2014). A survey and a layered taxonomy of software
defined networking. EEE Communications Surveys & Tutorials, 16, 1955–1980.
17. Hu, F., Hao, Q., & Bao, K. (2014). A survey on software-defined network and open flow: From
concept to implementation. EEE Communications Surveys & Tutorials, 16, 2181–2206.
18. Hussein, A., Elhajj, I., Chehab, A., & Kayssi, A. (2017). SDN VANETs in 5G: An architecture
for resilient security services. In 2017 fourth international conference on Software Defined
Systems (SDS) (pp. 67–74).
19. Pandey, P. K., Kansal, V., & Swaroop, A. (2020). Vehicular Ad Hoc Networks (VANETs):
Architecture, challenges, and applications. In U. Shanker & S. Pandey (Eds.), Handling priority
inversion in time-constrained distributed databases Hershey (pp. 224–239). IGI Global.
20. Hasrouny, H., Samhat, A. E., Bassil, C., & Laouiti, A. (2017). Vanet security challenges and
solutions: A survey. Vehicular Communication, 7, 7–20.
21. Tyagi, N., Rana, A., & Kansal, V. (2018). Resourceful system-level optimized placement of
virtual machines for cloud computing. In 4th International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT) (pp. 99–102). IEEE Explore.
22. Malhi, A. K., Batra, S., & Pannu, H. S. (2020). Security of vehicular ad-hoc networks: A
comprehensive survey. Computers & Security, 89, 101664.
23. Wang, P., Chen, C., Kumari, S., Shojafar, M., Tafazolli, R., & Liu, Y. (2020). HDMA:
Hybrid D2D message authentication scheme for 5G-enabled VANETs. IEEE Transactions on
Intelligent Transportation Systems, 1–10.
Security Challenges and Solutions for Next-Generation VANETs: An Exploratory Study
201
24. Xie, L., Ding, Y., Yang, H., & Wang, X. (2019). Block-chain-based secure and trustworthy
internet of things in SDN-enabled 5G-VANETs. IEEE Access, 7, 56656–56666.
25. Ferrag, M. A., & Ahmim, A. (2017). ESSPR: An efficient secure routing scheme based
on searchable encryption with vehicle proxy re-encryption for vehicular peer-to-peer social
network. Telecommunication Systems, 66, 481–503.
26. Makhlouf, A. M., & Guizani, M. (2019). SE-AOMDV: Secure and efficient AOMDV routing
protocol for vehicular communications. International Journal of Information Security, 18, 665–
676.
27. Yang, P., Deng, L., Yang, J., & Yan, J. (2020). SASMF: Socially aware security message
forwarding mechanism in VANETs. Mobile Networks and Applications, 25, 660–671.
28. Liu, P., Liu, B., Sun, Y., Zhao, B., & You, I. (2018). Mitigating DoS attacks against
pseudonymous authentication through puzzle-based co-authentication in 5G-VANET. IEEE
Access, 6, 20795–20806.
29. Zhang, J., Cui, J., Zhong, H., Bolodurina, I., & Liu, L. (2020). Intelligent drone-assisted
anonymous authentication and key agreement for 5G/B5G vehicular ad-hoc networks. In IEEE
transactions on network science and engineering.
30. Zhang, X., Wang, W., Mu, L., Huang, C., Fu, H., & Xu, C. (2021). Efficient privacy-preserving
anonymous authentication protocol for vehicular ad-hoc networks. Wireless Personal Communications.
iTeach: A User-Friendly Learning
Management System
Nikhil Sharma, Shakti Singh, Shivansh Tyagi, Siddhant Manchanda,
and Achal Kaushik
1 Introduction
Electronic learning or e-learning [5] refers to gaining knowledge through leveraging
the Internet and computer network across the globe. E-learning consists of various
forms of learning methodologies that are electronically supported [29]. The process
of automation has made the lives of people a lot easier. Therefore, incorporation
in education has also contributed to enabling students to learn in different styles
altogether by making the process more accessible and empowering educators by
developing a set of automation tools to create content and teaching significant [4].
E-learning in the current COVID-19 pandemic situation has proven to be a boon.
It provides a learning platform for the students of all the classes. The e-learning
offers an instant solution to the COVID-19 outbreak, where the authorities have
forced countrywide lockdowns, including educational institutions. In this gloomy
scenario, e-learning systems prove to help content makers, educators, and students.
Such online software allows people to study from home and get online learning
materials and guidance from the teachers online, without their physical presence.
The e-learning systems offer many features such as online doubt sessions, online
tests and quizzes, assignment submission help in isolation and protection from the
virus, and learning in an altogether holistic approach challenging to simulate in a
physical schooling environment.
In the twentieth century, the change was from the industrial age to the age of
knowledge and technical know-how. E-learning or electronic learning refers to the
concept of utilizing the Internet, providing a platform for the students to learn
via accessing content available on the Internet such as notes, videos etc., making
N. Sharma · S. Singh · S. Tyagi · S. Manchanda · A. Kaushik ()
Bhagwan Parshuram Institute of Technology, Affiliated To GGSIPU, Delhi, India
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_11
203
204
N. Sharma et al.
learning more effective and performant. One of the pivotal purposes of e-learning is
that every actor should know the technology’s core know-how and understand how
it can be utilized to reach a specific goal or objective [4].
In today’s world, competition among the industries regarding their product
efficiency, features, and performance plays an essential role in retaining the
customers and tackling their fellow competitor products. The cut-throat competition
requires skillful employees who can update themselves with the latest technology at
a demanding pace. An efficient learning management system plays a pivotal role
in helping learners learn the technologies beyond the time frames and physical
boundaries. Also, the e-learning systems can be a great way to document and
reference this knowledge for referring to and planning accordingly, which can help
automate the learning tasks [4].
In our work, we have developed a platform with an inbuilt scribble pad for
simulating the effect of blackboard teaching. This feature consists of a paint-like
drawing area where an external device like an electronic pen is used to write on it,
draw shapes, and even import pictures and explain them visually. The application
also consists of a video editor that enables content creators to create and edit videos
on the fly. Cloudinary is used as an online hosting cloud service for hosting their
videos. Another application feature is a screen recorder for recording full or a
custom part of the screen while teaching via webcam or through a scribble pad.
The content creators can publish playlists of their courses available for purchase
at the student marketplace. Each video lecture comprises an online discussion forum
where students could discuss what concepts they need clarity on and the teachers’
assignment answers as an attachment in the video lectures. One of the platform’s
significant features is an online one-to-one doubt session with the teacher after the
video lecture. Other learning management systems do not offer an online doubt
session with the same instructor present in the video. The student can avail this
facility by performing well in examinations, lecture revision tests, and assessments
that earn them iTeach [17] passes. These iTeach passes would help the students
encash live doubt session access and discounts on video content purchases.
Apart from that, the application has an extensible and easy-to-use user interface,
making it super easy to access the application. The platform has a rich session and
role-right management done at its backend, which normalizes the database.
The paper’s outline is as follows: Section 2 briefs the related work by surveying
the literature, and Sect. 3 details the formulation of the problem and the proposed
model showing all the features of the model. The quantification of various parameters and their comparison is in Sect. 4, and feedback analysis is demonstrated in
Sect. 5. Concluding remarks appear in Sect. 6.
2 Literature Review
Various learning management strategies [3, 32, 33] have been implemented, which
have significantly contributed to a different approach to learning. But if the users
iTeach: A User-Friendly Learning Management System
205
cannot utilize their full potential or face problems due to lack of accessibility, it
defeats the whole purpose. Therefore, an LMS is successful if the students can
make the most out of it and fulfill their purpose. However, recent studies [16, 26]
have shown that e-learning implementation is a technical solution and have many
different factors such as social factors and individuals. There is a process of factors
facilitating conditions in addition to organizational, such as behavioral and cultural
factors [4].
Absorb [27] is a learning management system that empowers organizations to
teach their employees the required skills to stay in this modern and competitive
world and change according to the technology demands. The LMS offers content
libraries to provide an instant return on investment via thousands of pre-built online
courses. An e-commerce is a marketplace for selling their courses and applying
monetization and competent administration for the teachers and learners to automate
various processes. The LMS consists of 12.9M users across 129 countries with
11,000 customers. One of the disadvantages of this application is its accessibility, a
dead-simple user interface, and various modern web optimization approaches. Also,
it does not provide automation in the context of content creation. Unacademy [37]
is another learning management platform that offers features like an accessible UI,
live lecture streaming, course subscriptions, etc. It uses React [11] as its frontend
and Nginx and Node [36] as its backend service, along with some management and
automation tools like CMake and Google tag manager [10]. But it also does not
provide features like an inbuilt text editor and a scribble-pad and software for video
creation and content editing on the fly.
Moodle [8] is one of the most famous learning management systems in the
market, providing features like learner progress tracker, quick activity, and course
setup with ease. It uses Nginx and PHP heavily under its platform layers [18].
Google Classroom [28] is a product backed by one of the learning tech brands called
Google, a web platform for creating classrooms to enable the teachers to distribute,
collect, and manage classwork. It utilizes Google suite and Kotlin for building
the android version of the app, providing features like distribution of resources
to platforms like YouTube videos, Google Drive links, and tools like GeoGebra
Classic, and Activity learn Hiver under the hood [2]. Easyclass [22] is an LMS
that provides a shared digitized environment to content creators and the students
for uploading and delivering content to students in videos, digital notes, etc. The
students can also submit quizzes and assignments along with tests on the platform.
The platform provides a secure environment for teachers and students so that the
content is safe from any external or unauthorized access.
Zoom classroom [6] is another such platform that provides the synchronous
teaching mode. The platform allows a host and students to connect to a meeting
room where teachers connect with them in real time and deliver their content. It also
enables platform versatility since a person can connect from a Windows machine or
a Mac or even connect via mobile devices. Backed by Office365, Microsoft Teams
(MS-Teams) [21] is another learning platform. MS-Teams allows meetings through
a virtual meeting room where approximately 10,000 people can connect at the same
time. It performs text chats and shares documents and files all in one place and its
206
N. Sharma et al.
accessible user interface, making it a platform to teach many audiences. Hypersay
[13] is an online platform that brings presentations with a new perspective. The
students can interact with presentations in real time and support features like live
subtitle changes. It is not suited to cater to a broad audience since it only allows 20
participants for each classroom session.
Nearpod [30] is another online learning management strategy that provides an
interactive learning environment to the audience, where instructors and content
creators can deliver interactive lessons. Some of the platform’s features are polls,
3-D objects, open-ended questions, and field trips through virtual reality. BrainPOP
[9] provides online access to resources to students in interactive class sessions,
quizzes, and study materials when access to schools is not feasible. Therefore, it
has been one of the favorite tools for such types of closures. Eduflow [25] is an
online platform that allows content creators to create online content. It provides
a simulation environment for students to submit quizzes and assignments, ask
live doubts, track their performance, conduct teamwork activities, and many more.
YouTube [7] has been one of the most popular sources to learn online. This platform
allows content creators to create education-specific channels and upload a series of
videos in the form of playlists. The students have free access to the content and can
save a playlist to access it offline.
Many screen recording applications are present, which do not act as a separate
learning management system but as tools to create and deliver content online.
Applications like Camtasia, screen hunter [38], ice cream screen recorder [34], and
windows screen recorder provide the facility to create audio, video, and screen
recordings. They further provide an option to edit them, customize the videos
through intro and ending screens, add animations, etc.
The main drawbacks of these systems are their inability to provide automation
in terms of video creation. Their approaches are not very friendly and encourage
educators to create videos with ease rather than restricting them in the barrier of
content creation software, discouraging them from creating content and sticking
to their straightforward old approaches. Therefore, iTeach [17] offers features
like accessible UI, online video recorder, and editor for content creation on the
fly, hosting through Cloudinary, and a rich video editor. The application usage
is updated to the demands of the modern era. It uses the latest and in-demand
technologies like React.js [11] as the frontend, Node [36] for API creation and
exposing those endpoints, and Mlab [15] for saving and managing the contents of
the users.
3 Proposed Model
One of the critical goals of LMS [23] is to provide learning facilities to students in
the best possible manner. They allow the students to connect with the best teachers
worldwide, clear their doubts online, and conduct regular exams to track their
knowledge and learn better. It also empowers educators to create content without
iTeach: A User-Friendly Learning Management System
207
getting stuck with the cumbersome buying process and video-making software,
which block many enthusiastic instructors willing to create content.
Although the most prevalent e-learning models are available, they fail to provide
a simple interface and access for students and teachers. For example, a person who
wants to teach students at YouTube or Unacademy [37] should spend money on
recording and editing software and hardware. A teacher cannot teach students to
use images on the screen and make direct edits. He cannot conduct live interactive
doubt sessions. One of the pivotal purposes of building an LMS is to make learning
accessible to students, connecting them with the best teachers worldwide, clearing
their doubts online, and conducting regular tests to track their knowledge and learn
better.
Summarizing the following points indicate how the present systems pose to be
blocker among the potentials users:
The present systems include a cumbersome and painful process for the instructors
to make content for the students, for example, if a person wants to create
educational content on platforms like Unacademy [37] YouTube, Scrimba, etc.
He has to go through the painful process of buying and setting up video editing
software, renderers, and editing videos that require some knowledge and sound
technical know-how. The tedious process discourages many such teachers from
making content.
In many platforms like Unacademy [37], Moodle [8], and Absorb [27], the
feature of a live scribbler, live video editor on the fly, and live doubt sessions with
the teachers is not present.
Our model has incorporated new features with pre-existing e-learning systems
and pre-existing infrastructure, which makes learning easier. Our learning management systems have provided various following features shown in Fig. 1 to make
learning easier and help automate the learning tasks.
1. Webcam teaching [24] – This utility allows teachers to teach from anywhere
worldwide through video live video sharing. A webcam enables a teacher to
simulate the behavior of live teaching. This feature utilizes the browser APIs
to record videos using a webcam and host the video sessions on a cloud service
(Fig. 2).
2. Screen recorder – It allows teachers to teach via broadcasting their screen to
the students. It serves many applications like teaching through code, explaining
through pictorial representations, etc. The instructor can start the screen recorder
with just a click. He would be given a set of options to load the whole desktop
or a custom screen. Saving the recording makes an API call where a storage
management utility called Cloudinary stores and hosts it to its server (Fig. 3).
3. Scribble pad [20] – Another application feature includes an inbuilt scribble
pad. The pad is provided to simulate a blackboard’s behavior more interactively
and handily, which would enable educators to teach on a virtual board with a
hardware pen attached to their computing device. The device would talk to the
OS and would trigger drawing strokes on the provided area. Usually, the existing
208
N. Sharma et al.
Fig. 1 Lecture creation
Fig. 2 Embedded video
mechanisms only handle and upload the instructors’ videos but don’t offer the
tools. An online scribble pad would prove to be a great boon for the teachers,
which would allow the following features.
(a) Pen – The feature offers the teachers to write using a hardware device to
construct strokes on the screen to simulate a blackboard’s behavior. Apart
from a pen, other handy features like drawing squares, circles, or even panels
are also provided, aiding in geometrical learning.
iTeach: A User-Friendly Learning Management System
209
Fig. 3 Live screen
(b) Image upload – The image upload feature allows the user to upload an
image and explain it using scribbles, which would enforce a better learning
experience. The teachers would be free from old school methods of writing
everything on the board and explaining afterwards or building the whole diagram first and then explaining. The image upload feature works by inserting
an image link either from Google or any other jpeg link or importing from
the computer and clicking the upload button. The corresponding image will
be on the display screen through which the instructor can teach seamlessly.
(c) Undo/redo – The scribble pad also opens the options for undoing and redoing
strokes and other actions like adding images or changing the stroke colors,
which proves to be useful for effortless teaching and a smooth experience
(Fig. 4).
(d) Color palette – The pad also has a rich set of color palettes for colorful
teaching and highlighting the essential parts or indicating proper texts.
4. Lecture live streaming – This feature allows teaching to live stream their lectures,
directly interact with the students, and improve their doubts (Fig. 3). The live
sessions help enhance mutual communication between the two parties [1].
5. Student marketplace – This section of the application offers a complete marketplace where students can select a set of different courses or teachers worldwide.
6. One-to-one doubt sessions – This feature offers live one-to-one doubt sessions
with the teachers who undertook the lectures. Teachers would be able to come
live and interact with students launching their doubts and questions. This feature
allows instructors to directly interact with the students after discussing the
students’ queries. This feature is missing from many prevalent systems and
would improve accessibility and user experience significantly.
210
N. Sharma et al.
Fig. 4 Playback class
7. A blazing fast video editor – The teachers would be saved from the hassle of
extensive manual video editing. The features include providing video presets to
automatically enhance sound quality and an easy-to-use GUI for video editing
on the fly.
8. Teacher ratings – According to student feedback, it offers an anonymous user
feedback forum by which a teacher could understand the students’ learning
strategies and mold their teaching methodologies [19] (Fig. 5).
Following are some of the screenshots of the applications:
4 Comparison
There are various LMS available in the market based on different technological
stacks and platforms. These available LMSs offer multiple features ranging from
tracking learner progress and setting up the course, offers live classes, doubt sessions, assignment management, distribution of resources, planner for teachers, and
other feature like in-line scribble pad, video editor, student discussion forum, etc.
Table 1 indicates the comparison between various learning management platforms
on the parameters like the technology used, features and platforms, etc. [28].
From the literature, we have identified various LMSs offering different feature
sets. The following table demonstrates the feature comparison of the applications
based on factors like SCORM, course content creation, LTI support, etc. (Table 2).
iTeach: A User-Friendly Learning Management System
211
Fig. 5 User interface
5 Users Feedback Analysis
We have tested our application during the COVID-19 pandemic. This section
describes the questionnaire distributed among 104 teachers and 600 students to test
the application and provide their valuable feedback by answering the questionnaire.
Two questionnaires (for teachers and students) were prepared according to their
respective application usage sections.
Table 3 specifies a set of generic questions for both teachers and students.
In our feedback evaluation, we have used the Likert scale, which ranges from
strongly disagree (1), disagree (2) and neutral (3) to agree (4) and strongly agree
(5). For the feedback analysis, we have used the divergent stacked bar chart. These
graphs consider the dual-axis charts that measure positive and negative sentiments
and visually help us understand the feedback’s polarity.
From Fig. 6, it is very evident that there is positive feedback on the feature set.
The evaluation of the questionnaire set provides the following insight about the
application: it offers a smooth user experience without lagging on its interface (Fig.
5). Further, it is working fine even with a low bandwidth network.
The student’s feedback, shown in Table 4, on the application usage provides
the effectiveness on various parameters. From Fig. 7, we can identify that the
application proves to be effective, where nearly 40% of students agree that the
212
N. Sharma et al.
Table 1 Comparison of the various learning management platforms
LMS name
Moodle [18]
Technology used
PHP, Nginx
Unacademy [37]
JavaScript, React
[11], Ruby, C#,
Typescript
PHP, Amazon Web
Services, JavaScript,
TypeScript, React
Absorb LMS [27]
Google Classroom
[12]
Google Suite, Kotlin
Edmodo [31]
jQuery, NGinx,
Handlebars, Node.js
iTeach [17]
React, JavaScript,
Node.js, MongoDB,
Mongoose
Features
Learner progress
tracker, quick
activity and course
setup
Live classes, doubt
sessions, student
subscriptions
E-commerce, content
libraries, analytic
reports,
AICC/SCORM
support
Assignment
management,
distribution of
resources
Planner for teachers,
Edmodo badges,
publisher
communities
In-line scribble pad,
video editor, student
discussion forum
Platform and tools
Mandrill, GSuite,
RequireJS for DevOps
CMake, Google Tag
Manager [10]
Google Analytics
GeoGebra Classic,
Activity learn, Hiver
Optimizely, Google
Analytics, Google Tag
Manager [10],
Bugsnag, New Relic
Mlab [15], Cloudinary
Table 2 Feature comparison
SCORM import
Bundled course
content
Google app
integration
Single sign-on
E- commerce
Developer API
available
LTI support
Native web
hosting
Scribble pad
Screen recording
LMS Name
Yes
No
Yes
No
Yes
Yes
No
Yes
No
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
Yes
yes
Yes
Yes
Yes
No
Yes
Yes
No
Yes
Yes
No
No
No
No
Yes
No
Yes
Yes
Yes
Yes
No
Yes
No
No
Absorb LMS
[27]
No
No
Moodle LMS
[18]
No
No
Infrastructure
Canvas LMS
[35]
No
No
Schoology
LMS
[14]
Yes
Yes
iTeach [17]
S. no.
1.
2.
3.
4.
5.
6.
Questions
The application has a smooth user experience
The application is easy to operate and find on the Internet
No personal data leaked from the application
The application works even on slow connections
The interface does not get stuck and responds effectively
The application replies to bugs and patch update issues on time
Table 3 General feedback ratings
1 (Strongly disagree)
66(10%)
22(3%)
20(3%)
23(3%)
50(8%)
73(11%)
2 (Disagree)
47(7%)
19(3%)
47(8%)
45(7%)
62(9%)
79(12%)
3 (Neutral)
82(12%)
100(15%)
419(70%)
111(17%)
61(9%)
97(15%)
4 (Agree)
262(40%)
200(30%)
95(16%)
336(51%)
76(12%)
281(43%)
5 (Strongly agree)
203(31%)
319(48%)
20(3%)
145(22%)
411(62%)
130(20%)
iTeach: A User-Friendly Learning Management System
213
214
N. Sharma et al.
Fig. 6 General feedback stacked bar chart
application provides a good learning platform. Almost 70% of students settled upon
the smooth test experience, with more than 90% appreciating the result evaluation
and representation for pinpointing the scope of improvement and learning gap. The
application offers better adaptability and understanding of the topic as the same
teacher takes the doubt session. The students also approved that the application was
not bulky, and they didn’t observe any connectivity issues. The application doesn’t
take too much RAM for mobile/desktop devices, and the scratchpad feature is handy
for visual learning. Over 80% of students are willing to recommend the application
platform to other students.
From Fig. 8, the teacher’s feedback (Table 5) on various usability parameters also
suggests that the application serves a good purpose and easy to use interface. The
platform is user-friendly in creating content as approved by over 80% of faculty.
Also, nearly 70% found that the application doesn’t take too much video rendering
time, which is otherwise an issue with any LMS system. The platforms allow bulk
entry to questions through excel sheets, which is a good help for the evaluators
to save their time. The platform is time-saving to effectively evaluate tests and
assignments, provide correct visualization data, and correctly depict students’ weak
points. More than 75% of teachers agree with the effectiveness of the application
and are ready to recommend the application to other teachers.
11
10
9
8
7
6
5
4
3
2
S. no.
1
Questions
The application helps me learn efficiently
The course material is useful and covers all
concepts
The application has a smooth experience for
taking tests
The application is sound with no security
vulnerabilities (like hacking the test timer, etc.)
The application shows the correct test result
data after the test
The application shows the graphical
representation of tests results taken over time
The doubt sessions involve the same teacher
taking your classes, with no connectivity issues
The application doesn’t take too much RAM
for mobile/desktop devices
The scratchpad feature is handy for visual
learning
You wish to recommend the platform to other
students
The courses provide a sufficient number of
assignments to cover the topic
Table 4 Students feedback ratings
70(12%)
2(0%)
32(5%)
68(11%)
77(13%)
5(1%)
15(3%)
31(5%)
15(3%)
67(11%)
83(14%)
43(7%)
24(4%)
88(15%)
93(16%)
7(1%)
12(2%)
70(12%)
46(8%)
124(21%)
Student feedback rating
1 (Strongly disagree)
2 (Disagree)
32(5%)
112(19%)
196(33%)
36(6%)
69(12%)
115(19%)
70(12%)
14(2%)
20(3%)
363(61%)
58(10%)
172(29%)
3 (Neutral)
221(37%)
155(26%)
300(50%)
163(27%)
267(45%)
200(33%)
56(9%)
27(5%)
96(16%)
339(57%)
203(34%)
4 (Agree)
145(24%)
96(16%)
219(37%)
312(52%)
62(10%)
160(27%)
518(86%)
526(88%)
40(7%)
142(24%)
34(6%)
5 (Strongly agree)
90(15%)
iTeach: A User-Friendly Learning Management System
215
216
N. Sharma et al.
Fig. 7 Students feedback stacked bar chart
6 Conclusion and Future Scope
The learning management system has been successfully built with easy-to-use features. The model has implemented new features with pre-existing e-learning systems
and pre-existing infrastructure, which makes learning easier. The model provides a
better teaching–learning environment suited well to students and teachers. Students
can appealingly visualize the content, and teachers can create the content with great
ease. A few of the critical components that make the application smooth and easy to
use are as follows:
• Fast video editor allows teachers to create and edit their videos without using
external resources easily.
• There is a new feature called a Feedback system, in which the student can
complete a short survey of what they understand, which requires more clarity.
iTeach: A User-Friendly Learning Management System
217
Fig. 8 Teachers’ feedback stacked bar chart
It allows online quiz sessions with the teacher, not just those students, to write
questions in forums that are never fun.
• It offers a built-in, powerful Scratchpad that mimics the board experience with
the added feature of importing images and videos and interpreting concepts with
visual learning.
• Teachers can periodically conduct tests that show what concepts the children
have understood and what elements need repetition.
• The product enables the instructors to create content without requiring any prior
knowledge of video editing, content rendering, etc. They now need to focus on
what they are going to teach with the utmost focus.
In the application’s future scope, we need to provide more advanced software
for e-learning management systems. The teletyping features, where the teacher’s
words could be written when they are speaking, give the students notes for greater
accessibility and a better provision for storing, saving videos, and removing noise.
11
10
9
8
7
6
5
3
4
S. no.
1
2
Question
Students can understand the content delivered
Students happy with audio/video quality
The application provides correct visualization
data and correctly depicts the weak points of
students
Video saved have no audio/video issues
You would recommend the application to other
teachers
The application doesn’t take too much video
rendering time
The platform is user-friendly in creating
contents
The platforms allow bulk entry to questions
through excel sheets
The platform is time-saving to evaluate tests
and assignments
The discussion forum helps understand the
doubts of the students
The application correctly depicts the weak
points of students
Table 5 Teachers feedback ratings
14(13%)
5(5%)
1(1%)
0(0%)
1(1%)
7(7%)
5(5%)
5(5%)
2(2%)
19(18%)
23(22%)
3(3%)
4(4%)
13(13%)
10(10%)
9(9%)
7(7%)
17(16%)
Teacher feedback rating
1 (Strongly disagree)
2 (Disagree)
13(13%)
11(11%)
12(12%)
14(13%)
15(14%)
9(9%)
10(10%)
0(0%)
6(6%)
16(15%)
9(9%)
17(16%)
8(8%)
3 (Neutral)
12(12%)
6(6%)
30(29%)
61(59%)
46(44%)
26(25%)
38(37%)
22(21%)
15(14%)
57(55%)
49(47%)
4 (Agree)
18(17%)
16(15%)
26(25%)
6(6%)
44(42%)
74(71%)
46(44%)
49(47%)
66(63%)
18(17%)
28(27%)
5 (Strongly agree)
50(48%)
56(54%)
218
N. Sharma et al.
iTeach: A User-Friendly Learning Management System
219
The above items are improvements that can be made to increase the applicability
and utilization of this model. Hence, effective management of student and assignment records and a strategy to utilize the cloud space to get better space at a low
cost. Also, the players are as versatile as they can see now. It is possible to introduce
a method to manage e-learning management systems with student, administrator,
and teacher improvements like quizzes and assignments. A significant role-right
management system is necessary to provide a normalized implementation so that
the real data can be handled without any security leaks and the users’ reliability can
be retained.
References
1. Abdous, M., & Yen, C. J. (2010). A predictive study of learner satisfaction and outcomes in
face-to-face, satellite broadcast, and live video-streaming learning environments. Internet and
Higher Education, 13(4), 248–257. https://doi.org/10.1016/j.iheduc.2010.04.005
2. Abid Azhar, K., & Iqbal, N. (2018). Effectiveness of Google classroom: Teachers’ perceptions.
Prizren Social Science Journal, 2(2), 52–66.
3. Adrain, S. (2019). 10 strategies to improve your learning management system. https://
www.intuto.com/blog/10-strategies-to-improve-your-learning-management-system
4. Alhabeeb, A., & Rowley, J. (2018). E-learning critical success factors: Comparing perspectives
from academic staff and students. Computers and Education, 127(August), 1–12. https://
doi.org/10.1016/j.compedu.2018.08.007
5. Attwell, G. (2007). Personal learning environments-the future of eLearning. Elearning Papers,
2(1), 1–8.
6. Barbosa, T. J. G., & Barbosa, M. J. (2019). Zoom: An innovative solution for the
live-online virtual classroom. HETS Online Journal, 9(2). https://go.gale.com/ps/
anonymous?p=AONE&sw=w&issn=&v=2.1&it=r&id=GALE%7CA596061565&sid=google
Scholar&linkaccess=fulltext
7. Chtouki, Y., Harroud, H., Khalidi, M., & Bennani, S. (2012). The impact of YouTube
videos on the student’s learning. In 2012 International Conference on information technology based higher education and training, ITHET 2012 (pp. 1–4). https://doi.org/10.1109/
ITHET.2012.6246045
8. Despotović-Zrakić, M., Marković, A., Bogdanović, Z., Barać, D., & Krčo, S. (2012). Providing
adaptivity in moodle lms courses. Educational Technology and Society, 15(1), 326–338.
9. Donovan, L., Green, T. D., & Mason, C. (2014). Examining the 21st century classroom:
Developing an innovation configuration map. Journal of Educational Computing Research,
50(2), 161–178. https://doi.org/10.2190/EC.50.2.a
10. Farney, T. (2019). Designing shareable tags: Using Google tag manager to share code. The
Code4Lib Journal, 46. https://journal.code4lib.org/articles/14853
11. Fedosejev, A. (2015). Create your first react element. React. js essentials. Packt Publishing
Ltd.
12. Hulse, R. (2019). The use and implementation of Google classroom in a Japanese University.
The Centre for the Study of English Language Teaching Journal, 7, 71.
13. Hypersay. (n.d.). Retrieved April 4, 2020, from https://hypersay.com/
14. Kattoua, T., Al-Lozi, M., & Alrowwad, A. (2016). A review of literature on knowledge
management using ICT in Higher Education. International Journal of Business Management
& Economic Research, 7(5), 754–762.
15. Knott, G. D. (1979). MLAB – A mathematical modeling TOOL. Computer Programs in
Biomedicine, 10, 231–244.
220
N. Sharma et al.
16. Kraleva, R., Sabani, M., & Kralev, V. (2019). An analysis of some learning management systems. International Journal on Advanced Science, Engineering and Information Technology,
9(4), 1190–1198. https://doi.org/10.18517/ijaseit.9.4.9437
17. Learning-management-system. (2019). https://github.com/siddhant1/Learning-managementsystem#start-of-content
18. Machado, M., & Tao, E. (2007). Blackboard vs. Moodle: Comparing user experience of
learning management systems. In Proceedings – Frontiers in Education Conference, FIE,
December 2006 (pp. 7–12). https://doi.org/10.1109/FIE.2007.4417910
19. MacKenzie, D. (2003). Assessment for e-learning: What are the features of an ideal eassessment system. In Proceedings of the 7th CAA conference (pp. 8–9). http://hdl.handle.net/
2134/1914
20. Maher, M. L., Rosenman, M. A., Smith, G. J., & Marchant, D. (2005). Supporting collaboration and multiple views of building models in virtual worlds. 1–11.
21. Martin, L., & Tapp, D. (2019). Teaching with teams: An introduction to teaching an
undergraduate law module using Microsoft Teams. Innovative Practice in Higher Education,
3(3), 58–66.
22. Mayyas, M., & Bataineh, R. (2019). Perceived and actual effectiveness of Easyclass in
Jordanian EFL tertiary- level students’ grammar learning. International Journal of Education
and Development Using Information and Communication Technology, 15(4), 89–100.
23. Mohammed, O., Mohamed, Y., & Chkouri, A. N. (2018). Learning management system and
the underlying learning theories: Towards a new modeling of an LMS. International Journal
of Information Science & Technology, 2(1), 25–33. https://doi.org/10.1007/978-3-319-745008_67
24. Morris, M. (2020). react-webcam. Webcam Component. https://www.npmjs.com/package/
react-webcam
25. Nitzsche, J., & Norton, B. (2009). Business process management workshops. In Business
process management workshops (Vol. 17). https://doi.org/10.1007/978-3-642-00328-8
26. Ouadoud, M., Nejjari, A., Chkouri, M. Y., & El Kadiri, K. E. (2018). Educational modeling of
a learning management system. In Proceedings of 2017 International conference on electrical
and information technologies, ICEIT 2017, 2018-Janua (March 2018), 1–6. https://doi.org/
10.1109/EITech.2017.8255247
27. Pandya, C. (2019). Librarianship development through Internet of Things and customer
services (Issue February).
28. Pappas, C. (2019). The 20 Best learning management systems (2019 Update). Learning Management Systems. https://elearningindustry.com/the-20-best-learning-management-systems
29. Patel, C., Gadhavi, M., & Patel, A. (2013). A survey paper on e-learning based learning
management Systems (LMS). Ijser.Org, 4(6), 171–177. http://www.ijser.org/researchpaper/Asurvey-paper-on-e-learning-based-learning-management-Systems-LMS.pdf
30. Perez, J. E. (2017). Resource review: Nearpod. Journal of the Medical Library Association,
105(1), 108–110. https://doi.org/10.5195/jmla.2017.121
31. Pertiwi, A., Kariadinata, R., Juariah, J., Sugilar, H., & Ramdhani, M. A. (2019). Edmodo-based
blended learning on mathematical proving capability. Journal of Physics: Conference Series,
1157(4). https://doi.org/10.1088/1742-6596/1157/4/042001
32. Raj, V. P. (2020). Solutions to unlock the challenges in a Flipped class. Our Heritage, 68(41),
5–9.
33. Saba, T. (2012). Implications of E-learning systems and self-efficiency on students outcomes:
A model approach. Human-Centric Computing and Information Sciences, 2(1), 1–11. https://
doi.org/10.1186/2192-1962-2-6
34. Screen Recorder. (n.d.). Retrieved April 4, 2020, from https://icecreamapps.com/ScreenRecorder/
35. Shepherd, I., Pope, D., & Reeves, B. (2019). A Canvas learning management system proposal
for accreditation reporting using rubrics and assignments. Journal of Higher Education Theory
and Practice, 19(4). https://doi.org/10.33423/jhetp.v19i4.2205
iTeach: A User-Friendly Learning Management System
221
36. Tilkov, S., & Vinoski, S. (2010). Node.js: Using JavaScript to build high-performance network
programs. IEEE Internet Computing, 14(6), 80–83. https://doi.org/10.1109/MIC.2010.145
37. Unacademy.com. (n.d.). Retrieved April 4, 2020, from https://unacademy.com/
38. Wright-Porto, H., & Wright-Porto, H. (2011). More than text. In Creative blogging. Springer.
https://doi.org/10.1007/978-1-4302-3429-6_4
Part III
Data-Intensive Systems in Health Care
Analysis of High-Resolution CT Images
of COVID-19 Patients
A. Joy Christy and A. Umamakeswari
1 Introduction
Corona virus, the deadliest pandemic, was first identified in Wuhan, Hubei Province,
China, in late December 2019 [1]. The virus, formerly called 2019-nCov, is a
mutated virus that emerged from the family of severe acute respiratory syndrome
coronaviruses (SARS) and termed as SARS-Cov-2. Later, World Health Organization (WHO) officially named the disease Coronavirus 2019 (COVID-19) on 12
February 2020 [2]. The main symptoms of the disease include pneumonia fever,
dry cough, shortness of breath, fatigue, sore throat, and severe respiratory illness
in the later stages [3]. The disease is highly infectious and has spread across 210
countries all over the world. The asymptomatic characteristic of the disease leads
the infected individual as virus carrier or transmitter which results in the fastest
spread of the disease. Despite the fact that the disease is highly contagious, the
virus has also claimed more than 461,715 lives so far and contracted to 8,708,008
people in the entire globe as on 22 June 2020. The pandemic not only poses threat
to human lives but also has advertent effect on social, economic, and political crisis
among the nations. Countries are racing to slow down the spread of the virus through
rapid tests and treatments with the intention to reduce high case of fatality rate
and to get back to normal life from the worldwide shutdown. The initial screening
of the symptomatic individuals starts with the testing of nasal and throat swabs.
The high false negative results of these tests increase the difficulty in controlling
the COVID-19 outbreak with misdiagnosed patients who might miss the golden
hours for proper treatment and cause the disease’s spread. Analysis of computed
tomography (CT) high-resolution scan images is proposed as the main diagnostic
A. Joy Christy () · A. Umamakeswari
School of Computing, SASTRA Deemed to be University, Thanjavur, Tamil Nadu, India
e-mail: joychristy@cse.sastra.edu
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_12
225
226
A. Joy Christy and A. Umamakeswari
method, suggested by the “Pneumonia diagnosis and treatment guideline for SARSCoV-2 infection (trial version 5),” issued by the National Health Commission of the
People’s Republic of China. Specifically, CT scan images of the chest are found to
be effective in detecting the abnormalities in the lungs and help the early diagnosis
of the disease it claims.
The use of machine learning approach for descriptive analysis of various diseases
have recently gained focus and are intended to be designed as an assistive tool
for physicians. Image segmentation is one of the popular applications of machine
learning approach. The goal of image segmentation is to represent an image into
something that is easier to analyze. Image segmentation methods in machine
learning approach typically help to locate objects and boundaries in images. In
image segmentation, every pixel in the image is labeled, wherein the pixels with
same label share similar characteristics with respect to color, intensity, or texture.
The outcome of image segmentation methods is a set of segments, or set of contours
extracted from the image, which would be aiding the medical practitioners to locate
tumors and pathologies, measure the volume of tissues, and study the anatomical
structure, surgery simulation, and surgery planning.
This work aims at segmenting the chest CT scan images of patients admitted
with corona virus symptoms and analyzing the abnormality in the lungs from the
segmented images, in order to have more comprehension and understanding of
patients with COVID-19. The proposed work would facilitate the quantification of
lesion regions with more emphasis on the survivability of the patients.
The clinical data has been acquired from the Kaggle data repository as a dataset
for 80 MB. The dataset contains a database of COVID-19 cases with chest x-ray
and high-resolution CT images. It is mentioned that the images are taken from the
costophrenic angle with the patients’ supine, head-first position in a breadth-holding
manner. In addition to the images, the dataset also holds metadata information about
the patients with patient ID, offset (number of days since the start of symptoms or
hospitalization of each image), gender, age, finding (which type of pneumonia),
survival (whether the patient will survive or not), view (x-ray or CT image), date
(date the image was acquired), location (hospital name, city, state, and country),
source (the source file name of the image), DOI (DOI of the research article), URL
(URL the research article or website resource of image), clinical notes (radiograph
information), and other notes (other information related to credit).
An image segmentation algorithm is employed over the CT scan images to
analyze the distribution features and the shape of abnormal attenuation involved
in the lungs. Each pixel in the CT scan image is represented as gray scale image
as the pixel values ranging from 0 to 1. In this work, the gray scale image is
converted into binary image having only two classes, 0 or 1, where 0 represents
black and 1 represents white color. Image segmentation groups the pixels with value
0 into one segment and the remaining in another. The binary regions segmented by
image segmentation algorithms are distorted by noise and texture and may contain
numerous imperfections. To overcome the issues, morphological image processing
is used to remove the imperfections by accounting for the form and structure of the
Analysis of High-Resolution CT Images of COVID-19 Patients
227
image. The segments are then processed to extract the regions of interest from CT
images.
The regions of interest from the segmented images are to be independently
reviewed in the following aspects:
(i) Lesion distribution: Left lobe or right lobe of the lung
(ii) Lesion location: Central and peripheral
(iii) Lobes involved: Superior, middle, and inferior lobes
(iv) Lesion characteristics: Ground glass opacity (GCO) and margins
2 Review of Literature
Recently, many predictive analysis-based diagnostics tools have been proposed
to detect the existence of COVID-19. Many of these works use deep learning
approaches. The methods have analyzed the CT scan or x-ray images to determine
whether a person is infected by COVID-19 or not. The literature focuses on the
recent models of COVID-19 disease.
Ozturk et al. [4] have proposed an automated diagnostic tool to detect COVID19 cases using deep neural network with x-ray images. The authors have used
Darknet convolution neural network model as a predictive model. The authors
have developed models for binary classification with COVID vs. no-findings cases
and multiclass classification with COVID vs. no-findings vs. pneumonia cases.
The authors have obtained the images from public image data sources and have
processed 1127 images. Among the experimental images, 500 images represent nofinding cases, 500 images denote pneumonia cases, and 127 images illustrate the
COVID-19 positive cases. The authors have claimed that the classification accuracy
of the proposed model is 98.08% for binary classification and 87.02% for multiclass classification. The authors have stated that their work does not include feature
selection.
Prathak et al. [5] have proposed a deep transfer learning-based classification
model for COVID-19 disease. The authors have also used a top-2 smooth loss
function with cost-sensitive attributes to handle noisy and imbalanced images. The
authors have taken CT scan images of COVID-19 cases and implemented ResNet-50
convolutional neural network model as a knowledge prediction model. The authors
have used transfer learning to tune the initial parameters and deep transfer learning
to train the classification model. The authors have built a multiclass classifier that
categorizes 413 COVID-19 positive cases and 439 normal and pneumonia infected
cases with 96.22% accuracy. The authors have claimed that the tenfold crossvalidation of the classifier is used to prevent overfitting issues and have stated that
the optimal selection of hyper-parameters is not considered in their work.
Togacar et al. [6] have proposed deep learning models that exploit social mimic
optimization (SMO) and structured chest x-ray images using fuzzy color and
stacking approaches for the detection of COVID-19. The authors have implemented
228
A. Joy Christy and A. Umamakeswari
MobileNetV2 and SqueezeNet deep learning models with the feature sets obtained
by the models using the SMO method. The authors have used SVM algorithm as
a classifier for diseased and non-diseased cases of COVID-19. The authors have
accessed publicly available COVID-19 x-ray image dataset with three classes. The
authors have collaborated images from multiple sources and have processed 458
images with 295 COVID-19 images, 65 normal images, and 95 pneumonia images.
The authors have claimed that their proposed model achieves 99.27% accuracy in
COVID-19 disease prediction.
RahimZadeh and Attar [7] have introduced a modified deep convolutional neural
network model for detecting COVID-19. The authors have proposed a concatenated
Xception and ResNet50V2 neural network models that help to detect COVID-19
disease. The authors have taken images from two open-source datasets: ieee8023
and Kaggle, where the prior contains three classes of x-ray images and the posterior
contains two classes of x-ray images. The dataset encompasses 118 COVID-19
cases, 42 pneumocystis cases, 25 streptococcus cases, 6012 pneumonia cases, and
8851 no-finding cases. The authors have claimed that the proposed model correctly
classifies the COVID-19 disease with the accuracy of 99.50%, with overall accuracy
for classes at 91.4%. The authors have stated that the unbalancing of images in
various classes is chaotic.
Zhang et al. [8] have constructed a large CT dataset on novel coronavirus
pneumonia (NCP) and other common types of pneumonia and normal controls. The
authors have developed an AI diagnostic system for assisting junior radiologists
in the epidemic area and two non-epidemic areas in China. The authors have also
provided prognosis indications for patients with NCP by using a combination of CT
scan images and clinical attributes. The dataset contains 617,775 CT images from
4154 patients. The authors have built the knowledge prediction model as multiclass
classification with four classes as NCP, viral pneumonia, bacterial pneumonia,
mycoplasma pneumonia, and normal control class. The authors have claimed that
their system is able to differentiate NCP from other classes with 92.49% accuracy.
Wu et al. [9] have developed a deep learning-based method to assist radiologists
to identify patients with COVID-19 cases by CT images. The authors have collected
the CT images of 495 patients from three hospitals in China. The dataset contains
two classes such as COVID-19 and other pneumonia class. The authors have built
a binary classifier as a knowledge prediction model. The authors used a multiview fusion model using deep learning network to screen patients with COVID-19
with maximum lung regions in axial, coronal, and sagittal views. The authors have
claimed that the proposed multi-view deep learning fusion model has achieved
81.9% accuracy. The authors have also computed the risk score for each patient
based on the morbidities and comorbidities.
Most of the articles published in the existing literature on COVID-19 are
concerned with the detection of COVID-19 disease using deep learning models.
These models are varying with image size, classes, neural network model, classifier,
and accuracy. Disease prediction comes under the predictive analysis of machine
learning. There is only a little effort made in the descriptive analysis of COVID-19.
Analysis of High-Resolution CT Images of COVID-19 Patients
229
Hence, in this work, we will analyze the CT scan images of COVID-19 cases to
perform the following:
• To segment the patchy ground-glass opacities with clear margins and highlights
septal thickening inside the lesions
• To exhibit disfigured and abnormal shape of the lung
• To quantify the lesion region
3 Materials and Methods
This work examines the lung-CT images of 21 COVID-19 patients and reviews their
clinical data such as age, survival, and the conditions of respiratory organs during
the admission of patients. The images are obtained from Kaggle’s ieee8023-covidchestxray dataset. The data set also consists of the clinical notes of each patient
that describes the signs, symptoms, and technical complications of the patients. The
features of the lung-CT images encompass GGO, mixed GGO, and consolidations.
The objective of the study is to assess the lesion region segment by segment so as
to find the correlation between the patient’s survivability and percentage of lesion
region. The study is also extended to analyze the relationship between patient’s
age and lesion region. To obtain this, we need to extract the affected lung regions
from CT images. Thus, in this work, a novel thresholding-based image segmentation
method called percentage split distribution (PSD) is used for segmenting lesion
regions. The method is a multilevel thresholding-based image segmentation method
and segments an image based on the distribution of image pixels. The method
intends to identify the region that develops the GGO, which is considered the benign
lesion region of COVID-19 disease. The images are preprocessed to remove the
background noise. The processed images are segmented into different groups to
extract the GGOs using the notion denoted in Eq. 1:
PSDn−1
i=1 = ((Maximum Pixel Value − Minimum Pixel Value) ∗ (i ∗ sp)) − Minimum Pixel Value
(1)
sp refers to the segment percentage and is calculated using the formula denoted in
Eq. 2:
sp =
100
n
(2)
Equation 1 generates n−1 thresholds that will assign the pixels in the relevant
group. The number of pixels in each group is counted, and the fraction with overall
image pixels is calculated to quantify the lesion and normal regions. The formula
depicted in Eqs. 3 and 4 defines the quantification of normal and lesion regions:
230
A. Joy Christy and A. Umamakeswari
Fig. 1 Graphical abstract of the proposed work
Number of Pixels in Gray Region
Total number of pixels in the Image
(3)
Number of Pixels in Black Region
Total Number of Pixels in the Image
(4)
GGO =
NR =
Figure 1 denotes the graphical abstract of the proposed work.
The correlation between quantitative imaging data and survivability of the patient
is tested using Pearson’s correlation. The method measures the linear dependency
between two variables with a value ranging from −1 to 1, where 0, 1, and −1
represents a positive, perfect, and negative correlations, respectively [10]. The ρ
value in Pearson’s correlation refers to the probability of finding the current result.
If the probability is lower than 0.05, the linear relationship between the two variables
is statistically significant.
Equation 5 denotes Pearson correlation measure:
n
i=1
r=
n
i=1
Xi − X
Xi − X
Yi − Y
2
(5)
Yi − Y
2
The quality of the segmented images is analyzed using the performance metrics
such as MSE, PSNR, SSIM, and computational complexity.
The mean-square error (MSE) is the summation of squared error between the
original and segmented image [11]. The lower the value of MSE is, the lower the
error. Equation 6 is used to compute the MSE:
Analysis of High-Resolution CT Images of COVID-19 Patients
MSE =
2
y))
−
S
y)
(x,
(x,
[O
X,Y
X×Y
231
(6)
Where
O refers to the original image
S refers to the segmented image
X and Y refer to the rows and columns in the original and segmented images
PSNR metric computes the peak signal-to-noise ratio (PSNR) between original
and segmented images in decibels. The higher the PSNR is, the better the quality
of the segmented image [12]. PSNR is a measure of peak error, derived from MSE.
The formula for computing PSNR is denoted in Eq. 7:
PSNR = 10 log10
F2
MSE
(7)
Where
F refers to the maximum fluctuation in the image data type
Structural similarity index measure (SSIM) is a multiplicative combination of
luminance, contrast, and structure [13], as denoted in Eq. 8, followed by the
expanded notions from Eqs. 9 to 11.
SSIM(O,S) = [l (O, S)]α .[c (O, S)]β .[St (O, S)]γ
(8)
Where
l (O, S) =
2μO μS + C1
μ2O + μ2S + C1
(9)
c (O, S) =
2σO σS + C2
σO2 + σS2 + C2
(10)
σOS + C3
σO σS + C3
(11)
St (O, S) =
4 Results and Discussion
Figure 2 shows the segmentation of lesion regions from lung-CT images using
PSD method. The method correctly segments the patchy ground-glass opacities
with optimized margins that highlight the lesion region. The results of the PSD
method clearly exhibit the disfigured and abnormal shape of lungs. It is evident that
232
Fig. 2 PSD image segmentation results
A. Joy Christy and A. Umamakeswari
Analysis of High-Resolution CT Images of COVID-19 Patients
Fig. 2 (continued)
233
234
A. Joy Christy and A. Umamakeswari
Table 1 Image quantification of lung-CT images
S. no
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Lesion
region (%)
12.805
17.77
62.695
28.095
33.26
12.205
9.63
16.195
15.05
47.11
40.16
65.505
37.19
19.795
15.29
14.58
28.325
20.56
15.37
32.23
45.65
Normal
region
(%)
87.195
82.23
37.305
71.905
66.74
87.795
90.37
83.805
84.95
52.89
59.84
34.495
62.81
80.205
84.71
85.42
71.675
79.44
84.63
67.77
54.35
Left lung
lesion
region
(%)
2.48
21.36
60.18
33.07
27.11
11.19
7.48
14.46
13.84
48.37
39.53
63.41
41.86
21.15
16.28
14.85
31.24
27.32
14.83
29.34
43.91
Left lung
normal
region (%)
97.52
78.64
40.1
66.93
72.89
88.81
92.52
85.54
86.16
51.63
60.47
36.59
58.14
78.85
83.72
85.15
68.76
72.68
85.17
70.66
56.09
Right lung
lesion
region (%)
23.13
14.18
65.21
23.12
39.41
13.22
11.78
17.93
16.26
45.85
40.79
67.6
32.52
18.44
14.3
14.31
25.41
13.8
15.91
35.12
47.39
Right lung
normal
region (%)
76.87
85.82
34.51
76.88
60.59
86.78
88.22
82.07
83.74
54.15
59.21
32.4
67.48
81.56
85.7
85.69
74.59
86.2
84.09
64.88
52.61
the GGO pattern is the most common finding in COVID-19 infections. The GGO
patterns are most commonly of multifocal, bilateral, and peripheral regions. GGO
is presented as a multifocal lesion in 11 out of 21 experimental lung-CT images and
involves all lobes. GGO is presented as a bilateral lesion in 6 out of 21 images and
most commonly located in the inferior lobes of both the left and right lungs. In the
remaining four images, GGO is exposed as a unilateral lesion and most commonly
found in the inferior lobe of the right lung.
Table 1 denotes the quantification of lesion and normal regions of the lungs.
The first two columns in the table denote quantification of the lesion and normal
regions of both the right and left lungs. The third and fourth columns illustrate the
quantification of lesion and normal region in left lung. Finally, the fifth and sixth
columns show the quantification of lesion and normal regions of the right lung.
Figure 3 illustrates the linear dependency between the quantified lesion region
and survivability of COVID-19 patients using Pearson’s correlation. The correlation
value r is 0.83, and ρ is <0.05, denoting a positive relationship between the
quantified lesion region of lung-CT images and survivability of COVID-19 patients.
The results denote the more the quantified lesion region is less the survivability
Analysis of High-Resolution CT Images of COVID-19 Patients
235
Fig. 3 Lesion region vs.
survivability
Fig. 4 Normal region vs.
survivability
of the patient is and vice-versa. Figure 4 depicts the linear relationship between the
quantified normal region and survivability of COVID-19 patients. Here, there seems
to be a negative correlation between these two attributes with r value as −0.83,
which means the more the quantified normal region is more the survivability of the
patient is. But, the value of ρ < 0.05, i.e., the linear relationship between these two
variables, is statistically significant.
The analysis of the linear relationship between the quantified lesion region and
age of COVID-19 patients is shown in Fig. 5. The correlation value r is 0.23, and ρ
is 0.31 > 0.05, denoting that there is no or less correlation between patient’s age and
quantified lesion region. The result signifies that the patient’s age does not impact
lesion regions, i.e., irrespective of the patient’s age the lesion region is growing
across the lungs and the variables do not have any statistical relations. Finally, Fig. 6
236
A. Joy Christy and A. Umamakeswari
Fig. 5 Lesion region vs.
survivability
Fig. 6 Normal region vs.
survivability
explicates the correlation between normal region and age. The results show no or
less negative relationship between the attributes with r value as −0.23 and ρ value
as 0.31 > 0.05. Hence, it has been proved that the patient’s age and normal lung
are two independent variables with no linear relationship. From the results, it has
been observed that the analysis of the lung-CT images promises a higher sensitivity
but lower specificity and can play a role in the descriptive analysis of COVID-19
disease. Though the severity of COVID-19 disease can be estimated by the visual
assessment of lung-CT images, the software assistance provided in this paper is
more supportive for the quantification of lesion region. Based on the involvement of
the lobes and GGO patterns, the severity of the disease can be computed.
Table 2 denotes the performance of the PSD image segmentation algorithm with
respect to PSNR, SSIM, MSE, and time analysis. The results obtained from the
Analysis of High-Resolution CT Images of COVID-19 Patients
Table 2 Performance
analysis of PSD image
segmentation method
S. no
1
2
3
4
5
6
7
8
9
10
Metric
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSD
17.16858
0.993961
0.127365
0.146008
21.77537
0.998667
0.0598
0.154008
23.16629
0.999119
0.043412
0.169009
16.75699
0.993321
0.139908
0.35302
16.75818
0.993495
0.189856
0.086004
17.1571
0.993971
0.173193
0.134008
22.42507
0.998859
0.051491
0.101006
27.02776
0.999672
0.017842
0.006001
28.09448
0.999749
0.013957
0.044002
17.51284
0.995754
0.159572
0.092005
237
K-means
14.63592
0.992213
0.137552
0.694038
13.10178
0.981128
0.195831
0.642037
12.1858
0.893495
0.241813
0.573032
13.99115
0.936215
0.159567
0.555031
13.78368
1
0.167375
0.20601
15.26202
0.985491
0.119085
0.556032
21.07038
1
0.031262
0.425024
15.24397
0.970221
0.119581
0.099006
19.18688
0.98294
0.048236
0.090005
11.62889
0.783696
0.274898
0.206012
Fuzzy
9.37694
0.98507
0.461706
0.131007
15.19089
0.987263
0.121051
0.395021
7.698778
0.952042
0.679488
0.117006
8.672593
0.963403
0.543001
0.234013
8.242791
0.981396
0.599489
0.192011
9.289661
0.943923
0.471079
0.141008
9.770341
0.978357
0.421721
0.105006
12.65338
0.992091
0.21713
0.18601
13.02038
0.992957
0.199536
0.099006
7.462082
0.962826
0.717549
0.159009
(continued)
238
Table 2 (continued)
A. Joy Christy and A. Umamakeswari
S. no
11
12
13
14
15
16
17
18
19
20
21
Metric
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSNR
SSIM
MSE
Time(ms)
PSD
24.87536
0.999432
0.029289
0.043003
19.4277
0.997376
0.102676
0.044002
23.7363
0.999245
0.038072
0.123007
19.8939
0.997598
0.092226
0.131008
22.55313
0.999693
0.05057
0.264015
20.19215
0.998391
0.072288
0.077004
18.49671
0.989972
0.153847
0.127007
21.65835
0.998636
0.061434
0.16001
21.4594
0.998314
0.056597
0.17401
23.20182
0.979198
0.030587
0.103006
24.9786
0.998483
0.046011
0.015007
K-means
17.06113
0.972831
0.074474
0.127201
11.72732
1
0.268737
0.0624
16.12227
0.983357
0.197686
0.470001
22.11123
1
0.0246
0.468402
13.00766
0.896225
0.200122
0.1248
12.27778
0.889713
0.236746
0.136008
12.73913
1
0.212886
0.437025
22.15041
0.999531
0.024379
0.593034
9.271233
0.75446
0.473082
0.605035
13.01902
0.99078
0.199599
0.418024
13.35937
0.963549
0.184554
0.309018
Fuzzy
9.047227
0.975545
0.498124
0.104006
8.223032
0.987717
0.602222
0.082005
6.658499
0.940081
0.863396
0.206012
9.386661
0.976044
0.460674
0.122007
8.14966
0.984885
0.612483
0.081004
7.905123
0.985079
0.647959
0.076004
8.019295
0.982554
0.631147
0.109006
14.29212
0.996478
0.148884
0.139008
6.420316
0.984929
0.91207
0.125007
7.05006
0.979232
0.788958
0.088005
7.317229
0.983924
0.741886
0.137008
Analysis of High-Resolution CT Images of COVID-19 Patients
239
PSD method are compared with K-means image thresholding and adaptive image
thresholding methods for analyzing the quality of segmented regions. Almost in
all images, the PSD method has obtained higher PSNR and SSIM values and
lower MSE value with reduced time complexity. Hence, it has been proven that the
segments created in lung-CT images of COVID-19 patients using the PSD method
are valid and seem to be better than the other two experimental methods by creating
quality segments.
5 Conclusion
In this work, we have used PSD, a novel thresholding-based image segmentation
method to segment the lung-CT images of COVID-19 patients for the quantification
of lesion regions. The segments created by PSD method are qualitative and
correctly elucidate the lesion regions from the lung CT-images than the other
experimental methods. The quantitative analysis on lesion regions could precisely
quantify not only the whole volume of infected regions COVID-19 disease but
also the proportion of GGO in the right and left lungs. The method highlights
the abnormality and shape of the lungs due to the disease. The results clearly
specify that a patient admitted with less than 20% of infection may be survived.
The complexity of the disease is increased with the growing infected regions.
The quantitative segment regions also well correlated with the survivability of the
patients. In this work, we have taken 21 images for the quantification analysis. The
efficiency of the method can be proven with more number of images. A predictive
model may also be implemented for predicting the survivability of a patient based
on infected regions.
References
1. Wang, W., Tang, J., & Wei, F. (2020). Updated understanding of the outbreak of 2019 novel
coronavirus (2019-nCoV) in Wuhan, China. Journal of Medical Virology, 92(4), 441–447.
2. Zu, Z. Y., Di Jiang, M., Peng Peng, X., Chen, W., Ni, Q. Q., Guang Ming, L., & Zhang, L.
J. (2020). Coronavirus disease 2019 (COVID-19): A perspective from China. Radiology, 296,
200490.
3. Chen, Z. M., Fu, J. F., Shu, Q., Chen, Y. H., Hua, C. Z., Li, F. B., Lin, R., Tang, L. F., Wang, T.
L., Wang, W., & Wang, Y. S. (2020). Diagnosis and treatment recommendations for pediatric
respiratory infection caused by the 2019 novel coronavirus. World Journal of Pediatrics, 16,
1–7.
4. Ozturk, T., Talo, M., Yildirim, E. A., Baloglu, U. B., Yildirim, O., & Acharya, U. R. (2020).
Automated detection of COVID-19 cases using deep neural networks with X-ray images.
Computers in Biology and Medicine, 121, 103792.
5. Pathak, Y., Shukla, P. K., Tiwari, A., Stalin, S., Singh, S., & Shukla, P. K. (2020). Deep transfer
learning based classification model for COVID-19 disease. IRBM, 43, 87.
240
A. Joy Christy and A. Umamakeswari
6. Toğaçar, M., Ergen, B., & Cömert, Z. (2020). COVID-19 detection using deep learning models
to exploit Social Mimic Optimization and structured chest X-ray images using fuzzy color and
stacking approaches. Computers in Biology and Medicine, 121, 103805.
7. Rahimzadeh, M., & Attar, A. (2020). A modified deep convolutional neural network for
detecting COVID-19 and pneumonia from chest X-ray images based on the concatenation of
Xception and ResNet50V2. Informatics in Medicine Unlocked, 19, 100360.
8. Zhang, K., Liu, X., Shen, J., Li, Z., Sang, Y., Wu, X., Zha, Y., Liang, W., Wang, C., Wang,
K., & Ye, L. (2020). Clinically applicable AI system for accurate diagnosis, quantitative
measurements, and prognosis of covid-19 pneumonia using computed tomography. Cell, 181,
1423.
9. Wu, X., Hui, H., Niu, M., Li, L., Wang, L., He, B., Yang, X., Li, L., Li, H., Tian, J., & Zha,
Y. (2020). Deep learning-based multi-view fusion model for screening 2019 novel coronavirus
pneumonia: A multicentre study. European Journal of Radiology, 128, 109041.
10. Cheng, Z., Qin, L., Cao, Q., Dai, J., Pan, A., Yang, W., Gao, Y., Chen, L., & Yan, F. (2020).
Quantitative computed tomography of the coronavirus disease 2019 (COVID-19) pneumonia.
Radiology of Infectious Diseases (Vol. 7, pp. 55–61).
11. Deshmukh, A. B., & Rani, N. U. (2019). Fractional-Grey Wolf optimizer-based kernel
weighted regression model for multi-view face video super resolution. International Journal
of Machine Learning and Cybernetics, 10(5), 859–877.
12. Sara, U., Akter, M., & Uddin, M. S. (2019). Image quality assessment through FSIM, SSIM,
MSE and PSNR—A comparative study. Journal of Computer and Communications, 7(3), 8–
18.
13. Ho, Y. H., Cho, C. Y., Peng, W. H., & Jin, G. L. (2019). Sme-net: Sparse motion estimation
for parametric video prediction through reinforcement learning. In Proceedings of the IEEE
international conference on computer vision (pp. 10462–10470).
Attention-Based Deep Learning
Approach for Semantic Analysis of Chest
X-Ray Images Modality
Rishabh Dhenkawat, Snehal Saini, Shobhit Kumar, and Nagendra
Pratap Singh
1 Introduction
Computer-vision-based diagnosis provides an automatic classification and suggestions for reference to improve accuracy and efficiency of diagnosis. Since the past
few years, many deep learning and machine learning algorithms are used for the
classification of medical images; SVM, K-nearest neighbors, random forest, and
other techniques are included. They can be used in a variety of medical image
processing applications.
Using old machine learning methods poses two major difficulties. First, the
inaccurate results are due to the limited processing of large input. Second, the use
of manual feature extractions instead of learning valid features. Thus, deep learning
methods are preferred for medical image processing.
Deep learning technology has a wide variety of applications in healthcare image
processing, such as diagnosis and organ segmentation. The convolution neural
network CNN has been used extensively in several pieces of research that include
reading and interpreting CT images for medical applications. Deep learning is
a representation learning technique that connects different layers and nonlinear
components efficiently to obtain various representation levels.
Deep learning algorithms have two essential characteristics: local connectivity
and shared weights (CNN). Deep learning is widely used in image analysis because
of all these features, which make it much easier to handle complex data processing
R. Dhenkawat · S. Saini () · N. P. Singh
NIT Hamirpur, Hamipur, India
e-mail: 185519@nith.ac.in; 5519@nith.ac.in; nps@nith.ac.in
S. Kumar
Institute for Geodesy and Geoinformation, University of Bonn, Bonn, Germany
e-mail: skuma@uni-bonn.de
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_13
241
242
R. Dhenkawat et al.
tasks. Convolution layers, pooling layers, and fully connected layers are the three
layers that make up the CNN architecture. Convolution layers extract features
from the previous layer, pooling layers minimize computational complexity, and
completely connected layers, eventually, are used to extract features from the
previous layer. A recurrent neural network (RNN) is used to process sequence
data to recognize things. Since words in a sentence are semantically related, word
generation uses previous word knowledge to predict the next word in the sentence.
In RNN, the current output of a sequence is related to the previous output, enabling
word relationships to be determined. It is used to model temporary sequences and
their long-range dependencies because of the property of feedback connections.
In this chapter, we propose a CNN–LSTM chest X-ray image semantic analysis
focused on an attention process to produce a description of the chest X-ray images.
In the deep learning model, we used the idea of the attention to highlight the
infection regions in the lungs. Two types of attention mechanisms in deep learning
are local attention and global attention. In our pour model, we used local attention,
also known as additive attention or Bahdanau attention. As a result, the model assists
in the analysis and clarification of chest X-ray images, automatically supplying
doctors with valuable knowledge about the input X-ray image. Two types of chest
X-ray images available are frontal and lateral sides. Using these two types of images
as data, our model generates a report for these chest X-ray images. To construct a
deep learning model, we present a predictive model that uses both image and text
processing. This chapter uses chest X-ray images from Indiana University’s large
chest X-ray dataset to describe the model’s architecture and detection efficiency.
2 Literature Review
2.1 Image Captioning
Various studies in computer vision were initially determined on generating descriptions for visual data from videos [1, 2]. However, these models were relatively
brittle, complex, hand-designed, and were limited to applications in very few
domains. The problem of image captioning gained popularity after the advancement
in object recognition systems. Farhadi et al. [3] converted a triplet of scene elements
into text descriptions using templates. Li et. al [4] detected objects and their
relationships and put together the descriptions for them to form phrases. Kulkarni
et al. [5] used a template-based model to generate text from complex detection
graphs beyond triplets.
Aker et al. [6] used dependency relations in documents from the web that had
information about the location of an image to summarize that image automatically
and generate its caption.
Elliott et al. [7] proposed visual dependency representations to find out the
relationships within the objects of an image to improve image captioning.
Attention-Based Deep Learning Approach for Semantic Analysis of Chest X-. . .
243
Kuznetsova et al. [8] used a tree-based design for image caption generalization
and generation. They used web images and their captions to generate natural
language image descriptions that are more expressive. These approaches were
however much rigid in terms of text generation.
Oriol et al. [9] combined the techniques of computer vision and machine
translation to propose a model that produced captions or sentences that would
describe a particular image. The model uses a supervised learning approach for
training purposes and learns individually for each image by encoding the image first
using a convolutional neural network and then generating a description of the image
in natural language using a recurrent neural network.
Karpathy and Fei-Fei [10] proposed a model to generate captions for images
and their regions using the image and description pairs. The model uses multimodal
embedding to align the modalities for a convolutional neural network on images and
a bidirectional recurrent neural network on sentences.
Anderson et al. [11] proposed an image captioning model that is trained using
partially specified sequence data from labeled images and object detection datasets.
Their algorithm utilizes finite state automata to describe partial sequences for
training the recurrent neural networks. This method solved the problem of training
the image captioning models on image–sentence pairs that was a requirement for
previously proposed models.
Chen et al. [12] proposed an adversarial training model for exploring crossdomain image captioning on unpaired image–sentence data. Despite achieving
improved accuracy, the novel algorithm still requires image–sentence paired data
for training purposes.
Gu et al. [13] used the technique called language pivoting by capturing the
attributes of an image captioning module first in a pivot language and then translated
it into another target language instead of using the traditional image–language
pairs. However, this method was still dependent on pivot-target parallel language
translation corpus.
Feng et al. [14] proposed an unsupervised technique to generate captions for
images by utilizing a visual concept detector, a set of images, and a sentence corpus
without requiring any paired image–sentence data during the training phase.
2.2 Attention Mechanism
Image captioning models for a large time could not accurately extract all the features
present in the image. This problem was solved by a new technique called the
attention mechanism proposed by Bahdanau et al. [15] in 2015. They proposed
a model for neural machine translation that automatically “soft-searched” for
relevant parts in a sentence to predict a target word without taking each word into
consideration explicitly. This method of soft searching and focusing only on the
specific parts of the input source was then named attention mechanism or Bahdanau
attention. Since then, it has been widely used in the application of computer vision
and image captioning domains [22].
244
R. Dhenkawat et al.
You et al. [16] combined the top-down and bottom-up approaches for image
captioning via employing a semantic attention module that selectively focuses on
semantic concepts and fuses them into outputs of RNN thus forming feedback
connecting the two approaches. This novel approach for image captioning technique
achieved outstanding results compared to the pre-existing methods.
The attention mechanism for image captioning in most cases forces specific
words to correspond to a particular region. However, words such as articles and
conjunctions cannot correspond to the image region. Deng et al. [17] solved
this problem for image captioning by using adaptive attention to formulate the
decision of using a specific image feature to generate a corresponding word for
description. They introduced DenseNet to extract all the features of the image and
simultaneously employed a sentinel gate for adaptive attention. The decoding phase
utilizes LSTM for word generation tasks. Using this new technique, they were able
to improve the BLEU and METEOR evaluation criteria. Yan et al. [18] further
improved this problem by introducing diversity regularization for adaptive attention
so that it does not only focus on visual features while generating image captions
instead generates words with much more expressivity.
2.3 Medical Report Generation
Writing reports from medical imaging can be a time-consuming and tedious
task for physicians. For this purpose, many studies in computer vision and the
natural language processing domain have tried to formulate artificial methods to
automatically generate such medical reports.
However, this poses various problems for the system such as identification
of abnormal regions in the medical scans, generation of multiple long sentences
for describing the medical image accurately, and including all sorts of multiple
heterogeneous pieces of information such as findings and tags.
To solve these problems, Jing et al. [19] introduced a multi-task learning
framework to jointly predict tags and generate paragraphs for medical images fed
as inputs. They employed a co-attention mechanism to focus on abnormal regions
in the scanned images and developed a hierarchical LSTM network model for
producing long descriptive findings of the images in the form of medical reports.
Xue et al. [20] proposed a CNN–LSTM model to generate radiology reports
from chest X-rays involving high-level conclusive findings and detailed descriptive
findings. They also used an attention input to maintain the coherence between the
generated sentences. The model was evaluated on Indiana University’s chest X-ray
dataset.
Yuan et al. [21] used a generative encoder–decoder model and pre-trained it using
chest X-ray images to discover 14 radiographic observations. They employed a late
fusion attention mechanism on a sentence level to extract visual features. Further,
they enhanced the model by fine-tuning the encoder to bring out the most common
medical concepts present in the X-ray images and fused these concepts at every step
Attention-Based Deep Learning Approach for Semantic Analysis of Chest X-. . .
245
of the decoding using the word-level attention mechanism. This allowed the model
to be more expressive and generate much more descriptive semantics for the image
under consideration that increases the accuracy of the radiology report obtained in
this fashion.
3 Methodology
3.1 Overview
We have used an encoder and a decoder architecture with an attention mechanism
(Fig. 1) and compared it with encoder and decoder architecture without attention.
Here convolution neural network is taken as an encoder to extract visual features;
this encoder will output image feature vectors. The resulted feature vectors will
be taken as input to an additive attention-based LSTM decoder. LSTM decoder took
image feature vector and sequence vector to process reports. An image classification
using InceptionV3 model over chest dataset is used; with this classification model,
the weights were saved over the training and later used in encoder feature extraction
by using the saved weights to InceptionV3.
3.2 CNN Encoder
The convolution neural network is popular in deep learning due to its ability to learn
and represent image feature vectors. Many frameworks such as VGG16, ResNet,
Inception, and DenseNet are trained on ImageNet Dataset containing 1.3 million
Fig. 1 Model flow
246
R. Dhenkawat et al.
natural images. Due to the difference between medical chest images and natural
images, InceptionV3 is again trained on labeled chest X-ray images to improve
transfer efficiency. The encoder is a single linear model that is fully connected. The
input X-ray image is fed to InceptionV3 that extracts the features of two images or
adds them. Then they are input to the FC layer, and an output vector is obtained.
The encoder’s last hidden state is connected to the decoder. To deal with datasets
that are too big to fit into memory, rather than shuffling the entire dataset, it keeps
a buffer of buffer size elements and picks the next element at random from that
buffer (replacing it with the next input element, if one is available). Hence, buffer
size is taken as 1000. To allow us to turn each word into a fixed-length vector of
a predetermined size, embedding layer is used with size 256. The resulting vector
is dense, with real values rather than just 0s and 1s. The constant length of word
vectors allows us to better represent words while reducing their dimensionality. Here
we have used rectified linear activation function that gives the same output as input
if it is positive; otherwise, it will give zero. A concatenation layer is used, which
gives the output to the dense layer, and further a dense layer is used whose output
is passed through ReLU activation function. The final output from the activation
function is further passed to the decoder.
3.3 LSTM Decoder
Recurrent neural networks model non-static behavior of sequences through connections between different units. LSTM is a type of RNN that has 3 added states such
as forget state, input state, and output gates. Hence, the LSTM layer is present in
the decoder that does language modeling up to word level. The first step receives
encoded output from the encoder and the <start> vector. The input is passed to
the LSTM layer with additive attention. The output consists of two vectors: one is
the predicted label and the other is the previous hidden state of the decoder; this
feedback goes again to the decoder on each time step. Here, bidirectional LSTM
layers are used with the ReLU activation function. One embedding layer is used with
the bi-LSTM layer. The output of the 2 concatenation layers is passed to additive
attention, and the final output is given with one flatten layer and 2 dense layers. In
Fig. 2, LSTM model is shown with attention.
3.4 Attention Mechanism
Recurrent neural networks model the non-static behavior of sequences through
connections between different units. LSTM is a type of RNN that has 3 added states
such as forget state, input state, and output gates. Hence, the LSTM layer is present
in the decoder that does language modeling up to word level. The first step receives
encoded output from the encoder and the <start> vector. The input is passed to
Attention-Based Deep Learning Approach for Semantic Analysis of Chest X-. . .
247
Output
Layer
y
+
h1
h2
h1
...
h3
h2
h3
hT
hT
h2
h1
Attention
Layer
LSTM
Layer
hT
h3
e1
e2
e3
...
eT
Embedding
Layer
x1
x2
x3
...
xT
Input
Layer
Fig. 2 LSTM model with Attention [22]
the LSTM layer with additive attention. The output vectors are two vectors: one is
the predicted label and the other is the previous hidden state of the decoder; this
feedback goes again to the decoder on each time step. Here bidirectional LSTM
layers are used with the ReLU activation function. One embedding layer is used
with the i-LSTM layer. The output of the 2 concatenation layers is passed to additive
attention, and the final output is given with one flatten layer and 2 dense layers.
3.5 Model Architecture
The model proposed as per Fig. 2 in this chapter contains five components: input
layer input labels are given to the model and then summed with the image feature
vectors. The embedding layer is then used to map each label to a low-dimensional
vector. LSTM layer is used to get high-level features, and from step, these LSTM
layers are repeated twice to understand features in more depth. A weight vector is
provided by the attention layer, and it also merges word-level features from each
time step into a sentence-level feature vector, by multiplying the weight vector.
Finally, the sentence-level feature vector in the output layer is finally used for
relation classification.
4 Experiments
4.1 Dataset
We have used Indiana University’s vast chest X-rays dataset provided by the Open-i
service of the National Library of Medicine. The dataset contains 7000 chest X-rays
248
R. Dhenkawat et al.
from various hospitals along with 3900 associated radiology reports. Each report is
associated with two different views of the chest, i.e., frontal and lateral views. The
associated tags contain the basic findings from the X-ray images that are used to
train the model so as to generate image captions later on.
4.2 Exploratory Data Analysis
Before jumping to the main code, we analyzed the dataset to visualize some of its
important characteristics. For eg, by performing text analysis, we obtained the bar
plot of the most unique sentences indicated in the x-ray reports and the frequency
of their occurrences (see Fig. 3). By generating a word cloud, we can see the most
occurring words in the sentences present in X-ray reports (Fig. 4). Some of these
words are chest pain, shortness, breath, male, female, dyspnea, and indication. The
word cloud in Fig. 4 is used to represent the words having the maximum word
count in the impression column target variable. Further, we visualize the word count
distribution plot in Fig. 5 for the impression column target variable. This plot offers
better insights to see the minimum and maximum word count. From the plot, we
conclude that the minimum word count is 1, the maximum word count is 122, and
the median word count is 5.0.
Further, we analyze the distribution of image count per patient using a bar plot,
and we see that the minimum image count is 1 and the maximum image count is 5 in
Fig. 6. Since two types of chest X-ray images are available to us, which are frontal
Fig. 3 Bar plot of unique words for indication
Attention-Based Deep Learning Approach for Semantic Analysis of Chest X-. . .
249
Fig. 4 Word cloud for impression column
word_count distribution
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
0
20
80
40
60
impression_count
100
120
Fig. 5 Word count distribution plot
and lateral views, by selecting a sample data point, we find out the total number of
images present for that particular patient, its findings, and impressions. From here,
we analyze that there are multiple images associated with every patient.
4.3 Pre-processing and Training
The transfer learning method is used for the image to feature vector conversion, and
text data tokenization is used for dataset preparation. InceptionV3 model trained on
250
R. Dhenkawat et al.
Image_count per patient distribution
3000
2500
2000
1500
1000
500
0
1
2
3
4
5
Fig. 6 Image count per patient distribution
ImageNet is used. A classifier to detect which type of disease the person is suffering
from was made. Once classification was done, weights of the trained model were
saved in hd5 format.
4.4 Model without Attention Mechanism
4.4.1
Encoder Architecture
Single fully connected layer linear output is used. Before we pass to the FC layer,
two image tensors were added, and we pass to the FC layer. This layer outputs the
shape of batch size and embedding dimension.
4.4.2
Decoder Architecture
It contains an embedding LSTM layer and dense layer that outputs shape (batch
size, vocab size). Here bi-LSTM layer is used with two dense and one flatten layers
(Fig. 7).
4.4.3
Model Training
In the training phase, the teacher forcing is used, for training recurrent neural
networks that use the output from a previous step as an input. In training, a “start”
token is used to start the process, and the generated word in the output sequence is
Attention-Based Deep Learning Approach for Semantic Analysis of Chest X-. . .
input:
251
[(?,2,1,2048)]
input_10: InputLayer
output: [(?,2,1,2048)]
input:
(?,2,1,2048)
output:
[(?,1,2048)]
If_op_layer_strided_slice_6:TensorFlowOpLayer
input:
(?,2,1,2048)
output:
[(?,1,2048)]
If_op_layer_strided_slice_7:TensorFlowOpLayer
input:
[(?,1,2048), (?,1,2048)]
output:
(?,1,4096)
concatenate_12:Concatenate
input:
[(?,1)]
input_11: InputLayer
input:
(?,1, 4096)
output:
(?,1,256)
output: [(?,1)]
input:
(?,1)
embedding_7: Embedding
dense_12: Dense
output: (?,1,256)
input:
[(?,1,256), (?,1,256)]
output:
(?,1,512)
concatenate_13:Concatenate
input:
(?,1, 512)
output:
(?,512)
input:
(?,512)
Istm_6: LSTM
dense_13: Dense
output: (?,1339)
Fig. 7 Model architecture without attention
used as input in the subsequent time step along with other inputs such as an image
or a source text. Until the end, the same recursive output as the input method is used
till better results are generated.
4.4.4
Model with an Attention Mechanism
The encoder part is the same as the previous model architecture and summed image
vector with a single fully connected layer. In the decoder, part LSTM with attention
is used. Here additive attention (local or Bahdanau attention) is used.
4.4.5
Model Evaluation
Beam-search-based teacher forcing method is used to find the resulting sentence.
Beam search is used here, as it chooses the most probable next step when the
sequence is made. Beam search is a heuristic search algorithm that investigates
a graph by extending the most promising node in a small set. Beam search is a
252
R. Dhenkawat et al.
0
100
0
0
200
100
100
300
200
200
400
300
300
500
400
400
0
100
200
300
400
500
0
100
200
300
400
500
600
0
100
200
300
400
500
Fig. 8 Results for sample data point
heuristic search approach that grows the W number of optimal nodes at each level
at all times. It travels downhill exclusively from the best W nodes at each level as
it develops level by level. Beam search constructs its search tree using breadthfirst search. Beam search constructs its search tree using breadth-first search. It
generates all the successors of the current level’s state at each level of the tree.
However, at each level, it only evaluates a W number of states. Other nodes are not
taken into account. It uses all possible next steps and takes most likely k. Here k is
user-specified and controls the number of semantic analyses of X-ray images and
other medical scanned images to generate reports. To evaluate the model for various
chest X-ray inputs, it is first checked upon a sample data point, and the results as
shown in Fig. 8 are observed. The model accuracy for pre-processing is also plotted
against every epoch of training and can be visualized as shown in Fig. 9. Figures 10
and 11 show the accuracy and loss curves obtained for the model without using
attention technique. As it can be clearly seen, the accuracy increases after utilizing
the attention mechanism to a greater extent from Figs. 12 and 13. Also, the loss is
decreased after employing the attention mechanism, which shows how the attention
mechanism proves to be more effective for image captioning tasks and can be used
for medical searches.
The argmax function is widely used in mathematics and machine learning.
However, there are some specific situations where you will see argmax used in
applied machine learning and may need to implement it yourself. The most common
situation for using argmax in applied machine learning is in determining the index
of an array that yields the largest value. The probabilities show how likely a sample
is to correspond to each of the class designations. The predicted probabilities are
sorted into classes, with the predicted probability at index 0 belonging to the first
class, the anticipated probability at index 1 to the second, and so on.
Attention-Based Deep Learning Approach for Semantic Analysis of Chest X-. . .
253
model accuracy
test
train
0.95
accuracy
0.90
0.85
0.80
0.75
0.70
0
2
4
epoch
6
8
Fig. 9 Model accuracy
Fig. 10 CNN–LSTM without additive attention loss curve
4.5 Results
Model accuracy and loss of the architecture with attention and without attention
have been calculated and plotted in Figs. 10 and 13. Accuracy of CNN–LSTM
without using attention mechanism on training set comes out to be 84%. But
accuracy of CNN–LSTM with additive attention comes out to be 91%, which is
254
R. Dhenkawat et al.
Fig. 11 CNN–LSTM without additive attention accuracy curve
Fig. 12 CNN–LSTM with additive attention loss curve
much better than the model without using attention on the training data (Fig. 13).
Actual and predicted captions of various chest X-ray images have been generated
and compared. On testing data, additive attention gives the accuracy of 87.5, and
without additive attention, testing accuracy comes out to be 80.5%. In Figs. 11 and
12, loss without attention and loss with attention mechanism are plotted. From
Figs. 15, 16, 17, 18, 19, and 20, actual and predicted captions of chest X-ray
Attention-Based Deep Learning Approach for Semantic Analysis of Chest X-. . .
255
Fig. 13 CNN–LSTM with additive attention accuracy curve
images are given with attention mechanism. The results are evaluated prominently
using two types of searches, namely argmax search and beam search. The actual
description associated with Fig. 20 is also printed, i.e., “no acute cardiopulmonary
process,” while the model after performing an argmax search predicts “no acute
cardiopulmonary abnormality” as the caption. For the same set of input, using beam
search, the predicted caption is “no cardiopulmonary abnormalities.”
Similarly, the model is tested upon various other chest X-ray images, and
the predictions are compared for both argmax and beam search methods. For
another such test input as seen in Fig. 14, the actual caption is “no acute cardiopulmonary abnormalities,” while argmax search provides output caption “no
acute cardiopulmonary abnormality” and beam search provides an output “no
acute cardiopulmonary disease.” To better understand how different types of search
methods yield different results for similar input, we have tried to analyze them via
cases discussed below.
4.5.1
Case 1
The actual caption is “right-sided chest in without demonstration of an acute
cardiopulmonary abnormality,” while the predicted caption using argmax search
comes out to be “no evidence of acute cardiopulmonary disease,” and using beam
search, it comes out to be “no evidence of the same (Figs. 15 and 16).”
256
R. Dhenkawat et al.
0
0
100
100
200
200
300
300
400
400
500
500
600
600
0
100
200
300
400
500
0
100
200
300
Fig. 14 Chest X-ray input image
0
0
100
100
200
200
300
300
400
400
500
500
0
100
200
300
400
500
0
100
200
300
Fig. 15 Case 1 report using argmax technique and attention mechanism
400
500
400
500
Attention-Based Deep Learning Approach for Semantic Analysis of Chest X-. . .
0
0
100
100
200
200
300
300
400
400
257
500
500
0
100
200
300
400
500
0
100
200
300
400
500
Fig. 16 Case 1 report using beam search technique and attention mechanism
0
0
100
100
200
200
300
300
400
400
500
500
0
100
200
300
400
500
0
100
200
300
400
500
Fig. 17 Case 2 report using beam search technique and attention mechanism
4.5.2
Case 2
The actual caption is “heart size is normal lungs are clear no nodules or masses
no adenopathy or effusion stable slightly sclerotic posterior inferior of one of
the mid-thoracic vertebral bodies seen on the lateral radiograph only this most
represents overlying degenerative spurring than metastasis,” while the predicted
caption using argmax search comes out to be “no acute cardio low lung volumes
no pneumothoraces is normal heart size and normal and clear lungs are grossly
within normal limits no acute cardiopulmonary abnormality identified,” and using
beam search, it comes out to be “no acute finding (Figs. 17 and 18).”
258
R. Dhenkawat et al.
0
0
100
100
200
200
300
300
400
400
500
500
0
100
200
300
400
500
0
100
200
300
400
500
Fig. 18 Case 2 report using argmax technique and attention mechanism
4.5.3
Case 3
The actual caption is “no acute cardiopulmonary disease,” while the predicted
caption using argmax search comes out to be “no acute cardiopulmonary disease,”
and using beam search, it comes out to be “no acute abnormalities (Figs. 19 and
20).”
4.5.4
Case 4
The actual caption is “comparison no suspicious appearing lung nodules identified
well expanded and clear lungs mediastinal contour within normal limits no acute
cardiopulmonary abnormality identified,” while the predicted caption using argmax
search comes out to be “no impression nodules of size normal and posterior inferior
degenerative changes without superimposed pleural based suspected,” and using
beam search, it comes out to be “no evidence for disease (Figs. 21 and 22).”
4.5.5
Case 5
The actual caption is “no focal airspace consolidation hyperexpanded lungs suggestive of emphysema lingular subsegmental atelectasis or scarring,” while the
Attention-Based Deep Learning Approach for Semantic Analysis of Chest X-. . .
0
0
100
100
200
200
300
300
400
400
500
500
600
259
600
0
100
200
300
400
500
0
100
200
300
400
500
Fig. 19 Case 3 report using argmax technique and attention mechanism
predicted caption using argmax search comes out to be “no acute abnormality noted
stable previous and appearance of atelectasis,” and using beam search, it comes out
to be “low lung features negative (Figs. 23 and 24).”
4.5.6
Case 6
The actual caption is “no active cardiopulmonary disease left humeral head is
positioned anterior and inferior to the glenoid concerning for anterior shoulder
subluxation this is related to the muscular dystrophy and decreased shoulder
muscles support postoperative changes from the spinal placement,” while the
predicted caption using argmax search comes out to be “no active disease no
evidence for contour no degenerative spurring of the prominent head of symptoms
from no acute tuberculosis since mediastinal contour no acute abnormalities since
patients symptoms of right volumes and l this may be an artifact of the previous the
exam is recommended no typical findings of pulmonary edema,” and using beam
search, it comes out to be “no acute cardiopulmonary disease (Figs. 25 and 26).”
260
R. Dhenkawat et al.
0
100
0
50
200
100
150
300
200
250
400
300
350
500
400
0
100
200
300
400
500
600
0
100
200
Fig. 20 Case 3 report using beam search technique and attention mechanism
0
0
100
100
200
200
300
300
400
400
500
500
0
100
200
300
400
500
0
100
200
300
400
500
Fig. 21 Case 4 report using argmax technique and attention mechanism
300
400
500
Attention-Based Deep Learning Approach for Semantic Analysis of Chest X-. . .
0
0
100
100
200
200
300
300
400
400
500
500
0
100
200
300
400
500
0
100
200
300
400
500
Fig. 22 Case 4 report using beam search technique and attention mechanism
0
0
100
100
200
200
300
300
400
400
500
0
100
200
300
400
500
500
0
100
200
300
400
500
Fig. 23 Case 5 report using argmax technique and attention mechanism
0
0
100
100
200
200
300
300
400
400
500
500
0
100
200
300
400
500
0
100
200
300
400
500
Fig. 24 Case 5 report using beam search technique and attention mechanism
261
262
R. Dhenkawat et al.
0
0
100
100
200
200
300
300
400
400
500
500
600
600
0
100
200
300
400
0
500
100
200
300
400
500
Fig. 25 Case 6 report using beam search technique and attention mechanism
0
0
100
100
200
200
300
300
400
400
500
500
600
600
0
100
200
300
400
500
0
100
200
300
400
500
Fig. 26 Case 6 report using argmax technique and attention mechanism
4.5.7
Conclusion
By comparing the results, model architecture of attention-based long short-term
memory networks for relation classification worked well in classification tasks than
without attention. Loss is converged to 0.3 with an accuracy of 89% train and 92%
validation; from the result, we can see there is a similarity between each predicted
and actual output. Thus, by using the attention mechanism, along with conventional
deep learning methods, we can improve the accuracy of the model.
Acknowledgments This paper and the research behind it would not have been possible without
the exceptional support of my supervisor, Dr. Nagendra Pratap Singh. His enthusiasm, knowledge,
and exacting attention to detail have been an inspiration and kept my work on track from our coding
to the final draft of this paper. The magnanimity and proficiency of one and all have enhanced this
study in innumerable ways and saved us from many errors.
Attention-Based Deep Learning Approach for Semantic Analysis of Chest X-. . .
263
References
1. Gerber, R., & Nagel, H.-H. (1996). Knowledge representation for the generation of quantified
natural language descriptions of vehicle traffic in image sequences. In ICIP. IEEE.
2. Yao, B. Z., Yang, X., Lin, L., Lee, M. W., & Zhu, S. C. (2010). I2T: Image parsing to text
description. Proceedings of the IEEE, 98(8), 1485–1508.
3. Farhadi, A., Hejrati, M., Sadeghi, M. A., Young, P., Rashtchian, C., Hockenmaier, J., & Forsyth,
D. (2010). Every picture tells a story: Generating sentences from images. In ECCV, 2010.
4. Kulkarni, G., Premraj, V., Dhar, S. Li, S., Choi, Y., Berg, A. C., & Berg, T. L. (2011). Baby
talk: Understanding and generating simple image descriptions. In CVPR, 2011.
5. Aker, A., & Gaizauskas, R. (2010). Generating image descriptions using dependency relational
patterns. In ACL, 2010.
6. Elliott, D., Keller, F. (2013). Image description using visual dependency representations. In
EMNLP, 2013
7. Zhou, P., Shi, W., Tian, J., Qi, Z., Li, B., Hao, H., & Xu, B. (2016). Attention-based
bidirectional long short-term memory networks for relation classification (pp. 207–212).
https://doi.org/10.18653/v1/P16-2034
8. Kuznetsova, P., Ordonez, V., Berg, T., & Choi, Y. (2014). Treetalk: Composition and
compression of trees for image descriptions. ACL, 2(10), 351–362.
9. Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and tell: A neural image caption
generator. In CVPR, 2015
10. Karpathy, A., & Fei-Fei, L. (2015). Deep visual-semantic alignments for generating image
descriptions. In Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (pp. 3128–3137).
11. Anderson, P., Gould, S., & Johnson, M. (2018). Partially-supervised image captioning. In
NeurIPS, 2018.
12. Chen, T.-H., Liao, Y.-H., Chuang, C.-Y., Hsu, W.-T., Fu, J., & Sun, M. (2017). Show, adapt,
and tell: Adversarial training of cross-domain image captioner. In (ICCV, 2017).
13. Gu, J., Joty, S., Cai, J., & Wang, G. (2018). Unpaired image captioning by language pivoting.
In (ECCV, 2018) (Vol. 61).
14. Feng, Y., Ma, L., Liu, W., & Luo, J. (2019). Unsupervised image captioning. In 2019
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 4120–4129).
15. Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to
align and translate. arXiv preprint arXiv:1409.0473
16. You, Q., Jin, H., Wang, Z., Fang, C., & Luo, J. (2016). Image captioning with semantic
attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(pp. 4651–4659).
17. Deng, Z., Jiang, Z., Lan, R., Huang, W., & Luo, X. (2020). Image captioning using DenseNet
network and adaptive attention. Signal Processing: Image Communication, 85, 115836.
18. Yan, C., Hao, Y., Li, L., Yin, J., Liu, A., Mao, Z., Chen, Z., & Gao, X. (2021). Taskadaptive attention for image captioning. IEEE Transactions on Circuits and Systems for Video
Technology, 32(1), 43–51.
19. Jing, B., Xie, P., & Xing, E. (2017). On the automatic generation of medical imaging reports.
arXiv preprint arXiv:1711.08195.
20. Xue, Y., Xu, T., Long, L. R., Xue, Z., Antani, S., Thoma, G. R., & Huang, X. (2018).
Multimodal recurrent model with attention for automated radiology report generation. In
International Conference on Medical Image Computing and Computer-Assisted Intervention
(pp. 457–466). Springer.
21. Yuan, J., Liao, H., Luo, R., & Luo, J. (2019). Automatic radiology report generation based
on multi-view image fusion and medical concept enrichment. In International Conference on
Medical Image Computing and Computer-Assisted Intervention (pp. 721–729). Springer.
22. Zhou, P., et al. (2016). Attention-based bidirectional long short-term memory networks for
relation classification. In Proceedings of the 54th annual meeting of the association for
computational linguistics (volume 2: Short papers).
Medical Image Processing
by Swarm-Based Methods
María-Luisa Pérez-Delgado
and Jesús-Ángel Román-Gallego
1 Introduction
There are several techniques to obtain medical images, such as x-ray, magnetic resonance (MR) imaging, computed tomography (CT), positron emission tomography
(PET), ultrasound (US) or single-photon emission computed tomography (SPECT).
In general, each technique is applied to specific parts of the body and generates a
different type of image (Fig. 1).
The images obtained by all these methods provide very useful information to help
experts in making medical decisions. For this, it is necessary to process the images
to extract useful information. Currently, there are several different methods available
to apply each processing. It must be taken into account that many image processings
have a high computational cost due to the dimensionality of the data. This makes
it necessary to use rapid techniques that allow obtaining good results. Among
such techniques, swarm-based algorithms have been successfully applied in various
image processing operations. This chapter shows the application of swarm-based
methods for medical imaging. In general, these methods are combined with others
to define a system that addresses various aspects of image processing. Although
there are many articles related to the subject, the description focuses on analyzing
recent works that present interesting proposals.
Many image processing operations are closely related and are often applied
sequentially to an image. For example, feature extraction and feature selection are
two operations that are usually applied to an image consecutively. The first operation
extracts a set of features from the image, which allow representing the image and at
the same time reducing the dimensionality of the data to be treated. Subsequently,
M.-L. Pérez-Delgado () · J.-Á. Román-Gallego
University of Salamanca, Escuela Politécnica Superior de Zamora, Zamora, Spain
e-mail: mlperez@usal.es; zjarg@usal.es
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_14
265
266
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
Fig. 1 Medical images obtained by different techniques
a subset is selected that includes only the most interesting features for the next
processing that must be applied to the image.
Among the operations applied to medical images, the chapter focuses on four
interesting cases: feature selection, segmentation, classification, and registration. A
section is included to describe interesting works that use swarm algorithms to apply
each of these processings. As already indicated, the processing of an image includes
several operations that are carried out by applying different methods. For example, it
is necessary to perform feature selection before applying a classification operation.
Therefore, although all the operations described in the subsequent sections are
related, it is easier to analyze them separately.
2 Swarm-Based Methods
Swarm-based methods apply a bioinspired approach to solve complex problems
[1]. These algorithms try to imitate the intelligent behavior observed in several
natural systems formed by a set of individuals that cooperate to face problems.
Although everyone in the population can only perform simple tasks, the cooperation
established among all individuals enables the population to perform complex tasks.
Medical Image Processing by Swarm-Based Methods
267
Swarm-based algorithms simulate this collective behavior and apply it to solve
optimization problems. These methods have been applied to solve a variety of
complex problems, generating good results compared to other existing methods
[2–4].
Although there are various swarm-based algorithms, they all share the same
basic structure. A population of individuals that represent solutions to the problem
is considered, and an iterative process is applied in which the population shares
information to move toward better positions in the search space. The initial
solution represented by each individual is defined in the initialization step. This
step generally associates each individual with a random solution of the search
space. Then, an iterative process improves the current solutions associated with the
individuals (some or all). To perform this improvement, it is necessary to compute
the quality or fitness of the solutions. This value is computed by applying the
objective function of the problem (or a modification of said function) to the solution
represented by each individual. The solution with the best fitness of the current
iteration represents the solution to the problem in that iteration, while the final
solution of the problem is the best found throughout the iterations. Once the fitness
of the solutions has been determined, the population shares information to try to
move the individuals to better areas of the search space. The computations applied
to perform this operation are different for each swarm-based method. Nevertheless,
in all cases, some or all the individuals move to new positions (generally more
promising positions) in the search space. The iterative process continues for a
predefined number of iterations or until the solution converges.
The first swarm-based method proposed in the literature mimics the foraging
behavior of ants. Several ant-based algorithms have been proposed over the years,
[5]. The first one, called ant system, was applied to solve the well-known traveling
salesman problem. To solve this optimization problem, the associated weighted
graph is considered, and the algorithm looks for a minimum cost path on the graph.
With this purpose, a set of ants is used that move on the graph. The ants share
information through the pheromone that they deposit on the connections of the graph
that they traverse. Each ant traverses the graph to define a path that passes through
all the nodes once, choosing more likely connections that have low cost and large
amounts of pheromone. When an ant has built its solution, it shares information
with other ants by updating the pheromone of the graph’s connections. The amount
of pheromone that an ant contributes is proportional to the cost or quality of the
solution it has found. As a result of this update, the connections that are part of
the best solutions become more desirable to the ants in the next iteration of the
algorithm. The solution to the problem is the lowest cost path found throughout the
iterations. Algorithm 1 shows the basic steps that have been described above.
268
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
Algorithm 1: Ant-Based Algorithm
Initialize the pheromone of the graph connections
REPEAT
Define a closed path for each ant
Compute the cost of the solution defined by each ant
Update the pheromone of the graph connections
Update the best solution to the problem
UNTIL stop criterion is met
The particle swarm optimization (PSO) algorithm proposes a different approach
to that proposed by the ant-based algorithm [6]. PSO is applied to solve an
optimization problem that has an associated objective function. The solution to
the problem is a vector whose size is equal to the number of dimensions of the
solution space. To solve this problem, a set or swarm of particles is considered, and
the position of each particle is a feasible solution to the problem. In addition to a
position, each particle has a velocity and remembers the best position it has found
throughout the iterations of the algorithm. The quality or fitness of a solution is
calculated by applying the objective function of the problem. The algorithm begins
by giving initial values to the particles in the swarm. Then, it applies an iterative
process that allows the particles to move within the search space, to find a good
solution to the problem. Each particle adjusts its position, based on both the best
position reached by itself and the best position reached by the swarm. The best
position found by the swarm throughout the iterations will be the solution to the
problem. Algorithm 2 shows the basic steps of PSO.
Algorithm 2: PSO
Initialize the particles in the population
REPEAT
Update the velocity of each particle
Update the position of each particle
Update the personal best position of each particle
Update the best solution of the swarm
UNTIL stop criterion is met
The length of this chapter precludes detailing the operations of other swarm
algorithms. However, the description given for PSO shows a general scheme
followed by many of these algorithms. For example, this can be seen in the steps
of the firefly algorithm (FA) [7] and the shuffled-frog leaping algorithm (SFLA) [8],
shown in Algorithms 3 and 4, respectively. To complete the information associated
with the algorithms that appear in this section, this chapter includes an appendix
that shows the flowchart of each method along with the equations associated with
the basic operations.
Medical Image Processing by Swarm-Based Methods
269
Algorithm 3: FA
Initialize the population of fireflies
REPEAT
Sort the fireflies by brightness (fitness)
Update all fireflies except the brightest one
Update the brightest firefly
Update the best solution to the problem
UNTIL stop criterion is met
The following sections of this chapter refer to some other swarm-based methods
that cannot be described in this section due to space limitations. However, they
are listed below, and a reference is cited where they are clearly described. The
indicated methods are as follows: artificial bee colony (ABC) [9], bacterial foraging
optimization (BFO) [10], bat algorithm (BA) [11], cat swarm optimization (CSO)
[12], crow search (CRS) [13], cuckoo search (CUS) [14], flower pollination
algorithm (FPA) [15], and gray wolf optimization (GWO) [16].
Algorithm 4: SFLA
Initialize the population of frogs
REPEAT
Sort the frogs by fitness
Create the memeplexes
FOR each memeplex
Improve the worst frog in the memeplex
END-FOR
Recombine the frogs of all memeplexes
Update the best solution of the population
UNTIL stop criterion is met
3 Feature Selection
An image can be represented by a set of features drawn from it. They are obtained
as a result of a feature extraction procedure, which is usually applied before other
image processing operations, such as classification. Once the set of features that
represent the image has been extracted, different operations can be applied to said
image. In general, these operations do not use the entire feature set, but only the most
suitable subset for the task to be performed. Therefore, a feature selection operation
is applied to the initial feature set. The objective of feature selection is to reduce the
initial set of features to a small subset, by selecting those that are the most relevant
for the processing to be applied to the image and reducing the redundancy.
270
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
The proposal of Jona and Nagaveni defines a feature selection method that is
applied to mammograms to detect breast cancer [17]. This method applies an antbased algorithm called ant colony optimization (ACO) and uses the CUS algorithm
to perform the local search of ACO. The method considers an initial set with 78
features. When the first iteration of ACO is applied, each ant randomly selects
a subset of features. However, in subsequent iterations, the ants can only select
features from the subsets used in the previous iteration to update the pheromone.
CUS is used at each iteration to select the best features.
Sudha and Selvarajan described a feature selection method for breast cancer
classification based on mammograms that uses a modification of the CUS algorithm
[18]. The image is first segmented to extract the region of interest that contains the
suspicious mass. The mass is then represented by a set of 123 features, and the CUS
method is applied to select the most suitable subset of features to classify the image.
Since the final objective of the feature selection process is to classify the images,
the fitness function used for CUS is computed based on the classification accuracy
of the nearest neighbor classifier.
Jothi combined FA with tolerance rough set to define a feature selection method
for MR brain images in which the features are used for the detection of brain
tumors [19]. The tolerance rough set is a feature selection method that can
operate on real values [20]. The method described by Jothi first performs image
segmentation. Then, feature extraction is applied to obtain 28 features (including
shape, intensity-based features, and texture-based features). After this, the feature
selection operation is applied by executing FA. This algorithm uses the similarity
measures defined in the tolerance rough set to compute the similarity among
fireflies.
The research reported in [21] describes a system for brain tumor grade identification based on the analysis of MR images. The system applies successive methods
for image segmentation, tumor isolation, feature extraction, feature selection, and
classification. The feature extraction operation obtains textural, non-textural, shape,
and intensity-based features. Then, SFLA is applied to said features in order to select
the best subset of features to perform the classification.
Sahoo and Chandra describe a system for classifying cervix lesions as benign
and malignant [22]. This system applies a modified version of GWO to perform the
feature selection operation. Since the original GWO was defined to solve single
objective optimization problems, this article describes two variants for applying
GWO to the multi-objective problem associated with feature selection.
Shankar et al. described a system for Alzheimer detection from MI brain images
[23]. After identifying the region of interest in the image, the features of such
region are extracted. The feature selection is then performed by applying the GWO
algorithm that uses the classification accuracy as fitness function.
Tan et al. describe a method for the diagnosis of skin cancer applied to
dermoscopic images, where a modified PSO is used for feature selection [24]. The
PSO-based method is applied to the general set of image features to identify the most
significant features of benign and malignant skin lesions. The main modifications of
the PSO are the use of two subpopulations and a new equation to update the velocity
Medical Image Processing by Swarm-Based Methods
271
of the particles, which considers the best particle of a sup-population and discards
the worst particle. In addition, some updates are applied to selected subdimensions,
while others are applied to all subdimensions.
A feature selection method to classify MR images of brain tumors is described
in [25]. Said method is based on the Fisher criterion and a variant of BA. The
modification introduced in BA tries to improve the exploration capacity of the
basic algorithm. Many feature selection methods measure the importance of the
feature subset by using the metric of classification accuracy. When the classification
accuracy is used as the fitness criteria, the feature subset selected depends on the
classifier considered. To avoid this limitation, the method proposed in this article
uses the trace obtained via the Fisher criteria as a fitness function, instead of using
the classification accuracy to define said function. The system described in the
article completes the operation by applying a support vector machine (SVM) to
perform the classification.
The proposal of Dandu et al. describes a method for the detection of brain
tumors and pancreatic tumors where CSO is used for feature selection [26]. After
performing image segmentation, scale-invariant feature transform is applied to
extract features. CSO then selects the features that allow to distinguish the objects
of different classes. After this, the classification is performed by applying a back
propagation neural network. The method was applied to MR images and CT images.
4 Image Segmentation
Image segmentation consists of decomposing an image into regions that do not
overlap. This operation makes it possible to identify interesting parts of the image
for further analysis. Image segmentation is an important operation in the analysis of
medical images, since it allows identifying areas of tissues, bones, or organs affected
by different problems (Fig. 2). Segmentation makes it possible to determine the
shape or volume of the affected area, and this information helps experts in making
medical decisions.
Various approaches can be applied for image segmentation, such as clustering,
thresholding, edge detection, or region identification.
Clustering algorithms are commonly used as segmentation techniques. These
methods divide the pixels of the image into clusters or groups of similar pixels.
The research presented in [27] proposes a model for blood vessels segmentation
that combines the matched filter method with the ant-based method called ant colony
system [28]. Matched filter is a method commonly used for blood vessel detection,
but the combination with the ant-based clustering method increases the accuracy of
the results. In this case, the matched filter algorithm and the ant-based algorithm are
applied in parallel, and the results of both methods are combined.
Hancer et al. describe an image segmentation method that applies ABC to extract
brain tumors from MR images [29]. Segmentation is carried out by ABC, which
is applied as a clustering method. In this case, each food source used by the
272
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
Fig. 2 Original brain image (a) and segmented suspicious area (b)
algorithm represents the centroid of each cluster. After this, the segmented image
is converted into a binary image by applying thresholding, and finally, the brain
tumor is extracted by applying connected component labeling.
Mostafa et al. describe a liver segmentation method that applies ABC [30]. This
proposal uses ABC as a clustering method that identifies regions with different
intensity in abdominal CT images. The initial liver area segmentation obtained by
this method is then refined by a region-growing approach.
Fuzzy c-means (FCM) [31] is a clustering method that has been widely applied to
image segmentation. This method is the fuzzy version of K-means [32]. Certainly,
K-means and FCM are two very popular clustering methods. K-means separates a
set of items into a predefined number of groups or clusters. Each item is assigned
to the most similar group. Similarity is calculated by comparing the item and the
cluster centroid, which is the mean value of the elements associated with that
cluster. The process is applied iteratively to refine the centroids. In the case of FCM,
each item can be associated with several groups. There is a membership value that
determines the degree of association of each item with each cluster. It should be
noted that the results of both methods are influenced by the initial centroids used.
The method described by Taherdangkoo et al. in [33] combines ABC with FCM
to segment MR images. This proposal considers the method described by Shen et
al. in [34] as a starting point. To improve the results obtained for noisy images,
Shen et al. introduced two new parameters in FCM (the feature difference between
neighboring pixels in the image and the relative location of the neighboring pixels)
and computed them by a neural network. Since this operation is time-consuming,
Taherdangkoo et al. proposed using ABC to compute these parameters. The proposal
of Forghani et al. is also based on the method of Shen et al. but uses PSO to calculate
the two new parameters [35].
Medical Image Processing by Swarm-Based Methods
273
PSO was used in [36] to select the optimal cluster centers for the FCM
method that performs segmentation. Then, FCM was applied for MR brain images
segmentation. The authors use a variant of FCM described in [37] and improve it by
including three main modifications. First, PSO is used to initialize the FCM cluster
centers. Second, the membership function of FCM considers outlier rejection. Third,
the method considers spatial neighborhood information by using a square window
around the pixel being processed.
The proposal of Alagarsamy et al. combines CUS with a variant of FCM (called
type-2 FCM) to define a method for MR brain image segmentation [38]. In this
case, an iterative process is defined where CUS and the FCM variants are applied
sequentially until the solution converges. The same authors proposed another similar
method where BA is used instead of CUS [39].
Kavitha and Prabakaran describe a method for the early detection of lung tumor
on CT images [40]. In this case, PSO is used to select the initial cluster centers for
the FCM clustering method that performs the segmentation. Before applying PSO,
the filtered image is divided into five horizontal equidistant strips, and the second
strip is taken to apply segmentation.
Thresholding methods are frequently used techniques for image segmentation.
They separate the pixels in the image into two or more classes, based on their
intensity, and determine the boundaries between classes. The methods used to
calculate the thresholds can be divided into nonparametric and parametric, the
former being more precise. Nonparametric methods determine the thresholds by
optimizing a specific criterion. For example, the Otsu criterion selects optimal
thresholds by maximizing the between-class variance [41]. On the other hand,
entropy-based criteria maximize the sum of entropy for each class. The Kapur
entropy [42], the Tsallis entropy [43], the minimum cross entropy [44], and the
fuzzy entropy are very popular entropy-based approaches. Several articles use
swarm-based algorithms to find optimal threshold values for the cited criteria. In this
case, the fitness function of the swarm is defined based on one of the thresholding
criteria described above.
The proposal described in [45] adapts the food-searching behavior of ants to
define a thresholding method for medical image segmentation. This method was
applied to MR brain images. The ants move on the image looking for food (similar
pixels) and can memorize the food they found during this process. When an ant
finds a new target, a fuzzy measure is used to evaluate the similarity between the
target and the previous position. When the operation of the ants is completed, the
pheromone deposited by the ants during their movement generates the segmentation
results. The segmentation method described in [46] is the same as that described by
[45], and it is also applied to the same type of images.
Menon and Ramakrishnan apply ABC to segment MR brain images and then use
FCM to process the segmentation result [47]. The segmentation method is based on
the use of gray levels and considers the entropy method for the threshold estimation.
ABC is applied to determine the global threshold. In this case, the authors use an
ABC-based method previously applied to satellite image segmentation [48]. Then,
274
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
FCM is applied to cluster the segmented image, which allows identifying the brain
tumor.
The proposal of Li et al. uses a variant of PSO to optimize the parameters for the
Otsu criterion that is applied to perform image segmentation [49]. The PSO variant
was previously proposed by the same authors in [50], using quantum uncertainty
and cooperation mechanisms to prevent PSO from being trapped in local optima.
The new article of the authors improves on this method by making better use of
contextual information, which is evaluated after each particle is processed. The
article shows the results of the method applied to CT images of a human stomach
cavity. On the other hand, the research described in [51] proposes an improvement
of the method presented in [49]. In this case, a set of auxiliary swarms is used to
initialize the particles in the main swarm. To reduce the effect of local minima,
the search space is partitioned into several regions, and each auxiliary swarm is
associated with a region.
Rajinikanth et al. describe a method to extract a tumor from a two-dimensional
gray scale brain MR image [52]. The method includes two stages. First, a multilevel
thresholding operation is performed by applying the FA method with a fitness
function that uses the Tsallis entropy. This operation enhances the tumor region by
grouping the similar pixels. To conclude the first stage, the skull region is eliminated.
The resulting image is then segmented into different partitions using the Markov
random field model combined with an expectation maximization algorithm, which
is a common method for gray scale image segmentation [53]. As a result, three
image segments are obtained: white matter, gray matter, and tumor mass.
The proposal discussed in [54] uses CUS to define a segmentation method
applied to microscopic images. The CUS method was applied considering three
different objective functions: Otsu criterion, Kapur entropy, and Tsallis entropy. The
article includes results that determine the efficiency of each of the variants in terms
of the execution time and the quality of the final solution.
The proposal of Want et al. applies multi-threshold image segmentation by using
FPA [55]. They use the Otsu criterion to define the objective function of the swarmbased method. In addition, they modify the basic method to increase population
diversity. On the one hand, the article proposes a new mutation mechanism for FPA
in which the solution vectors are selected in such a way that each vector represents
a different region of the search space. On the other hand, a crossover operator is
used to increase the population diversity in the local search process. The method
was applied to medical images of several types, most of them corresponding to CT
and MR images.
Edge-based methods used for segmentation attempt to detect edges in the image.
This requires finding local intensity changes in the image. On the other hand, regionbased methods try to identify groups of neighboring pixels with similar intensity.
The method described by Pereira et al. applies ACO to segment the optic disc
in retinal images [56]. The pixels in the image are considered as the nodes of the
graph that the ants can visit and the ACO algorithm is used as an edge detector. The
ants move over the image driven by the local variation of the intensity values of the
image. They then update a pheromone matrix with the same size of the image, which
Medical Image Processing by Swarm-Based Methods
275
represents the edge information at each pixel of the image. At the end of the process,
the pheromone matrix is analyzed, and a binary decision is made for each pixel,
determining whether it is edge or not. The same authors used a similar approach to
define a method for automatic identification of diabetic retinopathy lesions in fundus
images [57]. In this case, the ACO algorithm was applied to segment exudates.
Another approach commonly used in image segmentation is that defined by
active contour models. These models typically use energy-based segmentation
techniques, thus attempting to minimize the energy associated with the active
contour as it evolves to fit around the desired object. Therefore, it is necessary to
solve an optimization problem whose objective is to minimize the total energy, to
guarantee that the active contour is located at the limits of the object. An active
contour problem is usually solved by the gradient descent method, but some swarmbased methods have also been applied.
PSO was applied in [58] for image segmentation based on active contours. This
solution uses an active contour model method described in [59], which is a popular
region-based model. The authors improve the results of said method by using PSO to
solve the fitting energy minimization problem. The article shows the results obtained
for various types of medical images.
The proposal of Ilunga-Mbuyamba et al. describes an active contour model
approach for image segmentation that uses a CUS variant [60]. The method is
applied to MR brain images to detect tumors. CUS is used to help control points
converge toward the global minimum of the energy function. With this purpose,
the method defines a local search space (window) for each control point from the
current contour. Then, the control points are placed randomly inside each window,
in order to obtain new ones by applying CUS.
The proposal presented in [61] describes an intensity-based statistical method
that extracts the three-dimensional cerebrovascular structure from time-of-flight
magnetic resonance angiography data. This segmentation method combines a new
finite mixture model with an improved PSO variant. The information is modeled
by a Rayleigh distribution function and two Gaussian distribution functions. In
addition, the finite mixture model is used to fit the intensity histogram of the images.
In this case, PSO is used to estimate the parameters of the finite mixture model that
fits the intensity histogram of the image. The PSO variant uses a modified method
to update the velocity of the particles and also considers that each particle can only
share information with the neighbors that are within a ring around its position.
5 Image Classification
Medical imaging classification is generally used to identify suspicious areas. This
operation allows identifying the images that correspond to healthy people and those
that correspond to people with some disease (Fig. 3).
276
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
Fig. 3 A classification method can be used to differentiate between normal and abnormal images
(images showing a health problem)
In general, a part of the image is selected, and the classification process is applied
only to that part. Therefore, the classification operation is usually preceded by a
segmentation operation, which identifies the region of interest.
There are several methods frequently referenced in the literature related with
the classification of medical images, such as clustering methods, artificial neural
networks, SVM, or FCM. The quality of the result obtained by any classification
technique depends on the proper selection of its parameters. To aid in this task,
swarm methods have been combined with these techniques to set the corresponding
parameters.
Neural networks are trainable systems that learn to solve a problem from
examples of that problem. The training process adjusts the weights associated with
the network connections. Several research articles apply an artificial neural network
to classify medical images and use a swarm-based method to train the network. In
this case, each individual in the population represents the set of weights of the neural
network.
A method that applies a forward neural network to classify MR brain images as
normal or abnormal is proposed in [62]. The system applies principal component
analysis for feature selection and uses the selected set as input for the neural
network. The weights of this network are optimized by a PSO variant. The main
difference of this variant with respect to the original algorithm is in the definition of
the weights of the equation used to update the velocity of the particles. The fitness
function used in this case is the mean squared error.
A method that combines PSO and ABC to classify MR brain images is described
in [63]. The method classifies the images as normal and abnormal. It applies
principal component analysis for feature selection before applying the swarm-based
methods. The selected set is used as input of a feed-forward neural network that is
optimized with a combination of two swarm methods. The article investigates the
application of three different combinations of ABC and PSO previously proposed
by other authors. The results show that the best combination is the one described in
[64], which applies PSO and ABC in parallel and, at each iteration, recombines the
best solution obtained by both methods.
Medical Image Processing by Swarm-Based Methods
277
Dheeba et al. defined a system to detect breast abnormalities on digital mammograms [65]. The method classifies mammograms into normal and abnormal. The
feature extraction stage applied to the images allows obtaining texture information
that is used in the classification stage. The classification is carried out by means of
a neural network that uses the wavelet activation function, combined with the PSO
method that is used to tune the initial network parameters. The method described
in [66] considers the same problem and uses FA instead of PSO to optimize the
parameters of a neural network that also uses the wavelet function.
A method that analyzes skin images to detect melanoma was proposed in [67].
This method combines GWO with a neural network to process cancer images. In
this case, GWO is used to define the initial weights of a multilayer perceptron neural
network. The method identifies two areas for classification (cancer and healthy) and
classifies each pixel in the image into one of the two categories.
A classification method to identify brain tumors based on MR brain images is
described in [68]. The images are classified as normal or abnormal by a supervised
neural network that is combined with the GWO method to optimize the network
parameters.
The proposal described in [69] combines swarm-based methods and deep
learning to define a model for the detection and classification of lung cancer nodules
from CT images. The model uses a convolutional neural network trained using a
swarm-based method. The article analyzes the results obtained for seven swarm
methods, including PSO, ABC, BFO, and FA. Computational experiments show
that the best results are obtained when PSO is considered.
The method described in [70] for lung cancer diagnosis combines deep learning
and a variant of the CRS algorithm. The objective is to find lung nodules in CT
images and classify them as benign or malignant. The modified CRS is used to
update the weights of the neural network during the training phase. The CRS-variant
combines the original algorithm with the sine cosine algorithm proposed in [71].
This is a population-based method that creates a set of random initial solutions
and requires them to fluctuate outward or toward the best solution by applying
a mathematical model based on sine and cosine functions. Each individual in the
resulting CRS-variant can select to update its location according to the CRS method
or according to the sine cosine method.
SVM is a useful classification technique that has also been applied to classify
medical images. The objective of the SVM algorithm is to find a hyperplane in a
multidimensional space that clearly classifies a set of data points. When considering
a nonlinearly separable problem, SVM can use a kernel, which is a function that
takes a low-dimensional input space and transforms it into a higher-dimensional
space, so as it turns a nonseparable problem into a separable problem. For the
results obtained by SVM to be good, it is necessary to give adequate values to
the parameters. Several researchers have applied swarm-based methods to set these
parameters.
Zhang et al. proposed a method to classify MR brain images as normal or
abnormal (abnormal images correspond to 17 different types of diseases) [72]. They
apply a kernel SVM that replaces the dot product of the original SVM method with
278
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
the radial basis function kernel. In addition, the method applies PSO to optimize the
parameters of the SVM classifier.
ABC was used in [73] to analyze CT images in order to detect cervical cancer.
The method classifies the input images as normal or abnormal. The system first
segments the images to obtain the region of interest and then extracts textural
features from that region. After this, three methods are proposed to perform
the classification, which combine ABC with the k-nearest neighbor, SVM with
linear kernel, and SVM with Gaussian kernel, respectively. The computational
experiments reported in the article show that the best results are obtained with the
third method.
Zhang et al. describe a system that classifies three-dimensional MR brain images
and can distinguish images corresponding to Alzheimer’s disease, mild cognitive
impairment, and normal cases [74]. Although other methods initially determine the
region of interest and then focus on it, this method considers the entire brain, so it
is not necessary to apply a segmentation operation. The article analyzes the use of
several SVM variants whose parameters are defined by the PSO algorithm with time
varying acceleration coefficient. This PSO method modifies over time the weights
of the components used to update the velocity of the particles (it gives more weight
to the cognitive component at the former stage and gives more weight to the social
component in the latter stage).
In the solution proposed by Ahmed et al., the classification is carried out using a
method that combines GWO and SVM [75]. In this case, GWO is used to select the
SVM parameters, and the kernel function used by SVM is the Gaussian radial basis
function.
6 Image Registration and Fusion
As indicated in the introduction section, medical images can be of different
modalities since they can be obtained using different techniques (x-ray, PET,
SPECT, etc.). It is common to use different types of images when evaluating a
patient, to obtain more information related to pathologies and decide the appropriate
treatment. In other cases, several images of the same type taken at different times are
used. For images to provide reliable and useful information, they must be properly
combined or fused (Fig. 4). Before images can be fused, they must be geometrically
and temporally aligned. This alignment operation is called registration. Therefore,
image fusion is a general operation that includes image registration as an initial step.
Different approaches have been applied to tackle the image registration problem.
One of these approaches is defined by the intensity-based techniques. These
techniques use image intensity values (color or gray level) to calculate similarity
measures between the images. This information is used to calculate the transformation that maximizes the value of a similarity metric by searching a certain
transformations space and comparing intensity patterns. An advantage of these
methods is that they do not require the prior application of a feature extraction
Medical Image Processing by Swarm-Based Methods
279
Fig. 4 Fused image obtained
from two brain images
or image segmentation operation. These methods use a similarity metric that
determines the match between the features or intensity values of two images. There
are several similarity metrics that have been used successfully in multimodal image
registration, such as mutual information [76], normalized mutual information [77],
or Renyi entropy [78]. On the other hand, the methods apply a search strategy to
optimize the similarity metric. Powell’s method and the conjugate gradient [79] are
two local methods commonly used in image registration to optimize the similarity
metric. Several global methods have also been applied with this purpose, including
genetic algorithms, simulated annealing, and swarm-based methods.
In summary, intensity-based image registration methods include three important
elements: finding a transformation that aligns an image with another taken as a
reference, choosing a similarity metric that measures the similarity between these
two images, and using an optimization technique to find the optimal transformation
parameters that maximize the similarity measure.
The following describes several articles that apply swarm-based methods for
medical image registration.
PSO was applied for registration of medical images in [80]. Said method was
used as a search strategy in a solution that is applied to images obtained from
different modalities. Specifically, PSO was used to maximize the similarity metric
for registering single slice medical images to three-dimensional volumes. The article
analyzes three PSO variants. The first one includes crossover operators to update
the position and velocity of the particles. The second variant is based on the first
proposal but considers five subpopulations that are initialized using the well-known
K-means algorithm. The third variant includes three main modifications. Powell’s
method is applied to the initial position of the particles, and then particles are
generated around the position defined by said method. In addition, this PSO variant
includes a constriction coefficient in the expression that updates the velocity of the
particles and a relaxed convergence criterion. Once the PSO operations have been
completed, the three PSO variants apply Powell’s local optimization method to the
best particle in the swarm. In the case of the second variant, which considers five
subpopulations, this method is applied to the best particle of each subpopulation.
Talbi and Batouche adapted PSO for multimodal medical image registration,
[81]. With this objective, they defined a differential evolution operator to improve
the best solution of each particle. Differential evolution is an optimization technique
that solves a problem by iteratively improving a candidate solution using an
280
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
evolutionary process [82]. The solution proposed in this article alternately applies
the basic PSO operations and the differential evolution operation; that is, one
iteration applies the equations to update the position and velocity of each particle,
and the next iteration applies the differential evolution operator.
Abdel-Basset et al. applied PSO as search strategy and used modified mutual
information as similarity metric [83]. The modified mutual information includes
spatial image information by using a linear combination of image intensity and
image gradient vector flow intensity.
Dida et al. compare the use of GWO and PSO for multimodal registration of
human brain using CT and MR images [84]. In this case, the normalized mutual
information is used as similarity metric. The results included in the article indicate
that GWO is the best method.
Xiaogang et al. define a multi-resolution medical image registration method
based on a wavelet transformation that combines FA with Powell’s method [85].
This proposal uses normalized mutual information as similarity measure. The
image registration process based on the multi-resolution strategy using the wavelet
transform includes two parts: the rough registration with low sampling resolution
and the fine registration with high sampling resolution. In the proposed method,
both images are decomposed using wavelet transformation, thereby obtaining the
associated low-level resolution images. Then, the registration operation is applied.
To do this, FA is first used to obtain the approximate registration result from the
low-resolution images. The results of this method are used as the initial solution to
apply Powell’s method, which is applied to the high-resolution images to obtain a
better registration result.
Yang et al. described a method for nonrigid multimodal image registration
[86]. Image registration methods can be classified as rigid and nonrigid. The main
difference is that the transformations applied in rigid methods do not change the
shape of the objects, while those applied in nonrigid methods do. The solution
proposed by Yang et al. uses CSO as optimization technique and the normalized
mutual information as similarity criterion. CSO imitates two behaviors of the cats,
called seeking mode and tracing mode. The modified CSO used in this article
includes the limited memory Broyden–Fletcher–Goldfarb–Shanno into the seeking
mode, which is a commonly used method for optimizing the parameters of the
deformation model in the nonrigid image registration. In addition, it includes the
roulette wheel method in the tracing mode.
The registration method described in [87] uses a similarity metric called
enhanced mutual information, defined in [88], and applies an optimization strategy
that combines CUS and Powell’s method. This solution combines local and global
optimization to improve the results. CUS is applied first to perform a global search.
Then, Powell’s method is applied to perform a local search around the best solution
obtained by CUS.
Several methods have been proposed that combine the pulse-coupled neural
network (PCNN) [89], with swarm-based algorithms to perform medical image
fusion. This neural network is efficient to perform this operation but uses a set of
parameters that are difficult to configure.
Medical Image Processing by Swarm-Based Methods
281
The research described in [90] presents a method that applies artificial ants for
fusing multimodal medical images. The method first applies artificial ants for edge
detection and optimization and then uses this information as input for a simplified
PCNN that generates the fused image. The method was applied to brain images.
Xu et al. defined a method to fuse multimodal medical images based on the
use of an adaptive PCNN that is optimized by a modified PSO, called quantumbehaved (QPSO) [91]. A basic difference between PSO and QPSO is that in the
second method, the state of a particle is not defined by its position and velocity,
but by a wave function [92]. Xu et al. used QPSO to set the PCNN parameters and
defined a fitness function for QPSO that combines three evaluation criteria: average
gradient, image entropy, and spatial frequency.
The proposal described in [93] combines PCNN with SFLA for the fusion of
CT and SPECT brain images. SFLA was used to optimize the PCNN parameters.
First, the intensity-hue-saturation (IHS) of each original image is decomposed
using a nonsubsampled contourlet transform (NSCT). This operation generates lowfrequency and high-frequency images for each original image. The method that
combines PCNN and SFLA is used to fuse both high-frequency images, resulting
in a high-frequency fused image. The same method is applied to fuse the lowfrequency images to generate the low-frequency fused image. The final fused image
is obtained by applying the reversed NSCT and reversed IHS transforms.
Scaling-based techniques are commonly used in multimodal image fusion.
Daniel et al. describe a mask-based technique for multimodal image fusion that uses
GWO to select the optimal scale values [94]. Mask-based techniques are controlled
by the gain factor called scale value. In general, mask-based methods use static scale
values, regardless of the input images considered. Rather, the purpose of this article
is to dynamically adjust the scale value by GWO. The mutual information metric
is used to define the GWO fitness function. The method first transforms the two
original images into Fourier space. Then, the Fourier spectrum of the input images
is optimally scaled using scale values obtained by the GWO algorithm. The resulting
spectrum mask corresponding to each image is fused using pixel-based averaging
rule. The resulting fused image is obtained in the Fourier domain, so the inverse
Fourier transform is used to obtain the spatial domain fused image.
The method described by Daniel et al. in [95] proposes another mask-based
method that shares some characteristics with the one described above. In this case,
GWO is also used to select the optimum scale values, but in addition, CUS is used
to select the random control parameters of the GWO algorithm. On the other hand,
GWO uses the same fitness function as in the previous case. Unlike the previous
solution, in this case, each original image is filtered by two masking filters (wavelet
filter and Laplacian filter). The filtered input images are scaled using optimal scale
values selected using GWO. Then, the Laplacian and wavelet mask corresponding
to each original image are fused, generating a mask for each original image. The
last operation fuses these two masks to generate the final image.
The method proposed in [96] uses the binary CRS optimization algorithm and
discrete wavelet transform. The method was applied to MR and CT image fusion.
Both images are decomposed using discrete wavelet transform, producing four
282
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
subband images that contain approximation and detailed coefficients. An initial
fusion is performed that combines the detailed coefficients of an image with the
approximation coefficients of the other image. Then, the final fusion is performed
by applying an optimal fusion rule whose parameters are optimally selected by the
swarm-based method.
7 Conclusions
The popularity of swarm-based algorithms has increased in recent years, and they
have been successfully applied to solve complex problems in different fields.
These methods use a set of very simple individuals and each of them looks for a
solution in the search space of the problem. The final solution to the problem will
be the best solution found by the swarm during the search process. Individuals in
the swarm share information to guide their search to promising areas of the search
space. Another important feature of these methods is that all the individuals perform
similar operations and there is no central control. The characteristics of these models
make them easy to implement.
This chapter shows a review of several interesting applications of swarm-based
solutions for processing medical images. Processing these images is not an easy
task, due to the large amount of information that must be handled, the different
image formats, and the variety of operations that can be applied to an image.
As indicated in the previous sections, when applying a processing to an image,
successive operations must be carried out on said image. For this reason, image
processing usually combines several techniques that are applied to each of these
operations. The description in this chapter shows how swarm-based methods can
be combined with other methods to define a system that applies certain processing
to a medical image. For example, different systems that combine artificial neural
networks and swarm algorithms have been described. In this way, the system defined
to process the image benefits from the advantages offered by each of the methods
integrated in the system.
Medical image processing is a very interesting field that offers the possibility
of further work on the application of swarm algorithms to improve systems that
analyze images and allow diagnoses.
A.1 Appendix A. Flowcharts of Swarm-Based Algorithms
This appendix shows the flowchart (Figs. 5, 6, 7, 8, and 9) of the swarm-based
methods whose algorithm is outlined in Sect. 2. Several tables (Table 1, 2, 3, 4, and
5) are included that describe the variables used in the flowcharts.
Medical Image Processing by Swarm-Based Methods
283
BEGIN
INITIALIZE THE POPULATION
t←1
t < Tmax
FALSE
UPDATE THE
POPULATION
TRUE
UPDATE THE BEST SOLUTION
FOUND BY THE POPULATION
END
Fig. 5 Main operations of a swarm-based algorithm
t←t+1
284
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
BEGIN
i←1
i<N
TRUE
vi(t+1)←w·vi(t)+φ1· ε1[bi(t)xi(t)]+ φ2· ε2[g(t)-xi(t)]
xi(t+1)←x(t)+vi(t+1)
FALSE
fit (xi(t+1)) > fit
(bi(t))
TRUE
bi(t+1)←xi(t+1)
FALSE
bi(t+1)←bi(t)
i←i+1
END
Fig. 6 Flowchart of the operation that updates the population in the PSO algorithm
Medical Image Processing by Swarm-Based Methods
285
BEGIN
SORT FIREFLIES BY INCREASING
BRIGHTNESS
i←1
i < N-1
TRUE
S←0
j←i+1
j<N
FALSE
FALSE
TRUE
S←S+β(rij) · (xj(t)-xi(t))
j←i+1
xi(t+1)←xi(t)+S+εi
i←i+1
xN(t+1)←xN(t)+ε
END
Fig. 7 Flowchart of the operation that updates the population in the FA algorithm
286
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
BEGIN
i←1
i<N
TRUE
Ti(1)←Random node in the graph
j←2
j < NC
FALSE
TRUE
Define the next stop, Ti(j),
according to the probabilistic
transition rule
j←j+1
FALSE
Li←length of Ti
i←i+1
i←1
i<N
TRUE
j←1
j < NC-1
TRUE
INC (Ti(j),Ti(j+1))←INC (Ti(j),Ti(j+1))+1/Li
INC (Ti(Nc),Ti(1))←INC (Ti(Nc),Ti(1))+1/Li
j←j+1
i←i+1
τ(t+1)←(1-ρ)τ(t)+INC
END
Fig. 8 Flowchart of the operation that updates the population in the ant-based algorithm
Medical Image Processing by Swarm-Based Methods
287
BEGIN
SORT FROGS BY FITNESS
SPLIT THE FROGS INTO M
MEMEPLEXES
m←1
m<M
j←1
j < Jmax
B ← BEST FROG IN THE
MEMEPLEX m
FALSE
W ← WORST FROG IN THE
MEMEPLEX m
m←m+1
X’w←Xw+ε1(XB-XW)
fit (X’w) > fit (Xw)
FALSE
Xw←X’w
FALSE
X’w←Xw+ε2(g(t)-XW)
fit (X’w) > fit (Xw)
Xw←X’w
FALSE
Xw← ε3
j←j+1
GROUP THE FROGS OF ALL
THE MEMEPLEXES
END
Fig. 9 Flowchart of the operation that updates the population in the SFLA algorithm
288
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
Table 1
Variables used in the flowcharts
Variable
Tmax
t
N
fit(x)
xi (t)
g(t)
Table 2
Description
Number of iterations performed by the algorithm
Current iteration of the swarm-based algorthm
Population size
The fitness or quality of the solution x
Position of the individual i (a solution to the problem) at iteration t
Best solution found by the population at iteration t
Variables used in PSO algorithm flowchart
Variable
vi
bi
w, φ1 , φ2
ε1 , ε2
Table 3
Variable
β(rij )
β0
γ
rij
ε1 , ε2
Table 4
Variable
NC
τij
Ti
Li
ρ
Table 5
Variable
M
Jmax
ε1 , ε2, ε3
Description
Velocity of particle i
Personal best position of particle i
Weights to determine the relative influence of the addends
Random vectors
Variables used in FA algorithm flowchart
Description
Attractiveness between fireflies i and j. It can be
computed by the equation:
Attractiveness at distance 0
Light absorption coefficient
Distance between xi and xj
Random vectors
2
β rij = β0 e−γ rij
Variables used in ant-based algorithm flowchart
Description
Number of nodes in the graph
Pheromone of the connection (i, j)
Tour defined by ant i that includes NC nodes
Length of Ti (cost of all the connections included in the tour)
Evaporation rate of the pheromone
Variables used in SFLA algorithm flowchart
Description
Number of memeplexes
Number of iterations applied to improve each memeplex
Random vectors
The description of the algorithms considers that a maximization problem will be
solved.
Medical Image Processing by Swarm-Based Methods
289
References
1. Panigrahi, B. K., Shi, Y., & Lim, M. H. (Eds.). Handbook of swarm intelligence: Concepts,
principles and applications (Vol. 8). Springer. 2011.
2. Abraham, A., Guo, H., & Liu, H. (2006). Swarm intelligence: Foundations, perspectives and
applications. In Swarm intelligent systems (pp. 3–25). Springer.
3. Abdulrahman, S. M. (2017). Using Swarm Intelligence for solving NP-Hard Problems.
Academic Journal of Nawroz University., 6(3), 46–50.
4. Hassanien, A. E., & Emary, E. (2018). Swarm intelligence: Principles, advances, and
applications. CRC Press.
5. Dorigo, M., & Stützle, T. (2019). Ant colony optimization: Overview and recent advances.
In Handbook of metaheuristics (International series in operations research & management
science) (Vol. 272, pp. 311–351). Springer.
6. Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proceedings of ICNN‘95International conference on neural networks (Vol. 4, pp. 1942–1948). IEEE. https://doi.org/
10.1109/ICNN.1995.488968.
7. Yang, X. S., & He, X. (2013). Firefly algorithm: recent advances and applications. International
Journal of Swarm Intelligence., 1(1), 36–50.
8. Eusuff, M. M., & Lansey, K. E. (2003). Optimization of water distribution network design
using the shuffled frog leaping algorithm. Journal of Water Resources Planning and Management, 129(3), 210–225.
9. Karaboga, D., & Basturk, B. (2007). A powerful and efficient algorithm for numerical function
optimization: Artificial bee colony (ABC) algorithm. Journal of Global Optimization, 39(3),
459–471.
10. Passino, K. M. (2002). Biomimicry of bacterial foraging for distributed optimization and
control. IEEE Control Systems Magazine, 22(3), 52–67.
11. Yang, X. S. (2010). A new metaheuristic bat-inspired algorithm. In Nature inspired cooperative
strategies for optimization (NICSO 2010) (Studies in computational intelligence) (Vol. 284, pp.
65–74). Springer. https://doi.org/10.1007/978-3-642-12538-6_6.
12. Chu, S. C., & Tsai, P. W. (2007). Computational intelligence based on the behavior of cats.
International Journal of Innovative Computing, Information and Control., 3(1), 163–173.
13. Askarzadeh, A. (2016). A novel metaheuristic method for solving constrained engineering
optimization problems: Crow search algorithm. Computers and Structures, 169, 1–12.
14. Yang, X. S., & Deb, S. (2009). Cuckoo search via Lévy flights. In 2009 World congress on
nature & biologically inspired computing (NaBIC) (pp. 210–214). IEEE.
15. Yang, X. S., Karamanoglu, M., & He, X. (2014). Flower pollination algorithm: A novel
approach for multiobjective optimization. Engineering Optimization, 46(9), 1222–1237.
16. Mirjalili, S., Mirjalili, S. M., & Lewis, A. (2014). Grey wolf optimizer. Advances in
Engineering Software, 69, 46–61.
17. Jona, J. B., & Nagaveni, N. (2014). Ant-cuckoo colony optimization for feature selection in
digital mammogram. Pakistan Journal of Biological Sciences: PJBS., 17(2), 266–271.
18. Sudha, M. N., & Selvarajan, S. (2016). Feature selection based on enhanced cuckoo search
for breast cancer classification in mammogram image. Circuits and Systems., 7(04), 327–338.
https://doi.org/10.4236/cs.2016.74028
19. Jothi, G. (2016). Hybrid tolerance rough set–firefly based supervised feature selection for MRI
brain tumor image classification. Applied Soft Computing, 46, 639–651.
20. Mac Parthalain, N., & Shen, Q. (2009). Exploring the boundary region of tolerance rough sets
for feature selection. Pattern Recognition, 42(5), 655–667.
21. Subashini, M. M., Sahoo, S. K., Sunil, V., & Easwaran, S. (2016). A non-invasive methodology
for the grade identification of astrocytoma using image processing and artificial intelligence
techniques. Expert Systems with Applications., 43, 186–196.
22. Sahoo, A., & Chandra, S. (2017). Multi-objective grey wolf optimizer for improved cervix
lesion classification. Applied Soft Computing, 52, 64–80.
290
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
23. Shankar, K., Lakshmanaprabu, S. K., Khanna, A., Tanwar, S., Rodrigues, J. J., & Roy,
N. R. (2019). Alzheimer detection using group grey wolf optimization based features with
convolutional classifier. Computers and Electrical Engineering, 77, 230–243.
24. Tan, T. Y., Zhang, L., Neoh, S. C., & Lim, C. P. (2018). Intelligent skin cancer detection using
enhanced particle swarm optimization. Knowledge-Based Systems, 158, 118–135.
25. Kaur, T., Saini, B. S., & Gupta, S. (2018). A novel feature selection method for brain tumor MR
image classification based on the Fisher criterion and parameter-free bat optimization. Neural
Computing and Applications, 29(8), 193–206.
26. Dandu, J. R., Thiyagarajan, A. P., Murugan, P. R., & Govindaraj, V. (2020). Brain and
pancreatic tumor segmentation using SRM and BPNN classification. Health Technology, 10(1),
187–195.
27. Cinsdikici, M. G., & Aydin, D. (2009). Detection of blood vessels in ophthalmoscope images
using MF/ant (matched filter/ant colony) algorithm. Computer Methods and Programs in
Biomedicine, 96(2), 85–95.
28. Dorigo, M., & Gambardella, L. M. (1997). Ant colony system: A cooperative learning approach
to the traveling salesman problem. IEEE Transactions on Evolutionary Computation, 1(1), 53–
66.
29. Hancer, E., Ozturk, C., & Karaboga, D. (2013). Extraction of brain tumors from MRI
images with artificial bee colony based segmentation methodology. In 2013 8th International
conference on electrical and electronics engineering (ELECO) (pp. 516–520). IEEE. https://
doi.org/10.1109/ELECO.2013.6713896.
30. Mostafa, A., Fouad, A., Abd Elfattah, M., Hassanien, A. E., Hefny, H., Zhu, S. Y., & Schaefer,
G. (2015). CT liver segmentation using artificial bee colony optimisation. Procedia Computer
Science., 60, 1622–1630.
31. Bezdek, J. C. (2013). Pattern recognition with fuzzy objective function algorithms. Springer.
32. MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability
(Vol. 1, No. 14) (pp. 281–297). University of California Press.
33. Taherdangkoo, M., Yazdi, M., & Rezvani, M. H. (2010). Segmentation of MR brain images
using FCM improved by artificial bee colony (ABC) algorithm. In Proceedings of the 10th
IEEE international conference on information technology and applications in biomedicine (pp.
1–5). IEEE. https://doi.org/10.1109/ITAB.2010.5687803.
34. Shen, S., Sandham, W., Granat, M., & Sterr, A. (2005). MRI fuzzy segmentation of brain
tissue using neighborhood attraction with neural-network optimization. IEEE Transactions on
Information Technology in Biomedicine, 9(3), 459–467.
35. Forghani, N., Forouzanfar, M., & Forouzanfar, E. (2007). MRI fuzzy segmentation of brain
tissue using IFCM algorithm with particle swarm optimization. In 2007 22nd International
symposium on computer and information sciences (pp. 1–4). IEEE.
36. Mekhmoukh, A., & Mokrani, K. (2015). Improved fuzzy c-means based particle swarm
optimization (PSO) initialization and outlier rejection with level set methods for MR brain
image segmentation. Computer Methods and Programs in Biomedicine, 122(2), 266–281.
37. Mizutani, K., & Miyamoto, S. (2005). Possibilistic approach to kernel-based fuzzy c-means
clustering with entropy regularization. In International conference on modeling decisions for
artificial intelligence (pp. 144–155). Springer.
38. Alagarsamy, S., Kamatchi, K., Govindaraj, V., & Thiyagarajan, A. (2017). A fully automated
hybrid methodology using cuckoo-based fuzzy clustering technique for magnetic resonance
brain image segmentation. International Journal of Imaging Systems and Technology, 27(4),
317–332.
39. Alagarsamy, S., Kamatchi, K., Govindaraj, V., Zhang, Y. D., & Thiyagarajan, A. (2019).
Multi-channeled MR brain image segmentation: A new automated approach combining bat
and clustering technique for better identification of heterogeneous tumors. Biocybernetics and
Biomedical Engineering., 39(4), 1005–1035.
Medical Image Processing by Swarm-Based Methods
291
40. Kavitha, P., & Prabakaran, S. (2019). A novel hybrid segmentation method with particle swarm
optimization and fuzzy c-mean based on partitioning the image for detecting lung cancer.
International Journal of Engineering and Advanced Technology., 8(5), 1223–1227.
41. Otsu, N. (1979). A threshold selection method from gray-level histograms. IEEE Transactions
on Systems, Man, and Cybernetics, 9(1), 62–66.
42. Kapur, J. N., Sahoo, P. K., & Wong, A. K. (1985). A new method for gray-level picture
thresholding using the entropy of the histogram. Computer Vision, Graphics, and Image
Processing., 29(3), 273–285.
43. Tsallis, C. (1988). Possible generalization of Boltzmann-Gibbs statistics. Journal of Statistical
Physics, 52(1), 479–487.
44. Li, C. H., & Lee, C. K. (1993). Minimum cross entropy thresholding. Pattern Recognition,
26(4), 617–625.
45. Huang, P., Cao, H., & Luo, S. (2008). An artificial ant colonies approach to medical image
segmentation. Computer Methods and Programs in Biomedicine, 92(3), 267–273.
46. Lee, M. E., Kim, S. H., Cho, W. H., Park, S. Y., & Lim, J. S. (2009). Segmentation
of brain MR images using an ant colony optimization algorithm. In 2009 Ninth IEEE
international conference on bioinformatics and bioengineering (pp. 366–369). IEEE. https:/
/doi.org/10.1109/BIBE.2009.58.
47. Menon, N., & Ramakrishnan, R. (2015). Brain tumor segmentation in MRI images using
unsupervised artificial bee colony algorithm and FCM clustering. In 2015 International
conference on communications and signal processing (ICCSP) (pp. 0006–0009). IEEE. https:/
/doi.org/10.1109/ICCSP.2015.7322635
48. Ma, M., Liang, J., Guo, M., Fan, Y., & Yin, Y. (2011). SAR image segmentation based on
artificial bee colony algorithm. Applied Soft Computing, 11(8), 5205–5214.
49. Li, Y., Jiao, L., Shang, R., & Stolkin, R. (2015). Dynamic-context cooperative quantumbehaved particle swarm optimization based on multilevel thresholding applied to medical
image segmentation. Information Sciences, 294, 408–422.
50. Li, Y., Xiang, R., Jiao, L., & Liu, R. (2012). An improved cooperative quantum-behaved
particle swarm optimization. Soft Computing, 16(6), 1061–1069.
51. Li, Y., Bai, X., Jiao, L., & Xue, Y. (2017). Partitioned-cooperative quantum-behaved particle
swarm optimization based on multilevel thresholding applied to medical image segmentation.
Applied Soft Computing, 56, 345–356.
52. Rajinikanth, V., Raja, N. S. M., & Kamalanand, K. (2017). Firefly algorithm assisted
segmentation of tumor from brain MRI using Tsallis function and Markov random field.
Journal of Control Engineering and Applied Informatics., 19(3), 97–106.
53. Zhang, Y., Brady, M., & Smith, S. (2001). Segmentation of brain MR images through a hidden
Markov random field model and the expectation-maximization algorithm. IEEE Transactions
on Medical Imaging, 20(1), 45–57.
54. Chakraborty, S., Chatterjee, S., Dey, N., Ashour, A. S., Ashour, A. S., Shi, F., & Mali, K.
(2017). Modified cuckoo search algorithm in microscopic image segmentation of hippocampus. Microscopy Research and Technique, 80(10), 1051–1072.
55. Wang, R., Zhou, Y., Zhao, C., & Wu, H. (2015). A hybrid flower pollination algorithm based
modified randomized location for multi-threshold medical image segmentation. Bio-Medical
Materials and Engineering, 26(s1), S1345–S1351. https://doi.org/10.3233/BME-151432
56. Pereira, C., Gonçalves, L., & Ferreira, M. (2013). Optic disc detection in color fundus images
using ant colony optimization. Medical & Biological Engineering & Computing, 51(3), 295–
303.
57. Pereira, C., Gonçalves, L., & Ferreira, M. (2015). Exudate segmentation in fundus images
using an ant colony optimization approach. Information Sciences, 296, 14–24.
58. Mandal, D., Chatterjee, A., & Maitra, M. (2014). Robust medical image segmentation using
particle swarm optimization aided level set based global fitting energy active contour approach.
Engineering Applications of Artificial Intelligence, 35, 199–214.
59. Chan, T. F., & Vese, L. A. (2001). Active contours without edges. IEEE Transactions on Image
Processing, 10(2), 266–277.
292
M.-L. Pérez-Delgado and J.-Á. Román-Gallego
60. Ilunga-Mbuyamba, E., Cruz-Duarte, J. M., Avina-Cervantes, J. G., Correa-Cely, C. R., Lindner,
D., & Chalopin, C. (2016). Active contours driven by cuckoo search strategy for brain tumour
images segmentation. Expert Systems with Applications., 56, 59–68.
61. Wen, L., Wang, X., Wu, Z., Zhou, M., & Jin, J. S. (2015). A novel statistical cerebrovascular
segmentation algorithm with particle swarm optimization. Neurocomputing, 148, 569–577.
62. Zhang, Y. D., Wang, S., & Wu, L. (2010). A novel method for magnetic resonance brain image
classification based on adaptive chaotic PSO. Progress in Electromagnetics Research, 109,
325–343.
63. Wang, S., Zhang, Y., Dong, Z., Du, S., Ji, G., Yan, J., Yang, J., Wang, Q., Feng, C., & Phillips,
P. (2015a). Feed-forward neural network optimized by hybridization of PSO and ABC for
abnormal brain detection. International Journal of Imaging Systems and Technology, 25(2),
153–164.
64. Kıran, M. S., & Gündüz, M. (2013). A recombination-based hybridization of particle swarm
optimization and artificial bee colony algorithm for continuous optimization problems. Applied
Soft Computing, 13(4), 2188–2203.
65. Dheeba, J., Singh, N. A., & Selvi, S. T. (2014). Computer-aided detection of breast cancer on
mammograms: A swarm intelligence optimized wavelet neural network approach. Journal of
Biomedical Informatics, 49, 45–52.
66. Senapati, M. R., & Dash, P. K. (2013). Local linear wavelet neural network based breast tumor
classification using firefly algorithm. Neural Computing and Applications, 22(7), 1591–1598.
67. Parsian, A., Ramezani, M., & Ghadimi, N. (2017). A hybrid neural network-gray wolf
optimization algorithm for melanoma detection. Biomedical Research, 28(8), 3408–3411.
68. Ahmed, H. M., Youssef, B. A., Elkorany, A. S., Saleeb, A. A., & Abd El-Samie, F. (2018).
Hybrid gray wolf optimizer–artificial neural network classification approach for magnetic
resonance brain images. Applied Optics, 57(7), B25–B31.
69. de Pinho Pinheiro, C. A., Nedjah, N., & de Macedo Mourelle, L. (2020). Detection and
classification of pulmonary nodules using deep learning and swarm intelligence. Multimedia
Tools and Applications, 79(21), 15437–15465.
70. Surendar, P. (2021). Diagnosis of lung cancer using hybrid deep neural network with adaptive
sine cosine crow search algorithm. Journal of Computational Science., 53, 101374. https://
doi.org/10.1016/j.jocs.2021.101374
71. Mirjalili, S. (2016). SCA: A sine cosine algorithm for solving optimization problems.
Knowledge-Based Systems, 96, 120–133.
72. Zhang, Y., Wang, S., Ji, G., & Dong, Z. (2013). An MR brain images classifier system
via particle swarm optimization and kernel support vector machine. The Scientific World
Journal.https://doi.org/10.1155/2013/130134
73. Agrawal, V., & Chandra, S. (2015). Feature selection using Artificial Bee Colony algorithm
for medical image classification. In 2015 Eighth international conference on contemporary
computing (IC3) (pp. 171–176). IEEE. https://doi.org/10.1109/IC3.2015.7346674
74. Zhang, Y., Wang, S., Phillips, P., Dong, Z., Ji, G., & Yang, J. (2015). Detection of Alzheimer’s
disease and mild cognitive impairment based on structural volumetric MR images using 3DDWT and WTA-KSVM trained by PSOTVAC. Biomedical Signal Processing and Control, 21,
58–73.
75. Ahmed, H. M., Youssef, B. A., Elkorany, A. S., Elsharkawy, Z. F., Saleeb, A. A., & Abd
El-Samie, F. (2019). Hybridized classification approach for magnetic resonance brain images
using gray wolf optimizer and support vector machine. Multimedia Tools and Applications,
78(19), 27983–28002.
76. Viola, P., & Wells, W. M. (1997). Alignment by maximization of mutual information.
International Journal of Computer Vision, 24(2), 137–154.
77. Studholme, C., Hill, D. L., & Hawkes, D. J. (1999). An overlap invariant entropy measure of
3D medical image alignment. Pattern Recognition, 32(1), 71–86.
78. He, Y., Hamza, A. B., & Krim, H. (2003). A generalized divergence measure for robust image
registration. IEEE Transactions on Signal Processing, 51(5), 1211–1220.
Medical Image Processing by Swarm-Based Methods
293
79. Maes, F., Vandermeulen, D., & Suetens, P. (1999). Comparative evaluation of multiresolution
optimization strategies for multimodality image registration by maximization of mutual
information. Medical Image Analysis, 3(4), 373–386.
80. Wachowiak, M. P., Smolkov, R., Zheng, Y., Zurada, J. M., & Elmaghraby, A. S. (2004). An
approach to multimodal biomedical image registration utilizing particle swarm optimization.
IEEE Transactions on Evolutionary Computation, 8(3), 289–301.
81. Talbi, H., & Batouche, M. (2004). Hybrid particle swarm with differential evolution for
multimodal image registration. In 2004 IEEE international conference on industrial technology
(IEEE ICIT ‘04) (Vol. 3, pp. 1567–1572). IEEE. https://doi.org/10.1109/ICIT.2004.1490800.
82. Storn, R., & Price, K. (1997). Differential evolution–a simple and efficient heuristic for global
optimization over continuous spaces. Journal of Global Optimization, 11(4), 341–359.
83. Abdel-Basset, M., Fakhry, A. E., El-Henawy, I., Qiu, T., & Sangaiah, A. K. (2017). Feature
and intensity based medical image registration using particle swarm optimization. Journal of
Medical Systems, 41(12), 1–15.
84. Dida, H., Charif, F., & Benchabane, A. (2020). Grey wolf optimizer for multimodal medical
image registration. In 2020 Fourth international conference on intelligent computing in data
sciences (ICDS) (pp. 1–5). IEEE.
85. Xiaogang, D., Jianwu, D., Yangping, W., Xinguo, L., & Sha, L. (2013). An algorithm multiresolution medical image registration based on firefly algorithm and Powell. In 2013 Third
international conference on intelligent system design and engineering applications (pp. 274–
277). IEEE.
86. Yang, F., Ding, M., Zhang, X., Hou, W., & Zhong, C. (2015). Non-rigid multi-modal
medical image registration by combining L-BFGS-B with cat swarm optimization. Information
Sciences, 316, 440–456.
87. Shen, L., Huang, X., Fan, C., & Li, Y. (2018). Enhanced mutual information-based medical
image registration using a hybrid optimisation technique. Electronics Letters, 54(15), 926–
928.
88. Pradhan, S., & Patra, D. (2016). Enhanced mutual information based medical image registration. IET Image Processing, 10(5), 418–427. https://doi.org/10.1049/iet-ipr.2015.0346
89. Wang, Z., Ma, Y., Cheng, F., & Yang, L. (2010). Review of pulse-coupled neural networks.
Image and Vision Computing, 28(1), 5–13.
90. Kavitha, C. T., & Chellamuthu, C. (2014). Medical image fusion based on hybrid intelligence.
Applied Soft Computing, 20, 83–94.
91. Xu, X., Shan, D., Wang, G., & Jiang, X. (2016). Multimodal medical image fusion using PCNN
optimized by the QPSO algorithm. Applied Soft Computing, 46, 588–595.
92. Sun, J., Feng, B., & Xu, W. (2004). Particle swarm optimization with particles having quantum
behavior. In Proceedings of the 2004 congress on evolutionary computation (IEEE Cat. No.
04TH8753) (Vol. 1, pp. 325–331). IEEE.
93. Huang, C., Tian, G., Lan, Y., Peng, Y., Ng, E. Y. K., Hao, Y., Chen, Y., & Che, W. (2019).
A new pulse coupled neural network (PCNN) for brain medical image fusion empowered by
shuffled frog leaping algorithm. Frontiers in Neuroscience, 13, 210.
94. Daniel, E., Anitha, J., Kamaleshwaran, K. K., & Rani, I. (2017b). Optimum spectrum mask
based medical image fusion using gray wolf optimization. Biomedical Signal Processing and
Control, 34, 36–43.
95. Daniel, E., Anitha, J., & Gnanaraj, J. (2017). Optimum laplacian wavelet mask based
medical image using hybrid cuckoo search–grey wolf optimization algorithm. KnowledgeBased Systems, 131, 58–69.
96. Parvathy, V. S., & Pothiraj, S. (2019). Multi-modality medical image fusion using hybridization
of binary crow search optimization. Health Care Management Science, 23, 661–669.
Left Ventricle Volume Analysis in
Cardiac MRI Images Using
Convolutional Neural Networks
Palakala Sai Krishna Yadhav, K. Susheel Kumar, and Nagendra Pratap Singh
1 Introduction
In a study, the World Health Organization (WHO) claimed that cardiovascular
disease is one of the deadliest diseases and a leading cause of death globally,
claiming the lives of 17.9 million people per year. So, a lot of researches are
done and going on cardiovascular diseases. Especially left ventricle abnormalities
are more dangerous for heart attacks and a bigger chamber in the heart. So, early
diagnosis of heart diseases is a more important way to reduce deaths and improve
patient health. Left ventricle volume estimation is one of the biggest challenges in
cardiac MRI images where the volume is estimated at the time of systole, that is
when LV is contrasted, and at the time of diastole, that is when LV is extended
[1, 2].
Deep learning (DL), also known as hierarchical learning or deep structured
learning, is a sub-domain of machine learning. The term deep in the expression
deep learning refers to the number of layers in a neural network. DL has shown a
significant breakthrough in many areas with outstanding performance. After seeing
the performance, DL has been used in high-dimensional data and many complex
applications, which give state-of-the-art results compared to conventional methods.
In this chapter, some applications are discussed where deep learning is widely used,
namely healthcare, self-driving or autonomous cars, NLP, speech recognition, image
recognition, cybersecurity, automatic coloring, etc. Many areas are adopting deep
learning. Also, we have discussed some of the DL architectures commonly used in
the applications mentioned above. Deep learning is becoming more popular because
P. S. K. Yadhav () · K. Susheel Kumar · N. P. Singh
Department of Computer Science and Engineering, National Institute of Technology Hamirpur,
Hamirpur, India
e-mail: ksusheel@nith.ac.in; nps@nith.ac.in
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_15
295
296
P. S. K. Yadhav et al.
of the large data available for training the networks, deep architectures, activation
functions, and more computational power. And deep architectures have many hidden
layers that are more powerful and useful for robust feature learning to increase
performance.
A deep learning algorithm is a neural network with more than three layers,
including the inputs and outputs. Recently, in several research areas, deep learning
has been applied, which gives them state-of-the-art results. DL algorithms perform
very well on a large amount of data, even if it is labeled or unlabeled. Nowadays, a
large amount of data are available in the supervised and unsupervised ways in audio,
CSV files, images, and text. This large amount of data can be used to train the DL
algorithms, which improves the performance of models.
Using the conventional method, predictions in data and feature extraction are a
challenging task [3]. In this deep learning era, the above scenario has changed, as
the DL algorithms can learn from the data automatically, like finding the patterns in
data using many hidden layers. More hidden layers are used to understand the data
at a high level and more effectively. Earlier problems in research areas that took
a lot of time for feature extraction, pre-processing, and data predictions are easily
solved using DL algorithms with the help of computation power. The computational
resources available in today’s world are one of the biggest advantages of deep
learning. Because of the available computational resources, the models’ time is less
than the traditional ML approach.
After seeing the performance increase in research works by applying DL
algorithms, because a large amount of data is available, the depth in the neural
network models has increased significantly, allowing for more data abstraction such
as patterns, edges, and features from the model’s input. One more advantage of DL
algorithms is they can learn from unlabeled data, and it just needs a large dataset.
If the dataset is small, then data augmentation can be done to increase the size
of the dataset. Now many research areas started adapting to deep learning such
as medical imaging, voice assistants, self-driving cars, image recognition, fraud
detection, advertising, finance, etc.; for example, if we take medical imaging, it can
be applied to predict cardiovascular diseases that the patient is normal or abnormal.
Similarly, many critical diseases can be predicted in healthcare or for early
diagnosis of the diseases using medical images such as X-rays, CT, MRI scans,
etc. Recently, the outbreak of the covid19 pandemic can be detected using CT
scans of the patient by DL algorithms. One more example for deep learning in
5G and mobile networks Zhang et al. [4] says that mobile devices are increasing
exponentially, and processing the data between mobile and network is a challenging
task with less battery utilization. DL is used to tackle these challenges and improve
the performance of mobile devices.
Barz et al. [5] proposed a new approach for the dataset, which is small to train
the DL algorithms. Usually, the data given to DL algorithms are very large, but
the algorithms do not perform well for small datasets. It is the case that in some
research areas getting data is expensive. Paper [5] says that by changing the loss
function from categorical to cosine loss, the performance of the model increases for
small datasets.
Left Ventricle Volume Analysis in Cardiac MRI Images Using Convolutional. . .
297
Fig. 1 Convolutional neural network
1.1 Convolutional Neural Networks
AI has seen an unprecedented surge in building bridges between human and
machine capabilities. Researchers and enthusiasts from many backgrounds and
experiences work tirelessly to realize outstanding projects. A number of them may
be found in the field of computer vision. To equip computers to observe the world
as people do, see it, and use that knowledge for various activities, this discipline
works to allow computers to process images and videos, analyze images and
classifications, discover media, recommend products, and handle natural language
processing. One specific method, a convolutional neural network, has taken time to
mature and been refined over the years.
CNN is a multi-layered neural network, which is particularly used for images.
Usually, neural networks are not built to extract features from images and are not
capable of doing. So, convolutions and pooling are used to extract features from the
images, and they cannot perform classification, so we need fully connected layers to
classify the data. In traditional ML algorithms, feature extraction from the images
is the biggest challenge, but using CNN automatically extracts features and learns
from them. As shown in Fig. 1, there are multiple layers of convolutions and pooling,
and the more numbers of layers are better the algorithm and these layers are used
for feature extraction from the images. Pooling is used to reduce the dimensionality
in the features mostly max pooling is used because it selects the max info from the
extracted features. After this, we have used flatten, softmax, and FCN layers for the
classification part; if in the case of a regression model, we can use dense layers.
A deep learning algorithm known as a convolutional neural network (ConvNet/CNN) may be given an input picture and given the job of assigning significance
(learnable weights and biases) to various characteristics and objects within the
picture. ConvNets need substantially less pre-processing than other classification techniques. Although filters are hand-engineered in rudimentary approaches,
ConvNets can learn these filters and properties if they are trained properly. The
architecture of a ConvNet is like the structure of the visual cortex, which was based
on the human brain. In addition to stimulating a whole receptive field, stimuli can
298
P. S. K. Yadhav et al.
Fig. 2 Matrix to vector
only reach neurons in a certain area of the visual field known as the receptive field.
A full visual region is covered by fields that overlap one another.
A picture is made out of pixels, yes? Perhaps one could directly input an image
matrix of 3 × 3 images into a multi-level perceptron for classification, or one could
flatten the picture (i.e., change it to a 9 × 1 vector as shown in Fig. 2) and send it to
an MLP for classification. To be honest, not really. The approach would display an
average accuracy score when classifying photos with little dependence on pixels but
would perform poorly when classifying pictures with significant pixel dependencies.
A ConvNet can correctly extract the spatial and temporal relationships present in
a picture by applying necessary filters. The design better suits the dataset owing to a
decrease in the number of parameters, the reuse of weight values, and the reduction
in the dataset’s size. Therefore, it may be said that the network may be educated to
comprehend the image’s complexity better. It is separated by its three color planes,
RGB (or “Red, Green, and Blue”). Color spaces of this kind include grayscale,
RGB, HSV, CMYK, and so on. Using an illustration, one may envision how much
more computationally demanding things would be if photos were, for example, 8K
(7680 × 4320). The ConvNet’s job is to transform pictures to be simpler to interpret
while maintaining important characteristics that lead to accurate predictions. When
we are trying to create an architecture that is effective at learning features and
scalable to big datasets, this is a significant consideration.
A sort of artificial neural network called an artificial neural network has three
layers: the input layer, the hidden layer, and the output layer. The input layer neurons
are linked to all or part of the hidden layer neurons, but the neurons within the
same layer are not coupled. Multiple hidden layers may be found in a network.
Convolutional neural networks (CNNs) are one kind of artificial network that is
applied to this project. This measurement reports the patch’s chance in the left
ventricle, with values ranging from 0 to 1. Filtering an image at various scales using
a convolutional neural network (or a neural network applied to a whole picture)
might be likened to perform several convolutions on the picture. Convolution
layer—The kernel, the image has a height of 5 in., a width of 5 in., and a single
Left Ventricle Volume Analysis in Cardiac MRI Images Using Convolutional. . .
299
Fig. 3 Convolution operation
dimension (the number of channels, e.g., RGB). As shown in the demonstration, it
appears green in the part above. In the initial component of a convolutional layer,
the element that carries out the convolution operation is known as the kernel/filter,
K, which is color-coded yellow. K is a 3 × 3 × 1 matrix that we have chosen (Fig. 3).
Because kernel stride length = 1 (non-stride), the kernel is shifting nine times for
each matrix multiplication operation, which it does while hovering over the region P
of the picture. The filter pushes a given stride value to the right until it has processed
the whole width. Hops from the beginning (left) of the picture to the left side then
move over the rest of the picture, repeating the procedure as needed. The convolution
operation’s main goal is to isolate important visual elements such as edges from the
input picture. Although just one convolutional layer is required, you may apply as
many convolutional layers as you would like. The low-level characteristics, such
as edges, color, gradient orientation, etc., are traditionally handled by the first
ConvLayer. Applying more layers to the design results in the model exhibiting highlevel properties while also adapting to the dataset. In this way, we have a network
that is comparable to how we would.
1.2 Network Layers
A neural network may use several sorts of hidden layers to handle various
applications. Convolutional and max-pooling layers, followed by ReLU layers, and
ultimately fully linked layers make up the layers in this thesis.
1.3 CNN Tuning
Tuning a convolutional neural network may be likened to creating a house of cards.
It is difficult to maintain the delicate balance of the cards. A little adjustment to
the hyperparameters may benefit the CNN, but on the other hand, it may negatively
impact it. Training the model’s accuracy using the validation model enables one to
finetune a CNN. An accuracy percentage for the datasets is the number of samples
300
P. S. K. Yadhav et al.
properly categorized out of the total samples. Training accuracy measures if the
CNN is learning and gives information about the performance of fresh unseen
samples on the test set. Overfitting, or the validation accuracy being lower than the
training accuracy, is a typical issue while training a CNN. One option is to modify
the CNN’s dropout rate. Increasing the number of epochs, batch size, patch size,
and momentum has an additional benefit of reducing the training time, to design a
CNN with a high training accuracy, but one in which the validation accuracy differs
greatly.
1.4 Max Pooling
Max pooling is used to find the most important features in the matrix or feature
map. The filter is specified for the pooling that when the filter is applied to the patch
of the matrix, it will find the max value in that patch, and along with the filter, the
stride is also given, and the stride tells the step size to move in the feature map. As
shown in the below example, the 2 × 2 pooling is applied to the 4 × 4 matrix with a
stride of 2 × 2 that gives the max values of the matrix, which specifies the features
with high presence, and the images are also downsampled. 2D convolutional filters
are utilized to downsample the feature maps formed after filtering, and hence,
max pooling is necessary in the implementation of convolutional neural networks.
Utilizing the max-pooling method, each map is divided into sections of size N×N
pixels, and the most typical size for this kind of matrix when using max pooling
is 2 × 2. To avoid error, the largest possible value in each 2 × 2 matrix is picked,
and therefore, the spatial dimensions are reduced to half of the original map size.
Max pooling first expands the incoming feature map, which has a size of 4 × 4
pixels, resulting in a larger feature map of 2 × 2 pixels as shown in the image in
Fig. 4. Using the maximum pooling option will help speed up the computational
aspect of the convolutional neural network, as the number of weights that need to
be optimized is lowered. As a result, there is a smaller chance of overfitting. The
biggest disadvantage of max pooling is that if the max-pooling size is much larger
than the patch size or feature map, it is possible that information may be removed
to an extent that is detrimental to the CNN’s ability to learn.
Fig. 4 Max pooling
Left Ventricle Volume Analysis in Cardiac MRI Images Using Convolutional. . .
301
1.5 ReLU
Activation functions are used in neural networks that help the model to learn the
more complex patterns, and it helps to find the features of the non-linear properties
in the data. These functions take input and do basic operations on the input data
and give output that can be used as input to another layer or can be used as the
final output. There are many activation functions that are available that perform
these basic operations in a different manner. If the activation functions are not
applied, then the output of the neural networks is almost linear with some of the
learning parameters. And without these functions, the neural networks are similar
to the linear regression that fits the data in a line. But neural networks want to find
complex patterns in the data such as image, video, etc., where the networks should
be non-linear. ReLU (rectified linear activation unit) is a simple function where
the values are in a range of zero to infinity. The output of this function is zero if
there is any negative value, and the output will be same if there is any value greater
than or equal to zero. This activation function is mostly used neural networks in
particular to CNN. The function is f(x) = max(0, x) (Fig. 5). Because of this simple
function, it takes very less time to run, and it is also known for the sparsity in the
network. Therefore, by using ReLU, the model finds the more complex patterns as
there is sparsity in the network. Because of sparse, most of the entries are sparse
in the network so only few neurons will be activated as it helps model to predict
the patterns that are most useful and reduces the problem of overfitting or reduces
the pick of noise in random phenomena. The ReLU function is most widely used
because of vanishing gradient problem. As the other activation functions output,
values keep on decreasing as the depth of the model increases and it may reach to
the value zero also, but ReLU can overcome this problem as its slope cannot be
plateau. There is another problem that ReLU can have exploding gradient problem.
17.5
15.0
12.5
10.0
7.5
5.0
2.5
0.0
–20
–15
–10
Fig. 5 ReLU activation function
–5
0
5
10
15
302
P. S. K. Yadhav et al.
Fig. 6 Sigmoid activation
function
1.6 Sigmoid
The sigmoid activation is used when the output of the model is the probability.
Because the sigmoid function lies in a range of 0 and 1, the probability values always
also lie in the same range. So, if the output is to predict the probability, then we can
use sigmoid function. The sigmoid function is always used for the logistic regression
model. So, it is also called as logistic sigmoid function. And the function should
always be differentiable. Tanh is also a sigmoid function where it ranges between
−1 and 1 with a mid-value as 0. In the sigmoid function, any value is positive, then
the curve is after 0.5, and the negative value will be below 0.5. The drawback of the
sigmoid function is any value above 1 will always be 1 which is a saturation point
and same as the values below −1 or 0 for tanh and sigmoid it reaches the saturation
point. Mostly sigmoid is used as the activation function to the final layer to get the
output (Figs. 6 & 7).
1.7 Dropout
Dropout layer is used to reduce the problem of overfitting as it nullifies or drops
the neurons in the input layer and even in the hidden layer. So, the problem of
overfitting can be reduced. If dropout layers are not used, then the first batch of
the training samples may dominate the output of the model and then it prevents
the model to learn the new features from the later training samples. Dropout layer
cannot be added to the output layer, and it is only added to the input and the hidden
layers of the neural network. Dropout, a strategy for reducing overfitting, may be
used to lessen the amount of overfitting for a CNN. Dropout is used when using
hidden neurons in the CNN; in this case, the hidden neuron is completely deleted
with a given probability. During the first stages of cell division, the connections to
the neuron and from the neuron are likewise inactive. With a dropout rate of 50% in a
hidden layer, the likelihood of an independent neuron being temporarily eliminated
Left Ventricle Volume Analysis in Cardiac MRI Images Using Convolutional. . .
303
Fig. 7 Tanh activation function
Without Dropout
Classification
Classification
Hidden layer
Dropout on
hidden layer
Input layer
Input layer
With Dropout
Fig. 8 With and without dropout
during an epoch is 50%. As you can see in Fig. 8, using a dropout on a CNN serves
as utilizing just a part of the CNN in order to prevent neurons from being dependent
on each other, which is a form of that concept described in the paragraph above.
When all the neurons are loaded with the same kind of information, a more stronger
association is established between them. Dropout may be seen as a combination of
several convolutional neural networks, each one tuned to a specific piece of input.
Neurons that are inactivated for an epoch are reactivated before the next epoch,
allowing the activation of additional neurons.
304
P. S. K. Yadhav et al.
1.8 Hyperparameters
1.8.1
Learning Rate
The learning rate influences the training accuracy and how rapidly the loss function
decreases during the training of a CNN. The gradient descent process, which
calculates the optimum parameters, runs well and learns the examples provided
to the CNN. Choosing a learning rate that is too tiny or too great is ineffective.
When you set the learning rate too high, the loss function begins lowering in the
beginning but then plateaus or increases after a certain number of epochs, meaning
that it will cease to learn. If you set the learning rate too low, you will end up with
a lengthy learning process since the CNN’s weight updates will be minimal, which
will cause the learning curve to go steeper. The training accuracy goes in the other
direction. In Fig. 9, a good learning rate is shown to boost accuracy, whereas a high
learning rate may cause accuracy first to improve before declining after that. In the
illustration, as shown in Fig. 9, the algorithm’s learning rate affects the algorithm’s
training accuracy. When the learning rate is too high, the blue curve results. For
CNN, a good learning rate is shown by the red curve. The learning rate controls
how big each step is when we update the weights and biases. To lower the step size,
as the loss function is approaching its minimum, a learning decay is used to help
prevent optimality losses from occurring due to excessively large step size. Another
method of implementing learning decay is to apply a learning rate update after each
epoch, according to the training accuracy. Another option is to reduce the learning
rate depending on the validation accuracy.
0.95
Training accuracy
0.9
0.85
high learning rate
good learning rate
0.8
0.75
0.7
0.65
0
5
10
15
20
Epoch
Fig. 9 Learning rate effect
25
30
35
40
Left Ventricle Volume Analysis in Cardiac MRI Images Using Convolutional. . .
1.8.2
305
Patch Size
Another aspect to take into consideration is the size of the input patch. The CNN
will be unable to tell what kind of pixel it is if the patch size is too tiny. For the
CNN to learn from, the patch must be bigger. Simultaneously, additional weights
must be managed, thereby increasing calculation time. If the image is too big, the
pixel correlation between adjacent patches is reduced.
1.8.3
Batch Size
In gradient descent, the batch size is set by how many patches are examined before
proceeding. When training a CNN with a big dataset, a batch size of one results
in faster convergence when the patch taken is based on more patches. Increasing
the learning rate when the batch size is big increases the step size of the gradient
descent, thus allowing the optimum weights and biases to be obtained more quickly.
1.8.4
Epochs
The order of the samples in the training set is randomized so that order does not
affect the optimization of the weights. Learning the order of the samples is the
CNN’s fault. This workout set will be covered thoroughly, and the weights will
be optimized. Thus, the training set is divided into validation and testing sets, with
the former receiving new weights and biases. And each set serves to determine the
outcome of the validation; the more the CNN successfully identifies samples, the
better the results. Classification using the present parameter values is not trained
and tested during validation.
1.9 Augmentation
Images may be augmented using picture data enhancement to artificially enlarge
the size of a training dataset. Increasing the amount of training data available to
deep learning neural network models yields models that have greater capability, and
enhancement approaches may lead to different picture variants that better enable the
trained models to generalize what they have learnt.
Keras is a library used to build deep learning models. This library includes
the ImageDataGenerator class, which generates picture data with image data
augmentation applied. As more data is used, the effectiveness of deep learning
neural networks increases. The process of data augmentation creates fresh training
data by drawing from already existing training data. New and distinct training
examples are created by applying domain-specific approaches to examples from
the training data. Because this sort of data augmentation (i.e., constructing altered
306
P. S. K. Yadhav et al.
copies of pictures in the training dataset that belong to the same class as the original
picture) is likely the most well known, it is also known as image data augmentation.
A transformation is a set of operations that includes shifts, flips, zooms, and several
other picture modification functions.
The goal is to add fresh, realistic instances to the training collection. To
understand this, note that each model will view a different training set of photos,
as they are distinct. Let us consider an example: A snapshot of a cat may have been
shot from the left or right, and the orientation of the image might affect meaning.
The result of a vertical flip in the image of a cat does not make sense, and the model
is unlikely to detect an upside-down cat. Because of this, it is rather apparent that
whatever data augmentation methods are used to a training dataset is dependent on
both the training dataset and the subject of study. It is also possible to try using other
data augmentation approaches in isolation and then combine them with another tiny
prototype dataset, model, and training run to see whether they result in a noticeable
increase to model performance.
Convolutional neural networks, such as the CNN, may apply feature learning
that is location-invariant. Although this model relies on transformation invariance
to improve learning, augmenting the model may improve its transform invariant
approach to learning even further and assist the model in learning properties
such as left-to-right to top-to-bottom ordering, light levels in photos, and more.
Augmentation of the image dataset is normally only used during training and never
on the test or validation sets. When we are preparing a dataset for use with a model,
such as resizing and scaling images, we must ensure that our work is consistent
across all datasets.
1.9.1
Shift Augmentation
The movement of all pixels of the picture in one direction, as opposed to shifting
them just a single pixel, is referred to as an image shift. This implies that a portion
of the picture will be cut off and new pixel values will have to be supplied in the area
that remains. The horizontal and vertical shift ranges for the ImageDataGenerator
function Object() [native code] are controlled by the width shift range and height
shift range parameters.
This content includes examples of floating-point parameters that represent a
percentage value between 0 and 1 (relative to the dimensions of the picture) in
relation to the width or height of the picture. For another example, a picture may
be shifted by a number of pixels. Specifically, each picture will be sampled, and the
amount of shift (e.g., between 0 and value) applied, which means the range between
no shift and the percentage or pixel value. To determine the actual minimum and
maximum ranges where the shift will be sampled, provide a tuple or array with both
the minimum and maximum ranges, such as: [−100, 100] or [−0.5, 0.5].
Left Ventricle Volume Analysis in Cardiac MRI Images Using Convolutional. . .
1.9.2
307
Rotation Augmentation
A rotation augmentation uses a random number generator to shift the picture in
increments of 0–360◦ . Because the rotation is likely to remove pixels from the
picture frame, the picture frame will include sections with no pixel data.
2 Literature Review
Fumin Guo et al. [6] developed a CNN model integrating with the continuous maxflow framework and kernel cut, which has increased the segmentation accuracy,
especially by using the smaller datasets. Usually, neural networks require larger
datasets to feed into the model, but by using the proposed architecture as shown
in the paper [6], the model has increased segmentation accuracy. Budai et al. [7]
developed a fully automatic segmentation approach that is a regression model. The
proposed model architecture is ResNet, along with some convolutions and dense
layers. In previous research work, the post-processing method is time-consuming.
This can be overcome with the proposed approach in the paper[7].
Wu et al. [8] proposed a hybrid CNN approach for more accurate segmentation
of the left ventricle, which can be easily confused to other areas such as the right
ventricle and myocardium. The proposed approach is to combine CNN with U-net,
the U-net part is to find the left ventricle accurately, and CNN is used to extract the
ROI of the left ventricle. This method has improved the accuracy of left ventricle
segmentation. Khened et al. [9] developed a novel approach with upsampling
and some shortcut connections along with FCN architecture. The proposed model
achieved 100% accuracy on the ACDC dataset for diagnosis. They also combined
the benefits of Dice loss and cross-entropy to suggest a loss function.
Recently, fully automated CNN networks have been used for the segmentation of
the images. For example, Nasr-Esfahani et al. [10] proposed an FCN architecture to
predict the left ventricle (LV) volumes. And the model predicts the roundness shape
in the image, which is LV, and segmentation is done. The dataset used in this chapter
is the York heart image dataset. Many researchers are using CNN architecture
for segmentation of particular regions in the medical images, but finding features
from unlabeled image data is the biggest challenge; Krishna et al. [11] proposed
a contrastive learning strategy that is another version of self-supervised learning
(SSL) where very limited annotations are taken and used for predicting the volumes
in the MRI images, and using this method, the model achieves better results. This
experiment is done on three publicly available datasets. Scribble annotation is
another SSL segmentation technique that is used to correctly identify the location in
the picture. This scribble method is proposed by Valvano et al. [12] that can be used
in any type of medical image such as human pose parts, abdominal organs, and heart.
This method in paper [12] has experimented on many medical and non-medical
datasets that give results similar to fully automated segmentation (FAS). Wang
308
P. S. K. Yadhav et al.
et al. [13] proposed a FAS approach for segmentation based on slope difference
distribution threshold selection, which outperforms several existing CNN models.
Luo et al. [14] proposed a deep CNN-based approach for volumetric estimation without segmentation that learns from the input over multiple epochs. This
procedure yields a high correlation with the true values. Using the FCN network,
we can find LV, RV, and myocardium from the cardiac images; the myocardium, a
muscle mass, surrounds LV. Jang et al. [15] applied an FCN network to segment
myocardium, LV, and RV. Tan et al. [16] applied a CNN regression model for LV
segmentation and obtained a high Jaccard index on the LVSC dataset [17–19].
Emad et al. [20] The LV has to be localized for the segmentation, functional
analysis, and content-based retrieval of cardiac pictures to be done automatically.
We describe a novel methodology that uses convolutional neural networks (CNN)
to locate the LV in short-axis images of cardiac MRI. Feature extraction was
accomplished using a six-layer CNN with variable kernel sizes. Softmax fully
connected classifier was used on top of this for classification. To take into
consideration the varying sizes of the heart, the pyramids of scales analysis were
established. For learning and testing, a publicly accessible database of 33 cases was
utilized. Using deep convolution neural networks (CNN), a completely automated
LV localization has been suggested for cardiac (short-axis) MRI images. This
method is built on thorough picture analysis, using extensive search to look for
the LV at various scales, without previously knowing where, how, or at what speed
the LV is moving. Additionally, the method is applied to the periods, or timelines,
individually. With a suggested method, we are looking at extremely positive results,
including performance that is around 98.66%, 83.91%, and 99.07% for accuracy,
sensitivity, and specificity. One shortcoming of the method is the time it takes to
process each picture, around 10 s. Though, parallel processing and directed search
may be used to achieve this reduction.
Simantiris et al. [21] discussed a tough job since it is essential in the medical
evaluation of heart disorders. Cardiovascular disorders are among the biggest causes
of mortality worldwide, and the localization of particular areas of interest such
as the right and left ventricular cavities and myocardium helps clinicians know
about this. This chapter aims to solve the semantic segmentation issue using a
convolutional neural network trained on cardiac MR images. To operate in full
resolution throughout the network’s layers, we chose to use dilated convolutions,
which gave us more localization accuracy while still keeping the parameters that
could be trained at a lower amount. We built a bespoke loss function to enhance
the network’s training process. To deal with the absence of training photos, we
devised novel augmentation methods and altered old ones. The size of the training
set rises, as well as the value of what is being trained, since the network learns
rapidly and does not overfit. This includes our prep and post-processing phases,
which are integral to the whole process. According to the ACDC, we have achieved
good results using our methodologies on the right and left ventricles (RV, LV) and
the myocardium (MYO). We successfully passed the Post-2017-MICCAI-challenge
algorithmic validation with a Dice coefficient of 0.916, getting comparable scores
(average) compared to state-of-the-art shown on the ACDC leaderboard but with
Left Ventricle Volume Analysis in Cardiac MRI Images Using Convolutional. . .
309
many fewer parameters. Our technique has outperformed competing approaches that
use dilated convolutions by a considerable margin in the last challenge.
Semantic segmentation of MR cardiac pictures is a specialized function of a
cascaded dilated convolutional neural network, a kind of deep learning network.
Convolutional kernels with a higher dilation rate are used in the first layer, but the
kernel size decreases as the architecture becomes deeper. When working with large
datasets, the network has fewer parameters than other deep learning architectures
optimized for that job, and training may be completed in a short amount of time.
Even though the suggested network only includes a limited number of features and
parameters, we believe it is feasible to simplify the network in a later step, maybe
by using training developed to optimize network parameters. By finding out if the
first layer of the proposed network’s filter layers has an intriguing interpretation, we
might progress this endeavor. With data augmentation, basic picture manipulation
methods such as rotation and scaling speed up the training process. Before training,
an intensity augmentation tailored to each athlete is used to provide an enhanced
training dataset. The image contrast is changed to make additional tissue samples
based on the distributions of the left ventricle and myocardium densities. According
to our findings, this is the most important factor governing the highest performance
level in all measures. The post-processing stage involves incorporating anatomical
restrictions to build a more consistent segmentation map for the volume.
Zotti et al. [22]. The technique presented in this study uses a unique convolutional neural network architecture to separate pictures of heart slices that have
been horizontally compressed. The suggested model extends the U-net, which
incorporates a cardiac-specific shape prior and includes a loss function intended
to reflect the heart’s architecture better. Because our model is computed offline, our
implementation is not bound by it. We need the MRI to scan our raw pictures, do
not apply any post-processing or picture cropping, and train our system to segment
the endocardium, the endocardium of the right ventricle, as well as the heart’s core.
To enable the network to differentiate between high- and low-level information,
in addition to determining the location of the cardiac regions, the system uses a
multiresolution grid design. Our models completed multi-slice CMRIs (left and
right ventricle contours) in 0.18 s, with an average Dice coefficient of 0.91 and an
average 3-D Hausdorff distance of 9.5 mm.
Chang et al.[23]. In clinical diagnosis, segmentation of cardiac MRI images is
critical. Traditionally, the method of obtaining guidelines for cardiopathy diagnosis
requires clinical specialists to physically separate the left ventricle (LV), right
ventricle (RV), and myocardium, each one on its own. Manual segmentation
is labor-intensive and time-consuming. In this chapter, we provide deep neural
networks (DNNs) and cardiopathy classification in cardiac MRI images, which
are then segmented and classified automatically. First, we obtain an ROI from the
diastolic and systolic MRI sequences by using a YOLO-based network to recognize
objects. Next, we use fully convolutional neural networks to extract a pixel-wise
segmentation mask (FCN). Finally, we put together a linked network for diagnosing
cardiopathy to arrive at a definitive diagnosis of a heart ailment based on an MRI.
The findings demonstrate that the suggested approach effectively separates the left
310
P. S. K. Yadhav et al.
ventricle, the right ventricle, and the heart muscle when carrying out experiments.
About classifying the heart illness, it gets an accuracy of 90%.
Bernard et al.[24]. A frequent clinical job to establish the diagnosis is the
delineation of the left ventricular cavity, myocardium, and right ventricle using
cardiac magnetic resonance images (multi-slice 2-D cine MRI). Research toward the
automation of these operations has so been in place for many decades. We describe
in this study a brand new, never-before-seen dataset called the “Automatic Cardiac
Diagnosis Challenge” (ACDC), which contains the biggest, most detailed, and
publicly available collection of cardiac MR images (CMRs) ever made. The dataset
includes measurements and classifications from two medical specialists and the 150
multi-equipment CMRI recordings that form the data. This study aims to identify
how far state-of-the-art deep learning approaches can be used in the assessment of
CMRI, that is, categorizing diseases, including heart disease and valvular lesions.
Following the 2017 MICCAI-ACDC, we share the results of nine different research
groups’ deep learning techniques for the segmentation job and four research groups’
deep learning techniques for the classification assignment. The research found
that the best approaches accurately reproduced the expert analysis, leading to a
mean correlation score of 0.97, with 0.96 accuracies. This makes it possible to
accurately and automatically do cardiac CMRI analysis, and these findings are huge
for that. Additionally, we highlight circumstances in which the latest deep learning
approaches fail. Data and the findings and new datasets are accessible online for the
public to use.
Curiale et al. [25]. In terms of prognosis and therapy, a cardiologist would
consider cardiac function to be important in almost every case. The structure and
functionality of the heart are the primary determinants of cardiac behavior. In both
circumstances, it is necessary to locate and separate the myocardium in medical
imaging examinations. Today’s magnetic resonance imaging (MRI) is one of the
most important and accurate non-invasive techniques for diagnosing heart anatomy,
and function is today’s magnetic resonance imaging (MRI). Our objective in this
study is to use a deep learning algorithm to aid in the automated segmentation
of myocardial tissue in cardiac MRI. Our proposed modifications to earlier papers
include implementing a residual learning method, using the Jaccard distance as the
optimization objective function, and using a batch normalization layer to train the
fully convolutional neural network. Our findings show that this network structure
is superior to others previously developed for the same application and will serve
myocardial segmentation well. Based on our testing, our cardiac segmentation takes
22 segments in 22 s using an Intel Core i7 3.1 GHz processor with 128 × 128 × 13
pixels.
Zreik et al. [26]. As part of the examination of cardiac function, accurate
delineation of the left ventricle (LV) is necessary. This study has developed an
automated approach for segmentation of the left ventricle (LV) in cardiac computed
tomography angiography (CCTA) data. Segmentation is conducted in two phases.
The first step uses three convolutional neural networks to recognize a bounding box
around the LV (CNNs). To continue, a CNN classifier is used on voxel data within a
predefined bounding box to execute the LV segmentation. In all, sixty CCTA scans
Left Ventricle Volume Analysis in Cardiac MRI Images Using Convolutional. . .
311
were utilized, fifty of which were used to train the CNNs for the LV localization,
while the other five were used to train the LV segmentation approach. The average
Dice coefficient was 0.85, and the mean absolute surface distance was 1.1 mm
after the automatic segmentation process. The findings show that it is possible to
automatically segment the LV using voxel categorization in CCTA images using
convolutional neural networks.
Srinivasa et al. [27] applied CNN and LSTM for the detection of the expressions
in the videos or images. In today’s world, a massive amount of multimedia data
is available; this chapter aims to detect facial expressions in an image or frame.
For detecting the expressions, the movement of facial muscles and eye blinking
is also important. Traditional ML methods such as KNN, SVM, and K-means are
already implemented, but they cannot process the large data, capture each frame
in a video, and aggregate the output. Using LSTM and CNN, it can capture each
frame in a video and analyze the expression with a range between 1 and 100.
Finally, it aggregates all the frames using the abovementioned DL algorithms,
and a large amount of data can also be handled. Iliadis et al. [28] proposed a
framework for recovering the video frames using deep learning. Using DL, the
quality of reconstructing the video frames has significantly improved compared to
traditional ML methods. And the frames are restored in a matter of seconds using
the method described. After adding more layers to the network, the performance
also significantly improved. Jindal et al. [29] applied DL in wearable devices
for monitoring the real-time data. Applying the DL on low-power devices is a
challenging task; the proposed design in this chapter can overcome the above
limitation. They have combined the features of the external network and inertial
sensor data, and they have used spectral-domain pre-processing to optimize DL on
the devices [30].
3 Overview of Our Work
Here, the dataset we have taken is Kaggle’s SADSB that has the images in DICOM
format. In the first step, pre-processing is done. After that, the images are stored
in NumPy format. In the second step, load the NumPy files, split the train set
for validation, and hyperparameter tuning is done. The images are fed into the
convolutional neural network (CNN) model with hyperparameter tuning in the
third step. In the final step, after calculating continuously ranked probability scores
(CRPS), the final predictions of the test dataset are stored in a CSV file. Our
project is to develop a model that automatically finds the abnormalities in the heart
and reduces the complexity of reading MRI images manually, which is a timeconsuming process. The proposed pipeline is shown in Fig. 10.
312
P. S. K. Yadhav et al.
Fig. 10 Proposed model
pipeline
4 Methodology
4.1 Dataset
We used the Kaggle’s SADSB Challenge dataset, which has 500 patients of the
training set, 200 patients of the validation set, and 440 patients of the test set. Each
patient has 2 chambers, 4 chambers, and a short-axis view (SAX) of images. In this
chapter, we have selected SAX images with a clear view of the left ventricle at the
time of systole and diastole. Each SAX image folder has 30 images where these
images are collected from end systole to end diastole of the left ventricle in one
cycle (heartbeat). And each patient has a different number of SAX image folders.
The images are in DICOM format and are accessed with the pydicom library. Many
studies use a dataset of about 100 patient images; the dataset we used has 1140
Left Ventricle Volume Analysis in Cardiac MRI Images Using Convolutional. . .
Fig. 11 Original image of
the dataset
313
0
50
100
150
200
250
300
350
0
100
200
300
patients and is one of the largest cardiac MRI image datasets available online. The
short-axis view of the dataset is shown in Fig. 11.
4.2 Preprocess
In pre-processing steps, first, the images are taken from DICOM format loaded
using the pydicom library in python. In the next step, the images are blurred, and
we applied Canny edge detection method to find all the edges in the heart image,
which are used to detect the major parts in the short-axis view such as the left
ventricle, myocardium, and right ventricle. And the Canny edge detection algorithm
also reduces the noise along with edge detection. In the next step, Gaussian blur is
applied to smoothening the images and reduce the images’ details and noise. All
the above pre-processing steps such as a Gaussian blur, Canny edge detection, and
blur are applied using the library in python called OpenCV. It is also used in many
computer vision applications.
After some pre-processing techniques are applied, as mentioned above, now the
images are needed to be cropped and resized according to our requirements. In this
project, the images are cropped from the center of the image. After cropping, the
images are resized by the shape 64 × 64. Now the images are preprocessed, and all
the images are saved in order because each subject (patient) has 30 slices per cycle
and is taken from the end diastole when the volume of the left ventricle is extended
to the end systole and when the volume of the left ventricle is contracted. Now, all
the data that are preprocessed are saved into NumPy file format. The training and
test datasets are preprocessed separately and saved in different NumPy files. The
314
P. S. K. Yadhav et al.
training dataset is preprocessed and mapped to studies of their targets that are given
dataset. The targets are systole and diastole volumes, which are loaded to train data.
4.3 Load and Split Dataset
Split the training dataset for training and validation of the model that performs
the model. The dataset is loaded with NumPy, as the data are stored in NumPy
format, which loads and trains the data very quickly compared to loading 70 + GB
images larger than many datasets. The split ratio is 80% for the training part and the
remaining 20% for the testing part.
4.4 Augmentation
Augmentation is done on the training dataset to improve the performance of the
model. In this project, rotation augmentation and shift augmentation are done. At
the time of shift augmentation, the image pixels shift from one position to another
with a range of (0.1, 0.1), and at the time of rotation augmentation, the images are
rotated in the range of 15◦ .
4.5 Model
The model is built using Keras to design layers. It is a sequential model where all
the layers are kept in a stack and have only one input and output tensor. The custom
activation function is used to center normalize the data using a sample-wise center
where only each image is normalized for all features. And divide each sample with
its standard for sample-wise standard normalization.
The input shape of images to model is (30, 64, 64). The images are resized at
the time of pre-processing, and each subject has 30 slices of images. This model
has 8 convolutional layers, 9 activation layers, and 4 max-pooling layers with a pool
size of 2. After the flatten and another activation, dense layer is joined with an L2
regularizer and a dropout of 0.5. And there, 2 zero-padding layers are also used
with a dropout layer of 0.25. The activation function used here is ReLU because
ReLU does not allow negative inputs. Whenever there is negative input to the ReLU
function, it automatically returns 0. One of the advantages of ReLU is it does not
activate all neurons at once like if there is a value less than “0,” it is not activated.
Compile the model with Adam optimizer with a learning rate of 10e-2 and root
mean-squared error loss function. Adam, one of the best deep learning optimizers,
was used in this project because it combines the best features of RMSProp and
Left Ventricle Volume Analysis in Cardiac MRI Images Using Convolutional. . .
Dropout (0.25)
input
2x2
Max pooling
output
Relu activation
Dense (1)
input
Layer 1
Dropout (0.5)
Dense (1024)
Relu activation
Dense layer
315
(a)
output
(b)
Dropout (0.25)
2x2
Max pooling
input
Zero padding
Relu activation
Layer
output
(c)
3x3
Convolutions
Layer
3x3
Convolutions
Relu activation
3x3
Convolutions
Layer
3x3
Convolutions
Relu activation
Flatten layer
Layer 1
3x3
Convolutions
Relu activation
3x3
Convolutions
Layer 1
3x3
Convolutions
Relu activation
Dense layer
Segmentation map
Output image
3x3
Convolutions
Center normalize
activation
30,64,64
Input image
Fig. 12 Proposed architecture modular blocks as follows: (a) dense layer with ReLU activation,
dropout layer, and dense layers, (b) layer 1 contains max pooling, ReLU, and dropout layers, (c)
layer block contains zero-padding, max pooling, ReLU, and dropout layers
Fig. 13 The proposed architecture of convolutional neural network (CNN)
AdaGrad. Figure 12 shows the proposed architecture module blocks, and Fig. 13
shows the proposed model of CNN architecture.
4.6 Training the Model
The hyperparameters used for training the model are iterations, epochs, and batch
size. The number of iterations used here is 40, with only one epoch per iteration
with a batch size of 32. The image augmentation is performed for every iteration,
and the continuous ranked probability score (CRPS) is also calculated.
316
P. S. K. Yadhav et al.
5 Results and Discussion
This section discussed the model performance, metrics used for evaluation, model
training environment, results, computation time, loss function, and comparison to
other methods.
5.1 Environment
The experiments are conducted on an intel i7 processor 6th gen 3.4 GHz with 16 GB
ram and Nvidia GeForce GTX 960M 4 GB graphics. The project code is completely
on python and Keras. The Adam learning rate is 10e-2, trained for 40 epochs with a
batch size of 32 for predicting the left ventricle volumes.
5.2 Evaluation Metrics
The proposed approach has been assessed with the metric called continuous ranked
probability score (CRPS). The CRPS is calculated using two parameters, one is true
labels, and another is predicted values. It is a mean-squared error with cumulative
density function as follows:
2
1 xi − xi .
N
N
CRP S =
(1)
i=1
5.3 Training Details and Evaluation
After applying the pre-processing methods, the output of images is shown in Fig. 14.
After training, the results on the validation set are 0.043 (CRPS). The same trained
model is applied to the test dataset and generates a submission file with CDF values
for each subject’s systole and diastole values. The CRPS value on the test dataset
is 0.04307. The CRPS values are plotted over the epochs as shown in Fig. 15. The
time taken for pre-processing the dataset is 2 h and for the proposed deep CNN
model takes 2 min per epoch and 10 s for evaluating the CRPS.
Left Ventricle Volume Analysis in Cardiac MRI Images Using Convolutional. . .
0
0
0
50
50
50
100
100
100
150
150
150
200
200
200
250
250
250
300
300
300
350
350
350
0
100
200
300
(a)
0
100
200
(b)
300
0
317
100
200
300
(c)
Fig. 14 (a) After applying CLAHE to the original image, (b) Gaussian blur is applied on image
a, (c) Canny edge detection is applied on the image b
Fig. 15 CRPS train and test
5.4 Comparison with Other Models and Pre-processing
Methods
Pre-processing methods used in this chapter are Gaussian blur, Canny edge detection, stacked the images, and applied standard deviation to find the moving pixels
in the image, data augmentation, CLAHE, and denoise TV Chambolle, which are
giving better results compared to the methods used in paper [7] where they have
used Gaussian noise, augmenting the data using shifting and rotation and cropped
the ROI in the frame. And in paper [8], they have applied a prefilter to remove the
images that are not well aligned and downsampled the image, and ROI is extracted.
In the paper [8], the U-net architecture is used for segmentation that consists
of encoding and decoding blocks with Conv layers, max-pooling layers. And in
paper [7], the input given to the ResNet model is the cropped raw image, and the
activation functions used in the model are ReLU and sigmoid. In this chapter, we
have proposed a customized CNN model with 8 convolutional layers, and modular
blocks consist of ReLU activation, max pooling, dropout, and zero-padding layers
318
P. S. K. Yadhav et al.
as shown in Fig. 7. And the loss functions used are the root mean-squared error
(RMSE) and the Adam optimizer. However, the dataset used in this chapter is
different.
6 Conclusion
We developed a CNN model in this chapter to resolve the left ventricle segmentation
problem in cardiac magnetic resonance images. Our proposed method is of three
steps. In the first step, we analyzed the cardiac MRI images used to predict the
patient as normal or abnormal. After pre-processing, the preprocessed images are
given as input to the model. In the second step, we proposed a CNN model for left
ventricle volume estimation when it is contracted (systole) and extended (diastole).
In the third step, the proposed method predicts the volume of the left ventricle at
the time of systole and diastole. We have applied our method on the Kaggle ADSB
dataset, which is publicly available.
References
1. Luo, G., Dong, S., Wang, K., Zuo, W., Cao, S., & Zhang, H. (2017). Multi-views fusion
CNN for left ventricular volumes estimation on cardiac MR images. IEEE Transactions on
Biomedical Engineering, 65(9), 1924–1934.
2. Liao, F., Chen, X., Hu, X., & Song, S. (2017). Estimation of the volume of the left ventricle
from MRI images using deep neural networks. IEEE Transactions on Cybernetics, 49(2), 495–
504.
3. Dixit, M., Tiwari, A., Pathak, H., & Astya, R. (2018). An overview of deep learning architectures, libraries and its applications areas. In 2018 International Conference on Advances in
Computing, Communication Control and Networking (ICACCCN) (pp. 293–297).
4. Zhang, C., Patras, P., & Haddadi, H. (2019). Deep learning in mobile and wireless networking:
A survey. IEEE Communications Surveys Tutorials, 21(3), 2224–2287.
5. Barz, B., & Denzler, J. (2020). Deep learning on small datasets without pre-training using
cosine loss. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer
Vision (pp. 1371–1380).
6. Guo, F., Ng, M., Goubran, M., Petersen, S., Piechnik, S., Neubauer, S., & Wright, G. (2020).
Improving cardiac MRI convolutional neural network segmentation on small training datasets
and dataset shift: A continuous kernel cut approach. Medical Image Analysis, 61, 101636.
7. Budai, A., Suhai, F., Csorba, K., Toth, A., Szabo, L., Vago, H., & Merkely, B. (2020).
Fully automatic segmentation of right and left ventricle on short-axis cardiac MRI images.
Computerized Medical Imaging and Graphics, 85, 101786.
8. Wu, B., Fang, Y., & Lai, X. (2020). Left ventricle automatic segmentation in cardiac MRI
using a combined CNN and U-net approach. Computerized Medical Imaging and Graphics,
82, 101719.
9. Khened, M., Kollerathu, V., & Krishnamurthi, G. (2019). Fully convolutional multi-scale
residual DenseNets for cardiac segmentation and automated cardiac diagnosis using ensemble
of classifiers. Medical image analysis, 51, 21–45.
Left Ventricle Volume Analysis in Cardiac MRI Images Using Convolutional. . .
319
10. Nasr-Esfahani, M., Mohrekesh, M., Akbari, M., Soroushmehr, S., Nasr-Esfahani, E., Karimi,
N., Samavi, S., & Najarian, K. (2018). Left ventricle segmentation in cardiac MR images
using fully convolutional network. In 2018 40th Annual International Conference of the IEEE
Engineering in Medicine and Biology Society (EMBC) (pp. 1275–1278).
11. Chaitanya, K., Erdil, E., Karani, N., & Konukoglu, E. (2020). Contrastive learning of global and
local features for medical image segmentation with limited annotations. Advances in Neural
Information Processing Systems, 33, 12546–12558.
12. Valvano, G., Leo, A., & Tsaftaris, S. (2021). Learning to segment from scribbles using multiscale adversarial attention gates. IEEE Transactions on Medical Imaging, 40(8), 1990–2001.
13. Wang, Z., & Wang, Z. (2020). Fully automated segmentation of the left ventricle in magnetic
resonance images. arXiv preprint arXiv:2007.10665.
14. Luo, G., Sun, G., Wang, K., Dong, S., & Zhang, H. (2016). A novel left ventricular volumes
prediction method based on deep learning network in cardiac MRI. In 2016 Computing in
Cardiology Conference (CinC) (pp. 89–92).
15. Jang, Y., Hong, Y., Ha, S., Kim, S., & Chang, H.J. (2017). Automatic segmentation of LV
and RV in cardiac MRI. In International Workshop on Statistical Atlases and Computational
Models of the Heart (pp. 161–169).
16. Tan, L., Liew, Y., Lim, E., & McLaughlin, R. (2017). Convolutional neural network regression
for short-axis left ventricle segmentation in cardiac cine MR sequences. Medical Image
Analysis, 39, 78–86.
17. Wang, X., Zhai, S., & Niu, Y. (2020). Left ventricle landmark localization and identification
in cardiac MRI by deep metric learning-assisted CNN regression. Neurocomputing, 399, 153–
170.
18. Rostami, A., Amirani, M., & Yousef-Banaem, H. (2020). Segmentation of the left ventricle
in cardiac MRI based on convolutional neural network and level set function. Health and
Technology, 10(5), 1155–1162.
19. Medley, D., Santiago, C., & Nascimento, J. (2019). Segmenting the left ventricle in cardiac
in cardiac MRI: From handcrafted to deep region based descriptors. In 2019 IEEE 16th
International Symposium on Biomedical Imaging (ISBI 2019) (pp. 644–648).
20. Emad, O., Yassine, I., & Fahmy, A. (2015). Automatic localization of the left ventricle in
cardiac MRI images using deep learning. In 2015 37th Annual International Conference of the
IEEE Engineering in Medicine and Biology Society (EMBC) (pp. 683–686).
21. Simantiris, G., & Tziritas, G. (2020). Cardiac MRI segmentation with a dilated CNN incorporating domain-specific constraints. IEEE Journal of Selected Topics in Signal Processing,
14(6), 1235–1243.
22. Zotti, C., Luo, Z., Lalande, A., & Jodoin, P.M. (2018). Convolutional neural network with
shape prior applied to cardiac MRI segmentation. IEEE Journal of Biomedical and Health
Informatics, 23(3), 1119–1128.
23. Chang, Y., Song, B., Jung, C., & Huang, L. (2018). Automatic segmentation and cardiopathy
classification in cardiac MRI images based on deep neural networks. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1020–1024).
24. Bernard, O., Lalande, A., Zotti, C., Cervenansky, F., Yang, X., Heng, P.A., Cetin, I., Lekadir, K.,
Camara, O., Ballester, M., et al. (2018). Deep learning techniques for automatic MRI cardiac
multi-structures segmentation and diagnosis: is the problem solved? IEEE Transactions on
Medical Imaging, 37(11), 2514–2525.
25. Curiale, A., Colavecchia, F., Kaluza, P., Isoardi, R., & Mato, G. (2017). Automatic myocardial
segmentation by using a deep learning network in cardiac MRI. In 2017 XLIII Latin American
Computer Conference (CLEI) (pp. 1–6).
26. Zreik, M., Leiner, T., De Vos, B., Hamersvelt, R., & Viergever, (2016). Automatic segmentation
of the left ventricle in cardiac CT angiography using convolutional neural networks. In 2016
IEEE 13th International Symposium on Biomedical Imaging (ISBI) (pp. 40–43).
27. Srinivasa, K., & Anupindi, S. (2018). Performance analysis and application of expressiveness
detection on facial expression videos using deep learning techniques. Data-Enabled Discovery
and Applications, 2(1), 1–11.
320
P. S. K. Yadhav et al.
28. Iliadis, M., Spinoulas, L., & Katsaggelos, A. (2018). Deep fully-connected networks for video
compressive sensing. Digital Signal Processing, 72, 9–18.
29. Jindal, V. (2016). Integrating mobile and cloud for PPG signal selection to monitor heart rate
during intensive physical exercise. In Proceedings of the International Conference on Mobile
Software Engineering and Systems (pp. 36–37).
30. Hatcher, W., & Yu, W. (2018). A survey of deep learning: Platforms, applications and emerging
research trends. IEEE Access, 6, 24411–24432.
MRI Image Analysis for Brain Tumor
Detection Using Deep Learning
Prachi Chauhan, Hardwari Lal Mandoria, and Alok Negi
1 Introduction
A primary brain or spinal cord tumor is one that begins throughout the brain or
spine [1]. An approximate of 24,530 adults in the United States (13,840 males and
10,690 females) [2] will now be confirmed with predominant cancerous tumors of
the brain and spinal cord this year. There seem to be secondary brain tumors, also
known as brain metastases, in relation to primary brain tumors. When a tumor begins
elsewhere in the body and spreads to the brain, this is known as metastasis. Bladder,
breast, kidney, and lung cancers, as well as leukemias, lymphoma, and melanoma,
are perhaps the most prevalent cancers that extend to the brain. As shown in Fig. 1,
according to the World Health Organization (WHO), a total of 19,292, 789 new
cancer cases recorded with 905,677cases are expected due to liver cancer, 2,206,771
from lung cancer, 2,261,419 from breast cancer, 1,414,259 from prostate cancer,
1,931,590 from colorectum cancer, 1,089,103 from stomach cancer, 604,127 from
cervix uteri cancer, 604,100 from esophagus cancer, and 8,275,743 new cases from
other types of cancers in 2020 (World Health Organization, New Cancer Release
Report 2020) [3]. The statistics for the cases are shown in Fig. 1.
Brain tumor is a wide group of cancers that may begin in almost any brain
tissues and organs of a person whenever irregular cells develop uncontrollably
and penetrate adjoining regions of the brain [4]. Tumors sprout from the cells that
surround the brain’s membranes (meninges), glands, and nerves. In general, tumors
P. Chauhan () · H. L. Mandoria
Department of Information Technology, G.B. Pant University of Agriculture and Technology,
Pantnagar, India
A. Negi
Department of Computer Science and Engineering, National Institute of Technology,
Uttarakhand, India
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0_16
321
322
P. Chauhan et al.
Fig. 1 The number of new cancer cases in 2020, both sexes, all ages
can wreak havoc on brain cells. They can cause cell damage by increasing the
pressure within the skull. Several healthcare systems in low- and middle-income
societies are ill-equipped to deal with the issue, and a substantial proportion of
brain tumor patients around the world lack adequate access to high-quality diagnosis
and care. Many forms of cancer recovery rates are strengthening in countries with
good healthcare organizations due to early diagnosis, comprehensive treatment, and
overall survivorship services. As a result, early intervention and detection of a brain
tumor are important for many people’s lives to be saved. Visual assessment and
manual procedures are commonly used to diagnose certain types of tumors. This
method of manually interpreting medical images consumes a very long time and
seems to be vulnerable to errors. As a result, deep-learning-based computerized
research has shown promise as a diagnostic mechanism [5, 6].
Deep learning was already commonly used in a number of fields, especially
biomedical imaging, because its implementation does not necessarily involve the
expertise of a subject matter specialist, but it really does necessitate a large volume
of data as well as a complex set of data in order to produce good prediction
performance. For instance, convolutional neural nets (CNN) have shown to be
capable of detecting tumors with positive results [7]. With both the diagnosis and
treatment processes, researchers have focused on brain magnetic resonance imaging
(MRI) as being one of the excellent imaging techniques for diagnosing brain
tumor and predicting tumor progression. Because of the high resolution of MRI
images, they have a major impact across the domain of automated medical image
interpretation because they can provide a great deal of knowledgeable information
about the structure of brain and anomalies inside the brain tissues [8–10]. For such
a reason, we described the word “deep learning.”
MRI Image Analysis for Brain Tumor Detection Using Deep Learning
323
Fig. 2 Deep learning abstraction levels
As shown in Fig. 2, deep learning (DL) is a series of machine learning approaches
that start learning at multiple levels and progress through different abstraction levels.
Levels refer to distinctive levels of meanings, whereby higher-level principles are
described from lower-level principles, as well as the similar lower-level principles
can help to describe certain higher-level principles. DL models [11, 12] become
more and more precise as they analyze more data and fundamentally learn from
past findings to improve their abilities to create correlations and associations.
Deep learning can be extended to virtually every field of science and has resulted
in significant advancements. With new excellent progress of DL in the area of
robust recognizing steps, DL strengthens all aspects of activity by finding common
problems and also adds additional domains of study. Deep learning has shown
excellent results in a variety of fields. It is gradually making its way into groundbreaking technology with high-value applications in the medical sector [16–19].
Researchers are required to increase the performance of automation and smart
decision-making in main patient treatment and public health systems. As a result,
the goal of this chapter is to explore the potential of deep learning models for brain
tumor segmentation from microscopic tissue images via integrating an advanced
convolution model [20–22] and evaluation using different performance metrics.
2 Related Work
Sarhan et al. [10] developed a new CAD methodology for MRI image recognition
of brain tumor. The proposed design extracts feature from brain MRI images by
324
P. Chauhan et al.
using the discrete wavelet transform’s strong energy compactness property (DWT).
The wavelet aspects were being used to characterize the input MRI image using
a CNN. As a result, the proposed method was less time complex. MRI scans
from the Figshare (Cheng) database were used to create the brain images. The
extracted features were fed into both the proposed WCNN method and the SVM
classifier. The suggested system output was comparable to SVM classifier to
demonstrate its accuracy and robustness. Using a decomposed stage of two and
the Haar wavelet, computational experiments on the Figshare (Cheng) database
yielded 99.3% recognition accuracy. According to simulation performance, the
proposed method consistently outperforms as compared to the SVM model in terms
of effective metrics.
Rao et al. [1] reviewed different approaches for fully automated brain tumor
segmentation and classification that do not require user interaction. In the first stage,
the image data was pre-processed with an edge preserved anisotropic diffusion
filter and then segmented with GLCM texture features segmentation to distinguish
tumor, white matter (WM), grey matter (GM), and edema regions. And at last, the
selected features were analyzed to identify the proper features that used artificial
neural network (ANN) and support vector machine (SVM). Finally, through distinct
machine learning methods, the extracted feature has been further categorized as
tumor and non-tumor.
Chauhan et al. [13] proposed the DWA-DNN method for classification of brain
MRI. The performance findings indicate that DWA-DNN was much precise and
managed the huge dataset quantity more conveniently. The proposed DWA layer
was made up of DWT and AE. The image was encoded using AE and afterward
processed via DWT that uses Daubechies mother wavelet of 2nd order to obtain
the estimated and comprehensive coefficients by transferring it through low-pass
and high-pass filters, respectively, within this layer. The estimation coefficient was
then used in the DNN model for classification. In comparison to the other methods,
the accuracy of the CNN model was almost identical to DWA-DNN but just not
as efficient due to the use of deep neural networks. This means that classification
accuracy continued to improve whenever the extracted features were precise.
Havaei et al. [14] presented a deep convolutional neural-network-based method
for automatically segmenting brain tumors. The author assessed the effects of
various architectures on performance. The outcomes of the BRATS 2013 online
analysis system proved that the proposed method significantly outperforms on the
presently documented state-of-the-art models in terms of both speed and accuracy,
as introduced at MICCAI 2013. The elevated effect was obtained through the use of
a unique two-pathway architecture (that can simulate both local features and global
details) including by stacking two CNNs to model local label implementations.
The training was predicated on two techniques that allowed CNNs to be trained
efficiently even though the label distribution was unbalanced. The classification
system tends to result had been very fast due to the convolutional existence of the
modeling techniques and an efficient GPU execution. The time required to segment
MRI Image Analysis for Brain Tumor Detection Using Deep Learning
325
the whole brain with any of the CNN classification algorithms ranges between 25 s
and 3 min, making the proposed method more practical for logical segmentation
techniques.
Amin et al. [15] presented a computer-aided system for segmenting and classifying brain cancer using magnetic resonance imaging (MRI). For such segmentation
of candidate lesions, various methodologies have been used. Then, for each individual tumor, a feature array was selected based on shape appearance and severity. At
a certain point, the implemented new framework accuracy was compared using the
support vector machine (SVM) classifier to distinct cross-verification on the selected
features. The proposed approach was tested on three benchmark datasets: Harvard,
RIDER, and Local. The system had an overall precision of 97.1%, a region under a
curve of 0.98, a sensitivity of 91.9%, and a specificity of 98.0%.
Mohsen et al. [17] proposed an effective method for classifying brain MRIs into
standard and three kinds of malignant brain tumors: glioblastoma, sarcoma, and
metastatic bronchogenic carcinoma using the discrete wavelet transform (DWT)
and the deep neural nets (DNN). The novel methodology implementation was
similar to the convolutional neural nets (CNN) system, but it needed less hardware
requirements and took a reasonable amount of time to process large images
(256*256). Furthermore, as opposed to standard classifiers, the use of the DNN
classifier demonstrated high precision, and the performance assessment was very
successful across all performance steps.
Zhao et al. [18] developed an innovative brain tumor segmentation approach
based on the integration of fully convolutional neural nets (FCNNs) and conditional
random fields (CRFs) in a coherent system to achieve segmentation outcomes with
presence and spatial accuracy. The suggested procedure was tested by the authors
using imaging samples from the Multimodal Brain Tumor Image Segmentation
Challenge (BRATS) 2013, BRATS 2015, and BRATS 2016. The analytical findings
indicated that the evolved approach created a segmentation system using Flair, T1c,
and T2 scans and obtained comparable performance to some of the designed using
Flair, T1, T1c, and T2 scans.
3 Proposed Work
The proposed approach employs the CNN model to diagnose brain tumors from
MRI scans and the model being trained by the use of Python script. This research
also reveals the technological capability of deep learning to identify the brain tumors
automatically by classifying them as tumorous or non-tumorous. The proposed work
is depicted in Fig. 3 as a block diagram.
326
P. Chauhan et al.
Fig. 3 Proposed work
3.1 Dataset Description
The dataset for all of this research was obtained from Kaggle and consists of two
folders: yes and no, and containing 253 brain MRI images. Yes covers 155 tumorous
brain MRI scans, whereas no has 98 non-tumorous brain MRI image data. Figure 4
depicts the dataset distribution prior to the augmentation.
3.2 Data Augmentation
Data augmentation typically applied to generate additional images because of
the dataset’s moderate size. Data augmentation is often performed to address the
problem of data imbalance (since 61 of the data belong to the tumorous class).
The tumorous scans represent for 61% of the dataset (155 images), while the nontumorous scans account for 39% (98 images). Consequently, in terms of balancing
the data, we can create 8 new frames for each image as in “no” class and 5
new images for each image in the “yes” class. As a augmentation, the following
parameters are used:
•
•
•
•
•
Shear: 0.1
Rotation range: 10
Width shift range: 0.1
Height shift range: 0.1
Brightness range: (0.3, 1.0)
MRI Image Analysis for Brain Tumor Detection Using Deep Learning
327
Distribution of different classes of Dataset
160
140
120
100
80
60
40
20
0
Yes
No
Fig. 4 Dataset distribution before augmentation
Fig. 5 Data distribution after
data augmentation
• Horizontal flip=True
• Vertical flip=True
• Fill mode=“nearest”
After augmentation, the dataset has a maximum volume of 1812 images, with
51.32% (930 scans) being tumorous and 48.67% being non-tumorous (882 images).
Figure 5 depicts the prevalence of the dataset after augmentation.
328
P. Chauhan et al.
Cropped Image
Original Image
Fig. 6 Original image after cropping
3.3 Loading and Splitting Augmented Data
We provide two arguments to load the augmented data: the first is a list of directory
paths for the folders “yes” and “no,” and the second is the image size. To locate the
extreme top, bottom, left, and right locations of the brain, the very first scan in both
directories is examined and then cropped to accommodate only the brain image, as
seen in Fig. 6.
Because the images in the dataset are of varying sizes, they are resized to (224,
224, 3) before being fed into the neural network. After that, normalization is used to
scale the pixel from 0 to 1. Images with labels are appended to X and Y, followed by
shuffling. The total number of images is 1812, with the X shape being (1812, 224,
224, 3) and the Y form being (1812, 1). The kernel density plot for both the classes
is shown in Figs. 7 and 8.
The sample images for both the classes are shown in Figs. 9 and 10.
The data (X,Y) is then divided into three parts: 70% for training (1268 images),
15% for validation (272 scan), and 15% for testing (272 images). The shapes for
split data are listed below.
•
•
•
•
•
•
X train shape: (1268, 224, 224, 3)
Y train shape: (1268, 1)
X val shape: (272, 224, 224, 3)
Y val shape: (272, 1)
X test shape: (272, 224, 224, 3)
Y test shape: (272, 1)
MRI Image Analysis for Brain Tumor Detection Using Deep Learning
329
Fig. 7 Kernel density plot of tumorous images after augmentation
Fig. 8 Kernel density plot of non-tumorous images after augmentation
3.4 CNN Architecture
In CNN, every input image can be processed through a sequence of convolution
layers using filters (kernels), pooling, fully connected layers (FC), and Softmax to
identify an item with probabilistic values ranging from 0 to 1.
An convolution layer is implemented using 32 filters, followed by batch normalization, and ReLU is applied as an activation function that is measured by f (x) =
max(0,x). The model used two max-pooling layers with stride 4 × 4 followed by
flatten layer. Subsequently, using the sigmoid activation function, one dense layer is
employed for the output. Figure 11 depicts the classification’s layered design.
330
P. Chauhan et al.
Fig. 9 Brain tumor yes
Fig. 10 Brain tumor no
4 Result and Analysis
The model is trained on Google Colab for only 35 epochs using a python script
with batch size 32 and Adam optimizer. As for evaluations, accuracy curves, loss
curves, and f1 score-based analyses are used. Equations 1, 2, 3, 4, and 5 illustrate the
mathematics underlying accuracy, loss, precision, recall, and F1 score, respectively.
Accuracy = (T P + T N)/(T P + T N + F P + F N)
logloss = −1/N
N M
i
j
yij log(pij )
(1)
(2)
MRI Image Analysis for Brain Tumor Detection Using Deep Learning
Fig. 11 CNN layered architecture
331
332
P. Chauhan et al.
Fig. 12 Training and validation accuracy of CNN
P recision = T P /(T P + F P )
(3)
Recall = T P /(F N + T P )
(4)
F 1Score = 2X(P recision ∗ Recall)/(P recision + Recall).
(5)
As illustrated in Figs. 12 and 13, the research framework recorded the model’s
accuracy and loss curve per epoch. Training accuracy is over 100%, whereas
validation accuracy is nearly 90%, as indicated in the graph. We also computed
the F1 score for the validation set, which was 89.37%.
On the test set, we also deployed the proposed model. There are 272 scans in the
test set, 49.63% of which are tumorous (135 images) and the remaining 50.36% are
non-tumorous (137 images). With a 0.50 logloss score, the proposed model achieved
a test accuracy of 88.23%. Again, for testing, the F1 score is determined at 87.78%.
We built a classification model using custom CNN layers to classify if an
individual has a brain tumor or not using MRI images in this chapter. The model
performed well with a small number of training samples, but test accuracy can be
improved by adding more layers or using more deeper pre-trained architectures such
as Vgg16 or Resnet 34, etc.
MRI Image Analysis for Brain Tumor Detection Using Deep Learning
333
Fig. 13 Training and validation loss of CNN
5 Conclusion
Medical vision evaluation serves an important responsibility in the healthcare
industry, particularly in non-invasive therapy and clinical research. Healthcare
professionals and radiologists can use medical analytical techniques and reporting
capabilities to appropriately diagnose the condition. Medical imaging has emerged
as one of the main effective methods for detecting and evaluating a wide range of
abnormalities. Visualization enables doctors to analyze and interpret MRI scans in
order to detect deformities or anomalies inside the organs. Medical data collected
from numerous biomedical equipment that use diverse imaging techniques such as
X-rays, CT scans, MRI, mammograms, and others that play an important factor
in the diagnosing. Magnetic resonance imaging (MRI) is used to diagnose a brain
tumor (MRI). When an MRI reveals that there is a tumor in the brain, the most
common technique to determine the type of tumor is to read the results of a biopsy
sample of cells. We suggested a CNN model appropriate of locating tumors using
MRI scans of the brain regions in this chapter. With a 0.50 logloss score and an
F1 score of 87.78%, the proposed model achieved a test accuracy of 88.23%. As a
result, AI will undoubtedly have an impact on radiology, and it will do so much more
rapidly than in other medical disciplines. It will have a greater impact on radiology
practicing than ever before. In future, advanced deep neural network models with
app-based user interface can be developed for better analysis of the brain tumor.
334
P. Chauhan et al.
References
1. Rao, G. S., & Vydeki, D. (2018). Brain tumor detection approaches: A review. In 2018
International Conference on Smart Systems and Inventive Technology (ICSSIT) (pp. 479–488).
IEEE.
2. Brain Tumor Facts 2021, National Brain Tumor Society, 2021. https://braintumor.org/braintumor-information/brain-tumor-facts/quick-facts. Accessed 23 May 2021
3. Cancer Facts and Figures 2021, World Health Organization, 2021. https://www.who.int/newsroom/fact-sheets/detail/cancer. Accessed 23 May 2021
4. Zacharaki, E. I., Wang, S., Chawla, S., Soo Yoo, D., Wolf, R., Melhem, E. R., & Davatzikos, C.
(2009). Classification of brain tumor type and grade using MRI texture and shape in a machine
learning scheme. Magnetic Resonance in Medicine: An Official Journal of the International
Society for Magnetic Resonance in Medicine, 62(6), 1609–1618.
5. Alok, N., Krishan, K., & Chauhan, P. (2021). Deep learning-based image classifier for malaria
cell detection. In Machine learning for healthcare applications (pp. 187–197).
6. Negi, A., Kumar, K., Chauhan, P., & Rajput, R. S. (2021). Deep neural architecture for
face mask detection on simulated masked face dataset against Covid-19 pandemic. In 2021
International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)
(pp. 595–600). IEEE.
7. Khambhata, K. G., & Panchal, S. R. (2016). Multiclass classification of brain tumor in
MRI images. International Journal of Innovative Research in Computer and Communication
Engineering, 4(5), 8982–8992.
8. Das, V., & Rajan, J. (2016). Techniques for MRI brain tumor detection: A survey. International
Journal of Research in Computer Application and Information Technology, 4(3), 53–56.
9. Singh, L., Chetty, G., & Sharma, D. (2012). A novel machine learning approach for detecting
the brain abnormalities from MRI structural images. In IAPR International Conference on
Pattern Recognition in Bioinformatics (pp. 94–105). Springer.
10. Sarhan, A. M. (2020). Brain tumor classification in magnetic resonance images using deep
learning and wavelet transform. Journal of Biomedical Science and Engineering, 13(6), 102.
11. LeCun, Y., Kavukcuoglu, K., Farabet, C. (2010). Convolutional networks and applications in
vision. In Proceedings of 2010 IEEE International Symposium on Circuits and Systems, Paris
(pp. 253–256). https://doi.org/10.1109/ISCAS.2010.5537907
12. Alom, M. Z., Taha, T. M., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M. S., Hasan, M., Van
Essen, B. C., Awwal, A. A., & Asari, V. K. (2019). A state-of-the-art survey on deep learning
theory and architectures. Electronics, 8(3), 292.
13. Chauhan, N., & Choi, B. J. (2019). Performance analysis of classification techniques of human
brain MRI images. International Journal of Fuzzy Logic and Intelligent Systems, 19(4), 315–
322.
14. Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Bengio, Y., & Larochelle,
H. (2017). Brain tumor segmentation with deep neural networks. Medical Image Analysis, 35,
18–31.
15. Amin, J., Sharif, M., Yasmin, M., & Fernandes, S. L. (2017). A distinctive approach in brain
tumor detection and classification using MRI. Pattern Recognition Letters, 139, 118–127.
16. Kumar, S. N., Fred, A. L., Padmanabhan, P., Gulyas, B., Kumar, H. A., & Miriam, L. J.
(2021). Deep learning algorithms in medical image processing for cancer diagnosis: Overview,
challenges and future. In Deep learning for cancer diagnosis (pp. 37–66).
17. Mohsen, H., El-Dahshan, E. S. A., El-Horbaty, E. S. M., & Salem, A. B. M. (2018).
Classification using deep learning neural networks for brain tumors. Future Computing and
Informatics Journal, 3(1), 68–71.
18. Zhao, X., Wu, Y., Song, G., Li, Z., Zhang, Y., & Fan, Y. (2018). A deep learning model
integrating FCNNs and CRFs for brain tumor segmentation. Medical Image Analysis, 43, 98–
111.
MRI Image Analysis for Brain Tumor Detection Using Deep Learning
335
19. Munir, K., Elahi, H., Ayub, A., Frezza, F., & Rizzi, A. (2019). Cancer diagnosis using deep
learning: A bibliographic review. Cancers, 11(9), 1235.
20. Xiao, Z., Huang, R., Ding, Y., Lan, T., Dong, R., Qin, Z., Zhang, X., & Wang, W. (2016). A
deep learning-based segmentation method for brain tumor in MR images. In 2016 IEEE 6th
International Conference on Computational Advances in Bio and Medical Sciences (ICCABS)
(pp. 1–6). IEEE.
21. Dong, H., Yang, G., Liu, F., Mo, Y., & Guo, Y. (2017). Automatic brain tumor detection
and segmentation using u-net based fully convolutional networks. In Annual Conference on
Medical Image Understanding and Analysis (pp. 506–517). Springer.
22. Rezaei, M., Harmuth, K., Gierke, W., Kellermeier, T., Fischer, M., Yang, H., & Meinel,
C. (2017). A conditional adversarial network for semantic segmentation of brain tumor. In
International MICCAI Brain Lesion Workshop (pp. 241–252). Springer.
Index
A
Adaptive thresholding, 239
Additive attention, 242, 245–247, 251,
253–255
B
Bidding system, viii, 165–180
Blockchain, 129–159, 165–180, 193, 198, 199
Brain tumor, 270, 271, 277, 321–333
C
Cardiac, 295–318
Challenges in IoV, 129–159
Classifier, viii, 25, 39, 40, 42, 47, 48, 50–55,
99, 104, 105, 107, 111–125, 227, 228,
271, 278, 310, 324, 325
Computer vision, 3, 99, 241–243, 297, 313
Convolutional neural networks (CNN), 34,
47, 53, 54, 114, 116, 118, 241, 242,
244–246, 295–318, 324, 325, 329,
331–333
COVID-19, viii, 203, 211, 225–239, 296
Cyber crime, 97
Cyber fraud, 97, 98, 103, 106
D
Data, 3, 21, 45, 59, 84, 99, 112, 129, 166, 184,
214, 226, 241, 265, 295, 322
Decentralization, 138, 144–145, 148, 166–169
Deep learning, 25, 34, 39, 61, 62, 68, 76, 227,
228, 241–262, 277, 295–297, 305, 310,
311, 314, 321–333
Detection accuracy (DA), 25, 48, 112
Differential privacy, 61, 62, 65, 67, 68, 70,
74–76
Discrete wavelet transform (DWT), 25, 112,
114, 119–122, 125, 281, 324, 325
Dynamic trajectory, 59–76
E
Edge computing, 130, 136, 158
Educators, 203, 206, 207
E-governance, 83, 86, 88, 89, 91–93, 95
E-learning, 203–205, 207, 216, 217, 219
Energy efficiency, 6, 11–13
E-service, 89
Ethereum, 140, 165, 168, 169, 172, 173, 176
F
Fault tolerance, viii, 8–12, 14–16, 147
5G-VANETs, 186, 198
G
GGO, 229, 230, 234, 236, 239
H
Heuristics, 251, 252
I
Image processing, 119, 226, 241, 265–288
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
S. Pandey et al. (eds.), Role of Data-Intensive Distributed Computing Systems in
Designing Data Solutions, EAI/Springer Innovations in Communication and
Computing, https://doi.org/10.1007/978-3-031-15542-0
337
338
Images, 22–28, 30–40, 42, 112–122, 207,
209, 217, 225–239, 241–262, 265–288,
295–318, 321–333
Internet of Vehicles (IoV), viii, 129–159
K
K-means, 48, 50, 52, 66, 70, 71, 237–239, 272,
279, 310
L
Learning management system, 203–219
Left ventricle, 295–318
Lesion region, 226, 229, 231, 234–236, 239
LMS comparison, 210
Local binary pattern (LBP), 112, 119–122, 125
LSTM, 244–247, 250, 251, 311
M
Machine learning (ML), viii, 97–108, 111–125,
226, 241, 252, 295, 323, 324
Machine learning classifier, 104, 105, 111–125
Magnetic resonance imaging (MRI), 295–318,
321–333
Medical images, 241, 244, 265, 266, 271,
274–279, 281, 282, 296, 307, 322
Mobile agent technology (MAT), 84
MRI images, 295–318, 321–333
N
Nature-inspired methods, 266
Neighbor assister framework for mobile agents
(NAFMA), 84, 88, 89, 91–95
P
Percentage split distribution (PSD), 229, 231,
233, 236–239
Performance, 4–6, 9, 23, 38, 48, 53–55, 62,
67, 69, 73–76, 85, 92–93, 105, 115,
122–124, 140, 148, 150, 152, 154, 158,
Index
166, 169, 185, 191, 204, 206, 230, 236,
237, 295, 296, 300, 306, 308, 309, 311,
314, 316, 322–325
Phishing, 97–108
Phishing website, 97–108
Platform for mobile agent distribution and
execution (PMADE), 84, 89, 91, 93, 95
Population-based methods, 277
R
Reliability, 4–6, 8, 10–16, 23, 73, 75, 76, 95,
130, 150, 172, 179, 180, 195, 219
S
Security, 21, 23–25, 51, 59, 61–63, 74, 76, 85,
91, 93, 111, 130, 131, 134, 135, 137,
140–143, 146, 147, 149, 151–158, 165,
166, 172, 173, 177, 179–180, 183–199
Security attacks, 187–192, 194, 198
Security requirements, 186–191
Security strategies, 199
Security threats, 186, 199
Severe acute respiratory syndrome (SARS),
225
Source camera identification, viii, 111–125
Steganography, 21–42
Swarm-based methods, 265–288
T
Task scheduling, 10, 11, 14, 15
Teacher force, 250, 251
Trajectory privacy, 61, 65
Transmission, 21–42, 46, 135, 136, 147, 152,
166, 195
Types of blockchain, 131, 138–144
W
Wireless sensor networks (WSNs), 45–55, 183
Work agent (WA), 84, 87, 88, 90–92
Working of blockchain, 171, 177–178
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )