Uploaded by Hiển Trần Minh

[Lecture Notes in Electrical Engineering №666] Zainah Md Zain, Hamzah Ahmad, Dwi Pebrianti, Mahfuzah Mustafa, N - Proceedings of the 11th National Technical Seminar on Unmanned System Technology 2019 NUSYS'19 (2

advertisement
Lecture Notes in Electrical Engineering 666
Zainah Md Zain · Hamzah Ahmad ·
Dwi Pebrianti · Mahfuzah Mustafa ·
Nor Rul Hasma Abdullah ·
Rosdiyana Samad ·
Maziyah Mat Noh Editors
Proceedings of the
11th National
Technical Seminar
on Unmanned System
Technology 2019
NUSYS’19
Lecture Notes in Electrical Engineering
Volume 666
Series Editors
Leopoldo Angrisani, Department of Electrical and Information Technologies Engineering, University of Napoli
Federico II, Naples, Italy
Marco Arteaga, Departament de Control y Robótica, Universidad Nacional Autónoma de México, Coyoacán,
Mexico
Bijaya Ketan Panigrahi, Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, India
Samarjit Chakraborty, Fakultät für Elektrotechnik und Informationstechnik, TU München, Munich, Germany
Jiming Chen, Zhejiang University, Hangzhou, Zhejiang, China
Shanben Chen, Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Tan Kay Chen, Department of Electrical and Computer Engineering, National University of Singapore,
Singapore, Singapore
Rüdiger Dillmann, Humanoids and Intelligent Systems Laboratory, Karlsruhe Institute for Technology,
Karlsruhe, Germany
Haibin Duan, Beijing University of Aeronautics and Astronautics, Beijing, China
Gianluigi Ferrari, Università di Parma, Parma, Italy
Manuel Ferre, Centre for Automation and Robotics CAR (UPM-CSIC), Universidad Politécnica de Madrid,
Madrid, Spain
Sandra Hirche, Department of Electrical Engineering and Information Science, Technische Universität
München, Munich, Germany
Faryar Jabbari, Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA,
USA
Limin Jia, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
Alaa Khamis, German University in Egypt El Tagamoa El Khames, New Cairo City, Egypt
Torsten Kroeger, Stanford University, Stanford, CA, USA
Qilian Liang, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA
Ferran Martín, Departament d’Enginyeria Electrònica, Universitat Autònoma de Barcelona, Bellaterra,
Barcelona, Spain
Tan Cher Ming, College of Engineering, Nanyang Technological University, Singapore, Singapore
Wolfgang Minker, Institute of Information Technology, University of Ulm, Ulm, Germany
Pradeep Misra, Department of Electrical Engineering, Wright State University, Dayton, OH, USA
Sebastian Möller, Quality and Usability Laboratory, TU Berlin, Berlin, Germany
Subhas Mukhopadhyay, School of Engineering & Advanced Technology, Massey University,
Palmerston North, Manawatu-Wanganui, New Zealand
Cun-Zheng Ning, Electrical Engineering, Arizona State University, Tempe, AZ, USA
Toyoaki Nishida, Graduate School of Informatics, Kyoto University, Kyoto, Japan
Federica Pascucci, Dipartimento di Ingegneria, Università degli Studi “Roma Tre”, Rome, Italy
Yong Qin, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China
Gan Woon Seng, School of Electrical & Electronic Engineering, Nanyang Technological University,
Singapore, Singapore
Joachim Speidel, Institute of Telecommunications, Universität Stuttgart, Stuttgart, Germany
Germano Veiga, Campus da FEUP, INESC Porto, Porto, Portugal
Haitao Wu, Academy of Opto-electronics, Chinese Academy of Sciences, Beijing, China
Junjie James Zhang, Charlotte, NC, USA
The book series Lecture Notes in Electrical Engineering (LNEE) publishes the latest developments
in Electrical Engineering - quickly, informally and in high quality. While original research
reported in proceedings and monographs has traditionally formed the core of LNEE, we also
encourage authors to submit books devoted to supporting student education and professional
training in the various fields and applications areas of electrical engineering. The series cover
classical and emerging topics concerning:
•
•
•
•
•
•
•
•
•
•
•
•
Communication Engineering, Information Theory and Networks
Electronics Engineering and Microelectronics
Signal, Image and Speech Processing
Wireless and Mobile Communication
Circuits and Systems
Energy Systems, Power Electronics and Electrical Machines
Electro-optical Engineering
Instrumentation Engineering
Avionics Engineering
Control Systems
Internet-of-Things and Cybersecurity
Biomedical Devices, MEMS and NEMS
For general information about this book series, comments or suggestions, please contact leontina.
dicecco@springer.com.
To submit a proposal or request further information, please contact the Publishing Editor in
your country:
China
Jasmine Dou, Associate Editor (jasmine.dou@springer.com)
India, Japan, Rest of Asia
Swati Meherishi, Executive Editor (Swati.Meherishi@springer.com)
Southeast Asia, Australia, New Zealand
Ramesh Nath Premnath, Editor (ramesh.premnath@springernature.com)
USA, Canada:
Michael Luby, Senior Editor (michael.luby@springer.com)
All other Countries:
Leontina Di Cecco, Senior Editor (leontina.dicecco@springer.com)
** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex,
SCOPUS, MetaPress, Web of Science and Springerlink **
More information about this series at http://www.springer.com/series/7818
Zainah Md Zain Hamzah Ahmad
Dwi Pebrianti Mahfuzah Mustafa
Nor Rul Hasma Abdullah
Rosdiyana Samad Maziyah Mat Noh
•
•
•
•
•
•
Editors
Proceedings of the 11th
National Technical Seminar
on Unmanned System
Technology 2019
NUSYS’19
123
Editors
Zainah Md Zain
Faculty of Electrical & Electronics
Engineering
Universiti Malaysia Pahang
Pekan, Pahang, Malaysia
Hamzah Ahmad
Faculty of Electrical & Electronics
Engineering
Universiti Malaysia Pahang
Pekan, Pahang, Malaysia
Dwi Pebrianti
Faculty of Electrical & Electronics
Engineering
Universiti Malaysia Pahang
Pekan, Pahang, Malaysia
Mahfuzah Mustafa
Faculty of Electrical & Electronics
Engineering
Universiti Malaysia Pahang
Pekan, Pahang, Malaysia
Nor Rul Hasma Abdullah
Faculty of Electrical & Electronics
Engineering
Universiti Malaysia Pahang
Pekan, Pahang, Malaysia
Rosdiyana Samad
Faculty of Electrical & Electronics
Engineering
Universiti Malaysia Pahang
Pekan, Pahang, Malaysia
Maziyah Mat Noh
Faculty of Electrical & Electronics
Engineering
Universiti Malaysia Pahang
Pekan, Pahang, Malaysia
ISSN 1876-1100
ISSN 1876-1119 (electronic)
Lecture Notes in Electrical Engineering
ISBN 978-981-15-5280-9
ISBN 978-981-15-5281-6 (eBook)
https://doi.org/10.1007/978-981-15-5281-6
© Springer Nature Singapore Pte Ltd. 2021
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface
The National Technical Seminar on Unmanned System Technology 2019
(NUSYS’19) was organized by the IEEE Oceanic Engineering Society (OES)
Malaysia Chapter and Malaysian Society for Automatic Control Engineers (MACE)
IFAC NMO. NUSYS’19 was held during December 2–3, 2019, at Universiti
Malaysia Pahang, Gambang Campus, Kuantan, Pahang, Malaysia, with a conference theme “Unmanned System Technology and AI Applications”. The event was
the 11th conference continuing from previous conferences since the year 2008.
NUSYS’19 focused on both theory and application, primarily covering the topics of
intelligent unmanned technologies, robotics and autonomous vehicle. We invited
four keynote speakers who dealt with related state-of-the-art technologies including
unmanned aerial vehicles (UAVs), underwater vehicles (UVs), autonomous
vehicles, humanoid robot and intelligent system, among others. They are
Mr. Kamarulzaman Muhamed (Founder and CEO Aerodyne Group, “CEO of Top
10 hottest start-up company by Nikkei Japan, May 2019”), Assoc. Prof.
Dr. Hanafiah Yussof (Founder, Board of Director and Group Chief Officer of
Robopreneur Sdn. Bhd.), Assoc. Prof. Dr. Hairi Zamsuri (General Manager
eMoovit Technology Sdn. Bhd.) and Mr. Mohd Fairuz Nor Azmi (Project Manager,
Fugro Malaysia Marine Sdn. Bhd. formerly known as Fugro Geodetic Malaysia
Sdn. Bhd.). The objectives of the conference are threefold: to accommodate a
medium to discuss a wide range of unmanned system technology between universities and industries, to disseminate the latest technology in the field of
unmanned system technology and to provide an opportunity for researchers to
present their research paper in the unmanned system technology area.
Despite focusing on a rather specialized area of research concerning unmanned
system technology and electrical and electronics engineering technology,
NUSYS’19 has successfully attracted 87 papers locally from 12 universities and
one internationally from Institute Technology Surabaya, Indonesia. This volume of
proceedings from the conference provides an opportunity for readers to engage with
a selection of refereed papers that were presented during the NUSYS’19 conference. The book is organized into four parts, which reflect the research topics of the
conference themes:
v
vi
Part
Part
Part
Part
Preface
1:
2:
3:
4:
Unmanned System Technology, Underwater Technology and Marine
Applied Electronics and Computer Engineering
Control, Instrumentations and Artificial Intelligent Systems
Sustainable Energy and Power Electronics.
One aim of this book is to stimulate interactions among researchers in the areas
pertinent to intelligent unmanned systems of AUV, UAV and AGV, namely
autonomous control systems and vehicles. Another aim is to share new ideas, new
challenges and the author’s expertise on critical and emerging technologies. The
book covers multifaceted aspects of unmanned system technology.
The editors hope that readers will find this book not only stimulating but also
useful and usable in whatever aspect of unmanned system design in which they may
be involved or interested. The editors would like to express their sincere appreciation to all the contributors for their cooperation in producing this book.
We wish to take the opportunity to thank all individuals and organizations who
have contributed in some way in making NUSYS’19 a success and a memorable
gathering. Also, we wish to extend our gratitude to the members of the IEEE OES
Malaysia Chapter Committee and Organizing Committee for their tireless effort.
Finally, the publisher, Springer, and most importantly, Mr. Karthik Raj Selvaraj for
his support and encouragement in undertaking this publication.
Editors
Contents
Unmanned System Technology, Underwater Technology
and Marine
Tracking Control Design for Underactuated Micro Autonomous
Underwater Vehicle in Horizontal Plane Using Robust Filter
Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Muhammad Azri Bin Abdul Wahed and Mohd Rizal Arshad
Design and Development of Remotely Operated Pipeline
Inspection Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Mohd Shahrieel Mohd Aras, Zainah Md Zain, Aliff Farhan Kamaruzaman,
Mohd Zamzuri Ab Rashid, Azhar Ahmad, Hairol Nizam Mohd Shah,
Mohd Zaidi Mohd Tumari, Alias Khamis, Fadilah Ab Azis,
and Fariz Ali@Ibrahim
Vision Optimization for Altitude Control and Object Tracking
Control of an Autonomous Underwater Vehicle (AUV) . . . . . . . . . . . . .
Joe Siang Keek, Mohd Shahrieel Mohd Aras, Zainah Md. Zain,
Mohd Bazli Bahar, Ser Lee Loh, and Shin Horng Chong
Development of Autonomous Underwater Vehicle Equipped
with Object Recognition and Tracking System . . . . . . . . . . . . . . . . . . . .
Muhammad Haniff Abu Mangshor, Radzi Ambar,
Herdawatie Abdul Kadir, Khalid Isa, Inani Yusra Amran,
Abdul Aziz Abd Kadir, Nurul Syila Ibrahim, Chew Chang Choon,
and Shinichi Sagara
Dual Image Fusion Technique for Underwater Image
Contrast Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chern How Chong, Ahmad Shahrizan Abdul Ghani,
and Kamil Zakwan Mohd Azmi
3
15
25
37
57
vii
viii
Contents
Red and Blue Channels Correction Based on Green Channel
and Median-Based Dual-Intensity Images Fusion for Turbid
Underwater Image Quality Enhancement . . . . . . . . . . . . . . . . . . . . . . . .
Kamil Zakwan Mohd Azmi, Ahmad Shahrizan Abdul Ghani,
and Zulkifli Md Yusof
73
Analysis of Pruned Neural Networks (MobileNetV2-YOLO v2)
for Underwater Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A. F. Ayob, K. Khairuddin, Y. M. Mustafah, A. R. Salisa, and K. Kadir
87
Different Cell Decomposition Path Planning Methods for Unmanned
Air Vehicles-A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sanjoy Kumar Debnath, Rosli Omar, Susama Bagchi, Elia Nadira Sabudin,
Mohd Haris Asyraf Shee Kandar, Khan Foysol,
and Tapan Kumar Chakraborty
99
Improved Potential Field Method for Robot Path Planning
with Path Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Elia Nadira Sabudin, Rosli Omar, Ariffudin Joret, Asmarashid Ponniran,
Muhammad Suhaimi Sulong, Herdawatie Abdul Kadir,
and Sanjoy Kumar Debnath
Development of DugongBot Underwater Drones Using Open-Source
Robotic Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Ahmad Anas Yusof, Mohd Khairi Mohamed Nor,
Mohd Shahrieel Mohd Aras, Hamdan Sulaiman, and Abdul Talib Din
Development of Autonomous Underwater Vehicle for Water Quality
Measurement Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Inani Yusra Amran, Khalid Isa, Herdawatie Abdul Kadir, Radzi Ambar,
Nurul Syila Ibrahim, Abdul Aziz Abd Kadir,
and Muhammad Haniff Abu Mangshor
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle
in Steering Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Nira Mawangi Sarif, Rafidah Ngadengon, Herdawatie Abdul Kadir,
and Mohd Hafiz A. Jalil
Impact of Acoustic Signal on Optical Signal and Vice Versa
in Optoacoustic Based Underwater Localization . . . . . . . . . . . . . . . . . . 177
M. R. Arshad and M. H. A. Majid
Design and Development of Mini Autonomous Surface Vessel
for Bathymetric Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Muhammad Ammar Mohd Adam, Zulkifli Zainal Abidin,
Ahmad Imran Ibrahim, Ahmad Shahril Mohd Ghani,
and Al Jawharah Anchumukkil
Contents
ix
Control, Instrumentation and Artificial Intelligent Systems
Optimal Power Flow Solutions for Power System Operations
Using Moth-Flame Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . 207
Salman Alabd, Mohd Herwan Sulaiman,
and Muhammad Ikram Mohd Rashid
A Pilot Study on Pipeline Wall Inspection Technology Tomography . . . 221
Muhammad Nuriffat Roslee, Siti Zarina Mohd. Muji,
Jaysuman Pusppanathan, and Mohd. Fadzli Abd. Shaib
Weighted-Sum Extended Bat Algorithm Based PD Controller
Design for Wheeled Mobile Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Nur Aisyah Syafinaz Suarin, Dwi Pebrianti, Nurnajmin Qasrina Ann,
and Luhur Bayuaji
An Analysis of State Covariance of Mobile Robot Navigation
in Unstructured Environment Based on ROS . . . . . . . . . . . . . . . . . . . . . 259
Hamzah Ahmad, Lim Zhi Xian, Nur Aqilah Othman,
Mohd Syakirin Ramli, and Mohd Mawardi Saari
Control Strategy for Differential Drive Wheel Mobile Robot . . . . . . . . . 271
Nor Akmal Alias and Herdawatie Abdul Kadir
Adaptive Observer for DC Motor Fault Detection
Dynamical System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
Janet Lee, Rosmiwati Mohd-Mokhtar,
and Muhammad Nasiruddin Mahyuddin
Water Level Classification for Flood Monitoring System
Using Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
J. L. Gan and W. Zailah
Evaluation of Back-Side Slits with Sub-millimeter Resolution
Using a Differential AMR Probe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
M. A. H. P. Zaini, M. M. Saari, N. A. Nadzri, A. M. Halil,
A. J. S. Hanifah, and K. Tsukada
Model-Free Tuning of Laguerre Network for Impedance Matching
in Bilateral Teleoperation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
Mohd Syakirin Ramli, Hamzah Ahmad, Addie Irawan,
and Nur Liyana Ibrahim
Identification of Liquid Slosh Behavior Using Continuous-Time
Hammerstein Model Based Sine Cosine Algorithm . . . . . . . . . . . . . . . . 345
Julakha Jahan Jui, Mohd Helmi Suid, Zulkifli Musa,
and Mohd Ashraf Ahmad
x
Contents
Cardiotocogram Data Classification Using Random Forest Based
Machine Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
M. M. Imran Molla, Julakha Jahan Jui, Bifta Sama Bari, Mamunur Rashid,
and Md Jahid Hasan
FPGA Implementation of Sensor Data Acquisition for Real-Time
Human Body Motion Measurement System . . . . . . . . . . . . . . . . . . . . . . 371
Zarina Tukiran, Afandi Ahmad, Herdawatie Abd. Kadir,
and Ariffudin Joret
Pulse Modulation (PM) Ground Penetrating Radar (GPR) System
Development by Using Envelope Detector Technique . . . . . . . . . . . . . . . 381
Maryanti Razali, Ariffuddin Joret, M. F. L. Abdullah,
Elfarizanis Baharudin, Asmarashid Ponniran,
Muhammad Suhaimi Sulong, Che Ku Nor Azie Hailma Che Ku Melor,
and Noor Azwan Shairi
An Overview of Modeling and Control of a Through-the-Road
Hybrid Electric Vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
M. F. M. Sabri, M. H. Husin, M. I. Jobli, and A. M. N. A. Kamaruddin
Euler-Lagrange Based Dynamic Model of Double Rotary
Inverted Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
Mukhtar Fatihu Hamza, Jamilu Kamilu Adamu,
and Abdulbasid Ismail Isa
Network-Based Cooperative Synchronization Control
of 3 Articulated Robotic Arms for Industry 4.0 Application . . . . . . . . . 435
Kam Wah Chan, Muhammad Nasiruddin Mahyuddin, and Bee Ee Khoo
EEG Signal Denoising Using Hybridizing Method Between Wavelet
Transform with Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
Zaid Abdi Alkareem Alyasseri, Ahamad Tajudin Khader,
Mohammed Azmi Al-Betar, Ammar Kamal Abasi,
and Sharif Naser Makhadmeh
Neural Network Ammonia-Based Aeration Control for Activated
Sludge Process Wastewater Treatment Plant . . . . . . . . . . . . . . . . . . . . . 471
M. H. Husin, M. F. Rahmat, N. A. Wahab, and M. F. M. Sabri
A Min-conflict Algorithm for Power Scheduling Problem
in a Smart Home Using Battery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
Sharif Naser Makhadmeh, Ahamad Tajudin Khader,
Mohammed Azmi Al-Betar, Syibrah Naim,
Zaid Abdi Alkareem Alyasseri, and Ammar Kamal Abasi
An Improved Text Feature Selection for Clustering Using Binary
Grey Wolf Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
Ammar Kamal Abasi, Ahamad Tajudin Khader,
Mohammed Azmi Al-Betar, Syibrah Naim, Sharif Naser Makhadmeh,
and Zaid Abdi Alkareem Alyasseri
Contents
xi
Applied Electronics and Computer Engineering
Metamaterial Antenna for Biomedical Application . . . . . . . . . . . . . . . . . 519
Mohd Aminudin Jamlos, Nur Amirah Othman, Wan Azani Mustafa,
and Maswani Khairi Marzuki
Refraction Method of Metamaterial for Antenna . . . . . . . . . . . . . . . . . . 529
Maswani Khairi Marzuki, Mohd Aminudin Jamlos, Wan Azani Mustafa,
and Khairul Najmy Abdul Rani
Circular Polarized 5.8 GHz Directional Antenna Design
for Base Station Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
Mohd Aminudin Jamlos, Nurasma Husna Mohd Sabri,
Wan Azani Mustafa, and Maswani Khairi Marzuki
Medical Image Enhancement and Deblurring . . . . . . . . . . . . . . . . . . . . 543
Reza Amini Gougeh, Tohid Yousefi Rezaii, and Ali Farzamnia
A Fast and Efficient Segmentation of Soil-Transmitted Helminths
Through Various Color Models and k-Means Clustering . . . . . . . . . . . . 555
Norhanis Ayunie Ahmad Khairudin, Aimi Salihah Abdul Nasir,
Lim Chee Chin, Haryati Jaafar, and Zeehaida Mohamed
Machine Learning Calibration for Near Infrared
Spectroscopy Data: A Visual Programming Approach . . . . . . . . . . . . . . 577
Mahmud Iwan Solihin, Zheng Zekui, Chun Kit Ang, Fahri Heltha,
and Mohamed Rizon
Real Time Android-Based Integrated System for Luggage
Check-in Process at the Airport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591
Xin Yee Lee and Rosmiwati Mohd-Mokhtar
Antenna Calibration in EMC Semi-anechoic Chamber
Using Standard Antenna Method (SAM) and Standard
Site Method (SSM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Abdulrahman Ahmed Ghaleb Amer, Syarfa Zahirah Sapuan,
Nur Atikah Zulkefli, Nasimuddin Nasimuddin, Nabiah Binti Zinal,
and Shipun Anuar Hamzah
An Automatic Driver Assistant Based on Intention Detecting
Using EEG Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
Reza Amini Gougeh, Tohid Yousefi Rezaii, and Ali Farzamnia
Hybrid Skull Stripping Method for Brain CT Images . . . . . . . . . . . . . . 629
Fakhrul Razan Rahmad, Wan Nurshazwani Wan Zakaria, Ain Nazari,
Mohd Razali Md Tomari, Nik Farhan Nik Fuad,
and Anis Azwani Muhd Suberi
xii
Contents
Improvising Non-uniform Illumination and Low Contrast Images
of Soil Transmitted Helminths Image Using Contrast
Enhancement Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
Norhanis Ayunie Ahmad Khairudin, Aimi Salihah Abdul Nasir,
Lim Chee Chin, Haryati Jaafar, and Zeehaida Mohamed
Signal Processing Technique for Pulse Modulation (PM) Ground
Penetrating Radar (GPR) System Based on Phase and Envelope
Detector Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
Che Ku Nor Azie Hailma Che Ku Melor, Ariffuddin Joret,
Asmarashid Ponniran, Muhammad Suhaimi Sulong, Rosli Omar,
and Maryanti Razali
Evaluation of Leap Motion Controller Usability in Development
of Hand Gesture Recognition for Hemiplegia Patients . . . . . . . . . . . . . . 671
Wan Norliyana Wan Azlan, Wan Nurshazwani Wan Zakaria,
Nurmiza Othman, Mohd Norzali Haji Mohd,
and Muhammad Nurfirdaus Abd Ghani
Using Convolution Neural Networks Pattern for Classification
of Motor Imagery in BCI System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683
Sepideh Zolfaghari, Tohid Yousefi Rezaii, Saeed Meshgini,
and Ali Farzamnia
Metasurface with Wide-Angle Reception for Electromagnetic
Energy Harvesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693
Abdulrahman A. G. Amer, Syarfa Zahirah Sapuan, Nasimuddin,
and Nabiah Binti Zinal
Integrated Soil Monitoring System for Internet
of Thing (IOT) Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 701
Xin Yi Lau, Chun Heng Soo, Yusmeeraz Yusof, and Suhaila Isaak
Contrast Enhancement Approaches on Medical
Microscopic Images: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
Nadzirah Nahrawi, Wan Azani Mustafa,
Siti Nurul Aqmariah Mohd Kanafiah, Mohd Aminudin Jamlos,
and Wan Khairunizam
Effect of Different Filtering Techniques on Medical
and Document Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727
Wan Azani Mustafa, Syafiq Sam, Mohd Aminudin Jamlos,
and Wan Khairunizam
Implementation of Seat Belt Monitoring and Alert System
for Car Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
Zainah Md Zain, Mohd Hairuddin Abu Bakar, Aman Zaki Mamat,
Wan Nor Rafidah Wan Abdullah, Norsuryani Zainal Abidin,
and Haris Faisal Shaharuddin
Contents
xiii
Electroporation Study: Pulse Electric Field Effect
on Breast Cancer Cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 751
Nur Adilah Abd Rahman, Muhammad Mahadi Abdul Jamil,
Mohamad Nazib Adon, Chew Chang Choon, and Radzi Ambar
Influence of Electroporation on HT29 Cell Proliferation, Spreading
and Adhesion Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761
Hassan Buhari Mamman, Muhammad Mahadi Abdul Jamil,
Nur Adilah Abd Rahman, Radzi Ambar, and Chew Chang Choon
Wound Healing and Electrofusion Application via Pulse Electric
Field Exposure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775
Muhammad Mahadi Abdul Jamil, Mohamad Nazib Adon,
Hassan Buhari Mamman, Nur Adilah Abd Rahman, Radzi Ambar,
and Chew Chang Choon
Color Constancy Analysis Approach for Color Standardization
on Malaria Thick and Thin Blood Smear Images . . . . . . . . . . . . . . . . . 785
Thaqifah Ahmad Aris, Aimi Salihah Abdul Nasir, Haryati Jaafar,
Lim Chee Chin, and Zeehaida Mohamed
Stochastic Analysis of ANN Statistical Features for CT Brain
Posterior Fossa Image Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 805
Anis Azwani Muhd Suberi, Wan Nurshazwani Wan Zakaria,
Razali Tomari, Ain Nazari, Nik Farhan Nik Fuad,
Fakhrul Razan Rahmad, and Salsabella Mohd Fizol
Improvement of Magnetic Field Induction for MPI Application Using
Maxwell Coils Paired-Sub-coils System Arrangement . . . . . . . . . . . . . . 819
Muhamad Fikri Shahkhirin Birahim, Nurmiza Othman,
Syarfa’ Zahirah Sapuan, Mohd Razali Md Tomari,
Wan Nurshazwani Wan Zakaria, and Chua King Lee
DCT Image Compression Implemented on Raspberry Pi
to Compress Image Captured by CMOS Image Sensor . . . . . . . . . . . . . 831
Ibrahim Saad Mohsin, Muhammad Imran Ahmad, Saad M. Salman,
Mustafa Zuhaer Nayef Al-Dabagh, Mohd Nazrin Md Isa,
and Raja Abdullah Raja Ahmad
A Racial Recognition Method Based on Facial Color and Texture
for Improving Demographic Classification . . . . . . . . . . . . . . . . . . . . . . . 843
Amer A. Sallam, Muhammad Nomani Kabir, Athmar N. M. Shamhan,
Heba K. Nasser, and Jing Wang
Automatic Passengers Counting System Using Images Processing
Based on YCbCr and HSV Colour Spaces Analysis . . . . . . . . . . . . . . . . 853
Muhammad Shahid Che Husin and Aimi Salihah Abdul Nasir
xiv
Contents
Face Recognition Using PCA Implemented on Raspberry Pi . . . . . . . . . 873
Ibrahim Majid Mohammed, Mustafa Zuhaer Nayef Al-Dabagh,
Muhammad Imran Ahmad, and Mohd Nazrin Md Isa
Comparability of Edge Detection Techniques for Automatic Vehicle
License Plate Detection and Recognition . . . . . . . . . . . . . . . . . . . . . . . . 891
Fatin Norazima Mohamad Ariff, Aimi Salihah Abdul Nasir,
Haryati Jaafar, and Abdul Nasir Zulkifli
Classification of Facial Part Movement Acquired from Kinect V1
and Kinect V2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 911
Sheng Guang Heng, Rosdiyana Samad, Mahfuzah Mustafa,
Zainah Md Zain, Nor Rul Hasma Abdullah, and Dwi Pebrianti
Hurst Exponent Based Brain Behavior Analysis of Stroke Patients
Using EEG Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925
Wen Yean Choong, Wan Khairunizam, Murugappan Murugappan,
Mohammad Iqbal Omar, Siao Zheng Bong, Ahmad Kadri Junoh,
Zuradzman Mohamad Razlan, A. B. Shahriman,
and Wan Azani Wan Mustafa
Examination Rain and Fog Attenuation for Path Loss Prediction
in Millimeter Wave Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935
Imadeldin Elsayed Elmutasim and Izzeldin I. Mohd
Introduction of Static and Dynamic Features to Facial Nerve
Paralysis Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 947
Wan Syahirah W Samsudin, Rosdiyana Samad, Kenneth Sundaraj,
and Mohd Zaki Ahmad
Offline EEG-Based DC Motor Control for Wheelchair Application . . . . 965
Norizam Sulaiman, Nawfan Mohammed Mohammed Ahmed Al-Fakih,
Mamunur Rashid, Mohd Shawal Jadin, Mahfuzah Mustafa,
and Fahmi Samsuri
Automated Cells Counting for Leukaemia and Malaria Detection
Based on RGB and HSV Colour Spaces Analysis . . . . . . . . . . . . . . . . . 981
Amer Fazryl Din and Aimi Salihah Abdul Nasir
Simulation Studies of the Hybrid Human-Fuzzy Controller
for Path Tracking of an Autonomous Vehicle . . . . . . . . . . . . . . . . . . . . 997
Hafiz Halin, Wan Khairunizam, Hasri Haris, Z. M. Razlan, S. A. Bakar,
I. Zunaidi, and Wan Azani Mustafa
A New Approach in Energy Consumption Based on Genetic
Algorithm and Fuzzy Logic for WSN . . . . . . . . . . . . . . . . . . . . . . . . . . . 1007
Ali Adnan Wahbi Alwafi, Javad Rahebi, and Ali Farzamnia
Contents
xv
Sustainable Energy and Power Engineering
Comparison of Buck-Boost Derived Non-isolated DC-DC
Converters in a Photovoltaic System . . . . . . . . . . . . . . . . . . . . . . . . . . . 1023
Jotham Jeremy Lourdes, Chia Ai Ooi, and Jiashen Teh
Fault Localization and Detection in Medium Voltage Distribution
Network Using Adaptive Neuro-Fuzzy Inference System (ANFIS) . . . . . 1039
N. S. B. Jamili, Mohd Rafi Adzman, Wan Syaza Ainaa Wan Salman,
M. H. Idris, and M. Amirruddin
Flashover Voltage Prediction on Polluted Cup-Pin the Insulators
Under Polluted Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1053
Ali. A. Salem, R. Abd-Rahman, M. S. Kamarudin, N. A. Othman,
N. A. M. Jamail, N. Hussin, H. A. Hamid, and I. M. Rawi
Effect of Distributed Generation to the Faults in Medium Voltage
Network Using ATP-EMTP Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 1067
Wan Syaza Ainaa Wan Salman, Mohd Rafi Adzman, Muzamir Isa,
N. S. B. Jamili, M. H. Idris, and M. Amirruddin
Optimal Reactive Power Dispatch Solution by Loss Minimisation
Using Dragonfly Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 1083
Ibrahim Haruna Shanono, Masni Ainina Mahmud,
Nor Rul Hasma Abdullah, Mahfuzah Mustafa, Rosdiyana Samad,
Dwi Pebrianti, and Aisha Muhammad
Analysis of Pedal Power Energy Harvesting for Alternative
Power Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1105
Sheikh-Muhammad Haziq Sah-Azmi and Zuraini Dahari
An Application of Barnacles Mating Optimizer Algorithm
for Combined Economic and Emission Dispatch Solution . . . . . . . . . . . 1115
Mohd Herwan Sulaiman, Zuriani Mustaffa, Mohd Mawardi Saari,
and Amir Izzani Mohamed
Development of Microcontroller Based Portable Solar Irradiance
Meter Using Mini Solar Cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125
Lee Woan Jun, Mohd Shawal Jadin, and Norizam Sulaiman
Performance of Graphite and Activated Carbon as Electrical
Grounding Enhancement Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1139
Mohd Yuhyi Mohd Tadza, Tengku Hafidatul Husna Tengku Anuar,
Fadzil Mat Yahaya, and Rahisham Abd Rahman
Design on Real Time Control for Dual Axis Solar Tracker
for Mobile Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1155
Muhammad Hanzolah Shahul Hameed, Mohd Zamri Hasan,
and Junaidah Ali Mohd Jobran
xvi
Contents
Modified Particle Swarm Optimization for Robust Anti-swing
Gantry Crane Controller Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1173
Mahmud Iwan Solihin, Wei Hong Lim, Sew Sun Tiang,
and Chun Kit Ang
Feasibility Analysis of a Hybrid System for a Health Clinic
in a Rural Area South-Eastern Iraq . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1193
Zaidoon W. J. AL-Shammari, M. M. Azizan, and A. S. F. Rahman
Optimal Sizing of PV/Wind/Battery Hybrid System for Rural
School in South Iraq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1203
Zaidoon W. J. AL-Shammari, M. M. Azizan, and A. S. F. Rahman
The Use of Gypsum and Waste Gypsum for Electrical
Grounding Backfill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1213
Amizatulhani Abdullah, Nurmazuria Mazelan,
Mohd Yuhyi Mohd Tadza, and Rahisham Abd Rahman
Energy-Efficient Superframe Scheduling in Industrial Wireless
Networked Control System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1227
Duc Chung Tran, Rosdiazli Ibrahim, Fawnizu Azmadi Hussin,
and Madiah Omar
Design of Two Axis Solar Tracker Based on Optoelectrical Tracking
Using Hybrid FuGA Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1243
Imam Abadi, Erma Hakim Setyawan, and D. R. Pramesrani
Unmanned System Technology,
Underwater Technology and Marine
Tracking Control Design
for Underactuated Micro Autonomous
Underwater Vehicle in Horizontal Plane
Using Robust Filter Approach
Muhammad Azri Bin Abdul Wahed and Mohd Rizal Arshad
Abstract Micro autonomous underwater vehicle (µAUV) design and developed at
Underwater, Control and Robotics Group (UCRG) is a torpedo-shaped vehicle
measuring only 0.72 m in length and 0.11 in diameter with a mass of approximately
6 kg. This paper proposed a time invariant tracking control method for underactuated micro AUV in horizontal plane using robust filter approach to track a predefined
trajectory. Tracking error is introduced which can then be converged by using force
in surge direction and moment in yaw direction. A robust control will minimize
the effects of external disturbance and parameter uncertainties on the AUV performance. With only rigid-body system inertia matrix information of the micro AUV,
robustness against parameter uncertainties, model nonlinearities, and unexpected
external disturbance is achievable with the proposed controller. Performance of the
proposed robust tracking control is demonstrated in simulation results.
Keywords Underactuated system Micro autonomous underwater vehicles
Robust control Trajectory tracking
1 Introduction
The micro Autonomous Underwater Vehicle [1] developed by Underwater, Control
and Robotics Group (UCRG) is a torpedo shaped vehicle design for use in shallow
water inspection such as coral reef inspection. It measures at 0.72 m in length, 0.11
in diameter and 6 kg at its most basic configuration.
Underwater mission requires the µAUV to be very stable to be able to follow the
predefined trajectory with high accuracy. However, this µAUV is an underactuated
AUV and this complicates the AUV to follow a predefined trajectory. Therefore, a
M. A. B. A. Wahed M. R. Arshad (&)
Underwater, Control and Robotics Group, School of Electrical and Electronic Engineering,
Universiti Sains Malaysia, 14300 Nibong Tebal, Pulau Pinang, Malaysia
e-mail: eerizal@usm.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_1
3
4
M. A. B. A. Wahed and M. R. Arshad
tracking control system is required to allow the AUV to overcome the limitation of
its propulsion system. Furthermore, performance of the µAUV is adversely affected
by the unpredictable disturbances in the underwater environment.
A precise mathematical representation of an Autonomous Underwater Vehicle
(AUV) is very hard to obtain and this cause the control problem of underwater
robot becomes even more challenging. Hydrodynamic parameters that occurs in the
interaction between the vehicle and fluid is difficult to obtain with reasonable
accuracy due to their variations against different maneuvering conditions.
Therefore, a robust control technique with the constraint of not having its
complete mathematical representation is required to reduce the effects of external
disturbance on system behavior of the AUV.
Sliding Mode Control (SMC) has been used by many researchers due to it
robustness and is the most powerful robust control technique. SMC technique alter
the dynamics of underwater vehicle by applying a discontinuous control signal. The
control signal guides and maintains the trajectory of the system state error toward a
specified surface called sliding surface [2].
However, because of the frequent switching, chattering phenomenon occur in
the control input of SMC. Chattering has to be avoided because it causes high
thruster wear and degrade the system performance. To avoid chattering, dynamics
in a small vicinity of the discontinuity surface need to be alter by using a smoothing
function such as saturation function and hyperbolic tangent function [3, 4].
Unfortunately, accuracy and robustness are partially lost as convergence are only
ensured to approach a boundary layer of the sliding surface.
To overcome the chattering effect, a second order SMC controller has been
proposed [5, 6]. No smoothing function is required by the second order SMC
controller to produce the continuous control signal and this allows for finite-time
convergence to zero of the first-time derivative of sliding surfaces. However, second order SMC controller takes a longer time for its error to converges to zero.
Another robust control technique used in underwater environment is Time Delay
Control (TDC) which is relatively a new technique. It assumes that during a small
short enough time, a continuous signal will remain the same. Therefore, past
observation of uncertainties and disturbance can be used directly in the controller.
Even in the presence of sensor noise and ocean current disturbance, good performance is achievable by using TDC controller [7, 8].
In general, TDC controller consists of time delay estimator and linear controller.
However, the introduced delay causes the TDC controller unable to eliminate
estimation error that arises. To avoid critically affecting the stability and performance of the system, the feedback data acquisition rate has to be fast in order to
shorten the delay time.
In this paper, position of AUV is controlled by using a time invariant tracking
control method using robust filter approach. First proposed by [9], robustness
against parameter uncertainties, model nonlinearities, and unexpected external
disturbance is achievable with only inertia matrix information. The controller [10,
11] is designed consisting of a nominal controller and a robust compensator.
Tracking Control Design for Underactuated (lAUV) ...
5
This paper contains 6 sections. Section 1 introduce the research background
while Sect. 2 presents the µAUV dynamic model and Sect. 3 presents the control
objectives. Section 4 presents the design of proposed robust tracking control
design, Sect. 5 discussed the simulation result and finally Sect. 6 concluded this
paper.
2 Mathematical Modeling of µAUV
Before defining the model, reference frames need to be defined. AUV are best
described as a nonlinear system, thus two reference frame are considered:
Earth-fixed frame and Body-fixed frame. Standard notation from Society of Naval
Architects and Marine Engineers (SNAME) is used for easier understanding in this
paper. Figure 1 shows the defined reference frames. Earth-fixed frame has its x-axis
and y-axis pointing towards the North and East respectively while z-axis points
downwards normal to the surface of earth. On the other hand, Body-fixed frame has
its origin coincides with the center of gravity of the AUV.
In this paper, the AUV is assumed to be moving only at a certain depth and is
passively stable in roll direction. Therefore, all corresponding elements are
neglected during derivation of dynamic equation.
The nonlinear equations of motion of a Body-fixed frame is expressed in a
vectorial setting as shown in (1)–(6), where v represents vector of linear and angular
velocities expressed in Body-fixed frame, rigid-body system inertia matrix represented by MRB while added mass system inertia matrix represented by MA . DL and
DQ represents linear hydrodynamic damping matrix and quadratic hydrodynamic
damping matrix respectively. Lift matrix represented by L and the vector of
Fig. 1 Defined Earth-fixed frame and Body-fixed frame
6
M. A. B. A. Wahed and M. R. Arshad
Body-fixed force from actuators is represented by s. For simplicity, the lift matrix is
assume as input.
ðMRB þ MA Þv þ ðDL þ DQ jvjÞv ¼ s þ Ljvjv
v r T
v ¼ ½u
MRB ¼ diag½ m
m
ð1Þ
ð2Þ
Iz ð3Þ
MA ¼ diag½ MAu
MAv
MAr ð4Þ
DL ¼ diag½ DLu
DLv
DLr ð5Þ
DQ ¼ diag½ DQu
DQv
DQr ð6Þ
Body-fixed linear and angular velocities can be conveyed in Earth-fixed frame
using Euler angle transformation as shown in (7)–(9). g represents the vector of
position and attitude expressed in Earth-fixed frame while J represents the Jacobian
matrix.
g_ ¼ J ðwÞv
g ¼ ½ x y w T
2
cos w sin w
J ðwÞ ¼ 4 sin w cos w
0
0
ð7Þ
ð8Þ
3
0
05
1
ð9Þ
3 Control Objectives
Before designing the trajectory tracking control problem, we need to first defined
the tracking error as shown in (10). e represent the vector tracking error in
Earth-fixed frame while gd represent the vector of desired position and orientation.
Because the AUV is underactuated in sway direction, the desired velocities in x and
y directions has to depend on the desired yaw angle as (12).
e ¼ gd g
ð10Þ
ey
ew T
ð11Þ
wd ¼ tan1
y_ d
x_ d
ð12Þ
e ¼ ½ ex
Tracking Control Design for Underactuated (lAUV) ...
7
The first objective of this research is in designing a controller for an underactuated AUV to track a predefined, time-varying trajectory in the horizontal plane.
Using only force in surge direction and moment in yaw direction, the proposed
controller should be able to converge to zero the tracking error of the underactuated
AUV in the x, y and w directions.
The second objective of this research is to design a robust filter to compensate
the effect of unknown hydrodynamic parameters on the AUV. This is because the
complete mathematical representation of the AUV is not available.
4 Robust Tracking Control Design
This section presents the design of the proposed tracking control of underactuated
AUV in horizontal plane by using robust filter approach. Figure 2 shows the block
diagram of the proposed controller.
There are 3 steps in designing the proposed controller. Firstly, the tracking error
has to be transformed to allow it to be converge by only using force in surge
direction and moment in yaw direction. The Earth-fixed tracking error vector
described as shown in (10) is transformed into introduced error vector in
Body-fixed frame as shown in (13).
ge ¼ ½ x e
ye
we T
ð13Þ
xe ¼ cosðwÞex þ sinðwÞey
ð14Þ
ye ¼ sinðwÞex þ cosðwÞey
ð15Þ
we ¼ ew þ aye
ð16Þ
Second step is in designing a robust filter to compensate the effect of added mass
and hydrodynamic damping force on the AUV system as used by [12]. Since the
complete mathematical representation of the AUV is unknown, an artificial signal
of equivalent disturbance, q as shown in (17) which represent effect of added mass
and damping force on the AUV system is introduced. This equivalent signal is then
compensated by compensating signal as shown in (18) produced by a unity gain,
Fig. 2 Block diagram of the proposed controller
8
M. A. B. A. Wahed and M. R. Arshad
low pass filter. FLP represent the low pass filter with fs and fl representing the two
positive constants related to undamped natural frequency of the filter.
MRB v_ þ q ¼ s
ð17Þ
uR ¼ FLP q
ð18Þ
q ¼ s MRB v_
FLP ðsÞ ¼
h
fl fs
ðs þ fl Þðs þ fs Þ
0
ð19Þ
fl fs
ðs þ fl Þðs þ fs Þ
i
ð20Þ
Final step is to designed a nominal controller to introduce desired error dynamic
into the AUV system. The nominal control signal which is similar to PD controller is
shown in (21). KD and KP represent derivative and proportional gain matrix
respectively. A predefined error dynamic as shown in (22) will converge the
introduced tracking error to zero by using a suitable derivative and proportional gain.
uN ¼ MRB ðKD g_ e þ KP ge Þ
ð21Þ
€ge þ KD g_ e þ KP ge ¼ 0
ð22Þ
In the proposed controller, two input from robust compensator and nominal
controller is used as shown in (23). Where uR is robust compensating signal while
uN is nominal control signal.
s ¼ uR þ uN
ð23Þ
5 Simulations
For simulation, SimulinkTM is used to verify the performance of the proposed
controller. AUV parameters derived in (1) based on parameters presented in [1] is
used as the AUV parameters while control parameters values are shown in (24)–
(27).
KP ¼ diag½ 0:2
0
0:89 ð24Þ
KD ¼ diag½ 0:2
0
0:89 ð25Þ
fl ¼ 8
ð26Þ
fs ¼ 2
ð27Þ
Simulation 1 is performed to test the performance of the proposed controller in a
straight line trajectory with a constant velocity. The parameter of the value used is
Tracking Control Design for Underactuated (lAUV) ...
9
Table 1 Straight-line trajectory with constant velocity simulation parameters
0:5 0 T
Desired trajectory
gd ¼ ½ 0:2t
Initial position in y direction
eð0Þ ¼ ½ 0 0:5 0 T
Initial velocity in x direction
e_ ð0Þ ¼ ½ 0:2 0 0 T
a¼1
Positive constant related to converging rate of ye
Fig. 3 Position response of straight-line trajectory tracking
shown in Table 1 and the results are shown in Figs. 3, 4 and 5. At a constant
velocity, the controller is able to track a straight-line trajectory and converge to zero
the initial error in y direction within 30 s.
Next, simulation 2 is done to show the capabilities of the proposed controller in a
sinusoidal desired trajectory against a Model Free High Order Sliding Mode
Control (MFHOSMC) controller designed by [6]. The parameter of the value used
is shown in Table 2. From Fig. 6, both controller is able to achieve a path similar to
the desired path. In Fig. 7, the tracking error reach steady state for proposed controller in 22 s while MFHOSMC controller requires 25 s. Finally, Fig. 8 shows the
comparison for the controllers to reach steady state in y direction with the proposed
controller tracking error bounded to within 2 10−3 while SMC controller
bounded within 20 10−3. The tracking error is bigger in y direction due to no
actuator in y direction.
10
M. A. B. A. Wahed and M. R. Arshad
Fig. 4 Tracking error in x direction of straight-line trajectory tracking
Fig. 5 Tracking error in y direction of straight-line trajectory tracking
Tracking Control Design for Underactuated (lAUV) ...
11
Table 2 Sinusoidal trajectory tracking simulation parameters
Desired trajectory
gd ¼ ½ 0:2t
Initial position in y direction
eð0Þ ¼ ½ 0 0 0:25 T
Initial velocity in x direction
e_ ð0Þ ¼ ½ 0:2 0:05
a¼4
Positive constant related to converging rate of ye
sinð0:05tÞ
Fig. 6 Position response of sinusoidal trajectory tracking
Fig. 7 Tracking error in x direction of sinusoidal trajectory tracking
0 T
0:25 cosð0:05tÞ T
12
M. A. B. A. Wahed and M. R. Arshad
Fig. 8 Tracking error in y direction of sinusoidal trajectory tracking
6 Conclusions
This paper proposed an underwater tracking control method using robust filter
approach. By using the proposed controller, the effects of external influences on
AUV’s system behavior with subjects to the constraint of not having a complete
representation of the AUV system has been minimized. Simulation results show
that the proposed controller is able to track trajectory of straight-line and sinusoidal
with an excellent performance.
Acknowledgements The authors would like to thank RUI grant (Grant no.: 1001/PELECT/
8014088) and Universiti Sains Malaysia for supporting the research.
References
1. Wahed MA, Arshad MR (2019) Modeling of Torpedo-Shaped Micro Autonomous
Underwater Vehicle. Springer, Singapore
2. Shtessel Y, Edwards C, Fridman L, Levant A (2014) Sliding Mode Control and Observation.
Springer, New York
3. Guo J, Chiu FC, Huang CC (2003) Design of a sliding mode fuzzy controller for the guidance
and control of an autonomous underwater vehicle. Ocean Eng 30(16):2137–2155
4. Hoang NQ, Kreuzer E (2008) A robust adaptive sliding mode controller for remotely operated
vehicles. Tech Mech 28(3–4):185–193
5. Deng CN, Ge T (2013) Depth and heading control of a two DOF underwater system using a
model-free high order sliding controller with transient process. In: Proceedings of 2013 5th
Tracking Control Design for Underactuated (lAUV) ...
6.
7.
8.
9.
10.
11.
12.
13
International Conference on Measuring Technology and Mechatronics Automation,
ICMTMA 2013, pp 423–426
García-Valdovinos LG, Salgado-Jiménez T, Bandala-Sánchez M, Nava-Balanzar L,
Hernández-Alvarado R, Cruz-Ledesma JA (2014) Modelling, design and robust control of
a remotely operated underwater vehicle. Int J Adv Robot Syst 11(1):1–16
Prasanth Kumar R, Dasgupta A, Kumar CS (2007) Robust trajectory control of underwater
vehicles using time delay control law. Ocean Eng 34(5–6):842–849
Park JY, Cho BH, Lee JK (2009) Trajectory-tracking control of underwater inspection robot
for nuclear reactor internals using Time Delay Control. Nucl Eng Des 239(11):2543–2550
Zhong YS (2002) Robust output tracking control of SISO plants with multiple operating
points and with parametric and unstructured uncertainties. Int J Control 75(4):219–241
Gilbert S, Varghese E (2017) Design and simulation of robust filter for tracking control of
quadcopter system. In: 2017 International Conference on Circuit, Power and Computing
Technologies, ICCPCT, Kollam, pp 1–7
Yu Y, Zhong YS (2008) Robust tracking control for a 3DOF helicopter with multi-operation
points. In: Proceedings 27th Chinese Control Conference, CCC, pp 733–737
Song YS, Arshad MR (2016) Tracking control design for autonomous underwater
vehicle using robust filter approach. In: Autonomous Underwater Vehicles 2016, AUV
2016, pp 374–380
Design and Development of Remotely
Operated Pipeline Inspection Robot
Mohd Shahrieel Mohd Aras, Zainah Md Zain,
Aliff Farhan Kamaruzaman, Mohd Zamzuri Ab Rashid,
Azhar Ahmad, Hairol Nizam Mohd Shah, Mohd Zaidi Mohd Tumari,
Alias Khamis, Fadilah Ab Azis, and Fariz Ali@Ibrahim
Abstract Pipeline Inspection Robot (PIR) which is a type of mobile robot is operated
remotely or autonomously with little to no human intervention, inspecting various
fields of the pipeline system and even cleaning the inner walls of the pipelines by using
integrated programs. The development and application of PIR that is specifically used
in monitoring the pipeline system are still not widely studied and applied, although
Malaysia is a nation that is vastly developing in the industrial fields. The proposed PIR
can help in monitoring and inspecting pipe diameter ranging from 215 to 280 mm that
are impossible to reach and hazardous to human life. In addition, the PIR is needed to
make the inspecting operation easier and able to save work time. This project is
focusing on the design and development of suitable PIR for pipeline system monitoring. The PIR is designed by using the SolidWorks software and several simulations
are conducted in the software such as the stress and strain analysis. The PIR is fabricated by using aluminium and uses the adaptive mechanism structure which allow
the robot to adapt in pipe changing diameters. Moreover, the PIR is controlled by a
microcontroller. Experiments are performed to verify the robot’s performance such as
the ability of the robot to adapt in the pipeline. The results shown that the PIR has an
average speed of 0.0096 m/s and can move accurately straight in the pipeline.
Keywords Pipeline Inspection Robot
analysis
Solid works design Performances
M. S. Mohd Aras (&) A. F. Kamaruzaman M. Z. Ab Rashid H. N. Mohd Shah A. Khamis F. Ab Azis F. Ali@Ibrahim
Underwater Technology Research Group (UTeRG), Centre for Robotics and Industrial
Automation (CERIA), Fakulti Kejuruteraan Elektrik, Universiti Teknikal Malaysia Melaka,
76100 Durian Tunggal, Melaka, Malaysia
e-mail: shahrieel@utem.edu.my
Z. Md Zain
Robotics & Unmanned Systems (RUS) Research Group, Faculty of Electrical and Electronics
Engineering, Universiti Malaysia Pahang, 26600 Pekan, Pahang, Malaysia
A. Ahmad M. Z. Mohd Tumari
Fakulti Teknologi Kejuruteraan Elektrik dan Elektronik, Universiti Teknikal Malaysia
Melaka, 76100 Durian Tunggal, Melaka, Malaysia
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_2
15
16
M. S. Mohd Aras et al.
1 Introduction
The Pipelines Inspection Robot is a mobile robot that is equipped with a camera and
specifically used to inspect various fields of the pipeline systems. The PIRs are used
vastly in the supply of water, petrochemical and industries that working on fluid
transportation [1–3]. On the other hand, the pipelines are the crucial equipment for
transporting fuel oils and gas, delivering drinking water and transferring pollutants
[4]. Piping networks can cause a lot of inconvenience such as corrosion, aging,
cracks, and mechanical abrasion. Hence, the need of constant inspection, maintenance and repairs are massively needed [5]. The pipeline inspection robots are
utilized to investigate internal disintegration, fractures and defects which are mainly
due to many causes such as corrosion, degradation, and overheating [6]. With the
decades of enormous developments in the robotics field, the pipeline robots have
numerous designs such as the wheel type robot, caterpillar type robot, wall-press
robot, legged type robot, inchworm type robot and screw type robot [2]. In this
project, a PIR is to be designed and developed by using the SOLIDWORKS
software and the designs of the robot are specifically to apply in a straight pipeline
system and it can adapt in a various pipeline diameter. The PIR will be programmed
by a microcontroller which is the Arduino Mega2560. The performance of the PIR
will be based on its ability to move in a various pipeline diameter and its ability to
inspect the pipelines.
The aim of this project is to design and develop the PIR by using the
SOLIDWORKS software, fabricate the robot and to analyze its performance. The
goal of this project is to design and develop a PIR that is not too complex, low cost,
able to adapt in various pipelines and multifunctional. However, the performance of
other types of complex robot is detailed in this project. The pipelines are generally
used for fluid transportation from place to place. The usage and application of
pipelines across all over Malaysian industries are growing massively [7]. There are
several industries that are very well known to the pipeline industries, namely
Lembaga Air Sarawak, Telekom Malaysia, Petronas and Indah Water. As an
example, Petronas themselves is responsible to operate a huge number of 2500 km
of gas transmission pipeline in our country [8]. Nowadays, modern housing and
town planning in Malaysia are mostly having centralized sewage system. With the
utilization of the new sewage systems, all houses’ pipelines will be connected to
one station for each district. In addition, there will eventually be a more future
network of pipelines that will be constructed.
These pipelines will require the constant need of maintenance and technology as
the pipeline repair has become more vital [9]. There have been a series of accidents
involving pipelines throughout the years. As claimed by Carl Weimer [10], the
executive director of the Pipeline Safety, 135 excavation tragedy that involved
pipelines have occurred in which the pipelines are transporting dangerous chemicals
such as crude oil and petroleum over the last 10 years. This incident can be
summarized that roughly one incident happens every month. Apart from that, on the
31st of July 2014, gas explosion series had occurred in the Cianjhen and Lingya
Design and Development of Remotely Operated Pipeline …
17
districts of Kaohsiung, Taiwan. Earlier that evening, there were reports of gas spills
and unfortunately, after the blasts, thirty-two people did not survive and a number
of 321 others were wounded [11]. Recently this year, on the 1st of August, another
series of natural gas pipeline explosion in Midland Country, Texas has occurred and
five people were sent to hospital, leaving them with critical burn injuries. The cause
of the explosions was unknown, the officials said [12].
2 Methodology
The whole system has been constructed as shown in the Fig. 1. The control module
consists of the controller that is wired to connect with the Arduino Mega2560. The
inspection module consists of the pan-and-tilt CCD camera that is attached with
servo motor and the computer that is used to get real-time image or video recording
for pipe inspection. Next, the moving part module consists of the motor driver,
12 V DC motor, gear and the wheel’s movement. The whole module is powered by
a power supply that is connected externally from the robot.
Pipeline Inspection Robot is shown in the different planes of view as shown in
Fig. 2(a)–(d). The robot that have been designed can fit a pipe diameter ranging
from 90 to 130 mm. This robot applies the adaptive mechanism in which the spring
tension acts as a passive support which enable the robot to keep intact to the pipe
inner walls. The designed robot has a length of 15 cm and the arms of the robot
have a maximum reach of 130 mm. The most contracted and expanded state of the
robot arm as shown in the Fig. 2(e) and (f), respectively. The body tube of the
designed robot which act as the main body is used to store the electrical components. The designed robot uses stainless steel as its main materials that composed
most of its parts. Stainless steel has been chosen mainly due to its ability to
withstand corrosion and oxidation as this robot is going to be used to inspect
pipelines which have various conditions. In addition, the front and the rear of the
robot is attached with a transparent acrylic plastic respectively to protect the
electrical components inside the body tube especially the camera that is used for
inspecting the pipelines.
3 Results and Discussions
The stress and strain analysis results on the certain parts of the robot that have been
done in the SolidWorks software as shown in Fig. 3. All the parts are given the
same amount of force which is 100 N and are given the same type of materials
which is the Annealed Stainless Steel. The Annealed Stainless Steel has a yield
strength of 2.750e8 N/m2. The maximum stress given by the 100 N force to the
Body Tube is 2.656e5 N/m2 which is lower than the yield strength of the material.
Therefore, the body tube is operating within safe limits because the maximum stress
18
M. S. Mohd Aras et al.
Fig. 1 The block diagram of the pipeline inspection robot
is below the amount of the yield strength. As mentioned earlier, all the parts are
given the same amount of force and materials which is 100 N and Annealed
Stainless Steel. The robot part as shown in the Fig. 3 has the yield strength of
2.750e8 N/m2 and the maximum stress given by the 100 N force is 4.325e68 N/m2
which is lower than the yield strength. Therefore, this part of the robot operates
within the safe limit.
Same goes to the two robot parts in the Fig. 3, they are operating within the safe
limits because the maximum stress given is below the yield strength of the parts.
The specifications and the measurements of the fabricated robot is shown in the
Table 1.
The differences between the designed and the fabricated Pipeline Inspection
Robot are mainly on the adaptive mechanism linkage, which connect to the wheels
of the robot. The changes are made because the measurements of the adaptive
Design and Development of Remotely Operated Pipeline …
19
Fig. 2 A view of the designed pipeline inspection robot using SolidWorks software
mechanism parts of the designed robot are too small and thus, it was impossible to
be fabricated. The changes in the measurements led to the increase of the maximum
extended state diameter and the minimum extended diameter of the Pipeline
Inspection Robot. Hence, pipes with bigger diameter are needed to analyze the
performance of the fabricated Pipeline Inspection Robot. On the other hand, the
changes in measurements also led to the increase of the robot’s weight. The robot is
20
M. S. Mohd Aras et al.
Fig. 3 The stress and strain analysis results on the certain parts of the PIR using SolidWorks
software
Table 1 The specifications
and measurements of the
fabricated pipeline inspection
robot
Items
Specifications
Length (mm)
Weight (kg)
Maximum adaptive diameter (mm)
Minimum adaptive diameter (mm)
Diameter without spring attached (mm)
Wheels diameter (mm)
Average speed
150
2.2
280
215
200
30
0.0096
quite heavy with the weight of 2.2 kg. The robot’s weight was not expected to be
heavier than we thought after the fabrications and thus the DC motors that are used
to move the robot did not have enough power to move the robot sufficiently. The
speed of the robot is rather slow with an average speed of 0.0096 m/s. Thus, further
modifications of the fabricated Pipeline Inspection Robot and recommendations
will be made and stated for future works to improve the robot’s driving speed. The
materials that are used to make the Pipeline Inspection Robot parts are entirely
aluminiums. Aluminiums have a very low specific weight of about 1/3 of iron.
Hence, this can decrease the robot’s weight than using common metals to fabricate
the robot. Furthermore, aluminium has a very high resistance against corrosion and
oxidation, which best to be used for the Pipeline Inspection Robot as the robot will
be used and travel inside a pipeline with various conditions. Despite the beneficial
properties of the aluminium, the fabricated Pipeline Inspection Robot turns out
quiet heavy and thus, further research and development will be made to the robot
for future works and studies. Next, the transparent body covers for the front and
backside of the Pipeline Inspection Robot were not be able to completed because of
time constraint. The fabrications, modifications and the assembly of the fabricated
Pipeline Inspection Robot took a tremendous amount of time. The designed body
covers that are made up of acrylic plastic are used to protect the electronic parts
inside the body of the robot. It also protects the camera that will be placed inside the
robot’s body for inspection utilizations (Fig. 4).
The experiment is prepared to analyze and observe the robot’s average speed in a
320 mm long pipe with the diameter of 266 mm. A number of 10 trials were done
to test the robot’s speed inside the pipe and the time for the robot to move inside the
Design and Development of Remotely Operated Pipeline …
Fig. 4 A view of the fabricate Pipeline Inspection Robot
Table 2 The results of the pipeline inspection robot speed test
Trials
Time taken to move inside the pipeline (320 mm length 266 diameter) s
1
2
3
4
5
6
7
8
9
10
Average time
Average speed
31
33
35
31
34
33
32
35
36
32
33.2
0.0096 m/s
21
22
M. S. Mohd Aras et al.
pipe and the average speed is records in the Table 2. The robot took an average of
33.2 s to move to the end of the 320 mm long pipe and gain an average speed of
0.0096 m/s. The performance of the robot’s speed can be further improved with
proper modifications and future works.
4 Conclusion
The design of the Pipeline Inspection Robot with the specifications and features has
been done successfully. Next, the fabrications of the robot are also a success,
although there were a few modifications that have been made to the measurements
and specifications of the PIR. The performance of the PIR in terms of flexibility can
be further analyze with proper modifications to the Pipeline Inspection Robot.
Throughout the fabrication process, a few changes in measurements were made to
the parts of the robot because some parts are too small to be fabricated. These
changes were carefully made and the robot is fabricated successfully. There was the
unexpected result made after the fabrications of the robot. The weight of the robot
was unexpectedly heavy and it affected the speed of the robot. There are many ways
to improve the Pipeline Inspection Robot in terms of its performance and design. To
increase and improves the performance of the robot, these future works are needed
and further develop this Pipeline Inspection Robot.
Acknowledgements The authors would like to thank Universiti Malaysia Pahang for the provision of PJP grant (RDU170366) and Special appreciation and gratitude to especially for Centre
of Research and Innovation Management (CRIM), Centre for Robotics and Industrial Automation
(CERIA) for supporting this research and to Faculty of Electrical Engineering from UTeM for
supporting this research under PJP (PJP/2019/FKE(3C)/S01667).
References
1. Harish P, Venkateswarlu V (2013) Design and motion planning of indoor pipeline inspection
robot. Int J Innov Technol Explor Eng 3(7):41–47
2. Bhadoriya AVS, Gupta VK, Mukherjee S (2018) Development of in-pipe inspection robot.
Mater Today Proc 5(9):20769–20776
3. Nayak A, Pradhan SK (2014) Design of a new in-pipe inspection robot. Procedia Eng
97:2081–2091
4. Lee D, Park J, Hyun D, Yook G, Yang HS (2012) Novel mechanisms and simple locomotion
strategies for an in-pipe robot that can inspect various pipe types. Mech Mach Theory 56:52–
68
5. Roh SG, Choi HR (2005) Differential-drive in-pipe robot for moving inside urban gas
pipelines. IEEE Trans Robot 21(1):1–17
6. Roslin NS, Anuar A, Jalal MFA, Sahari KSM (2012) A review: Hybrid locomotion of in-pipe
inspection robot. Procedia Eng 41:1456–1462
7. Abidin ASZ (2015) Development of track wheel for in-pipe robot application. Procedia
Comput Sci 76:500–505
Design and Development of Remotely Operated Pipeline …
23
8. Bujang AS, Bern CJ, Brumm TJ (2016) Summary of energy demand and renewable energy
policies in Malaysia. Renew Sustain Energy Rev 53:1459–1467
9. Enner F, Rollinson D, Choset H (2013) Motion estimation of snake robots in straight pipes.
In: Proceedings of IEEE International Conference on Robotics and Automation, Germany,
pp 5168–5173. IEEE
10. How often do pipelines blow up? https://money.cnn.com/2016/11/01/news/pipelinefatalities/
index.html. Accessed 25 May 2019
11. Multiple gas explosions rock Kaohsiung streets. http://focustaiwan.tw/news/asoc/
201408010001.aspx. Accessed 25 May 2019
12. Natural Gas Pipeline Explosions in Texas Critically Injure 5 Workers. https://www.huffpost.
com/entry/natural-gas-pipeline-explosionstexas_n_5b62964be4b0fd5c73d62c97. Accessed
25 May 2019
Vision Optimization for Altitude Control
and Object Tracking Control
of an Autonomous Underwater
Vehicle (AUV)
Joe Siang Keek, Mohd Shahrieel Mohd Aras, Zainah Md. Zain,
Mohd Bazli Bahar, Ser Lee Loh, and Shin Horng Chong
Abstract Underwater vision is very different with atmospheric vision, in which the
former is subjected to a dynamic and visually noisy environment. Absorption of
light by the water and rippling waves caused by atmospheric wind are resulting
uncertain refraction of light in the underwater environment, thus continuously
causing disturbance towards the visual data collected. Therefore, it is always a
challenging task to obtain reliable visual data for the control of autonomous
underwater vehicle (AUV). In this paper, an AUV was developed and is tasked to
perform altitude control and object (poles) tracking control in a swimming pool by
merely using a forward-viewing vision camera and a convex mirror. Prior to design
and development of control system for the AUV, this paper only focuses on utilizing and optimizing the visual data acquired. The processing process involves
only gray-scaled image and without any common color restoration or image
enhancement techniques. In fact, the image processing technique implemented for
the object tracking control in this paper contains a self-optimizing algorithm, which
results improvement on the object detection. The result shows that under similar
challenging and dynamic underwater environment, the detection with optimization
is 80% more successful than without the optimization.
Keywords Vision optimization Altitude control
Autonomous underwater vehicle
Object tracking control J. S. Keek M. S. Mohd Aras (&) M. B. Bahar S. L. Loh S. H. Chong
Faculty of Electrical Engineering, Universiti Teknikal Malaysia Melaka, Jalan Hang Tuah
Jaya, 76100 Durian Tunggal, Melaka, Malaysia
e-mail: shahrieel@utem.edu.my
Z. Md. Zain
Robotics and Unmanned Systems (RUS) Research Group, Faculty of Electrical and
Electronics Engineering, Universiti Malaysia Pahang, 26600 Pekan, Pahang, Malaysia
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_3
25
26
J. S. Keek et al.
1 Introduction
Besides of the universe up in the sky and beyond, underwater world is another
universe that is always in the to-explore-list of mankind throughout the past decades. While the mankind has already reached millions of light years up into the
universe, but still yet to complete the exploration of underwater world even though
it is just a few hundreds of kilometers of deepness. The main reason for this
circumstance is because of the medium of the underwater environment—water not
only hinders the transmission of radio frequency (RF) signal, it refracts and absorbs
the penetration of visible light and thus causing the exploration of underwater world
to encounter various difficulties, even for shallow water environment as well. As
vision is one of the most informative source of feedback sensing, losing such
capability means a ‘handicapped’ autonomous underwater vehicle (AUV).
Therefore, the exploration of underwater world without vision is not preferable.
In underwater environment, visible light is refracted. What even worse is, a gust
of wind can easily create waves of ripple, causing the refraction to be varying and
uncertain. Therefore, the light reflected from underwater object may has dynamic
light reflection and patching over time. Moreover, water tends to absorb red and
green lights, thus leaving multicolor object left with only blue color. Therefore, the
image taken under water is very different with the image taken on ground, additional image processing techniques are mandatory.
Existing conventional image processing techniques for ground image are
matured and common, however, when it comes to the application of underwater
images, these techniques may be inadequate. Therefore, various additional image
processing technique for underwater image is developed and formulated from time
to time. As mentioned earlier, the rippling water waves cause the underwater image
to contain noise and disturbance. Image transformation technique such as wavelet,
curvelet and contourlet are promising in overcoming such circumstance [1].
Meanwhile, as water tends to absorb all spectrum of visible light except the blue
one, therefore, effort such as color restoration and correction was proposed for
acoustic underwater image with heuristic algorithm [2, 3]. Occasionally, working
with colorful image can be easier for feature extraction and object recognition, but it
is three times more computational power hungrier than gray-scaled image. Zhang
et al. proposed an implementation of Particle Swarm Optimization (PSO) in optimizing the gray-scaled tuning parameter, with the objective of achieving lesser
computational power yet retaining decent accuracy of object recognition and
detection [4].
As working with color restoration or correction techniques may add complexity
to the image processing, and colorful image involves higher computational power
as well, therefore in this project, gray-scaled underwater image is adopted but
unlike [4–7], a more complicated object is used for detection and a simple
self-tuning algorithm is implemented to cope with the dynamic environment of
under water. The final result displays a more robust detection of the object assigned
and deployed. This paper is organized as follow. Section 2 describes the hardware
Vision Optimization for Altitude Control and Object Tracking …
27
and experimental setups of the AUV developed. Section 3 presents the image
processing techniques used in this paper. Section 4 presents and discusses experimental result and finally in Sect. 5, this paper is concluded.
2 Hardware and Experimental Setups
The autonomous underwater vehicle (AUV) developed in this paper is equipped
with a looking-forward Raspberry Pi camera module and is tasked to acquire
altitude and object location data. In order to fulfill these criteria concurrently and
instead of using two cameras (one looking-forward camera and one
looking-downward camera), a convex mirror is used. The outcome of the
looking-forward raw image data is as shown in Fig. 1.
The convex mirror is actually a blind-spot mirror for the rear mirrors of car. The
advantage of such mirror is that it produces zoomed and wider field of view. Based
on Fig. 1, the areas (size) of the tiles spotted in the mirror are computed and used to
determine the immediate altitude of the AUV. The benefit of such approach or
hardware setup is, both altitude and object detecting data can be acquired concurrently by using merely one camera. Moreover, the image can be segmented into
two smaller regions of interest (ROI) for simultaneous processing, thus saving
abundant of computational power and time. Next, the detail of the poles is illustrated in Fig. 2.
Fig. 1 Forward view from the perspective of the AUV in a swimming pool
28
J. S. Keek et al.
Fig. 2 Illustration and detail of the object (poles) used
Overall, the frame captured by the camera has resolution of 640 480 pixels
and with frame rate of 10 frames per second (fps). Although the poles are colored
with bright orange color, however in Fig. 1, the poles appeared to have dark colored
surface and the overall image is blueish. Such properties vary from time to time and
from position to position. Therefore, a self-tuning image processing technique is
implemented to cope with such dynamicity, which will be presented in upcoming
section.
3 Image Processing Technique
3.1
Data for Altitude Control
To efficiently acquire altitude data, the raw image or frame is first cropped based on
region of interest (ROI), that is where the mirror locates in the image. Since the
mirror moves along with the AUV, the position of the mirror is constant and thus
the parameters for the ROI can be pre-defined. Figure 3 depicts the cropped image
of the raw image in Fig. 1.
To ease the computation, the segmented or cropped image is converted into
gray-scaled image, whereby the intensity of each pixel is then ranged between 0 and
255. Next, Gaussian blur is applied with 5 5 pixels of kernel to smoothen edges,
then followed by edge detection by using built-in Canny function from Python
OpenCV. To enhance edges, morphological transformations is applied, whereby the
Vision Optimization for Altitude Control and Object Tracking …
29
Fig. 3 ROI for altitude
control
Fig. 4 Morphological
transformed image
image is first dilated and then followed by erosion and the result is as shown in
Fig. 4. At this stage, contours of the image can be easily obtained. The shape of
each contour can be approximated by using Douglas-Peucker algorithm. Polygon
with four vertices is detected as a quadrilateral, which denotes the tile of the
swimming pool. Finally, the areas of each detected quadrilateral (tiles) are computed and collected and the altitude of the AUV can be determined by using the
average value of these tile areas.
30
J. S. Keek et al.
Fig. 5 ROI for object tracking control
3.2
Data for Object Tracking Control
In this subsection, the image processing technique on locating the targeted object
i.e. poles in the vision of the autonomous underwater vehicle (AUV) is presented.
As mentioned earlier, due to the dynamic and noisy environment of underwater
environment, detecting the poles in the swimming pool requires certain extent of
adaptability. Therefore, a self-tuning algorithm is discussed in this subsection,
whereby a parameter will be optimized heuristically based on the fitness function
designed and developed. First of all, and as previous, to minimize computational
power as much as possible, only region of interest (ROI) is extracted or cropped out
for processing. The cropped image with the ROI is as shown in Fig. 5.
Then, the image is converted into gray-scaled image to further lighten the
computation. Based on the image in Fig. 5, the poles straightforwardly outstand
from the environment based on our perspective. Therefore, there is certainly a
boundary value that can capture and detect the poles. Since the image is in
gray-scaled, the lower boundary value is 0 whereas the upper boundary value, Uop
is the parameter to be optimized. Since the optimization does not involve multidimensional search space and multivariable, a simple optimization process is
implemented, that is by just increasing the value of Uop with step value of 1 at each
iteration. During each iteration, contours are computed, and all polygons with four
vertices (quadrilaterals) are collected. The key point of a successful and accurate
detection of the poles depends on the reliability of the fitness function designed.
The algorithm of the fitness function in Python programming language is presented
in Algorithm 1.
Vision Optimization for Altitude Control and Object Tracking …
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
31
Algorithm 1: Fitness function for optimizing Uop.
…
if angles is not None and len(angles) == 2 and abs(angles[0]) < 45 and
abs(angles[1]) < 45:
angleDiff = abs(round(angles[0]) - round(angles[1]))
else:
angleDiff = 90
if len(widths) == 2 and len(areas) == 2 and angleDiff < 45:
widthAreaRa o = []
for i in range(2):
widthAreaRa o.append(widths[i]/areas[i])
fitnessFunc on = abs(widthAreaRa o[0]-widthAreaRa o[1])
else:
fitnessFunc on = float(‘inf’)
…
costs.append(fitnessFunc on)
…
minimumCostLoca on = costs.index(min(costs))
minimumCost = costs[minimumCostLoca on]
op malParameter = parameters[minimumCostLoca on]
Intuitively, the characteristics of the object (poles) are used as the criteria to
design the fitness function. Based on Fig. 5, the object is made up of two poles and
therefore in line 1 of Algorithm 1, the number of detected quadrilaterals allowed is
equals to 2. Moreover, the poles are in upright position and never in horizontal
position. Therefore, ‘abs(angles[0])’ and ‘abs(angles [1])’ only accept quadrilaterals
that are angled in less than 45° and −45°. Next, since these two poles are parallel to
each other, their angle difference should not have large difference; only angle
difference of less than 45° is allowed.
Next, the width-area ratio is introduced, and the value returned by the fitness
function is exactly the absolute difference of the width-area ratio of these two
quadrilateral (poles) as shown in line 10 of Algorithm 1. Intuitively, these two poles
are identical and therefore have very similar width. However, due to the cropped
region as shown in Fig. 5, the pole or poles may be partially blocked occasionally,
resulting the area of the poles obtained via the image processing technique to have
significant difference. Therefore, their widths are normalized by their respective
areas for reasonable detection. Finally, all the fitness function values are compiled.
The value of Uop with minimum cost value is selected as the optimal parameter.
32
J. S. Keek et al.
4 Experimental Result and Discussion
In this section, the result of the methods implemented for the image processing is
presented and discussed. The autonomous underwater vehicle (AUV) was manually
moved from one position to another to acquire raw image data. 15 frames of images
are selected to evaluate the performance of the proposed method. Table 1 presents
the altitude data obtained experimentally.
Based on Table 1, all 15 frames have successful detection of the tiles, even
though the underwater environment is dynamic and is sensitive to external disturbance. This is because, unlike the detection of the poles, detection of the tiles is
simply easier. Moreover, the tiles are beneath the AUV and therefore, noisy light
refraction caused by the rippling water waves does not affect the image significantly. Overall, the tile areas of each frame have coefficient of variation (COV) of
not more than 0.27, which denotes that the detection is reliable and consistent.
Next, the result for the detection of the poles is presented in Table 2.
In Table 2, experiments without and with optimization is compared. Without the
optimization, parameter Uop is fixed at value of 98 throughout all frames. Whereas
with the optimization, the value of Uop is dynamic and varies according to
immediate state and environment. The overall result shows that without the
self-tuning algorithm, only three frames i.e. Frames 1, 5 and 7 successfully detect
the poles whereas with the self-tuning algorithm, all 15 frames attain successful
detection. Take note that the values of Uop varies without an incremental or
decremental pattern, which indicates the uncertain dynamic environment of under
water. Meanwhile, the error, which is also the input for system controller, denotes
the horizontal distance between center point of the frame (white dot) and the center
point between the poles (black dot).
Table 1 Altitude data
Frame
No.
1
Outcome
Areas (pixel2)
208.0,
210.0
126.0,
210.0,
Coefficient
of Varia on
Mean
(pixel2)
0.22
188.5
2
126.0, 154.0
0.14
140
3
150.0, 224.0, 180.0
0.20
184.7
Vision Optimization for Altitude Control and Object Tracking …
33
Table 1 (continued)
4
126.0, 264.0, 176.0,
164.5, 256.0, 180.0, 255.5
0.27
203.1
5
224.0, 224.2, 196.0,
250.7, 154.0, 335.8, 188.0
0.26
224.6
0
225.0
6
225.0, 225.0
7
296.1, 223.4, 255.0,
176.0, 176.0, 192.0
0.22
204.5
8
289.0, 256.0,
180.0,
180.0,
272.0, 210.0
0.18
232.9
9
126.0, 150.0, 150.0,
196.0,
130.0,
225.0,
165.0, 154.0, 255.0, 180.0
0.24
173.1
10
225.0, 165.0,
255.4, 180.0
0.17
213.8
255.0,
221.0,
203.4,
34
J. S. Keek et al.
Table 2 Object detection data without and with the self-tuning algorithm
Frame
No.
1
Outcome without Error
(pixels)
Self-tuning
Algorithm, Uop = 98
Outcome with Self- Error
(pixels)
tuning Algorithm
-15.66
-15.66
Uop
98
2
nil
4.54
95
3
nil
7.38
89
4
nil
52.60
83
5
nil
40.43
76
Vision Optimization for Altitude Control and Object Tracking …
35
Table 2 (continued)
6
-4.24
-4.24
79
7
22.01
23.06
69
8
nil
9
nil
10
nil
-29.32
-74.97
-18.56
91
83
65
5 Conclusion and Future Work
The proposed method has successfully achieved robust data extraction for the
purposes of altitude control and object tracking control in the future. A conclusion
that can be drawn is, self-tuning or self-optimizing algorithm is a mandatory for
dynamic circumstance such as the environment of under water. In future work,
optimization technique with better convergence time can be implemented to
improve the proposed image processing technique. Moreover, more tuning
36
J. S. Keek et al.
parameters can be introduced to improve the robustness and reliability of the
detection.
Acknowledgements The authors would like to thank Universiti Malaysia Pahang for the provision of PJP grant (RDU170366) and Ministry of Higher Education of Malaysia for the provision
of FRGS grant (FRGS/2018/FKE-CeRIA/F00352).
References
1. Sharumathi K, Priyadharsini R (2016) A survey on various image enhancement techniques for
underwater acoustic images. In: International Conference on Electrical, Electronics, and
Optimization Techniques, pp 2930–2933
2. Pramunendar R, Shidik AGF, Supriyanto CP, Andono N, Hariadi M (2018) Auto level color
correction for underwater image matching optimization. Int J Comput Sci Netw Secur 13
(1):18–23
3. Trucco E, Olmos-Antillon AT (2016) Self-tuning underwater image restoration. IEEE J
Oceanic Eng 31(2):511–519
4. Zhang R, Liu J (2006) Underwater image segmentation with maximum entropy based on
particle swarm optimization (PSO). In: Proceedings of the First International
Multi-symposiums on Computer and Computational Sciences
5. Silpa-Anan C, Brinsmead T, Abdallah S, Zelinsky A (2001) Preliminary experiments in visual
servo control for autonomous underwater vehicle. In: Proceedings 2001 IEEE/RSJ
International Conference on Intelligent Robots and Systems. Expanding the Societal Role of
Robotics in the Next Millennium, vol 4, pp 1824–1829
6. Lee P-M, Hong S-W, Lim Y-K, Lee C-M, Jeon B-H, Park J-W (1999) Discrete-time
quasi-sliding mode control of an autonomous underwater vehicle. IEEE J Oceanic Eng 24
(3):388–395
7. Shojaei K, Dolatshahi M (2017) Line-of-sight target tracking control of underactuated
autonomous underwater vehicles. Ocean Eng 133:244–252
Development of Autonomous
Underwater Vehicle Equipped
with Object Recognition and Tracking
System
Muhammad Haniff Abu Mangshor, Radzi Ambar,
Herdawatie Abdul Kadir, Khalid Isa, Inani Yusra Amran,
Abdul Aziz Abd Kadir, Nurul Syila Ibrahim, Chew Chang Choon,
and Shinichi Sagara
Abstract The development and design of autonomous underwater vehicle (AUVs)
provides unmanned, self-propelled vehicles that are typically deployed from a
surface vessel, and can operate independently for periods of a few hours to several
days. This project discusses the development of an AUV equipped with object
recognition and tracking system. In this project, the motion of AUV is controlled by
two thrusters for horizontal motions and two thrusters for vertical motions. A Pixy
CMUcam5 is used as a vision sensor for the AUV that is utilized to recognize an
object through its specific color signatures. The camera recognizes an object
through colour-based filtering algorithm by calculating the colour (hue) and saturations of each red, green and blue (RGB) pixel derived from built-in image sensor.
When the camera recognizes an object, the AUV will automatically track the object
without any operator. Preliminary underwater experiments have been carried out to
test its ability to stay submerge underwater as well as its functionality to navigate
and recognize object underwater. Experiments also have been carried out to verify
the effectiveness of Pixy CMUcam5 to recognize a single and multiple objects
underwater, then tracks the recognize object. This work reports the findings that
demonstrate the usefulness of PixyCMUcam5 in the development of the AUV.
Keywords Autonomous underwater vehicle
recognition Object tracking
Pixy CMUcam5 Object
M. H. Abu Mangshor R. Ambar (&) H. A. Kadir K. Isa I. Y. Amran A. A. A. Kadir N. S. Ibrahim C. C. Choon
Department of Electronic Engineering, Faculty of Electrical and Electronic Engineering,
Universiti Tun Hussein Onn Malaysia, 86400 Parit Raja, Batu Pahat, Johor, Malaysia
e-mail: aradzi@uthm.edu.my
S. Sagara
Department of Mechanical and Control Engineering, Kyushu Institute of Technology,
Tobata, Kitakyushu 804-8550, Japan
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_4
37
38
M. H. Abu Mangshor et al.
1 Introduction
An underwater vehicle is a robotic vehicle that travels underwater that can be
classified into manned and unmanned vehicles. The manned variants include submarines and submersible. A submarine is a ship that can be submerged and navigated
underwater with a streamlined hull intended for lengthy periods of operation in the
ocean, fitted with a periscope and typically fitted with torpedoes or rockets. Military
submarines are typically used to protect aircraft carriers on the water surface, to
attack other submarines and watercraft, to supply ships for other submarines, to
launch torpedoes and rockets, and to provide surveillance and protection against
prospective attackers. It differs from a submersible which has limited underwater
capability. Submersible is used for various purpose, including deep-sea surveys,
marine ecological assessment, natural marine resource harvesting, deep-sea exploration and marine exploration [1].
Unmanned underwater vehicle (UUV) or more often referred to as autonomous
underwater vehicles (AUV) are robots that travels underwater independently
without requiring no physical connection to their input from an operator [2, 3].
AUVs are programmed at the surface, and then navigate through the water on their
own, collecting data as they go. AUVs can be preprogramed with an assignment
and location. Once their assignment is complete, the robot will return to its location.
On the other hand, remotely operated vehicles (ROV) are any vehicles that are
able to operate underwater where the vehicles are controlled by humans from a
remote location using remote control devices [4–6]. A series of wires running on
land or in the air connect the vehicles to a surface ship. These wires convey control
and control signals between the operator and the ROV, enabling the vehicles to be
remotely navigated. A ROV can include a video camera, lights, sonar systems and
robotic arms. The roles of UUV such as ROVs and AUVs are for example to map
the seabed for oil and gas industry, underwater observation, seabed exploration,
underwater building and subsea project maintenance and underwater inspection and
ship hull cleaning. ROVs involve in collecting samples or manipulating the
environment while AUVs will help to create detailed maps or measure water
properties.
Vision system is a technology that enables a computer to recognize and evaluate
images. A vision system usually comprises of hardware and software for digital
cameras and back-end image processing. The front camera of a robotic vehicle captures pictures from the setting or a centered object and sends them to the
processing scheme [7]. The vision system has the ability to recognize objects,
places, people, writing and actions in images. Computers can use machine vision
technologies in combination with a camera and artificial intelligence software to
achieve image recognition. Image recognition is utilized to play out an enormous
number of machine-based visual errands, for example, naming the substance of
images with meta-tags, self-driving vehicles and mishap evasion frameworks,
performing images content inquiry and controlling self-governing robots. Robotic
vehicles are expected to simultaneously detect obstacles and recognize an object.
Development of Autonomous Underwater Vehicle Equipped …
39
The technology is even capable of following the objects. By applying a vision
system to a robotic vehicle means that you give it eyes to recognize an object. In
this project, an autonomous underwater vehicle equipped with vision system has
been developed. The project proposes the design and development of an AUV that
can navigate based on object recognition and tracking system using a single camera.
A Pixy CMUcam5 camera is used to recognize a target object and track its
movements in underwater environment.
This paper is organized as follows. Section 2 describes the detail design of the
propose AUV including 3D model design and actual AUV prototype design which
consist of the vision sensor. Section 3 introduces object recognition and tracking
method algorithm that is used in this work, followed by a brief conclusion and
future recommendation in Sect. 4.
2 Methodology
2.1
AUV Design Process
Figure 1 shows the AUV design process. It can be classified into several stages.
The main stage focuses on the design concept of the AUV which covers mechanical
Review of previous AUV concepts
and designs
Propose design
Analysis of design concept
Choose final design
concept
Testing and assessing AUV
Object
recognition
Tracking system
yes
Designing
mechanical
system
Designing
electrical
system
Require
upgrades?
Adjustment to
AUV design
no
Integrate mechanical and electrical
system
Construction process
Fig. 1 Process of design and construction of the propose AUV
Final AUV design
40
M. H. Abu Mangshor et al.
and electrical design. The next stages can be described in two sections; the first
section is the development of the mechanical parts. Computer-aided software such
as the Sketch Up software is used to draw and animate the proposed AUV. Other
subsections discuss on the development of the internal and external electrical design
of the AUV. The last stages are testing, fine tuning and minor upgrading tasks.
2.2
AUV Structure 3D Modelling
This subsection discusses the 3D design of the AUV. The actual structure of the
AUV is developed based on the 3D design. Figure 2 shows the 3D design of the
proposed AUV modelled using Sketch Up software based on the actual size,
dimensions and the entire component that has been used. Figure 3 shows various
views of the 3D design. Figure 4 shows the main components of the proposed
AUV.
2.3
AUV Structure 3D Modelling
Figure 5 shows various views of the completed AUV structure. The structure is
composed of aluminium alloy struts which is extremely tough, light-weight,
Fig. 2 3D design of the AUV structure
Development of Autonomous Underwater Vehicle Equipped …
41
Fig. 3 Various view of the AUV’s 3D design
PVC Pipe
Left thruster
Compartment
Aluminium Alloy Strut
Arduino Mega
Pixy CMUCam5
Right thruster
Bottom thruster
Bottom thruster
Fig. 4 AUV main components
42
M. H. Abu Mangshor et al.
Fig. 5 AUV body structure
with dimensions
corrosive resistant, and anti-rusting. The aluminium alloy struts are easy to be
installed and modified making it very flexible in order to fitting with other component into the AUV. The dimension of the AUV is 65 cm length, 24 cm width and
24 cm height as shown in the figure.
The process of cutting the metal must be precise to avoid difficulty during
buoyancy test. Each aluminium alloy strut is joined using aluminium corner 90° L
shape joint bracket tightened using button head and ball nut. The joint parts need to
be completely tightened so that the AUV structure is strong enough to face
underwater external forces.
After all the installation and testing completed, all the system were integrated
and uploaded into the Arduino Mega microcontroller. All the electronic components were placed into the underwater compartment and the thrusters were mounted
onto the AUV in order to test its overall system functionality. Figure 6 shows the
completed installation of AUV including all peripherals such as thrusters and
electronic circuitry.
2.4
Pixy CMUcam5 Installation
The Pixy CMUcam5 is placed inside a waterproofed underwater compartment as
shown in Fig. 7. The underwater compartment has a dome end cap design. This
dome end cap helps to improve vision underwater environment clearly. The
Development of Autonomous Underwater Vehicle Equipped …
43
Fig. 6 Various viewpoints of the completed AUV
position of Pixy CMUcam5 is inside the compartment and at the dome end
cap. A mounting bracket has been designed using 3D printer in order to hold the
camera inside the compartment Fig. 8a shows the mounting bracket for Pixy
CMUcam5. The dimension of the mounting bracket is 8 cm in diameter with a
thickness of 1 cm. Figure 8b shows the Pixy CMUcam5 is attached inside the
compartment using the mounting bracket.
44
M. H. Abu Mangshor et al.
Fig. 7 AUV’s waterproofed underwater compartment
Fig. 8 a Mounting bracket for Pixy CMUcam5, b Pixy CMUcam5 is attached inside the
compartment using the mounting bracket
2.5
Object Recognition and Tracking System Using Single
Camera
Object recognition using Pixy CMUcam5. In this work, a Pixy CMUcam5 is
used as a vision sensor. Figure 9 shows an image of a Pixy CMUcam5 connected to
an Arduino Mega microcontroller. This Pixy CMUcam5 uses a colour-based filtering algorithm to recognize object. Pixy calculates the hue and saturation of each
RGB pixel from the image sensor and uses these as the primary filtering parameters.
Development of Autonomous Underwater Vehicle Equipped …
45
Fig. 9 Pixy CMUcam5 connected to Arduino. As can be seen the Pixy CMUcam5 is connected to
Arduino at ICSP pin
The hue of an object remains largely unchanged with changes in lighting and
exposure. The changes in lighting and exposure can have a frustrating effect on
color filtering algorithms. It can also recognize seven different color signatures; find
hundreds of objects at the same time, and processing at 50 fps. Pixy processes an
entire 640 400 image frame every 1/50th of a second (20 ms). This means that
you get a complete update of all detected objects’ positions every 20 ms. Pixy
CMUcam5 addresses these problems by pairing a powerful dedicated processor
with the image sensor. Pixy processes images from the image sensor and only sends
the useful information to the microcontroller. Pixy can easily connect to lots of
different controllers because it supports several interface options (UART serial, SPI,
I2C, USB, or digital/analog output).
Object Tracking using Pixy CMUcam5. The Pixy CMUcam5 is connected to an
Arduino microcontroller to recognize and track object. Figure 10 shows the flowchart of object tracking. The Pixy CMUcam5 will find the set signature colour by
using object colour-based filtering algorithm. Once the Pixy CMUcam5 succeed in
recognizing the object, the AUV will take action to achieve the goal. Otherwise the
AUV will keep acquiring image to recognize target object. As the AUV near to the
recognized object, the AUV will stop moving.
Initially, the Pixy CMUcam5 was ‘taught’ to track an object. PixyMon software is
used to teach the AUV to recognize the objects. This was done by holding the
object in front of its lens while holding down the button located on top. While doing
this, the RGB LED under the lens provides feedback regarding which object it is
looking at directly. When tracking an object using PixyMon, the Pixy CMUcam5
will determine some object image resolutions that have same assumption when
trying to detect an object. Object tracking is implemented in the TrackBlock
function where the function is to keep following the object in a set area. It analyzes
the image and identifies objects matching the colour characteristics of the object
being tracked. It then reports the position size and colors of all the detected objects
back to the Arduino.
46
M. H. Abu Mangshor et al.
Fig. 10 Flowchart of Object
Tracking
START
Image acquisition
Object colour-based
filtering algorithm
Object recognized?
No
Yes
Object tracking
AUV moves towards object
(Forward, Reverse, Left, Right)
No
Object distance
=10cm?
No
Object lost?
Yes
AUV stop
2.6
Yes
AUV Circuit Design
Figure 11 shows the circuit design for the AUV illustrated using Fritzing software.
As shown in the figure, the AUV utilizes an Arduino Mega microcontroller to
control all peripherals. The circuit consists of one (1) input and four (4) outputs.
The input is only Pixy CMUcam5 that connects at Arduino’s ICSP pin. The output
are consisting of four (4) T100 thrusters from BlueRobotics that perform up, bottom, right and left movements. To operate the thrusters, 11 V power supplies are
needed. The thrusters are connected to electronic speed controllers (ESC) and then
to the Arduino Mega. The ESC is used to control the speed of thrusters and the
forward or reverse rotation for forward or reverse thrust. A Pixy CMUCam5 is used
to give instructions to the AUV to recognize and track the object in underwater
based on a colour set signature and sends the data to the control system. The control
Development of Autonomous Underwater Vehicle Equipped …
47
5V
9V
Power jack
ESC
ESC
Arduino Mega
ESC
ESC
Thruster A
Thruster B
Thruster C
Thruster D
Pixy CMUcam5
Fig. 11 AUV circuit design using Fritzing
9V LiPo Battery
Electronic Speed Controller
5V Power
bank
Arduino Mega
Pixy CMUCam5
Thrusters
Fig. 12 Actual circuit for the proposed AUV
system will give instruction to the thrusters whether to move forward or reverse,
submerge deeper or rise depending the location of the object. Figure 12 shows the
actual circuit of the proposed AUV.
48
M. H. Abu Mangshor et al.
3 Preliminary Experiments
3.1
Water Leakage and Submerging Experiment
Before placing the electronic devices inside the underwater compartment, it is
necessary to perform water leak test. Figure 13a shows the water leakage test
condition. To detect an air leaks, the underwater compartment was submerged for
an hour inside a water container. If there is any present of bubbles means there is an
air leak. This test helps to prevent short circuit for electronic components inside the
underwater compartment and keeps of the underwater compartment dry while
submerged underwater. The underwater compartment has been tested three times
submerged underwater where each test was done for an hour. Before submerging,
the compartment was tested to make sure it is watertight and reliable in preventing
the electronic devices from damage due to water leakage.
After the AUV was completely assembled, a submerging test was carried out in a
lake to test whether the AUV ready to remain completely submerged for a period of
time. The experiment also carried out to verify the waterproofing of the component
storage compartment. Figure 13b shows the submerging experiment condition. As
shown in the figure, the yellow coloured PVC pipes were added to the sides of the
AUV to act as floating mechanism for the AUV to reduce the buoyant force acted
upon the AUV. Additional loads were added to the AUV in order to the AUV
submerged. Based on the experiment, the compartment was waterproofed reliably.
Furthermore, the right amount of loads required for the AUV to stay submerged
were verified successfully.
PVC Pipe
AUV
(a)
(b)
Fig. 13 a Compartment water leakage test condition, b AUV submerging experiment condition
performed in a lake
Development of Autonomous Underwater Vehicle Equipped …
3.2
49
Underwater Experiment on Single Object Recognition
Using Pixy CMUcam5
This experiment has been carried out to investigate the effectiveness on Pixy
CMUcam5 to recognize a single object in underwater. The object used in the
experiment test is a pink colour dinosaur toy named as Spinosaurus (pink).
Underwater experiments have been carried out in a water container with the size of
80 cm (width) 58 cm (depth) 50 cm (height). The container was chosen since
there was no large water tank to test long distance recognition capabilities.
Therefore, the maximum distance between camera position and the object was
30 cm.
Experimental Steps. The steps for this experiment are as follows:
1.
2.
3.
4.
5.
6.
Connect Pixy CMUcam5 to Arduino Mega.
Use 5 V power supply to Arduino Mega.
Upload a source code to Arduino Mega.
The electronic components are placed inside underwater compartment.
The object is placed in a water container as shown in Fig. 14.
Initially, the camera is located with a distance 30 cm to the object position.
Then, it is moved near to the object at 25, 20, 15, and 10 cm positions.
7. Repeat step 4 to 6 with different type of water which is clear water, and mud
water.
8. The video images captured by camera are recorded.
Fig. 14 Clear underwater single object recognition by Pixy CMUcam5
50
M. H. Abu Mangshor et al.
(a) 30cm
(b) 25cm
(c) 20cm
(d) 15cm
(e) 10cm
Fig. 15 Camera views of a single object in clear water with varying distances
(a) 30cm
(b) 25cm
(c) 20cm
(d) 15cm
(e) 10cm
Fig. 16 Camera views of a single object in muddy water with varying distances
Experimental Results. From the experiment, the pixy CMUcam5 was able to
recognize a single object in clear water condition with the distances of camera to
object set as 30, 25, 20, 15 and 10 cm for the clear water as shown in Figs. 15a–e.
In muddy water condition, the pixy CMUcam5 was able to only recognize object
located 10 cm from the camera position as shown in Figs. 16a–e.
3.3
Underwater Experiment on Multiple Objects
Recognition Using Pixy CMUcam5
This experiment has been carried out to investigate the effectiveness of Pixy
CMUcam5 to recognize on multiple objects in underwater. The objects used in the
experiment were Spinosaurus (pink), Stegosaurus (green), Pteranodon (yellow),
Triceratops (orange) and Tyrannosaurus (purple) in colours.
Experimental Steps. The steps for this experiment are as follows:
1.
2.
3.
4.
5.
Connect Pix CMUcam5 to Arduino Mega.
Use 5 V power supply to Arduino Mega.
Upload a source code to Arduino Mega.
The electronic components are placed inside underwater compartment.
The object is placed in a water container as shown in Fig. 17a.
Development of Autonomous Underwater Vehicle Equipped …
(a) Clear water
51
(b) Muddy water
Fig. 17 Camera views of a multiple objects in a clear water, b muddy water
52
M. H. Abu Mangshor et al.
6. Initially, the camera is located with a distance 30 cm to the object position.
Then, it is moved near to the object at 25, 20, 15, and 10 cm positions.
7. Repeat step 4 to 6 with different type of water which is clear water and mud
water.
8. The video images captured by camera are recorded.
Experimental Results. From the experiment, it was found that the pixy
CMUcam5 was able to recognize a certain multiple objects in clear underwater at
certain distances as shown in Fig. 17a. At the camera distance to object at 30 cm,
the camera was able to recognize Spinosaurus and Pterandon. The camera able to
recognize Stegosaurus at the distances 25 cm. Next is Tyrannosaurus, where the
camera recognizes at distance 20 cm. The camera started to recognize the orange
coloured Triceratop at the distance of 15 cm.
On the other hand, the camera was able to recognize multiple objects in muddy
water at certain distances. At distance of 20 cm, the camera could only recognized
Stegosaurus. The camera started to recognize all objects at a distance of 15 cm,
but only the Tyrannosaurus was undetected in muddy underwater. Figure 17b
shows the results. Light is comprised of wavelengths of light, and every wavelength
is a specific colour. As results, the pixy CMUcam5 recognize longest wavelength
and then follow by the lowest wavelength from the light’s visible spectrums.
3.4
Underwater Experiment on Recognizing and Tracking
a Single Object
This experiment has been carried out to investigate the effectiveness of Pixy
CMUcam5 to recognition an object in underwater and track the object. The object
used to recognize and track was a pink coloured Spinosaurus.
Experimental Steps. The steps for this experiment are as follows:
1.
2.
3.
4.
5.
6.
7.
Supply 9 V LiPo battery to ECS for thrusters.
Connect Pixy CMUcam5 to Arduino Mega.
Use 5 V power supply to Arduino Mega.
Upload a source code to Arduino Mega.
The electronic components are placed inside underwater compartment.
The object is place in 10 m underwater depth.
The camera from object distance is 20 cm and continuous move an object from
left to right.
8. The distances for object to recognize and track are recorded.
Experimental Results. Figures 18, 19, 20 and 21 show the experimental results.
From the experiments, the system is able to perform the desire tasks where the pixy
CMUcam5 able to recognise Spinosaurus in a clear underwater and tracking the
Development of Autonomous Underwater Vehicle Equipped …
53
Left thruster
Right thruster
Fig. 18 The direction of thrusters moving to the right. As can be seen thrusters on the left is
rotating based on the produced bubbles
Left thruster
Right thruster
Fig. 19 The direction of thrusters moving to the left. As can be seen, the thruster on the right is
rotating
54
M. H. Abu Mangshor et al.
Left thruster
Right thruster
Fig. 20 The direction of thrusters are moving forward. As can be seen both thrusters are rotating
Left thruster
Right thruster
Fig. 21 All thrusters stopped. As can be seen both thrusters are not rotating
Spinosaurus. When the Spinosaurus is moved to the left, the thruster A stopped and
the thruster B stopped hence it turned to left. Then, the thrusters A was activated
and the thruster B is stopping hence it turned to right. When Spinosaurus was
moved backwards, the thruster A and thruster B were activated to move forward to
track the object. Lastly, when the distance between the Spinosaurus and the camera
is 10 cm, the thrusters stopped.
Development of Autonomous Underwater Vehicle Equipped …
3.5
55
Summary
Every step that has been taken plays an essential role in order to successfully
develop a fully functional AUV. From sketching up the structure of the AUV by
using computer software until assembling the AUV, each procedure was very
crucial in the process of developing the AUV. Since the AUV will remain submerged, it is imperative to guarantee all the electronic components is water proof
and would not leak to water. The experimental results show that the camera was
able to recognize a single and multiple objects underwater especially for clear
water. The thrusters have been operated as desired where the direction of thrusters
follow the position of object.
4 Conclusion
This paper describes the development of an autonomous underwater vehicle
equipped with object recognition and tracking system. In this paper, the hardware
and software designs of the AUV has been described. The AUV is installed with a
Pixy CMUcam5 camera for object recognition and tracking system. Based on
preliminary object recognizing experiments, the Pixy CMUcam5 is capable to
recognize single and multiple objects underwater. It has been observed that the Pixy
CMUcam5 starts recognizing objects at a distance of 30 cm for clear water. While
in muddy water condition, it was difficult for the Pixy CMUcam5 to recognize
objects. This is maybe due to the fact that CMUcam5 utilizes colour-based algorithm. Furthermore, experiments related to thrusters showed that the thruster rotated
based on input from the image captured from the Pixy CMUcam5.
In conclusion, the objective of the project is to design and develop an AUV
equipped with object recognition and tracking system is successfully independently.
Lastly, improvement to be considered in future projects include using high-end
vision system which can monitor a real-time underwater. As a camera that can
perform in multiple types of water so that the AUV not limited to clear water only
but also muddy waters.
Acknowledgements The authors would like to thank the Research Management Center (RMC),
UTHM and Ministry of Higher Education for sponsoring the research under Tier 1 Research
Grants (Vot H161).
References
1. Levin LA et al (2019) Global observing needs in the deep ocean. Front Mar Sci 6(241):1–32
2. Spears A et al (2016) Under Ice in Antarctica: the icefin unmanned underwater vehicle
development and deployment. IEEE Robot Autom Mag 23(4):30–41
56
M. H. Abu Mangshor et al.
3. Ribas D et al (2015) I-AUV mechatronics integration for the TRIDENT FP7 project. IEEE/
ASME Trans Mechatron 20(5):2583–2592
4. Ambar RB, Sagara S (2015) Development of a master controller for a 3-link dual-arm
underwater robot. Artif Life Robotics 20:327–335
5. Yuh J (2000) Design and control of autonomous underwater robots: a survey. Auton Robots 8
(1):7–24
6. Khatib O et al (2016) Ocean one: a robotic avatar for oceanic discovery. IEEE Robot Autom
Mag 23(4):20–29
7. Techopedia: Machine Vision System (MVS). https://www.techopedia.com/definition/30414/
machine-vision-system-mvs. Accessed 21 Feb 2019
Dual Image Fusion Technique
for Underwater Image Contrast
Enhancement
Chern How Chong, Ahmad Shahrizan Abdul Ghani,
and Kamil Zakwan Mohd Azmi
Abstract Underwater imaging is receiving attentions throughout these years.
Attenuation of light causes the underwater images to have poor contrast and
deteriorated color. Furthermore, these images usually appear foggy and hazy. In this
paper, a new approach to enhance underwater images is proposed, which implements the integration of dehazing method, homomorphic filtering and image fusion.
The dehazing method consists of multi-scale fusion technique, which applies
weight maps in the pre-processing step. Homomorphic filtering and image fusion
are then applied to the resultant image for contrast and color enhancement.
Qualitative and quantitative evaluations are performed to analyze the performance
of the proposed method. The results show the superiority of the proposed method in
terms of contrast, image details, colors, and entropy. Moreover, implementation of
Raspberry Pi with Picamera as standalone underwater image processing device is
also successfully implemented.
Keywords Underwater image Contrast Color Multi-scale fusion Standalone
prototype device
1 Introduction
The physical features of an object are captured and stored as an image by capturing
device such as a camera, telescope, and computer devices built-in camera module.
In such ways, images have been categorized in varied forms. In terms of digital
reign, digital image is represented as a form in two-dimensional (2D) rectangular
matrix of any digital form sample value for the image itself. All of the quantized
sample values are converted as picture, pixels and image elements. The properties
C. H. Chong A. S. Abdul Ghani (&) K. Z. Mohd Azmi
Faculty of Manufacturing and Mechatronic Engineering Technology, Universiti Malaysia
Pahang, 26600 Pekan, Malaysia
e-mail: shahrizan@ump.edu.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_5
57
58
C. H. Chong et al.
Fig. 1 Different wavelengths
of light are attenuated at
different rates in water
of the image itself can be quantified and undergo processing for further analysis to
the next stage to illustrate the characteristics and properties of an image.
As reported by Abdul Ghani [1], most images that captured in water medium
have qualities (e.g. color and contrast) differ from internal properties possessed of
the environmental medium. An object captured underwater is overshadowed by
blue-green color cast. This problem creates an undesirable condition where the
genuine characteristics and natural color of an underwater object is falsely interpreted. Moreover, a capturing device (i.e. camera) also can cause degradation to
underwater image. Incompetent specification of a capturing device may result in
various noises to be induced in output image. Therefore, these issues need to be
resolved in order to have better quality of underwater image.
Nowadays, underwater image processing gradually becomes as one of challenging field study to researchers. The fundamental knowledge of image formation
in a water medium is described briefly in order to understand the underwater
imaging process.
Light’s phenomena that originated from the light attenuation as shown in Fig. 1
resulted in underwater images to suffer from low quality and poor contrast [2].
There are few experiments where the light source is replaced with artificial light to
rectify the light illumination in underwater, yet it contributes toward other lighting
issues. An image that captured with artificial light source tends to have bright spot
appeared in the center of the image. Moreover, absorption and scattering effects also
degrades further the contrast of underwater image.
There are lots of ways that have been introduced and proposed by researchers in
order to enhance underwater image quality. The advance of underwater image
processing technique can help to ease up the overall progress of marine’s exploration. For instance, Chiang and Chen [3] developed underwater image enhancement by wavelength compensation and de-hazing to compensate for the attenuation
discrepancy along the light propagation path. In 2017, Abdul Ghani and Mat Isa [4]
introduced a new method of enhancing underwater image, which implements the
modification to image histograms column wisely in accordance with Rayleigh
distribution. In other report, Mohd Azmi et al. [5] proposed a method that focuses
Dual Image Fusion Technique for Underwater Image Contrast Enhancement
59
on enhancing deep underwater image. They [6, 7] have also successfully integrated
a swarm-intelligence algorithm to further enhance the effectiveness of their image
enhancement method.
In 2017, Peng and Cosman [8] proposed a depth estimation method for underwater scenes based on image blurriness and light absorption for underwater image
enhancement. The visibility of output image can be improved through this method.
However, the blue-green color cast problem is not significantly reduced. In 2018,
Ancuti et al. [9] offered a single image approach where it builds on the blending of
two images that are directly derived from a color-compensated and white-balanced
version of the original image. This method is proven effective in improving turbid
underwater images. However, for deep underwater images, this method tends to
produce a reddish effect. Recently, Kareem et al. [10] applied integrated color
model with Rayleigh distribution (ICMRD) in their proposed method. The ICMRD
approach is operated in YCbCr color space for image enhancement. The blue-green
color cast is seen to be successfully reduced through this method. However, the
image contrast remains low.
In this paper, the image enhancement technique is presented with the application of
Graphical User Interfaces (GUI) to display the comparison between the raw input
image and the processed output image. Moreover, the proposed method is extended by
using Raspberry Pi [11, 12] as the computing platform to run underwater image
processing. Visual aid with GUI is developed to compare the results, and a standalone
prototype device is also designed for underwater image acquisition application.
2 New Approach for Underwater Image Contrast
Enhancement
In this work, homomorphic filtering and image fusion with dehazing (HFIFD)
technique is introduced for underwater image enhancement. First, the input image is
subjected to dehazing process in order to reduce the haziness element in the
underwater image. Dehazing method is a pre-processing procedure which split the
input image into two separated images, where these images are improved through
white balanced and contrast enhancement techniques. Implementation of luminance, chromatic and saliency weight maps are performed to both images and then
all the outputs are fused together to produce the output image as shown in Fig. 2.
This dehazing technique is necessary to eliminate unwanted distortion elements
in the image. The white balancing process is aimed to shed unreal color, chromatic
casts that are distorted by atmospheric color. Shades-of-gray color constancy
technique is applied in this process to have better computational efficiency.
Meanwhile, contrast enhancement is implemented to the second image by using
adaptive histogram equalization technique. This method is used to enhance the
contrast of each RGB channels by applying the histogram equalization on the
intensity of the whole frame of the image.
60
C. H. Chong et al.
Underwater image
White Balanced
Image
Luminance Weight
Map
Contrast Enhanced
Image
Chromac Weight
Map
Saliency Weight
Map
Images Fusion
Dehazing Image
Fig. 2 Block diagram of dehazing method with fusion technique
Then, weight maps are applied to white-balanced image and contrast-enhanced
image as the previous enhancement is insufficient to restore the quality of underwater image. Luminance, chromatic and saliency weight maps are introduced and
opted into the resultant image to improve the visibility and the color of underwater
Dual Image Fusion Technique for Underwater Image Contrast Enhancement
61
image. Luminance weight map is applied as there is color reduction occurred after
performing white balancing technique. Luminance weight map is used to assign
higher saturation value to region with better visibility and low saturation value to
others region. Chromatic weight map is then introduced by working on saturation
gain of the input image. Some of the object’s edge in certain region is considered as
the informative part of the image, which should be distinguished from their surrounding as they possessed important features. Therefore, saliency weight map is
applied to improve those regions so that they can be easily seen. Those three weight
maps that have been employed in the dehazing process hold critical roles to enhance
the image quality and to reduce haziness element.
Step
Output Image
Dehazing Image
Homomorphic Filtering
process
Histogram Matching
Dual-image Global
Stretching
Local Stretching as post
processing
Sharpening of image
for final output image
Fig. 3 Block diagram of homomorphic filtering and image fusion with dehazing (HFIFD)
62
C. H. Chong et al.
After the pre-processing steps (dehazing) are done, homomorphic filtering is
applied to the resultant image to enhance and restore the natural colors of underwater image as shown in Fig. 3. Butterworth filtering technique is applied in the
homomorphic filtering to filter low-frequency noise in the image.
However, the homomorphic filtering is inadequate to improve the underwater
image as the bluish or greenish illumination tends to retain in the background.
Therefore, histogram matching method is utilized in the filtering process to increase
inferior and intermediate color channels. In this step, the dominant color channel is
matched by the inferior and intermediate color channels. This process automatically
increases the influence of the inferior and intermediate color channels, while the
dominant color channel is being reduced. Then, dual-image global stretching, local
stretching, and image sharpening are applied to enhance further the image contrast.
3 GUI Application on Underwater Image Acquisition
MATLAB software is used as the compiler platform in this work. In addition, a
GUI application is designed and developed as well through MATLAB Guide to
display the input and output of the processed underwater image. The GUI is
developed to help users to see clearly the difference between the raw underwater
image and the processed image.
As shown in Fig. 4, there are axes and press button which have been designed
on the GUI. The axes are divided into two, and both axes have been labeled to
display both input and output images. The “Pick and Process” button is clicked for
selection of input image through file selector function in MATLAB. The corresponding function is uigetfile () where filename and pathname are the output from
Fig. 4 GUI for underwater image acquisition using MATLAB
Dual Image Fusion Technique for Underwater Image Contrast Enhancement
63
the function. The type of input image is defined as .jpg, which is common image
format. The user can choose any underwater image with .jpg format and then select
it as the input image. Figure 5 shows the flowchart of the GUI application with the
implementation of the proposed image enhancement technique.
Fig. 5 Flowchart for GUI application
4 Standalone Prototype Device for Underwater Image
Acquisition Application
In this proposed method, Raspberry Pi is used as the computing device for
underwater image enhancement. Raspberry Pi is a basic embedded system conjointly with a low cost single-board computer that is commonly used to ease the
complexity of a system in real time application. The application of Raspberry Pi
gives better opportunities than only observing simulation results. The interaction
between the Raspberry Pi and PC is handled by MATLAB and Simulink software
where Simulink makes possible porting of the MATLAB software to variety of
devices and platforms. MATLAB in Raspberry Pi can operate both in a simulation
mode where the board is connected to a PC and in a standalone mode where a
software is downloaded onto the board and runs independently from a PC.
64
C. H. Chong et al.
Raspberry Pi operates on special derivatives of Linux Operating System (OS).
There are six OS variants that are capable to install into Raspberry Pi such as
Raspbian, Pidora, OpenELEC, RaspBMC, RISC OS and Arch Linux. Raspbian is
the most frequently used OS which is specifically developed for Raspberry Pi. For
underwater imaging field, Raspberry Pi is supported by different programming
language software (i.e. MATLAB, Simulink) which is integrated by MathWorks.
MATLAB’s supporting package enable in development software for algorithms
that can run in Raspberry Pi. It also allows controlling peripheral devices connected
to the board through its GPIO interfaces, namely serial, I2C and SPI as well as a
camera module via command functions in MATLAB command window. The
performance of Raspberry Pi as the computing platform, helps researchers to study
and analyze the phenomena existed in underwater environment.
To capture live images from underwater environment, Raspberry Pi Camera
Module is utilized. The reason to use the Picamera as it has a built-in module that
can be integrated through Raspbian, and easy to connect it to the Raspberry Pi
board via short ribbon cable. Live still-images can be captured through the
Picamera module. It also has 8 megapixels lens in the module which is capable to
capture great quality of image. Moreover, a 5 in. TFT Display with a mini
panel-mountable HDMI monitor is used to display the Raspbian operating system
since the original Raspberry Pi board doesn’t come with a display. The display
showed 800 400 common HDMI display that is made for the Raspberry Pi. For
the power source, a portable power bank from PINENG with 20,000 maH capacity
is adopted to the Raspberry Pi board.
OpenCV is an open-source computer vision and machine learning software
library. It is also aimed at real-time computer vision function. In this proposed
method, OpenCV is applied and written with Python language to deploy the
implementation of homomorphic filtering process for underwater image enhancement. Python2 IDLE is used as the programming environment to write out the
algorithm for homomorphic filtering method. The libraries for both of Picamera and
OpenCV are imported to the programming environment to fully utilize the features
(Fig. 6).
Picamera captures the input image and the image is saved into a prepared folder
for storing purpose. The captured image is read by using function in OpenCV tools
which is cv2.imread. cv2.imread read an RGB image to BGR sequence image.
Therefore, another function from the tools itself, cv2.imwrite will write the final
output image which is the enhanced image back to RGB image. The input image is
then processed with adaptive histogram equalization. The image is divided into
small blocks by 2 2 tile grid sizes and then each histogram is equalized based on
tile. A contrast limiting parameter is also applied to prevent any noise amplification
in the image if there is noise presents in the blocks. The pixels in the input image
are clipped and distributed evenly to other bins before implementing adaptive
histogram equalization.
The next step is to split out the processed image after adaptive histogram
equalization to B, G, and R color channels. Contrast adjustment is applied by
normalized all three B, G, and R color channels in order to adjust the color and
Dual Image Fusion Technique for Underwater Image Contrast Enhancement
65
Fig. 6 Block diagram
process flow of homomorphic
filtering method through
Raspberry Pi microprocessor
module
Applied homomorphic filtering to
the image
contrast in the image. The function processes each color band (BGR) and determines the minimum and maximum value in each of the three colors band.
Each of the color channels has the same minimum value but different maximum
value. The minimum and maximum values are range in between 0–255 since the
input image is in 8-bits. The normalized of all BGR color channels are then merged
together and adaptive histogram equalization is performed again. The output from
the merging is then subjected to homomorphic filtering for the final enhancement.
Gaussian high-pass filter is used in the homomorphic filtering.
Figure 7 shows the interfaces generated to display the comparison on the raw
input image and enhanced output image in the Raspberry Pi. The left side of the
window shows the raw input image that is loaded from the database sample images,
and the right side of the window shows the output image that has been enhanced
through the proposed method. Both windows are generated by using Python IDLE.
66
C. H. Chong et al.
Fig. 7 Raw image and enhanced image display show in Raspberry Pi with HDMI display
5 Results and Discussion
Five sample images are used to test the effectiveness of the proposed method,
namely fish 1, coral 1, stone, fish 2, and coral 2. The performance of the proposed
method is compared with homomorphic filtering, gray world [13], CLAHE, and
contrast adjustment. The resultant images produced by all methods are shown in
Figs. 8, 9, 10, 11 and 12.
Fig. 8 Comparison of fish 1 images, a Original image; b Homomorphic filtering; c Gray world;
d CLAHE; e Contrast adjustment; f Proposed HFIFD method
Dual Image Fusion Technique for Underwater Image Contrast Enhancement
67
Fig. 9 Comparison of coral l images, a Original image; b Homomorphic filtering; c Gray world;
d CLAHE; e Contrast adjustment; f Proposed HFIFD method
Fig. 10 Comparison of stone images, a Original image; b Homomorphic filtering; c Gray world;
d CLAHE; e Contrast adjustment; f Proposed HFIFD method
The original image of fish 1 is affected by bluish color cast and the objects are
hardly seen. The homomorphic filtering method show a promising result as the
bluish color cast is significantly reduced. Meanwhile, gray world tends to generate a
reddish output image. CLAHE inadequately improve the original image as the
bluish color cast retains in the image. Contrast adjustment method is able to reduce
the bluish color cast in the foreground. However, this effect retains in the
68
C. H. Chong et al.
Fig. 11 Comparison of fish 2 images, a Original image; b Homomorphic filtering; c Gray world;
d CLAHE; e Contrast adjustment; f Proposed HFIFD method
Fig. 12 Comparison of coral 2 images, a Original image; b Homomorphic filtering; c Gray
world; d CLAHE; e Contrast adjustment; f Proposed HFIFD method
background. On the other hand, the proposed method is able to reduce the bluish
color cast significantly. The image contrast is also well-improved as the fishes can
be seen clearly.
The original image of coral 1 has poor contrast and the real color of the object is
overshadowed by the bluish color cast. Homomorphic filtering method is able to
reduce the bluish color cast. However, the image contrast is insufficiently enhanced.
Gray world method over-enhances the foreground color as the reddish color cast
dominated in that region. There is no significant improvement made by CLAHE as
the bluish color cast retains in the output image. Similar to gray world, the contrast
Dual Image Fusion Technique for Underwater Image Contrast Enhancement
69
adjustment method tends to produce reddish color cast in the foreground. On the
other hand, the proposed method is able to improve the image contrast adequately.
The bluish color cast is also significantly reduced. A similar trend can be seen in
other tested images, where the proposed method successfully recovers the image
contrast as the visibility of objects has been improved.
To support the visual observation, the quantitative evaluation metrics are used
such as entropy [14], MSE [15], and PSNR [15]. Entropy represents the abundance
of image information which measures the image information content. High entropy
is preferred as it shows the resultant images contain more information. Meanwhile,
MSE and PSNR are the quantitative metrics used to compare the original image and
the improved image. High noise of an image is indicated by high value of MSE and
low value of PSNR.
As shown in Table 1, for all tested images, the proposed method obtains the
highest value of entropy, indicating that the proposed method is able to produce
output images that have more details and information. For MSE and PSNR evaluations, the proposed method is in fourth place for images fish 1, coral 1, fish 2, and
coral 2. For image stone, the proposed method is in fifth place. Nevertheless, this
does not certainly denote that the proposed method is inferior compared to the other
methods. The quantitative evaluation metrics that are being used are subjective and
thus have complexities in measuring correctly the enhancements made by an image
enhancement technique [16]. In some cases, some performance metrics unsuccessfully achieve a result that is in agreement with the human perception of image
quality [7].
For example, based on image fish 1, gray world method obtains a better score for
MSE (3.521) compared to the proposed method (6.802). However, according to
visual observation, the output image produced by gray world looks reddish and the
image contrast is inadequately improved. Meanwhile, the proposed method adequately reduces the bluish color cast while the image contrast has been improved
significantly as the fish can be seen clearly. Therefore, in terms of image quality
comparison, visual qualitative evaluation by human visual system is taken as the
first priority for overall image quality evaluation [4].
On the other hand, the GUI which has been developed with MATLAB software
is successfully developed. The performance of this application in enhancing the
underwater image is promising since the computational time required is short. Each
of images requires 2–3 s to be processed and enhanced. Compared to GUI,
Raspberry Pi requires longer computational time to process the underwater image.
On average, this application takes about 21 s to improve underwater image.
70
Table 1 Quantitative results
in terms of entropy, MSE, and
PSNR
C. H. Chong et al.
Image
Method
Entropy
MSE
PSNR
fish 1
Original
Homomorphic
filtering
Gray world
CLAHE
Contrast
adjustment
HFIFD
Original
Homomorphic
filtering
Gray world
CLAHE
Contrast
adjustment
HFIFD
Original
Homomorphic
filtering
Gray world
CLAHE
Contrast
adjustment
HFIFD
Original
Homomorphic
filtering
Gray world
CLAHE
Contrast
adjustment
HFIFD
Original
Homomorphic
filtering
Gray World
CLAHE
Contrast
adjustment
HFIFD
7.463
7.865
–
5.920
–
40.408
6.404
6.940
7.419
3.521
11.078
2.948
42.664
37.686
43.436
7.870
7.591
7.779
6.802
–
55.592
39.804
–
30.681
7.141
7.466
7.491
143.191
20.487
32.673
26.572
35.016
32.989
7.878
7.557
7.886
60.119
–
31.451
30.341
–
33.154
7.494
7.653
7.608
22.906
8.298
9.549
34.531
38.941
38.331
7.888
7.529
7.863
40.308
–
5.635
32.077
–
40.622
6.647
7.254
7.441
3.010
8.093
3.102
43.345
39.049
43.214
7.889
7.181
7.553
7.589
–
38.944
39.329
–
32.226
6.734
7.047
7.087
266.114
28.447
19.058
23.880
33.590
35.330
7.735
40.022
32.108
coral
1
stone
fish 2
coral
2
Dual Image Fusion Technique for Underwater Image Contrast Enhancement
71
6 Conclusion
The proposed image enhancement method has proven to be effective in enhancing
underwater image in terms of color, contrast and image details. Qualitative evaluation and quantitative evaluation have been performed to evaluate and justify the
performance of the proposed method. Three sample images were tested and the
results showed the effectiveness of the proposed method. In addition, GUI application has been successfully developed for processing underwater images.
This GUI has successfully displayed the comparison between the input image (raw
image) and the output image (enhanced image). The implementation of the
Raspberry Pi device in underwater image acquisition application is also successfully produced. The idea of it is to take an image from the Picamera, and then the
image quality is improved through the proposed method. The image quality produced through the Raspberry Pi also shows satisfactory results.
Acknowledgements The research is supported by University Malaysia Pahang (UMP) research
grant RDU1803131 entitled “Development of Multi-Vision Guided Obstacle Avoidance System
for Ground Vehicle”. The sample images and some related references are taken from database
https://sites.google.com/ump.edu.my/shahrizan/database-publication.
References
1. Abdul Ghani AS (2015) Improvement of underwater image contrast enhancement technique
based on histogram modification. Thesis - Universiti Sains Malaysia. Accessed Jan 2019
2. Ancuti C, Ancuti CO, Haber T, Bekaert P (2012) Enhancing underwater images and videos
by fusion. In: Proceedings of the IEEE computer society conference on computer vision and
pattern recognition, pp 81–88
3. Chiang JY, Chen YC (2012) Underwater image enhancement by wavelength compensation
and dehazing. IEEE Trans Image Process 21(4):1756–1769
4. Abdul Ghani AS, Mat Isa NA (2017) Automatic system for improving underwater image
contrast and color through recursive adaptive histogram modification. Comput Electron Agric
141:181–195
5. Mohd Azmi KZ, Abdul Ghani AS, Md Yusof Z, Ibrahim Z (2019) Deep underwater image
enhancement through integration of red color correction based on blue color channel and
global contrast stretching. In: Md Zain Z et al (eds) Proceedings of the 10th national technical
seminar on underwater system technology 2018, LNEE, vol 538, pp 35–44. Springer,
Singapore
6. Mohd Azmi KZ, Abdul Ghani AS, Md Yusof Z, Ibrahim Z (2019) Deep underwater image
enhancement through colour cast removal and optimization algorithm. Imag Sci J 67(6):330–
342
7. Mohd Azmi KZ, Abdul Ghani AS, Md Yusof Z, Ibrahim Z (2019) Natural-based underwater
image color enhancement through fusion of swarm-intelligence algorithm. Appl Soft
Comput J 85:1–19
8. Peng Y, Cosman PC (2017) Underwater image restoration based on image blurriness and light
absorption. IEEE Trans Image Process 26(4):1579–1594
9. Ancuti CO, Ancuti C, De Vleeschouwer C, Bekaert P (2018) Color balance and fusion for
underwater image enhancement. IEEE Trans Image Process 27(1):379–393
72
C. H. Chong et al.
10. Kareem HH, Daway, HG, Daway EG (2019) Underwater image enhancement using colour
restoration based on YCbCr colour model. In: IOP conference series: materials science and
engineering, vol 571, pp 1–7
11. Horak K, Zalud L (2015) Image processing on raspberry pi in Matlab. Adv Intell Syst Comput
4:1–7
12. Patil VP, Gohatre UB, Singla CR (2018) Design and development of raspberry pi based
wireless system for monitoring underwater environmental parameters and image enhancement. Int J Electron Electr Comput Syst 7(5):133–138
13. Buchsbaum G (1980) A spatial processor model for object colour perception. J Franklin Inst
310(1):1–26
14. Ye Z (2009) Objective assessment of nonlinear segmentation approaches to gray level
underwater images. ICGST J Graph Vis Image Process 9(II):39–46
15. Hitam MS, Awalludin EA, Wan Yussof WNJ, Bachok Z (2013) Mixture contrast limited
adaptive histogram equalization for underwater image enhancement. In: Proceeding of the
IEEE international conference on computer applications technology (ICCAT), pp 1–5
16. Rao SP, Rajendran R, Panetta K, Agaian SS (2017) Combined transform and spatial domain
based “no reference” measure for underwater images. In: Proceedings of the IEEE
international symposium on technologies for homeland security (HST), pp 1–7
Red and Blue Channels Correction
Based on Green Channel
and Median-Based Dual-Intensity
Images Fusion for Turbid Underwater
Image Quality Enhancement
Kamil Zakwan Mohd Azmi, Ahmad Shahrizan Abdul Ghani,
and Zulkifli Md Yusof
Abstract One of the main problems encountered in processing the turbid underwater images is the effect of greenish color cast that overshadows the actual color of
an object. This paper introduces a new technique which focuses on the enhancement of turbid underwater images. The proposed method integrates two major
steps. The first step is specially designed to reduce the greenish color cast problem.
The blue and red channels are improved according to the difference between these
channels and the reference channel in terms of the total pixel values. Then, the
median-based dual-intensity images fusion approach is applied to all color channels
to improve the image contrast. Qualitative and quantitative evaluation is used to test
the effectiveness of the proposed method. The results show that the proposed
method is very effective in improving the visibility of the turbid underwater images.
Keywords Image processing
Turbid underwater image Contrast stretching
1 Introduction
The features of the turbid underwater images differ from deep underwater images,
where not only the red channel but the blue channel also problematic due to
absorption by the organic matter [1]. As a result, the greenish color cast dominates
these images and causes the actual color of an object difficult to be determined
accurately. In addition, the turbid underwater images also have low contrast issue,
resulting in poor image quality.
Based on the aforementioned issues, it is very crucial for underwater researchers
to focus on improving the turbid underwater images. In this paper, an idea to
K. Z. Mohd Azmi A. S. Abdul Ghani (&) Z. Md Yusof
Faculty of Manufacturing and Mechatronic Engineering Technology,
Universiti Malaysia Pahang, 26600 Pekan, Malaysia
e-mail: shahrizan@ump.edu.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_6
73
74
K. Z. Mohd Azmi et al.
improve the visibility of turbid underwater images is presented. The proposed
method involves two major steps: red and blue channels correction based on green
channel, and median-based dual-intensity images fusion (RBCG-MDIF). The
capability of the proposed method is validated through qualitative and quantitative
evaluation results.
This paper is organized as follows: literature review is described in Sect. 2.
Section 3 discusses the motivation of this research. Section 4 provides a detail
explanation of the proposed method. In Sect. 5, the capability of the proposed
method is confirmed through qualitative and quantitative evaluation results. This
paper ends with a conclusion.
2 Related Works
The gray world (GW) assumption [2] is a famous method, which has been
employed to improve underwater images. This method assumes that all color
channels have the same mean value before attenuation. However, this method
inadequately enhances underwater images that are highly affected by a strong
greenish effect such as in turbid underwater scene.
Another well-known method which is frequently being used to compare the
effectiveness of a method is unsupervised color correction method (UCM) [3]. This
method is able to increase the image contrast. However, for turbid underwater
images, it tends to produce a yellowish output image.
In 2016, Abdul Ghani and Mat Isa [4] proposed an integrated-intensity
stretched-Rayleigh histograms method (IISR). In this method, each color channel is
multiplied by a gain factor in order to balance all the color channels. Based on
visual observation, for turbid underwater images, IISR over-enhances the greenish
effect, thus reducing the visibility of the objects.
Recently, Mohd Azmi et al. (2019) [5] proposed a method for deep underwater
image enhancement. It incorporates two main steps, which are red color correction
based on blue color channel (RCCB) and global contrast stretching (GCS). This
method is very effective in enhancing the attribute of deep underwater images, as it
is able to reduce the bluish color cast significantly. However, it is less effective in
improving the quality of turbid underwater images. In the next section, we will
explain how this method is being modified and adapted for turbid underwater
images enhancement.
Red and Blue Channels Correction Based on Green Channel …
75
3 Motivation
The RCCB step has shown excellent results in improving the feature of deep
underwater images [5]. This step works by modifying the red channel with regards
to the difference between this channel and the blue channel in terms of the total
pixel value.
However, this step is less effective in improving the quality of turbid underwater
images. As mentioned earlier, the features of the turbid underwater images differ
from deep underwater images, where not only the red channel but the blue channel
also problematic due to absorption by the organic matter [1].
The diver image in Table 1(a) is used to show the output image produced by the
RCCB step. The original image is entirely disguised by the greenish color cast
while the objects are hardly seen. According to the image histograms, the green
channel is dominant over the other color channels.
No changes can be seen in the output image generated by the RCCB step. Image
histograms also show no adjustment and improvement. This is because of the
RCCB step only improves the red channel by referring to the blue channel [5],
while in the turbid scene, generally, the red and blue channels is not significantly
differ as shown in the histograms of the original image.
Therefore, this paper introduces a new idea to improve the RCCB step, considering the enhancements that need to be made to both red and blue channels. The
reference channel should be changed to the green channel, instead of the blue
channel as proposed in the RCCB step [5]. This is because the green channel is
usually superior to the other color channels in turbid underwater images.
Table 1 Resultant image and image histograms produce by RCCB step
Method
Resultant image
Histogram of image
Red
1000
500
0
(a)
Original
image
0
50
100
0
50
100
0
50
100
Green
150
200
250
150
200
250
150
200
250
150
200
250
150
200
250
150
200
250
500
0
Blue
1000
500
0
Red
1000
500
0
(b)
RCCB [5]
0
50
100
0
50
100
0
50
100
Green
500
0
Blue
1000
500
0
76
K. Z. Mohd Azmi et al.
4 Methodology: Red and Blue Channels Correction Based
on Green Channel and Median-Based Dual-Intensity
Images Fusion (RBCG-MDIF)
This section provides a detail explanation of the proposed method. Figure 1 shows
the flowchart of the proposed method, while Table 2 shows the resultant images
and image histograms of each step of the proposed RBCG-MDIF method.
4.1
Red and Blue Channels Correction Based on Green
Channel (RBCG)
To begin with, the image is disintegrated into the red, green, and blue channels.
Then, the total pixel value of red channel, Rsum , green channel, Gsum and blue
channel, Bsum are calculated. The green channel is chosen as the reference channel
for the enhancement of the red and blue channel, as this color channel is usually
dominant in turbid underwater scene. Two gain factors, Y and Z are obtained as
follows:
Fig. 1 Flowchart of the
proposed RBCG-MDIF
method
Input image
Red and blue channels correction
based on green channel (RBCG)
Median-based dual-intensity
images fusion (MDIF)
Unsharp masking
Output image
Red and Blue Channels Correction Based on Green Channel …
77
Table 2 Resultant images and image histograms of each step of the proposed RBCG-MDIF
Steps
Resultant images
Histograms of image
Red
1000
500
0
(a)
Input
image
0
50
100
0
50
100
0
50
100
Green
150
200
250
150
200
250
150
200
250
150
200
250
150
200
250
150
200
250
150
200
250
150
200
250
150
200
250
150
200
250
150
200
250
150
200
250
500
0
Blue
1000
500
0
Red
500
0
(b)
RBCG
0
50
100
0
50
100
0
50
100
Green
500
0
Blue
500
0
Red
500
0
(c)
MDIF
0
50
100
0
50
100
0
50
100
0
50
100
0
50
100
0
50
100
Green
500
0
Blue
500
0
Red
500
0
(c)
Unsharp
masking
Green
500
0
Blue
500
0
Y¼
Gsum Bsum
Gsum þ Bsum
ð1Þ
Z¼
Gsum Rsum
Gsum þ Rsum
ð2Þ
The gain factor of Y contains information concerning the difference between the
green and blue channels in terms of total pixel value. Meanwhile, the gain factor of
Z contains information concerning the difference between the green and red
channels. This information is crucial to control the appropriate amount of pixel
value that has to be added to the blue and red channels in order to reduce the
greenish color cast. The larger the pixel value difference between the green channel
and the other color channels, the higher the pixel value will be added to improve the
blue and red channels.
78
K. Z. Mohd Azmi et al.
After RBCG step
Before RBCG step
Reference
channel
Red
1000
500
Red
500
0
0
0
50
100
Green
150
200
250
0
50
100
0
50
100
0
50
100
Green
150
200
250
150
200
250
150
200
250
500
500
0
0
0
50
100
Blue
150
200
250
Blue
1000
500
500
0
0
0
50
100
150
200
250
Fig. 2 Images and their respective histograms before and after RBCG step
Then, the blue and red channels are improved through Eqs. (3) and (4),
respectively. As shown in Fig. 2, the proposed RBCG is able to enhance the blue
and red channels appropriately, thus significantly reduce the effect of greenish color
cast.
Pblue ¼ Pblue þ Y Pgreen
Pred ¼ Pred þ Z Pgreen
ð3Þ
ð4Þ
where Pred , Pgreen and Pblue are the pixel values of red, green and blue channels,
respectively.
4.2
Median-Based Dual-Intensity Images Fusion (MDIF)
Then, the median-based dual-intensity images fusion approach is employed to all
color channels to improve the image contrast. The phase starts with the determination of minimum, median, and maximum intensity values of each image
histogram.
Red and Blue Channels Correction Based on Green Channel …
79
Original histogram
Median
point
700
600
500
400
Min
value
300
Max
value
200
100
0
0
50
100
150
200
250
Upper stretched-region
Lower stretched-region
1500
1500
1000
1000
500
500
0
0
0
50
100
150
200
0
250
50
100
150
200
250
Fig. 3 Illustration of histogram division at a median point and stretching process
As shown in Fig. 3, based on the median point, each image histogram is separated into two regions, which are upper and lower stretched-regions. Then, each
region is stretched according to Eq. (5). Pin and Pout are the input and output pixels,
respectively. imin and imax represent the minimum and maximum intensity level
values for the input image, respectively.
Pout ¼ 255
Pin imin
imax imin
ð5Þ
For each color channel, the separation at the median point and global stretching
processes will produce two types of histograms, which are upper-stretched and
lower-stretched histograms. All upper-stretched histograms are integrated to generate a new resultant image. The similar process is performed to all lower-stretched
histograms. Then, these two types of images are composed by average points as
illustrated in Fig. 4.
4.3
Unsharp Masking
The unsharp masking technique [6] is applied in the last step to improve the overall
image sharpness. The fundamental idea of this method is to blur the original image
first, then deduct the blurry image from the original image. Then, the difference is
added to the original image.
80
K. Z. Mohd Azmi et al.
Over-enhanced image
Enhanced-contrast
output image
Input image
Under-enhanced image
Fig. 4 Composition of under-enhanced and over-enhanced images
This technique can be used and proven effective in improving the quality of
underwater images [7] [8]. Through this method, blurry appearance of underwater
objects can be further enhanced. This can assist underwater researchers to better
detect an object such as plants or animals under the sea.
5 Results and Discussion
In this experiment, 300 underwater images are used to evaluate the performance of
the proposed RBCG-MDIF method. The proposed method is compared with gray
world (GW) [2], unsupervised color correction method (UCM) [3],
integrated-intensity stretched-Rayleigh (IISR) [4], and red channel correction based
on blue channel and global contrast stretching (RCCB-GCS) [5].
Besides visual observation, three quantitative performance metrics are used to
support the qualitative assessment, which are entropy [9], patch-based contrast
quality index (PCQI) [10], and natural image quality evaluator (NIQE) [11]. A high
entropy value indicates that a method is able to generate an output image with more
information, while a high PCQI value corresponds to high quality of image contrast.
On the other hand, a low NIQE value indicates a high degree of image naturalness
of the output image. Five samples of underwater images are selected for comparison
as shown in Figs. 5, 6, 7, 8 and 9, while Table 3 shows the quantitative results of
these samples images.
The original image of turbid image 1 has low contrast and the greenish color
cast overshadows the actual color of objects. Through comparison, GW produces a
Red and Blue Channels Correction Based on Green Channel …
81
(a) Original image
(b) GW
(c) UCM
(d) IISR
(e) RCCB-GCS
(f) Proposed RBCG-MDIF
Fig. 5 Processed images of turbid image 1 based on different methods
(a) Original image
(d) IISR
(b) GW
(e) RCCB-GCS
(c) UCM
(f) Proposed RBCG-MDIF
Fig. 6 Processed images of turbid image 2 based on different methods
82
K. Z. Mohd Azmi et al.
(a) Original image
(d) IISR
(b) GW
(c) UCM
(e) RCCB-GCS
(f) Proposed RBCG-MDIF
Fig. 7 Processed images of turbid image 3 based on different methods
(a) Original image
(d) IISR
(b) GW
(e) RCCB-GCS
(c) UCM
(f) Proposed RBCG-MDIF
Fig. 8 Processed images of turbid image 4 based on different methods
Red and Blue Channels Correction Based on Green Channel …
(a) Original image
(d) IISR
83
(b) GW
(c) UCM
(e) RCCB-GCS
(f) Proposed RBCG-MDIF
Fig. 9 Processed images of turbid image 5 based on different methods
Table 3 Quantitative results in terms of entropy, PCQI, and NIQE
Images
(a)
Turbid image 1
(b)
Turbid image 2
(c)
Turbid image 3
Methods
Quantitative analysis
Entropy
PCQI
Original
GW
UCM
IISR
RCCB-GCS
Proposed RBCG-MDIF
Original
GW
UCM
IISR
RCCB-GCS
Proposed RBCG-MDIF
Original
GW
UCM
IISR
RCCB-GCS
Proposed RBCG-MDIF
7.556
7.030
7.665
7.113
7.559
7.917
7.600
6.987
7.762
5.431
7.490
7.942
7.266
6.639
7.391
4.779
7.180
7.858
1.000
0.943
1.196
1.107
1.209
1.256
1.000
0.858
1.101
0.698
1.141
1.166
1.000
0.846
1.131
0.756
1.179
1.221
NIQE
3.822
3.769
3.849
4.026
3.700
3.747
7.112
6.578
4.828
4.725
5.112
3.959
7.767
6.310
4.696
4.619
4.888
4.359
(continued)
84
K. Z. Mohd Azmi et al.
Table 3 (continued)
Images
(d)
Turbid image 4
(e)
Turbid image 5
Methods
Quantitative analysis
Entropy
PCQI
NIQE
Original
GW
UCM
IISR
RCCB-GCS
Proposed RBCG-MDIF
Original
GW
UCM
IISR
RCCB-GCS
Proposed RBCG-MDIF
6.713
6.075
7.301
4.856
6.630
7.719
7.674
7.033
7.863
6.796
7.691
7.951
4.996
4.344
6.947
4.615
4.783
4.774
5.999
5.279
4.711
4.943
4.975
4.445
1.000
0.992
1.209
0.973
1.421
1.442
1.000
0.940
1.155
1.033
1.132
1.202
reddish output image that seem unnatural to human visual system. Furthermore, this
method insufficiently enhances the image contrast as it produces the lowest values
of entropy (7.030) and PCQI (0.943). UCM is able reduce the greenish color cast,
however, the bright region is occupied by yellowish appearance. There is no major
enhancement can be observed in the resultant image delivered by IISR, as this
method intensify further the greenish color cast. The high score of NIQE (4.026)
obtained by this method shows the quality of this output image is worse than the
original image. RCCB-GCS is able to lessen the greenish color cast problem.
However, based on quantitative analysis, this method obtains low entropy value
(7.559) which is almost similar to original image (7.556). Meanwhile, the proposed
RBCG-MDIF produces the best image quality as the greenish color cast effect is
extensively lowered. This better performance is also verified by the quantitative
assessment stated in Table 3 (a) as the proposed method obtains the highest scores
for entropy and PCQI. For NIQE, the proposed method is in second rank after
RCCB-GCS method. However, the visual observation shows that output image
produced by the proposed method is better than RCCB-GCS. Based on output
image produced by RCCB-GCS method, the greenish color cast retains in the
background as shown in Fig. 5(e).
Contrary to the previous tested image, the original image of turbid image 2 is
affected by a strong greenish color cast causing the actual color of objects being
implicated with this effect. Instead of reducing the greenish color cast, GW introduces a reddish color cast in the output image. This causes the true color of objects
being associated with this effect. UCM is able to improve the image contrast,
however, this method produces a yellowish effect especially in the background.
Compared to the original image, the resultant image processed by IISR is worse.
This method over-enhances the greenish effect, thus reducing the visibility of the
Red and Blue Channels Correction Based on Green Channel …
85
objects. This outcome is supported by the quantitative analysis, where this method
produces the lowest values of entropy (5.431) and PCQI (0.698). RCCB-GCS is
able to improve the image contrast and reduces the greenish color cast problem as
the objects can be differentiated from the background. However, this method
produces a large NIQE value (5.112), indicating poor image naturalness. On the
other hand, the proposed RBCG-MDIF effectively reduces the greenish color cast.
The image contrast is also well-improved. This notable accomplishment is verified
by the quantitative assessment stated in Table 3(b) as the proposed RBCG-MDIF
obtains the highest values of entropy, PCQI, and NIQE with the values of 7.942,
1.166, and 3.959, respectively.
Meanwhile, the original image of turbid image 3 is occupied by intense greenish
color cast causing the appearance of objects is very limited. Through comparison,
GW darkens the original image. This method also produces a high value of NIQE
(6.310), indicating poor naturalness quality of the processed image. UCM produces
a yellowish effect in the output image while the greenish color cast preserves in the
background. IISR degrades further the original image, as the greenish color cast
exceedingly overshadows the output image. RCCB-GCS successfully reduces the
greenish color cast to some extent, however, this effect is retained in the background. On the other hand, the proposed RBCG-MDIF produces better image
feature than the other methods as the greenish color cast is significantly reduced.
Furthermore, the objects can be seen clearly. This prominent performance is confirmed by the quantitative assessment stated in Table 3(c) as the proposed method
obtains the highest scores for all performance metrics.
A similar trend can be observed in other tested images, where the proposed
RBCG-MDIF successfully reduces the greenish color cast and improve the image
contrast. Table 4 reports the average quantitative scores of 300 tested underwater
images. Based on this table, the superior performance of the proposed method is
further supported by this quantitative evaluation, as the proposed method attains the
best rank for all performance metrics.
Table 4 Average
quantitative result of 300
tested underwater images
Methods
Quantitative analysis
Entropy
PCQI
NIQE
Original
7.064
1.000
4.244
GW
6.607
0.976
4.801
UCM
7.571
1.194
4.615
IISR
7.258
1.148
3.959
RCCB-GCS
7.287
1.192
3.836
Proposed RBCG-MDIF
7.775
1.279
3.808
Note The values in bold typeface represent the best result
obtained in the comparison
86
K. Z. Mohd Azmi et al.
6 Conclusion
The RBCG-MDIF method is specifically designed to solve turbid underwater image
problems, especially to reduce the greenish color cast effect and to improve overall
image contrast. This paper introduces a new idea to improve the RCCB step,
considering the enhancements that need to be made to the red and blue channels.
The reference channel has been changed to the green channel, instead of the blue
channel for turbid underwater image enhancement. The capability of the proposed
method in enhancing the turbid underwater images is verified through qualitative
and quantitative evaluation results.
Acknowledgements We would like to thank all reviewers for the comments and suggestions to
improve this paper. This study is supported by Universiti Malaysia Pahang (UMP) through
Postgraduate Research Grant Scheme (PGRS1903184) entitled “Development of Underwater
Image Contrast and Color through Optimization Algorithm”.
References
1. Lu H, Li Y, Xu X, Li J, Liu Z, Li X, Yang J, Serikawa S (2016) Underwater image
enhancement method using weighted guided trigonometric filtering and artificial light
correction. J Vis Commun Image Represent 38:504–516
2. Buchsbaum G (1980) A spatial processor model for object colour perception. J Franklin Inst
310(1):1–26
3. Iqbal K, Odetayo M, James A, Salam RA, Talib AZH (2010) Enhancing the low quality
images using unsupervised colour correction method. In: Proceedings of the IEEE
international conference on systems, man and cybernetics pp. 1703–1709
4. Abdul Ghani AS, Raja Aris RSNA, Muhd Zain ML (2016) Unsupervised contrast correction
for underwater image quality enhancement through integrated-intensity stretched-Rayleigh
histograms. J Telecommun Electron Comput Eng 8(3):1–7
5. Azmi KZM, Ghani, ASA, Md Yusof Z, Ibrahim Z (2019) Deep underwater image
enhancement through integration of red color correction based on blue color channel and
global contrast stretching. In: Md Zain Z, et al (eds) Proceedings of the 10th national technical
seminar on underwater system technology 2018. LNEE, vol 538. Springer, Singapore,
pp 35–44
6. Jain AK (1989) Fundamentals of digital image processing. Prentice Hall, Englewood Cliffs
7. Mohd Azmi KZ, Abdul Ghani AS, Md Yusof Z, Ibrahim Z (2019) Deep underwater image
enhancement through colour cast removal and optimization algorithm. Imaging Sci J 67
(6):330–342
8. Mohd Azmi KZ, Abdul Ghani AS, Md Yusof Z, Ibrahim Z (2019) Natural-based underwater
image color enhancement through fusion of swarm-intelligence algorithm. Appl Soft
Comput J 85:1–19
9. Ye Z (2009) Objective assessment of nonlinear segmentation approaches to gray level
underwater images. ICGST J Graph Vis Image Process 9(2):39–46
10. Wang S, Ma K, Yeganeh H, Wang Z, Lin W (2015) A patch-structure representation method for
quality assessment of contrast changed images. IEEE Signal Process Lett 22(12):2387–2390
11. Mittal A, Soundararajan R, Bovik AC (2013) Making a “completely blind” image quality
analyzer. IEEE Signal Process Lett 20(3):209–212
Analysis of Pruned Neural Networks
(MobileNetV2-YOLO v2)
for Underwater Object Detection
A. F. Ayob, K. Khairuddin, Y. M. Mustafah, A. R. Salisa,
and K. Kadir
Abstract Underwater object detection involves the activity of multiple object
identification within a dynamic and noisy environment. Such task is challenging
due to the inconsistency of moving shapes underwater (i.e. goldfish) within a very
dynamic surrounding (e.g. bubbles, miscellaneous objects). The application of
pre-trained deep learning classifiers (e.g. AlexNet, ResNet, GoogLeNet and so on)
as the backbone of several object detection algorithms (e.g. YOLO, Faster-RCNN
and so on) have gained popularity in recent years, however, there is a lack of
attention on the systematic study to reduce the size of the pre-trained neural networks hence speeding up the object detection process in the real-world application.
In this work, we investigate the effect of reducing the size of the pre-trained
MobileNetV2 as the backbone of the YOLOv2 object detection framework to
construct a fast, accurate and small neural network model to perform goldfish breed
identification in real-time.
Keywords Artificial neural network Object detection Underwater engineering
Ocean technology
A. F. Ayob (&) K. Khairuddin A. R. Salisa
Faculty of Ocean Engineering Technology and Informatics,
Universiti Malaysia Terengganu, 21030 Kuala Nerus, Malaysia
e-mail: ahmad.faisal@umt.edu.my
Y. M. Mustafah
Department of Mechatronics Engineering, International Islamic University
Malaysia, 50728 Kuala Lumpur, Malaysia
K. Kadir
Garisan Automotive Sdn. Bhd., Cyberjaya, Malaysia
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_7
87
88
A. F. Ayob et al.
1 Introduction
Deep learning is a branch of artificial neural network which concerns about
developing a model that act as universal function approximator based on the
training data. In the field of underwater object detection, such function
approximator/model can be constructed without prior knowledge such as the depth
of the water, the map of the surrounding, underwater occlusion and the temperature
of the surrounding.
Underwater object detection presented by [1] utilized the combination of the
colour contrast, intensity and transmission information to identify the ROI in
underwater images, however unstable performance was reported in the artificially
illuminated environment. Sung et al. [2] presented the utilization of You Look Only
Once (YOLO) algorithm for the underwater fish detection via the use of transfer
learning by adopting the original framework and trained using their custom dataset,
however reported a very low frame per seconds (FPS) (16.7 FPS) through GeForce
Pascal Titan GPU. Xu and Matzner [3] presented the utilization of third version of
YOLO (YOLOv3) to perform underwater fish detection via the use of transfer
learning, however with a moderate value of mean average precision, mAP =
0.5392.
This paper shall address two questions which involve the effectiveness of deep
learning framework in the real-life applications, such as;
1. The effect of utilizing many layers of deep learning to solve for several classes
within a dynamic underwater environment with respect to detection time and
model size.
2. Whether there is a need to utilize all the layers in the pre-trained deep learning
model to be used in a different situation.
2 Proposed Approach
2.1
You Look Only Once (YOLO) and YOLOv2
YOLO [4] is a single convolutional network that directly predict object bounding
boxes and class probabilities directly from full images in just one evaluation [5].
YOLO comes with its own benefits, one of which is it is exceptionally fast. YOLO
does not need complex pipeline as it models detection as regression problem [4]
YOLO uses regression as its final detection layer that maps the output of the last
fully connected layer to the final bounding boxes and class assignments [6]. The
network of YOLO consists of 24 convolutional layers followed by 2 fully connected layers [7], as shown in Fig. 1. Furthermore, YOLO reasons globally about
the image when making prediction, resulting in less false positive prediction on the
Analysis of Pruned Neural Networks (MobileNetV2-YOLO v2)…
89
Fig. 1 Original YOLO architecture [4]
background. In addition, YOLO also learns the object general representation, means
that YOLO are able to detect the object in natural images and also in other domains
like artwork.
YOLOv2 [8], also known as YOLO9000, is an improved version of YOLO that
are able to detect over 9000 objects. When compared to Fast R-CNN, YOLO tends
to make a significant number of localization errors [8]. YOLO also suffers from low
recall when compared to region proposal-based methods. In YOLOv2, anchor
boxes are added to predict bounding boxes [9]. Anchor boxes proves to be effective,
which allows for multiple objects detection which varies in terms of aspect ratio in a
single grid cell. Furthermore, YOLOv2 introduces dimension clustering and
clustering-based (k-means) for bounding box parameterization which improves the
mean Average Precision (mAP) of the detection.
2.2
MobileNet and Mobile v2 Algorithm
MobileNet consists of two layers, in which its model is based on depth-wise separable convolutions [10]. Depth-wise separable convolution is made up of
depth-wise convolution and 1 1 pointwise convolutions, as shown in Fig. 2.
Basically, it performs a single convolution on each colour channel rather than
combining all three and flattening it. MobileNet shows that its models have large
accuracy gap against its float point model despite being successfully reduces
parameter size and computation latency with separable computation [11].
In MobileNetV2, bottleneck convolutions had been utilized [12]. The ratio
between the size of the input and the inner size is referred as the expansion ratio.
Each bottleneck block contains an input followed by several bottleneck. Shortcuts
were used directly between the bottlenecks because the bottlenecks contain all the
90
A. F. Ayob et al.
Fig. 2 MobileNet
architecture [12]
Fig. 3 Two types of
bottleneck blocks
incorporated in MobileNetV2
[12]
necessary information while an expansion layer only acts as an implementation
detail that accompanies a non-linear transformation of the tensor, as shown in
Fig. 3. Instead of using classical residual block, where it connects the layers with
high number of channels, the inverted residuals are used where it connects the
bottlenecks. The inverted design is used as it is considerably more memory efficient
and works slightly better. Within the pre-trained MobileNetV2, a 16-blocks
architecture were incorporated. The16-blocks pre-trained MobileNetV2 model can
be obtained from [13].
Analysis of Pruned Neural Networks (MobileNetV2-YOLO v2)…
2.3
91
Evaluation of Models
In order to evaluate the models, several evaluation metrics have been utilized,
namely; Precision, Recall, Average Precision and the mean Average Precision
(mAP). In a human perspective, such metrics are aimed to evaluate the skill of the
model with respect to its capability to mimic human’s capability in the detection
task.
Given a number of queries;
a) Precision is defined as the ratio of true positive items detected to the sum of all
positive objects based on the ground truth data, shown in Eq. (1).
precision ¼
true positives
true positives þ false negatives
ð1Þ
b) Recall is defined as the ratio of true positive items to the sum of the true positive
and false negatives items identified by the detector, with relative to the ground
truth data, shown in Eq. (2).
recall ¼
true positives
true positives þ false negatives
ð2Þ
c) Average Precision (AP) is defined as the area under the curve based on the
calculation of Precision and Recall across a given queries.
In this work, the mean Average Precision (mAP) is calculated for each model
across a number of classes, as shown in Eq. (3).
PnClass
mAP ¼
2.4
class¼1 AP
nClass
ð3Þ
Data Preparation
A four-minute free-swimming goldfish video has been prepared within the lab
under controlled lighting setup, as show as in Fig. 4. This setup is adequate to
simulate real world application, where bubbles and other uncontrolled movement
are tolerated. The frame-by-frame images of four-minute video has been extracted,
which resulted to 11,4444 images. A split of training set and validation set of the
images have been set to 60%–40% is applied. The training set was annotated/
labelled with respect to the goldfish breeds prior to the training of the YOLOv2
deep learning model.
92
A. F. Ayob et al.
Fig. 4 QR-code link to the video results of the 6-classes Goldfish Breeds detection/identification
[14]
3 Results and Discussions
The experiments were conducted using the pre-trained MobileNet v2 model acted
as the backbone of the YOLOv2 detection framework. The initial pre-trained model
were consists of 16 building blocks, in which for each experiment, the block was
systematically reduced with number of blocks minus one (n−1) for each new
training session. Each training was conducted across 30 epoch, mini batch size of
16, with stochastic gradient descent as the optimizer. The specification of the
machine is Intel i7 (8th Generation), 16 GB of RAM and RTX 2060 GPU with
6 GB of VRAM. The deep learning models were trained using 5,833 annotated
goldfish breeds image dataset, which consist of 6 classes of goldfish breeds; Calico
Goldfish, Blackmoor Goldfish, Common Goldfish, Lionhead Goldfish, Ryukin
Goldfish and Pearlscale Goldfish. The time taken for each experiment to complete
was approximately 4 h. Each newly trained model was analyzed qualitatively (via
videos) and quantitatively (Table 1) to measure its effectiveness.
The first order in evaluating the model is through observing the precision-recall
(PR) curve. The graphs that represented the precision-recall curve are presented in
Figs. 8, 9 and 10. Across all the models (Block 1 to 16), the character of the PR
curves is almost similar, which indicated the consistency of the training. In this
work, Block 1, Block 8 and Block 16 were selected as an indicative typical representation. It can be observed that the models are able to perform with high
precision, even with the recall threshold of 0.5.
The video representation of the results can be accessed via the link provided in
the QR as shown in Fig. 4. The effectiveness of the detection model can be
observed qualitatively in the web-based demonstration and further elaborated in this
section.
Shown in Fig. 5, 6 and 7 are the results of the snapshot at the time t = 1:28 min
(named as the ‘checkpoint’) for the three models that were trained on the respective
feature layers named Block 1, Block 8 and Block 16. Considering the whole 16
blocks that built the pre-trained MobileNet v2, Block 1 represented the 1/16 (6%) of
Analysis of Pruned Neural Networks (MobileNetV2-YOLO v2)…
93
Fig. 5 Video snapshot of the t = 1:28 min of the detection using Block 1 model
Fig. 6 Video snapshot of the t = 1:28 min of the detection using Block 8 model
the original pre-trained model, while Block 8 represented 8/16 (50%) and finally
Block 16 that represented the whole (100%) original pre-trained model.
Referring to the figures, it can be observed that at the checkpoint of t = 1:28
min, Block 1 was able to detect 8 out of 11 goldfishes in the aquarium, where Block
8 were able to detect all goldfishes, followed by Block 16 which was able to detect
8 goldfishes out of 11. This qualitative observation is closely related to the mAP of
each of the model as reported in Table 1, where Block 8 represented the highest
mAP compared with Block 1 and Block 16.
94
A. F. Ayob et al.
Fig. 7 Video snapshot of the t = 1:28 min of the detection using Block 16 model
Further inspection in Table 1 indicated that Block 16 with 3 million parameter
evaluations contributed to the longer detection time which resulted an average of
12.53 frame per-second, compared with Block 1 (17,328 parameter count) that was
able to perform the fastest detection with the rate of 56.64 frames per-second.
A much reasonable FPS (*24 frame per-second) for this case with the mAP close
to *97% can be attributed to Block 8, Block 9 and Block 10, as shown in Table 1.
In terms of possible extension or future works, for a non-critical, non-life
threatening application, such reduction of model size, parameters is beneficial for
mobile-based high-speed detection task such as presented in this paper.
Table 1 Quantitative observation of the trained model across different evaluation metrics.
Highlighted are the most reasonable models with respect to its mAP, FPS and size
Model
name
Block
Block
Block
Block
Block
Block
Block
Block
16
15
14
13
12
11
10
9
Total number of
parameters (x105)
Mean average
precision (mAP)
(%)
Mean frame per
second (FPS)
Size of model
(MB-decimal)
36.0284
17.478
14.3692
11.2604
6.782
5.654
4.526
2.95896
95.05
94.66
94.41
94.44
97.39
97.08
96.89
96.79
12.53
12.61
12.75
12.74
13.80
15.04
23.21
24.13
13.594
6.787
5.596
4.405
2.738
2.29
1.843
1.246
(continued)
Analysis of Pruned Neural Networks (MobileNetV2-YOLO v2)…
95
Table 1 (continued)
Model
name
8
7
6
5
4
3
2
1
2.45272
1.94648
1.44024
0.67928
0.54904
0.4188
0.24792
0.17328
Mean frame per
second (FPS)
96.43
96.65
95.99
95.83
95.46
94.86
91.77
89.53
25.06
30.33
32.35
33.57
35.47
37.37
50.90
56.64
Black Moor Goldfish AP = 0.9
Calico Goldfish AP = 0.8
1
Size of model
(MB-decimal)
1.035
0.825
0.615
0.322
0.26
0.198
0.123
0.084
Common Goldfish AP = 0.9
1
1
0.95
0.95
0.95
Precision
Precision
Mean average
precision (mAP)
(%)
0.9
0.85
Precision
Block
Block
Block
Block
Block
Block
Block
Block
Total number of
parameters (x105)
0.9
0.9
0.85
0.8
0.85
0.8
0.75
0.8
0.75
0
0.5
0.7
0
1
0.5
0
1
0.5
1
Recall
Recall
Recall
Lionhead Goldfish AP = 0.9
Ryukin Goldfish AP = 1.0
Pearlscale Goldfish AP = 0.9
1
1
1
0.95
0.95
0.85
0.8
Precision
0.9
Precision
Precision
0.95
0.9
0.9
0.85
0.85
0.75
0.8
0.8
0.7
0
0.5
1
0
Recall
Fig. 8 Precision-recall graph for Block 1
0.5
Recall
1
0
0.5
Recall
1
96
A. F. Ayob et al.
Calico Goldfish AP = 1.0
0.97
0.99
0.99
Precision
Precision
0.985
0.98
0.975
0.96
0.95
0
0.5
1
0.98
0.97
0.96
0.97
0.95
0
0.5
0
1
0.5
1
Recall
Recall
Recall
Lionhead Goldfish AP = 0.9
Ryukin Goldfish AP = 1.0
Pearlscale Goldfish AP = 1.0
1
0.99
0.995
0.98
0.99
Precision
1
0.97
0.96
1
0.99
Precision
Precision
0.98
Common Goldfish AP = 1.0
1
0.995
0.99
Precision
Black Moor Goldfish AP = 1.0
1
1
0.985
0.98
0.97
0.94
0
0.5
1
0.97
0.96
0.975
0.95
0.98
0.95
0
Recall
0.5
Recall
1
0
0.5
1
Recall
Fig. 9 Precision-recall graph for Block 8
4 Conclusions
In this work, we have presented a case study that investigates the effect of reducing
the neural network layers of the original MobileNetV2 from ‘16 Blocks’ to ‘1
Block’ architecture. The decrease of the number of layers accounts for the reduction
of 17,328 to 1.7 million learnable parameters in the deep learning neural net.
Important observations with regards to the effect of reducing the number of layers
include the significant speed-up in the detection process, which accounted to 78%
increase of speed; from *12 fps to *56 fps. The mean Average Precision (mAP)
were observed to be 89% by only utilizing ‘Block 1’, compared with utilizing the
whole 16 blocks of MobileNet v2 that accounted for 95% mAP. Furthermore, 99%
of model size shrinkage has been achieved between ‘Block 16’ (13.594 MB) and
‘Block 1’ (0.084 MB), asserting that reducing the number of layers will also
beneficial for the real-world mobile-based model architecture while maintaining
satisfactory accuracy.
Analysis of Pruned Neural Networks (MobileNetV2-YOLO v2)…
Black Moor Goldfish AP = 1.0
1
0.99
0.99
0.99
0.98
0.97
0.96
Precision
1
0.98
0.97
0.96
0.95
0.94
0.5
0.97
0.96
0.94
0
1
0.98
0.95
0.95
0
0.5
1
0
0.5
1
Recall
Recall
Recall
Lionhead Goldfish AP = 0.9
Ryukin Goldfish AP = 1.0
Pearlscale Goldfish AP = 1.0
1
1
0.98
0.99
1
0.98
0.96
0.94
0.98
Precision
Precision
Precision
Common Goldfish AP = 0.9
1
Precision
Precision
Calico Goldfish AP = 1.0
97
0.97
0.96
0.96
0.94
0.92
0.95
0.92
0.94
0.9
0
0.5
1
0
Recall
0.5
Recall
1
0
0.5
1
Recall
Fig. 10 Precision-recall graph for Block 16
Acknowledgements Parts of this research were sponsored under Fundamental Research Grant
Scheme (FRGS) 59361 awarded by Ministry of Education Malaysia, and Research Intensified
Grant Scheme (RIGS) 55192/12 awarded by Universiti Malaysia Terengganu.
References
1. Chen Z, Zhang Z, Dai F, Bu Y, Wang H (2017) Monocular vision-based underwater object
detection. Sensors (Basel) 17(8):1784
2. Sung M, Yu S, Girdhar Y (2017) Vision based real-time fish detection using convolutional
neural network. In: OCEANS 2017—Aberdeen, pp 1–6
3. Xu W, Matzner S (2018) Underwater fish detection using deep learning for water power
applications. arXiv preprint arXiv:1811.01494
4. Redmon J, Divvala S, Girshick R, Farhadi, A (2016) You only look once: unified, real-time
object detection. In: IEEE conference on computer vision and pattern recognition, pp 779–788
5. Jing L, Yang X, Tian Y Video you only look once: overall temporal convolutions for action
recognition. J Visual Commun Image Rep, 58–65 (2018)
6. Putra MH, Yussof ZM, Lim KC, Salim SI (2018) Convolutional neural network for person
and car detection using YOLO framework. J Telecommun Electron Comput Eng 10:1–7
98
A. F. Ayob et al.
7. Du J (2018) Understanding of object detection based on CNN family and YOLO. J Phys Conf
Series, 12–29
8. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE
conference on computer vision and pattern recognition, pp 7263–7271
9. Shafiee MJ, Chywl B, Li F, Wong A (2017) Fast YOLO: a fast you only look once system for
real-time embedded object detection in video, arXiv preprint arXiv:1709.05943
10. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H
(2017) Mobilenets: efficient convolutional neural networks for mobile vision applications.
arXiv preprint arXiv:1704.04861
11. Sheng T, Feng C, Zhuo S, Zhang X, Shen L, Aleksic M (2018) A quantization-friendly
separable convolution for MobileNets. In: 1st workshop on energy efficient machine learning
and cognitive computing for embedded applications (EMC2), pp 14–18
12. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L (2018) Mobilenetv2: inverted residuals
and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern
recognition, pp 4510–4520
13. Mathworks Inc.: Pretrained MobileNet-v2 convolutional neural network. Mathworks Inc.
(2019). https://www.mathworks.com/help/deeplearning/ref/mobilenetv2.html. Accessed 14
Nov 2019
14. Ayob AF.: MobileNet(v2)-YOLOv2 Goldfish Detection (2019). https://www.youtube.com/
playlist?list=PLyM-KBafTfgicwqAhpa9a8HSv2TSHV3fZ. Accessed 21 July 2019
Different Cell Decomposition Path
Planning Methods for Unmanned Air
Vehicles-A Review
Sanjoy Kumar Debnath, Rosli Omar, Susama Bagchi,
Elia Nadira Sabudin, Mohd Haris Asyraf Shee Kandar,
Khan Foysol, and Tapan Kumar Chakraborty
Abstract An Unmanned Aerial Vehicle (UAV) or robot is guided towards its goal
through path planning that helps it in avoiding obstacles. Path planning generates a
path between a given start and an end point for the safe and secure reach of the
robot with required criteria. A number of path planning methods are available such
as bio-inspired method, sampling based method, and combinatorial method. Cell
decomposition technique which is known as one of the combinatorial methods can
be represented with configuration space. The aim of this paper is to study the results
obtained in earlier researches where cell decomposition technique has been used
with different criteria like shortest travelled path, minimum computation time,
memory usage, safety, completeness, and optimality. Based on the classical taxonomy, the studied methods are classified.
Keywords Path planning
Cell decomposition Regular grid UAV
1 Introduction
The use of unmanned air vehicle or autonomous robot in place of human beings to
carry out dangerous missions in adverse environments has been gradually increased
since last decades. Path planning is one of the vital aspects in developing an
autonomous vehicle that should traverse the shortest distance from a starting point
to a target point while in a given mission for saving its resources and minimizing
S. K. Debnath R. Omar (&) S. Bagchi E. N. Sabudin M. H. A. Shee Kandar
Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
Johor, Malaysia
e-mail: roslio@uthm.edu.my
K. Foysol
Department of Allied Engineering, Bangladesh University of Textiles, Dhaka, Bangladesh
T. K. Chakraborty
Department of Electrical and Electronics Engineering, University of Asia Pacific, Dhaka,
Bangladesh
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_8
99
100
S. K. Debnath et al.
Path Planning Approaches
Combinatorial
C-Space Representation
Sampling Based
RRT
Probability
roadmap
Graph Search Algorithms
Road Map
Depth First
Visibility graph
Voronoi diagram
Potential Field
A*
Genetic Algorithm
Differential Evolution
Swarm Intelligence
Bread First Search
Cell Decomposition
Biologically Inspired
Evolutionary Algorithm
Particle Swarm Optimization
Dijkstra’s
Ant colony optimization
Best First
Simulated Annealing
D*
M*
Ecology Based
Fig. 1 Classification of path planning approach [8]
the potential risks. Therefore, it is crucial for a path planning algorithm to produce
an optimal path. The path planning algorithm should also hold the completeness
criterion which means that a path can be found if that exists. Moreover, the robot’s
safety, memory usages for computing and the real-time algorithms are also significant [1–7]. Figure 1 illustrates the classification of path planning approaches.
The bio-inspired methods are the nature-motivated/biologically inspired algorithms. A number of instances of bio-inspired approaches are the Genetic algorithm
(GA), Simulated annealing (SA), Particle Swarm Optimization (PSO) plus Ant
Colony Optimization (ACO). GA uses the natural selection course of biological
evolution that continuously fluctuate a populace of distinct results. Nonetheless, it
cannot assure any optimal path. Local minima may occur in narrow environments
and thus, it offers a lesser amount of safety and constricted corridor difficulty. GA is
computationally costly and ultimately it is not complete [8].
SA algorithm is developed based on warming and cooling process of metals to
regulate the internal configuration of its properties. Separate from very sluggish and
very high cost functions, SA is not able to accomplish the optimal path [9–15]. PSO
is a meta-heuristic population based approach and it has real-time outcome, but it
tumbles into local optima easily in many optimization complications. Additionally,
there is no general convergence concept appropriate for PSO in practice and its
convergence period is mostly vague for multidimensional problems [16]. On the
other hand, ACO emulates an ant to mark a path while the food source is confirmed.
The ant separates its direction towards the food source with pheromones for tracing
purpose. In ACO, the path in between the initial point and target point is arbitrarily
produced. ACO does a blind exploration and therefore, it is not proper for efficient
path planning due to the lack of optimal result [13, 17].
In sampling based path planning, a method Rapidly Exploring Random Tree
(RRT) does not require the establishment of the design space. In RRT, the first step
is to define the starting and the target points. Then, it considers the starting point as
Different Cell Decomposition Path Planning Methods …
101
the base for the tree, based on which different new branches are grown-up till it
reaches the target point [10, 11]. RRT is simple and easy way to handle problems
with obstacles and different constraints for autonomous/unmanned robotic motion
planning. Depending on the size of the engendered tree, the computation time is
also escalated. The resulting path commencing by RRT is not optimal all the time.
Nonetheless, it remains pretty easy to find a path for a vehicle with dynamic and
physical constrictions and it also creates least number of edges [18, 19].
Probabilistic roadmap (PRM) method is a path-planning algorithm that takes random samples from the configuration space by examining the accessible free space
and dodging the crashes to find a way. A local planner is used to join these
configurations with close-by configurations. PRM is costly without any possibilities
to acquire the path. [18, 19].
Combinatorial path planning consists of mainly two methods, i.e. C-space representation technique and graph search algorithm. In this case, the first step is to
create the configuration space of the environment. Then, a graph search algorithm,
for example Dijkstra’s and A-star (A*), is applied to search a path [7, 20].
Depth-first search (DFS) is good to pick up a path among many possibilities
without caring about the exact one. It may be less appropriate when there is only
one solution. DFS is good because a solution can be found without computing all
nodes [7]. Breadth-first search that is suitable for limited available solutions uses a
comparatively small number of steps. Its exceptional property finds the shortest
path from the source node up to the node that it visits first time when all the graph’s
edges are either un-weighted or having similar weight. Breadth-first search is
complete if one exists. Breadth-first search is good because it does not get trapped
in dead ends [21] and this algorithm does not assure to discover the shortest path
because it bypasses some branches in the search tree. It is a greedy search which is
not complete and optimal. Dijkstra’s algorithm is systematic search algorithm and
gives shortest path between two nodes. In optimal cases, where there is no prior
knowledge of the graph, it cannot estimate the distance between each node and the
target. Usually, a large area is covered in the graph by Dijkstra’s due to its edge
selections with minimum cost at every step and thus, it is significant for the situation having multiple target nodes without any prior knowledge of the closest one
[22]. A* is not very optimal because it needs to be executed a number of times for
each target node to get them all. A* expands on a node only if it seems promising. It
only aims to reach the target from the current node at the earliest and does not
attempt to reach any other node. A* is complete because it always finds a path if
one exists. By modifying the used heuristics and node’s evaluation tactics of A*,
other path-finding algorithm can be developed [23].
Configuration space gives complete information about the location of all points
in the coordination and it is the space for all configurations such as real free space
area for the motion of autonomous vehicle and guarantees that the vehicle must not
crash with obstacles. An illustration of a C-space for a circular vehicle is shown in
Fig. 2. It assumes the robot as a point and adds the area of the obstacles so that the
planning can be complete in a more capable way. C-space is obtained by adding the
vehicle radius while sliding it along the edge of the obstacles and the border of the
102
S. K. Debnath et al.
Goal
A
Start
Goal
A
(a)
Start
(b)
Fig. 2 A scenario represented in a original form b configuration space. Note that the darker
rectangles in a are those with actual dimensions while in b are those enlarged according to the size
of robot A. The white areas represent free space
search space. In Fig. 2(a), the obstacle-free area is represented by the white region
inside the close area.
The robot in Fig. 2(a) is represented by A. On the other hand, as the workspace is
considered as C-space, as shown in Fig. 2(b), it tells that the free space has been
condensed while the obstacles’ area has been inflated. Hence, C-space indicates the
real free space region for the motion of autonomous vehicle or unmanned vehicle
and it assures that the autonomous vehicle or robot must not collide with the
obstacle.
2 Cell Decomposition (CD) Method
Cell decomposition (CD) is a very useful method especially in outdoor atmosphere.
In CD, C-space is first divided into simple and connected regions called cells. The
cells may be of rectangular or polygonal shapes and they are discrete,
non-overlapping and contiguous to each other. If the cell contains obstacle, then it is
identified as occupied, or else it is obstacle free. A connectivity graph is erected at
Different Cell Decomposition Path Planning Methods …
103
Cell Decomposition
Regular Grid
Adaptive Cell Decomposition
Exact Cell Decomposition
Fig. 3 Classification of cell decomposition method
that point to link the adjacent cells [42]. There are several variations of CD
including Regular Grid (RG), Adaptive Cell Decomposition (ACD) and Exact Cell
Decomposition (ECD) [22]. The classification of CD is shown in Fig. 3.
2.1
Regular Grid (RG)
Regular grid (RG) technique was introduced by Brooks and Lozano-Perez [24] to
find a collision-free path for an object moving through cluttered obstacles. In
general, RG can be constructed by laying a regular grid over the configuration
space. As the shape and size of the cells in the grid are predefined, RG is easy to
apply. RG basically samples the domain and marks up the graph subsequently to
know whether the space is occupied, unoccupied or partially occupied.
A cell is marked as an obstacle if an object or part of it occupies the cell; else it
stays as free space. The node is located in the middle of every free space cell within
the C-space. Connectivity graph is then constructed from all the nodes. Path
planning using RG is illustrated in Fig. 4. The path connecting starting point and
target point is shown by solid yellow line.
RG method is popular because they are very easy to apply to a C-space and also
flexible. The computation time can be reduced by increasing the cell size. On the
other hand, the cell size can be made smaller to provide more detailed information
and completeness.
Although RG is easy to apply, there are some drawbacks with this method.
Firstly, it has the digitization bias wherever an obstacle that is too smaller than the
cell dimension results in that whole grid square as filled or occupied. Consequently,
a traversable space may be considered impenetrable by the planner. This scenario is
illustrated in Fig. 4 (b). Furthermore, if the cell is too big (hence grid resolution is
too coarse), the planner may not be complete.
104
S. K. Debnath et al.
Goal
Goal
Start
Start
(a)
(b)
Fig. 4 a Configuration Space obstacles b Obstacles represented by Regular Grid techniques. Note
that the drivable area is considered impenetrable
2.2
Adaptive Cell Decomposition (ACD)
The, adaptive cell decomposition (ACD) is built using quad-tree unlike RG. The
cells of a quad-tree are identified either as free cells, which contain no obstacles, as
obstacles cells, where the cells are occupied or as mixed cells, which represent
nodes with both free space and obstacles. The mixed cells should be recursively
sub-divided into four identical sub-cells until the resulted smaller cells contain no
obstacles’ region or the smallest cells are produced [25].
ACD maintains as much detail as possible while regular shape of the cells is
maintained. It also removes the digitization bias of RG. An ACD representation
employed for path planning is depicted in Fig. 5. The collision-free path that
connects starting point (Start) and target point (Goal) is depicted via solid yellow
line.
Different Cell Decomposition Path Planning Methods …
105
Fig. 5 Path planning using quad-tree
2.3
Exact Cell Decomposition
Another variant of CD is Exact Cell Decomposition (ECD) method and it consists
of two-dimensional cells to resolve certain dilemma linked with regular grids. The
sizes of the cells are not pre-determined; nonetheless they are decided based on the
location and shape of obstacles in the C-space [26]. The cell boundaries are
determined exactly as the boundaries of the C-space, and the unification of the cells
stands the free space. Therefore, ECD is complete that always finds a path if one
exists. ECD is shown in Fig. 6. The path connecting the starting (Start) and target
(Goal) points is shown as solid yellow line.
Opposed Angle-Based Exact Cell Decomposition is suggested and it is intended
for the mobile robot path-planning issue through curvilinear obstacles for more
natural collision-free efficient path [27].
106
S. K. Debnath et al.
7
11
6
Goal
1
12
5
4
10
2
Start
3
8
9
Fig. 6 Path planning using exact cell decomposition
Till date many researchers have used cell decomposition-based method to solve
path planning problems. In [28], researchers recommended three innovative formulations to construct a piecewise linear path for an unmanned/ autonomous
vehicle when a cell decomposition planning method is used. Another trajectory was
obtained via path planning algorithms, by varying the involved cell decomposition,
the graph weights, and the technique to calculate the waypoints [29]. A combined
algorithm was developed by cell decomposition and fuzzy algorithm to create a
map of the robot’s path [30]. A technique suggested an ideal route generation
outline in which the global obstacle-avoidance problem was decomposed into
simpler sub complications, corresponding to distinct path homotopy that impacted
the description of a technique for using current cell-decomposition methods to
count and represent local trajectory generation problems for proficient and autonomous resolution [31].
Parsons and Canny [32] used cell decomposition-based algorithm for multiple
mobile robots path planning, which shared the same workspace. The algorithm
computed a path for each robot and it was capable of avoiding any obstacles and
Different Cell Decomposition Path Planning Methods …
107
other robots. The cell decomposition algorithm was based on the idea of a product
operation that was defined on the cells in a decomposition of a 2D free space.
However, the developed algorithm was only useful when infrequent changes
occurred in obstacles set. Chen et al. [8] introduced framed-quad-tree to create a
map in order solve a problem to find a conditional shortest path over a new
atmosphere in real time. Conditional shortest path is the path that has shortest path
among all possible paths based on known environmental information. The path was
found using a propagated circular path planning wave based on a graph search
algorithm [33]. Jun and D’Andrea [34] used approximate cell decomposition-based
method to accomplish a robot path planning task. The proposed approach used the
initial information of the locations and shapes of the obstacles. The method
decomposed the region into uniform cells, and changed the values of probabilities
while detecting unexpected changes during the mission. A search algorithm was
used to find the shortest path. One drawback of this method is that if the penalty is
considered for accelerations and decelerations, the graph will become a tree and it
will expand exponentially with the number of cells making them very slow.
Lingelbach [35] applied the so-called Probabilistic Cell Decomposition
(PCD) method for path planning in a high-dimensional static C-space for its easy
scalability. Investigational consequences showed that the performance of PCD was
acceptable in numerous circumstances for path planning of rigid body movement
such as maze-like problems and chain-like robotic platform. However, the PCD had
a degraded performance when the free space was small compared to the area of
C-space. Zhang et al. [36] utilised ACD for path planning of robot to subdivide the
C-space into cells. The localised roadmaps were then computed by generating
samples within these cells. Since the complexity of ACD is increased with the
number of degree of freedom (DOF) of robots, it is not practical to use the higher
DOF robot. Arney [37] implemented ACD path planning approach, in which the
efficiency was attained by using a method found in Geographic Information
Systems (GIS) known as tesseral addressing. Each cell was labelled with an address
during the decomposition process that defined the cell size, position and neighbours
addresses. The planner had a priori information about environment and the generated path had an optimal distance from the unmanned/autonomous vehicles’
present location to the target location. It is suitable for real-time path planning
applications.
3 Discussion on Different Cell Decomposition Methods
The benefits of CD are that it provides assurance to find a collision-free path, if
exists and is controllable. Therefore, it is a comprehensive algorithm for an
unmanned or autonomous vehicle that can travel the path deprived of the risk of
local minima incidence [38]. Yet, the shortcoming of CD is that if the formed cell is
too rough, at that time it will not be feasible to achieve the smallest path distance or
length. Instead, if the cell is too trivial, then computation is more time-consuming
108
S. K. Debnath et al.
Table 1 Comparison of different cell decomposition methods
Method
Optimal
path
CD
RG
ACD
ECD
Computational
time
p
p
Real
time
Memory
p
p
Safety
p
p
Completeness
p
p
[1, 39, 40]. The CD approach also does not provide acceptable performance in a
dynamic state and in real-time circumstance [10, 38, 39]. It is required for CD to
fine-tune with the situation as necessary; e.g. in exact CD, the cells are not predefined, but they are selected based on the site and shape of the obstacles inside the
C-space [41].
Although RG is easy to apply, but the planner may not be complete if cell is too
big, i.e. finding a path where one exists is not guaranteed. If the obstacle’s size is
significantly lesser than the cell size, then also the outcome for the entire grid square
is not obstacle free or occupied. One more drawback of RG is that it inefficiently
represents the C-space as in sparse area many same sized cells are required to fill the
empty space. As a result, planning is costly because additional cells are handled
than they are actually required.
The outcome of ACD is a map that holds different size grid cells and concentrates with the cell boundaries to match the obstacle’s boundaries closely. It produces lesser number of cells so that the C-space can be used more efficiently and
hence, less memory and processing time are required. ACD maintains maximum
details while regular shape of the cells is maintained.
ECD is complete. Still, the paths generated via ECD are not optimal in path
length. There is no simple rule to decompose a space into cells. This method is not
suitable to apply in outdoor environments where obstacles are often poorly defined
and of irregular shape (Table 1).
4 Conclusion
The results from earlier researches on several path planning algorithms for cell
decomposition methods are compared in this study where the nature of motion was
given importance and these algorithms were discussed for their advantages and
drawbacks. When an optimal energy efficient collision-free path that is complete
can be calculated with lowest computation time by an algorithm, then that algorithm
can be conferred as an efficient path planning algorithm. Since none of the algorithms covers all the criteria, hence the optimization of an energy efficient path
planning depends on the criteria of the used algorithm such as completeness,
computation time etc., and the significant requisites of the vehicle’s mission and its
Different Cell Decomposition Path Planning Methods …
109
objective. For example, RG path planning is expensive but easy to apply. ACD has
the adaptive quality and ECD is complete but not suitable for outdoor environment.
Acknowledgements Authors like to give appreciations to Universiti Tun Hussein Onn Malaysia
(UTHM) and Research Management Center (RMC) for supporting fund under TIER-1 VOT H131.
References
1. Omar R (2012) Path planning for unmanned aerial vehicles using visibility line based
methods. PhD diss., University of Leicester
2. Debnath SK, Omar R, Latip NBA (2019) A review on energy efficient path planning
algorithms for unmanned air vehicles. Computational science and technology. Springer,
Singapore, pp 523–532
3. Ganeshmurthy MS, Suresh GR (2015) Path planning algorithm for autonomous mobile robot
in dynamic environment. In: 2015 3rd international conference on signal processing,
communication and networking (ICSCN). IEEE
4. Nguyet T, Duy-Tung N, Duc-Lung V, Nguyen-Vu T (2013) Global path planning for
autonomous robots using modified visibility graph, vol 13. IEEE, pp 317–321
5. Latip NBA, Omar R, Debnath SK (2017) Optimal path planning using equilateral spaces
oriented visibility graph method. Int J Electr Comput Eng 7(6):3046
6. Chen P, et al (2013) Research of path planning method based on the improved Voronoi
diagram. In: 2013 25th Chinese Control and Decision Conference (CCDC). IEEE
7. Omar R, Da-Wei G (2009) Visibility line based methods for UAV path planning. In:
ICCAS-SICE, 2009. IEEE
8. Cho K, et al (2017) Cost-aware path planning under co-safe temporal logic specifications.
IEEE Robotics and Automation Letters 2(4)
9. Li G, et al (2012) “An efficient improved artificial potential field based regression search
method for robot path planning. In: 2012 International Conference on Mechatronics and
Automation (ICMA). IEEE
10. Abbadi A, Matousek R (2014) Path planning implementation using MATLAB in technical
computing bratislava, pp 1–5
11. Adiyatov O, Huseyin AV “Rapidly-exploring random tree based memory efficient motion
planning. In: 2013 IEEE International Conference on Mechatronics and Automation (ICMA).
IEEE
12. Achour N, Chaalal M (2011) Mobile robots path planning using genetic algorithms. In: the
seventh international conference on autonomic and autonomous systems, Baker (ICAS 2011),
pp 111–115
13. Hsu, C-C, Wang W-Y, Chien Y-H, Hou R-Y, Tao C-W (2016) FPGA implementation of
improved ant colony optimization algorithm for path planning. In: 2016 IEEE Congress on
Evolutionary Computation (CEC). IEEE, pp 4516–4521
14. Goyal JK, Nagla KS (2014) A new approach of path planning for mobile robots. In:
international conference on advances in computing, communications and informatics
(ICACCI 2014). IEEE, pp 863–867
15. Gomez EJ, Santa FM, Sarmiento FH (2013) A comparative study of geometric path planning
methods for a mobile robot: potential field and Voronoi diagrams. In: 2013 II International
Congress of Engineering Mechatronics and Automation (CIIMA), 23 October. IEEE, pp 1–6
16. Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: proceedings
of the sixth international symposium on micro machine and human science (MHS 1995), 4
October 1995. IEEE, pp 39–43
110
S. K. Debnath et al.
17. Shaogang Z, Ming L (2010) Path planning of inspection robot based on ant colony
optimization algorithm. In: 2010 International Conference Electrical and Control Engineering
(ICECE). IEEE, pp 1474–1477
18. Latombe JC (1999) Motion planning: a journey of robots, molecules, digital actors, and other
artifacts. Int J Robot Res 18(11):1119–1128
19. Marble JD, Bekris KE (2013) Asymptotically near-optimal planning with probabilistic
roadmap spanners. IEEE Trans Rob 29(2):432–444
20. LaValle SM (2006) Planning Algorithms, Cambridge University Press (2006)
21. Dudek G, Jenkin M (2000) Computational principles of mobile robotics. Cambridge
University Press, Cambridge
22. Mehlhorn K, Sanders P (2008) Algorithms and data structures: the basic toolbox (PDF).
Springer
23. Debnath SK, Omar R, Latip NBA, Shelyna S, Nadira E, Melor CKNCK, Chakraborty TK,
Natarajan E (2019) A review on graph search algorithms for optimal energy efficient path
planning for an unmanned air vehicle. Indonesian J Electr Eng Comput Sci 15(2):743–749
24. Brooks RA, Lozano-Perez T (1985) A subdivision algorithm in configuration space for
findpath with rotation. IEEE Trans Syst Man Cybern 2:224–233
25. Chen DZ, Szczerba RJ, Uhran JJ (1995) Planning conditional shortest paths through an
unknown environment: A framed-quadtree approach. In: Proceedings 1995 IEEE/RSJ
international conference on intelligent robots and systems. Human Robot Interaction and
Cooperative Robots. vol 3. IEEE
26. Debnath SK, Omar R, Latip NBA (2019) Comparison of different configuration space
representations for path planning under combinatorial method. Indonesian J Electr Eng
Comput Sci 1(1):401–408
27. Jung J-W et al (2019) Expanded douglas–peucker polygonal approximation and opposite
angle-based exact cell decomposition for path planning with curvilinear obstacles. Appl Sci 9
(4):638
28. Kloetzer M, Mahulea C, Gonzalez R (2015) Optimizing cell decomposition path planning for
mobile robots using different metrics. In: 2015 19th international conference on system
theory, control and computing (ICSTCC), IEEE pp 565–570
29. Gonzalez R, Kloetzer M, Mahulea C (2017) Comparative study of trajectories resulted from
cell decomposition path planning approaches. In: 2017 21st international conference on
system theory, control and computing (ICSTCC), IEEE, pp 49–54
30. Tunggal TP, Supriyanto A, Faishal I, Pambudi I (2016) Pursuit algorithm for robot trash can
based on fuzzy-cell decomposition. Int J Electr Comput Eng 6(6):2088–8708
31. Park J, Karumanchi S, Iagnemma K (2015) Homotopy-based divide-and-conquer strategy for
optimal trajectory planning via mixed-integer programming. IEEE Trans Rob 31(5):1101–
1115
32. Parsons D, Canny J (1990) A motion planner for multiple mobile robots. In: Proceedings,
IEEE international conference on robotics and automation. IEEE, pp 8–13
33. Chen DZ, Szczerba RJ, Uhran JJ (1995) Planning conditional shortest paths through an
unknown environment: A framed-quadtree approach. In: Proceedings 1995 IEEE/RSJ
international conference on intelligent robots and systems. Human Robot Interaction and
Cooperative Robots, vol 3. IEEE, pp 33–38
34. Jun M, D’Andrea R Path planning for unmanned aerial vehicles in uncertain and adversarial
environments. In: cooperative control: models, applications and algorithms. Springer, Boston,
pp 95–110 (2003)
35. Lingelbach F (2004) Path planning using probabilistic cell decomposition. In: IEEE
international conference on robotics and automation, 2004. Proceedings. ICRA 2004, vol 1.
IEEE, pp 467–472
36. Zhang X (1994) Cell decomposition in the affine weyl group wA ([Btilde] 4). Commun
Algebra 22(6):1955–1974
Different Cell Decomposition Path Planning Methods …
111
37. Timothy A (2007) An efficient solution to autonomous path planning by approximate cell
decomposition. In: 2007 third international conference on information and automation for
sustainability, IEEE, pp 88–93
38. Glavaški D, Volf M, Bonkovic M Robot motion planning using exact cell decomposition and
potential field methods. In: Proceedings of the 9th WSEAS international conference on
Simulation, modelling and optimization. World Scientific and Engineering Academy and
Society (WSEAS) (2009)
39. Gonzalez R, Mahulea C, Kloetzer M (2015) A Matlab-based interactive simulator for mobile
robotics. In: 2015 IEEE international conference on automation science and engineering
(CASE). IEEE, pp 310–315
40. Hoang VD, Hernandez DC, Hariyono J, Jo KH (2014) Global path planning for unmanned
ground vehicle based on road map images. In: 2014 7th international conference human
system interactions (HSI), IEEE, pp 82–87
41. Giesbrecht J (2004) Global path planning for unmanned ground vehicles.
No. DRDC-TM-2004-272. Defence Reserch And Development Suffield Alberta
42. Omar R, Melor CK, Hailma CKNA (2015) Performance comparison of path planning
methods
Improved Potential Field Method
for Robot Path Planning with Path
Pruning
Elia Nadira Sabudin, Rosli Omar, Ariffudin Joret,
Asmarashid Ponniran, Muhammad Suhaimi Sulong,
Herdawatie Abdul Kadir, and Sanjoy Kumar Debnath
Abstract Path planning is vital for a robot deployed in a mission in a challenging
environment with obstacles around. The robot needs to ensure that the mission is
accomplished without colliding with any obstacles and find an optimal path to reach
the goal. Three important criteria, i.e., path length, computational complexity, and
completeness, need to be taken into account when designing a path planning
method. Artificial Potential Field (APF) is one of the best methods for path planning as it is fast, simple, and elegant. However, the APF has a major problem called
local minima, which will cause the robot fails to reach the goal. This paper proposed an Improved Potential Field method to solve the APF limitation. Despite that,
the path length produced by the Improved APF is not optimal. Therefore, a path
pruning technique is proposed in order to shorten the path generated by the
Improved APF. This paper also compares the performance on the path length and
computational time of the Improved APF with and without path pruning. Through
simulation, it is proven that the proposed technique could overcome the local
minima problem and produces a relatively shorter path with fast computation time.
Keywords Path planning
Artificial Potential Field
E. N. Sabudin R. Omar (&) A. Joret A. Ponniran H. A. Kadir S. K. Debnath
Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
Johor, Malaysia
e-mail: roslio@uthm.edu.my
A. Ponniran
Power Electronic Converters (PECs) Focus Group, Universiti Tun Hussein Onn Malaysia,
Johor, Malaysia
A. Joret M. S. Sulong
Faculty of Technical and Vocational Education, Universiti Tun Hussein Onn Malaysia, Johor,
Malaysia
M. S. Sulong
Internet of Things (IOT) Focus Group, Universiti Tun Hussein Onn Malaysia, Johor,
Malaysia
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_9
113
114
E. N. Sabudin et al.
1 Introduction
Path planning is one of the most critical issues to be considered in robot research.
Path planning in robotic is the act of robot to compute a valid and feasible solution
in order for it to traverse from a start to goal points with a sequence of collision-free
and safe motion to achieve a certain task in a given environment. The path taken
must be free of any collisions with surrounding obstacles and also meets kinematic
or dynamic conditions [1, 2]. In the path planning problem, the workspace for the
robot and obstacle geometry is outlined in 2D or 3D, while the motion is represented as a path in configuration space [3]. In path planning, the presented structure
of the environment is an aspect that needs to be taken into account to ensure the
robot can achieve a defined mission. There are two types of environment for path
planning, namely known and unknown.
As its name implies, the known environment has all the information of obstacles
and goal point. The robot moves based on the prescribed information. On the other
hand, in an unknown environment, there is no previous knowledge or only partial
information of the environment is available. The robot needs to plan a path based on
current information. The unknown environment may contain obstacles which move
continuously, and dynamic obstacles also appear spontaneously and randomly
while the robot is performing its mission.
As previously mentioned, the aspects that need to be addressed in path planning
are the computation time, path length, and completeness. In a dynamic or uncertain
environment, the path planning algorithm must be able to produce a low computational time for real-time applications. Apart from that, the robot should take the
optimal path during the mission to save fuel and energy. Completeness criterion is
satisfied if the path planning algorithm could find a path if one exists.
There are few common techniques used in path planning problems such as Cell
Decomposition (CD), Visibility Graph (VG), Voronoi Diagram (VD), Probability
Roadmap (PRM) and Artificial Potential Field (APF). APF is a path planning
method which is simple, highly safe, and elegant [4–6]. It uses simple mathematical
equations that are ideal for real-time environments [7]. APF produces two types of
forces, i.e., attractive force and repulsive force. The goal point generates the
attractive force to pull the robot towards it; meanwhile, the obstacles produce a
repulsive force to repel the robot from it. In that way, the robot movement depends
on the resultant of the forces. However, local minima is the major drawback of
APF. The robot will be trapped into local minima if the resultant force is zero. The
problem of Goal Non-Reachable with Obstacle Nearby (GNRON) also happens, if
the robot plunges into local minima. In order to solve the above-mentioned problem, this paper has proposed Improved Artificial Potential Field. This technique is
able to reduce the limitation of APF method. Besides that, it is also computationally
tractable. In reducing the path length, a path pruning is applied to the planned path.
Improved Potential Field Method for Robot Path Planning ...
115
2 Potential Field Method
Potential field (PF) is one of the most popular techniques in path planning problem.
Artificial Potential Field (APF) method has been used by many researchers because
of its properties such as simplicity, elegance, and high safety method [3]. Khatib
was the who first suggested this idea in which the robot was regarded as a point
under the influence of fields generated by the goals and obstacles in the search space
[8]. The APF can generate path planning based on two types of force which are
attractive force and repulsive force. The attractive force is produced by the goal, and
the repulsive force is generated by the obstacle. This method can be applied in
known scenarios and also effort working in the unknown environment despite
changes and modifications. APF method has several advantages such as path
planning can be implemented in a real-time environment due to its (1) fast computation time and (2) ability to generate a smooth path without any collision with
obstacles. However, this method has major drawbacks namely local minima, goal
non-reachable problem, and narrow passages [9, 10].
To address these problems, researchers have improved the potential field
method. Mei and Arshad used a Balance-Artificial Potential Field Method to solve
the local minima and narrow passage besides achieving heading and speed control
of ASV (Autonomous Surface Vessel) in a riverine environment [11]. An efficient
Improved Artificial Potential Field based Regression Search Method for robot path
planning and also Effective Improved Artificial Potential Field- Based Regression
Search Method for Autonomous Mobile Robot Planning developed by Li et al.
could generate a global sub-optimal/optimal path effectively and could reduce the
local minima and oscillation problems in a known environment without complete
information [12, 13]. Sfeir et al. presented the real-time mobile robot navigation in
an unknown environment using Improved APF approach to create a smoother
trajectory around the obstacles by developing an integrate of rotational force [14].
This method successfully prevented the limitation in APF due to Goal
Non-Reachable when Obstacles are Nearby (GNRON) problem. Besides that, Park
et al. proposed potential field method (PFM) and vector field histogram (VFH) to
overcome the PF limitations by developing a new obstacle avoidance method for
mobile robots based on advanced fuzzy PFM (AFPFM) [15].
3 Path Planning Method
3.1
Field Function Based on Traditional APF
The attractive potential field, Vg at goal is represented as
116
E. N. Sabudin et al.
Vg ¼ Kg rg
ð1Þ
rg ¼ dist X; Xg
ð2Þ
where Kg is a variable constant which is greater than zero, X ¼ ðx; yÞ is a current
position, Xg ¼ xg ; ygÞ is a goal position, and rg is the distance between the
current robot position and the goal. Figure 1 shows an attractive potential field at
the target. The attractive force will pull the robot towards the target [16].
The repulsive potential field, Vo at can be defined as
Vo ¼
Ko
ro
ro ¼ distðX; X0 Þ
ð3Þ
ð4Þ
where Ko is a variable constant that is greater than zero, X0 ¼ ðx0 ; y0 Þ is an obstacle
position, Ko and r0 are equivalent to the gain and distance from the robot,
respectively.
The repulsive potential field, Vr at the starting point can be written as
Vr ¼
Kr
rr
rr ¼ dist ðX; Xr Þ
Fig. 1 The form of the
general attractive potential
field
ð5Þ
ð6Þ
Improved Potential Field Method for Robot Path Planning ...
117
Fig. 2 General repulsive
potential field (the gradients
pointed away from the
obstacles)
Fig. 3 Negative gradient
between target and obstacles
Kr is a variable constant equal to or greater than zero, X ¼ ðx; yÞ is a current
position and Xr ¼ ðxr ; yr Þ is a starting position.
Figure 2 illustrates a repulsive potential field at a goal [16]. The repulsive force
will push the robot towards the target.
118
E. N. Sabudin et al.
Fig. 4 a The attractive potential without obstacle b The repulsive potential set the highest value to
the obstacle c The whole potential shows the combination of the two forces to get the final
potential field result
Therefore, the total potential field can be as represented as in (7)
Vtotal ¼ Vg þ Vr þ Vo
ð7Þ
Figure 3 illustrates the total force of the potential field [16]. The resultant force of
the fields is used to determine the direction of motion the robot. In Fig. 4, the
resultant force of the potential is shown in the 3D view [17].
3.2
Algorithm for Traditional Artificial Potential Field
(APF)
In APF, there are two forces involved, which are the attractive force and repulsive
force. The traditional APF is unable to reduce the local minima problem where the
total sum of the potential field is zero. Figure 5 shows the flowchart of APF for
robot path planning.
In particular, the APF algorithm starts with the setting variable initialization,
such as the number of obstacles and the environment range. The current waypoint
assigned as a starting point and as a target point. Subsequently, the total potential
field is calculated. The robot will move from the starting point; decreasingly with
respect to the value of the potential field surrounding it until reaches the target
point. If the local minima occur while the robot is carrying out a mission to a target
point, the robot will collide with obstacles or oscillation happen. The robot cannot
reach the goal success-fully unless there are no local minima problems while the
robot deploys the mission.
Improved Potential Field Method for Robot Path Planning ...
119
Fig. 5 The traditional APF process for path planning
3.3
3.3.1
Improved APF Method
Background
The attractive gain at goal, Kg is determined by the diagonal distance of the search
space.
120
E. N. Sabudin et al.
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Kg ¼ ðdistxÞ2 þ ðdistyÞ2
ð8Þ
where distx represents the distance of the search space along the x-axis, while disty
is that of the search space along the y-axis.
On the other hand, the repulsive gain at the obstacle, K0 is written as:
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðdistxÞ2 þ ðdistyÞ2
Ko ¼
ax þ b
ð9Þ
Where a; x and b, are the parameters for a line segment from (9). K0 is defined
based on the environmental factor (diagonal distance), and the number of obstacles.
3.3.2
Algorithm of Improved APF Method
The proposed Improved APF algorithm is shown in Fig. 6. From its initial point,
the location of the next position of the robot is selected by identifying and selecting
the lowest point from the eight surrounding points generated by the potential field.
Once the lowest point has been selected, the robot will move to that point. If the
identified and selected point is local minima, the robot will identify and select the
second-lowest potential field point value. The robot moves to that point and
removes the point where local minima happen. This process will continue until the
robot reaches the target.
3.4
APF with Pruning Path
The main aim of the improved APF is to solve the local minima, oscillation, and
non-reachable problems. However, the path length generated by the APF is
non-optimal. In addition, to ensure that the mission of the robot can be carried out
successfully, other factors such as the energy-saving need to be taken into account.
This could be realized if the path can be shortened. Therefore, an alternative
technique known as path pruning has been applied to address this issue.
Debnath et al. has mentioned that APF is effective in finding a shorter path [18].
Omar et al. proposed the path pruning in path planning problem using the probability roadmap (PRM) to produce a path with a shorter length [19]. Li et al. came
out with Efficient Improved Artificial Potential Field Based Simultaneous Forward
Search (Improved APF-based SIFORS) method for robot path planning which
redefined the potential function to calculate the valid path and consequently shorten
the distance of the planned path [20]. Lifen et al. improved the APF through
changing the repulsive potential function that could help the UAV to avoid collision
with obstacles effectively and found the optimal path [6].
Improved Potential Field Method for Robot Path Planning ...
121
Fig. 6 Proposed method for improved APF that solving the limitation of potential field method
3.5
Algorithm for Improved APF Method with Path
Pruning
In this paper, a path pruning technique is used to shorten the existing path. The
flowchart shown in Fig. 7 illustrates the process of pathfinding using Improved APF
with path pruning.
Let the path W consist of waypoints fPi ; Pi þ 1; Pi þ 2. . .Pn g where Pi is the
starting point and Pn is the target point. The path pruning process starts by checking
if there are any obstacles between waypoints Pi and Pi þ 1. Pi þ 1 will be eliminated
if no obstacle is detected between Pi and Pi þ 1, and the checking of the obstacle
will proceed between Pi and Pi þ 2. Otherwise, Pi þ 1 will be maintained as one of
the waypoints of W, and the above process continues from Pi þ 1. The process will
proceed until Pi ¼ Pn .
122
Fig. 7 Algorithm of path pruning based on improved APF
E. N. Sabudin et al.
Improved Potential Field Method for Robot Path Planning ...
123
4 Simulation Results and Discussion
Simulation of the proposed algorithm has been carried out using MATLAB R2016a
on a PC with Intel i5-4200U 1.6 GHz CPU and Windows 10 OS. The range of the
environment R is set to 100 units, with obstacles numbers, O varied from 25 to 125.
Coefficients Kg and Ko for calculating the attractive and repulsive force are set
based on Eqs. (8) and (9) which are 282.843 and 15.687 respectively. The performance of the proposed algorithm is in terms of:
i- Local minima
ii- Path length
iii- Computational time
Figure 8 shows the comparison of the simulation result of the traditional APF
(blue line) and Improved APF (magenta line). As can be seen from the scenario in
Fig. 8(a), the Improved APF manages to overcome the local minima problem, and
the robot reaches the goal. The red dots are referred to the area of local minima that
have been addressed successfully. Figure 8(b) illustrates the 3D representation of
the scenario. The subplot of the altitude of waypoints is depicted in Fig. 8(c) where
the robot moves from the highest value (initial point) to the lowest value (target
point).
With the different numbers of obstacles, i.e., 25, 50, 75, 100, and 125, the
resulting paths are shown in Fig. 9(a)–(e), respectively. Referring to subplot the
scenario, the magenta lines show the paths planned based on Improved APF, and
the blue lines represent the pruned paths. It is clearly shown that the algorithm
manages to address the local minima, oscillation, GNRON, and narrow passages.
Besides that, the resulting paths are shorter due to the application of path pruning
technique.
Fig. 8 Comparison between the traditional APF (blue line) and improved APF (magenta line)
simulation results, a Improved APF overcome the local minima problem, b 3D representation and
c Robot movement waypoint
124
E. N. Sabudin et al.
(a) 25 Obstacles
(b) 50 Obstacles
(c) 75 Obstacles
Fig. 9 Paths generated by the Improved APF (magenta lines); the pruned paths (blue lines) with a
number of obstacles, a 25 Obstacles, b 50 Obstacles, c 75 Obstacles, d 100 Obstacles and e 125
Obstacles
Improved Potential Field Method for Robot Path Planning ...
125
(d) 100 Obstacles
(e) 125 Obstacles
Fig. 9 (continued)
The computational time and path length of the proposed algorithm are summarized in Table 1. The overall simulation results show the path length and
computational time of the Improved APF with path pruning in each scenario
computational time are longer if local minima happen.
Referring to the Improved APF performances, the generated path is relatively
long due to the local minima. For the obstacles numbers of 25 and 50, there are no
local minima. For 75 obstacles in the environment, the generated path is relatively
long due to the local minima problems (red dots). The robot removes the previous
waypoints to avoid the repetition of local minima point, and then the robot needs to
move to the lowest point from the midpoint. It can be seen that the robot struggles
to exit from the local minima. As a result, the computation time has increased
126
E. N. Sabudin et al.
Table 1 The performance of Improved APF and pruning path
Number
of
obstacles
Path length of
Improved APF
(unit)
Pruned path
length
(unit)
Computation time
of Improved APF
(s)
Computation time
of pruned path (s)
25
50
75
100
125
193.807
208.056
431.686
257.863
274.967
153.270
143.532
187.355
160.987
172.536
14.127
17.811
48.695
27.771
32.419
0.323
0.431
1.674
0.971
0.892
dramatically. On the other hand, the path generated in environments with 100 and
125 obstacles are considered moderate. In these cases, the local minima problems
still occur, but the robot manages to address it.
5 Conclusion and Future Work
The Improved APF with path pruning has been proposed for robot path planning in
a known environment. The proposed method finds a valid, feasible, and shorter
solution for robot mission, and consumes low computation time, which is vital for a
real-time path planning application. Improved APF has also been proven to address
the problem faced by APF method. By the proposed algorithm, the criteria for path
planning problems have been fulfilled. In future work, the improved APF with path
pruning will be enhanced considering with a specific region to improve the algorithm speed. This research also focuses on the cooperative technique for multi
robots path planning.
Acknowledgements Authors like to give appreciations to Universiti Tun Hussein Onn Malaysia
(UTHM) and Research Management Center (RMC) for supporting fund under TIER-1 VOT H131.
References
1. Hasircioglu I, Topcuoglu HR, Ermis M (2008) 3-D path planning for the navigation of
unmanned aerial vehicles by using evolutionary algorithms. In: Proceedings of the conference
on genetic and evolutionary computation, pp 1499–1506
2. Omar RB (2011) Path planning for unmanned aerial vehicles using visibility line-based
methods. control and instrumentation research group. Department of Engineering, University
of Leicester, March 2011
3. Sabudin EN, Omar R, Che Ku Melor CKANH (2016) Potential field methods and their
inherent approaches for path planning. ARPN J Eng Appl Sci 11(18):10801–10805
4. Borenstein J, Koren Y (1991) Potential field methods and their inherent limitations for mobile
robot navigation, April 1991, pp 1398–1404
Improved Potential Field Method for Robot Path Planning ...
127
5. Cen Y, Wang L, Zhang H (2007) Real-time obstacle avoidance strategy for mobile robot
based on improved coordinating potential field with genetic algorithm. In: IEEE international
conference on control applications, October 2007
6. Lifen AL, Rouxin BS, Shuandao CL, Jiang DW (2016) Path planning for UAVS based on
improved artificial potential field method through changing the repulsive potential function.
In: IEEE Chinese guidance, navigation and control conference (CGNCC), 12–14 August
2016
7. Liu Y, Zhao Y (2016) A virtual-waypoint based artificial potential field method for UAV path
planning. In: Proceedings of 2016 IEEE Chinese guidance, navigation and control conference,
12–14 August 2016
8. Khatib O (1985) Real-time obstacle avoidance for manipulators and mobile robots. In:
Proceedings of the IEEE international conference on robotics and automation, pp 500–505
9. Mei W, Su Z, Tu D, Lu X (2013) A hybrid algorithm based on artificial potential field and
BUG for path planning of mobile robot. In: 2nd international conference on measurement,
information and control
10. Wang S, Min H (2013) Experience mixed the modified artificial potential field method. In:
IEEE/RSJ international conference on intelligent robots and systems (IROS), 3–7 November
2013
11. Mei JH, Arshad MR (2015) A balance-artificial potential field method for autonomous surface
vessel navigation in unstructured riverine environment. In: IEEE international symposium on
robotics and intelligent sensors (IRIS)
12. Li G, Tamura Y, Yamashita A, Asama H (2012) Effective improved artificial potential
field-based regression search method for robot planning. In: IEEE international conference on
mechatronic and automation, 5–8 August 2012
13. Li G, Tamura Y, Yamashita A, Asama H (2013) Effective improved artificial potential
field-based regression search method for autonomous mobile robot path planning. Int J
Mechatron Autom 3(3):141–170
14. Sfeir J, Saad M, Saliah-Hasane H (2011) An improved potential field approach to real-time
mobile robot path planning in an unknown environment. In: IEEE international symposium
on robotic and sensors environments (ROSE)
15. Park JW, Kwak HJ, Kang YC, Kim DW (2016) Advanced fuzzy potential field method for
mobile robot obstacle avoidance. J Comput Intell Neurosci 2016. Article No. 10
16. Godrich MA. Potential Field Tutorial. https://pdfs.semanticscholar.org/725e/fa1af22f41dcbe
cd8bd445ea82679a6eb7c6.pdf. Accessed 29 Aug 2019
17. Robot Motion Planning and Control. Potential Field. https://sebastian-hoeffner.de/uni/
ceng786/index.php?number=2. Accessed 29 Aug 2019
18. Debnath SK, Omar RB, Abdul Latip NB (2019) A review on energy efficient path planning
algorithms for unmanned air vehicles. In: Computational science and technology. Springer,
Singapore
19. Omar RB, Che Ku Melor CKNAH, Sabudin EN (2015) Performance comparison of path
planning methods. ARPN J Eng Appl Sci
20. Li G, Tong S, Lv G, Xiao R, Cong F, Tong Z, Yamashita A, Asama H (2015) An improved
artificial potential field-based simultaneous forward search (improved APF-based SIFORS)
method for robot path planning. In: The 12th international conference on ubiquitous robots
and ambient intelligence (URAI), 28–30 October 2015
Development of DugongBot Underwater
Drones Using Open-Source Robotic
Platform
Ahmad Anas Yusof, Mohd Khairi Mohamed Nor,
Mohd Shahrieel Mohd Aras, Hamdan Sulaiman, and Abdul Talib Din
Abstract This paper presents the development and fabrication of an open source,
do-it-yourself underwater drone called DugongBot, which is developed in collaboration with the Underwater Technology Research Group (UTeRG), Universiti
Teknikal Malaysia Melaka. Research institutes and hobbyist have shown a growing
interest in the development of micro observation class remotely operated vehicle
(micro-ROV) using open-source platform. Currently, OpenROV and Ardusub are
the low-cost open-source solutions that are available for such ROVs. The
open-source hardware and software platforms are being used worldwide for the
development of small range of electrical powered ROV system’s architecture, with
support from the literature in the internet and the extensive experience acquired
with the development of robotic exploration systems. This paper presents the
development of DugongBot, which uses the OpenROV open-source platform.
Weighing approximately 3 kg and designed for 100 m depth, the drone uses a
single 18 cm long watertight tube in 10 cm diameter to accommodate the main
electronics compartment, which can be tilted up and down with a servo, for CMOS
sensor HD webcam alignment. Two horizontal thrusters for forward, reverse and
rotational movement and a vertical thruster for depth control is also used for
manoeuvrability.
Keywords Micro-ROV
OpenROV Underwater drones Open-source
A. A. Yusof (&) M. K. M. Nor M. S. M. Aras
Faculty of Electrical Engineering, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya,
76100 Durian Tunggal, Melaka, Malaysia
e-mail: anas@utem.edu.my
A. A. Yusof M. K. M. Nor M. S. M. Aras
Centre for Robotics and Industrial Automation, Universiti Teknikal Malaysia Melaka,
Hang Tuah Jaya, 76100 Durian Tunggal, Melaka, Malaysia
H. Sulaiman A. T. Din
Faculty of Mechanical Engineering, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya,
76100 Durian Tunggal, Melaka, Malaysia
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_10
129
130
A. A. Yusof et al.
1 Introduction
Open-source robotic platform for underwater robotics has provided high return
investment for the scientific community. There is now significant evidence that such
sharing concept has allowed a scenario in such a way that underwater technology
can be studied, modified, created, and distributed by anyone. Thus, micro ROV or
underwater drones are increasingly famous due to the growing curiosity in
underwater drones by researchers that uses the open-source platform [1, 2]. The
platform has led to the development of various low-cost underwater drones for
hobbyist such as OpenRoV Trident, Gladius Mini and Geneinno Poseidon that
serves a wide variety of purposes in capturing footage in the underwater environment
for scientific exploration, industrial inspections and military surveillance [3–9]. The
availability of open-source platform also gives the opportunity for students to
develop underwater robots for underwater vehicles competition around the globe
[10–14]. These electric powered vehicles can weight to as low as 2 kg and are
generally smaller in size, which is suitable for backpack storage. They are generally
limited to depth ratings of less than 100 m due to the limitations to the underwater
pressure and power to weight ratios. They can be easily hand launched from the
surface, use a simple tether system, and sometimes can be connected wirelessly
from a floating buoy at the surface. This will ensure continuous live video feed from
the drones and more importantly, to avoid losing the drones in the deep ocean. Most
of them are also equipped with powerful headlamps, providing visibility in the dark
and murky underwater conditions. They also use 4K cameras for high-quality
image capture, FPV goggles for first person view experience and a simple robotic
arm for underwater sampling. Figure 1 shows price comparison of of selected small
ROVs and the underwater drones in Malaysian ringgit.
Fig. 1 Price comparison of small ROVs [15]
Development of DugongBot Underwater Drones …
131
Thus, in this paper, the review and the development of an underwater drone
using open-source platforms and solutions are presented and evaluated. Named
DugongBot, the underwater drone serves as the first generation of low-cost drones
that is developed in house at UTeRG.
2 DugongBot Development
Dugong, as shown in Fig. 2, is a species of sea cow found throughout the warm
latitudes of the Indian and western Pacific Oceans. It can be found in the coastal
area of Malaysia, and has been categorized as decreasing in numbers in the
International Union for Conservation of Nature’s Red List of Threatened Species
[16]. In support of the dugong protection throughout the world, the underwater
drone in this project is called DugongBot, as shown in the CAD design in Fig. 3.
2.1
Hardware Development
The DugongBot comes with the BeagleBone Black single board computer as a
processor, and integrated with Arduino MEGA microcontroller for sensor detection
and thruster control. It can be tele-operated by using either gamepad or keyboard to
control the vehicle’s movement. It can also works with any Windows compatible
gamepad. DugongBot uses inertia measurement unit and pressure sensor for
movement and depth calibration that uses a single-axis rate gyroscope to measure
the yaw rate and a two-axis accelerometer to measure the roll and the pitch. The
system has a maximum operational pressure of 30 bar for depth capability and a
magnetometer compass. A 1080p high-definition webcam with 120-degree
field-of-view is used in the telemetry system through I2C protocols for laptop
display. There are 3 thrusters used for forward, upward and downward movement.
The topside control hardware contains few electronics equipment to communicate
with the drone. The controller board, which is designed based on the Arduino Mega
configuration manages the low-level input commands from the IMU and pressure
sensors and the output commands to the motors/thrusters and lights, while the
Fig. 2 Dugong
132
A. A. Yusof et al.
Beaglebones Black processes the input from the underwater footage using the
mjpg-streamer. The topside interface board provides an Ethernet connection
between the drone and the laptop. The drone uses micro USB power supply that can
supply at least 500 milliamps to the topside interface board. It has been documented
in the OpenROV support group forum that the topside interface board can be
connected wirelessly by implementing a small modification [17]. Table 1 shows
the specification for the DugongBot 1.0.
Fig. 3 DugongBot CAD design
Table 1 DugongBot specification
Name
DugongBot 1.0
Dimension
Weight
Hull
Frame
Thrusters
ESCs
Controller
Processor
Software
Batteries
Sensors
Tether
Ballast
Camera
25H 30 W 45L (cm)
3 kg
Poly(methyl methacrylate) (Acrylic)
Polyvinyl chloride (PVC) pipe
3 thrusters
Afro ESC 12amp
Arduino Mega–based OpenROV microcontroller
Beaglebone Black
OpenROV Cockpit, Node.JS, mjpg-streamer, Socket.IO,
2500mAh, 9.6 V, 26650, LiFePO4
OpenROV IMU (add-on)
Ethernet 2 wire
Lead
HD Camera on tilt servo
Development of DugongBot Underwater Drones …
2.2
133
Software Development
OpenROV itself is a company that produces underwater exploration devices, which
is located in Berkeley, California and was founded in 2011. In 2019, Ocean data
startup Spoondrift and OpenROV has announced the merger into a new company
known as Sofar Ocean Technologies. Since then, the support for OpenROV 2.8 has
been unavailable from the OpenROV website, due to the merger. However, despite
the fact that OpenROV has merged into a new company called SOFAR, and the
company current focus is on marketing the OpenROV Trident and Intelligent
Spotter buoy, the support and documentation of OpenROV 2.8 and the older versions can still be downloaded from GitHub and Dozuki. GitHub is a hosting
platform for software development, which offers all of the distributed version
control and source code management for many software developer, including
OpenROV. Github OpenROV community is managed by a DIY community centred on underwater robots for exploration and adventure. The community is a group
of amateur and professional ROV builders and operators from over 50 countries
who have a passion for underwater robotics. Dozuki is a cloud-based platform that
provides access to various step-by-step manuals for repair, process tracking,
training and work instructions. Both platform provide good community and support
group for OpenROV documentations. It is noted that almost 30 guides are available
for the step-by-step development of OpenROV in Dozuki itself. Figure 4 shows
some of the open-source support for the project.
Fig. 4 Open Source support
134
A. A. Yusof et al.
3 Drone Testing
3.1
Camera Function with Software
DugongBot uses an ultra-wide angle full HD webcam. This camera enables the user
to experience the live video streaming to explore the underwater environment and
capture photos. The camera can also detect objects and be remotely operated for 25
to 30° upward movement and 60° downward movement. The camera also provides
a view of 120° wide. The battery enables the camera to be functioning up to 3 h.
The movement of the camera is controlled by a keyboard, whereby the Q key
controls the downward movement, T controls the upward movement and I key
controls the lights. The visual interface for openROV platform is known as the
Cockpit, as shown in Fig. 5, which provides informations on depth, heading display, battery voltage and consumption, and the flight time to the operator. It also
provides the graphical user interface to the operator. The cross-platform JavaScript
run-time environment Node.js application is used to send commands through the
keyboard by using a HTML 5 one page application supported browser. ROV
connection is possible, by using a static IP address that is similar to the ROV built
static IP address. The static IP address is 192.168.254.1, the last number must be set
other than 1 and the subnet mask need to be change at 255.255.255.0.
Fig. 5 Camera function using OpenROV cockpit platform
Development of DugongBot Underwater Drones …
135
The drone is connected via Ethernet tether to transfer data, and does not need to
download any software or having an internet connection to operate them. Ethernet
protocol is used to connect the DugongBot with a computer via Ethernet tether. The
BeagleBone black in the drone runs the browser and the webserver on the computer, and communicate with the server using Socket.IO, a JavaScript library that
enables bidirectional, real time event based-communication. The DugongBot’s
controller board, which is designed based on Arduino Mega configuration manages
the low-level input commands from the IMU and pressure sensors and the output
commands to the motors/thrusters and lights, while the Beaglebones Black processes the input from the underwater footage using the mjpg-streamer. The
DugongBot’s topside interface board provides an Ethernet connection between the
ROV and the laptop, as shown in Fig. 6.
Tenda
Adapter
Topside
Computer
Ethernet
(RJ45)
Gamepad
Controller
(Optional)
Topside Interface
Board
Fig. 6 DugongBot version 1.0
Ethernet
(2 Wire)
136
A. A. Yusof et al.
Fig. 7 DugongBot thrusters
3.2
Thrusters Functions
The low cost brushless motors are a good choice for the thrusters, but the motors
may have a limited life when used only in the salt water environments.
Nevertheless, proper maintenance will definitely enhance their life expectancy. All
the thrusters are wired to the input power and controlled by the keyboard which
enables the user to control the movement from the topside. The input power source
is powered by 2500 mAh, 9.6 V, 26650, LiFePO4 batteries. It can also be tested
with a 12 V power supply. The thrusters needed to be identified with their rotation
and movement effects, in order to align them together. The thruster is connected to
the left Shift key on the keyboard, for a anticlockwise rotation, that is used in a
forward drone movement. The right Shift key will provide a command for a
clockwise rotation on the same thruster, which also introduce a backward movement. In general, the Up, Down, Left, Right, Shift and Ctrl keys can be used to
maneuver the DugongBot. Figure 7 shows the thrusters used in the drone.
3.3
Buoyancy
An underwater drone that is stable and doesn’t tip over is very important.
DugongBot must be buoyant enough so that it can be maneuvered easily up or
down without using too much energy. The objective of the development is also to
Development of DugongBot Underwater Drones …
137
Fig. 8 DugongBot in action
develop a well balanced structure of underwater drone that it will naturally bouyant
below the water surface. During the first trial, the underwater drone is partially
submerged in the water, but not in a stable condition. The left side is heavier than
the right side. Later on, some weight is introduced, as a ballast, with one at the front
and two at the sides. The result is a naturally bouyant DugongBot, as shown in
Fig. 8.
4 Conclusion
The development of DugongBot underwater drone using a low cost open-source
robotic platform has been successfully implemented. The underwater drone has
been designed for maneuverability, performance and underwater footage capability.
This project will give much benefit for related underwater industries by looking at
small underwater drones features with minimum cost implementation. In this paper,
an open source prototype for building low-cost underwater drones and for customizing their thrusters and ballast configurations has been successfully tested using
a three-propeller underwater drone based on open source hardware and software
solutions. Nonetheless, further tests in deeper waters and under different frame
configurations will be undertaken in the near future.
Acknowledgements The authors wish to thank Ministry of Education (MOE) and Universiti
Teknikal Malaysia Melaka for their support.
138
A. A. Yusof et al.
References
1. Aristizábal LM, Rúa S, Gaviria CE, Osorio SP, Zuluaga CA, Posada NL, Vásquez RE (2016)
Design of an open source-based control platform for an underwater remotely operated vehicle.
DYNA 83(195):198–205
2. Schillaci G, Schillaci F, Hafner VV (2017) A customisable underwater robot. arXiv abs/
1707.06564
3. OpenROV Trident. https://www.sofarocean.com/products/trident. Accessed 10 Oct 2019
4. Fathom One. https://www.kickstarter.com/projects/1359605477/fathom-one-the-affordablemodular-hd-underwater-dr. Accessed 10 Oct 2019
5. Geneinno Poseidon. https://www.geneinno.com/poseidon.html. Accessed 10 Oct 2019
6. BlueROV2. https://www.bluerobotics.com/store/rov/bluerov2/. Accessed 10 Oct 2019
7. Aras MSM, Azis FA, Othman MN, Abdullah SS (2012) A low cost 4 DOF remotely operated
underwater vehicle integrated with IMU and pressure sensor. In: 2012 4th international
conference on underwater system technology: theory and applications (USYS 2012), Shah
Alam, Malaysia
8. Zain ZMd, Noh, MM, Ab Rahim KA, Harun N (2016) Design and development of an
X4-ROV. In: IEEE 6th international conference on underwater system technology: theory &
applications, Penang, Malaysia
9. Mainong AI, Ayob AF, Arshad MR (2017) Investigating pectoral shapes and locomotive
strategies for conceptual designing bio-inspired robotic fish. J Eng Sci Technol 12(1):001–014
10. Singapore Autonomous Underwater Vehicle Challenge (2017). https://sauvc.org/. Accessed
10 Oct 2019
11. Malaysia Autonomous Underwater Vehicle Challenge (2018). http://oes.ieeemy.org/.
Accessed 10 Oct 2019
12. Yusof AA, Nor MKM, Shamsudin SA, Alkahari MR, Mohd Aras MS, Nawawi MRM (2018)
Facing the autonomous underwater vehicle competition challenge: the TUAH AUV
experience. In: Hassan M (eds) Intelligent manufacturing & mechatronics. Lecture notes in
mechanical engineering. Springer, Singapore
13. Yusof AA, Nor MKM, Shamsudin SA, Alkahari MR, Musa M (2018) The development of
PANTHER AUV for autonomous underwater vehicle competition challenge 2017/2018. In:
Hassan M (eds) Intelligent manufacturing & mechatronics. Lecture notes in mechanical
engineering. Springer, Singapore
14. Yusof A, Kawamura T, Yamada H (2012) Evaluation of construction robot telegrasping force
perception using visual, auditory and force feedback integration. J Robot Mechatron
24(6):949–957
15. Sulaiman H, Nor MKM, Yusof AA, Aras MSM, Mohamad Ayob AF (2019) Low cost
observation class remotely operated underwater vehicle using open-source platform: a
practical evaluation between Openrov And Bluerov. In: International conference on ocean,
engineering technology and environmental sustainability (I-OCEANS 2019), Kuala
Terengganu, Malaysia
16. IUCN Red List of Threatened Species. https://www.iucn.org/ur/node/24442. Accessed 10 Oct
2019
17. Jakobi N. Guide ID 59. How to build a WiFi enabled Tether ManagementSystem. https://
openrov.dozuki.com/Guide/How+to+build+a+WiFi+enabled+Tether+Management+System/
59. Accessed 10 Oct 2019
Development of Autonomous
Underwater Vehicle for Water Quality
Measurement Application
Inani Yusra Amran, Khalid Isa, Herdawatie Abdul Kadir,
Radzi Ambar, Nurul Syila Ibrahim, Abdul Aziz Abd Kadir,
and Muhammad Haniff Abu Mangshor
Abstract Autonomous Underwater Vehicles (AUVs) are unmanned, self-propelled
vehicles typically deployed from a surface vessel and are capable of operating
independently from that vessel for periods of several hours to several days. This
project presents the development of an Autonomous Underwater Vehicle
(AUV) with a pH sensor, temperature sensor, and turbidity sensor to measure the
water quality. An existing method is a conventional approach, where a scientist has
to go to the site and collect a water sample to measure the quality. It required more
time to gather the data and lack the capability for real-time data capture. Thus,
through the innovation and idea of this project, a scientist can measure the water
quality in real-time, autonomously and easier than the conventional method. In this
project, two thrusters control the horizontal motion of the AUV, which placed on
the side of the AUV with the guidance of a digital magnetic compass to control the
direction of the AUV. The vertical movement of the AUV is controlled by two
thrusters located at the bottom of the AUV with the help of a depth sensor to ensure
that the AUV remains submerged. A pH sensor used to detect the water quality
whether the water contamination is close to acidity or alkaline or normal value. The
temperature sensor is used to sense the water temperature. The turbidity sensor is
used to detect the cloudiness of water, either murky water or clear water. These
three sensors start operating when the microcontroller starts to power up. The AUV
is tested in a G3 lake at UTHM to test its ability to stay submerged and its
functionality to measure the water quality parameters. The AUV has successfully
carried out the given task without requiring the interface of an operator. Future
researchers can improve the AUV’s design to make the AUV works more
efficiently.
Keywords Autonomous Underwater Vehicle
Water quality sensor
Water quality measurement I. Y. Amran K. Isa (&) H. A. Kadir R. Ambar N. S. Ibrahim A. A. A. Kadir M. H. A. Mangshor
Faculty of Electrical and Electronic Engineering, Universiti Tun Hussien Onn Malaysia,
86400 Parit Raja, Batu Pahat, Johor, Malaysia
e-mail: halid@uthm.edu.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_11
139
140
I. Y. Amran et al.
1 Introduction
1.1
Project Background
An Autonomous Underwater Vehicle (AUV), also known as an unmanned
underwater vehicle, is a robot that submerged underwater without requiring a
command from an operator. An AUV is different from Remotely Operated Vehicle
(ROV). The different between AUV and ROV is on how do both robots were
operated. An AUV works independently of humans, while ROV is an unoccupied
underwater robot with a sequence of wires linked to a vessel [1]. An AUV only
submerged underwater with the requirement inside the code from the user and
returned after it finishes and completes the mission, but ROV transmits all the data
to the operator through the cables convey power and allow the ROV to be controlled by the operator.
The application of AUV has been used for more and more tasks, with roles and
missions continually evolving, such as the oil and gas industry. This industry uses
the AUV to make detailed about seafloor maps before they start to build their
subsea infrastructure. The scientist uses AUV for their research about ocean floor
mapping, used to find wreckages of missing aeroplanes, and also can be as a hobby.
Water is a significant source of every living thing to survive. However, when
humans pollute the water, the water starts to be unclean. From that situation, water
problems become widespread. Water contamination is the primary inducement of
human disease [2]. Thus, measuring and monitoring of water quality is very crucial.
Human beings begin to assess the water quality of the contaminated water. They
were using conventional methods to measure water quality. The conventional
methods to measure water quality lack the capability for real-time data capture.
Traditional techniques of collecting, testing, and analysing water samples in water
laboratories are not expensive but also lack the capacity to collect, analyse, and
rapidly disseminate information in real-time [3]. Several procedures need to be
done before the data comes out. Many scientists collect water samples only from
lake cliffs and on the surface of the water, and they were also looking for beautiful
weather to go out to collect the sample of water. The collected water was tested and
analysed in the laboratory and need time to get the result. From this process, the
result became not real-time data capture because the conventional process took time
to analyse the data.
For traditional tools, the scientist using Litmus paper (pH strip paper) or a
Membrane-based kit. A litmus paper is produced of a lichens-based dye, turning
purple in acid (pH < 6.0) while turning green in a base (pH > 8.0) [4]. A litmus
paper only needs to dip into the collected water, and the paper changed the colour
according to the pH indicator. The pH indicator is the specific range of pH values.
A Membrane-based kit is also a type of strip paper which contains tetrazolium dye
and a carbon source on it. The kit is only required for the water sample to kept, and
the colour development is observed [5]. The traditional pH tools need time to
analyse real-time data because the Litmus paper changes colour after the paper
Development of Autonomous Underwater Vehicle …
141
dipped into the collected water. From that colour, the paper needs to match with the
pH indicator whether the water is acidity, alkaline or normal.
The objective of this project is to develop a functional prototype of an AUV for
water quality measurement application where the AUV that consists of a pH sensor,
turbidity sensor, and temperature sensor is a new idea and innovation to make it
easier for the scientist to carry out measurement tasks. The function of AUV
innovation is where the AUV can collect the water quality data on the surface of the
water and underwater. The data will be recorded and stored in the data logger. The
recorded data can be retrieved by removing the memory card inside the data logger.
From this kind of innovation, the data that produce were approximate to the
real-time data capture and also analysed the performance of the AUV and the
effectiveness of the water quality measurement.
Overall, there are five parts in this paper, and the following are structured.
Section 1 presents the introduction of this project. The problem statements and
goals were discussed, including reviewing the associated prior project. Section 2
introduces the project methodology, including system layout and a few project
trials. While Sect. 3 addresses the outcomes and analyses, the gathered information
was discussed in detail in this section. Towards the completion of this project,
Sect. 4 discusses the project restriction, and Sect. 5 presents the future work to
enhance this project.
1.2
Previous AUV with Water Quality Sensor
This section addresses relevant and important previous research that offers a
detailed and systematic perspective on the Underwater Vehicles literature review.
Komaki [6] concerned with the design and creation of an AUV specifically
designed for entry into hydrothermal settings over-complicated, wide depth seabed
topography. They can be very close to the ventilation fields and carry various types
of a chemical sensor. Okamura developed MINIMONE (Mini Monitoring
Equipment) for collecting water samples. MINIMONE information analyses various water characteristics such as water density, pH, dissolved inorganic carbon,
nutrients, iron and manganese. The environment for this AUV is for underwater.
The advantage for AUV Urashima is every second the information was logged;
meanwhile, the disadvantage is the AUV has 10 m in length, as shown in Fig. 1.
Takeuchi [7] applied the design implementation of a Solar-Powered
Autonomous Surface Vehicle (SASV), as shown in Fig. 2. SASV measured
depth, temperature, turbidity, conductivity, oxygen dissolved, and chlorophyll. The
ultimate objective of this study is to create an index of ocean ecosystem soundness
and to suggest preventive steps to avoid collisions between fast passenger vessels
and big whales. The environment for SASV is on the sea surface. The advantage of
this project is solar-powered, and the disadvantage is that the data collected only on
the water surface.
142
I. Y. Amran et al.
Fig. 1 AUV Urashima [6]
Fig. 2 Solar-powered ASV
[7]
An innovative project has been created by Helmi [8] to monitor water quality in
the continental, coastal and lake regions. The parameters for this project is pH data,
Oxidation Reduction Potential (ORP) and temperature of the water where these
water quality sensors are attached to a buoy. The environment of this project is on
the water surface, as shown in Fig. 3. The benefit for the portable buoy is that
information is obtained in real-time from the buoy, and the disadvantage is that the
data were collected only on the water surface.
Prasad [9] stated that the Internet of Things (IoT) and Remote Sensing
(RS) methods are commonly used to monitor, collect and analyse information from
remote places. The researcher developed the Smart Water Quality Monitoring to
analyse the following water parameters, as shown in Fig. 4. This project aims to
develop a technique for monitoring the quality of seawater, surface water, tap water
and polluted stream water in an attempt to help manage water pollution using IoT
and RS technologies. The benefit of Smart Water Quality Monitoring System is that
the information was stored onboard via the SD card or sent to the File Transfer
Development of Autonomous Underwater Vehicle …
143
Fig. 3 Mobile buoy [8]
Fig. 4 Smart water quality
system [9]
Protocol (FTP) or cloud server and the disadvantage is that the data can only be
taken at one point to another.
Kafli [10] mentioned that the environmental monitoring process is characterising
and monitoring environmental quality such as air quality and water quality.
Furthermore, environment monitoring is used to prepare environmental impact
assessment and in many cases where human operations pose a danger of damaging
impacts on the natural environment. The author developed a floating platform to
observe the air and the water, as shown in Fig. 5. This device monitors parameter
like temperature, humidity, latitude and longitude, water pH, date and time, and
carbon monoxide. The benefit for this project is the information saved for every
10 min in the SD card in .txt format [11], and the weakness is the data of water
quality measurement collected only at the water surface area.
Niswar [12] has studied soft shell crab farming throughout south-east Asia, such
as Indonesia. Poor water quality throughout crab farming raises the mortality rate in
the pond of the crab. The author proposed to design and implement a water quality
monitoring system for crab farming using IoT technology to raise awareness among
144
I. Y. Amran et al.
Fig. 5 Floating platform for
environment monitoring [10]
Fig. 6 IoT-based water
quality monitoring system for
soft-shell crab farming [12]
farmers about the maintenance of acceptable water quality levels in the pond. The
parameter used in this project is the temperature sensor, salinity, and pH sensor. The
environment of this project is the bottom of the water floor as shown in Fig. 6. The
advantage of this project is that the data sensing is transmitted via the ZigBee
network and stored in the cloud database, and the disadvantage of this project is that
the data collected only on the water surface.
Development of Autonomous Underwater Vehicle …
145
2 Methodology
2.1
Project Design
In order to attain the goals, this project is divided into several stages. It is to ensure
that the design of the project can be carried out smoothly. The subsequent phases
can be described into three sections; the first section is the modelling section, the
second section is design and development, and the third section is testing and
analysis sections. Figure 7 shows a sequence plan to start the AUV project.
The first phase of this project is modelling, where AUV system architecture and
mechanical assembly drawing is designed. Therefore, computer-aided software like
Solidworks is used to draw 3D modelling and design the suggested and anticipated
AUV structure. Phase 2 is to design and develop the AUV that consists of hardware
development, software development, and integration, which covers internal and
external mechanical design and electrical design. Phase 3 is to test and analyse the
components of the AUV. Three tests have been focused on, is a lake test, buoyancy
test, and leaking test.
Figure 8 demonstrates the sensor flowchart of the AUV for water quality measurement application. The sensor is on with the connected parts and senses the
surroundings. The pH sensor, temperature sensor, and turbidity sensor information
gathered will be stored in the data logger every 1 s. If the data were not collected or
not an accurate result, all connections of sensor need to troubleshoot.
Fig. 7 Sequence plan of project
146
I. Y. Amran et al.
Start
Switch on Arduino to power up
sensors
4 Thrusters start operates
Troubleshooting sensor connections
Acquire data from water quality sensor
Yes
No
Is data collected?
The sensor data stored in the memory card
End
Fig. 8 Sensor flowchart
Start
A
Switch on Arduino to
power up sensor
Acquire data from depth
sensor
Acquire data from digital
magnetic compass
Is data > range?
No
Yes
Both bottom thrusters
rotate counter
clockwise for 1s
No
Is data = range?
Yes
Both horizontal thrusters
remain stationary
A
Both bottom
thrusters rotate
clockwise for 1s
Yes
Is data < range?
No
Is data = range?
End
Fig. 9 System flowchart
Figure 9 shows the system flowchart for operation of an AUV. After it is entirely
in the water, the AUV switched on automatically. The compass navigates the AUV
Development of Autonomous Underwater Vehicle …
147
underwater while assisted by the depth sensor to keep the AUV underwater. When
the direction of the AUV is changed, the horizontal thruster reset the AUV to return
to its direction of instruction. At the same time, the vertical thruster adjusts the
AUV to remain submerged if the AUV reappears on the water surface. As the pH
sensor, temperature sensor, and turbidity sensor start operates when the AUV
switched on.
2.2
System Design
Figure 10 shows the project operational block diagram that consists of input,
process, and output part of the project. The input part comprises several sensors
with a battery as the primary power supply. Then, the process took place in Arduino
microcontroller, and then data logger displays the output. Finally, the outcomes of
the method will be discussed in the outcomes and analyses part in Sect. 3. Several
hardware experiments that are endurance testing, buoyancy testing, and leakage
testing have been performed after the model has been effectively constructed.
The AUV was tested to evaluate the buoyancy, endurance, and leakage at Universiti
Tun Hussien Onn Malaysia (UTHM) G3 Lake.
In Fig. 10, the sensors enable the AUV to perceive its surroundings. The sensors
in the input section play a key role in providing the AUV with accurate and detail
environmental information. The sensors include a pH sensor, turbidity sensor, and
temperature sensor. The pH sensor is used to evaluate the quality of water. The
turbidity sensor is used to sense the water’s cloudiness. The temperature sensor is
used to detect water temperature. These three sensors operated in simultaneously
when AUV is switched on. On the other hand, the output section consists of a
memory card and 4 thrusters; memory card is used to stores all the collected data
from water quality sensors and thruster is used to stabilise the AUV or to control the
movement.
Fig. 10 Block diagram of system
148
2.3
I. Y. Amran et al.
Hardware Requirements
The hardware requirement for AUV project is actuator and sensors. Figure 11
shows the T100 Thruster and Electronic Speed Controller (ESC). Four units of
thrusters with ESC were used in this project. The T100 Thruster is a patented
underwater marine robotic propeller. High performance with more than 5 lb of
thrust and long-lasting enough to be used at great depths in the open ocean.
The T100 is made of polycarbonate injection-moulded plastic, high-strength,
UV resistant. The core of the engine is closed and protected with an epoxy coating,
and it uses high-performance plastic bearings rather than steel bearings that rust in
saltwater. All that is not plastic is high-quality, non-corroding aluminium or
stainless steel. The propeller and nozzle intended by the T100 deliver a reliable and
effective thrust while active water-cooling helps cool the motor. This model is
composed by an electric brushless motor, ranging from 300 to 4200 rpm, has up to
130 W of output power and has 2.36 kgf of nominal torque [15]. The T100 can be
used to counter torque with clockwise (CW) and counter-clockwise (CCW).
Figure 12 shows that the microcontroller which used to control the AUV. This
panel has 54 pins and 16 more memory analogue pins to store the code [16]. The
Arduino Mega uses an Atmel 8 bits microcontroller that is ATmega2560 with
256 kb flash memory, 8 kb SRAM, 4 kb EEPROM, and 16 MHz of the clock
frequency [17]. The Arduino Mega can be powered with an external power supply
or via a USB connection. The power source is automatically selected. This
microcontroller has the purpose of controlling the four (4) thrusters, digital magnetic compass, depth sensor, temperature sensor, pH sensor, turbidity sensor, IMU
module, and data logger.
Figure 13 shows an analogue pH sensor that senses the pH level of water. This
sensor operates in 5 V. The measuring range of this sensor is 0pH to 14pH. The pH
sensor is the alternative to get the result of water quality comparing Litmus paper or
pH testing kit with colours that need to place on a pH indicator to get the result of
water quality. The electrode is made of a sensitive glass membrane with low
impedance. The calibrations of pH were a fast response. The pH is a significant
parameter for water quality measurement, and the pH impacts aquatic animal
development and reproduction [18].
Fig. 11 T100 Thruster [13]
and ESC [14]
Development of Autonomous Underwater Vehicle …
Fig. 12 Arduino Mega 2560
microcontroller
Fig. 13 pH sensor
149
150
I. Y. Amran et al.
Figure 14 shows a turbidity sensor that used to evaluate water quality turbidity.
Its procedure is based on the concept that the light intensity dispersed by the
suspended substance is proportional to its concentration [19]. The turbidity sensor
operates in 5 V and 40 mA.
Figure 15 shows a Celsius temperature sensor, also known as TSYS01. It is a
quick response, a high-precision temperature sensor sealed from the water protected
by an aluminium cage and ready to be installed in a waterproof enclosure [20]. The
TSYS01 sensor itself has a rapid response time and designed the entire package to
maintain that speed to enable accurate measurement of the temperature profile even
if it drops and rises rapidly.
Fig. 14 Turbidity sensor
Fig. 15 Temperature sensor
Development of Autonomous Underwater Vehicle …
151
3 Results and Analysis
3.1
3D AUV Modeling
This subtopic discusses the tools of the 3D AUV Modeling. The tools that used to
sketch the 3D AUV Modeling are Solidworks 2016 Software. Figure 16 shows the
AUV designed a box-shaped based on the features required for the AUV stabilisation system. The AUV mechanical system is designed that a centre of buoyancy
(COB) is above the centre of gravity (COG). The COB and COG distance is
referred to as metacentric height. The moment of restoration returning the vehicle to
its stable orientation is proportional to the height of the metacentre. As the value of
the metacentric height increases, the hydrostatic stability is increased. In addition,
the COB and COG location must be aligned in the vertical direction so that the
vehicle does not have a moment when the vehicle’s pitch and roll angle is equal to
zero.
Figure 17 shows the isometric 3D Design of an Autonomous Underwater
Vehicle. The isometric consists of three principal axes, where the x-axis represents
the front view, the y-axis represents the left view, and the z-axis represents the top
view of an AUV 3D Modeling.
Fig. 16 3D AUV Modeling
152
I. Y. Amran et al.
Fig. 17 Isometric 3D AUV design
3.2
Control System
All thrusters and sensors calibrated and tested for their functionality before
installation on the AUV, as shown in Fig. 18.
The thruster connected to the AUV control system, powered by an external 11 V
power supply, to control the speed and direction of the thrusters. The thrusters are
precisely mounted in the centre of the vehicle to prevent the AUV from becoming
imbalanced when flooded. Thus, a depth sensor is used to give the AUV instructions for submerging or floating underwater. The depth sensor detects the depth of
water via its pressure sensor and transmits the data to the control system. The
Fig. 18 Thruster calibration and testing
Development of Autonomous Underwater Vehicle …
153
Fig. 19 Thrusters tested on
the AUV structure
control system provided the thrusters with instructions on whether to submerge
deeper or rise depending on the preset value.
A digital magnetic compass is used as the AUV navigation system. The compass
provided the microcontroller with directional data, and the AUV moved in the
direction of pre-setting. The AUV’s orientation system used an Inertial
Measurement Unit (IMU) Module. The IMU sensors help to position an object in
three-dimensional space attached to the sensor. Usually, these values are in angles
to determine their position.
Figure 19 shows the view of the thruster testing process. All four thrusters
attached on the AUV open structure; two thrusters attached on both side which left
and right of AUV structure for horizontal movements and two thrusters attached at
the bottom of the AUV open structure for vertical movement. The purpose of two
thrusters at horizontal sides for back and forth movement which means the thruster
needs to control the torque to clockwise for forwarding movement or
counter-clockwise for backward movement. The function of two thrusters at the
bottom of the AUV structure is for submerging movement and flotation movement.
These two thrusters are also needed to counter the torque to clockwise for submerging movement or counter-clockwise for floating movement.
154
3.3
I. Y. Amran et al.
AUV Prototype
Before the model was constructed, several experiments were performed to check
each sensor’s functionality. A few experiments were also carried out on the model
by putting the model on the lake which the test of buoyancy, the test of leakage, and
the test of endurance. All the parts that were assembled were put on the AUV body
structure after all the experiments were completed, as shown in Fig. 20.
The AUV consists of four thrusters; two horizontal movement thrusters and two
vertical movement thrusters. The AUV has two compartments used to store all its
electronic components to prevent them from getting contact with water. All AUV
sensors stored in the upper compartment such as a compass, IMU module, data
logger, depth sensor, turbidity sensor, temperature sensor, and pH sensor. Thruster
speed controllers and power supply stored in the lower compartment. The floats and
weights were used to provide sufficient buoyancy force for the AUV to stay on the
float while it was fully submerged.
To collect the data, as shown in Fig. 21, it was conducted at the UTHM G3
Lake. All sensors begin to collect the data when the power supply is switched on,
and the data send to the Arduino microcontroller for storage in the memory card.
The underwater compartments of the AUV are reinforced with white tape, epoxy
and silicone grease to ensure that no water can enter the compartment to avoid
water contact with the components, causing the entire circuit to be short circuit. The
plasticine was also used as an additional reinforcement to seal off the entire opening
of the compartment.
Fig. 20 The AUV prototype
Development of Autonomous Underwater Vehicle …
155
Fig. 21 AUV field test at the
G3 Lake, UTHM
The endurance test shows that the AUV was able to survive with turbulent
streams of water. For example, when the water flow is turbulent, the AUV can swim
stable and balanced with the AUV’s assistive sensor like IMU sensor and actuator
to make AUV remains swim in position.
3.4
AUV Submerging and Leaking Test
Following the complete assembly of the AUV, the AUV was submerged at the G3
Lake in UTHM to test whether the AUV could remain fully submerged underwater
for a period of time, as shown in Fig. 22. The floats are added to the sides of the
AUV to act as a floating mechanism to increase the buoyant force acting on the
AUV.
156
I. Y. Amran et al.
Fig. 22 AUV submerging
and leaking test
The additional weights are added to the AUV to prevent the AUV from surfacing
back to the water surface to act as a sinking mechanism for the AUV. Both
mechanisms work together in order to keep the AUV underwater floating.
The AUV’s underwater compartments play a major role as their used for storing
the AUV control system. As the AUV control system is not waterproof, it is
therefore very important to ensure that the AUV control system does not come into
contact with the water. Simultaneously, a leakage test is also carried out to ensure
that no water can enter the AUV submarine compartments.
3.5
Experimental Results
The project goal was effectively accomplished from the outcome that was to
develop an AUV for Water Quality Measurement Application. The system effectively gathered the data of water turbidity, temperature, and pH and saved it every
1 s as shown in Fig. 23 to the SD card in .txt format.
UTHM G3 Lake is the suggested place for AUV to run the field test. This is
because the G3 Lake consists of thermocline where the thermocline is a layer of
transition between deep water and surface water. Each layer of water that is mixed
layer represented as surface water, thermocline layer, and deep water has different
temperature as shown in Fig. 24. Water close to the surface and warmed by the sun
is less dense the water closes to the bottom because of water density changes as the
water temperature changes. The lower the water temperature, the higher the water
density until around 4 °C [21]. In a thermocline, with small increases in-depth, the
temperature decreases rapidly. In these three layers also has different of cloudiness
of water and pH value of the water.
Development of Autonomous Underwater Vehicle …
157
Fig. 23 Data is saved in SD card with .txt format
Based on the results in Fig. 25, during field test at G3 Lake in UTHM, the
temperature of water starts to decrease rapidly until below 15 °C at 12:57:00 until
12:57:20. It is because the AUV is submerged underwater at the centre of the lake
which at the thermocline layer. The layer that is close to the thermocline, the
temperature of the water is decreasing. While early minutes of AUV operation for
pH data, the pH sensors begins with unstable data because of the voltage reads
incorrectly, the pH value viewed as the voltage is also discarded [23]. The
158
I. Y. Amran et al.
Fig. 24 Thermocline of
water [22]
sensitivity of glass of pH sensor takes time to calibrate the correct data of the pH
water quality. After several seconds which is the AUV started to swim at the centre
of the lake, the pH sensor calibrated the pH water between pH7 until pH10. It is
because the layer where the AUV dive in underwater, the pH value changed in
every layer and location.
Finally, the turbidity sensor senses the cloudiness of the water. From the result
shown, the turbidity data changed at 12:56:18 until 12:56:36 to 5 V. It is because
the water flows were in unsteady movement; in other words, is turbulence. The
turbulence makes water becomes murkier.
Development of Autonomous Underwater Vehicle …
159
Fig. 25 Data analysis for temperature, turbidity, and pH sensors
4 Conclusion
After testing out the AUV in a G3 lake at Universiti Tun Hussien Onn Malaysia, it
can be summed up that the AUV can perform the given task without requiring the
interface of an operator. The AUV switched on automatically after it is entirely in
the water, all sensor in the control system power-up including water quality measurement sensor. The digital magnetic compass navigated the AUV swam underwater while the depth sensor helps to keep the AUV remain submerged. At the
time, the water quality measurement sensors such as pH sensor, temperature sensor
and turbidity started calibrating the data of the water and record the data into the
data logger.
There were a few problems that were present before reaching the final phase,
which is the problem of leakage at the second compartment that consists of power
supply (batteries) and four ESCs. The problem of leakage could be solved by
applying a sealing tape with a layer of silicon grease around the thruster wire to
prevent the passage of water. The second problem that was the power supply
problem could be solved by adding a charging port to the power supply compartment so that the power supply could be recharged directly within the AUV
instead of replacing the old battery with new ones. The third problem that was the
uploading code to microcontroller could be solved by adding a Universal Serial Bus
160
I. Y. Amran et al.
(USB) port to the primary compartment (microcontroller compartment) so that the
user can upload the code through the USB port that connected with microcontroller
instead of opening the hull.
In conclusion, the project aims at designing and developing a functional
Autonomous Underwater Vehicle for Water Quality Measurement Application is
achieved. The last objective that is to analyse the performance of the AUV and the
effectiveness of the water quality measurement is successfully achieved as the AUV
able to operate fully function.
5 Recommendation
For future work, there are a few improvements that can be implemented in the
future. One of the recommendations is to decrease the length of the AUV because a
smaller size AUV can improve the manoeuvrability of the AUV. Based on the First
Law of Motion of Newton, also known as Inertia, an object in rest remains at rest,
and an object in movement remain in movement at the moment, unless an unbalanced force acts on it. As the mass of the AUV increases, the AUV’s inertia will
also increase, resulting in large inertia for the AUV. In terms of manoeuvrability, a
smaller size AUV will have small inertia that will benefit to the AUV.
Another improvement that can be implemented in future projects is by using
waterproof electronic components. This idea plays an essential part in the development of an AUV as the AUV is used explicitly for underwater missions, in
particular for mapping seafloors, detecting wreckage, and measuring the water
quality at seafloors. This is why the component will not malfunction when in
contact with water by using waterproof electronic components while lowering costs
at the same moment of replacing malfunction components with new ones.
References
1. National Oceanic and Atmospheric Administration (2018) What is the difference between an
AUV and an ROV? US Department of Commerce
2. Zhou B, Bian C, Tong J, Xia S (2017) Fabrication of a miniature multi-parameter sensor chip
for water quality assessment. Sensors 17(12):157
3. Faustine A, Mvuma AN, Mongi HJ, Gabriel MC, Tenge AJ, Kucel SB (2014) Wireless sensor
networks for water quality monitoring and control within lake victoria basin: prototype
development. Wirel Sens Netw 6:281–290
4. Gunda NSK, Dasgupta S, Mitra SK (2017) DipTest: a litmus test for E. coli detection in
water. PLoS ONE 12(9):1–13
5. Kumar SB, Shinde AH, Mehta R, Bhattacharya A, Haldar S (2018) Simple, one-step
dye-based kit for bacterial contamination detection in a range of water sources. Sens
Actuators B Chem 276:121–127
Development of Autonomous Underwater Vehicle …
161
6. Komaki K, Hatta M, Okamura K, Noguchi T (2015) Development and application of
chemical sensors mounting on underwater vehicles to detect hydrothermal plumes. In: 2015
IEEE underwater technology, UT
7. Arima M, Takeuchi A (2016) Development of an autonomous surface station for underwater
passive acoustic observation of marine mammals. In: Ocean 2016, Shanghai, no. 26289339,
pp 1–4
8. Helmi AHMA, Hafiz MM, Rizam MSBS (2014) Mobile buoy for real-time monitoring and
assessment of water quality. In: Proceedings of the 2014 IEEE conference on systems, process
and control, ICSPC 2014, December, pp 19–23
9. Prasad AN, Mamun KA, Islam FR, Haqva H (2016) Smart water quality monitoring system.
In: 2015 2nd Asia-Pacific world congress on computer science and engineering, APWC CSE
2015, pp 1–6
10. Kafli N, Othman MZ, Isa K (2017) Unsupervised floating platform for environmental
monitoring. In: Proceedings of the 2016 IEEE international conference on automatic control
and intelligent systems, I2CACIS 2016, October, pp 84–89
11. Kafli N, Othman MZ, Isa K (2016) Development of a floating platform for measuring air and
water quality. In: 2016 IEEE 6th international conference on underwater system technology:
theory and applications, USYS 2016, pp 177–182
12. Niswar M et al (2018) IoT-based water quality monitoring system for soft-shell crab farming.
In: Proceedings of the 2018 IEEE international conference on internet of things and
intelligence system, IOTAIS 2018, pp 6–9
13. T100 Thruster - Blue Robotics. https://www.bluerobotics.com/store/thrusters/t100-t200thrusters/t100-thruster/. Accessed 18 May 2019
14. Speed Controllers (ESCs) Archives - Blue Robotics. https://www.bluerobotics.com/productcategory/thrusters/speed-controllers/. Accessed 18 May 2019
15. Nascimento S, Valdenegro-Toro M (2018) Modeling and soft-fault diagnosis of underwater
thrusters with recurrent neural networks. IFAC-PapersOnLine 51(29):80–85
16. Introduction to Arduino Mega 2560 - The Engineering Projects. https://www.theengineer
ingprojects.com/2018/06/introduction-to-arduino-mega-2560.html. Accessed 18 May 2019
17. RobotShop (2015) Arduino Mega 2560 Datasheet. Power, pp 1–7
18. Wei Y, Hu X, An D (2018) Design of an intelligent pH sensor based on IEEE1451.2.
IFAC-PapersOnLine 51(17):191–198
19. Lambrou TP, Anastasiou CC, Panayiotou CG (2010) A nephelometric turbidity system for
monitoring residential drinking water quality. Springer, Berlin, Heidelberg, pp 43–55
20. Fast-Response, High Accuracy (± 0.1 °C) Temperature Sensor. https://www.bluerobotics.
com/store/sensors-sonars-cameras/sensors/celsius-sensor-r1/. Accessed 18 May 2019
21. About Water Temperature. https://staff.concord.org/*btinker/GL/web/water/water_temperat
ures.html. Accessed 27 May 2019
22. US Department of Commerce, N. N. W. S. Thermocline - Temperature Fluctuations at Erie,
PA
23. Top 10 Mistakes in pH Measurement. https://blog.hannainst.com/top-10-mistakes-in-phmeasurement. Accessed 21 May 2019
Discrete Sliding Mode Controller
on Autonomous Underwater Vehicle
in Steering Motion
Nira Mawangi Sarif, Rafidah Ngadengon, Herdawatie Abdul Kadir,
and Mohd Hafiz A. Jalil
Abstract The purpose of this study is to implement sliding mode control in discrete time domain for Autonomous Underwater Vehicle (AUV). Six Degree of
Freedom (DOF) was established for Naval Postgraduate School (NPS) AUV II
model, followed by linearizing surge and sway nonlinear Equation of Motion
(EoM) in horizontal plane to simplify the control system design. Discrete sliding
mode controller was designed based on Gao’s reaching law. Discrete Proportional
Integral Derivative (PID) controllers were used for performance comparative
analysis and brief discussion on existence of chattering phenomena in the controller
input. As a result, computer simulations on NPS AUV II showed that the proposed
controller has zero overshoot and faster settling time than the discrete PID
controller.
Keywords AUV
Chattering reduction Discrete time sliding mode
1 Introduction
Autonomous Underwater Vehicle (AUV) has shown popularity for three decades
due to its versatility and excellent performance which are increasingly being used in
many industries [1]. Their solid small size with self-operated propulsion systems,
capability carrying sensors such as depth sensors, video cameras, side-scan sonar
and other oceanographic measuring devices has made the AUV to be well suited in
dangerous mission. Futuristic elements in the AUV prompt advantage into much
wider area such as surveillance, environmental monitoring, underwater inspection
of harbor and pipeline, geological and biological survey and mine counter measures. However, extremely unexpected ocean behavior has created challenges to the
AUV navigation and motion performance in which this phenomenon demonstrate
N. M. Sarif R. Ngadengon (&) H. A. Kadir M. H. A. Jalil
Faculty of Electrical Engineering, University Tun Hussein Onn Malaysia,
86400 Parit Raja, Johor, Malaysia
e-mail: rafida@uthm.edu.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_12
163
164
N. M. Sarif et al.
highly frequency oscillating movement by affecting the sensor performance especially acoustical and optical sensors and also causing the dynamics system to have
highly nonlinear, time-varying and uncertainties in hydrodynamic parameters such
as added mass, lift forces, gravity and buoyancy forces [2]. Additionally, most
AUVs are operated under actuated mode, hence tracking and stabilization control
become demanding task, owing to over possession of Degree Of Freedom
(DOF) beyond control [3]. This restriction is imposed in real life application as
inverting or pointing vertically can cause equipment damage or dangerous control
response [4]. As a result, the AUVs motion control is restricted to only one noninteracting subsystem at a time [5]. Due to aforementioned challenges, many
advanced control techniques have been implemented in existing literatures, mostly
including robust control techniques in [6–8], intelligent control method in [9] and
adaptive control approach in [10–12]. It is apparent that the SMC evidently is a
promising strategy [13] among the robust controllers types, to overcome the
obstacles due to its simpler computation and robust to external disturbance and
parameter variations [14].
The work reported in the literature addresses that, majority of the SMC application on the AUV is in continuous time point of view but its effectiveness in real
situations is no longer efficient due to current trend toward digital rather than analog
control of dynamic system [15]. In other words, controllers nowadays are almost
exclusively in digital computer or microprocessors. This is mainly due to availability of low-cost digital computers and the advantage found in digital signals
rather than continuous time signal [16]. For this reason, researcher has produced
significant interest over recent years [13, 17, 18] in solving the problems caused by
the discretization of continuous time controllers. It was started in 1997, Lee et al.
[19] adopted self-tuning discrete sliding mode control on AUV ARMA based on
equivalent discrete variable structure control method and it was continued with a
research on quasi sliding mode control in presence of uncertainties and long
sampling interval as started in [20] on an AUV named VORAM (Vehicle for Ocean
Research and Monitoring). The research was then followed by Zhang [21] who has
proposed discrete-time quasi sliding mode controller for the multiple-input
multiple-output on AUV REMUS. In addition, Wu et al. [22] implemented adaptive sliding mode control in discrete time system and applied time varying sliding
surface obtained via parameter estimation method. The work developed by Bibuli
et al. in [23] described hybrid guidance and control system based on neural dynamic
and quasi sliding mode integration on Shark USV. Verma et al. [24] worked on
controlling speed of Carangiform robotic fish using Discrete Terminal Sliding
Mode Controller.
Research in discrete-time controller was started by Milosavljevic in [25]. Later
Gao et al. created quasi sliding mode band in [26]. Soon after that, Bartoszewics in
[27] proposed non-switching condition of DSMC. Although Gao’s reaching law
method has been introduced since two decades ago, it is still been used in many
significant studies such as [28–30].
The objective of this research is to implement discrete time sliding mode control
law proposed by Gao et al. in [31] during steering motion control. This is to ensure
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle …
165
the designed control law is parallel to technology advancement and minimize the
vehicle heading error so that the vehicle steering motion will follow the desired
heading angle as close as possible. Discrete Proportional Integral Derivative
(PID) and Discrete Sliding Mode Control (DSMC) are tested on AUV NSP II via
simulation and discrete PID controller is used for performance comparative analysis. The paper is organized as follows: Dynamic model of AUV NSP II in the
Body-Fixed Reference Frame (BFF) and DSMC structure design are presented in
Sects. 2 and 3 respectively. Results from numerical simulation are illustrated in
Sect. 4 and discussion on advantages and drawback of the control methods is
provided in Sect. 5.
2 Mathematical Modelling of NPS AUV II
2.1
Nonlinear Equation of Motion
AUV dynamic system is highly nonlinear, coupled and time varying which attribute
to considerations of many parameters such as hydrodynamic drag, damping and lift
forces, Coriolis and centripetal forces, gravity, buoyancy forces and thrust [32].
General nonlinear equation of motion is present as
M v_ þ CðvÞv þ DðvÞv þ GðgÞ ¼ s
ð1Þ
n_ ¼ J ðnÞv
ð2Þ
where, M 2 <6x6 is inertia matrix, CðvÞ 2 <6x6 is Coriolis and centripetal matrix,
DðvÞ 2 <6x6 is damping matrix, GðgÞ 2 <6x1 is vector of buoyancy/gravitational
forces/moments matrix and s 2 <6x1 is vector of control inputs relating to forces
and moments acting on vehicle.
Kinematic and dynamic of the AUV are established using earth-fixed reference
frame and body-fixed reference frame as illustrated in Fig. 1. The earth coordinate
system of vehicle is defined by three orthogonal axes originating from arbitrary
point. East, west and increasing depth correspond to x-axis, y-axis and z-axis
respectively. The motion element is expressed as
v ¼ ½v1 v2 T
v1 ¼ ½u v wT Linear velocities
v2 ¼ ½p q r T Angular velocities
ð3Þ
The position and attitude of body-fixed reference frame with refer to earth-fixed
frame is expressed in the following vectors
166
N. M. Sarif et al.
Fig. 1 The six Degree of Freedom of NPS AUV II [33]
n ¼ ½ n1 n2 T
n1 ¼ ½x y zT Position of Origin
ð4Þ
n2 ¼ ½U h wT Angles orientation of roll ð/Þ, pitch ðhÞ and yaw ðwÞ
The control input vector s has three components as stated in (5)
s ¼ ½ dr ; d e ; n ð5Þ
where de is elevator deflection, dr is rudder deflection and n is propeller revolutions.
The 6 DOF kinematic equation is expressed in vector form as
3 2
3
ucoshsinw þ vð cos/sinw þ sin/sinhcosw þ wðsin/sinw þ cos/sinhcoswÞ
x_
6 y_ 7 6 ucoshsinw þ vðcos/cosw þ sin/sinhsinwÞ þ wð sin/cosw þ cos/sinhsinwÞ 7
7
6 7 6
7
6 7 6
7
6 z_ 7 6
usinh þ vsin/tanh þ wcos/cosh
7
6 7¼6
7
6 /_ 7 6
p
þ
qsin/tanh
þ
rcos/tanh
7
6 7 6
7
6_7 6
5
4h5 4
qcos/ rsin/
qsin/
þ rcos/
w_
2
cosh
cosh
ð6Þ
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle …
167
Table 1 Position and velocities of AUV [32]
Motion
direction
Forces &
moments
Body-fixed frame
(Velocity)
Earth-fixed frame
(Position)
Surge
Sway
Heave
Roll
Pitch
Yaw
X
Y
Z
K
M
N
u
v
w
p
q
r
x
y
z
/
h
w
Six different motion components are conveniently defined as surge, sway, heave,
roll, pitch and yaw as summarized in Table 1 according to Fossen in [32].
The Six OF rigid body equations of motion (EoM) in (1), (2), (3), (4) and (5) are
expended as [32].
m u_ vr þ wq xG q2 þ r 2 þ yG ðpq r_ Þ þ zG ðpr þ q_ Þ ¼ X
ð7Þ
m v_ wp þ ur þ xG ðqr p_ Þ yG p2 r 2 þ zG ðqr p_ Þ ¼ Y
ð8Þ
m w_ uq þ vp xG ðpr q_ Þ þ yG ðqr þ p_ Þ zG p2 þ q2 ¼ Z
ð9Þ
Ix p_ þ qr Iz Iy þ Ixy ðpr q_ Þ Iyz q2 r 2 Ixz ðr_ þ pqÞ
þ m½yG ðw_ uq þ vpÞ zG ðv_ wp þ ur Þ ¼ K
ð10Þ
Iy q_ þ rpðIx Iz Þ Ixy ðqr p_ Þ þ Iyz ðqp r_ Þ þ Ixz p2 r 2
þ m½zG ðw_ uq þ vpÞ xG ðu_ vr þ ur Þ ¼ M
ð11Þ
Iz r_ þ pq Iy Ix Ixy p2 q2 Iyz ðq_ þ rpÞ þ Ixz ðrq p_ Þ
þ m½xG ðv_ þ ur wpÞ yG ðu_ vr þ wqÞ ¼ N
ð12Þ
where, m is the AUV mass, xG; yG ; zG are locations of the vehicle center of AUV
mass, Ix; Iy ; Iz are rotational inertia of AUV mass, u; v; w are AUV linear velocities in
x-axis, y-axis and z-axis. p; q; r are AUV angular velocities of roll, pitch and yaw
_ w;
_ v_ ; p;
_ q;
_ r_ are linear and angular acceleration and X; Y; Z; K; M; N
respectively. u;
is external force and moment.
Total forces and moments from [32] acting on vehicle is expressed as
X ¼ ðW BÞsinh þ Xujuj ujuj þ Xu_ u_ þ Xwq wq þ Xqq qq þ Xvr vr
þ Xrr rr þ Xprop
ð13Þ
168
N. M. Sarif et al.
Y ¼ ðW BÞcoshsinU þ Yvjvj vjvj þ Yrjrj r jr j þ Yv_ v_ þ Yr_ r_
þ Yur ur þ þ Ywp wp þ þ Ypq pq þ þ Yuv uv þ þ Yuudr uudr
Z ¼ ðW BÞcoshcosU þ Zwjwj wjwj þ Zqjqj qjqj þ Zw_ w_ þ Zq_ q_
þ Zuq uq þ Zvp vp þ Zrp rp þ Zuw uw þ Zuuds uuds
K ¼ ðYG W YB BÞcoshcosU þ ðZG W ZB BÞcoshsinU
þ Kpj pj pj pj þ Kp_ p_ þ Kprop
M ¼ ðZG W ZB BÞsinh þ ðXG W XB BÞcoshcosU þ Mwjwj wjwj
þ Mqjqj qjqj þ Mq_ q_ þ Muq uq þ Mvp vp þ Mrp rp þ Muw uw
ð14Þ
ð15Þ
ð16Þ
ð17Þ
þ Muuds uuds
N ¼ ðXG W XB BÞcoshsinU ðYG W YB BÞsinh þ Nvjvj vjvj
þ Nrjrj r jr j þ Nv_ v_ þ Nr_ r_ þ Nur ur þ Nwp wp þ Npq pq þ Nuv uv
ð18Þ
þ Nuudr uudr
where, Xujuj ujuj, Yvjvj vjvj; Yvjvj vjvj are cross flow drag moment coefficient,
Xwq ; Xvr ; Xqq ; Yur ; Ywp ; Ypq are added mass cross force coefficient terms, XProp and
KProp are propeller force and torque respectively. Muq ; Mvp ; Mrp ; Muw ;
Nur ; Nwp ; Npq ; Nuv are added mass cross moment coefficient terms and
Yuudr ; Zuuds ; Muuds ; Nuuds are fin lift moment coefficients.
2.2
Linearization of Horizontal Plane Equation of Motion
According to Healey and Marco in [5], a complete dynamic Equation of Motion
(EoM) is divided into three non-interacting subsystem. In order to reduce complexity in designing control law, this scope is limited to steering motion with
vertical motion control parameters set to zero. Steering control system is responsible for control heading errors. The automatic steering control is done by utilizing a
rudder and a pair of thrusters. Following assumptions are used to obtain a linearized
model of steering control system by considering sway and yaw EoM [5].
•
•
•
•
•
The forward velocity, uo is constant.
Vertical motion control parameters are set at zero.
The body drag force and moment are negligible.
The added mass force and moment are negligible.
The origin of the vehicle coincides with the centre or gravity.
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle …
169
Linearized (8) and (12) are stated as
m_v þ muo r ¼ Y
ð19Þ
Iz r_ ¼ N
ð20Þ
where uo is the constant forward vehicle velocity. From (6), roll and pitch angles
can be simplified as
sinh
cosU
qþ
rr
w_ ¼
cosh
cosh
ð21Þ
By considering previous assumptions, linearized modelling of hydrodynamic
added mass, damping and the rudder of (14) and (18), Y and N are yielded as
Y ¼ Yv_ v_ þ Yr_ r_ þ Yv v þ Yr r þ Yd dr
ð22Þ
N ¼ Nv_ v_ þ Nr_ r_ þ Nv v þ Nr r þ Nd dr
ð23Þ
Equation (19), (20), (21), (22) and (23) are expressed in a compact form of
2
m Yv_
4 mxG Nv_
0
mxG Yr_
Iz Nr_
1
32 3 2
0
v_
Yv_ u0
0 54 r_ 5 ¼ 4 Nv u0
0
w_
0
32 3 2
3
ðYr mÞu0 0
v
Ydr
ðNr mxG Þ 0 54 r 5 þ 4 Ndr 5dr
1
0
w
0
ð24Þ
where v is sway velocity, r is the angular velocity in yaw, w is heading angle and dr
is rudder deflection (Table 2).
Re-arranging the expression in state space form
x_ ¼ AxðtÞ þ BuðtÞ
ð25Þ
y ¼ CxðtÞ
where, x ¼ ½v r w, u ¼ dr ; C ¼ ½1 0 0; 0 1 0 ; 0 0 1 and y ¼ w
Table 2 The NPS AUV II
model parameter [35]
Parameter
Value
Units
m
W
ZG
ZB
Iy
5443.4
53400
0.061
0
13587
Kg
N
M
n
Mq_
1:7 102
Nms2
170
N. M. Sarif et al.
3 Controller Design
3.1
Discrete Sliding Mode Control (DSMC) Design
In this section, DSMC is designed in discrete time domain to control heading errors
of steering system. By considering continuous time system in (25), the discrete
model of (19), (20), (21), (22) and (23) by Zero Order Hold (ZOH) approximation
yields
xðk þ 1Þ ¼ UxðkÞ þ CuðkÞ
ð26Þ
yðk Þ ¼ Cxðk Þ
where xðkÞ is the state vector, uðkÞ is the control input, yðk Þ is the output and U and
C are the system matrices.
In this paper, the objective of the controller is to force the variable x to achieve a
constant reference position, xr . Hence, the output tracking error is defined as:
e ¼ xr x
ð27Þ
Next, discrete conventional sliding surface is defined as follows
Sðk Þ ¼ Cs eðk Þ
¼ Cs ðxr ðk Þ xðk ÞÞ
ð28Þ
where eðk Þ is the heading error, xr is reference input and Cs is the selected sliding
matrix such that Cs is a gain matrix.
Discrete sliding mode control scheme is designed based on reaching law or
equivalent method. In order to steer the state trajectory to reach the sliding surface
in one instant sampling, the strategy is developed based on following condition:
Sð k Þ ¼ 0
ð29Þ
The first-time derivative of (29) is expressed as:
Sð k þ 1Þ s ð k Þ ¼ 0
ð30Þ
The discrete time extension reaching law proposed by Gao et al. [31] is defined
as
Sðk þ 1Þ Sðk Þ ¼ qTSðkÞ eTsgnðSðkÞÞ
ð31Þ
where T is the sampling interval of discrete time system, e and q are positives
constants. e > 0, q > 0 and 1 qT [ 0.
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle …
171
From Eq. (26) and (28), the first derivative of sliding surface rewrite as;
Sðk þ 1Þ Sðk Þ ¼ Cs ðxr ðk þ 1Þ Cs xðk þ 1ÞÞ
ð32Þ
Substituting Eq. (26) into (31), the sliding surface is expressed as;
Sðk þ 1Þ SðkÞ ¼ Cs ðxr ðk þ 1Þ Cs Uxðk Þ CuðkÞ Cs xr ðk Þ
þ C s xð k Þ
ð33Þ
Hence, the control law of DSMC for system (26) so that the sliding surface steer
to zero in a finite time is defined as:
uðkÞ ¼ ðCs CÞ1 ½Cs xr ðk þ 1Þ þ Cs Uxðk Þ þ ð1 qT ÞsðkÞ
eTsgnðsðkÞÞ
ð34Þ
Flowing step obtains sliding gain matrix Cs by substituting (33) into (34) to
generate
xðK þ 1Þ ¼ ðU CK Þxðk Þ
ð35Þ
where K ¼ ðCs CÞ1 CsU
Hence, the sliding gain matrix Cs becomes the solution of the following
equation.
CsðU CK Þ ¼ 0
ð36Þ
CsC ¼ I
ð37Þ
where I is an identity matrix and (37) to ensure that CsC is full rank. Using (36),
(37) can be replace by CsU ¼ K and thus the above equations can be written as
Cs½U C ¼ ½KI ð38Þ
Finally, the sliding matric Cs is given by
Cs ¼ ½KI ½U C þ
ð39Þ
where + is representation of matrix pseudo-inverse. The feedback matrix K is
obtained by adopting (37) into Linear Quadratic Controller [35].
172
N. M. Sarif et al.
4 Computational Result on Steering Control Motion
This section evaluates controller performance via Matlab/Simulink simulations
which the overall system was considered as discrete control system using Zero
Order Hold (ZOH) with 0.2 s sampling time. To illustrate an effectiveness of
DSMC, discrete PID controller was used as a comparative analysis. Step response
simulations were performed in sway and yaw motion. Discrete PID controller is
widely used due to its reliability and simplicity but it is difficult to tune the
parameter in discrete time domain to achieve optimal performance. Discrete PID
gain setting is obtained from Ziegler Nicholas method as tabulated in Table 3
(Fig. 2 and Table 4).
Using (39), the sliding gain matrix, Cs is given by
Cs ¼ ½ 0:1 2
0:3 ð40Þ
Table 3 PID tuning gain value
Gain
Value
Proportional (Kp )
Integral (KI )
Derivative (Kd )
−0.626
−0.038
−1.179
Fig. 2 Yawing angle evolution
Table 4 Controller performance comparison
Transient response properties
Discrete PID
DSMC
Rise time
Overshoot
Settling time
Steady state error
3.045
20
50
0
11.5
0
45
0
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle …
173
Reaching law parameters are set as follows
q ¼ 0:4; e ¼ 0:01
ð41Þ
Figure 4 shows evolution of steering motion by both controllers that have
strongly reached the desired output. Both controllers start responding after 5 s of
input command and they have managed to achieve the desired 50-degree yawing
angle.
The AUV gradually changes the yawing angle and it is stabilized after 45 s and
50 s for DSMC and PID respectively. The rudder deflection by DSMC gradually
changes to positive value which then results in smooth and sensible yawing rate
angle with 0% overshoot. On the other hand, the PID controller results 20%
overshoot of rudder deflection before reaching steady state at 50 s as demonstrated
in Figs. 3 and 4. It is noticeable that, both controllers have achieved desired the
results with different performance.
Figure 5 illustrates chattering phenomena evolution in discrete sliding mode
control input. This is because discrete control algorithm is calculated in each sample
period and it is kept as a constant value until the next sampling period. Due to finite
sampling frequency, a Quasi-Sliding Mode (QSM) will occur in the close loop
system. This situation will force system state to move around the sliding surface
rather than staying along the sliding surface. From Gao’s reaching law in Eq. (31),
the thickness of Quasi-Sliding Mode Band (QSMB) in steady state depends on
Fig. 3 Control input evolution
Fig. 4 Yawing rate evolution
174
N. M. Sarif et al.
Fig. 5 Chattering phenomena in control input
Fig. 6 QSMB in sliding surface
parameter e as illustrated in Fig. 6. The width of the QSMB could be reduced by
using smaller e. In other words, the robustness of a system can be improved by
decreasing the e: The smaller e; will lead to the smaller is the effect of the sampling
time in the system.
5 Conclusion
In this study, two controllers DSMC and Discrete PID for the AUV were developed
based on discrete time domain. The NPS AUV II was used to design discrete time
controllers. Through the comparative computer simulations on NPS AUV II, it is
apparent that DSMC presents excellent performance than Discrete PID.
On the contrary, DSMC generated chattering phenomena due to finite sampling
frequency and control algorithm in discrete time calculated in each sample period
and kept as a constant value until the next sampling period. The chattering effect
can be mitigated by reducing the thickness of QSMB. However, the robustness of
the designed controller is not considered in the study.
In the future work, the controller performance will be teasted by considering
parameter uncertainties and external disturbance in the designed control law.
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle …
175
Acknowledgements The authors would like to thank Universiti Tun Hussein Onn Malaysia
(UTHM) for TIER 1 grant Vot H148, GPPS grant Vot H316 and AdMiRe FKEE for the research
funding support.
References
1. Gelli J, Meschini A, Monni N (2018) Development and design of a compact autonomous
underwater vehicle zeno AUV. IFAC-PapersOnLine 51(29):20–25
2. Yang R (2016) Modeling and robust control approach for autonomous underwater vehicles.
Ph.D. thesis, Ocean University of China (2016)
3. Isa K, Arshad MR, Ishak S (2014) A hybrid-driven underwater glider model, hydrodynamics
estimation, and an analysis of the motion control. Ocean Eng 81:111–129
4. Brutzman DP (1994) A virtual world for an autonomous underwater vehicle, Ph.D. thesis,
Naval Postgraduate School, Monterey, California
5. Healey A, Marco D (1992) Slow speed flight control of autonomous underwater vehicles:
experimental results with NPS AUV II. In: The second international offshore and Polar
engineering conference, pp 523–532
6. Guerrero J, Antonio E, Manzanilla A, Torres J, Lozano R (2018) Autonomous underwater
vehicle robust path tracking- auto-adjustable gain high order sliding mode controller.
IFAC-PapersOnLine 51(13):161–166
7. Farhan M (2017) Sliding mode control of autonomous under water vehicle by Muhammad
Farhan Faculty of Engineering, Master Thesis, Capital University of Science & Technology,
Islamabad
8. Song YS, Arshad MR (2016) Sliding mode depth control of a hovering autonomous
underwater vehicle. In: Proceedings of the 5th IEEE international conference on control
system, computing and engineering ICCSCE (2016)
9. Ullah B, Ovinis M, Baharom MB, Javaid MY, Izhar SS (2015) Underwater gliders control
strategies: a review. In: 10th Asian control conference: emerging control techniques for a
sustainable world, ASCC (2015)
10. Qiao L, Zhang W (2019) Adaptive second-order fast nonsingular terminal sliding mode
tracking control for fully actuated autonomous underwater vehicles. IEEE J Ocean Eng 44
(2):363–385
11. Cui R, Zhang X, Cui D (2016) Adaptive sliding-mode attitude control for autonomous
underwater vehicles with input nonlinearities. Ocean Eng 123:45–54
12. Chu Z, Xiang X, Zhu D, Luo C, Xie D (2017) Adaptive fuzzy sliding mode diving control for
autonomous underwater vehicle with input constraint. Int J Fuzzy Syst 10–11
13. Wang B (2008) On discretization of sliding mode control systems on discretization of sliding
mode control systems, Ph.D thesis, RMIT University
14. Hung JY, Gao W, Hung JC (1993) Variable structure control: a survey. IEEE Trans Ind
Electron 40(1):2–22
15. Sanchez-Gonzalez PL, Díaz-Gutiérrez D, Leo TJ, Núñez-Rivas LR (2019) Toward
digitalization of maritime transport. Sensors 19(4):926
16. Ogata K (1995) Discrete-time control systems, 2nd edn. Prentice Hall International, Inc.,
Upper Saddle River
17. Singh DP, Agarwal S, Gupta UK (2014) A technical review on discrete-time sliding mode
controller for linear time-varying systems. Int J Eng Tech Res 5:289–291
18. Feng Y, Xue C, Yu X, Han F (2018) On a discrete-time quasi-sliding mode control. In:
Proceedings of IEEE international workshop on variable structure systems, pp 251–254
19. Lee P, Hong S, Lim Y (1997) Self-tuning control of autonomous underwater vehicles based
on discrete variable structure system. In: IEEE Conference Proceedings Oceans 1997, vol 2.
MTS, pp 902–909
176
N. M. Sarif et al.
20. Lee PM, Hong SW, Lim YK, Lee CM, Jeon BH, Park JW (1999) Discrete-time quasi-sliding
mode control of an autonomous underwater vehicle. IEEE J Ocean Eng 24:388–394
21. Zhang S, Yu J, Zhang A (2010) Discrete-time quasi-sliding mode control of underwater
vehicles. In: Proceedings of the world congress on intelligent control and automation,
pp 6686–6690
22. Wu B, Li S, Wang X (2009) Discrete-time adaptive sliding mode control of autonomous
underwater vehicle in the dive plane. Springer, Heidelberg, pp 157–164
23. Lecce ND, Laschi C, Bibuli M, Bruzzone G, Zereik E (2015) Neural dynamics and sliding
mode integration for the guidance of unmanned surface vehicles. In: MTS/IEEE ocean.
Genova discovering sustainable ocean energy a new world, pp 1–6
24. Verma S, Abidi K, Xu JX (2016) Terminal sliding mode control for speed tracking of a
carangiform robotic fish. In: Proceedings of the IEEE international workshop on variable
structure systems, July, pp 345–350
25. Milosavljevic D (1985) General conditions for existence of a quasi-sliding mode on the
switching hyperplane in discrete variable structure systems. Autom Remote Control 46:307–
314
26. Gao W, Wang Y, Homaifa A (1995) Discrete-time variable structure control systems. IEEE
Trans Ind Electron 42(2):117–122
27. Bartoszewicz A (1998) Discrete-time quasi-sliding-mode control strategies. IEEE Trans Ind
Electron 45:633–637
28. Bsili I, Ghabi J, Messaoud H (2018) Discrete sliding mode control of inverted pendulum. In:
World symposium on mechatronics engineering and applied physics 2015, November
29. Dias MSG (2017) Discrete time sliding mode control strategies applied to a multiphase
brushless DC machine, Ph.D. thesis, Kassel University
30. Ngadengon R, Sam YM, Osman JHS, Ghazali R (2011) Controller design for inverted
pendulum system using discrete sliding mode control. In: Proceedings of the 2011 2nd
international conference on instrumentation control and automation ICA 2011, November,
pp 130–133
31. Liao TL (1997) On discrete-time variable structure control systems. J Control Syst Technol 5
(4):285–290
32. Fossen TI (1994) Guidance and control of ocean vehicles, 4th edn. Wiley, New York
33. Geranmehr B, Nekoo SR (2015) Nonlinear suboptimal control of fully coupled non-affine
six-DOF autonomous underwater vehicle using the state-dependent Riccati equation. Ocean
Eng 96:248–257
34. Gerdönmez F (2007) Simulation of motion of an underwater vehicle, Ph.D. thesis, Middle
East Technical University
35. Draženović B, Milosavljević C, Veselić, B (2013) Comprehensive approach to sliding mode
design and analysis in linear systems. In: Advances in sliding mode control, Springer, pp 1–19
Impact of Acoustic Signal on Optical
Signal and Vice Versa in Optoacoustic
Based Underwater Localization
M. R. Arshad and M. H. A. Majid
Abstract Underwater localization is an important process in order to determine the
approximate location of a deployed underwater tool such as different types of
underwater vehicle. A common underwater localization depends on acoustic signal,
but it has disadvantages of high development cost, slow propagation speed, high
attenuation and only works effectively at a long distance. Optic is an alternative
approach for underwater localization. Optical signal has advantages of low cost and
high propagation speed, but it has the disadvantage of shorter detection range
compared to an acoustic signal. A combination of both approaches is known as an
optoacoustic which eliminates the disadvantages of each individual approach and
can be used for both short and long distance localizations. However, since both
signals are travelling waves, the use of both signals simultaneously may introduce
interferences. This paper investigates this possibility through experimentation. The
results of investigation proved that the interference does exist when both signals are
used simultaneously underwater.
Keywords Underwater localization
localization Optoacoustic
Optical based localization Acoustic based
1 Introduction
A successful underwater operation depends on a reliable underwater positioning
system. On the ground applications, Global Positioning System (GPS) which uses
radio signal is widely used for positioning or localization purposes. However, in an
underwater environment, GPS or any RF-based localization methods cannot work
properly due to hostile aquatic channel conditions. Underwater localization can be
M. R. Arshad (&) M. H. A. Majid
Underwater, Control and Robotics Group, School of Electrical
and Electronic Engineering, Universiti Sains Malaysia (Engineering Campus),
14300 Nibong Tebal, Penang, Malaysia
e-mail: eerizal@usm.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_13
177
178
M. R. Arshad and M. H. A. Majid
categorized into two types, namely long range localization and short range localization. Long range localization is used for tracking underwater vehicle as it
maneuvers while short range localization is used for underwater docking during
recovery or power recharging. In general, underwater localization can be performed
through acoustic, radio frequency (RF) and optical waves. However, most of the
current technology of underwater localization depends stiffly on acoustic signal in
which distances are mostly estimated from the time delay estimation. This process
requires an accurate estimation of time delay, in order to obtain an accurate position
estimation which typically involves a complex signal processing. Thus, in order to
obtain high accuracy position estimation, a large size hydrophone array or a high
sampling rate is needed. Additionally, acoustic waves have the disadvantages of
slow propagation speed, high attenuation, low bandwidth and give bad impact on
marine life [1]. However, acoustic signal has a large field-of-view (i.e. detection
radius) and can travel in a long distance. Similarly, RF has disadvantages of high
attenuation, high absorption, requires huge antennas and transmission power, and it
is limited to shallow water applications.
On the other hand, optical based underwater localization is another approach to
the underwater localization. However, optical based localization has just been
recently studied. Typically, optical waves have advantages of high speed, low-cost,
energy-conservative, but it has a shorter operation range and observed point-topoint type of communication (i.e. narrower field-of-view) where the receiver and
transmitter must be aligned within a limited detection range in order to avoid
disconnect of connection compared to RF or acoustic waves. By combining
acoustic and optical approaches, an optoacoustic based underwater localization has
been developed as a new way of performing underwater localization. In the following subsections, some related works for acoustic based localization, optical
based localization and optoacoustic based localization are discussed.
1.1
Acoustic-Based Localization
Acoustic-based underwater localization problem is a common research problem
addressed by many literatures and can be found widely used in a wide range of
commercial applications. In a real underwater environment, acoustic based localization can be used to not only determine the underwater position of a specific
target, but also to track the source of interest as it moves [2].
Acoustic is the most favorable medium for underwater communication, positioning and localization since no radio wave could propagate efficiently underwater.
However, underwater acoustic channels are characterized by harsh physical layer
conditions with a low bandwidth; high propagation delay, high bit error rate and
variable speed of sound pose unique challenges for the underwater localization.
Common methods for underwater localization are known as Ultra Short Baseline
(USBL), Short Baseline (SBL) and Long Baseline (LBL) [3]. Usually, these
methods of acoustic based localizations depend on trigonometry solution which is
Impact of Acoustic Signal on Optical Signal and Vice Versa …
179
expressed as the distance between a transmitter (i.e. acoustic source) and a set of
receivers (i.e. hydrophones in an array form). The distances are determined directly
from by the time delay or phase delay estimation [4]. Provided that the underwater
speed of acoustic signal is known (i.e. can accurately be estimated), the distance
could be estimated from the estimated time or phase delay.
An underwater acoustic signal is influenced by path loss, noise, multi-path,
Doppler spread and high variable propagation delay. Direction of the underwater
acoustic communication also affect the acoustic link, which means that the different
propagation direction has different propagation characteristics, especially with
respect to the time dispersion, multi-path spread and delay variance. Hence, the
underwater acoustic channel is a temporal and spatial variable system, which makes
the available bandwidth limited and intensely dependent on both range and frequency [4].
1.2
Optical-Based Localization
Optical sources such as laser and light emitting diodes (LED) have been used
widely in many applications from as simple as a pointing device to advance defense
weapons. However, in underwater applications, utilization of optical signals
specifically laser technology is still limited due to high absorption properties of
light intensity by the sea water. However, the use of lasers as underwater communication, imaging and localization is seen as the future underwater localization
technology [5]. Although most of the lasers cannot penetrate sea water in a long
distance, but it performs better compared to LED. Some laser such as blue and
green lasers can propagate from several hundred meters to several kilometers in
seawater depends on the intensity of the laser. This type of laser has been studied
for the underwater broadband communication system in [6]. Similar works with
difference research considerations can be found in [7] and [8].
LEDs on the other hand have been used for underwater communication for a
very short range application [9, 10]. High power LEDs are used to assist acoustic
devices for localization of underwater swarm robotics [11]. The high powered LED
is used to calculate distances between the robots. The downside of using LED is
that it has to be in high powered and its wavelength has to be properly selected due
to the high light absorption coefficient in the water. Even more, compared to lasers,
light illuminated from the LED is easily scattered. Optical is efficient in close range
and clear water conditions, while acoustic work efficiently in long range and doesn’t
significantly affected by turbidity. In optical based operation, laser is proven to have
a better transmission range, higher data rate, low latency and power efficiency
compared to Light Emitting Diode (LED) [1]. However, the light beam propagation
suffers from the absorption, scattering and multipath fading. Optical receiver is
commonly developed based on the optical camera but camera is easily affected by
lighting. In other research work, laser-based vision system had been used to localize
an underwater vehicle [12]. The laser-based vision system consists of a camera and
180
M. R. Arshad and M. H. A. Majid
two laser pointers as its major components. Basically, the laser pointers will serve
as the target and the camera will capture the image of the target in the form of two
dotted points. Based from the captured image, the underwater vehicle will be able to
know its location with respect to the targets. The downside of using the camera
instead of acoustic is its short working range (i.e. 40–150 cm). The work has been
expanded to include an inertial measurement unit (IMU) to assist the localization
system [13].
1.3
Optoacoustic-Based Localization
One of the optoacoustic application in the underwater environment for bathymetry
in turbid water known as optoacoustic underwater remote sensing (OAURS) has
been studied in [14] to improve accuracy and enhance the speed of the process.
Additionally, optoacoustic has been studied for both outdoor and indoor localizations [4, 15]. In underwater, optoacoustic has been researched for as an ultra-short
laser based underwater acoustic signal generator [16]. Moreover, a remotely
operated underwater vehicle (ROV) guidance based on optoacoustic data fusion and
optoacoustic based mosaic, and positioning of underwater vehicle is proposed in
[17]. Other examples include fish tracking using optical and acoustical data fusion
is studied in [18] and optical and acoustic based underwater sensor network is
studied in [19].
Optoacoustic also can be used for seabed mapping and motion estimation of
underwater vehicle [20]. Instead of underwater localization application, optoacoustic had been used to perform underwater mapping using multiple robots. In the
context of this work, opto means an imaging device and not exactly a laser sensor.
The imaging device is used to operate together with acoustic sensor for multi-AUV
trajectory optimization [20]. Communication modem based on hybridization of
acoustic and optical signals where prominent solution is determined through simulation studies which signify efficiency and effectiveness of the optoacoustic based
underwater localization solution [21]. In general, a primary advantage of an
optoacoustic is it allows for compensating the drawbacks of the low resolution of
acoustic sensors and limitations of optical sensors in poor visibility condition. In
addition, by combining both acoustic and optic in a single underwater localization
system, localization accuracy can be maintained for short and long distance
purposes.
However, the above studies are mostly focus on direct implementation of the
optoacoustic technology without considering the impact of acoustic signal on
optical signal. Since both acoustic and optic are travelling waves, it is important to
study the possibility of interferences in order to ensure accurate reading and reliable
localization based on optoacoustic can be realized. This consideration is important
to ensure that the readings obtain by optic sensor and acoustic sensor is fully trusted
Impact of Acoustic Signal on Optical Signal and Vice Versa …
181
(i.e. free from interference) and thus, significance estimation errors can be avoided
during the localization processes. In order to study the above mentioned problem,
the rest of this paper is organized as follows: In Sect. 2, the research methodology
taken in order to investigate the problem is discussed in details. In Sect. 3, the
discussion of the research findings is reported.
2 Methodology
In order to investigate the impact of acoustic signal on optical signal, an overall
experimental setup used for the investigation is shown in Fig. 1. In this study, we
investigate the impact when the optic and acoustic sources are located perpendicular
to each other (i.e. the most critical orientation that gives the largest possible
interference). As can be seen from Fig. 1, the optical source is a diffused green laser
and the acoustic source is transmitted by a wideband underwater acoustic transmitter. Green laser is selected since it has better penetration performance compared
to other colors (i.e. different wavelengths). Notice that the diffuser is used to
increase field-of-view of the laser beam. The acoustic source is generated by a
signal generator. In this study the parameters of generated acoustic signal are shown
in Table 1. Sine wave is used since it gives easy to identify the fundamental
frequency without harmonics.
Fig. 1 Experimental setup to determine effect of acoustic on optic and vice versa
182
M. R. Arshad and M. H. A. Majid
Table 1 Parameters for acoustic signal source
Parameter
Value
Unit
Type
Magnitude
Frequency
Distance to receiver
Sinusoidal
0–12
1–10000
40
–
VDC
kHz
cm
Table 2 Specification of laser, acoustic transmitter and hydrophone
Parameter
Parameter
Value
Unit
Laser
Color/wavelength
Power supply
Power
Power supply
Max bandwidth
Frequency
Sensitivity
Operating temperature
Green/530
12
50
12
10
170
−211 ± 3
−2 to 80
nm
V
W
V
MHz
kHz
dB re 1 V/lPa
°
C
Acoustic transmitter
Hydrophone
The optical receiver is a Photoresistor (i.e. LDR-Light Dependent Resistor)
which is placed in a waterproof container. The optic intensity measured by a
Photoresistor is received by a microcontroller and then transferred to a computer for
real time data monitoring and analysis. The microcontroller is responsible for
converting a received analog signal from LDR to a digital signal. The intensity of
the acoustic signal is measured by a hydrophone (i.e. underwater acoustic sensor)
and transmitted to a computer through a PicoScopeTM (i.e. digital oscilloscope).
The specifications of the laser, acoustic transmitter and hydrophone are given in
Table 2.
The actual lab scale experimental setup is shown in Fig. 2. The inside view of
the tank and the orientation of the receivers and transmitters are shown Fig. 3. The
size of the tank used in this study is 52 38 31 cm. In this study, the impact of
optic on acoustic is measured based on different input parameters variation. The
output of measurement is the LDR intensity (i.e. measured in ADC value and
converted to Lux) and hydrophone intensity (i.e. measured in dBm). In order to
avoid the measured light intensity is disrupted by lighting (i.e. to ensure reading
consistency), a cover is used to cover the tank. In other words, the recorded data are
measured in a dark environment where LDR only measured the light intensity from
the diffused laser beam. The input and output parameters used in this experiment are
listed in Table 3.
Impact of Acoustic Signal on Optical Signal and Vice Versa …
Signal generator
183
Tank
PicoScopeTM
Computer
Microcontroller
Fig. 2 Actual experimental setup
LDR
Acoustic
Transmitter
Hydrophone
Diffused green
laser
Fig. 3 View inside the tank used for investigation
Table 3 Parameters for acoustic signal source
Parameter
Acoustic on optic
Optic on acoustic
Input
Output
Amplitude (VDC), Frequency (kHz)
Light intensity (lx)
Light intensity (lx)
Amplitude (dBm)
3 Results and Discussions
The results shown in Fig. 4(a) through Fig. 4(f) were obtained by recording the
ADC value received from the microcontroller. Then, the value of light intensity in
Lux (lx), Ilx is given by
184
M. R. Arshad and M. H. A. Majid
Fig. 4 Measured light intensity for different values of applied acoustic signal a 3V b 5 V c 7 V
d 9 V e 12 V (f) average intensity
Ilx ¼
LADC RANA
RADC
ð1Þ
where LADC is the value of ADC, RANA is the maximum range of analog voltage and
RADC is the maximum ADC value. The presented results are calculated based on
Impact of Acoustic Signal on Optical Signal and Vice Versa …
185
average of 2000 samples with three sets of measurements. In the presented results,
from the Fig. 4(a) through Fig. 4(e), it can be observed that both frequency and
amplitude affect the intensity reading of the light. From the figures, acoustic source
with low frequencies has a smaller impact (i.e. smaller interference) on optical
intensity value measurement compared to high frequency signals. Note that high
intensity value means low interference (as indicated by high intensity measurement)
and vice versa. This can be observed from the general trend of the light intensity
value as frequency increases. As the frequency increases the intensity decreases.
From Fig. 4(f), it can be clearly observed that the magnitude of the acoustic
source also affects the intensity measurement of the optical signal. The larger the
magnitude the better the intensity being measured, but as discussed earlier, the
intensity is slightly dropped as the acoustic frequency increases. The illustration
example of how acoustic interfere optical reading is shown in Fig. 5. From the
figure, it can be observed that the intensity measured by the LDR increases as the
acoustic pinger is activated (i.e. ON).
Theoretically, green light has a low absorption coefficient and attenuation, which
relatively gives a good intensity reading. The light beam intensity is affected by
absorption, scattering, and multipath fading effect due to interactions between water
molecules and particles with the photons as it propagates through the water.
However, as the acoustic source is activated (i.e. ON), the scattering effect caused
by the travelling acoustic signal scatters the light beam. As a result, the measured
light intensity by the LDR decreases. Figure 6 shows the impact of optical signal
intensity on acoustic signal intensity measurement (i.e. taken as average value).
Fig. 5 Actual Light sensor (linear scale LDR) response on green light source with ON and OFF
acoustic source (pinger) at 500 kHz. Setting: light source and acoustic source are at 90° from each
other (perpendicular)
186
M. R. Arshad and M. H. A. Majid
Fig. 6 Acoustic intensity
versus optical intensity
In this case, the intensity or brightness of the optical signal (i.e. diffused laser) are
controlled by a potentiometer while the distance between LDR and the diffused
laser is fixed. It can be observed that the optical signal intensity does not significantly affect the acoustic signal intensity measurement. Although there are small
discrepancies in measurement, it is expected that it is due to noise from the environment and not due to optical intensity change. This is because optical signal
transmission is not associated with pressure change as measured by hydrophone.
Thus, based on the above findings, it can be concluded that the acoustic signal has
any significant effect on the optical signal but not in the other way around.
4 Conclusion
In this paper, the study of interference effect of the acoustical signal on the optical
signal and vice versa in an underwater environment through experiment is presented. From the findings, both frequency and amplitude of the acoustic signal
affect the intensity reading of the optical signal. On the other hand, the optical
signal does not affect the intensity value of the acoustic signal. In the future, the
study of the impact of various external parameters such as salinity, density and
pressure will be considered and the reliability of the actual optoacoustic based
underwater localization will be studied.
Acknowledgements This research is funded by the
Scheme (FRGS). Account No.: 1001/PELECT/6071346.
Fundamental
Research
Grant
Impact of Acoustic Signal on Optical Signal and Vice Versa …
187
References
1. Saeed N, Celik A, Al-Naffouri TY, Alouini M-S (2019) Underwater optical wireless
communications, networking, and localization: a survey. Ad Hoc Netw 94:101935. https://
doi.org/10.1016/j.adhoc.2019.101935
2. Carroll P, Zhou S, Zhou H, Xu X, Cui J-H, Willett P (2012) Underwater localization and
tracking of physical systems. J Electr Comput Eng 2012:11. https://doi.org/10.1155/2012/
683919
3. Paull L, Saeedi S, Seto M, Li H (2014) AUV navigation and localization: a review. IEEE J
Ocean Eng 39(1):131–149. https://doi.org/10.1109/JOE.2013.2278891
4. Esslinger D, Rapp P, Wiertz S, Rendich H, Marsden R, Sawodny O, Tarín C (2019) Accurate
optoacoustic and inertial 3-D pose tracking of moving objects with particle filtering. IEEE
Trans Instrum Meas: 1–14. https://doi.org/10.1109/tim.2019.2905749
5. Shen C, Guo Y, Oubei HM, Ng TK, Liu G, Park K-H, Ho K-T, Alouini M-S, Ooi BS (2016)
20-meter underwater wireless optical communication link with 1.5 Gbps data rate. Opt
Express 24(22):25502–25509. https://doi.org/10.1364/OE.24.025502
6. Wu T-C, Chi Y-C, Wang H-Y, Tsai C-T, Lin G-R (2017) Blue laser diode enables underwater
communication at 124 Gbps. Sci Rep 7:40480. https://doi.org/10.1038/srep40480
7. Zhou T, Hu S, Mi L, Zhu X, Chen W (2017) A long-distance underwater laser communication
system with photon-counting receiver. In: 2017 16th international conference on optical
communications and networks (ICOCN), 7–10 August 2017, pp 1–3
8. Shan X, Yang C, Chen Y, Xia Q (2017) A free-space underwater laser communication device
with high pulse energy and small volume. In: OCEANS 2017, Anchorage, 18–21 September
2017, pp 1–5
9. Stefano B, Marco C, Silvia G, Ivan S (2013) Advances in underwater acoustic networking. In:
Mobile ad hoc networking: the cutting edge directions. IEEE, pp 804–852
10. Han X, Peng Y, Zhang Y, Ma Z, Wang J (2015) Research on the attenuation characteristics of
some inorganic salts in seawater, vol. 10
11. deir’Erba R, Moriconi C (2015) High power leds in localization of underwater robotics
swarms. IFAC-PapersOnLine 48(10):117–122. https://doi.org/10.1016/j.ifacol.2015.08.118
12. Wu S, Zhou P, Yang C, Zhu Y, Zhi H (2019) A novel approach for underwater vehicle
localization and communication based on laser reflection. Sensors 19(10):2253
13. Vila AP (2018) 3D underwater SLAM using sonar and laser sensors. University of Girona
14. Farrant D, Burke J, Dickinson L, Fairman P, Wendoloski J (2010) Opto-acoustic underwater
remote sensing (OAURS) - an optical sonar? In: OCEANS 2010 IEEE SYDNEY, 24–27 May
2010, pp 1–7
15. Esslinger D, Rapp P, Wiertz S, Sawodny O, Tarín C (2018) Highly accurate 3D pose
estimation for economical opto-acoustic indoor localization. In: 2018 15th international
conference on control, automation, robotics and vision (ICARCV), 18–21 November 2018,
pp 1984–1990
16. Brelet Y, Jarnac A, Carbonnel J, André Y-B, Mysyrowicz A, Houard A, Fattaccioli D,
Guillermin R, Sessarego J-P (2015) Underwater acoustic signals induced by intense ultrashort
laser pulse. J Acoust Soc Am 137(4):EL288–EL292. https://doi.org/10.1121/1.4914998
17. Campos R, Gracias N, Palomer A, Ridao P (2015) Global alignment of a multiple-robot
photomosaic using opto-acoustic constraints. IFAC-PapersOnLine 48(2):20–25. https://doi.
org/10.1016/j.ifacol.2015.06.004
18. Boudhane M, Nsiri B (2017) Fish tracking using acoustical and optical data fusion in
underwater environment. Paper presented at the proceedings of the international conference
on watermarking and image processing, Paris, France
19. Vasilescu I, Kotay K, Rus D, Dunbabin M, Corke P (2005) Data collection, storage, and
retrieval with an underwater sensor network. Paper presented at the proceedings of the 3rd
international conference on embedded networked sensor systems, San Diego, California, USA
188
M. R. Arshad and M. H. A. Majid
20. Lagudi A, Bianco G, Muzzupappa M, Bruno F (2016) An alignment method for the
integration of underwater 3D data captured by a stereovision system and an acoustic camera.
Sensors 16(4):536
21. Kumar ML, Rani MJ (2019) A design of novel hybrid opto-acoustic modem for underwater
communication. Int J Innov Technol Explor Eng (IJITEE) 8(8):7
Design and Development of Mini
Autonomous Surface Vessel
for Bathymetric Survey
Muhammad Ammar Mohd Adam, Zulkifli Zainal Abidin,
Ahmad Imran Ibrahim, Ahmad Shahril Mohd Ghani,
and Al Jawharah Anchumukkil
Abstract Bathymetric survey is necessary in monitoring hydrographic environment. The conventional approach involves operators including engineer, surveyor
and boat captain being onboard the vessel during survey works. Application of
Autonomous Surface Vessel (ASV) is becoming popular for bathymetric mapping
as it reduces operation cost and replaces human operation in high risk areas.
Available commercial ASVs are typically designed for a particular task with limited
expandability for other similar applications. Also, with current miniaturization of
industrial sensors, a small-sized ASV is sufficient for most inland water survey
operation. In this paper, the design and development components for a modular and
mini ASV, named Suraya-1 is detailed. This vessel is developed for hydrographic
survey using singlebeam depth sonar, measuring depth and temperature of a water
body. The specifications and performance of developed ASV is compared to
existing commercial unmanned vessels of same class and application. The
dimension of the vehicle is the smallest compared to counterparts which is
1.04 m 0.35 m 0.32 m, weighing only 6.8 kg without payload. Our ASV is
powered by two paralleled 18.5 V LiPo battery, which is in the mid-range, yet able
to reach navigation speed of 4 knots as required for survey work. The real-time
vessel poses and collected data are transmitted to the ground station within range of
2 km. For performance evaluation, the developed ASV is tested in pool environment. Qualitative outcome shows minimal error in navigation control. Also, output
data obtained is shown consistent and reliable for a calm water environment.
Keywords Autonomous Surface Vessel
survey
Bathymetry mapping Hydrographic
M. A. Mohd Adam (&) Z. Zainal Abidin A. I. Ibrahim A. S. Mohd Ghani A. J. Anchumukkil
International Islamic University Malaysia, 53100 Gombak, KL, Malaysia
e-mail: m.ammaradam@gmail.com
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_14
189
190
M. A. Mohd Adam et al.
1 Introduction
Unmanned Surface Vessel (USV) is a vessel that operates on the surface of water
without an onboard crew. A typical unmanned vessel is made up of a surface
vehicle, a ground control station, communication and control link as well as logistic
module [9]. While USVs has been developed widely, a complete autonomous
version, Autonomous Surface Vessel (ASV) are currently still under development
stage. ASVs, which are initially developed for military purposes, have recently also
being used extensively in commercial and scientific research applications as they
become a cheaper and easier solution for operations. Missions such as maritime
security [4], oil spill handling [5], search and rescue [7], bathymetry mapping [3, 6]
and environmental monitoring [1] are among the works taking advantage of this
technology.
Bathymetric survey is an essential operation in environmental monitoring.
Conventionally, these works involve multiple operator to be on-board the survey
vessel including a surveyor, an engineer and a boat captain. In many hydrography
applications, ASV is becoming a more popular alternative as they replace human
operation in dangerous and remote areas as well as reduce cost of manpower and
operations.
In the current market, common type of vessels available are mono-hull and
catamaran designs [9]. Gürsel et al. presented hydrodynamic analysis on different
forms of hull to be considered for their catamaran design vessel in geological
survey of coastal and offshore [2]. Despite catamaran in general provides better
stability and payload capacity in comparison to single-hull platforms, mono-hull
offers lower cost with sufficient stability and maneuverability in inland waters
which fulfills requirement of our application.
In another work by Vasilj et al., the developed surface vessel, which is also a
catamaran, are designed with modular hardware design where components can be
replaced with suitable payload for different applications [10]. Navigation, data
collection and propulsion system are separated into three control levels with separate microcontrollers, connected over a shared data link. The work specifically
aims for research platform with only low-cost sensors being integrated with the
prototype. A modular system utilizing marine-standard sensors is yet to be
implemented.
Typically, commercial ASVs are designed with specificity in functionalities
which lacks modularity for implementing other hydrographic operations. Moreover,
with the recent advancements in miniaturization of survey sensors, a small-sized
vessel is sufficient in supporting the necessary operation payloads. Prainetr and
Janprom presented a mini survey robot of 1-m length in [8] being integrated with a
sonar sensor which operates at 200 kHz with scanning range of 45°. However,
specifications of the sonar sensor used is not included in the paper.
In this particular work, research is limited to the scope of developing an ASV
prototype of small-sized class vessel for hydrographic survey in inland waters
including rivers, lakes and dam reservoirs. Specifically, this ASV is named
Design and Development of Mini Autonomous Surface Vessel …
191
Suraya-1. The platform will have the ability to navigate in two modes: autonomous
or remotely controlled, while transmitting real-time survey data obtained by singlebeam echo-sounder (SBES) installed on the vessel to be monitored onshore by
marine geology surveyor. Among the objectives of the development are:
• To reduce operation and research costs while obtaining comparable performance
in data measurement through utilizing industrial-grade sensors
• To modularize design and allowing flexibility in integrating alternate
marine-standard sensors for other forms of water survey works such as water
quality monitoring
• To increase portability and ease of operation with its light-weight vessel and
small-sized dimensions to be operated by minimal number of workers
This project is in collaboration with Temasek Hidroteknik, a company experienced in hydrographic survey works and interest in innovative marine product
development.
2 System Overview
The overall system is designed to fulfill functional and specification needs in the
hydrographic survey industry with limiting scope defined by the collaborating
industrial partner.
2.1
Functional Requirements
For bathymetry work, industrial-grade sensors are to be integrated within the system to ensure quality data output are obtained from the conducted survey using the
developed ASV. The typical requirement for data to be obtained are depth and
temperature of the surveyed water body. However, it is also essential for the system
to be able to obtain current position and timestamp for every recorded data from
singlebeam sensor. Both of these values are obtained using Global Navigation
Satellite System (GNSS) and transmitted to the ground station via 900 MHz radio
frequency wireless communication together with information from sonar, IMU and
compass sensors. On the ground, the received data are being used to plot the
corrected depth readings to corresponding geo-spatial position on a global
map. Each of the received data are used with the following functions as shown in
Table 1. The limit range of data transmission using existing communication device
is 2 km distance.
192
M. A. Mohd Adam et al.
Table 1 List of integrated sensors for hydrographic survey with their corresponding functions
Sensors
Type of sensors
Specification
Hemisphere Atlaslink
GNSS
GNSS sensitivity
Update rate
Pitch/roll accuracy
Horizontal accuracy
Inertial Labs AHRS-II
IMU
KVH C100
Compass
Airmar SS510
Singlebeam
sonar
Heading accuracy
Pitch & roll accuracy
Gyroscope bias in-run
stability
Accelerometer bias in-run
stability
Accuracy
Repeatability
Resolution
Response time
Min. depth reading
Max. depth reading
Depth resolution
Depth precision
: −142 dBm
: 10 Hz
:1
: RTK – 10 mm
L-band – 0.04 m
SBAS – 0.3 m
: 0.3
: 0.05
: 1°/h
: 0.005 mg
:
:
:
:
:
:
:
:
±0.5
±0.2
0.1
0.1 to 24 s
0.4 m
200 m
0.01 m
0.25% full range
Apart from data collection, the other essential part of system is the navigation. In
terms of functional requirement, it is also important for the operator to be able to
control the motion of the vessel in real-time manner. Being unmanned, the system is
designed with the capability of being controlled remotely using Remote Control as
well as autonomously controlled according to pre-defined waypoints set by the
operator. A user-friendly Graphical User Interface (GUI) is also necessary to reduce
steepness in learning curve required for operators to adapt from traditional onboard
control.
2.2
Specification Requirements
This development is designed considering specifications suitable for real environment of survey work. To reduce amount of resources required for an operation, it is
preferable to have a dimension and weight operable by one or two operators. The
vessel is based on an existing hull design with a deep-V shape which fits the
requirement of surveying calm inland water bodies such as rivers and dams.
In terms of operating period, a minimum of 1 h running time is sufficient with
operating speed of minimum 3 knots during survey. This is defined to technical
specification of minimum battery capacity 10000 mAh to power a 150–250 W brushed motor at 18.5 V. A longer operation will require bigger capacity power source.
Design and Development of Mini Autonomous Surface Vessel …
193
One of the differences of this ASV is in the minimal operator requirement for
launch and recovery due to miniaturized size vessel. This is a significant advantage
in reducing operating cost as well as improving efficiency in hydrographic work.
Other than that, the ASV is designed with modularity concept of sensor mounts
which provides flexibility to the system with other variation of sensors. This allows
the same system to easily be adapted for other similar applications such as water
monitoring and sampling.
3 Hardware Design
This section presents the hardware design of the ASV including hull design, overall
system block diagram and the sub-systems including navigation, data collection,
communication and power management system.
3.1
Hull Selection
Among the criteria considered in selecting suitable vessel as the platform for
autonomous survey are minimal operator requirement, low-cost development and
relatively stable for supporting payloads up to 20 kg for survey equipment. In terms
of hull design, a comparison between a mono-hull deep-V bottom and catamaran
has been conducted which both are the two most common hull shape used for
small-sized vessels.
The hull design of the selected vessel is a mono-hull deep-V bottom with
dimensions of 1.04 m in length, 0.32 m in height and 0.35 m in width. This design
is preferable due to its lower cost and easier to operate while being sufficiently
stable for the task required. Figure 1 illustrates the CAD drawing of the vessel
selected.
Fig. 1 CAD drawing of hull design selected for development from side (left) and isometric view
(right)
194
3.2
M. A. Mohd Adam et al.
System Block Diagram
The block diagram in Fig. 2 illustrates the integration between sub-systems
involved in developing the ASV which includes navigation, data collection and
ground station.
Navigation system
Mission Control Module
On-board sensors
ICM-20689
BMI055
IST8310
MS5611
Processors
STM32F76
STM32F10
External sensors
H
ard
war
e
interf
IST8310
M8N GPS
Effectors
Servo
DC motor
Hardware interfaces
PW-Link
Data collection system
Ground station
Sensors
Wi-Fi
GPS
Sonar
Compass
AHRS
PC
Data combiner
RF modem
RF modem
Fig. 2 Overall block diagram of the ASV system and sub-systems including the navigation, data
collection and ground station systems
Design and Development of Mini Autonomous Surface Vessel …
3.3
195
Navigation System
For autonomous navigation system, the mission control module is the main controller of the system to connect feedback sensors with output effectors. The module
is built on 32-bit STM32F765 Cortex-M7 core, 216 MHz frequency, 2 MB
memory and 512 KB RAM with I/O co-processor 32-bit STM32F100 Cortex-M3,
24 MHz frequency and 8 KB SRAM for failsafe purpose.
Internally, the module consists of multiple sensor modules which are
accelerometers, gyroscopes, magnetometer (compass) and barometer. The integrated sensing components included are extended with external sensors such as
GPS and another compass. The redundancy of certain sensors is implemented to
allow corrections being calculated in measurements. This introduces a more
accurate and stable readings obtained in terms of pose estimation – position, orientation and motion.
The vessel is propelled by a brushed DC motor as thruster and a heavy-duty
servo as rudder. The driving signal input into the servo and motor driver of thruster
is generated by the mission control module based on autonomous control calculations in autonomous mode or based on manual remote control in manual mode.
3.4
Data Collection System
Hardware architecture for data collection system is structured to be flexible for
changes in type of integrated sensors. The data combiner module is equipped with 5
serial RS-232 ports (4 payload sensors and 1 reserved for RF modem), which
enables 4 different sensors being combined and directly transmitted to receiver on
ground station.
The module is programmed to integrate inputs from National Marine Electronics
Association (NMEA) based sensors only as it is the common communication
protocol being used for marine applications (typically using RS-232 interface). It is
programmed using ATmega2560 which allows communication with NMEA sensors by means of UART to RS-232 converter (MAX232 level shifter). The combined information is transferred point-to-point to the PC on ground via 900 MHz
band radio modems and is used by the GUI software module for data monitoring.
The overall data collection system is supplied with isolate power input from
navigation system to avoid any potential malfunction from affecting the other
system. The connected power source provides power to data combiner module
while other components within the system derive their power from the module. In
this case, off-the-shelf lithium polymer (LiPo) batteries are sufficient to support the
power requirement of the system.
196
3.5
M. A. Mohd Adam et al.
Ground Station
The station located on ground is the central unit collecting information from both
systems, navigation and data collection. This provides a platform for real-time
status monitoring as well as actions controlling for the launched vessel from a
remote location.
As the ground station, any PC or device with Wi-Fi connection and USB ports
are applicable. Communication interface of receiving radio modem is converted to
USB using RS-232 to USB converter. On the other hand, the Wi-Fi connection can
be connected directly with the PW-Link transmitter on the vessel. The communication protocol of the device is User Datagram Protocol (UDP) which is chosen to
establish low latency and minimal data loss in transmission.
4 Software Design
The software architecture for navigation and data collection system is designed in
isolation concept similar to hardware design. In this section, the algorithm for
waypoint mission as well as hydrographic survey software implementations are
explained.
4.1
Navigation
In navigation system, the main autonomous control is run on the mission control
module. This device is based on NuttX operating system and is supported with
ArduPilot, an open-source firmware configured for ASV application. The architecture implemented for this system is as shown in Fig. 3.
In order to obtain reliable and accurate estimation of vehicle position, velocity
and angular orientation, Extended Kalman Filter (EKF) algorithm is implemented
by intelligently fusing the data from IMU, GPS, compass and other integrated
sensors.
On the ground control station (GCS), telemetric data is being input into the
ground-based PC through Mission Planner, an open-source mission planning
application. The software is used for operation waypoints entry, navigation firmware configurations, real-time output monitoring from mission control module and
mission data logging.
Design and Development of Mini Autonomous Surface Vessel …
197
ArduPilot
Main Loop
Background thread
Inertial Sensor
Extended Kalman Filter
Barometer
Flight Mode
GPS
Position Control
Motor & Servo Control
Hardware Abstraction Layer (HAL)
Hardware PWM Input
Hardware PWM Output
Fig. 3 Zoomed view of ArduPilot architecture re-configured for ASV application
4.2
Data Collection
Payload sensors measurement is the essential part of the whole ASV system for this
application. The received data from vessel are recorded and displayed on the
ground PC for the surveyor in operation to monitor the quality of data. These sensor
readings are processed, displayed on a developed GUI, and pushed to an online
server to allow data access from control station located on ground.
5 System Integration
The subsystems of navigation and data collection are initially tested separately on
the vessel to validate performance of each system. Once verified, both systems are
installed within the vessel for integrated system in pool environment.
The vessel being small-sized with limited internal space and payload capacity is
among the main challenges in this development. On top of that, an off-the-shelf
vessel is used to reduce cost which introduces limitation of full customization.
198
M. A. Mohd Adam et al.
Table 2 Parameters in measurement of longitudinal center of gravity (LCG) for existing
setup. Calculation for lateral COG is neglected due to positions of components being in the center
of body
List of components
Weight (kg)
LCG (mm)
Moment (kg.mm)
GPS
Singlebeam sonar
IMU
Compass
Mission control module
Batteries
TOTAL:
1.15
1.30
0.28
0.07
0.09
10.00
12.89
390
440
600
670
800
540
–
448.5
572.0
168.0
46.9
72.0
5400.0
6707.4
To overcome the limitation of space and payload, a balanced weight distribution
of payloads on-board is critical. Each component is weighed and positioned
according to calculated arrangements by using moment of inertia as shown in
Table 2. The individual moment of inertia is calculated using Eq. (1). The longitudinal center of gravity (COG) of all the components onboard is calculated using
Eq. (2) to be 520.36 mm from transom and the longitudinal center of gravity of the
vessel is 520 mm. This setup is near the ideal arrangement for balanced longitudinal
weight.
Moment ¼ Weight LCG
ð1Þ
DistanceRef : to COG ¼ Momenttotal Weighttotal
ð2Þ
On the other hand, to solve limitation of using pre-fabricated vessel, optimization of space is implemented. Singlebeam sonar is mounted externally by extending
support from top side of vessel to the bottom as shown in Fig. 4. This design is
considered to minimize amount of material required for support material, hence less
weight, as well as to position payload as close to COG as possible.
Fig. 4 Design of ASV with mounting for singlebeam sensor on the external body from isometric
and side view. Singlebeam sensor is below the vessel and GPS is positioned above the vessel
Design and Development of Mini Autonomous Surface Vessel …
199
Fig. 5 Fabricated final design of ASV
6 Results and Discussion
The final overall design is fabricated and integrated with all components required
for navigation and data collection system. The setup is as shown in Fig. 5. The
developed ASV is compared with existing commercial ASVs for shallow and calm
water in terms of specifications and capabilities which is summarized in Table 3.
Table 3 Comparison of specifications between recent development of commercial ASVs for
singlebeam hydrographic survey of small-sized class: USV Inception MK1 [13], Teledyne Z-Boat
1250 [12] and OceanAlpha SL20 [11]
Specifications
Suraya-1
Inception MK1
Z-Boat 1250
OceanAlpha
SL20
Length
Width
Weight
Hull type
Power
Speed
Endurance
Range
Launch/
recovery
1.04 m
0.35 m
20 kg
Mono hull
18.7 V DC
2–4 knots
Up to 2 h
Up to 2 km
Transport:
Via car or van
Launch:
1 person
from slipway/
launching
cradle
1 person
from pontoon/
river edge
Yes
1.40 m
1.32 m
37 kg
Twin hull
12 V DC
2–3 knots
Up to 4 h
750 m
Transport:
Via car or van
Launch:
1 person
from slipway/
launching
cradle
2 person
from pontoon/
river edge
Yes
1.27 m
0.94 m
22 kg
Tri hull
14.4 V DC
2–3 knots
Up to 4 h
750 m
Transport:
Via car or van
Launch:
1 person
from slipway/
launching
cradle
1 person
from pontoon/
river edge
No
1.05 m
0.55 m
24 kg
Mono hull
33 V DC
2–5 knots
Up to 6 h
Up to 2 km
Transport:
Via car or van
Launch:
1 person
from slipway/
launching cradle
1 person
from pontoon/
river edge
Autonomous
Yes
200
M. A. Mohd Adam et al.
From comparison, it is shown that our ASV has the smallest dimension in terms
of length and width, which contributes to being the lightest design of ASV compared to counterparts. With a mid-range powered motor, Suraya-1 is able to
compete in enabling necessary speed for survey work which is within range of 2 to
3 knots. However, our current endurance capacity is lowest compared to other three
ASVs. In order to increase battery capacity, another battery of same rating can be
connected in parallel. However, the downside of such approach is a significant
increase in total weight. Typically, 2 h is sufficient for survey work in small areas.
The detection range for data collection and navigation is within 2 km, which is on
par with SL20, and longer range than other two competitors.
For transportation, all the compared vessels are designed to fit at least a car or
van. On the other hand, for launch and recovery, only MK1 requires an extra
operator if particular ASV is released from pontoon or river edge. Apart from that,
single operator is sufficient for this process. In terms of navigation, our vessel as
well as MK1 and SL20 have the autonomous ability whereas Z-boat only allows
remote-controlled navigations.
To further evaluate the performance and capabilities of our Suraya-1, an
experiment is conducted in the fresh water swimming pool in International Islamic
University Malaysia (IIUM) Male Sports Complex located at Gombak, Malaysia.
The specific objective of the trial is to assess the functionality of ASV in performing
hydrographic survey for shallow inland waters environment. The depth of the test
pool is ranging between 1.5 to 2.5 m which is within detection range of our singlebeam sonar sensor.
Fig. 6 Sample pool test implementation showing path travelled for pre-defined waypoints. The
yellow line is the travelled path and red line is the target path. Green pinpoints are the target points
Design and Development of Mini Autonomous Surface Vessel …
201
The resulting path navigation of our ASV for the pool test is as shown in Fig. 6.
From qualitative evaluation, it is shown that the navigation system of our ASV
prototype is able follow the defined waypoint path with minimal deviation error in
maintaining the pre-calculated target path.
From the experiment conducted, the sample raw data output obtained from
payload sensors are extracted and tabulated in Table 4. The data is received as
NMEA strings synchronised based on timestamp. It is observed that the GPS
coordinates are received in real-time and provided accurate positioning of the
vessel. Also, the depth and temperature obtained the expected readings in comparison with manual approach of measurements. On the other hand, for IMU data,
based on our observation, the heave angles obtained are as expected as very
minimal rolling motion is affecting the vessel in the pool. At this moment, the
compass performance is also evaluated based on observation only, where further
analysis will be conducted in future work.
Table 4 Sample of data output collected from payload sensors for pool test hydrographic survey
conducted in IIUM
Timestamp
Singlebeam sonar
IMU
Latitude (°)
GPS
Longitude (°)
Depth (m)
Temperature (°C)
Heave (m)
Compass (°)
2019-08-27
11:11:52.245
3.2504278
101.7405416
1.50
30.08
−0.01
138.2
2019-08-27
11:11:52.740
3.2504278
101.7405394
1.50
30.08
−0.01
137.2
2019-08-27
11:11:53.730
3.2504289
101.7405369
1.50
30.08
−0.01
136.2
2019-08-27
11:11:54.769
3.2504317
101.7405336
1.50
30.08
−0.01
135.3
2019-08-27
11:11:55.264
3.2504333
101.7405319
1.51
30.08
−0.01
134.5
2019-08-27
11:11:55.761
3.2504355
101.7405292
1.51
30.08
−0.01
133.2
2019-08-27
11:11:56.752
3.2504391
101.7405257
1.51
30.08
−0.01
132.7
2019-08-27
11:11:57.749
3.2504419
101.7405227
1.51
30.08
−0.01
132.3
2019-08-27
11:11:58.240
3.2504429
101.7405212
1.51
30.08
−0.01
131.9
2019-08-27
11:11:59.772
3.2504465
101.7405171
1.51
30.08
−0.01
131.6
202
M. A. Mohd Adam et al.
7 Conclusion and Future Works
In this paper, the design and development of a light-weight class autonomous
surface vessel (ASV) for hydrographic survey is presented. The realized vessel is to
be tested and applied in real industrial survey application where a robust and stable
control system is critical. The targeted environment to be tested is specifically for
calm inland water body.
A modular architectural design for various payload is being considered in
designing the systems, software and hardware. This introduces the ability to expand
the potential of ASV to be used with other sensors and applications supportable by
the designed platform. With existing setup and capacity, it can support extra payloads up to 20 kg weight in total.
In comparison to existing commercial ASVs of same class and application, our
ASV stands out being the smallest in dimension and lightest in weight. This contributes to being a more efficient vessel which enables speed up to 4 knots with
mid-ranged powered batteries. However, the endurance of the vessel operation is
currently low compared to counterparts and require further improvement.
To improve the capability of ASV in operation, the propellers and battery
capacities can be upgraded to better specifications. As a replacement to current
brushed DC motor, a brushless motor will be a more efficient solution. For power
source, installing more parallel-connected batteries of same capacity will improve
operation time, but with trade-off of increased weight. Depending on requirement of
operation, this improvement shall be considered in future development.
Acknowledgements This research paper is supported by research initiative grant scheme with the
number RIGS16-348-0512, International Islamic University Malaysia (IIUM) with equipment and
additional financial support by Temasek Hidroteknik.
References
1. Dunbabin M, Grinham A (2017) quantifying spatiotemporal greenhouse gas emissions using
autonomous surface vehicles. J Field Robot 34(1):151–169
2. Gürsel KT, Taner M, Ünsalan D, Neşer G (2018) Design of a marine autonomous surface
vehicle for geological and geophysical surveys. Sci. Bull. Nav. Acad. 21:20–36
3. Han J, Park J, Kim T, Kim J (2015) Precision navigation and mapping under bridges with an
unmanned surface vehicle. Auton Robots 38(4):349–362
4. Johnston P, Poole M (2017) Marine surveillance capabilities of the AutoNaut wave-propelled
unmanned surface vessel (USV). In: OCEANS 2017 – Aberdeen, pp 1–46
5. Maawali WA, Al Naabi A, Yaruubi Al M, Saleem A, Maashri A.A (2019) Design and
implementation of an unmanned surface vehicle for oil spill handling. In: 2019 1st
International Conference on Unmanned Vehicle Systems-Oman (UVS), pp 1–6
6. Mat Idris MH, Sahalan MI, Abdullah MA, Zainal Abidin Z (2015) Development and initial
testing of an autonomous surface vehicle for shallow water mapping. ARPN J Eng Appl Sci
10:7113–7118
Design and Development of Mini Autonomous Surface Vessel …
203
7. Matos A, Silva E, Cruz N, Alves JC, Almeida D, Pinto M, Martins A, Almeida J, Machado D
(2013) Development of an unmanned capsule for large-scale maritime search and rescue. In:
2013 OCEANS - San Diego, pp 1–8
8. Prainetr S, Surface A (2017) Development of mini hydrography survey robot, pp 2–5
9. Stateczny A, Burdziakowski P (2019) Universal autonomous control and management system
for multipurpose unmanned surface vessel. Polish Marit Res 26(1):30–39
10. Vasilj J, Stancic I, Grujic T, Music J (2017) Design, development and testing of the modular
unmanned surface vehicle platform for marine waste detection. J Multimed Inf Syst 4:195–
204
11. Autonomous Survey Boat SL20. https://www.oceanalpha.com/product-item/sl20/. Accessed
11 Nov 2019
12. Teledyne Z-Boat 1250. http://www.teledynemarine.com/zboat1250. Accessed 11 Nov 2019
13. The Inception Class MK1 USV. https://www.unmannedsurveysolutions.com/usv-inceptionmki/. Accessed 11 Nov 2019
Control, Instrumentation and Artificial
Intelligent Systems
Optimal Power Flow Solutions
for Power System Operations Using
Moth-Flame Optimization Algorithm
Salman Alabd, Mohd Herwan Sulaiman,
and Muhammad Ikram Mohd Rashid
Abstract This article proposes a recent novel metaheuristic optimization technique:
Moth-Flame Optimizer (MFO) to solve one of the most important problems in the
power system namely Optimal power flow (OPF). Three objective functions will be
solved simultaneously: minimizing fuel cost, transmission loss, and voltage deviation minimization using a weighted factor. To show the effectiveness of proposed
MFO in solving the mentioned problem, the IEEE 30-bus test system will be used.
Then the obtained result from the MFO algorithm is compared with other selected
well-known algorithms. The comparison proves that MFO gives better results
compared to the other compared algorithms. MFO gives a reduction of 14.50%
compared to 13.38 and 14.15% for artificial bee colony (ABC) and Improved Grey
Wolf Optimizer (IGWO) respectively.
Keywords Optimal power flow
power
MFO Economic dispatch Optimal reactive
1 Introduction
Optimal power flow (OPF) has attained increasing interest from electrical
researchers since it is a key tool that helps utility power system to determine the
optimal economic and operational security of the electric grid. The predominant
purpose of OPF is to optimize certain objective functions such as: minimizing fuel
cost, emission, transmission loss, voltage deviation, etc. while meeting certain
S. Alabd M. H. Sulaiman (&) M. I. M. Rashid (&)
Faculty of Electrical and Electronics Engineering Technology, Universiti Malaysia Pahang,
26600, Pekan, Pahang, Malaysia
e-mail: herwan@ump.edu.my
M. I. M. Rashid
e-mail: mikram@ump.edu.my
S. Alabd
e-mail: slmnamn2014@gmail.com
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_15
207
208
S. Alabd et al.
operation constraints like line capacity, bus voltage, generator capability, and power
flow balance. The aforementioned objective functions can be solved as a single or
multi-objective problem.
Optimal reactive power dispatch (ORPD) is a part of Optimal power flow (OPF).
ORPD has a substantial impact on the security and the economic operation of the
electric grid system. ORPD problem contains continuous and discrete variables so it
considered a mixed nonlinear problem. The control variables of the ORPD problem
are the reactive power outputs of generators and static VAR compensators, bus
voltage magnitudes, and angles. Another sub-problem of OPF is Economic dispatch
(ED) which one of the complex problems in the power system which aims to find
the optimal allocation of generator unit output to meet the load demand at the
lowest economic generation cost while satisfying the equality and inequality
constraints.
Several optimization techniques have been used to solve the OPF ranging from
traditional to metaheuristic optimization algorithms. In recent years, metaheuristic
optimization algorithms have been developed for simulating some of the chemical,
physical and biological phenomena. Lately, many nature-inspired meta-heuristic
algorithms have been applied to solve the OPF problem and its sub-problem ORPD
and ED. Artificial Bee Colony (ABC) [1], Opposition-Based Gravitational Search
Algorithm (OGSA) [2], Grey Wolf Optimizer (GWO) [3] and Harmony Search
Algorithm (HAS) [4] have been to solve ORPD separately. On the other hand, ED
has been solved by many meta Meta-heuristic such as Grey Wolf Optimizer
(GWO) [5], Moth-Flame Optimization (MFO) algorithm [6], A Particle Swarm
Optimization PSO [7], and Genetic Algorithm (GA) [8]. Moreover, A lot of optimization techniques have been implemented to solve the ED problem and ORPD
problem simultaneously such as improved grey wolf optimizer IGWO [9], Modified
Sine-Cosine algorithm (MSCA) [10], Gravitational Search Algorithm (GSA) [11]
and Particle Swarm Optimization (PSO) [12].
According to no free lunch theorem, a single meta-heuristic algorithm is not best
for every problem [13], so in this paper, Moth-Flame Optimizer will be considered
to solve the optimal power flow (OPF) problem. The performance of the proposed
technique is tested on the standard IEEE 30-bus test system where the objective
functions are the minimization of generation fuel cost, minimization of power
losses and voltage profile improvement.
2 Problem Formulation
Since the OPF problem is a nonlinear complex optimization problem that minimizes certain objective functions while subjected to equality and inequity constraints. It can be express as follow:
Optimal Power Flow Solutions for Power System Operations …
209
Min f ðy; xÞ
ð1Þ
h ð xÞ ¼ 0
ð2Þ
g ð xÞ 0
ð3Þ
while subject to
In this paper, economic dispatch, Optimal reactive power dispatch, and voltage
profile improvement will be taking into consideration as objectives functions as
follow:
2.1
Economic Dispatch
The main objective function of economic dispatch is to reduce the generation cost
which can be formulated as a quadratic equation [14].
F1 ¼ min ð
N
X
Fi ðPi ÞÞ ¼
i¼1
N X
ai þ bi Pi þ ci P2i
ð4Þ
i¼1
where F1 Is the total fuel cost, N is the total number of generating units, Fi Is the
fuel cost of generator i, Pi Is the power generated by generator i and ai , bi And ci
Are the cost coefficients of generator i.
2.2
Optimal Reactive Power Dispatch Problem
The objective function of ORPD is to minimize the real transmission system power
losses while satisfying the equality and inequality constraint. It is formulated as
follow [15]:
F2 ¼ minðPLoss Þ ¼ min
N
X
i¼1
PL ¼
N
X
Gij Vi2 þ Vj2 2Vi Vj cosdij
ð5Þ
i¼1
where PLoss Is the real power losses in the transmission system and N is the number
of lines. Also, Gij Is the line conductance between the i-th and j-th buses. While
Vi and Vj Are the voltage at the i-th and j-th buses respectively and dij Is the voltage
phase angles of the i-th and j-th buses.
210
2.3
S. Alabd et al.
Voltage Profile Enhancement
The objective function of Voltage profile enhancement is to minimize the voltage
deviation [3]:
F3 ¼ min ðVDÞ ¼ min
Nd
X
jVi 1j
ð6Þ
i¼1
where Vi Is the voltage at i load bus and Nd Is the number of load buses.
2.4
The Weighted Objective Functions
The proposed optimization objective function can be formulated by combing the three
aforementioned objective functions into a signal objective function as fellow [9]:
F ¼ F1 þ w1 F2 þ w2 F3
$=h
ð7Þ
where w1 and w2 are the weighting factors which can be selected by the user [9].
2.5
Equality Constraints
The load power flow balance equation is equality constraints which states that total
load demand plus the total power losses should be equaled to the total power
generation. The equality constraint equation can be described as following [9]:
X PGi ¼ PDi þ Vi
ð8Þ
Vj Gij Cos hij þ Bij Sin hij
j2Ni
QGi ¼ QDi þ Vi
X
Vj Bij Cos hij Gij Sin hij
ð9Þ
j2Ni
2.6
Inequality Constraints
Generator Limit
The voltage, real power and reactive power of the generator must be constrained
within their minimum and maximum value limit [9]:
Optimal Power Flow Solutions for Power System Operations …
211
min
max
VGi
VGi VGi
i ¼ 1; 2; . . .; N
ð10Þ
max
Pmin
Gi PGi PGi
i ¼ 1; 2; . . .; N
ð11Þ
max
Qmin
Gi QGi QGi
i ¼ 1; 2; . . .; N
ð12Þ
Transformer Tap Setting
The tap ratio of the transformer must be constrained within their minimum and
maximum value limit [9]:
Timin Ti Timax
i ¼ 1; 2; . . .; NT
ð13Þ
Reactive Compensators
The shunt VAR compensator must be constrained within their minimum and maximum value limit [9]:
max
Qmin
Ci QCi QCi
i ¼ 1; 2; . . .. . .; NC
ð14Þ
3 Moth-Flame Optimizer (MFO)
Moth-flame optimizer is a new stochastic nature-inspired algorithm proposed by
Mirjalili in 2015 [16]. Moths are insects related to butterflies and they go through
two-stage in their lifetime which is larvae moth and adult moth. The special navigation technique used by moths to travel at night called transverse orientation. The
idea of transverse orientation is by maintaining a fixed angle of natural light such as
the moon, moths can ensure to travel in a straight line. Since the moon is too far, it
stays stationary and provides a fixed reference point for moths to navigate in a
straight line. However, the advent of lamps, moths get confused and take the
lamplight as an artificial moon and tries to keep a constant distance from it and end
up circling the artificial light since light is too close.
3.1
MFO Mathematical Formulation
The number of moths can be represented as matrix [16]:
2
m1;1
6 m2;1
6
M¼6 .
4 ..
m1;2
m2;2
..
.
mn;1
m1;1
3
m1;d
. . . m2;d 7
7
.. 7
..
. 5
.
. . . mn;d
ð15Þ
212
S. Alabd et al.
Where n is moths’ number which represents the candidate solutions and d is the
number variables.
To store the corresponding fitness value of each moth into an array as following [16]:
2
3
OM1
6 .. 7
6 . 7
OM ¼ 6 . 7
4 .. 5
OMn
ð16Þ
A matrix like Moths matrix is designed for flames [16]:
2
F1;1
6 F2;1
6
F¼6 .
4 ..
F1;2
F2;2
..
.
Fn;1
F1;1
3
F1;d
. . . F2;d 7
7
.. 7
..
. 5
.
. . . Fn;d
ð17Þ
Where n is moths’ number which represents the candidate solutions and d is the
number variables.
To store the corresponding fitness value of each flame into an array as following [16]:
2
3
OF1
6 .. 7
6 . 7
OF ¼ 6 . 7
4 .. 5
OFn
ð18Þ
It is important to note that flames and moths are both candidate solutions.
However, they differ only by the approach to update. Hence, the actual search
agents that go around the search space are the moths whereby the best locations of
moth gained so far are the flames. When searching the search space, each moth
drops flame as a pinpoint, so it can search around the flame and updated it in case of
finding a better solution. By applying this, the moth will never lose its best result
obtained so far. The way moth updates their location depending on flames can be
modeled as fellow [16]:
Mi ¼ S Mi ; Fj
ð19Þ
where Mi ; Fj indicate the i-th moth and j-th flame respectively while S represents
the spiral function. The logarithmic spiral function that used to as the update
mechanism is modeled as fellow [16]:
S Mi ; Fj ¼ Di ebt Cosð2ptÞ þ Fj
ð20Þ
Optimal Power Flow Solutions for Power System Operations …
213
where Di Indicates the distance of the i-th moth for the j-th flame, b is a constant
which defines the shape of the logarithmic spiral, and t is a random value within the
range of [−1, 1]. Di Is calculated as following [16]:
D i ¼ F j M i ð21Þ
where Mi Indicate the i-th moth, Fj Indicates the j-th flame.
To guarantee the processes of exploration and exploitation of the search area,
moths move around the flames and are not essential to fly within the area between
the flames and moths which modeled by the spiral Eq. (20). When the subsequent
position situated outside the space between the flame and the moth, exploration
occurs. However, when the next position located within the area between the flame
and the moth, exploitation occurs. To reach a global optimum and not to be stuck in
local optima, every moth must update its location according to corresponding
flames in Eq. (20) Fig. 1.
N1
flame no ¼ round N l T
Fig. 1 The spiral flying path of Moth around light source [16]
ð22Þ
214
3.2
S. Alabd et al.
Implementing MFO in Solving ORPD and ED Problems
The utilization of the MFO algorithm in solving the optimal ORPD problem and
ED problem is via obtaining the optimal control variables to minimize the objective
functions while fulfilling the equality and inequality constraints. The implementing
MFO In Solving ORPD and ED problems are shown in the flow chart below Fig. 2:
Fig. 2 MFO flow chart for
solving the objective function
Optimal Power Flow Solutions for Power System Operations …
215
4 Results and Discussion
To find the best optimal setting of the control variables for the OPF problem, the
proposed MFO method is tested on the standard IEEE 30-bus test system.
All simulations were carried out in a MATLAB R2017a and MATOWER 6.0
software package on a personal computer with an i5 processor, 1.6 GHz, 64 bits
and 8 GB RAM. In this paper, 30 search agents were selected, and the maximum
iteration was 300. Moreover, the weighting factors w1 and w2 are selected as 1950
and 200 respectively.
4.1
IEEE 30-Bus Systems
The bus and line data of the IEEE 30-bus test system is found in [18]. This test
system is composed of six generators located at buses 1, 2, 5, 8, 11 and 13, and four
transformers located at lines 6–9, 4–12, 9–12, and 27–28. The total load power
demand is 283:40 þ j126:20 MVA. Moreover, the total real power losses and the
total reactive power losses are 5.6035 MW and 29.9294 MVAr respectively.
Figure 3 shows the single line diagram of the IEEE-30 bus system while Table 1
shows the setting of control variables for IEEE 30-bus.
For the purpose of evaluating the performance of the proposed MFO, its optimal
results will be compared with the simulation results of other popular optimization
Fig. 3 Single line diagram of the IEEE-30 bus system [18]
216
S. Alabd et al.
Table 1 Upper and lower limit of control variables for the IEEE 30-bus system
Control variable
Upper bound
Lower bound
PG1 MW
PG2 MW
PG5 MW
PG8 MW
PG11 MW
PG13 MW
Generator Voltages p:u
Transformer Tap Setting p:u
Reactive Compensator Sizing MVAr
Load voltageðp:uÞ
50
20
15
10
10
12
0.95
0.9
−10
0.95
200
80
50
35
30
40
1.1
1.1
10
1.05
approaches which are ABC [9], IGWO [9]. For fair compression between the MFO
and the chosen methods, the optimization results of these methods reported in their
respective reference will be inserted into MTAPOWER load flow to evaluate the
proposed objective function.
4.2
The Weighted-Objective Function
The three objective functions namely minimizing transmission power losses,
minimizing generation cost and voltage profile improvement are compound into
one single objective function using the weighting factor which is called the
weighted objective function.
Table 2 shows the obtained results of MFO versus the reported optimization
method namely artificial bee colony (ABC) and Improved Grey Wolf Optimizer
(IGWO). It can be clearly observed that MFO outperforms the other two methods
with 967.59 $/h with a percentage of 14.50% compared to 980.1586 $/h (13.38%)
and 971.4114 $/h (13.38%) for artificial bee colony (ABC) and Improved Grey
Wolf Optimizer (IGWO) respectively. The convergence of MFO is shown in Fig. 4.
Optimal Power Flow Solutions for Power System Operations …
217
Table 2 The obtained results of MFO for the weighted objective function
Control variables
Generator output unit MW
PG1 MW
PG2 MW
PG5 MW
PG8 MW
PG11 MW
PG13 MW
Generator voltage p:u
VG1
VG2
VG5
VG8
VG11
VG13
Transformer tap ratio p:u
T412
T69
T610
T2827
Capacitor bank MVAr
Qc10
Qc12
Qc15
Qc17
Qc20
Qc21
Qc23
Qc24
Qc29
Fuel cost ð$=hÞ
Power loss, MW
Voltage deviation, p:u:
Objective function $/h
Initial
ABC [9]
IGWO [9]
MFO
99.00
80.00
50.00
20.00
20.00
20.00
119.338
54.8327
29.2442
35
30
21.041
123.3468
50.8357
30.3516
35
28.3808
21.5518
199.9683
50.84092
31.36332
35
26.79478
20.56381
1.060
1.045
1.010
1.010
1.082
1.071
1.0268
1.0156
0.994
0.9981
1.0459
1.0331
1.0295
1.0171
0.9974
1.0006
1.0015
1.0528
1.030482
1.016681
0.999912
0.999795
1.029194
1.001948
1.0780
1.0690
1.0320
1.0680
0.98
0.9381
1.0125
0.9672
1.0107
0.975
1.0556
0.978
1.040193
1.002741
0.953949
0.979411
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
901.3495
5.6035
0.6051
1131.6336
1.4017
−6.1533
3.5496
0.5092
4.8013
−3.0998
8.7841
8.4659
2.4237
833.9610
6.0396
0.1421
980.1586
2.1785
−10
10
3.4209
7.7976
10
2.256
9.8128
3.5445
831.38
6.06672
0.10867
971.4114
10
−1.16987
2.7043
1.314517
8.443245
10
3.742131
10
3.803413
830.1046
6.1289
0.0899
967.59
218
S. Alabd et al.
Fig. 4 Convergence performance of MFO for Case 1 (IEEE 30-bus)
5 Conclusion
In this paper, the application of MFO into solving OPF has been carried out. The
three objective functions namely minimizing fuel cost, transmission loss, and
voltage deviation minimization were compound into one weighted objective
function. The performance of MFO has been tested in the standard IEEE 30-bus test
system. Therefore, From the obtained result, MFO shows a competitive result in the
OPF problem compared to the other optimization techniques in the literature. The
application of MFO into a multi-objective function is highly recommended.
Acknowledgements This work was supported by the University Malaysia Pahang (UMP) and the
Ministry of Higher Education Malaysia (MOHE) under Fundamental Research Grant
Scheme FRGS/1/2017/TK04/UMP/03/1 & RDU170129.
References
1. Ayan K, Kiliç U (2012) Artificial bee colony algorithm solution for optimal reactive power
flow. Appl Soft Comput J 12(5):1477–1482
2. Shaw B, Mukherjee V, Ghoshal SP (2014) Solution of reactive power dispatch of power
systems by an opposition-based gravitational search algorithm. Int J Electr Power Energy Syst
55:29–40
3. Sulaiman MH, Mustaffa Z, Mohamed MR, Aliman O (2015) Using the gray wolf optimizer
for solving optimal reactive power dispatch problem. Appl Soft Comput J 32:286–292
Optimal Power Flow Solutions for Power System Operations …
219
4. Khazali AH, Kalantar M (2011) Optimal reactive power dispatch based on harmony search
algorithm. Int J Electr Power Energy Syst 33(3):684–692
5. Sulaiman MH, Ing WL, Mustaffa Z, Mohamed MR (2015) Grey wolf optimizer for solving
economic dispatch problem with valve-loading effects. ARPN J Eng Appl Sci 10(21):9796–
9801
6. Sulaiman MH, Mustaffa Z, Rashid MIM, Daniyal H (2018) Economic dispatch solution using
moth-flame optimization algorithm. In: MATEC web of conferences, vol 214
7. Park J-B, Jeong Y-W, Lee W-N, Shin J-R (2008) An improved particle swarm optimization
for economic dispatch problems with non-smooth cost functions, 20(1):7
8. Chen P-H, Chang H-C (2002) Large-scale economic dispatch by genetic algorithm. IEEE
Trans Power Syst 10(4):1919–1926
9. Taha IBM, Elattar EE (2018) Optimal reactive power resources sizing for power system
operations enhancement based on improved grey wolf optimiser. IET Gener Transm Distrib
12(14):3421–3434
10. Attia AF, El Sehiemy RA, Hasanien HM (2018) Optimal power flow solution in power
systems using a novel Sine-Cosine algorithm. Int J Electr Power Energy Syst 99
(January):331–343
11. Duman S, Güvenç U, Sönmez Y, Yörükeren N (2012) Optimal power flow using gravitational
search algorithm. Energy Convers Manag 59:86–95
12. Abido MA (2002) Optimal power flow using particle swarm optimization. Int J Electr Power
Energy Syst 24(7):563–571
13. Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans
Evol Comput 1:67–82
14. Sulaiman MH, Ing WL, Mustaffa Z, Mohamed MR (2015) Grey wolf optimizer for solving
economic dispatch problem with valve-loading effects ARPN. J Eng Appl Sci 10(21):1619–
1628
15. Abdel-Fatah S, Ebeed M, Kamel S (2019) Optimal reactive power dispatch using modified
sine cosine algorithm. In: Proceedings of 2019 international conference on innovation trends
computer engineering, ITCE 2019, no February, pp 510–514
16. Mirjalili S (2015) Moth-flame optimization algorithm: a novel nature-inspired heuristic
paradigm. Knowl-Based Syst 89:228–249
17. Ng Shin Mei R, Sulaiman MH, Mustaffa Z, Daniyal H (2017) Optimal reactive power
dispatch solution by loss minimization using moth-flame optimization technique. Appl Soft
Comput J 59:210–222
18. Lee KY, Park YM, Ortiz JL (1985) A united approach to optimal real and reactive power
dispatch. IEEE Power Eng Rev PER-5(5):42–43
A Pilot Study on Pipeline Wall
Inspection Technology Tomography
Muhammad Nuriffat Roslee, Siti Zarina Mohd. Muji,
Jaysuman Pusppanathan, and Mohd. Fadzli Abd. Shaib
Abstract Malaysia is one of the world’s third-largest exporter of liquefied natural,
the second-largest oil and natural gas producer in Southeast Asia, this signified that
development of oil and gas industry in Malaysia particularly has rapidly evolved
and so thus the using of steel pipe. Steel pipe is essential and widely uses for fluid
transportation in the sense of transporting petroleum, gas, water, steam etcetera.
Both corrosion and blockage are the main problem in the oil and gas industry.
However, it is reportedly that the main technique used in Malaysia is by using
radiation material like gamma ray or X-rays. This technique is too dangerous if
extensive care is neglected. Hence, a throughout discussion on established pipe wall
inspection technology is pivotal, as it to be applied on different situation of
application or study. This paper focusing on the suitability, the basic functionality,
advantage and disadvantage on every established pipe wall inspection technology
ever known. Mostly tomography researcher in Malaysia particularly, used acrylic
pipe as subject for experiment with tomography hardware. Ironically, with that
implementation is not entirely portraying the real process of pipeline inspection as
conducted by oil and gas company. In this research, steel pipe is used to imitate the
real situation of pipeline inspection as conducted. Therefore, the real issues raised is
more reliable when conducting the experiment using the real steel pipe thus, could
solve the industry problem. From the review that had been done, steel pipe in
M. N. Roslee S. Z. Mohd.Muji (&) Mohd.F. Abd. Shaib
Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
Parit Raja, 86400 Batu Pahat, Johor, Malaysia
e-mail: szarina@uthm.edu.my
M. N. Roslee
e-mail: ge180118@siswa.uthm.edu.my
Mohd.F. Abd. Shaib
e-mail: fadzli@uthm.edu.my
J. Pusppanathan
Sports Innovation & Technology Centre (SiTC), Institute of Human Centered Engineering
(iHumen), Faculty of Engineering, Universiti Teknologi Malaysia, 81310 Skudai, Johor,
Malaysia
e-mail: jaysuman@utm.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_16
221
222
M. N. Roslee et al.
diameter 203.2 mm and thickness of 7.7 mm will be used in this research to solve
the industrial problem situation. A simulation result using finite element analysis
method was done using ultrasonic as the main sensors and it shows that the
ultrasonic can penetrate successfully into the steel pipe. In conclusion, research
using ultrasonic can be used as it proved to have the measurement result where the
suitable frequency is 40 kHz with 20 V voltage inserted the most suitable to operate
the ultrasonic tomography system.
1 Introduction
Tomography is a technology that producing an image of certain internal system, i.e.
a process vessel or pipeline from the measurements signals sensor which located
around desired object or in other word, tomography can be defined as a method of
displaying a representation of image of a solid through the use of any kind of
penetrating wave impacting on the object. The word ‘tomography’ is originally
from Greek words whereby ‘tomo’ means ‘to slice’ and graph’ means ‘image’ [1].
Over the years, tomography has been widely used in medical imaging like X-ray,
CT scan, single proton emission computerized tomography (SPECT) and MRI for
diagnose disease, monitoring the effectiveness of therapy and many other purposes.
However, this recent decade, tomography has evolved rapidly and become most
beneficial technology in actual process material such as in pipeline and vessel and
many other fields.
The basic components of tomography system consist of hardware which
includes sensors and measurement circuits, software for image reconstruction and
display unit for displaying the image obtained. There are various type of tomography includes x-ray, gamma-ray, microwave, ultrasound, optical, positron emission tomography (PET), nuclear magnetic resonance, capacitance, resistance,
impedance and electrical charge [1]. Tomography also has different types of method
applied on tomography sensor which are attenuation, transmission, reflection,
diffraction and impedance [1]. Figure 1 shows the basic tomography system and
application in tomography fields. It proved that tomography has successfully and
reliable tool for Industrial Process Tomography.
Application of tomography includes inspection purpose, concentration inspection, process control monitoring, flow pattern identification, environment and flow
measurement as stated on Fig. 1. Every type of sensor has its unique characteristics,
advantages and disadvantages. This paper will emphasize and discussed the techniques of wall thickness inspection and multiphase imaging functionally and
technically.
A Pilot Study on Pipeline Wall Inspection Technology Tomography ...
223
Fig. 1 Basic tomography
system and applications [1]
2 Types of Tomography Sensor
In tomography field, a different type of sensor is used to detect the desired
parameter. Sensor is essentials because it will differentiate the functionality of the
sensing method in process vessel. Nowadays, non-invasive and non-intrusive are
the features needed for tomography system as it has the capability to eases the
monitoring process. However, each type of tomography has it advantages, limitations, and drawbacks hence the selection of tomography sensor must be accordingly
to the studied case.
Rahim and Rahiman [1] in their book has mentioned of consideration of several
factors for choosing tomography sensors as follows:
(1) The molecular structure of the components contained in the pipeline, vessel,
reactor, or desired material (particles, gases, liquids and mixtures).
(2) The industrial environment like humidity, temperature, noise, maintenance, and
safety implications.
(3) The requirements such as imaging resolution, measurement speed, measurement sensitivity and temporal solution.
(4) The size and cost of the equipment process also the length-scale of the case
study area.
(5) The requirements of human resource and any potential hazards towards
personnel.
The following below are the common types of tomography sensors:
(a) X-ray
(b) Gamma rays
224
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
M. N. Roslee et al.
Microwave
Ultrasound
Optical
Positron emission tomography
Nuclear magnetic resonance
Capacitance
Impedance
Electrical charge
Each of the sensors are the main part for every established tomography technologies that discussed on the next subtopic.
2.1
Pipe Wall Inspection Technology
Malaysia is one of the world’s third-largest exporter of liquefied natural, the
second-largest oil and natural gas producer in Southeast Asia, and strategically
amid important routes for seaborne energy trade [2]. This signified that development of oil and gas industry in Malaysia particularly has rapidly evolved and so
thus the using of steel pipe. Steel pipe is essential and widely used for fluid
transportation in the sense of transport petroleum, gas, water, steam and etcetera [3].
However, constantly under pressure, high temperature, mineral deposition along
the pipe wall and corrosion could lead to pipe thickness thinning of body pipe,
crack appearance or even leaking of oil and gas when it got worsen [4].
Consequently, this event could lead to huge economic lose, threaten the production
safety and bring disaster to surrounding environment [5]. Hence, continuous process of monitoring on oil and gas pipeline is essential to promote corrosion skills
and achieve strategically progresses on pipeline production by making corrosion
prediction, and furthermore, potentially can be an important technology method in
safety production assurance and a developing trend of digital oilfield.
Over the years there are various established technology regarding pipeline wall
inspection. Each type of it have the specific working principle either invasively or
not.
2.1.1
X-Ray
In 1970s x-ray computed tomography being introduced into the world [6]. Since
their discovery, X-rays have become an important tool in the fields of medical
diagnosis and materials testing or used for many applications [7].
Conventional X-ray imaging relies on the different attenuation of X-rays in
structures with high X-ray absorbance, such as bones, and lowly absorbing parts,
such as the surrounding tissue, in the examined target.
A Pilot Study on Pipeline Wall Inspection Technology Tomography ...
225
However, X-rays are not only absorbed in the object but also refracted and
scattered, producing measurable deviations from their original direction and thus
enabling the measurement of new signal components such as phase contrast and
dark-field contrast.
For instant, over the years the castings industry has used x-ray inspection to
verify the structural integrity of its castings or pipeline inspection [8]. The first
manually operated off-line film-based inspection systems have been replaced with
fully automatic real-time x-ray systems able to make pass/fail decisions without
operator intervention. Now the systems can be integrated directly into the manufacturing process and even integrated directly into the manufacturing process and
make the monitoring process become easier [8].
The basic principle of x-ray imaging in involving with the bombardment of a
thick target with energetic electron. X-ray tomography uses the ability of X-ray
radiation to penetrate objects. On the way through an object, part of the impinging
radiation is absorbed. The longer the radiographic length of the object, the less
radiation escapes from the opposite side.
The absorption also depends on the material. An X-ray detector (sensor) captures
the escaping X-ray radiation as a two-dimensional radiographic image. At detector
sizes of approximately 50 to 400 mm, a large portion of the measured object can be
captured in a single image as shown in Fig. 2.
Figure 3 shows the flowchart of the X-ray process starting from the beginning to
the end of the process. This is the basic structural process of x-ray tomography.
The advantage of X-ray is able to operate at higher temperature compared to
others established system of technologies that has been found [9]. Furthermore, the
image reconstructed from the X-ray is more reliable and precisely depicts the
internal image of the system [10]. X-ray one of the tomography technique that
offers high inspection efficiency, good economic effect and real time problem
evaluating [11].
X-ray have number of drawbacks, the usage of ionizing radiation is very dangerous due it’s hazardous potential and Brian Plonsky from International Atomic
Energy Agency (IAEA) mentioned that the most NDT technology being used in
Malaysia is radiography [12]. This technique is too dangerous towards humankind
if extensive safety precaution is not handled correctly. Furthermore, the design of
the system usually very bulky hence demographically not suitable for pipeline
Fig. 2 The basic operation of
X-ray [9]
226
M. N. Roslee et al.
Fig. 3 The flowchart of the
whole process of X-ray
tomography [9]
inspection. Lastly, very high cost and high maintenance due to the usage of ionizing
radiation. Therefore, a new way of detecting the pipe condition that can replace the
radiation sources by using ultrasonic sensors or any other tomography method.
2.1.2
Electrical Capacitance Tomography (ECT)
The measurement principle of electrical capacitance tomography is depend on the
permittivity of the internal material or media inside the pipeline [13]. Different
material have different number of permittivity value and this values are used to
reconstruct the image. User can differentiate and determine each material inside the
pipe from image reconstructed.
This system has been used for process industries for measuring the component
fraction of a multicomponent flow process. It is very useful since this system
operate very fast, invasively and does not use ionizing radiation. Furthermore, from
an industries point of view, ionizing radiation is not favored to its high cost and
hazard potential. However, according to Ruzairi et al. electrical capacitance have
few disadvantages which are no simple linear relationship between the measured
capacitance and the dielectric distribution, the changeable sensitivity is small [13].
A Pilot Study on Pipeline Wall Inspection Technology Tomography ...
2.1.3
227
Electrical Impedance Tomography (EIT)
EIT was first developed for medical applications in the early 1980 and it was then
extended to industrial process like process vessel. A basic principle of EIT is that
injected the current signal into one pair of electrodes and the others electrodes
measure the voltage developed, its repeated for others pairs of current injection
electrodes [14]. This technique mostly applied in industry whereby involving the
process that uses conducting fluid to carry desired compound from one place to
another.
The advantages of EIT are relatively low cost compared to others technique of
tomography, the design of the EIT is more simple and certainly non-invasive and
intrusive. However, EIT for a process of transporting fluid or material that contains
large numbers of non-conducting solid material is not suitable.
2.1.4
Magnetic Flux Leakage (MFL)
MFL is a device that established in the 1950’s and until now it becomes most
commonly used tools for pipeline inspection [13]. MFL is considered to be most
effective and dependable on-line method among inner corrosion-detection technologies for oil and gas pipeline [15–18].
The working principle of the MFL is shown in Figs. 4 and 5. In Fig. 4(a), the
inspected pipe is perfect without any metal loss and the magnetic flux totally passes
the magnetic circuit. Figure 4(b), there is defect (metal loss, corrosion) existing
within the pipeline. This defect area has different value of magnetic permeability
compared to perfect steel pipe [15, 19]. Thus, the different value of magnetic flux
between perfect steel pipe and defect steel pipe is pivotal as it be the indicator
whether the pipe is in perfect or defect condition.
As a result, most magnetic flux passes around the flaw, a small fraction of
magnetic flux passes through the defect, and some magnetic flux departs from the
top and bottom surfaces and passes around the defect through air [15]. The last part
of magnetic flux leakage can be acquired by sensors and stored in computer for
analysis, which can be used to evaluate the dimensions and characteristics of the
defects. Figure 5 shows the operation of the pipeline intervention gadget PIG
(pipeline intervention gadget) inside the steel pipeline. It’s intrusive technology
hence it is not efficient in term of costing and time operation.
MFL can be used to detect corrosion before pipe failure, and leaks occurred in
pipes. MFL is technology that have high accuracy compared to other established
technologies, since it works invasively inside the pipeline also high sensitivity and
no disturbance [20]. MFL also can provide qualitative information regarding the
presence of different defect located on the steel pipe [13].
However, on industry perspective any technologies that offer low cost of
inspection process would be the favourable but MFL process of inspection is highly
on cost and time consuming. Plus, it is not non-invasive technology thus, it does not
meet the industry interest currently which more to Non-destructive Technology
228
M. N. Roslee et al.
(a) Pipe without metal loss.
(b) Pipe with defect.
Fig. 4 Inspection principle of MFL [15]
Fig. 5 MFL hardware component [13]
A Pilot Study on Pipeline Wall Inspection Technology Tomography ...
229
(NDT). Lastly MFL is not suitable due to its method of operational which is
invasively because in every pipeline system have varies diameter of pipe. Hence, it
is not suitable for pipeline that has variety diameter. MFL technology should be
more applicable towards pipeline that have variety diameter and curved pipeline.
There is less research on producing MFL that has the ability to change the size of its
prototype accordingly to the pipe diameter to avoid MFL being stuck.
2.1.5
Ultrasonic Guided Wave
Guided wave of tomography as shown in Fig. 6 is technology that emerged this
recent decade. This technology is promising method since it can be used for
inspection in long range area up to 15 m [21–23]. The conventional point-by-point
methods such as ECT, EIT, and X-ray implies a slow inspection process and it
becomes very expensive when full inspection coverage is needed. It is therefore
useful to introduce a quick and sufficiently accurate method for the detection of
corrosion. It also non-destructive and non-intrusive technique [22].
Conceptionally, guided waves are generated from the interference of two types
of waves: longitudinal and transverse waves. Longitudinal waves exist when the
movements of the particles of a medium are parallel to the propagation direction of
the waves. Transverse waves exist when the movements of the particles are perpendicular to the propagation direction of the waves as shown in Fig. 6 [23].
Generally, the working principle of the ultrasonic guided waves is based on the
measurement of wave velocities, attenuation, and mode scattering of received signal
using fast fourier transform algorithms [24]. Practically, ultrasonic guided wave
inspection detects and assesses the severity of defects on steel pipe by measuring
the amplitude of the waves reflected by the defect area [25]. A quantitative study of
the reflection coefficient able to detect the defect size, the dimensions of the pipe
with the frequency at which the wave is excited [25].
Figure 7 shows that the ultrasonic guided wave traveled in media (steel pipe)
which contained crack. At this area the ultrasonic guided wave will experience
scattering or mode conversion at discontinuous places. Reflection and refraction can
be expected on this defected area [26, 27]. By processing the wave carrying discontinuous information, the location of defect area can be estimated as shown in
Fig. 7 [21, 26].
Fig. 6 Representation of
guided waves.
L = longitudinal wave.
T = transverse wave [23]
230
M. N. Roslee et al.
Fig. 7 Pipe model of crack defect [22]
The advantages of the guided wave technology are the guided wave travel along
the pipe without much energy attenuation and ultrasonic wave travelled by
vibrating the particles on the inner and outer pipe. Thus 100% pipe wall inspection
achieved [21, 28].
There are few drawbacks of ultrasonic guided wave technique which are cannot
display the exact of the defect area of the steel pipe and can only detect any defect
area of pipe by observation of the amplitude of the wave compared others technique
[24]. Ultrasonic guided wave not able to investigate the multiphase flow inside the
pipe due to the Lamb’s wave only travel within the pipe. Lastly, the effectiveness of
this method is based on the assumption that the leakage induced acoustic waves
propagates along the pipeline as an individual non-dispersive guided wave with
small attenuation. In reality, the assumption is not always valid because the acoustic
waves are multi-modally blended signals and consequently resulting in missing
detection and location of leakage [29–31]. Recently, most studies have focused on
signal processing algorithm to increase the accuracy of the received signal.
2.1.6
Ultrasonic Tomography
In physics, sound is the product of the vibration of object (particle) and typically it
propagates as an audible sound through transmission medium like gas, liquid and
solid. In human physiology, the range frequencies of human can hear is between
20 Hz to 20 kHz. Sound wave below than 20 Hz and above 20 kHz is not perceptible by humans and both called infrasound and ultrasound as shown in Fig. 8.
Tomography is the real time imaging technique that has been dominate oil and
gas industry over the recent years [33]. Generally, the basic principle of tomography is that producing a density imaging by exposing the material to sound wave
or any other physical stimulus that able to penetrate the material and the object
A Pilot Study on Pipeline Wall Inspection Technology Tomography ...
231
Fig. 8 The sound range frequency [32]
responded. By using computers and mathematical models the internal image of the
system can be constructed [34–36].
(i) Waves
In the case of the capability of the ultrasonic sensor to penetrate the materials, first
the characteristic of the wave that travelled inside the material should be studied.
There are four types of ultrasonic waves which are Longitudinal wave, Shear wave,
Rayleigh waves and Lamb wave [37, 38]. All these waves are shown in Figs. 9, 10,
11 and 12.
Figure 9 shows longitudinal waves (compression wave) is the type of wave that
human can hear and it used for testing the front end of pipe body structure and also
to test the integrity of pipe plate [37]. Longitudinal waves are moving inside
material by compressing and refraction of particles of the medium. Figure 10 shows
shear wave (transverse wave) that propagate slower and shorter wavelength compared to longitudinal wave [39]. It commonly used for detection discontinuity in
both inner and outer layer of pipe.
Figure 11 shows the Rayleigh wave that only travel along the surface of material
at velocities equal to shear wave [37]. Figure 12 shows ultrasonic lamb wave (plate
wave) that vibrates from upper to lower surface of the material. The application of
lamb wave is able to detect location and extent of discontinuities of metal pipe.
Fig. 9 Graphical depiction of parallel motion response of particles of longitudinal ultrasonic
waves [37]
232
M. N. Roslee et al.
Fig. 10 Graphical depiction of perpendicular motion response of particles of shear ultrasonic
waves [37]
Fig. 11 Graphical depiction of limited detection area of Rayleigh waves, confined on the surface
of material [37]
For the discussion above it is clearly to say that Rayleigh wave are not suitable
for detection any crack or corrosion inside the steel pipeline because the wave only
travelled on the surface of the pipe. Any crack or corrosion located beneath the
pipeline cannot be detected.
Longitudinal, shear waves and Lamb wave are the modes that most widely used
for ultrasonic testing of pipeline [40, 41]. From the observation from Figs. 9, 10
and 11 it shows that the waves are travelled throughout the entire medium (steel
pipe) so it can detect any crack or corrosion that placed inside the pipe medium by
measuring the signal received.
A Pilot Study on Pipeline Wall Inspection Technology Tomography ...
233
Fig. 12 Graphical depiction of ultrasonic Lamb waves (plate wave) [28, 37]
3 Result and Analysis
3.1
Frequency Selection
Frequency selection is one of the major factors that contribute the successfulness of
the ultrasonic tomography sensor. It is because the right selection of frequency able
to penetrate the internal system and can construct the internal image of the system
and with that analysation can be made.
The higher the frequency of the ultrasonic wave the faster the time for the wave
to decay and the higher the frequency of the ultrasonic wave the shorter the
wavelength [42]. Hence it cannot travel longer within the pipe material [33]. That
can be proved form the equation speed of sound below.
c ¼ fk
Where:
c = Speed of sound
f = frequency of sound
k = sound wavelength
However, steel pipe is commonly used in oil and gas industry thus, the appropriate ultrasonic sensor should be chosen regard to its performance. This is because
when ultrasonic sensor applied on steel pipe, Lamb wave become more pronounce
and disrupt the reading of signal received [40, 43]. The Lamb wave does not
provide any information caused by object disturbance or obstruction inside the pipe
because the Lamb wave only propagate within the pipe boundary [44, 45].
Abbaszadeh et al. has run simulation test using finite element analysis to find
the suitable frequency for steel pipe with minimum disturbance of Lamb wave
(noise). As for result stated that 40 kHz is the optimum frequency applied on the
steel pipe [43]. However, Nordin et al. stated that the selection of frequency should
234
M. N. Roslee et al.
be high to reduce the percentage of Lamb wave propagation with 390 kHz.
Afterwards, that range frequency within 40 to 490 kHz should be test using finite
element analysis to get a better and optimum frequency for better image reconstruction [46]. Thus, it is crucial to balance the trade-offs in developing an ultrasonic tomography system by considering the optimum frequency of the system.
This paper presented a pilot study on established pipeline wall inspection
technologies and purposed using ultrasonic tomography this is because there is no
researcher apply ultrasonic tomography on steel pipe with the outer diameter of
203.3 mm and thickness of 7.7 mm. A simulation result using finite element
analysis method was done using ultrasonic as the main sensors and it shows that the
ultrasonic can penetrate successfully into the steel pipe. It proved that ultrasonic
have the measurement result where the suitable frequency and voltage inserted to
operate the ultrasonic tomography system. This method will be applied briefly on
the next study on how the ultrasonic sensor will react to the pipeline fully-filled
with oil that have clog issue.
Table 1 shows the value of sound pressure level at 4 different voltages which are
5, 10, 20, 24 V and resonates at different frequencies in range of 40 kHz to 2 M
being applied on the steel pipe with diameter of 203.3 mm and thickness of
7.7 mm. The value sound pressure level shows that there are significant changes
whenever there is change in frequency and mostly voltage. This can be seen on
Fig. 13 where the 20 V have the highest value of sound pressure level compare to
other voltage thus, it can be concluded that 20 V is the most stable and the most
suitable to use for the ultrasonic sound wave able to penetrate the steel pipe from
transmitter to receiver.
Table 1 The sound pressure level for 5, 10, 20 and 24 V
Frequency
(Hz)
Sound pressure
level 5 V
Sound pressure
level 10 V
Sound pressure
level 20 V
Sound pressure
level 24 V
40 k
80 k
120 k
160 k
200 k
240 k
280 k
320 k
360 k
400 k
440 k
480 k
1M
2M
−720.7
−732.74
−939.78
−744.78
−748.66
−751.82
−754.5
−756.82
−758.87
760.7
−762.35
−763.87
−776.62
−788.66
−714.68
−726.72
−733.76
−738.76
−742.64
−745.8
−748.48
−750.8
−752.85
−754.68
−756.33
757.87
770.6
782.64
−708.66
−720.7
−727.74
−732.74
−736.62
−739.78
−742.46
−744.78
−746.83
−748.66
−750.31
−751.82
−764.57
−776.62
−707.07
−719.11
−726.16
−731.16
−735.03
−738.2
−740.88
−743.2
−745.34
−747.07
−748.73
−750.24
−762.99
−775.03
A Pilot Study on Pipeline Wall Inspection Technology Tomography ...
235
Frequency vs Pressure Sound Level
-660
40k
80k
120k 160k 200k 240k 280k 320k 360k 400k 440k 480k
1M
2M
-680
-700
-720
-740
-760
-780
-800
Sound Pressure Level 5V
Sound Pressure Level 20V
Sound Pressure Level 10V
Sound Pressure Level 24V
Fig. 13 The graph between frequency and pressure sound level
3.2
Transducer Projection
Crucial aspect of an ultrasonic tomography is the selection of transducer mode of
transmission and transmission beam. This to ensure that large portion of region of
interest is illuminated with the ultrasonic wave so that better image can be reconstructed and gaining more information of internal system of pipeline.
Ayob et al. stated that narrow beam does not have the advantage of gaining
information of internal system as much compared to wide beam of transmission.
This is a contrasting requirement to medical ultrasonic sensors which need the
sensor to be very narrow beam for fine lateral resolution. Hence, it is necessarily to
have a very wide beam so that, the information gaining is more reliable and precise.
The Illustration of ultrasonic sensor mounted on the steel pipe, there are five
significance interactions of the ultrasonic wave sound with different boundaries.
Fig. 14 The electronic transducers mounted on the surface of steel pipe
236
M. N. Roslee et al.
Fig. 15 The electronic
transducer with divergence
angle of 125° and array
estimation
Firstly, ultrasonic transducers with couplant (lithium grease), then couplant (lithium
grease) to steel pipe, after that steel pipe with liquid (hydrocarbon), steel pipe to
corrode and lastly, liquid to sand or mud. Before the real construction begin the
coefficient of transmitted and reflected sound energy must be known theoretically to
ensure the capability and successfulness operation of the technology. Consequently,
ultrasonic this recent decade has evolved with the development of dual functioning
sensor whereby it can either transmitter and receiver. In this case, 16 transducers are
being used. Figure 14 shows the drawing of transducer that mounted on the surface
of steel pipe using finite element analysis with diameter of pipe is 203.2 mm
(Fig. 15).
4 Conclusion
In the light all pre-existing and competing technologies that each have unique
characteristics, advantages and drawbacks. EIT and ECT has low spatial resolution
compared to ultrasonic and X-ray inspection technology. This indicate that the EIT
has low coverage area upon the targeted object hence has low efficiency and
accuracy for long pipeline system. X-ray has the ability of deepest penetration
compared to other technology however, the usage of radiation material is not favour
since it required extensive care and prolong being exposed to ionizing material may
lead to hazardous potential. Hence, ultrasonic tomography has better offer which it
has better spatial resolution and it clearly a non-radiation material. MFL is one of
the established technologies that work invasively. MFL is not suitable because in
every pipeline system have varies diameter of pipe. Hence, it is not suitable for
A Pilot Study on Pipeline Wall Inspection Technology Tomography ...
237
pipeline that has variety diameter. MFL technology should be more applicable
towards pipeline that have variety diameter and curved pipeline. Ultrasonic
tomography has adjustable sensor rig that could fit with any diameter of pipeline
since it works noninvasively. The effectiveness of Ultrasonic guided wave is based
on the assumption that the leakage induced acoustic waves propagates along the
pipeline as an individual non-dispersive guided wave with small attenuation. In
reality, the assumption is not always valid because the acoustic waves are
multi-modally blended signals and consequently resulting in missing detection and
location of leakage. However, ultrasonic tomography has being research interest in
this paper compared to other tomography techniques in oil and gas industry, thus
ultrasonic tomography should be given extra focus. Furthermore, ultrasonic
tomography inspection operation has proved to be the most beneficial, ideal, low
cost, consumed less operational time, reliable, not involving radiated material and
most of all the design is demographically suitable for industrial usage. In ultrasonic
tomography technologies all established instrument is dependable on the situation is
being applied. In this research the metal pipe is used hence the usage of ultrasonic
tomography is applied with suitable frequency of 40 kHz to minimalize the effect of
Lamb’s Wave on solid material. The longitudinal shows that the waves are travelled
throughout the entire medium (steel pipe) so it can detect any crack or corrosion
that placed inside the pipe medium by measuring the signal received. Further study
will be carried on the suitability ultrasonic tomography for detection any crack and
corrosion inside located on pipeline.
Acknowledgements The authors would like to acknowledge the support from grants FRGS
K074, MDR H499, UTMSHINE 09G18, TDR 06G17 and CRG 05G04.
References
1. Rahim RA, Rahiman MHF (2012) Ultrasonic tomography: non-invasive techniques for flow
measurement. Penerbit UTM Press Universiti Teknologi Malaysia, Johor
2. United State Energy Information Administration (2017) Country analysis brief: Malaysia.
Independent Statistic and Analysis, 26 April 2017, pp 1–23
3. Zhao J, Yang S, Li Y, Wang X (2010) Study on detection of industrial pipe network by
high-frequency ultrasound. In: Proceeding of the 2010 symposium on piezoelectricity,
acoustic wave and device application, pp 517–519
4. Yang B, Li Q, Li M, Lu Y (2012) Ultrasonic monitoring system for oil and gas pipeline
corrosion. In: 2012 fourth international conference on multimedia information networking and
security, pp 381–383
5. Qingling Y, Xuan CH, Jun Z (2009) Corrosion control technology for underground pipelines
in oil and gas station. In: Corrosion and protection in petrochemical industry, pp 16–19
6. Morton E, Mann K, Berman A, Knaup M, Kachelrieb M (2009) Ultrafast 3D reconstruction
for X-ray real-time tomography (RTT). In: 2009 IEEE nuclear science symposium conference
record, pp 4077–4088
7. Seyyedi S, Wieczorek M, Pfeiffer F, Lasser T (2018) Incorporating a noise reduction
technique into X-ray tensor tomography. IEEE Trans Comput Imaging 4(1):137–146
238
M. N. Roslee et al.
8. Chen W, Miao Z, Ming D (2011) Automated inspection using X-ray imaging. In: 2011
international joint conference of IEEE TrustCom-11/IEEE ICESS-11/FCST-11,
pp 1769–1772
9. Onel Y, Emert U, Willems P (2000) Radiographic wall thickness measurement of pipes by a
new tomography algorithms. In: 15th world conference on nondestructive testing, Roma, Italy
10. Maher KP, Edyvean S (2001) Diagnostic radiology physics: a handbook for teachers and
students. International Atomic Energy Agency, United State America
11. Deng Z, Xu F, Zhang X, Chen H (2004) The development of X-ray inspection real time
imaging pipeline robot. In: Proceedings of the 5th world congress on intelligent control and
automation, pp 4846–4850
12. Plonsky B (2015) Non-destructive testing helps Malaysia’s competitiveness. International
Atomic Energy Agency (IAEA), 25 September 2015. https://www.iaea.org/newscenter/news/
non-destructive-testing-helps-malaysias-competitiveness. Accessed 19 June 2019
13. Rostron P (2018) Critical review of pipeline scale measurement technologies. Indian J Sci
Technol 11(17):1–18
14. Qin X, Ji C, Wang Z, Wang P (2018) Reconstruction and simulation of fluid flow pattern in
pipeline based on electrical impedance tomography algorithm. In: 2018 international
symposium on computer, consumer and control (IS3C), pp 262–265
15. LiYing S, LiBo S, LingGe L (2012) Comparison of magnetic flux leakage (MFL) and
acoustic emission (AE) techniques in corrosion inspection for pressure pipeline. In:
Proceedings of the 31st Chinese control conference, pp 5375–5378
16. Kim HM, Yoo HR, Park S (2018) A new design of MFL sensors for self-driving NDT robot
to avoid getting stuck in curved underground pipelines. IEEE Trans Mag 54(11):1–5
17. Zhang Z, Udpa L, Udpa SS, Sun Y, Si J (1996) An equivalent linear model for magnetostatic
nondestructive evaluation. IEEE Trans Magn 32(3):718–721
18. Katragadda G, Lord W, Sun YS, Udpa S, Udpa L (1996) Alternative magnetic flux leakage
modalities for pipeline inspection. IEEE Trans Magn 32(3):1581–1584
19. Lu S, Feng J, Li F, Liu J, Zhang H (2017) Extracting defect signal form the MFL signal of
seamless pipeline. In: 2017 29th Chinese control and decision conference (CCDC),
pp 5209–5212
20. Rahman NA, Rahim R, Ling CP, Pusppanathan J, Rahiman MHF (2015) A review of
ultrasonic tomography for monitoring the corrosion of steel pipes. Jurnal Teknologi
73:151–158
21. Sun L, Li Y, Jin S (2006) Study on guided ultrasonic waves propagating along the pipes with
fluid loading. In: Proceeding on the 6th world congress on intelligent control
22. Yang H, Wang C (2011) Study on simulation of non-destructive test ing for pipeline defects
by ultrasonic guided waves. In: 2011 cross strait quad-regional radio science and wireless
technology conference, pp 238–242
23. Silva J, Wanzeller MG, Farias PA (2008) Neto SR (2008) Development of circuit excitation
and reception in ultrasonic transducers for generations of guided waves in hollow cylinders
for fouling detection. IEEE Trans Instrum Meas 57:1149–1153
24. Lyutak I (2005) Wavelet analysis of ultrasonic guided waves in pipeline inspection. In: IEEE
workshop on intelligent data acquisition and advanced computing systems: technology and
applications, pp 571–523
25. Dance DR, Christifides S, Maidment ADA, Mclean ID, Ng HK (2014) Diagnostic radiology
physics. International Atomic Energy Agency, Vienna
26. Li Y, Yang JH, Qiu CC, Yang JS, Shi XS, Wang FB (2017) Shear circumferential guided
waves in coated gas pipeline. In: 2017 symposium on piezoelectricity, acoustic waves, and
device applications, pp 481–485
27. Lowe PS, Sanderson RM, Boulgouris NV, Haig AG, Balachandran W (2016) Inspection of
cylindrical structures using the first longitudinal guided wave mode in isolation for higher
flaw sensitivity. IEEE Sens J 16(3):706–715
A Pilot Study on Pipeline Wall Inspection Technology Tomography ...
239
28. Huthwaite P, Ribichini R, Carley P, Lowe MJS (2013) Mode selection for corrosion detection
in pipes and vessels via guided waves tomography. IEEE Trans Ultrason Ferroelectr Freq
Control 60:1165–1177
29. Li S, Wen Y, Li P, Yang J, Wen J (2014) Modal analysis of leakage-induced acoustic
vibration in different directions for leak detection and location in fluid-filled pipelines. In:
2014 IEEE international ultrasonic symposium proceedings, pp 1412–1415
30. Wilcox PD, Lowe M, Cawley P (2001) The effect of dispersion on long-range inspection
using ultrasonic guided waves. NDT E Int 34:1–9
31. Long R, Lowe M, Cawley P (2001) Attenuation characteristics of the fundamental modes that
propagate in buried iron water pipes, vol 109, pp 1841–1847
32. Niaz A, Moshin F, Kaleem U, Kashif R, Afia SA, Ishaq BI (2009) Ultrasonic in Wet
Processing. Pak Text J 50–57
33. Shaib MFA, Rahim RA, Muji SZM (2017) The development of non-invasive ultrasonic
measuring system for monitoring multiphase flow liquid media within composite pipeline.
Int J Electr Comput Eng (IJECE) 7(6):3076–3087
34. Goh CL, Rahim RA, Tee ZC (2017) Investigation into slow scan front-end control of a
transmission mode ultrasonic system. IEEE Sens J 17(16):5136–5142
35. Rahim RA, Nyap NW, Rahiman MHF, San CK (2007) Determination of water and oil flow
composition using ultrasonic tomography. ELEKTRIKA 9(1):19–23
36. Ayob NMN, Yacoob S, Zakaria Z, Rahiman MHF, Manan MR (2010) Improving gas
component detection of an ultrasonic tomography system for monitoring liquid/gas flow. In:
2010 6th international colloquium on signal processing & its applications (CSPA), pp 278–
282
37. Alobaidi WM, Alkuam EA, Rizzoa HM, Sanguan E (2015) Applications of ultrasonic
techniques in oil and gas pipeline industries. Am J Oper Res 5:274–287
38. Gachagan A, McNab A, Reynolds P (2004) Analysis of ultrasonic wave propagation in
metallic pipe structure using finite element modelling techniques. In: 2004 IEEE international
ultrasonics, ferroelectric, and frequency, pp 938–941
39. Shivaprasad S, Balasubramaniam K, Kanna KC, Bhattachay S, Singh SP (2013) Multi-mode
tandem ultrasonic technique for tube inspection. In: 2013 joint UFFC, EFTF and PFM
symposium, pp 1307–1310
40. Abbaszadeh J, Rahim HA, Rahim RA, Sarafi S, Ayob MN, Faramarzi M (2013) Design
procedure of ultrasonic tomography system with steel pipe conveyor. Sens Actuators, A
203:215–224
41. Na J (2008) Design, fabrication, and characterization of single-element interdigital transducer
for NDT applications. Sens Actuator A 148:359–365
42. Ayob NMN, Rahiman MHF, Zakaria Z, Yaacob S (2010) Detection of small gas bubble using
ultrasonic transmission-mode tomography system. In: 2010 IEEE symposium on industrial
electronics and applications (ISIEA 2010), pp 165–169
43. Abbaszadeh J, Rahim HA, Rahim RA (2012) Optimizing the frequency of ultrasonic
tomography system with a metal pipe conveyor. In: 2012 IEEE 8th international colloquium
on signal processing and its applications, pp 52–56
44. Rahim RA, Rahiman MHF, Nyap NG, San CK (2004) On monitoring of liquid/gas using
ultrasonic tomography. Jurnal Teknologi 40:77–88
45. Gan TH, Hutchins DA, Billsum DR, Wong FC (2000) Ultrasonic tomography imaging of an
encased higly-attenuating solid media. In: 2000 IEEE ultrasonic symposium, pp 823–826
46. Nordin N, Idoras M, Zakaria Z, Ibrahim MN (2014) Tomography image reconstruction of
monitoring flaws on gas pipeline base on reverse ultrasonic tomography. In: 2014 IEEE 5th
international conference on intelligent and advanced system (ICIAS), pp 1–6
Weighted-Sum Extended Bat Algorithm
Based PD Controller Design
for Wheeled Mobile Robot
Nur Aisyah Syafinaz Suarin, Dwi Pebrianti, Nurnajmin Qasrina Ann,
and Luhur Bayuaji
Abstract PID controller of WMR needs to be tuned as precise as possible in order
to develop a good performance of WMR that is able to move from initial position to
a desired position with the fast time response and minimum steady state error.
Weighted-sum Extended Bat Algorithm (WS-EBA) is a multi-objective optimization method based on Extended Bat Algorithm. The weighted optimization
approach is used to search the optimum value of Proportional-Integral-Derivative
(PID) gains controller of Wheeled Mobile Robot (WMR) by referring to x and y
position. Several experiments are conducted to test the effect of variables or
parameters control to the value of PID gains and performance of WMR. Those
parameters are the type of PID controller, number of agents in WS-EBA and the
optimization functions used in the system to search the optimum value of PID
gains. Results obtained from this research study indicates that PD controller, 30
number of searching agents and ITAE as the objective function gives the most
suitable controller for WMR with result for X position is 11.00, 20.08 s and 0.00%
for rise time, settling time and overshoot respectively. Additionally, for Y position,
the results are 12.11, 22.08 s and 0.00% of rise time, settling time and overshoot
respectively. The comparison of WS-EBA with Weighted-sum Particle Swarm
Optimization (WS-PSO) and Weighted-sum Bat Algorithm (WS-BA) is also
experimented in this research. WS-EBA outperformed the rest with the best result
performance of WMR, consistency of solution, fastest convergence rate and the
most balance of exploration and exploitation phase.
N. A. S. Suarin D. Pebrianti (&) N. Q. Ann
Faculty of Electrical and Electronic Engineering, Universiti Malaysia Pahang (UMP),
26000 Pekan, Pahang, Malaysia
e-mail: dwipebrianti@ump.edu.my
L. Bayuaji
Faculty of Computer Science and Software Engineering, Universiti Malaysia Pahang (UMP),
26500 Gambang, Pahang, Malaysia
D. Pebrianti L. Bayuaji
Magister of Computer Science, Universitas Budi Luhur, Jakarta 12260, Indonesia
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_17
241
242
N. A. S. Suarin et al.
Keywords Weighted-sum Extended bat algorithm
Proportional-Integral-Derivative (PID) controller
Wheeled mobile robot 1 Introduction
Wheeled Mobile Robot has gained increasing popularity due to its ability and
flexibility to freely move by using the wheels and potentiality to be applied on
numerous applications such as to lift and moving heavy and static object. In order to
achieve predefine goal or desired location that the WMR need to moves, it has to be
equipped with a good controller. Fast response, minimum settling time and overshoot are the criteria which determine the performance of the controller on WMR.
There are plenty of controllers available nowadays, e.g. Proportional-IntegratedDerivative (PID) controller, path planning, fuzzy logic controller and the simplest
controller which is on-off controller [1–3]. The simple solution is preferable to be
applied to the system to solve the problem rather than complex solution. However
the simplest controller, on-off controller has an oscillating behaviour which limit its
usage. The ultimate aim of the controller is to maintain zero error or minimum
steady state error which is the difference between the process output and the desired
output.
Proportional-Integrated-Derivative (PID) controller is well known with its simple structure and ability to produce a robust performance for the system. It has been
applied to a lot of application such as to the system of machine [4], controller of
flood and to control a mobile robot [5–7]. PID is a basic controller which consists of
three unfixed gains variables. The gain that gives proportional output to the current
error is known as P controller. When applied alone, P controller tends to produce
steady state error and need to manual reset [8]. Integrated or I controller is the gain
that counter-backs the limitation of P gain by eliminating the steady state error.
However as the value of I gain increases, the speed is going to decrease. Last but
not least is the Derivative or D controller. The D gain has the ability to become
flexible and helps the system reacts when there is a change to the set point. Future
can be predicted well by applying the D controller. The lag of system response due
to the I gain can be fixed by applying D gain. However, the combination of each
gain as a set of controller depends on the suitability and performance required by
the system. The trade is between the accuracy, the speed and the robustness [9].
The gains of PID controller is not fixed and needs to be tuned to suit with the
system and process. There are several well-known methods. The most basic method
is trial and error method and manual tuning [9, 10]. It is the most simplest method
but not a systematic method and time consuming. Next is Ziegler-Nichols method.
There are two additional constants need to be tuned by using this method, which is
constant for oscillation and period of oscillation [6]. The lengthy method and
adding more constants make the controller become more complicated and hard to
be tuned to the optimized value. Last but not least, the method that can be applied to
tune the PID gains is by using mathematical optimization and metaheuristics
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ...
243
optimization method. Particle Swarm Optimization (PSO) was applied to tune PD
controller [5, 11, 12], Simulated Annealing was applied to tune load frequency
controller [13] and genetic algorithm (GA) was applied to solve unicycle type of
mobile robot [7, 9]. Swarm intelligence is one of the group of metaheuristics
algorithm method which is inspired by the swarm behaviour of living things. There
are abundance of optimization methods available as according to No Free Lunch
(NFL) theorem, there are no single solution of optimization is able to be applied to
all problems. Thus the new optimization method is still rapidly growing.
Extended Bat Algorithm (EBA) is a hybrid optimization method developed from
Bat Algorithm (BA) optimization and Spiral Dynamic Algorithm (SDA) optimization. It was created [14] and never been applied on the problem to optimize
PID controller yet as recorded in published papers. The steps of the EBA optimization method which improve the searching method by search in spiral according
to SDA and agents movement as in BA. This is to improves the result to avoid from
trap in local minima and speed up the process to converge to find the best solution.
PID controller needs to be tuned to the optimum value which give the best output
with the minimum steady state error, minimum overshoot and fast response of the
system. This is because WMR is the mobile robot which keeps moving to the
desired position.
Weighted-sum is a multi-objective optimization method which used to optimize
and find solution for multiple solutions in one system. Weighted-sum is the simplest
solution to optimize multiple objective due to the linearize method applied in the
approach. The linearize make the multiple objective functions become single
objective function by applying weightage value for each objective function.
Weighted-sum Extended Bat Algorithm (WS-EBA) is multiple-objective optimization which solve the optimization based on EBA approach. WS-EBA is chosen
in this research study because there are two objective functions of WMR needs to
minimize, i.e. error of x position and y position of WMR.
Apart from the ultimate aim which is to tune the gains of PID controller, there
are several variables need to take into the consideration. These variables were
recorded had gave impact and influenced the results of the controller when tuning
by using metaheuristics optimization method. The variables are the number of
agents used in the optimization method, the objective function and hyperparameters
tuning of metaheuristics algorithm. Each algorithm has different type and number of
hyperparameters. EBA consists of loudness, pulse rate, spiral radius and spiral
angle while hyperparameter for particle swarm optimization (PSO) algorithm is the
cognitive component, social component and inertia weight. The hyperparameter is
important to determine the local and global searching by the agents and to control
the exploration and exploitation phase for all the agents.
Thus in this paper, the most popular and simple yet can produce good performance of result, PID controller is being discussed and investigated to be applied to
WMR. The paper is organized as follows. Section 2 presents the experimental
design and methods while in Sect. 3, the simulation results and performance
comparison in terms of are discussed. Lastly, conclusion is drawn in Sect. 4.
244
N. A. S. Suarin et al.
2 Methodology and Experimental Setup
2.1
Closed Loop System for WMR and PID Controller
Figure 1 shows the example of wheeled mobile robot used in this study which is the
mBot wheeled mobile robot. The most important parameters of the mBot which
adapted to the kinematic model are the length, L, the distance between the two
wheels and radius, r, the radius of the wheel. The control system is fully developed
from kinematic model and equation derived in Fig. 1 and Eqs. (1) to (8). This
research study is a simulation-based study, thus converting a real mobile robot to
control system that the performance can be measurable is a necessary. The input of
the control system is the desired position (x and y position) where WMR needs to
reach and the output is the current position of the WMR. A closed loop control
system of the WMR is designed as in Fig. 2. The objective of this system is to
minimize or eliminate the error which is the difference between desired position and
current position.
r
x_ ¼ ðxR þ xL Þ cos h
2
ð1Þ
r
y_ ¼ ðxR þ xL Þ sin h
2
ð2Þ
r
h_ ¼ ðxR xL Þ
L
ð3Þ
r is the radius of the mBot’s wheel, xR is the right wheel angular velocity, xL is
the left wheel angular velocity and L is the distance between the mBot’s wheels.
Fig. 1 Development of
kinematic model of mBot
(WMR)
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ...
245
In order to develop a kinematic model a single upright wheel on the plane is given
as in Eqs. (4) and (6).
x_ ¼ v cos h
ð4Þ
y_ ¼ v sin h
ð5Þ
h_ ¼ x
ð6Þ
By rearranging Eqs. (1) to (6), we get:
xR ¼
2v þ xL
2r
ð7Þ
xL ¼
2v xL
2r
ð8Þ
where xR is the angular velocity of right wheel of the mBot and xL is the angular
velocity of the left wheel of the mBot. The output of xR and xL are formed from the
inputs of v and x. A constant number 0.113 m is used in the model. This constant
number is the distance, L between two mBot’s wheels and r refers to the radius of
the mBot’s wheel which is 0.03 m. The kinematic model has been verified by Dwi
Pebrianti et al. by conducting an experiment to compare the developed kinematic
model with the actual mBot robot performance [15]. The accuracy of the developed
kinematic model is 85%.
Proportional-Integral-Derivative (PID) controller in the system is aiming to reduce
the errors that feedback into the system. Weighted sum Extended Bat Algorithm
(WS-EBA) is a multi-objective hybrid metaheuristic algorithm. It can be clearly seen
in Fig. 2, there are two errors, the first error is for x position and second error is for y
position. Thus, a single objective optimization technique is not able to minimise both
of the errors by tuning the PID controller, multi objective optimization is needed to be
Fig. 2 Closed loop control system, controller (PID), plant (kinematic model of WMR), input (x
and y desired position), output (x and y current position)
246
N. A. S. Suarin et al.
applied in the system. Weighted sum is the simplest multi-objective algorithm as it
linearize into one function and the equation is stated in Eq. 9.
f T ¼ ðW 1 f 1 Þ þ ðW 2 f 2 Þ
ð9Þ
where fT is total fitness function, W1 is the weight for first fitness function, W2 is the
weight for second fitness function, f1 is the first fitness from the first objective
function and f2 is the second fitness from the second objective function.
It is important to have a robust PID controller which can reduce and eliminate
the errors in a short time and produce a stable performance of WMR. PID controller
is a classical well-known controller due to its simple structure, convenient
debugging, strong adaptability and most widely used in the systems. However, the
challenging for this controller is to tune the gains to the optimized value so that the
best performance of the system can be produced.
2.2
Weighted-Sum Extended Bat Algorithm
Extended Bat Algorithm (EBA) is a low level meta-heuristic hybridization algorithm of original Bat Algorithm (BA) and Spiral Dynamic Algorithm (SDA). The
hybridization is known as low level because the hybrid is only involved in one part
of Bat Algorithm which is the exploration part. Original Bat Algorithm is updating
the position of the agent by using Eq. (10) while in SDA, updating the position is
by using Eq. (11). However, in EBA, updating position is combination of BA and
SDA as stated in Eq. (12).
xti ¼ xt1
þ vti
i
ð10Þ
xti þ 1 ¼ rRðhÞxti ðr:RðhÞ In Þx
ð11Þ
xti ¼ rRðhÞxti ðr:RðhÞ In Þx þ vti
ð12Þ
where x is agent position, v is the velocity of the agent, r is the step rate between x(t)
and x* per t, h is the rotation rate [−p, p], R(h) is the composite rotation matrix, i is
the number of agent, t is the number of iteration, In is the matrix identity and x* is
the position of best agent.
The combination of both algorithms is expected to perform well together and
could improve the performance of the original BA. This is because, the performance of optimization method is depending on its ability to balance exploration and
exploitation phases. By applying SDA searching method, the exploration phase of
BA can be improved. The only part taken from SDA is for updating the x value, and
the remaining algorithm is from BA. Figure 3 is the flowchart for WS-EBA tune
PID controller of WMR.
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ...
247
Fig. 3 Flowchart of Weighted-sum Extended Bat Algorithm (WS-EBA) to tune PID controller for
WMR
248
N. A. S. Suarin et al.
Table 1 Parameters setup for
WS-EBA, WS-PSO and
WS-BA
Parameter
WS-EBA
WS-PSO
WS-BA
Number of searching agent
Number of iterations
Initial loudness, A
Initial pulse rate, p
Spiral radius, r
Spiral angle, h
KP boundary
KI boundary
KD boundary
Cognitive component, c1
Social component, c2
Inertia weight, w
10/30
100
0.5
0.5
0.95
1
[0 50]
[0 100]
[0 300]
–
–
–
30
100
–
–
–
–
[0 50]
[0 100]
[0 300]
0.9
0.9
0.5
30
100
0.5
0.5
–
–
[0 50]
[0 100]
[0 300]
–
–
–
Objective function (optimization index) is an important component in optimization method because the value need to be optimized (either minimized or
maximize) is relied on the objective function. Objective functions which usually
used to minimize the error in control system are Integral Square Error (ISE),
Integral Absolute Error (IAE), Integral Time Squared Error (ITSE) and Integral
Time Absolute Error (ITAE). Equations (13) to (16) are the equations of the
objective functions, all the objective functions had been tested. The best objective
function should be able to minimise the errors and give the optimum value of gain
of PID controller for WMR.
Z 1
2
ISE ¼
e1 ðtÞ þ e22 ðtÞ þ . . . þ e2n ðtÞ dt
ð13Þ
0
Z
1
IAE ¼
ðje1 ðtÞj þ je2 ðtÞj þ . . . þ jen ðtÞjÞdt
ð14Þ
t e21 ðtÞ þ e22 ðtÞ þ . . . þ e2n ðtÞ dt
ð15Þ
t ðje1 ðtÞj þ je2 ðtÞj þ . . . þ jen ðtÞjÞdt
ð16Þ
0
Z
ITSE ¼
1
0
Z
1
ITAE ¼
0
2.3
Experimental Setup
The parameter setup for WS-EBA is shown in Table 1. The experiment is conducted to determine the best PID based controller to be applied to the WMR, the
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ...
249
appropriate number of searching agents to be used to search the gains of PID
controller in the algorithm and the best objective function to be used in the optimization process. All of the criteria mentioned are important in order to produce the
robust performance of PID controller for WMR. In the table as well, the parameters
setup for WS-PSO and WS-BA optimization are listed. Both of the optimizations
are required in the comparison of controller tuned by proposed method with others
swarm based optimization method. For PSO, the parameter setup is referring to the
paper from [12] as recommended. For BA, the parameter setup is referring to our
previous research paper [16].
3 Result and Discussion
3.1
PID Controller for WMR Tuned by WS-EBA
Proportional-Integral-Differential (PID) is a controller which consists of three gains
those need to be tuned as precise as possible corresponding to the system. However,
not each gains of the controller i.e. P-I-D, is compulsory to be applied in the system,
it is depending on the suitability of the system to use those controller components.
In order to develop the robust controller for wheeled mobile robot (WMR) of
mobile robot, different PID based controllers are being tested and the best controller
with outperformed performance is selected.
Table 2 PID gains value tuned by WS-EBA
Controller
No. of agents
Objective functions
KP
KI
KD
PID
10
ISE
IAE
ITSE
ITAE
ISE
IAE
ITSE
ITAE
ISE
IAE
ITSE
ITAE
ISE
IAE
ITSE
ITAE
0.015
0.004
0.000
0.000
0.002
0.015
0.008
0.000
0.013
0.009
0.014
0.009
0.012
0.011
0.012
0.013
0.013
0.004
0.004
0.003
0.001
0.013
0.006
0.002
0.013
0.012
0.012
0.011
0.013
0.013
0.012
0.012
0.014
2.901
477.447
145.828
65.549
0.013
152.931
236.19
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
(continued)
30
PI
10
30
250
N. A. S. Suarin et al.
Table 2 (continued)
Controller
No. of agents
Objective functions
KP
KI
KD
PD
10
ISE
IAE
ITSE
ITAE
ISE
IAE
ITSE
ITAE
ISE
IAE
ITSE
ITAE
0.105
0.124
0.117
0.144
0.221
0.221
0.188
0.229
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.093
0.116
0.108
0.129
0.207
0.208
0.170
0.214
0.000
0.000
0.000
0.000
30
No PID
0
Proposed method to tune PID controller is multiple-objective-optimisation method
which is weighted sum Extended Bat Algorithm (EBA). The aim of this experiment is
to recognize the most suitable PID controller for WMR of mBot. Table 2 shows the
gain values of 3 types of PID controller tuned by WS-EBA. Additionally, the system
without PID controller is included in this experiment as well to observe the difference
between them. The agents in the algorithm are assigned to search for the optimal
solution of the PID gains. For PI and PD controller, D gain and I gain has been
neglected respectively. For no PID controller, all the gains have been neglected and
the system is run without controller.
The solutions which is the value of gains for PID controller obtained by the
WS-EBA for PID, PI and PD are different due to the different number of agents and
objective function used in the system. The gains obtained by the PID and PI
controller for 10 and 30 agents show only a small difference, in the other hand, PD
controller obtained value with big difference with each other. The value obtained by
30 agents are higher than 10 agents. This is because the agents are searching based
on the boundary of search area and the number of agents play important role to
obtain the best solution, avoid to easily trap in local minima and to explore the area
with the best solution.
Table 3 shows the performance of WMR when applying the gains in Table 2.
The WMR system is as shown in Fig. 2. PD controller, with 30 number of agents
and tuned by using ITAE objective function outperformed other controllers with the
best value of rise time, settling time and percentage of overshoot. In the WMR
system, short rise time is better than long time to rise to the desired position, short
settling time is better than long time to stable and settle down and the low value of
the overshoot percentage is better than has high overshoot percentage to determine
the performance of WMR. Thus, PD controller recorded the best result based on the
criteria above with the shortest rise time, settling time and the lowest value of
percentage of overshoot.
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ...
251
PID and PI controller show the worst result by not reaching the reference
position which is 1 m as stated by the result of settling time and percentage of the
overshoot. The system is run for 60 s and during that time, the system still not
reaches the settling time. The percentage of overshoot is too high which indicates
that the system is not stable and unable to reach to the desired position.
Performance of the robot without PID controller shows that it takes the longest
settling time and rise time. The WMR needs to react fast as it is a moving robot that
need to accomplish a task. Although the percentage of overshoot for the system
without PID controller is among the lowest, PD controller perform the best in all
aspects including for rise time and settling time.
PD controller consists of P and D gain values in the controller, without I gain
value. I is integral gain which the function is to eliminate the steady state error.
I gain limits the speed of the response and stability of the system. This gain can give
integral windup effect to the system which accumulates a significant error during
the rise, thus leads to continuous overshooting of the system performance. This
situation can be avoided by recognizing the range of the gain. For the system that
requires fast response, I components in the controller is not encourage to be
implemented in the system.
For WMR system, the present of I gain value, such as in PID and PI controller
make the error that is fed into the system accumulates and continue to increase the
overshoot values. The superior controller for the system in this study is PD controller with the fastest rise time and the most accurate steady state value.
Karahan et al. [17] in their paper applied PD controller integrated with fuzzy
controller for the wheeled mobile robot system, Baral et al. [13] applied PI controller in the load frequency controller system and Ye et al. [4] applied PID controller in the hydraulic system for position control. The PID controller is a
well-known classical controller, easier to be implemented to the system compared
with other controllers and can produce reliable result. However, to implement to the
system, the suitability needs to be tested and recognized first. For this WMR
system, PD controller is the best controller to be applied.
From the result, the similarities of all controller are the present of P gain value in
the controller except for no controller. P is proportional gain which makes the
feedback error in the system proportional to the system. The function is to help to
stabilise the system at the same time remaining the steady state error, SSE. The
optimal value of P gain is important to control the oscillation of the robot. By
referring to Table 2, table of PID gain values for each controller, P gain for PD
controller is not the lowest or the highest, but the performance is the best among all
in term of rise time, settling time and steady state error.
Searching agent is important element in optimization procedure where the agent
is used to search the best result. The values of gain obtained by 10 number of agents
is higher than by 30 number of agents. However, the high values of gain does not
252
N. A. S. Suarin et al.
indicate the best performance of WMR, the optimum value does. The searching
area plays the important role as well in this process. The number of searching agent
must be appropriate with the searching area, as too many searching agents in small
area might lead to deadlock situation and too little number of searching agents
might make the agents unable to explore the whole area and produce the bad result.
Table 3 Result analysis performance of WMR with different types of PID controller
Controller
PID
No. of agents
10
30
PI
10
30
PD
10
30
No PID
0
Obj. func.
ISE
IAE
ITSE
ITAE
ISE
IAE
ITSE
ITAE
ISE
IAE
ITSE
ITAE
ISE
IAE
ITSE
ITAE
ISE
IAE
ITSE
ITAE
ISE
IAE
ITSE
ITAE
–
X position
Tr (s) Ts (s)
Os (%)
Y position
Tr (s) Ts (s)
Os (%)
22.26
20.61
35.89
37.96
37.94
22.45
38.05
36.16
22.45
22.05
21.97
22.04
22.43
22.47
22.04
22.02
21.89
18.99
20.47
16.66
11.32
11.31
12.83
11.00
41.28
350.53
1003.54
96.27
98.50
1051.42
97.07
98.50
93.32
8557.86
1035.47
887.99
8578.58
1029.98
1030.47
915.27
8848.02
0.36
0.79
0.06
0.01
0.01
0.00
0.00
0.00
257.09
9.13
31.08
36.08
37.78
37.84
8.95
38.03
36.09
0.33
9.44
9.47
9.44
8.84
8.78
9.32
9.33
22.36
19.63
21.16
17.43
12.45
12.43
13.91
12.11
2.31
1.14
1.52
96.28
98.49
1.54
97.07
98.50
93.32
1.45
1.53
1.42
1.46
1.59
1.55
1.48
1.53
0.93
0.43
0.08
0.02
0.00
0.01
0.00
0.00
0.01
59.59
49.96
59.42
59.02
59.35
59.58
59.39
59.40
59.58
59.65
59.64
59.65
59.58
59.58
59.63
59.64
36.82
32.99
36.72
29.83
20.67
20.57
23.48
20.08
58.97
13.38
46.54
59.42
59.02
59.35
13.10
59.39
59.40
5.67
14.01
13.84
14.00
13.05
13.06
13.69
13.85
38.02
34.70
38.45
31.59
22.62
22.55
25.82
22.08
59.94
In Table 3 as well shows the result performance for X position of mobile robot
using PID, PI, PD controller and without PID controller tuned by EBA with different number of searching agents. PD controller with 30 number of searching
agents produce better result with shorter rise time and settling time than 10
searching agents. The difference between these two comparison is quite significant
as the difference is 30 searching agent is faster by 5.66 s than 10 searching agent for
rise time and 8.75 faster for rise time.
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ...
253
For Y position, the result shows a big difference of performance for the comparison for PD controller for 10 and 30 searching agents. 30 searching agents
produced better result for settling time by 12.11 s. The searching area of the PD
controller which are set by the parameters of upper boundary and lower boundary
for both gains are appropriate with the number of searching agents. Thus, the
searching agents are able to search and obtain the result that gives good performance for the mobile robot. The number of searching agent is one of the important
factor that can influence the performance of the system.
Sahib et al. [18], mentioned in his research that by using the most suitable
objective function can highly improve the PID tuning optimization. The result of
comparison between different objective functions are show in Table 3 as well. By
referring Table 3, ITAE (Integral Time Absolute Error) is the best objective
function to be applied in this system to tune PD controller. Rise time and settling
time for ITAE is the shortest, 11.00 and 20.08 s respectively. IAE and ISE weight
all error and independent of time which can result in a response with relatively
small overshoot compared to ITAE and ITSE.
The result for Y position of mobile robot using PD controller tuned by minimizing four different objective functions shows the performance for ITAE is the
best, same with X position. Overshoot produced by ISE and ITSE are higher than
overshoot produced by IAE and ITAE by 50%. Rise time and settling time for IAE
and ITAE are shorter than those produced by ITSE and ISE. These results indicate
that by squaring the output will increase the error and make the system become
unstable. Overshoot in the system plays an important role to measure the performance of the mobile robot.
3.2
Performance Comparison with WS-PSO and WS-BA
The comparison in this section only involves PD controller tuned by WS-EBA,
WS-PSO and WS-BA. This is because only PD controller outperforms the other
controller form previous experiment. The number of searching agents is set to 30
agents and the objective function used is ITAE due to the same reason as well. PSO
algorithm has been chosen because the method was used [12] to tune PID controller
for charger system and BA algorithm has been chosen due to the previous result of
WMR tuned by BA [16]. Additionally, the originality of EBA is BA, thus it is
appropriate to compare the performance.
Figure 4 shows the convergence curve of EBA, BA and PSO. All the algorithms
have fully converged at the end of 100 iterations. Among the three algorithm, PSO
does not converge and remains steady during the 100 iterations. The fitness value
for PSO is the maximum. In the other hand, BA keeps trying to converge until 40
iterations and EBA stop to converge at 12 iterations. The fitness value for BA is the
minimum. However, although the problem is to minimise the position error, but it is
depending on the value of PD gains controller tuned by the mobile robot system.
254
N. A. S. Suarin et al.
Fig. 4 Convergence curve fitness function for WS-EBA, WS-BA and WS-PSO
Table 4 Table weightage values for WS-EBA, WS-PSO and WS-BA
Algorithm
Total
fitness (fT)
Weightage
1 (W1)
Fitness
1 (f1)
Weightage
2 (W2)
Fitness
2 (f2)
WS-EBA
1.321
0.749
0.885
0.251
2.621
WS-PSO
3.224
0.453
1.587
0.547
4.580
WS-BA
0.742
0.642
1.032
0.358
0.222
Control parameters of each algorithm is different with each other. The good
performance of the algorithm is depending on how well the algorithm is able to
control and balance the exploration and the exploitation phase. Too much exploration will lead the searching agents diverge from the best solution while excessive
exploitation phase will make the algorithm fall in deadlock and trapped in wrong
solution. Thus, it is important to control the searching agents to search for the best
solution.
Table 4 shows the value of total fitness, first and second weightage and fitness of
ITAE function used for minimizing the first and second errors in the system. The
summation of total weightage is equal to one. Total fitness obtained by WS-PSO is
the maximum while total fitness obtained by WS-BA is the minimum. The values
may indicate the convergence of the algorithm to search for the best solution. Being
trapped in local minima will make the agents unable to explore more and give value
higher than the solution. In the other hand, uncontrol exploration phase will make the
agents missed the best solution by taking the solution from other low fitness value.
Table 5 shows the result of PD gains tuned by WS-EBA, WS-PSO and WS-BA.
Each algorithm produced different results depend on the method of searching agents
used to search in the algorithm. In this optimizations experiment, the best number of
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ...
255
Table 5 PD gains value for comparison with different algorithm
Optimization
KP
KD
WS-EBA
WS-PSO
WS-BA
14.376
16.562
14.875
20.223
63.962
40.452
Fig. 5 Box plot of EBA, BA and PSO for result of fitness values for five times repeatability
searching agent used is 30 and the best objective function, ITAE is used in order to
optimize the gain values of PD controller. The performance of the results can only
be determined after implementing the controller in the kinematic model of the mBot
and run the closed loop system.
Figure 5 shows the boxplot fitness value run for five times against algorithm and
Table 8 shows the analysis of performance from the box plot. EBA produce the
most consistent data as shown in the size of the box. The smaller the box, the more
consistent the data with the median. Apparently, PSO has the highest median and
the largest range by referring to the maximum point and minimum point of the box.
This means that PSO is the worst in terms of consistency (Table 6).
Table 6 Information of box plot for EBA, BA and PSO by referring Fig. 5
Algorithm
Maximum
point
Minimum
point
Median
Number of point
Outliers
WS-EBA
WS-PSO
WS-BA
1.284
4.514
1.404
1.204
1.229
1.199
1.216
2.845
1.232
5
5
5
No
No
No
256
N. A. S. Suarin et al.
Fig. 6 Graph of X position for optimizing PD controller by using different algorithm
Table 7 Result of
performance with different
optimization algorithm
approach, X position
Optimization
Tr ðsÞ
Ts ðsÞ
Os ðsÞ
WS-EBA
WS-PSO
WS-BA
17.650
21.454
19.753
34.989
41.440
38.776
0
0
0
Figure 6 and Table 7 display the results of PD controller tuned by different
algorithm for X position. Performance of mobile robot when using PD controller
tuned by WS-EBA is the best with the fastest rise time, 17.65 s and the fastest
settling time, 34.98 s. The worst result is showed by WS-PSO. The result of PD
controller tuned by BA acquired almost the same performance with WS-EBA
because, WS-BA is the origin algorithm, and both use the same main method for
searching the solution. WS-EBA is managed to obtain better result due to specific
method it used, by applying spiral path to search the solution.
Figure 7 and Table 8 show the results of PD controller tuned by different
algorithm for Y position. For this result, the performance of mobile robot is the
same for using the PD controller which the gain values are tuned by WS-EBA and
WS-BA. PD controller tuned by WS-PSO produced the worst performance by the
longest time took to rise time, 1.49 s, settling time, 11.63 s and 62% of overshoot.
Although WS-BA gives the good performance in Y-position, WS-EBA produced the best performance for both position, X and Y position. This makes the
controller tuned by WS-EBA is the best controller produced compared with other
algorithms. Rise time, settling time and overshoot are the three main indicators to
determine the performance of controller for the kinematic model of the system.
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ...
257
Fig. 7 Graph of Y position for optimizing PD controller by using different algorithm
Table 8 Result of
performance with different
optimization algorithm
approach, Y position
Optimization
Tr ðsÞ
Ts ðsÞ
Os ðsÞ
WS-EBA
WS-PSO
WS-BA
1.328
1.488
1.328
9.811
11.626
9.821
1.352
1.626
1.352
4 Conclusion
Extended Bat Algorithm is one of the latest hybrid algorithms and has not yet been
implemented to solve any controller optimization problem. By conducting this
research study, the potentiality of EBA has been proven. Solving multi-objective
optimization problem based on EBA is one of new challenge accepted by EBA.
EBA produced the best result for optimizing and tuning the gains of PID controller.
Based on the experiment conducted, PD controller, tuned by using 30 searching
agents, using ITAE as the fitness function is the best controller compared with the
PD controller tuned by WS- PSO and WS-BA. PD controller has been selected as
the best among PID and PI controllers. PD controller with P gain 14.776 and D gain
20.223 has the best rising time, 17.65 s, settling time, 34.989 s and one of the
controllers with the lowest overshoot which is 3.5%.
References
1. Abdalla TY, Abed AA, Ahmed AA (2017) Mobile robot navigation using PSO-optimized
fuzzy artificial potential field with fuzzy control. J Intell Fuzzy Syst 32(6):3893–3908
258
N. A. S. Suarin et al.
2. Jeng JC, Tseng WL, Chiu MS (2014) A one-step tuning method for PID controllers with
robustness specification using plant step-response data. Inst Chem Eng 92(3):545–558
3. Din A, Jabeen M, Zia K, Khalid A, Saini DK (2018) Behavior-based swarm robotic search
and rescue using fuzzy controller. Comput Electr Eng 70:53–65
4. Ye Y, Yin CB, Gong Y, Zhou JJ (2017) Position control of nonlinear hydraulic system using
an improved PSO based PID controller. Mech Syst Signal Process 83:241–259
5. Kanojiya RG, Meshram PM (2012) Optimal tuning of PI controller for speed control of DC
motor drive using particle swarm optimization. In: Proceeding of 2012 international
conference on advances in power conversion and energy technologies (APCET), Mylavaram,
Andhra Pradesh, pp 1–6
6. Chia KS (2018) Ziegler-nichols based proportional-integral-derivative controller for a line
tracking robot. Indones. J Electr Eng Comput Sci 9(1):221–226
7. Majid NA, Mohamed Z, Basri MAM (2016) Velocity control of a unicycle type of mobile
robot using optimal PID controller. Jurnal Teknologi 78(7–4):7–14
8. Foley MW, Ramharack NR, Copeland BR (2005) Comparison of PI controller tuning
methods. Ind Eng Chem Res 44(17):6741–6750
9. Sujay HS, Suman R, Chaithanya S, Narayanan S, Shamanth U (2018) Tuning and analysis of
PID controllers using soft computing techniques. Int J Sci Res Sci Technol 5(3):67–71
10. Goswami NK, Padhy PK (2018) Sliding mode controller design for trajectory tracking of a
non-holonomic mobile robot with disturbance. Comput Electron Eng 72:307–323
11. Nazari MAD, Khooban MH (2015) Design of optimal mamdani-type fuzzy controller for
nonholonomic wheeled mobile robots. J King Saudy Univ Eng Sci 27(1):92–100
12. Solihin MI, Tack LF, Kean ML (2011) Tuning of PID controller using particle swarm
optimization (PSO). In: Proceeding of international conference of advance science
engineering information technology, Putra Jaya, Malaysia, pp 458–461
13. Baral KK, Barisal AK, Mohanty B (2017) Load frequency controller design via GSO
algorithm for nonlinear interconnected power system. In: Proceeding of 2016 international
conference on signal processing, communication, power and embedded system (SCOPES),
Paralakhemundi, vol 77, pp 662–668
14. Pebrianti D, Ann NQ, Bayuaji L, Abdullah NRH, Zain ZM, Riyanto I (2019) Extended bat
algorithm (EBA) as an improved searching optimization algorithm. In: Md Zain Z, Ahmad H,
Pebrianti D, Mustafa M, Abdullah NRH, Samad R, Mat Noh M (eds) Proceeding of the 10th
national technical seminar on underwater system technology 2018, vol 538. LNEE. Springer,
Heidelberg, pp 229–237
15. Pebrianti D, Hao YH, Suarin NAS, Bayuaji L, Musa Z, Syafrullah M, Riyanto I (2018)
Motion tracker based wheeled mobile robot system identification and controller design. In:
Hassan MHA (ed) Intelligent manufacturing & mechatronics, vol 538. LNME. Springer,
Heidelberg, pp 241–258
16. Suarin NAS, Pebrianti D, Ann NQ, Bayuaji L, Syafrullah M, Riyanto I (2019) Performance
evaluation of PID controller parameters gain optimization for wheel mobile robot based on bat
algorithm and particle swarm optimization. In: Md Zain Z, Ahmad H, Pebrianti D,
Mustafa M, Abdullah NRH, Samad R, Mat Noh M (eds) Proceeding of the 10th national
technical seminar on underwater system technology 2018, vol 538. LNEE. Springer,
Heidelberg, pp 323–333
17. Karahan O, Bingül Z (2011) A fuzzy logic controller tuned with PSO for 2 DOF robot
trajectory control. Expert Syst Appl 38(1):1017–1031
18. Sahib MA, Ahmed BS (2016) A new multiobjective performance criterion used in PID tuning
optimization algorithms. J Adv Res 7(1):125–134
An Analysis of State Covariance
of Mobile Robot Navigation
in Unstructured Environment
Based on ROS
Hamzah Ahmad, Lim Zhi Xian, Nur Aqilah Othman,
Mohd Syakirin Ramli, and Mohd Mawardi Saari
Abstract This paper deals with mobile robot navigation in unstructured environment by using Robot Operating System (ROS). ROS is a framework to develop
robotic application and it consists of algorithms to build maps, navigate, and
interpret sensor data. The system is used to define a condition of mobile robot
navigation in a specific environment to evaluate the estimation performance. The
research aims to analyze and investigate the mobile robot movement in unknown
environment by using Kalman Filter approach considering uncertainties. Only one
LiDAR sensor and one IMU sensor are applied to measure the relative distance and
then provide the information for estimation purposes. An experiment of a Turtlebot
that can keep track autonomously with collision avoidance has been organized to
recognize the mobile robot motions through the application of Kalman Filter. Once
the simulation is successfully performed as expected, then only the experimental
analysis are organized. The results shown that Kalman Filter can sufficiently estimate the condition of the environment with only depending on a LiDAR and IMU
sensors with good performance. Besides, the calculated state covariance is also
agreed with the theoretical analysis.
Keywords Kalman Filter
Navigation Mobile robot LiDAR Covariance
1 Mobile Robot Navigation
Working with an autonomous mobile robot is a challenging task that requires a lot
of system analysis, integration of parts and sensors, environment conditions and
techniques. Introduced more than two decades, the simultaneous localization and
mapping problem, simply known as SLAM, is an integral part of navigation which
demands researcher to take into account several factors that can easily affects the
mobile robot performances. Issues such as computational cost, complexity,
H. Ahmad (&) L. Z. Xian N. A. Othman M. S. Ramli M. M. Saari
Faculty of Electrical and Electronics Engineering, UMP, Pekan, Malaysia
e-mail: hamzah@ump.edu.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_18
259
260
H. Ahmad et al.
dynamics of environments and uncertainties are always making the SLAM problem
inspired researchers to continuously seeks more reliable technique for solution.
Habibie and some researchers [1–3] states that SLAM, or also known as
Concurrent Mapping and Localization, is a term known as an approach to solve a
“chicken-and-egg problem” of robot localization and mapping. This problem
appears because to make a good map of robot’s environment it needs a precise
self-position estimation; however good localization only can be achieved when a
well-defined map is available. In SLAM problem, for each time of observation, a
mobile robot only knows its measurement from sensors based on controls given.
Referring to those information, the system needs to find the probability of all pose
or mobile robot state and the map of the environment concurrently.
As stated earlier, as SLAM is developed by two main issues i.e. localization and
mapping, each of the problem demands better and consistent results to guarantee a
good mobile robot performance [4]. Park [5] stated that in process of making map,
the simpler the geometry of working area, then the larger error are produced of
localization estimation. In other words, lesser information obtained by sensors leads
the mobile robot becomes uncertain about its expectation. Therefore, most of
current research are applying sensor fusion from different sensor types to gain better
results. Looking into this aspect, this research attempt to implement a single LiDAR
sensor to analyze the performance of estimation.
Lotfy [6] states that one major problem with SLAM is that the measurements
read from the sensors will invariably contains noise, and the motion performed by
the mobile robot too will produce uncertainties during its observations. This is why
Kalman Filter which relies on its state covariance performance becomes necessary
in SLAM problem. Since Kalman Filter uses linear models in which contrary to the
typical SLAM problem is nonlinear in nature, a nonlinear variant of it, called
Extended Kalman Filter (EKF) is applied. The EKF SLAM method mainly consists
of two steps which are the prediction step and the correction step. The details of the
EKF system algorithm will be presented in later section for better descriptions.
There have been a lot of research conducted to examine EKF performance in
various conditions covering from theoretical analysis to the experimental verifications e.g. Huang et al. [3, 7], Ahmad et al. [8–11], Remark that, there are also other
available techniques in SLAM such as the particle filter and other Kalman Filter
families. However, due to the shortcomings such as computational cost, and
complexity, EKF still offers better choices in providing solution in SLAM problem.
One of the important aspects in EKF is the behavior of state covariance. To
guarantee a good estimation is preserved, the state covariance must always converged and this is the reason why EKF being one of the famously applied technique
in SLAM [7].
Application of Kalman Filter in ROS can also be found in a number of papers
with different environment and settings. Kokovkina et al. demonstrates that, EKF
has been used for localization of the mobile robot and then was compared with the
data acquired from the sensing devices like camera and also laser scanner. The
results are satisfactory which shows the errors at acceptable level [12]. UAV is
another examples of successful implementation of EKF through ROS environment
An Analysis of State Covariance of Mobile Robot Navigation...
261
[13, 14]. Image obtained from the camera observations are feed as references into
EKF for prediction purposes for landing estimation especially when the landing
platform was not detected. In fact, the error produced by EKF is much more smaller
than the one detected by the sensors [13]. Looking on the perspective of sensoring
devices applied for EKF in ROS, Ponce et al. claims that, by using only a laser
scanner, a robot can still able to depart and mobilize people in domestic area. They
present the robot as an autonomous wheelchair to move from specific place to its
destination efficiently with EKF [15]. Their results determines a possibility to
reduce the computation and sufficient technique for estimation.
Inspired by the findings in literatures, this paper attempts to analyze the performance of EKF in ROS environment considering LiDAR and IMU sensors for
measurements. The LiDAR sensor is used as it can provide better measurement than
sonar sensor as well as reducing the computational cost in providing solution to the
mobile robot localization and mapping. While, IMU is proposed to identify the
mobile robot heading angle when its moves around the environment. The state
covariance of estimation is also examined to understand its relation to the estimation
as well as to compare with the theoretical results provided by the literatures. The
analysis of state covariance in EKF in preceding literatures especially on ROS
environment are few and in fact mostly focusing on the statistical error performance.
As EKF also concerns on the state covariance analysis, this paper deals on the matter
to observe the overall behavior of state covariance throughout the estimation processes. Meanwhile, for verification purposes, TurtleBot is being used as main
application as the mobile robot is easy to control and then to estimate its movements.
This paper is organized in the following manner. Section 2 describes the Kalman
Filter algorithm in SLAM and the mobile robot; TurtleBot 3 Burger. This is then
followed by Sect. 3 about the simulation and experimental analysis of the proposed
system. Finally Sect. 4 concludes the findings of the research.
2 Navigation and TurtleBot 3
2.1
SLAM and Kalman Filter
As mentioned in previous section, SLAM is consists of two main parts namely
known as process and measurement models. In this paper, the same configuration of
the system is being applied from Ahmad et al. [10]. The process model is stated as
follow. Consider a state xk 2 R3 þ 2n which consists of mobile robot x, y position
and its heading angle with a number of n landmarks marked with x, y locations. The
kinematic model of the mobile robot is represented by
x k þ 1 ¼ f ð x k ; uk ; x Þ
ð1Þ
262
H. Ahmad et al.
where uk defines the control input which basically describes of the mobile robot
velocity and angular acceleration. x represents the noise occurred during mobile
robot motions.
To observe the surrounding area, the mobile robot needs to know its environment and therefore sensors are important to retrieve the related information. This is
accomplished by using LiDAR to measure the relative distance between mobile
robot and any recognized landmarks during mobile robot observations. The measurement is calculated as follow.
z k þ 1 ¼ hð x k ; t Þ
ð2Þ
where zk þ 1 describes the measurement matrix which consists of the relative distances and angles between the mobile robot and landmarks.
Above two models are essential for the system to make its analysis and further
calculation especially for the Kalman Filter. Kalman Filter is generally consists of
two stages which are the prediction and update steps. Prediction stage simply
recognize the kinematic model of the mobile robot to infer the location of the
mobile robot based on its movements. This is then followed by the update steps
which continuously update the mobile robot location as well as landmarks for each
time frame. These two steps if compared to the process and measurement models
looks the same but with no noises considered in the calculation. The prediction
stage is shown as following equation.
x
x k þ f ð x k ; uk Þ
kþ1 ¼ ^
ð3Þ
where ^xk is the predicted states with its associated state covariance matrix expressed
by
^
P
k þ 1 ¼ f Pk f þ Qk
ð4Þ
P
k þ 1 is the predicted state covariance with its associated noise, Qk . The information
obtained in the prediction stage is then referred to update the estimated state. The
updated states xkþþ 1 becomes,
xkþþ 1 ¼ x
k þ 1 þ K z k þ 1 h xk þ 1
ð5Þ
where K is the Kalman Gain.
T
T
K ¼ P
k þ 1 h hPk þ 1 h þ Rk
ð6Þ
An Analysis of State Covariance of Mobile Robot Navigation...
263
where Rk is the covariance of measurement error produced by the sensor. Above all
equations will be further calculated to find the updated covariance.
Pkþþ 1 ¼ ðI KhÞP
kþ1
ð7Þ
One of the important criteria in Kalman Filter is that the state covariance always a
positive semidefinite. Besides, the state covariance will always converging to its
initial state as reported by Huang et al. [7], Ahmad et al. [10]. The state covariance
is related to the errors of estimation and leading to conclusion of either the estimation has higher accuracy or else. It was found in many literatures proving that if
the state covariance value becomes higher, then the mobile robot can easily become
uncertain about its estimation. The problem becomes severe especially for one
technique known as H∞ Filter where there are possibilities that the state covariance
can instantaneously increase. Therefore, there was a lot of analysis focusing on the
state covariance on the same family as EKF such as particle filter and Unscented
Kalman Filter. Hence, these properties will be observed in the experimental analysis
in the later section for verification purposes.
2.2
TurtleBot 3 Configuration
This research applied TurtleBot 3 as presented in Fig. 1. For experimental analysis
preparation, the ROS packages needs to be installed in a computer. The procedure
of installation can be found widely on the ROS wikipedia and further information
can be obtained on the website. The turtlebot must be consistently connected to the
computer to continuously received information of the system performances. The
gmapping technique is applied for mapping analysis and the initial results is shown
in below Fig. 2.
Once the system has been prepared, the EKF package from ROS wiki is installed
in the computer. The package contains odometry and IMU sensors. Odometry is the
use of data from motion sensor or LiDAR to estimate change in position over time
while IMU (Inertial Measurement Unit) is a sensor that determine the orientation of
the turtlebot. The package has been published since 2012 and since then, there are
not much update about this package. The system is then tested to ensure all
information can be obtained from those two sensors.
264
Fig. 1 Turtlebot 3 burger model
Fig. 2 Testing the gmapping of the mobile robot
H. Ahmad et al.
An Analysis of State Covariance of Mobile Robot Navigation...
265
3 Analysis and Discussion of the Experimental Results
This section provides the outcomes of experimental results. For evaluation purposes, two different places are selected to assess the performance of the estimation
using only LiDAR sensor. The results are mainly discussed on the EKF performance focusing on the state covariance conditions when the mobile robot moves
around the environment. The mobile robot motions are autonomous and monitored
through the computer for verifications purposes. It is assumed that the environment
do not contains any dynamical system and is planar as the measurement are made
for 2D conditions.
Figure 3 shows the initial map constructed by the mobile robot on the dining hall
which has dimension of 20 m 4 m. After a period of time, the mobile robot
completed the mapping as presented in Fig. 4. Based on these figures, it has been
found that there are some erroneous results of estimation. The error is not accumulated over time and highly depends on the initial measurement made by the
mobile robot. Other possible reasons are due to the mobile robot tyre slippage and
initial state covariance values. It was identified that, higher values of initial state
covariance has yield higher error of estimation.
Fig. 3 Before mapping of
the dining hall
266
H. Ahmad et al.
Fig. 4 Dining hall final mapping. Blue line shows the real environment based on odometry
measurement
Fig. 5 Odometry of position x against position y with lower initial state covariance
By defining lower initial state covariance, a better picture of estimation results
are shown in Fig. 5 consisting of x-y positions. In Fig. 5, it is clearly indicated that
the measured wheel odometry and the predicted EKF measurement are same and in
fact producing errors similar to odometry measurement. Even though Kalman Filter
has sufficiently low errors when comparing to the wheel odometry measurement,
the results of mapping is not the best it can performed. Hence, the initial measurements is playing an important roles to guarantee a good estimation can be
preserved. In addition, reading from IMU also plays significant effect to the estimation. The state covariance for both x, y states are also small as depicted in Figs. 6
and 7 respectively. It can be observed that at the beginning of time of measurement,
An Analysis of State Covariance of Mobile Robot Navigation...
267
Fig. 6 Covariance of position x by robot pose EKF against time
Fig. 7 Covariance of position y published by robot pose EKF package against time
high uncertainty was perceived which makes the estimation becomes erroneous. If
the state covariance is consistent at all time, the error become lower and then the
robot able to produce better results of estimation.
Investigation in other room size, 7 m 7 m was also organized to analyse the
mobile robot performance on its performance consistency. The same procedure of
mapping is applied for this room. Figures 8 and 9 shows the mapping of initial
position of the turtlebot and its movements respectively. In this experiment, the
results are better since the initial state covariance is smaller and the mobile robot
moves in smaller environment. Compared to the previous dining hall estimation, the
results produced better accuracy with smaller covariance being obtained in the
observations (Figs. 10 and 11).
268
Fig. 8 Initial position of turtlebot in mapping
Fig. 9 Final mapping of the turtlebot
Fig. 10 Covariance position x against time
H. Ahmad et al.
An Analysis of State Covariance of Mobile Robot Navigation...
269
Fig. 11 Covariance position y against time
4 Concluding Remarks
As been demonstrated above, EKF can be sufficiently provide good estimation of
the surrounding area approximating 90% accuracy especially when the initial state
covariance is designed to be suitable to the environment. This can be accomplished
by observing and identifying the mobile robot sensoring capabilities and the
environment complexity. Even though identifying a good initial state covariance is
one of the challenging factors to be considered, the results still preserved good
estimation. Besides of this finding, the estimation is also agreeing to the theoretical
analysis provided by the literatures even with different surroundings. It was also
possible to estimate an environment with using a minimum and yet efficient sensors
such as LiDAR and IMU sensors. Moreover, it was found that to ensure a good
estimation can be achieved, the design of the robot and the environment must be
taken into account.
Acknowledgements The research was conducted under UMP grant, RDU1703139. The authors
would like to thank University Malaysia Pahang for the continuous support in achieving the
research outcomes.
References
1. Habibie N, Nugraha AM, Anshori AZ, Ma’sum MA, Jatmiko W (2017) Fruit mapping mobile
robot on simulated agricultural area in Gazebo simulator using simultaneous localization and
mapping (SLAM). In: 2017 international symposium micro nano mechatronics and human
science (MHS), Japan. IEEE
2. Durrant-Whyte H, Bailey T (2006) Simultaneous localization and mapping: part I. IEEE
Robot Autom Mag 13(2):99–110
270
H. Ahmad et al.
3. Dissayanake G, Newman P, Clark S, Durrant-Whyte H, Csorba M (2001) A solution to the
simultaneous localization and map building (SLAM). IEEE Trans Robot Autom 17(3):229–
241
4. Sebastian T, Wolfram B, Dieter F (2005) Probabilistic robotics. MIT Press, Cambridge
5. Park S, Lee G (2017) Mapping and localization of cooperative robots by ROS and SLAM in
unknown working area. In: 2017 56th annual conference of the society of instrument and
control engineers of Japan (SICE), Japan. IEEE, pp 858–861
6. Saman ABSHM, Lotfy AH (2016) An implementation of SLAM with extended Kalman filter.
In: 2016 6th international conference on intelligent and advanced systems (ICIAS), Malaysia.
IEEE, pp 1–4
7. Huang S, Dissayanake G (2007) Convergence and consistency analysis for extended Kalman
filter based SLAM. IEEE Trans Robot 23(5):1036–1049
8. Ahmad H, Othman NA, Saari M, Ramli MS (2019) Investigating state covariance properties
during finite escape time in H∞ filter SLAM. In: Md Zain Z et al (eds) Proceedings of the
10th national technical seminar on underwater system technology 2018. Lecture notes in
electrical engineering, vol 538. Springer, Heidelberg
9. Ahmad H, Othman N (2015) The impact of cross-correlation on mobile robot localization.
Int J Control Autom Syst 13(5):1251–1261
10. Ahmad H, Othman NA, Saari MM, Ramli MS, Mazlan MBM, Namerikawa T (2017) A
hypothesis of state covariance decorrelation effects to partial observability SLAM. Indones J
Electr Eng Comput Sci 14(2):588–596
11. Othman N, Ahmad H, Namerikawa T (2016) Sufficient condition for estimation in designing
H∞ filter-based SLAM. Math Prob Eng 2015:1–14
12. Kokovkina VA, Antipov VA, Kirnos VP, Priorov AL (2019) The algorithm of EKF-SLAM
using laser scanning system and fisheye camera. In: 2019 systems of signal synchronization,
generating and processing in telecommunications (SYNCHROINFO), Russia. Media
Publisher, pp 1–6
13. Ruiz MS, Vargas AMP, Cano VR (2018) Detection and tracking of a landing platform for
aerial robotics applications. In: 2018 IEEE 2nd colombian conference on robotics and
automation (CCRA), Barranquilla. IEEE, pp 1–6
14. Ponce R, Mosquera Canchingre G, Velarde P, Moya M (2018) Design and construction of an
automatic transport system inside the home for people with reduced mobility. In: 2018
International conference on information systems and computer science (INCISCOS), Equidor.
IEEE, pp 88–93
15. Li B, Liu H, Zhang J, Zhao X, Zhao B (2017) Small UAV autonomous localization based on
multiple sensors fusion. In: 2017 IEEE 2nd advanced information technology, electronic and
automation control conference (IAEAC), Chongqing. IEEE, pp 296–303
Control Strategy for Differential Drive
Wheel Mobile Robot
Nor Akmal Alias and Herdawatie Abdul Kadir
Abstract The wheel mobile robot has been widely used nowadays. It is not only
being used in the industries, but currently has been developed to aid patients in
rehabilitation. Robotics is now used widely as it can reduce therapist workload as
well as to give out efficient results. Robots used as rehabilitation device can help
patients to gain the ability of walking due to the loss of it from stroke, spinal cord
injury and traumatic brain injury. The gait training device is widely used is the
Andago. The motivation behind this exploration is to build up a control methodology for a differential drive wheel mobile robot. The robotic job is to move in a
straight direction in the workspace regardless of powered by two
non-indistinguishable electric motors. The rotational speed was controlled by the
develop controller to achieve straight trajectory of WMR. This paper proposed a
trajectory tracking control for a WMR using sliding mode controller. SMC is best in
dealing with trajectory tracking of the nonholonomic robot. The sliding surface of
SMC will be converged to zero and trivially the error produce while the robot
moves will also be converged to zero.
Keywords Kinematics Dynamics
robot Sliding mode controller
Wheel mobile robot Differential drive
1 Introduction
In recent years, developments in robotic technology have reached a certain milestone. Heavy works conservatively done by hand by our predecessors are now
mostly accomplished with automated machinery. The applications of robots exist in
diverse fields such as logistics, aerospace as well as medicine. In the medical field,
robots are normally used to assist doctors in the rehabilitation of bedridden patients.
N. A. Alias H. A. Kadir (&)
Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
83000 Batu Pahat, Johor, Malaysia
e-mail: herdawatieabdulkadir@gmail.com
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_19
271
272
N. A. Alias and H. A. Kadir
During the rehab sessions, the said patient is strapped into a device, which will
assist them in walking from a point to another. The ergonomic feature of the
machine helps patients to retain a normal skeletal structure. Previously medical
personnel would need to support the patients themselves and guide them slowly.
This method is not efficient and time consuming. Patient will rely on the therapist
for them to do gait training. They are not able to do it by themselves as they have
lost the ability of walking due to their illness. Degree of illness may be different
form one patient to another. High degree of illness will need higher rate assistance.
Therapist workload can be reduced by having robotic device as gait trainer.
The non-holonomic properties for the differential drive wheeled mobile robot
(DDWMR) has some mobility restriction in their applications mostly regarding its
trajectory tracking problem. To overcome this, a variable structure control (VSC) as
a robust controller has been successfully design for diverse applications such as
electrical motor controller, autonomous underwater vehicle, flight stability and
robotics [1]. Thus, SMC may be one of the is the best approach for this
non-holonomic robot.
Implementing a system for an autonomous robot is however not without challenges. Trajectory tracking control of a mobile wheeled robot (WMR) was initially
based solely on kinematic models due to its similarities to nonholonomic limitations
[2, 3]. However, the output does not simulate the actual situation in real life.
Kinematics only ascertains the current position relative to the input. The reference
points are derived from the calculation of translational and rotational velocity. Thus,
reducing the reliability of the trajectory tracking control of the WMR significantly
[4]. To negate this problem, the dynamics of the WMR must be taken into consideration. Parameters such as mass, the center of gravity and moments of inertia
are added into the matrix to calibrate the motion of the WMR.
Researchers have come up with a few controller types, for example, fuzzy
control, neural network and adaptive control. Among them, sliding motion control
(SMC) has shown a great prospect in minimizing uncertainties, reducing the
tracking error as well as giving a fast response [5, 6]. In [7] positive results were
achieved in both tracking control and regulation tasks when SMC was utilized in a
WMR system. However, the system strictly needs the moving force of the robot to
be determined as one of the inputs. Although achievable this would impractical
mainly because of its complexity and cost.
Introducing the Lyapunov stability equation further enhances the robustness of
the system, as done by [7]. Following the example of [8], other researchers have
implemented Lyapunov based controller for their robots. Dealing with patients in
the rehabilitation field requires the utmost consideration of the patient’s safety.
This paper presents as follows. In Sect. 2, the trajectory tracking is introduced by
developing the kinematics and dynamics of the differential drive wheel mobile
robot. Towards the end, the SMC is briefly explained in the subsection. Section 3
tells about the results and discussion of the proposed controller for DDWMR. For
the final Sect. 4, conclusion is explained.
Control Strategy for Differential Drive Wheel Mobile Robot
273
2 System Description
The target of the investigation is the plan of a WMR with the capacity of following
a predefined way or direction. Trajectory tracking will follow the direction based on
the specified velocity using precise controller which has been studied by previous
researchers. Using trajectory tracking for DDWMR to operate in different controllers have been developed in the last recent years by [9–11]. Some researchers
designed the DDWMR controller using backstepping, PID, sliding mode and many
more.
By far, tracking control law developed using SMC in term of stability analysis, is
one of the best solutions [12]. SMC is insensitive to the uncertainties and thus
makes it a reliable controller for DDWMR especially in rehabilitation. Therefore,
SMC is the powerful answer for trajectory tracking controller in real application. In
this reproduction, the required course can be plotted by means of a progression of
focusing removed from an information record, or it can be produced from a condition or arrangement of conditions.
For demonstration purposes, the last approach was embraced. To synchronize
the way focuses with reproduction time steps, a clock was acquainted with the
framework. The inconvenience of time limitations on the coveted way brings about
the production of a direction, as characterized in prior segments.
The main objective is to develop a control law for the DDWMR to closely
follow the reference trajectory of the robot. Figure 1 shows the block diagram of the
proposed controller for this study. The first level of control strategy is to obtain the
suitable torque that will be used for the robot to move the left and right wheels.
Then, the DDWMR will provide suitable velocity to track the reference trajectory.
2.1
Kinematic and Dynamics Model
The wheeled mobile robot (WMR) that is being discussed in this paper is the
differential drive type robot platform. This WMR consists of two motorized
wheeled with one castor wheeled that act as the balancing for the robot. The left and
right wheel will be given a specific input velocity for the robot to perform the
desired trajectory. However, the free-moving wheel does not play any role in
driving or steering the robot. This type of robot also known as nonholonomic robot.
Trajectory
Error
x
y calculation
φ
xe
ye SMC
Φe
S1
S2 Transformation
Fig. 1 Block diagram for DDWMR using SMC controller
τr
τl
Plant
v
Inverse
ω kinematics
x
y
φ
274
N. A. Alias and H. A. Kadir
Y
Fig. 2 Kinematic model of
differential drive WMR
V
θ
y
x
X
The kinematics and dynamics are the two levels controllers for this nonholonomic robot. The kinematics is implemented in the system to obtain the velocity
which then will be used by the robot’s dynamic to apply the desired torque for both
left and right wheels. The goal is to develop a control law that can follow the
desired trajectory of the robot. Figure 2 below shows the kinematics behaviour of a
differential drive WMR.
p_ ¼ J_ q_
2 3 2
3
x_
cos; 0 4 y_ 5 ¼ 4 sin; 0 5 v
w
0
1
;_
ð1Þ
The differential drive robot is a 3 degree of freedom robot with a
two-dimensional movement which is the translational and rotational movement.
The kinematic model can be written in the form below where v and w are the linear
and angular velocities of WMR.
This WMR model is a two-vector field of driftless affine system. Matrix g1 and
g2 is obtained from the Jacobian matrix of the kinematics. The vector g1 allows for
the translational while vector g2 is for rotational movement. Figure 3 shows this
relationship.
2
3
2 3
0
cos;
g1 ¼ 4 sin; 5 g2 ¼ 4 0 5
1
0
ð2Þ
The dynamic model of the nonholonomic robot can be presented in linear and
angular velocities. The dynamic equations of WMR will ensure to give out the
Control Strategy for Differential Drive Wheel Mobile Robot
275
g1
Fig. 3 Translational and
rotational movement
g2
actual velocities match the desired velocities. The model is acquiring form
Lagrange dynamic equation and is depicted as below:
M ðqÞg_ þ V ðq; q_ Þg ¼ BðqÞs
ð3Þ
The equation then rearranges in a compact form such as:
1
sR
sL
v_ ¼
2 þ
2
2L
R m þ R2
m þ 2L
R 2 Iw
L
sR
sL
x_ ¼
2
2L2
R I þ 2L
I
I
þ
R2 w
R 2 Iw
!
þ
mc dx2
m þ 2IR2w
ð4Þ
mc dxV
2
I þ 2L
R2 Iw
ð5Þ
!
Where,
mc = mass without wheel and actuators
mw = mass of each wheel with actuators
Ic = Inertia about vertical axis through center of mass
Iw = Inertia of each wheel with actuators about wheel axis
Im = Inertia of each wheel with actuators about wheel diameter
Equations (4) and (5) can be written in matrix form as,
2 1
1 4 m þ 2IR2w
v_
¼
L
x_
R I þ 2L
2
I
w
R2
1
m þ 2I2w
R
L
2
I þ 2L2 Iw
R
2
3
s
R
5
4
sL
0
mc dx
2
m þ 2L2
mc dx
2
I þ 2L2 Iw
0
R
3
5 v
x
ð6Þ
R
Table 1 shows the values of each parameters that are used in the dynamic
equation.
276
N. A. Alias and H. A. Kadir
Table 1 DDWMR
parameters and values
2.2
Parameters
Value
Unit
m
mc
mw
L
R
d
Ic
Iw
Im
81.05522
80.4144
0.6378
0.385
0.1
0.2
−0.1821
1.0
1.0
kg
kg
kg
m
m
m
kgm2
kgm2
kgm2
Sliding Mode Controller
SMC is used to ensure the discontinuous control signal is generated from this
controller when the system is repeatedly across the sliding surface until it finally
converges to zero. Other issues after the sliding motion is the chattering phenomenon which it switch the states to divert from lying on the sliding surface. This
issue can be overcome by replacing the saturation function (sat) from the sign
function (sgn). It will smooth the boundary layer and reduce the chattering effect at
the same time.
The controller and its gains are used to lead the tracking errors to zero. As the
errors are zero, the real trajectory will follow the reference trajectory closely.
Tracking errors will exhibit when the real robots get moving. The differentiated
errors in terms of the robot coordinate are given out as below:
3 2
xe
cos ;d
4 ye 5 ¼ 4 sin ;d
;e
0
2
sin ;d
cos ;d
0
32
3
x xd
0
0 54 y yd 5
; ;d
1
ð7Þ
Hence, the dynamic errors for trajectory tracking,
x_ e ¼ ðx_ x_ d Þ cos ;d þ ðy_ y_ d Þ sin ;d ;_e ðx_ x_ d Þ sin ;d ;_d ðy yd Þ cos ;d
¼ x_ cos ;d x_ d cos ;d þ y_ sin ;d y_ d sin ;d ;_d x sin ;d þ ;_d xd sin ;d þ ;_d y cos ;d ;_d yd cos ;d
¼ x_ cos ;d þ y_ sin ;d þ ;_d ½x sin ;d þ xd sin ;d þ y cos ;d yd cos ;d x_ d cos ;d y_ d sin ;d
¼ x_ cos ;d þ y_ sin ;d þ xd ye Vd
¼ x_ cos ð; ;e Þ þ y_ sinð; ;e Þ þ xd ye Vd
¼ x_ ðcos ; cos ;e þ sin ; sin ;e Þ þ y_ ðsin ; cos ;e cos ; sin ;e Þ þ xd ye Vd
¼ cos ;e ðx_ cos ; þ y_ sin ;Þ þ sin ;e ðx_ sin ; y_ cos ;Þ þ xd ye Vd
¼ V cos ;e þ xd ye Vd
Control Strategy for Differential Drive Wheel Mobile Robot
277
y_ e ¼ ðx_ x_ d Þ sin ;d þ ðy_ y_ d Þ cos ;d ;_d ðx_ x_ d Þ cos ;d ;_d ðy yd Þsin ;d
¼ x_ sin ;d þ x_ d sin ;d þ y_ cos ;d y_ d cos ;d ;_d x cos ;d þ ;_d xd cos ;d ;_d y sin ;d ;_d yd sin ;d
¼ x_ sin ;d þ y_ cos ;d þ ;_d ½x cos ;d þ xd cos ;d y sin ;d yd sin ;d þ x_ d sin ;d y_ d cos ;d
¼ x_ sin ;d þ y_ cos ;d xd xe
¼ x_ sinð; ;e Þ þ y_ cosð; ;e Þ xd xe
¼ x_ ðsin ; cos ;e þ cos ; sin ;e Þ þ y_ ðcos ; cos ;e sin ; sin ;e Þ xd xe
¼ cos ;e ðx_ sin ; y_ cos ;Þ þ sin ;e ðx_ cos ; þ y_ sin ;Þ xd xe
¼ V sin e3 xd e3
;_e ¼ ;_ ;d
¼ x xd
ð8Þ
The SMC is designed for the actual velocities to follow the desired velocities of
the WMR and confirmed that the trajectory tracking is closely tracked. Referring to
S2, the lateral error, ye and angular error, ue are coupled together to make it
converge together. The C0 ; C1 ; C2 are the positive constant parameter for the system. Therefore, the sliding surface depicts as:
S
Si ¼ 1
S2
ð9Þ
S1 ¼ x_ e þ C1 xe
S2 ¼ y_ e þ C2 ye þ C0 sgnðye Þ ð;e Þ
Then, the sliding surface is differentiated into:
S_ 1 ¼ €xe þ C1 x_ e
S_ 1 ¼ x_ d ye þ xd y_ e þ V_ cos ;e ;_e V sin ;e V_ d þ C1 xd ye C1 V cos;3 C1 Vd
ð10Þ
S_ 2 ¼ €ye þ C2 y_ e þ C0 sgnðye Þ ;_e
S_ 2 ¼ V_ sin ;e þ ;_e V cos ;e x_ d xe xd x_ e þ C2 V sin;e C2 xd xe þ C0 sgnðe2 Þ ðxd Þ
The reaching law in the proposed controller is using the Gao and Hung reaching
law [13]. They suggested by using certain reaching law the reaching speeds can be
controlled. When the proportional rate P is used, it will push the switching faster if
278
N. A. Alias and H. A. Kadir
the boundary layer, Q is larger. Both P and Q must be larger than zero for the sliding
surface smoothly converging to zero. The general form of the law is given by:
S_ 1 ¼ Qi sgnðSi Þ Pi Si
i ¼ 1; 2
ð11Þ
Equation below is achieved when Eq. (11) = (10):
Q1 sgnðS1 Þ P1 S1 ¼ x_ d ye þ xd y_ e þ V_ cos ;e ;_e V sin ;e V_ d þ C1 xd ye
C1 V cos;3 C1 Vd
ð12Þ
Q2 sgnðS2 Þ P2 S2 ¼ V_ sin ;e þ ;_e V cos;e x_ d xe xd x_ e þ C2 V sin;e
C2 xd xe þ C0 sgnðye Þ ;e
Below equations are obtain after some mathematical equations from equations
in (12),
V_ ¼
1
½x_ d ye xd y_ e þ ;_e V sin ;e V_ d C1 xd ye þ C1 V cos;3
cos xe
C1 Vd Q1 sgnðS1 Þ P1 S1 x¼
ð13Þ
1
½ðQ2 sgnðS2 Þ P2 S2 V_ sin ;e þ x_ d xe
V cos;e þ C0 satðye Þ
þ xd x_ e C2 V sin;e þ C2 xd xe Þ þ xd The sign function in the boundary layer is then replaced with the saturation
function. By doing so, the chattering issue can be eliminated.
V_ ¼
1
½x_ d ye xd y_ e þ ;_e V sin ;e V_ d C1 xd ye þ C1 V cos;3
cos xe
C1 Vd Q1 satðS1 Þ P1 S1 x¼
ð14Þ
1
½ðQ2 satðS2 Þ P2 S2 V_ sin ;e þ x_ d xe
V cos;e þ C0 satðye Þ
þ xd x_ e C2 V sin;e þ C2 xd xe Þ þ xd The obtained control law of the DDWMR is free from uncertainties and will not
be considered in this paper. This is the nominal control law for SMC applied to
WMR. This control law will be feed into the DDWMR and tracked the generated
reference trajectory.
Control Strategy for Differential Drive Wheel Mobile Robot
2.3
279
Summarize
This paper briefly discusses about the DDWMR using SMC in application of
rehabilitation. Patients who lost the ability to walk will need an assist as needed
device for them to gain back their normal walking behavior. This can be achieved
by having frequent therapy session. Robot assisted device can help patients to gain
back their ability to walk much faster compared to therapy assistance. Andago [14]
is the closest reference for this research.
Both kinematics and dynamics of the DDWMR will be used in this simulation.
Trajectory tracking for DDWMR is formulated based on a mobile robot that will
move along a desired path with specified velocity. The kinematics explained about
the behavior of movement by the mobile robot while the dynamics will ensure that
the mobile robot physical parameters will be considered.
The SMC is the controller used in this research. The control law that is obtained
from SMC will be used for the DDWMR to follow the refence trajectory. The
sliding surface that is equal to zero shows that the controller can follow the input
that has been given to it. This shows that SMC is the efficient controller for the
system. It should be able to track the reference trajectory very well.
3 Results and Discussions
Displaying a WMR’s conduct on kinematics alone will probably prompt mistakes,
particularly at expanding mass and velocity. To avert slippage, dynamic forces must
be considered. For the WMR modelling utilized in this study, dynamic limitations
are forced on kinematic arrangements with a specific end goal to deliver practical
outcomes.
It is comprised of both kinematic and dynamic aspects and has incorporated
salient components such as SMC controllers and motors. It has also taken into
consideration the effects of tire friction. With this stage completed, the behaviour of
the wheeled robot in response to various assigned trajectories can now be simulated. The advantage of using software simulation is that trajectories and other
physical parameters can be altered with ease in order to gauge the reaction of the
robot.
The simulation starts with a trajectory that should be tracked by the differential
drive robot. Figure 4 shows that the blue line is the reference trajectory while the
yellow line is the real trajectory that managed to be tracked by the robot. The WMR
was able to quickly propel itself from its starting point towards the pre-defined path
located at a certain distance away. Once the WMR has negotiated itself onto the
path, it will faithfully follow until the end of the simulation period. Hence, the robot
slowly follows the reference trajectory. This has shown that the proposed control
law is validated by the trajectory tracking of the robot that closely follows the
reference trajectory (Table 2).
280
N. A. Alias and H. A. Kadir
Fig. 4 Real and reference trajectory
Table 2 Controller
parameters and values
Controller parameters
Value
Controller gain, C0
Controller gain, C1
Controller gain, C2
Reaching gain, P1
Reaching gain, P2
Boundary layer, Q1
Boundary layer, Q2
0.4
0.5
0.1
0.003
0.1
100.0
1.0
There are some parameters that are used in the simulation to achieve the results
below. These parameters are obtained within the SMC parameter rules. The selected
values of Q are used to eliminate the chattering effect occur in the SMC controller.
The boundary layer thickness must thick enough dot it to eliminate the chattering
that occur within the boundary.
The remaining results shown below can be used to validate the proposed control
law. The sliding surface of the SMC should converge to zero for the robot follows
the reference trajectory. Errors occur when the robot starts to move and if it’s able
to eliminate the error then, the robot can closely follow the reference trajectory.
Referring to Fig. 4, the sliding surface is successfully converged to zero as it able to
eliminate the error.
Control Strategy for Differential Drive Wheel Mobile Robot
281
Figure 5 shows results that is much likely to the above figure. The sliding
surface can reach zero. The WMR follows exactly along the trajectory as the error
has been eliminate when the sliding surface reaching zero. When the switching
function is introduced with a boundary layer, the system can reach zero much faster.
Both figures have shown that it is able to converge to zero in a short period of time
(Fig. 6).
Fig. 5 Sliding surface, S1
Fig. 6 Sliding surface, S2
282
N. A. Alias and H. A. Kadir
Walking speed
1.4
1.2
Speed (ms-1)
1
0.8
0.6
0.4
0.2
1
23
45
67
89
111
133
155
177
199
221
243
265
287
309
331
353
375
397
419
441
463
485
0
-0.2
Time (s)
Fig. 7 Walking speed for the proposed trajectory
The results prove that the dynamic algorithm will slow down the simulated
WMR when situations arise that could cause it to exceed friction limits.
Nevertheless, it is responsive enough to be able to speed up when required to match
the reference trajectory. In general, the SMC model performs reasonably well in a
simulated environment and demonstrates the feasibility of the idea.
It is important not only for a controller to be able to follow a prescribed trajectory, but it must be able to do it with a level of accuracy that is within acceptable
limits. In order to verify the tracking capabilities of the SMC, further simulation
runs must be conducted.
The trajectory tracking is generated using SMC must also satisfy patients
behavior. Average normal gait speed is 1.34 ms−1 [15]. When dealing with patients
who are in difficulties to perform their walking behavior, the speed of the gait
assisted device must tolerate with this situation. Figure 7 depicts that the average
speed perform by the WMR is 1.25 which is lesser than the normal speed. So, it is
shown that the controller suits well enough in the robust manner as well as in the
rehabilitation purpose.
4 Conclusion
This paper discussed about the DDWMR in application of rehabilitation by using
SMC. SMC is a robust controller that can tolerate very well with the nonholonomic
behavior of the WMR. Hence, SMC is applied to this robotic device in application
of gait assisting rehabilitation.
The proposed controller’s effectiveness has proven that it can tolerate well with
the WMR trajectory in order to ensure it can eliminate error and follow its trajectory
so well. The WMR happens to follow the desired trajectory that has been programmed as close as it can use the proposed SMC. This happen because of the
designed controller works very well with this nonholonomic robot that is roll
Control Strategy for Differential Drive Wheel Mobile Robot
283
without slipping constraint. Both sliding surfaces is eventually converging to zero,
hence making the tracking errors also equal to zero.
This simulation also shown that it can tolerate with patient’s condition who
facing some difficulties in their walking behaviour. Patients may not be able to walk
as the normal person. They may produce lower speed in order to cope with their
current situation.
All in all, the tracking performance produced by simulation has been thoroughly
evaluated. This can be summarized that the control law works very well in rehabilitation condition specifically in gait training.
Acknowledgements The authors acknowledge support from the Advanced Mechatronic
Research (AdMire) Group.
References
1. Filipescu A et al (2011) Trajectory-tracking and discrete-time sliding-mode control of
wheeled mobile robots. In: 2011 IEEE international conference on information and
automation. IEEE
2. Nicolescu A-F, Ilie F-M, Alexandru T-G (2015) Forward and inverse kinematics study of
industrial robots taking into account constructive and functional parameter’s modeling. Proc
Manuf Syst 10(4):157
3. Chwa D (2004) Sliding-mode tracking control of nonholonomic wheeled mobile robots in
polar coordinates. IEEE Trans Control Syst Technol 12(4):637–644
4. Solea R, Nunes U (2007) Trajectory planning and sliding-mode control based
trajectory-tracking for cybercars. Integr Comput-Aided Eng 14(1):33–47
5. Asif M, Khan MJ, Cai N (2014) Adaptive sliding mode dynamic controller with integrator in
the loop for nonholonomic wheeled mobile robot trajectory tracking. Int J Control 87(5):964–
975
6. Tzafestas SG (2013) Introduction to Mobile Robot Control. Elsevier, Amsterdam
7. Belhocine M, Hamerlain M, Meraoubi F (2003) Variable structure control for a wheeled
mobile robot. Adv Robot 17(9):909–924
8. Yun X, Sarkar N (1998) Unified formulation of robotic systems with holonomic and
nonholonomic constraints. IEEE Trans Robot Autom 14(4):640–650
9. Xie D, Wang S, Wang Y (2018) Trajectory tracking control of differential drive mobile robot
based on improved kinematics controller algorithm. In: 2018 Chinese automation congress
(CAC). IEEE
10. Ibrahim AE-SB (2016) Wheeled mobile robot trajectory tracking using sliding mode control.
JCS 12(1):48–55
11. Wu H-M, Karkoub M (2019) Hierarchical fuzzy sliding-mode adaptive control for the
trajectory tracking of differential-driven mobile robots. Int J Fuzzy Syst 21(1):33–49
12. Solea R et al (2009) Sliding mode control for trajectory tracking of an intelligent wheelchair.
Ann Dunarea de Jos Univ Galati. Fascicle III Electrotech Electron Autom Control Inf
32(2):42–50
13. Gao W, Hung JC (1993) Variable structure control of nonlinear systems: a new approach.
IEEE Trans Industr Electron 40(1):45–55
14. Alias NA et al (2017) The efficacy of state of the art overground gait rehabilitation robotics: a
bird’s eye view. Procedia Comput Sci 105:365–370
15. Bohannon RW, Andrews AW (2011) Normal walking speed: a descriptive meta-analysis.
Physiotherapy 97(3):182–189
Adaptive Observer for DC Motor Fault
Detection Dynamical System
Janet Lee, Rosmiwati Mohd-Mokhtar,
and Muhammad Nasiruddin Mahyuddin
Abstract The increase in the complexity of manufacturing systems increases the
importance of fault detections and isolations. Fault detection is important to prevent
failure of the system which may affect the productivity. This paper studies the fault
detection using observer-based approach for a dynamical system. Direct current
motor with encoder is used to represent a dynamical system and the sensor. A linear
observer and an adaptive observer are designed to detect the sensor fault. Two types
of encoder fault are modelled in the simulation via MATLAB Simulink. The result
shows the linear observer is good at estimate states but failed when there is presence
of fault in the output signal. The adaptive observer is better in estimating the actual
states of the system with additive faults but failed in gain fault. Comparable analysis
was made to verify the efficacy of the observer in fault detection and estimation.
Keywords Fault detection
Adaptive observer Sensor fault Encoder fault
1 Introduction
The improvement in the information technology leads to the invention of the
Internet which consequently leads to the fourth Industrial Revolution by the name
of Industry 4.0 [1]. This leads to the upgrading of the manufacturing systems from a
traditional factory to a smart factory and increase the complexity of the system, and
the use of sensors also increases [2]. Fault diagnosis techniques are getting more
important to ensure the safety of the systems as well as human beings including
J. Lee R. Mohd-Mokhtar (&) M. N. Mahyuddin
School of Electrical and Electronic Engineering, Universiti Sains Malaysia, Engineering
Campus, 14300 Nibong Tebal, Pulau Pinang, Malaysia
e-mail: eerosmiwati@usm.my
J. Lee
e-mail: janetvenus@gmail.com
M. N. Mahyuddin
e-mail: nasiruddin@usm.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_20
285
286
J. Lee et al.
industrial workers and customers. Due to the automations which are highly reliable
on the control system, any faults in the system should be detected quickly to avoid
hard failure of the system [3].
The feedback system of a control system implemented in the industry usually
relies on the information provided by sensors. Thus, a fault in the sensor may lead
to a loss in control of the system [3]. In a complex system that contains a lot of
subsystems in the system network, a small fault in a subsystem may affect the
stability of the whole network since the effect of the fault may be propagated to
other subsystems via the interconnections [4].
There are various fault detection strategies that are researched, one of them is by
using the observer. Observers can be utilized to estimate the states or disturbance
that are unknown [5]. Its main purpose is to estimate the state output in a given state
input condition. Hence it can be used in the fault detection system to estimate the
errors or the faults in the sensors.
In this paper, direct current (DC) motor is used as a representative to illustrate
the dynamical system. This is due to the DC motor is one of the most commonly
used actuator in the manufacturing industry. Besides that, DC motors are economical, easy to drive, and are easy to get in different sizes and shapes [6]. One of
the most commonly used sensor to detect the speed and position of the DC motor is
the encoder. It gives the exact rotor speed or position of a DC motor in a close-loop
operation [3]. This research aims to design an observer to detect sensor fault in a
DC motor with an encoder and to design a model for encoder fault signals.
Overall, this paper is organized as follows. Section 2 reviews the related work
and Sect. 3 shows the research methodology. The results are discussed in Sect. 4
and conclusions in Sect. 5.
2 Related Works
There are various fault diagnosis methods that have been developed and proposed
in the last few decades. One of the most commonly used approaches is the
observer-based approach. Observer plays a key role in model-based fault diagnosis
for monitored systems or processes characterized by deterministic models [7]. The
basic idea in the observer-based approach is by constructing various observers to
estimate the state and compared with the actual state to generate residuals that are
used to detect faults present in the system [8]. Observers are first to be designed
based on the linear system, which are known as linear observers or Luenberger
observers. The Luenberger observers are useful in linear systems and have been
applied in various applications, but they need to be modified before applying them
in non-linear systems and uncertain systems which are more common in the real
world [9].
Besides linear observers, an adaptive observer is an algorithm that estimates
unmeasured states and unknown parameters simultaneously [10]. With some
modification to the Luenberger observer, the adaptive observers can estimate the
Adaptive Observer for DC Motor Fault Detection Dynamical System
287
states of a system with the present of disturbance [11]. Another most used observer
for fault detection in non-linear system is the sliding mode observer. Sliding mode
observer uses a non-linear high gain feedback to bring the error dynamic to zero in
finite time. A sliding mode observer is usually implemented with a scaled switching
function, such as the signum of the error between estimated output and measured
output [12]. Its advantages are robustness to bounded disturbances and low sensitivity to parametric uncertainties [12].
Other than observer-based approaches, another model-based approach is parity
relation approach where the parity vector is generated to check the consistency
between the model and the process output [7]. Stable factorization approach is
frequency-domain fault diagnosis method. It generates a residual based on the
stable coprime factorization of the transfer function matrix of the monitored system
[7]. Both parity relation approach and the stable factorization approach involved the
design of an observer [7]. Besides that, another fault diagnosis approach is
non-linear geometric approach which relies on a coordinate change in the state and
output spaces [13]. This approach must be provided an observable subsystem which
is affected by the fault, but unaffected by disturbances and the other faults to be
decoupled [13].
Data-driven fault detection used the Takagi-Sugeno fuzzy model (T-S model) in
the dynamic modelling of a non-linear system [14]. It is called the kernel representation for the non-linear systems. Generally, the main concept of the standard
fuzzy fault detection approach is by designing the kernel representation based on
the model of the system with the aid of the fuzzy modelling technique [14]. Fault
tree analysis (FTA) approach is widely used to determine system dependability. In a
fault tree, the logical connections between faults and their causes are represented
graphically [15]. It is deductive in nature, in other words, the analysis starts with the
top event or a system failure and works backward from top of the fault tree to the
bottom leaves to find the root causes of the system failure [15].
In this paper, the adaptive observer will be employed in detection of fault to dc
motor system. The ability of the observer to estimate the state in the presence of
disturbance and can simultaneously estimate both unmeasured states and unknown
parameters will be the advantage of implementing this technique for dc motor fault
detection.
3 Observer Design
The transfer function and the state space model of the dc motor system can be
presented as (1) to (3).
hð s Þ
Km
¼ V ðsÞ s ðsJm þ Bm ÞðRa þ sLa Þ þ Km2
ð1Þ
288
J. Lee et al.
3 2 Ra
ia
d 4 5 4 La
0
h ¼
dt _
Km
h
2
Jm
32 3 2 1 3
KLma
ia
La
1 54 h 5 þ 4 0 5V
BJmm
h_
0
0
0
0
3
ia
0 4 h 5
h_
ð2Þ
2
y ¼ ½0 1
ð3Þ
where V is the source voltage, h is the position, Ra is armature resistance, La is
electric inductance, Km is the motor constant, Jm is the rotor moment of inertia, Bm
is the frictional coefficient, and ia is armature current.
3.1
Luenberger Observer Design
Consider the linear system in (4) and (5), and compare it with the state space model
in (2) and (3), the system matrices A, B and C can be identified.
x_ ðtÞ ¼ AxðtÞ þ BuðtÞ
ð4Þ
yðtÞ ¼ CxðtÞ
ð5Þ
where A 2 Rnn is the system matrix, B 2 Rnr is the input matrix, u 2 Rr is the
control input that satisfies the Sufficiently Rich (SR) condition to guarantee the
Persistently Excited (PE) condition which is later to be defined, y 2 Rq is the output
of the system and C 2 Rqn is the corresponding output matrix. The observability
of the system can be determined by using the observability matrix O in (6).
2
6
6
O¼6
6
4
C
CA
CA2
..
.
3
7
7
7
7
5
ð6Þ
CAi1
The Luenberger observer is formulated as (7). The observer gain L can be
designed by using pole placement method.
^x_ ðtÞ ¼ A^xðtÞ þ BuðtÞ þ Lðy C^xðtÞÞ
where ^x 2 Rn is the estimated state vector.
ð7Þ
Adaptive Observer for DC Motor Fault Detection Dynamical System
3.2
289
Adaptive Observer Design
Consider the linear system in (4) and (5), a fault f(t) is added at the output equation
to represent the sensor fault, and the system becomes
yðtÞ ¼ CxðtÞ þ f ðtÞ
ð8Þ
The fault signal is represented in a linear regression such that
f ðtÞ ¼ wðtÞqT
ð9Þ
h
i
where wðtÞ ¼ w1 ðtÞ; . . .; wp ðtÞ 2 Rqp are the regressors and qðtÞ ¼ ½q1 ðtÞ; . . .;
qp ðtÞT 2 Rp are the unknown coefficients of the regressors. This model comes from
the physical knowledge of the possible faults [11]. Let the signal wðtÞ be filtered
through the filter
Y_ ðtÞ ¼ ½A KC Y ðtÞ KwðtÞ
ð10Þ
XðtÞ ¼ CY ðtÞ þ wðtÞ
ð11Þ
Y(t) and X(t) are the state and output of the filter, respectively. Assuming that the
w(t) be persistently exciting, so that the filtered signals Ω(t) satisfies the following
inequality for t t0 and with some positive constants a, T where Iq 2 Rqq is q q
identity matrix [11].
Z
tþT
XT ðsÞXðsÞds aIq
ð12Þ
t
Thus, the adaptive observer can be formulated as follows where C is a positive
definite gain matrix [11].
Y_ ðtÞ ¼ ½A KC Y ðtÞ KwðtÞ
ð13Þ
^_ ðtÞ
^x_ ðtÞ ¼ A^xðtÞ þ BuðtÞ þ K ½yðtÞ C^xðtÞ wðtÞ^
qðtÞ þ Y ðtÞq
ð14Þ
^_ ðtÞ ¼ C½CY ðtÞ þ wðtÞT :½yðtÞ C^xðtÞ wðtÞ^
qðtÞ
q
ð15Þ
By considering the state space model in (4) and (8), K can be designed using pole
placement method.
290
3.3
J. Lee et al.
Encoder Fault Signal Modelling
Two types of fault are modelled. In mechanical causes, the loose mounting of the
encoder may result in random error signal [3]. Therefore, one of the method to
model the encoder fault signal is by adding a noise signal at the output of the plant.
This can be easily done in the Simulink by adding the Signal Generator block from
the library to generate random signal and is discussed in the next section.
Next, in electronic causes, if one of the two channels of the quadrature encoder is
malfunctioning and not delivering signals, the number of counted edges reduced to
the half of the healthy one [16]. This causes the resulting output become half of the
actual output. To represent this fault, the output of the state space model of DC
motor is multiply with a gain of 0.5.
3.4
Simulations in MATLAB Simulink
MATLAB Simulink is used to construct the model of the dynamic system, observer
and the encoder fault model, and is used to simulate the results. To model the whole
system in the simulation, the parameters of the DC motor in Table 1 are used. First,
the Luenberger observer is modelled. A MATLAB source file that calculates the
system matrices of the DC motor and also the observer gain matrix is run. Then, the
block diagram of the system and the observer are built in Simulink as shown in
Fig. 1 for a healthy system. For the random error signal, a Signal Generator block is
added to the x2 signal before feeding to the observer. The parameters of the block
are set to generate random waveform with amplitude equals to 1 and frequency,
10 Hz. For the gain fault, the x2 signal goes through a gain of 0.5 before entering
the observer. The Scope blocks are used to show the simulated signals for each
states and compares with the actual signals.
Next, the adaptive observer is modelled similar to the procedure of simulations
for the Luenberger observer. The MATLAB code is used to load the workspace
with appropriate parameters and calculate the gain matrix K. Then, the block diagram as shown in Fig. 2 is built for the observer without faults. The adding of fault
into the system is similar to that of the Luenberger observer. As the system is more
complex than the Luenberger observer, it is divided into four subsystems, three for
Table 1 DC motor
parameters for simulation
purposes
Parameters
Values
Armature resistance, Ra
Electric inductance, La
Frictional coefficient, Bm
Moment of inertia, Jm
Motor constant, Km
1X
1 10–3 H
1 10−4 N m s
5 10−3 kg m2
0.1 N m/A
Adaptive Observer for DC Motor Fault Detection Dynamical System
291
Fig. 1 Luenberger observer
Fig. 2 Adaptive observer
each equation of (13), (14) and (15), and one that generates the regressor,
w(t) which is used to estimate the fault. It is based on the Fourier series with four
frequency terms. A low pass filter is added after the x is generated to get a better
result.
292
J. Lee et al.
4 Results and Discussions
The observability check of the system is done and it shows that the system is
observable. The system matrices and the observer gain were calculated as follows.
2
1000
A¼4 0
20
0
0
0
3
100
1 5;
0:02
2
3
1000
B ¼ 4 0 5;
0
C ¼ ½0
3
12419998
K ¼ L ¼ 4 199:98 5
310496:0004
1 0
ð16Þ
2
ð17Þ
For the Luenberger observer, the input signal of the simulation was generated as
a square wave with amplitude of −1 V and frequency 1 Hz. The graphical results of
the simulation for the three states, current, position and speed without fault were
shown in Fig. 3. Besides that, Fig. 4 shows the results for system with random error
signal fault and Fig. 5 shows the results for system with gain fault. The random
error signal is generated using signal generator that generates random signal with
amplitude of 1 and frequency of 10 Hz. From these results, we can see that the
Fig. 3 Actual and estimated states for Luenberger observer with no fault
Adaptive Observer for DC Motor Fault Detection Dynamical System
Fig. 4 Actual and estimated states for Luenberger observer with random error signal fault
Fig. 5 Actual and estimated states for Luenberger observer with gain fault
293
294
J. Lee et al.
Luenberger observer is doing very well in estimating the states when there is no
fault and noise occurs. However, the output of the Luenberger observer is corrupted
by the fault signal when the faulty signal is fed into the observer. The estimated
output signal follows exactly the same as the faulty signal.
For the adaptive observer, the input signal used is a square wave with amplitude
of −1 V and frequency 0.5 Hz. The gain matrix C is set to 20I8 and the fault
regression used was taking the form of a Fourier series with four frequency terms,
which was as follows.
f ðtÞ ¼ WðtÞq
¼ q1 cos 100pt þ q2 cos 200pt þ q3 cos 400pt
þ q4 cos 800pt þ q5 sin 100pt þ q6 sin 200pt
ð18Þ
þ q7 sin 400pt þ q8 sin 800pt
The results of the simulation for the system with no fault are shown in Fig. 6.
Then, Fig. 7 shows the results for the system with random error signal fault and
Fig. 8 for the system with gain fault. From the results, we can see that for system
with no fault, the estimated states follow the actual states with slight delay due to
the low pass filter. The use of low pass filter makes the estimation slower as
mentioned in [11]. This result is acceptable and almost the same with the
Fig. 6 Actual and estimated states for adaptive observer with no fault
Adaptive Observer for DC Motor Fault Detection Dynamical System
Fig. 7 Actual and estimated states for adaptive observer with random error signal fault
Fig. 8 Actual and estimated states for adaptive observer with gain fault
295
296
J. Lee et al.
Luenberger observer. However, the response was much better than that of the
Luenberger observer for the random error signal fault. The estimated states did not
follow exactly as the faulty signal, and tried to get to the actual values. The results
for the gain fault were not good as they contained similar problem with the
Luenberger observer, which the estimated states followed the faulty states, and the
estimated states were delayed due to the low pass filter. This may be due to the
adaptive law is not suitable to detect gain fault.
5 Conclusions
In this research, the sensor fault detection was studied using linear observer and
adaptive observer. A linear observer and an adaptive observer were designed and
applied in fault detection. Before designing the observers, the DC motor system was
modelled. The observability of the system was checked. In designing the
Luenberger observer, pole placement method was used to design the observer gain
matrix. The adaptive observer was designed by modification on the Luenberger
observer, considering the fault in the system. By estimating the fault in the system,
the adaptive observer can reduce or eliminate the effect of the fault, and thus
estimated the actual states.
The encoder fault was studied and its effect on the output signal were investigated. Two types of encoder fault were modelled and applied into the simulations of
the fault detection system. An improper and loose mounting of the encoder that may
lead to random error signal fault was modelled using a noise or a random waveform
signal. Another encoder fault, that was gain fault, was represented using a gain of
0.5 at the output of the motor system. From the simulations, the linear observer can
estimate the states very well in the absence of fault signal, but failed to detect the
fault when there is presence of fault. The adaptive observer can estimate the states
well both in ideal system and random error fault system, but failed to detect the gain
fault. Based on the analysis, the modification to the adaptive observer is required to
overcome the issue of the gain fault. This will be the focus in the next research
investigation.
Acknowledgements The authors would like to thank Universiti Sains Malaysia for providing
space and software tool in conducting the research. This research is also partially supported by the
USM RUI Grant: 1001/PELECT/8014093.
References
1. Tjahjono B, Esplugues C, Ares E, Pelaez G (2017) What does Industry 4.0 mean to supply
chain? Procedia Manuf 13:1175–1182
2. Wang S, Wan J, Li D, Zhang C (2016) Implementing smart factory of Industry 4.0: an
outlook. Int J Distrib Sensor Netw 12(1):3159805
Adaptive Observer for DC Motor Fault Detection Dynamical System
297
3. Bourogaoui M, Jlassi I, El Khil SK, Sethom HBA (2015) An effective encoder fault detection
in PMSM drives at different speed ranges. In: 2015 IEEE 10th international symposium on
diagnostics for electrical machines, power electronics and drives (SDEMPED), Guarda,
pp 90–96
4. Zhu J, Yang G (2018) Robust distributed fault estimation for a network of dynamical systems.
IEEE Trans Control Netw Syst 5(1):14–22
5. Liu C et al (2017) A state-compensation extended state observer for model predictive control.
Euro J Control 36:1–9
6. Tun HM, Aung W (2014) Analysis of control system for A 24 V PM brushed DC motor fitted
with an encoder by supplying H-bridge converter. Bahria Univ J Inf Commun Technol 7
(1):54–67
7. Gao Z, Cecati C, Ding SX (2015) A survey of fault diagnosis and fault-tolerant
techniques-part I: fault diagnosis with model-based and signal-based approaches. IEEE
Trans Industr Electron 62(6):3757–3767
8. Bo L, Tao P, Lu S, Ze-zhou H, Chao Y (2016) Multi fault diagnosis of traction motor current
sensor based on state observer. In: 2016 Chinese control and decision conference (CCDC),
Yinchuan, pp 7058–7063
9. Zhang H, Wang J (2016) Adaptive sliding-mode observer design for a selective catalytic
reduction system of ground-vehicle diesel engines. IEEE/ASME Trans Mechatron 21
(4):2027–2038
10. Oliva-Fonseca P, Rueda-Escobedo JG, Moreno JA (2016) Fixed-time adaptive observer for
linear time-invariant systems. In: 2016 IEEE 55th conference on decision and control (CDC),
Las Vegas, NV, pp 1267–1272
11. Zhang Q (2005) An adaptive observer for sensor fault estimation in linear time varying
systems. IFAC Proc Vol 38(1):137–142
12. Xia J, Guo Y, Dai B, Zhang X (2017) Sensor fault diagnosis and system reconfiguration
approach for an electric traction PWM rectifier based on sliding mode observer. IEEE Trans
Ind Appl 53(5):4768–4778
13. Baldi P, Blanke M, Castaldi P, Mimmo N, Simani S (2018) Fault diagnosis for satellite
sensors and actuators using nonlinear geometric approach and adaptive observers. Int J
Robust Nonlinear Control 29:1–27
14. Li L, Ding SX, Yang Y, Peng K, Qiu J (2018) A fault detection approach for nonlinear
systems based on data-driven realizations of fuzzy kernel representations. IEEE Trans Fuzzy
Syst 26(4):1800–1812
15. Kabir S (2017) An overview of fault tree analysis and its application in model based
dependability analysis. Expert Syst Appl 77:114–135
16. Damdoum A, Berriri H, Slama-Belkhodja I (2012) Detection of faulty incremental encoder in
a DFIM-based variable speed pump-turbine unit. In: 2012 16th IEEE mediterranean
electrotechnical conference, Yasmine Hammamet, pp 1151–1154
Water Level Classification for Flood
Monitoring System Using Convolutional
Neural Network
J. L. Gan and W. Zailah
Abstract This project aims to propose a new water level classification model into
the flood monitoring system by integrating it with the Artificial Intelligence technology, Convolutional Neural Network. Various image pre-processing and data
augmentation techniques have been applied in order to increase the dataset from
one image to 300 images that are able to imitate the real images captured by a
camera. The images have undergone transfer learning for weight initialization with
fine tuning and training from scratch in order to compared their results and finalize
the most suitable optimizer, initial learning rate and batch size for this application.
The result has shown that by using pretrained AlexNet with Adam optimizer,
0.0001 initial learning rate and batch size of 16, the validation accuracy is able to
reach to 100% at the ninth epoch and show high stability and consistency for both
training and validation accuracies. Besides, when the model undergoes testing with
15 new images, it is able to obtain full score for 14 images and the average testing
accuracy is as high as 99.72%. The model has outperformed the previous work
done by other researchers. In conclusion, this project has contributed in improving
the safety of the community by successfully created a trustworthy and robust water
level classification model that is able to detect the water level, analyze its risk and
display the information by using camera which is more safe, durable and suitable to
be placed in flood-prone area.
Keywords Convolutional Neural Network
Image classification
Flood monitoring system J. L. Gan (&)
Department of Mechanical Engineering, Faculty of Engineering, UCSI University,
Kuala Lumpur, Malaysia
e-mail: fransisling@gmail.com
W. Zailah
Department of Mechatronic Engineering, Faculty of Engineering, UCSI University,
Kuala Lumpur, Malaysia
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_21
299
300
J. L. Gan and W. Zailah
1 Introduction
Flood has been a common issue for many countries across the globe for many
decades. There is no exemption for Malaysia. As one of the countries located in
Southeast Asia, Malaysia is subjected to monsoon season from November to March
every year. The heavy downpour is highly affecting the lives in the states of
Kelantan, Terengganu, Pahang and Johor [1]. Flooding caused by overflowing of
river, high tides and flash flood are the major types of flood happen in this country.
As flood is an inevitable disaster, many engineers and researchers have been
working on various projects, implementing structural and non-structural measures
in order to mitigate its negative impacts on social, environmental as well as
economy [1, 2].
As Artificial Intelligence (AI) is gaining much higher interest among the
researchers in recent years, machine learning, one of the AI application, is being
widely explored and implemented in various fields. Convolutional Neural Network
(CNN) falls under the category of supervised learning as the machine is needed to
be taught in order to learn the way to execute certain tasks. CNN is often used for
image classification, object detection, visual saliency detection as well as text
detection and recognition [3, 4]. Compared to other types of neural network, the
input data for CNN is in three-dimensions (3D), representing the width and height
of the image as well as the colour channel, which allows the machine to learn the
full features exist in the image instead of sacrificing its colour channel and losing
information from it [5]. CNN has proven its high performance in image classification in winning the ImageNet Large Scale Visual Recognition Challenge
(ILSVRC) since 2012 with low error rate. Multiple researchers had come up with
different CNN architectures and winning the contest with high results achieved [6],
including AlexNet [7], GoogLeNet [8] and ResNet [9].
Even so, CNN has not been widely implemented in the environmental field.
Previous integration work of CNN in disaster management system has only used it
to detect the region where flood has occurred instead of monitoring the rise of river
water level to provide early warning as this is one of the major type of flood happen
in Malaysia [5, 10]. Current flood monitoring system also integrated various ground
sensors to obtain hydrological parameters, radar and satellites for flood mapping as
well as unmanned aerial vehicle to monitor and observe the disaster area [11].
Malaysia has adopted both flood mapping and flood forecasting and warning system to manage the disaster. Many stick gauges, rainfall gauges, river gauges and
water level sensors are used to collect data and monitor the situation at the selected
area [12, 13]. However, physical sensors are rather vulnerable during the disaster.
The flood water during harsh weather is so destructive that can ruin the structures
attached with the sensors [1]. Water sensors are very expensive and it is subjected
to high risk of damage as well [14]. In some cases, the physical sensors might stop
working at the start of the flood. Unfortunately, it could not run away from the
disaster area as the water level sensor used is uniaxial, it must be placed right above
the water to detect the distance of water from the sensor.
Water Level Classification for Flood Monitoring System …
301
Therefore, this project aims to propose a new methodology which integrated
CNN technology into the flood monitoring system to monitor the river water level
by performing water risk level classification. Rather than using physical sensors that
need to be installed right above the surface of the water, a camera that could be
placed at a higher place, further away from the disaster region is preferred. The
usage of camera also served as a closed-circuit television that enabled the workers
to monitor the condition of the river, detecting the causes of peculiar data collected
and even the lives at the disaster area. The objectives of this project are to develop
the water level classification system model, implement the water level analysis
using CNN and evaluate the classification system performance by comparing it with
previous studies.
2 Methodology and Experimental Setup
The project focuses on modelling an active flood monitoring system which is able
to perform data collection, data analysis, data processing, decision-making and
release of useful information to the target audience [11, 15]. The proposed system
consists of three layers, which includes data observing layer as shown in Fig. 1
whereas the overall workflow from obtaining and creating the dataset to the final
implementation of the system on the hardware for real time testing is shown as in
Fig. 2.
2.1
Database Acquisition
Due to the limited database available from the Jabatan Pengairan dan Saliran
(JPS) Malaysia, the main image that is used in this project, as shown in Fig. 3, is
obtained from online resource because it is the same type of stick gauge used near
Fig. 1 The proposed system layer with its components
302
J. L. Gan and W. Zailah
Fig. 2 Overall workflow of the project
Fig. 3 The original image
used for this project
to light rapid transit around at Masjid Jamek, Kuala Lumpur. At the same time, it is
also assumed that the dataset is obtained at day time, right after the rain, where the
visibility is high while the water current is relatively high. The image at Sect. 3.4 is
assumed to be taken in the evening, after the rain, where the visibility is low and the
camera used has lower resolution and need to be maintained.
Water Level Classification for Flood Monitoring System …
303
Fig. 4 Images of low, medium and high risk levels created by using graphics editor
Data mining is performed in this project to create its own dataset as the actual
river images with different water levels could not be obtained from any official sites.
By using professional graphics editor software, 20 images for each low, medium
and high risk levels are being created from the image above, according to the water
level data provided by JPS for Sungai Kuantan Kajang, which defines 28.5 m as
alert level (low risk level), 28.68 m as warning level (medium risk level) and
29.10 m as danger level (high risk level) [2].
Besides, the images are also being cropped into a square shape in order to match
the expected input of the CNN architecture used in this project so that when it is
being resized automatically using the machine algorithm, it can prevent the image
from being stretched and deviated far from the real-life situation. Each image is then
labelled with “l”, “m” and “h”, indicating its water risk level and their folders are
renamed as “low”, “med” and “high” respectively. With five images from each
category being kept aside from the folder as testing set, a total of 60 images are
being saved inside the stated folders to be used for the following model training and
validation. The samples of some resulted images are as shown in Fig. 4.
2.2
Data Augmentation and Pre-processing
Data augmentation is often used in CNN image processing to increase the amount
of data and avoid overfitting [16]. It is also one of the techniques in CNN optimization. The data augmentation is applied unto testing and validation dataset so
that the system is able to detect new incoming images taken under various conditions [17]. In this project, rotating, translating and scaling are used. Furthermore,
in pre-processing step, the images are also being resized into the expected size of
the CNN architecture using machine algorithm, in this project, it is 227*227 pixels.
The resulted images from data augmentation are as shown in Fig. 5.
304
J. L. Gan and W. Zailah
Fig. 5 Samples of the resulted images from data augmentation
2.3
Details of CNN Workflow
AlexNet is chosen to be the architecture used in this project. With eight layers of
depth, this architecture is able to obtain error rate as low as 15.4%, defeating all
other participants during the ImageNet Challenge in 2012 [7]. AlexNet is a series
network that is composed by five convolutional layers, three pooling layers and
three fully-connected layers. Two cross channel normalization layers, two dropouts,
seven Rectified Linear Units (ReLU) and a Softmax classifier are used in the
architecture as well. Table 1 below shows the sequence of the layers along with the
settings of each layer which is referred from MATLAB deep learning toolbox [18].
At this stage, this project aims to investigate the most suitable training hyperparameters and type of optimizer for this water level dataset after performing
fine-tuning and to compare the accuracy result of the model trained by different
techniques as well as results generated by other researchers. To find the suitable
hyperparameters for this application, in this project, the chosen architecture undergoes two different techniques to compare their accuracies, which include transfer
learning for weight initialization combined with fine tuning and training from
scratch. The original pretrained CNN architecture is used for transfer learning, while
reconstruction of the famous CNN architecture is used in training from scratch.
Transfer Learning: Pretrained Network for Weight Initialization and Fine
Tuning. In this technique, the pretrained network is first loaded into the workspace. Then the input dataset is being resized according to the expected input size of
the network, which is 227*227 pixels. Layer transfer is needed to be performed as
Water Level Classification for Flood Monitoring System …
305
Table 1 Details of AlexNet architecture
No.
Type
1
Image input
2
Convolution
3
ReLU
4
Cross channel normalization
5
Max pooling
6
Convolution
7
ReLU
8
Cross channel normalization
9
Max pooling
10
Convolution
11
ReLU
12
Convolution
13
ReLU
14
Convolution
15
ReLU
16
Max pooling
17
Fully-connected
18
ReLU
19
Dropout
20
Fully-connected
21
ReLU
22
Dropout
23
Fully-connected
24
Softmax
25
Classification output
* Note S = stride, P = padding, K = number
Settings
Zero center normalization
S = 4, P = 2, K = 96, F = 11*11*3
5 channels/element
S = 2, P = 0, F = 3*3
S = 1, P = 2, K = 256, F = 5*5*48
5 channels/element
S = 2, P = 0, F = 3*3
S = 1, P = 1, K = 384, F = 3*3*256
S = 1, P = 1, K = 384, F = 3*3*192
S = 1, P = [1 1 1 1], K = 256, F = 3*3*192
S = 2, P = [0 0 0 0], F = 3*3
50%
50%
Cross-entropy
of filters, F = filter size, W = weights, B = bias
the number of classes in this project is only three, instead of the original 1000. The
last fully connected layer, Softmax layer and classification layer of the architecture
are being replaced by a three-output fully connected layer, a new Softmax layer and
a new classification layer. There are two common ways to fine tune the model in
order to obtain the most suitable hyperparameters for the dataset. As one being trial
and error and another is to analyze by observing the graph. In this stage, trial and
error is used as the second method requires more experience to be able to perform
well. Therefore, the pretrained network undergoes trial and error in order to obtain
the most suitable learning rate, batch size and type of optimizer for this application.
Then, the set of hyperparameters finalized in this stage are being used for the
second technique, which is training from scratch. The details are stated in the next
section. The training options used for this part are referring from the works done by
[19]. The best result generated by the fine-tuned model in this section is then being
selected to compare with the results generated in the other technique in order to
306
J. L. Gan and W. Zailah
finalize the type of technique, optimizer and values of hyperparameters for the
proposed flood monitoring system as they are proven to be the most suitable to be
implemented.
Train from Scratch. AlexNet architecture is reconstructed from scratch.
Therefore, the parameters in the filters are randomly initialized by random Gaussian
distributions which makes the main difference between the new and the pretrained
network. Deep Network Designer is first being initiated to build the architectures
and the architecture is then being loaded into MATLAB. The dataset is again being
resized according to the input size of the network and being trained according to the
finalized training hyperparameters obtained in transfer learning section. The finalized result is being brought forward to be compared as well.
3 Results and Discussion
In this section, the pretrained AlexNet architecture is first being studied to finalized
the types of hyperparameters to be used. The effects of the types of optimizer,
learning rates and batch size on the model’s accuracy and computational time are
also being analyzed. Then the finalized hyperparameters are being loaded into new
AlexNet. To compare the performance between the pretrained and the new architecture, the training graphs and the learned features of both architecture are being
studied. Last but not least, both architectures are being tested with the testing
dataset and also previous studies to finalized the model to be used for this system.
3.1
Transfer Learning: Pretrained Network for Weight
Initialization and Finetuning
The details of the hyperparameters for the model training at this stage are as shown
in Table 2. These values are referred from the work of [19].
Tables 3 and 4 show the results generated by comparing different optimizers for
different batch sizes and initial learning rates. As many researchers have proposed
different values for these hyperparameters, at this stage, this project aims to
investigate the relationships among those hyperparameters and to decide the most
desirable hyperparameters for this specific case study.
The initial learning rates used to be investigated in this project are 0.001, 0.0001
and 0.00001. The hyphens “-” that appear in the table show that the particular
results are not valid due to two reasons, which include the batch size that is too low
and out of the compatibility of the graphics processing unit, as well as the results
show constant validation accuracy which indicates that the optimizer is not able to
effectively update and optimize the weights in the model. Therefore, they are being
omitted from the results.
Water Level Classification for Flood Monitoring System …
307
Table 2 Hyperparameters set for weight initialization and finetuning
Hyperparameters
Type of optimizer
Settings
SGDM
RMSProp
Adam
Momentum
Max epochs
Learning rate drop period (epochs)
Weight decay
Gradient decay factor
Squared gradient decay factor
Shuffle
Validation frequency
0.9
12
6
0.0001
–
–
Every epoch
3
–
12
6
0.0001
–
0.999
Every epoch
3
–
12
6
0.0001
0.9
0.999
Every epoch
3
The results show that SGDM optimizer is able to work well for wide range of
training options while RMSProp only works when the initial learning rate is
0.00001 and Adam only works well with low batch size and low learning rate.
Next, among all the trials, SGDM optimizer is able to obtained the average validation accuracy as high as 99.11% with 0.0001 initial learning rate and batch size of
16 in 87.33 s. Adam optimizer has obtained the second highest average validation
accuracy with both 0.0001 and 0.00001 initial learning rates and batch size of 16 in
115.67 s and 110.67 s respectively. The performance of RMSProp is slightly lower
compare to the other two optimizers. Its best result is the third highest and it is able
to obtain 98.22% of average validation accuracy with 0.00001 learning rate and
batch size of 16 in 117.33 s. The results show that each optimizer performs differently from each other at different settings different. However, in general, all of
them work better in lower batch size and lower learning rate.
The results in Tables 3 and 4 are being plotted in the graphs of average accuracy
and average computational time against batch size as shown in Figs. 6 and 7. Based
on Fig. 6, when batch size increases, the accuracies of all different optimizers
decrease. Besides, when SGDM optimizer is used, at the same batch size, the
accuracy is higher for higher initial learning rate and it is lower for lower initial
learning rate, except when the batch size is 128 used by SGDM with 0.0001
learning rate. It also shows that the accuracy of SGDM with 0.0001 learning rate
and batch size of 16 is the highest compared to others while the same optimizer
with batch size of 128 has obtained the lowest accuracies in this simulation.
Therefore, large batch size might have to be avoided when the dataset available is
small as in this project.
In overall, Fig. 7 shows that the computational time needed when SGDM is used
is the shortest compared to others. As RMSProp being the second fastest, Adam
requires the most computational time. The reason for it might be the formula used
by Adam is relatively more complicated compared to SGDM and RMSProp. It can
also be observed that the gradients from batch size of 16 to 32 for all the optimizers
are much higher than the gradients at other points in the graph. Considered from
SGDM results, the gradients remain relatively similar across different learning rates.
0.00001
0.0001
0.001
Optimizers
Initial
learning
rate
16
32
64
128
16
32
64
128
16
32
64
128
Batch
size
98.67
97.33
88.00
98.67
100.00
96.00
90.67
96.00
92.00
92.00
–
98.67
96.00
86.67
100.00
93.33
88.00
85.33
96.00
96.00
94.67
–
96.00
97.33
86.67
98.67
97.33
94.67
89.33
97.33
97.33
90.67
Average
–
97.78
96.89
87.11
99.11
96.89
92.89
88.44
96.44
95.11
92.45
–
SGDM
Validation accuracy (%)
Trials
1
2
3
–
–
–
–
–
–
–
–
96.00
97.33
93.33
–
1
RMSProp
98.67
96.00
93.33
2
100.00
93.33
96.00
3
Average
–
–
–
–
–
–
–
–
98.22
95.55
94.22
–
–
–
–
100.00
98.67
–
–
100.00
98.67
–
–
1
Adam
Table 3 Validation results of accuracy obtained by comparing different optimizers with initial learning rate and batch size
98.67
97.33
98.67
93.33
2
97.33
96.00
97.33
94.67
3
Average
–
–
–
–
98.67
95.56
–
–
98.67
97.33
–
–
308
J. L. Gan and W. Zailah
Water Level Classification for Flood Monitoring System …
309
Table 4 Validation results of computational time obtained by comparing different optimizers
with initial learning rate and batch size
Optimizers
Initial
learning
rate
0.001
0.0001
0.00001
Batch
size
SGDM
Computational time (s)
Trials
1
2
3
Average
1
16
32
64
128
16
32
64
128
16
32
64
128
–
64
53
51
89
63
49
51
87
62
48
–
–
–
–
–
–
–
–
–
126
74
74
–
69
54
49
88
63
48
53
87
61
47
68
56
49
85
62
47
53
89
61
48
–
67.00
54.33
49.67
87.33
62.67
48.00
52.33
87.67
61.33
47.67
–
RMSProp
2
107
73
76
Adam
3
119
85
75
Average
1
–
–
–
–
–
–
–
–
117.33
77.33
75.00
–
–
–
–
–
115
107
–
–
109
105
–
–
2
3
119
107
113
109
113
106
110
103
Average
–
–
–
–
115.67
109.67
–
–
110.67
104.67
–
–
Fig. 6 Graph of average accuracy against batch size
Next, as the batch size increases, the computational time decreases for all the cases
except for SGDM with 0.00001 learning rate and batch size of 128. The computational time shows a significant increment at that point even exceeding the needed
computational time for SGDM with higher learning rate. The data at that point is
inaccurate because higher batch size is supposed to eventually lead to lower
computational time due to the fact that the optimizer requires less steps to observe
310
J. L. Gan and W. Zailah
Fig. 7 Graph of average computational time against batch size
the entire training set [20]. Therefore, that particular result is not being considered
for the selection to enter the next stage.
The graphs in Fig. 8 show the training progresses of the top four results
(highlighted) obtained in Table 3 which are SGDM with 0.0001 learning rate,
Fig. 8 From left to right, top to bottom are the training progress of SGDM, RMSProp, Adam with
0.0001 and Adam with 0.00001 learning rates
Water Level Classification for Flood Monitoring System …
311
RMSProp with 0.00001 learning rate, Adam with 0.0001 and 0.00001 learning
rates. It is observed that the noise level in the training progress is the highest in
SGDM and the lowest in Adam. The resulted figures have shown that SGDM and
Adam with 0.00001 learning rate are able to reach to 100% validation accuracy at
the sixth epoch while RMSProp and Adam with 0.0001 learning rate at the ninth
epoch. Even though SGDM requires the most minimum computational time for the
whole training and it is able to reach full accuracy at the lowest number of epoch, its
training graph shows unsteadiness at the training graph even upon termination. On
the other hand, even though Adam with 0.0001 learning rate requires the longest
computational time for training and it reaches 100% validation accuracy at the three
epochs later than SGDM, both training graph and validation graph have shown high
consistency at 100% accuracy from the ninth epoch onwards. Adam with 0.00001
learning rate has shown the smoothest training progress but it is not being selected
due to the same consistency reason as well.
Therefore, in regard to this technique called transfer learning, it is concluded that
the pretrained model performs the most effectively and efficiently with Adam
optimizer of 0.0001 learning rate and batch size of 16 because both of its training
and validation results are able to converge within an adequate time and show high
consistency and accuracy.
3.2
Training from Scratch
The same finalized hyperparameters obtained from Sect. 3.1 are being used to train
the new AlexNet. Figure 9 shows the training progress of the model. The figure
shows that the fluctuation at the beginning of training is higher than it is seen from
previous section. The fluctuation starts decreasing until it reaches the eighth epoch,
which is one epoch after the learning rate drops to 0.00001. However, the training
graph (bright blue line) is still fluctuating upon training termination. Nonetheless,
the figure shows that without pretraining, the model is still able to yield 100%
validation accuracy in 111 s. It does not experience underfitting or overfitting
throughout the training. The graph also shows that the model has reached to full
accuracy at the eighth epoch. Therefore, the number of epochs used in this case can
be decreased. The resulted model is being brought forward to the following sections
to compare the features learned by different models that undergo different techniques and also to compare their results obtained through the testing dataset.
312
J. L. Gan and W. Zailah
Fig. 9 Training progress for training from scratch
3.3
Extracted Features
Figure 10 shows the extracted features from different layers in the model obtained
from the pretrained AlexNet after performing transfer learning. Figure from left to
right and from top to bottom are the first five convolutional layers and the subsequent three fully-connected layers. The last three images also indicate the features
learned for high, low and medium water levels respectively. The features from the
first convolutional layer are rather simpler compared to other layers as according to
the working principle of CNN, the features in subsequent layers are learned from
the features in previous layer. Therefore, deeper layer is able to extract more
meaningful features. Besides, the complexity of the features is due to the pretrained
network has been trained by 1.2 million images from 1000 categories. So, the newly
learned features of the stick gauge are not noticeable in the figure. In spite of that, it
does not affect its performance in classifying water levels.
On the other hand, Fig. 11 shows the extracted features from the newly trained
AlexNet. As compared to the previous figure, the extracted features in this model
are obscurer because the model has not been pretrained by the huge database but
only with 300 images for three different categories. Therefore, the features extracted
might not be as clear as the ones from the pretrained network. However, starting
from the third convolutional layer, the shape of the stick gauge and its number have
become more noticeable compared to the pretrained network.
Water Level Classification for Flood Monitoring System …
313
Fig. 10 Features extracted from the pretrained AlexNet
The characteristics and the quality offeatures extracted from both models above are
different from each other. However, in order to verify the model’s performance, the
reserved testing dataset is being used. The results are shown in the following section.
3.4
Performance on Testing Dataset
The models selected from Sects. 3.1 and 3.2 are being used to predict the labels of
the testing dataset. The results are tabulated in Table 5. The pretrained model is able
to obtain 100% testing accuracy on almost all the images across different categories,
except for the second image of high risk category, which is shown in Fig. 12, the
predicted accuracy is 95.78%. On the other hand, when the new AlexNet is used,
the same image is unable to obtain 100% prediction accuracy as well. It is observed
314
J. L. Gan and W. Zailah
Fig. 11 Features extracted from the new AlexNet
Table 5 Prediction accuracy on testing dataset
Type of
network
Image
category
Prediction accuracy (%)
Image 1 Image 2 Image 3
Image 4
Image 5
Average
Pretrained
AlexNet
High
Med
Low
High
Med
Low
100
100
100
99.98
99.81
99.96
100
100
100
100
98.48
99.95
100
100
100
100
97.86
99.98
99.72
New AlexNet
95.78
100
100
94.53
99.96
99.69
100
100
100
99.90
99.86
99.89
99.32
Water Level Classification for Flood Monitoring System …
315
Fig. 12 Second image for
high risk category
that the water level is at 29.1 m, which is the border to the medium risk level. After
checking through the dataset, it is found that the lowest water level in the high-risk
category is 29.12 m while the highest water level in the medium-risk category is
29.06 m. There is insufficient data being trained on the model that causes the model
to be unable to fully distinguish between the two risk levels. Therefore, more
dataset is required to train the model so that it is able to learn to classify over the full
range of water risk level. Although the new AlexNet has obtained less average
prediction accuracy in the testing results, its individual result is only slightly lower
compared to the pretrained network.
In addition, the testing continues to obtain more information on the robustness of
the model by feeding it images of the testing dataset taken through a camera, as the
proposed system in Sect. 2. In this experiment, the difference between the performance of the two models has been enlarged. The pretrained AlexNet has outperformed the new AlexNet by 27.21% as shown in Table 6. The results also show
that the prediction accuracy in overall has dropped when a webcam is used because
contrary to the normal images used in previous sections, the images obtained
through the webcam is blurrier and has lower intensity, as shown in Fig. 13.
Table 6 Prediction accuracy on testing dataset using webcam
Type of
network
Image
category
Prediction accuracy (%)
Image 1 Image 2 Image 3
Image 4
Image 5
Average
Pretrained
AlexNet
High
Low
Med
High
Low
Med
99.97
100
99.46
24.16
99.95
100
100
100
100
87.95
99.98
68.52
100
100
100
38.38
99.99
99.95
96.98
New AlexNet
64.01
100
98.16
83.49
100
0.07
93.08
100
99.95
43.66
100
0.45
69.77
316
J. L. Gan and W. Zailah
Table 7 Comparisons of current results with previous works
Researcher
Architecture
Accuracy
Current results (transfer learning)
Current results (training from scratch)
Amit and Aoki [5]
Cirneanu and Popescu [21]
AlexNet
AlexNet
AlexNet
CNN
99.72%
99.32%
83%, 89%
95%
Fig. 13 Sample of image
taken by a webcam
Lastly, the results have been used to compare with previous works that are
related to flood monitoring or detection system. Note that the results obtained when
the webcam is used are not being considered in the final result comparison because
the testing is not done according to the normal testing procedure as other
researchers. Based on the tabulated results below, Amit and Aoki [5] has trained the
machine to detect the disaster region using aerial images while Cirneanu and
Popescu [21] have created a simple CNN architecture to classify flooded area based
on local binary pattern texture operator. Nonetheless, Table 7 has shown that the
results from current project have outperformed than the others.
4 Conclusion and Recommendation
Based on all the results obtained in this project, the final model chosen for water
level classification system is the pretrained AlexNet model. This model has proven
its high validation accuracy with Adam optimizer of 0.0001 learning rate and batch
size of 16 during the training stage. Next, its training progress has shown that the
model is able to reach to 100% validation accuracy at the ninth epoch and the result
remains stable and consistent to the end of the training. Although its average testing
accuracy is only slightly higher than the new model, it is noticed that only one result
shows the imperfect score. Therefore, the problem can be easily solved by training
Water Level Classification for Flood Monitoring System …
317
more data at that particular level to increase the model’s ability to distinguish the
difference. Furthermore, the accuracies (f-score) obtained in [5] is 89% and 83% for
two different flood location, which are much lower compared to the results obtained
by the models in this project. In a nutshell, AlexNet with Adam optimizer and initial
learning rate of 0.0001 and batch size of 16 is the most suitable choice for this
application.
The system can be further improved by applying heavier data augmentation to
create images with different brightness and clarity to imitate images of water level
captured at different time and weather respectively. Next, in order to ease the rescue
work, the system can also be trained to detect living organisms at times of flood.
A platform can also be created upon this application to better distribute the work
force of different parties that involve in the rescue work.
References
1. Zakaria SF, Zin RM, Mohamad I, Balubaid S, Mydin SH, MRD EMR (2017) The
development of flood map in Malaysia. In: 3rd International Conference on Construction and
Building Engineering (ICONBUILD) 2017. AIP Publishing, Malaysia, pp 1–8
2. Department of Irrigation and Drainage Malaysia Homepage. http://publicinfobanjir.water.gov.
my. Accessed 28 Mar 2019
3. Gu JX, Wang ZH, Kuen J, Ma LY, Shahroudy A, Shuai B, Liu T, Wang XX, Wang G,
Cai JF, Chen TH (2018) Recent advances in convolutional neural networks. Pattern Recogn
77:354–377
4. Hadji I, Wildes RP (2018) What do we understand about convolutional networks?. ArXiv,
Toronto
5. Amit SNK, Aoki Y (2017) Disaster detection from aerial imagery with convolutional neural
network. In: 2017 International Electronics Symposium on Knowledge Creation and
Intelligent Computing (IES-KCIC), pp 239–245
6. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A,
Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge.
Int J Comput Vis 115(3):211–252
7. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks, pp 1–9
8. Szegedy C, Liu W, Jia YQ, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V,
Rabinovich A (2014) Going deeper with convolutions, pp 1–12
9. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In:
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770–778
10. Ahamed A, Bolten JD (2017) A MODIS-based automated flood monitoring system for
southeast asia. Int J Appl Earth Obs Geoinf 61:104–117
11. Chen ZQ, Chen NC, Du WY, Gong JY (2018) An active monitoring method for flood events.
Comput Geosci 116:42–52
12. Chan NW (2012) Impacts of disasters and disasters risk management in Malaysia: the case of
floods. In: Economic and Welfare Impacts of Disasters in East Asia and Policy Responses,
pp 503–551
13. Shafiai S, Khalid MS (2016) Flood disaster management in Malaysia: a review of issues of
flood disaster relief during and post-disaster. In: International Soft Science Conference. Future
Academy, United Kingdom, pp 163–170
318
J. L. Gan and W. Zailah
14. Subramaniam SK, Vigneswara RG, Subramonian S, Hamidon AH (2010) Flood level
indicator and risk warning system for remote location monitoring using Flood Observatory
System. WSEAS Transn Syst Control 5(3):153–163
15. Hannan MA, Zailah W (2012) Image extraction and data collection for solid waste bin
monitoring system. J Appl Sci Res 8(8):3908–3913
16. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
17. Yurtsever M, Yurtsever U (2019) Use of a convolutional neural network for the classification
of microbeads in urban wastewater. Chemosphere 216:271–280
18. Deep Learning Toolbox Model for AlexNet Network. https://www.mathworks.com/
matlabcentral/fileexchange/59133-deep-learning-toolbox-model-for-alexnet-network.
Accessed 15 Nov 2019
19. Mahbod A, Schaefer G, Ellinger I, Ecker R, Pitiot A, Wang C (2019) Fusing fine-tuned deep
features for skin lesion classification. Comput Med Imaging Graph 71:19–29
20. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. The MIT Press, London
21. Cirneanu AL, Popescu D (2018) CNN based on LBP for evaluating natural disasters. In: 2018
15th International Conference on Control, Automation, Robotics and Vision (ICARCV).
IEEE, New Jersey, pp 568–573
Evaluation of Back-Side Slits
with Sub-millimeter Resolution Using
a Differential AMR Probe
M. A. H. P. Zaini, M. M. Saari, N. A. Nadzri, A. M. Halil,
A. J. S. Hanifah, and K. Tsukada
Abstract The electromagnetic method of the Non-destructive Test is one of the
approaches in the field of crack detection on a metallic sample. One of the techniques that appear in the electromagnetic method is the Eddy Current Testing
(ECT), where it utilizes the electromagnetic principle to detect cracks in metallic
components. In this research, an ECT probe that is made up of two AMR sensors,
two excitation coils, and a developed set/reset circuit. Besides, a digital lock-in
amplifier has also been developed by using NI-LabVIEW and a data acquisition
(DAQ) card. A measurement system that incorporates the ECT probe and the digital
lock-in amplifier as well as an amplifier circuit, a power supply, a PC and an XY
stage to which the probe is attached to, is developed. Then, artificial slits with
different depths from 768 µm to 929 µm are created on a galvanized steel plate
sample. The slits are evaluated from the back-side of the galvanized steel plate via
two types of scanning, which is the line scan and full map scanning. From the
results of the line scan, the localization of the slits, as well as their depths, could be
performed and estimated. Furthermore, 2-D mapping of the sample from the
backside has been generated. The 2-D map shows that the position of the slits could
be estimated, including their slits depths.
Keywords Non-destructive testing NDT
Anisotropic magnetoresistance AMR
Eddy Current Testing ECT M. A. H. P.Zaini (&) M. M. Saari N. A. Nadzri
Faculty of Electrical & Electronics Engineering, Universiti Malaysia Pahang,
26600 Pekan, Pahang, Malaysia
e-mail: mek18006@stdmail.ump.edu.my
A. M. Halil A. J. S.Hanifah
Faculty of Mechanical & Manufacturing Engineering, University Malaysia Pahang,
Pekan Campus, 26600 Pekan, Pahang, Malaysia
K. Tsukada
Graduate School of Interdisciplinary Science and Engineering in Health Systems,
Okayama University, Okayama 700-8530, Japan
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_22
319
320
M. A. H. P. Zaini et al.
1 Introduction
In order to identify and assess cracks in metallic components, the magnetic method
is considered as one of the approaches in the Non-destructive Test (NDT) where the
method is widely used in industry due to its small cost and straight-forward
operation [1], thanks to its capability to analyze metallic compounds since the
compound is conductive and has powerful magnetic characteristics. Furthermore,
the benefits of the magnetic method are that it is contactless and could provide
real-time inspection compared to other NDT techniques [2–4]. Recently, the electromagnetic method in NDT is extensively being researched since its emergence
due to the growth of technology.
There are a number of NDT techniques which utilizes the electromagnetic
principles to detect cracks in metallic component and one of them is recognized as
the Eddy Current Testing (ECT) where it is thoroughly used in NDT for the
detection of cracks on a metallic sample such as aluminum plates [5, 6]. The ECT
can be considered as one of the techniques that are extensively being researched,
especially due to the promising characteristics of eddy current. In ECT, the lift-off
between the magnetic sensor and the metallic sample could heavily affect the eddy
current signal, thus causing alterations in the eddy current readings [7]. Therefore,
in order to minimize or overcome this effect entirely, a compensation method
should be proposed.
In this research, a low-frequency ECT technique is used to allow deeper propagation of electromagnetic waves in an attempt to induce and provide deeper
penetration of eddy current [8]. This is because the generated eddy currents are
greatly influenced by the skin depth effect, which implies that the eddy currents will
be largely distributed on the surface area at high frequency, thus limiting the ability
of the eddy current to penetrate deeper.
Furthermore, for this research, a small-sized ECT probe consists of magnetic
sensors and excitation coils is developed to obtain benefits from its small size,
which allows it to be used in the evaluation of small or complex cracks [9]. There is
a list of magnetic sensors that can be used, such as the induction coil [10],
Anisotropic Magnetoresistive (AMR) sensor, Tunnel Magnetoresistance
(TMR) sensor, Giant Magnetoresistance (GMR) sensor and Superconducting
Quantum Interference Device (SQUID). Among these, SQUID is known to be the
one of the most sensitive magnetic sensor [11], however, due to its nature which
needs to be operated with the presence of complex heat insulation structures for
cooling purpose, thus rendered it to be difficult to be compacted [12, 13]. Therefore,
the usage of the AMR sensor is proposed in this research as it is compact in size as
well as offering high-sensitive sensing. The small size of the AMR sensor is
advantageous in resolving the higher spatial distribution of eddy currents in conductive materials [14, 15].
A plate that is made up of galvanized steel is used as the primary sample in this
research. One side of the plate is engraved with four slits with different depths at
sub-millimeter resolution. The ECT probe to be developed aims to analyze those
Evaluation of Back-Side Slits with Sub-Millimeter Resolution…
321
slits from the backside. Finally, the developed ECT probe is utilized to investigate
the magnetic response characteristic of the artificial slits on the sample.
2 Experimental Setup
2.1
ECT Probe
ECT probes, in general, can be found in different forms that vary in terms of types
and designs. In this research, an ECT probe is developed in order to detect and
evaluate back-side slits on a sample. This ECT probe is proposed to be designed to
become small in size so that it could become more beneficial for the detection of
small cracks as well as cracks that exist in a complex pattern. In addition to that, the
small-sized probe could also have an upside at which it can make the detection
performance of crack to be better.
Then, the ECT probe is fabricated with the presence of two magnetic sensors and
two excitation coils. The AMR sensor is chosen as the magnetic sensors for this
probe thanks to its highly sensitive detection and small size with a dimension of
11 4 mm2. An AMR sensor consists of 4 magneto-resistive (MR) elements.
The MR elements are arranged in a Wheatstone bridge connection, as shown in
Fig. 1, where each MR element is wounded with a set/reset strap. When a magnetic
field is exposed to the AMR sensor, this will cause the MR elements to change in
resistance, which will then cause a change in the potential difference at the node
between the two MR elements.
However, it is also worth to mention a drawback in utilizing the AMR sensor
where, whenever it is exposed with a strong magnetic field, the AMR sensor itself
will become saturated and becoming less sensitive. Therefore, a set/reset circuit is
fabricated to supply high pulses of current in order to help the sensor to regain its
sensitivity. Then, the reasoning of why two AMR sensors are used instead of one is
to introduce the differential technique detection, which may help in diminishing the
background noises. Between the two AMR sensors, a baseline of 4 mm is placed.
Fig. 1 The schematic
diagram of the AMR sensor
(HMC1001) that is used in
this research alongside a set/
reset circuit and an
instrumentation amplifier
(AD8249) connected to it
HMC1001
+
+5V
AD8249
5V
-
-5V
Set/Reset Set/Reset
Circuit Circuit
322
M. A. H. P. Zaini et al.
Amplifier
Circuit
DAQ
PC
Power
Supply
Induced Magnetic Field due
to Induced Eddy Current
Induced Eddy Current
N
S1
S2
S
Fig. 2 The developed ECT system
Then, the AMR sensors will be connected to an amplifier circuit where the
amplifier circuit is made up of two instrumentation amplifier (INA). Each INA is
connected with one AMR sensor where the output of the sensor is amplified with a
gain of 40 dB. Next, by placing the AMR sensors between two excitation coils, the
stability of the sensors could be significantly enhanced. Each excitation coils are
wounded with 0.65-mm magnet wires for 100 turns around a ferrite core with a
diameter and height of 6 mm and 20 mm, respectively, as in Fig. 2.
2.2
Measurement System
A measurement system that incorporates the developed ECT probe is constructed
with a few others components such as a power supply, an amplifier circuit, a digital
acquisition (DAQ) card, an XY stage with a size of 55 cm 45 cm as well as a
personal computer (PC) for the analysis of the acquired data as shown in Fig. 2.
The signal from the AMR sensors will be pre-amplified by the amplifier circuit
before it is acquired by the DAQ card (NI-USB6212). The ECT probe is attached to
the XY-stage. Then, via NI-LabVIEW, an XY-stage controller virtual instruments
(VI) is created. This is to allow the XY stage to be controlled by the PC.
Furthermore, as an instrument that can extract a signal from a noisy environment
is needed in this research, it is necessary that a lock-in amplifier (LIA) is to be used
in this research. The LIA is crucial due to its functional, where it is able to extract
Evaluation of Back-Side Slits with Sub-Millimeter Resolution…
323
Fig. 3 Block diagram of the digital LIA that is constructed in NI-LabVIEW VI
signal amplitudes and phases from a very noisy environment. Therefore, by using
LabVIEW, the VI of a digital LIA is constructed as shown in Fig. 3. Compared to
the analog LIA, the digital LIA excels in terms of size where it only required a
DAQ card for data acquisition purposes, thus, may enable the measurement system
to become simpler. Besides that, a VI that controls the power supply is developed.
This is to allow the measurement to be done automatically, thus reducing the time
taken for each measurement as well as minimizing any human intervention. Then,
by combining the XY-stage controller, the developed digital LIA and the power
supply controller, a measurement system is produced.
A 2-mm thick galvanized steel plate is used as the sample for this research. On
one surface of the sample, four artificial slits are fabricated with different depths at
the sub-millimeter resolution as shown in Fig. 4. First, line scans are conducted on
the sample as shown in Fig. 4, with a resolution of 1 mm. The experimental settings
of the line scan are sinusoidal currents with an amplitude 300 mA with variable
frequencies of 30 Hz, 70 Hz, 90 Hz, 110 Hz, 160 Hz, 210 Hz, 410 Hz, and
510 Hz, used to produced excitation fields using the excitation coils. Then, the
optimum frequency is determined from the results of the line scan. After that, by
929 µm
849 µm
817 µm
Direction of
scanning
Fig. 4 Scanning procedure of a line scan on the sample
768 µm
324
M. A. H. P. Zaini et al.
using the optimum frequency, a full map scanning is conducted for the back-side
measurement to generate the 2-D representation of the induced magnetic field of the
induced eddy current.
3 Results and Discussions
3.1
Line Scan of the Back-Side Measurement
Compared to the supplied magnetic field via excitation coil, the induced magnetic
field of the induced eddy current in the sample is delayed by 90°. From the output
of the LIA, the reading of the differential sensors consists of two different part,
which is the real part and the imaginary part. In other words, the real part is also
known as the signals which are in-phase with the reference signal while the
imaginary part represents the signal which is out-of-phase with the reference signal.
For this research, the reference signal is set to be the signal from sensor 1.
Therefore, as the induced magnetic field of the induced eddy current is delayed
compared to the supplied magnetic field, the signal of the magnetic field from the
induced eddy current could be detected from the imaginary part of the output of the
lock-in amplifier.
Figure 5 shows the raw waveforms of the induced magnetic fields of the eddy
current signals at the 849-µm slit. The slit is located at the position of 15 mm. From
the waveforms, the location of the slit can be identified to be at the middle of the
transition of voltage from peaks to troughs; i.e., the position of the highest gradient
of the waveforms with respect to the position of the probe. The pattern is similar for
every frequency. In terms of frequency, it can be observed that as the frequency
increases, the magnitude of the waveform averages is decreasing. This could be due
0.005
Slit Location
Voltage (V)
-0.005
30 Hz
70 Hz
90 Hz
110 Hz
160 Hz
210 Hz
410 Hz
510 Hz
-0.015
-0.025
∆V210 Hz
-0.035
-0.045
0
5
10
15
20
Position (mm)
Fig. 5 The raw waveform signal of the 849 µm slit
25
30
Evaluation of Back-Side Slits with Sub-Millimeter Resolution…
325
0.04
0.035
Delta Values (∆V)
0.03
30 Hz
70 Hz
90 Hz
110 Hz
160 Hz
210 Hz
410 Hz
510 Hz
0.025
0.02
0.015
0.01
0.005
0
750
800
850
Depths (μm)
900
950
Fig. 6 Delta values of voltage of the line scan back-side measurement as calculated from the raw
waveform
to the skin depth effect as the eddy current may not penetrate further as the frequency increases and distribute more on the surface. Furthermore, delta values of
voltage (ΔV) or simply the difference between the peaks and troughs can be calculated to characterize the sub-millimeter slits. For example, at the frequency of
210 Hz, the delta value of voltage, ΔV210 Hz, is calculated as shown in Fig. 5.
Then, a graph of ΔV versus the depth is plotted as in Fig. 6. From the graph,
there is a correlation that can be observed where, as the depth increases, the ΔV also
increasing. Thus, these characteristics can be used to estimate the crack depth of
any unknown defects. However, it is not the same for the ΔV at frequencies of
410 Hz and 510 Hz at which their ΔV seems to fluctuate. Also, as the frequency
increases, the overall ΔV is also increasing. This case, however, only occurs in the
frequency region between 30 Hz until 210 Hz. For the ΔV of frequency above
210 Hz, which is 410 Hz and 510 Hz, the overall ΔV seems to start to decrease.
This is suspected to happen due to the skin depth effect where eddy current distributed on the surface and its distribution is not much affected by the presence of
the back-side slits. Therefore, the frequency dependency characteristic can also be
utilized in order to provide richer information on the slit depth. Also, it is worth to
note that the maximum percent error of this system could go up to 15.41% and the
error variations can be seen as in Fig. 6 from the error bars. Thus, since the depth
difference between the 817 µm and the 849 µm depth slits is quite small, which is
approximately 32 µm, the overlap between the error bars at both slits can be
expected, which may cause the ΔV to be quite similar.
After that, for the ΔV at each frequency, a trendline is generated. From here, the
gradient of the trendlines is calculated, and then, the gradient is plotted versus
frequency, as in Fig. 7. From the figure, it can be seen clearly as the frequency
increases, the gradient of the trendline of the ΔV of the line scan for back-side
measurement is also increased. However, the gradient of the trendline of the ΔV of
326
M. A. H. P. Zaini et al.
Gradient of Trendline, m (×10 5)
12
10
8
6
4
2
0
10
100
Frequency, f (Hz)
1000
Fig. 7 Graph of trendline of the ΔV of the line scan for the back-side measurement versus
frequency
the line scan for back-side measurement starts to decrease after 210 Hz where this
could be affected by the skin depth effect. From the gradient of the trendline of the
ΔV of the line scan for the back-side measurement, it can be said that the most
optimal frequency is 210 Hz as the gradient of the trendline is at the highest.
3.2
2-D Map of the Back-Side Measurement
Next, a full map for the back-side measurement is conducted to evaluate the locality
of the slits. As mentioned previously, the frequency of 210 Hz is considered as the
optimum frequency. Therefore, the full map for the back-side measurement is
conducted at the frequency of 210 Hz. Same with the line scan measurement, the
full map measurement also uses the same scanning resolution which is 1 mm. Then,
from the full map scanning, a 2-D map of the sample is generated by using the
contour function of MATLAB. The result of the full map scanning is shown in
Fig. 8. The comparison between the 2-D mapping and the actual sample is also
highlighted in Fig. 8. It can be seen that from the 2-D mapping, the location of the
slit is at the middle between the intensity change of voltage, which is from the 2-D
mapping is the changes from red intensity to blue intensity; i.e., from minimum
voltage to maximum voltage.
Moreover, the depth can also be estimated by observing the level of intensity at
both blue intensity and red intensity regions. For the 768 µm depth slit, the blue and
red color intensity regions can be seen to be much lower compared to the blue and
red color intensity regions for the slit with a depth of 929 µm. Other than the
intensity change, the background signal can also be seen to be lower on the left side
as compared to the right side. This may be caused by the magnetic field distribution
929 µm
849 µm
817 µm
327
768 µm
Full Map of the
Sample
Actual Sample
Evaluation of Back-Side Slits with Sub-Millimeter Resolution…
-0.015
-0.01
-0.005
0
0.005
0.01
Voltage (V)
Fig. 8 2-D mapping of the sample from back-side measurement and comparison with the actual
sample
inside the sample itself. However, the background voltage does have a huge difference compared to the voltage near the slit as the intensity change can be clearly
seen as compared to the background. From both line scan and 2-D map measurements, it can be said that the developed probe is able to resolve back-side slit with a
resolution up to approximately 54 µm, showing its potential in an early and sensitive back-side crack assessment.
4 Conclusions
An ECT probe with a differential AMR sensor configuration has been developed in
this research. The probe is able to detect the artificial slits that have been created on
a galvanized steel plate sample from the backside with a slit depth resolution up to
approximately 54 µm. Two type scanning is done in this research, which is the line
scan and the full map scanning. For the line scan, the location of the slit could be
estimated by observing the patterns of the results from the line scan measurement.
Furthermore, by analyzing the results further, the depth of the slit could even be
estimated. Then, an optimum frequency is identified to be 210 Hz for detecting the
artificial back-side slits. By using the optimum frequency, a full map scanning is
conducted on the same sample from the backside. A 2-D mapping of the sample has
been generated. The location of the slit could be seen on the 2-D map as it is at the
328
M. A. H. P. Zaini et al.
transition from minimum to maximum points of the acquired signal. By observing
the intensity of the blue and red colors on the 2-D map, the depth could be
estimated.
Acknowledgements The authors would like to thank the Universiti Malaysia Pahang (grant no.
RDU1903100 and PGRS190321) for laboratory facilities and financial assistance.
References
1. Tsukada K, Kiwa T, Kawata T, Ishihara Y (2006) Low-frequency eddy current imaging using
mr sensor detecting tangential magnetic field components for nondestructive evaluation. IEEE
Trans Magn 42:3315–3317
2. Postolache O, Ribeiro AL, Ramos H (2009) Weld testing using eddy current probes and
image processing. In: 19th IMEKO World Congress 2009, pp 6–10
3. García-Martín J, Gómez-Gil J, Vázquez-Sánchez E (2011) Non-destructive techniques based
on eddy current testing. Sensors 11:2525–2565
4. Zaini MAHP, Saari MM, Nadzri NA, Mohd Halil A, Tsukada K (2019) An MFL probe using
shiftable magnetization angle for front and back side crack evaluation. In: Proceedings - 2019
IEEE 15th International Colloquium on Signal Processing and Its Applications, CSPA 2019,
pp 157–161
5. Sophian A, Tian G, Fan M (2017) Pulsed eddy current non-destructive testing and evaluation:
a review. Chin. J. Mech. Eng. 30:500–514
6. Ghanei S, Kashefi M, Mazinani M (2013) Eddy current nondestructive evaluation of dual
phase steel. Mater Des 50:491–496
7. Nadzri NA, Ishak M, Saari MM, Mohd Halil A (2019) Development of eddy current testing
system for welding inspection. In: Proceeding of the 2018 9th IEEE Control and System
Graduate Research Colloquium, ICSGRC 2018. IEEE, pp 94–98
8. He D, Shiwa M (2014) A magnetic sensor with amorphous wire. Sensors (Switzerland).
14:10644–10649
9. Tsukada K, Hayashi M, Nakamura Y, Sakai K, Kiwa T (2018) Small eddy current testing
sensor probe using a tunneling magnetoresistance sensor to detect cracks in steel structures.
IEEE Trans Magn 54:1–5
10. Saari MM, Zaini MAHP, Ahmad H, Che Lah NA (2019) An AC magnetometer using
automatic frequency switching of a resonant excitation coil for magnetic nanoparticles
characterization. In: Proceeding of the 2018 9th IEEE Control and System Graduate Research
Colloquium, ICSGRC 2018. IEEE, pp 207–210
11. Tumanski S (2007) Induction coil sensors—A review. Meas Sci Technol 18:R31–R46
12. Saari MM, Sakai K, Kiwa T, Sasayama T, Yoshida T, Tsukada K (2015) Characterization of
the magnetic moment distribution in low-concentration solutions of iron oxide nanoparticles
by a high- T c superconducting quantum interference device magnetometer. J Appl Phys
117:17B321
13. Saari MM, Ishihara Y, Tsukamoto Y, Kusaka T, Morita K, Sakai K, Kiwa T, Tsukada K
(2015) Optimization of an AC/DC high- Tc SQUID magnetometer detection unit for
evaluation of magnetic nanoparticles in solution. IEEE Trans Appl Supercond 25:1–4
14. Jander A, Smith C, Schneider R (2005) Magnetoresistive sensors for nondestructive
evaluation (Invited Paper). In: Advanced Sensor Technologies for Nondestructive Evaluation
and Structural Health Monitoring, p 1
15. Tsukada K, Haga Y, Morita K, Song N, Sakai K, Kiwa T, Cheng W (2016) Detection of inner
corrosion of steel construction using magnetic resistance sensor and magnetic spectroscopy
analysis. IEEE Trans Magn 52:1–4
Model-Free Tuning of Laguerre
Network for Impedance Matching
in Bilateral Teleoperation System
Mohd Syakirin Ramli, Hamzah Ahmad, Addie Irawan,
and Nur Liyana Ibrahim
Abstract This paper addresses the tuning method to attain symmetry between the
master and slave manipulators of a bilateral teleoperation system. In the proposed
structure, an equalizer based on the Laguerre network connected in-feedback loop
to the master manipulator has been introduced. A set of input-output data were first
generated and recorded which later be used in two-steps tuning procedure.
A fictitious reference signal was formulated based on these data. In addition, a
metaheuristic optimization algorithm namely the Particle Swarm Optimization has
been employed in seeking the optimal controller’s parameters. Numerical analyses
utilizing Matlab software has been performed. The results exhibited that the
dynamic of the master manipulator with the added controller is almost identical to
the dynamic of the slave systems. Hence, it is verified that the proposed tuning
technique is feasible to achieve symmetry between both sides of the manipulators.
Keywords Fictitious signal PID controller
Two-port networks Velocity matching
Particle Swarm Optimization 1 Introduction
A teleoperator system comprised of dual robots namely the master robot controlled
by the human operators, and a remote slave robot which tracks the motion of the
master, where it concurrently transmits the environment’s force back to the human
operator. The teleoperation system extends the human operator’s capability to
conduct tasks remotely from a base station. Vast applications of teleoperation
systems can be found in the underwater explorations [1, 2], telesurgery [3], and
military [4].
M. S. Ramli (&) H. Ahmad A. Irawan N. L. Ibrahim
Instrumentation and Control Engineering (ICE) Research Cluster, Faculty of Electrical and
Electronics Engineering Technology, Universiti Malaysia Pahang, 26600 Pekan, Malaysia
e-mail: syakirin@ump.edu.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_23
329
330
M. S. Ramli et al.
Various studies had been carried out by researchers in the past focusing on the
four-channels architecture of bilateral teleoperation systems. The work in [5, 6]
discussed some of the earlier ideas of the four-channels structure, and emphasized
that the proper utilization of all channels is crucial in achieving accurate transmission of the task impedance to the operator. In [7], their work focused on
designing symmetric impedance matched with position tracking. Meanwhile in [8],
the authors provide surveys on the implementation of the wave variable control in
the four-channel structure in bilateral teleoperation system. On the other hand, the
work in [9] considered the implementation of the wave variable control for
four-channels architecture in the multilateral framework. To add further to the lists,
our recent work in [10] investigated the potential of introducing a controller connected in-feedback to a single master manipulator, to attain a matched impedance
with the Locked-system derived from the multiple slave manipulators formed by
multi-agents system.
In this paper, we focus on obtaining a matched impedance between the master
and slave sides of a bilateral teleoperator system by using a model-free approach.
Assuming the human and the remote task at the environment to form two sides of
the divide, then by introducing a feedback controller to the master system, a
symmetry between both sides can be established. For this purpose, a Laguerre
network structure is selected as the controller due to orthonormal properties filter,
which simplifies the tuning process to only finding the optimal values of the basis of
the filters. Here, the task of tuning the basis of the Laguerre network can be
performed by employing the Fictitious-Reference-Iterative-Tuning (FRIT) and
Particle Swarm Optimization (PSO) algorithms. The FRIT only requires a set of
input-output data acquired from a single-shot experiment to be used in tuning
process [11]. Hence, the mathematical modeling of the complex system which
normally needed in the conventional controller design can partly be eliminated
through the employment of FRIT.
The PSO, on the other hand, is a metaheuristic optimization technique of finding
the optimal solution from a predefined search space. First introduced by Kennedy
and Eberhart [12] in 1995, the algorithm mimics the behavior of swarm or flock of
fishes/birds in minimizing or maximizing the specified fitness function. Our work
focus on implementing the algorithm in minimizing the cost function, formulated
based on the fictitious signals utilizing the recorded data.
The organization of this paper is as follows. In Sect. 2, we provide the problem
formulation where the overview of the two-ports and basic teleoperation structures
are presented. In Sect. 3, we discuss our proposed algorithm to achieve impedance
matching. Next in Sect. 4, a numerical example to illustrate the effectiveness of
proposed method is discussed. Finally, we conclude the findings in Sect. 5.
Mathematical Preliminaries: We denote R and Rn as the set of real numbers and
vectors with dimension n respectively. Suppose v 2 Rn , then the vector norm is
pffiffiffiffiffiffiffi
defined by kvk :¼ vT v where T is the transposition. Meanwhile, the notation of
kvðkÞk2K implies
Model-Free Tuning of Laguerre Network …
kvðkÞk2K :¼
K
X
331
kvðkÞk2 ¼ kvð1Þk2 þ kvð2Þk2 þ þ kvðKÞk2 :
ð1Þ
k¼1
Finally, we define 1m ¼ ½1; ; 1 2 R1m as the m-dimensional row vector with all
elements equal to 1.
2 Problem Formulation
2.1
Overview of the Two-Ports Network
The general model of two-ports network in bilateral teleoperation is depicted in
Fig. 1. In the bilateral teleoperation mechanism, the operator’s force on the master
fh is transmitted to the remote task through the teleoperation system T, and at the
same time the environment force fe is transmitted back to the operator. Considering
the master velocity x_ m and the slave velocity x_ s , the perfect transparency is achieved
if fh fe for x_ m ¼ x_ s . The relation between the forces and motions in bilateral
teleoperation system can be generalized in the hybrid matrix [13] of
fh ðsÞ
h ðsÞ
¼ 11
x_ m ðsÞ
h21 ðsÞ
h12 ðsÞ
h22 ðsÞ
x_ s ðsÞ
fe ðsÞ
ð2Þ
where hij ðsÞ is a SISO transfer function. From (2), it can be shown that
fh ¼ ðh11 h12 Ze Þðh21 h22 Ze Þ1 x_ m :
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
ð3Þ
ZT
To achieve a perfect transparency such that the transmitted impedance ZT equals to
the environment impedance Ze , the necessary and sufficient conditions are h22 ¼ 0,
h21 Ze ¼ Ze ðh12 Þ, and h11 ¼ 0. Hence, for an ideal case, a perfect transparency for
all frequencies implies
Fig. 1 General two-ports
model of a bilateral
teleoperation system [13]
332
M. S. Ramli et al.
2.2
h11
h21
h12
h22
0
¼
1
1
:
0
ð4Þ
Basic Structure of a Teleoperation System
We modelled the motion of the
mass-damper-spring system given by
master
manipulator
mm€xm þ dm x_ m þ km xm ¼ fm þ fh
by
a
simple
ð5Þ
where mm , dm , km are the mass, damping factor, and spring constants, respectively
Meanwhile, fm , and xm are the master’s exerted force and total displacement,
respectively. In a similar form, the slave manipulator is governed by the equation of
motion of
ms€xs þ ds x_ s þ ks xs ¼ fs fe
ð6Þ
where ms ,ds , ks are the mass, damping factor, and spring constants. The signals fs
and xs are the slave’s exerted force and the total displacement of the manipulator.
Figure 2 illustrates the general structure of a four-channels bilateral teleoperation
system. The total impedances of the human and environment are denoted by Zh and
Ze respectively. Meanwhile, Zm and Zs are the impedances of the master and slave
manipulators. The local controllers for both master and slave manipulators are
denoted by Cm and Cs . On the other hand, the controllers C1 to C4 are to dictate the
communication link between the master and the slave sides. Zhu and Salcudean
[13] reported that the perfect transparency can be achieved by properly designing
C1 to C4 . For transparency under position control, a fully transparent teleoperator
system satisfies the condition given in Eq. (4) by the selection of C1 ¼ Zs þ Cs ,
C2 ¼ C3 ¼ 1, and C4 ¼ ðZm þ Cm Þ. However, this control strategy requires for
acceleration measurement to implement C1 and C4 . As to overcome this issue, the
“intervenient impedance” was introduced to eliminate the need for acceleration
measurement [13]. With low-gain PD control of Cm and Cs , and with the selection
of C1 ¼ Cs , C2 ¼ C3 ¼ 1, C4 ¼ Cm , a nearly perfect transparency is achievable
when we have the master impedance identical to the slave impedance such that
Zm Zs . However, in most cases Zm 6¼ Zs . Hence, this paper will discuss our
proposed method to reach to the similar behavior of Zm Zs .
Model-Free Tuning of Laguerre Network …
333
Fig. 2 Four-channels structure proposed by Zhu and Salcudean [13]
2.3
Improvement to the Existing Structure
To improve the existing structure of the four-channel teleoperation system, Tsuji
et al. [14] introduced an additional equalizer or controller connected in-feedback to
the master manipulator. By using the same local controller Cm for both the master
and slave manipulators, the equalizer F can be properly tuned so that there exists
symmetry between the impedance of the master and slave system. The new
structure of the four-channels teleoperation system is depicted in Fig. 3. With this
implementation, the controllers C1 to C4 can be chosen as C1 ¼ Cm , C2 ¼ C3 ¼ 1,
and C4 ¼ Cm . Now, the aim is to design an optimal controller F to achieve
ZF :¼ Zm þ F Zs . In the next section, we present the structure of the Laguerre
network as to form the basic structure of F. Furthermore, the method of tuning
where the metaheuristic optimization algorithm and fictitious-reference signal
generation are also briefly discussed.
Remark 1: Even though the modeling of manipulators is presented in this paper, it
is not a necessity in implementing our proposed algorithm. It will be discussed
further in the next section to illustrate that only the recorded input-output data are
required in the process of tuning the controllers. Hence, this technique is totally a
model-free approach.
334
M. S. Ramli et al.
Fig. 3 Four-channel structure illustrating the additional equalizer F
3 Algorithm for Impedance Matching
3.1
Particle Swarm Optimization
The PSO is an optimization method based on the metaphor of social behavior of
flocks of birds or school of fish. First introduced by Kennedy and Eberhart [12], the
algorithm started with the initialization of the pools particles/agents with random
positions and velocities in multi-dimensional space. Let pi ðkÞ 2 R1D and
qi ðkÞ 2 R1D , i ¼ 1; 2; ; N, denote the position and velocity of each agent i in D
dimension at iteration k. Let the fitness function’s value associated with the position
pi ðkÞ is denoted by Fit 2 R. Each of the agents is assumed to optimize the fitness
function Fit , by evaluating the best-value-so-far (pbesti 2 R1D ) and its current
position. The velocity of each agent i will be updated based on the following
equation
qi ðk þ 1Þ ¼ xqi ðkÞ þ g1 r1 ðpbesti pi ðkÞÞ þ g2 r2 ðgbest pi ðkÞÞ
ð7Þ
where x 2 R is the weighting function, g1 ; g2 2 R are the weighting factors,
r1 ; r2 2 R are the cognitive and social learning parameters generated randomly
between 0 and 1. Meanwhile pbesti is the pbest value of agent i, and gbest 2 R1D is
the best value so far in the group among the pbests of all agents. The following
function is used to update the weighting function x in Eq. (7):
Model-Free Tuning of Laguerre Network …
335
xmax xmin
x ¼ xmax itermax
iter
ð8Þ
where xmax ; xmin 2 R are the initial and final weights, itermax 2 R is the maximum
number of iteration, and iter is the current iteration number. Thus, based on the
updated velocity in (7), each agent i will update its position such that
pi ðk þ 1Þ ¼ pi ðkÞ þ qi ðk þ 1Þ:
ð9Þ
At the end of iteration, the agents shall all converge to the optimal position p ,
where
p :¼ arg min Fit ; 8i:
pi
3.2
ð10Þ
Equalizer FðzÞ in the Form of a Laguerre Network
The discrete time SISO system can be approximated to use a series of Laguerre
filters of [15]
Li ðzÞ ¼
as to form yðzÞ ¼ FðzÞsðzÞ ¼
M
P
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðz1 aÞi1
ð1 a2 Þts
ð1 az1 Þi
ð11Þ
ci Li ðzÞ as shown in Fig. 4. The parameter a 2 R is
i¼1
the pole of the Laguerre network, and 0 a\1 for the stability of the network [16],
with ts as the sampling time. The input and output signals of the network are denoted
by sðkÞ ¼ Z 1 ½sðzÞ and yðkÞ ¼ Z 1 ½yðzÞ, respectively. Here, we use Z 1 ½ to
denote the inverse z-transform operator. The parameters ci 2 R, i ¼ 1; ; M are the
coefficients that form the basis of the Laguerre network. Meanwhile, the signal of
li 2 R, i ¼ 1; ; M is the output of the i th-order filter in the Laguerre network.
Fig. 4 Structure of the Laguerre network
336
M. S. Ramli et al.
By this notation, the SISO state-space model of the overall network can be
represented by
FðzÞ :
lðk þ 1Þ ¼ AlðkÞ þ BuðkÞ
yðkÞ ¼ ClðkÞ
ð12Þ
where l ¼ ½l1 ; ; LM T 2 RM is the state vector, A 2 RMM is the system matrix,
B 2 RM is the input matrix, and C ¼ ½c1 ; ; cM 2 R1M is the output matrix. The
elements of A and B are given by
½ Aij :¼
8
<
a
if i ¼ j
ð1Þðij þ 1Þ aðij1Þ ð1 a2 Þ if i\j
:
0
otherwise
½Bi :¼ ðaÞði1Þ
3.3
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ð1 a2Þts :
ð13Þ
ð14Þ
Fictitious-Reference-Iterative Tuning
The equalizer F needs to be properly designed and tuned to attain ZF Zs . Similar
procedure of tuning as discussed in [14] was adopted in this work. Figure 5
illustrates the two-process of tuning which had been carried out to obtain the
optimal controllers. In the first process (see Fig. 5(a)), an equalizer H was first to be
determined to match the velocities x_ m and x_ s . Similar to our previous work in [10],
we selected HðzÞ :¼ PðzÞ=QðzÞ as a bi-proper transfer function in the form of
HðzÞ ¼
^p zp
1 þ ^a1 z1 þ þ a
:
1 þ ^bz1 þ þ ^
bp zp
ð15Þ
In the second process, a fictitious signal was formulated to utilize (15) (see Fig. 5
(b)). The fictitious signal can be defined as
~fs ðkÞ ¼ H ðzÞ u0 ðkÞ þ FðzÞ_x0 ðkÞ
m
ð16Þ
where H ðzÞ is the transfer function of HðzÞ with the optimal parameters.
Meanwhile, u0 and x_ 0m are the recorded input-output data measured from the
master’s manipulator.
Model-Free Tuning of Laguerre Network …
337
Fig. 5 Two-steps of tuning: (a) to attain H , (b) to attain F 3.4
Attaining a Matched Impedance via PSO and FRIT
To obtain the optimal transfer function H ðzÞ, we need to solve the constraint
optimization problem defined by
min JH
ð17Þ
HðzÞ
s:t: jzj 1
where for the recorded initial data x_ 0m ðkÞ and x_ 0s ðkÞ,
JH :¼ x_ 0s ðkÞ HðzÞ_x0m ðkÞ
2
:
K
ð18Þ
Meanwhile, to attain the optimal F ðzÞ, we solve the second optimization problem
given by
ð19Þ
min JF
FðzÞ
where for recorded initial data fm0 ðkÞ and fs0 ðkÞ,
JF
:¼
fs0 ðkÞ ~fs ðkÞ
¼
fs0 ðkÞ H ðzÞðu0 ðkÞ þ FðzÞ_xm ðkÞÞ
¼
u0 ðkÞ ¼
2
K
2
K
H 1 ðzÞfs0 ðkÞ u0 ðkÞ FðzÞ_xm ðkÞ
0
fm ðkÞ F 0 ðzÞ_xm ðkÞ:
2
K
ð20Þ
338
M. S. Ramli et al.
The following algorithm has been implemented to obtain the optimal controllers
H ðzÞ and F ðzÞ:
Step 1. Let the tunable parameters of the controller FðzÞ be defined as
q ¼ ½a; c1 ; ; cM 2 R1D1 . By arbitrarily selecting the initial value q0 ,
the set of data x_ 0m , x_ 0s , fs0 and u0 are then generated.
Step 2. First, we tune the equalizer H by employing the PSO algorithm. Let
pi :¼ ^a1 ; ; ^ap ; ^b1 ; ; ^bp 2 R1D2 pi 2 ½pHmin ; pHmax ; 8i. Initialize the
positions of PSO agents in the specified search space. Define the fitness
function Fit for each agent according to Eq. (18), such that Fit ¼ JH .
Step 3. Update the agents’ velocities based on Eq. (7) and agents’ positions based
on Eq. (9) at each iteration. At the final iteration time, all agents shall
converge to the optimal position of p corresponds to optimization
problem defined in Eq. (17). Assign the coefficients of transfer function in
(15) with p . Repeat from Step 2 if results are not satisfactory.
Step 4. Next, we tune the
controller F by also employing the PSO algorithm. Let
pi :¼ q 2 R1D1 pi 2 ½pFmin ; pFmax ; 8i: Initialize the positions of PSO
agents in the specified search space. Define the fitness function Fit for
each agent according to Eq. (20), such that Fit ¼ JF .
Step 5. Update the agents’ velocities based on Eq. (7) and agents’ positions based
on Eq. (9) at each iteration. At the final iteration time, all agents shall
converge to the optimal position of p corresponds to optimization
problem defined in Eq. (19). Assign q ¼ q . Repeat from Step 4 if results
are not satisfactory.
4 Numerical Results and Analysis
To illustrate the effectiveness of our proposed method, we present an example in
this section. We conducted a numerical analysis employing the Matlab simulation
package to execute the developed theoretical models. The parameters used in the
teleoperation system are summarized in Table 1. The impedance of the human
operator was defined as Zh ¼ s2 þ 5s þ 10. Meanwhile, the number of basis of the
truncated Laguerre filters was chosen as M ¼ 10, and the sample time ts ¼ 0:01 s.
We assume there was no time delay in the communication link, and the environment’s impedance was set to zero to imply that the slave manipulator moves freely
Table 1 Parameters values
of the manipulator systems
Manipulator
Mass
(kg)
Damper (Ns/
m)
Spring (N/
m)
Master
Slave
mm ¼ 1:5
ms ¼ 3
dm ¼ 0:4952
ds ¼ 2:4762
km ¼ 0
ks ¼ 1:4621
Model-Free Tuning of Laguerre Network …
339
without any attached load. The transfer function of the local controllers for both
1
þ 0:2s . Meanwhile, the conmaster and slave were chosen as Cm ¼ 2 1 þ 100s
trollers C1 to C4 were selected based on the description provided in Sect. 2.3.
In Table 2, we provide the parameters of the PSO algorithm that were used in
the tuning process. For both procedures, we used the weighting factor
g1 ¼ g2 ¼ 1:4. Meanwhile, xmin ¼ 0:4 and xmax ¼ 0:9, respectively.
Figure 6 illustrates the performance of the equalizer HðzÞ with p ¼ 6 in equalizing the velocities between the manipulators. As presented in the figure, the initial
recorded velocity signals of the master and slave manipulator are indicated in the
blue and red lines, respectively. It can clearly be seen that the velocity x_ 0m was
matched with x_ 0s through the equalizer HðzÞ(as indicated by the dashed-black line).
The convergence of the cost function (18) is exhibited in Fig. 7 where JH ¼ 7:8817
at the final iteration k ¼ 150. Meanwhile, Fig. 8 indicates the location of the poles
and zeros of HðzÞ which all lie inside the unit circle to signify HðzÞ and HðzÞ1 are
always stable.
Table 2 Tuning parameters used in PSO algorithm
Number of
parameters D
Number of
agents N
Maximum
iteration
itermax
Minimum range
pmin
Maximum range
pmax
Tuning H
D2 ¼ 12
200
150
1
1
Tuning F
D1 ¼ 11
100
400
Fig. 6 Velocity matching through equalizer HðzÞ
0; 200 1M
1; 50 1M
340
M. S. Ramli et al.
Fig. 7 Convergence of the cost function JH
Fig. 8 Location of poles and
zeros of HðzÞ
The comparison of the positions, velocities and exerted forces of the master and
slave manipulators, before and after tuning are depicted in Fig. 9(a) and (b) respectively. From Fig. 9(b), it can be observed that the trends of velocities of both
manipulators are almost identical for all time t. Except for the position of the master
manipulator where it was slightly lagging than the position of the slave. Similar
observation can be obtained from the exerted forces response of the manipulators.
Here, it could be seen that they have almost identical patterns. Additional result to
illustrate the convergence of the cost function (20) is provided in Fig. 10. The cost
function value was obtained as JF ¼ 30294:2498 at the final iteration time of
k ¼ 400.
Model-Free Tuning of Laguerre Network …
(a) Before Tuning
(b) After Tuning
Fig. 9 Performance comparison before and after tuning
341
342
M. S. Ramli et al.
Fig. 10 Convergence of the cost function JF
5 Conclusion
In this paper, the tuning algorithm based on a model-free approach to improve
transparency through impedance matching between the master and slave manipulators of a bilateral teleoperation system has been demonstrated. By introducing a
controller connected in-feedback to the master manipulator, it provides the possibility of obtaining a symmetric impedance between both sides of the teleoperation
system. Furthermore, the utilization of FRIT has eliminated the necessity of
obtaining the plant model through mathematical modeling in designing the controllers. Hence, it is truly a model-free approach. Meanwhile, the implementation of
the PSO algorithm further simplified the process of obtaining the optimal controller
parameters. From the presented numerical results, it can be concluded that the
proposed algorithm exhibits promising results to achieve a matched impedance
between the master and slave manipulators. However, the formulation of the cost
function warrants for further investigation to ensure ultimate convergence of its
value towards zero.
Acknowledgements This research work has been supported by Research & Innovation
Department, Universiti Malaysia Pahang through short-term grant of RDU1703139.
References
1. Zhang J, Li W, Yu J, Zhang Q, Cui S, Li Y, Li S, Chen G (2017) Development of a virtual
platform for telepresence control of an underwater manipulator mounted on a submersible
vehicle. IEEE Trans Ind Electron 64:1716–1727
2. Saltaren R, Barroso AR, Yakrangi O (2018) Robotics for seabed teleoperation:
part-1-conception and practical implementation of a hybrid seabed robot. IEEE Access
6:60559–60569
Model-Free Tuning of Laguerre Network …
343
3. Berthet-Rayne P, Leibrandt K, Gras G, Fraisse P, Crosnier A, Yang G-Z (2018) Inverse
kinematics control methods for redundant snakelike robot teleoperation during minimally
invasive surgery. IEEE Robot Autom Lett 3:2501–2508
4. Chen JYC, Barnes MJ (2008) Robotics operator performance in a multi-tasking environment.
In: Human-robot interactions in future military operations, pp 293–314
5. Lawrence DA (1992) Designing teleoperator architectures for transparency. In: Proceedings
1992 IEEE international conference on robotics and automation. IEEE Computer Society
Press, pp 1406–1411
6. Lawrence DA (1993) Stability and transparency in bilateral teleoperation. IEEE Trans Robot
Autom 9:624–637
7. Namerikawa T, Kawada H (2006) Symmetric impedance matched teleoperation with position
tracking. In: Proceedings of the 45th IEEE conference on decision and control, pp 4496–4501
8. Sun D, Naghdy F, Du H (2014) Application of wave-variable control to bilateral teleoperation
systems: a survey. Ann Rev Control 38:12–31
9. Kanno T, Yokokohji T (2012) Multilateral teleoperation control over time-delayed computer
networks using wave variables. In: Haptics symposium (HAPTICS). IEEE, pp 125–131
10. Ramli MS, Ahmad H (2018) Data-driven impedance matching in multilateral teleoperation
systems. Indones J Electr Eng Comput Sci 10:713–724
11. Kaneko O, Soma S, Fujii T (2003) A fictititous reference iterative tuning (FRIT) in the
two-degree of freedom control scheme and its application to closed loop system identification.
Instrum Control Soc 42:17–25
12. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN 1995 international conference on neural networks, pp 1942–1948
13. Zhu M, Salcudean SE (1995) Achieving transparency for teleoperator systems under position
and rate control. In: Proceedings 1995 IEEE/RSJ international conference on intelligent robots
and systems. Human robot interaction and cooperative robots. IEEE Computer Society Press,
pp 7–12
14. Tsuji M, Yamamoto S, Kaneko O (2014) A tuning method of a 4-channel bilateral control
system. In: 46th SICE Hokkaido branch academic symposium, pp 1–4. (in Japanese)
15. Wang Q, Zhang J (2011) Wiener model identification and nonlinear model predictive control
of a pH neutralization process based on Laguerre filters and least squares support vector
machines. J Zhejiang Univ Sci C 12:25–35
16. Wang L (2009) Model predictive control system design and implementation using MATLAB.
Springer, London
Identification of Liquid Slosh Behavior
Using Continuous-Time Hammerstein
Model Based Sine Cosine Algorithm
Julakha Jahan Jui, Mohd Helmi Suid, Zulkifli Musa,
and Mohd Ashraf Ahmad
Abstract This paper presents the identification of liquid slosh plant using the
Hammerstein model based on Sine Cosine Algorithm (SCA). A remote car that
carrying a container of liquid is considered as the liquid slosh experimental rig. In
contrast to other research works, this paper considers a piece-wise affine function in
a nonlinear function of the Hammerstein model, which is more generalized function. Moreover, a continuous-time transfer function is utilized in the Hammerstein
model, which is more suitable to represent a real system. The SCA method is used
to tune both coefficients in the nonlinear function and the transfer function of the
Hammerstein model such that the error between the identified output and the real
experimental output is minimized. The effectiveness of the proposed framework is
assessed in terms of the convergence curve response, output response, and the
stability of the identified model through the pole-zero map. The results show that
the SCA based method is able to produce a Hammerstein model that yields identified output response closes to the real experimental slosh output with 80.44%
improvement of sum of quadratic error.
Keywords Slosh behavior
Sine Cosine Algorithm Hammerstein model
1 Introduction
Nowadays, liquid slosh inside a cargo always happens in many situations. For
example, ships with liquid container carriers are at high risk of generating sloshing
load during operation [1]. In the metal industries, high oscillation can spill molten
metal that is dangerous to the operator [2]. Meanwhile, sloshing of fuel and other
liquids in moving vehicles may cause instability and undesired dynamics [3].
Hence, it is necessary to completely study the behavior of this residual slosh
J. J. Jui (&) M. H. Suid Z. Musa M. A. Ahmad
Faculty of Electrical and Electronics Engineering Technology, University Malaysia Pahang,
26600 Pekan, Pahang, Malaysia
e-mail: julakha.ump@gmail.com
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_24
345
346
J. J. Jui et al.
induced by the container motion. One may study the behavior of liquid slosh
through developing the exact mathematical model of liquid slosh. So far, many
researchers focus on the first principle approach to model the slosh behavior, while
there are few literatures to discuss it from the perspective of nonlinear system
identification approach.
On the other hand, block oriented nonlinear system identification has become a
popular technique to model a complex plant. The block oriented nonlinear model
can be classified into three categories, which are Hammerstein model, Wiener
model and Hammerstein Wiener model. In particular, Hammerstein model is a
model that consists of a nonlinear function followed by linear dynamic sub-plant,
while Wiener model consists of a linear dynamic sub-plant followed by nonlinear
function, and finally, Hammerstein-Wiener model contains a linear dynamic
sub-plant inserted between two or more nonlinear functions in series. Among these
three block oriented models, Hammerstein model is famous due to its simple model
structure and it has been widely used for nonlinear system identification.
Specifically, the Hammerstein model has been applied to model a real plant such as
Solid Oxide fuel cell [4], bidirectional DC motor [5], oxygen uptake estimation [6],
stretch reflex dynamics [7], turntable servo system [8], pneumatic muscle actuators
[9], amplified piezoelectric actuators [10] and multi-axis piezoelectric micro positioning stages [11]. On the other hand, there are many tools that have been utilized
to identify the Hammerstein model. There are the iterative method [12–14], the
subspace method [15–17], the least square method [18], the blind approach [19] and
the parametric instrumental variables method [20]. Moreover, many also consider
the optimization tools for Hammerstein model, such as Bacterial Foraging algorithm [21], Cuckoo search algorithm [22], Particle Swarm optimization [23], and
Genetic algorithm [24].
Based on the above literature, several limitations are ineluctable in their works,
which are:
(i) Most of the Hammerstein models used in their study are based on
discrete-time model, while many real plants can be easily represented in
continuous-time model.
(ii) Almost all the methods assume a known structure of nonlinear function, which
consists of several basis functions.
Though, our proposed work can solve a more general class of continuous-time
Hammerstein model by assuming an unknown structure of nonlinear function. In
particular, a piece-wise affine function is adopted with so many basis functions. Due
to the introduction of the piece-wise affine function, a high dimensional design
parameter tuning is considered in this study, which make the identification problem
more complex. On the other hand, Sine Cosine Algorithm (SCA) [25] has become a
top notch optimization algorithm which has solved various types of engineering
problems [25–27]. To the best of our knowledge, there are still few works to discuss
Identification of Liquid Slosh Behavior …
347
on the SCA for identification of Hammerstein model. Moreover, other recent
optimization methods are quite complex as compared to SCA which may contribute
to high computation time in obtaining the result. Thence, it motivates us to see the
effectiveness of the SCA in modeling the liquid slosh plant from the real experimental data.
This paper presents the identification of liquid slosh plant using the
Hammerstein model based on SCA method. A remote car that carrying a container
of liquid is considered as the liquid slosh experimental rig. The SCA method is used
to tune both coefficients in the nonlinear function and transfer function of the
Hammerstein model such that the error between the identified output and the real
experimental output is minimized. The effectiveness of the proposed framework is
assessed in terms of the convergence curve response, output response, and the
stability of the identified model through the pole zero map.
2 Liquid Slosh Experimental Rig
In this study, a mobile liquid slosh plant is considered to replicate real situation of a
moving container carrying liquid, as shown in Fig. 1. In particular, a remote control
car is used to carry a small tank filled with liquid. The tank is also equipped with
four plastic wheels so that it can move smoothly as shown in Fig. 1(a). Moreover,
three accelerometer sensors (ADXL335) that are floated on the surface of liquid are
used to measure liquid oscillation as shown in Fig. 1(b). For simplicity of our study,
the liquid slosh data from only one of the sensor is recorded and only z-axis output
data is considered. Figure 2 shows a general schematic diagram of liquid slosh
experimental rig. In particular, an Arduino UNO is used as a data acquisition
platform to process the input and output data. Here, we generate a voltage from the
Arduino UNO to the remote car and concurrently the Arduino UNO also will
acquire the slosh data from the accelerometer. Both the input and output data can be
monitored and analyzed from the personal computer using the LabView software.
In order to identify the model of liquid slosh, the remote car is required to move to a
certain distance and suddenly stop to generate a liquid oscillation or slosh inside the
tank. Thence, we apply the input voltage as shown in Fig. 3 to move the remote car.
Concurrently, the liquid slosh data is recorded as shown in Fig. 4. These two data
are then used to develop the Hammerstein model based SCA, which is discussed in
the next section.
348
J. J. Jui et al.
(a) Side view
(b) Plan view
Fig. 1 Liquid slosh experimental rig
Fig. 2 Schematic diagram of liquid slosh experimental rig
3 Identification of Liquid Slosh Using Hammerstein Based
SCA
In this section, the proposed Sine Cosine Algorithm (SCA) for identification of
liquid slosh plant in Sect. 2 based on Hammerstein model is presented. Firstly, a
problem formulation to identify the liquid slosh plant is explained. Then, it is
shown on how to apply the SCA method to identify the liquid slosh based on
Hammerstein model.
Identification of Liquid Slosh Behavior …
349
Fig. 3 Input voltage applied
to the remote car
Fig. 4 Output slosh from the
accelerometer
Figure 5 shows a complete block diagram to identify the liquid slosh model in
Sect. 2. The proposed Hammerstein model consists of nonlinear function h(u) followed by the transfer function G(s). The nonlinear function is a piece-wise affine
function given by
hðuÞ ¼
8
>
>
>
<
>
>
>
:
c0 þ m1 ðu d0 Þ
c1 þ m2 ðu d1 Þ
if d0 u\d1 ;
if d1 u\d2 ;
..
.
cr1 þ mr ðu dr1 Þ if dr1 u\dr ;
ð1Þ
350
J. J. Jui et al.
Fig. 5 Block diagram of Hammerstein model based SCA
and the transfer function G(s) is given by
GðsÞ ¼
BðsÞ
sm þ bm1 sm1 þ þ b0
¼
:
AðsÞ am sm þ am1 sm1 þ þ a0
ð2Þ
In (1), the symbol mi ¼ ðci ci1 Þ=ðdi di1 Þ ði ¼ 1; 2; . . .; rÞ are the segment
slope with connecting input and output points as di ði ¼ 0; 1; . . .; rÞ and
ci ði ¼ 0; 1; . . .; rÞ, respectively. For simplicity of notation, let d = [d0, d1, …, dr]T
and c = [c0, c1, …, cr]T. The input of the real liquid slosh plant and the identified
model is defined by u(t), while the output of the real liquid slosh plant and the
identified model are denoted by yðtÞ and ~yðtÞ, respectively. Thence, the expression
of the identified output can be written as
~yðtÞ ¼ GðsÞhðuðtÞÞ:
ð3Þ
Moreover, several assumptions are adopted in this work, which are:
(i) The order of the polynomial A(s) and B(s) are assumed to be known
(ii) The nonlinear function h(u(t)) is one-to-one map to the input u(t) and the
values of di ði ¼ 1; 2; . . .; rÞ are pre-determined according to the response of
input u(t).
Identification of Liquid Slosh Behavior …
351
Next, let ts be a sampling time for the real experimental input and output data (u
(t), y(t)) (t = 0, ts, 2ts, …, Nts). Then, in order to accurately identify the liquid slosh
model, the following objective function in (4) is adopted in this study:
EðG; hÞ ¼
N
X
ðyðgts Þ ~yðgts ÞÞ2 :
ð4Þ
g¼0
Note that the objective function in (4) is based on the sum of quadratic error,
which has been widely used in many literature [28, 29]. Finally, our problem
formulation can be described as follows.
Problem 1. Based on the given real experimental data (u(t), y(t)) in Fig. 1, find the
nonlinear function h(u) and the transfer function G(s) such that the objective
function in (4) is minimized.
Furthermore, it is shown on how to apply the SCA in solving Problem 1. For
simplicity, let the design parameter of Problem 1 is defined as
x ¼ ½ b0 b1 bm1 a0 a1 am c0 cr T , where the elements
of the design parameter are the coefficients of both the nonlinear function and the
transfer function of the continuous-time Hammerstein model. In SCA framework,
let xi ði ¼ 1; 2; . . .; MÞ be the design parameter of each agent i for M total number
of agents. Then, consider xij ðj ¼ 1; 2; . . .; DÞ be the j-th element of the vector
xi ði ¼ 1; 2; . . .; MÞ, where D is the size of the design parameter. Thence, by
adopting objective function in (4), a minimization problem is expressed as
arg
min
xi ð1Þ; xi ð2Þ; ...
Eðxi ðkÞÞ:
ð5Þ
for iterations k = 1, 2, …, until maximum iteration kmax. Finally, the procedure of
the SCA in solving Problem 1 is shown below:
Step 1: Determine the total number of agents M and the maximum iteration kmax.
Set k = 0 and initialize the design parameter xi ð0Þði ¼ 1; 2; . . .; MÞ according to
the upper bound xup and lower bound xlow values of the design parameter.
Step 2: Calculate the objective function in (4) for each search agent i.
Step 3: Update the values of the best design parameter P based on the generated
objective function in Step 2.
Step 4: For each agent, update the design parameter using the following
equation:
xij ðk þ 1Þ ¼
xij ðkÞ þ r1 sin(r2 Þ r3 Pj xij ðkÞ
xij ðkÞ þ r1 cos(r2 Þ r3 Pj xij ðkÞ
if
if
r4 \0:5;
r4 0:5;
ð6Þ
352
J. J. Jui et al.
where
k
r1 ¼ 2 1 kmax
ð7Þ
for maximum iteration kmax and constant positive value a. Note that r2, r3 and r4 are
random values that are generated independently and uniformly in the ranges [0, 2p],
[0, 2] and [0, 1], respectively. The detailed justification on the selection of the
coefficients r1, r2, r3 and r4 are clearly explained in [25]. In (6), the symbol Pj
(j = 1, 2,…, n) is denoted as the best current design parameter in j-th element of
P that is kept during tuning process.
Step 5: After the maximum iteration is achieved, record the best design
parameter P and obtained the continuous-time Hammerstein model in Fig. 1.
Otherwise, repeat Step 2.
4 Results and Analysis
In this section, the effectiveness of the SCA based method for identifying the liquid
slosh system using continuous-time Hammerstein model is demonstrated. In particular, the convergence curve response of the objective function in (4), the
pole-zero mapping of linear function and the plot of nonlinear function, will be
presented and analyzed in this study.
Based on the experimental setup in Sect. 2, the input response u(t) as shown in
Fig. 3 is applied to the liquid slosh plant, and the output response y(t) is recorded as
shown in Fig. 4. Here, the input and output data are sampled at ts = 0.02 for
N = 450. In this study, the structure of G(s) is selected as follows:
GðsÞ ¼
BðsÞ
s3 þ b2 s2 þ b1 s þ b0
¼
:
4
AðsÞ a4 s þ a3 s3 þ a2 s2 þ a1 s þ a0
ð8Þ
after performing several preliminary testing on the given data (u(t), y(t)). The fourth
order system is used by considering a cascade of 2nd order system for both dc motor
of remote car and the slosh dynamic. Meanwhile, the input points for piece-wise
affine function of h(u(t)) are given by d = [0, 0.2, 0.4, 0.6, 0.8, 1, 2, 3, 4, 5]T. The
selection of vector d is obtained after several preliminary experiments. The design
parameter x 2 R18 with its corresponding transfer function and nonlinear function is
shown in Table 1. Next, the SCA algorithm is applied to tune the design parameter
with initial values of design parameter are randomly selected between the upper
bound xup and lower bound xlow as shown in Table 1. Note that the values xup and
xlow are obtained after performing several preliminary experiments. Here, we
choose the number of agents M = 40 with maximum iterations kmax = 5000.
Identification of Liquid Slosh Behavior …
Table 1 Design parameter of
liquid slosh plant
353
x
Coefficients
xlow
xup
P
x1
x2
x3
x4
x5
x5
x7
x8
x9
x10
x11
x12
x13
x14
x15
x16
x17
x18
b2
b1
b0
a4
a3
a2
a1
a0
c0
c1
c2
c3
c4
c5
c6
c7
c8
c9
−5
−5
−5
−5
−2200
−2200
−2200
−2200
−5
−5
−5
−5
−5
−5
−5
−5
−5
−5
35
35
35
35
−1
−1
−1
−1
5
5
5
5
5
5
5
5
5
5
−3.7948
10.7153
−0.9059
−0.6154
−5.3112
−139.8711
−1132.2883
−839.7621
−4.8859
−0.0219
3.3211
−4.7295
−0.3240
−4.4858
−0.0002
0.0000
0.1679
−4.3282
Fig. 6 Convergence curve
response
Figure 6 shows the response of the objective function convergence with the
value of E(G, h) = 0.1616 at kmax = 5000 with 80.44% of objective function
improvement to produce the best design parameter P as shown in the final column
of Table 1. It shows that the SCA based method is able to minimize the objective
function in (4) and produce a quite close output response yðtÞ as compared to the
real output ~yðtÞ, which can be clearly seen in Fig. 7. Note that the identified output
response tends to yield high oscillation when input is injected to the system and it
start to attenuate when the input is zero, which is quite similar to the response of
real experimental output.
354
Fig. 7 Response of the
identified output ~yðtÞ and real
output yðtÞ
Fig. 8 Pole-zero map of
transfer function G(s)
Fig. 9 Resultant of
piece-wise affine function
h(u)
J. J. Jui et al.
Identification of Liquid Slosh Behavior …
355
In the real experimental setup, we can say that the liquid slosh system is stable
since the liquid slosh output is reduced gradually as t ! 1. In order to validate our
model regarding the stability, we use the pole-zero map of the identified transfer
function G(s) as shown in Fig. 8. From the pole-zero map, all the poles are located
at the left hand side of y-axis. In particular, the obtained values of poles are
−0.1190 ± j14.8001, −7.5621 and −0.8229, while the obtained values of zeros are
0.0872 and 1.8538 ± j2.6373. On the other hand, we also can observe the feature
of nonlinear function by plotting the obtained piece-wise function as depicted in
Fig. 9. Note that our nonlinear function is not restricted to any form of nonlinear
function (i.e., quadratic), which is more generalized and provide more flexibility of
searching a justifiable function.
5 Conclusion
In this paper, an identification of liquid slosh plant using continuous-time
Hammerstein model based on Sine Cosine Algorithm (SCA) has been presented.
The results demonstrated that the proposed generic Hammerstein model based on
SCA has a good potential in identifying the real liquid slosh behavior. In particular,
it is shown that the proposed method is able to produce a quite close identified
output with real liquid slosh output. Moreover, the resultant linear model has been
proved to be stable based on the pole-zero map. It is also shown that the used of
piecewise-affine function gives more flexibility for the SCA to search more generic
nonlinear function. In the future, our work can be extended to various types of
nonlinear function such as continuous-time Wiener and Hammerstein-Wiener.
Acknowledgements The authors gratefully acknowledged Research and Innovation Department
of Universiti Malaysia Pahang under grant RDU1703153 for the financial support.
References
1. Rizzuto E, Tedeschi R (1997) Surveys of actual sloshing loads on board of ships at sea. In:
Proceedings of International Conference on Ship and Marine Research, pp 7.29–7.37
2. Terashima K, Schmidt G (1994) Sloshing analysis and suppression control of tilting-type
automatic pouring machine. In: Proceedings of IEEE International Symposium on Industrial
Electronics, pp 275–280
3. Acarman T, Ozguner U (2006) Rollover prevention for heavy trucks using frequency shaped
sliding mode control. Vehi Syst Dyn 44(10):737–762
4. Li C, Zhu X, Cao G, Sui S, Hu M (2008) Identification of the Hammerstein model of a
PEMFC stack based on least squares support vector machines. J Power Sour 175:303–316
5. Kara T, Eker I (2004) Nonlinear modeling and identification of a DC motor for bidirectional
operation with real time experiments. Energy Convers Manag 45(7–8):1087–1106
6. Su SW, Wang L, Celler BG, Savkin AV (2007) Oxygen uptake estimation in humans during
exercise using a Hammerstein model. Ann Biomed Eng 35(11):1898–1906
356
J. J. Jui et al.
7. Westwick DT, Kearney RE (2001) Separable least squares identification of nonlinear
Hammerstein models: Application to stretch reflex dynamics. Ann Biomed Eng 29(8):707–
718
8. Zhang Q, Wang Q, Li G (2016) Nonlinear modeling and predictive functional control of
Hammerstein system with application to the turntable servo system. Mech Syst Signal Process
72:383–394
9. Ai Q, Peng Y, Zuo J, Meng W, Liu Q (2019) Hammerstein model for hysteresis
characteristics of pneumatic muscle actuators. Int J Intell Robot Appl 3(1):33–44
10. Saleem A, Mesbah M, Al-Ratout S (2017) Nonlinear Hammerstein model identification of
amplified piezoelectric actuators (APAs): Experimental considerations. In: 2017 4th
International Conference on Control, Decision and Information Technologies (CoDIT),
pp 0633–0638
11. Zhang HT, Hu B, Li L, Chen Z, Wu D, Xu B, Huang X, Gu G, Yuan Y (2018) Distributed
Hammerstein modeling for cross-coupling effect of multiaxis piezoelectric micropositioning
stages. IEEE/ASME Trans Mechatron 23(6):2794–2804
12. Bai EW, Li D (2004) Convergence of the iterative Hammerstein system identification
algorithm. IEEE Trans Autom Control 49(11):1929–1940
13. Hou J, Chen F, Li P, Zhu Z (2019) Fixed point iteration-based subspace identification of
Hammerstein state-space models. IET Control Theory Appl 13(8):1173–1181
14. Ge Z, Ding F, Xu L, Alsaedi A, Hayat T (2019) Gradient-based iterative identification method
for multivariate equation-error autoregressive moving average systems using the decomposition technique. J Frankl Inst 356(3):1658–1676
15. Hou J, Liu T, Wahlberg B, Jansson M (2018) Subspace Hammerstein model identification
under periodic disturbance. IFAC-PapersOnLine 51(15):335–340
16. Hou J, Liu T, Wang QG (2019) Subspace identification of Hammerstein-type nonlinear
systems subject to unknown periodic disturbance. Int J Control, 1–29 (Just-accepted)
17. Jamaludin IW, Wahab NA (2017) Recursive subspace identification algorithm using the
propagator based method. Indones J Electr Eng Comput Sci 6(1):172–179
18. Wang D, Zhang W (2015) Improved least squares identification algorithm for multivariable
Hammerstein systems. J Frankl Inst 352(11):5292–5307
19. Bai EW (2002) A blind approach to the Hammerstein-Wiener model identification.
Automatica 38(6):967–979
20. Ma L, Liu X (2015) A nonlinear recursive instrumental variables identification method of
Hammerstein ARMAX system. Nonlinear Dyn 79(2):1601–1613
21. Lin W, Liu PX (2006) Hammerstein model identification based on bacterial foraging. Electron
Lett 42(23):1332–1333
22. Gotmare A, Patidar R, George NV (2015) Nonlinear system identification using a cuckoo
search optimized adaptive Hammerstein model. Expert Syst Appl 42(5):2538–2546
23. Al-Duwaish HN (2011) Identification of Hammerstein models with known nonlinearity
structure using particle swarm optimization. Arab J Sci Eng 36(7):1269–1276
24. Zhang H, Zhang H (2013) Identification of hammerstein model based on Quantum Genetic
Algorithm. Telkomnika 11(12):7206–7212
25. Mirjalili S (2016) SCA: A sine cosine algorithm for solving optimization problems.
Knowl-Based Syst 96:120–133
26. Suid MH, Tumari MZ, Ahmad MA (2019) A modified sine cosine algorithm for improving
wind plant energy production. Indones J Electr Eng Comput Sci 16(1):101–106
27. Suid MH, Ahmad MA, Ismail MRTR, Ghazali MR, Irawan A, Tumari MZ (2018) An
improved sine cosine algorithm for solving optimization problems. In: IEEE Conference on
Systems, Process and Control (ICSPC), pp 209–213
28. Mjahed M, Ayad H (2019) Quadrotor identification through the cooperative particle swarm
optimization-cuckoo search approach. Comput Intell Neurosci 2019:1–10
29. Gupta S, Gupta R, Padhee S (2018) Parametric system identification and robust controller
design for liquid–liquid heat exchanger system. IET Control Theory Appl 12(10):1474–1482
Cardiotocogram Data Classification
Using Random Forest Based Machine
Learning Algorithm
M. M. Imran Molla, Julakha Jahan Jui, Bifta Sama Bari,
Mamunur Rashid, and Md Jahid Hasan
Abstract The Cardiotocography is the most broadly utilized technique in obstetrics practice to monitor fetal health condition. The foremost motive of monitoring is
to detect the fetal hypoxia at early stage. This modality is also widely used to record
fetal heart rate and uterine activity. The exact analysis of cardiotocograms is critical
for further treatment. In this manner, fetal state evaluation utilizing machine
learning technique using cardiotocogram data has achieved significant attention. In
this paper, we implement a model based CTG data classification system utilizing a
supervised Random Forest (RF) which can classify the CTG data based on its
training data. As per the showed up results, the overall performance of the supervised machine learning based classification approach provided significant performance. In this study, Precision, Recall and F-Score has been employed as the metric
to evaluate the performance. It was found that, the RF based classifier could identify
normal, suspicious and pathologic condition, from the nature of CTG data with
94.8% accuracy. We also highlight the major features based on Mean Decrease
Accuracy and Mean Decrease Gini.
Keywords Fetal heart rate
Random forest classifier Cardiotocography
M. M. Imran Molla
Faculty of Computer Science and Engineering, Khwaja Yunus Ali University,
6751 Enayetpur, Sirajganj, Bangladesh
J. J. Jui (&) B. S. Bari M. Rashid
Faculty of Electrical and Electronics Engineering, Universiti Malaysia Pahang,
26600 Pekan, Pahang, Malaysia
e-mail: julakha.ump@gmail.com
M. J. Hasan
Faculty of Mechanical and Manufacturing Engineering, Universiti Malaysia Pahang,
26600 Pekan, Pahang, Malaysia
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_25
357
358
M. M. Imran Molla et al.
1 Introduction
Cardiotocography is a strategy that is utilized to screen fetal health condition during
pregnancy. A cardiotocogram (CTG) comprises of two signals, to be specific, the
fetal heart rate (FHR) as well as uterine activity (UA). The identification of fetal
hypoxia at early stage is the target for CTG monitoring. Further examinations for
fetal condition may be performed or the baby is delivered by a surgical strategy.
A standardized nomenclature has been embraced to peruse the cardiotocographs
[1]. It incorporates baseline fetal heart rate (110 to 160 beats/minute), uterine
activity, baseline FHR variability (5 to 25 beats/minute above and below the stable
FHR baseline), periods of decreased and increased FHR variability and existence of
any acceleration or deceleration [2]. It is conceivable to recognize the fetal hypoxia
(lack of oxygen normally in the range of 1 to 5%) by observing FHR. The possibility of being disabling of the newborn baby gets to be high and, in some cases, it
may lead to the death if fetal hypoxia is prolonged. Consequently, it is essential to
detect abnormal FHR patterns and take suitable actions for evading prenatal morbidity as well as mortality [3, 4]. Cardiotocography can be utilized to examine the
fetus health condition, normoxia [5] (oxygen tensions between 10–21%) and normal or abnormal fetus acid base status [6]. Thus, numerous indicators (occurring
days or hours before fetus death) that can be identified promptly can lead to
appropriate obstetric intervention which could assist in delivering a healthy baby.
CTG is done manually which may cause human error. A computerized CTG may
develop automatic interpretation by decreasing the fetal mortality rate [7, 8]. For the
classification of CTG data, various techniques are utilized. Czabanski et al. [9]
reported that two steps mechanism consisting of weighted fuzzy scoring and LSVM
algorithm are applied to FHR to predict the acidemia hazard. Artificial neural
network is applied to record the fetal wellbeing by Georgieva et al. [10] and
Jezewski et al. [11]. Esra et al. [12] utilized adaptive boosting ensemble of decision
trees for analyzing cardiotocogram to detect pathologic fetus. Neuro-fuzzy method
[13], naïve Bayes classifier [14] are two approaches utilized in the ensemble
classifiers to combine the classification outputs of the weak learners. Random forest
[15] is a classifier that is built on multiple trees from randomly sampled subspaces
of the input features which combine the output of the trees using bagging. It is
applied to different real life applications including protein sequencing [16], classification of Alzheimer’s disease [17], cancer detection [18], physical activity
classification [19], classification of cardiotocograms using random forest classifier
[20] and so on. Fetal state classification from cardiotocography with feature
extraction utilizing hybrid K-Means and support vector machine has been reported
in [21] with 90.64% accuracy. Fetal state assessment using Cardiotocogram with
Artificial Neural Networks has been presented in [22]. Fetal state assessment using
cardiotocography parameters by applying PCA and AdaBoost has been done by
Zhang et al. [23] with 93% accuracy. In [24], decision Tree is used for analyzing the
Cardiotocogram data for fetal distress determination. In this paper, random forest
classifier is applied for the classification of cardiotocograms into normal, suspicious
Cardiotocogram Data Classification Using Random Forest …
359
as well as pathological classes. Feature importance index is utilized for identifying
important features of the database. Fetal state identification from cardiotocogram
applying LS-SVM with PSO (Particle Swarm Optimization) and binary decision
tree has been reported in [25]. There proposed method provides 91.62% classification accuracy. It has been observed that good classification accuracy can be
obtained by applying only ten important features among twenty-one features [25].
A mathematical modeling strategy has been presented to simulate early deceleration
in CTG by Beatrijs et al. [26]. Their outcomes for the uncompromised fetus have
been described that partial oxygen pressures decreases with the strength and
duration of the contraction. Sundar has been proposed classification of cardiotocogram data using neural network in [27] the accuracy of 91%. A feature
group weighting method for subspace clustering of high-dimensional data reported
in [28]. The get the f measure value 0.77. Zhou and Sun proposed Active learning
of Gaussian Processes with the accuracy 89% in [29]. Cruz et al. proposed
META-DES Ensemble Classifier for the identification with the accuracy of 84.6%
in [30].
2 Research Methodology
Figure 1 depicted the complete working procedure while working with Random
Forest algorithm. For building any model at first it is necessary to import the
dataset. In this research CTG dataset [27] has been used. This dataset is collected
from UCI Machine Learning Repository. Then, various operations have been
performed for checking whether there is any missing value or misleading data
present in the dataset. After that the dataset is split in order to train the model. For
the classification model the dataset has been split into 80% train and 20% test set
and then, testing the model based on trained dataset. Random Forest classifier has
been used to get trained model using train dataset. After the training phase, testing
phase is performed to validate the predictive result using test data. Finally, various
measurements also used to evaluate the performance of the model.
2.1
Dataset Description
A freely accessible CTG data set [31] from the UCI Machine Learning Repository
has been utilized in this study. This data set comprises of 2126 instances described
by 22 attributes. The last two attributes are class codes for FHR pattern and fetal
condition, individually. Each instance can be grouped utilizing the FHR pattern and
fetal condition. The attributes are presented in Table 1. CTG is a technique for
account the fetal heartbeat and the uterine contractions during pregnancy typically
in the last trimester.
360
M. M. Imran Molla et al.
Fig. 1 Working principle of
Random Forest regression
The data set comprises of 2126 cardiotocograms which has been collected from
the Maternity and Gynecological Clinic [32]. CTG are classified by three expert
obstetricians and their larger part has characterized the class of the cardiotocogram.
The dataset is labeled as one the three classes, Normal (N), Suspicious (S) and
Pathological (P) which is shown in Table 2.
Cardiotocogram Data Classification Using Random Forest …
361
Table 1 Explanation of features
Symbol of features
Description
LB
AC
FM
UC
DL
DS
DP
ASTV
MSTV
ALTV
MLTV
Width
Min
Max
Nmax
Nzeros
Mode
Mean
Median
Variance
Tendency
FHR baseline (beats/min)
Number of accelerations/second
Number of fetal movements/second
Number of uterine contractions/second
Number of light decelerations/second
Number of severe decelerations/second
Number of prolonged decelerations/second
Percentage of time with abnormal short-term variability
Mean value of short term variability
Percentage of time with abnormal long-term variability
Mean value of long-term variability
Width of FHR histogram
Minimum of FHR histogram
Maximum of FHR histogram
Number of histogram peaks
Number of histogram zeros
Histogram mode
Histogram mean
Histogram median
Histogram variance
Histogram tendency
Table 2 Class distribution of CTGs
Fetal state
Class
Numeric class
Number of FHR recordings
Normal
Suspect
Pathologic
Total
N
S
P
1
2
3
1655
295
176
2126
2.2
Random Forest Classifier
Random forest classifier makes a set of decision trees from arbitrarily chosen subset
of training dataset. It aggregates the votes from various decision trees to choose the
final class of the test objects [33]. Each tree is grown as follows:
1. If the number of cases within the training set is N, sample N cases at random but with replacement, from the original data. This sample will be the training set
for growing/developing the tree.
362
M. M. Imran Molla et al.
2. If there are M input variables, a number m << M indicates that at each node, m
variables are chosen at random out of the M and the best split on these m is
utilized to split the node. The value of m is held constant during the forest
growing.
3. Each tree is grown to the highest extent possible. There is no pruning.
Decreasing m decreases both the correlation and the strength. In the other hand,
increasing it increases both. Somewhere in between is an “optimal” range of m usually quite wide. Utilizing the OOB error rate as shown below, a value of m in the
range can quickly be found. This is the only adjustable parameter to which random
forests is somewhat sensitive.
In Laymen’s term,
Assume that the training set is represented as: [X1, X2, X3, X4 …… Xn] with
corresponding labels [L1, L2, L3, L4 …… Ln], random forest may make three
decision trees having input of subset for example,
½X1 ; X2 ; X3 . . .Xn ð1Þ
½X1 ; X2 ; X4 . . .Xn ð2Þ
½X2 ; X3 ; X4 . . .Xn ð3Þ
Thus, it predicts dependent on the most votes from each of the decision trees
made. Classification outcomes are introduced by utilizing precision, recall and
F-measure. Precision or positive predictive value (PPV) is characterized as the
proportion of instances which belongs to a class (TP: True Positive) out of the total
instances including TP and FP (False Positive) classified by the classifier as belong
to this particular class.
Precision ¼
TP
TP þ FP
ð4Þ
Recall or Sensitivity is introduced as proportion of instances classified in one
class out of the total instances belonging to that class. TP and FN (False Negative)
is included by the total number of instances of a class.
Recall ¼
TP
TP þ FN
ð5Þ
F-measure can be defined as the combination of precision and recall which is
represented as,
F-Measure ¼
2 Precision Recall
Precision þ Recall
ð6Þ
Cardiotocogram Data Classification Using Random Forest …
363
3 Results and Discussions
To classify these three classes, Random forest classifier is used Normal (N),
Suspicious (S) and Pathological (P). In this experiment 10-folded cross validation
on random forest model has been performed and from the result it is found that
Random Forest gives accuracy with different randomly selected predictor shown in
Fig. 2.
Out-of-Bag (OOB) error along with class error for each class is also evaluated
and shown in Fig. 3. Out-of-bag (OOB) error, also called out-of-bag estimate, is a
method of measuring the prediction error of random forests. It is seen that error rate
is high with small tree size and with increase of tree error decrease. Errors are
almost constant when tree size in 300.
Number of nodes for different tree size is shown in Fig. 4. Size of trees (number
of nodes) in and ensemble. Depict the relationship between tree size with their
corresponding terminal nodes.
Training and testing and testing data sets are created by separating the whole
data set into 80-20 split randomly without any replacement. Random forest classifier is trained on the training set. The class labels of the testing set are anticipated
by the trained classifier. Mean and standard deviation of Precision Recall and
F-measure is accounted for training and testing data (Table 3). Random forest
classifier appeared exceptionally great performance for the training data achieving
large values of precision, recall and F-measure. The weighted average of the values
is appeared in the Table 3 (last row). Precision and recall of the Normal class are
0.948 with F-measure of 0.948 for the testing data sets. Suspect class (S) indicated
small precision and recall values when contrasted with other two classes. It is
apparent since specialists put these cardiotocogram within the suspect class too. In
this way, it is simpler for this class to be confused by the classifier with either
normal (N) class or pathological (P) class.
Fig. 2 Randomly selected predictor vs Accuracy (cross validation)
364
M. M. Imran Molla et al.
Fig. 3 Classification error with increase of tree
Fig. 4 Number of nodes for trees
Table 3 Classification result for training and testing dataset; values are represented as mean
(standard deviation)
Class
Normal
Suspect
Pathologic
Weighted average
Precision
Train
Test
Recall
Train
Test
F-Measure
Train
Test
0.999
0.996
1.00
0.999
0.979
0.760
0.947
0.948
0.999
0.996
1.00
0.999
0.967
0.905
0.857
0.948
0.999
0.996
1.00
0.999
0.973
0.826
0.900
0.948
Table 4 shows the confusion matrix for one of the testing data set. Most of the
Normal class is identified as Normal class whereas 4 cases of suspect (S) class are
confused with normal (N) class. Few cases of pathologic (P) class (only 1) are
confused with the normal class.
The accuracy of overall classification is 94.8% for the testing data set (Table 3).
There are 21 features in the data set. All the features may not be equally important
in contributing the classification. Thus, it is necessary to study the impact of
Cardiotocogram Data Classification Using Random Forest …
365
Table 4 Confusion matrix for one of testing data set
Class
Normal
Suspect
Pathologic
Normal
Suspect
Pathologic
146
2
1
4
19
2
1
0
18
Fig. 5 Important variable among the 21 variables
features in the classification for all three classes. 10 important variables based on
Mean Decrease Accuracy and Mean Decrease Gini are shown in Fig. 5. The Mean
Decrease Accuracy of a variable is determined during out of bag error calculation
phase. A variable is considered to be as more important whose exclusion (or
permutation) decrease the more accuracy of the random forest. That’s why variables
with a large mean decrease in accuracy are more important for classification. ALTV
has the higher mean decrease in accuracy. The Mean Decrease Gini indicates the
average (mean) of a variable’s total decrease in node impurity and weighted by the
proportion of samples in each decision tree in the random forest reaching that node.
This is an effective measure that implies how important a variable is for estimating
the value of the target variable across all of the trees which is making up the forest.
A variable with higher Mean Decrease Gini indicates higher variable importance.
MSTV has higher mean decrease in accuracy among all others variable.
A partial dependency on an important variable is shown in Fig. 6. Partial
dependence plot provides a graphical representation of the marginal effect of a
variable on the class probability. Negative values (in the y-axis) indicate the positive class is less likely for that value of the independent variable (x-axis) according
to the model. Similarly, positive values indicate that the positive class is more likely
for that value of the independent variable according to the model. Clearly, zero
implies no average impact on class probability according to the model.
366
M. M. Imran Molla et al.
Fig. 6 Partial dependencies on ASTV
Table 5 Comparison with previous works
References
Method
Accuracy
Sundar et al.
[27]
Neural network
Precision (0.91), Recall (0.90) and F-Measure (0.90)
Jezewski et al.
[11]
LSVM classifier
Sensitivity (83%), Specificity (92%)
Chen et al. [28]
FG-Kmeans
Precision (0.76) Recall (0.81) F-measure (0.77)
Cruz et al. [30]
META-DES Ensemble
Classifier
Overall accuracy 84.6%
Arif [20]
Random Forest (Full Features)
Precision, Recall and F-measure are 0.936
Overall Accuracy: 93.6%
Zhou and Sun
[29]
Active learning of Gaussian
Processes
Overall Accuracy 0.89% small training dataset of 140
examples only
Chamidah [21]
1. SVM
2. K-Means+SVM
76.72%
90.64%
Zhang [23]
PCA and AdaBoost
Overall accuracy 93%
Proposed
method
Random Forest (Full Features)
Precision, Recall and F-measure are 0.948
Overall Accuracy: 94.8%
The proposed work is compared with the previous works that is shown in
Table 5. In this study, all dataset is partitioned into 80% (training set) and 20%
(testing set). The classification accuracy is reported as the average value of 10
independent runs. It can be concluded that the overall classification accuracy is
better than the previous results.
Cardiotocogram Data Classification Using Random Forest …
367
4 Conclusions
Cardiotocograms (CTG) are sorted by three expert obstetricians. The used data set
was collected from the Maternity and Gynecological Clinic (University Hospital of
Porto in Portugal) (Ayres-de-Campos, Bernardes et al. 2000 [28]). The performance
of random forest classifier is analyzed by utilizing three different performance
measures: Precision, Recall and F-measure to distinguish the pathological and
suspicious condition of the fetus from the normal condition. The used dataset is
partitioned into training and testing datasets randomly (80% for training and 20%
for testing). As the classifier is stochastic, thus ten folds cross validation is utilized
with 80%-20% split of the CTG dataset. The proposed technique achieves the
classification accuracy of 94.8% when the complete feature sets are employed to the
classifier. The classifier performance has also been evaluated in terms of precision,
F-measure and recall which are 0.948.
Acknowledgements The author would like to acknowledge the great supports by the Faculty of
Electrical & Electronics Engineering and Universiti Malaysia Pahang, Malaysia.
References
1. Macones GA, Hankins GD, Spong CY, Hauth J, Moore T (2008) The 2008 National Institute
of Child Health and Human Development workshop report on electronic fetal monitoring:
update on definitions, interpretation, and research guidelines. J Obstetric Gynecol Neonatal
Nurs 37(5):510–515
2. Ugwumadu A (2013) Understanding cardiotocographic patterns associated with intrapartum
fetal hypoxia and neurologic injury. Best Practice Res Clin Obstetric Gynaecol 27(4):509–536
3. Chen HY, Chauhan SP, Ananth CV, Vintzileos AM, Abuhamad AZ (2011) Electronic fetal
heart rate monitoring and its relationship to neonatal and infant mortality in the United States.
Am J Obstetric Gynecol 204(6):491–501
4. Lees C, Marlow N, Arabin B, Bilardo CM, Brezinka C, Derks JB, Wolf H (2013) Perinatal
morbidity and mortality in early-onset fetal growth restriction: cohort outcomes of the trial of
randomized umbilical and fetal flow in Europe (TRUFFLE). Ultrasound Obstetric Gynecol 42
(4):400–408
5. Carbonne B, Langer B, Goffinet F, Audibert F, Tardif D, Le Goueff F (1997) Multicenter
study on the clinical value of fetal pulse oximetry. Am J Obstetric Gynecol 177(3):593–598
6. Spencer JA (1993) Clinical overview of cardiotocography. BJOG Int J Obstetrics Gynaecol
100(9):4–7
7. Grivell RM, Alfirevic Z, Gyte GM, Devane D (2010) Antenatal cardiotocography for fetal
assessment. Cochrane Database Syst Rev 1
8. Brown R, Wijekoon JH, Fernando A, Johnstone ED, Heazell AE (2014) Continuous objective
recording of fetal heart rate and fetal movements could reliably identify fetal compromise,
which could reduce stillbirth rates by facilitating timely management. Med Hypotheses 83
(3):410–417
368
M. M. Imran Molla et al.
9. Czabanski R, Jezewski J, Matonia A, Jezewski M (2012) Computerized analysis of fetal heart
rate signals as the predictor of neonatal acidemia. Expert Syst Appl 39(15):11846–11860
10. Georgieva A, Payne SJ, Moulden M, Redman CW (2013) Artificial neural networks applied
to fetal monitoring in labour. Neural Comput Appl 22(1):85–93
11. Jezewski M, Wrobel J, Labaj P, Leski J, Henzel N, Horoba K, Jezewski J (2007) Some
practical remarks on neural networks approach to fetal cardiotocograms classification. In:
IEEE 29th annual international conference of the engineering in medicine and biology society
(EMBS 2007), pp 5170–5173
12. Karabulut EM, Ibrikci T (2014) Analysis of cardiotocogram data for fetal distress
determination by decision tree based adaptive boosting approach. J Comput Commun 2
(09):32–37
13. Czabanski R, Jezewski M, Wrobel J, Horoba K, Jezewski J (2008) A neuro-fuzzy approach to
the classification of fetal cardiotocograms. In: 14th nordic-baltic conference on biomedical
engineering and medical physics. Springer, Heidelberg, pp 446–449
14. Menai MEB, Mohder FJ, Al-mutairi F (2013) Influence of feature selection on naïve Bayes
classifier for recognizing patterns in cardiotocograms. J Med Bioeng 2(1):66–70
15. Breiman L (2001) Random forests. Mach Learn 45(1):5–32
16. Kandaswamy KK, Chou KC, Martinetz T, Möller S, Suganthan PN, Sridharan S,
Pugalenthi G (2011) AFP-Pred: a random forest approach for predicting antifreeze proteins
from sequence-derived properties. J Theo Biol 270(1):56–62
17. Gray KR, Aljabar P, Heckemann RA, Hammers A, Rueckert D (2013) Random forest-based
similarity measures for multi-modal classification of Alzheimer’s disease. Neuroimage
65:167–175
18. Ozcift A (2012) Enhanced cancer recognition system based on random forests feature
elimination algorithm. J Med Syst 36(4):2577–2585
19. Arif M, Bilal M, Kattan A, Ahamed SI (2014) Better physical activity classification using
Smartphone acceleration sensor. J Med Syst 38(9):1–10
20. Arif M (2015) Classification of cardiotocograms using random forest classifier and selection
of important features from cardiotocogram signal. Biomater Biomech Bioeng 2:173–183
21. Chamidah N, Wasito I (2015) Fetal state classification from cardiotocography based on
feature extraction using hybrid K-Means and support vector machine. In: 2015 international
conference on advanced computer science and information systems (ICACSIS), pp 37–41
22. Yılmaz E (2016) Fetal state assessment from cardiotocogram data using artificial neural
networks. J Med Biol Eng 36(6):820–832
23. Zhang Y, Zhao Z (2017) Fetal state assessment based on cardiotocography parameters using
PCA and AdaBoost. In: 2017 10th international congress on image and signal processing,
BioMedical engineering and informatics (CISP-BMEI), pp 1–6
24. Permanasari AE, Nurlayli A (2017) Decision tree to analyze the cardiotocogram data for fetal
distress determination. In: 2017 international conference on sustainable information
engineering and technology (SIET), pp 459–463
25. Yılmaz E, Kılıkçıer Ç (2013) Determination of fetal state from cardiotocogram using
LS-SVM with particle swarm optimization and binary decision tree. Comput Math Methods
Med 1–8
26. Beatrijs H, GuidOei S, Bovendeerd PHM (2013) Simulation of reflex late decelerations in
labor with a mathematical model. Early Human Dev 89(1):7–19
27. Sundar C, Chitradevi M, Geetharamani G (2013) An overview of research challenges for
classification of cardiotocogram data. J Comput Sci 9(2):198–206
28. Chen X, Ye Y, Xu X, Huang JZ (2012) A feature group weighting method for subspace
clustering of high-dimensional data. Pattern Recogn 45(1):434–446
29. Zhou J, Sun S (2014) Active learning of Gaussian processes with manifold-preserving graph
reduction. Neural Comput Appl 25(7–8):1615–1625
30. Cruz RM, Sabourin R, Cavalcanti GD, Ren TI (2015) META-DES: a dynamic ensemble
selection framework using meta-learning. Pattern Recogn 48(5):1925–1935
Cardiotocogram Data Classification Using Random Forest …
369
31. UCI Machine Learning Repository, 13 March 2019. http://archive.ics.uci.edu/ml/datasets/
Cardiotocography
32. Ayres-de Campos D, Bernardes J, Garrido A, Marques-de-Sá J, Pereira-Leite L (2000)
SisPorto 2.0: a program for automated analysis of cardiotocograms. J Mater Fetal Med
9(5):311–318
33. Patel S (2017) Random Forest Classifier, 18 May 2017. Chapter 5
FPGA Implementation of Sensor Data
Acquisition for Real-Time Human Body
Motion Measurement System
Zarina Tukiran, Afandi Ahmad, Herdawatie Abd. Kadir,
and Ariffudin Joret
Abstract In most sensor-based human body motion measurement systems,
microcontroller and general-purpose unit are used to acquire and process the sensor
data. These processing devices, however, have some limitation in obtaining data in
parallel especially from various sensors. This paper focuses the discussion on the
use of FPGA as a processing device to acquire real-time sensor data from various
sensors concurrently. The architecture of real-time sensor data acquisition is proposed utilizing parallelism features of an FPGA. The architecture is also designed to
stream the sensor data from FPGA to the host. This paper also investigates the
performance of FPGA of the proposed architecture in terms of FPGA usage
resources and speed for various optimisation techniques. The implementation
results concluded that the synthesis optimisation technique contributed to the FPGA
overall performance. In addition, the experimental findings show promising results
to implement a state-of-the-art of the FPGA-based human body motion measurement system.
Keywords Sensor data acquisition
Body motion measurement FPGA
1 Introduction
In human motion analysis, most of the researchers focused on sensor data acquisition using a microcontroller and process the sensor data using a general-purpose
unit [1–8]. Field programmable gate array (FPGA) is another type of processing
unit that can be used to process data obtained from the sensor. Some of the FPGA
Z. Tukiran (&) A. Ahmad A. Joret
Microelectronics and Nanotechnology Shamsuddin Research Centre (MINT-SRC), Institut
Kejuruteraan Integrasi (I2E), Universiti Tun Hussein Onn Malaysia, Johor, Malaysia
e-mail: zarin@uthm.edu.my
H. Abd.Kadir
Advanced Mechatronics Research Group (ADMIRE) Focus Group, Faculty of Electrical and
Electronic Engineering, Universiti Tun Hussein Onn Malaysia, Johor, Malaysia
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_26
371
372
Z. Tukiran et al.
advantages over microcontroller is the ability to perform parallel computing and
fast real-time operation [9]. In this work, multiple wearable sensors were mounted
on the human lower limb to measure the motion. Since there is a need to acquire
sensor data from multiple sensors and at the same time perform other tasks,
therefore, the FPGA is chosen as a processing device.
As illustrated in Fig. 1, the proposed architecture of real-time human body
measurement system consists of three (3) main units; sensing, processing, and
displaying the measurement data. In this study, the sensing unit utilises four
(4) tri-axial accelerometer sensors to measure the lower body movement of both left
and right shank and left and right thigh. These sensors are connected to the processing unit; the FPGA board via FPGA I/O analogue pin connectors. In the processing unit, there are two main modules; the ResDAQ and joint measurement
modules. The ResDAQ module performed the task of acquiring the sensor data in
real-time. Whilst the later module computes sensor data to obtain the results of joint
Fig. 1 Proposed architecture
FPGA Implementation of Sensor Data Acquisition …
373
angle before streamed to the host via a fast Ethernet cable. At the host, the GUI
module displays the measurement results to the user. The GUI module is also
programmed to save the measurement results for future reference.
This study utilizes LabVIEW FPGA 2011 to implement the ResDAQ and joint
measurement modules. Whilst LabVIEW 2011 Service Pack 1 (SP1) is used to
implement the GUI module. However, this paper only focuses on the implementation of ResDAQ module as discussed in Sect. 2. Section 3 discusses the implementation and experimental findings. Section 4 remarks the conclusion and future
works.
2 Proposed FPGA-Based Sensor Data Acquisition
2.1
Hardware Configuration Between the Sensors, FPGA
Board and Host
The proposed real-time sensor data acquisition is designed and implemented in two
(2) phases; the hardware and software. The hardware phase involves configuring a
physical connection between sensors, an FPGA and a host. This physical connection is needed for streaming sensor data to the host via the FPGA board. The
software phase involves programming of acquiring and processing sensor data via
FPGA board.
In this work, as depicted in Fig. 1, the hardware configuration has two (2) parts;
(i) the configuration between the FPGA board and personal computer (PC), and
(ii) the configuration between the sensors and FPGA board. The FPGA board and
host is connected via Ethernet cable that must be installed properly through RJ-45
port on the FPGA board and the host. The configuration between FPGA and host is
performed automatically by Measurement & Automation (MAX) software [10].
The sensors are connected to FPGA board via FPGA analogue I/O pin connection. Since the FPGA board supplied 5 V and the sensor uses 3.3 V, thus a
voltage regulator LM117-T is used to reduce 5 V power supply to 3.3 V. The
overall hardware physical connections are shown in Fig. 2. The physical setting
between the FPGA board, the voltage regulator and the sensors are shown in
Table 1.
374
Z. Tukiran et al.
Fig. 2 The physical connection between the sensors and FPGA board
Table 1 Configuration between the FPGA, voltage regulator and sensors pin connector
FPGA I/O analogue pin connector
Voltage regulator pin connector
Sensor pin connector
–
5V
AI GND
AI0–AI3
AI4–AI7
AI8–AI11
OUT
IN
GND
–
ACC
–
GND
X
Y
Z
2.2
–
Implementation of ResDAQ Module
In this work, the main task of ResDAQ module is to obtain, filter and calibrate
sensor data in real-time. The sensor data were obtained from multiple sensors
mounted on the human body. The filter is configured with second-order Butterworth
to remove unwanted data from the signal. The conversion to the output voltage and
calibration are performed before the data were processed for the next task or
streamed to the host.
According to [11], the output voltage of the accelerometer sensor is related to the
acceleration of a particular axis by the relationship in Eq. (1).
Voffset þ S Ai ¼ Vout
ð1Þ
where Vout is the output voltage of the accelerometer, Voffset is the offset of the
accelerometer at 0 g, S is the sensitivity of the accelerometer in volts per meter per
second squared, and Ai is the acceleration of a particular axis in g. Thus, the
acceleration is determined as in Eq. (2). The Eq. (2) is then applied to design and
FPGA Implementation of Sensor Data Acquisition …
375
implement the ResDAQ module on the FPGA platform using LabVIEW FPGA
2011 via FPGA VI.
ðVoutVoffsetÞ=S ¼ Ai
ð2Þ
The ResDAQ module is also designed to perform transferring the data from the
FPGA to the host. LabVIEW FPGA provides two (2) communication methods of
data transfer between FPGA and the host; (i) FPGA host interface front panel
controls and indicators (FPCIs) and (ii) FPGA host interface FIFOs.
The FPGA host interface has registers for the top-level FPGA VI controls and
indicators. These registers were created by the LabVIEW FPGA and accessible to
the host via the FPGA host interface [12]. Figure 3 illustrates the implementation of
ResDAQ module with FPCIs.
Whilst, the FPGA host interface FIFOs uses DMA to buffer and transfer data to the
host system memory at high speed with little processor involvement [12]. This is an
efficient mechanism when sending large blocks of data compared to front panel
controls and indicators. The FPGA host interface FIFOs are a unidirectional transfer
mechanism and can be configured to transfer host-to-FPGA or FPGA-to-host. The
implementation of ResDAQ module with DMA is illustrated in Fig. 4.
Fig. 3 ResDAQ module with FPCIs interfacing method
376
Z. Tukiran et al.
Fig. 4 ResDAQ module with FIFOs interfacing method
3 Implementation Results
3.1
Results on FPGA Resources and Performance
on the Implementation of ResDAQ
LabVIEW FPGA VI provides two (2) Xilinx settings for synthesis optimisation
upon compilation; (i) speed (SS), and (ii) area (SA). This synthesis optimisation
technique is to translate the G-code to the hardware circuitry. The LabVIEW FPGA
Module was set to speed as a default optimisation technique. Once the optimisation
technique is selected, the Xilinx compiler performs the compilation process for
targeted FPGA devices. Once the process completed, the report that contains the
information about the FPGA resources usage and the maximum frequency is
generated.
In this study, the ResDAQ with FPCIs and the ResDAQ with FIFOs are compiled with these two (2) Xilinx settings for synthesis optimization. The motivation
is to investigate the impact of Xilinx synthesis optimization settings on the
ResDAQ architecture. Two parameters are selected to evaluate the performance of
the proposed ResDAQ architecture which are FPGA resources and maximum
frequency.
Based on Table 2, in the design of ResDAQ with FPCIs, the synthesis optimization by area (SA) reduces the usage of FPGA resources by approximately 1%.
Conversely, the FPGA speed decreases by 1.38 MHz. When the same design is
optimised for FPGA speed (SS), the usage of FPGA resources increases by 0.7%
FPGA Implementation of Sensor Data Acquisition …
Table 2 Comparison of the
usage of FPGA resources and
speed
FPGA performances
377
FPCIs
SS
SA
FIFOs
SS
SA
(A) Usage of FPGA
resources
Total slices (%)
24.1
23.4
25.3
23.5
(B) Maximum
41.91
40.53
40.88
40.78
frequency (MHz)
Note SS—synthesis optimization by speed, SA—synthesis
optimization by area, FPCIs—ResDAQ with FPCIs, FIFOs—
ResDAQ with FIFOs
Table 3 Details on total
slices of FPGA resources
usage
FPGA performances
FPCIs
SS
SA
FIFOs
SS
SA
Slice registers (%)
9.7
8.7
10.4
10.4
Slice LUTs (%)
18.9
18.6
20.4
20.7
Mult18X18s (%)
92.5
92.5
92.5
92.5
Block RAMs (%)
0
0
5
5
Note SS—synthesis optimization by speed, SA—synthesis
optimization by area, FPCIs—ResDAQ with FPCIs, FIFOs—
ResDAQ with FIFOs
and the FPGA speed is improved approximately 1.5 MHz. These findings show that
the optimisation method offered trade-off on overall FPGA performance.
Table 3 shows further details on total slices in terms of registers, Lookup Tables
(LUTs), multiplier and block Random Access Memory (RAM) usage for both
designs of ResDAQ with FPCIs and ResDAQ with FIFOs.
Based on Table 3, for both optimisation methods by area and speed, the design
with FIFOs uses more elements especially on registers, LUTs and block RAMs for
data storage approximately by 1%, 2% and 5%, respectively. However, there is no
significant difference in terms of multiplier usage for both design and synthesis
optimisation methods.
3.2
Measurement Results
Two (2) tri-axial accelerometers were used in this study. The output of all sensors is
processed using the FPGA board, which in turn was connected to the computer with
Ethernet cable. The sampling frequency was 1 kHz. The two (2) sensors were
mounted on simple Velcro strap and placed on the shank and thigh as shown in
Fig. 5. Before working with the sensors for measurement, the sensors were calibrated on a flat surface that was parallel to the ground. In this case, both sensors
have the same zero references. The assumption that thigh and shank segments are in
the same place was considered.
378
Z. Tukiran et al.
Fig. 5 Accelerometer sensors and goniometer placement
Table 4 Experimental results of 500 sample data
Number
of
samples
Actual
knee
joint
(degrees)
RMSE of actual vs.
estimated
measurement
(degrees)
Mean of
estimated
measurement
(degrees)
Standard deviation of
estimated
measurement
(degrees)
500
125
0.0959
125.0739
0.0610
Samples of 500 sensor data of static motion during flexed knee were collected
for five (5) cycles. The collected data were saved in a file with .CSV format. The
data processing was done offline using MS Excel 2016. For validating the joint
angle measurement that was estimated by the accelerometer, a goniometer was used
to measure the actual angles from the knee.
The root mean square error (RMSE) is used to find the differences between
actual and estimated measurement of 500 sample data in the unit of degrees. For
500 sample data, the calculated RMSE is small which approximately 0.1° as shown
in Table 4.
As in Table 4, from 500 sample data, the mean and standard deviation of estimated measurement is also calculated. Then, the minimum and maximum range of
accepted estimated data were determined. Figure 6 shows how far the estimated
measurement from the calculated mean.
FPGA Implementation of Sensor Data Acquisition …
379
Fig. 6 Distribution of estimated measurement from the calculated mean
4 Conclusion
As a conclusion, this study proposed the architecture of real-time sensor data
acquisition (ResDAQ) module on the FPGA platform to obtain data from
multi-sensor in parallel. The proposed ResDAQ architecture also considered two
(2) communication methods to transfer the sensor data from FPGA to the host;
FPCIs and FIFOs. The G-code of the proposed architecture is converted to hardware circuit using two (2) synthesis optimisation method; optimise for FPGA area
(SA) and optimise for FPGA speed (SS).
The implementation findings concluded that the optimization methods offered
trade-off on FPGA overall performance in terms of area and speed. Whilst the
experimental finding shows the measurement data produces small RMSE which is
approximately 0.1°. These findings give promising results to use FPGA platform as
data acquisition and processing device in human body motion measurement
application. Also, the optimisation method for FPGA speed (SS) is suitable to be
implemented in future work for the measurement of human body motion in
real-time.
References
1. Nwaizu H, Saatchi R, Burke D (2016) Accelerometer based human joints’ range of movement
measurement. In: 10th international symposium on communication systems, networks and
digital signal processing (CSNDSP). IEEE, Prague, pp 1–6
2. Kardos S, Balog P, Slosarcik S (2017) Gait dynamics sensing using IMU sensor array system.
Adv Electr Electron Eng 15(1):71–76
3. Wagner JF (2018) About motion measurement in sports based on gyroscopes and
accelerometers—an engineering point of view. Gyroscopy Navig 9(1):1–18
380
Z. Tukiran et al.
4. Ong ZC, Seet YC, Khoo SY, Noroozi S (2018) Development of an economic wireless human
motion analysis device for quantitative assessment of human body joint. Measurement
115:306–315
5. Zhang J, Cao Y, Qiao M, Ai L, Sun K, Mi Q, Zang S, Zuo Y, Yuan X, Wang Q (2018)
Human motion monitoring in sports using wearable graphene-coated fibre sensors. Sens
Actuators A 274:132–140
6. Taghavi N, Luecke GR, Jeffery ND (2018) A wearable body controlling device for
application of functional electrical stimulation. Sensors 18(4):1251
7. Zhang Y, Fei Y, Xu L, Sun G (2015) Micro-IMU-based motion tracking system for virtual
training. In: Chinese control conference (CCC 2015), Sept, pp 7753–7758
8. Tu Y, Liu L, Li M, Chen P, Mao Y (2018) A review of human motion monitoring methods
using wearable sensors. Int J Online Eng (iJOE) 14(10):168–179
9. Tripathi K, Narkhede P, Kottath R, Kumar V, Poddar S (2018) Design considerations of
orientation estimation system. In: 2016 5th international conference on wireless networks and
embedded systems. IEEE, pp 1–6
10. NI sbRIO-961x/963x/964x and NI sbRIO-9612XT/9632XT/9642XT National Instruments.
http://www.ni.com/pdf/manuals/375052c.pdf. Accessed 30 Sept 2019
11. Lee GX, Low KS, Taher T (2010) Unrestrained measurement of arm motion based on a
wearable wireless sensor network. IEEE Trans Instrum Meas 59(5):1309–1317
12. NI LabVIEW high-performance FPGA developer’s guide – recommended practices for
optimizing LabVIEW RIO applications. Rev No 1, 1 February 2014. http://download.ni.com/
pub/gdc/tut/labview_high_perf_fpga_v1.1.1.pdf. Accessed 30 Sept 2019
Pulse Modulation (PM) Ground
Penetrating Radar (GPR) System
Development by Using Envelope
Detector Technique
Maryanti Razali, Ariffuddin Joret, M. F. L. Abdullah,
Elfarizanis Baharudin, Asmarashid Ponniran,
Muhammad Suhaimi Sulong,
Che Ku Nor Azie Hailma Che Ku Melor, and Noor Azwan Shairi
Abstract GPR system equipment is used to detect embedded objects in the earth’s
surface. This system applied a method that is based on the reflection technique of
the electromagnetic wave produced by the dipole antenna. To obtain a clear image
of the GPR radargram, the output signal of the GPR antenna will be processed
using the envelope detector (ED) technique. In this study, the frequency range used
in developing GPR system simulation is from 0.07 to 0.08 GHz. The GPR system
simulations were designed to perform scanning using GPR system to detect
embedded iron object in the dry sandy soil at the depths of 0, 10, 100, 500, 900,
1000 and 1500 mm. Through this study, based on the GPR radargram, the only
embedded object that cannot be detected in the simulation is the object embed at
1500 mm. Comparison of the GPR radargram produced without and using envelope detector techniques proves that the envelope detector technique is capable of
generating GPR radargram and displaying embedded objects more clearly.
M. Razali A. Joret (&) M. F. L. Abdullah E. Baharudin A. Ponniran C. K. N. A. H. C. K. Melor
Faculty of Electrical and Electronics Engineering, Universiti Tun Hussein Onn Malaysia,
Parit Raja, Malaysia
e-mail: ariff@uthm.edu.my
M. S. Sulong
Faculty of Technical and Vocational Education, Universiti Tun Hussein Onn Malaysia,
Parit Raja, Malaysia
N. A. Shairi
Faculty of Electronic and Computer Engineering, Universiti Teknikal Malaysia Melaka,
Malacca, Malaysia
A. Joret M. S. Sulong
Internet of Things (IoT) Focus Group, Universiti Tun Hussein Onn Malaysia, Parit Raja,
Malaysia
A. Ponniran
Power Electronics Converters (PECs) Focus Group, Universiti Tun Hussein Onn Malaysia,
Parit Raja, Malaysia
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_27
381
382
Keywords Pulse modulation
M. Razali et al.
GPR system Dipole antenna Envelope detector
1 Introduction
Ground Penetrating Radar (GPR) System is a RADAR (Radio Detection and
Ranging) system used to detect the presence of embedded objects in the earth’s
surface. The system is said to be able to detect embedded objects that not only
consist of metal objects but also non-metal objects, have been studied by many
researchers [1–5]. According to Joret [4], basic equipment for developing a GPR
system equipment involves electromagnetic wave signal transmission system known
as transmitter system and electromagnetic wave signal receiving system (receiver
system) as shown in the Fig. 1. The proposed transmitter system equipment is the
alternating current signal generator while for the receiver system equipment is the
oscilloscope. According to the GPR system equipment shown in Fig. 1, this GPR
system equipment is known as a bistatic GPR system equipment where the same two
antennas are used as electromagnetic signal transmitter and detectors.
In GPR system, antenna is used as an electromagnetic wave signal detector can
be classified into two categories which is wide band antenna and narrow band
antenna. The classification of these antennas is depending on the value of its
fractional bandwidth. If the value 0.2, the antenna is classified as a wide band
antenna and vice versa [4, 6, 7].
The use of antennas in GPR system usually involves wide band antenna such as
microstrip patch antenna, horn antenna, frame antenna, monopole antenna, bow-tie
antenna and circular disc monopoles [4, 8–10]. According to Sato [11] and Ghafoor
[12]. The wideband antenna used as GPR antenna in GPR system are capable of
providing a good GPR radargram. However, the design of the wide band antenna is
difficult due to its complex geometric shape compared to the narrow band antenna.
Antennas that produces electromagnetic radiation at low frequencies are
essential to detect the position of the embedded object in the earth surface at greater
distance. Referring to the low operating frequency antenna, study using antenna that
is operating at 500 MHz have been used by Florian to study the maximum depth of
the embedded objects detectable by the GPR system [13]. Theoretically, referring to
Daniels [14] and Joret et al. [15], the depth of the embedded objects that can be
detected by the GPR systems is using Eq. (1)
Fig. 1 Basic equipment of the GPR system [4]
PM GPR System Development by using Envelope Detector Technique
vt
d ¼ pffiffiffiffi
2 er
383
ð1Þ
where the depth d in units of meter, the electromagnetic wave ð3 108 m=sÞ
velocity known as v while the electromagnetic wave signal used time t to travel
from the GPR system to the surface of an embedded objects in the earth and last but
not least is the relative permittivity value of the medium where the electromagnetic
wave propagates er .
This paper discusses on the simulation development of the GPR system using
dipole antenna to detect an embedded object in sandy soil area. The GPR system
simulation was designed using CST Studio Suite software while the processing of
the antenna output signal was done using MATLAB software. In obtaining pulse
signal from the antenna pulse modulation signal, an envelope detector (ED) was
used as one of the technique to process signal. The used of the ED based signal
processing technique in this study show that the produced GPR radargram is able to
show the embedded object clearly as compared to the GPR radargram produced
using signal processing technique without ED based.
1.1
Antenna Input Signal
The production of an electromagnetic wave signal by an antenna requires electrical
alternating current or pulse signal as the antenna input signal as shown in Fig. 2. In
Fig. 2(a), the Gaussian modulated pulse signal shown is the input signal for the
wideband antenna with fractional bandwidth value of 1 whereas Fig. 2(b) shows an
Fig. 2 Antenna input signal: (a) Input signal for wideband antenna with value of 1 for its fractional
bandwidth, (b) Input signal for wideband antenna with value of 0.4 for its fractional bandwidth
384
M. Razali et al.
antenna input signal for wideband antenna with fractional bandwidth of 0.4 [4].
Based on Fig. 2, the antenna input signal is said to be affected by the fractional
bandwidth value of the antenna. When the fractional bandwidth value is higher, it
will affect the ripple signal where it will become less ripple. This is directly
affecting the narrow band antenna because the input signal of this antenna will have
more ripple.
Compared to wideband antennas, narrow band antennas are easy to design,
which is one of the reasons why this antenna has been chosen to be used in this
study. However, the GPR radargram obtained from the use of the narrow band
antenna as GPR antenna will appear blur due to the input signal used have a lot of
ripple. Some examples of the narrow band antenna that are often designed are
dipole antenna, loop antenna, dish antenna and Yagi-Uda antenna [16].
Based on its simple design, lightweight, easy to ferment and economical [17] the
dipole antenna was selected as the GPR antenna in this study. The antenna consisting of two cylindrical copper wires thread. As the operating frequency is proportional to the antenna length, this feature enables this study to be performed using
the frequency operation of dipole antenna in Mega Hertz by adjusting the antenna
length. At this frequency range, the production of radiation of electromagnetic wave
is said to be able to penetrate the depths of the soil at a distance approximately
1–2 m.
1.2
Signal Processing Technique for PM GPR System
The use of narrow band antenna and amplitude modulation using pulse modulation
signal in the GPR system resulted too much ripple in the input and output signals.
This ripple signals can be minimized by using signal processing techniques. One of
the technique used in the GPR system to process signal is the envelope detector
technique [4].
The uniqueness of the amplitude modulation signal as related in PM GPR system
is on its envelope which contains the information signal. Refers to [4, 8, 18], the
amplitude modulation signal represented by AðtÞ can be referred to as
AðtÞ ¼ Ac cosðxc tÞ þ lAm cosðxm tÞ cosðxm tÞ
ð2Þ
where l is modulation index known as positive constant, Ac is carrier amplitude
signal, while carrier signal phase known as xc , the amplitude of information signal
is Am and phase of the information signal is xm and t for time. According to Eq. (2),
if the value 1 were set for Ac and Am , 0.6 is set for l value, 0:6p for the xc while 2p
for xm and t values are measured from 0 to 2. Figure 3(c) shows the amplitude
modulation signal generated using Eq. (2). The information signal used to derive
this modulation signal is as shown in Fig. 3(a).
Referring to the signal of amplitude modulation in Fig. 3(c), the ED technique
can be used to detect the information signal [4, 8]. There are three kinds of
PM GPR System Development by using Envelope Detector Technique
385
Fig. 3 Generation of amplitude modulation signal: (a) Message signal, (b) Carrier signal, (c) AM
signal [4]
Output signal
Thresholding
Low pass
filter
Input signal
Fig. 4 Block diagram for AHW envelope detector technique
techniques for the envelope detector to retrieve the signal information from
amplitude modulation signals which are Asynchronous Full-Wave (AFW),
Asynchronous Half-Wave (AHW) and Asynchronous Real Square Law (ARSL).
The AHW type of envelope detector technique was used to detect information
signals which is the pulse signal from the amplitude modulation signal in this paper.
Figure 4 shows the AHW type of envelope detector technique block diagram while
Fig. 5 shows an example of the signal extraction results from the amplitude
modulation signal using the AHW envelope detector technique. Figure 5(c) shows
the result signal of AHW envelope detector technique to detect message signal.
386
M. Razali et al.
Fig. 5 Signal extraction information from amplitude modulation using AHW envelope detector
technique
2 Development of GPR System Simulation
The use of narrow band antennas is less popular in the PM GPR system because this
kind of antennas use high ripple of input signal that will cause the radar image
known as GPR radargram to be blurred. In this study, the antenna output signal
generated in the GPR system simulation design using CST software will be
extracted into MATLAB software. Next, the output signal of the antenna will be
processed to obtain the GPR radargram of the GPR system simulation. The production of the GPR radargram in this study involved the processing of the signal
using envelope detector technique. The development of the GPR system simulation
has used materials such as dry sand soil as a background object and iron plate as the
embedded object, beside Dipole antenna.
2.1
Antenna Design and GPR System Simulation
Using CST Software
The frequency operation of the dipole antenna used in development of the GPR
system simulation is at 0.075 GHz. By selecting this frequency value, it requires
that the dipole length of the antenna should be in a range of 1500 mm with the
PM GPR System Development by using Envelope Detector Technique
387
Fig. 6 Dipole antenna using CST software
Fig. 7 Simulation design of
GPR system using CST
software
radius of 50 mm. This does not include the port distance value of this antenna that
has been set to 200 mm. Figure 6 shows the diagram of the developed dipole
antenna in this study using CST software.
In this study, after the dipole antenna was successfully designed hence to obtain
an appropriate reflectance parameter of less than –10 dB at 0.075 GHz, the addition
of background models and embedded objects was performed to model the GPR
system simulation. The background dimensions used for this simulation are
3000 mm in length, width and height using dry sand material. The dimension for
the iron plate used as the embedded object is around 800 mm 800 mm 400
mm for its width, length and height respectively. The schematic diagram of the
GPR system simulation model presented in this study is shown in Fig. 7.
388
Fig. 8 Scanning direction
procedure of the GPR system
simulation
M. Razali et al.
16
The scanning procedure of the GPR system simulation in this study were performed by running the simulated GPR systems simulation several times with reference to several antenna positions. A total of 16 antenna positions have been
determined that will be referred to as GPR system scanning antenna points based on
the background model. The movement of the simulated scan point can be referred
in Fig. 8 where the set distance for each antenna position from one position to next
position is at about 162.5 mm.
In this study, several GPR system simulations were carried out involving iron
plate design as an embedded object in dry sandy areas at certain depths such as at 0,
10, 100, 500, 900, 1000 and 1500 mm. This simulation was performed in order to
see the effectiveness of the dipole antenna as GPR antenna of the GPR system in
detecting an embedded iron object. In order to detect the position of an iron
embedded in dry sand area at depth of 1000 mm a simulation of the GPR system
using dipole antenna that was developed in this study has shown in Fig. 9.
PM GPR System Development by using Envelope Detector Technique
389
Fig. 9 Simulation of the
GPR system in scanning
object immersed in dry sand
at 1000 mm depth using
dipole antenna
2.2
GPR System Simulation Output Signal Processing
Based on GPR system simulation development using CST software, the antenna
output signal calculated by the software will be exported into MATLAB software.
The simulation frequency range was set from 0.07 GHz until 0.08 GHz. The
selection of this frequency range in the CST software has produced a modulated
Gaussian pulse signal as an information signal with a sinusoidal carrier frequency
of 0.075 GHz.
Next, to generate the GPR radargram of the GPR system simulation, the output
signal of the antenna for each antenna position in the simulation was arranged in a
column where the first position refers to the first column and so on. To obtain a
clearer GPR radargram in this study, the envelope detector technique was performed to these antenna output signal. The selected envelope detector technique is
the Asynchronous Half-Wave (AHW) which can be refered to the block diagram in
Fig. 4.
390
M. Razali et al.
Fig. 10 GPR system
algorithm based on magnitude
calculation
2.3
GPR System Reconstruction Image
To reconstruct the GPR system image, an output signal from the GPR system are
needed and arranged according to each of the antenna position scanning process.
The antenna scanning position for this study has been set as y-value for each signal
while x-value was set for the signal sample which will form an ðx; yÞ antenna
position referring to the 16 unmodulated signals as in Fig. 8. The output image, i in
this study is produced by following mapping procedure of Eq. (3)
iðx; yÞ ¼ ½y1 ðn; 1Þ
y2 ðn; 2Þ
y3 ðn; 3Þ. . .
ð3Þ
where the unmodulated signal is represented by y1 ; y2 ; y3 , the position at the image
width is represented as x while y is the position at the image length and n is the
sample value of the output signal. Based on the Fig. 10, the algorithm that apply for
this system can reconstruct image thru signal processing using Eq. (3).
PM GPR System Development by using Envelope Detector Technique
391
3 Result and Discussion
3.1
Result of Dipole Antenna Design
Figures 11 and 12 shows the input and output signal obtained from the simulation
of dipole antenna. Based on the displayed signal, it can be observed that the signal
is a modulated Gaussian pulse signal.
The simulation result of dipole antenna design shown in Fig. 13 is the magnitude of the reflection signal parameter (S11). This graph shows that the designed
dipole antenna is effectively capable to transmit signal with spectrum value from
0.07 GHz until 0.08 GHz. The center frequency of this antenna as selected at
0.075 GHz have reflection gain at about –31.0955 dB. Based on the radiation
pattern shown in Fig. 14 it can be said that this dipole antenna has an omnidirectional radiation pattern with gain value of 2.03 dBi based on the isotropic antenna at
this center frequency.
Fig. 11 Dipole antenna input signal
Fig. 12 Dipole antenna output signal
392
M. Razali et al.
Fig. 13 S11 parameter of designed dipole antenna
Fig. 14 Radiation pattern of the designed dipole antenna
3.2
GPR System Simulation Result
According to the simulation outcome of the GPR system in this study, the position
of the embedded objects in dry sand soil at 0 mm up to 1000 mm were successfully
detected and displayed in its GPR radargram. However, the embedded object at the
depth of 1500 mm could not be detected. Figure 15(a–c) shows the GPR radargram
of the GPR system simulation containing embedded object at depth of 100, 500 and
1500 mm respectively which have been processed without using the envelope
detector-based technique while Fig. 16(a–c) shows the GPR radargram that have
been processed using envelope detector-based technique. Based on Fig. 15 and
Fig. 16, it can be proved that the usage of the envelope detector-based technique on
antenna output signal can produce clearer GPR radargram and allow us to identify
the existence of the embedded object easily.
When the envelope detector technique is not used to reprocess the signal from
the GPR system, it is hard to identify the exact position of the iron embedded object
by using the reflection of electromagnetic wave. Normally, when the envelope
PM GPR System Development by using Envelope Detector Technique
Fig. 15 (a) GPR image
radargram of GPR system
simulation with an embedded
object at 100 mm depth in dry
sand soil processed without
ED based technique. (b) GPR
image radargram of GPR
system simulation with an
embedded object at 500 mm
depth in dry sand soil
processed without ED based
technique. (c) GPR image
radargram of GPR system
simulation with an embedded
object at 1500 mm depth in
dry sand soil processed
without ED based technique
a
b
c
393
394
Fig. 16 (a) GPR image
radargram of GPR system
simulation with embedded
object at 100 mm depth in dry
sand soil processed with ED
based technique. (b) GPR
image radargram of GPR
system simulation with
embedded object at 500 mm
depth in dry sand soil
processed with ED based
technique. (c) GPR image
radargram of GPR system
simulation with embedded
object at 1500 mm depth in
dry sand soil processed with
ED based technique
M. Razali et al.
a
b
c
PM GPR System Development by using Envelope Detector Technique
395
detector technique was not applied the GPR system are only able to display a vague
radargram image. This vague radargram image can be seen in the Fig. 15.
In Fig. 16, the position and the electromagnetic wave reflection shown in this figure
can be seen clearly because of the envelope detector appliances. At the depth of
100 mm as well as 500 mm, the embedded object detected by the GPR system is
estimated as be seen in time samples from 3000 to 3500 and at the scanning point
of 7 to 11. However, the depth of the embedded object cannot be determined in
detail and the size of the embedded object is slightly different from the size of the
embedded object set in the GPR system simulation.
As a validation purpose, before the GPR system is simulated to scanning process of the embedded object in this study, the GPR system will be simulated the scanning process of the dry sand area with no embedded object as a
Fig. 17 (a) GPR image
radargram of GPR system
simulation without embedded
object without ED based
technique. (b) GPR image
radargram of GPR system
simulation without embedded
object with ED based
technique
a
b
396
M. Razali et al.
Table 1 Depth of the scanned embedded object using dipole antenna
Metal depth, mm
With envelope detector
Without envelope detector
0
10
100
500
900
1000
1500
✓
✓
✓
✓
✓
✓
✗
✓
✓
✓
✓
✓
✓
✗
reference simulation which will be used to distinguish whether the system detects
the presence of the embedded object in the dry sand area or not. Fig. 17 shows
the GPR radargram image of the GPR system simulation with no embedded object.
Based on the GPR radargram image at Fig. 17, it can be concluded that this
GPR system can only detect the presence of the embedded object in the dry sand
area at a depth of less than 1000 mm. If, the depth is exceeding 1000 mm, the GPR
system radargram cannot detect any presence of the embedded object in the dry
sand area as the image has almost same pattern as the image of the GPR radargram
without embedded object. Through this simulation, there are two possibilities that
could be happened when the depth is exceeded 1000 mm which either there is
no embedded objects in the dry sand area or the electromagnetic wave signal of
the GPR system is not able to penetrate deeper in the dry sand area to detect the
embedded object.
Table 1 show the scanning results of the GPR system using the designed simulation model in this study which includes the position of the embedded object in
dry sand area. The embedded object detected by the GPR system are the object at
the depth of 0, 10, 100, 500, 900 and 1000 mm. the embedded object at the depth of
1500 mm cannot be detected by the GPR system whether or not it used the
envelope detector technique.
4 Conclusion
The GPR system simulation was designed using a dipole antenna. The use of these
antenna which is a narrow band antenna as GPR antenna have not received much
attention because the GPR radargram produced will be unsmootherned. Apart from
replacing the narrow band antenna to wide band antenna, the image of the
GPR radargram can be smootherned by applying signal processing technique to the
antenna output signal, which contain high ripple using envelope detector based.
PM GPR System Development by using Envelope Detector Technique
397
Acknoledgement This paper acknowledges the contribution of funding from UTHM under the
internal grant of Postgraduate Research Grant (GPPS) Scheme Vot No. H403. The experimentation and testing have been done at UTHM research project laboratory.
References
1. Daniels JJ (2000) Ground penetrating radar fundamentals, pp 1–21 (2000)
2. Baker GS, Jordan TE, Pardy J (2007) An introduction to ground penetrating radar (GPR). In:
Special paper 432 stratigraphic analysis using GPR, vol 2432, pp 1–18 (2007)
3. Lai WWL, Derobert X, Annan P (2018) A review of ground penetrating radar application in
civil engineering: a 30-year journey from locating and testing to imaging and diagnosis.
NDT E Int 96:58–78
4. Joret A (2018) Modulation techniques for GPR system radargram module technique GPR
system radargram, p 283
5. Jazayeri S, Saghafi A, Esmaeili S, Tsokos CP (2019) Automatic object detection using
dynamic time warping on ground penetrating radar signals. Expert Syst Appl 122:102–107
6. Breed G (2005) A summary of FCC rules for ultra wideband communications. High Freq
Electron 4(1):42–44
7. Wiesbeck W, Adamiuk G, Sturm C (2009) Basic properties and design principles of UWB
antennas. Proc IEEE 97(2):372–385
8. Carlson AB, Crilly PB, Rutledge JC (2002) Communication systems: an introduction to
signals and noise in electrical communication, 2nd edn. McGraw-Hill, New York
9. Sharif A, Chattha HT, Aftab N, Saleem R, Rehman S (2015) A tree shaped monopole antenna
for GPR applications, pp 3–5
10. Shebalkova LV, Markov MA, Romodin VB (2018) Broadband antenna for ground
penetrating radar application in soil. In: IOP conference series: earth and environmental
science, vol 134, no 1
11. Sato M, Yarovoy A (2008) GPR (ground penetrating radar) into real world 2. In:
Fundamentals of GPR 3 new technologies in GPR, p 4 (2008)
12. Riaz MM, Ghafoor A (2012) Information theoretic criterion based clutter reduction for ground
penetrating radar. Progr Electromagnet Res 45:147–164
13. Florian F (2003) Introduction of a ground penetrating radar system, vol 14, pp 35–44 (2003)
14. Daniels DJ (2004) Ground penetrating radar, 2nd edn. IET London, UK
15. Joret A, Sulong MS, Abdullah MFL, Madun A, Dahlan, SH (2018) Design and simulation of
horn antenna using CST software for GPR system. In: Journal of physics: conference series,
vol 995, no 1
16. Zivkovic I, Scheffler K (2013) A new inovative antenna concept for both narrow band and
UWB applications. Progr Electromagnet Res 139:121–131
17. Wu D, Yin Y, Guo M, Shen R (2006) Wideband dipole antenna for 3G base stations, pp 454–
457 (2006)
18. Ziemer RE, Tranter WH (2014) Principles of communication systems, modulation, and noise.
Wiley, Hoboken
An Overview of Modeling and Control
of a Through-the-Road Hybrid Electric
Vehicle
M. F. M. Sabri, M. H. Husin, M. I. Jobli,
and A. M. N. A. Kamaruddin
Abstract Heavy reliance on fossil fuels poised a challenge in environment
preservation as hazardous by-products from fuel-burning are dissipated irrepressibly to the atmosphere. The introduction of hybrid electric vehicles (HEV) in the
transportation sector serves as a contemporary solution towards the realization of
emission-free vehicles of the future. In this paper, a Through-the-Road (TtR) HEV
configuration with in-wheel motors (IWM) fitted in the rear wheels is proposed and
tested in simulation over standard drive cycles. Due to its simpler configuration,
TtR HEV has a lower efficiency compared to other conventional HEVs but the
architecture also grants several redeeming features such as enhanced acceleration
and stability courtesy of its 4-wheel drive (4WD) setup. Further research is needed
to improve the offering from TtR architecture to make them perform closer in
efficiency to conventional HEVs. Modeling of the TtR HEV uses established
mathematical equations in MATLAB® using Simulink. This is achieved through a
modification of a power-split HEV model in Simulink into a TtR architecture
through the elimination of the planetary gear system, the addition of IWM to the
rear wheels and a slight modification of the EMS. The main objective of this
exercise is to develop a robust simulation platform for future works such as drivetrain optimization and development of energy management strategy
(EMS) controller. Simulation results have shown that the proposed TtR HEV is
capable of satisfying the driver’s demand with acceptable fuel consumption.
Keywords Hybrid electric vehicle Through-the-road HEV
Robust simulation platform Energy management strategy
4-wheel drive M. F. M. Sabri (&) M. H. Husin M. I. Jobli A. M. N. A. Kamaruddin
Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia
e-mail: msmfaizrizwan@unimas.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_28
399
400
M. F. M. Sabri et al.
1 Introduction
1.1
HEV as Key for the Future of Transportation
The burning of fossil fuels is widely reflected as the main aggregator to air pollution
and is subsequently leading to the global warming phenomenon. The transportation
sector is among the biggest fossil fuel consumer and is one of the biggest producers
of greenhouse gasses (GHG). GHG comprises of hazardous fumes such as nitrogen
oxides (NOx), carbon monoxides (CO), sulfur oxides (SOx), unburned hydrocarbons and other pollutants [1–3]. In the effort to lessen the catastrophic impact
towards the environment and to achieve the 2 °C Scenario (2DS) advocated by The
Paris Agreement of 2015, industry leaders and researchers are actively striving to
cut down fuel consumption and emission towards realizing the zero-emission target
within this century [4].
In the wake of technological and logistical challenges faced by the
battery-powered electric vehicles (BEV) and fuel cell vehicles (FCV) [5–8], hybrid
electric vehicles (HEV) is thriving in the current market that is still wavering on the
best solution yet for the realization of zero-emission vehicles of the future. HEV is a
type of vehicle that accumulates the traits of conventional vehicles powered by
internal combustion engines (ICE) and BEV into a single package to deliver
uncompromised performance while producing lesser emissions [3, 9–11]. A HEV is
equipped with an ICE and one or more electric motors (EM) connected to the
vehicle’s final drive in a certain configuration. The driver’s request for speed and
power in HEVs is administered through a carefully schemed energy management
strategy (EMS) which determines the optimal power delivery from the two energy
forms—fuel and electric, to the wheels while taking their respective efficiency curve
into consideration [3, 9]. HEVs are the result of the synergy between mechanical,
electrical, electronic and power engineering which are working in sync to produce a
short-term solution for the global fuel consumption and emission problems.
In the current market, HEVs offer a better proposition in the skirmish for the
market share of green vehicles or energy-efficient vehicles (EEV) segment compared to BEVs and FCVs. Even though the data is showing that the BEV adoption
rate is on the rise with more than two million BEVs sold by 2016, it only managed a
0.2% market share of the global passenger vehicle market with FCVs almost a
non-factor [12, 13]. In terms of the core technology being used, HEVs have the
advantage by taking the midway approach of what currently available on the
market. BEVs are hindered by expensive battery technology and availability of
dedicated charging stations whereas FCVs are stalled by immature and expensive
fuel cell technology imposed by the production and storage of hydrogen (H2) which
is the essence for the fuel cell operation [6, 8, 14].
An Overview of Modeling and Control of a TtR HEV …
1.2
401
Overview of Hybrid Electric Vehicle Architectures
There are generally three main types of HEVs—series, parallel and power-split,
which are distinguished by their source-to-wheel arrangement of the ICE-driven
mechanical path and the EM-driven electrical path. A series HEV has its
mechanical path and electrical path arranged in a serial configuration with the ICE
only used to spin the generator (GEN) to charge the battery pack that acts as the
secondary energy storage system (ESS) which is the power source for the EM. As
presented in Fig. 1, the EM is the only source for traction driving the wheels. Series
HEVs are very similar to BEVs but with the addition of a generator. As the ICE is
not connected to the final drive, series HEVs achieve great fuel economy by having
the ICE operating at its highest efficiency throughout its operations [3, 10, 11].
Parallel HEVs adopt the parallel arrangement of the two energy paths from their
sources to the wheel. As shown in Fig. 2, both the ICE and EM are connected to the
transmission through a mechanical torque coupling device which blends the torque
output from both sources before delivering it to the final drive. The coupling device
is also necessary to allow ESS recharging by diverting a portion of torque from the
ICE, but this process is only present when the vehicle is in motion. Parallel HEVs
offer a higher degree of flexibility for the choice of ICE and EM capacity compared
to the limited downscaling options in series HEV. However, series HEVs do not
require any mechanical coupling device and gearbox as EMs are generally high
revving and efficient over a wide range of speed [3, 10, 11].
Series and parallel HEVs possess a contrasting set of advantages and disadvantages as tabulated in Table 1.
Fig. 1 Power flow for series
HEV
402
M. F. M. Sabri et al.
Fig. 2 Power flow for parallel HEV
Table 1 Series vs Parallel vs Power-split HEV—a comparison
Configuration
Advantages
Disadvantages
Series HEV
• Simpler design
• Lesser component requirement
• ESS recharging always available
• Simpler control design
• Most suitable for city driving
• Flexible component sizing
• Needs only one EM/GEN unit
• ESS capacity not dictated by EM
• Most suitable for highway driving
• Less flexible component sizing
• Needs separate EM and GEN units
• ESS capacity tied to EM capability
• Unsuited for highway driving
Parallel HEV
Power-split HEV
•
•
•
•
•
Flexible component sizing
Needs only one EM/GEN unit
ESS recharging always available
ESS capacity not dictated by EM
Suitable for all types of driving
•
•
•
•
•
•
•
•
•
More complex design
More component requirement
More complicated control design
ESS recharging only when moving
Unsuited for city driving
Most complex design requirement
Requires the most component
Most expensive implementation
Most complicated control design
As vehicles should be designed to suit all types of driving conditions, a new
configuration combining series and parallel HEV was introduced. The aim is to
overcome the weaknesses of the individual designs and harness their strengths.
From Table 1, it can be observed that power-split HEVs solve most of the problems
for the previous two architectures but introduce new areas of concern of their own.
Figure 3 shows the power flow in a power-split HEV and it is very similar to a
parallel HEV but it uses a planetary gear system instead of a simple torque coupling
system. The planetary gear is a complex system that enables the HEV to operate as
both a series and parallel HEV at the same time. However, the inclusion of the
complex planetary gear system with its complex control requirements is also the
root cause of the perceived disadvantages of this configuration [3, 10, 11].
An Overview of Modeling and Control of a TtR HEV …
403
Fig. 3 Power flow for power-split HEV
The introduction of plug-in HEVs (PHEV) has been able to elevate the
fuel-saving capabilities of HEVs to a whole new level. PHEVs are HEVs that can
be connected to the grid for a direct ESS recharging like a BEV [3, 9, 15]. PHEVs
differ from standard HEVs in the hierarchy of its source of energy where the battery
pack is the primary ESS instead. Due to the external charging capability, battery
packs in PHEVs are generally larger with higher energy density than standard
HEVs. PHEVs are also fitted with a bigger and more powerful electrical drivetrain
to match its large ESS. With these added combinations, PHEVs are capable of the
all-electric range (AER) drive for a certain amount of distance [3, 9]. The result is a
zero-emission vehicle with no fuel consumption for trips within the AER. This
breakthrough in HEV technology has driven a large interest from manufacturers
with new models introduced every year. The impacts of PHEVs and the technology
behind it will be one of the keys to unlocking true potentials of all HEV
configurations.
In this paper, a particular HEV architecture called the through-the-road
(TtR) HEV will be focused on. TtR HEV is a derivative form of parallel HEV
but the link between the mechanical and electrical path is established through
contact with the road surface and not using any mechanical torque coupling device
in the drivetrain [3, 9, 16, 17]. This architecture will be explained further by
comparing it with the other configurations in the next section. The merit and
shortcoming of the TtR architecture will be discusses based on recent publications
in the hope of evaluating the true potential of this architecture and to catapult it as
the configuration of choice to accelerate the adoption rate of green vehicles.
404
M. F. M. Sabri et al.
2 Through-the-Road Hybrid Electric Vehicles
2.1
Synopsis on Concept and Design
TtR HEVs have no mechanical torque coupling device linking the mechanical path
and electrical path of the vehicle. To make up for the absence of the in-transmission
torque coupling mechanism, the link between the two drivetrains is established
externally through the road contact while the vehicle is in motion, hence the name
“through-the-road”. This unconventional coupling mechanism grants a simpler and
cheaper foundation for HEV implementation compared to any other configurations
[3, 9, 16–19].
Configuration-wise, a TtR HEV, also known as separate axle parallel HEV,
obtains propulsion power through two independent propulsion systems compared to
only one in conventional HEV. Taking advantage of the separate axle setup, instead
of a big chassis-mounted EM turning the rear axle, smaller and highly efficient
in-wheel motors (IWMs) are fitted in the rear wheels to provide power directly to
the wheels for minimal losses. The smaller IWMs also have the benefit of being
lighter than conventional chassis-mounted EM giving a TtR HEV the much-needed
advantage in terms of the mass of the vehicle. The smaller size also means that
IWMs are theoretically gentler to the ESS. The extra space which normally
occupied by the EM is now vacant and is perfect for fitting a larger ESS depending
on the budget allocation [3, 9]. For this paper, the measure of the design consideration can be simplified by the illustration in Fig. 4.
Fig. 4 Design considerations for the proposed TtR HEV
An Overview of Modeling and Control of a TtR HEV …
2.2
405
Advantages and Disadvantages of Retrofit TtR HEVs
Among the advantages of this configuration are the 4-wheel drive (4WD) capability
that provides a higher level of stability to the vehicle and it also offers exceptional
acceleration. Next is the appeal of retrofitting any conventional ICE vehicles and
transform them into HEVs. This tantalizing property is an excellent motivation for
consumers to start embracing green vehicles at a reasonable cost, considerably
lower than buying a whole new vehicle. However, it is not seen as an enticing
prospect for car manufacturers aiming to keep selling new vehicles unless radical
measures and policies are imposed [3, 9, 18].
One of the trade-offs for the simpler architecture is the lower efficiency for ESS
recharging compared to conventional HEVs since the extra torque needed to
recharge the ESS from the ICE is supplied externally through forced interaction
with the road surface and limited only when in motion and enacts a big loss. Even
with the assist from regenerative braking, the amount of energy that can be harvested internally is significantly less than what is possible with a conventional
HEV. The result is a much smaller window for optimum EMS operation and a
reduced amount of electrical energy supply for the IWM which will affect the HEV
performance target. Another setback for the TtR architecture is both of its axles are
constrained to spin at a matching frequency and always relative to vehicle speed. In
a conventional HEV, the EM is never subjugated by the vehicle speed to allow it to
operate at its highest efficiency. However, as the road surface becomes the torque
coupling medium for the two drivetrains in TtR HEV, this poses a problem as EM
usually rotates at a higher revolution per minute (RPM) count than the ICE to
produce the same amount of power [3, 9, 16, 17].
One of the nifty features for a conventional HEV is the option to downsize the
ICE to further enhance their fuel-saving potential. However, with retrofitted TtR
HEVs, that option is unavailable as they are limited with existing mechanical
drivetrains which are not originally designed for HEV application [16]. This will
put TtR HEVs at a disadvantage in terms of fuel efficiency compared to natively
designed HEVs. This will also lead to the virtual limitation that, when the
state-of-charge (SOC) level is very low, the ICE can only be expected to recharge
the ESS enough to keep it at the lower threshold rather than replenishing it for
further hybrid mode operation [20]. These lingering issues with the TtR HEV
architecture need to be addressed as it is just as equally important as the EMS side
of the system in ensuring a successful development process [3, 9, 16]. The pro and
cons of the architecture can be summarized as in Table 2.
In the next section, the modeling process for the proposed TtR HEV model is
shown. The process will take these considerations to the fullest in the effort to
understand the behaviour of the vehicle in responding to the driver’s demand. It is
an important step in progressing further with the research because the results will
hopefully show the strengths and weaknesses of the developed model. The simulation is also an opportunity to identify the areas that require further optimizations
that can offer a performance boost.
406
M. F. M. Sabri et al.
Table 2 Pros and cons of TtR HEV architecture
Type
Advantages
Disadvantages
TtR
HEV
• Simplest design concept
• Cheapest cost of entry into HEV
• The least demanding component
requirement
• ESS size not tied to IWM capability
• IWM is gentler to the ESS
• 4WD capability
• Component sizing only limited to EM
and ESS
• ESS recharging only available while
moving
• Lower efficiency/high operational loss
• Front and rear axles speed matching
3 Simulation Platform Modification and Setup
3.1
Design Considerations
The design of choice for the TtR HEV proposed in this research is a PHEV in order
to maximize the EMS potential by using the external charging feature to provide the
best possible SOC window for optimal EMS operation [9, 19]. This design choice
will consequently eradicate the limited onboard ESS recharging capability of a
TtR HEV. Subsequently, the use of deep-discharge, high energy density battery as
the ESS is also being considered to further enhance the EMS potential. The main
focus of this research is to synthesis an EMS controller capable of performing
favourably in a heavily modified HEV architecture given the best possible conditions. From there onwards, the controller will be optimized towards a more realistic
target using the robustness of MATLAB® as a powerful simulation tool.
In order to identify the challenges and the most suitable EMS for the proposed
TtR HEV, first, it is important to take into consideration every possible operating
mode for a TtR HEV. By design, the direct ESS recharging mechanism by the ICE
is unavailable, therefore, the recharging of the ESS is only achievable when the
vehicle is in motion, ESS recharging cannot occur otherwise. This design choice
also excludes operation modes exclusive to series HEVs. Here are all the possible
operating modes for the proposed TtR HEV model:
1.
2.
3.
4.
5.
Load
Load
Load
Load
Load
obtains power from ICE alone
obtains power from IWM alone
obtains power from both ICE and IWM (hybrid mode)
returns power to ESS (regenerative braking)
obtains power from ICE and delivers power to ESS (TtR exclusive)
In this research, a deterministic rule-based strategy is used to carefully work
within these operating modes [9]. The power flow solution for the proposed
TtR HEV is as illustrated in Fig. 5.
An Overview of Modeling and Control of a TtR HEV …
407
Fig. 5 Power flow in the proposed TtR HEV
Fig. 6 Proposed TtR HEV model
3.2
Development of Simulation Model
The simulation platform for the TtR HEV is built in Simulink for efficient development. The proposed modified model of TtR HEV is based on the original
series-parallel HEV which can be accessed here [21]. Lookup tables are used in
various parts of the model for quicker system response. The balance between model
fidelity and simulation speed is critical for efficient development. The vehicle model
and controllers are modeled in a single environment to enable system-level optimization. The modeling aspect includes the electrical system, mechanical, thermal
and the control system of the vehicle. The simulation is done using Simulink over
standard drive cycles.
The main modification needed for TtR HEV is the removal of the power split
device from the original model. By this removal, the ICE (mechanical path) and the
IWM (electrical path) now have direct connections to the front and rear wheels
respectively as shown in Fig. 6.
408
M. F. M. Sabri et al.
Fig. 7 TtR HEV architecture in Simulink
Fig. 8 Mode Logic for the EMS
As per Fig. 7, the final drive model is also modified into a 4WD configuration to
ensure the ICE is connected to the front wheels and the IWMs are connected to the
rear wheels through two input ports—Port “Conn1” for the IWM directly to the rear
wheels and Port “Conn2” for connection from the ICE to the front wheels. As the
IWMs are efficient for a wide range of speed, it does not need a gearbox.
3.3
Energy Management for TtR HEV
The control system used to test the response of the proposed TtR HEV model is a
rule-based type. It contains multiple proportional-integral (PI) controllers as well as
a controller block containing the rule-based EMS programmed in state-flow. PI
controllers are used in various parts of the main controller to make the system
iterate quickly. Figure 8 illustrates the rule-based EMS controller used here. Basic
rules are imposed for the system based on four inputs, namely current vehicle
An Overview of Modeling and Control of a TtR HEV …
409
Fig. 9 State-flow diagram for the rule-based EMS
speed, brake signal, current SOC level and current ICE speed and it outputs three
switching signals controlling the ICE, IWM and GEN respectively.
The state-flow diagram is shown in Fig. 9. There two main modes available
which are brake mode and motion mode. Motion mode is further detailed into four
sub-modes. Start mode is during the initial movement where only the IWMs are
used as the ICE stall speed is yet to be exceeded. Once the ICE stall speed is
exceeded, the mode changes to the next sub-mode which is the normal mode where
the ICE is turned ON. The normal mode is further divided into cruise mode and
acceleration mode. These modes are used throughout the operation while corresponding to the driver’s demand. Acceleration is done in hybrid mode with both the
ICE and IWMs supplying power to the wheels. In cruise mode, when the SOC is
high, the GEN is switched OFF but at a lower threshold of 30%, the GEN will be
switched ON to replenish the ESS. Brake mode is when the brake pedal is activated
and the ICE and IWM will be turned OFF to allow the GEN to regenerate energy
through regenerative braking.
The detailed modification and modeling process is reported in a separate publication [9] which is why it will not be explained further in this paper.
410
M. F. M. Sabri et al.
Table 3 Drive cycles data
Name
Type
Urban Drive Cycle (ECE
R15)
Extra Urban Drive Cycle
(EUDC)
New European Drive Cycle
(NEDC)
Highway Fuel Economy
Test (HWFET)
Low speed, stop-go urban driving
High-speed highway driving
Combined urban and high-speed
highway driving
High-speed highway driving
Distance
Average
speed
995 m
18.4 km/h
6955 m
62.6 km/h
11017 m
33.6 km/h
16503 m
77.7 km/h
4 Simulation Results and Discussions
4.1
Experiment Setups
In this chapter, the proposed TtR HEV model is put to the test using the simulations
on four standard drive cycles. The four drive cycles used in the simulations are as
stated in Table 3.
These simulation runs will give a brief picture of how the proposed TtR HEV
model will perform in real-life driving situations in a controlled environment as
various performance indicators such as drivability, power flow, fuel consumption,
battery SOC, etc. are observed during the duration of the simulations. For this
research, the emphasis is put on drivability and fuel economy of the TtR HEV to
warrant first and foremost, the proposed TtR HEV is capable of responding to
driver’s demand while maintaining an acceptable level of fuel economy. The level
of initial SOC is set to the optimum value of 90% to ensure the best possible vehicle
performance without the SOC bottleneck as a detailed battery management strategy
for the proposed TtR HEV requires separate research which has been identified as
one of the areas to be focused on in the future. The basic parameters for the
proposed TtR HEV used in this simulation are as presented in Table 4 and the basis
for the fuel consumption calculation is based on the flow rate (g/s) of fuel provided
by the ICE block divided by the density of gasoline, which is 750 kg/m3, times the
total time taken by the drive cycle in seconds to obtain the total amount of fuel
consumption in liters.
An Overview of Modeling and Control of a TtR HEV …
Table 4 TtR HEV
parameters
4.2
Body
Mass
Frontal area
ICE
Max power
Speed at max power
Max. speed
Fuel consumption
IWM
Max. power
Max. torque
ESS
Type
Nominal voltage
Rated capacity
411
1200 kg
2.16 m2
114 kW
5000 RPM
6000 RPM
By speed and torque
30 kW
400 Nm
Li-Ion
200 V
22 Ah
Results and Discussions
ECE R15 Drive Cycle
For the ECE R15 drive cycle, it can be observed in Fig. 10(a) that the TtR HEV has
managed to follow the speed demand with minor difficulties after the first acceleration. Figure 10(b) shows the power flow throughout the simulation with the
heavy lifting done mostly by the IWMs and the ICE only activated during the last
section of the drive cycle due to the higher speed demand as can be proven by the
spike in fuel consumption shown in Fig. 10(c). When the ICE is activated, a little
bit of energy is replenished by the GEN as shown by the slight bump in the SOC
level but as the hypothesis suggested, the amount of energy that can be recovered
through regenerative braking in a TtR HEV is limited as proven by this result. The
simulation result shows that the model performs with acceptable performance on
low-speed cycle. At the end of the simulation, the total fuel consumption figure is at
0.1609 L and the final SOC level sits at 84.3%. From these results, seeing as the
SOC level is still pretty high by the end of the simulation and considering the short
trip distance, the IWMs could have been utilized more to save more fuel. However,
in the current model, trip distance is not among the considerations for the rule-based
EMS, thus, that kind of minute level adjustment is not possible unless a drastic
change is made to the EMS algorithm.
412
M. F. M. Sabri et al.
(a) Vehicle speed response.
(b) Power flow and SOC.
(c) Fuel consumption pattern.
Fig. 10 Test run on ECE R15 drive cycle
EUDC Drive Cycle
EUDC drive cycle provides insight into the TtR HEV performance on higher speed
cycles. From Fig. 11(a), it looks like the proposed model has no issue in responding
to the driver’s demand. The power flow plot in Fig. 11(b) shows the ICE as the
main contributor for power with the IWMs assisting during accelerations. And as
the ICE was running during cruising at a constant speed, it can be observed that the
ESS is getting recharged. The final SOC stands at 78.93% and as per Fig. 11(c), the
fuel consumption total is 0.9167 L. There is no apparent issue that needs to be
highlighted in this part of the simulation, but the fuel consumption figure looks a
little high as ICE is being used heavily here.
NEDC Drive Cycle
On a longer drive cycle such as the NEDC, the proposed model is showing a similar
performance attribute as the previous two drive cycles combined. Figure 12(a)
exhibits that the model is facing a bit of instability at the lower speed region but
performs smoother on high-speed regions. Figure 12(b) shows the power flow and
SOC level of 57.32% at the end which means that the ESS is used heavily especially during the low-speed section of the drive cycle but with limited regeneration.
The observation that can be made after the three simulations is the proposed model
An Overview of Modeling and Control of a TtR HEV …
(a) Vehicle speed response.
413
(b) Power flow and SOC.
(c) Fuel consumption pattern.
Fig. 11 Test run on EUDC drive cycle
performs best with the ICE as the main source of traction whereas the performance
of the IWMs needs more attention, so further investigation is needed to find the
source of the shaky performance. However, with the ICE taking the centre stage,
fuel consumption takes a hit as can be seen in Fig. 12(c) with the stern increase of
consumption during the later part of the cycle and the simulation ended with
1.561 L of consumption.
HWFET Drive Cycle
HWFET serves as the drive cycle with the highest demand in terms of speed and
power. From Fig. 13(a), it can be clearly observed that the model exhibits instability which takes a while to be corrected before it is able to follow the speed
profile. And from the Fig. 13(b), it can be deduced that this instability is caused by
the spike of power coming from the IWMs as they try to respond to the steep power
request by the driver but resulted in the overshoots. But as the ICE is being used
more frequently during much of the drive cycle, the performance of the proposed
model is smooth and the ESS is not put under too much strain as the simulation
ended with the SOC of 68.49%. However, as expected by the heavy usage of the
ICE, the fuel consumption is at a high 1.966 L as shown in Fig. 13(c). From the
414
M. F. M. Sabri et al.
(a) Vehicle speed response
.
(b) Power flow and SOC.
(c) Fuel consumption pattern.
Fig. 12 Test run on NEDC drive cycle
two long drive cycles above, it can be concluded that the fuel-saving is higher at
lower speed region regions, but the performance is a little unsteady due to the
inconsistency shown by the IWMs. At higher speed regions, the ICE helps maintain
smoother vehicle performance, but the result leads to unfavourable fuel
consumption.
The summary of the results is presented in Table 5. An interesting point from the
summary is the proposed TtR HEV model has a better fuel consumption rate at
higher speed cycles which are EUDC and HWFET. This is due to the ICE operating
more efficiently at high-speed compared to low-speed operations and it also resulted
in longer SOC preservation as the ESS is not drained as aggressively as when the
proposed model is relying on the IWMs for power at the lower speed sections.
However, when comparing the performance obtained here with other publications, such as [22–24] which have the fuel efficiency range between 2.01 L/100 km
to 4.25 L/100 km on NEDC and HWFET drive cycles, the fuel-saving capability of
the proposed model is still far from satisfactory. This is due to the rule-based EMS
used here compared to the more advanced EMS approaches by the publications
mentioned above and the choice of ICE which is not in favour of the proposed
model as the downsizing option is unavailable.
An Overview of Modeling and Control of a TtR HEV …
(a) Vehicle speed response.
415
(b) Power flow and SOC.
(c) Fuel consumption pattern.
Fig. 13 Test run on HWFET drive cycle
Table 5 Simulation summary
Features/Drive cycles
ECE R15
EUDC
NEDC
HWFET
Fuel consumption (L)
Fuel consumption
(L/100 km)
Final SOC (%)
0.1609
16.17
0.9167
13.18
1.561
14.16
1.966
11.91
84.3 (−5.7)
78.93 (−11.07)
57.32 (−32.68)
68.49 (−21.51)
5 Conclusions
Finally, it can be concluded that the modeling of the proposed TtR HEV has been a
success as far as the ability of the model to respond to the driver’s demand is
concerned albeit some minor instabilities which can be remedied through further
optimizations. The focus will be put on the IWM’s performance because safety is at
risk when a vehicle performs not as the drivers intended. The simulation results
have provided that the development of the simulation platform is successful, and it
can be used for further research for the proposed TtR model especially in
416
M. F. M. Sabri et al.
developing a new EMS controller to replace the rule-based EMS and take full
advantage of the design approach taken here, in the pursuit of achieving the best
fuel consumption possible without sacrificing vehicle performance.
From the results, several areas have been identified as prospective research
focuses in the future focusing on performance gain and increased fuel-saving
potential of the proposed architecture. EMS is certainly the main area in which
these goals can be achieved as the rule-based EMS currently used by the model has
too many limitations and is not capable of adapting to different characteristics of
different drive cycles. Another area of interest is towards hardware-based drivetrain
optimization by component-sizing to increase operational efficiency and minimizing losses.
Acknowledgements This research work and publication is supported and funded by UNIMAS
under Special MyRA Assessment Funding (Project ID: F02/SpMYRA/1719/2018).
References
1. Atabani AE, Badruddin IA, Mekhilef S, Silitonga AS (2011) A review on global fuel
economy standards, labels and technologies in the transportation sector. Renew Sustain
Energy Rev 15:4586–4610
2. Mohr SH, Wang J, Ellem G, Ward J, Giurco D (2015) Projection of world fossil fuels by
country. Fuel 141:120–135
3. Sabri MFM, Danapalasingam KA, Rahmat MF (2016) A review on hybrid electric vehicles
architecture and energy management strategies. Renew Sustain Energy Rev 53:1433–1442
4. UNFCCC (2015) Conference of the parties (COP): Paris climate change
conference-November 2015, COP 21. Adoption of the Paris Agreement. Proposed by
President 21932, 32 (2015)
5. Un-Noor F, Padmanaban S, Mihet-Popa L, Mollah M, Hossain E (2017) A comprehensive
study of key electric vehicle (EV) components, technologies, challenges, impacts, and future
direction of development. Energies 10:1217
6. Manoharan Y, Hosseini SE, Butler B, Alzhahrani H, Senior BTF, Ashuri T, Krohn J (2019)
Hydrogen fuel cell vehicles; current status and future prospect. Appl. Sci. 9:2296
7. Williamson SS, Rathore AK, Musavi F (2015) Industrial electronics for electric transportation: current state-of-the-art and future challenges. IEEE Trans Ind Electron 62:3021–3032
8. Cano ZP, Banham D, Ye S, Hintennach A, Lu J, Fowler M, Chen Z (2018) Batteries and fuel
cells for emerging electric vehicle markets. Nat Energy 3:279–289
9. Mohd Sabri MF, Danapalasingam KA, Rahmat MF (2018) Improved fuel economy of
through-the-road hybrid electric vehicle with fuzzy logic-based energy management strategy.
Int J Fuzzy Syst 20:2677–2692
10. Enang W, Bannister C (2017) Modelling and control of hybrid electric vehicles (a
comprehensive review). Renew Sustain Energy Rev 74:1210–1239
11. Hannan MA, Azidin FA, Mohamed A (2014) Hybrid electric vehicles and their challenges: a
review. Renew Sustain Energy Rev 29:135–150
12. International Energy Agency (IEA) (2017) Global EV outlook 2017: two million and
counting. IEA Publication, New Delhi, pp 1–71
13. Rezvani Z, Jansson J, Bodin J (2015) Advances in consumer electric vehicle adoption
research: a review and research agenda. Transp Res Part D Transp Environ 34:122–136
An Overview of Modeling and Control of a TtR HEV …
417
14. Nykvist B, Nilsson M (2015) Rapidly falling costs of battery packs for electric vehicles. Nat
Clim Change 5:329–332
15. Axsen J, Kurani KS (2013) Hybrid, plug-in hybrid, or electric-What do car buyers want?
Energy Policy 61:532–543
16. Pisanti C, Rizzo G, Marano V (2014) Energy management of through-the-road parallel hybrid
vehicles. IFAC Proc 47:2118–2124
17. Galvagno E, Morina D, Sorniotti A, Velardocchia M (2013) Drivability analysis of
through-the-road-parallel hybrid vehicles. Meccanica 48:351–366
18. Rashid MIM, Danial H (2017) ADVISOR simulation and performance test of split plug-in
hybrid electric vehicle conversion. Energy Procedia 105:1408–1413
19. Meisel J, Shabbir W, Evangelou SA (2013) Evaluation of the through-the-road architecture
for plug-in hybrid electric vehicle powertrains. In: 2013 IEEE international electric vehicle
conference (IEVC). IEEE, pp 1–5
20. Mathews JC, Walp KJ, Molen GM (2006) Development and implementation of a control
system for a parallel hybrid powertrain. In: 2006 IEEE vehicle power and propulsion
conference, pp 1–6
21. Miller S. Hybrid-Electric Vehicle Model in Simulink. https://www.mathworks.com/
matlabcentral/fileexchange/28441-hybrid-electric-vehicle-model-in-simulink. Accessed 10
Oct 2019
22. Dubois MR, Desrochers A, Denis N (2015) Fuzzy-based blended control for the energy
management of a parallel plug-in hybrid electric vehicle. IET Intell Transp Syst 9:30–37
23. Zhang Y, Liu H-P (2012) Fuzzy multi-objective control strategy for parallel hybrid electric
vehicle. IET Electr Syst Transp 2:39
24. Adhikari S, Halgamuge SK, Watson HC (2010) An online power-balancing strategy for a
parallel hybrid electric vehicle assisted by an integrated starter generator. IEEE Trans Veh
Technol 59:2689–2699
Euler-Lagrange Based Dynamic Model
of Double Rotary Inverted Pendulum
Mukhtar Fatihu Hamza, Jamilu Kamilu Adamu,
and Abdulbasid Ismail Isa
Abstract Double Rotary inverted pendulum (DRIP) is an important member of
nonlinear, unstable, non-minimum phase, and under-actuated mechanical systems.
The DRIP is known widely as experimental setup for testing different kind of
control algorithms. This paper, described a development of nonlinear dynamical
equations of the DRIP system using Euler-Lagrange methods. Euler-Lagrange
methods does not requisite complicated and tedious formulation since DRIP is not
large multi-body system. The linear model and state space representation was also
presented. The Simulink model of DRIP was developed based on the derived
equations. Simulation study was carried out and the results indicated that, the DRIP
system is inherently nonlinear and unstable. It is realized that the difficulties and
limitations in the previous dynamic equation of DRIP proposed in literature are
eliminated. Euler-Lagrange methods can be regarded as an alternative method for
finding the dynamic model of the systems.
Keywords Rotary inverted pendulum
Nonlinear system
Dynamic model Euler-Lagrange M. F. Hamza (&)
Department of Mechanical Engineering, University of Malaya, Kuala Lumpur, Malaysia
e-mail: mfhamza@siswa.um.edu.my
M. F. Hamza
Department of Mechatronics Engineering, Bayero University, Kano, Nigeria
J. K. Adamu
Department of Engineering Services, Federal Ministry of Power, Works and Housing, Abuja,
Nigeria
A. I. Isa
Department of Electrical and Electronics Engineering, Usmanu Danfodiyo University Sokoto,
Sokoto, Nigeria
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_29
419
420
M. F. Hamza et al.
1 Introduction
The Double Rotary inverted pendulum (DRIP) has two inverted pendulums connected with each other and one is attached to a rotating arm as shown in Fig. 1. The
plane of the two pendulums is orthogonal to the radial arm [1]. This rotary arm is
actuated by a controlling torque with the objective of balancing the two pendulums
in the inverted position. Therefore, it has three degree of freedom (DOF). The
actuated joint angle has complete azimuth revolution range to stabilize the double
inverted pendulum [2]. The DRIP is an important member of nonlinear, unstable,
non-minimum phase, and under-actuated mechanical systems. The schematic diagram of experimental setup is shown in Fig. 2.
The DRIP systems perform in an extensive range in real life applications such as
aerospace systems, robotics, marine systems, mobile systems, flexible systems,
pointing control, and locomotive systems [3]. Moreover, when the pendulums of
DRIP are at hanging position, it represents real model of the simplified industry
crane application [4].
The control objectives of the DRIP can be categorized into four categories [5, 6]
namely:
1. Controlling the two pendulums from downward stable position to upward
unstable position known as Swing-up control [7].
2. Regulating the pendulums to remain at the unstable position known as stabilization control [8].
Fig. 1 Picture of
experimental setup
Euler-Lagrange Based Dynamic Model of DRIP
421
Fig. 2 Schematic diagram of experimental setup
3. The switching between swing-up control and stabilization control known as
switching control [6].
4. Controlling the DRIP in such a way that the arm tracks a desired time varying
trajectory while the pendulum remains at unstable position known as trajectory
tracking control [9].
The study of system dynamics resides in modeling its behavior. The dynamic
equations of any mechanical system can be obtained from the known Newtonian
classical mechanics [10–12]. Newtonian dynamics is a mathematical model whose
purpose is to predict the motions of the various objects which we encounter in the
world around us [13]. The drawback of this formalism is the use of the variables in
vector form, complicating considerably the analysis when increasing the joints or
there are rotations present in the system. In these cases, it is favorable to employ the
Lagrange equations, which have formalism of scale, facilitating the analysis for any
mechanical system [14, 15].
This study, described the detail development of nonlinear and linear dynamical
equations of the DRIP system using Euler-Lagrange methods. The state space
representation of the developed linear model was also presented. The nonlinear
Matlab model of DRIP was developed based on the derived equations. Simulation
study was carried out and the results indicated that, the DRIP system is inherently
nonlinear and unstable. It is realized that the difficulties and limitations in the
previous dynamic equation of DRIP proposed in literature are eliminated.
422
M. F. Hamza et al.
2 Double Rotary Inverted Pendulum Modelling
The DRIP consists of a series of two pendulums attached to a rotary arm that rotate
around motor shaft axis. It has three DOF, namely rotary arm angle h, lower
pendulum angle a, and upper pendulum angle c. The schematic diagram of DRIP is
shown in Fig. 3. Derivation of mathematical equation describing dynamics of the
DRIP system is based on Euler-Lagrange equation of motion [16].
2.1
Euler-Lagrange Equation
As described in [17], the Euler-Lagrange Equation is given in Eq. (1)
d @L
@L @w
si ¼
þ
dt @ q_ i
@qi @ q_ i
ð1Þ
1
@w
w ¼ bi q_ 2i )
¼ bi q_ i
2
@ q_ i
ð2Þ
where qðtÞ are the generalize coordinates, q_ ðtÞ are the generalized velocities, si is
the external force or load vector, L is the Lagrangian w is the loss energy.
2.1.1
Kinetic Energy
The kinetic energy of the DRIP consist of a translational and rotational component
for the two pendula and rotational component for the rotary arm [18].
The total kinetic energy can be expressed in terms of the generalized coordinates
and their first-time derivatives. In order to describe the position and motion of the
system under consideration, we could use standard Cartesian (x, y, z) and polar
coordinates, ðr; hÞ, of each of the three links. Each different point in these planes
corresponds to a unique instantaneous state of the DRIP. The Kinetic Energy for
each of the links can be obtained as follows:
Arm (Link 1)
The kinetic energy of the arm consists only of the rotational components. The arm is
constrained to movement on x-o-z plane and rotates around y-axis through an angle
h [19]. The arm instantaneous position and hence the kinetic energy of the arm ðK1 Þ
is most conveniently specified in terms of the plane polar coordinates r and h.
1
K1 ¼ Ja h_ 2
2
ð3Þ
Euler-Lagrange Based Dynamic Model of DRIP
423
Fig. 3 Schematic diagram of DRIP
Pendulum
The movements of the two pendulums are constrained to a vertical plane perpendicular to Link 1. A Cartesian coordinate system allows position and direction in
space to be represented in a very convenient manner. Let us define our usual
Cartesian coordinates (x, y, z) and let the origin of our coordinate system correspond to the equilibrium position of each pendulum. The direction of the arrows on
424
M. F. Hamza et al.
the arcs in Fig. 3 that indicates the angular displacement shows the positive
direction for the rotary movement of the links. The straight dash lines in Fig. 4
represent the reference position of the link angles (i.e. h a c 0).
Lower pendulum (Link 2)
If the lower pendulum is deflected from the upward vertical position by a small
angle a then it is easily seen that:
X1 ¼ rh þ l1 sin a
ð4Þ
_ 1 cos a a_
X_ 1 ¼ r hl
ð5Þ
Y1 ¼ l1 cos a
ð6Þ
Y_ 1 ¼ l1 sin a a_
ð7Þ
Translational Kinetic Energy for link2 is given by:
1 Kt2 ¼ m1 X_ 12 þ Y_ 12
2
ð8Þ
Substituting (5) and (7) into (8) yields
2
1
2
_
Kt2 ¼ m1 r hl1 cos a a_ þ ðl1 sin a a_ Þ
2
ð9Þ
Rotational Kinetic Energy of link2 is:
1
Kr2 ¼ J1 a_ 2
2
ð10Þ
Total Kinetic Energy (K2) for link 2 is given by the sum of rotational and
translational kinetic energy
K2 ¼ Kt2 þ Kr2
Fig. 4 Position analyses for
link 2
Euler-Lagrange Based Dynamic Model of DRIP
425
2
1
1
2
2
_
K2 ¼ J1 a_ þ m1 r hl1 cos a a_ þ ðl1 sin a a_ Þ
2
2
ð11Þ
Upper pendulum (Link 3)
If the upper pendulum is deflected from the upward vertical position by a small
angle c then it is easily seen that
X2 ¼ rh þ L1 sin a þ l2 sin c
ð12Þ
X_ 2 ¼ r h_ þ L1 cos a a_ þ l2 cos c c_
ð13Þ
Y2 ¼ L1 cos a þ l2 cos c
ð14Þ
Y_ 2 ¼ L1 sin a a_ l2 sin c c_
ð15Þ
Translational Kinetic Energy for link 3 is given by:
1 Kt3 ¼ m2 X_ 22 þ Y_ 22
2
Kt3 ¼
ð16Þ
2
r h_ þ L1 cos a a_ þ l2 cos c c_ þ ðL1 sin a a_ l2 sin c c_ Þ2
ð17Þ
Rotational Kinetic Energy of link 3 is:
1
Kr3 ¼ J2 c_ 2
2
ð18Þ
Total Kinetic Energy (K3) for link 3 is given by the sum of rotational ðKt3 Þ and
translational ðKr3 Þ kinetic energy.
2
2 3
1 2 1 4 r h_ þ L1 cos a a_ þ l2 cos c c_ þ 5
K3 ¼ J2 c_ þ m2
2
2
ðL sin a a_ l sin c c_ Þ2
1
ð19Þ
2
The total kinetic energy for system ðK Þ is given by the combination of moving
and rotational kinetic energy of the individual components making up the system as
shown below.
K ¼ K1 þ K2 þ K3
K¼
2
1 _2 1
1
1
Ja h þ J1 a_ 2 þ J2 c_ 2 þ m1 rh_ þ l1 cos a a_ þ ðl1 a_ sin aÞ2
2
2
2
2
2
1
2
_
þ m2 r h þ L1 a_ cos a þ l2 c_ cos c þ ðL1 a_ sin a l2 c_ sin cÞ
2
ð20Þ
426
2.1.2
M. F. Hamza et al.
The Potential Energy
The potential energy for the individual links of DRIP is given below:
Arm
Since the center of mass of the arm is balanced at the original point (y = 0), thus,
the potential energy for the arm ðP1 Þ is zero.
P1 ¼ 0
ð21Þ
P2 ¼ m1 gl1 cos a
ð22Þ
P3 ¼ gm2 L1 cos a þ gm2 l2 cos c
ð23Þ
Lower Pendulum
Upper Pendulum
Total Potential Energy for the system ðPÞ is given by:
P ¼ P1 þ P2 þ P3
P ¼ gm1 l1 cos a þ gm2 L1 cos a þ gm2 l2 cos c
ð24Þ
_ €h; are angular position, velocity and acceleration of the motor shaft,
where: h; h;
_ €a; are agular position, velocity and
around the vertical axis respectively, a; a;
acceleration of the lower pendulum, around the motor shaft axis respectively, c; c_ ; €c;
are angular position, velocity and acceleration of the upper pendulum, around the
motor shaft axis respectively.
2.1.3
Lagrangian Formulation (L)
Let consider the Euler Lagrange equation
L¼KP
ð25Þ
Therefore, substituting Eqs. (20) and (24) we have:
2
1 _2 1
1 2 1
2
2
_
L ¼ Ja h þ J1 a_ þ J2 c_ þ m1 r h þ l1 cos a a_ þ ðl1 a_ sin aÞ
2
2
2
2
2
1
2
_
þ m2 r h þ L1 a_ cos a þ l2 c_ cos c þ ðL1 a_ sin a l2 c_ sin cÞ
2
½gm1 l1 cos a þ gm2 L1 cos a þ gm2 l2 cos c
ð26Þ
Euler-Lagrange Based Dynamic Model of DRIP
427
Applying the Euler Lagrange Eq. (1) to the Lagrangian (26) results in three
coupled nonlinear equations.
Euler-Lagrange equation of the motion of each link thus becomes:
For arm ðhÞ, substituting h in Eq. (1)
sa ¼
d @L
@L
þ ba h_
dt @ h_
@h
h
i
sa ¼ Ja þ r 2 ðm1 þ m2 Þ€h þ r ðm1 l1 þ m2 L1 Þ cos a €
a þ m2 l2 r€c cos c þ ba h_
r ðm1 l1 þ m2 L1 Þ sin a a_ 2 m2 l2 r sin c c_ 2
ð27Þ
ð28Þ
For lower pendulum ðaÞ, substituting a in Eq. (1)
d @L
@L
þ b1 a_
0¼
dt @ a_
@a
a þ m2 L1 l2 cosða cÞ€c
0 ¼ r ðm1 l1 þ m2 L1 Þ cos a €h þ J1 þ m1 l21 þ m2 L21 €
b1 a_ þ m2 L1 l2 sinða cÞ_c2 gðm1 l1 þ m2 L1 Þ sin a
ð29Þ
ð30Þ
For upper pendulum ðcÞ, substituting c in Eq. (1)
d @L
@L
þ b2 c_
0¼
dt @ c_
@c
0 ¼ m2 l2 rcos c €h þ m2 L1 l2 cos ða cÞ€a þ J2 þ m2 l22 €c þ b2 c_
m2 L1 l2 sinða cÞa_ 2 gm2 l2 sin c
ð31Þ
ð32Þ
Equations (28), (30) and (32) are three nonlinear, coupled, second order differential equations of motion describing the dynamics equations of the DRIP system. These dynamic equations can be reduced to the following equations:
sa ¼ z1 €h þ z2 cos a €a þ z3 €c cos c þ ba h_ z2 sin a a_ 2 z3 sin c c_ 2
ð33Þ
0 ¼ z2 cos a €h þ z4 €a þ z5 cosða cÞ€c þ b1 a_ þ z5 sinða cÞ_c2 z7 sin a
ð34Þ
0 ¼ z3 cos c €h þ z5 cosða cÞ€a þ z6 €c þ b2 c_ z5 sinða cÞa_ 2 z8 sin c
ð35Þ
where:
z1 ¼ Ja þ r 2 ðm1 þ m2 Þ
ð36Þ
z2 ¼ r ðm1 l1 þ m2 L1 Þ
ð37Þ
428
M. F. Hamza et al.
z3 ¼ m 2 l 2
ð38Þ
z4 ¼ J1 þ m1 l21 þ m2 L21
ð39Þ
z5 ¼ L 1 l 2 m 2
ð40Þ
z6 ¼ J2 þ m2 l22
ð41Þ
z7 ¼ gðm1 l1 þ m2 L1 Þ
ð42Þ
z8 ¼ gm2 l2
ð43Þ
The torque at the load shaft from an applied motor torque can be express as:
sm ð t Þ ¼
gg Kg gm kt Vm ðtÞ Kg km h_ ðtÞ
Rm
ð44Þ
The value of the torque for the system under consideration can be calculated
using Eq. (44) below.
sa ¼ 0:117238v 0:063h_ Nm
2.2
ð45Þ
System Specifications
The system specification and their description are given in Table 1 (SRV02 DRIP
module).
2.3
MATLAB Modelling
For the purpose of controller design and analysis, the DRIP Simulink model was
developed in Matlab/Simulink using the nonlinear, parameterized mathematical
model as shown in Fig. 5. This is done by first rearranging the nonlinear-coupled
equations of motion (33), (34) and (35) and substituting the values of the parameters we have:
€
h ¼ 0:8085v 0:6138 cos a €a 0:2966 cos c €c 4:5103h_ þ 0:6138 sin a a_ 2
þ 0:2966 sin c
ð46Þ
Euler-Lagrange Based Dynamic Model of DRIP
429
Table 1 SRV02 DRIP specifications
Symbol
Description
Value
Unit
Ja
J1
J2
0.0041
0.00032
0.0012
kg m2
kg m2
kg m2
0.2159
0.2
0.097
0.156
0.0024
0.0024
M
M
M
M
N m/(rad/s)
N m/(rad/s)
0.0024
N m/(rad/s)
Vnom
Rm
gm
gg
Rotary arm moment of inertia about its center of mass
First Pendulum moment of inertia about center of mass
Second Pendulum moment of inertia about center of
mass
Rotary arm length from pivot to tip
Lower pendulum length from pivot to tip
Lower pendulum length from pivot to center of mass
Upper pendulum length from pivot to center of mass
Viscous damping coefficient of the motor arm
Upper Pendulum viscous damping coefficient as seen
at the pivot axis
Lower Pendulum viscous damping coefficient as seen
at the pivot axis
Motor nominal input voltage
Motor armature resistance
Motor efficiency
Gear efficiency
6.0
2.6
0.63
0.9
V
X
Kg
Km
Kt
Total gear ratio
Back-emf constant
Motor torque constant
70
0.00768
0.00768
r
L1
l1
l2
ba
b1
b2
V/(rad/s)
Nm
€
a ¼ 1:1266 cos a €h 0:4937 cosða cÞ €c 0:3038a_ 0:4937 sinða cÞ c_ 2
þ 51:2405 sin a
ð47Þ
€c ¼ cos c €h 0:9096 cos ða cÞ €a 0:5581 c_ þ 0:9096 sinða cÞa_ 2
þ 45:2093 sin c
2.4
ð48Þ
Linearization of Nonlinear Model
In most situations where we seek a linearized model, the nominal state is an
equilibrium point. This term refers to an initial state where the system remains
unless perturbed [5]. Therefore, to linearize the model [27], the following
approximations are applied: cos h 1; cos a 1; cos c 1; sin h ¼ h, sin a ¼ a;
sin c ¼ c; h_ 2 ¼ a_ 2 ¼ c_ 2 0. This is based on Taylor series expansion.
430
M. F. Hamza et al.
Fig. 5 Simulink model of DRIP
The linearized model of the nonlinear equations (33), (34) and (35) in matrix
form:
2
z1
4 z2
z3
z2
z4
z5
32 3 2
€h
z3
ba
z5 54 €a 5 þ 4 0
0
z6
€c
0
b1
0
32 3 2
0
0 0
h_
0 54 a_ 5 þ 4 0 az7
b2
0 0
c_
3 2 3
sa
0
5 ¼ 40 5
0
cz8
0
ð49Þ
By defining the state variables as: x1 ¼ h x2 ¼ a x3 ¼ c x4 ¼ h_ x5 ¼ a_ x6 ¼ c_ and
substituting the values of the parameters, a linear state space system can be represented as:
2
0
60
6
60
A¼6
60
6
40
0
0
0
0
103:7924
211:7365
88:2477
0
0
0
1:7156
42:3798
81:9312
1
0
0
1
0
0
14:6318 0:6154
16:7688 1:2554
0:5772
0:5232
3
0
7
0
7
7
1
7
0:0212 7
7
0:5232 5
1:0115
Euler-Lagrange Based Dynamic Model of DRIP
3
0
7
6
2
0
7
6
1
7
6
0
7
4
B¼6
6 26:2209 7; C ¼ 0
7
6
0
4 30:0506 5
1:0343
431
2
0 0
1 0
0 1
0
0
0
0
0
0
3
0
05
0
3 Test for Stability
As pointed out in [20], the necessary and sufficient condition for stability of a
system is that all the roots of the characteristic equation (k, also referred to as
eigenvalues) should have negative real parts. If any of the roots has positive real
part, the contribution from the corresponding exponential term will grow with time,
the output response will be unbounded, and the entire system will be regarded as
unstable.
The characteristic equation (P) is given as:
PðkÞ ¼ detðkI AÞ
k1
0
0
PðkÞ ¼ 0
0
0
0
k2
0
103:7924
211:7365
88:2477
0
0
k3
1:7156
42:3798
81:9312
k1
k2
k3
k4
k5
k6
1
0
0
k4 þ 14:6318
16:7688
0:5772
ð50Þ
0
1
0
0:0212
k5 þ 1:2554
0:5232
0
0
1
0
0:5232 k6 þ 1:0115
¼0
¼ 22:5049
¼ 12:8716
¼ 6:2333
¼ 3:3489
¼ 10:1498
From the eigenvalues ðki Þ obtained, it is found that two of the poles are in
positive real part of s-plane. Hence, the system is confirmed to be unstable.
4 Open Loop Response
The system dynamic model was derived on the assumption that, the tilt angles for
both the links (arm, lower pendulum and upper pendulum) are at reference zero
(0 rad) position. As the motor is energized with a step signal and without control,
432
Fig. 6 Rotary arm open loop response
Fig. 7 Lower pendulum open loop response
M. F. Hamza et al.
Euler-Lagrange Based Dynamic Model of DRIP
433
Fig. 8 Upper pendulum open loop response
which serve as a disturbance to the unstable equilibrium DRIP, the two pendulums
were unable to maintain the unstable equilibrium position, but they fall to the
downward stable equilibrium equivalent to 180o. These behaviors are shown in
Figs. 6, 7 and 8.
5 Conclusion
This study, presented a development of nonlinear dynamical equations of the DRIP
system using Euler-Lagrange method. The MATLAB/Simulink model of DRIP was
developed based on the derived equations. Simulation study was carried out and the
result shows that, the RIP system is inherently nonlinear and unstable. The
developed models can be used by the researchers for application of linear or
nonlinear controllers. Also, the method used can be applied in modelling of other
nonlinear systems.
References
1. Casanova V, Salt J, Piza R, Cuenca A (2012) Controlling the double rotary inverted
pendulum with multiple feedback delays. Int J Comput Commun Control 7(1):20–38
2. Pakdeepattarakorn P, Thamvechvitee P, Songsiri J, Wongsaisuwan M, Banjerdpongchai D
(2004) Dynamic models of a rotary double inverted pendulum system. In: 2004 IEEE region
10 conference (TENCON 2004), vol 500. IEEE, pp 558–561 (2004)
434
M. F. Hamza et al.
3. Hamza MF, Yap HJ, Choudhury IA, Isa, AI (2016) Application of Kane’s method for
dynamic modeling of rotary inverted pendulum system. In: 2016 MNTMSim conference, vol
1. IEEE, Malaysia, pp 20–27 (2016)
4. Moreno-Valenzuela J, Aguilar-Avelar C (2018) Motion control of underactuated mechanical
systems. Springer, Cham
5. Hamza MF, Yap HJ, Choudhury IA (2015) Genetic algorithm and particle swarm
optimization based cascade interval type 2 fuzzy PD controller for rotary inverted pendulum
system. Math Probl Eng 12(2015):279–462
6. Hamza MF, Yap HJ, Choudhury IA (2017) Cuckoo search algorithm based design of interval
Type-2 Fuzzy PID Controller for Furuta pendulum system. Eng Appl Artif Intell 2(62):134–
151
7. Yang X, Zheng X (2018) Swing up and stabilization control design for an underactuated
rotary inverted pendulum system: theory and experiments. IEEE Trans Ind Electron
65(9):7229–7238
8. Fantoni I, Lozano R (2002) Stabilization of the Furuta pendulum around its homoclinic orbit.
Int J Control 6(75):390–398
9. Casanova V, Alcaína J, Salt J, Pizá R, Cuenca Á (2015) Control of the rotary inverted
pendulum through threshold-based communication. ISA Trans 1(62):357–366
10. Isa AI, Hamza MF (2014) Effect of sampling time on PID controller design for a heat
exchanger system. In: 6th international conference on adaptive science & technology. IEEE,
pp 1–8
11. Isa AI, Hamza MF, Zimit AY, Adamu JK (2018) Modelling and fuzzy control of ball and
beam system. In: 7th international conference on adaptive science & technology. IEEE, pp 1–
6 (2018)
12. Zimit AY, Yap HJ, Hamza MF, Siradjuddin I, Hendrik B, Herawan T (2018) Modelling and
experimental analysis two-wheeled self balance robot using PID controller. In: International
conference on computational science and its applications. Springer, pp 683–698 (2018)
13. Georgiadis MC, Macchietto S (2000) Dynamic modelling and simulation of plate heat
exchangers under milk fouling. Chem Eng Sci 9(55):1605–1619
14. Dhaouadi R, Hatab AA (2013) Dynamic modelling of differential-drive mobile robots using
lagrange and newton-euler methodologies: a unified framework. Adv Robot Autom 2(2):1–7
15. Hamza MF, Yap HJ, Choudhury IA, Isa AI, Zimit AY, Kumbasar T (2019) Current
development on using rotary inverted pendulum as a benchmark for testing linear and
nonlinear control algorithms. Mech Syst Sig Process 2(16):347–369
16. García-Alarcon O, Puga-Guzman S, Moreno-Valenzuela J (2012) On parameter identification
of the Furuta pendulum. Procedia Eng 1(35):77–84
17. Hamill P (2015) A student’s guide to Lagrangians and Hamiltonians. J Geom Symmetry Phys
2(37):101–105
18. Madrid JLD, Henao PO, Querubín EG (2017) Dynamic modeling and simulation of an
underactuated system. In: Journal of physics: conference series, vol 1, no 850. IOP
Publishing, p 012005
19. Li B (2013) Rotational double inverted pendulum. University of Dayton (2013)
20. Mandal AK (2006) Introduction to control engineering: modeling, analysis and design. New
Age International, New Delhi
Network-Based Cooperative
Synchronization Control of 3 Articulated
Robotic Arms for Industry 4.0
Application
Kam Wah Chan, Muhammad Nasiruddin Mahyuddin, and Bee Ee Khoo
Abstract This project presents a Control Area Network (CAN) based Cooperative
Synchronization Control of three articulated robotic arms for Industry 4.0 application.
Demand on multi-robot system increases as a result of its flexibility and ability on
handling complex task, especially in the era of our nation approaching Industry 4.0.
In this project, three robotic arms will be commissioned to synchronize with each
other to perform a cooperative task. The cooperative setup employs a multi-agentinspired framework. A leader agent is assigned to one of the robotic arms which has
full knowledge on the desired trajectory signal whereas the other follower agents have
partial information. CAN bus will be used as a means of communication between
the three robotic arms due to its ease of convenience in terms of configuration and
future extension. An intelligent cooperative phase lead controller is to be designed,
developed and implemented to guarantee smooth synchronizing motions of robot
arms. Experimental frequency response approach is used to identify the input-output
model of each joint of each robot agent i. Discrete phase lead controller is designed
from the transfer function obtained. The CAN bus network is designed so that slave
robot get cooperative consensus error from each other as input signal. The distributed
cooperative control robot system is successfully developed. The slave robots tracks
the master robot successfully.
Keywords Cooperative control · Robotics · Control system · Phase-lead
compensator · Multi-robot · Distributed control
1 Introduction
Efficiency, productivity, interconnectivity and the capability to handle complex task
seems to be the final target of the industry revolution. Industry 4.0 is based on the
K. W. Chan · M. N. Mahyuddin (B) · B. E. Khoo
School of Electrical and Electronic Engineering, Universiti Sains Malaysia, 14300 Nibong Tebal,
Pulau Pinang, Malaysia
e-mail: nasiruddin@usm.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_30
435
436
K. W. Chan et al.
technological concepts of cyber-physical systems, Internet of Things (IoT), which
enables the Factory of the Future (FoF) [1, 2].
The nine pillars of technological advancement are autonomous robot, simulation,
big data and analytics, horizontal and vertical system integration, the industrial internet of things, cybersecurity, the cloud, additive manufacturing, and augmented reality
[3]. Automated robots produce higher competitiveness of companies, provide better
quality and lower requirements for post processing and quality control, speed up
processing operation, decrease occupational injuries, and provide a better working
environment [4].
In order to manage uncertainties, machine operations have become more flexible
and more autonomous in handling their problems [5]. Cooperative control of robotic
arms is vital in manufacturing when comes to complex tasks such as assembling two
parts of the semi-finished product together and the motions for operating robot arms
are different among each other. Distributed architecture is the nature of cooperative
control.
In a distributed architecture, the key aspect is how communication requests are
handled [6]. The communication concerns is focused on suggesting solutions to
enable data exchange between the internal elements of the system [11]. CAN bus
protocol allows distributed network based control of robot arms with higher efficiency
and robustness as well as the simplify complexity of system [13].
With network-based control, the robot arms work together to complete a cooperative task [14–18]. The six main requirements discussed were modularity, interoperability, decentralization, virtualization, service orientation, and responsiveness
[7]. Haddara and Elragal emphasized on the need on machine-to-machine communication to ensure the effectiveness and the objective of smart factory as promoted
in Industry 4.0 [2].
Multi-robot systems (MRSs) have been widely investigated in the recent years
due their appealing characteristics in terms of flexibility, redundancy, fault tolerance,
and the possibility they offer for using distributed sensing and actuation [8, 19–37].
There is application by using a control procedure and a control algorithm with
two levels to solve the control problem of a cooperating multi-arm robotic system
like a gripper with n fingers manipulating a usual object [9]. Decentralised control
system without requiring communication between robot is applied in a collaborative
controller for a team of mobile manipulators is designed for transporting a rigid work
piece to a desired position and orientation [10].
In distributed system, the implementation of network based communication is easier than implementation of pure sensor system and the communication cost is reduced
as well [11]. Distributed controller–observer schema with first-order dynamics for
tracking control of the centroid and of the relative formation of a multi-robot system is implemented and can be potentially used as a bridge to the solution of the
tracking problem with additional control objectives including complex tasks such as
exploration and deployment [8].
Kocan et al. implemented CAN bus on L601-KT robotic arm with six degrees
of freedom [12]. A plug-in architecture robot platform [13] was designed using
STM32 series chips as microcontroller and CAN bus as communication medium.
Multi-robot system is extensively studied in the past decade due to the capability
Network-Based Cooperative Synchronization Control ...
437
in term of flexibility, and redundancy it offered providing a viable solution for the
complex task. The used of communication device instead of pure sensor system bring
distinct advantage in both cost and system complexity. The use of CAN bus provides
higher reliability in term of robustness and the ease of implementation.
2 System Setup
Figure 1 shows the system setup in the lab to demonstrate a cooperative task carried
out by 3 articulated robot arms. Each of the robot joints will be controlled by cooperative control algorithm to be designed as shown by the block diagram in Fig. 2. Each
of the robot arms joint angles will be passed among of the robot agents for control
purpose depending on the CAN bus communication topology.
2.1 Robot Arm Joint Model
Frequency response experiment is carried out estimate the transfer function of the
system. The experiment is conducted by observing the output response in terms of
the angular position of the robot actuator (DC motor in this case) when a sinusoidal
voltage signals of varying frequencies is fed into the motor. The sinusoidal voltage
input, ν(t), varying with time, t, is described by,
ν(t) = Asin(ωt)
Fig. 1 The system setup
(top view) for 3 DOF
articulated robot arms
commissioned for a
cooperative task
(1)
438
K. W. Chan et al.
Fig. 2 Block diagram showing the cooperative control scheme
where A is the peak-to-peak voltage amplitude, ω is the frequency in radian per
second. Conventional frequency response method was employed for each robot joint
to obtain the input-output model. It is assumed that each of the robotic arm link
poses minimal coupling effect throughout its connected linkages. Therefore, it is
permissible to model each joint in a linear form as in (2) under an assumption that
no abrupt motion or demanding joint acceleration is commissioned in this work.
f ( jω) =
Ak
jω( jω + k)
(2)
where Ak is the system gain and k is the system pole. Such transfer function is deemed
suitable when an input-output relationship is desired relating angular position to a
voltage input of a DC motor.
The experimental result for one robot arm joint is recorded in Table 1.
From the data tabulated in Table 1, a bode plot (shown in Fig. 3) is drawn to
conclude the frequency response experiment.
From the bode plot in Fig. 3, we may identify the uncompensated system DC gain
accordingly as in (3).
20logK
= −27.2984 dB
→ K = 10
−27.2984
20
= 0.04316
(3)
(4)
and from 0.8 = 10ω p , first order pole is calculated to be 0.08 rad/s, yielding the
following transfer function for one of the joint angle,
G( jω) =
0.04316
θi ( jω)
=
νi ( jω)
jω( jω + 0.08)
(5)
Network-Based Cooperative Synchronization Control ...
439
Table 1 Frequency response data for one of the robot arm agent i’s link.
Freq (rad/s) νi (V )
θi (◦ )
20log( νθii )
t(s)
T (s)
(dB)
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.10
0.11
0.12
0.13
0.15
0.17
0.19
0.20
0.25
0.30
0.35
0.40
0.50
0.60
0.70
0.80
0.90
1.00
1.10
1.30
1.50
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
25.5
468
338
299
238
207
186
151
132
121
111
99
95
78
74
67
57
44
37
32
26
21
18
16
12
11
9
9
5
5.274
2.447
1.382
−0.599
−1.811
−2.740
−4.551
−5.719
−6.470
−7.224
−8.218
−8.576
−10.288
−10.746
−11.609
−13.013
−15.261
−16.766
−18.027
−19.831
−21.686
−23.025
−24.048
−26.547
−27.302
−29.045
−29.045
−34.151
−61
−45
−39
−30.75
−25
−22.75
−20
−19
−16
−15.2
−14.25
−13
−11
−10
−9.5
−8
−7
−6.6
−6
−4.5
−4
−4
−4
−3.5
−3
−3
−2.5
−2
209
157
126
105
90
79
70
63
57
52
48
42
37
33
31.5
25
21
18
16
13
10.5
9
8
7
6
6
5
4
φ(◦ )
−105.071
−103.184
−111.428
−105.428
−100
−103.670
−102.857
−108.571
−101.052
−105.230
−106.875
−111.428
−107.027
−109.090
−108.571
−115.2
−120
−132
−135
−124.615
−137.142
−160
−180
−180
−180
−180
−180
−180
3 Cooperative Control Design
3.1 Consensus Error
The cooperative error signal measuring the difference between the joint angles of
the neighbouring robot arm can be written in a consensus-like formulation (inspired
from multi-agent theory),
440
K. W. Chan et al.
Fig. 3 Bode plot representing frequency response for joint i
ei =
N
ai j (θ j − θi ) + bi (θ0 − θi )
(6)
i=1
where ei = [ei1 ei2 · · · ein ]T ∈ Rn is the consensus error vector for each agent i with
n degrees of freedom, ai j is the element in the adjacency matrix A ∈ RN×N , i.e.
the matrix (from graph theory) that describes how the 3 agent robots are connected
to each other in their communication link through CAN bus. θi ∈ Rn is the angular
position of the current agent i, θ j is the angular position of the neighbouring agent
j and θ0 is the leader agent’s angular position.
3.2 Cooperative Control
The cooperative control algorithm utilises the use of consensus error (information
about the neighbouring agent state contained therein) in the discrete phase-lead controller setup creating the following expression,
u coopi = Dlead In ( jω)ei
(7)
where In ∈ Rn×n is the identity matrix, u coopi is the control input in a form of voltage
νi to each of the robot agent. It is assumed that the coupling between joints are
minimal under the condition that there is no sudden abrupt motion.
Network-Based Cooperative Synchronization Control ...
441
3.3 Discrete Phase Lead Controller Algorithm
The discrete phase lead compensator is designed and implemented to minimise the
response time and maximise the effect of synchronisation based on the transfer function obtained from frequency response experiment. The phase-lead compensator’s
transfer function is given in the form of
1
Dlead ( jω) = √
β
jω + ωl
jω + ωh
(8)
where ωl , ωh are the lower break and higher break frequency of the controller to be
designed respectively. β is a controller gain coefficient to be designed to satisfy the
performance criteria.
Since the system output is in discrete form, thus the phase lead compensator
is transformed into discrete from and long division is applied on the compensator
designed to be implemented into Arduino coding.
Recall the transfer function obtained in (5), we may identify the natural bandwidth
of the hardware joint system by observing the magnitude at −6 dB in the open loop
plot shown in Fig. 3.
The first step in the design of phase-lead controller is the steady-state performance
need to be satisfied first by increasing the system gain to 1. A bode plot is drawn
again (see Fig. 4) for this adjusted transfer function which satisfy the steady-state
performance.
From Fig. 4, the phase margin is observed to be,
φuncomp P M = 4.5812◦
(9)
The observed phase margin is too low and requires compensation which can be
achieved by the discrete phase lead controller. Additional phase lead contribution
φ Mlead and the coefficient β are calculated accordingly,
φ Mlead
φ Mlead = φ Mcomp − φ Muncomp + φcor = 50.4188◦
1 − 0.7117
−1 1 − β
→β=
= 0.1684
= sin
1+β
1 + 0.7117
(10)
(11)
where φcor is the phase correction factor in range of 5◦ –12◦ . The compensator’s
magnitude contribution can be computed at the peak of the phase curve,
1
|G lead ( jω)max)| = √ = 2.437 dB
β
(12)
From negative value of G lead ( jωmax ), we may determine the new gain crossover
frequency, ωmax , from the bode plot,
442
K. W. Chan et al.
Fig. 4 Normalised bode plot for one of the joint of robot agent i
ωl
1.145
=√
ωmax =
s
β
→ ωl = ωmax β = 0.47 rad/s
ωl
→ ωh =
= 2.791 rad/s
β
(13)
(14)
(15)
Consequently, the phase-lead compensator can be written as
Dlead =
2.1277ω + 1
0.3593ω + 1
(16)
The sampling time specified (adhering to the Nyquist sampling theorem) at the
microcontroller is Ts = 0.01 s and applying the bilinear transformation of the form,
ω=
Ts (z − 1)
2 (z + 1)
(17)
to discretize the controller for the purpose of hardware implementation, we would
arrive at the following discrete version of phase-lead compensator,
Dlead (z) =
1.00883z + 0.98758
z + 0.99642
(18)
Network-Based Cooperative Synchronization Control ...
443
The designed phase-lead compensator in (18) can be coded in the microcontroller
or digital signal processor by simply performing a long division to establish the
corresponding difference equation,
Dlead (z) = 1.00883 − 0.01764z −1 + 0.01758z −2 − 0.017515z −3 + . . .
(19)
Remark. For the sake of brevity, the control design and analysis here is shown only
for one of the robot joints. It is to note that, in practical, although all robot agents
being commissioned are identical in terms of kinematic configuration, each of the
actuators and feedback sensors exhibit entirely different characteristics due to the
wear-and-tear factor. The frequency response shown in Table 1 for other joints are
also different in certain magnitude and phase. Certain joints were observed to operate
in a narrower operating joint angle band. Backlash characteristics were also observed
distinctive from one robot agent platform to another. However, the principle approach
of system modeling and control design being elucidated in this paper will still be
valid for other types of robotic arm. Future work which employs a more advanced
nonlinear control design technique and with enhanced robotics instrumentation are
feasible.
4 Results
With the low baud rate used, the rate of current angular position sent from the robot
agent leader to the other neighbouring robot agent was slower compared to the one
with high baud rate used. The slave response was also slower when low baud rate
was used. To maximise real-time synchronisation effect, the maximum baud rate
(1000 kbps) was selected for the CAN bus.
Observing from Figs. 5 and 6, despite of the inherent noise signal emanating from
the aged encoder signal, all the robot agents (i = 0, 1, 2) are able to track the desired
Fig. 5 Preliminary evaluation on the selection of CAN bus communication bandwidth
444
K. W. Chan et al.
Fig. 6 Hardware setup of the cooperative controlled of 3 articulated robotic arms.
Fig. 7 Consensus tracking error for joint angle k all robot agents
trajectory satifactorily. The robot 1 read the sensor feedback and the robot 2 was
taking input from sensor feedback of robot 1 through CAN bus. Thus, the errors due
to sensor feedback were also amplified, affecting the performance of tracking for
the robot 2. Hardware or software filter was recommended to eliminate the noise as
the input for slave was depending on the sensor feedback of each robot. As mentioned, phase lead controller is a linear controller which poses some limitation in
handling nonlinearity such as backlash and saturation. It is to note that in a practical
system (Fig. 7), the nonlinearities do exist and it depends entirely on the knowledge
and experience of a control engineer to formulate a suitable compensator to overcome it. Actuator saturation may exist and can be overcame by means of dead-zone
Network-Based Cooperative Synchronization Control ...
445
compensation. Nonlinearities exist across all the link members and can be resolved
by nonlinear feedback compensation taking account the robot arm mass inertia, Corriolis/centrifugal effect and the gravity effect.
5 Conclusion
The network-based cooperative synchronization control has been developed for cooperative task between master, slave 1 and slave 2 robot arms. Frequency response
approach is successfully applied to obtain the system transfer function. Phase lead
compensator controller was chosen as most suitable controller for synchronization
task due to its advantage in improving transient response and small change in steady
state error as well as the ability to emphasize high frequency noise. The synchronization control was validated through performance analysis using data logged to Excel
using PLX-DAQ. The implementation of CAN bus as communication medium allow
information exchange between each robot arms. Robot arms are sending and receiving signal to each other through the two CAN bus node. By using the error equation
developed from the communication network, the cooperative task is performed. In
conclusion, the objectives of this project has been achieved successfully. The outcomes from this project shows the distributed network-based cooperative control by
using phase lead compensated control affect a lot in synchronization control between
robot arms.
References
1. Gereald M, Peter Z (2017) Industrial Robots Meet Industry 4.0, Hadmérnök (XII) IV, pp 230–
238
2. Haddara M, Elragal A (2015) The readiness of ERP systems for the factory of the future.
Procedia Comput Sci 6(64):721–728
3. Hecker M, Howe K, Russo M, Küpper D, Spindelndreier D, Whiteman S, Zinser M (2015)
Industry 4.0: The Future of Productivity and Growth in Manufacturing Industries
4. Vysocky A, Novak P (2016) Human-robot collaboration in industry. Sci J 9:903–906
5. Anussornnitisarn P, Nof SY, Etzion O (2005) Decentralized control of cooperative and
autonomous agents for solving the distributed resource allocation problem. Int J Prod Econ
98(2):114–128
6. Migliavacca M, Bonarini A, Matteucci M (2013) RTCAN: a real-time CAN-bus protocol for
robotic applications. In: Proceedings of the 10th international conference on informatics in
control, automation and robotics, ICINCO 2013, vol 2, pp 353–360
7. Mabkhot MM, Al-Ahmari AM, Salah B, Alkhalefah H (2018) Requirements of the smart
factory system: a survey and perspective. Machines 6:23
8. Antonelli G, Arrichiello F, Caccavale F (2014) Decentralized time-varying formation control
for multi-robot. Int J Robot Res 33(7):1029–1043
9. Stoian V, Bobasu E (2015) Control algorithm for a cooperative robotic system in fault conditions. In: 2015 12th international conference on informatics in control, automation and robotics
(ICINCO), Colmar, France
446
K. W. Chan et al.
10. He Y, Wu M, Liu S (2018) Decentralised cooperative mobile manipulation with adaptive
control parameters. In: 2018 IEEE conference on control technology and applications (CCTA),
Denmark, Copenhagen
11. Lu M, Liu L (2018) Adaptive leader-following consensus of networked uncertain EulerLagrange systems with dynamic leader based on sensory feedback. In: 2018 15th international
conference on control. automation, robotics and vision (ICARCV), Singapore, Singapore
12. Kocian J, Skovajsa L, Vojcinak P, Kotzian J (2009) Robotic arm controlled by CAN bus. In:
9th IFAC workshop on programmable devices and embedded systems, pp 92–95
13. Lin Z, Wang T, Gao Q, Liu Y (2011) Design of robot platform based on CAN bus. In: 2011
international conference on electrical and control engineering
14. Marino A (2018) Distributed adaptive control of networked cooperative mobile manipulators.
IEEE Trans Control Syst Technol 26(5):1646–1660
15. Khan SG, Bendoukha S, Mahyuddin MN (2018) Dynamic control for human-humanoid interaction. In: Humanoid robotics: a reference. Springer, Heidelberg, pp 1–29
16. Mahyuddin MN, Herrmann G (2013) Distributed motion synchronisation control of humanoid
arms. In: 2013 FIRA RoboWorld congress. Springer, Heidelberg, pp 21–35
17. Mahyuddin MN, Herrmann G, Lewis FL (2013) Distributed adaptive leader-following control
for multi-agent multi-degree manipulators with finite-time guarantees. In: 52nd IEEE conference on decision and control, Florence, pp 1496–1501
18. Mahyuddin MN, Herrmann G (2013) Cooperative robot manipulator control with human ‘pinning’ for robot assistive task execution. In: Herrmann G, Pearson MJ, Lenz A, Bremner P,
Spiers A, Leonards U (eds) Social robotics, ICSR 2013. LNCS, vol 8239. Springer, Cham
19. Zhang HW, Lewis FL (2012) Adaptive cooperative tracking control of higher-order nonlinear
systems with unknown dynamics. Automatica 48(7):1432–1439
20. Peng Z, Wang D, Sun G, Wang H (2014) Distributed cooperative stabilisation of continuous
time uncertain nonlinear multi-agent systems. Int J Syst Sci 45(10):2031–2041
21. Wang W, Wang D, Peng ZH (2015) Cooperative fuzzy adaptive output feedback control
for synchronisation of nonlinear multi-agent systems under directed graphs. Int J Syst Sci
46(16):2982–2995
22. Wang J, Chen K, Lewis FL (2017) Coordination of multi-agent systems on interacting physical
and communication topologies. Syst Control Lett 100:56–65
23. Lewis FL, Zhang H, Hengster-Movric K, Das A (2014) Cooperative control of multi-agent
systems. Springer, London
24. Jiao Q, Modares H, Lewis FL, Xu S, Xie L (2016) Distributed L2-gain output-feedback control
of homogeneous and heterogeneous systems. Automatica 71:361–368
25. Roman RC, Radac MB, Precup R-E (2016) Multi-input-multi-output system experimental
validation of model-free control and virtual reference feedback tuning techniques. IET Control
Theory Appl 10(12):1395–1403
26. Safaei A, Koo YC, Mahyuddin MN (2017) Adaptive model-free control for robotic manipulators. In: Proceedings of the IEEE international symposium on robotics and intelligent sensors
(IRIS2017), Ottawa, Canada, October 2017, pp 7–12
27. Safaei A, Mahyuddin MN (2018) Adaptive model-free control based on an ultra-local model
with model-free parameter estimations for a generic SISO system. IEEE Access 6:4266–4275
28. Safaei A, Mahyuddin MN (2018) Optimal model-free control for a generic MIMO nonlinear
system with application to autonomous mobile robots. Int J Adapt Control Signal Process
29. Cai H, Lewis FL, Hu G, Huang J (2017) The adaptive distributed observer approach to the
cooperative output regulation of linear multi-agent systems. Automatica 75:299–305
30. Modares H, Nageshrao SP, Delgado Lopes GA, Babuska R, Lewis FL (2016) Optimal modelfree output synchronization of heterogeneous systems using off-policy reinforcement learning.
Automatica 71:334–341
31. Peng ZH, Wang D, Sun G, Wang H (2014) Distributed cooperative stabilisation of continuous
time uncertain nonlinear multi-agent systems. Int J Syst Sci 45(10):2031–2041
32. Mahyuddin MN, Herrmann G, Lewis FL (2013) Distributed adaptive leader-following control
for multi-agent multi-degree manipulators with finite-time guarantees. In: 2013 IEEE 52nd
conference on decision and control (CDC2013), Florence, Italy, pp 1496–1501
Network-Based Cooperative Synchronization Control ...
447
33. Mahyuddin MN, Herrmann G, Na J, Lewis FL (2012) Finite-time adaptive distributed control
for double integrator leader-agent synchronisation. In: 2012 IEEE international symposium on
intelligent control (ISIC), Dubrovnik, Croatia, pp 714–720
34. Mahyuddin MN, Safaei A (2017) Robust adaptive cooperative control for formation-tracking
problem in a network of non-affine nonlinear agents. In: Rocha J (ed) Multi-agent systems.
InTech
35. Safaei A, Mahyuddin MN (2017) Adaptive model-free consensus control for a network of
nonlinear agents under the presence of measurement noise. In: Asian control conference
(ASCC2017), Gold Coast, Australia, December 2017, pp 1701–1706
36. Li Z, Duan Z (2015) Cooperative control of multi-agent systems. CRC Press/Taylor and Francis
Group, Boca Raton
37. Safaei A, Mahyuddin MN (2017) An optimal adaptive model-free control with a Kalman-filterbased observer for a generic nonlinear MIMO system. In: Proceedings of the 2017 IEEE 2nd
international conference on automatic control and intelligent systems (I2CACIS 2017), Kota
Kinabalu, Malaysia, October 2017, pp 56–61
EEG Signal Denoising Using Hybridizing
Method Between Wavelet Transform
with Genetic Algorithm
Zaid Abdi Alkareem Alyasseri, Ahamad Tajudin Khader,
Mohammed Azmi Al-Betar, Ammar Kamal Abasi,
and Sharif Naser Makhadmeh
Abstract The most common and successful technique for signal denoising with
non-stationary signals, such as electroencephalogram (EEG) and electrocardiogram
(ECG) is the wavelet transform (WT). The success of WT depends on the optimal configuration of its control parameters which are often experimentally set. Fortunately,
the optimality of the combination of these parameters can be measured in advance
by using the mean squared error (MSE) function. In this paper, genetic algorithm
(GA) is proposed to find the optimal WT parameters for EEG signal denoising. It is
worth mentioning that this is the initial investigation of using optimization method
for WT parameter configuration. This paper then examines which efficient algorithm has obtained the minimum MSE and the best WT parameter configurations.
The performance of the proposed algorithm is tested using two standard EEG dataset,
namely, EEG Motor Movement/Imagery dataset. The results of the proposed algorithm are evaluated using five common criteria: signal-to-noise-ratio (SNR), SNR
improvement, mean square error (MSE), root mean square error (RMSE), and percentage root mean square difference (PRD). In conclusion, the results show that the
proposed method for EEG signal denoising can produce better results than manual
configurations based on ad hoc strategy. Therefore, using metaheuristic approaches
to optimize the parameters for EEG signals positively affects the denoising process
performance of the WT method.
Keywords EEG · Signal denoising · Wavelet transform · Metaheuristic
algorithms · Genetic algorithm
Z. A. A. Alyasseri (B) · A. T. Khader · A. K. Abasi · S. N. Makhadmeh
School of Computer Sciences, Universiti Sains Malaysia, Gelugor, Pulau Pinang, Malaysia
e-mail: zaid.alyasseri@uokufa.edu.iq
Z. A. A. Alyasseri
ECE Department, Faculty of Engineering, University of Kufa, Najaf, Iraq
M. A. Al-Betar
IT Department, Al-Huson University College, Al-Balqa Applied University, Irbid, Jordan
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_31
449
450
Z. A. A. Alyasseri et al.
1 Introduction
Electroencephalogram (EEG) is a graphical recording of brain electrical activity that
is recorded from the scalp. This recording represents the voltage fluctuations resulting
from ionic current flows within the neurons of the brain [1, 2]. Therefore, EEG
signals can provide most of the required information about brain activity. EEG signals
from the brain are captured using invasive or non-invasive techniques [3]. The main
difference between these techniques is that the invasive approach involves the use of
electrode arrays implanted inside the brain, such as ECoG BCI for arm movement
control [4, 5]. Meanwhile, there are several techniques to record the brain activity
can also be captured using different types of signal capturing devices, including EEG
for electrical activity from the scalp, MEG for magnetic field fluctuations caused by
electrical activity in the brain, and fMRI and fNIR for changes in blood oxygenation
level resulting from neural activity [4, 6, 7]. In [8], Berger proposed for the first time
the use of EEG signals as a non-invasive technique for capturing brain activities. Over
the past several decades, researchers have developed Hans’s technique to suit multiple
applications. For instance, EEG signals have been used in medical applications for
prevention, detection diagnosis, rehabilitation and restoration. This technique has
also been used for non-medical applications, such as education and self-regulation,
neuromarketing and advertisement, neuroergonomics and smart environment, games
and entertainment, and learning and education [9, 10]. Recently, EEG signals have
been used as a new biometric technique in security and authentication applications
[1, 9].
In general, several artifact noises can corrupt the original EEG signal during its
recording time, such as eye blink, eye movements, muscle activity, and interference
of electronic device signals [11]. Therefore, the EEG signal must be processed to
reduce such noise. Several EEG noise removal techniques have been proposed in the
literature, such as filtering and adaptive thresholding. Recently, wavelet transform
(WT) has been successfully applied for denoising non-stationary signals, including
ECG and EEG [12–16].
Kumari et al. in [1] proposed a user identification system on the basis of EEG
signal collected from six users using EMOTIVE EPOC headset with 14 channels.
These researchers used wavelet transform (WT) for EEG signal denoising where a
db4 mother wavelet function (MWF) is used with five levels of signal decomposition. They tested their method using the EEG dataset established in [17]. Afterwards, the same authors investigated several cognitive tasks to design an individual
identification system [18]. These researchers used standard EEG datasets related to
motor/movement and imaginary tasks [19] with only one channel (i.e. Cz) to obtain
an input signal. In addition, the authors used WT to decompose the EEG signal into
five levels and then extract four features from each EEG sub-band. Al-Qazzaz et
al. [13, 20] conducted a comparative study to determine the efficient MWFs that
can provide high signal characteristics for an EEG channel. These authors tested 45
MWFs that are categorized into Daubechies, Symlets and Coiflets families. An MWF
called ‘sym9’ showed efficient results in nearly all brain regions. The same team of
EEG Signal Denoising Using Hybridizing Method ...
451
researchers applied WT with independent component analysis to decompose the
EEG signals for obtaining an efficient feature for discriminating stroke-related mild
cognitive impairment and vascular dementia [21]. Reddy et al. [22] proposed WT
for processing the EEG signal. These authors applied WT to EEG signal denoising
and used db8 as an MWF with eight EEG signal decomposition levels. Furthermore,
these authors classified the EEG signal on the basis of the features that are extracted
from the WT signal denoising process [23].
Mowla et al. [24] introduced a new method for removing EMG and electrooculogram (EOG) artifacts from the original EEG signal. The proposed method used two
scenarios for removing these artifacts. In the first scenario, the EMG artifacts were
processed using a combination method where the EEG signal was firstly processed
using canonical correlation analysis, and the output signal will then be reprocessed
by a stationary WT (SWT). A second-order blind identification approach followed
by SWT was used for removing EOG artifacts. The results of the proposed method
showed that combining the techniques provided more effective results than using
each technique individually.
Yang et al. in [25] proposed an artificial method for removing the EOG artifacts
from the EEG raw. The proposed method (CCA-EEMD) involves three steps. In
the first step, the input EEG signal proposed using CCA to spread the EOG. In the
second step, the EOG will be decomposed into multi-level and apply intrinsic mode
functions (IMFs) using EEMD approach. Finally, the clear EEG data are ready to use
and extract more features. The (CCA-EEMD) tested using seven subjects. The results
show that the (CCA-EEMD) method it is not only EOG removal method but also it
can keep the EEG features to the maximum extent. Torabi et al. in [26] introduced
a combining method between nonlinearity EEG features and wavelet coefficients
for improving the performance of the recognition rate classification. The proposed
method applied a linear SVM classifier and the effect of the combining technique
shown significant improvement in the classification results from (54%) to (73%).
Furthermore, the proposed method has been also applied for feature selection for the
same problem, while it is selected up (44%) for nonlinear features.
Several techniques have been proposed for EEG feature extraction. A comprehensive analysis and review of EEG decomposition methods for feature extraction have
been presented [27]. For example, Wang et al. [28] introduced a new method for EEG
feature extraction using spatiotemporal analysis with multivariate linear regression
to improve the accuracy detection of SSVEP features. Zhang et al. [29] proposed a
new algorithm for EEG feature extraction on the basis of common spatial pattern
with motor imagery classification. The proposed method used boost classification to
improve the accuracy rate of the MI EEG. The proposed method was tested using
three public EEG datasets from BCI competition. The performance of the TSGSP
reached 88.5% for these datasets. Jiao et al. [30] proposed a new technique (SGRM)
for EEG classification that is based on reducing the number of training samples for
EEG data by implementing a new representation for the non-zero coefficient samples.
For EEG classification, Zhang et al. [31] proposed the combination of classification
methods between sparse Bayesian and Laplace priors.
452
Z. A. A. Alyasseri et al.
In general, WT has five parameters with each parameter having different types
(Table 1). The efficiency of EEG signal denoising depends on the selection of the
best combination of WT parameters. The selection is usually performed based on
experience or empirical evidence. In previous research, the WT parameter configuration is formulated as an optimization problem with MSE as its objective function
[15]. As aforementioned, WT has five parameters, namely, (i) MWF Φ, (ii) decomposition level L, (iii) thresholding function β, (iv) threshold selection rules λ, and
(v) threshold re-scaling methods ρ. Each of these parameters has several values and
is used for a specific denoising level. The optimal values of these parameters are
required to empower WT in the denoising process. For ECG signals, El-Dahshan in
[12] attempted to obtain the optimal configuration using GA, the results were better
than those that were produced experimentally. Alyasseri et al. [14, 32] proposed
a hybrid scheme for non-stationary signals denoising, such as ECG and EEG that
is based on β-hill climbing (βhc) optimization algorithm [33] with WT to obtain
the optimal wavelet parameters. The proposed method (βhc-WT) was tested using
an MIT-BIH dataset [34], where the original ECG signal was corrupted with white
Gaussian noise (WGN) using different input SNR noises that corrupted the ECG
from 0 to 40 dB. The performance of the βhc-WT method was evaluated using minimum squared error (MSE) and SNR. The proposed method successfully removed
WGN from the ECG and EEG signals [14–16, 32].
The main objective of this paper is to propose genetic algorithm (GA) for optimal
settings of WT parameters. Therefore, a new GA version of WT, called (GA-WT)
is tested in an experiment. The original EEG signal benchmark taken from Motor
Movement/Imagery dataset 1 is used for the evaluation process [19]. To evaluate
the performance of the GA, EEG signals are corrupted using three different noise
mechanisms, including power line noise (PLN), electromyogram (EMG), and white
Gaussian noise (WGN) [12, 35, 36]. Initially, each GA generates optimal parameter
settings for WT to denoise the EEG signal of each dataset. Afterward, the denoisined
results are evaluated using five measurement factors, namely, SNR, SNR improvement, MSE, RMSE, and PRD. For comparative evaluation, the denoising results of
the GA method. Interestingly, FPA-WT achieves efficient EEG signal denoising for
EMG and WGN datasets. In addition, FPA-WT and GA-WT obtain the best denoising levels for PLN dataset. In conclusion, FPA is the best algorithm that can be
incorporated with WT to achieve an efficient EEG signal denoising.
This paper is organized as follows. Section 2 provide a background to Wavelet
Transform (WT). Section 2.1 presents a Wavelet denoising principle for EEG signal
denoising. Genetic algorithm presents in Sect. 3. The hybrid scheme between metaheuristic algorithms and WT explains in Sect. 4. The results and discussion presents
in Sect. 5. Finally, the conclusions and future works describes in Sect. 6.
1 https://www.physionet.org/physiobank/database/eegmmidb/.
EEG Signal Denoising Using Hybridizing Method ...
453
2 Wavelet Transform
Wavelet Transform (WT) is a common and powerful tool for representing signals
in the time-frequency domain. WT has been successfully used for non-stationary
signals, such as ECG and EEG, to address several problems, such as those related to
signal compression, feature selection, and signal denoising [14, 37, 38]. Recently,
WT has been extensively tailored for non-stationary signals because of its powerful
performance in removing several EEG artifact noises that can corrupt the original
EEG signal during its recording time. These noises include eye blinking noise, eye
movement noise, muscle activity noise, electromyogram (EMG) noise, and interference of electronic device signals [39–41].
2.1 Wavelet Denoising Principle for Non-stationary Signals
As aforementioned in Sect. 2, WT is a powerful tool for time-frequency domain
representation. This technique represents the signal on the basis of the correlation
between the translation and the dilation of MWF [12, 42, 43]. In general, the problems
solved by WT can be categorized into two WT versions, namely, continuous wavelet
transform (CWT) and discrete wavelet transform (DWT) [44]. In this paper, DWT
has been proposed for EEG signal decomposition whereby inverse DWT (iDWT)
is used for EEG signal reconstruction. DWT was originally established in [45] as
the so-called Donoho’s approach. In general, DWT decomposes a signal by using
set of filtering (i.e., low pass and high pass filters) to product the approximation and
details coefficients, respectively. The main objective of using DWT is to decompose
the input signal via different coefficient levels to correct the high frequency of the
input signals [46]. In other word, DWT decomposes the EEG signal into several
frequency bands because it assumed that the artifacts will have large amplitudes
in the respective frequency bands. Normally, the denoising process involves three
phases:
– EEG signal decomposition phase: Assuming the original EEG signals with n
samples x(t) = [x(1), x(2), ..., x(n)] will be divided into three levels, and each
level will be decomposed into two parts, namely, approximation coefficients (c A)
and detail coefficients (cD). cD will be processed using a high-pass filter, while
c A will continue to be decomposed for the next level.
c Ai (t) =
cDi (t) =
∞
c Ai−1 (k)φi (t − k)
k=−∞
∞
cDi−1 (k)Ψi (t − k)
k=−∞
(1)
454
Z. A. A. Alyasseri et al.
where c Ai (t), cDi (t) denotes the approximation and detail coefficients of level i,
Ψ , φ refers to scaling and shifting, respectively.
– Applying thresholding phase: A threshold value is defined for each level according
to the noise level of the coefficient.
– Reconstruction phase: The EEG denoised signal is reconstructed using iDWT.
The formula of iDWT as follows [24]:
E E G clean (t) =
∞
c A L (k)φi (t − k) +
k=−∞
∞
L
cDi+1 (k)Ψi (t − k)
i=1 k=−∞
where E E G clean (t) denotes the reconstructed EEG signal, i refers to decomposition level (Fig. 1),
Fig. 1 EEG denoising process taken from [2, 7]
EEG Signal Denoising Using Hybridizing Method ...
Table 1 The ranges of the
wavelet denoising parameters
WT denoising parameters
455
Method (range)
Mother wavelet function Φ
Symlet (sym1..sym45),
Coiflet (coif1..coif5),
Daubechies (db1..db45), and
Biorthogonal (bior1.1..
bior1.5&bior2.2 .. bior2.8&
bior3.1..bior3.9)
Thresholding function β
soft or hard threshold
Decomposition level L
5
Thresholding selection rule λ Heursure, Rigsure,
Sqtwolog, and Minimax
Re-scaling approach ρ
one, sln, mln
Signal noise removal is considered a challenging task in signal processing [47,
48]. Therefore, researchers have developed several approaches to solve this problem,
such as using the filtering technique [49, 50], thresholding technique [6, 51, 52], and
other techniques [53]. WT is one of the powerful techniques for non-stationary signal
denoising [43, 54, 55]. WT has five parameters, with each parameter having different
types (Table 1) the success of EEG signal denoising relies on the selection of WT
parameters. The wavelet denoising parameters are defined in three phases. In the
decomposition phase, the first parameter, namely, MWF (Φ), is used in the EEG
signal decomposition task. The second WT parameter, namely, the decomposition
level (L), is also selected in the decomposition phase based on the EEG signal and
experience.
The third parameter, namely, thresholding functions (i.e, β)), can be divided into
hard and soft thresholding [45, 51]. The thresholding types (soft or hard) in the
second phase must be selected along with the fourth parameter, namely, the selection rules (λ), and the fifth parameter, namely, the rescaling methods (ρ). These
threshold mechanisms must be applied because the selection will affect the global
denoising performance. The thresholding value is generally defined based on the
standard deviation (σ ) of the noise amplitude [12]. Tables 2 and 3 provide the different types of parameters for the thresholding selection rule and rescaling methods.
The thresholding rules are selected according to Eq. (2).
E E G noisy (n) = x(n) + σ e(n)
(2)
where x(n) is the original EEG signal, e is the noise, σ is the amplitude of the noise,
and n is the number samples. The wavelet parameters (β, λ, and ρ) must be separately
applied for each wavelet coefficient (approximation and details) level.
In the last phase, the denoised EEG signal is reconstructed by iDWT as shown in
Eq. (2.1).
456
Table 2 Thresholding
selection rules
Z. A. A. Alyasseri et al.
Thresholding selection rule Description
Rule 1: Rigrsure
Rule 2: Sqtwolog
Rule 3: Heursure
Rule 4: Minimaxi
Table 3 The wavelet
thresholding rescaling
methods
Threshold is selected using
the principle of Stein’s
Unbiased
Risk Estimate (SURE)
Threshold
is selected equal
√
to (2log M)
Threshold is selected
according to mixture
(Rigrsure and Sqtwolog)
Threshold is selected equal
to Max(MSE)
Wavelet threshold rescaling Rescaling
methods ρ
one
sln
mln
No scaling
Single level
Multiple level
3 Genetic Algorithm
GA was developed in [56] to mimic the natural phenomenon of Darwin evolution
theory. Based on the ‘survival of the fittest’ principle, GA starts with many solutions,
with each solution being a vector of decision variables and each decision variable
having a specific range of values. In evolution context, the set of solutions is equivalent to population, each solution is analogous to chromosome, each decision variable
is analogous to gene, and each value of the decision variables is analogous to allele.
Algorithm 1. Genetic Algorithm pseudo-code
1:
2:
3:
4:
5:
6:
7:
8:
9:
X chr om ← Generate_I nitital_Population
Evaluate(X chr om )
while (Stopping criterion is not met) do
X chr om ← Selection(X chr om )
X chr om ← Crossover (X chr om )
X chr om ← Mutation (X chr om )
Evaluate(X chr om )
X chr om ← Replacement (X chr om ∪ X chr om )
end while
In order to apply a successful GA to COPs, both the objective function and problem representation must be properly adjusted together with parameter tuning. GA
typically has a set of parameter, including the size of the population Psi ze , the number
of generations Pno , the crossover rate Pcr ossover , and the mutation rate Pmutation . In
EEG Signal Denoising Using Hybridizing Method ...
457
order to build an efficient and robust GA, the parameter settings of each COP must
be closely examined.
Algorithm 1 shows the high-level schematic pseudo-code of GA that starts with a
population of candidate solutions X chr om , where X chr om is an augmented matrix of
size Psi ze × N and N is the number of decision variables in each solution. Initially, the
population X chr om is filled with random candidate solutions across the problem search
space, that is, X chr om = {X chr om 1 , X chr om 2 , . . . , X chr om Psi ze }. Each candidate solution
X chr om i is evaluated based on an objective function. The improvement loop in GA
(see Algorithm 1, line 3 to 9) repeats the following steps until a termination criterion
is met: select the parents (new population X chr om ) that will be used to generate
the next population which will pairwise crossover with a probability of Pcr ossover to
come up with a new population X chr om . Afterward, each pairwise solution will be
checked if it must be mutated with probability Pmutation to come up with X chr om .
The new population will be reevaluated, and the X chr om will be substituted with the
population X chr om based on such selection method. This procedure is followed to
determine whether the offsprings are fit or not. This process will be repeated several
times until an optimal solution is reached.
4 Meta-Heuristic Algorithms and Wavelet Transform for
EEG Signal Denoising: Proposed Method
This section provide a full discussion for the proposed methodology of the metaheuristic algorithms with wavelet transform to solve EEG signal denoising problem.
Algorithm 2 shows the pseudocode of the proposed method framework. The proposed
methodology run through four phases where the result of each phase is an input to the
consecutive one. The four phases are presented in Fig. 2 and thoroughly described
as follows:
Algorithm 2. Tuning WT parameters using a meta-heuristic algorithms for EEG
signal denoising
1: Initialize noisy EEG signal (nEEG), calculate the SNR, MSE, RMSE, and PRD for
input EEG signal.
2: Initialize meta-heuristic operators, initialize solution(s) X i (i = 1, 2, .., N ) N=5 wavelet parameters,
the initial solution X i (Φ ,L,β ,λ,ρ )
= Metheuristic ( X , X )
3: X opt
i
,nEEG)
4: EEGDenoiseSignals=WT ( X opt
5: EEGOutSignals=Evaluate(EEGDenoiseSignals, S N Rout , S N Rimp , MSE, RMSE, PRD).
Phase I: Initialization. This phase involves three steps: firstly, reading the input
EEG signal x(n) from its source. The WT denoising approach was developed
based on the original EEG signal being corrupted with white Gaussian noise
(WGN), Power Line Noise (PLN), and Electromyogram (EMG) estimation
458
Z. A. A. Alyasseri et al.
Fig. 2 Proposed method for EEG denoising
[12, 35, 36]. Where these noises are exactly simulating the noises which will
corrupt the original EEG signal during the recording time such as eye blink
noise, eye movement noise, electro signal distortion, etc. In this paper, the original EEG signals are provided then the signals corrupted by PLN using Eq. (3)
EEG Signal Denoising Using Hybridizing Method ...
459
followed by signals corrupted by EMG using Eq. (4) followed by signals corrupted by WGN using Eq. (5) are given. These three types of noises corruption
EEG signals are used as a dataset to evaluate the performance of proposed
methods.
N (t) = A ∗ sin(2 ∗ π ∗ f ∗ t)
(3)
N (t) = E ∗ rand(t)
(4)
N (t) = x(t) + σ
(5)
where A = 60 uV, E = (0–10) uV, f = 60 Hz, e is the noise, σ is the amplitude
of the noise in this work σ = 15 μV. The N signal is added to the original EEG
signal x to simulate PLN, EMG, and WGN respectively.
Secondly, initialize WT denoising parameters (Φ, L, β, λ, ρ) which are shown in
Table 4, as well as the parameter for genetic algorithm is also initialized. Finally,
compute the signal to noise ratio (SNR) by Eq. (15), percentage of root mean
square difference (PRD) by Eq. (14), mean square error (MSE) by Eq. (6), and
root mean square error (RMSE) by Eq. (17). This is to record the results of EEG
signals before and after denoising process (Fig. 3).
Phase II: Tuning WT parameters by GA. In the proposed methodology, GA is
adapted to find the optimal WT parameters which can be used for EEG signal
denoising problem. Initially, the solution of WT parameters configuration is represented as a vector x = (x1 , x2 , . . . xn ) where n is the total number of parameter used for WT which is normally equal to 5. x1 represent the value of mother
wavelet function parameter Φ, x2 denotes the value of decomposition level parameter L, x3 refers to the thresholding method β, x4 represents the value of thresholding selection rule parameter λ, and x5 represents the re-scaling approach ρ,
Original EEG Signal
uV
200
0
−200
0
200
400
600
800
1000
1200
1400
1600
1800
2000
1400
1600
1800
2000
1400
1600
1800
2000
1400
1600
1800
2000
Noisy EEG Signal with PLN
uV
200
0
−200
0
200
400
600
800
1000
1200
Noisy EEG Signal with EMG
400
uV
200
0
−200
0
200
400
600
800
1000
1200
Original EEG Signal + WGN Noise
uV
500
0
−500
0
200
400
600
800
1000
1200
Time, in Milliseconds
Fig. 3 EEG signal corrupted using PLN, EMG, and WGN noise
460
Z. A. A. Alyasseri et al.
where the possible range for these parameters are selected from Table 1. Figure 4
shows an example solution of WT parameters for denoising EEG signals. The
selected metaheuristic algorithm evaluates the solution using the MSE objective
function which is formulated in Eq. (6).
N
1 MSE =
[x(n) − x (n)]2
N n=1
(6)
where x(n) denotes the original EEG signal and x (n) is the denoised EEG signal
obtained by tuning the wavelet parameters using the meta-heuristic algorithm.
Iteratively, the randomly generated solution(s) undergoes refinement using the
selected meta-heuristic algorithm. The final output of this phase is an optimized
solution xopt
= (x1 , x2 , . . . xn ) which will be passed to the next phase.
. As aformentioned in Sect. 2.1,
Phase III: EEG denoising using WT based on xopt
the denoising process of WT involves three main steps that are described in more
details below:
• EEG signal decomposition using DWT. In this step the DWT is applied to
decompose the noise of the input EEG signals x(n). In decomposition process,
parameters, namely, the mother wavelet furcation
we must use the first two xopt
ρ and the decomposition level L). The noisy EEG signal is divided at each level
into cA and cD. The latter is processed using a high-pass filter, while the former
is processed using a low-pass filter and is decomposed for the next level.
The EEG signal is convolved using the high-pass and low-pass filters, while
the block(↓2), which is represented by the downsampling operator, is used to keep
the even index elements of the EEG signal. The EEG signals are separated into
cA and cD based on their frequency and amplitude.
• The second step of EEG denoising is Thresholding which is applied based on
the noise level of the coefficients. In this step, the last three wavelet parameters,
namely, the thresholding type (β), the thresholding selection rules (λ), and the
.
re-scaling methods (ρ), must be selected from xopt
According to [57], using a thresholding operation on the input noisy nonstationary signal X can estimate the denoised EEG signal as follow:
Z = THR( X , δ),
(7)
where the THR denotes a thresholding function, while δ denotes a threshold value.
The EEG denoising performance in the wavelet domain depends on the estimation
of δ. Therefore, several methods have been proposed for estimating δ. Donoho and
Johnstone [45] calculated the threshold δ on an orthonormal basis as follows
δ = σ 2log M
(8)
where σ represents the standard deviation of DWT detail coefficients, while M
denotes the length vector of the DWT coefficients. Given that the threshold value
EEG Signal Denoising Using Hybridizing Method ...
461
δ only depends on cD and that cA has a low frequency EEG signal and the highest
amount of energy. We estimate the value of δ based on the coefficients level as
follows:
xd (l), δl ),
l = 1, 2, ....
(9)
xd (l) = T H R(
where xd represents a vector of threshold DWT detail coefficients, l denotes a
wavelet decomposition level, and δl denotes the threshold value determined for
that level. The wavelet generally provides two standard types of thresholding
functions (β), namely, hard and soft thresholding [45, 51]. The different between
hard and soft thresholding are described as follows:
xdi (l) =
|
xdi (l)| − δl
0
xdi (l) =
xdi (l)
0
|
xdi (l)| ≥ δl
|
xdi (l)| < δl
|
xdi (l)| ≥ δl
|
xdi (l)| < δl
(10)
(11)
where i denotes the index of the DWT details coefficients at a level l. The thresholding DWT coefficients can be expressed as follows:
xd (2) xa (2)]
X = [
xd (1) (12)
• Reconstruction of the denoising EEG signal by iDWT. We estimate the value
of the original EEG signals X by applying iDWT on X as follows:
z[n] =
∞
c A L (k)φi (n − k) +
k=−∞
∞
L
cDi+1 (k)Ψi (n − k)
(13)
i=1 k=−∞
The reconstruction convolves the EEG signals using upsampling (↑2), which
involves the insertion of zeros at the even index elements of EEG signals. Figure 1
shows the iDWT procedure for five levels as an example.
Phase V: EEG Denoising Evaluation The final phase is evaluating the EEG output
of WT. The evaluation will done based on five criteria which are: Signal-to-NoiseRation (SNR), SNR improvement, Mean Square Error (MSE) Eq. (6), Root Mean
Square Error (RMSE), and percentage root mean square difference (PRD).
P R D = 100 ∗ S N Rout = 10 log10
N
x (n)]2
n=1 [x(n) − N
2
n=1 [x(n)]
N
2
n=1 [x(n)]
N
n=1 [x(n)
−
x (n)]2
(14)
(15)
462
Z. A. A. Alyasseri et al.
Fig. 4 Solution of WT parameters for denoising EEG signals using MOFPA
S N Rimp = 10 log10
N
2
n=1 [δ(n) − x(n)]
N
x (n)]2
n=1 [x(n) − N
1 [x(n) − x (n)]2
RMSE = N n=1
(16)
(17)
where x(n) denotes the original EEG signal, x (n) is the denoised EEG signal
obtained by tuning the wavelet parameters through the selected meta-heuristic
algorithms, and N is the sampling number.
The final decision about the denoise results are decided by comparing the original
criteria (i.e., SNR, MSE, RMSE, PRD) with improved one (i.e., S N Rout , S N Rimp ,
MSE, RMSE, PRD).
5 Results and Discussions
5.1 EEG Dataset
The Motor Movement/Imagery’ (See footnote 1) dataset [19] collected the EEG
signals from 109 healthy subjects using a brain-computer interface software called
BCI2000 system [58]. The EEG signals are recorded using 64 Electrodes (EEG
channels) with sampling rate of 160 Hz per second, where each signal is stored in
EEG Signal Denoising Using Hybridizing Method ...
463
Fig. 5 Distribution of electrodes in EEG Motor Movement/Imagery Dataset
a separate EDF file. Each volunteer performs several motor/imagery tasks that are
mainly used in different fields, such as neurological rehabilitation and brain-computer
interface applications. In general, these tasks consist of imagining or simulating a
given action, such as opening and closing the eyes. The EEG signals are recorded
from each volunteer by asking them to perform four tasks according to the position
of a target that appears on the screen placed in front of them. If the target appears on
the right or left side of the screen, then the volunteer must open and close his/her fist
corresponding to the position of the target on the screen. If the target appears on the
top or bottom of the screen, then the volunteer must open and close his/her fists or feet.
Figure 5 shows the distribution of electrodes in the EEG Motor Movement/Imagery
Dataset.
5.2 Comparing the Proposed Method (GA-WT) with
State-of-the-Art Methods
In this section, two state-of-the-art methods for EEG signal denoising are discussed,
namely, the Al-Qazzaz method [13] and the Kumari method [1]. These methods use
WT for solving EEG signal denoising problems in which the WT parameters are set
based on a comparative study. The best parameter configurations for WT as identified
by these two methods are shown in Table 4.
464
Z. A. A. Alyasseri et al.
We compare the results of these two methods with this generated by our proposed
GA-WT method. The comparison is performed based on Kiern’s dataset [17], where
the original EEG signal is corrupted with WGN, PLN, and EMG [12, 35, 36]. The
final results are evaluated using five criteria, namely, MSE, RMSE, SNR, S N Rimp ,
and PRD. Table 5 shows the EEG signal denoising results of the Al-Qazzaz, Kumari,
and GA-WT methods. The first column presents the ranking of each method based
on the evaluation criteria adopted.
The results were evaluated using five measures, namely, MSE, RMSE, SNR_Out,
SNR_imp, and PRD). The performance of the proposed method (GA-WT) has been
compared with two state-of-the-art methods [1, 13]; the results show that the proposed method achieves better outputs than [1, 13], as summarized in Table 5, in terms
of the overall EEG signal denoising criteria.
Figure 6 proves that the proposed GA-WT method outperforms both the AlQazzaz and Kumari methods for EEG signal denoising based on different noises.
GA-WT obtains the best results for WGN and EMG based on MSE, RMSE, S N Rout ,
S N Rimp , and PRD. For PLN, GA-WT outperforms the Al-Qazzaz method [13] in
terms of MSE (0.0144) and RMSE (0.1200). Meanwhile, the S N Rout , S N Rimp , and
PRD values of these two methods are very close. In general, finding optimal param-
Table 4 Wavelet parameters range for Al-Qazzaz and Kumari methods
Wavelet parameters
Al-Qazzaz method
Kumari method
Symlet (sym9)
5
soft and hard
Rigrsure
sln, one
PRD with PLN
0.1
MSE
RMSE
0
GA
1
10
RMSE
PRD (%)
MSE
GA
50
2
1
0
GA
Sym9
db4
SNR imp (dB) for WGN
3
3
0
Sym9
db4
SNR output with WGN
100
10
−1
−2
−3
GA
PRD with WGN
20
SNR (dB)
30
20
db4
Sym9
SNR imp (dB) for EMG
0
0
GA
30
GA
SNR output with EMG
0
MSE and RMSE with WGN
−3
Sym9
40
2
db4
−2
−4
GA
3
PRD (%)
MSE and RMSE Value
Sym9
PRD with EMG
0.15
MSE and RMSE Value
10
0
GA
0.2
0
GA
20
0
Sym9
MSE and RMSE with EMG
0.05
SNR (dB)
1
0
−1
SNR (dB)
0
GA
2
SNR (dB)
0.05
MSE
RMSE
SNR imp (dB) for PLN
40
30
SNR (dB)
0.1
SNR output with PLN
3
0.2
0.15
PRD (%)
MSE and RMSE Value
MSE and RMSE with PLN
Daubechies (db4)
5
soft and hard
Rigrsure
sln, one
SNR (dB)
Mother wavelet (φ)
Decomposition level (L)
Thresholding type (β)
Selection method (λ)
Rescaling approach (ρ)
2
1
0
GA
Fig. 6 Comparative analysis between GA-WT, Sym9 and db4
Sym9
GA
Sym9
Al-Qazzaz
method [13]
Kumari method
[1]
Al-Qazzaz
method [13]
Proposed method PLN
GA-WT
Kumari method
[1]
Proposed method EMG
GA-WT
Kumari method
[1]
Al-Qazzaz
method [13]
Proposed method EOG
GA-WT
Al-Qazzaz
method [13]
Kumari method
[1]
2
3
1
2
3
1
2
3
1
2
3
MSE
2.2045
SNR
4.6352
3.8699
0.001
0.019144
0.015076
0.0098
0.030888
0.0144
0.025316
SNRimp
PRD
4.3497
RMSE
21.6583
22.4421
36.3513
13.3562
13.5106
15.6052
8.262
7.5491
1.5221
2.153
1.9672
0.0329
0.138361
0.0990
32.021900 −4.034729 2.505561
2.0793
0.122786
−2.4149
0.196744
33.059211 −2.99741 2.223511
33.6418
29.944328 −4.386341 3.182610
2.9700
0.1200
30.5449
−3.7858
94.106513 5.196744
92.668167 5.117316
78.7682
0.1591
0.6592
0.792952
2.0730
30.808240 −3.522428 2.881296
27.006156 0.527605
26.186927 0.661388
24.7403
db4
sym9
bior3.9
sym9
db4
db1
db4
db27
sym9
db4
sym9
db35
(φ)
L
5
5
5
5
5
5
5
5
5
5
5
5
Bold value indicates best results where for SNR, SNRimp, highest is best and for MSE, RMSE, and PRD, lowest is best
EOG
EOG
EMG
EMG
PLN
PLN
WGN
WGN
Proposed method WGN
GA-WT
1
Noise
Method
Rank
hard
hard
soft
hard
hard
hard
hard
hard
hard
Soft
Soft
Soft
β
Table 5 Comparing the proposed GA-WT method with state-of-the-art methods for EEG signals denoising with different noises
rigrsure
rigrsure
heursure
rigrsure
rigrsure
rigrsure
rigrsure
heursure
rigrsure
rigrsure
rigrsure
heursure
λ
one
one
one
one
one
one
one
one
one
sln
sln
sln
ρ
EEG Signal Denoising Using Hybridizing Method ...
465
466
Z. A. A. Alyasseri et al.
eter configurations for WT by using metaheuristic-based algorithms especially GA,
can directly improve the performance of WT in the EEG signal denoising process.
The results show that the proposed method (GA-WT) for EEG signal denoising can produce better results than manual configurations based on ad hoc strategy.
Therefore, using metaheuristic approaches to optimize the parameters for EEG signals positively affects the denoising process performance of the WT method.
6 Conclusions and Future Work
This paper proposes variation of wavelet transform (WT) method for EEG signal
denoising based on genetic algorithm called (GA-WT). As previously mentioned,
the denoising performance of WT depends on its five main parameters, with each
parameter having different types. Selecting the suitable WT parameters is a challenging task that is usually performed based on empirical evidence or experience. The
proposed method (GA-WT) aim to find the optimal WT parameters that can obtain
the minimum MSE between the original and denoised EEG signals.
The GA-WT is evaluated using a standard EEG dataset, the EEG Motor MovementImagery dataset. These dataset contain 109 volunteers, and capture EEG signals from
64 EEG channels based on different mental tasks. These EEG signals are corrupted
using three different noises namely, PLN, EMG, and WGN [12, 35, 36]. Five evaluation criteria are used, namely, SNR, SNR improvement, MSE, RMSE, and PRD.
Several experiments are conducted to compare the performance of the GA-WT can
support WT in producing efficient EEG signal denoising outcomes. Interestingly,
GA-WT outperforms the other proposed methods.
Acknowledgements This research has been done under USM Grant (1001/PKOMP/8014016).
Also, the first author would like to thank The World Academic Science (TWAS) and the University
Science Malaysia (USM) for supporting his study (TWAS-USM Postgraduate Fellowship 2015, FR
number: 3240287134).
References
1. Kumari P, Vaish A (2015) Brainwave based user identification system: a pilot study in robotics
environment. Robot Auton Syst 65:15–23
2. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, ahmad Alomari O (2018) EEG-based person
authentication using multi-objective flower pollination algorithm. In: 2018 IEEE congress on
evolutionary computation (CEC). IEEE, pp 1–8
3. Ramadan RA, Vasilakos AV (2017) Brain computer interface: control signals review. Neurocomputing 223:26–44
4. Rao RP (2013) Brain-computer interfacing: an introduction. Cambridge University Press, Cambridge
5. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA, Makhadme SN (2018) An
efficient optimization technique of EEG decomposition for user authentication system. In: 2018
EEG Signal Denoising Using Hybridizing Method ...
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
467
2nd international conference on biosignal analysis, processing and systems (ICBAPS). IEEE,
pp 1–6
Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA, Makhadmeh SN (2018)
Classification of EEG mental tasks using multi-objective flower pollination algorithm for person
identification. Int J Integr Eng 10(7):7
Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA (2018) EEG feature extraction
for person identification using wavelet decomposition and multi-objective flower pollination
algorithm. IEEE Access 6:76007–76024
Berger H (1929) Uber das elektrenkephalogramm des menschen. Eur Arch Psychiatry Clin
Neurosci 87(1):527–570
Abdulkader SN, Atia A, Mostafa M-SM (2015) Brain computer interfacing: applications and
challenges. Egypt Inform J 16(2):213–230
Tareq Z, Zaidan B, Zaidan A, Suzani M (2018) A review of disability EEG based wheelchair
control system: coherent taxonomy, open challenges and recommendations. Comput Methods
Programs Biomed 164:221–237
Adeli H, Ghosh-Dastidar S, Dadmehr N (2007) A wavelet-chaos methodology for analysis of
EEGs and EEG subbands to detect seizure and epilepsy. IEEE Trans Biomed Eng 54(2):205–
211
El-Dahshan E-SA (2011) Genetic algorithm and wavelet hybrid scheme for ECG signal denoising. Telecommun Syst 46(3):209–215
Al-Qazzaz NK, Hamid Bin Mohd Ali S, Ahmad SA, Islam MS, Escudero J (2015) Selection
of mother wavelet functions for multi-channel EEG signal analysis during a working memory
task. Sensors 15(11):29015–29035
Alyasseri ZAA, Khader AT, Al-Betar MA, Abualigah LM (2017) ECG signal denoising using
β-hill climbing algorithm and wavelet transform. In: ICIT 2017 the 8th international conference
on information technology, pp 1–7
Alyasseri ZAA, Khader AT, Al-Betar MA (2017) Optimal electroencephalogram signals
denoising using hybrid β-hill climbing algorithm and wavelet transform. In: Proceedings of the
international conference on imaging, signal processing and communication. ACM, pp 106–112
Alyasseri ZAA, Khader AT, Al-Betar MA (2017) Electroencephalogram signals denoising
using various mother wavelet functions: a comparative analysis. In: Proceedings of the international conference on imaging, signal processing and communication. ACM, pp 100–105
Keirn ZA, Aunon JI (1990) A new mode of communication between man and his surroundings.
IEEE Trans Biomed Eng 37(12):1209–1214
Sharma PK, Vaish A (2016) Individual identification based on neuro-signal using motor movement and imaginary cognitive process. Opt Int J Light Electron Opt 127(4):2143–2148
Goldberger AL, Amaral LA, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody
GB, Peng C-K, Stanley HE (2000) Physiobank, physiotoolkit, and physionet. Circulation
101(23):e215–e220
Al-Qazzaz NK, Ali S, Ahmad SA, Islam MS, Ariff MI (2014) Selection of mother wavelets
thresholding methods in denoising multi-channel EEG signals during working memory task. In:
2014 IEEE conference on biomedical engineering and sciences (IECBES). IEEE, pp 214–219
Al-Qazzaz NK, Ali SHBM, Ahmad SA, Islam MS, Escudero J (2018) Discrimination of strokerelated mild cognitive impairment and vascular dementia using EEG signal analysis. Med Biol
Eng Comput 56(1):137–157
Reddy CSP et al (2017) Analysis of EEG signal for the detection of brain abnormalities. Int J
Res 4(17):1947–1950
Kumari P, Vaish A (2016) Feature level fusion of mental tasks brain signal for an efficient
identification system. Neural Comput Appl 27(3):659–669
Mowla MR, Ng S-C, Zilany MS, Paramesran R (2015) Artifacts-matched blind source separation and wavelet transform for multichannel EEG denoising. Biomed Signal Process Control
22:111–118
Yang B, Zhang T, Zhang Y, Liu W, Wang J, Duan K (2017) Removal of electrooculogram artifacts from electroencephalogram using canonical correlation analysis with ensemble empirical
mode decomposition. Cogn Comput 9(5):626–633
468
Z. A. A. Alyasseri et al.
26. Torabi A, Jahromy FZ, Daliri MR (2017) Semantic category-based classification using nonlinear features and wavelet coefficients of brain signals. Cogn Comput 9(5):702–711
27. Zhou G, Zhao Q, Zhang Y, Adali T, Xie S, Cichocki A (2016) Linked component analysis from
matrices to high-order tensors: applications to biomedical data. Proc IEEE 104(2):310–331
28. Wang H, Zhang Y, Waytowich NR, Krusienski DJ, Zhou G, Jin J, Wang X, Cichocki A (2016)
Discriminative feature extraction via multivariate linear regression for SSVEP-based BCI.
IEEE Trans Neural Syst Rehabil Eng 24(5):532–541
29. Zhang Y, Nam CS, Zhou G, Jin J, Wang X, Cichocki A (2018) Temporally constrained sparse
group spatial patterns for motor imagery BCI. IEEE Trans Cybern 49(9):3322–3332
30. Jiao Y, Zhang Y, Chen X, Yin E, Jin J, Wang X, Cichocki A (2018) Sparse group representation
model for motor imagery EEG classification. IEEE J Biomed Health Inform 23(2):631–641
31. Zhang Y, Zhou G, Jin J, Zhao Q, Wang X, Cichocki A (2016) Sparse bayesian classification
of EEG for brain-computer interface. IEEE Trans Neural Netw Learn Syst 27(11):2256–2267
32. Alyasseri ZAA, Khader AT, Al-Betar MA, Awadallah MA (2018) Hybridizing β-hill climbing
with wavelet transform for denoising ECG signals. Inf Sci 429:229–246
33. Al-Betar MA (2017) β-hill climbing: an exploratory local search. Neural Comput Appl
28(1):153–168
34. Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus
JE, Moody GB, Peng C-K, Stanley HE (2000) PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 101(23):e215–e220. https://doi.org/10.1161/01.CIR.101.23.e215 Circulation Electronic
Pages: http://circ.ahajournals.org/content/101/23/e215.fullPMID:1085218
35. Wang J, Ye Y, Pan X, Gao X (2015) Parallel-type fractional zero-phase filtering for ECG signal
denoising. Biomed Signal Process Control 18:36–41
36. Jenkal W, Latif R, Toumanari A, Dliou A, El Bcharri O, Maoulainine FM (2016) An efficient
algorithm of ECG signal denoising using the adaptive dual threshold filter and the discrete
wavelet transform. Biocybern Biomed Eng 36(3):499–508
37. Subasi A, Ercelebi E (2005) Classification of EEG signals using neural network and logistic
regression. Comput Methods Programs Biomed 78(2):87–99
38. Kumar H, Pai SP, Vijay G, Rao R (2014) Wavelet transform for bearing condition monitoring
and fault diagnosis: a review. Int J COMADEM 17(1):9–23
39. Mamun M, Al-Kadi M, Marufuzzaman M (2013) Effectiveness of wavelet denoising on electroencephalogram signals. J Appl Res Technol 11(1):156–160
40. Al-Kadi MI, Reaz MBI, Ali MAM, Liu CY (2014) Reduction of the dimensionality of the
EEG channels during scoliosis correction surgeries using a wavelet decomposition technique.
Sensors 14(7):13046–13069
41. Borse S (2015) EEG de-noising using wavelet transform and fast ICA. IJISET Int J Innov Sci
Eng Technol 2:200–205
42. Poornachandra S, Kumaravel N (2005) Hyper-trim shrinkage for denoising of ECG signal.
Digit Signal Proc 15(3):317–327
43. Yang R, Ren M (2011) Wavelet denoising using principal component analysis. Expert Syst
Appl 38(1):1073–1076
44. Sawant C, Patii HT (2014) Wavelet based ECG signal de-noising. In: 2014 first international
conference on networks & soft computing (ICNSC). IEEE, pp 20–24
45. Donoho DL, Johnstone JM (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika
81(3):425–455
46. Singh BN, Tiwari AK (2006) Optimal selection of wavelet basis function applied to ECG signal
denoising. Digit Signal Proc 16(3):275–287
47. McSharry PE, Clifford GD, Tarassenko L, Smith LA (2003) A dynamical model for generating
synthetic electrocardiogram signals. IEEE Trans Biomed Eng 50(3):289–294
48. Alyasseri ZAA, Khadeer AT, Al-Betar MA, Abasi A, Makhadmeh S, Ali NS (2019) The effects
of EEG feature extraction using multi-wavelet decomposition for mental tasks classification.
In: Proceedings of the international conference on information and communication technology.
ACM, pp 139–146
EEG Signal Denoising Using Hybridizing Method ...
469
49. Feng J, Wang Z, Zeng M (2013) Distributed weighted robust kalman filter fusion for uncertain
systems with autocorrelated and cross-correlated noises. Inf Fusion 14(1):78–86
50. Sun X-J, Gao Y, Deng Z-L, Li C, Wang J-W (2010) Multi-model information fusion Kalman
filtering and white noise deconvolution. Inf Fusion 11(2):163–173
51. Donoho DL (1995) De-noising by soft-thresholding. IEEE Trans Inf Theory 41(3):613–627
52. Ustundaug M, Gokbulut M, Sengur A, Ata F (2012) Denoising of weak ECG signals by
using wavelet analysis and fuzzy thresholding. Netw Model Anal Health Inform Bioinform
1(4):135–140
53. Zeng K, Dong M (2014) A novel cuboid method with particle swarm optimization for real-life
noise attenuation from heart sound signals. Expert Syst Appl 41(15):6839–6847
54. Lagha M, Tikhemirine M, Bergheul S, Rezoug T, Bettayeb M (2013) De-noised estimation of
the weather doppler spectrum by the wavelet method. Digit Signal Proc 23(1):322–328
55. Vazquez RR, Velez-Perez H, Ranta R, Dorr VL, Maquin D, Maillard L (2012) Blind source
separation, wavelet denoising and discriminant analysis for EEG artefacts and noise cancelling.
Biomed Signal Process Control 7(4):389–400
56. Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with
applications to biology, control, and artificial intelligence. MIT Press, Cambridge
57. Kabir MA, Shahnaz C (2012) Denoising of ECG signals based on noise reduction algorithms
in EMD and wavelet domains. Biomed Signal Process Control 7(5):481–489
58. Schalk G, McFarland DJ, Hinterberger T, Birbaumer N, Wolpaw JR (2004) BCI 2000: a generalpurpose brain-computer interface (BCI) system. IEEE Trans Biomed Eng 51(6):1034–1043
Neural Network Ammonia-Based
Aeration Control for Activated Sludge
Process Wastewater Treatment Plant
M. H. Husin, M. F. Rahmat, N. A. Wahab, and M. F. M. Sabri
Abstract The paper proposes an improved effluent control for the operation of a
biological wastewater treatment plant using a neural network ammonia-based aeration control. The main advantage of this control method is the simplicity and
nonlinear approximation ability that beat the performances of the static-gain
Proportional Integral (PI) controller. The trained neural network controller used the
measured value of dissolved oxygen and ammonium in compartment 5 of the
Benchmark Simulation Model No. 1 (BSM1) to regulate the oxygen transfer
coefficient in compartment 5. The effectiveness of the proposed neural network
controller is verified by comparing the performance of the activated sludge process
to the benchmark PI under dry weather file. Simulation results indicate that Ntot,e,
and SNH,e violations are reduced by 22% reduction for Ntot,e, and 4% for SNH,e.
The significant improvement in effluent violation, and effluent quality index of the
BSM1 confirms the advantage of the proposed method over the Benchmark PI. For
future research, the method can also be applied in controlling the nitrate in activated
sludge wastewater treatment plant.
Keywords Aeration control
Activated sludge Wastewater treatment plant
Nomenclature
AE
BSM1
DO
EQ
MPC
OCI
Aeration Energy
Benchmark Simulation Model No. 1
Dissolved Oxygen
Effluent Quality
Model Predictive Control
Overall Cost Index
M. H. Husin (&) M. F. M. Sabri
Department of Electrical and Electronic Engineering, Faculty of Engineering,
Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia
e-mail: hhmaimun@unimas.my
M. F. Rahmat N. A. Wahab
Department of Control and Mechatronics Engineering, Faculty of Electrical Engineering,
Universiti Teknologi Malaysia (UTM), 81310 Johor Bahru, Johor, Malaysia
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_32
471
472
M. H. Husin et al.
PI/PID Proportional Integral/Proportional Integral Derivative
WWTP Wastewater treatment plant
1 Introduction
1.1
Activated Sludge Wastewater Treatment Plant
Wastewater treatment plant (WWTP) is a process used to remove the contaminants
from wastewater and convert it into an effluent that is safe or has minimum impact
on the environment. The activated sludge process is a form of the wastewater
treatment process for handling wastewater using aeration and bacteria. The activated sludge process is a biological process, and it is the most commonly applied
[1, 2] technology in WWTP.
Minimizing the energy expenditure in the activated sludge process can be
achieved by controlling the aeration system. From the total operation cost of the
WWTP, the energy consumption itself may range from 30–50% [3–5], with over
half of the energy requirement comes from the aeration section. Aeration is a costly
process [4–9], and the increases in the cost of the energy will escalate the total
operation cost even more. The wastewater treatment process performance depends
on the effectiveness of maintaining the dissolved oxygen (DO) concentration at a
reasonable level. DO concentration has an immense influence on treatment effectiveness, operational cost, and system stability.
WWTP is a process industry that has influent variations and a large disturbance.
Unlike other process industries, WWTP cannot restraint the crude material to the
plant. Standards have been established for the quality of the effluent discharged
from the WWTP to receiving waters. Due to the imposed of more strict discharge
thresholds, process control at WWTP is becoming gradually more essential. In
adapting to the new requirements, automatic control has been used to improve the
water quality and also to minimize the operational costs to achieve sustainable
treatments.
Two ways are proposed by [10] to control the aeration process, which is the total
aerobic volume and the aeration intensity. In changing the aeration intensity, a
common method used is by adjusting the DO concentration level based on the
ammonium concentration in the effluent. With ammonium concentration as a target
variable, ammonium feedback control can adjust the aeration intensity as required
by the process. Ammonium feedback control can achieve 3–7% [11] energy saving
compared to constant DO control.
The BSM1 [12] has been established as the simulation model and protocol and a
handful of papers working on the control of WWTP being using this benchmark.
Ever since the establishment, in most earlier literature and even in recent years
[13–15], the WWTP operation is usually assessed in terms of overall cost index
Neural Network Ammonia-Based Aeration Control …
473
(OCI) and effluent quality (EQ). Control schemes applied in most of those works
directly attempt to control the DO and nitrate concentration, which are the variables
that defined the attribute of the effluent and cost of the WWTP operation.
Proportional Integral (PI) or Proportional Integral Derivative (PID) control
strategy has is the most commonly used control strategy in the process control of
WWTP. However, the control of the linear PI/PID might be affected by the disturbances or changes in the condition of operation. Various solutions are proposed
to improve DO concentration control performance. By limiting the literature based
on BSM1 as the working scenario, it can be perceived that in most control strategy
solutions to the above mentioned setback of PI/PID controllers are using different
types of controllers such as nonlinear PI controller [16, 17], model predictive
control [18–22] and artificial intelligence control [23–27]. However, the control
strategy remains the same, which is to control the DO and nitrate concentration.
Generally, the enhancement of control performance in the nonlinear PI controller
results in a trivial enhancement of the EQ and infrequent to achieve a reduction of
cost. Model predictive control (MPC) or artificial intelligent control, on the other
hand, usually have better EQ and offers a reduction of cost. However, these
methods have complicated structures, and the complex algorithm of MPC requires a
large number of computations, due to the attempts at every control interval to
optimize upcoming plant behavior by calculating a sequence of upcoming manipulated variable adjustment.
Due to this implementation, the overall performance of the WWTP can be said to
be improved. However, the detailed analysis from the environmental aspect is not
being discussed further in most of the papers. Further analysis of the imposed
pollution limit must be taken into account to ensure the effluent discharge from the
WWTP is safe or has minimum impact on the environment.
1.2
Ammonium Based Aeration Control
Precisely, most critical pollutants are ammonium and ammonium nitrogen, and total
nitrogen. Not many research work yet to be found that are taking into account the
imposed pollution limits. Recently, all the necessary elements for advanced control
are now available and within reach of any wastewater treatment utilities. The arrival
of in situ ISEs to measure ammonium is an important development to the process
industries. This technology is mature and continues to develop and improve. Thus,
ammonia-based aeration control is becoming an increasingly popular aeration
strategy applied to WWTP. The ammonia-based aeration control was made possible
with the availability of numerous sensors, e.g., ammonia ISE probes, that determine
the activity of ammonia ion in solutions. Ammonia-based aeration control would be
beneficial for many wastewater treatment utilities. However, the applicable control
strategy for a particular wastewater treatment facility depends on factors like system
configuration, discharge limitation, and wastewater treatment characteristics.
474
M. H. Husin et al.
Utilities have implemented ammonia-based aeration control based on feedback
and feedforward strategies. Feedback is very common in the process industry, but it
can have limitations in a high dynamics system such as WWTP. Feedforward has
more complexity, but it does offer the possibility to attain the best effluent at the
lowest energy cost. A study on ammonia-based aeration control applied to WWTP
can be found in a few papers [5, 10, 28]. Two of these studies are implemented in
real WWTP (Kappala WWTP and an industrial WWTP), while the other one is
implemented using BSM1. In all papers, the focus of the study is to reduce the
aeration cost while maintaining high-quality effluent. The details on the imposed
pollution limits, e.g., ammonium and total nitrogen, are not mentioned in the paper.
In all papers above, PI controllers are the applied controller. As being stated earlier,
the PI/PID controller might not respond well when dealing with disturbances.
WWTP is a highly nonlinear plant with huge disturbances. Thus, a more advanced
controller is needed to tackle these issues. For this study, the referred article are [11,
29, 30]. The summary of previous related research focuses mainly on the aeration
control of activated sludge WWTP is illustrated in Table 1.
Table 1 Summary of aeration control for activated sludge WWTP
References
Approach
Major findings
Åmand and
Carlsson
[11]
Supervisory PI ammonium feedback
control with DO profile created from a
mathematical minimization of the daily
air flow rate
Santin et al.
[29]
Hierarchical control architecture
Lower level: MPC to regulates the DO
of the three aerated tanks based on
ammonium and ammonia nitrogen
concentration in the tank 5
Higher-level: Affine function to
determine the DO setpoint
Santin et al.
[30]
Effluent pollutants concentration
prediction by using ANN
Uprety et al.
[28]
Ammonia PID control calculated DO
setpoint based on the difference
between ammonia probe feedback and
ammonia set point
Várhelyi
et al. [5]
Combination of PI ammonia-based
aeration control with the control of
nitrate and return activated sludge
recycle
i. Achieved 1–3.5% savings in the
airflow rate compared to constant DO
control
ii. Use a modified version of BSM1
(no zones for denitrification included)
i. Complete elimination of total
nitrogen violation is achieved by
adding additional carbon at tank 1
ii. Manipulating internal recirculating
flow rate (Qrin) with a combination of
linear and exponential function makes
possible of ammonia violations
removal
i. A logical signal is generated at the
instants where risk is detected
ii. Simulation is done in BSM2
i. Implemented are real Industrial
WWTP
ii. Significant reduction in
supplemental carbon necessary for
denitrification with a reduction in plant
energy consumption
iii. Reduced the need for increased
reactor volume
i. Potential to achieve a cost reduction
of about 43%
ii. A data collection form municipal
WWTP
Neural Network Ammonia-Based Aeration Control …
475
Fig. 1 Ammonium cascade control. The NH controller determines the DO setpoint
Most of these study deals with PI/PID controller with ammonium cascade
control structure, as shown in Fig. 1. In this configuration, the ammonia sensor is
located at the aerated zones (reactor 3 to 5). The ammonia probe constantly
transmits a signal of the ammonia measurement to an ammonia PI/PID controller,
which then computes a DO setpoint based on the variation between the reading of
the ammonia probe and the required ammonia set point. Ammonia set point in the
aeration effluent ranges from 1–5 mg NH4/l [28], depending on the permit limits.
This PI/PID calculated DO set point is then relayed to the DO controller. With the
ammonia PI/PID control, it requires two cascade controller.
2 Benchmark Simulation Model No. 1 (BSM1)
The BSM1 is a simulation setting defining a plant outline, a simulation model,
influent loads, test procedure, and evaluation criteria. The BSM1 is based on the
ASM1, and the layout is as shown in Fig. 2. The first component of BSM1 is a
biological activated sludge reactor, which consists of five compartments of two
non-aerated compartments and three aerated compartments. For non-aerated compartments, the reactor volume is 1000 m3, and for the aerated compartments, the
reactor volume is 1333 m3. The secondary settler is 10 layers of the non-reactive
unit with no biological reaction. The settler volume is 6000 m3.
The influent data defines in BSM1 consists of dry weather, rain weather, and
storm weather. The influent data use is sampled with a sampling period of 15 min in
the following order:
[time
SI
SS
XI
XS
XBH
XBA
XP
SO
SNO
SNH
SND
XND
SALK
Q0]
In any influent: SO = 0 g (-COD) m3; XBA = 0 g COD m−3; SNO = 0 g N m−3;
XP = 0 g COD m−3; SALK = 7 mol m−3. The details of influent’s variables is in
Table 2. For this study, only a dry weather file is considered. The dry weather file
comprises fourteen days of dynamic dry influent data (see Fig. 3).
476
M. H. Husin et al.
Fig. 2 Default control strategy in BSM1
Table 2 Description of variables
Symbol
Description
Symbol
Description
SI
SS
XI
Soluble inert organic matter
Suspended solids
Particulate inert organic matter
SO
SNO
SNH
XS
Slowly biodegradable substrate
SND
XBH
Active heterotrophic biomass
XND
XBA
XP
Active autotrophic biomass
Particulate products arising from
biomass decay
SALK
Q0
Dissolved oxygen
Nitrate
Ammonium and ammonia
nitrogen
Soluble biodegradable organic
nitrogen
Particulate biodegradable
organic nitrogen
Alkalinity
Input flowrate
The simulation setup starts with initialization, where simulation using 100 days
of stabilization in a closed-loop condition (using constant inputs with no noise on
the measurements) has to be completed. After that, it follows by simulation using
the dry weather file, and lastly, it proceeds with weather files to be verified. Noise
on measurements must be used with the dynamic files. The system is stabilized if
the steady state is attained.
A simulation procedure is set to achieve a just assessment of results. In the
attempt to compare the different control strategies, a few standards are outlines for
the plant performance assessment. It includes Effluent Quality Index (EQI) and the
OCI to weigh the operating cost. The assessment also comprises the calculation of
the operating time that the concentration of the pollutants in the discharge is above
the limit, as shown in Table 3. Total nitrogen (Ntot) is the sum of NO and Kjeldahl
nitrogen (NKj).
Neural Network Ammonia-Based Aeration Control …
477
35000
Flowrate(m3.d-1)
30000
25000
20000
15000
10000
5000
0.0
0.5
1.0
1.6
2.1
2.6
3.1
3.6
4.2
4.7
5.2
5.7
6.3
6.8
7.3
7.8
8.3
8.9
9.4
9.9
10.4
10.9
11.5
12.0
12.5
13.0
13.5
0
Time (days)
(a) Q0, input flowrate of dry weather influent
140
Concentration (g.m-3)
120
100
80
SS
60
SNH
40
SND
20
0.0
0.6
1.2
1.8
2.5
3.1
3.7
4.3
4.9
5.5
6.1
6.8
7.4
8.0
8.6
9.2
9.8
10.4
11.1
11.7
12.3
12.9
13.5
0
Time (days)
(b) SS, SNH and SND concentration of dry weather influent
350
Concentration (g.m-3)
300
250
200
XBH
XS
150
XI
100
XND
50
0.0
0.6
1.2
1.8
2.5
3.1
3.7
4.3
4.9
5.5
6.1
6.8
7.4
8.0
8.6
9.2
9.8
10.4
11.1
11.7
12.3
12.9
13.5
0
Time (days)
(c) XBH, XS, XI and XND concentration of dry weather influent
Fig. 3 Dry weather influent
478
M. H. Husin et al.
Table 3 Concentration
thresholds of pollutants in the
effluent
2.1
Variables
Maximum accepted values
Ntot [g N/m3]
CODt [g COD/m3]
NH [g N/m3]
TSS [g SS/m3]
BOD5 [g BOD/m3]
18
100
4
30
10
PI Control
The default controller in BSM1 is the PI controller. The primary control objectives
are to maintain the nitrate concentration in tank two at a setpoint value of 1 g m−3
and the DO concentration in tank five at a setpoint value of 2 g(-COD) m−3.
The PI controllers are on the following form:
Z
1 t
deðtÞ
eðsÞds þ Td
uð t Þ ¼ K e ð t Þ þ
umin \uðtÞ\umax
T 0
dt
ð1Þ
where u(t) is the controller output, K is the controller gain, Ti is the integral time,
e(t) is the control error, and umin and umax are the upper and lower limits of the
controller output, correspondingly.
2.2
Ammonia Sensor
For the ammonia-based aeration control, the ammonia sensor used is of class B0
(see Fig. 4) with a measurement span of 0–20 g N m−3 and measurement noise
d = 0.5 g N m−3 as recommended by BSM1 [12]. This sensor is located at the
final aerated compartment, which will continuously send a signal of the ammonia
measurement to the neural network controller.
2.3
Performance Assessment
BSM1 performance assessment makes available measures for the outcome of the
proposed control strategy. According to the benchmark, it can be divided into few
categories, EQ, cost factors for operation (aeration energy (AE), pumping energy,
sludge production, consumption of external carbon source, mixing energy), influent
quality and OCI. However, for this study, only three important categories are
highlighted, EQ, AE, and OCI.
Neural Network Ammonia-Based Aeration Control …
479
Fig. 4 Simulink model of sensor class BO
The EQ is averaged throughout 7-days for each weather file and is based on a
weighting of the effluent loads of compounds that have the main impact on the
quality of the receiving water and counted in regional legislation. It is expressed as:
EQ ¼
1
T 1000
Z
0
t¼14days
t¼7days
1
BSS SSe ðtÞ þ BCOD CODe ðtÞ
@
AQe ðtÞ dt ð2Þ
þ BNkj SNkj;e ðtÞ
þ BNO SNO;e ðtÞ þ BBOD5 BODe ðtÞ
where
SNkj;e ¼ SNH;e þ SND;e þ XND;e þ iXB XBH;e þ XBA;e þ iXP XP;e þ Xi;e
SSe ¼ 0:75 XS;e þ XI;e þ XBH;e þ XBA;e þ XP;e
BOD5;e ¼ 0:25 SS;e þ XS;e þ ð1 fP Þ XBH;e þ XBA;e
CODe ¼ SS;e þ SI;e þ XS;e þ XI;e þ XBH;e þ XBA;e þ XP;e
The AE take into account the plant peculiarities and is computed from the kLa
according to the following relation:
AE ¼
Ssat
O
T 1:8 1000
Z
t¼14days
t¼7days
X8
i¼1
Vi KL ai ðtÞdt
with kLa given in d−1 and I referring to the compartment number.
ð3Þ
480
M. H. Husin et al.
Finally, the OCI is calculated:
OCI ¼ AE þ PE þ 5 SP þ 3 EC þ ME
ð4Þ
where PE is the pumping energy, SP is the sludge production to be disposed of, EC
is the consumption of external carbon source, and ME is mixing energy. Further
details on the equation for all of this can be found in [12].
3 Methodology
3.1
Feed-Forward Neural Network Ammonia-Based
Aeration Control
An artificial neural network (ANN) is an approach to replicate the biological nervous system, e.g., brain. It applies the nonlinear processing unit to mimic biological
neurons for modeling the activities of biological synapses amid neurons by
fine-tuning the values of the variable weights between output and target until the
network output matches the target. The main features of ANN are parallel processing capability and distributed storage. ANN offers advantages in which the
outstanding nonlinear mapping ability, strong fault acceptance, self-organization,
self-learning, and adaptive reasoning ability [23].
For this study, which is the application of neural network in the control system,
the neural network looks as function approximators. The proses (see Fig. 5) is
involving the adjustment of parameters of the network so that it will produce the
same response as the unknown function, if the same input is applied to both
systems.
In this paper, the proposed controller (see Fig. 6) neural network ammonia-based
aeration control is used as the controller to manipulate the oxygen transfer coefficient, KLa5 of the reactor tank five, by using the measured value of DO
Fig. 5 The neural network as a function approximator
Neural Network Ammonia-Based Aeration Control …
481
Fig. 6 The block diagram of the neural network ammonia-based aeration controller
concentration and ammonia concentration in tank five directly. This study aims to
evaluate the feedforward neural network ammonia-based aeration controller with PI
benchmark constant DO setpoint strategy.
Two-layer networks, with sigmoid transfer functions in the hidden layer and a
linear transfer function in the output layer, are universal approximators [31]. In this
study, a feed-forward neural network is applied with a two-layer network consist of
10 sigmoid hidden neurons and a linear output neuron. The schematic illustration of
the feedforward neural network is illustrated in Fig. 7.
Assuming that the samples to be trained are fxi ; ri g 2 fX; Rg, where xi represents the input of the network, X ¼ ½x1 ðkÞ; x2 ðk Þ; ; xn ðkÞT is the input vector, ri
represents the expected output of the network, and R ¼ ½r1 ðkÞ; r2 ðk Þ; ; rn ðk ÞT is
the anticipated output vector. Sigmoid function is chosen as the active function of
the hidden layer of the network, and linear function as the active function for the
L1
output layer. wL1
represents the weight connecting the ith neuron of the input
i;j 2 W
layer and jth neuron of the hidden layer, the weight connecting the ith neuron of
L2
hidden layer and jth neuron of output layer is wL2
i;j 2 W . Two layer network is
chosen and X ¼ ½y1 ðkÞ; y2 ðk Þ; ; yn ðkÞ as the actual output of the network
Y ¼ W L2 f X W L1
ð5Þ
where the sigmoid function as an f function
f ð xÞ ¼
1
1 þ ex
and e is a transcendental number, e = 2.71828 [32].
ð6Þ
482
M. H. Husin et al.
Fig. 7 The topological
structure of the feed-forward
neural network
The training index is set as:
1
J ðkÞ ¼ ðeðk ÞÞ2
2
ð7Þ
This structure can fit multidimensional mapping difficulties well, given reliable
data and sufficient neurons in its hidden layer. The feed-forward neural network is
widely used in modeling and control applications due to its simplicity and efficiency
[14]. Increment of the learning rate and avoiding the problem of local minima can
be achieved through the nonlinear mapping of the input layer to the output layer and
the linear mapping from the hidden layer to the output layer [23]. The network is
trained with the Bayesian Regularization algorithm.
Neural Network Ammonia-Based Aeration Control …
483
4 Results and Discussion
Ammonia-based aeration control applied in this study uses both the ammonium
concentration and DO concentration as the controlled variables, while the oxygen
transfer coefficient as the manipulated variable. The ammonium sensor was located
in the fifth tank. It is unexceptional to locate the sensor in the latter zone of the
activated sludge process. Simulations are carried out using sensor class B0 for SNH
and SNO and type A sensor for SO. Dry influent weather is used to evaluate the
suggested control strategy.
The pollutants SNH,e and Ntot,e are the ones that are more demanding to be kept
under the approved limits. Reduction of Ntot,e can be accomplished by adding
external carbon flow rate (qEC) in the first tank, while for reducing the peaks of
SNH,e, proper manipulation of internal recirculating flow rate (Qrin) is needed. The
comparison of the proposed control strategy is compared to the default BSM1 PI
controller (see Fig. 8). The dotted line is the Ntot,e limit, default BSM1 is indicated
using blue line, and the red line is the proposed neural network ammonia-based
aeration control. It can be observed that by using the proposed method, a large
decreased of Ntot,e peaks are achieved, and the number of violations is reduced
from 7 occasions to 5 occasions during the evaluation week using the NN-ABAC
control strategy.
However, the proposed control strategy alone will not keep the Ntot,e below the
allowing limit. The total remove of Ntot,e can only be achieved if the addition of
carbon is added at tank one. This is due to the increment of the anoxic growth of
XBH when carbon dosage is added to tank one.
22
Ntot,e (mg N/l)
21
20
19
18
17
16
15
14
7.00
7.25
7.50
7.75
8.00
8.25
8.50
8.75
9.00
9.25
9.50
9.75
10.00
10.25
10.50
10.75
11.00
11.25
11.50
11.75
12.00
12.25
12.50
12.75
13.00
13.25
13.50
13.75
13
Time (days)
BSM1
NN-ABAC
Ntot Limit
Fig. 8 Ntot,e performances of one-week simulation using dry weather with the benchmark PI
controller (blue line) and with the NN-ABAC (red line)
484
M. H. Husin et al.
9
SNH,e (mg N/l)
8
7
6
5
4
3
2
1
7.00
7.25
7.50
7.75
8.00
8.25
8.50
8.75
9.00
9.25
9.50
9.75
10.00
10.25
10.50
10.75
11.00
11.25
11.50
11.75
12.00
12.25
12.50
12.75
13.00
13.25
13.50
13.75
0
Time (days)
BSM1
NN-ABAC
SNH Limit
Fig. 9 SNH,e performance of one-week simulation using dry weather with the default PI
controller (blue line) and with the NN-ABAC (red line)
As for SNH,e violation, only a slight decreased of SNH,e peaks is achieved using
the NN-ABAC control strategy (see red line in Fig. 9); however, the number of the
occasion remains the same. As mentioned previously, the control of SNH,e violation can be obtained if the Qrin is correctly manipulated. Proper manipulation of
Qrin is needed to improve the nitrification process.
Table 4 shows the results of EQ, AE, OCI, and percentage of time over the
limits of SNH,e, and Ntot,e. It shows that with the proposed control strategy
(NN-ABAC), Ntot,e violation is reduced by 22% while SNH,e is reduced by 4%.
This figure is verified using the graph shown in Figs. 4 and 5. Besides, an
improvement of 2% of EQ is obtained. Improvement in EQ is foreseeable due to the
reduction of effluent violation for SNH,e and Ntot,e. However, AE is increased by
1%. The increases in AE mainly because in the benchmark, the DO concentration
setpoint is fixed while in the proposed controller, the DO concentration is varied.
Table 4 Results with the proposed NN-ABAC and its comparison with the benchmark BSM1
control strategy for dry weather
EQ (kg poll.unit s/d)
AE
OCI
Ntot,e violations (% of operating time)
SNH,e violations (% of operating time)
BSM1
NN-ABAC
% of reduction
6096.71
3697.57
16366.3
17.8571
16.8155
5975.73
3749.24
16435.9
13.8393
16.0714
2%
−1%
0%
22%
4%
Neural Network Ammonia-Based Aeration Control …
485
The DO setpoint for the proposed controller depends on the ammonia reading
obtained by the ammonia sensor at tank 5. However, the slight increased in the AE
does not increase the OCI that much.
5 Conclusions
This paper aims to improve the effluent control of the benchmark plant. Using the
proposed control strategy (NN-ABAC), the discharge effluent violations show a
reduction in the total number of violation in two main pollutants, SNH,e and Ntot,e.
These two pollutants are the ones that are difficult to be kept under the established
limits. It can be observed from the simulation results that Ntot,e, and SNH,e violations are reduced by 22% reduction for Ntot,e, and 4% for SNH,e. Also, a
reduction of EQ by 2% is achieved compared to the default PI benchmark. The
huge reduction in the number of violations proved that the proposed result had
improved the effluent control of the BSM1.
Nonetheless, for future improvement, adding the additional carbon dosage at
tank one can help improve the denitrification process thus can help achieves the
more elimination of Ntot,e violations. But, adding an addition to carbon dosage will
increase the OCI. Good control of the internal recirculation flow rate is needed to
improve the nitrification process because it can eliminate more SNH,e.
Acknowledgements The authors wish to thank the Universiti Malaysia Sarawak and
Special MYRA Assessment Funding (Project ID: F02/Sp/MYRA/1719/2018) for their financial
support. Their support is gratefully acknowledged.
References
1. Mei-jin L, Fei L (2014) A nonlinear adaptive control approach for an activated sludge process
using neural networks. In: The 26th Chinese control and decision conference CCDC 2014.
IEEE, pp 2435–2440
2. Hoang BL, Tien DN, Luo F, Nguyen PH (2014) Dissolved oxygen control of the activated
sludge wastewater treatment process using Hedge Algebraic control. In: 2014 7th
international conference on biomedical engineering and informatics. IEEE, pp 827–832
3. Ghoneim WAM, Helal AA, Wahab MGA (2016) Minimizing energy consumption in
wastewater treatment plants. In: 2016 3rd international conference on renewable energies for
developing countries, REDEC 2016. Institute of Electrical and Electronics Engineers Inc.
4. Fernández FJ, Castro MC, Rodrigo MA, Cañizares P (2011) Reduction of aeration costs by
tuning a multi-set point on/off controller: a case study. Control Eng Pract 19:1231–1237
5. Várhelyi M, Brehar M, Cristea VM (2018) Control strategies for wastewater treatment plants
aimed to improve nutrient removal and to reduce aeration costs. In: Proceedings of the 2018
IEEE international conference on automation, quality and testing, robotics, AQTR 2018,
THETA 21st edn, pp 1–6
6. Amand L, Carlsson B (2012) Optimal aeration control in a nitrifying activated sludge process.
Water Res 46:2101–2110
486
M. H. Husin et al.
7. Liu C, Li S, Zhang F (2011) The oxygen transfer efficiency and economic cost analysis of
aeration system in municipal wastewater treatment plant. Energy Procedia 5:2437–2443
8. Rieger L, Jones RM, Dold PL, Bott CB (2013) Ammonia-based feedforward and feedback
aeration control in activated sludge processes. Water Environ Res 86:63–73
9. Rieger L, Jones RM, Dold PL, Bott CB (2014) Ammonia-based feedforward and feedback
aeration control in activated sludge processes. Water Environ Res 86:63–73
10. Åmand L, Carlsson B (2014) Aeration control with gain scheduling in a full-scale wastewater
treatment plant. IFAC
11. Åmand L, Carlsson B (2013) The optimal dissolved oxygen profile in a nitrifying activated
sludge process – comparisons with ammonium feedback control. Water Sci Technol 68:641–
649
12. Alex J, Benedetti L, Copp J, Gernaey KV, Jeppsson U, Nopens I, Pons M, Rieger L, Rosen C,
Steyer JP, Vanrolleghem P, Winkler S (2008) Benchmark Simulation Model no. 1 (BSM1)
13. Chen W, Yao C, Lu X (2014) Optimal design activated sludge process by means of
multi-objective optimization: case study in Benchmark Simulation Model 1 (BSM1). Water
Sci Technol 69:2052–2058
14. Zhang W, Qiao J (2014) Direct adaptive neural network control for wastewater treatment
process. In: Proceeding of the 11th world congress on intelligent control and automation.
IEEE, pp 4003–4008
15. Kumar SS, Latha K (2017) A hybrid intelligent controller to reduce the energy of a
wastewater treatment plant. In: 2017 trends in industrial measurement and automation
(TIMA). IEEE, pp 1–5
16. Samsudin SI, Rahmat MF, Abdul Wahab N (2014) Nonlinear PI control with adaptive
interaction algorithm for multivariable wastewater treatment process. Math. Probl. Eng.
2014 (2014)
17. Samsudin SI, Rahmat MF, Wahab NA, Razali MC, Gaya MS, Salim SNS (2014)
Improvement of activated sludge process using enhanced nonlinear PI controller. Arab J
Sci Eng 39:6575–6586
18. Holenda B, Domokos E, Fazakas J (2008) Dissolved oxygen control of the activated sludge
wastewater treatment process using model predictive control. Comput Chem Eng 32:1270–
1278
19. Akyurek E, Yuceer M, Atasoy I (2009) Comparison of control strategies for dissolved oxygen
control in activated sludge wastewater. Elsevier B.V.
20. Han H-G, Qiao J-F, Chen Q-L (2012) Model predictive control of dissolved oxygen
concentration based on a self-organizing RBF neural network. Control Eng Pract 20:465–476
21. Wahab NA, Katebi R, Balderud J, Rahmat MF (2011) Data-driven adaptive model-based
predictive control with application in wastewater systems. IET Control Theory Appl 5:803–
812
22. Cristea MV, Agachi SP (2006) Nonlinear model predictive control of the wastewater
treatment plant, pp 1365–1370
23. Du X, Wang J, Jegatheesan V, Shi G (2018) Dissolved oxygen control in activated sludge
process using a neural network-based adaptive PID algorithm. Appl Sci 8:261
24. Han HG, Qian HH, Qiao JF (2014) Nonlinear multiobjective model-predictive control scheme
for wastewater treatment process. J Process Control 24:47–59
25. Shi X, Qiao J (2010) Neural network predictive optimal control for wastewater treatment. In:
Proceedings of the 2010 international conference on intelligent control and information
processing, ICICIP 2010, pp 248–252
26. Han H-G, Qiao J-F (2011) Adaptive dissolved oxygen control based on dynamic structure
neural network. Appl Soft Comput 11:3812–3820
27. Fu W, Qiao J, Han G, Meng X (2015) Dissolved oxygen control system based on the T-S
fuzzy neural network
28. Uprety K, Kennedy A, Balzer W, Baumler R, Duke R, Bott C (2015) Implementation of
ammonia-based aeration control (ABAC) at full-scale wastewater treatment plants. In:
Proceedings of the water environment federation 2015, pp 1–10
Neural Network Ammonia-Based Aeration Control …
487
29. Santin I, Pedret C, Meneses M, Vilanova R (2015) Process based control architecture for
avoiding effluent pollutants quality limits violations in wastewater treatment plants. In: 2015
19th international conference on system theory, control and computing (ICSTCC). IEEE,
pp 396–402
30. Santin I, Pedret C, Meneses M, Vilanova R (2015) Artificial neural network for nitrogen and
ammonia effluent limit violations risk detection in wastewater treatment plants. In: 2015 19th
international conference on system theory, control and computing, ICSTCC 2015, joint
conference SINTES 19, SACCS 15, SIMSIS 19, pp 589–594
31. Hagan MT, Demuth HB, Jesús ODE (1966) An introduction to the use of neural networks in
control systems. Endeavour 25:58
32. Qiao JF, Han G, Han HG (2014) Neural network on-line modeling and controlling method for
multi-variable control of wastewater treatment processes. Asian J Control 16:1213–1223
A Min-conflict Algorithm for Power
Scheduling Problem in a Smart Home
Using Battery
Sharif Naser Makhadmeh, Ahamad Tajudin Khader,
Mohammed Azmi Al-Betar, Syibrah Naim, Zaid Abdi Alkareem Alyasseri,
and Ammar Kamal Abasi
Abstract Scheduling operations of smart home appliances using an electricity pricing scheme is the primary issue facing power supplier companies and their users,
due to the scheduling efficiency in maintaining power system and reducing electricity
bill (EB) for users. This problem is known as power scheduling problem in a smart
home (PSPSH). PSPSH can be addressed by shifting appliances operation time from
period to another. The primary objectives of addressing PSPSH are minimizing EB,
balancing power demand by reducing peak-to-average ratio (PAR), and maximizing satisfaction level of users. One of the most popular heuristic algorithms known
as a min-conflict algorithm (MCA) is adapted in this paper to address PSPSH. A
smart home battery (SHB) is used as an additional source to attempt to enhance the
schedule. The experiment results showed the robust performance of the proposed
MCA with SHB in achieving PSPSH objectives. In addition, MCA is compared
with Biogeography based Optimization (BBO) to evaluate its obtained results. The
comparison showed that MCA obtained better schedule in terms of reducing EB and
PAR, and BBO performed better in improving user comfort.
Keywords Optimization · Min-conflict algorithm · Power scheduling problem in
a smart home · Smart home battery
1 Introduction
Power demand is increasing over time, due to the continuous growth of population
and appearing new technologies of smart home appliances that need more power to
S. N. Makhadmeh (B) · A. T. Khader · S. Naim · Z. A. A. Alyasseri · A. K. Abasi
School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia
e-mail: m_shareef_cs@yahoo.com
M. A. Al-Betar
Department of Information Technology, Al-Huson University College, Al-Balqa Applied
University, Irbid, Jordan
Z. A. A. Alyasseri
ECE Department, Faculty of Engineering, University of Kufa, Najaf, Iraq
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_33
489
490
S. N. Makhadmeh et al.
be operated [15]. Accordingly, old power grids faced several issues regarding the
stability of the power system in meeting this massive increment on power demand.
In addition, old power grids are not able to install more power generators to meet
power demands due to primitive nature of its architecture [16, 23, 28].
Smart Grids (SGs) are developed to address such issues, where they considered
as the next generation of old power grids. The communication system is the primary
system used in SGs, where it provides two ways communication between user and
power supplier companies (PSCs) to enhance distribution and power systems. This
enhancement allows PSCs to distribute more power to users and meet their power
needs.
SGs allow users to maintain their power consumption using demand response
(DR) programs. DR provides several programs that motivate users to modify and
balance appliances power consumption curve in order to maintain the stability of
power system [20]. DR is categorized into incentive-based programs and dynamic
pricing programs [22]. Dynamic pricing programs provide different electricity prices
in a time range, which offers high tariffs at peak periods and low tariffs at offpeak periods. These programs motivate users to maintain and schedule appliances
operating time at off-peak periods.
The problem of scheduling smart home appliances operation time at suitable periods according to dynamic pricing programs is known as power scheduling problem
in a smart home (PSPSH). PSPSH has been formulated as a scheduling optimization
problem which aims to minimize electricity bill (EB), balance power demands by
minimizing proportion between average and highest power demand which known as
peak-to-average ratio (PAR), and maximize the satisfaction level of users.
PSPSH was addressed in several studies using different optimization algorithms
such as exact and metaheuristic optimization algorithms. The metaheuristic optimization algorithms are the most popular in handling PSPSH due to their ability
to efficiently explore large and ragged search spaces. In addition, metaheuristic
optimization algorithms proved their efficiency in several domains, such as power
scheduling [23–26], text feature selection [1–3], authentication [11–13], gene selection [9, 10], and other domains [6–8]. In contrast, most of metaheuristic optimization
algorithms are not able to efficiently search locally in search spaces [5]. Therefore,
heuristic optimization algorithms found more efficient than metaheuristic optimization algorithms in searching for an optimal solution locally in search spaces, due to
their concentration on only one solution.
In this paper, one of the most popular and efficient heuristic optimization algorithms that never used in the domain of power scheduling known as a min-conflict
algorithm (MCA) is adapted to address PSPSH. In addition, a smart home battery
(SHB) is formulated to improve quality of solutions by storing power at low pricing
periods and discharge the stored power at high pricing periods. The dataset used
to evaluate the approaches in [24] and [26] is adopted in the evaluation process of
the proposed approach. The performance of the proposed approach is evaluated and
compared with another approach proposed in [26].
The structure of this paper is constructed as follows. The most important studies
that addressed PSPSH are presented in Sect. 2. PSPSH formulation is discussed in
A Min-conflict Algorithm for PSPSH Using Battery
491
Sect. 3. Section 4 described MCA and its adaptation to address PSPSH. In Sect. 5, the
simulation results of the proposed method are presented and illustrated, and Sect. 6
concluded the paper.
2 Related Work
Several optimization algorithms have been adapted to address PSPSH, including
exact and metaheuristic optimization algorithms. Metaheuristic optimization algorithms are more popular than exact algorithms in addressing PSPSH. Some of the
studies that use metaheuristic optimization algorithms are discussed in this section.
The authors of [30] formulate PSPSH as a multi-objective optimization problem.
The multi-objective function of PSPSH was formulated to reduce EB and user discomfort level. Genetic algorithm (GA), binary particle-swarm algorithm (BPSO),
and ant colony optimization (ACO) algorithm are adapted to schedule 13 home
appliances within one day. GA outperformed ACO and BPSO in achieving PSPSH
objectives.
In [31], PSPSH was formulated as a multi-objective optimization problem to
optimize EB and user comfort level simultaneously. Two dynamic pricing programs
were combined for balancing power demand and maintain system stability. GA was
adapted to address PSPSH using 16 operations of appliances for 90 days. The results
prove the proposed approach efficiency in reducing EB and improve the user comfort
level.
Biogeography based Optimization (BBO) and GA were adapted to address PSPSH
in [20]. A dynamic pricing program, namely, time-of-use pricing was used to schedule
operations of 12 appliances within one day. The simulation results showed the high
performance of BBO, where it performed better than GA in searching for an optimal
schedule.
In [4], GA and Flower Pollination Algorithm (FPA) were adapted to address
PSPSH. Sixteen appliances were used to evaluate the algorithms in terms of reducing EB and PAR and improving user comfort level in accordance with a dynamic
pricing program known as real-time price (RTP). In simulation results, FPA performed better than GA in reducing EB and PAR, whereas GA performed better than
FPA in improving comfort level.
The authors of [18] adapt harmony search algorithm (HSA) and BAT algorithm to
obtain a near-optimal schedule for 11 appliances. Critical peak pricing was used as
a dynamic pricing program in simulation results. In simulation results, HSA showed
better schedule than BAT and performed better in balancing power consumed through
time horizon.
The authors of [26] adapt PSO algorithm in attempting to obtain an optimal
schedule for 36 appliances operations using smart battery. RTP was used as a dynamic
pricing program in simulation results. In simulation results, PSO is compared with
GA to evaluate its performance. PSO showed better schedule than GA with and
without using the smart battery.
492
S. N. Makhadmeh et al.
Note that heuristic algorithms have never been used or adapted by the authors to
address PSPSH. Therefore, one of the most popular heuristic algorithms that provided
to solve scheduling problems known as the min-conflict algorithm is adapted in this
paper.
3 Problem Formulation
PSPSH can be addressed by schedule appliances operations at a specific period in
accordance with dynamic pricing program(s). The primary objectives of addressing
PSPSH are minimizing EB, PAR, and user discomfort level.
In this section, PSPSH objectives are illustrated and formulated mathematically. In
addition, a SHB is expressed to improve quality of solution(s) and obtain a more suitable schedule. RTP program is used as the dynamic pricing program and combined
with inclining block rate (IBR) due to IBR efficiency in balancing power demand
and reducing PAR value [23].
3.1 PSPSH Objectives Formulation
Minimizing EB is the essential objective of PSPSH due to its importance in motivating
user to reschedule their appliances operations. EB is mathematically formulated in
Eq. 1.
T
S j
Pi × pc j
(1)
Cost =
i=1 j=1
where S is maximum number of appliances in home, T denotes the maximum number
j
of time slots, and Pi is power consumed at time slot j by appliance i. pc j is electricity
tariff at time slot j. In the proposed approach, RTP is combined with IBR program;
therefore, pc j has two tariffs based on amount of power consumed as follows:
aj
pc =
bj
j
if 0 ≤ P j ≤ C
if P j > C
bj = λ × aj
(2)
(3)
where P j denotes all appliances power consumption at time slot j, C is the threshold
of power consumed, λ is a positive number, a j denotes normal price at j, and b j is
high price at j.
PAR is the second objective of addressing PSPSH, which is related to balancing
overall power consumed. PAR is formulated in Eq. 4
A Min-conflict Algorithm for PSPSH Using Battery
P AR =
Pmax
Pavg
493
(4)
where Pmax denotes maximum power consumed and Pavg is average overall power
consumed.
User comfort level can be improved by reducing waiting time rate (W T R) of
appliances because users always prefer to finish appliances’ operations as soon as
possible. W T R is formulated as follows:
W T Ri =
sti − O T Psi
, ∀i ∈ S
O T Pei − O T Psi − li
(5)
where W T Ri denotes W T R for appliance i, sti is starting operation of appliance i,
O T Psi and O T Pei are beginning and ending allowable period for appliance i to be
scheduled, respectively, and li is length of operation cycle of appliance i. Average
W T R for all appliances is calculated as follows:
m
(sti − O T Psi )
,
i=1 (O T Pei − O T Psi − li )
W T Ravg = m
i=1
(6)
The components of W T Ravg are presented and illustrated in Fig. 1.
In this study, the percentage of satisfaction (comfort) of users (U C p ) is calculated
based on W T R as follows:
U C p = (1 − W T Ravg ) × 100%,
(7)
3.2 Smart Home Battery (SHB)
SHB is containing a system known as a battery management system which allows it to
charge and discharge automatically based on predefined constraints. In this section,
SHB is formulated to enhance quality of solution(s) and attempt to achieve PSPSH
objectives optimally. The proposed SHB can efficiently reduce power consumed at
Fig. 1 Illustration of the
components in Eq. 6
494
S. N. Makhadmeh et al.
peak periods, where it formulated to store power at low peak periods and discharge
the stored power at peak periods.
The proposed SHB can store power at low pricing periods and if it is not completely
charged, and discharge at high pricing periods and if it is not empty. In addition, power
consumed by charging operation should not exceed C. The charging and discharging
states of SHB is formulated as follows:
1
if pc j ≤ pcavg and N S H B = 0 and P j < C
(8)
XSH B =
0
if pc j > pcavg and C HS H B > 0
X S H B is the state of SHB, where number 1 denoting the charging mode and 0 is
the discharging mode. Power charged and discharged at each time slot should not
exceed a maximum allowable limit. pcavg is average tariffs of all time slots, C HS H B
is total power charged in SHB and N S H B is power needed by SHB to be full where
it is formulated as follows:
N S H B = C S H B − C HS H B
(9)
where C S H B is capacity of SHB.
4 Min-conflict Heuristic Algorithm (MCA) for PSPSH
MCA is one of the most popular heuristic optimization algorithms that proposed to
address scheduling problems due to its simplicity and speed [14]. MCA was adapted
to address different problems such as scheduling sensor resources [19], job shop
scheduling [21] and n-queens [27].
In PSPSH, MCA solution is containing a vector of appliances’ starting operation
time (st). MCA for PSPSH is started by initializing PSPSH and SHB parameters,
then initializing the solution vector, as shown in step 1 and 2 of Algorithm 1. Note
that MCA is a local search algorithm and its population can be only one solution
vector of size S × 1. In the third step, the solution is updated by choosing an appliance randomly and calculate its operation cost at each time slot, then update its st
to operate at time slot with least cost. As remember, each appliance should be operated with respecting several constraints such as O T Ps, O T Pe, and l (see Fig. 1);
therefore, these constraints should be considered during the updating step. In step 4,
allowable periods and power that can SHB be charged and discharged are determined
by calculating power consumed by each appliance (see step 4 of Algorithm 1). Step
3 and 4 are repeated until reach maximum number of iteration, as shown in step 5 of
Algorithm 1.
A Min-conflict Algorithm for PSPSH Using Battery
495
Algorithm 1. Pseudo code of MCA for PSPSH using SHB
//Step 1:
Initializing PSPSH parameters
//Step 2:
Initializing MCA population of size (S × 1)
//Step 3:
while (k < Maximum number of iterations) do
Choose an appliance randomly
Calculate the appliance operation cost at each time slot with respecting its O T Ps, O T Pe, and
l
Update the appliance starting time to operate at time slot with least cost
//Step 4:
Calculate power consumed by each appliance
Determine allowable periods and power that can SHB be charged and discharged
Operate SHB
Calculate fitness value of the solution
//Step 5:
k =k+1
Is the maximum number of iterations reached?
end while
Return fitness value;
5 Experiments and Results
This section provides experiment results and their discussion and illustration. This
section begins with a description of the dataset used to evaluate the proposed
approach. SHB effects on the scheduling process and its enhancement are presented
as well. In addition, the adapted MCA is compared with BBO to assess its performance.
The simulation results are executed using MATLAB on a PC with 8 GB of memory
(RAM), Intel Core2 Quad CPU, and 2.66 GHz processor.
5.1 Dataset: Dynamic Pricing Program
In this study, the time horizon is containing 24 h that divided into 1440 slots, where
each slot equaled to 1 min. RTP is considered as a dynamic pricing program using
the pricing curve of the 1st of June 2016 that adopted from Commonwealth Edison
Company [17]. The RTP curve used is presenting in Fig. 2.
As mentioned previously, RTP is combined with IBR to disperse power consumed
and maintain the stability of power system. The IBR owns two parameters, including
C and λ (see Eq. 2). The values of these parameters are assigned by 0.0333 for each
slot and 1.543, respectively [24, 26].
496
S. N. Makhadmeh et al.
Fig. 2 RTP curve of the 1st of June 2016
5.2 Dataset: Smart Home Appliances
Generally, appliances can be operated several times in a time horizon. Therefore,
36 operations of nine appliances are used in the evaluation results. The primary
parameters of these operations are presented in Table 1.
Table 1 Parameters of appliances used in the experiments
No.
Appliance
l
OTPs–OTPe
Power (kW)
No.
Appliance
l
OTPs–OTPe
1
Dishwasher
105
540–780
0.6
19
Dehumidifier
30
1–120
Power (kW)
0.05
2
Dishwasher
105
840–1080
0.6
20
Dehumidifier
30
120–240
0.05
3
Dishwasher
105
1200–1440
0.6
21
Dehumidifier
30
240–360
0.05
4
Air conditioner
30
1–120
1
22
Dehumidifier
30
360–480
0.05
5
Air conditioner
30
120–240
1
23
Dehumidifier
30
480–600
0.05
6
Air conditioner
30
240–360
1
24
Dehumidifier
30
600–720
0.05
7
Air conditioner
30
360–480
1
25
Dehumidifier
30
720–840
0.05
8
Air conditioner
30
480–600
1
26
Dehumidifier
30
840–960
0.05
9
Air conditioner
30
600–720
1
27
Dehumidifier
30
960–1080
0.05
10
Air conditioner
30
720–840
1
28
Dehumidifier
30
1080–1200
0.05
11
Air conditioner
30
840–960
1
29
Dehumidifier
30
1200–1320
0.05
12
Air conditioner
30
960–1080
1
30
Dehumidifier
30
1320–1440
0.05
13
Air conditioner
30
1080–1200
1
31
Electric Water Heater
35
300–420
1.5
14
Air conditioner
30
1200–1320
1
32
Electric Water Heater
35
1100–1440
1.5
15
Air conditioner
30
1320–1440
1
33
Coffee Maker
10
300–450
0.8
16
Washing machine
55
60–300
0.38
34
Coffee Maker
10
1020–1140
0.8
17
Clothes dryer
60
300–480
0.8
35
Robotic Pool Filter
180
1–540
0.54
18
Refrigerator
1440
1–1440
0.5
36
Robotic Pool Filter
180
900–1440
0.54
A Min-conflict Algorithm for PSPSH Using Battery
497
Fig. 3 EB using MCA with and without SHB
For SHB, the usable C S H B is 13.5 kWh and the maximum allowable limit to
charge and discharge is 5 kW [29].
5.3 The Enhancement of SHB
In this section, SHB efficiency in attaining PSPSH objectives is examined and evaluated using MCA. The results with and without using SHB are compared, to show
whether SHB can improve the quality of the schedule. Figure 3 presents EB obtained
by MCA with and without considering SHB in the scheduling process. EB reduced
from (44.79 cent) using unscheduled mode (i.e., random schedule) to (41.12 cent)
and (28.85 cent) using MCA and MCA with SHB, respectively. The results show the
performance of SHB in improving quality of schedule and reduce EB.
In terms of PAR reduction, PAR value is reduced from (3.32) using unscheduled
mode to (2.53) using MCA and (2.60) using MCA with SHB, as shown in Fig. 4. The
results show that MCA without SHB obtained a better PAR value than MCA with
SHB. These results archived due to SHB process that allow it to store and consume
power only at low pricing periods which increase power consumed at these periods
and increase value of Pmax (see Eq. 4).
As discussed, the percentage of user comfort level could be improved by reducing
WTR value because users always prefer to finish appliances’ operations as soon
as possible. The proposed SHB reduced WTR and enhanced user comfort level
significantly, where WTR value is reduced from (0.4615) using unscheduled mode
to (0.3581) and (0.3368) using MCA and MCA with SHB, respectively, as shown in
Fig. 5. The percentage of user comfort level is 53.85% 64.19%, and 66.32% using
498
S. N. Makhadmeh et al.
Fig. 4 PAR using MCA with and without SHB
Fig. 5 WTR with and without SHB
unscheduled mode, MCA, and MCA with SHB. The results prove the efficiency of
proposed MCA with SHB in reducing waiting time for appliances and improving
user comfort level.
A Min-conflict Algorithm for PSPSH Using Battery
Table 2 Comparison between MCA and BBO.
BBO
EB
PAR
WTR
Without
SHB
With SHB
499
MCA
EB
PAR
WTR
42.46
2.64
0.3534
41.12
2.53
0.3581
28.95
2.60
0.3352
28.85
2.60
0.3368
5.4 Comparison Study Between MCA and BBO
This section presents a comparison between the adapted MCA and BBO algorithm.
This comparison study is provided to show the results of MCA against BBO and
evaluate its performance.
The results obtained by MCA and BBO without and with SHB are compared in
Table 2. The table shows the robust performance of MCA in reducing EB and PAR,
where it obtained better results than BBO in term of reducing EB and PAR, whereas
BBO performed better than MCA in improving user comfort level.
6 Conclusion and Future Work
PSPSH is the primary issue facing power supplier companies and their users, due to
the scheduling efficiency in maintaining power system and reducing EB for users.
PSPSH can be addressed by shifting appliances operation time from period to another
according to a time horizon and dynamic pricing program. The primary objectives of
addressing PSPSH are minimizing EB and PAR, and maximizing satisfaction level
of users.
In this paper, MCA is adapted to address PSPSH according to a time horizon
divided into 1440 time slots and RTP program. The RTP is combined with IBR
program to efficiently balance power demand though the time horizon. SHB is formulated and used as an additional source to attempt to enhance quality of solution.
In the simulation results, the schedule using SHB is compared with schedule
without considering SHB. SHB prove its efficiency in enhancing the schedule in
terms of EB and WTR, where MCA using SHB reduce EB and WTR by up to 29.8%
and 6%, respectively, better than MCA without SHB. However, MCA without SHB
obtains better schedule than MCA with SHB in terms of reducing PAR. In addition,
MCA is compared with BBO to evaluate its obtained results. The comparison showed
that MCA obtained better schedule in terms of reducing EB and PAR, and BBO
performed better in improving user comfort level.
In the future, different dataset can be considered in the scheduling process to efficiently evaluate MCA and SHB. Besides, renewable energy sources can be integrated
with the proposed SHB to improve quality of schedule.
500
S. N. Makhadmeh et al.
Acknowledgments This work has been partially funded by Universiti Sains Malaysia under Grant
1001/PKOMP/8014016.
References
1. Abasi AK, Khader AT, Al-Betar MA, Naim S, Makhadmeh SN, Alyasseri ZAA (2019) Linkbased multi-verse optimizer for text documents clustering. Appl Soft Comput 87:1–36
2. Abasi AK, Khader AT, Al-Betar MA, Naim S, Makhadmeh SN, Alyasseri ZAA (2019) A
text feature selection technique based on binary multi-verse optimizer for text clustering. In:
2019 IEEE Jordan international joint conference on electrical engineering and information
technology (JEEIT). IEEE, pp 1–6
3. Abasi AK, Khader AT, Al-Betar MA, Naim S, Makhadmeh SN, Alyasseri ZAA (2020) An
improved text feature selection for clustering using binary grey wolf optimizer. In: Proceedings of the 11th national technical seminar on unmanned system technology 2019. Springer,
Heidelberg, pp 1–13
4. Abbasi BZ, Javaid S, Bibi S, Khan M, Malik MN, Butt AA, Javaid N (2017) Demand side
management in smart grid by using flower pollination algorithm and genetic algorithm. In:
International conference on P2P, parallel, grid, cloud and internet computing. Springer, Heidelberg, pp 424–436
5. Al-Betar MA (2017) β-hill climbing: an exploratory local search. Neural Comput Appl
28(1):153–168
6. Al-Betar MA, Alyasseri ZAA, Khader AT, Bolaji AL, Awadallah MA (2016) Gray image
enhancement using harmony search. Int J Comput Intell Syst 9(5):932–944
7. Al-Betar MA, Awadallah MA, Bolaji AL, Alijla BO (2017) β-hill climbing algorithm for
sudoku game. In: 2017 Palestinian international conference on information and communication
technology (PICICT). IEEE, pp 84–88
8. Al-Betar MA, Khader AT (2012) A harmony search algorithm for university course timetabling.
Ann Oper Res 194(1):3–31
9. Alomari OA, Khader AT, Al-Betar MA, Abualigah LM (2017) Gene selection for cancer classification by combining minimum redundancy maximum relevancy and bat-inspired algorithm.
Int J Data Min Bioinform 19(1):32–51
10. Alomari OA, Khader AT, Al-Betar MA, Alyasseri ZAA (2018) A hybrid filter-wrapper gene
selection method for cancer classification. In: 2018 2nd international conference on biosignal
analysis, processing and systems (ICBAPS). IEEE, pp 113–118
11. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Ahmad Alomari O (2018) EEG-based person authentication using multi-objective flower pollination algorithm. In: 2018 IEEE congress
on evolutionary computation (CEC). IEEE, pp 1–8
12. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA, Makhadme SN (2018) An
efficient optimization technique of EEG decomposition for user authentication system. In: 2018
2nd international conference on biosignal analysis, processing and systems (ICBAPS). IEEE,
pp 1–6
13. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA, Makhadmeh SN (2018)
Classification of EEG mental tasks using multi-objective flower pollination algorithm for person
identification. Int J Integr Eng 10(7) (2018)
14. Bouhouch A, Loqman C, El Qadi A (2019) CHN and min-conflict heuristic to solve scheduling
meeting problems. In: Bioinspired heuristics for optimization. Springer, Heidelberg, pp 171–
184
15. Briefing US (2013) International energy outlook 2013. US Energy Information Administration
16. Colak I, Kabalci E, Fulli G, Lazarou S (2015) A survey on the contributions of power electronics
to smart grid systems. Renew Sustain Energy Rev 47:562–579
17. ComED Company (2017). https://hourlypricing.comed.com/live-prices/
A Min-conflict Algorithm for PSPSH Using Battery
501
18. Farooqi M, Awais M, Abdeen ZU, Batool S, Amjad Z, Javaid N (2017) Demand side management using harmony search algorithm and bat algorithm. In: International conference on
intelligent networking and collaborative systems. Springer, Heidelberg, pp 191–202
19. Gage A, Murphy RR (2004) Sensor scheduling in mobile robots using incomplete information
via min-conflict with happiness. IEEE Trans Syst Man Cybern Part B (Cybern) 34(1):454–467
20. Iftikhar H, Asif S, Maroof R, Ambreen K, Khan HN, Javaid N (2014) Biogeography based optimization for home energy management in smart grid. In: International conference on networkbased information systems. Springer, Heidelberg, pp 177–190
21. Johnston M, Minton S et al (1994) Analyzing a heuristic strategy for constraint satisfaction
and scheduling. Intell Sched 257–289
22. Khan AR, Mahmood A, Safdar A, Khan ZA, Khan NA (2016) Load forecasting, dynamic
pricing and dsm in smart grid: a review. Renew Sustain Energy Rev 54:1311–1322
23. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S (2018) Multi-objective power scheduling
problem in smart homes using grey wolf optimiser. J Ambient Intell Hum Comput 1–25
24. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S (2018) An optimal power scheduling for
smart home appliances with smart battery using grey wolf optimizer, pp 1–6
25. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S, Abasi AK, Alyasseri ZAA (2019) Optimization methods for power scheduling problems in smart home: survey. Renew Sustain Energy
Rev 115:109362
26. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S, Alyasseri ZAA, Abasi AK (2019) Particle
swarm optimization algorithm for power scheduling problem using smart battery. In: 2019 IEEE
Jordan international joint conference on electrical engineering and information technology
(JEEIT). IEEE, pp 672–677
27. Minton S, Johnston MD, Philips AB, Laird P (1992) Minimizing conflicts: a heuristic repair
method for constraint satisfaction and scheduling problems. Artif Intell 58(1–3):161–205
28. Nexans (2010) Deploying a smarter grid through cable solutions and services. http://www.
nexans.com/Corporate/2010/WHITEPAPERSMARTGRIDS2010.pdf
29. Powerwall T (2018). https://www.tesla.com/powerwall
30. Rahim S, Javaid N, Ahmad A, Khan SA, Khan ZA, Alrajeh N, Qasim U (2016) Exploiting
heuristic algorithms to efficiently utilize energy management controllers with renewable energy
sources. Energy Build 129:452–470
31. Zhao Z, Lee WC, Shin Y, Song KB (2013) An optimal power scheduling method for demand
response in home energy management system. IEEE Trans Smart Grid 4(3):1391–1400
An Improved Text Feature Selection
for Clustering Using Binary Grey Wolf
Optimizer
Ammar Kamal Abasi, Ahamad Tajudin Khader, Mohammed Azmi Al-Betar,
Syibrah Naim, Sharif Naser Makhadmeh, and Zaid Abdi Alkareem Alyasseri
Abstract Text Feature Selection (FS) is a significant step in text clustering (TC).
Machine learning applications eliminate unnecessary features in order to enhance
learning effectiveness. This work proposes a binary grey wolf optimizer (BGWO)
algorithm to tackle the text FS problem. This method introduces a new implementation of the GWO algorithm by selecting informative features from the text. These
informative features are evaluated using the clustering technique (i.e., k-means) so
that time complexity is reduced, and the clustering algorithm’s efficiency is improved.
The performance of BGWO is examined on six published datasets, including Tr41,
Tr12, Wap, Classic4, 20Newsgroups, and CSTR. The results showed that the BGWO
output outperformed the rest of the compared algorithms such as GA and BPSO based
on the measurements of the evaluation. The experiments also showed that the BGWO
method could achieve an average purity of 46.29%, F-measure of 42.23%.
Keywords Binary grey wolf optimizer · Text mining · K-means · Text feature
selection problem · Text clustering
1 Introduction
The number of digital documents is extremely increasing day by day due to the proliferation of the internet, that cannot be investigated only by humans [3]. Therefore,
text mining tools can assist in addressing this issue. Automatic systems, which are
not affected by a text explosion, can replace the human reader. Text mining examines
the massive documents’ collection to detect data that are previously unknown. Text
A. K. Abasi (B) · A. T. Khader · S. Naim · S. N. Makhadmeh · Z. A. A. Alyasseri
School of Computer Sciences, Universiti Sains Malaysia, Gelugor, Penang, Malaysia
e-mail: ammar_abasi@student.usm.my
M. A. Al-Betar
Department of Information Technology, Al-Huson University College, Al-Balqa Applied
University, Irbid, Jordan
Z. A. A. Alyasseri
ECE Department-Faculty of Engineering, University of Kufa, Najaf, Iraq
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_34
503
504
A. K. Abasi et al.
document clustering (TDC) is, among other techniques, an effective method, which
is used in the fields of text mining, topic extraction, machine learning, text summarization, and pattern recognition [16]. An efficient TDC technique allows automatic
classification of a corpus of documents into semantic cluster hierarchies. It is the
method through which documents are structured into significant classification. This
means that the records of similar clusters are closer together than the records of
different clusters [11].
The application of TDC algorithms requires the conversion of raw text files (i.e.,
terms) into numerical formats with document characteristics. The most fundamental stage to obtain trends and ideas from them is document representation [17]. In
TDC, Vector Space Model (VSM) is commonly utilized so that the documents are
presented, and the terms represent the features/dimensions in the VSM [29].
Huge informative, in addition to uninformative, in other words, irrelevant and
redundant, as well as noisy dimensional features are the result of the conversion process [12]. The main informative documents’ features are determined by FS. However,
the high dimensionality space represents the key difficulty. Problems are related to
the removal of non-informational features in order to reduce the dimension space and
improve the clustering performance [18]. It is a fact that hundreds of thousands of
textual features are part of the compilation of the text. The document dimensionality
determines the efficiency of TDC. Figure 1 shows the overall steps of TDC.
The FS techniques fall into three categories, including the filter method, the wrapper method, and the hybrid method based on the studies’ approach to obtaining
an information sub-ensemble of features. The filter method examines the feature
set based on statistical methods so that a discriminatory function subset is chosen
Fig. 1 Text clustering steps.
An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer
505
regardless of the machine learning algorithm. These include mean-median [15], mean
absolute difference [15], and odd ratio [23], to name a few examples of filter methods.
The previously mentioned methods are widely used in FS due to their advantageous less computational complexity, particularly if the dimension of the text feature
is vast. The search approach in the wrapper methods is used to evaluate the subsets of
features so that effective informative features are obtained. These techniques include
plus-l-take-away-r-process [25], and sequential forward selection/backward elimination [26]. Although these techniques are computationally costly, they are relatively
more expensive compared to the filter methods. Another class of FS is the hybrid
technique. Various FS techniques are incorporated into the hybrid methods to select
informative subsets of features. They utilize the advantages of one strategy and reduce
the disadvantages of another technique in choosing the subset.
FS is formulated as an NP-hard (nondeterministic polynomial time) optimization
problem [12]. In combinatorial optimization problems, the best way to achieve the
optimal solution is the exhaustive search [5]. However, the exhaustive search throughout the full search space cannot be practical because it includes an overwhelmingly
high degree of computing complexity [9, 21, 22]. Recently, many surveys have investigated the metaheuristic algorithms to address the issues of combinatorial optimization [2, 7, 20]. These algorithms are extensively utilized with the aim of discovering
the problems’ unknown search space and obtaining the best global solution and,
therefore, they are becoming more and more popular. Numerous metaheuristic algorithms are available, particle swarm optimization [19], binary multi-verse optimizer
[1], ant lion optimizer [19], harmony search (HS) [6], etc. [8, 28]. They are used to
address the FS issue.
Grey Wolf Optimizer (GWO) is a recent metaheuristic swarm optimization technique, which emulates grey pack hunting and social behaviour. It is proposed by
Mirjalili [24]. This algorithm provides many advantages over other swarm-based
intelligence techniques. It has a fewer set of parameters and any derivative information is not required. Besides, the decision variables’ exchange and the cooperation
process between swarm participants have a significant advantage. Consequently,
GWO has been effectively adjusted in the last analysis of GWO to several types
of optimization problems such as engineering, robotics, scheduling [22], economic
dispatch problems, planning, feature selection for classification problem [13], and
many more as described in [14].
The FS problem is basically a binary problem. For the continuous optimization
problem, the original GWO variant is suggested. Based on the above, a binary Grey
Wolf Optimizer (BGWO) is proposed in the present paper as a novel FS application
using all the GWO operators.
As for the structure of the paper, it is outlined as follows: The theoretical motivation for this work provides in Sect. 2. In Sect. 3, the binary grey wolf algorithm is
provided. In Sect. 4, BGWO for text FS is provided. Section 5 explains the obtained
empirical results to emphasize the efficiency of the new FS method. Finally, Sect. 6
provides the conclusion and future work.
506
A. K. Abasi et al.
2 Preliminaries
The preliminary research is briefly presented in this section.
2.1 Text Clustering Problem
TDC aims at finding the best distribution of a vast set of documents into a clusters’
subset by the clusters’ fundamental features. The pre-processing stages of TDC
are introduced in the following subsection and the k-means technique is briefly
introduced to produce document clusters depending on the obtained features.
Pre-processing Steps. The standard pre-processing stages, which include tokenization and stop words removal, as well as stemming, in addition to feature weighting,
are performed before clusters are created to convert the document into a numerical
form format [18]. The pre-processing substeps are shortly outlined as follows:
– Tokenization: Each word (term) in a single document is extracted as separate units
called tokens in this stage, neglecting special characters, symbols, and weight spice
in the text.
– Stop words removal: This involves a list of terms that are common, including
(‘in’, ‘on’, ‘at’, ‘that’, ‘the’, ‘of’, ‘an’, ‘a’, ‘she’, ‘he’, etc.). Short words, highfrequency terms, and functional terms are also recognized as stop words in TDC. It
is vital to remove these terms as they often cover a substantial part of the document.
Therefore, not only the number of characteristics is unnecessarily intensified but
also the clustering method efficiency is deluded and deteriorated. The stop words
list consists of 571 words that can be obtained.
– Stemming: Transforms several word forms with the same root. We can do this by
separating prefixes and suffixes from the term. For instance, ‘multi-coloured’ and
‘multi-media’ share the same root, i.e., /-multi-/.
– Term weighting: A weighting scheme TF-IDF (i.e., term frequency-inverse document frequency) is frequently utilized for transforming textual data into number
formats.
2.2 K-Means Text Clustering Algorithm
K-means represents one of the most popularly used clustering technique for solving
the TDC problem [16]. Algorithm 1 provides the K-means algorithm steps. It splits
the text documents’ set Docs = (doc1 , doc2 , doc3 , ., docn ) into a subset of K clusters via three main steps: (a) choosing random documents as clusters’ centroid (the
number of clusters is predefined). (b) assigning the documents to the nearest clusters.
(c) recalculating the clusters’ centroid.
An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer
507
Algorithm 1. K-means clustering algorithm
Data: The clusters’ number K , and a documents’ set Docs (after the pre-processing step)
Result: Clusters K contain homogeneous documents.
Create centroid clusters K by choosing one document randomly for each cluster.
while the number of iteration is not met do
for each document doci in Docs do
Compute the distance (i.e., the similarity) between centroid clusters K and document doci .
end
for each document doci in Docs do
Assign document doci to the nearest cluster k.
end
recalculate the clusters centroid k.
end
2.3 Problem Formulation of Unsupervised Feature Selection
In this paper, the technique of text FS problem utilizes the BGWO to cluster text
using a novel model to identify the most comprehensive informative text features.
In addition, uninformative features are removed. The following math defines the
proposed model for addressing the FS problem. Since F is a set of features F =
{ f 1 , f 2 , ...., f t }, where t signifies the amount of the entire unique features VSM.
Consider N ew_sub_ f eatur es = {N f1 , N f2 , ..., N f j , ..., N f,tn } signifies the subset
of the new features, which is the new dimension of informative features that is
obtained through the FS algorithm, tn signifies the amount of the new features.
3 Binary Grey Wolf Optimizer
The GWO mechanism is modelled by the grey wolves’ lifestyle. Their hunting mechanisms were formulated in 2014 as an optimization algorithm by Mirjalili [24] using
four stages of GWO social hierarchy, including (α), (β), (δ), and (ω) alpha, which
stand for an alpha, beta, gamma, and omega, respectively. Alpha is the leader of the
grey wolf pack, and it is at the top of the social hierarchy. In consulting alpha wolf,
beta bears perform the leading role. Delta refers to the level positioned in the structure between beta and omega wolves. Omega wolves are part of the last hierarchy.
To hunt prey, they surround it first [22].
The intelligence of the group hunting is also proceedingly modelled along with
this intelligent social hierarchy. It involves three main phases: chasing, encircling,
and attacking. Optimization speaking, the top three solutions in the hunting group
are classified into three types according to the fitness value: Alpha (α) is the first-best
solution in the hunting group. Beta (β) is the second-best solution and delta (δ) is
the third-best one. Other solutions involve omega (ω).
508
A. K. Abasi et al.
All the solutions are guided by these three solutions (i.e., (α), (β), and (δ)) to
discover the search space to find the optimal solution. The following equations are
used to mathematically model the encircling behaviour.
−
→ −
→
−
→
−
→
X (t + 1) = X p (t) + A × D
(1)
−
→
−
→ −
→
−
→
D = | C × X p (t) − X (t)|,
(2)
−
→
−
→
where D is as defined in 2 and t signifies the number of iterations, X p signifies the
−
→ −
→
−
→
position of the prey, A , C represent coefficient vectors and X signifies the grey
wolf position.
−
→
→
(3)
C =2×−
r 2
−
→
→
→
→
A =2×−
a ×−
r 1−−
a
(4)
−
→ −
→
→
The A , C vectors are calculated based on Eqs. 4 and 3. The components of −
a
are linearly reduced from (2.0 to 0.0) over the course of iterations and r 1, r 2 are
random vectors in [0, 1]. Hunting is typically driven by alpha. Sometimes, beta and
delta might be involved in hunting. In order to mathematically simulate the hunting
behaviour of the grey wolves, alpha, beta, and delta (i.e., the highest solutions) are
expected to possess a stronger understanding of the prey location. Other search agents
follow the first three best solutions, which have been so far achieved in the hunting
processes to update their position to the best search agent’s position. The updating
positions of the wolves are presented in these equations.
−
→
−
→
−
→
−
→
Dα = |C 1 × X α − X |
(5)
−
→
−
→
−
→
−
→
Dβ = |C 2 × X β − X |
(6)
−
→
−
→
−
→
−
→
Dδ = |C 3 × X δ − X |
(7)
−
→
−
→
−
→
−
→
X 1 = X α − A1 × Dα
(8)
−
→
−
→
−
→
−
→
X 2 = X β − A2 × Dβ
(9)
−
→
−
→
−
→
−
→
X 3 = X δ − A3 × Dδ
(10)
An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer
−
→
−
→
−
→
X1+ X2+ X3
−
→
X (t + 1) =
3
509
(11)
This paper proposes the modification of a GWO as a binary GWO (BGWO) for the
adaptation of binary variables in a search area (FS problem nature). The generation
−
→
function of solutions, as well as the equation of the new position (i.e., X (t + 1))
Eq. (11) are adjusted to identify the practical solutions during the execution of BGWO
as follows:
1
−
→
,
(12)
Sig( X (t + 1)) =
−
→
−
1 + e X (t+1)
−
→
where Sig( X (t + 1)) refers to the opportunity of the decision variables will be taken
‘0’ or ‘1’ in solution X . The Eq. 13 to update the decision variables of the X solution.
−
→
1
X (t + 1) =
0
−
→
if r < Sig( X (t + 1))
otherwise,
(13)
−
→
where the sigmoid function is used in Eq. 12 to convert the value of X (t + 1) in Eq. 11
in the range [0, 1], r refers to random numbers between (0, 1). Figure 2 illustrates
−
→
the sigmoid function of the X (t + 1).
Fig. 2 Sigmoid function.
510
A. K. Abasi et al.
Fig. 3 Solution represents.
4 BGWO for the Text FS Problem
4.1 Solution Representation
Figure 3 illustrates the BGWO solution presentation, which is proposed for the text
FS problem. In this presentation, the solution involves a text features’ subset. The
binary value of each position indicates whether if the feature selected or not selected
[3, 18]. BGWO starts after creating a random solutions’ set, then it improves the
solutions so that the best optimal solution can be found (i.e., the best informative
features).
4.2 Fitness Function
The mean absolute difference (MAD) [18] can be utilized by the BGWO algorithm
as a fitness function to evaluate each solution in the population to tackle the text FS
problem. MAD is used to give weight (i.e., significance rating) to each feature in the
subset N ew_sub_ f eatur es, then all the scores are summarized. The feature weight
is computed by calculating the distinction of each feature using Eq. 14.
M AD(Ui ) =
where,
Ui =
t
1 |Ui , j − U j |,
n i i=1
(14)
t
1 Ui , j,
n i i=1
(15)
where n i refers to all the selected features in the text document i, Ui , j signifies the
feature j value in the document i, U j refers to the mean value of the feature j, t refers
An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer
511
to the total features’ number. The methodology, which is proposed in this paper, is
described briefly in Algorithm 2.
Algorithm 2. The proposed BGWO algorithm’s pseudo code for FS problem
Initialize GWO and FS problem parameters (a, A, C, number of solutions(N ), number of iterations, number of feature (F)
Create a population matrix of size (N × F)
Calculate the fitness function for all solutions
Assign the best solution to X α
Assign the second best solution to X β
Assign the third best solution to X δ
for each iteration (t) do
for each solution(i) do
Update solution(i) using equation 13
end for
Update a, A, C
Calculate the fitness function for all solutions
Update X α
Update X β
Update X δ s
end for
Return the best solution X α ;
5 Experimental Setup
The proposed BGWO is tested on six standard datasets to solve the text FS problem.
The results were contrasted using GA [27], BPSO [18]. The parameter setting of
every comparative algorithm is described in Table 1. It should be noted that, the
values of the control parameters are set according to the recommendation given by
the founder of GWO in [24].
Table 1 The parameter setting for each algorithm of comparison.
Algorithm
Parameters
Value
GA
GA
binary PSO
binary PSO
binary PSO
binary PSO
BGWO, BPSO, GA
BGWO, BPSO, GA
BGWO, BPSO, GA
Crossover rate
Mutation rate
C1
C2
Max weight
Min weight
Population size
Maximum number of iteration
Runs
0.70
0.04
2
2
0.9
0.2
60
1000
30
512
A. K. Abasi et al.
Table 2 Text datasets details.
Datasets
ID
tr41
tr12
Wap
Classic4
20Newsgroups
CSTR
DS1
DS2
DS3
DS4
DS5
DS6
No. documents
(d)
No. clusters (K)
No. features or
terms (t)
878
313
1560
2000
300
2 99
10
8
20
4
3
4
6743
5329
7512
6500
2275
1725
5.1 Standard Datasets and Evaluation Metric
The BGWO algorithm is tested on six benchmark datasets, and it is compared with the
state-of-the-art algorithms in the experiment, including (Tr41, Tr12, Wap)1 , (Classic4, 20Newsgroups, CSTR)2 . Several characteristics in these datasets such as sparsity and skewness. Based on Table 2, the features’ description of the datasets is given.
The Purity and F-measure measures are used as standards to evaluate the TDC
algorithms [16]. The measures that are implemented involve the criteria, which is
commonly used to achieve validity and compare the clustering of various cluster
datasets [4]. It is worth noting that after the outcomes are achieved, they are calculated. The following section describes these steps in detail.
Purity. The purity measure is utilized for calculating the maximum correct documents of every single cluster and the highest purity score is close to 1 because, in
a single cluster, the immense cluster size is calculated according to the estimated
cluster size. Through the given measure, each cluster is assigned the most repeated
class [1]. Purity is calculated in Eq. 16 of the entire clusters.
purit y =
k
1
max(i, j),
n i=1
(16)
where n refers to the entire documents’ total number in the dataset, max(i, j) refers
to the large size in the cluster j of class i, k refers to the clusters’ number.
F-measure. The F-measure indicates the harmonic combination of the precision
measures (P) with the recall measures (R). When the F-measure’s value is close
to 1, this shows a robust clustering algorithm. Conversely, when the F-measure’s
value is close to 0, the clustering algorithm is considered weak [10]. In the following
Equation, the F-measure is calculated:
1 glaros.dtc.umn.edu/gkhome/fetch/sw/cluto/datasets.tar.gz.
2 sites.labic.icmc.usp.br/text_collections/.
An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer
P(i, j) =
n i, j
,
nj
513
(17)
where ni, j refers to the correct documents number in cluster j of class i, n j refers
to the total documents number in cluster j.
R(i, j) =
n i, j
,
ni
(18)
where ni, j refers to the correct documents number in cluster j of class i, n j refers
to the total documents number in class i.
2 × P(i, j) × R(i, j)
,
P(i, j) + R(i, j)
F(i, j) =
(19)
where R(i, j) refers to the Recall in cluster j of class i, P(i, j) refers to the Precision
in cluster j of class i. For all clusters, the calculated F-measure is shown in Eq. 20
F=
k
nj
i=1
n
max F(i, j)
(20)
5.2 Results and Discussion
The findings, which were achieved through BGWO, were compared with BPSO and
GA. In order to make a reasonable comparison, every single algorithm was reiterated
30 times, and the parameters’ setting of each clustering algorithm is similar as shown
in Table 1.
Table 3 provides the average of 30 runs for Purity and F-measure results, which
were obtained individually through the six standard text benchmarks by all the FS
algorithms GA, BPSO, and BGWO. For all datasets, BGWO exhibited higher purity
and F-measure in comparison with GA and BPSO. In contrast with both techniques,
this indicates that BGWO is effective and simultaneously efficient to find the globally
optimal solution. Compared with other data sets, BPSO obtained the best purity, as
well as the best F-measure in the DS2 dataset. According to the results, it was found
that BGWO exceeded other algorithms in comparison with purity and F-measure.
Figure 4 demonstrates the selected features percentages, which are compared with
other methods in different datasets. According to the findings, it is possible to state
that a better subset of the appropriate text clustering efficiency is discovered in
the proposed algorithm compared with other algorithms. The selection of features,
however, aims at improving the quality of the clustering and, at the same time,
removing unusable features. Otherwise, the efficiency may be decreased while the
feature subset is tiny. For example, BPSO obtained the smallest subset of features
for the DS3 text dataset. However, the purity and F-measure were smaller (please
514
A. K. Abasi et al.
Table 3 Comparison of BPSO, GA, BGWO results for different datasets based on k-means clustering algorithm in terms of Purity and F-measure
Dataset
Measure
K-means
BPSO
GA
BGWO
without FS
DS1
DS2
DS3
DS4
DS5
DS6
Average ranks
Final rank
Purity
F-measure
Rank
Purity
F-measure
Rank
Purity
F-measure
Rank
Purity
F-measure
Rank
Purity
F-measure
Rank
Purity
F-measure
Rank
0.4108
0.3876
4
0.3908
0.3222
4
0.4759
0.4315
4
0.5938
0.5472
4
0.3741
0.3406
4
0.3525
0.3460
4
4
4
0.4358
0.4004
2
0.4083
0.3471
2
0.4981
0.4507
2
0.5970
0.5579
3
0.4014
0.3499
1
0.3702
0.3662
2
2.00
2
Fig. 4 Features selected percentage between GA, BPSO, BGWO
0.4139
0.3904
3
0.4012
0.3250
3
0.4887
0.4436
3
0.6035
0.5504
2
0.3810
0.3481
3
0.3558
0.3512
3
2.83
3
0.4400
0.4286
1
0.4354
0.3299
1
0.5010
0.4627
1
0.6074
0.5801
1
0.3953
0.3418
2
0.3986
0.3962
1
1.16
1
An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer
515
refer to Table 3). Instead, BGWO selected a larger subgroup, which provided higher
purity and F-measure than BPSO in the same dataset. Figure 4 also shows that the
worst or the best clustering performance cannot be guaranteed by the lowest or the
largest features’ subset.
6 Conclusion
This paper proposed a binary grey wolf optimizer (BGWO) to solve the FS problem in
TDC. It aims to address the binary nature problem. BGWO uses the original features
to produce a subset, which contains the most necessary text features. The k-means
clustering technique addresses the features as an input in the clustering step so that
the new subset is evaluated. The proposed algorithm is tested on six benchmarks
document datasets regarding the purity and F-measure criteria. The experimental
findings of the BGWO algorithm archived better results than the existing FS technique. Therefore, the proposed FS algorithm enhanced the outcome of the TDC
by obtaining more homogeneous groups. The hybridization of this algorithm with
other metaheuristic algorithms may potentially improve the information by increasing the search capabilities of the algorithm. Another enhancement in the future can
involve applying various fitness functions so that the results are expected to be further
improved.
Acknowledgements This work was supported by Universiti Sains Malaysia (USM) under Grant
(1001/PKOMP/ 8014016).
References
1. Abasi AK, Khader AT, Al-Betar MA, Naim S, Makhadmeh SN, Alyasseri ZAA (2019) A
text feature selection technique based on binary multi-verse optimizer for text clustering. In:
2019 IEEE Jordan international joint conference on electrical engineering and information
technology (JEEIT). IEEE, pp 1–6
2. Abasi AK, Khader AT, Al-Betar MA, Naim S, Makhadmeh SN, Alyasseri ZAA (2020) Linkbased multi-verse optimizer for text documents clustering. Appl Soft Comput 87:106002
3. Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on
hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J
Supercomput 73(11):4773–4795
4. Abualigah LM, Khader AT, Al-Betar MA (2016) Multi-objectives-based text clustering technique using k-mean algorithm. In: 2016 7th international conference on computer science and
information technology (CSIT). IEEE, pp 1–6
5. Al-Betar MA, Awadallah MA (2018) Island bat algorithm for optimization. Expert Syst Appl
107:126–145
6. Al-Betar MA, Awadallah MA, Khader AT, Bolaji AL, Almomani A (2018) Economic load
dispatch problems with valve-point loading using natural updated harmony search. Neural
Comput Appl 29(10):767–781
516
A. K. Abasi et al.
7. Alomari OA, Khader AT, Al-Betar MA, Awadallah MA (2018) A novel gene selection method
using modified MRMR and hybrid bat-inspired algorithm with β-hill climbing. Appl Intell
48(11):4429–4447
8. Alyasseri ZAA, Khader AT, Al-Betar MA, Awadallah MA, Yang XS (2018) Variants of the
flower pollination algorithm: a review. In: Yang XS (ed) Nature-inspired algorithms and applied
optimization. Springer, Cham, pp 91–118
9. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA, Makhadme SN (2018) An
efficient optimization technique of EEG decomposition for user authentication system. In: 2018
2nd international conference on biosignal analysis, processing and systems (ICBAPS). IEEE,
pp 1–6
10. Bharti KK, Singh PK (2015) Hybrid dimension reduction by integrating feature selection with
feature extraction method for text clustering. Expert Syst Appl 42(6):3105–3114
11. Bharti KK, Singh PK (2016) Chaotic gradient artificial bee colony for text clustering. Soft
Comput 20(3):1113–1126
12. Bharti KK, Singh PK (2016) Opposition chaotic fitness mutation based adaptive inertia weight
BPSO for feature selection in text clustering. Appl Soft Comput 43:20–34
13. Emary E, Zawbaa HM, Hassanien AE (2016) Binary grey wolf optimization approaches for
feature selection. Neurocomputing 172:371–381
14. Faris H, Aljarah I, Al-Betar MA, Mirjalili S (2018) Grey wolf optimizer: a review of recent
variants and applications. Neural Comput Appl 30(2):413–435
15. Ferreira AJ, Figueiredo MA (2012) Efficient feature selection filters for high-dimensional data.
Pattern Recogn Lett 33(13):1794–1804
16. Forsati R, Mahdavi M, Shamsfard M, Meybodi MR (2013) Efficient stochastic algorithms for
document clustering. Inf Sci 220:269–291
17. Karaa WBA, Ashour AS, Sassi DB, Roy P, Kausar N, Dey N (2016) Medline text mining:
an enhancement genetic algorithm based approach for document clustering. In: Hassanien
AE, Grosan C, Fahmy Tolba M (eds) Applications of intelligent optimization in biology and
medicine. Springer, Cham, pp 267–287
18. Kushwaha N, Pant M (2018) Link based BPSO for feature selection in big data text clustering.
Future Gener Comput Syst 82:190–199
19. Mafarja MM, Mirjalili S (2019) Hybrid binary ant lion optimizer with rough set and approximate entropy reducts for feature selection. Soft Comput. 23(5):1–17
20. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S, Alyasseri ZAA, Abasi AK (2019) Particle
swarm optimization algorithm for power scheduling problem using smart battery, pp 1–6
21. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S, Alyasseri ZAA, Abasi AK (2020) A minconflict algorithm for power scheduling problem in a smart home using battery. In: Proceedings
of the 11th national technical seminar on underwater system technology 2019. Springer, pp
1–12
22. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S (2019) Multi-objective power scheduling
problem in smart homes using grey wolf optimiser. J Ambient Intell Human Comput. 10:3643–
3667
23. Mengle SS, Goharian N (2009) Ambiguity measure feature-selection algorithm. J Am Soc
Inform Sci Technol 60(5):1037–1050
24. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61
25. Nakariyakul S, Casasent DP (2009) An improvement on floating search algorithms for feature
subset selection. Pattern Recogn 42(9):1932–1940
26. Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern
Recogn Lett 15(11):1119–1125
27. Shamsinejadbabki P, Saraee M (2012) A new unsupervised feature selection method for text
clustering based on genetic algorithms. J Intell Inf Syst 38(3):669–684
28. Sheikhpour R, Sarram MA, Gharaghani S, Chahooki MAZ (2017) A survey on semi-supervised
feature selection methods. Pattern Recogn 64:141–158
29. Song W, Qiao Y, Park SC, Qian X (2015) A hybrid evolutionary computation approach with
its application for optimizing text document clustering. Expert Syst Appl 42(5):2517–2524
Applied Electronics and Computer
Engineering
Metamaterial Antenna for Biomedical
Application
Mohd Aminudin Jamlos, Nur Amirah Othman, Wan Azani Mustafa,
and Maswani Khairi Marzuki
Abstract In this paper, metamaterial element is applied towards antenna for
biomedical application. The metamaterial unit cell is constructed using circular split
ring resonator (CSRR) technique to be attached at the ground of the antenna. The
metamaterial antenna is design to be operated at frequency between 0.5–3.0 GHz
which is suitable for biomedical application such as wireless patient movement
monitoring, telemetry and telemedicine including micro-medical imaging and
Magnetic Resonance Imaging (MRI). The design and simulation has been carried
out using Computer Simulation Technology Microwave Studio (CST MWS) while
the fabricated antenna is measured using Vector Network Analyzer (VNA) to
analyse the overall performance.
Keywords Biomedical
Metamaterial Antenna
1 Introduction
Nowadays, Metamaterial has been a popular research topic for almost two decades.
Most of the researcher agree on certain the basic metamaterial definition characteristics although it has different definitions [1]. Metamaterials are materials not
generally found in nature and having negative permittivity and permeability but are
instead artificially medium with a negative index of refractive and structures that
have properties that are either not or seldom found in natural material [1–3].
Variable metamaterials have been designed from radio frequencies up to optical
frequencies, and different functions have been realized such as negative refractive
index (NRI), huge chirality, anisotropy, and bianisotropy [4]. As an interdisciplinary topic, metamaterials can be classified into different categories based on
different criteria. From an operating frequency point of view, they can be classified
M. A. Jamlos (&) N. A. Othman W. A. Mustafa M. K. Marzuki
Faculty of Engineering Technology, Universiti Malaysia Perlis, UniCITI ALAM Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: mohdaminudin@unimap.edu.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_35
519
520
M. A. Jamlos et al.
as microwave metamaterials, terahertz metamaterials, and photonic metamaterials.
From a spatial arrangement point of view, there are 1D metamaterials, 2D metamaterials, and 3D metamaterials. From a material point of view, there are metallic
and dielectric metamaterials. In this work, we will concentrate on the electromagnetic properties, and introduce several important types of metamaterials [5].
Metamaterial concepts are mainly focused on the size reduction and improving the
conventional patch antenna characteristics [6, 7].
For some years the metamaterials idea has mostly been considered as a means of
engineering the electromagnetic response of passive micro- and nanostructured
materials. Remarkable results have been achieved so far including negative-index
media that refract light in the opposite direction from that of conventional materials,
chiral materials that rotate the polarization state of light hundreds of thousands of
times more strongly than natural optical crystals, and structured thin films with
remarkably strong dispersion that can slow light in much the same way as resonant
atomic systems with electromagnetically induced transparency [11–13]. These great
achievements in applications of metamaterials encouraged the biomedical scientists
to use these novel materials and their electromagnetic application in medicine.
2 Metamaterial Unit Cell
The proposed metamaterial unit cell dimensions layout of the proposed G-shape
Ring Resonator (GSRR) [8] is depicted in Fig. 1. The gap between the splits (W2)
plays a significant role in determining the stop-band phenomenon of the proposed
metamaterial unit cell. Figure 2 illustrated a proper gap of W2 = 0.5 mm the stop
band phenomenon of the structure is observed at 3.3 GHz. At 3.3 GHz the
reflection coefficient (S11) is almost near to zero and the transmission coefficient is
below −10 dB.
Similar to GSRR unit cell, Hexagon Split Ring Resonator (HSRR) unit cell is
also analyzed as shown in Fig. 3 meanwhile the S-parameter of the HSRR design
illustrated in Fig. 4.
On the other hand, a schematic view and the design parameters of the proposed
double-negative square-circular ring resonator (SCRR) metamaterial unit cell have
been depicted in Fig. 5 [9]. This SCRR metamaterial unit cell is made by combining
split circular and split square ring shape structure on the front side and metal strip on
the backside of the substrate. The metal strip on the backside is treated as a wire.
The square, circle and wire structures are made up of copper material with a
thickness of 0.035 mm. Arlon AD 350 (lossy) is used as the substrate material
which has a dielectric constant of 3.5 and loss tangent of 0.003. The square-circular
rings behave as inductors whereas the splits in the square and circular ring behave
as capacitors which are responsible for resonance characteristics. Magnetic and
electric field induced in SRR and wire respectively are responsible for negative
permeability (l) and negative permittivity (e). Due to these characteristics, metamaterials exhibits left-handed properties.
Metamaterial Antenna for Biomedical Application
Fig. 1 Detailed dimension layout of GSRR
Fig. 2 S-parameter of proposed design
521
522
M. A. Jamlos et al.
Fig. 3 HRR unit cell
Fig. 4 S-parameter of HSRR
Figure 6 shown the simulation setup for proposed square-circular unit cell. The
frequency domain solver based electromagnetic simulator CST microwave studio
has been used for the calculation of reflection and transmission coefficient of the
proposed design.
The unit cell is placed between two waveguide ports on positive and negative
X-axis. The perfect Electric Conductor (PEC) and Perfect Magnetic Conductor
(PMC) boundary conditions are applied along Y and Z-axes. Electromagnetic
properties obtained by simulated S11 and S21 characteristics of SCRR metamaterial
unit cell. There are some methods which are suitable for parameter extraction such
Metamaterial Antenna for Biomedical Application
523
Fig. 5 SCRR unit cell structure. a Front view. b Back view
Fig. 6 Simulation setup of unit cell
as TR method, Nicolson Ross method and many others. By using a transfer matrix,
the effective parameters of proposed SCRR metamaterial structure such as complex
permittivity and complex permeability are extracted [10].
Figure 7 represent the transmission (S21) and reflection (S11) characteristics for
simulated unit cell structure. Transmission characteristics (|S21| < −10 dB) shows
that it can be used from 3.36 to 5.88 GHz which belongs to C-band. Meanwhile
Fig. 8 show the phase response of S11 and S21. In Fig. 9, negative refractive index
524
M. A. Jamlos et al.
is obtained from 5.7 to 6 GHz with maximum negative value at 5.816 GHz. For
Fig. 10, the real part of permittivity is negative from 3.22–6 GHz while Fig. 11
shows that real part of permeability is negative from 5.824–6.1 GHz.
For biomedical application, an attractive properties of metamaterial is the plane
wave propagating in the media would there phase velocity antiparallel with group
velocity so that media would support backward waves. In this paper we proposed a
periodic rectangular split ring resonator structure (SRSM) a unit cell is depicted in
Fig. 12. This metamaterial SRSM unit cell is composed of two nested spilt rings,
which are etched on a FR4 substrate of a dielectric constant of 4.4. The resonance
frequency of this rectangular split ring unit cell structure depends on the gap
dimension (g).
Normally, slot loaded miniaturized patch antennas were used in biomedical
applications. Such patch antennas were never extended and analyzed by metamaterial structure. Hence, rectangular split ring metamaterial structure loaded on
ground plane of the conventional circular microstrip antenna so that the antenna
achieved 75% of size reduction and good amount of bandwidth and gain for
biomedical and wireless applications. The designed metamaterial circular microstrip
patch antenna is shown in Fig. 13 after varying the width and gap of the metamaterial structure parametric studies was done for the better improvement of
bandwidth and gain and efficiency for biomedical applications for antenna under
test (AUT).
Fig. 7 The transmission (S21) and reflection (S11) characteristics
Metamaterial Antenna for Biomedical Application
Fig. 8 Phase response of S11 and S21
Fig. 9 Refractive index
525
526
Fig. 10 Real part of permittivity
Fig. 11 Real part of permeability
M. A. Jamlos et al.
Metamaterial Antenna for Biomedical Application
527
Fig. 12 RSRM unit cell
Fig. 13 Metamaterial circular microstrip patch antenna as AUT (top and bottom view)
3 Conclusion
As conclusion, variety of antennas metamaterial design for biomedical applications
has been discussed. The competency of the metamaterial determines by evaluating its
performances in term of resonant frequency, gain, efficiency, radiation pattern,
reflection coefficient magnitude, power ratio and bandwidth. Among the challenges in
realizing ideal designs of metamaterial are to obtain optimum efficiency and compact
size of the antenna which can be achieved through additional effort in designing ideal
metamaterial must be further carried on with metamaterial antenna designs.
528
M. A. Jamlos et al.
References
1. Gangwar K, Gangwar R (2014) Metamaterials: characteristics, process and applications. Adv
Electron Electric Eng 4:97–106
2. Mendhe SE, Kosta YP (2011) Metamaterial properties and applications. Int J Inf Technol
Knowl Manag 4(1):85–89
3. Sihvola A (2007) Metamaterials in electromagnetics. Metamaterials 1(1):2–11
4. Yan S (2015) Metamaterial design and its application for antennas. KU Leuven, Science,
Engineering & Technology
5. Anandhimeena B, Selvan PT, Raghavan S (2016) Compact metamaterial antenna with high
directivity for bio-medical systems. Circuits Syst 7:4036–4045
6. Islam MM, Islam MT, Samsuzzaman M, Faruque MRI, Misran N, Mansor MF (2015) A
miniaturized antenna with negative index metamaterial based on modified SRR and CLS unit
cell for UWB microwave imaging applications. Materials 8:392–407
7. Ali T, Subhash BK, Biradar RC (2018) Design and analysis of two novel metamaterial unit
cell for antenna engineering. In: Proceedings of 2018 2nd international conference on
advances in electronics, computers and communications, pp 1–4
8. Khombal M, Bagchi S, Harsh R, Chaudhari A (2018) Metamaterial unit cell with negative
refractive index at C band. In: 2018 2nd international conference on electronics, materials
engineering and nano-technology, IEMENTech 2018, pp 1–4
9. Rajput GS, Gwalior S (2012) Design and analysis of rectangular microstrip patch antenna
using metamaterial for better efficiency. Int J Adv Technol Eng Res 2:51–58
10. Koutsoupidou M, Karanasiou IS, Uzunoglu N (2013) Rectangular patch antenna on split-ring
resonators substrate for THz brain imaging: modeling and testing. In: 13th IEEE international
conference on bioinformatics and bioengineering, BIBE 2013. IEEE, pp 1–4
11. Singh G, Marwaha A (2015) A review of metamaterials and its applications. Int J Eng Trends
Technol 19(6):305–310
12. Hosseinzadeh HR (2018) Metamaterials in medicine: a new era for future orthopedics. Orthop
Res Online J 2(5):1–3
13. Tütüncu B, Torpi H, Urul B (2018) A comparative study on different types of metamaterials
for enhancement of microstrip patch antenna directivity at the Ku-band (12 GHz). Turk J
Electr Eng Comput Sci 26:1171–1179
Refraction Method of Metamaterial
for Antenna
Maswani Khairi Marzuki, Mohd Aminudin Jamlos,
Wan Azani Mustafa, and Khairul Najmy Abdul Rani
Abstract This paper reviews several refraction methods of metamaterial.
Metamaterial is an engineered structure to produce electromagnetic properties that
is not naturally occurred in ordinary material, such as negative permittivity, negative permeability and negative refractive index. This reviewed paper focuses on
negative refractive index application where complies with microwave and optical
frequency ranges. Each method provides different frequency range. Split ring resonator used in microwave radiation enhances the gain while fishnet-chiral planar
structure is used in photonic frequency. The photonic metamaterial acts similar to
lens, which leads to enhancing the gain of the microwave.
Keywords Refraction method
Metamaterial Antenna
1 Introduction
Metamaterial is an artificial material introduced on 19th century by researchers to
the world. It is known because of the unique properties, which do not occur naturally in other materials [1]. It is formed by a multiple of composite materials or
meta-atoms and is arranged in repeating pattern also known as unit cell. The
metamaterial structured atoms are much larger than conventional atoms but much
smaller than the wavelength of incident waves. The wavelength for microwave
radiation is in millimeter while for photonic metamaterial is in nanometer [2]. Each
design will provide different properties and capable to manipulate the electromagnetic waves, such as blocking, absorbing, enhancing and bending the incident
wave. It also affects the electromagnetic radiation or lights [3].
The idea to create an unusual material like metamaterial occurred because of the
limited abilities of natural materials where it has only positive characteristics, such as
M. K. Marzuki M. A. Jamlos (&) W. A. Mustafa K. N. A. Rani
Faculty of Engineering Technology, Universiti Malaysia Perlis, UniCITI ALAM Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: mohdaminudin@unimap.edu.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_36
529
530
M. K. Marzuki et al.
positive dielectric permittivity and positive magnetic permeability, which is also
known as “double positive” material. Metamaterial can be characterized into two
characters, which are “single –negative” where one of the permittivity or permeability
is negative and for this type of metamaterial, it supports evanescent waves. While
other character of metamaterial is to have both negative values also known as “double
negative” metamaterial for permittivity and permeability, which leads to negative
refractive index [4].
The focus of this paper is to explore the application used by the negative index
metamaterial (NIM). Theoretically, NIM is referred as left-handed materials
(LHM) where the poynting vector is antiparallel to wave vector. It is different from
the right-handed material where the poynting vector is parallel with wave vector
with positive permittivity and permeability [5]. The important property of NIM is it
can bend or refract the light passes differently from common positive index
material. The refracted light will lie on the same side of the normal as the incident
light. NIM with a −1 refractive index would provide ultrahigh resolution and give
the super lensing effect. NIM used in variety of applications and it can be distinguishing by different methods [6].
2 Refraction Method
There are several refraction methods of metamaterial discussed in this section. Each
method is used in different applications depends on the design of the unit cell. The
first method uses cylindrical lens antenna as shown in Fig. 1. The researcher uses
this method to replace the array antenna used at the Base Station for the next
generation mobile system (5G). It supports the application of multi-beam and
Fig. 1 Cylindrical lens
antenna
Refraction Method of Metamaterial for Antenna
531
Fig. 2 Huygen’s
metasurfaces
multi-frequency use. Besides that, the negative refractive index reduces the thickness of the lens and the angle obtained for the application is n = 2 [7].
Huygen’s metasurface method also produces negative refractive index, which is
used to focus the beam of the signal. This method is printed on two bonded boards
by using standard PCB fabrication techniques even there are many stacked and
interspaced layers as shown Fig. 2 [8].
The split-ring resonator (SRR) is commonly used in metamaterial antenna for
many applications depend on the design as shown in Fig. 3. Many researchers tend to
use this method because of the design characteristics. The permeability value is
controlled by the radius and width of the ring [9]. There are five different designs
discussed for this method. Firstly, the design used the double circular slot ring resonator. It acts as planar surface lens and the 3-dB transmission band of 2 GHz
obtained between 8.55 and 10.55 GHz. Then, the high gain antenna is modified by
placing double stacked meta-surface lens over a microstrip patch antenna and the gain
enhanced by 8.55 dB in H-plane while 6.20 dB in E-Plane. Lastly, cross polarization
improved by 8 dB [10]. There is also squared SRR design, which is used to synthesize
negative refractive index lens and parabolic lens. This method uses 90 unit cells to get
n = ∞ at 11.6 GHz. The combination of these two meta-surfaces able to focus the
energy in a point despite of the power losses in the air [11]. Besides that, the combination of square shape and circular designed to exhibit negative refractive index
from 5.7 to 6 GHz frequency band [12] and other researchers also used this design to
produce negative refractive index in S-band range between 2.2–3.3 GHz, which
resonated at 2.5 GHz. Radiation directivity was also enhanced and it could be used for
wireless power transfer application [13]. Lastly, for SRR design is not limited for
532
M. K. Marzuki et al.
Fig. 3 a Double circular slot ring resonator. b Squared split-ring resonator. c Square-circular
split-ring resonator. d S-shape resonator
Refraction Method of Metamaterial for Antenna
533
Fig. 4 a Chiral planar. b Fishnet structure. c Fishnet-like chiral metamaterial
circular or square shape only. One of the researchers manages to design SRR in
S-shape as shown in Fig. 3d. The negative refractive index occurred at the higher
frequency, which was between 5 and 9 GHz [14].
Subsequent paragraphs, however, are indented. All the methods discussed are
used to get the negative index from microwave. However, none of the above
method is used in optical frequency. Therefore, the Fishnet-Chiral Planar method is
introduced as shown in Fig. 4. There are three designs reviewed in this section. The
first design is chiral planar design used in optical frequency. It managed to reduce
losses of the negative index metamaterial and exhibit polarization effects for lights
field [2]. Then, the fishnet structure design was introduced and the researcher found
that this method used to gain negative permeability and able to get the highest
figure of merit (FOM) without loss compensation. Besides that, the light passes
through undergoes negative refractive index at the interface and focuses at the far
field. The negative index metamaterial (NIM) slab acts similar to a lens. Lastly, the
combination of the fishnet and chiral planars was designed known as fishnet like
chiral metamaterial. It was used to reduce losses exhibited by the chiral metamaterial and exhibit negative refractive indices in three frequency bands [15].
3 Conclusion
Metamaterial capabilities explored in many applications as reviewed in this paper
by using negative index metamaterial. However, most of the applications are in
microwave frequency range. Therefore, it is good to explore more in photonic
system. As reviewed, the 4th method, fishnet-chiral Planar design is able to
manipulate the electromagnetic radiation or light. There are three different capabilities of this method based on its design, which are it can exhibit polarization
effects of lights, bend and focus the light at a point and act similar to lens. With
these properties, it can be used to explore more in electromagnetic radiation and to
manipulate light properties.
534
M. K. Marzuki et al.
References
1. Kuse R, Hori T, Fujimoto M (2015) Variable reflection angle meta-surface using double
layered FSS. In: 2015 IEEE international symposium on antennas and propagation & USNC/
URSI national radio science meeting, Canada. IEEE, pp 872–873
2. Linden S, Wegener M (2007) Photonic metamaterials. In: Conference proceedings of the
international symposium on signals, systems and electronics, USA, pp 147–150
3. Zhu B, Huang C, Zhao J, Jiang T, Feng Y (2010) Manipulating polarization of
electromagnetic waves through controllable metamaterial absorber. In: 2010 Asia-pacific
microwave conference, Japan. IEEE, pp 1525–1528
4. Duan ZY, Guo C, Guo X, Chen M (2016) Double negative-metamaterial based terahertz
radiation excited by a sheet beam bunch. Phys Plasmas 20(9):1–6
5. Solymar L, Shamonina E (2009) Waves in metamaterial. Oxford University Press, Oxford A
bird’s-eye view of metamaterials
6. Yang J, Xu F, Yao S (2018) A dual frequency Fabry-Perot antenna based on metamaterial
lens. In: 2018 12th international symposium on antennas, propagation and EM theory
(ISAPE), China. CRIRP, pp 1–3
7. Hamid S, Ali MT, Abd Rahman NH, Pasya I, Yamada Y, Michishita N (2016) Accuracy
estimations of a negative refractive index cylindrical lens antenna designing. In: Proceedings
of the 2016 6th IEEE-APS topical conference on antennas and propagation in wireless
communications, APWC, USA. IEEE, pp 23–26
8. Wong Joseph PS (2015) Design of Huygens’ metasurfaces for refraction and focusing.
A dissertation submitted to the faculty of The University of Toronto in partial fulfillment of
requirement for the degree of Doctor of Philosophy in Electrical and Computer Engineering
9. Singh AK, Abegaonkar MP, Koul SK (2017) A negative index metamaterial lens for antenna
gain enhancement. In: International symposium on antennas and propagation, USA. IEEE,
pp 1–2
10. Yang J, Xu F, Yao S (2018) A dual frequency Fabry-Perot antenna based on metamaterial
lens. In: 12th international symposium on antennas, propagation and EM theory (ISAPE),
China. IEEE, pp 1–3
11. Pan CW, Kehn MNM, Quevedo-Teruel O (2015) Microwave focusing lenses by synthesized
with positive or negative refractive index split-ring resonator metamaterials. In: International
workshop on electromagnetics: applications and student innovation competition, IWEM,
pp 1–2
12. Khombal M, Bagchi S, Harsh R, Chaudhari A (2018) Metamaterial unit cell with negative
refractive index at C band. In: 2nd international conference on electronics, materials
engineering and nano-technology, India. IEEE, pp 1–4
13. Baghel AK, Nayak SK (2018) Negative refractive index metamaterial for enhancing radiation
directivity in S-band. In 3rd international conference on microwave and photonics, India.
IEEE, pp 1–2
14. Fiddy MA, Adams R, Weldon TP (2017) Exploiting metamaterials: fundamentals and
applications. A dissertation submitted to the faculty of The University of North Carolina at
Charlotte in partial fulfillment of the requirements for the degree of Doctor of Philosophy in
Electrical Engineering
15. Fernández O, Gómez Á, Vegas A, Molina-Cuberos GJ, García-Collado AJ (2017) Novel
fishnet-like chiral metamaterial structure with negative refractive index and low losses. In:
IEEE antennas and propagation society international symposium proceedings, USA, pp 1959–
1960
Circular Polarized 5.8 GHz Directional
Antenna Design for Base Station
Application
Mohd Aminudin Jamlos, Nurasma Husna Mohd Sabri,
Wan Azani Mustafa, and Maswani Khairi Marzuki
Abstract Nowadays, research development and utilization of directional antenna
with circular polarization have been grown rapidly for base station applications.
High Gain Antenna (HGA) is one of directional antenna that focused on narrow
beam width for the application. The antenna permits more precise on the targeting
the radio signal and usually is placed at the open area so that the radio waves to be
transmitted will not be interrupted. For this paper, methods for circularly polarized
microstrip patch antenna designs are being reviewed. In order to realized circularly
polarized antenna, the patch has undergone some design modification while array
antenna is design for improving antenna performance as to realize high gain so that
it is suitable to be used in base station applications.
Keywords Circular polarize
Base station Antenna
1 Introduction
Circular polarized 5.8 GHz directional antenna is designed to be used for base
station application. To design the antenna, it must have a very wide band impedance matching, stable radiation pattern in a wide frequency band and high
cross-polarization ratio in wide angle range [1–3]. For this research, circularly
polarized microstrip patch antenna is designed since it is suitable for wireless
communication. In order to make circularly polarized design, the patch must
undergo some modification such as masking perturbation, slot or slit and by
truncating corners [1, 4].
In order to enable the antenna that works in the base station, the antenna must
have a very high gain so that the signal can be easily transmitted and received
consistently. Thus, an array antenna is designed for improving antenna gain
M. A. Jamlos (&) N. H. Mohd Sabri W. A. Mustafa M. K. Marzuki
Faculty of Engineering Technology, Universiti Malaysia Perlis, UniCITI ALAM Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: mohdaminudin@unimap.edu.my
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_37
535
536
M. A. Jamlos et al.
performance in base station applications [5] where rectangular microstrip patch
antenna array is designed and some modification on the patch is made in order to
make a circularly polarized antenna for use in base station application.
Besides, the requirement of the directional radiation pattern is important since it
provides increased performances and reduced interference when transmission and
reception of communication [6]. The directional antenna is designed to function more
effectively than in others. The reason for that directionality is for improving transmission and reception of the signal communication as well as to reduce interference
[5]. The antenna for base station application is operating at 5.8 GHz frequency for the
requirement of the large bandwidth and gain for base station application.
2 Microstrip Antenna
Microstrip antenna associated with low cost, light weight, conformal antennas
which can be integrated with feed networks and active devices. The basic structure
of microstrip antenna consists of a radiating patch on one side of a dielectric
substrate and a ground plane on the other side of the substrate [1, 3, 5]. A microstrip
patch antenna structure is shown in Fig. 1. Patch is generally made up of conducting material like copper or gold and it can be of any possible shape. The patch
and the feed lines are photo etched on the substrate. As this antenna is etched on the
substrate so it can take any desired shape. Rectangular shaped patch is the simplest
patch shape to be etched and analyzed. Microstrip antenna has advantages of low
profile, lightweight, low cost, ease of integration with active component and radio
frequency devices [3, 7].
However, the microstrip antenna also have the disadvantages which is low gain,
low efficiency, low power handling capability and all of this disadvantages can be
overcome by using an array concept or by make MIMO antenna [5, 8]. Besides, the
radiation pattern of an antenna depends on its dimensions. It also depends on the
effective permittivity of the substrate which is dependent on the width and height of
the patch.
Fig. 1 Microstrip patch antenna structure [1]
Circular Polarized 5.8 GHz Directional Antenna …
537
Fig. 2 Types of polarization. a Linear. b Circular. c Eliptical [9]
3 Antenna Polarization
Polarization is the property of electromagnetic wave describing the time-varying
direction and relative magnitude of the electric field vector as observed along the
direction of propagation. Transmitting and receiving antennas should be similarly
polarized otherwise there will be more losses. There are three types of polarization
which is linear polarization, circular polarization and elliptical polarization.
Figure 2 above show three types of polarization and is rotating [9].
Transmitting and the receiving should be similarly polarized otherwise there will
be more losses. The uses of linear polarization will make the alignment of transmitting and receiving antenna become well. This limitation of alignment can be
removed by using circular polarization which compatible with this research project
that is needed circular polarized in its design [10]. Circularly polarized antenna used
to be exotic mw technology for communication. The field of CP antenna is always
rotating. A Circular Polarization Circulation polarization (CP) can be achieved by
making axial ratio equal to one. Besides, other researcher claims that circularly
polarized antenna have axial ratio less than 3 dB at 90° phase shift [11]. Circular
polarization has two types which is Right Hand CP (RHCP) and Left Hand CP
(LHCP). For practical implementation of antenna, to consider whether the antenna
is LHCP antenna or RHCP antenna, if the transmitting is LHCP antenna and
receiving is RHCP antenna there will be 25 dB gain difference between them. Some
of the antenna polarization losses are also exist when transmitting antenna and
receiving antenna polarizations are different [12].
4 Methods for Circular Polarized Antenna Design
Circular polarization (CP) antenna is increasingly attractive in wireless communication systems [13]. Circular Polarized can be obtained if two orthogonal modes
with equal amplitudes are excited with a 90° time-phase difference. This can be
538
M. A. Jamlos et al.
accomplishing for instance by adjusting the physical dimensions of the microstrip
patch or by various feed arrangements [14, 15]. Figures below show some of the
designs of the antenna resulting in circular polarization from some researchers.
Some researcher has modified the antenna design in result of circular polarization. As presented by Thoetphan Kingsuwannaphong, the design of 5.7 GHz circular polarization antenna uses the double feeder in order to avoid the interference
from adjacent channel of other wireless devices. But, the antenna required two input
port of 0° and 90° phase input to achieve circular polarization property. Since it
possible to create two output signals with 90° phase different, hence, the compact
circular polarized antenna with inset fed and slot is design as shown on Fig. 3. The
slot at edge of the circular patch is made to achieve circular polarized. The result of
the axial ratio is shown in Fig. 4 below. From the simulation, the result of the axial
ratio is acceptance which at 90 phases, AR is below that 3 dB. So the design is
circularly polarized.
The other way of design to achieve circularly polarization is make an inclined or
diagonally slot at the centre of the patch. The slot technique is a way to obtain a
circularly polarization [16–18]. As contribute from one of the researcher, the
antenna element is a square with an inclined slot at the center. The antenna is
feeding by a microstrip line having a characteristic impedance of 100 Ω, this
antenna was mounted on a FR4 substrate. The antenna dimensions are presented in
Fig. 5. Besides, by introducing asymmetrical slits in diagonal direction of the
square microstrip patches [18], the single coaxial-feed microstrip patch antenna is
realized for circularly polarized radiation with compact antenna size. The impedance and axial ratio bandwidths are small around 2.5 and 0.5%.
Besides, in order to make the circular polarized antenna, some modification on
the patch is done such as make some truncated design on the patch or make a slot
and so on. From the previous research, the proposed antenna is develop by combining two array antenna which excited from 50 GHz coaxial feed probe, the array
Fig. 3 Circular polarized
antenna design [15]
Circular Polarized 5.8 GHz Directional Antenna …
Fig. 4 Simulation result of axial ratio
Fig. 5 Patch antenna design
with inclined slot [16]
539
540
M. A. Jamlos et al.
Fig. 6 Circular polarized array antenna design [12]
antenna is designed with 4 element patches on the substrate and each elements is
truncated at the corner of the patch to achieve circular polarized result [12, 19, 20].
The antenna designed is shown in the Fig. 6.
A single-feed CP U-slot microstrip antenna is proposed in [21]. The asymmetrical U-slot structure is able to generate two orthogonal modes for CP operation
without truncate any corner of the square patch. The CP radiation is achieved by
etching the complementary split-ring resonator on the patch. The etched gap orientation to the current propagating direction will render the antenna to generate CP
waves. By cutting asymmetrical slots onto the square patches, the single probe-feed
microstrip antenna is realized for CP radiation [22]. A new technique to design
single-feed CP microstrip antenna using Fractal Defected Ground Structure FDGS
has been presented in this communication [21, 23]. By using this method, the level
of the linearly polarized microstrip antenna is increased to the required level for CP
radiation.
Another technique to obtain circularly polarized antenna in [24]. In this paper, a
circular microstrip patch antenna and its two element array have been proposed for
ISM band Applications. Here, the proposed antenna and its array is operated on
5.8 GHz ISM band. The antenna consists of a circular patch which has an elliptical
slot and a vertical strip at the center of the patch as shown on Fig. 7 below. The
antenna shows circularly polarized radiation pattern with best return loss
characteristics.
Circular Polarized 5.8 GHz Directional Antenna …
541
Fig. 7 Circular polarized
array antenna design [24]
5 Conclusion
As conclusion, the paper describes the method for circularly polarized microstrip
patch antenna design and ways to improve its performance to enhance its applicability for use in base station application. Basically bandwidth of the microstrip
antenna is its main limitation since for the base station, a large bandwidth is needed.
Through this paper, methods including modifying the shape of the patch antenna or
by using different feeding techniques circular polarization are described which
helps in increasing the bandwidth of the antenna as well as by making the antenna
in an array configuration. Different slotted antenna in term of shape and size of the
slot also helps in achieving increased bandwidth, improved efficiency, and gain.
References
1. Kingsuwannaphong T, Sittakul V (2018) Compact circularly polarized inset-fed circular
microstrip antenna for 5 GHz band. Comput Electr Eng 65:554–563
2. Chen W-S, Wu C-K, Wong K-L (2002) Compact circularly-polarised circular microstrip
antenna with cross-slot and peripheral cuts. Electron Lett 34:1040
3. Nayan MKA, Jamlos MF, Jamlos MA (2014) Circular polarized phased shift 90° MIMO array
antenna for 5.8 GHz application. In: IEEE international symposium on telecommunication
technologies, ISTT, vol 76, pp 169–173
4. Karvekar S, Deosarkar S, Deshmukh V (2014) Design of compact probe fed slot loaded
microstrip antenna. In: International conference on communication and signal processing,
ICCSP, pp 387–390
5. Midasala V, Siddaiah P (2016) Microstrip patch antenna array design to improve better gains.
Procedia Comput Sci 85:401–409
6. Fauzi DLN, Hariyadi T (2018) Design of a directional microstrip antenna at UHF-band for
passive radar application. IOP Conf Ser Mater Sci Eng 384:012006
7. Balanis CA (2005) Antenna theory analysis and design, 3rd edn. Wiley, Hoboken
542
M. A. Jamlos et al.
8. Nayan MKA, Jamlos MF, Jamlos MA, Lago H (2014) MIMO 22 RHCP array antenna for
point-to-point communication. In: IEEE symposium on wireless technology and applications,
ISWTA, pp 121–124
9. Orban D, Moernaut GJK (2006) The basics of patch antennas. Orban Microwave Products,
pp 1–4
10. Lacoste R (2010) Robert Lacoste’s the darker side: practical applications for electronic design
concepts. Elsevier Inc., Amsterdam
11. Fujita K, Yoshitomi K, Yoshida K, Kanaya H (2015) A circularly polarized planar antenna on
flexible substrate for ultra-wideband high-band applications. AEU Int J Electron Commun
69:1381–1386
12. Kunooru B, Nandigama SV, Rani SS, Ramakrishna D (2019) Analysis of LHCP and RHCP
for microstrip patch antenna. In: International conference on communication and signal
processing (ICCSP), pp 0045–0049
13. Jamlos MA, Jamlos MF, Ismail AH (2015) High performance of coaxial feed UWB antenna
with parasitic element for microwave imaging. Microw Opt Technol Lett 57:649–653
14. Jackson DR, Long SA, Williams JT, Davis VB (1997) Computer aided design of rectangular
microstrip antennas of advances in microstrip and printed antennas, 2nd edn. Wiley, Hoboken
15. Garg AIR, Bhartia P, Bahl I (2001) Microstrip antenna design handbook. Artech House,
Boston
16. Nayan MK, Jamlos MF, Lago H, Jamlos MA (2015) Two-port circular polarized antenna
array for point-to-point communication. Microw Opt Technol Lett 57:2328–2332
17. Madhuri S, Tiwari VN (2016) Review of circular polarization techniques for design of
microstrip patch antenna. In: International conference on recent cognizance in wireless
communication & image processing, pp 663–669
18. Nasimuddin, Chen ZN, Esselle KP (2008) Wideband circularly polarized microstrip antenna
array using a new single feed network. Microw Opt Technol Lett 50:1784–1789
19. Liang D, Hosung C, Robert WH, Hao L (2005) Simulation of MIMO channel capacity with
antenna polarization. IEEE Trans Wireless Commun 4(4):1869–1873
20. Wei K, Li JY, Wang L, Xu R, Xing ZJ (2017) A new technique to design circularly polarized
microstrip antenna by fractal defected ground structure. IEEE Trans Antennas Propag
65:3721–3725
21. Nasimuddin, Qing X, Chen ZN (2011) Compact asymmetric-slit microstrip antennas for
circular polarization. IEEE Trans Antennas Propag 59:285–288
22. Gupta K, Jain K, Singh P (2014) Analysis and design of circular microstrip patch antenna at
5.8 GHz. Int J Comput Sci Inf Technol 5:3895–3898
23. Nayan MK, Jamlos MF, Jamlos MA (2015) Circularly polarized MIMO antenna array for
point-to-point communication. Microw Opt Technol Lett 57:242–247
24. Singh N, Yadav DP, Singh S, Sarin RK (2010) Compact corner truncated triangular patch
antenna for WiMax application. In: Mediterranean microwave symposium, MMS, pp 163–
165
Medical Image Enhancement
and Deblurring
Reza Amini Gougeh, Tohid Yousefi Rezaii, and Ali Farzamnia
Abstract One of the most common image artifacts is blurring. Blind methods have
been developed to restore a clear image from blurred input. In this paper, we introduce
a new method which optimizes previous works and adapted with medical images.
Optimized non-linear anisotropic diffusion was used to reduce noise by choosing
constants correctly. After de-noising, edge sharpening is done using shock filters.
A novel enhanced method called Coherence-Enhancing shock filters helped us to
have strong sharpened edges. To obtain a blur kernel, we used the coarse-to-fine
method. In the last step, we used spatial prior before restoring the unblurred image.
Experiments with images show that combining these methods may outperform previous image restoration techniques in order to obtain reliable accuracy.
Keywords Medical images
Blind deconvolution Deburring
1 Introduction
Medical images are an indispensable component of the diagnosis and treatment
system, so we need accurate images. Blur is a type of medical image artifact that
has various sources such as body movement or detector.
The blur kernel determines the effect of the blur on the image. If the blur is
non-shift invariant, it can be modeled as a convolution of the original image with
the blur kernel; thus, obtaining a clear image becomes a deconvolution problem. In
non-blind decon-volution, the blur function is known, and the problem is to find the
original image from the blurred image. In blind deconvolution, the blur function is
unknown [1]. Among the non-blind methods, we can refer to the Wiener filter and
the Lucy-Richardson method that were introduced decades ago with the initial
R. Amini Gougeh T. Yousefi Rezaii
Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
A. Farzamnia (&)
Faculty of Engineering, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia
e-mail: ali-farzamnia@ieee.org
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_38
543
544
R. Amini Gougeh et al.
assumptions about the blur function. From the new blind methods, we can mention
the Fergus method [2].
In this article, we will investigate the blind deconvolution method and will try to
achieve an efficient method for use in the medical field with previous improvements. A clear image is obtained fully and correctly in the absence of noise in blur
image and error in blur kernel estimation. So the proposed algorithm tries to achieve
this ideal. As mentioned, blurry images are noisy, so we have the following
equation for the blurry image:
b ¼ i~kþn
ð1Þ
where b is the blurry image, i is the clear image, k is a blur kernel and n is noise. ~
indicates convolution operator. In the case of the Fourier transform, Eq. (1)
becomes the following relation:
B ¼ I KþN
ð2Þ
Figure 1 shows this equation on carotid MRI image.
Projection-based and maximum-likelihood method are the two major types of
blind deconvolution. The projection-based approach retrieves the blur function and
the real image simultaneously. This method is repeated continuously until it meets a
predefined criterion. The first step is estimating the blur function. One of the benefits
of this method is that it is not sensitive to noise. The second approach shows the
maximum likelihood estimation of blur parameters, such as the covariance matrix.
Since the estimated blur function is not unique, it is possible to introduce functions
by considering the size, symmetry of the estimated function. One of the significant
advantages of this method is that its computational complexity is low, and it also
helps to detect blur, noise, and real image power spectra [3].
Blur kernel estimation is an ill-posed problem. So various types of regularization
terms were used in the models. Fergus et al. [2] used heavy-tailed distribution. They
used the mixture of Gaussians and Bayes’ theorem to estimate kernel. Shan et al. [4]
has developed a parametric model to estimate heavy-tailed distribution from natural
image gradients. Levin et al. [5] used Hyper-Laplacian regularization terms of image
gradient approximation. Cho and Lee [1] used coarse-to-fine method to determine the
blur kernel. They used this iterative method with a bilateral filter. This method used
Gaussian regularization terms. Notably, our method is an adaptation of this method.
Fig. 1 Practical Eq. (1)
Medical Image Enhancement and Deblurring
545
According to previous studies of the blur kernel estimation, the existence of
appropriate edges makes the estimation more accurate. Combined methods such as
shock filters with bilateral filters have been used by Money and Kang [6] and
Alvarez and Mazorra [7]. Xu et al. [8] used zero norms in equations for kernel
estimation, which has a good effect on noise and prevents errors that appear around
the edges.
Our paper is formed as follows. In Sect. 2, we describe the structure of our
algorithm and the methods we used. Numerical aspects and results are briefly
sketched in Sect. 3. In the last section, we have a summary which concludes the
paper.
2 Materials and Methods
The primary purpose of the iterative alternating optimization is to refine the motion
blur kernel progressively. The final deblurring result is obtained by the last
non-blind de-convolution operation that is performed with the final kernel K and the
given blurred image B. The intermediate latent images estimated during the iterations have no direct influence on the deblurring result. They only affect the result
indirectly by contributing to the refinement of kernel K. The success of previous
iterative methods comes from two essential properties, including sharp edge
restoration and noise suppression in smooth regions. These attributes help to estimate the kernel accurately [1].
The coarse-to-fine method starts from developed for medical images. Chen et al.
[9] developed a new framework for 3D Brain MR image registration. We used
another method based on spatial priors.
2.1
Noise Reduction
In the first phase of blur function estimation, we try to denoise the blurry image.
The method used in this study is based on the Perona-Malik method [10], which
relies on the use of partial derivatives in image analysis. The values of the conduction coefficient and diffusion rate play an important role in noise reduction. The
weaknesses of conventional methods are the manual selection of constants. In our
method, the image gradient is calculated in its four major neighborhoods, then the
difference between the gradients are calculated in horizontal and vertical directions.
By calculating the average value of the gradient and variance, we obtain an
appropriate criterion for obtaining the magnitude of the image gradient changes,
which has a linear relationship with the diffusion rate. Choosing the right values is
critical to maintaining the edges of the image, larger values make the image
smoothly, and at low values, noise reduction will not be possible.
546
R. Amini Gougeh et al.
Equation (3) specifies the output image of this method in (1 + t)th repetition:
tþ1
t
Ii;j
¼ Ii;j
þ k½CN :rN I þ CS :rS I þ CE :rE I þ CW :rW I ti;j
ð3Þ
where 0 k 0:25 for the numerical scheme to be stable, N, S, E, and W are the
subscripts for North, South, East, West neighbors, and the symbol r indicates
nearest-neighbor differences:
rN Ii;j Ii1;j Ii;j
rS Ii;j Ii þ 1;j Ii;j
ð4Þ
rE Ii;j Ii;j þ 1 Ii;j
rW Ii;j Ii;j1 Ii;j
The conduction coefficients are updated at every iteration as a function of the
brightness gradient.
CtNi;j ¼ g ðrIÞti þ ð1Þ;j 2
CtSi;j ¼ g ðrIÞtið1Þ;j 2
CtEi;j ¼ g ðrIÞti;j þ ð1Þ 2
CtWi;j ¼ g ðrIÞti;jð1Þ ð5Þ
2
Figure 2 illustrated pixel’s 4 major neighborhood.
We used the equation of Black et al. [11]. As g(.):
(
gðrIÞ ¼ f ðxÞ ¼
h
i
2
0:67 1 kxpffiffi5 ;
0;
pffiffiffi
xk 5
otherwise
ð6Þ
where k is the diffusion rate controls the sensitivity to edges.
rNS I ¼ rN I rS I
rEW I ¼ rE I rW I
Fig. 2 Discrete
computational structure for
simulation of diffusion
equation of Perona and Malik
[10]
ð7Þ
Medical Image Enhancement and Deblurring
547
We calculated the gradient in two vertical and horizontal directions by (7), then
the average gradient value is calculated as follows:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
rI ¼ ðrNS IÞ2 þ ðrEW IÞ2
ð8Þ
According to results Hasanpor et al. [12], k has a linear relationship with the
variance of gradients, so we have:
k ¼ a:VarðrIÞ
ð9Þ
with respect to noise properties, we can suggest an optimum number of a so we can
calculate k more precise and easier. After applying modified Perona-Malik, we
obtain an image with less noise without removing image parts like edges which are
essential for blur kernel estimation.
2.2
Shock Filter
The shock filter is used to restore salient edges by [13] One of the disadvantages of
the shock filter is enhancing remnant noise. Money and King [6] used a shock filter
to find sharp edges, and estimated a blur kernel. Weickert [14] introduced an
enhanced version of shock filters called Coherence-enhancing shock filters. We
used this method in our research.
The basic of the shock filter is the transfer of gray values to the edge from both
sides by applying image’s morphological operations to satisfy the differential
equation conditions. The two main operations in image morphology are: 1-Dilation
and 2-Erosion.
The shock filter uses the sign function which has {−1, 0, +1} values to select
between two states (dilation and erosion). Applying such a method creates a severe
discontinuity called shock at the boundary between the two zones of influence. We
use the Gaussian filter to smooth the image and solve the shock filter equation.
@Is
¼ sgnðDIs ÞkrIs k
@t
ð10Þ
where DIs and rIs are Laplacian and gradient of Is , respectively. Is is the filtered
image which results from the equation follows:
Is ¼ Gr ~ Ip
ð11Þ
which Ip is image after the de-noising section and Gr is Gaussian filter with
standard deviation r. r determines the size of the resulting patterns. Often r is
548
R. Amini Gougeh et al.
chosen in the range between 0.5 and 2 pixel units. It is the main parameter of the
method and has a strong impact on the result.
If the right edges are not selected, the estimated blur kernel will have less
accuracy. Several modifications have been proposed in order to improve the performance of shock filters. For instance, replacing rIs with other expressions can be
a better edge detector.
It is clear that the shock filter and Perona-Malik method are iterative processes,
so we need to define the iteration number. Furthermore, it has been proven that the
number of salient edges does not always lead to accurate estimates. Impact of
iteration has shown in Fig. 3.
2.3
Edge Selecting
In order to achieve useful edges, Xu and Jia [15] assumed an h h window
centered at pixel x and moving over all parts of the blurred image; we can obtain a
criterion for choosing the correct gradients as follows:
P
y2Nh ðxÞ rBðyÞ
ð12Þ
rðxÞ ¼ P
y2Nh ðxÞ krBðyÞk þ 0:5
B is the blurred image and Nh ðxÞ is the mentioned window. The nominator is the
sum of the absolute values of the gradients of the windows with different x centers,
giving us an estimation of the structure of an image. Flat areas of the image, where
the pixel difference is negligible, and also the areas where the pixel sharpness is
high (such as the impulse) have the small r(x) values because they neutralize by
other gradient factors. It should be noted that we obtain the above equation for the x
and y coordinates (derived in two directions). 0.5 was used for grayscale level [0,
1], and if we use system with value [0, 255] we can select 20 instead of 0.5.
Absolute value is:
Fig. 3 The output of the shock filter. (a) Input image (b) shock filter iteration = 5 (c) iteration = 50
(d) iteration = 150 (g) iteration = 250
Medical Image Enhancement and Deblurring
549
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2ffi
2
rðxÞ ¼ ðrx Þ þ ry
ð13Þ
Figure 4 shows the calculated r(x) for depicting image.
We have the phase as follows:
h ¼ arctan
rx
ry
ð14Þ
Which h 2 p2 ; p2 then r values were sorted into 4 groups in descending order:
p
p p
, ½p
4 ; 0Þ, ½0; 4Þ, 4 ; 2 . Then the threshold value was defined to ensure the
minimum number of pixels to be selected in each group.
p p
2 ; 4
pffiffiffiffiffiffiffiffiffiffi
sr ¼ 0:5 PI PK
ð15Þ
where PI is the total number of pixels in the input image and PK is the total number
of pixels in the kernel.
Using the Heaviside function, H(.), the threshold will be applied:
M ¼ H ð r sr Þ
ð16Þ
Another threshold was defined which works with the gradient magnitude.
Selected edges are determined as:
pffiffiffiffiffiffi
ss ¼ 2 PK
rIs ¼ rIsh :H MrIsh ss
Fig. 4 (a) Input image (b) Calculated r(x)
ð17Þ
ð18Þ
550
R. Amini Gougeh et al.
Ish is shock filtered image, ss is mentioned threshold to guarantee that at least
pffiffiffiffiffiffi
2 PK pixels participate
estimation in each group. It also excludes seg in kernel
ments depended on MrIsh .
We calculated the required edges. Next step is the blur kernel estimation. Our
target is k, which is the kernel. We know that this problem is ill-posed so we need to
use regularization terms to solve the problem correctly. Our problem modeled as
follows by Xu and Jia [15]:
EðkÞ ¼ krIs ~ k rBk2 þ ckkk2
ð19Þ
To solve this problem, we need separate dimensions and solve the convolution in
matrixes. We can do that operation by flipping both of the rows and columns of the
image and then multiplying locally similar entries and summing:
2
EðkÞ ¼ kAx k rx Bk2 þ Ay k ry B þ ckkk2
ð20Þ
If we apply the first-order derivation:
@EðkÞ
¼ 2ATx ðAx k rx BÞ þ 2ATy ðAy k ry BÞ þ 2ck
@k
ð21Þ
We assume that the Eq. (21) equals zero, then we apply Fast Fourier Transform
(FFT) on all variables.
2ðATx Ax þ ATy Ay þ cÞk ¼ ATx rx B þ ATy ry B
ð22Þ
Using Parseval’s theorem:
k ¼ F 1
!
F ð@x Is ÞF ð@x BÞ þ F @y Is F @y B
2
F ð@x Is Þ2 þ F @y B þ c
ð23Þ
where F ð:Þ and F 1 ð:Þ denote the FFT and inverse FFT respectively. F ð:Þ is the
complex conjugate operator. So we restored the blur kernel with Eq. (23).
To restore an image, we need to model ill-posed problem again, but we use
spatial prior this time:
EðIÞ ¼ kI ~ k Bk2 þ kkrI rIs k2
ð24Þ
which rI rIs is new prior and restore sharp selected edges properly. Using the
former approach results:
Medical Image Enhancement and Deblurring
551
Fig. 5 (a) Blurred input (b) c ¼ 15; k ¼ 0:005 (c) c ¼ 15; k ¼ 0:05 (d) c ¼ 15; k ¼ 0:5
(e) c ¼ 15; k ¼ 5 (f) c ¼ 5; k ¼ 0:005 (g) c ¼ 10; k ¼ 0:005 (h) c ¼ 20; k ¼ 0:005 (i) c ¼ 30;
k ¼ 0:005
1
F ðkÞF ðBÞ þ kF ð@x ÞF Isx þ kF @y F Isy
I ¼ F 1 @
A
F ðkÞF ðBÞ þ kF ð@x ÞF ð@x Þ þ kF @y F @y
0
ð25Þ
I; is latent image and we need to use a non-blind deconvolution technique to
restore detailed image. Various methods for reach final image have been developed
and we used method of Cho and Lee [1].
Effect of k and c values is illustrated in Fig. 5.
3 Discussion and Results
The parameters in the calculations have an important role in predicting the blur
kernel. For example, if the threshold values are selected for the function r(x) and the
final edges are either large or very small, the image will be smoothed, and therefore
important edges will not be selected for kernel estimation. In this paper, we
attempted to improve performance by select these values automatically.
In Fig. 6 effects of values on kernel depicted.
We also tried our algorithm on images which contain text such as Fig. 7.
552
R. Amini Gougeh et al.
Fig. 6 (a) Output image and estimated kernel with c ¼ 15 (b) Output image and estimated kernel
with c ¼ 5 (c) Output image and estimated kernel with c ¼ 1
Fig. 7 Debluring image with text (a) blurred input (b) perona-malik output (c) deconvolution
output
Our algorithm was implemented in MATLAB R2016a on AMD A10 6th generation CPU 1.8 GHz. and duration of image restoration has calculated in Table 1.
Medical Image Enhancement and Deblurring
553
Table 1 Calculation speed
Image
Vessels (Fig. 1)
Foot (Fig. 3)
Arm (Fig. 5)
Brain (Fig. 6)
Faculty façade
(Fig. 7)
Restoration duration
(sec)
Iterations
Perona-malik
22.5
31.8
42
27.4
38
5
5
5
5
5
Shock
filter
Coarse to
fine
8
8
8
8
8
7
7
7
7
7
4 Summary
Image processing has improved dramatically in the last decades. The rate of
development has increased with the advent of more advanced machine vision
technologies in daily life. Medical imaging, as one of the pillars of the modern
medical diagnosis system, is not devoid of this technology.
Different imaging methods have different sensitivities to noise, camera movement, beam source, and other factors. The blur of the images cause damage to these
images. For example, a slight movement on an MRI or x-ray machine results in
blurry images. Figure 1 is used to detect blockage of the vein, which results in
relative blind-ness. Therefore, these images must have accuracy due to the physician can diagnose the disease with less error. The current method, in contrast to
conventional methods, can compute the blur kernel and help to reduce the costs of
re-imaging by restoring the original image. Proper edges and reduced initial noise
of blurry images lead to an accurate estimation of the blur kernel. According to the
results, using nonlinear noise reduction methods increases accuracy. The method
provided by Perona-Malik has basic parameters that are selected by the user.
Choosing these parameters automatically reduce error and leads to optimal results.
The next factor in the accuracy of the blur kernel after noise reduction is to select
the appropriate edges of the estimator function input. Shock filters introduced by
Osher and Rudin [13] perform better than other methods, such as Canny. Our
iterative algorithm modifies itself at every step and results in a more transparent
output.
Local deburring is one of the accurate ways which leads to clear images. In
Addition, creating a fast algorithm for shift-variant blur models is needed in future
works.
Acknowledgements The authors appreciate those who contributed to make this research successful. This research is supported by Center for Research and Innovation (PPPI) and Faculty of
Engineering, Universiti Malaysia Sabah (UMS) under the Research Grant (SBK0393-2018).
554
R. Amini Gougeh et al.
References
1. Cho S, Lee S (2009) Fast motion deblurring. ACM Trans Graph (TOG) 28(5):145
2. Fergus R, Singh B, Hertzmann A, Roweis ST, Freeman WT (2006) Removing camera shake
from a single photograph. ACM Trans Graph (TOG) 25(3):787–794
3. Yadav S, Jain C, Chugh A (2016) Evaluation of image deblurring techniques. Int J Comput
Appl 139(12):32–36
4. Shan Q, Jia J, Agarwala A (2008) High-quality motion deblurring from a single image. ACM
Trans Graph (TOG) 27(3)
5. Levin A, Weiss Y, Durand F, Freeman WT (2009) Understanding and evaluating blind
deconvolution algorithms. In: IEEE conference on computer vision and pattern recognition,
pp 1964–1971
6. Money J, Kang S (2008) Total variation minimizing blind deconvolution with shock filter
reference. Image Vis Comput 26(2):302–314
7. Alvarez L, Mazorra L (1994) Signal and image restoration using shock filters and anisotropic
diffusion. SIAM J Numer Anal 31(2):590–605
8. Xu L, Zheng S, Jia J (2013) Unnatural l0 sparse representation for natural image deblurring.
In: Computer vision and pattern recognition, pp 1107–1114
9. Chen T, Huang TS, Yin W, Zhou XS (2005) A new coarse-to-fine framework for 3D brain
MR image registration. In: International workshop on computer vision for biomedical image
applications, pp 114–124. Springer, Heidelberg, October 2005
10. Perona P, Malik J (1987) Scale-space and edge detection using anisotropic diffusion. IEEE
Trans Pattern Anal Mach Intell 12(7):629–639
11. Black MJ, Sapiro G, Marimont DH, Heeger D (1998) Robust anisotropic diffusion. IEEE
Trans Image Process 7(3):421–432
12. Hasanpor H, Nikpour M (2008) Using adaptive diffusion coefficient to eliminate image noise
using partial equations. Iranian J Electr Comput Eng 6(4)
13. Osher S, Rudin LI (1990) Feature-oriented image enhancement using shock filters. SIAM J
Numer Anal 27(4):919–940
14. Weickert J (2003) Coherence-enhancing shock filters. In: Joint pattern recognition
symposium. Springer, Berlin, pp 1–8
15. Xu L, Jia J (2010) Two-phase kernel estimation for robust motion deblurring. In: European
conference on computer vision. Springer, Berlin, pp 157–170
A Fast and Efficient Segmentation
of Soil-Transmitted Helminths Through
Various Color Models and k-Means
Clustering
Norhanis Ayunie Ahmad Khairudin, Aimi Salihah Abdul Nasir,
Lim Chee Chin, Haryati Jaafar, and Zeehaida Mohamed
Abstract Soil-transmitted helminths (STH) are one of the causes of health problems
in children and adults. Based on a large number of helminthiases cases that have been
diagnosed, a productive system is required for the identification and classification of
STH in ensuring the health of the people is guaranteed. This paper presents a fast and
efficient method to segment two types of STH; Ascaris Lumbricoides Ova (ALO) and
Trichuris Trichiura Ova (TTO) based on the analysis of various color models.
Firstly, the ALO and TTO images are enhanced using modified global contrast
stretching (MGCS) technique, followed by the extraction of color components from
various color models. In this study, segmentation based on various color models such
as RGB, HSV, L*a*b and NSTC have been used to identify, simplify and extract the
particular color needed. Then, k-means clustering is used to segment the color
component images into three clusters region which are target (helminth eggs),
unwanted and background regions. Then, additional processing steps are applied on
the segmented images to remove the unwanted region from the images and to restore
the information of the images. The proposed techniques have been evaluated on 100
images of ALO and TTO. Results obtained show saturation component of HSV color
model is the most suitable color component to be used with the k-means clustering
technique on ALO and TTO images which achieve segmentation performance of
99.06% for accuracy, 99.31% for specificity and 95.06% for sensitivity.
Keywords Soil-transmitted helminths
Color models k-Means clustering
Modified global contrast stretching N. A. A. Khairudin (&) A. S. A. Nasir H. Jaafar
Faculty of Engineering Technology, Universiti Malaysia Perlis, UniCITI Alam Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: hanisayunie@yahoo.com
L. C. Chin
School of Mechatronic Engineering, University Malaysia Perlis, Pauh Putra Campus,
02600 Arau, Perlis, Malaysia
Z. Mohamed
Department of Microbiology and Parasitology, School of Medical Sciences, Health Campus,
Universiti Sains Malaysia, 16150 Kubang Kerian, Kelantan, Malaysia
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_39
555
556
N. A. A. Khairudin et al.
1 Introduction
Soil-transmitted helminths (STH) are a group of intestinal parasitic worms that
affect humans through contact with larvae or ingestion of infective eggs. The
infections for humans are common in underprivileged communities where overcrowded, poor environmental sanitation and lack of access for clear and safe water
are prevalent [1, 2].
The most commonly STH eggs found in the human body are Ascaris
Lumbricordes ova (ALO) and Trichuris Trichiura ova (TTO). STH inhabit the
intestine, liver, lungs and blood vessels of their hosts while the adult worms inhabit
intestine to mate and they will release the eggs in feces [3] to be diffused into soils.
The sizes of the eggs are microscopic and vary for each species [4].
Helminth eggs can remain viable for 1 to 2 months in crops and many months in
soil, freshwater and sewage [5]. They can remain viable for several years in feces,
night soil, sludge and wastewater. STH eggs can be transmitted to the human body
through direct contact with polluted sludge or fecal material, exposure to contaminated food, water and also from an animal body or their fur [6].
These parasites can multiply in the human body and this could lead to a serious
illness such as filariasis and cysts. They also might increase the susceptibility to
other illnesses such as tuberculosis, malaria and HIV infection. For children, the
STH infection may cause malnutrition, education deficits and intellectual retardation [7, 8]. Studies have shown such infections have a high consequence on school
performance and attendance and future economic productivity [9].
In 2016, around 2.5 billion people all around the world affected with
helminthiases disease and over 530 million children which representing 63% of the
world’s total were treated [10]. Based on the high number of helminthiases cases,
the requirement for identification and classification for the types of helminth eggs is
paramount importance in the healthcare industry.
Early diagnosis is fundamental for patient recovery, especially for children cases.
Helminth eggs can be diagnosed through patients’ stool, blood and tissue sample.
Parasitologist needs to diagnose these sample in fresh condition under a limited
time. Problems occur when the procedures take a great amount of time and the
observer must have a good concentration in observing the samples [11]. Results
obtained are often neither accurate nor reliable. These limitations have initiated the
improvement in digital image processing for helminth eggs recognitions by using
image processing and computer algorithms.
Hadi et al. [12] used the median filter twice to reduce the artifacts and noises in
the image while edge enhancement based on sharpness and edge detection with
canny filter have been used to detect the edge of the hard sharp objects. Threshold
with Logical Classification Method (TLCM) has been proposed for the automatic
identification process by using shape, shell smoothness and size of the eggs as
features in the feature extraction process. The classifying accuracy obtained for
ALO species is 93% while TTO species is 94%.
A Fast and Efficient Segmentation of Soil-Transmitted Helminths …
557
Then, Suzuki et al. [13] identified 15 types of human intestine parasites through
a system that automatically segmented and classified the human intestinal parasites
from microscopy images. The proposed system explores image foresting transform
and ellipse matching for segmentation and optimum-path forest classifier for object
recognition. This system obtained 90.38% for sensitivity, 98.19% for efficiency and
98.32% for specificity.
Kamarul et al. [14] proposed a new classification using Filtration with Steady
Determinations Thresholds System (F-SDTS) classifier. This classifier is applied in
the feature extraction stage by using the ranges of feature values as a database to
identify and classify the type of parasite. The overall success rate for this classification system is 94%.
Jimenez et al. [11] proposed a system that identifies and quantifies seven species
of helminth eggs in wastewater. Gray-scale profile segmentation is used to identify
the shape and thus to differentiate genera and species of the helminth eggs. The
system shows a specificity of 99% and a sensitivity of 80% to 90%.
The systems proposed by the previous researchers showed an increment in
identification and classification of human intestinal parasites. However, improvement can be done in the segmentation part in order to achieve efficient results. One
of the improvements is by manipulating the color conversion in an image to differentiate the feature of helminths with the artifacts. This suggestion is recommended based on the outcome obtained when color conversion is applied to the
image of other medical studies such as cancer, cyst, leukemia and malaria [15–20].
Ghimire and Lee [15] used HSV color model on image by keeping H and S
components unvaried and used only (V) component from HSV color image to
prevent the change of state of color balance among the HSV component. The
enhanced image is not altered because the H and S are not changed. The proposed
method obtained a better image compared to other methods such as histogram
equalization and integrated neighborhood dependent approach for nonlinear
enhancement (AINDANE).
Kulkarni et al. [16] applied color conversion after the pre-processing method in
order to recognize Acute Lymphoblastic Leukemia (ALL) images. RGB color space
is converted into HSV color space to reduce the color dimension from three to two.
Saturation (S) plane is selected as it shows a better contrast compared to Hue
(H) and Value (V) components. Otsu’s Thresholding method is used for the segmentation part and able to segment the ALL into two parts; nucleus and cytoplasm.
Poostchi et al. [17] have listed RGB, HSV, YCbCr, LAB and intensity under
color feature when they analyzed the feature computation for classifying malaria
parasites for both thin and thick blood smear. Color feature is the most natural to be
used for stained parasite to acquire information and to describe the morphological
features in red blood cells.
An analysis of the usability of color model in image processing has been studied
by Sharma and Nayyer [18]. Color components provide a rational way to specify
orders, manipulated and effectively display the color of the object that is been
considered. Thus, the selected color model should be appropriate to deal with the
problem statement and solution. The process of selecting the best color
558
N. A. A. Khairudin et al.
representation involves knowing how color signals are generated and what information is needed from these signals. Color models are widely used to facilitate the
specification of the color in some standard generally accepted way.
Aris et al. [19] have analyzed color components in color spaces to improve the
counting performance of malaria parasites based on thick blood smear images. Y,
Cb, R, G, C, M, S and L components have been extracted from YCbCr, RGB,
CMY, HSV and HSL color models in order to identify which color component
shows the most accurate counting for malaria parasites. Based on results obtained,
Y component of YCbCr shows the best segmentation result with 98.48% of average
counting accuracy for 100 images of malaria thick blood smear.
A new color components’ exchanging method on different color spaces for
image segmentation has been proposed by Dai and Li [20] in order to segment a
hematocyte image. This method exchange the order of color components after the
color component from the original image is extracted. The new image formed has
been segmented using Otsu thresholding and region segmentation techniques. The
proposed method can differentiate the target segmentation of hematocyte image
which are nucleus and cytoplasm of hematocyte, erythrocytes and leukocyte from
background image. However, this method is unfitting for sample images that have
different staining methods and magnification.
Based on the previous studies, it can be seen that color models plays a major role
in improving the segmentation performance of image. Therefore, this study will
discover the potential of various color components for segmentation process in
order to improve the STH segmentation performance.
2 Methodology
Most of the researchers have focusing on segmentation and classification techniques to achieve the most accurate results. However, the most crucial part lies in
the pre-processing step in which it will affect the next processing step. In this paper,
several color models are applied on the enhanced images in order to identify which
color component is the most suitable to be applied in segmenting the ALO and TTO
images. The methodological steps for segmenting these images will be explained in
this section.
2.1
Image Acquisition
The samples of STH are acquired from helminthiases patients through a stool
sample. The samples of ALO and TTO are obtained from the Department of
Microbiology and Parasitology, Hospital University Science of Malaysia (HUSM).
These stool samples are freshly prepared on slides and have been analyzed under
A Fast and Efficient Segmentation of Soil-Transmitted Helminths …
559
40X magnification by using Leica DLMA digital microscope. Normal saline is used
as the staining to obtain a clear vision of the eggs. In this study, 100 images for each
species of ALO and TTO have been captured and saved in .jpg format.
2.2
Image Enhancement Technique Using Modified Global
Contrast Stretching (MGCS)
The samples obtained may have different luminance which needs to be standardized. The cause of this problem is due to the color of stool sample or through the
lighting from microscope. In order to standardize the luminance, a contrast
enhancement technique namely modified global contrast stretching (MGCS) is used
[21]. This technique is used to standardize the lighting in the image as well as
improving the quality of the targeted image.
One of the advantages of MGCS technique is its ability to enhance the contrast of
the image without affecting the color structure of the original image. Besides, this
technique is able to preserve as much information as the original image. MGCS is
altered from global contrast stretching (GCS), hence this technique able to overcome
the weakness of GCS by adjusting the values of minimum and maximum in R, G and B
components that have been acquired through a certain calculation from the total
number of pixels in the images. The original equation of GCS is shown in Eq. (1) [22].
inRGB ðx; yÞ minRGB
outRGB ðx; yÞ ¼ 255 maxRGB minRGB
ð1Þ
Several parameters are required in order to obtain the new minimum and
maximum values. These include the value for minimum percentage, minp, maximum percentage, maxp, number of pixels in each pixel level, Tpix, total number of
pixels that lie in a specified minimum percentage, Tmin and total number of pixels
that lie in a specified maximum percentage, Tmax. The procedures to develop the
MGCS techniques are as follows [22]:
1. Select the preferred values for minp and maxp.
2. Initialize Tmin = 0 and Tmax = 0. Set the value of k = 0, where k is the current
pixel level.
3. Estimate the histogram for the red component.
4. Find the number of pixels, Tpix[k] at k. If Tpix[k] 1, set Tmin = Tmin + Tpix[k].
5. Check the following condition:
Tmin
100 minp
total number of pixel in image
ð2Þ
560
N. A. A. Khairudin et al.
6. If Tmin fulfills Eq. 2, set the new minimum value, Nmin for the red component
in the image to the k value that satisfies this condition; else set k = k + 1.
7. Repeat steps 4 to 6 for the next pixel levels until Nmin is obtained based on the
k value that satisfies Eq. 2.
8. Set the value of k = 255.
9. Find Tpix[k] at k. If Tpix[k] 1, set Tmax = Tmax + Tpix[k].
10. Check the following condition:
Tmax
100 maxp
total number of pixel in image
ð3Þ
11. If Tmax satisfies Eq. 3, set the new maximum value, Nmax for the red component in the image to the k value that satisfies this condition; else set k = k − 1.
12. Repeat steps 9 to 11 for the next pixel levels until Nmax is obtained based on
the k value that satisfies Eq. 3.
13. Repeat steps 2 to 12 in order to calculate the Nmin and Nmax for the green and
blue components.
14. Nmin and Nmax then are used to replace the original min and max in the GCS
formula in Eq. (1).
2.3
Color Conversion of STH Image Using Various Color
Models
Color conversion identifies color that present in an image. It generally is made from
3D coordinate system and a subspace where each color is represented by a single
point [22]. In image processing, color model is used to identify, simplify, extract
and edit the particular color needed.
Various color models like RGB (Red, Green, Blue), HSV (Hue, Saturation,
Value) and L*a*b are used in various applications such as cell detection, lane
(a) Enhanced image
(b) R component
(c) G component
Fig. 1 Results of R, G and B components on STH image
(d) B component
A Fast and Efficient Segmentation of Soil-Transmitted Helminths …
561
detection, face detection and many more. Sharma et al. [23] stated that color space
provides a rational way to effectively considered in displaying the color of objects.
RGB Color Model. The RGB color model is based on the theory that all visible
color models can be created using primary colors of red, green and blue [22]. These
color models are commonly used to recognize, represent and display images in an
electronic system such as televisions, computers and photography. Figure 1 shows
the results of RGB color model on STH image. R, G and B components are suitable
to be used on STH images.
HSV Color Model. HSV is made up based on hue, saturation and value character.
The characteristic of HSV have been illustrated in hex-cone shape and the coordinate system is cylindrical. H describes the hue or true color in the image, while S
represents the amount of white color in the image [24]. The higher the amount of
white, the lower the image saturation. Value shows the degree of brightness in the
image which describes value or luminance in the image. The top of HSV hex-cone
is a projection along the RGB main diagonal color [25]. Figure 2 shows the
hex-cone shape of HSV.
Hue is defined by the one or two largest parameter. The range for H is from 0° to
360°. S able to be controlled by varying the R, G and B collective minimum value
whereas V is controlled by varying the magnitudes while keeping a constant ratio
[23, 25].
H ¼ f ð xÞ ¼
S¼
H1 ;
if B G
360 H1 ; if B [ G
ð4Þ
maxðR; G; BÞ þ ðR GÞ
maxðR; G; BÞ
ð5Þ
maxðR; G; BÞ
255
ð6Þ
V¼
The advantage of HSV is it has a simple conceptual concept that each of the
element attributes directly corresponds to the basic color model. The disadvantage
is the saturation attributes correspond to the mixture of a color with white (tinting),
so color desaturation increases the amount of intensity [26]. In this paper, S and V
components are applied on the STH images as the H component is unsuitable to be
Fig. 2 Hex-cone shape of
HSV color space
562
N. A. A. Khairudin et al.
(a) Enhanced image
(b) H component
(c) S component
(d) V component
Fig. 3 Results of H, S and V components on STH image
(a) Enhanced image
(b) Y component
(c) I component
(d) Q component
Fig. 4 Results of Y, I and Q components on STH image
used on the STH image because H components shows low contrast between the
foreground and background as can be seen in Fig. 3(b).
CIE 1976 L*a*b* Color Model. This color conversion is derived from CIE XYZ
and is used to linearize the perceptibility of color differences. The designation of
Lab color space is approximately for a human vision which L component is closely
matched to the human perception of lightness [27]. L* stands for luminosity, A* is
for red or green axis and B* is for blue or yellow axis. CIE Lab is popular in
measuring reflective and transmissive objects [25, 27].
NTSC Color Model. National Television System Committee (NTSC) uses YIQ as
color space which Y component represents the luma information while I and Q
represent the chrominance information for television receiver. Luminance can be
obtained from a linear combination of the three primaries. Equation (7) shows the
formula for the conversion from RGB color space to YIQ color space while Eq. (8)
shows the determined formula by the colorimetric for display system [28].
2
3 2
Y
0; 299
4 I 5 4 0:5959
Q
0:2115
32 3
0:587
0:114
R
0:2746 0:3213 54 G 5
0:5227 0:3112
B
Y ¼ 0:299R þ 0:587G þ 0:114B
ð7Þ
ð8Þ
In this study, only Y and I components are applied on the enhanced STH images.
This is because Y and I able to differentiate the foreground and background in the
image whereas the foreground and background are in the same color in Q
A Fast and Efficient Segmentation of Soil-Transmitted Helminths …
563
component. Figure 4 shows the results obtained from the NTSC color model based
on Y, I and Q components.
Arithmetic Between Color Models. The components in color models are altered
through addition and subtraction arithmetic to help in increasing the possibility of
the enhanced image to be segmented accurately. Between the arithmetic formulas
for the color models components, two formulas from arithmetic show a good
improvement in differentiating the color components in the enhanced STH image.
First formula is based on the addition of G component from RGB color model with
Lab color model (GLab). Second, subtraction of S component from HSV color
model with G component from RGB color model (SG).
2.4
GLab ¼ G þ Lab
ð9Þ
SG ¼ S G
ð10Þ
Image Segmentation of STH Image Using k-Means
Clustering
The main purpose for segmentation of STH image is to separate the regions in STH
image by dividing the image into the region of interest and background region. The
segmentation process is important because it will serve as a basic step for all
subsequent analyses.
In this paper, k-means clustering is used in order to identify which color component shows the best STH segmentation result. The algorithm for k-means clustering is based on the concept of data assignation to their respective centers by the
shortest Euclidean distance. The k-means clustering is one of the most popular
clustering methods based on unsupervised learning algorithms due to its simplicity
[20]. The k-means clustering is constructed on minimizing the objective function, J
as in Eq. (11).
J¼
Xn Xk xi cj i¼1
j¼1
ð11Þ
Where n is the number of data, k is the number for the cluster, xi is the ith the
sample and cj is the jth center of the cluster. In this paper, three clusters are used for
the segmentation process in order to differentiate between target, unwanted and
background regions.
564
2.5
N. A. A. Khairudin et al.
Post-processing Steps After Segmentation Process
After the segmented images have been obtained from k-means clustering, the
unwanted pixels and regions are removed by using object remover technique in
binary form. This technique helps in removing the pixel lower than 17000 pixel and
larger than 70000 pixel in order to achieve an accurate diagnosis for STH.
However, the tendency for the pixels inside the target image to disappear is high.
Fill holes operation is selected to overcome the side effect from the object remover
method on the segmented image by filling the area of the dark pixels that are
surrounded by lighter pixels.
2.6
Segmentation Performance
The segmentation performance aims to identify the successfulness of the segmentation. In this paper, segmentation performance is used to compare the image of the
segmentation results when the different color components are applied with k-means
clustering technique. Segmentation performance is divided into three measures
which are accuracy, specificity and sensitivity. These measurements are calculated
by comparing the pixels from the resultant segmented image with the manually
segmented image. The calculation for accuracy, specificity and sensitivity are
defined in Eqs. (12), (13) and (14) respectively.
TP þ TN
100
TP þ TN þ FP þ FN
ð12Þ
Specificity ¼
TN
100
TN þ FP
ð13Þ
Sensitivity ¼
TP
100
TP þ FN
ð14Þ
Accuracy ¼
Accuracy is the ratio of correctly classified pixels to the entire area of the STH
images while sensitivity is a true positive measure in that it refers to the proportion
of images that contain the region of helminth eggs which has been classified correctly. Specificity is the percentage of pixels that are correctly segmented as negative region [29].
A Fast and Efficient Segmentation of Soil-Transmitted Helminths …
565
3 Results and Discussion
In this study, MGCS technique has been applied on 100 ALO images and 100 TTO
images. From the enhancement results obtained, nine color components have been
applied on the enhanced images. The results of color components image has been
(a) ALO_1
(b) ALO_2
(c) TTO_1
(d) TTO_2
Fig. 5 Original ALO and ALO and TTO images
(a) MGCS ALO_1
(b) R ALO_1
(c) k-Means ALO_1
(d) PPS ALO_1
(e) MGCS ALO_2
(f) R ALO_2
(g) k-Means ALO_1
(h) PPS ALO2
(i) MGCS TTO_1
(j) R TTO_1
(k) k-Means TTO_1
(l) PPS TTO_1
(m) MGCS TTO_2
(n) R TTO_2
(o) k-Means TTO_2
(p) PPS TTO_2
Fig. 6 Results of R component and k-means clustering on enhanced ALO and TTO images
566
N. A. A. Khairudin et al.
(a) MGCS ALO_1
(b) G ALO_1
(c) k-Means ALO_1
(d) PPS ALO_1
(e) MGCS ALO_2
(f) G ALO_2
(g) k-Means ALO_2
(h) PPS ALO_2
(i) MGCS TTO_1
(j) G TTO_1
(k) k-Means TTO_1
(l) PPS TTO_1
(m) MGCS TTO_2
(n) G TTO_2
(o) k-Means TTO_2
(p) PPS TTO_2
Fig. 7 Results of G component and k-means clustering on enhanced ALO and TTO images
used as input image for k-means clustering in order to pinpoint the most suitable
color component to be used for the segmentation part. Then, the results of the
segmented images has been determined through qualitative and quantitative
evaluations.
Figure 5 shows the samples of the original ALO and TTO images. The lighting
in the images is different from each other. ALO_1 and TTO_2 images are darker
than ALO_2 and TTO_1. The artifacts also come in different colors and sizes for
each image. These differences increase the difficulty in the segmentation process.
A Fast and Efficient Segmentation of Soil-Transmitted Helminths …
567
(a) MGCS ALO_1
(b) B ALO_1
(c) k-Means ALO_1
(d) PPS ALO_1
(e) MGCS ALO_2
(f) B ALO_2
(g) k-Means ALO_2
(h) PPS ALO_2
(i) MGCS TTO_1
(j) B TTO_1
(k) k-Means TTO_1
(l) PPS TTO_1
(m) MGCS TTO_2
(n) B TTO_2
(o) k-Means TTO_2
(p) PPS TTO_2
Fig. 8 Results of B component and k-means clustering on enhanced ALO and TTO images
However, the MGCS technique eases the problem encountered by enhancing and
fixing the lighting in the images. Figure 6 until Fig. 15 show the result of images
when the proposed color components and k-means clustering are applied on the
MGCS images of ALO and TTO (Figs. 7, 8, 12).
From the resultant images achieved, it can be said that each of the color components has their advantage and disadvantage when applied on the MGSC images.
The results obtained from color components are crucial for k-means clustering and
post-processing process. Based on the observation of the enhanced images, MGCS
technique shows that the original images are enhanced into a better quality of
images. The target images pop up and can be distinguished from the artifacts while
the lighting for each image is balanced.
568
N. A. A. Khairudin et al.
(a) MGCS ALO_1
(b) S ALO_1
(c) k-Means ALO_1
(d) PPS ALO_1
(e) MGCS ALO_2
(f) S ALO_2
(g) k-Means ALO_2
(h) PPS ALO_2
(i) MGCS TTO_1
(j) S TTO_1
(k) k-Means TTO_1
(l) PPS TTO_1
(m) MGCS TTO_2
(n) S TTO_2
(o) k-Means TTO_2
(p) PPS TTO_2
Fig. 9 Results of S component and k-means clustering on enhanced ALO and TTO images
The results obtained show that R, V, Lab and GLab components are incompatible for STH segmentation. The information of the target images is greatly
affected when the images go through the post-processing procedure because most of
the loss information from the target images are unable to be restored. Figures 6, 10,
11 and 14 show the resultant images that have lost their information and unable to
be restored which are mostly come from TTO images.
A Fast and Efficient Segmentation of Soil-Transmitted Helminths …
569
(a) MGCS ALO_1
(b) V ALO_1
(c) k-Means ALO_1
(d) PPS ALO_1
(e) MGCS ALO_2
(f) V ALO_2
(g) k-Means ALO_2
(h) PPS ALO_2
(i) MGCS TTO_1
(j) V TTO_1
(k) k-Means TTO_1
(l) PPS TTO_1
(m) MGCS TTO_2
(n) V TTO_2
(o) k-Means TTO_2
(p) PPS TTO_2
Fig. 10 Results of V component and k-means clustering on enhanced ALO and TTO images
The images are successfully segmented when G, B, Lab and Y components are
applied on the enhanced images with the combination of k-means clustering
technique. However, the final results show that the artifacts are still present in the
images even though the target images are successfully segmented. These artifacts
are difficult to be removed because their sizes are within the range of target image
size. This increases the possibility of misleading analysis to occur in segmentation
performance.
570
N. A. A. Khairudin et al.
(a) MGCS ALO_1
(b) Lab ALO_1
(c) k-Means ALO_1
(d) PPS ALO_1
(e) MGCS ALO_2
(f) Lab ALO_2
(g) k-Means ALO_2
(h) PPS ALO_2
(i) MGCS TTO_1
(j) Lab TTO_1
(k) k-Means TTO_1
(l) PPS TTO_1
(m) MGCS TTO_2
(n) Lab TTO_2
(o) k-Means TTO_2
(p) PPS TTO_2
Fig. 11 Results of Lab color model and k-means clustering on enhanced ALO and TTO images
Then, S, I and SG components show better resultant images when been applied
on the MGCS images compared to the other techniques. The artifacts are present
but in minimize amounts. Figure 9 shows the result images for S component. The
target images are successfully segmented with only a small portion of artifact
present because they are in the same cluster as the target images. The results from I
components in Fig. 13 shows good segmentation results but the target images
A Fast and Efficient Segmentation of Soil-Transmitted Helminths …
571
(a) MGCS ALO_1
(b) Y ALO_1
(c) k-Means ALO_1
(d) PPS ALO_1
(e) MGCS ALO_2
(f) Y ALO_2
(g) k-Means ALO_2
(h) PPS ALO_2
(i) MGCS TTO_1
(j) Y TTO_1
(k) k-Means TTO_1
(l) PPS TTO_1
(m) MGCS TTO_2
(n) Y TTO_2
(o) k-Means TTO_2
(p) PPS TTO_2
Fig. 12 Results of Y component and k-means clustering on enhanced ALO and TTO images
produced in the final images are smaller than the original images. The results from
SG component in Fig. 15 shows that some information is missing although the
target images are successfully segmented with a lesser amount of the artifacts.
Table 1 shows the average results performance for each color component proposed on the total images of ALO and TTO. From the results obtained, the highest
accuracy result is 99.06%, obtained by S and SG color component. For specificity,
572
N. A. A. Khairudin et al.
(a) MGCS ALO_1
(b) I ALO_1
(c) k-Means ALO_1
(d) PPS ALO_1
(e) MGCS ALO_2
(f) I ALO_2
(g) k-Means ALO_2
(h) PPS ALO_2
(i) MGCS TTO_1
(j) I TTO_1
(k) k-Means TTO_1
(l) PPS TTO_1
(m) MGCS TTO_2
(n) I TTO_2
(o) k-Means TTO_2
(p) PPS TTO_2
Fig. 13 Results of I component and k-means clustering on enhanced ALO and TTO images
Table 1 Results of
segmentation performances
based on different color
components and k-means
clustering
Color components
Accuracy
Specificity
Sensitivity
R
G
B
S
V
Lab
Y
I
GLab
SG
96.76%
98.24%
98.53%
99.06%
96.97%
98.02%
98.01%
97.40%
96.50%
99.06%
98.06%
98.29%
98.64%
99.31%
99.54%
98.35%
98.12%
99.96%
99.41%
99.54%
67.81%
97.33%
96.54%
95.06%
91.46%
89.97%
95.19%
56.24%
40.83%
91.46%
A Fast and Efficient Segmentation of Soil-Transmitted Helminths …
573
(a) MGCS ALO_1
(b) GLab ALO_1
(c) k-Means ALO_1
(d) PPS ALO_1
(e) MGCS ALO_2
(f) GLab ALO_2
(g) k-Means ALO_2
(h) PPS ALO_2
(i) MGCS TTO_1
(j) GLab TTO_1
(k) k-Means TTO_1
(l) PPS TTO_1
(m) MGCS TTO_2
(n) GLab TTO_2
(o) k-Means TTO_2
(p) PPS TTO_2
Fig. 14 Results of GLab arithmetic component and k-means clustering on enhanced ALO and
TTO images
the highest result is 99.96%, obtained by I component while the highest result for
sensitivity is 97.33%, obtained by G component. By comparing the overall performance, S component achieved the best segmentation performance when been
applied with the k-means clustering with accuracy of 99.06%, specificity of 99.31%
and sensitivity of 95.06%.
574
N. A. A. Khairudin et al.
(a) MGCS ALO_1
(b) SG ALO_1
(c) k-Means ALO_1
(d) PPS ALO_1
(e) MGCS ALO_2
(f) SG ALO_2
(g) k-Means ALO_2
(h) PPS ALO_2
(i) MGCS TTO_1
(j) SG TTO_1
(k) k-Means TTO_1
(l) PPS TTO_1
(m) MGCS TTO_2
(n) SG TTO_2
(o) k-Means TTO_2
(p) PPS TTO_2
Fig. 15 Results of SG component and k-means clustering on enhanced ALO and TTO images
4 Conclusions
In this paper, the results of applying the proposed color models with k-means
clustering have been presented. Color components from the various color models
are used for k-means clustering segmentation to ease the identification of the target
image in order to achieve good segmentation results. A good segmentation result
helps to achieve more accurate results for classification and diagnosis of STH.
S component from HSV color model has proven to be the best in obtaining a good
segmentation of ALO and TTO images with accuracy of 99.06%, specificity of
99.31% and sensitivity of 95.06%. These results can be used as a reference for the
morphology of the ALO and TTO in the next project such as classification and
identification process.
Acknowledgements The author would like to acknowledge the support from the Fundamental
Research Grant Scheme for Research Acculturation of Early Career Researchers (FRGS-RACER)
under a grant number of RACER/1/2019/ICT02/UNIMAP//2 from the Ministry of Higher
Education Malaysia. The authors gratefully acknowledge team members and thank Hospital
Universiti Sains Malaysia (HUSM) for providing the helminths eggs samples.
A Fast and Efficient Segmentation of Soil-Transmitted Helminths …
575
References
1. Mohd-Shaharuddin N, Lim YAL, Hassan N-A, Nathan S, Ngui R (2018) Soil-transmitted
helminthiasis among indigenous communities in Malaysia: is this the endless malady with no
solution? Trop Biomed 35(1):168–180
2. Mehraj V, Hatcher J, Akhtar S, Rafique G, Beg MA (2008) Prevalence and factors associated
with intestinal parasitic infection among children in an urban slum of Karachi. PLoS ONE 3
(11):e3680
3. Ghate DA, Jadhav C (2012) Automatic detection of malaria parasite from blood images.
Department of Computer, College of Engineering, Pimpri, Pune, Maharashtra, India, TIJCSA
4. Ghazali KH, Hadi RS, Zeehaida M (2013) Microscopy image processing analysis for
automatic detection of human intestinal parasites ALO and TTO. In: International conference
on electronics computer and computation, ICECCO 2013, pp 40–43
5. World Health Organization (2004) Division of control of tropical diseases. Schistosomiasis
and intestinal parasites unit: training manual on diagnosis of intestinal parasites, tutor’s guide
electronic resource. CD-ROM
6. Amoah ID, Singh G, Stenström TA, Reddy P (2017) Detection and quantification of
soil-transmitted helminths in environmental samples: a review of current state-of-the-art and
future perspectives. Acta Trop 169(February):187–201
7. World Health Organization (WHO) (2005) Deworming for health and development. Report of
the third global meeting of the partners for parasite control. WHO, Geneva
8. World Health Organization (WHO) (2015) Third WHO report on neglected diseases:
investing to overcome the global impact of neglected tropical diseases. World Health
Organization, p 191
9. Bleakly H (2003) Disease and development. Evidence from hookworm eradication in the
American South. Q J Econ 1:376–386
10. Kaewpitoon SJ, Sangwalee W, Kujapun J, Norkaew J, Chuatanam J, Ponphimai S,
Chavengkun W, Padchasuwan N, Meererksom T, Tongtawee T, Matrakool L,
Panpimanmas S, Wakkhuwatapong P, Kaewpitoon N (2018) Active screening of gastrointestinal helminth infection in migrant workers in Thailand. J Int Med Res 46:4560–4568
11. Jiménez B, Maya C, Velásquez G, Torner F, Arambula F, Barrios JA, Velasco M (2016)
Identification and quantification of pathogenic helminth eggs using a digital image system.
Exp Parasitol 166:164–172
12. Hadi RS, Ghazali KH, Khalidin IZ, Zeehaida M (2012) Human parasitic worm detection
using image processing technique. In: ISCAIE 2012 - 2012 IEEE symposium on computer
applications & industrial electronics, no Iscaie, pp 196–201
13. Suzuki CTN, Gomes JF, Falcão AX, Papa JP, Hoshino-Shimizu S (2013) Automatic
segmentation and classification of human intestinal parasites from microscopy images. IEEE
Trans Biomed Eng 60(3):803–812
14. Kamarul HG, Raafat SH, Zeehaida M (2013) Automated system for diagnosis intestinal
parasites by computerized image analysis. Modern Appl Sci 7(5):98–114
15. Ghimire D, Lee J (2011) Nonlinear transfer function based local approach for color image
enhancement. IEEE Trans Consum Electron 57(2):858–865
16. Kulkarni TA, Bhosale DS, Yadav DM (2014) A fast segmentation method for the recognition
of acute lymphoblastic leukemia using thresholding algorithm. Int J Electron Commun
Comput Eng 5(4):364–368
17. Poostchi M, Silamut K, Maude RJ, Jaeger S, Thoma G (2018) Image analysis and machine
learning for detecting malaria. Transl Res 194(2018):36–55
576
N. A. A. Khairudin et al.
18. Sharma B, Nayyer R (2015) Use and analysis of color models in image processing. J Food
Process Technol 7(01):1–2
19. Aris TA, Nasir ASA, Mohamed Z, Jaafar H, Mustafa WA, Khairunizam W, Jamlos MA,
Zunaidi I, Razlan ZM, Shahriman AB (2019) Colour component analysis approach for
malaria parasites detection based on thick blood smear images. In: MEBSE 2018- IOP
conference series: materials science and engineering, vol 557. IOP
20. Dai H, Li X (2010) The color components’ exchanging on different color spaces for image
segmentation of hematocyte. In: 2nd international conference on multimedia information
networking and security, MINES 2010. IEEE, pp 10–13
21. Abdul-Nasir AS, Mashor MY, Mohamed Z (2012) Modified global and modified linear
contrast stretching algorithms-new colour contrast enhancement techniques for microscopic
analysis of malaria slide images. Comput Math Methods Med
22. Miller E (2017) Understanding the RGB colour model, graphic design 101. https://www.
thoughtco.com/colour-models-rgb-1697461
23. Sharma B, Nayyer B (2015) Use and analysis of color model in image processing. J Food
Process Control Technol
24. Nasir ASA, Mashor MY, Rosline H (2011) Detection of acute leukaemia cells using variety of
features and neural networks. In: 5th Kuala Lumpur international conference on biomedical
engineering. International Federation for Medical and Biological Engineering (IFMBE),
Kuala Lumpur, pp 40–46
25. Latoschik ME (2006) Realtime 3D computer graphic/virtual reality
26. Puniani S, Arora S (2015) Performance evaluation of image enhancement techniques. Int J
Signal Process Image Process Pattern Recogn 8(8):251–262
27. Erich LM (2006) Colour models CIE space for colour matching. CIE 1931 Model
International C
28. Hong Yan NL (2006) Improved method for color image enhancement based on luminance
and color contrast. J Electron Imaging 3(2):190–197
29. Khairudin NAA, Ariff FNM, Nasir ASA, Mustafa WA, Khairunizam W, Jamlos, MA,
Zunaidi I, Razlan ZM, Shahriman AB (2019) Image segmentation approach for acute and
chronic leukaemia based on blood sample images. In: MEBSE 2018- IOP conference series:
materials science and engineering, vol 557. IOP
Machine Learning Calibration for Near
Infrared Spectroscopy Data: A Visual
Programming Approach
Mahmud Iwan Solihin, Zheng Zekui, Chun Kit Ang, Fahri Heltha,
and Mohamed Rizon
Abstract Spectroscopy including Near infrared spectroscopy (NIRS) is a
non-destructive and rapid technique applied increasingly for food quality evaluation, medical diagnosis, manufacturing, etc. The qualitative or quantitative information using NIRS is only obtained after spectra data calibration process based
mathematical knowledge in chemometrics and statistics. This process naturally
involves multivariate statistical analysis. Machine learning as a subset of AI (artificial intelligence), in addition to conventional multivariate statistical tools, seems to
get more popularity for chemometric calibration of NIRS data nowadays. However,
often the software/toolboxes in chemometrics are commercialized version which is
not free. For the free versions, programming skills are required to deal with
applications of machine learning in spectra data calibration. Therefore, this paper
introduces a different approach of spectra data calibration based on visual programming approach using Orange data mining, a free software which is still rarely
used by the research community in spectroscopy. The data used namely: pesticide
sprayed on cabbage (to classify between pure cabbage and pesticide-sprayed cabbage with different level of pesticide solution), mango sweetness assessment (to
predict sugar soluble content in mango based on Brix degree value). These two data
represent classification and regression respectively. This approach is intended more
for researchers who want to apply machine learning calibration in their spectroscopy data but don’t want to have rigorous programming jobs, i.e. for
non-programmers.
M. I. Solihin (&) C. K. Ang F. Heltha
Mechatronics Engineering, Faculty of Engineering, UCSI University, Kuala Lumpur,
Malaysia
e-mail: mahmudis@ucsiuniversity.edu.my
M. Rizon
Electrical and Electronics Engineering, Faculty of Engineering, UCSI University,
Kuala Lumpur, Malaysia
Z. Zekui
TUM (Technical University of Munich) Asia, Singapore, Singapore
© Springer Nature Singapore Pte Ltd. 2021
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_40
577
578
M. I. Solihin et al.
Keywords Machine learning calibration Near infrared spectroscopy
free software Handheld near infrared spectrometer
Orange
1 Introduction
Machine learning including deep learning has become a highly discussed topics
recently in digital data world. It has tremendous potential to solve complex human
problems. Thus, many fields of applications demand implementation of machine
learning and artificial intelligence in broad to solve their respective problems [1–3].
This is not exclusive of spectroscopy data application. Spectroscopy is the study of
the interaction between matter and electromagnetic radiation originated through the
study of visible light dispersed according to its wavelength by a prism.
Particularly, near infrared spectroscopy (NIRS) is a non-destructive and rapid
technique applied increasingly for food quality evaluation, medical diagnosis,
manufacturing, etc. in recent years [4–15]. It can provide qualitative (substance
concentrations determination) and quantitate (raw material identification, adulteration of product identification) information of samples for in situ analysis and
online applications [4, 5]. For example, it can provide moisture, protein, fat, and
starch content information. In each industry, NIR applications vary and are tailored
to suit different companies and their products and needs [16–18].
In spectroscopy, absorption spectra of chemical species (atoms, molecules, or ions)
are generated when a beam of electromagnetic energy (i.e. light) is passed through a
sample, and the chemical species absorbs a portion of the photons of electromagnetic
energy passing through the sample. Lamberts beer law states that the absorptive
capacity of a dissolved substance is directly proportional to its concentration in a
solution. The relationship can be expressed as shown in Eq. (1) [19].
A ¼ log10
Io
¼ elc
I
ð1Þ
where:
A=
e=
l=
c=
absorbance
the molar extinction
length of the path light must travel in the solution in centimeters
concentration of a given solution
The qualitative or quantitative information using NIRS is only obtained after
spectra data calibration process using chemometrics and this process naturally
involves multivariate statistical analysis. Machine learning as a subset of AI (artificial intelligence), in addition to conventional multivariate statistical tools, seems to
get more popularity for chemometric calibration of NIRS data nowadays due to its
well-known capability to perform complex classification and regression tasks [20–
22]. This emergence may be encapsulated in a subject so called intelligent
Machine Learning Calibration for Near Infrared Spectroscopy Data ...
579
chemometrics. Among popular machine learnings in this regard are support vector
machine (SVM) and artificial neural networks (ANN). Some research in this area
include literatures review can be found [23–28].
The software programming tools for chemometric purpose which can accommodate machine learning are many such as Unscrambler, MALAB, R language,
WEKA, SIMCA and Python. However, often these softwares/toolboxes are commercialized version which is not free. Free software implementation on their
respective applications is motivating due to cost [29]. For the free versions, programming skills are required to deal with applications of machine learning in
spectra data calibration.
Therefore, this paper introduces a different approach of spectra data calibration
based on visual programming approach using Orange free software developed by
Biolab [30] which is still rarely used by the research community in spectroscopy.
This approach is intended more for researchers who want to apply machine learning
calibration in their spectroscopy data but don’t want to have rigorous programming
jobs, i.e. for non-programmers.
This paper will demonstrate the results of machine learning calibration for some
NIRS data in classification and regression mode. The NIRS data used are obtained
using micro handheld spectrometer, a new type of NIR spectroscopy instrument. The
data used namely: pesticide sprayed on cabbage (to classify between pure cabbage
and pesticide-sprayed cabbage with different level of pesticide solution), mango
sweetness assessment (to predict sugar soluble content in mango based on Brix
degree value). These two data represent classification and regression respectively.
2 Instrument and Software
Spectrometer is the instrument used to collect spectra data of the objects/samples by
directing infrared light source. The spectra data obtained for each sample is unique
for each simple indicating the uniqueness of its chemical composition. Therefore,
particularly NIR spectrometer can be used as a mean of study for material fingerprint. The spectra data graph can be plotted in unit of nm or cm−1 (wavelength in x
axis) versus the intensity or absorbance (arbitrary unit in y axis). Figure 1 shows
example of spectra data obtained from a spectrometer.
The NIR spectrometer used in this study is a handheld type (hand palm size)
with wavelength range in NIR region from 900–1700 nm. The optical electrical
board of this spectrometer is developed by Texas Instruments. Figure 2 shows the
handheld micro spectrometer used in this study. This device is connected via USB
port so that the user can acquire the spectra signal of the samples in their personal
computer using GUI software. The detailed explanation on how the data was
collected will be explained in the next section for the respective case studies.
For the multivariate spectra data calibration, Orange data mining software is
used [30]. This software can be downloaded freely as it is open source. It features a
visual programming front-end for explorative data analysis and interactive data
580
M. I. Solihin et al.
Fig. 1 An example of spectra data obtained from spectrometer reading on many samples
Fig. 2 The handheld hand
palm-sized NIR spectrometer
visualization and can also be used as a Python library. The visual programming in
Orange is performed as workflow. Orange workflow components are called widgets
and they range from simple data visualization, subset selection, and pre-processing,
to empirical evaluation of learning algorithms and predictive modelling. It means
that workflows are created by linking predefined or user-designed widgets, while
advanced users can use Orange as a Python library for data manipulation and
widget alteration [31]. Figure 3 shows typical Orange workflow example.
The widgets for spectroscopy can be downloaded as Add-ons option which also
includes some other applications such as Image Analyses, Time-Series, Geo, etc.
The widgets contained in Spectroscopy Add-ons are as seen in Fig. 4.
Machine Learning Calibration for Near Infrared Spectroscopy Data ...
Fig. 3 An example of workflow visual programming in Orange
Fig. 4 The orange software
widgets available in
spectroscopy add-ons
581
582
M. I. Solihin et al.
3 Case Studies
In this section, two case studies for NIR spectra data calibration will be presented.
One case represents classification problem (qualitative analysis) using machine
learning and another case represents regression problem (quantitative analysis).
This first case for qualitative analysis is experiment on pesticide solution spayed on
cabbage samples. The second case for quantitative analysis is mango sweetness
assessment based on sugar content (Brix value).
3.1
Pesticide Solution Sprayed on Cabbage
This experiment is motivated by the effort of developing rapid non-destructive
approach to detect pesticide residue on agricultural crops. It is carried out as initial
research to scrutinize whether NIRS is suitable tool for pesticide residue detection.
Monitoring of pesticides in fruit and vegetable samples has increased in the recent
years since most countries have established maximum residue level (MRL) for
pesticides in food products [32, 33].
Figure 5 shows the cabbage sample and the pesticide solution used, i.e.
Potassium oleate solution (285 g/1000 mL). The experiment procedure can be
summarized as follows:
1.
2.
3.
4.
5.
The instrument is set up.
A high concentration solution (28.5%, original ratio) of pesticide is blended.
The pesticide solution is spray on cabbage.
The cabbage sample is scanned 6 times to prove the result.
The spectrum is saved as .csv file.
Fig. 5 Cabbage and the pesticide solution
Machine Learning Calibration for Near Infrared Spectroscopy Data ...
583
Fig. 6 The orange workflow in the experiment for classification task
6. Repeat step 3 to 5 for 50 times for different leaf of cabbage.
7. Repeat step 2 to 6 for 5% pesticide, 1% pesticide and water.
8. Repeat step 4 to 5 for 30 times for different leaf of cabbage.
Total NIR spectrum of 230 samples are collected. Those NIR spectrum are of 30
samples of 30 pure cabbage leaves, 50 samples of cabbage sprayed with respectively 28.5% (original product ratio) pesticide solution, 5% pesticide solution, 1%
pesticide solution and water only solution. This means the machine learning will
make classification based on the recorded NIR spectrum to produce five classes
classification outcome. From these 230 samples, 180 samples are randomly for
training and the rest 50 samples are for testing.
Figure 6 shows the orange workflow for this experiment where three classifiers
are used namely, ANN, SVM and KNN (k-nearest neighbor). The classification
results are readily available from Confusion Matrix widget and Test & Score widget
as shown in Figs. 7 and 8. Figure 7 shows confusion matrix of classification performed by SVM. To see the results of other classifiers (ANN and KNN), a selection
click button can be performed on the left side. Noted that some other classifiers can
also be used such as Random Forest, Naïve Bayes, Decision Tree etc.
Furthermore, Test & Score widget can be used to check the classification
accuracy. As can be seen in Fig. 8, the results is mostly expressed in Data Science
terminologies such as AUC (area under curve), CA (classification accuracy),
Precision and Recall, etc. As can be seen, the highest CA performed on Test is
achieved by SVM followed by KNN and ANN respectively: 92, 86 and 72%.
Obviously, these results can be fine-tuned by changing parameters and the performance might be different. However, the focus of this study at the moment is on
584
M. I. Solihin et al.
Fig. 7 Confusion matrix of classification performed by SVM
Fig. 8 Screenshot of Test & Score widget that shows classification results
the use of the software instead of the machine learning algorithms performance. In
addition, some other algorithms can also be used and analyzed easily.
3.2
Brix Value Prediction on Mango
The second case study is regression problem as a part of research on non-destructive
fruit quality assessment using NIR spectroscopy. For this project, three different
types of mango fruit were selected namely Chokonan, Rainbow, and Kai Te. Total of
60 samples was prepared to be scanned by the spectrometer.
The samples were scanned in reflectance mode to record the absorbance spectra
data. Each sample spectrum was measured for 3 s in reflectance mode. Some
Machine Learning Calibration for Near Infrared Spectroscopy Data ...
585
samples were scanned two times in different environment, and some were scanned
only one time. A total of 80 spectra were collected from 60 samples. The training
and testing dataset consist of 60 and 20 samples respectively.
In assessing the fruit maturity of mango and as a guide to final food quality, short
wave near-infrared spectroscopy (NIR) (900–1700 nm) has been investigated. To
obtain a predictive model using spectroscopy data, real data needed to be collected
so that it can be used to calibrate and validate the accuracy of the prediction model.
Refractometer – A device used to measure the refractive index of plant juices in
order to determine the mineral/sugar ratio of the plant cell protoplasm. The
refractometer measured in units called Brix. NIRS is used to predict the Brix values
in mango fruit. The mango fruits used as samples are of three different types
namely: Chokonan, Rainbow, and Kai Te.
The MA871 is an optical refractometer instrument that employs the measurement of the refractive index to determine the % Brix of sugar in aqueous solutions
as shown in Fig. 9 [34]. In this project, the NIR spectrum of the Mango samples is
calibrated by machine learning (AdaBoost ensemble algorithm for regression in this
case) to predict Brix value non-invasively.
Figures 10 and 11 show the raw and the pre-processed spectra data of the Mango
samples. Some pre-processes are applied here namely: Gaussian smoothing and
EMSC (extended multiplicative scatter correction). Test & Score widget can be
used to show the regression accuracy in this regression case, in terms of R2 (coefficient of determination). The regression performance obtained by AdaBoost
ensemble regression in this case is 0.99% (training) indicating a very good prediction accuracy. However, R2 = 0.64 is obtained for testing. This lower attainment
Fig. 9 MA871 digital refractometer
586
Fig. 10 Raw spectral data of Mango
Fig. 11 Pre-processed spectra data of Mango
M. I. Solihin et al.
Machine Learning Calibration for Near Infrared Spectroscopy Data ...
Fig. 12 Orange workflow for regression experiment and the regression result
Fig. 13 Actual %Brix value vs predicted value (by AdaBoost)
587
588
M. I. Solihin et al.
is indication of overfitting of the prediction model and this needs to be remedied.
However, this discussion is beyond the scope of this conference.
Figure 12 shows the orange workflow (visual programing) used to generate the
data for this regression process. Figure 13 shows the regression plot for testing data.
It indicates the relation between actual %Brix and predicted value.
4 Conclusions and Discussions
This paper introduces a different approach of spectra data-particularly near infrared
spectroscopy- calibration based on visual programming approach using Orange data
mining, a free software which is still rarely used by the research community in
spectroscopy. This software tool is useful particularly for the non-programmer
researchers who want to apply machine learning algorithms in spectroscopy data
which leads to intelligent chemometrics approach. There was no coding involved in
the calibration and analysis which may attract interest for non-programmers.
However, there some recommendations for future improvement particularly for
the Orange software development that the research community and the authors
should proceed, such as: development of PLS (partial least square) regression
widget and deep learning (e.g. convolutional neural networks) widget. This is
because especially PLS is among the popular multivariate regression method in
chemometrics and spectroscopy. This can only be achieved with knowledge in
Python programming language.
References
1. Ang CK, Tey WY, Kiew PL, Fauzi M (2017) An artificial intelligent approach using fuzzy
logic for sleep quality measurement. J Mech Eng SI 4(2):31–47
2. Tang SH, Ang CK, Ariffin MKABM, Mashohor SB (2014) Predicting the motion of a robot
manipulator with unknown trajectories based on an artificial neural network. Int J Adv Robot
Syst 11(10):176
3. Hong TS, Kit AC, Nia DN, Ariffin MKAM, Khaksar W (2013) Planning for redundant
manipulator based on back-propagation neural network. Adv Sci Lett 19(11):3307–3310
4. Cen H, He Y (2007) Theory and application of near infrared reflectance spectroscopy in
determination of food quality. Trends Food Sci Technol 18(2):72–83
5. Teixeira Dos Santos CA, Lopo M, Páscoa RNMJ, Lopes JA (2013) A review on the
applications of portable near-infrared spectrometers in the agro-food 
Download