Uploaded by vishvcyber

DaveVishv - Path Planning using Image Processing

advertisement
Determination of Path
Planning using Image
Processing
Dave Vishv Divyang
20BCE1472
Guide - Dr. S Rajkumar
1
Introduction
●
●
●
●
The rise of automated vehicles marks a significant shift in the automotive landscape, requiring
specialized systems for seamless navigation.
Path planning is a crucial task for autonomous vehicles, as it determines the optimal route to reach a
desired destination while avoiding obstacles and complying with traffic rules.
Image processing is a powerful technique that can help extract useful information from the visual data
captured by cameras mounted on the vehicle.
This project centers on developing intelligent algorithms using image processing, machine learning, and
computer vision to create an efficient path planning system for real-time decision-making.
2
Literature Review
S. No
Paper Title
Summary
Algorithm Used
Results
1
Y. Xiao, L. Daniel and M. Gashinova, "Image
Segmentation and Region Classification in Automotive
High-Resolution Radar Imagery," in IEEE Sensors Journal,
vol. 21, no. 5, pp. 6698-6711, 1 March1, 2021, doi:
10.1109/JSEN.2020.3043586.
Proposed a method of automatic
segmentation of automotive radar
images
1)Unsupervised image presegmentation using marker-based
watershed transformation
2)supervised segmentation and
classification of regions containing
objects and surfaces
Good performance of the proposed
algorithm on single (standalone)
radar image frames based on F1
and JSC score.
2
E. Shelhamer, J. Long and T. Darrell, "Fully Convolutional
Networks for Semantic Segmentation," in IEEE
Transactions on Pattern Analysis and Machine
Intelligence, vol. 39, no. 4, pp. 640-651, 1 April 2017, doi:
10.1109/TPAMI.2016.2572683.
Takes input of arbitrary size and
produce correspondingly-sized
output with efficient inference and
learning
define a novel architecture to
combine deep, coarse, semantic
information and shallow, fine,
appearance information
1)AlexNet,
2) VGG net and 3) GoogLeNet
Tested FCN on semantic
segmentation and scene parsing,
exploring PASCAL VOC, NYUDv2,
and SIFT Flow
3
Z. Yue, F. Gao, Q. Xiong, J. Wang, A. Hussain and H. Zhou,
"A Novel Attention Fully Convolutional Network Method
for Synthetic Aperture Radar Image Segmentation," in
IEEE Journal of Selected Topics in Applied Earth
Observations and Remote Sensing, vol. 13, pp. 45854598, 2020, doi: 10.1109/JSTARS.2020.3016064.
Employed CNN in the pixel-wise
image segmentation tasks where
pixels are predicted with labelsfully convolutional network (FCN)
attention fully convolutional network
(AFCN)
multiscale attention network
(MANet)
Adopted the fully connected CRF to
capture the spatial information in
the SAR images
3 feature optimization strategies:
Multiscale feature, channel
attention, and spatial attention 3
extraction
Literature Review
S. No
Paper Title
Summary
Algorithm Used
Results
4
Paek, Dong-Hee and KONG, SEUNG-HYUN and Wijaya,
Kevin Tirta, Advances in Neural Information Processing
Systems, K-Radar: 4D Radar Object Detection for
Autonomous Driving in Various Weather Conditions, 2022
Introduced (K-Radar)- a novel object
detection dataset and contains 35K
frames of 4DRT data with power
measurements along the Doppler,
range, azimuth, and elevation
dimensions with 3D bounding box
labels of objects on the roads
1)K-Radar, for 3D object detection
2)3D object detection baseline NN
that directly consumes 4DRT
K-Radar provides 3D bounding box
labels and tracking ID for 93.3K
objects of five classes with distance
of up to 120 m
5
M. Ye, D. Lyu and G. Chen, "Scale-Iterative Upscaling
Network for Image Deblurring," in IEEE Access, vol. 8, pp.
18316-18325, 2020, doi: 10.1109/ACCESS.2020.2967823
Used scale-iterative upscaling
network (SIUN) that restores sharp
images in an iterative manner
Apploed super-resolution structure
instead of the upsampling layer
between two consecutive scales to
restore a detailed image
1)Modified RDN, combined with a UNet
2)Upscaling network scale-iterative
structure
method can produce better results
on both benchmark datasets and
real-world blurred images,
compared with both traditional and
learning-based methods
6
S. Song, Z. Jia, J. Yang and N. K. Kasabov, "A Fast Image
Segmentation Algorithm Based on Saliency Map and
Neutrosophic Set Theory," in IEEE Photonics Journal, vol.
12, no. 5, pp. 1-16, Oct. 2020, Art no. 3901016, doi:
10.1109/JPHOT.2020.3026973..
A fast image segmentation
algorithm combining the saliency
map with NS theory to obtain a
more accurate image segmentation.
Algorithm overcame the defects of
under and over segmentation
1)saliency map
2)neutrosophic set model
SMNS model was able to de-noise
to obtain better segmentation
results of noise image.
SMNS model is very fast
4
Literature Review
S. No
Paper Title
Summary
Algorithm Used
Results
7
C. Dewi, R. -C. Chen, Y. -T. Liu, X. Jiang and K. D. Hartomo,
"Yolo V4 for Advanced Traffic Sign Recognition With
Synthetic Training Data Generated by Various GAN," in
IEEE Access, vol. 9, pp. 97228-97242, 2021, doi:
10.1109/ACCESS.2021.3094201.
Work combines synthetic images
with original images to enhance
datasets and verify the
effectiveness of synthetic datasets
Structural Similarity Index (SSIM)
and Mean Square Error (MSE) were
employed to assess picture quality
1)DCGAN,
2) LSGAN,
3) WGAN
Highest SSIM value was achieved
when using 200 total images
8
J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You
Only Look Once: Unified, Real-Time Object Detection,"
2016 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 779788, doi: 10.1109/CVPR.2016.91.
Framed object detection as a
regression problem to spatially
separated bounding boxes and
associated class probabilities. A
single neural network predicts
bounding boxes and class
probabilities directly from full
images in one evaluation
1) Combined R-CNN & YOLO
The architecture is extremely fast. It
outperforms other detection
methods, including DPM and R-CNN
9
C. -J. Lin and J. -Y. Jhang, "Intelligent Traffic-Monitoring
System Based on YOLO and Convolutional Fuzzy Neural
Networks," in IEEE Access, vol. 10, pp. 14120-14133,
2022, doi: 10.1109/ACCESS.2022.3147866.
Record traffic volume, and vehicle
type information from the road
using YOLO and CFNN
1)Modified YOLOv4-tiny
2)Convolutional fuzzy neural
network (CFNN & Vector CFNN)
proposed method achieved an
accuracy of 90.45% on the Beijing
Institute of Technology public
dataset
5
Literature Review
S. No
Paper Title
Summary
Algorithm Used
Results
10
T. Brophy et al., "A Review of the Impact of Rain on
Camera-Based Perception in Automated Driving
Systems," in IEEE Access, vol. 11, pp. 67040-67057, 2023,
doi: 10.1109/ACCESS.2023.3290143.
Framework is used to understand
degree to which adverse weather
conditions affect the cameras used
in automated vehicles for sensing
and perception. The effects of rain
on each element of the model are
reviewed.
1)FasterRCNN
2)YOLOv3
Subsequent effect of using raindegraded data as input for
subsequent data processing is
studied
Rain impacts a wide variety of
under-explored aspects of an
autonomous vehicle environment.
11
Hoang, VD., Jo, KH. Path planning for autonomous
vehicle based on heuristic searching using online images.
Vietnam J Comput Sci 2, 109–120 (2015).
paper proposes a method that
constructs the shortest path for
vehicle auto-navigation in outdoor
environments. Global path for
vehicle motion is self-constructed
using road map and satellite images
1)Dijkstra algorithm + greedy
breadth-first search algorithm
It focuses on the estimation of the
path in the global coordinates
without using expensive
commercial services. One
disadvantage of this method is that
the method depends on the
updating road information
12
F. Yu and Z. Lu, "Road Traffic Marking Extraction
Algorithm Based on Fusion of Single Frame Image and
Sparse Point Cloud," in IEEE Access, vol. 11, pp. 8888188894, 2023, doi: 10.1109/ACCESS.2023.3306423.
Applies the improved road traffic
marking extraction algorithm of
sparse point cloud to the road
surface cloud with single frame
image, and construct the road
traffic marking extraction algorithm
1) Mask RCNN framework
2) Road traffic marking extraction
algorithm fused with single frame
image and point cloud
Average recall rate of the algorithm
was 0.841, the average accuracy
was 85.4%, and the operation speed
was 125.6 seconds. Its
performance was superior to other
algorithms compared
6
Literature Review
S. No
Paper Title
Summary
Algorithm Used
Results
13
S. Liu, E. Johns and A. J. Davison, "End-To-End Multi-Task
Learning With Attention," 2019 IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR), Long
Beach, CA, USA, 2019, pp. 1871-1880, doi:
10.1109/CVPR.2019.00197.
Proposed method allows for
learning of task specific features
from the global features, whilst
simultaneously allowing for features
to be shared across different tasks
1) MTAN based on VGG-16 +
encoder half of SegNet
The architecture is state-of-the-art
in multi-task learning compared to
existing methods, and is also less
sensitive to weighing schemes in
multitask loss function
14
MARKER BASED WATERSHED TRANSFORMATION FOR
IMAGE SEGMENTATION,Aman Kumar Sharma and Anju
Bala,2013, CorpusID:44245854
A new three step methodology for
image segmentation using
watershed transformation working
firstly with pre-segmentation
processing then detecting edges of
image and then computing
watershed transformation
1)Adaptive Histogram Equalization
2) Non Local Means Filter
3) Marker based watershed
transformation
The traditional watershed
transformation approach has a
problem of over segmentation. To
solve this problem a new algorithm
was proposed which gave good
results when compared to
traditional results.
15
N. Wojke, A. Bewley and D. Paulus, "Simple online and
realtime tracking with a deep association metric," 2017
IEEE International Conference on Image Processing
(ICIP), Beijing, China, 2017, pp. 3645-3649, doi:
10.1109/ICIP.2017.8296962.
Approach for multiple object
tracking
Integrate appearance information to
improve the performance of SORT
1)Simple Online and Realtime
Tracking (SORT)
Extensions reduce the number of
identity switches by 45%, achieving
overall competitive performance at
high frame rates
7
Problem Statement
●
●
●
Achieving robust and efficient route planning is essential for safe and effective navigation in the field of
autonomous robotics and unmanned vehicles, particularly in unstructured and dynamic situations.
Conventional route planning techniques often depend on explicit sensor data or pre-existing maps,
which may not be adequate for making decisions in real time in complicated settings. One potential way
to overcome these obstacles is to include image processing methods into the route planning procedure.
The principal aim of this study is to progress the domain of self-navigating systems by creating and
executing an image processing-based route planning system that is vision-based.
8
Research
Challenges
●
●
●
●
●
Dealing with noise, occlusion, illumination, and perspective
changes in the images.
Fusing multiple images from different sources and perspectives
to create a comprehensive map of the environment without using
any sensors which are taken in use.
Designing efficient and robust algorithms that can generate
feasible and optimal paths for automated vehicles, considering
various constraints and objectives, such as safety, speed, fuel
consumption
Handling dynamic and uncertain environments, such as traffic,
weather, and human factors
Ensuring the accuracy and stability of the vehicle’s localization
and navigation system
9
Research
Objectives
●
●
●
●
To design and implement an image acquisition and processing
system that can capture and analyze images of the environment
from different sensors i.e., cameras.
To extract and fuse useful features and information from the
images, such as terrain, obstacles, lanes, traffic, etc., using
advanced computer vision and machine learning techniques.
To formulate and solve the path planning problem as an
optimization problem, considering various constraints and
objectives, such as safety, speed, fuel consumption
Getting quantitative measures to get a better state of the model
which is better than existing models.
10
METHODOLOGY
11
MODULES
● ROAD SEGMENTATION
● OBJECT DETECTION
● 3D VISUALIZATION & PATH PLANNING
12
DATA
COLLECTION DESCRIPTION
●
●
●
●
●
Data to be worked and processed can be collected from
cameras attached to vehicles and capturing high quality
images and videos while driving.
Various such datasets are available.
Dataset used - Kitti Vision Benchmark Suite
Dataset for road and lane segmentation using images
captured by cameras mounted on a vehicle.
The dataset consists of 289 training and 290 test images,
with 3 different categories of road scenes: urban unmarked,
urban marked, and urban multiple marked lanes.
13
Proposed System
14
● Project will seamlessly integrate road segmentation, object detection, and 3D
visualization,to achieve the goal of creating comprehensive system that will
empower vehicles to navigate complex environments with precision, safety, and
efficiency.
● The proposed system will firstly do the preprocessing on data to enhance the quality
for better results.
● Initial stage comprises of image segmentation where in the image lane detection can
be done from the rest of environment
● Secondly we can the detect all the objects captured in the frame.
Finally we can visualize the traffic and other objects in 3D so that system can
generate efficient result which can vehicles for self driving
15
xx%
Fig 1.1 Proposed Architecture
16
MODULES
●
●
●
●
●
ROAD SEGMENTATION
2D OBJECT DETECTION
OBJECT TRACKING
MULTITASK LEARNING
3D OBJECT DETECTION
17
MODULES - DESCRIPTION
ROAD SEGMENTATION
●
●
●
Fully Convolutional Network will take the
input image and give out output which will
differentiate between road and other
environment
Here last 3 convolutional layers are used
as pooling layers which will be upsampled
Finally each pixel will be divided into 2
categories i.e., whether it is road or not
Fig 1.2 FCN8 Architecture
18
MODULES - DESCRIPTION
ROAD SEGMENTATION
●
●
●
Upscaling is important part of FCN model and
hence it is carried out using bipolar Interpolation
is used for upscaling the frames for better results
Used to estimate the value of a signal or pixel at
a non-integer position between two neighboring
points.
Calculate the weights for each neighbouring
point based on their distance and then
weighted sum gives the value of pixel
Fig 1.3 Mathematical example of bipolar interpolation
19
MODULES - DESCRIPTION
●
2D OBJECT DETECTION
○
○
YOLO (You only look once) state-of-theart object detection algorithm that is
capable of simultaneously predicting
bounding boxes and class probabilities
for multiple objects in an image.
Here it can be applied to detect all the
objects such as vehicles, signs, boards,
pedestrians and other objects on road
Fig 1.4 YOLO example
20
MODULES - DESCRIPTION
● OBJECT TRACKING
●
●
●
Object tracking is a crucial for analysis of the movement of
objects over time i.e, optical flow.
Deep SORT (Simple Online and Realtime Tracking with a Deep
Association Metric) - tracking algorithm that combines deep
learning with traditional tracking methods to achieve robust and
accurate object tracking.
Classified into 3 components:
○
○
○
Bounding box prediction(done using YOLO)
The Kalman filter - to predict the state of object wrt its previous
position and velocity - linear approximation
IOU matching - degree of overlap between two regions, i.e., bounding
boxes or segmentation masks. Need maximize the IOU score for better
results.
Fig 1.5 Kalman filter
21
MODULES - DESCRIPTION
● MULTI TASK
LEARNING
●
Multi Task attention
network will be used to
integrate the work of
object detection, object
classification and object
segmentation
Fig 1.5 Multitask Attention Network
22
MODULES - DESCRIPTION
● 3D OBJECT DETECTION
●
●
●
Employing SFA 3D (Super Fast and Accurate for 3D Object
Detection) - for additional dimension of understanding to the
surrounding environment
Can be divided into 3 steps:
○ Keypoint FPN - feature extraction and feature maps
○ Calculating loses - Focal, L1, Balanced L1
○ Learning rate Scheduling - Cosine Annealing
Moreover, this project will incorporate UNetXST to perform
camera to bird's eye view transformation, providing a top-down
view of the scene for enhanced perception and path planning.
23
Further Work
●
●
●
Implementation of all the modules integrated together to get fully
functioning program which can assist automated vehicles for self
driving
After completing Lane Segmentation, other modules are yet to
be implemented.
Completing the research paper component along with the
implementation component.
24
Project Status
Lane Segmentation
Applying bipolar interpolation
and later on FCN8 - achieved
true mask, working to achieve
better predicted mask
Fig 1.5 Lane Segmentation
25
Research Paper
Status
●
●
●
●
Completed the write up for abstract, introduction, motivation of
work.
Completed literature review.
Defined the algorithm and architecture to some extent but need
to complete it in depth.
Started the coding part but need to complete 100%
implementation in order to get quantitative results.
26
Guide Approval
27
References
[1] Y. Xiao, L. Daniel and M. Gashinova, "Image Segmentation and Region Classification in Automotive High-Resolution Radar Imagery," in IEEE Sensors Journal, vol. 21, no. 5, pp. 6698-6711, 1 March1,
2021, doi: 10.1109/JSEN.2020.3043586.
[2] E. Shelhamer, J. Long and T. Darrell, "Fully Convolutional Networks for Semantic Segmentation," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pp. 640-651, 1
April 2017, doi: 10.1109/TPAMI.2016.2572683.
[3] Z. Yue, F. Gao, Q. Xiong, J. Wang, A. Hussain and H. Zhou, "A Novel Attention Fully Convolutional Network Method for Synthetic Aperture Radar Image Segmentation," in IEEE Journal of Selected
Topics in Applied Earth Observations and Remote Sensing, vol. 13, pp. 4585-4598, 2020, doi: 10.1109/JSTARS.2020.3016064.
[4] Paek, Dong-Hee and KONG, SEUNG-HYUN and Wijaya, Kevin Tirta, Advances in Neural Information Processing Systems, K-Radar: 4D Radar Object Detection for Autonomous Driving in Various
Weather Conditions,2022
[5] M. Ye, D. Lyu and G. Chen, "Scale-Iterative Upscaling Network for Image Deblurring," in IEEE Access, vol. 8, pp. 18316-18325, 2020, doi: 10.1109/ACCESS.2020.2967823.
[6] S. Song, Z. Jia, J. Yang and N. K. Kasabov, "A Fast Image Segmentation Algorithm Based on Saliency Map and Neutrosophic Set Theory," in IEEE Photonics Journal, vol. 12, no. 5, pp. 1-16, Oct.
2020, Art no. 3901016, doi: 10.1109/JPHOT.2020.3026973.
[7] C. Dewi, R. -C. Chen, Y. -T. Liu, X. Jiang and K. D. Hartomo, "Yolo V4 for Advanced Traffic Sign Recognition With Synthetic Training Data Generated by Various GAN," in IEEE Access, vol. 9, pp.
97228-97242, 2021, doi: 10.1109/ACCESS.2021.3094201.
[8] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las
Vegas, NV, USA, 2016, pp. 779-788, doi: 10.1109/CVPR.2016.91.
[9] C. -J. Lin and J. -Y. Jhang, "Intelligent Traffic-Monitoring System Based on YOLO and Convolutional Fuzzy Neural Networks," in IEEE Access, vol. 10, pp. 14120-14133, 2022, doi:
10.1109/ACCESS.2022.3147866.
[10] T. Brophy et al., "A Review of the Impact of Rain on Camera-Based Perception in Automated Driving Systems," in IEEE Access, vol. 11, pp. 67040-67057, 2023, doi:
10.1109/ACCESS.2023.3290143.
[11] Hoang, VD., Jo, KH. Path planning for autonomous vehicle based on heuristic searching using online images. Vietnam J Comput Sci 2, 109–120 (2015).
[12] F. Yu and Z. Lu, "Road Traffic Marking Extraction Algorithm Based on Fusion of Single Frame Image and Sparse Point Cloud," in IEEE Access, vol. 11, pp. 88881-88894, 2023, doi:
10.1109/ACCESS.2023.3306423.
[13] S. Liu, E. Johns and A. J. Davison, "End-To-End Multi-Task Learning With Attention," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019,
pp. 1871-1880, doi: 10.1109/CVPR.2019.00197.
[14] MARKER BASED WATERSHED TRANSFORMATION FOR IMAGE SEGMENTATION,Aman Kumar Sharma and Anju Bala,2013, CorpusID:44245854
[15] N. Wojke, A. Bewley and D. Paulus, "Simple online and realtime tracking with a deep association metric," 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 2017, pp.
3645-3649, doi: 10.1109/ICIP.2017.8296962.
28
Download