Determination of Path Planning using Image Processing Dave Vishv Divyang 20BCE1472 Guide - Dr. S Rajkumar 1 Introduction ● ● ● ● The rise of automated vehicles marks a significant shift in the automotive landscape, requiring specialized systems for seamless navigation. Path planning is a crucial task for autonomous vehicles, as it determines the optimal route to reach a desired destination while avoiding obstacles and complying with traffic rules. Image processing is a powerful technique that can help extract useful information from the visual data captured by cameras mounted on the vehicle. This project centers on developing intelligent algorithms using image processing, machine learning, and computer vision to create an efficient path planning system for real-time decision-making. 2 Literature Review S. No Paper Title Summary Algorithm Used Results 1 Y. Xiao, L. Daniel and M. Gashinova, "Image Segmentation and Region Classification in Automotive High-Resolution Radar Imagery," in IEEE Sensors Journal, vol. 21, no. 5, pp. 6698-6711, 1 March1, 2021, doi: 10.1109/JSEN.2020.3043586. Proposed a method of automatic segmentation of automotive radar images 1)Unsupervised image presegmentation using marker-based watershed transformation 2)supervised segmentation and classification of regions containing objects and surfaces Good performance of the proposed algorithm on single (standalone) radar image frames based on F1 and JSC score. 2 E. Shelhamer, J. Long and T. Darrell, "Fully Convolutional Networks for Semantic Segmentation," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pp. 640-651, 1 April 2017, doi: 10.1109/TPAMI.2016.2572683. Takes input of arbitrary size and produce correspondingly-sized output with efficient inference and learning define a novel architecture to combine deep, coarse, semantic information and shallow, fine, appearance information 1)AlexNet, 2) VGG net and 3) GoogLeNet Tested FCN on semantic segmentation and scene parsing, exploring PASCAL VOC, NYUDv2, and SIFT Flow 3 Z. Yue, F. Gao, Q. Xiong, J. Wang, A. Hussain and H. Zhou, "A Novel Attention Fully Convolutional Network Method for Synthetic Aperture Radar Image Segmentation," in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 13, pp. 45854598, 2020, doi: 10.1109/JSTARS.2020.3016064. Employed CNN in the pixel-wise image segmentation tasks where pixels are predicted with labelsfully convolutional network (FCN) attention fully convolutional network (AFCN) multiscale attention network (MANet) Adopted the fully connected CRF to capture the spatial information in the SAR images 3 feature optimization strategies: Multiscale feature, channel attention, and spatial attention 3 extraction Literature Review S. No Paper Title Summary Algorithm Used Results 4 Paek, Dong-Hee and KONG, SEUNG-HYUN and Wijaya, Kevin Tirta, Advances in Neural Information Processing Systems, K-Radar: 4D Radar Object Detection for Autonomous Driving in Various Weather Conditions, 2022 Introduced (K-Radar)- a novel object detection dataset and contains 35K frames of 4DRT data with power measurements along the Doppler, range, azimuth, and elevation dimensions with 3D bounding box labels of objects on the roads 1)K-Radar, for 3D object detection 2)3D object detection baseline NN that directly consumes 4DRT K-Radar provides 3D bounding box labels and tracking ID for 93.3K objects of five classes with distance of up to 120 m 5 M. Ye, D. Lyu and G. Chen, "Scale-Iterative Upscaling Network for Image Deblurring," in IEEE Access, vol. 8, pp. 18316-18325, 2020, doi: 10.1109/ACCESS.2020.2967823 Used scale-iterative upscaling network (SIUN) that restores sharp images in an iterative manner Apploed super-resolution structure instead of the upsampling layer between two consecutive scales to restore a detailed image 1)Modified RDN, combined with a UNet 2)Upscaling network scale-iterative structure method can produce better results on both benchmark datasets and real-world blurred images, compared with both traditional and learning-based methods 6 S. Song, Z. Jia, J. Yang and N. K. Kasabov, "A Fast Image Segmentation Algorithm Based on Saliency Map and Neutrosophic Set Theory," in IEEE Photonics Journal, vol. 12, no. 5, pp. 1-16, Oct. 2020, Art no. 3901016, doi: 10.1109/JPHOT.2020.3026973.. A fast image segmentation algorithm combining the saliency map with NS theory to obtain a more accurate image segmentation. Algorithm overcame the defects of under and over segmentation 1)saliency map 2)neutrosophic set model SMNS model was able to de-noise to obtain better segmentation results of noise image. SMNS model is very fast 4 Literature Review S. No Paper Title Summary Algorithm Used Results 7 C. Dewi, R. -C. Chen, Y. -T. Liu, X. Jiang and K. D. Hartomo, "Yolo V4 for Advanced Traffic Sign Recognition With Synthetic Training Data Generated by Various GAN," in IEEE Access, vol. 9, pp. 97228-97242, 2021, doi: 10.1109/ACCESS.2021.3094201. Work combines synthetic images with original images to enhance datasets and verify the effectiveness of synthetic datasets Structural Similarity Index (SSIM) and Mean Square Error (MSE) were employed to assess picture quality 1)DCGAN, 2) LSGAN, 3) WGAN Highest SSIM value was achieved when using 200 total images 8 J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 779788, doi: 10.1109/CVPR.2016.91. Framed object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation 1) Combined R-CNN & YOLO The architecture is extremely fast. It outperforms other detection methods, including DPM and R-CNN 9 C. -J. Lin and J. -Y. Jhang, "Intelligent Traffic-Monitoring System Based on YOLO and Convolutional Fuzzy Neural Networks," in IEEE Access, vol. 10, pp. 14120-14133, 2022, doi: 10.1109/ACCESS.2022.3147866. Record traffic volume, and vehicle type information from the road using YOLO and CFNN 1)Modified YOLOv4-tiny 2)Convolutional fuzzy neural network (CFNN & Vector CFNN) proposed method achieved an accuracy of 90.45% on the Beijing Institute of Technology public dataset 5 Literature Review S. No Paper Title Summary Algorithm Used Results 10 T. Brophy et al., "A Review of the Impact of Rain on Camera-Based Perception in Automated Driving Systems," in IEEE Access, vol. 11, pp. 67040-67057, 2023, doi: 10.1109/ACCESS.2023.3290143. Framework is used to understand degree to which adverse weather conditions affect the cameras used in automated vehicles for sensing and perception. The effects of rain on each element of the model are reviewed. 1)FasterRCNN 2)YOLOv3 Subsequent effect of using raindegraded data as input for subsequent data processing is studied Rain impacts a wide variety of under-explored aspects of an autonomous vehicle environment. 11 Hoang, VD., Jo, KH. Path planning for autonomous vehicle based on heuristic searching using online images. Vietnam J Comput Sci 2, 109–120 (2015). paper proposes a method that constructs the shortest path for vehicle auto-navigation in outdoor environments. Global path for vehicle motion is self-constructed using road map and satellite images 1)Dijkstra algorithm + greedy breadth-first search algorithm It focuses on the estimation of the path in the global coordinates without using expensive commercial services. One disadvantage of this method is that the method depends on the updating road information 12 F. Yu and Z. Lu, "Road Traffic Marking Extraction Algorithm Based on Fusion of Single Frame Image and Sparse Point Cloud," in IEEE Access, vol. 11, pp. 8888188894, 2023, doi: 10.1109/ACCESS.2023.3306423. Applies the improved road traffic marking extraction algorithm of sparse point cloud to the road surface cloud with single frame image, and construct the road traffic marking extraction algorithm 1) Mask RCNN framework 2) Road traffic marking extraction algorithm fused with single frame image and point cloud Average recall rate of the algorithm was 0.841, the average accuracy was 85.4%, and the operation speed was 125.6 seconds. Its performance was superior to other algorithms compared 6 Literature Review S. No Paper Title Summary Algorithm Used Results 13 S. Liu, E. Johns and A. J. Davison, "End-To-End Multi-Task Learning With Attention," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 1871-1880, doi: 10.1109/CVPR.2019.00197. Proposed method allows for learning of task specific features from the global features, whilst simultaneously allowing for features to be shared across different tasks 1) MTAN based on VGG-16 + encoder half of SegNet The architecture is state-of-the-art in multi-task learning compared to existing methods, and is also less sensitive to weighing schemes in multitask loss function 14 MARKER BASED WATERSHED TRANSFORMATION FOR IMAGE SEGMENTATION,Aman Kumar Sharma and Anju Bala,2013, CorpusID:44245854 A new three step methodology for image segmentation using watershed transformation working firstly with pre-segmentation processing then detecting edges of image and then computing watershed transformation 1)Adaptive Histogram Equalization 2) Non Local Means Filter 3) Marker based watershed transformation The traditional watershed transformation approach has a problem of over segmentation. To solve this problem a new algorithm was proposed which gave good results when compared to traditional results. 15 N. Wojke, A. Bewley and D. Paulus, "Simple online and realtime tracking with a deep association metric," 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 2017, pp. 3645-3649, doi: 10.1109/ICIP.2017.8296962. Approach for multiple object tracking Integrate appearance information to improve the performance of SORT 1)Simple Online and Realtime Tracking (SORT) Extensions reduce the number of identity switches by 45%, achieving overall competitive performance at high frame rates 7 Problem Statement ● ● ● Achieving robust and efficient route planning is essential for safe and effective navigation in the field of autonomous robotics and unmanned vehicles, particularly in unstructured and dynamic situations. Conventional route planning techniques often depend on explicit sensor data or pre-existing maps, which may not be adequate for making decisions in real time in complicated settings. One potential way to overcome these obstacles is to include image processing methods into the route planning procedure. The principal aim of this study is to progress the domain of self-navigating systems by creating and executing an image processing-based route planning system that is vision-based. 8 Research Challenges ● ● ● ● ● Dealing with noise, occlusion, illumination, and perspective changes in the images. Fusing multiple images from different sources and perspectives to create a comprehensive map of the environment without using any sensors which are taken in use. Designing efficient and robust algorithms that can generate feasible and optimal paths for automated vehicles, considering various constraints and objectives, such as safety, speed, fuel consumption Handling dynamic and uncertain environments, such as traffic, weather, and human factors Ensuring the accuracy and stability of the vehicle’s localization and navigation system 9 Research Objectives ● ● ● ● To design and implement an image acquisition and processing system that can capture and analyze images of the environment from different sensors i.e., cameras. To extract and fuse useful features and information from the images, such as terrain, obstacles, lanes, traffic, etc., using advanced computer vision and machine learning techniques. To formulate and solve the path planning problem as an optimization problem, considering various constraints and objectives, such as safety, speed, fuel consumption Getting quantitative measures to get a better state of the model which is better than existing models. 10 METHODOLOGY 11 MODULES ● ROAD SEGMENTATION ● OBJECT DETECTION ● 3D VISUALIZATION & PATH PLANNING 12 DATA COLLECTION DESCRIPTION ● ● ● ● ● Data to be worked and processed can be collected from cameras attached to vehicles and capturing high quality images and videos while driving. Various such datasets are available. Dataset used - Kitti Vision Benchmark Suite Dataset for road and lane segmentation using images captured by cameras mounted on a vehicle. The dataset consists of 289 training and 290 test images, with 3 different categories of road scenes: urban unmarked, urban marked, and urban multiple marked lanes. 13 Proposed System 14 ● Project will seamlessly integrate road segmentation, object detection, and 3D visualization,to achieve the goal of creating comprehensive system that will empower vehicles to navigate complex environments with precision, safety, and efficiency. ● The proposed system will firstly do the preprocessing on data to enhance the quality for better results. ● Initial stage comprises of image segmentation where in the image lane detection can be done from the rest of environment ● Secondly we can the detect all the objects captured in the frame. Finally we can visualize the traffic and other objects in 3D so that system can generate efficient result which can vehicles for self driving 15 xx% Fig 1.1 Proposed Architecture 16 MODULES ● ● ● ● ● ROAD SEGMENTATION 2D OBJECT DETECTION OBJECT TRACKING MULTITASK LEARNING 3D OBJECT DETECTION 17 MODULES - DESCRIPTION ROAD SEGMENTATION ● ● ● Fully Convolutional Network will take the input image and give out output which will differentiate between road and other environment Here last 3 convolutional layers are used as pooling layers which will be upsampled Finally each pixel will be divided into 2 categories i.e., whether it is road or not Fig 1.2 FCN8 Architecture 18 MODULES - DESCRIPTION ROAD SEGMENTATION ● ● ● Upscaling is important part of FCN model and hence it is carried out using bipolar Interpolation is used for upscaling the frames for better results Used to estimate the value of a signal or pixel at a non-integer position between two neighboring points. Calculate the weights for each neighbouring point based on their distance and then weighted sum gives the value of pixel Fig 1.3 Mathematical example of bipolar interpolation 19 MODULES - DESCRIPTION ● 2D OBJECT DETECTION ○ ○ YOLO (You only look once) state-of-theart object detection algorithm that is capable of simultaneously predicting bounding boxes and class probabilities for multiple objects in an image. Here it can be applied to detect all the objects such as vehicles, signs, boards, pedestrians and other objects on road Fig 1.4 YOLO example 20 MODULES - DESCRIPTION ● OBJECT TRACKING ● ● ● Object tracking is a crucial for analysis of the movement of objects over time i.e, optical flow. Deep SORT (Simple Online and Realtime Tracking with a Deep Association Metric) - tracking algorithm that combines deep learning with traditional tracking methods to achieve robust and accurate object tracking. Classified into 3 components: ○ ○ ○ Bounding box prediction(done using YOLO) The Kalman filter - to predict the state of object wrt its previous position and velocity - linear approximation IOU matching - degree of overlap between two regions, i.e., bounding boxes or segmentation masks. Need maximize the IOU score for better results. Fig 1.5 Kalman filter 21 MODULES - DESCRIPTION ● MULTI TASK LEARNING ● Multi Task attention network will be used to integrate the work of object detection, object classification and object segmentation Fig 1.5 Multitask Attention Network 22 MODULES - DESCRIPTION ● 3D OBJECT DETECTION ● ● ● Employing SFA 3D (Super Fast and Accurate for 3D Object Detection) - for additional dimension of understanding to the surrounding environment Can be divided into 3 steps: ○ Keypoint FPN - feature extraction and feature maps ○ Calculating loses - Focal, L1, Balanced L1 ○ Learning rate Scheduling - Cosine Annealing Moreover, this project will incorporate UNetXST to perform camera to bird's eye view transformation, providing a top-down view of the scene for enhanced perception and path planning. 23 Further Work ● ● ● Implementation of all the modules integrated together to get fully functioning program which can assist automated vehicles for self driving After completing Lane Segmentation, other modules are yet to be implemented. Completing the research paper component along with the implementation component. 24 Project Status Lane Segmentation Applying bipolar interpolation and later on FCN8 - achieved true mask, working to achieve better predicted mask Fig 1.5 Lane Segmentation 25 Research Paper Status ● ● ● ● Completed the write up for abstract, introduction, motivation of work. Completed literature review. Defined the algorithm and architecture to some extent but need to complete it in depth. Started the coding part but need to complete 100% implementation in order to get quantitative results. 26 Guide Approval 27 References [1] Y. Xiao, L. Daniel and M. Gashinova, "Image Segmentation and Region Classification in Automotive High-Resolution Radar Imagery," in IEEE Sensors Journal, vol. 21, no. 5, pp. 6698-6711, 1 March1, 2021, doi: 10.1109/JSEN.2020.3043586. [2] E. Shelhamer, J. Long and T. Darrell, "Fully Convolutional Networks for Semantic Segmentation," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pp. 640-651, 1 April 2017, doi: 10.1109/TPAMI.2016.2572683. [3] Z. Yue, F. Gao, Q. Xiong, J. Wang, A. Hussain and H. Zhou, "A Novel Attention Fully Convolutional Network Method for Synthetic Aperture Radar Image Segmentation," in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 13, pp. 4585-4598, 2020, doi: 10.1109/JSTARS.2020.3016064. [4] Paek, Dong-Hee and KONG, SEUNG-HYUN and Wijaya, Kevin Tirta, Advances in Neural Information Processing Systems, K-Radar: 4D Radar Object Detection for Autonomous Driving in Various Weather Conditions,2022 [5] M. Ye, D. Lyu and G. Chen, "Scale-Iterative Upscaling Network for Image Deblurring," in IEEE Access, vol. 8, pp. 18316-18325, 2020, doi: 10.1109/ACCESS.2020.2967823. [6] S. Song, Z. Jia, J. Yang and N. K. Kasabov, "A Fast Image Segmentation Algorithm Based on Saliency Map and Neutrosophic Set Theory," in IEEE Photonics Journal, vol. 12, no. 5, pp. 1-16, Oct. 2020, Art no. 3901016, doi: 10.1109/JPHOT.2020.3026973. [7] C. Dewi, R. -C. Chen, Y. -T. Liu, X. Jiang and K. D. Hartomo, "Yolo V4 for Advanced Traffic Sign Recognition With Synthetic Training Data Generated by Various GAN," in IEEE Access, vol. 9, pp. 97228-97242, 2021, doi: 10.1109/ACCESS.2021.3094201. [8] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 779-788, doi: 10.1109/CVPR.2016.91. [9] C. -J. Lin and J. -Y. Jhang, "Intelligent Traffic-Monitoring System Based on YOLO and Convolutional Fuzzy Neural Networks," in IEEE Access, vol. 10, pp. 14120-14133, 2022, doi: 10.1109/ACCESS.2022.3147866. [10] T. Brophy et al., "A Review of the Impact of Rain on Camera-Based Perception in Automated Driving Systems," in IEEE Access, vol. 11, pp. 67040-67057, 2023, doi: 10.1109/ACCESS.2023.3290143. [11] Hoang, VD., Jo, KH. Path planning for autonomous vehicle based on heuristic searching using online images. Vietnam J Comput Sci 2, 109–120 (2015). [12] F. Yu and Z. Lu, "Road Traffic Marking Extraction Algorithm Based on Fusion of Single Frame Image and Sparse Point Cloud," in IEEE Access, vol. 11, pp. 88881-88894, 2023, doi: 10.1109/ACCESS.2023.3306423. [13] S. Liu, E. Johns and A. J. Davison, "End-To-End Multi-Task Learning With Attention," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 1871-1880, doi: 10.1109/CVPR.2019.00197. [14] MARKER BASED WATERSHED TRANSFORMATION FOR IMAGE SEGMENTATION,Aman Kumar Sharma and Anju Bala,2013, CorpusID:44245854 [15] N. Wojke, A. Bewley and D. Paulus, "Simple online and realtime tracking with a deep association metric," 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 2017, pp. 3645-3649, doi: 10.1109/ICIP.2017.8296962. 28