Performance Evaluation using YOLOv5 ECE 4300 Computer Engineering Team members: Diego Ramirez Pimienta, Jonathan Rosas, Peter Gabradilla, Edward Enriquez Overview ● ● ● ● ● ● ● ● Objective Methodology YOLOv5 COCO Dataset Profiling Process Data Analysis Conclusion Objective ● The motivation of this study was to examine the impact of various computer architectures on the performance of object detection models. ● Provide insights for selecting optimal hardware based on performance, accuracy, and resources ● YOLOv5 selected as computer vision testbench ● Single precision GFLOPs, execution time, IPC, and frames per second of the YOLOv5 model as metrics Methodology ● Hardware Specifications and Configurations ○ ○ ○ ○ ● YOLO and Dataset ○ ○ ● AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx 2.10 GHz AMD Ryzen 7 5825U with Radeon Graphics 2.00 GHz AMD Ryzen 7 5800H with Radeon Graphics 3.20 GHz Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz 2.59 GHz YOLOv5 algorithm by Ultralytics COCO dataset with 300 images for profiling Profiling Tools ○ ○ Intel VTune profiler for Intel Core architecture AMDuProf profiler for AMD processors YOLOv5 ● You Only Look Once (YOLO) is an open source AI image recognition software and was used as our program to test computer performance as it requires significant computing power in order to process images. ● YOLOv5 was installed in 4 different systems using the same steps from the same Ultralytics GitHub repository. COCO Dataset ● The COCO (Common Objects in Context) dataset is a large-scale image dataset for object detection, segmentation, and captioning tasks. ● Designed to represent a wide range of object categories commonly encountered in real-world contexts. ● Out of the 200,000 images in this dataset, 300 were selected as our standardized subset to test YOLOv4 Profiling Process ● Yolo Installation ○ ○ ○ Cloning yolov5-master from official github yolov5 repo. Installing python Installing yolov5 environment ● Intel Process ○ ○ ○ ○ Access yolov5-master directory in command prompt Obtain process ID and start profiling session Run YOLOv5 detection script with 300 COCO subset image folder Obtain results printed on administrator command prompt ● AMD Process ○ ○ ○ ○ Access yolov5-master directory and starting command prompt Run YOLOv5 detection script with 300 COCO subset image folder Look for python running process in AMDuProf profiller Obtain results in the AMDuProf after running 300 COCO subset image Data ● Processor with AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx 2.10 GHz Data ● Processor with AMD Ryzen 7 5825U with Radeon Graphics 2.00 GHz Data ● Processor with AMD Ryzen 7 5800H with Radeon Graphics 3.20 GHz Data ● Processor with Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz 2.59 GHz Frames Per Second Measurement ● Processor with AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx 2.10 GHz Frames Per Second Measurement ● Processor with AMD Ryzen 7 5825U with Radeon Graphics 2.00 GHz Frames Per Second Measurement ● Processor with AMD Ryzen 7 5800H with Radeon Graphics 3.20 GHz Frames Per Second Measurement ● Processor with Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz 2.59 GHz Processors List Summary of results Processor SP GFLOPs Execution Time IPC CPIavg Seconds Per Frame P1 41.298 86 sec 0.895 1.11 244 ms P2 44 43 sec 0.82 1.22 123 ms P3 42.545 120 sec 0.99 1.01 340 ms P4 43.192 62 sec 0.70 1.43 195 ms Analysis ● The previous table indicates that Processor 3 obtained the best performance in terms of Cycles per Instructions however Process 2 did have the best execution time and as well as the best Frame Per Second measurement time. ● Comparing processor manufacturers, there was no significant difference between AMD and Intel architecture. The sole Intel processor’s performance was average compared to AMD processors, only notably being outperformed in GFLOPs. Conclusion ● ● The experiment showed results that matched our hypothesis in regards to which processor would be the best performing. Possible Source of Errors ○ ● Manual start and end timing of the process could introduce some margin of error Future Work and Improvements ○ ○ ○ ○ ○ Explore other profiling tools and compare capabilities Fine-tune profiling settings for more accurate performance analysis Consider real-time performance analysis and algorithm performance in real-world scenarios Utilize resource utilization tools for optimizing memory usage and parallelizing computations Diversify results by examining other architecture properties or different image datasets References ● https://github.com/ultralytics/yolov5 ● https://wandb.ai/onlineinference/YOLO/reports/YOLOv5-ObjectDetection-on-Windows-Step-By-Step-Tutorial---VmlldzoxMDQwNzk4 ● https://www.amd.com/en/developer/uprof.html ● https://www.intel.com/content/www/us/en/developer/tools/oneapi/vt une-profiler.html