Uploaded by Edward Enriquez

YOLOv5 Performance Evaluation on Different Architectures

advertisement
Performance Evaluation using YOLOv5
ECE 4300 Computer Engineering
Team members:
Diego Ramirez Pimienta,
Jonathan Rosas,
Peter Gabradilla,
Edward Enriquez
Overview
●
●
●
●
●
●
●
●
Objective
Methodology
YOLOv5
COCO Dataset
Profiling Process
Data
Analysis
Conclusion
Objective
●
The motivation of this study was to examine the impact of
various computer architectures on the performance of object
detection models.
●
Provide insights for selecting optimal hardware based on
performance, accuracy, and resources
●
YOLOv5 selected as computer vision testbench
●
Single precision GFLOPs, execution time, IPC, and frames per
second of the YOLOv5 model as metrics
Methodology
●
Hardware Specifications and Configurations
○
○
○
○
●
YOLO and Dataset
○
○
●
AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx 2.10 GHz
AMD Ryzen 7 5825U with Radeon Graphics 2.00 GHz
AMD Ryzen 7 5800H with Radeon Graphics 3.20 GHz
Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz 2.59 GHz
YOLOv5 algorithm by Ultralytics
COCO dataset with 300 images for profiling
Profiling Tools
○
○
Intel VTune profiler for Intel Core architecture
AMDuProf profiler for AMD processors
YOLOv5
● You Only Look Once (YOLO) is an open source
AI image recognition software and was used
as our program to test computer performance
as it requires significant computing power in
order to process images.
●
YOLOv5 was installed in 4 different systems
using the same steps from the same
Ultralytics GitHub repository.
COCO Dataset
● The COCO (Common Objects in Context)
dataset is a large-scale image dataset for
object detection, segmentation, and
captioning tasks.
● Designed to represent a wide range of
object categories commonly encountered
in real-world contexts.
● Out of the 200,000 images in this dataset,
300 were selected as our standardized
subset to test YOLOv4
Profiling Process
● Yolo Installation
○
○
○
Cloning yolov5-master from official github yolov5 repo.
Installing python
Installing yolov5 environment
● Intel Process
○
○
○
○
Access yolov5-master directory in command prompt
Obtain process ID and start profiling session
Run YOLOv5 detection script with 300 COCO subset image folder
Obtain results printed on administrator command prompt
● AMD Process
○
○
○
○
Access yolov5-master directory and starting command prompt
Run YOLOv5 detection script with 300 COCO subset image folder
Look for python running process in AMDuProf profiller
Obtain results in the AMDuProf after running 300 COCO subset image
Data
● Processor with AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx 2.10 GHz
Data
● Processor with AMD Ryzen 7 5825U with Radeon Graphics 2.00 GHz
Data
● Processor with AMD Ryzen 7 5800H with Radeon Graphics 3.20 GHz
Data
● Processor with Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz 2.59 GHz
Frames Per Second Measurement
● Processor with AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx 2.10 GHz
Frames Per Second Measurement
● Processor with AMD Ryzen 7 5825U with Radeon Graphics 2.00 GHz
Frames Per Second Measurement
● Processor with AMD Ryzen 7 5800H with Radeon Graphics 3.20 GHz
Frames Per Second Measurement
● Processor with Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz 2.59 GHz
Processors List
Summary of results
Processor
SP GFLOPs
Execution
Time
IPC
CPIavg
Seconds Per
Frame
P1
41.298
86 sec
0.895
1.11
244 ms
P2
44
43 sec
0.82
1.22
123 ms
P3
42.545
120 sec
0.99
1.01
340 ms
P4
43.192
62 sec
0.70
1.43
195 ms
Analysis
● The previous table indicates that Processor 3 obtained the best performance
in terms of Cycles per Instructions however Process 2 did have the best
execution time and as well as the best Frame Per Second measurement
time.
● Comparing processor manufacturers, there was no significant difference
between AMD and Intel architecture. The sole Intel processor’s performance
was average compared to AMD processors, only notably being
outperformed in GFLOPs.
Conclusion
●
●
The experiment showed results that matched our hypothesis in
regards to which processor would be the best performing.
Possible Source of Errors
○
●
Manual start and end timing of the process could introduce some
margin of error
Future Work and Improvements
○
○
○
○
○
Explore other profiling tools and compare capabilities
Fine-tune profiling settings for more accurate performance analysis
Consider real-time performance analysis and algorithm performance
in real-world scenarios
Utilize resource utilization tools for optimizing memory usage and
parallelizing computations
Diversify results by examining other architecture properties or
different image datasets
References
● https://github.com/ultralytics/yolov5
● https://wandb.ai/onlineinference/YOLO/reports/YOLOv5-ObjectDetection-on-Windows-Step-By-Step-Tutorial---VmlldzoxMDQwNzk4
● https://www.amd.com/en/developer/uprof.html
● https://www.intel.com/content/www/us/en/developer/tools/oneapi/vt
une-profiler.html
Download