Literature Review & Research Approach
Report
Student: Areena Haq
Supervisor: [Supervisor Name]
Degree Program: Master's in Transportation Engineering (AI)
Report Title: Fine-Grained Vehicle Classification Using Cross-Part Learning
1. Introduction
Accurate vehicle recognition is a critical component of intelligent transportation systems in
modern smart cities. It supports tasks such as law enforcement, traffic flow analysis, and
automated tolling systems. Traditional approaches to vehicle classification often struggle in
real-world scenarios due to fine-grained distinctions, where vehicles of different models
and makes share highly similar appearances (high inter-class similarity) while individual
models may show significant visual variation (intra-class variation).
This report outlines the literature review and methodological evolution of the proposed
study—from an initial two-stage approach using YOLOv8 and Cross-Part Learning (CPL) to
a final streamlined method based solely on Cross-Part Learning, motivated by
implementation feasibility, novelty refinement, and academic value.
2. Initial Proposed Two-Stage Approach
Stage 1: YOLOv8 for Vehicle Detection
YOLOv8 is the latest iteration of the YOLO family, known for fast and accurate object
detection. It was initially planned to detect and localize vehicles in traffic surveillance
images and provide cropped vehicle images for downstream classification using finegrained methods.
Stage 2: Cross-Part Learning (CPL) for Fine-Grained Classification
CPL introduces a weakly-supervised approach for fine-grained image classification by
automatically discovering discriminative parts and transferring those features to enhance
classification without manual annotations.
Challenges:
- Required manual integration of YOLOv8 outputs with CPL
- High resource cost in managing both detection and classification models
- Limited novelty and supervisor feedback suggesting refinement
3. Final Approach: Cross-Part Learning for Fine-Grained Vehicle
Classification
Motivation:
- CPL allows weakly supervised learning
- Avoids complexity and overhead of object detection
- Focuses directly on fine-grained model-level distinctions
Working Principle:
1. Cross-Part Discovery (CPD)
2. Cross-Part Feature Transfer (CPFT)
3. Cross-Part Attention (CPA)
CPL is ideal for traffic surveillance datasets and real-world applications due to its simplicity,
efficiency, and annotation-free training.
4. Literature Review
General Object Detection:
- Faster R-CNN [3]: Two-stage detector with Region Proposal Networks
- YOLO [1]: One-stage, real-time detector
- SSD [4]: Fast and reasonably accurate
Fine-Grained Recognition:
- Stanford Cars Dataset [5] and CompCars [6] used widely
- Lu et al. [8] used part-level optimization with annotations
- Zhang et al. [2] proposed CPL without part annotations
- Chen et al. [7] and Balasubramanian & Rathore [9] tackled open-set and contrastive
learning methods
5. Datasets Considered
- Stanford Cars: 196 labeled vehicle classes
- CompCars: Large-scale dataset with surveillance images
- BoxCars116k: Additional source (may need preprocessing)
6. Proposed Methodology Workflow
1. Data Preprocessing
2. Baseline CNN Training
3. Implement CPL modules (CPD, CPFT, CPA)
4. Evaluation using Top-1 Accuracy, Precision, Recall, and Confusion Matrix
7. Contribution and Novelty
- Pure CPL-based classification (no detection required)
- No part-level annotations or bounding boxes
- Scalable to real-world traffic systems
- Applies advanced part-discovery techniques
8. Timeline & Implementation Plan
Week 1–2: Dataset selection and preprocessing
Week 3–4: Baseline model training
Week 5–7: CPL implementation
Week 8–9: Evaluation
Week 10: Paper writing and revisions
9. References
1. Redmon, J., et al. (2016). YOLO: Unified, Real-Time Object Detection. CVPR.
2. Zhang, H., et al. (2021). Cross-Part Learning for Fine-Grained Image Classification. CVPR.
3. Ren, S., et al. (2015). Faster R-CNN. PAMI.
4. Liu, W., et al. (2016). SSD: Single Shot MultiBox Detector. ECCV.
5. Krause, J., et al. (2013). 3D Object Representations for Fine-Grained Categorization. ICCV.
6. Yang, L., et al. (2015). A Large-Scale Car Dataset. CVPR.
7. Chen, X., et al. (2021). Knowledge-Distillation-Based Label Smoothing.
8. Lu, J., et al. (2022). Part-Level Feature Optimization.
9. Balasubramanian, M., & Rathore, S. (2022). Contrastive Learning for Object Detection.