Model Card for DETR Finetuned on CPPE-5
Model Overview
This model is a fine-tuned version of facebook/detr-resnet-50 on a custom dataset, likely focused on detecting personal protective equipment (PPE) items. The fine-tuning has optimized the model to recognize various PPE elements such as face shields, masks, gloves, and goggles.
The model is based on the DEtection TRansformer (DETR) architecture, leveraging a ResNet-50 backbone for feature extraction. This fine-tuned version retains DETR's core functionality, enabling object detection tasks but is specifically adjusted to detect items relevant to occupational safety or PPE.
Model Performance
The model achieves the following metrics on its evaluation set:
- Loss: 1.2294
- mAP (mean Average Precision):
- Overall: 0.2366
- 50 IoU threshold: 0.4852
- 75 IoU threshold: 0.2032
- Small objects: 0.1082
- Medium objects: 0.2086
- Large objects: 0.3408
- mAR (mean Average Recall):
- At 1 detection: 0.2819
- At 10 detections: 0.4463
- At 100 detections: 0.4665
- Small objects: 0.249
- Medium objects: 0.4004
- Large objects: 0.5893
For specific categories (face shields, gloves, goggles, masks), the precision and recall vary, with room for improvement, particularly for small objects like goggles.
Intended Use and Limitations
Intended Use
- Detecting personal protective equipment (PPE) in images or video streams.
- Monitoring workplace safety by ensuring proper usage of PPE items such as masks, gloves, face shields, and goggles.
- Suitable for industries like construction, healthcare, and manufacturing where PPE detection is critical for compliance and safety.
Limitations
- The model may not generalize well to non-PPE items or general object detection tasks.
- Performance on small or occluded objects can be limited, as indicated by lower mAP and mAR scores for small objects.
- The model was trained on a dataset specific to PPE detection, so its performance on images outside of this domain might be inconsistent.
Training and Evaluation Data
The dataset used for fine-tuning remains unspecified, but it appears to focus on personal protective equipment, such as face shields, masks, goggles, and gloves.
Training Procedure
Hyperparameters:
- Learning rate: 5e-05
- Train batch size: 8
- Eval batch size: 8
- Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
- Learning rate scheduler: Cosine decay
- Number of epochs: 30
- Seed: 42
The model was trained for 30 epochs with Adam optimization, using a learning rate of 5e-05 and cosine learning rate decay. The training was conducted with a batch size of 8 for both training and evaluation.
Evaluation Results
The following are performance metrics captured during the training process across multiple epochs:
Epoch | Validation Loss | mAP | mAP 50 | mAP 75 | mAR | Comments |
---|---|---|---|---|---|---|
1 | 2.1073 | 0.0518 | 0.1075 | 0.0423 | 0.2819 | Initial training |
5 | 1.6220 | 0.1223 | 0.2258 | 0.1115 | 0.4463 | Significant improvement |
10 | 1.5033 | 0.155 | 0.3265 | 0.1325 | 0.5032 | Stable performance |
20 | 1.2649 | 0.2211 | 0.4427 | 0.1952 | 0.5867 | Peak performance |
25 | 1.2347 | 0.2333 | 0.4831 | 0.1989 | 0.5966 | Final metrics |
Limitations and Ethical Considerations
Limitations:
- Domain-specific: The model performs well in PPE-related object detection but may not generalize to other tasks.
- Bias: If the dataset is skewed or limited, certain PPE items may be under-represented, leading to poorer performance for some categories.
- Real-time Applications: The model might not meet the latency requirements for real-time detection in high-throughput environments.
Ethical Considerations:
- Privacy: Using this model in surveillance scenarios (e.g., workplaces) may raise concerns about employee privacy, especially if applied without clear consent.
- Misuse: Improper use of this model could lead to incorrect enforcement of safety regulations.
Future Work
- Dataset Improvements: Expanding the dataset to include more diverse PPE items, environments, and object scales could improve model performance, especially for smaller objects.
- Model Efficiency: Further fine-tuning or model distillation may help make the model more suitable for real-time applications.
- Downloads last month
- 34
Model tree for ashaduzzaman/detr_finetuned_cppe5
Base model
facebook/detr-resnet-50