Training Background
Understanding what happens before inference helps you make better decisions about model selection, fine-tuning, and deployment.
Why pre-training matters
The models in YOLO-Toys share a common pattern:
- Pre-training: Learn general features from large datasets
- Fine-tuning: Adapt to specific tasks/domains
- Inference: Deploy for real-world use
The pre-training phase determines:
- Generalization: How well the model handles novel inputs
- Feature quality: How rich the learned representations are
- Transferability: How easily the model adapts to new domains
Articles in this section
Loss Functions
Understanding the mathematical objectives that shape model behavior:
- YOLO family losses (CIoU, VFL, DFL)
- DETR bipartite matching
- Contrastive losses for vision-language models
Pre-training data scale
| Model | Pre-training Data | Scale |
|---|---|---|
| YOLOv8 | COCO + Objects365 | ~2M images |
| DETR | COCO | 118K images |
| OWL-ViT | LAION-400M | 400M image-text pairs |
| BLIP | LAION + CC3M | ~130M image-text pairs |
What to read next
- Loss Functions for detailed loss explanations
- Model Matrix for practical specifications