Skip to content

Training Background

Understanding what happens before inference helps you make better decisions about model selection, fine-tuning, and deployment.

Why pre-training matters

The models in YOLO-Toys share a common pattern:

  1. Pre-training: Learn general features from large datasets
  2. Fine-tuning: Adapt to specific tasks/domains
  3. Inference: Deploy for real-world use

The pre-training phase determines:

  • Generalization: How well the model handles novel inputs
  • Feature quality: How rich the learned representations are
  • Transferability: How easily the model adapts to new domains

Articles in this section

Loss Functions

Understanding the mathematical objectives that shape model behavior:

  • YOLO family losses (CIoU, VFL, DFL)
  • DETR bipartite matching
  • Contrastive losses for vision-language models

Pre-training data scale

ModelPre-training DataScale
YOLOv8COCO + Objects365~2M images
DETRCOCO118K images
OWL-ViTLAION-400M400M image-text pairs
BLIPLAION + CC3M~130M image-text pairs

Released under the MIT License.