Image classification, object detection, segmentation, and generative models
CNN architecture evolution, data augmentation, and modern training strategies
Bounding boxes, YOLO, R-CNN family, anchor boxes, and evaluation metrics
Semantic, instance, and panoptic segmentation with U-Net, Mask R-CNN, and SAM
GANs, VAEs, diffusion models, Stable Diffusion, and ControlNet
ViT, DINO, CLIP, and multimodal models bridging vision and language