← Back to projects
Car Key-Points Detection with MobileNetV1
Predicts windshield and headlight key-points on car images using a fine-tuned MobileNetV1 backbone with custom regression heads, trained on a tiny dataset (90 images) heavily augmented to 9,000 samples.
Feb 2020 PythonTensorFlowMobileNetV1KerasAugmentation
Objective
Predict precise key-point coordinates for windshield corners and headlight centers across diverse car images. The model outputs continuous (x, y) coordinates rather than classification labels, framing the task as a multi-output regression problem.
Approach
- Backbone: MobileNetV1 with all layers trainable (no frozen feature extractor).
- Head: Custom dense layers producing key-point coordinates.
- Parameters: ~24M trainable.
- Training data: 90 base images → augmented to 9,000 samples (rotations, translations, brightness, horizontal flips with corresponding key-point remapping).
- Optimization: 20 epochs, MSE loss on normalized coordinates.
Results
The model generalizes well across viewpoints despite the tiny base dataset, demonstrating that aggressive geometric augmentation can substitute for large labeled corpora in key-point regression tasks.
Takeaways
- MobileNetV1 is a strong choice when inference latency matters and the task is geometrically simple.
- Per-key-point augmentation correctness is the hard part — every transform must remap ground-truth coordinates exactly, or the model learns noise.
- Small-data regression benefits hugely from depth-wise separable convolutions, which reduce overfitting.