← Back to projects

Car Key-Points Detection with MobileNetV1

Predicts windshield and headlight key-points on car images using a fine-tuned MobileNetV1 backbone with custom regression heads, trained on a tiny dataset (90 images) heavily augmented to 9,000 samples.

Feb 2020 PythonTensorFlowMobileNetV1KerasAugmentation
Car Key-Points Detection with MobileNetV1

Objective

Predict precise key-point coordinates for windshield corners and headlight centers across diverse car images. The model outputs continuous (x, y) coordinates rather than classification labels, framing the task as a multi-output regression problem.

Approach

  • Backbone: MobileNetV1 with all layers trainable (no frozen feature extractor).
  • Head: Custom dense layers producing key-point coordinates.
  • Parameters: ~24M trainable.
  • Training data: 90 base images → augmented to 9,000 samples (rotations, translations, brightness, horizontal flips with corresponding key-point remapping).
  • Optimization: 20 epochs, MSE loss on normalized coordinates.
MobileNetV1 architecture diagram

Results

The model generalizes well across viewpoints despite the tiny base dataset, demonstrating that aggressive geometric augmentation can substitute for large labeled corpora in key-point regression tasks.

Car image with predicted keypoints overlaid

Takeaways

  • MobileNetV1 is a strong choice when inference latency matters and the task is geometrically simple.
  • Per-key-point augmentation correctness is the hard part — every transform must remap ground-truth coordinates exactly, or the model learns noise.
  • Small-data regression benefits hugely from depth-wise separable convolutions, which reduce overfitting.