Skip to content

feat(dataset): add LabelMe format support to DetectionDataset#2299

Open
madhavcodez wants to merge 2 commits into
roboflow:developfrom
madhavcodez:feat/dataset-labelme-format
Open

feat(dataset): add LabelMe format support to DetectionDataset#2299
madhavcodez wants to merge 2 commits into
roboflow:developfrom
madhavcodez:feat/dataset-labelme-format

Conversation

@madhavcodez

Copy link
Copy Markdown
Contributor
Before submitting
  • Self-reviewed the code
  • Updated documentation, follow Google-style
  • Added docs entry for autogeneration (if new functions/classes)
  • Added/updated tests
  • All tests pass locally

Description

Adds LabelMe annotation support to DetectionDataset: from_labelme() to load and as_labelme() to export, following the existing from_<format> / as_<format> convention already used for COCO, YOLO, and Pascal VOC.

On load, rectangle shapes become bounding boxes and polygon shapes become masks (plus their boxes). Masks are loaded for any file that contains a polygon shape, or for every image when force_masks=True. Unsupported shape types (circle, line, point, linestrip) are skipped with a warning. On export, masked detections are written as polygon shapes (one per connected component) and box-only detections as rectangle shapes.

Type of Change

  • ✨ New feature (non-breaking change which adds functionality)

Motivation and Context

LabelMe is a widely used open-source annotation tool whose per-image JSON format covers both detection and segmentation. DetectionDataset can already read and write COCO, YOLO, and Pascal VOC, but LabelMe users currently have to convert their annotations to one of those formats before loading them into supervision. This closes that gap directly.

There is no open tracking issue for this — opening it as a feature addition. Happy to file one if you'd prefer.

Changes Made

  • src/supervision/dataset/formats/labelme.py (new) — load_labelme_annotations / save_labelme_annotations plus the helpers labelme_shapes_to_detections / detections_to_labelme_shapes. Two behaviors worth calling out:
    • Each image is located by the basename of the JSON's imagePath; the directory portion (which LabelMe stores relative to the JSON file) is ignored, so an annotation-supplied path cannot traverse outside images_directory_path.
    • A masked detection whose mask produces no polygon contour (empty or sub-pixel mask) is exported as a rectangle instead of being dropped.
  • src/supervision/dataset/core.pyDetectionDataset.from_labelme() and DetectionDataset.as_labelme(), plus the format import. docs/datasets/core.md already renders DetectionDataset via mkdocstrings, so the new methods appear in the API docs automatically.
  • tests/dataset/formats/test_labelme.py (new) — 31 tests.

Testing

  • I have tested this code locally
  • I have added unit tests that prove the feature works
  • All new and existing tests pass

The 31 tests cover the helpers, loader, and exporter: box and mask round-trips (single- and multi-image, float coordinates, and a mask-IoU fidelity check), force_masks, mixed rectangle/polygon files, multi-class id ordering, multi-component masks, the unsupported-shape warning, and the imagePath / image-dimension / malformed-shape guards.

pytest tests/dataset/ → 194 passed locally (the full dataset suite including the new tests and the --doctest-modules doctests). ruff check, ruff format --check, and mypy --strict are clean on the touched files.

Additional Notes

The mask round-trip is not bit-exact: export goes through mask_to_polygons and re-import rasterizes the polygon back, so a reloaded mask is a polygon approximation of the original — consistent with how the other mask-capable formats round-trip in this library.

@madhavcodez madhavcodez requested a review from SkalskiP as a code owner June 7, 2026 20:18
madhavcodez added a commit to madhavcodez/supervision that referenced this pull request Jun 7, 2026
@codecov

codecov Bot commented Jun 7, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 99.09091% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 80%. Comparing base (8a40630) to head (f6cc31c).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop   #2299    +/-   ##
========================================
  Coverage       80%     80%            
========================================
  Files           66      67     +1     
  Lines         8787    8897   +110     
========================================
+ Hits          7046    7155   +109     
- Misses        1741    1742     +1     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@madhavcodez madhavcodez force-pushed the feat/dataset-labelme-format branch from 30b771f to f6cc31c Compare June 10, 2026 16:10
Add from_labelme() and as_labelme() so DetectionDataset can load and
export LabelMe per-image JSON, alongside the existing COCO, YOLO, and
Pascal VOC support. Rectangle shapes map to boxes and polygon shapes to
masks; unsupported shape types are skipped with a warning. The loader
resolves images by imagePath basename to prevent path traversal.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant