Skip to content

Coco dataset , fix for grayscales images, convert them to RGB#45

Open
ExtReMLapin wants to merge 1 commit intotlpss:mainfrom
ExtReMLapin:patch-1
Open

Coco dataset , fix for grayscales images, convert them to RGB#45
ExtReMLapin wants to merge 1 commit intotlpss:mainfrom
ExtReMLapin:patch-1

Conversation

@ExtReMLapin
Copy link
Copy Markdown

No description provided.

@tlpss tlpss requested a review from Copilot May 15, 2025 12:59
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes the handling of grayscale images in the COCO dataset by converting them to 3-channel RGB before further processing.

  • Added conversion of grayscale images to RGB by duplicating the single channel.
  • Ensured that images with an alpha channel are reduced to RGB.

Comment thread keypoint_detection/data/coco_dataset.py
@tlpss
Copy link
Copy Markdown
Owner

tlpss commented May 15, 2025

@ExtReMLapin thanks for your contribution, looks like a useful addition! I'll merge it soon.

@ExtReMLapin
Copy link
Copy Markdown
Author

ExtReMLapin commented May 15, 2025

🖖🏻

A pleasure.
I also have more changes in staging for annother PR which adds max_image_size param
I'm training it on forensic images that have different resolutions and often high ones which causes :

  1. OOM during training (because of big resolution)
  2. Error during validation because of torch.stack trying to stack up different sizes.

@tlpss
Copy link
Copy Markdown
Owner

tlpss commented May 15, 2025

Hi @ExtReMLapin

Sounds like an interesting project.

I consider these steps part of the preprocessing to reduce the burden on the ML codebase (can't support everything in the training loops) and to increase data loading speeds (loading a huge image from disk and then resizing it can bottleneck the GPU because it has to wait on the CPU, which is not desirable).

I will probably not accept a PR that does image resizing in the dataloader (as a separation of concerns).

You should consider resizing the images upfront into a separate dataset and only then training a detector on them.

I have some code for this here if you are interested.

@tlpss
Copy link
Copy Markdown
Owner

tlpss commented May 16, 2025

@ExtReMLapin can you take a look at the CI failures? apparently one of the tests was broken by an update in torch ,but the fix should be straightforward.

Btw, I'm on a conference next week so will take some time for me to get back to you! But I do appreciate the PRs 🙂

@ExtReMLapin
Copy link
Copy Markdown
Author

ExtReMLapin commented May 16, 2025

No worry with the delay.

To be frank i've been working on this forensic minutiae detector for two years and you have no idea how sometimes it's a pain in the ass to :

  • set up the whole repository env
  • transform your dataset
  • discover their undocumented training examples are not working

here it's just working with wandb integration, few issues with DDP but it's fine tbf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants