Skip to content

Expose detections_per_img and topk_candidates in config#1314

Open
vickysharma-prog wants to merge 1 commit intoweecology:mainfrom
vickysharma-prog:add-retinanet-detection-params
Open

Expose detections_per_img and topk_candidates in config#1314
vickysharma-prog wants to merge 1 commit intoweecology:mainfrom
vickysharma-prog:add-retinanet-detection-params

Conversation

@vickysharma-prog
Copy link
Copy Markdown
Contributor

This PR exposes detections_per_img and topk_candidates as configurable parameters so users can adjust them for dense scenes.

Currently, these values are hardcoded to the torchvision defaults (300 / 1000). In very dense imagery (e.g., large bird colonies or dense tree canopies), this limit can truncate valid detections.

Changes

  • Added detections_per_img and topk_candidates to config.yaml with current defaults.
  • Passed these parameters through Model.create_model()RetinaNetHub() → torchvision RetinaNet.
  • Added tests in test_main.py to verify the parameters are correctly applied to the model.

Backwards Compatibility

Defaults remain unchanged (300 / 1000), so existing workflows are unaffected.

Benchmarks

See issue #1309 for details. On CPU, increasing the limit showed only modest overhead (~3–8%). The sample images in the repo did not reach the 300-detection cap, so I was not able to demonstrate detection differences locally. Happy to test further with denser imagery if available.

Fixes #1309


  • Used AI tools for guidance
  • I understand all submitted code
  • I reviewed and validated all AI-generated suggestions

@bw4sz
Copy link
Copy Markdown
Collaborator

bw4sz commented Feb 24, 2026

I found an image, I was using a different model, but I believe the general bird detector should get close, if not i'll share the checkpoint.

C7_L1_F213_T20241219_150708_678

@vickysharma-prog
Copy link
Copy Markdown
Contributor Author

vickysharma-prog commented Feb 24, 2026

Thanks @bw4sz! Testing now will share results shortly.

@vickysharma-prog
Copy link
Copy Markdown
Contributor Author

vickysharma-prog commented Feb 24, 2026

@bw4sz Test results with the shared image:

Setup:

  • Image: 4852 × 6464 pixels
  • Model: weecology/deepforest-bird (HuggingFace)
  • predict_tile: patch_size=400, patch_overlap=0.25

Results:

detections_per_img Detections Time
300 (default) 396 ~29 min
1000 396 ~25 min

This image shows 719 raw predictions per tile, reduced to 396 after NMS — doesn't appear to hit the per-patch limit with this model.
You mentioned using a different model. Would you like me to test with your checkpoint? That might show the limit being hit.

PR #1314 is ready whenever you'd like to review!

@bw4sz
Copy link
Copy Markdown
Collaborator

bw4sz commented Feb 25, 2026

But didn't you write above that the limit is 300 detections per image?

@vickysharma-prog
Copy link
Copy Markdown
Contributor Author

vickysharma-prog commented Feb 25, 2026

The detections_per_img=300 limit is applied per tile, not to the full image.

Since predict_tile() splits the image into multiple patches (352 in this case), each tile can return up to 300 detections independently. After stitching tiles together and applying NMS across overlaps, the combined count can exceed 300.

In this test, the model produced 719 total raw predictions across all tiles, reduced to 396 after NMS — so individual tiles weren't hitting the 300 cap with this image/model combination.
If you'd like, I can test with your alternate checkpoint to see a case where the limit is hit more clearly. Let me know!

@vickysharma-prog
Copy link
Copy Markdown
Contributor Author

@bw4sz Just following up would you like me to test with your checkpoint to demonstrate the limit being hit, or is this ready for review as-is? Happy to resolve conflicts and mark ready whenever.

@vickysharma-prog vickysharma-prog force-pushed the add-retinanet-detection-params branch from 6993b05 to 8d20acd Compare March 3, 2026 06:04
@vickysharma-prog vickysharma-prog marked this pull request as ready for review March 3, 2026 06:05
@vickysharma-prog vickysharma-prog force-pushed the add-retinanet-detection-params branch 2 times, most recently from 97e6ef3 to 565ced1 Compare March 3, 2026 07:05
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.84%. Comparing base (408e150) to head (04c7939).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1314      +/-   ##
==========================================
- Coverage   86.87%   86.84%   -0.04%     
==========================================
  Files          24       24              
  Lines        3064     3185     +121     
==========================================
+ Hits         2662     2766     +104     
- Misses        402      419      +17     
Flag Coverage Δ
unittests 86.84% <100.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vickysharma-prog
Copy link
Copy Markdown
Contributor Author

vickysharma-prog commented Mar 3, 2026

@bw4sz Rebased and added detections_per_img and topk_candidates to the config schema all CI checks passing now.

Testing with the shared image didn't hit the per-tile 300 cap. If you'd like, I can test with your alternate checkpoint to demonstrate the limit in a denser scenario.
Ready for review!

@bw4sz
Copy link
Copy Markdown
Collaborator

bw4sz commented Mar 12, 2026

@jveitchmichaelis drop one of those tree images here.

@vickysharma-prog
Copy link
Copy Markdown
Contributor Author

@jveitchmichaelis Hi! Would you mind sharing one of those tree images when you get a chance? Would love to test this out properly. Thanks!

@jveitchmichaelis
Copy link
Copy Markdown
Collaborator

Try this one:
5f059147ce2c9900068d83ee_824.tif

@vickysharma-prog
Copy link
Copy Markdown
Contributor Author

Thanks @jveitchmichaelis! Testing now, will share
results shortly.

@vickysharma-prog
Copy link
Copy Markdown
Contributor Author

@jveitchmichaelis Results on your image the default 300 limit is being hit on dense trees:
Image: 5f059147ce2c9900068d83ee_824.tif (2048x2048)
Model: weecology/deepforest-tree
patch_size=400, patch_overlap=0.25, 49 patches

detections_per_img Raw preds After NMS Time (CPU)
300 (default) 2151 1172 170s
1000 2880 1617 193s
2000 3235 1836 244s

300 silently drops 445+ detections. 300→1000 recovers +38% more trees with ~13% overhead.

Verified m.model.detections_per_img correctly reads the configured value after create_model(). Let me know if you'd
like any changes to the PR or further testing!

@bw4sz bw4sz requested a review from jveitchmichaelis April 7, 2026 17:00
Copy link
Copy Markdown
Collaborator

@jveitchmichaelis jveitchmichaelis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config + passing to the model is fine.

There is a small issue that is likely to pop up in future as we introduce more architectures. This is a specific argument for retinanet. DETR-based models have the same sort of config option, but it is a very different mechanic (changing the number of object queries will require completely retraining because of how the detection head works). And we'll encounter the same with other models I imagine.

I think the naming is confusing for users. How many raw detections does the model produce (which is related to anchors) vs how many outputs we pass to NMS (topk_candidates) and how many we cut after NMS (detections_per_image). This is from retinanet, but we can map it to whatever we like.

For example, in DETR (and family) "num_queries" defaults to 300. We also have an optional NMS stage which seems required for some datasets where the object count varies a lot.

One option would be to expose configs specifically for different architectures, which makes the config a little larger but is at least unambiguous.

@bw4sz any thoughts here?

@vickysharma-prog
Copy link
Copy Markdown
Contributor Author

vickysharma-prog commented Apr 9, 2026

@jveitchmichaelis @bw4sz That these are fundamentally different kinds of parameters. detections_per_img and topk_candidates are inference-time filters for RetinaNet topk_candidates controls how many anchor boxes survive to NMS, detections_per_img
caps the final output after NMS. Neither touches model weights, so users can change them freely without retraining.

DETR's num_queries is a different story it defines the number of learned object queries in the detection head itself.
Change it and the weight dimensions change, which means retraining from scratch. Putting both under the same flat config namespace would be genuinely confusing for users switching between architectures.

The nested approach makes sense to me:
retinanet:
detections_per_img: 300
topk_candidates: 1000

detr:
num_queries: 300

Implementation would mean a RetinaNetConfig dataclass in schema.py, updating both init paths in retinanet.py to use
self.config.retinanet.detections_per_img, and adjusting the test accordingly. All straightforward once direction is confirmed.

Happy to implement just waiting on @bw4sz's thoughts before touching anything.

@bw4sz
Copy link
Copy Markdown
Collaborator

bw4sz commented Apr 15, 2026

that's fine with me. per-model config. We only have a couple models at the current model, when we get to AutoModel, that may make this more complicated? I think this is pressing enough lets get this in.

@vickysharma-prog vickysharma-prog force-pushed the add-retinanet-detection-params branch from e94397b to 565ced1 Compare April 16, 2026 10:55
@vickysharma-prog vickysharma-prog force-pushed the add-retinanet-detection-params branch from 565ced1 to 04c7939 Compare April 16, 2026 11:54
@bw4sz
Copy link
Copy Markdown
Collaborator

bw4sz commented Apr 16, 2026

Okay we discussed and pre-model in the config is fine and we want to get this in and then that will lead us to a larger conversation about config files for each model and a per-model documentation. This is too big of an idea to relate to this small and important PR.

@vickysharma-prog
Copy link
Copy Markdown
Contributor Author

@jveitchmichaelis rebased onto latest main, all checks passing. Ben confirmed the current config approach is fine for now. Ready for your approval!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: allow users to change retinanet candidate images.

3 participants