Expose detections_per_img and topk_candidates in config#1314
Expose detections_per_img and topk_candidates in config#1314vickysharma-prog wants to merge 1 commit intoweecology:mainfrom
Conversation
|
Thanks @bw4sz! Testing now will share results shortly. |
|
@bw4sz Test results with the shared image: Setup:
Results:
This image shows 719 raw predictions per tile, reduced to 396 after NMS — doesn't appear to hit the per-patch limit with this model. PR #1314 is ready whenever you'd like to review! |
|
But didn't you write above that the limit is 300 detections per image? |
|
The Since In this test, the model produced 719 total raw predictions across all tiles, reduced to 396 after NMS — so individual tiles weren't hitting the 300 cap with this image/model combination. |
|
@bw4sz Just following up would you like me to test with your checkpoint to demonstrate the limit being hit, or is this ready for review as-is? Happy to resolve conflicts and mark ready whenever. |
6993b05 to
8d20acd
Compare
97e6ef3 to
565ced1
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1314 +/- ##
==========================================
- Coverage 86.87% 86.84% -0.04%
==========================================
Files 24 24
Lines 3064 3185 +121
==========================================
+ Hits 2662 2766 +104
- Misses 402 419 +17
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@bw4sz Rebased and added Testing with the shared image didn't hit the per-tile 300 cap. If you'd like, I can test with your alternate checkpoint to demonstrate the limit in a denser scenario. |
|
@jveitchmichaelis drop one of those tree images here. |
|
@jveitchmichaelis Hi! Would you mind sharing one of those tree images when you get a chance? Would love to test this out properly. Thanks! |
|
Try this one: |
|
Thanks @jveitchmichaelis! Testing now, will share |
|
@jveitchmichaelis Results on your image the default 300 limit is being hit on dense trees:
300 silently drops 445+ detections. 300→1000 recovers +38% more trees with ~13% overhead. Verified m.model.detections_per_img correctly reads the configured value after create_model(). Let me know if you'd |
There was a problem hiding this comment.
The config + passing to the model is fine.
There is a small issue that is likely to pop up in future as we introduce more architectures. This is a specific argument for retinanet. DETR-based models have the same sort of config option, but it is a very different mechanic (changing the number of object queries will require completely retraining because of how the detection head works). And we'll encounter the same with other models I imagine.
I think the naming is confusing for users. How many raw detections does the model produce (which is related to anchors) vs how many outputs we pass to NMS (topk_candidates) and how many we cut after NMS (detections_per_image). This is from retinanet, but we can map it to whatever we like.
For example, in DETR (and family) "num_queries" defaults to 300. We also have an optional NMS stage which seems required for some datasets where the object count varies a lot.
One option would be to expose configs specifically for different architectures, which makes the config a little larger but is at least unambiguous.
@bw4sz any thoughts here?
|
@jveitchmichaelis @bw4sz That these are fundamentally different kinds of parameters. detections_per_img and topk_candidates are inference-time filters for RetinaNet topk_candidates controls how many anchor boxes survive to NMS, detections_per_img DETR's num_queries is a different story it defines the number of learned object queries in the detection head itself. The nested approach makes sense to me: detr: Implementation would mean a RetinaNetConfig dataclass in schema.py, updating both init paths in retinanet.py to use Happy to implement just waiting on @bw4sz's thoughts before touching anything. |
|
that's fine with me. per-model config. We only have a couple models at the current model, when we get to AutoModel, that may make this more complicated? I think this is pressing enough lets get this in. |
e94397b to
565ced1
Compare
565ced1 to
04c7939
Compare
|
Okay we discussed and pre-model in the config is fine and we want to get this in and then that will lead us to a larger conversation about config files for each model and a per-model documentation. This is too big of an idea to relate to this small and important PR. |
|
@jveitchmichaelis rebased onto latest main, all checks passing. Ben confirmed the current config approach is fine for now. Ready for your approval! |

This PR exposes
detections_per_imgandtopk_candidatesas configurable parameters so users can adjust them for dense scenes.Currently, these values are hardcoded to the torchvision defaults (300 / 1000). In very dense imagery (e.g., large bird colonies or dense tree canopies), this limit can truncate valid detections.
Changes
detections_per_imgandtopk_candidatestoconfig.yamlwith current defaults.Model.create_model()→RetinaNetHub()→ torchvision RetinaNet.test_main.pyto verify the parameters are correctly applied to the model.Backwards Compatibility
Defaults remain unchanged (300 / 1000), so existing workflows are unaffected.
Benchmarks
See issue #1309 for details. On CPU, increasing the limit showed only modest overhead (~3–8%). The sample images in the repo did not reach the 300-detection cap, so I was not able to demonstrate detection differences locally. Happy to test further with denser imagery if available.
Fixes #1309