Skip to content

fix: prevent duplicate request processing in delayRequest#3504

Open
dearlordylord wants to merge 1 commit intoapify:masterfrom
dearlordylord:fix/delay-request-race
Open

fix: prevent duplicate request processing in delayRequest#3504
dearlordylord wants to merge 1 commit intoapify:masterfrom
dearlordylord:fix/delay-request-race

Conversation

@dearlordylord
Copy link
Copy Markdown

@dearlordylord dearlordylord commented Mar 20, 2026

delayRequest removes request from inProgress immediately, then re-adds via timer. Between those two points another worker can fetch the same request, causing duplicate processing.

Fix: stop manipulating inProgress in delayRequest. The request stays guarded from fetchNextRequest through reclaimRequest's natural 3s cleanup timer.

The inProgress delete/add was introduced in #2045 to "clean up inProgress cache" during delays. But even at that commit, reclaimRequest had an inProgress.has() guard that passed fine without the delete - the manipulation was never needed. It only made inProgressCount appear lower during delays, while opening the race window. Keeping the request in inProgress during the delay is semantically correct: it is pending indeed.

Regression test included (red/green)

refs #3427

@janbuchar janbuchar self-requested a review March 23, 2026 12:37
@janbuchar
Copy link
Copy Markdown
Contributor

Hi @dearlordylord and thanks for your contribution. At a glance, the functionality that you're removing seems intentional, can you investigate why it was added in the first place?

@dearlordylord
Copy link
Copy Markdown
Author

Hi @dearlordylord and thanks for your contribution. At a glance, the functionality that you're removing seems intentional, can you investigate why it was added in the first place?

Hi @janbuchar thank you for checking. Could you please check the last paragraph of my pr description where I have the investigation of the removed functionality? Would that be sufficient?

@janbuchar
Copy link
Copy Markdown
Contributor

Hi @dearlordylord and thanks for your contribution. At a glance, the functionality that you're removing seems intentional, can you investigate why it was added in the first place?

Hi @janbuchar thank you for checking. Could you please check the last paragraph of my pr description where I have the investigation of the removed functionality? Would that be sufficient?

Yeah, I missed that, I'm sorry. @B4nan do you remember why that change was needed, 2.5 years ago? It looks like the issue this PR is dealing with has actually appeared almost immediately after PR #2045.

Also, the code snippet in #3427 does not contain sameDomainDelaySecs, so I don't think that the delayed re-insertion of requests should even happen there.

@B4nan
Copy link
Copy Markdown
Member

B4nan commented Mar 23, 2026

Can't say I remember it exactly, but reading the PR title, I think it was solving some deadlocks caused by stale inProgress cache. Stale requests in there were causing the zero-concurrency bugs back in the day (but that was likely long before you joined, a bug that was so hard to reproduce that it often took many hours of running an actor).

This was surely not about "making things work", this was about "not leaking", and "not causing dead locks".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants