Skip to content

CASSSIDECAR-454: Sidecar should wake up immediately on the instance r…#346

Open
mansikhara wants to merge 9 commits into
apache:trunkfrom
mansikhara:CASSSIDECAR-454
Open

CASSSIDECAR-454: Sidecar should wake up immediately on the instance r…#346
mansikhara wants to merge 9 commits into
apache:trunkfrom
mansikhara:CASSSIDECAR-454

Conversation

@mansikhara
Copy link
Copy Markdown

@mansikhara mansikhara commented May 11, 2026

Summary

UpdateRestoreJobHandler now immediately notifies the restore system after writing a phase signal (STAGE_READY or IMPORT_READY) to the DB, eliminating the 5–10 minute idle wait for the next RestoreJobDiscoverer polling cycle.

  • IMPORT_READY: calls RestoreJobManagerGroup.updateRestoreJob() to propagate the new status to in-memory trackers, allowing RestoreProcessor to unblock staged
    ranges on its next 1-second tick.
  • STAGE_READY: calls RestoreJobDiscoverer.processJobNow() which discovers slices from the DB and submits them to RestoreProcessor immediately.

Design decisions

  • Notification is fire-and-forget on a worker thread (executeBlocking) — does not block the event loop or delay the HTTP 200 response.
  • Re-reads the full job from DB inside notifyPhaseSignalMaybe because RestoreJobDatabaseAccessor.update() returns a sparse object missing fields like keyspace and
    consistency level.
  • Both calls are additive and safe to miss — the DB write remains the durable source of truth, and the discovery loop still picks up the signal on its next cycle if
    the immediate notification fails.

Test plan

  • UpdateRestoreJobHandlerTest — new tests verify IMPORT_READY triggers updateRestoreJob and STAGE_READY triggers processJobNow
  • RestoreJobDiscovererPhaseSignalIntTest — new integration tests verify ranges are created immediately on STAGE_READY without explicit discovery, and
    IMPORT_READY does not create duplicates
  • Existing RestoreJobDiscovererNode*IntTest tests pass (regression)

@yifan-c
Copy link
Copy Markdown
Contributor

yifan-c commented May 13, 2026

CheckStyle is failing. There is a gradle task to run the check at local quickly.

./gradlew codeCheckTasks

Copy link
Copy Markdown
Contributor

@sarankk sarankk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Mansi, looks good. Left some comments.

mkhara added 7 commits May 20, 2026 15:20
…eceiving a phase signal instead of waiting for the discovery loop
…eceiving a phase signal instead of waiting for the discovery loop
…eceiving a phase signal instead of waiting for the discovery loop
…eceiving a phase signal instead of waiting for the discovery loop
…eceiving a phase signal instead of waiting for the discovery loop
…eceiving a phase signal instead of waiting for the discovery loop
…eceiving a phase signal instead of waiting for the discovery loop
mkhara added 2 commits May 21, 2026 10:43
…eceiving a phase signal instead of waiting for the discovery loop
…eceiving a phase signal instead of waiting for the discovery loop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants