Fetch deadline callback context via Execution API at runtime#66608
Open
seanghaeli wants to merge 7 commits into
Open
Fetch deadline callback context via Execution API at runtime#66608seanghaeli wants to merge 7 commits into
seanghaeli wants to merge 7 commits into
Conversation
…in DB Replace the simple context workaround from apache#55241 that stored serialized context in trigger kwargs. Now that apache#55068 gives the triggerer API access, fetch the DagRun and build context at execution time. This avoids DB bloat from serialized context, provides fresh (not stale) context, and enables richer context information. The CallbackTrigger now uses SUPERVISOR_COMMS.asend(GetDagRun(...)) to fetch the DagRun details from the Execution API when it runs, rather than receiving a pre-built context dict from the scheduler. Changes: - deadline.py: Store only identifiers (dag_id, run_id, deadline_id, deadline_time) in callback kwargs instead of serialized context - callback.py: Add _build_context() that fetches DagRun via Execution API; maintain backward compat for old callbacks with "context" key - triggerer_job_runner.py: Add GetDagRun/DagRunResult to triggerer comms - callback_supervisor.py: Add GetDagRun to executor callback comms Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Author
|
@ramitkataria incorporated your feedback from #64984 your reviews would be much appreciated! |
added 5 commits
May 8, 2026 22:54
The CallbackTrigger legitimately imports from airflow.sdk to communicate with the supervisor via the Execution API at runtime, similar to triggers/base.py and jobs/triggerer_job_runner.py which are already excluded.
ferruzzi
reviewed
May 12, 2026
Contributor
ferruzzi
left a comment
There was a problem hiding this comment.
Just a quick question, otherwise LGTM.
Address review feedback: only include deadline keys that have non-None values, preventing the callback from receiving unexpected None entries.
ferruzzi
approved these changes
May 12, 2026
Contributor
ferruzzi
left a comment
There was a problem hiding this comment.
Approved pending CI passing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace the simple context workaround from #55241 that stored serialized context in trigger kwargs (DB). Now that #55068 gives the triggerer API access, fetch the DagRun at execution time via the Execution API and build context fresh.
This avoids DB bloat from serialized context, provides fresh (not stale) context, and builds a richer context dict including
logical_date,ds,ts,conf,data_interval_start/end, and the deadline info.Changes
deadline.py: Removeget_simple_context(). Store only identifiers (dag_id,run_id,deadline_id,deadline_time) in callback kwargs.callback.py: Add_build_context()that fetches DagRun viaSUPERVISOR_COMMS.asend(GetDagRun(...)). Backward compat: old callbacks with"context"key still work.triggerer_job_runner.py: AddGetDagRuntoToTriggerSupervisorunion,DagRunResulttoToTriggerRunnerunion, handler in_handle_request.callback_supervisor.py: AddGetDagRuntoCallbackToSupervisorunion + handler for executor callback path.GetDagRunhandler test.Testing
Ran in Breeze to verify the comms plumbing works e2e:
GetDagRunround-trips through the triggerer'sToTriggerSupervisor→_handle_request→DagRunResultresponse path without breaking existing trigger handlingSUPERVISOR_COMMS.asend()is the correct async calling pattern — usesTriggerCommsDecoderfrominit_comms()with async lock for coroutine safety in the trigger event loopDagRungenerated model has all fields accessed in_build_context:logical_date,data_interval_start,data_interval_end,conf"context"key (queued before this change) still workMotivation
Per @ramitkataria's feedback on #64984: context should not be stored in the DB. The triggerer now has API access (#55068), so fetch it at runtime like tasks do.
Related