Skip to content

[WIP] Bug 2051180 - Track which pushes are actually backfilled#9662

Open
junngo wants to merge 1 commit into
mozilla:masterfrom
junngo:track-backfilled-push
Open

[WIP] Bug 2051180 - Track which pushes are actually backfilled#9662
junngo wants to merge 1 commit into
mozilla:masterfrom
junngo:track-backfilled-push

Conversation

@junngo

@junngo junngo commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

This PR is a work in progress. Please don't merge it yet.


Bugzilla: https://bugzilla.mozilla.org/show_bug.cgi?id=2051180
This PR adds tracking for which pushes were actually backfilled when Sherlock triggers a backfill.

How it works

  1. When Sherlock triggers a backfill, the Bk task ID is returned and is saved to a new BackfillRequest table.
  2. On the next Sherlock run (~1 hour later), the BackfillTracker class picks up any PENDING requests and parses the live_backing_log of each Bk task.
  • We don't track immediately after triggering because the Bk task takes a few minutes to complete.
  1. Two log lines are parsed to track the backfilled pushes [1]:
Backfill started: label=test-windows11-64-24h2-asan/opt-reftest, strategy=standard, slices=0, target_pushes=['270508', '270509', '270510', ...]

BACKFILL_DATA: {"push_id":"270508","decision_task_id":"PxSdYjhtTeegC-w531-17w","label":"test-windows11-64-24h2-asan/opt-reftest","strategy":"standard","slices":0,"push_count":1,"total_pushes":9}
BACKFILL_DATA: {"push_id":"270509","decision_task_id":"bt18xdpDS92wYIsrhL8HXw","label":"test-windows11-64-24h2-asan/opt-reftest","strategy":"standard","slices":0,"push_count":2,"total_pushes":9}
...
  • Backfill started: is logged before backfilling starts, contains the overall backfill plan
  • BACKFILL_DATA: is logged per push as it gets backfilled. We use the decision_task_id here to look up which push was actually backfilled in Treeherder.
  • Note 1: A patch to change the Backfill started format to JSON is currently under review: https://phabricator.services.mozilla.com/D309317
  • Note 2: We can't use the push_id parsed from the log directly since this ID is for hg (CI side)

New tables

  • BackfillRequest: one row per backfill trigger, stores the Bk task ID and parsed metadata from Backfill started
  • BackfilledPush: one row per actual backfilled push, linked to the Treeherder Push via decision_task_id

This will serve as a foundation for a follow-up patch to improve Sherlock's outcome check, which will only check the actual backfilled pushes rather than the full push range.
If you have any suggestions or opinions, feel free to share anytime.

[1] example
https://firefoxci.taskcluster-artifacts.net/Rymag17MSy2fMS-egTfdoA/0/public/logs/live_backing.log
https://treeherder.mozilla.org/jobs?repo=autoland&searchStr=bk&fromchange=47b90175fca9b0eb0f34d554e2226857016359eb&selectedTaskRun=Rymag17MSy2fMS-egTfdoA.0

request.save()

@staticmethod
def _parse_backfill_started(line: str) -> dict:

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper will be removed after the format is changed to JSON by https://phabricator.services.mozilla.com/D309317

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant