Skip to content

Fix flaky test_compute_derivative_rows_mem_usage#23059

Open
AAraKKe wants to merge 2 commits intomasterfrom
aarakke/fix-flake-statement-metrics-cache-test
Open

Fix flaky test_compute_derivative_rows_mem_usage#23059
AAraKKe wants to merge 2 commits intomasterfrom
aarakke/fix-flake-statement-metrics-cache-test

Conversation

@AAraKKe
Copy link
Copy Markdown
Contributor

@AAraKKe AAraKKe commented Mar 26, 2026

What does this PR do?

Replaces the tracemalloc peak-memory threshold check in test_compute_derivative_rows_mem_usage with a direct assertion on the contents of _previous_statements.

Motivation

The test was failing on master (job link) with a measured peak of ~105 MB against a 10 MB threshold, on Python 3.13 with ddtrace profiling active.

The root cause is that the old approach measured tracemalloc peak memory, which is sensitive to:

  • Python version (constant folding behaviour changed in 3.13)
  • GC timing (whether previous rounds' rows are freed before new ones are allocated)
  • ddtrace profiler keeping frame/object references alive longer than expected

Why the new assertion is correct

The property we actually care about is that _previous_statements does not hold on to large non-metric data (e.g. full query strings) between check runs. compute_derivative_rows achieves this by explicitly building the cache from only the metric columns:

self._previous_statements = {
    row_key: {col: row[col] for col in metrics if col in row}
    for row_key, row in merged_rows.items()
}

So after a call with metrics = ['count', 'time'], the cache looks like:

{'sig3767': {'count': 100, 'time': 2005}, 'sig4317': {'count': 200, 'time': 4010}, ...}

The new test directly asserts this: each cached entry's keys are a subset of the requested metrics — meaning the 3000-char query strings, errors, db, user, etc. are never retained. This is deterministic, independent of Python version and profiler state.

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
  • If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

Replace tracemalloc peak threshold check with direct assertion on
_previous_statements contents. The old test was sensitive to Python
version, GC timing, and ddtrace profiler overhead, causing it to exceed
the 10 MB threshold on Python 3.13 in CI (measured ~105 MB).

The new test directly verifies the property that matters: after calling
compute_derivative_rows, the cache only contains metric columns (e.g.
count, time) and not the full row data (e.g. the large query strings).
@AAraKKe AAraKKe added the qa/skip-qa Automatically skip this PR for the next QA label Mar 26, 2026
@AAraKKe AAraKKe requested review from a team as code owners March 26, 2026 09:42
@AAraKKe AAraKKe added the qa/skip-qa Automatically skip this PR for the next QA label Mar 26, 2026
@AAraKKe
Copy link
Copy Markdown
Contributor Author

AAraKKe commented Mar 26, 2026

Added no-changelog since this is only tests. Will involve DBM in the review as well to make sure this is not breaking something I am missing.

.claude

# PR review artifacts
.agint-review/
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is added to ignore any review we do with Claude in the repo, since worktrees are created it can become a pain to ensure none of this is considered.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.26%. Comparing base (969b7f0) to head (578371b).

Additional details and impacted files
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@eric-weaver eric-weaver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm 👍 this does change the goal of this test which was originally introduced here, but given the non-deterministic behavior of tracemalloc on CI this is a better test

@AAraKKe AAraKKe enabled auto-merge March 26, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants