fix: use merge commit timestamp for PR benchmark entries#343
fix: use merge commit timestamp for PR benchmark entries#343
Conversation
Replace pr.head.repo.updated_at (fork sync time) with the timestamp from the local merge commit (HEAD), which is always accessible even with shallow clones and is accurate to within seconds of the actual commit time.
📝 WalkthroughWalkthroughThe pull request fixes incorrect commit timestamp extraction for GitHub pull requests. Instead of using Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #343 +/- ##
==========================================
+ Coverage 89.24% 89.27% +0.02%
==========================================
Files 16 16
Lines 958 960 +2
Branches 203 204 +1
==========================================
+ Hits 855 857 +2
+ Misses 103 99 -4
- Partials 0 4 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@henrikingo I think we can fix the issue reported by you without using the GH API. It's still an approximation as explained in the PR description but should be only seconds off the real commit timestamp. WDYT? |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/extract.spec.ts (1)
215-245: Consider asserting the exact git command invocation.To harden this regression test, assert that the mocked
cmdwas called with([], 'log', '-1', '--format=%aI', 'HEAD').Suggested test strengthening
+import { cmd } from '../src/git'; ... it('collects the commit information from pull_request payload as fallback', async function () { ... const { commit } = await extractResult(config); + expect(cmd).toHaveBeenCalledWith([], 'log', '-1', '--format=%aI', 'HEAD'); ... A.equal(commit.timestamp, '2024-01-15T10:30:00+00:00');🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/extract.spec.ts` around lines 215 - 245, Add an assertion after the call to extractResult to verify the mocked cmd was invoked with the exact git args; specifically assert that the mocked cmd function (mocked `cmd`) was called with ([], 'log', '-1', '--format=%aI', 'HEAD') to ensure the test validates the precise git command used by extractResult.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/extract.spec.ts`:
- Around line 215-245: Add an assertion after the call to extractResult to
verify the mocked cmd was invoked with the exact git args; specifically assert
that the mocked cmd function (mocked `cmd`) was called with ([], 'log', '-1',
'--format=%aI', 'HEAD') to ensure the test validates the precise git command
used by extractResult.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a4948451-6135-457c-9658-385ede1e145a
📒 Files selected for processing (2)
src/extract.tstest/extract.spec.ts
Good call, I was pretty novice back then I guess.
Well, you have to! I didn't realize back then that giving the GITHUB_TOKEN to random bypassers is not safe!
I went back to check how I had fixed this in the mean time. It turns out I also use the local git checkout to grab the latest git commit. Since PR commits are kinda ephemeral anyway, and more importantly, in a pull request context they will always be leaf nodes, I at least convinced myself, that using the merge commit from the pr branch is sufficient, because it will necessarily be a later timestamp than the tip of the base branch. And that is sufficient. I also found in the pr api another timestamp called Both of these are poorly documented and I can't remember exactly why I found it useable, but it seems to be the timestamp of the most recent push to the head repo. This too feels like it will necessarily be large enough that it's always >= the base commit timestamp. However, this timestamp is wrong / arbitrary, because if someone else pushes something into a completely unrelated branch, you will get their timestamp. So in practice I use that value as a default or fallback timestamp and get the quasi-accurate timestamp from the locally checked out git repo, just like you here. |
Problem
getCommitFromPullRequestPayloadwas usingpr.head.repo.updated_atas the commit timestamp. This field reflects when the repository's metadata was last updated — for forks, that's typically the last time the fork was synced with upstream, which can be months in the past. The result is a wildly incorrect timestamp in the benchmark chart tooltip.Solution analysis
We considered three approaches:
1. GitHub API (
getCommitFromGitHubAPIRequest) — accurate, but requiresgithub-tokento be provided. The issue reporter suggested making the token always required for PR contexts, but we wanted to avoid that.2.
git log -1 --format=%aI <pr.head.sha>— accurate (real commit author date), no token needed. However,actions/checkoutdefaults tofetch-depth: 1(shallow clone), which fetches the synthetic merge commit (refs/pull/N/merge) but not its parents.pr.head.shais a parent of the merge commit and is not available as a git object in a shallow clone, so this fails in the common case.3.
git log -1 --format=%aI HEAD(chosen) — reads the timestamp from the merge commit GitHub creates for the PR. This commit is alwaysHEADafteractions/checkout, accessible even withfetch-depth: 1. GitHub sets the merge commit's timestamp at creation time, which is within seconds of the actual push. Verified with a real PR: merge commit was 3 seconds newer than the PR head commit.Trade-off
The timestamp is the merge commit time, not the developer's commit author time. The difference is consistently < 30 seconds in practice — acceptable for a chart tooltip, and vastly better than the current bug which can be months off for fork PRs.
No token required. No fork detection needed. No fallback complexity.
Fixes #327
Summary by CodeRabbit