fix: stop leaving stray commas when parsing legacy multi-group PR links by hmalik88 · Pull Request #286 · MetaMask/auto-changelog

hmalik88 · 2026-05-13T20:50:33Z

Summary

extractPrLinks leaves stray commas in change descriptions when parsing the legacy form ([#A](url)), ([#B](url)). The mangled output surfaces whenever auto-changelog re-stringifies a changelog that contains historical entries in that form. The release tool (@metamask/create-release-branch) hits this on every release run because updateChangelog calls parseChangelog({ shouldExtractPrLinks: true }).

Repro

Older changelog entry:

- Add new dependency `@metamask/keyring-sdk@1.2.0` ([#478](.../pull/478)), ([#482](.../pull/482)), ([#496](.../pull/496))

After parse + re-stringify:

- Add new dependency `@metamask/keyring-sdk@1.2.0`,,, ([#478](.../pull/478), [#482](.../pull/482), [#496](.../pull/496))

One stray comma per inter-group separator. PR references are still extracted correctly. Only the description text is polluted.

Root cause

src/parse-changelog.ts matches long PR-link groups with this regex:

\s+\(\s*(\[#\d+\]\([^)]*${repoName}[^)]*\)\s*,?\s*)+\)

The pattern matches a single parenthesized group. When the input is ([#A]), ([#B]), ([#C]), each ([#X]) is matched as its own group. The , separators sit between matches, outside the pattern, so String.replace(pattern, '') leaves them in place and they end up tacked onto the description.

Fix

Extend the pattern so one match consumes the full sequence of adjacent groups, including the comma separators between them:

const longLinkGroup = `\\(\\s*(?:\\[#\\d+\\]\\([^)]*${repoName}[^)]*\\)\\s*,?\\s*)+\\)`;
new RegExp(`\\s+${longLinkGroup}(?:\\s*,\\s*${longLinkGroup})*`, 'gu');

Three shapes are now handled correctly:

Canonical, one group with comma-separated refs: ([#A](url), [#B](url))
Legacy, adjacent single-ref groups: ([#A](url)), ([#B](url))
Mixed: ([#A](url), [#B](url)), ([#C](url))

longMatchPattern (PR-number extraction) is unchanged. It finds individual [#X](url) parts regardless of grouping.

Use cases this fixes

Any caller of parseChangelog / updateChangelog whose input contains historical entries in the legacy multi-group form. Concretely:

@metamask/create-release-branch running on monorepos. It calls updateChangelog per changed package, so any package with legacy-form entries in older release sections gets mangled on its next release.
Direct callers that pass shouldExtractPrLinks: true to parseChangelog.

PR numbers are still extracted correctly, so no data is lost. The damage is to the description text, which has to be cleaned up by hand before merging the release PR.

Note

Medium Risk
Medium risk because it changes the core regex used to strip long-form PR links from changelog entries, which could affect parsing edge cases; added tests cover the new legacy and mixed formats.

Overview
Fixes extractPrLinks long-form PR link stripping to treat multiple adjacent parenthesized link groups separated by commas as a single match, preventing leftover comma characters in the resulting change description.

Adds regression tests for legacy ([#A](url)), ([#B](url)) and mixed canonical/legacy groupings, and documents the fix in CHANGELOG.md.

^{Reviewed by Cursor Bugbot for commit 71bee80. Bugbot is set up for automated code reviews on this repo. Configure here.}

mcmire · 2026-05-13T21:28:49Z

I don't know that I like that we are adding even more regex parsing to what is already a pretty wild function. I wonder if it would be a lot easier to read if we parsed a changelog entry in steps (first, separate words from links; then strip out parentheses; then split by comma; then attempt to parse the rest).

But.... I guess that would require a larger refactor, so, maybe this is fine for now. Thanks for fixing this.

One question I have is, does this fix stray commas, or just prevent them from showing up going forward?

hmalik88 · 2026-05-13T23:24:44Z

I don't know that I like that we are adding even more regex parsing to what is already a pretty wild function. I wonder if it would be a lot easier to read if we parsed a changelog entry in steps (first, separate words from links; then strip out parentheses; then split by comma; then attempt to parse the rest).

But.... I guess that would require a larger refactor, so, maybe this is fine for now. Thanks for fixing this.

One question I have is, does this fix stray commas, or just prevent them from showing up going forward?

Thanks for the review!

Two answers:

Legacy form sitting in older release sections (([#A](url)), ([#B](url)), never been corrupted yet): yes, this fix means the next create-release-branch run produces clean canonical output. Self-heals.
Already-corrupted descriptions (Add new dep, ([#A](url), [#B](url)), where the previous buggy parser wrote stray commas into the file): no, this fix doesn't retroactively clean them. From the parser's perspective those commas are now part of the description prose. Affected repos need a one-time manual scrub.

I considered adding a heuristic to also strip dangling commas in the parser, but decided against it, it'd be permanent code that exists only to compensate for a finite historical bug.

On the refactor: agreed the function is wild. Maybe we can refactor in a follow up PR?

mcmire · 2026-05-15T16:44:45Z

@hmalik88 Okay no problem. I have plans to update this tool so it's more strict about formatting and so that the autofix functionality is more useful. So we can address fixing the extra commas then.

mcmire

LGTM!

hmalik88 added 4 commits May 13, 2026 16:46

fix: consume comma separators between adjacent legacy PR link groups

1c902bc

test: add tests

01c61dd

chore: add changelog entry

b900032

chore: add PR number

71bee80

hmalik88 marked this pull request as ready for review May 13, 2026 20:51

hmalik88 requested a review from a team as a code owner May 13, 2026 20:51

github-advanced-security AI found potential problems May 13, 2026

View reviewed changes

Comment thread src/parse-changelog.ts Dismissed

mcmire approved these changes May 15, 2026

View reviewed changes

hmalik88 merged commit 16f7742 into main May 15, 2026
26 checks passed

hmalik88 deleted the hm/fix-oxfmt-formatting branch May 15, 2026 16:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: stop leaving stray commas when parsing legacy multi-group PR links#286

fix: stop leaving stray commas when parsing legacy multi-group PR links#286
hmalik88 merged 4 commits into
mainfrom
hm/fix-oxfmt-formatting

hmalik88 commented May 13, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

mcmire commented May 13, 2026 •

edited

Loading

Uh oh!

hmalik88 commented May 13, 2026

Uh oh!

mcmire commented May 15, 2026 •

edited

Loading

Uh oh!

mcmire left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

hmalik88 commented May 13, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Repro

Root cause

Fix

Use cases this fixes

Uh oh!

Uh oh!

mcmire commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hmalik88 commented May 13, 2026

Uh oh!

mcmire commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mcmire left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hmalik88 commented May 13, 2026 •

edited by cursor Bot

Loading

mcmire commented May 13, 2026 •

edited

Loading

mcmire commented May 15, 2026 •

edited

Loading