plans: §4 sub-PR 2.2 spike memo (discogs-cache match score shape)#794
Open
jakebromberg wants to merge 1 commit into
Open
plans: §4 sub-PR 2.2 spike memo (discogs-cache match score shape)#794jakebromberg wants to merge 1 commit into
jakebromberg wants to merge 1 commit into
Conversation
Audits the two questions §4 sub-PR 2.2 spelled out as preconditions for
implementation. Both findings invalidate the plan's central assumption:
1. Neither flowsheet_match nor fuzzy_resolved has a trgm_score column.
- flowsheet_match is an exact equi-join on normalized strings; no fuzzy
score because no fuzzy match. The plan's `0.7 + 0.3 * trgm_score`
trigram fallback is dead code; alias_match 0.75 is the only viable
mapping for distinct_entities > 1.
- fuzzy_resolved discards the trigram score during its resolve step;
the score lives only on the un-persisted `fuzzy_full` staging table
as `combined = similarity(artist) + similarity(album) ∈ [1.55, 2.0]`.
2. fuzzy_resolved carries no Discogs ID at all — only resolved_library_id.
The master-vs-release decision applies only to flowsheet_match (S3),
where the upstream prefers master and falls back to release; recommended
mapping is option (c): write whichever the source row pins, never both.
Recommendation: split sub-PR 2.2 into 2.2a (flowsheet_match) + 2.2b
(fuzzy_resolved). The two mapping logics diverge enough — confidence
formula, Discogs ID handling, writer contract — that bisecting cleanly
warrants the second PR.
0840a35 to
f7d4086
Compare
This was referenced May 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Spike memo at
plans/library-hook-canonicalization/audits/discogs-cache-match-score-shape.mdauditing the two questions §4 sub-PR 2.2 spelled out as preconditions for implementation.The original second commit on this branch (a plan amendment splitting §4 sub-PR 2.2 into 2.2a + 2.2b) has been dropped because the architecture pivot in #800 supersedes it — Backend will no longer implement source-leg backfills directly. LML owns identity resolution end-to-end; Backend caches the verdict via a single bulk-resolve endpoint.
The spike findings stand on their own as reference for LML's own resolution logic, so this PR is narrowed to the memo only.
Spike findings (still valid as reference)
flowsheet_matchnorfuzzy_resolvedhas atrgm_scorecolumn. The original plan's0.7 + 0.3 * trgm_scoretrigram fallback was dead code.flowsheet_matchis an exact equi-join (no fuzzy score by construction);fuzzy_resolveddiscards the score during its resolve step (onlyarray_agg(library_id ORDER BY combined DESC)survives).fuzzy_resolvedcarries no Discogs ID — onlyresolved_library_id. The master-vs-release decision applies only toflowsheet_match, where the upstream pins one or the other.These observations belong to LML's resolution domain post-pivot but are documented here so the reasoning isn't lost.
Test plan
npm run format:check— clean.Backend-Service/scripts/discogs-bridge-flowsheet.sqlandfuzzy-trigram-flowsheet.sql.Refs #663, #800.