Skip to content

Auto-reconcile duplicate relationship types in LLM-generated schemas#536

Merged
NathalieCharbel merged 4 commits into
neo4j:mainfrom
NathalieCharbel:fix/auto-reconcile-duplicate-rel-types
Jun 1, 2026
Merged

Auto-reconcile duplicate relationship types in LLM-generated schemas#536
NathalieCharbel merged 4 commits into
neo4j:mainfrom
NathalieCharbel:fix/auto-reconcile-duplicate-rel-types

Conversation

@NathalieCharbel

@NathalieCharbel NathalieCharbel commented May 29, 2026

Copy link
Copy Markdown
Contributor

Description

This PR is to guarantee a GraphSchema auto-generated from text by an LLM (SchemaFromTextExtractor) never ends up with two or more relationship_types entries that share the same label.

Rather than hard failing an auto-generated schema, this PR reconciles duplicate relationship types:

  • validate_extraction_dict_to_graph_schema now merges same-label relationship_types into a single entry whose properties are the union of the duplicates' properties (de-duplicated by name, first-occurrence wins on conflicts), keeps the first entry's non-property fields, and logs a warning for each merged label. The merge runs on the raw extraction dict before the cross-reference filters and model_validate, so the unioned property set is visible to both - a previously dropped/misreported constraint now validates correctly.

PS:

  • why merging instead of splitting: splitting one label into distinct per-pattern types can't be done reliably - nothing binds a relationship-type definition to a specific pattern, so it would require fragile heuristics. Merging is deterministic and keeps the type global per name as Neo4j requires.
  • This PR on PR#535 that makes user-provided schemas hard-fail on duplicate relationship types; this PR handles the LLM-generated path by auto-reconciling instead of erroring. The two are independent and compose cleanly.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Documentation update
  • Project configuration change

Complexity

Complexity: low

How Has This Been Tested?

  • Unit tests
  • E2E tests
  • Manual tests

Checklist

The following requirements should have been met (depending on the changes in the branch):

  • Documentation has been updated
  • Unit tests have been updated
  • E2E tests have been updated
  • Examples have been updated
  • New files have copyright header
  • CLA (https://neo4j.com/developer/cla/) has been signed
  • CHANGELOG.md updated if appropriate

@NathalieCharbel NathalieCharbel marked this pull request as ready for review June 1, 2026 08:28
@NathalieCharbel NathalieCharbel requested a review from a team as a code owner June 1, 2026 08:28
@NathalieCharbel NathalieCharbel force-pushed the fix/auto-reconcile-duplicate-rel-types branch from bc005b6 to 25b011f Compare June 1, 2026 12:39
@NathalieCharbel NathalieCharbel merged commit 35ff071 into neo4j:main Jun 1, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants