Skip to content

fix(schema): raise a clear error for duplicate relationship types#535

Open
NathalieCharbel wants to merge 2 commits into
neo4j:mainfrom
NathalieCharbel:fix/improve-schemavalidationerror
Open

fix(schema): raise a clear error for duplicate relationship types#535
NathalieCharbel wants to merge 2 commits into
neo4j:mainfrom
NathalieCharbel:fix/improve-schemavalidationerror

Conversation

@NathalieCharbel

Copy link
Copy Markdown
Contributor

Description

Improves the SchemaValidationError raised for graph schemas that define the same relationship-type label more than once in relationship_types with different properties and constraints.
e.g.

"patterns": [
    {"source": "Patient",   "relationship": "PARTICIPATED_IN", "target": "Encounter"},
    {"source": "Encounter", "relationship": "PARTICIPATED_IN", "target": "Physician"}
  ],
  "relationship_types": [
    {"label": "PARTICIPATED_IN", "properties": [                                                                                                                                             
      {"name": "date"},
      {"name": "patient_name"},
      {"name": "encounter_name"}
    ]},
    {"label": "PARTICIPATED_IN", "properties": [                                                                                                                                             
      {"name": "encounter_name"},
      {"name": "physician_name"}
    ]}
  ],
  "constraints": [
    {"relationship_type": "PARTICIPATED_IN", "type": "KEY",
     "property_names": ["date", "patient_name", "encounter_name"]}
  ]

Currently, on schema validation, duplicate relationship-type labels are silently collapsed by last-write-wins lookup index. When a constraint then references a property that lived on the shadowed (overwritten) duplicate, validation fails with a misleading error that listed the surviving entry's properties as the only "valid" ones

  File "/workspace/.venv/lib/python3.12/site-packages/neo4j_graphrag/experimental/components/schema.py", line 708, in validate_constraints_against_node_types
    _validate_constraint_property_defined(
  File "/workspace/.venv/lib/python3.12/site-packages/neo4j_graphrag/experimental/components/schema.py", line 376, in _validate_constraint_property_defined
    raise SchemaValidationError(
neo4j_graphrag.exceptions.SchemaValidationError: KEY constraint references undefined property 'date' on relationship type 'PARTICIPATED_IN'. Valid properties: {'encounter_name', 'physician_name'}

This is incorrect on two counts: the property does exist (on the other duplicate), and - per Neo4j semantics - a relationship type is global per name, so two entries with the same label and different definitions are an invalid model in the first place.

GraphSchema now detects duplicate relationship_types labels up front and raises a clear, actionable SchemaValidationError that names the duplicated label, its property sets, the underlying Neo4j rule, and a suggestion to fix it. The check runs before constraint validation, so the clear duplicate-label error replaces the old misleading one.

This path intended for user-provided schema (where users should be informed about what's going wrong with their provided model). Follow-up PRs will handle the LLM schema-generation path, where the prompt will be slightly tuned to avoid having that + a post extraction validation step to reconcile automatically duplicate relationship types.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Documentation update
  • Project configuration change

Complexity

Note

Please provide an estimated complexity of this PR of either Low, Medium or High

Complexity: Low

How Has This Been Tested?

  • Unit tests
  • E2E tests
  • Manual tests

Checklist

The following requirements should have been met (depending on the changes in the branch):

  • Documentation has been updated
  • Unit tests have been updated
  • E2E tests have been updated
  • Examples have been updated
  • New files have copyright header
  • CLA (https://neo4j.com/developer/cla/) has been signed
  • CHANGELOG.md updated if appropriate

@NathalieCharbel NathalieCharbel force-pushed the fix/improve-schemavalidationerror branch from fb71886 to 8fc47af Compare June 1, 2026 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant