Version: 1.0 Author: Matthias Buchhorn-Roth Date: March 2026 License: MIT
This document defines a production-ready, layered Neo4j graph schema for health dataspaces that integrates:
- HL7 FHIR R4 (clinical data exchange and primary use)
- OMOP CDM (research analytics and secondary use)
- SNOMED CT / LOINC / ICD-10 (clinical terminology backbone)
- HealthDCAT-AP (metadata discovery and catalog layer)
- Dataspace Protocol (DSP) metadata for marketplace operations
The schema is designed for EHDS-compliant Health Data Access Bodies (HDABs), Contract Research Organizations (CROs), data preparation agencies, and clinical data marketplaces where data providers, consumers, and intermediaries exchange clinical and research data under sovereignty contracts.
- Architecture Overview
- Layer 1: Dataspace Marketplace Metadata
- Layer 2: HealthDCAT-AP Metadata
- Layer 3: FHIR Clinical Knowledge Graph
- Layer 4: OMOP Research Analytics
- Layer 5: Clinical Ontology Backbone
- Cross-Layer Integration Patterns
- Implementation Guide
- Query Patterns
- Validation Rules
- Migration from Existing Systems
The schema follows these Neo4j best practices:
- Node Labels as Entity Types — Singular nouns in PascalCase:
Patient,Condition,Dataset - Relationships as Verbs — Action-oriented in UPPER_SNAKE_CASE:
HAS_CONDITION,PART_OF_COHORT - Properties as Attributes — camelCase for property names:
birthDate,resourceType - Layered Model — Semantic separation between dataspace, metadata, clinical, research, and ontology
- Bidirectional Traceability — Every clinical node traces back to dataset and catalog metadata
- FHIR ↔ OMOP Mappings — Explicit transformation relationships preserve provenance
┌────────────────────────────────────────────────────┐
│ Layer 1: Dataspace Marketplace Metadata │
│ (DSP Catalog, DataProduct, Contract, Participant) │
└────────────────────┬───────────────────────────────┘
│ DESCRIBED_BY
┌────────────────────┴───────────────────────────────┐
│ Layer 2: HealthDCAT-AP Metadata │
│ (Dataset, Distribution, HealthDataset) │
└────────────────────┬───────────────────────────────┘
│ CONTAINS_RESOURCE
┌────────────────────┴───────────────────────────────┐
│ Layer 3: FHIR Clinical Knowledge Graph │
│ (Patient, Condition, Observation, Medication...) │
└────────────────────┬───────────────────────────────┘
│ MAPPED_TO
┌────────────────────┴───────────────────────────────┐
│ Layer 4: OMOP Research Analytics │
│ (Person, ConditionOccurrence, Measurement...) │
└────────────────────┬───────────────────────────────┘
│ CODED_BY
┌────────────────────┴───────────────────────────────┐
│ Layer 5: Clinical Ontology Backbone │
│ (SnomedConcept, LoincCode, ICD10Code) │
└────────────────────────────────────────────────────┘
Represents a dataspace participant (clinic, CRO, HDAB, data preparation agency).
Properties:
participantId: String!— Unique identifier (DID or X.509 DN)legalName: String!— Official organization nameparticipantType: String!— One of:CLINIC,CRO,HDAB,DATA_AGENCY,RESEARCHERjurisdiction: String— ISO 3166-1 alpha-2 country code (e.g.,DE,FR)vcUrl: String— Verifiable Credential endpoint for DCP attestationcatalogUrl: String— DSP catalog endpoint
Indexes:
CREATE CONSTRAINT participant_id IF NOT EXISTS FOR (p:Participant) REQUIRE p.participantId IS UNIQUE;
CREATE INDEX participant_type IF NOT EXISTS FOR (p:Participant) ON (p.participantType);A data product offered in the dataspace marketplace.
Properties:
productId: String!— Unique product identifiertitle: String!— Human-readable product namedescription: String— Product descriptionversion: String— Semantic version (e.g.,1.2.0)productType: String!— One of:COHORT,REGISTRY,SYNTHETIC,REALWORLDsensitivity: String!— One of:ANONYMOUS,PSEUDONYMIZED,IDENTIFIEDcreatedAt: DateTime!updatedAt: DateTime
Indexes:
CREATE CONSTRAINT product_id IF NOT EXISTS FOR (dp:DataProduct) REQUIRE dp.productId IS UNIQUE;
CREATE INDEX product_type IF NOT EXISTS FOR (dp:DataProduct) ON (dp.productType);Represents a dataspace usage contract negotiated via DSP.
Properties:
contractId: String!— DSP contract agreement IDproviderId: String!— Participant ID of data providerconsumerId: String!— Participant ID of data consumeragreementDate: DateTime!validUntil: DateTimeusagePurpose: String!— EHDS Article 53 permitted purposeaccessType: String!— One of:QUERY,EXTRACT,FEDERATED
Indexes:
CREATE CONSTRAINT contract_id IF NOT EXISTS FOR (c:Contract) REQUIRE c.contractId IS UNIQUE;Represents a formal data access request submitted to an HDAB by a data consumer, as required by EHDS Articles 45–52. An AccessApplication is a prerequisite for contract negotiation — the HDAB must approve the stated purpose before a Contract can be formed.
Properties:
applicationId: String!— Unique application reference numberapplicantId: String!— Participant ID of the data consumerdatasetId: String!— URI of the requestedHealthDatasetrequestedPurpose: String!— EHDS Article 53 permitted purpose (e.g.,SCIENTIFIC_RESEARCH)submittedAt: DateTime!— Submission timestampstatus: String!— One of:SUBMITTED,UNDER_REVIEW,APPROVED,REJECTED,REVOKEDjustification: String— Scientific or public-interest justification textethicsCommitteeRef: String— Reference to ethics committee approvaldataMinimisationStatement: String— GDPR Art. 5(1)(c) justification
Indexes:
CREATE CONSTRAINT access_application_id IF NOT EXISTS FOR (aa:AccessApplication) REQUIRE aa.applicationId IS UNIQUE;
CREATE INDEX access_application_status IF NOT EXISTS FOR (aa:AccessApplication) ON (aa.status);An approval decision issued by a Health Data Access Body authorising a specific data consumer to access a dataset for a stated purpose. Bridges the AccessApplication to a Contract and provides the formal legal basis for data exchange under EHDS.
Properties:
approvalId: String!— Unique approval decision identifierapplicationId: String!— Reference back to theAccessApplicationapprovedAt: DateTime!— Date of formal HDAB decisionvalidUntil: DateTime!— Approval expiry datepermittedPurpose: String!— Granted EHDS Article 53 purpose (must matchAccessApplication.requestedPurpose)conditions: String[]— Any conditions or restrictions attached to the approvalhdabOfficer: String— Name or ID of the issuing HDAB officerlegalBasisArticle: String— EHDS legal basis (e.g.,EHDS_Art_46,GDPR_Art_9_2_j)
Indexes:
CREATE CONSTRAINT hdab_approval_id IF NOT EXISTS FOR (ha:HDABApproval) REQUIRE ha.approvalId IS UNIQUE;(:Participant)-[:OFFERS]->(:DataProduct)
(:Participant)-[:CONSUMES]->(:DataProduct)
(:Contract)-[:GOVERNS]->(:DataProduct)
(:Contract)-[:PROVIDER {participantId}]->(:Participant)
(:Contract)-[:CONSUMER {participantId}]->(:Participant)
// EHDS HDAB approval chain (Articles 45–52)
(:Participant)-[:SUBMITTED]->(:AccessApplication)
(:AccessApplication)-[:REQUESTS_ACCESS_TO]->(:DataProduct)
(:Participant {participantType: 'HDAB'})-[:REVIEWED]->(:AccessApplication)
(:HDABApproval)-[:APPROVES]->(:AccessApplication)
(:HDABApproval)-[:APPROVED {approvalId, permittedPurpose}]->(:Contract)Layer 2 implements the W3C HealthDCAT-AP vocabulary — an application profile of DCAT-AP 3.0 extending W3C DCAT 3 with health-domain extensions required by the EHDS Regulation.
A catalog of datasets, following dcat:Catalog.
Properties:
catalogId: String!— URI identifier (dct:identifier)title: String!— Catalog name (dct:title)description: String— Catalog description (dct:description)license: String— License URI (dct:license)homepage: String— Homepage URL (foaf:homepage)createdAt: DateTimemodifiedAt: DateTime
Indexes:
CREATE CONSTRAINT catalog_id IF NOT EXISTS FOR (cat:Catalog) REQUIRE cat.catalogId IS UNIQUE;A dataset described using HealthDCAT-AP (healthdcatap:Dataset extending dcat:Dataset).
Properties (DCAT-AP mandatory):
datasetId: String!— URI identifier (dct:identifier)title: String!— Dataset name (dct:title)description: String— Dataset description (dct:description)issued: Date— Publication date (dct:issued)modified: Date— Last modification (dct:modified)language: String[]— ISO 639-1 language codes (dct:language)themes: String[]— EuroVoc theme URIs (dcat:theme)
Properties (DCAT-AP recommended):
dctSpatial: String— Geographic coverage ISO 3166 (dct:spatial)dctTemporalStart: Date— Temporal coverage start (dcat:startDatewithindct:temporal)dctTemporalEnd: Date— Temporal coverage end (dcat:endDatewithindct:temporal)conformsTo: String— Standard URI (dct:conformsTo, e.g.,http://hl7.org/fhir/R4)landingPage: String— Landing page URL (dcat:landingPage)
Properties (HealthDCAT-AP health extensions):
hdcatapDatasetType: String— Dataset type (healthdcatap:datasetType, e.g.,SyntheticData,ClinicalTrial,Registry)hdcatapPersonalData: Boolean— Contains personal data (healthdcatap:personalData)hdcatapSensitiveData: Boolean— Contains sensitive data (healthdcatap:sensitiveData)hdcatapLegalBasisForAccess: String— EHDS article reference (healthdcatap:legalBasisForAccess)hdcatapPurpose: String— Permitted purpose description (healthdcatap:purpose)hdcatapPopulationCoverage: String— Population description (healthdcatap:populationCoverage)hdcatapNumberOfRecords: Long— Total record count (healthdcatap:numberOfRecords)hdcatapNumberOfUniqueIndividuals: Long— Unique individual count (healthdcatap:numberOfUniqueIndividuals)hdcatapHealthCategory: String[]— EEHRxF priority categories (healthdcatap:healthCategory)hdcatapHealthTheme: String[]— MeSH / ICD-10 / SNOMED URIs (healthdcatap:healthTheme)hdcatapMinTypicalAge: Integer— Minimum typical age (healthdcatap:minTypicalAge)hdcatapMaxTypicalAge: Integer— Maximum typical age (healthdcatap:maxTypicalAge)hdcatapPublisherType: String— Publisher role (healthdcatap:publisherType, e.g.,DataHolder,HDAB,Researcher)
Properties (provenance):
source: String— Source URLgenerator: String— Data generator toolfhirVersion: String— FHIR version (e.g.,R4)omopCdmVersion: String— OMOP CDM version (e.g.,5.4)createdAt: DateTimemodifiedAt: DateTime
Indexes:
CREATE CONSTRAINT dataset_id IF NOT EXISTS FOR (hd:HealthDataset) REQUIRE hd.datasetId IS UNIQUE;A specific representation/format of a dataset (dcat:Distribution).
Properties:
distributionId: String!— URI identifier (dct:identifier)title: String— Distribution name (dct:title)format: String!— MIME type (dcat:mediaType, e.g.,application/fhir+json,text/csv)accessUrl: String— DSP access endpoint (dcat:accessURL)accessService: String— Service description (dcat:accessService)byteSize: Long— Size in bytes (dcat:byteSize)checksum: String— SHA-256 hash (spdx:checksum)conformsTo: String— Standard URI (dct:conformsTo, e.g.,http://hl7.org/fhir/R4)description: String— Distribution description (dct:description)createdAt: DateTime
Indexes:
CREATE CONSTRAINT distribution_id IF NOT EXISTS FOR (d:Distribution) REQUIRE d.distributionId IS UNIQUE;Contact information for a dataset (vcard:ContactPoint).
Properties:
contactId: String!— URI identifiername: String— Contact name (vcard:fn)email: String— Contact email (vcard:hasEmail)url: String— Contact URL (vcard:hasURL)role: String— Contact role (e.g.,DataSteward,DPO)
Indexes:
CREATE CONSTRAINT contact_point_id IF NOT EXISTS FOR (cp:ContactPoint) REQUIRE cp.contactId IS UNIQUE;An organization that publishes or owns datasets (foaf:Organization).
Properties:
organizationId: String!— URI identifiername: String!— Organization name (foaf:name)jurisdiction: String— ISO 3166 country codeehdsRole: String— EHDS role (DataHolder,HDAB,Researcher)createdAt: DateTime
Indexes:
CREATE CONSTRAINT organization_id IF NOT EXISTS FOR (org:Organization) REQUIRE org.organizationId IS UNIQUE;// Catalog structure
(:Organization)-[:PUBLISHES]->(:Catalog)
(:Catalog)-[:HAS_DATASET]->(:HealthDataset)
(:Organization)-[:OWNS_DATASET]->(:HealthDataset)
// Dataset metadata
(:DataProduct)-[:DESCRIBED_BY]->(:HealthDataset)
(:HealthDataset)-[:HAS_DISTRIBUTION]->(:Distribution)
(:HealthDataset)-[:PUBLISHED_BY]->(:Participant)
(:HealthDataset)-[:HAS_CONTACT_POINT]->(:ContactPoint)
(:HealthDataset)-[:HAS_THEME]->(:EEHRxFCategory)
(:HealthDataset)-[:SUBJECT_TO_PURPOSE]->(:EhdsPurpose)The European Electronic Health Record Exchange Format (EEHRxF) defines 6 priority categories for cross-border health data exchange under the EHDS Regulation. HL7 Europe publishes FHIR R4 Implementation Guides implementing these categories.
An EHDS priority category for electronic health data exchange.
Properties:
categoryId: String!— Kebab-case identifier (e.g.,patient-summary)name: String!— Human-readable namedescription: String— Category scopeehdsDeadline: String— EHDS implementation deadline (e.g.,2029-03)ehdsGroup: Integer— Rollout group (1 = 2029, 2 = 2031, 3 = TBD)status: String— Current status (available,partial,gap)
Indexes:
CREATE CONSTRAINT eehrxf_category_id IF NOT EXISTS FOR (c:EEHRxFCategory) REQUIRE c.categoryId IS UNIQUE;An HL7 Europe FHIR profile implementing part of the EEHRxF specification.
Properties:
profileId: String!— Kebab-case identifier (e.g.,patient-eu-core)name: String!— Profile display name (e.g.,Patient (EU core))igName: String!— Parent IG name (e.g.,HL7 Europe Base and Core)igPackage: String— FHIR package ID (e.g.,hl7.fhir.eu.base#0.1.0)fhirVersion: String— Target FHIR version (R4orR5)status: String— Maturity status (STU,Ballot,Draft)url: String— Canonical URL of the StructureDefinitionbaseResource: String— FHIR resource type this profile constrains (e.g.,Patient)description: String— Profile purposecoverage: String— Current data coverage status (full,partial,none)
Indexes:
CREATE CONSTRAINT eehrxf_profile_id IF NOT EXISTS FOR (p:EEHRxFProfile) REQUIRE p.profileId IS UNIQUE;
CREATE INDEX eehrxf_profile_base IF NOT EXISTS FOR (p:EEHRxFProfile) ON (p.baseResource);(:EEHRxFProfile)-[:PART_OF_CATEGORY]->(:EEHRxFCategory)
(:EEHRxFProfile)-[:PROFILES_RESOURCE {count: Integer, coverage: String}]->(fhirNode)
(:EEHRxFProfile)-[:DEPENDS_ON]->(:EEHRxFProfile)PART_OF_CATEGORY— Links a profile to its EHDS priority categoryPROFILES_RESOURCE— Links a profile to a representative FHIR node of the matching resource type;countholds the number of matching resources,coverageindicatesfull/partial/noneDEPENDS_ON— Inter-profile dependency (e.g., Lab Report depends on Base Patient)
Based on HL7 FHIR R4 resource types. Node labels match FHIR resource names.
Properties:
resourceId: String!— FHIRididentifier: String[]— FHIRidentifierarray (e.g., MRN, national ID)birthDate: Dategender: String— FHIRgender(male, female, other, unknown)deceasedBoolean: BooleandeceasedDateTime: DateTimeactive: Boolean
Indexes:
CREATE CONSTRAINT patient_id IF NOT EXISTS FOR (p:Patient) REQUIRE p.resourceId IS UNIQUE;
CREATE INDEX patient_identifier IF NOT EXISTS FOR (p:Patient) ON (p.identifier);Properties:
resourceId: String!clinicalStatus: String— FHIR ValueSet (active, recurrence, relapse, inactive, remission, resolved)verificationStatus: String— (unconfirmed, provisional, differential, confirmed, refuted, entered-in-error)category: String[]— FHIR condition-category (problem-list-item, encounter-diagnosis)code: String!— SNOMED CT / ICD-10 codecodeSystem: String!— e.g.,http://snomed.info/sctcodeDisplay: String— Human-readable termonsetDateTime: DateTimeabatementDateTime: DateTimerecordedDate: DateTime
Indexes:
CREATE CONSTRAINT condition_id IF NOT EXISTS FOR (c:Condition) REQUIRE c.resourceId IS UNIQUE;
CREATE INDEX condition_code IF NOT EXISTS FOR (c:Condition) ON (c.code);Properties:
resourceId: String!status: String!— (registered, preliminary, final, amended, corrected, cancelled, entered-in-error, unknown)category: String[]— (vital-signs, laboratory, imaging, survey, social-history)code: String!— LOINC / SNOMED CT codecodeSystem: String!codeDisplay: StringvalueQuantity: FloatvalueUnit: StringvalueCodeableConcept: String— Coded resulteffectiveDateTime: DateTimeissued: DateTimeinterpretation: String[]— (normal, abnormal, critical, high, low)
Indexes:
CREATE CONSTRAINT observation_id IF NOT EXISTS FOR (o:Observation) REQUIRE o.resourceId IS UNIQUE;
CREATE INDEX observation_code IF NOT EXISTS FOR (o:Observation) ON (o.code);
CREATE INDEX observation_category IF NOT EXISTS FOR (o:Observation) ON (o.category);Properties:
resourceId: String!status: String!— (active, on-hold, cancelled, completed, entered-in-error, stopped, draft, unknown)intent: String!— (proposal, plan, order, original-order, reflex-order, filler-order, instance-order, option)medicationCode: String!— RxNorm / ATC codemedicationCodeSystem: String!medicationDisplay: StringauthoredOn: DateTimedosageText: StringdosageQuantity: FloatdosageUnit: String
Indexes:
CREATE CONSTRAINT medication_request_id IF NOT EXISTS FOR (mr:MedicationRequest) REQUIRE mr.resourceId IS UNIQUE;
CREATE INDEX medication_code IF NOT EXISTS FOR (mr:MedicationRequest) ON (mr.medicationCode);Properties:
resourceId: String!status: String!— (planned, arrived, triaged, in-progress, onleave, finished, cancelled, entered-in-error, unknown)class: String!— (ambulatory, emergency, field, home health, inpatient, observation, virtual)type: String[]— Encounter type codesperiod_start: DateTimeperiod_end: DateTimeserviceProvider: String— Organization reference
Indexes:
CREATE CONSTRAINT encounter_id IF NOT EXISTS FOR (e:Encounter) REQUIRE e.resourceId IS UNIQUE;Properties:
resourceId: String!status: String!code: String!— CPT / SNOMED CT / ICD-10-PCScodeSystem: String!codeDisplay: StringperformedDateTime: DateTimeperformedPeriod_start: DateTimeperformedPeriod_end: DateTime
(:Patient)-[:HAS_CONDITION]->(:Condition)
(:Patient)-[:HAS_OBSERVATION]->(:Observation)
(:Patient)-[:HAS_MEDICATION_REQUEST]->(:MedicationRequest)
(:Patient)-[:HAS_ENCOUNTER]->(:Encounter)
(:Patient)-[:HAS_PROCEDURE]->(:Procedure)
(:Condition)-[:RECORDED_DURING]->(:Encounter)
(:Observation)-[:PART_OF]->(:Encounter)
(:Procedure)-[:PERFORMED_DURING]->(:Encounter)
(:Observation)-[:RELATES_TO]->(:Observation) // hasMember, derivedFrom
(:Condition)-[:CAUSED_BY]->(:Condition) // dueTo extension(:Patient)-[:FROM_DATASET]->(:HealthDataset)
(:Condition)-[:FROM_DATASET]->(:HealthDataset)
(:Observation)-[:FROM_DATASET]->(:HealthDataset)Based on OMOP CDM v5.4. Prefixed with OMOP to avoid collision with FHIR nodes where names overlap.
Maps to OMOP person table.
Properties:
personId: Long!— OMOP person_idgenderConceptId: Long!— OMOP Concept ID for genderyearOfBirth: Int!monthOfBirth: IntdayOfBirth: IntbirthDatetime: DateTimeraceConceptId: LongethnicityConceptId: LonglocationId: LongproviderIdPrimary: LongcareSiteIdPrimary: Long
Indexes:
CREATE CONSTRAINT omop_person_id IF NOT EXISTS FOR (op:OMOPPerson) REQUIRE op.personId IS UNIQUE;Maps to OMOP condition_occurrence table.
Properties:
conditionOccurrenceId: Long!personId: Long!conditionConceptId: Long!— Standard SNOMED conceptconditionStartDate: Date!conditionStartDatetime: DateTimeconditionEndDate: DateconditionEndDatetime: DateTimeconditionTypeConceptId: Long!— Provenance (EHR, claim, registry)conditionStatusConceptId: LongstopReason: StringvisitOccurrenceId: LongconditionSourceValue: String— Original codeconditionSourceConceptId: Long— Source vocabulary concept
Indexes:
CREATE CONSTRAINT omop_condition_occurrence_id IF NOT EXISTS FOR (oco:OMOPConditionOccurrence) REQUIRE oco.conditionOccurrenceId IS UNIQUE;
CREATE INDEX omop_condition_concept IF NOT EXISTS FOR (oco:OMOPConditionOccurrence) ON (oco.conditionConceptId);Maps to OMOP measurement table (lab results, vital signs).
Properties:
measurementId: Long!personId: Long!measurementConceptId: Long!— Standard LOINC conceptmeasurementDate: Date!measurementDatetime: DateTimemeasurementTime: StringmeasurementTypeConceptId: Long!operatorConceptId: Long— =, >=, <=, <, >valueAsNumber: FloatvalueAsConceptId: LongunitConceptId: LongrangeHigh: FloatrangeLow: FloatvisitOccurrenceId: LongmeasurementSourceValue: StringmeasurementSourceConceptId: LongunitSourceValue: StringunitSourceConceptId: LongvalueSourceValue: String
Indexes:
CREATE CONSTRAINT omop_measurement_id IF NOT EXISTS FOR (om:OMOPMeasurement) REQUIRE om.measurementId IS UNIQUE;
CREATE INDEX omop_measurement_concept IF NOT EXISTS FOR (om:OMOPMeasurement) ON (om.measurementConceptId);Maps to OMOP drug_exposure table.
Properties:
drugExposureId: Long!personId: Long!drugConceptId: Long!— Standard RxNorm ingredientdrugExposureStartDate: Date!drugExposureStartDatetime: DateTimedrugExposureEndDate: DatedrugExposureEndDatetime: DateTimeverbatimEndDate: DatedrugTypeConceptId: Long!stopReason: Stringrefills: Intquantity: FloatdaysSupply: Intsig: String— Dosage instructionsrouteConceptId: LonglotNumber: StringvisitOccurrenceId: LongdrugSourceValue: StringdrugSourceConceptId: LongrouteSourceValue: StringdoseUnitSourceValue: String
Maps to OMOP procedure_occurrence table.
Properties:
id: String!name: String— Display name of the procedureprocedureDate: String— Date the procedure was performed (YYYY-MM-DD)procedureSourceValue: String— Source code (SNOMED CT or ADA CDT)procedureConceptId: Long— OMOP standard concept ID (0 = unmapped)personId: String— Reference to OMOPPerson
Maps to OMOP visit_occurrence table.
Properties:
visitOccurrenceId: Long!personId: Long!visitConceptId: Long!— Inpatient, Outpatient, ER, etc.visitStartDate: Date!visitStartDatetime: DateTimevisitEndDate: DatevisitEndDatetime: DateTimevisitTypeConceptId: Long!providerId: LongcareSiteId: LongvisitSourceValue: StringvisitSourceConceptId: LongadmittingSourceConceptId: LongadmittingSourceValue: StringdischargeToConceptId: LongdischargeToSourceValue: StringprecedingVisitOccurrenceId: Long
(:OMOPPerson)-[:HAS_CONDITION_OCCURRENCE]->(:OMOPConditionOccurrence)
(:OMOPPerson)-[:HAS_MEASUREMENT]->(:OMOPMeasurement)
(:OMOPPerson)-[:HAS_DRUG_EXPOSURE]->(:OMOPDrugExposure)
(:OMOPPerson)-[:HAS_PROCEDURE_OCCURRENCE]->(:OMOPProcedureOccurrence)
(:OMOPPerson)-[:HAS_VISIT_OCCURRENCE]->(:OMOPVisitOccurrence)
(:OMOPConditionOccurrence)-[:DURING_VISIT]->(:OMOPVisitOccurrence)
(:OMOPMeasurement)-[:DURING_VISIT]->(:OMOPVisitOccurrence)
(:OMOPDrugExposure)-[:DURING_VISIT]->(:OMOPVisitOccurrence)These relationships preserve bidirectional traceability and transformation provenance.
(:Patient)-[:MAPPED_TO]->(:OMOPPerson)
(:Condition)-[:MAPPED_TO]->(:OMOPConditionOccurrence)
(:Observation)-[:MAPPED_TO {observationType}]->(:OMOPMeasurement)
(:MedicationRequest)-[:MAPPED_TO]->(:OMOPDrugExposure)
(:Procedure)-[:MAPPED_TO]->(:OMOPProcedureOccurrence)
(:Encounter)-[:MAPPED_TO]->(:OMOPVisitOccurrence)
// Properties on MAPPED_TO relationship:
// - transformationRule: String (FML rule ID)
// - transformedAt: DateTime
// - lossOfDetail: String[] (list of FHIR elements not mappable to OMOP)Properties:
conceptId: Long!— SNOMED CT concept IDfsn: String!— Fully Specified NamepreferredTerm: String!active: Boolean!effectiveTime: DatemoduleId: Long
Indexes:
CREATE CONSTRAINT snomed_concept_id IF NOT EXISTS FOR (sc:SnomedConcept) REQUIRE sc.conceptId IS UNIQUE;Properties:
loincNumber: String!— LOINC code (e.g.,85354-9)longCommonName: String!shortName: Stringcomponent: Stringproperty: StringtimeAspect: Stringsystem: StringscaleType: StringmethodType: Stringclass: StringversionLastChanged: String
Indexes:
CREATE CONSTRAINT loinc_code IF NOT EXISTS FOR (lc:LoincCode) REQUIRE lc.loincNumber IS UNIQUE;Properties:
code: String!— ICD-10 code (e.g.,E11.9)description: String!category: Stringsubcategory: Stringversion: String— ICD-10-CM, ICD-10-WHO
Indexes:
CREATE CONSTRAINT icd10_code IF NOT EXISTS FOR (icd:ICD10Code) REQUIRE icd.code IS UNIQUE;Properties:
rxcui: String!— RxNorm Concept Unique Identifiername: String!tty: String— Term Type (IN = ingredient, SCD = semantic clinical drug)active: Boolean
Indexes:
CREATE CONSTRAINT rxnorm_rxcui IF NOT EXISTS FOR (rx:RxNormConcept) REQUIRE rx.rxcui IS UNIQUE;(:SnomedConcept)-[:IS_A]->(:SnomedConcept)
(:SnomedConcept)-[:FINDING_SITE]->(:SnomedConcept)
(:SnomedConcept)-[:CAUSATIVE_AGENT]->(:SnomedConcept)
(:SnomedConcept)-[:ASSOCIATED_MORPHOLOGY]->(:SnomedConcept)(:LoincCode)-[:HAS_COMPONENT]->(:LoincCode)
(:LoincCode)-[:HAS_METHOD]->(:LoincCode)(:RxNormConcept)-[:HAS_INGREDIENT]->(:RxNormConcept)
(:RxNormConcept)-[:HAS_DOSE_FORM]->(:RxNormConcept)(:Condition)-[:CODED_BY]->(:SnomedConcept)
(:Condition)-[:CODED_BY]->(:ICD10Code)
(:Observation)-[:CODED_BY]->(:LoincCode)
(:Observation)-[:CODED_BY]->(:SnomedConcept)
(:MedicationRequest)-[:CODED_BY]->(:RxNormConcept)
(:OMOPConditionOccurrence)-[:STANDARD_CONCEPT]->(:SnomedConcept)
(:OMOPMeasurement)-[:STANDARD_CONCEPT]->(:LoincCode)
(:OMOPDrugExposure)-[:STANDARD_CONCEPT]->(:RxNormConcept)Traverse from marketplace → metadata → clinical → research → ontology in a single query.
MATCH (dp:DataProduct {productId: 'cardio-cohort-2024'})-[:DESCRIBED_BY]->(hd:HealthDataset)
MATCH (p:Patient)-[:FROM_DATASET]->(hd)
MATCH (p)-[:HAS_CONDITION]->(c:Condition)-[:CODED_BY]->(sc:SnomedConcept)
WHERE sc.conceptId = 38341003 // Hypertensive disorder
MATCH (c)-[:MAPPED_TO]->(oco:OMOPConditionOccurrence)
RETURN p.resourceId, c.onsetDateTime, sc.preferredTerm, oco.conditionOccurrenceIdVerify a data consumer has active contract before returning data.
MATCH (consumer:Participant {participantId: $consumerId})
MATCH (provider:Participant)-[:OFFERS]->(dp:DataProduct {productId: $productId})
MATCH (contract:Contract)-[:GOVERNS]->(dp)
WHERE contract.consumerId = $consumerId
AND contract.validUntil > datetime()
AND contract.accessType IN ['QUERY', 'EXTRACT']
RETURN contract.contractId, contract.usagePurposeFind patients across multiple datasets matching research criteria.
MATCH (hd:HealthDataset)
WHERE hd.permittedPurpose CONTAINS 'SCIENTIFIC_RESEARCH'
MATCH (p:Patient)-[:FROM_DATASET]->(hd)
MATCH (p)-[:HAS_CONDITION]->(c:Condition)-[:CODED_BY]->(sc:SnomedConcept)
WHERE sc.conceptId IN [73211009, 44054006] // Type 2 diabetes
MATCH (p)-[:HAS_OBSERVATION]->(o:Observation)-[:CODED_BY]->(lc:LoincCode)
WHERE lc.loincNumber = '4548-4' // HbA1c
AND o.valueQuantity >= 6.5
WITH hd.datasetId AS datasetId, count(DISTINCT p) AS patientCount
RETURN datasetId, patientCount- Layer 5: Ontologies — SNOMED CT, LOINC, ICD-10, RxNorm (via neosemantics or batch CSV)
- Layer 1: Participants & Contracts — Bootstrap dataspace marketplace structure
- Layer 2: HealthDCAT-AP — Register datasets and distributions
- Layer 3: FHIR Data — Load via CyFHIR plugin or custom ETL
- Layer 4: OMOP Transform — Run FHIR → OMOP mapping logic (TermX FML or custom)
- Cross-Layer Links — Create
MAPPED_TO,CODED_BY,FROM_DATASETrelationships
| Layer | Tool | Purpose |
|---|---|---|
| Layer 5 Ontology | Neosemantics (n10s) | Import SNOMED CT / LOINC RDF |
| Layer 3 FHIR | CyFHIR | Native FHIR Bundle → Neo4j |
| Layer 4 OMOP | TermX + FML or Custom Cypher | FHIR → OMOP transformation |
| Layer 2 Metadata | rdflib-neo4j | HealthDCAT-AP RDF → Neo4j |
| Layer 1 Marketplace | Custom API + Cypher | DSP catalog ingestion |
Create all constraints and indexes:
// Layer 1
CREATE CONSTRAINT participant_id IF NOT EXISTS FOR (p:Participant) REQUIRE p.participantId IS UNIQUE;
CREATE CONSTRAINT product_id IF NOT EXISTS FOR (dp:DataProduct) REQUIRE dp.productId IS UNIQUE;
CREATE CONSTRAINT contract_id IF NOT EXISTS FOR (c:Contract) REQUIRE c.contractId IS UNIQUE;
// Layer 2 (HealthDCAT-AP)
CREATE CONSTRAINT catalog_id IF NOT EXISTS FOR (cat:Catalog) REQUIRE cat.catalogId IS UNIQUE;
CREATE CONSTRAINT dataset_id IF NOT EXISTS FOR (hd:HealthDataset) REQUIRE hd.datasetId IS UNIQUE;
CREATE CONSTRAINT distribution_id IF NOT EXISTS FOR (d:Distribution) REQUIRE d.distributionId IS UNIQUE;
CREATE CONSTRAINT contact_point_id IF NOT EXISTS FOR (cp:ContactPoint) REQUIRE cp.contactId IS UNIQUE;
CREATE CONSTRAINT organization_id IF NOT EXISTS FOR (org:Organization) REQUIRE org.organizationId IS UNIQUE;
// Layer 3 FHIR
CREATE CONSTRAINT patient_id IF NOT EXISTS FOR (p:Patient) REQUIRE p.resourceId IS UNIQUE;
CREATE CONSTRAINT condition_id IF NOT EXISTS FOR (c:Condition) REQUIRE c.resourceId IS UNIQUE;
CREATE CONSTRAINT observation_id IF NOT EXISTS FOR (o:Observation) REQUIRE o.resourceId IS UNIQUE;
CREATE CONSTRAINT medication_request_id IF NOT EXISTS FOR (mr:MedicationRequest) REQUIRE mr.resourceId IS UNIQUE;
CREATE CONSTRAINT encounter_id IF NOT EXISTS FOR (e:Encounter) REQUIRE e.resourceId IS UNIQUE;
CREATE INDEX condition_code IF NOT EXISTS FOR (c:Condition) ON (c.code);
CREATE INDEX observation_code IF NOT EXISTS FOR (o:Observation) ON (o.code);
// Layer 4 OMOP
CREATE CONSTRAINT omop_person_id IF NOT EXISTS FOR (op:OMOPPerson) REQUIRE op.personId IS UNIQUE;
CREATE CONSTRAINT omop_condition_occurrence_id IF NOT EXISTS FOR (oco:OMOPConditionOccurrence) REQUIRE oco.conditionOccurrenceId IS UNIQUE;
CREATE CONSTRAINT omop_measurement_id IF NOT EXISTS FOR (om:OMOPMeasurement) REQUIRE om.measurementId IS UNIQUE;
CREATE INDEX omop_condition_concept IF NOT EXISTS FOR (oco:OMOPConditionOccurrence) ON (oco.conditionConceptId);
CREATE INDEX omop_measurement_concept IF NOT EXISTS FOR (om:OMOPMeasurement) ON (om.measurementConceptId);
// Layer 5 Ontology
CREATE CONSTRAINT snomed_concept_id IF NOT EXISTS FOR (sc:SnomedConcept) REQUIRE sc.conceptId IS UNIQUE;
CREATE CONSTRAINT loinc_code IF NOT EXISTS FOR (lc:LoincCode) REQUIRE lc.loincNumber IS UNIQUE;
CREATE CONSTRAINT icd10_code IF NOT EXISTS FOR (icd:ICD10Code) REQUIRE icd.code IS UNIQUE;
CREATE CONSTRAINT rxnorm_rxcui IF NOT EXISTS FOR (rx:RxNormConcept) REQUIRE rx.rxcui IS UNIQUE;MATCH (dp:DataProduct)-[:DESCRIBED_BY]->(hd:HealthDataset)
WHERE 'SCIENTIFIC_RESEARCH' IN hd.permittedPurpose
AND hd.healthSensitivity IN ['ANONYMOUS', 'PSEUDONYMIZED']
MATCH (provider:Participant)-[:OFFERS]->(dp)
RETURN dp.productId, dp.title, hd.temporalCoverage, provider.legalName
ORDER BY hd.issued DESCMATCH (p:Patient {resourceId: 'patient-12345'})
MATCH (p)-[:HAS_CONDITION]->(c:Condition)
MATCH (c)-[:CODED_BY]->(sc:SnomedConcept)
MATCH (sc)-[:IS_A*1..3]->(parent:SnomedConcept)
RETURN p, c, sc, parent// Find all persons with Type 2 Diabetes + HbA1c >= 7.0 in past year
MATCH (op:OMOPPerson)-[:HAS_CONDITION_OCCURRENCE]->(oco:OMOPConditionOccurrence)
WHERE oco.conditionConceptId = 201826 // Type 2 Diabetes OMOP Standard Concept
AND oco.conditionStartDate >= date('2025-01-01')
MATCH (op)-[:HAS_MEASUREMENT]->(om:OMOPMeasurement)
WHERE om.measurementConceptId = 4184637 // HbA1c OMOP Standard Concept
AND om.valueAsNumber >= 7.0
AND om.measurementDate >= date('2025-01-01')
RETURN op.personId, oco.conditionStartDate, om.valueAsNumber, om.measurementDateMATCH (hd:HealthDataset)<-[:FROM_DATASET]-(p:Patient)-[:HAS_CONDITION]->(c:Condition)
MATCH (c)-[:CODED_BY]->(sc:SnomedConcept)
WHERE sc.conceptId = 13645005 // Chronic obstructive pulmonary disease
WITH hd.title AS dataset, count(DISTINCT p) AS patientCount
RETURN dataset, patientCount
ORDER BY patientCount DESCMATCH (parent:SnomedConcept {conceptId: 64572001}) // Disease (disorder)
MATCH (descendant:SnomedConcept)-[:IS_A*1..]->(parent)
RETURN descendant.conceptId, descendant.preferredTerm
LIMIT 100Verify a data consumer has a valid, non-expired HDAB approval before executing a contract-governed query. This is the canonical pre-flight check for EHDS Articles 45–52 compliance.
// Full approval chain: Consumer → Application → HDABApproval → Contract → DataProduct
MATCH (consumer:Participant {participantId: $consumerId})
MATCH (consumer)-[:SUBMITTED]->(app:AccessApplication {status: 'APPROVED'})
MATCH (app)-[:REQUESTS_ACCESS_TO]->(hd:HealthDataset {datasetId: $datasetId})
MATCH (approval:HDABApproval)-[:APPROVES]->(app)
WHERE approval.validUntil > datetime()
AND approval.permittedPurpose = $requestedPurpose
MATCH (approval)-[:APPROVED]->(contract:Contract)-[:GOVERNS]->(dp:DataProduct)-[:DESCRIBED_BY]->(hd)
WHERE contract.validUntil > datetime()
RETURN
consumer.legalName AS consumer,
approval.approvalId AS hdabApproval,
approval.permittedPurpose AS grantedPurpose,
contract.contractId AS contract,
dp.productId AS dataProductEnforce mandatory FHIR → OMOP mappings:
// Every FHIR Patient MUST have corresponding OMOPPerson
MATCH (p:Patient)
WHERE NOT EXISTS((p)-[:MAPPED_TO]->(:OMOPPerson))
RETURN p.resourceId AS unmappedPatientVerify all Conditions are coded:
MATCH (c:Condition)
WHERE NOT EXISTS((c)-[:CODED_BY]->(:SnomedConcept))
AND NOT EXISTS((c)-[:CODED_BY]->(:ICD10Code))
RETURN c.resourceId AS uncodedCondition, c.codeIdentify data access without valid contract:
MATCH (consumer:Participant)-[:CONSUMES]->(dp:DataProduct)
WHERE NOT EXISTS((consumer)<-[:CONSUMER]-(contract:Contract)-[:GOVERNS]->(dp))
OR NOT EXISTS {
MATCH (contract:Contract)-[:CONSUMER]->(consumer)
WHERE contract.validUntil > datetime()
}
RETURN consumer.participantId, dp.productIdFind contracts missing a valid HDAB approval — EHDS non-compliant access:
MATCH (contract:Contract)-[:GOVERNS]->(dp:DataProduct)
WHERE NOT EXISTS {
MATCH (approval:HDABApproval)-[:APPROVED]->(contract)
WHERE approval.validUntil > datetime()
}
RETURN contract.contractId AS nonCompliantContract,
contract.consumerId AS consumer,
dp.productId AS dataProductDetect approved applications whose contracts have since expired:
MATCH (app:AccessApplication {status: 'APPROVED'})
MATCH (approval:HDABApproval)-[:APPROVES]->(app)
MATCH (approval)-[:APPROVED]->(contract:Contract)
WHERE contract.validUntil < datetime()
AND approval.validUntil > datetime()
RETURN app.applicationId, contract.contractId,
contract.validUntil AS contractExpired,
approval.validUntil AS approvalStillValidList all active HDAB approvals for audit:
MATCH (hdab:Participant {participantType: 'HDAB'})-[:REVIEWED]->(app:AccessApplication)
MATCH (approval:HDABApproval)-[:APPROVES]->(app)
MATCH (consumer:Participant)-[:SUBMITTED]->(app)
WHERE approval.validUntil > datetime()
RETURN
hdab.legalName AS hdab,
consumer.legalName AS consumer,
approval.approvalId AS approvalId,
approval.permittedPurpose AS purpose,
approval.validUntil AS expiresAt
ORDER BY approval.validUntil ASC- Export FHIR Bundles via HAPI FHIR JPA Server or custom ETL
- Load Bundles into Neo4j via CyFHIR
- Run terminology mapping scripts to create
CODED_BYrelationships - Generate OMOP transform via TermX FML rules
- Export OMOP tables as CSV (person, condition_occurrence, measurement, drug_exposure, visit_occurrence)
- Load as
OMOPPerson,OMOPConditionOccurrence, etc. nodes - Import OMOP vocabulary tables as ontology nodes
- Create
STANDARD_CONCEPTrelationships - Optionally back-transform to FHIR using reverse FML rules
- Query FHIR server via REST API for Patient, Condition, Observation, MedicationRequest, Encounter bundles
- POST bundles to CyFHIR
/loadendpoint - Run post-load script to create
FROM_DATASETprovenance relationships - Trigger FHIR → OMOP transformation pipeline
This schema provides a production-ready, layered Neo4j data model for health dataspaces that:
✅ Integrates FHIR clinical exchange and OMOP research analytics ✅ Supports HealthDCAT-AP metadata discovery across HDABs (formal W3C vocabulary) ✅ Enables dataspace marketplace operations with DSP contracts ✅ Preserves bidirectional traceability between all layers ✅ Leverages SNOMED CT / LOINC / ICD-10 / RxNorm as semantic backbone ✅ Follows Neo4j best practices for labels, relationships, and indexes ✅ Supports JSON-LD serialization for DSP Federated Catalog interoperability
Next Steps:
- Implement reference code in MinimumViableDataspace health demo
- Publish as open-source schema for EHDS ecosystem
- Validate with real clinical datasets (Synthea, MIMIC-IV)
- Submit to HL7 FHIR-OMOP IG as graph-based transformation reference
Contact: Matthias Buchhorn-Roth Solutions Architect, Sopra Steria LinkedIn: linkedin.com/in/ma3u GitHub: github.com/ma3u
License: MIT Repository: github.com/ma3u/MinimumViableHealthDataspacev2