Skip to content

Commit 90bebe4

Browse files
Merge pull request #151 from BlockScience/dev
docs: semantic web integration summary + nav
2 parents 3041c1d + 5972826 commit 90bebe4

2 files changed

Lines changed: 140 additions & 0 deletions

File tree

Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
# Semantic Web Integration: What We Learned
2+
3+
A team summary of GDS + OWL/SHACL/SPARQL integration via `gds-owl`.
4+
5+
## The Short Version
6+
7+
We can export **85% of a GDS specification** to Turtle/RDF files and
8+
import it back losslessly. The 15% we lose is Python callables (transition
9+
functions, constraint predicates, distance functions). This is a
10+
mathematical certainty, not a gap we can close.
11+
12+
## What Gets Exported (R1 -- Fully Representable)
13+
14+
Everything structural round-trips perfectly through Turtle:
15+
16+
| GDS Concept | RDF Representation | Validated By |
17+
|---|---|---|
18+
| Block names, roles, interfaces | OWL classes + properties | SHACL shapes |
19+
| Port names and type tokens | Literals on Port nodes | SHACL datatype |
20+
| Wiring topology (who connects to whom) | Wire nodes with source/target | SHACL cardinality |
21+
| Entity/StateVariable declarations | Entity + StateVariable nodes | SHACL |
22+
| TypeDef (name, python_type, units) | TypeDef node + properties | SHACL |
23+
| Space fields | SpaceField blank nodes | SHACL |
24+
| Parameter schema (names, types, bounds) | ParameterDef nodes | SHACL |
25+
| Mechanism update targets (what writes where) | UpdateMapEntry nodes | SHACL |
26+
| Admissibility dependencies (what reads what) | AdmissibilityDep nodes | SHACL |
27+
| Transition read dependencies | TransitionReadEntry nodes | SHACL |
28+
| State metric variable declarations | MetricVariableEntry nodes | SHACL |
29+
| Canonical decomposition (h = f . g) | CanonicalGDS node | SHACL |
30+
| Verification findings | Finding nodes | SHACL |
31+
32+
**13 SHACL shapes** enforce structural correctness on the RDF graph.
33+
**7 SPARQL query templates** enable cross-node analysis (blocks by role,
34+
dependency paths, entity update maps, parameter impact, verification summaries).
35+
36+
## What Requires SPARQL (R2 -- Structurally Representable)
37+
38+
Some properties can't be checked by SHACL alone (which validates individual
39+
nodes) but CAN be checked by SPARQL queries over the full graph:
40+
41+
| Property | SPARQL Feature | Why SHACL Can't |
42+
|---|---|---|
43+
| Acyclicity (G-006) | Transitive closure (`p+`) | No path traversal in SHACL-core |
44+
| Completeness (SC-001) | `FILTER NOT EXISTS` | No "for all X, exists Y" |
45+
| Determinism (SC-002) | `GROUP BY` + `HAVING` | No cross-node aggregation |
46+
| Dangling wirings (G-004) | `FILTER NOT EXISTS` | Name existence, not class membership |
47+
48+
These all terminate (SPARQL over finite graphs always does) and are decidable.
49+
50+
## What Cannot Be Exported (R3 -- Not Representable)
51+
52+
These are **fundamentally** non-exportable. Not a tooling gap -- a
53+
mathematical impossibility (Rice's theorem for callables, computational
54+
class separation for string processing):
55+
56+
| GDS Concept | Why R3 | What Happens on Export |
57+
|---|---|---|
58+
| `TypeDef.constraint` (e.g. `lambda x: x >= 0`) | Arbitrary Python callable | Exported as boolean flag `hasConstraint`; imported as `None` |
59+
| `f_behav` (transition functions) | Arbitrary computation | Not stored in GDSSpec -- user responsibility |
60+
| `AdmissibleInputConstraint.constraint` | Arbitrary callable | Exported as boolean flag; imported as `None` |
61+
| `StateMetric.distance` | Arbitrary callable | Exported as boolean flag; imported as `None` |
62+
| Auto-wiring token computation | Multi-pass string processing | Results exported (WiringIR edges); process is not |
63+
| Construction validation | Python `@model_validator` logic | Structural result preserved; validation logic is not |
64+
65+
**Key insight:** The *results* of R3 computation are always R1. Auto-wiring
66+
produces WiringIR edges (R1). Validation produces pass/fail (R1). Only the
67+
*process* is lost.
68+
69+
## The Boundary in One Sentence
70+
71+
> **You can represent everything about a system except what its programs
72+
> actually do.** The canonical decomposition `h = f . g` makes this
73+
> boundary explicit: `g` (topology) and `f_struct` (update targets) are
74+
> fully representable; `f_behav` (how state actually changes) is not.
75+
76+
## Practical Implications
77+
78+
### What You Can Do With the Turtle Export
79+
80+
1. **Share specs between tools** -- any RDF-aware tool (Protege, GraphDB,
81+
Neo4j via neosemantics) can import a GDS spec
82+
2. **Validate specs without Python** -- SHACL processors (TopBraid, pySHACL)
83+
can check structural correctness
84+
3. **Query specs with SPARQL** -- find all mechanisms that update a given
85+
entity, trace dependency paths, check acyclicity
86+
4. **Version and diff specs** -- Turtle is text, diffs are meaningful
87+
5. **Cross-ecosystem interop** -- other OWL ontologies can reference GDS
88+
classes/properties
89+
90+
### What You Cannot Do
91+
92+
1. **Run simulations from Turtle** -- you need the Python callables back
93+
2. **Verify behavioral properties** -- "does this mechanism converge?" requires
94+
executing `f_behav`
95+
3. **Reproduce auto-wiring** -- the token overlap computation can't run in SPARQL
96+
97+
### Round-Trip Fidelity
98+
99+
Tested with property-based testing (Hypothesis): 100 random GDSSpecs
100+
generated, exported to Turtle, parsed back, reimported. All structural
101+
fields survive. Known lossy fields:
102+
103+
- `TypeDef.constraint` -> `None`
104+
- `TypeDef.python_type` -> falls back to `str` for non-builtin types
105+
- `AdmissibleInputConstraint.constraint` -> `None`
106+
- `StateMetric.distance` -> `None`
107+
- Port/wire ordering -> set-based (RDF is unordered)
108+
- Blank node identity -> content-based comparison, not node ID
109+
110+
## Numbers
111+
112+
| Metric | Count |
113+
|---|---|
114+
| R1 concepts (fully representable) | 12 |
115+
| R2 concepts (SPARQL-needed) | 3 |
116+
| R3 concepts (not representable) | 6 |
117+
| SHACL shapes | 13 |
118+
| SPARQL templates | 7 |
119+
| Verification checks expressible in SHACL | 6 of 15 |
120+
| Verification checks expressible in SPARQL | 6 more |
121+
| Checks requiring Python | 2 of 15 |
122+
| Round-trip PBT tests | 26 |
123+
| Random specs tested | ~2,600 |
124+
125+
## Paper Alignment
126+
127+
The structural/behavioral split is a **framework design choice**, not a
128+
paper requirement. The GDS paper (Zargham & Shorish 2022) defines
129+
`U: X -> P(U)` as a single map; we split it into `U_struct` (dependency
130+
graph, R1) and `U_behav` (constraint predicate, R3) for ontological
131+
engineering. Same for `StateMetric` and `TransitionSignature`. The
132+
canonical decomposition `h = f . g` IS faithful to the paper.
133+
134+
## Files
135+
136+
- `packages/gds-owl/` -- the full export/import/SHACL/SPARQL implementation
137+
- `docs/research/formal-representability.md` -- the 800-line formal analysis
138+
- `docs/research/verification/r3-undecidability.md` -- proofs for the R3 boundary
139+
- `docs/research/verification/representability-proof.md` -- R1/R2 decidability + partition independence

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -384,6 +384,7 @@ nav:
384384
- Paper Implementation Gap: research/paper-implementation-gap.md
385385
- View Stratification: guides/view-stratification.md
386386
- Ecosystem: framework/ecosystem.md
387+
- Semantic Web Integration: research/semantic-web-summary.md
387388
- Formal Verification:
388389
- Verification Plan: research/verification-plan.md
389390
- R3 Non-Representability: research/verification/r3-undecidability.md

0 commit comments

Comments
 (0)