Skip to content

Commit 0d9f1ec

Browse files
committed
docs: substantive CRG C annotation (EXPLAINME.adoc)
1 parent bd561cf commit 0d9f1ec

1 file changed

Lines changed: 176 additions & 4 deletions

File tree

EXPLAINME.adoc

Lines changed: 176 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,195 @@
11
// SPDX-License-Identifier: PMPL-1.0-or-later
2+
// SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell <j.d.a.jewell@open.ac.uk>
23
= Sanctify-PHP — Show Me The Receipts
34
:toc:
5+
:toclevels: 3
46
:icons: font
7+
:author: Jonathan D.A. Jewell
8+
:email: j.d.a.jewell@open.ac.uk
59

6-
The README makes claims. This file backs them up.
10+
The README makes claims. This file backs them up with specific module paths,
11+
an honest reading of what is verified vs. what is tested, and enough detail
12+
for an external reviewer to trace the critical analysis and transformation paths.
13+
14+
== Claim 1: Haskell-based PHP parser and hardening tool
715

816
[quote, README]
917
____
1018
Haskell-based PHP hardening and security analysis tool.
1119
____
1220

21+
*How it works.*
22+
The Haskell source tree lives under `src/Sanctify/`.
23+
`AST.hs` defines the complete PHP abstract syntax tree: `PhpFile`,
24+
`Statement`, `Expr`, `PhpType`, `ClassMember`, `TraitAdaptation`, and
25+
`SourcePos` (for error reporting), all using the `{-# LANGUAGE StrictData #-}`
26+
pragma to prevent thunk accumulation on large codebases.
27+
`Parser.hs` (and the `Parser/` subdirectory) implements a Megaparsec-based
28+
PHP parser exposing `parsePhpFile`, `parsePhpString`, `parseStatement`, and
29+
`parseExpr` as the primary API surface.
30+
The analysis pipeline in `Analysis/` contains `Security.hs` (SQL injection,
31+
XSS, CSRF, command injection detection), `Taint.hs` (full taint tracking from
32+
sources to sinks), `Advanced.hs`, `Types.hs`, and `DeadCode.hs`.
33+
Transformations live in `Transform/`: `Strict.hs` inserts
34+
`declare(strict_types=1)`, `TypeHints.hs` infers and adds parameter/return
35+
type annotations, and `Sanitize.hs` wraps unescaped `echo` output in
36+
`esc_html()`.
37+
`Report.hs` serialises analysis results to JSON, SARIF, and HTML.
38+
The `WordPress/` module enforces WordPress-specific constraints (ABSPATH
39+
checks, text-domain presence, nonce verification).
40+
`app/Main.hs` is the CLI entry point; `sanctify-php.cabal` declares the
41+
build.
42+
43+
*Honest caveat.*
44+
The README includes an explicit note: "It should not be read as proof that
45+
every analysis and transform is already trustworthy enough for production
46+
security decisions."
47+
Taint analysis tests and end-to-end execution coverage are the most notable
48+
remaining gaps — `PROOF-NEEDS.md` and `TEST-NEEDS.md` list the outstanding
49+
obligations.
50+
51+
== Claim 2: Multiple output formats and infrastructure export
52+
53+
[quote, README]
54+
____
55+
Generates reports in JSON/SARIF/HTML formats. Exports infrastructure
56+
recommendations (php.ini, nginx, Guix).
57+
____
58+
59+
*How it works.*
60+
`Report.hs` implements the format serialisation: the `ReportFormat` sum type
61+
selects JSON (via `aeson`), SARIF (structured diagnostic format for IDE/SAST
62+
integration), or HTML (standalone rendered report).
63+
The `sanctify export` subcommand (wired in `app/Main.hs`) reads the analysis
64+
results and emits php.ini hardening directives, nginx security headers, or a
65+
Guix package override as plain text appended to the target config file.
66+
The `bench/` directory contains criterion benchmarks measuring parser and
67+
analysis throughput on representative PHP codebases.
68+
Test plugins in `test-plugins/` provide real-world WordPress plugin fixtures
69+
for integration tests.
70+
71+
*Honest caveat.*
72+
SARIF output enables IDE integration (VS Code problem matcher, GitHub
73+
Advanced Security upload) but end-to-end round-trip tests from PHP source to
74+
SARIF upload are absent — the feature is structural rather than validated.
75+
76+
== Dogfooded Across The Account
77+
78+
[cols="1,3", options="header"]
79+
|===
80+
| Tool / Repo | How sanctify-php uses it
81+
82+
| `panic-attacker`
83+
| Pre-commit security scan gate (`just pre-commit`)
84+
85+
| `contractile.just` / contractiles
86+
| Standard hyperpolymath contractile integration hook
87+
88+
| `stapeln.toml`
89+
| Container build manifest for the Haskell tool image
90+
91+
| `guix.scm` / `guix/`
92+
| Guix package expression; the export subcommand also emits Guix override stanzas
93+
94+
| `flake.nix`
95+
| Nix flake for reproducible build environment (GHC, Cabal)
96+
97+
| `Hypatia CI`
98+
| `hypatia-scan.yml` workflow applies neurosymbolic security rules to each commit
99+
100+
| `PROOF-NEEDS.md`
101+
| Consumed by Hypatia and gitbot-fleet to track outstanding proof obligations
102+
|===
103+
13104
== File Map
14105

15-
[cols="1,2"]
106+
[cols="2,4", options="header"]
16107
|===
17108
| Path | What's There
18109

19-
| `src/` | Source code
20-
| `test(s)/` | Test suite
110+
| `src/Sanctify/AST.hs`
111+
| PHP AST definition: all node types from `PhpFile` to `TraitAdaptation`; uses `StrictData`
112+
113+
| `src/Sanctify/Parser.hs`
114+
| Megaparsec PHP parser entry points: `parsePhpFile`, `parsePhpString`, `parseStatement`, `parseExpr`
115+
116+
| `src/Sanctify/Parser/`
117+
| Sub-parsers: expression, statement, declaration, class-member parsers broken out by production rule group
118+
119+
| `src/Sanctify/Analysis/Security.hs`
120+
| Vulnerability detection: SQLi, XSS, CSRF, command injection pattern matching over the AST
121+
122+
| `src/Sanctify/Analysis/Taint.hs`
123+
| Taint tracking: source tagging at user input, propagation through assignments, sink detection at output/exec calls
124+
125+
| `src/Sanctify/Analysis/Advanced.hs`
126+
| Higher-level analysis passes (type-coercion risks, insecure deserialisation, SSRF patterns)
127+
128+
| `src/Sanctify/Analysis/DeadCode.hs`
129+
| Dead code detection: unreachable branches, unused variables after strict-type enforcement
130+
131+
| `src/Sanctify/Analysis/Types.hs`
132+
| Analysis result types: `Finding`, `Severity`, `Location`, `AnalysisReport`
133+
134+
| `src/Sanctify/Transform/Strict.hs`
135+
| Inserts `declare(strict_types=1)` at file head where absent
136+
137+
| `src/Sanctify/Transform/TypeHints.hs`
138+
| Infers and adds parameter and return type annotations
139+
140+
| `src/Sanctify/Transform/Sanitize.hs`
141+
| Wraps `echo $var` with `esc_html()` / `esc_attr()` for WordPress output contexts
142+
143+
| `src/Sanctify/Transform/StrictTypes.hs`
144+
| Enforces PHP 8 strict typing constraints across class hierarchies
145+
146+
| `src/Sanctify/Report.hs`
147+
| Serialises `AnalysisReport` to JSON (aeson), SARIF, and HTML
148+
149+
| `src/Sanctify/Ruleset.hs`
150+
| Ruleset configuration: which checks to enable, severity overrides, exclusion patterns
151+
152+
| `src/Sanctify/Config.hs`
153+
| Configuration loading from `.sanctify.yaml` or CLI flags
154+
155+
| `src/Sanctify/Emit.hs`
156+
| Pretty-printer: reconstructs PHP source from a transformed AST
157+
158+
| `src/Sanctify/WordPress/`
159+
| WordPress-specific checks: ABSPATH guard, text-domain presence, nonce verification, capability checks
160+
161+
| `app/Main.hs`
162+
| CLI entry point: subcommands `analyze`, `fix`, `report`, `export`
163+
164+
| `sanctify-php.cabal`
165+
| Cabal build descriptor; lists all source modules and their dependencies
166+
167+
| `bench/`
168+
| Criterion benchmarks: parser throughput, analysis latency on fixture codebases
169+
170+
| `test/`
171+
| HSpec unit tests for AST round-trips and individual transform passes
172+
173+
| `tests/`
174+
| Integration tests: full pipeline from PHP source to report output
175+
176+
| `test-plugins/`
177+
| Real-world WordPress plugin fixtures for integration and regression testing
178+
179+
| `docs/`
180+
| Design documents: taint model, SARIF schema mapping, WordPress constraint rationale
181+
182+
| `PROOF-NEEDS.md`
183+
| Outstanding formal proof obligations for security-analysis core
184+
185+
| `TEST-NEEDS.md`
186+
| Known testing gaps: taint end-to-end, SARIF upload round-trip
187+
188+
| `PRIORITY.adoc`
189+
| Prioritised work backlog for the next development phase
190+
191+
| `flake.nix` / `guix.scm`
192+
| Reproducible build environments (GHC + Cabal) via Nix and Guix
21193
|===
22194

23195
== Questions?

0 commit comments

Comments
 (0)