|
1 | 1 | // SPDX-License-Identifier: PMPL-1.0-or-later |
| 2 | +// SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell <j.d.a.jewell@open.ac.uk> |
2 | 3 | = Sanctify-PHP — Show Me The Receipts |
3 | 4 | :toc: |
| 5 | +:toclevels: 3 |
4 | 6 | :icons: font |
| 7 | +:author: Jonathan D.A. Jewell |
| 8 | +:email: j.d.a.jewell@open.ac.uk |
5 | 9 |
|
6 | | -The README makes claims. This file backs them up. |
| 10 | +The README makes claims. This file backs them up with specific module paths, |
| 11 | +an honest reading of what is verified vs. what is tested, and enough detail |
| 12 | +for an external reviewer to trace the critical analysis and transformation paths. |
| 13 | + |
| 14 | +== Claim 1: Haskell-based PHP parser and hardening tool |
7 | 15 |
|
8 | 16 | [quote, README] |
9 | 17 | ____ |
10 | 18 | Haskell-based PHP hardening and security analysis tool. |
11 | 19 | ____ |
12 | 20 |
|
| 21 | +*How it works.* |
| 22 | +The Haskell source tree lives under `src/Sanctify/`. |
| 23 | +`AST.hs` defines the complete PHP abstract syntax tree: `PhpFile`, |
| 24 | +`Statement`, `Expr`, `PhpType`, `ClassMember`, `TraitAdaptation`, and |
| 25 | +`SourcePos` (for error reporting), all using the `{-# LANGUAGE StrictData #-}` |
| 26 | +pragma to prevent thunk accumulation on large codebases. |
| 27 | +`Parser.hs` (and the `Parser/` subdirectory) implements a Megaparsec-based |
| 28 | +PHP parser exposing `parsePhpFile`, `parsePhpString`, `parseStatement`, and |
| 29 | +`parseExpr` as the primary API surface. |
| 30 | +The analysis pipeline in `Analysis/` contains `Security.hs` (SQL injection, |
| 31 | +XSS, CSRF, command injection detection), `Taint.hs` (full taint tracking from |
| 32 | +sources to sinks), `Advanced.hs`, `Types.hs`, and `DeadCode.hs`. |
| 33 | +Transformations live in `Transform/`: `Strict.hs` inserts |
| 34 | +`declare(strict_types=1)`, `TypeHints.hs` infers and adds parameter/return |
| 35 | +type annotations, and `Sanitize.hs` wraps unescaped `echo` output in |
| 36 | +`esc_html()`. |
| 37 | +`Report.hs` serialises analysis results to JSON, SARIF, and HTML. |
| 38 | +The `WordPress/` module enforces WordPress-specific constraints (ABSPATH |
| 39 | +checks, text-domain presence, nonce verification). |
| 40 | +`app/Main.hs` is the CLI entry point; `sanctify-php.cabal` declares the |
| 41 | +build. |
| 42 | + |
| 43 | +*Honest caveat.* |
| 44 | +The README includes an explicit note: "It should not be read as proof that |
| 45 | +every analysis and transform is already trustworthy enough for production |
| 46 | +security decisions." |
| 47 | +Taint analysis tests and end-to-end execution coverage are the most notable |
| 48 | +remaining gaps — `PROOF-NEEDS.md` and `TEST-NEEDS.md` list the outstanding |
| 49 | +obligations. |
| 50 | + |
| 51 | +== Claim 2: Multiple output formats and infrastructure export |
| 52 | + |
| 53 | +[quote, README] |
| 54 | +____ |
| 55 | +Generates reports in JSON/SARIF/HTML formats. Exports infrastructure |
| 56 | +recommendations (php.ini, nginx, Guix). |
| 57 | +____ |
| 58 | + |
| 59 | +*How it works.* |
| 60 | +`Report.hs` implements the format serialisation: the `ReportFormat` sum type |
| 61 | +selects JSON (via `aeson`), SARIF (structured diagnostic format for IDE/SAST |
| 62 | +integration), or HTML (standalone rendered report). |
| 63 | +The `sanctify export` subcommand (wired in `app/Main.hs`) reads the analysis |
| 64 | +results and emits php.ini hardening directives, nginx security headers, or a |
| 65 | +Guix package override as plain text appended to the target config file. |
| 66 | +The `bench/` directory contains criterion benchmarks measuring parser and |
| 67 | +analysis throughput on representative PHP codebases. |
| 68 | +Test plugins in `test-plugins/` provide real-world WordPress plugin fixtures |
| 69 | +for integration tests. |
| 70 | + |
| 71 | +*Honest caveat.* |
| 72 | +SARIF output enables IDE integration (VS Code problem matcher, GitHub |
| 73 | +Advanced Security upload) but end-to-end round-trip tests from PHP source to |
| 74 | +SARIF upload are absent — the feature is structural rather than validated. |
| 75 | + |
| 76 | +== Dogfooded Across The Account |
| 77 | + |
| 78 | +[cols="1,3", options="header"] |
| 79 | +|=== |
| 80 | +| Tool / Repo | How sanctify-php uses it |
| 81 | + |
| 82 | +| `panic-attacker` |
| 83 | +| Pre-commit security scan gate (`just pre-commit`) |
| 84 | + |
| 85 | +| `contractile.just` / contractiles |
| 86 | +| Standard hyperpolymath contractile integration hook |
| 87 | + |
| 88 | +| `stapeln.toml` |
| 89 | +| Container build manifest for the Haskell tool image |
| 90 | + |
| 91 | +| `guix.scm` / `guix/` |
| 92 | +| Guix package expression; the export subcommand also emits Guix override stanzas |
| 93 | + |
| 94 | +| `flake.nix` |
| 95 | +| Nix flake for reproducible build environment (GHC, Cabal) |
| 96 | + |
| 97 | +| `Hypatia CI` |
| 98 | +| `hypatia-scan.yml` workflow applies neurosymbolic security rules to each commit |
| 99 | + |
| 100 | +| `PROOF-NEEDS.md` |
| 101 | +| Consumed by Hypatia and gitbot-fleet to track outstanding proof obligations |
| 102 | +|=== |
| 103 | + |
13 | 104 | == File Map |
14 | 105 |
|
15 | | -[cols="1,2"] |
| 106 | +[cols="2,4", options="header"] |
16 | 107 | |=== |
17 | 108 | | Path | What's There |
18 | 109 |
|
19 | | -| `src/` | Source code |
20 | | -| `test(s)/` | Test suite |
| 110 | +| `src/Sanctify/AST.hs` |
| 111 | +| PHP AST definition: all node types from `PhpFile` to `TraitAdaptation`; uses `StrictData` |
| 112 | + |
| 113 | +| `src/Sanctify/Parser.hs` |
| 114 | +| Megaparsec PHP parser entry points: `parsePhpFile`, `parsePhpString`, `parseStatement`, `parseExpr` |
| 115 | + |
| 116 | +| `src/Sanctify/Parser/` |
| 117 | +| Sub-parsers: expression, statement, declaration, class-member parsers broken out by production rule group |
| 118 | + |
| 119 | +| `src/Sanctify/Analysis/Security.hs` |
| 120 | +| Vulnerability detection: SQLi, XSS, CSRF, command injection pattern matching over the AST |
| 121 | + |
| 122 | +| `src/Sanctify/Analysis/Taint.hs` |
| 123 | +| Taint tracking: source tagging at user input, propagation through assignments, sink detection at output/exec calls |
| 124 | + |
| 125 | +| `src/Sanctify/Analysis/Advanced.hs` |
| 126 | +| Higher-level analysis passes (type-coercion risks, insecure deserialisation, SSRF patterns) |
| 127 | + |
| 128 | +| `src/Sanctify/Analysis/DeadCode.hs` |
| 129 | +| Dead code detection: unreachable branches, unused variables after strict-type enforcement |
| 130 | + |
| 131 | +| `src/Sanctify/Analysis/Types.hs` |
| 132 | +| Analysis result types: `Finding`, `Severity`, `Location`, `AnalysisReport` |
| 133 | + |
| 134 | +| `src/Sanctify/Transform/Strict.hs` |
| 135 | +| Inserts `declare(strict_types=1)` at file head where absent |
| 136 | + |
| 137 | +| `src/Sanctify/Transform/TypeHints.hs` |
| 138 | +| Infers and adds parameter and return type annotations |
| 139 | + |
| 140 | +| `src/Sanctify/Transform/Sanitize.hs` |
| 141 | +| Wraps `echo $var` with `esc_html()` / `esc_attr()` for WordPress output contexts |
| 142 | + |
| 143 | +| `src/Sanctify/Transform/StrictTypes.hs` |
| 144 | +| Enforces PHP 8 strict typing constraints across class hierarchies |
| 145 | + |
| 146 | +| `src/Sanctify/Report.hs` |
| 147 | +| Serialises `AnalysisReport` to JSON (aeson), SARIF, and HTML |
| 148 | + |
| 149 | +| `src/Sanctify/Ruleset.hs` |
| 150 | +| Ruleset configuration: which checks to enable, severity overrides, exclusion patterns |
| 151 | + |
| 152 | +| `src/Sanctify/Config.hs` |
| 153 | +| Configuration loading from `.sanctify.yaml` or CLI flags |
| 154 | + |
| 155 | +| `src/Sanctify/Emit.hs` |
| 156 | +| Pretty-printer: reconstructs PHP source from a transformed AST |
| 157 | + |
| 158 | +| `src/Sanctify/WordPress/` |
| 159 | +| WordPress-specific checks: ABSPATH guard, text-domain presence, nonce verification, capability checks |
| 160 | + |
| 161 | +| `app/Main.hs` |
| 162 | +| CLI entry point: subcommands `analyze`, `fix`, `report`, `export` |
| 163 | + |
| 164 | +| `sanctify-php.cabal` |
| 165 | +| Cabal build descriptor; lists all source modules and their dependencies |
| 166 | + |
| 167 | +| `bench/` |
| 168 | +| Criterion benchmarks: parser throughput, analysis latency on fixture codebases |
| 169 | + |
| 170 | +| `test/` |
| 171 | +| HSpec unit tests for AST round-trips and individual transform passes |
| 172 | + |
| 173 | +| `tests/` |
| 174 | +| Integration tests: full pipeline from PHP source to report output |
| 175 | + |
| 176 | +| `test-plugins/` |
| 177 | +| Real-world WordPress plugin fixtures for integration and regression testing |
| 178 | + |
| 179 | +| `docs/` |
| 180 | +| Design documents: taint model, SARIF schema mapping, WordPress constraint rationale |
| 181 | + |
| 182 | +| `PROOF-NEEDS.md` |
| 183 | +| Outstanding formal proof obligations for security-analysis core |
| 184 | + |
| 185 | +| `TEST-NEEDS.md` |
| 186 | +| Known testing gaps: taint end-to-end, SARIF upload round-trip |
| 187 | + |
| 188 | +| `PRIORITY.adoc` |
| 189 | +| Prioritised work backlog for the next development phase |
| 190 | + |
| 191 | +| `flake.nix` / `guix.scm` |
| 192 | +| Reproducible build environments (GHC + Cabal) via Nix and Guix |
21 | 193 | |=== |
22 | 194 |
|
23 | 195 | == Questions? |
|
0 commit comments