|
| 1 | +# SPDX-License-Identifier: PMPL-1.0-or-later |
| 2 | + |
| 3 | +# panic-attacker Roadmap |
| 4 | + |
| 5 | +## Current State: v0.1.0 |
| 6 | + |
| 7 | +**2,359 lines of Rust.** Functional proof of concept, not scaffolding. |
| 8 | + |
| 9 | +| Component | Status | Notes | |
| 10 | +|---|---|---| |
| 11 | +| X-Ray static analysis | Working | 5 language-specific analyzers | |
| 12 | +| Attack executor (6 axes) | Working | CPU, memory, disk, network, concurrency, time | |
| 13 | +| Signature detection | Working | Simplified (string matching, not real Datalog) | |
| 14 | +| Report generation | Working | Robustness scoring, coloured terminal output, JSON | |
| 15 | +| CLI (4 commands) | Working | xray, attack, assault, analyze | |
| 16 | +| Pattern library | Defined but unused | Not wired into attack selection | |
| 17 | +| Tests | 2 unit tests | Effectively untested | |
| 18 | +| Constraint sets | Not started | — | |
| 19 | +| Multi-program testing | Not started | — | |
| 20 | + |
| 21 | +### Tested Against |
| 22 | + |
| 23 | +- **eclexia** (23,295 lines Rust): 66 weak points, 0 unsafe blocks, high unwrap density |
| 24 | +- **echidna** (60,248 lines Rust): 271 weak points, 15 unsafe blocks, 4 frameworks detected |
| 25 | + |
| 26 | +--- |
| 27 | + |
| 28 | +## v0.2 — Fix What's Broken |
| 29 | + |
| 30 | +**Theme: Make the existing output trustworthy** |
| 31 | + |
| 32 | +- [ ] Fix X-Ray duplicate entries (running counts → per-file delta counts) |
| 33 | +- [ ] Add file paths to weak point `location` field (currently all `null`) |
| 34 | +- [ ] Wire pattern library into attack selection (currently defined but unused) |
| 35 | +- [ ] Connect `RuleSet` to signature engine (currently stored but ignored) |
| 36 | +- [ ] Handle non-UTF-8 source files gracefully (skip with warning, not crash) |
| 37 | + - Already partially fixed: `fs::read_to_string` failures now `continue` |
| 38 | + - Improvement: attempt Latin-1/ISO-8859-1 fallback before skipping |
| 39 | + - Improvement: log skipped files in verbose mode |
| 40 | + - Root cause: vendored third-party C files with non-ASCII author names |
| 41 | + (e.g. `Jørn` encoded as ISO-8859-1 `0xf8` instead of UTF-8 `0xc3b8`) |
| 42 | +- [ ] Fix the 10 compiler warnings (dead code) |
| 43 | +- [ ] Add integration test using `examples/vulnerable_program.rs` |
| 44 | +- [ ] Per-file breakdown in verbose output (which files contribute most weak points) |
| 45 | + |
| 46 | +**Estimated: 1–2 days** |
| 47 | + |
| 48 | +--- |
| 49 | + |
| 50 | +## v0.3 — Test Coverage |
| 51 | + |
| 52 | +**Theme: Trust the tool enough to use it on real code** |
| 53 | + |
| 54 | +- [ ] Unit tests for X-Ray (each language analyzer: Rust, C/C++, Go, Python, generic) |
| 55 | +- [ ] Unit tests for signature engine (each inference rule) |
| 56 | +- [ ] Integration tests: X-Ray → Attack → Signature pipeline |
| 57 | +- [ ] Test against example vulnerable program (panic, OOM, deadlock, race) |
| 58 | +- [ ] Regression test: eclexia baseline (66 weak points, known profile) |
| 59 | +- [ ] Regression test: echidna baseline (271 weak points, known profile) |
| 60 | +- [ ] CI with GitHub Actions |
| 61 | +- [ ] Code coverage reporting |
| 62 | + |
| 63 | +**Estimated: 2–3 days** |
| 64 | + |
| 65 | +--- |
| 66 | + |
| 67 | +## v0.4 — Constraint Sets |
| 68 | + |
| 69 | +**Theme: Composable stress profiles** |
| 70 | + |
| 71 | +Constraint sets are named combinations of conditions that simulate real |
| 72 | +failure scenarios. Real failures are never one thing — they're the |
| 73 | +intersection of multiple pressures. |
| 74 | + |
| 75 | +```yaml |
| 76 | +"Production Spike": |
| 77 | + cpu: 80% + periodic 100% spikes |
| 78 | + memory: 70% with 2% leak rate |
| 79 | + network: 50ms latency + 1% loss |
| 80 | + disk: 85% full |
| 81 | +``` |
| 82 | +
|
| 83 | +- [ ] YAML-based constraint set definitions |
| 84 | +- [ ] New CLI command: `panic-attacker stress ./program --profile spike.yaml` |
| 85 | +- [ ] Multi-axis simultaneous attacks (not just sequential) |
| 86 | +- [ ] Built-in profiles: |
| 87 | + - `production-spike` — CPU + memory + network pressure |
| 88 | + - `memory-leak` — gradual memory growth over time |
| 89 | + - `disk-full` — I/O with shrinking free space |
| 90 | + - `network-partition` — intermittent connectivity loss |
| 91 | + - `thundering-herd` — sudden concurrency burst |
| 92 | +- [ ] Custom profile authoring with slider-like intensity controls |
| 93 | +- [ ] Profile composition (combine two profiles into a new one) |
| 94 | + |
| 95 | +**Estimated: 1 week** |
| 96 | + |
| 97 | +--- |
| 98 | + |
| 99 | +## v0.5 — Real Signature Engine |
| 100 | + |
| 101 | +**Theme: Replace string matching with actual logic programming** |
| 102 | + |
| 103 | +This is the critical path milestone. The current engine pattern-matches |
| 104 | +on stderr strings. A real engine would parse structured crash data and |
| 105 | +run inference over facts using Datalog rules. |
| 106 | + |
| 107 | +- [ ] Integrate Crepe or Datafrog (Rust Datalog engines) |
| 108 | +- [ ] Real fact extraction from crash traces (parse backtraces, not just string contains) |
| 109 | +- [ ] Proper temporal ordering of events |
| 110 | +- [ ] Confidence scoring based on evidence chain strength |
| 111 | +- [ ] User-definable rules (not just hardcoded) |
| 112 | +- [ ] Rule composition (combine rules to detect compound bugs) |
| 113 | +- [ ] Implement remaining signatures: |
| 114 | + - Integer overflow detection |
| 115 | + - Unhandled error detection |
| 116 | + - Resource leak detection (file descriptors, sockets) |
| 117 | + - Goroutine/task leak detection |
| 118 | +- [ ] Evidence chain visualisation in reports |
| 119 | + |
| 120 | +**Estimated: 2–3 weeks** |
| 121 | + |
| 122 | +--- |
| 123 | + |
| 124 | +## v0.6 — Multi-Program & Data Testing |
| 125 | + |
| 126 | +**Theme: Test relationships, not just programs** |
| 127 | + |
| 128 | +- [ ] Test program A under stress while program B is also stressed |
| 129 | +- [ ] Test program + its data (corrupt inputs, malformed configs) |
| 130 | +- [ ] Corpus-based testing (known-bad inputs per framework type) |
| 131 | +- [ ] Dependency chain analysis (what happens when a dependency fails) |
| 132 | +- [ ] Inter-process communication testing (what happens when IPC degrades) |
| 133 | +- [ ] Database + server combined stress (realistic service topology) |
| 134 | + |
| 135 | +**Estimated: 1–2 weeks** |
| 136 | + |
| 137 | +--- |
| 138 | + |
| 139 | +## v0.7 — Language Coverage Expansion |
| 140 | + |
| 141 | +**Theme: Support more than just Rust well** |
| 142 | + |
| 143 | +Each language gets a dedicated analyzer with language-specific |
| 144 | +vulnerability patterns and framework detection. |
| 145 | + |
| 146 | +- [ ] Deeper C/C++ analysis |
| 147 | + - AddressSanitizer integration |
| 148 | + - Valgrind integration |
| 149 | + - malloc/free pair tracking |
| 150 | +- [ ] Java/JVM analysis |
| 151 | + - Heap dump analysis |
| 152 | + - Thread dump parsing |
| 153 | + - GC pressure simulation |
| 154 | +- [ ] JavaScript/Node analysis |
| 155 | + - Event loop starvation detection |
| 156 | + - Memory leak patterns (closures, listeners) |
| 157 | + - Promise rejection handling |
| 158 | +- [ ] Erlang/BEAM analysis |
| 159 | + - Process mailbox overflow |
| 160 | + - Supervisor tree stress testing |
| 161 | + - ETS table pressure |
| 162 | +- [ ] Chapel analysis |
| 163 | + - Locale distribution imbalance |
| 164 | + - Task spawning overhead |
| 165 | + - Data distribution skew |
| 166 | + - Parallel proof search scaling |
| 167 | +- [ ] eclexia-specific analysis |
| 168 | + - Resource constraint behaviour under stress |
| 169 | + - Adaptive function degradation |
| 170 | + - `@solution` block fallback testing |
| 171 | +- [ ] Julia analysis |
| 172 | + - Type instability detection |
| 173 | + - GC pressure simulation |
| 174 | + - ccall FFI boundary testing |
| 175 | +- [ ] Non-UTF-8 source file support (Latin-1, Shift-JIS, etc.) |
| 176 | + - Detect encoding from BOM or heuristics |
| 177 | + - Transcode to UTF-8 before analysis |
| 178 | + |
| 179 | +**Estimated: 2–3 weeks** (each language ~2–3 days) |
| 180 | + |
| 181 | +--- |
| 182 | + |
| 183 | +## v0.8 — Reporting & CI/CD Integration |
| 184 | + |
| 185 | +**Theme: Make it useful in real workflows** |
| 186 | + |
| 187 | +- [ ] HTML report output (not just terminal + JSON) |
| 188 | +- [ ] Trend tracking (compare runs over time, detect regressions) |
| 189 | +- [ ] GitHub Actions integration (run panic-attacker in CI) |
| 190 | +- [ ] Exit codes that CI can act on (fail build if robustness < threshold) |
| 191 | +- [ ] SARIF output for GitHub Security tab integration |
| 192 | +- [ ] Baseline support (suppress known issues, alert on new ones) |
| 193 | +- [ ] Comparative reports (diff two X-Ray runs) |
| 194 | +- [ ] Badge generation (robustness score badge for README) |
| 195 | + |
| 196 | +**Estimated: 1–2 weeks** |
| 197 | + |
| 198 | +--- |
| 199 | + |
| 200 | +## v0.9 — Performance & Polish |
| 201 | + |
| 202 | +**Theme: Make it fast and reliable enough for production use** |
| 203 | + |
| 204 | +- [ ] Parallel X-Ray analysis (rayon for file scanning) |
| 205 | +- [ ] Incremental analysis (only re-scan changed files) |
| 206 | +- [ ] Resource limits on panic-attacker itself (don't crash the host) |
| 207 | +- [ ] Graceful cleanup (kill child processes on SIGINT/SIGTERM) |
| 208 | +- [ ] Config file support (`panic-attacker.toml`) |
| 209 | +- [ ] Shell completions (bash, zsh, fish, nushell) |
| 210 | +- [ ] Man page generation |
| 211 | +- [ ] `--quiet` mode for CI pipelines |
| 212 | +- [ ] Memory-mapped file reading for large codebases |
| 213 | + |
| 214 | +**Estimated: 1 week** |
| 215 | + |
| 216 | +--- |
| 217 | + |
| 218 | +## v1.0 — Production Release |
| 219 | + |
| 220 | +**Theme: Battle-tested and documented** |
| 221 | + |
| 222 | +- [ ] Run against 50+ real-world programs and fix false positives |
| 223 | +- [ ] Comprehensive user guide (not just README) |
| 224 | +- [ ] Published to crates.io |
| 225 | +- [ ] Reproducible builds |
| 226 | +- [ ] SBOM generation |
| 227 | +- [ ] Security audit of panic-attacker itself (eat your own dogfood) |
| 228 | +- [ ] X-Ray panic-attacker with panic-attacker (meta-test) |
| 229 | +- [ ] Stable JSON output schema (versioned, documented) |
| 230 | +- [ ] Minimum supported Rust version (MSRV) policy |
| 231 | + |
| 232 | +**Estimated: 1–2 weeks of hardening** |
| 233 | + |
| 234 | +--- |
| 235 | + |
| 236 | +## Timeline Summary |
| 237 | + |
| 238 | +| Version | Theme | Effort | Cumulative | |
| 239 | +|---|---|---|---| |
| 240 | +| v0.2 | Fix what's broken | 1–2 days | 2 days | |
| 241 | +| v0.3 | Test coverage | 2–3 days | 1 week | |
| 242 | +| v0.4 | Constraint sets | 1 week | 2 weeks | |
| 243 | +| v0.5 | Real Datalog engine | 2–3 weeks | 5 weeks | |
| 244 | +| v0.6 | Multi-program testing | 1–2 weeks | 7 weeks | |
| 245 | +| v0.7 | Language expansion | 2–3 weeks | 10 weeks | |
| 246 | +| v0.8 | Reporting & CI/CD | 1–2 weeks | 12 weeks | |
| 247 | +| v0.9 | Performance & polish | 1 week | 13 weeks | |
| 248 | +| v1.0 | Production release | 1–2 weeks | ~15 weeks | |
| 249 | + |
| 250 | +**Roughly 4 months of focused work** from v0.1 to v1.0. |
| 251 | + |
| 252 | +--- |
| 253 | + |
| 254 | +## Critical Path |
| 255 | + |
| 256 | +**v0.5 (Real Datalog Engine)** is the make-or-break milestone. Everything |
| 257 | +else is incremental improvement, but the signature engine is what separates |
| 258 | +panic-attacker from "a fancy grep + stress test script." If the logic |
| 259 | +programming works well, the tool is genuinely novel. |
| 260 | + |
| 261 | +**v0.4 (Constraint Sets)** is the second most important — it's the feature |
| 262 | +that makes panic-attacker *composable* rather than just a list of |
| 263 | +individual stress tests. |
| 264 | + |
| 265 | +--- |
| 266 | + |
| 267 | +## Post v1.0 |
| 268 | + |
| 269 | +See [VISION.md](VISION.md) for the long-range roadmap: |
| 270 | + |
| 271 | +- **v1.5** — Generic constraint modelling (not software-specific) |
| 272 | +- **v2.0** — Sensor/actuator integration |
| 273 | +- **v2.5** — Physical system modelling |
| 274 | +- **v3.0** — Digital twin stress testing |
| 275 | + |
| 276 | +### Separate Products (Informed by panic-attacker) |
| 277 | + |
| 278 | +- **Resource Topology Simulator** — Cisco-like GUI for resource flow design |
| 279 | +- **Software Fuse Framework** — Rust library for building fuse components |
| 280 | +- **eclexia Profiler** — eclexia-specific stress testing integration |
| 281 | +- **Safety Priority Scheduler** — Production daemon for resource management |
| 282 | + |
| 283 | +--- |
| 284 | + |
| 285 | +## Authors |
| 286 | + |
| 287 | +- **Concept & Design:** Jonathan D.A. Jewell |
| 288 | +- **Initial Implementation:** Claude (Anthropic) + Jonathan D.A. Jewell |
| 289 | +- **Date:** 2026-02-07 |
0 commit comments