Skip to content

Commit 9033f8a

Browse files
Jonathan D.A. Jewellclaude
andcommitted
fix: handle non-UTF-8 source files, add ROADMAP.md
X-Ray crashed on echidna's vendored HOL/examples/muddy/muddyC/muddy.c which contains a single ISO-8859-1 byte (0xf8 = ø in "Jørn"). Now gracefully skips unreadable files instead of aborting the entire scan. ROADMAP.md documents the v0.1→v1.0 path across 9 milestones (~15 weeks). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent b53b729 commit 9033f8a

2 files changed

Lines changed: 300 additions & 3 deletions

File tree

ROADMAP.md

Lines changed: 289 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,289 @@
1+
# SPDX-License-Identifier: PMPL-1.0-or-later
2+
3+
# panic-attacker Roadmap
4+
5+
## Current State: v0.1.0
6+
7+
**2,359 lines of Rust.** Functional proof of concept, not scaffolding.
8+
9+
| Component | Status | Notes |
10+
|---|---|---|
11+
| X-Ray static analysis | Working | 5 language-specific analyzers |
12+
| Attack executor (6 axes) | Working | CPU, memory, disk, network, concurrency, time |
13+
| Signature detection | Working | Simplified (string matching, not real Datalog) |
14+
| Report generation | Working | Robustness scoring, coloured terminal output, JSON |
15+
| CLI (4 commands) | Working | xray, attack, assault, analyze |
16+
| Pattern library | Defined but unused | Not wired into attack selection |
17+
| Tests | 2 unit tests | Effectively untested |
18+
| Constraint sets | Not started ||
19+
| Multi-program testing | Not started ||
20+
21+
### Tested Against
22+
23+
- **eclexia** (23,295 lines Rust): 66 weak points, 0 unsafe blocks, high unwrap density
24+
- **echidna** (60,248 lines Rust): 271 weak points, 15 unsafe blocks, 4 frameworks detected
25+
26+
---
27+
28+
## v0.2 — Fix What's Broken
29+
30+
**Theme: Make the existing output trustworthy**
31+
32+
- [ ] Fix X-Ray duplicate entries (running counts → per-file delta counts)
33+
- [ ] Add file paths to weak point `location` field (currently all `null`)
34+
- [ ] Wire pattern library into attack selection (currently defined but unused)
35+
- [ ] Connect `RuleSet` to signature engine (currently stored but ignored)
36+
- [ ] Handle non-UTF-8 source files gracefully (skip with warning, not crash)
37+
- Already partially fixed: `fs::read_to_string` failures now `continue`
38+
- Improvement: attempt Latin-1/ISO-8859-1 fallback before skipping
39+
- Improvement: log skipped files in verbose mode
40+
- Root cause: vendored third-party C files with non-ASCII author names
41+
(e.g. `Jørn` encoded as ISO-8859-1 `0xf8` instead of UTF-8 `0xc3b8`)
42+
- [ ] Fix the 10 compiler warnings (dead code)
43+
- [ ] Add integration test using `examples/vulnerable_program.rs`
44+
- [ ] Per-file breakdown in verbose output (which files contribute most weak points)
45+
46+
**Estimated: 1–2 days**
47+
48+
---
49+
50+
## v0.3 — Test Coverage
51+
52+
**Theme: Trust the tool enough to use it on real code**
53+
54+
- [ ] Unit tests for X-Ray (each language analyzer: Rust, C/C++, Go, Python, generic)
55+
- [ ] Unit tests for signature engine (each inference rule)
56+
- [ ] Integration tests: X-Ray → Attack → Signature pipeline
57+
- [ ] Test against example vulnerable program (panic, OOM, deadlock, race)
58+
- [ ] Regression test: eclexia baseline (66 weak points, known profile)
59+
- [ ] Regression test: echidna baseline (271 weak points, known profile)
60+
- [ ] CI with GitHub Actions
61+
- [ ] Code coverage reporting
62+
63+
**Estimated: 2–3 days**
64+
65+
---
66+
67+
## v0.4 — Constraint Sets
68+
69+
**Theme: Composable stress profiles**
70+
71+
Constraint sets are named combinations of conditions that simulate real
72+
failure scenarios. Real failures are never one thing — they're the
73+
intersection of multiple pressures.
74+
75+
```yaml
76+
"Production Spike":
77+
cpu: 80% + periodic 100% spikes
78+
memory: 70% with 2% leak rate
79+
network: 50ms latency + 1% loss
80+
disk: 85% full
81+
```
82+
83+
- [ ] YAML-based constraint set definitions
84+
- [ ] New CLI command: `panic-attacker stress ./program --profile spike.yaml`
85+
- [ ] Multi-axis simultaneous attacks (not just sequential)
86+
- [ ] Built-in profiles:
87+
- `production-spike` — CPU + memory + network pressure
88+
- `memory-leak` — gradual memory growth over time
89+
- `disk-full` — I/O with shrinking free space
90+
- `network-partition` — intermittent connectivity loss
91+
- `thundering-herd` — sudden concurrency burst
92+
- [ ] Custom profile authoring with slider-like intensity controls
93+
- [ ] Profile composition (combine two profiles into a new one)
94+
95+
**Estimated: 1 week**
96+
97+
---
98+
99+
## v0.5 — Real Signature Engine
100+
101+
**Theme: Replace string matching with actual logic programming**
102+
103+
This is the critical path milestone. The current engine pattern-matches
104+
on stderr strings. A real engine would parse structured crash data and
105+
run inference over facts using Datalog rules.
106+
107+
- [ ] Integrate Crepe or Datafrog (Rust Datalog engines)
108+
- [ ] Real fact extraction from crash traces (parse backtraces, not just string contains)
109+
- [ ] Proper temporal ordering of events
110+
- [ ] Confidence scoring based on evidence chain strength
111+
- [ ] User-definable rules (not just hardcoded)
112+
- [ ] Rule composition (combine rules to detect compound bugs)
113+
- [ ] Implement remaining signatures:
114+
- Integer overflow detection
115+
- Unhandled error detection
116+
- Resource leak detection (file descriptors, sockets)
117+
- Goroutine/task leak detection
118+
- [ ] Evidence chain visualisation in reports
119+
120+
**Estimated: 2–3 weeks**
121+
122+
---
123+
124+
## v0.6 — Multi-Program & Data Testing
125+
126+
**Theme: Test relationships, not just programs**
127+
128+
- [ ] Test program A under stress while program B is also stressed
129+
- [ ] Test program + its data (corrupt inputs, malformed configs)
130+
- [ ] Corpus-based testing (known-bad inputs per framework type)
131+
- [ ] Dependency chain analysis (what happens when a dependency fails)
132+
- [ ] Inter-process communication testing (what happens when IPC degrades)
133+
- [ ] Database + server combined stress (realistic service topology)
134+
135+
**Estimated: 1–2 weeks**
136+
137+
---
138+
139+
## v0.7 — Language Coverage Expansion
140+
141+
**Theme: Support more than just Rust well**
142+
143+
Each language gets a dedicated analyzer with language-specific
144+
vulnerability patterns and framework detection.
145+
146+
- [ ] Deeper C/C++ analysis
147+
- AddressSanitizer integration
148+
- Valgrind integration
149+
- malloc/free pair tracking
150+
- [ ] Java/JVM analysis
151+
- Heap dump analysis
152+
- Thread dump parsing
153+
- GC pressure simulation
154+
- [ ] JavaScript/Node analysis
155+
- Event loop starvation detection
156+
- Memory leak patterns (closures, listeners)
157+
- Promise rejection handling
158+
- [ ] Erlang/BEAM analysis
159+
- Process mailbox overflow
160+
- Supervisor tree stress testing
161+
- ETS table pressure
162+
- [ ] Chapel analysis
163+
- Locale distribution imbalance
164+
- Task spawning overhead
165+
- Data distribution skew
166+
- Parallel proof search scaling
167+
- [ ] eclexia-specific analysis
168+
- Resource constraint behaviour under stress
169+
- Adaptive function degradation
170+
- `@solution` block fallback testing
171+
- [ ] Julia analysis
172+
- Type instability detection
173+
- GC pressure simulation
174+
- ccall FFI boundary testing
175+
- [ ] Non-UTF-8 source file support (Latin-1, Shift-JIS, etc.)
176+
- Detect encoding from BOM or heuristics
177+
- Transcode to UTF-8 before analysis
178+
179+
**Estimated: 2–3 weeks** (each language ~2–3 days)
180+
181+
---
182+
183+
## v0.8 — Reporting & CI/CD Integration
184+
185+
**Theme: Make it useful in real workflows**
186+
187+
- [ ] HTML report output (not just terminal + JSON)
188+
- [ ] Trend tracking (compare runs over time, detect regressions)
189+
- [ ] GitHub Actions integration (run panic-attacker in CI)
190+
- [ ] Exit codes that CI can act on (fail build if robustness < threshold)
191+
- [ ] SARIF output for GitHub Security tab integration
192+
- [ ] Baseline support (suppress known issues, alert on new ones)
193+
- [ ] Comparative reports (diff two X-Ray runs)
194+
- [ ] Badge generation (robustness score badge for README)
195+
196+
**Estimated: 1–2 weeks**
197+
198+
---
199+
200+
## v0.9 — Performance & Polish
201+
202+
**Theme: Make it fast and reliable enough for production use**
203+
204+
- [ ] Parallel X-Ray analysis (rayon for file scanning)
205+
- [ ] Incremental analysis (only re-scan changed files)
206+
- [ ] Resource limits on panic-attacker itself (don't crash the host)
207+
- [ ] Graceful cleanup (kill child processes on SIGINT/SIGTERM)
208+
- [ ] Config file support (`panic-attacker.toml`)
209+
- [ ] Shell completions (bash, zsh, fish, nushell)
210+
- [ ] Man page generation
211+
- [ ] `--quiet` mode for CI pipelines
212+
- [ ] Memory-mapped file reading for large codebases
213+
214+
**Estimated: 1 week**
215+
216+
---
217+
218+
## v1.0 — Production Release
219+
220+
**Theme: Battle-tested and documented**
221+
222+
- [ ] Run against 50+ real-world programs and fix false positives
223+
- [ ] Comprehensive user guide (not just README)
224+
- [ ] Published to crates.io
225+
- [ ] Reproducible builds
226+
- [ ] SBOM generation
227+
- [ ] Security audit of panic-attacker itself (eat your own dogfood)
228+
- [ ] X-Ray panic-attacker with panic-attacker (meta-test)
229+
- [ ] Stable JSON output schema (versioned, documented)
230+
- [ ] Minimum supported Rust version (MSRV) policy
231+
232+
**Estimated: 1–2 weeks of hardening**
233+
234+
---
235+
236+
## Timeline Summary
237+
238+
| Version | Theme | Effort | Cumulative |
239+
|---|---|---|---|
240+
| v0.2 | Fix what's broken | 1–2 days | 2 days |
241+
| v0.3 | Test coverage | 2–3 days | 1 week |
242+
| v0.4 | Constraint sets | 1 week | 2 weeks |
243+
| v0.5 | Real Datalog engine | 2–3 weeks | 5 weeks |
244+
| v0.6 | Multi-program testing | 1–2 weeks | 7 weeks |
245+
| v0.7 | Language expansion | 2–3 weeks | 10 weeks |
246+
| v0.8 | Reporting & CI/CD | 1–2 weeks | 12 weeks |
247+
| v0.9 | Performance & polish | 1 week | 13 weeks |
248+
| v1.0 | Production release | 1–2 weeks | ~15 weeks |
249+
250+
**Roughly 4 months of focused work** from v0.1 to v1.0.
251+
252+
---
253+
254+
## Critical Path
255+
256+
**v0.5 (Real Datalog Engine)** is the make-or-break milestone. Everything
257+
else is incremental improvement, but the signature engine is what separates
258+
panic-attacker from "a fancy grep + stress test script." If the logic
259+
programming works well, the tool is genuinely novel.
260+
261+
**v0.4 (Constraint Sets)** is the second most important — it's the feature
262+
that makes panic-attacker *composable* rather than just a list of
263+
individual stress tests.
264+
265+
---
266+
267+
## Post v1.0
268+
269+
See [VISION.md](VISION.md) for the long-range roadmap:
270+
271+
- **v1.5** — Generic constraint modelling (not software-specific)
272+
- **v2.0** — Sensor/actuator integration
273+
- **v2.5** — Physical system modelling
274+
- **v3.0** — Digital twin stress testing
275+
276+
### Separate Products (Informed by panic-attacker)
277+
278+
- **Resource Topology Simulator** — Cisco-like GUI for resource flow design
279+
- **Software Fuse Framework** — Rust library for building fuse components
280+
- **eclexia Profiler** — eclexia-specific stress testing integration
281+
- **Safety Priority Scheduler** — Production daemon for resource management
282+
283+
---
284+
285+
## Authors
286+
287+
- **Concept & Design:** Jonathan D.A. Jewell
288+
- **Initial Implementation:** Claude (Anthropic) + Jonathan D.A. Jewell
289+
- **Date:** 2026-02-07

src/xray/analyzer.rs

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -49,8 +49,13 @@ impl Analyzer {
4949
let files = self.collect_source_files()?;
5050

5151
for file in &files {
52-
let content = fs::read_to_string(file)
53-
.with_context(|| format!("Failed to read {}", file.display()))?;
52+
let content = match fs::read_to_string(file) {
53+
Ok(c) => c,
54+
Err(_) => {
55+
// Skip non-UTF-8 or unreadable files (binary artifacts, etc.)
56+
continue;
57+
}
58+
};
5459

5560
// Update statistics
5661
statistics.total_lines += content.lines().count();
@@ -329,7 +334,10 @@ impl Analyzer {
329334
let mut frameworks = HashSet::new();
330335

331336
for file in files {
332-
let content = fs::read_to_string(file)?;
337+
let content = match fs::read_to_string(file) {
338+
Ok(c) => c,
339+
Err(_) => continue,
340+
};
333341

334342
// Web servers
335343
if content.contains("actix_web")

0 commit comments

Comments
 (0)