Date: 2026-02-08 Author: Jonathan D.A. Jewell j.d.a.jewell@open.ac.uk Context: panic-attack v1.0.0, scanning Eclexia (10 crates, ~20,000 lines of Rust)
Scanning the Eclexia compiler toolchain with panic-attack assail exposed
several categories of false positives, misdetections, and missing features
that reduce the tool's signal-to-noise ratio on well-engineered Rust
codebases. This document captures the specific observations and proposes
concrete improvements, prioritised for implementation.
The core finding is that panic-attack currently treats all code uniformly --
test code is scored the same as production code, safe unwrap variants are
counted alongside unsafe ones, and severity ratings do not account for
language-specific safety guarantees. For a codebase like Eclexia with zero
unsafe blocks, zero production unwraps, and comprehensive error handling via
Result<_, RuntimeError>, the current output overstates risk significantly.
These improvements would benefit any Rust workspace scan, not just Eclexia.
The following observations were made during the Eclexia scan session:
-
Eclexia profile: 10 crates, ~20,000 lines of Rust, 0 unsafe blocks, 0 production unwrap calls, full
Result-based error handling with customRuntimeErrortype and?propagation throughout. -
False positive rate: The
builtins.rsmodule was flagged with 96 unwrap calls and 13 panic sites, all of which reside exclusively within#[cfg(test)]modules. This produced a "Medium" severity finding for a file with zero production panic paths. -
Framework misdetection: panic-attack reported "WebServer" as the detected framework for pure compiler crates that perform zero I/O operations (no network, no HTTP, no filesystem serving).
-
Safe unwrap variants counted as risks: Calls to
.unwrap_or(1),.unwrap_or_default(), and.unwrap_or_else(|| ...)were counted toward theunwrap_callsmetric. These are safe patterns that cannot panic and should not contribute to risk scoring. -
Severity not calibrated for Rust safety guarantees: A crate with 0 unsafe blocks, 0 production unwraps, and comprehensive
Resultpropagation still received non-trivial risk scores due to allocation site counts alone. -
No workspace-level consolidation: Scanning 10 crates individually produced 10 separate reports with no aggregate view of workspace health.
-
No differential capability: There is no way to compare a scan before and after a refactoring to show what improved or regressed.
-
Allocation sites lack context: The report stated "73 allocation sites" without distinguishing between
Vec::new()(safe, bounded) and user-controlled allocations (potentially unbounded).
Priority: HIGH
Problem: panic-attack counts unwrap() and panic!() calls inside
#[cfg(test)] modules, #[test] functions, and files in tests/
directories as production panic paths. This produces false positives for
any well-tested codebase. In the Eclexia scan, 100% of the unwrap and
panic findings in builtins.rs (96 unwrap calls, 13 panic sites) were
test-only code, yet the file received a "Medium" severity rating.
Proposed solution:
- Parse
#[cfg(test)]module boundaries and#[test]function boundaries in the Rust analyzer. - Exclude files matching
tests/**,test_*.rs,*_test.rspatterns. - Report test code metrics separately (e.g.,
test_unwrap_calls) so they remain visible but do not affect severity scoring. - Add a
--include-test-codeflag for users who want the current behaviour.
Impact: Eliminates the single largest source of false positives on well-tested Rust codebases.
Priority: HIGH
Problem: panic-attack reports "WebServer" as the detected framework for pure compiler crates with zero I/O operations. This is a misdetection that undermines trust in the tool's output. The Eclexia crates have no HTTP dependencies, no server bindings, and no network code.
Proposed solution:
- Revise framework detection heuristics to require evidence of actual I/O patterns (e.g., TCP listeners, HTTP handlers, route definitions), not just crate structure.
- Add a "None" or "CLI" framework classification for projects that do not match any framework pattern.
- Consider inspecting
Cargo.tomldependencies as a signal: presence ofactix-web,axum,warp,rocket, orhyperstrongly indicates a web server; absence of these with presence ofclap,structopt, or a[[bin]]target suggests a CLI tool. - For library crates with no binary target and no I/O dependencies, default to "Library" rather than guessing a framework.
Impact: Prevents misleading framework labels that erode confidence in the overall report.
Priority: HIGH
Problem: The Rust analyzer counts .unwrap_or(value),
.unwrap_or_default(), and .unwrap_or_else(|| ...) toward the
unwrap_calls metric. These methods are safe alternatives to .unwrap()
that cannot panic -- they provide fallback values instead. Counting them as
risks conflates safe error handling with unsafe panic paths.
Proposed solution:
- Distinguish between panic-capable unwrap calls (
.unwrap(),.expect("...")) and safe variants (.unwrap_or(...),.unwrap_or_default(),.unwrap_or_else(|| ...)). - Report safe variants in a separate metric (e.g.,
safe_unwrap_calls) that does not contribute to severity scoring. - Optionally report the ratio of safe-to-unsafe unwraps as a code quality signal: a high ratio indicates disciplined error handling.
Impact: Improves accuracy of the core unwrap metric, which is one of the primary signals in Rust analysis.
Priority: MEDIUM
Problem: For Rust codebases, 0 unsafe blocks combined with 0 production unwraps represents an exceptionally strong safety posture. However, panic-attack still assigns non-trivial risk scores based on allocation site counts alone. A crate with zero ways to reach a panic path should be classified as "clean" or "hardened," not flagged with residual risk.
Proposed solution:
- Introduce a "hardened" or "clean" severity tier for codebases that meet language-specific safety criteria.
- For Rust, the criteria would be: 0 unsafe blocks, 0 production
.unwrap()/.expect()calls (excluding test code), and presence ofResult-based error handling with?propagation. - Allocation sites alone should not elevate severity. They should be reported as informational context, not risk indicators, unless combined with unsafe blocks or unbounded allocation patterns.
- Allow language-specific scoring profiles so that a Rust crate with no unsafe code is evaluated differently from a C program with manual memory management.
Impact: Reduces false severity inflation on well-engineered Rust codebases and makes the severity metric more meaningful across languages.
Priority: MEDIUM
Problem: When scanning a Cargo workspace with 10 crates, panic-attack produces 10 independent reports with no aggregate view. Users must manually collate results to understand workspace-level health. There is no way to identify which crate contributes the most risk or how the workspace compares overall to other projects.
Proposed solution:
- Detect Cargo workspaces (presence of
[workspace]in rootCargo.toml) and automatically scan all member crates. - Produce a consolidated report with:
- Per-crate breakdowns (individual scores and findings).
- Workspace-level totals (aggregate weak points, overall severity).
- A "top offenders" summary listing the crates with the highest risk.
- Shared dependency analysis (which dependencies appear across crates).
- Support
--workspaceflag for explicit workspace scanning. - In JSON output, nest per-crate results under a
workspaceobject with metadata about the workspace itself.
Impact: Enables meaningful assessment of multi-crate projects, which represent the majority of non-trivial Rust codebases.
Priority: MEDIUM
Problem: There is no way to compare two scans to show what improved or regressed between them. This limits the tool's usefulness in CI pipelines, where the primary question is "did this change make things better or worse?"
Proposed solution:
- Add a
panic-attack diff <baseline.json> <current.json>subcommand that compares two scan results. - Output should show:
- New findings (present in current but not baseline).
- Resolved findings (present in baseline but not current).
- Changed severity (findings that moved between severity tiers).
- Net change in weak point count and severity score.
- Support exit codes for CI: exit 0 if no regressions, exit 1 if new findings or severity increases detected.
- Optionally accept a
--baselineflag in theassailcommand to perform the comparison inline:panic-attack assail . --baseline previous.json.
Impact: Makes panic-attack viable as a CI gate that detects regressions without requiring manual report comparison.
Priority: MEDIUM
Problem: Reporting "73 allocation sites" without context is not
actionable. A Vec::new() in a function with bounded iteration is safe. A
Vec::with_capacity(user_input) is a potential denial-of-service vector.
The current report does not distinguish between these cases.
Proposed solution:
- Classify allocation sites into categories:
- Bounded: Allocation size is a compile-time constant or derived from
a bounded source (e.g.,
Vec::with_capacity(256)). - Internally bounded: Allocation depends on internal state that the
program controls (e.g.,
Vec::new()in a loop with a fixed upper bound). - User-controlled: Allocation size depends on external input (e.g.,
Vec::with_capacity(header.length)). - Unknown: Allocation size cannot be statically determined.
- Bounded: Allocation size is a compile-time constant or derived from
a bounded source (e.g.,
- Only flag user-controlled and unknown allocations as potential risk.
- Report bounded allocations as informational.
Impact: Transforms allocation site reporting from noise into a useful signal for identifying actual denial-of-service vectors.
Priority: LOW
Problem: Eclexia is a domain-specific language with first-class resource types (energy, carbon, latency). Programs written in Eclexia manage resource budgets as part of their core semantics. panic-attack has no awareness of these domain-specific resource dimensions and therefore cannot detect unbounded resource consumption patterns in Eclexia programs.
Proposed solution:
- Add an extensible resource dimension system that allows language-specific analyzers to define custom resource types beyond CPU/memory/disk/network.
- For Eclexia specifically, detect:
@resource_constraintblocks and verify they have bounded consumption.@solutionblocks and check that fallback paths exist.- Resource type declarations and flag any that lack upper bounds.
- Implement this as a plugin or analyzer extension rather than hard-coding Eclexia-specific logic into the core tool.
- Generalize the pattern: any language with resource annotations (Rust's
#[must_use], Ada's resource management, etc.) could benefit from similar awareness.
Impact: Extends panic-attack's value proposition to domain-specific languages with resource semantics. Requires language-specific investment but aligns with the long-term vision of universal constraint testing.
Priority: LOW
Problem: panic-attack does not recognize structured error handling
patterns. A crate that consistently uses Result<T, RuntimeError> with
? propagation throughout its public API has a fundamentally different
risk profile from one that uses .unwrap() liberally. The current analysis
treats both the same.
Proposed solution:
- Detect and score the following safe error handling patterns:
Result<T, E>return types on public functions.- Consistent use of
?operator for error propagation. - Custom error types with
Fromimplementations for error conversion. thiserrororanyhowusage patterns.
- Introduce an "error handling maturity" metric:
- Level 0: No structured error handling (raw panics).
- Level 1: Partial
Resultusage with frequent.unwrap(). - Level 2: Consistent
Resultusage with occasional.unwrap(). - Level 3: Full
Resultpropagation, custom error types, no production.unwrap()calls.
- Use this metric as a positive modifier on severity scoring: higher maturity should reduce overall severity.
Impact: Rewards disciplined error handling and produces more accurate risk assessments for Rust codebases that invest in proper error management.
Priority: LOW
Problem: panic-attack produces severity ratings but does not support project-specific pass/fail criteria. Different projects have different standards -- a safety-critical system might require 0 unsafe blocks while a prototype might tolerate higher risk. Without configurable thresholds, CI integration requires external scripting to interpret results.
Proposed solution:
-
Add a
[thresholds]section topanic-attacker.toml:[thresholds] max_unsafe_blocks = 0 max_production_unwraps = 5 max_severity = "low" max_weak_points = 20 require_error_handling_level = 2
-
When thresholds are configured,
panic-attack assailshould produce a pass/fail verdict in addition to the detailed report. -
Exit code 0 for pass, exit code 1 for fail, with clear indication of which thresholds were violated.
-
Support per-crate threshold overrides in workspace mode.
-
Allow threshold inheritance: define workspace-level defaults with per-crate overrides for crates with special requirements.
Impact: Makes panic-attack directly usable as a CI gate without external wrapper scripts, supporting project-specific quality standards.
Improvements 1-4 (test code exclusion, framework detection, unwrap variant
distinction, and severity calibration) are modifications to the existing
Rust analyzer in src/assail/mod.rs. They can be implemented incrementally
without architectural changes.
Improvement 5 (workspace consolidation) requires a new scanning mode in
src/main.rs and a new report aggregation layer, but reuses the existing
per-crate scan logic.
Improvement 6 (differential scanning) is a new subcommand that operates on JSON output files. It is independent of the scan engine and can be implemented at any time.
Improvements 7-9 (allocation context, resource dimensions, error handling patterns) require deeper static analysis capabilities. They should be planned for post-v1.0 milestones.
Improvement 10 (configurable thresholds) extends the existing
panic-attacker.toml configuration and is straightforward to implement
once the metrics it depends on (from improvements 1-4) are accurate.
- Test code exclusion (improvement 1) -- highest impact, lowest effort.
- Safe unwrap variant distinction (improvement 3) -- high impact, low effort, same code area as improvement 1.
- Framework detection accuracy (improvement 2) -- high impact, moderate effort.
- Language-specific severity calibration (improvement 4) -- depends on improvements 1 and 3 being complete.
- Workspace consolidation (improvement 5) -- independent track, can proceed in parallel with 1-4.
- Differential scanning (improvement 6) -- independent track, high CI value.
- Allocation site context (improvement 7) -- post-v1.0.
- Error handling patterns (improvement 9) -- post-v1.0.
- Configurable thresholds (improvement 10) -- depends on 1-4.
- Resource dimension awareness (improvement 8) -- long-term.
This document complements ROADMAP.md and VISION.md. The improvements
described here are specific, actionable findings from real-world usage,
whereas the roadmap covers broader feature milestones. Several items
overlap with planned roadmap work:
- Improvement 5 overlaps with the planned
sweepsubcommand. - Improvement 6 overlaps with v0.8 "Comparative reports (diff two Assail runs)".
- Improvement 10 overlaps with v0.8 "Exit codes that CI can act on."
These improvements should be incorporated into the relevant roadmap milestones rather than tracked separately.
- Analysis and writing: Jonathan D.A. Jewell
- Scan session: 2026-02-08
- Tool version: panic-attack v1.0.0