Skip to content

[BUG] --recursive --format json emits per-skill summary, drops the full per-issue schema (issues[], components[], analysis_completeness) #228

Description

@ThePlenkov

What I'm seeing

When I run skillspector scan --recursive --format json, the output document is a per-skill summary — not a concatenated version of the single-skill JSON document. Specifically, the per-skill entries only carry:

  • name
  • path
  • risk_score
  • risk_severity
  • finding_count

…and drop every per-issue field that single-skill --format json exposes (and that --format sarif consequently can't carry either):

  • id, category, severity, confidence
  • location.file / start_line / end_line
  • explanation, remediation, code_snippet, intent, pattern, finding
  • tags (OWASP / MITRE / CWE / Agent Snooping / MCP Least Privilege)
  • analysis_completeness, components[], metadata, risk_assessment, suppressed[], suppressed_count

The multi-skill top-level document also differs structurally: {"max_risk_score", "multi_skill", "skill_count", "skills": [...]} instead of one document per skill.

Reproduction (skillspector 2.3.7)

skillspector scan ./skill-collection/ --no-llm --recursive --format json --out report.json

jq '.skills[0]' report.json
# → { "name", "path", "risk_score", "risk_severity", "finding_count" }
#   (no issues[], no components, no analysis_completeness)

skillspector scan ./skill-collection/act/ --no-llm --format json | jq 'keys'
# → [ "analysis_completeness", "components", "issues", "metadata",
#      "risk_assessment", "skill", "suppressed", "suppressed_count" ]

Same scan target, same --no-llm, same --format json, but recursive loses the entire per-issue payload.

Why it matters

This is the obvious automation path: scan every skill in a repo at once. Anyone trying to feed recursive output into a downstream tool (CodeQL, Snyk, custom CI, SARIF converters, GitHub PR annotations) gets only finding_count and can't surface remediation, code snippets, OWASP tags, or confidence — which is the actual signal that makes findings actionable.

In our case we had to drop --recursive and call single-skill scans in a loop just to keep the per-issue fields. That's correct but it's also ~43× the subprocess overhead and is not how a "scan everything" flag should behave.

What I think the fix is

When --recursive --format json is requested, emit a document shaped like {"results": [<full single-skill document #1>, <full single-skill document #2>, …]} (or the existing skills[] shape but with each entry being a full single-skill document embedded under a key). That way recursive output is strictly a superset of what single-skill output carries, and downstream tools can iterate results[] the same way regardless of which mode produced the document.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions