Skip to content

Commit 159613e

Browse files
authored
[codex] sbom-diff-and-risk v0.3.0 (#9)
* sbom-diff-and-risk v0.3.0 * Normalize v0.3.0 release files for CI
1 parent 78df03e commit 159613e

56 files changed

Lines changed: 7707 additions & 131 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/sbom-diff-and-risk-ci.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,10 @@ jobs:
2424
working-directory: tools/sbom-diff-and-risk
2525
steps:
2626
- name: Check out repository
27-
uses: actions/checkout@v6
27+
uses: actions/checkout@v4
2828

2929
- name: Set up Python
30-
uses: actions/setup-python@v6
30+
uses: actions/setup-python@v5
3131
with:
3232
python-version: "3.11"
3333

@@ -57,7 +57,7 @@ jobs:
5757
5858
build-and-attest:
5959
# Keep provenance publication on trusted non-PR runs so consumers verify
60-
# workflow-produced wheel and sdist artifacts from this repository workflow.
60+
# workflow-produced wheel/sdist artifacts from this repository workflow.
6161
if: github.event_name != 'pull_request'
6262
needs: test
6363
runs-on: ubuntu-latest

.github/workflows/sbom-diff-and-risk-code-scanning.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ jobs:
1818
working-directory: tools/sbom-diff-and-risk
1919
steps:
2020
- name: Check out repository
21-
uses: actions/checkout@v6
21+
uses: actions/checkout@v5
2222

2323
- name: Set up Python
2424
uses: actions/setup-python@v6

tools/sbom-diff-and-risk/README.md

Lines changed: 87 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# sbom-diff-and-risk
22

3-
v0.2.0 adds policy-based enforcement, SARIF export, GitHub code scanning integration, and deterministic parser hardening for Python dependency inputs.
3+
v0.3.0 adds opt-in PyPI provenance enrichment, provenance-aware policy and reporting, optional advisory Scorecard signals, and self-provenance verification guidance for workflow-built artifacts.
44

55
`sbom-diff-and-risk` is a local, deterministic CLI for comparing two SBOMs or dependency manifests and producing JSON plus Markdown reports.
66

@@ -156,9 +156,82 @@ sbom-diff-risk compare \
156156
- `--warn-on rule[,rule...]`
157157
- `--strict`
158158
- `--enrich-pypi`
159+
- `--pypi-timeout seconds`
160+
- `--enrich-scorecard`
161+
- `--scorecard-timeout seconds`
159162
- `--source-allowlist pypi.org,files.pythonhosted.org,github.com`
160163

161-
`--enrich-pypi` is reserved for future work and currently returns a clear error.
164+
Offline mode remains the default. No network access occurs unless `--enrich-pypi` or `--enrich-scorecard` is set explicitly.
165+
166+
## Opt-in Provenance Enrichment
167+
168+
PyPI provenance and integrity enrichment is explicit and additive in this PR:
169+
170+
- only Python / PyPI packages are queried
171+
- no hidden network access occurs in default mode
172+
- enrichment results are captured as evidence and summarized in the reports
173+
- per-component `evidence.provenance` records stable lookup fields such as `supported`, `lookup_performed`, and per-file attestation totals
174+
- lack of attestation is treated as unavailable metadata, not as proof of compromise
175+
- policy evaluation can use these signals explicitly when configured
176+
- SARIF stays conservative and only emits selected high-signal provenance policy violations
177+
178+
When enabled, the tool queries PyPI-facing release metadata plus file-level provenance data and records stable evidence fields under component `evidence.provenance`, along with run metadata under `metadata.enrichment` and the top-level trust-signal report fields in the JSON report.
179+
180+
```bash
181+
sbom-diff-risk compare \
182+
--before examples/requirements_before.txt \
183+
--after examples/requirements_after.txt \
184+
--enrich-pypi \
185+
--pypi-timeout 3 \
186+
--out-json outputs/report-enriched.json
187+
```
188+
189+
## Provenance-Aware Reporting
190+
191+
When provenance enrichment is enabled, the reports surface trust signals directly instead of burying them in component evidence:
192+
193+
- JSON includes `provenance_summary`, `attestation_summary`, `enrichment_metadata`, `trust_signal_notes`, and `provenance_policy_impact`
194+
- Markdown includes `Provenance summary`, `Attestation gaps`, `Policy impact for provenance-related rules`, and `Trust signal notes`
195+
- core diff semantics do not change when enrichment is enabled
196+
- SARIF maps only selected high-signal provenance decisions such as `provenance_required`, blocking `missing_attestation`, and blocking `unverified_provenance`
197+
- provenance-related SARIF alerts prefer file-level locations that point to the relevant compared manifest or SBOM input
198+
199+
Routine enrichment outcomes remain JSON and Markdown evidence for review. Non-blocking enrichment facts do not automatically become SARIF alerts.
200+
201+
## Opt-in Scorecard Enrichment
202+
203+
OpenSSF Scorecard enrichment is also explicit and advisory:
204+
205+
- no Scorecard requests are made unless `--enrich-scorecard` is set
206+
- lookups only occur when a component can be mapped to a repository with high confidence from explicit metadata
207+
- repository registry pages and ambiguous URLs are treated as unmapped instead of inferred
208+
- Scorecard results are auxiliary trust signals, not proof of safety
209+
- Scorecard-only SARIF alerts are emitted only when policy explicitly turns a threshold breach into a violation
210+
211+
```bash
212+
sbom-diff-risk compare \
213+
--before examples/cdx_before.json \
214+
--after examples/cdx_after.json \
215+
--enrich-scorecard \
216+
--scorecard-timeout 3 \
217+
--out-json outputs/report-scorecard.json
218+
```
219+
220+
If you want policy gating, make it explicit with a v3 policy such as [policy-scorecard-minimal.yml](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/policy-scorecard-minimal.yml), which sets `minimum_scorecard_score` and opts into the `scorecard_below_threshold` rule.
221+
222+
Setting `minimum_scorecard_score` alone is advisory metadata for review. It only affects policy outcomes when `scorecard_below_threshold` is configured explicitly in `block_on`, `warn_on`, or `ignore_rules`.
223+
224+
## Self-provenance
225+
226+
This repository also records provenance for `sbom-diff-and-risk` itself by generating GitHub artifact attestations for the wheel and source distribution produced by the `sbom-diff-and-risk-ci` workflow.
227+
228+
- the attested files are the wheel and source distribution built by `python -m build` from `tools/sbom-diff-and-risk`
229+
- the build files are uploaded together as the `sbom-diff-and-risk-dist` workflow artifact
230+
- only trusted non-PR runs publish the attestation
231+
- consumers can verify provenance with GitHub's attestation tooling after downloading one of those artifacts
232+
- this complements the tool's analysis of third-party supply-chain inputs, but it does not replace that analysis
233+
234+
See [docs/self-provenance.md](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/docs/self-provenance.md) for the exact attested filenames, where the evidence appears in GitHub, and a run-by-run verification flow for consumers.
162235

163236
## Examples
164237

@@ -167,11 +240,15 @@ The [examples/](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-an
167240
- before/after inputs for CycloneDX JSON, SPDX JSON, `requirements.txt`, and `pyproject.toml`
168241
- dependency-group examples at `examples/pyproject_groups_before.toml` and `examples/pyproject_groups_after.toml`
169242
- example policies at `examples/policy-minimal.yml` and `examples/policy-strict.yml`
243+
- provenance-aware policy examples at `examples/policy-provenance-minimal.yml` and `examples/policy-provenance-strict.yml`
244+
- a Scorecard-aware policy example at `examples/policy-scorecard-minimal.yml`
170245
- a sample pass JSON report at [sample-report.json](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-report.json)
171246
- a sample pass Markdown report at [sample-report.md](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-report.md)
172247
- sample policy-warn reports at [sample-policy-warn-report.json](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-policy-warn-report.json) and [sample-policy-warn-report.md](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-policy-warn-report.md)
173248
- sample policy-fail reports at [sample-policy-fail-report.json](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-policy-fail-report.json) and [sample-policy-fail-report.md](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-policy-fail-report.md)
174249
- a sample SARIF export at [sample-sarif.sarif](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-sarif.sarif)
250+
- provenance-aware sample reports at [sample-provenance-report.json](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-provenance-report.json), [sample-provenance-report.md](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-provenance-report.md), and [sample-provenance-report.sarif](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-provenance-report.sarif)
251+
- Scorecard-aware sample reports at [sample-scorecard-report.json](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-scorecard-report.json), [sample-scorecard-report.md](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-scorecard-report.md), and [sample-scorecard-report.sarif](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-scorecard-report.sarif)
175252
- requirements-based sample reports at [sample-requirements-report.json](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-requirements-report.json) and [sample-requirements-report.md](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-requirements-report.md)
176253

177254
## Enforcement Mode
@@ -214,9 +291,10 @@ SARIF export is intentionally conservative. The current renderer emits a GitHub-
214291
- `suspicious_source`
215292
- `unknown_license`
216293
- `major_upgrade`
217-
- selected blocking policy results such as `max_added_packages` and `allow_sources`
294+
- selected policy results such as `max_added_packages`, `allow_sources`, `provenance_required`, and blocking provenance violations like `missing_attestation` or `unverified_provenance`
295+
- explicit Scorecard policy violations such as `scorecard_below_threshold`
218296

219-
It does not turn every diff or informational heuristic into a code scanning alert.
297+
It does not turn every enrichment fact, diff, or informational heuristic into a code scanning alert.
220298

221299
```bash
222300
sbom-diff-risk compare \
@@ -228,17 +306,8 @@ sbom-diff-risk compare \
228306

229307
For GitHub code scanning integration guidance and a minimal upload workflow, see [docs/github-code-scanning.md](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/docs/github-code-scanning.md).
230308

231-
## Self-provenance
232-
233-
This repository also records provenance for `sbom-diff-and-risk` itself by generating GitHub artifact attestations for the wheel and source distribution produced by the `sbom-diff-and-risk-ci` workflow.
309+
For details on how this repository attests the tool's own wheel and source distribution artifacts, see [docs/self-provenance.md](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/docs/self-provenance.md).
234310

235-
- the attested files are the wheel and source distribution built by `python -m build` from `tools/sbom-diff-and-risk`
236-
- the build files are uploaded together as the `sbom-diff-and-risk-dist` workflow artifact
237-
- only trusted non-PR runs publish the attestation
238-
- consumers can verify provenance with GitHub's attestation tooling after downloading one of those artifacts
239-
- this complements the tool's analysis of third-party supply-chain inputs, but it does not replace that analysis
240-
241-
See [docs/self-provenance.md](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/docs/self-provenance.md) for the exact attested filenames, where the evidence appears in GitHub, and a run-by-run verification flow for consumers.
242311
## Parser Boundaries
243312

244313
Deterministic local mode intentionally supports a conservative subset of packaging syntax. The detailed matrix lives in [docs/parser-boundaries.md](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/docs/parser-boundaries.md).
@@ -266,9 +335,12 @@ Deterministic local mode intentionally supports a conservative subset of packagi
266335
## Limitations
267336

268337
- default mode is local-file based only.
338+
- PyPI provenance enrichment is opt-in only via `--enrich-pypi`; default runs stay offline.
269339
- `generated_at` remains `null` to preserve deterministic report output.
270340
- `stale_package` is not resolved offline. The report emits `not_evaluated` instead.
271-
- SARIF export intentionally covers only a conservative subset of findings in v0.2.
341+
- provenance evidence is recorded for supported PyPI packages only; unsupported and failed lookups remain explicit evidence gaps.
342+
- SARIF export intentionally covers only a conservative subset of findings in v0.2, including only selected high-signal provenance policy violations.
343+
- Scorecard enrichment is opt-in only via `--enrich-scorecard`, uses only high-confidence repository mappings, and remains advisory unless policy explicitly gates it.
272344
- No vulnerability database integration, CVE matching, or advisory enrichment.
273345
- `requirements.txt` support intentionally covers a conservative subset: plain PEP 508 requirement entries, comments, extras, markers, and line continuations.
274346
- `requirements.txt` intentionally rejects include/constraint directives, editable installs, direct URL/path refs, index/source options, and other pip-only install flags in deterministic mode.

tools/sbom-diff-and-risk/docs/dependency-risk-heuristics.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ The current rules are intentionally conservative:
2525
## Deferred work
2626

2727
- real `stale_package` evaluation behind explicit enrichment
28+
- provenance-based policy gates over opt-in enrichment evidence
2829
- ecosystem-specific trust rules
2930
- advisory and CVE enrichment
3031
- configurable risk policy profiles

tools/sbom-diff-and-risk/docs/policy-schema.md

Lines changed: 80 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,17 @@
11
# Policy schema
22

3-
`sbom-diff-and-risk` supports a YAML-only policy schema in v1.
3+
`sbom-diff-and-risk` supports YAML-only policy schemas in versions `1`, `2`, and `3` for the local, provenance-aware, and optional Scorecard-aware policy flows described here.
44

55
The schema is intentionally conservative and fail-closed:
66

77
- unknown rule ids are rejected
88
- unknown top-level keys are rejected
99
- invalid types are rejected
10-
- only schema version `1` is supported
10+
- version `1` remains the v0.2-compatible schema and existing v0.2 policies continue to work unchanged
11+
- version `2` adds provenance-aware gating for explicit PyPI enrichment evidence
12+
- version `3` adds optional Scorecard-aware gating for explicitly requested Scorecard enrichment
1113

12-
## Fields
14+
## Version 1 fields
1315

1416
- `version: 1`
1517
- `block_on: [rule_id, ...]`
@@ -18,7 +20,7 @@ The schema is intentionally conservative and fail-closed:
1820
- `allow_sources: [host, ...]`
1921
- `ignore_rules: [rule_id, ...]`
2022

21-
## Supported rule ids
23+
## Version 1 supported rule ids
2224

2325
- `new_package`
2426
- `major_upgrade`
@@ -29,6 +31,41 @@ The schema is intentionally conservative and fail-closed:
2931
- `max_added_packages`
3032
- `allow_sources`
3133

34+
## Version 2 fields
35+
36+
Version `2` supports every version `1` field plus:
37+
38+
- `require_attestations_for_new_packages: bool`
39+
- `require_provenance_for_suspicious_sources: bool`
40+
- `allow_unattested_packages: [package_name, ...]`
41+
- `allow_provenance_publishers: [publisher_kind, ...]`
42+
- `allow_unattested_publishers: [publisher_kind, ...]` as an accepted compatibility alias for `allow_provenance_publishers`
43+
44+
`allow_provenance_publishers` is the canonical publisher override field. The parser also accepts `allow_unattested_publishers` as an alias when teams want a more explicit override-style name in review. Neither field treats missing attestations as trusted; they only constrain which attested publisher kinds count as verified provenance.
45+
46+
## Version 2 supported rule ids
47+
48+
Version `2` supports every version `1` rule id plus:
49+
50+
- `missing_attestation`
51+
- `unverified_provenance`
52+
- `provenance_unavailable`
53+
- `provenance_required`
54+
55+
## Version 3 fields
56+
57+
Version `3` supports every version `1` and `2` field plus:
58+
59+
- `minimum_scorecard_score: float`
60+
61+
`minimum_scorecard_score` is advisory by itself. It only affects policy outcomes when you also opt into the `scorecard_below_threshold` rule through `block_on`, `warn_on`, or `ignore_rules`.
62+
63+
## Version 3 supported rule ids
64+
65+
Version `3` supports every version `1` and `2` rule id plus:
66+
67+
- `scorecard_below_threshold`
68+
3269
## Semantics
3370

3471
- `block_on` turns matching rule ids into blocking violations.
@@ -37,8 +74,20 @@ The schema is intentionally conservative and fail-closed:
3774
- `max_added_packages` enforces a deterministic threshold on the added component count.
3875
- `allow_sources` enforces exact host matches against `source_url` hosts for added and changed components.
3976
- `ignore_rules` suppresses matching rule ids entirely.
77+
- `missing_attestation` means PyPI release metadata was fetched successfully but no attestations were present.
78+
- `provenance_unavailable` means the run did not have usable provenance evidence for that package, for example because enrichment was disabled, unsupported, or failed.
79+
- `unverified_provenance` means attestations were present, but the provenance could not be verified against publisher metadata.
80+
- `provenance_required` is a policy-only rule emitted when an explicit provenance requirement was not satisfied.
81+
- `require_attestations_for_new_packages` applies only to added PyPI packages.
82+
- `require_provenance_for_suspicious_sources` applies only when the component also triggered `suspicious_source`.
83+
- `allow_unattested_packages` is a narrow package-name override for explicit missing-attestation exceptions only.
84+
- `allow_unattested_packages` does not waive `provenance_unavailable` or `unverified_provenance`; those remain separate, reviewable policy decisions.
85+
- `allow_provenance_publishers` and `allow_unattested_publishers` apply only when attestations exist and publisher kinds are available to verify.
86+
- when enrichment is disabled, deterministic local mode is unchanged unless a provenance-aware policy explicitly turns unavailable evidence into a warning or block.
87+
- `minimum_scorecard_score` does not create alerts or blocks on its own; it only becomes enforceable when `scorecard_below_threshold` is configured explicitly.
88+
- Scorecard evidence remains an auxiliary trust signal. A high score is not proof of safety, and missing Scorecard data is not proof of risk.
4089

41-
## Example
90+
## Version 1 example
4291

4392
```yaml
4493
version: 1
@@ -54,3 +103,29 @@ allow_sources:
54103
ignore_rules:
55104
- major_upgrade
56105
```
106+
107+
## Version 2 example
108+
109+
```yaml
110+
version: 2
111+
block_on:
112+
- provenance_required
113+
- provenance_unavailable
114+
warn_on:
115+
- missing_attestation
116+
require_attestations_for_new_packages: true
117+
require_provenance_for_suspicious_sources: true
118+
allow_unattested_packages:
119+
- pip
120+
allow_unattested_publishers:
121+
- github actions
122+
```
123+
124+
## Version 3 example
125+
126+
```yaml
127+
version: 3
128+
warn_on:
129+
- scorecard_below_threshold
130+
minimum_scorecard_score: 7.0
131+
```

tools/sbom-diff-and-risk/docs/self-provenance.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ The attested subjects are the exact Python distributables built from `tools/sbom
1111

1212
Those two files are uploaded together as the workflow artifact named `sbom-diff-and-risk-dist`. The attestation applies to the built files themselves, not just to the artifact bundle name shown in the Actions UI.
1313

14-
Current attestations cover workflow-built wheel and sdist artifacts, not GitHub Release assets or PyPI-published distributions.
14+
This repository does not currently publish PyPI Trusted Publishing provenance or immutable GitHub release attestations as part of this workflow. The current self-provenance coverage is limited to the workflow-produced wheel and source distribution files.
1515

1616
## Workflow and permissions
1717

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Missing attestation remains a review signal, not proof of compromise.
2+
version: 2
3+
warn_on:
4+
- missing_attestation
5+
- provenance_required
6+
require_attestations_for_new_packages: true
7+
allow_unattested_packages:
8+
- pip
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Explicit provenance requirements for enriched PyPI evidence.
2+
version: 2
3+
block_on:
4+
- provenance_required
5+
- provenance_unavailable
6+
- unverified_provenance
7+
warn_on:
8+
- missing_attestation
9+
require_attestations_for_new_packages: true
10+
require_provenance_for_suspicious_sources: true
11+
allow_unattested_publishers:
12+
- github actions
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
version: 3
2+
warn_on:
3+
- scorecard_below_threshold
4+
minimum_scorecard_score: 7.0

0 commit comments

Comments
 (0)