Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
7eb4c36
feat(02.1-01): add --ecosystem flag and buildPypiProbeFor
simonhj May 20, 2026
0c0ed28
feat(02.1-01): create bazel-pypi-discovery module with tests
simonhj May 20, 2026
f2b9079
feat(02.1-02): create PyPI parser, extraction orchestrator, and ecosy…
simonhj May 20, 2026
f00db2d
test(bazel-pypi): fix constructed fixture test and add oracle
simonhj May 20, 2026
bd4a93d
feat(02.1-03): wire PyPI branch into auto-manifest dispatch with mock…
simonhj May 20, 2026
acc8ec1
docs(02.1-04): document Bazel PyPI extraction in README and CHANGELOG
simonhj May 20, 2026
d2dc321
fix(02.1): CR-01 honor socket.json ecosystem default (normalize strin…
simonhj May 20, 2026
2c0089f
fix(02.1): CR-02/WR-01 fail loudly on ecosystem hard failures in both…
simonhj May 20, 2026
fb16f98
fix(02.1): WR-05/WR-06 enrich native candidates with parsed metadata,…
simonhj May 20, 2026
3eeb9a3
fix(02.1): WR-09 add oversized .bzl file rejection test
simonhj May 20, 2026
147946b
fix(02.1): WR-04 extract outcome matrix to pure helper and add unit t…
simonhj May 20, 2026
b95ec5a
fix(02.1): regenerate PyPI oracle to byte-equal live sortPackageLines…
simonhj May 20, 2026
c94720e
style: fix lint errors in bazel PyPI extraction files
simonhj May 21, 2026
d62cecc
fix(02.1): use Bazel repo mapping for visible repos
simonhj May 21, 2026
dc8af61
fix(02.1): restrict PyPI hubs to static candidates
simonhj May 21, 2026
323e01d
fix(api): preserve http apiFetch support
simonhj May 21, 2026
295960b
fix(bazel): parse single-quoted pypi attrs
simonhj May 21, 2026
06a3849
fix(bazel): reject conflicting pypi lock duplicates
simonhj May 21, 2026
e1c8ad1
test(bazel): enforce exact pypi oracle match
simonhj May 21, 2026
04a7820
fix(manifest): keep extractor exit status composable
simonhj May 21, 2026
3b5f086
fix(bazel): resolve pypi spoke metadata targets
simonhj May 21, 2026
d524fb2
fix(bazel): harden review-found edge cases
simonhj May 21, 2026
0c49273
fix(bazel): close final review findings
simonhj May 21, 2026
6a0de95
fix(bazel): make pypi extraction opt-in
simonhj May 21, 2026
faf78ca
feat(bazel): prefer command-driven pypi discovery
simonhj May 21, 2026
5615f5c
feat(bazel): add bounded verbose diagnostics
simonhj May 21, 2026
8543f68
Revert "fix(api): preserve http apiFetch support"
simonhj May 21, 2026
5630b05
fix(bazel): keep pypi generation explicit
simonhj May 21, 2026
a3f1742
docs(bazel): clarify pypi parser scope
simonhj May 22, 2026
5992511
fix(bazel): address pypi review nits
simonhj May 22, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
### Added
- **`socket manifest bazel [beta]`** — Generate Bazel JVM SBOM manifests by running `bazel query` against discovered Maven repos in a Bazel workspace. Closes the inline-Maven-declaration gap that lockfile-only parsing misses for repos like envoy, ray, tensorflow, tink-java, and or-tools. Auto-detects Bzlmod and legacy `WORKSPACE`.
- **`socket scan create --auto-manifest`** now covers Bazel workspaces in addition to Gradle/Scala/Kotlin/Conda. Repos with `MODULE.bazel`, `WORKSPACE`, or `WORKSPACE.bazel` are detected automatically and their Maven dependencies extracted as part of the standard scan-create flow.
- **Bazel PyPI extraction** — `socket manifest bazel --ecosystem pypi` now generates `requirements.txt` for Python Bazel workspaces. Discovers custom `rules_python` pip hub names with Bazel command output first, queries `py_library` / `py_binary` / `py_test` dependencies, resolves canonical pinned versions from `requirements_lock.txt`, and emits PEP 503-normalized `name==version` lines. Supports both Bzlmod (`pip.parse`) and legacy `WORKSPACE` (`pip_parse` / `pip_install`) configurations. PyPI remains explicit opt-in for `socket scan create --auto-manifest` until real-world no-lockfile recovery is validated.

### Changed
- **Bazel diagnostics** — `socket manifest bazel --verbose` now emits bounded subprocess traces with argv, cwd, duration, exit status, output sizes, and failure stderr tails to make customer log-only triage safer and faster.

## [1.1.101](https://github.com/SocketDev/socket-cli/releases/tag/v1.1.101) - 2026-05-22

Expand Down
92 changes: 79 additions & 13 deletions src/commands/manifest/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,14 @@ manifest generator. Useful when you do not want to spell out the language.

## socket manifest bazel [beta]

Generates Bazel JVM SBOM manifests (`maven_install.json`-shaped) by running
`bazel query` against discovered Maven repos in a Bazel workspace. Output is
consumed by `socket scan create` and closes the
inline-Maven-declaration gap that lockfile-only parsing misses.
Generates Bazel SBOM manifests (Maven `maven_install.json` and/or PyPI
`requirements.txt`) by running `bazel query` against discovered ecosystem
hubs in a Bazel workspace. Output is consumed by `socket scan create` and
closes the inline-declaration gap that lockfile-only parsing misses for
Bazel monorepos.

> **Note**: This command generates Maven dependency manifests for Bazel JVM
> workspaces. It does not run reachability analysis.
> **Note**: This command generates dependency manifests for Bazel
> workspaces (Maven and PyPI). It does not run reachability analysis.

### Usage

Expand All @@ -36,33 +37,98 @@ socket manifest bazel [options] [DIR=.]
- `--bazel-rc <path>` — path to additional `.bazelrc` fragments forwarded to bazel.
- `--bazel-flags <str>` — flags forwarded to every bazel invocation (single quoted string).
- `--bazel-output-base <dir>` — Bazel `--output_base` for read-only-cache CI environments.
- `--ecosystem <name>` — ecosystem(s) to extract; repeatable. Supported values: `maven`, `pypi`. When omitted, Maven is generated by default; PyPI is explicit opt-in.
- `--out <dir>` — output directory; default `./.socket/bazel-manifests/`.
- `--dry-run`, `--verbose` — standard diagnostic flags.

> **Upload**: This subcommand only generates manifests. To generate and
> upload in one step, use `socket scan create --auto-manifest .` — it
> detects the workspace, runs the same extraction this subcommand performs,
> and uploads the result.
> detects the workspace, generates Bazel Maven manifests, and uploads the
> result. Generate Bazel PyPI manifests explicitly with `socket manifest bazel
> --ecosystem pypi`, then scan the generated output with `socket scan create`.

### Examples

```bash
# Generate maven manifests from the current Bazel workspace.
# Generate the default Bazel Maven manifest from the current workspace.
socket manifest bazel .

# Generate only the PyPI manifest.
socket manifest bazel . --ecosystem pypi

# Generate both Maven and PyPI manifests explicitly.
socket manifest bazel . --ecosystem maven --ecosystem pypi

# Use bazelisk explicitly.
socket manifest bazel --bazel=/usr/local/bin/bazelisk .
```

### Python/PyPI Extraction

When `--ecosystem pypi` is selected, the command:

1. Discovers `rules_python` pip hubs from Bazel's `mod show_extension` output when available, with bounded static parsing of `MODULE.bazel` (`pip.parse(hub_name = "...")`) and legacy `WORKSPACE` (`pip_parse(name = "...")` / `pip_install(name = "...")`) retained as fallback. Hub names are never hardcoded; custom names like `my_pypi` are detected automatically.
2. Validates each candidate hub by probing it with `bazel query` for `:pkg` targets / `alias(` rules. Invalid candidates are dropped.
3. Runs `bazel query 'deps(kind("py_library|py_binary|py_test", //...))'` to determine which PyPI packages are actually reached by Python rules in the repo (test dependencies included for whole-repo scope).
4. Reads `requirements_lock.txt` (the path discovered from `pip.parse(requirements_lock = "...")`) for canonical pinned versions. When the lockfile is unavailable, falls back to parsing `pypi_name=` and `pypi_version=` tags from the spoke `py_library` rules in the hub-and-spoke architecture.
5. Emits a sorted canonical `requirements.txt` containing `name==version` lines for every reached package.

### PyPI Name and Version Semantics

- **PEP 503 normalization.** Package matching uses PEP 503 normalization
(lowercase, then any run of `-`, `_`, or `.` is collapsed to a single
`-`). Bazel target names use underscores (`charset_normalizer`); PyPI
canonical names use hyphens (`charset-normalizer`). The emitted
`requirements.txt` always uses the canonical hyphenated form.
- **Lockfile pins win.** When the lockfile and spoke-repo tags disagree on
a version, the lockfile wins because that is the version Bazel actually
resolves at analysis time. A `--verbose` warning is logged for the
divergence.
- **Conflict detection.** When two reached packages normalize to the same
PyPI name with different versions, the command fails clearly: a single
`requirements.txt` cannot represent both versions, and silently
picking one would produce a misleading SBOM.

### Unsupported PyPI Forms

The PyPI extractor is intentionally narrow in this phase:

- **Direct URL, editable (`-e`), and unpinned requirements** are not
emitted. Only canonical `name==version` lines from the resolved
lockfile are produced. Repositories that rely on unpinned or
URL-pinned requirements will see those packages omitted from
`requirements.txt`.
- **Private corpus validation** requires authenticated GitHub access.
When credentials are unavailable, the bazel-bench harness's private
PyPI case skips cleanly with a distinct reason rather than failing.
- **Whole-repo extraction.** The initial PyPI implementation emits one
whole-workspace manifest. Per-target PyPI slicing is not currently
supported.

### Cross-Language Edges

Bazel repos with cross-language dependencies (e.g. `rust_library` →
`py_library` via PyO3 / cffi / etc.) are **not** traversed by the PyPI
extractor in this phase. The PyPI extractor only covers Python rule
dependencies reachable from `py_library`, `py_binary`, and `py_test`
targets. Cross-language edges are assigned to Phase 4. The bazel-bench
fixture `constructed/python-pypi` includes Go/Rust sidecars as
validation context only; they are intentionally not asserted by the
PyPI correctness cases.

### Requirements

- `bazel` or `bazelisk` on `PATH` (or pass `--bazel <path>`).
- Network access on cold cache. Bazel and `rules_jvm_external` own their own
retry policy for transient Maven resolution failures — `socket manifest bazel`
does not retry on top of them.
- Network access on cold cache. Bazel and `rules_jvm_external` /
`rules_python` own their own retry policy for transient resolution
failures — `socket manifest bazel` does not retry on top of them.
- Writable Bazel output base; pass `--bazel-output-base` for read-only-cache CI.
- For PyPI extraction: a Python 3 interpreter on `PATH` so the
rules_python toolchain can analyze the workspace.

This is the user-visible entry point for Bazel JVM SBOM support; the [beta] label and "Bazel JVM SBOM support" wording must stay consistent across release notes and docs.
This is the user-visible entry point for Bazel SBOM support (Maven and
PyPI); the [beta] label and "Bazel SBOM support" wording must stay
consistent across release notes and docs.

## socket manifest cdxgen

Expand Down
Loading
Loading