Skip to content

Add publish command#249

Merged
consideRatio merged 16 commits intosensmetry:mainfrom
consideRatio:publish
Apr 14, 2026
Merged

Add publish command#249
consideRatio merged 16 commits intosensmetry:mainfrom
consideRatio:publish

Conversation

@consideRatio
Copy link
Copy Markdown
Collaborator

@consideRatio consideRatio commented Apr 1, 2026

Summary

This PR introduces a new sysand publish command to upload .kpar artifacts to a sysand package index.

What’s Included

  • Adds CLI command used as sysand publish --index <URL> [PATH] where --index is required
  • Resolves [PATH] explicitly, or defaults to the output a build command would default to
  • Makes use of a new API between client/server. A POST request with multipart form payload to <index>/api/v1/upload. It has two form fields.
    • metadata (application/json) with:
      • normalized_publisher
      • normalized_name
      • version
      • license
      • kpar_sha256_digest
    • kpar with the .kpar data blob (application/zip)
  • Validates publish inputs before upload:
    • publisher/name must be normalizable based on strict rules
    • version must be valid SemVer 2.0
    • license must be valid SPDX expression
  • Normalizes published ID fields (lowercase, spaces -> hyphens; dots in name are preserved)
  • Validates Index URL:
    • only http/https
    • rejects query/fragment
    • rejects URLs that already include /api/v1/upload
  • Makes use of publish-specific auth behavior:
    • only bearer-token credentials are used
    • credential matching is performed against the resolved upload URL
    • clear failures for missing/ambiguous bearer credentials
  • Maps HTTP errors to user-friendly CLI errors (401/403/404/409)

Docs

  • Adds docs/src/commands/publish.md
  • Adds sysand publish entry to docs summary navigation

Tests

  • Adds dedicated core publish tests (URL handling, validation)
  • Adds extensive CLI integration tests for success paths and failure modes
  • New tests added

Dependency/Config Updates

  • Enables multipart features for reqwest / reqwest-middleware where needed for upload support

@consideRatio consideRatio marked this pull request as ready for review April 1, 2026 13:51
@andrius-puksta-sensmetry
Copy link
Copy Markdown
Collaborator

andrius-puksta-sensmetry commented Apr 2, 2026

Uploads multipart form (purl + file) to /api/v1/upload

Why not use something like Cargo publish protocol?

  • Reads and validates project metadata from the .kpar:
    • publisher present and canonicalizable
    • name canonicalizable
    • version valid SemVer 2.0

What about license? Should require it to be an SPDX expression. (EDIT: Agree! RESOLVED)

@andrius-puksta-sensmetry
Copy link
Copy Markdown
Collaborator

andrius-puksta-sensmetry commented Apr 2, 2026

Uploads multipart form (purl + file) to /api/v1/upload

Why not use something like Cargo publish protocol?

Alaternatively, why not be even simpler: pass the PURL in some header (e.g. X-Sysand-publish-PURL) and the KPAR as the application/zip body?

EDIT by Erik: Discussed a bit further in #249 (comment) and #249 (comment), we now have metadata as JSON and kpar as application/zip, where metadata includes a checksum and the purl.

Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
@andrius-puksta-sensmetry
Copy link
Copy Markdown
Collaborator

Looked through everything. One question/suggestion still not addressed:

I think it would be better to have separate fields for publisher, name and version. It's also unclear to me if these are required to match the actual KPAR metadata (IMO they should be the same). If yes, combining them into a PURL can be done server side, as publisher/name/version will have to be checked anyway.

Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
@consideRatio
Copy link
Copy Markdown
Collaborator Author

consideRatio commented Apr 13, 2026

Looked through everything. One question/suggestion still not addressed:

I think it would be better to have separate fields for publisher, name and version. It's also unclear to me if these are required to match the actual KPAR metadata (IMO they should be the same). If yes, combining them into a PURL can be done server side, as publisher/name/version will have to be checked anyway.

They must match with the content. The idea of having the purl field was to do an authorization check before reading the .kpar archive, as it may be a bit more expensive operation, thinking about DoS protection etc.

With a purl field its clear we mean the normalized parts, do you want name and publisher to be normalized or not?

We could also just remove the purl field and go with it, because we won't process the request until the .kpar is received anyhow etc, so I think my DoS thinking wasn't very relevant.

Decision options:

  1. Remove purl, just having checksum in metadata
  2. Remove purl, but add publisher and name without normalization
  3. Remove purl, but add normalized_publisher and normalized_name with normalization
  4. Keep things as they are
  5. Something else

I think I lean towards 1 if something, but more than anything I lean towards momentum so I'm fine with quickly adjusting to any decision

EDIT: I've pushed a commit implementing option 1, can switch to another decision still just wanted to avoid being idle.

Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
@consideRatio
Copy link
Copy Markdown
Collaborator Author

consideRatio commented Apr 13, 2026

I've checked that this functionality works e2e against developed index server, so I consider it ready, but I can also adjust to a decision in the comment above.

@andrius-puksta-sensmetry
Copy link
Copy Markdown
Collaborator

we won't process the request until the .kpar is received anyhow etc, so I think my DoS thinking wasn't very relevant.

Is it feasible to process the request before fully receiving it (not now, just asking about whether this would be doable in the future if the need arises)? Even if not, it would still be nice to be able to quickly reject invalid requests (e.g. trying to publish to the wrong publisher/name, or especially publishing to a not-yet-existing/typoed package, which will happen for legitimate users) without unzipping the kpar. This doesn't prevent DoS by malicious users (they can simply include different names in the json than in the package, so we still must check the package), but most invalid requests come from legitimate users. Therefore I lean towards option 3 + version (to quickly detect duplicate publishes to the same version).

@andrius-puksta-sensmetry
Copy link
Copy Markdown
Collaborator

Even if not, it would still be nice to be able to quickly reject invalid requests

For this to be effective, we also have to require that metadata "field" comes before the file in the request.

@consideRatio
Copy link
Copy Markdown
Collaborator Author

consideRatio commented Apr 14, 2026

@andrius-puksta-sensmetry comment above:

For this to be effective, we also have to require that metadata "field" comes before the file in the request.

AI responses to misc questions:

The multipart/form-data spec (RFC 7578) does not guarantee that servers, proxies, or client libraries preserve field order. Client libraries (requests, curl, etc.) may reorder fields, especially when mixing file and non-file parts.

The WSGI server (gunicorn, uwsgi, etc.) reads the entire request body into memory (or a temp file for large uploads) before passing it to Django.


It is possible to have the server parse a stream-read a POST request and abort mid-read etc, but this is really sidestepping the conveniences of most webserver software we build upon. If we look for this level of robustness, I think we should go for the even more solid strategy is to not have the index server receive the file at all. You can implement a multi-step flow where you first have the server generate a pre-signed URL for the client where the client can upload directly to object storage, and then the client does it, and communicates back to the server referencing the initial upload request, whereby the server processes it etc.

However, we are now in the weeds of optimization beyond even what crates.io, npm, pypi, etc do, and it raises the bar in general on implementing an index.

@andrius-puksta-sensmetry
Copy link
Copy Markdown
Collaborator

Yeah, it was just a question about feasibility. I would still prefer option 3 + version to be implemented to catch user errors without having to unzip .project.json.

Erik Sundell added 2 commits April 14, 2026 11:35
Signed-off-by: Erik Sundell <erik.sundell@sensmetry.com>
Signed-off-by: Erik Sundell <erik.sundell@sensmetry.com>
@consideRatio consideRatio merged commit e6c0a6d into sensmetry:main Apr 14, 2026
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants