A privacy-preserving inference gateway and reference implementation for machine learning inference that supports both a plaintext baseline path and a CKKS-based encrypted inference path.
The system allows a client to discover supported models, construct a compatible CKKS session locally, encrypt inputs on the client side, submit ciphertexts to the server, retrieve encrypted inference results, and decrypt outputs locally. The server never receives plaintext inputs or the client’s secret key.
This repository focuses on protocol clarity, validation correctness, implementation conformance, and measurable privacy-performance tradeoffs between plaintext and encrypted inference.
Encrypted inference systems can become ambiguous: protocol rules get mixed with backend details, validation becomes implicit, and clients become tightly coupled to a single implementation. This project instead defines a versioned wire contract first, then provides a Python reference implementation that demonstrates one valid way to satisfy that contract.
The goal is to make encrypted inference easier to reason about, easier to test, and easier to implement correctly.
This project is intended for:
- engineers building privacy-preserving ML systems
- researchers evaluating encrypted inference protocols
- teams implementing compatible clients, servers, or SDKs
The client is responsible for:
- discovering model metadata
- constructing a compatible CKKS session locally
- encrypting input data locally
- submitting ciphertexts
- retrieving encrypted results
- decrypting results locally
The server is responsible for:
- exposing the protocol endpoints
- validating request envelopes
- validating ciphertext structure and compatibility
- performing homomorphic evaluation
- storing and returning encrypted results
For the reference logistic_v1 model (8 input features, 20 measured runs), the plaintext and encrypted inference paths have a direct benchmark comparison:
- Plaintext mean latency:
3.76 ms - Encrypted mean end-to-end latency:
151.40 ms - Plaintext throughput:
266.13 req/s - Encrypted throughput:
6.60 req/s - Plaintext request size:
87 B - Encrypted request size:
12,585,046 B - Mean absolute error:
2.30e-6 - Max absolute error:
6.82e-6
See docs/benchmarking.md for methodology, and docs/benchmark_results.md for charts, raw metrics, and interpretation.
flowchart LR
subgraph CLIENT["Client / SDK"]
direction TB
C1["Discovery Client<br/>GET /models"]
C2["Model Metadata"]
C3["CKKS Session Builder"]
C4["Local Encryption"]
C5["Inference Submitter<br/>POST /infer"]
C6["Jobs Client<br/>GET /jobs/{id}"]
C7["Local Decryption"]
end
subgraph SERVER["Server / Reference Backend"]
direction TB
S1["/models Route"]
S2["/infer Route"]
S3["/jobs/{id} Route"]
S4["Model Registry"]
S5["Envelope Validation"]
S6["Ciphertext Validation"]
S7["CKKS Backend"]
S8["HE Execution"]
S9["Job Store"]
end
C1 --> S1
S1 --> S4
S4 --> C2
C2 --> C3
C3 --> C4
C4 --> C5
C5 --> S2
S2 --> S5
S5 --> S4
S5 --> S6
S6 --> S7
S7 --> S8
S8 --> S9
C6 --> S3
S3 --> S9
S9 --> C7
A more detailed description is available at architecture.md.
python -m venv .venvActivate it:
Windows PowerShell
.venv\Scripts\Activate.ps1Linux / macOS
source .venv/bin/activatepip install -r requirements.txtuvicorn server.app.main:app --reloadThe server will start at:
http://127.0.0.1:8000pytestFor live integration tests:
pytest tests/integration -v- Client calls
/models - Client selects a supported model and reads its encryption requirements
- Client constructs a compatible CKKS session locally
- Client encrypts input features locally
- Client submits an inference request to
/infer - Server validates the request envelope, model identity, constraints, and ciphertext compatibility
- Server performs homomorphic evaluation
- Server stores the encrypted result in job state
- Client retrieves the encrypted result and decrypts locally
benchmarks/
├── benchmark_inference.py Benchmark harness
├── generate_report_assets.py Static chart generator
└── results/ Raw benchmark outputs
docs/
├── api/examples/ Example protocol payloads
├── api.md Human-readable protocol description
├── architecture/architecture.md Detailed architecture notes
├── benchmarking.md Benchmark methodology + results
└── assets/ Generated charts + diagrams
schemas/ JSON Schemas (requests/responses/errors)
openapi.yaml OpenAPI 3.1 protocol definition
server/
├── app/
│ └── routes/ FastAPI route handlers
├── core/
│ ├── crypto/
│ │ └── crypto_backends/
│ │ └── ckks_pyfhel/ Pyfhel CKKS backend implementation
│ ├── he_execution/ Homomorphic model execution
│ ├── plain_execution/ Plaintext execution (testing/debug)
│ ├── jobs/ Job lifecycle + queue management
│ ├── model_registry/ Model metadata loading/validation
│ ├── protocol/ Envelope + schema validation
│ └── security/ Rate limiting + tenant helpers
client/
├── examples/ Example client usage
└── src/heapi_client/
├── api.py Low-level HTTP wrapper
├── client.py High-level SDK entry point
├── discovery.py Model discovery client
├── infer.py Inference submission client
├── jobs.py Job polling/waiting logic
└── ckks/ CKKS session + wire helpers
tests/
├── sdk/ SDK/unit tests
├── server/ Server/unit + route tests
├── integration/ End-to-end protocol tests
└── plain/ Plain execution tests
notebooks/ Demo + E2E walkthroughs
- Formal JSON Schemas and OpenAPI protocol contract
- Encrypted inference route with ciphertext validation
- Plaintext baseline inference route
- Python client-side encryption/decryption workflow
- Plaintext vs encrypted agreement tests
- Reproducible benchmark harness and result artifacts
- Benchmark report and static visual results page
The normative protocol artifacts are:
- JSON Schemas
openapi.yaml- documented invariants and error semantics
These define the wire contract.
The reference backend is included to demonstrate one valid implementation of that contract.
The repository currently includes:
- a versioned encrypted inference protocol
- JSON Schemas and an OpenAPI 3.1 contract
- a Python reference gateway/server
- a Python client SDK
- a CKKS reference backend using Pyfhel
- a plaintext baseline inference path
- plaintext vs encrypted agreement testing
- reproducible benchmark tooling and result documentation
A full live round-trip is working:
discover model → build CKKS session → encrypt locally → submit ciphertext → validate and evaluate server-side → retrieve ciphertext result → decrypt locally
The repository also supports side-by-side plaintext vs encrypted benchmarking for the reference model.
The intended security posture is:
- plaintext inputs remain client-side
- secret decryption material remains client-side
- server operates only on ciphertexts
- malformed or incompatible ciphertexts should be rejected before evaluation
- model requirements are explicit in metadata rather than implied
This repository is not a hardened production deployment guide. See docs/threat-model.md for the current threat model and assumptions.
This repository does not aim to:
- provide a production-ready serving platform
- train or fine-tune machine learning models
- handle key generation UX or key distribution workflows
- expose cryptographic internals such as noise budgets to clients
- guarantee exact decrypted numeric equality
Approximation error is expected under CKKS and is not treated as a protocol failure.
An implementation is considered protocol-compliant if it:
- accepts and emits payloads matching the published schemas
- implements the documented endpoints and response shapes
- preserves required validation semantics
- preserves documented error behavior at the protocol layer
Implementation details such as scheduling, persistence, or execution strategy are non-normative unless explicitly documented as part of the protocol.
The major next-step areas are:
- stronger adversarial ciphertext hardening
- refinement of synchronous versus job-based execution semantics
- production-oriented persistence and queueing
The protocol uses semantic versioning at the API/protocol level.
- breaking changes belong in a new major version
- minor versions should preserve the documented wire contract unless explicitly versioned otherwise
Licensed under the Apache License, Version 2.0.