Skip to content

Commit e158e0a

Browse files
kamirclaude
andauthored
docs: add LFS Proxy, Helm, and SDK pages to kafscale.io (#129)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 6ba0d78 commit e158e0a

5 files changed

Lines changed: 688 additions & 0 deletions

File tree

_docs/index.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,13 @@ KafScale keeps brokers stateless and moves processing into add-on services, so y
3939
- [Storage format](/storage-format/)
4040
- [S3 health](/s3-health/)
4141

42+
## Large File Support (LFS)
43+
44+
- [LFS Proxy](/lfs-proxy/) — Claim-check proxy for large binary payloads via S3 + Kafka
45+
- [LFS Helm Deployment](/lfs-helm/) — Kubernetes deployment and configuration
46+
- [LFS Client SDKs](/lfs-sdks/) — Java, Python, JS, and browser SDKs
47+
- [LFS Demos](/lfs-demos/) — Runnable demos from local IDoc processing to full Kubernetes pipelines
48+
4249
## Processors
4350

4451
- [Iceberg Processor](/processors/iceberg/)

_docs/lfs-demos.md

Lines changed: 238 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,238 @@
1+
---
2+
layout: doc
3+
title: LFS Demos
4+
description: Runnable demos for the KafScale Large File Support system — from local IDoc processing to full Kubernetes pipelines.
5+
permalink: /lfs-demos/
6+
nav_title: LFS Demos
7+
nav_order: 4
8+
nav_group: LFS
9+
---
10+
11+
<!--
12+
Copyright 2026 Alexander Alten (novatechflow), NovaTechflow (novatechflow.com).
13+
This project is supported and financed by Scalytics, Inc. (www.scalytics.io).
14+
15+
Licensed under the Apache License, Version 2.0 (the "License");
16+
you may not use this file except in compliance with the License.
17+
You may obtain a copy of the License at
18+
19+
http://www.apache.org/licenses/LICENSE-2.0
20+
21+
Unless required by applicable law or agreed to in writing, software
22+
distributed under the License is distributed on an "AS IS" BASIS,
23+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
24+
See the License for the specific language governing permissions and
25+
limitations under the License.
26+
-->
27+
28+
# LFS Demos
29+
30+
KafScale ships with a set of runnable demos that exercise the LFS pipeline end-to-end. They range from a lightweight local demo (MinIO only, no cluster) to full Kubernetes deployments with industry-specific content-explosion patterns.
31+
32+
## Quick reference
33+
34+
| Make target | What it runs | Needs cluster? |
35+
|---|---|---|
36+
| `make lfs-demo` | Baseline LFS proxy flow | Yes (kind) |
37+
| `make lfs-demo-idoc` | SAP IDoc explode via LFS | No (MinIO only) |
38+
| `make lfs-demo-medical` | Healthcare imaging (E60) | Yes (kind) |
39+
| `make lfs-demo-video` | Video streaming (E61) | Yes (kind) |
40+
| `make lfs-demo-industrial` | Industrial IoT mixed payloads (E62) | Yes (kind) |
41+
| `make e72-browser-demo` | Browser SPA uploads (E72) | Yes (kind) |
42+
43+
All demos are environment-driven — override any setting via env vars without touching code.
44+
45+
---
46+
47+
## Baseline LFS demo
48+
49+
```bash
50+
make lfs-demo
51+
```
52+
53+
The baseline demo bootstraps a kind cluster, deploys MinIO and the LFS proxy, then runs the full claim-check flow:
54+
55+
1. Build and load the `kafscale-lfs-proxy` image into kind
56+
2. Deploy broker, etcd, MinIO, and LFS proxy into the `kafscale-demo` namespace
57+
3. Create a demo topic and upload binary blobs via the LFS proxy HTTP API
58+
4. Verify pointer envelopes arrive in Kafka and blobs exist in S3
59+
60+
| Variable | Default | Description |
61+
|---|---|---|
62+
| `LFS_DEMO_TOPIC` | `lfs-demo-topic` | Kafka topic for pointer records |
63+
| `LFS_DEMO_BLOB_SIZE` | `524288` (512 KB) | Size of each test blob |
64+
| `LFS_DEMO_BLOB_COUNT` | `5` | Number of blobs to upload |
65+
| `LFS_DEMO_TIMEOUT_SEC` | `120` | Test timeout |
66+
67+
---
68+
69+
## IDoc explode demo (local, no cluster)
70+
71+
```bash
72+
make lfs-demo-idoc
73+
```
74+
75+
This is the fastest way to see the LFS pipeline in action. It only needs a local MinIO container — no Kubernetes cluster required.
76+
77+
The demo walks through the complete data flow that the [LFS Proxy](/lfs-proxy/) performs in production:
78+
79+
**Step 1 — Blob upload.** The demo uploads a realistic SAP ORDERS05 IDoc XML (a purchase order with 3 line items, 3 business partners, 3 dates, and 2 status records) to MinIO, simulating what the LFS proxy does when it receives a large payload.
80+
81+
**Step 2 — Envelope creation.** A KafScale LFS envelope is generated — the compact JSON pointer record that Kafka consumers receive instead of the raw blob:
82+
83+
```json
84+
{
85+
"kfs_lfs": 1,
86+
"bucket": "kafscale",
87+
"key": "lfs/idoc-demo/idoc-inbound/0/0-idoc-sample.xml",
88+
"content_type": "application/xml",
89+
"size": 2706,
90+
"sha256": "96985f1043a285..."
91+
}
92+
```
93+
94+
**Step 3 — Resolve and explode.** The `idoc-explode` processor reads the envelope, resolves the blob from S3 via `pkg/lfs.Resolver`, validates the SHA-256 checksum, then parses the XML and routes segments to topic-specific streams:
95+
96+
```
97+
Segment routing:
98+
E1EDP01, E1EDP19 -> idoc-items (order line items)
99+
E1EDKA1 -> idoc-partners (business partners)
100+
E1STATS -> idoc-status (processing status)
101+
E1EDK03 -> idoc-dates (dates/deadlines)
102+
EDI_DC40, E1EDK01 -> idoc-segments (all raw segments)
103+
(root) -> idoc-headers (IDoc metadata)
104+
```
105+
106+
**Step 4 — Topic streams.** Each output file maps to a Kafka topic. Routed records carry their child fields as a self-contained JSON object:
107+
108+
```json
109+
{"name":"E1EDP01","path":"IDOC/E1EDP01","fields":{
110+
"POSEX":"000010","MATNR":"MAT-HYD-4200",
111+
"ARKTX":"Hydraulic Pump HP-4200","MENGE":"5",
112+
"NETWR":"12500.00","WAERS":"EUR"}}
113+
```
114+
115+
```json
116+
{"name":"E1EDKA1","path":"IDOC/E1EDKA1","fields":{
117+
"PARVW":"AG","NAME1":"GlobalParts AG",
118+
"STRAS":"Industriestr. 42","ORT01":"Stuttgart",
119+
"LAND1":"DE"}}
120+
```
121+
122+
The sample IDoc produces 94 records across 6 topics.
123+
124+
---
125+
126+
## Industry demos
127+
128+
The three industry demos build on the baseline flow and demonstrate the **content explosion pattern** — a single large upload that produces multiple derived topic streams for downstream analytics.
129+
130+
### Medical imaging (E60)
131+
132+
```bash
133+
make lfs-demo-medical
134+
```
135+
136+
Simulates a radiology department uploading DICOM imaging files (CT/MRI scans, whole-slide pathology images). A single scan upload explodes into:
137+
138+
| Topic | Content | LFS blob? |
139+
|---|---|---|
140+
| `medical-images` | Original DICOM blob pointer | Yes |
141+
| `medical-metadata` | Patient ID, modality, study date | No |
142+
| `medical-audit` | Access timestamps, user actions | No |
143+
144+
Demonstrates checksum integrity validation and audit trail logging relevant to healthcare compliance scenarios.
145+
146+
### Video streaming (E61)
147+
148+
```bash
149+
make lfs-demo-video
150+
```
151+
152+
Simulates a media platform ingesting large video files. A single video upload explodes into:
153+
154+
| Topic | Content | LFS blob? |
155+
|---|---|---|
156+
| `video-raw` | Original video blob pointer | Yes |
157+
| `video-metadata` | Codec, duration, resolution, bitrate | No |
158+
| `video-frames` | Keyframe timestamps and S3 references | No |
159+
| `video-ai-tags` | Scene detection, object labels | No |
160+
161+
Uses HTTP streaming upload for memory-efficient transfer of multi-gigabyte files.
162+
163+
### Industrial IoT (E62)
164+
165+
```bash
166+
make lfs-demo-industrial
167+
```
168+
169+
Simulates a factory floor with **mixed payload sizes** flowing through the same Kafka interface — small telemetry readings alongside large thermal/visual inspection images:
170+
171+
| Topic | Content | LFS blob? | Typical size |
172+
|---|---|---|---|
173+
| `sensor-telemetry` | Real-time sensor readings | No | ~1 KB |
174+
| `inspection-images` | Thermal/visual inspection photos | Yes | ~200 MB |
175+
| `defect-events` | Anomaly detection alerts | No | ~2 KB |
176+
| `quality-reports` | Aggregated quality metrics | No | ~10 KB |
177+
178+
Demonstrates automatic routing based on the `LFS_BLOB` header — small payloads flow through Kafka directly, large payloads are offloaded to S3.
179+
180+
---
181+
182+
## SDK demo applications
183+
184+
### E70 — Java SDK
185+
186+
Located in `examples/E70_java-lfs-sdk-demo/`. Demonstrates the Java LFS producer SDK uploading files via the HTTP API, consuming pointer records from Kafka, and resolving blobs from S3.
187+
188+
```bash
189+
cd examples/E70_java-lfs-sdk-demo
190+
make run-all # builds SDK, starts port-forwards, runs demo
191+
```
192+
193+
Key env vars: `LFS_HTTP_ENDPOINT`, `LFS_TOPIC`, `KAFKA_BOOTSTRAP`, `LFS_PAYLOAD_SIZE`.
194+
195+
### E71 — Python SDK
196+
197+
Located in `examples/E71_python-lfs-sdk-demo/`. Demonstrates the Python LFS SDK with three payload size presets for video upload testing:
198+
199+
```bash
200+
cd examples/E71_python-lfs-sdk-demo
201+
make run-small # 1 MB
202+
make run-midsize # 50 MB
203+
make run-large # 200 MB
204+
```
205+
206+
Requires `pip install -e lfs-client-sdk/python` for the SDK.
207+
208+
### E72 — Browser SDK
209+
210+
Located in `examples/E72_browser-lfs-sdk-demo/`. A single-page application that uploads files directly from the browser to the LFS proxy — no backend server required.
211+
212+
```bash
213+
make e72-browser-demo # local with port-forward
214+
make e72-browser-demo-k8s # in-cluster deployment
215+
```
216+
217+
Features drag-and-drop upload, real-time progress, SHA-256 verification, presigned URL download, and an inline video player for MP4 content.
218+
219+
---
220+
221+
## Common environment
222+
223+
All demos share these defaults:
224+
225+
| Variable | Default | Description |
226+
|---|---|---|
227+
| `KAFSCALE_DEMO_NAMESPACE` | `kafscale-demo` | Kubernetes namespace |
228+
| `MINIO_BUCKET` | `kafscale` | S3 bucket |
229+
| `MINIO_REGION` | `us-east-1` | S3 region |
230+
| `MINIO_ROOT_USER` | `minioadmin` | MinIO credentials |
231+
| `MINIO_ROOT_PASSWORD` | `minioadmin` | MinIO credentials |
232+
| `LFS_PROXY_IMAGE` | `ghcr.io/kafscale/kafscale-lfs-proxy:dev` | Proxy container image |
233+
234+
## Related docs
235+
236+
- [LFS Proxy](/lfs-proxy/) — Architecture and configuration
237+
- [LFS Helm Deployment](/lfs-helm/) — Kubernetes deployment guide
238+
- [LFS Client SDKs](/lfs-sdks/) — SDK API reference for Go, Java, Python, JS

0 commit comments

Comments
 (0)