devopsdefender · posix4e · Apr 18, 2026 · Apr 18, 2026
diff --git a/apps/README.md b/apps/README.md
@@ -1,6 +1,15 @@
-# apps/ — workload specs
+# apps/ — worked example of a DD agent VM
 
-This directory is DD's canonical reference for **how to deploy a workload**. Every directory here is one workload — a process easyenclave runs inside a TDX-sealed VM. The specs are both the live deployment configuration and the worked example for operators writing their own.
+This directory is **a worked example**, not a bundle dd ships to users. Every
+directory here is one easyenclave workload. Together they describe a complete
+DD agent VM: the minimum infra to boot podman, run one demo container
+(`web-nvidia-smi`), register with a control plane, and expose the demo on a
+stable hostname.
+
+The goal is to be the shortest legible "agent VM from scratch" that you can
+copy and adapt. For orchestrating many workloads, assembling them from
+templates, and the run / teardown lifecycle, see
+[slopandmop](https://github.com/slopandmop/slopandmop).
 
 ## Layout
 
@@ -14,7 +23,8 @@ apps/
 
 ## What a workload looks like
 
-A **workload** is a JSON object consumed by easyenclave's `DeployRequest` (see `src/easyenclave/src/workload.rs`). Minimum shape:
+A **workload** is a JSON object consumed by easyenclave's `DeployRequest` (see
+`src/easyenclave/src/workload.rs`). Minimum shape:
 
 ```json
 {
@@ -23,7 +33,9 @@ A **workload** is a JSON object consumed by easyenclave's `DeployRequest` (see `
 }
 ```
 
-Add `github_release` to fetch a binary asset directly from a GitHub release — no OCI registry, no Dockerfile. The asset lands in `/var/lib/easyenclave/bin/` and is spawned by `cmd`:
+Add `github_release` to fetch a binary asset directly from a GitHub release —
+no OCI registry, no Dockerfile. The asset lands in `/var/lib/easyenclave/bin/`
+and is spawned by `cmd`:
 
 ```json
 {
@@ -44,15 +56,41 @@ Add `env` to inject config:
 }
 ```
 
+Add `expose` to ask DD to route a public hostname to a workload's port:
+
+```json
+{
+  "app_name": "web-nvidia-smi",
+  "expose": { "hostname_label": "gpu", "port": 8081 },
+  "cmd": [...]
+}
+```
+
+At agent boot, `apps/_infra/local-agents.sh` collects every `expose` entry
+into `DD_EXTRA_INGRESS`. dd-agent forwards them on `/register` and the CP
+prepends them to the agent's cloudflared tunnel ingress. A workload declaring
+`{"hostname_label": "gpu", "port": 8081}` becomes reachable at
+`gpu.<agent-hostname>` — in addition to the default dashboard at
+`<agent-hostname>`. easyenclave itself ignores the field; it's a DD-level
+hint about tunnel routing.
+
+Per-workload ingress is **boot-time only** today. Workloads POSTed later via
+`/deploy` don't get auto-exposed — declare your exposure on boot workloads in
+this tree.
+
 ## Templates
 
 Files ending in `.json.tmpl` carry `${VAR}` placeholders. At bake time:
 
-1. `envsubst` substitutes every uppercase `${VAR}` that appears in the template using the caller's environment.
-2. `jq` drops env-array entries whose value ended up empty (so you can make OAuth creds / optional secrets conditional by just leaving them unset).
+1. `envsubst` substitutes every uppercase `${VAR}` that appears in the
+   template using the caller's environment.
+2. `jq` drops env-array entries whose value ended up empty (so you can make
+   OAuth creds / optional secrets conditional by just leaving them unset).
 3. The result is a plain `workload.json` ready for EE.
 
-Only uppercase placeholders get substituted — shell locals like `$i` or `$((n+1))` inside `cmd` strings are left alone. The bake helper is duplicated inline in two places so both lifecycle points behave identically:
+Only uppercase placeholders get substituted — shell locals like `$i` or
+`$((n+1))` inside `cmd` strings are left alone. The bake helper is duplicated
+inline in two places so both lifecycle points behave identically:
 
 - `.github/workflows/deploy-cp.yml` (CI, for CP workloads)
 - `apps/_infra/local-agents.sh` (tdx2 host, for agent VMs)
@@ -62,26 +100,30 @@ Only uppercase placeholders get substituted — shell locals like `$i` or `$((n+
 | workload | CP VM | agent VM (preview) | agent VM (prod) |
 |---|---|---|---|
 | `cloudflared` | ✅ | ✅ | ✅ |
-| `dd-management` | ✅ | | |
 | `dd-agent` | | ✅ | ✅ |
-| `mount-models` | | ✅ | ✅ |
+| `dd-management` | ✅ | | |
 | `nv` | | | ✅ (GPU insmod) |
 | `podman-static` | | ✅ | ✅ |
 | `podman-bootstrap` | | ✅ | ✅ |
-| `ollama` | | ✅ (CPU, preview.json) | ✅ (GPU, prod.json) |
-| `openclaw` | | ✅ (qwen2.5:0.5b) | ✅ (qwen2.5:7b) |
+| `web-nvidia-smi` | | | ✅ (`gpu.<agent-host>`) |
 
-CP stays slim: just `cloudflared` + `dd-management`. Containerised LLM serving lives on agent VMs where the `vdc` ext4 disk holds models + image storage.
+CP stays slim: just `cloudflared` + `dd-management`. Preview agent VMs run a
+bare agent + podman for CI to prove registration end-to-end. Prod agent VMs
+add the GPU insmod and the `web-nvidia-smi` demo on `gpu.<agent-host>`.
 
 ## Ordering
 
-EasyEnclave spawns boot workloads concurrently — there's no declared dependency graph. Dependents self-sequence by polling for their prerequisites. Worked example from this tree:
+EasyEnclave spawns boot workloads concurrently — there's no declared
+dependency graph. Dependents self-sequence by polling for their prerequisites.
+Worked examples from this tree:
 
-- `podman-bootstrap` waits for `podman-static`'s tarball (`until [ -x $SRC/usr/local/bin/podman ]; do sleep 1; done`).
-- `ollama`'s cmd waits for the wrapper (`until [ -x /var/lib/easyenclave/bin/podman ]; do sleep 2; done`).
-- `openclaw`'s cmd waits for ollama's HTTP endpoint (`until wget -q -O- http://127.0.0.1:11434/api/tags; do sleep 5; done`) before pulling the model and launching the gateway.
+- `podman-bootstrap` waits for `podman-static`'s tarball
+  (`until [ -x $SRC/usr/local/bin/podman ]; do sleep 1; done`).
+- `web-nvidia-smi`'s cmd waits for the wrapper
+  (`until [ -x /var/lib/easyenclave/bin/podman ]; do sleep 2; done`).
 
-Costs seconds of wasted polling at boot; easy to reason about; no workload-runner changes needed.
+Costs seconds of wasted polling at boot; easy to reason about; no
+workload-runner changes needed.
 
 ## Deploying your own
 
@@ -91,9 +133,12 @@ Costs seconds of wasted polling at boot; easy to reason about; no workload-runne
    $EDITOR apps/myapp/workload.json
    ```
 2. Decide where it runs:
-   - **CP VM**: add a `bake apps/myapp/workload.json` line to the workload-building `run:` step in `.github/workflows/deploy-cp.yml`.
-   - **Agent VM**: add the same call to `apps/_infra/local-agents.sh` in `build_config_iso()`.
-   - **Ad-hoc, runtime-only**: POST the baked JSON to `/deploy` on a running agent:
+   - **CP VM**: add a `bake apps/myapp/workload.json` line to the
+     workload-building `run:` step in `.github/workflows/deploy-cp.yml`.
+   - **Agent VM**: add the same call to `apps/_infra/local-agents.sh` in
+     `build_config_iso()`.
+   - **Ad-hoc, runtime-only**: POST the baked JSON to `/deploy` on a running
+     agent:
      ```
      curl -H "Authorization: Bearer $DD_PAT" \
           -H "Content-Type: application/json" \
@@ -103,6 +148,17 @@ Costs seconds of wasted polling at boot; easy to reason about; no workload-runne
 
 ## Reference
 
-- Schema source of truth: [`src/easyenclave/src/workload.rs`](../src/easyenclave/src/workload.rs) — the `DeployRequest` struct EE deserializes on `/deploy`.
-- CP deploy caller: [`.github/workflows/deploy-cp.yml`](../.github/workflows/deploy-cp.yml) — inline `bake()` + CP workload set.
-- Agent VM builder: [`apps/_infra/local-agents.sh`](_infra/local-agents.sh) — inline `bake()` + agent workload set per kind.
+- Schema source of truth:
+  [`src/easyenclave/src/workload.rs`](../src/easyenclave/src/workload.rs) —
+  the `DeployRequest` struct EE deserializes on `/deploy`. `expose` is not in
+  this struct; EE silently ignores it. DD reads it at the bake + register
+  boundary.
+- CP deploy caller:
+  [`.github/workflows/deploy-cp.yml`](../.github/workflows/deploy-cp.yml) —
+  inline `bake()` + CP workload set.
+- Agent VM builder:
+  [`apps/_infra/local-agents.sh`](_infra/local-agents.sh) — inline `bake()` +
+  agent workload set per kind.
+- Ingress plumbing: `src/cf.rs` (`create()` takes per-workload ingress),
+  `src/cp.rs` (`register` handler accepts `extra_ingress`), `src/agent.rs`
+  (reads `DD_EXTRA_INGRESS`, forwards on `/register`).
diff --git a/apps/_infra/local-agents.sh b/apps/_infra/local-agents.sh
@@ -1,8 +1,14 @@
 #!/usr/bin/env bash
 # local-agents.sh — define two local TDX agent VMs on this host:
 #
-#   dd-local-preview : no GPU, registers with the PR-preview CP
-#   dd-local-prod    : H100 passthrough, registers with production
+#   dd-local-preview : no GPU, registers with the PR-preview CP. Bare
+#                      agent + podman — no demo workload — so the release
+#                      pipeline can prove registration + tunnel end-to-end
+#                      against per-PR CPs without needing GPU hardware.
+#   dd-local-prod    : H100 passthrough, registers with production. Boots
+#                      the web-nvidia-smi demo workload + declares a
+#                      `gpu.<agent-host>` ingress so the output is reachable
+#                      from the public internet.
 #
 # Both reuse the existing easyenclave base qcow2 via copy-on-write
 # overlays; each gets its own config.iso baking in DD_CP_URL + DD_PAT +
@@ -46,9 +52,7 @@ BASE_DOMAIN="easyenclave-local"
 #
 # envsubst is restricted to the ALL-CAPS `${VAR}` references that
 # appear in the template itself. Lowercase `$i`, `${i}`, and bare
-# `$((…))` arithmetic inside shell cmd strings are left alone —
-# otherwise envsubst would eat shell locals in openclaw's `until`
-# loop and produce broken scripts.
+# `$((…))` arithmetic inside shell cmd strings are left alone.
 bake() {
   case "$1" in
     *.json.tmpl)
@@ -67,6 +71,13 @@ bake() {
   esac
 }
 
+# Extract `expose` entries from a stream of baked workloads and emit
+# them as a compact JSON array of `{hostname_label, port}` — the
+# shape dd-agent expects in $DD_EXTRA_INGRESS.
+extract_extra_ingress() {
+  jq -cs '[.[] | select(.expose) | .expose]'
+}
+
 [ -r "$BASE" ] || { echo "missing $BASE" >&2; exit 1; }
 virsh dominfo "$BASE_DOMAIN" >/dev/null 2>&1 || {
   echo "base libvirt domain '$BASE_DOMAIN' not defined — rebuild the EE image first" >&2
@@ -90,44 +101,41 @@ build_config_iso() {
   tmp=$(mktemp -d)
   trap "rm -rf $tmp" RETURN
 
-  # Boot workload chain (EE spawns concurrently; each uses `until`
-  # loops to self-sequence):
-  #   nv             — insmod nvidia driver (prod only, first so the
-  #                    device nodes exist by the time ollama runs)
-  #   mount-models   — mount /dev/vdc at /var/lib/easyenclave/ollama
-  #   podman-static  — fetch the podman binary tarball into /var/lib/easyenclave/bin
-  #   podman-bootstrap — stage binaries, write containers.conf + policy.json,
-  #                    install /var/lib/easyenclave/bin/podman as the wrapper
-  #                    (symlinked from dd-podman for back-compat)
-  #   ollama         — run docker.io/ollama/ollama:latest serve via the wrapper
-  #   openclaw       — wait for ollama, pull $MODEL, launch openclaw gateway
-  #   cloudflared    — fetch cloudflared binary (dd-register spawns it)
-  #   dd-agent       — run devopsdefender agent, register with CP, serve workloads
-  #
-  # Prod gets the GPU model; preview gets the tiny CPU-friendly one.
-  local model ollama_spec
-  if [ "$with_gpu" = "yes" ]; then
-    model="qwen2.5:7b"
-    ollama_spec="$REPO_ROOT/apps/ollama/workload.prod.json"
-  else
-    model="qwen2.5:0.5b"
-    ollama_spec="$REPO_ROOT/apps/ollama/workload.preview.json"
-  fi
-
-  local workloads
-  workloads=$({
+  # Boot workload chain (EE spawns concurrently; dependents self-sequence
+  # via `until` loops):
+  #   nv             — insmod nvidia driver (prod only, first so device
+  #                    nodes exist by the time web-nvidia-smi runs)
+  #   podman-static  — fetch the podman tarball into /var/lib/easyenclave/bin
+  #   podman-bootstrap — stage binaries, install /var/lib/easyenclave/bin/podman
+  #                    wrapper + containers.conf + policy.json
+  #   web-nvidia-smi — prod only. Run nvidia/cuda container, serve
+  #                    `nvidia-smi` output on :8081.
+  #   cloudflared    — fetch binary (agent spawns the tunnel process)
+  #   dd-agent       — register with CP, serve workloads. Requests the
+  #                    gpu.<agent-host> ingress via $DD_EXTRA_INGRESS,
+  #                    computed below from `expose` entries on the
+  #                    baked workloads.
+  local bare_workloads
+  bare_workloads=$({
     [ "$with_gpu" = "yes" ] && bake "$REPO_ROOT/apps/nv/workload.json"
-    bake "$REPO_ROOT/apps/mount-models/workload.json"
     bake "$REPO_ROOT/apps/podman-static/workload.json"
     bake "$REPO_ROOT/apps/podman-bootstrap/workload.json"
-    bake "$ollama_spec"
-    MODEL="$model" bake "$REPO_ROOT/apps/openclaw/workload.json.tmpl"
+    [ "$with_gpu" = "yes" ] && bake "$REPO_ROOT/apps/web-nvidia-smi/workload.json"
     bake "$REPO_ROOT/apps/cloudflared/workload.json"
+  })
+
+  local extra_ingress
+  extra_ingress=$(echo "$bare_workloads" | extract_extra_ingress)
+
+  local workloads
+  workloads=$({
+    echo "$bare_workloads"
     DD_CP_URL="$cp" \
       DD_PAT="$DD_PAT" \
       DD_ITA_API_KEY="$DD_ITA_API_KEY" \
       DD_ENV="$env" \
       DD_VM_NAME="dd-local-$name" \
+      DD_EXTRA_INGRESS="$extra_ingress" \
       bake "$REPO_ROOT/apps/dd-agent/workload.json.tmpl"
   } | jq -cs '.')
 
@@ -139,7 +147,7 @@ build_config_iso() {
   # ext4 — EE rootfs has no iso9660 module.
   truncate -s 4M "$out"
   mkfs.ext4 -q -d "$tmp" "$out"
-  echo "  wrote $out (env=$env, gpu=$with_gpu, model=$model)"
+  echo "  wrote $out (env=$env, gpu=$with_gpu, extra_ingress=$extra_ingress)"
 }
 
 build_overlay() {
@@ -154,24 +162,6 @@ build_overlay() {
   echo "  wrote $overlay (backing $BASE)"
 }
 
-# Persistent models disk — survives VM relaunch, so ollama doesn't
-# re-download the model each time. Pre-formatted ext4 on the host;
-# the guest just mounts it.
-build_models_disk() {
-  # $1=name, $2=size_gb
-  local name="$1" size_gb="$2"
-  local models="$IMG_DIR/dd-local-$name-models.qcow2"
-  if [ -f "$models" ]; then
-    echo "  models disk $models already exists (reusing)"
-    return
-  fi
-  qemu-img create -q -f raw "$models.raw" "${size_gb}G"
-  mkfs.ext4 -q -F "$models.raw"
-  qemu-img convert -q -f raw -O qcow2 "$models.raw" "$models"
-  rm -f "$models.raw"
-  echo "  wrote $models (${size_gb}G ext4)"
-}
-
 render_domain_xml() {
   # $1=name, $2=with_gpu (yes/no)
   local name="$1" with_gpu="$2"
@@ -193,22 +183,11 @@ render_domain_xml() {
   sed -i "s|/var/log/ee-local\\.log|/var/log/ee-local-$name.log|g" "$out"
 
   # Size the VM for the workload. Base easyenclave-local is 4 GiB /
-  # 2 vCPU — fine for a bare agent, undersized for podman + ollama
-  # + the openclaw gateway on a 900 MB container image. Host has
-  # 243 GiB / 64 cores, so we can be generous.
-  #
-  #   prod:    32 GiB / 16 vCPU  (GPU handles the model; host RAM
-  #                               for podman, openclaw, image pull
-  #                               scratch, model load spill)
-  #   preview: 16 GiB / 8 vCPU   (CPU-only inference; qwen2.5:0.5b
-  #                               + 64k ctx + gateway)
-  if [ "$with_gpu" = "yes" ]; then
-    local mem_kib=33554432  # 32 GiB
-    local vcpus=16
-  else
-    local mem_kib=16777216  # 16 GiB
-    local vcpus=8
-  fi
+  # 2 vCPU — fine for a bare agent. The demo workloads are modest
+  # (web-nvidia-smi just runs nvidia-smi on demand + one apt-get at
+  # boot for netcat). Host has 243 GiB / 64 cores.
+  local mem_kib=8388608   # 8 GiB
+  local vcpus=8
   sed -i -E "s|<memory unit='KiB'>[0-9]+</memory>|<memory unit='KiB'>$mem_kib</memory>|" "$out"
   sed -i -E "s|<currentMemory unit='KiB'>[0-9]+</currentMemory>|<currentMemory unit='KiB'>$mem_kib</currentMemory>|" "$out"
   sed -i -E "s|<vcpu placement='static'>[0-9]+</vcpu>|<vcpu placement='static'>$vcpus</vcpu>|" "$out"
@@ -231,20 +210,6 @@ render_domain_xml() {
          /<\/hostdev>/{skip=0}' "$out" > "$out.tmp" && mv "$out.tmp" "$out"
   fi
 
-  # Add a persistent models disk as vdc. EE will mount it at
-  # /var/lib/easyenclave/ollama via the mount-models boot workload.
-  local models="$IMG_DIR/dd-local-$name-models.qcow2"
-  local disk_block="    <disk type='file' device='disk'>
-      <driver name='qemu' type='qcow2'/>
-      <source file='$models'/>
-      <target dev='vdc' bus='virtio'/>
-    </disk>"
-  # Insert before </devices>.
-  awk -v block="$disk_block" '
-    /<\/devices>/ { print block }
-    { print }
-  ' "$out" > "$out.tmp" && mv "$out.tmp" "$out"
-
   echo "$out"
 }
 
@@ -256,12 +221,6 @@ define_agent() {
 
   echo "== dd-local-$name → $cp (env=$env_label, gpu=$with_gpu) =="
   build_overlay "$name"
-  # Models disk: prod holds the GPU model (few GB), preview holds the small CPU one.
-  if [ "$with_gpu" = "yes" ]; then
-    build_models_disk "$name" 40
-  else
-    build_models_disk "$name" 10
-  fi
   build_config_iso "$name" "$cp" "$env_label" "$with_gpu"
   local xml
   xml=$(render_domain_xml "$name" "$with_gpu")

diff --git a/apps/dd-agent/workload.json.tmpl b/apps/dd-agent/workload.json.tmpl
@@ -17,6 +17,7 @@
     "DD_OWNER=devopsdefender",
     "DD_ENV=${DD_ENV}",
     "DD_VM_NAME=${DD_VM_NAME}",
-    "DD_PORT=8080"
+    "DD_PORT=8080",
+    "DD_EXTRA_INGRESS=${DD_EXTRA_INGRESS}"
   ]
 }
diff --git a/apps/mount-models/workload.json b/apps/mount-models/workload.json
diff --git a/apps/ollama/workload.preview.json b/apps/ollama/workload.preview.json