Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,80 @@ jobs:
- run: pnpm run build
- run: pnpm run test

e2e:
runs-on: ubuntu-latest
services:
redis:
image: redis:7
ports:
- 6390:6379
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
postgres:
image: postgres:17-alpine
env:
POSTGRES_USER: storage
POSTGRES_PASSWORD: storage
POSTGRES_DB: storage
ports:
- 5432:5432
options: >-
--health-cmd "pg_isready -U storage -d storage"
--health-interval 10s
--health-timeout 5s
--health-retries 5
strategy:
matrix:
node-version: [22, 24, 26]
env:
REDIS_URL: redis://127.0.0.1:6390
PG_URL: postgresql://storage:storage@127.0.0.1:5432/storage
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v4
with:
version: latest
- uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}
cache: 'pnpm'
- run: pnpm install
- run: pnpm run test:e2e

docker-smoke:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Bring up storage-db stack
working-directory: examples/storage-db
run: docker compose up --build -d --wait
- name: Wait for coordinator to be ready
run: |
for i in $(seq 1 30); do
if curl -sf http://127.0.0.1:8080/pods > /dev/null; then
echo "coordinator ready after ${i}s"
exit 0
fi
sleep 1
done
echo "coordinator did not become ready in 30s"
docker compose -f examples/storage-db/docker-compose.yml logs
exit 1
- name: Run smoke script
working-directory: examples/storage-db
run: ./scripts/smoke.sh
- name: Dump logs on failure
if: failure()
working-directory: examples/storage-db
run: docker compose logs
- name: Tear down
if: always()
working-directory: examples/storage-db
run: docker compose down -v

lint:
runs-on: ubuntu-latest
steps:
Expand Down
52 changes: 47 additions & 5 deletions coordinator-pattern.md
Original file line number Diff line number Diff line change
Expand Up @@ -231,14 +231,56 @@ The coordinator picks among a destination's pod set on the request path using th

Reducing fan-out (removing a pod from a destination's set when load drops) is not a runtime concern. It is an operator action or a slow background reconciler; running it from the data path risks thrash under bursty load.

## Why Kubernetes
## Platform requirements

The architecture leans on two K8s features that some other platforms (ECS, plain VM fleets) do not directly provide:
The pattern is platform-neutral. It runs anywhere these three properties hold:

- **Headless Services**: the coordinator addresses resource pods by pod IP, read from Valkey. Platforms that route traffic through load balancers or DNS service discovery typically do not expose a stable per-pod endpoint for the coordinator to dial directly.
- **Stable pod identity** (StatefulSet-style): the identity registered into Valkey is the pod's K8s identity. Ephemeral task IPs force the heartbeat path to do more work after every restart.
1. **Each resource pod has a dialable address.** A scheme + host + port the coordinator can open a TCP connection to. Host can be a per-pod DNS name (Kubernetes Headless Service), a per-task IP (ECS `awsvpc` ENI, Nomad alloc IP), or any other stable-enough endpoint.
2. **Each resource pod can compute its own address at startup** and write it into Valkey. The pod tells the registry where it is; the registry is the discovery layer.
3. **The coordinator and all resource pods can reach Valkey** and can open TCP connections to the addresses they read from it.

The in-process co-location of caller and coordinator inside one process is a separate constraint and works on either platform. ECS can be made to work with Cloud Map for service discovery and a custom registration path, but at meaningful cost to operational simplicity.
Nothing in `@platformatic/coordinator` calls a Kubernetes API, parses downward-API files, or relies on K8s-specific behavior. The library is Valkey + HTTP + a `memberAddress` string.

### Kubernetes

The natural fit. A `StatefulSet` + Headless Service gives each pod a predictable DNS name like `pod-0.svc.namespace.svc.cluster.local`, and the downward API composes that into an env var:

```yaml
env:
- name: POD_NAME
valueFrom: { fieldRef: { fieldPath: metadata.name } }
- name: MEMBER_ADDRESS
value: "http://$(POD_NAME).svc.namespace.svc.cluster.local:3000"
```

The pod reads `MEMBER_ADDRESS` from config and passes it to `Member`.

### ECS / Fargate

Each task in `awsvpc` mode gets its own ENI with a private VPC IP. The task fetches that IP from the ECS task metadata endpoint at startup, composes its address, and self-registers. A minimal entrypoint:

```sh
#!/bin/sh
IP=$(wget -qO- "${ECS_CONTAINER_METADATA_URI_V4}/task" \
| jq -r '.Containers[0].Networks[0].IPv4Addresses[0]')
export MEMBER_ADDRESS="http://${IP}:3000"
exec node start.js
```

The coordinator dials the IP directly, so Cloud Map and ALBs are not in the path. Two Fargate tasks in the same VPC can reach each other given a permissive intra-SG rule. Task IDs are random rather than ordinal, but the library does not care: `memberId` is opaque.

### Nomad, plain VMs, Docker Compose, local dev

Same shape. Each instance discovers its own address (alloc IP, instance metadata, container name, `localhost:<port>`), exports it as `MEMBER_ADDRESS`, and registers. The `storage-db` example in this repo runs on plain Docker Compose with explicit per-pod hostnames and is identical, code-wise, to a Fargate or Kubernetes deployment.

### What you give up off-Kubernetes

The two ergonomic conveniences that K8s provides without effort:

- **Ordinal pod names.** Kubernetes `StatefulSet` gives you stable, human-friendly identifiers like `pod-0`, `pod-1`. ECS and most schedulers give you random IDs. This is a debugging convenience, not a functional requirement.
- **Stable DNS.** K8s Headless Service publishes per-pod DNS names that survive across IP changes. On ECS the address is the task IP, which changes on restart. Both are fine for the coordinator pattern because the registry is the source of truth, but K8s gives you a second source for free.

Neither shortfall changes the protocol or the code. They affect the deployment glue around the library.

## Scaling

Expand Down
5 changes: 5 additions & 0 deletions examples/storage-db/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
node_modules
dist
.eslintcache
.git
*.log
4 changes: 4 additions & 0 deletions examples/storage-db/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
node_modules
dist
.eslintcache
*.log
26 changes: 26 additions & 0 deletions examples/storage-db/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Build context must be the workspace root (coordinator/).
# See docker-compose.yml in this directory.
FROM node:24-alpine

RUN corepack enable

WORKDIR /workspace

COPY pnpm-workspace.yaml package.json ./
COPY examples/storage-db/package.json examples/storage-db/

RUN --mount=type=cache,target=/root/.local/share/pnpm/store \
pnpm install

COPY src ./src
COPY tsconfig.json ./
COPY examples/storage-db/src examples/storage-db/src
COPY examples/storage-db/migrations examples/storage-db/migrations
COPY examples/storage-db/tsconfig.json examples/storage-db/

RUN pnpm --filter @platformatic/coordinator run build

WORKDIR /workspace/examples/storage-db
EXPOSE 3000 8080

CMD ["node", "src/bin/pod.ts"]
Loading
Loading