Skip to content

mongodb-developer/ai-coding-with-mongodb-evaluator

Repository files navigation

Schema Summary Evaluator

The evaluator is the system that grades attendee schema summaries from Lab 4 and triggers Credly badge issuance for passing submissions.

Architecture

The evaluator is a generic Express application with a vanilla HTML+JS frontend. Two routes:

  • GET / serves frontend/index.html. The page requires a ?session=... query parameter; without it, the page refuses to load. The session identifier is opaque to the evaluator (it's used purely for tracking which event a submission came from) and is forwarded to the backend with each submission.

  • POST /api/evaluate accepts a JSON body with session, name, email, and summary. It calls Anthropic via the Grove gateway with the rubric baked into the system prompt, parses the structured verdict, and on a passing verdict triggers a Credly badge issuance. Returns the verdict to the frontend.

Files

evaluator/
├── README.md                  This file.
├── tsconfig.json              TypeScript config; compiles backend/ → dist/.
├── frontend/
│   └── index.html             Single-file frontend, vanilla JS, no build step.
├── backend/
│   ├── server.ts              Express server. Evaluation endpoint. System prompt with the full rubric.
│   └── credly.ts              Credly Acclaim API integration for badge issuance.
└── credly-setup-guide.md      Pre-event checklist for Credly configuration.

Backend is TypeScript. npm run dev runs via tsx --watch (no build step). npm start runs the compiled output from dist/; run npm run build first or rely on the Dockerfile's build stage in production.

Required environment variables

Variable Purpose
GROVE_API_KEY Server-side API key for the Grove gateway (grove-gateway-prod.azure-api.net), which proxies Anthropic. Separate from the per-event proxy key used by Claude Code.
ANTHROPIC_MODEL Optional. Defaults to claude-opus-4-7. The model ID is still Anthropic-shaped since Grove forwards to Anthropic.
CREDLY_ORGANIZATION_ID Credly organization that owns the badge template.
CREDLY_BADGE_TEMPLATE_ID The badge template attendees receive on pass.
CREDLY_API_KEY Auth credential for the Credly Acclaim API.
PORT Optional. Defaults to 8080.

The Grove key is intentionally separate from the per-event proxy key used by Claude Code. The evaluator runs as a long-lived service; the proxy key rotates per event.

Deploying

The evaluator is a vanilla Node.js / Express app. It ships with three deployment paths:

Target Config file Use for
Local dev package.json npm start script Iteration, rehearsal
Vercel vercel.json Test / preview deploys
GCP Cloud Run (via Cloud Build) Dockerfile + cloudbuild.yaml Production
AWS App Runner / ECS / Lambda (container) Dockerfile Production

Local

cd evaluator
npm install
# Put GROVE_API_KEY=... in evaluator/.env (already gitignored)
npm run dev          # tsx --watch, no build step needed
# or:
npm run build && npm start   # production-shaped: compile then run dist/
# → http://localhost:8080/?session=local-rehearsal

npm run dev uses tsx --env-file=.env --watch backend/server.ts, so the .env file is required. npm start runs the compiled dist/backend/server.js (run npm run build first). Node 20.6+.

Vercel (test)

cd evaluator
vercel             # link / preview deploy
vercel --prod      # production deploy

Set environment variables in the Vercel dashboard (Project Settings → Environment Variables):

  • GROVE_API_KEY (required)
  • ANTHROPIC_MODEL (optional; defaults to claude-opus-4-7)
  • CREDLY_DRY_RUN=1 (recommended for test environments)
  • CREDLY_TOKEN, CREDLY_ORG_ID, CREDLY_BADGE_TEMPLATE_ID (only if issuing real badges)

The vercel.json routes all traffic to backend/server.ts via @vercel/node (which compiles TS on deploy), and the Express server's static middleware serves frontend/index.html and frontend/styles.css from the same function.

GCP Cloud Run (production via Cloud Build)

# One-time: create the Artifact Registry repo and store the Grove key as a secret.
gcloud artifacts repositories create workshop \
  --repository-format=docker --location=us-central1

echo -n "$GROVE_API_KEY" | gcloud secrets create GROVE_API_KEY \
  --data-file=-

# Build + push + deploy.
cd evaluator
gcloud builds submit --config cloudbuild.yaml \
  --substitutions=_REGION=us-central1,_REPO=workshop,_SERVICE=evaluator

The cloudbuild.yaml runs three steps: build the container, push to Artifact Registry, deploy to Cloud Run. Cloud Run gets GROVE_API_KEY from Secret Manager and is configured for CREDLY_DRY_RUN=1 by default (override with --update-env-vars in a follow-up deploy when going live).

AWS (production via container image)

The same Dockerfile runs on AWS App Runner, ECS Fargate, or Lambda (container image runtime). Build locally or in CodeBuild, push to ECR, then point your chosen service at the image. Set env vars via the service's standard mechanism (App Runner: environment variables, ECS: task definition, Lambda: function configuration).

cd evaluator
docker build -t evaluator:latest .

# Push to ECR (one-time setup of repo elided)
aws ecr get-login-password --region us-east-1 \
  | docker login --username AWS --password-stdin <account>.dkr.ecr.us-east-1.amazonaws.com
docker tag evaluator:latest <account>.dkr.ecr.us-east-1.amazonaws.com/evaluator:latest
docker push <account>.dkr.ecr.us-east-1.amazonaws.com/evaluator:latest

Environment variables required at runtime

Every target needs at minimum:

GROVE_API_KEY=...

Optional:

ANTHROPIC_MODEL=claude-opus-4-7   # default
PORT=8080                         # platform usually sets this
CREDLY_DRY_RUN=1                  # set to 0 only for real badge issuance
CREDLY_TOKEN=...                  # required if CREDLY_DRY_RUN=0
CREDLY_ORG_ID=...                 # required if CREDLY_DRY_RUN=0
CREDLY_BADGE_TEMPLATE_ID=...      # required if CREDLY_DRY_RUN=0

How attendees reach the evaluator

The evaluator URL is announced via Instruqt at the start of Lab 4. The URL includes the session identifier as a query parameter:

https://workshop-evaluator.example.com/?session=ai-coding-with-mongodb-devday-20260120-newyork

The session identifier follows the standard MongoDB Education badge format: <workshop-slug>-<event-type>-<YYYYMMDD>-<city>. The evaluator treats the identifier as opaque; it is logged with each submission and passed to Credly for tracking.

Resubmission

Resubmissions are unlimited. Each submission is graded independently. The evaluator logs every submission with a timestamp, the session identifier, the email, and the verdict, so a DRI can correlate badge issuance against attempts post-event.

Pass threshold

80 out of 100 weighted points, with no single criterion scoring 0. See rubric.md at the workshop root for the full rubric.

What the DRI must do before first delivery

See credly-setup-guide.md for the full Credly checklist. In summary:

  • Create the badge template in the Credly admin UI.
  • Generate an API key with badge issuance permission.
  • Set CREDLY_ORGANIZATION_ID, CREDLY_BADGE_TEMPLATE_ID, and CREDLY_API_KEY in the evaluator's environment.
  • Set GROVE_API_KEY in the evaluator's environment, separate from the per-event proxy key used by Claude Code.
  • Deploy the evaluator and confirm the URL is reachable.
  • Smoke-test with a known-good schema summary that should pass and a known-bad one that should fail.

About

Schema Summary Evaluator for the AI Coding with MongoDB Workshop

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors