TASKS.md — Smart Commerce Agent

This file is the source of truth for what's done, what's active, and what's next. The coding agent MUST update this file after completing any task. Last updated: 2026-03-03 - Phase 7 COMPLETE! 🎉

✅ Completed Phases

Phase 1-2: Monorepo Scaffold

Turborepo + pnpm workspaces
Shared packages: @smart-commerce/types, @smart-commerce/errors

Phase 3: commerce-api

Hono + GraphQL Yoga + MCP server
Prisma integration
Commit: 161c325f

Phase 4: agent-core

FastAPI + Python LangGraph
classify + shopper + support agents
Commit: 254d451b

Phase 5: Web Proxy Layer

/api/agent route (SSE to agent-core)
/api/copilotkit route
Deleted lib/agents/ + lib/llm/ (moved to agent-core)

Phase 6: GenUI Components

ProductGrid, CartDrawer, ActionConfirm, OrderTimeline
Registered in chat.tsx with useCopilotAction
18 component tests passing

Phase 7: Docker + Makefile + Env

docker-compose.yml (4 services: postgres, redis, commerce-api, agent-core)
docker-compose.langfuse.yml (separate — optional)
Updated Dockerfiles for monorepo build context
.env.example with all required vars
Makefile with memory-safe targets
9 docker-compose structure tests passing

⏳ Remaining Phases

Phase 8: E2E Verification

Full stack smoke test
Playwright E2E tests

Phase 9: Taste Vector

pgvector embeddings for recommendations

Phase 10: Stripe MCP Payment Flow

checkout-wizard GenUI component

Phase 11: Proactive Agent

cx-proactive.ts port + cron triggers

Phase 12: Rate Limiting + Circuit Breaker

Proxy route protections

Phase 13: Production Hardening

Secrets management
TLS
Health dashboards

🐛 Known Issues

88 pre-existing integration test failures (require running infrastructure)
UCP module exists — should be deleted (replaced by Stripe MCP)
RAG metrics in README showing old 44%/38%

📝 Architecture Decisions Log

Date	Decision	Reason
Feb 21	Dropped UCP, using Stripe MCP	UCP is custom/unknown, Stripe MCP is real + official
Feb 21	Dropped Qdrant, pgvector only	Redundant infra, pgvector sufficient at portfolio scale
Feb 21	LangGraph active (not disabled)	Agent orchestration is the core of the project
Feb 21	Azure AI Foundry over Ollama	Production-ready, industry-standard
Feb 21	TDD enforced via CLAUDE.md	Stops hallucination, ensures quality
Feb 21	Real infra for all tests	No mocks for DB/Redis/LLM in integration tests
Mar 03	Monorepo with 3 apps	Clean separation: web, commerce-api, agent-core
Goal: Real working chat → DB → Azure AI Foundry pipeline

Core Infrastructure

Verify Docker stack runs (docker ps -a shows 3 containers healthy)
- ✅ smart-commerce-postgres (pgvector:pg16) - healthy
- ✅ smart-commerce-redis (redis:7-alpine) - healthy
- ✅ smart-commerce-langfuse (langfuse/latest) - running
Verify Azure AI Foundry responds (test with curl)
- ✅ Model: gpt-oss-120b responding successfully
Prisma schema v1 (Customer, Product, Order, Cart, CartItem, Ticket)
- ✅ 15 tables created successfully
Run migrations + seed 20 products
- ✅ Migration: 20260221060801_init
- ✅ 20 realistic products seeded (MacBook, Sony, iPhone, etc.)
pgvector HNSW index + tsvector trigger (migration.sql)
- ✅ pgvector extension created
- ✅ HNSW index created (manual SQL)
- ✅ GIN index for full-text search created
Verify pgvector works: SELECT vector_dims('[1,2,3]'::vector);
- ✅ Returns: 3 (pgvector working)

✅ Phase 2: LangGraph Agent - COMPLETE! 🎉

LangGraph Agent (activate — don't disable)

lib/agents/state.ts — AgentState type with 14 intent types
- ✅ Message, toolResults, uiComponents reducers
- ✅ Entities, Sentiment, ToolResult, UIHint types
- ✅ Tests: 7/7 passing
lib/agents/nodes/classify.ts — intent + entity extraction with Azure AI
- ✅ 14 intent types supported
- ✅ Entity extraction (products, prices, orderIds, emails)
- ✅ Sentiment detection (positive, neutral, negative, frustrated)
- ✅ Fallback to keyword classification on error
- ✅ Tests: 11/11 passing
lib/agents/supervisor.ts — graph with routing logic
- ✅ StateGraph assembly with 8 nodes
- ✅ Intent-based routing (product_search → search_node, etc.)
- ✅ State accumulation through workflow
- ✅ Error handling with fallback
- ✅ Tests: 10/10 passing
Test: classifyIntent("find wireless headphones") → intent="product_search"
- ✅ Verified with mock Azure AI
Test: graph persists state across workflow
- ✅ Messages accumulated
- ✅ userId preserved

✅ Phase 3: MCP Tool Layer - COMPLETE! 🎉

MCP Tool Layer

lib/mcp/server.ts — auth wrapper + Langfuse tracing
- ✅ Tool registration system
- ✅ User authentication (userId requirement)
- ✅ Rate limiting interface
- ✅ Zod argument validation
- ✅ Langfuse tracing integration
- ✅ Error handling
- ✅ Execution metadata (timing, userId, traced)
- ✅ Tests: 17/17 passing
lib/mcp/tools.ts — existing tools integrated with server
Test: catalog.search with userId enforced (no userId → UNAUTHORIZED)
Test: cart.add_item with real Docker Postgres

🟡 Phase 2: Search Pipeline (Week 1-2)

Goal: Hybrid FTS + pgvector search with Azure embeddings

RAG Enhancements (IMPLEMENTED - needs integration)

lib/rag/semantic-chunker.ts — semantic chunking with similarity merging
lib/rag/reranker.ts — cross-encoder reranking
lib/rag/query-transform.ts — query rewriting + HyDE
lib/rag/semantic-cache.ts — Redis-backed semantic cache
Integrate semantic chunking into indexDocument
Wire reranker into ragQuery
Hook query transforms into MCP tools

Search Implementation

lib/search/embeddings.ts — Azure text-embedding-3-small
lib/search/hybrid.ts — FTS candidates → pgvector rerank
Test: hybridSearch("wireless headphones") returns ranked results
Test: FTS fallback when query returns 0 semantic matches
Test: filter by maxPrice works at SQL level (not post-filter)
Semantic cache in Redis (5min TTL)
Test: second identical search hits Redis cache

🟡 Phase 3: Cart + Checkout (Week 2)

Goal: Full cart cycle + Stripe MCP checkout

Cart MCP Tools

cart.get, cart.add_item, cart.update_quantity, cart.remove_item, cart.clear
Idempotency: adding same product twice updates quantity (not duplicate row)
Test: cart total recalculated correctly after update
Test: remove last item → empty cart (not null cart)

Stripe MCP Integration

lib/payments/stripe-mcp.ts — toolkit init
lib/payments/idempotency.ts — key generation + Redis storage
checkout.start MCP tool → Stripe payment intent via toolkit
Test: idempotency key prevents duplicate payment intents
Test: Stripe webhook → order.create_from_cart → order in DB
Add stripe-mcp container to docker-compose.dev.yml

🟡 Phase 4: GenUI + CopilotKit (Week 2-3)

Goal: Agent renders React components, not markdown

shadcn Components

ProductCard — existing in app/dashboard/components/genui/
OrderCard — existing
TicketStatus — existing
ProductGrid — grid of ProductCards with add-to-cart
CartDrawer — slide-in cart with quantity controls
CheckoutWizard — Stripe Elements embedded
OrderConfirmation — post-purchase summary
OrderTracking — status timeline

CopilotKit Actions

useCopilotAction("catalog.search") → renders
useCopilotAction("cart.add_item") → renders
useCopilotAction("checkout.start") → renders
useCopilotReadable: expose cart + visible products to agent

🟡 Phase 5: Orders + Support (Week 3)

orders.list, orders.get, orders.track MCP tools
support.create_ticket, support.get_ticket MCP tools
LangGraph refund_node → Stripe MCP refunds.create
Azure Language NER on support tickets (sentiment tagging)

🟡 Phase 6: Observability + Evals (Week 3-4)

Goal: RAGAS scores ≥ 70% relevancy, ≥ 75% faithfulness

Observability

lib/observability/rag-trace.ts — per-span RAG tracing
lib/observability/llm-judge.ts — LLM-as-judge scoring
Per-span Langfuse tracing: classify → search → rerank → generate
scripts/llm_eval.py — RAGAS metrics (replace current 44%/38% scores)
Azure Content Safety on LLM outputs
Target metrics: relevancy >70%, faithfulness >75%

🟡 Phase 7: Azure AI Services (Week 4)

Azure Language NER on search queries (enrich before FTS)
Azure SignalR for real-time cart updates
Azure Event Grid for order.placed → async workers
Azure Functions: price alerts, abandoned cart, inventory

✅ Completed

Infrastructure

Removed Supabase client files
Azure AI Foundry .env configured
docker-compose.dev.yml created (PostgreSQL + Redis + Langfuse)
Prisma adapter for local Postgres
HLD + LLD documented

RAG Enhancements (IMPLEMENTED)

Semantic chunking with similarity merging (22 tests passing)
Cross-encoder reranker (15 tests passing)
Query transformation (rewriting + HyDE)
Semantic cache with Redis

Guardrails (IMPLEMENTED)

Pydantic schemas for validation
LangChain guard chains
DSPy signatures for optimization
PII, toxicity, jailbreak detection (24 tests passing)

MCP Tools (IMPLEMENTED)

Cart tools (update_quantity, remove_item, clear, apply_coupon)
Checkout tool (checkout.create)
Order tools (create_from_cart, cancel)

Testing

61 unit tests passing
22 integration tests passing
Test files created for all new modules

Documentation

CLAUDE.md — agent instructions
AGENTS.md — architecture context
TASKS.md — living task board
TRANSFORMATION_REPORT.md — gaps analysis
IMPLEMENTATION_COMPLETE_SUMMARY.md

🐛 Known Issues / Blockers

App won't start — Supabase middleware blocking (needs removal or mock)
LangGraph disabled — routes temporarily disabled, need activation
RAG metrics in README — showing old 44%/38% (needs update after integration)
UCP module exists — should be deleted (replaced by Stripe MCP)

📝 Architecture Decisions Log

Date	Decision	Reason
Feb 21	Dropped UCP, using Stripe MCP	UCP is custom/unknown, Stripe MCP is real + official
Feb 21	Dropped Qdrant, pgvector only	Redundant infra, pgvector sufficient at portfolio scale
Feb 21	LangGraph active (not disabled)	Agent orchestration is the core of the project
Feb 21	Azure AI Foundry over Ollama	Production-ready, industry-standard
Feb 21	TDD enforced via CLAUDE.md	Stops hallucination, ensures quality
Feb 21	Real infra for all tests	No mocks for DB/Redis/LLM in integration tests

FilesExpand file tree

TASKS.md

Latest commit

History

TASKS.md

File metadata and controls

TASKS.md — Smart Commerce Agent

✅ Completed Phases

Phase 1-2: Monorepo Scaffold

Phase 3: commerce-api

Phase 4: agent-core

Phase 5: Web Proxy Layer

Phase 6: GenUI Components

Phase 7: Docker + Makefile + Env

⏳ Remaining Phases

Phase 8: E2E Verification

Phase 9: Taste Vector

Phase 10: Stripe MCP Payment Flow

Phase 11: Proactive Agent

Phase 12: Rate Limiting + Circuit Breaker

Phase 13: Production Hardening

🐛 Known Issues

📝 Architecture Decisions Log

Core Infrastructure

✅ Phase 2: LangGraph Agent - COMPLETE! 🎉

LangGraph Agent (activate — don't disable)

✅ Phase 3: MCP Tool Layer - COMPLETE! 🎉

MCP Tool Layer

🟡 Phase 2: Search Pipeline (Week 1-2)

RAG Enhancements (IMPLEMENTED - needs integration)

Search Implementation

🟡 Phase 3: Cart + Checkout (Week 2)

Cart MCP Tools

Stripe MCP Integration

🟡 Phase 4: GenUI + CopilotKit (Week 2-3)

shadcn Components

CopilotKit Actions

🟡 Phase 5: Orders + Support (Week 3)

🟡 Phase 6: Observability + Evals (Week 3-4)

Observability

🟡 Phase 7: Azure AI Services (Week 4)

✅ Completed

Infrastructure

RAG Enhancements (IMPLEMENTED)

Guardrails (IMPLEMENTED)

MCP Tools (IMPLEMENTED)

Testing

Documentation

🐛 Known Issues / Blockers

📝 Architecture Decisions Log