This guide shows six complete, realistic examples of how cc-polymath works in practice. Each demonstrates auto-discovery in action and shows the natural flow of conversation with Claude Code.
Context: You're starting a real-time analytics dashboard for a SaaS product and need guidance on architecture and best practices.
User: "I need to build a Next.js dashboard that displays real-time metrics. Users should see live updates as data changes, and I want best practices for performance."
Auto-Discovery: ✓ Triggered discover-frontend
Conversation Flow:
Claude: I'm loading frontend skills for Next.js dashboard development...
Loaded: discover-frontend (235 lines)
→ react-component-patterns (318 lines)
→ nextjs-app-router (402 lines)
→ react-data-fetching (356 lines)
→ frontend-performance (289 lines)
Based on your requirements, here's what I recommend:
1. **Server Components for Data**: Use Next.js 14+ Server Components
to fetch data at the server layer, reducing client bundle size.
2. **Real-time Updates**: Choose between:
- Server-Sent Events (SSE) - simple, one-directional updates
- WebSocket via library like Socket.io - full bidirectional
- React Query polling - simple client-side approach
3. **Performance**: Use React Suspense for loading states,
implement incremental static regeneration (ISR) for stable data.
Should we set up Server Components with SWR for real-time data,
or do you need WebSocket-level bidirectionality?
User: "WebSockets would be better for our use case. What about state management across the dashboard?"
Claude: "Great! For WebSocket + state management, I'm loading additional skills...
Loaded: → react-state-management (372 lines) → browser-concurrency (298 lines)
For real-time dashboards, I recommend this pattern:
-
Use Zustand (minimal, reactive):
- Lightweight state container
- Integrates well with Next.js
- Minimal re-renders
-
Separate Real-time Store:
- Keep server state separate from WebSocket state
- Use a dedicated hook: useRealtimeMetrics()
- Handles reconnection and backpressure automatically
-
Suspense Boundaries:
- Wrap dashboard sections in for independent loading
- Each metric card loads independently
Here's the architecture:
// stores/metrics.ts - Zustand store for real-time data
import { create } from 'zustand'
interface MetricUpdate {
timestamp: number
value: number
metric: string
}
export const useMetricsStore = create((set) => ({
metrics: {},
updateMetric: (update: MetricUpdate) =>
set((state) => ({
metrics: {
...state.metrics,
[update.metric]: update.value
}
}))
}))
// hooks/useRealtimeMetrics.ts - WebSocket connection
export function useRealtimeMetrics() {
const updateMetric = useMetricsStore(s => s.updateMetric)
useEffect(() => {
const ws = new WebSocket(process.env.NEXT_PUBLIC_WS_URL)
ws.onmessage = (event) => {
const update = JSON.parse(event.data)
updateMetric(update)
}
return () => ws.close()
}, [updateMetric])
}Ready to implement this? I can guide you through Server Components, Zustand setup, and WebSocket handling step-by-step.
**Result**:
- Architecture decision made in 2 minutes
- Zustand + Server Components chosen for optimal performance
- Code scaffolding ready to build from
- Real-time updates implemented within 30 minutes
- Initial dashboard metrics displaying live data within 1 hour
**Skills Used**:
- discover-frontend (gateway)
- react-component-patterns (component design)
- nextjs-app-router (Server Components, routing)
- react-state-management (Zustand patterns)
- react-data-fetching (data lifecycle)
- browser-concurrency (WebSocket management)
---
## Example 2: Optimizing Postgres Queries
**Context**: Your application is slowing down with database growth. Queries that used to run in milliseconds now take seconds.
**User**: "Our database is getting slow. A query that lists user posts with comments is taking 2.3 seconds. I need to figure out what's wrong and fix it."
**Auto-Discovery**: ✓ Triggered `discover-database`
**Conversation Flow**:
Claude: I'm loading database optimization skills...
Loaded: discover-database (218 lines) → postgres-query-optimization (445 lines) → postgres-schema-design (389 lines) → orm-patterns (356 lines)
Let's diagnose this together. First, I need to see:
- The slow query (SQL or ORM code)
- Your schema (table definitions)
- Data volume (approximate rows)
Can you share the current implementation?
**User**: Shares:
```python
# Django ORM
posts = Post.objects.filter(user_id=user_id).prefetch_related('comments')
# Generated SQL roughly:
# SELECT * FROM posts WHERE user_id = 1
# SELECT * FROM comments WHERE post_id IN (...)
Claude: "Loaded your code. I can see a few issues using EXPLAIN analysis. Let me diagnose:
Loaded: → orm-patterns (full guide)
# EXPLAIN (ANALYZE, BUFFERS) shows:
Seq Scan on posts WHERE user_id = 1
Planning: 0.1ms
Execution: 1800ms (reading all 500K posts!)
Actual loops: 1
Problem #1: NO INDEX on posts.user_idHere's your issues:
-
Missing Index on posts(user_id)
- Currently doing FULL TABLE SCAN of 500K rows
- Adding index: 1800ms → 15ms
-
N+1 Problem in Comments
- Your prefetch_related works, but missing index on comments.post_id
- Adding index: 500ms → 50ms
-
Query selects too many columns
- Add .only('id', 'title', 'created_at') to reduce I/O
- Saves 30% network overhead
Here's the fix:
-- Migration 1: Fix posts lookup
CREATE INDEX idx_posts_user_id ON posts(user_id);
-- Migration 2: Fix comment loading
CREATE INDEX idx_comments_post_id ON comments(post_id);And in Django:
posts = (Post.objects.filter(user_id=user_id)
.prefetch_related('comments')
.only('id', 'title', 'created_at', 'user_id'))Expected improvement: 2.3s → 45ms (51x faster)"
**User**: "Perfect! Should I create these as migrations?"
**Claude**: "Already covered. Loading migration patterns...
Loaded:
→ postgres-migrations (401 lines)
Here's the safe way to add these:
```sql
-- migrations/0003_optimize_posts_queries.sql
-- Part 1: Add indexes (safe, no locking in Postgres 11+)
CREATE INDEX CONCURRENTLY idx_posts_user_id
ON posts(user_id);
CREATE INDEX CONCURRENTLY idx_comments_post_id
ON comments(post_id);
-- Add monitoring
ANALYZE posts;
ANALYZE comments;
Use CREATE INDEX CONCURRENTLY in production. It won't lock writes while creating the index.
Your migration is complete. Deploy with confidence."
**Result**:
- Root cause identified in 3 minutes
- Query time improved from 2.3 seconds to 45ms (51x faster)
- Safe migration strategy provided
- Indexes deployed without downtime
- Application performance restored in under 15 minutes
**Skills Used**:
- discover-database (gateway)
- postgres-query-optimization (EXPLAIN analysis, indexes)
- postgres-schema-design (understanding structure)
- postgres-migrations (safe deployment)
- orm-patterns (N+1 detection)
---
## Example 3: Designing a REST API
**Context**: You're building the backend for a task management application and need to design a scalable REST API.
**User**: "I'm building a task management API. I need to design endpoints for tasks, lists, and user collaboration features. What should the API structure look like?"
**Auto-Discovery**: ✓ Triggered `discover-api`
**Conversation Flow**:
Claude: Loading API design skills...
Loaded: discover-api (242 lines) → rest-api-design (468 lines) → api-authentication (412 lines) → api-authorization (395 lines) → api-rate-limiting (287 lines) → api-error-handling (324 lines)
For a task management API, let me design the resource model first.
Resources (nouns):
- Users (/users)
- Lists (/lists) - collections of tasks
- Tasks (/tasks)
- Comments (/tasks/{id}/comments)
- Shares (/lists/{id}/shares) - permissions
Proposed endpoints:
- GET /lists - my lists
- POST /lists - create list
- GET /lists/{id} - get list with tasks
- PATCH /lists/{id} - update list
- DELETE /lists/{id} - delete list
- GET /lists/{id}/members - who has access
- POST /lists/{id}/share - share with users
- GET /lists/{id}/tasks - list tasks (supports filtering)
- POST /lists/{id}/tasks - create task
- GET /tasks/{id} - get single task
- PATCH /tasks/{id} - update task
- DELETE /tasks/{id} - delete task
- POST /tasks/{id}/assign - assign to user
- GET /tasks/{id}/comments - task comments
- POST /tasks/{id}/comments - add comment
- PATCH /comments/{id} - edit comment
- DELETE /comments/{id} - delete comment
- GET /tasks/{id}/activity - activity log
- GET /users/me - current user profile
Each endpoint returns consistent responses with error handling. Should we implement authentication first, or finalize the schema?
**User**: "Authentication is critical. What's the best approach?"
**Claude**: "Perfect. Loading authentication patterns...
Loaded:
→ api-authentication (full guide)
For a collaborative app, I recommend JWT + Refresh tokens:
```typescript
// POST /auth/login
{
"email": "user@example.com",
"password": "secure_password"
}
Response:
{
"accessToken": "eyJhbGc...", // 15-min expiry
"refreshToken": "ref_...", // 7-day expiry
"user": {
"id": "user_123",
"email": "user@example.com"
}
}
// POST /auth/refresh
{
"refreshToken": "ref_..."
}
For authorization, we need role-based access:
Loaded: → api-authorization (full guide)
// User roles per list
enum Role {
OWNER = 'owner', // create, delete, manage members
EDITOR = 'editor', // create/edit tasks
VIEWER = 'viewer' // read-only
}
// Middleware checks role
async function checkAccess(req, res, next) {
const listId = req.params.id
const userId = req.user.id
const access = await db.listAccess.findUnique({
where: { listId_userId: { listId, userId } }
})
if (!access) throw new Forbidden()
req.userRole = access.role
next()
}For shared APIs, rate limiting prevents abuse:
Loaded: → api-rate-limiting (full guide)
// 100 requests per 15 minutes per user
const rateLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 100,
keyGenerator: (req) => req.user.id
})
app.use('/api/', rateLimiter)Finally, standardized error responses:
Loaded: → api-error-handling (RFC 7807)
// All errors follow RFC 7807
{
"type": "https://api.tasks.example.com/errors/unauthorized",
"title": "Unauthorized",
"status": 403,
"detail": "You do not have access to this list",
"instance": "/lists/list_456"
}Here's the complete OpenAPI spec structure...
**User**: "This is great. Can you show the full OpenAPI schema?"
**Claude**: "Absolutely. Here's your API spec in OpenAPI 3.0:
```yaml
openapi: 3.0.0
info:
title: Task Management API
version: 1.0.0
servers:
- url: https://api.tasks.example.com
description: Production
components:
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
security:
- bearerAuth: []
paths:
/auth/login:
post:
summary: User login
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
email: { type: string }
password: { type: string }
responses:
200:
description: Login successful
content:
application/json:
schema:
$ref: '#/components/schemas/AuthResponse'
/lists:
get:
summary: Get user's lists
responses:
200:
description: List of lists
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/List'
post:
summary: Create new list
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/CreateListRequest'
responses:
201:
description: List created
/lists/{id}:
parameters:
- name: id
in: path
required: true
schema:
type: string
get:
summary: Get list details with tasks
patch:
summary: Update list
delete:
summary: Delete list
/lists/{id}/tasks:
get:
summary: Get tasks in list
parameters:
- name: status
in: query
schema:
type: string
enum: [todo, in_progress, done]
post:
summary: Create task
/tasks/{id}:
get:
summary: Get single task
patch:
summary: Update task
delete:
summary: Delete task
/tasks/{id}/comments:
get:
summary: Get task comments
post:
summary: Add comment
components:
schemas:
List:
type: object
properties:
id: { type: string }
name: { type: string }
description: { type: string }
role: { type: string, enum: [owner, editor, viewer] }
taskCount: { type: integer }
members: { type: array }
Task:
type: object
properties:
id: { type: string }
title: { type: string }
description: { type: string }
status: { enum: [todo, in_progress, done] }
priority: { enum: [low, medium, high] }
assignedTo: { type: string }
dueDate: { type: string, format: date-time }
comments: { type: array }
Ready to implement in Node.js + Express or another framework?"
**Result**:
- Complete API design in 15 minutes
- Resource modeling done
- Authentication strategy chosen (JWT + Refresh)
- Authorization pattern defined (role-based)
- Rate limiting strategy selected
- Full OpenAPI 3.0 specification ready
- Implementation-ready architecture document created
**Skills Used**:
- discover-api (gateway)
- rest-api-design (resource modeling, HTTP semantics)
- api-authentication (JWT patterns)
- api-authorization (RBAC implementation)
- api-rate-limiting (abuse prevention)
- api-error-handling (RFC 7807 responses)
---
## Example 4: Creating Visual Documentation
**Context**: You need to document your deployment process for new team members joining next week.
**User**: "I need to create a clear diagram showing our deployment pipeline. We have development, staging, and production environments with automatic tests in between."
**Auto-Discovery**: ✓ Triggered `discover-engineering`
**Conversation Flow**:
Claude: Loading diagram creation skills...
Loaded: discover-engineering (195 lines) → mermaid-flowcharts (367 lines) → mermaid-sequence-diagrams (401 lines)
For deployment pipelines, a flowchart works best. Here's your diagram:
```mermaid
graph LR
A["Developer
Pushes Code"] -->|git push| B["GitHub
Webhook"]
B -->|triggers| C["Run Unit
Tests"]
C -->|✓ pass| D["Build Docker
Image"]
C -->|✗ fail| E["Notify Developer
in Slack"]
E --> A
D -->|push image| F["Deploy to<br/>Staging"]
F -->|wait 30s| G["Run E2E<br/>Tests"]
G -->|✓ pass| H["Wait for<br/>Approval"]
G -->|✗ fail| I["Rollback<br/>Staging"]
I --> E
H -->|approved| J["Deploy to<br/>Production"]
J -->|CloudFlare CDN| K["Live for<br/>Users"]
J -->|smoke tests| L{All Good?}
L -->|yes| M["Send Success<br/>Alert"]
L -->|no| N["Automatic<br/>Rollback"]
N -->|call prev version| K
N --> E
style A fill:#e1f5ff
style C fill:#fff3e0
style G fill:#fff3e0
style J fill:#c8e6c9
style K fill:#c8e6c9
style E fill:#ffcdd2
style M fill:#c8e6c9
```
This shows:
- Developer trigger
- Test gates (unit → E2E)
- Approval checkpoint
- Production deployment
- Automatic rollback on failure
- Slack notifications
Would you like to add more detail to any stage?
**User**: "Can you also show what happens when a deploy fails in production? We have automatic rollback."
**Claude**: "Great question. Let me add the failure path with more detail...
```mermaid
graph TD
A["Developer<br/>Pushes Code"] -->|git push| B["GitHub<br/>Webhook"]
B -->|triggers| C["Run Unit Tests"]
C -->|✓ pass| D["Build Docker<br/>Image"]
C -->|✗ fail| E["Notify: Tests Failed"]
E -->|slack #deployments| F["Await Fix"]
F --> A
D -->|push to ECR| G["Deploy to<br/>Staging"]
G -->|wait 30s| H["Run E2E Tests"]
H -->|✓ pass| I["Await Manual<br/>Approval"]
H -->|✗ fail| J["Notify: E2E Failed"]
J --> E
I -->|approved| K["Get Previous<br/>Deploy Hash"]
K -->|v1.2.3| L["Deploy to<br/>Production"]
L -->|point DNS| M["Traffic → New<br/>Release v1.3.0"]
L -->|5min window| N["Smoke Tests"]
N -->|✓ All Checks Pass| O["Green Checkmark<br/>Success Alert"]
O --> P["Canary 5%<br/>traffic"]
P -->|5min ok| Q["100% Traffic<br/>New Release"]
N -->|✗ Error Rate High| R["ROLLBACK<br/>Initiated"]
R -->|blue/green| S["Traffic → Previous<br/>Release v1.2.3"]
S -->|instant| T["Notify: Rollback<br/>Occurred"]
T -->|ping oncall| U["Engineer Reviews<br/>What Failed"]
U --> V["Incident Report<br/>+ Post-Mortem"]
Q --> W["✓ Deployment<br/>Complete"]
V --> X["Schedule Fix<br/>for Next Release"]
style A fill:#e1f5ff
style C fill:#fff3e0
style H fill:#fff3e0
style L fill:#c8e6c9
style O fill:#c8e6c9
style Q fill:#c8e6c9
style R fill:#ffcdd2
style T fill:#ffcdd2
style E fill:#ffcdd2
style J fill:#ffcdd2
\`\`\`
Key features:
- Blue/Green deployment (zero downtime)
- 5% canary traffic before full rollout
- Automatic rollback on error rate spike
- Slack notifications at each gate
- Post-incident review captured
- 5-minute production validation window
This diagram is production-ready. Want me to add environment details?"
Result:
- Professional deployment diagram created in 5 minutes
- Flowchart clearly shows the entire pipeline
- Failure scenarios documented
- Color-coded for easy scanning
- Ready to share with new team members
- Can be dropped directly into documentation wiki
- Easily updated as process evolves
Skills Used:
- discover-engineering (gateway)
- mermaid-flowcharts (process flows, decision logic)
- mermaid-specialized (optional for Gantt view if needed)
Context: You're architecting a new service and need to decide between different database technologies based on your requirements.
User: "We're building a new feature that tracks user activity events. We might get 50,000 events per second. Should we use PostgreSQL, MongoDB, or Redis? What are the tradeoffs?"
Auto-Discovery: ✓ Triggered discover-database
Conversation Flow:
Claude: Loading database selection framework...
Loaded: discover-database (218 lines)
→ database-selection (456 lines)
→ postgres-schema-design (389 lines)
→ mongodb-document-design (402 lines)
→ redis-data-structures (378 lines)
This is a classic OLTP (Online Transaction Processing) vs OLAP
(Online Analytical Processing) decision. Let me analyze your requirements:
Requirements Analysis:
- Write volume: 50K events/sec (very high)
- Read pattern: Real-time queries + historical analysis
- Retention: How long keep data?
- Consistency: Is eventual consistency okay?
Let me evaluate each option:
## Option 1: PostgreSQL
Pros:
- ACID transactions (guaranteed consistency)
- Complex queries for analysis
- Strong relationships and JOINs
- Single system of truth
Cons:
- 50K writes/sec requires significant tuning
- Needs connection pooling (PgBouncer)
- Disk I/O becomes bottleneck
- Sharding adds operational complexity
Capacity estimate:
- Single instance: ~5K-10K writes/sec max
- Need sharding or partitioning by user_id
- 3-5 instances minimum
- Storage: 50K events/sec * 86,400 sec/day * 365 days = 1.5 TB/year
Best for: Complex analysis queries, strong consistency
## Option 2: MongoDB
Pros:
- Horizontal scaling (sharding built-in)
- JSON-like documents match events naturally
- Good for time-series data
- Flexible schema (events may vary)
Cons:
- Eventual consistency (by default)
- Less efficient for complex analytical queries
- Higher storage overhead than SQL
- More operational complexity
Capacity estimate:
- Can handle 50K writes/sec across shards
- 2-3 shards initially
- 2.5 TB/year storage
Best for: Event streaming, flexible schema, scaling writes
## Option 3: Redis
Pros:
- Extreme speed (100K+/sec writes)
- Perfect for caching/sessionization
- Pub/Sub for real-time events
Cons:
- All data in memory (expensive)
- Limited analytical querying
- Data loss on failure (persistence adds latency)
- Not designed for event history
Capacity estimate:
- 50K events/sec * 1KB each = 50MB/sec throughput
- 365 days * 86K sec = 1.5B events/year = 1.5TB RAM
- Cost: prohibitive for long-term storage
Best for: Caching, real-time notifications, stream processing
## My Recommendation: Postgres + TimescaleDB Extension
Postgres + TimescaleDB is the optimal choice:
TimescaleDB advantages:
- Built on Postgres (ACID + complex queries)
- Optimized for time-series (what you need)
- Handles 50K+ writes/sec (tuned)
- Automatic partitioning by time
- Superior compression (4-10x vs plain SQL)
- Full SQL support for analysis
Architecture:
``` Events Stream ↓ [Kafka/RabbitMQ Buffer] ↓ [Connection Pool: 50 connections] ↓ [TimescaleDB Cluster]
- Primary (writes)
- 2x Replicas (reads, failover) ↓ Archive to S3 for analytics ```
Schema:
-- TimescaleDB auto-partitions by time
CREATE TABLE events (
time TIMESTAMPTZ NOT NULL,
user_id UUID NOT NULL,
event_type VARCHAR NOT NULL,
properties JSONB,
INDEX ON (user_id, time DESC),
INDEX ON (event_type, time DESC)
);
-- Create hypertable (automatic partitioning)
SELECT create_hypertable('events', 'time');
-- Retention policy (auto-delete old data)
SELECT add_retention_policy('events', INTERVAL '1 year');
-- Compression (4x reduction)
ALTER TABLE events SET (timescaledb.compress = on);
SELECT add_compression_policy('events', INTERVAL '7 days');Capacity with TimescaleDB:
- Single high-CPU instance: 50K writes/sec ✓
- Compression reduces 1.5TB to ~400GB
- Replica for read scaling + failover
- Automatic archival to S3 after 1 year
Cost estimate (AWS):
- db.r6i.4xlarge x 2 (primary + replica): $4K/month
- vs bare Postgres (need 5 instances): $8K/month
- vs MongoDB (3 shards, Atlas): $6K/month
Winner: TimescaleDB - Best cost, consistency, and queryability
**User**: "What about operational complexity? Can we manage this?"
**Claude**: "Great question. TimescaleDB operational overhead is minimal:
Loaded:
→ postgres-migrations (401 lines)
Operational checklist:
Day 1: Schema + Hypertable Setup ├─ Create table with indexes ├─ Configure retention policy ├─ Setup compression policy └─ Test with 5K events/sec
Week 1: Replication Setup ├─ Setup replica instance ├─ Configure streaming replication ├─ Test failover └─ Setup monitoring
Ongoing: ├─ Monitor query performance (built-in dashboards) ├─ Rotate logs to S3 (automatic) ├─ Patch Postgres updates (Postgres 15+, fast patch) └─ Capacity planning (adds ~30% year-over-year)
Monitoring:
- TimescaleDB Grafana dashboards included
- Query performance insights (table bloat alerts)
- Replication lag monitoring
- Compression effectiveness tracking
You'll spend ~2 hrs/week on ops vs 8 hrs/week with sharded Postgres.
Recommendation: Start with TimescaleDB primary + 1 replica.
Provides 99.99% uptime + 50K/sec write capacity."
Result:
- Clear database selection decision made in 12 minutes
- Comparison of all three options analyzed
- TimescaleDB chosen as optimal solution
- Architecture diagram provided
- Schema with optimization patterns ready
- Operational complexity outlined
- Cost analysis provided
- Ready for engineering team handoff
Skills Used:
- discover-database (gateway)
- database-selection (decision framework, tradeoff analysis)
- postgres-schema-design (schema design, indexing)
- postgres-query-optimization (performance tuning)
- mongodb-document-design (comparison context)
- redis-data-structures (comparison context)
Context: You're building an e-commerce platform and want AI-powered product recommendations that improve conversion rates.
User: "We need product recommendations for our e-commerce site. Can we use AI/ML to improve conversion? What's the best approach - collaborative filtering, content-based, or LLM?"
Auto-Discovery: ✓ Triggered discover-ml
Conversation Flow:
Claude: Loading ML and recommendation skills...
Loaded: discover-ml (267 lines)
→ rag-evaluation-metrics (412 lines)
→ llm-as-judge (445 lines)
→ llm-model-selection (389 lines)
→ custom-llm-evaluation (478 lines)
For e-commerce recommendations, I recommend a hybrid approach:
Collaborative Filtering + Content-Based + LLM explanations.
Let me break this down:
## Recommendation System Design
### Architecture (3-Tier):
1. **Collaborative Filtering** (70% of recommendations)
- User purchase history → similar users → recommend their products
- Fast, proven for e-commerce
- Cold start problem for new users
2. **Content-Based** (20% of recommendations)
- Product features (category, price, style) + user preferences
- Solves cold start
- Handles seasonal trends
3. **LLM Explanations** (10% of work, 30% conversion improvement)
- Why are we recommending? Generate natural language explanations
- Increases user trust and click-through rate
- LLM-as-Judge validates recommendations
### Implementation Stack:
```python
# 1. Collaborative Filtering (via embeddings)
from sklearn.decomposition import NMF
import numpy as np
# User-product purchase matrix
user_product_matrix = create_matrix(purchases)
# NMF decomposition (finds latent factors)
model = NMF(n_components=50, random_state=0)
W = model.fit_transform(user_product_matrix) # User embeddings
H = model.components_ # Product embeddings
# Find similar users
def get_similar_users(user_id, k=5):
user_vec = W[user_id]
similarities = cosine_similarity([user_vec], W)[0]
return np.argsort(similarities)[-k:]
# Get their purchases
similar_users = get_similar_users(user_id)
candidate_products = set()
for sim_user in similar_users:
candidate_products.update(
get_user_purchases(sim_user)
)
# 2. Content-Based Filtering
def get_content_recommendations(user_id, k=10):
user_prefs = get_user_preferences(user_id)
candidates = db.query(Product).filter(
Product.category.in_(user_prefs['categories']),
Product.price_range.overlaps(user_prefs['price_range']),
Product.style.in_(user_prefs['styles'])
).all()
# Score by relevance
scores = []
for product in candidates:
score = (
0.4 * category_match(product, user_prefs) +
0.3 * price_relevance(product, user_prefs) +
0.3 * style_affinity(product, user_prefs)
)
scores.append((product, score))
return sorted(scores, key=lambda x: x[1])[:k]
# 3. Merge + Rank (70% CF + 20% Content + 10% exploration)
def get_recommendations(user_id, num=5):
cf_recs = get_collaborative_filtering(user_id, k=10)
content_recs = get_content_recommendations(user_id, k=10)
# Merge with weights
merged = {}
for product, score in cf_recs:
merged[product.id] = 0.7 * score
for product, score in content_recs:
if product.id in merged:
merged[product.id] += 0.2 * score
else:
merged[product.id] = 0.2 * score
# Add 10% exploration (new products)
new_products = get_trending_new_products(category, k=1)
for product in new_products:
merged[product.id] = 0.1 * np.random.random()
return sorted(
merged.items(),
key=lambda x: x[1],
reverse=True
)[:num]
from anthropic import Anthropic
client = Anthropic()
def generate_recommendation_explanation(
user_id,
product_id,
recommendation_source
):
"""Generate natural language explanation for why we're recommending."""
user = get_user(user_id)
product = get_product(product_id)
similar_buyers = get_users_who_bought([product_id])
prompt = f"""You are an e-commerce product specialist.
Explain why this product is recommended in 1-2 concise sentences.
Customer: {user.name}, recently bought {user.recent_purchases[-3:]}
Recommended: {product.name} (${product.price})
Similar buyers: {len(similar_buyers)} customers with similar taste
Explanation (natural, conversational, persuasive):"""
message = client.messages.create(
model="claude-opus-4-1",
max_tokens=100,
messages=[{"role": "user", "content": prompt}]
)
explanation = message.content[0].text
return explanation
# Example output:
# "Customers who loved your recent purchases also loved this.
# Premium leather, same style as your favorites. 23% off today."Loaded: → llm-as-judge (for ranking quality) → llm-benchmarks-evaluation (for A/B testing) → custom-llm-evaluation (for business metrics)
def evaluate_recommendations(
user_id,
recommendations,
metric='conversion_lift'
):
"""Use LLM-as-Judge to rate recommendation quality."""
user_context = f"""
User purchase history: {get_user_history(user_id)}
Browse history: {get_browse_history(user_id)}
Price sensitivity: {get_price_sensitivity(user_id)}
"""
for product in recommendations:
evaluation_prompt = f"""
Rate this recommendation (1-10) for {user_context}
Product: {product.name}
Price: ${product.price}
Category: {product.category}
Review score: {product.rating}/5
Rating (1-10, with brief reasoning):"""
# LLM-as-Judge evaluates
rating = llm_evaluate(evaluation_prompt)
product.relevance_score = rating
# Business metrics
results = {
'avg_relevance': np.mean([p.relevance_score for p in recommendations]),
'click_through_rate': calculate_ctr(recommendations),
'conversion_rate': calculate_cvr(recommendations),
'avg_order_value': calculate_aov(recommendations),
'customer_satisfaction': survey_score()
}
return results
# Expected metrics:
# - CTR: +35% (natural language explanations)
# - Conversion: +23% (relevance + trust)
# - AOV: +18% (premium product mix)
# - Satisfaction: 4.7/5 (vs 3.9/5 baseline)┌─────────────────────────────────────────────┐
│ E-Commerce Platform │
└──────────┬──────────────────────────────────┘
│
Product View
│
↓
┌──────────────────────────────────────────────┐
│ Recommendation Service │
├──────────────────────────────────────────────┤
│ 1. Fetch user embeddings (CF) │
│ 2. Get content-based candidates │
│ 3. Merge & rank (70/20/10) │
│ 4. Call Claude API for explanations │
│ 5. Return {product, explanation, score} │
└──────────────────────────────────────────────┘
│
↓ (< 500ms latency)
Render on Client
- Product cards
- Natural language explanation
- "Customers like you also loved"
│
↓
User interaction tracking
(for feedback loop)
Initial rollout (month 1):
- Click-through rate: 15% → 20% (+35%)
- Conversion rate: 2.1% → 2.6% (+23%)
- Average order value: $87 → $103 (+18%)
- Revenue per visitor: $1.83 → $2.24 (+23%)
A/B test (with vs without LLM explanations):
- Without explanations: 2.3% CVR
- With explanations: 2.8% CVR
- Lift: +21.7%
That's worth it.
**User**: "How do we handle cold start for new users?"
**Claude**: "Great question. Loaded cold-start strategies...
```python
# Cold Start Problem: New users have no purchase history
def get_recommendations_cold_start(user_id, num=5):
"""Recommendations for users with < 5 purchases."""
user = get_user(user_id)
# Strategy 1: Content-based (80%)
# Use signup survey + browsing behavior
content_recs = get_content_recommendations(user_id, k=6)
# Strategy 2: Trending (15%)
# What's popular in user's category?
trending = db.query(Product).filter(
Product.category == user.preferred_category,
Product.created_at > now() - timedelta(days=30)
).order_by(Product.views.desc()).limit(3).all()
# Strategy 3: Hybrid (5%)
# Recommendations from users with similar signup profile
similar_signup = find_similar_profiles(user.signup_survey)
hybrid_recs = collaborative_filter(similar_signup)
# Merge strategies
all_recs = []
# 80% content-based
all_recs.extend([
(p, 0.8 * s) for p, s in content_recs[:4]
])
# 15% trending
all_recs.extend([
(p, 0.15 * 0.8) for p in trending[:1]
])
# 5% hybrid
all_recs.extend([
(p, 0.05 * 0.8) for p in hybrid_recs[:1]
])
# Return top 5
return sorted(
all_recs,
key=lambda x: x[1],
reverse=True
)[:num]
# After first 5 purchases:
# Gradually shift weight to collaborative filtering
Cold-start solved. System becomes better as it learns."
**Result**:
- Hybrid recommendation system designed in 20 minutes
- Collaborative filtering + content-based + LLM approach chosen
- Production-ready implementation provided
- Evaluation framework with LLM-as-Judge for quality assessment
- Cold-start strategy included
- Expected outcomes: +23% conversion lift, +18% AOV increase
- Deployment architecture documented
- A/B testing strategy outlined
**Skills Used**:
- discover-ml (gateway)
- rag-evaluation-metrics (evaluating recommendations)
- llm-as-judge (quality assessment)
- llm-model-selection (choosing Claude for explanations)
- custom-llm-evaluation (business metric evaluation)
- llm-evaluation-frameworks (comprehensive testing)
---
## What These Examples Show
**Pattern 1: Auto-Discovery Works**
Each conversation starts with a single problem statement. Claude auto-activates the relevant gateway skill based on keywords. No manual loading required - the system understands context.
**Pattern 2: Progressive Skill Loading**
As conversations deepen, additional skills load naturally. Want authentication? Load api-authentication. Need optimization? Load postgres-query-optimization. Skills compose on-demand.
**Pattern 3: Context Efficiency**
Instead of loading a 25K-token skills index upfront, each conversation only loads 5-10 skills (2-5K tokens total). This leaves more context for actual code, conversation history, and thinking.
**Pattern 4: Realistic Outcomes**
- Example 1: Working dashboard in 30 minutes (discovery time + implementation)
- Example 2: Query optimized from 2.3s to 45ms (3 minutes to diagnose + 15 minutes to deploy)
- Example 3: Complete API design with OpenAPI spec (15 minutes)
- Example 4: Professional documentation diagram (5 minutes)
- Example 5: Database selection decision with architecture (12 minutes)
- Example 6: Production ML system designed (20 minutes)
These are real time estimates when you're in conversation, not including your own implementation.
**Pattern 5: Skills Know What to Recommend**
Each skill includes decision trees, tradeoff analysis, and "when to use" guidance. Claude doesn't just give code - it explains why, when, and what the alternatives are.
**Pattern 6: Production-Ready Output**
Every example produces immediately actionable output:
- Code you can run (not pseudocode)
- Diagrams you can share with team (not sketches)
- API specs you can implement from (not outlines)
- Optimization strategies you can deploy (with migration steps)
---
## Next Steps
Ready to get started?
1. **Install the plugin**: `/plugin marketplace add rand/cc-polymath
/plugin install cc-polymath@cc-polymath`
2. **Verify it works**: `bash ../scripts/verify-install.sh`
3. **Start with your project**: Ask Claude about building React components, optimizing queries, designing APIs - whatever you're working on
4. **Skills load automatically** - no manual loading needed for common tasks
For more detailed guidance on specific domains:
- Frontend: `/discover-frontend` → Next.js, React, TypeScript
- Backend: `/discover-api` → REST design, authentication, rate-limiting
- Database: `/discover-database` → PostgreSQL, MongoDB, Redis selection
- DevOps: `/discover-deployment` → CI/CD, infrastructure, containers
- ML/AI: `/discover-ml` → LLMs, RAG, embeddings, evaluation
Questions? See the [Troubleshooting Guide](TROUBLESHOOTING.md) or [FAQ](FAQ.md).