Solution Guide (Interviewer Reference)

Bug Summary

Bug 1: Missing Environment Variable (Easy)

File: docker-compose.yml
Issue: DEVICE_SERVICE_URL not set for workflow-service
Symptom: workflow-service crashes on startup with KeyError: 'DEVICE_SERVICE_URL'
Fix: Add - DEVICE_SERVICE_URL=http://device-service:5001 to workflow-service environment

Bug 2: Wrong API Endpoint Path (Easy)

File: services/workflow-service/app.py, line ~134
Issue: Uses /device/{id}/reserve instead of /devices/{id}/book
Symptom: Starting a workflow returns 404 from device service
Fix: Change to f"{DEVICE_SERVICE_URL}/devices/{device_id}/book"

Bug 3: Race Condition (Medium)

File: services/device-service/app.py, book_device()
Issue: time.sleep(0.1) between status check and update, no DB locking
Symptom: ./test-race-condition.sh shows multiple bookings for same device
Fix: Use SELECT ... FOR UPDATE:

with db.session.begin():
    device = Device.query.filter_by(id=device_id).with_for_update().one()
    if device.status != "available":
        return error
    device.status = "busy"

Bug 4: N+1 Query Problem (Medium)

File: services/workflow-service/app.py, list_workflows()
Issue: For each workflow, individually queries WorkflowSample then Sample
Symptom: GET /workflows generates ~150+ SQL queries (visible with SQLALCHEMY_ECHO=true)
Fix: Use eager loading or a joined query:

from sqlalchemy.orm import joinedload
workflows = Workflow.query.options(
    joinedload(Workflow.steps)
).all()
# Then batch-load samples with a single IN query

Bug 5: Missing Index (Medium)

File: services/device-service/models.py, ExecutionLog.started_at
Issue: No index on started_at column, 500k rows
Symptom: GET /execution-logs/search?start=2025-01-01&end=2025-01-02 takes 5-15 seconds
Fix: Add index=True to the column definition or create index via SQL

Bug 6: OOM - Unbounded Query (Hard)

File: services/device-service/app.py, get_all_execution_logs()
Issue: ExecutionLog.query.all() loads 500k rows into memory
Symptom: GET /execution-logs crashes the container (256M memory limit)
Fix: Add pagination:

page = request.args.get("page", 1, type=int)
per_page = min(request.args.get("per_page", 50, type=int), 100)
pagination = ExecutionLog.query.order_by(
    ExecutionLog.started_at.desc()
).paginate(page=page, per_page=per_page)

Phase 2: Device Queue Implementation

Expected Schema

CREATE TABLE device_queue (
    id SERIAL PRIMARY KEY,
    device_id VARCHAR(50) NOT NULL,
    workflow_id VARCHAR(36) NOT NULL,
    priority VARCHAR(10) NOT NULL DEFAULT 'normal',
    priority_value INTEGER NOT NULL DEFAULT 5,
    status VARCHAR(20) NOT NULL DEFAULT 'queued',
    queued_at TIMESTAMP NOT NULL DEFAULT NOW(),
    booked_at TIMESTAMP
);

Expected Endpoints

POST /devices/{id}/queue - Queue a booking request with priority
GET /devices/{id}/queue - View current queue
GET /devices/{id}/wait-time - Estimated wait time

Key Things to Look For

Proper ordering: priority_value ASC, queued_at ASC
Transaction safety on release + auto-book
Handling edge cases (cancelled queue entries, device in maintenance)
Concurrency handling on queue operations

Evaluation Criteria

Phase 1 (Debugging)

Systematic approach: Do they check logs first? Follow the error trail?
Understanding: Can they explain WHY each bug exists?
DB knowledge: Do they recognize N+1 pattern? Know about indexes?
Fix quality: Are fixes correct and complete?

Phase 2 (Feature)

Schema design: Appropriate columns, types, constraints?
Priority logic: Correct ordering algorithm?
Concurrency: Do they consider race conditions in the queue?
Code quality: Clean, well-structured, handles edge cases?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solution Guide (Interviewer Reference)

Bug Summary

Bug 1: Missing Environment Variable (Easy)

Bug 2: Wrong API Endpoint Path (Easy)

Bug 3: Race Condition (Medium)

Bug 4: N+1 Query Problem (Medium)

Bug 5: Missing Index (Medium)

Bug 6: OOM - Unbounded Query (Hard)

Phase 2: Device Queue Implementation

Expected Schema

Expected Endpoints

Key Things to Look For

Evaluation Criteria

Phase 1 (Debugging)

Phase 2 (Feature)

FilesExpand file tree

SOLUTION_GUIDE.md

Latest commit

History

SOLUTION_GUIDE.md

File metadata and controls

Solution Guide (Interviewer Reference)

Bug Summary

Bug 1: Missing Environment Variable (Easy)

Bug 2: Wrong API Endpoint Path (Easy)

Bug 3: Race Condition (Medium)

Bug 4: N+1 Query Problem (Medium)

Bug 5: Missing Index (Medium)

Bug 6: OOM - Unbounded Query (Hard)

Phase 2: Device Queue Implementation

Expected Schema

Expected Endpoints

Key Things to Look For

Evaluation Criteria

Phase 1 (Debugging)

Phase 2 (Feature)