Skip to content

atomicol/distributed-tasks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distributed Task Processing System

A scalable, production-ready distributed task processing system built with Go, PostgreSQL, and Redis.

Features

  • Horizontal Scaling: Deploy multiple worker processes across different machines
  • Priority Queue: Redis-based job queuing with priority support and atomic operations
  • Persistent Storage: PostgreSQL for reliable job storage and worker management
  • REST API: Comprehensive HTTP API for job submission, monitoring, and management
  • Built-in Handlers: Ready-to-use job processors for common tasks
  • Retry Logic: Automatic retry with exponential backoff for failed jobs
  • Worker Health: Heartbeat monitoring and automatic worker registration
  • Graceful Shutdown: Safe handling of in-flight jobs during shutdown
  • Real-time Stats: Job processing metrics and queue monitoring

Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   API Gateway   │    │     Redis       │    │   PostgreSQL    │
│   (Port 8080)   │◄──►│   (Queue)       │    │   (Storage)     │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         │              ┌─────────────────┐              │
         └──────────────►│    Worker 1     │◄─────────────┘
                        └─────────────────┘
                        ┌─────────────────┐
                        │    Worker 2     │◄─────────────┘
                        └─────────────────┘
                        ┌─────────────────┐
                        │    Worker N     │◄─────────────┘
                        └─────────────────┘

Quick Start

Prerequisites

# Install dependencies
brew install postgresql redis go

# Start services
brew services start postgresql
brew services start redis

# Create database
createdb distributed_tasks

Build and Deploy

# Build all components
make build

# Start all services in background
make start-all

# Or start services individually
make start-worker    # Terminal 1
make start-api      # Terminal 2

Test the System

# Submit a test job
curl -X POST http://localhost:8080/api/jobs \
  -H "Content-Type: application/json" \
  -d '{
    "type": "math",
    "payload": {
      "operation": "multiply",
      "a": 6,
      "b": 7
    },
    "priority": 10,
    "queue_name": "default"
  }'

# Check job status
curl http://localhost:8080/api/jobs/{job_id}

# Monitor queue stats
curl http://localhost:8080/api/stats

Built-in Job Types

The system includes several production-ready job handlers:

echo - Echo Handler

Echoes back the job payload for testing and debugging.

{
  "type": "echo",
  "payload": {
    "message": "Hello, World!",
    "data": [1, 2, 3]
  }
}

math - Mathematical Operations

Performs arithmetic operations with proper error handling.

{
  "type": "math",
  "payload": {
    "operation": "add|subtract|multiply|divide",
    "a": 10,
    "b": 5
  }
}

http_request - HTTP Client

Makes HTTP requests with configurable methods and headers.

{
  "type": "http_request",
  "payload": {
    "url": "https://api.example.com/data",
    "method": "GET",
    "headers": {
      "Authorization": "Bearer token123"
    }
  }
}

delay - Delay/Sleep

Simulates work with configurable delays (useful for testing and rate limiting).

{
  "type": "delay",
  "payload": {
    "duration": "5s",
    "message": "Processing delayed task"
  }
}

hello_world - Internationalization

Generates greetings in multiple languages.

{
  "type": "hello_world",
  "payload": {
    "name": "Alice",
    "language": "spanish"
  }
}

fail - Failure Testing

Always fails for testing error handling and retry logic.

{
  "type": "fail",
  "payload": {
    "reason": "Testing failure scenarios",
    "retryable": true
  }
}

API Endpoints

Job Management

  • POST /api/jobs - Submit a new job
  • GET /api/jobs/{id} - Get job details
  • PUT /api/jobs/{id} - Update job
  • DELETE /api/jobs/{id} - Delete job
  • GET /api/jobs - List jobs with filtering

Queue Management

  • GET /api/queues - List all queues
  • GET /api/queues/{name} - Get queue details
  • DELETE /api/queues/{name} - Purge queue

Worker Management

  • GET /api/workers - List active workers
  • GET /api/workers/{id} - Get worker details

System Status

  • GET /health - Health check
  • GET /api/stats - System statistics
  • GET /docs - API documentation

Configuration

Configure the system using environment variables:

# Database
export DB_HOST=localhost
export DB_PORT=5432
export DB_NAME=distributed_tasks
export DB_USER=postgres
export DB_PASSWORD=""

# Redis
export REDIS_HOST=localhost
export REDIS_PORT=6379
export REDIS_PASSWORD=""

# Worker Configuration
export WORKER_MAX_CONCURRENCY=10
export WORKER_QUEUES=default,high_priority,background
export WORKER_POLL_INTERVAL=1s
export WORKER_JOB_TIMEOUT=30m
export WORKER_MAX_RETRIES=3
export WORKER_RETRY_DELAY=5s

# API Gateway
export API_HOST=0.0.0.0
export API_PORT=8080
export API_READ_TIMEOUT=30s
export API_WRITE_TIMEOUT=30s

# Logging
export LOG_LEVEL=info
export LOG_FORMAT=json

Scaling Workers

Single Machine

# Start multiple workers with different concurrency
WORKER_MAX_CONCURRENCY=5 ./bin/worker &
WORKER_MAX_CONCURRENCY=3 ./bin/worker &

Multiple Machines

# Machine 1 - High priority jobs
WORKER_QUEUES=high_priority,default ./bin/worker

# Machine 2 - Background processing
WORKER_QUEUES=background,default ./bin/worker

# Machine 3 - Mixed workload  
WORKER_QUEUES=default,high_priority,background ./bin/worker

Docker Deployment

FROM golang:1.19-alpine AS builder
WORKDIR /app
COPY . .
RUN make build

FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/bin/worker .
COPY --from=builder /app/bin/api-gateway .
CMD ["./worker"]

Development

Project Structure

distributed/
├── cmd/                    # Application entry points
│   ├── api-gateway/        # REST API server
│   └── worker/             # Worker process
├── internal/               # Internal packages
│   ├── api/                # HTTP handlers and routes
│   ├── config/             # Configuration management
│   ├── health/             # Health checks
│   ├── metrics/            # Metrics collection
│   ├── queue/              # Redis queue implementation
│   ├── storage/            # PostgreSQL storage
│   └── worker/             # Worker logic and handlers
├── pkg/                    # Public packages
│   ├── errors/             # Error handling utilities
│   ├── logger/             # Structured logging
│   └── models/             # Data models and types
├── deployments/            # Deployment configurations
├── docs/                   # Documentation
└── monitoring/             # Monitoring and observability

Adding Custom Job Handlers

package main

import (
    "context"
    "time"
    "eskev/distributed/internal/worker"
    "eskev/distributed/pkg/models"
)

// CustomHandler implements a custom job type
type CustomHandler struct{}

func (h *CustomHandler) GetJobType() string {
    return "image_resize"
}

func (h *CustomHandler) Process(ctx context.Context, job *models.Job) (*worker.JobResult, error) {
    // Extract parameters from job.Payload
    imageURL := job.Payload["image_url"].(string)
    width := int(job.Payload["width"].(float64))
    height := int(job.Payload["height"].(float64))
    
    // Process the image (your custom logic here)
    outputURL, err := resizeImage(imageURL, width, height)
    if err != nil {
        return &worker.JobResult{
            Success: false,
            Error:   err.Error(),
            Retryable: true,
        }, nil
    }
    
    return &worker.JobResult{
        Success: true,
        Result: map[string]interface{}{
            "input_url":  imageURL,
            "output_url": outputURL,
            "width":      width,
            "height":     height,
            "processed_at": time.Now(),
        },
    }, nil
}

func (h *CustomHandler) GetTimeout() time.Duration {
    return 5 * time.Minute
}

func (h *CustomHandler) IsRetryable(err error) bool {
    // Retry on network errors, not on invalid image formats
    return !strings.Contains(err.Error(), "invalid format")
}

// Register the handler
func main() {
    w := worker.NewWorker(cfg, store, queue, log)
    w.RegisterHandler(&CustomHandler{})
    w.Start()
}

Monitoring and Observability

The system provides comprehensive monitoring:

# Worker Statistics
curl http://localhost:8080/api/workers/{worker_id}/stats

# Queue Depth Monitoring
curl http://localhost:8080/api/queues/default/size

# Job Processing Rates
curl http://localhost:8080/api/stats/processing_rates

# Failed Job Analysis
curl "http://localhost:8080/api/jobs?status=failed&limit=10"

Performance Tuning

  1. Worker Concurrency: Adjust WORKER_MAX_CONCURRENCY based on CPU cores and job types
  2. Queue Strategy: Use multiple queues for different job priorities
  3. Database: Optimize PostgreSQL for your job volume
  4. Redis: Configure Redis persistence and memory settings
  5. Timeouts: Set appropriate timeouts for different job types

Production Deployment

High Availability Setup

  • Deploy multiple API gateway instances behind a load balancer
  • Run workers across multiple machines for fault tolerance
  • Use Redis Cluster for queue high availability
  • Set up PostgreSQL replication for data durability

Monitoring Stack

  • Metrics: Prometheus + Grafana
  • Logging: ELK Stack or structured JSON logs
  • Alerting: Monitor queue depth, worker health, job failure rates
  • Tracing: Add distributed tracing for complex job workflows

Available Commands

make build          # Build all binaries
make start-worker   # Start worker (blocking)
make start-api      # Start API gateway (blocking)
make start-all      # Start all services in background
make stop           # Stop background services
make clean          # Clean build artifacts
make help           # Show all commands

License

MIT License - see LICENSE file for details.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

For questions and support, please open an issue on GitHub.

About

distributed task-processing system written in Go

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors