Cloud Architect | Event-Driven Systems Builder | Marathon Runner | Learning in Public
I design distributed systems for cloud platforms and explore how resilience principles from endurance sports apply to building reliable software. This is my laboratory β where architecture thinking meets code, and theory meets the constraints of running real applications.
If you're interested in:
β Start with EventsTracker
A production-grade multi-service platform exploring RabbitMQ choreography, ShedLock coordination, and Kubernetes operations. Now in local & prod ready state with Spring Boot 3.5.7, Java 21, and spring-cloud-config integration. Built to answer: How do you handle distributed transactions and race conditions at scale?
β Start with Runs AI Analyzer (Active Development)
Using semantic caching (PgVector + Claude API + Ollama embeddings) to analyze running data as a testbed for RAG patterns and real-time anomaly detection. Accepts Garmin payloads, publishes RabbitMQ events into EventTracker topology. Recent work: EventTracker integration, ECS deployment configs, multi-service orchestration. Why? Because marathons taught me that resilience is a system property, not a component.
β Coming Soon: EKS Terraform Labs (Learning Phase)
Reverse-engineering cloud-click clusters into versioned, reviewed, reproducible infrastructure. Learning to go from "eksctl create cluster" to "infrastructure as a git-reviewed system."
β Explore AI Agent Experiments
Auto-triaging stale branches, reconciling Terraform state with live resources, drafting ADRs from commit history. Early-stage exploration of how AI agents can reduce toil.
I write longer pieces at sathishjayapal.me (canonical source) and cross-post to Medium @dotsky.
- From eksctl to Terraform: Making Sense of EKS Resources
How to take an EKS cluster created witheksctland reverse-engineer it into maintainable Terraform modules. The gap between "click-next cloud" and "infrastructure you can version and review." - Designing Scalable Queues for Real-World Workloads
Patterns for moving RabbitMQ from hobby projects to resilient production-like setups: dead-lettering, backpressure, observability. This thinking is baked into EventsTracker. - Tackling Distributed Transactions in Microservices (cross-posted)
Using ShedLock for distributed task scheduling and avoiding race conditions in Kubernetes. Real constraints. Real solutions. - From Marathon Dreams to Injury Recovery: A Runner's Journey
How systems thinking from distributed systems applies to running recovery, feedback loops, and building resilience into training design. - Semantic Caching for Intelligent Running Analysis
Using PGVector and Claude embeddings to avoid re-analyzing past running data. RAG patterns at personal scale. β See all posts
A multi-service event ingestion platform with config server integration, production profiles, and Kubernetes-native design.
- Why: To understand how production systems handle distributed transactions, race conditions, and resilience at small scale before enterprise scale.
- Tech: Java 21 β’ Spring Boot 3.5.7 β’ Spring Cloud Config β’ RabbitMQ β’ PostgreSQL/Flyway β’ Kubernetes β’ Maven
- Focus: Event-driven choreography, ShedLock coordination, zero-trust microservice security, production env support.
- Status: Core event ingestion stable and production-ready; config-server integration tested; running locally with spring profiles (local/prod).
- Recent: Production profile support, env-based config sourcing, script-driven deployment, CI policy enforcement for README updates.
- Next: Zero-downtime deployments, comprehensive observability (metrics/tracing/logging), Kubernetes Helm charts. β Go to EventsTracker | Read the blog post
A multi-service platform for ingesting Garmin running data, analyzing via Claude API, storing in PgVector, and publishing events.
- Why: Marathons taught me that resilience is a system property. I'm applying that insight to real-time athletic performance analytics using RAG patterns.
- Tech: Java 21 β’ Spring Boot 4.0.1 β’ Spring AI 2.0.0-M1 (Claude + Ollama) β’ PGVector β’ PostgreSQL β’ RabbitMQ β’ OpenAPI/Swagger
- Focus: RAG-based semantic caching, EventTracker integration (RabbitMQ topology), force-refresh for fresh analysis, Garmin payload compatibility.
- Status: Core analysis stable; PgVector RAG cache working; EventTracker event publishing integrated; Ollama embeddings live; ECS deployment configs added.
- Recent: Multi-service orchestration with EventsTracker, integration test suite (three-service topology), event payload schemas, ECS task definitions for cloud deployment.
- Next: Kubernetes deployment (helm), multi-region event consistency patterns, anomaly detection for injury prevention signals. β Go to Runs App | Read the blog post
Reverse-engineering EKS clusters created with eksctl into clean, versioned Terraform modules.
- Why: Too many teams run "cloud click-next" deployments. This is how you move from ad-hoc to reviewable infrastructure.
- Tech: Terraform β’ AWS EKS β’ Kubernetes β’ Infrastructure as Code
- Status: Early exploration; learning the mapping from eksctl-generated resources to idiomatic Terraform. β Read the blog post
Exploring AI agents to reduce engineering toil:
- Auto-triaging stale branches and PRs
- Reconciling Terraform state with live Kubernetes/EKS/AKS resources
- Drafting ADRs and changelogs from commit history β Browse AI experiments
Languages & Frameworks
Java β’ Spring Boot β’ Spring Cloud β’ Spring AI β’ REST APIs β’ Event-Driven Architectures
Cloud & Infrastructure
AWS (EKS, RDS, S3, ECS) β’ Azure β’ Kubernetes β’ Terraform β’ Infrastructure as Code β’ Spring Cloud Config
Data & Patterns
PostgreSQL β’ RabbitMQ/Kafka β’ Distributed Transactions β’ PGVector/Semantic Search β’ Real-Time Analytics β’ RAG Caching
Architecture Styles
Microservices β’ Event-Driven β’ Domain-Driven Design β’ CQRS β’ Zero-Trust Security
Marathoner (Transitioning): 9 marathon finishes; now training for Flying Pig Half Marathon (Cincinnati, May 2026)βinjury recovery + systems-based training design. Every long run is a lesson in system design β feedback loops, resilience, constraint management, recovery.
Thesis: The principles that make distributed systems resilient (redundancy, graceful degradation, observability, feedback loops) are the same principles that make training cycles effective. I explore this at the intersection of both domains.
Location: Madison/Sun Prairie, Wisconsin. Always happy to discuss architecture over South Indian coffee.
π Blog β sathishjayapal.me (canonical source of all posts)
π Medium β @dotsky (cross-posted, always with canonical link back)
Interested in collaborating, discussing architecture, or connecting on cloud modernization?
β Open an issue on any repo or reach out at contact@sathishjayapal.me
- EventsTracker: Production profiles working; config server integration live; local/prod env switching via scripts
- Runs AI Analyzer: EventTracker integration complete; semantic caching with PgVector stable; ECS deployment configs added; three-service integration tests passing
- Learning: CKAD certification prep; Terraform EKS reverse-engineering; Spring AI + Claude API patterns
- Writing: In-progress piece on RAG pattern trade-offs and multi-region event consistency
- Running: Training cycle 2026 (half-marathon focus); injury recovery + systems-based periodization model
β
Learn from the code: Each project has a detailed README explaining the "why" alongside the "how."
β
Read the architecture posts first: Blog posts provide context for why code is structured the way it is.
β
Follow the learning journey: From CKAD exploration β EventsTracker β Kubernetes ops patterns β RAG systems.
β
Engage & discuss: Open issues for questions, architecture debates, or alternative approaches.
β
Contribute: Forks, PRs, and improvements welcome.
This is not a portfolio of finished products. It's a learning laboratory in public:
- Real constraints (Kubernetes, distributed transactions, RAG patterns, Spring AI integration)
- Real decisions (documented in Architecture Decision Records)
- Real friction (MapStruct compilation, reconciling Terraform state, Ollama embedding complexity)
- Real outcomes (blog posts, working applications, operational insights) The goal is to show how I think, not just what I've built.
Built with β and π. Always learning. Always building. Always honest.


