Skip to content

viho-kernel/Roboshop-Architecture

Repository files navigation

🛒 RoboShop — Production-Grade AWS Infrastructure as Code

Enterprise-ready e-commerce platform on AWS. Complete IaC with Terraform + Ansible. VPC to Auto Scaling. Zero manual steps. Includes CDN, VPN, and multi-tier security.


📋 Table of Contents


🏗 Architecture Overview

                              INTERNET
                                 │
                    ┌────────────┴────────────┐
                    │                         │
                    ▼                         ▼
              [CloudFront CDN]          [Frontend ALB]
              (Static Assets)           HTTPS :443 (SSL terminated)
                    │                         │
                    │                         ▼
                    │                  [Frontend ASG]
                    │                  Private Subnet
                    │                         │
                    └────────────┬────────────┘
                                 │
                                 ▼
                          [Backend ALB]
                          HTTP :80 (Internal)
                                 │
        ┌────────┬────────┬──────┼──────┬────────┬──────────┐
        │        │        │      │      │        │          │
        ▼        ▼        ▼      ▼      ▼        ▼          ▼
    [Catalogue] [User]  [Cart] [Ship] [Pay]  [Backend]  [Bastion]
    Node.js     Node.js  Node.js Java   Python (Private) SSH Jump
    :8080       :8080    :8080   :8080  :8080  Subnets    Public
        │        │ ║      │      │      │
        └────────┼─╫──────┴──────┘      │
                 ║ ║                    │
        ┌────────╨─╨────┬──────┬────────┘
        │               │      │
        ▼               ▼      ▼
    [MongoDB]      [Redis]  [MySQL]  [RabbitMQ]
    :27017         :6379    :3306    :5672
    Private EC2 Instances

Key Points:

  • ✅ All application EC2 instances in private subnets
  • ✅ Databases in private subnets, no internet access
  • ✅ Only Frontend ALB and Bastion have public IPs
  • Backend ALB is internal-only, reachable only from Frontend
  • NAT Gateway enables outbound internet for private instances
  • ✅ All traffic encrypted (TLS at ALB, SSH for Bastion)

📦 What's Included

Layer Folder Provisions Status
00 00-VPC VPC, 2x Public/Private subnets, IGW, NAT, Route tables ✅ Complete
10 10-SG 10 empty Security Groups (one per component) ✅ Complete
20 20-SG-Rules All ingress/egress rules wiring SGs together ✅ Complete
30 30-Bastion Bastion EC2 jump host in public subnet ✅ Complete
40 40-Databases MongoDB, Redis, MySQL, RabbitMQ on EC2 + Route 53 DNS ✅ Complete
50 50-backend-alb Internal ALB + Target Groups for backend services ✅ Complete
60 60-catalogue Catalogue microservice (EC2 → AMI → ASG) ✅ Complete
70 70-acm ACM wildcard certificate for *.opsora.space ✅ Complete
80 80-frontend-alb Public-facing ALB with HTTPS listener + Target Groups ✅ Complete
90 90-components Reusable module for User, Cart, Shipping, Payment, Frontend ✅ Complete
95 95cdn CloudFront distribution for static assets + S3 ✅ Complete
99 99-vpn OpenVPN server for secure remote access ✅ Complete

✅ Prerequisites

Tools Required

Tool Version Purpose
Terraform ≥ 1.5 Infrastructure as Code provisioning
AWS CLI ≥ 2.13 Authentication, parameter retrieval
Ansible ≥ 2.10 Configuration management (EC2-based)
Git any Clone repos, pull Ansible roles
jq any JSON parsing (optional, for scripting)

AWS Permissions Required

  • EC2: Full (instances, AMI, snapshots, launch templates)
  • VPC: Full (VPC, subnets, gateways, route tables, security groups)
  • ELB: Full (ALB, target groups, listeners)
  • Auto Scaling: Full (ASG, scaling policies, instance refresh)
  • ACM: Full (certificate provisioning)
  • Route 53: Full (hosted zones, DNS records)
  • IAM: Full (roles, policies, instance profiles)
  • SSM Parameter Store: Full (read/write parameters)
  • CloudFront: Full (distributions, cache behaviors)
  • S3: Full (bucket creation, policies)

AWS Account Setup

# Configure AWS credentials
aws configure
# or use a profile
export AWS_PROFILE=your-profile

# Verify access
aws sts get-caller-identity

Pre-deployment Checklist

  • Route 53 hosted zone opsora.space already created
  • Route 53 zone ID noted (required in variables)
  • Local IP address known (for Bastion SSH access)
  • SSL certificate domain (*.opsora.space) ready for ACM validation
  • Terraform backend configured (optional but recommended)

🚀 Quick Start

1. Clone Repository

git clone https://github.com/viho-kernel/Roboshop-Architecture
cd Roboshop-Architecture

2. Set Variables

Edit terraform.tfvars in each folder (or create one at root):

# terraform.tfvars
environment  = "dev"
project      = "roboshop"
owner        = "Your Name"
domain_name  = "opsora.space"
zone_id      = "Z1234567890ABC"  # Your Route 53 zone ID
app_version  = "v3"
my_ip        = "203.0.113.0/32"  # Your public IP for Bastion SSH

3. Deploy in Strict Order

#!/bin/bash
# deploy.sh - Deploy all layers in order

LAYERS=(
  "00-VPC"
  "10-SG"
  "20-SG-Rules"
  "30-Bastion"
  "40-Databases"
  "50-backend-alb"
  "60-catalogue"
  "70-acm"
  "80-frontend-alb"
  "90-components"
  "95cdn"
  "99-vpn"
)

for layer in "${LAYERS[@]}"; do
  echo "▶ Deploying $layer..."
  cd "$layer"
  terraform init
  terraform apply -auto-approve
  if [ $? -ne 0 ]; then
    echo "$layer failed!"
    exit 1
  fi
  cd ..
  sleep 10  # Brief pause between layers
done

echo "✅ All layers deployed successfully!"
chmod +x deploy.sh
./deploy.sh

⚠️ Critical: Each layer reads outputs from the previous layer via AWS SSM Parameter Store. Deploy strictly in numerical order or resources will fail to find dependencies.


📁 Folder Structure & Deployment Order

Dependency Graph

00-VPC
  ↓
10-SG  →  20-SG-Rules
  ↓
30-Bastion
  ↓
40-Databases
  ↓
50-backend-alb
  ↓
60-catalogue  ←┐
  ↓            │
70-acm         │
  ↓            │
80-frontend-alb
  ↓
90-components  ←┘ (uses 60-catalogue module pattern)
  ↓
95cdn
  ↓
99-vpn

Layer Details

00-VPC — Network Foundation

├── main.tf          # VPC, subnets, IGW, NAT Gateway
├── variables.tf     # CIDR blocks, AZ configuration
├── outputs.tf       # VPC ID, subnet IDs → SSM parameters
├── parameters.tf    # Writes outputs to SSM
├── provider.tf      # AWS provider config
└── locals.tf        # Local variables, tags

Creates:

  • VPC (10.0.0.0/16)
  • 2x Public subnets for ALB/Bastion
  • 2x Private subnets for applications
  • 1x NAT Gateway for outbound internet
  • Route tables with proper routing

10-SG — Security Group Shells

├── main.tf          # 10 empty SGs (no rules yet)
├── variables.tf
├── parameters.tf    # Writes SG IDs to SSM
├── provider.tf
├── locals.tf        # SG names: frontend_alb, backend_alb, etc.
└── data.tf          # Reads VPC from SSM

Creates Security Groups for:

  • frontend_alb_sg — Public ALB
  • backend_alb_sg — Internal ALB
  • frontend_sg — Frontend instances
  • catalogue_sg — Catalogue service
  • user_sg — User service
  • cart_sg — Cart service
  • shipping_sg — Shipping service
  • payment_sg — Payment service
  • bastion_sg — Bastion jump host
  • database_sg — Shared database security group

20-SG-Rules — Security Group Wiring

├── main.tf          # All ingress/egress rules
├── variables.tf
├── provider.tf
├── locals.tf
└── data.tf          # Reads SG IDs from SSM

Establishes allowed traffic:

Internet (0.0.0.0/0) → frontend_alb_sg (443)
frontend_alb_sg → frontend_sg (80)
frontend_sg → backend_alb_sg (80)
backend_alb_sg → [catalogue|user|cart|shipping|payment]_sg (8080)
catalogue_sg → database_sg (27017 MongoDB)
user_sg → database_sg (27017 MongoDB + 6379 Redis)
cart_sg → database_sg (6379 Redis)
shipping_sg → database_sg (3306 MySQL)
payment_sg → database_sg (5672 RabbitMQ)
bastion_sg → all_sg (22 SSH)

30-Bastion — SSH Jump Host

├── main.tf          # EC2 instance, IAM role, SSM agent
├── bastion.sh       # Bootstrap script (updates, SSM agent)
├── variables.tf
├── parameters.tf
├── provider.tf
├── locals.tf
└── data.tf          # Reads VPC, subnet, SG from SSM

Launches:

  • t3.micro EC2 in public subnet
  • Elastic IP for stable SSH access
  • IAM role with SSM Systems Manager permissions
  • Route 53 DNS record bastion-dev.opsora.space
  • Available immediately for SSH jump access

40-Databases — Stateful Data Layer

├── main.tf           # 4 EC2 instances for databases
├── bootstrap.sh      # Installs Docker + services
├── variables.tf
├── parameters.tf
├── provider.tf
├── locals.tf
├── data.tf           # Reads VPC, subnets, SGs from SSM
├── iam.tf            # IAM role + policies
├── r53.tf            # Route 53 DNS records
├── outputs.tf        # Database host IPs → SSM
├── mysql-iam-policy.json
├── rabbitmq-iam-user-policy.json
└── rabbitmq-iam-user-password.json

Launches 4 EC2 Instances in Private Subnets:

Service Port Instance Type Storage Notes
MongoDB 27017 t3.small 20 GB EBS Used by Catalogue, User
Redis 6379 t3.micro 10 GB EBS Used by User, Cart
MySQL 3306 t3.small 30 GB EBS Used by Shipping
RabbitMQ 5672 t3.small 20 GB EBS Used by Payment

Key Features:

  • Ansible configures each via user data
  • Route 53 DNS names (e.g., mongodb-dev.opsora.space)
  • SSM Parameter Store exports of host IPs
  • IAM roles for secure credential retrieval
  • EBS volumes persistent across reboots

50-backend-alb — Internal Load Balancer

├── main.tf          # ALB, target groups, listener
├── variables.tf
├── parameters.tf
├── provider.tf
├── r53.tf           # DNS alias
├── locals.tf
└── data.tf          # Reads VPC, subnets, SGs from SSM

Creates:

  • Internal ALB in private subnets
  • Listener on HTTP :80 (no HTTPS, internal only)
  • 5 Target Groups:
    • catalogue-tg (port 8080)
    • user-tg (port 8080)
    • cart-tg (port 8080)
    • shipping-tg (port 8080)
    • payment-tg (port 8080)
  • Route 53 CNAME backend-alb-dev.opsora.space (internal resolution)
  • Health checks every 5s

60-catalogue — First Microservice (Template)

├── main.tf          # EC2, AMI baking, ASG
├── bootstrap.sh     # Ansible installation + pull
├── variables.tf
├── provider.tf
└── data.tf          # Reads from SSM

Deployment Pipeline:

  1. Spin up temporary EC2 in private subnet
  2. SSH in via Bastion or Systems Manager Session Manager
  3. Run bootstrap.sh — installs Ansible
  4. Ansible runs ansible-pull from GitHub (roboshop-ansible repo)
  5. Ansible applies role for Catalogue service
  6. Instance is stopped
  7. Golden AMI created from stopped instance
  8. Launch Template created pointing to AMI
  9. Auto Scaling Group (Min: 1, Max: 10, Desired: 2)
  10. ASG registers with Backend ALB Target Group
  11. Temporary instance terminated
  12. Service is now live and auto-scaling

70-acm — SSL Certificate

├── main.tf          # ACM certificate request
├── variables.tf
├── parameters.tf
├── provider.tf
└── locals.tf

Provisions:

  • Wildcard ACM certificate for *.opsora.space
  • Domain validation via Route 53 CNAME
  • Auto-renews before expiry
  • Exports certificate ARN to SSM Parameter Store

80-frontend-alb — Public HTTPS Load Balancer

├── main.tf          # Public ALB, HTTPS listener, redirects
├── variables.tf
├── parameters.tf
├── provider.tf
├── r53.tf           # Public DNS alias
├── locals.tf
└── data.tf          # Reads VPC, subnets, SGs, ACM cert from SSM

Creates:

  • Public ALB in public subnets
  • HTTPS Listener on port 443 (SSL termination with ACM cert)
  • HTTP Listener on port 80 → redirects to 443
  • Default Target Group for Frontend (port 80)
  • Route 53 public alias dev.opsora.space → ALB DNS
  • Security group allows 0.0.0.0/0 on ports 80/443

90-components — Reusable Module for All Services

├── main.tf          # Module instantiation for all 5 services
├── variables.tf     # Component list, ASG config
├── provider.tf
└── bootstrap.sh     # Generic bootstrap (component-agnostic)

Deploys 5 Microservices:

Component Language Port Module Calls
Frontend React + Node 80 1x module call
User Node.js 8080 1x module call
Cart Node.js 8080 1x module call
Shipping Java (Spring Boot) 8080 1x module call
Payment Python (Flask) 8080 1x module call

Module Workflow (identical to 60-catalogue):

  1. Create temporary EC2 in private subnet
  2. Pass component variable to bootstrap.sh
  3. Ansible reads component name, pulls role from GitHub
  4. Configures service (runtime, app download, systemd)
  5. Stop instance → bake AMI
  6. Create Launch Template from AMI
  7. Create ASG (1-10 instances, 70% CPU scaling)
  8. Register ASG with appropriate Backend ALB Target Group
  9. Setup Path-based routing (e.g., /user/* → user TG)
  10. Terminate temporary instance

Reusability: Every component uses the same module code, just with different component variable passed.


95cdn — CloudFront Distribution

├── main.tf          # CloudFront, S3 bucket, cache behaviors
├── variables.tf
├── provider.tf
├── locals.tf
└── data.tf          # Reads VPC, subnets from SSM

Provides:

  • S3 bucket for static assets (CSS, JS, images)
  • CloudFront distribution with origin access control
  • Cache behaviors:
    • Images (.jpg, .png, .gif) → 30 days TTL
    • Stylesheets (.css) → 7 days TTL
    • JavaScript (.js) → 7 days TTL
    • HTML → 1 hour TTL
  • Route 53 alias cdn-dev.opsora.space
  • Compression enabled (gzip, brotli)

99-vpn — OpenVPN Server

├── main.tf          # OpenVPN EC2, security group, route
├── openvpn.sh       # OpenVPN installation + config generation
├── variables.tf
├── provider.tf
├── locals.tf
└── data.tf          # Reads VPC, subnet, SG from SSM

Sets Up:

  • t3.small EC2 instance in public subnet
  • OpenVPN server (UDP :1194)
  • Client certificate generation
  • Routes all VPN traffic through EC2
  • Client config downloadable from SSM Parameter Store
  • Allows secure access to private subnets (RDP, SSH to databases)

🔬 Component Deep Dive

Microservice Deployment Pattern

Every component (catalogue, user, cart, shipping, payment, frontend) follows this immutable infrastructure pattern:

Trigger: terraform apply
    │
    ▼
Module creates temporary EC2
    │
    ▼
UserData runs bootstrap.sh
    │
    ├─→ aws ssm get-parameter → fetch ENV vars from SSM
    ├─→ apt update && apt install ansible git
    ├─→ mkdir -p /opt/ansible
    ├─→ git clone roboshop-ansible-repo
    │
    ▼
ansible-pull -i inventory ${COMPONENT}
    │
    ├─→ Install runtime (Node.js / Java / Python)
    ├─→ Download app code
    ├─→ Install dependencies
    ├─→ Create systemd service
    ├─→ Start service on :8080
    │
    ▼
Instance is STOPPED
    │
    ▼
Golden AMI created
    │
    ├─→ New Launch Template points to AMI
    ├─→ ASG Updated to use new template
    │
    ▼
ASG Triggers Instance Refresh
    │
    ├─→ Spins up new instances from AMI
    ├─→ Old instances slowly terminated (50% min healthy)
    ├─→ Register with Backend ALB
    ├─→ Health checks verify :8080 responding
    │
    ▼
Zero Downtime Deployment Complete
    │
    ▼
Temporary instance TERMINATED

Why This Approach?

Benefit Why It Matters
Immutable Every instance created from identical AMI
Reproducible Terraform + Ansible = deterministic builds
Fast Scaling No bootstrap on scale-out (AMI ready to go)
Rollback Previous AMI still exists, can revert Launch Template
Observable CloudWatch logs from Ansible runs
Testable Can validate AMI before rolling out

🔒 Security Model

Security Group Chaining

No hardcoded IPs. Everything flows through named Security Groups:

┌─────────────────────────────────────────────────────────────┐
│                        INTERNET                             │
│                      (0.0.0.0/0)                            │
└────────────────┬──────────────────────────────┬─────────────┘
                 │                              │
         (HTTPS :443)                    (SSH :22, Bastion only)
                 │                              │
        ┌────────▼────────┐            ┌────────▼─────────┐
        │ frontend_alb_sg │            │  bastion_sg      │
        │ (public)        │            │  (public)        │
        └────────┬────────┘            └──────┬───────────┘
                 │                            │
         (HTTP :80)                    (SSH :22)
                 │                            │
        ┌────────▼────────┐                   │
        │  frontend_sg    │                   │
        │  (private)      │◀──────────────────┘
        └────────┬────────┘
                 │
         (HTTP :80)
                 │
        ┌────────▼────────┐
        │ backend_alb_sg  │
        │ (private)       │
        └────────┬────────┘
                 │
         (HTTP :8080)
                 │
    ┌────┬─────┬─┼──┬──────┬──────────┐
    │    │     │ │  │      │          │
┌───▼─┬─┬───┬─┬─┬─┬─┬──┬──┬┴──┬──┬──┬┴─┐
│cat_ │ │usr│ │car│ │ship_ │pay_│all_sg│
│_sg  │ │_sg│ │_sg│ │_sg   │_sg│databases_sg│
└─────┴─┴───┴─┴───┴─┴──────┴──┴─────────┘

IAM Roles & Policies

EC2 Bastion Role:

  • ssm:GetParameter — Fetch connection strings from SSM
  • ec2messages:* — For Systems Manager Session Manager
  • ssmmessages:* — CloudWatch forwarding
  • s3:GetObject — Pull Ansible roles from S3 (if needed)

Microservice Instance Role:

  • ssm:GetParameter — Fetch app secrets, DB connection strings
  • s3:GetObject — Download app code from S3
  • cloudwatch:PutMetricData — Custom metrics
  • logs:* — CloudWatch Logs agent

Database EC2 Role:

  • ssm:GetParameter → Store/retrieve credentials
  • ec2:* — Basic EC2 permissions

Network Segregation

Tier Subnet Type Internet Access Access From
Frontend ALB Public ✅ Via IGW Internet (0.0.0.0/0 on 443)
Bastion Public ✅ Via IGW Your IP only (SSH :22)
Frontend ASG Private Via NAT (outbound only) Frontend ALB only
Backend ALB Private Via NAT (outbound only) Frontend ASG only
Microservices Private Via NAT (outbound only) Backend ALB only
Databases Private Via NAT (outbound only) Microservices + Bastion

📦 Auto Scaling & AMI Baking

Auto Scaling Policy

Target Tracking (TargetTrackingScaling) on CPU Utilization:

If Average CPU > 70% (for 2 min)
  └─→ Scale OUT: Add instances (Max: 10)

If Average CPU < 30% (for 5 min)
  └─→ Scale IN: Remove instances (Min: 1)

Warmup Period: 120 seconds (new instance not counted until ready)

Scaling Lifecycle

1. New instance launches from Golden AMI
   ↓
2. Warmup period (120s) — not counted in metrics
   ↓
3. Health check passes (HTTP GET /health on :8080)
   ↓
4. Instance added to ALB Target Group
   ↓
5. Counted toward Auto Scaling metrics
   ↓
6. If CPU still high → more instances spin up

AMI Refresh (Rolling Update)

When Terraform detects a configuration change:

Current State:         New State:
┌────────┐           ┌──────────────┐
│ Old AMI│           │New Code/AMI  │
│ v1.2   │           │v1.3          │
└────────┘           └──────────────┘
    │                     │
    ├─ Instance 1         ├─ Instance 1' (new)
    ├─ Instance 2    →    ├─ Instance 2' (new)
    └─ Instance 3         └─ Instance 3' (new)

Timeline:
t=0   : Instance Refresh triggered
t=30s : Instance 1 stopped, Instance 1' starts (50% capacity)
t=60s : Instance 1' healthy, Instance 2 stopped
t=90s : Instance 2' healthy, Instance 3 stopped
t=120s: All instances running on new AMI
Result: Zero downtime deployment

🌐 DNS, SSL & CDN

DNS Architecture

Hosted Zone: opsora.space (Route 53)

Record Type Points To Purpose
dev.opsora.space A (Alias) Frontend ALB DNS Public entrypoint (user-facing)
backend-alb-dev.opsora.space A (Alias) Backend ALB DNS Internal ALB (private)
cdn-dev.opsora.space CNAME CloudFront distribution Static assets CDN
bastion-dev.opsora.space A Bastion Elastic IP SSH jump host
mongodb-dev.opsora.space A MongoDB private IP Database access (internal)
redis-dev.opsora.space A Redis private IP Cache access (internal)
mysql-dev.opsora.space A MySQL private IP Database access (internal)
rabbitmq-dev.opsora.space A RabbitMQ private IP Queue access (internal)

SSL/TLS Encryption

Frontend ALB (Public):

  • Listener on HTTPS :443 with ACM wildcard cert *.opsora.space
  • HTTP :80 automatically redirects to HTTPS :301
  • HTTP/2 enabled
  • TLS 1.2+ only

Backend Tier (Private):

  • No HTTPS (internal only, NAT gateway provides egress firewall)
  • All traffic encrypted at Network Layer (VPC isolation)

Database Tier (Private):

  • No encryption in transit (could add mTLS with HAProxy if needed)
  • Data at rest: EBS encryption enabled (AES-256)

CloudFront CDN

User Request
    │
    ▼
CloudFront Edge Location (global)
    │
    ├─→ Cache hit? Return cached content (30 days)
    │
    └─→ Cache miss? Fetch from S3 origin
            │
            ▼
        S3 Bucket (private, origin access control)
            │
            ▼
        Return to CloudFront → Cache → Return to User

Behaviors:
  *.jpg, *.png, *.gif  → 30 days TTL
  *.css               → 7 days TTL
  *.js                → 7 days TTL
  *.html              → 1 hour TTL
  / (index)           → 1 hour TTL
  Compression: gzip + brotli enabled
  Query String: Ignored (cache by URL only)

🔐 VPN Access

OpenVPN Server (99-vpn)

Purpose: Secure remote access to private database instances + internal resources

Setup:

  1. OpenVPN EC2 instance in public subnet (t3.small)
  2. UDP listener on :1194
  3. Generates client certificate & key
  4. Stores client .ovpn config in SSM Parameter Store
  5. Route 53 CNAME → VPN instance public IP

Client Access:

# On your local machine
aws ssm get-parameter --name /roboshop/dev/vpn/client-config --query 'Parameter.Value' > client.ovpn

# Launch VPN client
openvpn client.ovpn

# Now SSH directly to private database instance
ssh -i my-key.pem ec2-user@mongodb-dev.opsora.space

# Or RDP to Windows instances (if added later)

Routing:

  • VPN tunnel routes all traffic destined for VPC CIDR (10.0.0.0/16)
  • Client can reach private EC2 instances directly
  • Bastion SSH no longer needed for DB access (VPN provides direct access)

📝 Variables Reference

Global Variables (in terraform.tfvars or environment)

Variable Type Default Description
environment string dev Deployment environment (dev/prod)
project string roboshop Project name, used in all resource names
owner string Vihari Owner tag applied to all resources
domain_name string opsora.space Base Route 53 domain
zone_id string (required) Route 53 Hosted Zone ID
app_version string v3 Application version to deploy
my_ip string (required) Your public IP (CIDR /32 for Bastion SSH)

VPC Variables (00-VPC)

Variable Type Default Description
vpc_cidr string 10.0.0.0/16 VPC CIDR block
public_subnet_cidrs list(string) ["10.0.1.0/24", "10.0.2.0/24"] Public subnets
private_subnet_cidrs list(string) ["10.0.10.0/24", "10.0.11.0/24"] Private subnets
availability_zones list(string) ["us-east-1a", "us-east-1b"] AZs for subnets
enable_nat_gateway bool true Create NAT Gateway for private subnet egress
enable_dns_hostnames bool true Enable DNS hostnames in VPC

Component Variables (90-components)

Variable Type Default Description
components map(object) See below Component config (name, rule priority, instance count)
instance_type string t3.small Instance type for component ASG
asg_min_size number 1 ASG minimum instances
asg_max_size number 10 ASG maximum instances
asg_desired_size number 2 ASG desired instances
scaling_target_cpu number 70 CPU % to trigger scale-out

🐛 Troubleshooting

Deployment Fails at Layer N

Symptom: terraform apply fails with "resource not found"

Cause: Previous layer didn't write outputs to SSM Parameter Store

Fix:

# Check if parameters exist
aws ssm get-parameters-by-path \
  --path "/roboshop/dev" \
  --recursive \
  --query 'Parameters[].Name'

# Re-run previous layer
cd ../[N-1]-layer
terraform apply -auto-approve

# Verify parameters written
aws ssm get-parameter --name "/roboshop/dev/vpc-id"

# Then retry current layer
cd ../N-layer
terraform apply -auto-approve

ASG Instances Stuck in "Pending" State

Symptom: Instances launch but never become healthy

Cause: Security group rules blocking ALB health checks

Fix:

# Check target group health
aws elbv2 describe-target-health \
  --target-group-arn arn:aws:elasticloadbalancing:... \
  --query 'TargetHealthDescriptions[*].[Target.Id, TargetHealth.State, TargetHealth.Reason]'

# SSH to stopped instance via Bastion
# Check if service is listening on :8080
sudo systemctl status service-name
sudo ss -tlnp | grep 8080

# Check application logs
sudo tail -100f /var/log/service-name/app.log

Bastion SSH Fails

Symptom: Permission denied (publickey)

Cause:

  1. Wrong security group rule (check port 22 open from your IP)
  2. Key pair doesn't exist in EC2

Fix:

# Verify security group rules
aws ec2 describe-security-groups \
  --group-ids sg-xxxxx \
  --query 'SecurityGroups[0].IpPermissions[?FromPort==`22`]'

# SSH with verbose output
ssh -vvv -i my-key.pem ec2-user@bastion-dev.opsora.space

Database EC2 Can't Connect to Internet

Symptom: curl google.com hangs/times out

Cause: NAT Gateway not properly routing outbound traffic

Fix:

# Check route table
aws ec2 describe-route-tables \
  --filters "Name=association.subnet-id,Values=subnet-xxxxx" \
  --query 'RouteTables[0].Routes'

# Should have 0.0.0.0/0 → NAT Gateway

💣 Teardown

Destroy in REVERSE Order

#!/bin/bash
# destroy.sh - Destroy all layers in reverse order

LAYERS=(
  "99-vpn"
  "95cdn"
  "90-components"
  "80-frontend-alb"
  "70-acm"
  "60-catalogue"
  "50-backend-alb"
  "40-Databases"
  "30-Bastion"
  "20-SG-Rules"
  "10-SG"
  "00-VPC"
)

read -p "⚠️  This will DELETE all resources. Type 'yes' to confirm: " confirm
if [ "$confirm" != "yes" ]; then
  echo "Aborted."
  exit 1
fi

for layer in "${LAYERS[@]}"; do
  echo "▶ Destroying $layer..."
  cd "$layer"
  terraform destroy -auto-approve
  cd ..
done

echo "✅ All destroyed!"

👤 Author

Vihari@viho-kernel


Built with ❤️ using Terraform + Ansible on AWS Last updated: March 24, 2026

About

Complete 3 tier architecture for an ecommerce application

Topics

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors