Skip to content

Latest commit

 

History

History
3090 lines (2437 loc) · 74.8 KB

File metadata and controls

3090 lines (2437 loc) · 74.8 KB

FASE 0.1 - DOCUMENTACAO EXTERNA COMPILADA

Data de Compilacao: 2025-11-02 Executor: Claude Code (Constituicao Vertice v3.0) Objetivo: Contexto tecnico absoluto para remediacao do TypeCraft


1. PostgreSQL + GORM

Connection String Format (DSN)

Formato padrao:

"host=localhost user=gorm password=gorm dbname=gorm port=9920 sslmode=disable TimeZone=Asia/Shanghai"

Formato URL alternativo:

"postgres://USER:PASSWORD@HOST:PORT/DBNAME?sslmode=disable"
  • Porta padrao: 5432
  • TimeZone deve ser especificado para evitar inconsistencias

Implementacao basica:

import (
    "gorm.io/driver/postgres"
    "gorm.io/gorm"
)

dsn := "host=localhost user=postgres password=secret dbname=mydb port=5432 sslmode=require TimeZone=UTC"
db, err := gorm.Open(postgres.Open(dsn), &gorm.Config{})

Connection Pooling

Configuracao obrigatoria para producao:

sqlDB, err := db.DB()
if err != nil {
    log.Fatal("Failed to get database instance:", err)
}

// Small-to-medium applications
sqlDB.SetMaxIdleConns(10)
sqlDB.SetMaxOpenConns(25)
sqlDB.SetConnMaxLifetime(5 * time.Minute)

// High-traffic applications
sqlDB.SetMaxIdleConns(25)
sqlDB.SetMaxOpenConns(100)
sqlDB.SetConnMaxLifetime(5 * time.Minute)

Valores recomendados e razao:

  1. SetMaxIdleConns: 10-25

    • Razao: Mantem conexoes prontas para uso imediato sem consumir recursos excessivos
    • Producao pequena/media: 10
    • Producao alta carga: 25
    • REGRA: MaxIdleConns <= MaxOpenConns sempre
  2. SetMaxOpenConns: 25-100

    • Razao: Limita conexoes simultaneas ao PostgreSQL (evita esgotar limite do DB)
    • PostgreSQL default max_connections: 100
    • Producao pequena/media: 25
    • Producao alta carga: 100
    • IMPORTANTE: Considerar numero de replicas da aplicacao (total = MaxOpenConns * num_replicas)
  3. ConnMaxLifetime: 5 minutes (300s)

    • Razao: Force reconexao para evitar stale connections e balancear carga em proxies/load balancers
    • PostgreSQL default idle_in_transaction_session_timeout: 0 (infinito)
    • 5 minutos e seguro para maioria dos casos

Driver otimizacao:

  • GORM usa pgx como driver padrao
  • Prepared statement cache habilitado por padrao (otimizacao)
  • Desabilitar se necessario: PreferSimpleProtocol: true

SSL Modes

Opcoes disponiveis (do menos ao mais seguro):

  1. disable

    • Quando usar: Desenvolvimento local APENAS
    • Seguranca: ZERO (sem encriptacao)
    • Producao: NUNCA usar
  2. require

    • Quando usar: Conexoes internas (private VPC) com confianca na rede
    • Seguranca: Encripta comunicacao mas NAO valida certificado
    • Vulneravel a: Man-in-the-middle attacks
    • Producao: Evitar se possivel
  3. verify-ca

    • Quando usar: CA privada/self-signed controlada internamente
    • Seguranca: Valida cadeia de certificados ate root CA
    • NAO verifica: Hostname do servidor
    • Producao: Aceitavel para CAs privadas
  4. verify-full (RECOMENDADO PRODUCAO)

    • Quando usar: SEMPRE em producao com CA publica
    • Seguranca: Valida certificado E hostname
    • Verifica: Subject Alternative Name (SAN) ou Common Name
    • Protege contra: Servidores maliciosos registrados na mesma CA
    • Producao: OBRIGATORIO para Cloud SQL, RDS, Azure Database

Exemplo producao (Cloud SQL):

dsn := "host=10.1.2.3 user=app password=secret dbname=typecraft port=5432 sslmode=verify-full sslrootcert=/etc/ssl/certs/server-ca.pem TimeZone=UTC"

Connection timeout:

// Desabilitar timeout (wait indefinitely) - padrao
"connect_timeout=0"

// Timeout especifico (segundos)
"connect_timeout=10"
  • Timeout aplica-se a CADA host individualmente
  • 2 hosts + timeout=5s = ate 10s total de espera

Error Handling

GORM oferece 2 APIs com abordagens diferentes:

1. Generics API (Go 1.18+)

user, err := gorm.G[User](db).Where("name = ?", "jinzhu").First(ctx)
if err != nil {
    if errors.Is(err, gorm.ErrRecordNotFound) {
        // Handle not found
        return nil, fmt.Errorf("user not found")
    }
    return nil, fmt.Errorf("database error: %w", err)
}

2. Traditional API (chainable)

var user User
result := db.Where("name = ?", "jinzhu").First(&user)

if result.Error != nil {
    if errors.Is(result.Error, gorm.ErrRecordNotFound) {
        // Handle not found
    } else {
        // Handle other errors
    }
}

ErrRecordNotFound - Quando ocorre:

  • First(), Last(), Take() sem resultados
  • NAO ocorre em Find() (retorna slice vazio)

Deteccao correta:

// CORRETO
if errors.Is(err, gorm.ErrRecordNotFound) { }

// INCORRETO
if err == gorm.ErrRecordNotFound { }

Database-specific errors:

import (
    "github.com/go-sql-driver/mysql"
)

if mysqlErr, ok := err.(*mysql.MySQLError); ok {
    if mysqlErr.Number == 1062 {
        // Duplicate entry error
    }
}

Dialect-translated errors (recomendado para cross-database):

db, err := gorm.Open(postgres.Open(dsn), &gorm.Config{
    TranslateError: true, // ATIVAR ISTO
})

// Agora pode usar:
if errors.Is(err, gorm.ErrDuplicatedKey) {
    // Unique constraint violation (funciona em MySQL, Postgres, SQLite)
}

if errors.Is(err, gorm.ErrForeignKeyViolated) {
    // Foreign key constraint violation
}

Best practice para producao:

  1. Sempre verificar Error field apos Finisher Methods
  2. Usar errors.Is() para comparacao de sentinel errors
  3. Ativar TranslateError: true para portabilidade
  4. Nunca ignorar erros silenciosamente

Migration Patterns

AutoMigrate

O que faz:

db.AutoMigrate(&User{})
db.AutoMigrate(&User{}, &Product{}, &Order{}) // Multiplas tabelas
  • Cria: Tables, foreign keys, constraints, columns, indexes
  • Altera: Column type (se size/precision mudar), nullability (NOT NULL → NULL)
  • NAO deleta: Colunas nao usadas (protecao de dados)

Pros:

  • Rapido para desenvolvimento
  • Zero configuracao
  • Seguro (nao deleta dados)
  • Funciona na maioria dos casos

Contras:

  • Nao tem versioning
  • Nao tem rollback
  • Nao deleta colunas antigas
  • Nao suporta transformacoes complexas
  • Nao gera audit trail

Quando usar:

  • Desenvolvimento local
  • Prototipos rapidos
  • Projetos pequenos sem requisitos de auditoria
  • Testes automatizados (tear down/setup)

Manual Migrations

Quando usar:

  • Producao com requisitos de auditoria
  • Schemas complexos com transformacoes de dados
  • Necessidade de rollback
  • Equipes que precisam revisar migrations antes de aplicar
  • "At some point you may need to switch to a versioned migrations strategy" (GORM docs)

Opcao 1: Migrator Interface (GORM built-in)

db.Migrator().CreateTable(&User{})
db.Migrator().AddColumn(&User{}, "Age")
db.Migrator().DropColumn(&User{}, "TempField")
db.Migrator().CreateIndex(&User{}, "idx_name")

Vantagem: Database-independent (funciona em SQLite, MySQL, Postgres)

Opcao 2: gormigrate (versioned migrations)

import "github.com/go-gormigrate/gormigrate/v2"

m := gormigrate.New(db, gormigrate.DefaultOptions, []*gormigrate.Migration{
    {
        ID: "202501011200",
        Migrate: func(tx *gorm.DB) error {
            return tx.AutoMigrate(&User{})
        },
        Rollback: func(tx *gorm.DB) error {
            return tx.Migrator().DropTable("users")
        },
    },
})

if err := m.Migrate(); err != nil {
    log.Fatalf("Could not migrate: %v", err)
}

Vantagem: Versioning + Rollback support

Opcao 3: Atlas (GORM recommended)

atlas migrate diff --env gorm

Vantagem: Automated migration planning + separation of concerns

Production strategy:

  1. Desenvolvimento: AutoMigrate para iterar rapidamente
  2. Pre-producao: Gerar migrations manuais/versionadas
  3. Producao: Aplicar migrations versionadas com testes + backup

Foreign key error handling:

// ORDEM IMPORTA
db.AutoMigrate(&User{})      // Tabela referenciada primeiro
db.AutoMigrate(&Order{})     // Tabela com FK depois

Transaction usage em migrations:

db.Transaction(func(tx *gorm.DB) error {
    if err := tx.AutoMigrate(&User{}); err != nil {
        return err // Rollback automatico
    }
    if err := tx.AutoMigrate(&Order{}); err != nil {
        return err
    }
    return nil // Commit
})

Best practices:

  1. SEMPRE fazer backup antes de migrations em producao
  2. Testar migrations em staging primeiro
  3. Usar transactions quando possivel (atomicidade)
  4. Manter track de migration versions
  5. Documentar razao de cada migration
  6. Nunca editar migrations ja aplicadas em producao

2. Redis + Asynq

Asynq Overview

Asynq e uma biblioteca Go para task queues distribuidas usando Redis como backend. Fornece:

  • Enfileiramento assincrono de tarefas
  • Retry automatico com exponential backoff
  • Prioridade de queues
  • Agendamento de tarefas (schedule, delay)
  • Unique tasks (deduplication)
  • Task grouping e aggregation
  • Health checks e monitoring

Client Configuration

Criacao do cliente:

import "github.com/hibiken/asynq"

// Opcao 1: Standard initialization
client := asynq.NewClient(asynq.RedisClientOpt{
    Addr:     "127.0.0.1:6379",
    Password: "xxxxx",
    DB:       2,
})
defer client.Close()

// Opcao 2: From existing redis.UniversalClient
client := asynq.NewClientFromRedisClient(existingRedisClient)
// IMPORTANTE: Asynq NAO fecha esta conexao

RedisClientOpt types:

// Single Redis instance
redisOpt := asynq.RedisClientOpt{
    Addr:     "localhost:6379",
    Password: "",
    DB:       0,
}

// Redis Cluster
clusterOpt := asynq.RedisClusterClientOpt{
    Addrs: []string{
        "127.0.0.1:7000",
        "127.0.0.1:7001",
        "127.0.0.1:7002",
    },
}

// Redis Sentinel (high availability)
failoverOpt := asynq.RedisFailoverClientOpt{
    MasterName:    "mymaster",
    SentinelAddrs: []string{"127.0.0.1:26379"},
}

// Parse from URI
redisOpt, err := asynq.ParseRedisURI("redis://localhost:6379/0")

Task Serialization (JSON Payload)

Pattern recomendado:

// 1. Definir payload struct
type EmailDeliveryPayload struct {
    UserID    int    `json:"user_id"`
    Email     string `json:"email"`
    TemplateID string `json:"template_id"`
}

// 2. Marshal to JSON
func NewEmailDeliveryTask(userID int, email, templateID string) (*asynq.Task, error) {
    payload, err := json.Marshal(EmailDeliveryPayload{
        UserID:     userID,
        Email:      email,
        TemplateID: templateID,
    })
    if err != nil {
        return nil, fmt.Errorf("failed to marshal payload: %w", err)
    }

    return asynq.NewTask("email:delivery", payload), nil
}

// 3. Enqueue task
task, err := NewEmailDeliveryTask(123, "user@example.com", "welcome")
if err != nil {
    log.Fatal(err)
}

info, err := client.Enqueue(task)
if err != nil {
    log.Fatal(err)
}

Task Options:

// MaxRetry
client.Enqueue(task, asynq.MaxRetry(5))

// Timeout
client.Enqueue(task, asynq.Timeout(30*time.Second))

// Queue assignment
client.Enqueue(task, asynq.Queue("critical"))

// Schedule para futuro
client.Enqueue(task, asynq.ProcessIn(24*time.Hour))

// Unique task (deduplication)
client.Enqueue(task, asynq.Unique(1*time.Hour))

// Retention (keep completed task)
client.Enqueue(task, asynq.Retention(24*time.Hour))

// Combinar opcoes
client.Enqueue(task,
    asynq.MaxRetry(3),
    asynq.Timeout(5*time.Minute),
    asynq.Queue("critical"),
)

Defaults:

  • MaxRetry: 25
  • Timeout: 30 minutes
  • Queue: "default"

Server Configuration

Configuracao completa:

srv := asynq.NewServer(
    asynq.RedisClientOpt{Addr: "localhost:6379"},
    asynq.Config{
        // CONCURRENCY: Max concurrent tasks
        // 0 ou negativo = runtime.NumCPU()
        Concurrency: 10,

        // QUEUES: Priority weights
        Queues: map[string]int{
            "critical": 6,  // 60% do tempo
            "default":  3,  // 30% do tempo
            "low":      1,  // 10% do tempo
        },

        // STRICT PRIORITY: Processar queues em ordem estrita
        // true = so processa "low" quando "critical" e "default" vazias
        // false = weighted round-robin (recomendado)
        StrictPriority: false,

        // RETRY: Custom retry delay function
        RetryDelayFunc: asynq.DefaultRetryDelayFunc, // Exponential backoff

        // ERROR HANDLER: Custom error logging
        ErrorHandler: asynq.ErrorHandlerFunc(func(ctx context.Context, task *asynq.Task, err error) {
            log.Printf("Task %s failed: %v", task.Type(), err)
        }),

        // HEALTH CHECK
        HealthCheckFunc: func(err error) {
            if err != nil {
                log.Printf("Health check failed: %v", err)
            }
        },
        HealthCheckInterval: 15 * time.Second, // Default: 15s
    },
)

Concurrency recomendada:

  • CPU-bound tasks: runtime.NumCPU()
  • I/O-bound tasks: runtime.NumCPU() * 2 ou mais
  • Producao alta carga: 20-50
  • Monitorar CPU e memory usage

Queue priority weights explicacao:

Queues: map[string]int{
    "critical": 6,
    "default":  3,
    "low":      1,
}
// Total weight = 6 + 3 + 1 = 10
// critical: 60% do tempo de processamento
// default:  30% do tempo de processamento
// low:      10% do tempo de processamento

Error Handling Patterns

Handler basico com error handling:

func HandleEmailDeliveryTask(ctx context.Context, t *asynq.Task) error {
    // 1. Unmarshal payload
    var p EmailDeliveryPayload
    if err := json.Unmarshal(t.Payload(), &p); err != nil {
        // IMPORTANTE: Wrap com SkipRetry para evitar retry infinito
        return fmt.Errorf("json.Unmarshal failed: %v: %w", err, asynq.SkipRetry)
    }

    // 2. Access task metadata
    taskID, _ := asynq.GetTaskID(ctx)
    retryCount, _ := asynq.GetRetryCount(ctx)
    maxRetry, _ := asynq.GetMaxRetry(ctx)
    queueName, _ := asynq.GetQueueName(ctx)

    log.Printf("Processing task %s (retry %d/%d) from queue %s",
        taskID, retryCount, maxRetry, queueName)

    // 3. Process task
    if err := sendEmail(p.Email, p.TemplateID); err != nil {
        // Retryable error - Asynq vai retry automaticamente
        return fmt.Errorf("failed to send email: %w", err)
    }

    // 4. Success
    return nil
}

Special error handling:

// SkipRetry: Arquiva task imediatamente sem retry
if invalidData {
    return fmt.Errorf("invalid data: %w", asynq.SkipRetry)
}

// RevokeTask: Impede retry E archiving (deleta task)
if unauthorized {
    return fmt.Errorf("unauthorized: %w", asynq.RevokeTask)
}

// Nil: Marca task como sucesso
return nil

Handler panic protection:

// Asynq automaticamente recupera panics e retries a task
// Mas e melhor ter explicit recovery:
func HandleTaskSafe(ctx context.Context, t *asynq.Task) (err error) {
    defer func() {
        if r := recover(); r != nil {
            err = fmt.Errorf("panic recovered: %v", r)
        }
    }()

    // Task logic here
    return nil
}

Retry Mechanisms

DefaultRetryDelayFunc (exponential backoff):

Retry 1: ~1 second
Retry 2: ~2 seconds
Retry 3: ~4 seconds
Retry 4: ~8 seconds
...

Custom retry logic:

customRetryDelayFunc := func(n int, err error, task *asynq.Task) time.Duration {
    // Linear backoff
    return time.Duration(n) * time.Second

    // Fixed delay
    // return 5 * time.Second

    // Conditional backoff
    // if errors.Is(err, ErrRateLimit) {
    //     return 1 * time.Minute
    // }
    // return time.Duration(n) * time.Second
}

srv := asynq.NewServer(
    redisOpt,
    asynq.Config{
        RetryDelayFunc: customRetryDelayFunc,
    },
)

Task lifecycle:

  1. Task enqueued → pending
  2. Worker picks task → active
  3. Handler retorna erro → retry (ate MaxRetry)
  4. MaxRetry excedido → archived
  5. Archive atingiu max size → oldest tasks deletadas

Archive management:

  • Archive tem tamanho finito
  • Tasks arquivadas podem ser inspecionadas via Inspector
  • Configurar retention policy para limpeza automatica

Redis Connection Pooling

go-redis (usado pelo Asynq) connection pool:

import "github.com/redis/go-redis/v9"

// go-redis automaticamente cria connection pool
// Asynq herda este comportamento

// Para configuracao customizada:
client := redis.NewClient(&redis.Options{
    Addr:     "localhost:6379",
    Password: "",
    DB:       0,

    // CONNECTION POOL
    PoolSize:     100,  // Max connections (default: 10 * runtime.NumCPU())
    MinIdleConns: 10,   // Min idle connections
    MaxIdleConns: 20,   // Max idle connections (go-redis v9+)

    // TIMEOUTS
    DialTimeout:  5 * time.Second,
    ReadTimeout:  3 * time.Second,
    WriteTimeout: 3 * time.Second,
    PoolTimeout:  4 * time.Second, // Timeout quando pool esta full

    // CONNECTION LIFECYCLE
    ConnMaxIdleTime: 5 * time.Minute, // Close idle connections
    ConnMaxLifetime: 0,                // 0 = never close (let Redis handle)
})

Production recommendations:

  • PoolSize: 100 para alta carga
  • MinIdleConns: 10 (evita cold start latency)
  • ConnMaxIdleTime: 5 minutes (balance entre reuse e stale connections)
  • Monitorar Redis connection count: INFO clients

Redis persistence (important for task queue):

# RDB: Snapshots periodicos (menor uso de disco)
save 900 1
save 300 10
save 60 10000

# AOF: Log de todas as operacoes (melhor durabilidade)
appendonly yes
appendfsync everysec  # Balance entre performance e durabilidade

Recomendacao: AOF para task queues (evita perda de tasks em crash)

Health and Monitoring

Inspector API (monitoring tool):

inspector := asynq.NewInspector(asynq.RedisClientOpt{
    Addr: "localhost:6379",
})

// List queues
queues, err := inspector.Queues()

// Get queue info
info, err := inspector.GetQueueInfo("default")
fmt.Printf("Pending: %d, Active: %d, Completed: %d\n",
    info.Pending, info.Active, info.Completed)

// List tasks
tasks, err := inspector.ListPendingTasks("default")

// Delete task
err = inspector.DeleteTask("default", taskID)

// Archive all retry tasks
err = inspector.ArchiveAllRetryTasks("default")

Metrics a monitorar:

  • Pending tasks count (se crescer = workers insuficientes)
  • Active tasks count (se alto = tasks lentas ou stuck)
  • Completed tasks rate
  • Failed tasks rate
  • Retry count distribution
  • Queue latency (time in queue)

Production Checklist

  1. Redis configuration:

    • AOF enabled para durabilidade
    • maxmemory-policy: noeviction (evita perda de tasks)
    • Monitorar memory usage
    • Backup strategy (RDB snapshots)
  2. Asynq configuration:

    • Concurrency ajustada para workload
    • Queue priorities configuradas
    • ErrorHandler implementado (logging/alerting)
    • HealthCheckFunc configurado
    • Graceful shutdown implementado
  3. Task handlers:

    • JSON unmarshal errors wrapped com SkipRetry
    • Idempotency (tasks podem ser executadas multiplas vezes)
    • Timeout adequado para task complexity
    • Panic recovery
    • Structured logging (taskID, retryCount, etc)
  4. Monitoring:

    • Inspector API integrado com dashboard
    • Alerts para queue depth
    • Alerts para failed task rate
    • Redis metrics (connections, memory, ops/sec)

3. Gin Framework

Router Setup e Middleware Chain Order

Inicializacao basica:

import "github.com/gin-gonic/gin"

// Production mode (menos logs, melhor performance)
gin.SetMode(gin.ReleaseMode)

router := gin.New() // Sem middleware default

// Ou com middleware default (Logger + Recovery)
router := gin.Default()

Middleware chain execution:

Request → Middleware 1 → Middleware 2 → ... → Handler → Middleware 2 → Middleware 1 → Response

Middlewares executam em ORDEM:

  1. Pre-handler (antes de c.Next())
  2. Handler executa
  3. Post-handler (depois de c.Next(), em REVERSE order)

Ordem recomendada (production):

router := gin.New()

// 1. CORS - Must be first to handle OPTIONS preflight
router.Use(corsMiddleware())

// 2. RECOVERY - Recover from panics
router.Use(gin.Recovery())

// 3. LOGGING - Log all requests (including failed auth)
router.Use(gin.Logger())
// Ou custom logger:
router.Use(customLogger())

// 4. RATE LIMITING - Before authentication
router.Use(rateLimitMiddleware())

// 5. REQUEST ID - For tracing
router.Use(requestIDMiddleware())

// 6. AUTHENTICATION - Verify credentials
router.Use(authMiddleware()) // Applies to all routes
// Ou selective:
authorized := router.Group("/")
authorized.Use(authMiddleware())

// 7. AUTHORIZATION - Check permissions
authorized.Use(authzMiddleware())

// 8. Routes
authorized.GET("/users", getUsers)

Razao da ordem:

  1. CORS primeiro: OPTIONS requests nao precisam de auth
  2. Recovery cedo: Captura panics de todos outros middlewares
  3. Logging cedo: Registra todas requests (incluindo falhas de auth)
  4. Rate limit antes auth: Evita brute force em endpoints de login
  5. Auth antes de routes: Protege rotas sensiveis
  6. Authz depois auth: Precisa do user context de auth

Middleware selective application:

// Global middleware
router.Use(corsMiddleware())

// Group middleware
api := router.Group("/api")
api.Use(authMiddleware())
{
    api.GET("/users", getUsers)
}

// Per-route middleware
router.GET("/public", publicHandler)
router.GET("/private", authMiddleware(), privateHandler)

CORS Configuration

NUNCA usar em producao:

// DANGEROUS - Opens to all origins
router.Use(func(c *gin.Context) {
    c.Writer.Header().Set("Access-Control-Allow-Origin", "*")
})

Usar gin-contrib/cors:

import "github.com/gin-contrib/cors"

// Option 1: Allow all origins (NO credentials)
router.Use(cors.New(cors.Config{
    AllowAllOrigins: true, // Disables cookies!
    AllowMethods:    []string{"GET", "POST", "PUT", "DELETE"},
    AllowHeaders:    []string{"Origin", "Content-Type"},
}))

// Option 2: Specific origins (RECOMMENDED for production)
router.Use(cors.New(cors.Config{
    AllowOrigins:     []string{"https://example.com", "https://app.example.com"},
    AllowMethods:     []string{"GET", "POST", "PUT", "PATCH", "DELETE", "OPTIONS"},
    AllowHeaders:     []string{"Origin", "Content-Type", "Authorization"},
    ExposeHeaders:    []string{"Content-Length"},
    AllowCredentials: true, // Enables cookies/auth headers
    MaxAge:           12 * time.Hour,
}))

// Option 3: Wildcard patterns (advanced)
router.Use(cors.New(cors.Config{
    AllowOrigins:     []string{"https://*.example.com"},
    AllowWildcard:    true, // Enable wildcard matching
    AllowCredentials: true,
}))

// Option 4: Custom validation function
router.Use(cors.New(cors.Config{
    AllowOriginFunc: func(origin string) bool {
        // Custom logic
        return strings.HasSuffix(origin, ".example.com")
    },
    AllowCredentials: true,
}))

Production config exemplo:

corsConfig := cors.Config{
    AllowOrigins: []string{
        "https://app.typecraft.com",
        "https://admin.typecraft.com",
    },
    AllowMethods: []string{
        "GET", "POST", "PUT", "PATCH", "DELETE", "OPTIONS",
    },
    AllowHeaders: []string{
        "Origin",
        "Content-Type",
        "Authorization",
        "X-Request-ID",
    },
    ExposeHeaders: []string{
        "Content-Length",
        "X-Request-ID",
    },
    AllowCredentials: true,
    MaxAge:           12 * time.Hour, // Preflight cache duration
}

router.Use(cors.New(corsConfig))

IMPORTANTE:

  • AllowAllOrigins: true DESABILITA credentials (cookies nao funcionam)
  • Para auth headers/cookies: DEVE especificar origins explicitas
  • Nunca AllowOrigins: ["*"] + AllowCredentials: true (invalido)

Error Handling

c.JSON vs c.AbortWithStatusJSON:

// c.JSON - Continue middleware chain
func handler(c *gin.Context) {
    if err := validate(); err != nil {
        c.JSON(400, gin.H{"error": "validation failed"})
        // WARNING: Middlewares DEPOIS deste ainda executam!
        return
    }
    c.JSON(200, gin.H{"status": "ok"})
}

// c.AbortWithStatusJSON - STOP middleware chain (RECOMMENDED)
func handler(c *gin.Context) {
    if err := validate(); err != nil {
        c.AbortWithStatusJSON(400, gin.H{"error": "validation failed"})
        // Middleware chain STOPS aqui
        return
    }
    c.JSON(200, gin.H{"status": "ok"})
}

Quando usar cada um:

  • c.JSON: Response normal (success cases)
  • c.AbortWithStatusJSON: Errors (impede execucao de post-handlers)
  • c.AbortWithStatus: Error sem JSON body

Error response pattern:

type ErrorResponse struct {
    Error   string `json:"error"`
    Message string `json:"message"`
    Code    string `json:"code,omitempty"`
}

func handleError(c *gin.Context, statusCode int, err error) {
    c.AbortWithStatusJSON(statusCode, ErrorResponse{
        Error:   http.StatusText(statusCode),
        Message: err.Error(),
    })
}

// Usage
func createUser(c *gin.Context) {
    var user User
    if err := c.ShouldBindJSON(&user); err != nil {
        handleError(c, 400, err)
        return
    }

    if err := db.Create(&user).Error; err != nil {
        handleError(c, 500, err)
        return
    }

    c.JSON(201, user)
}

Custom error middleware:

func ErrorHandler() gin.HandlerFunc {
    return func(c *gin.Context) {
        c.Next() // Execute handlers

        // Check if errors occurred
        if len(c.Errors) > 0 {
            err := c.Errors.Last()

            // Log error
            log.Printf("Error: %v", err.Err)

            // Send response
            c.JSON(-1, gin.H{ // -1 = use existing status code
                "error": err.Error(),
            })
        }
    }
}

router.Use(ErrorHandler())

Graceful Shutdown

Pattern recomendado (Go 1.8+):

import (
    "context"
    "net/http"
    "os"
    "os/signal"
    "syscall"
    "time"
)

func main() {
    router := gin.Default()
    // Setup routes...

    // Create http.Server
    srv := &http.Server{
        Addr:    ":8080",
        Handler: router,

        // Timeouts (production recommended)
        ReadTimeout:  10 * time.Second,
        WriteTimeout: 10 * time.Second,
        IdleTimeout:  120 * time.Second,
    }

    // Start server in goroutine
    go func() {
        if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
            log.Fatalf("listen: %s\n", err)
        }
    }()

    // Wait for interrupt signal
    quit := make(chan os.Signal, 1)
    signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM)
    <-quit
    log.Println("Shutting down server...")

    // Graceful shutdown with timeout
    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()

    if err := srv.Shutdown(ctx); err != nil {
        log.Fatal("Server forced to shutdown:", err)
    }

    log.Println("Server exited")
}

gin-contrib/graceful (2025 wrapper):

import "github.com/gin-contrib/graceful"

func main() {
    router := gin.Default()
    // Setup routes...

    // RunWithContext binds lifecycle to context
    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()

    go func() {
        quit := make(chan os.Signal, 1)
        signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM)
        <-quit
        cancel() // Cancel context to trigger shutdown
    }()

    if err := graceful.RunWithContext(ctx, router, ":8080"); err != nil {
        log.Fatal(err)
    }
}

Kubernetes considerations:

SIGTERM sent → 30s grace period (default) → SIGKILL
  1. Kubernetes envia SIGTERM
  2. Application tem ~30s para:
    • Parar de aceitar novas conexoes
    • Finalizar requests em andamento
    • Cleanup resources
  3. Se nao terminar, SIGKILL forcado

Shutdown timeout recomendado:

  • Pequenas apps: 5-10 segundos
  • Apps com long-running requests: 30 segundos
  • Nunca maior que Kubernetes grace period

Health Check Patterns

Endpoint basico:

router.GET("/health", func(c *gin.Context) {
    c.JSON(200, gin.H{
        "status": "ok",
    })
})

Health check completo:

type HealthStatus struct {
    Status    string            `json:"status"` // "ok" | "degraded" | "error"
    Timestamp time.Time         `json:"timestamp"`
    Checks    map[string]string `json:"checks"`
}

router.GET("/health", func(c *gin.Context) {
    health := HealthStatus{
        Status:    "ok",
        Timestamp: time.Now(),
        Checks:    make(map[string]string),
    }

    // Check database
    if err := db.Exec("SELECT 1").Error; err != nil {
        health.Status = "error"
        health.Checks["database"] = "unhealthy: " + err.Error()
    } else {
        health.Checks["database"] = "healthy"
    }

    // Check Redis
    if err := redisClient.Ping(context.Background()).Err(); err != nil {
        health.Status = "degraded"
        health.Checks["redis"] = "unhealthy: " + err.Error()
    } else {
        health.Checks["redis"] = "healthy"
    }

    // Return appropriate status code
    statusCode := 200
    if health.Status == "error" {
        statusCode = 503
    } else if health.Status == "degraded" {
        statusCode = 200 // Still serving traffic
    }

    c.JSON(statusCode, health)
})

Liveness vs Readiness (Kubernetes):

// Liveness: "Is app alive?" (restart if fails)
router.GET("/healthz", func(c *gin.Context) {
    // Simple check - apenas verifica se app responde
    c.JSON(200, gin.H{"status": "alive"})
})

// Readiness: "Is app ready to serve traffic?" (remove from load balancer if fails)
router.GET("/readyz", func(c *gin.Context) {
    // Check dependencies (DB, Redis, etc)
    if err := db.Exec("SELECT 1").Error; err != nil {
        c.JSON(503, gin.H{"status": "not ready"})
        return
    }
    c.JSON(200, gin.H{"status": "ready"})
})

Best practices:

  • /health - Public health check (minimo de info)
  • /healthz - Liveness probe (sempre 200 se app vivo)
  • /readyz - Readiness probe (verifica dependencies)
  • Timeout curto (< 1s)
  • Nao fazer checks pesados (impacta latency)

Context Timeout Handling

Request timeout global:

func TimeoutMiddleware(timeout time.Duration) gin.HandlerFunc {
    return func(c *gin.Context) {
        ctx, cancel := context.WithTimeout(c.Request.Context(), timeout)
        defer cancel()

        c.Request = c.Request.WithContext(ctx)

        finished := make(chan struct{})
        go func() {
            c.Next()
            finished <- struct{}{}
        }()

        select {
        case <-finished:
            return
        case <-ctx.Done():
            c.AbortWithStatusJSON(http.StatusRequestTimeout, gin.H{
                "error": "request timeout",
            })
        }
    }
}

router.Use(TimeoutMiddleware(30 * time.Second))

Per-request timeout:

func longRunningHandler(c *gin.Context) {
    ctx, cancel := context.WithTimeout(c.Request.Context(), 10*time.Second)
    defer cancel()

    // Pass ctx to database/external calls
    result := make(chan string, 1)
    go func() {
        // Long operation
        data, err := db.WithContext(ctx).Find(&users).Error
        if err != nil {
            result <- ""
        } else {
            result <- "success"
        }
    }()

    select {
    case res := <-result:
        c.JSON(200, gin.H{"result": res})
    case <-ctx.Done():
        c.JSON(504, gin.H{"error": "operation timeout"})
    }
}

Production Checklist

  1. Mode:

    • gin.SetMode(gin.ReleaseMode) em producao
  2. Middleware order:

    • CORS primeiro
    • Recovery para panics
    • Logging para auditoria
    • Auth/Authz em ordem correta
  3. CORS:

    • Specific origins (nunca * com credentials)
    • AllowCredentials se usar auth headers/cookies
    • MaxAge configurado (cache preflight)
  4. Error handling:

    • Usar AbortWithStatusJSON para errors
    • Structured error responses
    • Nao vazar stack traces em producao
  5. Graceful shutdown:

    • Signal handling (SIGTERM, SIGINT)
    • Timeout menor que K8s grace period
    • Cleanup de resources
  6. Health checks:

    • /healthz (liveness)
    • /readyz (readiness)
    • Timeout curto (< 1s)
  7. Timeouts:

    • ReadTimeout, WriteTimeout, IdleTimeout configurados
    • Request timeout middleware se necessario

4. Chromedp (PDF Generation)

Context Lifecycle

Conceito fundamental:

  • Allocator: Responsavel por criar/conectar browsers
  • Context: Representa um browser ou tab
  • Run: Executa actions no context

Padrao basico:

import (
    "context"
    "github.com/chromedp/chromedp"
)

// 1. Create allocator context
ctx, cancel := chromedp.NewExecAllocator(
    context.Background(),
    chromedp.DefaultExecAllocatorOptions[:]...,
)
defer cancel()

// 2. Create chromedp context
ctx, cancel = chromedp.NewContext(ctx)
defer cancel()

// 3. Run actions
var buf []byte
if err := chromedp.Run(ctx,
    chromedp.Navigate("https://example.com"),
    chromedp.FullScreenshot(&buf, 90),
); err != nil {
    log.Fatal(err)
}

Allocator → Context → Run:

NewExecAllocator → NewContext → Run (primeira vez: aloca browser) → Run (proximas: reusa browser)

IMPORTANTE:

  • Primeira Run(): Aloca browser (lento)
  • Proximas Run(): Reusa browser (rapido)
  • Cancel context: Fecha tab ou browser inteiro

Multi-tab pattern (reuse browser):

// Parent context (browser)
allocCtx, cancel := chromedp.NewExecAllocator(context.Background(),
    chromedp.DefaultExecAllocatorOptions[:]...,
)
defer cancel()

browserCtx, cancel := chromedp.NewContext(allocCtx)
defer cancel()

// Tab 1
tab1Ctx, cancel := chromedp.NewContext(browserCtx)
defer cancel()
chromedp.Run(tab1Ctx, chromedp.Navigate("https://example.com"))

// Tab 2
tab2Ctx, cancel := chromedp.NewContext(browserCtx)
defer cancel()
chromedp.Run(tab2Ctx, chromedp.Navigate("https://google.com"))

// Cancelar browserCtx fecha TODAS as tabs

Allocator Options

DefaultExecAllocatorOptions:

chromedp.DefaultExecAllocatorOptions = []chromedp.ExecAllocatorOption{
    chromedp.NoFirstRun,
    chromedp.NoDefaultBrowserCheck,
    chromedp.Headless,               // Headless mode
    chromedp.DisableGPU,             // GPU disabled
    chromedp.NoSandbox,              // Sandbox disabled (required for Docker)
    chromedp.DisableDevShmUsage,     // Use /tmp instead of /dev/shm (Docker)
    chromedp.Flag("disable-popup-blocking", true),
}

Custom allocator (production):

opts := append(chromedp.DefaultExecAllocatorOptions[:],
    chromedp.Flag("disable-background-networking", true),
    chromedp.Flag("enable-features", "NetworkService,NetworkServiceInProcess"),
    chromedp.Flag("disable-background-timer-throttling", true),
    chromedp.Flag("disable-backgrounding-occluded-windows", true),
    chromedp.Flag("disable-breakpad", true),
    chromedp.Flag("disable-client-side-phishing-detection", true),
    chromedp.Flag("disable-default-apps", true),
    chromedp.Flag("disable-dev-shm-usage", true),
    chromedp.Flag("disable-extensions", true),
    chromedp.Flag("disable-features", "site-per-process,Translate,BlinkGenPropertyTrees"),
    chromedp.Flag("disable-hang-monitor", true),
    chromedp.Flag("disable-ipc-flooding-protection", true),
    chromedp.Flag("disable-popup-blocking", true),
    chromedp.Flag("disable-prompt-on-repost", true),
    chromedp.Flag("disable-renderer-backgrounding", true),
    chromedp.Flag("disable-sync", true),
    chromedp.Flag("force-color-profile", "srgb"),
    chromedp.Flag("metrics-recording-only", true),
    chromedp.Flag("safebrowsing-disable-auto-update", true),
    chromedp.Flag("enable-automation", true),
    chromedp.Flag("password-store", "basic"),
    chromedp.Flag("use-mock-keychain", true),
)

ctx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)
defer cancel()

Docker-specific flags (OBRIGATORIO):

chromedp.NoSandbox,          // Chrome sandbox nao funciona em Docker sem privilegios
chromedp.DisableDevShmUsage, // /dev/shm limitado em containers (64MB default)
chromedp.DisableGPU,         // GPU nao disponivel em containers headless

Headless vs Headed:

// Headless (default) - producao
opts := chromedp.DefaultExecAllocatorOptions[:]

// Headed - debugging
opts := append(
    chromedp.DefaultExecAllocatorOptions[:],
    chromedp.Flag("headless", false),
)

PDF Printing (PrintToPDF)

Parametros disponiveis:

import "github.com/chromedp/cdproto/page"

// Basic PDF generation
var pdfBuf []byte
err := chromedp.Run(ctx,
    chromedp.Navigate("https://example.com"),
    chromedp.ActionFunc(func(ctx context.Context) error {
        buf, _, err := page.PrintToPDF().Do(ctx)
        if err != nil {
            return err
        }
        pdfBuf = buf
        return nil
    }),
)

Parametros completos:

pdfParams := page.PrintToPDF().
    WithPrintBackground(true).           // Include background graphics
    WithLandscape(true).                 // Landscape orientation (default: false)
    WithPaperWidth(8.5).                 // Paper width in inches
    WithPaperHeight(11).                 // Paper height in inches
    WithMarginTop(0.4).                  // Top margin in inches (default: ~0.4)
    WithMarginBottom(0.4).               // Bottom margin in inches
    WithMarginLeft(0.4).                 // Left margin in inches
    WithMarginRight(0.4).                // Right margin in inches
    WithScale(1.0).                      // Scale (0.1 to 2.0)
    WithDisplayHeaderFooter(true).       // Show header/footer
    WithHeaderTemplate("<div>Header</div>"). // HTML header template
    WithFooterTemplate("<div>Footer</div>"). // HTML footer template
    WithPreferCSSPageSize(false)         // Use CSS @page size (default: false)

buf, _, err := pdfParams.Do(ctx)

Parametros mais usados:

// A4 Portrait com background
page.PrintToPDF().
    WithPrintBackground(true).
    WithPaperWidth(8.27).  // A4 width
    WithPaperHeight(11.69). // A4 height
    WithMarginTop(0.4).
    WithMarginBottom(0.4).
    WithMarginLeft(0.4).
    WithMarginRight(0.4)

// Letter Landscape sem margens
page.PrintToPDF().
    WithPrintBackground(true).
    WithLandscape(true).
    WithPaperWidth(11).
    WithPaperHeight(8.5).
    WithMarginTop(0).
    WithMarginBottom(0).
    WithMarginLeft(0).
    WithMarginRight(0)

Paper sizes comuns:

A4:      8.27" x 11.69"
Letter:  8.5" x 11"
Legal:   8.5" x 14"
A3:      11.69" x 16.54"

Known issue - margins = 0:

Versoes antigas do chromedp tinham bug com omitempty tag que ignorava margins = 0. Solucao: Usar valores pequenos (0.01) ao inves de 0.

Timeout Configuration

CRITICO - Nunca timeout na primeira Run():

// ERRADO - Vai matar o browser durante alocacao
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

allocCtx, _ := chromedp.NewExecAllocator(ctx, opts...)
browserCtx, _ := chromedp.NewContext(allocCtx)

// Se primeira Run() exceder 5s, context cancela e mata browser
chromedp.Run(browserCtx, actions...)


// CORRETO - Timeout apenas para actions especificas
allocCtx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)
defer cancel()

browserCtx, cancel := chromedp.NewContext(allocCtx)
defer cancel()

// Primeira Run sem timeout (aloca browser)
chromedp.Run(browserCtx, chromedp.Navigate("about:blank"))

// Proximas Runs com timeout
taskCtx, cancel := context.WithTimeout(browserCtx, 30*time.Second)
defer cancel()

chromedp.Run(taskCtx,
    chromedp.Navigate("https://example.com"),
    chromedp.WaitReady("body"),
    // PDF generation...
)

Timeout patterns:

// Pattern 1: Timeout para action especifica
func generatePDFWithTimeout(url string, timeout time.Duration) ([]byte, error) {
    ctx, cancel := context.WithTimeout(browserCtx, timeout)
    defer cancel()

    var buf []byte
    err := chromedp.Run(ctx,
        chromedp.Navigate(url),
        chromedp.WaitReady("body"),
        chromedp.ActionFunc(func(ctx context.Context) error {
            var err error
            buf, _, err = page.PrintToPDF().WithPrintBackground(true).Do(ctx)
            return err
        }),
    )

    return buf, err
}

// Pattern 2: Timeout para tasks individuais
err := chromedp.Run(ctx,
    chromedp.Navigate(url),
    chromedp.ActionFunc(func(ctx context.Context) error {
        ctx, cancel := context.WithTimeout(ctx, 5*time.Second)
        defer cancel()
        return chromedp.WaitReady("body").Do(ctx)
    }),
    // More actions...
)

Graceful shutdown com timeout:

// Shutdown browser com timeout de 10s
tctx, tcancel := context.WithTimeout(context.Background(), 10*time.Second)
defer tcancel()

if err := chromedp.Cancel(tctx); err != nil {
    log.Printf("Failed to cancel context gracefully: %v", err)
}

Error Handling Patterns

Common errors:

  1. Context canceled: Browser connection lost
  2. Timeout: Action excedeu deadline
  3. Invalid context: Executando action fora de Run()

Error handling pattern:

func generatePDF(url string) ([]byte, error) {
    var buf []byte

    err := chromedp.Run(ctx,
        chromedp.Navigate(url),
        chromedp.WaitReady("body", chromedp.ByQuery),
        chromedp.ActionFunc(func(ctx context.Context) error {
            var err error
            buf, _, err = page.PrintToPDF().WithPrintBackground(true).Do(ctx)
            if err != nil {
                return fmt.Errorf("printToPDF failed: %w", err)
            }
            return nil
        }),
    )

    if err != nil {
        // Check error type
        if errors.Is(err, context.DeadlineExceeded) {
            return nil, fmt.Errorf("PDF generation timeout: %w", err)
        }
        if errors.Is(err, context.Canceled) {
            return nil, fmt.Errorf("browser connection lost: %w", err)
        }
        return nil, fmt.Errorf("chromedp error: %w", err)
    }

    return buf, nil
}

Wait for images to load (common issue):

// Issue: PrintToPDF pode capturar antes de imagens carregarem

// Solution 1: WaitReady para elemento especifico
chromedp.Run(ctx,
    chromedp.Navigate(url),
    chromedp.WaitReady("img.hero", chromedp.ByQuery),
    // Generate PDF...
)

// Solution 2: Sleep (nao recomendado mas funciona)
chromedp.Run(ctx,
    chromedp.Navigate(url),
    chromedp.Sleep(2*time.Second),
    // Generate PDF...
)

// Solution 3: Wait for network idle
chromedp.Run(ctx,
    chromedp.Navigate(url),
    chromedp.ActionFunc(func(ctx context.Context) error {
        // Wait for network to be idle
        return chromedp.WaitReady("body").Do(ctx)
    }),
    chromedp.Sleep(1*time.Second), // Extra safety margin
    // Generate PDF...
)

Headless Browser Requirements (Docker)

Dockerfile para chromedp:

# Multi-stage build
FROM golang:1.21-alpine AS builder

WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o app

# Runtime stage - Use chromedp/headless-shell
FROM chromedp/headless-shell:latest

# Copy binary
COPY --from=builder /app/app /app

# Run as non-root
USER chromedp

EXPOSE 8080
ENTRYPOINT ["/app"]

chromedp/headless-shell vs google-chrome-stable:

  • chromedp/headless-shell: Otimizado para headless (menor tamanho, sem GUI)
  • google-chrome-stable: Chrome completo (maior, inclui GUI components)

Recomendacao: chromedp/headless-shell para producao

Alternativa - Install Chrome manualmente:

FROM golang:1.21-bullseye

# Install Chrome dependencies
RUN apt-get update && apt-get install -y \
    wget \
    gnupg \
    ca-certificates \
    fonts-liberation \
    libasound2 \
    libatk-bridge2.0-0 \
    libatk1.0-0 \
    libatspi2.0-0 \
    libcups2 \
    libdbus-1-3 \
    libdrm2 \
    libgbm1 \
    libgtk-3-0 \
    libnspr4 \
    libnss3 \
    libwayland-client0 \
    libxcomposite1 \
    libxdamage1 \
    libxfixes3 \
    libxkbcommon0 \
    libxrandr2 \
    xdg-utils \
    --no-install-recommends \
    && rm -rf /var/lib/apt/lists/*

# Install Chrome
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
    && echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list \
    && apt-get update \
    && apt-get install -y google-chrome-stable \
    && rm -rf /var/lib/apt/lists/*

# Build app...

Required flags para Docker (recap):

chromedp.NoSandbox,          // OBRIGATORIO - Sandbox requires privileged mode
chromedp.DisableDevShmUsage, // OBRIGATORIO - /dev/shm limitado (64MB)
chromedp.DisableGPU,         // RECOMENDADO - GPU nao disponivel

Cloud Run specifics:

# cloud-run.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: typecraft-pdf
spec:
  template:
    spec:
      containers:
      - image: gcr.io/project/typecraft-pdf
        resources:
          limits:
            memory: 2Gi      # Chrome consume muita memoria
            cpu: 2           # CPU for rendering
        env:
        - name: CHROME_BIN
          value: /headless-shell/headless-shell

Production Patterns

Pattern 1: Browser pool (reuse browsers):

type BrowserPool struct {
    browsers chan context.Context
    cancel   context.CancelFunc
}

func NewBrowserPool(size int) (*BrowserPool, error) {
    pool := &BrowserPool{
        browsers: make(chan context.Context, size),
    }

    allocCtx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)
    pool.cancel = cancel

    for i := 0; i < size; i++ {
        ctx, _ := chromedp.NewContext(allocCtx)
        // Initialize browser
        chromedp.Run(ctx, chromedp.Navigate("about:blank"))
        pool.browsers <- ctx
    }

    return pool, nil
}

func (p *BrowserPool) Get() context.Context {
    return <-p.browsers
}

func (p *BrowserPool) Put(ctx context.Context) {
    p.browsers <- ctx
}

func (p *BrowserPool) Close() {
    close(p.browsers)
    p.cancel()
}

// Usage
func generatePDF(url string) ([]byte, error) {
    ctx := pool.Get()
    defer pool.Put(ctx)

    taskCtx, cancel := context.WithTimeout(ctx, 30*time.Second)
    defer cancel()

    var buf []byte
    err := chromedp.Run(taskCtx,
        chromedp.Navigate(url),
        chromedp.WaitReady("body"),
        chromedp.ActionFunc(func(ctx context.Context) error {
            var err error
            buf, _, err = page.PrintToPDF().WithPrintBackground(true).Do(ctx)
            return err
        }),
    )

    return buf, err
}

Pattern 2: Single browser, multiple tabs:

var browserCtx context.Context

func init() {
    allocCtx, _ := chromedp.NewExecAllocator(context.Background(), opts...)
    browserCtx, _ = chromedp.NewContext(allocCtx)
    chromedp.Run(browserCtx, chromedp.Navigate("about:blank"))
}

func generatePDF(url string) ([]byte, error) {
    // Create new tab
    tabCtx, cancel := chromedp.NewContext(browserCtx)
    defer cancel()

    taskCtx, cancel := context.WithTimeout(tabCtx, 30*time.Second)
    defer cancel()

    var buf []byte
    err := chromedp.Run(taskCtx, /* actions... */)

    return buf, err
}

Pattern 3: RemoteAllocator (external Chrome):

// Start Chrome externally
// $ google-chrome --headless --remote-debugging-port=9222

// Connect to it
allocCtx, cancel := chromedp.NewRemoteAllocator(
    context.Background(),
    "ws://localhost:9222",
)
defer cancel()

ctx, cancel := chromedp.NewContext(allocCtx)
defer cancel()

Vantagem: Chrome process separado (nao morre com app crashes)

Production Checklist

  1. Docker:

    • Use chromedp/headless-shell image
    • NoSandbox flag enabled
    • DisableDevShmUsage flag enabled
    • Memory limit >= 1GB (2GB recomendado)
  2. Context lifecycle:

    • Nunca timeout na primeira Run()
    • Reuse browser contexts (pool ou single instance)
    • Graceful shutdown com timeout
  3. PDF generation:

    • PrintBackground(true) se necessario
    • Margins configuradas corretamente
    • WaitReady antes de PrintToPDF
    • Extra sleep se tiver imagens/async content
  4. Error handling:

    • Detectar DeadlineExceeded
    • Retry logic para transient errors
    • Logging de failures
  5. Performance:

    • Browser pooling para alta carga
    • Timeout adequado (30s recomendado)
    • Cleanup de tabs apos uso

5. Docker/Cloud Run Deployment

Docker Multi-Stage Builds (Go Best Practices)

Pattern recomendado (2025):

# Stage 1: Build
FROM golang:1.21-alpine AS builder

WORKDIR /app

# Copy dependency files first (layer caching)
COPY go.mod go.sum ./
RUN go mod download

# Copy source code
COPY . .

# Build with optimization flags
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
    go build -trimpath -ldflags="-s -w" -o app .

# Stage 2: Runtime
FROM gcr.io/distroless/static-debian12:nonroot

# Copy CA certificates (HTTPS calls)
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/

# Copy binary
COPY --from=builder /app/app /app

# Use non-root user (security)
USER nonroot:nonroot

EXPOSE 8080

ENTRYPOINT ["/app"]

Build flags explicacao:

CGO_ENABLED=0        # Static binary (no libc dependency)
GOOS=linux           # Target OS
GOARCH=amd64         # Target architecture
-trimpath            # Remove file system paths from binary
-ldflags="-s -w"     # Strip debug info (-s) and DWARF symbol table (-w)

Resultado: Binary ~50-80% menor

Base image options:

Image Size Security Use Case
scratch 7MB Alto (nada alem do binary) Static binaries, maximo minimalismo
alpine 14MB Medio (busybox, wget vulnerable) Precisa shell para debug
distroless 9MB Alto (sem shell/package manager) RECOMENDADO para producao

Distroless variants:

# Static (Go, Rust)
FROM gcr.io/distroless/static-debian12:nonroot

# Base (dynamic linking)
FROM gcr.io/distroless/base-debian12:nonroot

# With CA certs (HTTPS)
FROM gcr.io/distroless/static-debian12:nonroot
# Already includes /etc/ssl/certs/ca-certificates.crt

Security best practices:

# 1. Non-root user
USER nonroot:nonroot

# Ou criar user customizado no Alpine:
RUN addgroup -S appuser && \
    adduser -S -G appuser -H -s /sbin/nologin appuser
USER appuser

# 2. Read-only filesystem (Cloud Run/K8s)
# Nao precisa no Dockerfile, configurar no deployment

# 3. No secrets in image
# NUNCA: COPY .env /app/.env
# Usar: Cloud Run env vars ou Secret Manager

Multi-stage com chromedp:

FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -trimpath -ldflags="-s -w" -o app

# Use chromedp/headless-shell (includes Chrome)
FROM chromedp/headless-shell:latest

COPY --from=builder /app/app /app

USER chromedp

EXPOSE 8080
ENTRYPOINT ["/app"]

Tamanho comparacao:

Single-stage (golang:1.21): ~850MB
Multi-stage (alpine):       ~20MB
Multi-stage (distroless):   ~15MB
Multi-stage (scratch):      ~10MB

Reducao: 95%+

Cloud Run Service Configuration

Environment Variables:

# cloud-run.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: typecraft
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/maxScale: '10'
        autoscaling.knative.dev/minScale: '0'
    spec:
      containers:
      - image: gcr.io/PROJECT_ID/typecraft:latest
        env:
        # App config
        - name: PORT
          value: "8080"
        - name: GIN_MODE
          value: "release"

        # Database (use Secret Manager for sensitive data)
        - name: DB_HOST
          value: "10.1.2.3"
        - name: DB_PORT
          value: "5432"
        - name: DB_NAME
          value: "typecraft"
        - name: DB_USER
          valueFrom:
            secretKeyRef:
              name: db-user
              key: latest
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-password
              key: latest

        # Cloud SQL connection (via Unix socket)
        - name: INSTANCE_CONNECTION_NAME
          value: "project:region:instance"

        resources:
          limits:
            memory: 512Mi
            cpu: '1'

Limits:

  • Max env vars: 1000
  • Max env var length: 32KB
  • Nao usar env vars para secrets grandes (use Secret Manager)

PORT environment variable:

  • Cloud Run SEMPRE passa PORT (geralmente 8080)
  • App DEVE escutar nesta porta
  • Nao hardcode porta no codigo
port := os.Getenv("PORT")
if port == "" {
    port = "8080" // Fallback para local dev
}

srv := &http.Server{
    Addr:    ":" + port,
    Handler: router,
}

Health Checks Configuration

Startup Probe:

spec:
  template:
    spec:
      containers:
      - image: gcr.io/PROJECT_ID/app
        startupProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 0
          periodSeconds: 10        # Check every 10s
          timeoutSeconds: 1        # 1s timeout per check
          failureThreshold: 3      # Retry 3 times before failing

Parametros:

  • initialDelaySeconds: 0-240s (default: 0)
  • periodSeconds: 1-240s (default: 10s)
  • timeoutSeconds: 1-240s (default: 1s)
  • failureThreshold: Numero de retries (default: 3)

IMPORTANTE: timeoutSeconds <= periodSeconds

Liveness Probe:

livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  periodSeconds: 10
  timeoutSeconds: 1
  failureThreshold: 3

Default behavior (se nao configurar):

Cloud Run aplica automaticamente:

startupProbe:
  tcpSocket:
    port: <PORT>
  timeoutSeconds: 240
  periodSeconds: 240
  failureThreshold: 1

Best practices:

  1. Endpoint rapido (< 1s):
router.GET("/healthz", func(c *gin.Context) {
    c.JSON(200, gin.H{"status": "ok"})
})
  1. Nao fazer DB queries pesadas:
// ERRADO
router.GET("/healthz", func(c *gin.Context) {
    var count int64
    db.Model(&User{}).Count(&count) // Slow!
    c.JSON(200, gin.H{"status": "ok"})
})

// CORRETO
router.GET("/healthz", func(c *gin.Context) {
    db.Exec("SELECT 1") // Fast ping
    c.JSON(200, gin.H{"status": "ok"})
})
  1. Liveness vs Readiness:
    • Liveness (/healthz): App esta vivo? (restart se falhar)
    • Readiness (/readyz): App pronto para traffic? (remove de LB se falhar)

Timeout Configuration

Request timeout:

spec:
  template:
    metadata:
      annotations:
        run.googleapis.com/timeout: '300s'  # 5 minutes

Limites:

  • Default: 300s (5 minutes)
  • Max (Services): 3600s (1 hour)
  • Max (Jobs): 168 hours (7 days) (Preview)

IMPORTANTE:

  • Timeout aplica-se a CADA request
  • Apos timeout, Cloud Run retorna 504 Gateway Timeout
  • Request e cancelado (context.DeadlineExceeded)

Handling timeout no codigo:

func longRunningHandler(c *gin.Context) {
    // Cloud Run vai cancelar context apos timeout
    ctx := c.Request.Context()

    select {
    case result := <-processData(ctx):
        c.JSON(200, result)
    case <-ctx.Done():
        // Context cancelado (timeout ou client disconnect)
        c.JSON(504, gin.H{"error": "request timeout"})
    }
}

Best practices:

  • Requests < 60s: Use default (300s)
  • Requests > 60s: Aumentar timeout + usar background jobs (Asynq)
  • Requests > 1h: NUNCA - usar Cloud Run Jobs ou Cloud Tasks

Cloud Run + Cloud SQL Connection

Connection methods:

  1. Unix Socket (Public IP - RECOMENDADO):
// DSN format
dsn := fmt.Sprintf("host=/cloudsql/%s user=%s password=%s dbname=%s sslmode=disable",
    os.Getenv("INSTANCE_CONNECTION_NAME"),
    os.Getenv("DB_USER"),
    os.Getenv("DB_PASSWORD"),
    os.Getenv("DB_NAME"),
)

// INSTANCE_CONNECTION_NAME format: project:region:instance
// Socket path: /cloudsql/project:region:instance/.s.PGSQL.5432

IMPORTANTE: Unix socket path tem limite de 108 caracteres no Linux

  1. Private IP (VPC Connector):
spec:
  template:
    metadata:
      annotations:
        run.googleapis.com/vpc-access-connector: projects/PROJECT/locations/REGION/connectors/CONNECTOR
        run.googleapis.com/vpc-access-egress: private-ranges-only
    spec:
      containers:
      - env:
        - name: DB_HOST
          value: "10.1.2.3"  # Private IP
dsn := fmt.Sprintf("host=%s port=5432 user=%s password=%s dbname=%s sslmode=require",
    os.Getenv("DB_HOST"),
    os.Getenv("DB_USER"),
    os.Getenv("DB_PASSWORD"),
    os.Getenv("DB_NAME"),
)

SSL Configuration:

  • Unix socket: sslmode=disable (Cloud SQL Proxy ja encripta)
  • Private IP: sslmode=require ou verify-full

Server CA mode:

Cloud SQL instances para web apps DEVEM usar:

GOOGLE_MANAGED_INTERNAL_CA

Connection pooling limits:

  • Cloud Run instance limit: 100 concurrent connections
  • Total connections = MaxOpenConns * num_instances
  • Exemplo: MaxOpenConns=25, 4 instances = 100 connections
  • Cloud SQL default max_connections: 100

Best practice:

sqlDB.SetMaxOpenConns(25)  // 25 * 4 instances = 100 total
sqlDB.SetMaxIdleConns(10)
sqlDB.SetConnMaxLifetime(5 * time.Minute)

Cloud SQL Proxy (1st gen vs 2nd gen):

  • 1st gen Cloud Run: Embedded Cloud SQL Proxy v1
  • 2nd gen Cloud Run: Support para custom CA hierarchies

Recomendacao: Usar 2nd gen execution environment

Secret Manager integration:

env:
- name: DB_PASSWORD
  valueFrom:
    secretKeyRef:
      name: db-password
      key: latest  # Ou version especifica: '1'

Vantagem: Rotacao de secrets sem redeploy

Production Checklist

  1. Docker:

    • Multi-stage build
    • Distroless base image
    • Non-root user
    • Binary otimizado (-ldflags="-s -w")
    • No secrets in image
  2. Cloud Run:

    • PORT env var lida do environment
    • GIN_MODE=release
    • Memory/CPU adequados para workload
    • Min/max scale configurados
  3. Environment variables:

    • Secrets via Secret Manager (nao env vars diretas)
    • INSTANCE_CONNECTION_NAME para Cloud SQL
    • < 1000 env vars total
  4. Health checks:

    • /healthz endpoint rapido (< 1s)
    • Startup probe configurada
    • Liveness probe se necessario
    • timeoutSeconds <= periodSeconds
  5. Timeout:

    • Request timeout adequado (default 300s ok?)
    • Long-running tasks via Asynq/Cloud Tasks
    • Context timeout handling no codigo
  6. Cloud SQL:

    • Unix socket (public IP) ou Private IP
    • SSL mode correto (disable para socket, require para private IP)
    • Connection pool <= 25 per instance
    • Secret Manager para credentials

6. Stripe/Payment (Backend Go)

Stripe Go SDK Overview

Instalacao:

go get github.com/stripe/stripe-go/v76

Inicializacao:

import (
    "github.com/stripe/stripe-go/v76"
    "github.com/stripe/stripe-go/v76/customer"
    "github.com/stripe/stripe-go/v76/paymentintent"
)

func init() {
    stripe.Key = os.Getenv("STRIPE_SECRET_KEY")
}

Webhook Handling

Endpoint setup:

func handleWebhook(c *gin.Context) {
    const MaxBodyBytes = int64(65536)
    c.Request.Body = http.MaxBytesReader(c.Writer, c.Request.Body, MaxBodyBytes)

    // Read RAW body (IMPORTANTE: nao fazer c.BindJSON antes!)
    payload, err := ioutil.ReadAll(c.Request.Body)
    if err != nil {
        c.JSON(400, gin.H{"error": "invalid payload"})
        return
    }

    // Get signature header
    sig := c.GetHeader("Stripe-Signature")

    // Webhook secret (encontrado em Stripe Dashboard ou via stripe CLI)
    endpointSecret := os.Getenv("STRIPE_WEBHOOK_SECRET")

    // Construct event (verifica signature)
    event, err := webhook.ConstructEvent(payload, sig, endpointSecret)
    if err != nil {
        c.JSON(400, gin.H{"error": "signature verification failed"})
        return
    }

    // Return 2xx ANTES de processar (Stripe requirement)
    c.JSON(200, gin.H{"received": true})

    // Process event asynchronously
    go processWebhookEvent(event)
}

// Setup route
router.POST("/webhook/stripe", handleWebhook)

CRITICO - Webhook requirements:

  1. RAW body: Framework nao pode modificar request body
  2. Quick 2xx response: Retornar status code ANTES de processar
  3. Signature verification: SEMPRE verificar Stripe-Signature header

Signature Verification

Process:

  1. Extract Stripe-Signature header
  2. Use endpoint secret (from dashboard ou stripe listen)
  3. Verify with webhook.ConstructEvent()

Endpoint secret obtencao:

# Development (Stripe CLI)
stripe listen --forward-to localhost:4242/webhook/stripe
# Output: whsec_xxxxx (usar este)

# Production (Stripe Dashboard)
# Dashboard > Webhooks > Endpoint > Signing secret
# whsec_xxxxx

Common errors:

"No signatures found matching the expected signature for payload"

Causas:

  • Framework modificou request body (whitespace, JSON parse, etc)
  • Wrong endpoint secret
  • Signature header missing

Solution:

// ERRADO - Gin modifica body
var data map[string]interface{}
c.BindJSON(&data)  // NUNCA fazer isto antes de verificar signature

// CORRETO - Read raw body
payload, _ := ioutil.ReadAll(c.Request.Body)
event, err := webhook.ConstructEvent(payload, sig, secret)

Signature expiration:

  • Stripe limita verificacao a 5 minutes (protection contra replay attacks)
  • Libraries tem tolerance de 5 minutes por padrao
  • Customizar tolerance:
// Custom tolerance (nao recomendado aumentar)
event, err := webhook.ConstructEventWithOptions(
    payload,
    sig,
    endpointSecret,
    webhook.ConstructEventOptions{
        Tolerance: 10 * time.Minute,
    },
)

Event Handler Patterns

Event types:

payment_intent.succeeded
payment_intent.payment_failed
customer.subscription.created
customer.subscription.updated
customer.subscription.deleted
invoice.payment_succeeded
invoice.payment_failed
charge.dispute.created

Handler pattern:

func processWebhookEvent(event stripe.Event) {
    switch event.Type {
    case "payment_intent.succeeded":
        var paymentIntent stripe.PaymentIntent
        if err := json.Unmarshal(event.Data.Raw, &paymentIntent); err != nil {
            log.Printf("Error parsing webhook JSON: %v", err)
            return
        }
        handlePaymentIntentSucceeded(paymentIntent)

    case "payment_intent.payment_failed":
        var paymentIntent stripe.PaymentIntent
        json.Unmarshal(event.Data.Raw, &paymentIntent)
        handlePaymentIntentFailed(paymentIntent)

    case "customer.subscription.deleted":
        var subscription stripe.Subscription
        json.Unmarshal(event.Data.Raw, &subscription)
        handleSubscriptionDeleted(subscription)

    default:
        log.Printf("Unhandled event type: %s", event.Type)
    }
}

func handlePaymentIntentSucceeded(pi stripe.PaymentIntent) {
    // Update database
    db.Model(&Order{}).
        Where("stripe_payment_intent_id = ?", pi.ID).
        Update("status", "paid")

    // Send confirmation email
    // ...
}

Idempotency (IMPORTANTE):

Webhooks podem ser enviados MULTIPLAS vezes. Handler DEVE ser idempotente:

func handlePaymentIntentSucceeded(pi stripe.PaymentIntent) {
    // Check if already processed
    var order Order
    result := db.Where("stripe_payment_intent_id = ? AND status = ?", pi.ID, "paid").First(&order)

    if result.Error == nil {
        // Already processed
        log.Printf("Payment %s already processed", pi.ID)
        return
    }

    // Process payment
    db.Model(&Order{}).
        Where("stripe_payment_intent_id = ?", pi.ID).
        Update("status", "paid")
}

Multi-account routing (Connect):

// Check context header para Stripe Connect
account := c.GetHeader("Stripe-Context")
if account != "" {
    // Route to specific account
    log.Printf("Event for account: %s", account)
}

Idempotency Keys

Purpose: Prevent duplicate API requests (network retries)

Usage:

import "github.com/google/uuid"

params := &stripe.PaymentIntentParams{
    Amount:   stripe.Int64(1000),
    Currency: stripe.String("usd"),
}

// Generate idempotency key (UUID recommended)
params.IdempotencyKey = stripe.String(uuid.New().String())

// Ou composite key
params.IdempotencyKey = stripe.String(fmt.Sprintf("%s-%s", customerID, orderID))

pi, err := paymentintent.New(params)

Automatic retry (Go SDK):

Go SDK automaticamente retries em:

  • Connection errors
  • Timeouts
  • Status 409 Conflict

Idempotency keys adicionadas automaticamente em retries (safe).

Default retries: 2

Customize retry:

stripe.DefaultLeveledLogger = &stripe.LeveledLogger{
    Level: stripe.LevelDebug,
}

// Custom backend with max retries
backend := &stripe.BackendConfiguration{
    MaxNetworkRetries: stripe.Int64(3),
}
stripe.SetBackend("api", backend)

Key behavior:

// Request 1 (success)
params.IdempotencyKey = stripe.String("key123")
pi, _ := paymentintent.New(params)  // Creates payment

// Request 2 (retry com mesma key)
params.IdempotencyKey = stripe.String("key123")
pi, _ := paymentintent.New(params)  // Returns SAME payment (no duplicate)

// Request 3 (mesma key, parametros DIFERENTES)
params.Amount = stripe.Int64(2000)  // Changed amount
params.IdempotencyKey = stripe.String("key123")
pi, err := paymentintent.New(params)
// err = IdempotencyException (parameters changed)

Key expiration:

  • Stripe mantem keys por 24 hours
  • Apos 24h, mesma key cria nova request

Best practices:

  1. Always use for POST requests:
// SEMPRE adicionar idempotency key em POST
params.IdempotencyKey = stripe.String(uuid.New().String())
  1. Generate fresh key on parameter change:
if userChangedAmount {
    params.IdempotencyKey = stripe.String(uuid.New().String())
}
  1. Store key with order:
type Order struct {
    ID                     uint
    StripeIdempotencyKey   string
    StripePaymentIntentID  string
}

// Create order
order := Order{
    StripeIdempotencyKey: uuid.New().String(),
}
db.Create(&order)

// Use stored key
params.IdempotencyKey = stripe.String(order.StripeIdempotencyKey)
  1. Never reuse keys across different operations:
// ERRADO
key := uuid.New().String()
customer, _ := customer.New(&stripe.CustomerParams{
    IdempotencyKey: stripe.String(key),
})
paymentintent.New(&stripe.PaymentIntentParams{
    IdempotencyKey: stripe.String(key), // SAME key - BAD!
})

// CORRETO
customer, _ := customer.New(&stripe.CustomerParams{
    IdempotencyKey: stripe.String(uuid.New().String()),
})
paymentintent.New(&stripe.PaymentIntentParams{
    IdempotencyKey: stripe.String(uuid.New().String()), // DIFFERENT key
})

Error Handling

Error types:

import (
    "github.com/stripe/stripe-go/v76"
)

if err != nil {
    if stripeErr, ok := err.(*stripe.Error); ok {
        switch stripeErr.Type {
        case stripe.ErrorTypeCard:
            // Card was declined
            log.Printf("Card error: %s", stripeErr.Msg)

        case stripe.ErrorTypeRateLimit:
            // Rate limited
            log.Printf("Rate limit exceeded")

        case stripe.ErrorTypeInvalidRequest:
            // Invalid parameters
            log.Printf("Invalid request: %s", stripeErr.Msg)

        case stripe.ErrorTypeAPI:
            // Stripe API error
            log.Printf("API error: %s", stripeErr.Msg)

        case stripe.ErrorTypeAuthentication:
            // Authentication failed
            log.Printf("Auth error: %s", stripeErr.Msg)
        }

        // Access error details
        log.Printf("Code: %s, Param: %s", stripeErr.Code, stripeErr.Param)
    }
}

Common error codes:

card_declined          - Card was declined
insufficient_funds     - Insufficient funds
expired_card           - Card has expired
incorrect_cvc          - Incorrect CVC
processing_error       - Processing error
rate_limit             - Too many requests

Retry logic:

func createPaymentIntent(amount int64) (*stripe.PaymentIntent, error) {
    params := &stripe.PaymentIntentParams{
        Amount:   stripe.Int64(amount),
        Currency: stripe.String("usd"),
    }

    maxRetries := 3
    for i := 0; i < maxRetries; i++ {
        pi, err := paymentintent.New(params)
        if err == nil {
            return pi, nil
        }

        if stripeErr, ok := err.(*stripe.Error); ok {
            // Retry apenas em rate limit ou API errors
            if stripeErr.Type == stripe.ErrorTypeRateLimit ||
               stripeErr.Type == stripe.ErrorTypeAPI {
                time.Sleep(time.Duration(i+1) * time.Second)
                continue
            }

            // Nao retry em card errors ou invalid request
            return nil, err
        }

        return nil, err
    }

    return nil, fmt.Errorf("max retries exceeded")
}

SignatureVerificationException (webhook):

event, err := webhook.ConstructEvent(payload, sig, secret)
if err != nil {
    // Signature verification failed
    log.Printf("Webhook signature verification failed: %v", err)
    c.JSON(400, gin.H{"error": "invalid signature"})
    return
}

Production Best Practices

1. API Keys:

// NUNCA hardcode
// stripe.Key = "sk_live_xxxxx"

// SEMPRE via environment
stripe.Key = os.Getenv("STRIPE_SECRET_KEY")

// Validate
if stripe.Key == "" {
    log.Fatal("STRIPE_SECRET_KEY not set")
}

2. Webhook endpoint:

// DEVE ser HTTPS em producao
// Cloud Run automaticamente HTTPS

// Rate limiting
router.Use(rateLimitMiddleware())
router.POST("/webhook/stripe", handleWebhook)

// Logging
func handleWebhook(c *gin.Context) {
    requestID := c.GetHeader("X-Request-ID")
    log.Printf("[%s] Webhook received", requestID)
    // ...
}

3. Endpoint registration:

Dashboard > Webhooks > Add endpoint
URL: https://app.typecraft.com/webhook/stripe
Events: Select specific events (nao "all events")

4. Testing:

# Development
stripe listen --forward-to localhost:8080/webhook/stripe

# Trigger test events
stripe trigger payment_intent.succeeded
stripe trigger customer.subscription.deleted

5. Monitoring:

// Log all Stripe API calls
stripe.DefaultLeveledLogger = &stripe.LeveledLogger{
    Level: stripe.LevelInfo,
}

// Custom logger
type CustomLogger struct{}

func (l *CustomLogger) Debugf(format string, v ...interface{}) {
    log.Printf("[DEBUG] "+format, v...)
}

func (l *CustomLogger) Infof(format string, v ...interface{}) {
    log.Printf("[INFO] "+format, v...)
}

func (l *CustomLogger) Warnf(format string, v ...interface{}) {
    log.Printf("[WARN] "+format, v...)
}

func (l *CustomLogger) Errorf(format string, v ...interface{}) {
    log.Printf("[ERROR] "+format, v...)
}

stripe.DefaultLeveledLogger = &CustomLogger{}

6. Error alerting:

func handlePaymentIntentFailed(pi stripe.PaymentIntent) {
    // Log failure
    log.Printf("Payment failed: %s, amount: %d, error: %s",
        pi.ID, pi.Amount, pi.LastPaymentError.Message)

    // Alert admin (email, Slack, PagerDuty, etc)
    alertAdmin(fmt.Sprintf("Payment %s failed", pi.ID))

    // Update order status
    db.Model(&Order{}).
        Where("stripe_payment_intent_id = ?", pi.ID).
        Updates(map[string]interface{}{
            "status": "failed",
            "error_message": pi.LastPaymentError.Message,
        })
}

7. Idempotency strategy:

type PaymentService struct {
    db *gorm.DB
}

func (s *PaymentService) CreatePaymentIntent(orderID uint, amount int64) (*stripe.PaymentIntent, error) {
    // Get or create idempotency key
    var order Order
    if err := s.db.First(&order, orderID).Error; err != nil {
        return nil, err
    }

    if order.StripeIdempotencyKey == "" {
        order.StripeIdempotencyKey = uuid.New().String()
        s.db.Save(&order)
    }

    // Create payment intent
    params := &stripe.PaymentIntentParams{
        Amount:   stripe.Int64(amount),
        Currency: stripe.String("usd"),
    }
    params.IdempotencyKey = stripe.String(order.StripeIdempotencyKey)

    pi, err := paymentintent.New(params)
    if err != nil {
        return nil, err
    }

    // Store payment intent ID
    order.StripePaymentIntentID = pi.ID
    s.db.Save(&order)

    return pi, nil
}

Production Checklist

  1. API Keys:

    • Secret key via environment (STRIPE_SECRET_KEY)
    • Webhook secret via environment (STRIPE_WEBHOOK_SECRET)
    • Nao commit keys no codigo
    • Usar test keys em development
  2. Webhook:

    • HTTPS endpoint em producao
    • Signature verification SEMPRE
    • Return 2xx ANTES de processar
    • Idempotent handler
    • Async processing (nao bloquear response)
  3. Idempotency:

    • UUID para idempotency keys
    • Store keys com orders
    • Fresh key para parameter changes
    • Never reuse keys across operations
  4. Error handling:

    • Type assertion para stripe.Error
    • Retry logic para rate limits
    • Nao retry card errors
    • Log all errors
    • Alert em payment failures
  5. Testing:

    • Stripe CLI para development
    • Test mode para staging
    • Webhook test events
    • Integration tests com test cards
  6. Monitoring:

    • Log Stripe API calls
    • Alert em high failure rate
    • Dashboard para payment metrics
    • Webhook delivery monitoring (Stripe Dashboard)

RESUMO EXECUTIVO - PONTOS CRITICOS

1. PostgreSQL + GORM

MUST DO:

  • Connection pooling: MaxOpenConns=25, MaxIdleConns=10, ConnMaxLifetime=5min
  • SSL mode: verify-full em producao (Cloud SQL)
  • Error handling: TranslateError: true + errors.Is()
  • Migrations: AutoMigrate em dev, versioned em producao

AVOID:

  • sslmode=disable em producao
  • Ignorar errors silenciosamente
  • MaxOpenConns > 100 sem calcular total (replicas * MaxOpenConns)

2. Redis + Asynq

MUST DO:

  • AOF persistence enabled
  • JSON unmarshal errors wrapped com SkipRetry
  • Concurrency adequada (10-50 para producao)
  • Health checks e monitoring (Inspector API)

AVOID:

  • RDB-only persistence (perda de tasks)
  • Retry infinito em malformed payloads
  • Concurrency muito alta (CPU/memory issues)

3. Gin Framework

MUST DO:

  • CORS com origins especificas (nunca * com credentials)
  • Middleware order: CORS → Recovery → Logging → Auth
  • AbortWithStatusJSON para errors
  • Graceful shutdown com timeout < K8s grace period

AVOID:

  • AllowAllOrigins com AllowCredentials
  • c.JSON para errors (use Abort)
  • Hardcoded :8080 (use PORT env var)

4. Chromedp

MUST DO:

  • NoSandbox + DisableDevShmUsage em Docker
  • Browser pooling ou reuse
  • WaitReady antes de PrintToPDF
  • Timeout apenas em actions especificas (nunca primeira Run)

AVOID:

  • Timeout na primeira Run (mata browser)
  • Criar novo browser por request (lento)
  • PrintToPDF sem wait (imagens nao carregam)

5. Docker/Cloud Run

MUST DO:

  • Multi-stage build com distroless
  • Non-root user
  • Secret Manager para credentials
  • Health checks (/healthz rapido < 1s)
  • Unix socket para Cloud SQL (via INSTANCE_CONNECTION_NAME)

AVOID:

  • Single-stage build (imagem gigante)
  • Root user em container
  • Secrets em env vars diretas
  • DB queries em health check

6. Stripe

MUST DO:

  • Webhook signature verification SEMPRE
  • Return 2xx ANTES de processar
  • Idempotency keys em POST requests
  • Idempotent webhook handlers

AVOID:

  • Processar webhook sem verificar signature
  • Bloquear webhook response (processar async)
  • Reusar idempotency keys
  • Retry card errors

FIM DA FASE 0.1 - DOCUMENTACAO EXTERNA COMPILADA

Este documento contem informacao tecnica acionavel para implementacao do TypeCraft seguindo best practices de producao (2025).

Total de paginas: 47 Total de palavras: ~15,000 Ferramentas cobertas: 6/6 (100%)