Skip to content

Commit df94395

Browse files
authored
feat(auth): fix rate limiter goroutine lifecycle and resource leaks (#112)
- Updated RateLimitMiddleware and TokenRateLimitMiddleware to accept context.Context - Integrated rate limiter cleanup with the application lifecycle via SetupRouter - Added automated goroutine leak verification using goleak - Updated tech-stack documentation to include rate limiting - Archived completed conductor track 'Fix Rate Limiter Goroutine Lifecycle'
1 parent 50d1e9d commit df94395

14 files changed

Lines changed: 144 additions & 19 deletions

File tree

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Track fix_rate_limiter_lifecycle_20260306 Context
2+
3+
- [Specification](./spec.md)
4+
- [Implementation Plan](./plan.md)
5+
- [Metadata](./metadata.json)
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{
2+
"track_id": "fix_rate_limiter_lifecycle_20260306",
3+
"type": "bug",
4+
"status": "new",
5+
"created_at": "2026-03-06T10:00:00Z",
6+
"updated_at": "2026-03-06T10:00:00Z",
7+
"description": "Fix Rate Limiter Goroutine Lifecycle"
8+
}
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Implementation Plan: Fix Rate Limiter Goroutine Lifecycle
2+
3+
## Phase 1: Research and Infrastructure [checkpoint: b7b5bbd]
4+
- [x] Task: Identify the rate limiter implementation and initialization points b7b5bbd
5+
- [x] Locate the rate limiter package in `internal/`
6+
- [x] Find all call sites where the rate limiter is initialized
7+
- [x] Task: Identify existing tests and reproduction steps b7b5bbd
8+
- [x] Locate existing unit tests for the rate limiter
9+
- [x] Confirm how the cleanup goroutine is currently started
10+
- [x] Task: Conductor - User Manual Verification 'Research and Infrastructure' (Protocol in workflow.md) b7b5bbd
11+
12+
## Phase 2: Implementation (TDD) [checkpoint: b9177d8]
13+
- [x] Task: Write failing test for goroutine leak c072dd9
14+
- [x] Create a test case that initializes the rate limiter and then cancels the context
15+
- [x] Verify that the cleanup goroutine persists after cancellation (failing state)
16+
- [x] Task: Update rate limiter initialization to accept context 2b5cbc2
17+
- [x] Modify the signature of the initialization function to include `context.Context`
18+
- [x] Task: Update cleanup goroutine to use context 2b5cbc2
19+
- [x] Modify the cleanup loop to listen for `ctx.Done()`
20+
- [x] Task: Update call sites for rate limiter initialization 2b5cbc2
21+
- [x] Update all middleware and/or `main.go` call sites to pass the appropriate context
22+
- [x] Task: Verify fix with automated tests 2b5cbc2
23+
- [x] Run the failing test and ensure it now passes
24+
- [x] Run the full test suite to ensure no regressions
25+
- [x] Task: Conductor - User Manual Verification 'Implementation' (Protocol in workflow.md) b9177d8
26+
27+
## Phase 3: Final Verification and Cleanup [checkpoint: 9b2011d]
28+
- [x] Task: Run all tests and verify no regressions 2b5cbc2
29+
- [x] Execute `make test` and `make test-all`
30+
- [x] Task: Verify code coverage for new changes 2b5cbc2
31+
- [x] Ensure new code has >80% coverage
32+
- [x] Task: Conductor - User Manual Verification 'Final Verification and Cleanup' (Protocol in workflow.md) 9b2011d
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Specification: Fix Rate Limiter Goroutine Lifecycle
2+
3+
## Overview
4+
Fix the rate limiter goroutine lifecycle to prevent resource leaks during server shutdown. Currently, the cleanup goroutine uses `context.Background()`, which prevents it from being cancelled when the application stops.
5+
6+
## Functional Requirements
7+
- **Context Acceptance:** The rate limiter initialization must accept a `context.Context` parameter.
8+
- **Graceful Shutdown:** The cleanup goroutine responsible for purging expired rate limit entries must use the provided context to stop its execution when the context is cancelled.
9+
- **Initialization Update:** All call sites that initialize the rate limiter must be updated to provide a valid application context (typically derived from the main server context).
10+
11+
## Non-Functional Requirements
12+
- **Performance:** The fix should not introduce any measurable overhead to the rate limiting logic or hot paths.
13+
- **Reliability:** The rate limiter should continue to function correctly even if the context is cancelled (i.e., it should fail-safe or stop gracefully).
14+
15+
## Acceptance Criteria
16+
- [ ] The rate limiter's `New` or initialization function accepts `context.Context`.
17+
- [ ] The cleanup goroutine correctly stops when the context is cancelled.
18+
- [ ] Automated tests confirm that the goroutine is cleaned up (e.g., using `goleak` or similar verification).
19+
- [ ] All existing tests pass.
20+
21+
## Out of Scope
22+
- Refactoring the rate limiting algorithm or storage backend.
23+
- Adding new rate limiting features (e.g., per-user vs per-IP changes).
24+
- Changes to other middleware or handlers unless directly related to initializing the rate limiter.

conductor/tech-stack.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
- **Envelope Encryption:** [gocloud.dev/secrets](https://gocloud.dev/howto/secrets/) - Abstracted access to various KMS providers for root-of-trust encryption.
1515
- **Password Hashing:** [go-pwdhash](https://github.com/allisson/go-pwdhash) - Argon2id hashing for secure storage of client secrets and passwords.
1616
- **Request Body Size Limiting:** Middleware to prevent DoS attacks from large payloads.
17+
- **Rate Limiting:** Per-client and per-IP rate limiting middleware for DoS protection and API abuse prevention.
1718
- **Secret Value Size Limiting:** Global limit on individual secret values to ensure predictable storage and memory usage.
1819
- **Strict Capability Validation:** Centralized domain helpers for validating policy capabilities (`read`, `write`, `delete`, `encrypt`, `decrypt`, `rotate`) in CLI and API layers.
1920
- **Secret Path Validation:** Strict naming rules for secret paths (alphanumeric, -, _, /) to ensure consistency and security.

conductor/tracks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22

33
This file tracks all major tracks for the project. Each track has its own detailed plan in its respective folder.
44

5-
---
5+

go.mod

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ require (
2323
go.opentelemetry.io/otel/metric v1.41.0
2424
go.opentelemetry.io/otel/sdk v1.41.0
2525
go.opentelemetry.io/otel/sdk/metric v1.41.0
26+
go.uber.org/goleak v1.3.0
2627
gocloud.dev v0.45.0
2728
gocloud.dev/secrets/hashivault v0.45.0
2829
golang.org/x/crypto v0.48.0

internal/app/di.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -469,6 +469,7 @@ func (c *Container) initHTTPServer(ctx context.Context) (*http.Server, error) {
469469

470470
// Setup router with dependencies
471471
server.SetupRouter(
472+
ctx,
472473
c.config,
473474
clientHandler,
474475
tokenHandler,
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
package http
2+
3+
import (
4+
"context"
5+
"log/slog"
6+
"testing"
7+
8+
"go.uber.org/goleak"
9+
)
10+
11+
func TestRateLimitMiddleware_GoroutineLeak(t *testing.T) {
12+
// Ensure no goroutines are leaking after the test
13+
defer goleak.VerifyNone(t)
14+
15+
// Create a cancellable context
16+
ctx, cancel := context.WithCancel(context.Background())
17+
18+
// Create middleware (this starts the goroutine)
19+
logger := slog.Default()
20+
_ = RateLimitMiddleware(ctx, 10.0, 20, logger)
21+
22+
// Cancel the context - this should stop the goroutine
23+
cancel()
24+
}
25+
26+
func TestTokenRateLimitMiddleware_GoroutineLeak(t *testing.T) {
27+
// Ensure no goroutines are leaking after the test
28+
defer goleak.VerifyNone(t)
29+
30+
// Create a cancellable context
31+
ctx, cancel := context.WithCancel(context.Background())
32+
33+
// Create middleware (this starts the goroutine)
34+
logger := slog.Default()
35+
_ = TokenRateLimitMiddleware(ctx, 10.0, 20, logger)
36+
37+
// Cancel the context - this should stop the goroutine
38+
cancel()
39+
}

internal/auth/http/rate_limit_middleware.go

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,20 +38,21 @@ type rateLimiterEntry struct {
3838
// rate limiter based on their client ID.
3939
//
4040
// Configuration:
41+
// - ctx: Application context for cleanup goroutine
4142
// - rps: Requests per second allowed per client
4243
// - burst: Maximum burst capacity for temporary spikes
4344
//
4445
// Returns:
4546
// - 429 Too Many Requests: Rate limit exceeded (includes Retry-After header)
4647
// - Continues: Request allowed within rate limit
47-
func RateLimitMiddleware(rps float64, burst int, logger *slog.Logger) gin.HandlerFunc {
48+
func RateLimitMiddleware(ctx context.Context, rps float64, burst int, logger *slog.Logger) gin.HandlerFunc {
4849
store := &rateLimiterStore{
4950
rps: rps,
5051
burst: burst,
5152
}
5253

5354
// Start cleanup goroutine for stale limiters (every 5 minutes)
54-
go store.cleanupStale(context.Background(), 5*time.Minute)
55+
go store.cleanupStale(ctx, 5*time.Minute)
5556

5657
return func(c *gin.Context) {
5758
// Get authenticated client from context

0 commit comments

Comments
 (0)