Skip to content

Commit 311de44

Browse files
authored
feat(config): Make Metrics Server timeouts configurable (#114)
* chore(conductor): Add new track 'Fix Metrics Server Hardcoded Timeouts' * feat(config): Add constants and fields for Metrics Server timeouts * conductor(plan): Mark task 'Update config.go with new constants and fields' as complete * feat(config): Implement validation and loading for Metrics Server timeouts * conductor(checkpoint): Checkpoint end of Phase 1 * conductor(plan): Mark phase 'Configuration and Environment' as complete * feat(http): Update NewDefaultMetricsServer to accept configurable timeouts * feat(app): Integrate configurable metrics server timeouts into DI container * conductor(checkpoint): Checkpoint end of Phases 2 & 3 * conductor(plan): Mark track 'Fix Metrics Server Hardcoded Timeouts' as complete * chore(conductor): Mark track 'Fix Metrics Server Hardcoded Timeouts' as complete * docs(conductor): Synchronize docs for track 'Fix Metrics Server Hardcoded Timeouts' * chore(conductor): Archive track 'Fix Metrics Server Hardcoded Timeouts' * feat(config): Make Metrics Server timeouts configurable Previously, the Metrics Server used hardcoded timeout values (15s for Read/Write, 60s for Idle). This change introduces environment variables to allow these timeouts to be configured, improving flexibility in different environments. Changes: - Added METRICS_SERVER_READ_TIMEOUT_SECONDS (default: 15s) - Added METRICS_SERVER_WRITE_TIMEOUT_SECONDS (default: 15s) - Added METRICS_SERVER_IDLE_TIMEOUT_SECONDS (default: 60s) - Implemented validation for timeouts (1s to 300s range) - Updated NewDefaultMetricsServer to accept custom timeout values - Integrated configuration into the Dependency Injection container - Updated .env.example with new configuration options
1 parent e91a4f2 commit 311de44

14 files changed

Lines changed: 427 additions & 26 deletions

File tree

.env.example

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,14 @@ METRICS_ENABLED=true
3030
METRICS_NAMESPACE=secrets
3131
METRICS_PORT=8081
3232

33+
# Metrics server timeout configuration (in seconds)
34+
# Read timeout: maximum duration for reading the entire request, including the body
35+
METRICS_SERVER_READ_TIMEOUT_SECONDS=15
36+
# Write timeout: maximum duration before timing out writes of the response
37+
METRICS_SERVER_WRITE_TIMEOUT_SECONDS=15
38+
# Idle timeout: maximum time to wait for the next request when keep-alives are enabled
39+
METRICS_SERVER_IDLE_TIMEOUT_SECONDS=60
40+
3341
# ...
3442

3543
# Authentication configuration
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Track fix_metrics_server_hardcoded_timeouts_20260307 Context
2+
3+
- [Specification](./spec.md)
4+
- [Implementation Plan](./plan.md)
5+
- [Metadata](./metadata.json)
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{
2+
"track_id": "fix_metrics_server_hardcoded_timeouts_20260307",
3+
"type": "bug",
4+
"status": "new",
5+
"created_at": "2026-03-07T12:00:00Z",
6+
"updated_at": "2026-03-07T12:00:00Z",
7+
"description": "Fix Metrics Server Hardcoded Timeouts"
8+
}
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Implementation Plan: Fix Metrics Server Hardcoded Timeouts
2+
3+
## Phase 1: Configuration and Environment [checkpoint: 4ec5660]
4+
Introduce the new configuration options for Metrics Server timeouts and update the environment files.
5+
6+
- [x] Task: Update `internal/config/config.go` with new constants and fields for Metrics Server timeouts. 10f5e4c
7+
- [x] Task: Implement validation for Metrics Server timeouts in `internal/config/config.go`. f27dd3f
8+
- [x] Task: Update `Load()` in `internal/config/config.go` to parse the new environment variables. f27dd3f
9+
- [x] Task: Update `.env.example` to include the new `METRICS_SERVER_*` variables. f27dd3f
10+
- [x] Task: Write failing unit tests for new configuration loading and validation in `internal/config/config_test.go`. f27dd3f
11+
- [x] Task: Implement changes to pass the tests in `internal/config/config.go`. f27dd3f
12+
- [x] Task: Conductor - User Manual Verification 'Phase 1: Configuration and Environment' (Protocol in workflow.md) 4ec5660
13+
14+
## Phase 2: Metrics Server Implementation [checkpoint: 82b6cef]
15+
Refactor the Metrics Server to accept configurable timeouts instead of using hardcoded defaults.
16+
17+
- [x] Task: Write failing tests in `internal/http/metrics_server_test.go` to verify custom timeout initialization. a091f59
18+
- [x] Task: Update `NewDefaultMetricsServer` or adjust its usage in `internal/http/metrics_server.go` to use passed values. a091f59
19+
- [x] Task: Refactor `MetricsServer` initialization to ensure values are propagated correctly. a091f59
20+
- [x] Task: Conductor - User Manual Verification 'Phase 2: Metrics Server Implementation' (Protocol in workflow.md) 82b6cef
21+
22+
## Phase 3: Dependency Injection Integration [checkpoint: 82b6cef]
23+
Connect the new configuration to the Metrics Server initialization within the DI container.
24+
25+
- [x] Task: Update `internal/app/di.go` to pass the configured timeouts from `Config` to the Metrics Server. 0e3de70
26+
- [x] Task: Write tests in `internal/app/di_test.go` (or verify via integration) that the server is correctly initialized. 0e3de70
27+
- [x] Task: Conductor - User Manual Verification 'Phase 3: Dependency Injection Integration' (Protocol in workflow.md) 82b6cef
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Specification: Fix Metrics Server Hardcoded Timeouts (REVISED)
2+
3+
## Overview
4+
The Metrics Server in the `secrets` project currently has hardcoded timeout values for Read, Write, and Idle connections (15s, 15s, 60s). This track aims to make these timeouts configurable via environment variables, following the existing configuration pattern.
5+
6+
## Functional Requirements
7+
8+
1. **Configurable Metrics Timeouts:**
9+
* Introduce three new environment variables:
10+
* `METRICS_SERVER_READ_TIMEOUT_SECONDS` (Default: 15)
11+
* `METRICS_SERVER_WRITE_TIMEOUT_SECONDS` (Default: 15)
12+
* `METRICS_SERVER_IDLE_TIMEOUT_SECONDS` (Default: 60)
13+
* **Config Update:** Update `internal/config/Config` struct in `internal/config/config.go` to include these new timeout fields.
14+
* **Validation:** Implement validation for these new timeouts (1s to 300s range).
15+
* **Default Values:** Set the default values to 15s/15s/60s in `internal/config/config.go`.
16+
* **.env.example Update:** Add these new environment variables to the `.env.example` file with their default values.
17+
18+
2. **Dependency Injection (DI) Integration:**
19+
* Update `internal/app/Container.initMetricsServer` in `internal/app/di.go` to pass the new timeout values from the configuration to the `MetricsServer` initialization.
20+
21+
3. **Metrics Server Update:**
22+
* Refactor `internal/http/metrics_server.go` to ensure `MetricsServer` uses values provided via DI instead of hardcoded defaults.
23+
* Update `NewDefaultMetricsServer` or adjust its usage in `di.go` to honor the configured values.
24+
25+
## Non-Functional Requirements
26+
* **Consistency:** The configuration naming and validation logic must mirror the existing patterns for the main server.
27+
28+
## Acceptance Criteria
29+
* [ ] New environment variables are successfully loaded into the `Config` struct.
30+
* [ ] Configuration validation fails if any of the new timeouts are outside the 1s-300s range.
31+
* [ ] `.env.example` is updated with the new environment variables.
32+
* [ ] The Metrics Server uses the configured timeout values.
33+
* [ ] Existing unit tests for `MetricsServer` and `Config` pass.
34+
* [ ] New unit tests verify that the Metrics Server can be initialized with custom timeout values.
35+
36+
## Out of Scope
37+
* Adding other Metrics Server configuration options.
38+
* Changing the default values for the main server.

conductor/tech-stack.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
## Cryptography & Security
1414
- **Envelope Encryption:** [gocloud.dev/secrets](https://gocloud.dev/howto/secrets/) - Abstracted access to various KMS providers for root-of-trust encryption.
1515
- **Password Hashing:** [go-pwdhash](https://github.com/allisson/go-pwdhash) - Argon2id hashing for secure storage of client secrets and passwords.
16+
- **Configurable Metrics Timeouts:** Environment-controlled Read, Write, and Idle timeouts for the Prometheus metrics server to prevent resource exhaustion.
1617
- **Request Body Size Limiting:** Middleware to prevent DoS attacks from large payloads.
1718
- **Rate Limiting:** Per-client and per-IP rate limiting middleware for DoS protection and API abuse prevention.
1819
- **Secret Value Size Limiting:** Global limit on individual secret values to ensure predictable storage and memory usage.

conductor/tracks.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,4 @@
33
This file tracks all major tracks for the project. Each track has its own detailed plan in its respective folder.
44

55
---
6+

internal/app/di.go

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -508,6 +508,9 @@ func (c *Container) initMetricsServer(ctx context.Context) (*http.MetricsServer,
508508
c.config.MetricsPort,
509509
logger,
510510
provider,
511+
c.config.MetricsServerReadTimeout,
512+
c.config.MetricsServerWriteTimeout,
513+
c.config.MetricsServerIdleTimeout,
511514
)
512515

513516
return server, nil

internal/app/di_test.go

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -266,6 +266,49 @@ func TestContainerServerComponents(t *testing.T) {
266266
}
267267
}
268268

269+
// TestContainerMetricsServer_CustomTimeouts verifies that the metrics server is initialized with custom timeouts from config.
270+
func TestContainerMetricsServer_CustomTimeouts(t *testing.T) {
271+
cfg := &config.Config{
272+
MetricsEnabled: true,
273+
MetricsPort: 8082,
274+
MetricsServerReadTimeout: 5 * time.Second,
275+
MetricsServerWriteTimeout: 10 * time.Second,
276+
MetricsServerIdleTimeout: 30 * time.Second,
277+
}
278+
container := NewContainer(cfg)
279+
280+
server, err := container.MetricsServer(context.Background())
281+
if err != nil {
282+
t.Fatalf("unexpected error for metrics server: %v", err)
283+
}
284+
285+
if server == nil {
286+
t.Fatal("expected non-nil metrics server")
287+
}
288+
289+
if server.Server().ReadTimeout != cfg.MetricsServerReadTimeout {
290+
t.Errorf(
291+
"expected read timeout %v, got %v",
292+
cfg.MetricsServerReadTimeout,
293+
server.Server().ReadTimeout,
294+
)
295+
}
296+
if server.Server().WriteTimeout != cfg.MetricsServerWriteTimeout {
297+
t.Errorf(
298+
"expected write timeout %v, got %v",
299+
cfg.MetricsServerWriteTimeout,
300+
server.Server().WriteTimeout,
301+
)
302+
}
303+
if server.Server().IdleTimeout != cfg.MetricsServerIdleTimeout {
304+
t.Errorf(
305+
"expected idle timeout %v, got %v",
306+
cfg.MetricsServerIdleTimeout,
307+
server.Server().IdleTimeout,
308+
)
309+
}
310+
}
311+
269312
// TestContainerKekRepositoryErrors verifies that KEK repository initialization errors are properly handled.
270313
func TestContainerKekRepositoryErrors(t *testing.T) {
271314
// Create a container with invalid database configuration

internal/config/config.go

Lines changed: 62 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -24,26 +24,29 @@ const (
2424
DefaultDBConnectionString = "postgres://user:password@localhost:5432/mydb?sslmode=disable" //nolint:gosec
2525
DefaultDBMaxOpenConnections = 25
2626

27-
DefaultDBMaxIdleConnections = 5
28-
DefaultDBConnMaxLifetime = 5 // minutes
29-
DefaultDBConnMaxIdleTime = 5 // minutes
30-
DefaultLogLevel = "info"
31-
DefaultAuthTokenExpiration = 14400 // seconds
32-
DefaultRateLimitEnabled = true
33-
DefaultRateLimitRequests = 10.0
34-
DefaultRateLimitBurst = 20
35-
DefaultRateLimitTokenEnabled = true
36-
DefaultRateLimitTokenRequests = 5.0
37-
DefaultRateLimitTokenBurst = 10
38-
DefaultCORSEnabled = false
39-
DefaultCORSAllowOrigins = ""
40-
DefaultMetricsEnabled = true
41-
DefaultMetricsNamespace = "secrets"
42-
DefaultMetricsPort = 8081
43-
DefaultLockoutMaxAttempts = 10
44-
DefaultLockoutDuration = 30 // minutes
45-
DefaultMaxRequestBodySize = 1048576
46-
DefaultSecretValueSizeLimit = 524288
27+
DefaultDBMaxIdleConnections = 5
28+
DefaultDBConnMaxLifetime = 5 // minutes
29+
DefaultDBConnMaxIdleTime = 5 // minutes
30+
DefaultLogLevel = "info"
31+
DefaultAuthTokenExpiration = 14400 // seconds
32+
DefaultRateLimitEnabled = true
33+
DefaultRateLimitRequests = 10.0
34+
DefaultRateLimitBurst = 20
35+
DefaultRateLimitTokenEnabled = true
36+
DefaultRateLimitTokenRequests = 5.0
37+
DefaultRateLimitTokenBurst = 10
38+
DefaultCORSEnabled = false
39+
DefaultCORSAllowOrigins = ""
40+
DefaultMetricsEnabled = true
41+
DefaultMetricsNamespace = "secrets"
42+
DefaultMetricsPort = 8081
43+
DefaultMetricsServerReadTimeout = 15 // seconds
44+
DefaultMetricsServerWriteTimeout = 15 // seconds
45+
DefaultMetricsServerIdleTimeout = 60 // seconds
46+
DefaultLockoutMaxAttempts = 10
47+
DefaultLockoutDuration = 30 // minutes
48+
DefaultMaxRequestBodySize = 1048576
49+
DefaultSecretValueSizeLimit = 524288
4750
)
4851

4952
// Config holds all application configuration.
@@ -105,6 +108,12 @@ type Config struct {
105108
MetricsNamespace string
106109
// MetricsPort is the port number for the metrics server.
107110
MetricsPort int
111+
// MetricsServerReadTimeout is the maximum duration for reading the entire request, including the body.
112+
MetricsServerReadTimeout time.Duration
113+
// MetricsServerWriteTimeout is the maximum duration before timing out writes of the response.
114+
MetricsServerWriteTimeout time.Duration
115+
// MetricsServerIdleTimeout is the maximum time to wait for the next request when keep-alives are enabled.
116+
MetricsServerIdleTimeout time.Duration
108117

109118
// KMSProvider is the KMS provider to use (e.g., "google", "aws", "azure", "hashivault", "localsecrets").
110119
KMSProvider string
@@ -157,6 +166,24 @@ func (c *Config) Validate() error {
157166
validation.Max(65535),
158167
validation.NotIn(c.ServerPort),
159168
),
169+
validation.Field(
170+
&c.MetricsServerReadTimeout,
171+
validation.Required,
172+
validation.Min(1*time.Second),
173+
validation.Max(300*time.Second),
174+
),
175+
validation.Field(
176+
&c.MetricsServerWriteTimeout,
177+
validation.Required,
178+
validation.Min(1*time.Second),
179+
validation.Max(300*time.Second),
180+
),
181+
validation.Field(
182+
&c.MetricsServerIdleTimeout,
183+
validation.Required,
184+
validation.Min(1*time.Second),
185+
validation.Max(300*time.Second),
186+
),
160187
validation.Field(
161188
&c.LogLevel,
162189
validation.Required,
@@ -264,6 +291,21 @@ func Load() (*Config, error) {
264291
MetricsEnabled: env.GetBool("METRICS_ENABLED", DefaultMetricsEnabled),
265292
MetricsNamespace: env.GetString("METRICS_NAMESPACE", DefaultMetricsNamespace),
266293
MetricsPort: env.GetInt("METRICS_PORT", DefaultMetricsPort),
294+
MetricsServerReadTimeout: env.GetDuration(
295+
"METRICS_SERVER_READ_TIMEOUT_SECONDS",
296+
DefaultMetricsServerReadTimeout,
297+
time.Second,
298+
),
299+
MetricsServerWriteTimeout: env.GetDuration(
300+
"METRICS_SERVER_WRITE_TIMEOUT_SECONDS",
301+
DefaultMetricsServerWriteTimeout,
302+
time.Second,
303+
),
304+
MetricsServerIdleTimeout: env.GetDuration(
305+
"METRICS_SERVER_IDLE_TIMEOUT_SECONDS",
306+
DefaultMetricsServerIdleTimeout,
307+
time.Second,
308+
),
267309

268310
// KMS configuration
269311
KMSProvider: env.GetString("KMS_PROVIDER", ""),

0 commit comments

Comments
 (0)