Skip to content

Commit b98d9c7

Browse files
authored
General code cleanup and more test coverage (#60)
* Only record response body in log Fix header/status order fix url shadowing just log tmpl errors since the status order is tricky typo * General code cleanup * more tests * f * more tests * codecov * fixup cleanup
1 parent 5e7d988 commit b98d9c7

10 files changed

Lines changed: 1165 additions & 99 deletions

File tree

.github/workflows/lint-test.yml

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ jobs:
1313

1414
- uses: actions/setup-go@44694675825211faa026b3c33043df3e48a5fa00 # v6
1515
with:
16-
go-version: '>=1.24.0'
16+
go-version: ">=1.25.0"
1717

1818
- name: golangci-lint
1919
uses: golangci/golangci-lint-action@4afd733a84b1f43292c63897423277bb7f4313a9 # v8
@@ -42,6 +42,17 @@ jobs:
4242
- name: unit test
4343
run: go test -v -race ./...
4444

45+
- name: generate coverage
46+
run: go test -coverprofile=coverage.out -covermode=atomic ./...
47+
48+
- name: upload coverage to codecov
49+
uses: codecov/codecov-action@v5
50+
with:
51+
files: ./coverage.out
52+
fail_ci_if_error: false
53+
env:
54+
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
55+
4556
integration-test:
4657
needs: [run]
4758
permissions:

CLAUDE.md

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
This is a Traefik middleware plugin that protects websites from bot traffic by challenging individual IPs with CAPTCHAs when traffic spikes are detected from their subnet. The plugin supports Cloudflare Turnstile, Google reCAPTCHA, and hCaptcha.
8+
9+
**Key concept**: Instead of rate limiting individual IPs, this plugin monitors traffic at the subnet level (e.g., /16 for IPv4, /64 for IPv6) and only challenges specific IPs when their entire subnet exceeds a configured rate limit.
10+
11+
## Architecture
12+
13+
### Core Components
14+
15+
- **main.go** (`main.go:1-761`): Contains the entire middleware implementation in a single file
16+
- `CaptchaProtect` struct: Main middleware handler with rate limiting, bot detection, and challenge serving
17+
- `Config` struct: Configuration from Traefik labels
18+
- Three in-memory caches (using `github.com/patrickmn/go-cache`):
19+
- `rateCache`: Tracks request counts per subnet
20+
- `verifiedCache`: Stores IPs that have passed challenges (24h default TTL)
21+
- `botCache`: Caches reverse DNS lookups for bot verification
22+
23+
### Request Flow Decision Tree
24+
25+
The middleware follows this decision order (see `shouldApply()` at `main.go:422-453`):
26+
27+
1. Check if HTTP method is protected (default: GET, HEAD)
28+
2. Check if IP already verified (passed challenge recently)
29+
3. Check if IP is in exemptIps (private ranges + configured exemptions)
30+
4. Check if IP is a good bot (reverse DNS matches goodBots list)
31+
5. Check if user agent is exempt
32+
6. Check if route matches protection rules (prefix/suffix/regex matching)
33+
7. If protected, increment subnet counter and check rate limit
34+
8. If rate limit exceeded, serve challenge or redirect to challenge page
35+
36+
### Internal Packages
37+
38+
- **internal/helper/**: Utility functions
39+
- `ip.go`: IP parsing, CIDR matching, reverse DNS lookups for bot verification
40+
- `tmpl.go`: Default challenge template (embedded fallback)
41+
- **internal/log/**: Structured logging with slog
42+
- **internal/state/**: State serialization for persistent storage across restarts
43+
44+
### Challenge Modes
45+
46+
Two modes for serving challenges:
47+
48+
1. **Redirect mode** (default): `challengeURL: "/challenge"` - Redirects to dedicated challenge page
49+
2. **Inline mode**: `challengeURL: ""` - Serves challenge on the same page that triggered rate limit
50+
51+
## Development Commands
52+
53+
### Running Tests
54+
55+
```bash
56+
# Run unit tests
57+
go test -v -race ./...
58+
59+
# Run single test
60+
go test -v -race -run TestParseIp
61+
62+
# Run integration tests (requires Docker)
63+
cd ci && go run test.go
64+
```
65+
66+
### Linting and Formatting
67+
68+
```bash
69+
# Run golangci-lint locally
70+
golangci-lint run
71+
72+
# Format code
73+
gofmt -w .
74+
75+
# Check if go.mod is tidy
76+
go mod tidy && git diff --exit-code go.mod go.sum
77+
78+
# Update vendored dependencies
79+
go mod vendor
80+
```
81+
82+
### CI/CD
83+
84+
The GitHub Actions workflow (`.github/workflows/lint-test.yml`) runs on every push:
85+
1. golangci-lint
86+
2. Validates `.traefik.yml` with yq
87+
3. Checks `go mod tidy` and `go mod vendor` are up-to-date
88+
4. Runs unit tests with race detector
89+
5. Runs integration tests against Traefik v2.11, v3.0, v3.1, v3.2, v3.3, v3.4
90+
91+
### Integration Testing
92+
93+
The `ci/` directory contains a full integration test:
94+
- Spins up Traefik + nginx with docker-compose
95+
- Generates 100 unique public IPs from different subnets
96+
- Makes parallel requests to verify rate limiting behavior
97+
- Tests state persistence across container restarts
98+
- Validates stats endpoint JSON
99+
100+
To run: `cd ci && go run test.go`
101+
102+
## Key Implementation Details
103+
104+
### Route Matching Modes
105+
106+
Three modes configured via `mode` parameter (defaults to "prefix"):
107+
108+
1. **prefix**: Fast string prefix matching (`strings.HasPrefix`)
109+
2. **suffix**: Matches route suffixes (useful for specific endpoints)
110+
3. **regex**: Full regex support (13x slower than prefix, use only when needed)
111+
112+
Regex is significantly slower (~41ns vs ~3.4ns per operation) - see README benchmark section.
113+
114+
### IP Subnet Calculation
115+
116+
- IPv4: Masks IPs to configured subnet (default /16 means `192.168.x.x``192.168.0.0`)
117+
- IPv6: Default /64 subnet mask
118+
- Implementation at `main.go:621-642`
119+
120+
### State Persistence
121+
122+
When `persistentStateFile` is configured:
123+
- State saves every 1 minute to JSON file (`saveState()` at `main.go:695-727`)
124+
- On startup, loads previous state from file (`loadState()` at `main.go:729-756`)
125+
- Contains: rate limits per subnet, bot verification cache, verified IPs
126+
127+
### Good Bot Detection
128+
129+
To avoid SEO impact, the plugin allows "good bots" to bypass rate limits:
130+
- Performs reverse DNS lookup on IP (`internal/helper/ip.go`)
131+
- Checks if hostname ends with configured second-level domain (e.g., "googlebot.com")
132+
- Results cached in `botCache` to avoid repeated DNS lookups
133+
- Optional `protectParameters: "true"` forces rate limiting even for good bots if URL contains query parameters
134+
135+
### File Extension Filtering
136+
137+
By default, only HTML files are rate-limited (to prevent CSS/JS/images from consuming rate limit quota). Configure `protectFileExtensions` to add more file types.
138+
139+
## Configuration
140+
141+
Configuration comes from Traefik labels. See `.traefik.yml` for the plugin manifest.
142+
143+
Key defaults:
144+
- `rateLimit: 20` requests per subnet
145+
- `window: 86400` seconds (24 hours)
146+
- `ipv4subnetMask: 16` (/16 = 65,536 IPs)
147+
- `ipv6subnetMask: 64`
148+
- `challengeStatusCode: 200` (or 429 for inline challenges)
149+
150+
## Testing Strategy
151+
152+
Unit tests (`main_test.go`) cover:
153+
- IP parsing and subnet masking
154+
- Route protection logic (prefix/suffix/regex)
155+
- Client IP extraction from forwarded headers with depth traversal
156+
- User agent exemption matching
157+
- Challenge page serving with different status codes
158+
159+
Integration tests (`ci/test.go`) verify:
160+
- Full request lifecycle with real Traefik/nginx
161+
- Rate limiting behavior across multiple subnets
162+
- State persistence across container restarts
163+
- Stats endpoint functionality
164+
165+
## Traefik Plugin Constraints
166+
167+
- Must implement `http.Handler` interface
168+
- Entry point: `New(ctx context.Context, next http.Handler, config *Config, name string)`
169+
- Plugin loaded via Traefik's `--experimental.plugins` flag
170+
- No external state allowed (must use in-memory caches or file persistence)
171+
- Must be compatible with Traefik v2.11.1+

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# Captcha Protect
22
[![lint-test](https://github.com/libops/captcha-protect/actions/workflows/lint-test.yml/badge.svg)](https://github.com/libops/captcha-protect/actions/workflows/lint-test.yml)
33
[![Go Report Card](https://goreportcard.com/badge/github.com/libops/captcha-protect)](https://goreportcard.com/report/github.com/libops/captcha-protect)
4+
[![codecov](https://codecov.io/gh/libops/captcha-protect/branch/main/graph/badge.svg)](https://codecov.io/gh/libops/captcha-protect)
45

56
Traefik middleware to challenge individual IPs in a subnet when traffic spikes are detected from that subnet, using a captcha of your choice for the challenge (turnstile, recaptcha, or hcaptcha). **Requires traefik `v2.11.1` or above**
67

go.mod

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
module github.com/libops/captcha-protect
22

3-
go 1.24.0
3+
go 1.25.0
44

55
require github.com/patrickmn/go-cache v2.1.0+incompatible

internal/helper/ip_test.go

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -191,3 +191,55 @@ func parseCIDR(cidr string, t *testing.T) *net.IPNet {
191191
}
192192
return block
193193
}
194+
195+
func TestParseCIDR(t *testing.T) {
196+
tests := []struct {
197+
name string
198+
cidr string
199+
expectErr bool
200+
}{
201+
{
202+
name: "Valid IPv4 CIDR",
203+
cidr: "192.168.1.0/24",
204+
expectErr: false,
205+
},
206+
{
207+
name: "Valid IPv6 CIDR",
208+
cidr: "2001:db8::/32",
209+
expectErr: false,
210+
},
211+
{
212+
name: "Invalid CIDR - no mask",
213+
cidr: "192.168.1.0",
214+
expectErr: true,
215+
},
216+
{
217+
name: "Invalid CIDR - bad format",
218+
cidr: "not-a-cidr",
219+
expectErr: true,
220+
},
221+
{
222+
name: "Invalid CIDR - empty string",
223+
cidr: "",
224+
expectErr: true,
225+
},
226+
}
227+
228+
for _, tt := range tests {
229+
t.Run(tt.name, func(t *testing.T) {
230+
result, err := ParseCIDR(tt.cidr)
231+
if tt.expectErr {
232+
if err == nil {
233+
t.Errorf("Expected error for CIDR %q, got nil", tt.cidr)
234+
}
235+
} else {
236+
if err != nil {
237+
t.Errorf("Unexpected error for CIDR %q: %v", tt.cidr, err)
238+
}
239+
if result == nil {
240+
t.Errorf("Expected non-nil result for valid CIDR %q", tt.cidr)
241+
}
242+
}
243+
})
244+
}
245+
}

internal/helper/tmpl_test.go

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
package helper
2+
3+
import (
4+
"strings"
5+
"testing"
6+
)
7+
8+
func TestGetDefaultTmpl(t *testing.T) {
9+
tmpl := GetDefaultTmpl()
10+
11+
// Verify it returns a non-empty string
12+
if tmpl == "" {
13+
t.Error("GetDefaultTmpl returned empty string")
14+
}
15+
16+
// Verify it contains expected HTML elements
17+
expectedElements := []string{
18+
"<html>",
19+
"</html>",
20+
"<head>",
21+
"</head>",
22+
"<body>",
23+
"</body>",
24+
"<form",
25+
"</form>",
26+
"{{ .FrontendJS }}",
27+
"{{ .SiteKey }}",
28+
"{{ .ChallengeURL }}",
29+
"{{ .Destination }}",
30+
"{{ .FrontendKey }}",
31+
"captchaCallback",
32+
}
33+
34+
for _, elem := range expectedElements {
35+
if !strings.Contains(tmpl, elem) {
36+
t.Errorf("Template missing expected element: %s", elem)
37+
}
38+
}
39+
40+
// Verify it's valid HTML structure (basic check)
41+
if !strings.HasPrefix(tmpl, "<html>") {
42+
t.Error("Template should start with <html>")
43+
}
44+
if !strings.HasSuffix(strings.TrimSpace(tmpl), "</html>") {
45+
t.Error("Template should end with </html>")
46+
}
47+
}

internal/log/log_test.go

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
package log
2+
3+
import (
4+
"log/slog"
5+
"testing"
6+
)
7+
8+
func TestNew(t *testing.T) {
9+
tests := []struct {
10+
name string
11+
levelStr string
12+
expectedLevel slog.Level
13+
}{
14+
{"DEBUG level", "DEBUG", slog.LevelDebug},
15+
{"INFO level", "INFO", slog.LevelInfo},
16+
{"WARN level", "WARN", slog.LevelWarn},
17+
{"WARNING level", "WARNING", slog.LevelWarn},
18+
{"ERROR level", "ERROR", slog.LevelError},
19+
{"debug lowercase", "debug", slog.LevelDebug},
20+
{"Unknown level defaults to INFO", "UNKNOWN", slog.LevelInfo},
21+
{"Empty level defaults to INFO", "", slog.LevelInfo},
22+
}
23+
24+
for _, tt := range tests {
25+
t.Run(tt.name, func(t *testing.T) {
26+
logger := New(tt.levelStr)
27+
if logger == nil {
28+
t.Error("Expected non-nil logger")
29+
}
30+
// Logger is created successfully, we can't easily test the exact level
31+
// but we verify it doesn't panic or error
32+
})
33+
}
34+
}
35+
36+
func TestParseLogLevel(t *testing.T) {
37+
tests := []struct {
38+
name string
39+
level string
40+
expected slog.Level
41+
expectErr bool
42+
}{
43+
{"DEBUG", "DEBUG", slog.LevelDebug, false},
44+
{"debug lowercase", "debug", slog.LevelDebug, false},
45+
{"INFO", "INFO", slog.LevelInfo, false},
46+
{"info lowercase", "info", slog.LevelInfo, false},
47+
{"WARN", "WARN", slog.LevelWarn, false},
48+
{"WARNING", "WARNING", slog.LevelWarn, false},
49+
{"warning lowercase", "warning", slog.LevelWarn, false},
50+
{"ERROR", "ERROR", slog.LevelError, false},
51+
{"error lowercase", "error", slog.LevelError, false},
52+
{"Unknown level", "INVALID", slog.LevelInfo, true},
53+
{"Empty string", "", slog.LevelInfo, true},
54+
}
55+
56+
for _, tt := range tests {
57+
t.Run(tt.name, func(t *testing.T) {
58+
level, err := parseLogLevel(tt.level)
59+
if tt.expectErr {
60+
if err == nil {
61+
t.Errorf("Expected error for level %q, got nil", tt.level)
62+
}
63+
} else {
64+
if err != nil {
65+
t.Errorf("Unexpected error for level %q: %v", tt.level, err)
66+
}
67+
}
68+
if level != tt.expected {
69+
t.Errorf("Expected level %v, got %v", tt.expected, level)
70+
}
71+
})
72+
}
73+
}

0 commit comments

Comments
 (0)