RFC 9111 compliant client-side HTTP caching for HttpClient, powered by .NET's HybridCache for efficient L1 (memory) and L2 (distributed) caching.
- Features
- Installation
- Quick Start
- Handler Pipeline Configuration
- Configuration Options
- Cache Behavior
- Performance & Memory
- Metrics
- Benchmarks
- Samples
- RFC 9111 Compliant: Full implementation of HTTP caching specification for client-side caching
- HybridCache Integration: Leverages .NET's HybridCache for efficient L1 (memory) and L2 (distributed) caching
- Transparent Operation: Works seamlessly with existing HttpClient code
Request Directives:
max-age: Control maximum acceptable response agemax-stale: Accept stale responses within specified staleness tolerancemin-fresh: Require responses to remain fresh for specified durationno-cache: Force revalidation with origin serverno-store: Bypass cache completelyonly-if-cached: Return cached responses or 504 if not cached
Response Directives:
max-age: Define response freshness lifetimeno-cache: Store but require validation before useno-store: Prevent cachingpublic/private: Control cache visibilitymust-revalidate: Enforce validation when stale
- Conditional Requests: Automatic ETag (
If-None-Match) and Last-Modified (If-Modified-Since) validation - Vary Header Support: Content negotiation with multiple cache entries per resource
- Freshness Calculation: Supports
Expiresheader,Ageheader, and heuristic freshness (Last-Modified based) - Stale Response Handling:
stale-while-revalidate: Serve stale content while updating in backgroundstale-if-error: Serve stale content when origin is unavailable
- Configurable Limits: Per-item content size limits (default 10MB)
- Metrics: Built-in metrics via
System.Diagnostics.Metricsfor hit/miss rates and cache operations - Custom Cache Keys: Extensible cache key generation for advanced scenarios
- Request Collapsing: Prevents cache stampede via
HybridCache.GetOrCreateAsyncautomatic request coalescing
dotnet add package DamianH.HttpHybridCacheHandlervar services = new ServiceCollection();
services.AddHttpHybridCacheHandler(options =>
{
options.FallbackCacheDuration = TimeSpan.FromMinutes(5);
options.MaxCacheableContentSize = 10 * 1024 * 1024; // 10MB
options.CompressionThreshold = 1024; // Compress cached content >1KB
});
services.AddHttpClient("MyClient")
.ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler
{
// Enable automatic decompression - server compression handled transparently
AutomaticDecompression = DecompressionMethods.All,
// DNS refresh every 5 minutes - critical for cloud/microservices
PooledConnectionLifetime = TimeSpan.FromMinutes(5),
// Close idle connections after 2 minutes
PooledConnectionIdleTimeout = TimeSpan.FromMinutes(2),
// Reasonable connection timeout
ConnectTimeout = TimeSpan.FromSeconds(10)
})
.AddHttpMessageHandler(sp => sp.GetRequiredService<HttpHybridCacheHandler>());
var client = services.BuildServiceProvider()
.GetRequiredService<IHttpClientFactory>()
.CreateClient("MyClient");
var response = await client.GetAsync("https://api.example.com/data");Always use SocketsHttpHandler with AutomaticDecompression enabled (better performance, DNS refresh, and connection pooling than legacy HttpClientHandler):
.ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler
{
AutomaticDecompression = DecompressionMethods.All,
PooledConnectionLifetime = TimeSpan.FromMinutes(5),
PooledConnectionIdleTimeout = TimeSpan.FromMinutes(2)
})Two different compressions:
-
Transport Compression (Server → Client)
- Controlled by:
AutomaticDecompressiononSocketsHttpHandler - Purpose: Reduce network bandwidth
- Result: Handler receives decompressed content
- Controlled by:
-
Cache Storage Compression (This library)
- Controlled by:
CompressionThresholdin options - Purpose: Reduce cache storage size
- Result: Content compressed before storing in cache
- Controlled by:
Example Flow:
Server sends: gzipped 512 bytes
↓
SocketsHttpHandler: auto-decompresses → 2048 bytes
↓
HttpHybridCacheHandler: receives decompressed content
↓
Our compression: compresses → 600 bytes
↓
Cache: stores 600 bytes (no Base64 overhead!)
Benefits:
- Cache handler can inspect and validate response content
- Cache-Control, ETag, and Last-Modified headers are readable
- Enables intelligent caching decisions
- Storage compression is optional and configurable
Pipeline structure:
HttpClient → [Outer Handlers] → HttpHybridCacheHandler → SocketsHttpHandler → Network
.AddHttpMessageHandler(sp => new HttpHybridCacheHandler(...))
.AddStandardResilienceHandler(options =>
{
options.Retry.MaxRetryAttempts = 3;
options.CircuitBreaker.SamplingDuration = TimeSpan.FromSeconds(30);
});Order: Polly (outer) → Cache → SocketsHttpHandler
Why: Cache hit = fast path, Polly never invoked. Cache miss + network failure = Polly retries.
.AddHttpMessageHandler(() => new AuthenticationHandler())
.AddHttpMessageHandler(sp => sp.GetRequiredService<HttpHybridCacheHandler>());To include auth headers in cache key, configure VaryHeaders in the AddHttpHybridCacheHandler options.
Auth applied before caching, headers included in cache key via Vary.
Wrong: Not enabling AutomaticDecompression
new SocketsHttpHandler() // Defaults to None!Problem: Cache handler receives compressed content, can't inspect properly.
Correct: Explicitly enable decompression
new SocketsHttpHandler
{
AutomaticDecompression = DecompressionMethods.All
}Wrong: Using legacy HttpClientHandler
new HttpClientHandler() // Legacy, less efficientCorrect: Use modern SocketsHttpHandler
new SocketsHttpHandler { /* ... */ }Wrong: Cache handler after Polly
.AddStandardResilienceHandler() // Outer
.AddHttpMessageHandler(sp => new HttpHybridCacheHandler(...)) // Inner - Wrong!Correct: Cache handler before Polly
.AddHttpMessageHandler(sp => new HttpHybridCacheHandler(...)) // Inner - Correct!
.AddStandardResilienceHandler() // OuterGolden Rule: HttpHybridCacheHandler should receive decompressed, ready-to-use content.
The library supports two cache modes, following RFC 9111 semantics:
Browser-like cache behavior suitable for client applications:
Use Cases:
- HttpClient in web applications, APIs, background services
- Scaled-out clients sharing cache (multiple instances, serverless/Lambda)
- Per-user/per-tenant caching scenarios
Behavior:
- Caches responses with
Cache-Control: private - Uses
max-agedirective (ignoress-maxage) - Caches authenticated requests if marked
privateormax-age - Each cache key is client-specific (Vary headers applied)
Example:
new HttpHybridCacheHandlerOptions
{
Mode = CacheMode.Private, // Shares cache across app instances via Redis L2
FallbackCacheDuration = TimeSpan.FromMinutes(5)
}Proxy/CDN-like cache behavior suitable for gateways:
Use Cases:
- Reverse proxies (YARP, Envoy)
- API gateways
- Edge caches / CDN-like scenarios
Behavior:
- Does NOT cache responses with
Cache-Control: private - Prefers
s-maxageovermax-age - Only caches authenticated requests with
publicors-maxage - Cache is shared across all clients/users
Example:
new HttpHybridCacheHandlerOptions
{
Mode = CacheMode.Shared, // RFC 9111 shared cache semantics
MaxCacheableContentSize = 50 * 1024 * 1024 // 50MB
}- Mode: Cache mode determining caching behavior (default:
CacheMode.Private). UseCacheMode.Sharedfor proxy/CDN scenarios - HeuristicFreshnessPercent: Heuristic freshness percentage for responses with Last-Modified but no explicit freshness info (default: 0.1 or 10%)
- VaryHeaders: Headers to include in Vary-aware cache keys (default: Accept, Accept-Encoding, Accept-Language, User-Agent)
- MaxCacheableContentSize: Maximum size in bytes for cacheable response content (default: 10 MB). Responses larger than this will not be cached
- FallbackCacheDuration: Fallback cache duration for responses without explicit caching headers (default:
TimeSpan.MinValue, meaning responses without caching headers are not cached) - CompressionThreshold: Minimum content size in bytes to enable compression (default: 1024 bytes). Set to 0 or negative value to disable compression
- CompressibleContentTypes: Content types eligible for compression (default:
text/*,application/json,application/json+*,application/xml,application/javascript,image/svg+xml) - CacheableContentTypes: Content types eligible for caching (default:
text/*,application/json,application/json+*,application/xml,application/javascript,application/xhtml+xml,image/*) - ContentKeyPrefix: Prefix for content cache keys (default:
"httpcache:content:"). Content is stored separately from metadata to avoid Base64 encoding overhead - IncludeDiagnosticHeaders: Whether to include diagnostic headers (
X-Cache-Diagnostic, etc.) in responses (default:false)
The handler emits the following counters via System.Diagnostics.Metrics under the meter named DamianH.HttpHybridCacheHandler:
| Counter | Description |
|---|---|
cache.hits |
Number of cache hits (fresh, revalidated, stale-while-revalidate, stale-if-error) |
cache.misses |
Number of cache misses (including cache errors and failed revalidations) |
cache.stale |
Number of stale cache entries served (stale-while-revalidate or stale-if-error) |
cache.size_exceeded |
Number of responses that exceeded MaxCacheableContentSize and were not cached |
All counters include the following tags (following OpenTelemetry semantic conventions):
| Tag | Description | Example |
|---|---|---|
http.request.method |
HTTP method | GET, HEAD |
url.scheme |
URL scheme | http, https |
server.address |
Server hostname | api.example.com |
server.port |
Server port | 443 |
When IncludeDiagnosticHeaders is enabled in options, the handler adds diagnostic information to responses:
- X-Cache-Diagnostic: Indicates cache behavior for the request
HIT-FRESH: Served from cache, content is freshHIT-REVALIDATED: Served from cache after successful 304 revalidationHIT-STALE-WHILE-REVALIDATE: Served stale while background revalidation occursHIT-STALE-IF-ERROR: Served stale due to backend errorHIT-ONLY-IF-CACHED: Served from cache with only-if-cached directiveMISS: Not in cache, fetched from backendMISS-REVALIDATED: Cache entry was stale and resource changedMISS-CACHE-ERROR: Cache operation failed, bypassedMISS-ONLY-IF-CACHED: Not in cache with only-if-cached directive (504 Gateway Timeout)BYPASS-METHOD: Request method not cacheable (POST, PUT, etc.)BYPASS-NO-STORE: Request has no-store directive
- X-Cache-Age: Age of cached content in seconds (only for cache hits)
- X-Cache-MaxAge: Maximum age of cached content in seconds (only for cache hits)
- X-Cache-Compressed: "true" if content was stored compressed (only for cache hits)
Example:
var options = new HttpHybridCacheHandlerOptions
{
IncludeDiagnosticHeaders = true
};Only GET and HEAD requests are cached. Responses are cached when:
- Status code is 200 OK
- Cache-Control allows caching (not no-store, not no-cache without validation)
- Content size is within MaxContentSize limit
Cache keys are generated from:
- HTTP method
- Request URI
- Vary header values from the response
When serving stale content, the handler automatically adds:
If-None-Matchheader with cached ETagIf-Modified-Sinceheader with cached Last-Modified date
If the server responds with 304 Not Modified, the cached response is refreshed and served.
The handler is designed for high-performance scenarios with several key optimizations:
Eliminates Base64 overhead in distributed cache:
- Metadata (small, ~1-2KB): Status code, headers, timestamps → Stored as JSON
- Content (large, variable): Response body → Stored as raw
byte[]- No Base64 encoding = 33% size savings
- Content deduplication via SHA256 hash
- Same content shared across cache entries (different Vary headers)
Trade-offs:
- Two cache lookups (metadata + content) vs one lookup
- Acceptable: L1 (memory) cache makes second lookup very fast (~microseconds)
- Benefit: Zero Base64 overhead on all cached content
- Stampede Prevention (via
HybridCache.GetOrCreateAsync): Multiple concurrent requests for the same resource are automatically collapsed into a single backend request - Automatic Deduplication: Only one request hits the backend while others await the cached result
- Built-in HybridCache feature - no additional configuration needed
- L1/L2 Strategy: Fast in-memory (L1) + optional distributed (L2) via HybridCache
- Size Limits: Configurable per-item limits (default: 10MB) prevent memory issues
- Conditional Requests: ETags and Last-Modified enable efficient 304 responses
See /benchmarks for comprehensive memory allocation benchmarks:
| Response Size | Allocations | Gen2 (LOH) | Notes |
|---|---|---|---|
| 1-10KB | ~10-20 KB | 0 | No LOH, optimal |
| 10-85KB | ~20-100 KB | 0 | No LOH, good |
| >85KB | ~100KB+ | >0 | LOH expected, acceptable for reliability |
Run benchmarks: cd benchmarks && .\run-memory-tests.ps1
Run benchmarks to measure performance:
dotnet run --project benchmarks/Benchmarks.csproj -c ReleaseSee the /samples directory for complete examples:
HttpClientFactorySample: Integration with IHttpClientFactoryYarpCachingProxySample: Building a caching reverse proxy with YARPFusionCacheSample: Using FusionCache via its HybridCache adapter for enhanced caching featuresFileDistributedCacheSample: File-based L2 cache with HttpHybridCacheHandler