diff --git a/src/content/docs/cache/interaction-cloudflare-products/workers.mdx b/src/content/docs/cache/interaction-cloudflare-products/workers.mdx
index 0b90c7883ab5180..248beef80c01579 100644
--- a/src/content/docs/cache/interaction-cloudflare-products/workers.mdx
+++ b/src/content/docs/cache/interaction-cloudflare-products/workers.mdx
@@ -14,6 +14,10 @@ head:
 
 You can use [Workers](/workers/) to customize cache behavior on Cloudflare's network. Workers run as middleware in the request lifecycle — a single Worker handles both the request and response phases. When a request arrives, it hits the Worker before the cache is checked. The Worker can modify the incoming request (for example, rewrite the URL or add headers), then call `fetch()` to continue the request through the cache. When the response comes back — whether from cache or from the origin server — the Worker can also modify the response before it is sent to the visitor.
 
+:::note
+This page describes how Workers interact with a zone's Cloudflare Cache. Workers can also opt in to **[Workers Caching](/workers/cache/)** — a cache that sits in front of the Worker itself, so that Cloudflare returns a cached response without running the Worker. Use Workers Caching to cache the output of a Worker's logic directly, independent of any zone configuration.
+:::
+
 The diagram below illustrates a common interaction flow between Workers and Cache.
 
 ![Workers and cache flow example flow diagram.](~/assets/images/cache/workers-cache-flow.png)
diff --git a/src/content/docs/workers/cache/cache-keys.mdx b/src/content/docs/workers/cache/cache-keys.mdx
new file mode 100644
index 000000000000000..efed694674f8456
--- /dev/null
+++ b/src/content/docs/workers/cache/cache-keys.mdx
@@ -0,0 +1,166 @@
+---
+title: Cache keys
+pcx_content_type: concept
+description: How Workers Caching builds cache keys, with guidance for service bindings, multi-tenant Workers, and gradual deployments.
+sidebar:
+  order: 2
+---
+
+import { TypeScriptExample } from "~/components";
+
+Every cached response is stored under a **cache key**. When a request arrives, Cloudflare computes a cache key for it and looks it up — on a hit, the stored response is returned; on a miss, your Worker runs and its response is stored under that key for next time.
+
+Two requests that produce the same cache key share the same cached response. Two requests that produce different cache keys get independent cached entries.
+
+This page explains what Workers Caching puts into the cache key, why each component is there, and how to reason about it when designing your Worker.
+
+## What goes into the cache key
+
+Workers Caching keys responses by:
+
+- The **target entrypoint** — which specific [named entrypoint](/workers/runtime-apis/bindings/service-bindings/rpc/#named-entrypoints) of the Worker received the request. A `default` export and an exported class are different entrypoints and do not share a cache even if they produce identical responses.
+- The **path and query string** of the request URL. Query parameter order matters — `?a=1&b=2` and `?b=2&a=1` are different cache keys. Trailing slashes matter too.
+- The invocation's [`ctx.props`](/workers/runtime-apis/bindings/service-bindings/rpc/#ctxprops), when the Worker is invoked through a service binding or RPC. See [Multi-tenant safety with `ctx.props`](#multi-tenant-safety-with-ctxprops).
+- The `x-http-method-override`, `x-http-method`, and `x-method-override` request headers.
+- The `x-forwarded-host`, `x-host`, `x-forwarded-scheme` (unless its value is `http` or `https`), `x-original-url`, `x-rewrite-url`, and `forwarded` request headers.
+
+The last two bullets are a safety measure, not something you should need to reason about. Some frameworks interpret those headers as overriding the effective method or URL of a request, which can lead to [cache poisoning](https://portswigger.net/research/practical-web-cache-poisoning) if two requests differ only in those headers but produce materially different responses. Including them in the cache key ensures a poisoned entry only affects requests that carry the same poisoned header.
+
+Requests that differ only in request headers **not** listed above (for example, `User-Agent` or `Accept-Language`) return the same cached response. This is usually what you want — you do not want every user agent string or language preference producing a separate cache entry. If you do need content negotiation, handle it inside your Worker and produce a canonical response per URL.
+
+Notably, the cache key does **not** include:
+
+- **The request's host.** The Worker's cache is keyed by path and query string, not the full URL. See [The cache belongs to the Worker, not to a domain](#the-cache-belongs-to-the-worker-not-to-a-domain).
+- **The currently invoked Worker version.** This is intentional — see [Invalidating cache across deployments](#invalidating-cache-across-deployments).
+
+At launch, you cannot inspect the exact cache key Cloudflare computed for a request. The primary signals you have for understanding cache behavior are the `Cf-Cache-Status` response header and per-invocation cache-hit information in the [Workers observability dashboard](/workers/observability/). See [Inspecting the cache key](#inspecting-the-cache-key).
+
+## The cache belongs to the Worker, not to a domain
+
+A Worker is a zoneless entity. It can be invoked through several different paths:
+
+- Directly on a `workers.dev` subdomain.
+- Through a [route](/workers/configuration/routing/routes/) on any zone you control.
+- Through a [custom domain](/workers/configuration/routing/custom-domains/) — and you can bind the same Worker to many custom domains.
+- Through a [service binding](/workers/runtime-apis/bindings/service-bindings/) from another Worker, with an arbitrary placeholder hostname in the URL.
+
+Workers Caching treats all of these as the same Worker and uses a single shared cache across them. The cache key does not include the host, so a request to `/api/users/42` hits the same cached entry whether it came in through `api.example.com`, `api.example.net`, a service binding, or a `workers.dev` URL.
+
+This is the behavior you almost always want. A Worker's responses are a function of its code and its inputs, not of which domain the request arrived through — so caching them once and serving that response back to every ingress path maximizes the cache hit rate without losing correctness.
+
+If you genuinely need different cached responses for the same path on different hostnames — for example, white-labeled tenants where `tenant-a.example.com/index` and `tenant-b.example.com/index` must produce different content — the cache key does not do this for you automatically. Instead, distinguish the tenants at your gateway Worker and pass the tenant identifier via `ctx.props`, which _is_ part of the cache key.
+
+## Invalidating cache across deployments
+
+The currently invoked Worker version is **not** part of the cache key. This surprises some developers at first, so it is worth explaining why.
+
+Most deployments do not change response content or freshness semantics — they change implementation details, fix bugs, or adjust internals. If the cache invalidated every time you deployed, you would throw away a lot of correct cached data for no benefit. And during a [gradual deployment](/workers/configuration/versions-and-deployments/gradual-deployments/), version-keyed cache entries would split the cache across the old and new versions — so while you used a gradual rollout to safely test changes on a slice of traffic, both halves would be populating independent caches from cold.
+
+So by default, cached responses are shared across versions. A response written by version A is still served after version B is deployed, as long as its TTL has not expired. In most cases this is what you want.
+
+When it is not what you want — for example, after a rollback, or when a release changes response content in a way that matters — you have two tools.
+
+### Tag responses by version, purge the tag on rollback
+
+If you want fine-grained control, tag each cached response with the Worker version that produced it. Later, purging that version tag removes every entry that version wrote, without affecting cached responses from other versions.
+
+This uses the [version metadata binding](/workers/runtime-apis/bindings/version-metadata/) to read the current version ID at request time, and prepends it as a `Cache-Tag` value. See [Version-specific purging](/workers/cache/purge/#version-specific-purging) for the full pattern with code.
+
+This is the best option if your Worker is on a gradual rollout or you might need to roll back a specific version without blowing away cached content from working versions.
+
+### Purge everything after deploy
+
+The simpler approach: after each deploy, hit a small Worker endpoint from your CI that calls [`ctx.cache.purge({ purgeEverything: true })`](/workers/cache/purge/#purge-everything). The next request after the purge re-populates the cache from whichever Worker version is live at that moment.
+
+This is coarser but requires zero in-Worker logic. Use it if your deployments always want to invalidate cache, or if you are not using gradual rollouts and do not need per-version granularity.
+
+## Multi-tenant safety with `ctx.props`
+
+When your Worker is invoked through a [service binding](/workers/runtime-apis/bindings/service-bindings/) or [RPC](/workers/runtime-apis/bindings/service-bindings/rpc/), the caller's [`ctx.props`](/workers/runtime-apis/bindings/service-bindings/rpc/#ctxprops) is part of the cache key. Two callers that invoke your Worker with different `ctx.props` get **separate cached entries** — one caller can never receive another caller's cached response.
+
+This is the mechanism that makes caching safe for multi-tenant Workers invoked over a service binding. If you use `ctx.props` to carry per-caller authorization context — user ID, tenant ID, organization, role — caching is safe by default. Responses that logically belong to one caller cannot leak to another through the cache.
+
+<TypeScriptExample filename="src/backend.ts">
+
+```ts
+import { WorkerEntrypoint } from "cloudflare:workers";
+
+interface Props {
+	userId: string;
+}
+
+export default class Backend extends WorkerEntrypoint<Env, Props> {
+	async fetch(request: Request): Promise<Response> {
+		// ctx.props.userId is set by the caller (for example, an auth gateway).
+		// Because it is part of the cache key, User A and User B requesting the
+		// same URL get separate cache entries — there is no way for one to
+		// see the other's response.
+		const { userId } = this.ctx.props;
+		const data = { userId, timestamp: Date.now() };
+
+		return new Response(JSON.stringify(data), {
+			headers: {
+				"Content-Type": "application/json",
+				"Cache-Control": "public, s-maxage=300",
+			},
+		});
+	}
+}
+```
+
+</TypeScriptExample>
+
+:::caution
+If you authenticate callers through some mechanism other than `ctx.props` — for example, by reading a custom header your gateway Worker attaches — that input is **not** automatically part of the cache key. Two callers authenticated by different header values but otherwise identical requests will share a single cached entry, which means one caller can receive another caller's response.
+
+The fix is to move per-caller authorization state into `ctx.props`. Your gateway Worker should populate `ctx.props` with whatever distinguishes callers before invoking the cached Worker. Cloudflare's standard [automatic bypass rules](/cache/concepts/cache-responses/#bypass) (`Set-Cookie`, `Authorization`) can also prevent the problem by disabling caching entirely for authenticated requests, but `ctx.props` is the recommended approach because it still lets you cache.
+:::
+
+### Service binding URL
+
+Service binding calls deserve a specific note because the URL you pass does not mean what you might think it means.
+
+When you call a service binding with [`fetch()`](/workers/runtime-apis/bindings/service-bindings/#use-the-fetch-method), the hostname in the URL is a placeholder. The request is routed via the binding, not by DNS — the hostname is never resolved. And because the host is not part of the cache key (as described in [The cache belongs to the Worker, not to a domain](#the-cache-belongs-to-the-worker-not-to-a-domain)), the placeholder has no effect on caching either. Only the **path** (and query string) contribute to the cache key, alongside the target entrypoint and `ctx.props`:
+
+<TypeScriptExample filename="src/gateway.ts">
+
+```ts
+interface Env {
+	BACKEND: Fetcher;
+}
+
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		// "internal" here is just a placeholder — it is not routed anywhere
+		// and is not part of the cache key.
+		//
+		// What identifies this cached response is:
+		//   - the BACKEND entrypoint
+		//   - the path "/api/users/42"
+		//   - whatever ctx.props the gateway passes along
+		return env.BACKEND.fetch("http://internal/api/users/42");
+	},
+} satisfies ExportedHandler<Env>;
+```
+
+</TypeScriptExample>
+
+If you want cached responses to differ for different callers, vary `ctx.props`. If you want them to differ by request, vary the path or query string. Varying the hostname does nothing.
+
+## Inspecting the cache key
+
+At launch, two signals give you visibility into cache behavior:
+
+1. **The `Cf-Cache-Status` response header.** `HIT` means Cloudflare returned a cached response without running your Worker. `MISS` means your Worker ran and the response was stored. `UPDATING` means the cached response was stale and your Worker ran in the background to refresh it. `BYPASS` means caching was disabled for this request. Refer to [Cloudflare cache responses](/cache/concepts/cache-responses/) for the full set of values.
+
+2. **Cache hits in the [Workers observability dashboard](/workers/observability/).** Each invocation surfaces whether it was served from cache, so you can filter and aggregate cache-hit behavior across your Worker's traffic.
+
+Cloudflare does not currently expose the cache key composition itself. If two requests you expected to share a cached response do not, you have to reason about what part of the key differed from the components listed in [What goes into the cache key](#what-goes-into-the-cache-key). For a walkthrough of common caching problems and how to diagnose them, refer to [Debugging](/workers/cache/debugging/).
+
+## Custom cache keys
+
+At launch, the cache key is composed from the components listed in [What goes into the cache key](#what-goes-into-the-cache-key). You cannot customize the cache key directly — for example, to ignore a tracking query parameter, or to include a session cookie.
+
+{/* TODO(dan): Document the custom cache key API once available. Track use cases: ignore query strings, key on a session cookie, exclude specific query parameters, include Accept-Language or other content-negotiation headers. */}
+
+As a workaround, you can shape the request your Worker receives (for example, by stripping tracking parameters in a gateway Worker before passing the request on) or you can incorporate discriminating values into `ctx.props` so they become part of the cache key.
diff --git a/src/content/docs/workers/cache/configuration.mdx b/src/content/docs/workers/cache/configuration.mdx
new file mode 100644
index 000000000000000..c3460495a9c2a98
--- /dev/null
+++ b/src/content/docs/workers/cache/configuration.mdx
@@ -0,0 +1,237 @@
+---
+title: Configuration
+pcx_content_type: configuration
+description: Enable and configure Workers Caching.
+sidebar:
+  order: 1
+---
+
+import { TypeScriptExample, WranglerConfig } from "~/components";
+
+Workers Caching is configured per Worker, in your Wrangler configuration file. When enabled, caching applies to every HTTP invocation — eyeball requests, service binding calls, and loopback calls via [`ctx.exports`](/workers/runtime-apis/bindings/service-bindings/rpc/).
+
+This is **your Worker's cache** — configured through your Worker's code and Wrangler file. Your Worker controls its cache entirely through:
+
+- The `cache.enabled` flag in your Wrangler configuration, which turns caching on or off.
+- The `Cache-Control` (and `cdn-cache-control`, `cloudflare-cdn-cache-control`) headers your Worker sets on its responses, per [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111).
+- The optional `Cache-Tag` response header for bulk purging, and [`ctx.cache.purge()`](/workers/cache/purge/) for programmatic invalidation.
+
+That is the entire configuration surface.
+
+Note: Wrangler must be version 4.69.0 or above.
+
+## Enable caching
+
+Add a `cache` block to your Wrangler configuration:
+
+<WranglerConfig>
+
+```jsonc
+{
+	"name": "my-worker",
+	"main": "src/index.ts",
+	"compatibility_date": "$today",
+	"cache": {
+		"enabled": true,
+	},
+}
+```
+
+</WranglerConfig>
+
+Setting `cache.enabled` to `true` causes Cloudflare to check the cache before invoking your Worker on every HTTP request.
+
+## Disable caching
+
+To turn caching off, set `cache.enabled` to `false` (or remove the `cache` block) and redeploy:
+
+<WranglerConfig>
+
+```jsonc
+{
+	"name": "my-worker",
+	"main": "src/index.ts",
+	"compatibility_date": "$today",
+	"cache": {
+		"enabled": false,
+	},
+}
+```
+
+</WranglerConfig>
+
+Disabling caching does not purge previously cached responses — it only stops Cloudflare from consulting or populating the cache on subsequent requests. If you re-enable caching later, any entries that are still within their TTL become usable again. If you need cached responses to stop being served immediately, [purge the cache](/workers/cache/purge/) after disabling.
+
+## Versioned deployments
+
+The `cache` configuration is part of your Worker version:
+
+- Each version uploaded with [`wrangler deploy`](/workers/wrangler/commands/#deploy) or [`wrangler versions upload`](/workers/wrangler/commands/#versions-upload) captures whatever `cache.enabled` value is in its Wrangler configuration.
+- Rolling back to a previous version also rolls back the `cache` setting attached to that version.
+- You can use [gradual deployments](/workers/configuration/versions-and-deployments/gradual-deployments/) to turn caching on for a percentage of traffic before applying it to 100%. During a gradual rollout from a version with caching disabled to a version with caching enabled, traffic routed to the old version runs uncached as it did before, and traffic routed to the new version consults and populates the cache. Cached entries written during the rollout remain valid once the new version reaches 100% because cache entries are not keyed on the Worker version.
+
+## Environment-specific configuration
+
+The `cache` block can be set at the top level and overridden per [environment](/workers/wrangler/environments/). The typical pattern is to turn caching on in production once you are confident it is safe, while keeping staging uncached for easier debugging:
+
+<WranglerConfig>
+
+```jsonc
+{
+	"name": "my-worker",
+	"main": "src/index.ts",
+	"compatibility_date": "$today",
+	"cache": {
+		"enabled": false,
+	},
+	"env": {
+		"production": {
+			"cache": {
+				"enabled": true,
+			},
+		},
+	},
+}
+```
+
+</WranglerConfig>
+
+## Cache-Control semantics
+
+With caching enabled, your Worker is the origin for Cloudflare's cache. Standard HTTP `Cache-Control` directives on the response your Worker returns determine whether and for how long Cloudflare caches it. For the full list of directives and how they interact, refer to [Cache-Control](/cache/concepts/cache-control/).
+
+### Set `s-maxage` and `max-age` separately
+
+Use `s-maxage` to control how long Cloudflare caches a response; use `max-age` to control how long browsers cache it:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+	async fetch(request): Promise<Response> {
+		const body = await renderPage(request);
+
+		return new Response(body, {
+			headers: {
+				"Content-Type": "text/html",
+				// Browsers cache for 5 minutes; Cloudflare caches for 1 hour.
+				"Cache-Control": "public, max-age=300, s-maxage=3600",
+			},
+		});
+	},
+} satisfies ExportedHandler;
+
+// Replace with your own rendering logic.
+async function renderPage(request: Request): Promise<string> {
+	return `<!doctype html><title>Home</title><h1>Hello</h1>`;
+}
+```
+
+</TypeScriptExample>
+
+### Use `stale-while-revalidate` for low-latency refreshes
+
+When a cached response becomes stale, `stale-while-revalidate` lets Cloudflare return the stale response immediately and refresh it in the background:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+	async fetch(request): Promise<Response> {
+		const data = { timestamp: Date.now() };
+
+		return new Response(JSON.stringify(data), {
+			headers: {
+				"Content-Type": "application/json",
+				// Fresh for 10 minutes; may be served stale for up to 1 minute
+				// while a background revalidation runs.
+				"Cache-Control": "public, s-maxage=600, stale-while-revalidate=60",
+			},
+		});
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+### Choose TTL and stale-while-revalidate values
+
+High cache hit rate and high freshness are in tension. Background revalidation hides the latency of refreshing the cache, but your Worker still runs once per revalidation — it is not free.
+
+Two common patterns:
+
+- **Mostly static content with a small tolerance for staleness.** Use a short `s-maxage` (for example, 60 seconds) and a longer `stale-while-revalidate` window (for example, 3600 seconds). Most requests are `HIT`s; occasional requests trigger a background refresh.
+- **"Always serve from cache" for high-traffic endpoints.** Use `max-age=0, s-maxage=0, stale-while-revalidate=<large>`. Every request returns the previously cached response immediately and triggers a background refresh. Your Worker runs once per request to revalidate, so CPU costs are close to running the Worker every time. Freshness drops as request volume drops — if no request arrives for a long time, the next request will see stale content.
+
+### Header precedence
+
+When multiple cache headers are present, the most specific wins:
+
+1. `cloudflare-cdn-cache-control` — Cloudflare-specific, highest precedence. Consumed by Cloudflare and stripped from the response returned to clients.
+2. `cdn-cache-control` — standard header for CDN-only directives. Respected by Cloudflare and passed through to downstream CDNs.
+3. `Cache-Control` — standard HTTP header. Respected by Cloudflare and passed through to clients.
+
+Use `cloudflare-cdn-cache-control` when you want a longer edge TTL than you expose to browsers without leaking the directive downstream.
+
+## Response headers
+
+### `Cf-Cache-Status`
+
+Every response carries a `Cf-Cache-Status` header indicating what happened for that request. The most common values are `HIT`, `MISS`, `UPDATING`, and `BYPASS`. For the full set of values and their meanings, refer to [Cloudflare cache responses](/cache/concepts/cache-responses/).
+
+### `Cache-Tag`
+
+The [`Cache-Tag`](/cache/how-to/purge-cache/purge-by-tags/) response header attaches tags to a cached response so you can purge it later in bulk. Cloudflare consumes this header and strips it before the response reaches the client.
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+	async fetch(request): Promise<Response> {
+		const html = `<!doctype html><title>Post</title>`;
+
+		return new Response(html, {
+			headers: {
+				"Content-Type": "text/html",
+				"Cache-Control": "public, s-maxage=3600",
+				"Cache-Tag": "blog,posts,post-123",
+			},
+		});
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+The `Cache-Tag` header value is a comma-separated list of tags. For limits on the number, length, and character set of tags, refer to [Cache tag limits](/cache/how-to/purge-cache/purge-by-tags/#a-few-things-to-remember).
+
+### Automatic bypass conditions
+
+Workers Caching inherits Cloudflare's standard [cache bypass rules](/cache/concepts/cache-responses/#bypass). The most common triggers:
+
+- The response includes a `Set-Cookie` header.
+- The request includes an `Authorization` header (the response is treated as private unless `Cache-Control: public` explicitly overrides it).
+- The response `Cache-Control` header includes `private`, `no-store`, or `no-cache`.
+
+When any of these apply, `Cf-Cache-Status` is `BYPASS` and your Worker runs on every request.
+
+### `Vary`
+
+When your Worker returns a `Vary` response header, Cloudflare stores a separate cached variant per distinct combination of the listed request header values, and only returns a variant whose stored values match the incoming request. This implements [RFC 9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-vary) and the cache-key calculation in [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111.html#name-calculating-cache-keys-with). For an introduction with example code, refer to [Content negotiation with `Vary`](/workers/cache/#content-negotiation-with-vary).
+
+How `Vary` is processed for Workers Caching:
+
+- **All header names are honored.** Any header name your Worker lists in `Vary` participates in the variant key. There is no allowlist.
+- **Values are compared verbatim.** Cloudflare does not normalize the listed request headers before keying. `Accept-Encoding: gzip, br` and `Accept-Encoding: br, gzip` produce two separate variants even though they are semantically identical. If you need to fold equivalent values onto the same variant, normalize the headers your Worker sees in a gateway Worker before passing the request on, or canonicalize them inside the Worker that sets `Vary`.
+- **`Vary: *` disables caching.** A wildcard variance cannot be satisfied deterministically from request headers, so the response is treated as uncacheable and `Cf-Cache-Status` is `BYPASS`.
+- **Variants share a single purge identity.** [Purging](/workers/cache/purge/) by tag or path prefix invalidates every variant of a URL together. All variants must therefore use the same `Cache-Tag` values — assigning different tags to different variants results in inconsistent purges.
+- **Image transformation features take precedence.** Responses produced by Polish or Image Resizing already generate their own variants, and `Vary` on those responses is ignored.
+
+### `Accept-Encoding` and `Content-Encoding`
+
+Your Worker controls its own content negotiation. Whatever `Content-Encoding` your Worker sets on the response is what Cloudflare stores and serves to subsequent requests.
+
+If your Worker needs to return different encodings to different clients, you have two options:
+
+- **Pick one canonical encoding inside your Worker.** Decide based on the `Accept-Encoding` request header, encode the body once, and return a single representation. Subsequent requests for that URL hit the same cached entry regardless of what they accept. This produces the highest cache hit rate but requires you to decide which clients you serve which encoding to.
+- **Vary on `Accept-Encoding`.** Return a different `Content-Encoding` per request and set `Vary: Accept-Encoding`. Cloudflare stores one variant per distinct `Accept-Encoding` value the Worker has seen. Because comparison is verbatim, clients that send semantically equivalent values in different orders or with different quality factors produce separate variants — keep cache fan-out under control by normalizing `Accept-Encoding` (for example, in a gateway Worker) before the response is generated.
diff --git a/src/content/docs/workers/cache/debugging.mdx b/src/content/docs/workers/cache/debugging.mdx
new file mode 100644
index 000000000000000..f3d8b61eb638c4f
--- /dev/null
+++ b/src/content/docs/workers/cache/debugging.mdx
@@ -0,0 +1,104 @@
+---
+title: Debugging
+pcx_content_type: troubleshooting
+description: Diagnose why Workers Caching is not behaving the way you expect.
+sidebar:
+  order: 5
+---
+
+If caching is not behaving the way you expect, the `Cf-Cache-Status` response header is the first place to look. Every response carries it, and its value tells you exactly what happened for that request.
+
+:::note
+Workers Caching is **your Worker's cache**. When debugging, look at your Worker's traces and responses. Zone-level cache controls and dashboards operate on a separate cache and will not affect what Workers Caching stores or serves.
+:::
+
+## Inspect `Cf-Cache-Status`
+
+Send two requests to the same URL and compare the headers:
+
+```sh
+curl -I https://my-worker.example.workers.dev/api/users/42
+curl -I https://my-worker.example.workers.dev/api/users/42
+```
+
+Match the status value against the scenarios below.
+
+## My Worker runs on every request
+
+`Cf-Cache-Status` is not present. Check that your wrangler version is 4.69.0 or above, and that
+wrangler.toml or wrangler.jsonc has `cache.enabled = true` for that worker.
+
+`Cf-Cache-Status` is `MISS` on every request, or `DYNAMIC`, or `BYPASS`. Caching is not storing anything, or a bypass rule is firing.
+
+**Check your `Cache-Control` header.** The response must carry directives that make it cacheable:
+
+- `public, s-maxage=N` — cached in Cloudflare for `N` seconds.
+- `public, max-age=N` — cached in Cloudflare and browsers for `N` seconds (if `s-maxage` is not set).
+
+A response with no `Cache-Control` header, or with `Cache-Control: private`, `no-store`, or `no-cache`, is not cached. Refer to [Cache-Control](/cache/concepts/cache-control/) for the full directive reference.
+
+**Check the request method.** Only `GET` and `HEAD` requests are cached. Everything else is `BYPASS`.
+
+**Check for automatic bypass conditions.** Cloudflare bypasses the cache when:
+
+- The response includes a `Set-Cookie` header.
+- The request includes an `Authorization` header, unless the response explicitly sets `Cache-Control: public`.
+
+If your Worker unconditionally sets `Set-Cookie` (for example, a session cookie on every response), the response is never cached. Either remove the cookie from cacheable responses, or separate cookie-setting and cacheable responses into different routes.
+
+**Check the status code.** Workers Caching follows [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111). Responses with status codes that are not cacheable by default (for example, `401`, `403`, `500`) are not stored.
+
+## My Worker runs even after the first request
+
+`Cf-Cache-Status` is `MISS` on the first request but still `MISS` on subsequent requests.
+
+**The cache is likely partitioned.** The cache key includes the request path, the target entrypoint, and the invocation's `ctx.props`. Two requests that look the same to you may produce different cache keys if any of these differ.
+
+Common causes:
+
+- The URL path or query string differs between requests (even trailing slashes matter).
+- The calling Worker passes different `ctx.props` for each request — for example, a different user ID.
+- Requests are hitting different [named entrypoints](/workers/runtime-apis/bindings/service-bindings/rpc/#named-entrypoints) of the same Worker.
+
+Cloudflare does not currently expose the cache key composition, so you cannot see the computed key directly. Instead, walk through the components listed in [Cache keys](/workers/cache/cache-keys/#what-goes-into-the-cache-key) and verify each one is the same for both requests.
+
+## My cache does not update after a deployment
+
+Deploying a new Worker version does not invalidate cached responses by design — the cache key does not include the invoked Worker version. If you want a deployment to remove cached content, you have two options:
+
+- **Call [`ctx.cache.purge({ purgeEverything: true })`](/workers/cache/purge/#purge-everything)** after deploying. This is the simplest approach.
+- **Tag each cached response with the producing version** using the [version metadata binding](/workers/runtime-apis/bindings/version-metadata/), then purge that tag on rollback. Refer to [Version-specific purging](/workers/cache/purge/#version-specific-purging).
+
+## My cache never updates after content changes
+
+If your origin data changed but requests still return stale content:
+
+- **Check the TTL.** The response stays cached for `s-maxage` seconds (or `max-age` if `s-maxage` is absent). You may be looking at a response that is still within its freshness window.
+- **Purge the affected responses.** Use `ctx.cache.purge()` with tags or a path prefix to invalidate specific entries. Refer to [Purging the cache](/workers/cache/purge/).
+- **Add tags at write time.** If you did not set `Cache-Tag` headers, you cannot purge by tag. Add tags to your cached responses, deploy, and once new entries are written they become purgeable.
+
+## Two callers receive each other's cached responses
+
+This should not happen if you use `ctx.props` for per-caller authorization context. If it does, one of the following is true:
+
+- You are authenticating callers with a header or query parameter that is not part of the cache key. Move the authorization input into `ctx.props`. Refer to [Multi-tenant safety with `ctx.props`](/workers/cache/cache-keys/#multi-tenant-safety-with-ctxprops).
+- You are calling a service binding with a user-specific query parameter that is not present. The query string is part of the cache key; make sure each caller's request path actually differs.
+
+## `Cf-Cache-Status: UPDATING` appears constantly
+
+`UPDATING` means the response was served from cache while stale and your Worker is running in the background to refresh it. This is expected behavior when using `stale-while-revalidate`.
+
+If you see `UPDATING` more often than you expect:
+
+- Your `s-maxage` is shorter than your request arrival rate. Every request that arrives after `s-maxage` elapses triggers a revalidation.
+- With `max-age=0, s-maxage=0, stale-while-revalidate=<large>`, **every** request triggers a revalidation. This is "always serve from cache" behavior, not "don't run the Worker." Refer to [Choose TTL and stale-while-revalidate values](/workers/cache/configuration/#choose-ttl-and-stale-while-revalidate-values).
+
+## My response is larger than the size limit
+
+If a response is too large to cache, Cloudflare does not store it. You will see `Cf-Cache-Status: MISS` on every request even though the response otherwise looks cacheable.
+
+For per-plan response size limits, refer to [Cacheable size limits](/cache/concepts/default-cache-behavior/#cacheable-size-limits). Note that at launch all Workers Caching responses are subject to the Free plan size limit — refer to [Response size](/workers/cache/limitations/#response-size) for details.
+
+## I need more visibility
+
+At launch, the primary debugging surfaces are the `Cf-Cache-Status` response header and per-invocation cache-hit information in the [Workers observability dashboard](/workers/observability/).
diff --git a/src/content/docs/workers/cache/examples.mdx b/src/content/docs/workers/cache/examples.mdx
new file mode 100644
index 000000000000000..9360824f28d89e5
--- /dev/null
+++ b/src/content/docs/workers/cache/examples.mdx
@@ -0,0 +1,525 @@
+---
+title: Examples
+pcx_content_type: example
+description: Patterns for combining Workers Caching with authentication, request normalization, and Durable Objects.
+sidebar:
+  order: 4
+---
+
+import { TypeScriptExample } from "~/components";
+
+Workers Caching is **a cache that is itself a Worker primitive**. It sits in front of every Worker entrypoint — the default export and every named [`WorkerEntrypoint`](/workers/runtime-apis/bindings/service-bindings/rpc/#named-entrypoints) — and it also sits in front of [`ctx.exports`](/workers/runtime-apis/bindings/service-bindings/rpc/) calls between entrypoints in the same Worker. That second fact is the one that makes the rest of this page possible.
+
+When one entrypoint invokes another via `ctx.exports`, the cache evaluates that call the same way it would evaluate a request from a browser. A hit returns the cached response without the callee running. A miss runs the callee and stores the response under its own cache key, keyed by the callee's entrypoint, path, query string, and [`ctx.props`](/workers/cache/cache-keys/#multi-tenant-safety-with-ctxprops). The caller still runs on every request — but anything the caller hands off to the callee is cacheable independently.
+
+That gives you a primitive you can compose. You can author a Worker as a chain of small entrypoints — auth, normalization, routing, the expensive read, the data layer — and let Workers Caching slot in wherever you want it. Each cached entrypoint is a unit of memoization with its own key, its own TTL, and its own tag namespace for purging. Anything you would want to configure about caching — when it runs, what it keys on, when it invalidates — is expressed as ordinary Worker code: which entrypoint you call, what request you forward, what `ctx.props` you pass, what `Cache-Control` you set.
+
+The examples on this page all use the same shape: an outer entrypoint that runs every request, plus one or more inner entrypoints that are cached. The outer entrypoint does something cheap (authenticate, rewrite a header, pick a route); the inner entrypoint does something expensive (look up data, transform it, run a Durable Object). They are written as classes in one source file, deployed as one Worker, billed as one Worker — and connected by a cache stage that you did not have to configure, just to think about.
+
+### Two rules to keep in mind
+
+Two facts shape every pattern below. They follow directly from "the cache is in front of every entrypoint":
+
+**Strip request headers that would force a bypass.** Cloudflare's standard [bypass rules](/cache/concepts/cache-responses/#bypass) apply to the inner entrypoint's cache too — an `Authorization` header on the forwarded request will turn every inner call into a `BYPASS`, and nothing will ever be stored. When the outer entrypoint authenticates the request and decides it is safe to cache, it must strip `Authorization` (and anything else that triggers automatic bypass) before invoking the inner entrypoint.
+
+**Mark the outer entrypoint as `no-store`.** Because the cache also sits in front of the outer entrypoint, a cacheable response returned from `default.fetch` will itself be cached — and the next request will be served from that outer cache without ever entering your gateway logic. To keep the outer entrypoint running on every request, overlay `Cache-Control: no-store` on the response it returns:
+
+```ts
+const response = await ctx.exports.Inner.fetch(forwarded);
+
+// Prevent the OUTER entrypoint's cache from storing this response.
+// The INNER entrypoint's cache has already done its work above.
+const headers = new Headers(response.headers);
+headers.set("Cache-Control", "no-store");
+
+return new Response(response.body, {
+	status: response.status,
+	statusText: response.statusText,
+	headers,
+});
+```
+
+Both rules are inlined in every example below.
+
+## Cache authenticated responses
+
+Caching authenticated APIs has historically been awkward. The standard [bypass rules](/cache/concepts/cache-responses/#bypass) treat any request with an `Authorization` header as private and refuse to cache it — which is the safe default, but means a token-authenticated endpoint that returns identical responses to thousands of users runs your Worker every single time.
+
+The pattern below lets you authenticate every request and still serve cache hits without running the cacheable handler:
+
+1. The outer (default) entrypoint receives the request and authenticates it.
+2. On success, it strips the `Authorization` header and forwards the request to a named entrypoint via `ctx.exports`.
+3. Workers Caching sits in front of the named entrypoint. On a hit, the cached response is returned to the outer entrypoint, which returns it to the client — without the named entrypoint ever running.
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+import { WorkerEntrypoint } from "cloudflare:workers";
+
+interface Env {
+	API_TOKEN: string;
+}
+
+// Cached entrypoint. Workers Caching sits in front of this — on a hit,
+// the cached response is returned and `fetch` below is never invoked.
+export class CachedAPI extends WorkerEntrypoint<Env> {
+	async fetch(request: Request): Promise<Response> {
+		const data = await loadExpensiveData(request);
+
+		return new Response(JSON.stringify(data), {
+			headers: {
+				"Content-Type": "application/json",
+				// All authenticated callers see this same response on a hit.
+				"Cache-Control": "public, s-maxage=60",
+			},
+		});
+	}
+}
+
+// Default entrypoint. Runs on every request to authenticate the caller,
+// then forwards to the cached entrypoint.
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		if (!(await authenticate(request, env))) {
+			return new Response("Unauthorized", { status: 401 });
+		}
+
+		// Strip the Authorization header before forwarding. Otherwise the
+		// request would trigger Cloudflare's automatic bypass for
+		// authenticated requests, and nothing would ever be cached.
+		const forwarded = new Request(request);
+		forwarded.headers.delete("Authorization");
+
+		const response = await ctx.exports.CachedAPI.fetch(forwarded);
+
+		// Prevent this outer entrypoint from caching the response itself —
+		// see "Mark the outer entrypoint as `no-store`" above.
+		const headers = new Headers(response.headers);
+		headers.set("Cache-Control", "no-store");
+		return new Response(response.body, {
+			status: response.status,
+			statusText: response.statusText,
+			headers,
+		});
+	},
+} satisfies ExportedHandler<Env>;
+
+async function authenticate(request: Request, env: Env): Promise<boolean> {
+	const token = request.headers.get("Authorization")?.replace(/^Bearer\s+/, "");
+	return token === env.API_TOKEN;
+}
+
+async function loadExpensiveData(request: Request): Promise<unknown> {
+	// Replace with your real data source — D1, KV, an origin, and so on.
+	return { timestamp: Date.now() };
+}
+```
+
+</TypeScriptExample>
+
+A few things to notice:
+
+- **The cache is in the right place.** It sits between the outer entrypoint and the cached entrypoint, so cache hits skip the expensive work entirely. Only the auth check runs.
+- **`Authorization` is stripped before forwarding.** This is what makes the response cacheable — Cloudflare's bypass rule fires on the inbound request, not on the response, so removing the header before the request reaches the cached entrypoint is what lets the cached entrypoint's `Cache-Control: public` take effect. It also prevents tokens from contributing to any future cache key.
+- **The cached response is shared across users.** Every caller who passes the auth check sees the same cached body.
+
+:::caution
+This example is intentionally minimal to show the shape of the pattern. Sharing a single cached response across every authenticated caller is only correct when the response really is the same for all of them — for example, a public catalog behind a token gate, or an internal endpoint where every caller is equally privileged. If different callers should see different data, the next section shows how to partition the cache per user with `ctx.props`. Workers Caching does not infer per-user isolation for you; you have to ask for it.
+:::
+
+### Per-user authenticated responses
+
+If your endpoint returns user-specific data, pass the user identifier via `ctx.props`. Workers Caching includes `ctx.props` in the cache key, so each user gets their own cache entry and one user can never receive another user's cached response:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+import { WorkerEntrypoint } from "cloudflare:workers";
+
+interface Env {
+	API_TOKEN: string;
+}
+
+interface Props {
+	userId: string;
+}
+
+export class CachedAPI extends WorkerEntrypoint<Env, Props> {
+	async fetch(request: Request): Promise<Response> {
+		// ctx.props.userId is part of the cache key, so this response
+		// is cached separately for every userId.
+		const { userId } = this.ctx.props;
+		const data = await loadUserData(userId);
+
+		return new Response(JSON.stringify(data), {
+			headers: {
+				"Content-Type": "application/json",
+				"Cache-Control": "public, s-maxage=60",
+			},
+		});
+	}
+}
+
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		const userId = await authenticate(request, env);
+		if (!userId) {
+			return new Response("Unauthorized", { status: 401 });
+		}
+
+		const forwarded = new Request(request);
+		forwarded.headers.delete("Authorization");
+
+		// Pass the authenticated userId to the cached entrypoint via props.
+		// This becomes part of the cache key.
+		const response = await ctx.exports.CachedAPI.fetch(forwarded, {
+			props: { userId },
+		});
+
+		// Prevent the outer entrypoint from caching this response.
+		const headers = new Headers(response.headers);
+		headers.set("Cache-Control", "no-store");
+		return new Response(response.body, {
+			status: response.status,
+			statusText: response.statusText,
+			headers,
+		});
+	},
+} satisfies ExportedHandler<Env>;
+
+async function authenticate(
+	request: Request,
+	env: Env,
+): Promise<string | null> {
+	// Replace with your real auth — JWT verification, token lookup, and so on.
+	return "user-42";
+}
+
+async function loadUserData(userId: string): Promise<unknown> {
+	return { userId, timestamp: Date.now() };
+}
+```
+
+</TypeScriptExample>
+
+For more on cache isolation between callers, refer to [Multi-tenant safety with `ctx.props`](/workers/cache/cache-keys/#multi-tenant-safety-with-ctxprops).
+
+The shape of this example — the outer entrypoint shapes a value (the user's identity) into the cache key by passing it through `ctx.props` — is the same shape the next example uses to influence a different part of the key.
+
+## Normalize `Accept-Encoding` for `Vary`
+
+[`Vary`](/workers/cache/#content-negotiation-with-vary) lets a single URL cache multiple representations — for example, a Brotli-encoded and gzip-encoded variant of the same asset. Cloudflare keys variants on the **verbatim value** of each `Vary`-listed request header, so two requests with semantically equivalent but textually different `Accept-Encoding` headers produce two separate variants.
+
+For requests routed through Cloudflare's front line, this matters even more: the `Accept-Encoding` request header your Worker sees has typically been rewritten by Cloudflare to a canonical value (such as `gzip, br`) for cache efficiency. The original value is preserved at [`request.cf.clientAcceptEncoding`](/workers/runtime-apis/request/#incomingrequestcfproperties), but if your Worker varies on `Accept-Encoding` without restoring the eyeball's value first, every cached variant ends up keyed on the rewritten string — so the cache returns a Brotli variant to clients that only accept gzip, or the other way around.
+
+The fix is a gateway entrypoint that restores `Accept-Encoding` from `request.cf.clientAcceptEncoding` before forwarding to the cached entrypoint:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+import { WorkerEntrypoint } from "cloudflare:workers";
+
+export class CachedAssets extends WorkerEntrypoint {
+	async fetch(request: Request): Promise<Response> {
+		const accept = request.headers.get("Accept-Encoding") ?? "";
+		const wantsBrotli = accept.includes("br");
+
+		const { body, encoding } = wantsBrotli
+			? await loadBrotli(request)
+			: await loadGzip(request);
+
+		return new Response(body, {
+			headers: {
+				"Content-Type": "application/javascript",
+				"Content-Encoding": encoding,
+				"Cache-Control": "public, s-maxage=86400, immutable",
+				// One variant per distinct Accept-Encoding value the cached
+				// entrypoint sees. The gateway below normalizes that value.
+				Vary: "Accept-Encoding",
+			},
+		});
+	}
+}
+
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		// On Cloudflare, the eyeball's Accept-Encoding is usually rewritten
+		// to a canonical value before the Worker runs. Restore it from
+		// request.cf.clientAcceptEncoding so the cached entrypoint sees
+		// what the client actually sent — and so Vary keys variants on
+		// the real value.
+		const original = request.cf?.clientAcceptEncoding;
+
+		const forwarded = new Request(request);
+		if (original) {
+			forwarded.headers.set("Accept-Encoding", original);
+		}
+
+		const response = await ctx.exports.CachedAssets.fetch(forwarded);
+
+		// Prevent the outer entrypoint from caching this response — if it
+		// did, future requests would skip the Accept-Encoding restoration
+		// above and the wrong variant could be served.
+		const headers = new Headers(response.headers);
+		headers.set("Cache-Control", "no-store");
+		return new Response(response.body, {
+			status: response.status,
+			statusText: response.statusText,
+			headers,
+		});
+	},
+} satisfies ExportedHandler;
+
+async function loadBrotli(
+	request: Request,
+): Promise<{ body: ArrayBuffer; encoding: string }> {
+	// Replace with your real asset loader (R2, KV, fetch, and so on).
+	return { body: new ArrayBuffer(0), encoding: "br" };
+}
+
+async function loadGzip(
+	request: Request,
+): Promise<{ body: ArrayBuffer; encoding: string }> {
+	return { body: new ArrayBuffer(0), encoding: "gzip" };
+}
+```
+
+</TypeScriptExample>
+
+Things to notice:
+
+- **The gateway runs on every request, but it is small.** It only restores one header and calls `ctx.exports`. The expensive work — picking the encoding, loading the asset — runs only on cache misses.
+- **Variants share a single purge identity.** Purging by tag or path prefix invalidates every variant of a URL together, so all variants must use the same [`Cache-Tag`](/workers/cache/configuration/#cache-tag) values. Refer to the notes in [Content negotiation with `Vary`](/workers/cache/#content-negotiation-with-vary).
+- **The same pattern applies to other normalizable headers.** If you want to vary on `Accept-Language` and you receive a long, complex value from browsers, normalize it in the gateway (for example, fold it down to the primary language tag) before forwarding. This keeps the cache fan-out bounded.
+
+If you do not need per-encoding variants — for example, if your Worker always returns Brotli when the client accepts it and otherwise falls back to gzip — you do not need `Vary` at all. Pick a canonical encoding inside the cached entrypoint based on the restored `Accept-Encoding`, and let the cache store a single variant. Refer to [`Accept-Encoding` and `Content-Encoding`](/workers/cache/configuration/#accept-encoding-and-content-encoding) for that variant of the pattern.
+
+So far the inner entrypoint has been a function of the request. The next example puts a stateful component — a Durable Object — behind the same cache stage, with the same shape.
+
+## Cache Durable Object responses
+
+[Durable Objects](/durable-objects/) are never cached directly by Workers Caching — they are stateful, and caching their responses would defeat the point. But many Durable Object endpoints serve read-heavy traffic where a short cache TTL is perfectly acceptable: leaderboards, counters, aggregated stats, configuration that changes a few times an hour.
+
+You can cache those responses by wrapping the Durable Object behind a named entrypoint and letting Workers Caching sit in front of the entrypoint. On a cache hit, the wrapper never runs and the Durable Object is never touched.
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+import { DurableObject, WorkerEntrypoint } from "cloudflare:workers";
+
+interface Env {
+	LEADERBOARD: DurableObjectNamespace<Leaderboard>;
+}
+
+// A Durable Object that maintains an expensive-to-compute leaderboard.
+export class Leaderboard extends DurableObject<Env> {
+	async fetch(request: Request): Promise<Response> {
+		const url = new URL(request.url);
+
+		if (url.pathname === "/top") {
+			const top = await this.computeTop();
+			return new Response(JSON.stringify(top), {
+				headers: { "Content-Type": "application/json" },
+			});
+		}
+
+		if (url.pathname === "/record" && request.method === "POST") {
+			const { userId, score } = await request.json<{
+				userId: string;
+				score: number;
+			}>();
+			await this.record(userId, score);
+			return new Response("Recorded");
+		}
+
+		return new Response("Not found", { status: 404 });
+	}
+
+	private async computeTop(): Promise<unknown> {
+		// Pretend this is expensive — a sorted scan of stored state, an
+		// aggregation across many keys, a call to another service.
+		return { top: [], computedAt: Date.now() };
+	}
+
+	private async record(userId: string, score: number): Promise<void> {
+		await this.ctx.storage.put(`score:${userId}`, score);
+	}
+}
+
+// Cached entrypoint. Forwards GET /top to the Durable Object and tags
+// the response so it can be purged when scores change.
+export class CachedLeaderboard extends WorkerEntrypoint<Env> {
+	async fetch(request: Request): Promise<Response> {
+		const id = this.env.LEADERBOARD.idFromName("global");
+		const stub = this.env.LEADERBOARD.get(id);
+		const response = await stub.fetch(request);
+
+		// Copy the body and headers into a new Response so we can attach
+		// cache headers. The DO's body stream is consumed once here.
+		return new Response(response.body, {
+			status: response.status,
+			headers: {
+				...Object.fromEntries(response.headers),
+				"Cache-Control": "public, s-maxage=30",
+				"Cache-Tag": "leaderboard",
+			},
+		});
+	}
+}
+
+// Default entrypoint. Routes reads through the cached entrypoint
+// and writes directly to the Durable Object, purging the cache on write.
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		const url = new URL(request.url);
+
+		if (request.method === "GET" && url.pathname === "/top") {
+			// Read path — goes through Workers Caching. On a hit,
+			// CachedLeaderboard never runs and the Durable Object
+			// is never touched.
+			const response = await ctx.exports.CachedLeaderboard.fetch(request);
+
+			// Prevent the outer entrypoint from caching this response —
+			// otherwise the router itself would be bypassed and writes
+			// could not invalidate it.
+			const headers = new Headers(response.headers);
+			headers.set("Cache-Control", "no-store");
+			return new Response(response.body, {
+				status: response.status,
+				statusText: response.statusText,
+				headers,
+			});
+		}
+
+		if (request.method === "POST" && url.pathname === "/record") {
+			// Write path — bypass the cached entrypoint, hit the Durable
+			// Object directly, then invalidate the cached leaderboard
+			// so the next read returns fresh data.
+			const id = env.LEADERBOARD.idFromName("global");
+			const stub = env.LEADERBOARD.get(id);
+			const result = await stub.fetch(request);
+
+			await ctx.cache.purge({ tags: ["leaderboard"] });
+
+			return result;
+		}
+
+		return new Response("Not found", { status: 404 });
+	},
+} satisfies ExportedHandler<Env>;
+```
+
+</TypeScriptExample>
+
+Why this works:
+
+- **Reads pay nothing on a cache hit.** Workers Caching sits in front of `CachedLeaderboard`, so a hit returns the cached body without invoking the wrapper, without invoking the Durable Object, and without doing the expensive aggregation. The default entrypoint still runs to dispatch the request, but it is a thin router.
+- **Writes invalidate the cache immediately.** The POST handler updates the Durable Object and then calls [`ctx.cache.purge({ tags: ["leaderboard"] })`](/workers/cache/purge/#purge-by-tag). The very next GET misses the cache, reruns the wrapper, and stores a fresh response.
+- **The cached entrypoint owns the cache contract.** All cache-control headers are set in `CachedLeaderboard`, including the `Cache-Tag` that lets the writer invalidate things. The Durable Object stays unaware of caching.
+
+If you have many independent Durable Object instances — for example, one per tenant — pass the tenant identifier via `ctx.props` when invoking the cached entrypoint, the same way [Per-user authenticated responses](#per-user-authenticated-responses) does. Each tenant gets its own cache entry, and a purge on one tenant does not invalidate any other.
+
+## Cache an origin you do not control
+
+Sometimes the origin you depend on is not yours. A third-party API, a SaaS endpoint, a public dataset, a vendor service behind a slow CDN — its caching headers are whatever the owner decided to ship, and you cannot change them. Maybe it sends `Cache-Control: no-store` to be safe. Maybe it sends nothing at all. Maybe it caches aggressively in a way that does not match your application's read patterns. Either way, you pay the latency and the request cost on every call.
+
+Workers Caching lets you put your own cache layer in front of that origin without changing anything on the origin side. The pattern is the same outer-plus-inner shape as the rest of this page: a thin entrypoint that forwards to the origin, with Workers Caching sitting in front of it and applying the `Cache-Control` directives you choose. The origin keeps its own caching contract with the rest of the world; your Worker just adds a second, user-controlled layer between your application and that origin.
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+import { WorkerEntrypoint } from "cloudflare:workers";
+
+const ORIGIN = "https://api.example.com";
+
+// Cached entrypoint. Fetches the upstream origin and overlays your own
+// Cache-Control on the response. Workers Caching sits in front of this,
+// so on a hit the upstream origin is never contacted.
+export class CachedOrigin extends WorkerEntrypoint {
+	async fetch(request: Request): Promise<Response> {
+		const url = new URL(request.url);
+		const upstream = new URL(url.pathname + url.search, ORIGIN);
+
+		// Forward the request to the third-party origin. The origin's own
+		// caching headers (or lack of them) are about to be overwritten —
+		// they apply to the origin's relationship with the public internet,
+		// not to your cache layer.
+		const response = await fetch(upstream, {
+			method: request.method,
+			headers: request.headers,
+			body: request.body,
+		});
+
+		// Replace the origin's Cache-Control with your own. This is the
+		// whole point of the pattern: you decide how long Workers Caching
+		// stores this response, regardless of what the origin says.
+		const headers = new Headers(response.headers);
+		headers.set("Cache-Control", "public, s-maxage=300");
+		headers.set("Cache-Tag", "origin:example");
+
+		return new Response(response.body, {
+			status: response.status,
+			statusText: response.statusText,
+			headers,
+		});
+	}
+}
+
+// Default entrypoint. Forwards every request through the cached entrypoint.
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		const response = await ctx.exports.CachedOrigin.fetch(request);
+
+		// Prevent the outer entrypoint from caching this response — the
+		// inner entrypoint's cache has already done its work.
+		const headers = new Headers(response.headers);
+		headers.set("Cache-Control", "no-store");
+		return new Response(response.body, {
+			status: response.status,
+			statusText: response.statusText,
+			headers,
+		});
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+What is happening here:
+
+- **The cache layer is yours.** The origin's `Cache-Control` is replaced before the response reaches Workers Caching, so the TTL, freshness directives, and `Cache-Tag` namespace are all controlled by your code. You decide when the cache holds onto a response, and you decide when to purge it via [`ctx.cache.purge()`](/workers/cache/purge/).
+- **The origin's own caching model is untouched.** Your Worker is the only thing that sees the rewritten `Cache-Control`. The origin still serves its other clients with whatever caching contract it published — you have not changed its behaviour or its security model, you have only added a layer in front of it for your application.
+- **Cache hits never touch the origin.** Workers Caching sits in front of `CachedOrigin`, so a hit returns the stored response without invoking `fetch` against the upstream. This is what cuts the origin request volume and the latency of every cached call.
+
+A few common extensions to this pattern:
+
+- **Per-resource TTLs.** If different paths on the upstream should have different freshness, branch on `url.pathname` inside `CachedOrigin` and set a different `s-maxage` (and a different `Cache-Tag`) for each. The cache key already includes the path and query string, so each resource gets its own entry.
+- **Per-user caching.** If your application authenticates the caller and the upstream returns user-specific data, authenticate in the outer entrypoint and pass the user identifier via `ctx.props` to `CachedOrigin` — the same shape as [Per-user authenticated responses](#per-user-authenticated-responses). Each user gets their own cache entry, and one user can never receive another user's cached response.
+- **Stale-while-revalidate.** If the origin is slow or flaky, set `Cache-Control: public, s-maxage=60, stale-while-revalidate=600` on the cached response. Most requests return the cached body immediately, and Workers Caching refreshes the origin in the background. Refer to [Use `stale-while-revalidate` for low-latency refreshes](/workers/cache/configuration/#use-stale-while-revalidate-for-low-latency-refreshes).
+- **Targeted invalidation.** Tag responses with `Cache-Tag` values that reflect your application's data model (for example, `Cache-Tag: origin:example, product:42`). When you know the upstream has changed — a webhook fires, an admin action runs — call `ctx.cache.purge({ tags: ["product:42"] })` and the next request repopulates the cache.
+
+This is the same building block as every other example on this page. The only difference is that the "expensive work" the cached entrypoint does on a miss is a `fetch` to somebody else's server. The control over how long that response lives, how it is keyed, and when it is invalidated stays entirely in your Worker.
+
+## Composing the patterns
+
+All four examples are the same architecture seen through four lenses:
+
+| Outer entrypoint          | What the cache stage is doing                 | Inner entrypoint                                  |
+| ------------------------- | --------------------------------------------- | ------------------------------------------------- |
+| Authenticate the request  | Caching an expensive computation per user     | Loads or computes the user's data                 |
+| Restore `Accept-Encoding` | Caching one variant per real encoding         | Loads the correctly-encoded asset                 |
+| Route reads vs. writes    | Caching reads, invalidating them on writes    | Wraps a Durable Object behind a `Cache-Tag`       |
+| Forward the request as-is | Caching a third-party origin under your terms | Fetches the upstream and overlays `Cache-Control` |
+
+The only thing that changes between rows is what the outer entrypoint does before the call and what the inner entrypoint does on a miss. The cache stage in the middle is the same primitive every time — keyed by the inner entrypoint, the request path and query string, and `ctx.props`; configured by the inner entrypoint's `Cache-Control` and `Cache-Tag`; invalidated by `ctx.cache.purge()` from whichever entrypoint owns the data.
+
+That uniformity is what makes the patterns compose. Nothing stops you from stacking them in a single Worker:
+
+- An outer entrypoint that authenticates and routes.
+- A normalization entrypoint that strips tracking query parameters, restores `Accept-Encoding`, and shapes the request into a canonical form.
+- A cached entrypoint that fronts a Durable Object, tagged for purging.
+- A separate cached entrypoint for an unauthenticated public endpoint, also reachable through the same outer entrypoint, with its own cache key and `Cache-Tag` namespace.
+
+Each call between these entrypoints goes through its own cache stage. The chain is built out of the same three building blocks — `WorkerEntrypoint`, `ctx.exports`, and a `Cache-Control` header — and the cache is a stage of the chain rather than a separate system bolted on. Whatever you would have configured in a cache rules engine, you now write as code: which entrypoint runs, what request gets forwarded, what props get passed, what `Cache-Control` gets returned, what gets purged.
+
+There is no fixed list of patterns. Workers Caching gives you a cache between every Worker entrypoint — what you build with that is up to you.
diff --git a/src/content/docs/workers/cache/index.mdx b/src/content/docs/workers/cache/index.mdx
new file mode 100644
index 000000000000000..4c41ee574cfbf2d
--- /dev/null
+++ b/src/content/docs/workers/cache/index.mdx
@@ -0,0 +1,370 @@
+---
+title: Workers Cache
+pcx_content_type: concept
+description: Workers Cache lets you cache Worker responses to reduce latency and Workers usage.
+sidebar:
+  order: 10
+  label: Overview
+---
+
+import {
+ DirectoryListing,
+ TypeScriptExample,
+ WranglerConfig,
+} from "~/components";
+
+Workers Caching lets Cloudflare return cached HTTP responses from your Worker without executing your Worker code. When an incoming request matches a cached response, Cloudflare serves the response directly from its edge cache — reducing latency and Workers CPU usage.
+
+Caching works for eyeball requests (requests from browsers and API clients), requests sent through [service bindings](/workers/runtime-apis/bindings/service-bindings/), and loopback calls via [`ctx.exports`](/workers/runtime-apis/bindings/service-bindings/rpc/). You control caching with standard HTTP `Cache-Control` directives on your responses.
+
+## Your Worker's cache
+
+Workers Caching is **your Worker's cache**. It is owned by your Worker, operated by your Worker, and private to your Worker.
+
+A Worker is a zoneless entity — a Worker can be bound to any number of zones, run on `workers.dev`, or be invoked entirely through service bindings without ever touching a zone. The cache follows the Worker, not a zone, so:
+
+- **No zone configuration for caching applies to Workers Caching.** [Cache Rules](/cache/how-to/cache-rules/), [Cache Response Rules](/cache/how-to/cache-response-rules/), Page Rules, cache level settings, the zone's default cached-file-extensions list, and every other zone-level cache control have no effect on a Worker's cache.
+- **Your Worker is in full control.** You set `Cache-Control` headers on your responses, and Cloudflare honors them per [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111). That is the entire configuration surface.
+- **The cache is shared across every way the Worker can be invoked.** A Worker bound to `api.example.com`, `api.example.net`, and invoked over a service binding serves the same cached responses to all three — the cache is keyed by the request path, entrypoint, and `ctx.props`, not by hostname. See [Cache keys](/workers/cache/cache-keys/).
+
+### The Worker is the configuration surface
+
+A Worker is **already infinitely customizable**. You can change response bodies, rewrite headers, branch on any request attribute, call out to other Workers via service bindings or [`ctx.exports`](/workers/runtime-apis/bindings/service-bindings/rpc/), and compose logic across an entire system.
+
+Workers Caching leans on that. Instead of introducing a separate configuration layer for caching behavior, it lets your Worker express that intent directly — through the `Cache-Control` headers it returns, the `ctx.props` it accepts, and the programmatic purges it issues. Anything you might want to configure about caching, you can configure in code:
+
+- Want a longer TTL for certain paths? Branch on the path in your Worker and set a different `s-maxage`.
+- Want to strip a tracking query parameter before caching? Rewrite the URL or `ctx.props` in a gateway Worker before dispatching.
+- Want per-tenant cache partitioning? Set the tenant identifier in `ctx.props` — that is in the cache key.
+- Want to bypass the cache for authenticated users? Return `Cache-Control: private`, or rely on the [automatic bypass](/cache/concepts/cache-responses/#bypass) triggered by `Set-Cookie` and `Authorization`.
+
+The Worker you already wrote is the configuration mechanism. Workers Caching runs in front of it and honors whatever headers the Worker returns.
+
+## When caching helps
+
+Caching is a good fit for Workers that:
+
+- Perform CPU-intensive work whose result can be reused across requests — content generation, template rendering, data transformation.
+- Fetch data from a slow origin or third-party API and want to absorb that latency for subsequent requests.
+- Power a server-rendered or statically generated site where many requests produce identical responses.
+
+Caching is not useful for per-user responses that change on every request, non-idempotent operations (`POST`, `PUT`, `DELETE`), or responses that must be computed fresh every time.
+
+## How it works
+
+With caching enabled, Cloudflare checks the cache before running your Worker. On a hit, the cached response is returned directly. On a miss, your Worker runs, and if the response is cacheable per its `Cache-Control` header, Cloudflare stores it for the next request.
+
+```mermaid
+flowchart LR
+    accTitle: Cache before a Worker request flow
+    accDescr: Request arrives at Cloudflare, cache is consulted before Worker execution.
+
+    Request["Request"] --> Cache{"Cache"}
+    Cache -- Hit --> Response["Cached response returned"]
+    Cache -- Miss --> Worker["Worker runs"]
+    Worker --> Store["Response stored in cache"]
+    Store --> Response2["Response returned"]
+```
+
+## Tiered cache
+
+Workers Caching is **tiered by default**. Cloudflare operates two layers of cache for your Worker:
+
+- **Lower tier** — a cache in the Cloudflare data center closest to the eyeball. Every data center that receives traffic for your Worker has its own lower-tier cache.
+- **Upper tier** — a smaller set of data centers that every lower tier consults on a miss. The upper tier aggregates cache fills across the whole network.
+
+A request is served from the lower tier if it is a hit there. If it is a miss, the lower tier asks the upper tier. If the upper tier also misses, your Worker finally runs to generate the response — and that response is stored in **both** tiers on the way back out, so subsequent requests from any data center benefit.
+
+```mermaid
+flowchart LR
+    accTitle: Tiered cache for Workers
+    accDescr: A request hits the lower-tier cache first, then the upper-tier cache, then the Worker.
+
+    Request["Request"] --> Lower{"Lower-tier cache<br/>(near eyeball)"}
+    Lower -- Hit --> Response["Cached response returned"]
+    Lower -- Miss --> Upper{"Upper-tier cache"}
+    Upper -- Hit --> Lower
+    Upper -- Miss --> Worker["Worker runs"]
+    Worker --> Upper
+```
+
+This is the same topology that powers [Tiered Cache](/cache/how-to/tiered-cache/) for zones, applied automatically to your Worker. You do not configure it, and the tiering runs regardless of whether your Worker uses [Smart Placement](#smart-placement-and-the-cache).
+
+**Why this matters:** the first request for a given cache key anywhere on Earth populates the upper tier. Every later request, from any Cloudflare data center, can be served from the upper tier without running your Worker — even if the lower tier at that location has never seen the request before. Cache hit ratios are substantially higher than a single flat cache layer.
+
+## Quickstart
+
+This quickstart walks you through enabling caching, deploying, and observing the cache in action.
+
+### 1. Enable caching in your Wrangler configuration
+
+<WranglerConfig>
+
+```jsonc
+{
+ "name": "my-worker",
+ "main": "src/index.ts",
+ "compatibility_date": "$today",
+ "cache": {
+  "enabled": true,
+ },
+}
+```
+
+</WranglerConfig>
+
+### 2. Return a cacheable response from your Worker
+
+Use `s-maxage` to control how long Cloudflare caches each response:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+ async fetch(request): Promise<Response> {
+  const body = JSON.stringify({
+   timestamp: new Date().toISOString(),
+   random: Math.random(),
+  });
+
+  return new Response(body, {
+   headers: {
+    "Content-Type": "application/json",
+    // Cache for 1 hour; serve stale for up to 5 minutes while revalidating.
+    "Cache-Control": "public, s-maxage=3600, stale-while-revalidate=300",
+   },
+  });
+ },
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+### 3. Deploy and observe the cache
+
+Deploy your Worker:
+
+```sh
+npx wrangler deploy
+```
+
+Then send two requests and look at the `Cf-Cache-Status` response header:
+
+```sh
+curl -I https://my-worker.example.workers.dev/
+```
+
+```txt title="First request — expected"
+HTTP/2 200
+cache-control: public, s-maxage=3600, stale-while-revalidate=300
+cf-cache-status: MISS
+```
+
+```sh
+curl -I https://my-worker.example.workers.dev/
+```
+
+```txt title="Second request — expected"
+HTTP/2 200
+cache-control: public, s-maxage=3600, stale-while-revalidate=300
+cf-cache-status: HIT
+```
+
+The second request receives the cached response. The `timestamp` and `random` values in the body are identical between the two requests, even though the Worker generates fresh ones on every run — confirming that the second request did not execute your Worker.
+
+## What gets cached
+
+- All HTTP invocations of the Worker are eligible for caching, including eyeball requests, service binding calls, and loopback calls via `ctx.exports`.
+- Only `GET` and `HEAD` requests are cached. Other methods always invoke your Worker.
+- Cacheability is determined by the response headers your Worker returns. Workers Caching follows the semantics defined in [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111). Refer to [Cache-Control](/cache/concepts/cache-control/) for the full list of directives Cloudflare respects.
+- Cloudflare's standard [cache bypass conditions](/cache/concepts/cache-responses/#bypass) apply. In particular, responses with a `Set-Cookie` header and requests with an `Authorization` header trigger automatic bypass.
+- [Preview URLs](/workers/configuration/previews/) are supported. Each preview caches independently of your production deployment, so testing a cache-affecting change in a preview never touches production's cached responses.
+- [Workers for Platforms](/cloudflare-for-platforms/workers-for-platforms/) is supported. Each user Worker has its own cache, isolated from the dispatcher and from other user Workers in the namespace.
+
+The `Cf-Cache-Status` response header tells you what happened for each request (`HIT`, `MISS`, `UPDATING`, `BYPASS`). Refer to [Cloudflare cache responses](/cache/concepts/cache-responses/) for the full set of values.
+
+## Content negotiation with `Vary`
+
+Workers Caching honors the [`Vary`](https://www.rfc-editor.org/rfc/rfc9110.html#name-vary) response header as defined in [RFC 9110](https://www.rfc-editor.org/rfc/rfc9110.html) and [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111.html#name-calculating-cache-keys-with). When your Worker returns a `Vary` header, Cloudflare stores a separate cached variant per distinct combination of the listed request header values, and only returns a cached variant when the incoming request's headers match the ones the variant was stored under.
+
+This lets a single URL cache multiple representations — for example, different encodings, different content types, or different languages — without your Worker coordinating content negotiation by hand:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+ async fetch(request): Promise<Response> {
+  const accept = request.headers.get("Accept") ?? "";
+  const wantsWebp = accept.includes("image/webp");
+
+  const body = wantsWebp
+   ? await fetchWebpImage()
+   : await fetchJpegImage();
+
+  return new Response(body, {
+   headers: {
+    "Content-Type": wantsWebp ? "image/webp" : "image/jpeg",
+    "Cache-Control": "public, s-maxage=3600",
+    // Cache a separate variant per distinct Accept header value.
+    Vary: "Accept",
+   },
+  });
+ },
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+Notes:
+
+- `Vary: *` disables caching for the response. A wildcard variance cannot be satisfied deterministically from request headers, so Cloudflare does not store the response.
+- Variants share a single cache entry for purge purposes — [purging](/workers/cache/purge/) a tag or path prefix that matches any variant invalidates all variants of that URL. All variants of a URL must therefore use the same `Cache-Tag` values.
+- `Vary` is not compatible with image-transformation features that already produce their own variants (Polish, Image Resizing). Responses rewritten by those features ignore `Vary`.
+- Variants are stored per exact request-header value. Clients that send semantically equivalent but textually different values — for example `Accept-Encoding: gzip, br` and `Accept-Encoding: br, gzip` — produce separate variants. Shape the headers your Worker sees (for example, by normalizing them in a gateway Worker before passing the request on) if you need to reduce variant fan-out.
+
+## Caching between Workers
+
+When one Worker calls another over a [service binding](/workers/runtime-apis/bindings/service-bindings/), the **callee's** cache is consulted. If the callee has caching enabled and has a matching cached response, the caller receives it without invoking the callee.
+
+```mermaid
+flowchart LR
+    accTitle: Cache between Workers
+    accDescr: Worker A calls Worker B; Worker B's cache is consulted before Worker B runs.
+
+    Request["Request"] --> WorkerA["Worker A"]
+    WorkerA --> CacheB{"Worker B's cache"}
+    CacheB -- Hit --> WorkerA
+    CacheB -- Miss --> WorkerB["Worker B"]
+    WorkerB --> CacheB
+```
+
+The cache key for service binding calls includes the caller's [`ctx.props`](/workers/runtime-apis/bindings/service-bindings/rpc/#ctxprops), so different callers with different authorization context are cached separately. For details, refer to [Cache keys](/workers/cache/cache-keys/).
+
+## Cache Durable Object responses
+
+[Durable Objects](/durable-objects/) are never cached directly by Workers Caching. However, since Workers Caching runs in front of any Worker entrypoint, you can cache a Durable Object's HTTP responses by wrapping the Durable Object behind a [named Worker entrypoint](/workers/runtime-apis/bindings/service-bindings/rpc/#named-entrypoints) and caching the entrypoint.
+
+The wrapper entrypoint forwards the request into the Durable Object and sets `Cache-Control` on the response it returns. Because Workers Caching sits in front of the entrypoint, subsequent requests are served from cache without re-entering the Durable Object:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+import { WorkerEntrypoint } from "cloudflare:workers";
+
+interface Env {
+ COUNTER: DurableObjectNamespace;
+}
+
+// Cached entrypoint. Requests to this entrypoint are served from cache
+// when possible; on a miss, the Durable Object is invoked and its
+// response is stored.
+export class CachedCounter extends WorkerEntrypoint<Env> {
+ async fetch(request: Request): Promise<Response> {
+  const id = this.env.COUNTER.idFromName("global");
+  const stub = this.env.COUNTER.get(id);
+  const response = await stub.fetch(request);
+
+  // Attach cache headers. Clone into a new Response so the headers
+  // are mutable.
+  return new Response(response.body, {
+   status: response.status,
+   headers: {
+    ...Object.fromEntries(response.headers),
+    "Cache-Control": "public, s-maxage=30",
+   },
+  });
+ }
+}
+
+// Default entrypoint. Delegates to the cached entrypoint via ctx.exports,
+// which routes through the cache.
+export default {
+ async fetch(request, env, ctx): Promise<Response> {
+  return ctx.exports.CachedCounter.fetch(request);
+ },
+} satisfies ExportedHandler<Env>;
+```
+
+</TypeScriptExample>
+
+## Smart placement and the cache
+
+[Smart Placement](/workers/configuration/placement/) moves **where your Worker runs** when it runs — typically closer to a slow origin or database. It does not move the cache. Workers Caching always has a lower tier near the eyeball and an upper tier aggregating the network, exactly as described in [Tiered cache](#tiered-cache) above, whether or not Smart Placement is enabled.
+
+The cache is always consulted before Smart Placement is considered. Concretely:
+
+- **Lower-tier hit:** the response is returned from the data center nearest the eyeball. Your Worker does not run. Smart Placement is not consulted.
+- **Lower-tier miss, upper-tier hit:** the response is returned from the upper tier. Your Worker does not run. Smart Placement is not consulted.
+- **Both tiers miss:** Smart Placement routes execution of your Worker to the placement target (for example, near your origin). The resulting response is stored in both cache tiers on the way back to the eyeball.
+
+Importantly, the **upper tier and the Smart Placement target are independent locations**. The upper tier is chosen by Cloudflare to aggregate cache fills across the network; the Smart Placement target is chosen to minimize latency between your Worker and its backend. They are generally not in the same data center.
+
+```mermaid
+flowchart LR
+    accTitle: Tiered cache with Smart Placement across three locations
+    accDescr: The eyeball, the upper-tier cache, and the Smart Placement target are three independent locations. Requests traverse them in order on a full cache miss.
+
+    subgraph EyeballColo["Data center near eyeball"]
+        Request["Request"] --> Lower{"Lower-tier cache"}
+    end
+
+    subgraph UpperColo["Upper-tier data center"]
+        Upper{"Upper-tier cache"}
+    end
+
+    subgraph PlacedColo["Smart Placement target"]
+        Placed["Worker runs"]
+        Origin["Origin / backend"]
+        Placed <--> Origin
+    end
+
+    Lower -- Hit --> Response["Response"]
+    Lower -- Miss --> Upper
+    Upper -- Hit --> Lower
+    Upper -- Miss --> Placed
+    Placed --> Upper
+```
+
+On a full cache miss, a request therefore traverses three locations: the lower-tier data center near the eyeball, the upper-tier data center, and the Smart Placement target. The cache tiers absorb this cost so that the slow trip to the placement target is only paid once for the whole network — the upper tier shields the placement target from every lower-tier miss.
+
+:::note[Future optimization]
+
+Because the upper tier and the Smart Placement target are chosen independently today, a cache miss pays for an extra hop between them. Future updates may bring these choices closer together — for example, by co-locating the upper tier with the placement target, or by having a single placement decision account for both — so that full cache misses incur one wide-area trip instead of two.
+
+:::
+
+## Purging the cache
+
+Your Worker can invalidate its own cache at any time using `ctx.cache.purge()`. Tags are the most flexible mechanism — tag responses with `Cache-Tag` when returning them, and purge those tags later:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+ async fetch(request, env, ctx): Promise<Response> {
+  await ctx.cache.purge({ tags: ["blog-posts"] });
+  return new Response("Purged", { status: 200 });
+ },
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+You can also [import `cache` from `cloudflare:workers`](/workers/cache/purge/#two-ways-to-call-purge) and call `cache.purge({...})` when you do not have `ctx` in scope — for example, from a utility module. For all purge modes and patterns, refer to [Purging the cache](/workers/cache/purge/).
+
+## Billing
+
+Requests to a Worker with caching enabled are billed at the standard [Workers request rate](/workers/platform/pricing/) whether the response comes from cache or your Worker. **CPU time is only billed when your Worker runs** — cache hits do not consume CPU time.
+
+| Request type                                                            | Request charge | CPU time charge |
+| ----------------------------------------------------------------------- | -------------- | --------------- |
+| Cache `HIT` (Worker does not run)                                       | Standard rate  | Not billed      |
+| Cache `MISS` (Worker runs)                                              | Standard rate  | Billed          |
+| Cache `BYPASS` (Worker runs)                                            | Standard rate  | Billed          |
+| [Static asset request](/workers/static-assets/billing-and-limitations/) | Free           | Not billed      |
+
+For an example, refer to [Pricing example: Worker with caching](/workers/platform/pricing/#example-5-worker-with-caching).
+
+## Next steps
+
+<DirectoryListing />
diff --git a/src/content/docs/workers/cache/limitations.mdx b/src/content/docs/workers/cache/limitations.mdx
new file mode 100644
index 000000000000000..83250bd23e952eb
--- /dev/null
+++ b/src/content/docs/workers/cache/limitations.mdx
@@ -0,0 +1,95 @@
+---
+title: Limitations
+pcx_content_type: reference
+description: Current limitations, unsupported scenarios, and how Workers Caching relates to other Cloudflare caches.
+sidebar:
+  order: 6
+---
+
+This page lists the scenarios where Workers Caching does not apply, followed by notes on how it relates to other caches you may already be using.
+
+## Unsupported scenarios
+
+### HTTP methods
+
+Only `GET` and `HEAD` requests are cached. `POST`, `PUT`, `PATCH`, `DELETE`, and other methods always invoke your Worker.
+
+If you need to cache responses to non-idempotent requests, do so explicitly in your Worker — for example, by hashing the request body into a synthetic URL and making an internal `GET` subrequest.
+
+### Other invocation types
+
+Workers Caching only applies to HTTP requests handled by a `fetch` handler on a [Worker entrypoint](/workers/runtime-apis/bindings/service-bindings/rpc/#named-entrypoints). The following invocation types always run without cache involvement:
+
+- [Cron Triggers](/workers/configuration/cron-triggers/) — scheduled invocations via the `scheduled` handler.
+- [Queue consumers](/queues/configuration/javascript-apis/#consumer) — messages delivered via the `queue` handler.
+- [Workflows](/workflows/) — workflow step execution.
+- [Tail Workers](/workers/observability/logs/tail-workers/) — trace event handlers.
+- [Durable Objects](/durable-objects/) — Durable Object invocations are never cached, regardless of handler or method. To cache a Durable Object's HTTP responses, wrap it behind a Worker entrypoint with caching enabled. See [Cache Durable Object responses](/workers/cache/#cache-durable-object-responses).
+
+### Purge by host
+
+There is no "purge by host" mode. The cache [belongs to the Worker, not to a domain](/workers/cache/cache-keys/#the-cache-belongs-to-the-worker-not-to-a-domain) — the host is not part of the cache key, so purging by host would not map onto anything the cache stores. Use [purge by tag](/workers/cache/purge/#purge-by-tag), [purge by path prefix](/workers/cache/purge/#purge-by-path-prefix), or [`purgeEverything`](/workers/cache/purge/#purge-everything) instead.
+
+### Cache pre-warming
+
+There is no API to pre-populate the cache with responses generated at build time. A response is only cached once it has been served at least once. If you need pre-rendered content available to the first requester, use [Static Assets](/workers/static-assets/).
+
+## Limits
+
+### Response size
+
+Response size limits are the same as Cloudflare's zone cache. For per-plan limits, refer to [Cacheable size limits](/cache/concepts/default-cache-behavior/#cacheable-size-limits).
+
+:::caution
+At launch, all Workers Caching responses are subject to the Free plan size limit regardless of your account's plan. This restriction is temporary and will be lifted in a future update, after which the limits in [Cacheable size limits](/cache/concepts/default-cache-behavior/#cacheable-size-limits) will apply based on your account.
+:::
+
+### `Cache-Tag` limits
+
+Limits on the number, length, and character set of `Cache-Tag` values are the same as Cloudflare's zone cache. Refer to [Cache tag limits](/cache/how-to/purge-cache/purge-by-tags/#a-few-things-to-remember) for the full list.
+
+### Purge rate limits
+
+`ctx.cache.purge()` uses the same rate-limiting system as the zone purge API. Refer to [Availability and limits](/cache/how-to/purge-cache/#availability-and-limits) for the rates that apply to your account.
+
+## Relationship to other caches
+
+### Zone-level cache configuration
+
+Workers Caching is **your Worker's cache**, not your zone's cache. It uses your Worker itself as the configuration surface, so there is no separate layer of rules or settings to configure alongside it. None of the following applies to Workers Caching:
+
+| Zone-level feature                                                                                        | Equivalent in Workers Caching                                                                                                                                                                                                                              |
+| --------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| [Cache Rules](/cache/how-to/cache-rules/) and [Cache Response Rules](/cache/how-to/cache-response-rules/) | Set `Cache-Control` headers in your Worker, or branch on the request and return different headers per path.                                                                                                                                                |
+| [Cache key customization in Cache Rules](/cache/how-to/cache-rules/settings/#cache-key)                   | Workers Caching has its own key composition; see [Cache keys](/workers/cache/cache-keys/). Shape the key by shaping the request (for example, by rewriting the URL or setting `ctx.props` in a gateway Worker).                                            |
+| Zone-level cache level settings (bypass / standard / aggressive / ignore query string)                    | `Cache-Control` headers on the response express the same intent at a per-request level.                                                                                                                                                                    |
+| The zone's default cached-file-extensions list                                                            | Workers Caching caches any response whose headers say it is cacheable, regardless of file extension.                                                                                                                                                       |
+| Custom tiered cache topologies                                                                            | Workers Caching uses a generic tiered cache topology by default. Because a Worker can execute anywhere, a fixed custom topology does not apply — future integrations with [Smart Placement](/workers/configuration/placement/) may tailor tiering further. |
+| [Rulesets](/ruleset-engine/) that modify request or response before cache                                 | Transform the request or response in your Worker's code before returning it.                                                                                                                                                                               |
+
+To influence your Worker's cache, change your Worker. `Cache-Control` headers, `ctx.props`, service binding composition, and [`ctx.cache.purge()`](/workers/cache/purge/) cover the configuration surface.
+
+### Cache API (`caches.default`)
+
+The [Cache API](/workers/runtime-apis/cache/) is a separate programmatic cache store. It is independent of Workers Caching — operations on one do not affect the other, and `ctx.cache.purge()` is what invalidates Workers-Caching entries.
+
+For new Workers, prefer Workers Caching. The Cache API, by design, is a lower-level primitive:
+
+- It does not read through — responses are only cached when your Worker explicitly calls `put()`, and every request still executes your Worker on the way in.
+- It does not coalesce concurrent requests for the same resource.
+- It does not participate in [tiered caching](/cache/how-to/tiered-cache/).
+
+Workers Caching provides all three automatically. The Cache API remains useful when you need fine-grained programmatic control.
+
+### `fetch()` subrequest caching
+
+Workers Caching is a server-side cache **in front of** your Worker. It is unrelated to the cache controls on outgoing [`fetch()`](/workers/runtime-apis/fetch/) subrequests (`cf.cacheTtl`, `cf.cacheKey`, `cf.cacheEverything`), which govern the cache for requests your Worker makes to its own origins.
+
+The two caches operate independently: a `fetch()` subrequest hit saves a trip to your origin, while a Workers Caching hit saves your Worker from running at all.
+
+## Coming soon
+
+The following surfaces are in development:
+
+- **Dashboard UI** for enabling caching without Wrangler.
+- **Cache Analytics in Workers Observability**.
diff --git a/src/content/docs/workers/cache/purge.mdx b/src/content/docs/workers/cache/purge.mdx
new file mode 100644
index 000000000000000..cca40438c473dc8
--- /dev/null
+++ b/src/content/docs/workers/cache/purge.mdx
@@ -0,0 +1,348 @@
+---
+title: Purging the cache
+pcx_content_type: how-to
+description: Invalidate cached responses using ctx.cache.purge() — purge by tag, by path prefix, or purge everything.
+sidebar:
+  order: 3
+---
+
+import { TypeScriptExample, WranglerConfig } from "~/components";
+
+Your Worker can invalidate its own cached responses at any time using the purge API. Purging is useful when data changes and the new value is more important than the performance benefit of continuing to serve the cached response — for example, after a content update, a user action, or a webhook from an upstream system.
+
+Because Workers Caching is **your Worker's cache**, purging is scoped to the Worker that owns the cache. Within a Worker, purges are further scoped to the [entrypoint](/workers/runtime-apis/bindings/service-bindings/rpc/#named-entrypoints) that called `purge()`. A Worker cannot reach into another Worker's cache, an entrypoint cannot reach into another entrypoint's cache, and no zone-level purge (via the dashboard, [API](/cache/how-to/purge-cache/), or Terraform) affects Workers Caching content.
+
+## Two ways to call purge
+
+There are two equivalent ways to trigger a purge from inside your Worker:
+
+- **`ctx.cache.purge(...)`** — available on the execution context passed to every handler. Use this when you already have `ctx` in scope.
+- **`cache.purge(...)`** — imported from `cloudflare:workers`. Use this when you want to call purge from code that does not receive `ctx` — for example, a utility module shared across multiple handlers, or a framework adapter that does not thread the execution context through its internals.
+
+Both forms call into the same API and behave identically. Pick whichever reads more cleanly for your code.
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+import { cache } from "cloudflare:workers";
+
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		// Using the module import — no need to thread ctx through helper functions.
+		await cache.purge({ tags: ["blog-posts"] });
+
+		// Equivalent, using ctx directly:
+		// await ctx.cache.purge({ tags: ["blog-posts"] });
+
+		return new Response("Purged", { status: 200 });
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+The rest of this page uses `ctx.cache.purge(...)` in most examples because those examples already have `ctx` in scope. If you prefer the import form, substitute `cache.purge(...)` — nothing else changes.
+
+## Purge modes
+
+`purge()` accepts one or more of the following fields:
+
+| Field             | Purges                                                                                         | Scope          |
+| ----------------- | ---------------------------------------------------------------------------------------------- | -------------- |
+| `tags`            | Every cached response tagged with one of the given values via `Cache-Tag`.                     | Per entrypoint |
+| `pathPrefixes`    | Every cached response whose request path starts with one of the given prefixes.                | Per entrypoint |
+| `purgeEverything` | Every cached response for the entrypoint that called `purge()`.                                | Per entrypoint |
+
+All three modes are scoped to the [entrypoint](/workers/runtime-apis/bindings/service-bindings/rpc/#named-entrypoints) that called `purge()`. A purge from `PublicAPI` does not affect cached responses stored by `AdminAPI`, even if they share tag names or path prefixes. To invalidate across every entrypoint of a Worker, call `purge()` from each entrypoint.
+
+The returned promise resolves to a result object you can inspect to confirm success or handle failures — see [Return value](#return-value).
+
+Purge after a write by calling `ctx.cache.purge()` at the end of any handler that mutates data:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		if (request.method === "POST") {
+			const body = await request.json<{ postId: string }>();
+
+			// Mutate your data source (D1, KV, an origin, and so on), then invalidate
+			// every cached response tagged for this post.
+			await ctx.cache.purge({
+				tags: [`post-${body.postId}`, "post-list"],
+			});
+
+			return new Response("Updated", { status: 200 });
+		}
+
+		// Handle cacheable reads here.
+		return new Response("Hello", {
+			headers: { "Cache-Control": "public, s-maxage=3600" },
+		});
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+{/* TODO(zaidoon): Confirm whether multiple fields can be combined in a single call (e.g., `{ tags: [...], pathPrefixes: [...] }`) or whether each call must use a single field. */}
+
+## Purge by tag
+
+Tags are attached to responses via the `Cache-Tag` response header, and purged later by name. This is the most flexible and commonly used purge method.
+
+### Attach tags on write
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+	async fetch(request): Promise<Response> {
+		const url = new URL(request.url);
+		const postId = url.pathname.split("/").pop() ?? "unknown";
+		const body = { id: postId, title: `Post ${postId}` };
+
+		return new Response(JSON.stringify(body), {
+			headers: {
+				"Content-Type": "application/json",
+				"Cache-Control": "public, s-maxage=3600",
+				"Cache-Tag": `post,post-${postId},blog`,
+			},
+		});
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+The `Cache-Tag` header value is a comma-separated list of tags. Cloudflare strips this header before returning the response to clients.
+
+### Trigger the purge
+
+<TypeScriptExample filename="src/admin.ts">
+
+```ts
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		const postId = new URL(request.url).searchParams.get("id");
+		if (!postId) return new Response("Missing id", { status: 400 });
+
+		await ctx.cache.purge({ tags: [`post-${postId}`] });
+
+		return new Response("Purged", { status: 200 });
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+### Tag scope across entrypoints
+
+Tags are scoped to the entrypoint that called `purge()`. A tag named `user-42` applied to responses in two different entrypoints is **not** invalidated by a single `purge({ tags: ["user-42"] })` call — it only affects the entrypoint the call originated from. If you need to invalidate the same tag across several entrypoints, call `purge()` from each entrypoint, or centralize purge calls in a shared entrypoint that caches every response you later need to invalidate.
+
+### Use hierarchical tags
+
+To invalidate groups of related responses in one call, tag each response with multiple tags representing every level of hierarchy it belongs to — sometimes called "soft tags":
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+	async fetch(request): Promise<Response> {
+		const path = new URL(request.url).pathname;
+
+		// Build a list of hierarchical tags for the current path.
+		// A response at /blog/2025/02/hello gets tags for:
+		//   _path:/blog/, _path:/blog/2025/, _path:/blog/2025/02/, _path:/blog/2025/02/hello
+		const segments = path.split("/").filter(Boolean);
+		const tags = segments.map(
+			(_, i) => `_path:/${segments.slice(0, i + 1).join("/")}/`,
+		);
+
+		const body = `<!doctype html><title>${path}</title>`;
+
+		return new Response(body, {
+			headers: {
+				"Content-Type": "text/html",
+				"Cache-Control": "public, s-maxage=3600",
+				"Cache-Tag": tags.join(","),
+			},
+		});
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+Purging the tag `_path:/blog/2025/` then invalidates every cached response whose URL starts with `/blog/2025/`.
+
+For limits on the number, length, and character set of tags, refer to [Cache tag limits](/cache/how-to/purge-cache/purge-by-tags/#a-few-things-to-remember).
+
+### Version-specific purging
+
+Workers Caching does not partition the cache by Worker version — a cached response written by version A may still be served after version B is deployed, and purging is not version-specific by default. If you need to purge the cached entries that a specific version wrote, tag each response with the version that produced it and purge that tag later.
+
+Add the [version metadata binding](/workers/runtime-apis/bindings/version-metadata/) to your Wrangler configuration:
+
+<WranglerConfig>
+
+```jsonc
+{
+	"name": "my-worker",
+	"main": "src/index.ts",
+	"compatibility_date": "$today",
+	"cache": { "enabled": true },
+	"version_metadata": { "binding": "CF_VERSION_METADATA" },
+}
+```
+
+</WranglerConfig>
+
+Then prepend the version ID to your tags:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+interface Env {
+	CF_VERSION_METADATA: WorkerVersionMetadata;
+}
+
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		const { id: versionId } = env.CF_VERSION_METADATA;
+		const postId = new URL(request.url).pathname.split("/").pop() ?? "unknown";
+
+		return new Response(JSON.stringify({ id: postId }), {
+			headers: {
+				"Content-Type": "application/json",
+				"Cache-Control": "public, s-maxage=3600",
+				// Include the version ID as a tag so you can purge by version later.
+				"Cache-Tag": `post,post-${postId},v:${versionId}`,
+			},
+		});
+	},
+} satisfies ExportedHandler<Env>;
+```
+
+</TypeScriptExample>
+
+When you want to invalidate everything a specific version wrote — for example, after a rollback — purge the version tag:
+
+<TypeScriptExample filename="src/admin.ts">
+
+```ts
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		const versionId = new URL(request.url).searchParams.get("version");
+		if (!versionId) return new Response("Missing version", { status: 400 });
+
+		await ctx.cache.purge({ tags: [`v:${versionId}`] });
+
+		return new Response("Purged", { status: 200 });
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+## Purge by path prefix
+
+`pathPrefixes` invalidates every cached response whose **request path** begins with one of the given prefixes:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		// Invalidate everything under /blog/2025/ for the current entrypoint.
+		await ctx.cache.purge({
+			pathPrefixes: ["/blog/2025/"],
+		});
+
+		return new Response("Purged", { status: 200 });
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+Entries in `pathPrefixes` are **paths**, not full URLs. Do not include a scheme or host — `https://example.com/blog/` will not match. Use leading slashes to match from the root of the request path.
+
+`pathPrefixes` is scoped to the entrypoint that makes the purge call. A `purge({ pathPrefixes: ["/blog/"] })` from `PublicAPI` will not affect cached responses stored by `AdminAPI`, even when their paths also start with `/blog/`.
+
+### Purge a single URL
+
+There is no dedicated "purge by URL" mode. To invalidate a single cached URL, pass its path as a single-element `pathPrefixes` array:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		// Invalidate the cached response for exactly /blog/2026/hello-world.
+		await ctx.cache.purge({
+			pathPrefixes: ["/blog/2026/hello-world"],
+		});
+
+		return new Response("Purged", { status: 200 });
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+Because `pathPrefixes` matches on the start of the request path, passing the full path matches only that path — plus any paths that happen to extend it (for example, `/blog/2026/hello-world-2`). If you need exact-match semantics with no risk of over-purge, use a [tag](#purge-by-tag) instead.
+
+## Purge everything
+
+Invalidate every cached response stored by the calling entrypoint:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		await ctx.cache.purge({ purgeEverything: true });
+
+		return new Response("Purged", { status: 200 });
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+Use this sparingly. Purging everything causes all subsequent requests to miss the cache until they can be re-filled, which temporarily increases load on your Worker and any upstream services it calls.
+
+## Purge propagation
+
+Purges triggered by `ctx.cache.purge()` use Cloudflare's [Instant Purge](/cache/how-to/purge-cache/) infrastructure and propagate globally with the same guarantees as zone-level purges.
+
+## Return value
+
+`purge()` resolves to a result object. Check `success` to confirm the purge was accepted, and inspect `errors` if it was not:
+
+<TypeScriptExample filename="src/index.ts">
+
+```ts
+export default {
+	async fetch(request, env, ctx): Promise<Response> {
+		const result = await ctx.cache.purge({ tags: ["blog-posts"] });
+
+		if (!result.success) {
+			console.error("Cache purge failed", result.errors);
+			return new Response("Purge failed", { status: 500 });
+		}
+
+		return new Response("Purged", { status: 200 });
+	},
+} satisfies ExportedHandler;
+```
+
+</TypeScriptExample>
+
+On failure, each error in `errors` carries a numeric `code` and a human-readable `message` you can log or surface to the caller.
+
+## Rate limits
+
+`purge()` uses the same rate-limiting system as Cloudflare's zone purge API. For the rate limits that apply to your account's plan, refer to [Availability and limits](/cache/how-to/purge-cache/#availability-and-limits). When the purge is rate-limited, `success` is `false` and `errors` contains an entry describing the rejection.
diff --git a/src/content/docs/workers/platform/pricing.mdx b/src/content/docs/workers/platform/pricing.mdx
index d306aaec2558390..4a7bf7c575d6b75 100644
--- a/src/content/docs/workers/platform/pricing.mdx
+++ b/src/content/docs/workers/platform/pricing.mdx
@@ -27,9 +27,9 @@ All [Pages Functions](/pages/functions/) are billed as Workers. All pricing and
 
 Users on the Workers Paid plan have access to the Standard usage model. Workers Enterprise accounts are billed based on the usage model specified in their contract. To switch to the Standard usage model, contact your Account Manager.
 
-|              | Requests<sup>1, 2, 3</sup>                                         | Duration                        | CPU time                                                                                                                                                                                                                                                                                                                                                                                                  |
-| ------------ | ------------------------------------------------------------------ | ------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| **Free**     | 100,000 per day                                                    | No charge for duration          | 10 milliseconds of CPU time per invocation                                                                                                                                                                                                                                                                                                                                                                |
+|              | Requests<sup>1, 2, 3, 4</sup>                                      | Duration                        | CPU time                                                                                                                                                                                                                                                                                                                                                                                                        |
+| ------------ | ------------------------------------------------------------------ | ------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| **Free**     | 100,000 per day                                                    | No charge for duration          | 10 milliseconds of CPU time per invocation                                                                                                                                                                                                                                                                                                                                                                      |
 | **Standard** | 10 million included per month <br /> +$0.30 per additional million | No charge or limit for duration | 30 million CPU milliseconds included per month<br /> +$0.02 per additional million CPU milliseconds<br /><br/> Max of [5 minutes of CPU time](/workers/platform/limits/#account-plan-limits) per invocation (default: 30 seconds)<br /> Max of 15 minutes of CPU time per [Cron Trigger](/workers/configuration/cron-triggers/) or [Queue Consumer](/queues/configuration/javascript-apis/#consumer) invocation |
 
 <sup>1</sup> Inbound requests to your Worker. Cloudflare does not bill for
@@ -41,6 +41,11 @@ WebSocket messages routed through a Worker do not count as requests.
 
 <sup>3</sup> Requests to static assets are free and unlimited.
 
+<sup>4</sup> When [Workers Caching](/workers/cache/) is enabled, requests served
+from the Worker's cache are billed at the same per-request rate as requests that
+invoke the Worker. CPU time is only billed when the Worker runs (on a cache miss
+or bypass).
+
 ### Example pricing
 
 #### Example 1
@@ -97,6 +102,19 @@ A high traffic Worker that serves 100 million requests per month, and uses an av
 | **CPU time**     | $13.40        | ((7 ms of CPU time per request \* 100,000,000 requests) - 30,000,000 included CPU ms) / 1,000,000 \* $0.02 |
 | **Total**        | $45.40        |                                                                                                            |
 
+#### Example 5: Worker with caching
+
+The same Worker as Example 4, but with [Workers Caching](/workers/cache/) enabled and an 80% cache hit rate. 80 million requests are served from cache and 20 million invoke the Worker. Cache hits count as requests but do not consume CPU time.
+
+|                  | Monthly Costs | Formula                                                                                                   |
+| ---------------- | ------------- | --------------------------------------------------------------------------------------------------------- |
+| **Subscription** | $5.00         |                                                                                                           |
+| **Requests**     | $27.00        | (100,000,000 requests - 10,000,000 included requests) / 1,000,000 \* $0.30                                |
+| **CPU time**     | $2.20         | ((7 ms of CPU time per request \* 20,000,000 requests) - 30,000,000 included CPU ms) / 1,000,000 \* $0.02 |
+| **Total**        | $34.20        |                                                                                                           |
+
+For details on what is cached and how to enable caching, refer to [Cache](/workers/cache/).
+
 :::note[Custom limits]
 
 To prevent accidental runaway bills or denial-of-wallet attacks, configure the maximum amount of CPU time that can be used per invocation by [defining limits in your Worker's Wrangler file](/workers/wrangler/configuration/#limits), or via the Cloudflare dashboard (**Workers & Pages** > Select your Worker > **Settings** > **CPU Limits**).
diff --git a/src/content/docs/workers/reference/how-the-cache-works.mdx b/src/content/docs/workers/reference/how-the-cache-works.mdx
index 53aad4dd28b3f81..1dfd4e13106a262 100644
--- a/src/content/docs/workers/reference/how-the-cache-works.mdx
+++ b/src/content/docs/workers/reference/how-the-cache-works.mdx
@@ -12,6 +12,12 @@ By allowing developers to write to the cache, Workers provide a way to customize
 
 Cloudflare Workers run before the cache but can also be utilized to modify assets once they are returned from the cache. Modifying assets returned from cache allows for the ability to sign or personalize responses while also reducing load on an origin and reducing latency to the end user by serving assets from a nearby location.
 
+:::note
+This page describes how Workers interact with a **zone's** Cloudflare Cache — for example, when a Worker runs on a zone with Cache Rules configured, or when a Worker uses the [Cache API](/workers/runtime-apis/cache/) or `fetch()` to store and retrieve responses.
+
+To cache responses from a Worker itself — so that Cloudflare returns the cached response without executing the Worker — refer to [Cache](/workers/cache/).
+:::
+
 ## Interact with the Cloudflare Cache
 
 Conceptually, there are two ways to interact with Cloudflare’s Cache using a Worker:
diff --git a/src/content/docs/workers/runtime-apis/cache.mdx b/src/content/docs/workers/runtime-apis/cache.mdx
index cc26647ff0a0df2..7650e6ae7415427 100644
--- a/src/content/docs/workers/runtime-apis/cache.mdx
+++ b/src/content/docs/workers/runtime-apis/cache.mdx
@@ -12,6 +12,10 @@ products:
 
 The [Cache API](https://developer.mozilla.org/en-US/docs/Web/API/Cache) allows fine grained control of reading and writing from the [Cloudflare global network](https://www.cloudflare.com/network/) cache.
 
+:::note
+The Cache API is a programmatic interface for reading from and writing to Cloudflare's cache from inside a Worker. To cache responses from your Worker so that Cloudflare returns them without executing your Worker, use [Workers Caching](/workers/cache/) instead. The two mechanisms are independent.
+:::
+
 The Cache API is available globally but the contents of the cache do not replicate outside of the originating data center. A `GET /users` response can be cached in the originating data center, but will not exist in another data center unless it has been explicitly created.
 
 :::caution[Tiered caching]