groundy
infrastructure & runtime

Vercel Now Honors stale-if-error: Serving Stale Cache When the Origin Dies

Vercel's CDN now serves a cached copy when an origin fails, but only inside the stale-if-error window you set. It masks transient failures, it does not fix the origin.

9 min · · · 4 sources ↓

On 2026-02-13, Vercel announced that its CDN now honors the stale-if-error Cache-Control directive on every response. When an origin returns a 500, a network failure, or a DNS error, the edge serves the last good cached copy instead of an error page, but only for as long as the delta-seconds window you set allows. That moves transient-failure tolerance out of application retry code and into the cache layer. It masks failures inside the stale window; it does not fix the origin.

What Vercel shipped

Vercel’s CDN now supports stale-if-error as a Cache-Control extension on all responses, with no plan-level toggle required. Per Vercel’s changelog, you set the directive to declare how many seconds a stale cached response may be served when a request to the origin fails; when the directive is present and the origin returns an error, the CDN “may serve a previously cached response instead of returning the error to the client.”

The feature is config-driven: it lives entirely in Cache-Control headers, so a route opts in by emitting the directive alongside its normal max-age. There is no separate dashboard flag to flip. The changelog names three triggering cases: 500 Internal Server Errors, network failures, and DNS errors.

This is Vercel catching up to a directive the rest of the CDN field has implemented for years (more on that below), rather than inventing new behavior. What is new is that it is now first-party, documented, and applied uniformly to responses on Vercel’s own edge.

How stale-if-error works

stale-if-error is a Cache-Control extension defined in RFC 5861, an Informational RFC from May 2010 authored by Mark Nottingham. It is explicitly not on the Internet Standards Track, which matters: “Informational” means the IETF published it for the community to consider, not that every CDN implements it identically. Treat the RFC as a reference behavior, not a contract.

The mechanism is two windows stacked on top of each other: a fresh window governed by max-age, and a stale window governed by stale-if-error. The RFC’s introduction describes the extension as letting a cache “return a stale response when an error… is encountered, rather than returning a ‘hard’ error,” which improves availability. A community worked example makes the arithmetic concrete: with Cache-Control: max-age=300, stale-if-error=86400, the response is fresh for 300 seconds, and if the origin then fails, the cache may keep serving the stale copy for up to 86400 more seconds (24 hours) instead of returning an error page.

The directive’s value is a delta-seconds ceiling on how long stale content may be served on error, not a guarantee that the cache will serve it that long. A cache is allowed to serve stale sooner, or to bail earlier if it chooses. What it is not allowed to do is keep serving stale past the ceiling once the origin has failed.

stale-if-error vs stale-while-revalidate

These two directives get conflated constantly, and they solve different problems. RFC 5861 defines both, and they are independent mechanisms that can appear in the same Cache-Control header.

stale-while-revalidate (for example max-age=600, stale-while-revalidate=30) serves a stale response immediately while revalidating in the background, then swaps in the fresh response. Its job is to hide latency for the normal case: the origin is healthy, the response is slightly stale, and you want the client to get a fast answer rather than wait for a round trip.

stale-if-error only fires when something has gone wrong. The origin did not answer acceptably, so the cache falls back to the last good copy rather than surfacing the failure. In the healthy path it does nothing.

The practical effect is that the two compose. A community example shows the combined header: Cache-Control: public, max-age=3600, stale-while-revalidate=60, stale-if-error=86400. Fresh for an hour, then a one-minute stale-while-revalidate window to hide latency under normal load, and a 24-hour stale-if-error window to fall back to if the origin actually fails. One optimizes the happy path; the other covers the sad one.

What shifts architecturally

The interesting consequence is not “fewer error pages.” It is where transient-failure tolerance now lives.

Before this, a team protecting users from a flaky origin wrote that tolerance into the application: retry-with-backoff around origin calls, circuit breakers, failover to a degraded-mode renderer, a fallback cache read inside the request handler. That code runs per request, in your function, against your concurrency and cold-start budget, and it is the thing you debug at 2 a.m. when a dependency hiccups.

With stale-if-error honored at the edge, that same tolerance can move to the CDN. The retry-vs-serve-stale decision is made at the edge before your function is even invoked, against cached bytes rather than a fresh origin round trip. For read-heavy, cacheable responses, that can delete a whole class of application-level fallback code.

Read literally against the changelog, the mechanism that enables this is simple: the origin returned an error, the CDN had a cached copy, and the directive permitted serving it. The architectural consequence is an inference from that mechanism. It changes how teams budget for origin reliability: the relevant question stops being “what is our origin’s uptime” and becomes “how stale are we willing to serve, and for how long.” You are now budgeting for staleness, not just for failures. Whether you actually get to delete your retry logic depends on whether your tolerance window, your max-age, and your cache hit rate line up.

The hard boundary: it masks, it does not fix

This is the part the resilience framing glosses over. stale-if-error masks failures inside the window you define. It does nothing beyond it.

Run the RFC math to its end: once the cached response is older than max-age + stale-if-error, the cache must stop serving it and hand the client the origin’s error. If your origin outage outlasts your stale window, users see 500s again. If the origin fails but you never had a cached copy in the first place (a cache miss on a cold key), there is nothing stale to serve and the error propagates immediately.

This means the directive is a time-bounded safety net, not a substitute for origin reliability, retries, or circuit breakers. A February 2026 analysis frames the feature as keeping sites functional during maintenance, outages, and network glitches by reducing bounce and origin load. That framing is accurate for the transient case. It breaks down for sustained failures: stale content served for minutes can be acceptable; stale content served for hours usually is not, and past the window it stops being served at all.

The corollary is observability. If the edge silently serves stale while the origin is on fire, your error-rate dashboards may look clean precisely when they should be red. Whatever you serve during the window, you still want it instrumented so an operator knows the origin is failing even while users see cached content.

When you should not use it

stale-if-error is opt-in per route for a reason. Some responses must not be served stale.

Community guidance names the standard exclusions. Financial transactions: a stale bank balance or order-confirmation page can mislead a user into acting on a number that is no longer true. Real-time data: stock prices, live scores, auction bids, and inventory counts where staleness is a correctness bug, not a graceful degradation. And authentication surfaces: a stale login form or token endpoint can submit to the wrong endpoint after a deployment, or surface a cached state that no longer reflects the session.

The general rule is data-currency. If “stale” and “wrong” are the same word for your endpoint, do not set the directive. stale-if-error is for content where “slightly old and correct” is strictly better than “current and an error page”: marketing pages, documentation, catalog browse views, cached API aggregates. For everything else, let the error propagate.

How Vercel compares to Cloudflare, Fastly, Akamai, and CloudFront

stale-if-error is not a Vercel invention; it is RFC 5861, and the rest of the CDN field has implemented it for years. The comparison that matters is how each vendor exposes it.

Per the same community comparison, Cloudflare supports stale-if-error and has enabled it by default on some plans; Fastly exposes it through Varnish configuration (VCL), where the behavior is whatever you program; Akamai supports it within its caching product. AWS CloudFront’s support is described as limited and dependent on custom error responses rather than header-driven fallback. Treat all of these as drifting facts: CDN feature matrices change quarter to quarter, and “supported” can mean anything from honoring the header verbatim to requiring plan-specific configuration. Re-verify against each vendor’s current docs before architecting around a specific behavior.

What Vercel shipped, then, is the boring and useful thing: honoring a long-standing RFC directive, in the obvious config-driven way, on all responses, without a separate plan toggle. The differentiator is not the feature. It is that teams running their origin on Vercel can now express transient-failure tolerance in a header and let the edge enforce it, while keeping the boundary straight in their heads: serve stale inside the window, and nowhere past it.

Frequently Asked Questions

What is the difference between RFC 5861 and the formal HTTP caching standard?

The two directives, stale-if-error and stale-while-revalidate, were never promoted into the standards-track HTTP caching specifications and remain defined only in RFC 5861, a 2010 Informational document. The IETF never gave them normative weight, so their exact behavior is whatever each CDN actually implements.

How can you tell from the response that the edge served a stale copy rather than a fresh one?

RFC 5861’s Security Considerations require a stale response to carry a non-zero Age header and a Warning header, with HTTP Warning code 111 reserved for revalidation failure. An operator can detect stale-but-served traffic by logging or alerting on those headers rather than relying on client-visible status codes.

Why would a team choose Fastly’s VCL-based stale serving over Vercel’s header-driven version?

On Fastly the fallback is written in Varnish configuration, so it can fire on conditions a Cache-Control header cannot express, such as origin latency crossing a threshold or a custom health check, and the stale window can vary per request, whereas Vercel honors only the directive’s fixed, error-triggered semantics.

Will a browser or a downstream corporate proxy also honor the stale-if-error header Vercel sets?

The directive is effectively CDN-side, because no major browser implements stale-if-error and intermediate proxies only honor it if their operator configured them to, so the fallback you set on Vercel protects edge traffic but not requests served from a user’s browser cache or an unconfigured downstream proxy.

sources · 4 cited

  1. RFC 5861: HTTP Cache-Control Extensions for Stale Content datatracker.ietf.org primary accessed 2026-06-24
  2. stale-if-error for Origin Server Resilience dev-toolbox.tech community accessed 2026-06-24
  3. Vercel Boosts Resilience with stale-if-error Cache Control vectordynamic.com analysis accessed 2026-06-24