Table of contents of the article:
Introduction
Varnish cache is a popular HTTP reverse proxy and HTTP accelerator. It sits in front of our web/application services and uses a range of caching, prefetching and compression techniques to reduce the load on upstream backend services and provide improved client response times to our customers.
In addition to improving performance, a perhaps lesser-known feature of Varnish is that it can be configured to protect clients from upstream failures and outages. If an upstream service is temporarily unavailable or begins to fail, Varnish can fall back to provide stale responses from its cache, provided the cached objects' grace period has not expired.
We can now say without making it a secret that Varnish Cache is the basis of our infrastructure and our Hosting Performance services.
Cache Control
The role cache-control
of the header is to instruct downstream caches (browsers and proxy caches like Varnish) how to cache responses. The value of s-maxage
lets Varnish know the number of seconds the response can be cached, otherwise known as the time to live (TTL). The value of max-age
it will be used by browser caches (and also by proxy caches in case s-maxage
not specified). Internal to Varnish the value is used to set the beresp.ttl
variable.
Il stale-while-revalidate
value informs the cache how long it is acceptable to reuse an outdated response. In the example above, the response will become stale after 10m (600s), so the cache can reuse it for any request made within the next 2 days (172800s). This is also known as the grace period, internal to Varnish the value is used to set the beresp.grace
variable. The first response fetched during the grace period will trigger an asynchronous background fetch request, making the cached object fresh again, without passing the latency cost of the revalidation to the client.
Also, if the backend service is down or responding slowly, clients will be protected from such responses until the grace period expires, hopefully giving the service adequate time to recover, or the engineers adequate time to fix a problem, before it negatively affects the customer experience. It should not be overlooked that this is a trade-off: while it provides faster responses and greater resilience, it also increases the chances of providing outdated or even incorrect responses.
Setting a stale-while-revalidate
high lifetime is a judgment and may not be appropriate for responses containing highly dynamic server-side rendered data, where freshness is critical. We've tried to maximize use of the feature on responses containing relatively static data, such as our editorial content and category landing pages.
Varnish configuration
By default, Varnish will fall back to provide an outdated response during the grace period if it can't connect to the backend service or if the request times out. The behavior can be extended to 5xx errors with the following code insub vcl_backend_response
sub vcl_backend_response { if (beresp.status >= 500 && bereq.is_bgfetch) { return (abandonment); } ... }
beresp.status
contains the status code returned by the backend service, bereq.is_bgfetch
will be true if the backend request was sent asynchronously after the client received a cached response and return (abandon)
instructs Varnish to abandon the request and do nothing.
In conclusion
An out-of-date cached response, if available, will in most cases provide a better customer experience than an error page. Where the server-side rendered content is static enough to allow a stale-while-revalidate
high durability, caching can be a useful tool to have to increase its resiliency.