Caching might be one of the hard things in computer science, but wow, it's fun when it works! Let's dive into some cool new caching strategies!
Sometimes, old stuff is the coolest. In a world saturated with new technology every day, I find solace in learning something new about a technology I have been using for decades. The fact that even the building blocks of the technologies I use every day still have something new to teach me, just feels good for some reason.
This year, my team and I spent three months implementing a new version of a fairly popular website in Norway, and during that process I spent a bunch of times trying to make everything as fast, responsive and available as possible. Since we were using Remix to implement the new web platform, I started looking into more of the core web fundamentals. What I learned turned out to be incredibly useful for me – and hopefully it will be for you too.
This article will teach you two (fairly) new caching strategies that helped us save money, reduce load time by a ton, and greatly simplify our code.
I know I should know how caching works on the web, but ehm…
Before we dive into my new cool findings, I want to give a quick introduction on what caching is, and how caching works on the web.
Caching is a way to temporarily store data, so we can retrieve it quicker than we would from the original source. We do it all the time, often without noticing. Our CPUs cache data it needs within a millisecond or two, memory chips store cache data we need almost as quickly, and even our hard drives cache stuff we’ve accessed recently so that we don’t have to look it up from the storage unit itself.
When doing web development, there’s a few ways we can cache data as well. We can cache data in our JavaScript applications (hello useState), we can ask the browser to cache stuff for us, and we can use a globally distributed cache layer, called a Content Delivery Network (or CDN, for short) to cache documents and assets closer to the user than our origin servers. There are lots of other ways, of course, but these seem to be the most common ones.
When caching data on the web, we typically use something called caching headers to specify what we want to cache, and how we want to cache it. There are lots of fun cache headers to play with, but the most important one – at least for the strategies mentioned in this article – is the Cache-Control
header. Headers are part of the request that’s passed between the server and the client, providing meta information (information about information) to the receiver.
The Cache-Control
header has a bunch of functionality, but here are a few examples:
Cache-Control: max-age=3600; // Cache something for 3600 seconds (1 hour)
Cache-Control: nocache; // Don't cache this!
Cache-Control: private; // Only cache in private caches (i.e. the browser)
There are lots more to learn about this header, and I highly recommend diving into the MDN documentation for more details.
However, there are a few strategies that turned out to be great resources for us to reduce response time, server load and cost.
A few words about CDNs
The site we implemented had a CDN (content delivery network, remember?) between the site’s servers and the end user. A CDN is a network of globally distributed machines, working as intermediate storage devices for any resources.
You typically see CDNs storing assets that never change, like images, videos and hashed JavaScript files. This lets the user download these assets from a nearby server (typically within the same state or part of the country) without going to the origin server. This is very practical, since it both drastically reduces response time for the user, as well as the load on the origin servers.
Our CDN (AWS CloudFront, if you were wondering) receives every request to our domain, checks whether it has that particular resource (file path) stored, and returns it to the user, without ever bothering the origin server. However, we also use it to serve files that do change, like CMS-backed HTML pages and JSON responses, which I what I want to talk about in this article.
stale-while-revalidate
As I mentioned earlier, the Cache-Control
header accepts a bunch of different directives. One of most powerful ones, is one called stale-while-revalidate
. This strategy tells our cache (CDN) to do the following:
- If the CDN contains the resource that’s still “fresh” (hasn’t expired), return that resource. Done.
- If the CDN contains the resource, but it has expired, return the expired resource to the user. Then, ask the origin server for a fresh version of that resource in the background, so that the next request for the same resource will be “fresh”
The tradeoff is basically this: Some users might get stale content, but will still get it quickly.
It looks like this:
Cache-Control: max-age=3600, stale-while-revalidate=3600;
What happens here is the following:
- If a request comes in, and there’s less than 1 hour since the data was placed in the cache, return the resource. Don’t ask the origin server, just return the data as is.
- If a request comes in, and there’s less than 1 hour since the cache expired (for instance, 1 hour and 30 minutes since the data was placed in the cache), return the stale data to the user. Next, ask the origin server for a fresh response. Save that data in the cache, and mark it as fresh again.
- If a new request comes in right after (for instance, 1 hour and 31 minutes after the original data was stored in the cache), the user will receive fresh data without waiting for the origin server at all. The origin server won’t be asked for a new copy for another hour.
This is a great strategy to use when you have a bunch of consistent traffic, or if any changes to the content you’re caching can wait for a few seconds. If you have consistent traffic, the cache will always be updated, since any stale request will trigger what’s called a revalidation (”go check the origin server in the background”). Cool stuff!
Our site was based on a CMS, and any changes in the returned HTML site (or JSON data) would typically be changes to a landing page, articles etc. Yes, it’s cool if the updated page was available right away, but it doesn’t really break anyone’s experience if they are a few seconds behind the breaking news. This seemed like a wonderful strategy for us.
The result is that, for resources that change, we have a > 95 % cache hit rate. The only pages that are retrieved from the server is the ones rarely updated. We’ve set the cache threshold to about 1 hour, which means you might get 1 hour old content if you’re the first visitor to visit that page a while – but if that’s the case, chances are the page might not be the most frequently updated one you have anyways.
The stale-while-revalidate
header is such a useful technique in combination with CDNs, I don’t think I’ll ever serve a site without it again.
stale-if-error
We all make mistakes from time to time. In my experience, making sure the production environment we serve is resilient from errors is the best investment you’ll ever make.
Meet the stale-if-error
directive. It works in a pretty similar fashion to the stale-while-revalidate
directive, but with a very specific difference:
If the origin server returns a 5xx server error (500 Internal Server Error, 503 Service Unavailable etc) for a specific resource, or if there’s a network error (a timeout) return the last cached successful response.
In other words, we can specify that if our website goes down for some time (it happens), we can return the last working version of that website for a specified amount of time. Sure, the user won’t get any new data, but at least it’ll get some data back. Your server will hopefully trigger an alarm, and you can get your server back online without the user ever noticing.
In our case, we set this cache timeout to be 12 hours:
Cache-Control: stale-if-error=43200; // 12 hours * 60 minutes * 60 seconds
In other words – if our website goes down, we have 12 hours to get it fixed before we ever show a 500 error page to the user. Neat!
The previous version of our website had some intricate code that dealt with downtime in downstream systems, like our previous CMS or other internal APIs. Now, however, we could delete all that code and just use The Platform like it was supposed to work.
These two directives have made wonders for both our users, servers and pocket books. We spent a bunch of time optimizing the different parameters to make them just right, and we’ll probably keep on doing so for the foreseeable future, as we learn more about our users’ usage patterns.
outline
Summary
Caching stuff is fun. Sure, it might be one of two hard things in computer science, but it’s sure a great tool to keep in your toolbox.
I hope this article inspired you to start experimenting with these caching directives on your own site, saving your team and users a bunch of headaches.