Frank DENIS random thoughts.

Application-controlled browser cache using local storage.

In order to reduce latency and improve the user experience, using client caching is the web development 101.

Setting correct HTTP server-side headers in order to instruct the web browser to keep the content in its local cache, has become straightforward with modern web servers (for static content), frameworks and middleware layers (Rack!).

However, this is only a hint. The actual caching policy remains up to the web browser. Setting HTTP headers doesn’t give applications any control nor feedback about what’s really going on client-side.

Specifically:

  • You can’t tell whether a resource is already in the client cache or not (ok, you can, but by making extra requests, which is, outside the scope of a demonstration, exactly what you’re trying to avoid),
  • You can’t tell whether what you asked the browser to cache was actually cached or not
  • You can’t invalidate a cached resource
  • You have no way to prioritize cached content. A huge Javascript should probably stay in the client cache as long as possible, while a 20 bytes-long transparent GIF image is no biggie if it has to eventually get reloaded. The only knob you have on the browser cache is a deadline (which is almost always, and should be, “immediately” or “never”), and it doesn’t allow any kind of real prioritization.
  • Some web browsers, like Safari (and UIWebView components) on the iPhone have a ridiculously small cache.
  • Watching your web server delivering so many 304 replies is driving you nuts.

Fortunately, the HTML5 specs bring an exciting feature: client-side storage. Thanks to localStorage et al., web browsers now provide a convenient way to permanently store big chunks of data, giving applications full control of the data store. While the primary target was offline web applications, client-side storage can also be extremely helpful in a bunch of other situations.

When you think about it, what’s the difference between local storage and cached resources? Not much, except that local storage is fully application-controlled, and can solve every issue listed above!

A good example might be a huge Javascript that you really would love to keep cached. It doesn’t mean that additional content like pictures shouldn’t get cached as well, but you know that this specific script is critical and that users won’t be able to see anything but a blank page till this one isn’t ready for action.

Here’s the trick. Instead of using a script element, this very script can be loaded using XHR, then eval()’d. Since the script is available as a regular string, storing it as an entry in the browser data store is a piece of cake. Checking whether the script is already in the cache is also as easy as checking a key for existence. Invalidation is as easy as deleting the entry. This way, we get a totally user-controlled browser cache.

Here’s a real-life example: Geobar

Geobar was specifically designed for iPhone and Android devices. It relies on the OpenLayers scripts + some custom Geoportail scripts. All packed together, we get 1 Mb worth of data that are absolutely mandatory for the page to display. And this is where local storage works wonders.

    function load_scripts() {
        if (window.localStorage["geoapi_js"]) {
            window.eval(window.localStorage["geoapi_js"]);
        } else {
            var xhr = new XMLHttpRequest;
            xhr.onreadystatechange = function() {
                if (this.readyState != 4) {
                    return;
                }
                if (this.status != 200) {
                    alert("Error: " + this.status);
                    return;
                }
                try {
                    window.localStorage["geoapi_js"] = this.responseText;
                } catch (e) { }
                window.eval(this.responseText);            
            };
            xhr.open("GET", "geoapi.js", true);
            xhr.send();
        }
    }

This is enough to keep the script in the data store instead of relying on the traditional HTTP cache.

Why not use an HTML5 manifest instead, you may ask? To start with, the manifest still doesn’t give you much control about the cached content. And further HTTP requests made from a resource loaded this way have no Referer, which might be a showstopper, as it was the case here.

Of course, the same trick can be used in order to store other kind of resources like stylesheets, and, why not, hex-encoded images.

While HTML5 makes local storage easy and powerful, almost every major web browser out there has some kind of support for local storage for ages: userData (since IE 5.0), globalStorage (since Firefox 2), and for those running behind, Flash comes to the rescue with its Shared Objects.