How to Set Up an HTML5 Web Page for Offline Cache - dummies

How to Set Up an HTML5 Web Page for Offline Cache

By Andy Harris

Most Web-based applications work only if you’re online, which makes a certain kind of sense. But HTML5 has a mechanism for forcing part of a web page and its resources to be stored on the local machine so that you can view it while offline. You can have a page identify itself for this behavior and attempt to save a copy on the local machine automatically; for example:

<html lang = "en"
      manifest = "cache.manifest">
  <meta charset = "UTF-8" />  
  <link rel = "stylesheet"
        type = "text/css"
        href = "offline.css" />
  <script type = "text/javascript"
          src = "offline.js">
<body onload = "setCaption()">
  <h1>Offline Storage Demo</h1>
    <img src = "pot.jpg"
         alt = "hand-etched pottery" />
    <p id = "caption"></p>

While extremely simple, this page manages to draw resources from several different files. Of course, it requires the image pot.jpg, but it also uses an external JavaScript file (offline.js) and an external style sheet (offline.css). HTML5 offers a mechanism that allows the browser to automatically save not only the HTML file, but all the other resources it needs to display properly.

You wouldn’t build such a simple page with so many external dependencies, but that’s the point of this particular exercise.

The secret is in a special file called cache.manifest. This special file is simply a text file that indicates which other files are needed by the page. Here’s the content of cache.manifest for this particular example:


The file must begin with the phrase CACHE MANIFEST (all in capital letters). Each subsequent line should contain the name of a file needed to complete the page. It’s easiest if all the files are in the same directory, but relative references are acceptable.

Follow these steps to set up a page for offline cache:

  1. Set up your server to manage caches.

    The cache mechanism uses the text/manifest MIME type. Your server may not yet be set up for this type of data. If you’re using Apache, this is easy to fix. Look for (or create) a text file called .htaccess in the root directory of your web server. Add the following line to this file:

    AddType text/cache-manifest .manifest

    If you do not have permission to add or modify .htaccess or you are using another server, you might have to ask your server administrator to add this MIME type.

  2. Create your manifest file by building a text file called cache.manifest in the same directory as your project.

    Make the first line read CACHE MANIFEST. On each subsequent line, list one of the assets your page will need. You may need to look through your source code to find the various elements (normally images, CSS, and JavaScript files) that your page will need.

  3. Build the page in the normal way.

    Keep track of any external resources you might need.

  4. Indicate that your page will request local storage by adding the manifest attribute to the <html> tag and a link to your cache.manifest file.

  5. Load your page.

    Obviously, you cannot test cache on a local machine (unless you’re running your own web server and test through the http://localhost address). You’ll need to load your files to a server. The first time you try to access the page, your browser will probably ask permission to save data locally. Grant permission to do so.

  6. Test offline.

    To see if the page works, disconnect from the Internet (by turning off your wireless or unplugging your network cable). Try to load the page again. If you are successful, the entire page will load.

Browsers already have a form of cache that automatically stores pages the user has visited, but this type of cache is a different, more intentional form of cache.

Note that you can’t put links to server-side assets in the cache. A local cache can’t store a PHP program or database. However, you can store any data you need on the client so your project will still work without a server connection.

If you change your cache.manifest file and retest, the browser will not update immediately. That’s because browsers are set to keep the current cache for a couple of hours. If you test again after a couple of hours, you will see the changes reflected. During testing, you can turn the automatic browser caching off by adding these lines to your .htaccess file:

ExpiresActive On
ExpiresDefault "access"

It only makes sense to turn off browser caching on a test server. Leave the caching behavior at its default level for a production server.

If one of the files changes but the cache.manifest file does not, the browser will not know to reload the changed file. If you want to trigger a complete file reload, you need to change at least one character in the cache.manifest file. This can be a comment, so you can just add a version number like this:

#version 3

Changing the version number will trigger the reload mechanism the next time you’re online, so the offline version will be up to date.