301s, Drupal, Varnish, and SSL Load Balancers

Posted in Articles on Oct 03, 2015

Hi all,

Today I'm just going to provide a snippet that I used when trying to handle Varnish caching, when a client required a Drupal install in front of a platform that had Varnish over Apache, all behind a cloud SSL load balancer. The client needed not only to redirect all raw domain requests requests to "www", but also to redirect all non-https to their HTTPS equivalent.

This seemed to work pretty well straight out of the box, by just setting the $base_url variable in the Drupal settings file to point to the HTTPS domain and setting up a Rewrite in the .htaccess. It was only when we cleared the cache that things started to go awry.

The www redirect was exactly as you'd expect, uncommented character for character in the default Drupal .htaccess file. The SSL redirect was a custom entry implemented as follows:


RewriteCond %{HTTP:X-Forwarded-Proto} !=https
RewriteRule .* https://%{HTTP_HOST}/%{REQUEST_URI} [R=301, L]

X-Forwarded-Proto is the de-facto standard header forwarded by SSL-compatible load balancers. There's no RFC that states it, but it's been adopted by all the major providers, so you're generally going to be pretty safe assuming it's there.

The problem was, when we cleared the Varnish cache (which happened automatically on an unattended deploy), and the first view to the page was accessed via HTTP, which it would be for any clients who just typed the URL into their browser, Varnish hashed the content of the page as being nothing but a redirect. Needless to say this completely broke everything! The site just went into a redirect loop.

This could have been resolved with complex iptables rules, or any number of other smart but unnecessary solutions. Instead, we chose to add the X-Forwarded-Proto header to the Varnish hash. To do this, just a couple of lines needed adding to our /etc/varnish/default.vcl. Because it's a shared hosting server with various sites and configurations on it, we elected to restrict this modification to JUST the site in question, and came up with the something resembling the following:

sub vcl_hash {
    hash_data(req.url);

    # Begin SSL load balancer config
    if (req.url ~ "www.mydomain.url" && req.http.X-Forwarded-For) {
        hash_data(req.http.X-Forwarded-For);
    }
    # End SSL load balancer config

    if (req.http.host) {
        hash_data(req.http.host);
    } else {
        hash_data(server.ip);
    }

    return (lookup);
}

This means that Varnish will store two different caches. One for requests which came in via HTTP, and another via HTTPS. In this case, the only thing that would be cached for the HTTP side of things is the redirect to the secure site. Always consider the impact this will have on your cached data storage before making changes like this, particularly if the server hosts more than one site.