Redirecting to a www subdomain


I decided to put this website on the www subdomain, since it's a clue to users as to what to expect. The Bytemark account I'm borrowing from my friend's training company provides the same content without the subdomain. It's good practice, when both domains lead to the same content, to redirect one to the other. That will keep URLs consistent and make sure each page has only one canonical URL.

I've done this kind of thing many times before, but not for a while. I seem to have forgotten everything I knew about Apache mod_rewrite configuration, so it took me a while to figure out how to do this. There are of course other sites with examples of recipes to put in .htaccess files, but several of the ones I tried didn't work. Here's my version, so I'll know where to find it next time:

RewriteEngine On
RewriteCond %{http_host} !www\. [NC]
RewriteCond %{http_host} ([-.a-z0-9]+)$ [NC]
RewriteRule ^(.*) http://www.%1/$1 [R=301,L]

The first RewriteCond checks that the www. prefix is missing. The second one captures the whole host name. Finally, the RewriteRule issues a redirect to the canonical URL.

The 301 HTTP status code indicates that this is a permanent redirect. Browsers should cache it for future use, and I believe Google treats it as a pointer to the canonical version of a URL, which is what you want in this kind of situation. The caching might cause problems if you ever change your mind and swap the canonical domain with the redirecting one. In that case it's probably a good idea to take the redirect out some time before you put the new reversed redirect in, to avoid a cached 301 response from leading to an infinite loop of redirects.

Testing redirects

It's easiest to check these things by looking at the actual HTTP response headers, rather than just what your browser does with them. I like to set up a command like this, that sends a response to the sever through netcat (the nc command):

echo -e "GET / HTTP/1.1\nHost:\n" \
    | nc 80 -q3

The -q3 option tells netcat to wait three seconds after it's read all the input, which gives enough time (probably) for the response to arrive and be printed. There's probably a nicer way to do that. It's not necessary if you're actually typing the request in while netcat's running, instead of piping it in, because then netcat won't see an EOF on stdin until you're finished.