Catching 404 Visitors
March 22, 2008 – 5:44 pm
Running a site requires maintenance from time to time, perhaps an upgrade to the file structure too. Even if you've not altered the file structure from the original set up, a simple mistake such as an incorrect link in your content, can cause a 404 Not Found error. Unless you've set up a custom 404 error page that offers a correct navigation to your main site (which I would recommend too, if you've got the option), then you could lose each visitor, a potentially paying customer, when that get that dreaded 404 page and leave your site for good. Also, when a file structure is updated the old files will still be known to the search engine spiders and they'll continue to search for these files.
Find the Missing Files
So what files do you need to cover? Well, ideally if you change your file structure then you'll accommodate every old path/filename, creating a 301 redirect to the new path. However incorrect links, by both yourself on your site and from external links, incorrectly written, can be found by using both your statistics package, to view all the 404 errors, and also Google Site Maps, as this will list all the pages that the spider has tried to visit and found 404s.
301s with htaccess
If you run your site on an Apache server then you will most likely have access to edit your .htaccess file. This is the best method to control the redirection. To create a straightforward 301 redirect from one file to another, simply add the following, changing the old-filename and new-filename to suit what you need. Duplicate this line for every old to new file redirection you need to set up. The path is relative to the root of the site.
- Simple 301 Redirect
-
- Redirect 301 /old-filename.html /new-filename.html
Okay, so that's for all standard pages, be them html, htm, php or others. But that doesn't work for file paths that include a query string eg. /old-filename.php?id=1234. So the code for this would be
- 301 Redirect for query string paths
-
- Options +FollowSymLinks
- RewriteEngine On
- RewriteBase /
- RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /old-filename\.php\?id=1234\ HTTP/
- RewriteRule ^.*$ /new-filename.html? [R=301,L]
The first 3 lines are only needed once at the top of your .htaccess file and are always needed for URL Rewriting (note it's not needed for the basic redirects in the first code block). Lines 5 and 6 then do the following:
- Check the value of THE_REQUEST (which could be, for example, "GET /old-filename.php?id=1234 HTTP/")
- Check the pattern of the string is a match
- If it's a match then create a 301 redirect the URL to /new-filename.html
The optional element in this code is the question mark on line 6 after the new-filename.html. If you have a question mark there then any query string from the old address will not be carried over, if you remove the question mark then it append any query string to the end of the new URL.
Now, you may ask, what happens if you have a lot of old filenames/paths and need to redirect them all to a new file structure, but there's a common theme with them. For example moving a handful of addresses from one directory to another. This can be done using just a couple of lines. For example you want to redirect from /olddir/oldfile.php?p=X to /newdir/newfile.php?p=X
- Multiple 301 Redirects
-
- RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /olddir/oldfile\.php\?p=([^&]+)\ HTTP/
- RewriteRule ^.*$ /newdir/newfile.php?p=%1 [R=301,L]
The slight change here to the previous code block is on the request string it determines the value of p in the query string (with could be anything except an ampisand) and whatever it finds it appends to the end of the redirect address, using the %1.
There are other methods of doing this, these are just two of the methods I use, however they should work for you.
301 Redirects with PHP
You may not have htaccess available to you which then limits you to using a server side script to do the same work. This unfortunately means that you need to have a file online for every old file you may have had in the past, and whilst that's tedious it only needs doing once and will hopefully be worthwhile even if you catch just one new customer. So how to do this with PHP? Create a new file and add the following to it:
- PHP 301 Redirect Code
-
- <?php
- header("HTTP/1.1 301 Moved Permanently");
- header("Location:http://www.domain.com/directory/path-to-new-file.php");
- exit;
- ?>
Save the file as the old filename and place it on your server in the same place where the old file used to be. This will then return a 301 redirection and redirect all visitors to the old file through to the new one. If your file accepted query strings then for a quick and simple solution you can use the following which will append any query string on to the end of the redirection URL.
- PHP 301 Redirect Code with Query String
-
- <?php
- header("HTTP/1.1 301 Moved Permanently");
- header("Location:http://www.domain.com/directory/path-to-new-file.php?".$_SERVER['QUERY_STRING']);
- exit;
- ?>
301 Redirects with ASP
Now my ASP is pretty much non existent so I can't give much help on that. The following will do a basic redirect, and maybe someone a bit more knowledgeable can recreate the above code block in ASP.
- ASP 301 Redirect
-
- <%
- response.status = "301 Moved Permanently"
- response.addheader "Location", "http://www.domain.com/directory/path-to-file.asp"
- %>


5 Responses to “Catching 404 Visitors”
Really handy summary Sarah, cheers. Wasn't sure about the ASP 301 redirect code so that'll be most helpful.
Also in terms of finding broken links, if you don't have a stats package and you aren't happy to hand additional info about your site/s to Google and their webmaster tools team you could use "Xenu Link Sleuth" (google that term and you'll find it). Or if you're running Linux you could run the "wget" command as a spider on your site if you wanted.
By HB on Mar 25, 2008
HB, you're right, not everyone has a stats package. Although signing up to Google Sitemaps (or webmaster central, whatever it's called these days) isn't so bad. You don't even need to add a sitemap. Of course linking all those sites under your name could be a concern, but that's only if your sites are dubious
By Sarah on Mar 26, 2008
Nice article. Does Google Analytics help with this at all?
By Robert Morgen on May 14, 2008
Hi Robert, I doubt it can as Analytics is only aware of the pages that you've placed the analytics code on, so unless you have a 404 pages with analytics code on then it wouldn't know about it. The Webmaster Tools will however show you all of the 404 pages it knows about.
By Sarah on May 15, 2008