404s Building Up?
Sat, 3 February, 2007 – 7:31 pm
I hate seeing a lot of hits for 404 errors in my stats. I have clients asking why there are any 404s and I have to explain that a 404 will notch a hit up if someone simply goes to a file that doesn't exist, be it intentional or not. As we know, 404s can build up when someone's trying to find a security hole in your site, but they can also build up when you remove or change file names. Luckily the latter can be worked on to cut down the 404 errors.
There's often a few reasons a 404 will arise. The first is because you've changed the filename for some reason. Whether it's because you redesigned the site and moved from .html to .php extensions, or you wanted to use better filenames, or for any other reason, however the old address is still being visited when there's a replacement. Remember, within a few days to a week or so of a file going online, if a search engine spider can find that file it will most likely index it, plus visitors may have bookmarked the page too. So how to remedy this simply requires you to automatically redirect all visitors going to the old address through to the new one. There are several ways to do this depending on your server.
htaccess
If you're on Apache and can edit your .htaccess file then this is the best way to deal with the redirect. Simply add at the bottom of your htaccess file (note there are several ways to do a Permanent (301) redirect, this is simply one method)
RedirectMatch permanent ^/category/web-sites/css-xhtml$ http://www.sarahfreelance.co.uk/category/css-xhtml
As you can see this is a line from my htaccess file on this website. I used to have a category 'web-sites' which contained sub categories. I then moved all of the sub categories into the root to make them just categories so the paths on my site had to be changed.
Often you'll find with older sites that the filenames contain spaces (thank you Microsoft!!). Spaces will cause validation errors within your code and they're just not a good idea. However the above won't work if you have a space in your filename, so use quotes to surround the old address
RedirectPermanent "/products/products mm.htm" http://www.domain.com/newfile.php
Above you can see another way to create a permanent redirect.
No .htaccess?
You may not have a .htaccess file (perhaps you're on Microsoft IIS) or you cannot change the file for whatever reason. This can then be dealt with by using PHP or ASP (this is definitely possible if the page you are changing can be parsed as PHP or ASP).
For PHP capable sites you'll need to recreate the old page on the server and simply drop the PHP code in:
<?php
header("HTTP/1.1 301 Moved Permanently");
header("Location: http://www.domain.com/path/to/new/page.php");
exit;
?>
For ASP I believe (although don't hold me to it) that the following should work:
<%
response.status = “301 Moved Permanently”
response.addheader “Location”, “http://www.domain.com/path/to/page.php”
%>
What happens when your old file was just a htm(l) file? Well on an Apache server you should be able to at least ask your host to set up parsing of html files as PHP. This means you could use the PHP solution in a html page. On IIS you would again need to ask your host to set up html pages to be parsed as ASP or PHP. I've done the latter on the server we use at my contract job and this has cut the 404 errors generated from visitors to pages over 3 years old! Every month I check the 404 stats and simply add the pages not already set up. It's annoying that you have to have these extra pages sitting on the server to simply redirect visitors however at least it prevents losing a possible visitor.
410 Gone
So that's great for pages that have been replaced, but what about pages that no longer exist? For visitors there's not much you can do there. You could set up a temporary (302) redirect to the front end of your site, or set up a custom 404 error page so that visitors are made aware of the missing page and are still given a link to the front end of your site so that hopefully they'll click through and stay on your site, however often spiders don't seem to catch on to a 404 error (note my previous comment of visitors still going to pages that have been gone for over 3 years!). So another option to inform a search spider that a page no longer exists is to tell it that it's gone using an code of 410.
410 Returns a response of 'Gone' to the spiders and human visitors alike. This will tell the spider that the page has gone and won't be returning any time soon. Again, this can be done in htaccess, with PHP and I would assume ASP. For the htaccess you have two options. To target a single page then you can use the following:
RedirectMatch 410 ^/path/to/file.htm$
Alternatively you can target a whole directory (thanks to Dave S. for the exact code, saved me the time of doing each individual page!):
RedirectMatch 410 ^/directory
Of course, for the PHP and ASP you will have to target each individual page.
So whilst it will be tedious if you have a lot to sort out, at least it will help to clean up those constant 404 errors and you never know, you may get extra visitors staying because of your work. Remember, not everyone thinks to alter the URL in their address bar to return to the front of a site if they've come direct to the site and straight to a 404 error page. I know I don't ![]()


1 Trackback(s)