404 – what to do about missing www documents
This article discusses how I dealt with missing articles when I moved my
websites around. It allows you to display a custom page depending on the situation.
This is only a simple solution. You can get complex with CGI code, but that
wasn’t an option I wished to pursue.
The problem
Until recently, my website was shared by four domains (see Not much new here. What’s going on? for
details). I decided that it was time to give each domain it’s own website.
Until recently, I could not afford to run separate websites for each domain.
Splitting
up the mega-website into found separate websites was easy, but very time consuming.
But the resulting changes to URLs and file locations meant that people who had bookmarks
and the data in the search engines would all be wrong. For example, you could have
seen the racing system stuff by using:
http://www.racingsystem.com/racesys/
http://www.freebsddiary.org/racesys/
http://www.freebsddiary.com/racesys/
http://www.dvl-software.com/racesys/
But after the split, only the following URL would work:
http://www.racingsystem.com/
Similarly, the FreeBSD Diary URLs ended with /freebsd/, the PowerBuilder tips ended
with /pbtips/, and the free web space ended with /freespace/. I wanted to ensure
that appropriate messages where given no matter what URL you were looking for.
The working solutions
In this section, you’ll see working examples of the above strategy.
In the following section you’ll see how I did this.
If you go to the missing page
you will see the page which is displayed whenever a URL request cannot be satisfied.
This is a missing document. If you try this fake URL
that page will be displayed.
But if you visit apache.htm you’ll automagically
be redirected to the new URL apache.html. This is a
redirection, which is handled in Redirecting URL requests with
Apache.
How to do this
To create a default error page, you need to use the ErrorDocument
directive. Here is what I added to /usr/local/etc/apache/httpd.conf:
<Directory "/www/freebsddiary.org"> AllowOverride All ErrorDocument 404 /missing.php </Directory>
For more information on the ErrorDocument feature, please see the Apache documentation.
For a complete virtual host example, see the following
section. You may or may not need the AllowOverride. This directive ensures
that the ErrorDocument directive can be used (see the Apache documentation for more info).
The above bit of code tells Apache that if any 404 errors (document not found) are
generated in the indicated directory, the the page /missing.php should be
displayed. You can see that file at this URL testing404.txt.
Different messages for different parts of the website
It is possible to create different messages for each directory of the
website. So I added the following sections to the virtual host definition:
<VirtualHost 10.0.0.45> <Directory "/www/freebsddiary.com/testing/freebsd"> AllowOverride All ErrorDocument 404 /missing.php </Directory> <Directory "/www/freebsddiary.com/testing/racesys"> AllowOverride All ErrorDocument 404 /testing/racesys/missing.html </Directory> <Directory "/www/freebsddiary.com/testing/pbtips"> AllowOverride All ErrorDocument 404 /testing/pbtips/missing.html </Directory> <Directory "/www/freebsddiary.com/testing/freespace"> AllowOverride All ErrorDocument 404 /testing/freespace/missing.html </Directory> <Directory "/www/freebsddiary.com"> AllowOverride All ErrorDocument 404 /missing.php </Directory> </VirtualHost>
As you can see, for each directory (freebsd, racesys, pbtips), we show a custom NOT
FOUND page.
Please note the following points:
- the file does not have to be named "missing.html". Use whatever name
suits your needs. - the file does not have to reside within the directory in which the ErrorDocument
directive resides. That is, you could use "/missing.html" instead of
/testing/freespace/missing.html. - you can use the same file within multiple ErrorDocument directives (e.g. each of the
above statements could refer to the same "/missing.html".
The following URLs demonstrate the httpd.conf entries.
This is quite a simple solution. I’m sure you can get quick complex if you want.
Order is important
The order of your ErrorDocument directives is important. If you put
the directive for "/www/freebsddiary.com" before the directive for
"/www/freebsddiary.com/testing/pbtips", then the latter will never be invoked.
The ErrorDocument processing takes the first match. So put your directives bottom-up
rather than top-down.
.htaccess can also be used
The above can also be done with an entry in an .htaccess file.
Here’s what I put in the .htaccess file at http://www.dvl-software.com/racesys:
ErrorDocument 404 /racesys/missing.php
This accomplishes the same as the previous section but requires that you allow at least
FileInfo to be overridden. That means you need an entry something like this in your
virtual host:
<Directory "/www/freebsddiary.com/racesys"> AllowOverride FileInfo </Directory>
I prefer the all-in-httpd.conf example. But if you are web-hosting for other
people, you might want the .htacess solution.
Things that can go wrong
If you see this message in your browser:
500 Internal Server Error Internal Server Error The server encountered an internal error or misconfiguration and was unable to complete your request.Please contact the server administrator, webmaster@example.com and inform them of the time the error occurred, and anything you might have done that may have caused the error. More information about this error may be available in the server error log. ______________________________________________________________ Apache/1.3.9 Server at www.freebsddiary.com Port 80
And this in your logs:
/www/freebsddiary.com/racesys/.htaccess: ErrorDocument not allowed here
It means you didn’t include the AllowOverride directive in the correct place.
Check that and try again.
Not all links work as advertised. I think the webserver is getting out of date with respect to the article. The article contents is still correct and accurate AFAIK. It’s just that the working examples aren’t working.
Thank you for a most clear and concise ‘how to’. I had been using the htaccess method with mixed results. The httpd method is much better and your explaination quite clear.
You might want to mention to the pure novice that a system reload is necessary after changes to the httpd.conf file. For example on our system —>
/etc/rc.d/init.d/httpd reload
was necessary.