404 – what to do about missing www documents
This article discusses how I dealt with missing articles when I moved my websites around. It allows you to display a custom page depending on the situation. This is only a simple solution. You can get complex with CGI code, but that wasn’t an option I wished to pursue.The problem
Until recently, my website was shared by four domains (see Not much new here. What’s going on? for details). I decided that it was time to give each domain it’s own website. Until recently, I could not afford to run separate websites for each domain.Splitting up the mega-website into found separate websites was easy, but very time consuming. But the resulting changes to URLs and file locations meant that people who had bookmarks and the data in the search engines would all be wrong. For example, you could have seen the racing system stuff by using:
http://www.racingsystem.com/racesys/
http://www.freebsddiary.org/racesys/
http://www.freebsddiary.com/racesys/
http://www.dvl-software.com/racesys/
But after the split, only the following URL would work:
http://www.racingsystem.com/
Similarly, the FreeBSD Diary URLs ended with /freebsd/, the PowerBuilder tips ended with /pbtips/, and the free web space ended with /freespace/. I wanted to ensure that appropriate messages where given no matter what URL you were looking for.
The working solutions
In this section, you’ll see working examples of the above strategy. In the following section you’ll see how I did this.If you go to the missing page you will see the page which is displayed whenever a URL request cannot be satisfied. This is a missing document. If you try this fake URL that page will be displayed.
But if you visit apache.htm you’ll automagically be redirected to the new URL apache.html. This is a redirection, which is handled in Redirecting URL requests with Apache.
How to do this
To create a default error page, you need to use the ErrorDocument directive. Here is what I added to /usr/local/etc/apache/httpd.conf:<Directory "/www/freebsddiary.org"> AllowOverride All ErrorDocument 404 /missing.php </Directory>
For more information on the ErrorDocument feature, please see the Apache documentation. For a complete virtual host example, see the following section. You may or may not need the AllowOverride. This directive ensures that the ErrorDocument directive can be used (see the Apache documentation for more info).
The above bit of code tells Apache that if any 404 errors (document not found) are generated in the indicated directory, the the page /missing.php should be displayed. You can see that file at this URL testing404.txt.
Different messages for different parts of the website
It is possible to create different messages for each directory of the website. So I added the following sections to the virtual host definition:<VirtualHost 10.0.0.45> <Directory "/www/freebsddiary.com/testing/freebsd"> AllowOverride All ErrorDocument 404 /missing.php </Directory> <Directory "/www/freebsddiary.com/testing/racesys"> AllowOverride All ErrorDocument 404 /testing/racesys/missing.html </Directory> <Directory "/www/freebsddiary.com/testing/pbtips"> AllowOverride All ErrorDocument 404 /testing/pbtips/missing.html </Directory> <Directory "/www/freebsddiary.com/testing/freespace"> AllowOverride All ErrorDocument 404 /testing/freespace/missing.html </Directory> <Directory "/www/freebsddiary.com"> AllowOverride All ErrorDocument 404 /missing.php </Directory> </VirtualHost>
As you can see, for each directory (freebsd, racesys, pbtips), we show a custom NOT FOUND page.
Please note the following points:
- the file does not have to be named "missing.html". Use whatever name suits your needs.
- the file does not have to reside within the directory in which the ErrorDocument directive resides. That is, you could use "/missing.html" instead of /testing/freespace/missing.html.
- you can use the same file within multiple ErrorDocument directives (e.g. each of the above statements could refer to the same "/missing.html".
The following URLs demonstrate the httpd.conf entries.
This is quite a simple solution. I’m sure you can get quick complex if you want.
Order is important
The order of your ErrorDocument directives is important. If you put the directive for "/www/freebsddiary.com" before the directive for "/www/freebsddiary.com/testing/pbtips", then the latter will never be invoked. The ErrorDocument processing takes the first match. So put your directives bottom-up rather than top-down..htaccess can also be used
The above can also be done with an entry in an .htaccess file. Here’s what I put in the .htaccess file at http://www.dvl-software.com/racesys:ErrorDocument 404 /racesys/missing.php
This accomplishes the same as the previous section but requires that you allow at least FileInfo to be overridden. That means you need an entry something like this in your virtual host:
<Directory "/www/freebsddiary.com/racesys"> AllowOverride FileInfo </Directory>
I prefer the all-in-httpd.conf example. But if you are web-hosting for other people, you might want the .htacess solution.
Things that can go wrong
If you see this message in your browser:500 Internal Server Error Internal Server Error The server encountered an internal error or misconfiguration and was unable to complete your request.Please contact the server administrator, webmaster@example.com and inform them of the time the error occurred, and anything you might have done that may have caused the error. More information about this error may be available in the server error log. ______________________________________________________________ Apache/1.3.9 Server at www.freebsddiary.com Port 80
And this in your logs:
/www/freebsddiary.com/racesys/.htaccess: ErrorDocument not allowed here
It means you didn’t include the AllowOverride directive in the correct place. Check that and try again.
Not all links work as advertised. I think the webserver is getting out of date with respect to the article. The article contents is still correct and accurate AFAIK. It’s just that the working examples aren’t working.
Thank you for a most clear and concise ‘how to’. I had been using the htaccess method with mixed results. The httpd method is much better and your explaination quite clear.
You might want to mention to the pure novice that a system reload is necessary after changes to the httpd.conf file. For example on our system —>
/etc/rc.d/init.d/httpd reload
was necessary.