Jan 192000
 

Rewriting URLs within Apache

This article shows you how you can rewrite URLs as they arrive at your Apache server.  I used this on my website.  I moved everything from /freebsd/ to / and renamed all the files from *.htm to *.html.  I told Apache how to process the incoming URLs so the correct files were found.  This is way-cool stuff!

Note: although I was doing the rewrites for quite some time after reorganizing the website, I have now removed the rewrites.

This solution requires mod_rewrite (which is included in the Apache port I used).  Make sure the following are present in your httpd.conf:

LoadModule rewrite_module     libexec/apache/mod_rewrite.so
AddModule mod_rewrite.c

NOTE  *** WARNING *** – if you are using FrontPage Extensions, you can break things when using mod-rewrite.  See FrontPage doesn’t like RewriteRule for how to avoid this problem..

Different solutions for different situations

If you are using your ISP’s webserver, then the .htaccess solution is what you will probably need.

If you have renamed the files from .htm to .html, then the simple solution will do this.

And the more complex solution is what you have if you have changed directory names.

Useful Resources

I found that the following URLs were useful.
Users Guide to URL Rewriting http://www.engelschall.com/pw/apache/rewriteguide/ Redirecting old file to new file http://www.engelschall.com/pw/apache/rewriteguide/#ToC23 Apache URL Rewriting Engine http://www.apache.org./docs/mod/mod_rewrite.html

A single file

This section deals with the renaming of files from *.htm to *.html.   Here is what I added to my the virtual host section of /usr/local/etc/apache/httpd.conf as an example:
<Directory "/www/test.freebsddiary.org">
     RewriteEngine   on
     RewriteBase     /
     RewriteRule     ^rewrite\.htm$  rewrite.html [R=permanent]
</Directory>

When a request is encountered for rewrite.htm, this will be rewritten to be rewrite.html.  This is demonstrated by http://test.freebsddiary.org/rewrite.htm which will rewrite the URL to http://test.freebsddiary.org/rewrite.html.   This is a simple solutions and works well for one file.

All files

We will now look at how we can do rewrite rules for all files.  But in this example, we’ll use a silly file extension name, just because we can.
<Directory "/www/test.freebsddiary.org">
        RewriteEngine  on
        RewriteBase     /
        RewriteRule     ^(.*)\.xyz$   $1.html [R=permanent]
</Directory>

The above translates any request for an .xyz file to a .html file.  As I had renamed all such files, this is enough for me.  It also changes the URL in the user’s browsers.  If they had requested foo.xyz, their browser will display foo.html.   If foo.html doesn’t exist, they will get the normal error screen.

The "=permanent" indicates to the client that this is a permanent change in the URL.  If you don’t supply this option, the relocation is deemed temporary.

To demonstrate this rewrite, click on http://test.freebsddiary.org/rewrite.xyz which will rewrite the URL to http://test.freebsddiary.org/rewrite.html.   Given the above rule, http://test.freebsddiary.org/rewrite2.xyz. will not work because there is no file named rewrite2.html at the test webserver.

I have seen a solution which first checks if foo.html exists, and if it does, return foo.html.  If foo.html does not exist, the URL in the browser remains unchanged at foo.htm.  And I wrote it about it here.

A more complex solution

NOTE: In recent testing, I was unable to get this solution to work.

This solution deals with the moving of files from one directory to another as well as the renaming of the extension..  Here is what I added to my the virtual host section of /usr/local/etc/apache/httpd.conf:

<Directory "/www/test.freebsddiary.org/example">
       RewriteEngine  on
       RewriteBase    /
       RewriteRule    ^(.*)\.htm$  $1.html [R=permanent]
</Directory>

Now, http://test.freebsddiary.org/example/rewrite.htm will rewrite the URL to http://test.freebsddiary.org/rewrite.html.

Redirect and rewrite – the file is on another server in another directory

If you want to redirect racesys to http://www.racingsystem.com/ here is what I used.  Within the .htaccess on freebsddiary.org, I place this:
Redirect  permanent /racesys http://www.racingsystem.com/racesys

This redirects the client to http://www.racingsystem.com/racesys.   At that website, I have this

<Directory "/www/racingsystem.com/racesys">
        AllowOverride   All
        RewriteEngine   on
        RewriteBase     /
        RewriteRule     ^$     /      [R=permanent]
</Directory>

This rewrite says that for an empty string (i.e. ^$), rewrite the rule to be just /.   And the URL becomes http://www.racingsystem.com/.

Why did we not redirect straight to http://www.racingsystem.com/ in the first place?  Because I also have these types of rewrites on the website in addition to the above:

RewriteRule ^booksmags\.htm$      booksmags.html  [R=permanent]
RewriteRule ^download\.htm$       download.html   [R=permanent]
RewriteRule ^enhance\.htm$        enhance.html    [R=permanent]

I preferred to put those rewrites in the /racesys/ directory rather in the main directory.

.htaccess solution

These solutions can also be accomplished with .htaccess entries.   This is useful to know if you are not running your own webserver and do not have access to httpd.conf.  Here’s what I put into my .htaccess for this solution:
RewriteEngine  on
RewriteBase    /
RewriteRule    ^(.*)\.htm$  $1.html [R=permanent]

The virtual host in question must allow FileInfo to be overridden.

<Directory "/www/freebsddiary.org/freebsd">
        AllowOverride   All
</Directory>

Changing the file extensions – if needed

This set of rules is based on an example from http://www.engelschall.com/pw/apache/rewriteguide/#ToC21 and can be used when you have renamed files from .htm to .html.   It first checks to see if a file with the new extension exists.  If it does, it returns that URL.  Otherwise, it returns the original URL.
RewriteEngine on
RewriteBase   /
RewriteRule   ^(.*)\.htm$              $1      [C,E=WasHTM:yes]
RewriteCond   %{REQUEST_FILENAME}.html -f
RewriteRule   ^(.*)$ $1.html                   [S=1,R]
RewriteCond   %{ENV:WasHTM}            ^yes$
RewriteRule   ^(.*)$ $1.htm

Redirecting/rewriting for a specific file

If you have moved a file from one server to another, this is my favorite method for redirecting.  I put this within the virtual host section of the website in question.
Redirect  permanent /cats/   http://www.freebsddiary.org/cats/

You should also read Redirecting URL requests with Apache for more information on redirects.

If you have renamed a file, and wish to redirect incoming requests, you can do this:

RewriteEngine   on
RewriteBase     /
RewriteRule     ^about\.htm$  about.html   [R=permanent]

The above will result in requests for about.htm being redirected to about.html.   The "^" represents the start of the substitution.  The "\" is an escape which allows the "."  The "$" represents the end of the substitution.

Redirects vs rewrites

When should you use a redirect?  When should you use a rewrite?   If the file is on the same website, you should use a rewrite.  If the file is on another server, you should use a redirect.  Why?  A simple answer is bandwidth.  A redirect sends the new URL back to the client and the client must reissue the URL request, which creates more traffic.  With a rewrite, the original request is satisfied and a new URL is returned along with the new file.  The client does not have to reissue anything.

What I’m using now

NOTE: Since writing this article, I have removed these rewrites from my webserver.

When I rearranged the Diary, I moved everything from /freebsd/ into /.  I wanted the old URLs to still work.  The following is the contents of /freebsd/.htaccess:

RewriteEngine on
RewriteRule ^$ / [R=permanent]
RewriteBase /
RewriteRule ^(.*)\.htm$ $1.html [R=permanent]

The following describes each of the above lines:

  1. turn on the rewrite engine.
  2. if the URL does not contain a file name, then supply nothing.
  3. allows the base directory to be rewritten.  I’m a bit sketchy on this one.   All I know is that it removes the /freebsd/ directory from the URL.
  4. converts any .htm extension to a .html extension.

Line 2 allows for http://www.freebsddiary.org/freebsd/ to take you to the home page.  Lines 3/4 allow http://www.freebsddiary.org/freebsd/ed1.htm to still work.

NOTE: Since writing this article, I have removed these rewrites from my webserver.

Coming soon to a log file near you!

Here’s what I get in my log files if someone browses to http://www.freebsddiary.org/freebsd/search.htm:
"GET /search.htm HTTP/1.0" 302 335 "-" "Mozilla/3.01Gold"
"GET /search.html HTTP/1.0" 200 2304 "-" "Mozilla/3.01Gold"

As you can see, the first request for search.htm is shown.  The code 302 refers to a rewrite, I think.  Then you can see the real page being requested, search.html.

You can also log the rewrites by putting the following within your virtual host definition:

RewriteLog      /var/log/apache/racingsystem.com-rewrite.log 
RewriteLogLevel 1

The log level should only be used for debugging as high levels of logging can dramatically affect performance.  See http://www.apache.org./docs/mod/mod_rewrite.html for detail.

  9 Responses to “Rewriting URLs within Apache”

  1. I found this article to be hepful. I have been scouring the web for info on rewrite. Are there any principal resources other than the apache doc and the rewriting guide?

    At one point the author used [R=permanent], what would happen if he/she didn’t use permanent??

    thanks, and again, I enjoyed the article.

    -john

    • Practical examples are what we’re all about. Documentation is fine, but without a practical example, it is all too frequently too difficult to translate the documentation into something useful.

      Permanent, AFAIK, is a signal to the browser, perhaps that could be useful to the browser.

      Also, if you don’t do permanent, AFAIK, the URL in the browser remains unchanged. If you do it as permanent, the URL shown to the user is actually the new URL, not the old one.

      It would be good if someone could could verify the above with their own testing.

    • the 301 is the R=permanant is basicly a Document has moved and 302 / just the R is a plane redirect …. little difference that i can see except maybe cache and proxys handle it different!

      Owen Hindman

    • can anyone give me a quick example on how to remove or replace spaces in a url before it hits php? can i do that with mod_rewrite? i have to pass a generated variable to php and it sometimes contains spaces. most browsers will correct this, but the one i have to use does not.

    • Hi there, found your article to be one of the most informative on the subject. Thanks for that.

      I have a particular problem and thought perhaps you could shed some light on it for me.

      I am playing with a cms package called Sitellite, they eir is an included .htaccess file, to rewrite the URLs, but it seems to be written for Apache2.

      My server is running 1.3.x and I have breen trying to get a version of their file to work, but to no avail. I’ve read quite a few articles and although learning, am worried about the order of events as well as the various options. Having real trouble NOT getting server error 500 most of the time.

      Is it possible you could show me (by way of example, the correct htaccess code for my apache v1 3.x

      Here is what they’ve provided;

      <IfDefine APACHE2>
      AcceptPathInfo On
      </IfDefine>

      # Let Apache know that ‘index’ is really a PHP script in disguise.
      <Files index>
      ForceType application/x-httpd-php
      </Files>

      # Let Apache know that ‘sitellite’ is also a PHP script in disguise.
      <Files sitellite>
      ForceType application/x-httpd-php
      </Files>

      # Make SCS the directory index handler (instead of index.html or index.php).
      DirectoryIndex index index.html index.php

      # Instruct Apache to treat XT templates as HTML files upon direct access.
      # Useful for previewing.
      AddType text/html .tpl

      • David wrote:

        > Hi there, found your article to be one of the most informative
        > on the subject. Thanks for that.

        You are welcome.

        > I have a particular problem and thought perhaps you could shed
        > some light on it for me.

        Not here I can’t. This is for article feedback. If you’re looking for help, please try the Support Forum. That is where you will find help.


        The Man Behind The Curtain

  2. I’ve tried implementing a variety of solutions to rewrite .htm requests to .php files and serve the original .htm if a .php file isn’t there.

    I got the suggestion from this site to work fine in an .htaccess file (although could not get it to work in my Virtual Hosts config):

    RewriteEngine on
    RewriteBase /
    RewriteRule ^(.*)\.htm$ $1 [C,E=WasHTM:yes]
    RewriteCond %{REQUEST_FILENAME}.php -f
    RewriteRule ^(.*)$ $1.php [S=1,R=permanent]
    RewriteCond %{ENV:WasHTM} ^yes$
    RewriteRule ^(.*)$ $1.htm

    the problem is, that if someone reuqests an index.htm file somewhere on the site where there is neither an index.htm or an index.php, the server shows a 403 Forbidden error rather then a 404 File not Found. This is definitely caused by this rewrite as without it I get 404s. How can I stop this from happening??

    Sally.

  3. Hello. I want intsall PPTP server but i can’t. I installed but contain some error. This error is

    /usr/sbin/pppdU unknown host: loop
    Mar 14 20:36:40 pptp pppd[805]: unknown host: loop
    Mar 14 20:36:40 pptp pppd[804]: GRE: read(fd05,buffer=804d500,len=8196) from PTY failed: status = 0 error = No error
    Mar 14 20:36:40 pptp pppd[804]: CTRL: PTY read or GRE write failed (pty,gre)=(5,6)

    You can help me.
    My configured file.

    My intranal LAN IP is 192.168.0.1

    external LAN 202.55.*.*

    Client IP 192.168.0.2
    PPTP server’s config files
    /usr/local/etc/pptpd.conf

    option /etc/ppp/ppp.conf
    #bcrelay eth1
    debug
    nobsdcomp
    proxyarp
    localip 192.168.0.1
    remoteip 192.168.0.200-238
    pidfile /var/run/pptp.pid
    ??v2
    mppe-40
    mppe-128
    mppe-stateless

    etc/ppp/ppp.conf

    loop:
    set timeout 0
    set log phase chat connect lcp ipcp command
    set device localhost:pptp
    set dial
    set login
    # Server (local) IP address, Range for Client, and Netmask
    # if you want to use NAT use ptivate IP addresses
    set ifaddr 192.168.0.1 192.168.0.200-192.168.0.254 255.255.255.0
    add default HISADDR
    set server /tmp/loop "" 0177

    loop-in:
    set timeout 0
    set log phase lcp ipcp command
    allow mode direct

    pptp:
    load loop
    disable pap
    # Authudenticate against /etc/passwd
    enable passwdauth
    disable ipv6cp
    enable proxy
    accept dns
    enable MSChapV2
    enable mppe
    disable deflate pred1
    deny deflate pred1
    set dns 202.55.176.10
    set device !/etc/ppp/secure

    /etc/ppp/ppp.secret

    #user #password
    manlai manlai
    1982 1

    /etc/ppp/secret

    #!/bin/sh
    exec /usr/sbin/ppp -direct loop-in