BeeBen's Web Programming PagesBee

Please note that these pages date from 2003 and are near prehistoric in internet terms. It was good stuff when it was written, but old hat now.

These pages not maintained and I no longer deal with queries about them. They remain here for historical interest.

.htaccess files

A feature of the Apache server is the ability to do fancy operations on a per-directory basis. The .htaccess file is used to do this. Put the .htaccess in the directory to which you want the directives to apply, and it will also apply to all the subdirectories of that directory.

Note that not all of the following operations may be permitted on your web server. That depends on your ISP's server configuration and policies.

 

Redirection

Alongside the virtual host facility described on the hosting page I want to make sure that people who try to access the subdirectories directly are redirected to the correct URL. This ensures that links given within the sites as relative to the site root work correctly. Ie. I want the link <a href="/"> in hannah.edgingtonfamily.org always to point to /hannah/, not to the real site root. This would be a problem if someone accessed the site as http://www.edgingtonfamily.org/hannah/. Therefore I arrange for that URL to redirect the UA to the correct place by putting the following in the real root .htaccess file.

/.htaccess

Redirect permanent /hannah/ http://hannah.edgingtonfamily.org/
Redirect permanent /ben/ http://ben.edgingtonfamily.org/

Redirection can also be useful if you move things around within your site to point people transparently to the new location. But don't do this too much; it easily gets very complicated!

 

URL rewriting

Another use of the .htaccess file is to perform URL rewriting. This is different from redirection as the UA knows nothing about it: it is entirely internal to the web server. Rewriting is a bit of black magic that I use primarily to hide from browsers and search engines that they are accessing dynamic content rather than static content. The main point of this is to encourage cacheing of the content by the browser and intermediate web caches, which is kind to the internet, and should improve the loading-speed of the web pages. This is an adjunct to sending the correct HTTP headers.

For example a request to the GIF-server might look like http://www.edginet.org/gifs/tl-ff0000.gif - a perfectly normal URL. This, however, is rewritten by the web server in accordance with the rules in the /gifs/.htaccess file to point to the GIF-server program: http://www.edginet.org/gifs/gif.pl/tl-ff0000.gif. This request invokes the gif.pl script which then treats the final part of the request as a program parameter, and generates the gif accordingly.

/gifs/.htaccess

Options ExecCGI FollowSymLinks
RewriteEngine on
RewriteRule ^([tb]?[lr]-[0-9a-f]{6}\.gif)$ gif.pl/$1

Note the Apache options at the beginning of the file. The ExecCGI option allows the Perl CGI script, gif.pl, to be executed by the server. The FollowSymLinks option is necessary to allow the rewrite engine to work. If your ISP will not allow you to set FollowSymLinks in a directory then you will not be able to use URL rewriting.

Another example is the sermon-server. I transformed the original static pages into pages created dynamically. The original pages, however, had already been indexed by search engines, so I was reluctant to change the URLs. Rewriting to the rescue!

/christian/sermons/.htaccess

Options FollowSymLinks
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !index\.html
RewriteBase /christian/sermons/
RewriteRule ^(.+)\.html$ sermons.php?file=$1

This set of rules will rewrite a request http://www.edginet.org/christian/sermons/gen40.html into a request http://www.edginet.org/christian/sermons/sermons.php?file=gen40. The RewriteCond prevents the transformation from being applied to requests for the directory index file.

Again we need the FollowSymLinks Apache option, but PHP scripts are not run under CGI, so we do not need the ExecCGI option in this directory.

The .htaccess file in this directory is similar,

Options FollowSymLinks
RewriteEngine on
RewriteBase /techie/website/
# The file basename is the "file=" parameter to index.php
RewriteCond %{REQUEST_FILENAME} !index\.html
RewriteRule ^(.+)\.html$ index.php?file=$1 [QSA]

The QSA directive instructs the server to append any query strings from the original URL to the rewritten URL. In the case of this directory that might be a "skin=..." parameter.

For the gory details about URL rewriting see the mod_rewrite manual.

 

Password protection

I also use .htaccess files to control access to certain directories. For example, I have a script which marks up my server log files as HTML, but I want to prevent you from looking at my log files. Therefore it is password protected by means of the .htaccess file.

A basic authorisation file for a directory looks like this,

.htaccess

AuthType Basic
AuthName "Caption for dialogue box"
AuthUserFile /full/path/to/.htpasswd
require valid-user

The .htpasswd file looks like this

.htpasswd

user1:$apr1$.VSrd/..$QyTXXcU258jvxUjtIrlCS.
user2:$apr1$AR1hy...$jLcxh.ubgsmkEcAe6.KVJ0

I created this using the htpasswd utility that comes with Apache. The command line looks like this,

$ htpasswd -n -m user1 > .htpasswd
New password: 
Re-type new password: 
$ htpasswd -n -m user2 >> .htpasswd
New password: 
Re-type new password: 

The -m option generates MD5 passwords which it seems are the most portable between platforms. You may wish to store your .htpasswd file outside the webserver's directory structure if possible so that it cannot be retrieved by a potential attacker. Just be sure to specify the full path to the .htpasswd file in your .htaccess file.

In this example I created two users, user1 and user2, who are given different passwords. For many applications it would be sufficient to have only one. Sometimes, however, multiple users may be useful. First, so that they don't have to share a password. Second, because you may wish to deliver different data to or assign different privileges to different users. For example, a normal user may have read-only access to a database; an administrator user may be given write access as well.

PHP gives you access to the username somebody has authenticated under in the $_SERVER['REMOTE_USER'] variable. So in a database application I've written I check a user's privileges as follows,

$admin_priv = ($_SERVER['REMOTE_USER'] == 'admin');

If a user has succeeded in authenticating under the name "admin" then $admin_priv is set to true. Otherwise it is set to false.

 

File security

If you want to prevent people from viewing or downloading any files in a particular directory you can just place a .htaccess file like this in that directory.

# prevent reading of all files
<Files *>
    Deny From All
</Files>

For example, I have a directory which holds include files containing various passwords and web-service keys that I don't want anybody to be able to see. Ideally this would be stored outside the directory tree that the web-server can serve-up, but on some configurations that is not possible. In this case the .htaccess file above prevents anyone from viewing the files whilst still allowing them to be included in PHP scripts.

You can be more selective about what you allow people to see. This .htaccess file will prevent people from seeing just the files with the .inc suffix. Anything else is accessible.

# prevent reading of .inc files
<Files *.inc>
    Deny From All
</Files>
 

PHP options

You may not have access to the PHP configuration files at your ISP, but you still might be able to change PHP configuration options in your .htaccess files.

For example, for security reasons you may wish to ensure that the PHP functionality "register_globals" is turned off. This is the mechanism whereby URL query parameters are automatically turned into PHP variables. It's fantastically convenient, but it too easily allows the inadvertant creation of security holes by means of uninitialised variables. So from PHP version 4.2.0 it will be turned off by default. But my guess is that many ISPs will turn it back on again as many scripts will break completely.

In any case, you can turn off "register_globals" on a per-directory basis by putting the following in the .htaccess file. (Or turn it back on again if you must.)

php_flag register_globals off
 

MIME types

Yet another configuration option that can be set in the directory .htaccess file is the MIME type that the server will supply for a given file-extension if the file does not supply one itself.

For example, the M'Cheyne server provides XML output which links to an XSL stylesheet, server.xsl. To persuade Mozilla and Netscape to apply this stylesheet it needs to be served with MIME type text/xml. Unortunately my web server was not configured to do this and served it up with MIME type text/plain (leading to much head-scratching). Never fear! I put the following line into the directory .htaccess file to associate the .xsl file extension with MIME type text/xml and all was well.

AddType text/xml xsl

You can check what MIME types your pages are being served with by looking at the Content-Type header using my HTTP header viewer.

Skin

Valid XHTML 1.0!
Valid CSS2!

Copyright © 2003 Ben Edgington.