How to Use Mod-Rewrite to Simplify URL Rewriting in Apache - A Basic Guide to the Mod-Rewrite Module

Friday, November 2, 2012 1 comments

Introduction
URL Rewriting is the process of manipulating an URL or a link, which is send to a web server in such a way that the link is dynamically modified at the server to include additional parameters and information along with a server initiated redirection. The web server performs all these manipulations on the fly so that the browser is kept out of the loop regarding the change made in URL and the redirection.
URL Rewriting can benefit your websites and web based applications by providing better security, better visibility or friendliness with Search Engines and helps in keeping the structure of the website more easy to maintain for future changes.
In this article we will be taking a look at how we can implement URL Rewriting on an Apache based web server environment using the mod_rewrite module for Apache.
What is mod_rewrite?
Mod_rewrite is one of the most favored modules for the Apache web server and there are many web developers and administrators who will vote this module as the best thing to happen on Apache. This module has a lot of tricks up its sleeve so that it can be called the Swiss Army Knife of all Apache Modules. Apart from providing simple URL Rewriting functionality for an Apache based website, this module arms the website with better URL protection, better search engine visibility, protection against bandwidth thieves by stopping hot linking, hassle free restructuring possibilities and options to provide friendliest of URLs for the website users. This module due to its versatility and functionality can at times feel a bit daunting to master, but getting a through understanding of the basics can make you a master of the craft of URL Rewriting.
Lets Begin! - A look at all the stuff you need to have on your test environment to get mod-rewrite alive and kicking.
First and foremost you should have a properly configured Apache Web Server on your test machine. Mod_rewrite is usually installed along with the Apache server, but in case it is missing - this can be the case on a Linux machine where the mod_rewrite module was not compiled along with the installation - you will have to get it installed. For using mod_rewrite on your Apache box you will have to configure this module to load dynamically on demand made by Apache. On a shared server you will have to contact your web hosting company to get this module installed and loaded on Apache.
On your local machine you can find if the module is installed along with Apache by having a look at the modules directory of Apache. Check for a file named mod_rewrite.so and if it is there then the module can be made to load in to the Apache server dynamically. By default this module is not loaded when Apache starts and you need to tell Apache to enable this module for dynamic loading by making changes in the web servers configuration file, which is explained below.
How to Enable mod_rewrite on Apache?
You can make the mod_rewrite module load dynamically in to the Apache web server environment using the LoadModule Directive in the httpd.conf file. Load this file in a text editor and find a line similar to the one given below.
#LoadModule rewrite_module modules/mod_rewrite.so
Uncomment this line by removing the # and save the httpd.conf file. Restart your Apache server and if all went well mod_rewrite module will now be enabled on your web server.
Lets Rewrite our first URL using mod_rewrite Ok, now the mod_rewrite module is enabled on your server. Lets have a look at how to make this module load itself and to make it work for us.
In order to load the module dynamically you have to add a single line to your .htaccess file. The .htaccess files are configuration files with Apache directives defined in them and they provide distributed directory level configuration for a website. Create a .htaccess file in your web servers test directory - or any other directory on which you want to make URL Rewriting active - and add the below given line to it.
RewriteEngine on
Now we have the rewrite engine turned on and Apache is ready to rewrite URLs for you. Lets look at a sample rewrite instruction for making a request to our server for first.html redirected to second.html at server level. Add the below given line to your .htaccess file along with the RewriteEngine directive that we have added before.
RewriteRule ^first.html$ second.html
I will explain what we have done here at the next section, but if all went well then any requests for first.html made on your server will be transferred to second.html. This is one of the simplest forms of URL Rewritting.
A point to note here is that the redirect is kept totally hidden from client and this differs from the classic HTTP Redirects. The client or the browser is given the impression that the content of the second.html is being fetched from first.html. This enables websites to generate on the fly URLs with out the clients awareness and is what makes URL Rewriting very powerful.
Basics of mod_rewrite module
Now we know that mod_rewrite can be enabled for an entire website or a specific directory by using .htaccess file and have done a basic rewrite directive in the previous example. Here I will explain what exactly have we done in the first sample rewrite.
Mod_rewrite module provides a set of configuration directive statements for URL Rewriting and the RewriteRule directive - that we saw in the previous sample - is the most important one. The mod_rewrite engine uses pattern-matching substitutions for making the translations and this means a good grasp of Regular Expressions can help you a lot.
Note: Regular Expressions are so vast that they will not fit in to the scope of this article. I will try to write another article on that topic someday.
1. The RewriteRule Directive
The general syntax of the RewriteRule is very straightforward.
RewriteRule Pattern Substitution [Flags]
The Pattern part is the pattern which the rewrite engine will look for in the incoming URL to catch. So in our first sample ^first.html$ is the Pattern. The pattern is written as a regular expression.
The Substitution is the replacement or translation that is to be done on the caught pattern in the URL. In our sample second.html is the Substitution part.
Flags are optional and they make the rewrite engine to do certain other tasks apart from just doing the substitution on the URL string. The flags if present are defined with in square brackets and should be separated by commas.
Lets take a look at a more complex rewrite rule. Take a look at the following URL.
yourwebsiteurl/articles.php?category=stamps&id=122
Now we will convert the above URL in to a search engine and user friendly URL like the one given below.
yourwebsiteurl/articles/stamps/122
Create a page called articles.php with the following code:
$category = $_GET['category'];
$id = $_GET['id'];
echo "Category : " . $category . " ";
echo "ID : " . $id;
This page simply prints the two GET variables passed to it on the webpage.
Open the .htaccess file and write in the below given Rule.
RewriteEngine on
RewriteRule ^articles/(w+)/([0-9]+)$ /articles.php?category=$1&id=$2
The pattern ^articles/(w+)/([0-9]+)$ can be bisected as:
^articles/ - checks if the request starts with 'articles/'
(w+)/ - checks if this part is a single word followed by a forward slash. The parenthesis is used for extracting the parameter values, which we need for replacing in the actual query string, in the substituted URL. The pattern, which is placed in parenthesis will be stored in a special variable which can be back-referenced in the substitution part using variables like $1, $2 so on for each pair of parenthesis.
([0-9]+)$ - this checks for digits at the last part of the url.
Try requesting the articles.php file in your test server with the below given url.
yourwebsiteurl/articles/coins/1222
The URL Rewrite rule you have written will kick in and you will be seeing the result as if the url requested where:
yourwebsiteurl/articles.php?category=coins&id=1222
Now you can work on this sample to build more and more complex URL Rewritting rules. By using URL rewriting in the above example we have achieved a search engine and user friendly URL, which is also tamper proof against casual script kiddie injection sort of attacks.
What does the Flags parameter of RewriteRule directive do?
RewriteRule flags provide us with a way to control the way mod_rewrite handles each rule. These flags are defined inside a common set of square brackets separated by commas and there are about 15 flags to choose from. These flags range from those which controls the way rules are interpreted to complex one's like those which sent specific HTTP headers back to the client when a match is found on the pattern.
Lets look at some of the basic flags.
  • [NC] flag (nocase) -. This makes mod_rewrite to treat the pattern in a case-insensitive manner.
  • [F] flag (forbidden) - This makes Apache send a forbidden HTTP response header - response 403 - back to the client.
  • [R] flag (redirect) - This flag makes mod_rewrite to use a formal HTTP redirect instead of the internal Apache redirect. You can use this flag to inform the client about the redirection and this flag sends a Moved Temporarily - Response 302 - by default, but this flag takes an extra parameter, which you can use to modify the response code. If you wish to send a response code of 301 - Moved Permanently - then this flag can be written as [R=301]
  • [G] flag (gone) - This flag makes Apache respond with a HTTP Response 410 - File Gone.
  • [L] flag (last) - This makes mod_rewrite to stop processing succeeding directives if the current directive is successful.
  • [N] flag (next) - This flag makes the rewrite engine to stop process and loop back to start of the rule list. A point to note is that the URL, which will be used for pattern matching, will be the rewritten one. This flag can create an endless loop and so extreme care should be given while using it.
There are other flags too but they are complex to explain with in the scope of this article so you can find more info on them by referring the mod_rewrite manual.
2. The RewriteCond Directive
This directive gives you the additional power of conditional checking on a range of parameters and conditions. This statement when combined with RewriteRule will let you rewrite URLs based on the success of conditions. RewriteCond are like the if() statement in your programming language but here they are for deciding whether a RewriteRule directive's substitution should take place or not. Things like preventing hot linking and checking whether the client meets certain criteria's before rewriting the URL etc can be achieved by using this directive.
The general syntax of the RewriteCond is:
RewriteCond string-to-test condition-pattern
The string-to-test part of the RewriteCond has access to a large set of Variables like the HTTP Header variables, Request Variables, Server Variables, Time variables etc so you can do a lot of complex conditional checking while writing directives. You can use any of these variables as a string to test by putting it in a %{string} format. Suppose you want to use the HTTP_REFERER variable then it can be used as %{HTTP_REFERER }.
The condition part can be a simple string or a very complex regular expression as your imagination is the only limit with this module.
Lets take a look at an example for conditional rewriting using RewriteCond directive:
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4(.*)MSIE
RewriteRule ^index.html$ /index.ie.html [L]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/5(.*)Gecko
RewriteRule ^index.html$ /index.netscape.html [L]
RewriteRule ^index.html$ /index.other.html [L]
This example uses the HTTP_USER_AGENT as the test string with the RewriteCond directive. What it does is that it uses the HTTP_USER_AGENT header variable to find the browser of the visiting user and match it against a set of pre known values to detect the browser and serve different pages to the visitor based on the match result. The first RewriteCond checks the HTTP_USER_AGENT to find a match for the ^Mozilla/4(.*)MSIE pattern. This match will occur when a user visits the page using IE as browser. Then the RewriteRule given just under that statement will kick in and will rewrite the URL to server index.ie.html page to the IE visitor.
Similarly a checking is made for mozilla specific browsers in the second RewriteCond and the RewriteRule will do the substitution for index.netscape.html when a positive match is made on the ^Mozilla/5(.*)Gecko pattern. The third RewriteRule is there to catch other browsers. If both the first and second RewriteCond fails then the last RewriteRule will be considered. A point to note in the above example is the usage of the [L] flag with all the RewriteRule directives. This is used to avoid the cascading of applying the rules when a positive RewriteRule is applied.
Two flags which can be used to further control the way the RewriteCond directive behave are [NC] - case-insensitive - and [OR] - chaining of multiple RewriteCond directives with logical OR.
By using these two directives - RewriteRule and RewriteCond - you can implement a lot of powerful URL Rewriting functionality on your website.
Other mod_rewrite Directives
  1. RewriteBase Directive - This directive can solve the problem of RewriteRule creating non-existent URLs due to difference in the physical file system structure on web server and the structure of website URLs. Setting this directive to the below given statement can solve this problem. RewriteBase /
  2. RewriteMap Directive- This directive is very powerful as it allows you to map unique values to a set of other replacement values from a table and to use it in the substitution to generate on the fly URLs. This can be especially useful for huge e-commerce or CMS kind of applications where you need to replace each section name or category name in the URL with a corresponding id taken from a database.
  3. RewriteLog Directive - This directive can be used to set the log file that the mod_rewrite engine will use to log all the actions taken during processing on client requests. The syntax is: RewriteLog /path/to/logfile This directive should be defined in the httpd.conf file as this directive is applied on a per-server basis.
  4. RewriteLogLevel Directive- This directive tells mod_rewrite module the amount of information on the internal processing done while rewriting URLs to be logged. This directive takes values from 0 to 9 where 0 means no logging and 9 means all the information is logged. A higher level of logging can make Apache run slow, so a level above 2 is desired only for debugging purposes. This directive can be applied using the below given syntax.br/> RewriteLogLevel levelnumber

ConclusionIn this article we have taken only a brief look at the power of the mod_rewrite module. It is only a scratch on the surface but I hope it is enough to get you started on using this module on your web server environment.

Get 25 Facebook Fans to Register a Short URL For Your Facebook Page

Wednesday, September 12, 2012 1 comments
For the past two months millions of Facebook users and Page owners have been able to register short URLs, like facebook.com/yourcompany. From the start Facebook has stated that once registered, the URL can't be changed. However, recently the rules have been updated and now users can change their short URLs. This only applies to profiles. Pages have to settle with the URL they have already grabbed.
Currently you need more than 25 Fans to qualify for a short URL. But the limit might be raised soon. Originally it was at 1000 fans for several weeks... Then suddenly lowered to 0... then 25... and within hours it was raised to 100 fans where it stayed for two months. In the beginning of September it was lowered to 25 fans. But there's no telling how long this will last. Facebook is still testing the most profitable number, at which Page owners are most willing to pay for ads to get visitors.
So you better hurry up to register your special URL before the minimum requirement goes back to 100 fans.
How to register a short Facebook URL
Once you have acquired 26 fans, you can register a short URL at facebook.com/username. By default there are some suggestions for a username according to the name of your Page. If you administer many Pages you can choose a username for every one of them. Username is the name Facebook has given to these short URLs. Some call them Vanity URLs. To create a username you must use alphanumeric characters a-z, 0-9 or a period(.). Minimum length is 5 characters. Some words like screw are censored and can't be used. Also you can't register generic words like "pizza."
You should get an URL that is closely related to your brand name. If you don't, it might be reclaimed later. Some trademark owners have prevented their name from being used. If you manage to get a trademarked URL, it can be taken away at any time by the real owner.
In order to register the URL you need to verify your account with a code that will be sent to your cell phone. If you have many Pages you only need one cell phone and one verification.
If you want to change your already registered URL to something else, then go to Settings... Account Settings... and click to change the username. If a user changes his short URL, the old one will instantly become available for anyone to register. If on the other hand your account is deleted, the old username will not become available. Unlike profiles, Pages can't change their short URLs. Hopefully this will be allowed in the future.
If you are an application developer, you can register a short URL for your application, too. But instead of the URL being apps.Facebook.com/application, it is Facebook.com/application.
If a Page or application profile has many administrators, the first one to register gets to choose it and after that it can't be changed. So make sure your team is on the "same page" with your thoughts.

Free URL Redirection - Features

Tuesday, September 11, 2012 1 comments
Well, there are a large number of features that each Free Url Redirection service can theoretically provide. But in practice, Free Short Url services usually don't provide them all, but provide only some of them.
I'd like to tell you about the most important features of free url redirection services. So, the more features from the list below a free url service can provide, the better it is..
- No ads at all is a very important feature and it means that the Free Url Redirection service provider does not implement any forced ads into users websites. Don't confuse it with "No banner and popup ads", as there are many other kinds of ads that are neither banners, nor pop-ups.. e.g. pop-under or exit ads.. Be careful here :) Some service providers may place a very small frame at the bottom of the page with a link back to their site. This is perfectly acceptable, as they have to be able to get new members!
- Free Domain Name - it means the free url provided looks very professional, like a real paid domain name.
- Free Url Cloaking also known as free url masking - is used to mask your real website address with the Free short url provided. So, your short url will always be in the location bar of your website visitors, and nobody ever knows you are using an url redirection.
- Free Email Forwarding means you get a branded email address with your free short url, like webmaster@yourshort.url, and the emails sent to this email address are automatically redirected to the email address you choose, e.g. to your_name@yahoo.com
- Free Path Forwarding is the feature that allows you to hide that you are using short url service even more. With path forwarding turned ON you can access the files and subdirectories of your website using your free short url, e.g. www.yourshort.url/forums/ will be in the location bar of your website visitors and will actually point to your real, long address.
- Free Subdomains feature allows you to create subdomains like forums.yourshort.url and point them to different websites and/or to different folders of your hosting account (in other words, you are allowed to create subdomain path forwards of your free short url)
- Meta Tags Support. Meta Tags are the tags located in the head of HTML pages. The most important are Title, Description and Keywords tags, and they are very important for your website to be indexed by search engines
- Free Website Statistics. Some Free Url Redirection services provide website statistics for free - number of visitors, referrals, webpage hits, etc. In this case you will not have to use the 3rd party statistics service (counter) to monitor your website visitors
- Dynamic IP support. This feature is useful if would like to run a webserver from their home computer, but unfortunately, have a dynamic IP address
- Full DNS Support is a very rare feature provided by Free Url Redirection service providers and it means that it is allowed to modify all DNS records (A, NS, MX) for the free short urls provided.
There are also a lot of different not very important features provided by Free Url Redirection providers, like ability to use your free short url with & without WWW, support for other less important meta tags, etc. Plus some very rare features like POP3 boxes, included guestbooks, chats, etc.