Search Engine Friendly URLS


Introduction to IIS URL REWRITING

isapi_rewrite_logo.gif

 

ORDER ISAPI REWRITE

ISAPI_Rewrite is a powerful regular expressions-based URL manipulation engine. It acts mostly like Apache’s mod_Rewrite, but it is designed especially for Microsoft Internet Information Server and Microsoft Security and Acceleration Server 2004. If you ever wanted to change your web site’s URL scheme, this product is for you!

Some key benefits of ISAPI_Rewrite:

  • Speed

    ISAPI_Rewrite is extremely fast and highly scalable solution. It is written by using only pure C/C++ code, Win32 API and ISAPI. It uses intelligent configuration cache mechanism. All work is done just in one stage and there are no recursively requests or any other operations that may take a long time.

  • Security

    ISAPI_Rewrite is designed for operation in a shared environment. It can serve as many sites as you have. ISP and hosting providers can safely permit their users to configure ISAPI_Rewrite and be sure that any configuration changes will affect only local user’s environment. ISAPI_Rewrite can even solve many security problems, for example, block an access to some folders or file extensions or create more complex rules.

  • Power

    Flexibility and power of ISAPI_Rewrite come from its regular expression nature. With regular expressions you don’t need to write a thousands check strings. The comparison and replace of URLs can be done with a few string patterns. So, ISAPI_Rewrite can do many things that cannot be done using other solutions available for IIS. See examples section for more information.

IIS URL Rewrite Main concept

ISAPI_Rewrite provides a rule-based rewriting engine to rewrite requested URLs on the fly. It supports virtually unlimited number of the rules and an unlimited number of attached rule conditions to provide a really flexible and powerful URL manipulation mechanism (Really a config file size is forcibly limited to 2Mb to prevent possible config parsing overhead). The URL manipulations can depend on tests for HTTP headers, Server variables, Request-URI, method and version information of a client request.

In most cases ISAPI_Rewrite is used to rewrite a Request-URI (defined in the RFC 2616) and common HTTP request headers. For example, if a client requests resource as http://www.somesite.com/path/file.ext?parameter=value ISAPI_Rewrite will operate on the part marked in red. In addition ISAPI_Rewrite can rewrite, create or remove any other HTTP header of the request. Program operation may result in rewriting, proxying, redirection, or blocking of an original request to a server.

The rewriting engine goes through the ruleset rule by rule (RewriteRule and RewriteHeader directives). The particular rule is applied only if it matches Request-URI and all corresponding conditions (RewriteCond directives) match theirs test strings (headers or server valiables). ISAPI_Rewrite uses MATCH algorithm. It means that a test string is NOT searched for a rule pattern, but the whole test string is matched against a pattern. For example, pattern a*b will not match string aaaaaaaabbbbbbbb.

Result of a successful rule application is saved in the original header and it will be visible for all subsequent rules. Rules processing stops when a last rule (redirect, proxy, forbid or rule marked by the L flag) is matched.

Rewriting will cause server to continue request processing with a new URI as if it has been originally requested by a client. New URI can include query string section (following question mark) and may point to to any plain files (like images), scripts (like ASP), programs (like EXE), etc.

Proxiing causes the resulting URI to be internally treated as a target on another server and immediately (i.e. rules processing stops here) passed to the ISAPI extension handling proxy requests. You have to make sure that the resulting string is a valid URL including protocol, host, etc. Otherwise you will get an error from the proxy.

Redirection will result in sending of an immediate response with a redirect instruction (HTTP response code 302 or 301) having resulting URI as a new location. You can use an absolute URL (and that is required by the RFC 2616) in a redirection instruction to redirect a request to a different host, port and protocol. Redirect instruction always causes rewriting engine to stop rules sequence processing.

Rules are processed in the order of appearance in a configuration file. ISAPI_Rewrite applies server-wide (global) rules first. Then it applies rules specific for an IIS web site handling a request (if site-level rules are present). There are no recursive requests or subsequent rollbacks in a processing order (except explicitly generated loops). So, request processing will never fall into an infinite loop.

Before any URI modification ISAPI_Rewrite saves original Request-URI into the HTTP header named X-Rewrite-URL. Then it can be retrieved in ASP using Request.ServerVariables("HTTP_X_REWRITE_URL") alias.

Multiple RewiteCond directives followed by a RewriteRule (or RewriteProxy) directive affect only that RewriteRule. So, those conditions should be considered as a part of a complex rule.

Whenever you put parentheses in any regular expression present in a complex rule (a rule with conditions) you are marking a submatch that could be used in a format string (using $N syntax) or as a back-reference (using N syntax) in subsequent conditions. These submathces are global for the whole complex rule ( RewriteRule directive and corresponding RewriteCond directives). Submatches are numbered from up to down and from left to right beginning from the first RewriteCond directive (if such directive exists) corresponding to the RewriteRule.

To simplify rules and strengthen server security it is strongly recommended to disable parent paths in the IIS settings.

IIS URL Rewrite Examples

Note: All rules in these examples are intended for httpd.conf file. In ISAPI_Rewrite as well as in Apache mod_rewrite base path for rules is different depending on a directory where you put .htaccess file. Initial leading slash only exist if you put rules in httpd.conf, while in .htaccess files virtual path to these files is truncated. Rules that rely on a root path are preceded with RewriteBase / directive to allow them to work in any location within httpd.conf and directory level .htaccess files.

Simple search engine friendly URLs

This example demonstrates how to easily hide query string parameters using loop flag. Suppose you have URL like http://www.mysite.com/foo.asp?a=A&b=B&c=C and you want to access it as http://www.myhost.com/foo.asp/a/A/b/B/c/C

Try the following rule to achieve desired result:

 

RewriteEngine on
RewriteRule ^(.*?.asp)/([^/]*)/([^/]*)(/.+)? $1$4?$2=$3 [NC,LP,QSA]

 

Note that this rule may break page-relative links to CSSs, images, etc. This is happening due to a change of the base path (parent folder of the page) that is being used by a browser to calculate complete resource URI. This problem occurs only if you use directory separator as a replacement character. There are three possible solutions:

  1. Use the rule given below. It does not affect base path because does not use directory separator character ‘/’.
  2. Directly specify correct base path for a page with the help of <base href="/folder/"> tag.
  3. Change all page-relative links to either root-relative or absolute form.

There also exist many variations of this rule with different separator characters and file extensions. For example, to use URLs like http://www.myhost.com/foo.asp~a~A~b~B~c~C the following rule can be implemented:

 

RewriteEngine on
RewriteRule ^(.*?.asp)~([^~]*)~([^~]*)(.*) $1$4?$2=$3 [NC,LP,QSA]

 

Keyword rich URLs

In previous example we have used general technique to simply hide query string markers. But much more useful solution for search engine optimization would be making your URL keyword rich. Consider following URL example: http://www.mysite.com/productpage.asp?productID=127 This is very usual situation for most web sites. But you can significantly increase rating of your page in search engines by using the following URL format instead: http://www.mysite.com/products/our_super_tool.asp Keywords “our super tool” in this URL will be indexed and improve page rank. But “our_super_tool” cannot be used to retract productID=127 directly. Several solutions to this problem exist.

The first solution that we would recommend if you have short URL format with only few parameters is to include in URL both keywords and numeric identifiers. In this case your URL may look as: http://www.mysite.com/products/our_super_tool_127.asp Only one rule will be needed to achieve this rewrite:

 

RewriteEngine on
RewriteBase /
RewriteRule ^products/[^?/]*_(d+).asp /productpage.asp?productID=$1 [QSA]

 

Another and more complex solution is to create 1 to 1 map file and use it to map “our_super_tool” to 127. This solution is useful for some long URLs with many parameters and will allow you to hide even numeric identifier. The URL will look as http://www.mysite.com/products/our_super_tool.asp. Please note that “our_super_tool” part should uniquely identify the product and it’s identifier. Here is an example for this solution:

 

RewriteEngine on
RewriteBase /
RewriteMap mapfile txt:mapfile.txt
RewriteRule ^products/([^?/]+).asp /productpage.asp?productID=${mapfile:$1}

 

And you will need to create mapfile.txt map file with the following content:

 

one_product       1
another_product   2
our_super_tool    127
more_products     335

 

Advantage of this method is that you can use it to combine quite complex URL transformations, but this is a bit out of this small examples guide context.

 

ORDER ISAPI REWRITE