Computer articles :: Module mod_rewrite Tutorial (Part 3) :: Computer articles for websites
Google

Web Torroid

IT Support London
IT Support London
online casino gambling
online casino gambling
Free Sex and Sexuality Guide
Free Sex and Sexuality Guide
Free Web and Internet Directory
Free Web and Internet Directory
DNS Services
DNS Services
Office Humor Cartoon Poetry
Office Humor Cartoon Poetry

Bad Credit Credit Card
Online Loans
Mobile Phones
Mobile Phones
Fast Loans
Advertise here
Affiliates
Reviews
Help Youth
Defeating Stigma
structured settlement news

Module mod_rewrite Tutorial (Part 3)

Next articles:

Module mod_rewrite Tutorial (Part 4) - In this final part of our tutorial we will take a look at those special directives we haven't covered yet. These directives cannot be defined on...

Code Protection Under Cold Fusion - By default, Cold Fusion poweredn sites won't allow for .htaccess functionality which makes it difficult to...

Cloaking Meta Tags - (rt) This short tutorial will cover the cloaking of web page meta tags, which follows a different procedure than the IP delivery and full page cloaking...

Ten Basic Steps for Building a Click-ready Web Site - 1) Assemble a web site development plan that is integrated with your overall marketing processes; the content should be consistent with offline materials...

Module mod_rewrite: Rewriting URLs With Query Strings - The Apache server's module mod_rewrite is typically used to rewrite one URL to turn it into another one...

Module mod_rewrite Tutorial (Part 3): Rewriting URLs by Dirk Brockhausen

In the two preceding parts of this tutorial we explained the basics of Rules and Conditions.

We will now follow up with two examples to illustrate their use for somewhat more complex applications.

The first example deals with dynamicall generated pages while the second example will cover calling up ".txt" files.

For our first example, let's assume that you want to sell several items of merchandise on your web site.

Your clients are guided to various detailed product descriptions via a script:

http://www.yoursite.com/cgi-bin/shop.cgi?product1 http://www.yoursite.com/cgi-bin/shop.cgi?product2 http://www.yoursite.com/cgi-bin/shop.cgi?product3

These URLs are included as links on your site.

If you want to submit these dynamic pages to the search engines, you are confronted with the problem that most of them will not accept URLs containing the "?" character.

However, it would be perfectly possible to submit an URL of the following format:

http://www.yoursite.com/cgi-bin/shop.cgi/product1

Here, the "?" character has been replaced by "/".

Yet more pleasing to the eye would be a URL of this type:

http://www.yoursite.com/shop/product1

To the search engine, this appears to be just another acceptable hyperlink, with "shop" presenting a directory containing files "product1", "product2", etc.

If a visitor clicks this link on a search engine's results page, this URL must be reconverted to make sure that "shop.cgi?product1" will actually be called.

To this effect we will make use of mod_rewrite with the following entries:

RewriteEngine on Options +FollowSymlinks RewriteBase / RewriteRule ^(.*)shop/(.*)$ $1cgi-bin/shop.cgi?$2

The variables $1 and $2 constitute so-called "backreferences". These are related to text groups.

Everything called in the clicked URL which is located before "shop" plus everything following "shop/" is defined by and stored in the two variables $1 and $2

Up to this point our given examples made use of rules such as this one:

RewriteRule ^.htaccess*$ - [F]

However, we did not yet achieve a true rewrite in the sense that one URL would be switched to another.

For the entry in our current example:

RewriteRule ^(.*)shop/(.*)$ $1cgi-bin/shop.cgi?$2

this general syntax applies:

RewriteRule currentURL rewrittenURL

As you can see, this command executes a real rewrite.

In addition to installing the ".htaccess" file, all links in your normal HTML pages which follow the format "cgi-bin/shop.cgi?product" must be changed to: "shop/product" (without the quotes).

When a spider visits a normal HTML page of this kind it will also follow or crawl the product links because there is no question mark contained in the link anymore to prevent it from doing so.

So employing this method you can convert dynamically generated product descriptions into seemingly static web pages and feed them to the search engines.

-

In our second example we will discuss how to redirect calls for ".txt" files to a program script.

Many webspace providers running Apache will feature system log files only in common format. What this means is that these logs will not store visitor Referrers and UserAgents.

However, in relation to "robots.txt" calls it is preferable to have access to this information in order to learn more about visiting spiders than merely their IPa.

To effect this, the entries in ".htaccess" should be as follows:

RewriteEngine on Options +FollowSymlinks RewriteBase / RewriteRule ^ obots.txt$ /text.cgi?%{REQUEST_URI}

Now, when "robots.txt" is called, the applied Rule will redirect your visitor to the program script "text.cgi".

Furthermore, a variable is conveyed to the script which will be processed by the program.

"REQUEST_URI" defines the name of the file you expect to be called. In out example this is "robots.txt".

The script will now read the contents of "robots.txt" and will forward them to the web browser or the search engine spider.

Finally, the visitor hit is archived in the log file. To this effect, the script will pull the environmental variables "$ENV{'HTTP_USER_AGENT'}" etc. This will provide the required information.

Here is the source code for the cgi script mentioned above:

<BEGIN SOURCE CODE> #!/usr/bin/perl # If required, adjust line above to point to Perl 5. ##################################### # (c) Copyright 2000 by fantomaster.com # # All rights reserved. # #####################################

$stats_dir = "stats"; $log_file = "stats.log";

$remote_host = "$ENV{'REMOTE_HOST'}"; $remote_addr = "$ENV{'REMOTE_ADDR'}"; $user_agent = "$ENV{'HTTP_USER_AGENT'}"; $referer = "$ENV{'HTTP_REFERER'}"; $document_name = "$ENV{'QUERY_STRING'}";

open (FILE, "robots.txt"); @TEXT = <FILE>; close (FILE);

&get_date;

&log_hits ("$date $remote_host $remote_addr $user_agent $referer $document_name ");

print "Content-type: text/plain "; print @TEXT;

exit;

sub get_date { ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)=localtime(); $mon++; $sec = sprintf ("%02d", $sec); $min = sprintf ("%02d", $min); $hour = sprintf ("%02d", $hour); $mday = sprintf ("%02d", $mday); $mon = sprintf ("%02d", $mon); $year = scalar localtime; $year =~ s/.*?(d{4})/$1/; $date="$year-$mon-$mday, $hour:$min:$sec"; }

sub log_hits { open (HITS, ">>$stats_dir/$log_file"); print HITS @_; close (HITS); }

<END SOURCE CODE>

To install the script, upload it to your web site's main or DocumentRoot directory by ftp and change file permissions to 755.

Next, create the directory "stats".

A more detailed description on how to install a script can he found in our online manuals, e.g. here:

< http://www.fantomaster.com/fantomasSuite/logFrog/lfhelp.txt >

If your server's configuration does not permit execution of Perl or CGI scripts in the main directory (DocumentRoot), you may wish to try the following RewriteRule instead:

RewriteRule ^ obots.txt$ /cgi-bin/text.cgi?%{REQUEST_URI}

Note, however, that in this case you will have to modify the paths accordingly in the program script!

Finally, here's the solution to our quiz from the previous issue of fantomNews:

RewriteCond %{REMOTE_ADDR} ^216.32.64 RewriteRule ^.*$ - [F]

Quiz question: - If we don't write "^216.32.64." for a regular expression in the configuration above, but "^216.32.64" instead, will we get the identical effect, i.e. will this exclude the same IPs?

The regular expression ^216.32.64 will apply e.g. to the following strings:

216.32.64 216.32.640 216.32.641 216.32.64a 216.32.64abc 216.32.64.12 216.32.642.12

Hence, "4" may be followed by any character string.

However, IP addresses can only have the maximal value 255.255.255.255 - which implies that e.g. 216.32.642.12 is not a valid IP. The only valid IP in the list above is 216.32.64.12!

Although the two regular expressions "^216.32.64." and "^216.32.64" allow for different strings, due to the technical limitation of IP addresses to 0-255 this range of IPs will remain excluded.

(to be continued ...)

Dirk Brockhausen is the co-founder and principal of fantomaster.com Ltd. (UK) and fantomaster.com GmbH (Belgium), a company specializing in webmasters software development, industrial-strength cloaking and search engine positioning services. He holds a doctorate in physics and has worked as an SAP consultant and software developer since 1994. He is also Technical Editor of fantomNews, a free newsletter focusing on search engine optimization, available at: < http://fantomaster.com/fantomnews-sub.html > You can contact him at mailto:fntecheditor@fantomaster.com (c) copyright 2000 by fantomaster.com

Link to this article, just copy and paste following code:

<a href=http://www.torroid.com/article1185.html>Module mod_rewrite Tutorial (Part 3)</a>

Article viewed 763 time(s). Read more:

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 |

Copyright © Torroid.com, 2004, Sitemap of computer articles | Resources | Computer articles home";
Page loaded in 0.637 seconds

Computer Tips   Computing Guide   Computers