Jump to content

.htaccess redirect rule for clean URLs


Josh
 Share

Recommended Posts

I want this URL:

https://www.leadwerks.com/learn/API-Reference/Object/Entity/SetPosition

 

To go to this location:

https://www.leadwerks.com/learn?page=API-Reference_Object_Entity_SetPosition

 

Any idea what the .htaccess rewrite rule to do this would be?

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

  • 3 weeks later...

Yeah, they look like gibberish to me.

 

I assume you never were a UNIX hacker..And yes Regular expressions are basically write-only... :-)

 

 

I'm not familiar with configuring Apache servers, but I may shed some light on the gibberish.. ( I own a copy of the book "Mastering Regular Expressions" !)

 

I'm guessing (from reading a bit of the classic obfucation that is Apache documentation) that you want something like the following:

 

RewriteEngine On

RewriteBase "/learn/"

RewriteRule "^/?learn/([^/]+)/([^/]+)/([^/]+)/([^/]+)$" "/learn?page=$1_$2_$3_$4"

 

Lets step through this gibberish

 

"^": This is an "anchor" that matches the beginning of the line. This is just to make sure we match "<pattern>", but not "yadayadayada<pattern>"

 

"/?": match a "/" optionally. This is to handle an inconsistency in the webserver where the the path that is to be matched sometimes start with a slash and sometimes does not. (Depends on under which directive you place the match.)

 

"learn/": match the characters "learn/"

 

"([^/]+)": The "[]" represent a "character class", meaning "match any of these characters". When the character class starts with a "^" the meaning is reversed to "any characters but those listed".

The "+" means one of more of the match specified to the left of "+"

Paranthesis is used for "capturing", meaning that whatever is matched is available in a variable $<n>. So here $1 would contain "API-Reference".

So this hole group means: capture all characters up to the next "/".

 

 

"/": Match a "/"

 

"$": This is an "anchor" that matches the end of the line. (Thus if the URL contains still more characters this expression will not be a match.)

 

 

 

The second part of the RewriteRule specifies the replacement using the variables $<n> containg the strings that were captured in the match.

 

 

A final note: When doing web searches on the topic of RewriteRule you will find stuff like "(.*)/(.*)/(.*)/(.*)". Technically this is correct, but sure looks NOOB-ish to me. The "." matches any character and "*" matches zero or more of the match to the left, so the "(.*)" will match match the rest of the line (as the PCRE algorithm is "greedy") and then the matcher code will have to start to backtrace. After backtracing one char it will fail again trying to match the rest of the string. Thus backtracing yet an other char, and then... Well you probably see my point.. CPU cycles matter! :-)

(Also, capturing zero characters also makes no sense in this case, but this is a small point)

 

Here is a short introduction/reference of the more helpful kind: https://httpd.apache.org/docs/current/rewrite/intro.html#regex

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...