This is my pattern:
^(\w{2}-\w{2})/questions(?:/(\w+))?(?:/(\d+))?(?:/.*?)?$
these are the what I'm testing:
en-us/questions/ask
en-us/questions/newest/15
en-us/questions/12/just-a-text-to-be-ignored
It works perfectly, here is the demo:
https://regex101.com/r/yC3tI8/1
but the following rewrite rule:
<rule name="en-us questions" enabled="true" stopProcessing="true">
<match url="^(\w{2}-\w{2})/questions(?:/(\w+))?(?:/(\d+))?(?:/.*?)?$" />
<action type="Rewrite" url="/questions.aspx?lang={R:1}&tab={R:2}&pid={R:3}" />
</rule>
when I give the link en-us/questions/newest redirects to: /questions.aspx?lang=en-us&tab=&pid=
What is wrong with this? Its now about 5 hours I'm just reviewing the same things
Since you have three different possible url endings that ultimately effect the outcome of the rewritten url you can either setup one all inclusive rule that will hopefully match everything you want, or you could setup three rules to handle each accordingly:
One Rule:
^(\w{2}-\w{2})/questions/(\w+)/?(\d+)?.*$
https://regex101.com/r/dN8bM9/1 - tries to handle all cases
<rule name="en-us questions" enabled="true" stopProcessing="true">
<match url="^(\w{2}-\w{2})/questions/(\w+)/?(\d+)?.*$" />
<action type="Rewrite" url="/questions.aspx?lang={R:1}&tab={R:2}&pid={R:3}" />
</rule>
* note: one possible reason the original pattern was failing to capture the second group was the inclusion of (?:) - which means match but don't capture; leaving that out might solve most of the issues there.
Three Rules:
^(\w{2}-\w{2})/questions/(\w+)$
https://regex101.com/r/lI8bQ1/1 - en-us/questions/[single word]
^(\w{2}-\w{2})/questions/(\d+)/.*$
https://regex101.com/r/hV5fK3/1 - en-us/questions/[digits]/discard
^(\w{2}-\w{2})/questions/(\w+)/(\d+)$
https://regex101.com/r/kO0dJ0/1 - en-us/questions/[single
word]/[digits]
Putting it all together into a ruleset:
<rule name="en-us questions case one" enabled="true" stopProcessing="true">
<match url="^(\w{2}-\w{2})/questions/(\w+)$" />
<action type="Rewrite" url="/questions.aspx?lang={R:1}&tab={R:2}" />
</rule>
<rule name="en-us questions case two" enabled="true" stopProcessing="true">
<match url="^(\w{2}-\w{2})/questions/(\d+)/.*$" />
<action type="Rewrite" url="/questions.aspx?lang={R:1}&tab={R:2}" />
</rule>
<rule name="en-us questions case three" enabled="true" stopProcessing="true">
<match url="^(\w{2}-\w{2})/questions/(\w+)/(\d+)$" />
<action type="Rewrite" url="/questions.aspx?lang={R:1}&tab={R:2}&pid={R:3}" />
</rule>
* note: you might need to adjust this in some way, but it should give you an idea of how to accomodate three different variations (as you seem to have) for rewriting your urls.
Note you have three lazy captures:
(?:/(\w+))?
(?:/(\d+))?
(?:/.*?)?
asp.net's regex implementation interprets ? as:
In addition to specifying that a given pattern may occur exactly 0 or 1 time, the ? character also forces a pattern or subpattern to match the minimal number of characters when it might match several in an input string.
So asp.net is assigning no characters to 1, no characters to 2, and collecting the rest of the characters 3.
To use greedy matching instead of the lazy matching ? forces use: {0,1}
So you're regex should look like:
^(\w{2}-\w{2})/questions(?:/(\w+)){0,1}(?:/(\d+)){0,1}(?:/.*?)?$
Live example
Related
I have the following URL rewrite rule configured in my web.config file:
<rule name="Test Rule" stopProcessing="true">
<match url="^$" />
<conditions>
<add input="{QUERY_STRING}" pattern="^(item=1|all|none|first).*$" />
</conditions>
<action type="Rewrite" url="/newsite/test.asp?{C:0}" />
</rule>
The following source url matches as expected.
http://domainname?item=1¶m1=x%param2=y
However the rewritten url includes param1 and param2 in its query string. Shouldn't {C:0} only translate to what is matched in the regex group i.e. (item=1|all|none|first) and not anything that comes after it?
The reason why the rewrite captures everything after the ( ) is because {C:0} grabs the entire match, and not just the group. Since you only want to capture what is within the ( ) (which is capture group 1) you would use:
{C:1}
Another example might be:
^(item=1|all|none|first)/(abc).*$
So to capture only abc for example (capture group 2):
{C:2}
I've not been able to understand the purpose of {R:N}. Could anyone please clarify when to use
{R:0} vs. {R:1}
usage example:
<action type="Redirect" url="http://www.{HTTP_HOST}/{R:0}" />
I've seen ScottGu using {R:1}
http://weblogs.asp.net/scottgu/archive/2010/04/20/tip-trick-fix-common-seo-problems-using-the-url-rewrite-extension.aspx
Whereas, below has {R:0}
http://weblogs.asp.net/owscott/archive/2009/11/27/iis-url-rewrite-rewriting-non-www-to-www.aspx
Had a look at the IIS link below but could not quite digest the definition below:
Back-references to condition patterns are identified by {C:N} where N is from 0 to 9; back-references to rule pattern are identified by {R:N} where N is from 0 to 9. Note that for both types of back-references, {R:0} and {C:0}, will contain the matched string
As per the documentation:
When an ECMAScript pattern syntax is used, a back-reference can be
created by putting parenthesis around the part of the pattern that
must capture the back-reference.
So taking the example that follows in the documentation:
^(www\.)(.*)$
And using the input string www.foo.com in the conditions, you will have:
{C:0} - www.foo.com
{C:1} - www.
{C:2} - foo.com
To make it simple:
{R:x} is used as back reference from the rule pattern (<match url="...">).
{C:x} is used as back reference from the condition pattern (<conditions><add input="{HTTP_HOST}" pattern="..."></conditions>)
The 0 reference contains the whole input string
The 1 reference will contain the first part of the string matching the pattern in the first parenthesis (), the 2 reference the second one, etc...up to the reference number 9
Note:
When "Wildcard" pattern syntax is used, the back-references are always
created when an asterisk symbol (*) is used in the pattern. No
back-references are created when "?" is used in the pattern.
http://www.iis.net/learn/extensions/url-rewrite-module/url-rewrite-module-configuration-reference#Using_back-references_in_rewrite_rules
I'm trying to evaluate URL Rewrite pattern, but for second value it's giving me to much:
My URL:
news/group/?page=2
My pattern:
group/([^\.]+)/?page=([0-9]+)
Group value:
{R:0} news/group/?page=2
{R:1} group/?
{R:2} 2
How to modify my pattern to have in {R:1} just "group" value?
Anwser was simply, as I wanted to have query_string as is.
I changed pattern to:
^group/([^\.]+)/$
And under action set:
<action ... appendQueryString="true" />
Which was just adding that ?page=2 query string to rewritten url.
We use IIS URL Re-write module, like this
<rule name="RewriteSearch" stopProcessing="true">
<match url="^Search/([_0-9a-z+-]+)" />
<action type="Rewrite" url="CommonPages/Search.aspx?term={R:1}" />
</rule>
http://www.tickettail.com/Search/NormalText123
Works fine
But...
http://www.tickettail.com/Search/ราคัดมาใ
(This is Thai)
Will not. How can I modify the match to allow foreign text?
Thanks
The regular expression to which you are matching only accepts the characters _, 0 to 9, a to z, + and -. In order to accept all characters you have to modify the regular expression to e.g. (.+) (this accepts any character and requires at least one character.
Second, in order for any character to be properly passed to the search page you have to URL encode the term by using the built in {UrlEncode:{}} function. Also make sure you page can handle and output UTF-8.
The following rules works:
<rule name="RewriteSearch" stopProcessing="true">
<match url="^Search/(.+)" />
<action type="Rewrite" url="CommonPages/Search.aspx?term={UrlEncode:{R:1}}" />
</rule>
In my appication i used to URL Rewrite using Intelligencia.
My URL Pattern is
<rewrite url="~/([A-Za-z0-9-_ ]+)$" to="~/xxxxxx.aspx?id=$1"/>
Now i want if the URL doesn't contain iphonestatelist,iphonepropertytype then only it should redirect.
I have used this below code but it does not working.
<rewrite url="~/(^((?!\b(iphonestatelist|iphonepropertytype)\b).)*[A-Za-z0-9-_ ]+)$" to="~/xxxxxx.aspx?id=$1"/>
Eg:
The below URL dont want to Rewrite
www.xxxx.com/abc/sef/iphonestatelist
The right URL to Rewrite
www.xxxx.com/FX1234
Try
<rewrite
url="^(?!.*?\b(?:iphonestatelist|iphonepropertytype)\b)~/([A-Za-z0-9-_ ]+)$"
to="~/xxxxxx.aspx?id=$1"
/>
Generally: To check if a string contains (or does not contain) a certain sub-string, use a look-ahead (or negative look-ahead, respectively) like this:
^(?=.*?pattern-reqired)pattern-you-look-for
^(?!.*?pattern-disallowed)pattern-you-look-for
These patterns can also be chained
^(?!.*?not-this)(?!.*?not-that)pattern-you-look-for
Try
<rewrite
url="^(?!.*?\b(?:iphonestatelist|iphonepropertytype)\b)~/([A-Za-z0-9-_ ]+)$"
to="~/xxxxxx.aspx?id=$1"
/>
Generally: To check if a string contains (or does not contain) a certain sub-string, use a look-ahead (or negative look-ahead, respectively) like this:
`^(?=.*?pattern-reqired)pattern-you-look-for
^(?!.*?pattern-disallowed)pattern-you-look-for
These patterns can also be chained
^(?!.?not-this)(?!.?not-that)pattern-you-look-for`