IIS URL Rewrite {R:N} clarification - iis-7

I've not been able to understand the purpose of {R:N}. Could anyone please clarify when to use
{R:0} vs. {R:1}
usage example:
<action type="Redirect" url="http://www.{HTTP_HOST}/{R:0}" />
I've seen ScottGu using {R:1}
http://weblogs.asp.net/scottgu/archive/2010/04/20/tip-trick-fix-common-seo-problems-using-the-url-rewrite-extension.aspx
Whereas, below has {R:0}
http://weblogs.asp.net/owscott/archive/2009/11/27/iis-url-rewrite-rewriting-non-www-to-www.aspx
Had a look at the IIS link below but could not quite digest the definition below:
Back-references to condition patterns are identified by {C:N} where N is from 0 to 9; back-references to rule pattern are identified by {R:N} where N is from 0 to 9. Note that for both types of back-references, {R:0} and {C:0}, will contain the matched string

As per the documentation:
When an ECMAScript pattern syntax is used, a back-reference can be
created by putting parenthesis around the part of the pattern that
must capture the back-reference.
So taking the example that follows in the documentation:
^(www\.)(.*)$
And using the input string www.foo.com in the conditions, you will have:
{C:0} - www.foo.com
{C:1} - www.
{C:2} - foo.com
To make it simple:
{R:x} is used as back reference from the rule pattern (<match url="...">).
{C:x} is used as back reference from the condition pattern (<conditions><add input="{HTTP_HOST}" pattern="..."></conditions>)
The 0 reference contains the whole input string
The 1 reference will contain the first part of the string matching the pattern in the first parenthesis (), the 2 reference the second one, etc...up to the reference number 9
Note:
When "Wildcard" pattern syntax is used, the back-references are always
created when an asterisk symbol (*) is used in the pattern. No
back-references are created when "?" is used in the pattern.
http://www.iis.net/learn/extensions/url-rewrite-module/url-rewrite-module-configuration-reference#Using_back-references_in_rewrite_rules

Related

Extract a certain element from URL using regular expressions

I need to extract the first element ("adidas-originals") after "designer" in the following URL using regular expressions.
xxx/en-ca/men/designers/adidas-originals/shorts
This needs to be done in Google Big Query API (standard SQL). To this end, I have tried several ways to get the desired valued without any success. Below is the best solution that I have found so far which obviously is not the right one as it returns "/adidas-originals/shorts".
REGEXP_EXTRACT(hits.page.pagePath, r'designers([^\n]*)')
Thanks!
The [^\n]* matches 0 or more chars other than a newline, LF, so no wonder it matches too much.
You need a pattern to match up to the next /, so you may use
designers/([^/]+)
Or a more precise:
(?:^|/)designers/([^/]+)
See the regex demo
Details
(?:^|/) - either start of a string or / (you may just use / if designers is always preceded with /)
designers/ a designers/ substring
([^/]+) - Capturing group 1 (just what will be returned with the REGEXP_EXTRACT function): one or more chars other than /.

IIS UrlRewrite RegEx differences with .NET RegEx

I have a valid RegEx pattern in .NET:
(?>.*param1=value1.*)(?<!.*param2=\d+.*) which matches if:
query string contains param1=value1
but does not contain param2= a number
It works in .NET. However IIS URLRewrite complains that it is not a valid pattern.
Can I not use zero-width negative look behind (?<! ) expressions with IIS URLRewrite?
Note that I tried to apply this pattern both in web.config (properly changing < and > to < and > respectively, as well as in the IIS Manager - all without success.
IIS URLRewrite default regex syntax is ECMAScript, that is not compatible with .NET regex syntax. See URL Rewrite Module Configuration Reference:
ECMAScript – Perl compatible (ECMAScript standard compliant) regular expression syntax. This is a default option for any rule.
You cannot use a lookbehind at all, you will have to rely on lookaheads only:
^(?!.*param2=\d).*param1=value1.*
Pattern explanation:
^ - start of string
(?!.*param2=\d) - if there is param2= followed with a digit (\d) after 0+ characters other than a newline fails the match (return no match)
.*param1=value1.* - match a whole line that contains param1=value1
You can enhance this rule by adding \b around param1=value1 to only match it as a whole word.

URL Rewrite problem ,Regex expression

In my appication i used to URL Rewrite using Intelligencia.
My URL Pattern is
<rewrite url="~/([A-Za-z0-9-_ ]+)$" to="~/xxxxxx.aspx?id=$1"/>
Now i want if the URL doesn't contain iphonestatelist,iphonepropertytype then only it should redirect.
I have used this below code but it does not working.
<rewrite url="~/(^((?!\b(iphonestatelist|iphonepropertytype)\b).)*[A-Za-z0-9-_ ]+)$" to="~/xxxxxx.aspx?id=$1"/>
Eg:
The below URL dont want to Rewrite
www.xxxx.com/abc/sef/iphonestatelist
The right URL to Rewrite
www.xxxx.com/FX1234
Try
<rewrite
url="^(?!.*?\b(?:iphonestatelist|iphonepropertytype)\b)~/([A-Za-z0-9-_ ]+)$"
to="~/xxxxxx.aspx?id=$1"
/>
Generally: To check if a string contains (or does not contain) a certain sub-string, use a look-ahead (or negative look-ahead, respectively) like this:
^(?=.*?pattern-reqired)pattern-you-look-for
^(?!.*?pattern-disallowed)pattern-you-look-for
These patterns can also be chained
^(?!.*?not-this)(?!.*?not-that)pattern-you-look-for
Try
<rewrite
url="^(?!.*?\b(?:iphonestatelist|iphonepropertytype)\b)~/([A-Za-z0-9-_ ]+)$"
to="~/xxxxxx.aspx?id=$1"
/>
Generally: To check if a string contains (or does not contain) a certain sub-string, use a look-ahead (or negative look-ahead, respectively) like this:
`^(?=.*?pattern-reqired)pattern-you-look-for
^(?!.*?pattern-disallowed)pattern-you-look-for
These patterns can also be chained
^(?!.?not-this)(?!.?not-that)pattern-you-look-for`

How to use a rewrite map to replace a single querystring value with IIS 7 URL Rewrite

We have incoming URL's with querystring's like this:
?query=word&param=value
where we would like to use a rewrite map to build a table of substitution values for "word".
For example:
word = newword
such that the new querystring would be:
?query=newword&param=value
"word" may be any urlencoded value, including %'s
I'm having trouble with the regex match and substitution - I seem to get it to match, but then the substituted value doesn't get passed through.
My current rule looks like:
Match URL: Matches the Pattern: .*
Conditions: match all:
Condition 1: {QUERY_STRING} matches "query=(.+)\&+?(.+)$"
Condition 2: {rewritemap:{C:1}} matches the pattern (.+)
track capture across groups.
Action: rewrite:
rewrite url: ?query={C:1}&param=value
(I've hard coded the &param=value because it isn't changing... having that just pass through from the input would be ideal, I was just being lazy)
So write now using failed request tracking I can see it match, and seemingly replace with the mapped value, but then the url that is output still has the original value.

Rewriting a URL with RegEx

I have a RegEx problem. Consider the following URL:
http://ab.cdefgh.com/aa-BB/index.aspx
I need a regular expression that looks at "aa-BB" and, if it doesn't
match a number of specific values, say:
rr-GG
vv-VV
yy-YY
zz-ZZ
then the URL should redirect to some place. For example:
http://ab.cdefgh.com/notfound.aspx
In web.config I have urlrewrite rules. I need to know what
the regex would be between the tags.
<urlrewrites>
<rule>
<url>?</url>
<rewrite>http://ab.cdefgh.com/notfound.aspx</rewrite>
</rule>
</urlrewrites>
Assuming you don't care about the potential for the replacement pattern to be in the domain name or some other level of the directory structure, this should select on the pattern you're interested in:
http:\/\/ab\.cdefgh\.com\/(?:aa\-BB|rr\-GG|vv\-VV|yy\-YY|zz\-ZZ)\/index\.aspx
where the aa-BB, etc. patterns are simply "or"ed together using the | operator.
To further break this apart, all of the /, ., and - characters need to be escaped with a \ to prevent the regex from interpreting them as syntax. The (?: notation means to group the things being "or"ed without storing it in a backreference variable (this makes it more efficient if you don't care about retaining the value selected).
Here is a link to a demonstration (maybe this can help you play around with the regex here to get to exactly which character combinations you want)
http://rubular.com/r/UfB65UyYrj
Will this help?
^([a-z])\1-([A-Z])\2.*
It matches:
uu-FF/
aa-BB/
bb-CC/index
But not
aaBB
asdf
ba-BB
aA-BB
(Edit based on comment)
Just pipe delimit your desired urls inside of () and escaping special chars.
Eg.
^(xx-YY|yy-ZZ|aa-BB|goodStuff)/.*
But, I think you might actually want the following which matches anything other than the urls that you specify, so that all else goes to notfound.aspx:
^[^(xx-YY|yy-ZZ|aa-BB|goodStuff)]/.*
Assuming you want anything but xx-XX, yy-YY and zz-ZZ to redirect:
[^(xx\-XX)|(yy\-YY)|(zz\-ZZ)]

Resources