regular expression needed to rewrite URLs - asp.net

I am trying to make a rewrite rule to check whether the URL ends with .htm or .html, but does not contain Archive.aspx.
The url starts out like
www.contoso.com/test.htm (or .html)
and ends up like
www.contoso.com/Archive.aspx?page=/test.htm
How can I do this with a regular expression?

You can use negative lookahead: (?!Archive\.aspx).*\.html?$

Related

Nginx redirect rule with arabic characters

I need to create an Nginx rule that redirects /ae/ar/أسافر%20من%20أجل/ to /ae/en. I've tried with these two rules:
rewrite (?i)^/ae/ar/أسافر%20من%20أجل(.*)$ http://$host/ae/en? permanent;
rewrite (?i)^/ae/ar/%D8%A3%D8%B3%D8%A7%D9%81%D8%B1%20%D9%85%D9%86%20%D8%A3%D8%AC%D9%84(.*)$ https://$host/ae/en? permanent;
But no one worked. Looks like the rule is ignored, any clues to make this rule work properly?
Thanks in advance!
Rewrite does not see the percent encoded characters. You should be able to leave the arabic characters as they are (UTF-8 characters within the configuration file) and replace the %20 with a single space. The entire regular expression should be surrounded with "s.
Alternatively, use a location statement:
location = "/ae/ar/أسافر من أجل/" { return 301 /ae/en; }

Why is my url rewriting rule not working properly?

I'm trying to write this redirection
images/catalog/1002/10002/main-200x250.12345.jpg to url images/catalog/1002/10002/main.jpg?w=200&h=250&vw=main
I tried this rule:
rewrite "^/images/(.*)/([a-z0-9]+)-([0-9])x([0-9]).([0-9]{5}).(jpg|jpeg|png|gif|ico)$" /images/$1/$2.$6?w=$3&h=$4&vw=$2 break;
It is not working, it return 404 not found error. I don't know what I'm missing.
Also when I remove double quotes (") I got this error
directive "rewrite" is not terminated by ";"
And I don't clear see the utility of the sign " and when should I use it or avoid it
I m working on a Mac with MAMP Pro v 5.2.2
You forgot to add a quantifier for the width and height numbers in your regex. Try this (I added a twice a +, you might want to use {X} instead, where X is the amount of digits for each number (if it is always the same amount of digits)):
rewrite "^/images/(.*)/([a-z0-9]+)-([0-9]+)x([0-9]+).([0-9]{5}).(jpg|jpeg|png|gif|ico)$" /images/$1/$2.$6?w=$3&h=$4&vw=$2 break;
Your reqular expression needs to be quoted because there is a } in it.
I think the nginx documentation about rewrite directive will answer your question, when a regular expression needs to be quoted:
If a regular expression includes the “}” or “;” characters, the whole
expressions should be enclosed in single or double quotes.

Nginx Regular expression to match first segment of a url

I have some urls like these:
domain.com/article/1234-newstitle
domain.com/article/1234-newstitle/
domain.com/article/1234-newstitle/abc
domain.com/article/1234-newstitle/xpto/abc
domain.com/article/1234-newstitle/qwerty/abc/xyz
I want to catch only the /1234-newstitle/ to redirect these urls and ignore everything after the 1º slash (whatever the number of segments) so I can have:
domain.com/news/newstitle-1234/
The best I could get is:
rewrite ^article/(\d+)-(.*)[^\/]* /new/$2-$1/ permanent;
but $2 matches everything after 1234-
How am I able to match only the first segment "1234-newstitle" and ignore the rest?
Looks like you've got a greedy match after the hyphen. Try this:
rewrite ^/article/(\d+)-([^/]+) /new/$2-$1/ permanent;

Regex to match url not for certain file types

I want my Regex to match all valid URLs that do not end with
.gif
.jpg
.jpeg
.pdf
.doc
I tried
http(s)?://([\w-]+\.)+[\w-]+(/[\w- ./?%&=;]*)?((?!jpg)|(?!gif)|(?!doc))
You need to use a lookbehind for that, try
http(s)?://([\w-]+\.)+[\w-]+(/[\w- ./?%&=;]*)?(?<!jpg)(?<!gif)(?<!doc)$
You need also the anchor $ at the end, it matches the end of the string, that is important to define clearly the point from where the lookbehind should look behind.
See it here on Regexr

Rewriting a URL with RegEx

I have a RegEx problem. Consider the following URL:
http://ab.cdefgh.com/aa-BB/index.aspx
I need a regular expression that looks at "aa-BB" and, if it doesn't
match a number of specific values, say:
rr-GG
vv-VV
yy-YY
zz-ZZ
then the URL should redirect to some place. For example:
http://ab.cdefgh.com/notfound.aspx
In web.config I have urlrewrite rules. I need to know what
the regex would be between the tags.
<urlrewrites>
<rule>
<url>?</url>
<rewrite>http://ab.cdefgh.com/notfound.aspx</rewrite>
</rule>
</urlrewrites>
Assuming you don't care about the potential for the replacement pattern to be in the domain name or some other level of the directory structure, this should select on the pattern you're interested in:
http:\/\/ab\.cdefgh\.com\/(?:aa\-BB|rr\-GG|vv\-VV|yy\-YY|zz\-ZZ)\/index\.aspx
where the aa-BB, etc. patterns are simply "or"ed together using the | operator.
To further break this apart, all of the /, ., and - characters need to be escaped with a \ to prevent the regex from interpreting them as syntax. The (?: notation means to group the things being "or"ed without storing it in a backreference variable (this makes it more efficient if you don't care about retaining the value selected).
Here is a link to a demonstration (maybe this can help you play around with the regex here to get to exactly which character combinations you want)
http://rubular.com/r/UfB65UyYrj
Will this help?
^([a-z])\1-([A-Z])\2.*
It matches:
uu-FF/
aa-BB/
bb-CC/index
But not
aaBB
asdf
ba-BB
aA-BB
(Edit based on comment)
Just pipe delimit your desired urls inside of () and escaping special chars.
Eg.
^(xx-YY|yy-ZZ|aa-BB|goodStuff)/.*
But, I think you might actually want the following which matches anything other than the urls that you specify, so that all else goes to notfound.aspx:
^[^(xx-YY|yy-ZZ|aa-BB|goodStuff)]/.*
Assuming you want anything but xx-XX, yy-YY and zz-ZZ to redirect:
[^(xx\-XX)|(yy\-YY)|(zz\-ZZ)]

Resources