How do I use regex in `proxy_cookie_domain` - nginx

I'm using the proxy_cookie_domain directive, and the docs say I should be able to use regex:
The directive can also be specified using regular expressions. In this
case, domain should start from the “~” symbol. A regular expression
can contain named and positional captures, and replacement can
reference them:
proxy_cookie_domain ~\.(?P<sl_domain>[-0-9a-z]+\.[a-z]+)$ $sl_domain
How do I use regex to replace every instance of example.com with mydomain.com, e.g.:
anything.example.com with anything.mydomain.com
any.thing.example.com with any.thing.mydomain.com
literally.any.thing.example.com with literally.any.thing.mydomain.com
And so on...

Turns out nginx uses the PRCE flavor of regex, which you can play with at regex101
Here's my what I came up with
proxy_cookie_domain ~\.((?<cookie_pre>.*)example.com)$ "${cookie_pre}mydomain.com";

Related

Use script instead of regex for Nginx prox_redirect to modify Location header

For Nginx, proxy_redirect, can we use a string instead of string or regex? I have a need to replace multiple occurrences of text in the Location string using proxy_redirect and I am not sure how to do it with regex. Therefore, I am checking if the proxy_redirect can make us of script, then it could be easy.
You can use multiple proxy_redirect statements within a single block containing the proxy_pass statement. Nginx evaluates statements containing regular expressions in order until a match is found, so place the more specific regular expressions before the less specific ones.
To replace a single occurrence of a pattern within the Location header of a 3xx response, you would use:
proxy_redirect ~^(.*)origin.example.com(.*)$ $1main.example.com$2;
To replace two occurrences of the same pattern, you would use:
proxy_redirect ~^(.*)origin.example.com(.*)origin.example.com(.*)$ $1main.example.com$2main.example.com$3;
To replace one, two or three occurrences of the same pattern, you would use:
proxy_redirect ~^(.*)origin.example.com(.*)origin.example.com(.*)origin.example.com(.*)$ $1main.example.com$2main.example.com$3main.example.com$4;
proxy_redirect ~^(.*)origin.example.com(.*)origin.example.com(.*)$ $1main.example.com$2main.example.com$3;
proxy_redirect ~^(.*)origin.example.com(.*)$ $1main.example.com$2;
Use ~* to make the regular expression case-insensitive. See this document for details.

IIS UrlRewrite RegEx differences with .NET RegEx

I have a valid RegEx pattern in .NET:
(?>.*param1=value1.*)(?<!.*param2=\d+.*) which matches if:
query string contains param1=value1
but does not contain param2= a number
It works in .NET. However IIS URLRewrite complains that it is not a valid pattern.
Can I not use zero-width negative look behind (?<! ) expressions with IIS URLRewrite?
Note that I tried to apply this pattern both in web.config (properly changing < and > to < and > respectively, as well as in the IIS Manager - all without success.
IIS URLRewrite default regex syntax is ECMAScript, that is not compatible with .NET regex syntax. See URL Rewrite Module Configuration Reference:
ECMAScript – Perl compatible (ECMAScript standard compliant) regular expression syntax. This is a default option for any rule.
You cannot use a lookbehind at all, you will have to rely on lookaheads only:
^(?!.*param2=\d).*param1=value1.*
Pattern explanation:
^ - start of string
(?!.*param2=\d) - if there is param2= followed with a digit (\d) after 0+ characters other than a newline fails the match (return no match)
.*param1=value1.* - match a whole line that contains param1=value1
You can enhance this rule by adding \b around param1=value1 to only match it as a whole word.

Limitations on regex in nginx?

I had a rewrite rule on Apache for /year/month/date links in a form that specifically defined 4 digits, then 2 digits, then 2 digits, that looked like this:
^([0-9]{4})/([0-9]{2})/([0-9]{2})/$
On nginx this regex causes an error that says the whole line is not terminated by a ; sign, until i remove the {} brackets and leave the regex like this:
^/([0-9]+)/([0-9]+)/([0-9]+)/$
Is this limitation intentional on nginx's part or some mistake on my part?
The whole line from Apache:
RewriteRule ^([0-9]{4})/([0-9]{2})/([0-9]{2})/$ index.php?page=date&year=$1&month=$2&day=$3
The whole (working) line from nginx:
rewrite ^/([0-9]+)/([0-9]+)/([0-9]+)/$ /index.php?page=date&year=$1&month=$2&day=$3;
If a regular expression includes the “}” or “;” characters, the whole expressions should be enclosed in single or double quotes.
http://nginx.org/en/docs/http/ngx_http_rewrite_module.html#rewrite

Rewriting a URL with RegEx

I have a RegEx problem. Consider the following URL:
http://ab.cdefgh.com/aa-BB/index.aspx
I need a regular expression that looks at "aa-BB" and, if it doesn't
match a number of specific values, say:
rr-GG
vv-VV
yy-YY
zz-ZZ
then the URL should redirect to some place. For example:
http://ab.cdefgh.com/notfound.aspx
In web.config I have urlrewrite rules. I need to know what
the regex would be between the tags.
<urlrewrites>
<rule>
<url>?</url>
<rewrite>http://ab.cdefgh.com/notfound.aspx</rewrite>
</rule>
</urlrewrites>
Assuming you don't care about the potential for the replacement pattern to be in the domain name or some other level of the directory structure, this should select on the pattern you're interested in:
http:\/\/ab\.cdefgh\.com\/(?:aa\-BB|rr\-GG|vv\-VV|yy\-YY|zz\-ZZ)\/index\.aspx
where the aa-BB, etc. patterns are simply "or"ed together using the | operator.
To further break this apart, all of the /, ., and - characters need to be escaped with a \ to prevent the regex from interpreting them as syntax. The (?: notation means to group the things being "or"ed without storing it in a backreference variable (this makes it more efficient if you don't care about retaining the value selected).
Here is a link to a demonstration (maybe this can help you play around with the regex here to get to exactly which character combinations you want)
http://rubular.com/r/UfB65UyYrj
Will this help?
^([a-z])\1-([A-Z])\2.*
It matches:
uu-FF/
aa-BB/
bb-CC/index
But not
aaBB
asdf
ba-BB
aA-BB
(Edit based on comment)
Just pipe delimit your desired urls inside of () and escaping special chars.
Eg.
^(xx-YY|yy-ZZ|aa-BB|goodStuff)/.*
But, I think you might actually want the following which matches anything other than the urls that you specify, so that all else goes to notfound.aspx:
^[^(xx-YY|yy-ZZ|aa-BB|goodStuff)]/.*
Assuming you want anything but xx-XX, yy-YY and zz-ZZ to redirect:
[^(xx\-XX)|(yy\-YY)|(zz\-ZZ)]

regular expression needed to rewrite URLs

I am trying to make a rewrite rule to check whether the URL ends with .htm or .html, but does not contain Archive.aspx.
The url starts out like
www.contoso.com/test.htm (or .html)
and ends up like
www.contoso.com/Archive.aspx?page=/test.htm
How can I do this with a regular expression?
You can use negative lookahead: (?!Archive\.aspx).*\.html?$

Resources