Nginx Regular expression to match first segment of a url - nginx

I have some urls like these:
domain.com/article/1234-newstitle
domain.com/article/1234-newstitle/
domain.com/article/1234-newstitle/abc
domain.com/article/1234-newstitle/xpto/abc
domain.com/article/1234-newstitle/qwerty/abc/xyz
I want to catch only the /1234-newstitle/ to redirect these urls and ignore everything after the 1º slash (whatever the number of segments) so I can have:
domain.com/news/newstitle-1234/
The best I could get is:
rewrite ^article/(\d+)-(.*)[^\/]* /new/$2-$1/ permanent;
but $2 matches everything after 1234-
How am I able to match only the first segment "1234-newstitle" and ignore the rest?

Looks like you've got a greedy match after the hyphen. Try this:
rewrite ^/article/(\d+)-([^/]+) /new/$2-$1/ permanent;

Related

How to rewrite a single word in Nginx many times

i have a url like
https://ex.com/dir/?word=2022%2F11%2F19&another=2022%2F11%2F19
how to replace %2F with -
to be like
https://ex.com/dir/?word=2022-11-19&another=2022-11-19
rewrite ^(.*)%2f(.*)$ $1-$2;

nginx rewrite regex matching first group instead of all

I'm working on an nginx rewrite rule to redirect:
/collections/collection-name/products/product-handle-with-dashes
to:
/products/product-handle-with-dashes
I've got it almost working, the only issue I have right now if my rule to match the product handle is only returning the first string before the first hyphen.
My rule:
rewrite ^(/collections/.*)/products/(\w+)\.?.*$ /products/$2 permanent;
With this rule if I hit the following path: /collections/collection-name/products/some-product-handle it will redirect me to /products/some
what am I missing on my regex to allow it my second variable to capture the entire handle with dashes.
\w metacharacter of the PCRE/PCRE2 regex patterns include characters from ranges a-z, A-Z, 0-9 and underscore. It does not include a hyphen. You probably want [-\w] instead. The whole rewrite rule can be the
rewrite ^/collections/[^/]+(/products/[-\w]+)(\.\w+)?$ $1 permanent;

Rewrite Rule for 301 redirections structure

I have developed a project where url structure has changed from the old webpage and I need to create 301 redirects to avoid SEO penalization. After reading a lot I can't find how to do this rewrites.
Old URL
/es/madrid/comprar/893134/prop-712/
New URL
/es/property/prop-712/
Idea approach
RewriteRule ^/$1/property/$5 /$1/$2/$3/$4/$5/
What I need is using only the first path as param (/es/) and the last (/prop-712/) to restructure the URL /$first/property/$second and remove the $2, $3 & $4.
As you will see we share the last param (prop-712) of the URL. Any idea if this is possible?
Try the following at the top of the root .htaccess file using mod_rewrite:
RewriteRule ^([a-z]{2})(?:/[\w-]+){3}/([\w-]+?)/?$ https://example.com/$1/property/$2/ [R=301,L]
This will redirect a URL of the form /<lang>/<one>/<two>/<three>/<prop>/ (trailing slash optional) to https://example.com/<lang>/property/<prop>/. Where <lang> is any two lowercase letter language code and example.com is your canonical hostname. This matches exactly 3 path segments in the middle that are removed.
The regex [\w-]+ matches each path segment, including the property (last path segment). This matches characters in the range a-z, A-Z, 0-9, _ (underscore) and - (hyphen).
Only the first and last path segments are captured and later referenced using the $1 and $2 backreferences respectively. The parenthesised subpattern in the middle (ie. (?:/[\w-]+){3}) that matches the 3 inner path segments is non-capturing (indicated by the (?: prefix).
You do not need to repeat the RewriteEngine directive, since this already occurs later in the file inside the WordPress code block.
Test first with a 302 (temporary) redirect to prevent potential caching issues. Only change to a 301 (permanent) redirect when you are sure it is working as intended.
A quick look at your "idea":
RewriteRule ^/$1/property/$5 /$1/$2/$3/$4/$5/
In .htaccess files, the URL-path that the RewriteRule pattern matches against, does not start with a slash.
Backreferences of the form $n are used to reference capturing groups in the RewriteRule pattern (first argument). Backreferences can only be used in the substitution string (second argument)*1. You can't use backreferences in the RewriteRule pattern itself (which is a standard regex) - the $ carries special meaning here as an end-of-string anchor (regex syntax).
(*1 ...and the TestString (first) argument of any preceding RewriteCond directives, but this does not apply here.)
Reference:
https://httpd.apache.org/docs/2.4/rewrite/intro.html
https://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewriterule

Nginx redirect rule with arabic characters

I need to create an Nginx rule that redirects /ae/ar/أسافر%20من%20أجل/ to /ae/en. I've tried with these two rules:
rewrite (?i)^/ae/ar/أسافر%20من%20أجل(.*)$ http://$host/ae/en? permanent;
rewrite (?i)^/ae/ar/%D8%A3%D8%B3%D8%A7%D9%81%D8%B1%20%D9%85%D9%86%20%D8%A3%D8%AC%D9%84(.*)$ https://$host/ae/en? permanent;
But no one worked. Looks like the rule is ignored, any clues to make this rule work properly?
Thanks in advance!
Rewrite does not see the percent encoded characters. You should be able to leave the arabic characters as they are (UTF-8 characters within the configuration file) and replace the %20 with a single space. The entire regular expression should be surrounded with "s.
Alternatively, use a location statement:
location = "/ae/ar/أسافر من أجل/" { return 301 /ae/en; }

Nginx rewrite with and without trailing slash

I'm trying to create a set of rules that match a url with and without a trailing slash
Most of the answers were pointing me to use something similar to this.
location /node/file/ {
rewrite ^/node/file/(.*)/(.*)$ /php/node-file.php?file=$1&name=$2;
rewrite ^/node/file/(.*)/(.*)/?$ /php/node-file.php?file=$1&name=$2; │
}
But this does not match the trailing slash url.
How can I write a rule that matches urls that look like
http://example.com/node/file/abcd/1234/
http://example.com/node/file/abcd/1234
The first rewrite statement includes (.*) as the last capture, which will match any string, including one with a trailing slash.
Use the character class [^/] to match any character except the /:
rewrite ^/node/file/([^/]*)/([^/]*)$ /php/node-file.php?file=$1&name=$2;
rewrite ^/node/file/([^/]*)/([^/]*)/?$ /php/node-file.php?file=$1&name=$2;
Now you will notice that the first rewrite statement is unnecessary, as the second rewrite statement matches URIs both with and without a trailing /.
So all you need is:
rewrite ^/node/file/([^/]*)/([^/]*)/?$ /php/node-file.php?file=$1&name=$2;

Resources