Canonical URLs with Nginx - nginx

We're working on removing the directory index files from our URLs to clean things up and provide more consistency to improve our SEO.
However, I'm not familiar with how to take care of this in Nginx.
I found the following for Apache (we're just looking for the Nginx equivalent)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.%{HTTP_HOST}/ [R=301,NS,L]
I've read the docs and tried several different options - the closest I can get will still return an infinite loop error.

The snippet you posted for Apache uses the immutable global variable %{THE_REQUEST} to determine the original URI requested by the client. However, this variable contains the entire request, including the HTTP method, version, and query string. Therefore, parsing this variable is a bit messy, as seen in the example you posted.
However, nginx has a dedicated variable that holds the original request URI received from the client: $request_uri. This allows you to do the following:
## REDIRECT foo/index(.html) to foo/
if ($request_uri ~ ^(.*/)index(?:\.html)?$) {
return 301 $1;
}
If you wanted to also strip the file suffix, e.g. .html, you could use the following snippet:
## REDIRECT foo/bar.html to foo/bar
if ($request_uri ~ ^(.+)\.html$) {
return 301 $1;
}
Now, in order for nginx to still be able to serve the correct file, one uses the try_files directive, which checks all given URIs in sequence until one matches:
## Rewrite internal requests for foo/bar to foo/bar.html
try_files $uri $uri.html =404;
So a request for /foo/bar would be handled as follows:
return $uri = /foo/bar, if that file exists in the document
root, otherwise
return $uri.html = /foo/bar.html if it exists, finally
issue a 404 error.

Related

Nginx remove special character

My URL is: https://example.com/data=123456
I added the following line of code to the index.php file: var_dump($_GET['data']);
But this only works if I add a ? character to the URL.
(example: https://example.com/?data=123456)
This is in the root of index.php.
I tried to add this to the nginx.conf file:
location / {
try_files $uri $uri/ /index.php$is_args$args;
rewrite ^(.*)\?(.*)$ $1$2 last;
}
location ~ \.php$ {
fastcgi_pass 127.0.0.1:9000;
rewrite ^(.*)\?(.*)$ $1$2 last;
include fastcgi.conf;
}
But not working. How to remove the "?" character from URL?
When your URL is https://example.com/data=123456, the data=123456 is a part of URL path. When your URL is https://example.com/?data=123456, the data=123456 is a query part of the URL, where query argument data has a value if 123456. Check at least the Wikipedia article.
Next, only query arguments can be accesses via $_GET array. Full request URI (including both path and query parts) is available via $_SERVER['REQUEST_URI']. You can analyze that variable in your PHP script and skip the rewrite part completely.
If you want to rewrite your URI and made that data to be accessible via $_GET['data'] variable, it is also possible. However your main mistake is that the rewrite directive works only with (normalized) path of the given URL and not the query part of it. Nevertheless it can be used to append (or replace) query arguments. To do it you can use
rewrite ^(.*/)(data=[^/]*)$ $1?$2;
(to rewrite only data parameter) or more generic
rewrite ^(.*/)([^/=]+=[^/]*)$ $1?$2;
(to count on every param=value form of the URL path suffix). You can place this rewrite rule at the server configuration level.

nginx - Redirect to urls without extension and add slash at the end

Using nginx, I would like to redirect all my static html URLs( mydomain.com/index.html; mydomain.com/contact.html; mydomain.com/about.html ) to the same urls without extension but with a slash at the end.
The end urls should be like that:
mydomain.com/index/
mydomain.com/contact/
mydomain.com/about/
....
How can I achieve that? Thanks.
There are two parts to this question. The first is how to make the server return contact.html in response to the URI /contact/.
There are a number of ways to achieve this. For example:
location / {
try_files $uri #rewrite;
}
location #rewrite {
rewrite ^(/.+)/$ $1.html last;
return 404;
}
Any URI ending with a / will be internally mapped to the same URI with the / replaced by .html.
The second part is if people are still accessing the .html URIs using the old scheme, you may want to externally redirect them to the new scheme.
The best way to achieve this without creating a redirection loop, is to check the original request stored in $request_uri. For example:
if ($request_uri ~* "\.html(?|$)") {
rewrite ^(.*)\.html $1/ permanent;
}

NGINX conditional Rewrite extensionless URLS

I am trying to provide extensionless URLs for a client. The systems URLs will be generated without the extension in the navigation elements and links so I will have links that look like.
www.somesite.com
www.somesite.com/foo
www.somesite.com/foo/bar
www.somesite.com/bar/foo/barfoo
lets pretend for a moment that the calls will be either routed to a proxy that can handle a defined file extension or simply serve the html page if it exists. If the url is correctly rewritten then I would think a location command with a matching regex for the extension should work.
so behind the scenes we have.
www.somesite.com/index.abc
www.somesite.com/foo.def
www.somesite.com/foo/bar.abc
www.somesite.com/bar/foo/barfoo.def
...
with Apache .htaccess I can solve this problem by first testing for the existence of the page with a desired filetype.
RewriteCond %{REQUEST_FILENAME}.abc -f
RewriteRule ^(.*)$ $1.abc [L]
I would also make sure that directory browsing is off and that trailing slashes would be removed
#ensure trailing slash is removed
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^(.*)(?:/)$ $1 [R=301,L]
all well and good for Apache, and for me relatively intuitive, but this is NGINX and quite frankly I have no idea how to solve this use-case.
All the similar use cases I have found deal with html & php (How to remove both .php and .html extensions from url using NGINX?) and simply use try_files until they fall-through to a named location that rewrites the uri with .php extension. This would work if one is only dealing with a single dynamic language and fails miserably if we have two dynamic languages.
So the question is how do I do something similar in NGINX like can be done with the .htaccess condition/rewrite above
any help would be appreciated!
Cheers
Gary
UPDATE:
I have been able to get it "mostly" working by using the standard php approach. The issue is www.somesite.com is being directed to www.somesite.com/.php instead of serving the default document. Trailing slashes are also being removed correctly.
so to recap:
www.somesite.com - not working - www.somesite.com/.php
www.somesite.com/foo - working
www.somesite.com/foo/bar - working
www.somesite.com/bar/foo/barfoo - working
here my config:
location / {
index index.php index.html
autoindex off;
#remove trailing slash
rewrite ^/(.*)(?:/)$ /$1 permanent;
#try html files or route to named location
try_files $uri $uri.html #php;
}
location ~ \.php$ {
...
}
location #php {
rewrite ^(.*)$ $1.php last;
}
in other posts the try_files block looks like this: try_files $uri $uri.html $uri/ #php; the problem is if I add $uri/ it will work for the default document e.g. serve www.somesite.com but all other urls like www.somesite.com/foo/ or www.somesite.com/foo/bar/ , which are also directories and have files of the same name, will be redirected to infinity instead of their respective pages.
Assuming that you have three ways to process an extensionless URI, you would need three location blocks. You can use try_files to test for the existence of a file by type, and cascade to the next location block in the chain, if the file is not found. See this document for details.
For example:
root /path/to/document/root;
location / {
try_files $uri $uri.html $uri/ #php;
}
location #php {
try_files $uri.php #proxy;
fastcgi_pass ...;
...
}
location #proxy {
proxy_pass ...;
}
The first block processes normal files within the document root. You probably have a location ~ \.php$ block to process PHP files, and the second block is essentially a replica. The third block sends everything else upstream.

double try_files directive for the same location in nginx

I discovered a nginx config snippet in serveral gists and config examples (mostly for PHP apps):
#site root is redirected to the app boot script
location = / {
try_files #site #site;
}
#all other locations try other files first and go to our front controller if none of them exists
location / {
try_files $uri $uri/ #site;
}
But I just do not get it: Does the first try_files directive ever match? To me this looks like some nonsense hacking.
Please confirm or explain why not - thanks :)
This is what happens here:
The first location = / is only used when the path in the request is /, e.g. http://example.com/. The second location /is used for all other URLs, e.g. http://example.com/foo or http://example.com/bar.
The reason for the first location is to avoid any interference from index-directives that do a redirect to index.html or something similar.
Inside the first location the try_files-directive first looks for a file named #site, which does not exist and then redirects to the named location #site. The reason for this construct is that the redirect to the named location is purely internal, i.e. the #site location can not be accessed directly from the client and the $uri is kept unmodified during this redirect (which would not be the case for other redirects). The first parameter #site can be anything except a real existing file. I prefer to call it DUMMY for clarity.
The second location tries static files first and, if not found, then also redirects to the named location.

Rewrite all params requests to index.php with Nginx

i've URL like this
http://www.example.com/index.php?page=profile&id=324&opt=edit&cat=23&...
I wish I had
http://www.example.com/profile/324/edit/23...
I read several tutorials on how to remove php extension but do not know how to pass other parameters
Thanks to everyone
UPDATE
Sovled with PHP solution.
nginx:
location / {
try_files $uri $uri/ /index.php;
}
PHP:
$params = explode("/",$_SERVER['REQUEST_URI']);
and use them
Actually it's pretty easy if those params are always in the same order as your example.
Add a rewrite rule in your nginx config.
rewrite ^/index.php?page=(.*)&id=(.*)&opt=(.*)&cat=(.*)(&.*)$ /$1/$2/$3/$4
But if the order of params changes in every request, we may need another solution.
UPDATE
nginx stores request args in variables like $arg_name, the name after $arg_ if the actual request param name. So you just need to rewrite requests to index.php with a tailing question mark to url style you want.
rewrite ^/index.php? /$arg_page/$arg_id/$arg_opt/$arg_cat;

Resources