Set fastcgi_cache_key using original $request_uri without $args - wordpress

I am trying to integrate a WordPress plugin (Jetpack's Related Posts module) which adds query strings to the end of post URLs. I would like to cache URLs with FastCGI while completely ignoring the query strings/$args.
My current config is: fastcgi_cache_key "$scheme$request_method$host$request_uri";
I am aware of using the solution mentioned here to turn my $skip_cache variable off for URLs containing a certain $arg, which works. However, I want to cache the same result regardless of the value of the $args rather than using a unique cache key for each set of $args.
I am also aware of some suggestions to just use $uri in the fastcgi_cache_key rather than $request_uri; however, because $uri is not just the original requested URI minus the $args, something in the WordPress architecture (likely pretty links) forces all requested URIs to return the same cache result (rather than a different result for each page).
Is there any way to truly use the originally requested URI without including the $args in the cache key?

Just now, I had similar problem.
So, my solution:
In the nginx config add to the
http {
...
map $request_uri $request_path {
~(?<captured_path>[^?]*) $captured_path;
}
...
}
Then you will have a variable $request_path, which contains $request_uri without query_string.
So use $request_path for cache key
fastcgi_cache_key "$scheme$request_method$host$request_path"
Important. The "map" directive can be added only to "http {}". This directive will be execute for all requests for all hosts.

Related

Fastest redirect on nginx for many urls?

new on nginx.
We have migrated a site and need to redirect about 800 urls to new site.
It would be faster for nginx to do it through maps, through return or some other method ?
It's not exactly true that map is very fast. It is known that (quoting docs):
Since variables are evaluated only when they are used, the mere declaration even of a large number of “map” variables does not add any extra costs to request processing.
However, in your case, you need a single variable and check it against a lot of strings. Sure it may be still fast, but it won't scale well with the more strings you have to compare. Moreover, although quite negligible, those strings have to be stored in memory.
So instead of potentially slowing down every single request, you might better off coding the redirects in your app's language.
server {
# ...
rewrite ^/article/(\d)/.* /old-format.php?article_id=$1;
location = /old-format.php {
internal;
fastcgi_pass /your/php/socket;
# all the fastcgi directives as usual
}
# ...
}
Inside the /old-format.php, check the $_REQUEST['article_id'] and emit the appropriate redirect.
Why this is better:
Pages conforming to the new URL format, won't suffer evaluation of request URI against 800 URLs; only a regular expression mismatch, which is much faster than the given map, especially so if you use pcre_jit on; directive atop your nginx.conf file
You can (likely) write the logic of redirects in your PHP file in few lines of code (fetching relation between article and category ids from database). If speed of access of old URLs is of any concern, you can apply FastCGI cache to old-format.php location with things like fastcgi_cache_valid 301 24h; (+ define FastCGI cache zone in the http context)
In this case you will need to use map - and if you get an error about map_hash_bucket_size being too small you should increase it (inside the http block) like this - map_hash_bucket_size 256
First create a file /etc/nginx/header-maps.conf like this:
map $request_uri $redirect_uri {
/path-1 /path-1-new;
/path-2 /path-2-new;
/path-3 /path-3-new;
}
Then in nginx.conf or in your domain-specific config file
server {
# other things here...
# if there is a $redirect_uri set for the current URI:
if ($redirect_uri) {
return 301 $redirect_uri;
}
# other things here...
}
map is very fast - so it should not affect performance too much.

Nginx - 301 redirect for pages with specific GET param to the same page without get params

I have
many pages which were indexed by search engines with crappy GET-parameter like _escaped_fragment_ (for more info about escaped fragments see more Yandex man page)
nginx as reverse proxy in front of many different frontend apps
So I need to get 301 redirect only for all these pages with some GET-parameter to the same pages but without any get parameters. For example
example.com/some/long/path?_escaped_fragment_=
should be 301-redirected to
example.com/some/long/path
I can do it by adding this logic to each frontend app or I can do it at nginx configuration. I prefer to use second variant.
Potential solution can include
using http rewrite module
using if in rewrite to check GET-params
using map module to define request uri path without args from $request_uri
I've fixed my initial issue in such manner
add below code to my http context to define request uri path without args from $request_uri (here we use map module)
map $request_uri $request_uri_path {
"~^(?P<path>[^?]*)(\?.*)?$" $path;
}
add below code to my server context to make 301 redirect from pages with _escaped_fragment_ in GET parameters to the same pages without any GET parameters (here we use rewrite module)
if ($args ~* "_escaped_fragment_") {
rewrite ^ $scheme://$host$request_uri_path? permanent;
}
And this works for me.

Share lua data between blocks?

I'm trying to use lua to serve 301 redirects directly from nginx instead of having to go through php or other stuffs.
I'm inspired by this article here:
http://www.agileweboperations.com/supporting-millions-of-pretty-url-rewrites-in-nginx-with-lua-and-redis
The idea is that I can save a list of redirects directly to redis, then match and serve them with lua directly within nginx to increase performance.
Since the backend project is Symfony, I have to find a way to tweak the code a bit to suit my need, below is what I have:
Here I try to match the usual mysite.com/this/that requests. I include the lua script to handle the redirection first, if nothing match I let nginx try_files
location / {
include /etc/nginx/include.d/lua_st_redis_rewrites.lua;
try_files $uri #rewriteapp;
}
location #rewriteapp {
rewrite ^(.*)$ /app.php/$1 last;
}
Since I want to support the dev environment, I have to handle url like this as well: mysite.com/app_dev.php/this/that
These urls will not match the location / block so I have to call include lua here again. The problem is that now the links mysite.com/this/that will actually call the lua script twice.
My idea is that I can init a true/false flag in the first call and then use it in the second call to check if the script is already included? At this stage I'm quite confused with the scope of variable however:
# pass the PHP scripts to FastCGI server from upstream phpfcgi
location ~ ^/(app|app_dev)\.php(/|$) {
# # Setup var defaults
# set $no_cache "";
include /etc/nginx/include.d/lua_st_redis_rewrites.lua;
# some more usual code for symfony here
}
Should I use global variables in this case to share data among 2 blocks of lua code? I see the use of global variables is strongly discouraged?
If I include the lua script twice, can I safely assume that variables declared in the script will always be re-declared every time it is called?
Thank you, I'm completely new to this so please forgive my obvious question.
As far as I know you can't include Lua files directly like that. Assuming you are using Openresty you need to use the relevant *_by_lua phase to process the request, in this case rewrite_by_lua.
Different blocks can't access each others globals but you can use the ngx.ctx table which sticks around for the duration of the request.
There's a handy diagram of the openresty phases here.

Fix duplicate query string in nginx

I've got a strange case where my app is now for some reason appending a duplicate query string onto a URL that already has one. I don't have the access or the time to dig into the application code right now, but I do have access to Nginx configs.
What I need is a rewrite rule that will ignore the duplicate query string, ie
http://foo.com?test=bar?test=bar will redirect to http://foo.com?test=bar
Is this even possible? Thank you.
Here, add this inside your server block, before any other location or if:
if ($query_string ~ ^([^?]+)\?\1$) {
set $clean $1;
rewrite . $uri?$clean?;
}
It's quite tricky to get right, because there are more than a few obscure behaviours:
$query_string is the query string without the initial ?
the \1 in the regexp matches a second copy of the first group (...)
you need to save $1 into a new variable because rewrite will overwrite it
you need to use $uri and not $request_uri in the rewrite, because the latter contains a copy of the original query string
you need to add a trailing ? to the rewrite, otherwise the rewrite would append a copy of the original query string! (that's probably the most obscure bit of Nginx syntax, ever)
The simple form of rewrite (without the redirect flag) should be enough for most applications, plus it doesn't require an additional round-trip to the browser.
If it does not work for you, you can try using the $clean query string into the appropriate backend directive (such as fastcgi_param QUERY_STRING and such) or, as a last resort, issuing a browser redirect using the redirect rewrite flag.

proxy_set_header not working as expected

I'm trying to have nginx proxy various applications on subpaths of the same domain.
My problem is that the links generated by the application use / as their root instead of their subdir.
My configuration is :
location /wiki/ {
proxy_pass http://localhost:4567/;
proxy_set_header SCRIPT_NAME /wiki;
}
I believe proxy_set_header SCRIPT_NAME /wiki; should set the header SCRIPT_NAME, which is use by the application to generate links, but instead HTTP_SCRIPT_NAME is set, which is ignore by the application.
How can I set SCRIPT_NAME so that my links are generated correctly ?
As per CGI specification, http headers are availabe with the HTTP_ prefix:
Meta-variables with names beginning with "HTTP_" contain values read
from the client request header fields, if the protocol used is HTTP.
The HTTP header field name is converted to upper case, has all
occurrences of "-" replaced with "_" and has "HTTP_" prepended to
give the meta-variable name.
That is, header Some-Header will be seen as HTTP_SOME_HEADER in your application. That is, everything works is expected - you added http header, and got it available with HTTP_ prefix.
The SCRIPT_NAME variable is special and isn't set by any header, but instead it's constructed from URI by the code which runs your application. To change it, you have to actually change URI seen by your backend, i.e. you need
proxy_pass http://localhost:4567/wiki/;
Or just no /wiki/ in proxy_pass, as long as it's in location /wiki/ anyway, i.e.
location /wiki/ {
proxy http://localhost:4567;
}
The bad thing here is that you probably did URI change from /wiki/ to / for some reason, i.e. your backend application expects /. There are several possible solutions to this problem:
Actually move the application to /wiki/. Usually this is simple to do.
Change your app to accept it's base url used for generating links etc. by some out-of-bands method. Many application already support this via some configuration option.
Try to replace what your application returns by nginx itself. There are several nginx directives to do it, in particular proxy_redirect, proxy_cookie_path, and the sub filter. It's most fragile method though, and not recommended unless you know what your application returns and what exactly have to be replaced.

Resources