Disable logging for specific useragent in nginx conf - nginx

I want to disable logging for a specific useragent. This is a part of my current conf-file.
if ($http_user_agent ~ (bingbot|AhrefsBot|DotBot|Exabot|Baiduspider|SemrushBot) ) {
return 403;
}
I've tried adding access_log off; but get the following error:
nginx: [emerg] "access_log" directive is not allowed here
I'm assuming this is because I only have a server block. I need a location block also. I've tried the following code:
location / {
if ($http_user_agent ~ (bingbot|AhrefsBot|DotBot|Exabot|Baiduspider|SemrushBot) ) {
return 403;
}
}
But I get the following error:
duplicate location "/"
In my conf-file I already have this code:
location / {
try_files $uri $uri/ =404;
}
Can I combine the two location snippets into one? Or how do I proceed?

As your question indicates, the access_log directive cannot be used within an if block unless enclosed within a location. However, the access_log directive does include an if=condition which can be controlled by a map. There is an example at the end of this section of the manual.
For example:
map $http_user_agent $goodagent {
default 1;
~(bingbot|AhrefsBot|DotBot|Exabot|Baiduspider|SemrushBot) 0;
}
server {
access_log ... if=$goodagent;
if ($goodagent = 0) { return 403; }
...
}
The map directive must be placed outside of the server block. The access_log statement can be placed inside or outside the server block depending on whether it applies to all server blocks or just one.

At the http level declare a map like so.
map $http_user_agent $ignore_status_checks {
default 0;
"~Pingdom.*" 1;
"~*\(StatusCake\)" 1;
"~*mod_pagespeed*" 1;
"~*NodePing*" 1;
}
Then in your server's location block add:
if ($ignore_status_checks) {
access_log off;
}
This will turn off the access_log for anything returns a 1 in the map. Of course, you can do want ever you want in the if.

Related

nginx logs for location

Need some help in setting up nginx logs so that they are not duplicated.
My configuration is as following. What I would like to achieve is that all logs for say, http://example.com/app goes to file app.access.log and logs for rest of the site goes to file main.access.log
However, following configuration logs app logs to both, app.access.log and main.access.log.
server {
access_log /var/log/nginx/main.access.log;
location /app {
access_log /var/log/nginx/app.access.log;
}
}
Any idea how to fix this?
You could use a negation regexp to intercept all request NOT directed to app, and define there the access_log directive. Then define the other location for app
location ~ ^((?!app).)*$ {
access_log /var/log/nginx/not-an-app.access.log;
}
location /app {
access_log /var/log/nginx/app.access.log;
}
I think it's a bit of a stretch though, and i would test the hell out of this before putting it in production.
The access_log directive includes an if=condition which can be used to control logging.
For example:
map $request_uri $loggable {
~^/app 0;
default 1;
}
server {
access_log /var/log/nginx/main.access.log if=$loggable;
...
}
See this document for details.
The alternative is to log everything together and split it into two separate files later using grep.
Inspired by #Andrea's solution, you could also use this pattern:
server {
location / {
access_log /var/log/nginx/main.access.log;
location /foo { ... }
location /bar { ... }
...
}
location /app {
access_log /var/log/nginx/app.access.log;
}
}
So the top level has just two top level location blocks, and all other location blocks are nested within the default `location.

Conditional url rewrite and proxy

I have a Javascript app, and for Social Sharing puposes I would like to return the result form another server when the facebook crawler visits the page.
Basically, if I detect a Social bot, I would like to do:
GET https://example.net/shared/route/123 -> GET https://rendered.example.net/robots/shared/route/123
I am failing to complete this, as I cannot use proxy_pass or try_files inside an if statement.
What I've tried:
location /shared/route/ {
if ($http_user_agent ~ (?!(facebookexternalhit|Facebot|Twitterbot|Pinterest|Google.*snippet))) {
try_files $uri $uri/ /index.html =404;
}
rewrite /shared/(.*) /robots/$1 break;
proxy_pass https://rendered.example.net/;
}
But I receive the error:
"try_files" directive is not allowed here in /etc/nginx/sites-enabled/webapp:22
And I've also tried
location /shared/route/ {
if ($http_user_agent ~ (?!(facebookexternalhit|Facebot|Twitterbot|Pinterest|Google.*snippet))) {
rewrite /shared(.*) /robots/$1 break;
proxy_pass https://rendered.example.net/;
}
}
But then I receive:
"proxy_pass" cannot have URI part in location given by regular expression, or inside named location, or inside "if" statement, or inside "limit_except" block in /etc/nginx/sites-enabled/app:24
How can I solve this?

How to disable logging images in nginx but still allow the get request?

I'm trying to log only java-script files request in the nginx access_log.
I tried using the following code i found on this site:
location ~* ^.+.(jpg|jpeg|gif|css|png|html|htm|ico|xml|svg)$ {
access_log off;
}
the problem is it doesn't allow the get request at all and i get a 404 error when trying to run the html file that executes the js file in the browse.
I want everything to work just the same but for the access log to log only request for js files.
How do i do that?
Put it in the server block and make sure that the "root" is correctly set up. It does work
Working example:
location ~* \.(js|css|png|jpg|jpeg|gif|ico)$ {
expires +60d;
access_log off;
}
I have this in the server block and not a location block.
Alternatively you can keep all requests within single location but use access_log with condidional if operator to disable images logging:
map $request_uri $is_loggable {
~* ^.+\.(jpg|jpeg|gif|css|png|html|htm|ico|xml|svg)$ 0;
default 1;
}
server {
location / {
access_log /path/to/log/file combined if=$is_loggable;
...
}
}
Here combined is a name of default log format.
You say that you want to log only java-script files, so actually you can use even simplier solution:
map $request_uri $is_loggable {
~* ^.+\.js$ 1;
default 0;
}

Nginx proxy pass and url rewriting

How to trig this rule only when I have GET parameters(query string) in url,
otherwise I will match on an alias.
location ~^/static/photos/.* {
rewrite ^/static/photos/(.*)$ /DynamicPhotoQualitySwitch/photos/$1 break;
expires 7d;
proxy_pass http://foofoofoo.com;
include /etc/nginx/proxy.conf;
}
The 1st way that I know of is using a regex against the $args parameter like so:
if ($args ~ "^(\w+)=") {
Or the 2nd way is to use the convenient $is_args like so:
if ($is_args != "") {
Remember that in both styles you need to put a space between the if and the opening parenthesis; "if (" not "if(" as well as a space after the closing parenthesis and the opening brace; ") {" rather than "){".
Full example using the 1st style above, nginx.conf:
location ~^/static/photos/.* {
include /etc/nginx/proxy.conf;
if ($args ~ "^(\w+)=") {
rewrite ^/static/photos/(.*)$ /DynamicPhotoQualitySwitch/photos/$1 break;
expires 7d;
proxy_pass http://foofoofoo.com;
}
}
Full example using the 2nd style above, nginx.conf:
location ~^/static/photos/.* {
include /etc/nginx/proxy.conf;
if ($is_args != "") {
rewrite ^/static/photos/(.*)$ /DynamicPhotoQualitySwitch/photos/$1 break;
expires 7d;
proxy_pass http://foofoofoo.com;
}
}
Note that the proxy.conf include goes outside of the if statement.
Version:
[nginx#hip1 ~]$ nginx -v
nginx version: nginx/1.2.6
And some info on the $args and $is_args variables:
http://nginx.org/en/docs/http/ngx_http_core_module.html
Reading the docs is always useful, I just discovered that $query_string is the same as $args, so where I have $args above, you could also use $query_string according to the docs.
IMPORTANT
It is important to note however, that If can be Evil!
And therefore either test thoroughly or use the recommendation provided in the link above to change the URL inside location statement in a way similar to the example provided there, something like:
location ~^/static/photos/.* {
error_page 418 = #dynamicphotos;
recursive_error_pages on;
if ($is_args != "") {
return 418;
}
# Your default, if no query parameters exist:
...
}
location #dynamicphotos {
# If query parameters are present:
rewrite ^/static/photos/(.*)$ /DynamicPhotoQualitySwitch/photos/$1 break;
expires 7d;
include /etc/nginx/proxy.conf;
proxy_pass http://foofoofoo.com;
}

nginx: auth_basic for everything except a specific location

How can I enable HTTP Basic Auth for everything except for a certain file?
Here is my current server block configuration for the location:
location / {
auth_basic "The password, you must enter.";
auth_basic_user_file /etc/nginx/htpasswd;
}
location /README {
auth_basic off;
}
However, on /README, it is still prompting for a password.
How can we fix this?
Thanks!
Mark
Try to use sign = , that helps you:
location = /README {
auth_basic off;
allow all; # Allow all to see content
}
I am doing something similar using "map" instead of "if" to assign the auth_basic realm variable and htpasswd file:
map $http_host $siteenv {
default dev;
~^(?<subdomain>.+)\.dev dev;
~^(?<subdomain>.+)\.devprofile devprofile;
~^(?<subdomain>.+)\.devdebug devdebug;
~^(?<subdomain>.+)\.test test;
~^(?<subdomain>.+)\.demo demo;
~^(?<subdomain>.+)\.stage stage;
# Live
~^(?<subdomain>.+)\.live live;
~^.*\.(?P<subdomain>.+)\.[a-zA-Z]* live;
}
map $http_host $auth_type {
default "Restricted";
~^(?<subdomain>.+)\.dev "Development";
~^(?<subdomain>.+)\.devprofile "Development";
~^(?<subdomain>.+)\.devdebug "Development";
~^(?<subdomain>.+)\.test "Testing";
~^(?<subdomain>.+)\.stage "Stage";
~^(?<subdomain>.+)\.demo "Demo";
# Live
~^(?<subdomain>.+)\.live "off";
~^.*\.(?P<subdomain>.+)\.[a-zA-Z]* "off";
}
server {
.. etc ..
auth_basic $auth_type;
auth_basic_user_file /etc/nginx/conf.d/htpasswd-$siteenv;
}
I'm doing the following:
location = /hc.php {
auth_basic "off";
}
location / {
try_files $uri $uri/ =404;
}
The narrow match:location = /somefile.txt {} comes first, so location / {} can capture the remaining requests
auth_basic "off" requires the quotes around it as far as I know
I also use the exact (full, if you like) match, in order to stop iteration over the other locations defined in the config (read below quote for more info on what it does)
Probably this would work in different orders, and/or without the double quotes also, but why not try to do things as correct and complete as possible, if possible.
The most important modifiers are:
(none) No modifier at all means that the location is interpreted as a prefix. To determine a match, the location will now be matched against the beginning of the URI.
=: The equal sign can be used if the location needs to match the exact request URI. When this modifier is matched, the search stops right here.
~: Tilde means that this location will be interpreted as a case-sensitive RE match.
~*: Tilde followed by an asterisk modifier means that the location will be processed as a case-insensitive RE match.
^~: Assuming this block is the best non-RE match, a carat followed by a tilde modifier means that RE matching will not take place.
quoted from here: https://www.keycdn.com/support/nginx-location-directive
Only auth_basic off didn't work for me
If we have to skip auth for ALL uri's under our url
location ^~ /some/location/to_skip/ {
auth_basic off;
try_files $uri $uri/ /index.html;
}

Resources