Nginx don't block bot by user-agent - nginx

I'm trying to ban annoying bot by user-agent. I put this into server section of nginx config:
server {
listen 80 default_server;
....
if ($http_user_agent ~* (AhrefsBot)) {
return 444;
}
checking by curl:
[root#vm85559 site_avaliable]# curl -I -H 'User-agent: Mozilla/5.0 (compatible; AhrefsBot/5.2; +http://ahrefs.com/robot/)' localhost/
curl: (52) Empty reply from server
so i check /var/log/nginx/access.log and i see some connections get 444, but another connections get 200!
51.255.65.78 - - [25/Jun/2017:15:47:36 +0300 - -] "GET /product/kovriki-avtomobilnie/volkswagen/?PAGEN_1=10 HTTP/1.1" 444 0 "-" "Mozilla/5.0 (compatible; AhrefsBot/5.2; +http://ahrefs.com/robot/)" 1498394856.155
217.182.132.60 - - [25/Jun/2017:15:47:50 +0300 - 2.301] "GET /product/bryzgoviki/toyota/ HTTP/1.1" 200 14500 "-" "Mozilla/5.0 (compatible; AhrefsBot/5.2; +http://ahrefs.com/robot/)" 1498394870.955
How is it possible?

Ok, got it!
I've add $server_name and $server_addr to nginx log format, and saw that cunning bot connects by ip without server_name:
51.255.65.40 - _ *myip* - [25/Jun/2017:16:22:27 +0300 - 2.449] "GET /product/soyuz_96_2/mitsubishi/l200/ HTTP/1.1" 200 9974 "-" "Mozilla/5.0 (compatible; AhrefsBot/5.2; +http://ahrefs.com/robot/)" 1498396947.308
so i added this and bot can't connect anymore
server {
listen *myip*:80;
server_name _;
return 403;
}

Related

change nginx response code from 413

Is there any way to change the response code nginx sends? When the server receives a file that exceeds its client_max_body_size as defined in the config, can I have it return a 403 code instead of a 413 code?
Below works fine for me
events {
worker_connections 1024;
}
http {
server {
listen 80;
location #change_upload_error {
return 403 "File uploaded too large";
}
location /post {
client_max_body_size 10K;
error_page 413 = #change_upload_error;
echo "you reached here";
}
}
}
Results for posting a 50KB file
$ curl -vX POST -F file=#test.txt vm/post
Note: Unnecessary use of -X or --request, POST is already inferred.
* Trying 192.168.33.100...
* TCP_NODELAY set
* Connected to vm (192.168.33.100) port 80 (#0)
> POST /post HTTP/1.1
> Host: vm
> User-Agent: curl/7.54.0
> Accept: */*
> Content-Length: 51337
> Expect: 100-continue
> Content-Type: multipart/form-data; boundary=------------------------67df5f3ef06561a5
>
< HTTP/1.1 403 Forbidden
< Server: openresty/1.11.2.2
< Date: Mon, 11 Sep 2017 17:58:55 GMT
< Content-Type: text/plain
< Content-Length: 23
< Connection: close
<
* Closing connection 0
File uploaded too large%
and nginx logs
web_1 | 2017/09/11 17:58:55 [error] 5#5: *1 client intended to send too large body: 51337 bytes, client: 192.168.33.1, server: , request: "POST /post HTTP/1.1", host: "vm"
web_1 | 192.168.33.1 - - [11/Sep/2017:17:58:55 +0000] "POST /post HTTP/1.1" 403 23 "-" "curl/7.54.0"

Redirected to compute.amazonaws.com

I'm running nginx on a basic AWS EC2 instance.
My nginx http server block looks like this:
server {
listen 80 default_server;
listen [::]:80;
# TLS redirect
server_name ec2-xx-xx-xxx-xxx.rr-rrrr-r.compute.amazonaws.com;
return 301 https://$host$request_uri;
}
My https server block works fine, so when I enter:
https://ec2-xx-xx-xxx-xxx.rr-rrrr-r.compute.amazonaws.com
In my browser, everything works, but when I enter:
http://ec2-xx-xx-xxx-xxx.rr-rrrr-r.compute.amazonaws.com (http, not https)
I get redirected to
https://compute.amazonaws.com
Is there any obvious reason for this? Thanks in advance.
Edit with curl
[ec2-user#ip-xxx-xx-xx-xxx ~]$ curl -v http://ec2-xx-xx-xxx-xxx.rr-rrrr-r.compute.amazonaws.com
* Rebuilt URL to: http://ec2-xx-xx-xxx-xxx.rr-rrrr-r.compute.amazonaws.com/
* Trying 172.31.14.246...
* TCP_NODELAY set
* Connected to ec2-xx-xx-xxx-xxx.rr-rrrr-r.compute.amazonaws.com (xxx.xx.xx.xxx) port 80 (#0)
> GET / HTTP/1.1
> Host: ec2-xx-xx-xxx-xxx.rr-rrrr-r.compute.amazonaws.com
> User-Agent: curl/7.51.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Server: nginx/1.10.2
< Date: Mon, 24 Apr 2017 22:09:39 GMT
< Content-Type: text/html
< Content-Length: 185
< Connection: keep-alive
< Location: https://ec2-xx-xx-xxx-xxx.rr-rrrr-r.compute.amazonaws.com
<
<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>nginx/1.10.2</center>
</body>
</html>
* Curl_http_done: called premature == 0
* Connection #0 to host ec2-xx-xx-xxx-xxx.rr-rrrr-r.compute.amazonaws.com left intact

How to point a domain into another website's page?

For example, I have a website under the domain example.com. In that site, I have a page like this example.com/hello. Now I need to point my second domain hello.com to that page example.com/hello. It should not be a re-direct. The visitor should stay in hello.com but see the content from the page example.com/hello. Is this possible? Can we do it in dns or in nginx?
The access log after using proxy pass :
123.231.120.120 - - [10/Mar/2016:19:53:18 +0530] "GET / HTTP/1.1" 200 1598 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36"
123.231.120.120 - - [10/Mar/2016:19:53:18 +0530] "GET /a4e1020a9f19bd46f895c136e8e9ecb839666e7b.js?meteor_js_resource=true HTTP/1.1" 404 44 "http://swimamerica.lk/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.$
123.231.120.120 - - [10/Mar/2016:19:53:18 +0530] "GET /9b342ac50483cb063b76a0b64df1e2d913a82675.css?meteor_css_resource=true HTTP/1.1" 200 73 "http://swimamerica.lk/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.262$
123.231.120.120 - - [10/Mar/2016:19:53:18 +0530] "GET /images/favicons/favicon-16x16.png HTTP/1.1" 200 1556 "http://swimamerica.lk/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36"
123.231.120.120 - - [10/Mar/2016:19:53:19 +0530] "GET /images/favicons/favicon-96x96.png HTTP/1.1" 200 1556 "http://swimamerica.lk/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36"
123.231.120.120 - - [10/Mar/2016:19:53:19 +0530] "GET /images/favicons/favicon-32x32.png HTTP/1.1" 200 1556 "http://swimamerica.lk/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36"
123.231.120.120 - - [10/Mar/2016:19:53:19 +0530] "GET /images/favicons/android-icon-192x192.png HTTP/1.1" 200 1556 "http://swimamerica.lk/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36"
You can use proxy_pass directive. Just create a new server associated with the domain hello.com and then for location = / set proxy_pass equals to http://example.com/hello:
server {
server_name hello.com;
# ...
location = / {
proxy_pass http://example.com/hello/;
}
# serve static content (ugly way)
location ~* \.(jpg|jpeg|gif|png|css|js|ico|xml|rss|txt)$ {
proxy_pass http://example.com/hello/$uri$is_args$args;
}
# serve static content (better way,
# but requires collection all assets under the common root)
location ~ /static/ {
proxy_pass http://example.com/static/;
}
}
UPD: Here is an exact solution for your situation:
server {
server_name swimamerica.lk;
location = / {
proxy_pass http://killerwhales.lk/swimamerica;
}
# serve static content (ugly way) - added woff and woff2 extentions
location ~* \.(jpg|jpeg|gif|png|css|js|ico|xml|rss|txt|woff|woff2)$ {
proxy_pass http://killerwhales.lk$uri$is_args$args;
}
# added location for web sockets
location ~* sockjs {
proxy_pass http://killerwhales.lk$uri$is_args$args;
}
}
Use the proxy_pass directive. Just create a new server associated with the domain hello.com and then for location = / set proxy_pass equal to http://domain.com/hello:
server {
server_name hello.com;
# ...
location = / {
proxy_pass http://domain.com/hello/;
}
# serve static content (ugly way)
location ~* \.(jpg|jpeg|gif|png|css|js|ico|xml|rss|txt)$ {
proxy_pass http://domain.com/hello/$uri$is_args$args;
}
# serve static content (better way,
# but requires collection all assets under the common root)
location ~ /static/ {
proxy_pass http://domain.com/static/;
}
}

What is wrong with my GET request?

Sorry to bother with something that should be easy.
I have this HTTP GET request:
GET /ip HTTP/1.1
Host: httpbin.org
Connection: close
Accept: */*
User-Agent: Mozilla/4.0 (compatible; esp8266 Lua; Windows NT 5.1)
When I send this request via my ESP8266 it returns a 404 error:
HTTP/1.1 404 Not Found
Date: Fri, 04 Sep 2015 16:34:46 GMT
Server: Apache
Content-Length: 1363
X-Frame-Options: deny
Connection: close
Content-Type: text/html
But when I (and you) go to http://httpbin.org/ip it works perfectly!
What is wrong?
DETAILS
I construct my request in Lua:
conn:on("connection", function(conn, payload)
print('\nConnected')
req = "GET /ip"
.." HTTP/1.1\r\n"
.."Host: httpbin.org\r\n"
.."Connection: close\r\n"
.."Accept: */*\r\n"
.."User-Agent: Mozilla/4.0 (compatible; esp8266 Lua; Windows NT 5.1)\r\n"
.."\r\n"
print(req)
conn:send(req)
end)
And if I use another host (given is this example) it works:
conn:on("connection", function(conn, payload)
print('\nConnected')
conn:send("GET /esp8266/test.php?"
.."T="..(tmr.now()-Tstart)
.."&heap="..node.heap()
.." HTTP/1.1\r\n"
.."Host: benlo.com\r\n"
.."Connection: close\r\n"
.."Accept: */*\r\n"
.."User-Agent: Mozilla/4.0 (compatible; esp8266 Lua; Windows NT 5.1)\r\n"
.."\r\n")
end)
Are you actually connecting to httpbin.org ? Or somewhere else?
I just tried issuing your request by typing it into telnet, and it worked for me. But the server responding was nginx, whereas your example shows apache.
$ telnet httpbin.org 80
Trying 54.175.219.8...
Connected to httpbin.org.
Escape character is '^]'.
GET /ip HTTP/1.1
Host: httpbin.org
Connection: close
Accept: */*
User-Agent: Mozilla/4.0 (compatible; esp8266 Lua; Windows NT 5.1)
HTTP/1.1 200 OK
Server: nginx
Date: Wed, 07 Oct 2015 06:08:40 GMT
Content-Type: application/json
Content-Length: 32
Connection: close
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
{
"origin": "124.149.55.34"
}
Connection closed by foreign host.
When I try another request with some other URI to force a 404 response, I see this:
HTTP/1.1 404 NOT FOUND
Server: nginx
Date: Wed, 07 Oct 2015 06:12:21 GMT
Content-Type: text/html
Content-Length: 233
Connection: close
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
Which again is nothing like the response you said you got from httpbin.org.
http.get("http://httpbin.org/ip", nil, function(code, data)
if (code < 0) then
print("HTTP request failed")
else
print(code, data)
end
end)
http.post('http://httpbin.org/post',
'Content-Type: application/json\r\n',
'{"hello":"world"}',
function(code, data)
if (code < 0) then
print("HTTP request failed")
else
print(code, data)
end
end)
see the reference here.
It's your request line the server doesn't like.
This will do the job:
GET http://httpbin.org/ip HTTP/1.1
Host: httpbin.org

nginx = / location pattern not working

I am trying to configure nginx to serve a static html page on the root domain, and proxy everything else to uwsgi. As a quick test I tried to divert to two different static pages:
server {
server_name *.example.dev;
index index.html index.htm;
listen 80;
charset utf-8;
location = / {
root /www/src/;
}
location / {
root /www/test/;
}
}
This seems to be what http://nginx.org/en/docs/http/ngx_http_core_module.html#location says you can do. But I'm always getting sent to the test site, even on the / request by visiting http://www.example.dev in my browser.
Curl output:
$ curl http://www.example.dev -v
* Rebuilt URL to: http://www.example.dev/
* Hostname was NOT found in DNS cache
* Trying 192.168.50.51...
* Connected to www.example.dev (192.168.50.51) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.37.1
> Host: www.example.dev
> Accept: */*
>
< HTTP/1.1 200 OK
* Server nginx/1.8.0 is not blacklisted
< Server: nginx/1.8.0
< Date: Tue, 19 May 2015 01:11:10 GMT
< Content-Type: text/html; charset=utf-8
< Content-Length: 415
< Last-Modified: Wed, 15 Apr 2015 02:53:27 GMT
< Connection: keep-alive
< ETag: "552dd2a7-19f"
< Accept-Ranges: bytes
<
<!DOCTYPE html>
<html>
...
And the output from the nginx access log:
192.168.50.1 - - [19/May/2015:01:17:05 +0000] "GET / HTTP/1.1" 200 415 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36" "-"
So I decided to comment out the test location. So I have only the location = / { ... block. Nginx now 404s and logs the following error:
2015/05/19 01:24:12 [error] 3116#0: *6 open() "/etc/nginx/html/index.html" failed (2: No such file or directory), client: 192.168.50.1, server: *.example.dev, request: "GET / HTTP/1.1", host: "www.example.dev"
Which is the default root in the original nginx conf file? I guess this confirms my location = / pattern is not matching.
I added $uri to the access log and see that it is showing /index.html which I guess means the first location pattern is matching, but then it goes into the second location block? So now I just need to figure out how to serve my index.html from the / block, or just add another block like: location =/index.html
according to #Alexey Ten commented, in ngx_http_index doc:
It should be noted that using an index file causes an internal redirect, and the request can be processed in a different location. For example, with the following configuration:
location = / {
index index.html;
}
location / {
...
}
a “/” request will actually be processed in the second location as “/index.html”.
In your case, request to "/" will not get /www/src/index.html, but /www/test/index.html.

Resources