Strange requests to web server - nginx

I have a VPS running Nginx, which currently serves only static content.
Once I was looking at the log and noticed some strange requests:
216.244.66.239 - - [03/Jan/2019:15:04:26 +0100] "GET /en/profile/Souxy HTTP/1.1" 200 4650 "-" "Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help#moz.com)"
216.244.66.239 - - [03/Jan/2019:15:04:28 +0100] "GET /en/view/8gIi2vad8Y HTTP/1.1" 200 4650 "-" "Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help#moz.com)"

this is crawler. On this link is descriptin https://moz.com/help/moz-procedures/crawlers/dotbot. Maybe it is indexing your website.
You can block this requests on firewall or add file robots.txt with content
User-agent: dotbot
Disallow: /

Related

HTTP 301 in Apache logs but 404 in browser/curl

I am dealing with a client who has been blacklisted by Google Adwords because the Googlebot crawlers that are crawling their site is finding tons of weird links that redirect via a 301 code. Some examples:
216.244.66.238 - - [15/Mar/2022:00:22:33 +0000] "GET /ffdd1g/hytera-phone.html HTTP/1.1" 301 - "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; help#moz.com)"
216.244.66.238 - - [15/Mar/2022:00:22:34 +0000] "GET /ffdd1g/od-tools-houdini.html HTTP/1.1" 301 - "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; help#moz.com)"
216.244.66.238 - - [15/Mar/2022:00:22:42 +0000] "GET /7oh5yny/fujifilm-classic-chrome.html HTTP/1.1" 301 - "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; help#moz.com)"
216.244.66.238 - - [15/Mar/2022:00:22:45 +0000] "GET /7oh5yny/fusion-360-join-line-segments.html HTTP/1.1" 301 - "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; help#moz.com)"
But when I recreate any of the requests in my browser or with curl, it 404s correctly. The fact that Google is seeing 301s is what caused them to be blacklisted by Google AdWords. Why could this be happening and how can I make sure that all invalid links always return 404 instead of 301.
This is a WordPress website by the way in-case it makes a difference. Thank you.

how nginx know from which site i was redirected to my site?

I was checking my nginx access.log and I found bellow info.
From there is showing my redirection task: redirect to my site webcovid19.live from another portal sme.sk
How is this possible ? Is it hidden somewhere in HTML protocol ?
How webserver knows about redirection ?
Use Google Analytics the same logic ? direct vs referral
85.216.x.x - - [24/May/2020:08:50:52 +0000] "GET / HTTP/1.1" 200 1358 "https://domov.sme.sk/diskusie/3671287/2/koronavirus-
slovensko-minuta-po-minute-23-maj-2020.html" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:76.0) Gecko/20100101 Firefox/76.0"
85.216.x.x - - [24/May/2020:08:50:52 +0000] "GET /styles.css HTTP/1.1" 200 725 "https://webcovid19.live/" "Mozilla/5.0 (X11
; Ubuntu; Linux x86_64; rv:76.0) Gecko/20100101 Firefox/76.0"
Many Thanks Incognito :)
Issue solved:
en.wikipedia.org/wiki/HTTP_referer

Apache losing referer information in some cases

Update
After digging through my Apache access logs I believe the issue is related to hosting, not Google Analytics. Clicking a link to my site sometimes results in the Referer information getting dropped.
Here are a few example entries. Some contain a valid referer (Twitter, for example) while others just contain a -. For some of these cases, I was tailing the access log while clicking a link from another site so I know it should have a valid referer.
X.X.X.X - - [10/Jun/2019:03:06:10 +0000] "GET / HTTP/1.0" 200 12153 "https://twitter.com/PxJVBrEB7T" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:67.0) Gecko/20100101 Firefox/67.0"
X.X.X.X - - [10/Jun/2019:03:12:37 +0000] "GET / HTTP/1.0" 200 13535 "https://twitter.com/PTVIpLWqE9" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:67.0) Gecko/20100101 Firefox/67.0"
X.X.X.X - - [10/Jun/2019:03:50:39 +0000] "GET / HTTP/1.0" 200 12308 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:67.0) Gecko/20100101 Firefox/67.0"
This is a Plesk managed hosting account using Apache + Nginx cache. Is it possible Apache is misconfigured and dropping the referer information? Or is more likely something to do with the referring website?
There are multiple referring websites where links aren't showing up properly in the access log, so I suspect it's on the server side but I'm not sure what more to check.

how to get real ip in nginx access.log

I have a very problem. My website is spamming through joomla contact form. In nginx access.log I see only:
10.50.0.1 - - [06/Sep/2017:19:57:32 +0200] "GET /index.php/en/kontakt HTTP/1.1" 200 16132 "http://polskaszkolaslough.org/index.php/en/kontakt" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)"
10.50.0.1 - - [06/Sep/2017:19:57:32 +0200] "POST /index.php/en/kontakt HTTP/1.1" 301 193 "http://polskaszkolaslough.org/index.php/en/kontakt" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)"
10.50.0.1 - - [06/Sep/2017:19:57:34 +0200] "POST /index.php/en/kontakt HTTP/1.1" 301 193 "http://polskaszkolaslough.org/index.php/en/kontakt" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)"
10.50.0.1 - - [06/Sep/2017:19:57:34 +0200] "GET /index.php/en/kontakt HTTP/1.1" 301 193 "http://polskaszkolaslough.org/index.php/en/kontakt" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)"
10.50.0.1 - - [06/Sep/2017:19:57:34 +0200] "GET /index.php/en/kontakt HTTP/1.1" 301 193 "http://polskaszkolaslough.org/index.php/en/kontakt" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)"
10.50.0.1 - - [06/Sep/2017:19:57:36 +0200] "GET /index.php/en/kontakt HTTP/1.1" 200 16132 "http://polskaszkolaslough.org/index.php/en/kontakt" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)"
10.50.0.1 - - [06/Sep/2017:19:57:37 +0200] "GET /index.php/en/kontakt HTTP/1.1" 301 193 "http://polskaszkolaslough.org/index.php/en/kontakt" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)"
10.50.0.1 - - [06/Sep/2017:19:57:37 +0200] "GET /index.php/en/kontakt HTTP/1.1" 200 16132 "http://polskaszkolaslough.org/index.php/en/kontakt" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)"
10.50.0.1 - - [06/Sep/2017:19:57:37 +0200] "GET /index.php/en/kontakt HTTP/1.1" 301 193 "http://polskaszkolaslough.org/index.php/en/kontakt" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)"
When I open Linux command tail, new request comes one after another. This is shock! My website is very slow. I have private server with public IP. My local IP is: 10.50.0.6 and a gateway is: 10.50.0.1 DNS is at my domain provider and record A forwards a traffic in to my public IP and then a router forward to my local IP. I would like to block ip range which spamming my domain but I don't see there original address. I see only my gateway IP. I installed fail2ban and I added reCaptha to contact form but it not helped. How can I resolve this problem?
You need access to the router.
The router should be capable of logging the address translations that it makes, and by comparing these logs with your nginx logs you should be able to identify the originating IP address. The router should also be capable of implementing an access list so that you can block the originating IP address.

Symfony route path /config - is it reserved?

when i try to add route with path /config, it shows 404 not found. Strange thing is that its not regular 404 symfony error that shows when i enter non existing route Here is apache access log:
127.0.0.1 - - [03/Feb/2015:11:32:26 +0100] "GET /config HTTP/1.1" 404 499 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:34.0) Gecko/20100101 Firefox/34.0"
127.0.0.1 - - [03/Feb/2015:11:35:00 +0100] "GET /configasd HTTP/1.1" 404 743 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:34.0) Gecko/20100101 Firefox/34.0"
127.0.0.1 - - [03/Feb/2015:11:36:42 +0100] "GET /configasdasd HTTP/1.1" 404 743 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:34.0) Gecko/20100101 Firefox/34.0"
You can see that accessing /config generates 404 499 error code while accessing another non existing route generates 404 743 error code.
My question is: is "config" an reserved word for using in routes ? Is there an complete list of such words in symfony ?
EDIT: Route configuration:
in app/config/routing.yml:
myapp_config:
resource: "#MyappConfigBundle/Resources/config/routing.yml"
prefix: /config
in MyappConfigBundle/Resources/config/routing.yml:
myapp_config:
path: /
defaults: { _controller: MyappConfigBundle:Config:index }
The status code for response is only 404, 499 and 743 - are sizes of response.
Your server is configured to have at /config path some other resource. It may be configured with some global alias or you can just have file/folder or symlink/hardlink with name config in your web directory.
Check all the cases and you will solve your problem.

Resources