Prerender.io: Returning 404 instead of 200 - prerender

When I access a page in the browser I get a proper 200 from the server:
xx.xxx.xxx.xxx - - [02/May/2019:19:53:50 +0200] "GET /retourneren HTTP/1.1" 200 2889 "https://mysite.nl/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0 Safari/605.1.15" "-"
However when I add the url in prerender I get a 400:
3.90.111.223 - - [02/May/2019:19:50:39 +0200] "GET /retourneren HTTP/1.1" 404 10050 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/61.0.3163.59 Safari/537.36 Prerender (+https://github.com/prerender/prerender)" "-"
therefor the page is not getting cached. Does anyone have an idea?

That seems like you might be setting the prerender-status-code meta tag on the page and setting it to "404", which would make Prerender.io return a 404 response code directly.
Can you confirm whether or not that meta tag is being set in the HTML of the page?

Related

How to block in NGINX all request starting with question mark

My website is getting attacked with such request as
66.249.75.242 - - [12/Jan/2023:00:29:11 +0800] "GET /?bailiffry/1529595 HTTP/1.1" 200 57100 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.5304.115 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.75.236 - - [12/Jan/2023:00:29:14 +0800] "GET /?Diphysite-7105-hwfLs/328609048 HTTP/1.1" 200 57097 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.75.236 - - [12/Jan/2023:00:29:16 +0800] "GET /?hypothermal/sealant313919.html HTTP/1.1" 200 57100 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.5304.115 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.75.236 - - [12/Jan/2023:00:29:17 +0800] "GET /?dianilid/elated357845.html HTTP/1.1" 200 57100 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.5304.115 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
I have blocked other patterns, I just wish to block this for now as I have been solving this for hours and wish a quick fix for now.
How do I block request starting with question mark?
Yes you can. See the question below
Drop unwanted connections
if (condition) ) {
return 444;
}
You Can also put a WAF (Web Application Firewall) in your front, if your request is coming for a security issue you're facing.
you can see NAXSI. It's Open-source and compatible with any nginx version.

AMP clear cache returning - Public key not found due to ingestion error: 499 error from origin That’s all we know

I am trying to clear the cache of amp page but I am getting this error:
Public key not found due to ingestion error: 499 error from origin
That’s all we know.
What I have checked:
.well-known/amphtml/apikey.pub is publicly available
the file is not roboted(allowed for google bots)
the file has content type plain/text
bypassed cloudflare cache
Suspiction:
There are also 301 requests from google bots. I guess it is because google bot is requesting the file in HTTP initially and redirected to https.
64.233.173.70 - - [13/Jun/2022:16:08:22 +0800] "GET /.well-known/amphtml/apikey.pub HTTP/1.1" 301 193 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.61 Mobile Safari/537.36 (compatible; Google-AMPHTML)"
64.233.173.204 - - [13/Jun/2022:16:08:22 +0800] "GET /.well-known/amphtml/apikey.pub HTTP/1.1" 200 451 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.61 Mobile Safari/537.36 (compatible; Google-AMPHTML)"
66.249.70.56 - - [13/Jun/2022:16:10:55 +0800] "GET /.well-known/amphtml/apikey.pub HTTP/1.1" 301 193 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.61 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.64.21 - - [13/Jun/2022:16:11:14 +0800] "GET /.well-known/amphtml/apikey.pub HTTP/1.1" 200 451 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.61 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.64.55 - - [13/Jun/2022:17:07:18 +0800] "GET /.well-known/amphtml/apikey.pub HTTP/1.1" 301 193 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.61 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.64.114 - - [13/Jun/2022:17:07:19 +0800] "GET /.well-known/amphtml/apikey.pub HTTP/1.1" 200 451 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.61 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Another suspicion:
The .well-known/amphtml/apikey.pub cannot be curled and it returns 403. But from above logs from Google bot, it seems it does not have any problem fetching the file.
I don't understand what I miss here. Please help!

All http request to a particular script returns GET /false HTTP/1.1

Whenever I'm trying to open web application my Nginx access log shows "GET /false HTTP/1.1" 404 206 "https://www.example.com/FeedifySW.js" "Mozilla/5.0 (Linux; Android 6.0; Micromax Q4260 Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.85 Mobile Safari/537.36" - -
I have searched through all way possible and still couldn't find the solution. Can anyone help?
This is for an nginx server where i have hosted my web application. due to high request counts, sometimes the application crashes
"GET /false HTTP/1.1" 404 206 "https://www.example.com/FeedifySW.js" "Mozilla/5.0 (Linux; Android 6.0; Micromax Q4260 Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.85 Mobile Safari/537.36" - -

Strange google bot attack (e.g. /123456-12345678-123abc)

An abandoned outdated wordpress website that i thought the internet didn't know about got hacked. The attack got resolved quickly and had no real damage as for as I know now. But it does give an opportunity to study the attack used.
One thing that caught my attention is an upload of a malicous sitemap.xml causing google to do many (250k/day) requests to strange urls matching a specific pattern. 6 digits - 8 digits - 6 char hex
Examples:
66.249.76.33 - - [03/Oct/2018:14:12:13 +0200] "GET /035742-41258563-3329f7 HTTP/1.1" 302 244 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.76.33 - - [03/Oct/2018:14:12:13 +0200] "GET /042913-72193084-e8a20a HTTP/1.1" 302 244 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.76.33 - - [03/Oct/2018:14:12:14 +0200] "GET /012527-34165946-30e419 HTTP/1.1" 302 244 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.76.33 - - [03/Oct/2018:14:12:14 +0200] "GET /064248-52623737-8691d5 HTTP/1.1" 302 244 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.64.2 - - [03/Oct/2018:14:12:15 +0200] "GET /052839-44405924-68722a HTTP/1.1" 302 244 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.64.2 - - [03/Oct/2018:14:12:15 +0200] "GET /065830-65437791-de5b61 HTTP/1.1" 302 244 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.76.33 - - [03/Oct/2018:14:12:16 +0200] "GET /013227-70693694-023293 HTTP/1.1" 302 244 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.76.33 - - [03/Oct/2018:14:12:16 +0200] "GET /125539-43521853-8481a2 HTTP/1.1" 302 244 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.76.33 - - [03/Oct/2018:14:12:17 +0200] "GET /033515-14477539-24816a HTTP/1.1" 302 244 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.76.33 - - [03/Oct/2018:14:12:17 +0200] "GET /104450-28458335-28053c HTTP/1.1" 302 244 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
I've verified it's a real google bot by using host and ping. I'm curious if anyone has seen this attack before and how it works!
Never seen the attack before but it's execution is pretty simple: upload a massive sitemap of randomly generated "page urls" and let google do the rest.
Google will go "ooh look at that they must have added loads of new content, I want that" and will hit the site a LOT to try and get it, hence crippling the site. We see this when taking large sites to a new url structure all the time.
The only way I've found to compensate is to use NGINX rate limiting to stop any single IP making too many requests per second.

Request to URL defined with ingress.kubernetes.io/auth-url annotation is done with HTTP/1.0

I am using auth-url and auth-signin annotation for authenticating access to app. Problem is that request to URL defined with auth-url is always done with HTTP/1.0 and not with HTTP/1.1 as expected. From logs you can see that all other requests are done with HTTP/1.1.
Version used: nginx-ingress-controller:0.9.0-beta.19
Logs from ELB:
2017-11-30T14:28:30.606436Z dev-sandbox-2cb4 201.137.96.59:58692 10.10.0.101:80 0.000044 0.031215 0.000039 302 302 0 154 "GET https://example.net:443/testing/ HTTP/1.1" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2
2017-11-30T14:28:30.623944Z dev-sandbox-2cb4 24.134.104.23:40704 10.10.7.144:80 0.000029 0.01263 0.000068 401 401 0 21 "GET https://example.net:443/oauth2/auth HTTP/1.0" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2
2017-11-30T14:28:30.699239Z dev-sandbox-2cb4 201.137.96.59:58692 10.10.3.6:80 0.000028 0.001223 0.000046 302 302 0 395 "GET https://example.net:443/oauth2/start?rd=https://example.net/testing/ HTTP/1.1" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2
Annotation:
annotations:
ingress.kubernetes.io/auth-url: "https://$host/oauth2/auth"
ingress.kubernetes.io/auth-signin: "https://$host/oauth2/start"
Problem is that in the environment I need to use only 1.1 is allowed.
Is this something to be expected or am I doing something wrong?
Issue can be solved by adding
proxy_http_version 1.1;
under location = {{ $authPath }} block in nginx ingress template.
See https://github.com/kubernetes/ingress-nginx/pull/1787.

Resources