In professional environments Antivirus Software like McAfee filters the clients traffic and blocks illegal requests. One way they do it, is by responding with a 403 message instead of the real response.
The problem I am having right now is, that a frontend can not see the difference between a 403 coming from the real backend or a 403 coming from the Antivirus Software.
I was thinking about using their custom status texts, but I couldnt find a lot of documentation regarding a list of possible status texts. Is there such a thing as a standard or a public list of possible status texts?
Examples:
URLBlocked
URLCategoryBlocked
...
I've recently deployed a public website and looking at the nginx access logs I see hackers trying to access different php admin pages (which is fine, I don't use php), but I also see requests like this:
85.239.221.75 - - [27/Dec/2019:14:52:42 +0000] "k\xF7\xE9Y\xD3\x06)\xCF\xA92N\xC7&\xC4Oq\x93\xDF#\xBF\x88:\xA9\x97\xC0N\xAC\xFE>)9>\x0Cs\xC1\x96RB,\xE1\xE2\x16\xB9\xD1_Z-H\x16\x08\xC8\xAA\xAF?\xFB4\x91%\xD9\xDD\x15\x16\x8E\xAB\xF5\xA6'!\xF8\xBB\xFBBx\x85\xD9\x8E\xC9\x22\x176\xF0E\x8A\xCDO\xD1\x1EnW\xEB\xA3D|.\xAC\x1FB\xC9\xFD\x89a\x88\x93m\x11\xEB\xE7\xA9\xC0\xC3T\xC5\xAEF\xF7\x8F\x9E\xF7j\x03l\x96\x92t c\xE4\xB5\x10\x1EqV\x0C5\xF8=\xEE\xA2n\x98\xB4" 400 182 "-" "-"
What is this hacker sending and what are they trying to do? And what should I do to stay ahead of this type of attack?
The data you are having is hex formatted. It is more likely showed because of making HTTPS request to an HTTP request endpoint. Because HTTP expects plain text data and you are giving it HTTPS data which is encrypted, that's why you are seeing bunch of gibberish regarding that log.
I have a small ecommerce site, (LEMP stack) and I had used a route like
my.domain.com/makecart?return_url=....
as a means of returning to a point in the previous page to assist selection for the cart.
Over a period of months I started getting thousands of GET requests with unwanted domain links appended to the ?return_url parameter.
I have now reprogrammed this route without the use of any parameters, but my site is still getting the unwanted hits.
e.g. 76.175.182.173 - - [14/Nov/2018:19:36:08 +0000] "GET /makecart?return_url=http://www.nailartdeltona.com/ HTTP/1.0" 302 364 "http://danielcraig.2bb.ru/click.php
I am redirecting such requests to an error page, and have it 'under control' with fail2ban but I am gradually filling up memory with banning information.
Is there a way to prevent these hits before they are plucked back out of the access log?
Furthermore what are they doing anyway?
On a popular WordPress site, I'm getting a constant stream of requests for these paths (where author-name is the first and last name of one of the WordPress users):
GET /author/index.php?author=author-name HTTP/1.1
GET /index.rdf HTTP/1.0
GET /rss HTTP/1.1
The first two URLs don't exist, so the server is constantly returning 404 pages. The third is a redirect to /feed.
I suspect the requests are coming from RSS readers or search engine crawlers, but I don't know why they keep using these specific, nonexistent URLs. I don't link to them anywhere, as far as I can tell.
Does anybody know (1) where this traffic is coming from and (2) how I can stop it?
Check Apache logs to get the "where" part.
Stopping random internet traffic is hard. Maybe serve them some other error codes and it will stop. It probably wont tho.
Most my sites have these, most of the time I track them to Asia or the americas, blocking the ip works but if they are few and far between that would be just wasting resources.
I'm creating a bulletin board system, and now I'm implementing a 'delete topic' feature for admins. If someone opens the deleted topic, the server cannot find it, so it must be 404. On the other hand, the topic has existed sometime, so I must use 410. Implementing the 410 would require a new table called deleted_topics, and so would require more space. However, 410 I think is better for search engines. What do you think? Should I use 404 or 410?
404 Not found
The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.
410 Gone
The requested resource is no longer available at the server and no forwarding address is known. This condition is expected to be considered permanent. Clients with link editing capabilities SHOULD delete references to the Request-URI after user approval. If the server does not know, or has no facility to determine, whether or not the condition is permanent, the status code 404 (Not Found) SHOULD be used instead. This response is cacheable unless indicated otherwise.
The 410 response is primarily intended to assist the task of web maintenance by notifying the recipient that the resource is intentionally unavailable and that the server owners desire that remote links to that resource be removed. Such an event is common for limited-time, promotional services and for resources belonging to individuals no longer working at the server's site. It is not necessary to mark all permanently unavailable resources as "gone" or to keep the mark for any length of time -- that is left to the discretion of the server owner.
Thanks,
Showing a 410 requires a little more effort than a 404 because to know it's a 410 you need to maintain at least a "ghost" of the former page in your database. If this is not a problem to you, I'd consider the 410 "better" and "friendlier" because it presents more information. If you don't want to be hassled with maintaining a graveyard in your database, then 404 is acceptable too, of course.
I don't like Alohci's approach of redirecting to a different page. The end result looks like the user ended up on the "input new topic" page (or whatever) by accident. This works, but I think a preferable solution would be to create a custom 410 page (or 404 page, if you don't want to support 410) with specific information for the case at hand. I.e. your 410 shouldn't just say "gone", it should say "this post has been deleted, here's a link to similar posts or a link to create a new post". Your "404" wouldn't have quite as much information available but it could still offer a subset of such information and links.
I guess the "custom 410 page" comes close in appearance to "redirecting with 301" but an important difference is that robotic users of your site (of which there are many!) will get the more accurate status, and know to purge the old link from their crawl index – this will ultimately save them and you some unnecessary traffic.
I think the correct way to do this is by sending the 410 Gone for some time and after a few weeks/months to switch to 404 Not found. Of course, it is for you to decide if that is worth the amount of time and effort.
Neither. Since you tagged your question 'SEO' I'm assuming you want the best SEO answer. If there are any backlinks (coming from outside sites) to your deleted topic all the 'link juice' will be lost with 404 and 410 status.
Instead you should definitely create some 301 redirects which point to the root of the site, the root of the forum, or a related category. You will thus preserve the link juice and you get to decide which pages of your site will benefit most.