I have a WordPress site that I manage. I recently received a Qualys vulnerability security scan (non-authenticated scan) that has a large number of "Path Based Vulnerability" findings. Almost all of the paths listed follow this format:
https://www.example.com/search/SomeString
https://www.example.com/search/1/feed/rss2
Some examples include:
https://www.example.com/search/errors
https://www.example.com/search/admin
https://www.example.com/search/bin
When I go to these URL's, I get an appropriate search page response stating, for example, "Search for Admin produced no results".
But, if I go to https://www.example.com/search/ without a string parameter, I get a 404 error (custom error page) stating the page could not be found. All this works like I would expect it to. No sensitive data/pages are being shown.
An example of the Qualys finding is:
150004 Path-Based Vulnerability URL:
https://www.example.com/search/1/feed/rss2/ Finding #
8346060(130736429) Severity Confirmed Vulnerability - Level 2 Unique #
redacted Group Path Disclosure Detection Date 22 Mar 2021 18:16
GMT-0400 CWE CWE-22 OWASP A5 Broken Access Control WASC WASC-15
APPLICATION MISCONFIGURATION WASC-16 DIRECTORY INDEXING WASC-17
IMPROPER FILESYSTEM PERMISSIONS CVSS V3 Base 5.3 CVSS V3 Temporal5
CVSS V3 Attack VectorNetwork
Details
Threat A potentially sensitive file, directory, or directory listing was discovered on the Web server.
Impact The contents of this file or directory may disclose sensitive information.
Solution Verify that access to this file or directory is permitted. If necessary, remove it or apply access controls to it.
Detection Information
Parameter No param has been required for detecting the information.
Authentication In order to detect this vulnerability, no authentication has been required.
Access Path Here is the path followed by the scanner to reach the exploitable URL: https://www.example.com
https://www.example.com/?s=1
Payloads
#1
#1 Request
GET https://www.example.com/search/tools/
Referer: https://www.example.com
Cookie: [removed in case its sensitive];
caosLocalGa= [removed in case its sensitive];
Host: https://www.example.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1.1
Safari/605.1.15
Accept: /
Based on the findings, this seems to be a false positive. But, my CIO insists that I prove it as such. First, is there any documentations on this that might be helpful? Second, does anyone know of any updates to WP that could hide/remove these findings?
(I'd comment, but my rep isn't high enough.)
I can partially answer this, as I fighting the same battle right now with a different web app. If you run the request in a browser with the developer tools on, I'll bet you'll see the response code from the server is 200 even though it is actually doing a redirect.
The scanner sees the response code is OK and based on that, the request succeeded as is when it really didn't. You have to give it a different response code when doing a "silent" redirect.
Related
I've recently deployed a public website and looking at the nginx access logs I see hackers trying to access different php admin pages (which is fine, I don't use php), but I also see requests like this:
85.239.221.75 - - [27/Dec/2019:14:52:42 +0000] "k\xF7\xE9Y\xD3\x06)\xCF\xA92N\xC7&\xC4Oq\x93\xDF#\xBF\x88:\xA9\x97\xC0N\xAC\xFE>)9>\x0Cs\xC1\x96RB,\xE1\xE2\x16\xB9\xD1_Z-H\x16\x08\xC8\xAA\xAF?\xFB4\x91%\xD9\xDD\x15\x16\x8E\xAB\xF5\xA6'!\xF8\xBB\xFBBx\x85\xD9\x8E\xC9\x22\x176\xF0E\x8A\xCDO\xD1\x1EnW\xEB\xA3D|.\xAC\x1FB\xC9\xFD\x89a\x88\x93m\x11\xEB\xE7\xA9\xC0\xC3T\xC5\xAEF\xF7\x8F\x9E\xF7j\x03l\x96\x92t c\xE4\xB5\x10\x1EqV\x0C5\xF8=\xEE\xA2n\x98\xB4" 400 182 "-" "-"
What is this hacker sending and what are they trying to do? And what should I do to stay ahead of this type of attack?
The data you are having is hex formatted. It is more likely showed because of making HTTPS request to an HTTP request endpoint. Because HTTP expects plain text data and you are giving it HTTPS data which is encrypted, that's why you are seeing bunch of gibberish regarding that log.
Recently I have seen a huge increase in referral traffic in GA that comes from spammy domains like bidvertiser . com, easyhits4u . com or trafficswirl . com. These are messing a lot the data in GA triggering a sudden decrease in conversion rate rendering the data unusable.
You can easily see which referrals are bad because they have a few charateristics:
high bounce rate
low time spent on pages (even fewer pageviews per user)
0 conversions (if you measure such a thing)
Looking in the logs I found lines like this
52.33.56.250 - - [10/May/2017:08:39:05 +0000] "GET / HTTP/1.0" 200 18631 "http://ptp4all.com/ptp/promote.aspx?id=628" "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0; .NET4.0E; .NET4.0C; .NET CLR 3.5.30729; .NET CLR 2.0.50727; .NET CLR 3.0.30729; MALCJS)"
74.73.253.77 - - [10/May/2017:08:39:05 +0000] "GET / HTTP/1.0" 200 18631 "http://secure.bidvertiser.com/performance/bdv_rd.dbm?enparms2=7523,1871496,2463272,7474,7474,8973,7684,0,0,7478,0,1870757,475406,91376,112463629579,78645910,nlx.lwlgwre&ioa=0&ncm=1&bd_ref_v=www.bidvertiser.com&TREF=1&WIN_NAME=&Category=1000&ownid=627368&u_agnt=&skter=vgzouvw%2B462c%2B40v10h%2Bghru%2Bmlir%2Bhoveizn%2Bsxgzd&skwdb=ooz_wvvu" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"
How to handle this?
There are 2 main things that you need to do:
1. Server level - You must block spammy request from the beggining.
I thought it would be best to prepare dynamic filters that will block requests from specific IPs that are doing the spammy traffic.
I am using for this purpose fail2ban, but there is no rule that will help ypu do this out fo the box. First you need to create a new Jail Filter (I am using Plesk so here is how to do that in Plesk https://docs.plesk.com/en-US/onyx/administrator-guide/server-administration/protection-against-brute-force-attacks-fail2ban/fail2ban-jails-management.73382/). For those that do not use Plesk and use ssh you can have a look here https://www.fail2ban.org/wiki/index.php/MANUAL_0_8
The definition of a jail is like this:
[Definition]
failregex =
ignoreregex =
Be sure to include ignoreregex as well otherwise it will not save.
After that look for the domains you find in Google Analytics, in your access log. You will find a lot of requests like the one above.
Once you identify the domain you need to add rules like this:
failregex = <HOST>.+bidvertiser\.com
<HOST>.+easyhits4u\.com
HOST - is a keyword for fail2ban to use the ip in the log.
please note the ".+" - this will enable fail2ban to ignore all text
until they find the domain you are looking for in that line
bidvertiser.com - the domain that causes the trouble with the "." escaped by "\".
the new line (new domain) should have a TAB character before the rule
otherwise it will not save
My rule looks like this:
[Definition]
failregex = <HOST>.+bidvertiser\.com
<HOST>.+easyhits4u\.com
<HOST>.+sitexplosion\.com
<HOST>.+ptp4all\.com
<HOST>.+trafficswirl\.com
<HOST>.+bdv_rd\.dbm
ignoreregex =
You can see the bdv_rd\.dbm. That is not a domain but a script they used to produce the spam. So it could be easy for them to change the domain and use the same script. This adds an extra layer of filtering. I added that there becasue fail2ban will search for any string that matches the pattern.
Note 1: be sure you do not interfere with your own website URL's because this will block legitimate users and you do not want that.
Note 2: you can test your regex in ssh like this
:# fail2ban-regex path/to/log/access_log "<HOST>.+bidvertiser\.com"
This should produce the following output:
Running tests
=============
Use failregex line : <HOST>.+bidvertiser\.com
Use log file : access_log
Use encoding : UTF-8
Results
=======
Failregex: 925 total
|- #) [# of hits] regular expression
| 1) [925] <HOST>.+bidvertiser\.com
`-
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
| [4326] Day(?P<_sep>[-/])MON(?P=_sep)Year[ :]?24hour:Minute:Second(?:\.Microseconds)?(?: Zone offset)?
`-
Lines: 4326 lines, 0 ignored, 925 matched, 3401 missed
[processed in 3.14 sec]
Missed line(s): too many to print. Use --print-all-missed to print all 3401 lines
Now this means your filter found 925 requests matching that domain (a lot if you ask me) that will translate into 925 hits from the referral bidvertiser.com in your Google Analytics.
You can verify this downloading the log and doing the search with a tool like Notepad++.
Now that your definition is ready you should add a jail and a rule.
I use the definition above with the action to block all ports for that IP for 24 hours.
After I installed this in just a few hours I had close to 850 blocked IPs. Some are in Amazon AWS network so I filed an abuse complaint here https://aws.amazon.com/forms/report-abuse .
You can use this service https://ipinfo.io/ to find the owner of the ip.
2. Google Analytics level
Here you have a few options that I will not describe here because it is not the place and there are well written resources on this theme:
https://moz.com/blog/how-to-stop-spam-bots-from-ruining-your-analytics-referral-data
https://www.optimizesmart.com/geek-guide-removing-referrer-spam-google-analytics/
A few notes:
These guys use in some places .htaccess blocking. That is an option as well that I did not use here because in my filters I use also script names and not only domains.
Fail2Ban will use iptables to block any other request from these IPs and not only the http/https port.
The first request will always pass and create 1 hit in Analytics and then depending on whether the script still accesses your website another hit when the ban expires
You can use the recidive filter to permanently ban those IPs https://wiki.meurisse.org/wiki/Fail2Ban#Recidive
The Analytics filters will not filter out historic data.
I've been trying to connect to the REST API of Woocommerce (using HTTP Basic Auth) but fail to do so.
I'm probably doing stuff wrong (first timer # REST API's), but here is what I've been doing:
I'm using a GET with an url consisting of: https://example.com/wc-api/v2/
I'm using an Authorization header with the consumer key and secret base64 encoded
I've enabled the REST Api in the Woocommerce setting and enabled secure checkout. Also I've put some product in the shop. But whenever I try to run the URL as described above; the connection is just being refused.
I do not receive an error, but it looks like the page cannot even be reached. Can someone help me out?
I've followed the docs (http://woothemes.github.io/woocommerce-rest-api-docs/#requestsresponses) up to the Authentication-section, but that's where I've been stuck up till now.
The complete url I'm using is:
http://[MYDOMAIN]/wc-api/v2/orders
With the HTTP-header looking like:
GET /wc-api/v2/ HTTP/1.1
Authorization: Basic [BASE64 encoded_key:BASE64 encoded_secret]
Host: [MYDOMAIN]
Connection: close
User-Agent: Paw/2.1.1 (Macintosh; OS X/10.10.2) GCDHTTPRequest
Then after I run the request I'm getting:
Given the screenshot that you posted, it seems that the server is not responding on HTTPS. So you'll need to configure your webserver to respond to HTTPS requests, and to do that you'll need to install an SSL certificate.
You can either generate one yourself, which is free, but won't work for the general public. Or you can buy one - most domain registrars and hosts will let you buy a certificate, and they usually start at around $50 per year.
I'm using a GET with an url consisting of: https://example.com/wc-api/v2/
In this example, you're using HTTPS. Is that where you're trying to connect?
I highly recommend going straight to HTTPS connection. It's a thousand times easier to accomplish. Documentation for over HTTPS can be found here. Follow directions for "OVER HTTPS". From there you can use something like Postman to test if you'd like.
Exception in thread "main" org.jsoup.HttpStatusException: HTTP error fetching URL. Status=403, URL=(site)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:449)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:465)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:424)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:178)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:167)
at plan.URLReader.main(URLReader.java:21)
Hello all!
I have been looking up a way to read a directory on a website of mine for an application I'm developing.
I can read the files themselves and work with them if I hardcode it, but if I try to grab the list of files from the directory I get this error.
I've tried a few ways, but this is the code I am currently working with.
String url = ""//(removed site for privacy);
print("Fetching %s...", url);
Document doc = Jsoup.connect(url).userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36").get();
Elements links = doc.select("a[href]");
Elements media = doc.select("[src]");
Elements imports = doc.select("link[href]");
...
...
...
Now if I use the main site as in www.google.com/ it reads the links. The problem is I want a directory as in www.google.com/something/something/...
when I try that for my site I am getting this error.
Any idea why I can access my main site, but not directories within it?
I also notice that '/' is needed at the end.
Just curious if am I missing something, or need to do something another way?
Thank you for your time.
String mylink = "http://www.imdb.com/search/title?genres=action";
Connection connection = Jsoup.connect(mylink);
connection.userAgent("Mozilla/5.0");
Document doc = connection.get();
//Elements elements = doc.body().select("tr.even detailed");
Elements elements = doc.getElementsByClass("results");
System.out.println(elements.toString());
This is likely a problem with (or deliberate attempt to block access using) the server's configuration, not your application. From the tag wiki excerpt for the http-status-code-403 tag:
The 403 or "Forbidden" error message is a HTTP standard response code indicating that the request was legal and understood but the server refuses to respond to the request.
From the tag wiki itself:
A 403 Forbidden may be returned by a web server due to an authorization issue or other constraint related to the request. File permissions, lack of encryption, and maximum number of users reached (among others) can all be the cause of a 403 response.
If the target site is attempting to block screen-scraping, another possibility is an unrecognized user-agent string, but you're setting the user-agent string to one (I presume) you've obtained from an actual browser, so that shouldn't be the cause.
It's not clear from your question if you expect to fetch a regular (HTML) web page, or a special "directory listing" page generated by the server when an index.html is not present in a directory. If it's the latter, note that many servers have these listings disabled to avoid leaking the names of files in the directory that aren't linked to from the web site itself. Again, this is a server configuration issue, not something your application can work around.
One of the possible reason is don't have access from Java code to access external websites use proxy to connect.
System.setProperty("http.proxyHost", "<<proxy host>>");
System.setProperty("http.proxyPort", "<<proxy port>>");
Recently I put some hidden links in a web site in order to trap web crawlers. (Used CSS visibility hidden style in order to avoid human users accessing it).
Any way, I found that there were plenty of HTTP requests with a reference of browsers which have accessed the hidden links.
E.g : "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31"
So now my problems are:
(1) Are these web crawlers? or else what can be?
(2) Are they malicious?
(3) Is there a way to profile their behaviour?
I searched on the web but couldn't find any valuable information. Can you please provide me some resources, or any help would be appreciated.
This is a HTTP user agent. They are not malicious at all. It's following the pattern, for example Mozilla/<version> and so on. A browser is a user-agent for example. However, they can be used by attackers and this can be identified by looking at anomalies. You can read this paper.
The Hypertext Transfer Protocol (HTTP) identifies the client software
originating the request, using a "User-Agent" header, even when the
client is not operated by a user.
The answer to your questions are, in order:
They are not web crawlers. They are user agents. Common term for a web developer.
Generally they aren't malicious but they can be, as I suggest, look at the paper.
I don't understand what you mean by profiling behaviour, they aren't malware!