whitelisting to stop undesireable bots using IIS - asp.net

Basically I want to do this in IIS:
In Apache you can block many bots by simply changing your .htaccess files to OPT-IN instead of OPT-OUT, basically whitelisting instead of blacklisting. You let in Google, Yahoo, MSN, etc. and IE, Opera, Firefox, Netscape and bounce EVERYTHING else by default. The beauty here is you don't have to keep looking for bots anymore as anything that identifies itself as a bot will be bounced.
How do I achieve that in IIS? Can you please point me to an example? Thanks!
references: http://www.spanishseo.org/how-to-identify-user-agents-and-ip-addresses-for-bot-blocking
http://incredibill.blogspot.com/2011/05/whitelisting-not-blacklisting-to-stop.html

There's no native way of doing this in IIS. If you're using asp.net it's easy enough to create an httpmodule to do this filtering, although unless we're talking IIS7 then only .net requests will be filtered.
Outside of that, you're looking at an IIS Filter, written in something like C++ or Delphi or something that can compile a dll. They're not easy to write either.
I wrote something similar that uses Project Honeypot (http://projecthoneypot.org/) to block spammy IP addresses. You can get it here: http://code.google.com/p/blacklistprotector/

Related

app_offline alternative

I usually place an app_offline.htm in my root directory when I am releasing a website to a production environment. However sometimes if there has been a few big changes to the site, I would like to click around first to make sure it's stable without allowing access to anyone other than me.
As far as I am aware this isn't possible, but I'm hoping someone has a neat solution...
The solution has to include if someone has a deeplink into the site, so using a default.htm/asp page in the root won't do the trick unfortunately.
I agree with the staging environment answer above, but otherwise here's one possible approach: Temporarily block all IP addresses besides your own. This can be achieved through IIS Directory Security configuration, or programmatically in any number of ways
You can redirect all the non-authorized users to an Under Construction page of some sort. Meanwhile, you can happily browse the site from your IP. When the site is vetted, you remove that IP restriction and the site becomes available to the world at large.
It's a difficult thing to achieve. That's why you should have a staging environment where everything should be validated before shipping into production. Then during the deployment process (if it takes long, but it shouldn't) you could use an App_Offline file. This staging environment should be as close as possible to your production environment (in terms of software, patches and configurations installed, not in terms of hardware power of course).
Another quick suggestion that would allow you to control things from the web.config might include a custom module that redirected all requests to a static page except those defined by a filter (i.e. hostname, url sniffing) that could be configured via the web.config.

My hosting is messing up my urls

Usually when I get the url of a request i use Request.RawUrl.
This gives /default.aspx for example.
However recently my host changed something and now the name of the application directory is displayed as well so i get /appdirname/default.aspx.
Now why does it give me the directory of the application? It looks as if my website is a subapplication of another website. So when you go to mydomain.com the rawurl will be:
/appdirname/default.aspx
I believe each domain has it's own website defined in iis or am i mistaken.
I am not asking for a workaround, which should be pretty straightforward, I am asking why this is happening and how, what kind of IIS setup causes this to happen?
PS.
And the worst part is i had this issue with godaddy and i was happy my host didnt have it but now both hosts have the same problem.
The Request.RawUrl method returns everything after the domain declaration, so if your full url is:
http://www.yourdomain.com:8080/directory/Page.aspx
then the method will return
/directory/Page.aspx
That's all it does. That's all it claims to do. As you say, your hosting provider must have changed something, which is very naughty, and the workaround should be easy. There is a good chance that they have introduced some kind of url redirection, but the best way to find out is to get in touch with their helpdesk and ask them what is happening. I find that most successful hosting companies tend to respond in good time to this kind of question. Otherwise they tend to become formerly-successful hosting companies.
Ric Strahl has this to say about it: http://www.west-wind.com/weblog/posts/132081.aspx

ASP.Net reverse proxy, what to do with external resources?

I'm currently working on a concept for a reverse proxy to basically relay responses and requests between the user and an otherwise invisible website. So basically the user goes to a site, let's say www.myproxyportal.com, where it is possible to access a website (in an iframe) in the webserver's intranet which isn't made public (for example internal.myproxyportal.com).
I've been working on a solution where I translate request objects to the desired location and return that response to the website. Works great, except for stuff like CSS links, IMG's, etc. I can do the translation of course, but then the link would go to internal.myproxyportal.com/css/style.css and this will never work from the outside.
How to approach such a thing?
Are there any out of the box solutions maybe?
EDIT: I found this, which is very similar to what I have written so far, but it also lacks support for external images, css, javascript, etc.
You can change settings in IIS to route all requests through ASP.NET pipeline, not just .aspx pages. Then simply create an HttpHandler to handle those in your proxy.
By default, IIS doesn't run "static" content requests through ASP.NET engine.
Apache has a pretty slick reverse proxy built-in, I use it extensively.
See more here: http://www.apachetutor.org/admin/reverseproxies

ASP.NET Friendly URLs

In my research, I found 2 ways to do them.
Both required modifications to the Application_BeginRequest procedure in the Global.Asax, where you would run your code to do the actual URL mapping (mine was with a database view that contained all the friendly URLs and their mapped 'real' URLs). Now the trick is to get your requests run through the .NET engine without an aspx extension. The 2 ways I found are:
Run everything through the .NET engine with a wildcard application extension mapping.
Create a custom aspx error page and tell IIS to send 404's to it.
Now here's my question:
Is there any reason one of these are better to do than the other?
When playing around on my dev server, the first thing I noticed about #1 was it botched frontpage extensions, not a huge deal but that's how I'm used to connecting to my sites. Another issue I have with #1 is that even though my hosting company is lenient with me (as I'm their biggest client) and will consider doing things such as this, they are wary of any security risks it might present.
`#2 works great, but I just have this feeling it's not as efficient as #1. Am I just being delusional?
Thanks
I've used #2 in the past too.
It's more efficient because unlike the wildcard mapping, the ASP.NET engine doesn't need to 'process' requests for all the additional resources like image files, static HTML, CSS, Javascript etc.
Alternatively if you don't mind .aspx extension in your URL's you could use: http://myweb/app/idx.aspx/products/1 - that works fine.
Having said that, the real solution is using IIS 7, where the ASP.NET runtime is a fully fledged part of the IIS HTTP module stack.
If you have the latest version of IIS there is rewrite module for it - see here. If not there are free third party binaries you can use with older IIS (i.e. version 6) - I have used one that reads the rewrite rules from an .ini file and supports regular expression but I cant remember its name sorry (its possibly this). I'd recommend this over cheaping it out with the 404 page.
You have to map all requests through the ASP.NET engine. The way IIS processes requests is by the file extension. By default it only processes the .aspx, .ashx, etc extensions that are meant to only be processed by ASP.NET. The reason is it adds overhead to the processing of the request.
I wrote how to do it with IIS 6 a while back, http://professionalaspnet.com/archive/2007/07/27/Configure-IIS-for-Wildcard-Extensions-in-ASP.NET.aspx.
You are right in doing your mapping from the database. RegEx rewriting, like is used out of the box in MVC. This is because it more or less forces you to put the primary key in the URL and does not have a good way to map characters that are not allowed in URLs, like '.
Did you checked the ASP .Net MVC Framework? Using that framework all your URLs are automatically mapped to Controllers which could perform any desired action (including redirecting to other URLs or controllers). You could also set custom routes with custom parameters. If you don't have seen it yet, maybe it will worth the look.

Why does ASP.NET framework add the 'X-Powered-By:ASP.NET' HTTP Header in responses?

I am just curious to know if there is a specific reason why the .Net Framework adds the 'X-Powered-By:ASP.NET' Http Header in its responses? Do other web servers (Apache, httpd) do the same thing?
EDIT: I know that it can be changed. I want to know if there is a reason to keep it or leave it as it is?
I know that PHP does this. I guess there is no real purpose, other than marketing and making it easier for script kiddies to find suitable victims. For PHP it's better to disable the flag entirely since it shows the PHP version and therefore makes the server more vulnerable to attacks.
Edit: Who knows, it might also lead to better search results on bing... ;-)
It is a default custom header when using IIS. It is a setting in IIS, you can change it if you wish.
Using IIS6 -
Click on the HTTP Headers tab
You can edit or remove the header in the Custom HTTP Headers box.
It is probably there so that sites like Netcraft can pull together statistics for the number of servers running IIS and ASP.NET. This used to be considered an important thing when .NET was released. By stating that n number of sites started using ASP.NET Microsoft could provide metrics for companies that only adopt technology based on the number of other users out there.
I don't believe there is a strong technical reason for having it since a PHP app could imitate an ASP.NET application, by setting the same header in Apache. I could imagine some naive client applications like FrontPage 2003, or SharePoint Designer might use headers like this to validate that they are indeed connecting to an ASP.NET enabled site but that is speculation on my part.
It is fairly common to see a signature for the server/executing engine sent with the headers of a page whether you're running Apache and PHP or IIS and ASP.NET. Just acts as some free publicity, I suppose.
"X-Powered-By:" isn't a standard header, but "Server: " is (and it clearly serves the same purpose).
In a world of SaaS and Cloud services, Web frameworks are 'strategic' assets, and every little piece of real-estate is advidly conquered... sometimes the cheating way.
Tomcat, Apache, WebSphere, JBoss, you name it..
Appearantly, it's not actually a standard HTTP header field.
If "Why" used in context of "how to change it" - go to IIS properties of your site ant open tab "HTTP Headers" and correct Custom HTTP Header.

Resources