How to tell if a Request is coming from a Proxy?

How to tell if a Request is coming from a Proxy? - asp.net

Is it possible to detect if an incoming request is being made through a proxy server? If a web application "bans" users via IP address, they could bypass this by using a proxy server. That is just one reason to block these requests. How can this be achieved?

IMHO there's no 100% reliable way to achieve this but the presence of any of the following headers is a strong indication that the request was routed from a proxy server:
via:
forwarded:
x-forwarded-for:
client-ip:
You could also look for the proxy or pxy in the client domain name.

If a proxy server is setup properly to avoid the detection of proxy servers, you won't be able to tell.
Most proxy servers supply headers as others mention, but those are not present on proxies meant to completely hide the user.
You will need to employ several detection methods, such as cookies, proxy header detection, and perhaps IP heuristics to detect such situations. Check out http://www.osix.net/modules/article/?id=765 for some information on this situation. Also consider using a proxy blacklist - they are published by many organizations.
However, nothing is 100% certain. You can employ the above tactics to avoid most simple situations, but at the end of the day it's merely a series of packets forming a TCP/IP transaction, and the TCP/IP protocol was not developed with today's ideas on security, authentication, etc.
Keep in mind that many corporations deploy company wide proxies for various reasons, and if you simply block proxies as a general rule you necessarily limit your audience, and that may not always be desirable. However, these proxies usually announce themselves with the appropriate headers - you may end up blocking legitimate users, rather than users who are good at hiding themselves.
-Adam

Did a bit of digging on this after my domain got hosted up on Google's AppSpot.com with nice hardcore porn ads injected into it (thanks Google).
Taking a leaf from this htaccess idea I'm doing the following, which seems to be working. I added a specific rule for AppSpot which injects a HTTP_X_APPENGINE_COUNTRY ServerVariable.
Dim varys As New List(Of String)
varys.Add("VIA")
varys.Add("FORWARDED")
varys.Add("USERAGENT_VIA")
varys.Add("X_FORWARDED_FOR")
varys.Add("PROXY_CONNECTION")
varys.Add("XPROXY_CONNECTION")
varys.Add("HTTP_PC_REMOTE_ADDR")
varys.Add("HTTP_CLIENT_IP")
varys.Add("HTTP_X_APPENGINE_COUNTRY")
For Each vary As String In varys
If Not String.IsNullOrEmpty(HttpContext.Current.Request.Headers(vary)) Then HttpContext.Current.Response.Redirect("http://www.your-real-domain.com")
Next

You can look for these headers in the Request Object and accordingly decide whether request is via a proxy/not
1) Via
2) X-Forwarded-For
note that this is not a 100% sure shot trick, depends upon whether these proxy servers choose to add above headers.

Related

HTTP Response Header to identify actual server that responded to request

I was about to add an HTTP header to all responses in our web application that would identify which physical node behind our load balancer has serviced the request.
I thought maybe there's a standard (or de facto standard) header that has been traditionally used for this purpose.
Is there?

One potential response header you could use is the "Server" header. By RFC 2616, we can see that it's used to identify the software and any sub-products being used to handle the request and it should not be mutated by any proxies/ load balancers between the server and the client.
Typically this shows information some suggest is sensitive (name & version number of the HTTP server). Many suggest removing the Server header entirely to improve security (see this Stack Overflow question about someone doing this).
You could almost view this as killing two birds with one stone: giving you some way of identifying the server used to process requests on your side and lightly obfuscating the server for attackers satisfying certain security concerns (though, FWIW, I'm not convinced of the value of this from a security perspective, but it's worth mentioning).

The reason for a mandatory 'Host' clause in HTTP 1.1 GET

Last week I started quite a fuss in my Computer Networks class over the need for a mandatory Host clause in the header of HTTP 1.1 GET messages.
The reason I'm provided with, be it written on the Web or shouted at me by my classmates, is always the same: the need to support virtual hosting. However, and I'll try to be as clear as possible, this does not appear to make sense.
I understand that in order to allow two domains to be hosted in a single machine (and by consequence, share the same IP address), there has to exist a way of differentiating both domain names.
What I don't understand is why it isn't possible to achieve this without a Host clause (HTTP 1.0 style) by using an absolute URL (e.g. GET http://www.example.org/index.html) instead of a relative one (e.g. GET /index.html).
When the HTTP message got to the server, it (the server) would redirect the message to the appropriate host, not by looking at the Host clause but, instead, by looking at the hostname in the URL present in the message's request line.
I would be very grateful if any of you hardcore hackers could help me understand what exactly am I missing here.

This was discussed in this thread:
modest suggestions for HTTP/2.0 with their rationale.
Add a header to the client request that indicates the hostname and
port of the URL which the client is accessing.
Rationale: One of the most requested features from commercial server
maintainers is the ability to run a single server on a single port
and have it respond with different top level pages depending on the
hostname in the URL.
Making an absolute request URI required (because there's no way for the client to know on beforehand whether the server homes one or more sites) was suggested:
Re the first proposal, to incorporate the hostname somewhere. This
would be cleanest put into the URL itself :-
GET http://hostname/fred http/2.0
This is the syntax for proxy redirects.
To which this argument was made:
Since there will be a mix of clients, some supporting host name reporting
and some not, it just doesn't matter how this info gets to the server.
Since it doesn't matter, the easier to implement solution is a new HTTP
request header field. It allows all clients and servers to operate as they
do now with NO code changes. Clients and servers that actually need host
name information can have tiny mods made to send the extra header field
containing the URL and process it.
[...]
All I'm suggesting is that there is a better way to
implement the delivery of host name info to the server that doesn't involve
hacking the request syntax and can be backwards compatible with ALL clients
and servers.
Feel free to read on to discover the final decision yourself. But be warned, it's easy to get lost in there.

The reason for adding support for specifying a host in an HTTP request was the limited supply of IP addresses (which was not an issue yet when HTTP 1.0 came out).
If your question is "why specify the host in a Host header as opposed to on the Request-Line", the answer is the need for interopability between HTTP/1.0 and 1.1.
If the question is "why is the Host header mandatory", this has to do with the desire to speed up the transition away from assigned IP addresses.
Here's some background on the Internet address conservation with respect to HTTP/1.1.

The reason for the 'Host' header is to make explicit which host this request refers to. Without 'Host', the server must know ahead of time that it is supposed to route 'http://joesdogs.com/' to Joe's Dogs while it is supposed to route 'http://joscats.com/' to Jo's Cats even though they are on the same webserver. (What if a server has 2 names, like 'joscats.com' and 'joescats.com' that should refer to the same website?)
Having an explicit 'Host' header make these kinds of decisions much easier to program.

Checking if a computer is hacked via the headers

If your computer is infected, apparently Google will tell you so - as shown in the image below:
According to this article, Google use HTTP headers to work this out. But how do they do it, what sort of headers should we look for?
Thank you!

The Google Security blog post you linked doesn't mention HTTP headers.
A key point in the blog post is this:
This particular malware causes infected computers to send traffic to Google through a small number of intermediary servers called “proxies.”
And this:
...taking steps to notify users whose traffic is coming through these proxies...
Google doesn't say much about the proxies, for instance if they were standards-compliant(ish) HTTP proxies or just servers echoing the user requests.
The "unusual" traffic that originated from Google would have been from a small set of IP addresses. No special HTTP headers would be necessary. Google only had to add the warning message to pages being served to the suspect IP addresses. That's it.
The term "signature" in the the follow up link from your comments is used very informally, probably alluding to the IP addresses of the proxy servers. If you want to imagine something more complicated than that, then I suppose it's possible that these proxies (like many HTTP clients) could be detected by some pattern of HTTP headers unique to them. For example the User-Agent or Via headers, or even something more subtle like the ordering or capitalization of headers. I doubt it came to that though, and I don't see much value in speculating, especially two years after the fact.

What Necessitates a Different Protocol for Email?

In what way is HTTP inappropriate for E-mail? How (for example) does the statefulness of IMAP benefit client development?
What actually are the arguments for keeping them separate other then historical and backwards compatibility reasons?

SMTP, IMAP, and HTTP are specialized application-level protocols. If there was a generic application-level protocol which all of these could inherit from, you could usefully refactor things, but since that is not the case, wedging the other protocols into one of the existing protocols is hardly worth the effort, and would hardly simplify things.
As things are now, the history and backwards compatibility is not just a cultural heritage, it is also a long and complex process of defining application-specific features for each protocol. SMTP is store-and-forward, which introduces the need for audit headers (Received: et al.). IMAP was designed for concurrent access to a data store, which is what made it necessary to introduce state (who are you, where are you authorized to connect, which folder are you connected to, what have you already seen, read, or deleted). HTTP is fundamentally a pull protocol (pull down a web page) and the POST facility carries with it a lot of functionality specific to the CGI protocol and the overall content model of HTTP.

SMTP is a protocol that identifies the sender and the recipients to send individual mail messages, each mail server accepts (or not) mail to forward, eventually reaching the destination. HTTP is meant for anybody to connect to the server and look at (mostly the same) contents. They are quite fundamentally different, and so it makes a lot of sense to use different protocols.

The transfer attempted appeared to contain a data leak?

Getting this error message in the browser:
Attention!!!
The transfer attempted appeared to contain a data leak!
URL=http://test-login.becreview.com/domain/User_Edit.aspx?UserID=b5d77644-b10e-44e0-a007-3b9a5e0f4fff
I've seen this before but I'm not sure what causes it. It doesn't look like a browser error or an asp.net error. Could it be some sort of proxy error? What causes it?
That domain is internal so you won't be able to go to it. Also the page has almost no styling. An h1 for "Attention!!!" and the other two lines are wrapped in p tags if that helps any.

For anyone else investigating this message, it appears to be a Fortinet firewall's default network data-leak prevention message.

It doesn't look like an ASP.NET error that I've ever seen.
If you think it might be a proxy message you should reconfigure your browser so it does not use a proxy server, or try to access the same URL from a machine that has direct access to the web server (and doesn't use the same proxy).

This is generated from an inline IPS sensor (usually an appliance or a VM) that is also configured to scan traffic for sensitive data (CC info, SSNs etc). Generally speaking, the end user cannot detect or bypass this proxy as it is deployed to be transparent. It is likely also inspecting all SSL traffic. In simple terms, it is performing a MITM attack because your organizational policy has specified that all traffic to and from your network be inspected.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex