Do HTTP proxy servers modify request packets? - http

Is any request header added or modified to the HTTP request before forwarding to the server by a proxy server?
If so, are the changes done to the same packets, or are the contents used to create new request packets with the modifications?

There are a few different types of proxy servers. Because you've mentioned request headers, I'm going to assume that you're talking about HTTP proxy servers, which forward HTTP requests, not packets.
NOTE: In the special case of HTTPS requests (TLS/SSL via CONNECT), proxy servers will just forward the content of the TCP packets (and are unable to inspect the packets unless acting as a man-in-the-middle proxy).
Of course it depends on the proxy software and its configuration, but HTTP proxies are expected to follow the W3C Guidelines for Web Content Transformation Proxies, which states many things, but most relevantly:
Other than to convert between HEAD and GET proxies must not alter request methods.
If the request contains a Cache-Control: no-transform directive, proxies must not alter the request other than to comply with transparent HTTP behavior defined in RFC 2616 HTTP sections section 14.9.5 and section 13.5.2 and to add header fields as described in 4.1.6 Additional HTTP Header Fields.
Other than the modifications required by RFC 2616 HTTP proxies should not modify the values of header fields other than the User-Agent, Accept, Accept-Charset, Accept-Encoding, and Accept-Language header fields and must not delete header fields.
Proxies should add the IP address of the initiator of the request to the end of a comma separated list in an X-Forwarded-For HTTP header field.
Proxies must (in accordance with RFC 2616) include a Via HTTP header field.
In summary, you can generally expect these HTTP headers to be changed/added by a standards-compliant proxy:
User-Agent
Accept
Accept-Charset
Accept-Encoding
Accept-Language
X-Forwarded-For
Via

Related

HTTP Proxy behavior for X-Forwarded-Host header

I have a question regarding how HTTP Proxies usually handle the "X-Forwarded-Host" header. Consider the following scenario where a client sends a HTTP request through a chain of proxies:
Client -> Proxy 1 (http://proxy1.com) -> Proxy 2 (http://proxy2.com) -> Server (http://server.com)
Proxy 1 sets the "X-Forwarded-Host" header to its own endpoint:
X-Forwarded-Host: proxy1.com
What behavior is expected from the second proxy? Does it replace the header and again use its own endpoint as the value (proxy2.com), or is it supposed to detect that a "X-Forwarded-Host" header is already present in the request and simply forward this header? Reading https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-Host:
The X-Forwarded-Host (XFH) header is a de-facto standard header for identifying the original host requested by the client in the Host HTTP request header.
I suspect it is the latter (simply forward the existing header), but I am not sure if this assumption is correct. Any help is appreciated.

Is there a way to distinguish out the IntermediateSystem-related HTTP request and response headers?

In the wikipedia HTTP header fields:
There list many HTTP header fields, but I did not found a way to distinguish the headers about intermediate system(Such as CDN/HTTP Proxy).
Is there a way to distinguish out the IntermediateSystem-related HTTP header fields? or is there any link introduce it?
All the HTTP headers can send to proxy, there can only distinguished by whether Proxy modify the headers: End-to-end headers and Hop-by-hop headers.
End-to-end headers
These headers must be transmitted to the final recipient of the message: the server for a request, or the client for a response. Intermediate proxies must retransmit these headers unmodified and caches must store them.
Hop-by-hop headers
These headers are meaningful only for a single transport-level connection, and must not be retransmitted by proxies or cached. Note that only hop-by-hop headers may be set using the Connection general header.
and if you want to find more proxy-related,
from developer.mozilla.org - HTTP Headers:
Headers can also be grouped according to how proxies handle them:
Connection
Keep-Alive
Proxy-Authenticate
Proxy-Authorization
TE
Trailer
Transfer-Encoding
Upgrade (see also Protocol upgrade mechanism).

Should HTTP proxy copy Content-Encoding header back to client?

It's said that Transport will handle the Content-Encoding automatically (like auto decompressing when reading from resp.Body).
It's also said that, Content-Encoding is an end-to-end HTTP header, not a hop-by-hop one.
Therefore, if a proxy copied Content-Encoding back to client's response header, and this proxy also io.Copy the upstream response body (which may decompressing automatically since io.Copy will read from resp.Body), won't it be inconsistent to client? (Content-Encoding copied from upstream response, but body has been decompressed)
In general, the Content-Encoding response header should not be altered by a proxy.
Different encodings of the same URI are deemed to be different representations and have different ETags. So, changing Content-Encoding would not play well with caching.
But if it's your own proxy and client in your own ecosystem, you could do it, since you know what's going on, so if your proxy is decompressing data back to the client you'd need to strip the Content-Encoding header.

Which changes do a browser make when using an HTTP Proxy?

Imagine a webbrowser that makes an HTTP request to a remote server, such as site.example.com
If the browser is then configured to use a proxy server, let's call it proxy.example.com using port 8080, in which ways are the request now different?
Obviously the request is now sent to proxy.example.com:8080, but there must surely be other changes to enable the proxy to make a request to the original url?
RFC 7230 - Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing, Section 5.3.2. absolute-form:
When making a request to a proxy, other than a CONNECT or server-wide
OPTIONS request (as detailed below), a client MUST send the target
URI in absolute-form as the request-target.
absolute-form = absolute-URI
The proxy is requested to either service that request from a valid
cache, if possible, or make the same request on the client's behalf
to either the next inbound proxy server or directly to the origin
server indicated by the request-target. Requirements on such
"forwarding" of messages are defined in Section 5.7.
An example absolute-form of request-line would be:
GET http://www.example.org/pub/WWW/TheProject.html HTTP/1.1
So, without proxy, the connection is made to www.example.org:80:
GET /pub/WWW/TheProject.html HTTP/1.1
Host: www.example.org
With proxy it is made to proxy.example.com:8080:
GET http://www.example.org/pub/WWW/TheProject.html HTTP/1.1
Host: www.example.org
Where in the latter case the Host header is optional (for HTTP/1.0 clients), and must be recalculated by the proxy anyway.
The proxy simply makes the request on behalf of the original client. Hence the name "proxy", the same meaning as in legalese. The browser sends their request to the proxy, the proxy makes a request to the requested server (or not, depending on whether the proxy wants to forward this request or deny it), the server returns a response to the proxy, the proxy returns the response to the original client. There's no fundamental difference in what the server will see, except for the fact that the originating client will appear to be the proxy server. The proxy may or may not alter the request, and it may or may not cache it; meaning the server may not receive a request at all if the proxy decides to deliver a cached version instead.

Reason for browser not showing X-Forwarded-For headers

Note: Please do read the full question
I'm trying to understand as to why the browsers doesn't show me any X-Forwarded-For header every time a request a page
BTW here are my Request Headers look like
Request URL:http://localhost:3000/users/sign_in
Request Method:GET
Status Code:304 Not Modified
Request Headers:
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-GB,en-US;q=0.8,en;q=0.6
Cache-Control:max-age=0
Connection:keep-alive
Cookie:undefined=0; poasterapp=s%3A4faaa6b1723e7c6fbd949083532c52598652547b.sNX%2BKOEed2TEQkQN7I7K5lgpoHMRpwerKFvUegMnTVI; _minerva_session=BAh7CUkiD3Nlc3Npb25faWQGOgZFRkkiJWEyM2Q0ZTViMWEyODBiYmFmODEwZTJhZmUwNWU5ODk5BjsAVEkiE3VzZXJfcmV0dXJuX3RvBjsARiIGL0kiCmZsYXNoBjsARm86JUFjdGlvbkRpc3BhdGNoOjpGbGFzaDo6Rmxhc2hIYXNoCToKQHVzZWRvOghTZXQGOgpAaGFzaHsGOgphbGVydFQ6DEBjbG9zZWRGOg1AZmxhc2hlc3sGOwpJIgAGOwBUOglAbm93MEkiEF9jc3JmX3Rva2VuBjsARkkiMUN0Uk56SXU0dUdIdzgwcFZJM3R0L2N4dlovRllTSGRrQ2o1R0VVanhIaVk9BjsARg%3D%3D--6bd89ce9d29e9bdcf56573f9a153dc663a8fe755
Host:localhost:3000
If-None-Match:"785d34e3998360353567fc710af123fb"
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.102 Safari/537.36
Response Headers(Not need but still )
Cache-Control:max-age=0, private, must-revalidate
Connection:close
ETag:"785d34e3998360353567fc710af123fb"
Server:thin 1.5.0 codename Knife
Set-Cookie:_minerva_session=BAh7CEkiD3Nlc3Npb25faWQGOgZFRkkiJWEyM2Q0ZTViMWEyODBiYmFmODEwZTJhZmUwNWU5ODk5BjsAVEkiE3VzZXJfcmV0dXJuX3RvBjsARiIGL0kiEF9jc3JmX3Rva2VuBjsARkkiMUN0Uk56SXU0dUdIdzgwcFZJM3R0L2N4dlovRllTSGRrQ2o1R0VVanhIaVk9BjsARg%3D%3D--dfb3ce9f5c97463cfcd0229a133654e6cc606d98; path=/; HttpOnly
X-Request-Id:41a6f3062dc8bc36b7b3eae71dc5075d
X-Runtime:89.238257
X-UA-Compatible:IE=Edge
Now as said, I dont see any X-Forwarded-For in request headers
Reading through the wiki pages of X-Forwarded-For make me feel that ,it is something done by caching server(which in my case I believe is my ISP provider) so am I safe to believe that the **X-Forwarded-For** headers is something that is added at the caching server side (ISP provider)
If yes their is this one bugging me about it then that is
why? is the same true (i.e X-Forwarded-For not appearing in the request-headers ) for my server running locally on my machine and I accessing them via browser like http://localhost:3000
X-Forwarded-For is not a standard request header as specified in RFC 2616 Section 5.3 that addresses the protocol standard request headers, which are (as specified in the RFC)
Accept
Accept-Charset
Accept-Encoding
Accept-Language
Authorization
Expect
From
Host
If-Match
If-Modified-Since
If-None-Match
If-Range
If-Unmodified-Since
Max-Forwards
Proxy-Authorization
Range
Referer
TE
User-Agent
In order for your incoming request to have the custom [X-Forwarded-For] header, it must be explicitly added to that request by the calling client. The easiest explanation of why you are not seeing that header is that the client sending the request did not manually add it.
The tricky thing is that the header you are expecting to see is not a header that you should necessarily be expecting to receive unless there is a contract in place between your service and the caller that is apart from the HTTProtocol indicating that you should expect an X-Forwarded-For value to be specified in your request header. As others have already stated, the XFF header is typically set by a proxy server or a load balancer to indicate who the real requester is that is acting through their proxy.
As a service provider, if you demand that an [X-Forwarded-For] header be set in all requests, you must enforce it at a service policy level. If you do not want to service proxy accounts that do not identify who they are shielding with their proxy IP, bounce their request with a 403 Forbidden. If you are in a situation where you must service these requests but depend on this header being set, then you're going to have to come up with a custom process where you could communicate their error back.
Here is what the HTTProtocol has to say about anonymity:
Because the source of a link might be private information or might
reveal an otherwise private information source, it is strongly
recommended that the user be able to select whether or not the
Referer field is sent. For example, a browser client could have a
toggle switch for browsing openly/anonymously, which would
respectively enable/disable the sending of Referer and From
information.
Clients SHOULD NOT include a Referer header field in a (non-secure)
HTTP request if the referring page was transferred with a secure
protocol.
Authors of services which use the HTTP protocol SHOULD NOT use GET
based forms for the submission of sensitive data, because this will
cause this data to be encoded in the Request-URI. Many existing
servers, proxies, and user agents will log the request URI in some
place where it might be visible to third parties. Servers can use
POST-based form submission instead
...
Elaborate user-customized accept header fields sent in every request,
in particular if these include quality values, can be used by servers
as relatively reliable and long-lived user identifiers. Such user
identifiers would allow content providers to do click-trail tracking,
and would allow collaborating content providers to match cross-server
click-trails or form submissions of individual users. Note that for
many users not behind a proxy, the network address of the host
running the user agent will also serve as a long-lived user
identifier. In environments where proxies are used to enhance
privacy, user agents ought to be conservative in offering accept
header configuration options to end users. As an extreme privacy
measure, proxies could filter the accept headers in relayed requests.
General purpose user agents which provide a high degree of header
configurability SHOULD warn users about the loss of privacy which can
be involved.
Personally, I would bounce the request with a 401.2 and route the requester to a challenge screen via the WWW-Authenticate response header that presents them with notification that they will not be allowed anonymous access to your site. It's kind of a bastardized way of using the WWW-Authenticate header, but it seems like you're expecting the X-Forwarded-For header to acknowledge and identify the real requester and allowing public non-anonymous access to your service. To me, that's an authentication concern.
ISP providers does not adds X-Forwarded-For.
X-Forwarded-For is not for end user to identify application behind proxy/balancer.
X-Forwarded-For is for application behind proxy/balancer to identify
user.
For example:
You got web application (php, java, etc.)
also you have http server (Apache, nginx, etc.) then:
User do request to http server.
Http server redirect request to web application with X-Forwarded-For as user ip.
Yours web application know that it is behind http server so it does read X-Forwarded-For as user ip.
Why are you expecting X-Forwarded-For to appear in the first place? You are connecting to a web server running on localhost, so there is no ISP provider involved at all. Even if you were connecting to your web server via an ISP, it is still not likely to add X-Forwarded-For to the requests. X-Forwarded-For is typically added by an HTTP proxy server or load balancer, neither of which you are going through. X-Forwarded-For is never included by a web browser.

Resources