The JVM allows proxy properties http.proxyHost and http.proxyPort for specifying a HTTP proxy server and https.proxyHost and https.proxyPort for specifying a HTTPS proxy server .
I was wondering whether there are any advantages of using a HTTPS proxy server compared to a HTTP proxy server ?
Is accessing a https url via a HTTPS proxy less cumbersome than accesing it from a HTTP proxy ?
HTTP proxy gets a plain-text request and [in most but not all cases] sends a different HTTP request to the remote server, then returns information to the client.
HTTPS proxy is a relayer, which receives special HTTP request (CONNECT verb) and builds an opaque tunnel to the destination server (which is not necessarily even an HTTPS server). Then the client sends SSL/TLS request to the server and they continue with SSL handshake and then with HTTPS (if requested).
As you see, these are two completely different proxy types with different behavior and different design goals. HTTPS proxy can't cache anything as it doesn't see the request sent to the server. With HTTPS proxy you have a channel to the server and the client receives and validates server's certificate (and optionally vice versa). HTTP proxy, on the other hand, sees and has control over the request it received from the client.
While HTTPS request can be sent via HTTP proxy, this is almost never done because in this scenario the proxy will validate server's certificate, but the client will be able to receive and validate only proxy's certificate, and as name in the proxy's certificate will not match the address the socket connected to, in most cases an alert will be given and SSL handshake won't succeed (I am not going into details of how to try to address this).
Finally, as HTTP proxy can look into the request, this invalidates the idea of security provided by HTTPS channel, so using HTTP proxy for HTTPS requests is normally done only for debugging purposes (again we omit cases of paranoid company security policies which require monitoring of all HtTPS traffic of company employees).
Addition: also read my answer on the similar topic here.
There are no pros or cons.
And there are no "HTTPS proxy" server.
You can tell the protocol handlers which proxy server to use for different protocols. This can be done for http, https, ftp and socks. Not more and not less.
I can't tell you if you should use a different proxy for https connections or not. It depends.
I can only explain the difference of an http and https request to a proxy.
Since the HTTP Proxy (or web proxy) understands HTTP (hence the name), the client can just send the request to the proxy server instead of the actual destenation.
This does not work for HTTPS.
This is because the proxy can't make the TLS handshake, which happens at first.
Therefore the client must send a CONNECT request to the proxy.
The proxy establishes a TCP connection and just sends the packages forth and back without touching them.
So the TLS handshake happens between the client and destenation.
The HTTP proxy server does not see everything and does not validate destenation servers certificate whatsoever.
There can be some confusion with this whole http, https, proxy thing.
It is possible to connect to a HTTP proxy with https.
In this case, the communication between the client and the proxy is encrypted.
There are also so called TLS terminating or interception proxy servers like Squid's SSL Peek and Splice or burp, which see everything.
But this should not work out of the box, because the proxy uses own certificates which are not signed by trusted CAs.
References
https://docs.oracle.com/javase/8/docs/technotes/guides/net/proxies.html
https://parsiya.net/blog/2016-07-28-thick-client-proxying---part-6-how-https-proxies-work/
http://dev.chromium.org/developers/design-documents/secure-web-proxy
https://www.rfc-editor.org/rfc/rfc2817#section-5
https://www.rfc-editor.org/rfc/rfc7231#section-4.3.6
If you mean connecting to a HTTP proxy server over TLS by saying HTTPS proxy, then
I was wondering whether there are any advantages of using a HTTPS
proxy server compared to a HTTP proxy server ?
The advantage is that your client's connection to proxy server is encrypted. E.g. A firewall can't not see which host you use CONNECT method connect to.
Is accessing a https url via a HTTPS proxy less cumbersome than
accesing it from a HTTP proxy ?
Everything is the same except that with HTTPS proxy, brower to proxy server connection is encrypted.
But you need to deploy a certificate on your proxy server, like how a https website does, and use a pac file to configure the brower to enable Connecting to a proxy over SSL.
For more details and a practical example, check my question and answer here HTTPs proxy server only works in SwitchOmega
Unfortunately, "HTTPS proxy" has two distinct meanings:
A proxy that can forward HTTPS traffic to the destination. This proxy itself is using an HTTP protocol to set up the forwarding.
In case the browser is trying to connect to a website using HTTPS, the browser will send a CONNECT request to the proxy, and the proxy will set up a TCP connection with the website and mirror all TCP traffic sent on the connection from the browser to the proxy onto the connection between the proxy and the website, and similarly mirror the response TCP packet payload from the webite to the connection with the browser. Hypothetically, the same mechanism using CONNECT could be used with HTTP traffic, but practically speaking browsers don't do that. For HTTP traffic, they send the actual HTTP request to the proxy, including the full path in the HTTP command (as well as setting the Host header): https://stackoverflow.com/a/38259076/10026
So, by this definition, HTTPS Proxy is a proxy that understands the CONNECT directive and can support HTTPS traffic going between the browser and the website.
A proxy that uses HTTPS protocol to secure client communication.
In this mode (sometimes referred to as "Secure Proxy"), the browser uses the proxy's own certificate to perform TLS handshake with the proxy, and then sends either HTTP or HTTPS traffic, (including CONNECT requests), on that connection as per (1). So, the connection between the browser and the proxy is always protected with a TLS key derived using the proxy's certificate, regardless of whether the traffic itself is encrypted with a key negotiated between the browser and the website. If HTTPS traffic is proxied via a secure proxy, it is double-encrypted on the connection between the browser and the proxy.
For example, the Proxy Switcher Chrome plugin has two separate settings to control each of these funtionalities:
As of 2022, the option to use a secure proxy is not available in MacOS and Windows manual proxy configuration UI. But a secure proxy may be specified in a PAC file used in automatic proxy configuration using the HTTPS proxy directive. It is up to the consuming application to support the HTTPS directive; most major browsers, except Safari, and many desktop apps support it.
NOTE: Things get a bit more complicated because some proxies that proxy HTTPS traffic don't simply forward TCP packet payload, as described in (1), but act as Intercepting Proxies. Using a spoofed website certificate, they effectively perform a Man-in-the-Middle attack (well, it's not necessarily an attack because it's expected behavior). Whereas the browser thinks it's using the website's certificate to set up a TLS tunnel with a website, it's actually using a spoofed certificate to set up TLS tunnel with the proxy, and the proxy sets up the TLS tunnel with the website. Then proxy has visibility into the HTTPS requests/responses. But all of that is completely orthogonal to whether the proxy is acting as a secure proxy as per (2).
Related
I wrote a proxy server which works well. But when looking at the log, there are some weird requests like:
POST https://vortex.data.microsoft.com/collect/v1 HTTP/1.1
Also some GET over https. I think only CONNECT is allowed over https, am I wrong? If I am wrong, how to deal with these request? (I just dropped these requests in my app.)
Another thing maybe unrelated is all these requests are related to microsoft from the log.
There isn't any problem handling any HTTP Method with HTTPS within a proxy.
All the requests with https://-protocol will be automatically received and sent to port 443 if not indicated otherwise.
Independently if you have a server where you deployed a HAProxy, NGINX, Apache Web Server or that you literally wrote a proxy like this one in JavaScript, only thing you have to do is to literally proxy the requests to the destination server address.
Regarding the encryption, precisely HTTPS ensures that there are no eavesdroppers between the client and the actual target, so the Proxy would act as initial target and then this would transparently intercept the connection.
Client starts HTTPS session to Proxy
Proxy intercepts and returns its certificate, signed by a CA trusted by the client.
Proxy starts HTTPS session to Target
Target returns its certificate, signed by a CA trusted by the Proxy.
Proxy streams content, decrypt and re-encrypt with its certificate.
Basically it's a concatenation of two HTTPS sessions, one between the client and the proxy and other between the proxy and the final destination.
Many proxy soft provides multiple protocol over one port.
1.
Is there any byte (above or in TCP/UDP package) reserved to mark which protocol client is using?
As I know, HTTP protocol is just carried by TCP's data segment, and there is no any other mark.
So how proxy software tell the protocol when received a request?
(By guessing the first one or two bytes received? It doesn't sounds a good idea)
2.
What's the difference between HTTP proxy and HTTPS proxy?
Here are my guesses
"HTTP proxy" only means a service that can provide proxy for HTTP protocol, And a "HTTPS proxy" can provides service for HTTPS protocol? (means the only difference is whether they can deal HTTP CONNECT method) So the HTTPS proxy is just a functionally enhanced HTTP proxy
Or
HTTPS proxy offers extra security layer between client and proxy server? (to protect the HTTP CONNECT method headers?) So the communication process is quite different between HTTP proxy and HTTPS
proxy, and both HTTP proxy and HTTPS proxy can service HTTP/HTTPS protocol ?
Is there any byte (above or in TCP/UDP package) reserved to mark which protocol client is using?
It is not a single byte, but a HTTP request is clearly distinguishable from a SOCKS request, as a look at the respective standards (RFC 7230, RFC 1928) would show. It isn't the case that all protocols can be easily distinguished, but it is true for SOCKS and HTTP.
What's the difference between HTTP proxy and HTTPS proxy?
A HTTPS proxy is a HTTP proxy used for https:// requests. This is done by using the CONNECT method to create a tunnel through the proxy to the final server and then doing end-to-end HTTPS (TLS+HTTP) inside this tunnel.
... Or HTTPS proxy offers extra security layer between client and proxy server?
This exists too, but it is usually called "HTTP proxy over TLS", "HTTP proxy over HTTPS", "encrypted proxy connection" or similar, but not "HTTPS proxy".
I understand that HTTTPS is secured and it requires SSL certificate issued by CA authority to make the application secure. But what I do not understand is that its in-depth difference with HTTP.
My question, as a user, if I make a request to an application with HTTP or if I make same request to HTTPS what is the actual difference? The traffic remains same to both. Is there any traffic filtering happening if I use HTTPS?
Thanks
HTTPS, as an application protocol is just HTTP over TLS, so there are very few differences, the s in the URL and some consequences for proxy, that is all.
Now you are speaking about the traffic and the filtering. Here you have a big difference because using TLS adds confidentiality and integrity: passive listeners will see nothing about the HTTP data exchanged, including headers. The only thing visible will be the hostname (taken from the https:// URL) as this is needed at the TLS level before HTTP even happens, through a mechanism called SNI (Server Name Indication) that is now used everywhere to be able to install multiple services using TLS under different names but with a single IP address.
The question is about HTTP vs HTTPS.
If I want to anonymously load a website that forces HTTPS, like Google.com, do I need an HTTPS proxies, or can I get away with HTTP proxies?
If your proxy is SOCKS it will not care what kind of socket is connecting through it. It has its own handshake and it does not care about what happens after the handshake. Whether after the SOCKS handshake an SSL handshake (HTTPS) is started it is not a SOCKS proxy problem, it will just pass through.
Several HTTP proxies on the other hand expect HTTP headers to guide them, such a HTTP proxy will not allow HTTPS since it needs to read the headers.
On the third hand (ekhm... well, foot?), an HTTP proxy that supports HTTP CONNECT can also setup the transfer of arbitrary data. Therefore such a proxy can setup any type of socket, which can have an SSL handshake, which can then be used for HTTPS transfer.
HTTP Proxy Server supports CONNECT verb which supports HTTPS connections within HTTP Proxy. You don't need special HTTPS proxy server or any other setup.
CONNECT verb allows you to create binary socket tunnel to any given IP:Port address. So any HTTP client (all browsers), will open secure tunnel and communicate securely over proxy server. However, no one cant control or see anything that is going through the tunnel unless they implement man in middle attack by sending you self-signed certificates.
Most firewall these days automatically implement man in middle self signed certificates that are deployed in work network, so you have to probably dig more to identify whether it is really secure or not. So it may not be that anonymous.
If you're trying to access a service anonymously, you won't get this by running your own proxy. It's not clear from the original question what is meant by "proxy", e.g. local service, or remote service. You won't get anonymity by surfing through a proxy that's on your network, unless it's something like a TOR proxy which relays out through the TOR network.
As for whether proxies can support HTTPS or not, that's been covered here, it would be unusual to find a proxy that doesn't support CONNECT. However if it's a remote anonymizing service you're using, I doubt they would do MitM, since you'd need to install the signing cert into your trusted root store, so they couldn't do that surreptitiously.
I understand that a SOCKS proxy only establishes a connection at the TCP level while an HTTP proxy interprets traffic at HTTP level. Thus a SOCKS proxy can work for any kind of protocol while an HTTP Proxy can only handle HTTP traffic. But why does an HTTP Proxy like Squid can support protocol like IRC, FTP ? When we use an HTTP Proxy for an IRC or FTP connection, what does specifically happen? Is there any metadata added to the package when it is sent to the proxy over the HTTP protocol?
HTTP proxy is able to support high level protocols other than HTTP if it supports CONNECT method, which is primarily used for HTTPS connections, here is description from Squid wiki:
The CONNECT method is a way to tunnel any kind of connection through an HTTP proxy. By default, the proxy establishes a TCP connection to the specified server, responds with an HTTP 200 (Connection Established) response, and then shovels packets back and forth between the client and the server, without understanding or interpreting the tunnelled traffic
If client software supports connection through 'HTTP CONNECT'-enabled (HTTPS) proxy it can be any high level protocol that can work with such a proxy (VPN, SSH, SQL, version control, etc.)
As others have mentioned, the "HTTP CONNECT" method allows you to establish any TCP-based connection via a proxy. This functionality is needed primarily for HTTPS connections, since for HTTPS connections, the entire HTTP request is encrypted (so it appears to the proxy as a "meaningless" TCP connection). In other words, an HTTPS session over a proxy, or a SSH/FTPS session over a proxy, will both appear as "encrypted sessions" to the proxy, and it won't be able to tell them apart, so it has to either allow them all or none of them.
During normal operation, the HTTP proxy receives the HTTP request, and is "smart enough" to understand the request to be able to do high level things with it (e.g. search its cache to see if it can serve the response without going to the destination server, or consults a whitelist/blacklist to see if this URL is allowed, etc.). In "CONNECT" mode, none of this happens. The proxy establishes a TCP connection to the destination server, and simply forwards all traffic from the client to the destination server and all traffic from the destination server to the client. That means any TCP protocol can work (HTTPS, SSH, FTP - even plain HTTP)