tomcat compression - http

If compression is setup on tomcat, will it also compress data that is uploaded by the client - via browser/applet ?

No, it won't. It only applies on the server response. The client has to compress the request data itself. It makes no sense to send the data from the client uncompressed over network to the server first and then compress over there. It won't have any benefits (i.e. saving network bandwidth and so on).
Compression of HTTP requests is however not part of the HTTP specification since a client can't know beforehand if a server would support it. It has to fire a whole request first. It's only specified for the HTTP responses. The server can determine based on the Accept-Encoding request header if the client supports compression or not and then handle accordingly.
In an applet, you can consider to send the data compressed using GZIPOutputStream. You'll only need to develop a specific servlet on the server side which listens on requests from the applet only and knows that it needs to decompress the HttpServletRequest#getInputStream() accordingly using GZIPInputStream

Related

How to act as a middleman server to add HTTP headers between client and remote server?

I have a server which acts as a middle man between an HTTP client that I don't control and a remote file hosting server I don't control. I want to expose a URL through which the client can download a chunk (specified by HTTP range headers my server provides) of a file on the remote server.
There are two important constraints here: I'd like to facilitate this partial download without having the response flow back through my server (response goes straight to client) and without writing a custom client. How can I accomplish this?
One option I tried was having my endpoint send a redirect response with the range headers set on the response, but unfortunately those do not get forwarded onto the subsequent request from the client and as a result the entire file is downloaded. Are there any other hacky tricks / network wizardry I can employ to achieve this end given the constraints?
i am also thinking about this since 5 days it's like the server give you file only when you give required header from your side and without header it will deny your request and middleman if it does get request with required header then file will be accessable through your middleman to client and you are trying to client get file from server not from your custom server which is trying to pass headers to server for your client

How to check the communication options of an entire web server using the HTTP OPTIONS method?

According to the documentation of the HTTP OPTIONS method, one can check the communication options of an entire web server, assuming the server supports such a check. My understanding is that one needs to make an HTTP request to the server to be checked with the first line of the request being OPTIONS * HTTP/1.1. How can one make such a request with a common HTTP client? I wasn't able do it with the Postman client or the Requests HTTP client library for Python. Specifically, specifying the asterisk * along with the server's location, http://<host>/*, for example, didn't work.

The use of HTTP headers

In developer.mozilla.org says:
HTTP headers allow the client and the server to pass additional
information with the request or the response
but I don't understand what is the use of that? What is the need to pass additional information with the request or the response?
This is a hard question to answer concisely because of the many different types of HTTP headers and what they do, but here's an attempt at a one-line answer:
HTTP headers allow a client and server to understand each other better, meaning they can communicate more effectively.
So then if you look at individual headers, it becomes clearer why each is needed:
User-Agent header
Sent by the client
Tells the server about the client's setup (browser, OS etc.)
Mostly used to improve client experience, e.g. tailoring responses for mobile devices or dealing with browser compatibility issues
set-cookie header
Sent by the server
Tells the browser to set a cookie
host header
Sent by the client
Specifies the exact domain name of the site the client wants to reach, this is used when a single server hosts multiple websites (a.k.a. virtual hosting)

Varnish + Static HTML Pages

I've recently come across a http web accelerator called Varnish. From what I've read, Varnish speeds up delivery of a website by optimizing every process of HTTP communication with the HTTP server using a reverse proxy configuration.
My question is that if you have a website that has its caching mechanism configured all the way down to static html files then how much more of an effect will Varnish have on this? Does a reverse proxy cut down the work that is performed by the HTTP server to process the request? If you have everything extensively cached on the server-side (HTTP headers, Etags, Expires Headers, Database Caching, Fragment and Page caching) then what more will a HTTP accelerator do to improve on this?
Firstly, we should differentiate between two different types of caching that go on in a normal web system: HTTP caching and server-side caching.
HTTP caching is controlled by HTTP headers, notably as you point out ETag and the various expiry mechanisms (including Expires and various aspects of Cache-Control). This is all covered in RFC 2616 (HTTP), section 13, and allows HTTP caches to return a response to an HTTP request from a client without having to go back to the origin server. In effect, the HTTP caching mechanism allows another machine between client and server to act as if it's the server, in certain cases. This is actually what varnish is doing, as we'll see in a minute; another common use that many people are familiar with is when ISPs provide an HTTP cache within their network, that can generally respond faster to their subscribers (and so improve perceived performance) than the origin servers outside their network.
Server-side caching includes database caching, and fragment and page caching, which are really all just ways of the web server avoiding doing some expensive operation (say, a database query, or rendering a particular piece of a template) by doing it once then keeping the result in a cache for a while.
I said earlier that varnish was an HTTP cache, which means that straight away it's able to be more efficient than a web server serving even a static file. Consider what a web server has to do:
parse the HTTP request
map the URI (and any relevant request headers, such as Accept-Encoding) onto a file
pull up information about the file to build the HTTP headers in the response; these are known as entity headers (RFC 2616 section 7.1, which include things such as Content-Length, Content-Type and the Expires and Last-Modified headers used in HTTP caching)
figure out what additional response headers (RFC 2616 section 6.2; these include ETag and Vary, both important parts of HTTP caching) and general header fields (RFC 2616 section 4.5) are needed
write the HTTP status line and headers out to the network
write the file's contents out to the network
By comparison, varnish is upstream of all of this, so all it has to do is:
parse the HTTP request
map the URI (and any relevant request headers) onto an entry in its internal cache
see if there's an entry; if there is, write it to the network; the HTTP headers will have been stored in the cache
If there isn't an entry, varnish has to do a little more work:
connect to a web server behind it that will run through all the steps 1-6 in the first list to generate a response
write the response to the network, including all the HTTP headers
store the response in its cache
In particular because the HTTP headers and entity body (the entire response) can be cached by varnish, if it can serve out of its cache it has less work to do. When you start generating the response dynamically in your server, the difference can become even more pronounced: say you have a page that takes 5 seconds to generate, but is the same for everyone hitting your site, varnish should be able to serve that in at most milliseconds out of the cache (plus whatever time it takes to get the response across the network to the HTTP client), and has a neat mechanism (the grace period) so it can keep on doing it while hitting the backend server once to refresh the cached version of the page.
Of course, you can introduce server-side caching to improve the speed with which your web server can process a request, but if you have a response you can cache in varnish it's generally going to be faster to do that. (There are various things that are hard to cache in varnish, particularly if you're using cookies or have pages that change depending on which user is looking at them. While it's possible to continue using varnish in these cases, unless you need really incredible speed, as far as I'm aware most people start optimising those cases using server-side caching and other techniques before hitting up varnish.)
(Note that varnish can also edit headers and indeed data going in and out of the cache, which complicates things. But the main points still stand, and even while editing things on the fly varnish can be incredibly fast.)

Is it possible to use Content-Encoding: gzip in a HTTP POST request?

I'm trying to upload some files with compression to a server. The files will be fairly large and the server is a standard HTTP server where the interface defines that they're not compressed. Is it possible to use something like Content-Encoding to indicate that the upload request is compressed, much like it is used for downstream compression?
Apache supports it with the mod_deflate module but I does not look a common web server feature. If you have access to the server you can enable this module or rewrite the server side code to handle your compressed data (e.g. a special servlet/php which call the original servlet/php with the decompressed data).

Resources