Intentionally bad HTTP clients for testing - http

What are good "bad HTTP client"-s I can use to test your HTTP servers?
For instance, there are servers like
https://httpbin.org/
https://badssl.com/
which allow you to test client against different, sometimes intentionally bad, behavior.
I seek for HTTP client utility for testing HTTP servers. It may send wrong Content-Length or close connection in the middle of request, or do other bad things which robust HTTP server should handle.

In the past, I have used Tamper Data (for Firefox). I see that there is an equivalent for Chrome - Tamper Chrome.
These plugins allow you to edit the HTTP request prior to sending it to the server. This way you can do a number of test cases e.g.:
Changing the content length or type
Bypassing the client-side field validation
These are great for manual, exploratory testing.

Related

Can I do something similar to pipelining on HTTP 2?

Pipelining is a technique in HTTP/1.1 where multiple requests are sent at once without waiting for a response, on a keepalive connection. The responses are then returned in order by the server, without waiting for a round-trip-time between a response being sent and the next request being received.
HTTP/2 adds a feature called multiplexing, which similarly allows the client to send off multiple requests at once. In this case however, the server can send responses all at once.
Without control of the server, Can I achieve something similar to pipelining (i.e. receiving responses in order one-at-a-time without latency between responses) when using HTTP/2?
This would be useful when downloading many large files, without much available memory to buffer several partially-completed responses.
Without control of the server, Can I achieve something similar to pipelining (i.e. receiving responses in order one-at-a-time without latency between responses) when using HTTP/2?
No you cannot, unless the server cooperates (for example the server can be configured to handle requests sequentially or something similar).
As a side note, while request pipelining was allowed in HTTP/1.1, it has always been considered a bad idea and as such made irrelevant by all major implementations (i.e. browsers don't do it, servers don't really support it, etc.).
The main problem is error handling and buggy proxy servers.
HTTP/2 allows a client to set priorities on requests so that requests are processed in priority order.
However, this feature is optional and servers may not implement it, so again you need to carefully choose/configure the server in order to get the behavior you want.
If you can control a little the server side, for both HTTP/1.1 and HTTP/2, a better solution would be to ask the server for all the files in a single request, and have the server reply with a multipart response.

How to tell a proxy a connection is still used using HTTP communication?

I have a client side GUI app for human usage that consumes some SOAP web services and uses cURL as the underlying HTTP communication lib. Depending on the input, processing a request can take some large amount of time, even one hour. Neither the client nor server time out for that reason on their own and that's tested and works. Most of the requests get processed in some minutes anyway, so this is an edge case.
One of my users is forced to use a proxy between my client app and my server and for various reasons has no control over it. That proxy has a time out configured and closes the connection to my client after 4 minutes of no data transfer. So the user can (and did) upload data for e.g. 30 minutes, afterwards the server starts to process the data and after 4 minutes the proxy closes the connection, the server will silently continue to process the request, but the user is left with some error message AND won't get the processing result. My app already uses TCP Keep Alive, so that shouldn't be the problem, but instead the time out seems to be defined for higher level data. It works the same like the option read_timeout for squid, which I used to reproduce the behaviour in our internal setup.
What I would like to do now is start a background thread in my web service which simply outputs some garbage data to my client over all the time the request is processed, which is ignored by the client and tells the proxy that the connection is still active. I can recognize my client using the user agent and can configure if to ouput that data or not server side and such, so other clients consuming the web service wouldn't get a problem.
What I'm asking for is, if there's any HTTP compliant method to output such garbage data before the actual HTTP response? So e.g. would it be enough to simply output \r\n without any additional content over and over again to be HTTP compliant with all requesting libs? Or maybe even binary 0? Or some full fledged HTTP headers stating something like "real answer about to come, please be patient"? From my investigation this pretty much sounds like chunked HTTP encoding, but I'm not sure yet if this is applicable.
I would like to have the following, where all those "Wait" stuff is simply ignored in the end and the real HTTP response at the end contains Content-Length and such.
Wait...\r\n
Wait...\r\n
Wait...\r\n
[...]
HTTP/1.1 200 OK\r\n
Server: Apache/2.4.23 (Win64) mod_jk/1.2.41\r\n
[...]
<?xml version="1.0" encoding="UTF-8"?><soap:Envelope[...]
Is that possible in some standard HTTP way and if so, what's the approach I need to take? Thanks!
HTTP Status 102
Isn't HTTP Status 102 exactly what I need? As I understand the spec, I can simply print that response line over and over again until the final response is available?
HTTP Status 102 was a dead-end, two things might work, depending on the proxy used: A NPH script can be used to regularly print headers directly to the client. The important thing is that NPH scripts normally bypass header buffers from the web server and can therefore be transferred over the wire as needed. They "only" need be correct HTTP headers and depending on the web server and proxy and such it might be a good idea to create incrementing, unique headers. Simply by adding some counter in the header name.
The second thing is chunked transfer-encoding, in which case small chunks of dummy data can be printed to the client in the response body. The good thing is that such small amount of data can be transferred over the wire as needed using server side flush and such, the bad thing is that the client receives this data and by default behaves as if it was part of the expected response body. That might break the application of course, but most HTTP libs provide callbacks for processing received data and if you print some unique one, the client should be able to filter the garbage out.
In my case the web service is spawning some background thread and depending on the entry point of the service requested it either prints headers using NPH or chunks of data. In both cases the data can be the same, so a NPH-header can be used for chunked transfer-encoding as well.
My NPH solution doesn't work with Squid, but the chunked one does. The problem with Squid is that its read_timeout setting is not low level for the connection to receive data at all, but instead some logical HTTP thing. This means that Squid does receive my headers, but it expects a complete HTTP header within the period of time defined using read_timeout. With my NPH approach this isn't the case, simply because by design I only want to send some garbage headers to ignore until the real headers arrive.
Additionally, one has to be careful about NPH in Apache httpd, but in my use case it works. I can see the individual headers in Squid's log and without any garbage after the response body or such. Avoid the Action directive.
Apache2 sends two HTTP headers with a mapped "nph-" CGI

Can the HTTP-Header be send long before the whole body using HTTPS

I've heared that you (in some cases) can prevent timeouts by sending the HTTP-header back to the client before the whole HTTP-body is prepared.
I know that this is impossible using gzip ... but is this possible using HTTPS?
I read in some posts that the secure part of HTTPS is done in the transport-layer (TLS/SSL) - therefore it should be possible, right?
Sorry for mixing gzip in here - it's a completely different level - I know ... and it may is more confusing than giving an example ;)
In HTTP 1.1 it's possible to send the response header before preparing of the body of the response is completed . To do this one normally uses chunked encoding.
Some servers also stream the data as is by not specifying the content length and indicating the end of stream by closing connection, but this is quite a brutal way to do things (chunked encoding was designed exactly for sending the data before it's completely available).
As HTTP(S) is HTTP running over SSL/TLS channel, TLS doesn't affect the above behavior in any way.
Yes, you can do this. HTTPS is just HTTP over an TLS/SSL transport, the HTTP protocol is exactly the same.

Is SPDY any different than http multiplexing over keep alive connections

HTTP 1.1 supports keep alive connections, connections are not closed until "Connection: close" is sent.
So, if the browser, in this case firefox has network.http.pipelining enabled and network.http.pipelining.maxrequests increased isn't the same effect in the end?
I know that these settings are disabled because for some websites this could increase load but I think a simple http header flag could tell the browser that is ok tu use multiplexing and this problem can be solved easier.
Wouldn't be easier to change default settings in browsers than invent a new protocol that increases complexity especially in the http servers?
SPDY has a number of advantages that go beyond what HTTP pipelining can offer, which are described in the SPDY whitepaper:
With pipelining, the server still has to return the responses one at a time in the order they were requested. This can be a problem if the client requests a resource that's dynamically generated before one that is static: the server cannot send any of the "easy" static responses until the dynamically generated one has been generated and sent. With SPDY, responses can be returned out of order or in parallel as they are generated, lowering the total time to receive all resources.
As you noted in your question, not all servers are able to deal with pipelining: it's not just load, some servers actually behave incorrectly when the client requests pipelining. Using a header to indicate that it's okay to do pipelining is too late to get the maximum benefit: you are already receiving the first response at that point, so while you can use it on future connections it's already too late for this one.
SPDY compresses headers using an algorithm which is specific to that task (stateful and with knowledge of what is normally in HTTP headers); while yes, SSL already includes compression, just compressing them with deflate is not as efficient. Most HTTP requests have no bodies and only a short GET line, so the headers make up virtually the entire request: any compression you can get is an improvement. Many responses are also small compared to their headers.
SPDY allows servers to send back additional responses without the client asking for them. For example, a server might start sending back the CSS for a page along with the original HTML, before the client has had a chance to receive and parse the HTML to determine the stylesheet URL. This can speed up page loads even further by eliminating the need for the client to actually parse the HTML before requesting other resources needed to render the page. It also supports a less bandwidth-heavy version of this feature where it can "hint" about which resources might be needed, and allow the client to decide: this allows, for example, clients that don't care about images to not bother to request them, but clients that want to display images can still request the images using the given URLs without needing to wait for the HTML.
Other things too: see William Chan's answer for even more.
HTTP pipelining is susceptible to head of line blocking (http://en.wikipedia.org/wiki/Head-of-line_blocking) at the HTTP transaction level whereas SPDY only has head of line blocking at the transport level, due to its use of multiplexing.
HTTP pipelining has deployability issues. See https://datatracker.ietf.org/doc/html/draft-nottingham-http-pipeline-01 which describes a number of different workarounds and heuristics to mitigate this. SPDY as deployed in the wild does not have this problem since it is generally deployed over SSL (port 443) using NPN (http://technotes.googlecode.com/git/nextprotoneg.html) to negotiate SPDY support. SSL is key, since it prevents intermediaries from interfering.
SPDY has header compression. See http://dev.chromium.org/spdy/spdy-whitepaper which discusses some benchmark results of the benefits of header compression. Now, it's useful to note that bandwidth is less and less of an issue (see http://www.belshe.com/2010/05/24/more-bandwidth-doesnt-matter-much/), but it's also useful to remember that bandwidth is still key for mobile. Check out https://developers.google.com/speed/articles/spdy-for-mobile which shows how beneficial SPDY is for mobile.
SPDY supports features like server push. See http://dev.chromium.org/spdy/spdy-best-practices for ways to use server push to improve cacheability of content and still reduce roundtrips.
HTTP pipelining has ill-defined failure semantics. When the server closes the connection, how do you know which requests have been successfully processed? This is a major reason why POST and other non-idempotent requests are not allowed over pipelined connections. SPDY provides semantics to cancel individual streams on the same connection, and also has a GOAWAY frame which indicates the last stream to be successfully processed.
HTTP pipelining has difficulty, often due to intermediaries, in allowing deep pipelines. This (in addition to many other reasons like HoL blocking) means that you still need to utilize multiple TCP connections to achieve maximal parallelization. Using multiple TCP connections means that congestion control information cannot be shared, that compression contexts cannot be shared (like SPDY does with headers), is worse for the internet (more costly for intermediaries and servers).
I could go on and on about HTTP pipelining vs SPDY. But I'd recommend just reading up on SPDY. Check out http://dev.chromium.org/spdy and our tech talk on SPDY at http://www.youtube.com/watch?v=TNBkxA313kk&list=PLE0E03DF19D90B5F4&index=2&feature=plpp_video.
See Difference between HTTP pipeling and HTTP multiplexing with SPDY

What things should be kept in mind while desigining an HTTP based protocol?

I have heard that http is a nice way to design my own protocol. although i can design a binary protocol, i would prefer to follow the HTTP standard to design my protocol.
basically the flow of the application is that with the request the client sends some parameter strings to the server, the server sends the response string to the application. this procedure continues several times, before the connection thread terminates.
i am using java servlets for the above.
how should the client send the HTTP parameters so that parsing is easy at the server.
Get /a HTTP/1.1
Host: localhost
??? what comes here
??? what comes here
Since that is a GET request, nothing.
I'd suggest using querystring parameters, then you can access them using ServletRequest.getParameterNames(), getParameterValues(), getParameterMap() etc.
So, your request line would take the form:
GET /a?x=1&y=1 HTTP/1.1
since this is the standard way of passing parameter data, other clients, such as web browsers, will be able to use your service easily.
This assumes that the operation does not cause side-effects on the server. If it does then you should be using a POST, PUT or DELETE request depending on the exact nature of the operation.
HTTP Made Really Easy is a useful document since, at least initially, the HTTP Spec can be a bit daunting.
Why not base your protocol on something that already exists for example SOAP?
What you're designing is a data exchange format, not a protocol really.
So the question is, really, what sort of data do you want to send? To answer that, you need to consider who is receiving it. If it's yourself, then just keep it simple.

Resources