HTTPListener response - how to define packet boundaries - multipart

I am using the HTTPListener class to implement a basic web server. My response uses a Content-Type of "multipart/x-mixed-replace" to return a stream of JPEG images. I am able to construct my response correctly, however my web client does not properly interpret the response due to the way in which the response is broken across IP packet boundaries.
Using a separate server implemented in python, I am able to generate a good, working case. The response to the client's HTTP GET request looks like this:
packet 1:
HTTP/1.1 200 OK
packet 2:
Server: (myServer)
Date: (the date)
Connection: close
(other header info)
Content-Type: multipart/x-mixed-replace; boundary=myboundary
packet 3:
--myboundary
Content-Length: 1042
Content-Type: image/jpeg
(jpeg data)
In the failed case, using the HTTPListener, everything gets sent in a single packet
packet 1:
HTTP/1.1 200 OK
Server: (myServer)
Date: (the date)
Connection: close
(other header info)
Content-Type: multipart/x-mixed-replace; boundary=myboundary
--myboundary
Content-Length: 1042
Content-Type: image/jpeg
(jpeg data)
So my question is, how do I manipulate the HTTPListenerResponse's OutputStream to force a packet boundary? I want to be able to specify some data, manually tell the OutputStream to push out a packet, then specify some more data and push out another packet. Does the HTTPListener offer this level of control, or do I need to instead use a TCPListener? I've not been able to find a solution; please help!

If your client doesn't work because of IP packet boundaries it is severely broken. Fix the client, don't add crutches for it in a place where there isn't a problem. HTTP is defined over TCP, and TCP is a byte-stream protocol. Period. Any correctly written TCP program, let alone an HTTP client, doesn't care where the packet boundaries are. If your client misbehaves in this way it will misbehave in other ways as well. Fix it.

Related

Confusion with HTTP

I am new to web development and came across the term HTTP. I have done some research and wanted to ensure that I correctly understood the term. So, is it true that HTTP, in simple words, a letter containing information in the language that both client and server can understand. Then, that letter is sent to the server thanks to TCP/IP which serves as a car that takes that letter to the server. Then, after the letter is delivered to the server, the server reads the content of the letter and if it is GET request, the server takes the necessary data and ATTACHES that data to the letter and sends back to the client via again TCP/IP. But if it was POST request then the client ATTACHES the DATA to the letter and sends it to the server so that it saves that data in the database. Is that true?
HTTP is a text protocol, where the client (usually a browser) sends a request, which is just a properly formatted text. Then the server sends a response, which is also a properly formatted text. TCP/IP is only used to send bytes, which are interpreted as this text by the browser and the web server.
Both GET and POST methods send text requests and responses. The most significant difference between those two is that with POST method, a request can have a body.
It would look like this:
> POST /post HTTP/1.1
> Host: example.com
> Content-Length: 19
> Content-Type: application/x-www-form-urlencoded
>
> foo1=bar1&foo2=bar2 <- this is request's body, this is how it would look like if a form was submitted. GET method doesn't have this part.
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=utf-8
< Date: Mon, 17 Feb 2020 22:24:29 GMT
< Content-Length: 27
<
< {"response":{"text": "OK"}} <- this is a response's body
(> means a request, a text sent to the server, < means a response, a text sent back to the user. Those characters are not sent, it's just for us now to distinguish between those two)
You may say, that images are sent over HTTP too, but it's not text. This is true, but as we all know, for computers everything is just bytes. It's only how we interpret them that makes a distinction between text and image. This is why you can open an image file in Notepad and you will see ... text. Weirdly formatted and impossible to understand, but still text.

Why do my HTTP/1.0 GET request returns OK but no body content?

I'm making a simple http page-requester in C. It uses sockets to send HTTP/1.0 GET requests to hosts, and parses the answer to effectively download the html file.
However, when i send a request like this:
GET http://stackoverflow.com/questions HTTP/1.0
User-Agent: myRequester/1.0
It returns this
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=utf-8
Date: Mon, 19 Dec 2011 15:28:08 GMT
Content-Length: 54362
Connection: close
But no body content.
Yes, I've put CRLF on the end of every line and a blank line at the end.
I use only one socket through one connection. And i also have to stick to HTTP/1.0.
The most likely explanation is that the server is actually sending a body but you are not reading all of it. Most networking systems do not necessarily return all of the response in one function call, because they see it useful that what data is available immediately is returned immediately, even if more is expected.
The Unix system call recv returns zero when the connection has ended. You should keep calling it until you get zero or an error.

Chunked Transfer Encoding problem with Apache Abdera

I'm using Apache Abdera to POST atom multipart data to my server, and am having some odd problems that I can't pin down.
It looks like an issue with chunked transfer encoding, but I'm insufficiently experienced to be certain. The problem manifests as the server throwing an error indicating that the request I sent it contains only one mime part, not two as required. I attached Wireshark to the interface and captured the conversation, and it went like this:
POST /sss/col-uri/2ee98ea1-f9ad-4f01-9b1c-cfa3c4a6dc3c HTTP/1.1
Host: localhost
Expect: 100-continue
Transfer-Encoding: chunked
Content-Type: multipart/related; boundary="1306399868259";type="application/atom+xml;type=entry"
The server's response:
HTTP/1.1 100 Continue
My client continues:
198
--1306399868259
Content-Type: application/atom+xml;type=entry
Content-Disposition: attachment; name="atom"
<entry xmlns="http://www.w3.org/2005/Atom"><title xmlns="http://purl.org/dc/terms/">Richard Woz Ere</title><bibliographicCitation xmlns="http://purl.org/dc/terms/">this is my citation</bibliographicCitation><content type="application/zip" src="cid:48bd9436-e8b6-4f68-aa83-5c88eda52fd4" /></entry>
0
b0e9
--1306399868259
Content-Type: application/zip
Content-Disposition: attachment; name="payload"; filename="example.zip"
Content-ID: <48bd9436-e8b6-4f68-aa83-5c88eda52fd4>
Packaging: http://purl.org/net/sword/package/SimpleZip
And at this point the server responds with:
HTTP/1.1 400 Bad Request
Date: Thu, 26 May 2011 08:51:08 GMT
Server: Apache/2.2.17 (Unix) mod_ssl/2.2.17 OpenSSL/0.9.8l DAV/2 mod_wsgi/3.3 Python/2.6.1
Connection: close
Transfer-Encoding: chunked
Content-Type: text/xml
Indicating the error (which is well understood). My server goes on to stream a pile of base64 encoded bits onto the output stream, but in the mean time the server is not listening, it has already decided that the request was erroneous.
Unfortunately, I'm not in charge of the HTTP layer - this is all handled by Abdera using Apache httpclient. My code that does this looks like this:
client.execute("POST", url.toString(), new SWORDMultipartRequestEntity(deposit), options);
Here, the SWORDMultipartRequestEntity is a copy of the standard Abdera MultipartRequestEntity class, with a few extra headers thrown in (see, for example, Packaging in the above snippet); the "deposit" argument is just an object holding the atom part and the inputstream.
When attaching a debugger I get to this line of code fine, and then it disappears into a rat hole and then I get this error back.
Any hints or tips? I've pretty much exhausted my angles of attack!
The only thing that stands out for me is that immediately after the atom:entry document, there is a newline with "0" on it alone, which appears to be chunked transfer encoding speak for "I'm finished". Not sure how it got there, or whether it really has any effect. Help much appreciated.
Cheers,
Richard
The lonely 0 may indeed be a problem. My uninformed guess is that it results from some call to flush(), which then writes the whole buffer as another HTTP chunk. Unfortunately at the point where flush is called, the buffer had already been flushed and its size is therefore zero. So the HttpChunkedOutputFilter (or however it is called) should be taught than an empty buffer does not need to be flushed.
[update:] You should set a breakpoint in the ChunkedOutputStream class, especially the flush method. I just looked at its code and it seems to be ok, but maybe I missed something.

Http keep-alive conncetion - response delimiter

Hi i am writing a custom http server. At first i had only one request at one connection possible(Connection: close), everything was ok. But once i remade it connecton: keep-alive logic(more requests at one connection), my images stopped displaying. I think, it may be a problem with http response delimiters. Are there any? Or how can browser detect, that current http response is complete? thx
The size of the response is guided by the Content-Length header or by using Chunked Transfer Encoding.

Difference between Content-Range and Range headers?

What is the difference between HTTP headers Content-Range and Range? When should each be used?
I am trying to stream an audio file from a particular byte offset. Should I use Content-Range or Range header?
Actually, the accepted answer is not complete. Content-Range is not only used in responses. It is also legal in requests that provide an entity body.
For example, an HTTP PUT provides an entity body, it might provide only a portion of an entity. Thus the PUT request can include a Content-Range header indicating to the server where the partial entity body should be merged into the entity.
For example, let's first create and then append to a file using HTTP:
Request 1:
PUT /file HTTP/1.1
Host: server
Content-Length: 1
a
Request 2:
PUT /file HTTP/1.1
Host: server
Content-Range: bytes 1-2/*
Content-Length: 1
a
How, let's see the file's contents...
Request 3:
GET /file HTTP/1.1
Host: server
HTTP/1.1 200 OK
Content-Length: 2
aa
This allows random file access, both READING and WRITING over HTTP. I just wanted to clarify, as I was researching the use of Content-Range in a WebDAV client I am developing, so perhaps this expanded information will prove useful to somebody else.
Range is used in the request, to ask for a particular range (or ranges) of bytes. Content-Range is used in the response, to indicate which bytes the server is giving you (which may be different than the range you requested), as well as how long the entire content is (if known).

Resources