Http Get response - http

I created a http client in C. I want to get just the data sent by server. The data is normally after the empty line (\r\n\r\n). The problem is when I try GET on a html page after the empty line I get a number then the line \n0.
I don't know the signification of these two numbers.
When I try GET on a image file I don't get theses two numbers.
Someone can explain me.

Does the response have "Transfer-Encoding: chunked" header?
If so, the response is chunked encoded and the numbers may be chunk-size and last-chunk. The response is split into many chunks and the each chunk-size tells size of each chunk and the last-chunk must be "0\r\n" by HTTP/1.1 specification.

Related

Correct Way to Manually Parse HTTP Response

I am working in a language that has extremely low-level TCP support (if you must know, it's UnrealScript). The response received after making a POST request includes the entire HTTP header, status code, body, etc. as a string.
So, I need to parse the response to extract the body text manually. The HTTP 1.1 specification says:
Response = Status-Line
*(( general-header
| response-header
| entity-header ) CRLF)
CRLF
[message-body]
Am I correct in assuming that the best way to do this is to split the string along a double CRLF (carriage return/line feed) and return the second part of this split?
Or are there weird HTTP edge cases I should be aware of?
Am I correct in assuming that the best way to do this is to split the string along a double CRLF
Yes - but what appears in the body may be compressed using three different compressions methods even if you told the server you don't accept compressed responses.
Further the body may be split into chunks, in between each chunk is an indicator of the size of the next chunk.
Do you really have no scope for using an off the shelf component for parsing? (I would recommend lib curl).

How to configure `Content-Length' header in HTTP protocol

I don't clear about how to count `Content-Length' header in HTTP.
Take an example,
HEADER
...
Content-Type: text/html
(blank line `\r\n')
<html></html>
(blank line `\r\n')
This is a working http request sending an empty HTML page(correct me if any problem :-)). Then what should be the length of content? 15 or 17(take the blank line between header and sending entity into account)?
Thanks in advance. Best regards.
According to W3 Content-Lentgth is defined as followed:
The Content-Length entity-header field indicates the size of the
entity-body, in decimal number of OCTETs, sent to the recipient or, in
the case of the HEAD method, the size of the entity-body that would
have been sent had the request been a GET.
As far as I understand it, you have to count everything after the first line break. My answer to your question would be 15 then.
15 is the correct answer. That counts the line break at the END of the entity data, which means that line break is part of the entity, not the http protocol. DO NOT count the line break between the headers and entity.

How do I properly add a "Connection : close" header to an Http request?

I keep getting an error when I'm adding the "Connection : close" header to an Http request...
The error is:
com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'content' at row 1
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3601)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3535)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1989)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2150)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2626)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2333)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2318)
The reason for this error is that the web server returns a very long (larger than MAX_LONG) response for the request (to which I added the "Connection : close" header) and so the data base is collapsing because it exceeds the size of data allowed for that column (the content of the request). If I leave out this header, the responses are just fine and with reasonable length.
Anybody got a clue?
tnx, Itamar
Maybe because the server is HTTP 1.0 and doesn't handle the "Connection : close" header correctly? Perhaps you forgot a CRLF after one of your headers or perhaps you forgot the blank line that indicates the end of the headers.
Anyway, the general structure for a HTTP request is:
initial line CRLF(=new line "carriage-return line-feed")
header lines, each followed be a CRLF (>=0)
a blank line (i.e. a CRLF by itself)
[optional] message body (e.g. a file, or query data, or query output).
quick http guide

Error code redirect when returning a response with chunked encoding?

My web application uses chunked encoding. I'd like to have the behavior where if any chunk generates an error, I can properly set the error code and redirect to an error page using that. Right now it seems like that can only happen if the error occurs during the first chunk because the response headers must be sent in the first chunk. Is there any way to make this work or get the behavior that I want? Thanks.
The HTTP spec allows you to provide additional headers as a "trailer" after your last chunk, which should be treated just like headers at the top of the response:
https://www.rfc-editor.org/rfc/rfc2616#section-3.6.1
Here's an example:
http://www.jmarshall.com/easy/http/#http1.1c2

How can I find out whether a server supports the Range header?

I have been trying to stream audio from a particular point by using the Range header values but I always get the song right from the beginning. I am doing this through a program so am not sure whether the problem lies in my code or on the server.
How can I find out whether the server supports the Range header param?
Thanks.
The way the HTTP spec defines it, if the server knows how to support the Range header, it will. That in turn, requires it to return a 206 Partial Content response code with a Content-Range header, when it returns content to you. Otherwise, it will simply ignore the Range header in your request, and return a 200 response code.
This might seem silly, but are you sure you're crafting a valid HTTP request header? All too commonly, I forget to specify HTTP/1.1 in the request, or forget to specify the Range specifier, such as "bytes".
Oh, and if all you want to do is check, then just send a HEAD request instead of a GET request. Same headers, same everything, just "HEAD" instead of "GET". If you receive a 206 response, you'll know Range is supported, and otherwise you'll get a 200 response.
This is for others searching how to do this. You can use curl:
curl -I http://exampleserver.com/example_video.mp4
In the header you should see
Accept-Ranges: bytes
You can go further and test retrieving a range
curl --header "Range: bytes=100-107" -I http://exampleserver.com/example_vide0.mp4
and in the headers you should see
HTTP/1.1 206 Partial Content
and
Content-Range: bytes 100-107/10000000
Content-Length: 8
[instead of 10000000 you'll see the length of the file]
Although I am a bit late in answering this question, I think my answer will help future visitors. Here is a python method that detects whether a server supports range queries or not.
def accepts_byte_ranges(self, effective_url):
"""Test if the server supports multi-part file download. Method expects effective (absolute) url."""
import pycurl
import cStringIO
import re
c = pycurl.Curl()
header = cStringIO.StringIO()
# Get http header
c.setopt(c.URL, effective_url)
c.setopt(c.NOBODY, 1)
c.setopt(c.HEADERFUNCTION, header.write)
c.perform()
c.close()
header_text = header.getvalue()
header.close()
verbose_print(header_text)
# Check if server accepts byte-ranges
match = re.search('Accept-Ranges:\s+bytes', header_text)
if match:
return True
else:
# If server explicitly specifies "Accept-Ranges: none" in the header, we do not attempt partial download.
match = re.search('Accept-Ranges:\s+none', header_text)
if match:
return False
else:
c = pycurl.Curl()
# There is still hope, try a simple byte range query
c.setopt(c.RANGE, '0-0') # First byte
c.setopt(c.URL, effective_url)
c.setopt(c.NOBODY, 1)
c.perform()
http_code = c.getinfo(c.HTTP_CODE)
c.close()
if http_code == 206: # Http status code 206 means byte-ranges are accepted
return True
else:
return False
One way is just to try, and check the response. In your case, it appears the server doesn't support ranges.
Alternatively, do a GET or HEAD on the URI, and check for the Accept-Ranges response header.
You can use GET method with 0-0 Range request header, and check whether the response code is 206 or not, which will respond with
the first and last bytes of the response body
You also can use HEAD method do the same thing as the first session which will get the same response header and code without response body
Furthermore, you can check Accept-Ranges on the response header to judge whether it can support range, but please notice if the value is none on Accept-Ranges field, it means it can't support range, and if the response header doesn't have Accept-Ranges field you also can't finger out it can't support range from it.
There is another thing you have to know if you are using 0- Range on the request header with GET method to check the response code, the response body message will be cached automatically on the TCP receive window until the cache is full.

Resources