why http gzip content decompress with zlib failed? (C++) - http

The data comes from here, I take tcp socket to get it.
the response like:
HTTP/1.1 200 OK
Server: nginx/0.7.67
Date: Tue, 06 Aug 2013 08:25:48 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
P3P: CP="IDC DSP COR ADM DEVi TAIi PSA PSD IVAi IVDi CONi HIS OUR IND CNT"
Content-Encoding: gzip
2e2
?
And then I decompress used the zlib function "uncompress", but get a Z_DATA_ERROR returned code. It looks like the data started position "2e2" was not a validate gzip stream data?

The transfer encoding is chunked. Each chunk of data is preceded by the chunk size specified in hexadecimal followed by a line terminator. Then, that many bytes should be read in for the content. The chunk data is followed by another line terminator. The next chunk has the same format (size followed by data) until a 0 sized chunk is sent.
You need to decode each chunk and append it to the decompress buffer. Leaving the chunk size in your data stream would not be treated as valid input by zlib.

Related

HTTP Response: Google misplaced the content-length

When receiving a raw response from http://www.google.com/ the Content-Length header is missing. Instead the number of bytes to receive is placed after the end of header code \r\n\r\n but before the actual content.
I looked at the raw response and the 8000 contains \r\n for an end of line.
Partial Google Response
Header
HTTP/1.1 200 OK
Date: Tue, 28 Oct 2014 18:38:37 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: ...
Set-Cookie: ...
P3P: ...
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic,p=0.01
Transfer-Encoding: chunked
End of Header (signified by '\r\n\r\n')
8000 # has '\r\n', I am assuming this is the content-length?
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content...`
End Response
So my question is why is Google so damn special, that they can screw up my re-invention of the HTTP wheel. And if I should account for this happening in all my responses or just from Google.
You should account for this possibility from all responses, as it is quite common. Nothing about this is "special" and this is indeed completely legitimate behavior.
What you're assuming is the content-length is actually the chunk size, as per the Chunked Transfer Encoding that the server is responding with.
The response is complete when a chunk size of 0 is encountered, thus the effective content-length of a chunked response is equal to the sum of chunk sizes.
This has been part of the HTTP spec since 1999.

HTTP Content length less than File byte-size, did it fully download?

Trying to determine if a user actually downloaded an executable file from a website. I examined the pcap and I see that the Content-Length field = 784,536 but the Server->User is 430,380 bytes. This tells me that the user did not fully download the file. I also downloaded the file myself and see that it is 766 KB. Is it possible that the content-length value based on the HTTP header will not be EQUAL TO the file size of that EXE file if it is downloaded (the local file size)? Is this correct?
Packet Capture Data (I can't post screenshots)
GET /ChromasLite211Setup.exe HTTP/1.1
Host: www.technelysium.com.au
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:17.0) Firefox/17.0
Accept: text/html, application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us
Accept-Enconding: gzip, deflate
Connection: keep-alive
Referrer: http://technelysium.com.au/
HTTP/1.1 200 OK
Date: Thu, 01 Aug 2013 17:28:17 GMT
Server: Apache
Last-Modified: Mon, 15 Apr 2013 08:29:57 GMT
Accept-Ranges: bytes
Content-Length: 784536
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: application/x-msdownload
MZP........................#.............................!..L..This program must be run under Win32
Entire Conversation (430722 bytes)
Users IP -> Server IP (342 bytes)
Server IP -> Users IP (430380)
When I download the file from the site it shows as, "Binary FIle (766 KB)"
Converting Bytes to Kilobytes
784,536/1024 = 766.14
No. The user did not download all the bytes.
If a server sends a Content-Length header, that is exactly how many bytes of content it intends to send as the HTTP Response Body. If less than that number was sent, then something happened (Client terminated connection, Client timed out, etc.)

How to unzip HTTP response body by ngrep?

I'm trying to capture HTTP messages between my laptop and github.com with ngrep, but some responses are not human readable because they are sent in chunked encoding and zipped, like:
T 207.97.227.239:80 -> 192.168.0.175:41372 [AP]
HTTP/1.1 404 Not Found.
Server: GitHub.com.
Date: Sun, 31 Mar 2013 09:50:25 GMT.
Content-Type: text/plain.
Transfer-Encoding: chunked.
Connection: keep-alive.
Content-Encoding: gzip.
.
25.
..........
J-./.,./.T../QH./.K........
How can I unzip the response? Or is there better tool than ngrep to capture HTTP messages?
ngrep is implementation of grep for network (incoming/outgoing messages). It is not suited to capture HTTP messages. Here are two tools you can use for capturing message and viewing compressed content.
Fiddler (windows) FAQ for viewing compressed data
Wireshark (all OS) FAQ for viewing compressed data

Debugging Multiple Disposition Headers

I am developing a web application using Java and keep getting this error in Chrome on some particular pages:
net::ERR_RESPONSE_HEADERS_MULTIPLE_CONTENT_DISPOSITION
So, I checked WireShark for the corresponding TCP stream and this was the header of the response:
HTTP/1.0 200 OK
Date: Mon, 10 Sep 2012 08:48:49 GMT
Server: Apache-Coyote/1.1
Content-Disposition: attachment; filename=KBM 80 U (50/60Hz,220/230V)_72703400230.pdf
Content-Type: application/pdf
Content-Length: 564449
X-Cache: MISS from my-company-proxy.local
X-Cache-Lookup: MISS from my-company-proxy.local:8080
Via: 1.0 host-of-application.com, 1.1 my-company-proxy.local:8080 (squid/2.7.STABLE5)
Connection: keep-alive
Proxy-Connection: keep-alive
%PDF-1.4
[PDF data ...]
I only see one content disposition header in there. Why does chrome tell me there were several?
Because the filename parameter is unquoted, and contains a comma character (which is not allowed in unquoted values, and in this case indicates that multiple header values have been folded into a single one).
See http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.4.2.p.5 and http://greenbytes.de/tech/webdav/rfc6266.html

Uploading a file to Google Docs Api getting error 504

I'm working on a delphi api for Google docs and having a hard time getting the upload to work. I'm following Google's development guide here and from what I understand it looks like the process should go like this:
Make a POST request to this url: https://docs.google.com/feeds/upload/create-session/default/private/full/?access_token=my_access_token&v=3&convert=false with these headers: X-Upload-Content-Type and X-Upload-Content-Length
Get a 200 OK response with the next upload location stored in the Location header
Make a PUT request to the Location header with the header Content-Type set to whatever I had X-Upload-Content-Type set to in step 1 and the header Content-Range set to something like this: bytes 0-524287/2097152 and the first 512kb of data in the body
Get a 308 Resume Incomplete Response that has the next upload location in the Location header
Go back to 3 until all bytes are uploaded, at which point I will receive a 201 Created response that will have the xml data describing the file I uploaded
Everything up to and including step 3 works fine. It is at step 4 that things start to go wrong.
The one thing that confuses me the most is that the response on step 4 doesn't contain a Location header. I figured that meant I should just send the next request to the same url, but that causes me to get a 504 error. I tried the entire process with fiddler just to see if it was the delphi code, a lack of understanding on my part, or something that google is doing.
Here's the requests and responses I sent and received using fiddler:
POST https://docs.google.com/feeds/upload/create-session/default/private/full/?access_token=my_access_token&v=3&convert=false HTTP/1.1
Content-Type: application/x-www-form-urlencoded
X-Upload-Content-Type: application/octet-stream
X-Upload-Content-Length: 2097152
Content-Length: 0
Host: docs.google.com
HTTP/1.1 200 OK
Server: HTTP Upload Server Built on May 16 2012 12:03:24 (1337195004)
Location: https://docs.google.com/feeds/upload/create-session/default/private/full/?access_token=my_access_token&v=3&convert=false&upload_id=AEnB2Ur9-9VxMSI6kaFzbybY2qiyzK6kVoKzcZ6Yo02H8Ni4FlQFl_N06DdjZXzp3vSjOPH3CEb_4vDlKZp7VlC0hxpkypzlKg
Date: Tue, 22 May 2012 16:53:27 GMT
Pragma: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Cache-Control: no-cache, no-store, must-revalidate
Content-Length: 0
Content-Type: text/html
PUT https://docs.google.com/feeds/upload/create-session/default/private/full/?access_token=my_access_token&v=3&convert=false&upload_id=AEnB2Ur9-9VxMSI6kaFzbybY2qiyzK6kVoKzcZ6Yo02H8Ni4FlQFl_N06DdjZXzp3vSjOPH3CEb_4vDlKZp7VlC0hxpkypzlKg HTTP/1.1
Content-Type: application/octet-stream
Content-Length: 524288
Content-Range: bytes 0-524287/2097152
Host: docs.google.com
[first 512kb of data here]
HTTP/1.1 308 Resume Incomplete
Server: HTTP Upload Server Built on May 16 2012 12:03:24 (1337195004)
Range: bytes=0-524287
X-Range-MD5: bd9d4ee7afa24b7da0e685f05b5f1f44
Date: Tue, 22 May 2012 16:54:29 GMT
Pragma: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Cache-Control: no-cache, no-store, must-revalidate
Content-Length: 0
Content-Type: text/html
PUT https://docs.google.com/feeds/upload/create-session/default/private/full/?access_token=my_access_token&v=3&convert=false&upload_id=AEnB2Ur9-9VxMSI6kaFzbybY2qiyzK6kVoKzcZ6Yo02H8Ni4FlQFl_N06DdjZXzp3vSjOPH3CEb_4vDlKZp7VlC0hxpkypzlKg HTTP/1.1
Content-Type: application/octet-stream
Content-Length: 524288
Content-Range: bytes 524288-1048575/2097152
Host: docs.google.com
[next 512kb of data]
HTTP/1.1 504 Fiddler - Send Failure
Content-Type: text/html; charset=UTF-8
Connection: close
Timestamp: 10:54:14.056
The only thing I was able to do was to be able to say for a fact that it is not just the delphi code that is wrong, and since I don't think it's google, I'm going to have to go with I don't understand something that should be happening. What am I missing?
Edit
I was able to get the upload working, I'm not entirely sure what I did differently, but the documentation is a little misleading. At least it is to me. When you send a PUT request, you don't get a new location, you just continue to upload to the same one. Also, when you finish the upload, the 201 response doesn't contain the actual XML data, instead, it has a Location header that points to where you can grab the XML data from. Not a huge deal but a little confusing.
It seems like the 504 error is returned by Fiddler, these two links should help:
https://urda.com/blog/2010/09/28/iis-services-504s-and-fiddler/
https://urda.com/blog/2010/09/30/follow-up-iis-services-504s-and-fiddler/

Resources