How do I prevent GAE from ungzipping a gzipped xml feed? - http

I have a script on GAE that requests an XML feed from a partner that's typically 40MB but only 5MB gzipped. GAE is automatically unzipping this content and throwing an error that the response is too big:
HTTP response was too large: 46677241. The limit is: 33554432.
The script is setup to uncompress the response itself. How do I prevent GAE from getting in the way and breaking?
Here's the response header from my partner:
HTTP/1.0 200 OK
Expires: Wed, 27 Jun 2012 05:42:07 GMT
Cache-Control: max-age=10368000
Content-Type: application/x-gzip
Accept-Ranges: bytes
Last-Modified: Wed, 22 Feb 2012 11:06:09 GMT
Content-Length: 5263323
Date: Tue, 28 Feb 2012 05:42:07 GMT
Server: lighttpd
X-Cache: MISS from static01
X-Cache-Lookup: MISS from static01:80
Via: 1.0 static01:80 (squid)

Most likely your partner's server responds with plain XML, because it thinks that http-client sending requests (i.e. GAE URL Fetch service) does not support gzipping. Hence "response was too large" error.
To announce that you actually want to receive gzipped content you need to set Accept-Encoding: gzip header when using URL fetch service.

Related

How can I make Firefox cache a large image?

I have issues convincing Firefox 71 to cache a large (>4MB) image. I notice both in developer tools (as being logged) and during normal operations (as per loading delay) that the image is loaded every time the page is accessed.
Although I thought I provided all the necessary response headers, Firefox is not sending If-Modified-Since or If-None-Match request headers.
These are the HTTP headers my server is sending:
$ HEAD https://😉/image.png
200 OK
Cache-Control: public, max-age=31536000, immutable
Connection: close
Date: Sat, 04 Jan 2020 19:52:20 GMT
Accept-Ranges: bytes
ETag: "564cd5fb-4484b0"
Server: nginx/1.14.0 (Ubuntu)
Content-Length: 4490416
Content-Type: image/png
Last-Modified: Wed, 18 Nov 2015 19:48:11 GMT
Client-Date: Sat, 04 Jan 2020 19:52:20 GMT
Client-Peer: 😛
Client-Response-Num: 1
Client-SSL-Cert-Issuer: /C=US/O=Let's Encrypt/CN=Let's Encrypt Authority X3
Client-SSL-Cert-Subject: /CN=😉
Client-SSL-Cipher: ECDHE-RSA-CHACHA20-POLY1305
Client-SSL-Socket-Class: IO::Socket::SSL
The web page loads the image via JavaScript:
let mapImg = new Image();
mapImg.src = 'image.png';
I believe I did everything according to documentation and wonder if I made some wrong combination of response headers, encryption, compression, and loading method?

Is the version of HTTP either 1.0 or 1.1 defined by webserver? How works the HTTP protocol definition?

I have a quick question but in advance I've read the RFC 2616 Chapter 14.22 about Host and HTTP Header but I still not understand where in httpd.conf or configuration file of a webserver should be changed? Please correct me if I'm wrong.
Look at following two HTTP GET I did to an Apache. The first one is GET for HTTP 1.0 , the other one is GET for HTTP 1.1. See the output:
HTTP/1.0 200 OK
Date: Thu, 24 Oct 2013 03:46:22 GMT
Server: Apache/1.3.41 (Unix) mod_gzip/1.3.26.1a PHP/5.2.9 mod_throttle/3.1.2 mod_psoft_traffic/0.2 mod_ssl/2.8.31 OpenSSL/0.9.8b
Vary: *
Last-Modified: Fri, 10 Aug 2012 20:22:30 GMT
ETag: "17c815b-3b-50256d86"
Accept-Ranges: bytes
Content-Length: 59
Connection: close
Content-Type: text/html
<html>
<body>
<center>webli7</center>
</body>
</html>
HTTP/1.1 400 Bad Request
Date: Thu, 24 Oct 2013 04:04:40 GMT
Server: Apache/1.3.41 (Unix) mod_gzip/1.3.26.1a PHP/5.2.9 mod_throttle/3.1.2 mod_psoft_traffic/0.2 mod_ssl/2.8.31 OpenSSL/0.9.8b
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1
16e
The HTTP protocol version is decided dynamicaly, not through configuration files. The client send a request specifying the highest protocol version that its support. Then, the server must respond with either the version requested by the client, or any earlier version that it prefers.
Since Apache does support HTTP/1.1, it should therefore match exactly the version provided by the client.
There exist a flag that you may set in Apache's config to force Apache to use HTTP/1.0 in certain situations, even though the browser requested HTTP/1.1. This is used to fix bugs in HTTP/1.1 handling of some very old browser. Today, you should not need to play with this flag.
As for your error, I would suggest that you make sure that your GET does provide the Host: header. This header is required in HTTP/1.1, yet optional in HTTP/1.0, and having it missing would certainly result in a 400 error.

IIS not returning HTTP/304 on conditional request made with If-None-Match

I have made a request for a video which returns a video with an ETAG.
When I make a request for the same video again, I can see the If-non-match header passed from the browser with the Etag but instead of 304 returned, the video is downloaded again with a 200 OK response.
In fiddler for the very first request for the video, the response is:
HTTP/1.1 200 OK
Cache-Control: max-age=10
Content-Length: 76278442
Content-Type: video/mp4
Last-Modified: Wed, 21 Aug 2013 08:47:29 GMT
ETag: "2117329216"
Server: Microsoft-IIS/7.5
X-Mod-H264-Streaming: version=2.2.7
X-Powered-By: ASP.NET
Date: Fri, 23 Aug 2013 21:20:34 GMT
On the second request, the GET headers are:
GET http://test/video.mp4 HTTP/1.1
Accept: */*
Accept-Language: en-GB
x-flash-version: 11,8,800,94
Accept-Encoding: gzip, deflate
If-Modified-Since: Wed, 21 Aug 2013 08:47:29 GMT
If-None-Match: "2117329216"
Connection: Keep-Alive
But in this case, I get the whole video downloaded rather than a 304 non modified response.
I noticed that X-Mod-H264-Streaming was used, not sure if this may have something to do with it.
Edit
I used the URL to the video in IE 10 directly (not using the flex application we were using before) and I get the same response where on the first request I get the complete video and after hitting f5 I get the whole video returned again rather than a 304 response.

Why is my browser doing a request if I've configured the expires and cache control headers?

This is an example response from my amazon bucket.
$ curl -I http://amazon_bucket/image.jpg
HTTP/1.1 200 OK
x-amz-id-2: Tmr9SynKe8ztlB/Jix1hNrclwyc/k4NVHyqK3B0vNKUoPFIxfzwALi0XQRwEjhzO
x-amz-request-id: DCFDBCF510988AFB
Date: Wed, 27 Mar 2013 13:06:34 GMT
Cache-Control: public, max-age=2629000
Expires: Wed, 26 Mar 2014 23:00:00 GMT
Last-Modified: Wed, 27 Mar 2013 13:00:19 GMT
ETag: "52dd53ea738c7824b3f67cfea6a3af2a"
Accept-Ranges: bytes
Content-Type: image/jpeg
Content-Length: 627046
Server: AmazonS3
I would expect the browser to cache the image and serve it from cache. Instead, when I reload the page, my browser does a request, which yield a 304 not modified response. Why is it acting like must-revalidate option was passed? Why isn't the browser serving the image directly from cache? The options I've configured on the image, from my S3 client are these:
Cache-Control: public, max-age=2629000
Expires: Wed, 26 Mar 2014 23:00:00 GMT
Is there some other option I should be passing to the S3 files? It might be a dumb answer, but I see that the requests my browser makes to get these pictures all have the following headers:
Cache-Control:no-cache
Pragma:no-cache
Why is my browser sending those?
I was hitting refresh, and apparently, this always triggers an If-Modified-Since request. If you visit the page normally, the asset is served from browser cache.

Uploading a file to Google Docs Api getting error 504

I'm working on a delphi api for Google docs and having a hard time getting the upload to work. I'm following Google's development guide here and from what I understand it looks like the process should go like this:
Make a POST request to this url: https://docs.google.com/feeds/upload/create-session/default/private/full/?access_token=my_access_token&v=3&convert=false with these headers: X-Upload-Content-Type and X-Upload-Content-Length
Get a 200 OK response with the next upload location stored in the Location header
Make a PUT request to the Location header with the header Content-Type set to whatever I had X-Upload-Content-Type set to in step 1 and the header Content-Range set to something like this: bytes 0-524287/2097152 and the first 512kb of data in the body
Get a 308 Resume Incomplete Response that has the next upload location in the Location header
Go back to 3 until all bytes are uploaded, at which point I will receive a 201 Created response that will have the xml data describing the file I uploaded
Everything up to and including step 3 works fine. It is at step 4 that things start to go wrong.
The one thing that confuses me the most is that the response on step 4 doesn't contain a Location header. I figured that meant I should just send the next request to the same url, but that causes me to get a 504 error. I tried the entire process with fiddler just to see if it was the delphi code, a lack of understanding on my part, or something that google is doing.
Here's the requests and responses I sent and received using fiddler:
POST https://docs.google.com/feeds/upload/create-session/default/private/full/?access_token=my_access_token&v=3&convert=false HTTP/1.1
Content-Type: application/x-www-form-urlencoded
X-Upload-Content-Type: application/octet-stream
X-Upload-Content-Length: 2097152
Content-Length: 0
Host: docs.google.com
HTTP/1.1 200 OK
Server: HTTP Upload Server Built on May 16 2012 12:03:24 (1337195004)
Location: https://docs.google.com/feeds/upload/create-session/default/private/full/?access_token=my_access_token&v=3&convert=false&upload_id=AEnB2Ur9-9VxMSI6kaFzbybY2qiyzK6kVoKzcZ6Yo02H8Ni4FlQFl_N06DdjZXzp3vSjOPH3CEb_4vDlKZp7VlC0hxpkypzlKg
Date: Tue, 22 May 2012 16:53:27 GMT
Pragma: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Cache-Control: no-cache, no-store, must-revalidate
Content-Length: 0
Content-Type: text/html
PUT https://docs.google.com/feeds/upload/create-session/default/private/full/?access_token=my_access_token&v=3&convert=false&upload_id=AEnB2Ur9-9VxMSI6kaFzbybY2qiyzK6kVoKzcZ6Yo02H8Ni4FlQFl_N06DdjZXzp3vSjOPH3CEb_4vDlKZp7VlC0hxpkypzlKg HTTP/1.1
Content-Type: application/octet-stream
Content-Length: 524288
Content-Range: bytes 0-524287/2097152
Host: docs.google.com
[first 512kb of data here]
HTTP/1.1 308 Resume Incomplete
Server: HTTP Upload Server Built on May 16 2012 12:03:24 (1337195004)
Range: bytes=0-524287
X-Range-MD5: bd9d4ee7afa24b7da0e685f05b5f1f44
Date: Tue, 22 May 2012 16:54:29 GMT
Pragma: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Cache-Control: no-cache, no-store, must-revalidate
Content-Length: 0
Content-Type: text/html
PUT https://docs.google.com/feeds/upload/create-session/default/private/full/?access_token=my_access_token&v=3&convert=false&upload_id=AEnB2Ur9-9VxMSI6kaFzbybY2qiyzK6kVoKzcZ6Yo02H8Ni4FlQFl_N06DdjZXzp3vSjOPH3CEb_4vDlKZp7VlC0hxpkypzlKg HTTP/1.1
Content-Type: application/octet-stream
Content-Length: 524288
Content-Range: bytes 524288-1048575/2097152
Host: docs.google.com
[next 512kb of data]
HTTP/1.1 504 Fiddler - Send Failure
Content-Type: text/html; charset=UTF-8
Connection: close
Timestamp: 10:54:14.056
The only thing I was able to do was to be able to say for a fact that it is not just the delphi code that is wrong, and since I don't think it's google, I'm going to have to go with I don't understand something that should be happening. What am I missing?
Edit
I was able to get the upload working, I'm not entirely sure what I did differently, but the documentation is a little misleading. At least it is to me. When you send a PUT request, you don't get a new location, you just continue to upload to the same one. Also, when you finish the upload, the 201 response doesn't contain the actual XML data, instead, it has a Location header that points to where you can grab the XML data from. Not a huge deal but a little confusing.
It seems like the 504 error is returned by Fiddler, these two links should help:
https://urda.com/blog/2010/09/28/iis-services-504s-and-fiddler/
https://urda.com/blog/2010/09/30/follow-up-iis-services-504s-and-fiddler/

Resources