Cross-domain chunked uploads using CORS - nginx

I have user-submitted files that I'm trying to upload in 10 MB chunks. I'm currently using raw XMLHttpRequest (and XDomainRequest) to push each individual slice (File.prototoype.slice) on the front end. The back end is Nginx using the upload module.
Just for reference, here's the synopsis of how I'm using slice:
element.files[0].slice(...)
I understand the cross-browser prefixed methods webkitSlice and mozSlice and all that.
The problem I have is with actually making the cross-domain request. I'm uploading from server.local to upload.server.local. In Firefox, the options request goes through fine and then the actual post fails. In Chrome and Opera, the options request fails with
OPTIONS https://URL Resource failed to load
Here are the headers from Firefox:
Request Headers
OPTIONS /path/to/asset HTTP/1.1
Host: upload.server.local:8443
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Origin: https://server.local:8443
Access-Control-Request-Method: POST
Access-Control-Request-Headers: content-disposition,content-type,x-content-range,x-session-id
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
Response Headers
HTTP/1.1 204 No Content
Server: nginx/1.2.6
Date: Wed, 13 Feb 2013 03:27:44 GMT
Connection: keep-alive
access-control-allow-origin: https://server.local:8443
Access-Control-Allow-Methods: POST, OPTIONS
Access-Control-Allow-Headers: x-content-range, origin, content-disposition, x-session-id, content-type, cache-control, pragma, referrer, host
access-control-allow-credentials: true
Access-Control-Max-Age: 10000
The actual post request never leaves the browser. Nginx access logs never see the post. The browser halts it for some reason. How do I unravel why this post is being blocked?
Chromium 24
Firefox 18
Opera 12.14
I've verified all browsers support CORS properly here.
By pointing my uploads to https://cors-test.appspot.com/test, I have confirmed that the problem is definitely with the server-side headers.

The POST won't leave the browser if the preflight check does not return sufficient permissions and thus the POST request is not fully authorized. The request/response included in the question does look sufficient to me.
Are you sure you are setting withCredentials = true in your XMLHttpRequest?
Are you sure that you have valid (not self-signed) SSL certificates on your servers? The HTTPS might fail the CORS check even if you have added an exception for browsing the site with an invalid certificate.
Have you tried emptying your cache? You have Access-Control-Max-Age: 10000 set in your response headers. That's close to 3 hours. I know you've been working on this longer than that but while testing especially, set that header to zero instead so you don't go crazy with browser caching of old access permissions.
In general I'd start with going as permissive as possible with the CORS headers and slowly ratcheting up the the security to see where it fails. However, this is not completely straightforward. For example, according to the MDN documentation on CORS,
When responding to a credentialed request, server must specify a domain, and cannot use wild carding. The above example would fail if the header was wildcarded as: Access-Control-Allow-Origin: *
When I send the request part of your question to https://cors-test.appspot.com/test, I get back the following:
HTTP/1.1 200 OK
Cache-Control: no-cache
Access-Control-Allow-Origin: https://server.local:8443
Access-Control-Allow-Headers: content-disposition,content-type,x-content-range,x-session-id
Access-Control-Allow-Methods: POST
Access-Control-Max-Age: 0
Access-Control-Allow-Credentials: true
Cache-Control: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Content-Type: application/json
Content-Encoding: gzip
Content-Length: 35
Vary: Accept-Encoding
Date: Thu, 23 May 2013 06:37:34 GMT
Server: Google Frontend
So you can start from there and add more and more security until it breaks to figure out what is the culprit.

Related

Cors Blocks Request with status 403 on Nginx

I am facing a strange issue with running CORS on Nginx, CORS is working fine for everything but one scenario when the Server responds with a 403 http response.
Basically when I login with correct credentials the cors request works fine , however when I provide wrong credentials for login the server(backend) responds with a 403 status and I get the following error
"NetworkError: 403 Forbidden - http://mydomain.com/v1/login"
login
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at http://mydomain.com/v1/login. This can be fixed by moving the resource to the same domain or enabling CORS.
If the credentials are correct I don't get this error and everything works perfectly.
I have done the configuration for enabling CORS and it seems to be working fine for everything else.
Following are the Request Headers
Request Headers
User-Agent:Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:29.0) Gecko/20100101 Firefox/29.0
Referer:http://abc.mydomain.com/
Pragma: no-cache
Origin: http://abc.mydomain.com
Host: www.mydomain.com
Content-Type: application/json;charset=utf-8
Content-Length: 74
Connection: keep-alive
Cache-Control: no-cache
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Accept: application/json, text/plain, /
Response Headers
Server: nginx/1.4.1
Date: Tue, 10 Jun 2014 05:28:30 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 76
Connection: keep-alive
An option for nginx(>=1.75) is to specify always parameter in add_header :
If the always parameter is specified (1.7.5), the header field will be
added regardless of the response code.
I assume that you are using add_header directive to add CORS headers in the nginx configuration.
Nginx modules reference states about add_header:
Adds the specified field to a response header provided that the response code equals 200, 201, 204, 206, 301, 302, 303, 304, or 307.
To fix problem you could use ngx_headers_more module to set CORS headers also to error responses.
more_set_headers 'Access-Control-Allow-Origin: $http_origin';
more_set_headers 'Access-Control-Allow-Headers: Content-Type, Origin, Accept, Cookie';

client side caching happening despite version in url

We seems to be having an issue with a css file caching on the client. I generally stop this from causing issue by adding a version number to the file, i.e.
<link href="Default.css?4.31.0.17051" rel="stylesheet" type="text/css">
But in this circumstance this isn't working and I don't understand why.
The version number was incremented last night from 4.30.0.xxxxx to 4.31.0.17051
Some users, and I've seen it myself, are getting a HTTP 304 response. What's strange is if I inspect it using the IE dev tools it shows a HTTP 304, if I fire up fiddler it doesn't show any request at all.
Content caching is not enabled on the server.
Here's the HTTP header if I do a ctrl-f5:
HTTP/1.1 200 OK
Content-Length: 45861
Content-Type: text/css
Last-Modified: Tue, 23 Jul 2013 14:19:40 GMT
Accept-Ranges: bytes
ETag: "0a61aabaf87ce1:bdba"
Vary: Accept-Encoding
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Date: Tue, 30 Jul 2013 16:05:54 GMT
So:
why don't I see this in fiddler? is it not sending the request at all as this question suggests (https://stackoverflow.com/a/8958279/542251)
Why isn't this cache control (i.e. adding the ?4.31.0.17051 to the file) working?
EDIT
I've now touched the file, to update the Last-modified date but it's still not requesting it.
So the page is returning a HTTP 200, so this isn't the page caching as suggested below:
Request:
GET http://www.mysite.com/Agent/Hotel HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Referer: http://www.mysite.com/Agent/Flights
Accept-Language: en-GB
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)
Accept-Encoding: gzip, deflate
Host: www.mysite.com
Connection: Keep-Alive
Response:
HTTP/1.1 200 OK
Cache-Control: private
Content-Length: 53184
Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 4.0.30319
X-AspNetMvc-Version: 3.0
Date: Wed, 31 Jul 2013 08:33:25 GMT
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<link href="Default.css?4.31.0.17051" rel="stylesheet" type="text/css" />
.........
As noted in my answer that you cited, the F12 Tools can show a HTTP/304 when the response was really served from the cache. If you don't see the request in Fiddler, it wasn't sent over the network.
Are you sure that the page that refers to the CSS file wasn't pulled from the cache? If it were, then you'd still have the old URL reference. (Look carefully at the CSS request's URL in the F12 tools, as the URL will be accurate even if the "304" was not).
Two points:
- What does "Why isn't this cache control working?" mean? Your HTTP-response headers don't include any Cache-Control directives.
- Changing the Last-Modified date on the server obviously isn't something the client will know about unless it actually issues a Conditional GET request to the server.
So I got a solution but not necessarily an answer.
In the end I updated the tag on the site to:
<link href="Default.css?v=4.31.0.17051" rel="stylesheet" type="text/css">
I'm not sure if adding the v= turning it into a valid querystring was the solution or whether it was just the fact that I altered the URL again solved this issue.
I haven't been able to replicate this issue in staging so I don't really know how or why this started happening.

Difference between Pragma and Cache-Control headers?

I read about Pragma header on Wikipedia which says:
"The Pragma: no-cache header field is an HTTP/1.0 header intended for
use in requests. It is a means for the browser to tell the server and
any intermediate caches that it wants a fresh version of the resource,
not for the server to tell the browser not to cache the resource. Some
user agents do pay attention to this header in responses, but the
HTTP/1.1 RFC specifically warns against relying on this behavior."
But I haven't understood what it does? What is the difference between the Cache-Control header whose value is no-cache and Pragma whose value is also no-cache?
Pragma is the HTTP/1.0 implementation and cache-control is the HTTP/1.1 implementation of the same concept. They both are meant to prevent the client from caching the response. Older clients may not support HTTP/1.1 which is why that header is still in use.
There is no difference, except that Pragma is only defined as applicable to the requests by the client, whereas Cache-Control may be used by both the requests of the clients and the replies of the servers.
So, as far as standards go, they can only be compared from the perspective of the client making a requests and the server receiving a request from the client. The http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32 defines the scenario as follows:
HTTP/1.1 caches SHOULD treat "Pragma: no-cache" as if the client had
sent "Cache-Control: no-cache". No new Pragma directives will be
defined in HTTP.
Note: because the meaning of "Pragma: no-cache as a response
header field is not actually specified, it does not provide a
reliable replacement for "Cache-Control: no-cache" in a response
The way I would read the above:
if you're writing a client and need no-cache:
just use Pragma: no-cache in your requests, since you may not know if Cache-Control is supported by the server;
but in replies, to decide on whether to cache, check for Cache-Control
if you're writing a server:
in parsing requests from the clients, check for Cache-Control; if not found, check for Pragma: no-cache, and execute the Cache-Control: no-cache logic;
in replies, provide Cache-Control.
Of course, reality might be different from what's written or implied in the RFC!
Stop using (HTTP 1.0)
Replaced with (HTTP 1.1 since 1999)
Expires: [date]
Cache-Control: max-age=[seconds]
Pragma: no-cache
Cache-Control: no-cache
If it's after 1999, and you're still using Expires or Pragma, you're doing it wrong.
I'm looking at you Stackoverflow:
200 OK
Pragma: no-cache
Content-Type: application/json
X-Frame-Options: SAMEORIGIN
X-Request-Guid: a3433194-4a03-4206-91ea-6a40f9bfd824
Strict-Transport-Security: max-age=15552000
Content-Length: 54
Accept-Ranges: bytes
Date: Tue, 03 Apr 2018 19:03:12 GMT
Via: 1.1 varnish
Connection: keep-alive
X-Served-By: cache-yyz8333-YYZ
X-Cache: MISS
X-Cache-Hits: 0
X-Timer: S1522782193.766958,VS0,VE30
Vary: Fastly-SSL
X-DNS-Prefetch-Control: off
Cache-Control: private
tl;dr: Pragma is a legacy of HTTP/1.0 and hasn't been needed since Internet Explorer 5, or Netscape 4.7. Unless you expect some of your users to be using IE5: it's safe to stop using it.
Expires: [date] (deprecated - HTTP 1.0)
Pragma: no-cache (deprecated - HTTP 1.0)
Cache-Control: max-age=[seconds]
Cache-Control: no-cache (must re-validate the cached copy every time)
And the conditional requests:
Etag (entity tag) based conditional requests
Server: Etag: W/“1d2e7–1648e509289”
Client: If-None-Match: W/“1d2e7–1648e509289”
Server: 304 Not Modified
Modified date based conditional requests
Server: last-modified: Thu, 09 May 2019 19:15:47 GMT
Client: If-Modified-Since: Fri, 13 Jul 2018 10:49:23 GMT
Server: 304 Not Modified
last-modified: Thu, 09 May 2019 19:15:47 GMT

LiveHttpHeaders: which cache-control info is right

Using LiveHttpHeaders for Firefox 6 I was trying to see if my css, JS files being cached using Headers Module from Apache using htaccess. But I confuse, there are 2 values from the 'Cache-Control' data:
GET /proz/css/global.css HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0
Accept: text/css,*/*;q=0.1
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Connection: keep-alive
Referer: http://localhost/proz/
Cookie: PHPSESSID=el34de37pe3bnp4rdtbst1kd43
If-Modified-Since: Fri, 16 Sep 2011 21:15:32 GMT
If-None-Match: "400000000b06a-2999-4ad157e5b4583"
Cache-Control: max-age=0
HTTP/1.1 304 Not Modified
Date: Sat, 17 Sep 2011 03:04:50 GMT
Server: Apache/2.2.17 (Win32) PHP/5.2.8
Connection: Keep-Alive
Keep-Alive: timeout=5, max=99
Etag: "400000000b06a-2999-4ad157e5b4583"
Cache-Control: max-age=604800, public
Vary: Accept-Encoding
Which one is the true data, the first Cache-Control data (max-age=0) or the latter one.
I also would like to know how do I make sure that my JS, CSS, HTML files being compress after I use deflate module in htaccess. And yes, both headers and deflate modules are turn on.
There are two parts in this listing:
The part before the blank line is the request, sent by your browser
The part after the blank line is the response, sent by the server
The Cache-Control: max-age=0 sent by the client (your browser) tells the server (or any proxy in the middle) to send the most fresh version of the file. The browser usually sends this when you hit the refresh button.
The Cache-Control: max-age=604800, public sent by the server tells the client (your browser or a proxy) that the file is valid for 604800 seconds and can be cached for that time. (The browser won't even attemps to ask the server if a newer version exists, unless you hit refresh, as you did in this case.)
The server replied 304 Not Modified, this means that your browser already has the most recent version and it doesn't need to download it again (it did not downloaded it again).
The Vary: Accept-Encoding header indicate that the server taken some decisions based on the client's Accent-Encoding header. This may indicate that, if the server didn't replied 304 Not Modified, it would have compressed the file.
To verify this last point, clear your cache, and request the file again, and look at the content of the Content-Encoding header (must be gzip or deflate if the data is compressed).

HTTP Expires header not respected by browser?

I have a situation where my (embedded) web server is sending Expires header, but the browser does not seem to respect the header setting, i.e., if I refresh the page, the browser requests the resources that are supposed to be cached. Following are the headers that are getting exchanged:
https://192.168.1.180/scgi-bin/ajax/ajax.cgi
GET /scgi-bin/ajax/ajax.cgi HTTP/1.1
Host: 192.168.1.180
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cache-Control: max-age=0
HTTP/1.x 200 OK
Date: Wed, 24 Jun 2009 20:26:47 GMT
Server: Embedded HTTP Server.
Connection: close
Content-Type: text/html
----------------------------------------------------------
https://192.168.1.180/scgi-bin/ajax/static.cgi?fn=images/logo.jpg&ts=20090624201057
GET /scgi-bin/ajax/static.cgi?fn=images/logo.jpg&ts=20090624201057 HTTP/1.1
Host: 192.168.1.180
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11 (.NET CLR 3.5.30729)
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: https://192.168.1.180/scgi-bin/ajax/ajax.cgi
Cache-Control: max-age=0
HTTP/1.x 200 OK
Date: Wed, 24 Jun 2009 20:26:47 GMT
Server: Embedded HTTP Server.
Connection: close
Expires: Wed, 1 Jun 2011 20:00:00 GMT
Content-Type: image/jpg
----------------------------------------------------------
The ajax.cgi returns an html page with a logo graphic (via the static.cgi script), which I'd like cached, but the browser is asking for the logo on every refresh.
The browser ignores the Expires header if you refresh the page. It always checks whether the cache entry is still valid by contacting the web server. Ideally, it will use the If-Modified-Since request header so that the server can return '304 Not modified' if the cache entry is still valid.
You're not setting the Last-Modified header, so the browser has to perform an unconditional GET of the content to ensure that it is up to date.
Some rules of thumb for setting Expires and Last-Modified are described in this blog post:
http://blog.httpwatch.com/2007/12/10/two-simple-rules-for-http-caching/
What are you doing in your browser? I looks like you click the reload button or even something like shift+Reload. Normally, the browser wouldn't send a Cache-Control: max-age=0 header. That means the browser has thrown away the cached image and wants to get it again.
If you just navigate to another page and then back again, the browser should respect your Expires header.
Additionally, you could add a Cache-control: public header to your response. That allows proxies and the browser explicitly to cache the image.
Any errors in your https certificate will cause the browser to not respect your headers.
Try it without https and see if it works over plain http.
See this answer https://stackoverflow.com/a/17716911
The CGI script looks like it has a timestamp parameter...this isn't changing, is it? The browser should be treating each unique URL as a different object in the cache, so if that is updating with every request, it won't match with the cached image.
Additionally, the Expires field is not exactly in RFC 1123 format, because you need two digits for the date. This may or may not be an issue, but it's something to check. The browser is including Cache-Control: max-age=0, which indicates that it believes its cache to be potentially out of date.
Once the server gets this validation request, it can return 304 (Not Modified), or 200 (OK), as it is doing currently.

Resources