Caching images with different query strings (S3 signed urls) - http

I'm trying to figure out if I can get browsers to cache images with signed urls.
What I want is to generate a new signed url for every request (same image, but with an updated signature), but have the browser not re-download it every time.
So, assuming the cache-related headers are set correctly, and all of the URL is the same except for the query string, is there any way to make the browser cache it?
The urls would look something like:
http://example.s3.amazonaws.com/magic.jpg?WSAccessKeyId=stuff&Signature=stuff&Expires=1276297463
http://example.s3.amazonaws.com/magic.jpg?WSAccessKeyId=stuff&Signature=stuff&Expires=1276297500
We plan to set the e-tags to be an md5sum, so will it at least figure out it's the same image at that point?
My other option is to keep track of when last gave out a url, then start giving out new ones slightly before the old ones expire, but I'd prefer not to deal with session info.

The browser will use the entire URL for caching purposes, including request parameters. So if you change a request parameter it will effectively be a new "key" in the cache and will always download a new copy of that image. This is a popular technique in the ad-serving world - you add a random number (or the current timestamp) to the end of the URL as a parameter to ensure the browser always goes back to the server to make a new request.
The only way you might get this to work is if you can make the URL static - i.e. by using Apache rewrite rules or a proxy of some sort.

I've been having exactly the same issue with S3 signed URLs. The only solution I came up with is to have the URLs expire on the same day. This is not ideal but at least it will provide caching for some time.
For example all URLs signed during April I set the expiry on May 10th. All URLs signed in June I set to expire on July 10th. This means the signed URLs will be identical for the whole month.

Just stumbled on this problem and found a way to solve it. Here's what you need to do:
Store first url string (in localStorage for example);
When you receive img url next time just check if their main urls match (str1.split('?')[0] === str2.split('?')[0])
If they do, use the first one as img src attribute.
Hope it helps someone.

Related

How do websites change content that have already been labeled with very long expiry dates?

What would happen if the server sets a very far away expiry date for a resource (e.g. 20 years later) and then after some new requirements emerge it decides that the resource must be changed? For example, the CSS files of some websites seem to have such long expiry dates. Is there another header that the website can send to cancel an exact previous cached resource?
No. Once you have a very long expiry date, you have no idea if people will still be using that up until the expiry date.
The way to “change” something generally falls into two categories:
Create a unique filename so it’s versioned. For example styles.57ab85ca183.css instead of styles.css. Or perhaps styles.css?v=12345. This requires you to refer to the specific version in your code so does add a little complexity but there are tools for this.
Have a short expiry date. This is generally what people do with the main page (as it’s not possible to change that location with a versioned path).

Scene7 URL parameters

Our business uses Adobe Scene7. One of the things we need to be able to do is share the URL of an image, to a vendor for all of the products with an image.
We have identified the construct of the URL to predict the link, and then we ping the image URL to ensure it is valid and available for viewing.
As of late, we've come into a problem where many of the images are not rendering...
Most images:
http://s7d5.scene7.com/is/image/LuckyBrandJeans/7W64372_960_1
Some images:
https://s7d9.scene7.com/is/image/LuckyBrandJeans/7Q64372_960_1
The only difference appears to be the s7d5 becomes s7d9 on some images. What drives that?
How do we get a list of all of those URL's if we can't predict the d9 vs d5?
I'm not sure it matters. I think all you need is the filename. It looks like if you take the filename "7W64372_960_1" it works on both s7d5 and s7d3:
http://s7d5.scene7.com/is/image/LuckyBrandJeans/7W64372_960_1
http://s7d9.scene7.com/is/image/LuckyBrandJeans/7W64372_960_1
In fact, you can change it to s7d1, s7d2, s7d3, etc. and it still works.
So, I think if you were to build some sort of template you could just pick whatever URL you wanted and just append the filename on the end like:
http://s7d5.scene7.com/is/image/LuckyBrandJeans/{{imageFileName}}
We have the same thing with our company.
One domain serves the images for the lower "sandbox" environment (d5) and the other serves the images to your live environment (d9).

How to expire Branch.io link within specific time? (Deep linking via branch metrics)

I am using link to generate deep linking. I am using their public API's endpoint to generate links.
Here is their endpoint: https://api.branch.io/v1/url
I append my branch key and data that I need to associate in this link. Everything is working fine but I need to expire this link within one hour.
Reading up here: https://github.com/BranchMetrics/branch-deep-linking-public-api#creating-a-deep-linking-url
I added "duration" key also, but it didnt expire the link.
It will be great if anyone could help me in figuring out how to expire branch.io link.
Alex from Branch.io here: the duration parameter is used for something different, so it's not going to be able to do what you want. We don't have a built-in feature to expire links, but you could create something close to it yourself:
Add a custom link parameter containing a timestamp for when the link was created.
Check for that timestamp when handling the link at the destination, and do something different if it is more than an hour old. I'm guessing this would be inside your app, and also on whatever fallback URL you have specified for when the app isn't installed or the user is on desktop.
Mail from branch.io support team suggested this answer as below:
If you found out about the $exp_date parameter from here then the
parameters in that list are only used for iOS Spotlight Indexing but
will be used by Branch in the future. A better solution than
utilizing $exp_date is to code logic into your client to determine
what to do with link data based on date. This way, your deep links
will always work and always carry data through, and you won't have to
worry about users clicking empty links.
This way, you would include date as an extra meta key/value pair, and
examine this date in your client when receiving link params to
determine if you want to honor the link's contents or not.

imgix.com downloads images instead of browsing to them

I am using the imgix.com CDN for a test project and for some reason it keeps downloading the images instead of browsing and applying the rules to to them.
So if I type in myprefix.imgix.net/myimage.png it simply downloads it and if I type https://myprefix.imgix.net/myimage.png~text?txtsize=44&txt=470%C3%97480&w=450&h=480 nothing happens.
Has anyone come across this problem?
Thanks
These are two separate issues:
1) If you request an imgix URL without adding any query parameters, imgix will just act as a passthrough to your source. If your images are being treated as a download by the browser rather than as images to display, there must be something mis-configured at the source level. Not knowing anything about your source, I really can't offer any better advice here.
2) The myimage.png~text URL isn't working because you shouldn't be using ~text at all here. Take those five characters out of your URL and it should work as you expect.
Imgix's ~text endpoint is a way to request an image where the "base image" is text rather than a real image. In trying to combine a real base image (myimage.png, in your URL above) with this text-only endpoint (~text), you're making a request that imgix doesn't know how to handle.
If you've got further questions about your imgix integration, especially if they're configuration questions that involve your specific account and settings, I'd encourage you to send your questions to support#imgix.com instead of StackOverflow. While SO is a great place to answer one-off questions, writing into our support-ticket system will allow us to answer account-specific questions a lot easier.
Once your Source has been configured and deployed, you can begin making image requests to imgix. These requests differ slightly for each imgix Source type, but they all have the same basic structure:
https:// example.imgix.net imgix domain / products/desk.jpg path ? w=600&exp=1 query string
The hostname, or domain, of the imgix URL will have the form YOUR_SOURCE_NAME.imgix.net. In the above URL, the name of the Source is example, so the hostname takes the form of example.imgix.net. Different hostnames can be set in your Source by clicking Manage under the Domains header.
The path consists of any additional directory information required to locate your image within your image storage (e.g. if you have different subfolders for your images). In this example, /products/desk.jpg completes the full path to the image.
imgix’s parameters are added to the query string of the URL. In the above example, the query string begins with ?w=600 and the additional parameters are linked with ampersands. These parameters dictate how images are processed. In the above URL, w=600 specifies the width of the image and exp=1 adjusts the exposure setting.

URL hash is persisting between redirects

For some reason, non IE browsers seem to persist a URL hash (if present) when a server-side redirect is sent (using the Location header). Example:
// a simple redirect using Response.Redirect("http://www.yahoo.com");
Text.aspx
If I visit:
Test.aspx#foo
In Firefox/Chrome, I'm taken to:
http://www.yahoo.com#foo
Can anyone explain why this happens? I've tried this with various server side redirects in different platforms as well (all resulting in the Location header, though) and this always seems to happen. I don't see it anywhere in the HTTP spec, but it really seems to be a problem with the browsers themselves. The URL hash (as expected) is never sent to the server, so the server redirect isn't polluted by it, the browsers are just persisting it for some reason.
Any ideas?
I suggest that this is the correct behaviour. The 302 and 307 status codes indicate that the resource is to be found elsewhere. #bookmark is a location within the resource.
Once the resource (html document) has been located it is for the browser to locate the #bookmark within the document.
The analogy is this: You want to look something up in a book in chapter 57, so you go to the library to get the book. But there is a note on the shelf saying the book has moved, it is now in the other building. So you go to the new location. You still want chapter 57 - it is irrelevant where you got the book.
This is an aspect that was not covered by previous HTTP specifications but has been addressed in the later HTTP development:
If the server returns a response code of 300 ("multiple choice"), 301
("moved permanently"), 302 ("moved temporarily") or 303 ("see
other"), and if the server also returns one or more URIs where the
resource can be found, then the client SHOULD treat the new URIs as
if the fragment identifier of the original URI was added at the end.
The exception is when a returned URI already has a fragment
identifier. In that case the original fragment identifier MUST NOT be
not added to it.
So the fragment of the original URI should also be used for the redirection URI unless it also contains a fragment.
Although this was just a draft that expired in 2000, it seems that the behavior as described above is the de-facto standard behavior among todays web browsers.
#Julian Reschke or #Mark Nottingham probably know more/better about this.
From what I have found, it doesn't seem clear what the exact behaviour should be. There are plently of people having problems with this, some of them wants to keep the bookmark through the redirect, some of them wants to get rid of it.
Different browsers handle this differently, so in practice it's not useful to rely on either behaviour.
It definitely is a browser issue. The browser never sends the bookmark part of the URL to the server, so there is nothing that the server could do to find out if there is a bookmark or not, and nothing that could be done about it reliably.
When I put the full URL in the action attribute of the form, it will keep the hash. But when I just do the query string then it drops the hash. E.g.,
Keeps the hash:
https://example.com/edit#alrighty
<form action="https://example.com/edit?ok=yes">
Drops the hash:
https://example.com/edit
<form action="?ok=yes">

Resources