Caching a resource in Varnish-Cache after specific number of requests - nginx

Is there any functionality like proxy_cache_min_uses of Nginx in the Varnish-Cache that caches a resource after specific number of requests to that resource?
Here is some similar solution in Varnish-Cache plus (based on slimhazard's comment on this issue):
import vsthrottle;
sub vcl_recv {
if (req.url ~ "^/min/use/me" && vsthrottle.is_denied(req.url, 50, 2h, 1h) {
# If the URL was requested more than 50 times during the last two hours,
# then go to cache lookup for the next hour.
return (hash);
}
else {
# Otherwise bypass the cache
return (pass);
}
}
Is there any similar solution that could be used in Varnish-Cache itself?

Not in Varnish Cache core in itself, but you can achieve this with a VMOD, like this counter VMOD.
It will allow you to increment some counter for a number of times a resource was requested and then check its value and apply caching logic required.

Related

NGINX: how to process requests with different http methods to one location differently

(The situation that caused the question is pretty defined and not abstract. But I know applicable to it workarounds thus I dropped details in order to get general pipeline that answers to this question.)
Problem description: Imagine that you have one location NGINX block and you want to have possibility to response in this block in different ways depending on requests' HTTP method. But NGINX does not support syntax like location /:POST {...}, some more complex way is needed.
I tried to find the answer to this question and all my findings can be classificated into these groups:
Group 1: if ($request_method ...) inside location
But If is Evil .
As for me, the pure if way does not work without copying directives into if clause from external "common" space. Moreover, all copied directives are not approved to be inside by NGINX's developers.
Only return and rewrite is approved to be inside if but it is not enough.
Group 2: limit_except
This way is applicable if you want to restrict some methods for a location. But it does not allow distinguishing behavior for each HTTP method group for the same location.
Group 3: lets map everything
Many if cases can be replaced by one/two/... proper map block(s). $request_method can also be mapped but what if it is not enough or pretty just map several times? How, for example, to proxy_pass requests with one HTTP method and perform try_files for requests with the same location and another HTTP method without if? (Not my situation, just example of two incompatible directives not allowed in if). How to avoid 10+ maps that can be needed for even one conditioning.
Group 4: proxying to different upstream blocks
This way is quite complex and looks like a overkill. Moreover, if you need something else than just proxying (for endpoint selection map is enough) you have to create a new virtual host. Also pieces of the implementation are located far apart.
While I was writing the question I came up with the idea of error pages and named locations. However, the question has been submitted as a possibility to gather more solutions+opinions and help somebody.
Own idea 1: Error pages and named locations
return allows returning a error code without explicit redirection. But with providing a non-standard error code own error handler that just extend the functionality can be defined.
Sometimes ago I found this technique as a solution of the "duplicated in-location directives" problem (the directives were placed at the error handler and the locations return the error code of the handler in addition to different directives of the locations).
Usage example:
map $request_method $handler_code {
GET 598;
POST 599;
default 405; # Method Not Allowed
}
server {
...
error_page 598 = #get_handler;
location #get_handler {
# only GET routine
}
error_page 599 = #post_handler;
location #post_handler {
# only POST routine
}
location / {
... # Common routine
return $handler_code;
...
}
...
}

Forcing ttl's in varnish 6.2

I've been looking at the latest version of Varnish (6.2) and having problems with the removal of return(miss) from vcl_hit.
So our use case is that we want to cache things for a set amount of time, then force varnish to retrieve new content, in previous versions the following has worked fine
sub vcl_hit {
if (obj.ttl >= 0s) {
return (deliver);
}
else
{
return (miss);
}
}
However in 6.2 return(miss) has been removed, we want to force content to always be refreshed correctly.
I looked at return(pass) but from the documentation this suggests that the response will not be cached, which is not what we want.
and return(fetch) has not been an option for some time, but I'm struggling to find an alternative? As return(restart) suggested in the docs will just loop back to the same place.
Should I be looking elsewhere, and trying to disable grace/saint?
vcl_hit is the wrong subroutine for a handful of reasons, the main one you are using a complicated way, with side-effects, to do something trivial. Just do:
sub vcl_backend_response {
# set the ttl
set beresp.ttl = 5m;
# after ttl is gone, grace kicks in during which
# content is revalidated asynchronously
set beresp.grace = 2h;
# after grace, keep kicks during which
# content is revalidated synchronously
set beresp.keep = 3d;
}
Your snippet is equivalent to setting grace and keep to 0s.

How do I prevent hotlining but allow google in nginx?

I want to setup a reverse proxy for serving images stored in S3.
I dont want to allow access to images if referrer is not example.com
But I want to allow multiple crawlers for example google bot, bing bot etc (based on user_agent) to access the images.
I also want to allow my android app to access the images (based on custom header say X-Application: ExampleApp)
How do I configure nginx to do so ?
That comes to using 3 IF's which is not going to work due to IF limitations.
What you can do is 2 things, set up MAP to deal with the 3 tests (setting true or false values) then inside the server block use Lua to combine the 3 test values into one and use a single IF (or pure Lua) in the location block to allow/deny access.
map $referrer $usestring1 {
default 0;
~^google$ 1;
}
map $user_agent $usestring2 {
default 0;
~^google$ 1;
}
etc....
location / {
content_by_lua '
local s = ngx.var.usestring1;
local t = ngx.var.usestring2;
if s+t == 2 then return ngx.exit(503); end;
';
etc...........

How to set request headers asynchronously in typeahead/bloodhound

Environment:
I am using typeahead/bloodhound for a search field in my mobile app (steroids/cordova)
Every request from my app to the API needs to be signed and the signature added to auth headers
Obviously setting the headers in the ajax settings won't work as each request bloodhound sends will be different and require different signatures.
In my first implementation, I was using the beforeSend ajax setting to achieve this. Simply calculate the signature in that function and add it to the request headers.
However, this was not very secure so I have decided to place the secret used and the signature calculation into a Cordova custom plugin's native code to be compiled. Not bullet proof but a reasonable amount of security.
As Cordova plugins are asynchronous, beforeSend became useless in this case. The function will complete before the signing and setting of the headers are done.
So, in summary, the question is: How can I asynchronously calculate and set those headers with typeahead/bloodhound?
ok, the solution seems to be fork and hack. First modify _getFromRemote to remove the need for beforeSend by adding a remote.headers option similar to the remote.replace except that it returns a deferred object
if (this.remote.headers) {
$.when(
this.remote.headers(url, query, this.remote.ajax)
).done(function(headers) {
that.remote.ajax.headers = headers;
deferred.resolve(that.transport.get(url, that.remote.ajax, handleRemoteResponse));
});
} else {
deferred.resolve(this.transport.get(url, this.remote.ajax, handleRemoteResponse));
}
and then modify the get function that uses this to handle the deferred
if (matches.length < this.limit && this.transport) {
cacheHitPromise = this._getFromRemote(query, returnRemoteMatches);
cacheHitPromise.done(function(hit) {
if (!hit) {
(matches.length > 0 || !this.transport) && cb && cb(matches);
}
});
}
now I'm free to use asynchronous native code to sign and set request auth headers :)

Create a timed cache in Drupal

I am looking for more detailed information on how I can get the following caching behavior in Drupal 7.
I want a block that renders information I'm retrieving from an external service. As the block is rendered for many users I do not want to continually request data from that service, but instead cache the result. However, this data is relatively frequent to change, so I'd like to retrieve the latest data every 5 or 10 minutes, then cache it again.
Does anyone know how to achieve such caching behavior without writing too much of the code oneself? I also haven't found much in terms of good documentation on how to use caching in Drupal (7), so any pointers on that are appreciated as well.
Keep in mind that cache_get() does not actually check if an item is expired or not. So you need to use:
if (($cache = cache_get('your_cache_key')) && $cache->expire >= REQUEST_TIME) {
return $cache->data;
}
Also make sure to use the REQUEST_TIME constant rather than time() in D7.
The functions cache_set() and cache_get() are what you are looking for. cache_set() has an expire argument.
You can use them basically like this:
<?php
if ($cached_data = cache_get('your_cache_key')) {
// Return from cache.
return $cached_data->data;
}
// No or outdated cache entry, refresh data.
$data = _your_module_get_data_from_external_service();
// Save data in cache with 5min expiration time.
cache_set('your_cache_key', $data, 'cache', time() + 60 * 5);
return $data;
?>
Note: You can also use a different cache bin (see documentation links) but you need to create a corresponding cache table yourself as part of your schema.
I think this should be $cache->expire, not expires. I didn't have luck with this example if I'm setting REQUEST_TIME + 300 in cache_set() since $cache->expires will always be less than REQUEST_TIME. This works for me:
if (($cache = cache_get('your_cache_key', 'cache')) && (REQUEST_TIME < $cache->expire)) {
return $cache->data;
}

Resources