Varnish: Hiding internal backend requests - http

That is my scenario:
1) Varnish (172.16.217.131:80), receives a request from a client, i.e:
http://172.16.217.131:80/a.png
2) Request is forwarded to the Default Backend (127.0.0.1:8000)
3) Default backend receive the request and process it
4) That processing results in a new URL, i.e: http://172.16.217.132:80/a.png (**As you can see the IP has changed)
5) 172.16.217.132:80 is another backend in Varnish's config file
6) The new URL points to a resource that should be provided by Varnish
(that resource generally is an image)
My problem is: The client needs to execute 2 GETs to obtain the image.
My question: How can I configure varnish to internally receive the
response from the first backend(127.0.0.1:8000), and fetch data from
the second backend (172.16.217.132:80), and after that, send the data
to the client?
Thanks.

By step 4;
4) That processing results in a new URL, i.e:
http://172.16.217.132:80/a.png (**As you can see the IP has changed)
do you mean that it results in a HTTP Redirect? Then you could check the backend response status code in vcl_fetch (check for 301, 302 etc), use the Location header as your new url and do a restart. I found a great example of this in the Varnish Book
sub vcl_fetch {
if (req.restarts == 0 &&
req.request == "GET" &&
beresp.status == 301) {
set beresp.http.location = regsub(beresp.http.location,"^http://","");
set req.http.host = regsub(beresp.http.location,"/.*$","");
set req.url = regsub(beresp.http.location,"[^/]*","");
return (restart);
}
}

Related

How to get the client IP adress using HTTP.jl

I'm trying to get the both the client request and IP address from http requests to my HTTP.jl server (based on the basic server example in the docs).
using HTTP
using Sockets
const APP = HTTP.Router()
# My request handler function can see the request's method
# and target but not the IP address it came from
HTTP.#register(APP,"GET","/",req::HTTP.Request -> begin
println("$(req.method) request to $(req.target)")
"Hello, world!"
end)
HTTP.serve(
APP,
Sockets.localhost,
8081;
# My tcpisvalid function can see the client's
# IP address but not the HTTP request
tcpisvalid=sock::Sockets.TCPSocket -> begin
host, port = Sockets.getpeername(sock)
println("Request from $host:$port")
true
end
)
My best guess would be that there's a way to parse the TCPSocket.buffer into an HTTP request but I can't find any methods to do it.
Can you suggest a way to get an HTTP.Request from a TCPSocket or a different way to approach this problem?
Thanks in advance!
The router (APP) is a (collection of) "request handler(s)" which can only access the HTTP.Request -- you can not get the stream from it. Instead you can define a "stream handler", which is passed the stream. From the stream you can get the client's IP adress using Sockets.getpeername (requires HTTP.jl version 0.9.7 when called on a HTTP.Stream as in the examples below).
using HTTP, Sockets
const APP = HTTP.Router()
function request_handler(req::HTTP.Request)
println("$(req.method) request to $(req.target)")
return "Hello, world!"
end
HTTP.#register APP "GET" "/" request_handler
function stream_handler(http::HTTP.Stream)
host, port = Sockets.getpeername(http)
println("Request from $host:$port")
return HTTP.handle(APP, http) # regular handling
end
# HTTP.serve with stream=true to specify that stream_handler is a function
# that expects a HTTP.Stream as input (and not a HTTP.Request)
HTTP.serve(stream_handler, Sockets.localhost, 8081; stream=true) # <-- Note stream=true
# or HTTP.listen
HTTP.listen(stream_handler, Sockets.localhost, 8081)

is it possible to change response on proxy level using varnish?

For example we have setup like this:
user -> api gateway -> (specific endpoint) varnish -> backend service
If backend returns response 500 {"message":"error"} I want to patch this response and return 200 "[]" instead.
Is it possible to do something like this using varnish or some other proxy?
It is definitely possible to intercept backend errors, and convert them into regular responses.
A very simplistic example is the following:
sub vcl_backend_error {
set beresp.http.Content-Type = "application/json";
set beresp.status = 200;
set beresp.body = "[]";
return(deliver);
}
sub vcl_backend_response {
if(beresp.status == 500) {
return(error(200,"OK"));
}
}
Whenever your backend would fail, and return an HTTP/503 error, we will send a HTTP/200 response with [] output.
This output template for backend errors is also triggered when the backend does reply, but with a HTTP/500 error.
In real world scenarios, I would a some conditional logic in vcl_backend_error to only return the JSON output template when specific criteria are matched. For example: a certain URL pattern was matched.
I would advise the same in vcl_backend_response: maybe you don't want to convert all HTTP/500 errors into regular HTTP/200 responses. Maybe you also want to add conditional logic.

Unable to modify request in middleware using Scrapy

I am in the process of scraping public data regarding metheorology for a project (data science), and in order to effectively do that I need to change the proxy used on my scrapy requests in the event of a 403 response code.
For this, I have defined a download middleware to handle such situation, which is as follows
class ProxyMiddleware(object):
def process_response(self, request, response, spider):
if response.status == 403:
f = open("Proxies.txt")
proxy = random_line(f) # Just returns a random line from the file with a valid structure ("http://IP:port")
new_request = Request(url=request.url)
new_request.meta['proxy'] = proxy
spider.logger.info("[Response 403] Changed proxy to %s" % proxy)
return new_request
return response
After properly adding the class to settings.py, I expected this middleware to deal with 403 responses by generating a new request with the new proxy, hence finishing in a 200 response. The observed behaviour is that it actually gets executed (I can see the Logger info about Changed proxy), but the new request does not seem to be made. Instead, I'm getting this:
2018-12-26 23:33:19 [bot_2] INFO: [Response] Changed proxy to https://154.65.93.126:53281
2018-12-26 23:33:26 [bot_2] INFO: [Response] Changed proxy to https://176.196.84.138:51336
... indefinitely with random proxies, which makes me think that I'm still retrieving 403 errors and the proxy is not changing.
Reading the documentation, regarding process_response, it states:
(...) If it returns a Request object, the middleware chain is halted and the returned request is rescheduled to be downloaded in the future. This is the same behavior as if a request is returned from process_request().
Is it possible that "in the future" is not "right after it is returned"? How should I do to change the proxy for all requests from that moment on?
Scrapy will drop duplicate requests to the same url by default, so that's probably what's happening on your spider. To check if this is your case you can set this settings:
DUPEFILTER_DEBUG=True
LOG_LEVEL='DEBUG'
To solve this you should add dont_filter=True:
new_request = Request(url=request.url, dont_filter=True)
Try this:
class ProxyMiddleware(object):
def process_response(self, request, response, spider):
if response.status == 403:
f = open("Proxies.txt")
proxy = random_line(f)
new_request = Request(url=request.url)
new_request.meta['proxy'] = proxy
spider.logger.info("[Response 403] Changed proxy to %s" % proxy)
return new_request
else:
return response
A better approach would be to use scrapy random proxies module instead:
'DOWNLOADER_MIDDLEWARES' : {
'rotating_proxies.middlewares.RotatingProxyMiddleware': 610,
'rotating_proxies.middlewares.BanDetectionMiddleware': 620
},

Is it possible for nginx reverse proxy server to compress request body before sending it to backend servers?

I was trying to compress request body data before sending it to backend server.
To achieve this, I add my module to the official nginx, my module has a rewrite phase handler,
which I want it to rewrite request body, code shows as below:
static ngx_int_t
ngx_http_leap_init(ngx_conf_t *cf)
{
ngx_http_handler_pt *h;
ngx_http_core_main_conf_t *cmcf;
cmcf = ngx_http_conf_get_module_main_conf(cf, ngx_http_core_module);
h = ngx_array_push(&cmcf->phases[NGX_HTTP_REWRITE_PHASE].handlers);
if (h == NULL)
return NGX_ERROR;
*h = ngx_http_leap_rewrite_handler;
return NGX_OK;
}
in ngx_http_leap_rewrite_handler method, I have the following line:
rc = ngx_http_read_client_request_body(r, ngx_http_leap_request_body_read);
the ngx_http_leap_request_body_read handler is able to compress request body, when posting data using application/x-www-form-urlencoded but not for multipart/form-data.
Because what I really want to do is to compress post files, not post lines,
is there any ideas?

Apache CXF WebClient multiple requests with www-authenticate header

I got simple JAX-RS resource and I'm using Apache CXF WebClient as a client. I'm using HTTP basic authentication. When it fails on server, typical 401 UNAUTHORIZED response is sent along with WWW-Authenticate header.
The strange behavior happens with WebClient when this (WWW-Auhenticate) header is received. The WebClient (internally) repeats the same request multiple times (20 times) and than fails.
WebClient webClient = WebClientFactory.newClient("http://myserver/auth");
try {
webClient.get(SimpleResponse.class);
// inside GET, 20 HTTP GET requests are invoked
} catch (ServerWebApplicationException ex) {
// data are present when WWW-authenticate header is not sent from server
// if header is present, unmarshalling fails
AuthError err = ex.toErrorObject(webClient, AuthError.class);
}
I found the same problem in CXF 3.1.
In my case for all async http rest request if response came 401/407, then thread is going in infinite loop and printing WWW-Authenticate is not set in response.
What I analysed the code I found that :
In case of Asynchronous call Control flow from HttpConduit.handleRetransmits-> processRetransmit-> AsyncHTTPConduit.authorizationRetransmit
which return true and in HttpConduit the code is
int maxRetransmits = getMaxRetransmits();
updateCookiesBeforeRetransmit();
int nretransmits = 0;
while ((maxRetransmits < 0 || nretransmits < maxRetransmits) && processRetransmit()) {
nretransmits++;
}
If maxRetransmits = -1 and processRetransmit() return true then thread going in infinite loop.
So to overcome this issue we pass maxRetransmitValue as 0 in HttpConduit.getClient().
Hope it will others.
This has been fixed in the latest versions of CXF:
https://issues.apache.org/jira/browse/CXF-4815

Resources