Nginx memcached with fallback to remote service - nginx

I can't get Nginx working with memcached module, the requirement is to query remote service, cache data in memcached and never fetch remote endpoint until backend invalidates the cache. I have 2 containers with memcached v1.4.35 and one with Nginx v1.11.10.
The configuration is the following:
upstream http_memcached {
server 172.17.0.6:11211;
server 172.17.0.7:11211;
}
upstream remote {
server api.example.com:443;
keepalive 16;
}
server {
listen 80;
location / {
set $memcached_key "$uri?$args";
memcached_pass http_memcached;
error_page 404 502 504 = #remote;
}
location #remote {
internal;
proxy_pass https://remote;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
I tried to set memcached upstream incorrectly but I get HTTP 499 instead and warnings:
*3 upstream server temporarily disabled while connecting to upstream
It seems with described configuration Nginx can reach memcached successfully but can't write or read from it. I can write and read to memcached with telnet successfully.
Can you help me please?

My guesses on what's going on with your configuration
1. 499 codes
HTTP 499 is nginx' custom code meaning the client terminated connection before receiving the response (http://lxr.nginx.org/source/src/http/ngx_http_request.h#0120)
We can easily reproduce it, just
nc -k -l 172.17.0.6 172.17.0.6:11211
and curl your resource - curl will hang for a while and then press Ctrl+C — you'll have this message in your access logs
2. upstream server temporarily disabled while connecting to upstream
It means nginx didn't manage to reach your memcached and just removed it from the pool of upstreams. Suffice is to shutdown both memcached servers and you'd constantly see it in your error logs (I see it every time with error_log ... info).
As you see these messages your assumption that nginx can freely communicate with memcached servers doesn't seem to be true.
Consider explicitly setting http://nginx.org/en/docs/http/ngx_http_memcached_module.html#memcached_bind
and use the -b option with telnet to make sure you're correctly testing memcached servers for availability via your telnet client
3. nginx can reach memcached successfully but can't write or read from it
Nginx can only read from memcached via its built-in module
(http://nginx.org/en/docs/http/ngx_http_memcached_module.html):
The ngx_http_memcached_module module is used to obtain responses from
a memcached server. The key is set in the $memcached_key variable. A
response should be put in memcached in advance by means external to
nginx.
4. overall architecture
It's not fully clear from your question how the overall schema is supposed to work.
nginx's upstream uses weighted round-robin by default.
That means your memcached servers will be queried once at random.
You can change it by setting memcached_next_upstream not_found so a missing key will be considered an error and all of your servers will be polled. It's probably ok for a farm of 2 servers, but unlikely is it what your want for 20 servers
the same is ordinarily the case for memcached client libraries — they'd pick a server out of a pool according to some hashing scheme => so your key would end up on only 1 server out of the pool
5. what to do
I've managed to set up a similar configuration in 10 minutes on my local box - it works as expected. To mitigate debugging I'd get rid of docker containers to avoid networking overcomplication, run 2 memcached servers on different ports in single-threaded mode with -vv option to see when requests are reaching them (memcached -p 11211 -U o -vv) and then play with tail -f and curl to see what's really happening in your case.
6. working solution
nginx config:
https and http/1.1 is not used here but it doesn't matter
upstream http_memcached {
server 127.0.0.1:11211;
server 127.0.0.1:11212;
}
upstream remote {
server 127.0.0.1:8080;
}
server {
listen 80;
server_name server.lan;
access_log /var/log/nginx/server.access.log;
error_log /var/log/nginx/server.error.log info;
location / {
set $memcached_key "$uri?$args";
memcached_next_upstream not_found;
memcached_pass http_memcached;
error_page 404 = #remote;
}
location #remote {
internal;
access_log /var/log/nginx/server.fallback.access.log;
proxy_pass http://remote;
proxy_set_header Connection "";
}
}
server.py:
this is my dummy server (python):
from random import randint
from flask import Flask
app = Flask(__name__)
#app.route('/')
def hello_world():
return 'Hello: {}\n'.format(randint(1, 100000))
This is how to run it (just need to install flask first)
FLASK_APP=server.py [flask][2] run -p 8080
filling in my first memcached server:
$ telnet 127.0.0.1 11211
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
set /? 0 900 5
cache
STORED
quit
Connection closed by foreign host.
checking:
note that we get a result every time although we stored data
only in the first server
$ curl http://server.lan && echo
cache
$ curl http://server.lan && echo
cache
$ curl http://server.lan && echo
cache
this one is not in the cache so we'll get a response from server.py
$ curl http://server.lan/?q=1 && echo
Hello: 32337
whole picture:
the 2 windows on the right are
memcached -p 11211 -U o -vv
and
memcached -p 11212 -U o -vv

Related

NGINX : Using Upstream server name

I am running nginx/1.19.6 on Ubuntu.
I am struggling to get the upstream module to work without returning a 404.
My *.conf files are located in /etc/nginx/conf.d/
FILE factory_upstream.conf:
upstream factoryserver {
server factory.service.corp.com:443;
}
FILE factory_service.conf:
server
{
listen 80;
root /data/www;
proxy_cache factorycache;
proxy_cache_min_uses 1;
proxy_cache_methods GET HEAD POST;
proxy_cache_valid 200 72h;
#proxy_cache_valid any 5m;
location /factory/ {
access_log /var/log/nginx/access-factory.log main;
proxy_set_header x-apikey abcdefgh12345678;
### Works when expressed as a literal.# proxy_pass https://factory.service.corp.com/;
### 404 when using the upstream name.
proxy_pass https://factoryserver/;
}
}
I have error logging set to debug, but after reloading the configuration and attempting a call, there are no new records in the error log.
nginx -t # Scans OK
nginx -s reload # no errors
cat /var/log/nginx/error.log
...
2021/03/16 11:29:52 [notice] 26157#26157: signal process started
2021/03/16 11:38:20 [notice] 27593#27593: signal process started
The access-factory.log does record the request :
127.1.2.3 --;[16/Mar/2021:11:38:50 -0400];";GET /factory/api/manifest/get-full-manifest/PID/904227100052 HTTP/1.1" ";/factory/api/manifest/get-full-manifest/PID/904227100052" ;404; - ;/factory/api/manifest/get-full-manifest/PID/904227100052";-" ";ID="c4cfc014e3faa803c8fe136d6eae347d ";-" ";10.8.9.123:443" ";404" ";-"
To help with debugging, I cached the 404 error, "proxy_cache_valid any 5m;" commented out in the example above:
When I use the upstream name, the cache file contains the followiing:
<##$ non-printable characters $%^>
KEY: https://factoryserver/api/manifest/get-full-manifest/PID/904227100052
HTTP/1.1 404 Not Found
Server: nginx/1.17.8
Date: Tue, 16 Mar 2021 15:38:50 GMT
...
The key contains the name 'factoryserver' I don't know if that matters or not. Does it?
The server version is different than what I see when I enter the command nginx -v, which is: nginx version: nginx/1.19.6
Does the difference in version in the cache file and the command line indicate anything?
When I switch back to the literal server name in the proxy_pass, I get a 200 response with the requested data. The Key in the cache file then contains the literal upstream server name.
<##$ non-printable characters $%^>
KEY: https://factory.service.corp.com/api/manifest/get-full-manifest/PID/904227100052
HTTP/1.1 200
Server: nginx/1.17.8
Date: Tue, 16 Mar 2021 15:59:30 GMT
...
I will have multiple upstream servers, each providing different services. The configuration will be deployed to multiple factories, each with its own upstream servers.
I would like for the deployment team to only have to update the *_upstream.conf files, and keep the *_service.conf files static from deployment site to deployment site.
factory_upstream.conf
product_upstream.conf
shipping_upstream.conf
abc123_upstream.conf
Why do I get a 404 when using a named upstream server?
Based on the nginx version in the cached response not matching what you see on the command line, it seems that maybe the 404 is coming from the upstream server. I.e, your proxying is working, but the upstream server is returning a 404. To troubleshoot further, I would check the nginx logs for the upstream server and if the incoming request is what you expect.
Note that when using proxy_pass, it makes a big difference whether you have a / at the end or not. With a trailing slash, nginx treats this as the URI it should send the upstream request to and it doesn't include the URI matched by the location block (/factory/) while without, it includes the full URI as-is.
proxy_pass https://factoryserver/ results in https://factory.service.corp.com:443/
proxy_pass https://factoryserver results in https://factory.service.corp.com:443/factory/
Docs: https://docs.nginx.com/nginx/admin-guide/web-server/reverse-proxy/
So maybe when switching between using an upstream and specifying the literal server name, you're inadvertently being inconsistent with the trailing slash. It's a really easy detail to miss, especially when you don't know it's important.
Please provide more information about your server configuration where you proxy pass the requests. Only difference I see now is that you specify the port (443) in your upstream server.

Accessing GraphDB Workbench throught the internet (not localhost) in a Nginx server

I have GraphDb (standlone server version) in Ubuntu Server 16 running in localhost (with the command ./graphdb -d in /etc/graphdb/bin). But I only have ssh access to the server in the terminal, can't open Worbench in localhost:7200 locally.
I have many websites running on this machine with Ningx. If I try to access the machine's main IP with the port 7200 through the external web it doesn't work (e.g. http://193.133.16.72:7200/ = "connection timmed out").
I have tried to make a reverse proxy with Nginx with this code ("xxx" = domain):
listen 7200;
listen [::]:7200;
server_name sparql.xxx.com;
location / {
proxy_pass http://127.0.0.1:7200;
}
}
But all this fails. I checked and port 7200 is open in firewall (ufw). In the logs I get info that GraphDB is working locally in some testes. But I need Workbench access to import and create repositories (not sure how to do it or if it is possible without the Workbench GUI).
Is there a way to connect through the external web to the Workbench using a domain/IP and/or Nginx?
Read all the documentation and searched all day, but could not find a way to deal with this sorry. I only used GraphDB locally (the simple installer version), never used the standalone server in production before, sorry.
PS: two extra questions related:
a) For creating an URI endpoint it is the same procedure?
b) For GraphDB daemon to start automattically at boot time (with the command ./graphdb -d in graph/bin folder), what is the recommended way and configuration? (tryed the line "/etc/graphdb/bin ./graphdb -d" in rc.local but it didn't worked).
In case it is usefull for someone, I was able to make it work with this Nginx configuration:
server {
listen 80;
server_name sparql.xxxxxx.com;
location / {
proxy_pass http://localhost:7200;
proxy_set_header Host $host;
}
}
I think it was the "proxy_set_header Host $host;" that solved it (tried the rest without it before and didn't worked). I think GraphDB uses some headers to set configurations, and they were not passing.
I wounder if I am forgetting something else important to forward in the proxy, but in this moment the Worbench seams to work and opens in the domain used "sparql.xxxxxx.com".

Send Uncompressed Request to Nginx Proxy

My website is running Nginx as a reverse proxy that is in front of a ASP.net application server. The ASP app server does not support large request bodies compressed with with gzip.
Is it possible for Nginx to decompress the HTTP request before sending it to the ASP application server?
If you have nginx with lua support or openresty, you can use lua-zlib module to do this.
-- Handle request
local zlib = require 'zlib'
myStr = nil
if ngx.req.get_headers()['Content-Encoding'] == 'gzip' then
ngx.req.read_body()
local myStr = zlib.inflate()(ngx.var.request_body, 'finish')
ngx.req.clear_header('Content-Encoding')
ngx.req.clear_header('Content-Length')
ngx.req.set_body_data(myStr)
end
Source:
https://notes.ayushsharma.in/2016/10/decompressing-request-using-gzip-with-nginx
http://www.pataliebre.net/howto-make-nginx-decompress-a-gzipped-request.html#.VfgySp2qqko
I would direct you to this site http://www.pataliebre.net/howto-make-nginx-decompress-a-gzipped-request.html#.Xgslz0czZPZ, it contains usefule snippets of code that should work for what you want.
I think what you are looking for is client_max_body_size
step to change
1> vi /etc/nginx/nginx.conf
2> add this
http {
#all other configurations . . .
client_max_body_size: 10M; #add this thing
}
3> sudo nginx -t
if every thing ok then restart your server and check otherwise check for your syntex error.
4> sudo service nginx restart
In case above steps not worked for you can stop gzip by this
server {
gzip off;
...
}
and repeat step 3 and 4

Nginx will not start with host not found in upstream

I use nginx to proxy and hold persistent connections to far away servers for me.
I have configured about 15 blocks similar to this example:
upstream rinu-test {
server test.rinu.test:443;
keepalive 20;
}
server {
listen 80;
server_name test.rinu.test;
location / {
proxy_pass https://rinu-test;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $http_host;
}
}
The problem is if the hostname can not be resolved in one or more of the upstream blocks, nginx will not (re)start. I can't use static IPs either, some of these hosts explicitly said not to do that because IPs will change. Every other solution I've seen to this error message says to get rid of upstream and do everything in the location block. That it not possible here because keepalive is only available under upstream.
I can temporarily afford to lose one server but not all 15.
Edit:
Turns out nginx is not suitable for this use case. An alternative backend (upstream) keepalive proxy should be used. A custom Node.js alternative is in my answer. So far I haven't found any other alternatives that actually work.
Earlier versions of nginx (before 1.1.4), which already powered a huge number of the most visited websites worldwide (and some still do even nowdays, if the server headers are to be believed), didn't even support keepalive on the upstream side, because there is very little benefit for doing so in the datacentre setting, unless you have a non-trivial latency between your various hosts; see https://serverfault.com/a/883019/110020 for some explanation.
Basically, unless you know you specifically need keepalive between your upstream and front-end, chances are it's only making your architecture less resilient and worse-off.
(Note that your current solution is also wrong because a change in the IP address will likewise go undetected, because you're doing hostname resolution at config reload only; so, even if nginx does start, it'll basically stop working once IP addresses of the upstream servers do change.)
Potential solutions, pick one:
The best solution would seem to just get rid of upstream keepalive as likely unnecessary in a datacentre environment, and use variables with proxy_pass for up-to-date DNS resolution for each request (nginx is still smart-enough to still do the caching of such resolutions)
Another option would be to get a paid version of nginx through a commercial subscription, which has a resolve parameter for the server directive within the upstream context.
Finally, another thing to try might be to use a set variable and/or a map to specify the servers within upstream; this is neither confirmed nor denied to have been implemented; e.g., it may or may not work.
Your scenario is very similar to the one when using aws ELB as uptreams in where is critical to resolve the proper IP of the defined domain.
The first thing you need to do and ensure is that the DNS servers you are using can resolve to your domains, then you could create your config like this:
resolver 10.0.0.2 valid=300s;
resolver_timeout 10s;
location /foo {
set $foo_backend_servers foo_backends.example.com;
proxy_pass http://$foo_backend_servers;
}
location /bar {
set $bar_backend_servers bar_backends.example.com;
proxy_pass http://$bar_backend_servers;
}
Notice the resolver 10.0.0.2 it should be IP of the DNS server that works and answer your queries, depending on your setup this could be a local cache service like unbound. and then just use resolve 127.0.0.1
Now, is very important to use a variable to specify the domain name, from the docs:
When you use a variable to specify the domain name in the proxy_pass directive, NGINX re‑resolves the domain name when its TTL expires.
You could check your resolver by using tools like dig for example:
$ dig +short stackoverflow.com
In case is a must to use keepalive in the upstreams, and if is not an option to use Nginx +, then you could give a try to openresty balancer, you will need to use/implement lua-resty-dns
A one possible solution is to involve a local DNS cache. It can be a local DNS server like Bind or Dnsmasq (with some crafty configuration, note that nginx can also use specified dns server in place of the system default), or just maintaining the cache in hosts file.
It seems that using hosts file with some scripting is quite straightforward way. The hosts file should be spitted into the static and dynamic parts (i.e. cat hosts.static hosts.dynamic > hosts), and the dynamic part should be generated (and updated) automatically by a script.
Perhaps it make sense to check from time to time the hostnames for changing IPs, and update hosts file and reload configuration in nginx on changes. In case of some hostname cannot be resolved the old IP or some default IP (like 127.0.1.9) should be used.
If you don't need the hostnames in the nginx config file (i.e., IPs are enough), the upstream section with IPs (resolved hostnames) can be generated by a script and included into nginx config — and no need to touch the hosts file in such case.
I put the resolve parameter on server and you need to set the Nginx Resolver in nginx.conf as below:
/etc/nginx/nginx.conf:
http {
resolver 192.168.0.2 ipv6=off valid=40s; # The DNS IP server
}
Site.conf:
upstream rinu-test {
server test.rinu.test:443;
keepalive 20;
}
My problem was container related. I'm using docker compose to create the nginx container, plus the app container. When setting network_mode: host in the app container config in docker-compose.yml, nginx was unable to find the upstream app container. Removing this fixed the problem.
we can resolve it temporarily
cd /etc
sudo vim resolv.conf
i
nameserver 8.8.8.8
:wq
then do sudo nginx -t
restart nginx it will work for the momment
An alternative is to write a new service that only does what I want. The following replaces nginx for proxying https connections using Node.js
const http = require('http');
const https = require('https');
const httpsKeepAliveAgent = new https.Agent({ keepAlive: true });
http.createServer(onRequest).listen(3000);
function onRequest(client_req, client_res) {
https.pipe(
protocol.request({
host: client_req.headers.host,
port: 443,
path: client_req.url,
method: client_req.method,
headers: client_req.headers,
agent: httpsKeepAliveAgent
}, (res) => {
res.pipe(client_res);
}).on('error', (e) => {
client_res.end();
})
);
}
Example usage:
curl http://localhost:3000/request_uri -H "Host: test.rinu.test"
which is equivalent to:
curl https://test.rinu.test/request_uri

Nginx: Setting up SSL-passthorugh

I'm trying to configure SSL-passthrough for multiple webapps using the same nginx server (nginx version: nginx/1.13.6), but when restarting the nginx server, I get an error complaining that
nginx: [emerg] "stream" directive is duplicate
The configuration I have is the following:
2 files for the ssl passthrough that look like this:
server1.conf:
stream {
upstream workers {
server 192.168.1.10:443;
server 192.168.1.11:443;
server 192.168.1.12:443;
}
server {
listen server1.com:8443;
proxy_pass workers;
}
}
and server2.conf:
stream {
upstream workers {
server 192.168.1.20:443;
server 192.168.1.21:443;
server 192.168.1.22:443;
}
server {
listen server2.com:8443;
proxy_pass workers;
}
}
If I remove one of the two files, then nginx starts correctly.
How can this be achieved?
Thanks,
Cristi
Streams work on Layer 5, and cannot read encrypted traffic (which is Layer 6 on the OSI model), and thus cannot tell apart requests hitting server1.com and server2.com unless they are pointing to different IPs.
This can be solved by one of the following solutions
Decrypt the traffic on nginx, then proxy-pass it to backend processes/wockers using HTTP.
Bind server1.com to a port that is different to server2.com.
Get an additional IP address and bind server2.com on that.
Get an additional load balancer and move server2.com there.

Resources