I'm asking myself if it possible to reproduce NGinx proxy_next_upstream system on F5 BIG-IP.
As a reminder, here is how it works on NGinx:
Given a pool of upstream servers let's call it webservers compose by 2 instances:
upstream webservers {
server 192.168.1.10:8080 max_fails=1 fail_timeout=10s;
server 192.168.1.20:8080 max_fails=1 fail_timeout=10s;
}
With the following instruction (proxy_next_upstream error), if a tcp connection fail on first instance when routing a request (because instance is down for example), NGinx automatically forward request to the second instance (USER DOESN'T SEE ANY ERROR).
Furthermore, instance 1 is blacklisted for 10 seconds (fail_timeout=10s).
Every 10 sec, NGinx will try to route 1 request to instance 1 (to know if instance is coming back) and make the instance available again if it succeed otherwise it wait again 10 sec to try.
location / {
proxy_next_upstream error;
proxy_pass http://webservers/$1;
}
I hope I'm clear enough...
Thanks for your help.
Here is something interesting: https://support.f5.com/kb/en-us/solutions/public/10000/600/sol10640.html
Related
I have a simple stream block to stream MySQL TCP traffic to Maxscale instances. 2nd instance acts as a failover only, like:
stream {
upstream maxscale {
zone upstream_maxscale 64k;
server 10.1.0.11:3307;
server 10.1.0.12:3307 backup;
}
server {
listen 3307;
proxy_pass maxscale;
}
}
When connections are low (<30), everything goes fine. But when connection are high (>40, if we can say that 40 connections are high...), nginx error log keeps complaining about something that i don't know how to debug.
recv() failed (104: Connection reset by peer) while proxying and reading from upstream, client: 10.1.0.16, server: 10.1.0.15:3307, upstream: "10.1.0.11:3307", bytes from/to client:15738/64316, bytes from/to upstream:64316/15738
I've tried play with options like reuseport, worker_connections or so_keepalive but no chances.
https://nginx.org/en/docs/stream/ngx_stream_core_module.html
https://docs.nginx.com/nginx/admin-guide/load-balancer/tcp-udp-load-balancer/
Can it be a problem in the Maxscale side?
Here the Maxscale 2.4 listener:
# Listener
[listener-rw]
type=listener
service=readwritesplit
protocol=MariaDBClient
address=10.1.0.11
port=3307
ssl=required
ssl_ca_cert=/var/lib/maxscale/ssl/ca-cert.pem
ssl_cert=/var/lib/maxscale/ssl/server.pem
ssl_key=/var/lib/maxscale/ssl/server.key
ssl_version=MAX
# Service
[readwritesplit]
type=service
router=readwritesplit
servers=sql1,sql2,sql3
user=maxscale
password=324F74A347291B3BE79956AD5F4BB2FAD65E1F9052A976722917701742729400
enable_root_user=1
max_sescmd_history=150
max_slave_connections=100%
lazy_connect=true
slave_selection_criteria=LEAST_CURRENT_OPERATIONS
optimistic_trx=true
connection_keepalive=300
master_failure_mode=fail_on_write
https://nginx.org/en/docs/stream/ngx_stream_core_module.html
https://docs.nginx.com/nginx/admin-guide/load-balancer/tcp-udp-load-balancer/
The MaxScale log (in /var/log/maxscale/maxscale.log) most likely contains either an answer as to why you receive such errors or at least will help you determine what the problem might be.
In case you can't find out the reason for this from the logs alone, I would suggest opening a bug report on the MariaDB Jira under the MaxScale project.
I'm trying to configure SSL-passthrough for multiple webapps using the same nginx server (nginx version: nginx/1.13.6), but when restarting the nginx server, I get an error complaining that
nginx: [emerg] "stream" directive is duplicate
The configuration I have is the following:
2 files for the ssl passthrough that look like this:
server1.conf:
stream {
upstream workers {
server 192.168.1.10:443;
server 192.168.1.11:443;
server 192.168.1.12:443;
}
server {
listen server1.com:8443;
proxy_pass workers;
}
}
and server2.conf:
stream {
upstream workers {
server 192.168.1.20:443;
server 192.168.1.21:443;
server 192.168.1.22:443;
}
server {
listen server2.com:8443;
proxy_pass workers;
}
}
If I remove one of the two files, then nginx starts correctly.
How can this be achieved?
Thanks,
Cristi
Streams work on Layer 5, and cannot read encrypted traffic (which is Layer 6 on the OSI model), and thus cannot tell apart requests hitting server1.com and server2.com unless they are pointing to different IPs.
This can be solved by one of the following solutions
Decrypt the traffic on nginx, then proxy-pass it to backend processes/wockers using HTTP.
Bind server1.com to a port that is different to server2.com.
Get an additional IP address and bind server2.com on that.
Get an additional load balancer and move server2.com there.
I can't get Nginx working with memcached module, the requirement is to query remote service, cache data in memcached and never fetch remote endpoint until backend invalidates the cache. I have 2 containers with memcached v1.4.35 and one with Nginx v1.11.10.
The configuration is the following:
upstream http_memcached {
server 172.17.0.6:11211;
server 172.17.0.7:11211;
}
upstream remote {
server api.example.com:443;
keepalive 16;
}
server {
listen 80;
location / {
set $memcached_key "$uri?$args";
memcached_pass http_memcached;
error_page 404 502 504 = #remote;
}
location #remote {
internal;
proxy_pass https://remote;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
I tried to set memcached upstream incorrectly but I get HTTP 499 instead and warnings:
*3 upstream server temporarily disabled while connecting to upstream
It seems with described configuration Nginx can reach memcached successfully but can't write or read from it. I can write and read to memcached with telnet successfully.
Can you help me please?
My guesses on what's going on with your configuration
1. 499 codes
HTTP 499 is nginx' custom code meaning the client terminated connection before receiving the response (http://lxr.nginx.org/source/src/http/ngx_http_request.h#0120)
We can easily reproduce it, just
nc -k -l 172.17.0.6 172.17.0.6:11211
and curl your resource - curl will hang for a while and then press Ctrl+C — you'll have this message in your access logs
2. upstream server temporarily disabled while connecting to upstream
It means nginx didn't manage to reach your memcached and just removed it from the pool of upstreams. Suffice is to shutdown both memcached servers and you'd constantly see it in your error logs (I see it every time with error_log ... info).
As you see these messages your assumption that nginx can freely communicate with memcached servers doesn't seem to be true.
Consider explicitly setting http://nginx.org/en/docs/http/ngx_http_memcached_module.html#memcached_bind
and use the -b option with telnet to make sure you're correctly testing memcached servers for availability via your telnet client
3. nginx can reach memcached successfully but can't write or read from it
Nginx can only read from memcached via its built-in module
(http://nginx.org/en/docs/http/ngx_http_memcached_module.html):
The ngx_http_memcached_module module is used to obtain responses from
a memcached server. The key is set in the $memcached_key variable. A
response should be put in memcached in advance by means external to
nginx.
4. overall architecture
It's not fully clear from your question how the overall schema is supposed to work.
nginx's upstream uses weighted round-robin by default.
That means your memcached servers will be queried once at random.
You can change it by setting memcached_next_upstream not_found so a missing key will be considered an error and all of your servers will be polled. It's probably ok for a farm of 2 servers, but unlikely is it what your want for 20 servers
the same is ordinarily the case for memcached client libraries — they'd pick a server out of a pool according to some hashing scheme => so your key would end up on only 1 server out of the pool
5. what to do
I've managed to set up a similar configuration in 10 minutes on my local box - it works as expected. To mitigate debugging I'd get rid of docker containers to avoid networking overcomplication, run 2 memcached servers on different ports in single-threaded mode with -vv option to see when requests are reaching them (memcached -p 11211 -U o -vv) and then play with tail -f and curl to see what's really happening in your case.
6. working solution
nginx config:
https and http/1.1 is not used here but it doesn't matter
upstream http_memcached {
server 127.0.0.1:11211;
server 127.0.0.1:11212;
}
upstream remote {
server 127.0.0.1:8080;
}
server {
listen 80;
server_name server.lan;
access_log /var/log/nginx/server.access.log;
error_log /var/log/nginx/server.error.log info;
location / {
set $memcached_key "$uri?$args";
memcached_next_upstream not_found;
memcached_pass http_memcached;
error_page 404 = #remote;
}
location #remote {
internal;
access_log /var/log/nginx/server.fallback.access.log;
proxy_pass http://remote;
proxy_set_header Connection "";
}
}
server.py:
this is my dummy server (python):
from random import randint
from flask import Flask
app = Flask(__name__)
#app.route('/')
def hello_world():
return 'Hello: {}\n'.format(randint(1, 100000))
This is how to run it (just need to install flask first)
FLASK_APP=server.py [flask][2] run -p 8080
filling in my first memcached server:
$ telnet 127.0.0.1 11211
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
set /? 0 900 5
cache
STORED
quit
Connection closed by foreign host.
checking:
note that we get a result every time although we stored data
only in the first server
$ curl http://server.lan && echo
cache
$ curl http://server.lan && echo
cache
$ curl http://server.lan && echo
cache
this one is not in the cache so we'll get a response from server.py
$ curl http://server.lan/?q=1 && echo
Hello: 32337
whole picture:
the 2 windows on the right are
memcached -p 11211 -U o -vv
and
memcached -p 11212 -U o -vv
I'm trying to determine if it is possible to tell Nginx to choose another server for a specified upstream when the first server selected returns a specific error.
Eg:
try upstream server 0
if it returns a certain error code (eg: 503)
try upstream 1
else
return response to client
Here's a sample of what you need to do, you can read more details on this answer
upstream myservers{
#the first server is the main server
server xxx.xxx.xxx.xxx weight=999 fail_timeout=5s max_fails=1;
server xxx.xxx.xxx.xxx;
}
server {
# all config
proxy_pass http://myservers;
}
We are using Nginx As a load balancer for multiple riak nodes. The setup worked fine for some time(few hours) before Nginx started giving bad gateway 502 errors. On checking the individual nodes seemed to be working. We found out that The problem was with nginx buffer size hence increased the buffer size to 16k, it worked fine for one more day before we started getting 502 error for everything.
My Nginx configuration is as follows
upstream riak {
server 127.0.0.1:8091 weight=3;
server 127.0.0.1:8092;
server 127.0.0.1:8093;
server 127.0.0.1:8094;
}
server {
listen 8098;
server_name 127.0.0.1:8098;
location / {
proxy_pass http://riak;
proxy_buffer_size 16k;
proxy_buffers 8 16k;
}
}
Any help is appreciated,Thank you.
Check if you are running out of fd's on the nginx box. Check with netstat if you have too many connections in the TIME_WAIT state. If so, you will need to reduce you tcp_fin_timeout value from default 60 seconds to something smaller.