I am running an nginx-ingress controller in a kubernetes cluster and one of my log statements for the request looks like this:
upstream_response_length: 0, 840
upstream_response_time: 60.000, 0.760
upstream_status: 504, 200
I cannot quite understand what does that mean? Nginx has a response timeout equal to 60 seconds, and tries to request one more time after that (successfully) and logs both requests?
P.S. Config for log format:
log-format-upstream: >-
{
...
"upstream_status": "$upstream_status",
"upstream_response_length": "$upstream_response_length",
"upstream_response_time": "$upstream_response_time",
...
}
According to split_upstream_var method of ingress-nginx, it splits results of nginx health checks.
Since nginx can have several upstreams, your log could be interpreted this way:
First upstream is dead (504)
upstream_response_length: 0 // responce from dead upstream has zero length
upstream_response_time: 60.000 // nginx dropped connection after 60sec
upstream_status: 504 // responce code, upstream doesn't answer
Second upstream works (200)
upstream_response_length: 840 // healthy upstream returned 840b
upstream_response_time: 0.760 // healthy upstream responced in 0.760
upstream_status: 200 // responce code, upstream is ok
P.S. JFYI, here's a cool HTTP headers state diagram
Related
I have a very simple flask app that is deployed on GKE and exposed via google external load balancer. And getting random 502 responses from the backend-service (added a custom headers on backend-service and nginx to make sure the source and I can see the backend-service's header but not nginx's)
The setup is;
LB -> backend-service -> neg -> pod (nginx -> uwsgi) where pod is the application built using flask and deployed via uwsgi and nginx.
The scenario is to handle image uploads in simple-secured way. Sender sends me a token with upload request.
My flask app
receive request and check the sent token via another service using "requests".
If token valid, proceed to handle the image and return 200
If token is not valid, stop and send back a 401 response.
First, I got suspicious about the 200 and 401's. And reverted all responses to 200. Following some of the expected responses, server starts to respond 502 and keep sending it. "Some of the messages at the very beginning succeeded".
nginx error logs contains below lines
2023/02/08 18:22:29 [error] 10#10: *145 readv() failed (104: Connection reset by peer) while reading upstream, client: 35.191.17.139, server: _, request: "POST /api/v1/imageUpload/image HTTP/1.1", upstream: "uwsgi://127.0.0.1:21270", host: "example-host.com"
my uwsgi.ini file is as below;
[uwsgi]
socket = 127.0.0.1:21270
master
processes = 8
threads = 1
buffer-size = 32768
stats = 127.0.0.1:21290
log-maxsize = 104857600
logdate
log-reopen
log-x-forwarded-for
uid = image_processor
gid = image_processor
need-app
chdir = /server/
wsgi-file = image_processor_application.py
callable = app
py-auto-reload = 1
pidfile = /tmp/uwsgi-imgproc-py.pid
my nginx.conf is as below
location ~ ^/api/ {
client_max_body_size 15M;
include uwsgi_params;
uwsgi_pass 127.0.0.1:21270;
}
Lastly, my app has a healthcheck method with simple JSON response. It does no extra stuff and simply returns. This never fails as explained above.
Edit : my nginx access logs in the pod shows the response as 401 while the client receives 502.
for those who gonna face with the same issue, the problem was post data reading (or not reading).
nginx was expecting to get post data read by the proxied, in our case uwsgi, app. But according to my logic I was not reading it in some cases and returning back the response.
Setting uwsgi post-buffering solved the issue.
post-buffering = %(16 * 1024 * 1024)
Which led me to this solution;
https://stackoverflow.com/a/26765936/631965
Nginx uwsgi (104: Connection reset by peer) while reading response header from upstream
I want to block excessive requests on a per-IP basis, allowing at maxium 12 requests per second. For this sake, I have the following in /etc/nginx/nginx.conf:
http {
##
# Basic Settings
##
...
limit_req_zone $binary_remote_addr zone=perip:10m rate=12r/s;
server {
listen 80;
location / {
limit_req zone=perip nodelay;
}
}
Now, when I'm running the command ab -k -c 100 -n 900 'https://www.mywebsite.com/' in the terminal, I get the output with only 179 non-2xx responses out of 900.
Why isn't Nginx blocking most requests?
Taken from here:
with nodelay nginx process all burst requests instantly and without
this option nginx makes excessive requests to wait so that overall
rate would be no more than 1 request per second and last successful
request took 5 seconds to complete.
rate=6r/s actually means one request in 1/6th of a second. So if
you send 6 request simultaneously you'll get 5 of them with 503
The problem might be in your testing method (or your expectations from it).
Problem
I am getting an error message in my Kong error log reporting that the upstream server has timed out. But I know that the upstream process was just taking over a minute, and when it completes (after Kong has logged the error) it logs a java error "Broken Pipe", implying that Kong was no longer listening for the response.
This is the behavior when the upstream process takes longer than 60 seconds. In some cases, it takes less than 60 seconds and everything works correctly.
How can I extend Kong's timeout?
Details
Kong Version
1.1.2
Kong's Error Message (slightly edited):
2019/12/06 09:57:10 [error] 1421#0: *1377 upstream timed out (110: Connection timed out) while reading response header from upstream, client: xyz.xyz.xyz.xyz, server: kong, request: "POST /api/...... HTTP/1.1", upstream: "http://127.0.0.1:8010/api/.....", host: "xyz.xyz.com"
Here is the error from the upstream server log (Java / Tomcat via SpringBoot)
Dec 06 09:57:23 gateway-gw001-99 java[319]: org.apache.catalina.connector.ClientAbortException: java.io.IOException: Broken pipe
Dec 06 09:57:23 gateway-gw001-99 java[319]: at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:364) ~[tomcat-embed-core-8.5.42.jar!/
Dec 06 09:57:23 gateway-gw001-99 java[319]: at org.apache.catalina.connector.OutputBuffer.flushByteBuffer(OutputBuffer.java:833) ~[tomcat-embed-core-8.5.42.jar!
...
My kong.conf (slightly edited)
trusted_ips = 0.0.0.0/0
admin_listen = 0.0.0.0:8001
proxy_listen = 0.0.0.0:8080 proxy_protocol, 0.0.0.0:8443 ssl proxy_protocol
database = postgres
pg_host = 127.0.0.1
pg_port = 5432
pg_user = kong
pg_password = xyzxyzxyzxyzxyz
pg_database = kong
plugins = bundled,session
real_ip_header = proxy_protocol
A little more Context
Kong and the Upstream Server are hosted on the same Ubuntu VM
The Ubuntu VM is hosted as a linux container (LXC) inside another Ubuntu VM
The outer VM uses NGinX to receive public traffic and reverse proxies it to Kong. It does this using stream. This allows Kong to be my SSL demarcation point.
The Outer NGinX Stream Config:
stream {
server {
listen 80;
proxy_pass xyz.xyz.xyz.xyz:8080;
proxy_protocol on;
}
server {
listen 443;
proxy_pass xyz.xyz.xyz.xyz:8443;
proxy_protocol on;
}
}
What I've Tried
I've tried adding the following lines to kong.conf. In version 1.1.2 of Kong you basically alter the NGinX settings remotely by adding prefixes to NginX config and placing them in the kong.conf (https://docs.konghq.com/1.1.x/configuration/#injecting-individual-nginx-directives ). None of them seemed to do anything:
nginx_http_keepalive_timeout=300s
nginx_proxy_proxy_read_timeout=300s
nginx_http_proxy_read_timeout=300s
nginx_proxy_send_timeout=300s
nginx_http_send_timeout=300s
Per the documentation Kong Version 0.10 has three properties that you can set for managing proxy connections
upstream_connect_timeout: defines in milliseconds the timeout for
establishing a connection to your upstream service.
upstream_send_timeout: defines in milliseconds a timeout between two successive write operations for transmitting a request
to your upstream service.
upstream_read_timeout:
defines in milliseconds a timeout between two successive read
operations for receiving a request from your upstream service.
In this case, as Kong is timing out waiting for the response from the upstream you would need to add a property setting for upstream_read_timeout
In the Kong Version 1.1 documentation the Service object now includes these timeout attributes with slightly different names:
connect_timeout: The timeout in milliseconds for establishing a connection to the upstream server. Defaults to 60000.
write_timeout: The timeout in milliseconds between two successive write operations for transmitting a request to the upstream server. Defaults to 60000.
read_timeout: The timeout in milliseconds between two successive read operations for transmitting a request to the upstream server. Defaults to 60000.
if you use Kubernetes, you must specify a special annotation in service:
konghq.com/override: {{ ingressName }}
no obvious, though.
I discovered it here https://github.com/Kong/kubernetes-ingress-controller/issues/905#issuecomment-739927116
example of service:
apiVersion: v1
kind: Service
metadata:
name: websocket
annotations:
konghq.com/override: timeout-kong-ingress
spec:
selector:
app: websocket
ports:
- port: 80
targetPort: 8010
for detailed explanation please follow the link above
I'm asking myself if it possible to reproduce NGinx proxy_next_upstream system on F5 BIG-IP.
As a reminder, here is how it works on NGinx:
Given a pool of upstream servers let's call it webservers compose by 2 instances:
upstream webservers {
server 192.168.1.10:8080 max_fails=1 fail_timeout=10s;
server 192.168.1.20:8080 max_fails=1 fail_timeout=10s;
}
With the following instruction (proxy_next_upstream error), if a tcp connection fail on first instance when routing a request (because instance is down for example), NGinx automatically forward request to the second instance (USER DOESN'T SEE ANY ERROR).
Furthermore, instance 1 is blacklisted for 10 seconds (fail_timeout=10s).
Every 10 sec, NGinx will try to route 1 request to instance 1 (to know if instance is coming back) and make the instance available again if it succeed otherwise it wait again 10 sec to try.
location / {
proxy_next_upstream error;
proxy_pass http://webservers/$1;
}
I hope I'm clear enough...
Thanks for your help.
Here is something interesting: https://support.f5.com/kb/en-us/solutions/public/10000/600/sol10640.html
Nginx supports proxy response buffering, according to the mailing list (Nov 2013):
The proxy_buffering directive disables response buffering, not request
buffering.
As of now, there is no way to prevent request body buffering in nginx.
It's always fully read by nginx before a request is passed to an
upstream server. It's basically a part of nginx being a web
accelerator - it handles slow communication with clients by itself and
only asks a backend to process a request when everything is ready.
Since version 1.7.11 Nginx (Mar 2015) directive proxy_request_buffering see details:
Update: in nginx 1.7.11 the proxy_request_buffering directive is
available, which allows to disable buffering of a request body. It
should be used with care though, see docs.
See docs for more details:
Syntax: proxy_request_buffering on | off;
Default: proxy_request_buffering on;
Context: http, server, location
When buffering is enabled, the entire request body is read from the
client before sending the request to a proxied server.
When buffering is disabled, the request body is sent to the proxied
server immediately as it is received. In this case, the request cannot
be passed to the next server if nginx already started sending the
request body.
When HTTP/1.1 chunked transfer encoding is used to send the original
request body, the request body will be buffered regardless of the
directive value unless HTTP/1.1 is enabled for proxying.
The question is about buffering for Nginx (Load Balancer). For instance, we have the following scheme:
Nginx (LB) -> Nginx (APP) -> Backend
Nginx (APP) buffers the request from the load balancer and also buffers response from Backend. But does it make sense to buffer request and response on the Nginx (LB) side if both Nginx nodes are physically located close to each other (less than 2ms ping) with pretty fast network connection in between?
I measured the following benchmarks - Please note that this is for illustration only - Production loads would be significantly higher :
siege --concurrent=50 --reps=50 https://domain.com
proxy_request_buffering on;
Transactions: 2500 hits
Availability: 100.00 %
Elapsed time: 58.57 secs
Data transferred: 577.31 MB
Response time: 0.55 secs
Transaction rate: 42.68 trans/sec
Throughput: 9.86 MB/sec
Concurrency: 23.47
Successful transactions: 2500
Failed transactions: 0
Longest transaction: 2.12
Shortest transaction: 0.10
proxy_request_buffering off;
Transactions: 2500 hits
Availability: 100.00 %
Elapsed time: 57.80 secs
Data transferred: 577.31 MB
Response time: 0.53 secs
Transaction rate: 43.25 trans/sec
Throughput: 9.99 MB/sec
Concurrency: 22.75
Successful transactions: 2500
Failed transactions: 0
Longest transaction: 2.01
Shortest transaction: 0.09