I'm trying to find the most effective way of redirecting all POST data to /dev/null. The goal is to have a way of sending huge amounts of data to the internet, to monitor network performance, but all data should be discarded immediately by the server.
I tried the solution offered by https://devnull-as-a-service.com/code/, but the issue I'm having is that nginx returns without waiting on all POST data to be received, which doesn't create the network traffic I'm hoping to see.
Any ideas, nginx gurus?
Related
I had the following issue on a system that I supported ~7 years ago. We never got to the bottom of it, and focus shifted onto other issues. I was recently reminded of it, and wondered if anyone would know what was going on. But alas I'll be a little short on details. Sorry.
The Setup
I had a farm of web servers sitting behind a load balancer. The servers were hosting a system that would receive HTTP requests (XML &/or SOAP) from clients, then for each one kick-off a bunch of further HTTP requests to 3rd-party-suppliers, wait for the suppliers' responses, process and combine the results and respond to the client's request.
Think insurance comparison, but as Business-To-Business XML service.
The whole processing would take 5s of seconds, from receiving the initial client request to them sending back a response to that original HTTP request, and the server would be processing 10s or 100s of requests in parallel (i.e. at any given point, a given webserver would have many client Requests that had come in, and been logged, but not yet been responded to.)
We had detailed logging which record the reciept of the requests, including origin IP and which server was processing the request, and record when a response was sent.
All client requests were sent to a single IP address (well, URL), which was the address of the loadbalancer, which would then forward requests to the webservers, which weren't individually accessible to the internet (they didn't have public IP addresses).
Our load balancer would allow us to take individual web-servers out of rotation, for maintenance.
When we did that we could watch the DB logs, and see new requests stop coming in, and the existing request gradually get completed, until there were now outstanding requests and the server was idle.
The problem
We found that sometimes, when we took a server out of rotation ... it wouldn't entirely stop receiving requests. You could see the large bulk of request suddenly stop coming in, but it would still receive a trickle of fresh requests (I don't know ... maybe 0.1% of normal load, maybe less?). I think the longest we left it going was maybe ... 10 minutes?
Notably we realised that all of those requests were coming from a single client/IP address (I don't remember which).
I forget whether other (still-in-rotation) webservers were still receiving requests from this client, but I think they were?
If we rebooted the webserver, no further requests would come in after restarting.
Web stack was Windows, IIS, ASP.NET; pretty old school even at the time. All servers individually owned and configured.
What was happening?
We vaguely waved our hands and asserted that the client's integration with us was "holding an HTTP tunnel open and sending multiple requests through it", rather than sending each request separately, and thus was maintaining that tunnel even after the LB stopped sending new requests to that server. But that was BS-waffle, and since we never needed to actually understand what was going on, we ignored it and moved on with our lives :)
But I'd still like to know what we were seeing, if anyone can diagnose it from that description.
We vaguely waved our hands and asserted that the client's integration with us was "holding an HTTP tunnel open and sending multiple requests through it", rather than sending each request separately, and thus was maintaining that tunnel even after the LB stopped sending new requests to that server.
That sounds like a good explanation.
Normally, a LB will refuse new connections to a removed server, but will allow open connections to live on until they naturally close. This is known as "connection draining" or "graceful shutdown".
If one of your clients had HTTP keepalive on, and was holding a TCP connection open and sending HTTP requests through it for a long time, it would give the symptoms you describe.
Most LBs will have a configuration knob for how long to wait for connections to close before force-closing them during this "connection draining" time. You can set a timeout here to avoid this scenario if it is a problem for you.
The HTTP connection handling behaviour of clients will vary at the client's discretion, to a large extent. Perhaps most of your clients were of one type (say, web browsers) and weren't holding open a single connection for 10 mins, but perhaps one client was different (say, a programmatic HTTP API client)?
Further reading about "connection draining" on AWS Load Balancers here (the exact details will vary by LB vendor): https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/config-conn-drain.html
Further reading about HTTP keep alive here: https://en.wikipedia.org/wiki/HTTP_persistent_connection
I've setup NGINX as a warm cache server in front of Wowza > HTTP-Origin application to act as an edge server. The config is working great streaming over HTTPS with nDVR and adaptive streaming support. I've combed the internet looking for examples and help on configuring NGINX and/or other solutions to give me live statistics (# of viewers per stream_name) as well parse the logs to give me stream duration per stream_name/session and data_transferred per stream_name/session. The logging in NGINX for HLS streams logs each video chunk. With Wowza, it is a bit easier to get this data by reading the duration or bytes transferred values from the logs when the stream is destroyed... Any help on this subject would be hugely appreciated. Thank you.
Nginx isn't aware of what the chunks are. It's only serving resource to clients over HTTP, and doesn't know or care that they're interrelated. Therefore, you'll have to derive the data you need from the logs.
To associate client requests together as one, you need some way to track state between requests, and then log that state. Cookies are a common way to do this. Alternatively, you could put some sort of session identifier in the request URI, but this hurts your caching ability since each client is effectively requesting a different resource.
Once you have some sort of session ID logged, you can process those logs with tools such as Elastic Stack to piece together the reports you're looking for.
Depending on your goals with this, you might find it better to get your data client-side. There, you have a better idea of what a session actually is, and then you can log client-side items such as buffer levels and latency and what not. The HTTP requests don't really tell you much about the experience the end users are getting. If that's what you want to know, you should use the log from the clients, not from your HTTP servers. Your HTTP server log is much more useful for debugging underlying technical infrastructure issues.
I have the following situation: a very expensive data calculation task, which is performed for every http request made against a REST resource.
Let's say a customer sends 1000 requests. Nginx then takes up to 2 days to respond to all of the requests.
We're fine with that.
But that should block the endpoint for all other customers.
I'm looking for a way to cheat the "first come first serve" way that Nginx deals with incoming requests, so other customers will be served during the period.
I've tried Google but so far no luck.
I thank you in advance if you can suggest anything that could solve my issue.
Thanks a lot!
We have recently implemented a nginx based reverse proxy.
While, debugging our access logs, we are seeing quite a bit of status code 400 results.
They look something like this:
[07/Sep/2011:05:49:04 -0700] - "400" 0 "-" "-" "-"
We have enabled debug error logging, and they usually correspond to something like this:
2011/09/07 05:09:28 [info] 5937#0: *30904 client closed prematurely connection while reading client request line
We have tried raising a number of the buffers, as mentioned by a few pages we were able to google up.
http://www.ruby-forum.com/topic/173362
or
http://blog.craz8.com/articles/2009/06/17/nginx-400-bad-request-errors-due-to-cookies-and-what-to-do-about-them
To no avail.
Why is this happening?
This is a strandard nginx reverse proxy -> apache backend server.
Worth mentioning, the unique type of content on our site is fairly minimal. We have tested this using many browsers and are not personally receiving any of these 400 results.
Thanks!
Further urls detailing similar entries in their logs:
http://blog.rayfoo.info/2009/10/weird-web-server-access-log-entries
I found this was caused by using Chrome, which apparently opens extra connections occasionally without sending any data.
Here's some more info: http://www.ruby-forum.com/topic/2953545
Now the question is what to do about them - the answer provided there wasn't very satisfying.
Are you handling SSL connections? Can you add $ssl_cipher $ssl_protocol to your access log format?
First, it's fairly possible that your clients send request with really big http headers or urls. Maybe an older version of your application set some (probably big) cookies which are unused now and some clients are still trying send them back.
I'd set the header buffers to a really big value and on the application side log the size of the headers/requests and the complete request if they are bigger than usual. Or completely take out the nginx from the chain and log the header/request with the same conditions. If you can, take out the nginx for only those IPs/subnets where the 400 errors came from. I suppose nginx can log the source IP for these 400 errors.
I began to study protocol stuffs recently.
I acknowledged that in the old method, incoming data will be first delivered to SSL proxy, where to be decrypted and then be sent to HTTP proxy through another TCP connection. For every packet passes through this connection, we need to do a connection table to look up to determine the other endpoint of the connection.
But the pipe setup and teardown require one function call each and no packet sent. Sending data through the pipe will not require a connection table lookup, as the data structures are already tied together with pointers.
I tried to search the answer of my own question, but can’t find good method to understand it. I guess there may be something related to structure of TCP or PIPE. Could any tell me that why exactly pipe is simple than TCP connection between SSL proxy and HTTP proxy? Or please suggest me what book to read or how can I understand it?
Two Pics related to this question:
http://www.tripntale.com/pic/19254/857880/pipe-jpg#pid-857880
http://www.tripntale.com/pic/19254/857880/pipe-jpg#pid-857882
So what you want to know is how these two diagrams compare?
I'm sorry to say that these diagrams don't make much sense to me either, hopefully they do make sense if there's the text to go with them when they were published.
The diagrams relate to software engineering approaches to a problem, but the objects in the diagrams aren't defined functionally, appear to me to be used in different ways and it isn't clear what the problem is that these are approaches to.
HTTP proxies can be used as:
Forward proxies (client sends it's HTTP requests to proxy, proxy fetches and returns them to client)
Or
Reverse proxies (proxy sits in front of server(s) for service engineering reasons)
The term "SSL Proxy" could refer to either application and would have differing implications to how it was designed.
See here for more explanation: http://en.wikipedia.org/wiki/SSL_Proxy
Do you just want to understand these diagrams? Or are you trying to solve a problem and think that these diagrams can help you? If so, what is the problem you are trying to solve?
For every packet passes through this connection, we need to do a
connection table
Why? I've written several proxies without a connection table.