Randomize/Shuffle HTTP requests made to Nginx - nginx

I have the following situation: a very expensive data calculation task, which is performed for every http request made against a REST resource.
Let's say a customer sends 1000 requests. Nginx then takes up to 2 days to respond to all of the requests.
We're fine with that.
But that should block the endpoint for all other customers.
I'm looking for a way to cheat the "first come first serve" way that Nginx deals with incoming requests, so other customers will be served during the period.
I've tried Google but so far no luck.
I thank you in advance if you can suggest anything that could solve my issue.
Thanks a lot!

Related

How to handle HTTP STATUS CODE 429 (Rate limiting) for a particular domain in Scrapy?

I am using Scrapy to build a broad crawler which will crawl a few thousand pages from 50-60 different domains.
I have encountered 429 status code sometimes. I am thinking of ways of dealing with it. I am trying to set polite policies regarding Concurrent requests per domain and autothrottle settings. This is a worst-case situation.
By default, Scrapy drops the request.
If we add 429 to RETRY_HTTP_CODES, Scrapy will use the default retry middleware which will schedule the request at the end of the queue. This will still allow other requests to the same domain to ping the server - Does this prolong the temporary block in place due to rate limiting? If not, why not use this approach only instead of trying other complex solutions as described below?
Another approach is to block the spider when it encounters a 429. However, one of the comments mentions that this will lead to a timeout in other active requests. Also, this would block requests to all the domains (which is an inefficient way as requests to other domains should continue normally). Does it make sense to temporarily reschedule requests to a particular domain instead of pinging the server continuously with other requests to the same domain? If yes, how to implement it in Scapy?
Does this solve the issue?
When Rate Limiting is already triggered - does sending more requests (which will receive a 429 response) prolong the time period for which rate limiting is applied? Or will it have no effect on rate limiting's time period?
How to pause scrapy to send requests to a particular domain, while continuing its other tasks (including requests to other domains)?
EDIT:
The default Retry Middleware cannot be used as it has a max retry counter - RETRY_TIMES. After this has expired for a particular request, that request is dropped - something that we don't want in the case of a 429.

Why are only six HTTPS requests allowed to be sent simultaneously?

One of my front-end developer co-workers asked for my help because of the browser not sending HTTPS requests asynchronously. At least that is what he thought at first.
By opening up the Firefox networking tab, I realized that the requests get sent asynchronously, but for some reason, only six HTTPS requests are allowed to be sent parallel.
I assume this is a technical limit of the HTTPS protocol, though I do not know the cause.
I searched for the cause on the web, but I have not been able to find it so far.
In the following picture, the red parts of the lines mark the duration spent being blocked by another request:
My understanding is that this is a courtesy to the server you're connecting to. You wouldn't want to overload a server and prevent others from simultaneously connecting to it.
Depending on the browser, the limit changes as well.
See also: Max parallel http connections in a browser?

NGINX as warm cache in front of wowza for HLS live streams - Get per stream data duration and data transferred?

I've setup NGINX as a warm cache server in front of Wowza > HTTP-Origin application to act as an edge server. The config is working great streaming over HTTPS with nDVR and adaptive streaming support. I've combed the internet looking for examples and help on configuring NGINX and/or other solutions to give me live statistics (# of viewers per stream_name) as well parse the logs to give me stream duration per stream_name/session and data_transferred per stream_name/session. The logging in NGINX for HLS streams logs each video chunk. With Wowza, it is a bit easier to get this data by reading the duration or bytes transferred values from the logs when the stream is destroyed... Any help on this subject would be hugely appreciated. Thank you.
Nginx isn't aware of what the chunks are. It's only serving resource to clients over HTTP, and doesn't know or care that they're interrelated. Therefore, you'll have to derive the data you need from the logs.
To associate client requests together as one, you need some way to track state between requests, and then log that state. Cookies are a common way to do this. Alternatively, you could put some sort of session identifier in the request URI, but this hurts your caching ability since each client is effectively requesting a different resource.
Once you have some sort of session ID logged, you can process those logs with tools such as Elastic Stack to piece together the reports you're looking for.
Depending on your goals with this, you might find it better to get your data client-side. There, you have a better idea of what a session actually is, and then you can log client-side items such as buffer levels and latency and what not. The HTTP requests don't really tell you much about the experience the end users are getting. If that's what you want to know, you should use the log from the clients, not from your HTTP servers. Your HTTP server log is much more useful for debugging underlying technical infrastructure issues.

Nginx send all POST data to /dev/null

I'm trying to find the most effective way of redirecting all POST data to /dev/null. The goal is to have a way of sending huge amounts of data to the internet, to monitor network performance, but all data should be discarded immediately by the server.
I tried the solution offered by https://devnull-as-a-service.com/code/, but the issue I'm having is that nginx returns without waiting on all POST data to be received, which doesn't create the network traffic I'm hoping to see.
Any ideas, nginx gurus?

Random "Connection reset" error when calling 2 or more RemoteObject function

I know this question is something difficult to resolve, but I have been walking thourgh sources, googling, etc... and haven't found anything clear bout the problem I'm going to expose.
I have an application that uses php as backend. I use amf as my transport protocol. The problem is that when I do some requests to the server backends through remote objects, randomly, I recive a connection reset on all/several requests. This does not happen when just one remote service request is made, or at least it didn't happen to me. Problem is more visible as more concurrent calls are execute at once.
Could anybody guess what is hapenning from this little info? At first I thougt it was an Apache problem that reset the connection because of a bunch of requests, but I have following requests and I do 3 or 4 concurrect requests, not more.
Thanks in advance for your time.

Resources