Can nginx reverse proxy isolate cache per endpoint?

Can nginx reverse proxy isolate cache per endpoint? - nginx

I'm using nginx reverse proxy to cache content from two endpoints, one of which is very reliable; the other has frequent timeouts.
I've found that those timeouts can sometimes use up all available connections or cause other issues, degrading performance for the server as a whole and leading to increased latency for the reliable endpoint as well.
I've tweaked some settings (worker_rlimit_nofile, worker_connections), but what I'd really like to do is isolate the caching and connections for the two endpoints as much as possible: give each a share of the available cache, and a share of the available connections, and operate as if they're hitting two separate servers, to reduce the chances that issues with one endpoint affect the performance of the other.
If I were to create two location blocks, one for each endpoint, can I designate each block's share of the cache (e.g. number of files, or total size) and share of available connections?
Or is there a better way of achieving this goal of isolation to ensure reliable performance for the good endpoint, even if the bad endpoint is experiencing lots of timeouts?

Most of the proxy_cache_* directives can be specific to location blocks and will allow you to do just that.
https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache
It may also help others answer if an example config is provided that reflects what you're currently doing.

Related

Advanced HTTP/2 proxy for load balancing of distributed scraping solution

I have built a distributed HTTP scraper solution that uses different "exit addresses" addresses by design in order to balance the network load.
The solution supports IPv4, IPv6 and HTTP proxy to route the traffic.
Each processor was responsible to define the most efficient route to balance the traffic and it was temporarily implemented manually for prototyping. Currently, the solution grows and with the number of processors as the complexity of the load balancing task get higher, that's why I need a way to create a component dedicated to it.
I did some rather extensive research, but seem to have failed in finding a solution for load balancing traffic between IPv6, IPv4 (thousands of local addresses) and public HTTP proxies. The solution needs to support weights, app-level response checks and cool-down periods.
Does anyone know a solution that already solves this problem? Before I start developing a custom one.
Thanks for your help!

If you search for load balancing proxy you'll discover the Cache Array Routing Protocol (CARP). This CARP might not be what you're searching for and there exists servers only for the proxy-cache what I never knew till now.
Nevertheless those servers have own load balancers too, and perhaps that's a detail where it's worth it to search more.
I found a presentation mentioning CARP as outstanding solution too: https://cs.nyu.edu/artg/internet/Spring2004/lectures/lec_8b.pdf
Example: for proxy-arrays in Netra Proxy Cache Server: https://docs.oracle.com/cd/E19957-01/805-3512-10/6j3bg665f/index.html
Also there exist several concepts for load-balancing (https://link.springer.com/article/10.1023/A:1020943021842):
The three proposed methods can broadly be divided into centralized and decentralized
approaches. The centralized history (CH) method makes use of the transfer rate of each
request to decide which proxy can provide the fastest turnaround time for the next job.
The route transfer pattern (RTP) method learns from the past history to build a virtual
map of traffic flow conditions of the major routes on the Internet at different times of the
day. The map information is then used to predict the best path for a request at a particular time of the day. The two methods require a central executive to collate information
and route requests to proxies. Experimental results show that self-organization can be
achieved (Tsui et al., 2001). The drawback of the centralized approach is that a bottleneck and a single point of failure is created by the central executive. The decentralized
approach—the decentralized history (DH) method—attempts to overcome this problem
by removing the central executive and put a decision maker in every proxy (Kaiser et al.,
2000b) regarding whether it should fetch a requested object or forward the request to another
proxy.
As you use public proxy-servers probably you won't use decentralized history (DH) but centralized history (CH) OR the route transfer pattern (RTP).
Perhaps it would be even useful to replace your own solution completely, i.e. by this: https://github.blog/2018-08-08-glb-director-open-source-load-balancer/. I've no reason for this special example, it's just random by search results I found.
As I'm not working with proxy-servers this post is just a collection of findings, but perhaps there is a usable detail for you. If not, don't mind - probably you know most or all already and it's never adding anything new for you. Also I never mention any concrete solution.

Have you checked this project? https://Traefik.io which supports http/2 and tcp load balancing. The project is open source and available on github. It is build using Go. I'm using it now as my reverse proxy with load balancing for almost everything.
I also wrote a small blog post on docker and Go where I showcase the usage of Traefik. That also might help you in your search. https://marcofranssen.nl/docker-tips-and-tricks-for-your-go-projects/
In the traefik code base you might find your answer, or you might decide to utilize traefik to achieve your goal instead of home grown solution.
See here for a nice explanation on the soon to be arriving Traefik 2.0 with TCP support.
https://blog.containo.us/back-to-traefik-2-0-2f9aa17be305

http2 domain sharding without hurting performance

Most articles consider using domain sharding as hurting performance but it's actually not entirely true. A single connection can be reused for different domains at certain conditions:
they resolve to the same IP
in case of secure connection the same certificate should cover both domains
https://www.rfc-editor.org/rfc/rfc7540#section-9.1.1
Is that correct? Is anyone using it?
And what about CDN? Can I have some guarantees that they direct a user to the same server (IP)?

Yup that’s one of the benefits of HTTP/2 and in theory allows you to keep sharding for HTTP/1.1 users and automatically unshard for HTTP/2 users.
The reality is a little more complicated as always - due mostly to implementation issues and servers resolving to different IP addresses as you state. This blog post is a few years old now but describes some of the issues: https://daniel.haxx.se/blog/2016/08/18/http2-connection-coalescing/. Maybe it’s improved since then, but would imagine issues still exist. Also new features like the ORIGIN frame should help but are not widely supported yet.
I think however it’s worth revisiting the assumption that sharding is actually good for HTTP/1.1. The costs of setting up new connections (DNS lookup, TCP setup, TLS handshake and then the actual sending HTTP messages) are not immaterial and studies have shown the 6 connection browser limit is really used never mind adding more by sharding. Concatenation, spriting and inlining are usually much better options and these can still be used for HTTP/2. Try it on your site and measure is the best way of being sure of this!
Incidentally it is for for these reasons (and security) that I’m less keen on using common libraries (e.g. jquery, bootstrap...etc.) from their CDNs instead of hosted locally. In my opinion the performance benefit of a user already having the version your site uses already cached is over stated.
With al these things, HTTP/1.1 will still work without sharded domains. It may (arguably) be slower but it won’t break. But most users are likely on HTTP/2 so is it really worth adding the complexity for the minority’s of users? Is this not a way of progressively enhancing your site for people on modern browsers (and encouraging those not, to upgrade)? For larger sites (e.g. Google, Facebook... etc.) the minority may still represent a large number of users and the complexity is worth it (and they have the resources and expertise to deal with it) for the rest of us, my recommendation is not to shard, to upgrade to new protocols like HTTP/2 when they become common (like it is now!) but otherwise to keep complexity down.

Is it realistic for a dedicated server to send out many requests per second?

TL;DR
Is it appropriate for a (dedicated) web server to be sending many requests out to other servers every second (naturally with permission from said server)?
I'm asking this purely to save myself spending a long time implementing an idea that won't work, as I hope that people will have some more insight into this than me.
I'm developing a solution which will allow clients to monitor the status of their server. I need to constantly (24/7) obtain more recent logs from these servers. Unfortunately, I am limited to getting the last 150 entries to their logs. This means that for busy clients I will need to poll their servers more.
I'm trying to make my solution scalable so that if it gets a number of customers, I won't need to concern myself with rewriting it, so my benchmark is 1000 clients, as I think this is a realistic upper limit.
If I have 1000 clients, and I need to poll their servers every, let's give it a number, two minutes, I'm going to be sending requests off to more than 8 servers every second. The returned result will be on average about 15,000 characters, however it could go more or less.
Bearing in mind this server will also need to cope with clients visiting it to see their server information, and thus will need to be lag-free.
Some optimisations I've been considering, which I would probably need to implement relatively early on:
Only asking for 50 log items. If we find one already stored (They are returned in chronological order), we can terminate. If not, we throw out another request for the other 100. This should cut down traffic by around 3/5ths.
Detecting which servers get more traffic and requesting their logs less commonly (i.e. if a server only gets 10 logged events every hour, we don't want to keep asking for 150 every few minutes)
I'm basically asking if sending out this many requests per second is considered a bad thing and whether my future host might start asking questions or trying to throttle my server. I'm aiming to go shared for the first few customers, then if it gets popular enough, move to a dedicated server.
I know this has a slight degree of opinion enabled, so I fear that it might be a candidate for closure, but I do feel that there is a definite degree of factuality required in the answer that should make it an okay question.
I'm not sure if there's a networking SE or if this might be more appropriate on SuperUser or something, but it feels right on SO. Drop me a comment ASAP if it's not appropriate here and I'll delete it and post to a suggested new location instead.

You might want to read about the C10K Problem. The article compares several I/O strategies. A certain number of threads that each handle several connections using nonblocking I/O is the best approach imho.
Regarding your specific project I think it is a bad idea to poll for a limited number of log items. When there is a peak in log activity you will miss potentially critical data, especially when you apply your optimizations.
It would be way better if the clients you are monitoring pushed their new log items to your server. That way you won't miss something important.
I am not familiar with the performance of ASP.NET so I can't answer if a single dedicated server is enough. Especially because I do not know what the specs of your server are. Using a reasonable strong server it should be possible. If it turns out to be not enough you should distribute your project across multiple servers.

http charging high volume of http requests

Sorry for banal question, but i would like to know, while i launch many many http requests to a site, will site goes down or just i will be banned after n request limit (according to server settings)?

As is most always the case the answer is: it depends :)
Now, seriously, it depends on a few things but most likely if you're just hitting "a" server with a single client computer you most likely will not be able to make it go down as servers usually have more bandwidth (networking & processing powers) available to them than most client computers. Plus, when you're making requests to a website you might actually be making requests to a load-balanced network of servers which definitely means that a single client will not be able to outperform the site. If this is the case your IP will probably get blacklisted or maybe just ignored.
The other possibility is to make a lot of requests with a number of different clients at the same time - using different IPs. This is what is usually called a coordinated or distributed denial of service attack. In that case you might be able to make the web server(s) go down for a while.
This is just a simple answer but I hope you get the point.

Harvesting Dynamic HTTP Content to produce Replicating HTTP Static Content

I have a slowly evolving dynamic website served from J2EE. The response time and load capacity of the server are inadequate for client needs. Moreover, ad hoc requests can unexpectedly affect other services running on the same application server/database. I know the reasons and can't address them in the short term. I understand HTTP caching hints (expiry, etags....) and for the purpose of this question, please assume that I have maxed out the opportunities to reduce load.
I am thinking of doing a brute force traversal of all URLs in the system to prime a cache and then copying the cache contents to geodispersed cache servers near the clients. I'm thinking of Squid or Apache HTTPD mod_disk_cache. I want to prime one copy and (manually) replicate the cache contents. I don't need a federation or intelligence amongst the slaves. When the data changes, invalidating the cache, I will refresh my master cache and update the slave versions, probably once a night.
Has anyone done this? Is it a good idea? Are there other technologies that I should investigate? I can program this, but I would prefer a configuration of open source technologies solution
Thanks

I've used Squid before to reduce load on dynamically-created RSS feeds, and it worked quite well. It just takes some careful configuration and tuning to get it working the way you want.

Using a primed cache server is an excellent idea (I've done the same thing using wget and Squid). However, it is probably unnecessary in this scenario.
It sounds like your data is fairly static and the problem is server load, not network bandwidth. Generally, the problem exists in one of two areas:
Database query load on your DB server.
Business logic load on your web/application server.
Here is a JSP-specific overview of caching options.
I have seen huge performance increases by simply caching query results. Even adding a cache with a duration of 60 seconds can dramatically reduce load on a database server. JSP has several options for in-memory cache.
Another area available to you is output caching. This means that the content of a page is created once, but the output is used multiple times. This reduces the CPU load of a web server dramatically.
My experience is with ASP, but the exact same mechanisms are available on JSP pages. In my experience, with even a small amount of caching you can expect a 5-10x increase in max requests per sec.

I would use tiered caching here; deploy Squid as a reverse proxy server in front of your app server as you suggest, but then deploy a Squid at each client site that points to your origin cache.
If geographic latency isn't a big deal, then you can probably get away with just priming the origin cache like you were planning to do and then letting the remote caches prime themselves off that one based on client requests. In other words, just deploying caches out at the clients might be all you need to do beyond priming the origin cache.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex