Handling Multiple request on a single resource - http

Recently I came across a question where it was asked if we have around 1000 users hitting a rest endpoint of a microservice and on this endpoint, it was fetching the same data from some other slow process, how could we optimize the request in this usecase? Caching is the obvious answer but how could it be optimized for large number of concurrent requests ?

Obviously as you say you have to go through the cache options, when you have a single application the fastest way can be Ehcache, it is a fairly simple framework to implement.
But if you want to increase the availability of the service you have to cluster it. For that you have to have a load balancer like nginx, and on the other hand to centralize the cache can be Redis db.
Redis ( cache )
^^
Nginx (LB) -> cluster of your app -> Other service

Related

Setting max connection per route for Http Connection Pool

I am writing a crawler to crawl some forum contents and all my HTTP connection is using Apache Http Client.
As suggested by the official documentation, I'm using a single Http client for a single forum server and this client, equipped with a PoolingHttpClientConnectionManager instance, can execute multiple requests by multiple execution threads at the same time.
One important attribute of this Pooling connection manager is maximum number of connections per route (which is 2 by default). I am confused which is the optimal (general) limit for this that ensure the speed of crawling but not overload the server?
(By general, I meant an average number that work for a general forum server in different cases cause I will set it static when I initialize the connection manager).
Besides that, I would really appreciate if someone know how to dynamically manage the limit per route based on server feedback in HttpClient 4.5 or other similar library.
Thanks really much for helping!

How to limit maximum concurrent connections / requests / sessions in ASP.NET WebAPI

I'm looking for a mechanism to limit the number of concurrent connections to a service exposed using ASP.NET WebAPI.
Why? Because this service is performing operations that are expensive on the hardware resources and I would like to prevent degradation under stress.
More info:
I don't know how many requests will be issued per period of time.
This service runs in its own IIS application pool and limiting the maximum connections on the parent site in IIS is not an option.
I found this suite, but the supported algorithms do not include the one that I'm interested in.
I'm looking for something out of the box (something as straightforward as an IIS config setting) but I could not find exactly what I need.
Any clues?
Thanks!
Scaling your service would probably be a better idea than limiting the number of requests. You could send the heavy processing to some background jobs and keep your API servicing requests.
But assuming the above cannot be done, you will need to use one of the throttling package available or write your own if none meets your requirements.
I suggest starting with the ThrottlingHandler from WebApiContrib
You might be able to meet your needs by properly implementing the GetUserIdentifier method.
If not, you will need to implement your own MessageHandler and the handler mentioned would be a good starting point.

Can nginx be used to front sharded, HTTP-based resources?

Info
CouchDB is a RESTful, HTTP-based NoSQL datastore. Responses are sent back in simple JSON and it is capable of utilizing ETags in the generated responses to help caching servers tell if data has changed or not.
Question
Is it possible to use nginx to front a collection of CouchDB servers where each Couch server is a shard of a larger collection (and not replicas of each other) and have it determine the target shard based on a particular aspect of the query string?
Example Queries:
http://db.mysite.com?id=1
http://db.mysite.com?id=2
Shard Logic:
shard = ${id} % 2; // even/odd
This isn't a straight forward "load balancing" question because I would need the same requests to always end up at the same servers, but I am curious if this type of simple routing logic can be written into an nginx site configuration.
If it can be, what makes this solution so attractive is that you can then turn on nginx caching of the JSON responses from the Couch servers and have the entire setup nicely packaged up and deployed in a very capable and scalable manner.
You could cobble something together or you could use BigCouch (https://github.com/cloudant/bigcouch).

web service that can withstand with 1000 concurrent users with response in 25 millisecond

Our client requirement is to develop a WCF which can withstand with 1-2k concurrent website users and response should be around 25 milliseconds.
This service reads couple of columns from database and will be consumed by different vendors.
Can you suggest any architecture or any extra efforts that I need to take while developing. And how do we calculate server hardware configuration to cope up with.
Thanks in advance.
Hardly possible. You need network connection to service, service activation, business logic processing, database connection (another network connection), database query. Because of 2000 concurrent users you need several application servers = network connection is affected by load balancer. I can't imagine network and HW infrastructure which should be able to complete such operation within 25ms for 2000 concurrent users. Such requirement is not realistic.
I guess if you simply try to run the database query from your computer to remote DB you will see that even such simple task will not be completed in 25ms.
A few principles:
Test early, test often.
Successful systems get more traffic
Reliability is usually important
Caching is often a key to performance
To elaborate. Build a simple system right now. Even if the business logic is very simplified, if it's a web service and database access you can performance test it. Test with one user. What do you see? Where does the time go? As you develop the system adding in real code keep doing that test. Reasons: a). right now you know if 25ms is even achievable. b). You spot any code changes that hurt performance immediately. Now test with lots of user, what degradation patterns do you hit? This starts to give you and indication of your paltforms capabilities.
I suspect that the outcome will be that a single machine won't cut it for you. And even if it will, if you're successful you get more traffic. So plan to use more than one server.
And anyway for reliability reasons you need more than one server. And all sorts of interesting implementation details fall out when you can't assume a single server - eg. you don't have Singletons any more ;-)
Most times we get good performance using a cache. Will many users ask for the same data? Can you cache it? Are there updates to consider? in which case do you need a distributed cache system with clustered invalidation? That multi-server case emerging again.
Why do you need WCF?
Could you shift as much of that service as possible into static serving and cache lookups?
If I understand your question 1000s of users will be hitting your website and executing queries on your DB. You should definitely be looking into connection pools on your WCF connections, but your best bet will be to avoid doing DB lookups altogether and have your website returning data from cache hits.
I'd also look into why you couldn't just connect directly to the database for your lookups, do you actually need a WCF service in the way first?
Look into Memcached.

Harvesting Dynamic HTTP Content to produce Replicating HTTP Static Content

I have a slowly evolving dynamic website served from J2EE. The response time and load capacity of the server are inadequate for client needs. Moreover, ad hoc requests can unexpectedly affect other services running on the same application server/database. I know the reasons and can't address them in the short term. I understand HTTP caching hints (expiry, etags....) and for the purpose of this question, please assume that I have maxed out the opportunities to reduce load.
I am thinking of doing a brute force traversal of all URLs in the system to prime a cache and then copying the cache contents to geodispersed cache servers near the clients. I'm thinking of Squid or Apache HTTPD mod_disk_cache. I want to prime one copy and (manually) replicate the cache contents. I don't need a federation or intelligence amongst the slaves. When the data changes, invalidating the cache, I will refresh my master cache and update the slave versions, probably once a night.
Has anyone done this? Is it a good idea? Are there other technologies that I should investigate? I can program this, but I would prefer a configuration of open source technologies solution
Thanks
I've used Squid before to reduce load on dynamically-created RSS feeds, and it worked quite well. It just takes some careful configuration and tuning to get it working the way you want.
Using a primed cache server is an excellent idea (I've done the same thing using wget and Squid). However, it is probably unnecessary in this scenario.
It sounds like your data is fairly static and the problem is server load, not network bandwidth. Generally, the problem exists in one of two areas:
Database query load on your DB server.
Business logic load on your web/application server.
Here is a JSP-specific overview of caching options.
I have seen huge performance increases by simply caching query results. Even adding a cache with a duration of 60 seconds can dramatically reduce load on a database server. JSP has several options for in-memory cache.
Another area available to you is output caching. This means that the content of a page is created once, but the output is used multiple times. This reduces the CPU load of a web server dramatically.
My experience is with ASP, but the exact same mechanisms are available on JSP pages. In my experience, with even a small amount of caching you can expect a 5-10x increase in max requests per sec.
I would use tiered caching here; deploy Squid as a reverse proxy server in front of your app server as you suggest, but then deploy a Squid at each client site that points to your origin cache.
If geographic latency isn't a big deal, then you can probably get away with just priming the origin cache like you were planning to do and then letting the remote caches prime themselves off that one based on client requests. In other words, just deploying caches out at the clients might be all you need to do beyond priming the origin cache.

Resources