caching and load-balancing approaches - soa

Lets say I have one WCF service that is running on one server.
Now we introduced some caching on the wcf service for performance reason.
Now if I want to do some load balancing .... is there some existing solution that will allow my cache to also be synched when it is living on different servers ???
How do we deal with these sort of issues?
Maybe a solution is to create a seperate CachingService that will be hosted on another server ... but then again if I want to load balance that service ... I'd need to somehow sync those cache that resides on different machines ...
Or maybe it doesn't event makes sense to load-balance the caching servers but only your wcf services???

Yes, there are many solutions/product that (usually) sync cache between servers (memcached, ehcache and more). I personally found that in case of closely located servers referenced cache performs better. That is cache where (possibly multiple) cache servers share the load and store objects on first "requested from" basis and then share references to the objects between all cache servers.

Related

Spring Redis cache expiration in memory

Using Spring Redis cache and wonder if is possible to set some data cache duration in memory. Cache of cache. If i know that data in Redis will not change for 5 minutes i dont need that Spring Redis cache touch the Redis everytime when some #Cacheable method is called.
Is Redisson the answer?
AFAICT, Redisson is simply a client-side facade or enhanced Redis (Java) client used to interface with a Redis node (or cluster) in a more powerful and convenient way, not unlike Spring Data Redis. For example, and as you already know, using Redis as a caching provider in Spring's Cache Abstraction.
Redis does seem to support client-side caching (a local cache in addiion to the remote (server) cache?), when using a Redis client/server topology. This would be transparent to you application (e.g. #Cacheable) and configured in the Redis client driver, AFAIK.
However, given my lack of experience with Redis, or even Redisson for matter, I cannot speak to this feature in detail. Redis client-side caching may need to be supported by the Redis client drivers (e.g. Jedis, Lettuce, even Redisson, etc).
NOW THE LONG-WINDED ANSWER FOR THE INTERESTED READER:
What you are describing when you state a "cache of cache" hearsay, is really having a "locally available cache" in addition to the "remote, or server-side cache". This assumes, of course, you are running Redis in a client/server (not embedded), and possibly distributed/clustered (maybe HA), capacity in the first place.
Ideally, you would choose a caching provider that supported this sort of arrangement out-of-the-box, natively. And, despite popular belief (for example), much of what Redis "reinvented" (horizontal scale-out or cluster, HA, even persistence) already existed in other, more mature solutions, built from the ground up with these concerns in mind.
SIDENOTE: Granted, the referenced article above is dated, but also a bit naive.
A "cache of (a) cache" is technically referred to as the Near Caching pattern.
It is where the "local" (application/client-side) cache mirrors the "remote" (server-side and primary) cache to avoid [a] network hop(s), i.e. latency, by only accessing the remote cache when necessary (e.g. cache miss), preferably in a "single-hop", "fault-tolerant" fashion, when the server-side is distributed and clustered.
However, a fundamental difference between the local cache and server-side, remote cache is that the local cache only stores a subset of the data from the remote cache based on "interests".
NOTE: In Redis's documentation, they referred to this as "tracking". There are different ways, across different providers, to express "interests" or track what the client has accessed. Be mindful of the different approaches here since they consume different system resources.
You might have a distributed (Web / Microservice) application architecture where several client application instances serve different demographics or populations of end-users. Clearly, those client application instances might use shared, but different subsets of the primary dataset stored in the servers. This is where the local cache and "registering interest" only in the data that matters to, or is used by, the client application comes into play.
"Registering interest" is important since the server-side, remote cache can notify clients ("push", rather than a client "pulling") hosting a local cache when data on the server changes that a client is interested in since more than 1 client might have interest in and use the same data (e.g. "record", and the intersection of data).
So, how do we properly address this concern without unnecessarily introducing extra (layers of) complexity into our system/application architecture?
Well, for one, it starts by choosing the right caching provider for the problem at hand.
DISCLAIMER: my experience stems from Apache Geode, which is the OSS variate of VMware Tanzu GemFire and a I am responsible for all things Spring for Apache Geode at VMware.
While I am a bit biased here it is not uncommon for other caching providers (and complete IMDG solutions) to support the same arrangement. For example, 1 of my personal favorites is Hazelcast.
Hazelcast calls this particular caching arrangement, or topology, an "embedded" cache and even refers to this as "near cache" in the documentation.
The nice thing about a local, embedded "Near Cache" is that it avoids latency through unnecessary networks hops, however, interest registration is key to keep data consistent, as far as possible.
I have documented, talked about and even demonstrated different caching patterns when using Spring for Apache Geode in the Spring Boot for Apache Geode documentation here and Near Caching in particular, along with the Near Caching Sample in the Samples with the other caching patterns).
I am sure you can find similar resources with other caching providers, even Redis.
At any rate, this documentation should help you understand different concerns to be aware of (e.g. memory consumption) when choosing any topology and configuration.
Good luck!

Planning server infrastructure when hosting duplicated web-product over multiple servers

We have a web-application product that we sell to companies that is hosted at our servers.
The product contains couple of web applications, windows services and SQL server db.
Right now we have only one client that uses our product. We have two servers - one for the web apps and services and other for the db.
In order to add the product to another client, we have to 'duplicate' all the apps and db and run in separately.
As we started expanding and some companies will require more server power then others, I need to plan the servers infrastructure.
Having two servers for each client sounds ridiculous. Hosting costs will be huge. What will happen when I'll have 10 clients? And probably some servers will take more power than others, leaving servers using 30% from their capacity while others use 70%.
One thing I really care about is separating the DB from each product so in case of server compromise, only one db will be at risk.
So... I thought about Virtual Machines...
Does it sounds right?
Do I need two super servers to hold virtual machine instances? (one for web and other for db?)
What about Load balancing / etc..?
Will it require more maintenance time only because I use virtual machines?
Are there any hardware recommendations?
Any help will be appreciated
Many thanks
Virtual Machines is definitely the safest way to separate clients and will allow you the flexibility to allocate a specific percentage of resources to specific clients.
However, using separate processes on the same physical machine will perform better (but not always significantly) and will allow more dynamic use of resources (i.e., if one spikes, it will use the resources it needs). This setup will not allow you to control the resource allocation nearly as easily though. You'll also have to build your own monitoring tools to see and analyze what processes (clients) are using what resources (piggyback on perfmon).
Using separate processes also is dangerous if your application wasn't designed for this. Anywhere the application caches data on the file system or accesses anything besides memory and the database needs to be thoroughly scrubbed to make sure data from clients is not co-mingled or shared.
Separate virtual machines is more work to manage--each one is pretty much like it's own computer. So you have to manage all the VM's plus the physical machine.
You may also want to consider hosting in a more dynamic environment like Amazon AWS or Microsoft's Azure which will allow you to more easily scale up/down as necessary than a VM at a traditional host.

ASP.Net load balancing

I am working on asp.net (newbie) and I am trying to understand what it means to do "load balancing" for the web site. The website will be used by multiple users and resources (database, web service,..).
If anyone could help me understanding the concept of the load balance for asp.net web site, I would really appreciate it.
Thanks.
One load-balancing-related issue you may want to be aware of at development time: where you store your session state. This MSDN article gives a good overview of your options.
If you implement your asp.net system using "out-of-process" or "sql-server-mode" session state management, that will give you some additional flexibliity later, if you decide to introduce a load-balancer to your deployed system:
Your load balancer needn't handle session affinity. As one poster mentioned above, all modern load-balancers handle it anyway, so this is a minor consideration in any case.
Web-gardens (a sort of IIS/server-implemented load-balancer) REQUIRES use of "out-of-process" or "sql-server-mode" session state management. So if your system is already configured that way, you'll be one step closer to being able to use web-gardens.
What is it?
Load balancing simply refers to distributing a workload between two or more computers. As a concept, it's not unique to asp.net. Although having separate machines for your database and web server could be called "load balancing" it more commonly refers to using multiple machines to serve a single role, such as having multiple web servers.
Should you worry about it? Probably not. Do you already have a performance problem? Are your database and web server on their own machines? If you do find that your server resources are strained, it would probably be easier to scale up (a more powerful single machine) than out (load balancing). These days, a dedicated box can handle a LOT of traffic if your code is decent.
Load Balancing, in the programming sense, does not apply to ASP.NET; it applies to a technique to try to distribute server load across two or more machines, rather than it all being used on one machine. Unless you will have many thousands (millions?) of users, you probably do not need to worry about it.
Check the Wikipedia article for more information.
Load balancing is not specific for any on technology stack be it asp.net, jsp etc. To load balance is to spread the incoming requests to a web site over more than one server. This is typically done with a software or hardware load balancer. The load balancer sits in front of two or more web servers and delegates the incoming traffic. Although this technique is not limited to web servers. Load Balancing
Enjoy!
I've never used it, but an option is IIS Application Request Routing.
IIS Application Request Routing (ARR)
2.0 enables Web server administrators, hosting providers, and Content
Delivery Networks (CDNs) to increase
Web application scalability and
reliability through rule-based
routing, client and host name
affinity, load balancing of HTTP
server requests, and distributed disk
caching
In a typical web server/database scenario, the db is almost always guaranteed to load up the machine first. This is because dealing with storing data requires more resources. Before you even start looking at load balancing your web server, you need to think about how to load balance the database.
Spreading one database across multiple servers is a lot harder than load balancing a web server. One of the techniques that can be used is sharding (or horizontal partitioning). This is where some records are stored on one server, and other records - on another server. For example records with ID 1-900000 are on server 1 and records 900001- are on server 2.
In comparison to DB load balancing, spreading the load across multiple ASP.NET servers is not overly complicated. Most of the session issues can be easily mitigated by using out of process session and/or never talking to Application.Cache directly. Data load balancing on the other hand is hard and requires a lot of planning and trial and error. In most cases, talking to a load balanced DB requires using an ORM which supports it (e.g. NHibernate) or your own Data Access Layer. The reason being is that you need to take out establishing a connection from the code that uses the database, so that the decision which DB to talk to is handled in one place.
the exact solution is to save session into the SQL Server with Stored Procedure. To read session call 'SessionCheck' stored Procedure.
I'd add that it really isn't something to worry about. By the time you need a load balancer, you can probably afford one of the neato newfangled ones with sticky sessions so you don't even have to deal with the session boogeyman.

Physically Separating Secure and Non Secure Web Requests

We have been doing some research into physically isolating the secure and non-secure sections of our web application into two applications. All "http" requests would be served by one server (or cluster) and all "https" requests would be served by another server (or cluster).
The reason that we are looking into this is partially for the survivability of the application. Since the secure section of the application is revenue generating we could, for example, have a larger and/or more powerful cluster to serve the requests. Conversely, when we upgrade the hardware in the secure application, it could be re-purposed to serve the non-secure site - basically extending the life of the servers.
Has anyone worked with this approach? We had an RFP out to a (well known) vendor last year for an architectural assessment and this was one of the possible paths that was recommended. While I see the potential upside, I worry about things such as maintenance, deployment, version control, etc.
Depending how your app is architected, it seems to me that if you used virtualisation / load balancing you could have the same benefits of guaranteed resources and isolation for the paid area, while also being able to dynamically burst resources to deal with spikes in load in either area. Your current proposal allows you to guarantee and prioritise resources, but it may result in some of them being idle.
Plus it would be easier to manage load through configuration, as it would then be a pure deployment issue and an entirely separate concern. You'd also be more independent of your hardware upgrade path as you'd just be adding/assigning virtual machines to the new hardware.

Anyone using Memcached with ASP.NET on a distributed farm?

We have 22 HTTP servers each running their own individual ASP.NET Caches. They read from a read only DB that is only updated off peak hours.
We use a file dependency to invalidate the cache, prompting the servers to "new up" their caches...If this is accidentally done during peak hours, it risks bringing down our DB cluster due to the sudden deluge of open connections.
Has anyone used memcached with ASP.NET in this distributed form? It seems to me that it would offer a huge advantage of having to only build up one cache (and hit the DB 21 times less), while memcached would handle distributing it on each box.
If you have, do you place it on the same box as the HTTP boxes, or do you run a separate cache tier? How well does it scale, can we expect it to need powerful servers? Our working dataset is not huge (We fit it into 4 gigs of memory on each HTTP box just fine).
How do you handle invalidation?
Looking for experiences and war stories.
EDIT: Win2k3, IIS6, 64-bit servers...4 gigs per box (I believe, we may have upped it to 16 gigs when we changed to 64-bit servers).
"memcached would handle distributing it on each box"
memcached does not distribute or replicate a cache to each box in a memcached farm. The memcached client basically hashes the key and chooses a cache server based on that hash. When one of the memcached servers fail you will lose whatever cached items existed on that server, however, the client will recognize the failure and begin writing values to a different server. This being the case, your code needs to account for missing items in the cache and reset them if necessary.
This article discusses the memcached architecture in more detail: How memcached works.
Best practice (according to the memcached site) is to run memcached on the same box as your web server app or else you're making http calls (which isn't all that bad, but it's not optimal). If you're running a 64-bit app server (which you probably should if you're going to be running memcached), then you can load up each of the servers with loads of memory and it will be available to memcached. There's not much in the way of CPU resources used by memcached, so if your current app server isn't very taxed, it will remain that way.
Haven't used them together, but I've used them both on separate projects.
Last I saw the documentation explicitly said that sharing with the web server was ok.
Memcache really only needs RAM and if you take your asp.net cache out of the equation how much RAM is you web server actually using? Probably not much. It won't compete much with your web server for CPU and it doesn't need disk at all. You might consider segmenting off the network traffic (if you don't already) from the incoming web requests.
It worked well and was fast I didn't have any problems with it.
Oh, invalidation was explicit on the project I used it on. Not sure what other modes there are for that.
If you want to get replication accross your memcached servers then it maybe worth a look at repcached. It's a patch for memcached that handles the replication part.
Worth checking out Velocity, which is a distributed cache provided by Microsoft. I cannot give you a point-by-point comparison to memcached, but Velocity is integrated with ASP.NET and will continue to get more development and integration.

Resources