Which .net architecture should I implement for 10,000 concurrent users for web application [closed] - asp.net

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I need to create a web application for tasks delegation, monitor and generate reports for a organization.
I am thinking about ASP.Net and MVC should be the architecture to support so many concurrent users. Is there any other better architecture that I should be looking for ?
What kind of server configuration will require to hold this application ?
How do I test so many users connected before I launch this application, is there any free/economical testing tools available ?
Thanks in advance.
anil

the choice of MVC versus webforms have little/nothing to do with the ability for the app to handle load. Your problems will be reading/writing to the DB, and that doesn't change no matter which of the two you choose.
ideas for improving ability to handle load:
first and foremost: absolute minimum is two servers: web server and DB server, they should NEVER run on the same box.
DB:
Efficient queries towards the DB, indexes in the DB, denormalizing tables that are hit a lot, CACHE, CACHE CACHE, running the DB in a cluster, oh, and did I mention CACHING?
Processing:
if you need heavy processing, do this in web services that can run on separate machines from the web servers so that you can scale out (buy more servers and put them behind a load balancer if needed)
WEB:
avoid the need for server affinity (so that it doesn't matter which web server serves a given user at any given time) this means using DB or StateServer to store sessions, syncing the MachineKey on the servers.
the decision of using MVC or not have NO impact on the ability to handle 10k concurrent users, however it's a HUGE benefit to use MVC if you want the site to be unit-testable
remember: Applications are either testable or detestable, your choice

Cache Cache Cache Cache :-) a smart caching policy will make even one server go a long way ... aside from that, you will need to find out where your bottleneck will be. If your application is database heavy, then you will need to consider scaling your database either by clustering, or sharding. If you expect your web server to be the bottleneck (for example if you are doing a lot of processing, like image processing or something), then you can put a load balancer to distribute requests between N number of servers in your webfarm.

For a setup this large I would highly recommend using a Distributed memory caching provider to be layered above your database. I would also really suggest using an ORM that has built in support for the memory cache, like NHibernate since in an application of this scale your biggest bottleneck will definitely be your database.
You will most likely need a webfarm for this scenario, if a single server is strong enough for it currently at some point in the near future you will most likely out grow a single box which is why it's important to architect in the distributed cache first so you can grow your farm out and not have to re-architect your entire system.

Related

Which of the design patterns (with or without SQS) below should they use to process spiked data volume in AWS? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I am doing my AWS study and came across this seemly debatable question online, wondering if posting here can get more inputs:
A company is building a voting system for a popular TV show, viewers win watch the performances then visit
the show’s website to vote for their favorite performer. It is expected that in a short period of time after the
show has finished the site will receive millions of visitors. The visitors will first login to the site using their
Amazon.com credentials and then submit their vote. After the voting is completed the page will display the
vote totals. The company needs to build the site such that can handle the rapid influx of traffic while
maintaining good performance but also wants to keep costs to a minimum. Which of the design patterns
below should they use?
A. Use CloudFront and an Elastic Load balancer in front of an
auto-scaled set of web servers, the web servers will first can the
Login With Amazon service to authenticate the user then process the
users vote and store the result into a multi-AZ Relational Database
Service instance.
B. Use CloudFront and the static website hosting feature of S3 with
the Javascript SDK to call the Login With Amazon service to
authenticate the user, use IAM Roles to gain permissions to a DynamoDB
table to store the users vote.
C. Use CloudFront and an Elastic Load Balancer in front of an
auto-scaled set of web servers, the web servers will first call the
Login with Amazon service to authenticate the user, the web servers
will process the users vote and store the result into a DynamoDB table
using IAM Roles for EC2 instances to gain permissions to the DynamoDB
table.
D. Use CloudFront and an Elastic Load Balancer in front of an
auto-scaled set of web servers, the web servers will first call the
Login. With Amazon service to authenticate the user, the web servers
win process the users vote and store the result into an SQS queue
using IAM Roles for EC2 Instances to gain permissions to the SQS
queue. A set of application servers will then retrieve the items from
the queue and store the result into a DynamoDB table.
The original question is from
http://www.briefmenow.org/amazon/which-of-the-design-patterns-below-should-they-use/#comment-25822
and it is very debatable there.
My thought is limit the numbers of AWS service so that the cost is minimum. RDS is excluded here for there are three options pointing to DynamoDB (I am not a cloud guru, this is my instinct judgement). S3 is only good for static website, so B is excluded.
Between C & D I chose C as 1. it doesn't need to use SQS which is an extra cost; 2. does SQS come with the capability to process the volume in an expected IOPS/Throughput? I don't know but both C & D use DynamoDB which is what I think a good solution and D doesn't indicate the needed permission for the application servers (i.e. EC2 instances) to obtain access to DynamoDB. So D is excluded here.
Am I missing anything here?
There is no standard and authority answer provided for this question.
Thank you very much for the discussion.
Highly opinionated question.
This is one way to solve the problem.
Design considerations
High number of requests in a short span of time
a. we must be able to autoscale
b. write traffic should be spread out so that we don't have to provision lots of dynamodb capacity.
c. read should not require doing database join and count operation, so that we don't have to provision lots of RAM, dynamodb capacity.
Assumptions
Eventual consistency of votes is fine.
Write should be strongly consistent. (if votes are immutable(can not be changed once done) it may fine to use sqs otherwise bringing in sqs brings in lot of complexities in the system. as explained below)
Architecture
Components
Use CloudFront and the static website hosting feature of S3 for powering website. (so that horizontally scalable.) (PS this approach is called client side rendering and has multiple cons like seo, do your research before choosing this approach, if you need server side rendering have another server behind aws elb which calls other services and creates the page. these are pros and cons of both the approaches, for the rest of answer i assume you are doing server side rendering.)
Website calls your services to render the page.
Your service is deployed on ec2, with auto scaling enabled.
All the reads are served from elastic-cache (they are deployed in master slave configuration so that no single point of failure).
All the writes go to dynamodb consistently. (Most of the services need to get consistent state to decide next state, if we have sqs queue in between if becomes impossible to determine exact state of system any given point of time. But this also means that we need to pay more for dynamodb. For this enable auto-scaling on dynamodb.)
Since major share of the load would be read, keep the aggregates in elastic cache. To update elastic cache you can have a lambda function subscribed to dynamodb change streams.
Contingencies
You should have a plan to populate elastic cache in case it goes down and you have to rehydrate the state.
Here are the cons of your options
Option A cons:
It will require huge rsu since you will be aggregating results.
Option B cons:
It will require huge rsu since you will be aggregating results.
Directly calling dynamodb from frontend might not be a good idea, considering how you store and retrieve data may evolve over time.
See cons of client side rendering.
Option C cons:
It will require huge rsu since you will be aggregating results.
Option D cons:
It will require huge rsu since you will be aggregating results.
Asynchronous application has their own complexities.
for example
if user upvotes and refreshes the page, you may show him that he hasn't casted any vote, since your application is yet to process the sqs event.
if user first upvotes and then downvotes, your system may store that user has upvoted. Since sqs events are not processed in order.
Despite the above cons if you still want to take this approach, What you are trying to do here is called Event Sourcing. You should use kafka instead of sqs so that your events are ordered.

AppFabric vs asp.net cache with sqldependency performance

I'm working on a plan to increase performance and scalability of a web app by caching a user database for a WCF web service. Goals are to increase performance by accessing this data inProc vs a round trip the database server, as well as increase scalability of the service by reducing the load on the database server, thus allowing more web servers to be added to increase scale.
In researching AppFabric, I really don't see the value in my situation because it seems like for the most part, I'm just replacing a round trip to the database with a round trip to a cache cluster (which seems like it might even have more overhead than the db to keep nodes in synch).
For the performance question, it seems like using the asp.net cache (in process) would be much faster than a round trip to the cache cluster, even though the data is in memory on those servers, and even if some of it is cached locally (I believe that would still be out of process from the web app).
For the scalability issue, it also seems easier to be able to add identical web servers to a web farm (each caching the user data in process), rather than manage a cache cluster seperately which adds complexity.
With that said, could someone explain why I would choose one approach over the other, given my stated goals? If you recommend the AppFabric approach, can you explain how the performance would be better than storing data in the asp.net cache in process.
Thanks!
You are right that the App fabric cache is stored out of process.
When the request comes in for a app fabric cache item, there is first a lookup to find where the item is, then a wcf net.tcpip call to get the item. Therefore, it will be slower than asp.net caching. But there are times when appfabric caching is better:
You do not loose the cache when the application pool is recycled.
If you have 100 web servers then you need to get the data from the database once, not 100 times
If you are running Enterprise Edition of windows you do not loose the cache if a machine goes down
I found this topic on codeproject. Hope it can answer your question
you should consider NCache as an other option. NCache is an extremely fast in-memory distributed cache which reduces the performance bottlenecks associates with the database enhance the scalability of the app.
As far as use of asp.net cache is concerned, you should keep into mind its limitations as well. it is good for small web farms only. but when the number of servers grow, asp.net cache may ends up with some performance and scalability issues due to its in-process nature. in a larger web garden you need to have an in-memory distributed cache. Read this for reference

How can I use caching to improve performance?

My scenario is : WebApp -> WCF Service -> EDMX -> Oracle DB
When I want to bind grid I fetch records from Oracle DB using EDMX i.e LINQ Query. But, this degrades performance as multiple layers take place between WebApp & Oracle DB. Can I use caching mechanism to improve the performance? But as far as I know cache is shared across the whole application. So, if I update cache other user might receive wrong information. Can we use caching per user? Or is there any other way to improve performance of the application?
Yes, you can definitely use caching techniques to improve performance. Generally speaking, caching is “application wide” (or it should be) and the same data is available to all users. But this really depends on your scenario and implementation. I don't see how adding the extra caching layer will degrade performance, it's a sound architecture and well worth the extra complexity.
ASP.NET Caching has a concept of "cache dependencies" which is a method to notify the caching mechanism that the underlying source has changed, and the cached data should be flushed and reloaded on the next request. ASP.NET has a built-in cache dependency for SQL Server, and a quick Google search revealed there’s probably also something you can use with Oracle.
As Jakob mentioned, application-wide caching is a great way to improve performance. Generally user context-agnostic data (eg reference data) can be cached at the application level.
You can also cache user context data by storing data in the user's session when they login. Then the data can be cached for the duration of that users session (HttpContext.Session)
Session data can be configured to be stored in the web application process memory, in a state server (a special WCF service) or in a SQL Server database, depending on the architecture and infrastructure.

SQLite use it for websites, but not for client/server apps?

After reading this question and the suggested link explaining when is more appropriate to use SQLite vs another DB it's still unclear to me one simple thing, and I hope someone could clarify it.
They say:
Situations Where SQLite Works Well
Websites
SQLite usually will work
great as the database engine for low
to medium traffic websites...
...
Situations Where Another RDBMS May
Work Better
Client/Server
Applications...
If you have many
client programs accessing a common
database over a network...
Isn't a website also a client/server app?
I mean I don't understand, a website is exactly a situation where I have many client programs (users with their web browsres) concurrently accessing a common DB via one server application.
Just to keep it simple: at the end of the day, is it possible for instance to use this SQLite for an ecommerce site or an online catalog or a CMS site with about 1000 products/pages?
The users' web browsers don't directly access the database; the web application does. And normally the request/response cycle for each page the user views will be very fast, usually lasting a fraction of a second.
IIRC, a transaction in SQLite locks the whole database file, meaning that if a web app request requires a blocking transaction, all traffic will effectively be serialized. This is fine, for a low-to-medium traffic website, because many requests per second can still be handled.
In a client-server database application, however, multiple users may need to keep connections open for longer periods of time, and may also need to perform transactions. This is far less of a problem for bigger RDBMS systems because locking can be performed in a more fine-grained way.
SQLite can allow multiple client reads but only single client write. See: https://www.sqlite.org/faq.html
Client/server is when multiple clients do simultaneous writes to the database, such as order entry where there are multiple users simultanously inserting and updating information, or a multi-user blog where there are multiple simultaneous editors.
A website, in the case of read-only, is not client/server but rather simply a server with multiple requests. In many cases, a website is heavily cached and the database is not even accessed, or rarely.
In the case of a slightly used ecommerce website, say a few simultaneous shoppers, this could be supported by SQLite, or by MySQL. Somewhere there is a line where performance is better for a highly-concurrent database as opposed to SQLite.
Note that the number of products/pages is not a great way to determine the requirement for MySQL over SQLite, rather it is the number of concurrent users, and at what point their concurrent behavior experiences slowness due to waiting for locks to clear.
A website isn't necessarily a client server application in the context of use.
I think when they say website, they mean that the web application will directly manage the database. That is, the database file will live within the web site and will not be access via any other means. (A single point of access, put simply)
In contrast, a client/server app may have the web site accessing the data store as well as another web site, SOAP client or even a smart client. IN this context, you have multiple clients access one database (server). This is where the web site would become (yet another) client.
Another aspect to consider when constrasting the two, is what is the percentage of writes compared to reads. I think SQLite will perform happiply when there is little writing going on compared to the amount of reads. SQLite, I understand, doesn't do well in a multiple write scenario. It's intended for a single (handful?) process to be manipulating it.
I mainly only use SQLite on embedded applications. (iOS, Android). For larger, more complex websites (like your describing) I would use something like mySQL.

Advantages and disadvantages of using caching in an asp.net application?

What are the advantages and disadvantages of using caching in an asp.net application?
Answers will vary based on Environments and Technology.
Advantages
Reduce load on Web Services/ Database
Increase Performance
Reliability (Assuming db backed cache. Server goes down and db is backed by cache there is no time wasted to repopulate an in memory cache)
Disadvantages
Could run into issues syncing caches
Increased Maintenance
Scalability Issues
With great power comes great responsibility ;). We ran into an issue where we decided to use the HttpContext.Cache (bad idea) in an application that was distributed. Early on in the project someone deemed to just throw it in there and we didn't run into issues until we went live. Whenever it comes to caching you need to look at the big picture. Ask yourself do we have enough Data, enough Users, or a performance requirement that warrants implementing caching?
If you answer yes then you are probably going to need a farm of servers so choose your caching provider wisely.
With all that being said, Microsoft has a new cache API AppFabric/Velocity that you could leverage that handles the distribution and syncing of the cache auto-magically.
AppFabric caching allows you to do time out eviction, or even built in notification eviction, so as your data chances the caching server takes not of it and periodically the cache client checks in with the server and gets a list of stuff it needs to sync.
http://msdn.microsoft.com/en-us/library/xsbfdd8c%28VS.71%29.aspx
Advantage: performances
Disadvantage: new data is not displayed immediately

Resources