Implenting a New Message Notification Feature in a Server Farm Scenario - asp.net

I'm working on a forum based website, the site also supports onsite messaging (ie. the users can send private messages to other users), what I'm trying to do is notify a member if they have new messages, for example by displaying the inbox link in bold and also the number of messgages, e.g. Inbox(3)
I'm a little confused how this can be implemented for a website running on a server farm, querying the database with every request seems like an overkill to me, so this is out of question, probably a shared cache should be used for this, I tend to think this a common feature for many sites including many of the large ones (running on server farms), I wonder how they implement this, any ideas are appreciated.

SO caches the questions, however every postback requeries your reputation. This can be seen by writing a couple of good answers quickly, then refreshing the front page.
The questions will only change every minute or so, but you can watch your rep go up each time.

Waleed, I recommend you read the articles on high scalability. They have specific case studies on the architectures of various mega scale web applications. (See the side bar on the right side of the main page.)
The general consensus these days is that RDBMs usage in this type of application is a bottle neck. It is also probably safe to say that most of the highly scalable web applications sacrifice consistency to achieve availability.
This series should be informative of various views on the topic. A word on scalability is highly cited.
In all this, keep in mind that these folks are dealing with Flickr, Amazon, Tweeter scale issues and architectures. The solutions are somewhat radical departures from the (previously accepted) norms and unless your forum application is the next Big Thing, you may wish to first test out the conventional approach to determine if it can handle the load or not.

Related

Approach for disconnected application development

Our company has people in every catastrophic event here in the U.S. and parts of Canada. An example is they were quite prevalent in Katrina immediately after the event.
We are constructing an application to improve their job in the field which may be either ASP.NET or WPF, and the disconnect requirement makes us believe it will be a WPF application. Our people need to be able to create their jobs, provide all of the insurance and measurement data, and save it as if in the database whether or not the internet is available.
The issue we are trying to get our heads around is that when at catastrophic events our people need to be able to use our new application even when the internet is not available. (They were offline for 3 days in Katrina)
Has anyone else had to address requirements like this and suggestions on how they approached functioning on small-footprint devices while saving data as if they were still connected to the backend services and database? We also have to incorporate security into this as well, and do it well enough that their entered data loads into the connected database without issues.
Our longterm goal is to also provide this application for Android and IPad Tablet devices as well as laptops. Our initial desire for ASP.NET was it gave us an immediate application for the tablet environment. In the old application they have, they run a local server, run remote connections on the tablets and run the application through terminal server. Not pretty. Not pretty.
I feel this is a serious question that is not subjective so hopefully this won't get deleted.
Our current architecture on the server side is Entity Framework with a repository pattern, WCF services to satisfy CRUD requests returning composite data transfer objects, and a proxy for use by the clients.
I'm interested in hearing other developers' input and this design puzzle.
Additional Information Added to the Discussion
Lots of good information provided!!! I'll have to look at Microsoft Sync for sure. For the disconnected database I would be placing only list tables (enumerations) in the initial database. Jobs and, if needed, an item we call dry books, will be added for each client we are helping. (though I hope the internet returns by the time we are cleaning and drying out the homes) These are the tables that would then populate back to the host once we have a stable link. In the case of Katrina we also lost internet connectivity in our offices which meant the office provided no communication relief for days as well.
Last night I realized that our client proxy is the key to everything working! The client remains unaware of the fact that it is online or offline and leaves the synchronization process within that library. We are discovering how much data we are talking about today. I also want to make it clear that ASP.NET was a like-to-have but a thick client (actually WPF with XAML) may end up being our end state.
Now -- for multiple updates. The disconnected work will be going to individual homes by a single franchise. In fact our home office dispatches specific franchises to specific events. So we have a reduced likelihood (if any) of the problem of multiple people updating a record. The reason is that they are creating records for each job (person's home/office/business) and only that one franchise will deal with it. Of course this also means that if they are disconnected for days that the device that creates the job (record of who, where, condition, insurance company, etc) is also the only device that knows of the job. But that can be lived with. In fact we may be able to have a facility to sync the franchise devices on a hub.
I'm looking forward to hearing additional stories of how you've implemented your disconnected environment.
Thanks!!!
Looking at new technology from Microsoft
I was directed to look at a video from TechEd 2012 and thought I might have an answer. The talk was on using ASP.NET and MVC4 along with 2 libraries for disconnected behavior. At first I thought it would be great but then as it continued it worried me quite a bit.
First the use of a javascript backend to support disconnected I/O does not generate confidence. As a compiler guy (and one who wrote two interpretive languages) I really do not like having a critical business model reliant upon interpretive javascript. And script at that! It may be me but it just makes me shudder.
Then they show their "great"(???) programming model having your ViewModel exist as just javascript. I do not care for an application (asp.net and javascript) that can be, and may as well be (for lack of intellisense ) written in notepad.
No offense meant to any asp lovers, but a well written C# program that has been syntactically and type checked gives me stronger confidence in software than something written with a hope and prayer that a class namespace has been properly typed without any means of cross check. I've seen too many hours of debugging looking for a bug that ended up in a huge namespace with transposed ie in it's name. I ran my thought past the other senior developers in my group and we are all in consensus on this technology.
But we continue to look. (I feel this is becoming more of a diary than a question) :)
Looks like a perfect example for Microsoft Sync Framework
http://msdn.microsoft.com/en-us/sync/bb736753.aspx
A comprehensive synchronization platform that enables collaboration
and offline access for applications, services, and devices with
support for any data type, any data store, any transfer protocol, and
any network topology.
I often find that building a lightweight framework to fit my specific needs is more beneficial to me than using an existing one. However, always look at what's available and weigh the pros and cons before making that decision.
I haven't use the Microsoft Sync Framework, but it sounds like that's a good one to research first. If you have Sql Server Standard (or some other version other than the Express version) then replication might also be an option.
If you want to develop your own homegrown solution, then be sure to put lastupdated and dateadded fields on any tables that need to stay in sync. It doesn't 'sound' like your scenario will be burdened by concurrency issues (i.e. if person A and B both modify a field at the same time, who wins?). If that's the case then developing your own lightweight solution will be pretty straightforward.
As Jeremy pointed out, you will need a way to get the changes. In addition to using a web service, you can also use WCF which is similar to a web service in some ways. But my personal bias would be towards just accessing a SQL server remotely over the internet. The downside of that solution is added security concerns, while the upside is decreased development overhead (i.e. faster/easier development now and less maintenance over time). Also, the direct SQL solution is also assuming that this is an internal application... that you're in charge of all development and not working with 3rd parties who need access to your data and wouldn't be allowed to access it this way.
Not really a full answer but too much for a comment.
I have two apps one that synchs one way and the other two way.
I do a one way synch to client for disconnected operation. At the server full SQL Server and at the client Compact Edition. TimeStamp is a prefect for finding any rows that needs to be synched. I also don't copy the whole database as some of the largest table are non nonessential. The common use is the user marks identified records they want to synch.
If synch does what you need great +1 for Jakub. For me I don't have the option to synch the whole MSSQL both based on size and security.
Have another smaller application that synchs two way but in this case it has regions and update are only within the region. So a region only synchs their data and in disconnected mode they can only add new records. Update to an existing records must be performed in connected mode. That was mangeable. In that case MSSQL for the master and used XML for the client.
No news to you but the hard part of a raw synch is that two parties may have added or revised the same record.

When to use load balancing?

I am just getting in to the more intricate parts of web development. This may not be in the best place. However, when is it best to get load balancing for a web project? I understand that it depends on good design/bad design as to how many users you can get to visit a site without it REALLY effecting the performance. However, I am planning to code a new project that could potentially have a lot of users and I wondered if I should be thinking off the bat about load balancing. Opinions welcome; thanks in advance!
I should not also that the project most likely will be asp.net (webforms or mvc not yet decided) with backend of mongodb or pgsql(again still deciding).
Load balancing can also be a form of high availability. What if your web server goes down? It can take a long time to replace it.
Generally, when you need to think about throughput you are already rich because you have an enormous amount of users.
Stackoverflow is serving 10m unique users a month with a few servers (6 or so). Think about how many requests per day you had if you were constantly generating 10 HTTP responses per second for 8 hot hours: 10*3600*8=288000 page impressions per day. You won't have that many users soon.
And if you do, you optimize your app to 20 requests per second and CPU core which means you get 80 requests per second on a commodity server. That is a lot.
Adding a load balancer later is usually easy. LBs can tag each user with a cookie so they get pinned to one particular target. You app will not notice the difference. Usually.
Is this for an e-commerce site? If so, then the real question to ask is "for every hour that the site is down, how much money are you losing?" If that number is substantial, then I would make load balancing a priority.
One of the more-important architecture decisions that I have seen affect this, is the use of session variables. You need to be able to provide a seamless experience if your user ends-up on different servers during their visit. Session variables won't transfer from server to server, so I would avoid using them.
I support a solution like this at work. We run four (used to be eight) .NET e-commerce websites on three Windows 2k8 servers (backed by two primary/secondary SQL Server 2008 databases), taking somewhere around 1300 (combined) orders per day. Each site is load-balanced, and kept "in the farm" by a keep-alive. The nice thing about this, is that we can take one server down for maintenance without the users really noticing anything. When we bring it back, we re-enable our replication service and our changes get pushed out to the other two servers fairly quickly.
So yes, I would recommend giving a solution like that some thought.
The parameters here that may affect the one the other and slow down the performance are.
Bandwidth
Processing
Synchronize
Have to do with how many user you have, together with the media you won to serve.
So if you have to serve a lot of video/files to deliver, you need many servers to deliver it. Let say that you do not have, what is the next think that need to check, the users and the processing.
From my experience what is slow down the processing is the locking of the session. So one big step to speed up the processing is to make a total custom session handling and your page will no lock the one the other and you can handle with out issue too many users.
Now for next step let say that you have a database that keep all the data, to gain from a load balance and many computers the trick is to make local cache of what you going to show.
So the idea is to actually avoid too much locking that make the users wait the one the other, and the second idea is to have a local cache on each different computer that is made dynamic from the main database data.
ref:
Web app blocked while processing another web app on sharing same session
Replacing ASP.Net's session entirely
call aspx page to return an image randomly slow
Always online
One more parameter is that you can make a solution that can handle the case of one server for all, and all for one :) style, where you can actually use more servers for backup reason. So if one server go off for any reason (eg for update and restart), the the rest can still work and serve.
As you said, it depends if/when load balancing should be introduced. It depends on performance and how many users you want to serve. LB also improves reliability of your app - it will not stop when one system goes crashing down. If you can see your project growing to be really big and serve lots of users I would sugest to design your application to be able to be upgraded to LB, so do not do anything non-standard. Try to steer away of home-made solutions and always follow good practice. If later on you really need LB it should not be required to change your app.
UPDATE
You may need to think ahead but not at a cost of complicating your application too much. Do not go paranoid and prepare everything to work lightning fast 'just in case'. For example, do not worry about sessions - session management can be easily moved to SQL Server at any time and this is the way to go with LB. Caching will also help if you hit some bottlenecks in the future but you do not need to implement it straight away - good design (stable interfaces), separation and decoupling will allow for the cache to be added later on. So again - stick to good practices, do not close doors but also do not open all of them straight away.
You may find this article interesting.

ASP.NET What's the best way to produce a trial version for customers to download?

I've written a ASP.NET app that I hope to sell to businesses, I could host the trial but it's designed to connect to the customers data so customers will certainly want to install it to do a successful evaluation.
I've never produced anything commercial before so I'm looking for advice on how best to limit the trial, a 30 day trial seems most common, do you simply rely on the clock of the PC/Server they install it on? Any other suggestions welcome, please keep in mind this is ASP.NET app so will be installed on their web server.
Thanks
Craig
I would just do it via the PC's clock. At the end of the day, they could just change the clock and continue to use your software, though it's probably not going to work in practice (i.e. most software actually uses the date/time for other things as well and changing it going to screw that up).
Generally, you can usually trust business more than you trust the general public. The liability of a business is much higher than that of an individual, so if it came to it, you could potentially sue them for quite a bit. That alone means most businesses will purchase licenses for all of their software: a few hundred (or even thousand) dollars for a software license is much better than risk getting sued.
When they sign up for the demo, make sure you get all of their contact details and so on.
I would setup a web service on your server to authenticate the demo application. The web service should get called periodically and if it fails, then shut down the application. That way you have complete control over the trial (you can extend it or shut it down remotely).
You should give them some sort of key which they will place in your web.config that will identify them as a customer.
Make sure you take the usual precautions of encrypting / using hashes with both the key and the web service so it's not bypassed.
This sort of thing has been well covered on SO in the past.
You cannot make it unbreakable, but you can make it very difficult for the client to break your trial period.
One way to do it is to take the first run time and encrypt that info and store it either in your web.config or database. This has a weakness though: what do you do if the value is not present where you expect it to be?
Another option is to ping a webservice that you host. If the webservice says their trial is over then you can render the appropriate page to tell them that. This has the advantage that the webservice is beyond their control and cannot be messed with. It has the disadvantage that not every client will want to be allowing their web app to phone home, and there may be connectivity issues which would interfere with the functioning of your app.
So you might want to come up with a variety of options, and then implement a licencing module using the Provider pattern, so that you can swap in the licencing module most suitable for that client.
Put a counter in the web.config, of course give the counter a non-related name so the customer does not know what it is for. Every time they access the application you can increment the counter. Give them x number of log-in's.
If you want you can encrypt the counter if you do not want the customer to figure out that the counter is incrementing.

Determining Website Capacity

A client of mine has a website and they need to determine how 'scalable' the site currently is. What I mean by this is the number of users browsing around the site concurrently.
It's a custom e-commerce app in .net, not written by myself and the code is... well lets just say, a bit dubious.
A much bigger company is looking to buy them / throw funding their way but they need some form of metrics to show how much load it can take before it falls apart. This big company has the ability to 'turn on the taps' to a huge user base - and obviously doesn't want to do that if the site is going to fall over with a sneeze of traffic.
What is a good metric to provide here? And how can I obtain it?
Edit: Question revised
I always use Apache's "ab" tool: link text
Run it from a different machine, preferably a BSD or Linux machine with no firewall rules that will limit the performance of the tool. Because otherwise the result might not be as reliable. If you use a Windows machine, make sure you're using one that isn't limiting the number of active TCP connections.
When using "ab", the number you're looking for it "Requests per second". Experiment with the concurrency switch to see how many concurrent users you can handle before you're getting a lot of errors, or when the requests per seconds is dropping rapidly.
When you are noticing the webserver is having serious issues you should restart the webserver, and let it rest for a while before continuing the test.
You'd be better off with a hosted load test, as this might give you more insight on realworld scenario's (something like http://www.scl.com/software-quality/hosted-load-test, no experience with them though).
Furthermore: scalability is as far as I know, not how many concurrent users can be served, but the way how easy it is to serve more when the site grows bigger (by adding extra servers etc, how easy is it for the website to scale up, does the codebase allow to use unlimited number of servers, etc.)
Well, I suppose it'll depend on what the client cares about.
Do they care about how many users to can access the site at once? Report on that, but running simultaneous requests from another server until it dies, then get the number.
Do they care about something else?
For me, when someone says they want it to 'scale', it really means they have no idea what they want. So try and talk to them, and get specific details of what, exactly, they want to see 'scaling', and then, once you find the areas to analyse, you can do so trivially, and attempt to improve them.

How would I go about figuring out the maximum load my server(s) can handle?

In Joel's article for Inc. entitled How Hard Could It Be?: The Unproven Path, he wrote:
...it turns out that Jeff and his
programmers were so good that they
built a site that could serve 80,000
visitors a day (roughly 755,000 page
views)
How would I go about figuring out the maximum load my server(s) can handle?
Benchmarking your software is often a lot harder than it seems. Sure, it's easy to produce some numbers that say something about the performance of your software, but unless it was calculated using a very accurate representation of the actual usage patterns of your end users, it might be completely different from the actual results you will get in the wild. Websites are notoriously hard to benchmark correctly. Sure, you can run a script that measures the time it takes to generate a page but it will be a very different number from what you will see under real world usage.
Inorder to create a solid benchmark of what your servers can handle, you first need to figure out what the usage patterns of your users is. If your site is already running, you can easily collect this data from your logs. Next, you need to create a simulation that will emulate exactly the same patterns as your real users exhibit... that is - view front page, login, view status page and so forth. Different pages will create a different load on the servers requiring that you actually fetch correct set of pages when simulating load on your servers. Finally, you need to figure out which resources are cached by your users, you can do this again by looking through your access log or using a tool such as firebug.
JMeter, ab, or httperf
You can create several "stress tests" and run them as the other posters are telling.
Apache has a tool called JMeter where you can create these tests and run them several times.
http://jmeter.apache.org/
Greetings.
Jason, Have you looked at the Load Test built in to Visual Studio 2008 Team System? Check out this video to see a demo.
Edit: Here's another video that has better resolution.
Apache has a tool called ab that you can use to benchmark a server. It can simulate loads requests and concurrency situations for you.
Basically you need to mimic the behavior of a user and keep ramping up the number of users being mimiced until the server response is no longer acceptable.
There are a variety of tools that can do this but essentially you want to record a few sessions activity on your site and then play those sessions back (adding some randomisation to reflect real user behaviour) lots of times.
You will want to log the performance of each session and keep increasing the load until the the performance becomes unacceptable.

Resources