I am actually building a chat application which has to show current users, I have a column 'IsOnline' in db whose value toggles between 1 and 0 as user logs in or out. I need a function which hits api every 15 seconds to get latest users who are currently online.
Since I am using entity framework which does not support signalr and sql dependency I have decided to go this way.
How can I have a method which runs every 15 seconds in a separate thread so as not to interfere my other crud operations as long I have user in session.
Polling after 15 second is not a good solution especially if your call is on DB. Think about the latency in this approach. I think you need to look for different approach rather than calling db after 15 second.
If you want to check online/offline maintain the status in memory rather than Persist in Db (persist after 1 hour or 2 hour if you want to keep in DB).
Store status in memory, for instance in memcached or redis. Have the client issue a request every 15 seconds. The online status is transient, it does not need to be stored in DB.
It's hard to advise in depth because you did not describe the architecture of your app.
In general, efficient implmentation of presence notifications is tricky. It may be easier to take something off the shelf instead of developing your own.
Related
I am trying to make 6 asynchronous jQuery ajax calls to my .NET Page Method all at once on document.ready to request for different sets of data from the database and in return render as charts to users.
Problem is that when one chart takes a long time to generate, it locks up generation of the next 5 charts, for instance, when each chart takes 1 min to generate, the user will be approx waiting for 6 mins, instead of 1 - 2 mins which i thought it will be when using async ajax calls and page method gets processed in parallel.
After reading a lot of useful posts in this forum, i found that this is because I have to read and write to session objects within the page methods, and asp.net will lock the whole request, as a result making them run sequentially.
I have seen people suggesting to set the session state to read only in #Page tag, but it will not address my problem because i need write to the session as well. I have considered moving from inProc session to sql database session, but my session object is not serializable and is used across the whole project. I also cannot change to use Cache instead because the session contains user specific details.
Can anyone please help and point me to the right direction? I have been spending days to investigate this page inefficiency and still haven't yet found a nice way yet.
Thanks in advance
From my personal experience, switching to SQL session will NOT help this problem as all of the concurrent threads will block in SQL as the first thread in will hold an exclusive lock on one or more rows in the database.
I'm curious as to why your session object isn't serializable. The only solution that I can think of is use a database table to store the user specific data that you are keeping in session and then only holding onto a database lock for as long as it takes you to update the user data.
You can use the ASP.NET session id or other unique cookie value as the database key.
The problem may not be server side at all.
Browsers have a built in limit on how many concurrent HTTP requests they will make - this is part of the HTTP/1.1 spec which sugests a limit of 2.
In IE7 the limit is 2. in IE8 it is 6. But when a page loads you could easily hit 6 due to the concurrent requests for CSS, JS, images etc.
A good source of info about these limits is BrowserScope (see Connections per Hostname column).
What about combining those 6 requests into 1 request? This will also load a little faster.
A part of the application I'm working on is an swf that shows a test with some 80 questions. Each question is saved in SQL Server through WebORB and ASP.NET.
If a candidate finishes the test, the session needs to be validated. The problem is that sometimes 350 candidates finish their test at the same moment, and the CPU on the web server and SQL Server explodes (350 validations concurrently).
Now, how should I implement queuing here? In the database, there's a table that has a record for each session. One column holds the status. 1 is finished, 2 is validated.
I could implement queuing in two ways (as I see it, maybe you have other propositions):
A process that checks the table for records with status 1. If it finds one, it validates the session. So, sessions are validated one after one.
If a candidate finishes its session, a message is sent to a MSMQ queue. Another process listens to the queue and validates sessions one after one.
Now:
What would be the best approach?
Where do you start the process that will validate sessions? In your global.asax (application_start)? As a windows service? As an exe on the root of the website that is started in application_start?
To me, using the table and looking for records with status 1 seems the easiest way.
The MSMQ approach decouples your web-facing application from the validation logic service and the database.
This brings many advantages, a few of which:
It would be easier to handle situations where the validation logic can handle 5 sessions per second, and it receives 300 all at once. Otherwise you would have to handle copmlicated timeouts, re-attempts, etc.
It would be easier to do maintanance on the validation service, without having to interrupt the rest of the application. When the validation service is brought down, messages would queue up in MSMQ, and would get processed again as soon as it is brought up.
The same as above applies for database maintanance.
If you don't have experience using MSMQ and no infrastructrure set up, I would advice against it. Sure, it might be the "proper" way of doing queueing on the Microsoft platform, but it is not very straight-forward and has quite a learning curve.
The same goes for creating a Windows Service; don't do it unless you are familiar with it. For simple cases such as this I would argue that the pain is greater than the rewards.
The simplest solution would probably be to use the table and run the process on a background thread that you start up in global.asax. You probably also want to create an admin page that can report some status information about the process (number of pending jobs etc) and maybe a button to restart the process if it for some reason fails.
What is validating? Before working on your queuing strategy, I would try to make validating as fast as possible, including making it set based if it isn't already so.
I have recently been investigating this myself so wanted to mention my findings. The location of the Database in comparison to your application is a big factor on deciding which option is faster.
I tested inserting the time it took to insert 100 database entries versus logging the exact same data into a local MSMQ message. I then took the average of the results of performing this test several times.
What I found was that when the database is on the local network, inserting a row was up to 4 times faster than logging to an MSMQ.
When the database was being accessed over a decent internet connection, inserting a row into the database was up to 6 times slower than logging to an MSMQ.
So:
Local database - DB is faster, otherwise MSMQ is.
I have seen several websites that show you a real time update of what's going on in the database. An example could be
A stock ticker website that shows stock prices in real time
Showing data like "What other users are searching for currently.."
I'd assume this would involve some kind of polling mechanism that queries the database every few seconds and renders it on a web page. But the thought scares me when I think about it from the performance standpoint.
In an application I am working on, I need to display the real time status of an operation that a user has submitted. Users wait for the process to be completed. As and when an operation is completed, the status is updated by another process (could be a windows service). Should I query the database every second to get the updated status?
It's not necessarily done in the db. As you suggested that's expensive. Although db might be a backing store, likely a more efficient mechanism is used to accompany the polling operation like storing the real-time status in memory in addition to finally on the db. You can poll memory much more efficiently than SELECT status from Table every second.
Also as I mentioned in a comment, in some circumstances, you can get a lot of mileage out of forging the appearance of status update through animations and such, employ estimation, checking the data source less often.
Edit
(an optimization to use less db resources for real time)
Instead of polling the database per user to check job status every X seconds, slightly alter the behaviour of the situation. Each time a job is added to the database, read the database once to put meta data about all jobs in the cache. So , for example, memory cache will reflect [user49 ... user3, user2, user1, userCurrent] 50 user's jobs if 1 job each. (Maybe I should have written it as [job49 ... job2, job1, job_current] but same idea)
Then individual users' web pages will poll that cache which is always kept current. In this example the db was read just 50 times into the cache (once per job submission). If those 50 users wait an average 1 minute for job processing and poll for status every second then user base polls the cache a total of 50 users x 60 secs = 3000 times.
That's 50 database reads instead of 3000 over a period of 50 min. (avg. one per min.) The cache is always fresh and yet handles the load. It's much less scary than considering hitting the db every second for each user. You can store other stats and info in the cache to help out with estimation and such. As long as fresh cache provides more efficiency then it's a viable alternative to massive db hits.
Note: By cache I mean a global store like Application or 3rd party solution, not the ASP.NET page cache which goes out of scope quickly. Caching using ASP.NET's mechanisms might not suit your situation.
Another Note: The db will know when another job record is appended no matter from where, so a trigger can init the cache update.
Despite a good database solution, so many users polling frequently is likely to create problems with web server connections and you might need a different solution at that level, depending on traffic.
Maybe have a cache and work with it so yo don't hit the database each time the data is modified and update the database every few seconds or minutes or what you like
The problem touches many layers of a web application.
On the client, you either use an iframe whose content autorefreshes every n seconds using the meta refresh tag (HTML), or a javascript which is triggered by a timer and updated a named div (AJAX).
On the server, you have at least two places to cache your data:
One is in the Application object, where you keep a timestamp of the last update, and refresh the cached data as your refresh interval elapses.
If you want to present data from a database, keep aggregated values or cache relevant data for faster retrieval.
I'm running an ASP.NET app in which I have added an insert/update query to the [global] Page_Load. So, each time the user hits any page on the site, it updates the database with their activity (session ID, time, page they hit). I haven't implemented it yet, but this was the only suggestion given to me as to how to keep track of how many people are currently on my site.
Is this going to kill my database and/or IIS in the long run? We figure that the site averages between 30,000 and 50,000 users at one time. I can't have my site constantly locking up over a database hit with every single page hit for every single user. I'm concerned that's what will happen, however this is the first time I have attempted a solution like this so I may just be overly paranoid.
Do it Async.
Create a dll that handles the update, and in the page load do a fire and forget with parameters.
Insert-Based designs have less locking than Update-Based designs.
So if a user logged-in and then logged-out, in an Insert-Based design you would have multiple rows with a SessionID in each, one for each activity whereas in an Update-Based design, you would have a SessionId, LoginTime and a LogoutTime column and you would update the LogoutTime based on the SessionId.
I have seen many more locking and contention problems caused by Update activity more than Insert activity.
Activities such as counting and linking logins to logouts etc take more complex queries and a little more resources.
It goes without saying that your queries, especially the ones that run on every page, should be as fast as possible so that the site doesn't appear slow to users.
To keep track of how many users are currently on your site you could use performance counters. What you describe though sounds more like a full fledged logging of every page hit.
Lets say you realy have 50k users connected at any one time.
As long as you don't have contention between the updates (trying to lock the same record) a database can track a very high number of inserts and updates. You need to do some capacity planning to assure the load can be carried. 50k users visiting a page every minute will give you 50k inserts and 50k updates per minute, roughly 850 inserts and 850 updates per second, which have to commit (flush the log). Does your DB I/O subsytem support such a write pressure load, in addition to responding to all the requests (reads)?
Also 50k users doing 1 page hit per minute adds up to 72 mil hits per day, 72 mil. logging inserts, at such a rate you need to carefully plan the size capacity of the database and consider what kind of analysis you'll do on the collected data since querying ad-hoc 2 billion rows (one month data) will get you nowhere fast (actually... quite slow).
Doing it async can give you some relief over very short spikes, but not on the long run. If your DB system cannot handle the load then doing async calls will just create a backlog queue in the application process (in the ASP app pool) and this will grow until out of memory, at which moment the all vigilant IIS will 'recycle' the app pool, thus loosing all pending async updates.
I think updating the database in the begin session and end session will do the job. that will reduce the count of statements dramatically.
I think it makes no difference if you track hits or begin/end session. with hits you'll also need additional logic to subtract inactive users
EDIT: session end is not fired always. I would suggest to call an update statement/stored procedure in another session begin event (in addition to the other insert statement) that will fix invalid sessions.
I don't think that calling this "fix routine" is necessary in every page load event because I think you cant exactly count "current no. of visitors".
I would keep this in Application state instead - if possible. On ApplicationStart create some data structure saved to App state that you can update from anywhere in your application - session start, page load, wherever. Keep it out of the database. You are just using it to track "currently online" info anyway it sounds like.
If you have multiple instances of your app, or if there is a requirement to maintain historical info beyond the IIS logs, this won't work obviously. Go with chris' fire-and-forget solution in that case.
What's wrong with IIS Logs?
2009-05-01 12:30:31 207.219.27.35 GET /assocadmin/ibb-reg.asp - usernameremoved 544.566.570.575 Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.0;+SLCC1;+.NET+CLR+2.0.50727;+Media+Center+PC+5.0;+.NET+CLR+3.5.30729;+.NET+CLR+3.0.30618) 200 0 0 40058
EDIT: I'd like to close this answer, but I want the comments to stay. Consider this answer withdrawn.
How about adding a small object to the session?
Something like LoggedInUserFlag:IDisposable
In the constructor, increment your counter however you decide to implement it.
Then in the Dispose method, decrement the counter.
This way, regardless of how the session is ended, the counter will always be (eventually) decremented.
see:
http://weblogs.asp.net/cnagel/archive/2005/01/23/359037.aspx
for info on using IDisposable.
I am not an ASP guy at all, but what about rather than logging all that other info, and insert their IP address?
If they have an IP address already in there, have a last_seen timestamp, and on each refresh just delete any row that isn't 10 minutes ago?
This is how I would take a shot at it. It is much more space efficient, but I am not sure about the checking and deleting so much on such a high profile site.
As a direct answer to your question, yes, running a database query in-line with every request is a bad idea:
Synchronous requests will tie up a thread, which will reduce your scalability (fewer simultaneous activities)
DB inserts (or updates) are writes to the DB, which will put a load on your log volume
DB accesses shouldn't be required in a single server / single AppPool scenario
I answered your question about how to count users in the other thread:
Best way to keep track of current online users
If you are operating in a multi-server / load-balanced environment, then DB accesses may in fact be required. In that case:
Queue them to a background thread so the foreground request thread doesn't have to wait
Use Resource Governor in SQL 2008 to reduce contention with other DB accesses
Collect several updates / inserts together into a single batch, in a single transaction, to minimize log disk I/O pressure
Return the current count with each DB access, to minimize round-trips
In case it's of any interest, I cover sync/async threading issues and the techniques above in detail in my book, along with code examples: Ultra-Fast ASP.NET.
I have a form with a list that shows information from a database. I want the list the update in run time (or almost real time) every time something changes in the database. These are the three ways I can think of to accomplish this:
Set up a timer on the client to check every few seconds: I know how to do this now, but it would involve making and closing a new connection to the database hundreds of times an hour, regardless of whether there was any change
Build something sort of like a TCP/IP chat server, and every time a program updates the database it would also send a message to the TCP/IP server, which in turn would send a message to the client's form: I have no idea how to do this right now
Create a web service that returns the date and time of when the last time the table was changed, and the client would compare that time to the last time the client updated: I could figure out how to build a web service, but I don't how to do this without making a connection to the database anyway
The second option doesn't seem like it would be very reliable, and the first seems like it would consume more resources than necessary. Is there some way to tell the client every time there is a change in the database without making a connection every few seconds, or is it not that big of a deal to make that many connections to a database?
I would imagine connection pooling would make this a non-issue. Depending on your database, it probably won't even notice it.
Are you making the update to the database? Or is the update happening from an external source?
Generally, hundreds of updates per hour won't even bother the DB. Even Access, which is pretty slow, won't cause a performance issue.
Here's a rough idea if you really want to optimize it and you're doing the data updates. Store an application variable on the server side called, say, LastUpdateTime. When you make updates to the database, you can update the LastUpdateTime variable with the current time. Since LastUpdateTime is a very lightweight object in server memory, your clients can technically request the last update time hundreds if not thousands of times per second without any round trip to the database. Based on the last time the client retrieved new information vs. the last update time on the server, you can then go fetch the updated info.
We have a similar question Polling database for updates from C# application. Another idea (may be not a proper solution) would be to use Microsoft Sync Framework. You can use a timer to sync the DB.