Running a query in Page Load a bad idea? - asp.net

I'm running an ASP.NET app in which I have added an insert/update query to the [global] Page_Load. So, each time the user hits any page on the site, it updates the database with their activity (session ID, time, page they hit). I haven't implemented it yet, but this was the only suggestion given to me as to how to keep track of how many people are currently on my site.
Is this going to kill my database and/or IIS in the long run? We figure that the site averages between 30,000 and 50,000 users at one time. I can't have my site constantly locking up over a database hit with every single page hit for every single user. I'm concerned that's what will happen, however this is the first time I have attempted a solution like this so I may just be overly paranoid.

Do it Async.
Create a dll that handles the update, and in the page load do a fire and forget with parameters.

Insert-Based designs have less locking than Update-Based designs.
So if a user logged-in and then logged-out, in an Insert-Based design you would have multiple rows with a SessionID in each, one for each activity whereas in an Update-Based design, you would have a SessionId, LoginTime and a LogoutTime column and you would update the LogoutTime based on the SessionId.
I have seen many more locking and contention problems caused by Update activity more than Insert activity.
Activities such as counting and linking logins to logouts etc take more complex queries and a little more resources.
It goes without saying that your queries, especially the ones that run on every page, should be as fast as possible so that the site doesn't appear slow to users.

To keep track of how many users are currently on your site you could use performance counters. What you describe though sounds more like a full fledged logging of every page hit.
Lets say you realy have 50k users connected at any one time.
As long as you don't have contention between the updates (trying to lock the same record) a database can track a very high number of inserts and updates. You need to do some capacity planning to assure the load can be carried. 50k users visiting a page every minute will give you 50k inserts and 50k updates per minute, roughly 850 inserts and 850 updates per second, which have to commit (flush the log). Does your DB I/O subsytem support such a write pressure load, in addition to responding to all the requests (reads)?
Also 50k users doing 1 page hit per minute adds up to 72 mil hits per day, 72 mil. logging inserts, at such a rate you need to carefully plan the size capacity of the database and consider what kind of analysis you'll do on the collected data since querying ad-hoc 2 billion rows (one month data) will get you nowhere fast (actually... quite slow).
Doing it async can give you some relief over very short spikes, but not on the long run. If your DB system cannot handle the load then doing async calls will just create a backlog queue in the application process (in the ASP app pool) and this will grow until out of memory, at which moment the all vigilant IIS will 'recycle' the app pool, thus loosing all pending async updates.

I think updating the database in the begin session and end session will do the job. that will reduce the count of statements dramatically.
I think it makes no difference if you track hits or begin/end session. with hits you'll also need additional logic to subtract inactive users
EDIT: session end is not fired always. I would suggest to call an update statement/stored procedure in another session begin event (in addition to the other insert statement) that will fix invalid sessions.
I don't think that calling this "fix routine" is necessary in every page load event because I think you cant exactly count "current no. of visitors".

I would keep this in Application state instead - if possible. On ApplicationStart create some data structure saved to App state that you can update from anywhere in your application - session start, page load, wherever. Keep it out of the database. You are just using it to track "currently online" info anyway it sounds like.
If you have multiple instances of your app, or if there is a requirement to maintain historical info beyond the IIS logs, this won't work obviously. Go with chris' fire-and-forget solution in that case.

What's wrong with IIS Logs?
2009-05-01 12:30:31 207.219.27.35 GET /assocadmin/ibb-reg.asp - usernameremoved 544.566.570.575 Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.0;+SLCC1;+.NET+CLR+2.0.50727;+Media+Center+PC+5.0;+.NET+CLR+3.5.30729;+.NET+CLR+3.0.30618) 200 0 0 40058
EDIT: I'd like to close this answer, but I want the comments to stay. Consider this answer withdrawn.

How about adding a small object to the session?
Something like LoggedInUserFlag:IDisposable
In the constructor, increment your counter however you decide to implement it.
Then in the Dispose method, decrement the counter.
This way, regardless of how the session is ended, the counter will always be (eventually) decremented.
see:
http://weblogs.asp.net/cnagel/archive/2005/01/23/359037.aspx
for info on using IDisposable.

I am not an ASP guy at all, but what about rather than logging all that other info, and insert their IP address?
If they have an IP address already in there, have a last_seen timestamp, and on each refresh just delete any row that isn't 10 minutes ago?
This is how I would take a shot at it. It is much more space efficient, but I am not sure about the checking and deleting so much on such a high profile site.

As a direct answer to your question, yes, running a database query in-line with every request is a bad idea:
Synchronous requests will tie up a thread, which will reduce your scalability (fewer simultaneous activities)
DB inserts (or updates) are writes to the DB, which will put a load on your log volume
DB accesses shouldn't be required in a single server / single AppPool scenario
I answered your question about how to count users in the other thread:
Best way to keep track of current online users
If you are operating in a multi-server / load-balanced environment, then DB accesses may in fact be required. In that case:
Queue them to a background thread so the foreground request thread doesn't have to wait
Use Resource Governor in SQL 2008 to reduce contention with other DB accesses
Collect several updates / inserts together into a single batch, in a single transaction, to minimize log disk I/O pressure
Return the current count with each DB access, to minimize round-trips
In case it's of any interest, I cover sync/async threading issues and the techniques above in detail in my book, along with code examples: Ultra-Fast ASP.NET.

Related

Where do I cache ASP.NET data to avoid Application state being reset?

I have an application that performs complex queries against what amounts to data organized in a "star schema". The gold-owner keeps adding new "axes" to perform searches on, with the result that performance becomes worse over time. Currently, the execution of a search operation, using a stored procedure to do all the work on the SQL server, takes about 2 seconds, which doesn't fit the gold-owner's desire to have the code be interactive (<0.1 sec response time). Looking at the SQL Server query analyzer, the search is IO-bound on 9 table scans of 100,000 records, and then doing brutal joins. Due to the nature of the queries I need to perform and the limitations of SQL, this cannot be improved.
In desperation, I've rewritten the query processor so that it sucks in the 100,000 records into a cache at application start, then perform the complex queries against the cached memory. Loading all the records from the database takes about 12 seconds. This expensive initial load is mitigated by my rewritten query processor. It now only needs to do a single scan through the records, and gives a response time of 0.02 seconds.
This good news is tainted by the gold-owner's discovery that the 12-second hit for populating the cache is being experienced every hour or so. I'm currently storing the data in the ASP.NET application state, as Application["FactTable"]. It seems the application state is being reset after the ASP.NET application is idle for longer than a dozen minutes or so.
If I move the 100,000 records into the ASP.NET application cache, will I be experiencing these evictions just as often, or can I rely on the data remaining in memory for the fast retrievals for longer periods of time? If the ASP.NET cache is also victim to application resets, what other mechanism should I use? A separate app domain hosting an instance of my database cache comes to mind, but I don't want to go down that route unless my other options are closed off.
I realise that you have a lot of data and processing and you must have tried a few things to speed this scenario up, but using Application State which is managed by IIS will be volatile...
Have you thought of running the your calculations etc in another process, ie, create a windows service that periodically runs the queries to organise your data and save that "flat" data to a database cache. When the user requests the data, they will just get the last DB cached results... and then further speed this up by holding those results in the Application state which can just refresh itself if that gets destroyed?

ASP Website runs slow when number of users Increases

I need some information from you.I have used session.TimeOut=540 in application.Is that effects on my Application performance after some time.When number of users increases its getting very slow. response time nearly more that 2 minutes for a button click also.This is hosted in server in Application pool .I don't know about Application pool much.If Session Timeout is the problem i will remove it.Please suggest me the way to for more users.
Job Numbers,CustomerID,Tasks will come from one database.when the user click start Button then the data saved in another Database.I need this need to be faster for more Users
I think that you have some page(s) that make some work that takes time, or for some reason or a bug is keep open for more time than the usual.
This page is keep lock the session and hold the rest page from response because the session holds all the pages.
Now, together with the increase of the timeout this page is lock everything and here is you response time near to 2 minutes.
The solution is to locate the page that have the long running problem and fix it or make it faster by optimize the process, or if this page must keep the long time running, then disable the session for that one.
relative:
Web app blocked while processing another web app on sharing same session
What perfmon counters are useful for identifying ASP.NET bottlenecks?
Replacing ASP.Net's session entirely
Trying to make Web Method Asynchronous
Does ASP.NET Web Forms prevent a double click submission?
About server
Now from the other hand, if your server suffer from hardware, or bad setup then here is one other answer with points that you need to check to make it faster.
Find out where the time is spent
add the StopWatch in the method which you said "more that 2 minutes for a button click". you can find which statment spent the most time.
If it is a query on DB that cost time. Check your sql statement.
are you using "SELECT Count(*)" instead of "SELECT Count(Id)"? the * is always slower. also, don't try "SELECT * FROM...."
Use cache.
there are many ways to do cache. both in ASPX pages and your biz layer.
the OutputCache is the most easy way.
and also, cache the page (for example a blog post) on the first time when a user visit it.
Did you use memory paging?
be careful when doing paging on gridview or other list. If you just call DataSource=xxx and DataBind(), even with PagedDataSource, this is likely a memory paging. It cost a lot of performance. Please use stored procedures to do paging.
Check your server environment
where did you deploy the website? many ISP will limit brandwide and IIS connection count and also CPU time to your account.
if you have RD access to your server. you can watch CPU and memory usage to see if they are high when many user comes to your site. If the site is slow and neither CPU nor memory useage is high, it may be a network brandwide problem.
Here are some simple steps to narrow down the issue -
1) Get HTTPWatch (theres a free Basic version) available and check whats really taking time from an end user perspective. Look at number of requests, number of resources downloaded, and the payload. If there is nothing to worry move on to next
2) If its not client, then its usually the processing time on the server. Jump on to DB first - since this is quite easier to eliminate quickly. Look at how many DB calls are made (run profiler in staging or dev) and see if there are any long running queries, missing indexes or statistics, and note the IO. If all is well, move on
3) Check your app code. You could get on with VS.NET in build profiler or professional tools such as Ants. If code is fine then its your network or external calls that you make, check your network bandwidth. If you still cannot narrow down, check your environment/hardware
The best way to get to it is to apply load - You could use simple tools such as ab.exe (that comes as part of Apache Web server) to have concurrent hits on your server and run the App, DB profilers in the background to get to the issue.
Hope this helps!

Session state blocking async ajax call from processed concurrently

I am trying to make 6 asynchronous jQuery ajax calls to my .NET Page Method all at once on document.ready to request for different sets of data from the database and in return render as charts to users.
Problem is that when one chart takes a long time to generate, it locks up generation of the next 5 charts, for instance, when each chart takes 1 min to generate, the user will be approx waiting for 6 mins, instead of 1 - 2 mins which i thought it will be when using async ajax calls and page method gets processed in parallel.
After reading a lot of useful posts in this forum, i found that this is because I have to read and write to session objects within the page methods, and asp.net will lock the whole request, as a result making them run sequentially.
I have seen people suggesting to set the session state to read only in #Page tag, but it will not address my problem because i need write to the session as well. I have considered moving from inProc session to sql database session, but my session object is not serializable and is used across the whole project. I also cannot change to use Cache instead because the session contains user specific details.
Can anyone please help and point me to the right direction? I have been spending days to investigate this page inefficiency and still haven't yet found a nice way yet.
Thanks in advance
From my personal experience, switching to SQL session will NOT help this problem as all of the concurrent threads will block in SQL as the first thread in will hold an exclusive lock on one or more rows in the database.
I'm curious as to why your session object isn't serializable. The only solution that I can think of is use a database table to store the user specific data that you are keeping in session and then only holding onto a database lock for as long as it takes you to update the user data.
You can use the ASP.NET session id or other unique cookie value as the database key.
The problem may not be server side at all.
Browsers have a built in limit on how many concurrent HTTP requests they will make - this is part of the HTTP/1.1 spec which sugests a limit of 2.
In IE7 the limit is 2. in IE8 it is 6. But when a page loads you could easily hit 6 due to the concurrent requests for CSS, JS, images etc.
A good source of info about these limits is BrowserScope (see Connections per Hostname column).
What about combining those 6 requests into 1 request? This will also load a little faster.

How to implement real time updates in ASP.NET

I have seen several websites that show you a real time update of what's going on in the database. An example could be
A stock ticker website that shows stock prices in real time
Showing data like "What other users are searching for currently.."
I'd assume this would involve some kind of polling mechanism that queries the database every few seconds and renders it on a web page. But the thought scares me when I think about it from the performance standpoint.
In an application I am working on, I need to display the real time status of an operation that a user has submitted. Users wait for the process to be completed. As and when an operation is completed, the status is updated by another process (could be a windows service). Should I query the database every second to get the updated status?
It's not necessarily done in the db. As you suggested that's expensive. Although db might be a backing store, likely a more efficient mechanism is used to accompany the polling operation like storing the real-time status in memory in addition to finally on the db. You can poll memory much more efficiently than SELECT status from Table every second.
Also as I mentioned in a comment, in some circumstances, you can get a lot of mileage out of forging the appearance of status update through animations and such, employ estimation, checking the data source less often.
Edit
(an optimization to use less db resources for real time)
Instead of polling the database per user to check job status every X seconds, slightly alter the behaviour of the situation. Each time a job is added to the database, read the database once to put meta data about all jobs in the cache. So , for example, memory cache will reflect [user49 ... user3, user2, user1, userCurrent] 50 user's jobs if 1 job each. (Maybe I should have written it as [job49 ... job2, job1, job_current] but same idea)
Then individual users' web pages will poll that cache which is always kept current. In this example the db was read just 50 times into the cache (once per job submission). If those 50 users wait an average 1 minute for job processing and poll for status every second then user base polls the cache a total of 50 users x 60 secs = 3000 times.
That's 50 database reads instead of 3000 over a period of 50 min. (avg. one per min.) The cache is always fresh and yet handles the load. It's much less scary than considering hitting the db every second for each user. You can store other stats and info in the cache to help out with estimation and such. As long as fresh cache provides more efficiency then it's a viable alternative to massive db hits.
Note: By cache I mean a global store like Application or 3rd party solution, not the ASP.NET page cache which goes out of scope quickly. Caching using ASP.NET's mechanisms might not suit your situation.
Another Note: The db will know when another job record is appended no matter from where, so a trigger can init the cache update.
Despite a good database solution, so many users polling frequently is likely to create problems with web server connections and you might need a different solution at that level, depending on traffic.
Maybe have a cache and work with it so yo don't hit the database each time the data is modified and update the database every few seconds or minutes or what you like
The problem touches many layers of a web application.
On the client, you either use an iframe whose content autorefreshes every n seconds using the meta refresh tag (HTML), or a javascript which is triggered by a timer and updated a named div (AJAX).
On the server, you have at least two places to cache your data:
One is in the Application object, where you keep a timestamp of the last update, and refresh the cached data as your refresh interval elapses.
If you want to present data from a database, keep aggregated values or cache relevant data for faster retrieval.

How to find how many people are online in a website

I have to show how many people are online in that site. The site has developed by using ASP.NET 2.0. Currently I am using the Session start (increase by 1) and Session End Event (decrease by 1) in Global.asax. But the Session_End Event not calling properly most of the times. Is there any better way to achieve this?
You're not going to get much better then incrementing in Session_Start and decrementing in Session_End unless you use some other means like a database.
When you are authenticating your requests you could update a timestamp in the database to show that the user has been active, and after a given time (say 15 minutes), the query that collects the number of concurrent users ignores that row in the database (thus decrementing the count).
A quick Google search revealed a handy way of doing this with a HttpModule.
All in all, Yohann did a great job with this. It does implement a set timeout that was suggested above, but otherwise there is no set way of doing this accurately outside of checking the server's perfmon.exe and checking the WebService >> WebAppPool's count of current connections.
When I implemented this myself I used a SQL Server table to store a date/time and user info on authentication. I decremented the count by re-assesing and matching the IP addresses whenever I had to refresh the data cache (once every 10 mins).
We have the same issue in a project, after tried several methods, we end with tracking idle time of each user, when the idle time is over the session timeout, we consider the user is not online anymore. of course you also need to consider the other issue such as the user log off, or log back in after timeout...
I've done this before and can attest that Session_End will only be called if you manually destroy the session (Session.Abandon). When you think about it, it makes sense. If the user isn't on the website, the code never gets executed.
What I did was store a hashtable in Application state that contained the Username and a Datetime for the last time the user was seen on the site. For every page load I would call a function that would either insert or update this value for the current user. Then it would cull the entire list and remove all of the entries that are older than the session timeout (20 minutes or whatever). Remember to use lock or sync to avoid race conditions when making changes to this list.
This has the added benefit of not only knowing how many people, but specifically which users.
If you don't have something unique like a Username, you can use Session.SessionID instead. It should be unique per visitor of your site.
But be careful, using an Application or App Instance state variable has its own share of problems since it won't share between processes in "Web Garden Mode" or in a multi-server setup. You would need a more persistent medium such as a database or distributed cache for larger scale setups.

Resources