Is there anyway to monitor one single (class) of object in terms of cache? - asp.net

I am trying to determine which implementation of the data structure would be best for the web application. Basically, I will maintain one set of "State" class for each unique user, and the State will be cached for some time when the user login, and after the non-sliding period, the state is saved to the db. So in order to balance the db load and the iis memory, I have to determine what is the best (expected) timeout for the cache.
My question is, how to monitor the particular cache activity for one set of object? I tried perfmon, and it gives roughly the % of total memory limit, but no idea on size or so (maybe even better, I could get a list of all cached objects and also the size and other performance issue data).
One last thing, I expect the program is going to handle 100,000+ cached user and each of them may do a request in about 10s-60s. So performance does matters to me.

What exactely are you trying to measure here? If you just want to get the size of your in-memory State instances at any given time, you can use an application-level counter and add/substract every time you create/remove an instance of State. So you know your State size, you know how many State instances you have. But if you already count on getting 100.000+ users each requesting at least once / minute you can actually do the math.

Related

How to handle offline aggregation using Firestore?

I have been scouring the internet for days on a solution to this problem.
That is, how to handle aggregation when there is no network connection? I have a task management app that looks to aggregate meta data about user tasks. For example, the task can contain tags that can be aggregated to be shown in a dashboard to the user on a daily basis. This would be easy if the user is always online, so I could use transaction or cloud function to aggregate, but when the user is offline, the aggregation will appear to be incorrect, until the user restores their network connection.
Aggregation queries are explained here:
https://firebase.google.com/docs/firestore/solutions/aggregation
Which states a limitation:
Offline support - Client-side transactions will fail when the user's
device is offline, which means you need to handle this case in your
app and retry at the appropriate time.
However, there has yet to be any example or documentation on how to 'handle this case'. How would I go about addressing this problem?
Some thoughts:
I could cache the item if a transaction fails. This item will be aggregated on top of the stored aggregation. However, going down this line would mean that I can't take advantage of the Firestore's "offline mode", because I'm using my own cache on every write while offline anyway.
I could aggregate on demand. That is, never store the aggregation. This is going to be very heavy on read depending on how many tasks a user has. Furthermore, if the aggregation will need to be shared as insights to other users, this option will not work because other users do not have access to the tasks.
I'm at a loss and any help would be appreciated, thanks!
After a lot of research and trial and error I found a solution that can address this problem gracefully.
FieldValue.increment to the rescue.
What FieldValue.increment does is bypass the use of transaction while respecting the default Firestore's offline cache behaviour. It requires the use of set or update on the field directly. The drawback is the inability to use the 'withConverter' on the collection for type safety. I'm willing to live with the drawback considering how useful FieldValue.increment is.
I've done multiple tests and can confirm that the values can be incremented/decremented multiple times locally while offline. This offline value is reflected in a get or snapshot call to the cache. When the network connection is restored, the values are updated on the server.
The value itself is not stored on the cache, it simply stores the "difference" in the FieldValue sentinel for when it is time to update it on the server.
This method only works with incrementing and decrementing values. Storing averages will not be possible using this method. That is because the true total number of items is not known at the time of its calculation when offline.
Instead, the total number of items are stored along side the total value. The average is then calculated when and as needed. In this way the average will always be accurate from a local perspective when offline, and it will also be accurate when online when the total value and count has been synced.

Firebase database high delay after a long standby

I'm currently testing Firebase on a non-production Firebase app which I am the only one who works on.
When I try to query the database to retrieve the data after there has not been any query during the last 24 hours, the query take about 8 seconds. After a query is done, the next ones would take normal amount of time (about 100ms).
This is not about caching the queries, by "next queries" I mean new queries which are not the same.
To reproduce it:
Create a database node called users, users children are user data (first name, last name, age, gender, etc)
Add 500,000 users to this node
Get a user by its UID and measure the time. (It should take about 100ms)
Wait 24 hours (I don't know the exact time, but I'm sure about 24 hours)
Get any user by its UID and measure the time. (It should take about 8sec)
Get any user by its UID and measure the time. (It should take about 100ms)
I want to know if this is a known issue to Firebase realtime database or not?
I reached Firebase support, they were able to recreate the issue and faced a wait time of about 6 seconds. Here is their answer after the investigation:
It looks like this is intended behavior. The realtime database queries work by building the index in-memory, which takes time linear to the number of nodes at that location. Once the index is built things are very fast, but the initial build can take a bit to build, especially for large locations.
If you wants the index to stay in memory on the database you should have a listener always listening for this query.
So basically the database takes a long time to process the query because of indexing the large database.
The problem can be solved by keeping a listener on the database or querying the database every few hours.
In production it is not very likely that you face this problem, because the database is being accessed by the user all the time, but if your database is not accessed all the time and you don't want the users experience that long wait time, you should utilize the discussed solution.
Firebase keeps recently used data in its internal cache. This cache is cleared after a few minutes.
But the exact numbers depend on how much data you're loading and how you're loading that data. Without seeing a specific setup that shows how to reproduce these numbers there really isn't much anyone can say.

Storing DataSet in Session

I have read a lot about storing datatable/set in session/viewstate and general consensus seems to be that its not a good idea as it slows down the webpage..but it has its advantages..
Now Iam making a website that allows users to create/manage/host quizzes..and I want to retrieve certain number of questions from database(value of questions will be defined) and store it in a datatable which is maintained in session...Max. no of questions should be 120..
So total data to be stored in session = 120 questions + options + correctanswer; along with other minor things like score of candidate and userdata
My question is: Considering maximum number of questions to be 120, will this much data seriously affect the performance of my page, if so then kindly help me by telling an alternative method...thx.
The view state is certainly not a good place to store this kind of data as the data set will be serialized into a (very long) string and sent to a client with each request. In general, I'd avoid view state whenever possible (in my opinion it has far more disadvantages than advantages).
Storing the data set in session should be fine as long as you use in-process session mode. In this case session data is maintained in memory and the only disadvantage of this approach (as I see it without going into specifics of your task) is that sometimes you might consider releasing the memory that is not needed at the moment. If you're using SQL Server to store session data, I'd choose another storage as serializing/deserializing your data set in this case will probably have an impact on the app performance.
That depends on where you want to put the limit on the application.
Storing the data in memory will limit the application to a certain number of simultaneous users.
Getting the data from the database each time will lower the number of requests that the application can handle per second.
As with any optimisation, you should only optimise when you know what the need is, so I would suggest that you just get the data from the database, and add caching when there actually is a need for it.

ASP.NET data caching design

I have method in my BLL that interacts with the database and retrieves data based on the defined criteria.
The returned data is a collection of FAQ objects which is defined as follows:
FAQID,
FAQContent,
AnswerContent
I would like to cache the returned data to minimize the DB interaction.
Now, based on the user selected option, I have to return either of the below:
ShowAll: all data.
ShowAnsweredOnly: faqList.Where(Answercontent != null)
ShowUnansweredOnly: faqList.Where(AnswerContent != null)
My Question:
Should I only cache all data returned from DB (e.g. FAQ_ALL) and filter other faqList modes from cache (= interacting with DB just once and filter the data from the cache item for the other two modes)? Or should I have 3 cache items: FAQ_ALL, FAQ_ANSWERED and FAQ_UNANSWERED (=interacting with database for each mode [3 times]) and return the cache item for each mode?
I'd be pleased if anyone tells me about pros/cons of each approach.
Food for thought.
How many records are you caching, how big are the tables?
How much mid-tier resources can be reserved for caching?
How many of each type data exists?
How fast filtering on the client side will be?
How often does the data change?
how often is it changed by the same application instance?
how often is it changed by other applications or server side jobs?
What is your cache invalidation policy?
What happens if you return stale data?
Can you/Should you leverage active cache invalidation, like SqlDependency or LinqToCache?
If the dataset is large then filtering on the client side will be slow and you'll need to cache two separate results (no need for a third if ALL is the union of the other two). If the data changes often then caching will return stale items frequently w/o a proactive cache invalidation in place. Active cache invalidation is achievable in the mid-tier if you control all the updates paths and there is only one mid-tier instance application, but becomes near really hard if one of those prerequisites is not satisfied.
It basically depends how volatile the data is, how much of it there is, and how often it's accessed.
For example, if the answered data didn't change much then you'd be safe caching that for a while; but if the unanswered data changed a lot (and more often) then your caching needs might be different. If this was the case it's unlikely that caching it as one dataset will be the best option.
It's not all bad though - if the discrepancy isn't too huge then you might be ok cachnig the lot.
The other point to think about is how the data is related. If the FAQ items toggle between answered and unanswered then it'd make sense to cache the base data as one - otherwise the items would be split where you wanted it together.
Alternatively, work with the data in-memory and treat the database as an add-on...
What do I mean? Well, typically the user will hit "save" this will invoke code which saves to the DB; when the next user comes along they will invoke a call which gets the data out of the DB. In terms of design the DB is a first class citizen, everything has to go through it before anyone else gets a look in. The alternative is to base the design around data which is held in-memory (by the BLL) and then saved (perhaps asynchronously) to the DB. This removes the DB as a bottleneck but gives you a new set of problems - like what happens if the database connection goes down or the server dies with data only in-memory?
Pros and Cons
Getting all the data in one call might be faster (by making less calls).
Getting all the data at once if it's related makes sense.
Granularity: data that is related and has a similar "cachability" can be cached together, otherwise you might want to keep them in separate cache partitions.

Best way to keep track of current online users

I have a requirement that my site always display the number of users currently online. For example, "35741 Users Currently Online". This is not based on a log in, simply how many users are currently on my site. I have tried using Session Start/Session End for this, however session end is not reliable. Therefore I get inflated numbers, as my session start adds numbers but session end doesn't remove them because it doesn't fire.
There is no additional information to be gathered from this (reporting, etc), it's simply requested that the number show up. Very simple request that's turning into a huge deal. Any help is appreciated.
EDIT:
I should specify that I have also tried using a database for this. Simple table that contains a session ID and a last activity column. With each page hit, I check to see if the session is in my database. If not, insert. If so, update with activity time. Then I run a procedure that sweeps the database looking for sessions with no activity in the last 20 minutes. This approach seemed to kill my SQL server and/or IIS. Had to restart the site.
Best way is like you do, but time it out via activity. If a given session doesn't access a page within 5 minutes or so, you may consider them no longer active.
If you're using ASP.Net membership, take a look at GetNumberOfUsersOnline.
For every user action that you can record, you need to consider them "online" for a certain window of time. Depending on the site, you may set that to 5 minutes. The actual web request should take less than a second. You have to make some assumption about how long they might stay on that page and do nothing but be considered online.
This approach requires that you keep track of the time of each users last activity.
Use Performance Counters:
State Server Sessions Active: The
number of active user sessions.
Expanding what silky said in his answer - since really http is stateless to determine if the user is currently 'online' you can really only track how long since the user last accessed your site and make a determination on how long between requests your consider to still be active.
Since you stated that this isn't based upon users logging in may it's a simple of how many different IP addresses you received requests from in the past 5 minutes (or however long you consider the 'online' timeout to be).
Don't use sessions for this unless you also need sessions for something else; it's overkill otherwise.
Assuming a single-server installation, do something like this:
For each user, issue a cookie that contains a unique ID
Maintain a static table of unique IDs and their last access time
In an HttpModule (or Global.asax), enter new users into the table and update their access times (use appropriate locking to prevent race conditions)
Periodically, either from a background thread or in-line with a user request, remove entries from the table that haven't made a request within the last N minutes. You might also want to support an explicit "log out" feature.
Report the number of people online as the size of the table
If you do use sessions, you can use the Session ID as the unique identifier. However, keep in mind that Session IDs aren't issued until you store something in the Session dictionary, unless you have a Session_Start() event configured.
In a load balanced or web garden scenario, it gets a little more complicated, but you can use the same basic idea, just persisting the info in a database instead of in memory.
When the user logs in write his user name into the HttpContext.Current.Cache with a sliding expiration (say 20 minutes).
Then in the Global.asax.cs in the Application_PreRequestHandlerExecute "touch" the cache entry for the current users so it resets the sliding expiration.
When a user explicitly logs out, remove his username from HttpContext.Current.Cache.
If you do this, at any given time HttpContext.Current.Cache.Count will give you the # of current users.
Note: this is assuming you aren't using the Cache for other purposes.

Resources