Need suggestion for ASP.Net in-memory queue - asp.net

I've a requirement of creating a HttpHandler that will serve an image file (simple static file) and also it'll insert a record in the SQL Server table. (e.g http://site/some.img, where some.img being a HttpHandler) I need an in-memory object (like Generic List object) that I can add items to on each request (I also have to consider a few hundreds or thousands requests per second) and I should be able unload this in-memory object to sql table using SqlBulkCopy.
List --> DataTable --> SqlBulkCopy
I thought of using the Cache object. Create a Generic List object and save it in the HttpContext.Cache and insert every time a new Item to it. This will NOT work as the CacheItemRemovedCallback would fire right away when the HttpHandler tries to add a new item. I can't use Cache object as in-memory queue.
Anybody can suggest anything? Would I be able to scale in the future if the load is more?

Why would CacheItemRemovedCalledback fire when you ADD something to the queue? That doesn't make sense to me... Even if that does fire, there's no requirement to do anything here. Perhaps I am misunderstanding your requirements?
I have quite successfully used the Cache object in precisely this manner. That is what it's designed for and it scales pretty well. I stored a Hashtable which was accessed on every app page request and updated/cleared as needed.
Option two... do you really need the queue? SQL Server will scale pretty well also if you just want to write directly into the DB. Use a shared connection object and/or connection pooling.

How about just using the Generic List to store requests and using different thread to do the SqlBulkCopy?
This way storing requests in the list won't block the response for too long, and background thread will be able to update the Sql on it's own time, each 5 min so.
you can even base the background thread on the Cache mechanism by performing the work on CacheItemRemovedCallback.
Just insert some object with remove time of 5 min and reinsert it at the end of the processing work.

Thanks Alex & Bryan for your suggestions.
Bryan: When I try to replace the List object in the Cache for the second request (now, count should be 2), the CacheItemRemovedCalledback gets fire as I'm replacing the current Cache object with the new one. Initially, I also thought this is weird behavior so I gotta look deeper into it.
Also, for the second suggestion, I will try to insert record (with the Cached SqlConnection object) and see what performance I get when I do the stress test. I doubt I'll be getting fantastic numbers as it's I/O operation.
I'll keep digging on my side for an optimal solution meanwhile with your suggestions.

You can create a conditional requirement within the callback to ensure you are working on a cache entry that has been hit from an expiration instead of a remove/replace (in VB since I had it handy):
Private Shared Sub CacheRemovalCallbackFunction(ByVal cacheKey As String, ByVal cacheObject As Object, ByVal removalReason As Web.Caching.CacheItemRemovedReason)
Select Case removalReason
Case Web.Caching.CacheItemRemovedReason.Expired, Web.Caching.CacheItemRemovedReason.DependencyChanged, Web.Caching.CacheItemRemovedReason.Underused
' By leaving off Web.Caching.CacheItemRemovedReason.Removed, this will exclude items that are replaced or removed explicitly (Cache.Remove) '
End Select
End Sub
Edit Here it is in C# if you need it:
private static void CacheRemovalCallbackFunction(string cacheKey, object cacheObject, System.Web.Caching.CacheItemRemovedReason removalReason)
{
switch(removalReason)
{
case System.Web.Caching.CacheItemRemovedReason.DependencyChanged:
case System.Web.Caching.CacheItemRemovedReason.Expired:
case System.Web.Caching.CacheItemRemovedReason.Underused:
// This excludes the option System.Web.Caching.CacheItemRemovedReason.Removed, which is triggered when you overwrite a cache item or remove it explicitly (e.g., HttpRuntime.Cache.Remove(key))
break;
}
}

To expand on my previous comment... I get the picture you are thinking about the cache incorrectly. If you have an object stored in the Cache, say a Hashtable, any update/storage into that Hashtable will be persisted without you explicitly modifying the contents of the Cache. You only need to add the Hashtable to the Cache once, either at application startup or on the first request.
If you are worried about the bulkcopy and page request updates happening simultaneously, then I suggest you simple have TWO cached lists. Have one be the list which is updated as page requests come in, and one list for the bulk copy operation. When one bulk copy is finished, swap the lists and repeat. This is similar to double-buffering video RAM for video games or video apps.

Related

MDriven ECO_ID duplicates

We appear to have a problem with MDriven generating the same ECO_ID for multiple objects. For the most part it seems to happen in conjunction with unexpected process shutdowns and/or server shutdowns, but it does also happen during normal activity.
Our system consists of one ASP.NET application and one WinForms application. The ASP.NET app is setup in IIS to use a single worker process. We have a mixture of WebForms and MVC, including ApiControllers. We're using a rather old version of the ECO packages: 7.0.0.10021. We're on VS 2017, target framework is 4.7.1.
We have it configured to use 64 bit integers for object id:s. Database is Firebird. SQL configuration is set to use ReadCommitted transaction isolation.
As far as I can tell we have configured EcoSpaceStrategyHandler with EcoSpaceStrategyHandler.SessionStateMode.Never, which should mean that EcoSpaces are not reused at all, right? (Why would I even use EcoSpaceStrategyHandler in this case, instead of just creating EcoSpace normally with the new keyword?)
We have created MasterController : Controller and MasterApiController : ApiController classes that we use for all our controllers. These have a EcoSpace property that simply does this:
if (ecoSpace == null)
{
if (ecoSpaceStrategyHandler == null)
ecoSpaceStrategyHandler = new EcoSpaceStrategyHandler(
EcoSpaceStrategyHandler.SessionStateMode.Never,
typeof(DiamondsEcoSpace),
null,
false
);
ecoSpace = (DiamondsEcoSpace)ecoSpaceStrategyHandler.GetEcoSpace();
}
return ecoSpace;
I.e. if no strategy handler has been created, create one specifying no pooling and no session state persisting of eco spaces. Then, if no ecospace has been fetched, fetch one from the strategy handler. Return the ecospace. Is this an acceptable approach? Why would it be better than simply doing this:
if (ecoSpace = null)
ecoSpace = new DiamondsEcoSpace();
return ecoSpace;
In aspx we have a master page that has an EcoSpaceManager. It has been configured to use a pool but SessionStateMode is Never. It has EnableViewState set to true. Is this acceptable? Does it mean that EcoSpaces will be pooled but inactivated between round trips?
It is possible that we receive multiple incoming API calls in tight succession, so that one API call hasn't been completed before the next one comes in. I assume that this means that multiple instances of MasterApiController can execute simultaneously but in separate threads. There may of course also be MasterController instances executing MVC requests and also the WinForms app may be running some batch job or other.
But as far as I understand id reservation is made at the beginning of any UpdateDatabase call, in this way:
update "ECO_ID" set "BOLD_ID" = "BOLD_ID" + :N;
select "BOLD_ID" from "ECO_ID";
If the returned value is K, this will reserve N new id:s ranging from K - N to K - 1. Using ReadCommitted transactions everywhere should ensure that the update locks the id data row, forcing any concurrent save operations to wait, then fetches the update result without interference from other transactions, then commits. At that point any other pending save operation can proceed with its own id reservation. I fail to see how this could result in the same ID being used for multiple objects.
I should note that it does seem like it sometimes produces id duplicates within one single UpdateDatabase, i.e. when saving a set of new related objects, some of them end up with the same id. I haven't really confirmed this though.
Any ideas what might be going on here? What should I look for?
The issue is most likely that you use ReadCommitted isolation.
This allows for 2 systems to simultaneously start a transaction, read the current value, increase the batch, and then save after each other.
You must use Serializable isolation for key generation; ie only read things not currently in a write operation.
MDriven use 2 settings for isolation level UpdateIsolationLevel and FetchIsolationLevel.
Set your UpdateIsolationLevel to Serializable

Slowdown issue in web project

I just need suggestion in this case. There is a PIN code field in my project in asp.net environment. I have stored 50,000 around pin code in sql server database. When I run project in local host, it becomes slow down. Since I have a drop-down to get value from database. I think it is because of huge data is being rendered into html, since when I click on view source at run-time, I can see all the PIN-code inside it.
Moreover, I have also done this for Select CITY, and STATE from database in a same way.
I will really appreciate you, if you get me any logic or technique to lessen this slowdown
If you are using all the Pincode in the single page then You have multiple option to optimized this slow down If this is in initialized phase then Try MongoDB ,No SQL DB otherwise go for Solr , Redis that gives fast accessing of the data. If you are not able to using these then You can optimised it by eager loading , Cache Storing of data.
If its not in single page then break it to batch via paginate the pincode.
This is common problem with any website where we deal with large amount of data. To be frank there is no code level solution for this. You need to select any of following approach.
You can try multiple options for faster retrieval.
Caching -
Use redis or memcache - in simpler words, on the first request cache manager will read and store your data from SQL server. For subsequent requests, data will be served from cache.
Also, don't forget to make a provision to invalidate the data when new pin codes are added.
Edit: You can also use object caching provided by .Net framework. Refer: object caching
Code will be something like.
if (Cache["key_pincodes"] == null)
{
// if No object is present in Cache, add it to the cache with expiry time of 10 minutes
// Read data to datatable or any object
DataTable pinCodeObject = GetPinCodesFromdatabase();
Cache.Insert("key_pincodes", pinCodeObject, null, DateTime.MaxValue, TimeSpan.FromMinutes(10));
}
else // If pinCodes are cached, dont make Database call and read it from cache
{
// This will get execute
DataTable pinCodeObject = (DataTable)Cache["key_pincodes"];
}
// bind it your dropdown
No-sql database-
MongoDB, XML, Txt files could be used to read the data. It will take much lesser time than the database hit.

Does any asp.net data cache support background population of cache entries?

We have a data driven ASP.NET website which has been written using the standard pattern for data caching (adapted here from MSDN):
public DataTable GetData()
{
string key = "DataTable";
object item = Cache[key] as DataTable;
if((item == null)
{
item = GetDataFromSQL();
Cache.Insert(key, item, null, DateTime.Now.AddSeconds(300), TimeSpan.Zero;
}
return (DataTable)item;
}
The trouble with this is that the call to GetDataFromSQL() is expensive and the use of the site is fairly high. So every five minutes, when the cache drops, the site becomes very 'sticky' while a lot of requests are waiting for the new data to be retrieved.
What we really want to happen is for the old data to remain current while new data is periodically reloaded in the background. (The fact that someone might therefore see data that is six minutes old isn't a big issue - the data isn't that time sensitive). This is something that I can write myself, but it would be useful to know if any alternative caching engines (I know names like Velocity, memcache) support this kind of scenario. Or am I missing some obvious trick with the standard ASP.NET data cache?
You should be able to use the CacheItemUpdateCallback delegate which is the 6th parameter which is the 4th overload for Insert using ASP.NET Cache:
Cache.Insert(key, value, dependancy, absoluteExpiration,
slidingExpiration, onUpdateCallback);
The following should work:
Cache.Insert(key, item, null, DateTime.Now.AddSeconds(300),
Cache.NoSlidingExpiration, itemUpdateCallback);
private void itemUpdateCallback(string key, CacheItemUpdateReason reason,
out object value, out CacheDependency dependency, out DateTime expiriation,
out TimeSpan slidingExpiration)
{
// do your SQL call here and store it in 'value'
expiriation = DateTime.Now.AddSeconds(300);
value = FunctionToGetYourData();
}
From MSDN:
When an object expires in the cache,
ASP.NET calls the
CacheItemUpdateCallback method with
the key for the cache item and the
reason you might want to update the
item. The remaining parameters of this
method are out parameters. You supply
the new cached item and optional
expiration and dependency values to
use when refreshing the cached item.
The update callback is not called if
the cached item is explicitly removed
by using a call to Remove().
If you want the cached item to be
removed from the cache, you must
return null in the expensiveObject
parameter. Otherwise, you return a
reference to the new cached data by
using the expensiveObject parameter.
If you do not specify expiration or
dependency values, the item will be
removed from the cache only when
memory is needed.
If the callback method throws an
exception, ASP.NET suppresses the
exception and removes the cached
value.
I haven't tested this so you might have to tinker with it a bit but it should give you the basic idea of what your trying to accomplish.
I can see that there's a potential solution to this using AppFabric (the cache formerly known as Velocity) in that it allows you to lock a cached item so it can be updated. While an item is locked, ordinary (non-locking) Get requests still work as normal and return the cache's current copy of the item.
Doing it this way would also allow you to separate out your GetDataFromSQL method to a different process, say a Windows Service, that runs every five minutes, which should alleviate your 'sticky' site.
Or...
Rather than just caching the data for five minutes at a time regardless, why not use a SqlCacheDependency object when you put the data into the cache, so that it'll only be refreshed when the data actually changes. That way you can cache the data for longer periods, so you get better performance, and you'll always be showing the up-to-date data.
(BTW, top tip for making your intention clearer when you're putting objects into the cache - the Cache has a NoSlidingExpiration (and a NoAbsoluteExpiration) constant available that's more readable than your Timespan.Zero)
First, put the date you actually need in a lean class (also known as POCO) instead of that DataTable hog.
Second, use cache and hash - so that when your time dependency expires you can spawn an async delegate to fetch new data but your old data is still safe in a separate hash table (not Dictionary - it's not safe for multi-reader single writer threading).
Depending on the kind of data and the time/budget to restructure SQL side you could potentially fetch only things that have LastWrite younger that your update window. you will need 2-step update (have to copy dats from the hash-kept opject into new object - stuff in hash is strictly read-only for any use or the hell will break loose).
Oh and SqlCacheDependency is notorious for being unreliable and can make your system break into mad updates.

Viewstate in a .ashx Handler?

I've got a handler (list.ashx for example) that has a method that retrieves a large dataset, then grabs only the records that will be shown on any given "page" of data. We are allowing the users to do sorting on these results. So, on any given page run, I will be retrieving a dataset that I just got a few seconds/minutes ago, but reordering them, or showing the next page of data, etc.
My point is that my dataset really hasn't changed. Normally, the dataset would be stuck into the viewstate of a page, but since I'm using a handler, I don't have that convenience. At least I don't think so.
So, what is a common way to store the viewstate associated with a current user's given page when using a handler? Is there a way to take the dataset, encode it somehow and send that back to the user, and then on the next call, pass it back and then rehydrate a dataset from those bits?
I don't think Session would be a good place to store it since we might have 1000 users all viewing different datasets of different data, and that could bring the server to its knees. At least I think so.
Does anyone have any experience with this kind of situation, and can you give me any advice?
In this situation I would use a cache with some type of user and query info as the key. The reason being is you say it is a large dataset. Right there is something you don't want to be pushing up and down the pipe constantly. Remember your server still has to received the data if it is in ViewState and handle it. I would do something like this which would cache it for a specific user and have a short expiry:
public DataSet GetSomeData(string user, string query, string sort)
{
// You could make the key just based on the query params but figured
// you would want the user in there as well.
// You could user just the user if you want to limit it to one cached item
// per user too.
string key = string.Format("{0}:{1}", user, query);
DataSet ds = HttpContext.Current.Cache[key] as DataSet;
if (ds == null)
{
// Need to reload or get the data
ds = LoadMyData(query);
// Now store it and make the expiry short so it doesn't bog up your server
// needlessly... worst case you have to retrieve it again because the data
// has expired.
HttpContext.Current.Cache.Insert(key, ds, null,
DateTime.UtcNow.AddMinutes(yourTimeout),
System.Web.Caching.Cache.NoSlidingExpiration);
}
// Perform the sort or leave as default sorting and return
return (string.IsNullOrEmpty(sort) ? ds : sortSortMyDataSet(ds, sort));
}
When you say 1000's of users, does that mean concurrent users? If your expiration time was 1 minute how many concurrent users would make that call in a minute and require sorting. I think offloading the data to something like similar to ViewState is just trading some cache memory for bandwidth and processing load of larget requests back and forth. The less you have to transmit back and forth the better in my opinion.
Why don't you implement a server side caching?
A I understand, you're retrieving a large amount of data and then returns only necessary records from this data to different clients. So you could use HttpContext.Current.Cache property for this.
E.g. a property which encapsulates a data retrieving logic (gets from the original data store with the first request, then puts to cache and gets from cache with every next request) could be used. In this case all the necessary data manipulations (paging, etc.) may be done much more quicker than retrieving a large amount of data with the each request.
In the case when clients have different data sources (mean each client have its own data source) the solution above may also be implemented. I suppose each client has at least identifier, so you could use different caches for different clients (client identifier as a part of cache key).
The best you could do is "grow your own" by including the serialized data set in the body of the request to the ASHX handler. Your handler would then check to see if the request does indeed have a body by checking Request.ContentLength and then reading from Request.InputStream, and if it does serializing that body back into the data set instead of reading from your database.

Page View Counter like on StackOverFlow

What is the best way to implement the page view counter like the ones they have here on the site where each question has a "Views" counter?
Factoring in Performance and Scalability issues.
I've made two observations on the stackoverflow views counter:
There's a link element in the header that handles triggering the count update. For this question, the markup looks like this:
<link href="/questions/246919/increment-view-count" type="text/css" rel="stylesheet" />
I imagine you could hit that url to update the viewcount without ever actually viewing the page, but I haven't tried it.
I had a uservoice ticket, where the response from Jeff indicated that views are not incremented from the same ip twice in a row.
The counter i optimized works like this:
UPDATE page_views SET counter = counter + 1 WHERE page_id = x
if (affected_rows == 0 ) {
INSERT INTO page_views (page_id, counter) VALUES (x, 1)
}
This way you run 2 query for the first view, the other views require only 1 query.
An efficient way may be :
Store your counters in the Application object, you may persist it to file/DB periodically and on application close.
Instead of making a database call everytime the database is hit, I would increment a counter using a cache object and depending on how many visits you get to your site every day you have send the page hits to the database every 100th hit to the site. This is waay faster then updating the database on every single hit.
Or another solution is analyzing the IIS log file and updating hits every 30min through a windows service. This is what I have implemented and it work wonders.
You can implement an IHttpHandler to do that.
I'm a fan of #Guillaume's style of implementation. I use a transparent GIF handler and in-memory queues to batch-up sets of changes that are then periodically flushed using a seperate thread created in global.asax.
The handler implements IHttpHandler, processes the request parameters e.g. page id, language etc., updates the queue, then response.writes the transparent GIF.
By moving persistent changes to a seperate thread than the user-request you also deal much better with potential serialization issues from running multiple servers etc.
Of course you could just pay someone else to do the work too e.g. with transparent gifs.
For me the best way is to have a field in the question table and update it when the question is accessed
UPDATE Questions SET views = views + 1 WHERE QuestionID = x
Application Object: IMO is not scalable because the may end with lots of memory consumption as more questions are accessed.
Page_views table: no need to, you have to do a costly join after

Resources