EF Caching: How to detach objects *completely* before inserting them into HttpRuntime cache? - asp.net

Some background:
Working with:
.NET 4.5 (thinking of migrating to 4.5.1 if it's painless)
Web Forms
Entity Framework 5, Lazy Loading enabled
Context Per Request
IIS 8
Windows 2012 Datacenter
Point of concern: Memory Usage
Over the project we are currently on, and probably our first bigger project, we're often reading some bigger chunks of data, coming from CSV imports, that are likely to stay the same for very long periods of time.
Unless someone explicitly re-imports the CSV data, they are guaranteed to be the same, this happens in more than one places in our project and similar approach is used for some regular documents that are often being read by the users. We've decided to cache this data in the HttpRuntime cache.
It goes like this, and we pull about 15,000 records consisting mostly of strings.
//myObject and related methods are placeholders
public static List<myObject> GetMyCachedObjects()
{
if (CacheManager.Exists(KeyConstants.keyConstantForMyObject))
{
return CacheManager.Get(KeyConstants.keyConstantForMyObject) as List<myObject>;
}
else
{
List<myObject> myObjectList = framework.objectProvider.GetMyObjects();
CacheManager.Add(KeyConstants.keyConstantForMyObject, myObjectList, true, 5000);
return myObjectList;
}
}
The data retrieving for the above method is very simple and looks like this:
public List<myObject> GetMyObjects()
{
return context.myObjectsTable.AsNoTracking().ToList();
}
There are probably things to be said about the code structure, but that's not my concern at the moment.
I began profiling our project as soon as I saw high memory usage and found many parts where our code could be optimized. I never faced 300 simultaneous users before and our internal tests, done by ourselves were not enough to show the memory issues. I've highlighted and fixed numerous memory leaks but I'd like to understand some Entity Framework related unknowns.
Given the above example, and using ANTS Profiler, I've noticed that 'myObject', and other similar objects, are referencing many System.Data.Entity.DynamicProxies.myObject, additionally there are lots of EntityKeys which hold on to integers. They aren't taking much but their count is relatively high.
For instance 124 instances of 'myObject' are referencing nearly 300 System.Data.Entity.DynamicProxies.
Usually it looks like this, whatever the object is:
Some cache entry, some object I've cached and I now noticed many of them have been detached from dbContext prior caching, the dynamic proxies and the objectContext. I've no idea how to untie them.
My progress:
I did some research and found out that I might be caching something Entity Framework related together with those objects. I've pulled them with NoTracking but there are still those DynamicProxies in the memory which probably hold on to other things as well.
Important: I've observed some live instances of ObjectContext (74), slowly growing, but no instances of my unitOfWork which is holding the dbContext. Those seem to be disposed properly per request basis.
I know how to detach, attach or modify state of an entry from my dbContext, which is wrapped in a unitOfWork, and I often do it. However that doesn't seem to be enough or I am asking for the impossible.
Questions:
Basically, what am I doing wrong with my caching approach when it comes to Entity Framework?
Is the growing number of Object Contexts in the memory a concern, I know the cache will eventually expire but I'm worried of open connections or anything else this context might be holding.
Should I be detaching everything from the context before inserting it into the cache?
If yes, what is the best approach. Especially with List I cannot think of anything else but iterating over the collection and call detach one by one.
Bonus question: About 40% of the consumed memory is free (unallocated), I've no idea why .NET is reserving so much free memory in advance.

You can try using non entity class with specific properties with SELECT method.
public class MyObject2 {
public int ID { get; set; }
public string Name { get; set; }
}
public List<MyObject2> GetObjects(){
return framework.provider.GetObjects().Select(
x=> new MyObject2{
ID = x.ID ,
Name = x.Name
}).ToList();
);
}
Since you will be storing plain c# objects, you will not have to worry about dynamic proxies. You will not have to call detach on anything at all. Also you can store only few properties.
Even if you disable tracking, You will see dynamic proxy because EF uses dynamic class derived from your class which stores extra meta data information (relation e .g. name of foreign key etc to other entities) for the entity.

steps to reduce memory here:
Re new the context, often
Dont try and delete content from the Context. Or Set it to detached.
It hangs around like a fart in a phone box
eg context = new MyContext.
But if possible you should be
using (var context = new Mycontext){ .... }
//short lived contexts is best practice
With your Context you can set Configurations
this.Configuration.LazyLoadingEnabled = false;
this.Configuration.ProxyCreationEnabled = false; //<<<<<<<<<<< THIS one
this.Configuration.AutoDetectChangesEnabled = false;
you can disable proxies if you still feel they are hogging memory.
But that may be unecesseary if you apply using to the context in the first place.

I would redesign the solution a bit:
You are storing all data as a single entry in cache
I would move this and have an entry per cache item.
You are using HTTPRuntime cache
I would use Appfabric Caching, also MS, also free.
Not sure where you are calling that code from
I would Call it on Application start, then all data is in memory when the user needs it
You are using Entity SQL
For this I would use an Entity Data Reader http://msdn.microsoft.com/en-us/library/system.data.entityclient.entitydatareader(v=vs.110).aspx
See also:
http://msdn.microsoft.com/en-us/data/hh949853.aspx

Related

Getting IMetadataDetailsProviders to Run More than Once in ASP.NET Core

This is a tricky question which will require some deep knowledge of the ASP.NET Core framework. I'll first explain what is happening in our application in the MVC 3 implementation.
There was a complex requirement which needed to be solved involving the ModelMetaData for our ViewModels on a particular view. This is a highly configurable application. So, for one "Journal Type", a property may be mandatory, whereas for another, the exact same property may be non-mandatory. Moreover, it may be a radio-button for one "Journal Type" and a select list for another. As there was a huge number of combinations, mixing and matching for all these configuration options, it was not practical to create a separate ViewModel type for each and every possible permutation. So, there was one ViewModel type and the ModelMetaData was set on the properties of that type dynamically.
This was done by creating a custom ModelMetadataProvider (by inheriting DataAnnotationsModelMetadataProvider).
Smash-cut to now, where we are upgrading the application and writing the server stuff in ASP.NET Core. I have identified that implementing IDisplayMetadataProvider is the equivalent way of modifying Model Metadata in ASP.NET Core.
The problem is, the framework has caching built into it and any class which implements IDisplayMetadataProvider only runs once. I discovered this while debugging the ASP.NET Core framework and this comment confirms my finding. Our requirement will no longer be met with such caching, as the first time the ViewModel type is accessed, the MetadataDetailsProvider will run and the result will be cached. But, as mentioned above, owing to the highly dynamic configuration, I need it to run prior to every ModelBinding. Otherwise, we will not be able to take advantage of ModelState. The first time that endpoint is hit, the meta-data is set in stone for all future requests.
And we kinda need to leverage that recursive process of going through all the properties using reflection to set the meta-data, as we don't want to have to do that ourselves (a massive endeavour beyond my pay-scale).
So, if anyone thinks there's something in the new Core framework which I have missed, by all means let me know. Even if it is as simple as removing that caching feature of ModelBinders and IDisplayMetadataProviders (that is what I'll be looking into over the next couple of days by going through the ASP.NET source).
Model Metadata is cached due to performance considerations. Class DefaultModelMetadataProvider, which is default implementation of IModelMetadataProvider interface, is responsible for this caching. If your application logic requires that metadata is rebuilt on every request, you should substitute this implementation with your own.
You will make your life easier if you inherit your implementation from DefaultModelMetadataProvider and override bare minimum for achieving your goal. Seems like GetMetadataForType(Type modelType) should be enough:
public class CustomModelMetadataProvider : DefaultModelMetadataProvider
{
public CustomModelMetadataProvider(ICompositeMetadataDetailsProvider detailsProvider)
: base(detailsProvider)
{
}
public CustomModelMetadataProvider(ICompositeMetadataDetailsProvider detailsProvider, IOptions<MvcOptions> optionsAccessor)
: base(detailsProvider, optionsAccessor)
{
}
public override ModelMetadata GetMetadataForType(Type modelType)
{
// Optimization for intensively used System.Object
if (modelType == typeof(object))
{
return base.GetMetadataForType(modelType);
}
var identity = ModelMetadataIdentity.ForType(modelType);
DefaultMetadataDetails details = CreateTypeDetails(identity);
// This part contains the same logic as DefaultModelMetadata.DisplayMetadata property
// See https://github.com/aspnet/Mvc/blob/dev/src/Microsoft.AspNetCore.Mvc.Core/ModelBinding/Metadata/DefaultModelMetadata.cs
var context = new DisplayMetadataProviderContext(identity, details.ModelAttributes);
// Here your implementation of IDisplayMetadataProvider will be called
DetailsProvider.CreateDisplayMetadata(context);
details.DisplayMetadata = context.DisplayMetadata;
return CreateModelMetadata(details);
}
}
To replace DefaultModelMetadataProvider with your CustomModelMetadataProvider add following in ConfigureServices():
services.AddSingleton<IModelMetadataProvider, CustomModelMetadataProvider>();

Side effects of SessionFactory.Evict?

Every so often I am tasked with making alterations to one or more business entities in our database that may already be cached in our internal application. To get the application to reflect these changes without cycling the app pool, I figured I'd embed the ability for dev/administrators to evict the cache from within the app's UI (either entirely or for certain objects), but I noticed the comments for the method state the following...
/// <summary>
/// Evict an entry from the process-level cache. This method occurs outside
/// of any transaction; it performs an immediate "hard" remove, so does not respect
/// any transaction isolation semantics of the usage strategy. Use with care.
/// </summary>
void ISessionFactory.Evict(Type persistentClass, object id);
What exactly does that mean? What could go wrong if I try to evict one or more objects that may be involved in a transaction, and is there anyway to avoid these side effects if they are destructive? I'm currently using SysCache2 and am looking for implementation details on how to use SqlDependency but I'm still curious about the Evict effects in the mean time.
UPDATE: Upon looking closer at the comments, it appears that SessionFactory.Evict() and SessionFactory.EvictCollection() remove from the process level cache, and SessionFactory.EvictEntity() remove from the second level cache. However the same disclaimer exists on either flavor. So my original question still stands. What dangers are there of evicting an entity from the cache (process or second level) if it's currently in use in another transaction?
Evict detaches the entity from session.
private static Book CreateAndSaveBook(ISessionFactory sessionFactory)
{
var book = new Book()
{
Name = "Book1",
};
using (var session = sessionFactory.OpenSession())
{
using (var tx = session.BeginTransaction())
{
session.Save(book);
tx.Commit();
session.Evict(book);
}
}
return book;
}
In CreateAndSaveBook, we create a book and save it to the database. We commit our
transaction, evict the book from session, close the session, and return the book. This sets up our problem. We now have an entity without a session. Changes to this entity are not being tracked. It's just a plain ordinary book object.
We continue to change the book object, and now we want to save those changes. NHibernate
doesn't know what we've done to this book. It could have been passed through other layers or tiers of a large application. We don't know with which session it's associated, if any. We may not even know if the book exists in the database.

ASP.NET SessionState mode SQLServer serialization with protobuf-net

Problem Background
I have been thinking of ways to optimize the out of state storage of sessions within SQL server and a few I ran across are:
Disable session state on pages that do not require the session. Also, use read-only on pages that are not writing to the session.
In ASP.NET 4.0 use gzip compression option.
Try to keep the amount of data stored in the session to a minimum.
etc.
Right now, I have a single object (a class called SessionObject) stored in the session. The good news is, is that it is completely serializable.
Optimizing using protobuf-net
An additional way I thought might be a good way to optimize the storage of sessions would be to use protocol buffers (protobuf-net) serialization/deserialization instead of the standard BinaryFormatter. I understand I could have all of my objects inherit ISerializable, but I'd like to not create DTO's or clutter up my Domain layer with serialize/deserialize logic.
Any suggestions using protobuf-net with session state SQL server mode would be great!
If the existing session-state code uses BinaryFormatter, then you can cheat by getting protobuf-net to act as an internal proxy for BinaryFormatter, by implementing ISerializable on your root object only:
[ProtoContract]
class SessionObject : ISerializable {
public SessionObject() { }
protected SessionObject(SerializationInfo info, StreamingContext context) {
Serializer.Merge(info, this);
}
void ISerializable.GetObjectData(SerializationInfo info, StreamingContext context) {
Serializer.Serialize(info, this);
}
[ProtoMember(1)]
public string Foo { get; set; }
...
}
Notes:
only the root object needs to do this; the any encapsulated objects will be handled automatically by protobuf-net
it will still ad d a little type metadata for the outermost object, but not much
you will need to decorate the members (and encapsulated types) accordingly (this is best done explicit per member; there is an implicit "figure it out yourself" mode, but this is brittle if you add new members)
this will break existing state; changing the serialization mechanism is fundamentally a breaking change
If you want to ditch the type metadata from the root object, you would have to implement your own state provider (I think there is an example on MSDN);
advantage: smaller output
advantage: no need to implement ISerializable on the root object
disadvantage: you need to maintain your own state provider ;p
(all the other points raised above still apply)
Note also that the effectiveness of protobuf-net here will depend a bit on what the data is that you are storing. It should be smaller, but if you have a lot of overwhelmingly large strings it won't be much smaller, as protobuf still uses UTF-8 for strings.
If you do have lots of strings, you might consider additionally using gzip - I wrote a state provider for my last employer that tried gzip, and stored whichever (original or gzip) was the smallest - obviously with a few checks, for example:
don't gzip if it is smaller than [some value]
short-circuit the gzip compression early if the gzip ever exceeds the original
The above can be used in combination with protobuf-net quite happily - and if you are writing a state-provider anyway you can drop the ISerializable etc for maximum performance.
A final option, if you really want would be fore me to add a "compression mode" property to [ProtoContract(..., CompressionMode = ...)]; which:
would only apply for the ISerializable usage (for technical reasons, it doesn't make sense to change the primary layout, but this scenario would be fine)
automatically applies gzip during serialization/deserialization of the above [perhaps with the same checks I mention above]
would mean you don't need to add your own state provider
However, this is something I'd only really want to apply for "v2" (I'm being pretty brutal about bugfix only in v1, so that I can keep things sane).
Let me know if that would be of interest.

Business Logic Layer

I am programming data driven applications using asp.net with telerik controls (v2009 q2).
I have a class named BLL which contains (almost only) static classes that return different objects taking some id as parameter. The generally return group of objects as Lists.
My question is that are there any architrectural flaws to this, always using static. I know people make their Busines Layer and DataAccess layer as different projects. What is the advantage of it being a project ? So I can add more functionality or just it is tidier that way.
Thanks in advance
Using static methods as your method of entry is not a particularly big concern. It really depends on whether you have areas of work where you need to store state, as static definitions may not allow you to store or separate state information. Fortunately, going backward from having used static declarations to member declarations is usually less painful than the reverse. You might not even encounter this as an issue if the items returned from such methods are solely responsible for state.
Separate libraries/projects are useful for partitioning units of work. There are no strict requirements that everything must be separated into different libraries, although you may see quirks with static member variables, particularly in multi-threaded apps, as mentioned by Dave Swersky.
Having separate libraries also gives you the following benefits:
Better separation of changes during development, as project boundaries usually coincide with source-control boundaries, allowing more people to work concurrently over the entire surface of your platform.
Separate parts that may be updated independently in production, provided layout and interfaces are compatible.
Better organization of what behaviors, features, and roles intersect for a given segment at each layer, whether BLL or DAL. Some developers prefer to strictly isolate components based on what users are allowed to operate on items provided in a given BLL.
However, some parties have found that large monolithic libraries work better for them. Here are some benefits that are important in this scenario.
Faster compile times for projects where older components and dependencies rarely change (especially important for C/C++ devs!). Source files that don't change, collectively, can hint and allow the compiler to avoid recompiling whole projects.
Single (or low-count) file upgrades and management, for projects where it is important to minimize the amount of objects present at a given location. This is highly desirable for people who provide libraries for consumption by other parties, as one is less susceptible to individual items being published or updated out of order.
Automatic namespace layout in Visual Studio .NET projects, where using sub-folders automatically implies the initial namespace that will be present for new code additions. Not a particularly great perk, but some people find this useful.
Separation of groups of BLLs and DALs by database or server abstraction. This is somewhat middle ground, but as a level of organization, people find this level to be more comfortable for long-term development. This allows people to identify things by where they are stored or received. But, as a trade-off, the individual projects can be more complex- though manageable via #3.
Finally, one thing I noticed is that it sounds like you have implemented nested static classes. If other people using your work are in an environment with intellisense or other environment shortcuts unavailable, they may find this setup to be highly troublesome to use. You might consider unrolling some levels of nesting into separate (or nested) namespaces instead. This is also beneficial in reducing the amount of typing required to declare items of interest, as namespace declarations only need to be present once, where static nested items need to be present every time. Your counterparts will like this.
Having the BLL and DAL in separate projects (i.e. separate assemblies) means that they can be used with different user interfaces without re-compiling, but more importantly that the boundary interfaces and dependencies of the DLLs are relatively well-defined (although it doesn't guarantee a great design, it at least enforces a separation). It's still possible to have a single assembly with great logical separation, so it's not required nor sufficient.
As far as the static methods versus business object classes, that could be unusual and it could have drawbacks, but it doesn't really affect whether your layers are separated or not.
If your application is stateless, all-static methods/classes shouldn't be a problem. However, if your app is multi-threaded and the BLL does read and commit, you could run into thread safety issues.
One advantage of a separate project is that if you need to update your application but only change the BLL, you can make your change, recompile the DLL and drop it in the bin folder where the application is deployed in IIS without having to redeploy the whole web application
My question is that are there any
architrectural flaws to this, always
using static.
One flaw with that approach is that you can't apply Interfaces to static methods. A possible work-around is to use a Singleton pattern, although you will need to be careful about threading issues.
I know people make their Busines Layer
and DataAccess layer as different
projects. What is the advantage of it
being a project ? So I can add more
functionality or just it is tidier
that way.
Advantages:
Easier for multiple developers to work on (depending on your environment and source control)
Forced separation of logic / protection levels from the rest of your solution
Easier to group and manage if your BLL gets large
namespace BLL
{
public class tblCity
{
public tblCity()
{
//
// TODO: Add constructor logic here
//
}
private int iCityId;
private string sCityName;
public int CityId
{
get
{ return iCityId; }
set
{ iCityId = value; }
}
public string CityName
{
get
{
return sCityName;
}
set
{ sCityName = value; }
}
public int InserttblCity()
{
DBAccess db = new DBAccess();
//db.AddParameter("#iSid", iSid);
db.AddParameter("#sCityName", sCityName);
return db.ExecuteNonQuery("tblCity_Insert", true);
}
public DataSet SelectAlltblCity()
{
DBAccess db = new DBAccess();
return db.ExecuteDataSet("tblCity_SelectAll");
}
public DataSet CheckCityName()
{
DBAccess db = new DBAccess();
db.AddParameter("#sCityName", sCityName);
return db.ExecuteDataSet("tblCity_CheckCity");
}
public DataSet SelectDistinctCityWithId()
{
DBAccess db = new DBAccess();
//db.AddParameter("#iCityName", iCityName);
return db.ExecuteDataSet("tblCity_getLastId");
}
public int UpdatetblCity()
{
DBAccess db = new DBAccess();
db.AddParameter("#iCityId", iCityId);
db.AddParameter("#sCityName", sCityName);
return db.ExecuteNonQuery("[tblCity_Update]", true);
}
public int DeletetbltblCity()
{
DBAccess db = new DBAccess();
db.AddParameter("#iCityId", iCityId);
return db.ExecuteNonQuery("[tblCity_Delete]", true);
}
public DataSet FindPropertyLocationSubCategory()
{
DBAccess db = new DBAccess();
db.AddParameter("#iCityId", iCityId);
return db.ExecuteDataSet("tblPropertyDetails_FindPropertyLocationSubCategory");
}
public DataSet SelectDistinctPLCNAmeWithId()
{
DBAccess db = new DBAccess();
return db.ExecuteDataSet("tblCity_getLastId");
}
}
}

Singleton vs Cache ASP.NET

I have created a Registry class in .NET which is a singleton. Apparently this singleton behaves as if it were kept in the Cache (the singleton object is available to each session). Is this a good practice of should I add this Singleton to the Cache?
+ do I need to wacth out for concurrency problems with the GetInstance() function?
namespace Edu3.Business.Registry
{
public class ExamDTORegistry
{
private static ExamDTORegistry instance;
private Dictionary<int, ExamDTO> examDTODictionary;
private ExamDTORegistry()
{
examDTODictionary = new Dictionary<int, ExamDTO>();
}
public static ExamDTORegistry GetInstance()
{
if (instance == null)
{
instance = new ExamDTORegistry();
}
return instance;
}
}
}
Well, your GetInstance method certainly isn't thread-safe - if two threads call it at the same time, they may well end up with two different instances. I have a page on implementing the singleton pattern, if that helps.
Does your code rely on it being a singleton? Bear in mind that if the AppDomain is reloaded, you'll get a new instance anyway.
I don't really see there being much benefit in putting the object in the cache though. Is there anything you're thinking of in particular?
Despite their presence in GoF singletons are generally considered bad practice. Is there any reason why you wish to have only one instance?
HttpContext.Cache is available to all sessions, but items in the cache can be removed from memory when they expire or if there is memory pressure.
HttpContext.Application is also available to all sessions and is a nice place to store persistent, application-wide objects.
Since you've already created a singleton and it works, I don't see why should use one of the ones built-in singleton collections instead, unless you need the extra functionality that Cache gives you.
Not sure sure what you mean by cache... if you want this cached (as in... keep it in memory so that you don't have to fetch it again from some data store) then yes, you can put it in the cache and it will be global for all users. Session means per user, so I don't think this is what you want.
I think the original question spoke to which was preferred. If you have data that remains static or essentially immutable, then http caching or singleton pattern makes a lot of sense. If the singleton is loaded on application start up then there is no "Threading" issue at all. Once the singleton is in place you will receive the same Instance you requested. The problem with a lot of what I am seeing in actual implementations is that people are using both without fully thinking it out. Why should you expire immutable configuration data? Had one client that cached there data and still created ADO DB objects etc. when last they checked if it was in cache. Effectively both of these solutions will work for you, but to gain any positive effect, make sure you use the cache/singleton. In either case, if your data is not available, both should be refreshed at that moment.
i would make it like:
private static READONLY ExamDTORegistry instance;
then you dont need to check for NULL and its thread safe.

Resources