Every so often I am tasked with making alterations to one or more business entities in our database that may already be cached in our internal application. To get the application to reflect these changes without cycling the app pool, I figured I'd embed the ability for dev/administrators to evict the cache from within the app's UI (either entirely or for certain objects), but I noticed the comments for the method state the following...
/// <summary>
/// Evict an entry from the process-level cache. This method occurs outside
/// of any transaction; it performs an immediate "hard" remove, so does not respect
/// any transaction isolation semantics of the usage strategy. Use with care.
/// </summary>
void ISessionFactory.Evict(Type persistentClass, object id);
What exactly does that mean? What could go wrong if I try to evict one or more objects that may be involved in a transaction, and is there anyway to avoid these side effects if they are destructive? I'm currently using SysCache2 and am looking for implementation details on how to use SqlDependency but I'm still curious about the Evict effects in the mean time.
UPDATE: Upon looking closer at the comments, it appears that SessionFactory.Evict() and SessionFactory.EvictCollection() remove from the process level cache, and SessionFactory.EvictEntity() remove from the second level cache. However the same disclaimer exists on either flavor. So my original question still stands. What dangers are there of evicting an entity from the cache (process or second level) if it's currently in use in another transaction?
Evict detaches the entity from session.
private static Book CreateAndSaveBook(ISessionFactory sessionFactory)
{
var book = new Book()
{
Name = "Book1",
};
using (var session = sessionFactory.OpenSession())
{
using (var tx = session.BeginTransaction())
{
session.Save(book);
tx.Commit();
session.Evict(book);
}
}
return book;
}
In CreateAndSaveBook, we create a book and save it to the database. We commit our
transaction, evict the book from session, close the session, and return the book. This sets up our problem. We now have an entity without a session. Changes to this entity are not being tracked. It's just a plain ordinary book object.
We continue to change the book object, and now we want to save those changes. NHibernate
doesn't know what we've done to this book. It could have been passed through other layers or tiers of a large application. We don't know with which session it's associated, if any. We may not even know if the book exists in the database.
Related
I have several entities, each with its form type. I want to be able, instead of saving the entity straight away on save, to save a copy of the changes we want to perform and store it in DB.
We'd send a message to the user who can approve the change, who will review the original and the changed field(s) and will approve or not. If approved the entity would be properly flushed.
To solve the issue I was thinking about:
1) doing a persist
2) getting the changesets (both the one related to "normal" fields, and the one relative to collections)
3) storing it in DB
4) Performing $em->refresh() to discard changes.
Later what I need is to get the changset(s) back, ask the (other) user to approve it and flush it.
Is this doable? What I'm especially concerned about is that the entity manager that generated the first changeset is not the same we are going to use to perform the flush, I basically need to "load" a changeset.
Any idea on how to solve the issue (this way, or another way ;) )
Another solution (working only for "normal" fields, not reference ones that come from other entities to the current one, like a many to many) would be to clone the current entity, store it, and then once approved copy the field(s) from the cloned to the original one. But it does not work for all fields (if the previous solution does not work we'd limit the feature just to "normal" fields).
Thank you!
SN
Well, you could just treat the modifications as entities themselves, so that every change is stored in the database, and then all the changes that were approved are executed against the entity.
So, for example, if you have some Books stored in the database, and you want to make sure that all the modifications made to these are approved, just add a model that would contain the changeset that has to be processed, and a handler that would apply these changes:
<?php
class UpdateBookCommand
{
// If you'll store these commands in a database, perhaps this field would be a relation,
// or you could just store the ID
public $bookId;
public $newTitle;
public $newAuthor;
// Perhaps this field should be somehow protected from unauthorized changes
public $isApproved;
}
class UpdateBookHandler
{
private $bookRepository;
private $em;
public function handle(UpdateBookCommand $command)
{
if (!$command->isApproved) {
throw new NotAuthorizedException();
}
$book = $this->bookRepository->find($command->bookId);
$book->setTitle($command->newTitle);
$book->setAuthor($command->newAuthor);
$this->em->persist($book);
$this->em->flush();
}
}
Next, in your controller you would just have to make sure that the commands are somehow stored (in a database or maybe even in a message queue), and the handler gets called when the changesets could possibly get applied.
P.S. Perhaps I could have explained this a bit better, but mostly the inspiration for this solution comes from the CQRS pattern that's explained quite well by Martin Fowler. However, I guess in your case a full-blown CQRS implementation is unnecessary and a simpler solution should work.
In Short
I seem to have landed on a MAJOR anti-pattern of saving objects WAY too many times. I've read through the limited Objectify docs and can't seem to find the right pattern to use.
Details
I have multiple objects I want to store. They are all transient (they don't exist in the database yet) and they have a one-to-many relationship. I don't want to sit and call ofy().save() on every last object in my hierarchy.
In the following example, a Player has a List of Cards.
My Model:
#Entity
public class Player {
#Id private Long id = null;//will be generated
private List<Ref<Card>> cards = new ArrayList<Ref<Card>>();
//getters and setters here
}
public class Card{
#Id private Long id = null;//will be generated
//lots of other fields and getters and setters here
}
My Operation:
I need to create a new player and new card, with the player having a reference to the card in his List "cards."
IDEAL SOLUTION:
I would like to just create the player and card java objects, set their relationships, and pass them to Objectify to be saved. Like this:
Player player = new Player();
Card card = new Card();
player.setPlayer(Ref.create(card));
ofy.save().entity(player).now();
That will fail. The 3rd line attempts to create a new Ref for Card, which cannot be done because Card doesn't have an Id yet, which will be assigned to it once it's already persisted. It seems I must never associate an object with another until one has already been saved.
Current Crappy Solution
So, my solution must be to save the Card first, and then relate it to the Player, then save the player.
Player player = new Player();
Card card = new Card();
ofy().save().entity(card).now();
player.setPlayer(Ref.create(card));
ofy().save().entity(card).now();
This is insane. It seems reasonable at first, but my app is dealing with many more relationships than just this, and with this pattern my algorithm will be a spiderweb of checking for transient objects inside collections before saving the entity I'm actually concerned with.
There MUST be some way to tell Objectify to just SAVE all child/related entities along with the entity I've requested, and furthermore generate the Ids necessary instead of throwing an Exception at me.
Furthermore, I'll also need this sort of "recursive save" solution even when none of my objects are transient (ie they all have IDs already). I can't waste my time iterating through collections and then all the collections WITHIN those collections and saving them all. I'm going to need some way of telling Objectify to just SAVE THIS WHOLE HEIRARCHY OF OBJECTS I just passed you.
I've been reading around this #Load annotation and I feel like maybe there's something in there I'm missing... I don't know. Need help. Documentation is slim.
UPDATED SOLUTION
For posterity -
Using the allocateId() method decouples the entire ID generation constraint away from the database and you get a VERY clean pattern, particularly if you do as I did:
All database #Entity classes get a private constructor and a static public factory for creating transient objects. This static factory method ( createTransient() ) will always allocate a new ID. So then, all client code can use this method for acquiring new transient objects, or the obvious objectify load for acquiring existing persisted instances. Simple. Done. Lovely.
I recommend two things:
Allocate ids manually when you construct your objects using ObjectifyFactory.allocateId(). Do not use the "save with null autogenerates" feature. As you've noticed, it's a PITA to deal with entity objects that have null ids, so don't allow them to exist.
Use deferred saves. ofy().defer().save().entity(blah); You can save almost any number of things this way and they'll only get saved once on commit (or closing of the objectify session). Deferring save on the same entity multiple times produces only a single save.
This pattern of leaving ids null and filling it in on save is a holdover from the JPA days. It didn't work very well with JPA either; there were plenty of frustrating edge cases dealing with entities missing ids (especially when you wanted to put the in maps or sets). The best solution is to simply guarantee that no entity is ever missing an id in the first place.
Note that you'll want to allocate the id in a custom constructor, not the no-args constructor that Objectify uses to build your entity on load. Allocating an id is cheap but still a call to the GAE service layer and you don't want to do this on every load.
Some background:
Working with:
.NET 4.5 (thinking of migrating to 4.5.1 if it's painless)
Web Forms
Entity Framework 5, Lazy Loading enabled
Context Per Request
IIS 8
Windows 2012 Datacenter
Point of concern: Memory Usage
Over the project we are currently on, and probably our first bigger project, we're often reading some bigger chunks of data, coming from CSV imports, that are likely to stay the same for very long periods of time.
Unless someone explicitly re-imports the CSV data, they are guaranteed to be the same, this happens in more than one places in our project and similar approach is used for some regular documents that are often being read by the users. We've decided to cache this data in the HttpRuntime cache.
It goes like this, and we pull about 15,000 records consisting mostly of strings.
//myObject and related methods are placeholders
public static List<myObject> GetMyCachedObjects()
{
if (CacheManager.Exists(KeyConstants.keyConstantForMyObject))
{
return CacheManager.Get(KeyConstants.keyConstantForMyObject) as List<myObject>;
}
else
{
List<myObject> myObjectList = framework.objectProvider.GetMyObjects();
CacheManager.Add(KeyConstants.keyConstantForMyObject, myObjectList, true, 5000);
return myObjectList;
}
}
The data retrieving for the above method is very simple and looks like this:
public List<myObject> GetMyObjects()
{
return context.myObjectsTable.AsNoTracking().ToList();
}
There are probably things to be said about the code structure, but that's not my concern at the moment.
I began profiling our project as soon as I saw high memory usage and found many parts where our code could be optimized. I never faced 300 simultaneous users before and our internal tests, done by ourselves were not enough to show the memory issues. I've highlighted and fixed numerous memory leaks but I'd like to understand some Entity Framework related unknowns.
Given the above example, and using ANTS Profiler, I've noticed that 'myObject', and other similar objects, are referencing many System.Data.Entity.DynamicProxies.myObject, additionally there are lots of EntityKeys which hold on to integers. They aren't taking much but their count is relatively high.
For instance 124 instances of 'myObject' are referencing nearly 300 System.Data.Entity.DynamicProxies.
Usually it looks like this, whatever the object is:
Some cache entry, some object I've cached and I now noticed many of them have been detached from dbContext prior caching, the dynamic proxies and the objectContext. I've no idea how to untie them.
My progress:
I did some research and found out that I might be caching something Entity Framework related together with those objects. I've pulled them with NoTracking but there are still those DynamicProxies in the memory which probably hold on to other things as well.
Important: I've observed some live instances of ObjectContext (74), slowly growing, but no instances of my unitOfWork which is holding the dbContext. Those seem to be disposed properly per request basis.
I know how to detach, attach or modify state of an entry from my dbContext, which is wrapped in a unitOfWork, and I often do it. However that doesn't seem to be enough or I am asking for the impossible.
Questions:
Basically, what am I doing wrong with my caching approach when it comes to Entity Framework?
Is the growing number of Object Contexts in the memory a concern, I know the cache will eventually expire but I'm worried of open connections or anything else this context might be holding.
Should I be detaching everything from the context before inserting it into the cache?
If yes, what is the best approach. Especially with List I cannot think of anything else but iterating over the collection and call detach one by one.
Bonus question: About 40% of the consumed memory is free (unallocated), I've no idea why .NET is reserving so much free memory in advance.
You can try using non entity class with specific properties with SELECT method.
public class MyObject2 {
public int ID { get; set; }
public string Name { get; set; }
}
public List<MyObject2> GetObjects(){
return framework.provider.GetObjects().Select(
x=> new MyObject2{
ID = x.ID ,
Name = x.Name
}).ToList();
);
}
Since you will be storing plain c# objects, you will not have to worry about dynamic proxies. You will not have to call detach on anything at all. Also you can store only few properties.
Even if you disable tracking, You will see dynamic proxy because EF uses dynamic class derived from your class which stores extra meta data information (relation e .g. name of foreign key etc to other entities) for the entity.
steps to reduce memory here:
Re new the context, often
Dont try and delete content from the Context. Or Set it to detached.
It hangs around like a fart in a phone box
eg context = new MyContext.
But if possible you should be
using (var context = new Mycontext){ .... }
//short lived contexts is best practice
With your Context you can set Configurations
this.Configuration.LazyLoadingEnabled = false;
this.Configuration.ProxyCreationEnabled = false; //<<<<<<<<<<< THIS one
this.Configuration.AutoDetectChangesEnabled = false;
you can disable proxies if you still feel they are hogging memory.
But that may be unecesseary if you apply using to the context in the first place.
I would redesign the solution a bit:
You are storing all data as a single entry in cache
I would move this and have an entry per cache item.
You are using HTTPRuntime cache
I would use Appfabric Caching, also MS, also free.
Not sure where you are calling that code from
I would Call it on Application start, then all data is in memory when the user needs it
You are using Entity SQL
For this I would use an Entity Data Reader http://msdn.microsoft.com/en-us/library/system.data.entityclient.entitydatareader(v=vs.110).aspx
See also:
http://msdn.microsoft.com/en-us/data/hh949853.aspx
I'm beginning to work on the caching infrastructure for my ASP.NET MVC site. The problem is, I can't seem to find a reasonable place for data caching (other than 'everywhere')
Right now my architecture looks like this:
Controller -> Service Layer -> Repository. The repository uses Linq to SQL for data access.
The repository exposes generic methods like Insert, GetById, and GetQueryable, which returns an IQueryable that the service layer can further refine.
I like the idea of putting caching in the repository layer, since the service layer shouldn't really care where the data comes from. The problem though is with cache invalidation. The service layer has more information about when data becomes stale than the repository. For instance:
Suppose we have a Users table and an Orders table (the canonical example). The service layer offers methods like GetOrder(int id), which would call the repository layer:
public Order GetOrder(int id)
{
using(var repo = _repoFactory.Create<Order>())
{
return repo.GetById(id)
}
}
or
repo.GetQueryable(order => order.Id == id && order.HasShipped == false).Single();
If we cache in the repository layer, it seems like it would be very limited in knowing when that order data has changed. Suppose the user was deleted, causing all their orders to be deleted with a CASCADE. The service layer could invalidate the Orders cache, since it knew the user was just removed. The repository though (since it's a Unit of Work), wouldn't be aware. (Ignore the fact that we shouldn't be querying orders for a deleted user, since it's just an example).
There's other situations where I think this shows itself. Suppose we want to fetch all the users orders:
repo.GetQueryable(order => order.UserId == userId).ToList()
The repository can cache the results of this query, but, if another order is added, this query is no longer valid. Only the service layer is aware of this though.
It's also possible my understanding of the repository layer is wrong. I sort of view it as a facade around the data source (i.e. changing from L2SQL to EF to whatever, the service layer is unaware of the underlying source).
Realistically, you will need another layer; the data caching layer. It will be used by your service layer when requesting data. Upon such a request, it will decide if it has the data in cache or if it needs to query the appropriate repository. Likewise, your service layer can tell this new data caching layer of an invalidation (the deletion of a particular user, etc.).
What this can mean for your architecture though, is that your data caching layer will implement the same interface(s) your repositories do. A fairly simple implementation would cache the data by entity type and key. However, if you are using a more sophisticated ORM behind the scenes (NHibernate, EF 4, etc.), it should have caching as an option for you.
You could put an event on the objects returned by your repositories, and have the repository subscribe the cache invalidation to a handler.
For example,
public class SomethingRepository{
public Something GetById(int id){
var something = _table.Single(x=>x.id==id);
something.DataChanged += this.InvalidateCache;
return something;
}
public void InvalidateCache(object sender, EventArgs e){
// invalidate your cache
}
}
And your Something object needs to have a DataChanged event and some public method for your service layer to call to trigger it. Like,
public class Something{
private int _id;
public int Id{
get { return _id; }
set {
if( _id != value )
{
_id = value;
OnDataChanged();
}
}
}
public event EventHandler DataChanged;
public void OnDataChanged(){
if(DataChanged!=null)
DataChanged(this, EventArgs.Empty);
}
}
So, all your service layer needs to know is that the data is being changed, and the repository handles the cache invalidation.
I also suggest you take ventaur's advice and put the cache invalidation logic in a separate service. You don't need to go so far as to create a separate "data caching layer", but the logic would be cleaner if kept in a different class.
I have an ASP.NET web application and I want to be able to take items from a master list and store them temporarliy into one of four other lists. The 'other' lists need to survive post backs so that more items can be added to them. What direction would you suggest going with?
I have thought of using a generic list stored in memory, temporarliy storing the items into the database and calling them back on PostBack, or storing them into the viewstate, but I have a feeling that there is some solution that I'm missing that might be easier or better.
Josh laid out the states pretty well. My recommendation for a smaller list like he said would be using Session state. Using the DB would be a little messy because you have to maintain those temp tables and worry about multi-session access to the tables. Likewise, cache have the same problem. Viewstate gives you this with extra client traffic and insecure data. So if you're talking less than a few thousand instances on a low traffic server, then session is likely fine.
To make session easier to work with (and you can do this with caching and application state as well) is setup a container object that manages the lists.
//To use it in your page, you can easily access it via:
ListManagerContext.Current.MasterList.Add(4);
[Serializable]
public class ListManagerContext
{
public List<int> MasterList { get; set; }
public List<int> SubList1 { get; set; }
public List<int> SubList2 { get; set; }
public List<int> SubList3 { get; set; }
/// <summary>
/// Key used for the list manager context session variable.
/// </summary>
public const string ListManagerContextKey = "ListManagerContext";
/// <summary>
/// Gets the current ListManagerContext for this session.
/// If none exists, it returns a brand new one.
/// </summary>
[XmlIgnore]
public static ListManagerContext Current
{
get
{
HttpContext context = HttpContext.Current;
if (context != null && context.Session != null)
{
ListManagerContext data = null;
if (context.Session[ListManagerContextKey] == null)
{
data = new ListManagerContext();
context.Session[ListManagerContextKey] = data;
}
else
data = context.Session[ListManagerContextKey]
as ListManagerContext;
return data;
}
throw new ApplicationException("
No session available for list manager context.");
}
}
}
The first thing I would suggest is to see if you can remove the need of keeping state across postbacks.
If you can't do so (and ViewState is not applicable for some reason like bandwidth limitations or requiring data preservation even without a postback from a server form), I suggest consider using Session. You can configure session state to use a SQL Server database backend whenever you want without worrying about changing source code.
The database Idea is likely a poor one (assuming you're not dealing with large amounts of data).
Perhaps your best method would be to store the main list in ViewState, and have the other lists be lists of indexes to the first list.
The lists should automatically store the values they have in the viewstate. If they don't, you probably need to turn the viewstate on for these controls.
If you manually want to make the data survive the round trip, you can either store them in the session or in the viewstate yourself. Technically the viewstate makes the most sense, but if there's a lot of data, it can make the viewstate very large and take a long time to do a round trip. The only issue with the session is you'll have to make sure you clear it once you leave the page.
Don't use the database, that's not what its for.
You could store the list in ViewState or Session and assign it to a property. Here's simple example using a generic list of string, but can be any serializable type.
private List<String> MyTempList
{
get{return Session["mylist"] as List<String>;}
set{Session["mylist"] = value;}
}
protected void Page_Load(object source, EventArgs e)
{
if(!IsPostBack)
{
MyTempList = new List<String>();
}
else
{
MyTempList.Add("Something");
}
}
All of those are options and all have pro's and cons:
Database:
Storing items in the database is a fairly easy and consistent option. You do have to worry about making a round trip call to the database, but at least you have a centralized location to story the data that will scale easily with your web load. However, if this is short lived data, then you will have to worry about cleaning up your database as it might begin to get unwieldy.
Session/Cache:
Session affords a quick solution for in memory storage, but scaling can become problematic if the amount of data is very large. The more information you store in memory the less capacity you have for concurrent users. Also, if you start to add multiple web servers, then you will have to look into some sort of session state server to make sure users don't spontaneously lose their session.
Cache has basically all the same pros/cons except that there is the additional complexity of having to make sure you expire cache items, and manage concurrency issues.
Again, these are both easy to implement solutions, but don't scale as well under heavy load, or large amounts of data.
ViewState:
Viewstate is also an easy to implement solution and gets the load off the server and into the client, but can result in longer load times for the end user. Also it is important to remember that ViewState can be hacked, so if security is a concern then you want to take extra precautions to ensure data integrity.
Conclusion:
All in all, figure out what you want to accomplish and choose the solution that best fits your needs. Shove it behind some abstraction layer like an interface so you can easily change the details later, and then you won't have to worry as much. It's all about knowing what will work best in your particular scenario.