Ways to store an object across multiple postbacks - asp.net

For the sake of argument assume that I have a webform that allows a user to edit order details. User can perform the following functions:
Change shipping/payment details (all simple text/dropdowns)
Add/Remove/Edit products in the order - this is done with a grid
Add/Remove attachments
Products and attachments are stored in separate DB tables with foreign key to the order.
Entity Framework (4.0) is used as ORM.
I want to allow the users to make whatever changes they want to the order and only when they hit 'Save' do I want to commit the changes to the database. This is not a problem with textboxes/checkboxes etc. as I can just rely on ViewState to get the required information. However the grid is presenting a much larger problem for me as I can't figure out a nice and easy way to persist the changes the user made without committing the changes to the database. Storing the Order object tree in Session/ViewState is not really an option I'd like to go with as the objects could get very large.
So the question is - how can I go about preserving the changes the user made until ready to 'Save'.
Quick note - I have searched SO to try to find a solution, however all I found were suggestions to use Session and/or ViewState - both of which I would rather not use due to potential size of my object trees

If you have control over the schema of the database and the other applications that utilize order data, you could add a flag or status column to the orders table that differentiates between temporary and finalized orders. Then, you can simply store your intermediate changes to the database. There are other benefits as well; for example, a user that had a browser crash could return to the application and be able to resume the order process.
I think sticking to the database for storing data is the only reliable way to persist data, even temporary data. Using session state, control state, cookies, temporary files, etc., can introduce a lot of things that can go wrong, especially if your application resides in a web farm.

If using the Session is not your preferred solution, which is probably wise, the best possible solution would be to create your own temporary database tables (or as others have mentioned, add a temporary flag to your existing database tables) and persist the data there, storing a single identifier in the Session (or in a cookie) for later retrieval.

First, you may want to segregate your specific state management implementation into it's own class so that you don't have to replicate it throughout your systems.
Second, you may want to consider a hybrid approach - use session state (or cache) for a short time to avoid unnecessary trips to a DB or other external store. After some amount of inactivity, write the cached state out to disk or DB. The simplest way to do this, is to serialize your objects to text (using either serialization or a library like proto-buffers). This helps allow you to avoid creating redundant or duplicate data structure to capture the in-progress data relationally. If you don't need to query the content of this data - it's a reasonable approach.
As an aside, in the database world, the problem you describe is called a long running transaction. You essentially want to avoid making changes to the data until you reach a user-defined commit point. There are techniques you can use in the database layer, like hypothetical views and instead-of triggers to encapsulate the behavior that you aren't actually committing the change. The data is in the DB (in the real tables), but is only visible to the user operating on it. This is probably a more complicated implementation than you may be willing to undertake, and requires intrusive changes to your persistence layer and data model - but allows the application to be ignorant of the issue.

Have you considered storing the information in a JavaScript object and then sending that information to your server once the user hits save?

Use domain events to capture the users actions and then replay those actions over the snapshot of the order model ( effectively the current state of the order before the user started changing it).
Store each change as a series of events e.g. UserChangedShippingAddress, UserAlteredLineItem, UserDeletedLineItem, UserAddedLineItem.
These events can be saved after each postback and only need a link to the related order. Rebuilding the current state of the order is then as simple as replaying the events over the currently stored order objects.
When the user clicks save, you can replay the events and persist the updated order model to the database.
You are using the database - no session or viewstate is required therefore you can significantly reduce page-weight and server memory load at the expense of some page performance ( if you choose to rebuild the model on each postback ).
Maintenance is incredibly simple as due to the ease with which you can implement domain object, automated testing is easily used to ensure the system behaves as you expect it to (while also documenting your intentions for other developers).
Because you are leveraging the database, the solution scales well across multiple web servers.
Using this approach does not require any alterations to your existing domain model, therefore the impact on existing code is minimal. Biggest downside is getting your head around the concept of domain events and how they are used and abused =)
This is effectively the same approach as described by Freddy Rios, with a little more detail about how and some nice keyword for you to search with =)
http://jasondentler.com/blog/2009/11/simple-domain-events/ and http://www.udidahan.com/2009/06/14/domain-events-salvation/ are some good background reading about domain events. You may also want to read up on event sourcing as this is essentially what you would be doing ( snapshot object, record events, replay events, snapshot object again).

how about serializing your Domain object (contents of your grid/shopping cart) to JSON and storing it in a hidden variable ? Scottgu has a nice article on how to serialize objects to JSON. Scalable across a server farm and guess it would not add much payload to your page. May be you can write your own JSON serializer to do a "compact serialization" (you would not need product name,product ID, SKU id, etc, may be you can just "serialize" productID and quantity)

Have you considered using a User Profile? .Net comes with SqlProfileProvider right out of the box. This would allow you to, for each user, grab their profile and save the temporary data as a variable off in the profile. Unfortunately, I think this does require your "Order" to be serializable, but I believe all of the options except Session thus far would require the same.
The advantage of this is it would persist through crashes, sessions, server down time, etc and it's fairly easy to set up. Here's a site that runs through an example. Once you set it up, you may also find it useful for storing other user information such as preferences, favorites, watched items, etc.

You should be able to create a temp file and serialize the object to that, then save only the temp file name to the viewstate. Once they successfully save the record back to the database then you could remove the temp file.

Single server: serialize to the filesystem. This also allows you to let the user resume later.
Multiple server: serialize it but store the serialized value in the db.
This is something that's for that specific user, so when you persist it to the db you don't really need all the relational stuff for it.
Alternatively, if the set of data is v. large and the amount of changes is usually small, you can store the history of changes done by the user instead. With this you can also show the change history + support undo.

2 approaches - create a complex AJAX application that stores everything on the client and only submits the entire package of changes to the server. I did this once a few years ago with moderate success. The applicaiton is not something I would want to maintain though. You have a hard time syncing your client code with your server code and passing fields that are added/deleted/changed is nightmarish.
2nd approach is to store changes in the data base in a temp table or "pending" mode. Advantage is your code is more maintainable. Disadvantage is you have to have a way to clean up abandonded changes due to session timeout, power failures, other crashes. I would take this approach for any new development. You can have separate tables for "pending" and "committed" changes that opens up a whole new level of features you can add. What if? What changed? etc.

I would go for viewstate, regardless of what you've said before. If you only store the stuff you need, like { id: XX, numberOfProducts: 3 }, and ditch every item that is not selected by the user at this point; the viewstate size will hardly be an issue as long as you aren't storing the whole object tree.
When storing attachments, put them in a temporary storing location, and reference the filename in your viewstate. You can have a scheduled task that cleans the temp folder for every file that was last saved over 1 day ago or something.
This is basically the approach we use for storing information when users are adding floorplan information and attachments in our backend.

Are the end-users internal or external clients? If your clients are internal users, it may be worthwhile to look at an alternate set of technologies. Instead of webforms, consider using a platform like Silverlight and implementing a rich GUI there.
You could then store complex business objects within the applet, provide persistant "in progress" edit tracking across multiple sessions via offline storage and easily integrate with back-end services that providing saving / processing of the finalised order. All whilst maintaining access via the web (albeit closing out most *nix clients).
Alternatives include Adobe Flex or AJAX, depending on resources and needs.

How large do you consider large? If you are talking sessions-state (so it doesn't go back/fore to the actual user, like view-state) then state is often a pretty good option. Everything except the in-process state provider uses serialization, but you can influence how it is serialized. For example, I would tend to create a local model that represents just the state I care about (plus any id/rowversion information) for that operation (rather than the full domain entities, which may have extra overhead).
To reduce the serialization overhead further, I would consider using something like protobuf-net; this can be used as the implementation for ISerializable, allowing very light-weight serialized objects (generally much smaller than BinaryFormatter, XmlSerializer, etc), that are cheap to reconstruct at page requests.
When the page is finally saved, I would update my domain entities from the local model and submit the changes.
For info, to use a protobuf-net attributed object with the state serializers (typically BinaryFormatter), you can use:
// a simple, sessions-state friendly light-weight UI model object
[ProtoContract]
public class MyType {
[ProtoMember(1)]
public int Id {get;set;}
[ProtoMember(2)]
public string Name {get;set;}
[ProtoMember(3)]
public double Value {get;set;}
// etc
void ISerializable.GetObjectData(
SerializationInfo info,StreamingContext context)
{
Serializer.Serialize(info, this);
}
public MyType() {} // default constructor
protected MyType(SerializationInfo info, StreamingContext context)
{
Serializer.Merge(info, this);
}
}

Related

Storing large object to InProc session rather than reloading on every page

This is my first post/question so please let me know if/how I can improve it. I found similar questions but nothing quite covered this.
When you store to InProc session you're just storing a reference to the data. So, if I have a public property foo, and I store it in Session("foo") = foo, then I haven't really taken up any additional memory (aside from the 32/64 bits used by the pointer)?
In my case, we are currently reloading foo on every page of our website, so if I were to instead store it in session, then it should be taking the same about of space, but not needing to reload on every page. I'm seeing a lot of people say not to store large objects in session, but if that large object already exists, what difference does it make to have a pointer to it? Of course I would remove the object from session the moment it was no longer needed.
The data we are trying to store is an object specific to the user's current work, but not user data. As an analogy, say the user was a car dealer, and he is looking at all the data for a particular customer. We have multiple pages for this customer, and we want to keep all the customer info loaded on each page, All the customer data is stored in a single xml data column in a SQL table, which we parse on every page.
We have tried binary serialization instead of parsing xml, so we could store with session in state server mode, but we found the performance to actually be worse.
We are running on a single web server.
First off, no. When you store something in the session state all the data required to store that object is consumed by the website process(s). Just because .NET treats variables like references doesn't mean it actually uses less memory than a no-GC language. It just means that copying that variable around is done efficiently without using reference operators or pointers.
Your question is a bit vague, but you have a few options for persisting data:
1) Send the data to the client as JSON and store it on the browser if it should be per-user and is needed more on the client side than the server side. You can then send pieces of the data with different requests if you need to (put it in hidden fields if you have to use ASPX web forms).
2) Store it in the session state if it is a small bit of per user data.
3) Store it in the ASP.NET cache if it is large and common to all users, see here (https://msdn.microsoft.com/en-us/library/6hbbsfk6.aspx).
4) If it is large and user-specific that is used primarily on the server then you have more of a performance problem. You should see if you can break out any user specific stuff from static stuff. If you do that and its still large then a database may not be a bad solution. If you are already using DB calls in your application then looking up this data on every request won't cause too much overhead and you won't have to regenerate it from scratch (You should only do this if the data takes a considerable time to generate as a DB call could be slower than just regenerating the data itself). I recommend writing some sort of middleware (HttpModule or OwinMiddleware) that uses whatever user Identity you use for auth to look up the data and then set it on the HttpContext.Current.Items collection. This way the data is usable for the entire request and you can add logic in the middleware to figure out when to set it.
I would think that having a large chunk of user-specific data would be a red flag as user data should just be a list of what the user can/can't do and what their preferences are.
If this is static data then its super simple. The application cache is what you want. The only complications would be if you have multiple servers that need synced data.

cache data until changed

I have a legacy website that needs a little optimization because of poor performance. It is an asp.net shopping website with linq to sql as data layer and MVP pattern as UI pattern.
The most costly entities in the db are product and category tables that have a one to many relationship. These two entities might not change regularly unless a user of admin group decides to add a product or category… etc. i was wondering how resource costly would it be to create and fetch everything from these two entities for each request! so if i could have had a way to keep my data alive…
first I thought well let’s use AJAX for data retrievals so I will create only those entities that I need to query or bind to, but wait, how can I do that without creating a new DataContext instance?!!
At the other side, using cache for whole DataContext is considered a bad decision because of memory cost. So what would be the best option here? How can I improve things?
UPDATE
1) doing what #HatSoft suggested.
Cons: those approaches will not help your code, only the database. beside this, there might be memory issues since we're putting data in memory instead of rendered html, however this might be the best option regarding de-coupling.
2) using output caching we have this code in an http handler with *.aspx wildcard:
string pagePath = Context.Request.Url.AbsolutePath;
object cacheKey = application[pagePath];
if(cacheKey == null)
return; //application restarted/first run so cache the stuff
else
Context.Response.RemoveOutputCacheItem(pagePath);
Cons: now we should link the pagePath to each database entity that the page uses, but if i do so then i'm coupling things instead of de-coupling them. this approach also will run into a little hard coding.
3) another solution would be output caching in post-cache mode instead of control cache mode. using Subsituation element and setting the OutPutCache Duration to 86400 so the page will be re-created every 24 hours.
Cons: hard coding user controls to produce the html output for Subsituation element dynamically.
so what do you suggest?
I would suggest you look in to SqlDependency class please read this article http://www.asp.net/web-forms/tutorials/data-access/caching-data/using-sql-cache-dependencies-cs
Also I would suggest you look in to loading data in the cache at application startup if it suits your application. Please see a good example here http://www.asp.net/web-forms/tutorials/data-access/caching-data/caching-data-at-application-startup-cs
With Linq2SQL you can use LinqToCache which offers a SqlDependency powered cache for your LINQ queries. It transforms the IQueryable<Products> into IEnumerable<Products> and enumerates form memmory after first access (first iteration of the underlying IQueryable). Based on SqlDependency data change notifications it invalidates the list and subsequent access will query again from DB, and cache the result.
My recommendation would be to cache the Products list and Categories in memory, since they change seldom and I expect them to be of a fairly constrained size.

Solution for previewing user changes and allowing rollback/commit over a period of time

I have asked a few questions today as I try to think through to the solution of a problem.
We have a complex data structure where all of the various entities are tightly interconnected, with almost all entities heavily reliant/dependant upon entities of other types.
The project is a website (MVC3, .NET 4), and all of the logic is implemented using LINQ-to-SQL (2008) in the business layer.
What we need to do is have a user "lock" the system while they make their changes (there are other reasons for this which I won't go into here that are not database related). While this user is making their changes we want to be able to show them the original state of entities which they are updating, as well as a "preview" of the changes they have made. When finished, they need to be able to rollback/commit.
We have considered these options:
Holding open a transaction for the length of time a user takes to make multiple changes stinks, so that's out.
Holding a copy of all the data in memory (or cached to disk) is an option but there is heck of a lot of it, so seems unreasonable.
Maintaining a set of secondary tables, or attempting to use session state to store changes, but this is complex and difficult to maintain.
Using two databases, flipping between them by connection string, and using T-SQL to manage replication, putting them back in sync after commit/rollback. I.e. switching on/off, forcing snapshot, reversing direction etc.
We're a bit stumped for a solution that is relatively easy to maintain. Any suggestions?
Our solution to a similar problem is to use a locking table that holds locks per entity type in our system. When the client application wants to edit an entity, we do a "GetWithLock" which gets the client the most up-to-date version of the entity's data as well as obtaining a lock (a GUID that is stored in the lock table along with the entity type and the entity ID). This prevents other users from editing the same entity. When you commit your changes with an update, you release the lock by deleting the lock record from the lock table. Since stored procedures are the api we use for interacting with the database, this allows a very straight forward way to lock/unlock access to specific entities.
On the client side, we implement IEditableObject on the UI model classes. Our model classes hold a reference to the instance of the service entity that was retrieved on the service call. This allows the UI to do a Begin/End/Cancel Edit and do the commit or rollback as necessary. By holding the instance of the original service entity, we are able to see the original and current data, which would allow the user to get that "preview" you're looking for.
While our solution does not implement LINQ, I don't believe there's anything unique in our approach that would prevent you from using LINQ as well.
HTH
Consider this:
Long transactions makes system less scalable. If you do UPDATE command, update locks last until commit/rollback, preventing other transaction to proceed.
Second tables/database can be modified by concurent transactions, so you cannot rely on data in tables. Only way is to lock it => see no1.
Serializable transaction in some data engines uses versions of data in your tables. So after first cmd is executed, transaction can see exact data available in cmd execution time. This might help you to show changes made by user, but you have no guarantee to save them back into storage.
DataSets contains old/new version of data. But that is unfortunatelly out of your technology aim.
Use a set of secondary tables.
The problem is that your connection should see two versions of data while the other connections should see only one (or two, one of them being their own).
While it is possible theoretically and is implemented in Oracle using flashbacks, SQL Server does not support it natively, since it has no means to query previous versions of the records.
You can issue a query like this:
SELECT *
FROM mytable
AS OF TIMESTAMP
TO_TIMESTAMP('2010-01-17')
in Oracle but not in SQL Server.
This means that you need to implement this functionality yourself (placing the new versions of rows into your own tables).
Sounds like an ugly problem, and raises a whole lot of questions you won't be able to go into on SO. I got the following idea while reading your problem, and while it "smells" as bad as the others you list, it may help you work up an eventual solution.
First, have some kind of locking system, as described by #user580122, to flag/record the fact that one of these transactions is going on. (Be sure to include some kind of periodic automated check, to test for lost or abandoned transactions!)
Next, for every change you make to the database, log it somehow, either in the application or in a dedicated table somewhere. The idea is, given a copy of the database at state X, you could re-run the steps submitted by the user at any time.
Next up is figuring out how to use database snapshots. Read up on these in BOL; the general idea is you create a point-in-time snapshot of the database, do whatever you want with it, and eventually throw it away. (Only available in SQL 2005 and up, Enterprise edition only.)
So:
A user comes along and initiates one of these meta-transactions.
A flag is marked in the database showing what is going on. A new transaction cannot be started if one is already in process. (Again, check for lost transactions now and then!)
Every change made to the database is tracked and recorded in such a fashion that it could be repeated.
If the user decides to cancel the transaction, you just drop the snapshot, and nothing is changed.
If the user decides to keep the transaction, you drop the snapshot, and then immediately re-apply the logged changes to the "real" database. This should work, since your requirements imply that, while someone is working on one of these, no one else can touch the related parts of the database.
Yep, this sure smells, and it may not apply to well to your problem. Hopefully the ideas here help you work something out.

ASP.NET 2 Session State Between Authenticated Users

I am developing a website for a client (ASP.NET, T-SQL). It is a data-entry website allowing many of their users to login and manipulate records in the same database.
There are instructions (basically a list of string) throughout the form, telling the users what to do for each section; these instructions are themselves present in the database.
On each login, I store these instructions in the Session[] object per authenticated user. The instructions are identical for everyone.
I've looked at a solution which suggested storing a common session identifier in the database and then querying it to re-use that particular session but this seems very hacky. What is a best-practices solution to accomplish this? Is there a 'common' object available to all users?
Firstly, does it matter at this point? Yes, it's bad practice and inefficent, but if you're storing 20Kb of strings in memory and have a maximum of 100 users, that's 2,000Kb of data. Hardly a lot of memory "wasted". Even at 200Kb of strings, that's 20,000Kb of data. Again, not a lot. Is it worth your time, and the client waiting for you to solve it, right now?
If you decide it is then you could:
Store the strings in the Application object or a static class so that they're retrieved once and used many times.
Retrieve the strings on every page view. This may not be as performance damaging as it seems.
Use something like the Cache class in System.Web.Caching.
Make use of Output Caching.
Make use of Windows Server AppFabric "Velocity" memory cache.
Sounds to me like you're looking for the Application Cache. Like the Session, it is an in-memory cache of data. Unlike the session, it is shared among all users; each user doesn't get their own individual copy of the data. Also, when you add data elements to the cache, you can specify criteria which will automatically invalidate that data, and cause it to be reloaded/refreshed (useful when your seldom-changing data actually does change :).
Here's some articles which should give you everything you need to know about using the Application cache (and some other caching options within ASP.NET as well):
ASP.NET Caching Overview
Using the ASP.NET Application Cache to Make Your Applications Scream
Caching Data at Application Startup
.NET Data Caching
I would suggest using the application-level Cache object. It is available everywhere as part of HttpContext. You can populate it on App_Start.
You can put any kind of object into Cache, though obviously, the smaller the better.
Here are some examples of how to populate it using C#:
1) Add items to the cache as you would add items to a dictionary by specifying the item's key & value.
Example: add the current Value property of a text box to the cache.
Cache["txt1"] = txtName.value;
or
Cache["result"] = dataset;
2) The Insert method is overloaded, allowing you to define values for the parameters of the version you're using.
Example: add only an item key & value:
Cache.Insert("MyData1", connectionString);
3) The Add method has the same signature as the Insert method, but it returns an object representing the item you added.
Cache.Add("MyData1", connectionString);
To retrieve the from cache:
stringName = Cache["MyData"];
If the cached data is not a string, you may need to cast it to the proper data type.
result = (DataSet)Cache["result"];
One of the benefits of using the Cache object as opposed to the Application object is that the CLR will dump contents of Cache if the system is in danger of running out of memory.

Where should IsChanged functionality be handled?

I'm having an internal debate about where I should handle checking for changes to data and can't decide on the practice that makes the most sense:
Handling IsChanged in the GUI - This requires persistence of data between page load and posting of data which is potentially a lot of bandwidth/page delivery overhead. This isn't so bad in a win forms application, but in a website, this could start having a major financial impact for bandwidth costs.
Handling it in the DAL - This requires multiple calls to the database to check if any data has changed prior to saving it. This potentially means an extra needless call to the database potentially causing scalability issues by needless database queries.
Handling it in a Save() stored proc - This would require the stored proc to potentially make an extra needless call to the table to check, but would save the extra call from the DAL to the database. This could potentially scale better than having the DAL handle it, but my gut says this can be done better.
Handling it in a trigger - This would require using a trigger (which I'm emotionally averse to, I tend to avoid triggers except where absolutely necessary).
Don't handle IsChanged functionality at all - It not only becomes hard to handle the "LastUpdated" date, but saving data unnecessarily to the database seems like a bad practice for scalability in itself.
So each approach has its drawbacks and I'm at a loss as to which is the best of this bad bunch. Does anyone have any more scalable ideas for handling data persistence for the specific purpose of seeing if anything has changed?
Architecture: SQL Server 2005, ASP.NET MVC, IIS7, High scalability requirements for non-specific global audience.
Okay, here's another solution - I've not thought through all the repercussions, but it could work I think:
Think about the GetHashCode() comparison functionality:
At page load time, you calculate the hash code for your page data. You store the hashcode in the page data or viewstate/session if that's what you prefer.
At data post (postback) you calculate the hash code of the data that was posted and compare it to the original hash. If it's different, the user changed something and you can save it back to the database.
Pros
You don't have to store all your original data in the page load which cuts down on bandwidth/page delivery overhead.
You don't have to have your DAL do multiple calls to the database to determine if something's changed.
The record will only be updated in the database if something's changed and hence maintain your correct LastUpdated date.
Cons
You still have to load any original data from the database into your business object that wasn't stored in the "viewstate" that is necessary to save a valid record to your database.
Change of one field will change the hash, but you don't know which field unless you call the original data from the database to compare. On a side note, perhaps you don't need to. If you've gotta update any of the fields the timestamp changes and overwriting a field that hasn't changed for all intensive purposes doesn't have any effect.
You can't completely rule out the chance of collisions but they would be rare. This comes down to is the repercussion of a collision acceptable or not?
Either/Or
If you store the hash in the session, then that saves bandwidth, but increases server resources so you have a potential scalability issue in either case to consider.
Unknowns
Is the overhead of updating a single column and different than that of updating multiple/all columns in a record? I don't know what that performance overhead is.
I handle it in the DAL - it has the original values in it so no need to go to the database.
For each entity in your system introduce additional Version field. With this field you will be able to check for changes at the database level.
Since you have a web application and usually scalability matters for web application, I would suggest you to avoid IsChanged logic at the UI level. LastUpdated date can be set at the database level during Save operation.

Resources