Why adding new entity just to modify another one - asp.net

I'm reading the source of an ASP.NET Core example project (MSDN) and try to understand all.
There's an Edit razor page which shows the values of an entity record in <input> fields allowing the user to see and change different fields of a given record. There's this line:
Movie = await _context.Movie.FirstOrDefaultAsync(m => m.ID == id);
...
_context.Attach(Movie).State = EntityState.Modified;
I don't understand why it adds a new entity and change its EntityState to Modified, instead of fetch the record and change it then call SaveChanges().

My guess is that their example is loading the movie in one call, passing it to the view, then in another update action, passing the modified entity from the view to the controller, which is attaching it, setting it's state to modified, and then calling save changes.
IMHO this is an extremely bad practice with EF for a number of reasons, and I have no idea why Microsoft uses it in examples (other than that it makes CRUD look easy-peazy).
By serializing entities to the view, you are serializing typically far more data to send across the wire than your view actually needs. You give malicious, or curious users far more information about your system than you should.
You are bound to run into serializer errors with bi-directional references. ("A" has reference to "B" which has a reference back to "A") Serializers (like JSON) generally don't like these.
You are bound to run into performance issues with lazy loading calls as the serializer "touches" references. When dealing with collections of results, the resulting lazy load calls can blow performance completely out of the water.
Without lazy loading enabled, you can easily run into issues where referenced data is passed as #null or potentially incomplete collections due to the Context possibly having some referenced data in the cache that it can pull and associate to the entity, but not the complete set of child records.
By passing entities back to the controller you expose your system to unintentional modifications by which an attacker can modify the entity data in ways that you do not intend it to be modified, then when you attach it, set the state to modified, and save, you overwrite the data state. I.e. change FKs, or otherwise alter data that is not supported, or even displayed by your UI.
You are bound to run into stale data issues where data can have changed between the point you read the entity initially and the point it is saved. Attaching and saving takes a brutal "last-in-wins" approach to concurrent data.
My general advice is to:
1. Leverage Select or Automapper's ProjectTo to populate ViewModel classes with just the fields your view will need. This avoids the risks of lazy loads, and minimizes the data passed to the client. (Faster, and reveals nothing extra about your system)
2. Trust absolutely nothing coming back from the client. Validate the returned view model object, then only after you're satisfied it is legit, load the entity(ies) from the context and copy the applicable fields across. This also gives you the opportunity to evaluate row versioning to handle concurrency issues.
Disclaimer: You certainly can address most of my pointed out issues while still passing entities back & forth, but you definitely leave the door wide open to vulnerabilities and bugs creeping in when someone just defaults to an attach & save, or lazy loads start creeping in.

Related

cache data until changed

I have a legacy website that needs a little optimization because of poor performance. It is an asp.net shopping website with linq to sql as data layer and MVP pattern as UI pattern.
The most costly entities in the db are product and category tables that have a one to many relationship. These two entities might not change regularly unless a user of admin group decides to add a product or category… etc. i was wondering how resource costly would it be to create and fetch everything from these two entities for each request! so if i could have had a way to keep my data alive…
first I thought well let’s use AJAX for data retrievals so I will create only those entities that I need to query or bind to, but wait, how can I do that without creating a new DataContext instance?!!
At the other side, using cache for whole DataContext is considered a bad decision because of memory cost. So what would be the best option here? How can I improve things?
UPDATE
1) doing what #HatSoft suggested.
Cons: those approaches will not help your code, only the database. beside this, there might be memory issues since we're putting data in memory instead of rendered html, however this might be the best option regarding de-coupling.
2) using output caching we have this code in an http handler with *.aspx wildcard:
string pagePath = Context.Request.Url.AbsolutePath;
object cacheKey = application[pagePath];
if(cacheKey == null)
return; //application restarted/first run so cache the stuff
else
Context.Response.RemoveOutputCacheItem(pagePath);
Cons: now we should link the pagePath to each database entity that the page uses, but if i do so then i'm coupling things instead of de-coupling them. this approach also will run into a little hard coding.
3) another solution would be output caching in post-cache mode instead of control cache mode. using Subsituation element and setting the OutPutCache Duration to 86400 so the page will be re-created every 24 hours.
Cons: hard coding user controls to produce the html output for Subsituation element dynamically.
so what do you suggest?
I would suggest you look in to SqlDependency class please read this article http://www.asp.net/web-forms/tutorials/data-access/caching-data/using-sql-cache-dependencies-cs
Also I would suggest you look in to loading data in the cache at application startup if it suits your application. Please see a good example here http://www.asp.net/web-forms/tutorials/data-access/caching-data/caching-data-at-application-startup-cs
With Linq2SQL you can use LinqToCache which offers a SqlDependency powered cache for your LINQ queries. It transforms the IQueryable<Products> into IEnumerable<Products> and enumerates form memmory after first access (first iteration of the underlying IQueryable). Based on SqlDependency data change notifications it invalidates the list and subsequent access will query again from DB, and cache the result.
My recommendation would be to cache the Products list and Categories in memory, since they change seldom and I expect them to be of a fairly constrained size.

How do I check which values in my Form have changed before saving?

The situation is like this. We have a form with a large number of fields (over 30 spread over several tabs) and what I want to do is find which values have changed before saving with minimum impact on performance. What happens right now is, for editing, single records are queried from several databases. The values are passed over to the client side as value objects. At the moment they are not bound to any fields in the form.
My initial idea was to have a boolean flag for each field to set true or false each time any of the fields were changed. At the time of saving the program would run through the list of flags to see which fields have changed. This seems more than a bit clunky to me so I was thinking maybe it could be done on the server side. But then I don't want to go through each field one by one checking to see which ones don't match the db records.
Any ideas on what to do here?
This is a very common problem for a lot of Flex applications. Because it happens so often there are a number of commercial implementations for Data Management. Queries are stored into entities and those entities are bound to a form on the client side. Whenever a field is updated, it will automatically perform the steps to persist the changes to the db and do rollbacks when requested.
Adobe LCDS Data Management - If you are dealing with a Java environment
WebOrb - If you are dealing with a .net, php, java, rails environment
Of course you can re-invent the wheel and roll out your own, set up PropertyChangeEvent listeners on each field. When the changes are dispatched, listen for them and write handlers for each one.
This sounds exactly like what we're doing with one of the projects I'm working on for a client.
What we do is dupe the value objects once they back to the UI. Then when calling the update service, I send both the original object and the new object. In the service, I do a field by field compare on the server to determine what values should sent to the database.
If you need to update every field/property conditionally based on whether or not it changed; then I don't see a way to avoid the check with every field/property. Even if you implement your Boolean idea and swap the flag in the UI whenever anything changes; you're still going to have to check those Boolean values when creating your query to determine what should be updated or not.
In my situation, three different databases are queried to create the value object that gets sent back to the UI. Field updates are saved in one of those database and given first order of preference when doing the select. So, wee have an explicit field by field comparison happening inside a stored procedure.
If you don't need field by field comparisons, but rather a "record by record" comparisons; then the Boolean approach to let you know the record/Value Object had changed is going to save you some time and coding.

Solution for previewing user changes and allowing rollback/commit over a period of time

I have asked a few questions today as I try to think through to the solution of a problem.
We have a complex data structure where all of the various entities are tightly interconnected, with almost all entities heavily reliant/dependant upon entities of other types.
The project is a website (MVC3, .NET 4), and all of the logic is implemented using LINQ-to-SQL (2008) in the business layer.
What we need to do is have a user "lock" the system while they make their changes (there are other reasons for this which I won't go into here that are not database related). While this user is making their changes we want to be able to show them the original state of entities which they are updating, as well as a "preview" of the changes they have made. When finished, they need to be able to rollback/commit.
We have considered these options:
Holding open a transaction for the length of time a user takes to make multiple changes stinks, so that's out.
Holding a copy of all the data in memory (or cached to disk) is an option but there is heck of a lot of it, so seems unreasonable.
Maintaining a set of secondary tables, or attempting to use session state to store changes, but this is complex and difficult to maintain.
Using two databases, flipping between them by connection string, and using T-SQL to manage replication, putting them back in sync after commit/rollback. I.e. switching on/off, forcing snapshot, reversing direction etc.
We're a bit stumped for a solution that is relatively easy to maintain. Any suggestions?
Our solution to a similar problem is to use a locking table that holds locks per entity type in our system. When the client application wants to edit an entity, we do a "GetWithLock" which gets the client the most up-to-date version of the entity's data as well as obtaining a lock (a GUID that is stored in the lock table along with the entity type and the entity ID). This prevents other users from editing the same entity. When you commit your changes with an update, you release the lock by deleting the lock record from the lock table. Since stored procedures are the api we use for interacting with the database, this allows a very straight forward way to lock/unlock access to specific entities.
On the client side, we implement IEditableObject on the UI model classes. Our model classes hold a reference to the instance of the service entity that was retrieved on the service call. This allows the UI to do a Begin/End/Cancel Edit and do the commit or rollback as necessary. By holding the instance of the original service entity, we are able to see the original and current data, which would allow the user to get that "preview" you're looking for.
While our solution does not implement LINQ, I don't believe there's anything unique in our approach that would prevent you from using LINQ as well.
HTH
Consider this:
Long transactions makes system less scalable. If you do UPDATE command, update locks last until commit/rollback, preventing other transaction to proceed.
Second tables/database can be modified by concurent transactions, so you cannot rely on data in tables. Only way is to lock it => see no1.
Serializable transaction in some data engines uses versions of data in your tables. So after first cmd is executed, transaction can see exact data available in cmd execution time. This might help you to show changes made by user, but you have no guarantee to save them back into storage.
DataSets contains old/new version of data. But that is unfortunatelly out of your technology aim.
Use a set of secondary tables.
The problem is that your connection should see two versions of data while the other connections should see only one (or two, one of them being their own).
While it is possible theoretically and is implemented in Oracle using flashbacks, SQL Server does not support it natively, since it has no means to query previous versions of the records.
You can issue a query like this:
SELECT *
FROM mytable
AS OF TIMESTAMP
TO_TIMESTAMP('2010-01-17')
in Oracle but not in SQL Server.
This means that you need to implement this functionality yourself (placing the new versions of rows into your own tables).
Sounds like an ugly problem, and raises a whole lot of questions you won't be able to go into on SO. I got the following idea while reading your problem, and while it "smells" as bad as the others you list, it may help you work up an eventual solution.
First, have some kind of locking system, as described by #user580122, to flag/record the fact that one of these transactions is going on. (Be sure to include some kind of periodic automated check, to test for lost or abandoned transactions!)
Next, for every change you make to the database, log it somehow, either in the application or in a dedicated table somewhere. The idea is, given a copy of the database at state X, you could re-run the steps submitted by the user at any time.
Next up is figuring out how to use database snapshots. Read up on these in BOL; the general idea is you create a point-in-time snapshot of the database, do whatever you want with it, and eventually throw it away. (Only available in SQL 2005 and up, Enterprise edition only.)
So:
A user comes along and initiates one of these meta-transactions.
A flag is marked in the database showing what is going on. A new transaction cannot be started if one is already in process. (Again, check for lost transactions now and then!)
Every change made to the database is tracked and recorded in such a fashion that it could be repeated.
If the user decides to cancel the transaction, you just drop the snapshot, and nothing is changed.
If the user decides to keep the transaction, you drop the snapshot, and then immediately re-apply the logged changes to the "real" database. This should work, since your requirements imply that, while someone is working on one of these, no one else can touch the related parts of the database.
Yep, this sure smells, and it may not apply to well to your problem. Hopefully the ideas here help you work something out.

ASP.NET 2 Session State Between Authenticated Users

I am developing a website for a client (ASP.NET, T-SQL). It is a data-entry website allowing many of their users to login and manipulate records in the same database.
There are instructions (basically a list of string) throughout the form, telling the users what to do for each section; these instructions are themselves present in the database.
On each login, I store these instructions in the Session[] object per authenticated user. The instructions are identical for everyone.
I've looked at a solution which suggested storing a common session identifier in the database and then querying it to re-use that particular session but this seems very hacky. What is a best-practices solution to accomplish this? Is there a 'common' object available to all users?
Firstly, does it matter at this point? Yes, it's bad practice and inefficent, but if you're storing 20Kb of strings in memory and have a maximum of 100 users, that's 2,000Kb of data. Hardly a lot of memory "wasted". Even at 200Kb of strings, that's 20,000Kb of data. Again, not a lot. Is it worth your time, and the client waiting for you to solve it, right now?
If you decide it is then you could:
Store the strings in the Application object or a static class so that they're retrieved once and used many times.
Retrieve the strings on every page view. This may not be as performance damaging as it seems.
Use something like the Cache class in System.Web.Caching.
Make use of Output Caching.
Make use of Windows Server AppFabric "Velocity" memory cache.
Sounds to me like you're looking for the Application Cache. Like the Session, it is an in-memory cache of data. Unlike the session, it is shared among all users; each user doesn't get their own individual copy of the data. Also, when you add data elements to the cache, you can specify criteria which will automatically invalidate that data, and cause it to be reloaded/refreshed (useful when your seldom-changing data actually does change :).
Here's some articles which should give you everything you need to know about using the Application cache (and some other caching options within ASP.NET as well):
ASP.NET Caching Overview
Using the ASP.NET Application Cache to Make Your Applications Scream
Caching Data at Application Startup
.NET Data Caching
I would suggest using the application-level Cache object. It is available everywhere as part of HttpContext. You can populate it on App_Start.
You can put any kind of object into Cache, though obviously, the smaller the better.
Here are some examples of how to populate it using C#:
1) Add items to the cache as you would add items to a dictionary by specifying the item's key & value.
Example: add the current Value property of a text box to the cache.
Cache["txt1"] = txtName.value;
or
Cache["result"] = dataset;
2) The Insert method is overloaded, allowing you to define values for the parameters of the version you're using.
Example: add only an item key & value:
Cache.Insert("MyData1", connectionString);
3) The Add method has the same signature as the Insert method, but it returns an object representing the item you added.
Cache.Add("MyData1", connectionString);
To retrieve the from cache:
stringName = Cache["MyData"];
If the cached data is not a string, you may need to cast it to the proper data type.
result = (DataSet)Cache["result"];
One of the benefits of using the Cache object as opposed to the Application object is that the CLR will dump contents of Cache if the system is in danger of running out of memory.

Ways to store an object across multiple postbacks

For the sake of argument assume that I have a webform that allows a user to edit order details. User can perform the following functions:
Change shipping/payment details (all simple text/dropdowns)
Add/Remove/Edit products in the order - this is done with a grid
Add/Remove attachments
Products and attachments are stored in separate DB tables with foreign key to the order.
Entity Framework (4.0) is used as ORM.
I want to allow the users to make whatever changes they want to the order and only when they hit 'Save' do I want to commit the changes to the database. This is not a problem with textboxes/checkboxes etc. as I can just rely on ViewState to get the required information. However the grid is presenting a much larger problem for me as I can't figure out a nice and easy way to persist the changes the user made without committing the changes to the database. Storing the Order object tree in Session/ViewState is not really an option I'd like to go with as the objects could get very large.
So the question is - how can I go about preserving the changes the user made until ready to 'Save'.
Quick note - I have searched SO to try to find a solution, however all I found were suggestions to use Session and/or ViewState - both of which I would rather not use due to potential size of my object trees
If you have control over the schema of the database and the other applications that utilize order data, you could add a flag or status column to the orders table that differentiates between temporary and finalized orders. Then, you can simply store your intermediate changes to the database. There are other benefits as well; for example, a user that had a browser crash could return to the application and be able to resume the order process.
I think sticking to the database for storing data is the only reliable way to persist data, even temporary data. Using session state, control state, cookies, temporary files, etc., can introduce a lot of things that can go wrong, especially if your application resides in a web farm.
If using the Session is not your preferred solution, which is probably wise, the best possible solution would be to create your own temporary database tables (or as others have mentioned, add a temporary flag to your existing database tables) and persist the data there, storing a single identifier in the Session (or in a cookie) for later retrieval.
First, you may want to segregate your specific state management implementation into it's own class so that you don't have to replicate it throughout your systems.
Second, you may want to consider a hybrid approach - use session state (or cache) for a short time to avoid unnecessary trips to a DB or other external store. After some amount of inactivity, write the cached state out to disk or DB. The simplest way to do this, is to serialize your objects to text (using either serialization or a library like proto-buffers). This helps allow you to avoid creating redundant or duplicate data structure to capture the in-progress data relationally. If you don't need to query the content of this data - it's a reasonable approach.
As an aside, in the database world, the problem you describe is called a long running transaction. You essentially want to avoid making changes to the data until you reach a user-defined commit point. There are techniques you can use in the database layer, like hypothetical views and instead-of triggers to encapsulate the behavior that you aren't actually committing the change. The data is in the DB (in the real tables), but is only visible to the user operating on it. This is probably a more complicated implementation than you may be willing to undertake, and requires intrusive changes to your persistence layer and data model - but allows the application to be ignorant of the issue.
Have you considered storing the information in a JavaScript object and then sending that information to your server once the user hits save?
Use domain events to capture the users actions and then replay those actions over the snapshot of the order model ( effectively the current state of the order before the user started changing it).
Store each change as a series of events e.g. UserChangedShippingAddress, UserAlteredLineItem, UserDeletedLineItem, UserAddedLineItem.
These events can be saved after each postback and only need a link to the related order. Rebuilding the current state of the order is then as simple as replaying the events over the currently stored order objects.
When the user clicks save, you can replay the events and persist the updated order model to the database.
You are using the database - no session or viewstate is required therefore you can significantly reduce page-weight and server memory load at the expense of some page performance ( if you choose to rebuild the model on each postback ).
Maintenance is incredibly simple as due to the ease with which you can implement domain object, automated testing is easily used to ensure the system behaves as you expect it to (while also documenting your intentions for other developers).
Because you are leveraging the database, the solution scales well across multiple web servers.
Using this approach does not require any alterations to your existing domain model, therefore the impact on existing code is minimal. Biggest downside is getting your head around the concept of domain events and how they are used and abused =)
This is effectively the same approach as described by Freddy Rios, with a little more detail about how and some nice keyword for you to search with =)
http://jasondentler.com/blog/2009/11/simple-domain-events/ and http://www.udidahan.com/2009/06/14/domain-events-salvation/ are some good background reading about domain events. You may also want to read up on event sourcing as this is essentially what you would be doing ( snapshot object, record events, replay events, snapshot object again).
how about serializing your Domain object (contents of your grid/shopping cart) to JSON and storing it in a hidden variable ? Scottgu has a nice article on how to serialize objects to JSON. Scalable across a server farm and guess it would not add much payload to your page. May be you can write your own JSON serializer to do a "compact serialization" (you would not need product name,product ID, SKU id, etc, may be you can just "serialize" productID and quantity)
Have you considered using a User Profile? .Net comes with SqlProfileProvider right out of the box. This would allow you to, for each user, grab their profile and save the temporary data as a variable off in the profile. Unfortunately, I think this does require your "Order" to be serializable, but I believe all of the options except Session thus far would require the same.
The advantage of this is it would persist through crashes, sessions, server down time, etc and it's fairly easy to set up. Here's a site that runs through an example. Once you set it up, you may also find it useful for storing other user information such as preferences, favorites, watched items, etc.
You should be able to create a temp file and serialize the object to that, then save only the temp file name to the viewstate. Once they successfully save the record back to the database then you could remove the temp file.
Single server: serialize to the filesystem. This also allows you to let the user resume later.
Multiple server: serialize it but store the serialized value in the db.
This is something that's for that specific user, so when you persist it to the db you don't really need all the relational stuff for it.
Alternatively, if the set of data is v. large and the amount of changes is usually small, you can store the history of changes done by the user instead. With this you can also show the change history + support undo.
2 approaches - create a complex AJAX application that stores everything on the client and only submits the entire package of changes to the server. I did this once a few years ago with moderate success. The applicaiton is not something I would want to maintain though. You have a hard time syncing your client code with your server code and passing fields that are added/deleted/changed is nightmarish.
2nd approach is to store changes in the data base in a temp table or "pending" mode. Advantage is your code is more maintainable. Disadvantage is you have to have a way to clean up abandonded changes due to session timeout, power failures, other crashes. I would take this approach for any new development. You can have separate tables for "pending" and "committed" changes that opens up a whole new level of features you can add. What if? What changed? etc.
I would go for viewstate, regardless of what you've said before. If you only store the stuff you need, like { id: XX, numberOfProducts: 3 }, and ditch every item that is not selected by the user at this point; the viewstate size will hardly be an issue as long as you aren't storing the whole object tree.
When storing attachments, put them in a temporary storing location, and reference the filename in your viewstate. You can have a scheduled task that cleans the temp folder for every file that was last saved over 1 day ago or something.
This is basically the approach we use for storing information when users are adding floorplan information and attachments in our backend.
Are the end-users internal or external clients? If your clients are internal users, it may be worthwhile to look at an alternate set of technologies. Instead of webforms, consider using a platform like Silverlight and implementing a rich GUI there.
You could then store complex business objects within the applet, provide persistant "in progress" edit tracking across multiple sessions via offline storage and easily integrate with back-end services that providing saving / processing of the finalised order. All whilst maintaining access via the web (albeit closing out most *nix clients).
Alternatives include Adobe Flex or AJAX, depending on resources and needs.
How large do you consider large? If you are talking sessions-state (so it doesn't go back/fore to the actual user, like view-state) then state is often a pretty good option. Everything except the in-process state provider uses serialization, but you can influence how it is serialized. For example, I would tend to create a local model that represents just the state I care about (plus any id/rowversion information) for that operation (rather than the full domain entities, which may have extra overhead).
To reduce the serialization overhead further, I would consider using something like protobuf-net; this can be used as the implementation for ISerializable, allowing very light-weight serialized objects (generally much smaller than BinaryFormatter, XmlSerializer, etc), that are cheap to reconstruct at page requests.
When the page is finally saved, I would update my domain entities from the local model and submit the changes.
For info, to use a protobuf-net attributed object with the state serializers (typically BinaryFormatter), you can use:
// a simple, sessions-state friendly light-weight UI model object
[ProtoContract]
public class MyType {
[ProtoMember(1)]
public int Id {get;set;}
[ProtoMember(2)]
public string Name {get;set;}
[ProtoMember(3)]
public double Value {get;set;}
// etc
void ISerializable.GetObjectData(
SerializationInfo info,StreamingContext context)
{
Serializer.Serialize(info, this);
}
public MyType() {} // default constructor
protected MyType(SerializationInfo info, StreamingContext context)
{
Serializer.Merge(info, this);
}
}

Resources