We have a very large HTML form (> 100 fields) that updates a SQL Server database with user-entered data. It will take the user a long time to fill out the form, but every piece of information they submit is very valuable to the business process. Even if the user gives up on the form, we want to retain everything they have entered.
We plan to attach an onblur event to each field and use jQuery/AJAX to post each piece of data back to the application server immediately. That part is pretty straightforward. The question we have is when and how to best save this application-level information to the database. Again, our priority is data retention as opposed to performance but we also want to do this as efficiently as possible.
Options as I see it are:
Have the web service immediately post each piece of data to the database server.
Store the information in a custom class on the application server, then periodically call an update method to post new data to the database.
Store the information in view or session state, then run a routine to post this information to the database server.
Something else that we haven't thought of.
Option 1 seems the most obviously failsafe, but also the most resource intensive. Option 2 seems the most elegant, but can we be absolutely certain that the custom class instance can't be destroyed without first running its update method?
Thanks for your help!
IMHO, I'd really cut up the form into sections (if possible). Since this is ASP.Net, if you are using Web Forms then look into using wizards (cut up the form into logical Steps)
You can do same without Form Wizard, but still cut up the process into logical steps, client-side. You can probably do this in pure JavaScript, but it would likely be easier if you used a framework (jQuery, Knockout, etc.) - the concept remains the same, cut up the form entry process into sections (aka 'Steps') - e.g. using display toggles, divs for each "step", etc.
"retain everything even if abandoned later": assumed that the steps are "hierarchical" where the "most critical" inputs are at the beginning. This makes the "steps" approach even more important - this is a "logical group" (of inputs you really want) so if you do the Step approach, then you can save this data (of this "step") to DB in whatever fashion you deem appropriate (e.g. Ajax, or ASP.Net Post/postback).
Hth...
I would package everything up in some xml or dataset (.getxml) and pass the xml to a stored procedure....
How to pass XML from C# to a stored procedure in SQL Server 2008?
And maybe put the call on a background thread.
http://code.msdn.microsoft.com/CSASPNETBackgroundWorker-dda8d7b6
The xml will be faster than calling the values row by row (RBAR).
You can save just the xml, or shred the xml into a relational table(s).
Related
I've been building data driven applications for about 18 years and for the past two, I've been successfuly using angular for my large forms/crud based apps. You know, the classic sql server db with hundreds of tables with millons of records. So far, so good.
Now I'm porting/re-engineering a desktop app with about 50 forms, all complex, all fully functional, "smart". My approach for the last couple years was to simply work tightly with the backend rest API to retrieve, insert or update data as needed and everything works fine.
Then I stumbled across ngrx and I understand exactly how it works, what it does and why it is good for a "reactive" app.
My problem is the following: In the usual lifecycle of the kind of systems i mentioned, you always have to deal with fresh data and always have to tell everything to the server. Almost no data in such apps can be safely "stored" localy since transactional systems rely on centralized data interactions. There's no such thing as "hey lets keep this employee's sales here for later use".
So why would it be so important to manage a local 'store' when most of my data is volatile? I understand why it would be useful for global app data like user-profile or general ui related state, but for the core data itself? I dont get it. You query for data, plug that data in the form, it gets processed by the user and sent back to the server. That data is no longer needed, and if you do need it, you ask for it again, as it could have changed its state since the last time you interacted with it.
I do not understand the great lengths i have to go to mantain a local store and all the boilerplate if that state is so volatile.
They say change detection does not scale but I've build some really large web apps with a simple "http service" pattern and it works just fine, cause most of the component-tree is destroyed anyway as you go somewhere else in the app, and any previous subscriptions become useless. Even with large-bulky-kinky forms, it's never that big of a problem the inner workings of a form as to require external "aid" fro a store. The way I see it, the "state" of a form is a concern of that form in that moment alone. Is it to keep the component tree in sync? never had problems with that before... even for complicated trees with lots of shared data, master detail is kind of a flat pattern in the end if al lthe data is there.
For other components, such as grids, charts, reporte, etc, same thing applyes. They get the data they need and then "puf", gone.
So now you see my mindset. I AM trying to change it to something better. Why am I missing out the redux pattern?
I have a bit of experience here! It's all subjective, so what I've done may not suit you. My system is a complex system that sounds like it's on a similar scale as yours. I battled at first with the same issues of "why build complex logic on the front end and back end", and "why bother keeping stuff in state".
A redux/NGRX approach works for me because there are multiple ways data can be changed - perhaps it's a single user using the front end, perhaps it's another user making a change and I want to respond to that change straight away to avoid concurrency issues down the track. Perhaps there are multiple parts within my front end that can manipulate the same data.
At the back end, I use a CQRS pattern instead of a traditional REST API. Typically, one might suggest to re-implement the commands/queries to "reduce" changes to the state, however I opted for a different approach. I don't just want to send a big object graph back to the server and have it blindly insert, and I don't want to re-implement logic on the client and server.
My basic "use case" life cycle looks a bit like:
Load a list of data (limited size, not all attributes).
User selects item from list
Client requests "full" object/view/dto from server
Client stores response in object entity state
User starts modifying data
These changes are stored as "in progress" changes in a different part of state. The system is now responding to the data in the "in progress" part
If another change comes in from server, it doesn't overwrite the "in progress" data, but it does replace what is in the object entity state.
If required, UI shows that the underlying data has changed / is different to what user has entered / whatever.
User clicks on the "perform action" button, or otherwise triggers a command to be sent to server
server performs command. Any errors are returned, or success
server notifies client that change was successful, the client clears the "in progress" information
server notifies client that Entity X has been updated, client re-requests entity X and puts it into the object entity state. This notification is sent to all connected clients, so they can all behave appropriately.
I have a this asp.net page which upon first time load:
1: Make a DB call and get data - XML string (this chunk can go beyond 100kb). And this DB call is a bit expensive takes about 5-10 secs.
2: I loop through this XML and create a Custom Collection with values from XML. Then Bind it to a Repeater Control.
Now the repeater control has one text input. User is free to enter values in one or more or all TBs or leave all blank. Then then hit Save button.
On Save Postback, I will have to loop through all rows in the Repeater, Collect all the rows that has some input in the and generate an XML using the initial values and this new input value and Save it to DB.
Problem:
So I will need reference to all the initial XML values. I can think of these options and looking for inputs on selecting a best one.
1: ViewState: Store my Collection or XML string in ViewState - I'm sure it is will be too huge
2: Session: Use Session to store Collection of xml string - Again
3: DB Call: Make a DB call to get the data again - as I said it is kind of expensive call and my DBA is asking me to avoid this
4: HiddenField: Store the essential data from XML in to HiddenField and use that for Save manipulation. i.e. in each repeater item find all the hiddenfields
Which one is best in terms of better request response and less resource utilization on server?
Or is there a better way I am missing?
PS: Have to use ASP.NET 2.0 WebForms only.
Update1:
I tried the following with ViewState:
1: Store entire xml string: ViewState length = 97484 and FireBug shows pagesize - 162Kb
2:Store stripped down version of Collection with required only data: ViewState length = 27372 and FireBug shows pagesize - 94Kb and with gzip compression it reduces to 13kb.
With the existing website FireBug shows Size 236Kb.
So definitely option 2 is better and my new version is better then current website.
So any inputs?
A quick question - who is your target audience for this page? If it's an internal website for a company then just storing the data in viewstate might be acceptable. If it's for external people, e.g. potential customers, then speed and performance probably matter to you more.
Viewstate - have you tried adding your XML to viewstate? How much did it increase the page size by? If you're gzipping all of your pages rather than sending them over the wire uncompressed then you could see about a 70% reduction in size - 30kb isn't that much these days...
Session - it is worth remembering that the server can and will drop data from sessions if it runs out of space. They can also expire. Do you trust your users not to log in again in a new tab and then submit the page that they've had open for the last 10 hours? While using session results in less data on the wire you might need to re-pull the data from the db if the value does end up being dropped for whatever reason. Also, if you're in a web farm environment etc there are complications involving synchronizing sessions across servers.
DB Call - can the query be optimised in any way? Are the indices on all the fields that need them? Maybe you and your DBA can make it less painful to pull. But then again, if the data can change between you pulling it the first time and the user submitting their changes then you wouldn't want to re-pull it, I suspect.
Hidden Fields - With these you'd be saving less data than if you put the whole string in Viewstate. The page wouldn't be depending on the state of the webserver like with session and nor would you be racing against other users changing the state of the db.
On the whole, I think 4 is probably the best bet if you can afford to slow your pages down a little. Use Firebug/YSlow and compare how much data is transmitted before and after implementing 4.
One final thought - how are things like this persisted between postbacks in the rest of your webapp? Assuming that you haven't written the whole thing on your own/only just started it you might be able to find some clues as to how other developers in a similar situation solved the problem.
Edit:
there is a load-balancer, not sure how it will play with Session
If you have a load balancer then you need to make sure that session state is stored in a state server or similar and not in the process ("inproc"). If the session is stored on the webserver then option 2 will play very badly with the load balancer.
While I'm not a huge fan of overusing session, this will probably be your best bet as it will be your fastest option from the user's standpoint.
Since session state does have it's own inherit issues, you could load the data you need into session, and if your session drops for whatever reason, just do another database hit and reload it.
I would really stay away from options 1 and 4 just because of the amount of unnecessary data you will be sending to the client, and potentially slowing down their experience.
Option 3 will also slow down the user experience, so I would stay away from that if at all possible unless you can speed up your query time.
I'm looking to build an ajax page; it's a reporting page. By default, load today's report. On the page there's a calendar control and when the user clicks on a date, reload the gridview with the corresponding data. Is it considered good practice to do the following:
1) on the first page load, query the data for the page
2) put the query result in the session object and display it in a gridview
3) if the user requests new data, get new data from the query with different parameters
4) put the result of the second query in the session object and display it
5) if the user then requests the data from the first query, get it from the session object
6) do the sorting and paging with the data held in the session.
Note: the data of each query will contain about 300-500 rows and about 15 columns. I'd like to do all this with ajax calls. What are some suggestions and pitfalls to avoid.
Thanks.
I would use Backbone.js:
Server produces report in JSON format.
Client has a Backbone.js Model for this report, which binds to the JSON endpoint.
Client renders the Report Model as a Backbone view.
Client reloads the report from server only when appropriate.
Reports from previously viewed days will still be around in the client as Backbone Model instances, so you don't need to reload from server unless the user forces. I believe this is your main concern?
You're probably still in the realm of can-do-without-a-client-side-framework, but if you plan on doing more of these pages or getting any more complex, you can go to spaghetti pretty quickly without something like Backbone.js.
PS. I just noticed this is .NET related. I know nothing about .NET so maybe there's a built-in client-side framework that can do something similar.
EDIT (updated after reading comment):
For server-side caching, I think a either a denormalized report table in the DB or a separate dedicated cache store (e.g. memcache) is a better practice than session object.
It depends though. If there was say, 1 possible report per-user per-day, and you didn't have memcache set up, and you don't want to use the DB for whatever reason, then it could make sense to store it in their session object. However, if each day's report is the same for all users, you're now caching it N times instead of 1. It could also be hard to invalidate from an external hook and the user loses their cache when they logout.
So I would probably just have a typical get-or-set pattern to try and load report from cache first, and fallback to DB. Then invalidate/update the cached report only when the user forces, or if data used to create the report has changed. AJAX call requests the report by date or however a report is identified.
Since you are hoping to use the data in Javascript Ajax scenerios it would make the most sense to create a HTTP Handler to query and return the needed data result sets on demand.
Using the session object is not a solution because it cannot be accessed asynchronously. As a result, your page would not be able to query this data to feed back to your Javascript objects (unless you created an HTTP Handler to send it back, but that would be pointless when you could just query the data in the HTTP Handler directly).
You are forgetting about windows. A client isn't a window, a client is a browser it can contain many windows/tabs. You need to make sure you are rendering/feeding the correct window. Usually i handle this by submit hidden values.
Problem is separating resuming a session / Starting a new window.
I wouldn't bother holding more than one copy of the query in the Session. The primary reason you'd want to hold it in Session is to improve the sorting/paging speed, presumably. Users expect those to be relatively fast, but choosing new dates can be slower. Plus, what's the likelihood that they'll really reload the first query?
The other answers raise good pitfalls with storing in Session in general.
I have a web app, that consumes a web service. The main page runs a search - by passing parameters to a particular web service method, and I bind the results to a gridview.
I have implemented sorting and paging on the grid. By putting the datatable that the grid is bound to in the viewstate and then reading / sorting / filtering it as necessary - and rebinding to the grid.
As the amount of data coming back from the web service has increased dramatically, when I try to page/sort etc I receive the following errors.
The connection was reset
The connection to the server was reset while the page was loading.
I have searched around a bit, and it seems that a very large viewstate is to blame for this.
But surely the only other option is to
Limit the results
Stick the datatable in the session rather than the viewstate
Something else I am unaware of
Previously I did have the datatable in the session, as some of this data needed to persist from page to page - (not being posted however so viewstate was not an option). As the amount of data rose and the necessity to persist it was removed, I used the viewstate instead. Thinking this was a better option than the session because of the amount of data the session would have to hold and the number of users using the app.
It appears maybe not.
I thought that when the viewstate got very big, that .net split it over more than one hidden viewstate field, but it seems all I'm getting is one mammoth viewstate that I have trouble viewing in the source.
Can anyone enlighten me as to how to avoid the error I'm getting? If it is indeed to do with the amount of data in the viewstate?
It sounds like your caching the whole dataset for all pages even though you are only presenting one page of that data. I would change your pagination to only require the data for the current page the user is on.
If the query is heavy and you don't want to have to be constantly calling it over and over because there is a lot of paging back and forth (you should test typical useage pattern) then I would implement some type of caching on the web service end to cache page by page (by specific user if the data is specific to a user) and have it expire rather quick (eg a few minuites).
I think you need to limit the total amount of data your dealing with. Change your code to not pass back extra data that might never be needed is a good place to start.
EDIT: Based on your comments:
You can't change the web service
The user can manipulate the query by filtering or sorting
There is a large amount of data returned by the web service
The data is user specific
Well I think you have a perfect case for using the Session then. This can be taxing the the server with large amounts of users and data so you might want to implement some logic to clear the data from the Session and not wait for it to expire (like on certain landing pages you know the user will go when they are done, clear the session data).
You really want to get it out of the ViewState beacuse it is a huge bandwidth hog. Just look at your physical page size and that data is being passed back and forth with every action. Moving it to the Session would eliminate that bandwidth useage and allow for you to do everything you need.
You could also look at the data the web service is bringing back and store it in a custom object that you make as 'thin' as possible. If your storing a DataSet or a DataTable in your Session, those objects have some extra overhead you probably don't need so store the data as an array of some custom thin object and just bind to that. You would need to map the result from the WS to your custom object but this is a good option you cut down on memory useage.
Let me know if there is something else I am missing.
I wouldn't put the data in either the view state or the session. Instead store the bare minimum information to re-request the dataset from the web service and store that (in either view state or session, or even on the URL). Then call the web service using that data and reaction the data on each request. If necessary, look to use some form of caching (memCache) to improve performance.
For the sake of argument assume that I have a webform that allows a user to edit order details. User can perform the following functions:
Change shipping/payment details (all simple text/dropdowns)
Add/Remove/Edit products in the order - this is done with a grid
Add/Remove attachments
Products and attachments are stored in separate DB tables with foreign key to the order.
Entity Framework (4.0) is used as ORM.
I want to allow the users to make whatever changes they want to the order and only when they hit 'Save' do I want to commit the changes to the database. This is not a problem with textboxes/checkboxes etc. as I can just rely on ViewState to get the required information. However the grid is presenting a much larger problem for me as I can't figure out a nice and easy way to persist the changes the user made without committing the changes to the database. Storing the Order object tree in Session/ViewState is not really an option I'd like to go with as the objects could get very large.
So the question is - how can I go about preserving the changes the user made until ready to 'Save'.
Quick note - I have searched SO to try to find a solution, however all I found were suggestions to use Session and/or ViewState - both of which I would rather not use due to potential size of my object trees
If you have control over the schema of the database and the other applications that utilize order data, you could add a flag or status column to the orders table that differentiates between temporary and finalized orders. Then, you can simply store your intermediate changes to the database. There are other benefits as well; for example, a user that had a browser crash could return to the application and be able to resume the order process.
I think sticking to the database for storing data is the only reliable way to persist data, even temporary data. Using session state, control state, cookies, temporary files, etc., can introduce a lot of things that can go wrong, especially if your application resides in a web farm.
If using the Session is not your preferred solution, which is probably wise, the best possible solution would be to create your own temporary database tables (or as others have mentioned, add a temporary flag to your existing database tables) and persist the data there, storing a single identifier in the Session (or in a cookie) for later retrieval.
First, you may want to segregate your specific state management implementation into it's own class so that you don't have to replicate it throughout your systems.
Second, you may want to consider a hybrid approach - use session state (or cache) for a short time to avoid unnecessary trips to a DB or other external store. After some amount of inactivity, write the cached state out to disk or DB. The simplest way to do this, is to serialize your objects to text (using either serialization or a library like proto-buffers). This helps allow you to avoid creating redundant or duplicate data structure to capture the in-progress data relationally. If you don't need to query the content of this data - it's a reasonable approach.
As an aside, in the database world, the problem you describe is called a long running transaction. You essentially want to avoid making changes to the data until you reach a user-defined commit point. There are techniques you can use in the database layer, like hypothetical views and instead-of triggers to encapsulate the behavior that you aren't actually committing the change. The data is in the DB (in the real tables), but is only visible to the user operating on it. This is probably a more complicated implementation than you may be willing to undertake, and requires intrusive changes to your persistence layer and data model - but allows the application to be ignorant of the issue.
Have you considered storing the information in a JavaScript object and then sending that information to your server once the user hits save?
Use domain events to capture the users actions and then replay those actions over the snapshot of the order model ( effectively the current state of the order before the user started changing it).
Store each change as a series of events e.g. UserChangedShippingAddress, UserAlteredLineItem, UserDeletedLineItem, UserAddedLineItem.
These events can be saved after each postback and only need a link to the related order. Rebuilding the current state of the order is then as simple as replaying the events over the currently stored order objects.
When the user clicks save, you can replay the events and persist the updated order model to the database.
You are using the database - no session or viewstate is required therefore you can significantly reduce page-weight and server memory load at the expense of some page performance ( if you choose to rebuild the model on each postback ).
Maintenance is incredibly simple as due to the ease with which you can implement domain object, automated testing is easily used to ensure the system behaves as you expect it to (while also documenting your intentions for other developers).
Because you are leveraging the database, the solution scales well across multiple web servers.
Using this approach does not require any alterations to your existing domain model, therefore the impact on existing code is minimal. Biggest downside is getting your head around the concept of domain events and how they are used and abused =)
This is effectively the same approach as described by Freddy Rios, with a little more detail about how and some nice keyword for you to search with =)
http://jasondentler.com/blog/2009/11/simple-domain-events/ and http://www.udidahan.com/2009/06/14/domain-events-salvation/ are some good background reading about domain events. You may also want to read up on event sourcing as this is essentially what you would be doing ( snapshot object, record events, replay events, snapshot object again).
how about serializing your Domain object (contents of your grid/shopping cart) to JSON and storing it in a hidden variable ? Scottgu has a nice article on how to serialize objects to JSON. Scalable across a server farm and guess it would not add much payload to your page. May be you can write your own JSON serializer to do a "compact serialization" (you would not need product name,product ID, SKU id, etc, may be you can just "serialize" productID and quantity)
Have you considered using a User Profile? .Net comes with SqlProfileProvider right out of the box. This would allow you to, for each user, grab their profile and save the temporary data as a variable off in the profile. Unfortunately, I think this does require your "Order" to be serializable, but I believe all of the options except Session thus far would require the same.
The advantage of this is it would persist through crashes, sessions, server down time, etc and it's fairly easy to set up. Here's a site that runs through an example. Once you set it up, you may also find it useful for storing other user information such as preferences, favorites, watched items, etc.
You should be able to create a temp file and serialize the object to that, then save only the temp file name to the viewstate. Once they successfully save the record back to the database then you could remove the temp file.
Single server: serialize to the filesystem. This also allows you to let the user resume later.
Multiple server: serialize it but store the serialized value in the db.
This is something that's for that specific user, so when you persist it to the db you don't really need all the relational stuff for it.
Alternatively, if the set of data is v. large and the amount of changes is usually small, you can store the history of changes done by the user instead. With this you can also show the change history + support undo.
2 approaches - create a complex AJAX application that stores everything on the client and only submits the entire package of changes to the server. I did this once a few years ago with moderate success. The applicaiton is not something I would want to maintain though. You have a hard time syncing your client code with your server code and passing fields that are added/deleted/changed is nightmarish.
2nd approach is to store changes in the data base in a temp table or "pending" mode. Advantage is your code is more maintainable. Disadvantage is you have to have a way to clean up abandonded changes due to session timeout, power failures, other crashes. I would take this approach for any new development. You can have separate tables for "pending" and "committed" changes that opens up a whole new level of features you can add. What if? What changed? etc.
I would go for viewstate, regardless of what you've said before. If you only store the stuff you need, like { id: XX, numberOfProducts: 3 }, and ditch every item that is not selected by the user at this point; the viewstate size will hardly be an issue as long as you aren't storing the whole object tree.
When storing attachments, put them in a temporary storing location, and reference the filename in your viewstate. You can have a scheduled task that cleans the temp folder for every file that was last saved over 1 day ago or something.
This is basically the approach we use for storing information when users are adding floorplan information and attachments in our backend.
Are the end-users internal or external clients? If your clients are internal users, it may be worthwhile to look at an alternate set of technologies. Instead of webforms, consider using a platform like Silverlight and implementing a rich GUI there.
You could then store complex business objects within the applet, provide persistant "in progress" edit tracking across multiple sessions via offline storage and easily integrate with back-end services that providing saving / processing of the finalised order. All whilst maintaining access via the web (albeit closing out most *nix clients).
Alternatives include Adobe Flex or AJAX, depending on resources and needs.
How large do you consider large? If you are talking sessions-state (so it doesn't go back/fore to the actual user, like view-state) then state is often a pretty good option. Everything except the in-process state provider uses serialization, but you can influence how it is serialized. For example, I would tend to create a local model that represents just the state I care about (plus any id/rowversion information) for that operation (rather than the full domain entities, which may have extra overhead).
To reduce the serialization overhead further, I would consider using something like protobuf-net; this can be used as the implementation for ISerializable, allowing very light-weight serialized objects (generally much smaller than BinaryFormatter, XmlSerializer, etc), that are cheap to reconstruct at page requests.
When the page is finally saved, I would update my domain entities from the local model and submit the changes.
For info, to use a protobuf-net attributed object with the state serializers (typically BinaryFormatter), you can use:
// a simple, sessions-state friendly light-weight UI model object
[ProtoContract]
public class MyType {
[ProtoMember(1)]
public int Id {get;set;}
[ProtoMember(2)]
public string Name {get;set;}
[ProtoMember(3)]
public double Value {get;set;}
// etc
void ISerializable.GetObjectData(
SerializationInfo info,StreamingContext context)
{
Serializer.Serialize(info, this);
}
public MyType() {} // default constructor
protected MyType(SerializationInfo info, StreamingContext context)
{
Serializer.Merge(info, this);
}
}