I was just learning how to use the LINQ/EDM combination to retrieve and update a simple user-thread-and-comment webapp as part of evaluating it.
When I turn on SQL Profiler, I rarely see a SQL statement executed by my app.
I'm starting to really like how well it keeps things cached, coz as soon as I add new data, it magically updates itself while I'm blinking.
But is that something I should be scared of?
My concern is when I use this to make a webapp with some traffic (whatever hit count that reaches this approach's par).
Should I keep a single context object at app-level, so that different sessions can benefit from each other's cache entries?
Or should I do the create-and-release on each page submission?
I know this sounds like an open-ended question, but I really have that as a question: how does the entity cache its data when using LINQ?
On the ObjectContext question you should use a lifetime of per-page-request cycle or smaller. It's designed for a unit of work and not for the application lifetime. Search SO for "ObjectContext lifetime" or "DataContext lifetime" and you'll see this is a common question.
Related
Search is the most used feature on our website and the search query is the most CPU intensive, complex and frequent query that executes on our db, causing heavy CPU usages on the db server. To reduce the load on the db we have been looking at various caching strategies. For now, we intend to use the ASP.NET Cache.
The idea is to have an in-memory db of the most frequently/recently created/accessed objects in the cache and then query the in-memory db using linq to come up with search results. My initial thought was to Cache a List of the Users and then query or modify this List using linq. But given the complexities of multiple threads accessing or trying to modify List I was looking at other options.
Which is when I thought that instead of Caching a List, cache the individual User objects with its Id as the key and try and query the Cache. At http://msdn.microsoft.com/en-us/library/system.web.caching.cache.aspx I see that the Cache has an extension method AsQueryable but I am not sure what does this mean. Cache is a key value pair so with AsQueryable will I able to query the keys and get a set of User Objects or will I able to query the User objects and get my desired result?
Before you start this you really need to have some measurability in place around it -- there is no way to figure out if your changes help or hurt without having some good, solid data to make that judgement on. Performance, especially performance at scale isn't something you can think or guess through. You have to know your way through it.
As for your solution, I think you might well make the problem worse or at least create another problem here. Your database server is theoretically designed to handle arbitrary user queries across vast information sets efficiently. Linq is awesome but it is not really meant to be an ad-hoc search engine -- it doesn't have the sorts of indexing capabilities one really expects from search engines. Just because it can expose things as an IQueryable doesn't mean you should treat it that way. And even if you've got a way to efficently search the cache, you've got another problem to get past -- how do you identify what is most frequently used? And how do you manage the ASP.NET cache to not start ejecting things when it gets low on memory?
You would probably be better served here by:
Starting with some good old fashioned database tuning -- why are your queries so slow and expensive? Are you missing an index somewhere?
Looking at caching the results page output, especially if your search URLs are GET-able as that is pretty easy to manage. This is a great short term solution if the site is melting.
Look at building the search bits properly. Using LIKE %whatever% is not a proper search. Full text indexes in your database is a good start. Something like lucene.net is probably better.
No, cannot use AsQueryable to query User objects and get the desired result I was looking for. So now I will be using a static List for the time being though I know I will have to change sooner rather than later.
I guess I would need some really good explanation on some Model related concepts.
In general does the model, as described by frameworks like Robotlegs play the role of an application state holder, or a domain state holder? I originally thought that models are entirely domain based, i.e UserModel, LocationModel, which play the same role that DAO classes play on the server. The more source code I am looking at though, the more I see stuff like UserAccountModel, ShoppingCartModel, etc, full of properties and methods related to the state of the client application, not the domain state.
I see that the people do not bother to add complex relationships to the VO classes, i.e. if a User has a lot of photos, the photos collection is obviously omitted from the UserVO class. Instead, a bunch of PhotoVO objects are loaded from the server whenever necessary, based on a service call with the user ID. Is that some sort of a rule of thumb - in general keeping VOs as "bare" as possible? Doesn't that increase the possible number of calls that must be made to the server to fetch all the data? Moreover, doesn't that fragment the domain model in general? (an entity User class on the server will always have a photos property)
With so many calls to the server, it is normal to fetch some objects that might be already on the client storage. does it make sense to make a client side cache, and check if the object that is going to be fetched is already there, or in general, the overhead of getting it once again will be paid off by the benefits of getting a fully synced object from the server. Otherwise, every object stored on the client side cache must be cared for when a change occurs. I personally think that the overhead of getting an object from the server, which might have already been picked up before is not as big. Better have fresh and synced data I'd say.
I do not believe your question is answerable, because so many of the answers are "it depends." It depends on the application you're building and the needs of the UI.
I don't really understand your distinction between "Domain State" and "Application State." However, I believe that any "Value Object" style classes implemented in the UI should focus on holding the state of specific views. It is extremely rare that a single view is a one to one relationship to database tables. As such, my UI Data Objects may not be identical to server side data objects. Although, it is very common that I will map UI objects to server side objects using AMF. But, it doesn't mean that every object in the UI is implemented server side and every server object is implemented on the UI.
I see that the people do not bother to add complex relationships to
the VO classes,
I'm not sure where you see that; I will often do exactly this. However, it depends what the view is supposed to display. IF the view is not displaying a lot of photos related to the user, then I won't make a remote call to retrieve the user information with all their photos.
With so many calls to the server, it is normal to fetch some objects
that might be already on the client storage.
It depends. I would say that the apps I write, the calls to the server are done as needed; and attempts are made to limit them as appropriate. If I already fetched data and have it cached on the client, then I am going to try to use that cache instead of retrieving the data again.
I'll restate my original assessment: I think the answers to most of your questions depend on the situation, and depend on the app. You seem to start with overally broad generalizations about how things are done. However, I Do not believe they are universal truths. Developer's fight about application architecture issues all the time.
I've been using this site for quite a while, usually being able to sort out my questions by browsing through the questions and following tags. However, I've recently come across a question that is rather hard to lookup amongst the great number of questions asked - a question I hope some of you might be able to share your opinion on.
As my problem is a bit hard to fit into a single line, going in the title, I'll try to give a bit more details on the problem I've encountered. So, as the title says I need to filter, or limit, some of the response data my standard ASP.NET Soap-based Web service returns on invoking various web methods. The web service is used to return data used by other systems (a data repository more or less), where the client today is able to specify a few parameters on how the data should be filtered and in return a full-set of data back.
Well, easy enough I thought, just put additional filtering options on the existing web methods which needs a bit more filtered applied, make adjustments on the server-side and we are all set to go - well, unfortunately it turned out to be a bit more tricky then this.
The problem I am facing is that I'm working on a web service running in a production environment, which needs to be extended in such that additional filters can be applied to existing web method being invoked w/o affecting the calls already being made by other systems used by the customer using their client stubs. This is where I am a bit troubled, since I can't seem to find a "right solution" on extending the current web service.
Today, the filter is send as a custom data structure which holds information on which data should filtered, but I am not sure if I can simply just add more information to this data structure w/o breaking code at the clients? One of my co-workers suggested that I could implement a solution where I would extend the web.config on the server-side to hold a section with details on which data should be excluded (filtered out), but I don't find this to be a viable solution long-sighted - and I don't trust customers with such an option since this is likely to go wrong at some point. So the solution I am looking for is a way that I can apply a "second filter" to the data I am requesting from the client so instead of getting a full-set of data back it should only give a fraction, it implemented in such that the filter can be easily modified and it must not affect the current client calls.
Any suggestions on how I should approach this problem?
Thanks!
Kind regards,
E.
A pretty common practice is to create another instance of the application OR use part of the url to signify the version of the endpoint they are connecting to, perhaps the virtual directory is the date. That way old calls will go to the old API and new calls will come in on the new API.
http://api.example.com/dostuff
vs
http://api.example.com/6-7-2011/dostuff
This is quite a lengthy post, so bear with me. I'm not sure whether it is primarily about ASP.NET Session State behaviour, NInject, application design, or refactoring. Read on and then you can decide... :-)
Background
First, a bit of background. We are working on trying to refactor a large webshop into a more maintainable , structured design. The webshop is currently running on .NET 3.5, but the design is more of a hangover from the classic ASP days. Obviously we cannot tackle everything in one go, so many of the features / technologies / approaches have to be taken as a given. With that in mind...
The app maintains everything to do with the current session (user profile, cart, session choices, etc.) in a context object which is simply a large XML document that gets serialized to and deserialized from the Session as a string. The XML format is also important because the rendering is done via XSLT.
This has led to a number of problems :
It's a kind of God object with far
too many concerns.
It's loosely typed and relies too much on XML manipulation / XPath.
There is no standard way / pattern for retrieving the session xml document or for writing it back. We have a horrible mixture of methods that take the document in as a parameter, modify it and return it, methods that retrieve it themselves, modify it and save it back to session, etc, etc. This has lead to a lot of hard to trace bugs, over-use of serializing /deserializing from the Session, etc.
Our Solution
What we have done is try to introduce a strongly -typed wrapper around the xml document, which breaks it up into different concerns and to manage the lifecycle transparently to the rest of the app.
What we are aiming for is the following workflow:
Beginning of the request, we populate
the session document from the xml
string stored in the session.
The rest of the app interacts with it
only through the strongly typed
wrapper. The whole app uses the same
instance and does not have to worry
about when to retrieve or save the
state back to session.
At the end of the request, the underlying xml document is serialized back to the Session.
Since we are using NInject(v1) as the IOC of choice, we decided to use this to manage the lifecycle of our context object. The context object was wrapped with the OnePerRequest attribute and the dispose method was hooked up to a method that would save the xml document back to Session as a string.
It doesn't work...
We soon encountered a problem that the NInject OnePerRequest module didn't appear to have access to SessionState. The first thing we tried was a hack that we would keep the Session object in a variable to make sure we could still write to it. This appeared to work on a development machine but it became obvious it didn't when moving to out of process state.
It still doesn't work...
We tried inheriting from the OnePerRequest behaviour / module, and adding the IRequiresSessionState marker interface (OnePerRequestRequiresSessionState). However, this was not enough as the method which NInject uses to release references and clean up gets hooked up to the EndRequest method. Session is available in EndRequest but it has already been serialized to the out of process state server so changing something now is not reflected when the session string is retrieved at the beginning of the next request.
We then decided to change the even t to hook up to. We ditched EndRequest and hooked up our OnePerRequestRequiresSessionState "release all" method to the PostRequestHandlerExecute event, which is BEFORE the session data gets serialized out of process.
It works... then it doesn't...
This seemed to work. On a single server and on a web farm. Then we noticed weird behaviour. There seemed to be two different versions of the context and you would randomly switch between them. Add something to the cart, it's not there. Go to browse to another product and the previous product would show up in the cart.
After some tracing, we discovered the culprit: Response.Redirect. Sprinkled throughout the site in literally hundreds of places is Response.Redirect(url);. With this version of the redirect, the execution of the page is stopped immediately. This means that PostRequestHandlerExecute is not fired and the current version of the Context object is not thrown away by NInject... and everything falls apart. New versions are not created properly, etc. EndRequest is fired which is why the normal NInject OnePerRequest module works fine with it, just not our bastardized version that tries to use session state.
Of course, there is an override to Response.Redirect where you can pass a boolean value in to tell it whether to terminate the existing page or continue to execute - Response.Redirect(url,false). Continuing obviously fires our event and everything works but... it continues to execute the rest of the page! This means executing everything that comes after the call to Redirect and we have absolutely no idea what that means (since the existing site expects it to stop).
What next?
So, any suggestions on what to do? So far we've discussed :
Abstracting our redirect behaviour
and going through a central method
that controls the redirect (perhaps
hacking out a way to call the
PostRequestHandlerExecute even t or
maybe a custom Redirect event that
our NInject module can also
subscribe to and clean up).
Seeing if there is a way we can
force the Session object to save in
EndRequest if it hasn't been saved
previously in
PostRequestHandlerExecute, and do
the ninject clean up in EndRequest
Remove our dependency on Session
completely and use another storage
mechanism: DB, document DB,
distributed HashTable, etc. Any
advice? Suggestions we haven't
thought of? Things you've tried
that have / haven't worked?
I think you're on the right track. Here's some thoughts I had:
in addition to the strongly typed wrapper you have, I'd suggest a facade for accessing the context object that returns your wrapper, something like an IContextProvider. that way you can introduce it piece-meal, and then when it's fully integrated, you can refactor the provider without breaking the things that use it. I can't tell, but you might have already done this. it'll also be easier to change your persistence mechanism if you choose to. if you can do this, I would suggest once you get all the dependencies isolated from the context object, change it to not persist as XML. the SessionState will store a binary object much faster, and you can always serialize to XML if you need to do transforms.
I don't think that Ninject is the correct mechanism for what you're trying to do. it's difficult to signal end of the request in Ninject, since garbage collection can't be depended on. have you considered using an IHttpModule instead? you can use the AcquireRequestState and ReleaseRequestState or EndRequest to handle getting/setting the context in Session. only allow the app to get to the context object through the facade.
if you're on a webfarm, you're probably using a database for your Session storage anyway, so putting your context into a DB won't be much different.
Firstly, while it's good to demonstrate you've put in the work, (and I and others may not have replied if it wasn't clear how much you're interested in a resolution)... that's a massive wall of text! Here's a +1 on your way to investing in a bonus for a complete response that talks about the Ninject ASP.NET extensions and how they apply to each individual element of your issue. Having said that, hopefully someone will come along with a real resolution for you.
Even though it's [very] 2.0 specific, Nate's Cache and Collect Post is required reading. While it seems you're pretty au fait with the tradeoffs involved and have debugged deep in, the article is well worth a few reads.
I'd also consider moving to V2 of Ninject - a lot of this stuff has been revised significantly. It's not magically going to work, but represents a mature rewrite based on a lot of learning from V1. Have you read the (V1 or) V2 unit tests for Ninject? They'll show you the low level tools at your disposal in order to realise your goals.
Bottom line for me is that you need to work out a strategy for your state management independent of DI, and then by all means use the container/DI system as a part of the implementation.
The Problem
In the stack that we re-use between projects, we are putting a little bit too much data in the session for passing data between pages. This was good in theory because it prevents tampering, replay attacks, and so on, but it creates as many problems as it solves.
Session loss itself is an issue, although it's mostly handled by implementing Session State Server (or by using SQL Server). More importantly, it's tricky to make the back button work correctly, and it's also extra work to create a situation where a user can, say, open the same screen in three tabs to work on different records.
And that's just the tip of the iceberg.
There are workarounds for most of these issues, but as I grind away, all this friction gives me the feeling that passing data between pages using session is the wrong direction.
What I really want to do here is come up with a best practice that my shop can use all the time for passing data between pages, and then, for new apps, replace key parts of our stack that currently rely on Session.
It would also be nice if the final solution did not result in mountains of boilerplate plumbing code.
Proposed Solutions
Session
As mentioned above, leaning heavily on Session seems like a good idea, but it breaks the back button and causes some other problems.
There may be ways to get around all the problems, but it seems like a lot of extra work.
One thing that's very nice about using session is the fact that tampering is just not an issue. Compared to passing everything via the unencrypted QueryString, you end up writing much less guard code.
Cross-Page Posting
In truth I've barely considered this option. I have a problem with how tightly coupled it makes the pages -- if I start doing PreviousPage.FindControl("SomeTextBox"), that seems like a maintenance problem if I ever want to get to this page from another page that maybe does not have a control called SomeTextBox.
It seems limited in other ways as well. Maybe I want to get to the page via a link, for instance.
QueryString
I'm currently leaning towards this strategy, like in the olden days. But I probably want my QueryString to be encrypted to make it harder to tamper with, and I would like to handle the problem of replay attacks as well.
On 4 guys from Rolla, there's an article about this.
However, it should be possible to create an HttpModule that takes care of all this and removes all the encryption sausage-making from the page. Sure enough, Mads Kristensen has an article where he released one. However, the comments make it sound like it has problems with extremely common scenarios.
Other Options
Of course this is not an exaustive look at the options, but rather the main options I'm considering. This link contains a more complete list. The ones I didn't mention such as Cookies and the Cache not appropriate for the purpose of passing data between pages.
In Closing...
So, how are you handling the problem of passing data between pages? What hidden gotchas did you have to work around, and are there any pre-existing tools around this that solve them all flawlessly? Do you feel like you've got a solution that you're completely happy with?
Thanks in advance!
Update: Just in case I'm not being clear enough, by 'passing data between pages' I'm talking about, for instance, passing a CustomerID key from a CustomerSearch.aspx page to Customers.aspx, where the Customer will be opened and editing can occur.
First, the problems with which you are dealing relate to handling state in a state-less environment. The struggles you are having are not new and it is probably one of the things that makes web development harder than windows development or the development of an executable.
With respect to web development, you have five choices, as far as I'm aware, for handling user-specific state which can all be used in combination with each other. You will find that no one solution works for everything. Instead, you need to determine when to use each solution:
Query string - Query strings are good for passing pointers to data (e.g. primary key values) or state values. Query strings by themselves should not be assumed to be secure even if encrypted because of replay. In addition, some browsers have a limit on the length of the url. However, query strings have some advantages such as that they can be bookmarked and emailed to people and are inherently stateless if not used with anything else.
Cookies - Cookies are good for storing very tiny amounts of information for a particular user. The problem is that cookies also have a size limitation after which it will simply truncate the data so you have to be careful with putting custom data in a cookie. In addition, users can kill cookies or stop their use (although that would prevent use of standard Session as well). Similar to query strings, cookies are better, IMO, for pointers to data than for the data itself unless the data is tiny.
Form data - Form data can take quite a bit of information however at the cost of post times and in some cases reload times. ASP.NET's ViewState uses hidden form variables to maintain information. Passing data between pages using something like ViewState has the advantage of working nicer with the back button but can easily create ginormous pages which slow down the experience for the user. In general, ASP.NET model does not work on cross page posting (although it is possible) but instead works on posts back to the same page and from there navigating to the next page.
Session - Session is good for information that relates to a process with which the user is progressing or for general settings. You can store quite a bit of information into session at the cost of server memory or load times from the databases. Conceptually, Session works by loading the entire wad of data for the user all at once either from memory or from a state server. That means that if you have a very large set of data you probably do not want to put it into session. Session can create some back button problems which must be weighed against what the user is actually trying to accomplish. In general you will find that the back button can be the bane of the web developer.
Database - The last solution (which again can be used in combination with others) is that you store the information in the database in its appropriate schema with a column that indicates the state of the item. For example, if you were handling the creation of an order, you could store the order in the Order table with a "state" column that determines whether it was a real order or not. You would store the order identifier in the query string or session. The web site would continue to write data into the table to update the various parts and child items until eventually the user is able to declare that they are done and the order's state is marked as being a real order. This can complicate reports and queries in that they all need to differentiate "real" items from ones that are in process.
One of the items mentioned in your later link was Application Cache. I wouldn't consider this to be user-specific since it is application wide. (It can obviously be shoe-horned into being user-specific but I wouldn't recommend that either). I've never played with storing data in the HttpContext outside of passing it to a handler or module but I'd be skeptical that it was any different than the above mentioned solutions.
In general, there is no one solution to rule them all. The best approach is to assume on each page that the user could have navigated to that page from anywhere (as opposed to assuming they got there by using a link on another page). If you do that, back button issues become easier to handle (although still a pain). In my development, I use the first four extensively and on occasion resort to the last solution when the need calls for it.
Alright, so I want to preface my answer with this; Thomas clearly has the most accurate and comprehensive answer so far for people starting fresh. This answer isn't in the same vein at all. My answer is coming from a "business developer's" standpoint. As we all know too well; sometimes it's just not feasible to spend money re-writing something that already exists and "works"... at least not all in one shot. Sometimes it's best to implement a solution which will let you migrate to a better alternative over time.
The only thing I'd say Thomas is missing is; client-side javascript state. Where I work we've found customers are coming to expect "Web 2.0"-type applications more and more. We've also found these sorts of applications typically result in much higher user satisfaction. With a little practice, and the help of some really great javascript libraries like jQuery (we've even started using GWT and found it to be AWESOME) communicating with JSON-based REST services implemented in WCF can be trivial. This approach also provides a very nice way to start moving towards a SOA-based architecture, and clean separation of UI and business logic.
But I digress.
It sounds to me as though you already have an application, and you've already stretched the limits of ASP.NET's built-in session state management. So... here's my suggestion (assuming you've already tried ASP.NET's out-of-process session management, which scales signifigantly better than the in-process/on-box session management, and it sounds like you have because you mentioned it); NCache.
NCache provides you with a "drop-in" replacement for ASP.NET's session management options. It's super easy to implement, and could "band-aid" your application more than well enough to get you through - without any significant investment in refactoring your existing codebase immediately.
You can use the extra time and money to start reducing your technical debt by focusing new development on things with immediate business-value - using a new approach (such as any of the alternatives offered in the other answers, or mine).
Just my thoughts.
Several months later, I thought I would update this question with the technique I ended up going with, since it has worked out so well.
After playing with more involved session state handling (which resulted in a lot of broken back buttons and so on) I ended up rolling my own code to handle encrypted QueryStrings. It's been a huge win -- all of my problem scenarios (back button, multiple tabs open at the same time, lost session state, etc) are solved and the complexity is minimal since the usage is very familiar.
This is still not a magic bullet for everything but I think it's good for about 90% of the scenarios you run into.
Details
I built a class called CorePage that inherits from Page. It has methods called SecureRequest and SecureRedirect.
So you might call:
SecureRedirect(String.Format("Orders.aspx?ClientID={0}&OrderID={1}, ClientID, OrderID)
CorePage parses out the QueryString and encrypts it into a QueryString variable called CoreSecure. So the actual request looks like this:
Orders.aspx?CoreSecure=1IHXaPzUCYrdmWPkkkuThEes%2fIs4l6grKaznFGAeDDI%3d
If available, the currently logged in UserID is added to the encryption key, so replay attacks are not as much of a problem.
From there, you can call:
X = SecureRequest("ClientID")
Conclusion
Everything works seamlessly, using familiar syntax.
Over the last several months I've also adapted this code to work with edge cases, such as hyperlinks that trigger a download - sometimes you need to generate a hyperlink on the client that has a secure QueryString. That works really well.
Let me know if you would like to see this code and I will put it up somewhere.
One last thought: it's weird to accept my own answer over some of the very thoughtful posts other people put on here, but this really does seem to be the ultimate answer to my problem. Thanks to everyone who helped get me there.
After going through all the above scenarios and answers and this link Data pasing methods My final advice would be :
COOKIES for:
ENCRYPT[userId's]
ENCRYPT[productId]
ENCRYPT[xyzIds..]
ENCRYPT[etc..]
DATABASE for:
datasets BY COOKIE ID
datatables BY COOKIE ID
all other large chunks BY COOKIE ID
My advise also depends on the below statistics and this link details Data pasing methods :
I would never do this. I have never had any issues storing all session data in the database, loading it based on the users cookie. It's a session as far as anything is concerned, but I maintain control over it. Don't give up control of your session data to your web server...
With a little work, you can support sub sessions, and allow multi-tasking in different tabs/windows.
As a starting point, I find using the critical data elements, such as a Customer ID, best put into the query string for processing. You can easily track/filter bad data coming off of these elements, and it also allows for some integration with e-mail or other related sites/applications.
In a previous application, the only way to view an employee or a request record involving them was to log into the application, do a search for the employee or do a search for recent records to find the record in question. This became problematic and a big time sink when somebody from a related department needed to do a simple view on records for auditing purposes.
In the rewrite, I made both the employee Id, and request Ids available through a basic URL of "ViewEmployee.aspx?Id=XXX" and "ViewRequest.aspx?Id=XXX". The application was setup to A) filter out bad Ids and B) authenticate and authorize the user before allowing them to these pages. What this allowed the primarily application users to do was to send simple e-mails to the auditors with a URL in the e-mail. When they were in a big hurry, they were in their bulk processing time, they were able to simply click down a list of URLs and do the appropriate processing.
Other session related data, such as modification dates and maintaining the "state" of the user's interaction with the application gets a little more complex, but hopefully this provides a starting poing for you.