Best way to persist data? [closed] - asp.net

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I have a complex JSON which I need to persist over two POST requests. Currently i'm storing the serialized JSON in tempdata though the second post never succeeds before of Error 400(The size of headers is too long). In this image I viewed the cookies in Chrome Debugger
Am I doing MVC wrong? The data is probably too complex to be stored in tempdata. However for this example this JSON is only 234 line(Unsure if this reflexes cookie size accurately). I know I could increase the size of the cookie but this wouldn't fix the real issues.
Should I be storing the data in a different method?
Basically in my project i'm posting a value to the controller(Many times via POST) which then uses the value to get a certain part of the JSON. Is Session the only alternative?
I'm still a novice to MVC so forgive me if i've made a simple mistake

First, TempData and Session are the same thing. The only difference is the length of persistence: in the former, just until the next request, while in the latter for the life of the session.
Second, session storage has to be configured. If you don't configure it, then something like TempData will attempt to use cookies to persist the data. Otherwise, it will use your session store. Basically, by using any actual session store, you should have no issues with the size of the data.
Third, you have not provided much information about what you're actually doing here, but for the most part, sessions (Session or TempData) are a poor choice for persistence. The data you're trying to store between requests does not sound like it is user-specific, which makes sessions a particular poor choice. Most likely, you want a distributed cache here, though you could potentially get by with an in-memory cache. You should also consider whether you need to persist this data at all. It's far too common to over-optimize by worrying about running the same query against at database, for example, multiple times. Databases are designed to efficiently retrieve large amounts of data, and properly set up, can handle many thousands of simultaneous queries without breaking a sweat. Ironically, sometimes a caching a query doesn't actually save you anything over actually running the query, especially with distributed caching mechanisms.
Simple is better than complex. Start simple. Solve the problem in the most straight-forward way possible. If that involves issuing the same query multiple times, do so. It doesn't matter. Then, once you have a working solution, profile. If it's running slower than you like, or starts to fall down when fielding 1000s of requests, then look into ways to optimize it by caching, etc. Developers waste an enormous amount of time and energy trying to optimize things that aren't actually even problems.

Related

How to store big data? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 months ago.
Improve this question
Suppose we have a web service that aggregates 20 000 users, and each one of them is linked to 300 unique user data entities containing whatever. Here's naive approach on how to design an example relational database that would be able to store above data:
Create table for users.
Create table for user data.
And thus, user data table contains 6 000 000 rows.
Querying tables that have millions of rows is slow, especially since we have to deal with hierarchical data and do some uncommon computations much different from SELECT * FROM userdata. At any given point, we only need specific user's data, not the whole thing - getting it is fast - but we have to do weird stuff with it later. Multiple times.
I'd like our web service to be fast, so I thought of following approaches:
Optimize the hell out of queries, do a lot of caching etc. This is nice, but these are just temporary workarounds. When database grows even further, these will cease to work.
Rewriting our model layer to use NoSQL technology. This is not possible due to lack of relational database features and even if we wanted this approach, early tests made some functionalities even slower than they already were.
Implement some kind of scalability. (You hear about cloud computing a lot nowadays.) This is the most wanted option.
Implement some manual solution. For example, I could store all the users with names beginning with letter "A..M" on server 1, while all other users would belong to server 2. The problem with this approach is that I have to redesign our architecture quite a lot and I'd like to avoid that.
Ideally, I'd have some kind of transparent solution that would allow me to query seemingly uniform database server with no changes to code whatsoever. The database server would scatter its table data on many workers in a smart way (much like database optimizers), thus effectively speeding everything up. (Is this even possible?)
In both cases, achieving interoperability seems like a lot of trouble...
Switching from SQLite to Postgres or Oracle solution. This isn't going to be cheap, so I'd like some kind of confirmation before doing this.
What are my options? I want all my SELECTs and JOINs with indexed data to be real-time, but the bigger the userdata is, the more expensive queries get.
I don't think that you should use NoSQL by default if you have such amount of data. Which kind of issue are you expecting that it will solve?
IMHO this depends on your queries. You haven't mentioned some kind of massive writing so SQL is still appropriate so far.
It sounds like you want to perform queries using JOINs. This could be slow on very large data even with appropriate indexes. What you can do is to lower your level of decomposition and just duplicate a data (so they all are in one database row and are fetched together from hard drive). If you concern latency, avoid joining is good approach. But it still does not eliminates SQL as you can duplicate data even in SQL.
Significant for your decision making should be structure of your queries. Do you want to SELECT only few fields within your queries (SQL) or do you want to always get the whole document (e.g. Mongo & Json).
The second significant criteria is scalability as NoSQL often relaxes usual SQL things (like eventual consistency) so it can provide better results using scaling out.

What cache strategy do I need in this case ?

I have what I consider to be a fairly simple application. A service returns some data based on another piece of data. A simple example, given a state name, the service returns the capital city.
All the data resides in a SQL Server 2008 database. The majority of this "static" data will rarely change. It will occassionally need to be updated and, when it does, I have no problem restarting the application to refresh the cache, if implemented.
Some data, which is more "dynamic", will be kept in the same database. This data includes contacts, statistics, etc. and will change more frequently (anywhere from hourly to daily to weekly). This data will be linked to the static data above via foreign keys (just like a SQL JOIN).
My question is, what exactly am I trying to implement here ? and how do I get started doing it ? I know the static data will be cached but I don't know where to start with that. I tried searching but came up with so much stuff and I'm not sure where to start. Recommendations for tutorials would also be appreciated.
You don't need to cache anything until you have a performance problem. Until you have a noticeable problem and have measured your application tiers to determine your database is in fact a bottleneck, which it rarely is, then start looking into caching data. It is always a tradeoff, memory vs CPU vs real time data availability. There is no reason to make your application more complicated than it needs to be just because.
An extremely simple 'win' here (I assume you're using WCF here) would be to use the declarative attribute-based caching mechanism built into the framework. It's easy to set up and manage, but you need to analyze your usage scenarios to make sure it's applied at the right locations to really benefit from it. This article is a good starting point.
Beyond that, I'd recommend looking into one of the many WCF books that deal with higher-level concepts like caching and try to figure out if their implementation patterns are applicable to your design.

asp .net Application performance [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have an asp .net 4.0 application. I have an mdf file in my app_data folder that i store some data. There is a "User" table with 15 fields and an "Answers" table with about 30 fields. In most of the scenarios in my website, the user retrieves some data from "User" table and writes some data to "Answers" table.
I want to test the performance of my application when about 10000 users uses the system.What will happen if 10000 users login and use the system at the same time and how will the performance is affected ? What is the best practice to test my system performance of asp .net pages in general?
Any help will be appreciated.
Thanks in advanced.
It reads like performance testing/engineering is not your core discipline. I would recommend hiring someone to either run this effort or assist you with it. Performance testing is a specialized development practice with specific requirement sets, tool expertise and analytical methods. It takes quite a while to become effective in the discipline even in the best case conditions.
In short, you begin with your load profile. You progress to definitions of the business process in your load profile. You then select a tool that can exercise the interfaces appropriately. You will need to set a defined initial condition for your testing efforts. You will need to set specific, objective measures to determine system performance related to your requirements. Here's a document which can provide some insight as a benchmark on the level of effort often required, http://www.tpc.org/tpcc/spec/tpcc_current.pdf
Something which disturbs me greatly is your use case of "at the same time," which is a practical impossibility for systems where the user agent is not synchronized to a clock tick. Users can be close, concurrent within a defined window, but true simultaneity is exceedingly rare.

What kind of data should never go into session? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What kinds of data should never be kept in a session?
I really wish it was clearer what kind of session you mean. Depending on the answer, I can come up with a couple:
Passwords of any sort
Large amounts of data, especially 4 GB+ on a 32-bit OS (guaranteed out of memory if it has to be loaded into RAM)
Executable code
Raw SQL
Swear words
Things likely to get government agencies angry ("Free Tibet" in China, threats to the president in the US)
Your bank account PIN or credit card number
A rabid badger. Actually, ANY kind of badger.
If possible, store nothing in the Session. It is an unreliable way to maintain state, especially if you need to move to a web farm. Also, I believe it encourages poor design. HTTP is stateless, and web sites should be designed in a way where you assume that for any request, you could be starting over from scratch.
COM or complex objects.
This link can also be useful: ASP.NET 2.0 Performance Inspection Questions - Session State
This answer is for PHP Sessions.
If you mean $_SESSION, well it is stored on the hard drive, so it is not immediately available in anything like the cookies.
However, on a shared host, it can sometimes be trivial to access session files from other websites.
I would not store anything in the session you wouldn't want anyone else on your shared host to see.
This can be a pretty subjective question. Anything that's serializable can be stored in session, technically. But there are definitely scenarios where you don't want to add things to session. Complex objects, objects that have large collections as properties, etc. All these things are serialized into byte arrays and kept in memory (for InProc Session State) and then deserialized when needed in code again. The more complex the object, the more resource intensive it can get to go back and forth.
Depending on how many users you have, you may wish to limit the number of items that go into session and perhaps use ViewState or other means of persistence. If it's truly something meant for multiple pages, then it's probably a good candidate for session. If it's only used in a page or two, then ViewState, QueryString, etc. may be better.
I would not put the session inside the session also!
You can store anything in Session as long as you keep the SessionMode="InProc" in the web.config. This stores any session data in the web server's memory in a user specific context.
However, if you want to scale up one day and run your web app in a farm, you will have to use another SessionMode. Then you can't any longer store objects of complex types that are not serializable (unfortunately, dictionaries are a common candidate) and you will have to change your design.
DataSets: Serialising a dataset to store in session can take up an order of magnitude more memory than the dataset itself (i.e. a 1MB dataset can need 20MB to serialise/deserialise, and it does that on every request).
Controls: Storing controls (and their collections) in session means that ASP.NET can't clean them up properly at the end of the page request, leading to memory leaks.
See Tess Ferrandez's blog for other examples of things you should never put in session, along with reasons why.
Stock tips, pirated CDs, full-length movies (except "Clerks", that movie was awesome), analog information, ...
This question seems kind of vague -- I can think of countless kinds of information that shouldn't be stored in the session!

What are some ways to optimize your use of ASP.NET caching? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I have been doing some reading on this subject, but I'm curious to see what the best ways are to optimize your use of the ASP.NET cache and what some of the tips are in regards to how to determine what should and should not go in the cache. Also, are there any rules of thumb for determining how long something should say in the cache?
Some rules of thumb
Think in terms of cache miss to request ratio each time you contemplate using the cache. If cache requests for the item will miss most of the time then the benefits may not outweigh the cost of maintaining that cache item
Contemplate the query expense vs cache retrieval expense (e.g. for simple reads, SQL Server is often faster than distributed cache due to serialization costs)
Some tricks
gzip strings before sticking them in cache. Effectively expands the cache and reduces network traffic in a distributed cache situation
If you're worried about how long to cache aggregates (e.g. counts) consider having non-expiring (or long-lived) cached aggregates and pro-actively updating those when changing the underlying data. This is a controversial technique and you should really consider your request/invalidation ratio before proceeding but in some cases the benefits can be worth it (e.g. SO rep for each user might be a good candidate depending on implementation details, number of unanswered SO questions would probably be a poor candidate)
Don't implement caching yet.
Put it off until you've exhausted all the Indexing, query tuning, page simplification, and other more pedestrian means of boosting performance. If you flip caching on before it's the last resort, you're going to have a much harder time figuring out where the performance bottlenecks really live.
And, of course, if you have the backend tuned right when you finally do turn on caching, it will work a lot better for a lot longer than it would if you did it today.
The best quote i've heard about performance tuning and caching is that it's an art not a science, sorry can't remember who said it but the point here is that there are so many factors that can have an effect on the performance of your app that you need to evaluate each situation case by case and make considered tweaks to that case until you reach a desired outcome.
I realise i'm not giving any specifics here but I don't really think you can
I will give one previous example though. I worked on an app that made alot of calls to webservices to built up a client profile e.g.
GET client
GET client quotes
GET client quote
Each object returned by the webservice contributed to a higher level object that was then used to build the resulting page. At first we gathered up all the objects into the master object and cached that. However we realised when things were not as quick as we would like that it would make more sense to cache each called object individually, this way it could be re-used on the next page the client sees e.g.
[Cache] client
[Cache] client quotes
[Cache] client quote
GET client quote upgrades
Unfortunately there is no pre-established rules...but to give you a common sense, I would say that you can easily cache:
Application Parameters (list of countries, phone codes, etc...)
Any other application non-volatile data (list of roles even if configurable)
Business data that is often read and does not change much (or not a big deal if it is not 100% accurate)
What you should not cache:
Volatile data that change frequently (usually the business data)
As for the cache duration, I tend to use different durations depending on the type of data and its size. Application Parameters can be cached for several hours or even days.
For some business data, you may want to have smaller cache duration (minutes to 1h)
One last thing is always to challenge the amount of data you manipulate. Remember that the end-user won't read thousands of records at the same time.
Hope this will give you some guidance.
It's very hard to generalize this sort of thing. The only hard-and-fast rule to follow is not to waste time optimizing something unless you know it needs to be done. Then the proper course of action is going to be very much dependent on the nitty gritty details of your application.
That said... I'll almost always cache global applications parameters in some easy to use object. This is certainly more of a programming convenience rather than optimization.
The one time I've written specific data caching code was for an app that interfaced with a very slow accounting database, and then it was read-only for data that didn't change very often. All writes went to the DB. With SQL Server, I've never run into a situation where the built-in ASP.NET-to-SQL Server interface was the slow part of the equation.

Resources