We generate reports in our web application by querying our sql server for data returned as xml and then processing with xslt to create the final output.
As a way to speed up the system, we removed all the static information from the returned sql xml and cached a large XDocument with all the static info in it. Right before performing the xsl transform, we append the XDocument with the static info to the end of the xml that came from sql server. The static XDocument is about 50Meg and takes many seconds to build from the sql server.
Our problem is that once we started caching a few of these large XDoc's, we hit the cache private bytes limit size and the cache was cleared. Rebuilding these XDocuments is too time consuming to do while people are running reports. I have not tried saving these XDocs to a physical file because they are needed for every report run which happens constantly through the day.
I've thought of installing AppFabric Cache but I'm not sure it's a great idea to store 5 to 10 of these large items in it.
Any ideas? If I install more memory on the web server, will it automatically be available to asp.net for a larger cache? I've tried compressing the data before storing it in the cache (shrunk by a factor of 5), but uncompressing it and reparsing the XDocument slowed the server way down.
Final, just save it to a file as it is and then reload it as it is because its all ready Serialized.
The protobuf-net is super fast and light and I have test it and use it, but it not make any good because its all ready serialized.
You can serialize the xml object in a binary format and store it on the database using a varbinary(max). Not sure about the performance of that, but It might worth to try it since it won't take very long to implement it.
Something else that you might want to address is the performance penalty for the first user accessing the report. In order to avoid this, you could pre-generate the reports so they are cached for everyone.
Related
I have some data which is obtained from an API which I display via a master-detail web page. The data I receive from the API is in JSON format and I currently cache a serialised version of this to disk. All files are stored in a single folder. The file is used for a maximum of 1 week as new content is released every week. There can be up to a maximum of 40,000 files. Each file is about 12kb and a guid is used as the filename.
What is the best caching strategy?
Keep as is.
Store the raw JSON instead of serialised data.
Replace the disk caching solution with a NoSQL solution like Redis.
Organise the files into folders
Use faster serialization / deserialization techniques
If you have huge RAM then in order to retrieve the data faster you can avoid serialization and de serialization and keep the data directly in Redis as key value pair.
I have a couple of ActionMethods that returns content from the database that is not changing very often (eg.: a polygon list of available ZIP-Areas, returned as json; changes twice per year).
I know, there is the [OutputCache(...)] Attribute, but this has some disadvantages (a long time client-side caching is not good; if the server/iis/process gets restartet the server-side cache also stopps)
What i want is, that MVC stores the result in the file system, calculates the hash, and if the hash hasn't changed - it returns a HTTP Status Code 304 --> like it is done with images by default.
Does anybody know a solution for that?
I think it's a bad idea to try to cache data on the file system because:
It is not going to be much faster to read your data from file system than getting it from database, even if you have it already in the json format.
You are going to add a lot of logic to calculate and compare the hash. Also to read data from a file. It means new bugs, more complexity.
If I were you I would keep it as simple as possible. Store you data in the Application container. Yes, you will have to reload it every time the application starts but it should not be a problem at all as application is not supposed to be restarted often. Also consider using some distributed cache like App Fabric if you have a web farm in order not to come up with different data in the Application containers on different servers.
And one more important note. Caching means really fast access and you can't achieve it with file system or database storage this is a memory storage you should consider.
I am facing a situation where I am stuck in a very heavy traffic load and keeping the performance high at the same time. Here is my scenario, please read it and advise me with your valuable opinion.
I am going to have a three way communication between my server, client and visitor. When visitor visits my client's website, he will be detected and sent to a intermediate Rule Engine to perform some tasks and output a filtered list of different visitors on my server. On the other side, I have a client who will access those lists. Now what my initial idea was to have a Web Service at my server who will act as a Rule Engine and output resultant lists on an ASPX page. But this seems to be inefficient because there will be huge traffic coming in and the clients will continuously requesting data from those lists so it will be a performance overhead. Kindly suggest me what approach should I do to achieve this scenario so that no deadlock will happen and things work smoothly. I also considered the option for writing and fetching from XML file but its also not very good approach in my case.
NOTE: Please remember that no DB will involve initially, all work will remain outside DB.
Wow, storing data efficiently without a database will be tricky. What you can possibly consider is the following:
Store the visitor data in an object list of some sort and keep it in the application cache on the server.
Periodically flush this list (say after 100 items in the list) to a file on the server - possibly storing it in XML for ease of access (you can associate a schema with it as well to make sure you always get the same structure you need). You can perform this file-writing asynchronously as to avoid keeping the thread locked while writing the file.
The Web Service sounds like a good idea - make it feed off the XML file. Possibly consider breaking the XML file up into several files as well. You can even cache the contents of this file separately so the service feeds of the cached data for added performance benefits...
My system was mentioned in this question:
https://stackoverflow.com/questions/5633634/best-index-strategies-for-read-only-table
because the data is readonly, and in specific time, a part of data (50-200k rows, about 200 byte/each) is intensively queried, so I think allowing client to connect to database and query each row/query is way too expensive. It would be a better choice if I cache part of data (which is being intensively queried) into RAM, which is much faster than SQL Server.
The problem is, the system I'm currently working on is a webservice, so I'm not sure that it allows large static data to be cached. Is my idea a good choice? How my data can "survive" when IIS recycle?
Thank you very much.
Load the data into RAM on Application_Start event. Then you don't need to worry about IIS restart
Here is an MS guide about Caching Data at Application Startup
I have to port a smaller windows forms application (product configurator) to an asp.net app which will be used on a large company's website, demand should be moderate because it's for a specialized product line.
I don't have access to a database and using XML is a requirement from their web developers.
There are roughly 30 different products with roughly 300 different possible configurations stored in the xml files, and linked questions / answers that lead to a product recommendation. Also some production options. The app is available in 6 languages.
How would you solve the 'data access' layer, if you could call it this way? I thought of reading / deserializing the xml files into their objects and store them in asp.net's cache if they're not there already and then read from the cache on subsequent requests. But that would mean all objects live in the memory all day and night.
Is that even necessary, or smart, performance wise? As I said before, the app is not that big, the xml files not that large. Could I just create some Repository class that reads the xml files whenever an object is requested (ie. 'Product Details', or 'Next question') and returns it that way, and drive memory consumption down?
The whole approach seems to be sticking to a single server. First consider if this is appropriate as you mentioned a "large company's website", that sets a red flag for me. If you need the site to scale, you will end up having more than a single server, which prevents considering a simple local file.
If you are constrained to using that, analyze what data is more appropriate to keep in cache (does not change often, its long lived, the same info is requested different times). Try to keep the cached stuff separated from the non cached, which will reduce the amount of amount of info in the more dynamic files. If you expect big amounts of information, consider splitting the files with something appropriate to your domain.
I use Cache whenever I can. I cache objects upon their first request. If memory is of any concern, I set expiration policy. And whether it is or not, when short on memory, the framework will unload the cache anyway.
Since it is per application and not per user, it makes sense to have it, especially if the relative footprint is small.
If you have to expand to multiple servers later, you can access the same file over the network or modify DA layer to retrieve data by any other means (services, DB, etc). The caching code will stay the same and performance will be virtually unaffected.
If you set dependency, objects will always stay current.
I'm for it.
Using the cache, and setting an appropriate expiration policy as advised by others is a sound approach. I'd suggest you look at using LINQ to XML as the basis for your data access code as it is so much easier to use than traditional methods of querying XML. You can find a decent introduction here.