In a system with virtual memory, when the pages of a process are swapped from HD to RAM is it true that all the pages are always put in Swap Area? Or only the pages that not fit in RAM are put in Swap Area?
Which one of these two situations happens?
Swapping is the memory management technique where the entire process is stored on disk. Swapping was common in the days of 64kb address spaces where it did not take many disk I/O operations to store an entire process. Processes are stored in a swap file.
Pages is the memory management technique where individual pages are stored on disk. Pages are stored in a page file.
Some systems use both swapping and paging. For example, Windoze has recently reintroduced swapping.
Related
I know next to nothing about Drupal but I do have a question. We had a site, written in straight HTML and PHP, that loaded the main page in 1-2 seconds and made 25 requests to the server to get the data it needed. A new Drupal version of the site takes 5-6 seconds to load the main page, which is no more complicated than the old page, and makes 127 requests (I'm watching Firebug NET) to the server to get the data it needs.
Is this typical?
Thanks.
Yep a 3x performance hit is natural to Drupal, or most of large scale PHP application framework. Bootstraping Drupal is a costly operation as it requires loading a lot of files. Drupal is also known to perform too much DB queries in order to produce a single page.
The first step is to enable page caching and JS/CSS aggregation. This can be done from the administration page at Administration >> Configuration >> Performance (in Drupal 7).
But a 1-2 seconds load time on a lightweight PHP site is a sign of a either overloaded or badly tuned hosting. You should ensure you site is running in a recent PHP version (PHP is getting faster and faster with each version). Also enable APC (or any other opcache), even with the default settings it can greatly improve Drupal's performances. With APC, try increasing the shared memory size (eg. apc.shm_size = 64 in php.ini).
You should also try profiling your site to identify the actual bottle necks. With Drupal making several requests per page, the DB quickly becomes the bottle neck. Drupal support using multiple slave servers for read queries.
About the database, Drupal uses an internal cache which by default is stored in the database. So this cache does not deal well with overloaded database. Drupal's cache is pluggable. It can be configured to use memcache, redis or mongodb for its storage. This could greatly reduce the load on the database.
Yes drupal is slow.
Thats why we use caching mecahnisms if ur page is making too many requests
See if u can aggregate ur CSS and JS(This will reduce number of
HTML calls. u can do this from admin)
Use CDN
use memcache or varnish cache
use page cache in apache.
Note:-please provide some actual data split up with some load testing tools
How much requests are sent to server? it also matters but drupal has solutions for it. Drupal combine all css files in to a single file to make server calls low, and similarly for js files.
But the speed also matters on server side code, database operations. Drupal is a powerful system which makes complex things easy (and yes easy things complex) and provides such capabilities so that a user can make a complete portal without a line of coding. But all these features come by the cost of performance. Internally drupal do lots of operations and it makes it slow.
Those operations includes views and block operations and the more complex the view / block / form is, the more operations there will be, and hence it will take more time.
Also if the site contents are increased then it will be become more slow. Because drupal consider every content as a node, and for all of your content types (for example news, cms pages, testimonials and so on) data is stored in a single node table (some other tables are also used, but your main contents are stored in node table). So when the contents are increased, the load on that single table is increased, which cause slow database operations, because the more big your table size is, the more operation time it will be taking.
I may be wrong, but Drupal is slow :P
Firstly I appreciate that this question could be seen as subjective but I strongly believe that there should be and probably is a definitive answer to my question.
At work we are currently implementing a strategy for dynamically resizing and serving images using a generic handler and the question of caching has become something of a contentious issue.
In my original implementation the resized image is cached in memory with a cache dependency based on the original image.
e.g
using (MemoryStream ms = new MemoryStream())
{
imageEditor.Image.Save(ms, imageFormat);
// Add the file to the cache.
context.Cache.Insert(key,
ms.ToArray(),
new System.Web.Caching.CacheDependency(path)
);
imageEditor.Dispose();
// Set the context headers and serve.
SetHeaders(ms.GetHashCode(), context, responseType);
context.Response.BinaryWrite(ms.ToArray());
}
This has it's downsides though.
Every time the Application Pool Worker Process is recycled (every 1740 minutes by default) we'll lose anything that is in the cache.
If there are a lot of images we could be in danger of overloading the system memory and causing an out of memory exception. (Does IIS prevent this by recycling the App pool if the usage hits a certain level?)
One of my colleagues has suggested that we implement a file caching system instead that saves the resized file and instead serves that file on subsequent requests which should (I don't know the intricacies of operating system IO caching memory management) reduce memory usage. Whilst this would allow us to persist the resized image across recycles I see a few major problems with that approach:
We cannot track the original file any more so if someone uploads a
new image of the same name our resized images will be incorrect.
The fileserver becomes polluted over time with thousands of images
Reading a file is slow compared to reading from memory. Especially if you are reading multiple files.
What would be the best overall approach? Is there a standard defined somewhere by Microsoft? The sites we build are generally very busy so we'd really like to get this right and to the best possible standard.
We have a similar system on my website. Regarding your objections to the file caching system:
1,2) On my site I have a single class through which all file saving/loading passes. You could implement something similar, that clears all the cached,resized images whenever a user uploads a new image. If you name the files in a predictable manner this isn't hard to do. If storage space is a concern for you, you could implement something to remove all cached images with a last access date that is too old.
3) This depends on how your sites work. My site has a vast number of images, so it isn't feasible to store them all in memory. If your site has fewer images, it might be the better solution.
This is not a complete answer to your question but, you shouldn't be able to cause a system of out memory exception by putting to much stuff into the cache. If system memory should start running low the application cache will automatically start removing the unimportant and seldom used items in order to avoid causing any memory issues.
You may use ASP.NET Generated Image. Interesting article by Scott Hanselman - ASP.NET Futures - Generating Dynamic Images with HttpHandlers gets Easier.
Recently our customers started to complain about poor performance on one of our servers.
This contains multiple large CMS implementations and alot small websites using Sitefinity.
Our Hosting team is now trying to find the bottlenecks in our environments, since there are some major issues with loadtimes. I've been given the task to specify one big list of things to look out for, devided into different the parts (IIS, ASP.NET, Web specific).
I think it'd be good to find out how many instances of the Sitecore CMS we can run on one server according to the Sitecore documentation e.d. We want to be able to monitor and find out where our bottleneck is at this point. Some of our websites load terribly slow, other websites load very fast. Most of our Sitecore implementations that run on this server have poor back-end performance, and have terrible load times after a compilation.
Our Sitecore solutions run on a Win 2008 64 server with Microsoft SQL Server 2008 for db's.
I understand that it might be handy to specify more detailed information about our setup, but I'm hoping we'd be able to get some usefull basic information regarding how to monitor and find bottlenecks e.d.
What tools / hints / tips & tricks do you have?
do NOT use too many different asp.net pools, called and as dedicate pool in plesk. Place more sites on the same pool.
More memory, or stop non used programs/services on the server
Check if you have memory limits on the application pool that make the pool continues auto-restarts.
On the database, set Recovery Mode to simple.
Shrink database files, and reindex database, from inside the program
after all that Defrag your disks
Check the memory with process explorer.
To check whats starts with your server use the autoruns but be careful not to stop any critical service and the computer never starts again. Do not stop services from autoruns, use the service manager to change the type to manual. Also many sql serve services they not need to run if you never used them.
Some other tips
Move the temporary files / and maybe asp.net build directory to a different disk
Delete all files from temporary dir ( cd %temp% )
Be sure that the free physical memory is not zero, using the process exporer. If its near zero, then your server needs memory, or needs to stop non using programs from running.
To place many sites under the same pool, you need to change the permissions of the sites under the new share pool. Its not difficult, just take some time and organize to know what site runs under what pool. Now let say that you have 10 sites, its better to use 2 diferent pools, and spread the sites on this pools base on the load of each site.
There are no immediate answer to Sitecore performance tuning. But here are some vital tips:
1) CACHING
Caching is everything. The default Sitecore cache parameters are rarely correct for any application. If you have lots of memory, you should increase the cache sizes:
http://learnsitecore.cmsuniverse.net/en/Developers/Articles/2009/07/CachingOverview.aspx
http://sitecorebasics.wordpress.com/2011/03/05/sitecore-caching/
http://blog.wojciech.org/?p=9
Unfortunately this is something the developer should be aware of when deploying an installation, not something the system admin should care about...
2) DATABASE
The database is the last bottleneck to check. I rarely touch the database. However, the DB performance can be increased with the proper settings:
Database properties that improves performance:
http://www.theclientview.net/?p=162
This article on index fragmentation is very helpful:
http://www.theclientview.net/?p=40
Can't speak for Sitefinity, but will come with some tips for Sitecore.
Use Sitecores caching whenever possible, esp. on XSLTs (as they tend to be simpler than layouts & sublayouts and therefore Sitecore caching doesn't break them, as Sitecore caching does to asp.net postbacks), this ofc will only help if rederings & sublayouts etc are accessed a lot. use /sitecore/admin/stats.aspx?site=website to check stuff that isn't cached
Use Sitecores profiler, open up an item in the profiler and see which sublayouts etc are taking time
Only use XSLTs for the simplest content, if it get anymore complicated than and I'd go for sublayouts (asp.net controls), this is a bit biased as I'm not fond of XSLT, but experience indicates that .ascx's are faster
Use IIS' content expiration on the static files (prob all of /sitecore and if you have some images, javascript & CSS files) this is for IIS 6: msdn link
Check database access times with Sitecore Databasetest.aspx (the one for Sitecore 6 is a lot better than the simple one that works on Sitecore 5 & 6) Sitecore SDN link
And that's what I can think of from the top of my head.
Sitecore has a major flaw, its uses GUIDs for primary keys (amongst other poorly chosen data types), this fragments the table from the first insert and if you have a heavily utilised Sitecore database the fragmentation can be greater than 90% within an hour. These is not a well-designed database and recommend looking at other products until they fix this, it is causing us a major performance headache (time and money).
We are at a stand still we cannot add anymore RAM cannot rebuild the indexes more often
Also, set your IIS to recycle the app_pool ONLY once a day at a specific time. I usually set mine for 3am. This way the application never goes to sleep, recycle or etc. Best to reduce spin up times.
Additionally configure IIS to 'always running' instead of 'on starup'. This way, when the application restarts, it recompiles immediately and again, is ready to roar.
Sitefinity is really a fantastic piece of software (hopefully my tips above get the thumbs up, and not my endorsement of the product). haha
I googled forever, and I couldn't find an answer to this; the answer is either obvious (and I need more training) or it's buried deep in documentation (or not documented). Somebody must know this.
I've been arguing with somebody who insisted on caching some static files on an ASP.NET site, where I thought it's not necessary for a simple fact that all other files that produce dynamic HTML are not cached (by default; let's ignore output caching for now; let's also ignore the caching mechanism that person had in mind [in-memory or out on network]). In other words, why cache some xml file (regardless on how frequently it's accessed) when all aspx files are read from disk on every request that map to them? If I'm right, by caching such static files very little would be gained (less disk-read operations), but more memory would be spent (if cached in memory) or more network operations would be caused (if cached on external machine). Does somebody know what in fact happens when an aspx file is [normally] requested? Thank you.
If I'm not mistaken ASPX files are compiled at run-time, on first access. After the page is compiled into an in-memory instance of a Page class, requests to the same resource (ASPX page) are serviced against the object in memory. So in essence, they are cached with respect to disk-access.
Obviously the dynamic content is generated for every request, unless otherwise cached using output caching mechanisms.
Regarding memory consumption vs disk access time, I have to say that from the performance stand point it makes sense to store objects in memory rather than reading them from disk every time if they are used often. Disk access is 2 orders of magnitude slower than access in RAM. Although inappropriate caching strategies could push frequently used objects out of memory to make room for seldom used objects which could hurt performance for obvious reasons. That being said, caching is really important for a high-performance website or web application.
As an update, consider this:
Typical DRAM access times are between 50 - 200 nano-seconds
Average disk-access times are in the range of 10 - 20 milliseconds
That means that without caching a hit against disk will be ~200 times slower than accessing RAM. Of course, the operating system, the hard-drive and possible other components in between may do some caching of their own so the slow-down may only occur on first hit if you only have a couple such files you're reading from.
Finally, the only way to be certain is to do some benchmarking. Stress-test both implementations and choose the version that works best in your case!
IIS does a large amount of caching, so directly, no. But, IIS checks for ANY changes in the web directory and reloads any changed files as they get changed. Sometimes IIS gets borked and you have to restart it to detect changes, but usually it works pretty good.
P.S. The caching mechanisms may flush data frequently based on server usage, but the caching works for all files in the web directory. Any detected changes to source code causes IIS to flush the web applicaiton and re-compile/re-load as well.
I believe that the answer to your question depends on both the version of IIS you're using, and configuration settings.
But I believe that it's possible to configure some combinations of IIS/.Net to avoid checking the files - there's an option to pre-compile sites, so no code actually needs to be deployed to the web server.
Assumptions: Microsoft stack (ASP.NET; SQL Server).
Some content management systems handle user-generated content (images, file attachments) by storing it in the file system. Others store these items in the back end database.
Some examples of both:
In the filesystem: Community Server, Graffiti CMS
In the database: Microsoft Sharepoint
I can see pros and cons of each approach.
In the filesystem
Lightweight
Avoids bloating the database
Backup and restore potentially simpler
In the Database
All content together in one repository (the database)
Complete separation of concerns (content vs format)
Easier deployment of web site (e.g. directly from Subversion repository)
What's the best approach, and why? What are the pros and cons of keeping user files in the database? Is there another approach?
I'm making this question Community Wiki because it is somewhat subjective.
If you are using SQL Server 2008 or higher, you can use the FileStream functionality to get the best of both worlds. That is, you can access documents from the database (for queries, etc), but still have access to the file via the file system (using SMB). More details here.
Erick
I picked the file system because it made editing of documents in place easier, that is when the user edits a file or document it can be saved in the location it is loaded from with no intervention by the program or user.
IMO, as of right now with the current functionality available in databases, the file system is the better choice.
The file system has no limit on the size of the files and with content this could easily be files larger than 2 GB.
It makes the database size much smaller which means less pressure on memory.
You can design your system to use UNCs and NASs or even cloud storage where as you cannot do this with FILESTREAM.
The biggest downside with using the file system is the potential for orphaning files and keeping the database information on files in sync with the actual files on disk. Admittedly, this is a huge issue but until solutions like FILESTREAM are more flexible, it is the price you have to pay.
Actually its door #3 Chuck.
I think storing images in the database is bad news unless you need to keep them private, otherwise, just put them on a CDN and store the URLs of the images instead. I've built some huge sites for ecommerce and putting the load on a CDN like Akumai or Amazon Cloudfront is a real nice way to speed up your website dramatically. I'm not a big fan of burning your bandwidth, CPU and memory for serving up images. Seems a silly waste of resources now days since CDNs are so cheap. Also, it does allow deployment to not care because your stuff is already in a globally accessible region. You can take a look at my profile to see the sites I've done and see how they are using CDNs to offload static requests. Just makes sense and gets even better if you can gzip it.