I know next to nothing about Drupal but I do have a question. We had a site, written in straight HTML and PHP, that loaded the main page in 1-2 seconds and made 25 requests to the server to get the data it needed. A new Drupal version of the site takes 5-6 seconds to load the main page, which is no more complicated than the old page, and makes 127 requests (I'm watching Firebug NET) to the server to get the data it needs.
Is this typical?
Thanks.
Yep a 3x performance hit is natural to Drupal, or most of large scale PHP application framework. Bootstraping Drupal is a costly operation as it requires loading a lot of files. Drupal is also known to perform too much DB queries in order to produce a single page.
The first step is to enable page caching and JS/CSS aggregation. This can be done from the administration page at Administration >> Configuration >> Performance (in Drupal 7).
But a 1-2 seconds load time on a lightweight PHP site is a sign of a either overloaded or badly tuned hosting. You should ensure you site is running in a recent PHP version (PHP is getting faster and faster with each version). Also enable APC (or any other opcache), even with the default settings it can greatly improve Drupal's performances. With APC, try increasing the shared memory size (eg. apc.shm_size = 64 in php.ini).
You should also try profiling your site to identify the actual bottle necks. With Drupal making several requests per page, the DB quickly becomes the bottle neck. Drupal support using multiple slave servers for read queries.
About the database, Drupal uses an internal cache which by default is stored in the database. So this cache does not deal well with overloaded database. Drupal's cache is pluggable. It can be configured to use memcache, redis or mongodb for its storage. This could greatly reduce the load on the database.
Yes drupal is slow.
Thats why we use caching mecahnisms if ur page is making too many requests
See if u can aggregate ur CSS and JS(This will reduce number of
HTML calls. u can do this from admin)
Use CDN
use memcache or varnish cache
use page cache in apache.
Note:-please provide some actual data split up with some load testing tools
How much requests are sent to server? it also matters but drupal has solutions for it. Drupal combine all css files in to a single file to make server calls low, and similarly for js files.
But the speed also matters on server side code, database operations. Drupal is a powerful system which makes complex things easy (and yes easy things complex) and provides such capabilities so that a user can make a complete portal without a line of coding. But all these features come by the cost of performance. Internally drupal do lots of operations and it makes it slow.
Those operations includes views and block operations and the more complex the view / block / form is, the more operations there will be, and hence it will take more time.
Also if the site contents are increased then it will be become more slow. Because drupal consider every content as a node, and for all of your content types (for example news, cms pages, testimonials and so on) data is stored in a single node table (some other tables are also used, but your main contents are stored in node table). So when the contents are increased, the load on that single table is increased, which cause slow database operations, because the more big your table size is, the more operation time it will be taking.
I may be wrong, but Drupal is slow :P
Related
My company uses SilverStripe v3.1.21, along with the Subsite module to display and administer a number of clients' websites that sell products. This results in close to 200 subsites and a page count in the tens of thousands. The websites are very slow to load and tools such as Google's PageSpeed tell us page speeds are poor. We've already done step like combining and minimising the JS and compressing resources such as imaging, which gave some improvements, however the pages remain slow. The system was handed to us in this state and further hardware upgrades are not on the table as an option, nor are gaining additional resources for redevelopment.
We've taken a look at the static publish module (https://github.com/silverstripe/silverstripe-staticpublisher) and found that when we generating static pages the pages become fast and get a good score on the various tools, however the process to regenerate all of these pages takes over 14 hours, which is unacceptable given these products are updated from an external source daily. We also find that the regeneration process is a memory hog, as the module builds all of the pages in memory before dumping to file, causing the process to crash. We've had to alter the process to go subsite-by-subsite just to make it run.
We then took a look at the static publishing queue module (https://github.com/silverstripe/silverstripe-staticpublishqueue), which seemed to address our issues by having it queue pages as needed for regeneration, making it much more responsive to changes. However, the module seems to be very buggy and often crashes when generating pages.
Has anyone had experience using these modules (or similar) with larger sites and may be able to provide any pointers or ideas on how to implement static publishing successfully?
We are using staticpublishqueue currently on several sites. The only problem we've had with it is crashing due to long builds and poor locking. Or to be precise it doesn't actually crash but keeps spawning more and more instances until the server becomes unresponsible.
I think we have a fix for this in our fork. At least we haven't had any problems after using the modified locking. You could try installing the fork instead of the official version. If this fixes things for you maybe we should make a pull request :)
First of: We only use staticpublishqueue, I don't have any experience in regards to the sub site module. So I can't speak for your exact combination.
We are using staticpublishqueue on a huge site. Setup: We have multiple servers running the SilverStripe Website. They share a MySQL Database and use Redis as a session store.
One great thing about staticpublishqueue: you can run it in parallel. So the servers all run an instance of staticpublishqueue and publish into a shared folder, which is then synced to a nginx load balancer in front of the actual webservers. Works quite nice, but it does not scale indefinitely. At some point the staticpublishqueue instances start to pick the same record to render and waste resources. I think about 6 is the max for us.
Couple of things we learned regarding staticpublishqueue:
do not run to many instances at the same time (see above)
make sure it has enough ram
make sure it runs as the same user as the website
the record look it uses is not compatible with a MariaDB Galera Cluster
If possible switch to SilverStripe 3.6.x and PHP7. The performance gain is huge.
We are migrating away from staticpublishqueue to Cloudflare (or maybe another CDN). Why? Because if a page that is requested has not been rendered yet the server will render it for each request individually and then throw it away. Until the que does a separate render for the cache. Total waste of resources, especially if you purge your cache after a sitewide layout change or something.
So I've attached the resource usage of my wordpress website over the past 30 days.
You can see the I/O usage has been getting higher and more frequent. I think this is a problem that has caused a massive drop in visits to my site.
I asked my host why this is and he said backs up usually contribute largely to this. Only thing is, I backup once a month not every day.
I've tried optimising my database, disabling plugins but I don't understand why it keeps getting higher.
I have a Analytics plugin that refreshes every hour but I've had that all year and I/O usage only started getting high recently.
The only thing I can think of is Wp Super Cache and CloudFlare not working well together. I've tried different configurations but hasn't helped.
Any help would be appreciated.
I think this is a pretty standard IO log, Over time your db does get a lot bigger and so does your users who end up using a lot of IO. I dont think there is anything to panic right away, but obviously if this is a very huge difference from what you are used to see normally then i think you should look into it seriously. I take caching very seriously and i usually use W3 total cache for this kind of performance optimization. Its a bit tricky in the begining but once you are used to it, it very easy.
I know you might just want to improve the IO, for which mostly you just need caching but here are somethings that i would do to get the most performance out of a site.
1) If you are using a VPS or dedicated server install memcache or something like Redis, and then configure your plugin according to it. You might have to enable it in your php.ini file but once installed you will see the difference. It will execute the code and give you a save the results in the RAM, on the next request instead of executing the php code it will just hand over the same results. Now it depends on your website, and whether you want to cache it or not. You can setup individual pages to use caching as well.
2) If your plugin has options to automatically minify and combine html/css/js files then use it, if not then you should minify and combine them into a single file or as less number of files as possible and then manually upload to your server. It will reduce a lot of time that is spent on requesting a file and waiting for getting the response back. Its usually in milliseconds but if you have a lot of files then it does add up to seconds + unnecessary load on the server.
3)If your plugin has gzip feature, then enable it. It will allow your users to download the gzipped css and js files instead of the original large files. This will enormously reduce the number of bits a browser have to download on every attempt.
4) Enable caching of files on the browser, your plugin might already have this, but if not then you will have to set some headers which will tell the browser to cache the css and js files in the user browser. So the next time when the user goes to the next page on your website, instead of calling the css/js files from the server it loads them directly from the Cache.
5) Upload your css/js/images files to a CDN, that way whenever someone requests a file it will use the shortest route to get your users browser.
6) If your site is not just a personal blog and is making serious money or you just want to please all the huge growing number your users. Then i would suggest you look into auto scaling server platforms, where you set some triggers and the number of servers automatically increase when facing a lot of users / IO and once the number of users go back to normal it automatically scales down. One of the big boys for this sort of service would be AWS beanstalk, microsoft azure. Or you can use beanstalkd with digital ocean which is a cheap alternative.
7) Wordpress is quite compatible with facebook's HHVM which is an opensource virtual machine designed to use php as just in time (JIT). Php is an interpreted language i.e its written in C/C++ (you can checkout the code at github), so when ever you refresh a page, hundred's of line of php code is interpretted by C++ and then compiled and executed. What HHVM does is that it compiles the code and keep it in memory, so when someone else requests the same page it already has a compiled version so it just executes and serves it. So it removes 30-40% of the compiling time from every request, which in turn makes your site 30-40% faster. Now PHP7 is already out last month and it does have a lot of performance upgrades, so if you are still not sure about HHVM you should definitely try upgrading to PHP7.
Some pages on Drupal use more memory than other pages. I think it's a waste of server resources to reserve 64M or more to all pages in Drupal only because the modules' page (or a section with graphics) reaches this peak and I want to avoid white pages when doing changes.
So, my question is: Is it a good practice to manage different memory limits programmatically, depending of the section or page? Some pages use 32M or less, so I think it's better to optimize specific sections of a web app with specific limits.
I've read a lot about optimization practices but I haven't found handling memory limits dynamically or a Drupal module dealing this kind of matter or applying this approach.
The memory limit is really just telling Drupal how much memory it's allowed to use. Drupal isn't going to "reserve" memory. It doesn't really "manage" memory at all the way you think. It'll use whatever it needs (of the allowed) when it needs it, and if it needed more, you'll get an error. If it needs less than the memory limit, it'll use less.
The minimum required available memory for Drupal 7 to run is 32MB, but a recommended number would be closer to 128MB.
http://drupal.org/requirements#php
PHP memory requirements can vary significantly depending on the modules in use on your site. Drupal 6 core requires PHP's memory_limit to be at least 16MB. Drupal 7 core requires 32MB. Warning messages will be shown if the PHP configuration does not meet these requirements. However, while these values may be sufficient for a default Drupal installation, a production site with a number of commonly used modules enabled (CCK, Views etc.) could require 64 MB or more. Some installations may require much more, especially with media-rich implementations. If you are using a hosting service it is important to verify that your host can provide sufficient memory for the set of modules you are deploying or may deploy in the future. (See the Increase PHP memory limit page in the Troubleshooting FAQ for additional information on modifying the PHP memory limit.)
Drupal is very memory heavy. When warming up an instance of drupal for the very first time, it tries to allocate memory to the views, cache etc..
Be sure to place this inside sites/default/settings.php
ini_set('memory_limit','128M');
Recently our customers started to complain about poor performance on one of our servers.
This contains multiple large CMS implementations and alot small websites using Sitefinity.
Our Hosting team is now trying to find the bottlenecks in our environments, since there are some major issues with loadtimes. I've been given the task to specify one big list of things to look out for, devided into different the parts (IIS, ASP.NET, Web specific).
I think it'd be good to find out how many instances of the Sitecore CMS we can run on one server according to the Sitecore documentation e.d. We want to be able to monitor and find out where our bottleneck is at this point. Some of our websites load terribly slow, other websites load very fast. Most of our Sitecore implementations that run on this server have poor back-end performance, and have terrible load times after a compilation.
Our Sitecore solutions run on a Win 2008 64 server with Microsoft SQL Server 2008 for db's.
I understand that it might be handy to specify more detailed information about our setup, but I'm hoping we'd be able to get some usefull basic information regarding how to monitor and find bottlenecks e.d.
What tools / hints / tips & tricks do you have?
do NOT use too many different asp.net pools, called and as dedicate pool in plesk. Place more sites on the same pool.
More memory, or stop non used programs/services on the server
Check if you have memory limits on the application pool that make the pool continues auto-restarts.
On the database, set Recovery Mode to simple.
Shrink database files, and reindex database, from inside the program
after all that Defrag your disks
Check the memory with process explorer.
To check whats starts with your server use the autoruns but be careful not to stop any critical service and the computer never starts again. Do not stop services from autoruns, use the service manager to change the type to manual. Also many sql serve services they not need to run if you never used them.
Some other tips
Move the temporary files / and maybe asp.net build directory to a different disk
Delete all files from temporary dir ( cd %temp% )
Be sure that the free physical memory is not zero, using the process exporer. If its near zero, then your server needs memory, or needs to stop non using programs from running.
To place many sites under the same pool, you need to change the permissions of the sites under the new share pool. Its not difficult, just take some time and organize to know what site runs under what pool. Now let say that you have 10 sites, its better to use 2 diferent pools, and spread the sites on this pools base on the load of each site.
There are no immediate answer to Sitecore performance tuning. But here are some vital tips:
1) CACHING
Caching is everything. The default Sitecore cache parameters are rarely correct for any application. If you have lots of memory, you should increase the cache sizes:
http://learnsitecore.cmsuniverse.net/en/Developers/Articles/2009/07/CachingOverview.aspx
http://sitecorebasics.wordpress.com/2011/03/05/sitecore-caching/
http://blog.wojciech.org/?p=9
Unfortunately this is something the developer should be aware of when deploying an installation, not something the system admin should care about...
2) DATABASE
The database is the last bottleneck to check. I rarely touch the database. However, the DB performance can be increased with the proper settings:
Database properties that improves performance:
http://www.theclientview.net/?p=162
This article on index fragmentation is very helpful:
http://www.theclientview.net/?p=40
Can't speak for Sitefinity, but will come with some tips for Sitecore.
Use Sitecores caching whenever possible, esp. on XSLTs (as they tend to be simpler than layouts & sublayouts and therefore Sitecore caching doesn't break them, as Sitecore caching does to asp.net postbacks), this ofc will only help if rederings & sublayouts etc are accessed a lot. use /sitecore/admin/stats.aspx?site=website to check stuff that isn't cached
Use Sitecores profiler, open up an item in the profiler and see which sublayouts etc are taking time
Only use XSLTs for the simplest content, if it get anymore complicated than and I'd go for sublayouts (asp.net controls), this is a bit biased as I'm not fond of XSLT, but experience indicates that .ascx's are faster
Use IIS' content expiration on the static files (prob all of /sitecore and if you have some images, javascript & CSS files) this is for IIS 6: msdn link
Check database access times with Sitecore Databasetest.aspx (the one for Sitecore 6 is a lot better than the simple one that works on Sitecore 5 & 6) Sitecore SDN link
And that's what I can think of from the top of my head.
Sitecore has a major flaw, its uses GUIDs for primary keys (amongst other poorly chosen data types), this fragments the table from the first insert and if you have a heavily utilised Sitecore database the fragmentation can be greater than 90% within an hour. These is not a well-designed database and recommend looking at other products until they fix this, it is causing us a major performance headache (time and money).
We are at a stand still we cannot add anymore RAM cannot rebuild the indexes more often
Also, set your IIS to recycle the app_pool ONLY once a day at a specific time. I usually set mine for 3am. This way the application never goes to sleep, recycle or etc. Best to reduce spin up times.
Additionally configure IIS to 'always running' instead of 'on starup'. This way, when the application restarts, it recompiles immediately and again, is ready to roar.
Sitefinity is really a fantastic piece of software (hopefully my tips above get the thumbs up, and not my endorsement of the product). haha
I'm working on an ASP.NET MVC project and I've come to the point where I want to start considering my caching strategy. I've tried to leave my framework as open as possible for the use in caching.
From what I heard during Scott Hanselman's podcast StackOverflow.com uses page output caching and zips that content and puts it into RAM. This sounds like this would be great for user-wide cache but for something like personalized pages you would have to cache a version for each user and that could get out of control very quickly.
So, for a caching strategy. Which should be used, Output Caching, Data Caching or combined? My first thoughts are both but as far as cache dependencies it sounds like it could get a bit complex.
We're doing API and Output caching on a large scale (3 milion visits a day) web site (news portal). The site is primarily used by anonymous users, but we do have authenticated users and we cache a complete site just for them, due to some personalized parts of the site, and I must admit that we had absolutely no problems with memory pressure.
So, my advice would be cache everything you can in API cache so your Output cache rebuilding is even faster.
Of course, pay close attention to your cache ratio values in the performance counters. You should see numbers >95% of cached hits.
Another thing to pay attention is cache invalidation, this is a big issue if you have a lot of related content. For example, you cache music stuff and information about one album or song might be displayed and cached on few hundred pages. If anything changes in that song, you have to invalidate all of these pages which can be problematic.
Bottom line, caching is one of the best features of ASP.NET, it's done superbly and you can rely on it.
Be careful about over-aggressive caching. Although caching is a tool for helping performance, when used incorrectly, it can actually make performance worse.
I can't answer whether output caching or data caching would work for you better without knowing more details about your project.
I can help provide a couple examples of when to use one over another.
If you have a specific data set which you would use often in many different views, you'd be better off using data caching. You'd use this if your data fetch operation was very common and expensive relative to your data rendering. If you had multiple views which used the same data, you would save your data fetching time.
If you had a view which used a very specific data set and the rendering of the view was complicated and this view was requested very often (for example, stack overflow's home page), then you would benefit a lot from output caching.
So in the end, it really depends on your needs and be careful about using caching incorrectly.