Drupal 6.15 and memcache running on RHEL 5.4 server. Memcache miss percentage is 32%. I think is is high. What can be done to improve it?
Slightly expanded form of the comment below.
A cache hit ratio will depend on a number of factors, things like
Cache Size
Cache timeout
Cache clearing frequency.
Traffic
Using memcached is most beneficial when you have a high number hits on a small amount of content. That way the cache is built quickly and then used frequently giving you a high hit ratio.
If you don't get that much traffic, cache items will be stale so will need to be re cached.
If you have traffic going to a lot of different content then the cache can either get full, or go stale before it is used again.
memcached is only something you really need to use if you are having, or anticipating scalability issues. It is not buggy, but adds another layer of application which needs to be monitored and configured.
Related
we have seen several actions where a simple screaming frog action would almost take down our server (it does not go down, but it slows down almost to a halt and PHP processes go crazy). We run Magento ;)
Now we applied this Nginx ruleset: https://gist.github.com/denji/8359866
But I was wondering of there is a more strict or better way to kick out too gready crawlers and screaming frog crawl episodes. Say 'after 2 minutes' of intense requesting we should already know someone is running too many requests of some automated system (not blocking the Google bot ofcourse)
Help and ideas appreciated
Seeing how a simple SEO utility scan may cause your server to crawl, you should realize that blocking of spiders isn't real solution. Suppose that you have managed to block every spider in the world or created a sophisticated ruleset to define that this number of requests per second is legit, and this one is not.
But it's obvious that your server can't handle even a few visitors at the same time. Few more visitors and they will bring your server down, whenever your store receives more traffic.
You should address the main performance bottleneck, which is PHP.
PHP is slow. With Magento it's slower. That's it.
Imagine every request to your Magento store causes scanning and parsing of dozens and dozens of PHP files. This will hit the CPU so bad.
If you have unoptimized PHP-FPM configuration, this will hit your RAM so bad also.
These are the things which should be done in order of priority to ease the PHP strain:
Make use of Full Page Cache
Really, it's a must with Magento. You don't loose anything, but only gain performance. Common choices are:
Lesti FPC. This is easiest to install and configure. Works most of the time even if your theme is badly coded. Profit - your server will no longer be down and will be able to serve more visitors. It can even store its cache to Redis if you have enough RAM and you are willing to configure it. It will cache, and it will cache things fast.
Varnish. It is the fastest caching server, but it's tricky to configure if you're running Magento 1.9. You will need Turpentine Magento plugin. The latter is quite picky to make work if your theme is not well coded. If you're running Magento 2, it's just compatible with Varnish out of the box and quite easy to configure.
Adjust PHP-FPM pool settings
Make sure that your PHP-FPM pool is configured properly. A value for pm.max_children that is too small will make for slow page requests. A value that is too high might hang your server because it will lack RAM. Set it to (50% of total RAM divided by 128MB) for starters.
Make sure to adjust pm.max_requests and set it to a sane number, i.e. 500. Many times, having it set to 0 (the default) will lead to "fat" PHP processes which will eventually eat all of the RAM on server.
PHP 7
If you're running Magento 2, you really should be using PHP 7 since it's twice as fast as PHP 5.5 or PHP 5.6.
Same suggestions with configs in my blog post Magento 1.9 performance checklist.
I'm optimizing a very popular website and since the user base is constantly growing I'm interested in what matters when it comes to scaling.
Currently I am scaling by adding more CPU power / RAM memory to the server. This works nicely - even though the site is quite popular, currently CPU usage is at 10%.
So, if possible, I'd keep doing that. What I am worried about is whether I could get to the point where CPU usage is low but users have problems connecting because of the number of HTTP connections. Is it better to scale horizontally, by adding more servers to the cluster?
Thanks!
Eventually just adding more memory won't be enough. There are concurrent connection limits for TCP rather than IIS (though both factors do come into account, IIS can handle about 3000 connections without a strain).
You probably won't encounter what you suggest where the CPU usage is low but number of HTTP connections is high unless it is a largely static site, but the more connections open, the higher the CPU usage.
But regardless of this, what you need for a popular site is redundancy, which is essential for a site which has a large user base. There is nothing more annoying to the user than the site being down as your sole server goes offline for some reason. If you have 2 servers behind a load balancer, you can grow the site, even take a server offline with less fear of your site going offline.
Previously I am using the http://domainname.com,
I got some security issue so the I moved to the https://domainname.com.
Previously the panel was loading very quickly after converted to https:// panel is very slow,
Is there any problem with the http and https.
Please give me some suggestion on this.
Thanks
The size of each transaction over SSL has an additional overhead for encryption, however the real killer is latency.
For HTTP traffic, there has to be 2 complete round trips for each request. But over SSL, there's at least 4. Although bandwidth has increased massively in recent years, latency has not changed much. The only practical solution is to be closer to the server.
I have a slowly evolving dynamic website served from J2EE. The response time and load capacity of the server are inadequate for client needs. Moreover, ad hoc requests can unexpectedly affect other services running on the same application server/database. I know the reasons and can't address them in the short term. I understand HTTP caching hints (expiry, etags....) and for the purpose of this question, please assume that I have maxed out the opportunities to reduce load.
I am thinking of doing a brute force traversal of all URLs in the system to prime a cache and then copying the cache contents to geodispersed cache servers near the clients. I'm thinking of Squid or Apache HTTPD mod_disk_cache. I want to prime one copy and (manually) replicate the cache contents. I don't need a federation or intelligence amongst the slaves. When the data changes, invalidating the cache, I will refresh my master cache and update the slave versions, probably once a night.
Has anyone done this? Is it a good idea? Are there other technologies that I should investigate? I can program this, but I would prefer a configuration of open source technologies solution
Thanks
I've used Squid before to reduce load on dynamically-created RSS feeds, and it worked quite well. It just takes some careful configuration and tuning to get it working the way you want.
Using a primed cache server is an excellent idea (I've done the same thing using wget and Squid). However, it is probably unnecessary in this scenario.
It sounds like your data is fairly static and the problem is server load, not network bandwidth. Generally, the problem exists in one of two areas:
Database query load on your DB server.
Business logic load on your web/application server.
Here is a JSP-specific overview of caching options.
I have seen huge performance increases by simply caching query results. Even adding a cache with a duration of 60 seconds can dramatically reduce load on a database server. JSP has several options for in-memory cache.
Another area available to you is output caching. This means that the content of a page is created once, but the output is used multiple times. This reduces the CPU load of a web server dramatically.
My experience is with ASP, but the exact same mechanisms are available on JSP pages. In my experience, with even a small amount of caching you can expect a 5-10x increase in max requests per sec.
I would use tiered caching here; deploy Squid as a reverse proxy server in front of your app server as you suggest, but then deploy a Squid at each client site that points to your origin cache.
If geographic latency isn't a big deal, then you can probably get away with just priming the origin cache like you were planning to do and then letting the remote caches prime themselves off that one based on client requests. In other words, just deploying caches out at the clients might be all you need to do beyond priming the origin cache.
We have 22 HTTP servers each running their own individual ASP.NET Caches. They read from a read only DB that is only updated off peak hours.
We use a file dependency to invalidate the cache, prompting the servers to "new up" their caches...If this is accidentally done during peak hours, it risks bringing down our DB cluster due to the sudden deluge of open connections.
Has anyone used memcached with ASP.NET in this distributed form? It seems to me that it would offer a huge advantage of having to only build up one cache (and hit the DB 21 times less), while memcached would handle distributing it on each box.
If you have, do you place it on the same box as the HTTP boxes, or do you run a separate cache tier? How well does it scale, can we expect it to need powerful servers? Our working dataset is not huge (We fit it into 4 gigs of memory on each HTTP box just fine).
How do you handle invalidation?
Looking for experiences and war stories.
EDIT: Win2k3, IIS6, 64-bit servers...4 gigs per box (I believe, we may have upped it to 16 gigs when we changed to 64-bit servers).
"memcached would handle distributing it on each box"
memcached does not distribute or replicate a cache to each box in a memcached farm. The memcached client basically hashes the key and chooses a cache server based on that hash. When one of the memcached servers fail you will lose whatever cached items existed on that server, however, the client will recognize the failure and begin writing values to a different server. This being the case, your code needs to account for missing items in the cache and reset them if necessary.
This article discusses the memcached architecture in more detail: How memcached works.
Best practice (according to the memcached site) is to run memcached on the same box as your web server app or else you're making http calls (which isn't all that bad, but it's not optimal). If you're running a 64-bit app server (which you probably should if you're going to be running memcached), then you can load up each of the servers with loads of memory and it will be available to memcached. There's not much in the way of CPU resources used by memcached, so if your current app server isn't very taxed, it will remain that way.
Haven't used them together, but I've used them both on separate projects.
Last I saw the documentation explicitly said that sharing with the web server was ok.
Memcache really only needs RAM and if you take your asp.net cache out of the equation how much RAM is you web server actually using? Probably not much. It won't compete much with your web server for CPU and it doesn't need disk at all. You might consider segmenting off the network traffic (if you don't already) from the incoming web requests.
It worked well and was fast I didn't have any problems with it.
Oh, invalidation was explicit on the project I used it on. Not sure what other modes there are for that.
If you want to get replication accross your memcached servers then it maybe worth a look at repcached. It's a patch for memcached that handles the replication part.
Worth checking out Velocity, which is a distributed cache provided by Microsoft. I cannot give you a point-by-point comparison to memcached, but Velocity is integrated with ASP.NET and will continue to get more development and integration.