How do you set up a large scale Alfresco CIFS server? - alfresco

Alfresco provides a CIFS connector so it can act just a normal file-server in your intranet.
Compared with a "normal" (windows/samba) based fileserver, certain operations can really hurt the system, e.g. listing a folder with a few thousand files using windows explorer. Not quite sure, but I think permission checking is the primary reason for this case. Anyways, now assume you have a big filesystem hierarchy exposed and many users using CIFS, stressing the system, effectively "knocking it down".
What is the suggested approach to scale / improve performance ?

In my experience Windows Explorer is part of the CIFS performance issue. I don't have exact numbers, but I remember working on an instance with roughly 500GB data, mostly composed of small images and a few texts in a not well balanced folder tree, for which listing a folder with a thousand children was taking in Explorer around a minute to display. The same operation was taking around 3s on Chrome browser.
We never had time to investigate the issue thoroughly, but we saw an impressive amount of traffic generated by Explorer due to prefetch of information of the subfolders of the currently open folder.

Been revisiting the issue a little, and I guess the best answer I can give for now is: Tweak the cache(s).
I used a 5k children space, default cache values and benchmarked executing "ls -alrt" on the CIFS mount running alfresco 4.0.d.
The first execution took roughly two minutes bombarding the (lightning fast) mysql database with approx 200k queries.
The second execution took "only" around 40 seconds, but the amount of queries did not change significantly.
Increasing the CIFS fileinfo cache, I got the second time down to 30 seconds, but I still see 160k DB queries firing. I'm fairly sure this lions share has to do with permissions/ACLs and it should be possible improve the situation a lot.
PS: Windows Explorer definitely behaves a little unexpected, but I cannot confirm that it makes a significant difference regarding user experience.
PPS: https://issues.alfresco.com/jira/browse/ALFCOM-2951
PPPS: I'll look into this further when I find the time - should be this year. ;)
Update: The massive amount of queries is no permission issue.

Permission checks definitely IS a part of the problem. I can't link to anything specific, but browsing alfresco forums and the net for the last few years I've learned that permissions can hurt the performance.
I've read (and experienced) in several scenarios that alfresco spaces with large numbers of children (1000+) can be painfully slow. One part you noticed yourself: it takes a while to go through 100-200k queries. But hook up something into alfresco to watch what's it doing and you'll see that massive amounts of time go on serialization/deserialization (e.g.webscripts for share) and also node traversal (hence the thousands of queries and averages of 400-500 qps when nobody is logged on).
So you're on the right way with your cache optimizations.
Do you have dedicated hardware for your installation? I've had big issues with performance, but I've moved the MySQL server to a separate box (server-grade hardware - 4 cores, 8GB ram, SSD for myqsl server and SAS for tomcat server etc) and I gained a lot. So, get on with begging for the new hardware too :)
I think you're on the right path here.

Related

Amazon Web Services EC2 Instances Become Painfully Slow From Time to Time

I am using the free tier for now and have noticed that the servers on which I host a couple of Wordpress sites become periodically and painfully slow to respond if I access the sites through the browser. Accessing the admin of the site becomes impossible. That said, I can still quickly SSH to the servers, if needed.
What can be causing such a behavior? I have seen short bouts of this previously but the current one has lasted for over two days. I have tried recreating the servers, rebooting them, etc. Needless to say, unless I am doing something wrong, it is making me question the decision to try AWS for these projects.
This is from the amazon site:
T2 instances accrue CPU Credits when they are idle, and use CPU credits when they are active. T2 instances are a good choice for workloads that don’t use the full CPU often or consistently, but occasionally need to burst
Worpdress sites are CPU intensive and not suiteable for this kind of environment. I have tried it before and learned quickly that t2.micro is good for two things. Light development or cached html content. Anything beyond that you are asking for trouble.
You need to upgrade beyond the free tier if you want some results. Otherwise convert your site to a simple html website, or cache the heck out of your content.

Worker process taking high CPU%

All of my websites are hosted in IIS and configured with one application pool. This application pool consists 10 websites running.
It is working fine till today, but all of sudden I am observing that there is sudden up and down % in CPU usage. I am unable to trace out the problem.
Is there anyway to check which website is taking much load among all in the application pool?
Performance counters, task manager and native code analysis tools only tell part of the story. To gain a deeper understanding of what is happening inside your ASP.NET application you need to use WinDBG, SOS and ADPlus.
Tess Ferrandez has a great series of articles on tracking down what is to blame here:
.NET Debugging Demos Lab 4: High CPU hang
.NET Debugging Demos Lab 4: High CPU Hang - Review
This is a real world example:
High CPU in .NET app using a static Generic.Dictionary
You will probably want to separate your sites into individual application pools so you can identify and isolate the site that is causing the high CPU (but it already looks like you have a suspect so I'd isolate that one). From then you can follow Tess's advice and guidance to track down the cause.
You should also take a look at the logs to see if you're experiencing an unexpected spike or increase in traffic. Perhaps there's a badly behaved search engine site indexer nailing the site. If that's the case then maybe you need to (if you haven't already done so) create a robots.txt to prevent crawlers from indexing parts of the site that don't need to be indexed. On top of that if certain crawlers are being overly promiscious then just ban them. Perhaps consider a sitemap for google to tame and tune its activities.
If your server has reached it's max capacity, you will see CPU go up and down erratically because the GC will start trying to recover resources(cache..etc), which in turn causes your sites to work even harder. It's an endless cycle.
Have you been monitoring your performance counters? Do you have any idea what normal capacity is for your site? If you cannot answer these questions, I suggest you gather some perf numbers as soon as possible.
My rule of thumb is to always measure first, then make necessary changes.
Most of the time performance bottlenecks aren't where you think they would be.
There is really no performance counter way to tell, because the CPU counters are at the process level. Your best bet would be to do a time corelation with other events in the event log and .NET/ASP.NET counters for garbage collection, requests etc.
If you really want to go hardcore, you could use the SysInternals toolset to take snapshots of your app pool over time and then do a post-analysis to figure out what code was executed when the spike happened. Here is a related example from Mark Russinovich's blog - http://blogs.technet.com/b/markrussinovich/archive/2008/04/07/3031251.aspx.

Ways to make ASP.NET build faster

When I'm building my web project it takes about 20 seconds to compile. Then when I try to browse to a web page in project, asp.net does its runtime compilation(another 20 seconds). I know I can't escape these steps because thats how asp.net works, just want to see if anyone has some kind of optimization to make these builds faster.
Trying to improve my Edit-Compile-Test loop
My machine details:
-Intel Core i7 processor #2.80GHz
-8GB of RAM
-HD # 7200 RPM
Buy a faster machine? Sounds like a smart answer. I know that the compiler can take advantage of multi core machines. Also, during compilation there's a lot of Hard drive access, so it may make sense to get a solid state drive. Maybe not the answer you are looking for, but it's a definite solution.
The other thing you can do is configure your project to allow for "Edit And Continue". This will allow for small things to be change, and continue debugging, without doing a full recompile.
Here are a couple of thoughts:
Disable any "realtime" virus / malware protection, at least during this process.
Disable indexing (Windows, Google desktop, etc.) for the folders that VS uses during this process.
Disable / stop other processes that may be accessing the hard disk. The biggest issue here is latency - even if other applications are accessing / writing tiny files, it is the access time that kills speed.
As the original poster suggested, your biggest bang will come from hardware: get an SSD and a processor with at least 4 cores. If you were to buy 4 cheap 64GB SSD's and put them in RAID 0, you would be shocked at the difference and even discover that your CPU and RAM will suddenly become bottlenecks.
Move your code onto a RAMDisk, or buy an SSD drive.
Suspend Resharper - R# helps tremendously when you're just coding but really slows down the Edit-Compile-Test loop.

rsync vs SyncML (Funambol)

I would like some idea about how rsync compares to SyncML/Funambol, especially when it comes to bandwidth, sync over unstable network and multiple clients to one server.
This is to sync several mobile devices with a directory structure of growing text-files. (Se we essentially want as much as possible on the server, and inconsistent files is not really a problem, also we know where changes originates).
So far, it seems Funambol doesn't compress, doesn't handle partial updates, and it is difficult to handle interruptions in a file-transfer.
I know rsync doesn't go through the server, but I don't quite see how that is a disadvantage.
Olav,
rsync can:
Compress the data (as you said) - thus gaining better performances over the net.
Synchronize only the newest data within each file - thus, once again, saving time.
Can be ran by multiple users at the same time. It's a very basic backup software behavior.
And one of my favorites: work over a secure shell.
You might want to check Rsyncrypto, for compressing and encrypting at the same time.
Dotan

Debugging ASP.Net shared pool techniques

I work for a hosting company, providing ASP.Net 3.5 hosting. Honestly, we usually provide quite good uptime and velocity. However, we are having problems with one of our shared pools. As usual, we try to maximize the number of webs that can run into one pool.
Lately we are suffering continuous hangs. The process doesn't crash, but starts to show OutOfMemoryExceptions or stops processing requests. We think this is responsability of one of the applications (it would be great to know which one).
I have some memory dumps that I have processed with WinDbg. I've run f.e:
!dumpheap -stat
This method provide global memory usage of objects. Nothing remarkable... Also I've checked:
~*e!clrstack
I see various non managed threads. In those who are managed appears stacks like:
[HelperMethodFrame_1OBJ: 0f30e320]
System.Threading.WaitHandle.WaitMultiple(System.Threading.WaitHandle...
0f30e3ec 7928b3ff System.Threading.WaitHandle.WaitAny(System.Threading...
0f30e40c 7a55fc89 System.Net.TimerThread.ThreadProc()...
0f30e45c 792d6e46 System.Threading.ThreadHelper.ThreadStart_Context(System...
0f30e468 792f5781 System.Threading.ExecutionContext.runTryCode(System...
At least, I haven't seen exception throwing or similar (in that moment). I've also had access to two scripts written by Tess Ferrandez for calculating the number of sessions and size. Also here not promising results. Anything peculiar or remarkable (24000 bytes as average).
I would like to know what kind of strategies are you usually using facing this kind of problems. Have you ever used Microsoft Support?
Thanks a lot!
Very nice question, well a bad asp.net can hang all shared web apps on the same pool...
Ok let see... if the problem is on memory, get the VMMap from Sysinternals, and also the Process Explorer
Run them both, and from Process explorer find the PID number of pool that you wish to investigate, its under the inetinfo.exe, and have probably the name aspnet_wp.exe.
Now on the VMMap add for monitoring this Pool using for help the PID, and voila, you see the memory and the open images (aspx files) that probably are a lot and make the problems... The files that you going to see are located on temporary of asp.net Framework, but you can connect them and see from witch site they come from.
Well if the problem is not on memory, but the programmer have create bad loops, or even create thread sleeps, then I think process explorer is a way to investigate the pools and search for whats eating the power.
Additional
Maybe a pool recycle every 15minute can solve this issue ?
More about
In those videos there are a lot of informations about VMMap and memory manager.
Mysteries of Windows Memory Management, Part 1, and , Part 2
There are many tools, but it sounds like your main goal is to determine what's causing the problem. This can be done very simply with a binary search.
Break the pool in half, and see which one crashes. Repeat until you have a crashed pool with only one application in it.
This is already O(log2n), but you can speed the process up arbitrarily by dividing into more than two sub-pools.

Resources