What is wrong with alfresco.cache.immutableEntityTransactionalCache? - alfresco

I have this in my log:
2016-01-07 12:22:38,720 WARN [alfresco.cache.immutableEntityTransactionalCache] [http-apr-8080-exec-5] Transactional update cache 'org.alfresco.cache.immutableEntityTransactionalCache' is full (10000).
and I do not want to just increase this parameter without knowing what is really going on and having better insights of alfresco caches best practices!
FYI:
The warning appears when I list the element from document library root folder in a site. Note that the site does have ~300 docs/folder at that level, several of which are involved in current workflows and I am getting all of them in one single call (Client-side paging)
I am using an Alfresco CE 4.2.c instance with around 8k nodes

I ve seen this in my logs whenever you do a "big" transaction. By that I mean making a change to 100+ files in a batch.
Quoting Axel Faust:
The performance degredation is the reason that log message is a warning. When the transactional cache size is reached, the cache handling can no longer handle the transaction commit properly and before any stale / incorrect data is put into the shared cache, it will actually empty out the entire shared cache. The next transaction(s) will suffer bad performance due to cache misses...
Cache influence on Xmx depends on what the cache does unfortunately. The property value cache should have little impact since it stores granular values, but the node property cache would have quite a different impact as it stores the entire property map. I only have hard experience data from node cache changes and for that we calculated additional need of 3 GiB for an increase to four-times the standard cache size

It is very common to get these warnings.
I do not think that it is a good idea to change the default settings.
Probably you can try to change your code, if possible.
As described in this link to the alfresco forum by one of the Alfresco engineer, the value suggested by Alfresco are "sane". They are designed to work well in standard cases.
You can decide to change them, but you have to be careful because you can get lower performances than what you would get doing nothing.
I would suggest to investigate why your use of this webscript is causing the cache overflow and check if you can do something about it. The fact that you are retrieving 300 documents/folders in the same time, it is likely to be the cause.
In the following article you can find how to troubleshoot and solve issues with the cache.
Alfresco cache tuning
As described in that article, I would suggest to increase the log level for ehcache:
org.alfresco.repo.cache.EhCacheTracerJob=DEBUG
Or selectively adding the name of the cache that you want to monitor.

Related

How to get rid of ConflictError on ZEO workers?

Looking at my ZEO workers I get to see quite a lot of:
2013-10-18T11:59:54 INFO ZPublisher.Conflict ConflictError at
/VirtualHostBase/http/www.domain.com:80/Plone/VirtualHostRoot/:
database conflict error (oid 0x533cd5, class
persistent.mapping.PersistentMapping) (78 conflicts (0 unresolved)
since startup at Mon Oct 14 04:09:45 2013)
As they are logged as INFO should I assume that is not harmful at all?
And I guess that if there is a conflict is because there are too much writes on the ZODB?
The conflicts are indeed caused because two requests are trying to change a PersistentMapping at the same time. One of these is then forced to retry the commit.
Use these entries to pinpoint bottlenecks in your application; perhaps replace the specific mapping with a BTree.OOBTree which minimizes conflicts by spreading key-value pairs out over separate persistent buckets.
Without traffic data and what that specific PersistentMapping holds or what your application does with it, it is impossible to say if 78 conflicts in 4 days is a lot or a little, and if it is worth your while switching to a different container.
Conflict errors are not -- in themselves -- harmful. The ZEO server will retry several times to resolve the error. But they are a sign of write-contention in the database, and a lot of them will indicate that you have a bottleneck in your current configuration. Your users soon will be complaining of poor performance.
You should probably begin analysis to determine if you've some add-on package that's doing excessive or very inefficient writes to the database. The worst case, for example, would be some code that's trying to write to the database on every page load like a traffic logger. The ZODB is optimized for reading, not writing, and those operations should be redesigned to put their data stores somewhere other than the ZODB.
If it's just content writes that are the problem, look to reduce catalog indexes and metadata. If at all possible, replace old Archetypes-style content with Dexterity content types. Dexterity is far more efficient in content creation.

Is Caching in C# the right approach for me?

I've tried to read up on Caching in ASP.NET and still have a few questions.
When using a Sql Cache Dependency ... I know that you can specify which tables will be monitored but if a change happens to any one of those tables does it reset the entire cache? I understand that I don't want to cache tables that will have frequent changes but we could end up with a good handful of cached tables and even if each table only gets a few updates a day, that could turn into 50ish resets of the cache daily (8 hour window).
I would be creating and maintaining this cache via a GAC DLL. A large number of different applications would be accessing that GAC at any one time. Does each application maintain its own copy of the cache or is it just stored in one global location (or possibly per app pool)?
Is there a physical location on the server where I can see how much space the Cache is currently consuming? This would be extremely pertinent if each application maintains its own Cache as that could end up taking large amounts of disk space.
Is there some way to physically force the cache to rebuild itself? I could see my boss assuming that the cache was at fault for a particular issue and I'd need to be able to rule that out at the rootest level. No "changing a record and saying that SHOULD rebuild the cache" but rather "doing [Action X] and KNOWING that whatever was in the cache is now gone"
Thanks in advance for your answers and time.
SqlCacheDependency only monitors tables in the old-style SQL 2000 approach, which relies on triggers and polling. The SQL 2005+ method monitors changes at the row level, and uses Service Broker. At the level of the Cache object, changes will invalidate just the Cache entries associated with the given SqlCacheDependency (not the entire cache).
Each application has a separate copy of the Cache. If you have many apps sharing the same data, you might consider creating a separate "caching server," and have your apps get their data from there, using WCF -- basically add another tier to your app.
You can look at a couple of cache-related performance counters, but if your concern is disk space, then there's nothing to worry about, since the ASP.NET cache is stored entirely in RAM. In addition, if RAM gets too full, one feature of the cache is that it will let go of old/infrequently referenced objects to make room for new objects.
The easiest way to force the cache to be dropped is to simply recycle your application or AppPool (which happens once a day or so by default anyway). If you want something more targeted, you would need to write some code to forcibly remove certain items from the cache, either using Cache.Remove() or using linked dependencies.
from top of my head:
Only that table's content will be invalidated.
Each web application has it's own cache.
Cache is stored in memory. and see this question How to determine total size of ASP.Net cache? regarding cache size
http://bit.ly/vsqNDl this may help

Outputcache - how to determine optimal value for duration?

I read somewhere that for a high traffic site (I guess that is a murky term as well), 30 - 60 seconds is a good value. Obviously I could do a load test and vary the values, but I couldn't find any kind of documentation on this. Most samples have a minute, a couple of minutes. There's no recommended range. Is there something on msdn or anywhere that talks about this?
This all depends on whether or not the content changes frequently. For slowly or non-mutating content, a longer value works perfectly. However, you may need to shorten the value for always-changing data or risk bad output.
It all depends on how often a user requests your resource, and how big the resource is.
First, it is important to understand that when you cache something, that resource will remain the same until the cache duration runs out. A short duration cache will tax the webserver more than longer one, but the short will provide more up-to-date data should the requested resource change.
Obviously you want to cache database queries as much as possible, prioritizing those who are called often. But all cache takes memory on the server, and as resources runs low the cache will be evicted. Take this into consideration when caching large things for longer durations.
If you want data on how often users requests a resource you can use Google Analytics, which is extremely easy to set up.
For very exhausitive analytics you can use Kiwik. It requires a local server though.
On very changing resources, don't cache at all, unless it's really really resource heavy and isn't vital to be realtime updated.
To give you an exact number or recommendation would be to make you a disservice, there are too many variables around.

How to determine the size in bytes of the ASP.NET Cache?

I'm in active development of an ASP.NET web application that is using server side caching and I'm trying to understand how I can monitor the size this cache during some scale testing. The cache stores XML documents of various sizes, some of which are multi-megabyte.
On the System.Web.Caching.Cache object of System.Web.Caching namespace I see various properties, including Count, which gets "the number of items stored in the cache" and EffectivePrivateBytesLimit, which gets "the number of bytes available for the cache." Nothing tells me the size in bytes of the cache.
In the Understanding Caching Technologies section of the "Caching Architecture Guide for .NET Framework Applications" guide, there is a "Managing the Cache Object" section with a table (Table 2.2: Application performance counters for monitoring a cache) listing a set of application performance counters, but I don't see any that give me the current size of the cache.
What is a good way to find the size this cache? Do I have to set a byte limit on the cache and look at one of the turnover rates? Am I thinking of this problem in the wrong way? Is the answer to How to determine total size of ASP.Net cache really the best way to go?
I was about to give a less detailed account of the answer you refer to in your question until I read that. I would refer you to this, seems spot on to me. No better way than seeing the physical size on the server, anything else might not be accurate.
You might want to set up some monitoring, for which a Powershell script might be handy to record and send on to yourself in a report. This way you could run various tests overnight say and summarise it.
On a side note, they sound like very large documents to be putting in a memory cache. Have you considered a disk based cache for these larger items and leaving the memory for smaller items which is more ideal for it. If your disks are reasonably fast this should be fairly performant.

xml parsing / querying performance question for asp.net

I have to port a smaller windows forms application (product configurator) to an asp.net app which will be used on a large company's website, demand should be moderate because it's for a specialized product line.
I don't have access to a database and using XML is a requirement from their web developers.
There are roughly 30 different products with roughly 300 different possible configurations stored in the xml files, and linked questions / answers that lead to a product recommendation. Also some production options. The app is available in 6 languages.
How would you solve the 'data access' layer, if you could call it this way? I thought of reading / deserializing the xml files into their objects and store them in asp.net's cache if they're not there already and then read from the cache on subsequent requests. But that would mean all objects live in the memory all day and night.
Is that even necessary, or smart, performance wise? As I said before, the app is not that big, the xml files not that large. Could I just create some Repository class that reads the xml files whenever an object is requested (ie. 'Product Details', or 'Next question') and returns it that way, and drive memory consumption down?
The whole approach seems to be sticking to a single server. First consider if this is appropriate as you mentioned a "large company's website", that sets a red flag for me. If you need the site to scale, you will end up having more than a single server, which prevents considering a simple local file.
If you are constrained to using that, analyze what data is more appropriate to keep in cache (does not change often, its long lived, the same info is requested different times). Try to keep the cached stuff separated from the non cached, which will reduce the amount of amount of info in the more dynamic files. If you expect big amounts of information, consider splitting the files with something appropriate to your domain.
I use Cache whenever I can. I cache objects upon their first request. If memory is of any concern, I set expiration policy. And whether it is or not, when short on memory, the framework will unload the cache anyway.
Since it is per application and not per user, it makes sense to have it, especially if the relative footprint is small.
If you have to expand to multiple servers later, you can access the same file over the network or modify DA layer to retrieve data by any other means (services, DB, etc). The caching code will stay the same and performance will be virtually unaffected.
If you set dependency, objects will always stay current.
I'm for it.
Using the cache, and setting an appropriate expiration policy as advised by others is a sound approach. I'd suggest you look at using LINQ to XML as the basis for your data access code as it is so much easier to use than traditional methods of querying XML. You can find a decent introduction here.

Resources