I'm having problems with a cache cluster to empty all cache data stores.
This cluster has 89 cache stores and lasts more than 40 minutes to completely unload data.
I'm using this function:
public void deleteAll() {
try {
getCache().clear();
} catch (Exception e) {
log.error("Error unloading cache",getCache().getCacheName());
}
}
getCache Method retrieves a NamedCache of CacheFactory.
public NamedCache getCache() {
if (cache == null) {
cache = com.tangosol.net.CacheFactory.getCache(this.idCacheFisica);
}
return cache;
}
Has anyone found any other way to do this faster?
Thank you in advance,
It's strange it would take so long, though to be honest, it's unusual to call clear.
You could try destroying the cache with NamedCache.destroy or CacheFactory.destroy(NamedCache).
The only problem with this is that it invalidates any references that there might be to the cache, which would need to be re-obtained with another call the CacheFactory.getCache.
Often, a "bulk operation" call like clear() will get transmitted all the way back to the backing maps as a bulk operation, but will at some point (somewhere inside the server code) become a iteration-based implementation, so that each entry being removed can be evaluated against any registered rules, triggers, event listeners, etc., and so that backups can be captured exactly. In other words, to guarantee "correctness" in a distributed environment, it slows down the operation and forces it to occur as some ordered sequence of sub-operations.
Some of the reasons why this would happen:
You've configured read-through, write-through and/or write-behind behavior for the cache;
You have near caches and/or continuous queries (which imply listeners);
You have indexes (which need to be updated as the data disappears);
etc.
I will ask for some suggestions on how to speed this up.
Update (14 July 2014) - Yes, it turns out that this is a known issue (i.e. that it could be optimized), but that to maintain both "correctness" and backwards compatibility, the optimizations (which would be significant changes) have been deferred, and the clear() is still done in an iterative manner on the back end.
Related
I used AWS DynamoDBMapper Java class to build a repository class to support CRUD operations. In my unit test, I created an item, saved it to DB, loaded it and then deleted it. Then I did a query to DB with the primary key of deleted item, query returns an empty list, all looked correct. However, when I check the table on AWS console the deleted item is still there, and another client on a different session can still find this item. What did I do wrong? Is there any other configuration or set up required to ensure the "hard delete" happened as expected? My API looks like this:
public void deleteObject(Object obj) {
Object objToDelete = load(obj);
delete(obj);
}
public Object load(Object obj) {
return MAPPER.load(Object.class, obj.getId(),
ConsistentReads.CONSISTENT.config());
}
private void save(Object obj) {
MAPPER.save(obj, SaveBehavior.CLOBBER.config());
}
private void delete(Object obj) {
MAPPER.delete(obj);
}
Any help/hint/tip is munch appreciated
Dynamodb is eventually consistent by default. Creating -> Reading -> Deleting immediately would not always work.
Eventually Consistent Reads (Default) – the eventual consistency
option maximizes your read throughput. However, an eventually
consistent read might not reflect the results of a recently completed
write. Consistency across all copies of data is usually reached within
a second. Repeating a read after a short time should return the
updated data.
Strongly Consistent Reads — in addition to eventual consistency,
Amazon DynamoDB also gives you the flexibility and control to request
a strongly consistent read if your application, or an element of your
application, requires it. A strongly consistent read returns a result
that reflects all writes that received a successful response prior to
the read.
In a clustered intershop environment, we see a lot of error messages. I'm suspecting the communication between the application servers is not reliable.
Caused by: com.intershop.beehive.orm.capi.common.ORMException:
Could not UPDATE object: com.intershop.beehive.bts.internal.orderprocess.basket.BasketPO
Is there safe way to for the local application server, to load the latest instance.
BasketPO basket = null;
try{
BasketPOFactory factory = (BasketPOFactory) NamingMgr.getInstance().lookupFactory(BasketPOFactory.FACTORY_NAME);
try(ORMObjectCollection<BasketPO>baskets = factory.getObjectsBySQLWhere("uuid=?", new Object[]{basketID},CacheMode.NO_CACHING);){
if(null != baskets && !baskets.isEmpty()){
basket = baskets.stream().findFirst().get();
}
}
}
catch(Throwable t){
Logger.error(this, t.getMessage(),t);
}
Does the ORMObject#refresh method help ?
try{
if(null != basket)
basket.refresh();
}
catch(Throwable t){
Logger.error(this, t.getMessage(),t);
}
You experience that error because an optimistic lock "fails". To understand the problem better I'll try to explain how the optimistic locking works in particular in the Intershop ORM layer.
There is a column named OCA in the PO tables (OCA == optimistic control attribute?). Imagine that two servers (or two different threads/transactions) try to update the same row in a table. For performance reasons there is no DB locking involved by default (e.g. by issuing select for update). Instead the first thread/server increments the OCA by one when it updates the row successfully within its transaction.
The second thread/server knows the value of the OCA from the time that it created its own state. It then tries to update the row by issuing a similar query:
UPDATE ... OCA = OCA + 1 ... WHERE UUID = <uuid> AND OCA = <old_oca>
Since the OCA is already incremented by the first thread/server this update fails (in reality - updates 0 rows) and the exception that you posted above is thrown when the ORM layer detects that no rows were updated.
Your problem is not the inter-server communication but rather the fact that either:
multiple servers/threads try to update the same object;
there are direct updates in the database that bypass the ORM layer (less likely);
To solve this you may:
Avoid that situation altogether (highly recommended by me :-) );
Use the ISH locking framework (very cumbersome imHo);
Use pesimistic locking supported by the ISH ORM layer and Oracle (beware of potential performance issues, deadlocks, bugs);
Use Java locking - but since the servers run in different JVM-s this is rarely an option;
OFFTOPIC remarks: I'm not sure why you use getObjectsBySQLWhere when you know the primary key (uuid). As far as I remember ORMObjectCollection-s should be closed if not iterated completely.
UPDATE: If the cluster is not configured correctly and the multicasts can't be received from the nodes you won't be able to resolve the problems programatically.
The "ORMObject.refresh()" marks the cached shared state as invalid. Next access to the object reloads the state from the database. This impacts the performance and increase the database server load.
BUT:
The "refresh()" method does not reload the PO instance state if it already assigned to the current transaction.
Would be best to investigate and fix the server communication issues.
Other possibility is that it isn't a communication problem (multicast between node in the cluster i assume), but that there are simply two request trying to update the basket at the same time. Example two ajax request to update something on the basket.
I would avoid trying to "fix" the orm, it would only cause more harm than good. Rather investigate further and post back more information.
I have a replicated cluster composed by several nodes (up to 30) on which there is a single JAVA process accessing to the coherence cache and I use the map.invoke(key, agent) method for both creation and update of agents. The creation and the update are performed setting the value in the process method.
Example (agent is instance of a ConcreteEntryProcessor implementing EntryProcessor interface):
map.invoke(key, agent);
Which invoke the following code of agent object:
public Object process(Entry entry) {
if (entry.isPresent())
{
//UPDATE
... some stuff which compute the new entry value...
entry.setValue(newValue, true);
return newValue
}
else
{
//CREATION
..other stuff to determine the value...
entry.setValue(value, true);
return value;
}
}
I noticed that if the update is made by the node that created the agent I have good performances, otherwise I have a performance decrease if the update is made from a different node. It seems that there is a kind of ownership of data.
Is there a way to force the execution of an agent on the local node or change the ownership of data?
It all depends on cache configuration. If you use distributed (partitioned) cache, then indeed there is some kind of data ownership. In that case entry processor is invoked on a node that owns given key.
And according to your performance issues, I see two possibilities:
Performance of map.invoke(key, agent) decreases, but performance of EntryProcessor.process(entry) is stable. In that case your performance issues are probably caused by serialization and network traffic needed to send back the result of processing to the node that called map.invoke(key, agent). If you don't need this value on that node, then simply return null from your entry processor.
Performance of EntryProcessor.process(entry) decreases. In that case mayby your create/update logic need some data from the node that called map.invoke(key, agent). So it is again serialization/network traffic issue, but without knowing the details of your particular logic it is hard to find a solution to your issue.
We have a data driven ASP.NET website which has been written using the standard pattern for data caching (adapted here from MSDN):
public DataTable GetData()
{
string key = "DataTable";
object item = Cache[key] as DataTable;
if((item == null)
{
item = GetDataFromSQL();
Cache.Insert(key, item, null, DateTime.Now.AddSeconds(300), TimeSpan.Zero;
}
return (DataTable)item;
}
The trouble with this is that the call to GetDataFromSQL() is expensive and the use of the site is fairly high. So every five minutes, when the cache drops, the site becomes very 'sticky' while a lot of requests are waiting for the new data to be retrieved.
What we really want to happen is for the old data to remain current while new data is periodically reloaded in the background. (The fact that someone might therefore see data that is six minutes old isn't a big issue - the data isn't that time sensitive). This is something that I can write myself, but it would be useful to know if any alternative caching engines (I know names like Velocity, memcache) support this kind of scenario. Or am I missing some obvious trick with the standard ASP.NET data cache?
You should be able to use the CacheItemUpdateCallback delegate which is the 6th parameter which is the 4th overload for Insert using ASP.NET Cache:
Cache.Insert(key, value, dependancy, absoluteExpiration,
slidingExpiration, onUpdateCallback);
The following should work:
Cache.Insert(key, item, null, DateTime.Now.AddSeconds(300),
Cache.NoSlidingExpiration, itemUpdateCallback);
private void itemUpdateCallback(string key, CacheItemUpdateReason reason,
out object value, out CacheDependency dependency, out DateTime expiriation,
out TimeSpan slidingExpiration)
{
// do your SQL call here and store it in 'value'
expiriation = DateTime.Now.AddSeconds(300);
value = FunctionToGetYourData();
}
From MSDN:
When an object expires in the cache,
ASP.NET calls the
CacheItemUpdateCallback method with
the key for the cache item and the
reason you might want to update the
item. The remaining parameters of this
method are out parameters. You supply
the new cached item and optional
expiration and dependency values to
use when refreshing the cached item.
The update callback is not called if
the cached item is explicitly removed
by using a call to Remove().
If you want the cached item to be
removed from the cache, you must
return null in the expensiveObject
parameter. Otherwise, you return a
reference to the new cached data by
using the expensiveObject parameter.
If you do not specify expiration or
dependency values, the item will be
removed from the cache only when
memory is needed.
If the callback method throws an
exception, ASP.NET suppresses the
exception and removes the cached
value.
I haven't tested this so you might have to tinker with it a bit but it should give you the basic idea of what your trying to accomplish.
I can see that there's a potential solution to this using AppFabric (the cache formerly known as Velocity) in that it allows you to lock a cached item so it can be updated. While an item is locked, ordinary (non-locking) Get requests still work as normal and return the cache's current copy of the item.
Doing it this way would also allow you to separate out your GetDataFromSQL method to a different process, say a Windows Service, that runs every five minutes, which should alleviate your 'sticky' site.
Or...
Rather than just caching the data for five minutes at a time regardless, why not use a SqlCacheDependency object when you put the data into the cache, so that it'll only be refreshed when the data actually changes. That way you can cache the data for longer periods, so you get better performance, and you'll always be showing the up-to-date data.
(BTW, top tip for making your intention clearer when you're putting objects into the cache - the Cache has a NoSlidingExpiration (and a NoAbsoluteExpiration) constant available that's more readable than your Timespan.Zero)
First, put the date you actually need in a lean class (also known as POCO) instead of that DataTable hog.
Second, use cache and hash - so that when your time dependency expires you can spawn an async delegate to fetch new data but your old data is still safe in a separate hash table (not Dictionary - it's not safe for multi-reader single writer threading).
Depending on the kind of data and the time/budget to restructure SQL side you could potentially fetch only things that have LastWrite younger that your update window. you will need 2-step update (have to copy dats from the hash-kept opject into new object - stuff in hash is strictly read-only for any use or the hell will break loose).
Oh and SqlCacheDependency is notorious for being unreliable and can make your system break into mad updates.
what happens if an user trying to read HttpContext.Current.Cache[key] while the other one trying to remove object HttpContext.Current.Cache.Remove(key) at the same time?
Just think about hundreds of users reading from cache and trying to clean some cache objects at the same time. What happens and is it thread safe?
Is it possible to create database aware business objects in cache?
The built-in ASP.Net Cache object (http://msdn.microsoft.com/en-us/library/system.web.caching.cache.aspx) is thread-safe, so insert/remove actions in multi-threaded environments are inherently safe.
Your primary requirement for putting any object in cache is that is must be serializable. So yes, your db-aware business object can go in the cache.
If the code is unable to get the object, then nothing / null is returned.
Why would you bother to cache an object if you would have the chance of removing it so frequently? Its better to set an expiration time and reload the object if its no longer in the cache.
Can you explain "DB aware object"? Do you mean a sql cache dependency, or just an object that has information about a db connection?
EDIT:
Reponse to comment #3.
I think we are missing something here. Let me explain what I think you mean, and you can tell me if its right.
UserA checks for an object in cache
("resultA") and does not find it.
UserA runs a query. Results are
cached as "resultA" for 5 minutes.
UserB checks for an object in cache
("resultA") and does find it.
UserB uses the cached object "resultA"
If this is the case, then you dont need a Sql Cache dependency.
Well i have a code to populate cache:
string cacheKey = GetCacheKey(filter, sort);
if (HttpContext.Current.Cache[cacheKey] == null)
{
reader = base.ExecuteReader(SelectQuery);
HttpContext.Current.Cache[cacheKey] =
base.GetListByFilter(reader, filter, sort);
}
return HttpContext.Current.Cache[cacheKey] as List<CurrencyDepot>;
and when table updated cleanup code below executing:
private void CleanCache()
{
IDictionaryEnumerator enumerator =
HttpContext.Current.Cache.GetEnumerator();
while (enumerator.MoveNext())
{
if (enumerator.Key.ToString().Contains(_TableName))
{
try {
HttpContext.Current.Cache.Remove(enumerator.Key.ToString());
} catch (Exception) {}
}
}
}
Is this usage cause a trouble?