Deadlock: will order of resource return have any potential issue? - deadlock

// down = acquire the resource
// up = release the resource
typedef int semaphore;
semaphore resource_1;
semaphore resource_2;
void process_A(void) {
down(&resource_1);
down(&resource_2);
use_both_resources();
up(&resource_2);
up(&resource_1);
}
If the resource return in the same order as it acquired, i.e,
void process_A(void) {
down(&resource_1);
down(&resource_2);
use_both_resources();
up(&resource_1);
up(&resource_2);
}
Would that cause any potential problem.
Thanks for any explanation!

The important part is if you are taking the locks in the same order in different threads or not.
The order of release has no effect; there's nothing stopping the program from releasing the second lock after the first one has been released, (unless you're taking new locks in between, but then you're back at the first case; taking the locks in the correct order.)
If you have two functions that try to take the same two locks, in different orders, they could grab one lock each, and wait forever for the other one to release their lock. Example code:
down(first_lock)
down(second_lock)
running concurrently with
down(second_lock)
down(first_lock)
they could both take their first lock before any of them take their second lock, and then they'll deadlock.

Related

I got deadlock when service fabric stateful service

GRPC END Contract=IPubSubPartitionManager Action=GetOrAddContextSelectorPartition ID=71eff709-3920-4139-a5c9-2b0bef6f5ba7 From=ipv4:10.0.0.56:35091 IsFault=True Duration(ms)=4932 Request { "contextSelector": "/debug/fabric--madari-madariservice-01-GrpcPublishSubscribeProber-8", "addIfNotExists": true, "clientCredential": { "credentialValue": "uswestcentral-prod.sdnpubsub.core.windows.net-client", "credentialRegex": { "pattern": "null string" }, "enablePropertyBasedAcls": false } } Response { "errorMsg": "Timed out waiting for Shared lock on key; id=49b61cd7-31ba-4c5d-b579-d068326e8a90#133028293161628388#urn:ContextSelectorMapping/dataStore#132077756302731635, timeout=4000ms, txn=133029757743026569, lockResourceNameHash=6572262935404555983; oldest txn with lock=133029757735370545 (mode Shared)\r\n" }
This problem mostly affects read operations that support Repeatable Read, the user may request an Update lock rather than a Shared lock. Update lock is an asymmetric lock is used to prevent but when a several transactions potentially updates at that time deadlock occurs.
Try to avoid TimeSpan.MaxValue for time-outs. It may detect deadlocks.
Don't create a transaction within another transaction's using statement. For example: the two transactions (T1 and T2) are attempting to read from and update K1, respectively. Due to the fact that they both end up with the Shared lock possible for them to deadlock. In this situation, one or both of the operations will time out.
To avoid a frequent type of deadlock that arises,
The default transaction timeout should be increased. Usually, it takes 4 second try to use in a different value
Make sure the transactions are short-lived; if they last any longer than necessary, you'll be blocking other tasks in the queue for a longer period of time than necessary.
For Reference: Azure Service Fabric

Submit a task to an ExecutorService using a SheduleExecutorService

I'm developing a JavaFX application for read data from a serial device and show a notification when a new device is connected to the computer.
I have a task DeviceDetectorTask which scans all the ports and creates an event when a new device is connected. This task must be submited every 3 seconds.
When a device is detected the user can press a button to read all the data contained in it. This is performed by another task ReadDeviceTask. At this point and while the ReadDeviceTask is running scan operations should not be performed (I cannot read and scan one port at the same time). So only one of the two task can be running at a time.
My actual solution is:
public class DeviceTaskQueue {
private ExecutorService executorService = Executors.newSingleThreadExecutor();
public void submit(Runnable task) {
executorService.submit(task);
}
}
public class ScanScheduler {
private ScheduledExecutorService executor = Executors.newSingleThreadScheduledExecutor();
public void start() {
AddScanTask task = new AddScanTask();
executor.scheduleAtFixedRate(task, 0, 3, TimeUnit.SECONDS);
}
}
public class AddScanTask implements Runnable {
#Autowired
DeviceTaskQueue deviceTaskQueue;
#Override
public void run() {
deviceTaskQueue.submit(new DeviceDetectorTask());
}
}
public class ViewController {
#Autowired
DeviceTaskQueue deviceTaskQueue;
#FXML
private readDataFromDevice() {
deviceTaskQueue.submit(new ReadDeviceTask());
}
}
My question is: is it ok to add a task to the ExecutorService from the task AddScanTask which has been scheduled by the ScheduledExecutorService?
Yes, An Executor May Post Task To Another Executor
To answer your simple question in last line:
is it ok to add a task to the ExecutorService from the task AddScanTask which has been scheduled by the ScheduledExecutorService?
Yes. Certainly you can submit a Callable/Runnable from any other code. That the submitting code happens to be running from another executor is irrelevant, as code run from an executor is still “normal” Java code, just running on a different thread.
That is the whole point of the executor, to handle the juggling of threads in a manner convenient to you the programmer. Making multi-threaded coding easier and less error-prone is why these classes were added to Java. See the extremely helpful book, Java Concurrency in Practice by Brian Goetz et al. And see other writings by Goetz.
In your case you have two executors each with their own thread, each executing a series of submitted tasks. One has tasks submitted automatically (timed) while the other has tasks submitted manually (arbitrarily). Each executes on their own thread independent of one another. With multiple cores they may execute simultaneously.
Therein lies the bigger problem: In your scenario you don't want them to be independent. You want the reading tasks to block the scanning tasks.
Bigger Problem
The problem you present is that a regularly occurring activity (scanning) must halt when an arbitrary event (reading) happens. That means the two activities must coordinate with one another. The question is how to coordinate.
Semaphores
When the arbitrary event is happening, it should raise a flag. The recurring activity, when it runs, should always check for that flag. If raised, wait until the flag lowers before proceeding with scan. The ScheduledExecutorService is designed for this, tolerating a task that may run for a time longer than the scheduled period. If one execution of the task runs long, the SES does not run again, so it does not pile up a backlog of executions. That is just the behavior you want.
Vice versa, if the recurring activity is executing, it should raise a flag. The arbitrary event’s first to-do item is to check for that flag. If raised, wait until lowered. Then proceed, first raising its own flag and then proceeding with the task at hand (scanning).
Perhaps your scenario should be designed with a single flag rather than scanner and reader each having their own. I would have to think about it more and probably know more about your scenario.
The technical term for such flags is semaphore.
Unfortunately your comment says you cannot alter the scanner’s source code. So you cannot implement the semaphores and coordinate the activities. So I am stuck, cannot see a solution.
Hack
Given your frozen code, one hack solution, which I do not recommend, is that the regularly occurring activity (the scanning) not actually do the work but instead post a scanning task on another thread (another executor). That other executor would also be the same executor used to post the arbitrary activity (the reading). So there is one single queue of to-do items, a mix of scanning and reading jobs, submitted to a single-thread executor. The single-thread means they get done one at a time in sequence of their submission.
I do not like this hack because if any of the to-do items takes a long while you will begin to accumulate a backlog. That could be a mess.
By the way, no need for the DeviceTaskQueue in your example code. Just call the instance of the ExecutorService directly to submit a task. That is the job of an ExecutorService, and wrapping it adds no value that I can see.

Oracle Coherence fast empty complete cluster

I'm having problems with a cache cluster to empty all cache data stores.
This cluster has 89 cache stores and lasts more than 40 minutes to completely unload data.
I'm using this function:
public void deleteAll() {
try {
getCache().clear();
} catch (Exception e) {
log.error("Error unloading cache",getCache().getCacheName());
}
}
getCache Method retrieves a NamedCache of CacheFactory.
public NamedCache getCache() {
if (cache == null) {
cache = com.tangosol.net.CacheFactory.getCache(this.idCacheFisica);
}
return cache;
}
Has anyone found any other way to do this faster?
Thank you in advance,
It's strange it would take so long, though to be honest, it's unusual to call clear.
You could try destroying the cache with NamedCache.destroy or CacheFactory.destroy(NamedCache).
The only problem with this is that it invalidates any references that there might be to the cache, which would need to be re-obtained with another call the CacheFactory.getCache.
Often, a "bulk operation" call like clear() will get transmitted all the way back to the backing maps as a bulk operation, but will at some point (somewhere inside the server code) become a iteration-based implementation, so that each entry being removed can be evaluated against any registered rules, triggers, event listeners, etc., and so that backups can be captured exactly. In other words, to guarantee "correctness" in a distributed environment, it slows down the operation and forces it to occur as some ordered sequence of sub-operations.
Some of the reasons why this would happen:
You've configured read-through, write-through and/or write-behind behavior for the cache;
You have near caches and/or continuous queries (which imply listeners);
You have indexes (which need to be updated as the data disappears);
etc.
I will ask for some suggestions on how to speed this up.
Update (14 July 2014) - Yes, it turns out that this is a known issue (i.e. that it could be optimized), but that to maintain both "correctness" and backwards compatibility, the optimizations (which would be significant changes) have been deferred, and the clear() is still done in an iterative manner on the back end.

Sample Grabber Sink release() issue

I use Sample Grabber Sink in my Media session using most of code from msdn sample.
In OnProcessSample method I memcpy data to media buffer, attach it to MFSample and put this one into main process pointer. Problem is I either get memory leaking or crashes in ntdll.dll
ntdll.dll!#RtlpLowFragHeapFree#8() Unknown
SampleGrabberSink:
OnProcessSample(...)
{
MFCreateMemoryBuffer(dwSampleSize,&tmpBuff);
tmpBuff->Lock(&data,NULL,NULL);
memcpy(data,pSampleBuffer,dwSampleSize); tmpBuff->Unlock();
MFCreateSample(&tmpSample);
tmpSample->AddBuffer(tmpBuff);
while(!(*Free) && (*pSample)!=NULL)
{
Sleep(1);
}
(*Free)=false;
(*pSample)=tmpSample;
(*Free)=true;
SafeRelease(&tmpBuff);
}
in main thread
ReadSample()
{
if(pSample==NULL)
return;
while(!Free)
Sleep(1);
Free=false;
//process sample into dx surface//
SafeRelease(&pSample);
Free=true;
}
//hr checks omitted//
With this code i get that ntdll.dll error after playing few vids.
I also tried to push samples in qeue so OnProcess doesn't have to wait but then some memory havent free after video ended.
(even now it practicaly doesn't wait, Session rate is 1 and main process can read more than 60fps)
EDIT: It was thread synchronization problem. Solved by using critical section thanks to Roman R.
It is not easy to see is from the code snippet, but I suppose you are burning cycles on a streaming thread (you have your callback called on) until a global/shared variable is NULL and then you duplicate a media sample there.
You need to look at synchronization APIs and serialize access to shared variables. You don't do that and eventually either you are accessing freed memory or breaking reference count of COM object.
You need an event set externally when you are ready to accept new buffer from the callback, then the callback sees the event, enters critical section (or, reader/writer lock), does your *pSample magic there, exits from critical section and sets another event indicating availability of a buffer.

Is there a way to acquire a lock in Redis? (Node.js)

My Node.js application accepts connections from the outside. Each connection handler reads a SET on Redis, eventually modifies the set itself, then moves on. The problem is that in the meanwhile another async connection can try to read the same SET and try to update it or decide its next step based on what it reads.
I know that Redis does its best to be atomic, but this is not quite sufficient for my use case. Think about this: the set is read to understand if it's FULL (there is a business rule for that). If it's FULL, then something happens. The problem is that if there is one only slot left, two semi-concurrent connections could think each one is the last one. And I get an overflow.
I there a way to keep a connection "waiting" for the very short time the other eventually needs to update the set state?
I think this is a corner case, very very unluckely... but you know :)
Using another key as the "lock" is an option, or does it stink?
How about using blpop to do locking. blpop key 5 to wait 5 seconds for key. At start put item(to identify queue is not empty) at key. The connection acquiring the lock should remove item from key. The next connect then can't acquire the lock, because empty, but blpop has the following nice property:
Multiple clients can block for the same key. They are put into a
queue, so the first to be served will be the one that started to wait
earlier, in a first-BLPOP first-served fashion.
When connection which acquired lock has finished task it should put back item back in queue, then the next connection waiting can acquire lock(item).
You may be looking for WATCH with MULTI/EXEC. Here's the pattern that both threads follow:
WATCH sentinel_key
GET value_of_interest
if (value_of_interest = FULL)
MULTI
SET sentinel_key = foo
EXEC
if (EXEC returned 1, i.e. succeeded)
do_something();
else
do_nothing();
else
UNWATCH
The way this works is that all of the commands between MULTI and EXEC are queued up but not actually executed until EXEC is called. When EXEC is called, before actually executing the queued instructions it checks to see if sentinel_key has changed at all since the WATCH was set; if it has, it returns (nil) and the queued commands are discarded. Otherwise the commands are executed atomically as a block, and it returns the number of commands executed (1 in this case), letting you know you won the race and do_something() can be called.
It's conceptually similar to the fork()/exec() Unix system calls - the return value from fork() tells you which process you are (parent or child). In this case it tells you whether you won the race or not.

Resources