Hierarchical CLH lock behaviour

Hierarchical CLH lock behaviour - multiprocessor

Could anyone explain how does a HCLH lock handles the new nodes that are created in the local cluster after the Cluster Master has merged the local queue into the global queue?

Once the local queue is merged on to the global queue, the cluster master sets the tailWhenSpliced field to true. The new local node that is added will know that it is the cluster master when it checks the predecessor's tailWhenSpliced flag. I have cut the long answer short.

Related

Raft.Next - persistent cluster configuration fails when running multiple processes

I'm currently investigating Raft in dotNext and would like to move from the fairly simplistic example which registers all the nodes in the cluster at startup to using an announcer to notify the leader when a new node has joined.
To my understanding this means that I should start the initial node in ColdStart but then subsequent nodes should use the ClusterMemberAnnouncer to add to the cluster as:
services.AddTransient<ClusterMemberAnnouncer<UriEndPoint>>(serviceProvider => async (memberId, address, cancellationToken) =>
{
// Register the node with the configuration storage
var configurationStorage = serviceProvider.GetService<IClusterConfigurationStorage<UriEndPoint>>();
if (configurationStorage == null)
throw new Exception("Unable to resolve the IClusterConfigurationStorage when adding the new node member");
await configurationStorage.AddMemberAsync(memberId, address, cancellationToken);
});
It makes sense to me that the nodes should use a shared/persisted configuration storage so that when the second node tries to start up and announce itself, it's able to see the first cold-started active node in the cluster. However if I use the documented services.UsePersistentConfigurationStorage("configurationStorage") approach and then run the nodes in separate console windows ie. separate processes, the second node understandably says:
The process cannot access the file 'C:\Projects\RaftTest\configurationStorage\active.list' because it is being used by another process.
Has anyone perhaps got an example of using an announcer in Raft dotnext?
And does anyone know the best way (hopefully with an example) to use persistent cluster configuration storage so that separate processes (potentially running in different docker containers) are able to access the active list?

What is JOB_flow_overrides in Airflow EmrCreateJobFlowOperator? When to use it

I have seen people creating JOB_flow_overrides in creating EMR. I have set up all those master and slave nodes required in emr_default (extra), DO I still need it? In what cases will use this parameter?

job_flow_overrides is used to create a new emr cluster every time the dag is ran.
https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/operators/emr.html

How can the data of a Corda node be deleted without restarting the node?

When running Corda nodes for testing or demo purposes, I often find a need to delete all the node's data and start it again.
I know I can do this by:
Shutting down the node process
Deleting the node's persistence.mv.db file and artemis folder
Starting the node again
However, I would like to know if it is possible to delete the node's data without restarting the node, as this would be much faster.

It is not currently possible to delete the node's data without restarting the node.
If you are "resetting" the nodes for testing purposes, you should make sure that you are using the Corda testing APIs to allow your contracts and flows to be tested without actually starting a node. See the testing API docs here: https://docs.corda.net/api-testing.html.
One alternative to restarting the nodes would also be to put the demo environment in a VmWare workstation, take a snapshot of the VM while the nodes are still "clean", run the demo, and then reload the snapshot.

Gridgain node discovery and gridcache

I start the grid gain node using G.start(gridConfiguration), the node automatically joins the existing nodes.After this I start loading the GridCache ( which is configured to be LOCAL ).
This works fine, but is there is way to access the Grid Cache without doing the G.start(gridConfiguration), since I would like to load the LOCAL cache first and then have the node being detected by other nodes once the cache is loaded succesfully

You need to have GridGain started in order to use it's API's. After the grid is started, you can access it using GridGain.grid().cache(...) method.
What you can do, for example, is use distributed count down latch (GridCacheCountDownLatch) which is exactly the same as java.util.concurrent.CountDownLatch class. Then you can have other nodes wait on the latch while your local cache is loading. Once loading is done, you can call latch.countDown() and other nodes will be able to proceed.
More information on count-down-latch, as well as other concurrent data structures in GridGain can be found in documentation.

How to lock node for deletion process

Within alfresco, I want to delete a node but I don't want to be used by any other users in a cluster environment.
I know that I will use LockService for lock a node (in a cluster environment) as in the folloing lines:
lockService.lock(deleteNode);
nodeService.deleteNode(deleteNode);
lockService.unlock(deleteNode);
the last line may cause an exception because the node has already been deleted, and indeed it causes the exception is
A system error happened during the operation: Node does not exist: workspace://SpacesStore/cb6473ed-1f0c-4fa3-bfdf-8f0bc86f3a12
So how to ensure concurrency in a cluster environment when delete a node to prevent two users to access the same node at the same time one of them want to update it and the second once want o delete it?

Depending on your cluster environment (e.g. same DB server used by all Alfresco instances), transactions might most likely just be enough to ensure no stale content is used:
serverA(readNode)
serverB(deleteNode)
serverA(updateNode) <--- transaction failure
The JobLockService allows more control in case of more complex operations, which might involve multiple, dynamic nodes (or no nodes at all, e.g. sending emails or similar):
serverA(acquireLock)
serverB(acquireLock) <--- wait for the lock to be released
serverA(readNode1)
serverA(if something then updateNode2)
serverA(updateNode1)
serverA(releaseLock)
serverB(readNode2)
serverB(releaseLock)