I am using the AKKA framework for concurrency with Java and have the following use case
The actors operate on a graph data structure and work on each node in the graph in a specific order and do not proceed to the next until a node is processed
Here is the relevant code
GraphProcessor:
if (msg instanceof processGraph) {
// For each level in the graph create a child at each level, for nodes on the same level create child actors simultaneously
BreadthFirstIterator bfs = new BreadthFirstIterator<>(graph);
while (bfs.hasNext()) {
ActorRef NodeProcesor = getContext().actorOf(NodeProcesor.props());
NodeProcesor.tell(send the node),
getSelf());
}
}
}
Now the question is, while the nodes of one graph are getting processed a new graph may be handed over to the GraphProcessor, this will confuse the actor from tracking the state of the graph? How do I correctly maintain this informtaion?
Related
I use Gremlin API in Java.
Assume we have a traversal to persons and another traversal to locations that is quite long and dependent on the first:
GraphTraversal<?, Vertex> persons = g.V().has("prop", "value");
GraphTraversal<?, Vertex> locations = persons.out("place").has(..)..;
Now I want to link each person to the locations that correspond to that persons with a direct link, considering that some of these edges are already in place.
Which strategy would be good to do such links using Gremlin API in Java?
I couldn't find an easy way to link two streams of vertices with many to many relationship. But getting the set of objects and creating edge in loop as usually for one to many works for me:
Set<Object> personVertexIds = persons.id().toSet();
personVertexIds.forEach(id -> {
GraphTraversal<Vertex, Vertex> person = g.V(id).as("p");
GraphTraversal<?, Vertex> locations = persons.out("place").has(..)..;
locations.coalesce(inE("link").where(outV().where(P.eq("p"))),
addE("link").from("p")).property("prop", value);
});
I have set up a Janusgraph Cluster with Cassandra + ES. The cluster has been set up to support ConfiguredGraphFactory. Also, I am connecting the gremlin cluster remotely. I have set up a client and am able to create a graph using :
client.submit(String.format("ConfiguredGraphFactory.create(\"%s\")", graphName));
However, I am not able to get the traversalSource of the graph created using the gremlin driver. Do I have to create raw gremlin queries and traverse the graph using client.submit or is there a way to get it through the gremlin driver using Emptygraph.Instance().
To get the remote traversal reference, you need to pass in a variable name that is bound to your graph traversal. This binding is usually done as part of the "globals" in your startup script when you start the remote server (the start up script is configured to run as part of the gremlin-server.yaml).
There is currently no inherent way to dynamically bind a variable to a graph or traversal reference, but I plan on fixing this at some point.
A short term fix is to bind your graph and traversal references to a method that will be variably defined, and then create some mechanism to change the variable dynamically.
To further explain a potential solution:
Update your server's startup script to bind g to something variable:
globals << [g : DynamicBindingTool.getBoundGraphTraversal()]
Create DynamicBindingTool, which has to do two things:
A. Provide a way to setBoundGraph() which may look something like:
setBoundGraph(graphName) {
this.boundGraph = ConfiguredGraphFactory.open(graphName);
}
B. Provide a way to getBoundGraphTraversal() which may look something like:
getBoundGraphTraversal() {
this.boundGraph.traversal();
}
You can include these sorts of functions in your start-up script or perhaps even create a separate jar that you attach to your Gremlin Server.
Finally, I would like to note that the proposed example solution does not take into account a multi-node JanusGraph cluster, i.e. your notion of the current bound graph would not be shared across the JG nodes. To make this a multi-node solution, you can update the functions to define the bound graph on an external database or even piggybacked on a JanusGraph graph.
For example, something like this would be a multi-node safe implementation:
setBoundGraph(graphName) {
def managementGraph = ConfiguredGraphFactory.open("managementGraph");
managementGraph.traversal().V().has("boundGraph", true).remove();
def v = managementGraph.addVertex();
v.property("boundGraph", true);
v.property("graph.graphname", graphName);
}
and:
getBoundGraphTraversal() {
def managementGraph = ConfiguredGraphFactory.open("managementGraph");
def graphName = managementGraph.traversal().V().has("boundGraph", true).values("graph.graphname");
return ConfiguredGraphFactory.open(graphName).traversal();
}
EDIT:
Unfortunately, the above "short-term trick" will not work as the global bindings are evaluated once and stored in a Map for the duration of the sever life cycle. Please see here for more information and updates on fixes: https://issues.apache.org/jira/browse/TINKERPOP-1839.
I am looking to get back my whole object, but limit one of my children objects.
For example, say you take a chat app like firebase does and you do "rooms".
So you might have
rooms: {
mainroom:{
name: something,
otherAttrs: mfasfd,
messages: {
0: {
message: something
},
1: {
message: something else
}
}
}
I may have 300 messages in that mainroom, but I want to limit it to 30 say. This example is basic, but in my actual application my objects are very related so I don't want to denormalize any further.
I could do a mainroom call, and then do another child call off of that, but I am wondering if I would get dinged twice. in the initial call it would load all messages anyways, and then I would load 30 of them with the child call. Was just hoping someone would have a better recommendation.
Start by reading up about denormalization. This is a concept which is enforced in SQL by table structures, but also important in NoSQL, although you're given enough rope to tangle yourself up and have a bad day.
So the first step is to split messages into its own path:
URL/rooms
URL/messages
Now you can grab your meta data and messages separately, and call limit to set the number loaded:
var fbRef = new Firebase(URL);
var roomRef = fbRef.child('rooms/'+roomId);
var chatRef = fbRef.child('messages/'+roomId).limit(30);
In case you're not convinced that these should be split up, you're going to run into this same issue when you want to create a dropdown containing a list of room names (you have to load all your messages in the current data structure, just to get the room names).
For great justice, split meta data and detailed records into their own paths. Otherwise, all your base are belong to bandwidth.
Assume that I have a workflow with 3 Custom Activities which are placed in a Sequence Activity. And I created a Boolean variable (name it as “FinalResult”) at Sequence Activity level (Root) to hold the Result. My Intention is, I want to assign each Custom Activity Result to Root level variable (“FinalResult”) within the Custom Activity Execute method after finishing the activity.
I can get this by declaring the output argument in Custom Activity and placing the variable name at design time in the properties window of activity manually while designing the policy.
But I don’t want to do this by the end user. I want just the end user drag and drop the activities and write conditions on the” FinalResult” variable. Internally I have to maintain the Activity Result in “FinalResult” Variable through programmatically.
Finally I want to maintain the workflow state in “FinalResult” variable and access it anytime and anywhere in the workflow.
I tried like this below getting error "Property does not exist".
WorkflowDataContext dataContext = context.DataContext;
PropertyDescriptorCollection propertyDescriptorCollection = dataContext.GetProperties();
foreach (PropertyDescriptor propertyDesc in propertyDescriptorCollection)
{
if (propertyDesc.Name == "FinalResult")
{
object data = propertyDesc.GetValue(dataContext);// as WorkUnitSchema;
propertyDesc.SetValue(dataContext, "anil");
break;
}
}
Please let us know the possible solutions for the same.
I do this all the time.
Simply implement IActivityTemplateFactory in your activity. When dragged and dropped onto the design surface, the designer will determine if your activity (or whatever is being dropped) implements this interface. If it does, it will construct an instance and call the Create method.
Within this method you can 1) instantiate your Activity and 2) configure it. Part of configuring it is binding your Activities' properties to other Activities' arguments and/or variables within the workflow.
There are a few ways to do this. Most simply, require these arguments/variables have well known names. In this case, you can simply bind to them via
return new MyActivity
{
MyInArgument = new VisualBasicValue<object>(MyActivity.MyInArgumentDefaultName),
};
where MyActivity.MyInArgumentDefaultName is the name of the argument or variable you are binding to.
Alternatively, if that variable/argument is named by the user... you're in for a world of hurt. Essentially, you have to
Cast the DependencyObject target passed to the Create method to an ActivityDesigner
Get the ModelItem from that AD
Walk up the ModelItem tree until you find the argument/value of the proper type
Use its name to create your VisualBasicValue
Walking up the ModelItem tree is super duper hard. Its kind of like reflecting up an object graph, but worse. You can expect, if you must do this, that you'll have to fully learn how the ModelItem works, and do lots of debugging (write everything down--hell, video it) in order to see how you must travel up the graph, what types you encounter along the way, and how to get their "names" (hint--it often isn't the Name property on the ModelItem!). I've had to develop a lot of custom code to walk the ModelItem tree looking for args/vars in order to implement a drag-drop-forget user experience. Its not fun, and its not perfect. And I can't release that code, sorry.
this question is about best practices. I'm implementing a 3D interval Kd-Tree and, because of the recursive structure of the tree I would be tempted to create a unique class, KdTree to represent the tree itself, the nodes and the leaves.
However: elements are contained only at leaves, some general tree parameters (such as the maximum number of elements before splitting the space) are meant to be the same for all the tree and finally splitting planes have no sense at all in leaves.
That said: should I make up three classes (KdTree, KdNode, KdLeaf) or just pretend that each node or leaf is in fact a Kd-Tree (which, in fact, is precisely the case) and duplicate data?
Tommaso
I would say there is no need for a Tree class. The top element is a node like all the rest.
To differentiate the leaves and the branch nodes, I'd go for
namespace KdTree
{
abstract class Node
{
virtual EnumLeafNodes(LeafNodeCallback callback);
virtual GetLeafCount();
}
class Leaf : Node
{
// implement virtuals by returning/counting leaf values
}
class Branch : Node
{
// implement virtuals by delegating to child nodes
// direct children:
Node[] children;
}
}
Note that this is very much pseudocode (C#-ish). The idea behind this design is that you use virtual functions to differentiate the behavior between branches and leaf nodes, and the branches can delegate to their child nodes. This is a trivial example on what is know as the Visitor pattern.
Create and use the classes KdNode and KdLeaf privately within the context of KdTree. This will make your life easier, and hide the complexity from other parts of the program
It seems the lead and tree are nodes that are simply at the beginning of the end of the "branch".
In these cases, I just name them "nodes" and when parsing through them, I would refer to them as KdParentNode, KdNode and KdChildNode. If a node does not have a parent, It's the tree (root) node, and if it does not have children, it's a leaf node.