When are gremlin sessions better? - gremlin

I understand that sessionless operations are the preferred method of using gremlin. I'm wondering when is the sessioned approach better?
So I might be doing something like...
graph.addVertex("foo").property("name","bar")
graph.traversal().V().has("name","bar").as("f").addV("foo").property("name","baz").as("g").addE("test").from("f").to("g")
I'm doing this type of operation a lot. Often there's also a query (usually involving a coalesce) beforehand to check if a node (g in my example) exists, and create it if not.
So I'm wondering if a session might be better because I could hold a handle to the previous vertices and just attach new nodes to them without the expense of the lookup.
Feel free to tell me why I'm wrong in anything else that I'm doing.. Just trying to make things faster.

First of all, I would avoid use of addVertex() and stick to addV() - see more details here.
As to your question, I think the only time to leverage sessions is if you have some sort of loading operation that requires explicit control over transactions and you're not using a JVM based language. Even then, I might consider other options for dealing with that and just avoid sessions completely. You end up with a less portable solution as there are a number of graph systems which don't even support them directly (e.g. Neptune).
The cost to do a T.id based lookup should be really fast, so saving a vertex between requests in a session really shouldn't vastly improve performance. Even if you keep the vertex between requests you will still need to pass the vertex into your traversal so you still have the lookup anyway - I'm not sure I see the difference in cost there.
// first request
v = g.addV(...).property(...).next()
// second request
g.V(v).addE(....
// third request
g.V(v).addE(....
The above should not be that much faster than:
// first request - returns id=1
g.addV(...).property(...).id().next()
// second request - where "1" is just passed in on the next request as a parameter
g.V(1).addE(....
// third request
g.V(1).addE(....

Related

Implementing a cold source with Mutiny

I'd like to know, what is a proper way to implement my own cold source (publisher) using the Mutiny library.
Let's say there is huge file parser that should return lines as Multi<String> items according to the Subscriber's consumption rate.
New lines should be read only after previous were processed to optimize memory usage, while buffering a couple of hundred items to eliminate consumer idling.
I know about the Multi.createFrom.emitter() factory method, but using it I can't see a convenient way to implement the backpressure.
Does Mutiny have a idiomatic way to create cold sources that produce next items only after requested by the downstream, or in this case I supposed to implement my own Publisher using the Java Reactive Streams API and then wrap it in Multi?
You can use Multi.createFrom().generator(...).
The function is called for every request. But you can pass a "state" to remember where you are, typically an Iterator.
This is the opposite of the emitter approach (which does not check for requests but has a backpressure strategy attached to it).
If you need more fine-grain back-pressure support, you would need to implement a Publisher.

How do I guarantee task order processing for a queue with multiple consumers in RabbitMQ?

Say I want to start friendship between A and B.
Say I want to end friendship between A and B.
Those are two tasks I want to send to a queue having multiple consumers (workers).
I want to guarantee processing order so, how to avoid the second task to be performed before the first?
My solution: make tasks sticky (tasks about A are always sent to the same consumer).
Implementation: use RabbitMQ's exchanges and map tasks to the available consumers.
How do I map A to its consumer? I'm thinking about nginx's ip_hash. I think I need something similar.
I don't know if it is relevant but A and B are uuid.v4() UUIDs.
Can you point me out to the algorithm I need to accomplish mapping, please?
Well, there are two options:
make one exchange / queue for all events and guarantee that they're gonna be inserted in proper order. Create one worker for them. This costs more on inserting data (and doesn't give you option of scalability).
prepare your app for such situation, e.g. when you get message destroyFriendship and friendship does not exist - save message to db containing future friendship ending. Then you can have multiple workers making and destroying friendship and do not have to care about proper order. Simply do your job, make friends and if there's row in db about ending of friendship - destroy it (or simply do not create). Of course you need to check timestamp of creation/destroying time and check if destroying time was after creation time!
Of course you can count somehow hash of A/B, but it would be IMO more costfull then preparing app. Scalling app using excahnges/queues is not really good - you're going to create more and more queues and it's going to end up in too many queues/exchanges in rabbitmq.
If you have to use solution you specified - you can for example count crc32 from A and B, and using it's value calcalate to which queue task should be send. But having multiple consumers might result wrong here - what if one of consumers is blocked somehow and other receive message with destroying friendship? Using this solution I'd say that it's dangerous to have more than 1 worker per group of A/B.

Concurrent Read Access to Thread Object that Emulates Map

I am experiencing (very) slow page load times that increase proportionately to the number of active users on the system. I have a hunch that this is related to a custom defined thread object:
define stageStoreCache => thread {
parent map
public oncreate() => ..oncreate()
}
This stageStoreCache object simply mimics the behavior of a map whose data available across the entire instance.
Many threads are reading it and very few threads are writing to it. Is this a poorly conceived solution to having a large map of data available across the instance? It's a fairly large map of maps that when exported to map->asstring can exceed 5MB. The objective is to prevent translating data stored as JSON in the database to Lasso types on the fly.
It seems that the large size of the stageStoreCache is not what causes problems. It seems to really be the number of concurrent users on the system.
Thanks for any insight you can offer.
You said that this holds a map of maps and is rather large. If those sub-maps are large, it is possible that the way you are accessing the data is causing the issue. Here's what I mean, if you are doing something like this:
// Potential problem as it copies the sub-map each time
stageStoreCache->find('sub-map')->find('data')
stageStoreCache->find('sub-map')->find('other')
The problem comes in that each time stageStoreCache->find('sub-map') is called it actually has to copy all the map data it finds for "sub-map" out of the thread object and into the thread requesting that data. If those sub-maps are large, this takes time. A better approach would be to do this once and stash it in a local variable:
// Better Approach
local(cache) = stageStoreCache->find('sub-map')
#cache->find('data')
#cache->find('other')
This at least only has to copy the "sub-map" over once. Another approach that might be better (only testing could tell) would be to refactor your code so that each call to stageStoreCache drills down to the data you actually want, and have just that small amount of data copied over.
// Might even be better as it just copies the values you want
stageStoreCache->drill('sub-map', 'data')
stageStoreCache->drill('sub-map', 'other')
Ultimately, I would love for Lasso to improve thread objects so that they never blocked for reads. (I had thought this had been submitted as a feature request, but I'm not finding it on Rhinotrac.) Until that happens, if none of my suggestions help then you may need to investigate using something else to cache this data in such as memcached.
Testing is the only way to tell for sure. But I would go a long way to avoid having a thread object that contains some 5 MB of data.
Take this snippet from the Lasso guide into consideration:
"all parameter values given to a thread object method are copied, as well as any return value of a thread object method"
http://www.lassoguide.com/language/threading.html
Meaning that one of the key features that makes Lasso 9 so fast, the extensive use of reference data, is lost.
Each time you have a call for stageStoreCache all the data it contains will first be copied into the thread that asks for it. That is an awful lot of copying.
I have found that having settings and site wide data contained in smallest possible chunks is convenient and fast. And also, to only actually set it up when it is called for. Unlike the old approach that had a config file that was included on every call, setting up a bunch of variables where the majority maybe never got used on that particular call. Here's a Ke trick that I'm using instead. Consider this:
define mysetting1 => var(__mysetting1) || $__mysetting1 := 'Setting 1 value'
define mysetting2 => var(__mysetting2) || $__mysetting2 := 'Setting 2 value'
define mysetting3 => var(__mysetting3) || $__mysetting3 := 'Setting 3 value'
Have this is a file that is read at startup, either in a LassoApp that's initiated or a file in the startup folder.
These settings can then be called like this:
code blabla
mysetting2
more code blabla
mysetting1
mysetting2
With the beauty that, in this case, there is no wasted processing to initiate mysetting3, since it's not called for. And that mysetting2 is called for several times but is still only initiated once.
This technique can be used for simple things like the above, but also to initiate complex types or methods. Like session management, calling post or get params etc.

How the chances of getting "read-your-writes" consistency are increased in Dynamo?

In Section 5 of Dynamo paper, there is the following content:
In particular, since each write usually follows a read operation, the
coordinator for a write is chosen to be the node that replied fastest to the
previous read operation which is stored in the context information of the
request. This optimization enables us to pick the node that has the data that
was read by the preceding read operation thereby increasing the chances of
getting "read-your-writes" consistency.
How the chances of getting "read-your-writes" consistency are increased?
"read-your-writes" means that a read following a write gets the value set by the
write. The read and the write are performed by two different clients for this
context. The reason is that the choice of the write coordinator does not impact
on the chances of getting "read-your-writes" by the same client.
But the above text is talking about a write following a read. Here is my guess.
The read coordinator will try to do syntactic reconciliation if it is possible.
If syntactic reconciliation is impossible because of divergent versions, the
client need to do semantic reconciliation before doing a write. Either way, the
versions on all the nodes involved in the read operation is an ancestor of the
reconciled version. So the following write can be sent to any of them to get
applied. The earliest time for a write to be seen by a read is after the
following steps are finished:
Client contact the write coordinator.
The write coordinator generates the version clock for the new version.
The write coordinator writes the new version locally.
The shorter the time to perform the above steps, the more likely another
following read sees the new version. Since it is very possible that the node
which replied fastest to the previous read can perform the following steps in a
shorter time. Such a node is chosen as the write coordinator.
Section 2.3 talks about performing the reconciliation at read time rather than write time.
Data versioning - "One can determine whether two versions of an
object are on parallel branches or have a causal ordering, by
examine their vector clocks."
This paragraph from section 4. [emphasis mine]
In Dynamo, when a client wishes to update an object, it must specify
which version it is updating. This is done by passing the context it
obtained from an earlier read operation, which contains the vector
clock information. Upon processing a read request, if Dynamo has
access to multiple branches that cannot be syntactically reconciled,
it will return all the objects at the leaves, with the corresponding
version information in the context. An update using this context is
considered to have reconciled the divergent versions and the
branches are collapsed into a single new version
So by performing the read first, you're effectively reconciling all divergent versions prior to writing. By writing to that same node, the version you've updated is marked with the context and vector clock of the most up to date version and all divergent branches can be collapsed. This is sent to the top N nodes (as you've stated) as fast as possible. But by removing the divergent branches - you reduce the chance that multiple values could be returned. You only need one of the N nodes read in the next read to get the reconciled write. ie - the node as part of the quorum of R reads says - "I am the reconciled version, and all others must bow to me". (and if that has already been distributed to another of the "R" nodes, then there's even greater chance of getting the reconciled version in the quorum)
But, if you wrote to a different node, one that you hadn't read from - the vector clock that is being updated may not necessarily be a reconciled version of the object. Therefore, you could still have divergent branches. The following read will try and reconcile it, but it's more more probable that you could have multiple divergent data and no reconciliation.
If you've made it this far, I think the most interesting part is that per Section 6, client applications can dictate the values of N, R and W - ie - number of nodes that constitute the pool to draw from, and the number of nodes that must agree on a read or write for it to be successful.
Geez - my head hurts now.
I re-read the Dynamo paper. I have a new understanding of "read-your-write" consistency. "read-your-writes" involves only one client. Image the following requests performed by one client on the same key:
read-1
write-1
read-2
"read-your-writes" means that read-2 sees write-1. The write coordinator has the best chance to have write-1. To ensure "read-your-writes", it is desired that the write coordinator replies fastest to read-2. It is highly possible that the node replies fastest to read-1 also reply fastest to read-2. So choose the node replies fastest to read-1 as the write coordinator.
And what is the node that replied fastest to the previous read operation? Such a node only makes sense if client-driven coordination is used. For server-side coordination, the coordinator nodes replies to the client and the other involved nodes reply to the coordinator node. replied fastest is meaningless in this case.

Should I care about thread safe of static int (4 bytes) variable in ASP .NET

I have the feeling that I should not care about thread safe accessing / writing to an
public static int MyVar = 12;
in ASP .NET.
I read/write to this variable from various user threads. Let's suppose this variable will store the numbers of clicks on a certain button/link.
My theory is that no thread can read/write to this variable at the same time. It's just a simple variable of 4 bytes.
I do care about thread safe, but only for refference objects and List instances or other types that take more cycles to read/update.
I am wrong with my presumption ?
EDIT
I understand this depend of my scenario, but wasn't that the point of the question. The question is: it is right that can be written thread safe code with an (static int) variable without using lock keyword ?
It is my problem to write correct code. The answer seems to be: Yes, if you write correct and simple code, and not to much complicated, you can create thread safe functions without the need of lock keyword.
If one thread simply sets the value and another thread reads the value, then a lock is not necessary; the read and write are atomic. But if multiple threads might be updating it and are also reading it to do the update (e.g., increment), then you definitely do need some kind of synchronization. If only one thread is ever going to update it even for an increment, then I would argue that no synchronization is necessary.
Edit (three years later) It might also be desirable to add the volatile keyword to the declaration to ensure that reads of the value always get the latest value (assuming that matters in the application).
The concept of thread 'safety' is too vague to be meaningful unfortunately. If you're asking whether you can read and write to it from multiple threads without the program crashing during the operation, the answer is almost certainly yes. If you're also asking if the variable is guaranteed to either be the old value or the new value without ever storing any broken intermediate values, the answer for this data type is again almost certainly yes.
But if your question is "will my program work correctly if I access this from multiple threads", then the answer depends entirely on what your program is doing. For example, if you run the following pseudo code in 2 threads repeatedly in most programming languages, eventually you'll hit the assertion.
if MyVar >= 1:
MyVar = MyVar - 1
assert MyVar >= 0
Primitives like int are thread-safe in the sense that reads/writes are atomic. But as with most any type, it's left to you to do proper checking with more complex operations. For example, if (x > 0) x--; would be problematic in a multi-threaded scenario because x might change in between the if condition check and decrement.
A simple read or write on a field of 32 bits or less is always atomic. But you should provide your read/write code to make sure that it is thread safe.
Check out this post: http://msdn.microsoft.com/en-us/magazine/cc163929.aspx
It explains why you need to synchronize access to the integers in this scenario
Try Interlocked.Increment() or Interlocked.Add() and you'll be right. Your code complexity will be the same but you truly won't have to worry. If you're not worried about losing a few clicks in your counter, you can continue as you are.
Reading or writing integers is atomic. However, reading and then writing is not atomic. So, if you have one thread that writes and many that read, you may be able to get away without locks.
However, even though the operations are atomic, there are still potential multi-threading issues. In order for one thread to be guaranteed that another thread can see values it writes, you need a memory barrier. Otherwise, the compiler can optimize the code so that the variable stays in a register (or even optimize the operation away completely), so changes would be invisible from one thread to another.
You can establish a memory barrier explicitly (volatile or Thread.MemoryBarrier), or with the Interlocked class -- or with the lock statement (Monitor).

Resources