How to handle Transaction (Rollback and Commit) in Gremlin - AWS Neptune?
In a request, I have two details, <student and List[Subject]>
1. Create student vertex with properties
2. Create Edge b/w class & student
3. Create a list of subjects vertex in for loop & edge b/w subject and student.
The document says, Multiple statements separated by a semicolon (;) or a newline character (\n) are included in a single transaction.
How to handle for loop in a single transaction in GremlinPython?
Any suggestion/Help?
There are 3 main ways to send a Gremlin query to a server. One is as plain text. In this case you can use semicolons between queries and they are treated as a single transaction. The second is to use a Gremlin client driver that supports sending queries as bytecode. In this case each query sent is a transaction. The third is to use Gremlin Sessions. In this case you create a session and then send one or more queries. The whole session is treated as a single transaction and only committed when you close the session. In this third case you can have code and queries intermixed as needed.
All of that said, you really don't need to use a for loop, you can just send everything in one go. Here is a simple example:
g.addV('root').property('data',9).as('root').
addV('node').property('data',5).as('b').
addV('node').property('data',2).as('c').
addV('node').property('data',11).as('d').
addV('node').property('data',15).as('e').
addV('node').property('data',10).as('f').
addV('node').property('data',1).as('g').
addV('node').property('data',8).as('h').
addV('node').property('data',22).as('i').
addV('node').property('data',16).as('j').
addE('left').from('root').to('b').
addE('left').from('b').to('c').
addE('right').from('root').to('d').
addE('right').from('d').to('e').
addE('right').from('e').to('i').
addE('left').from('i').to('j').
addE('left').from('d').to('f').
addE('right').from('b').to('h').
addE('left').from('c').to('g')
Related
I'm trying to execute an OLAP traversal on a query that needs to check if a vertex has a neighbour of certain type.
i keep getting
Local traversals may not traverse past the local star-graph on GraphComputer
my query looks something like:
g.V().hasLabel('label1').
where(_.out().hasLable('label2'))
I'm using the TraversalVertexProgram.
needless to say, when running the same query in oltp mode there is no problem
is there a way to execute such logic?
That is limitation of TinkerPop OLAP GraphComputer. It operate on 'star-graph' objects. The vertex and connected edges only. It uses message passing engine inside. So you have to rewrite you query.
Option 1: start from label2 and go t label1. This should return the same result
g.V().hasLabel('label2').in().hasLabel('label1')
Option2: try to use unique edge labels and check edge label instead
g.V().hasLabel('label1').where(_.outE('label1_label2'))
I want to create 1000+ Edges in a single query.
Currently, I am using the AWS Neptune database and gremlin.net for creating it.
The issue I am facing is related to the speed. It took huge time because of HTTP requests.
So I am planning to combine all of my queries in a string and executing in a single shot.
_g.AddE("allow").From(_g.V().HasLabel('person').Has('name', 'name1')).To(_g.V().HasLabel('phone').Where(__.Out().Has('sensor', 'nfc'))).Next();
There are chances that the "To" (target) Vertex may not be available in the database. When it is the case this query fails as well. So I had to apply a check if that vertex exists before executing this query using hasNext().
So as of now its working fine, but when I am thinking of combining all 1000+ edge creation at once, is it possible to write a query which doesn't break if "To" (target) Vertex not found?
You should look at using the Element Existence pattern for each vertex as shown in the TinkerPop Recipes.
In your example you would replace this section of your query:
_g.V().HasLabel('person').Has('name', 'name1')
with something like this (I don't have a .NET environment to test the syntax):
__.V().Has('person', 'name', 'name1').Fold().
coalesce(__.Unfold(), __.AddV('person').Property('name', 'name1')
This will act as an Upsert and either return the existing vertex or add a new one with the name property. This same pattern can then be used on your To step to ensure that it exists before the edge is created as well.
I am trying to submit a batch like operation for creating multiple vertices and edges in the same query.
g.addV('os').property('name', 'linux').as('linux').
addV('os').property('name', 'windows').as('windows').
addV('os').property('name', 'mac').as('mac').
addE('competitor').from('linux').to('UNEXISTING OS'). # fail here
addE('competitor').from('linux').to('windows').
addE('competitor').from('windows').to('mac').
addE('competitor').from('linux').to('mac').
iterate()
The query is constructed to intentionally fail, however all vertices before the failing line are being created.
Is it possible to achieve a kind of transaction for the whole query? So that if one subquery fails, it should rollback the ones that were previously executed.
Thanks!
The query could not be executed in the Gremlin Console using TinkerGraph,
as per TinkerPop documentation, there isn't support for transactions for built-in TinkerGraph object.
But, as cygri pointed out, AWS Neptune offers support for transactions (see here), that can be executed under the form of original query from OP or by separating queries by a semicolon (;) or a newline character (\n)
g.addV('os').property('name', 'linux').next();
g.addV('os').property('name', 'windows').next();
g.addE('competitor').from('1101').to('1102')
You can also use Gremlin Sessions; create a sessioned-connection and it'll rollback queries in case of an error.
When I'm trying to filter CustAccount field on CustTableListPage it's taking too long to filter. On the other fields there is no latency. I'm trying to filter just part of account number like "*123".
I have done reindexing for custtable and also updated statics but not appreciable difference at all.
When i have added listpage's query in a view it's filtering custAccount field normally like the other fields.
Any suggestion?
Edit:
Our version is AX 2012 r2 cu8, not a user based problem it occurs for every user, Interaction class has some custimizations but just for setting some buttons enable/disable props. etc... i tryed to look query execution what i found is not clear. something like FETCH_API_CURSOR_000000..x
Record a trace of this execution and locate what is a bottleneck.
Keep in mind that that wildcards (such as *) have to be used with care. Using a filter string that starts with a wildcard kills all performance because the SQL indexes cannot be used.
Using a wildcard at the end
Imagine that you have a dictionnary and have to list all the words starting with 'Foo'. You can skip all entries before 'F', then all those before 'Fo', then all those before 'Foo' and start your result list from there.
Similarly, asking the underlying SQL engine to list all CustAccount entries starting with '123' (= filter string '123*') allows using an index on CustAccount to quickly skip to the relevant data.
Using a wildcard at the start
Imagine that you still have that dictionnary and have to list all the words ending with 'ing'. You would have no other choice than going through the entire dictionnary and checking the ending of every word (due to the alphabetical sorting).
This explains why asking the SQL engine to list all CustAccount entries ending with '123' (= filter string '*123') means that all CustAccount values must be investigated. So the AOS loops through all the entries and uses an SQL cursor to do this. That is the FETCH_API_CURSOR statement you see on the SQL level.
Possible solutions
Educate your end user that using a wildcard at the beginning of a filter string will always be slow on a large table.
Step up the SQL server hardware / allocated resources (faster CPU, more RAM, faster disk, ...).
Create a full text index on CustAccount (not a fan of this one and performance impact should be thoroughly investigated).
I've solve the problem. CustTableListPage query had a sorting over DirPartyTable.Name field. When I remove this sorting, filtering with wildcard working like a charm.
My PreSQL query in the WorkFlow of Informatica has 2 parts:
Insert query
Refresh Stats procedure call
When no semicolon ; is present between these 2 parts, the record count is given in Millions. When a semicolon ';' is present the record count is given in Thousands.
Two Questions:
Why is it not throwing an Error for the absence of a semicolon?
Why the increased record count?
From the documentation:
Use a semicolon (;) to separate multiple statements. The Integration Service issues a commit after each statement.
The Designer does not validate the SQL.
It will not throw an error in power center because is not validating it. You can check logs on what SQL is getting passed to teradata and execute the same outside (any other client) to investigate on the row count diffference.