I am struggling with running the pagination script on Azure Cosmos DB Graph, using Gremlin.NET:
t = g.V().hasLabel('person');[]
t.next(2)
t.next(2)
It comes from Tinkerpop page where it is described how to reuse traversal instance to avoid loading the whole graph.
Unfortunately, in Gremlin .NET I get:
ExceptionType : GraphSyntaxException
ExceptionMessage :
Gremlin query syntax error: Unsupported groovy expression kind: t=g.V().hasLabel('person') # line 1, column 1.
Probably this kinda structure is not supported by Azure Cosmos DB Graph, but I cannot find any materials that would prove it.
Related
I have an Azure function that runs with a CosmosDBTrigger. It points correctly to my target database and collection. The CreateLeaseCollectionIfNotExists is set to true and the LeaseCollectionName is set to leases. When the function is started I receive this error:
Error indexing method ' * * '
Microsoft.Azure.WebJobs.Host.Indexers.FunctionIndexingException :
Error indexing method '***' ---> System.InvalidOperationException :
Cannot create Collection Information for *** in database
*** with lease leases in database *** : Partition
key path /id is invalid for Gremlin API. The path cannot be '/id',
'/label' or a nested path such as '/key/path'
It seems like Azure is creating the leases graph with an '/id' as a partition. Where did I go wrong?
The Azure Functions Cosmos DB Trigger documentation says it only works on SQL API accounts: https://learn.microsoft.com/azure/azure-functions/functions-bindings-cosmosdb-v2#supported-apis
Particularly the Trigger uses the Change Feed Processor library that was designed to work with SQL API accounts, it uses a lease collection that has the requirement of being partitioned by /id which is something that Gremlin API accounts cannot do.
I'm trying to export an entire remote graph into json. When I use the following code, it results in an empty file. I am using Gremlin-driver 3.3.2 as this is the same version in the underling graph database, AWS Neptune.
var traversal = EmptyGraph.instance().traversal().withRemote(DriverRemoteConnection.using(getCluster()))
traversal.getGraph().io(graphson()).writeGraph("my-graph.json");
How is one supposed to populate the graph with data such that it can be exported?
I also posted this to the Gremlin Users list.
Here's some code that will do it for you with Neptune and should work with most Gremlin Server implementations I would think.
https://github.com/awslabs/amazon-neptune-tools/tree/master/neptune-export
The results of the export can be used to load via Neptune's bulk loader if you choose to export as CSV.
Hope that is useful
If that is more than you needed hopefully it will at least give you some pointers that help.
With hosted graphs, including Neptune, it is not uncommon to find that they do not expose the Graph object or give access to the io() classes.
The Gremlin io( ) Step is not supported in Neptune. Here is the Neptune's documentation which talks about the other difference between the Amazon Neptune implementation of Gremlin and the TinkerPop implementation.
Taking on-board the valuable feed-back from Ankit and Kelvin, I concentrated on using a local gremlin server to handle the data wrangling.
Once I had the data in a locally running server, by generating gremlin script from an in-memory entity model,I accessed it via a Gremlin console and ran the following:
~/apache-tinkerpop-gremlin-console-3.3.7/bin/gremlin.sh
gremlin> :remote connect tinkerpop.server conf/remote.yaml
gremlin> :> graph.io(graphson()).writeGraph("my-graph.json")
==>null
This put the my-graph.json file in /opt/gremlin-server/ on the docker container.
I extracted it using docker cp $(docker container ls -q):/opt/gremlin-server/my-graph.json .
I can then use this data to populate a gremlin-server testcontainer for running integration tests against a graph database.
neptune-export doesn't support direct export to S3. You'll have to export to the local file system, and then separately copy the files to S3.
Cosmos DB FAQ says in Cassandra API section that Azure Cosmos DB provides automatic indexing of all attributes without any schema definition. https://learn.microsoft.com/en-us/azure/cosmos-db/faq#does-this-mean-i-dont-have-to-create-more-than-one-index-to-satisfy-the-queries-1
But when I try to add WHERE column1 = 'x' filter to my CQL query, I get exception from Datastax cassandra driver saying that data filtering is not supported. I tried to bypass client driver by supplying ALLOW FILTERING but this time got error from cosmos server saying this feature is not implemented.
So, if automatic indexing is implemented for Cosmos/Cassandra API, how can it be used?
I'm using Azure Cosmos DB .NET SDK Version 3.0 and I want to create container programmatically without partition key. Is it possible? I always got error saying Value cannot be null.
Parameter name: partitionKey
I use method CosmosContainers.CreateContainerIfNotExistsAsync
Reproduce your issue on my side always.
Notice the exception is caused by below method:
Try to deserialize the dll source code and find the detailed logical code.
It seems we can't cross this judgement so far because cosmos db team is planning to deprecate ability to create non-partitioned containers, as they do not allow you to scale elastically.(Mentioned in my previous case:Is it still a good idea to create comos db collection without partition key?)
But you still could create non-partitioned containers with DocumentDB .net package or REST API.
Im working on a Cosmos DB app that stores both standard documents and graph documents. We are saving both types via the documentdb api and I am able to run graph queries that return Graphson using the DocumentClient.CreateGremlinQuery method. This graphson is to be read by a web app and the graph displayed for user viewing and so on.
My issue is that I cannot define the version of the Graphson format returned when using the Microsoft.Azure.Graphs method. So i looked into Gremlin.net and that has a lot more options in this regard from the documentation.
However I am finding connecting to the Cosmos Document Db using gremlin.net difficult. The server variable which you define like this :
var server = new GremlinServer("https://localhost/",8081 , enableSsl: true, username: $"/dbs/TheDatabase/colls/TheCOllection", password: "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==");
then results in a uri that has "/gremlin" and it cannot locate the database end point.
Has anyone used Gremlin.net to connect to a Cosmos document database (not a Cosmos db configured as a graph db) that has been setup as a document db not a graph db ? The documents in it are graph/gremlin compatible in their format with _isEdge / label / _sink etc.
Cheers,
Mark (Document db/Gremlin/graph newbie)