Query parameters in Cosmos DB Graph-API - gremlin

Are query parameters supported in the new Cosmos DB graph API?
For example in the query:
IDocumentQuery<dynamic> query = client.CreateGremlinQuery<dynamic>(graph, "g.V().has('name', 'john')");
Can I replace the hard-coded value 'john' with a query parameter as we could do in DocumentDB:
IQueryable<Book> queryable = client.CreateDocumentQuery<Book>(
collectionSelfLink,
new SqlQuerySpec
{
QueryText = "SELECT * FROM books b WHERE (b.Author.Name = #name)",
Parameters = new SqlParameterCollection()
{
new SqlParameter("#name", "Herman Melville")
}
});
I am asking with security in mind. Or might there be other ways to defend against injections in Gremlin?

Tinkerpop in general has a notion of bindings, which allow your to define your data separately from your gremlins scripts. An example using Java code can be found here: https://github.com/tinkerpop/gremlin/wiki/Using-Gremlin-through-Java
(search for bindings).
You can also use bindings through the Http endpoint for example by doing something like:
curl http://localhost:8182 -d '{"gremlin": "g.V().has(key1, value1);", "bindings": {"key1": "name", "value1": "david"}}'
You need to find out if the client in your query supports the bindings parameters, but it seems to me what you are looking for is a Tinkerpop compatible functionality.

Related

How to get available Graph container list in my Cosmos Graph DB?

I am trying to get metadata information of my Cosmos Graph Database. There are a number of Graphs created in this database and I want to list those Graph names.
In the Gremlin API, we have support to connect to any Graph DB container and then we can submit the query as I mentioned in the below code sample. But here we need a {collection} that is our GraphName as well. So somehow we are bound to a particular graph here.
var gremlinServer = new GremlinServer(hostname, port, enableSsl: true,
username: "/dbs/" + database + "/colls/" + collection,
password: authKey);
using (var gremlinClient = new GremlinClient(gremlinServer, new GraphSON2Reader(), new GraphSON2Writer(), GremlinClient.GraphSON2MimeType))
{
gremlinClient.SubmitAsync(query);
}
Is there any way so that we can connect to GraphDB only and get some metadata information ? Such as, in my case, list of available Graphs.
It looks like the Gremlin Client is implemented at a Collection level (i.e. graph) so it won't be possible to enumerate Graphs from one account / database using the gremlin connection.
You can always use the CosmosDB SDK to connect to the account and enumerate the databases/collections and then use the Gremlin Clients to connect to each of them separately.
Install-Package Microsoft.Azure.Cosmos
using (var client = new CosmosClient(endpoint, authKey))
{
var dbIterator = client.GetDatabaseQueryIterator<DatabaseProperties>();
while(dbIterator.HasMoreResults)
{
foreach (var database in await dbIterator.ReadNextAsync())
{
var containerIterator = database.GetContainerQueryIterator<ContainerProperties>();
while (containerIterator.HasMoreResults)
{
foreach (var container in await containerIterator.ReadNextAsync())
{
Console.WriteLine($"{database.Id} - {container.Id}");
}
}
}
}
}

Multiple partitions in COSMOS DB collection

1) I have a Cosmos DB collection with about 500k documents and which is Partitioned by a property "SITEID". In the Query Request Options only one partition key value can be passed. In my case I have queries where the SITEID in (1,2,3,4) needs to be executed where SiteID is the partition key.
For example, my SP is as follows:
SELECT * FROM c WHERE c.SITEID IN
("SiteId1","SiteId2","SiteId3","SiteId4","SiteId5")
AND c.STATUS IN ("Status1","Status2","Status3","Status4")
I am Calling the above SP using the below SQL API code.
await client.ExecuteStoredProcedureAsync<string>(UriFactory.CreateStoredProcedureUri("DBName", "CollectionName", "Sample"),new RequestOptions { PartitionKey = new PartitionKey("SiteId1") })
In the above SQL API Code, PartitionKey property only supports a Single value. Where I need to pass several partition values. Is there any other options to do this?
2) "EnableCrossPartitionQuery" property is only availbale in the FeedOptions but not in the Request Options class. Client.ExecuteStoredProcedureAsync only supports the RequestOptions parameter not FeedOptions. Now I need to execute a Stored Procedure at once and across all partitions. Is there any other options to pass EnableCrossPartitionQuery in ExecuteStoredProcedureAsync method.
E.g)
client.CreateDocumentQuery<Doc>(UriFactory.CreateDocumentCollectionUri("DBName", "CollectionName"), "select * from c", new FeedOptions { EnableCrossPartitionQuery = true }).ToList()
await client.ExecuteStoredProcedureAsync<string>(UriFactory.CreateStoredProcedureUri("DBName", "CollectionName", "Sample"),new RequestOptions { PartitionKey = new PartitionKey("WGC") })
Stored procedures can only be executed against a single partition. There is nothing you can do about that.
They are not considered a query that returns a feed but a request that could return a response of any type. That's they they don't used the FeedOptions but rather the RequestOptions.
You can still execute your query as a normal document query and set the EnableCrossPartitionQuery to true. Cosmos should recognise the partition key in the query and should limit the requests to the specific partition key values.
I say should because this answer suggests that this is the case but there are some comments that say otherwise. I would suggest you check your metrics regarding the amount of collection hits.

Use Traversal to query Azure Cosmos DB graph

I am trying to use Traversal to query an Azure Cosmos DB graph as follows
val cluster = Cluster.build(File("remote.yaml")).create()
val client = cluster.connect()
val graph = EmptyGraph.instance()
val g = graph.traversal().withRemote(DriverRemoteConnection.using(cluster))
val traversal = g.V().count()
val aliased = client.alias("g")
val result = aliased.submit(traversal)
val resultList = result.all().get()
resultList.forEach { println(it) }
Problem is execution hangs after result.all().get() and never get a response. I only have this problem when submitting a traversal. When submitting a Gremlin query string directly it works properly.
I'm in a similar boat, but according to this recent query Does Cosmos DB support Gremlin.net c# GLV? traversals are not possible just yet. However, for those using (or thinking about using) Gremlin.NET to connect to Cosmos, I'll share some of what I've been able to do.
Firstly, I have no trouble connecting to cosmos from the gremlin console, just when using Gremlin.NET as follows:
var gremlinServer = new GremlinServer(hostname, port, enableSsl: true,
username: "/dbs/" + database + "/colls/" + collection,
password: authKey);
var driver = new DriverRemoteConnection(new GremlinClient(gremlinServer));
//var driver = new DriverRemoteConnection(new GremlinClient(new GremlinServer("localhost", 8182)));
var graph = new Gremlin.Net.Structure.Graph();
var g = graph.Traversal().WithRemote(driver);
g.V().Drop().Next(); // nullreferenceexception
When using Gremlin.NET to work with a locally hosted gremlin server (see commented out line), all works fine.
The only way I can work with cosmos using gremlin.net is to submit queries as string literals e.g.
var task = gremlinClient.SubmitAsync<dynamic>("g.V().Drop()");
This works, but I want to be able to use fluent traversals.
I can work with Cosmos quite easily using the Azure/Graph API (documentclient etc), but still only with string literals. Also, this isn't very portable, and apparently slower too

How to get metrics for a request on CosmosDB graph collection?

I want to find out details about a Gremlim query - so I set the PopulateQueryMetrics property of the FeedOptions argument to true.
But the FeedResponse object I get back doesn't have the QueryMetrics property populated.
var queryString = $"g.addV('{d.type}').property('id', '{d.Id}')";
var query = client.CreateGremlinQuery<dynamic>(graphCollection, queryString,
new FeedOptions {
PopulateQueryMetrics = true
});
while (query.HasMoreResults)
{
FeedResponse<dynamic> response = await query.ExecuteNextAsync();
//response.QueryMetrics is null
}
Am I missing something?
According to your description, I created my Azure Cosmos DB account with Gremlin (graph) API, and I could encounter the same issue as you mentioned. I found a tutorial Monitoring and debugging with metrics in Azure Cosmos DB and read the Debugging why queries are running slow section as follows:
In the SQL API SDKs, Azure Cosmos DB provides query execution statistics.
IDocumentQuery<dynamic> query = client.CreateDocumentQuery(
UriFactory.CreateDocumentCollectionUri(DatabaseName, CollectionName),
“SELECT * FROM c WHERE c.city = ‘Seattle’”,
new FeedOptions
{
PopulateQueryMetrics = true,
MaxItemCount = -1,
MaxDegreeOfParallelism = -1,
EnableCrossPartitionQuery = true
}).AsDocumentQuery();
FeedResponse<dynamic> result = await query.ExecuteNextAsync();
// Returns metrics by partition key range Id
IReadOnlyDictionary<string, QueryMetrics> metrics = result.QueryMetrics;
Then, I queried my Cosmos DB Gremlin (graph) account via the SQL API above, I retrieved the QueryMetrics as follows:
Note: I also checked that you could specify the SQL expression like this SELECT * FROM c where c.id='thomas' and c.label='person'. For adding new Vertex, I do not know how to construct the SQL expression. Moreover, the CreateDocumentAsync method does not support the FeedOptions parameter.
Per my understanding, the PopulateQueryMetrics setting may only work when using the SQL API. You could add your feedback here.

Pass list of strings as a Parameter in Parameterized Query in DocumentDB

Is there a way i can pass a list of strings in the SqlParameter, lets say i have 10 authors and i want to find books published by them. I know i can make 10 parameters in (new SqlParameter) separately. But is there a way to just pass a list and get the results.
IQueryable<Book> queryable = client.CreateDocumentQuery<Book>(collectionSelfLink,
new SqlQuerySpec
{
QueryText = "SELECT * FROM books b WHERE (b.Author.Name = #name)",
Parameters = new SqlParameterCollection()
{
new SqlParameter("#name", "Herman Melville")
}
});
I think what you are looking for is the SQL IN keyword, see this link for more information.
Usage example:
SELECT *
FROM books
WHERE books.Author.Name IN ('Helena Petrovna Blavatsky',
'Hermes Trismegistus', 'Heinrich Cornelius Agrippa')

Resources