How to merge lists of one document CosmosDb - azure-cosmosdb

I have a document which has 2 list attributes
{
CurrentDocument:[
{ DocName: "name1", DocType: "Identity" },
{ DocName: "name2", DocType: "Authorization" }
],
ClosedDocument:[
{ DocName: "name3", DocType: "Passport" }
]
}
I want to have a query that return DocName & DocType of my two lists.
I can't use Join because if one of the list is empty, my query return nothing.
Furthermore, in case of a join, I can't merge all my attributes in one list.
SELECT cur.DocName AS curName, clo.DocName AS cloName FROM c JOIN cur IN c.CurrentDocument JOIN clo IN c.ClosedDocument
This query is not what I'm looking for cause :
if one list is empty, I lost all data
I get a list of value n*m which has duplicates (n : number of CurrentDocument, m : number of ClosedDocument)
I tried using the Union expression, but i can't seem to make it work in a query.
Thanks in advance.

Use UDF for this.
Create the following UDF
function userDefinedFunction(current, closed){
return current.concat(closed);}
Use it in your query
SELECT udf.MergeLists(o.CurrentDocument, o.ClosedDocument) as merged FROM Orders o WHERE o.id = 'a811d13f-a308-4df1-85c1-31e566e9fc1e'
This returns the following

Related

Cosmos DB query on key-value pairs

I have a large collection of json documents whose structure is in the form:
{
"id": "00000000-0000-0000-0000-000000001122",
"typeId": 0,
"projectId": "p001",
"properties": [
{
"id": "a6fdd321-562c-4a40-97c7-4a34c097033d",
"name": "projectName",
"value": "contoso",
},
{
"id": "d3b5d3b6-66de-47b5-894b-cdecfc8afc40",
"name": "status",
"value": "open",
},
.....{etc}
]
}
There may be a lot of properties in the collection, all identified by the value of name. The fields in properties are pretty consistent -- there may be some variability, but they will all have the fields that I care about. There's an Id, some labels, etc
I'm wanting to combine these with some other data in PowerBI using the projectId to create some very valuable reports.
I think what I want to do it 'normalize' this data into a table, like:
ProjectId
projectName
status
openDate
closeDate
manager
p001
contoso
open
20200101
me
etc
​
Where I'm at...
I can go:
SELECT c["value"] AS ProjectName
FROM c in t.Properties
WHERE c["name"] = "projectName"
... this will give me each projectName
I can do that a heap of times to get the 'values' (status, openDate, manager, etc)
If I want to combine them together then I would need to combine all those sub-queries together with 'id'. But 'id' in not in the scope of the SELECT, so how do I get it?? If I were to do this, it sounds like something that would be very expensive (RU's) to execute.
I think I'm overcomplicating this, but I cant quite get my head around the Cosmos syntax.
Help??
You can achieve it with JOINS and the WHERE expressions although the scheme is not ideal for querying and you should consider changing it.
SELECT
c['projectId'], --c.projectId also works, but value is a reserved keyword
n['value'] AS projectName,
s['value'] AS status
FROM c
JOIN n IN c.properties
JOIN s IN c.properties
WHERE n['name'] = 'projectName' AND s['name'] = 'status'
--note all filtered properties must appear exactly once for it to work properly
Edit; new query that solves the potential issue that filtered properties must appear exactly once.
SELECT
c['projectId'],
ARRAY(
SELECT VALUE n['value']
FROM n IN c.properties
WHERE n['name'] = 'projectName'
)[0] AS projectName,
ARRAY(
SELECT VALUE n['value']
FROM n IN c.properties
WHERE n['name'] = 'status'
)[0] AS status
FROM c

CosmosDB SQL String functions not working with a join?

I have a collection in DocumentDB with objects that look like this:
{
"id":"1de03a93-729d-43da-985a-12584079b4f8",
"Components":[
{
"Name":"MyComponentName1",
"Value": 12345
},
{
"Name":"MyComponentName2",
"Value": 34567
},
{
"Name":"MyComponentName3",
"Value": 56789
}
]
...other properties irrelevant to question...
}
When querying CosmosDB, I have the following query:
SELECT VALUE d FROM c
JOIN d IN c.Components
WHERE d.Name="MyComponentName1"
which correctly returns:
{
"Name":"MyComponentName1",
"Value":12345
}
However, when I attempt to query based on a String operator:
SELECT VALUE d FROM c
JOIN d IN c.Components
WHERE CONTAINS(d.Name,'MyComponent') --OR STARTSWITH OR ENDSWITH
I get no results.
If I take the same query as above but I add an id restriction to the where clause:
SELECT VALUE d FROM c
JOIN d IN c.Components
WHERE CONTAINS(d.Name,'MyComponent')
AND c.id = "1de03a93-729d-43da-985a-12584079b4f8"
I get back the results I expect, but obviously only for that id. I need all of the documents that match the String operator.
Is this a bug with CosmosDB, or am I doing something wrong?
Nick,
Make sure that you're following all the continuations when you execute this query. Please keep in mind that the query w/ Contains will result in a full scan and hence it might not finish in a single continuation. This is the same case w/ EndsWith. For StartsWith, however, it should utilize the index, but only if the collection index policy define range index on strings; otherwise, it will still be a scan.

How to remove collection or edge document using for loop in ArangoDB?

I'm using the latest ArangoDB 3.1 on Windows 10.
Here I want to remove the collection document and edge document using the for loop. But I'm getting an error like document not found (vName).
vName contains the many collection names. But I dunno how to use it in for loop.
This is the AQL I am using to remove the documents from the graph:
LET op = (FOR v, e IN 1..1 ANY 'User/588751454' GRAPH 'my_graph'
COLLECT vid = v._id, eid = e._id
RETURN { vid, eid }
)
FOR doc IN op
COLLECT vName = SPLIT(doc.vid,'/')[0],
vid = SPLIT(doc.vid,'/')[1],
eName = SPLIT(doc.eid,'/')[0],
eid = SPLIT(doc.eid,'/')[1]
REMOVE { _key: vid } in vName
Return output im getting from the AQL (Web UI screenshot)
vName is a variable introduced by COLLECT. It is a string with the collection name of a vertex (extracted from vid / v._id). You then try to use it in the removal operation REMOVE { ... } IN vName.
AQL does not support dynamic collection names however, collection names must be known at query compile time:
Each REMOVE operation is restricted to a single collection, and the collection name must not be dynamic.
Source: https://docs.arangodb.com/3.2/AQL/Operations/Remove.html
So, you either have to hardcode the collection into the query, e.g. REMOVE { ... } IN User, or use the special bind parameter syntax for collections, e.g. REMOVE { ... } IN ##coll and bind parameters: {"#coll": "User", ...}.
This also means that REMOVE can only delete documents in a single collection.
It's possible to workaround the limitation somewhat by using subqueries like this:
LET x1 = (FOR doc IN User REMOVE aa IN User)
LET x2 = (FOR doc IN relations REMOVE bb IN relations)
RETURN 1
The variables x1 and x2 are syntactically required and receive an empty array as subquery result. The query also requires a RETURN statement, even if we don't expect any result.
Do not attempt to remove from the same collection twice in the same query though, as it would raise a access after data-modification error.

Join multiple children with DocumentDb

I am trying to run the following query on DocumentDb
SELECT p.id
FROM p JOIN filter IN p.Filters
WHERE filter.Id = "686e4c9c-f1ab-40ce-8472-cc5d63597263"
AND filter.Id = "caa2c2a0-cc5b-42e3-9943-dcda776bdc20"
My json is like this
{
"id": "a3dc570b-26e2-40a9-8777-683186965f78",
"Filters": [
{
"Id": "686e4c9c-f1ab-40ce-8472-cc5d63597263"
},
{
"Id": "caa2c2a0-cc5b-42e3-9943-dcda776bdc20"
}
]
}
I want to find the entities that has a child Filter with Id "686e4c9c-f1ab-40ce-8472-cc5d63597263" and a child Filter with Id "Id": "caa2c2a0-cc5b-42e3-9943-dcda776bdc20", however the query returns no results.
If I use OR instead of AND I can get results, but that's obviously not the results I want.
For querying across multiple children within a single parent - you could use multiple JOINS.
Note: this can potentially become an expensive query due to having to perform multiple cross products.
SELECT p.id
FROM p
JOIN filter1 IN p.Filters
JOIN filter2 IN p.Filters
WHERE filter1.Id = "686e4c9c-f1ab-40ce-8472-cc5d63597263"
AND filter2.Id = "caa2c2a0-cc5b-42e3-9943-dcda776bdc20"
For situations where the order of children is deterministic - you can avoid performing a cross product (JOIN) by using the child element's index in the query. This will be faster and cheaper query:
SELECT p.id
FROM p
WHERE p.Filters[0].Id = "686e4c9c-f1ab-40ce-8472-cc5d63597263"
AND p.Filters[1].Id = "caa2c2a0-cc5b-42e3-9943-dcda776bdc20"

ASP.NET: Linq2SQL: selecting all names matching an id

Got 2 tables: db.Tags (ID, TagName) and db.Names (ID, Name, TagID).
I want to fetch all db.Tags rows, and all the Names matching the TagID.
So it will look like
ID - TagName - Names
1 - tag1 - name1, name2, name3
2 - tag2 - name4, name5, name6
Is this possible in one (long) linq query?
or do I have to get all the tags, then do foreach tag, get all the names, then do foreach names to put them in a one long string..
Thanks in advance!
EDIT:
Okay see my comment on the second answer (first one up..), this is what i tried but i get some errors in compiler:
var tags =
from t in db.Tags
orderby t.Priority ascending
select new {
t.ID,
t.Name,
t.Priority,
Places = String.Join(", ",
(from p in db.Places
join o in db.TagToPlaces on new {
p.ID,
t.ID
}
equals new {
o.PlaceId,
o.TagId
}
select p.Name
).ToArray()
)
}
);
I think this is what you're after:
var query =
from t in db.Tags
select new
{
t.ID,
t.TagName,
Names = String.Join(", ",
(from n in db.Names
where n.TagID == t.ID
select n.Name)
.ToArray()),
};
With this I get the same sort of output that you gave in your question. I also understood that you want to output the tag id and name even when there are no associated name records - my query does that.
Now depending on if you're using EF or LINQ-to-SQL or something else you may need to add .ToArray() to the db.Tags & db.Names references to force the database query to occur.
If you have a large number of tag records you'll find you have a large number of queries going to the database. You could make this change to reduce it to only two queries:
var tags = db.Tags.ToArray();
var names = db.Names.ToArray();
var query =
from t in tags
select new
{
t.ID,
t.TagName,
Names = String.Join(", ",
(from n in names
where n.TagID == t.ID
select n.Name)
.ToArray()),
};
Now you just need to make sure that your data fits into memory - but it sounds like it should. I hope this helps.
Since the concat is a pain in TSQL, I would query the 3 values "as is", and format from there:
var list = (from tag in db.Tags
join name in db.Names
on tag.ID equals name.TagId
orderby tag.ID
select new { tag.ID, tag.TagName, name.Name }).ToList();
for example, if I wanted the names by tag-id, I could do:
var namesByTag = list.ToLookup(row => row.ID, row => row.Name);
(or whatever else you choose)

Resources