I’m a total beginner in Cypher and I’m struggling to obtain the result I want.
So I have nodes that all have a property called « level ». I want to keep only a certain level, but I want to recreate the missing links.
Here is my dataset:
in CSV:
n
"{owner:Team A,name:MySubscription,level:1}"
"{name:Database,level:2}"
"{owner:Team A,name:Service A,level:3}"
"{owner:Team A,name:MyTopic,level:2}"
"{name:Service B,level:3}"
"{name:Service C,level:3}"
"{name:MySecret,level:1}"
I want to keep only the nodes that are level >= 2 but I want to recreate the links like so:
Could you help me create the query that does just this?
Not sure it's the better way to do it. But I did found the answer:
MATCH (a:Asset)-[rel]-(b:Asset) WHERE a.level >= 2 AND b.level >= 2
RETURN a, rel, b
UNION
MATCH (a:Asset) -[:USING]-(:Asset)-[:ATTACHED]-(b:Asset) WHERE a.level >= 2
AND b.level >= 2
CALL apoc.create.vRelationship(a,'USING',{}, b) YIELD rel
RETURN a, rel, b
UNION
MATCH (a) WHERE NOT (a)--()
RETURN a, null as rel, null as b;
Related
I am trying to identify what records exist in table 1 that are not in table 2 (so essentially using NOT IN)
let outliers =
Table 2
| project UniqueEventGuid;
Table 1
|where UniqueEventGuid !in (outliers)
|project UniqueEventGuid
but getting 0 records back even though I know there are orphans in table 1.
Is the !in not the right syntax?
Thanks in advance!
!in operator
"In tabular expressions, the first column of the result set is
selected."
In the following example I intentionally ordered the column such that the query will result in error due to mismatched data types.
In your case, the data types might match, so the query is valid, but the results are wrong.
let t1 = datatable(i:int, x:string)[1,"A", 2,"B", 3,"C" ,4,"D" ,5,"E"];
let t2 = datatable(y:string, i:int)["d",4 ,"e",5 ,"f",6 ,"g",7];
t1
| where i !in (t2)
Relop semantic error: SEM0025: One of the values provided to the
'!in' operator does not match the left side expression type 'int',
consider using explicit cast
Fiddle
If that is indeed the case, you can reorder the columns or project only the relevant one.
Note the use of double brackets.
let t1 = datatable(i:int, x:string)[1,"A", 2,"B", 3,"C" ,4,"D" ,5,"E"];
let t2 = datatable(y:string, i:int)["d",4 ,"e",5 ,"f",6 ,"g",7];
t1
| where i !in ((t2 | project i))
i
x
1
A
2
B
3
C
Fiddle
Another option is to use leftanti join
let t1 = datatable(i:int, x:string)[1,"A", 2,"B", 3,"C" ,4,"D" ,5,"E"];
let t2 = datatable(y:string, i:int)["d",4 ,"e",5 ,"f",6 ,"g",7];
t1
| join kind=leftanti t2 on i
i
x
2
B
3
C
1
A
Fiddle
I'm performing an Sqlite3 query similar to
SELECT * FROM nodes WHERE name IN ('name1', 'name2', 'name3', ...) LIMIT 1
Am I guaranteed that it will search for name1 first, name2 second, etc? Such that by limiting my output to 1 I know that I found the first hit according to my ordering of items in the IN clause?
Update: with some testing it seems to always return the first hit in the index regardless of the IN order. It's using the order of the index on name. Is there some way to enforce the search order?
The order of the returned rows is not guaranteed to match the order of the items inside the parenthesis after IN.
What you can do is use ORDER BY in your statement with the use of the function INSTR():
SELECT * FROM nodes
WHERE name IN ('name1', 'name2', 'name3')
ORDER BY INSTR(',name1,name2,name3,', ',' || name || ',')
LIMIT 1
This code uses the same list from the IN clause as a string, where the items are in the same order, concatenated and separated by commas, assuming that the items do not contain commas.
This way the results are ordered by their position in the list and then LIMIT 1 will return the 1st of them which is closer to the start of the list.
Another way to achieve the same results is by using a CTE which returns the list along with an Id which serves as the desired ordering of the results, which will be joined to the table:
WITH list(id, item) AS (
SELECT 1, 'name1' UNION ALL
SELECT 2, 'name2' UNION ALL
SELECT 3, 'name3'
)
SELECT n.*
FROM nodes n INNER JOIN list l
ON l.item = n.name
ORDER BY l.id
LIMIT 1
Or:
WITH list(id, item) AS (
SELECT * FROM (VALUES
(1, 'name1'), (2, 'name2'), (3, 'name3')
)
)
SELECT n.*
FROM nodes n INNER JOIN list l
ON l.item = n.name
ORDER BY l.id
LIMIT 1
This way you don't have to repeat the list twice.
I'm finding it hard to get my head around this problem, and I couldn't find any answers to this specific problem anywhere:
Say I have a table like this, I'm just using fruit as an example:
Fruit | Date | Value
=================================
Apple | 1 | other_random_value
Apple | 2 | some_value_1
Apple | 3 | some_value_2
Pear | 1 | other_random_value
Pear | 2 | unexpected_value_1
Pear | 3 | some_value_2
Everything will be ordered by Fruit, then Date.
Basically, if the last row (for each fruit) is some_value_2, but the one preceding it is not some_value_1, I want to match just those fruits (i.e. in this case, Pear).
So, some_value_2 I always expect to come after a row with a certain value for that particular fruit, and if it doesn't I want to flag errors against those particular fruits. It would also be nice to match cases where nothing precedes some_value_2 as well, though if this is too complicated I could match it seperately and just check that some_value_2 is not the first row, which I don't imagine would be a difficult query.
EDIT: Also, being able to match any consecutive rows where the preceding value is unexpected would be nice, though I mainly care about the last 2 rows. So if being able to match all consecutive rows results in a simpler and better performing query, then I might go with that. I'm going to be doing an INSERT at the same time (into an alert table), so if I could flag it as an ERROR if it's the last two rows and a WARNING if it's not, that would be really nifty. Though I wouldn't know where to start with writing a query that does that. Also having a query that performs well is a must, as I will be using this across a large dataset.
EDIT:
This is what I used in the end, it's quite slow, but if I index Date, it's not so bad:
SELECT c.Id AS CId, c.Fruit AS CFruit,
c.Date AS CDate, c.Value AS CValue,
(SELECT Id
FROM fruits
WHERE Fruit = c.Fruit
AND Date >= c.Date
AND Id > c.Id
ORDER BY Date, Id) AS NId, n.Fruit AS NFruit,
n.Date AS NDate, n.Value AS NValue
FROM fruits AS c
JOIN fruits AS n ON n.Id = NId
ORDER BY c.Date, c.Id
I might try Joachim's method again at some point, as I realised I'm getting a lot of results I don't really care much about. Or I might even try incorporating the two somehow and delegate to INFO/ERROR as appropriate...
Solved: I used the same SELECT statement that I used to get NId, and used SELECT COUNT(*) instead of SELECT Id. This told me the number of results after the current one. Then I just used a CASE operator to turn it into a boolean field called Latest :). So I effectively combined Nicolas' and Joachim's methods. Performance still seems OK, probably because SQLite caches the results.
SQLite is (as far as I know) a bit low on efficient operators for this, so this is the best I can come up with for now :)
SELECT Fruit FROM fruits
WHERE ( SELECT COUNT(*) FROM fruits f
WHERE f.fruit=fruits.fruit
AND f.date > fruits.date ) = 1
AND fruits.value <> 'some_value_1'
INTERSECT
SELECT Fruit FROM fruits
WHERE ( SELECT COUNT(*) FROM fruits f
WHERE f.fruit=fruits.fruit
AND f.date > fruits.date ) = 0
AND fruits.value = 'some_value_2'
An SQLfiddle to test with.
I named the table fruits. This query gets you the preceding date for a ‘key‘ (fruit + date)
select fruit, date, value currvalue,
(select max(date) precedingDate
from fruits p
where p.fruit = c.fruit
and p.date < c.date) precedingdate
from fruits c ;
From there we can get the precedent value for each key
select f1.*, precedingdate, f2.value precedingvalue
from
fruits f1 join
(select fruit, date, value,
(select max(date) precedingDate
from fruits p
where p.fruit = c.fruit
and p.date < c.date) precedingdate
from fruits c) f2
on f1.fruit = f2.fruit and f1.date = precedingdate ;
For all the rows that have a previous row, you get both the current and preceding date and the current and preceding value.
Edit : we add an id used to choose when there are several identical previous date (see comment below)
I will be using intermediate views for the sake of clarity but you could write one big query.
As before, what's the previous date :
create view VFruitsWithPreviousDate
as select fruit, date, value, id,
(select max(date)
from fruits p
where p.fruit = c.fruit
and p.date < c.date) previousdate
from fruits c ;
What's the previous id :
create view VFruitsWithPreviousId
as select fruit, date, value,
(select max(id)
from fruits f
where v.fruit = f.fruit AND
v.previousdate = f.date) previousID
from VFruitsWithPreviousDate v ;
A query for all consecutive rows :
select f.*, v.value
from fruits f
join VFruitsWithPreviousId v on f.id = v.previousid ;
You can then add the condition WHERE f.Value = 'some_value_2' AND v.value != 'some_value_1'
I don't know if I'm being dumb here but I can't seem to find an efficient way to do this. I wrote a very long and inefficient query that does what I need, but what I WANT is a more efficient way.
I have 2 result sets that displays an ID (a PK which is generic/from the same source in both sets) and a FLAG (A - approve and V - Validate).
Result Set 1
ID FLAG
1 V
2 V
3 V
4 V
5 V
6 V
Result Set 2
ID FLAG
2 A
5 A
7 A
8 A
I want to "merge" these two sets to give me this output:
ID FLAG
1 V
2 (V/A)
3 V
4 V
5 (V/A)
6 V
7 A
8 A
Neither of the 2 result sets will at any time have all the ID's to make a simple left join with a case statement on the other result set an easy solution.
I'm currently doing a union between the two sets to get ALL the ID's. Thereafter I left join the 2 result sets to get the required '(V/A)' by use of a case statement.
There must be a more efficient way but I just can't seem to figure it out now as I'm running low on amps... I need a holiday... :-/
Thanks in advance!
Use a FULL OUTER JOIN:
SELECT ID,
CASE
WHEN t1.FLAG IS NULL THEN t2.FLAG
WHEN t2.FLAG IS NULL THEN t1.FLAG
ELSE '(' || t1.FLAG || '/' || t2.FLAG || ')'
END AS MERGED_FLAG
FROM TABLE1 t1
FULL OUTER JOIN TABLE2 t2
USING (ID)
ORDER BY ID
See this SQLFiddle.
Share and enjoy.
I think that you can use xmlagg. Here an exemple :
SELECT deptno,
SUBSTR (REPLACE (REPLACE (XMLAGG (XMLELEMENT ("x", ename)
ORDER BY ename),'</x>'),'<x>','|'),2) as concated_list
FROM emp
GROUP BY deptno
ORDER BY deptno;
Bye
I´m getting a strange result from a SQLite query. The query is the next one:
SELECT rule FROM rules
WHERE idRule = (SELECT idRuleForeign FROM rulesXfilter
WHERE idFilterForeign = (SELECT idFilter FROM filters
WHERE name = 'Filter1'));
Now, let´s suppose that I have the following tables with a few rows on it.
filters rules rulesXfilter
idFilter name idRule rule idRuleForeign idFilterForeign
1 Filter1 1 Rule1 1 1
2 Filter2 2 Rule2 2 1
3 Rule3 3 1
2 2
What I get is {Rule1}, although I think I should get {Rule1, Rule2, Rule3}
What am I doing wrong?
Select idRuleForeign... returns multiple results, yes ({1, 2, 3}). However, you then say "give me the rule where idRule = {SET}", and sql doesnt like this. I believe what is happening is that it is instead taking the first result only and giving you that.
The solution is to use joins. Inner selects like that, while work most of the time, can REALLY slow down your query. If I got my syntax correct, the following should do what you need:
SELECT r.rule FROM rules r
JOIN rulesXfilter rf ON r.idRule = rf.idRuleForeign
JOIN filters f ON f.idFilter = rf.idFilterForeign
WHERE f.name = 'Filter1'