I am making a graph database using Neo4j and I'm wondering what's the best way to model this case:
Person1 > told > quote > to > Person2 > who told it to -> Person3 -> who told it to -> Person4 > Who told it to -> Person1
I've thought about quote being an attribute of link. But then maybe quote needs also to be a node. In this case the edges would be "told" and "was_told". Like:
Person1 -> created > quote
Quote attributes: id, text
Person attributes: id, name
Person2 > told: {to: Person 3} > quote
Person3 > was_told: {by: Person2} > quote
or:
Person3 > told:quote > Person1
What's the best approach to use to model this database?
I think you need the following model:
A fragment (talk) of a conversation (including time)
Who was the speaker of this fragment
Who was an audience of this fragment
Content (quote) of this fragment
For example, here's the code for creating the first fragment:
MERGE (P1:Person {name:'Person1'})
MERGE (P2:Person {name:'Person2'})
MERGE (Q:Quote {name:'Quote1', text:'Quote1 text'})
MERGE (P1)<-[:has_speaker]-(T1:Talk {name:'Talk1', time: 1})-[:has_audience]->(P2)
MERGE (T1)-[:talk_about]->(Q)
Visualization:
The query for the entire life cycle of a quote:
MATCH (Q:Quote {name:'Quote1', text:'Quote1 text'})<-[:talk_about]-(T:Talk)
WITH Q, T
MATCH (P1:Person)<-[:has_speaker]-(T)-[:has_audience]->(P2)
WITH Q, T, P1 as speaker, collect(P2.name) as audience ORDER BY T.time ASC
RETURN Q as quote,
collect( {time: T.time,
speaker: speaker.name,
audience: audience}
) as quoteTimeline
Related
I have the following vertices -
Person1 -> Device1 <- Person2
^
| |
v
Email1 <- Person3
Now I want to write a gremlin query (janusgraph) which will give me all persons connected to the device(only) with which person1 is connected.
So according to the above graph, our output should be - [Person2].
Person3 is not in output because Person3 is also connected with "Email1" of "Person1".
g.addV('person').property('name', 'Person1').as('p1').
addV('person').property('name', 'Person2').as('p2').
addV('person').property('name', 'Person3').as('p3').
addV('device').as('d1').
addV('email').as('e1').
addE('HAS_DEVICE').from('p1').to('d1').
addE('HAS_EMAIL').from('p1').to('e1').
addE('HAS_DEVICE').from('p2').to('d1').
addE('HAS_DEVICE').from('p3').to('d1').
addE('HAS_EMAIL').from('p3').to('e1')
The following traversal will give you the person vertices that are connected to "Person1" via one or more "device" vertices and not connected via any other type of vertices.enter code here
g.V().has('person', 'name', 'Person1').as('p1').
out().as('connector').
in().where(neq('p1')).
group().
by().
by(select('connector').label().fold()).
unfold().
where(
select(values).
unfold().dedup().fold(). // just in case the persons are connected by multiple devices
is(eq(['device']))
).
select(keys)
I'm trying to dynamically create nested map like below in code.
def people = [
[name: 'Ash', age: '21', gender: 'm'],
[name: 'Jo', age: '22', gender: 'f'],
[name: 'etc.', age: '42', gender: 'f']
]
So I can search it like below
person = people.findAll {item ->
item.gender == 'm' &&
item.age == '21'}
My problem is that whilst I can dynamically create one dimensional maps in code, I don't know how to dynamically combine maps in code to create nested map e.g. let's assume in code I have created two maps name1 and name2. How do I add them to people map so they are nested like above example?
def people = [:]
def name1 = [name:'ash', age:'21', gender:'m']
def name2 = [name:'Jo', age:'22', gender:'f']
I've searched / tried so many posts without success. Below is close, but does not work :(
people.put((),(name1))
people.put((),(name2))
In your example, people is a list of maps, not a nested map
So you can simply do:
def people = []
def name1 = [name:'ash', age:'21', gender:'m']
def name2 = [name:'Jo', age:'22', gender:'f']
Then:
people += name1
people += name2
Or define it in one line:
def people = [name1, name2]
I am running the following set of commands in Pig. My data set has one row for each student in a class and each student has a number of grades. Student name is tab separated from grades for that student. The scores for each student are comma separated. I need to find the average grade for each student.
After grouping, I can successfully get the count of grades for each student but I cannot get the average score for each student. Pig complains it cannot find the iterator when it is averaging. I am confused since the iterator for both aggregate function COUNT and AVG is the same. I am not sure what I am missing. Any help is appreciated?
Scripts:
grunt> A = LOAD 'grades.txt' USING PigStorage('\t') AS
(f1:chararray,f2:chararray);
grunt> dump A;
(s14,59,94,81)
(s15,60,77)
(s16,77,77)
(s17,76,76)
(s18,19,61,72)
(s20,34,35)
grunt> B = foreach A generate f1 as stu, Flatten(TOKENIZE(f2)) as (grade:int);
grunt> describe B;
B: {stu: chararray,grade: int}
grunt> dump B;
(s14,59)
(s14,94)
(s14,81)
(s15,60)
(s15,77)
(s16,77)
(s16,77)
(s17,76)
(s17,76)
(s18,19)
(s18,61)
(s18,72)
(s20,34)
(s20,35)
grunt> grp = group B by stu;
grunt> cnt = foreach grp generate group, COUNT(B.grade);
grunt> dump cnt;
(s14,3)
(s15,2)
(s16,2)
(s17,2)
(s18,3)
(s20,2)
grunt> avg = foreach grp generate group, AVG(B.grade);
grunt> dump avg;
2015-03-20 21:56:30,900 ERROR org.apache.pig.tools.pigstats.PigStatsUtil:
1 map reduce job(s) failed!
2015-03-20 21:56:30,907 ERROR org.apache.pig.tools.grunt.Grunt: ERROR 1066:
Unable to open iterator for alias avg
Details at logfile: /home/training/pig/pig_1426902869706.log
grunt>
As mentioned in the comments, a workaround was found:
changed
B = foreach A generate f1 as stu, Flatten(TOKENIZE(f2)) as (grade:int)
to
B = foreach A generate f1 as stu, Flatten(TOKENIZE(f2)) as grade
And then copied the bag into:
C = foreach B generate stu as stu, grade as (int)grade;
I have a file .txt with 3 columns: ID-polygon-1, ID-polygon-2 and distance.
When I import my file into Netlogo, I obtain 3 lists [[list1][list2][list3]] which corresponds with the 3 columns.
I used table:from-list list to create a table with the content of 3 lists.
I obtain {{table: [[1 1] [67 518] [815 127]]}} (The table displays the first two lines of my dataset).
For example, I would like to get the value of distance (list3) between ID-polygon-1 = 1 (list1) and ID-polygon-2 = 67 (list1), that is, 815.
How can I use table:get table key when I have need of 2 keys (ID-polygon-1 and ID-polygon-2) ?
Thanks very much your help.
Using table:from-list will not help you there: it expects "a list of two element lists, or pairs" where the "the first element in the pair is the key and the second element is the value." That's not what you have in your original list.
Furthermore, NetLogo tables (and associative arrays in general) cannot have two keys. They are always just key-value pairs. Nothing prevents the value from being another table, however, and in your case, that is what you need: a table of tables!
There is no primitive to build that directly, however. You will need to build it yourself:
extensions [ table ]
globals [ t ]
to setup
let lists [
[ 1 1 ] ; ID-polygon-1 column
[ 67 518 ] ; ID-polygon-2 column
[ 815 127 ] ; distance column
]
set t table:make
foreach n-values length first lists [ ? ] [
let id1 item ? (item 0 lists)
let id2 item ? (item 1 lists)
let dist item ? (item 2 lists)
if not table:has-key? t id1 [
table:put t id1 table:make
]
table:put (table:get t id1) id2 dist
]
end
Here is what you get when you print the resulting table:
{{table: [[1 {{table: [[67 815] [518 127]]}}]]}}
And here is a small reporter to make it convenient to get a distance from the table:
to-report get-dist [ id1 id2 ]
report table:get (table:get t id1) id2
end
Using get-dist 1 67 will give the 815 result you were looking for.
I want to return all users that I follow who are not members of any groups that I am in. If a followed user is a member of even one group that I am in, it should not be returned.
However, I am getting an error:
None.get
Neo.DatabaseError.Statement.ExecutionFailure
when I try this query:
MATCH (g1:groups)<-[:MEMBER_OF]-(u1:users{userid1:"56"})-[:FOLLOWS]->(u2:users)-[:MEMBER_OF]->(g2:groups)
WITH collect(g1.groupid) AS my_groups,u2,collect(g2.groupid) AS foll_groups
WHERE NOT any(t in foll_groups WHERE t IN extract(x IN my_groups))
RETURN u2
Here is one solution:
MATCH (g1:groups)<-[:MEMBER_OF]-(u1:users { userid1:"56" })-[:FOLLOWS]->(u2:users)-[:MEMBER_OF]->(g2:groups)
WITH u2, collect(g2) AS foll_groups, collect(g1) AS my_groups
WITH u2, reduce(dup = FALSE, g IN foll_groups | (dup OR g IN my_groups)) AS has_dup
WHERE NOT has_dup
RETURN u2;