SPARQL count unique value combinations - count

I have been working on a SPARQL query to find unique value combinations in my graph store. But I dont succeed.
Basically what I try to do is:
a b c
e f g
e r t
a b c
k l m
e f g
a b c
result:
a b c | 3
e f g | 2
e r t | 1
k l m | 1
Tried several constructions, with distincts, group by`s and sub queries but I dont succeed.
Last Try:
SELECT (count (*) as ?n){
SELECT DISTINCT ?value1 ?value2 ?value3 WHERE {
?instance vocab:relate ?value1 .
?instance vocab:relate ?value2 .
?instance vocab:relate ?value3 .
}
}
RDF:
<http://test.example.com/instance1>
a <http://test.example.com#Instance> ;
<http://vocab.example.com/relate>
<http://test.example.com/c> , <http://test.example.com/b> , <http://test.example.com/a> .
<http://test.example.com/instance6>
a <http://test.example.com#Instance> ;
<http://vocab.example.com/relate>
<http://test.example.com/g> , <http://test.example.com/f> , <http://test.example.com/e> .
<http://test.example.com/instance4>
a <http://test.example.com#Instance> ;
<http://vocab.example.com/relate>
<http://test.example.com/c> , <http://test.example.com/b> , <http://test.example.com/a> .
<http://test.example.com/instance2>
a <http://test.example.com#Instance> ;
<http://vocab.example.com/relate>
<http://test.example.com/g> , <http://test.example.com/f> , <http://test.example.com/e> .
<http://test.example.com/instance7>
a <http://test.example.com#Instance> ;
<http://vocab.example.com/relate>
<http://test.example.com/c> , <http://test.example.com/b> , <http://test.example.com/a> .
<http://test.example.com/instance5>
a <http://test.example.com#Instance> ;
<http://vocab.example.com/relate>
<http://test.example.com/m> , <http://test.example.com/l> , <http://test.example.com/k> .
<http://test.example.com/instance3>
a <http://test.example.com#Instance> ;
<http://vocab.example.com/relate>
<http://test.example.com/t> , <http://test.example.com/r> , <http://test.example.com/e> .

AKSW's comment is spot on: you need to add an ordering criteria to the values so that you're not considering all the different possible ways of ordering the values. Also, remember that RDF doesn't have "duplicate" triples, so
:a :p :c, :c, :d
is the same as
:a :p :c, :d
so the appropriate comparison is < as opposed to <=, since without duplicate triples, you'd never have an = case. Also, since the values are IRIs, you need to get their string values before you can compare with <, but the str function will take care of that.
prefix v: <http://vocab.example.com/>
prefix : <http://test.example.com/>
select ?a ?b ?c (count(distinct ?i) as ?count) where {
?i v:relate ?a, ?b, ?c .
filter (str(?a) < str(?b) && str(?b) < str(?c))
}
group by ?a ?b ?c
------------------------
| a | b | c | count |
========================
| :a | :b | :c | 3 |
| :e | :f | :g | 2 |
| :e | :r | :t | 1 |
| :k | :l | :m | 1 |
------------------------

Related

Antlr4 order of token in lexer

lexer grammar
DESC: D | D E S C;
.
.
.
INCREMENTOPTION: S | H | M | D;
parser grammar:
sortExpression: integer? sortFieldList Desc = DESC?;
.
.
.
incrementOption: integer INCREMENTOPTION;
in the case of input 'd' i have a problem.
each of DESC or INCREMENTOPTION token be the upper token in lexer that is matched and the other one not matched
what can i do?!
You will have to do something like this:
sortExpression. : integer? sortFieldList desc?;
incrementOption : integer incrementoption;
desc : DESC | SINGLE_D;
incrementoption : SINGLE_D | SINGLE_S_H_M;
DESC : D E S C;
SINGLE_D : D;
SINGLE_S_H_M : S | H | M;

How to duplicate input into outputs with jq?

I'm trying to adapt the following snippet:
echo '{"a":{"value":"b"}, "c":{"value":"d"}}' \
| jq -r '. as $in | keys[] | [$in[.].value | tostring + " 1"] | #tsv'
b 1
d 1
to output:
b 1
b 2
d 1
d 2
The following adaptation produces the desired output:
echo '{"a":{"value":"b"}, "c":{"value":"d"}}' |
jq -r '
def addindex(start;lessthan):
range(start;lessthan) as $i | "\(.) \($i)";
. as $in
| keys[]
| $in[.].value
| addindex(1;3)'
Note that keys emits the key names after they have been sorted, whereas keys_unsorted retains the ordering.

What would the conversion of this from EBNF to BNF be? Also, what is the leftmost derivation?

I need to convert this from EBNF to BNF.
<statement> ::= <ident> = <expr>
<statement> ::= IF <expr> THEN <statement> [ ELSE <statement> ] END
<statement> ::= WHILE <expr> DO <statement> END
<statement> ::= BEGIN <statement> {; <statement>} END
Also, I'm stuck on this one:
E -> E+T | E-T | T
T -> T*F | T/F | F
F -> (E) | VAR | INT
VAR -> a | b | c
INT -> 0 | 1 | 2| 3 | 4| 5 | 6 | 7 | 8 | 9
After modifying the grammer to add a ^ operator, What is the leftmost derivation that your grammar assigns to the expression a^2^b*(c+1)? You may find it convenient to sketch the parse tree for this expression first, and then figure out the leftmost derivation from that.
I added G -> F^G | G and then got G 2 G b E as my answer but am not sure if that is correct.

Removing Left Recursion from CFG

The following grammar has left recursion:
T -> Tx | TYx | YX | x
X -> xx
Y -> Yy | Yx | y
How do you go about removing left recursion. I read the wikipedia explanation, but I'm fairly new to CFGs so it did not make a lot of sense. Any help is appreciated? A plain english explanation would be even more appreciated.
In this example, you can follow Robert C. Moore's general algorithm to convert a rule with left recursion to a rule with right recursion:
A -> A a1 | A a2 | ... | b1 | b2 | ...
# converts to
A -> b1 A' | b2 A' | ...
A' -> e | a1 A' | a2 A' | ... # where e = epsilon
In our first case: A=T, a1=x, a2=Yx, b1=y, b2=x... (similarly for Y)
T -> YXT' | xT'
T' -> e | xT' | YxT'
X -> xx
Y -> yY'
Y' -> e | yY' | xY'

Breadth first search, and A* search in a graph?

I understand how to use a breadth first search and A* in a tree structure, but given the following graph, how would it be implemented? In other words, how would the search traverse the graph? S is the start state
Graph Here
It's exactly the same as doing it in a tree. You just need to somehow keep track of which nodes you've already visited so that you don't end up going in circles.
Basically, you treat a graph the same way that you'd treat a tree, except you need to keep track of nodes you've already visited. That's fine for BFS. On top of that, in the case of A*, consider what you'd do when you revisit a node but have found a cheaper route to it.
Paint the graph - Recursively search each node and mark the nodes you visited as dirty. Only recurse when the graph is not dirty.
If memory is not an issue, copy the graph and instead of marking the nodes, remove them from the copy graph.
It's weighted graph. Do you want to find shortest paths or just traverse it?
If you want just traversing, here it is:
1) there is only S in the queue
2) we are adding C and A in the queue, only they are reachable from S directly (with one edge)
3) D, G2 - from C
4) B, E - from A
5) G1 - from D (G2 is already in the queue)
6) there no outgoing edge from G2
7) there's no adjacent nodes of B which aren't already in the queue
So here's the order how nodes where added in the queue: S, C, A, D, G2, B, E, G1
I don't know how helpful you will find this, but here's a complete solution coded in the functional language J (available for free from jsoftware.com).
First, it's probably simplest to work directly from a representation of the graph you show in your picture. I represent this as a (# nodes) x (# nodes) table with a number at (i,j) for the value of the link between node-i and node-j. Also, along the diagonal I've put the number associated with each node itself.
So, I enter this as follows - don't worry too much about the unfamiliar notation, you'll soon see what the result looks like:
grph=: <;.1&>TAB,&.><;._2 ] 0 : 0
A B C D E G1 G2 S
A 2 1 8 2
B 1 1 1 4 2
C 3 1 5
D 1 5 2
E 6 9 7
G1 0
G2 0
S 2 3 5
)
So, I've assigned the variable "grph" as a 9x9 table where the first row and first column are the labels "A"-"E", "G1", "G2", and "S"; I've used tabs to delimit items so this could be cut-and-pasted to or from a spreadsheet as needed.
Now, I'll check the size of my table and display it:
$grph
9 9
grph
+---+--+--+--+--+--+---+---+--+
| | A| B| C| D| E| G1| G2| S|
+---+--+--+--+--+--+---+---+--+
| A | 2| 1| | | 8| | | 2|
+---+--+--+--+--+--+---+---+--+
| B | | 1| 1| 1| | 4 | | 2|
+---+--+--+--+--+--+---+---+--+
| C | | 3| 1| | | | 5 | |
+---+--+--+--+--+--+---+---+--+
| D | | | | 1| | 5 | 2 | |
+---+--+--+--+--+--+---+---+--+
| E | | | | | 6| 9 | 7 | |
+---+--+--+--+--+--+---+---+--+
| G1| | | | | | 0 | | |
+---+--+--+--+--+--+---+---+--+
| G2| | | | | | | 0 | |
+---+--+--+--+--+--+---+---+--+
| S | 2| | 3| | | | | 5|
+---+--+--+--+--+--+---+---+--+
It looks OK and it's easy to compare this to the picture of the graph to check it.
Now I'll drop the first row and column so we're left only with numbers (as boxed literals),
and remove any extraneous tab characters.
grn=. TAB-.~&.>}.}."1 grph
You can see I assign this result to the variable "grn".
Next, I'll replace any empty cells with "_" - which represents infinity - then convert the literals to numeric representation (re-assigning the result to the same name "grn"):
grn=. ".&>(0=#&>grn)}grn,:<'_'
Finally, I'll move the last column and row to the beginning since this is the one for "S" and it's supposed to be first. I'll also display the result to confirm that it looks correct.
]grn=. _1|."1]_1|.grn NB. "S" goes first.
5 2 _ 3 _ _ _ _
2 2 1 _ _ 8 _ _
2 _ 1 1 1 _ 4 _
_ _ 3 1 _ _ _ 5
_ _ _ _ 1 _ 5 2
_ _ _ _ _ 6 9 7
_ _ _ _ _ _ 0 _
_ _ _ _ _ _ _ 0
So, now that I have a simple 8x8 table of numbers representing the graph, it's a simple matter to traverse it.
Here's a simple J function, called "traverseGraph", to read this table, traverse the graph it represents, and return two results: the indexes (0-based origin) of the nodes visited, and the values of the points and edges in the order visited.
traverseGraph=: 3 : 0
pts=. ,_-.~,ix{y [ nxt=. ix=. ,0
while. 0~:#nxt=. ~.ix-.~;([:I._~:])&.><"1 nxt{y do.
ix=. ix,nxt [ pts=. pts,_-.~,nxt{y
end.
ix;pts
)
We start by initializing three variables: the list of indexes "ix" (to zero, since we want to begin in the zeroth row of the table), a variable "nxt" to point to the next group of nodes (initially the same as the starting node), and the list of point values "pts" (starting as the 0th row of our input table, known internally as "y", with all the infinite values removed.)
In the "while." loop, we continue as long as there's more than zero "nxt" values resulting from pulling the current row out of the table and removing any nodes (in "ix") we've already visited. Inside the loop, we accumulate the next set of indexes onto the end of "nxt" and the point values onto "pts". At the end, we return the indexes and point values as our (two-element) result.
We run it like this - it displays the result by default:
traverseGraph grn
+---------------+---------------------------------------------+
|0 1 3 2 5 7 4 6|5 2 3 2 2 1 8 3 1 5 2 1 1 1 4 6 9 7 0 1 5 2 0|
+---------------+---------------------------------------------+
So, the first box contains the indexes starting with "0" and ending with "6". The second boxed item is the vector of point values in the order we accumulated them. I don't know what you do with these, so I just show them.
We can use the indexes to display the node names like this:
0 1 3 2 5 7 4 6{(<"0'SABCDE'),'G1';'G2'
+-+-+-+-+-+--+-+--+
|S|A|C|B|E|G2|D|G1|
+-+-+-+-+-+--+-+--+
I don't know how useful you'll find this but it does outline a simple solution to your problem.

Resources