How to get the difference URI between 2 collection in marklogic - xquery

I have two collections .
I need to get difference uri between the two collections based on the file name.
Example Scenario :
Collection 1:
/data/1.xml
/data/2.xml
/data/3.xml
collection 2:
/test/1.xml
/test/2.xml
/test/3.xml
/test/4.xml
/test/5.xml
output:
/data/1.xml
/data/2.xml
/data/3.xml
/test/4.xml
/test/5.xml

Using set delta as David suggests is correct, but you will need to first generate filename keys for the URIs. Maps are very helpful for this, which make it easy to keep a filename key associated with its original URI.
First generate two maps with filename keys and URI values. Then, using set delta on the map keys, generate a sequence of diff filenames. Then get the URIs for those filenames from its source map:
let $x := (
"/data/1.xml",
"/data/2.xml",
"/data/3.xml")
let $y := (
"/test/1.xml",
"/test/2.xml",
"/test/3.xml",
"/test/4.xml",
"/test/5.xml")
let $map-x := map:new($x ! map:entry(tokenize(., '/')[last()], .))
let $map-y := map:new($y ! map:entry(tokenize(., '/')[last()], .))
let $keys-diff-y := map:keys($map-y)[not(. = map:keys($map-x))]
let $diff-y := map:get($map-y, $keys-diff-y)
return ($x, $diff-y)

Two alternative solutions:
First approach, put each of the items in the map, using a consistent key(substring after the last slash), and then select the first item in the map for each key:
let $x := (
"/data/1.xml",
"/data/2.xml",
"/data/3.xml")
let $y := (
"/test/1.xml",
"/test/2.xml",
"/test/3.xml",
"/test/4.xml",
"/test/5.xml")
let $intersection := map:map()
let $_ := ($x, $y) ! (
let $key := tokenize(., "/")[last()]
return
map:put($intersection, $key, (map:get($intersection, $key), .))
)
return
for $key in map:keys($intersection)
for $uri in map:get($intersection, $key)[1]
order by number(replace($uri, ".*/(\d+).xml", '$1'))
return $uri
Second approach, ensure that only the first item is set for a given key:
let $x := (
"/data/1.xml",
"/data/2.xml",
"/data/3.xml")
let $y := (
"/test/1.xml",
"/test/2.xml",
"/test/3.xml",
"/test/4.xml",
"/test/5.xml")
let $intersection := map:map()
let $_ := ($x, $y) ! (
let $key := tokenize(., "/")[last()]
return
if (fn:exists(map:get($intersection, $key))) then ()
else map:put($intersection, $key, .)
)
return
for $uri in map:get($intersection, map:keys($intersection))
order by number(replace($uri, ".*/(\d+).xml", '$1'))
return $uri
The order by is optional, but with maps you may not have consistent ordering of the keys. Customize for what you need (i.e. /data/ uris first, and then /test/ uris, etc), or remove if you don't care about the order of the URIs.

Set notation:
Delta: (Yields 'a')
let $c1 := ('a', 'b', 'c')
let $c2 := ('b', 'c', 'd')
return $c1[fn:not(.= $c2)]
Intersection: (Yields b,c)
let $c1 := ('a', 'b', 'c')
let $c2 := ('b', 'c', 'd')
return $c1[.= $c2]
Reverse c1 and c2 for the other two permutations.
For a good read, check out this post from Dave Cassel

Related

XQuery get data based on distinct values

My xml is like this:
full xml at inputxml
<course>
<reg_num>10616</reg_num>
<subj>BIOL</subj>
<crse>361</crse>
<sect>F</sect>
<title>Genetics and MolecularBiology</title>
<units>1.0</units>
<instructor>Russell</instructor>
<days>M-W-F</days>
<time>
<start_time>11:00AM</start_time>
<end_time>11:50</end_time>
</time>
<place>
<building>PSYCH</building>
<room>105</room>
</place>
</course>
I need to take distinct values for courses and return instructors that teach those courses.
This is my current code:
let $x := doc("reed.xml")/root/course
for $y in distinct-values($x/title)
let $z := $y/instructor
return ( {data($y)} ,{data($z)})
What am i doing wrong
In the code that you posted, the for loop is iterating over the sequence of distinct title string values.
You can't XPath into those strings.
Rather, you want to XPath into the $x courses and use the $y in each iteration to select those course elements that have the title equal to $y, then select it's instructor.
let $x := doc("reed.xml")/root/course
for $y in distinct-values($x/title)
let $z := $x[title = $y]/instructor
return ( {data($y)} ,{data($z)})
In XQuery 3.1 you can do this with group by.
for $course := doc("reed.xml")/root/course
group by $title := $course/title
return ( <title>{$title}</title> ,
<instructors>{$course/instructor}</instructors> )

How to add a count number to an array of data : Doctrine

In fact, after returning a result of data from the database using Doctrine,
I'm trying to add the row count number, without calling another query request.
This is my function:
public function search(QueryBuilder $qb, string $search)
{
$qb = $qb->addSelect('COUNT(n) as count');
$search = $this->escape($search);
$qb->andWhere(
$qb->expr()->like('n.title', $qb->expr()->literal('%'.$search.'%'))
);
$qb->setMaxResults(2);
}
This is my DQL:
SELECT n, COUNT(n) as count FROM CoreBundle\Entity\News n LEFT JOIN n.category c WHERE n.title LIKE '%re%'
And I need to return as a result a all my data with a count key that refer to the number of rows.
The problem that I'm getting only the first row with id = 1, and it seems that the count number is correct.
So the result should by something like that:
['count' => 2 , [Newsn1,Newsn2]
Don't tell me to use array_count because I need to get the count of rows in the database, and I have a setMaxResults function, so I will not get a real number of rows.
I don't know the configuration of your table, I just can imagine. So, here's my try:
For getting counts for all titles in your table:
# SQL
SELECT COUNT(id) AS count, GROUP_CONCAT(title SEPARATOR ', ') AS titles FROM newses GROUP BY title
# DQL. assuming you are using a Repository method:
$qb = $this->createQueryBuilder('n');
$qb
->select("COUNT(n.id) AS count, GROUP_CONCAT(n.title SEPARATOR ', ') AS titles")
->leftJoin('n.category', 'c')
->groupBy('n.title')
;
return $qb->getQuery()->getArrayResult();
For getting counts for a particular title:
# SQL
SELECT COUNT(id) AS count, GROUP_CONCAT(title SEPARATOR ', ') AS titles FROM newses WHERE n.title LIKE '%news%' GROUP BY title
# NewsRepository.php
public function getTitlesCount($title)
{
$qb = $this->createQueryBuilder('n');
$qb
->select("COUNT(n.id) AS count, GROUP_CONCAT(n.title SEPARATOR ', ') AS titles")
->leftJoin('n.category', 'c')
->where('n.title LIKE :title')
->setParameter('title', "%{$title}%")
->groupBy('n.title')
;
return $qb->getQuery()->getArrayResult();
}

limit the result using cts:collection-match in marklogic

I have 2 collections in collections.
/test/1
This has 10 documents with id
test1_1
test1_2
.....
test1_10
/test/2
This has 20 documents with id as follows
test2_1
test2_2
.....
test2_20
Query:
let $result := cts:collection-match("/test/*")
let $id :=(
fn:distinct-values(
for $doc in fn:collection(result)
order by $doc//timestamp descending
return $doc//timestamp/text()
)[1 to 5]
)
return $id
I want to return the top 5 documents from each collection descending order of timestamp but it returns only 5 documents not 10 i.e. top 5 from each collection
When $result is a sequence of greater than one item, writing for $doc in fn:collection($result) aggregates all of the documents from multiple collections into a single sequence. You need to iterate over collections first, then iterate over the values in each collection, ordered and limited.
let $collections := cts:collection-match("/test/*")
let $id :=
for $collection in $collections
return
fn:distinct-values(
for $doc in fn:collection($collection)
order by $doc//timestamp descending
return $doc//timestamp/string()
)[1 to 5]
return $id

Need to remove timezone from current-date() xquery

I am a newbie in xquery and one of the requirement is to generate daily reports by querying xDB.
If I hardcode the date I am able to generate the report from xDB but if I try to read the current Date the from the system I don't get any values in report.
Code Snippet:
let $day := fn:current-date()
let $hours := ( '00', '01', '02', '03', '04',
'05', '06', '07', '08', '09',
'10', '11', '12', '13', '14',
'15', '16', '17', '18', '19',
'20', '21', '22', '23' )
let $ddhh := for $i in $hours return concat($day,"T",$i)
let $title := concat("Average Time By Documents in Hour on ", $day)
In report:
("Average Time By Documents in Hour on 2015-06-08Z"..)
The difference between hardcoded value and current-date() function is the timezone"Z". How do I get rid of timezone "Z". I only want say "2015-06-08" in variable $day and not "2015-06-08Z"
Thanks in advance.
If you want to get rid of the timezone component of the date, you can use fn:adjust-date-to-timezone(), using an empty sequence as the second parameter:
fn:adjust-date-to-timezone($day, ())
When casting a dateTime object to a string, the default ISO date format is chosen, always including the timezone. Convert the date manually instead using fn:format-dateTime($dateTime, $picture) with a date picture string $picture.
format-date(current-date(), "[Y0001]-[M01]-[D01]")
You can also add the hours in the desired format:
format-dateTime(current-dateTime(), "[Y0001]-[M01]-[D01]-[H01]")
This even enable you to omit the hour-sequence you pregenerated (of course also works in a single line, split up so it can be explained more easily):
let $today := current-date() (: today :)
let $today := xs:dateTime($today) (: convert to dateTime :)
let $hour := $today + xs:dayTimeDuration("PT3H") (: add 3 hours :)
return format-dateTime($hour, "[Y0001]-[M01]-[D01]-[H01]")

Where and how to use NULLIF(X,Y) function of SQLITE?

I know that NULLIF(X,Y) function of SQLITE work equivalent to:
CASE
WHEN
X = Y
THEN
NULL
ELSE
X
END
and IFNULL(X,Y) function work equivalent to:
CASE
WHEN
X IS NULL
THEN
Y
ELSE
X
END
IFNULL(X,Y) function of SQLITE is used for replacing the NULL values of X to the Y but I can't understand the use of NULLIF(X,Y) function of SQLITE.
Please explain with examples, so it is more useful.
The IFNULL function is used when the database contains NULL values, but you want to handle those values as something else; for example:
SELECT Name, IFNULL(Age, 'unknown') AS Age FROM People
The NULLIF function is used when the database contains special values that are not NULL, but that you want to handle as NULL.
This is useful especially for aggregate functions. For example, to get the number of employees that get bonuses, use:
SELECT COUNT(NULLIF(Bonus, 0)) FROM Employees
This is the same as:
SELECT COUNT(*) FROM Employees WHERE Bonus != 0
In practice, NULLIF is not used as often as IFNULL.
I use NULLIF() when UPDATing or INSERTing rows containing NULLable fieds.
Basically with PHP :
$value1 = 'foo';
$value2 = '';
$sql = 'INSERT INTO table(field1, field2) '
. "VALUES(NULLIF('$value1', ''), NULLIF('$value2', ''))";
// => INSERT INTO table(field1, field2) VALUES(NULLIF('foo', ''), NULLIF('', ''))
// => INSERT INTO table(field1, field2) VALUES('foo', NULL)
It saves me doing things like :
$value1forSQL = ($value1 === '' || $value1 === NULL) ? 'NULL' : "'$value1'";
...
$sql = ...
. "VALUES($value1forSQL, ...)";

Resources