Need to compare array in Marklogic with xquery - xquery

I need to compare array in MarkLogic with Xquery .
Query parameters:
{
"list": {
"bookNo": 13,
"BookArray":[20,21,22,23,24,25]
}
}
Sample Data:
{
"no":01'
"arrayList"[20,25]
}
{
"no":02'
"arrayList"[20,27]
}
{
"no":03'
"arrayList"[20,23,25]
}
Output:
"no":01
"no":03
I need to return "no" where all values from arraylist should be match with bookArray.

Ok. You do not explain if the actual data is in the system or not. So I did an example as if it is all in memory.
I chose to keep the sample in the MarkLogic JSON representation which has some oddities like number-nodes and array-nodes under the hood. To make it more readable if you dig into it, i used fn:data() to get less verbose. In all reality, if this was an in-memory operation and I could not use Javascript, then I would have converted the JSON structures to maps.
Here is a sample to help you explore. I cleaned up the JSON to be valid and for my sample wrapped the three samples in a single array.
xquery version "1.0-ml";
let $param-as-json := xdmp:unquote('{
"list": {
"bookNo": 13,
"BookArray":[20,21,22,23,24,25]
}
}')
let $list-as-json := xdmp:unquote('[
{
"no":"01",
"arrayList":[20,25]
},
{
"no":"02",
"arrayList":[20,27]
},
{
"no":"03",
"arrayList":[20,23,25]
}
]')
let $my-list := fn:data($param-as-json//BookArray)
return for $item in $list-as-json/*
let $local-list := fn:data($item//arrayList)
let $intersection := fn:data($item//arrayList)[.=$my-list]
where fn:deep-equal($intersection, $local-list)
return $item/no
Result:
01
03

Related

Weaviate: using near_text with the exact property doesn't return a distance of 0

Here's a minimal example:
import weaviate
CLASS = "Superhero"
PROP = "superhero_name"
client = weaviate.Client("http://localhost:8080")
class_obj = {
"class": CLASS,
"properties": [
{
"name": PROP,
"dataType": ["string"],
"moduleConfig": {
"text2vec-transformers": {
"vectorizePropertyName": False,
}
},
}
],
"moduleConfig": {
"text2vec-transformers": {
"vectorizeClassName": False
}
}
}
client.schema.delete_all()
client.schema.create_class(class_obj)
batman_id = client.data_object.create({PROP: "Batman"}, CLASS)
by_text = (
client.query.get(CLASS, [PROP])
.with_additional(["distance", "id"])
.with_near_text({"concepts": ["Batman"]})
.do()
)
print(by_text)
batman_vector = client.data_object.get(
uuid=batman_id, with_vector=True, class_name=CLASS
)["vector"]
by_vector = (
client.query.get(CLASS, [PROP])
.with_additional(["distance", "id"])
.with_near_vector({"vector": batman_vector})
.do()
)
print(by_vector)
Please note that I specified both "vectorizePropertyName": False and "vectorizeClassName": False
The code above returns:
{'data': {'Get': {'Superhero': [{'_additional': {'distance': 0.08034378, 'id': '05fbd0cb-e79c-4ff2-850d-80c861cd1509'}, 'superhero_name': 'Batman'}]}}}
{'data': {'Get': {'Superhero': [{'_additional': {'distance': 1.1920929e-07, 'id': '05fbd0cb-e79c-4ff2-850d-80c861cd1509'}, 'superhero_name': 'Batman'}]}}}
If I look up the exact vector I get 'distance': 1.1920929e-07, which I guess is actually 0 (for some floating point evil magic), as expected.
But if I use near_text to search for the exact property, I get a distance > 0.
This is leading me to believe that, when using near_text, the embedding is somehow different.
My question is:
Why does this happen?
With two corollaries:
Is 1.1920929e-07 actually 0 or do I need to read something deeper into that?
Is there a way to check the embedding created during the near_text search?
here is some information that may help:
Is 1.1920929e-07 actually 0 or do I need to read something deeper into that?
Yes, this value 1.1920929e-07 should be interpreted as 0. I think there are some unfortunate float32/64 conversions going on that need to be rooted out.
Is there a way to check the embedding created during the near_text search?
The embeddings are either imported or generated during object creation, not at search-time. So performing multiple queries on an unchanged object will utilize the same search vector.
We are looking into both of these issues.

Getting the JSON output in an object format for a single record instead of an array using XQuery

I am getting JSON output from an API in an array format(for multiple records) and in object format for a single record. Instead consumer wants to maintain one format - as an array for single record also. Please let me know if there is a way to display the JSON output in array format irrespective of single/multiple records using XQuery
I have tried the below XQuery:
<Response>
<totalSize>{totalSize/number()}</totalSize>
<done>{done/text()}</done>
<nextRecordsUrl>{nextRecordsUrl/text()}</nextRecordsUrl>
{
let $input:=/root/records
for $i in $input
return
<records>
<Product1>{$i/Product_Lookup_BI__c/text()}</Product1>
<EventLastModifiedDate>{$i/LastModifiedDate/text()}</EventLastModifiedDate>
<Venue>{$i/Venue_vod__r/(fn:concat(fn:concat(fn:concat(fn:concat(fn:concat(fn:concat(BI_Country_Code__c/text(),'-'),State_Province_vod__c/text()),'-'),City_vod__c/text()),'-'),Address_vod__c/text()))}</Venue>
{
let $a:=$i/EM_Event_Team_Member_vod__r/records
for $y in $a
return
<User_records>
<AttendeeLastModifiedDate>{$y/LastModifiedDate/text()}</AttendeeLastModifiedDate>
<EmployeeName>{$y/Team_Member_vod__r/Name/text()}</EmployeeName>
<EmployeeID>{$y/Team_Member_vod__r/BIDS_ID_BI__c/text()}</EmployeeID>
</User_records>
}
</records>
}
</Response>
Actual Output from the above XQuery:
{
"Response": {
"totalSize": 1,
"done": true,
"nextRecordsUrl": "",
"records": {
"Product1": "12345",
"EventLastModifiedDate": "2021-11-10T01:30:55.000+0000",
"Venue": "UK",
"User_records": {
"AttendeeLastModifiedDate": "2021-11-08T02:55:03.000+0000",
"EmployeeName": "Ish",
"EmployeeID": "00002113152"
}
}
}
}
Expected Output:
The Output should be in an array format for "records" & "user_records"
{
"Response":{
"totalSize":1,
"done":true,
"nextRecordsUrl":"",
"records":[
{
"Product1":"12345",
"EventLastModifiedDate":"2021-11-10T01:30:55.000+0000",
"Venue":"UK",
"User_records":[
{
"AttendeeLastModifiedDate":"2021-11-08T02:55:03.000+0000",
"EmployeeName":"Ish",
"EmployeeID":"00002113152"
}
]
}
]
}
}
Try:
<User_records xmlns:json="http://www.json.org" json:array="true">
<AttendeeLastModifiedDate>{$y/LastModifiedDate/text()}</AttendeeLastModifiedDate>
<EmployeeName>{$y/Team_Member_vod__r/Name/text()}</EmployeeName>
<EmployeeID>{$y/Team_Member_vod__r/BIDS_ID_BI__c/text()}</EmployeeID>
</User_records>
I would do the same for <records> as well. This example works in eXist-db. The JSON namespace may be different in your environment.
Here is what I ran in eXide:
xquery version "3.1";
declare option exist:serialize "method=json indent=yes media-type=application/json";
<Response>
<totalSize>5</totalSize>
<done>yes</done>
<nextRecordsUrl>abc</nextRecordsUrl>
<User_records xmlns:json="http://www.json.org" json:array="true">
<AttendeeLastModifiedDate>123</AttendeeLastModifiedDate>
<EmployeeName>456</EmployeeName>
<EmployeeID>789</EmployeeID>
</User_records>
</Response>

Golang syntax in "if" statement with a map

I am reading a tutorial here: http://www.newthinktank.com/2015/02/go-programming-tutorial/
On the "Maps in Maps" section it has:
package main
import "fmt"
func main() {
// We can store multiple items in a map as well
superhero := map[string]map[string]string{
"Superman": map[string]string{
"realname":"Clark Kent",
"city":"Metropolis",
},
"Batman": map[string]string{
"realname":"Bruce Wayne",
"city":"Gotham City",
},
}
// We can output data where the key matches Superman
if temp, hero := superhero["Superman"]; hero {
fmt.Println(temp["realname"], temp["city"])
}
}
I don't understand the "if" statement. Can someone walk me through the syntax on this line:
if temp, hero := superhero["Superman"]; hero {
Like if temp seems nonsensical to an outsider as temp isn't even defined anywhere. What would that even accomplish? Then hero := superhero["Superman"] looks like an assignment. But what is the semicolon doing? why is the final hero there?
Can someone help a newbie out?
Many thanks.
A two-value assignment tests for the existence of a key:
i, ok := m["route"]
In this statement, the first value (i) is assigned the value stored
under the key "route". If that key doesn't exist, i is the value
type's zero value (0). The second value (ok) is a bool that is true if
the key exists in the map, and false if not.
This check is basically used when we are not confirmed about the data inside the map. So we just check for a particular key and if it exists we assign the value to variable. It is a O(1) check.
In your example try to search for a key inside the map which does not exists as:
package main
import "fmt"
func main() {
// We can store multiple items in a map as well
superhero := map[string]map[string]string{
"Superman": map[string]string{
"realname": "Clark Kent",
"city": "Metropolis",
},
"Batman": map[string]string{
"realname": "Bruce Wayne",
"city": "Gotham City",
},
}
// We can output data where the key matches Superman
if temp, hero := superhero["Superman"]; hero {
fmt.Println(temp["realname"], temp["city"])
}
// try to search for a key which doesnot exist
if value, ok := superhero["Hulk"]; ok {
fmt.Println(value)
} else {
fmt.Println("key not found")
}
}
Playground Example
if temp, hero := superhero["Superman"]; hero
in go is similar to writing:
temp, hero := superhero["Superman"]
if hero {
....
}
Here is "Superman" is mapped to a value, hero will be true
else false
In go every query to a map will return an optional second argument which will tell if a certain key is present or not
https://play.golang.org/p/Hl7MajLJV3T
It's more normal to use ok for the boolean variable name. This is equivalent to:
temp, ok := superhero["Superman"]
if ok {
fmt.Println(temp["realname"], temp["city"])
}
The ok is true if there was a key in the map. So there are two forms of map access built into the language, and two forms of this statement. Personally I think this slightly more verbose form with one more line of code is much clearer, but you can use either.So the other form would be:
if temp, ok := superhero["Superman"]; ok {
fmt.Println(temp["realname"], temp["city"])
}
As above. For more see effective go here:
For obvious reasons this is called the “comma ok” idiom. In this
example, if the key is present, the value will be set appropriately and ok
will be true; if not, the value will be set to zero and ok will be
false.
The two forms for accessing maps are:
// value and ok set if key is present, else ok is false
value, ok := map[key]
// value set if key is present
value := map[key]

Using jsonPath looking for a string

I'm trying to use jsonPath and the pick function to determine if a rule needs to run or not based on the current domain. A simplified version of what I'm doing is here:
global
{
dataset shopscotchMerchants <- "https://s3.amazonaws.com/app-files/dev/merchantJson.json" cachable for 2 seconds
}
rule checkdataset is active
{
select when pageview ".*" setting ()
pre
{
merchantData = shopscotchMerchants.pick("$.merchants[?(#.merchant=='Telefora')]");
}
emit
<|
console.log(merchantData);
|>
}
The console output I expect is the telefora object, instead I get all three objects from the json file.
If instead of merchant=='Telefora' I use merchantID==16 then it works great. I thought jsonPath could do matches to strings as well. Although the example above isn't searching against the merchantDomain part of the json, I'm experiencing the same problem with that.
Your problem comes from the fact that, as stated in the documentation, the string equality operators are eq, neq, and like. == is only for numbers. In your case, you want to test if one string is equal to another string, which is the job of the eq string equality operator.
Simply swap == for eq in you JSONpath filter expression and you will be good to go:
global
{
dataset shopscotchMerchants <- "https://s3.amazonaws.com/app-files/dev/merchantJson.json" cachable for 2 seconds
}
rule checkdataset is active
{
select when pageview ".*" setting ()
pre
{
merchantData = shopscotchMerchants.pick("$.merchants[?(#.merchant eq 'Telefora')]"); // replace == with eq
}
emit
<|
console.log(merchantData);
|>
}
I put this to the test in my own test ruleset, the source for which is below:
ruleset a369x175 {
meta {
name "test-json-filtering"
description <<
>>
author "AKO"
logging on
}
dispatch {
domain "exampley.com"
}
global {
dataset merchant_dataset <- "https://s3.amazonaws.com/app-files/dev/merchantJson.json" cachable for 2 seconds
}
rule filter_some_delicous_json {
select when pageview "exampley.com"
pre {
merchant_data = merchant_dataset.pick("$.merchants[?(#.merchant eq 'Telefora')]");
}
{
emit <|
try { console.log(merchant_data); } catch(e) { }
|>;
}
}
}

Is there a version of the removeElement function in Go for the vector package like Java has in its Vector class?

I am porting over some Java code into Google's Go language and I converting all code except I am stuck on just one part after an amazingly smooth port. My Go code looks like this and the section I am talking about is commented out:
func main() {
var puzzleHistory * vector.Vector;
puzzleHistory = vector.New(0);
var puzzle PegPuzzle;
puzzle.InitPegPuzzle(3,2);
puzzleHistory.Push(puzzle);
var copyPuzzle PegPuzzle;
var currentPuzzle PegPuzzle;
currentPuzzle = puzzleHistory.At(0).(PegPuzzle);
isDone := false;
for !isDone {
currentPuzzle = puzzleHistory.At(0).(PegPuzzle);
currentPuzzle.findAllValidMoves();
for i := 0; i < currentPuzzle.validMoves.Len(); i++ {
copyPuzzle.NewPegPuzzle(currentPuzzle.holes, currentPuzzle.movesAlreadyDone);
copyPuzzle.doMove(currentPuzzle.validMoves.At(i).(Move));
// There is no function in Go's Vector that will remove an element like Java's Vector
//puzzleHistory.removeElement(currentPuzzle);
copyPuzzle.findAllValidMoves();
if copyPuzzle.validMoves.Len() != 0 {
puzzleHistory.Push(copyPuzzle);
}
if copyPuzzle.isSolutionPuzzle() {
fmt.Printf("Puzzle Solved");
copyPuzzle.show();
isDone = true;
}
}
}
}
If there is no version available, which I believe there isn't ... does anyone know how I would go about implementing such a thing on my own?
How about Vector.Delete( i ) ?
Right now Go doesn't support generic equality operators. So you'll have to write something that iterates over the vector and removes the correct one.

Resources