Mongo query that matches field to any element of array - r

I am trying to query a Mongo Db through R (rmongodb package). i have a simple requirement:
Return records where the field "email" matches any of the emails in the vector usr$email. I think I am close but just not able to find the right syntax to pull it through.
I saw this response to an earlier question (Mongo: If any array position matches single query) and am trying along the lines:
eids_l <- paste0("'", unique(usr$email), "'", collapse=", ")
eids_l1 <- sprintf("[ %s ]", eids_l)
q <- sprintf('{"email": {"$in": %s}}', eids_l1)
cursor <- mongo.find.all(mongo, namespace, buf)
I still get an error:
Error in mongo.bson.from.JSON(arg) :
Not a valid JSON content: {"email": {"$in": [ 'xx#gmail.com',

cursor <- mongo.find.all(mongo, "namespace", query='{ "email": {
"$in": ["xx#gmail.com", "yy#gmail.com", "zz#gmail.com" ] } }')
Be careful with the use of apostrophes(') and quotation marks(").
I always use the rmongodb Cheat sheet:
https://cran.r-project.org/web/packages/rmongodb/vignettes/rmongodb_cheat_sheet.pdf

Related

r json mongodb query $in operator syntax error due to double quotes?

I'm building a json query to pass to a mongodb database in R.
In one scenario, I have a vector of dates and I want to query the database to return all records which have a date in the relevant field that matches a date in my vector of dates.
The second scenario is the same as the first, but this time I have a vector of character strings (IDs) and need to return all the records with matching IDs.
I understood the correct way to do this in a json query is to use the $in operator, and then put my vector in an array.
However, when I pass the query to my mongodb database, the exportLogId returns NULL. I'm quite sure that the problem is something to do with how I am representing the $in operator in the final query, since I have very similarly structured queries without the $in operator and they are all working. If I look for just one of my target dates or character strings, I get the desired result.
I followed the mongodb manual here to construct my query, and the only issue I can see is that the $in operator in the output of jsonlite::toJSON() is enclosed in double quotes; whereas I think it might need to be in single quotes (or no quotes at all, but I don't know how to write the syntax for that).
I'm creating my query in two steps:
Create the query as a series of nested lists
Convert the list object to json with jsonlite::toJSON()
Here is my code:
# Load libraries:
library(jsonlite)
# Create list of example dates to query in mongodb format:
sampledates <- c("2022-08-11T00:00:00.000Z",
"2022-08-15T00:00:00.000Z",
"2022-08-16T00:00:00.000Z",
"2022-08-17T00:00:00.000Z",
"2022-08-19T00:00:00.000Z")
# Create query as a list object:
query_list_l <- list(filter =
# Add where clause:
list(where =
# Filter results by list of sample dates:
list(dateSampleTaken = list('$in' = sampledates),
# Define format of column names and values:
useDbColumns = "true",
dontTranslateValues = "true",
jsonReplaceUndefinedWithNull = "true"),
# Define columns to return:
fields = c("id",
"updatedAt",
"person.visualId",
"labName",
"sampleIdentifier",
"dateSampleTaken",
"sequence.hasSequence")))
# Convert list object to JSON:
query_json = jsonlite::toJSON(x = query_list_l,
pretty = TRUE,
auto_unbox = TRUE)
The JSON query now looks like this:
> query_json
{
"filter": {
"where": {
"dateSampleTaken": {
"$in": ["2022-08-11T00:00:00.000Z", "2022-08-15T00:00:00.000Z", "2022-08-16T00:00:00.000Z", "2022-08-17T00:00:00.000Z", "2022-08-19T00:00:00.000Z"]
},
"useDbColumns": "true",
"dontTranslateValues": "true",
"jsonReplaceUndefinedWithNull": "true"
},
"fields": ["id", "updatedAt", "person.visualId", "labName", "sampleIdentifier", "dateSampleTaken", "sequence.hasSequence"]
}
}
As you can see, $in is now enclosed in double quotes, even though I put it in single quotes when I created the query as a list object. I have tried replacing with sprintf() but that just adds a lot of backslashes to my query. I also tried:
query_fixed <- gsub(pattern = "\\"\\$\\in\\"",
replacement = "\\'$in\\'",
x = query_json)
... but this fails with an error.
I would be very grateful to know if:
The syntax problem that is preventing $in from working is actually the double quotes?
If double quotes is the problem, how do I replace them with single quotes without messing up the JSON format?
UPDATE:
The issue seems to occur when R is passing the query to the database, but I still can't work out exactly why.
If I try the query out in loopback explorer in the database, it works and using the export log ID produced, I can then fetch the results with httr::GET() in R. Example query results are shown below (sorry for the hashes - the main point is you can see the format of the returned values):
[1] "[{\"_id\":\"e59953b6-a106-4b69-9e25-1c54eef5264a\",\"updatedAt\":\"2022-09-12T20:08:39.554Z\",\"dateSampleTaken\":\"2022-08-16T00:00:00.000Z\",\"labName\":\"LNG_REFERENCE_DATA_CATEGORY_LAB_NAME_LAB_A\",\"sampleIdentifier\":\"LS0044-SCV2-PCR\",\"sequence\":{\"hasSequence\":false},\"person\":{\"visualId\":\"C-2022-0002\"}},{\"_id\":\"af5cd9cc-4813-4194-b60b-7d130bae47bc\",\"updatedAt\":\"2022-09-12T20:11:07.467Z\",\"dateSampleTaken\":\"2022-08-17T00:00:00.000Z\",\"labName\":\"LNG_REFERENCE_DATA_CATEGORY_LAB_NAME_LAB_A\",\"sampleIdentifier\":\"LS0061-SCV2-PCR\",\"sequence\":{\"hasSequence\":false},\"person\":{\"visualId\":\"C-2022-0003\"}},{\"_id\":\"b5930079-8d57-43a8-85c0-c95f7e0338d9\",\"updatedAt\":\"2022-09-12T20:13:54.378Z\",\"dateSampleTaken\":\"2022-08-16T00:00:00.000Z\",\"labName\":\"LNG_REFERENCE_DATA_CATEGORY_LAB_NAME_LAB_A\",\"sampleIdentifier\":\"LS0043-SCV2-PCR\",\"sequence\":{\"hasSequence\":false},\"person\":{\"visualId\":\"C-2022-0004\"}}]"

Kusto extractjson not working with email address

I am attempting to use the extractjson() method that includes email addresses in the source data (specifically the # symbol).
let T = datatable(MyString:string)
[
'{"user#domain.com": {"value":10}, "userdomain.com": { "value": 5}}'
];
T
| project extractjson('$.["user#domain.com"].value', MyString)
This results in a null being returned, changing the JSONPath to '$.["userdomain.com"].value' does return the correct result.
Results
I know the # sign is a used as the current node in a filter expression, does this need to be escaped when used with KQL?
Just as a side note, I run the same test using nodes 'jsonpath' package and this worked as expected.
const jp = require('jsonpath');
const data = {"user#domain.com": {"value":10}, "name2": { "value": 5}};
console.log(jp.query(data, '$["user#domain.com"].score'));
you can use the parse_json() function instead, and when you don't have to use extract_json():
print MyString = '{"user#domain.com": {"value":10}, "userdomain.com": { "value": 5}}'
| project parse_json(MyString)["user#domain.com"].value
MyString_user#domain.com_value
10
From the documentation: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/extractjsonfunction

mongolite: how to perform a LIKE query?

I want to perform a partial match query on a MongoDB in R. I've tried to specify a query that matches the MongoDB query format like so:
library(mongolite)
foo <- mongo(url = "myConnectionString")
bar <- foo$find(
query = '{"_id": /idContainsThis/}',
fields = '{}'
)
But when I try this, I get the following error:
Error: Invalid JSON object: {"_id": /idContainsThis/}
I can't use this solution because if I put quotes round the term, the / is taken as a string literal, not the wildcard I need.
Does anyone know how to make this work with mongolite?
You'll have to use the regex function like this
query = '{"_id": { "$regex" : "idContainsThis", "$options" : "i" }}'
The "$options" : "i" is in case you want it to be case insensitive.
However I am not sure if this will work on an _id

mongolite filtering with dynamic array in r shiny

I have a select input with multiple options and my Mongo query
Here is the array if elements:
c<- c("elen","shallen")
query1 <- paste0('{"client": {"$in"["',c,'"]}')
#sales info is the data base
salesinfo$find(fields = '{"store":true,"_id":false}',query = query1)
Error: Invalid JSON object: {"client": [ elen ]}{"client": [ shallen ]}
this isn't working please help me please remember that it is a dynamic array and the values will change
After extensive research i found a way to solve the issue and i hope my solution will help out guys like me.
q1=paste(shQuote(c, type="cmd"), collapse=", ")
this step is to ensure you print out the array as a string and then use the query
query =paste0('{"store":{"$in":[',q1,']}}')
and the next step would be incorporating it to the query
salesinfo$find(fields = '{"store":true,"_id":false}',query = query)

MongoDB query using .attrs attribute

Given I have the following json:
{
"A" : {...},
".attrs" : {"A1": "1" }
}
I'd like to query using rmongodb package in R. I'm unable to query A.attrs field values. Any suggestions?
mongo <- mongo.create()
if (mongo.is.connected(mongo)) {
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.append(buf, "A.attrs", "1")
query <- mongo.bson.from.buffer(buf)
# assume "db.collection" is correct
cursor <- mongo.find(mongo, "db.collection", query, limit=1000L)
# Step though the matching records and display them
while (mongo.cursor.next(cursor))
print(mongo.cursor.value(cursor))
mongo.cursor.destroy(cursor)
}
I understand that (.) is not a valid field name in Mongo, however; it was generated using an xml to json converter.
"\uff0E" as escape character didn't help.
It's probably best to rename .attrs to a valid convention, but there are several .attrs at various nested levels in json.
The period in the key is a problem. If we assume it is good, I guess you should construct your query like:
mongo.bson.buffer.append(buf, ".attrs.A1", "1")

Resources