I want to perform a partial match query on a MongoDB in R. I've tried to specify a query that matches the MongoDB query format like so:
library(mongolite)
foo <- mongo(url = "myConnectionString")
bar <- foo$find(
query = '{"_id": /idContainsThis/}',
fields = '{}'
)
But when I try this, I get the following error:
Error: Invalid JSON object: {"_id": /idContainsThis/}
I can't use this solution because if I put quotes round the term, the / is taken as a string literal, not the wildcard I need.
Does anyone know how to make this work with mongolite?
You'll have to use the regex function like this
query = '{"_id": { "$regex" : "idContainsThis", "$options" : "i" }}'
The "$options" : "i" is in case you want it to be case insensitive.
However I am not sure if this will work on an _id
Related
I'm building a json query to pass to a mongodb database in R.
In one scenario, I have a vector of dates and I want to query the database to return all records which have a date in the relevant field that matches a date in my vector of dates.
The second scenario is the same as the first, but this time I have a vector of character strings (IDs) and need to return all the records with matching IDs.
I understood the correct way to do this in a json query is to use the $in operator, and then put my vector in an array.
However, when I pass the query to my mongodb database, the exportLogId returns NULL. I'm quite sure that the problem is something to do with how I am representing the $in operator in the final query, since I have very similarly structured queries without the $in operator and they are all working. If I look for just one of my target dates or character strings, I get the desired result.
I followed the mongodb manual here to construct my query, and the only issue I can see is that the $in operator in the output of jsonlite::toJSON() is enclosed in double quotes; whereas I think it might need to be in single quotes (or no quotes at all, but I don't know how to write the syntax for that).
I'm creating my query in two steps:
Create the query as a series of nested lists
Convert the list object to json with jsonlite::toJSON()
Here is my code:
# Load libraries:
library(jsonlite)
# Create list of example dates to query in mongodb format:
sampledates <- c("2022-08-11T00:00:00.000Z",
"2022-08-15T00:00:00.000Z",
"2022-08-16T00:00:00.000Z",
"2022-08-17T00:00:00.000Z",
"2022-08-19T00:00:00.000Z")
# Create query as a list object:
query_list_l <- list(filter =
# Add where clause:
list(where =
# Filter results by list of sample dates:
list(dateSampleTaken = list('$in' = sampledates),
# Define format of column names and values:
useDbColumns = "true",
dontTranslateValues = "true",
jsonReplaceUndefinedWithNull = "true"),
# Define columns to return:
fields = c("id",
"updatedAt",
"person.visualId",
"labName",
"sampleIdentifier",
"dateSampleTaken",
"sequence.hasSequence")))
# Convert list object to JSON:
query_json = jsonlite::toJSON(x = query_list_l,
pretty = TRUE,
auto_unbox = TRUE)
The JSON query now looks like this:
> query_json
{
"filter": {
"where": {
"dateSampleTaken": {
"$in": ["2022-08-11T00:00:00.000Z", "2022-08-15T00:00:00.000Z", "2022-08-16T00:00:00.000Z", "2022-08-17T00:00:00.000Z", "2022-08-19T00:00:00.000Z"]
},
"useDbColumns": "true",
"dontTranslateValues": "true",
"jsonReplaceUndefinedWithNull": "true"
},
"fields": ["id", "updatedAt", "person.visualId", "labName", "sampleIdentifier", "dateSampleTaken", "sequence.hasSequence"]
}
}
As you can see, $in is now enclosed in double quotes, even though I put it in single quotes when I created the query as a list object. I have tried replacing with sprintf() but that just adds a lot of backslashes to my query. I also tried:
query_fixed <- gsub(pattern = "\\"\\$\\in\\"",
replacement = "\\'$in\\'",
x = query_json)
... but this fails with an error.
I would be very grateful to know if:
The syntax problem that is preventing $in from working is actually the double quotes?
If double quotes is the problem, how do I replace them with single quotes without messing up the JSON format?
UPDATE:
The issue seems to occur when R is passing the query to the database, but I still can't work out exactly why.
If I try the query out in loopback explorer in the database, it works and using the export log ID produced, I can then fetch the results with httr::GET() in R. Example query results are shown below (sorry for the hashes - the main point is you can see the format of the returned values):
[1] "[{\"_id\":\"e59953b6-a106-4b69-9e25-1c54eef5264a\",\"updatedAt\":\"2022-09-12T20:08:39.554Z\",\"dateSampleTaken\":\"2022-08-16T00:00:00.000Z\",\"labName\":\"LNG_REFERENCE_DATA_CATEGORY_LAB_NAME_LAB_A\",\"sampleIdentifier\":\"LS0044-SCV2-PCR\",\"sequence\":{\"hasSequence\":false},\"person\":{\"visualId\":\"C-2022-0002\"}},{\"_id\":\"af5cd9cc-4813-4194-b60b-7d130bae47bc\",\"updatedAt\":\"2022-09-12T20:11:07.467Z\",\"dateSampleTaken\":\"2022-08-17T00:00:00.000Z\",\"labName\":\"LNG_REFERENCE_DATA_CATEGORY_LAB_NAME_LAB_A\",\"sampleIdentifier\":\"LS0061-SCV2-PCR\",\"sequence\":{\"hasSequence\":false},\"person\":{\"visualId\":\"C-2022-0003\"}},{\"_id\":\"b5930079-8d57-43a8-85c0-c95f7e0338d9\",\"updatedAt\":\"2022-09-12T20:13:54.378Z\",\"dateSampleTaken\":\"2022-08-16T00:00:00.000Z\",\"labName\":\"LNG_REFERENCE_DATA_CATEGORY_LAB_NAME_LAB_A\",\"sampleIdentifier\":\"LS0043-SCV2-PCR\",\"sequence\":{\"hasSequence\":false},\"person\":{\"visualId\":\"C-2022-0004\"}}]"
I have a select input with multiple options and my Mongo query
Here is the array if elements:
c<- c("elen","shallen")
query1 <- paste0('{"client": {"$in"["',c,'"]}')
#sales info is the data base
salesinfo$find(fields = '{"store":true,"_id":false}',query = query1)
Error: Invalid JSON object: {"client": [ elen ]}{"client": [ shallen ]}
this isn't working please help me please remember that it is a dynamic array and the values will change
After extensive research i found a way to solve the issue and i hope my solution will help out guys like me.
q1=paste(shQuote(c, type="cmd"), collapse=", ")
this step is to ensure you print out the array as a string and then use the query
query =paste0('{"store":{"$in":[',q1,']}}')
and the next step would be incorporating it to the query
salesinfo$find(fields = '{"store":true,"_id":false}',query = query)
Is there any way to pass a list of search strings in the contains() method of FilterExpression in DynamoDb?
Something like below:
search_str = ['value-1', 'value-2', 'value-3']
result = kb_table.scan(
FilterExpression="contains (title, :titleVal)",
ExpressionAttributeValues={ ":titleVal": search_str }
)
For now I can only think of looping through the list and scanning the table multiple times (as in below code), but I think it will be resource heavy.
for item in search_str:
result += kb_table.scan(
FilterExpression="contains (title, :titleVal)",
ExpressionAttributeValues={ ":titleVal": item }
)
Any suggestions.
For the above scenario, the CONTAINS should be used with OR condition. When you give array as input for CONTAINS, DynamoDB will check for the SET attribute ("SS", "NS", or "BS"). It doesn't looks for the sub-sequence on the string attribute.
If the target attribute of the comparison is of type String, then the
operator checks for a substring match. If the target attribute of the
comparison is of type Binary, then the operator looks for a
subsequence of the target that matches the input. If the target
attribute of the comparison is a set ("SS", "NS", or "BS"), then the
operator evaluates to true if it finds an exact match with any member
of the set.
Example:-
movies1 = "MyMovie"
movies2 = "Big New"
fe1 = Attr('title').contains(movies1)
fe2 = Attr('title').contains(movies2)
response = table.scan(
FilterExpression=fe1 or fe2
)
a little bit late but to allow people to find a solution i give here my method.
lets assume that in your DB you have a props called 'EMAIL you want to filter your scan on this EMAIL with a list of value. you can proceed as following.
list_of_elem=['mail1#mail.com','mail2#mail.com','mail3#mail.com']
#set an empty string to create your query
stringquery=""
# loop each element in your list
for index,value in enumerate(list_of_elem):
# add your query of contains with mail value
stringquery=stringquery+f"Attr('EMAIL').contains('{value }')"
# while your value is not the last element in list add the 'OR' operator
if index < len(list_of_elem)-1:
stringquery=stringquery+ ' | '
dynamodb = boto3.resource('dynamodb')
# Use eval of your query string to parse the string as filter expression
tableUser = dynamodb.Table('mytable')
tableUser.scan(
FilterExpression=eval(stringquery)
)
I have a a structure like this:
e.item.fatturato_ac_s1
e.item.fatturato_ac_s2
e.item.fatturato_ac_s3
e.item.fatturato_ac_s4
[...]
and so on...
in order to compute dinamically the string I wrote:
e.item.(myStr.toString()) where myStr (type string) = "fatturato_ac_s" + Index (so I can have fatturato_ac_s1, fatturato_ac_s2, ...)
I can correctly retrieve the value of e.item.(myStr.toString()) (a numeric value), but if I try to put it in a variable I get the error in the title:
myVariable = e.item.(myStr.toString())
myVariable is a Number.
I also tried:
myVariable = Number(e.item.(myStr.toString()))
but doesn't work... the same if I try String to String....
How can I solve it!?!!?
thank you!
This is correct syntax:
myVariable = Number(e.item[myStr])
I now have a full path for a file as a string like:
"/db/Liebherr/Content_Repository/Techpubs/Topics/HyraulicPowerDistribution/Released/TRN_282C_HYD_MOD_1_Drive_Shaft_Rev000.xml"
However, now I need to take out only the folder path, so it will be the above string without the last back slash content like:
"/db/Liebherr/Content_Repository/Techpubs/Topics/HyraulicPowerDistribution/Released/"
But it seems that the substring() function in xQuery only has substring(string,start,len) or substring(string,start), I am trying to figure out a way to specify the last occurence of the backslash, but no luck.
Could experts help? Thanks!
Try out the tokenize() function (for splitting a string into its component parts) and then re-assembling it, using everything but the last part.
let $full-path := "/db/Liebherr/Content_Repository/Techpubs/Topics/HyraulicPowerDistribution/Released/TRN_282C_HYD_MOD_1_Drive_Shaft_Rev000.xml",
$segments := tokenize($full-path,"/")[position() ne last()]
return
concat(string-join($segments,'/'),'/')
For more details on these functions, check out their reference pages:
fn:tokenize()
fn:string-join()
fn:replace can do the job with a regular expression:
replace("/db/Liebherr/Content_Repository/Techpubs/Topics/HyraulicPowerDistribution/Released/TRN_282C_HYD_MOD_1_Drive_Shaft_Rev000.xml",
"[^/]+$",
"")
This can be done even with a single XPath 2.0 (subset of XQuery) expression:
substring($fullPath,
1,
string-length($fullPath) - string-length(tokenize($fullPath, '/')[last()])
)
where $fullPath should be substituted with the actual string, such as:
"/db/Liebherr/Content_Repository/Techpubs/Topics/HyraulicPowerDistribution/Released/TRN_282C_HYD_MOD_1_Drive_Shaft_Rev000.xml"
The following code tokenizes, removes the last token, replaces it with an empty string, and joins back.
string-join(
(
tokenize(
"/db/Liebherr/Content_Repository/Techpubs/Topics/HyraulicPowerDistribution/Released/TRN_282C_HYD_MOD_1_Drive_Shaft_Rev000.xml",
"/"
)[position() ne last()],
""
),
"/"
)
It seems to return the desired result on try.zorba-xquery.com. Does this help?