I'm trying to query a MongoDB via the R driver rmongodb. The following query works on the cmd line (result: 204,915):
db.col1.count(
{
$or: [
{'status.time':{$gt: ISODate('2013-09-10 00:00:00')}},
{'editings.time':{$gt: ISODate('2013-09-10 00:00:00')}}
]
} );
Translating this into R, I tried:
d<-strptime('2013-09-10', format='%Y-%m-%d')
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.start.array(buf, "$or")
mongo.bson.buffer.start.object(buf, 'status.time')
mongo.bson.buffer.append(buf, "$gt", d)
mongo.bson.buffer.finish.object(buf)
mongo.bson.buffer.start.object(buf, 'editings.time')
mongo.bson.buffer.append(buf, "$gt", d)
mongo.bson.buffer.finish.object(buf)
EDIT: This is what the query prints in R:
>mongo.bson.from.buffer(buf)
$or : 4
status.time : 3
$gt : 9 79497984
editings.time : 3
$gt : 9 79497984
Executing the query using...
mongo.count(mongo, db1.col1, query=mongo.bson.from.buffer(buf))
...gives me "-1". I tried several variants of the BSON, all with the same result. Using only one of the conditions (without the $or array) works, however. Does anyone see my mistake?
BTW: I'm aware of the thread rmongodb: using $or in query, however, the suggested answer to use the alternative driver RMongo does not satisfy other requirements of my code.
your way of creating an mongo bson array is wrong. You are missing the parts
mongo.bson.buffer.start.object(buf, "0")
...
mongo.bson.buffer.finish.object(buf)
mongo.bson.buffer.start.object(buf, "1")
...
mongo.bson.buffer.finish.object(buf)
For a working example please check the latest comment on:
https://github.com/mongosoup/rmongodb/issues/17
I hope this works for now. I am working on an easier solution!
To avoid having to compose the sequence of mongo.bson.buffer-statements I wrote a package (rmongodbHelper) that will translate a JSON or a list() to a BSON object which can then be used with rmongodb.
First let's setup the environment:
library(rmongodb)
# install rmongodbHelper package from GitHub
library(devtools)
devtools::install_github("joyofdata/rmongodbHelper")
library(rmongodbHelper)
# the MongoDB instance
ns <- "dbx.collx"
M <- mongo.create()
mongo.is.connected(M)
mongo.remove(M, ns, json_to_bson("{}"))
# inserting a number of dummy objects
# JSON keys currently are expected to be wrapped in double quotes!
objs <- c(
'{"_id":"__int(0)", "dates":{}}',
'{"_id":"__int(1)", "dates":{"a":"__time(2013-01-01)", "b":"__time(2013-01-01)"}}',
'{"_id":"__int(2)", "dates":{"a":"__time(2013-01-01)", "b":"__time(2014-01-01)"}}',
'{"_id":"__int(3)", "dates":{"a":"__time(2014-01-01)", "b":"__time(2013-01-01)"}}',
'{"_id":"__int(4)", "dates":{"a":"__time(2014-01-01)", "b":"__time(2014-01-01)"}}'
)
for(obj in objs) {
mongo.insert(M, ns, json_to_bson(obj))
}
Let's see via MongoDB shell if they were successfully inserted:
> use dbx
switched to db dbx
> db.collx.find().pretty()
{ "_id" : 0, "dates" : { } }
{
"_id" : 1,
"dates" : {
"a" : ISODate("2013-01-01T00:00:00Z"),
"b" : ISODate("2013-01-01T00:00:00Z")
}
}
[...]
{
"_id" : 4,
"dates" : {
"a" : ISODate("2014-01-01T00:00:00Z"),
"b" : ISODate("2014-01-01T00:00:00Z")
}
}
Now let's search for documents with a query:
# searching for those objects
# JSON keys currently are expected to be wrapped in double quotes!
json_qry <-
'{
"$or": [
{"dates.a":{"$gt": "__time(2013-06-10)"}},
{"dates.b":{"$gt": "__time(2013-06-10)"}}
]
}'
cur <- mongo.find(M, "dbx.collx", json_to_bson(json_qry))
while(mongo.cursor.next(cur)) {
print(mongo.cursor.value(cur))
}
And this is what we get in the end:
_id : 16 2
dates : 3
a : 9 -211265536
b : 9 1259963392
_id : 16 3
dates : 3
a : 9 1259963392
b : 9 -211265536
_id : 16 4
dates : 3
a : 9 1259963392
b : 9 1259963392
keys - also operators like $or - need to be put in double quotes.
"x":3 will lead to 3 being casted as double
"x":"__int(3)" will lead to 3 being casted as integer
Related
Question
Using the mongolite package in R, how do you query a database for a given date?
Example Data
Consider a test collection with two entries
library(mongolite)
## create dummy data
df <- data.frame(id = c(1,2),
dte = as.POSIXct(c("2015-01-01","2015-01-02")))
> df
id dte
1 1 2015-01-01
2 2 2015-01-02
## insert into database
mong <- mongo(collection = "test", db = "test", url = "mongodb://localhost")
mong$insert(df)
Mongo shell query
To find the entries after a given date I would use
db.test.find({"dte" : {"$gt" : new ISODate("2015-01-01")}})
How can I reproduce this query in R using mongolite?
R attempts
So far I have tried
qry <- paste0('{"dte" : {"$gt" : new ISODate("2015-01-01")}}')
mong$find(qry)
Error: Invalid JSON object: {"dte" : {"$gt" : new ISODate("2015-01-01")}}
qry <- paste0('{"dte" : {"$gt" : "2015-01-01"}}')
mong$find(qry)
Imported 0 records. Simplifying into dataframe...
data frame with 0 columns and 0 rows
qry <- paste0('{"dte" : {"gt" : ', as.POSIXct("2015-01-01"), '}}')
mong$find(qry)
Error: Invalid JSON object: {"dte" : {"gt" : 2015-01-01}}
qry <- paste0('{"dte" : {"gt" : new ISODate("', as.POSIXct("2015-01-01"), '")}}')
mong$find(qry)
Error: Invalid JSON object: {"dte" : {"gt" : new ISODate("2015-01-01")}}
#user2754799 has the correct method, but I've made a couple of small changes so that it answers my question. If they want to edit their answer with this solution I'll accept it.
d <- as.integer(as.POSIXct(strptime("2015-01-01","%Y-%m-%d"))) * 1000
## or more concisely
## d <- as.integer(as.POSIXct("2015-01-01")) * 1000
data <- mong$find(paste0('{"dte":{"$gt": { "$date" : { "$numberLong" : "', d, '" } } } }'))
as this question keeps showing up at the top of my google results when i forget AGAIN how to query dates in mongolite and am too lazy to go find the documentation:
the above Mongodb shell query,
db.test.find({"dte" : {"$gt" : new ISODate("2015-01-01")}})
now translates to
mong$find('{"dte":{"$gt":{"$date":"2015-01-01T00:00:00Z"}}}')
optionally, you can add millis:
mong$find('{"dte":{"$gt":{"$date":"2015-01-01T00:00:00.000Z"}}}')
if you use the wrong datetime format, you get a helpful error message pointing you to the correct format: use ISO8601 format yyyy-mm-ddThh:mm plus timezone, either "Z" or like "+0500"
of course, this is also documented in the mongolite manual
try mattjmorris's answer from github
library(GetoptLong)
datemillis <- as.integer(as.POSIXct("2015-01-01")) * 1000
data <- data_collection$find(qq('{"createdAt":{"$gt": { "$date" : { "$numberLong" : "#{datemillis}" } } } }'))
reference: https://github.com/jeroenooms/mongolite/issues/5#issuecomment-160996514
Prior converting your date by multiplying it with 1000, do this: options(scipen=1000), as the lack of this workaround will affect certain dates.
This is explained here:
This code is meant to compute the total distance of some given coordinates, but I don't know why it's not working.
The error is: Error in lis[[i]] : attempt to select less than one element.
Here is the code:
distant<-function(a,b)
{
return(sqrt((a[1]-b[1])^2+(a[2]-b[2])^2))
}
totdistance<-function(lis)
{
totdis=0
for(i in 1:length(lis)-1)
{
totdis=totdis+distant(lis[[i]],lis[[i+1]])
}
totdis=totdis+distant(lis[[1]],lis[[length(lis)]])
return(totdis)
}
liss1<-list()
liss1[[1]]<-c(12,12)
liss1[[2]]<-c(18,23)
liss1[[4]]<-c(29,25)
liss1[[5]]<-c(31,52)
liss1[[3]]<-c(24,21)
liss1[[6]]<-c(36,43)
liss1[[7]]<-c(37,14)
liss1[[8]]<-c(42,8)
liss1[[9]]<-c(51,47)
liss1[[10]]<-c(62,53)
liss1[[11]]<-c(63,19)
liss1[[12]]<-c(69,39)
liss1[[13]]<-c(81,7)
liss1[[14]]<-c(82,18)
liss1[[15]]<-c(83,40)
liss1[[16]]<-c(88,30)
Output:
> totdistance(liss1)
Error in lis[[i]] : attempt to select less than one element
> distant(liss1[[2]],liss1[[3]])
[1] 6.324555
Let me reproduce your error in a simple way
>list1 = list()
> list1[[0]]=list(a=c("a"))
>Error in list1[[0]] = list(a = c("a")) :
attempt to select less than one element
So, the next question is where are you accessing 0 index list ?
(Indexing of lists starts with 1 in R )
As Molx, indicated in previous posts : "The : operator is evaluated before the subtraction - " . This is causing 0 indexed list access.
For ex:
> 1:10-1
[1] 0 1 2 3 4 5 6 7 8 9
>1:(10-1)
[1] 1 2 3 4 5 6 7 8 9
So replace the following lines of your code
>for(i in 1:(length(lis)-1))
{
totdis=totdis+distant(lis[[i]],lis[[i+1]])
}
I am trying to write a generic upsert for a tick Database in R.
The python code would be:
collection.update({'symbol':'somesymbol', 'sha':'SoM3__W3|Re|7__Sh#'},
{'$set':{segment:5},
'$addToSet': {'parent':parent_id}}},
upsert=True)
In R I am using rmongodb and trying to build the BSON Objects
#get the query
mtch_b<-mongo.bson.buffer.create()
mongo.bson.buffer.append(mtch_b, "symbol", "somesymbol")
mongo.bson.buffer.append(mtch_b, "sha", "SoM3__W3|Re|7__Sh#")
mtch<-mongo.bson.from.buffer(mtch_b)
#set the segment
qry_b<-mongo.bson.buffer.create()
mongo.bson.buffer.start.object(qry_b, "$set")
mongo.bson.buffer.append(qry_b, "segment", 5)
mongo.bson.buffer.start.object(qry_b, "$addToSet")
mongo.bson.buffer.append(qry_b, "parent", "Initial")
mongo.bson.buffer.finish.object(qry_b) #end of $addtoSet object
mongo.bson.buffer.finish.object(qry_b) #end of $set object
qry_bsn <-mongo.bson.from.buffer(qry_b)
mongo.update(mongo, "M__test.tmp", mtch, qry_bsn, flags=mongo.update.upsert)
When I run this I get an error:
"The dollar ($) prefixed field '$addToSet' in '$addToSet' is not valid for storage."
looking at the qry_bsn:
qry_bsn
$set : 3
segment : 4
0 : 1 1.000000
1 : 1 2.000000
2 : 1 3.000000
3 : 1 4.000000
$addToSet : 3
parent : 2 Initial
When I remove the $addToSet, append and finish objects of the $addToSet object the query runs fine.
Any help on how to do this would be much appreciated.
I can't find reason for not using mongo.bson.from.list. It make all mongo.bson.buffer.* calls for you. And it is much less chance to produce a bug with bson construction.
query <- mongo.bson.from.list(list("symbol" = "somesymbol", "sha" = "SoM3__W3|Re|7__Sh#"))
upd_obj <- mongo.bson.from.list(list('$set' = list('segment' = 1:4), '$addToSet' = list('parent' = 'PARENT_ID')))
mongo.update(mongo = mongo, ns = "M__test.tmp", criteria = query, objNew = upd_obj, flags=mongo.update.upsert)
I have some problems understanding the way how specific fields of a subdocument (as opposed to the entire subdocument) can be updated.
I seem to have understood how to query for certain field values in subdocuments, but I'm lost with respect to how a BSON document needs to be structured that only changes the fields queried.
Still feel like I'm not fully understanding how how "plain MongoDB syntax" translates into R syntax and how the update operators exactly work. Any hints in that respect would be greatly appreciated.
Preliminaries
pkg <- "rmongodb"
lib <- file.path(R.home(), "library")
if (!suppressWarnings(require(pkg, lib.loc=lib, character.only=TRUE))) {
install.packages(pkg, lib=lib)
require(pkg, lib.loc=lib, character.only=TRUE)
}
db <- "__test"
ns.0 <- "user"
ns <- paste(db, ns.0, sep=".")
con <- mongo.create(db=db)
Ensuring empty DB
mongo.remove(mongo=con, ns=ns)
Inserting documents
This section simply ensures some example data in the DB. It's just an auxiliary part which you can skip mentally!! Continue with section “Querying“ and see "Actual querying" to get an idea of the document structure which might be hard to grasp from the R code below.
BSON for document 1
blist <- NULL
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.append(buf, name="host",
value="unittest.com")
mongo.bson.buffer.start.array(buf, "paths")
mongo.bson.buffer.start.object(buf, "1")
mongo.bson.buffer.append(buf, name="path",
value="home")
mongo.bson.buffer.append(buf, name="url",
value="www.unittest.com/home")
mongo.bson.buffer.start.array(buf, "queries")
mongo.bson.buffer.start.object(buf, "1")
mongo.bson.buffer.append(buf, name="query",
value="?somequery")
mongo.bson.buffer.append(buf, name="url",
value="www.unittest.com/home?somequery")
mongo.bson.buffer.finish.object(buf) # finish query:1
mongo.bson.buffer.start.object(buf, "2")
mongo.bson.buffer.append(buf, name="query",
value="?someotherquery")
mongo.bson.buffer.append(buf, name="url",
value="www.unittest.com/home?someotherquery")
mongo.bson.buffer.finish.object(buf) # finish query:2
mongo.bson.buffer.finish.object(buf) # finish queries
mongo.bson.buffer.finish.object(buf) # finish path:1
mongo.bson.buffer.start.object(buf, "2")
mongo.bson.buffer.append(buf, name="path",
value="somepage")
mongo.bson.buffer.append(buf, name="url",
value="www.unittest.com/somepage")
mongo.bson.buffer.start.array(buf, "queries")
mongo.bson.buffer.start.object(buf, "1")
mongo.bson.buffer.append(buf, name="query",
value="?somequery")
mongo.bson.buffer.append(buf, name="url",
value="www.unittest.com/somepage?somequery")
mongo.bson.buffer.finish.object(buf) # finish query:1
mongo.bson.buffer.start.object(buf, "2")
mongo.bson.buffer.append(buf, name="query",
value="?someotherquery")
mongo.bson.buffer.append(buf, name="url",
value="www.unittest.com/somepage?someotherquery")
mongo.bson.buffer.finish.object(buf) # finish query:2
mongo.bson.buffer.finish.object(buf) # finish queries
mongo.bson.buffer.finish.object(buf) # finish path:2
mongo.bson.buffer.finish.object(buf) # finish paths
mongo.bson.buffer.finish.object(buf) # finish buf
b <- mongo.bson.from.buffer(buf)
blist <- c(blist, list(b))
BSON for document 2
EDIT 2012-01-23
I removed this section to make the question a bit easier to grasp.
Actual insert
sapply(blist, function(ii) {
mongo.insert(mongo=con, ns=ns, b=ii)
})
Querying
BSON for query
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.start.object(buf, "paths")
mongo.bson.buffer.start.object(buf, "$elemMatch")
mongo.bson.buffer.start.object(buf, "queries")
mongo.bson.buffer.start.object(buf, "$elemMatch")
mongo.bson.buffer.append(buf, name="query", value="?somequery")
mongo.bson.buffer.finish.object(buf)
mongo.bson.buffer.finish.object(buf)
mongo.bson.buffer.finish.object(buf)
mongo.bson.buffer.finish.object(buf)
query <- mongo.bson.from.buffer(buf)
> query
paths : 3
$elemMatch : 3
queries : 3
$elemMatch : 3
query : 2 ?somequery
Actual query
> mongo.find.one(mongo=con, ns=ns, query=query)
_id : 7 50feff31ba54a032514b6181
host : 2 unittest.com
paths : 4
1 : 3
path : 2 home
url : 2 www.unittest.com/home
queries : 4
1 : 3
query : 2 ?somequery
url : 2 www.unittest.com/home?somequery
2 : 3
query : 2 ?someotherquery
url : 2 www.unittest.com/home?someotherquery
2 : 3
path : 2 somepage
url : 2 www.unittest.com/somepage
queries : 4
1 : 3
query : 2 ?somequery
url : 2 www.unittest.com/somepage?somequery
2 : 3
query : 2 ?someotherquery
url : 2 www.unittest.com/somepage?someotherquery
Updating
BSON for update
I would like to set the value of the query field in query subdocuments. I had a look at the MongoDB Manual and tried something like this (using the $set and $ operators because there are arrays involved):
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.start.object(buf, "$set")
mongo.bson.buffer.start.object(buf, "paths")
mongo.bson.buffer.start.object(buf, "$")
mongo.bson.buffer.start.object(buf, "queries")
mongo.bson.buffer.start.object(buf, "$")
mongo.bson.buffer.append(
buf,
name="name",
value="abcd"
)
mongo.bson.buffer.finish.object(buf)
mongo.bson.buffer.finish.object(buf)
mongo.bson.buffer.finish.object(buf)
mongo.bson.buffer.finish.object(buf)
mongo.bson.buffer.finish.object(buf)
bnew <- mongo.bson.from.buffer(buf)
> bnew
$set : 3
paths : 3
$ : 3
queries : 3
$ : 3
name : 2 abcd
Actual update
Apparently, this wasn't a good choice ;-)
res <- mongo.update(mongo=con, ns=ns, criteria=query,
objNew=bnew, flags=mongo.update.multi)
> res
[1] FALSE
2: http://docs.mongodb.org/manual/applications/update/#update-operators zU
Try this for bnew:
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.start.object(buf, "$set")
mongo.bson.buffer.append(buf, "paths.0.queries.1.query", "?newquery")
mongo.bson.buffer.finish.object(buf)
bnew = mongo.bson.from.buffer(buf)
this will replace the 2nd query in the 1st of queries.
I am very new to lua and my plan is to create a table. This table (I call it test) has 200 entries - each entry has the same subentries (In this example the subentries money and age):
This is a sort of pseudocode:
table test = {
Entry 1: money=5 age=32
Entry 2: money=-5 age=14
...
Entry 200: money=999 age=72
}
How can I write this in lua ? Is there a possibility ? The other way would be, that I write each subentry as a single table:
table money = { }
table age = { }
But for me, this isn't a nice way, so maybe you can help me.
Edit:
This question Table inside a table is related, but I cannot write this 200x.
Try this syntax:
test = {
{ money = 5, age = 32 },
{ money = -5, age = 14 },
...
{ money = 999, age = 72 }
}
Examples of use:
-- money of the second entry:
print(test[2].money) -- prints "-5"
-- age of the last entry:
print(test[200].age) -- prints "72"
You can also turn the problem on it's side, and have 2 sequences in test: money and age where each entry has the same index in both arrays.
test = {
money ={1000,100,0,50},
age={40,30,20,25}
}
This will have better performance since you only have the overhead of 3 tables instead of n+1 tables, where n is the number of entries.
Anyway you have to enter your data one way or another. What you'd typically do is make use some easily parsed format like CSV, XML, ... and convert that to a table. Like this:
s=[[
1000 40
100 30
0 20
50 25]]
test ={ money={},age={}}
n=1
for balance,age in s:gmatch('([%d.]+)%s+([%d.]+)') do
test.money[n],test.age[n]=balance,age
n=n+1
end
You mean you do not want to write "money" and "age" 200x?
There are several solutions but you could write something like:
local test0 = {
5, 32,
-5, 14,
...
}
local test = {}
for i=1,#test0/2 do
test[i] = {money = test0[2*i-1], age = test0[2*i]}
end
Otherwise you could always use metatables and create a class that behaves exactly like you want.