I am facing a problem using word boundary regex with mongolite. It looks like the word boundary \b does not work, whereas it works in norm MongoDB queries.
Here is a working example:
I create this toy collection:
db.test2.insertMany([
{ item: "journal gouttiere"},
{ item: "notebook goutte"},
{ item: "paper plouf"},
{ item: "planner gouttement"},
{ item: "postcard goutte"}
]);
With mongosh:
db.test2.aggregate(
{
$match: {
item: RegExp("\\bgoutte\\b")
}
})
returns:
[
{
"_id": {
"$oid": "63206efeb0e1e89db6ef0c20"
},
"item": "notebook goutte"
},
{
"_id": {
"$oid": "63206efeb0e1e89db6ef0c23"
},
"item": "postcard goutte"
}
]
But:
library(mongolite)
connection <- mongo(collection="test2",db="test",
url = "mongodb://localhost:27017",
verbose = T)
connection$aggregate(pipeline = '[{
"$match": {
"item":{"$regex" : "\\bgoutte\\b", "$options" : "i"}
}
}]',options = '{"allowDiskUse":true}')
returns 0 lines. Changing to
connection$aggregate(pipeline = '[{
"$match": {
"item":{"$regex" : "goutte", "$options" : "i"}
}
}]',options = '{"allowDiskUse":true}')
Imported 3 records. Simplifying into dataframe...
_id item
1 63206efeb0e1e89db6ef0c20 notebook goutte
2 63206efeb0e1e89db6ef0c22 planner gouttement
3 63206efeb0e1e89db6ef0c23 postcard goutte
It looks like the word boundary regex does not work the same with mongolite. What is the proper solution ?
Ottie is right (and should post an answer!–I'd be fine with deleting mine then):
Backslashes have special meaning for both R and in the regex. You need two additional backslashes (one per \) to pass \\ from R to mongoDB (where you escape \b by \\b), see e.g. this SO question. I just checked:
con <- mongo(
"test",
url = "mongodb+srv://readwrite:test#cluster0-84vdt.mongodb.net/test"
)
con$insert('{"item": "notebook goutte" }')
con$insert('{"item": "postcard goutte" }')
Now
con$aggregate(pipeline = '[{
"$match": {
"item":{"$regex" : "\\\\bgoutte\\\\b", "$options" : "i"}
}
}]',options = '{"allowDiskUse":true}')
yields
_id item
1 63234ac1435f9b7c2a0787c2 notebook goutte
2 63234ac5435f9b7c2a0787c5 postcard goutte
Using code below to pull data from a local json file.
The file is very large and is nested with objects and arrays. There are multiple objects in the .ratings[] that I would like to extract.
How can I use the pipe operator in the .ratings[] array so that I don't have to retype .ratings[] for each piece of data that I would like to pull?
jq -r '.players[] | [.firstName,.lastName,.tid,.pid,.ratings[].spd,.ratings[].jmp] | join(", ")'
You can enclose it in () to use the pipe sign:
.players[] | [.firstName, .lastName, .tid, .pid, (.ratings[] | .spd, .jmp)] | join(", ")
Try it online
You didn't specify the expected output, so it is not clear if your proposed solution gives you the output you want.
Given the following input:
{
"players": [
{
"firstName": "fname1",
"lastName": "lname1",
"tid": "tid1",
"pid": "pid1",
"ratings": [
{
"spd": "spd1-1",
"jmp": "jmp1-1"
}
]
},
{
"firstName": "fname2",
"lastName": "lname2",
"tid": "tid2",
"pid": "pid2",
"ratings": [
{
"spd": "spd2-1",
"jmp": "jmp2-1"
},
{
"spd": "spd2-2",
"jmp": "jmp2-2"
}
]
},
{
"firstName": "fname3",
"lastName": "lname3",
"tid": "tid3",
"pid": "pid3",
"ratings": [
{
"spd": "spd3-1",
"jmp": "jmp3-2"
},
{
"spd": "spd3-2",
"jmp": "jmp3-2"
},
{
"spd": "spd3-3",
"jmp": "jmp3-3"
}
]
}
]
}
Your solution and the answer from 0ston0 will give you 1 line per player, but a different number of columns per line:
.players[] | [.firstName,.lastName,.tid,.pid,(.ratings[]|.spd,.jmp)] | join(", ")
generates:
fname1, lname1, tid1, pid1, spd1-1, jmp1-1
fname2, lname2, tid2, pid2, spd2-1, jmp2-1, spd2-2, jmp2-2
fname3, lname3, tid3, pid3, spd3-1, jmp3-2, spd3-2, jmp3-2, spd3-3, jmp3-3
This might or might not be what want your result to look like.
A different solution will print one line per rating, but duplicate the players' names. Running:
.players[] | [.firstName,.lastName,.tid,.pid] + (.ratings[]|[.spd,.jmp]) | join(", ")
will result in:
fname1, lname1, tid1, pid1, spd1-1, jmp1-1
fname2, lname2, tid2, pid2, spd2-1, jmp2-1
fname2, lname2, tid2, pid2, spd2-2, jmp2-2
fname3, lname3, tid3, pid3, spd3-1, jmp3-2
fname3, lname3, tid3, pid3, spd3-2, jmp3-2
fname3, lname3, tid3, pid3, spd3-3, jmp3-3
Both solutions are valid for different use cases and depending on how you are going to subsequently process the data.
I can get the first and birthday,
{
"users": [
{
"first": "Stevie",
"last": "Wonder",
"birthday": "01/01/1945"
},
{
"first": "Michael",
"last": "Jackson",
"birthday": "03/23/1963"
}
]
}
So with this jq command, I can get the record:
$ cat a.json |jq '.users[] | .first + " " + .last + " " + .birthday'
"Stevie Wonder 01/01/1945"
"Michael Jackson 03/23/1963"
And I am close to the answer to match the first name
$ cat a.json |jq '.users[] | select(.first=="Stevie") | .birthday '
"01/01/1945"
But how to get the output which matched both first and last name?
Here is an approach which starts by filtering out .users which do not meet your criteria:
.users |= map(select(
(.first == "Stevie") and (.last == "Wonder")
))
if you Try it online! you will observe it simplifies your data to just
{
"users": [
{
"first": "Stevie",
"last": "Wonder",
"birthday": "01/01/1945"
}
]
}
Then you can add more filters if you want particular elements (e.g. .birthday):
.users |= map(select(
(.first == "Stevie") and (.last == "Wonder")
))
| .users[].birthday
to obtain
Try it online!
"01/01/1945"
This may seem needlessly redundant but may be easier if you are experimenting without precise requirements.
I want to create a custom keybinding in sublime text 3 that doesn't return a command but returns the key combination used in R to define a variable like below.
variable <- variable_definition //for example
z1 <- seq(1,100)
In R 3.2.2 GUI mac OS X the keybinding:
"alt+-" returns " <- "
I have read the documentation for user keybindings but couldn't find something that I could use.
I have tried "print" and "echo" as below but they don't work.
[
{ "keys": ["alt+-"], "print": " <- "}
]
or
[
{ "keys": ["alt+-"], "echo": " <- "}
]
Some help would be much appreciated
In Sublime Text you run commands with arguments. If you want to insert something the command is insert and the argument is called characters. If you want to limit it to the language R you can add a context. Hence the keybinding:
[
{
"keys": ["alt+-"], "command": "insert", "args": {"characters": " <- "},
"context":
[
{ "key": "selector", "operator": "equal", "operand": "source.r" }
]
}
]
Aside: it could also be interesting for you to use snippets as keybindings.
[
{
"keys": ["alt+-"], "command": "insert_snippet", "args": {"contents": "${1:variable} <- ${0:definition}"}
}
]
Given a working makefile which crop a world map to a specific country's bounding box.
# boxing:
INDIA_crop.tif: ETOPO1_Ice_g_geotiff.tif
gdal_translate -projwin 67.0 37.5 99.0 05.0 ETOPO1_Ice_g_geotiff.tif INDIA_crop.tif
# ulx uly lrx lry // W N E S
# unzip:
ETOPO1_Ice_g_geotiff.tif: ETOPO1.zip
unzip ETOPO1.zip
touch ETOPO1_Ice_g_geotiff.tif
# download:
ETOPO1.zip:
curl -o ETOPO1.zip 'http://www.ngdc.noaa.gov/mgg/global/relief/ETOPO1/data/ice_surface/grid_registered/georeferenced_tiff/ETOPO1_Ice_g_geotiff.zip'
clean:
rm `ls | grep -v 'zip' | grep -v 'Makefile'`
Given I currently have to change this makefile each time by hand editing the makefile to change:
1. the country name,
2. its North border geocoordinate,
3. its South border geocoordinate,
4. its East border geocoordinate,
5. its West border geocoordinate.
Given I also have a dataset for all countries such :
data = [
{ "W":-62.70; "S":-27.55;"E": -54.31; "N":-19.35; "item":"Paraguay" },
{ "W": 50.71; "S": 24.55;"E": 51.58; "N": 26.11; "item":"Qatar" },
{ "W": 20.22; "S": 43.69;"E": 29.61; "N": 48.22; "item":"Romania" },
{ "W": 19.64; "S": 41.15;"E":-169.92; "N": 81.25; "item":"Russia" },
{ "W": 29.00; "S": -2.93;"E": 30.80; "N": -1.14; "item":"Rwanda" },
{ "W": 34.62; "S": 16.33;"E": 55.64; "N": 32.15; "item":"Saudi Arabia"}
];
How to loop on each line of the data so to set parameters into my makefile ? So I output at once all the files COUNTRYNAME_crop.tif with the correct bounding boxes.
Assuming you're using GNU make, this seems to me like a perfect problem for autogenerated makefiles. After make reads in its makefiles it will test each one as if it were a target to see if it can be rebuilt. If so, and it is rebuilt, make will automatically re-exec itself. This is an extraordinarily powerful type of meta-programming. I would combine this with recursive variable naming.
1. Data: Let's assume your dataset is in dataset.out such :
[
{ "W":-62.70; "S":-27.55;"E": -54.31; "N":-19.35; "item":"Paraguay" },
{ "W": 50.71; "S": 24.55;"E": 51.58; "N": 26.11; "item":"Qatar" },
{ "W": 20.22; "S": 43.69;"E": 29.61; "N": 48.22; "item":"Romania" },
{ "W": 19.64; "S": 41.15;"E":-169.92; "N": 81.25; "item":"Russia" },
{ "W": 29.00; "S": -2.93;"E": 30.80; "N": -1.14; "item":"Rwanda" },
{ "W": 34.62; "S": 16.33;"E": 55.64; "N": 32.15; "item":"Saudi Arabia"}
];
2. Converter: Now you need to write the utility convert-to-makefile. I would write it in Perl myself but the new kids would probably choose Python. Whatever. Anyway, for each country, the output should be something like this:
COUNTRIES += <countryname>
<countryname>-NORTH := <north-coord>
<countryname>-SOUTH := <south-coord>
<countryname>-EAST := <east-coord>
<countryname>-WEST := <west-coord>
so that bounding.mk, after being generated, has one of those stanzas for each country.
3a. Makefile: Then, add this to the beginning of your makefile:
-include bounding.mk
3b. Then add this rule to the end of your makefile:
bounding.mk: dataset.out
convert-to-makefile $< > $#
3c. Then you can write your rules like this:
all: $(COUNTRIES:%=%_crop.tif)
%_crop.tif: ETOPO1_Ice_g_geotiff.tif
gdal_translate -projwin $($*-WEST) $($*-NORTH) $($*-EAST) $($*-SOUTH) $< $#
That should about do it!