I have a JSON file that has geoJSON feature collections nested inside of it.
Is it possible to read in the JSON file using jsonlite::read_json(), extract the geoJSON bits, and then convert the resulting list to a sf object? The alternative is to write the list back to JSON (text) and read the geoJSON using a package like geojsonio.
This is what my JSON code looks like:
{
"all": [
{
"type": "Feature",
"geometry": {
"type": "GeometryCollection",
"geometries": [
{
"type": "Point",
"coordinates": [
-75.155727,
39.956318
]
},{
"type": "LineString",
"coordinates": [
[
-75.15567895337301,
39.95653558798881
],[
-75.15575995337292,
39.95616931624319
]
]
},{
"type": "Point",
"coordinates": [
-75.15566,
39.956432
]
}
]
},
"properties": {
# properties
}
},{
# more features of mixed type
}
]
}
perhaps
x <- '{
"all": [
{
"type": "Feature",
"geometry": {
"type": "GeometryCollection",
"geometries": [
{
"type": "Point",
"coordinates": [
-75.155727,
39.956318
]
},{
"type": "LineString",
"coordinates": [
[
-75.15567895337301,
39.95653558798881
],[
-75.15575995337292,
39.95616931624319
]
]
},{
"type": "Point",
"coordinates": [
-75.15566,
39.956432
]
}
]
},
"properties": null
}
]
}'
sf::st_read(jqr::jq(x, ".all[]"))
(string edited to be valid JSON)
Related
I've a route which is stored as a set of points.
{
"id": "9fc9b1e9-6062-4c65-820d-992569618883",
"shape": [
16.373056,
48.208333,
16.478611,
48.141111,
17.112778,
48.144722
]
}
I want to find nearest route to given point. For example: give me a route which is less than 25 km from point XY.
To be able to use built-in functions for geospatial querying in Azure Cosmos DB I need to make some changes to the document structure. My first attempt was to use LineString type.
{
"id": "9fc9b1e9-6062-4c65-820d-992569618883",
"shape": {
"type": "LineString",
"coordinates": [
[
16.373056,
48.208333
],
[
16.478611,
48.141111
],
[
17.112778,
48.144722
]
]
}
}
Than I query SELECT tf.id, ST_DISTANCE(tf.shape, {type: "Point", "coordinates": [16.6475, 48.319444]}) FROM tf WHERE ST_DISTANCE(tf.shape, {type: "Point", "coordinates": [16.6475, 48.319444]}) < 25000 with following result.
[
{
"id": "9fc9b1e9-6062-4c65-820d-992569618883",
"$1": 19683.798772898
}
]
Based on research it looks like plausible that ST_DISTANCE found a point on one route which is under 25 km.
When I have large document with many points (around 15000) the result is always []. It is an another dataset so the numbers are different.
SELECT tf.id, ST_DISTANCE(tf.shape, {type: "Point", "coordinates": [10.09, 52.831667]}) FROM tf WHERE ST_DISTANCE(tf.shape, {type: "Point", "coordinates": [10.09, 52.831667]}) < 10000 returns [].
What I tried next is to wrap every point as own data type and put them in array.
{
"id": "265de514-8995-4976-aeca-1f5d0ab0931d",
"shape": [
{
"type": "Point",
"coordinates": [
9.38626,
51.01587
]
},
{
"type": "Point",
"coordinates": [
9.38829,
51.01533
]
},
{
"type": "Point",
"coordinates": [
9.38853,
51.01554
]
}
...another set of 15000 points
]
}
When I execute the query like SELECT tf.id, locations.coordinates, ST_DISTANCE(locations, {type: "Point", "coordinates": [10.09, 52.831667]}) FROM tf JOIN locations IN tf.shape WHERE ST_DISTANCE(locations, {type: "Point", "coordinates": [10.09, 52.831667]}) < 10000 it returns all points on the route under 10 km.
[
{
"id": "265de514-8995-4976-aeca-1f5d0ab0931d",
"coordinates": [
9.97907,
52.77248
],
"$1": 9967.70776520528
},
{
"id": "265de514-8995-4976-aeca-1f5d0ab0931d",
"coordinates": [
9.97908,
52.77274
],
"$1": 9948.088917723748
}
...another set of points under 10 km
]
Do I use ST_DISTANCE correct and if yes why I don't get any results? Any service limitations? If no what is the correct way to implement this functionality? I see the possibility with the array of points but it seems somehow clunky.
I have the graph model below which represents the sub-pattern I'd like to traverse or fetch. The nodes and their properties are shown below as well.
The expected response to my query would look something like this:
where 's', 'c', 'aid', 'qid', 'p', 'r1', 'r2' are the nodes that make up the subpattern or subgraph.
[
{
"s": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "severity",
"type": "vertex",
"properties": {
"severity": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "High"
}
],
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
"c": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "cve",
"type": "vertex",
"properties": {
"cve_id": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "CVE-xxxx-xxxx"
}
],
"publishedOn": [
{
"id": "fc5dde4d-c027-4c19-9b16-b3314b2b10c6",
"value": "xxx"
}
],
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
"aid": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "aid",
"type": "vertex",
"properties": {
"aid": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "xxxx-xxxx"
}
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
"qid": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "qid",
"type": "vertex",
"properties": {
"qid": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "xxxx-xxxx"
}
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
"p": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "package",
"type": "vertex",
"properties": {
"name": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "xxxxx"
}
],
"version": [
{
"id": "fc5dde4d-c027-4c19-9b16-b3314b2b10c6",
"value": "xxx"
}
],
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
"r1": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "release",
"type": "vertex",
"properties": {
"source": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "xxxx-xxxx"
}
],
"status": [
{
"id": "fc5dde4d-c027-4c19-9b16-b3314b2b10c6",
"value": "xxx"
}
],
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
"r2": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "release",
"type": "vertex",
"properties": {
"source": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "xxxx-xxxx"
}
],
"status": [
{
"id": "fc5dde4d-c027-4c19-9b16-b3314b2b10c6",
"value": "xxx"
}
],
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
{
....
....
},
{
....
..
}
]
My question is how do I build my traversal query to achieve this end result?
What I have so far is this, but the project() step is not working as expected
g.V().hasLabel('cve').as('c').and(
__.in('severity').as('s'),
__.out('cve_to_aid').as('aid').and(
__.out('has_qid').as('qid'),
__.in('package_to_aid').as('p'),
or(
__.in('r1_to_aid').has('status', 'Patched').as('r1'),
__.in('r2_to_aid').has('status', 'Patched').as('r2')
)
)
).project('c', 's', 'aid', 'qid', 'p', 'r1', 'r2').
by(('c').values('cve_id')).
by(('s').values('severity')).
by(('aid').values('aid')).
by(('qid').values('qid')).
by(('p').values()).
by(('r1').values()).
by(('r2').values()).
I am doing this on CosmosDB, so please only provide answers using supported steps found here: https://learn.microsoft.com/en-us/azure/cosmos-db/gremlin/support
It is possible to nest project() steps, e.g. on the TinkerGraph:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).as('x').project('x').by(
select('x').project('id', 'label','properties').by(id).by(label).by(
project('name').by(properties())
)
)
==>[x:[id:1,label:person,properties:[name:vp[name->marko]]]]
gremlin>
but then you end up coding your entire data model into your query.
In full TinkerPop you could turn your result into a subGraph() and write it to graphSon with the io() step. In Cosmos you can add the returned vertices to a TinkerGraph instance clientside and again use the io() step to serialize the TinkerGraph to graphSon.
I'm trying to deduplicate all objects inside the array results that share the same key id, and merge their path arrays.
JSON input:
[
{
"type": "apple",
"results": [
{
"id": "apple1",
"name": "appleName1",
"path": "/some/path/a"
},
{
"id": "apple1",
"name": "appleName1",
"path": "/some/path/b"
},
{
"id": "apple2",
"name": "appleName2",
"path": "/some/path/c"
}
]
},
{
"type": "orange",
"results": [
{
"id": "orange1",
"name": "orangeName1",
"path": "/some/path/a"
},
{
"id": "orange1",
"name": "orangeName1",
"path": "/some/path/b"
},
{
"id": "orange2",
"name": "orangeName2",
"path": "/some/path/c"
}
]
}
]
Expected output:
[
{
"type": "apple",
"results": [
{
"id": "apple1",
"name": "appleName1",
"path": [
"/some/path/a",
"/some/path/b"
]
},
{
"id": "apple2",
"name": "appleName2",
"path": [
"/some/path/c"
]
}
]
},
{
"type": "orange",
"results": [
{
"id": "orange1",
"name": "orangeName1",
"path": [
"/some/path/a",
"/some/path/b"
]
},
{
"id": "orange2",
"name": "orangeName2",
"path": [
"/some/path/c"
]
}
]
}
]
I've managed to get an approximate solution using:
jq '[{type: .[].type, results: .[].results | group_by(.id) | map({id: .[0].id, name: .[0].name, path: (map(.path))})}]'
But my solution produces two additional elements that aren't supposed to be there.
I know there are some similar questions already answered but I didn't manage to get them to work with this example. Any help is appreciated!
You could group_by the .id field, then for each group take the first item and replace its .path field with a map on the .path fields of all group members:
jq 'map(.results |= (group_by(.id) | map(first + {path: map(.path)})))'
[
{
"type": "apple",
"results": [
{
"id": "apple1",
"name": "appleName1",
"path": [
"/some/path/a",
"/some/path/b"
]
},
{
"id": "apple2",
"name": "appleName2",
"path": [
"/some/path/c"
]
}
]
},
{
"type": "orange",
"results": [
{
"id": "orange1",
"name": "orangeName1",
"path": [
"/some/path/a",
"/some/path/b"
]
},
{
"id": "orange2",
"name": "orangeName2",
"path": [
"/some/path/c"
]
}
]
}
]
Demo
I want to convert a feature file to json so that I can pass it to a javascript function in an RMD file.
However, the toJSON function seems to flatten it and remove many of the fields and structures as below. How can I convert it and keep it in tact, as it does if I write to a file using sf::st_write?
url <- 'https://opendata.arcgis.com/api/v3/datasets/bf9d32b1aa9941af84e6c2bf0c54b1bb_0/downloads/data?format=geojson&spatialRefId=4326'
ukWardShapes <- sf::st_read(url) %>%
head(2)
# Looks OK when written out
sf::st_write(ukWardShapes, "wardShapes.geojson")
# Converting to json with toJSON seems drop other top level fields (type, name, crs) and list the objects within features object,
# but without type, and puts all fields in properties at the top level of object.
json_data <- jsonlite::toJSON(ukWardShapes)
# I want to do this as I need to pass it to javascript within an RMD like this
htmltools::tags$script(paste0("var ukWardShapes = ", json_data, ";"))
# Output from st_write - with all the objects and fields listed properly
{
"type": "FeatureCollection",
"name": "wardShapes",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [
{ "type": "Feature", "properties": { "OBJECTID": 1, "WD21CD": "E05000026", "WD21NM": "Abbey", "WD21NMW": " ", "BNG_E": 544433, "BNG_N": 184376, "LONG": 0.081276, "LAT": 51.53981, "SHAPE_Length": 0.071473941285613768, "SHAPE_Area": 0.00015225110241064838 }, "geometry": { "type": "MultiPolygon", "coordinates": [ [ [ [ 0.093628520000038, 51.53767283600007 ], [ 0.08163128800004, 51.539165094000055 ], [ 0.085507102000065, 51.537043160000053 ], [ 0.075954208000041, 51.533595714000057 ], [ 0.07333983500007, 51.537621201000036 ], [ 0.068771363000053, 51.536206993000064 ], [ 0.068303699000069, 51.544253423000043 ], [ 0.068361695000021, 51.544390390000046 ], [ 0.08006389600007, 51.544772356000067 ], [ 0.093628520000038, 51.53767283600007 ] ] ] ] } },
{ "type": "Feature", "properties": { "OBJECTID": 2, "WD21CD": "E05000027", "WD21NM": "Alibon", "WD21NMW": " ", "BNG_E": 549247, "BNG_N": 185196, "LONG": 0.150987, "LAT": 51.545921, "SHAPE_Length": 0.074652046036690151, "SHAPE_Area": 0.00017418950412786572 }, "geometry": { "type": "MultiPolygon", "coordinates": [ [ [ [ 0.161601914000073, 51.543327754000074 ], [ 0.147931795000034, 51.541598449000048 ], [ 0.140256898000075, 51.54111542000004 ], [ 0.13420572800004, 51.540716652000071 ], [ 0.131925236000029, 51.543763455000033 ], [ 0.14633003900002, 51.546332889000041 ], [ 0.142816723000067, 51.550973604000035 ], [ 0.156378253000071, 51.551020271000027 ], [ 0.161601914000073, 51.543327754000074 ] ] ] ] } }
]
}
# Output from toJson which seems to have a lot of structure removed. Note, I'm not
# concerned about it being pretty and separated into lines
[{
"OBJECTID":1, "WD21CD":"E05000026", "WD21NM":"Abbey", "WD21NMW":" ", "BNG_E":544433, "BNG_N":184376, "LONG":0.0813, "LAT":51.5398, "SHAPE_Length":0.0715, "SHAPE_Area":0.0002, "geometry":{
"type":"MultiPolygon", "coordinates":[[[[0.0936, 51.5377], [0.0816, 51.5392], [0.0855, 51.537], [0.076, 51.5336], [0.0733, 51.5376], [0.0688, 51.5362], [0.0683, 51.5443], [0.0684, 51.5444], [0.0801, 51.5448], [0.0936, 51.5377]]]]
}
}, {
"OBJECTID":2, "WD21CD":"E05000027", "WD21NM":"Alibon", "WD21NMW":" ", "BNG_E":549247, "BNG_N":185196, "LONG":0.151, "LAT":51.5459, "SHAPE_Length":0.0747, "SHAPE_Area":0.0002, "geometry":{
"type":"MultiPolygon", "coordinates":[[[[0.1616, 51.5433], [0.1479, 51.5416], [0.1403, 51.5411], [0.1342, 51.5407], [0.1319, 51.5438], [0.1463, 51.5463], [0.1428, 51.551], [0.1564, 51.551], [0.1616, 51.5433]]]]
}
}]
As per #SymbolixAU's comment above, the answer is to use
geojsonsf::sf_geojson() instead of jsonlite::toJSON() as geojson is a specific structure of JSON for spatial data and it needs a specific parser for it.
So my line of code should be:
json_data <- geojsonsf::sf_geojson(ukWardShapes)
I want to work on GeoJson data having below mentioned format;
{ "id": 1,
"geometry":
{ "type": "Point",
"coordinates": [
-3.706,
40.3],
"properties": {"appuserid": "5b46-7d3c-48a6-9c08-cc894",
"eventtype": "location",
"devicedate": "2016-06-08T07:25:21",
"date": "2016-06-08T07:25:06.507",
"location": {
"building": "2",
"floor": "0",
"elevation": ""
}}}
The problem is i want to use a "Where" clause to "appuserid" and select the selected records for processing. I dont know how to do it ? I have already saved data from a Mongodb in a dataframe.
Right now i am trying to do it as follow;
library(sqldf)
sqldf("SELECT * FROM d WHERE d$properties$appuserid = '0000-0000-0000-0000'")
But it gives an error.
Error: Only lists of raw vectors are currently supported
code is below;
library(jsonlite);
con <- mongo(collection = "geodata", db = "MongoDb", url = "mongodb://192.168.26.18:27017", verbose = FALSE, options = ssl_options());
d <- con$find();
library(jqr)
jq(d, '.features[] | select(d$properties$appuserid == "5b46-7d3c-48a6-9c08-cc894")')
Error : Error in jq.default(d, ".features[] | select(d$properties$appuserid == \"5b46-7d3c-48a6-9c08-cc894\")") :
jq method not implemented for data.frame.
jqr is one option, an R client for jq https://stedolan.github.io/jq/
x <- '{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {
"population": 200
},
"geometry": {
"type": "Point",
"coordinates": [
10.724029,
59.926807
],
"properties": {
"appuserid": "5b46-7d3c-48a6-9c08-cc894"
}
}
},
{
"type": "Feature",
"properties": {
"population": 600
},
"geometry": {
"type": "Point",
"coordinates": [
10.715789,
59.904778
],
"properties": {
"appuserid": "c7e866a7-e32d-4dc2-adfd-c2ca065b25ce"
}
}
}
]
}'
library(jqr)
jq(x, '.features[] | select(.geometry.properties.appuserid == "5b46-7d3c-48a6-9c08-cc894")')
returns
{
"type": "Feature",
"properties": {
"population": 200
},
"geometry": {
"type": "Point",
"coordinates": [
10.724029,
59.926807
],
"properties": {
"appuserid": "5b46-7d3c-48a6-9c08-cc894"
}
}
}