Group by multiple properties in Cosmos DB - azure-cosmosdb

I have the following collection (c) and I want to group a specific property into a list of values, by multiple properties.
{
"trim": "8375",
"year": "2022",
"model_name": "Tundra",
"brand": "TOYOTA"
},
{
"trim": "9854",
"year": "2022",
"model_name": "NX Hybrid",
"brand": "LEXUS"
},
{
"trim": "8361",
"year": "2022",
"model_name": "Tundra",
"brand": "TOYOTA"
},
{
"trim": "8382",
"year": "2022",
"model_name": "Tundra",
"brand": "TOYOTA"
},
{
"trim": "9854",
"year": "2022",
"model_name": "NX Hybrid",
"brand": "LEXUS"
},
{
"trim": "8386",
"year": "2022",
"model_name": "Tundra",
"brand": "TOYOTA"
},
{
"trim": "9764",
"year": "2022",
"model_name": "NX Hybrid",
"brand": "LEXUS"
},
{
"trim": "8361",
"year": "2022",
"model_name": "Tundra",
"brand": "TOYOTA"
},
{
"trim": "8261",
"year": "2022",
"model_name": "Tundra",
"brand": "TOYOTA"
},
{
"trim": "8376",
"year": "2022",
"model_name": "Tundra",
"brand": "TOYOTA"
},
{
"trim": "8361",
"year": "2022",
"model_name": "Tundra",
"brand": "TOYOTA"
}
The properties by which I want to group by are the year, model_name and brand. I want to store the trim in a list containing unique trims only.
Result should look like:
{
"trim": ["8375", "8361", "8382", "8386", "8261", "8376"],
"year": "2022",
"model_name": "Tundra",
"brand": "TOYOTA"
},
{
"trim": ["9854", "9764"],
"year": "2022",
"model_name": "NX Hybrid",
"brand": "LEXUS"
},

From group by documentation:
When a query uses a GROUP BY clause, the SELECT clause can only contain the subset of properties and system functions included in the GROUP BY clause. One exception is aggregate functions, which can appear in the SELECT clause without being included in the GROUP BY clause.
Aggregation function documentation does not mention any aggregation function for concatenating arrays.
So, your intended query cannot currently be implemented just by SQL query.
What you can do instead (depending on expected data amounts and acceptable RU usage):
query without grouping and apply grouping in client,
-or- query without array and make separate N queries to get trim numbers per each group.

Related

jq filter items where values in a nested array array are different

Let's say I have the following json. I want to get all the items where the "employee" had different teams over the year.
[
{
"id": 122343,
"name": "Tom Muller",
"teams": [
{
"year": "2010-2011",
"team_id": 27
},
{
"year": "2011-2012",
"team_id": 27
},
{
"year": "2013-2014",
"team_id": 27
}
]
},
{
"id": 338744,
"name": "Eric Gonzales",
"teams": [
{
"year": "2010-2011",
"team_id": 12
},
{
"year": "2011-2012",
"team_id": 17
},
{
"year": "2013-2014",
"team_id": 17
}
]
}
]
I would like to query the array with jq and the output would return
{
"id": 338744,
"name": "Eric Gonzales",
"teams": [
{
"year": "2010-2011",
"team_id": 12
},
{
"year": "2011-2012",
"team_id": 17
},
{
"year": "2013-2014",
"team_id": 17
}
]
}
How would I write such a query ?
Thanks
With unique_by you can reduce the .teams array to those that differ in their .team_id, and with select and length you can filter for those that have strictly more than just one such item.
jq '.[] | select(.teams | unique_by(.team_id) | length > 1)'
{
"id": 338744,
"name": "Eric Gonzales",
"teams": [
{
"year": "2010-2011",
"team_id": 12
},
{
"year": "2011-2012",
"team_id": 17
},
{
"year": "2013-2014",
"team_id": 17
}
]
}
Demo
This gives you a stream of objects (in case there's more than one result). Use map(select(…)) instead of .[] | select(…) if you want to have them as an array.

Cannot access child value on Newtonsoft.Json.Linq.JProperty error

I have this json format I'm making an API using ASP.NET.
{
"0": {
"order_id": 11748,
"complete_date": "2021-04-19 14:48:41",
"shipping_code": "aramex.aramex",
"awbs": [
{
"aramex_id": "1314",
"order_id": "11748",
"awb_number": "46572146154",
"reference_number": "11748",
"date_added": "2021-03-04 03:46:58"
}
],
"payment": {
"method": {
"name": "الدفع عند الاستلام",
"code": "cod"
},
"invoice": [
{
"code": "sub_total",
"value": "120.8700",
"value_string": "120.8700 SAR",
"title": "الاجمالي"
},
{
"code": "shipping",
"value": "0.0000",
"value_string": "0.0000 SAR",
"title": "ارمكس"
},
{
"code": "coupon",
"value": "-13.9000",
"value_string": "-13.9000 SAR",
"title": "قسيمة التخفيض(RMP425)"
},
{
"code": "cashon_delivery_fee",
"value": "5.0000",
"value_string": "5.0000 SAR",
"title": "رسوم الدفع عند الاستلام"
},
{
"code": "tax",
"value": "18.1300",
"value_string": "18.1300 SAR",
"title": " ضريبة القيمة المضافة (15%)"
},
{
"code": "total",
"value": "130.1000",
"value_string": "130.1000 SAR",
"title": "الاجمالي النهائي"
}
]
},
"product": [
{
"id": 69,
"name": "مخلط 4 أو دو بيرفيوم للجنسين - 100 مل",
"sku": "45678643230",
"weight": "0.50000000",
"quantity": 1,
"productDiscount": "",
"images": []
}
]
}
}
How can I reach order_id? I made an object let's say its name is obj1 I tried foreach obj1 and storing into a variable obj1.order_id;
It stored null in the variable. the {"0"} is the numbering of orders starts 0-1-2 etc.
You can deserialize that json to Dictionary<string,dynamic> without creating a new class as following:
var values = JsonConvert.DeserializeObject<Dictionary<string, dynamic>>(json);
var orderId = values["0"]["order_id"].ToString();
This will give you 11748 as a result.

Is there an R function for checking if a specified GeoJSON object(polygon or multi-polygon) contains the specified point?

I have an array of point
{
"Sheet1": [
{
"CoM ID": "1040614",
"Genus": "Washingtonia",
"Year Planted": "1998",
"Latitude": "-37.81387927",
"Longitude": "144.9817733"
},
{
"CoM ID": "1663526",
"Genus": "Banksia",
"Year Planted": "2017",
"Latitude": "-37.79582801",
"Longitude": "144.9160598"
},
{
"CoM ID": "1031170",
"Genus": "Melaleuca",
"Year Planted": "1997",
"Latitude": "-37.82326441",
"Longitude": "144.9305296"
}
]
}
and also an array of Geojson polygon in the same form as shown below:
{"type":"FeatureCollection","features":[
{"type":"Feature","id":"01","properties":{"name":"Alabama","density":94.65},"geometry":{"type":"Polygon","coordinates":[[[-87.359296,35.00118],[-85.606675,34.984749],[-85.431413,34.124869],[-85.184951,32.859696],[-85.069935,32.580372],[-84.960397,32.421541],[-85.004212,32.322956],[-84.889196,32.262709],[-85.058981,32.13674],[-85.053504,32.01077],[-85.141136,31.840985],[-85.042551,31.539753],[-85.113751,31.27686],[-85.004212,31.003013],[-85.497137,30.997536],[-87.600282,30.997536],[-87.633143,30.86609],[-87.408589,30.674397],[-87.446927,30.510088],[-87.37025,30.427934],[-87.518128,30.280057],[-87.655051,30.247195],[-87.90699,30.411504],[-87.934375,30.657966],[-88.011052,30.685351],[-88.10416,30.499135],[-88.137022,30.318396],[-88.394438,30.367688],[-88.471115,31.895754],[-88.241084,33.796253],[-88.098683,34.891641],[-88.202745,34.995703],[-87.359296,35.00118]]]}}
I'm trying to find the Geojson polygons with points inside it using R.
For example how can I know if the three points I added above are inside the polygon?
The function I found may be helpful is the point.in.polygon function. but it doesn't support the Geojson format.
Is there any R function or any way would be useful to solve this problem?
It will be really helpful if it's return is the ID of the polygon.
you can use lawn pkg, e.g.,
x <- '{
"Sheet1": [
{
"CoM ID": "1040614",
"Genus": "Washingtonia",
"Year Planted": "1998",
"Latitude": "-37.81387927",
"Longitude": "144.9817733"
},
{
"CoM ID": "1663526",
"Genus": "Banksia",
"Year Planted": "2017",
"Latitude": "-37.79582801",
"Longitude": "144.9160598"
},
{
"CoM ID": "1031170",
"Genus": "Melaleuca",
"Year Planted": "1997",
"Latitude": "-37.82326441",
"Longitude": "144.9305296"
}
]
}'
feature1 <- '{"type":"Feature","id":"01","properties":{"name":"Alabama","density":94.65},"geometry":{"type":"Polygon","coordinates":[[[-87.359296,35.00118],[-85.606675,34.984749],[-85.431413,34.124869],[-85.184951,32.859696],[-85.069935,32.580372],[-84.960397,32.421541],[-85.004212,32.322956],[-84.889196,32.262709],[-85.058981,32.13674],[-85.053504,32.01077],[-85.141136,31.840985],[-85.042551,31.539753],[-85.113751,31.27686],[-85.004212,31.003013],[-85.497137,30.997536],[-87.600282,30.997536],[-87.633143,30.86609],[-87.408589,30.674397],[-87.446927,30.510088],[-87.37025,30.427934],[-87.518128,30.280057],[-87.655051,30.247195],[-87.90699,30.411504],[-87.934375,30.657966],[-88.011052,30.685351],[-88.10416,30.499135],[-88.137022,30.318396],[-88.394438,30.367688],[-88.471115,31.895754],[-88.241084,33.796253],[-88.098683,34.891641],[-88.202745,34.995703],[-87.359296,35.00118]]]}}'
Do one test:
lawn_boolean_contains(as.feature(feature1), lawn_point('[144.9817733,-37.81387927]'))
#> FALSE
All at once:
apply(jsonlite::fromJSON(x)$Sheet1, 1, function(z) {
lawn_boolean_contains(
as.feature(feature1),
lawn_point(sprintf("[%s,%s]", z['Longitude'], z['Latitude']))
)
})
#> 1 2 3
#> FALSE FALSE FALSE

Gremlin filter by count

With the usage of this query in CosmosDB Gremlin API:
g.V().has('person', 'name', 'John').as('his')
.out('bought').aggregate('self')
.out('made_by')
I have next output:
[
{
"id": "100",
"label": "brand",
"type": "vertex",
"properties": {
"name": [
{
"id": "233b77e7-7007-4c08-8930-99b25b67e493",
"value": "Apple"
}
]
}
},
{
"id": "100",
"label": "brand",
"type": "vertex",
"properties": {
"name": [
{
"id": "233b77e7-7007-4c08-8930-99b25b67e493",
"value": "Apple"
}
]
}
},
{
"id": "101",
"label": "brand",
"type": "vertex",
"properties": {
"name": [
{
"id": "f3e238e2-f274-489c-a69c-f1333403ee8e",
"value": "Google"
}
]
}
}
]
Is there a way to select only brands, which quantity is > 1 (Apple in this case)?
I think that you just need to groupCount() and then use a filter:
g.V().has('person', 'name', 'John').as('his').
out('bought').aggregate('self').
out('made_by').
groupCount().
unfold().
where(select(values).is(gt(1))).
select(keys)
You could just groupCount() and then unfold() the resulting Map so that you can filter the entries with where().

R Getting JSON data into dataframe

I have this file with JSON formatted data, but need this into a dataframe. Ultimately I would like to plot the geolocations onto a map, but can't seem to get this data into a df first.
json_to_df <- function(file){
file <- lapply(file, function(x) {
x[sapply(x, is.null)] <- NA
unlist(x)
})
df <- do.call("rbind", file)
return(df)
}
But I get only this error:
Error in fromJSON(file) :
STRING_ELT() can only be applied to a 'character vector', not a 'list'
The file structure looks like this (this is only part of the data):
{
"results": [
{
"utc_offset": 7200000,
"venue": {
"country": "nl",
"localized_country_name": "Netherlands",
"city": "Bergen",
"address_1": "16 Notweg",
"name": "FitClub Bergen",
"lon": 4.699218,
"id": 24632049,
"lat": 52.673046,
"repinned": false
},
"headcount": 0,
"distance": 22.46796989440918,
"visibility": "public",
"waitlist_count": 0,
"created": 1467149834000,
"rating": {
"count": 0,
"average": 0
},
"maybe_rsvp_count": 0,
"description": "<p>Start your week off right with a Monday Morning Bootcamp!!! The fresh air and peaceful dunes provide the perfect setting for a total body workout. Whether you are a beginner with brand spankin' new health goals and in need of some direction, or training for a race or competition, we're the trainers for you!!! See you at 8:50 for sign-in!</p>",
"event_url": "https://www.meetup.com/FitClubBergen/events/234936736/",
"yes_rsvp_count": 3,
"duration": 3600000,
"name": "Free Bootcamp in the Bergen Dunes",
"id": "glzqvlyvnbgc",
"time": 1477292400000,
"updated": 1477297999000,
"group": {
"join_mode": "open",
"created": 1441658286000,
"name": "FitClub Bergen Free Bootcamp in the Dunes",
"group_lon": 4.710000038146973,
"id": 18908751,
"urlname": "FitClubBergen",
"group_lat": 52.66999816894531,
"who": "FitClubbers"
},
"status": "past"
},
{
"utc_offset": 7200000,
"venue": {
"country": "nl",
"localized_country_name": "Netherlands",
"city": "Bergen",
"address_1": "16 Notweg",
"name": "FitClub Bergen",
"lon": 4.699218,
"id": 24632049,
"lat": 52.673046,
"repinned": false
},
"headcount": 0,
"distance": 22.46796989440918,
"visibility": "public",
"waitlist_count": 0,
"created": 1467149834000,
"rating": {
"count": 0,
"average": 0
},
"maybe_rsvp_count": 0,
"description": "<p>Start your week off right with a Monday Morning Bootcamp!!! The fresh air and peaceful dunes provide the perfect setting for a total body workout. Whether you are a beginner with brand spankin' new health goals and in need of some direction, or training for a race or competition, we're the trainers for you!!! See you at 8:50 for sign-in!</p> <p>ALWAYS FREE</p> <p>FOR ALL LEVELS OF FITNESS</p> <p>BRING: water bottle and energy</p>",
"event_url": "https://www.meetup.com/FitClubBergen/events/234936737/",
"yes_rsvp_count": 3,
"name": "Monday Morning Bootcamp in the Bergen Dunes",
"id": "flzqvlyvnbgc",
"time": 1477292400000,
"updated": 1477303926000,
"group": {
"join_mode": "open",
"created": 1441658286000,
"name": "FitClub Bergen Free Bootcamp in the Dunes",
"group_lon": 4.710000038146973,
"id": 18908751,
"urlname": "FitClubBergen",
"group_lat": 52.66999816894531,
"who": "FitClubbers"
},
"status": "past"
},
{
"utc_offset": 7200000,
"venue": {
"country": "nl",
"localized_country_name": "Netherlands",
"city": "Amsterdam",
"phone": "020 4275777",
"address_1": "Dijksgracht 2",
"address_2": "1019 BS ",
"name": "Klimmuur Central",
"lon": 4.91284,
"id": 1143381,
"lat": 52.376626,
"repinned": false
},
"headcount": 0,
"distance": 1.0689502954483032,
"visibility": "public",
"waitlist_count": 0,
"created": 1477215767000,
"rating": {
"count": 0,
"average": 0
},
"maybe_rsvp_count": 0,
"description": "<p>Climbing Right After Work: RAW.<br/>Quiet hall, pretty much every rope available; no rope chasing necessary. And.. still some time left to do other things later that evening. Take you gear and an extra sandwich to work and join me afterwards pulling some plastic.<br/>Some notes:<br/>- This events starts #17:00. If you can't make it that early, please comment the time you can.<br/>- Please fill in your belaying skills in your profile. If you've never climbed before or don't have belaying skills: follow an introduction course a the gym first! Safety above all!</p>",
"event_url": "https://www.meetup.com/The-Amsterdam-indoor-rockclimbing/events/235054729/",
"yes_rsvp_count": 3,
"name": "Monday's RAW Climb",
"id": "235054729",
"time": 1477321200000,
"updated": 1477334279000,
"group": {
"join_mode": "approval",
"created": 1358348565000,
"name": "The Amsterdam indoor rockclimbing",
"group_lon": 4.889999866485596,
"id": 6689952,
"urlname": "The-Amsterdam-indoor-rockclimbing",
"group_lat": 52.369998931884766,
"who": "Climbers"
},
"status": "past"
},
{
"utc_offset": 7200000,
"venue": {
"country": "nl",
"localized_country_name": "Netherlands",
"city": "Amstelveen",
"address_1": "Langs de Akker 3",
"name": "Emergohal",
"lon": 4.87967,
"id": 23816542,
"lat": 52.290199,
"repinned": false
},
"rsvp_limit": 12,
"headcount": 0,
"distance": 5.541957378387451,
"visibility": "public",
"waitlist_count": 0,
"created": 1474452073000,
"fee": {
"amount": 5.5,
"accepts": "cash",
"description": "per person",
"currency": "EUR",
"label": "price",
"required": "0"
},
"rating": {
"count": 0,
"average": 0
},
"maybe_rsvp_count": 0,
"description": "<p>We will play the Whole Season indoor soccer on Mondays from 18:00 - 19:00 starting 5 September until May 2017 in the Emergohal Amstelveen.</p> <p>Preferred payment is with Paypal EUR 5.50 (in advance)<br/>If this is not possible you may pay cash but then I will ask EUR 6,-<br/>(Please have the exact cash with you)</p> <p>xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</p> <p>A couple of Unisys (ex)colleagues and football lovers are playing every Monday in the Emergohal Amstelveen at 6PM on a reasonable good level. We are looking for a compact group of players who are willing/able to play (almost) every Monday playing 5v5 (or 6v6).<br/>We are playing with the FIFA Futsal rules in mind:<br/>http://www.fifa.com/mm/document/footballdevelopment/refereeing/51/44/50/lawsofthegamefutsal2014_15_eneu_neutral.pdf</p> <p>The Emergohal has dressing rooms and a nice bar for after the game.</p> <p>Hope to see you on Mondays</p> <p>Cheers Jeroen</p> <p>For questions you may call me on[masked], send a text message (SMS) or leave a message on this meetup group.</p>",
"event_url": "https://www.meetup.com/Futsal_Emergohal_Monday_18-00/events/234290812/",
"yes_rsvp_count": 11,
"duration": 4500000,
"name": "Futsal",
"id": "234290812",
"time": 1477323900000,
"updated": 1477330559000,
"group": {
"join_mode": "approval",
"created": 1474445066000,
"name": "Futsal_Emergohal_Monday_18.00",
"group_lon": 4.860000133514404,
"id": 20450096,
"urlname": "Futsal_Emergohal_Monday_18-00",
"group_lat": 52.31999969482422,
"who": "Players"
},
"status": "past"
}],
"meta": {
"next": "https://api.meetup.com/2/open_events?and_text=False&offset=1&city=Amsterdam&sign=True&format=json&lon=4.88999986649&limited_events=False&photo-host=public&page=20&time=-24m%2C&radius=25.0&lat=52.3699989319&status=past&desc=False",
"method": "OpenEvents",
"total_count": 643,
"link": "https://api.meetup.com/2/open_events",
"count": 20,
"description": "Searches for recent and upcoming public events hosted by Meetup groups. Its search window is the past one month through the next three months, and is subject to change. Open Events is optimized to search for current events by location, category, topic, or text, and only lists Meetups that have **3 or more RSVPs**. The number or results returned with each request is not guaranteed to be the same as the page size due to secondary filtering. If you're looking for a particular event or events within a particular group, use the standard [Events](/meetup_api/docs/2/events/) method.",
"lon": ,
"title": "Meetup Open Events v2",
"url": "",
"signed_url": "{signed_url}",
"id": "",
"updated": 1479988687055,
"lat":
}
}
So I was wondering how I would put this in a dataframe or csv even to be able to extract geolocations later?
There is no need to write a parser yourself, there are a number of packages that can read JSON formatted data. The one I use, and #hrbrmstr linked, is jsonlite. This package provides a fromJSON function which can parse JSON into a data.frame:
fromJSON('file.json', flatten = TRUE)
note that the flatten argument here ensures the json is flattended into a nice data.frame.

Resources