Nested query in Firebase Firestore - firebase

I have a Firestore database structure like this:
(Articles is a collection, everything inside is arrays)
Articles:
[
{
"id": 1,
"reports": [
{
"locations": [
{
"location": "Sydney",
"country": "Australia"
},
{
"location": "Perth",
"country": "Australia"
}
],
},
{
"locations": [
{
"location": "King County, Washington",
"country": "USA"
}
],
}
]
},
{
"id": 2,
"reports": [
{
"locations": [
{
"location": "Brisbane",
"country": "Australia"
}
]
}
]
}
]
I'd like to create a query to return all Articles that mention a specific country.
Am I better off restructuring my database?

While Firestore's array-contains queries can get you close to this, you can't use them in your current data model as you're nesting arrays.
To allow the use-case you'll at the very least need to unnest one of those arrays into a subcollection of each article, so:
Articles (collection)
$article
reports (collection)
$report (document with a locations array)
At that point you can use a collection group query to search all reports collections, and then array-contains to find the relevant report documents:
var ref = firebase.firestore().collectionGroup("reports");
ref.where("locations", "array-contains", { location: "Sydney", country: "Australia" })
Never mind the below answer, which assumed you had only one level of array...
This type of query should be possible, as far as I can see, using the array-contains operator. Keeping in mind that you need to specify the entire array item.
So something like:
collectionRef.where("reports.locations", "array-contains",
{ location: "Sydney", country: "Australia" })

Related

Non Recursive Index CosmosDb

As a CosmosDB (SQL API) user I would like to index all non object or array properties inside of an object.
By default the index in cosmos /* will index every property, our data set is getting extremely large (expensive) and this strategy is no longer optimal. We store our metadata at the root and our customer data wrapped inside of an object property data.
Our platform restricts queries on the data path to be value type properties, this means that for us to index objects and arrays nested under the data path is just slowing down writes and costing RUs to store but never getting used.
I have tried several iterations of index policies but cannot find one that fits. Example:
{
"partitionKey": "f402a704-19bb-4f4d-93e6-801c50280cf6",
"id": "4a7a11e5-00b5-4def-8e80-132a8c083f24",
"data": {
"country": "Belgium",
"employee": 250,
"teammates": [
{ "name": "Jake", "id": 123 ...},
{ "name": "kyle", "id": 3252352 ...}
],
"user": {
"name": "Brian",
"addresses": [{ "city": "Moscow" ...}, { "city": "Moscow" ...}]
}
}
}
In this case I want to only index the root properties as well as /data/employee and /data/country.
Policies like /data/* will not work because it would then index /data/teammates/name ... and so on.
/data/? => assumes data is a value type which it never will be so this doesn't work.
/data/ and /data/*/? and /data/*? are not accepted by cosmos as valid policies.
Additionally I can't simply exclude /data/teammates/ and /data/user/ because what is inside of data is completely dynamic so while that might cover this use case there are several 100k others that it would not.
I have tried many iterations but it seems that options don't work for various reasons, is there a way to support what I am trying to do?
This indexing policy will index the properties you are asking for.
{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [
{
"path": "/partitionKey/?"
},
{
"path": "/data/country/?"
},
{
"path": "/data/employee/?"
}
],
"excludedPaths": [
{
"path": "/*"
}
]
}

How to filter data ignoring the data that does not match

I have this query:
query ListFreightDriverTrucks($state: String! $tons: Float!) {
listFreightDrivers(filter: {
state: {
contains: $state
}
}) {
items {
name
city
state
trucks (filter: {
tons: {
eq: $tons
}
}) {
items {
id
brand
model
fuelType
fuelEfficiency
utilityPercentage
tons
axes
frontPhoto
truckBox {
type
width
height
depth
}
}
}
}
}
}
And I get as a response the data that match with the $state which is Jalisco.
{
"data": {
"listFreightDrivers": {
"items": [
{
"name": "Jaen Carlos",
"city": "Zapopan",
"state": "Jalisco",
"trucks": {
"items": []
}
},
{
"name": "Diey",
"city": "Zapopan",
"state": "Jalisco",
"trucks": {
"items": []
}
},
{
"name": "Roberto mendez",
"city": "Guadalajara",
"state": "Jalisco",
"trucks": {
"items": []
}
},
{
"name": "Engineering",
"city": "Zapopan",
"state": "Jalisco",
"trucks": {
"items": []
}
},
{
"name": "Roberto mendez",
"city": "Guadalajara",
"state": "Jalisco",
"trucks": {
"items": []
}
},
{
"name": "Andrés",
"city": "Zapopan",
"state": "Jalisco",
"trucks": {
"items": [
{
"id": "2b0cb78e-49c4-4229-8a71-60b350a5fc47",
"brand": "chevrolet",
"model": "xx",
"fuelType": "magna",
"fuelEfficiency": 12,
"utilityPercentage": 10,
"tons": 15,
"axes": 12,
"frontPhoto": "freight-driver/e9adf7fb-09c2-477e-9152-56fe4a71a96b/trucks/dlb0275xqna51.png",
"truckBox": {
"type": "Plataforma",
"width": 4,
"height": 4,
"depth": 4
}
}
]
}
}
]
}
}
}
If you check the response, there are some with this:
"trucks": {
"items": []
}
But I'm not interested in those because do not match with the $tons just the last one did. How can I remove them?
In case I need to make a lambda how the DynamoDB queries will look?
I see this question a lot which makes me a bit insecure but GraphQL isn't really supposed to work that way. You are supposed to get what you ask for and not to "SQL query yourself to victory".
Anyhoot,
You could fix this in your resolvers (the req.vtl file) by filtering out all trucks.items.length < 1 or other things. Please see this link
Appsync & GraphQL: how to filter a list by nested value
Be aware that this is a DynamoDB scan operation (all list operations are) which is quite slow.
AWS DynamoDB has the same design philosophy that you most of the time know the unique keys you are looking for and only filter over a small amount of items. Adding lots of indexes or combining keys.
Recommended reading if you want to update your data model:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/best-practices.html
Maybe rethink your GraphQL design? I don't know anything about trucks but maybe
"Location has Truck has Driver" instead?
or
"Location has Driver has Truck"?
or even both! Since GraphQL gives you what you want a Driver can contain a Truck and a Truck a Driver.
Location {
id: ID!
truck: [Truck]
driver: [Driver]
}
Truck {
id: ID!
driver: Driver!
}
Driver {
id: ID!
Truck: Truck!
}
Amplify auto generates with depth 2 so that your lists don't circle forever and you can just don't ask for what you don't need. There are tons of options here.
https://docs.amplify.aws/cli/graphql-transformer/dataaccess
If you want to make it a Lambda (#function) the dynamo syntax is quite easy (and pretty much the same).
Either you scan the whole table https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB.html#scan-property
or you create an index which you query and then filter https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB.html#query-property
Last but not least

Mismatch of location id between Geocoding Autocomplete and Geocoding

I am using geocoding autocomplete to display found locations after user typed something. Afterwards I am using geocoding with given location ID to fetch detailed information about selected location.
It worked well, till I tried to select "Russia"
Here is my first request to geocoding autocomplete via https://autocomplete.geocoder.api.here.com/6.2/suggest.json
{
"app_id": "xxx",
"app_code": "xxx",
"query": "russia",
"resultType": "areas"
}
And here is the (simplified) response:
{
"suggestions": [
{
"label": "Russia",
"language": "en",
"countryCode": "RUS",
"locationId": "NT_Ya5FK7rlnK5m6PEDf7BwfA",
"address": {
"country": "Russia"
},
"matchLevel": "country"
},
...
]
}
The second request that I send to geocoding via https://geocoder.api.here.com/6.2/geocode.json with following arguments
{
"app_id": "xxx",
"app_code": "xxx",
"locationId": "NT_Ya5FK7rlnK5m6PEDf7BwfA",
"jsonattributes": "1",
"gen": "9",
"language": "en"
}
As you can see - location id is the same as in response to the first query. I suggest to become details to country russia, but instead, I receive empty response:
{
"response": {
"metaInfo": {
"timestamp": "2019-08-20T21:02:54.652+0000"
},
"view": []
}
}
After some troubleshooting I noticed, that geocoding also works with simple form input. I directly tried this request on the example page. In searchtext I type "russia", and voila, I got response (simplified):
{
"Response": {
"MetaInfo": {
"Timestamp": "2019-08-21T12:36:07.874+0000"
},
"View": [
{
"_type": "SearchResultsViewType",
"ViewId": 0,
"Result": [
{
...
"Location": {
"LocationId": "NT_tcqMSofTaW297lvniHjdXD",
"LocationType": "area",
"Address": {
"Label": "Россия",
"Country": "RUS",
"AdditionalData": [
{
"value": "Россия",
"key": "CountryName"
}
]
},
...
}
}
]
}
]
}
}
But wait, what? The ID form autocomplete was NT_Ya5FK7rlnK5m6PEDf7BwfA and from geocoding is NT_tcqMSofTaW297lvniHjdXD
Why do I receive wrong location ID from geocoding autocomplete?
We just implemented HERE API in our product, and we are testing it currently with real use-case input, and so we found this bug.
Is it just one location, that has inconsistent locationId reference, or are there some more? How can we workaround this error? Is it common?
Geocoder generates LocationId from a set of values, which uniquely identify the object. This set includes different types of data such as result type, base names and attribution of admin hierarchy, street name, house number, etc. From all this information Geocoder generates a hash value which is expected to be unique.
Using only base names guarantees that LocationId does not change if e.g. additional language variants are added to country or state name. But if the main official country or state name changes, all the areas and addresses within this country or state will get new LocationId. So using LocationId from Geocoder Autocomplete API will not always work with Geocoder API,
We will update our documentation to reflect this as the current documentation may be a bit misleading.

Azure search service index pointing multiple document db collections

How to load data from two separate collections of azure cosmos db to a single azure search index? I need a solution to join the data from two collections in a way similar to inner joining concept of SQL and load that data to azure search service.
I have two collections in azure cosmos db.
One for product and sample documents for the same is as below.
{
"description": null,
"links": [],
"replaces": "00000000-0000-0000-0000-000000000000",
"replacedBy": "00000000-0000-0000-0000-000000000000",
"productTypeId": "ccd0bc73-c4a1-41bf-9c96-454a5ba1d025",
"id": "a4853bf5-9c58-4fb5-a1ff-fc3ab575b4c8",
"name": "New Product",
"createDate": "2018-09-19T10:04:35.1951552Z",
"createdBy": "00000000-0000-0000-0000-000000000000",
"updateDate": "2018-10-05T13:46:24.7048358Z",
"updatedBy": "DIJdyXMudaqeAdsw1SiNyJKRIi7Ktio5#clients"
}
{
"description": null,
"links": [],
"replaces": "00000000-0000-0000-0000-000000000000",
"replacedBy": "00000000-0000-0000-0000-000000000000",
"productTypeId": "ccd0bc73-c4a1-41bf-9c96-454a5ba1d025",
"id": "b9b6c3bc-a8f8-470f-ac93-be589eb1da16",
"name": "New Product 2",
"createDate": "2018-09-19T11:02:02.6919008Z",
"createdBy": "00000000-0000-0000-0000-000000000000",
"updateDate": "2018-09-19T11:02:02.6919008Z",
"updatedBy": "00000000-0000-0000-0000-000000000000"
}
{
"description": null,
"links": [],
"replaces": "00000000-0000-0000-0000-000000000000",
"replacedBy": "00000000-0000-0000-0000-000000000000",
"productTypeId": "ccd0bc73-c4a1-41bf-9c96-454a5ba1d025",
"id": "98b3647a-3b40-4a00-bd0f-2a397bd48b68",
"name": "New Product 7",
"createDate": "2018-09-20T09:42:28.2913567Z",
"createdBy": "00000000-0000-0000-0000-000000000000",
"updateDate": "2018-09-20T09:42:28.2913567Z",
"updatedBy": "00000000-0000-0000-0000-000000000000"
}
Another collection for ProductType with below sample document.
{
"description": null,
"links": null,
"replaces": "00000000-0000-0000-0000-000000000000",
"replacedBy": "00000000-0000-0000-0000-000000000000",
"id": "ccd0bc73-c4a1-41bf-9c96-454a5ba1d025",
"name": "ProductType1_186",
"createDate": "2018-09-18T23:54:43.9395245Z",
"createdBy": "00000000-0000-0000-0000-000000000000",
"updateDate": "2018-10-05T13:29:44.019851Z",
"updatedBy": "DIJdyXMudaqeAdsw1SiNyJKRIi7Ktio5#clients"
}
The product type id is referred in product collection and that is the column which links both the collections.
I want to load the above two collections to the same azure search service index and I expect my field of index to be populated somewhat like below.
If you use product id as the key, you can simply point two indexers at the same index, and Azure Search will merge the documents automatically. For example, here are two indexer definitions that would merge their data into the same index:
{
"name" : "productIndexer",
"dataSourceName" : "productDataSource",
"targetIndexName" : "combinedIndex",
"schedule" : { "interval" : "PT2H" }
}
{
"name" : "sampleIndexer",
"dataSourceName" : "sampleDataSource",
"targetIndexName" : "combinedIndex",
"schedule" : { "interval" : "PT2H" }
}
Learn more about the create indexer api here
However, it appears that the two collections share the same fields. This means that the fields from the document which gets indexed last will replace the fields from the document that got indexed first. To avoid this, I would recommend replacing the fields that match the 00000000-0000-0000-0000-000000000000 pattern with null in your Cosmos DB query. For example:
SELECT productTypeId, (createdBy != "00000000-0000-0000-0000-000000000000" ? createdBy : null) as createdBy FROM products
This exact query may not work for your use case. See the query syntax reference for more information.
Please let me know if you have any questions, or something is not working as expected.
Thanks
Matt

WHERE clause on an array in Azure DocumentDb

In an Azure Documentdb document like this
{
"id": "WakefieldFamily",
"parents": [
{ "familyName": "Wakefield", "givenName": "Robin" },
{ "familyName": "Miller", "givenName": "Ben" }
],
"children": [
{
"familyName": "Merriam",
"givenName": "Jesse",
"gender": "female",
"grade": 1,
"pets": [
{ "givenName": "Goofy" },
{ "givenName": "Shadow" }
]
},
{
"familyName": "Miller",
"givenName": "Lisa",
"gender": "female",
"grade": 8
}
],
"address": { "state": "NY", "county": "Manhattan", "city": "NY" },
"isRegistered": false
};
How do I query to get children whose pets given name is "Goofy" ?
Looks like the following syntax is invalid
Select * from root r
WHERE r.children.pets.givenName="Goofy"
Instead I need to do
Select * from root r
WHERE r.children[0].pets[0].givenName="Goofy"
which is not really searching through an array.
Any suggestion on how I should handle queries like these ?
You should take advantage of DocumentDB's JOIN clause, which operates a bit differently than JOIN in RDBMs (since DocumentDB deals w/ denormlaized data model of schema-free documents).
To put it simply, you can think of DocumentDB's JOIN as self-joins which can be used to form cross-products between nested JSON objects.
In the context of querying children whose pets given name is "Goofy", you can try:
SELECT
f.id AS familyName,
c AS child,
p.givenName AS petName
FROM Families f
JOIN c IN f.children
JOIN p IN c.pets
WHERE p.givenName = "Goofy"
Which returns:
[{
familyName: WakefieldFamily,
child: {
familyName: Merriam,
givenName: Jesse,
gender: female,
grade: 1,
pets: [{
givenName: Goofy
}, {
givenName: Shadow
}]
},
petName: Goofy
}]
Reference: http://azure.microsoft.com/en-us/documentation/articles/documentdb-sql-query/
Edit:
You can also use the ARRAY_CONTAINS function, which looks something like this:
SELECT food.id, food.description, food.tags
FROM food
WHERE food.id = "09052" or ARRAY_CONTAINS(food.tags.name, "blueberries")
I think the ARRAY_CONTAINS function has changed since this was answered in 2014. I had to use the following for it to work.
SELECT * FROM c
WHERE ARRAY_CONTAINS(c.Samples, {"TimeBasis":"5MIN_AV", "Value":"5.105"},true)
Samples is my JSON array and it contains objects with many properties including the two above.

Resources