firebase realtime database query to find data - firebase

I wish to store data for some children activities where each activity is good for certain age range. Let's say act A is good for 2 - 5 year old. act B is good for 0 -1 year old.
On the client side, there is a fixed set of choices like:
0 - 1 years,
1 - 3 years,
4 - 5 years,
6 - 13 years
Now the requirement is that the activity A should come up for selection 1 - 3 as well as 4 -5 years as 2 - 5 overlaps both the ranges.
What would be the good way to store activity data and then query it efficiently ?

Assuming the fixed set of choices is a permanent feature to your application, I'd have a boolean field for each match, for example, your activities would look like:
activities: {
activityA: {
range0to1: false,
range2to3: true,
range4to5: true,
range6to13: false
},
activityB: {
range0to1: true,
range2to3: false,
range4to5: false,
range6to13: false
}
}
And then when you want to query all activities which apply for eg. ages 2 to 3, then you already have the field to query with nothing too complicated.
But really for longevity, I wouldn't assume that the fixed set of choices is permanent for the lifetime of a an app, in which case I'd rather have something like:
activities: {
activityA: {
minAge: 2,
maxAge: 5,
},
activityB: {
minAge: 0,
maxAge: 1,
}
}
...and then if I want to query for the fixed choice of ages between x and y, my ideal query would be for all activities where either minAge or maxAge are between x and y (hence there's an overlap in the range)
eg (pseudocode) where ((minAge > x and minAge < y) or (maxAge > x or maxAge < y))
But unfortunately, in practice, firebase RTDB doesn't let you query by multiple fields, so if it's not too late, I'd recommend looking at Firestore which may be better suited for your needs (personally I think I'd typically recommend firestore over RTDB for most use-cases).
If you are stuck with RTDB, then another solution might be to create a lookup block at the root of your structure:
{
activities: {
activityA: {
// age range of 2-5 stored however you like
},
activityB: {
// age range of 0-1 stored however you like
},
activityC: {
// age range of 0-3 stored however you like
}
},
ageActivityLookup: {
age0: {
activityB: true,
activityC: true,
},
age1: {
activityB: true,
activityC: true,
},
age2: {
activityA: true,
activityC: true,
},
age3: {
activityA: true,
activityC: true,
},
age4: {
activityA: true,
},
age5: {
activityA: true,
}
}
}
So then you can simply query ageX and get your list of activities. This will mean multiple queries if you're looking for a range of ages, and does mean having to ensure your lookup block stays in sync. This should be OK if the rest of your application data structure isn't too complex.

#hussein as an inspiration from your idea i simplified it a bit to adjust to my usecase. And instead of a separate node i actually added each age group classification within the activity like:
baby:true
teen:true
and so on.
This saves from overhead of maintaining and updating an entire node with increasing complexity asactivities grow

Related

Weaviate: using near_text with the exact property doesn't return a distance of 0

Here's a minimal example:
import weaviate
CLASS = "Superhero"
PROP = "superhero_name"
client = weaviate.Client("http://localhost:8080")
class_obj = {
"class": CLASS,
"properties": [
{
"name": PROP,
"dataType": ["string"],
"moduleConfig": {
"text2vec-transformers": {
"vectorizePropertyName": False,
}
},
}
],
"moduleConfig": {
"text2vec-transformers": {
"vectorizeClassName": False
}
}
}
client.schema.delete_all()
client.schema.create_class(class_obj)
batman_id = client.data_object.create({PROP: "Batman"}, CLASS)
by_text = (
client.query.get(CLASS, [PROP])
.with_additional(["distance", "id"])
.with_near_text({"concepts": ["Batman"]})
.do()
)
print(by_text)
batman_vector = client.data_object.get(
uuid=batman_id, with_vector=True, class_name=CLASS
)["vector"]
by_vector = (
client.query.get(CLASS, [PROP])
.with_additional(["distance", "id"])
.with_near_vector({"vector": batman_vector})
.do()
)
print(by_vector)
Please note that I specified both "vectorizePropertyName": False and "vectorizeClassName": False
The code above returns:
{'data': {'Get': {'Superhero': [{'_additional': {'distance': 0.08034378, 'id': '05fbd0cb-e79c-4ff2-850d-80c861cd1509'}, 'superhero_name': 'Batman'}]}}}
{'data': {'Get': {'Superhero': [{'_additional': {'distance': 1.1920929e-07, 'id': '05fbd0cb-e79c-4ff2-850d-80c861cd1509'}, 'superhero_name': 'Batman'}]}}}
If I look up the exact vector I get 'distance': 1.1920929e-07, which I guess is actually 0 (for some floating point evil magic), as expected.
But if I use near_text to search for the exact property, I get a distance > 0.
This is leading me to believe that, when using near_text, the embedding is somehow different.
My question is:
Why does this happen?
With two corollaries:
Is 1.1920929e-07 actually 0 or do I need to read something deeper into that?
Is there a way to check the embedding created during the near_text search?
here is some information that may help:
Is 1.1920929e-07 actually 0 or do I need to read something deeper into that?
Yes, this value 1.1920929e-07 should be interpreted as 0. I think there are some unfortunate float32/64 conversions going on that need to be rooted out.
Is there a way to check the embedding created during the near_text search?
The embeddings are either imported or generated during object creation, not at search-time. So performing multiple queries on an unchanged object will utilize the same search vector.
We are looking into both of these issues.

Indexing data in my firebase realtime database rules based on the nested value

I have the following JSON tree from my realtime database:
{
"old_characters" :
{
"Reptile" : {
"kick" : 20,
"punch" : 15
},
"Scorpion" : {
"kick" : 15,
"punch" : 10
},
"Sub-zero" : {
"kick" : 30,
"punch" : 10
}
},
"new_characters" : {
//...ect
}
}
Is it possible to set rules in my firebase console so that I can index my data based on the character with the highest value of kick?
The constraints are:
- character_name are dynamic.
- Key "kick" is static, but its value is dynamic.
Result should be:
Sub-zero first (kick 30)
Reptile second (kick 20)
Scorpion third (kick 15)
What you want seems to be a fairly simple Firebase query on the kick property:
var ref = firebase.dababase().ref('old_characters');
var query = ref.orderByChild('kick');
query.once(function(snapshot) {
snapshot.forEach(function(characterSnapshot) {
console.log(characterSnapshot.key);
console.log(characterSnapshot.child('kick').val());
});
});
You'll note that this prints the results in ascending order. You can:
either reverse the results client-side
or add an inverted property with -1 * score to each character and then order on that
To learn more about the inverting/sorting descending, have a look at some of these previous questions:
firebase -> date order reverse
Sorting in descending order in Firebase database
sorting numbers with firebase

Firebase - Querying in flattened data with Two Way Relationship

I currently have a Firebase database with the following structure
// Tracking two-way relationships between users and groups {
" users": {
"mchen": {
"name": "Mary Chen",
"groups": {
"alpha": true,
"charlie": true
}
},
...
}, "groups": {
"alpha": {
"name": "Alpha Group",
"members": {
"mchen": true,
"donald": true
}
}, "bravo": {
"name": "Bravo Group",
"members": {
"mickey": true,
"donald": true
}
},
...
}
}
How do I write a query to show me all the groups a given set of users have in common. i.e. show me all groups where Mickey and Donald both registered.
I don't think that is possible with a single query, and multiple equalTo are not supported.
I would restructure your database as such
root
L users
L groups
L groupName
L ...
L groupMembers
L groupName
L userName:true
Then query like
Query is reference.child('groupMembers').orderByChild('Donald').equalTo(true);
Then go through the results
List resultList
for all DataSnapshot's as snapshot in dataSnapshot's Children
if snapshot.child('Mickey') is not null
add to resultList key of snapshot
// resultList now contains all group keys/names which Donald and Mickey are both appart of.
That would be one way to solve your problem, but you would initially be downloading groups which Mickey might not be apart of. this may or may not be what you want (security etc..).
If you wanted to only get groups which they are both appart of without any clientside filtering, you would have to restructure your database to something like this.
root
L users ...
L groups ...
L groupPairs
L groupName
L DonaldMickey:true
Your query would look similar, adding/removing a user from a group would be more elaborate. You would have to make sure that every possible pair is under the groupName. You could reduce the amount of pairs by setting a criteria like for example: Donald before Mickey because D is before M, or something like that.

Multiple range keys in couchdb views

I've been searching for a solution since few hours without success...
I just want to do this request in couchdb with a view:
select * from database where (id >= 3000000 AND id <= 3999999) AND gyro_y >= 1000
I tried this:
function(doc) {
if(doc.id && doc.Gyro_y){
emit([doc.id,doc.Gyro_y], null);
}
}
Here is my document (record in couchdb):
{
"_id": "f97968bee9674259c75b89658b09f93c",
"_rev": "3-4e2cce33e562ae502d6416e0796fcad1",
"id": "30000002",
"DateHeure": "2016-06-16T02:08:00Z",
"Latitude": 1000,
"Longitude": 1000,
"Gyro_x": -242,
"Gyro_y": 183,
"Gyro_z": -156,
"Accel_x": -404,
"Accel_y": -2424,
"Accel_z": -14588
}
I then do an HTTP request like so:
http://localhost:5984/arduino/_design/filter/_view/bygyroy?startkey=["3000000",1000]&endkey=["3999999",9999999]&include_docs=true
I get this as an answer:
{
total_rows: 10,
offset: 8,
rows: [{
id: "f97968bee9674259c75b89658b09f93c",
key: [
"01000002",
183
],
value: null,
doc: {
_id: "f97968bee9674259c75b89658b09f93c",
_rev: "3-4e2cce33e562ae502d6416e0796fcad1",
id: "30000002",
DateHeure: "2016-06-16T02:08:00Z",
Latitude: 1000,
Longitude: 1000,
Gyro_x: -242,
Gyro_y: 183,
Gyro_z: -156,
Accel_x: -404,
Accel_y: -2424,
Accel_z: -14588
}
}
]
}
So it's working for the id but it's not working for the second key gyro_y.
Thanks for your help.
When you specify arrays as your start/end keys, the results are filtered in a "cascade". In other words, it moves from left to right, and only if something was matched by the previous key, will it be matched by the next key.
In this case, you'll only find Gyro_y >= 1000 when that document also matches the first condition of 3000000 <= id <= 3999999.
Your SQL example does not translate exactly to what you are doing in CouchDB. In SQL, it'll find both conditions and then find the intersection amongst your resulting rows. I would read up on view collation to understand these inner-workings of CouchDB.
To solve your problem right now, I would simply switch the order you are emitting your keys. By putting the Gyro_y value first, you should get the results you've described.

Why does DynamoDB simple deleteItem operation use 2 CapacityUnits?

I have a simple delete operation which goes like this:
{
"TableName":"demo_events",
"Key":{
"category":{"S":"Demo"},
"DynamoID":{"S":"164933868Slt1396454204"}
},
"Expected":{
"category":{
"Exists":true,
"Value"{"S":"Demo"}
}
},
"ReturnConsumedCapacity":"TOTAL",
"ReturnItemCollectionMetrics":"SIZE"}
There is only a single item in database with that ID. The response is this:
{
ConsumedCapacity: {
CapacityUnits: 2,
TableName: 'demo_events'
},
ItemCollectionMetrics: {
ItemCollectionKey: {
category: { S: 'Demo' }
},
SizeEstimateRangeGB: [ 0, 1 ] }
}
Shouldn't this only consume 1 write unit?
Many thanks.
For PutItem, UpdateItem, and DeleteItem, which write only one item, DynamoDB rounds the item size up to the next 1 KB. If you have other attributes in the item in addition to the key attributes, they all together could add up to more than 1 KB.
If there is a Local Secondary Index (LSI) on the table, DeleteItem would also delete the corresponding item from the LSI and item size would contribute to the total Write Capacity Units consumed. DeleteItem response returns an ItemCollectionMetrics when there is a LSI defined for the table. There seems to be a LSI defined for the table based on the sample response
regards

Resources