Firebase Database - Add Index on "dynamic" child - firebase

I'm using Firebase Database to store the scores of a game. All is working fine until I've decided to implement a "weekly score".
In order to be able to filter by score and then order by weekly, I'm storing the data in the following structure:
game_scores-weekly
2018-01-29
user_id: { score, date, bla, bla bla}
user_id: { score, date, bla, bla bla}
user_id: { score, date, bla, bla bla}
2018-02-05
user_id: { score, date, bla, bla bla}
user_id: { score, date, bla, bla bla}
So, this works just fine but I get that annoying warning every new week about performance issues due not having indexes on "game_scores-weekly/new_week" indexOn "score". Manually adding the index works... until the next week, so not an option.
"game_scores-weekly": {
"2018-02-19": {
".indexOn": ["score", "uid"]
},
"2018-02-12": {
".indexOn": ["score", "uid"]
}
}
Is there any way to specify somehow a wildcard in the date, so it works for any new date? or perhaps can I programatically create the new index every week, or is there any other solution I might have not thought about?
Also, thought of manually creating list of all weeks of the year and adding it in one go, but likely would be a limit?
Last, but not least, I'm only interested on current week and last week scores, anything older I'd like to keep it to have some historical data but I don't query it in the game, so could potentially get rid of indexes of older weeks.
Cheers!

Thanks to #Tristan for pointing me in the right direction.
I used the following code to define the index and now warning is gone:
"game_scores-weekly": {
"$date": {
".indexOn": ["score", "uid"]
}
}
Seems super obvious now but couldn't find anything clear in the documentation.
Note that $date could be any name really, seems like you can specify a variable value using any $identifier.

Related

Finding JSONPath value by a partial key

I have the following JSON:
{
"Dialog_1": {
"en": {
"label_1595938607000": "Label1",
"newLabel": "Label2"
}
}
}
I want to extract "Label1" by using JSONPath. The problem is that each time I get a JSON with a different number after "label_", and I'm looking for a consistent JSONPath expression that will return the value for any key that begins with "label_" (without knowing in advance the number after the underscore).
It is not possible with JSONPath. EL or Expression Language does not have sch capability.
Besides, I think you need to review your design. Why the variable name is going to be changed all the time? If it is changing then it is data and you need to keep it in a variable. You cannot keep data in data.

How to Convert Chrome Browser History Sqlite Timestamps with Osquery

As I understand, the Chrome browser uses the WebKit time format for timestamps within the browser history database. WebKit time is expressed as milliseconds since January, 1601.
I've found numerous articles that seemingly have the answer to my question, but none have worked so far. The common answer is to use the formula below to convert from WebKit to a human-readable, localtime:
SELECT datetime((time/1000000)-11644473600, 'unixepoch', 'localtime') AS time FROM table;
Sources:
https://linuxsleuthing.blogspot.com/2011/06/decoding-google-chrome-timestamps-in.html
What is the format of Chrome's timestamps?
I'm trying to convert the timestamps while gathering the data through Osquery, using the configuration below.
"chrome_browser_history" : {
"query" : "SELECT urls.id id, urls.url url, urls.title title, urls.visit_count visit_count, urls.typed_count typed_count, urls.last_visit_time last_visit_time, urls.hidden hidden, visits.visit_time visit_time, visits.from_visit from_visit, visits.visit_duration visit_duration, visits.transition transition, visit_source.source source FROM urls JOIN visits ON urls.id = visits.url LEFT JOIN visit_source ON visits.id = visit_source.id",
"path" : "/Users/%/Library/Application Support/Google/Chrome/%/History",
"columns" : ["path", "id", "url", "title", "visit_count", "typed_count", "last_visit_time", "hidden", "visit_time", "visit_duration", "source"],
"platform" : "darwin"
}
"schedule": {
"chrome_history": {
"query": "select distinct url,datetime((last_visit_time/1000000)-11644473600, 'unixepoch', 'localtime') AS time from chrome_browser_history where url like '%nhl.com%';",
"interval": 10
}
}
The resulting events have timestamps from the year 1600:
"time":"1600-12-31 18:46:16"
If I change the config to pull the raw timestamp with no conversion, I get stamps such as the following:
"last_visit_time":"1793021894"
From what I've read about WebKit time, it is expressed in 17-digit numbers, which clearly is not what I'm seeing. So I'm not sure if this is an Osquery, Chrome, or query issue at this point. All help and insight appreciated!
Solved. The datetime conversion needs to take place within the table definition query.
I.e. the query defined underneath "chrome_browser_history".
"chrome_browser_history" : {
"query" : "SELECT urls.id id, urls.url url, urls.title title, urls.visit_count visit_count, urls.typed_count typed_count, datetime(urls.last_visit_time/1000000-11644473600, 'unixepoch') last_visit_time, urls.hidden hidden, visits.visit_time visit_time, visits.from_visit from_visit, visits.visit_duration visit_duration, visits.transition transition, visit_source.source source FROM urls JOIN visits ON urls.id = visits.url LEFT JOIN visit_source ON visits.id = visit_source.id",
"path" : "/Users/%/Library/Application Support/Google/Chrome/%/History",
"columns" : ["path", "id", "url", "title", "visit_count", "typed_count", "last_visit_time", "hidden", "visit_time", "visit_duration", "source"],
"platform" : "darwin"
}
"schedule": {
"chrome_history": {
"query": "select distinct url,last_visit_time from chrome_browser_history where url like '%nhl.com%';",
"interval": 10
}
}
Trying to make the conversion within the osquery scheduled query (as I was trying before) will not work. i.e:
"schedule": {
"chrome_history": {
"query": "select distinct url,datetime((last_visit_time/1000000)-11644473600, 'unixepoch', 'localtime') AS time from chrome_browser_history where url like '%nhl.com%';",
"interval": 10
}
}
Try:
SELECT datetime(last_visit_time/1000000-11644473600, \"unixepoch\") as last_visited, url, title, visit_count FROM urls;
This is from something I wrote up a while ago - One-liner that runs osqueryi with ATC configuration to read in the chrome history file, export as json and curl the json to an API endpoint
https://gist.github.com/defensivedepth/6b79581a9739fa316b6f6d9f97baab1f
The things you're working with, are pretty straight sqlite. So I would start by debugging inside sqlit.
First, you should verify the data is what you expect. On my machine, I see:
$ cp Library/Application\ Support/Google/Chrome/Profile\ 1/History /tmp/
$ sqlite3 /tmp/History "select last_visit_time from urls limit 2"
13231352154237916
13231352154237916
Second, I would verify the underlying math:
sqlite> select datetime(last_visit_time/1000000-11644473600, "unixepoch") from urls limit 2;
2020-04-14 15:35:54
2020-04-14 15:35:54
It would be easier to test your config snippet if you included it as text we can copy/paste.

JQ: Nested JSON Array transformation

Since some month ago i had a little problem with a jq Transformation (j1 1.5 on Windows 10). Since them the command worked excellent: "[{nid, title, nights, company: .operator.shortTitle, zone: .zones[0].title}
+ (.sails[] | { sails_nid: .nid, arrival, departure } )
+ (.sails[].cabins[] | { cabinname: .cabinType.title, cabintype: .cabinType.kindName, cabinnid: .cabinType.nid, catalogPrice, discountPrice, discountPercentage, currency } )]". Since some days ago the api deliver "bigger" json files JSON File. With the jq command i got a lot of duplicates (with the attached file i got around 3146 objects, expected objects are arround 250). I tried to Change the jq command to avoid the duplicates but had no "luck" on that.
The json files contains a variable amount of sails (10 in these case), while each sail has a variable amount of cabins (25 in this case). Any tips how i can realize that? Regards timo
This is probably what you're looking for:
[{nid, title, nights, company: .operator.shortTitle, zone: .zones[0].title}
+ (.sails[] | ({ sails_nid: .nid, arrival, departure } +
(.cabins[] | { cabinname: .cabinType.title,
cabintype: .cabinType.kindName,
cabinnid: .cabinType.nid,
catalogPrice,
discountPrice,
discountPercentage,
currency } ))) ]
Hopefully the layout will clarify the difference with your jq filter.

Firebase: Best practice to build a data structure that can filter data by multiple criteria quickly

For example, I have a list of 1.000.000 users, the data look like this:
users: {
$userId: {
name: "",
sex: "",
age: "",
city: "",
maritalStatus: "",
// can be more
}
}
I want to filter, paginate the data for: users who are single, male, with age < 30, living in city X.
Is there a good practice to make this kind of queries less painful?
Firebase doesn't have a a direct way to query for more than one child at a time.
You can structure your data to make it easier - for example
users
$userId
gender_age: male_27
$userId
gender_age: male_32
Then, to query for males between 30 and 40:
gender_age....queryStartingAtValue("male_30").endingAtValue("male_40")
That will narrow down the results - you could then filter in code for the ones you want, for example (conceptual)
if snapshot.child("maritalStatus") = "Single" and
snapshot.child("city") = "AnyTown" then
//add person to list for display
You could expand this out a bit to narrow the results further:
users
$userId
city_gender_age: anytown_male_27
city_gender_age....queryStartingAtValue("anytown_male_30").endingAtValue("anytown_male_40")
Unfortunately the pattern breaks down if the query is less specific; e.g. if we are querying for either male or female in anytown between 30 and 40, this won't work.
However, disk space is cheap so storing 'duplicate' data in another node would resolve that
another_node
$user_id
city_age: anytown_27

Couchbase Reduce function

I am trying to learn how to use map reduce functions with Couchbase. until now i created reports engines based on SQL using Where with multi terms (adding and subtracting terms) and to modify the group part.
I am trying to create this report engine using views.
my problem is how to create a report that enable users to dive in and find more and more data, getting all the way to individual ip stats.
For example. how many clicks where today ? which traffic source ? what did they see? which country ? and etc..
My basic doc for this example look like this:
"1"
{
"date": "2014-01-13 10:00:00",
"ip": "111.222.333.444",
"country": "US",
"source":"1",
}
"2"
{
"date": "2014-01-13 10:00:00",
"ip": "555.222.333.444",
"country": "US",
"source":"1",
}
"3"
{
"date": "2014-01-13 11:00:00",
"ip": "111.888.888.888",
"country": "US",
"source":"2",
}
"4"
{
"date": "2014-01-13 11:00:00",
"ip": "111.777.777.777",
"country": "US",
"source":"1",
}
So i want to allow the user to see at the first screen , how many clicks per day there are at this site.
so i need to count the amount of clicks. simple map/reduce:
MAP:
function (doc, meta) {
emit(dateToArray(doc.date),1);
}
Reduce:
_count
group level 4, group true
will create the sum of clicks per hour.
Now if i want to allow a break down of countries, so i need a dynamic param to change.. from what i am understand it can only by the group level..
so assume i have added this to the emit like this:
emit([dateToArray(doc.date),source],1);
and then grouping level 5 will allow this divide, and using the key too focus on a certein date.. but what if i need to add a county break down? adding this to the emit again?
this seem to be a mess, also if i will want to do a country stats before the source.. is there any smarter way to do this?
Second part...
What if i want to get the first count as follow:
[2014,1,28,10] {ip:"555.222.333.444","111.222.333.444","count":"2"}
i want to see all the ips that are counted for this time...
how should i write my reduce function?
this is my current state that doesnt work..
function(key, values, rereduce) {
var result = {id: 0, count: 0};
for(i=0; i < values.length; i++) {
if(rereduce) {
result.id = result.id + (values[i]).ip +',';
result.count = result.count + values[i].count;
} else {
result.id = values.ip;
result.count = values.length;
}
}
return result;
i didnt get the answer format i was looking for..
i hope this is not to messy and that you could help me with this..
thanks!!
For the first part of your question, I think you are on the right track. That is how you break down views to enable coarse drill down. However, it is important to remember that views are not intended to store your entire documents, nor are they necessarily going to be able to give you a clean cut swatch of data. You probably will need to do fine-filtering within the access layer of your code (using Linq perhaps).
For the second part of your question, a reduce is not the appropriate mechanism to accomplish this. Reduce values have a very finite (and limited) size and will crash the map/reduce engine once they get too big. I suspect you have experimented with that and discovered this for yourself.
The way you worded the question, it seems like you wish to search for all IP addresses that have been counted "X" number of times. This cannot be accomplished directly in Couchbase's map/reduce architecture; however, if you simply want the count for a given IP address, that is something the map/reduce framework has built-in (just use Date + IP as a key).

Resources