It is ideal to keep all the data until you can't. When there is a need to delete data, it doesn't necessarily have to be from all the streams. There might be data in some streams that we might want to keep. The current approach does not let the user select streams to delete the telemetry data from but rather purges data from all the streams.
The solution I came up with is to add four new functions to the existing purge script which now enables the user to select streams to purge data from.
Steps -
First run the purge script
python purge.py
This will show you three menu options. The last option is 3 -- Purge selected streams.
Upon selecting the third option, a list of streams is displayed. The script prompts you to select stream(s) to purge. Enter a comma-separated list of stream names. If the stream name(s) is/are incorrect, you will be prompted to try one more time.
Enter the number of days older than today to purge data. Confirm with a y/n. If the input is y, data will be purged from all the streams with id corresponding to the stream names you input. Lastly, a list of all the streams the data was purged from is printed. If the input is n, you will be taken back to the main menu.
To explain the code a little;
The first function is get_streams which fetches all the stream names and corresponding IDs from the stream table and stores them as key-value pairs in a dictionary.
The second function is list_streams which calls the get_streams function to get the dictionary and the existing get_stream_tables function to get all the streams corresponding to each ID in the stream table. It prints a list of streams (say, socomec 0, generator 11 etc.) for users to choose from.
The third function is stream_input which takes a comma-separated input from the user and checks if the stream name(s) input by the user exists or not. If the input is incorrect, it prompts the user to try again (one time only). If the input is correct, it takes the ID(s) and appends 'stream' in front of it and filters out all the streams corresponding to that particular ID, using a lambda function, into a list. It then prompts the user to enter the number of days and provide confirmation.
The fourth function , purge_stream, is a slight modification of the original purge function. The loop variable in this function is the list of streams that we get from the lambda function mentioned above which ensures that the data is purged only from the selected streams.
Related
I have a button which runs the following code when it's clicked:
let dataReference = await db.collection("dog").doc("1").get()
let HashMap = dataReference.data().Annotations
console.log(HashMap)
My firestore database looks like this:
Whenever this function is run, it returns the proper dictionary, however, the ordering of the keys seems to change randomly. Here's a screenshot of my console logs when I pressed the button a bunch of times:
Why does the ordering of the key-value pairs change and is there a way to fix it?
The Firestore SDK does not guarantee an order of iteration of document fields. What you see in the console is always lexically sorted by the code of the console itself. If you require a stable ordering, you should sort them yourself before iteration.
One workaround you can do is to have the order of the keys you want in an array. Because arrays are ordered it will maintain the order you desired. Then you take that key and use it in the dictionary. While the dictionary is out of order, you will be accessing each value in order by key.
In my app I use Firebase's childByAutoId() (swift) or .push() (web) to insert some data in the following format:
- events
- $autoId
- time:
- name:
- $autoId
- time:
- name:
Where $autoId are the randomly generated keys Firebase makes. time is the epoch time of when the data was pushed.
I want to allow users to modify each inserted entry's time. However, I want to keep the nodes under events sorted by their key and by time which Firebase naturally does when you use .push(). But if they modify the time so that it should actually be in a different order, the entries won't be sorted correctly.
Is there a way to generate an id by the modified time so that if it were inserted into events it would be in the right order? That way I could just delete the old entry and insert the new one while just duplicating the data.
Since the algorithm for Firebase's push IDs is well documented, you could easily modify the function to generate them based on a specific timestamp.
But I'd recommend instead keeping the necessary values as named properties for each child node. If you need to be able to sort by both creation and modification time, keep two separate properties. That way you won't have to depend on the behavior of the push IDs, but instead use more explicitly named properties to accomplish what you need.
I'm trying to push new data into Firebase but it keeps generating an unique ID and an extra child node for me. How can I stop generating the ID and extra node or is there another way to push data into Firbase? My dataset is given and I just need to dump the dataset into the db. (But I'm slowly updating this given dataset.)
Firease always generates the unique id when using the push method. You can use it as a key or as a parent node. This post describes push in more detail.
https://firebase.googleblog.com/2015/02/the-2120-ways-to-ensure-unique_68.html
You should use set to publish new data to firebase if push doesn't fit your use case. There is also an update method as set will overwrite any data in the given path.
https://firebase.google.com/docs/database/web/save-data
I generally use set for new data, or if I change how I have my data structured, or if I am appending new data in a path, update to make changes to fields for a dataset, and push for sets of data that I want organized in a list by time( i.e. Chat message log ).
Please explain above question with example scenario I am confusing which is best.
If you to fetch a specific object based on keyword or any identity in list then you have to iterate the list get object and compare with its values
In map you can directly create key value pair..you can pass key and get the value.
ex:
A object user is present which has several properties one of them is user code
Now if you have list of user object then you will fetch one by one user object and compare the code of each user...but in map you can directly store user object with user code as key pass the key and get the desired object
map.get("key");
but if you requirement is not based on key type access better to use list.. example as you to just display list of items or you have to perform sublisting.
Too broad question, but will try to shorten it:
When you have to get the value based on key (key can be anything) then you go for hashmap. Consider a telephone directory where you go to appropriate name and search for person's name to find his number.
While if you have similar object's and want to store them somehow and later on retrieve it say by index or traverse them one by one then you go for list. So if your task is to find employees older than age 50 yrs, you can just return a list of employees who are older than 50.
From the Transactions doc, second paragraph:
The intention here is for the client to increment the total number of
chat messages sent (ignore for a moment that there are better ways of
implementing this).
What are some standard "better ways" of implementing this?
Specifically, I'm looking at trying to do things like retrieve the most recent 50 records. This requires that I start from the end of the list, so I need a way to determine what the last record is.
The options as I see them:
use a transaction to update a counter each time a record is added, use the counter value with setPriority() for ordering
forEach() the parent and read all records, do my own sorting/filtering at client
write server code to analyze Firebase tables and create indexed lists like "mostRecent Messages" and "totalNumberOfMessages"
Am I missing obvious choices?
To view the last 50 records in a list, simply call "limit()" as shown:
var data = new Firebase(...);
data.limit(50).on(...);
Firebase elements are ordering first by priority, and if priorities match (or none is set), lexigraphically by name. The push() command automatically creates elements that are ordered chronologically, so if you're using push(), then no additional work is needed to use limit().
To count the elements in a list, I would suggest adding a "value" callback and then iterating through the snapshot (or doing the transaction approach we mention). The note in the documentation actually refers to some upcoming features we haven't released yet which will allow you to count elements without loading them first.