I want to get the data that is in the table bigquery. It is in the table, not in the buffer.
How can you get it?
Clear table. But don't touch the buffer?
Related
For a project I need to apply some functions to data that is added to a Google Sheets or Google BigQuery table using Pub/Sub.
I want to pass the newly added table rows to listeners that are subscribed to the Pub/Sub topic. Essentially, the table contains some links with images from external websites and I want to automatically download them, store them in our google cloud storage bucket and add a link to the new location of the image to the original table. This is supposed to happen immediately after the data is received.
I cannot figure out how to publish a message that contains the new data to my PubSub topic once data is appended to my tables.
Does anyone know if what I am trying to achieve is even possible?
I have scanned a Bigquery table from Google DLP Console. The scan results are saved back into a big query table. DLP has identified sensitive information, but the row_index is shown as null "location.content_locations.record_location.table_location.row_index", can anyone help me understand why?
We no longer populate row_index for bigquery as it's not meaningful since BQ is unordered. If you want to identify the row where the finding came from, I suggest using identifyingFields which lives in BigQueryOptions when you create your job.
https://cloud.google.com/dlp/docs/creating-job-triggers#job-identifying-fields
I need to access raw event data stored in Firebase. Thus, I have linked Firebase to Bigquery last month. Bigquery currently creates daily tables containing event data for a month. However, as the Bigquery documentation states, it is not possible to import data prior to linking to Bigquery. Does anyone know how these data can be exported?
The exact dataset from prior to linking cannot be exported in anyway.
The only workaround is that if you want to look up specific information, you can try using the GA4 Data API to fetch the information. Again, this will not give you the entire dataset export.
I use Firebase Analytics and export data to BigQuery. Now I'd like to filter data and sort them by timestamp.
I wrote this query:
SELECT event_timestamp, event_name, event_params, user_id,
user_pseudo_id, user_properties, STRUCT(device.category,
device.time_zone_offset_seconds, device.is_limited_ad_tracking) device,
platform FROM `myTable` ORDER BY event_timestamp;
But this results Error: Resources exceeded during query execution: The query could not be executed in the allotted memory. Sort operator used for ORDER BY used too much memory... I think the data is too much to put on BigQuery's memory.
The reason why I have to sort data is that I'd like to download the data and parse them in my on-premises application in the ascending order of timestamp. And if I change the sorting role to my application, it must take a lot of time.
I don't know fully about the feature of Google Cloud Platform. Is there any good way to sort huge data on GCP?
I was wondering, how does Firestore handle real-time syncing of deeply nested objects? Specifically, does it only sync the diff?
For example, I have a state of the app which is just an array of 3 values and this state is synced between devices. If I then change one of the values will the whole new array be synced (transmitted on the network) or only the diff? What if my state is the nested object?
I'm asking because I want to sync the whole state which is an object with multiple fields but I don't wont to sync the whole object when I only change single field.
Like Realtime Database, Cloud Firestore uses data synchronization to update data on any connected device. However, it's also designed to make simple, one-time fetch queries efficiently.
Queries are indexed by default: Query performance is proportional to the size of your result set, not your data set.
Cloud Firestore will only send your device only the difference of the document.
Tips:
Add queries to limit the data that your listen operations return and use listeners that only download updates to data.
Place your listeners as far down the path as you can to limit the amount of data they sync. Your listeners should be close to the data you want them to get. Don't listen at the database root, as that results in downloads of your entire database.
Hope it helps!