Total ordering of transactions in Sqlite WAL mode - sqlite

I have a use-case where my application does a count (with a filter) at time T1 and transmits this information to the UI.
In parallel (T0.9, T1.1), rows are inserted/updated and fire events to the interface. From the UI's perspective, it is hard to know if the count included a given event or not.
I would like to return some integer X with the count transaction and another integer Y from the insert/update transaction so that the interface only considers events where Y > X.
Since sqlite mimics the snapshot isolation, I was thinking that there must be information in the snapshot to know what records to read or not that could de leveraged for that.
I cannot use the max ROWID either because an update might change older rows that should now we counted given the filter.
Time seems also unreliable since a write transaction could start before a read transaction but still not be included in the snapshot.
I have no issue coding a custom plugin for sqlite is there is need for one to access the data.
Any idea is appreciated!

Related

Firestore transactions

Why does Firestore require read operations to be executed before write operations? I understand that in general reads should be checked in transactions, whether some other transaction is also accessing data but is there some more in-depth explanation?
If you need to take some actions according to the value of a particular field, then you can use transactions. This means that the value of that field is read first and right after that is updated according to your logic. This is because a transaction absolutely requires round trip communications with the server in order to ensure that the code inside the transaction completes successfully.
If you don't care about what exists in the particular field and you want to increment a counter for example, then you can simply use increment(your_number). This operation doesn't require a round trip with the Firebase servers, it simply just increments the value.

Constraints (Meta Information) in Kusto / Azure Data Explorer

For some of our logs we have the following schema:
A "master" event table that collects events. Each event comes with a unique id (guid).
For each event we collect additional IoT data (sensor data) which also contains the guid as a link the event table
Now, we often see the schema that someone starts with the IoT data and then wants to query the master event table. The join or query criteria is the guid. Now, as we have a lot of data the unconditioned query, of course, does not return within a short time frame, if at all.
Now what our analysts do is to use the time range as a factor. Typically, the sensor data refers to events that happend on the same day or +/- a few hours, or minutes or seconds (depends on the events). This query typically returns, but not always as fast as it could be. Given that the guid is unique, queries that explicitely state this knowledge are typically way faster than those that don't, e.g.
Event_Table | where ... | take 1
unfortuntely, everyone needs to remember those properties of the data.
After this long intro: Is there a way in Kusto to speed up those queries without explictely write "take 1"? As in, telling the Kusto engine that this column holds unique keys? I am not talking about enforcing that (as a DB unique key would do), but just to give hints to kusto on how to improve the query? Can this be done somehow?
It sounds that you can benefit from introducing server-side function:
https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/schema-entities/stored-functions
Using this method - you can define a function on a server - and users will provide a parameter to the function.

Exactly-once semantics in Dataflow stateful processing

We are trying to cover the following scenario in a streaming setting:
calculate an aggregate (let’s say a count) of user events since the start of the job
The number of user events is unbounded (hence only using local state is not an option)
I'll discuss three options we are considering, where the two first options are prone to dataloss and the final one is unclear. We'd like to get more insight into this final one. Alternative approaches are of course welcome too.
Thanks!
Approach 1: Session windows, datastore and Idempotency
Sliding windows of x seconds
Group by userid
update datastore
Update datastore would mean:
Start trx
datastore read for this user
Merging in new info
datastore write
End trx
The datastore entry contains an idempotency id that equals the sliding window timestamp
Problem:
Windows can be fired concurrently, and then can hence be processed out of order leading to dataloss (confirmed by Google)
Approach: Session windows, datastore and state
Sliding windows of x seconds
Group by userid
update datastore
Update datastore would mean:
Pre-check: check if state for this key-window is true, if so we skip the following steps
Start trx
datastore read for this user
Merging in new info
datastore write
End trx
Store in state for this key-window that we processed it (true)
Re-execution will hence skip duplicate updates
Problem:
Failure between 5 and 7 will not write to local state, causing re-execution and potentially counting elements twice.
We can circumvent this by using multiple states, but then we could still drop data.
Approach 3: Global window, timers and state
Based on the article Timely (and Stateful) Processing with Apache Beam, we would create:
A global window
Group by userid
Buffer/count all incoming events in a stateful DoFn
Flush x time after the first event.
A flush would mean the same as Approach 1
Problem:
The guarantees for exactly-once processing and state are unclear.
What would happen if an element was written in the state and a bundle would be re-executed? Is state restored to before that bundle?
Any links to documentation in this regard would be very much appreciated. E.g. how does fault-tolerance work with timers?
From your Approach 1 and 2 it is unclear whether out-of-order merging is a concern or loss of data. I can think of the following.
Approach 1: Don't immediately merge the session window aggregates because of out of order problem. Instead, store them separately and after sufficient amount of time, you can merge the intermediate results in timestamp order.
Approach 2: Move the state into the transaction. This way, any temporary failure will not let the transaction complete and merge the data. Subsequent successful processing of the session window aggregates will not result in double counting.

When are commited changes visible to other transactions?

A transaction in an Oracle db makes changes in the db, and the changes are committed. Is it possible that other transactions see the performed changes only after several seconds, not immediately?
Background:
We have an application that performs db changes, commits them, and THEN (immediately) it reads the changed data back from the database. However, sometimes it happens that it finds no changes. When the same read is repeated later (by executing the same select manually from SQL Developer), the changed data are returned correctly. The db is standalone, not clustered.
The application does not communicate with the database directly, that would be easy, several layers (including MQ messaging) are involved. We've already eliminated other potential causes of the behaviour (like incorrect parameters, caching etc.). Now I'd like to eliminate an unexpected behaviour of the Oracle db as the cause.
Edit:
At first, I'd like to emphasize that I'm NOT asking whether uncommited changes can be visible to other sessions.
At second, the Oracle COMMIT statement has several modifiers, like WRITE BATCH or NOWAIT. I don't know whether these modifiers can have any influence on the answer to my questions, but we are not using them anyway.
Assuming that your sessions are all using a read committed isolation level, changes would be visible to any query that starts after the data was committed. It was possible in early versions of RAC to have a small delay between when a change was committed on one node and when it was visible on another node but that has been eliminated for a while and you're not using RAC so that's presumably not it.
If your transactions are using the serializable isolation level and the insert happens in a different session than the select, the change would only be visible to other sessions whose transactions began after the change was committed. If sessions A & B both start serializable transactions at time 0, A inserts a row at time 1, and B queries the data at time 2, B would see the state of the data at time 0 and wouldn't see the data that was inserted at time 1 until after it committed its transaction. Note that this would only apply if the two statements are in different sessions-- session A would see the row because it was inserted in A's transaction.
Barring an isolation level issue, I would expect that the SELECT wasn't actually running after the INSERT committed.

Efficient DynamoDB schema for time series data

We are building a conversation system that will support messages between 2 users (and eventually between 3+ users). Each conversation will have a collection of users who can participate/view the conversation as well as a collection of messages. The UI will display the most recent 10 messages in a specific conversation with the ability to "page" (progressive scrolling?) the messages to view messages further back in time.
The plan is to store conversations and the participants in MSSQL and then only store the messages (which represents the data that has the potential to grow very large) in DynamoDB. The message table would use the conversation ID as the hash key and the message CreateDate as the range key. The conversation ID could be anything at this point (integer, GUID, etc) to ensure an even message distribution across the partitions.
In order to avoid hot partitions one suggestion is to create separate tables for time series data because typically only the most recent data will be accessed. Would this lead to issues when we need to pull back previous messages for a user as they scroll/page because we have to query across multiple tables to piece together a batch of messages?
Is there a different/better approach for storing time series data that may be infrequently accessed, but available quickly?
I guess we can assume that there are many "active" conversations in parallel, right? Meaning - we're not dealing with the case where all the traffic is regarding a single conversation (or a few).
If that's the case, and you're using a random number/GUID as your HASH key, your objects will be evenly spread throughout the nodes and as far as I know, you shouldn't be afraid of skewness. Since the CreateDate is only the RANGE key, all messages for the same conversation will be stored on the same node (based on their ConversationID), so it actually doesn't matter if you query for the latest 5 records or the earliest 5. In both cases it's query using the index on CreateDate.
I wouldn't break the data into multiple tables. I don't see what benefit it gives you (considering the previous section) and it will make your administrative life a nightmare (just imagine changing throughput for all tables, or backing them up, or creating a CloudFormation template to create your whole environment).
I would be concerned with the number of messages that will be returned when you pull the history. I guess you'll implement that by a query command with the ConversationID as the HASH key and order results by CreationDate descending. In that case, I'd return only the first page of results (I think it returns up to 1MB of data, so depends on an average message length, it might be enough or not) and only if the user keeps scrolling, fetch the next page. Otherwise, you might use a lot of your throughput on really long conversations and anyway, the client doesn't really want to get stuck for a long time waiting for megabytes of data to appear on screen..
Hope this helps

Resources