Target table schema change and update policy - azure-data-explorer

The documentation says:
The update policy function schema and the target table schema must match in their column names, types, and order.
What is the behavior when the target table schema needs to be updated? Will it fail or will that cause the update policy to fail?
If none fails, will the source table data stop being processed into the target table?

if there is a mismatch between the schema of the target table and the output schema of the query in the update policy - the update policy will fail.
And, if the update policy is configured with IsTransactional set to true - the entire ingestion will fail.
In managed ingestion flows, e.g. ingestion from event hub, or any other kind of queued ingestion - the service will automatically retry the ingestion operation in this failure scenario.
This means that there's a 'grace period' during which you can perform the desired schema changes, after which these retries will succeed (assuming you've resolved all schema mismatches)
This, too, is mentioned in the documentation (partial snippet included below)

Related

Update Policy with zero retention on source table

If we follow the scenario described in the docs titled 'Zero retention on source table' , i.e. we have set transactional update policy and treating source tables are only temporary landing points and thus setting softdelete as 0s:-
.alter-merge table <TableName> policy retention softdelete=0s
Now since the update policy in question is transactional in nature , lets say update policy execution (execution of stored function executed by the update policy) fails , will there be retry ? and how long will Kusto keep retrying? Until the time it attempts retry , where does the data reside? Because source tables are 0 retention , so it won't even exist in source table I believe.
Yes, the retry logic is described in the "transactional policy" section. When the ingestion fails due to transactional update policy, the Data Management cluster will simply send the ingestion command to the source table again based on the logic described in the docs, until the full ingestion command succeeded the data will not be in the source or target tables:

Is Corda support state deletion scenario?

Is corda support state deletion scenario when don't need to use some state (in both dev/prod)
Because I face exception when start node like "class not found exception", It's happen when I delete state class in project and use same old persistence file.
I think it because of state class already insert in VAULT_STATES and it can't find that class when start node.
I expect to have some method that provide state deletion.
More info
In Dev side I delete persistence file and of course it's work, but I just worry about Production side.
As of Corda 3, if a node has a state stored as part of a transaction in its transaction storage or in its vault, the node needs to keep the state's class definition on its classpath permanently.
You can delete old transactions and states directly via the node's database, but only if the transactions are not required for transaction resolution. You would do this by dropping rows from the NODE_TRANSACTIONS and VAULT_STATES tables in the node's database (as well as any custom tables defined by the state's schemas if it is a QueryableState). However, if the deleted transactions are later required as part of transaction resolution, your node will throw an error.
Future versions of Corda may provide a mechanism to delete old or "non-current" states and transactions. You can find a discussion of what this process may look like here: https://groups.io/g/corda-dev/topic/20405353.
For development purposes you can simply just delete the persistence.mv.db file which is the entire H2 database. This will reset your corda node.
Of course don't do that for any production use.

Determine if Cosmos DB NotFound due to missing collection vs. document

Is there a way to programmatically determine from a DocumentClientException where StatusCode == HttpStatusCode.NotFound whether it was the document, the collection, or the database that was not found?
I'm trying to figure out whether I can implement on-demand collection provisioning and only call DocumentClient.CreateDocumentCollectionIfNotExistsAsync when I need to. I'm trying to avoid calling it before making every request (presumably this adds an extra network roundtrip to every request). Likewise, I'm trying to avoid calling it on error recovery when I know it won't help.
From experimentation with the local emulator, the only field I see varying in these three cases is DocumentClientException.Error.Message, and only when the database cannot be found. I generally try to avoid exception dispatching based on human-readable messages.
Wrong database name:
StatusCode: HttpStatusCode.NotFound
Error.Message: {\"Errors\":[\"Owner resource does not exist\"]}...
Correct database name, wrong collection name:
StatusCode: HttpStatusCode.NotFound
Error.Message: {\"Errors\":[\"Resource Not Found\"]}...
Correct database name, correct collection name, incorrect document ID:
StatusCode: HttpStatusCode.NotFound
Error.Message: {\"Errors\":[\"Resource Not Found\"]}...
I'm planning to use a database with its own offer. Since collections inside a database with its own offer are cheap, I'm trying to see whether I can segregate each tenant in my multi-tenant application into its own collection. Each tenant ends up having a different indexing and default TTL policy. The set of collections is not fixed and changes dynamically during runtime as new tenants sign up. I cannot predict when I will need to add a new collection. There's no new tenant notification: I just get a request that I need to handle by creating a document in a possibly non-existent collection. There's a process to garbage collect unused collections.
I'm using the NuGet package Microsoft.Azure.DocumentDB.Core Version 1.9.1 in a .NET Core 2.1 app targeting a SQL API Cosmos DB instance.
If you look at the Message property in detail, you should see following strings that informs whether 404 Not Found response was generated due to Document vs Collection.
ResourceType: Document
ResourceType: Collection
It's not ideal but you can try to regex this information out of error message.

Need to pin all flyway updates to 1 schema_version table in a specified schema

We want to be able to pin all sql executions to a particular schema_version table in schema A. We need this so that we can run sqls as a sysdba and flyway always references A.schema_version to validate checksums and update result of SQL runs. We tried by adding the following settings:
flyway.schemas=A
flyway.table=schema_version
However we find that if we run info as user B then flyway is not able to show it can read A.schema_version. What are we missing?
Found out the solution. Schemas is case sensitive as specified in the flyway docs. We needed to reference flyway.schemas= where schema_name is in caps

When I use Fixture with SqlAlchemy in my unit tests, why am I unable to confirm changes to the database during test?

I am testing a message processer that uses SqlAlchemy (v0.7.4). In my test, I am using Fixture (v1.4) with Sqlite to set up and tear down a temporary database. My fixture data includes a file table with a status field that should get updated when the processor runs.
I have confirmed that the test, the processor being tested, and the fixture are all sharing the same database session.
I query the status field on the file record before the processor is run and afterwards. The value should change (from an int representing "Processing" to "Complete"). I have added debug code within the processor to verify that the field is being updated with the correct new status value. I am also able to independently verify that the processor runs successfully by checking the contents of an output file it produces. However, when I query the status at the end of my test using my test's database session, it is always the same as the value at the beginning.
I have tried explicitly committing and flushing the session before the final status query. Nothing works. Any ideas?
The issue here was twofold: 1) My test was using a temporary Sqlite database in memory. 2) Within my functional test, the processor was being spawned within a new process.
So even though I had hacked the processor class to use the same database session as the test itself, since processor and database were in separate memory spaces, the database updates the processor was making were invisible to the test code trying to verify the results.
Solution: set up the temporary Sqlite database in-file rather than in-memory.
Additionally, I discovered that, when Fixture does its teardown, it will throw an error if your fixture data isn't in the same state as it was setup in (noted at the end of this section). But that was a separate issue.

Resources