Azure data explorer data purge - azure-data-explorer

As per Azure Data Explorer's purge guidelines, the process should not be used for deleting massive quantity of data. What defines "massive"? Trying to delete ~70K rows results in the query timeout.

Related

Update Policy with zero retention on source table

If we follow the scenario described in the docs titled 'Zero retention on source table' , i.e. we have set transactional update policy and treating source tables are only temporary landing points and thus setting softdelete as 0s:-
.alter-merge table <TableName> policy retention softdelete=0s
Now since the update policy in question is transactional in nature , lets say update policy execution (execution of stored function executed by the update policy) fails , will there be retry ? and how long will Kusto keep retrying? Until the time it attempts retry , where does the data reside? Because source tables are 0 retention , so it won't even exist in source table I believe.
Yes, the retry logic is described in the "transactional policy" section. When the ingestion fails due to transactional update policy, the Data Management cluster will simply send the ingestion command to the source table again based on the logic described in the docs, until the full ingestion command succeeded the data will not be in the source or target tables:

How to incrementally refresh Azure Analysis Services having synapse pool as source. Does "Default" processing option supports incremental refresh

Cube: Tabular (Azure Analysis Server)---
Source DB: Azure Synapse Pool ---
Data Operation: Insert and update is happening on the source system
Issue:
Not able to get latest data from synapse pool in AAS cube, using "Default" processing option.
Clarification:
Which processing option(except full) to use, to have the latest data updated in tables in the cube?
Does "Default" processing option supports incremental refresh on tables in cube?
This question was asked/answered over on the Microsoft Q&A forums: https://learn.microsoft.com/en-us/answers/questions/251394/how-to-incrementally-refresh-azure-analysis-servic.html
To summarize:
Default processing does not refresh any data, only loads in tables that have not already been processed
Incremental processing isn't currently a supported feature in Azure Analysis Services. Instead, use partitions to segment data so you only need to reload the most recent changes.

wso2 "complete and same" master-datasources.xml on all five WSO2 API-M components

I'm setting WSO2 APIM HA in distributed environment and I have some challanges using this documentation.
Documentation states: Note: When configuring clustering, ignore the WSO2_CARBON_DB data source configuration.
Question is, do I really cannot use CARBON db instead od UM un REG databases in HA?
Documentation mentions to configure following:
AM DB - in the Publisher, Store, and Key Manager nodes
UM DB - in the Publisher, Store, and Key Manager nodes
REG DB - in the API Publisher and Store nodes. (single tenant)
MB DB - in the Traffic manager nodes (each TM own DB)
Question is, can I completely fill one master-datasources.xml file and overwrite it on all components so I would not have to edit it on each server? (only editing the second TM datasource to aim to the second MB DB)
Yes, that is fine if you completely fill only one master-datasource.xml file & overwrite it on all other components. (except WSO2_MB_STORE_DB which is MB DB)
But MB DB (WSO2_MB_STORE_DB ) has to be separate for each node. As this DB is used for traffic as well as internally by Throttling policies, which has very high rate of DB transactions.
It will work if you don't keep WSO2_MB_STORE_DB separate, but it will have large number of DB transactions which can slower down your single DB. So it's Highly Advisable to maintain separate DB on each node. It will also help you in easy DEBUGGING in PROD environments.

cosmosdb - archive data older than n years into cold storage

I researched several places and could not find any direction on what options are there to archive old data from cosmosdb into a cold storage. I see for DynamoDb in AWS it is mentioned that you can move dynamodb data into S3. But not sure what options are for cosmosdb. I understand there is time to live option where the data will be deleted after certain date but I am interested in archiving versus deleting. Any direction would be greatly appreciated. Thanks
I don't think there is a single-click built-in feature in CosmosDB to achieve that.
Still, as you mentioned appreciating any directions, then I suggest you consider DocumentDB Data Migration Tool.
Notes about Data Migration Tool:
you can specify a query to extract only the cold-data (for example, by creation date stored within documents).
supports exporting export to various targets (JSON file, blob
storage, DB, another cosmosDB collection, etc..),
compacts the data in the process - can merge documents into single array document and zip it.
Once you have the configuration set up you can script this
to be triggered automatically using your favorite scheduling tool.
you can easily reverse the source and target to restore the cold data to active store (or to dev, test, backup, etc).
To remove exported data you could use the mentioned TTL feature, but that could cause data loss should your export step fail. I would suggest writing and executing a Stored Procedure to query and delete all exported documents with single call. That SP would not execute automatically but could be included in the automation script and executed only if data was exported successfully first.
See: Azure Cosmos DB server-side programming: Stored procedures, database triggers, and UDFs.
UPDATE:
These days CosmosDB has added Change feed. this really simplifies writing a carbon copy somewhere else.

DynamoDB limitations when deploying MoonMail

I'm trying to deploy MoonMail on AWS. However, I receive this exception from CloudFormation:
Subscriber limit exceeded: Only 10 tables can be created, updated, or deleted simultaneously
Is there another way to deploy without opening support case and asking them to remove my limit?
This is an AWS limit for APIs: (link)
API-Specific Limits
CreateTable/UpdateTable/DeleteTable
In general, you can have up to 10
CreateTable, UpdateTable, and DeleteTable requests running
simultaneously (in any combination). In other words, the total number
of tables in the CREATING, UPDATING or DELETING state cannot exceed
10.
The only exception is when you are creating a table with one or more
secondary indexes. You can have up to 5 such requests running at a
time; however, if the table or index specifications are complex,
DynamoDB might temporarily reduce the number of concurrent requests
below 5.
You could try to open a support request to AWS to raise this limit for your account, but I don't feel this is necessary. It seems that you could create the DynamoDB tables a priori, using the AWS CLI or AWS SDK, and use MoonMail with read-only access to those tables. Using the SDK (example), you could create those tables sequentially, without reaching this simultaneously creation limit.
Another option, is to edit the s-resources-cf.json file to include only 10 tables and deploy. After that, add the missing tables and deploy again.
Whatever solution you apply, consider creating an issue ticket in MoonMail's repo, because as it stands now, it does not work in a first try (there are 12 tables in the resources file).

Resources