How to ingest CSV data into kusto - azure-data-explorer

I need to ingest the csv data into kusto using kusto query for geographical map visualization. but I couldn't find any query to ingest csv data. please help me on this.
Thank you

Try the One-Click ingestion wizard or ingest-from-storage commands (requires data to be in Azure Blob or ADLSv2 file.

You can ingest using query as well as described here: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/management/data-ingestion/ingest-from-storage
You will need to place your files in Azure Blob Storage or Azure Data Lake Store Gen 2.
However, one click ingestion wizard method is advisable as that will ingest data via Data Management component that offers better queuing and ingestion orchestration instead of ingesting directly into database/table using the command shared above.

Related

How to ingest data from ADLS into Azure Data Explorer by subscribing to Event Grid

I am trying to ingest data from ADLS gen2 to Azure data explorer through Event Grid.
I could find a few of MSFT docs explaining about how to ingest blobs into ADX through event grid but not ADLS.
the file path to the ADLS storage account is abfss://container#p01lakesstor.dfs.core.windows.net/UserData/Overground/UsersFolder/projectname/A/data/json/
I just would like to know how to set the prefix/suffix here to read the data from that adls storage account
would appreciate any help!
That should do the work
/blobServices/default/containers/mycontainer/blobs/this/is/my/path/
Replace mycontainer and /this/is/my/path/ with the relevant info.
Please be mindful of ADLS subtleties when setting up Event Grid subscriptions:
Writing ADLSv2 files
Known ADLSv2 limitations

How to export DynamoDB table data without the point in time recovery?

I am trying to export data from a DynamoDB table for the last 15 days, but
unfortunately, the point in time recovery is not active. So I can't use the new DynamoDB export to S3 feature because it's not retroactive.
I have tried using the AWS Data Pipeline to export the DynamoDB data to S3
but is it retroactive?
If so I have tried to export the data but the Pipeline is failing with
the TableBackupActivity with a status of cancelled. Didn't find anything in the log bucket nor in the
Data Pipeline console but only this
#failureReason Resource not healthy: Jobflow retired
How to know if this is due to the Read Capacity Unit of the DynamoDB Table?
You cannot backup data to s3 natively without the Point in time Recovery enabled.
Another way of doing this would be reading the complete database and saving it as JSON. And during the time of recovery re-populating your disaster recovery database with your JSON file.
Amazon has an article for populating dynamodb from JSON file here https://aws.amazon.com/blogs/compute/creating-a-scalable-serverless-import-process-for-amazon-dynamodb/

Azure Synapse replicated to Cosmos DB?

We have a Azure data warehouse db2(Azure Synapse) that will need to be consumed by read only users around the world, and we would like to replicate the needed objects from the data warehouse potentially to a cosmos DB. Is this possible, and if so what are the available options? (transactional, merege, etc)
Synapse is mainly about getting your data to do analysis. I dont think it has a direct export option, the kind you have described above.
However, what you can do, is to use 'Azure Stream Analytics' and then you should be able to integrate/stream whatever you want to any destination you need, like an app or a database ands so on.
more details here - https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-integrate-azure-stream-analytics
I think you can also pull the data into BI, and perhaps setup some kind of a automatic export from there.
more details here - https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-get-started-visualize-with-power-bi

cosmosdb - archive data older than n years into cold storage

I researched several places and could not find any direction on what options are there to archive old data from cosmosdb into a cold storage. I see for DynamoDb in AWS it is mentioned that you can move dynamodb data into S3. But not sure what options are for cosmosdb. I understand there is time to live option where the data will be deleted after certain date but I am interested in archiving versus deleting. Any direction would be greatly appreciated. Thanks
I don't think there is a single-click built-in feature in CosmosDB to achieve that.
Still, as you mentioned appreciating any directions, then I suggest you consider DocumentDB Data Migration Tool.
Notes about Data Migration Tool:
you can specify a query to extract only the cold-data (for example, by creation date stored within documents).
supports exporting export to various targets (JSON file, blob
storage, DB, another cosmosDB collection, etc..),
compacts the data in the process - can merge documents into single array document and zip it.
Once you have the configuration set up you can script this
to be triggered automatically using your favorite scheduling tool.
you can easily reverse the source and target to restore the cold data to active store (or to dev, test, backup, etc).
To remove exported data you could use the mentioned TTL feature, but that could cause data loss should your export step fail. I would suggest writing and executing a Stored Procedure to query and delete all exported documents with single call. That SP would not execute automatically but could be included in the automation script and executed only if data was exported successfully first.
See: Azure Cosmos DB server-side programming: Stored procedures, database triggers, and UDFs.
UPDATE:
These days CosmosDB has added Change feed. this really simplifies writing a carbon copy somewhere else.

DocumentDB Data Migration Tool - transformDocument procedure with partitions

I am trying to convert data from SQL Server to DocumentDb. I need to create embedded arrays in the DocumentDb document.
I am using the DocumentDb Data Migration Tool and it describes using the transformDocument for a bulk insert stored proc...unfortunately we are using partitioned collections and they do not support bulk insert.
Am I missing something or is this not currently supported?
The migration tool only supports sequential data import to a partitioned collection. Please follow sample below to bulk-import data efficiently into a partition collection.
https://github.com/Azure/azure-documentdb-dotnet/tree/master/samples/documentdb-benchmark

Resources