How can i stream changes from Azure CosmosDB( MongoDB API ) and save the data to Azure Data Lake - azure-cosmosdb

Problem and Research:
Trying to get real time data from CosmosDB to Data Lake. this is what i have understood from my research, that i have to create a function app to monitor the changes in Cosmos using Change Feed then i have to bind it to event grid, by which i will be able to store the changes to ADLS.
Blockers:
In Data Factory, Data flow is not able to connect Cosmosdb Mongo.
I have to listen to all the collections, in function app at a time only one collection can be monitored
which compute should i use to store the data.
My Understanding:
Azure CosmosDb to ADLS pipeline
Have to create a Streaming Pipeline which store all the data from CosmosDB mongoAPI to ADLS storage

You can try Azure synapse link for cosmosDB - https://learn.microsoft.com/en-us/azure/cosmos-db/synapse-link

Related

How to copy data from Cosmos DB API for MongoDB to another Cosmos DB account

How to copy data, one collection, from one Cosmos DB API for MongoDB account to another Cosmos DB API for MongoDB account, in another subscription, placed in another Azure region.
Preferably do it periodically.
You can use Azure Data Factory to easily copy a collection from one Cosmos DB API for MongoDB account to another Cosmos DB API for MongoDB account, in any other subscription, placed in any other Azure region simply using Azure Portal.
You need to deploy some required components like Linked Services, Datasets and Pipeline with Copy data activity in order to accomplish this task.
Use Azure Cosmos DB (MongoDB API) Linked Service to connect the Azure Data Factory with your Cosmos DB Mongo API account. Refer Create a linked service to Azure Cosmos DB's API for MongoDB using UI for more details and step to deploy.
Note: You need to deploy two Azure Cosmos DB (MongoDB API) Linked Service, one for source account from where you need to copy the collection, and another for destination account where the data will be copied.
Create Datasets by using Linked service created in above step. Your dataset will connect you to the collection. Again you need to deploy two datasets, one for source collection and another for destination collection. It will look like as shown below.
Now create a pipeline using Copy data activity
In Source and Sink tab in copy data activity settings, select the source dataset and sink dataset respectively which you have created in step 2.
Now just Publish the changes and click on Debug option to run the pipeline once. The pipeline will run and collection will be copied at destination.
If you want to run the pipeline periodically, you can create Trigger based on event or any specific time. Check Create a trigger that runs a pipeline on a schedule for more details.

Can't fetch data in cosmos DB

Hi i simulate data de to send in azureIoT. getting sample data works fine and when i use stream analytics with cosmos DB nothing is fetched in the data collection. everthing is empty.
i tried the change my connection (maybe my company's firewall) but nothing happens too
i would like to fetch data in cosmosDB by stream analytics because next step i want to use powerBI
ComsoDB sample works
and nothing in cosmoDB when i use stream analytics
Make sure Cosmos DB settings for JSON output is mentioned correctly as follows:
Mandatory fields and description.
After defining Input, Output and query, click on the start job, then you can find the output in the Cosmos DB document.
Checkout the Azure Stream Analytics output to Azure Cosmos DB:
For more details, refer "Azure Stream Analytics output to Azure Cosmos DB".
Hope this helps.

CosmosDB Error while fetching page of documents: {"code":400,"body":"Command find failed: Unknown server error occurred when processing

I'm new to CosmosDB and have used Data Factory to import some test data from a BLOB into a CosmoDB container. The Monitor screen tells me it was successful. I then went to the Azure portal, opened my container and clicked 'Documents' but this does not show me any data. I then clicked the refresh button in the sub-pane (the one on the 'load more' section) and it gave me the error:
Error while fetching page of documents:
{"code":400,"body":"Command find failed: Unknown server error occurred when processing this request.."}
I also could not find any good tutorials online or on Youtube that shoes step by step how to import a CSV from BLOB storage into ComosDB Document store via DataFactory, so unable to tell if I am doing it correctly.
I contacted Microsoft. Response was: "The Azure Data Factory loads data using the SQL API SDK, and does not support mMngo yet. The data loaded using the SQL API SDK would have to be in MongoDB BSON schema. Also, the Mongo DB Native driver expect the data in JSON schema and fails to deserialize triggerin the 400 error."
The MongoBulkExecutor API was recommended as an alternative but from what I can tell this really requires json too.
I also could not find any good tutorials online or on Youtube that
shoes step by step how to import a CSV from BLOB storage into ComosDB
Document store via DataFactory, so unable to tell if I am doing it
correctly.
In fact, you should check the below components when you import csv from blob storage into cosmos db.
1.You already have created cosmos db Linked Service and DataSet.
2.You already have created cosmos db Linked Service and DataSet.
You could do the above steps on the portal.
3.Create copy activity and fill the blob storage input and cosmos db output into the activity.
4.In addition,you need to know cosmos db sql api and cosmos db mongo api are different api,though they call named cosmos db. Based on Supported capabilities in document: Copy data to or from Azure Cosmos DB by using Azure Data Factory, Azure Cosmos DB connector only supports Copying data from and to the Azure Cosmos DB SQL API. So,please don't confuse.
If you do want to use Mongo api, you could choose mongo connector to do your jobs which is mentioned in this case:https://social.msdn.microsoft.com/Forums/security/en-US/52cddbf7-c132-490c-9088-65a38f9b7200/copy-activity-to-cosmosdb-with-mongo-api?forum=AzureDataFactory.

Using Azure Data Factory, how to extract data from arrays in documents of DocumentDB to SQL Database

I need to extract arrays from documents in a DocumentDB and copy to SQL Database using Azure Data Factory..
I need to implement the same functionality of using jsonNodeReference and jsonPathDefinition in "Sample 2: cross apply multiple objects with the same pattern from array" of this article:
https://learn.microsoft.com/en-us/azure/data-factory/data-factory-supported-file-and-compression-formats#json-format
According to your mentioned File and compression formats supported by Azure Data Factory, it seems that it is not supported to extract data from documentdb to SQL database with Azure Data Factory Copy Activity currently. We could give our feedback to Azure document Team.
This topic applies to the following connectors: Amazon S3, Azure Blob, Azure Data Lake Store, File System, FTP, HDFS, HTTP, and SFTP.
But we also could use custom activities in an Azure Data Factory pipeline or Azure WebJob/Function with customized logic to do that.
Some related documents:
How to to Query Azure Cosmos DB resources
How to operate Azure SQL Database

Syncronize data in SQLite / IONIC from neo4j

I have an ionic application that uses SQLite to store data. The data is retrieved from an HTTP REST Service, which then connects to a neo4j database. I need to be able to sync the changing data or insert any new data into my sqlite database as the data changes on the neo4j server. What is the best way to do this? Are there any existing frameworks? I am aware of PouchDB, but that doesn't really fit with what I am doing. I can't use local storage or any other in-memory storage as there could be a lot of data.
You may find these neo4j data integration articles helpful:
Import Data Into Neo4j
Data Migration between MySQL and Neo4j

Resources