Azure Data Explorer: Continuous Export with Managed Identity - azure-data-explorer

I'm trying to setup a continues export in Azure Data Explorer by following these instructions.
I run this query
.create-or-alter continuous-export MyExport
over (table)
to table externaltable
with
(intervalBetweenRuns=1h,
forcedLatency=10m,
sizeLimit=104857600)
<| table
and get the following error
Error: continuous export to external tables with impersonation requires setting the 'managedIdentity' property in the continuous export configuration. See https://aka.ms/continuousExportWithManagedIdentity for more information.
The instructions say to do the follwing
In order to use Continuous Export with Managed Identity, please add the AutomatedFlow usage to the Managed Identity policy
But I cant figure out how I am supposed to do that. Is AutomatedFlow a role?

You need to specify which managed identity to use in the continuous export - please see the managedIdentity property in the .create continuous-export properties. See more information in this link about using managed identities in automated workflows.

Related

Publish features to cosmos dB using Azure Databricks Feature Store Client fails on workspace with unity catalog enabled

we are trying to create an online feature store using cosmosdb following this documentation: https://learn.microsoft.com/en-us/azure/databricks/machine-learning/feature-store/publish-features .
But I get an error when I publish the table to cosmosdb: AnalysisException: Catalog 'cosmoscatalog' not found. The issue only happens when using unity-enabled workspaces. I can publish using a non-unity enabled workspace.
P.S. If I create the table using the non-unity enabled workspace, then the unity-enabled workspace can update the cosmosdb. But the unity-enabled worskpace cannot create the cosmos container/database using the fs.publish_table.
I tried the following code:
from databricks.feature_store.online_store_spec import AzureCosmosDBSpec
from databricks.feature_store.client import FeatureStoreClient
fs = FeatureStoreClient()
account_uri = "https://online-feature-store.documents.azure.com:443/"
# Specify the online store.
online_store_spec = AzureCosmosDBSpec(
account_uri=account_uri,
write_secret_prefix="secret/write-cosmos",
read_secret_prefix="secret/read-cosmos",
database_name="online_feature_store_example",
container_name="feature_store_online_wine_features"
)
# Push the feature table to online store.
fs.publish_table("online_feature_store_example.wine_static_features", online_store_spec, mode='merge')
The following code works on workspaces without unity catalog enabled. However, on a unity-catalog enabled workspace, it trhows an error: AnalysisException: Catalog 'cosmoscatalog' not found
You need to create the database and container in CosmosDB with the name you are specifying in AzureCosmosDBSpec.

How to set up a database on startup in AWS amplify?

I have an amplify and I get how to add an api function, use a lambda layer etc. What I don't see is how to create the database on startup -- it appears from the documentation that this is done from a CloudFormation stack, but I still can't see how to ensure that the database is set up on startup of the app (or build the tables if not) using something like SQLAlchemy.
What's the intended flow here?

Store and alter single variable on vercel serverless functions

For a client I am building a static website rendered with nextjs and deployed on vercel. Everything on this website is static, so I don't need any database. However, this client wants to use the instagram API to show a gallery of their photos on two of their pages. This is with a custom design, so I can't use any embed code, but to the best of my knowledge I have to use the Instagram basic display API
To the problem at hand: I was wondering if there is some way to store a single variable without creating a whole database for it in vercel. I know I can use Environment Variables, but the problem is that the instagram api needs to change the access token every 2 months. To renew the access token for instagram, I was planning to write a CRON job that runs about every month to update this value.
I was wondering if it is possible to somehow store this single value on the deployed site without creating a database just for this single value. For example, is it somehow possible to change an environment variable from within a serverless function?
Any help in the right direction is appreciated!
Thanks
You go to Vercel: settings-> environment variables -> add your variable. In this variable you can store your Instagram API variable and in the code you use process.env.{variable}
Example:
you defined name of variable as instagramAPI in your local files (next.config.js or .env.local)
module.exports = {
env:{
instagramAPI : 'https://instagramapiexample.com'
},
}
you define instagramAPI (exactly the same name of the variable as in the code) on your vercel settings
In your code (local files) you call process.env.instagramAPI variable to have the value of the string.
Your code works as expected.
!IMPORTANT! if you have some secrets or passwords in your process.env.variables you newer saves it in next.config.js. For this purpose you saves your instagramAPI to .env.local (described in point 1). More info here

GCP encryption thru Beam / Dataflow APIs for Bigquery and Cloud SQL

Context: We are trying to load some CSV format data into GCP BigQuery using GCP Dataflow (Apache Beam). As a part of this for the first time (for each table) creating the BQ tables thru BigQueryIO API. One of the customer requirement is the data on GCP needs to be encrypted using Customer supplied/managed Encryption keys.
Problem Statement: We are not able to find any way to specify the "Custom Encryption Keys" thru APIs while creating Tables. The GCP documentation details about how to specify the Custom encryption keys thru GCP BQ Console but could not find anything for specifying it thru APIs from within DataFlow Code.
Code Snippet:
String tableSpec = new StringBuilder().append(PipelineConstants.PROJECT_ID).append(":")
.append(dataValue.getKey().target_dataset).append(".").append(dataValue.getKey().target_table_name)
.toString();
ValueProvider<String> valueProvider = StaticValueProvider.of("gs://bucket/folder/");
dataValue.getValue().apply(Count.globally()).apply(ParDo.of(new RowCount(dataValue.getKey())))
.apply(ParDo.of(new SourceAudit(runId)));
dataValue.getValue().apply(ParDo.of(new PreProcessing(dataValue.getKey())))
.apply(ParDo.of(new FixedToDelimited(dataValue.getKey())))
.apply(ParDo.of(new CreateTableRow(dataValue.getKey(), runId, timeStamp)))
.apply(BigQueryIO.writeTableRows().to(tableSpec)
.withSchema(CreateTableRow.getSchema(dataValue.getKey()))
.withCustomGcsTempLocation(valueProvider)
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND));
Query: If anybody could let us know
If this is possible to provide encryption key thru Beam API?
If its not possible with the current version what could be the possible work
around?
Kindly let know if additional information is required.
Customer supplied encryption keys is a new feature, not all libraries have been updated to support it yet.
If you know the table name in advance, you can use UI/CLI or API to create table, then run your normal flow to load data into that table. That might be a work around for you.
https://cloud.google.com/bigquery/docs/customer-managed-encryption#create_table
API to create table: https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/insert
You need to set this section on table object:
"encryptionConfiguration": {
"kmsKeyName": string
}
More details on table: https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#resource

firebase logging best practice

I'm looking for a way to log everything that is written on a firebase database. For now, I'm using a few firebase functions that are simply printing a diff between old values and new ones. However I'm not sure the firebase functions logs screen is the best tool in this situation. Do you have any recommendations?
I had a similar problem and I ended up adding database triggers in which I send data to another firebase database (using the production one is not recommended) which stored the logs in the following form:
logs:{
myFunction:{
'10.01.2018': {
debug: 'Some logging here',
error: 'Some error here'
}
}
}
I found a way.
I'm still using firebase functions to log a diff of the previous and current values of objects written on my database.
I'm now using stackdriver logging from google cloud platform to visualize the logs, and this tool is what I was looking for.

Resources