What is the best way to integrate Microsoft Teams operator in Apache Airflow? - airflow

Current I am using Apache Airflow for sending files through mails.
But I want to send files on teams.
MY Airflow version : 1.10.15
Something like trigger dags through teams and get response/data back.

Related

How to get the state of a remote job in Livy using Java API

Is it possible to monitor the state of an already running remote job in Livy with Java API? How can this be done?
I looked over Livy Java API docs. A JobHandle would let me pool the state of the app. However, the only way I can see to obtain it is via LivyClient.submit while I need to get a handle of the job that was already submitted outside of Java code. I'm afraid that Java API doesn't support it and creating REST calls in Java code is the only option. If anyone has found a way to get information about batch jobs via Livy Java API, please show the code used.

Schedule Rscript in a server

Currently, I am scheduling a daily bat from my laptop with TaskScheduler on windows and I would like to do it automatically in a server. Actually, the situation is:
I read some SQLite DB that I have stored in local.
Once I read them I do a webscraping based on some information.
I store this new information in the DB mentioned above.
I've read that is possible do it with Amazon EC2 (REF: http://www.louisaslett.com/RStudio_AMI/), but it doesn't mention anything related with SQLite local db.
Could you give me some recommendations about how to manage and which tool could be the best approach (Azure, AWS, Google Bigquery)
You could build a Cloud Run task to do this. Then schedule it using cloud scheduler.
https://cloud.google.com/run/docs/triggering/using-scheduler
You can find samples how to build a cloud run task in the docs:
https://cloud.google.com/run/docs
To write to big query, you can use the bigquery api client libraries: https://cloud.google.com/bigquery/docs/reference/libraries

Using kubernetes-secrets with Google Composer

Is it possible to use kubernetes-secrets together with Google Composer in order to access secrets from Airflow workers?
We are using k8s secrets with our existing standalone k8s Airflow cluster and were hoping we can achieve the same with Google Composer.
By default, Kubernetes secrets are not exposed to the Airflow workers deployed by Cloud Composer. You can patch the deployments to add them (airflow-worker and airflow-scheduler), but there will be no guarantee that they won't be reverted if you perform an update on the environment (such as configuration update or in-place upgrade).
It's probably easiest to use an Airflow connection (which are encrypted in the metadata database using Fernet), or to launch new pods using KubernetesPodOperator/GKEPodOperator and mounting the relevant secrets into the pod at pod launch.
Kubernetes secrets are available to the Airflow workers. You can contribute the components for whatever API you wish to call to work natively in Airflow so that the credentials can be stored as a Connection in Airflow's metadata database, which is encrypted at rest. Using Airflow connection involves storing the secret key in GCS with an appropriate ACL, and setting up Composer to secure the connection.
You can write your own custom operator to access the secret in the Kubernetes and use it. Take a look for SimpleHttpOperator - this pattern can be applied to any arbitrary secret management scheme. This is for for scenarios that access external services that aren't explicitly supported by Airflow Connections, Hooks, and Operators.
I hope it helps.

Cloud Composer airflow webserver issue with KMS

I'm attempting to utilize the KMS library in one of my DAGs which is running the PythonOperator, but I'm encountering an error in the airflow webserver:
details = "Cloud Key Management Service (KMS) API has not been used in project 'TENANT_PROJECT_ID' before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/cloudkms.googleapis.com/overview?project='TENANT_PROJECT_ID' then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry."
The airflow webserver is unable to import my specific DAG from my host project to the tenant project (which is where the webserver is running). The DAGs runs with no problem as my host project is correctly setup, but not having the opportunity to monitor it in the UI is a huge drawback.
System specifications:
softwareConfig:
imageVersion: composer-1.8.2-airflow-1.10.3
pypiPackages:
google-cloud-kms: ==1.2.1
pythonVersion: '3'
It would be nice to be able to leverage KMS and the airflow ui, if not then I might have to add my secrets to cloud composer environmental variables (which is not preferred.)
Any known solutions on this?
The Airflow webserver is a managed component in Cloud Composer, so as other have stated, it runs in a tenant project that you (as the environment owner) do not have access to. There is currently no way to access this project.
If you have a valid use case for enabling extra APIs in the tenant project, I'd recommend submitting product feedback. You can find out how to do that from the product's public documentation (including if you want to submit a feature request to the issue tracker).
Alternatively, if you're willing to experiment, AIP-24 was an Airflow proposal called DAG database persistence that caches DAGs in the Airflow database, as opposed to parsing/importing them in the webserver (which is the reason why you need KMS in this situation). If you're using Composer 1.8.1+, then you can experimentally enable the feature by setting core.store_serialized_dags=True. Note that it's not guaranteed to work for all DAGs, but it may be useful to you here.

How to allow an access to a Compute Engine VM in Airflow (Google Cloud Composer )

I try to run a bash command in this pattern ssh user#host "my bash command" using BashOperator in Airflow. This works locally because I have my publickey in the target machine.
But I would like to run this command in Google Cloud Composer, which is Airflow + Google Kubernetes Engine. I understood that the Airflow's core program is running in 3 pods named according to this pattern airflow-worker-xxxxxxxxx-yyyyy.
A naive solution was to create an ssh keys for each pod and add it's public key to the target machine in Compute Engine. The solution worked until today, somehow my 3 pods have changed so my ssh keys are gone. It was definitely not the best solution.
I have 2 questions:
Why Google cloud composer have changed my pods ?
How can I resolve my issue ?
Pods restarts are not specifics to Composer. I would say this is more related to kubernetes itself:
Pods aren’t intended to be treated as durable entities.
So in general pods can be restarted for different reasons, so you shouldn't rely on any changes that you make on them.
How can I resolve my issue ?
You can solve this taking into account that Cloud Composer creates a Cloud Storage bucket and links it to your environment. You can access the different folders of this bucket from any of your workers. So you could store your key (you can use only one key-pair) in "gs://bucket-name/data", which you can access through the mapped directory "/home/airflow/gcs/data". Docs here

Resources