How to setup AWS KMS on Airflow?

How to setup AWS KMS on Airflow? - airflow

Can you please advice if Airflow supports AWS KMS server side encryption? If yes, is there any documentation on how to setup? I am using Airflow 1.9.0 version.
I tried with creating s3 connection with extra args like -
{"aws_access_key_id":"xx", "aws_secret_access_key": "xx", "sse": "aws:kms", "sse-kms-key-id": "xx"}
and used s3 hook to upload a file in the code but it is throwing this error -
an error occurred (accessdenied) when calling the createmultipartupload operation access denied
Where as s3 cp from command line did work!
aws s3 cp test.txt s3://xxx/xx/test.txt --sse aws:kms --sse-kms-key-id "xx"
upload: ./test.txt to s3://xxx/xx/test.txt
thanks in advance.

Related

AWS credentials not found for celery-k8s deployment

I'm trying to run dagster using celery-k8s and using the examples/celery-k8s as a start. upon running the pipeline from playground I get
Initialization of resources [s3, io_manager] failed.
botocore.exceptions.NoCredentialsError: Unable to locate credentials
I have configured aws credentials in env variables as mentioned in the document
deployments:
- name: "user-code-deployment-test"
image:
repository: "somasays/dagster-usercode-example"
tag: "0.5"
pullPolicy: Always
dagsterApiGrpcArgs:
- "-f"
- "/workspace/repo.py"
port: 3030
env:
AWS_ACCESS_KEY_ID: AAAAAAAAAAAAAAAAAAAAAAAAA
AWS_SECRET_ACCESS_KEY: qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
AWS_DEFAULT_REGION: eu-central-1
and I can also see these values are set in the env variables of the pod and can also access the s3 location after pip install awscli and aws s3 ls see the screenshot below the job pod however throws Unable to locate credentials
Please help

The deployment configuration applies to the user code servers. Meanwhile the celery executor runs your pipeline code in separate kubernetes jobs. To provide your secrets there, you will want to configure the env_secrets field of the celery-k8s executor in your pipeline run config.
See https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-k8s/dagster_k8s/job.py#L321-L327 for details on the config.

Showing error while mounting EFS to an instance in my Elastic Beanstalk environment

Followed the following procedure for attaching the EFS file to instances created using EB:
https://aws.amazon.com/premiumsupport/knowledge-center/elastic-beanstalk-mount-efs-volumes/#:~:text=In%20an%20Elastic%20Beanstalk%20environment,scale%20up%20to%20multiple%20instances.
But the logs of Elastic Beanstalk are showing following error:
[Instance: i-06593*****] Command failed on instance. Return code: 1 Output: (TRUNCATED)...fs ... mount -t efs -o tls fs-d9****:/ /efs Failed to resolve "fs-d9****.efs.us-east-1.amazonaws.com" - check that your file system ID is correct. See https://docs.aws.amazon.com/console/efs/mount-dns-name for more detail. ERROR: Mount command failed!. command 01_mount in .ebextensions/storage-efs-mountfilesystem.config failed. For more detail, check /var/log/eb-activity.log using console or EB CLI.
Just used **** in EFS ID for security.

Based on the comments.
The solution was to create new EFS filesystem, instead of using the original one.

Airflow (LocalExecutor) - Docker :: Job is failing with Log file does not exist

Airflow version: 1.10.9
Executor : LocalExecutor
Docket Setup
when job runs sometime we are getting following error. I have searched in web, many people faced this issue in celeryExecutor but we are using LocalExecutor(Docker setup). How can I resolve this problem?
*** Log file does not exist: /home/ubuntu/airflow/airflow/logs/es_update_relevance_score/es_update_relevance_score/2020-05-14T16:26:06.062416+00:00/1.log
*** Fetching from: http://:8793/log/es_update_relevance_score/es_update_relevance_score/2020-05-14T16:26:06.062416+00:00/1.log
*** Failed to fetch log file from worker. Invalid URL 'http://:8793/log/es_update_relevance_score/es_update_relevance_score/2020-05-14T16:26:06.062416+00:00/1.log': No host supplied

Here is one approach I've seen when running the scheduler and webserver in their own containers and using LocalExecutor:
Mount a host log directory as a volume into both the scheduler and webserver containers:
volumes:
- /location/on/host/airflow/logs:/opt/airflow/logs
Make sure the user within the airflow containers (usually airflow) has permissions to read and write that directory. If the permissions are wrong you will see an error like the one in your post.
This probably won't scale beyond LocalExecutor usage though.

Task fails due to not being able to read log file

Composer is failing a task due to it not being able to read a log file, it's complaining about incorrect encoding.
Here's the log that appears in the UI:
*** Unable to read remote log from gs://bucket/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** 'ascii' codec can't decode byte 0xc2 in position 6986: ordinal not in range(128)
*** Log file does not exist: /home/airflow/gcs/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Fetching from: http://airflow-worker-68dc66c9db-x945n:8793/log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-68dc66c9db-x945n', port=8793): Max retries exceeded with url: /log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1c9ff19d10>: Failed to establish a new connection: [Errno -2] Name or service not known',))
I try viewing the file in the google cloud console and it also throws an error:
Failed to load
Tracking Number: 8075820889980640204
But I am able to download the file via gsutil.
When I view the file, it seems to have text overriding other text.
I can't show the entire file but it looks like this:
--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------
#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,313] {models.py:1569} INFO - Executing <Task(BigQueryOperator): merge_campaign_exceptions> on 2019-08-03T10:00:00+00:00#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,314] {base_task_runner.py:124} INFO - Running: ['bash', '-c', u'airflow run __campaign_exceptions_0_0_1 merge_campaign_exceptions 2019-08-03T10:00:00+00:00 --job_id 22767 --pool _bq_pool --raw -sd DAGS_FOLDER//-campaign-exceptions.py --cfg_path /tmp/tmpyBIVgT']#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:24,658] {base_task_runner.py:107} INFO - Job 22767: Subtask merge_campaign_exceptions [2019-08-04 10:01:24,658] {settings.py:176} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
Where the #-#{} pieces seems to be "on top of" the typical log.

I faced the same problem. In my case the problem was that I removed the google_gcloud_default connection that was being used to retrieve the logs.
Check the configuration and look for the connection name.
[core]
remote_log_conn_id = google_cloud_default
Then check the credentials used for that connection name has the right permissions to access the GCS bucket.

I'm having a similar problem with viewing logs in GCP Cloud Composer. It doesn't appear to be preventing the failing DAG task from running though. What it looks like is a permissions error between the GKE and Storage Bucket where the log files are kept.
You can still view the logs by going into your cluster's storage bucket in the same directory as your /dags folder where you should also see a logs/ folder.

Your helm chart should setup global env:
- name: AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT
value: "google-cloud-platform://"
Then, you should deploy a Dockerfile with root account only (not airflow account), additionaly, you set up your helm uid, gid as:
uid: 50000 #airflow user
gid: 50000 #airflow group
Then upgrade helm chart with new config

*** Unable to read remote log from gs://bucket
1)Found the solution after assigning the roles to the service account
2)The SA key(json or txt) to be added and configured to the connection in the
remote_log_conn_id = google_cloud_default
3)restart the scheduler and webserver of the airflow
4)restart the dags on the airflow
you can find the logs on the GCS bucket where its configured

Issue with connecting Golang application on Cloud Run with Firestore

I try to get all Documents from Firestore using the below function.
The credentials are stored in an encrypted file in a GCP Cloud Source repository.
I decrypted the configuration in the Cloud Build trigger and set the ENV in the Dockerfile pointing to the file. I see the content by RUN ls /app/credentials.json.
The error I get in the application log:
rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"
The credentials are stored in an encrypted file in a GCP Cloud Source repository.
I decrypted the configuration in the Cloud Build trigger and set the ENV in the Dockerfile pointing to the file. I see the content by RUN ls /app/credentials.json.
The error I get in the application log:
rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"

This error is the result of an HTTPS failure where the certificate cannot be verified. The Alpine base image is missing a package that provides root certificates. Currently the Cloud Run quickstart is missing this for at least the Go language.
Assuming this is your problem, add the following to the final stage of your Dockerfile:
RUN apk add --no-cache ca-certificates