I deployed my airflow on kubernetes using helm, with this chart: stable/airflow
I also am using this image:
image:
repository: apache/airflow
tag: latest
## values: Always or IfNotPresent
pullPolicy: IfNotPresent
pullSecret: ""
And I added this two variables to the enviroment variables, to change the timezone:
AIRFLOW__CORE__DEFAULT_TIMEZONE: America/New_York
AIRFLOW__WEBSERVER__DEFAULT_UI_TIMEZONE: America/New_York
But when I go to airflow it shows me the UTC timezone:
But in the configuration part of airflow I see that airflow captured the timezone on the ui:
webserver default_ui_timezone America/New_York env var
TL/DR: airflow on kubernetes getting timezoen configuration from enviroment variable but showing on wrong timezone.
Any ideas everyone?
Related
Airflow was working fine for several weeks and suddenly started getting errors for a few days.
Dags fail randomly with this error.
Log file does not exist: airflow_path/1.log
Fetching from: http://:8793/airflow_path/1.log
*** Failed to fetch log file from worker. The request to ':///' is missing either an 'http://
I had a similar issue, and I figured that in my case the worker node (I was using Celery Executor) was exhausted and therefore unavailable to execute any dags on it, can you check the CPU and memory utilized by the worker node (or its alternative if you are not using celery executor).
You can try to increase the CPU and memory for that applicable node and try.
Happened to me as well using LocalExecutor and an Airflow setup on Docker Compose. Eventually, I figured that the webserver would fail to fetch old logs whenever I recreated my Docker containers. Digging deeper, I realized that the webserver was failing to fetch the logs because it didn't have access to the filesystem of the scheduler (where the logs live).
The fix was to ensure that both the scheduler and the webserver services in docker-compose.yml share a volume with the logs, i.e.:
# docker-compose.yml
version: "3.9"
services:
scheduler:
image: ...
volumes:
- airflow_logs:/airflow/logs
...
webserver:
image: ...
volumes:
- airflow_logs:/airflow/logs
...
volumes:
airflow_logs:
I'm using Airflow that run in container as described here. It seems that the configuration file airflow.cfg on the host have no impact on Airflow. I tried the solution here but it didn't help.
The configuration fields I changed are:
default_timezone = system #(from utc)
load_examples = False #(from True)
base_url = http://localhost:8081 #(from 8080)
default_ui_timezone = system #(from UTC)
I didn't see any impact on airflow eventhough I did docker-compose down and docker-compose up
I'm trying to run dagster using celery-k8s and using the examples/celery-k8s as a start. upon running the pipeline from playground I get
Initialization of resources [s3, io_manager] failed.
botocore.exceptions.NoCredentialsError: Unable to locate credentials
I have configured aws credentials in env variables as mentioned in the document
deployments:
- name: "user-code-deployment-test"
image:
repository: "somasays/dagster-usercode-example"
tag: "0.5"
pullPolicy: Always
dagsterApiGrpcArgs:
- "-f"
- "/workspace/repo.py"
port: 3030
env:
AWS_ACCESS_KEY_ID: AAAAAAAAAAAAAAAAAAAAAAAAA
AWS_SECRET_ACCESS_KEY: qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
AWS_DEFAULT_REGION: eu-central-1
and I can also see these values are set in the env variables of the pod and can also access the s3 location after pip install awscli and aws s3 ls see the screenshot below the job pod however throws Unable to locate credentials
Please help
The deployment configuration applies to the user code servers. Meanwhile the celery executor runs your pipeline code in separate kubernetes jobs. To provide your secrets there, you will want to configure the env_secrets field of the celery-k8s executor in your pipeline run config.
See https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-k8s/dagster_k8s/job.py#L321-L327 for details on the config.
I want to create a Mongo connection (other than default) without using the Airflow UI.
I read from the Airflow documentation:
Connections in Airflow pipelines can be created using environment
variables. The environment variable needs to have a prefix of
AIRFLOW_CONN_ for Airflow with the value in a URI format to use the
connection properly.
When referencing the connection in the Airflow pipeline, the conn_id
should be the name of the variable without the prefix. For example, if
the conn_id is named postgres_master the environment variable should
be named AIRFLOW_CONN_POSTGRES_MASTER (note that the environment
variable must be all uppercase).
I tried to apply this when using the Puckel docker image.
This is a docker compose using that image:
version: '2.1'
services:
postgres:
image: postgres:9.6
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
webserver:
image: puckel/docker-airflow:1.10.6
restart: always
depends_on:
- postgres
environment:
- LOAD_EX=n
- EXECUTOR=Local
- AIRFLOW_CONN_MY_MONGO=mongodb://mongo:27017
volumes:
- ./src/:/usr/local/airflow/dags
- ./requirements.txt:/requirements.txt
ports:
- "8080:8080"
command: webserver
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 3
Note the line AIRFLOW_CONN_MY_MONGO=mongodb://mongo:27017 where I'm passing the environment variable as the Airflow documentation suggests.
Problem here is that there is no my_mongo connection created when I'm listing the connections in the UI.
Any advice? Thanks!
The connection won't be listed in the UI when you create it with environment variable.
Reason:
Airflow supports the creation of connections via Environment variable for ad-hoc jobs in the DAGs
The connection in the UI are actually saved in the DB and retrieved from it. The ones created by Env vars are not stored in DB
How do I test my connection?
Create a sample DAG and use your connection to run a sample job. It should work fine.
I read a Puckel issue where they mention that the connection is created, but is not showed in the UI. I tested it and in fact the connection works when used in a DAG.
I'm trying to follow along this blog about using Docker with R.
I followed basic Docker set up steps and am able to run the hello world image.
I'm on a old 2009 Mac and had to use Docker Toolbox.
I'm in a place with weak internet connection and am using a personal hotspot.
Each time I try to run docker run --rm -p 8787:8787 rocker/verse I wait for a few minutes and see a downloading message, then I get a message "docker: unauthorized: authentication required."
I found this separate documentation which advised me to add a password:
docker run --rm -p 8787:8787 -e PASSWORD=blah rocker/rstudio
But I got the same result "docker: unauthorized: authentication required."
I did some Google searching and found some posts both here on SO and on Github but was unable to identify what is causing this error in my specific case.
I suspect my weak internet connection might have something to do with it since I seem to be able to download for about 10 or 15 minutes before seeing this message.
Here is Docker info:
Macs-MacBook:~ macuser$ docker info
Containers: 1
Running: 0
Paused: 0
Stopped: 1
Images: 2
Server Version: 18.09.6
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 4.14.116-boot2docker
Operating System: Boot2Docker 18.09.6 (TCL 8.2.1)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 1.951GiB
Name: default
ID: XMCE:OBLV:CKEX:EGIB:PHQ7:MLHF:ZJSA:PGYN:OIMM:JI67:ETCI:JKBH
Docker Root Dir: /mnt/sda1/var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
provider=virtualbox
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Does anyone know where I can look to next in order to be able to pull and or run the rocker image?