Rest API requests failing in Airflow ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')) - python-requests

I'm requesting data through a rest API using the requests library. Everything works locally, but when I deploy to my GCP composer instance I experience errors.
def get_ids():
s = requests.Session()
retries = Retry(total=5, backoff_factor=1, status_forcelist=[429])
s.mount('http://', HTTPAdapter(max_retries=retries))
ids = []
counter = 0
url = "some_url" + "/?limit=50"
r = s.get(url, headers=headers)
data = r.json()
for i in data['items']:
if 'custom' in i['user']:
ids(i['user']['custom']['id'])
while 'next' in data['_links']:
time.sleep(0.5)
try:
r = s.get("some url"+ data['_links']['next']['href'], headers=headers)
data = r.json()
counter+=1
for i in data['items']:
if 'custom' in i['user']:
try:
ids(i['user']['custom']['accountId'])
except KeyError:
pass
return ids
Runs on my local machine, but this is the error I get running in airflow/composer:
[2022-05-19 01:43:10,041] {subprocess.py:78} INFO - Traceback (most recent call last):
[2022-05-19 01:43:10,053] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 703, in urlopen
[2022-05-19 01:43:10,066] {subprocess.py:78} INFO - httplib_response = self._make_request(
[2022-05-19 01:43:10,074] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 449, in _make_request
[2022-05-19 01:43:10,082] {subprocess.py:78} INFO - six.raise_from(e, None)
[2022-05-19 01:43:10,092] {subprocess.py:78} INFO - File "<string>", line 3, in raise_from
[2022-05-19 01:43:10,111] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 444, in _make_request
[2022-05-19 01:43:10,118] {subprocess.py:78} INFO - httplib_response = conn.getresponse()
[2022-05-19 01:43:10,136] {subprocess.py:78} INFO - File "/opt/python3.8/lib/python3.8/http/client.py", line 1348, in getresponse
[2022-05-19 01:43:10,146] {subprocess.py:78} INFO - response.begin()
[2022-05-19 01:43:10,160] {subprocess.py:78} INFO - File "/opt/python3.8/lib/python3.8/http/client.py", line 316, in begin
[2022-05-19 01:43:10,180] {subprocess.py:78} INFO - version, status, reason = self._read_status()
[2022-05-19 01:43:10,192] {subprocess.py:78} INFO - File "/opt/python3.8/lib/python3.8/http/client.py", line 277, in _read_status
[2022-05-19 01:43:10,214] {subprocess.py:78} INFO - line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
[2022-05-19 01:43:10,224] {subprocess.py:78} INFO - File "/opt/python3.8/lib/python3.8/socket.py", line 669, in readinto
[2022-05-19 01:43:10,243] {subprocess.py:78} INFO - return self._sock.recv_into(b)
[2022-05-19 01:43:10,252] {subprocess.py:78} INFO - File "/opt/python3.8/lib/python3.8/ssl.py", line 1241, in recv_into
[2022-05-19 01:43:10,274] {subprocess.py:78} INFO - return self.read(nbytes, buffer)
[2022-05-19 01:43:10,281] {subprocess.py:78} INFO - File "/opt/python3.8/lib/python3.8/ssl.py", line 1099, in read
[2022-05-19 01:43:10,301] {subprocess.py:78} INFO - return self._sslobj.read(len, buffer)
[2022-05-19 01:43:10,322] {subprocess.py:78} INFO - ConnectionResetError: [Errno 104] Connection reset by peer
[2022-05-19 01:43:10,330] {subprocess.py:78} INFO -
[2022-05-19 01:43:10,350] {subprocess.py:78} INFO - During handling of the above exception, another exception occurred:
[2022-05-19 01:43:10,359] {subprocess.py:78} INFO -
[2022-05-19 01:43:10,379] {subprocess.py:78} INFO - Traceback (most recent call last):
[2022-05-19 01:43:10,386] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/requests/adapters.py", line 440, in send
[2022-05-19 01:43:10,406] {subprocess.py:78} INFO - resp = conn.urlopen(
[2022-05-19 01:43:10,416] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 785, in urlopen
[2022-05-19 01:43:10,436] {subprocess.py:78} INFO - retries = retries.increment(
[2022-05-19 01:43:10,443] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/urllib3/util/retry.py", line 550, in increment
[2022-05-19 01:43:10,463] {subprocess.py:78} INFO - raise six.reraise(type(error), error, _stacktrace)
[2022-05-19 01:43:10,469] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/urllib3/packages/six.py", line 769, in reraise
[2022-05-19 01:43:10,490] {subprocess.py:78} INFO - raise value.with_traceback(tb)
[2022-05-19 01:43:10,518] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 703, in urlopen
[2022-05-19 01:43:10,539] {subprocess.py:78} INFO - httplib_response = self._make_request(
[2022-05-19 01:43:10,575] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 449, in _make_request
[2022-05-19 01:43:10,578] {subprocess.py:78} INFO - six.raise_from(e, None)
[2022-05-19 01:43:10,594] {subprocess.py:78} INFO - File "<string>", line 3, in raise_from
[2022-05-19 01:43:10,621] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 444, in _make_request
[2022-05-19 01:43:10,639] {subprocess.py:78} INFO - httplib_response = conn.getresponse()
[2022-05-19 01:43:10,675] {subprocess.py:78} INFO - File "/opt/python3.8/lib/python3.8/http/client.py", line 1348, in getresponse
[2022-05-19 01:43:10,683] {subprocess.py:78} INFO - response.begin()
[2022-05-19 01:43:10,700] {subprocess.py:78} INFO - File "/opt/python3.8/lib/python3.8/http/client.py", line 316, in begin
[2022-05-19 01:43:10,707] {subprocess.py:78} INFO - version, status, reason = self._read_status()
[2022-05-19 01:43:10,727] {subprocess.py:78} INFO - File "/opt/python3.8/lib/python3.8/http/client.py", line 277, in _read_status
[2022-05-19 01:43:10,735] {subprocess.py:78} INFO - line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
[2022-05-19 01:43:10,755] {subprocess.py:78} INFO - File "/opt/python3.8/lib/python3.8/socket.py", line 669, in readinto
[2022-05-19 01:43:10,763] {subprocess.py:78} INFO - return self._sock.recv_into(b)
[2022-05-19 01:43:10,782] {subprocess.py:78} INFO - File "/opt/python3.8/lib/python3.8/ssl.py", line 1241, in recv_into
[2022-05-19 01:43:10,803] {subprocess.py:78} INFO - return self.read(nbytes, buffer)
[2022-05-19 01:43:10,906] {subprocess.py:78} INFO - File "/opt/python3.8/lib/python3.8/ssl.py", line 1099, in read
[2022-05-19 01:43:10,913] {subprocess.py:78} INFO - return self._sslobj.read(len, buffer)
[2022-05-19 01:43:10,927] {subprocess.py:78} INFO - urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
[2022-05-19 01:43:10,930] {subprocess.py:78} INFO -
[2022-05-19 01:43:10,932] {subprocess.py:78} INFO - During handling of the above exception, another exception occurred:
[2022-05-19 01:43:10,937] {subprocess.py:78} INFO -
[2022-05-19 01:43:10,940] {subprocess.py:78} INFO - Traceback (most recent call last):
[2022-05-19 01:43:10,943] {subprocess.py:78} INFO - File "my_script.py", line 114, in <module>
[2022-05-19 01:43:10,945] {subprocess.py:78} INFO - main(api_key, environment)
[2022-05-19 01:43:10,946] {subprocess.py:78} INFO - File "my_script.py", line 95, in main
[2022-05-19 01:43:10,950] {subprocess.py:78} INFO - ids = get_ids()
[2022-05-19 01:43:10,953] {subprocess.py:78} INFO - File "my_script.py", line 39, in get_ids_ids
[2022-05-19 01:43:10,954] {subprocess.py:78} INFO - r = s.get("some url", headers=headers)
[2022-05-19 01:43:10,956] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/requests/sessions.py", line 542, in get
[2022-05-19 01:43:10,958] {subprocess.py:78} INFO - return self.request('GET', url, **kwargs)
[2022-05-19 01:43:10,964] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/requests/sessions.py", line 529, in request
[2022-05-19 01:43:10,966] {subprocess.py:78} INFO - resp = self.send(prep, **send_kwargs)
[2022-05-19 01:43:10,966] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/requests/sessions.py", line 645, in send
[2022-05-19 01:43:10,967] {subprocess.py:78} INFO - r = adapter.send(request, **kwargs)
[2022-05-19 01:43:10,969] {subprocess.py:78} INFO - File "/tmp/venv/lib/python3.8/site-packages/requests/adapters.py", line 501, in send
[2022-05-19 01:43:10,972] {subprocess.py:78} INFO - raise ConnectionError(err, request=request)
[2022-05-19 01:43:10,979] {subprocess.py:78} INFO - requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
[2022-05-19 01:43:12,592] {subprocess.py:82} INFO - Command exited with return code 1
[2022-05-19 01:43:13,014] {taskinstance.py:1465} ERROR - Task failed with exception
any guidance here? I've tried increasing the sleep, increasing the backoff, but nothing's worked. Thanks

Related

Apache Airflow on Azure sqlalchemy Postgres SSL SYSCALL error: EOF detected

We deployed Apache Airflow 2.3.3 to Azure.
Webserver - Web App
Scheduler - ACI
Celery Worker - ACI
We were seeing errors on the Celery ACI console related to Postgres and Redis connection timeouts as below
[2022-09-22 18:55:50,650: WARNING/ForkPoolWorker-15] Failed operation _store_result. Retrying 2 more times.
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1803, in _execute_context
cursor, statement, parameters, context
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 719, in do_execute
cursor.execute(statement, parameters)
psycopg2.DatabaseError: could not receive data from server: Connection timed out
SSL SYSCALL error: Connection timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/celery/backends/database/__init__.py", line 47, in _inner
return fun(*args, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/celery/backends/database/__init__.py", line 117, in _store_result
task = list(session.query(self.task_cls).filter(self.task_cls.task_id == task_id))
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 2887, in __iter__
return self._iter().__iter__()
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 2897, in _iter
execution_options={"_sa_orm_load_options": self.load_options},
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1689, in execute
result = conn._execute_20(statement, params or {}, execution_options)
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1614, in _execute_20
return meth(self, args_10style, kwargs_10style, execution_options)
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 326, in _execute_on_connection
self, multiparams, params, execution_options
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1491, in _execute_clauseelement
cache_hit=cache_hit,
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1846, in _execute_context
e, statement, parameters, cursor, context
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2027, in _handle_dbapi_exception
sqlalchemy_exception, with_traceback=exc_info[2], from_=e
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
raise exception
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1803, in _execute_context
cursor, statement, parameters, context
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 719, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.DatabaseError: (psycopg2.DatabaseError) could not receive data from server: Connection timed out
SSL SYSCALL error: Connection timed out
[SQL: SELECT celery_taskmeta.id AS celery_taskmeta_id, celery_taskmeta.task_id AS celery_taskmeta_task_id, celery_taskmeta.status AS celery_taskmeta_status, celery_taskmeta.result AS celery_taskmeta_result, celery_taskmeta.date_done AS celery_taskmeta_date_done, celery_taskmeta.traceback AS celery_taskmeta_traceback
FROM celery_taskmeta
WHERE celery_taskmeta.task_id = %(task_id_1)s]
[parameters: {'task_id_1': 'c5f9f53c-8afe-4d67-8d3b-d7ad84875de1'}]
(Background on this error at: https://sqlalche.me/e/14/4xp6)
[2022-09-22 18:55:50,929: INFO/ForkPoolWorker-15] [c5f9f53c-8afe-4d67-8d3b-d7ad84875de1] Executing command in Celery: ['airflow', 'tasks', 'run', 'CS_ALERTING', 'CheckRunningTasks', 'scheduled__2022-09-22T18:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/CS_ALERTING.py']
[2022-09-22 18:55:51,241: INFO/ForkPoolWorker-15] Filling up the DagBag from /opt/airflow/platform_pam/dags/CS_ALERTING.py
[2022-09-22 18:55:53,467: INFO/ForkPoolWorker-15] Running <TaskInstance: CS_ALERTING.CheckRunningTasks scheduled__2022-09-22T18:00:00+00:00 [queued]> on host localhost
[2022-09-22 18:55:58,304: INFO/ForkPoolWorker-15] Task airflow.executors.celery_executor.execute_command[c5f9f53c-8afe-4d67-8d3b-d7ad84875de1] succeeded in 960.1964174450004s: None
[2022-09-22 19:29:25,931: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/redis/connection.py", line 706, in send_packed_command
sendall(self._sock, item)
File "/home/airflow/.local/lib/python3.7/site-packages/redis/_compat.py", line 9, in sendall
return sock.sendall(*args, **kwargs)
File "/usr/local/lib/python3.7/ssl.py", line 1034, in sendall
v = self.send(byte_view[count:])
File "/usr/local/lib/python3.7/ssl.py", line 1003, in send
return self._sslobj.write(data)
TimeoutError: [Errno 110] Connection timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/celery/worker/consumer/consumer.py", line 332, in start
blueprint.start(self)
File "/home/airflow/.local/lib/python3.7/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/home/airflow/.local/lib/python3.7/site-packages/celery/worker/consumer/consumer.py", line 628, in start
c.loop(*c.loop_args())
File "/home/airflow/.local/lib/python3.7/site-packages/celery/worker/loops.py", line 97, in asynloop
next(loop)
File "/home/airflow/.local/lib/python3.7/site-packages/kombu/asynchronous/hub.py", line 301, in create_loop
poll_timeout = fire_timers(propagate=propagate) if scheduled else 1
File "/home/airflow/.local/lib/python3.7/site-packages/kombu/asynchronous/hub.py", line 143, in fire_timers
entry()
File "/home/airflow/.local/lib/python3.7/site-packages/kombu/asynchronous/timer.py", line 64, in __call__
return self.fun(*self.args, **self.kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/kombu/asynchronous/timer.py", line 126, in _reschedules
return fun(*args, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/kombu/transport/redis.py", line 557, in maybe_check_subclient_health
client.check_health()
File "/home/airflow/.local/lib/python3.7/site-packages/redis/client.py", line 3522, in check_health
check_health=False)
File "/home/airflow/.local/lib/python3.7/site-packages/redis/connection.py", line 726, in send_command
check_health=kwargs.get('check_health', True))
File "/home/airflow/.local/lib/python3.7/site-packages/redis/connection.py", line 718, in send_packed_command
(errno, errmsg))
redis.exceptions.ConnectionError: Error 110 while writing to socket. Connection timed out.
[2022-09-22 19:29:26,023: WARNING/MainProcess] /home/airflow/.local/lib/python3.7/site-packages/celery/worker/consumer/consumer.py:367: CPendingDeprecationWarning:
I referred the Airflow's documentation and found setting up database
We are modifying Airflow's docker image and adding a python file,
airflow.www.db_utils.db_config (This file is installed to site_packages) and defined the dictionary
keepalive_kwargs = {
"keepalives": 1,
"keepalives_idle": 30,
"keepalives_interval": 5,
"keepalives_count": 5,
}
Finally, we are setting
ENV AIRFLOW__DATABASE__SQL_ALCHEMY_CONNECT_ARGS="airflow.www.db_utils.db_config.keepalive_kwargs"
Unfortunately, the error stills persist. It will be great if someone helps me to resolve this issue.

google.api_core.exceptions.NotFound bucket does not exists

When I'm running data_ingestion_gcs_dag DAG in Airflow.I get error that it can not find a specified bucket, however, I rechecked it and the bucket name is fine. I have specified access to Google account with docker-compose, here is code down below, i have inserted only first part of code:
version: '3'
x-airflow-common:
&airflow-common
# In order to add custom dependencies or upgrade provider packages you can use your extended image.
# Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml
# and uncomment the "build" line below, Then run `docker-compose build` to build the images.
build:
context: .
dockerfile: ./Dockerfile
environment:
&airflow-common-env
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow#postgres/airflow
AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow#postgres/airflow
AIRFLOW__CELERY__BROKER_URL: redis://:#redis:6379/0
AIRFLOW__CORE__FERNET_KEY: ''
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
_PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
GOOGLE_APPLICATION_CREDENTIALS: /.google/credentials/google_credentials.json
AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT: 'google-cloud-platform://?extra__google_cloud_platform__key_path=/.google/credentials/google_credentials.json'
# TODO: Please change GCP_PROJECT_ID & GCP_GCS_BUCKET, as per your config
GCP_PROJECT_ID: 'real-dtc-de'
GCP_GCS_BUCKET: 'dtc_data_lake_real-dtc-de'
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
- ~/.google/credentials/:/.google/credentials:ro
And here is code from DAG code, presented down below:
PROJECT_ID = os.environ.get("GCP_PROJECT_ID")
BUCKET = os.environ.get("GCP_GCS_BUCKET")
Here is logs from DAG:
*** Reading local file: /opt/airflow/logs/data_ingestion_gcs_dag/local_to_gcs_task/2022-06-13T02:47:29.654918+00:00/1.log
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1032} INFO - Dependencies all met for <TaskInstance: data_ingestion_gcs_dag.local_to_gcs_task manual__2022-06-13T02:47:29.654918+00:00 [queued]>
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1032} INFO - Dependencies all met for <TaskInstance: data_ingestion_gcs_dag.local_to_gcs_task manual__2022-06-13T02:47:29.654918+00:00 [queued]>
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1238} INFO -
--------------------------------------------------------------------------------
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1239} INFO - Starting attempt 1 of 2
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1240} INFO -
--------------------------------------------------------------------------------
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1259} INFO - Executing <Task(PythonOperator): local_to_gcs_task> on 2022-06-13 02:47:29.654918+00:00
[2022-06-13, 02:47:36 UTC] {standard_task_runner.py:52} INFO - Started process 1042 to run task
[2022-06-13, 02:47:36 UTC] {standard_task_runner.py:76} INFO - Running: ['***', 'tasks', 'run', 'data_ingestion_gcs_dag', 'local_to_gcs_task', 'manual__2022-06-13T02:47:29.654918+00:00', '--job-id', '11', '--raw', '--subdir', 'DAGS_FOLDER/data_ingestion_gcs_dag.py', '--cfg-path', '/tmp/tmp11gg9aoy', '--error-file', '/tmp/tmpjbp6yrks']
[2022-06-13, 02:47:36 UTC] {standard_task_runner.py:77} INFO - Job 11: Subtask local_to_gcs_task
[2022-06-13, 02:47:36 UTC] {logging_mixin.py:109} INFO - Running <TaskInstance: data_ingestion_gcs_dag.local_to_gcs_task manual__2022-06-13T02:47:29.654918+00:00 [running]> on host aea7312db396
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1426} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=***
AIRFLOW_CTX_DAG_ID=data_ingestion_gcs_dag
AIRFLOW_CTX_TASK_ID=local_to_gcs_task
AIRFLOW_CTX_EXECUTION_DATE=2022-06-13T02:47:29.654918+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2022-06-13T02:47:29.654918+00:00
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1700} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2594, in upload_from_file
retry=retry,
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2396, in _do_upload
retry=retry,
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1917, in _do_multipart_upload
transport, data, object_metadata, content_type, timeout=timeout
File "/home/airflow/.local/lib/python3.7/site-packages/google/resumable_media/requests/upload.py", line 154, in transmit
retriable_request, self._get_status_code, self._retry_strategy
File "/home/airflow/.local/lib/python3.7/site-packages/google/resumable_media/requests/_request_helpers.py", line 147, in wait_and_retry
response = func()
File "/home/airflow/.local/lib/python3.7/site-packages/google/resumable_media/requests/upload.py", line 149, in retriable_request
self._process_response(result)
File "/home/airflow/.local/lib/python3.7/site-packages/google/resumable_media/_upload.py", line 113, in _process_response
_helpers.require_status_code(response, (http.client.OK,), self._get_status_code)
File "/home/airflow/.local/lib/python3.7/site-packages/google/resumable_media/_helpers.py", line 104, in require_status_code
*status_codes
google.resumable_media.common.InvalidResponse: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1329, in _run_raw_task
self._execute_task_with_callbacks(context)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1455, in _execute_task_with_callbacks
result = self._execute_task(context, self.task)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1511, in _execute_task
result = execute_callable(context=context)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/operators/python.py", line 174, in execute
return_value = self.execute_callable()
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/operators/python.py", line 185, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/opt/airflow/dags/data_ingestion_gcs_dag.py", line 51, in upload_to_gcs
blob.upload_from_filename(local_file)
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2735, in upload_from_filename
retry=retry,
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2598, in upload_from_file
_raise_from_invalid_response(exc)
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 4466, in _raise_from_invalid_response
raise exceptions.from_http_status(response.status_code, message, response=response)
google.api_core.exceptions.NotFound: 404 POST https://storage.googleapis.com/upload/storage/v1/b/dtc_data_lake_animated-surfer-338618/o?uploadType=multipart: {
"error": {
"code": 404,
"message": "The specified bucket does not exist.",
"errors": [
{
"message": "The specified bucket does not exist.",
"domain": "global",
"reason": "notFound"
}
]
}
}

MysqlOperator in airflow 2.0.1 failed with "ssl connection error"

I am new to airflow and I am trying to test Mysql connection using MysqlOperator in airflow 2.0.1. However I am getting an error regarding to ssl connection error. I have tried to add extra parameters to disable ssl mode, but still I am getting the same error.
Here is my code, (I tried to pass the ssl param = disable in the code), and it doesn't work.
from airflow import DAG
from airflow.providers.mysql.operators.mysql import MySqlOperator
from airflow.operators.python import PythonOperator
from airflow.operators.dummy_operator import DummyOperator
from airflow.utils.dates import days_ago, timedelta
default_args = {
'owner' : 'airflow',
'depend_on_past' : False,
'start_date' : days_ago(2),
'retries' : 1,
'retry_delay' : timedelta(minutes=1)
}
with DAG(
'mysqlConnTest',
default_args=default_args,
schedule_interval='#once',
catchup=False) as dag:
start_date = DummyOperator(task_id = "start_task")
# [START howto_operator_mysql]
select_table_mysql_task = MySqlOperator(
task_id='select_table_mysql', mysql_conn_id='mysql', sql="SELECT * FROM country;"autocommit=True, parameters= {'ssl_mode': 'DISABLED'}
)
start_date >> select_table_mysql_task
and here is the error
*** Reading local file: /opt/airflow/logs/mysqlHookConnTest/select_table_mysql/2021-04-14T12:46:42.221662+00:00/2.log
[2021-04-14 12:47:46,791] {taskinstance.py:851} INFO - Dependencies all met for <TaskInstance: mysqlHookConnTest.select_table_mysql 2021-04-14T12:46:42.221662+00:00 [queued]>
[2021-04-14 12:47:47,007] {taskinstance.py:851} INFO - Dependencies all met for <TaskInstance: mysqlHookConnTest.select_table_mysql 2021-04-14T12:46:42.221662+00:00 [queued]>
[2021-04-14 12:47:47,047] {taskinstance.py:1042} INFO -
--------------------------------------------------------------------------------
[2021-04-14 12:47:47,054] {taskinstance.py:1043} INFO - Starting attempt 2 of 2
[2021-04-14 12:47:47,074] {taskinstance.py:1044} INFO -
--------------------------------------------------------------------------------
[2021-04-14 12:47:47,331] {taskinstance.py:1063} INFO - Executing <Task(MySqlOperator): select_table_mysql> on 2021-04-14T12:46:42.221662+00:00
[2021-04-14 12:47:47,377] {standard_task_runner.py:52} INFO - Started process 66 to run task
[2021-04-14 12:47:47,402] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', 'mysqlHookConnTest', 'select_table_mysql', '2021-04-14T12:46:42.221662+00:00', '--job-id', '142', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/MySqlHookConnTest.py', '--cfg-path', '/tmp/tmppujnrey3', '--error-file', '/tmp/tmpjl_g_p3t']
[2021-04-14 12:47:47,413] {standard_task_runner.py:77} INFO - Job 142: Subtask select_table_mysql
[2021-04-14 12:47:47,556] {logging_mixin.py:104} INFO - Running <TaskInstance: mysqlHookConnTest.select_table_mysql 2021-04-14T12:46:42.221662+00:00 [running]> on host ea95b9685a31
[2021-04-14 12:47:47,672] {taskinstance.py:1257} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=mysqlHookConnTest
AIRFLOW_CTX_TASK_ID=select_table_mysql
AIRFLOW_CTX_EXECUTION_DATE=2021-04-14T12:46:42.221662+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2021-04-14T12:46:42.221662+00:00
[2021-04-14 12:47:47,687] {mysql.py:72} INFO - Executing: SELECT idPais, Nombre, codigo, paisPlataforma, create_date, update_date FROM ob_cpanel.cpanel_pais;
[2021-04-14 12:47:47,710] {base.py:74} INFO - Using connection to: id: mysql. Host: sys-sql-pre-01.oneboxtickets.net, Port: 3306, Schema: , Login: lectura, Password: None, extra: None
[2021-04-14 12:47:48,134] {taskinstance.py:1455} ERROR - (2006, 'SSL connection error: error:1425F102:SSL routines:ssl_choose_client_version:unsupported protocol')
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 1112, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 1285, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 1315, in _execute_task
result = task_copy.execute(context=context)
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/mysql/operators/mysql.py", line 74, in execute
hook.run(self.sql, autocommit=self.autocommit, parameters=self.parameters)
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/hooks/dbapi.py", line 173, in run
with closing(self.get_conn()) as conn:
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/mysql/hooks/mysql.py", line 144, in get_conn
return MySQLdb.connect(**conn_config)
File "/home/airflow/.local/lib/python3.6/site-packages/MySQLdb/__init__.py", line 85, in Connect
return Connection(*args, **kwargs)
File "/home/airflow/.local/lib/python3.6/site-packages/MySQLdb/connections.py", line 208, in __init__
super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (2006, 'SSL connection error: error:1425F102:SSL routines:ssl_choose_client_version:unsupported protocol')
[2021-04-14 12:47:48,143] {taskinstance.py:1503} INFO - Marking task as FAILED. dag_id=mysqlHookConnTest, task_id=select_table_mysql, execution_date=20210414T124642, start_date=20210414T124746, end_date=20210414T124748
[2021-04-14 12:47:48,243] {local_task_job.py:146} INFO - Task exited with return code 1
We have tried to remove the last two parameter from the dag code, and we add in extra field(conn-airflow UI). Adding this json
{"ssl":false}
and the issue appears with another similar error
/opt/airflow/logs/mysqlOperatorConnTest/select_table_mysql/2021-04-15T11:26:50.578333+00:00/2.log
*** Fetching from: http://airflow-worker-0.airflow-worker.airflow.svc.cluster.local:8793/log/mysqlOperatorConnTest/select_table_mysql/2021-04-15T11:26:50.578333+00:00/2.log
[2021-04-15 11:27:54,471] {taskinstance.py:851} INFO - Dependencies all met for <TaskInstance: mysqlOperatorConnTest.select_table_mysql 2021-04-15T11:26:50.578333+00:00 [queued]>
[2021-04-15 11:27:54,497] {taskinstance.py:851} INFO - Dependencies all met for <TaskInstance: mysqlOperatorConnTest.select_table_mysql 2021-04-15T11:26:50.578333+00:00 [queued]>
[2021-04-15 11:27:54,497] {taskinstance.py:1042} INFO -
--------------------------------------------------------------------------------
[2021-04-15 11:27:54,497] {taskinstance.py:1043} INFO - Starting attempt 2 of 2
[2021-04-15 11:27:54,497] {taskinstance.py:1044} INFO -
--------------------------------------------------------------------------------
[2021-04-15 11:27:54,507] {taskinstance.py:1063} INFO - Executing <Task(MySqlOperator): select_table_mysql> on 2021-04-15T11:26:50.578333+00:00
[2021-04-15 11:27:54,510] {standard_task_runner.py:52} INFO - Started process 115 to run task
[2021-04-15 11:27:54,514] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', 'mysqlOperatorConnTest', 'select_table_mysql', '2021-04-15T11:26:50.578333+00:00', '--job-id', '68', '--pool', 'default_pool', '--raw', '--subdir', '/opt/airflow/dags/repo/MySqlOperatorConnTest.py', '--cfg-path', '/tmp/tmpy7bv58_z', '--error-file', '/tmp/tmpaoe808of']
[2021-04-15 11:27:54,514] {standard_task_runner.py:77} INFO - Job 68: Subtask select_table_mysql
[2021-04-15 11:27:54,644] {logging_mixin.py:104} INFO - Running <TaskInstance: mysqlOperatorConnTest.select_table_mysql 2021-04-15T11:26:50.578333+00:00 [running]> on host airflow-worker-0.airflow-worker.airflow.svc.cluster.local
[2021-04-15 11:27:54,707] {logging_mixin.py:104} WARNING - /opt/python/site-packages/sqlalchemy/sql/coercions.py:518 SAWarning: Coercing Subquery object into a select() for use in IN(); please pass a select() construct explicitly
[2021-04-15 11:27:54,725] {taskinstance.py:1255} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=mysqlOperatorConnTest
AIRFLOW_CTX_TASK_ID=select_table_mysql
AIRFLOW_CTX_EXECUTION_DATE=2021-04-15T11:26:50.578333+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2021-04-15T11:26:50.578333+00:00
[2021-04-15 11:27:54,726] {mysql.py:72} INFO - Executing: SELECT idPais, Nombre, codigo, paisPlataforma, create_date, update_date FROM ob_cpanel.cpanel_pais;
[2021-04-15 11:27:54,744] {connection.py:337} ERROR - Expecting value: line 2 column 9 (char 11)
Traceback (most recent call last):
File "/opt/python/site-packages/airflow/models/connection.py", line 335, in extra_dejson
obj = json.loads(self.extra)
File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 9 (char 11)
[2021-04-15 11:27:54,744] {connection.py:338} ERROR - Failed parsing the json for conn_id mysql
[2021-04-15 11:27:54,744] {base.py:65} INFO - Using connection to: id: mysql. Host: sys-sql-pre-01.oneboxtickets.net, Port: 3306, Schema: , Login: lectura, Password: XXXXXXXX, extra: None
[2021-04-15 11:27:54,745] {connection.py:337} ERROR - Expecting value: line 2 column 9 (char 11)
Traceback (most recent call last):
File "/opt/python/site-packages/airflow/models/connection.py", line 335, in extra_dejson
obj = json.loads(self.extra)
File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 9 (char 11)
[2021-04-15 11:27:54,745] {connection.py:338} ERROR - Failed parsing the json for conn_id mysql
[2021-04-15 11:27:54,745] {connection.py:337} ERROR - Expecting value: line 2 column 9 (char 11)
Traceback (most recent call last):
File "/opt/python/site-packages/airflow/models/connection.py", line 335, in extra_dejson
obj = json.loads(self.extra)
File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 9 (char 11)
[2021-04-15 11:27:54,745] {connection.py:338} ERROR - Failed parsing the json for conn_id mysql
[2021-04-15 11:27:54,746] {connection.py:337} ERROR - Expecting value: line 2 column 9 (char 11)
Traceback (most recent call last):
File "/opt/python/site-packages/airflow/models/connection.py", line 335, in extra_dejson
obj = json.loads(self.extra)
File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 9 (char 11)
[2021-04-15 11:27:54,746] {connection.py:338} ERROR - Failed parsing the json for conn_id mysql
[2021-04-15 11:27:54,746] {connection.py:337} ERROR - Expecting value: line 2 column 9 (char 11)
Traceback (most recent call last):
File "/opt/python/site-packages/airflow/models/connection.py", line 335, in extra_dejson
obj = json.loads(self.extra)
File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 9 (char 11)
[2021-04-15 11:27:54,746] {connection.py:338} ERROR - Failed parsing the json for conn_id mysql
[2021-04-15 11:27:54,746] {connection.py:337} ERROR - Expecting value: line 2 column 9 (char 11)
Traceback (most recent call last):
File "/opt/python/site-packages/airflow/models/connection.py", line 335, in extra_dejson
obj = json.loads(self.extra)
File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 9 (char 11)
[2021-04-15 11:27:54,747] {connection.py:338} ERROR - Failed parsing the json for conn_id mysql
[2021-04-15 11:27:54,747] {connection.py:337} ERROR - Expecting value: line 2 column 9 (char 11)
Traceback (most recent call last):
File "/opt/python/site-packages/airflow/models/connection.py", line 335, in extra_dejson
obj = json.loads(self.extra)
File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 9 (char 11)
[2021-04-15 11:27:54,747] {connection.py:338} ERROR - Failed parsing the json for conn_id mysql
[2021-04-15 11:27:54,747] {connection.py:337} ERROR - Expecting value: line 2 column 9 (char 11)
Traceback (most recent call last):
File "/opt/python/site-packages/airflow/models/connection.py", line 335, in extra_dejson
obj = json.loads(self.extra)
File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 9 (char 11)
[2021-04-15 11:27:54,747] {connection.py:338} ERROR - Failed parsing the json for conn_id mysql
[2021-04-15 11:27:54,787] {taskinstance.py:1455} ERROR - (2006, 'SSL connection error: error:1425F102:SSL routines:ssl_choose_client_version:unsupported protocol')
Traceback (most recent call last):
File "/opt/python/site-packages/airflow/models/taskinstance.py", line 1112, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File "/opt/python/site-packages/airflow/models/taskinstance.py", line 1285, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File "/opt/python/site-packages/airflow/models/taskinstance.py", line 1315, in _execute_task
result = task_copy.execute(context=context)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/mysql/operators/mysql.py", line 74, in execute
hook.run(self.sql, autocommit=self.autocommit, parameters=self.parameters)
File "/opt/python/site-packages/airflow/hooks/dbapi.py", line 173, in run
with closing(self.get_conn()) as conn:
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/mysql/hooks/mysql.py", line 144, in get_conn
return MySQLdb.connect(**conn_config)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/__init__.py", line 85, in Connect
return Connection(*args, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/connections.py", line 208, in __init__
super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (2006, 'SSL connection error: error:1425F102:SSL routines:ssl_choose_client_version:unsupported protocol')
[2021-04-15 11:27:54,788] {taskinstance.py:1496} INFO - Marking task as FAILED. dag_id=mysqlOperatorConnTest, task_id=select_table_mysql, execution_date=20210415T112650, start_date=20210415T112754, end_date=20210415T112754
[2021-04-15 11:27:54,845] {local_task_job.py:146} INFO - Task exited with return code 1
We solved this issue upgrading the Mysql client to 5.7. Our server version was 5.6 and the previous client was 8, as I was using docker image. so we downgraded the client to be more closer to the server version.

Hourly tasks "randomly" fail

I have an Apache Airflow DAG which runs hourly but "randomly" fails, which means it succeeds about half of the time. The Dag consists of exactly one task which utilizes the BashOperator to do a single curl. The problem occurs with scheduled and manual triggers.
Airflow Version: 1.10.13.
Executor: Celery
Env: Kubernetes with Bitnami Airflow Helm Chart (1 Airflow-Web, 1 Airflow-Scheduler, 1 Airflow-Worker, I can give more information on this if it helps)
DB: PSQL
Redis: A Redis instance is provided to the Airflow setup but no keys are present.
DAGs: DAGs are defined inside a K8s ConfigMap and are updated regularily by Airflow
What have I tried:
Dropped DB and set everything up from scratch two times
Changed params like start_date, retries, etc.
Monitored behaviour for 2 days
Code
from builtins import range
from datetime import datetime, timedelta
from airflow.models import DAG
from airflow.operators.bash_operator import BashOperator
args = {
'owner': 'Airflow',
'start_date': datetime(2021, 1, 8, 8, 30, 0),
'retries': 3,
'retry_delay': timedelta(minutes=5),
}
dag = DAG(
dag_id='persist__events_dag',
default_args=args,
schedule_interval='0 * * * *',
tags=['SparkJobs']
)
run_job = BashOperator(
task_id='curl_to_spark_api',
bash_command="""
//Here comes a curl command
""",
dag=dag,
)
run_job
Logs
From Airflow Worker on failed runs:
[2021-01-08 10:36:13,313: INFO/MainProcess] Received task: airflow.executors.celery_executor.execute_command[c6e3fdfb-5474-47aa-b333-be8c69b23ebe]
[2021-01-08 10:36:13,314: INFO/ForkPoolWorker-15] Executing command in Celery: ['airflow', 'run', 'persist_events_dag', 'task_id', '2021-01-08T09:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/opt/bitnami/airflow/dags/external/..2021_01_08_10_33_28.059622242/persist_screenviews.py']
[2021-01-08 10:36:19,279] {__init__.py:50} INFO - Using executor CeleryExecutor
[2021-01-08 10:36:19,356] {dagbag.py:417} INFO - Filling up the DagBag from /opt/bitnami/airflow/dags/external/..2021_01_08_10_33_28.059622242/persist_screenviews.py
/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/models/dag.py:1342: PendingDeprecationWarning: The requested task could not be added to the DAG because a task with task_id create_tag_template_field_result is already in the DAG. Starting in Airflow 2.0, trying to overwrite a task will raise an exception.
category=PendingDeprecationWarning)
Traceback (most recent call last):
File "/opt/bitnami/airflow/venv/bin/airflow", line 37, in <module>
args.func(args)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/utils/cli.py", line 76, in wrapper
return f(*args, **kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/bin/cli.py", line 538, in run
dag = get_dag(args)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/bin/cli.py", line 164, in get_dag
'parse.'.format(args.dag_id))
airflow.exceptions.AirflowException: dag_id could not be found: persist_events_dag. Either the dag did not exist or it failed to parse.
[2021-01-08 10:36:20,373: ERROR/ForkPoolWorker-15] execute_command encountered a CalledProcessError
Traceback (most recent call last):
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 78, in execute_command
close_fds=True, env=env)
File "/opt/bitnami/python/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['airflow', 'run', 'persist_events_dag', 'task_id', '2021-01-08T09:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/opt/bitnami/airflow/dags/external/..2021_01_08_10_33_28.059622242/persist_screenviews.py']' returned non-zero exit status 1.
[2021-01-08 10:36:20,373: ERROR/ForkPoolWorker-15] None
[2021-01-08 10:36:20,526: ERROR/ForkPoolWorker-15] Task airflow.executors.celery_executor.execute_command[c6e3fdfb-5474-47aa-b333-be8c69b23ebe] raised unexpected: AirflowException('Celery command failed',)
Traceback (most recent call last):
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 78, in execute_command
close_fds=True, env=env)
File "/opt/bitnami/python/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['airflow', 'run', 'persist_events_dag', 'task_id', '2021-01-08T09:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/opt/bitnami/airflow/dags/external/..2021_01_08_10_33_28.059622242/persist_screenviews.py']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/celery/app/trace.py", line 412, in trace_task
R = retval = fun(*args, **kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/celery/app/trace.py", line 704, in __protected_call__
return self.run(*args, **kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 83, in execute_command
raise AirflowException('Celery command failed')
airflow.exceptions.AirflowException: Celery command failed
[2021-01-08 10:36:43,230: INFO/MainProcess] Received task: airflow.executors.celery_executor.execute_command[8ba7956f-6b96-48f8-b112-3a4d7baa8bf7]
[2021-01-08 10:36:43,231: INFO/ForkPoolWorker-15] Executing command in Celery: ['airflow', 'run', 'persist_events_dag', 'task_id', '2021-01-08T10:36:40.516529+00:00', '--local', '--pool', 'default_pool', '-sd', '/opt/bitnami/airflow/dags/external/..2021_01_08_10_33_28.059622242/persist_screenviews.py']
[2021-01-08 10:36:49,157] {__init__.py:50} INFO - Using executor CeleryExecutor
[2021-01-08 10:36:49,158] {dagbag.py:417} INFO - Filling up the DagBag from /opt/bitnami/airflow/dags/external/..2021_01_08_10_33_28.059622242/persist_screenviews.py
/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/models/dag.py:1342: PendingDeprecationWarning: The requested task could not be added to the DAG because a task with task_id create_tag_template_field_result is already in the DAG. Starting in Airflow 2.0, trying to overwrite a task will raise an exception.
category=PendingDeprecationWarning)
Traceback (most recent call last):
File "/opt/bitnami/airflow/venv/bin/airflow", line 37, in <module>
args.func(args)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/utils/cli.py", line 76, in wrapper
return f(*args, **kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/bin/cli.py", line 538, in run
dag = get_dag(args)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/bin/cli.py", line 164, in get_dag
'parse.'.format(args.dag_id))
airflow.exceptions.AirflowException: dag_id could not be found: persist_events_dag. Either the dag did not exist or it failed to parse.
[2021-01-08 10:36:50,560: ERROR/ForkPoolWorker-15] execute_command encountered a CalledProcessError
Traceback (most recent call last):
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 78, in execute_command
close_fds=True, env=env)
File "/opt/bitnami/python/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['airflow', 'run', 'persist_events_dag', 'task_id', '2021-01-08T10:36:40.516529+00:00', '--local', '--pool', 'default_pool', '-sd', '/opt/bitnami/airflow/dags/external/..2021_01_08_10_33_28.059622242/persist_screenviews.py']' returned non-zero exit status 1.
[2021-01-08 10:36:50,561: ERROR/ForkPoolWorker-15] None
[2021-01-08 10:36:50,649: ERROR/ForkPoolWorker-15] Task airflow.executors.celery_executor.execute_command[8ba7956f-6b96-48f8-b112-3a4d7baa8bf7] raised unexpected: AirflowException('Celery command failed',)
Traceback (most recent call last):
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 78, in execute_command
close_fds=True, env=env)
File "/opt/bitnami/python/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['airflow', 'run', 'persist_events_dag', 'task_id', '2021-01-08T10:36:40.516529+00:00', '--local', '--pool', 'default_pool', '-sd', '/opt/bitnami/airflow/dags/external/..2021_01_08_10_33_28.059622242/persist_screenviews.py']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/celery/app/trace.py", line 412, in trace_task
R = retval = fun(*args, **kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/celery/app/trace.py", line 704, in __protected_call__
return self.run(*args, **kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 83, in execute_command
raise AirflowException('Celery command failed')
airflow.exceptions.AirflowException: Celery command failed
[2021-01-08 10:37:23,414: INFO/MainProcess] Received task: airflow.executors.celery_executor.execute_command[10478bca-c21d-4cf5-b366-5ca1f66c0fe1]
[2021-01-08 10:37:23,415: INFO/ForkPoolWorker-15] Executing command in Celery: ['airflow', 'run', 'persist_events_dag', 'task_id', '2021-01-08T09:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/opt/bitnami/airflow/dags/external/..data/persist_screenviews.py']
[2021-01-08 10:37:29,457] {__init__.py:50} INFO - Using executor CeleryExecutor
[2021-01-08 10:37:29,458] {dagbag.py:417} INFO - Filling up the DagBag from /opt/bitnami/airflow/dags/external/..data/persist_screenviews.py
Running %s on host %s <TaskInstance: persist_events_dag.task_id 2021-01-08T09:00:00+00:00 [queued]> airflow-worker-0.airflow-worker-headless.default.svc.cluster.local
[2021-01-08 10:37:36,294: INFO/ForkPoolWorker-15] Task airflow.executors.celery_executor.execute_command[10478bca-c21d-4cf5-b366-5ca1f66c0fe1] succeeded in 12.8788606680464s: None
If a run succeeds, I nevertheless get the following error in the logs of the Airflow scheduler:
Process DagFileProcessor144631-Process:
Traceback (most recent call last):
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
cursor, statement, parameters, context
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "dag_pkey"
DETAIL: Key (dag_id)=(persist_events_dag) already exists.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/bitnami/python/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/opt/bitnami/python/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 159, in _run_file_processor
pickle_dags)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 1609, in process_file
dag.sync_to_db()
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/models/dag.py", line 1535, in sync_to_db
orm_dag.tags = self.get_dagtags(session=session)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/utils/db.py", line 70, in wrapper
return func(*args, **kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/models/dag.py", line 1574, in get_dagtags
session.commit()
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1042, in commit
self.transaction.commit()
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 504, in commit
self._prepare_impl()
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 483, in _prepare_impl
self.session.flush()
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2536, in flush
self._flush(objects)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2678, in _flush
transaction.rollback(_capture_exception=True)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
with_traceback=exc_tb,
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2638, in _flush
flush_context.execute()
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
rec.execute(self)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
uow,
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
insert,
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1083, in _emit_insert_statements
c = cached_connections[connection].execute(statement, multiparams)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
distilled_params,
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
e, statement, parameters, cursor, context
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
sqlalchemy_exception, with_traceback=exc_info[2], from_=e
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
cursor, statement, parameters, context
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "dag_pkey"
DETAIL: Key (dag_id)=(persist_events_dag) already exists.
[SQL: INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expired, scheduler_lock, pickle_id, fileloc, owners, description, default_view, schedule_interval) VALUES (%(dag_id)s, %(root_dag_id)s, %(is_paused)s, %(is_subdag)s, %(is_active)s, %(last_scheduler_run)s, %(last_pickled)s, %(last_expired)s, %(scheduler_lock)s, %(pickle_id)s, %(fileloc)s, %(owners)s, %(description)s, %(default_view)s, %(schedule_interval)s)]
[parameters: {'dag_id': 'persist_events_dag', 'root_dag_id': None, 'is_paused': True, 'is_subdag': False, 'is_active': True, 'last_scheduler_run': datetime.datetime(2021, 1, 8, 10, 55, 4, 297819, tzinfo=<Timezone [UTC]>), 'last_pickled': None, 'last_expired': None, 'scheduler_lock': None, 'pickle_id': None, 'fileloc': '/opt/bitnami/airflow/dags/external/..2021_01_08_10_33_28.059622242/persist_screenviews.py', 'owners': 'Airflow', 'description': None, 'default_view': None, 'schedule_interval': '"0 * * * *"'}]
(Background on this error at: http://sqlalche.me/e/13/gkpj)
Process DagFileProcessor144689-Process:
Traceback (most recent call last):
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
cursor, statement, parameters, context
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "dag_pkey"
DETAIL: Key (dag_id)=(persist_events_dag) already exists.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/bitnami/python/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/opt/bitnami/python/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 159, in _run_file_processor
pickle_dags)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 1609, in process_file
dag.sync_to_db()
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/models/dag.py", line 1535, in sync_to_db
orm_dag.tags = self.get_dagtags(session=session)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/utils/db.py", line 70, in wrapper
return func(*args, **kwargs)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/models/dag.py", line 1574, in get_dagtags
session.commit()
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1042, in commit
self.transaction.commit()
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 504, in commit
self._prepare_impl()
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 483, in _prepare_impl
self.session.flush()
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2536, in flush
self._flush(objects)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2678, in _flush
transaction.rollback(_capture_exception=True)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
with_traceback=exc_tb,
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2638, in _flush
flush_context.execute()
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
rec.execute(self)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
uow,
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
insert,
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1083, in _emit_insert_statements
c = cached_connections[connection].execute(statement, multiparams)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
distilled_params,
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
e, statement, parameters, cursor, context
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
sqlalchemy_exception, with_traceback=exc_info[2], from_=e
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
cursor, statement, parameters, context
File "/opt/bitnami/airflow/venv/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "dag_pkey"
DETAIL: Key (dag_id)=(persist_events_dag) already exists.
[SQL: INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expired, scheduler_lock, pickle_id, fileloc, owners, description, default_view, schedule_interval) VALUES (%(dag_id)s, %(root_dag_id)s, %(is_paused)s, %(is_subdag)s, %(is_active)s, %(last_scheduler_run)s, %(last_pickled)s, %(last_expired)s, %(scheduler_lock)s, %(pickle_id)s, %(fileloc)s, %(owners)s, %(description)s, %(default_view)s, %(schedule_interval)s)]
[parameters: {'dag_id': 'persist_events_dag', 'root_dag_id': None, 'is_paused': True, 'is_subdag': False, 'is_active': True, 'last_scheduler_run': datetime.datetime(2021, 1, 8, 10, 55, 33, 439514, tzinfo=<Timezone [UTC]>), 'last_pickled': None, 'last_expired': None, 'scheduler_lock': None, 'pickle_id': None, 'fileloc': '/opt/bitnami/airflow/dags/external/..2021_01_08_10_33_28.059622242/persist_screenviews.py', 'owners': 'Airflow', 'description': None, 'default_view': None, 'schedule_interval': '"0 * * * *"'}]
(Background on this error at: http://sqlalche.me/e/13/gkpj)
[2021-01-08 10:57:22,392] {scheduler_job.py:963} INFO - 1 tasks up for execution:
<TaskInstance: persist_events_dag.task_id 2021-01-08 09:00:00+00:00 [scheduled]>
[2021-01-08 10:57:22,402] {scheduler_job.py:997} INFO - Figuring out tasks to run in Pool(name=default_pool) with 128 open slots and 1 task instances ready to be queued
[2021-01-08 10:57:22,403] {scheduler_job.py:1025} INFO - DAG persist_events_dag has 0/16 running and queued tasks
[2021-01-08 10:57:22,407] {scheduler_job.py:1085} INFO - Setting the following tasks to queued state:
<TaskInstance: persist_events_dag.task_id 2021-01-08 09:00:00+00:00 [scheduled]>
[2021-01-08 10:57:22,420] {scheduler_job.py:1159} INFO - Setting the following 1 tasks to queued state:
<TaskInstance: persist_events_dag.task_id 2021-01-08 09:00:00+00:00 [queued]>
[2021-01-08 10:57:22,420] {scheduler_job.py:1195} INFO - Sending ('persist_events_dag', 'task_id', datetime.datetime(2021, 1, 8, 9, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 1 and queue default
[2021-01-08 10:57:22,424] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'persist_events_dag', 'task_id', '2021-01-08T09:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/opt/bitnami/airflow/dags/external/persist_screenviews.py']
[2021-01-08 10:57:36,579] {scheduler_job.py:1334} INFO - Executor reports execution of persist_events_dag.task_id execution_date=2021-01-08 09:00:00+00:00 exited with status success for try_number 1

Airflow 1.9 - SSHOperator doesn't seem to work?

Upgraded to v1.9 and I'm having a hard time getting the SSHOperator to work. It was working w/ v1.8.2.
Code
dag = DAG('transfer_ftp_s3', default_args=default_args,schedule_interval=None)
task = SSHOperator(
ssh_conn_id='ssh_node',
task_id="check_ftp_for_new_files",
command="echo 'hello world'",
do_xcom_push=True,
dag=dag,)
Error
[2018-02-19 06:48:02,691] {{base_task_runner.py:98}} INFO - Subtask: Traceback (most recent call last):
[2018-02-19 06:48:02,691] {{base_task_runner.py:98}} INFO - Subtask: File "/usr/bin/airflow", line 27, in <module>
[2018-02-19 06:48:02,692] {{base_task_runner.py:98}} INFO - Subtask: args.func(args)
[2018-02-19 06:48:02,693] {{base_task_runner.py:98}} INFO - Subtask: File "/usr/lib/python2.7/site-packages/airflow/bin/cli.py", line 392, in run
[2018-02-19 06:48:02,695] {{base_task_runner.py:98}} INFO - Subtask: pool=args.pool,
[2018-02-19 06:48:02,695] {{base_task_runner.py:98}} INFO - Subtask: File "/usr/lib/python2.7/site-packages/airflow/utils/db.py", line 50, in wrapper
[2018-02-19 06:48:02,696] {{base_task_runner.py:98}} INFO - Subtask: result = func(*args, **kwargs)
[2018-02-19 06:48:02,696] {{base_task_runner.py:98}} INFO - Subtask: File "/usr/lib/python2.7/site-packages/airflow/models.py", line 1496, in _run_raw_task
[2018-02-19 06:48:02,696] {{base_task_runner.py:98}} INFO - Subtask: result = task_copy.execute(context=context)
[2018-02-19 06:48:02,697] {{base_task_runner.py:98}} INFO - Subtask: File "/usr/lib/python2.7/site-packages/airflow/contrib/operators/ssh_operator.py", line 146, in execute
[2018-02-19 06:48:02,697] {{base_task_runner.py:98}} INFO - Subtask: raise AirflowException("SSH operator error: {0}".format(str(e)))
[2018-02-19 06:48:02,698] {{base_task_runner.py:98}} INFO - Subtask: airflow.exceptions.AirflowException: SSH operator error: 'bool' object has no attribute 'lower'
as AIRFLOW-2122 check your connections setting , make sure the extra's value using a string instead of a bool

Resources