SQLite Error disk I/O when running Airflow commands - sqlite

Upon running:
airflow scheduler
I get the following error:
[2022-08-10 13:26:53,501] {scheduler_job.py:708} INFO - Starting the scheduler
[2022-08-10 13:26:53,502] {scheduler_job.py:713} INFO - Processing each file at most -1 times
[2022-08-10 13:26:53,509] {executor_loader.py:105} INFO - Loaded executor: SequentialExecutor
[2022-08-10 13:26:53 -0400] [1388] [INFO] Starting gunicorn 20.1.0
[2022-08-10 13:26:53,540] {manager.py:160} INFO - Launched DagFileProcessorManager with pid: 1389
[2022-08-10 13:26:53,545] {scheduler_job.py:1233} INFO - Resetting orphaned tasks for active dag runs
.
.
.
[2022-08-10 13:26:53 -0400] [1391] [INFO] Booting worker with pid: 1391
Process DagFileProcessor10-Process:
Traceback (most recent call last):
File "/home/dromo/anaconda3/envs/airflow_env_2/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 998, in _commit_impl
self.engine.dialect.do_commit(self.connection)
File "/home/dromo/anaconda3/envs/airflow_env_2/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 672, in do_commit
dbapi_connection.commit()
sqlite3.OperationalError: disk I/O error
I get this 'disk I/O error' as well when I run airflow webserver --port 8080 command as so:
Workers: 4 sync
Host: 0.0.0.0:8080
Timeout: 120
Logfiles: - -
Access Logformat:
=================================================================
[2022-08-10 14:42:28 -0400] [2759] [INFO] Starting gunicorn 20.1.0
[2022-08-10 14:42:29 -0400] [2759] [INFO] Listening at: http://0.0.0.0:8080 (2759)
[2022-08-10 14:42:29 -0400] [2759] [INFO] Using worker: sync
.
.
.
[2022-08-10 14:42:55,149] {app.py:1455} ERROR - Exception on /static/appbuilder/datepicker/bootstrap-datepicker.css [GET]
Traceback (most recent call last):
File "/home/dromo/anaconda3/envs/airflow_env_2/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 998, in _commit_impl
self.engine.dialect.do_commit(self.connection)
File "/home/dromo/anaconda3/envs/airflow_env_2/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 672, in do_commit
dbapi_connection.commit()
sqlite3.OperationalError: disk I/O error
Any ideas as to what might be causing this and possible fixes?

It seems like airflow doesn't find the database on the disk, try to initialize it:
airflow db init

Related

Airflow task randomly exited with return code 1 [Local Executor / PythonOperator]

To give some context, I am using Airflow 2.3.0 on Kubernetes with the Local Executor (which may sound weird, but it works for us for now) with one pod for the webserver and two for the scheduler.
I have a DAG consisting of a single task (PythonOperator) that makes many API calls (200K) using requests.
Every 15 calls, the data is loaded in a DataFrame and stored on AWS S3 (using boto3) to reduce the RAM usage.
The problem is that I can't get to the end of this task because it goes into error randomly (after 1, 10 or 120 minutes).
I have made more than 50 tries, no success and the only logs on the task are:
[2022-09-01, 14:45:44 UTC] {taskinstance.py:1159} INFO - Dependencies all met for <TaskInstance: INGESTION-DAILY-dag.extract_task scheduled__2022-08-30T00:00:00+00:00 [queued]>
[2022-09-01, 14:45:44 UTC] {taskinstance.py:1159} INFO - Dependencies all met for <TaskInstance: INGESTION-DAILY-dag.extract_task scheduled__2022-08-30T00:00:00+00:00 [queued]>
[2022-09-01, 14:45:44 UTC] {taskinstance.py:1356} INFO -
--------------------------------------------------------------------------------
[2022-09-01, 14:45:44 UTC] {taskinstance.py:1357} INFO - Starting attempt 23 of 24
[2022-09-01, 14:45:44 UTC] {taskinstance.py:1358} INFO -
--------------------------------------------------------------------------------
[2022-09-01, 14:45:44 UTC] {taskinstance.py:1377} INFO - Executing <Task(_PythonDecoratedOperator): extract_task> on 2022-08-30 00:00:00+00:00
[2022-09-01, 14:45:44 UTC] {standard_task_runner.py:52} INFO - Started process 942 to run task
[2022-09-01, 14:45:44 UTC] {standard_task_runner.py:79} INFO - Running: ['airflow', 'tasks', 'run', 'INGESTION-DAILY-dag', 'extract_task', 'scheduled__2022-08-30T00:00:00+00:00', '--job-id', '4390', '--raw', '--subdir', 'DAGS_FOLDER/dags/ingestion/daily_dag/dag.py', '--cfg-path', '/tmp/tmpwxasaq93', '--error-file', '/tmp/tmpl7t_gd8e']
[2022-09-01, 14:45:44 UTC] {standard_task_runner.py:80} INFO - Job 4390: Subtask extract_task
[2022-09-01, 14:45:45 UTC] {task_command.py:369} INFO - Running <TaskInstance: INGESTION-DAILY-dag.extract_task scheduled__2022-08-30T00:00:00+00:00 [running]> on host 10.XX.XXX.XXX
[2022-09-01, 14:48:17 UTC] {local_task_job.py:156} INFO - Task exited with return code 1
[2022-09-01, 14:48:17 UTC] {taskinstance.py:1395} INFO - Marking task as UP_FOR_RETRY. dag_id=INGESTION-DAILY-dag, task_id=extract_task, execution_date=20220830T000000, start_date=20220901T144544, end_date=20220901T144817
[2022-09-01, 14:48:17 UTC] {local_task_job.py:273} INFO - 0 downstream tasks scheduled from follow-on schedule check
But when I go to the pod logs, I get the following message:
[2022-09-01 14:06:31,624] {local_executor.py:128} ERROR - Failed to execute task an integer is required (got type ChunkedEncodingError).
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 124, in _execute_work_in_fork
args.func(args)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/cli_parser.py", line 51, in command
return func(*args, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/cli.py", line 99, in wrapper
return f(*args, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/commands/task_command.py", line 377, in task_run
_run_task_by_selected_method(args, dag, ti)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/commands/task_command.py", line 183, in _run_task_by_selected_method
_run_task_by_local_task_job(args, ti)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/commands/task_command.py", line 241, in _run_task_by_local_task_job
run_job.run()
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/base_job.py", line 244, in run
self._execute()
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/local_task_job.py", line 105, in _execute
self.task_runner.start()
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/task/task_runner/standard_task_runner.py", line 41, in start
self.process = self._start_by_fork()
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/task/task_runner/standard_task_runner.py", line 125, in _start_by_fork
os._exit(return_code)
TypeError: an integer is required (got type ChunkedEncodingError)
What I find strange is that I never had this error on other DAGs (where tasks are smaller and faster). I checked, during an attempt, CPU and RAM usages are stable and low.
I have the same error locally, I also tried to upgrade to 2.3.4 but nothing works.
Do you have any idea how to fix this?
Thanks a lot!
Nicolas
As #EDG956 said, this is not an error from Airflow but from the code.
I solved it using a context manager (which was not enough) and recreating a session:
s = requests.Session()
while True:
try:
with s.get(base_url) as r:
response = r
except requests.exceptions.ChunkedEncodingError:
s.close()
s.requests.Session()
response = s.get(base_url)

Airflow 2 Error sending Celery task: Timeout

I am in the process of migrating our Airflow environment from version 1.10.15 to 2.3.3. I have migrated 1 DAG over to the new environment and intermittently I get an email with this error: Executor reports task instance finished (failed) although the task says its queued. (Info: None) Was the task killed externally?
When looking at the logs, this is what I find in the scheduler logs:
[2022-08-09 07:00:08,621] {dag.py:2968} INFO - Setting next_dagrun for DAGRP-Get_Overrides to 2022-08-09T11:00:00+00:00, run_after=2022-08-09T16:00:00+00:00
[2022-08-09 07:00:08,652] {scheduler_job.py:353} INFO - 1 tasks up for execution:
<TaskInstance: DAGRP-Get_Overrides.Get_override scheduled__2022-08-08T16:00:00+00:00 [scheduled]>
[2022-08-09 07:00:08,652] {scheduler_job.py:418} INFO - DAG DAGRP-Get_Overrides has 0/3 running and queued tasks
[2022-08-09 07:00:08,652] {scheduler_job.py:504} INFO - Setting the following tasks to queued state:
<TaskInstance: DAGRP-Get_Overrides.Get_override scheduled__2022-08-08T16:00:00+00:00 [scheduled]>
[2022-08-09 07:00:08,654] {scheduler_job.py:546} INFO - Sending TaskInstanceKey(dag_id='DAGRP-Get_Overrides', task_id='Get_override', run_id='scheduled__2022-08-08T16:00:00+00:00', try_number=1, map_index=-1) to executor with priority 1 and queue default
[2022-08-09 07:00:08,654] {base_executor.py:91} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'DAGRP-Get_Overrides', 'Get_override', 'scheduled__2022-08-08T16:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/da_group/get_override.py']
[2022-08-09 07:00:12,665] {timeout.py:67} ERROR - Process timed out, PID: 1
[2022-08-09 07:00:12,667] {celery_executor.py:283} INFO - [Try 1 of 3] Task Timeout Error for Task: (TaskInstanceKey(dag_id='DAGRP-Get_Overrides', task_id='Get_override', run_id='scheduled__2022-08-08T16:00:00+00:00', try_number=1, map_index=-1)).
[2022-08-09 07:00:16,701] {timeout.py:67} ERROR - Process timed out, PID: 1
[2022-08-09 07:00:16,702] {celery_executor.py:283} INFO - [Try 2 of 3] Task Timeout Error for Task: (TaskInstanceKey(dag_id='DAGRP-Get_Overrides', task_id='Get_override', run_id='scheduled__2022-08-08T16:00:00+00:00', try_number=1, map_index=-1)).
[2022-08-09 07:00:21,704] {timeout.py:67} ERROR - Process timed out, PID: 1
[2022-08-09 07:00:21,705] {celery_executor.py:283} INFO - [Try 3 of 3] Task Timeout Error for Task: (TaskInstanceKey(dag_id='DAGRP-Get_Overrides', task_id='Get_override', run_id='scheduled__2022-08-08T16:00:00+00:00', try_number=1, map_index=-1)).
[2022-08-09 07:00:26,627] {timeout.py:67} ERROR - Process timed out, PID: 1
[2022-08-09 07:00:26,627] {celery_executor.py:294} ERROR - Error sending Celery task: Timeout, PID: 1
Celery Task ID: TaskInstanceKey(dag_id='DAGRP-Get_Overrides', task_id='Get_override', run_id='scheduled__2022-08-08T16:00:00+00:00', try_number=1, map_index=-1)
Traceback (most recent call last):
File "/opt/airflow/lib/python3.8/site-packages/kombu/utils/functional.py", line 30, in __call__
return self.__value__
AttributeError: 'ChannelPromise' object has no attribute '__value__'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/airflow/lib/python3.8/site-packages/airflow/executors/celery_executor.py", line 177, in send_task_to_executor
result = task_to_run.apply_async(args=[command], queue=queue)
File "/opt/airflow/lib/python3.8/site-packages/celery/app/task.py", line 575, in apply_async
return app.send_task(
File "/opt/airflow/lib/python3.8/site-packages/celery/app/base.py", line 788, in send_task
amqp.send_task_message(P, name, message, **options)
File "/opt/airflow/lib/python3.8/site-packages/celery/app/amqp.py", line 510, in send_task_message
ret = producer.publish(
File "/opt/airflow/lib/python3.8/site-packages/kombu/messaging.py", line 177, in publish
return _publish(
File "/opt/airflow/lib/python3.8/site-packages/kombu/connection.py", line 523, in _ensured
return fun(*args, **kwargs)
File "/opt/airflow/lib/python3.8/site-packages/kombu/messaging.py", line 186, in _publish
channel = self.channel
File "/opt/airflow/lib/python3.8/site-packages/kombu/messaging.py", line 209, in _get_channel
channel = self._channel = channel()
File "/opt/airflow/lib/python3.8/site-packages/kombu/utils/functional.py", line 32, in __call__
value = self.__value__ = self.__contract__()
File "/opt/airflow/lib/python3.8/site-packages/kombu/messaging.py", line 225, in <lambda>
channel = ChannelPromise(lambda: connection.default_channel)
File "/opt/airflow/lib/python3.8/site-packages/kombu/connection.py", line 895, in default_channel
self._ensure_connection(**conn_opts)
File "/opt/airflow/lib/python3.8/site-packages/kombu/connection.py", line 433, in _ensure_connection
return retry_over_time(
File "/opt/airflow/lib/python3.8/site-packages/kombu/utils/functional.py", line 312, in retry_over_time
return fun(*args, **kwargs)
File "/opt/airflow/lib/python3.8/site-packages/kombu/connection.py", line 877, in _connection_factory
self._connection = self._establish_connection()
File "/opt/airflow/lib/python3.8/site-packages/kombu/connection.py", line 812, in _establish_connection
conn = self.transport.establish_connection()
File "/opt/airflow/lib/python3.8/site-packages/kombu/transport/pyamqp.py", line 201, in establish_connection
conn.connect()
File "/opt/airflow/lib/python3.8/site-packages/amqp/connection.py", line 323, in connect
self.transport.connect()
File "/opt/airflow/lib/python3.8/site-packages/amqp/transport.py", line 129, in connect
self._connect(self.host, self.port, self.connect_timeout)
File "/opt/airflow/lib/python3.8/site-packages/amqp/transport.py", line 184, in _connect
self.sock.connect(sa)
File "/opt/airflow/lib/python3.8/site-packages/airflow/utils/timeout.py", line 68, in handle_timeout
raise AirflowTaskTimeout(self.error_message)
airflow.exceptions.AirflowTaskTimeout: Timeout, PID: 1
[2022-08-09 07:00:26,627] {scheduler_job.py:599} INFO - Executor reports execution of DAGRP-Get_Overrides.Get_override run_id=scheduled__2022-08-08T16:00:00+00:00 exited with status failed for try_number 1
[2022-08-09 07:00:26,633] {scheduler_job.py:642} INFO - TaskInstance Finished: dag_id=DAGRP-Get_Overrides, task_id=Get_override, run_id=scheduled__2022-08-08T16:00:00+00:00, map_index=-1, run_start_date=None, run_end_date=None, run_duration=None, state=queued, executor_state=failed, try_number=1, max_tries=0, job_id=None, pool=default_pool, queue=default, priority_weight=1, operator=PythonOperator, queued_dttm=2022-08-09 11:00:08.652767+00:00, queued_by_job_id=56, pid=None
[2022-08-09 07:00:26,633] {scheduler_job.py:684} ERROR - Executor reports task instance <TaskInstance: DAGRP-Get_Overrides.Get_override scheduled__2022-08-08T16:00:00+00:00 [queued]> finished (failed) although the task says its queued. (Info: None) Was the task killed externally?
[2022-08-09 07:01:16,687] {processor.py:233} WARNING - Killing DAGFileProcessorProcess (PID=1811)
[2022-08-09 07:04:00,640] {scheduler_job.py:1233} INFO - Resetting orphaned tasks for active dag runs
I am running Airflow on 2 servers with 2 of each service (2 schedulers, 2 workers, 2 webservers). They are running in docker containers. They are configured to use celery executor and I'm using RabbitMQ version 3.10.6 (also 2 instances in docker containers behind a LB). I am using Postgres 13.7 for our database (running one instance in a docker container on the 1st server). Our environment is running on Python 3.8.12.
From my understanding, the timeout is between the scheduler and rabbitmq? From what I can tell we are hitting this timeout: AIRFLOW__CELERY__OPERATION_TIMEOUT (it's currently set to 4).
I would like to track down what is causing the issue before I just increase timeout settings. What can I do to find out what's going on? Anyone else run into this issue? Am I correct in assuming the timeout is between the scheduler and rabbitmq? Is it between the scheduler and database? Why am I seeing this with Airflow 2 when I have the same setup with Airflow 1 and it works with no problems? Any help is greatly appreciated!
Update:
I was able to reproduce the error by shutting down 1 of the rabbitmq nodes. Even though rabbitmq is behind a LB with a health probe, whenever a job was picked up by scheduler 1, it would fail with this error... But if scheduler 2 picked up the job, it would finish successfully. The odd thing is that I shut down rabbitmq 2..
So I think I've been able to solve this issue. Here is what I did:
I added a custom celery_config.py to the scheduler and worker docker containers, adding this environment variable: AIRFLOW__CELERY__CELERY_CONFIG_OPTIONS=celery_config.CELERY_CONFIG. As part of that celery config, I specified both my rabbitmq brokers under broker_url. This is the full config:
from airflow.config_templates.default_celery import DEFAULT_CELERY_CONFIG
import os
RABBITMQ_PW = os.environ["RABBITMQ_PW"]
CLUSTER_NODE = os.environ["RABBITMQ_CLUSTER_NODE"]
LOCAL_NODE = os.environ["RABBITMQ_NODE"]
CELERY_CONFIG = {
**DEFAULT_CELERY_CONFIG,
"worker_send_task_events": True,
"task_send_sent_event": True,
"result_extended": True,
"broker_url": [
f'amqp://rabbitmq:{RABBITMQ_PW}#{LOCAL_NODE}:5672',
f'amqp://rabbitmq:{RABBITMQ_PW}#{CLUSTER_NODE}:5672'
]
}
What happens now in the worker, if it looses connection to the 1st broker, it will attempt to connect to the 2nd broker.
[2022-08-11 12:00:52,876: ERROR/MainProcess] consumer: Cannot connect to amqp://rabbitmq:**#<LOCAL_NODE>:5672//: [Errno 111] Connection refused.
[2022-08-11 12:00:52,875: INFO/MainProcess] Connected to amqp://rabbitmq:**#<CLUSTER_NODE>:5672//
Also an interesting note, I still have the Airflow environment variable AIRFLOW__CELERY__BROKER_URL set to the load balancer URL. That's because Airflow 1 won't allow the worker to start without it, and 2 won't allow you to specify multiple brokers like the celery config does. So when the worker starts, it shows:
- ** ---------- .> transport: amqp://rabbitmq:**#<LOCAL_NODE>:5672//
[2022-08-26 11:37:17,952: INFO/MainProcess] Connected to amqp://rabbitmq:**#<LOCAL_NODE>:5672//
Even though I have the LB configured for the AIRFLOW__CELERY__BROKER_URL

Airflow server constantly restarting - Signal 15

I launch a airflow webserver command in my local machine to start an airflow instance on port 8081. The server starts, however the pŕompt constantly shows some warning messages, as a loop. No error message appears, but the server doesn't works. Those are the messages:
/usr/local/lib/python3.8/dist-packages/airflow/configuration.py:361 DeprecationWarning: The default_queue option in [celery] has been moved to the default_queue option in [operators] - the old setting has been used, but please update your config.
/usr/local/lib/python3.8/dist-packages/airflow/configuration.py:361 DeprecationWarning: The dag_concurrency option in [core] has been renamed to max_active_tasks_per_dag - the old setting has been used, but please update your config.
/usr/local/lib/python3.8/dist-packages/airflow/configuration.py:361 DeprecationWarning: The processor_poll_interval option in [scheduler] has been renamed to scheduler_idle_sleep_time - the old setting has been used, but please update your config.
[2022-06-13 15:11:57,355] {manager.py:779} WARNING - No user yet created, use flask fab command to do it.
[2022-06-13 15:12:01,925] {manager.py:512} WARNING - Refused to delete permission view, assoc with role exists DAG Runs.can_create User
[2022-06-13 15:12:19 +0000] [1117638] [INFO] Handling signal: ttou
[2022-06-13 15:12:19 +0000] [1120256] [INFO] Worker exiting (pid: 1120256)
[2022-06-13 15:12:19 +0000] [1117638] [WARNING] Worker with pid 1120256 was terminated due to signal 15
[2022-06-13 15:12:22 +0000] [1117638] [INFO] Handling signal: ttin
[2022-06-13 15:12:22 +0000] [1121568] [INFO] Booting worker with pid: 1121568
Do you know what can could be happening?
Thank you in advance!

How to fix the error "AirflowException("Hostname of job runner does not match")"?

I'm running airflow on my computer (Mac AirBook, 1.6 GHz Intel Core i5 and 8 GB 2133 MHz LPDDR3). A DAG with several tasks, failed with below error. Checked several articles online but with little to no help. There is nothing wrong with the task itself(double checked).
Any help is much appreciated.
[2019-08-27 13:01:55,372] {sequential_executor.py:45} INFO - Executing command: ['airflow', 'run', 'Makefile_DAG', 'normalize_companies', '2019-08-27T15:38:20.914820+00:00', '--local', '--pool', 'default_pool', '-sd', '/home/airflow/dags/makefileDAG.py']
[2019-08-27 13:01:56,937] {settings.py:213} INFO - settings.configure_orm(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=40647
[2019-08-27 13:01:57,285] {__init__.py:51} INFO - Using executor SequentialExecutor
[2019-08-27 13:01:59,423] {dagbag.py:90} INFO - Filling up the DagBag from /home/airflow/dags/makefileDAG.py
[2019-08-27 13:02:01,736] {cli.py:516} INFO - Running <TaskInstance: Makefile_DAG.normalize_companies 2019-08-27T15:38:20.914820+00:00 [queued]> on host ajays-macbook-air.local
Traceback (most recent call last):
File "/anaconda3/envs/airflow/bin/airflow", line 32, in <module>
args.func(args)
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/utils/cli.py", line 74, in wrapper
return f(*args, **kwargs)
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/bin/cli.py", line 522, in run
_run(args, dag, ti)
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/bin/cli.py", line 435, in _run
run_job.run()
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/jobs/base_job.py", line 213, in run
self._execute()
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/jobs/local_task_job.py", line 111, in _execute
self.heartbeat()
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/jobs/base_job.py", line 196, in heartbeat
self.heartbeat_callback(session=session)
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/utils/db.py", line 70, in wrapper
return func(*args, **kwargs)
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/jobs/local_task_job.py", line 159, in heartbeat_callback
raise AirflowException("Hostname of job runner does not match")
airflow.exceptions.AirflowException: Hostname of job runner does not match
[2019-08-27 13:05:05,904] {sequential_executor.py:52} ERROR - Failed to execute task Command '['airflow', 'run', 'Makefile_DAG', 'normalize_companies', '2019-08-27T15:38:20.914820+00:00', '--local', '--pool', 'default_pool', '-sd', '/home/airflow/dags/makefileDAG.py']' returned non-zero exit status 1..
[2019-08-27 13:05:05,905] {scheduler_job.py:1256} INFO - Executor reports execution of Makefile_DAG.normalize_companies execution_date=2019-08-27 15:38:20.914820+00:00 exited with status failed for try_number 2
Logs from the task:
[2019-08-27 13:02:13,616] {bash_operator.py:115} INFO - Running command: python /home/Makefile_Redo/normalize_companies.py
[2019-08-27 13:02:13,628] {bash_operator.py:124} INFO - Output:
[2019-08-27 13:05:02,849] {logging_mixin.py:95} INFO - [[34m2019-08-27 13:05:02,848[0m] {[34mlocal_task_job.py:[0m158} [33mWARNING[0m - [33mThe recorded hostname [1majays-macbook-air.local[0m does not match this instance's hostname [1mAJAYs-MacBook-Air.local[0m[0m
[2019-08-27 13:05:02,860] {helpers.py:319} INFO - Sending Signals.SIGTERM to GPID 40649
[2019-08-27 13:05:02,861] {taskinstance.py:897} ERROR - Received SIGTERM. Terminating subprocesses.
[2019-08-27 13:05:02,862] {bash_operator.py:142} INFO - Sending SIGTERM signal to bash process group
[2019-08-27 13:05:03,539] {taskinstance.py:1047} ERROR - Task received SIGTERM signal
Traceback (most recent call last):
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 922, in _run_raw_task
result = task_copy.execute(context=context)
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/operators/bash_operator.py", line 126, in execute
for line in iter(sp.stdout.readline, b''):
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 899, in signal_handler
raise AirflowException("Task received SIGTERM signal")
airflow.exceptions.AirflowException: Task received SIGTERM signal
[2019-08-27 13:05:03,550] {taskinstance.py:1076} INFO - All retries failed; marking task as FAILED
A weird thing I noticed from above log is:
The recorded hostname [1majays-macbook-air.local[0m does not match this instance's hostname [1mAJAYs-MacBook-Air.local[0m[0m
How is this possible and any solution to fix this?
I had the same problem on my Mac. The solution that worked for me was updating airflow.cfg with hostname_callable = socket:gethostname. The original getfqdn returns different hostnames from time to time.

Setup of flex units on linux

I am working on a project using Flex and until now we are using Windows to run flex unit tests for the modules/artifacts that require flex environment. Because of given dependencies, it is difficult to automatize anything because I have to swithch between linux/windows when running those maven tests.
I have made an effort to try to make flex units tests run on linux, but have not succedded [yet]. Here is small part of the stack trace from maven clean test -X on a flex project.
[INFO] Flexmojos 3.8
[INFO] Apache License - Version 2.0 (NO WARRANTY) - See COPYRIGHT file
[INFO] Running tests /root/trunk/flex-project/flex-surface/flex-surface-common/flex-surface-common-flex/target/test-classes/TestRunner.swf
[DEBUG] [org.sonatype.flexmojos.test.monitor.AsVmPing] opened server socket on port 13540
[DEBUG] [org.sonatype.flexmojos.test.monitor.ResultHandler] opened server socket on port 13539
[DEBUG] [LAUNCHER] ASVmLauncher starting
[DEBUG] [LAUNCHER] exec: /usr/bin/flashplayer - /root/trunk/flex-project/flex-project-arbeidsflate/flex-surface-common/flex-surface-common-flex/target/test-classes/TestRunner.swf
[DEBUG] [LAUNCHER] Creating process
[WARNING] [LAUNCHER] Using xvfb-run to launch headless tests
[DEBUG] [LAUNCHER] Process created java.lang.UNIXProcess#1a6c088
[DEBUG] [MOJO] launcher RUNNING
[DEBUG] [MOJO] pinger STARTED
[DEBUG] [MOJO] resultHandler STARTED
[DEBUG] [LAUNCHER] Output pumpers ON
[DEBUG] [LAUNCHER] Waiting for flashplayer termination
[DEBUG] [LAUNCHER] Flashplayer closed
[DEBUG] [LAUNCHER] Unexpected return code 1
[DEBUG] [SYSERR]: mktemp: cannot create temp file /tmp/Xauthority: File exists
[DEBUG] [MOJO] launcher ERROR
[DEBUG] [MOJO] pinger STARTED
[DEBUG] [MOJO] resultHandler STARTED
[INFO] ------------------------------------------------------------------------
[INFO] Tests run: 0, Failures: 0, Errors: 0, Time Elapsed: 0 sec
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
I need help find out what is wrong. If any of you guys know any other way to run flex units test on linux, NOT THROUGH JENKINS/HUDSON, I will be extremely grateful.
First of all, just follow the linux part of the instructions on the following site:Running unit tests - FlexMojos. Download the flashplayer and untar it some appropriate place and put the absolute directory path in your PATH variable.
Download the xvfb-run script and change the following 'fi'
# If the user did not specify an X authorization file to use, set up a temporary
# directory to house one.
if [ -z "$AUTHFILE" ]; then
XVFB_RUN_TMPDIR="${TMPDIR:-/tmp}/$PROGNAME.$$"
if ! mkdir -p -m 700 "$XVFB_RUN_TMPDIR"; then
error "temporary directory $XVFB_RUN_TMPDIR already exists"
exit 4
fi
AUTHFILE=$(mktemp -p "$XVFB_RUN_TMPDIR" Xauthority)
fi
to
# If the user did not specify an X authorization file to use, set up a temporary
# directory to house one.
if [ -z "$AUTHFILE" ]; then
XVFB_RUN_TMPDIR=$(mktemp -d)
if ! mkdir -p -m 700 "$XVFB_RUN_TMPDIR"; then
error "temporary directory $XVFB_RUN_TMPDIR already exists"
exit 4
fi
AUTHFILE=$(mktemp "$XVFB_RUN_TMPDIR/Xauthority")
fi
I solved my problem, hopefully you will too.
Good luck.

Resources