Cloudify manager bootsrapping - rest service failed

Cloudify manager bootsrapping - rest service failed - cloudify

I followed the steps in http://docs.getcloudify.org/4.1.0/installation/bootstrapping/#option-2-bootstrapping-a-cloudify-manager to bootstrap the cloudify manager using option 2, and getting the following error repeatedly:
Workflow failed: Task failed 'fabric_plugin.tasks.run_script' -> restservice
error: http: //127.0.0.1:8100: <urlopen error [Errno 111] Connection refused>
The command is able to install a verify a lot of things like rabbitmq, postgresql etc, but always fails at rest service. Create and configure of rest service is successful, but verification fails. It looks like the service never starts.
2017-08-22 04:23:19.700 CFY <manager> [rest_service_cyd4of.start] Task started 'fabric_plugin.tasks.run_script'
2017-08-22 04:23:20.506 LOG <manager> [rest_service_cyd4of.start] INFO: Starting Cloudify REST Service...
2017-08-22 04:23:21.011 LOG <manager> [rest_service_cyd4of.start] INFO: Verifying Rest service is running...
2017-08-22 04:23:21.403 LOG <manager> [rest_service_cyd4of.start] INFO: Verifying Rest service is working as expected...
2017-08-22 04:23:21.575 LOG <manager> [rest_service_cyd4of.start] WARNING: <urlopen error [Errno 111] Connection refused>, Retrying in 3 seconds...
2017-08-22 04:23:24.691 LOG <manager> [rest_service_cyd4of.start] WARNING: <urlopen error [Errno 111] Connection refused>, Retrying in 6 seconds...
2017-08-22 04:23:30.815 LOG <manager> [rest_service_cyd4of.start] WARNING: <urlopen error [Errno 111] Connection refused>, Retrying in 12 seconds...
[10.0.2.15] out: restservice error: http: //127.0.0.1:8100: <urlopen error [Errno 111] Connection refused>
[10.0.2.15] out: Traceback (most recent call last):
[10.0.2.15] out: File "/tmp/cloudify-ctx/scripts/tmp4BXh2m-start.py-VHYZP1K3", line 71, in <module>
[10.0.2.15] out: verify_restservice(restservice_url)
[10.0.2.15] out: File "/tmp/cloudify-ctx/scripts/tmp4BXh2m-start.py-VHYZP1K3", line 34, in verify_restservice
[10.0.2.15] out: utils.verify_service_http(SERVICE_NAME, url, headers=headers)
[10.0.2.15] out: File "/tmp/cloudify-ctx/scripts/utils.py", line 1734, in verify_service_http
[10.0.2.15] out: ctx.abort_operation('{0} error: {1}: {2}'.format(service_name, url, e))
[10.0.2.15] out: File "/tmp/cloudify-ctx/cloudify.py", line 233, in abort_operation
[10.0.2.15] out: subprocess.check_call(cmd)
[10.0.2.15] out: File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call
[10.0.2.15] out: raise CalledProcessError(retcode, cmd)
[10.0.2.15] out: subprocess.CalledProcessError: Command '['ctx', 'abort_operation', 'restservice error: http: //127.0.0.1:8100: <urlopen error [Errno 111] Connection refused>']' returned non-zero exit status 1
[10.0.2.15] out:
Fatal error: run() received nonzero return code 1 while executing!
Requested: source /tmp/cloudify-ctx/scripts/env-tmp4BXh2m-start.py-VHYZP1K3 && /tmp/cloudify-ctx/scripts/tmp4BXh2m-start.py-VHYZP1K3
Executed: /bin/bash -l -c "cd /tmp/cloudify-ctx/work && source /tmp/cloudify-ctx/scripts/env-tmp4BXh2m-start.py-VHYZP1K3 && /tmp/cloudify-ctx/scripts/tmp4BXh2m-start.py-VHYZP1K3"
I am using CentOS 7.
Any suggestion to address the issue or debug will be appreciated

Can you please try the same bootstrap option using these instructions and let me know if it works for you?

Do you have the python-virtualenv package installed? If you do, try uninstalling it.
The version of virtualenv in CentOS repositories is too old and causes problems with the REST service installation. Cloudify will install its own version of virtualenv while bootstrapping, but only if one is not already present in the system.

Related

Airflow 2 Error sending Celery task: Timeout

I am in the process of migrating our Airflow environment from version 1.10.15 to 2.3.3. I have migrated 1 DAG over to the new environment and intermittently I get an email with this error: Executor reports task instance finished (failed) although the task says its queued. (Info: None) Was the task killed externally?
When looking at the logs, this is what I find in the scheduler logs:
[2022-08-09 07:00:08,621] {dag.py:2968} INFO - Setting next_dagrun for DAGRP-Get_Overrides to 2022-08-09T11:00:00+00:00, run_after=2022-08-09T16:00:00+00:00
[2022-08-09 07:00:08,652] {scheduler_job.py:353} INFO - 1 tasks up for execution:
<TaskInstance: DAGRP-Get_Overrides.Get_override scheduled__2022-08-08T16:00:00+00:00 [scheduled]>
[2022-08-09 07:00:08,652] {scheduler_job.py:418} INFO - DAG DAGRP-Get_Overrides has 0/3 running and queued tasks
[2022-08-09 07:00:08,652] {scheduler_job.py:504} INFO - Setting the following tasks to queued state:
<TaskInstance: DAGRP-Get_Overrides.Get_override scheduled__2022-08-08T16:00:00+00:00 [scheduled]>
[2022-08-09 07:00:08,654] {scheduler_job.py:546} INFO - Sending TaskInstanceKey(dag_id='DAGRP-Get_Overrides', task_id='Get_override', run_id='scheduled__2022-08-08T16:00:00+00:00', try_number=1, map_index=-1) to executor with priority 1 and queue default
[2022-08-09 07:00:08,654] {base_executor.py:91} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'DAGRP-Get_Overrides', 'Get_override', 'scheduled__2022-08-08T16:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/da_group/get_override.py']
[2022-08-09 07:00:12,665] {timeout.py:67} ERROR - Process timed out, PID: 1
[2022-08-09 07:00:12,667] {celery_executor.py:283} INFO - [Try 1 of 3] Task Timeout Error for Task: (TaskInstanceKey(dag_id='DAGRP-Get_Overrides', task_id='Get_override', run_id='scheduled__2022-08-08T16:00:00+00:00', try_number=1, map_index=-1)).
[2022-08-09 07:00:16,701] {timeout.py:67} ERROR - Process timed out, PID: 1
[2022-08-09 07:00:16,702] {celery_executor.py:283} INFO - [Try 2 of 3] Task Timeout Error for Task: (TaskInstanceKey(dag_id='DAGRP-Get_Overrides', task_id='Get_override', run_id='scheduled__2022-08-08T16:00:00+00:00', try_number=1, map_index=-1)).
[2022-08-09 07:00:21,704] {timeout.py:67} ERROR - Process timed out, PID: 1
[2022-08-09 07:00:21,705] {celery_executor.py:283} INFO - [Try 3 of 3] Task Timeout Error for Task: (TaskInstanceKey(dag_id='DAGRP-Get_Overrides', task_id='Get_override', run_id='scheduled__2022-08-08T16:00:00+00:00', try_number=1, map_index=-1)).
[2022-08-09 07:00:26,627] {timeout.py:67} ERROR - Process timed out, PID: 1
[2022-08-09 07:00:26,627] {celery_executor.py:294} ERROR - Error sending Celery task: Timeout, PID: 1
Celery Task ID: TaskInstanceKey(dag_id='DAGRP-Get_Overrides', task_id='Get_override', run_id='scheduled__2022-08-08T16:00:00+00:00', try_number=1, map_index=-1)
Traceback (most recent call last):
File "/opt/airflow/lib/python3.8/site-packages/kombu/utils/functional.py", line 30, in __call__
return self.__value__
AttributeError: 'ChannelPromise' object has no attribute '__value__'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/airflow/lib/python3.8/site-packages/airflow/executors/celery_executor.py", line 177, in send_task_to_executor
result = task_to_run.apply_async(args=[command], queue=queue)
File "/opt/airflow/lib/python3.8/site-packages/celery/app/task.py", line 575, in apply_async
return app.send_task(
File "/opt/airflow/lib/python3.8/site-packages/celery/app/base.py", line 788, in send_task
amqp.send_task_message(P, name, message, **options)
File "/opt/airflow/lib/python3.8/site-packages/celery/app/amqp.py", line 510, in send_task_message
ret = producer.publish(
File "/opt/airflow/lib/python3.8/site-packages/kombu/messaging.py", line 177, in publish
return _publish(
File "/opt/airflow/lib/python3.8/site-packages/kombu/connection.py", line 523, in _ensured
return fun(*args, **kwargs)
File "/opt/airflow/lib/python3.8/site-packages/kombu/messaging.py", line 186, in _publish
channel = self.channel
File "/opt/airflow/lib/python3.8/site-packages/kombu/messaging.py", line 209, in _get_channel
channel = self._channel = channel()
File "/opt/airflow/lib/python3.8/site-packages/kombu/utils/functional.py", line 32, in __call__
value = self.__value__ = self.__contract__()
File "/opt/airflow/lib/python3.8/site-packages/kombu/messaging.py", line 225, in <lambda>
channel = ChannelPromise(lambda: connection.default_channel)
File "/opt/airflow/lib/python3.8/site-packages/kombu/connection.py", line 895, in default_channel
self._ensure_connection(**conn_opts)
File "/opt/airflow/lib/python3.8/site-packages/kombu/connection.py", line 433, in _ensure_connection
return retry_over_time(
File "/opt/airflow/lib/python3.8/site-packages/kombu/utils/functional.py", line 312, in retry_over_time
return fun(*args, **kwargs)
File "/opt/airflow/lib/python3.8/site-packages/kombu/connection.py", line 877, in _connection_factory
self._connection = self._establish_connection()
File "/opt/airflow/lib/python3.8/site-packages/kombu/connection.py", line 812, in _establish_connection
conn = self.transport.establish_connection()
File "/opt/airflow/lib/python3.8/site-packages/kombu/transport/pyamqp.py", line 201, in establish_connection
conn.connect()
File "/opt/airflow/lib/python3.8/site-packages/amqp/connection.py", line 323, in connect
self.transport.connect()
File "/opt/airflow/lib/python3.8/site-packages/amqp/transport.py", line 129, in connect
self._connect(self.host, self.port, self.connect_timeout)
File "/opt/airflow/lib/python3.8/site-packages/amqp/transport.py", line 184, in _connect
self.sock.connect(sa)
File "/opt/airflow/lib/python3.8/site-packages/airflow/utils/timeout.py", line 68, in handle_timeout
raise AirflowTaskTimeout(self.error_message)
airflow.exceptions.AirflowTaskTimeout: Timeout, PID: 1
[2022-08-09 07:00:26,627] {scheduler_job.py:599} INFO - Executor reports execution of DAGRP-Get_Overrides.Get_override run_id=scheduled__2022-08-08T16:00:00+00:00 exited with status failed for try_number 1
[2022-08-09 07:00:26,633] {scheduler_job.py:642} INFO - TaskInstance Finished: dag_id=DAGRP-Get_Overrides, task_id=Get_override, run_id=scheduled__2022-08-08T16:00:00+00:00, map_index=-1, run_start_date=None, run_end_date=None, run_duration=None, state=queued, executor_state=failed, try_number=1, max_tries=0, job_id=None, pool=default_pool, queue=default, priority_weight=1, operator=PythonOperator, queued_dttm=2022-08-09 11:00:08.652767+00:00, queued_by_job_id=56, pid=None
[2022-08-09 07:00:26,633] {scheduler_job.py:684} ERROR - Executor reports task instance <TaskInstance: DAGRP-Get_Overrides.Get_override scheduled__2022-08-08T16:00:00+00:00 [queued]> finished (failed) although the task says its queued. (Info: None) Was the task killed externally?
[2022-08-09 07:01:16,687] {processor.py:233} WARNING - Killing DAGFileProcessorProcess (PID=1811)
[2022-08-09 07:04:00,640] {scheduler_job.py:1233} INFO - Resetting orphaned tasks for active dag runs
I am running Airflow on 2 servers with 2 of each service (2 schedulers, 2 workers, 2 webservers). They are running in docker containers. They are configured to use celery executor and I'm using RabbitMQ version 3.10.6 (also 2 instances in docker containers behind a LB). I am using Postgres 13.7 for our database (running one instance in a docker container on the 1st server). Our environment is running on Python 3.8.12.
From my understanding, the timeout is between the scheduler and rabbitmq? From what I can tell we are hitting this timeout: AIRFLOW__CELERY__OPERATION_TIMEOUT (it's currently set to 4).
I would like to track down what is causing the issue before I just increase timeout settings. What can I do to find out what's going on? Anyone else run into this issue? Am I correct in assuming the timeout is between the scheduler and rabbitmq? Is it between the scheduler and database? Why am I seeing this with Airflow 2 when I have the same setup with Airflow 1 and it works with no problems? Any help is greatly appreciated!
Update:
I was able to reproduce the error by shutting down 1 of the rabbitmq nodes. Even though rabbitmq is behind a LB with a health probe, whenever a job was picked up by scheduler 1, it would fail with this error... But if scheduler 2 picked up the job, it would finish successfully. The odd thing is that I shut down rabbitmq 2..

So I think I've been able to solve this issue. Here is what I did:
I added a custom celery_config.py to the scheduler and worker docker containers, adding this environment variable: AIRFLOW__CELERY__CELERY_CONFIG_OPTIONS=celery_config.CELERY_CONFIG. As part of that celery config, I specified both my rabbitmq brokers under broker_url. This is the full config:
from airflow.config_templates.default_celery import DEFAULT_CELERY_CONFIG
import os
RABBITMQ_PW = os.environ["RABBITMQ_PW"]
CLUSTER_NODE = os.environ["RABBITMQ_CLUSTER_NODE"]
LOCAL_NODE = os.environ["RABBITMQ_NODE"]
CELERY_CONFIG = {
**DEFAULT_CELERY_CONFIG,
"worker_send_task_events": True,
"task_send_sent_event": True,
"result_extended": True,
"broker_url": [
f'amqp://rabbitmq:{RABBITMQ_PW}#{LOCAL_NODE}:5672',
f'amqp://rabbitmq:{RABBITMQ_PW}#{CLUSTER_NODE}:5672'
]
}
What happens now in the worker, if it looses connection to the 1st broker, it will attempt to connect to the 2nd broker.
[2022-08-11 12:00:52,876: ERROR/MainProcess] consumer: Cannot connect to amqp://rabbitmq:**#<LOCAL_NODE>:5672//: [Errno 111] Connection refused.
[2022-08-11 12:00:52,875: INFO/MainProcess] Connected to amqp://rabbitmq:**#<CLUSTER_NODE>:5672//
Also an interesting note, I still have the Airflow environment variable AIRFLOW__CELERY__BROKER_URL set to the load balancer URL. That's because Airflow 1 won't allow the worker to start without it, and 2 won't allow you to specify multiple brokers like the celery config does. So when the worker starts, it shows:
- ** ---------- .> transport: amqp://rabbitmq:**#<LOCAL_NODE>:5672//
[2022-08-26 11:37:17,952: INFO/MainProcess] Connected to amqp://rabbitmq:**#<LOCAL_NODE>:5672//
Even though I have the LB configured for the AIRFLOW__CELERY__BROKER_URL

How to fix the error "AirflowException("Hostname of job runner does not match")"?

I'm running airflow on my computer (Mac AirBook, 1.6 GHz Intel Core i5 and 8 GB 2133 MHz LPDDR3). A DAG with several tasks, failed with below error. Checked several articles online but with little to no help. There is nothing wrong with the task itself(double checked).
Any help is much appreciated.
[2019-08-27 13:01:55,372] {sequential_executor.py:45} INFO - Executing command: ['airflow', 'run', 'Makefile_DAG', 'normalize_companies', '2019-08-27T15:38:20.914820+00:00', '--local', '--pool', 'default_pool', '-sd', '/home/airflow/dags/makefileDAG.py']
[2019-08-27 13:01:56,937] {settings.py:213} INFO - settings.configure_orm(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=40647
[2019-08-27 13:01:57,285] {__init__.py:51} INFO - Using executor SequentialExecutor
[2019-08-27 13:01:59,423] {dagbag.py:90} INFO - Filling up the DagBag from /home/airflow/dags/makefileDAG.py
[2019-08-27 13:02:01,736] {cli.py:516} INFO - Running <TaskInstance: Makefile_DAG.normalize_companies 2019-08-27T15:38:20.914820+00:00 [queued]> on host ajays-macbook-air.local
Traceback (most recent call last):
File "/anaconda3/envs/airflow/bin/airflow", line 32, in <module>
args.func(args)
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/utils/cli.py", line 74, in wrapper
return f(*args, **kwargs)
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/bin/cli.py", line 522, in run
_run(args, dag, ti)
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/bin/cli.py", line 435, in _run
run_job.run()
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/jobs/base_job.py", line 213, in run
self._execute()
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/jobs/local_task_job.py", line 111, in _execute
self.heartbeat()
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/jobs/base_job.py", line 196, in heartbeat
self.heartbeat_callback(session=session)
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/utils/db.py", line 70, in wrapper
return func(*args, **kwargs)
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/jobs/local_task_job.py", line 159, in heartbeat_callback
raise AirflowException("Hostname of job runner does not match")
airflow.exceptions.AirflowException: Hostname of job runner does not match
[2019-08-27 13:05:05,904] {sequential_executor.py:52} ERROR - Failed to execute task Command '['airflow', 'run', 'Makefile_DAG', 'normalize_companies', '2019-08-27T15:38:20.914820+00:00', '--local', '--pool', 'default_pool', '-sd', '/home/airflow/dags/makefileDAG.py']' returned non-zero exit status 1..
[2019-08-27 13:05:05,905] {scheduler_job.py:1256} INFO - Executor reports execution of Makefile_DAG.normalize_companies execution_date=2019-08-27 15:38:20.914820+00:00 exited with status failed for try_number 2
Logs from the task:
[2019-08-27 13:02:13,616] {bash_operator.py:115} INFO - Running command: python /home/Makefile_Redo/normalize_companies.py
[2019-08-27 13:02:13,628] {bash_operator.py:124} INFO - Output:
[2019-08-27 13:05:02,849] {logging_mixin.py:95} INFO - [[34m2019-08-27 13:05:02,848[0m] {[34mlocal_task_job.py:[0m158} [33mWARNING[0m - [33mThe recorded hostname [1majays-macbook-air.local[0m does not match this instance's hostname [1mAJAYs-MacBook-Air.local[0m[0m
[2019-08-27 13:05:02,860] {helpers.py:319} INFO - Sending Signals.SIGTERM to GPID 40649
[2019-08-27 13:05:02,861] {taskinstance.py:897} ERROR - Received SIGTERM. Terminating subprocesses.
[2019-08-27 13:05:02,862] {bash_operator.py:142} INFO - Sending SIGTERM signal to bash process group
[2019-08-27 13:05:03,539] {taskinstance.py:1047} ERROR - Task received SIGTERM signal
Traceback (most recent call last):
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 922, in _run_raw_task
result = task_copy.execute(context=context)
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/operators/bash_operator.py", line 126, in execute
for line in iter(sp.stdout.readline, b''):
File "/anaconda3/envs/airflow/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 899, in signal_handler
raise AirflowException("Task received SIGTERM signal")
airflow.exceptions.AirflowException: Task received SIGTERM signal
[2019-08-27 13:05:03,550] {taskinstance.py:1076} INFO - All retries failed; marking task as FAILED
A weird thing I noticed from above log is:
The recorded hostname [1majays-macbook-air.local[0m does not match this instance's hostname [1mAJAYs-MacBook-Air.local[0m[0m
How is this possible and any solution to fix this?

I had the same problem on my Mac. The solution that worked for me was updating airflow.cfg with hostname_callable = socket:gethostname. The original getfqdn returns different hostnames from time to time.

Error when Communicating with the server while Cluster Setup in Cloudera

I am trying to up Hadoop in Centos-7 usign CLoudera, but while Cluster Setup process (Single node), I am getting this error stating:
There was an error when communicating with the server. See the log file for more information.
I logged into cloudera-scm-agent.log file using
sudo cat /var/log/cloudera-scm-agent/cloudera-scm-agent.log
And I see Failed directory creation and connection refused errors.
The detailed log file can be found here.
Can someone please assist me on what am I doing wrong here?

Have you installed the cluster with single user mode? if so the system user "cloudera-scm" should have permission to perform read, write operation on service log, pid, data directory. From your log message, all services are refused to start because of improper file system permission.
stacks', u'bytes_free_warning_threshhold_bytes': 0, u'group': u'cloudera-scm', u'user': u'cloudera-scm', u'mode': 493}]
[01/Nov/2018 04:41:11 +0000] 28095 MainThread os_ops ERROR Failed directory creation: /var/log/zookeeper/stacks: [Errno 13] Permission denied: '/var/log/zookeeper'
[01/Nov/2018 04:41:11 +0000] 28095 MainThread process ERROR Could not evaluate resource {u'path': u'/var/log/zookeeper/stacks', u'bytes_free_warning_threshhold_bytes': 0, u'group': u'cloudera-scm', u'user': u'cloudera-scm', u'mode': 493}
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.1-py2.7.egg/cmf/process.py", line 963, in _do_directory_resources
self.osops.mkabsdir(d["path"], user=d["user"], group=d["group"], mode=d["mode"])
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.1-py2.7.egg/cmf/util/os_ops.py", line 180, in mkabsdir
os.makedirs(path)
File "/usr/lib64/cmf/agent/build/env/lib64/python2.7/os.py", line 150, in makedirs
makedirs(head, mode)
File "/usr/lib64/cmf/agent/build/env/lib64/python2.7/os.py", line 157, in makedirs
mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/var/log/zookeeper'

AirflowException: Celery command failed - The recorded hostname does not match this instance's hostname

I'm running Airflow on a clustered environment running on two AWS EC2-Instances. One for master and one for the worker. The worker node though periodically throws this error when running "$airflow worker":
[2018-08-09 16:15:43,553] {jobs.py:2574} WARNING - The recorded hostname ip-1.2.3.4 does not match this instance's hostname ip-1.2.3.4.eco.tanonprod.comanyname.io
Traceback (most recent call last):
File "/usr/bin/airflow", line 27, in <module>
args.func(args)
File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 387, in run
run_job.run()
File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 198, in run
self._execute()
File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 2527, in _execute
self.heartbeat()
File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 182, in heartbeat
self.heartbeat_callback(session=session)
File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 50, in wrapper
result = func(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 2575, in heartbeat_callback
raise AirflowException("Hostname of job runner does not match")
airflow.exceptions.AirflowException: Hostname of job runner does not match
[2018-08-09 16:15:43,671] {celery_executor.py:54} ERROR - Command 'airflow run arl_source_emr_test_dag runEmrStep2WaiterTask 2018-08-07T00:00:00 --local -sd /var/lib/airflow/dags/arl_source_emr_test_dag.py' returned non-zero exit status 1.
[2018-08-09 16:15:43,681: ERROR/ForkPoolWorker-30] Task airflow.executors.celery_executor.execute_command[875a4da9-582e-4c10-92aa-5407f3b46d5f] raised unexpected: AirflowException('Celery command failed',)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 52, in execute_command
subprocess.check_call(command, shell=True)
File "/usr/lib64/python3.6/subprocess.py", line 291, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'airflow run arl_source_emr_test_dag runEmrStep2WaiterTask 2018-08-07T00:00:00 --local -sd /var/lib/airflow/dags/arl_source_emr_test_dag.py' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/dist-packages/celery/app/trace.py", line 382, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/lib/python3.6/dist-packages/celery/app/trace.py", line 641, in __protected_call__
return self.run(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 55, in execute_command
raise AirflowException('Celery command failed')
airflow.exceptions.AirflowException: Celery command failed
When this error occurs the task is marked as failed on Airflow and thus fails my DAG when nothing actually went wrong in the task.
I'm using Redis as my queue and postgreSQL as my meta-database. Both are external as AWS services. I'm running all of this on my company environment which is why the full name of the server is ip-1.2.3.4.eco.tanonprod.comanyname.io. It looks like it wants this full name somewhere but I have no idea where I need to fix this value so that it's getting ip-1.2.3.4.eco.tanonprod.comanyname.io instead of just ip-1.2.3.4.
The really weird thing about this issue is that it doesn't always happen. It seems to just randomly happen every once in a while when I run the DAG. It's also occurring on all of my DAGs sporadically so it's not just one DAG. I find it strange though how it's sporadic because that means other task runs are handling the IP address for whatever this is just fine.
Note: I've changed the real IP address to 1.2.3.4 for privacy reasons.
Answer:
https://github.com/apache/incubator-airflow/pull/2484
This is exactly the problem I am having and other Airflow users on AWS EC2-Instances are experiencing it as well.

The hostname is set when the task instance runs, and is set to self.hostname = socket.getfqdn(), where socket is the python package import socket.
The comparison that triggers this error is:
fqdn = socket.getfqdn()
if fqdn != ti.hostname:
logging.warning("The recorded hostname {ti.hostname} "
"does not match this instance's hostname "
"{fqdn}".format(**locals()))
raise AirflowException("Hostname of job runner does not match")
It seems like the hostname on the ec2 instance is changing on you while the worker is running. Perhaps try manually setting the hostname as described here https://forums.aws.amazon.com/thread.jspa?threadID=246906 and see if that sticks.

I had a similar problem on my Mac. It fixed it setting hostname_callable = socket:gethostname in airflow.cfg.

Personally when running on my Mac, I found that I got similar errors to this when the Mac would sleep while I was running a long job. The solution was to go into System Preferences -> Energy Saver and then check "Prevent computer from sleeping automatically when the display is off."

Cloudify 3.3.1 simple-manager bootstrap fails with http 504 / filename argument expected

I'm trying to bootstrap a cloudify manager using the simple-manager-blueprint from the cloudify-manager-repo and following the instructions here
I am running the bootstrap process from Ubuntu 16, and attempting to bootstrap onto an already-existing Centos 7 VM (KVM) hosted remotely.
The error I get during the bootstrap process is:
(cfyenv) k#ubuntu1:~/cloudify/cloudify-manager$ cfy init -r
Initialization completed successfully
(cfyenv) k#ubuntu1:~/cloudify/cloudify-manager$ cfy --version
Cloudify CLI 3.3.1
(cfyenv) k#ubuntu1:~/cloudify/cloudify-manager$ cfy bootstrap -p ./cloudify-manager-blueprints-3.3.1/simple-manager-blueprint.yaml -i ./cloudify-manager-blueprints-3.3.1/simple-manager-blueprint-inputs.yaml
executing bootstrap validation
2016-06-10 13:03:38 CFY <manager> Starting 'execute_operation' workflow execution
2016-06-10 13:03:38 CFY <manager> [rabbitmq_b88e8] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [python_runtime_89bdd] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [rest_service_61510] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [amqp_influx_2f816] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [manager_host_d688e] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [influxdb_98fd6] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [logstash_39e85] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [manager_configuration_0d9ca] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [mgmt_worker_f0d02] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [riemann_20a3e] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [java_runtime_c9a1c] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [elasticsearch_b1536] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [nginx_db289] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [webui_9c064] Starting operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [rabbitmq_b88e8] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [python_runtime_89bdd] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [manager_configuration_0d9ca] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [mgmt_worker_f0d02] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [nginx_db289] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [rest_service_61510] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [manager_host_d688e] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [riemann_20a3e] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [influxdb_98fd6] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [logstash_39e85] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [amqp_influx_2f816] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [webui_9c064] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [elasticsearch_b1536] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> [java_runtime_c9a1c] Finished operation cloudify.interfaces.validation.creation
2016-06-10 13:03:38 CFY <manager> 'execute_operation' workflow execution succeeded
bootstrap validation completed successfully
executing bootstrap
Inputs ./cloudify-manager-blueprints-3.3.1/simple-manager-blueprint-inputs.yaml
Inputs <cloudify.workflows.local._Environment object at 0x7fc76b458a10>
2016-06-10 13:03:45 CFY <manager> Starting 'install' workflow execution
2016-06-10 13:03:45 CFY <manager> [manager_host_cd1f8] Creating node
2016-06-10 13:03:45 CFY <manager> [manager_host_cd1f8] Configuring node
2016-06-10 13:03:45 CFY <manager> [manager_host_cd1f8] Starting node
2016-06-10 13:03:46 CFY <manager> [java_runtime_e2b0d] Creating node
2016-06-10 13:03:46 CFY <manager> [manager_configuration_baa5a] Creating node
2016-06-10 13:03:46 CFY <manager> [python_runtime_a24d5] Creating node
2016-06-10 13:03:46 CFY <manager> [rabbitmq_2656a] Creating node
2016-06-10 13:03:46 CFY <manager> [influxdb_720e7] Creating node
2016-06-10 13:03:46 CFY <manager> [manager_configuration_baa5a.create] Sending task 'fabric_plugin.tasks.run_script'
2016-06-10 13:03:46 CFY <manager> [python_runtime_a24d5.create] Sending task 'fabric_plugin.tasks.run_script'
2016-06-10 13:03:46 CFY <manager> [influxdb_720e7.create] Sending task 'fabric_plugin.tasks.run_script'
2016-06-10 13:03:46 CFY <manager> [rabbitmq_2656a.create] Sending task 'fabric_plugin.tasks.run_script'
2016-06-10 13:03:46 CFY <manager> [java_runtime_e2b0d.create] Sending task 'fabric_plugin.tasks.run_script'
2016-06-10 13:03:46 CFY <manager> [manager_configuration_baa5a.create] Task started 'fabric_plugin.tasks.run_script'
2016-06-10 13:03:46 LOG <manager> [manager_configuration_baa5a.create] INFO: preparing fabric environment...
2016-06-10 13:03:46 LOG <manager> [manager_configuration_baa5a.create] INFO: Fabric env: {u'always_use_pty': True, u'key_filename': u'/home/k/.ssh/id_rsa.pub', u'user': u'cloudify', u'host_string': u'10.124.129.42'}
2016-06-10 13:03:46 LOG <manager> [manager_configuration_baa5a.create] INFO: environment prepared successfully
[10.124.129.42] put: /tmp/tmppt9dtd-configure_manager.sh -> /tmp/cloudify-ctx/scripts/tmppt9dtd-configure_manager.sh-7MH6NQ63
[10.124.129.42] put: <file obj> -> /tmp/cloudify-ctx/scripts/env-tmppt9dtd-configure_manager.sh-7MH6NQ63
[10.124.129.42] run: source /tmp/cloudify-ctx/scripts/env-tmppt9dtd-configure_manager.sh-7MH6NQ63 && /tmp/cloudify-ctx/scripts/tmppt9dtd-configure_manager.sh-7MH6NQ63
[10.124.129.42] out: Traceback (most recent call last):
[10.124.129.42] out: File "/tmp/cloudify-ctx/ctx", line 130, in <module>
[10.124.129.42] out: main()
[10.124.129.42] out: File "/tmp/cloudify-ctx/ctx", line 119, in main
[10.124.129.42] out: args.timeout)
[10.124.129.42] out: File "/tmp/cloudify-ctx/ctx", line 78, in client_req
[10.124.129.42] out: response = request_method(socket_url, request, timeout)
[10.124.129.42] out: File "/tmp/cloudify-ctx/ctx", line 59, in http_client_req
[10.124.129.42] out: timeout=timeout)
[10.124.129.42] out: File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen
[10.124.129.42] out: return opener.open(url, data, timeout)
[10.124.129.42] out: File "/usr/lib64/python2.7/urllib2.py", line 437, in open
[10.124.129.42] out: response = meth(req, response)
[10.124.129.42] out: File "/usr/lib64/python2.7/urllib2.py", line 550, in http_response
[10.124.129.42] out: 'http', request, response, code, msg, hdrs)
[10.124.129.42] out: File "/usr/lib64/python2.7/urllib2.py", line 475, in error
[10.124.129.42] out: return self._call_chain(*args)
[10.124.129.42] out: File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
[10.124.129.42] out: result = func(*args)
[10.124.129.42] out: File "/usr/lib64/python2.7/urllib2.py", line 558, in http_error_default
[10.124.129.42] out: raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
[10.124.129.42] out: urllib2.HTTPError: HTTP Error 504: Gateway Time-out
[10.124.129.42] out: /tmp/cloudify-ctx/scripts/tmppt9dtd-configure_manager.sh-7MH6NQ63: line 3: .: filename argument required
[10.124.129.42] out: .: usage: . filename [arguments]
[10.124.129.42] out:
Fatal error: run() received nonzero return code 2 while executing!
Requested: source /tmp/cloudify-ctx/scripts/env-tmppt9dtd-configure_manager.sh-7MH6NQ63 && /tmp/cloudify-ctx/scripts/tmppt9dtd-configure_manager.sh-7MH6NQ63
Executed: /bin/bash -l -c "cd /tmp/cloudify-ctx/work && source /tmp/cloudify-ctx/scripts/env-tmppt9dtd-configure_manager.sh-7MH6NQ63 && /tmp/cloudify-ctx/scripts/tmppt9dtd-configure_manager.sh-7MH6NQ63"
Aborting.
2016-06-10 13:03:47 LOG <manager> [manager_configuration_baa5a.create] ERROR: Exception raised on operation [fabric_plugin.tasks.run_script] invocation
Traceback (most recent call last):
File "/home/k/cfyenv/local/lib/python2.7/site-packages/cloudify/decorators.py", line 122, in wrapper
result = func(*args, **kwargs)
File "/home/k/cfyenv/local/lib/python2.7/site-packages/fabric_plugin/tasks.py", line 214, in run_script
remote_env_script_path, command))
File "/home/k/cfyenv/local/lib/python2.7/site-packages/fabric/network.py", line 639, in host_prompting_wrapper
return func(*args, **kwargs)
File "/home/k/cfyenv/local/lib/python2.7/site-packages/fabric/operations.py", line 1042, in run
shell_escape=shell_escape)
File "/home/k/cfyenv/local/lib/python2.7/site-packages/fabric/operations.py", line 932, in _run_command
error(message=msg, stdout=out, stderr=err)
File "/home/k/cfyenv/local/lib/python2.7/site-packages/fabric/utils.py", line 327, in error
return func(message)
File "/home/k/cfyenv/local/lib/python2.7/site-packages/fabric/utils.py", line 32, in abort
raise env.abort_exception(msg)
FabricTaskError: run() received nonzero return code 2 while executing!
Requested: source /tmp/cloudify-ctx/scripts/env-tmppt9dtd-configure_manager.sh-7MH6NQ63 && /tmp/cloudify-ctx/scripts/tmppt9dtd-configure_manager.sh-7MH6NQ63
Executed: /bin/bash -l -c "cd /tmp/cloudify-ctx/work && source /tmp/cloudify-ctx/scripts/env-tmppt9dtd-configure_manager.sh-7MH6NQ63 && /tmp/cloudify-ctx/scripts/tmppt9dtd-configure_manager.sh-7MH6NQ63"
2016-06-10 13:03:47 CFY <manager> [manager_configuration_baa5a.create] Task failed 'fabric_plugin.tasks.run_script' -> run() received nonzero return code 2 while executing!
Requested: source /tmp/cloudify-ctx/scripts/env-tmppt9dtd-configure_manager.sh-7MH6NQ63 && /tmp/cloudify-ctx/scripts/tmppt9dtd-configure_manager.sh-7MH6NQ63
Executed: /bin/bash -l -c "cd /tmp/cloudify-ctx/work && source /tmp/cloudify-ctx/scripts/env-tmppt9dtd-configure_manager.sh-7MH6NQ63 && /tmp/cloudify-ctx/scripts/tmppt9dtd-configure_manager.sh-7MH6NQ63" [attempt 1/6]
2016-06-10 13:03:47 CFY <manager> [python_runtime_a24d5.create] Task started 'fabric_plugin.tasks.run_script'
2016-06-10 13:03:47 LOG <manager> [python_runtime_a24d5.create] INFO: preparing fabric environment...
2016-06-10 13:03:47 LOG <manager> [python_runtime_a24d5.create] INFO: Fabric env: {u'always_use_pty': True, u'key_filename': u'/home/k/.ssh/id_rsa.pub', u'hide': u'running', u'user': u'cloudify', u'host_string': u'10.124.129.42'}
2016-06-10 13:03:47 LOG <manager> [python_runtime_a24d5.create] INFO: environment prepared successfully
[10.124.129.42] put: /tmp/tmpmndvAt-create.sh -> /tmp/cloudify-ctx/scripts/tmpmndvAt-create.sh-F7IX8WT9
[10.124.129.42] put: <file obj> -> /tmp/cloudify-ctx/scripts/env-tmpmndvAt-create.sh-F7IX8WT9
[10.124.129.42] run: source /tmp/cloudify-ctx/scripts/env-tmpmndvAt-create.sh-F7IX8WT9 && /tmp/cloudify-ctx/scripts/tmpmndvAt-create.sh-F7IX8WT9
[10.124.129.42] out: Traceback (most recent call last):
[10.124.129.42] out: File "/tmp/cloudify-ctx/ctx", line 130, in <module>
[10.124.129.42] out: main()
[10.124.129.42] out: File "/tmp/cloudify-ctx/ctx", line 119, in main
[10.124.129.42] out: args.timeout)
[10.124.129.42] out: File "/tmp/cloudify-ctx/ctx", line 78, in client_req
[10.124.129.42] out: response = request_method(socket_url, request, timeout)
[10.124.129.42] out: File "/tmp/cloudify-ctx/ctx", line 59, in http_client_req
[10.124.129.42] out: timeout=timeout)
[10.124.129.42] out: File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen
[10.124.129.42] out: return opener.open(url, data, timeout)
[10.124.129.42] out: File "/usr/lib64/python2.7/urllib2.py", line 437, in open
[10.124.129.42] out: response = meth(req, response)
[10.124.129.42] out: File "/usr/lib64/python2.7/urllib2.py", line 550, in http_response
[10.124.129.42] out: 'http', request, response, code, msg, hdrs)
[10.124.129.42] out: File "/usr/lib64/python2.7/urllib2.py", line 475, in error
[10.124.129.42] out: return self._call_chain(*args)
[10.124.129.42] out: File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
[10.124.129.42] out: result = func(*args)
[10.124.129.42] out: File "/usr/lib64/python2.7/urllib2.py", line 558, in http_error_default
[10.124.129.42] out: raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
[10.124.129.42] out: urllib2.HTTPError: HTTP Error 504: Gateway Time-out
[10.124.129.42] out: /tmp/cloudify-ctx/scripts/tmpmndvAt-create.sh-F7IX8WT9: line 3: .: filename argument required
[10.124.129.42] out: .: usage: . filename [arguments]
[10.124.129.42] out:
Fatal error: run() received nonzero return code 2 while executing!
^C
(cfyenv) k#ubuntu1:~/cloudify/cloudify-manager$ ^C
As far as I can tell it looks like the bootstrap scripts are expecting something to be listening http on the target manager host but it's not there, but of course I could be way off track as I'm new to cloudify.
I've made only minimal changes to the blueprints input:
(cfyenv) k#ubuntu1:~/cloudify/cloudify-manager/cloudify-manager-blueprints-3.3.1$ cat ./simple-manager-blueprint-inputs.yaml
#############################
# Provider specific Inputs
#############################
# The public IP of the manager to which the CLI will connect.
public_ip: '<my target hosts ip>'
# The manager's private IP address. This is the address which will be used by the
# application hosts to connect to the Manager's fileserver and message broker.
private_ip: '<my target hosts ip>'
# SSH user used to connect to the manager
ssh_user: 'cloudify'
# SSH key path used to connect to the manager
ssh_key_filename: '/home/k/.ssh/id_rsa.pub'
# This is the user with which the Manager will try to connect to the application hosts.
agents_user: 'cloudify'
#resources_prefix: ''
#############################
# Security Settings
#############################
# Cloudify REST security is disabled by default. To disable security, set to true.
# Note: If security is disabled, the other security inputs are irrelevant.
#security_enabled: false
# Enabling SSL limits communication with the server to SSL only.
# NOTE: If enabled, the certificate and private key files must reside in resources/ssl.
#ssl_enabled: false
# Username and password of the Cloudify administrator.
# This user will also be included in the simple userstore repostiroty if the
# simple userstore implementation is used.
admin_username: 'admin'
admin_password: '<my admin password>'
#insecure_endpoints_disabled: false
#############################
# Agent Packages
#############################
# The key names must be in the format: distro_release_agent (e.g. ubuntu_trusty_agent)
# as the key is what's used to name the file, which later allows our
# agent installer to identify it for your distro and release automatically.
# Note that the windows agent key name MUST be `cloudify_windows_agent`
agent_package_urls:
# ubuntu_trusty_agent: http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/Ubuntu-trusty-agent_3.3.1-sp-b310.tar.gz
# ubuntu_precise_agent: http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/Ubuntu-precise-agent_3.3.1-sp-b310.tar.gz
centos_7x_agent: http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/centos-Core-agent_3.3.1-sp-b310.tar.gz
# centos_6x_agent: http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/centos-Final-agent_3.3.1-sp-b310.tar.gz
# redhat_7x_agent: http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/redhat-Maipo-agent_3.3.1-sp-b310.tar.gz
# redhat_6x_agent: http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/redhat-Santiago-agent_3.3.1-sp-b310.tar.gz
# cloudify_windows_agent: http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify-windows-agent_3.3.1-sp-b310.exe
#############################
# Cloudify Modules
#############################
# Note that you can replace rpm urls with names of packages as long as they're available in your default yum repository.
# That is, as long as they provide the exact same version of that module.
rest_service_rpm_source_url: 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify-rest-service-3.3.1-sp_b310.x86_64.rpm'
management_worker_rpm_source_url: 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify-management-worker-3.3.1-sp_b310.x86_64.rpm'
amqpinflux_rpm_source_url: 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify-amqp-influx-3.3.1-sp_b310.x86_64.rpm'
cloudify_resources_url: 'https://github.com/cloudify-cosmo/cloudify-manager/archive/3.3.1.tar.gz'
webui_source_url: 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify-ui-3.3.1-sp-b310.tgz'
# This is a Cloudify specific redistribution of Grafana.
grafana_source_url: http://repository.cloudifysource.org/org/cloudify3/components/grafana-1.9.0.tgz
#############################
# External Components
#############################
# Note that you can replace rpm urls with names of packages as long as they're available in your default yum repository.
# That is, as long as they provide the exact same version of that module.
pip_source_rpm_url: http://repository.cloudifysource.org/org/cloudify3/components/python-pip-7.1.0-1.el7.noarch.rpm
java_source_url: http://repository.cloudifysource.org/org/cloudify3/components/jre1.8.0_45-1.8.0_45-fcs.x86_64.rpm
# RabbitMQ Distribution of Erlang
erlang_source_url: http://repository.cloudifysource.org/org/cloudify3/components/erlang-17.4-1.el6.x86_64.rpm
rabbitmq_source_url: http://repository.cloudifysource.org/org/cloudify3/components/rabbitmq-server-3.5.3-1.noarch.rpm
elasticsearch_source_url: http://repository.cloudifysource.org/org/cloudify3/components/elasticsearch-1.6.0.noarch.rpm
elasticsearch_curator_rpm_source_url: http://repository.cloudifysource.org/org/cloudify3/components/elasticsearch-curator-3.2.3-1.x86_64.rpm
logstash_source_url: http://repository.cloudifysource.org/org/cloudify3/components/logstash-1.5.0-1.noarch.rpm
nginx_source_url: http://repository.cloudifysource.org/org/cloudify3/components/nginx-1.8.0-1.el7.ngx.x86_64.rpm
influxdb_source_url: http://repository.cloudifysource.org/org/cloudify3/components/influxdb-0.8.8-1.x86_64.rpm
riemann_source_url: http://repository.cloudifysource.org/org/cloudify3/components/riemann-0.2.6-1.noarch.rpm
# A RabbitMQ Client for Riemann
langohr_source_url: http://repository.cloudifysource.org/org/cloudify3/components/langohr.jar
# Riemann's default daemonizer
daemonize_source_url: http://repository.cloudifysource.org/org/cloudify3/components/daemonize-1.7.3-7.el7.x86_64.rpm
nodejs_source_url: http://repository.cloudifysource.org/org/cloudify3/components/node-v0.10.35-linux-x64.tar.gz
#############################
# RabbitMQ Configuration
#############################
# Sets the username/password to use for clients such as celery
# to connect to the rabbitmq broker.
# It is recommended that you set both the username and password
# to something reasonably secure.
rabbitmq_username: 'cloudify'
rabbitmq_password: '<my rabbit password>'
# Enable SSL for RabbitMQ. If this is set to true then the public and private
# certs must be supplied (`rabbitmq_cert_private`, `rabbitmq_cert_public` inputs).
#rabbitmq_ssl_enabled: false
# The private certificate for RabbitMQ to use for SSL. This must be PEM formatted.
# It is expected to begin with a line containing 'PRIVATE KEY' in the middle.
#rabbitmq_cert_private: ''
# The public certificate for RabbitMQ to use for SSL. This does not need to be signed by any CA,
# as it will be deployed and explicitly used for all other components.
# It may be self-signed. It must be PEM formatted.
# It is expected to begin with a line of dashes with 'BEGIN CERTIFICATE' in the middle.
# If an external endpoint is used, this must be the public certificate associated with the private
# certificate that has already been configured for use by that rabbit endpoint.
#rabbitmq_cert_public: ''
# Allows to define the message-ttl for the different types of queues (in milliseconds).
# These are not used if `rabbitmq_endpoint_ip` is provided.
# https://www.rabbitmq.com/ttl.html
rabbitmq_events_queue_message_ttl: 60000
rabbitmq_logs_queue_message_ttl: 60000
rabbitmq_metrics_queue_message_ttl: 60000
# This will set the queue length limit. Note that while new messages
# will be queued in RabbitMQ, old messages will be deleted once the
# limit is reached!
# These are not used if `rabbitmq_endpoint_ip` is provided.
# Note this is NOT the message byte length!
# https://www.rabbitmq.com/maxlength.html
rabbitmq_events_queue_length_limit: 1000000
rabbitmq_logs_queue_length_limit: 1000000
rabbitmq_metrics_queue_length_limit: 1000000
# RabbitMQ File Descriptors Limit
rabbitmq_fd_limit: 102400
# You can configure an external endpoint of a RabbitMQ Cluster to use
# instead of the built in one.
# If one is provided, the built in RabbitMQ cluster will not run.
# Also note that your external cluster must be preconfigured with any
# user name/pass and SSL certs if you plan on using RabbitMQ's security
# features.
#rabbitmq_endpoint_ip: ''
#############################
# Elasticsearch Configuration
#############################
# bootstrap.mlockall is set to true by default.
# This allows to set the heapsize for your cluster.
# https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html
#elasticsearch_heap_size: 2g
# This allows to provide any JAVA_OPTS to Elasticsearch.
#elasticsearch_java_opts: ''
# The index for events will be named `logstash-YYYY.mm.dd`.
# A new index corresponding with today's date will be added each day.
# Elasticsearch Curator is used to rotate the indices on a daily basis
# via a cronjob. This allows to determine the number of days to keep.
#elasticsearch_index_rotation_interval: 7
# You can configure an external endpoint of an Elasticsearch Cluster to use
# instead of the built in one. The built in Elasticsearch cluster will not run.
# You need to provide an IP (defaults to localhost) and Port (defaults to 9200) of your Elasticsearch Cluster.
#elasticsearch_endpoint_ip: ''
#elasticsearch_endpoint_port: 9200
#############################
# InfluxDB Configuration
#############################
# You can configure an external endpoint of an InfluxDB Cluster to use
# instead of the built in one.
# If one is provided, the built in InfluxDB cluster will not run.
# Note that the port is currently not configurable and must remain 8086.
# Also note that the database username and password are hardcoded to root:root.
#influxdb_endpoint_ip: ''
#############################
# Offline Resources Upload
#############################
# You can configure a set of resources to upload at bootstrap. These resources
# will reside on the manager and enable offline deployment. `dsl_resources`
# should contain any resource needed in the parsing process (i.e. plugin.yaml files)
# and any plugin archive should be compiled using the designated wagon tool
# which can be found at: http://github.com/cloudify-cosmo/wagon.
# The path should be passed to plugin_resources. Any resource your
# blueprint might need, could be uploaded using this mechanism.
#dsl_resources:
# - {'source_path': 'http://www.getcloudify.org/spec/fabric-plugin/1.3.1/plugin.yaml', 'destination_path': '/spec/fabric-plugin/1.3.1/plugin.yaml'}
# - {'source_path': 'http://www.getcloudify.org/spec/script-plugin/1.3.1/plugin.yaml', 'destination_path': '/spec/script-plugin/1.3.1/plugin.yaml'}
# - {'source_path': 'http://www.getcloudify.org/spec/diamond-plugin/1.3.1/plugin.yaml', 'destination_path': '/spec/diamond-plugin/1.3.1/plugin.yaml'}
# - {'source_path': 'http://www.getcloudify.org/spec/aws-plugin/1.3.1/plugin.yaml', 'destination_path': '/spec/aws-plugin/1.3.1/plugin.yaml'}
# - {'source_path': 'http://www.getcloudify.org/spec/openstack-plugin/1.3.1/plugin.yaml', 'destination_path': '/spec/openstack-plugin/1.3.1/plugin.yaml'}
# - {'source_path': 'http://www.getcloudify.org/spec/tosca-vcloud-plugin/1.3.1/plugin.yaml', 'destination_path': '/spec/tosca-vcloud-plugin/1.3.1/plugin.yaml'}
# - {'source_path': 'http://www.getcloudify.org/spec/vsphere-plugin/1.3.1/plugin.yaml', 'destination_path': '/spec/vsphere-plugin/1.3.1/plugin.yaml'}
# - {'source_path': 'http://www.getcloudify.org/spec/cloudify/3.3.1/types.yaml', 'destination_path': '/spec/cloudify/3.3.1/types.yaml'}
# The plugins you would like to use in your applications should be added here.
# By default, the Diamond, Fabric and relevant IaaS plugins are provided.
# Note that you can upload plugins post-bootstrap via the `cfy plugins upload`
# command.
plugin_resources:
# - 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify_diamond_plugin-1.3.1-py27-none-linux_x86_64-redhat-Maipo.wgn'
- 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify_diamond_plugin-1.3.1-py27-none-linux_x86_64-centos-Core.wgn'
# - 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify_diamond_plugin-1.3.1-py26-none-linux_x86_64-centos-Final.wgn'
# - 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify_diamond_plugin-1.3.1-py27-none-linux_x86_64-Ubuntu-precise.wgn'
# - 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify_diamond_plugin-1.3.1-py27-none-linux_x86_64-Ubuntu-trusty.wgn'
- 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify_fabric_plugin-1.3.1-py27-none-linux_x86_64-centos-Core.wgn'
# - 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify_aws_plugin-1.3.1-py27-none-linux_x86_64-centos-Core.wgn'
- 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify_openstack_plugin-1.3.1-py27-none-linux_x86_64-centos-Core.wgn'
# - 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify_vcloud_plugin-1.3.1-py27-none-linux_x86_64-centos-Core.wgn'
# - 'http://repository.cloudifysource.org/org/cloudify3/3.3.1/sp-RELEASE/cloudify_vsphere_plugin-1.3.1-py27-none-linux_x86_64-centos-Core.wgn'
I'm kinda lost even knowing where to start troubleshooting. Any assistance very gratefully received
K.

Have you looked at the document on offline installation? This should address the scenario when you need to work behind a firewall or a proxy.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Cloudify manager bootsrapping - rest service failed - cloudify

Can you please try the same bootstrap option using these instructions and let me know if it works for you?

Related

Airflow 2 Error sending Celery task: Timeout

How to fix the error "AirflowException("Hostname of job runner does not match")"?

Error when Communicating with the server while Cluster Setup in Cloudera

AirflowException: Celery command failed - The recorded hostname does not match this instance's hostname

Cloudify 3.3.1 simple-manager bootstrap fails with http 504 / filename argument expected

Categories

Resources