Openstack Watcher complains about endpoint not found even though it is defined in the catalog - openstack

I have the following scenario:
server: Ubuntu 20.04.3 LTS
Openstack: installed following the official guide
Watcher: 1:4.0.0-0ubuntu0.20.04.1 (installed also following the official wiki)
Everything works like a charm, however, when I run
root#controller:/etc/watcher# openstack optimize service list
Internal Server Error (HTTP 500)
root#controller:/etc/watcher#
and checked what is it about on watcher's log
2022-01-15 17:25:58.509 17960 INFO watcher-api [-] 10.0.0.11 "GET /v1/services HTTP/1.1" status: 500 len: 139 time: 0.0277412
2022-01-15 17:40:52.535 17960 INFO watcher-api [-] Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/eventlet/wsgi.py", line 573, in handle_one_response
result = self.application(self.environ, start_response)
File "/usr/lib/python3/dist-packages/watcher/api/app.py", line 58, in __call__
return self.v1(environ, start_response)
File "/usr/lib/python3/dist-packages/watcher/api/middleware/auth_token.py", line 61, in __call__
return super(AuthTokenMiddleware, self).__call__(env, start_response)
File "/usr/local/lib/python3.8/dist-packages/webob/dec.py", line 129, in __call__
resp = self.call_func(req, *args, **kw)
File "/usr/local/lib/python3.8/dist-packages/webob/dec.py", line 193, in call_func
return self.func(req, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/keystonemiddleware/auth_token/__init__.py", line 338, in __call__
response = self.process_request(req)
File "/usr/local/lib/python3.8/dist-packages/keystonemiddleware/auth_token/__init__.py", line 659, in process_request
resp = super(AuthProtocol, self).process_request(request)
File "/usr/local/lib/python3.8/dist-packages/keystonemiddleware/auth_token/__init__.py", line 409, in process_request
data, user_auth_ref = self._do_fetch_token(
File "/usr/local/lib/python3.8/dist-packages/keystonemiddleware/auth_token/__init__.py", line 445, in _do_fetch_token
data = self.fetch_token(token, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/keystonemiddleware/auth_token/__init__.py", line 752, in fetch_token
data = self._identity_server.verify_token(
File "/usr/local/lib/python3.8/dist-packages/keystonemiddleware/auth_token/_identity.py", line 157, in verify_token
auth_ref = self._request_strategy.verify_token(
File "/usr/local/lib/python3.8/dist-packages/keystonemiddleware/auth_token/_identity.py", line 108, in _request_strategy
strategy_class = self._get_strategy_class()
File "/usr/local/lib/python3.8/dist-packages/keystonemiddleware/auth_token/_identity.py", line 130, in _get_strategy_class
if self._adapter.get_endpoint(version=klass.AUTH_VERSION):
File "/usr/local/lib/python3.8/dist-packages/keystoneauth1/adapter.py", line 291, in get_endpoint
return self.session.get_endpoint(auth or self.auth, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/keystoneauth1/session.py", line 1233, in get_endpoint
return auth.get_endpoint(self, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/keystoneauth1/identity/base.py", line 375, in get_endpoint
endpoint_data = self.get_endpoint_data(
File "/usr/local/lib/python3.8/dist-packages/keystoneauth1/identity/base.py", line 275, in get_endpoint_data
endpoint_data = service_catalog.endpoint_data_for(
File "/usr/local/lib/python3.8/dist-packages/keystoneauth1/access/service_catalog.py", line 462, in endpoint_data_for
raise exceptions.EndpointNotFound(msg)
keystoneauth1.exceptions.catalog.EndpointNotFound: internal endpoint for identity service in regionOne region not found
and the request on the webserver side
==> horizon_access.log <==
127.0.0.1 - - [15/Jan/2022:17:38:29 +0300] "GET /dashboard/project/api_access/view_credentials/ HTTP/1.1" 200 1027 "http://localhost/dashboard/project/api_access/" "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
10.0.0.11 - - [15/Jan/2022:17:38:30 +0300] "GET /identity/v3/auth/tokens HTTP/1.1" 200 5318 "-" "python-keystoneclient"
10.0.0.11 - - [15/Jan/2022:17:38:30 +0300] "GET /compute/v2.1/servers/detail?all_tenants=True&changes-since=2022-01-15T14%3A33%3A30.416004%2B00%3A00 HTTP/1.1" 200 433 "-" "python-novaclient"
10.0.0.11 - - [15/Jan/2022:17:40:52 +0300] "GET /identity HTTP/1.1" 300 569 "-" "openstacksdk/0.50.0 keystoneauth1/4.2.1 python-requests/2.23.0 CPython/3.8.10"
10.0.0.11 - - [15/Jan/2022:17:40:52 +0300] "POST /identity/v3/auth/tokens HTTP/1.1" 201 5316 "-" "openstacksdk/0.50.0 keystoneauth1/4.2.1 python-requests/2.23.0 CPython/3.8.10"
10.0.0.11 - - [15/Jan/2022:17:40:52 +0300] "POST /identity/v3/auth/tokens HTTP/1.1" 201 5320 "-" "watcher/unknown keystonemiddleware.auth_token/9.1.0 keystoneauth1/4.2.1 python-requests/2.23.0 CPython/3.8.10"
and on keystone side - I run it with some verbosity using the following command
/usr/bin/uwsgi --procname-prefix keystone --ini /etc/keystone/keystone-uwsgi-public.ini
I got the following log
DEBUG keystone.server.flask.request_processing.req_logging [None req-e422207d-b376-4e97-b20b-1d16144be4db None None] REQUEST_METHOD: `GET` {{(pid=20441) log_request_info /opt/stack/keystone/keystone/server/flask/request_processing/req_logging.py:27}}
DEBUG keystone.server.flask.request_processing.req_logging [None req-e422207d-b376-4e97-b20b-1d16144be4db None None] SCRIPT_NAME: `/identity` {{(pid=20441) log_request_info /opt/stack/keystone/keystone/server/flask/request_processing/req_logging.py:28}}
DEBUG keystone.server.flask.request_processing.req_logging [None req-e422207d-b376-4e97-b20b-1d16144be4db None None] PATH_INFO: `/` {{(pid=20441) log_request_info /opt/stack/keystone/keystone/server/flask/request_processing/req_logging.py:29}}
[pid: 20441|app: 0|req: 1/1] 10.0.0.11 () {58 vars in 998 bytes} [Sat Jan 15 17:44:30 2022] GET /identity => generated 268 bytes in 5 msecs (HTTP/1.1 300) 6 headers in 232 bytes (1 switches on core 0)
DEBUG keystone.server.flask.request_processing.req_logging [None req-cc547fb9-886e-4ed2-a3be-7e043004eed8 None None] REQUEST_METHOD: `POST` {{(pid=20440) log_request_info /opt/stack/keystone/keystone/server/flask/request_processing/req_logging.py:27}}
DEBUG keystone.server.flask.request_processing.req_logging [None req-cc547fb9-886e-4ed2-a3be-7e043004eed8 None None] SCRIPT_NAME: `/identity` {{(pid=20440) log_request_info /opt/stack/keystone/keystone/server/flask/request_processing/req_logging.py:28}}
DEBUG keystone.server.flask.request_processing.req_logging [None req-cc547fb9-886e-4ed2-a3be-7e043004eed8 None None] PATH_INFO: `/v3/auth/tokens` {{(pid=20440) log_request_info /opt/stack/keystone/keystone/server/flask/request_processing/req_logging.py:29}}
DEBUG oslo_db.sqlalchemy.engines [None req-cc547fb9-886e-4ed2-a3be-7e043004eed8 None None] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_ENGINE_SUBSTITUTION {{(pid=20440) _check_effective_sql_mode /usr/local/lib/python3.8/dist-packages/oslo_db/sqlalchemy/engines.py:304}}
DEBUG passlib.handlers.bcrypt [None req-cc547fb9-886e-4ed2-a3be-7e043004eed8 None None] detected 'bcrypt' backend, version '3.2.0' {{(pid=20440) _load_backend_mixin /usr/local/lib/python3.8/dist-packages/passlib/handlers/bcrypt.py:567}}
DEBUG passlib.handlers.bcrypt [None req-cc547fb9-886e-4ed2-a3be-7e043004eed8 None None] 'bcrypt' backend lacks $2$ support, enabling workaround {{(pid=20440) _finalize_backend_mixin /usr/local/lib/python3.8/dist-packages/passlib/handlers/bcrypt.py:382}}
DEBUG keystone.auth.core [None req-cc547fb9-886e-4ed2-a3be-7e043004eed8 None None] MFA Rules not processed for user `97eec1465cdc4e41b5c0ba48a1b39cc2`. Rule list: `[]` (Enabled: `True`). {{(pid=20440) check_auth_methods_against_rules /opt/stack/keystone/keystone/auth/core.py:438}}
DEBUG keystone.common.fernet_utils [None req-cc547fb9-886e-4ed2-a3be-7e043004eed8 None None] Loaded 2 Fernet keys from /etc/keystone/fernet-keys/, but `[fernet_tokens] max_active_keys = 3`; perhaps there have not been enough key rotations to reach `max_active_keys` yet? {{(pid=20440) load_keys /opt/stack/keystone/keystone/common/fernet_utils.py:286}}
[pid: 20440|app: 0|req: 1/2] 10.0.0.11 () {62 vars in 1095 bytes} [Sat Jan 15 17:44:30 2022] POST /identity/v3/auth/tokens => generated 4862 bytes in 125 msecs (HTTP/1.1 201) 6 headers in 385 bytes (1 switches on core 0)
DEBUG keystone.server.flask.request_processing.req_logging [None req-0584fbcc-66c5-4fba-9d8a-ea8ad2d40c5d None None] REQUEST_METHOD: `GET` {{(pid=20441) log_request_info /opt/stack/keystone/keystone/server/flask/request_processing/req_logging.py:27}}
DEBUG keystone.server.flask.request_processing.req_logging [None req-0584fbcc-66c5-4fba-9d8a-ea8ad2d40c5d None None] SCRIPT_NAME: `/identity` {{(pid=20441) log_request_info /opt/stack/keystone/keystone/server/flask/request_processing/req_logging.py:28}}
DEBUG keystone.server.flask.request_processing.req_logging [None req-0584fbcc-66c5-4fba-9d8a-ea8ad2d40c5d None None] PATH_INFO: `/` {{(pid=20441) log_request_info /opt/stack/keystone/keystone/server/flask/request_processing/req_logging.py:29}}
[pid: 20441|app: 0|req: 2/3] 10.0.0.11 () {58 vars in 1033 bytes} [Sat Jan 15 17:44:30 2022] GET /identity => generated 268 bytes in 2 msecs (HTTP/1.1 300) 6 headers in 232 bytes (1 switches on core 0)
DEBUG keystone.server.flask.request_processing.req_logging [None req-f096d017-66d0-4baa-8414-2596d0869005 None None] REQUEST_METHOD: `POST` {{(pid=20440) log_request_info /opt/stack/keystone/keystone/server/flask/request_processing/req_logging.py:27}}
DEBUG keystone.server.flask.request_processing.req_logging [None req-f096d017-66d0-4baa-8414-2596d0869005 None None] SCRIPT_NAME: `/identity` {{(pid=20440) log_request_info /opt/stack/keystone/keystone/server/flask/request_processing/req_logging.py:28}}
DEBUG keystone.server.flask.request_processing.req_logging [None req-f096d017-66d0-4baa-8414-2596d0869005 None None] PATH_INFO: `/v3/auth/tokens` {{(pid=20440) log_request_info /opt/stack/keystone/keystone/server/flask/request_processing/req_logging.py:29}}
DEBUG keystone.auth.core [None req-f096d017-66d0-4baa-8414-2596d0869005 None None] MFA Rules not processed for user `c5c42a1a942e48fd9b735ea9c6a11ed0`. Rule list: `[]` (Enabled: `True`). {{(pid=20440) check_auth_methods_against_rules /opt/stack/keystone/keystone/auth/core.py:438}}
DEBUG keystone.common.fernet_utils [None req-f096d017-66d0-4baa-8414-2596d0869005 None None] Loaded 2 Fernet keys from /etc/keystone/fernet-keys/, but `[fernet_tokens] max_active_keys = 3`; perhaps there have not been enough key rotations to reach `max_active_keys` yet? {{(pid=20440) load_keys /opt/stack/keystone/keystone/common/fernet_utils.py:286}}
[pid: 20440|app: 0|req: 2/4] 10.0.0.11 () {62 vars in 1130 bytes} [Sat Jan 15 17:44:30 2022] POST /identity/v3/auth/tokens => generated 4866 bytes in 26 msecs (HTTP/1.1 201) 6 headers in 385 bytes (2 switches on core 0)
So the first thing I did is to check the catalog
openstack catalog list
----
| keystone | identity | RegionOne |
| | | internal: http://controller/identity |
| | | RegionOne |
| | | public: http://controller/identity |
| | | RegionOne |
| | | admin: http://controller/identity |
| | | |
---
My question is: do I need to create a specific (another) internal endpoint for the identity service and where should I declare it for the watcher-api to find it?
EDIT: Following #Larsks comment, I changed the credentials used on watcher.conf by username=admin (the admin user) and the corresponding password. Openstack optimize service list gave back the following
WARNING keystonemiddleware.auth_token [-] Identity response: {"error":{"code":401,"message":"The request you have made requires authentication.","title":"Unauthorized"}}
: keystoneauth1.exceptions.http.Unauthorized: The request you have made requires authentication. (HTTP 401) (Request-ID: req-56b63a60-1ba2-4f12-93c0-e7c7d1a1769c)
2022-01-15 19:04:17.424 28742 CRITICAL keystonemiddleware.auth_token [-] Unable to validate token: Identity server rejected authorization necessary to fetch token data: keystonemiddleware.auth_token._exceptions.ServiceError: Identity server rejected authorization necessary to fetch token data

Related

google.api_core.exceptions.NotFound bucket does not exists

When I'm running data_ingestion_gcs_dag DAG in Airflow.I get error that it can not find a specified bucket, however, I rechecked it and the bucket name is fine. I have specified access to Google account with docker-compose, here is code down below, i have inserted only first part of code:
version: '3'
x-airflow-common:
&airflow-common
# In order to add custom dependencies or upgrade provider packages you can use your extended image.
# Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml
# and uncomment the "build" line below, Then run `docker-compose build` to build the images.
build:
context: .
dockerfile: ./Dockerfile
environment:
&airflow-common-env
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow#postgres/airflow
AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow#postgres/airflow
AIRFLOW__CELERY__BROKER_URL: redis://:#redis:6379/0
AIRFLOW__CORE__FERNET_KEY: ''
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
_PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
GOOGLE_APPLICATION_CREDENTIALS: /.google/credentials/google_credentials.json
AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT: 'google-cloud-platform://?extra__google_cloud_platform__key_path=/.google/credentials/google_credentials.json'
# TODO: Please change GCP_PROJECT_ID & GCP_GCS_BUCKET, as per your config
GCP_PROJECT_ID: 'real-dtc-de'
GCP_GCS_BUCKET: 'dtc_data_lake_real-dtc-de'
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
- ~/.google/credentials/:/.google/credentials:ro
And here is code from DAG code, presented down below:
PROJECT_ID = os.environ.get("GCP_PROJECT_ID")
BUCKET = os.environ.get("GCP_GCS_BUCKET")
Here is logs from DAG:
*** Reading local file: /opt/airflow/logs/data_ingestion_gcs_dag/local_to_gcs_task/2022-06-13T02:47:29.654918+00:00/1.log
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1032} INFO - Dependencies all met for <TaskInstance: data_ingestion_gcs_dag.local_to_gcs_task manual__2022-06-13T02:47:29.654918+00:00 [queued]>
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1032} INFO - Dependencies all met for <TaskInstance: data_ingestion_gcs_dag.local_to_gcs_task manual__2022-06-13T02:47:29.654918+00:00 [queued]>
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1238} INFO -
--------------------------------------------------------------------------------
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1239} INFO - Starting attempt 1 of 2
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1240} INFO -
--------------------------------------------------------------------------------
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1259} INFO - Executing <Task(PythonOperator): local_to_gcs_task> on 2022-06-13 02:47:29.654918+00:00
[2022-06-13, 02:47:36 UTC] {standard_task_runner.py:52} INFO - Started process 1042 to run task
[2022-06-13, 02:47:36 UTC] {standard_task_runner.py:76} INFO - Running: ['***', 'tasks', 'run', 'data_ingestion_gcs_dag', 'local_to_gcs_task', 'manual__2022-06-13T02:47:29.654918+00:00', '--job-id', '11', '--raw', '--subdir', 'DAGS_FOLDER/data_ingestion_gcs_dag.py', '--cfg-path', '/tmp/tmp11gg9aoy', '--error-file', '/tmp/tmpjbp6yrks']
[2022-06-13, 02:47:36 UTC] {standard_task_runner.py:77} INFO - Job 11: Subtask local_to_gcs_task
[2022-06-13, 02:47:36 UTC] {logging_mixin.py:109} INFO - Running <TaskInstance: data_ingestion_gcs_dag.local_to_gcs_task manual__2022-06-13T02:47:29.654918+00:00 [running]> on host aea7312db396
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1426} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=***
AIRFLOW_CTX_DAG_ID=data_ingestion_gcs_dag
AIRFLOW_CTX_TASK_ID=local_to_gcs_task
AIRFLOW_CTX_EXECUTION_DATE=2022-06-13T02:47:29.654918+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2022-06-13T02:47:29.654918+00:00
[2022-06-13, 02:47:36 UTC] {taskinstance.py:1700} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2594, in upload_from_file
retry=retry,
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2396, in _do_upload
retry=retry,
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1917, in _do_multipart_upload
transport, data, object_metadata, content_type, timeout=timeout
File "/home/airflow/.local/lib/python3.7/site-packages/google/resumable_media/requests/upload.py", line 154, in transmit
retriable_request, self._get_status_code, self._retry_strategy
File "/home/airflow/.local/lib/python3.7/site-packages/google/resumable_media/requests/_request_helpers.py", line 147, in wait_and_retry
response = func()
File "/home/airflow/.local/lib/python3.7/site-packages/google/resumable_media/requests/upload.py", line 149, in retriable_request
self._process_response(result)
File "/home/airflow/.local/lib/python3.7/site-packages/google/resumable_media/_upload.py", line 113, in _process_response
_helpers.require_status_code(response, (http.client.OK,), self._get_status_code)
File "/home/airflow/.local/lib/python3.7/site-packages/google/resumable_media/_helpers.py", line 104, in require_status_code
*status_codes
google.resumable_media.common.InvalidResponse: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1329, in _run_raw_task
self._execute_task_with_callbacks(context)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1455, in _execute_task_with_callbacks
result = self._execute_task(context, self.task)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1511, in _execute_task
result = execute_callable(context=context)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/operators/python.py", line 174, in execute
return_value = self.execute_callable()
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/operators/python.py", line 185, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/opt/airflow/dags/data_ingestion_gcs_dag.py", line 51, in upload_to_gcs
blob.upload_from_filename(local_file)
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2735, in upload_from_filename
retry=retry,
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2598, in upload_from_file
_raise_from_invalid_response(exc)
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 4466, in _raise_from_invalid_response
raise exceptions.from_http_status(response.status_code, message, response=response)
google.api_core.exceptions.NotFound: 404 POST https://storage.googleapis.com/upload/storage/v1/b/dtc_data_lake_animated-surfer-338618/o?uploadType=multipart: {
"error": {
"code": 404,
"message": "The specified bucket does not exist.",
"errors": [
{
"message": "The specified bucket does not exist.",
"domain": "global",
"reason": "notFound"
}
]
}
}

CORS on flask - uwsgi - nginx stack

I am running a flask app behind nginx/uwsgi. I am facing CORS issues when uploading files btw the upload limit in nginx is set to 30M and the same is uwsgi and I'm only uploading 2M of files, and I allowed all CORS origins. I've tried everything but to no avail, the request succeeds when I run it directly from an interactive python session.
I have an endpoint /result
#app.route('/result', methods = ['GET', 'POST'])
#token_required
def result(user : User):
if request.method == "GET":
d = request.args
# do stuff
return jsonify({'success': False, 'msg': 'Unable to fullfill request' }), 201
else:
# do stuff
return jsonify({'success' : False, 'msg': 'Missing Fields'}), 201
here are the uwsgi logs
[pid: 8615|app: 0|req: 1/1] xxx.xx.xxx.xxx () {52 vars in 820 bytes} [Fri Apr 22 18:33:08 2022] OPTIONS /jwt => generated 0 bytes in 4 msecs (HTTP/2.0 200) 8 headers in 340 bytes (1 switches on core 0)
[pid: 8615|app: 0|req: 2/2] xxx.xx.xxx.xxx () {52 vars in 840 bytes} [Fri Apr 22 18:33:08 2022] OPTIONS /notifications => generated 0 bytes in 0 msecs (HTTP/2.0 200) 8 headers in 340 bytes (1 switches on core 0)
[pid: 8614|app: 0|req: 1/3] xxx.xx.xxx.xxx () {52 vars in 826 bytes} [Fri Apr 22 18:33:08 2022] OPTIONS /result => generated 0 bytes in 4 msecs (HTTP/2.0 200) 8 headers in 346 bytes (1 switches on core 0)
[pid: 8615|app: 0|req: 3/4] xxx.xx.xxx.xxx () {52 vars in 820 bytes} [Fri Apr 22 18:33:08 2022] OPTIONS /jwt => generated 0 bytes in 1 msecs (HTTP/2.0 200) 8 headers in 340 bytes (1 switches on core 0)
[pid: 8615|app: 0|req: 4/5] xxx.xx.xxx.xxx () {52 vars in 840 bytes} [Fri Apr 22 18:33:08 2022] OPTIONS /notifications => generated 0 bytes in 0 msecs (HTTP/2.0 200) 8 headers in 340 bytes (1 switches on core 0)
[pid: 8617|app: 0|req: 1/6] xxx.xx.xxx.xxx () {52 vars in 945 bytes} [Fri Apr 22 18:33:08 2022] GET /jwt => generated 22 bytes in 14 msecs (HTTP/2.0 201) 5 headers in 190 bytes (1 switches on core 0)
[pid: 8615|app: 0|req: 5/7] xxx.xx.xxx.xxx () {52 vars in 951 bytes} [Fri Apr 22 18:33:08 2022] GET /result => generated 274 bytes in 14 msecs (HTTP/2.0 201) 5 headers in 191 bytes (1 switches on core 0)
[pid: 8613|app: 0|req: 1/8] xxx.xx.xxx.xxx () {52 vars in 965 bytes} [Fri Apr 22 18:33:08 2022] GET /notifications => generated 19973 bytes in 25 msecs (HTTP/2.0 201) 5 headers in 193 bytes (2 switches on core 0)
[pid: 8614|app: 0|req: 2/9] xxx.xx.xxx.xxx () {52 vars in 945 bytes} [Fri Apr 22 18:33:08 2022] GET /jwt => generated 22 bytes in 7 msecs (HTTP/2.0 201) 5 headers in 190 bytes (1 switches on core 0)
[pid: 8614|app: 0|req: 3/10] xxx.xx.xxx.xxx () {52 vars in 965 bytes} [Fri Apr 22 18:33:08 2022] GET /notifications => generated 19973 bytes in 10 msecs (HTTP/2.0 201) 5 headers in 193 bytes (1 switches on core 0)
[pid: 8614|app: 0|req: 4/11] xxx.xx.xxx.xxx () {52 vars in 827 bytes} [Fri Apr 22 18:33:18 2022] OPTIONS /result => generated 0 bytes in 1 msecs (HTTP/2.0 200) 8 headers in 346 bytes (0 switches on core 0)
Chrome OPTIONS response
access-control-allow-headers: authorization
access-control-allow-methods: DELETE, GET, HEAD, OPTIONS, PATCH, POST, PUT
access-control-allow-origin: https://example.com
access-control-expose-headers: Content-Disposition
allow: OPTIONS, HEAD, POST, GET
content-length: 0
content-type: text/html; charset=utf-8
date: Fri, 22 Apr 2022 18:33:18 GMT
server: nginx/1.20.0
vary: Origin
Chrome Console error
Access to XMLHttpRequest at 'https://api.example.com/result' from origin 'https://example.com' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
the error is quite funny because the OPTIONS response has the 'Access-Control-Allow-Origin' header.
When I was setting up uwsgi I had to change nginx's user group. when I check the nginx error logs at /var/log/nginx/error.log I noticed that there was a permissions issue.
I solved this by changing the user group of
sudo chgrp www-data /var/lib/nginx/tmp/ /var/lib/nginx/ /var/lib/nginx/tmp/client_body/
Still can't explain why plain python requests query from my pc was not causing the same issue.
PS
Whenever nginx will face an error and returns a response to the browser you will see the cors thing in the console logs. so always take that with a grain of salt especially if you already set the cors headers from flask side correctly.

Apache Airflow : Dag task marked zombie, with background process running on remote server

**Apache Airflow version:**1.10.9-composer
Kubernetes Version : Client Version: version.Info{Major:"1", Minor:"15+", GitVersion:"v1.15.12-gke.6002", GitCommit:"035184604aff4de66f7db7fddadb8e7be76b6717", GitTreeState:"clean", BuildDate:"2020-12-01T23:13:35Z", GoVersion:"go1.12.17b4", Compiler:"gc", Platform:"linux/amd64"}
Environment: Airflow, running on top of Kubernetes - Linux version 4.19.112
OS : Linux version 4.19.112+ (builder#7fc5cdead624) (Chromium OS 9.0_pre361749_p20190714-r4 clang version 9.0.0 (/var/cache/chromeos-cache/distfiles/host/egit-src/llvm-project c11de5eada2decd0a495ea02676b6f4838cd54fb) (based on LLVM 9.0.0svn)) #1 SMP Fri Sep 4 12:00:04 PDT 2020
Kernel : Linux gke-europe-west2-asset-c-default-pool-dc35e2f2-0vgz
4.19.112+ #1 SMP Fri Sep 4 12:00:04 PDT 2020 x86_64 Intel(R) Xeon(R) CPU # 2.20GHz GenuineIntel GNU/Linux
What happened ?
A running task is marked as Zombie after the execution time crossed the latest heartbeat + 5 minutes.
The task is running in background in another application server, triggered using SSHOperator.
[2021-01-18 11:53:37,491] {taskinstance.py:888} INFO - Executing <Task(SSHOperator): load_trds_option_composite_file> on 2021-01-17T11:40:00+00:00
[2021-01-18 11:53:37,495] {base_task_runner.py:131} INFO - Running on host: airflow-worker-6f6fd78665-lm98m
[2021-01-18 11:53:37,495] {base_task_runner.py:132} INFO - Running: ['airflow', 'run', 'dsp_etrade_process_trds_option_composite_0530', 'load_trds_option_composite_file', '2021-01-17T11:40:00+00:00', '--job_id', '282759', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/dsp_etrade_trds_option_composite_0530.py', '--cfg_path', '/tmp/tmpge4_nva0']
Task Executing time:
dag_id dsp_etrade_process_trds_option_composite_0530
duration 7270.47
start_date 2021-01-18 11:53:37,491
end_date 2021-01-18 13:54:47.799728+00:00
Scheduler Logs during that time:
[2021-01-18 13:54:54,432] {taskinstance.py:1135} ERROR - <TaskInstance: dsp_etrade_process_etrd.push_run_date 2021-01-18 13:30:00+00:00 [running]> detected as zombie
{
textPayload: "[2021-01-18 13:54:54,432] {taskinstance.py:1135} ERROR - <TaskInstance: dsp_etrade_process_etrd.push_run_date 2021-01-18 13:30:00+00:00 [running]> detected as zombie"
insertId: "1ca8zyfg3zvma66"
resource: {
type: "cloud_composer_environment"
labels: {3}
}
timestamp: "2021-01-18T13:54:54.432862699Z"
severity: "ERROR"
logName: "projects/asset-control-composer-prod/logs/airflow-scheduler"
receiveTimestamp: "2021-01-18T13:54:55.714437665Z"
}
Airflow-webserver log :
X.X.X.X - - [18/Jan/2021:13:54:39 +0000] "GET /_ah/health HTTP/1.1" 200 187 "-" "GoogleHC/1.0"
{
textPayload: "172.17.0.5 - - [18/Jan/2021:13:54:39 +0000] "GET /_ah/health HTTP/1.1" 200 187 "-" "GoogleHC/1.0"
"
insertId: "1sne0gqg43o95n3"
resource: {2}
timestamp: "2021-01-18T13:54:45.401670481Z"
logName: "projects/asset-control-composer-prod/logs/airflow-webserver"
receiveTimestamp: "2021-01-18T13:54:50.598807514Z"
}
Airflow Info logs :
2021-01-18 08:54:47.799 EST
{
textPayload: "NoneType: None
"
insertId: "1ne3hqgg47yzrpf"
resource: {2}
timestamp: "2021-01-18T13:54:47.799661030Z"
severity: "INFO"
logName: "projects/asset-control-composer-prod/logs/airflow-scheduler"
receiveTimestamp: "2021-01-18T13:54:50.914461159Z"
}
[2021-01-18 13:54:47,800] {taskinstance.py:1192} INFO - Marking task as FAILED.dag_id=dsp_etrade_process_trds_option_composite_0530, task_id=load_trds_option_composite_file, execution_date=20210117T114000, start_date=20210118T115337, end_date=20210118T135447
Copy link
{
textPayload: "[2021-01-18 13:54:47,800] {taskinstance.py:1192} INFO - Marking task as FAILED.dag_id=dsp_etrade_process_trds_option_composite_0530, task_id=load_trds_option_composite_file, execution_date=20210117T114000, start_date=20210118T115337, end_date=20210118T135447"
insertId: "1ne3hqgg47yzrpg"
resource: {2}
timestamp: "2021-01-18T13:54:47.800605248Z"
severity: "INFO"
logName: "projects/asset-control-composer-prod/logs/airflow-scheduler"
receiveTimestamp: "2021-01-18T13:54:50.914461159Z"
}
Airflow Database shows the latest heartbeat as:
select state, latest_heartbeat from job where id=282759
--------------------------------------
state | latest_heartbeat
running | 2021-01-18 13:48:41.891934
Airflow Configurations:
celery
worker_concurrency=6
scheduler
scheduler_health_check_threshold=60
scheduler_zombie_task_threshold=300
max_threads=2
core
dag_concurrency=6
Kubernetes Cluster :
Worker nodes : 6
What was expected to happen ?
The backend process takes around 2hrs 30 minutes to finish. During
such long running jobs the task is detected as zombie. Eventhough the
worker node is still processing the task. The state of the job is
still marked as 'running'. State if the task is not known during the
run time.

<< "[read] I/O error: Read timed out" immediately upon sending headers

We see time-outs during some calls to external REST service from within a Spring Boot application. They do not seem to occur when we connect to the REST service directly. Debug logging on org.apache.http has given us a very peculiar aspect of the failing requests: it contains an inbound log entry '<< "[read] I/O error: Read timed out"' in the middle of sending headers - the same millisecond the first headers were sent.
How can we see an inbound 'Read timed out' a few milliseconds after sending the first headers? And why does it not immediately interrupt the request/connection with a time-out, but instead waits the full 4500ms until it times out with an exception?
Here is our production log for a failing request, redacted. Note the 4500ms delay between lines two and three. My question is about the occurrence of http-outgoing-104 << "[read] I/O error: Read timed out" at 16:55:08.258, not the first one on line 2.
16:55:12.764 Connection released: [id: 104][route: {s}-><<website-redacted>>:443][total kept alive: 0; route allocated: 0 of 2; total allocated: 0 of 20]
16:55:12.763 http-outgoing-104 << "[read] I/O error: Read timed out"
16:55:08.259 http-outgoing-104 >> "<<POST Body Redacted>>"
16:55:08.259 http-outgoing-104 >> "[\r][\n]"
16:55:08.258 http-outgoing-104: set socket timeout to 4500
16:55:08.258 Executing request POST <<Endpoint Redacted>> HTTP/1.1
16:55:08.258 Target auth state: UNCHALLENGED
16:55:08.258 Proxy auth state: UNCHALLENGED
16:55:08.258 Connection leased: [id: 104][route: {s}-><<website-redacted>>:443][total kept alive: 0; route allocated: 1 of 2; total allocated: 1 of 20]
....
16:55:08.258 http-outgoing-104 >> "POST <<Endpoint Redacted>> HTTP/1.1[\r][\n]"
16:55:08.258 http-outgoing-104 >> "Accept: text/plain, application/json, application/*+json, */*[\r][\n]"
16:55:08.258 http-outgoing-104 >> Cookie: <<Redacted>>
16:55:08.258 http-outgoing-104 >> "Content-Type: application/json[\r][\n]"
16:55:08.258 http-outgoing-104 >> "Connection: close[\r][\n]"
16:55:08.258 http-outgoing-104 >> "X-B3-SpanId: <<ID>>[\r][\n]"
16:55:08.258 http-outgoing-104 << "[read] I/O error: Read timed out"
16:55:08.258 http-outgoing-104 >> "X-Span-Name: https:<<Endpoint Redacted>>[\r][\n]"
16:55:08.258 http-outgoing-104 >> "X-B3-TraceId: <<ID>>[\r][\n]"
16:55:08.258 http-outgoing-104 >> "X-B3-ParentSpanId: <<ID>>[\r][\n]"
16:55:08.258 http-outgoing-104 >> "Content-Length: 90[\r][\n]"
16:55:08.258 http-outgoing-104 >> "User-Agent: Apache-HttpClient/4.5.3 (Java/1.8.0_172)[\r][\n]"
16:55:08.258 http-outgoing-104 >> "Cookie: <<Redacted>>"
16:55:08.258 http-outgoing-104 >> "Host: <<Host redacted>>[\r][\n]"
16:55:08.258 http-outgoing-104 >> "Accept-Encoding: gzip,deflate[\r][\n]"
16:55:08.258 http-outgoing-104 >> "X-B3-Sampled: 1[\r][\n
Update 1: a second occurrence:
In another request that timed out the same behavior roughly occurs, but the timeout message is logged even before sending headers and eventually receiving the actual timeout. Note: this request is actually older, after it I have configured the request to include 'Connection: close' to circumvent a firewall dropping the connection under 'Keep Alive'.
19:28:08.102 http-outgoing-36 << "[read] I/O error: Read timed out"
19:28:08.102 http-outgoing-36: Shutdown connection
19:28:08.102 http-outgoing-36: Close connection
19:28:03.598 http-outgoing-36 >> "Connection: Keep-Alive[\r][\n]"
19:28:03.598 http-outgoing-36 >> "Content-Type: application/json;charset=UTF-8[\r][\n]"
...
19:28:03.598 http-outgoing-36 >> "Accept-Encoding: gzip,deflate[\r][\n]"
...
19:28:03.597 http-outgoing-36 >> Cookie: ....
19:28:03.597 http-outgoing-36 >> Accept-Encoding: gzip,deflate
19:28:03.597 http-outgoing-36 >> User-Agent: Apache-HttpClient/4.5.3 (Java/1.8.0_172)
19:28:03.596 Connection leased: [id: 36][route: {s}-><< Site redacted >>:443][total kept alive: 0; route allocated: 1 of 2; total allocated: 1 of 20]
19:28:03.596 http-outgoing-36: set socket timeout to 4500
19:28:03.596 Executing request POST HTTP/1.1
19:28:03.596 Target auth state: UNCHALLENGED
19:28:03.596 http-outgoing-36 << "[read] I/O error: Read timed out"
19:28:03.594 Connection request: [route: {s}-><< Site redacted >>:443][total kept alive: 1; route allocated: 1 of 2; total allocated: 1 of 20]
19:28:03.594 Auth cache not set in the context
Update 2: added HttpClientBuilder configuration
RequestConfig.Builder requestBuilder = RequestConfig.custom()
.setSocketTimeout(socketTimeout)
.setConnectTimeout(connectTimeout);
CloseableHttpClient httpClient = HttpClientBuilder.create()
.setDefaultRequestConfig(requestBuilder.build())
.build();
HttpComponentsClientHttpRequestFactory rf = new HttpComponentsClientHttpRequestFactory(httpClient);
return new RestTemplate(rf);

Advanced AWS CloudFormation - cfn-Init & cfn-Hup not working

I am experimenting with Cloudformation CFN-INIT & CFN-HUP based on below template but the wordpress stack doesn't get created. CFN-HUP process is not started and CFN-Init throws Code1 error. Please see Stack-template and error log details below. Can anyone help me understand what's going wrong here please?
Stack-Template:
**Parameters:
DecideEnvSize:
Type: String
Default: LOW
AllowedValues:
- LOW
- MEDIUM
- HIGH
Description: Select Environment Size (S,M,L)
DatabaseName:
Type: String
Default: DB4wordpress
DatabaseUser:
Type: String
Default: ***************
DatabasePassword:
Type: String
Default: *************
NoEcho: true
TestString:
Type: String
Default: Don't eat yourself up!!!
Mappings:
MyRegionMap:
us-east-1:
"AMALINUX" : "ami-c481fad3" # AMALINUX SEP 2016 - N. Verginia
us-east-2:
"AMALINUX" : "ami-71ca9114" # AMALINUX SEP 2016 - Ohio
InstanceSize:
LOW:
"EC2" : "t2.micro"
"DB" : "db.t2.micro"
MEDIUM:
"EC2" : "t2.small"
"DB" : "db.t2.small"
HIGH:
"EC2" : "t2.medium"
"DB" : "db.t2.medium"
Resources:
DBServer:
Type: "AWS::RDS::DBInstance"
Properties:
AllocatedStorage: 5
StorageType: gp2
DBInstanceClass: !FindInMap [InstanceSize, !Ref DecideEnvSize, DB] # Dynamic mapping + Pseudo Parameter
DBName: !Ref DatabaseName
Engine: MySQL
MasterUsername: !Ref DatabaseUser
MasterUserPassword: !Ref DatabasePassword
DeletionPolicy: Delete
EC2server:
Type: "AWS::EC2::Instance"
DependsOn: DBServer
Properties:
ImageId: !FindInMap [MyRegionMap, !Ref "AWS::Region", AMALINUX] # Dynamic mapping + Pseudo Parameter
InstanceType: !FindInMap [InstanceSize, !Ref DecideEnvSize, EC2]
KeyName: AdvancedCFN
**UserData:
"Fn::Base64":
!Sub |
#!/bin/bash
yum update -y aws-cfn-bootstrap # good practice - always do this.
/opt/aws/bin/cfn-init -v --stack ${AWS::StackName} --resource EC2server --configsets wordpress --region ${AWS::Region}
yum -y update
Metadata:
AWS::CloudFormation::Init:
configSets:
wordpress:
- "configure_cfn"
- "install_wordpress"
- "config_wordpress"
configure_cfn:
files:
/etc/cfn/cfn-hup.conf:
content: !Sub |
[main-just some name]
stack=${AWS::StackId}
region=${AWS::Region}
verbose=true
interval=5
mode: "000400"
owner: root
group: root
/etc/cfn/hooks.d/cfn-auto-reloader.conf:
content: !Sub |
[cfn-auto-reloader-hook #just a name]
triggers=post.update
path=Resources.EC2server.Metadata.AWS::CloudFormation::Init
action=/opt/aws/bin/cfn-init -v --stack ${AWS::StackName} --resource EC2server --configsets wordpress --region ${AWS::Region}
mode: "000400"
owner: root
group: root
/var/www/html/index2.html:
content: !Ref TestString
services:
sysvinit:
cfn-hup:
enabled: "true"
ensureRunning: "true"
files:
- "/etc/cfn/cfn-hup.conf"
- "/etc/cfn/hooks.d/cfn-auto-reloader.conf"**
install_wordpress:
packages:
yum:
httpd: []
php: []
mysql: []
php-mysql: []
sources:
/var/www/html: "http://wordpress.org/latest.tar.gz"
services:
sysvinit:
httpd:
enabled: "true"
ensureRunning: "true"
config_wordpress:
commands:
01_clone_config:
cwd: "/var/www/html/wordpress"
test: "test ! -e /var/www/html/wordpress/wp-config.php"
command: "cp wp-config-sample.php wp-config.php"
02_inject_dbhost:
cwd: "/var/www/html/wordpress"
command: !Sub |
sed -i 's/localhost/${DBServer.Endpoint.Address}/g' wp-config.php
03_inject_dbname:
cwd: "/var/www/html/wordpress"
command: !Sub |
sed -i 's/database_name_here/${DatabaseName}/g' wp-config.php
04_inject_dbuser:
cwd: "/var/www/html/wordpress"
command: !Sub |
sed -i 's/username_here/${DatabaseUser}/g' wp-config.php
05_inject_dbpassword:
cwd: "/var/www/html/wordpress"
command: !Sub |
sed -i 's/password_here/${DatabasePassword}/g' wp-config.php
S3blob:
Type: "AWS::S3::Bucket"**
Error & log details
[root#ip-172-31-25-239 ec2-user]# cd /var/log
[root#ip-172-31-25-239 log]# ls
*audit btmp cfn-init-cmd.log cfn-wire.log cloud-init-output.log dmesg lastlog maillog ntpstats spooler wtmp
boot.log cfn-hup.log cfn-init.log cloud-init.log cron dracut.log mail messages secure tallylog yum.log
[root#ip-172-31-25-239 log]# cat cfn-hup.log
2017-12-30 10:48:15,923 [ERROR] Error: [main] section must contain stack option*
===========================================================================================
===========================================================================================
[root#ip-172-31-25-239 log]# cat cfn-init.log
2017-12-30 10:48:15,499 [DEBUG] CloudFormation client initialized with endpoint https://cloudformation.us-east-1.amazonaws.com
2017-12-30 10:48:15,501 [DEBUG] Describing resource EC2server in stack Yetagain-init-hup-try10
2017-12-30 10:48:15,616 [INFO] -----------------------Starting build-----------------------
2017-12-30 10:48:15,616 [DEBUG] Not setting a reboot trigger as scheduling support is not available
2017-12-30 10:48:15,617 [INFO] Running configSets: wordpress
2017-12-30 10:48:15,618 [INFO] Running configSet wordpress
2017-12-30 10:48:15,619 [INFO] Running config configure_cfn
2017-12-30 10:48:15,620 [DEBUG] No packages specified
2017-12-30 10:48:15,620 [DEBUG] No groups specified
2017-12-30 10:48:15,620 [DEBUG] No users specified
2017-12-30 10:48:15,620 [DEBUG] No sources specified
2017-12-30 10:48:15,620 [DEBUG] Parent directory /etc/cfn does not exist, creating
2017-12-30 10:48:15,625 [DEBUG] Writing content to /etc/cfn/cfn-hup.conf
2017-12-30 10:48:15,625 [DEBUG] Setting mode for /etc/cfn/cfn-hup.conf to 000400
2017-12-30 10:48:15,626 [DEBUG] Setting owner 0 and group 0 for /etc/cfn/cfn-hup.conf
2017-12-30 10:48:15,626 [DEBUG] Parent directory /etc/cfn/hooks.d does not exist, creating
2017-12-30 10:48:15,626 [DEBUG] Writing content to /etc/cfn/hooks.d/cfn-auto-reloader.conf
2017-12-30 10:48:15,626 [DEBUG] Setting mode for /etc/cfn/hooks.d/cfn-auto-reloader.conf to 000400
2017-12-30 10:48:15,626 [DEBUG] Setting owner 0 and group 0 for /etc/cfn/hooks.d/cfn-auto-reloader.conf
2017-12-30 10:48:15,626 [DEBUG] Parent directory /var/www/html does not exist, creating
2017-12-30 10:48:15,627 [DEBUG] Writing content to /var/www/html/index2.html
2017-12-30 10:48:15,627 [DEBUG] No mode specified for /var/www/html/index2.html. The file will be created with the mode: 0644
2017-12-30 10:48:15,627 [DEBUG] No commands specified
2017-12-30 10:48:15,627 [DEBUG] Using service modifier: /sbin/chkconfig
2017-12-30 10:48:15,627 [DEBUG] Setting service cfn-hup to enabled
2017-12-30 10:48:15,634 [INFO] enabled service cfn-hup
2017-12-30 10:48:15,635 [DEBUG] Restarting cfn-hup due to change detected in dependency
2017-12-30 10:48:15,635 [DEBUG] Using service runner: /sbin/service
*2017-12-30 10:48:15,941 [ERROR] Could not restart service cfn-hup; return code was 1
2017-12-30 10:48:15,941 [DEBUG] Service output: Stopping cfn-hup: [FAILED]
Starting cfn-hup: [FAILED]
2017-12-30 10:48:15,942 [ERROR] Error encountered during build of configure_cfn: Could not restart cfn-hup*
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/cfnbootstrap/construction.py", line 542, in run_config
CloudFormationCarpenter(config, self._auth_config).build(worklog)
File "/usr/lib/python2.7/dist-packages/cfnbootstrap/construction.py", line 270, in build
CloudFormationCarpenter._serviceTools[manager]().apply(services, changes)
File "/usr/lib/python2.7/dist-packages/cfnbootstrap/service_tools.py", line 161, in apply
self._restart_service(service)
File "/usr/lib/python2.7/dist-packages/cfnbootstrap/service_tools.py", line 185, in _restart_service
raise ToolError("Could not restart %s" % service)
ToolError: Could not restart cfn-hup
2017-12-30 10:48:15,942 [ERROR] -----------------------BUILD FAILED!------------------------
2017-12-30 10:48:15,944 [ERROR] Unhandled exception during build: Could not restart cfn-hup
Traceback (most recent call last):
File "/opt/aws/bin/cfn-init", line 171, in <module>
worklog.build(metadata, configSets)
File "/usr/lib/python2.7/dist-packages/cfnbootstrap/construction.py", line 129, in build
Contractor(metadata).build(configSets, self)
File "/usr/lib/python2.7/dist-packages/cfnbootstrap/construction.py", line 530, in build
self.run_config(config, worklog)
File "/usr/lib/python2.7/dist-packages/cfnbootstrap/construction.py", line 542, in run_config
CloudFormationCarpenter(config, self._auth_config).build(worklog)
File "/usr/lib/python2.7/dist-packages/cfnbootstrap/construction.py", line 270, in build
CloudFormationCarpenter._serviceTools[manager]().apply(services, changes)
File "/usr/lib/python2.7/dist-packages/cfnbootstrap/service_tools.py", line 161, in apply
self._restart_service(service)
File "/usr/lib/python2.7/dist-packages/cfnbootstrap/service_tools.py", line 185, in _restart_service
raise ToolError("Could not restart %s" % service)
ToolError: Could not restart cfn-hup
=============================================================================================================
==============================================================================================================
[root#ip-172-31-25-239 log]# cat /etc/cfn/cfn-hup.conf
[main-just some name]
stack=arn:aws:cloudformation:us-east-1:523324464109:stack/Yetagain-init-hup-try10/908305e0-ed4d-11e7-b9f7-500c285ebefd
region=us-east-1
verbose=true
interval=5
==========================================================================================================
=========================================================================================================
[root#ip-172-31-25-239 log]# cat /etc/cfn/hooks.d/cfn-auto-reloader.conf
[cfn-auto-reloader-hook #just a name]
triggers=post.update
path=Resources.EC2server.Metadata.AWS::CloudFormation::Init
action=/opt/aws/bin/cfn-init -v --stack Yetagain-init-hup-try10 --resource EC2server --configsets wordpress --region us-east-1
Looks like the main error is in cfn-hup.log:
2017-12-30 10:48:15,923 [ERROR] Error: [main] section must contain stack option*
Try changing [main-just some name] to [main] in your cfn-hup.conf. For reference, my /etc/cfn/cfn-hup.conf looks like something like this:
[main]
stack=arn:aws:cloudformation:us-west-1:acccount_id:stack/mystack-dev-ecs-EC2-1VF68LZMOLAIY/cb2a6a80-554a-11e8-b318-503dcab41efa
region=us-west-1
interval=5
verbose=true

Resources