SSL error bad handshake while calling databrick job from airflow Dag - airflow

I am running a airflow container in which
My airflow dags fails to connect to Databrick job with the error log below.
failed with reason: HTTPSConnectionPool(host='Mycompany-dev.cloud.databricks.com', port=443): Max retries exceeded with url: /api/2.0/jobs/runs/submit (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')],)",),))
more Information:
Initially to install docker or java it was giving same error in which i rewrite the pip install code as below, however i am not sure how to include it while connecting to a server from airflow UI.
pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org <package_name>

Related

Upgrading R gives error - Failed to connect to Mir: Failed to connect to server socket: No such file or directory

I'm currently using R version 3.6.1 and I'm trying to update it to R/4.2. I tried following instructions here and here. But everytime i try sudo add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/, I'm getting this error
Failed to connect to Mir: Failed to connect to server socket: No such file or directory
Unable to init server: Could not connect: Connection refused
Failed to connect to Mir: Failed to connect to server socket: No such file or directory
Unable to init server: Could not connect: Connection refused
How do I resolve this ?

I can't connect watcher to openstack

I have the following scenario:
server: Ubuntu 20.04.3 LTS
openstack installed with devstack
watcher 2.2.0
Every services seem to be working and i can see the watcher dashboard on localhost.
But I think I have an auth problem between watcher and keystone :
Error contacting Watcher server: Unable to establish connection
to http://x.x.x.x:9322/v1/services: HTTPConnectionPool(host='x.x.x.x', port=9322):
Max retries exceeded with url: /v1/services (Caused by NewConnectionError('<urllib3.connection.
HTTP Connection object at 0x7fba33b1be50>: Failed to establish a new connection: [Errno 111] Connexion refusée')).
Attempt 6 of 6
I'm testing something on a VM, I put the same IP and the same password everywhere
Where should I look first ?

openstack-nova-api has conflicted with the httpd service among the port 8774

I can't use httpd and nova-api at the same time.
when I used httpd service.The nova-api is dead(or inactive).
#systemctl restart openstack-nova-api
OUTPUT:
Job for openstack-nova-api.service failed because the control process exited
with error code. See "systemctl status openstack-nova-api.service" and
"journalctl -xe" for details.
I checked out the log,I get the error as follows.
LOG:ERROR nova.wsgi [-] Could not bind to 0.0.0.0:8774: error: [Errno 98] Address already in use.
CRITICAL nova [-] Unhandled error: error: [Errno 98] Address already in use.
And then,I try to find which process have used the port8774.
#netstat -tunlp | grep 8774
OUTPUT:
tcp 0 0 0.0.0.0:8774 0.0.0.0:* LISTEN 61690/httpd
When I #systemctl stop httpd->#systemctl restart nova-api->#systemctl restart http. I get a similiar mistake(I use RDO to install openstack-train version on centos 7).
they can't exist together

Airflow live executor logs with DaskExecutor

I have an Airflow installation (on Kubernetes). My setup uses DaskExecutor. I also configured remote logging to S3. However when the task is running I cannot see the log, and I get this error instead:
*** Log file does not exist: /airflow/logs/dbt/run_dbt/2018-11-01T06:00:00+00:00/3.log
*** Fetching from: http://airflow-worker-74d75ccd98-6g9h5:8793/log/dbt/run_dbt/2018-11-01T06:00:00+00:00/3.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-74d75ccd98-6g9h5', port=8793): Max retries exceeded with url: /log/dbt/run_dbt/2018-11-01T06:00:00+00:00/3.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7d0668ae80>: Failed to establish a new connection: [Errno -2] Name or service not known',))
Once the task is done, the log is shown correctly.
I believe what Airflow is doing is:
for finished tasks read logs from s3
for running tasks, connect to executor's log server endpoint and show that.
Looks like Airflow is using celery.worker_log_server_port to connect to my dask executor to fetch logs from there.
How to configure DaskExecutor to expose log server endpoint?
my configuration:
core remote_logging True
core remote_base_log_folder s3://some-s3-path
core executor DaskExecutor
dask cluster_address 127.0.0.1:8786
celery worker_log_server_port 8793
what i verified:
- verified that the log file exists and is being written to on the executor while the task is running
- called netstat -tunlp on executor container, but did not find any extra port exposed, where logs could be served from.
UPDATE
have a look at serve_logs airflow cli command - I believe it does exactly the same.
We solved the problem by simply starting a python HTTP handler on a worker.
Dockerfile:
RUN mkdir -p $AIRFLOW_HOME/serve
RUN ln -s $AIRFLOW_HOME/logs $AIRFLOW_HOME/serve/log
worker.sh (run by Docker CMD):
#!/usr/bin/env bash
cd $AIRFLOW_HOME/serve
python3 -m http.server 8793 &
cd -
dask-worker $#

ICp 2.1.0.1: Installation failed with error TASK [master: Waiting for MariaDB service to start]

I am installing ICp 2.1.0.1 and I received an error at the TASK
[master: Waiting for MariaDB service to start] msg: The MariaDB
component failed to start.
After this msg the installation completed with failed status.
We are installing ICp with 3 Masters, 3 Proxies and 2 Workers. We have 1 IP for VIP master and 1 for VIP proxy.
I tried to install multiple times and all installations got the same error.
For prior issues with that error, the correct db admin password was not used. So check the db user and password to resolve issue.
Would you validate whether each master host was able to access port 3306 on the other hosts?
If you run with .. install -vv | tee -a install-log.txt, do you get additional details as well?
The error was solved by following the steps below.
Check whether kubelet is running:
Log in to your master node.
Run the following command to check kubelet status:
systemctl status kubelet
If kubelet is not running, run the following command to get the logs:
journalctl -u kubelet &> kubelet.log
We found the error in the kubelet.log log:
Error: failed to run Kubelet: Running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false.
We found this troubleshoot in this link, and the solution at the ICP issue 4651.
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_2.1.0/troubleshoot/etcd_fails.html
https://github.ibm.com/IBMPrivateCloud/roadmap/issues/4651

Resources