Apache Atlas and Airflow Integration - airflow

I am trying to integrate an Apache Atlas instance I have running with Apache Airflow. Once I set up the connection in airflow.cfg I tried running a DAG from the Airflow scheduler. I get the following error in the log.
[2021-02-02 20:50:47,958] {connectionpool.py:752} WARNING - Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f464b856950>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/atlas/v2/types/typedefs
[2021-02-02 20:50:47,960] {taskinstance.py:1150} ERROR - HTTPConnectionPool(host='localhost', port=21000): Max retries exceeded with url: /api/atlas/v2/types/typedefs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f464b8650d0>: Failed to establish a new connection: [Errno 111] Connection refused'))
My airflow.cfg is configured as the following:
[lineage]
backend = airflow.lineage.backend.atlas.AtlasBackend
[atlas]
username = <username>
password = <password>
host = localhost
port = 21000
I have tried changing the host to http://localhost as well. I am not sure where to investigate in Atlas to identify why the connection is being refused.

Connection Refused means either service is not listening on the configured port or the wrong hostname.
Try to replace localhost to fqdn,
A good way to configure it correctly is to access atlas ui and simply put the hostname from the url to config.

I was able to solve the problem by adding the --hostname flag when starting the docker container for atlas. I then used the hostname I provided as the host in airflow.cfg

Related

Can't access openstack dashboard

I installed Openstack in Ubuntu and started it using the ./start.sh command. When I try to login the dashboard, I get the error below about not being able to connect to Neuron:
ConnectionFailed at /project/
Connection to neutron failed: HTTPConnectionPool(host='192.168.204.173', port=9696): Max retries exceeded with url: /networking/v2.0/extensions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7efde4f460>: Failed to establish a new connection: [Errno 111] Connection refused'))
How do I fix this?

I can't connect watcher to openstack

I have the following scenario:
server: Ubuntu 20.04.3 LTS
openstack installed with devstack
watcher 2.2.0
Every services seem to be working and i can see the watcher dashboard on localhost.
But I think I have an auth problem between watcher and keystone :
Error contacting Watcher server: Unable to establish connection
to http://x.x.x.x:9322/v1/services: HTTPConnectionPool(host='x.x.x.x', port=9322):
Max retries exceeded with url: /v1/services (Caused by NewConnectionError('<urllib3.connection.
HTTP Connection object at 0x7fba33b1be50>: Failed to establish a new connection: [Errno 111] Connexion refusée')).
Attempt 6 of 6
I'm testing something on a VM, I put the same IP and the same password everywhere
Where should I look first ?

Airflow webserver not able to access remote worker logs

i have my airflow docker container running with celery worker nodes, the jobs are getting triggered correctly. However from the webserver UI section the it cannot reach the worker for log, showing as
*** Log file does not exist: /usr/local/airflow/airflow/logs/spark-submit/transform/2021-08-07T04:21:58.646836+00:00/1.log
*** Fetching from: http://localhost.localdomain:8793/log/spark-submit/transform/2021-08-07T04:21:58.646836+00:00/1.log
*** Failed to fetch log file from worker. [Errno 111] Connection refused
trying to diagnose why this happened? does it mean that the webserver is not able to resolve the worker ip address correctly? How can i configure this worker ip mapping somewhere?
I've tried set up the hostname in airflow.cfg as
hostname_callable = airflow.utils.net.get_host_ip_address
but doesn't help.
Appreciate any help! Thanks
Version:
airflow==2.1.2
celery[redis]==4.4.2

Passing hostname to netty

Background: I've got two machines with identical hostnames, I need to set up a local spark cluster for testing, setting up a master and a worker works fine, but trying to run an application with the driver causes problems, netty doesn't seem to be picking the correct host (regardless of what I put in there, it just picks the first host).
Identical hostname:
$ dig +short corehost
192.168.0.100
192.168.0.101
Spark config (used by master and the local worker):
export SPARK_LOCAL_DIRS=/some/dir
export SPARK_LOCAL_IP=corehost // i tried various like 192.168.0.x for
export SPARK_MASTER_IP=corehost // local, master and the driver
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=2
export SPARK_WORKER_MEMORY=2g
export SPARK_WORKER_INSTANCES=2
export SPARK_WORKER_DIR=/some/dir
Spark starts up and I can see the worker in the web-ui.
When I run the spark "job" below:
val conf = new SparkConf().setAppName("AaA")
// tried 192.168.0.x and localhost
.setMaster("spark://corehost:7077")
val sc = new SparkContext(conf)
I get this exception:
15/04/02 12:34:04 INFO SparkContext: Running Spark version 1.3.0
15/04/02 12:34:04 WARN Utils: Your hostname, corehost resolves to a loopback address: 127.0.0.1; using 192.168.0.100 instead (on interface en1)
15/04/02 12:34:04 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/04/02 12:34:05 ERROR NettyTransport: failed to bind to corehost.home/192.168.0.101:0, shutting down Netty transport
...
Exception in thread "main" java.net.BindException: Failed to bind to: corehost.home/192.168.0.101:0: Service 'sparkDriver' failed after 16 retries!
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:393)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:389)
at scala.util.Success$$anonfun$map$1.apply(Try.scala:206)
at scala.util.Try$.apply(Try.scala:161)
at scala.util.Success.map(Try.scala:206)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
15/04/02 12:34:05 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/04/02 12:34:05 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/04/02 12:34:05 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
Process finished with exit code 1
Not sure how to proceed... its a whole jungle of ip addresses.
Not sure if this is a netty issue either.
My experience with the identical problem is that it revolves around setting things up locally. Try being more verbose in your spark driver code, add the SPARK_LOCAL_IP and driver host ip to the config:
val conf = new SparkConf().setAppName("AaA")
.setMaster("spark://localhost:7077")
.set("spark.local.ip","192.168.1.100")
.set("spark.driver.host","192.168.1.100")
This should tell netty which of the two identical hosts to use.

what can cause connection to a port in the same node refused?

Got an EndpointWriter error:
14/10/30 23:12:29 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker#node001:35249] -> [akka.tcp://sparkExecutor#node001:7088]: Error [Association failed with [akka.tcp://sparkExecutor#node001:7088]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor#node001:7088]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: node001/10.69.144.56:7088
the node001 and 10.69.144.56 are both the node itself. my understanding is that akka was trying to connect to a port in local but got rejected. The executor port was fixed to be '7087'.
Thanks for your help!
The usual reason for connection refused is that there is nothing listening on the port. If the port that executor is listening on is 7087, akka is trying to make a connection to port 7088 and there's probably nothing listening there. Check your code or configuration to see if you got 7088 instead of 7087.

Resources