How to communicate with Kafka server running inside a docker - networking

I am using apache KafkaConsumer in my Scala app to talk to a Kafka server wherein the Kafka and Zookeeper services are running in a docker container on my VM (the scala app is also running on this VM). I have setup the KafkaConsumer's property "bootstrap.servers" to use 127.0.0.1:9092.
The KafkaConsumer does log, "Sending coordinator request for group queuemanager_testGroup to broker 127.0.0.1:9092". The problem appears to be that the Kafka client code is setting the coordinator values based on the response it receives which contains responseBody={error_code=0,coordinator={node_id=0,host=e7059f0f6580,port=9092}} , that is how it sets the host for future connections. Subsequently it complains that it is unable to resolve address: e7059f0f6580
The address e7059f0f6580 is the container ID of that docker container.
I have tested using telnet that my VM is not detecting this as a hostname.
What setting do I need to change such that the Kafka on my docker returns localhost/127.0.0.1 as the host in its response ? Or is there something else that I am missing / doing incorrectly ?

Update
advertised.host.name is deprecated, and --override should be avoided.
Add/edit advertised.listeners to be the format of
[PROTOCOL]://[EXTERNAL.HOST.NAME]:[PORT]
Also make sure that PORT is also listed in property for listeners
After investigating this problem for hours on end, found that there is a way to
set the hostname while starting up the Kafka server, as follows:
kafka-server-start.sh --override advertised.host.name=xxx (in my case: localhost)

Related

Unable to access Kafka Broker from separate LAN machine

EDIT: OBE - figured it out. Provided in answer for anyone else who has this issue.
I am working in an offline environment and am unable to connect to a kafka broker, on machine 1, from a separate machine, machine 2, on a LAN connection through a single switch.
Machine 1 (where Kafka and ZK are running):
server.properties
listeners=PLAINTEXT://<ethernet_IPv4_m1>:9092
advertised.listeners=PLAINTEXT://<ethernet_IPv4_m1>:9092
zookeeper.connect=localhost:2181
I am starting kafka/ZK from the config files located in kafka_2.12-2.8.0/config and the running the appropritate .bat from kafka_2.12-2.8.0/bin/windows.
On machine 2 I am able to ping <ethernet_IPv4_m1> and get results; however, I fail to get a TCP connection if I run Test-NetConnection <ethernet_IPv4_m1> -p 9092 while kafka is running. In python 3.8.11, using KafkaConsumer from kafka-python, I receive the NoBrokersAvailable error when using <ethernet_IPv4_m1>:9092 as the bootstrap_server. Additionally if I run a python:3.8.12-buster docker container with a '/bin/bash' entrypoint, and follow along with the kafka-listener walkthrough I am unable to connect to the broker. I'm in the exact situation as Scenario 1 provided in the link, but the walkthrough assumes you can connect to the broker. I have also tried opening the 9092 port in my Windows Defender for in/outbound traffic (on both machines) and still have no luck. Neither Kafka, nor networking, are my strong suits and every tutorial/answer I find refers to changing the listener and advertised.listener in the kafka server.properties file - I think I correctly did this, but am unsure. This is everything I have tried so far, any recommendations would be greatly appreciated. Thank you.
For M1, the private network was the active network.
Go to control panel -> Firewall & network protection -> advanced settings (must be admin) -> setup inbound/outbound rules for port 9092 for the active network.

How to use Apache ActiveMQ Artemis in Kubernetes networking

I have setup a cluster within kubernetes using jgroups and the cluster appears to form correctly, each node has a local ip and a public ip, when I connect to one of the nodes using the public ip all is fine but the list of available nodes that is returned to the client (wildfly instance) contains the local ips of the nodes rather than their public ones, I have defined the connector with the public ip
<connectors>
<connector name="netty-connector">tcp://{public ip}:61616</connector>
</connectors>
and then configured the broadcast as
<broadcast-groups>
<broadcast-group name="my-broadcast-group">
<broadcast-period>5000</broadcast-period>
<jgroups-file>jgroups-file_ping.xml</jgroups-file>
<jgroups-channel>activemq_broadcast_channel</jgroups-channel>
<connector-ref>netty-connector</connector-ref>
</broadcast-group>
</broadcast-groups>
and then configured the discvery as
<discovery-groups>
<discovery-group name="my-discovery-group">
<jgroups-file>jgroups-file_ping.xml</jgroups-file>
<jgroups-channel>activemq_broadcast_channel</jgroups-channel>
<refresh-timeout>10000</refresh-timeout>
</discovery-group>
</discovery-groups>
and finally the cluster as
<cluster-connections>
<cluster-connection name="my-cluster">
<connector-ref>netty-connector</connector-ref>
<retry-interval>500</retry-interval>
<use-duplicate-detection>true</use-duplicate-detection>
<message-load-balancing>STRICT</message-load-balancing>
<max-hops>1</max-hops>
<discovery-group-ref discovery-group-name="my-discovery-group"/>
</cluster-connection>
</cluster-connections>
Whenever I force a node to shutdown the client reconnects but fails and reports the local ip of the node, I was under the impression that the connector defined in the broker was used to broadcast to other members of the cluster but it uses the local ip is that correct?
Wildfly runs and send and receives messages but every few minutes I get the following log
14:27:31,463 WARN [org.apache.activemq.artemis.service.extensions.xa.recovery] (Periodic Recovery) AMQ122015: Can not connect to XARecoveryConfig [transportConfiguration=[TransportConfiguration(name=, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?trustStorePassword=****&port=61616&sslEnabled=true&host=x-x-x-x&trustStorePath=client-ts], discoveryConfiguration=null, username=username, password=****, JNDI_NAME=java:/RemoteJmsXA] on auto-generated resource recovery: ActiveMQNotConnectedException[errorType=NOT_CONNECTED message=AMQ119007: Cannot connect to server(s). Tried with all available servers.]at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:797)
at org.apache.activemq.artemis.service.extensions.xa.recovery.ActiveMQXAResourceWrapper.connect(ActiveMQXAResourceWrapper.java:311)
at org.apache.activemq.artemis.service.extensions.xa.recovery.ActiveMQXAResourceWrapper.getDelegate(ActiveMQXAResourceWrapper.java:239)
at org.apache.activemq.artemis.service.extensions.xa.recovery.ActiveMQXAResourceWrapper.recover(ActiveMQXAResourceWrapper.java:69)
at org.apache.activemq.artemis.service.extensions.xa.ActiveMQXAResourceWrapperImpl.recover(ActiveMQXAResourceWrapperImpl.java:106)
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecoveryFirstPass(XARecoveryModule.java:634)
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:226)
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:171)
at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:770)
at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:382)
This is the expected behavior as you are connecting through a load balancer. You can work around that by setting useTopologyForLoadBalancing=false and specifying servers explicitly in your connection URL.
When using WildFly, the connection factory or pooled connection factory must be configured with the attribute use-topology-for-load-balancing set to false. This is how to set this from the CLI (replace remote-artemis with your actual name):
/subsystem=messaging-activemq/pooled-connection-factory=remote-artemis:write-attribute(name=use-topology-for-load-balancing, value=false)
Got it working eventually by creating a service per pod and putting public ip in the connector definition for each node

corda CENM networkmap server start failing to connect database after a few week run

we operate CENM(1.2 and use helm template to run on k8s cluster) to construct our own private network and keep on running CENM network map server for a few week, then launching new node start failing.
with further investigation, its appeared that request timeout for http://nmap:10000/network-map causes problem.
in nmap server’s log, we found following output when access to above url with curl.
[NMServer] - Error while handling socket client message com.r3.enm.servicesapi.networkmap.handlers.LatestUnsignedNetworkParametersRetrievalMessage#760c53ea: HikariPool-1 - Connection is not available, request timed out after 30000ms.
netstat shows there is at least 3 establish connection to the database from the container which network map server runs, also I can connect database directly with using CLI.
so I don’t think it is neither database saturated nor network configuration problem.
anyone have an idea why this happens? I think restart probably solve the problem, but want to know the root cause...
regards,
Please test the following options.
Since it is the HikariCP (connection pool) component that is throwing the error it would be worth seeing if increasing the pool size in the network map configuration may help - see below)
Corda uses Hikari Pool for creating the connection pool. To configure the connection pool any custom properties can be set in the dataSourceProperties section.
dataSourceProperties = {
dataSourceClassName = "org.postgresql.ds.PGSimpleDataSource"
...
maximumPoolSize = 10
connectionTimeout = 50000
}
Has a healthcheck been conducted to verify there are sufficient resources on that postgres database i.e basic diagnostic checks ?
Another option to get more information logged from the network map service is to run with TRACE logging also:
From https://docs.corda.net/docs/cenm/1.2/troubleshooting-common-issues.html
Enabling debug/trace logging
Each service can be configured to run with a deeper log level via command line flags passed at startup:
java -DdefaultLogLevel=TRACE -DconsoleLogLevel=TRACE -jar <enm-service-jar>.jar --config-fi

Can I create a PACT to run on a different hostname?

Can I create a PACT to run on a different hostname? I have been using pact rule and keeping the hostname as localhost. But now I'm trying to create a pact for an application that can not run on localhost.
#Rule
public PactProviderRule provider = new PactProviderRule("ServiceNowClientRestClientProvider", "localhost", 8080, this);
Is it possible to change localhost to something else, if so are there additional configurations that I need. I've tried changing tests that work on localhost to the actual hostname that the code is using but then it fails and I get a various error message "Unresolved address" or "Cannot assign requested address: bind", or "Address in use"
Ronald Holshausen responded with a good answer to my question. Full conversation is on Pact Google forum post:
The hostname is passed through to the HTTP server library to start an HTTP server to be the mock server. This server will be running on the same machine as the test (in fact will also be the same JVM process). The HTTP server library will use the hostname to resolve to an IP address, which will in turn resolve to a network interface on the machine which the port for the server will be bound to.
It is not as simple as a yes/no answer. It is possible to do (there are standalone mock servers you can run on another machine), but the PactProviderRule always starts a mock server on the same host as where the tests are running.
To achieve what you require, you would need to use one of the mock server implementations, and a new JUnit Rule would need to implemented (preferably extended from PactProviderRule).
There are a number of standalone pact mock servers:
https://github.com/DiUS/pact-jvm/tree/master/pact-jvm-server
https://github.com/bethesque/pact-mock_service
https://github.com/pact-foundation/pact-reference/tree/master/rust/pact_mock_server_cli
The only valid values that can be used are: the hostname of the machine where the test is running, the IP address of the machine where the test is running, localhost, 127.0.0.1 or 0.0.0.0
If a standalone mock server is started on another machine (say from your example Hostname: test.service-now.com and Port: 80), then the PactProviderRule will need to know that it should not try start a new mock server but communicate with the one is has been provided with (via the address https://test.service-now.com).
You can in the ruby version using pact-provider-proxy. However, the best use case for consumer driven contracts is when you have development control over both the consumer and the provider, and this generally means that you can stand up an instance of the provider locally. If you are trying to test a public API, or an API you don't have development control over, pact may not be the best tool for you. You can read more here about what pact is not good for.
It is possible to do (there are standalone mock servers you can run on another machine), but the PactProviderRule always starts a mock server on the same host as where the tests are running.
To achieve what you require, you would need to use one of the mock server implementations, and a new JUnit Rule would need to implemented (preferably extended from PactProviderRule).
There are a number of standalone pact mock servers:
https://github.com/DiUS/pact-jvm/tree/master/pact-jvm-server
https://github.com/pact-foundation/pact-reference/tree/master/rust/pact_mock_server_cli
as well as the pact-mock_service from the Ruby implementation (I can't post the link due to reputation restrictions on stack overflow).

docker registry on localhost with nginx proxy_pass

I'm trying to setup a private docker registry to upload my stuff but I'm stuck. The docker-registry instance is running on port 5000 and I've setup nginx in front of it with a proxy pass directive to pass requests on port 80 back to localhost:5000.
When I try to push my image I get this error:
Failed to upload metadata: Put http://localhost:5000/v1/images/long_image_id/json: dial tcp localhost:5000: connection refused
If I change localhost with my server's ip address in nginx configuration file I can push allright. Why would my local docker push command would complain about localhost when localhost is being passed from nginx.
Server is on EC2 if it helps.
I'm not sure the specifics of your traffic, but I spent a lot of time using mitmproxy to inspect the dataflows for Docker. The Docker registry is actually split into two parts, the index and the registry. The client contacts the index to handle metadata, and then is forwarded on to a separate registry to get the actual binary data.
The Docker self-hosted registry comes with its own watered down index server. As a consequence, you might want to figure out what registry server is being passed back as a response header to your index requests, and whether that works with your config. You may have to set up the registry_endpoints config setting in order to get everything to play nicely together.
In order to solve this and other problems for everyone, we decided to build a hosted docker registry called Quay that supports private repositories. You can use our service to store your private images and deploy them to your hosts.
Hope this helps!
Override X-Docker-Endpoints header set by registry with:
proxy_hide_header X-Docker-Endpoints;
add_header X-Docker-Endpoints $http_host;
I think the problem you face is that the docker-registry is advertising so-called endpoints through a X-Docker-Endpoints header early during the dialog between itself and the Docker client, and that the Docker client will then use those endpoints for subsequent requests.
You have a setup where your Docker client first communicates with Nginx on the (public) 80 port, then switch to the advertised endpoints, which is probably localhost:5000 (that is, your local machine).
You should see if an option exists in the Docker registry you run so that it advertises endpoints as your remote host, even if it listens on localhost:5000.

Resources