spring-kafka can't work with kafka-cluster - spring-kafka

I've configured 3 kafka cluster and I'm trying to use with spring-kafka.
but when I kill a kafka, I'm not able to send other messages to queue.
Kafka version 2.0.0
spring-kafka version 2.0.1
kafka-topics.sh --describe --zookeeper=zoo1:2181
print
KAFKA_SWARM_TEST PartitionCount:1 ReplicationFactor:2 Configs:
Topic: KAFKA_SWARM_TEST Partition: 0 Leader: 2 Replicas: 1,2 Isr: 2,1
spring-kafka config
spring.kafka.bootstrap-servers="kafka2:9094,kafka1:9093"
the leader is kafka2.when I kill kafka1. leader still kafka1. but spring-kafka will throw
Connection to node 1 could not be established.Broker may not be available.
Discovered group coordinator kafka1:9093
look like the spring-kafka connect just use kafka1;
my java code
#GetMapping(path = "/send",produces = MediaType.APPLICATION_JSON_VALUE)
public JsonNode send() throws JsonProcessingException {
ObjectNode put = JsonNodeFactory.instance.objectNode().put("status", "success");
String topic = "KAFKA_SWARM_TEST";
val msg = MessageBuilder
.withPayload(objectMapper.writeValueAsString(put))
.setHeader(KafkaHeaders.TOPIC, topic)
.build();
kafkaTemplate.send(msg);
return put;
}
#Bean
public NewTopic topic() {
return new NewTopic("KAFKA_SWARM_TEST", 1, (short) 2);
}
#KafkaListener(groupId="#{T(java.util.UUID).randomUUID().toString()}",topics = "KAFKA_SWARM_TEST")
void testGetInfo(String message) throws IOException {
log.error("getMessage: =====> " + message);
}
kafka config
version: '3.7'
services:
zoo1:
image: wurstmeister/zookeeper
restart: always
ports:
- 2181:2181
environment:
ZOO_MY_ID: 1
ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888
zoo2:
image: wurstmeister/zookeeper
restart: always
ports:
- 2180:2181
environment:
ZOO_MY_ID: 2
ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888
kafka1:
image: wurstmeister/kafka
restart: always
ports:
- "9093:9093"
depends_on:
- zoo1
- zoo2
privileged: true
environment:
KAFKA_BROKER_ID: 1
KAFKA_ADVERTISED_HOST_NAME: $KAFKA_ADVERTISED_HOST_NAME
KAFKA_ZOOKEEPER_CONNECT: zoo1:2181,zoo2:2181
KAFKA_LOG_DIRS: /kafka
KAFKA_SSL_KEYSTORE_LOCATION: /kafka_broker_cert/server.keystore.jks
KAFKA_SSL_KEYSTORE_PASSWORD: ksstone430
KAFKA_SSL_KEY_PASSWORD: ksstone430
KAFKA_SSL_TRUSTSTORE_LOCATION: /kafka_broker_cert/server.truststore.jks
KAFKA_SSL_TRUSTSTORE_PASSWORD: stsstone430
KAFKA_LISTENERS: "PLAINTEXT://:9092,SSL://:9093"
KAFKA_ADVERTISED_LISTENERS: "PLAINTEXT://:9092,SSL://$KAFKA_ADVERTISED_HOST_NAME:9093"
KAFKA_SSL_CLIENT_AUTH: required
LEADER_IMBALANCE_CHECK_INTERVAL_SECONDS: 60
KAFKA_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM: "null"
volumes:
- ./kafka_broker_cert:/kafka_broker_cert
- /var/run/docker.sock:/var/run/docker.sock
kafka2:
image: wurstmeister/kafka
restart: always
ports:
- "9094:9093"
depends_on:
- zoo1
- zoo2
environment:
KAFKA_BROKER_ID: 2
KAFKA_ADVERTISED_HOST_NAME: $KAFKA_ADVERTISED_HOST_NAME
KAFKA_ZOOKEEPER_CONNECT: zoo1:2181,zoo2:2181
KAFKA_LOG_DIRS: /kafka
KAFKA_SSL_KEYSTORE_LOCATION: /kafka_broker_cert/server.keystore.jks
KAFKA_SSL_KEYSTORE_PASSWORD: ksstone430
KAFKA_SSL_KEY_PASSWORD: ksstone430
KAFKA_SSL_TRUSTSTORE_LOCATION: /kafka_broker_cert/server.truststore.jks
KAFKA_SSL_TRUSTSTORE_PASSWORD: stsstone430
KAFKA_LISTENERS: "PLAINTEXT://:9092,SSL://:9093"
KAFKA_ADVERTISED_LISTENERS: "PLAINTEXT://:9092,SSL://$KAFKA_ADVERTISED_HOST_NAME:9093"
KAFKA_SSL_CLIENT_AUTH: required
LEADER_IMBALANCE_CHECK_INTERVAL_SECONDS: 60
KAFKA_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM: "null"
volumes:
- ./kafka_broker_cert:/kafka_broker_cert
- /var/run/docker.sock:/var/run/docker.sock

Try checking if the new leader election of the Kafka cluster is working when you kill one of the nodes (e.g. the leader, kafka1)
Also, check if there are other configurations that overrides spring.kafka.bootstrap-servers. There could be a bean that just points to kafka1:9093 as the broker.
However even if the bootstrap-servers property points to kafka1:9093 only, the consumer should find the other nodes of the broker in case of node adjustments.

Related

Getting container host metrics when running Telegraf inside Docker

I've got a docker compose service with a bunch of containers and I am attempting to collect both the docker container metrics from these containers but also the container host metrics from the Ubuntu server the containers are running on. I'm getting the docker container stats but I am not getting the Ubuntu container host metrics. The stats from the non-docker based input plugins (inputs.diskio,inputs.mem, etc) are from the telegraf container.
I found this and opened up the volumes but still nothing: https://community.influxdata.com/t/how-can-we-collect-host-machine-metrics-while-telegraf-is-running-in-docker-container/12005
Here is my compose file:
version: "3"
services:
telegraf:
image: telegraf:1.20.3
volumes:
- ./telegraf.conf:/etc/telegraf/telegraf.conf:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- /sys:/rootfs/sys:ro
- /proc:/rootfs/proc:ro
- /etc:/rootfs/etc:ro
environment:
HOST_PROC: /rootfs/proc
HOST_SYS: /rootfs/sys
HOST_ETC: /rootfs/etc
vote:
build: ./vote
# use python rather than gunicorn for local dev
command: python app.py
depends_on:
- redis
volumes:
- ./vote:/app
ports:
- "5000:80"
networks:
- front-tier
- back-tier
result:
build: ./result
# use nodemon rather than node for local dev
command: nodemon server.js
depends_on:
- db
volumes:
- ./result:/app
ports:
- "5001:80"
- "5858:5858"
networks:
- front-tier
- back-tier
worker:
build:
context: ./worker
depends_on:
- redis
- db
networks:
- back-tier
redis:
image: redis:5.0-alpine3.10
volumes:
- "./healthchecks:/healthchecks"
healthcheck:
test: /healthchecks/redis.sh
interval: "5s"
ports: ["6379"]
networks:
- back-tier
db:
image: postgres:9.4
environment:
POSTGRES_USER: "postgres"
POSTGRES_PASSWORD: "postgres"
volumes:
- "db-data:/var/lib/postgresql/data"
- "./healthchecks:/healthchecks"
healthcheck:
test: /healthchecks/postgres.sh
interval: "5s"
networks:
- back-tier
volumes:
db-data:
networks:
front-tier:
back-tier:
Here is the agent config:
[agent]
interval = "10s"
[[inputs.mem]]
[[inputs.disk]]
## Ignore mount points by filesystem type.
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
[[inputs.diskio]]
[[inputs.kernel]]
[[inputs.ethtool]]
[[inputs.procstat]]
pattern = ".*"
[[inputs.docker]]
endpoint = "unix:///var/run/docker.sock"
gather_services = false
container_names = []
source_tag = true
container_name_include = []
container_name_exclude = []
timeout = "5s"
perdevice = true
docker_label_include = []
docker_label_exclude = []
[[inputs.cpu]]
percpu = true
totalcpu = true
collect_cpu_time = true
report_active = true
How do I get the container host metrics??

Couldn't find message bus pubsub.jetstream/v1 Dapr

I'm trying to connect dapr with nats with jetstream functionality enabled.
I want to start everything with docker-compose. Nats service is started and when I run nats-cli with command nats -s "nats://localhost:4222" server check jetstream, I get OK JetStream | memory=0B memory_pct=0%;75;90 storage=0B storage_pct=0%;75;90 streams=0 streams_pct=0% consumers=0 consumers_pct=0% indicating nats with jetstream is working ok.
Unfortunately, dapr returns first warning then error
warning: error creating pub sub %!s(*string=0xc0000ca020) (pubsub.jetstream/v1): couldn't find message bus pubsub.jetstream/v1" app_id=conversation-api1 instance=50b51af8e9a8 scope=dapr.runtime type=log ver=1.3.0
error: process component conversation-pubsub error: couldn't find message bus pubsub.jetstream/v1" app_id=conversation-api1 instance=50b51af8e9a8 scope=dapr.runtime type=log ver=1.3.0
I followed instructions on official site.
docker-compose.yaml
version: '3.4'
services:
conversation-api1:
image: ${DOCKER_REGISTRY-}conversationapi1
build:
context: .
dockerfile: Conversation.Api1/Dockerfile
ports:
- "5010:80"
conversation-api1-dapr:
container_name: conversation-api1-dapr
image: "daprio/daprd:latest"
command: [ "./daprd", "--log-level", "debug", "-app-id", "conversation-api1", "-app-port", "80", "--components-path", "/components", "-config", "/configuration/conversation-config.yaml" ]
volumes:
- "./dapr/components/:/components"
- "./dapr/configuration/:/configuration"
depends_on:
- conversation-api1
- redis
- nats
network_mode: "service:conversation-api1"
nats:
container_name: "Nats"
image: nats
command: [ "-js", "-m", "8222" ]
ports:
- "4222:4222"
- "8222:8222"
- "6222:6222"
# OTHER SERVICES...
conversation-pubsub.yaml
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
name: conversation-pubsub
namespace: default
spec:
type: pubsub.jetstream
version: v1
metadata:
- name: natsURL
value: "nats://host.docker.internal:4222" # already tried with nats for host
- name: name
value: "conversation"
- name: durableName
value: "conversation-durable"
- name: queueGroupName
value: "conversation-group"
- name: startSequence
value: 1
- name: startTime # in Unix format
value: 1630349391
- name: deliverAll
value: false
- name: flowControl
value: false
conversation-config.yaml
apiVersion: dapr.io/v1alpha1
kind: Configuration
metadata:
name: config
namespace: default
spec:
tracing:
samplingRate: "1"
zipkin:
endpointAddress: "http://zipkin:9411/api/v2/spans"
The problem was in old Dapr version. I used version 1.3.0, Jetstream support is introduced in 1.4.0+. Pulling latest version of daprio/daprd fixed my problem. Also no need for nats://host.docker.internal:4222, nats://nats:4222 works as expected.

fluent-bit to Loki - "log" field is not being parsed/filtered

I have:
a simple Python app ("iss-web") writing JSON log output to stdout
the Python app ("iss-web") is within a Docker Container
the Python app ("iss-web") Container logging driver is set to "fluentd"
a separate Container running "fluent/fluent-bit:1.7" to collect the Python app JSON log output
Loki 2.2.1 deployed via a Container to receive the Python app log output from fluent-bit
Grafana connected to Loki to visualize the log data
The issue is that the "log" field is not filtered/parsed by fluent-bit, therefore in Loki/Grafana the content of the "log" field is not parsed and used as "Detected fields".
"iss-web" docker-compose.yml
version: '3'
services:
iss-web:
build: ./iss-web
image: iss-web
container_name: iss-web
env_file:
- ./iss-web/app.env
ports:
- 46664:46664
logging:
driver: fluentd
options:
tag: iss.web
redis:
image: redis
container_name: redis
ports:
- 6379:6379
logging:
driver: "json-file"
options:
max-file: ${LOG_EXPIRE}
max-size: ${LOG_SEGMENT}
"fluent-bit" docker-compose.yml
version: '3'
services:
fluent-bit:
image: fluent/fluent-bit:1.7
container_name: fluent-bit
environment:
- LOKI_URL=http://135.86.186.75:3100/loki/api/v1/push
user: root
volumes:
- ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
- ./parsers.conf:/fluent-bit/etc/parsers.conf
ports:
- "24224:24224"
- "24224:24224/udp"
fluent-bit.conf
[SERVICE]
Flush 1
Daemon Off
log_level debug
Parsers_File /fluent-bit/etc/parsers.conf
[INPUT]
Name forward
Listen 0.0.0.0
port 24224
[FILTER]
Name parser
Match iss.web
Key_Name log
Parser docker
Reserve_Data On
Preserve_Key On
[OUTPUT]
Name loki
Match *
host 135.86.186.75
port 3100
labels job=fluentbit
[OUTPUT]
Name stdout
Match *
parsers.conf
I've tried with/without Time_Key, Time_Format, Time_Keep
[PARSER]
Name docker
Format json
#Time_Key time
#Time_Format %Y-%m-%dT%H:%M:%S.%L
#Time_Keep On
# Command | Decoder | Field | Optional Action
# =============|==================|=================
#Decode_Field_As escaped_utf8 log do_next
Decode_Field_As json log
fluent-bit log extract
[0] iss.web: [1620640820.000000000, {"log"=>"{'timestamp': '2021:05:10 11:00:20.439513', 'epoch': 1620640820.4395688, 'pid': 1, 'level': 'DEBUG', 'message': '/ping', 'data': {'message': 'PONG', 'timestamp': '1620640820.4394963', 'version': '0.1'}}", "container_id"=>"bffd720e9ac1e8c3992c1120eed37e00c536cd44ec99e9c13cf690d840363f80", "container_name"=>"/iss-web", "source"=>"stdout"}]
Grafana/Loki screen
I would expect the "Detected fields" to contain pid=1, message=/ping etc
I needed a "json.dumps" in my "logger":
def log(message, level="INFO", **extra):
out = {"timestamp": get_now(), "epoch": get_epoch(), "pid": get_pid(), "level": level, "message": message}
if extra: out |= extra
print(json.dumps(out), flush=True)
return True

Traefik not trusting ssl certificate

I have had success in both instantiating a traefik container, as well as 4 other nginx containers to serve applications that route my subdomains to each individual service. The routing works, and I am using [acme] for certificate generation, but everytime i try to go to any of my subdomains chrome still gives me an error saying "this connection isn't trusted", and then I have to hit advanced and proceed. The individual applications load fine, but there's something wrong with the certificates.
I have tried clearing the acme.json file to no avail. I had also played around with enabling onDemand in the traefick.toml but that didn't work either.
Please help?
traefik.toml
# defaultEntryPoints must be at the top
# because it should not be in any table below
defaultEntryPoints = ["http", "https"]
# Entrypoints, http and https
[entryPoints]
# http should be redirected to https
[entryPoints.http]
address = ":80"
[entryPoints.http.redirect]
entryPoint = "https"
# https is the default
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
# Enable ACME (Let's Encrypt): automatic SSL
[acme]
email = "chris#myubercode.com"
storage = "./acme.json"
entryPoint = "https"
OnHostRule = true
acmeLogging = true
caServer = "https://acme-v02.api.letsencrypt.org/directory"
[acme.httpChallenge]
entryPoint = "http"
[acme.dnsChallenge]
provider = "digitalocean"
delayBeforeCheck = 0
[[acme.domains]]
main = "cswilson.site"
sans = ["profile.cswilson.site", "ecommerce.cswilson.site", "fitness.cswilson.site", "biosite.cswilson.site"]
traefikLogsFile = "/tmp/traefik.log"
logLevel = "DEBUG"
[accessLog]
filePath = "/tmp/access.log"
[docker]
endpoint = "unix:///var/run/docker.sock"
domain = "cswilson.site"
watch = true
exposedbydefault = false
docker-compose.yml (for the traefik container):
version: '3'
services:
traefik:
image: traefik
command: --docker
ports:
- "80:80"
- "443:443"
restart: always
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
- "./traefik.toml:/traefik.toml"
- "./acme.json:/acme.json"
networks:
- default
And here is the docker-compose.yml for the 4 different application containers:
version: '3'
services:
profile:
build: .
image: nginx
labels:
- "traefik.enabled=true"
- "traefik.backend=profile"
- "traefik.frontend.rule=Host:profile.cswilson.site"
- "traefik.frontend.entryPoinst=http,https"
restart: always
networks:
- "traefik_default"
fitness:
build: .
image: nginx
labels:
- "traefik.enabled=true"
- "traefik.backend=fitness"
- "traefik.frontend.rule=Host:fitness.cswilson.site"
- "traefik.frontend.entryPoinst=http,https"
restart: always
networks:
- "traefik_default"
ecommerce:
build: .
image: nginx
labels:
- "traefik.enabled=true"
- "traefik.backend=ecommerce"
- "traefik.frontend.rule=Host:ecommerce.cswilson.site"
- "traefik.port=80"
restart: always
networks:
- "traefik_default"
biosite:
build: .
image: nginx
labels:
- "traefik.enabled=true"
- "traefik.backend=ecommerce"
- "traefik.frontend.rule=Host:biosite.cswilson.site"
- "traefik.port=80"
restart: always
networks:
- "traefik_default"
networks:
traefik_default:
external:
name: traefik_default
I am new to docker and just found traefik this morning, and I don't really know if I need some sort of a real certificate to put into
[[entryPoints.http.tls.certificates]]
Any help is greatly appreciated, thank you

How to resolve service names in a Docker Swarm?

I am playing around with Docker and stuff, using this docker-compose.yml:
version: '3.4'
services:
frontend:
image: apmimg:latest
networks:
- core-infra
ports:
- 8080:80
deploy:
replicas: 2
update_config:
parallelism: 2
delay: 10s
restart_policy:
condition: on-failure
backend:
image: productsapi:latest
volumes:
- myvol:/opt/myvol
networks:
- core-infra
deploy:
replicas: 2
update_config:
parallelism: 2
delay: 10s
restart_policy:
condition: on-failure
networks:
core-infra:
driver: overlay
volumes:
myvol:
driver: local
And when I ssh into frontend and ping backend "ping mysite_backend" it does work.
But when I try to make a HTTP request from my Node.js code:
private _productUrl = "http://mysite_backend/api/products";
getProducts(): Observable<IProduct[]>
{
let url = this._productUrl;
return this._http.get<IProduct[]>(url)
.do(data => console.log('All: ' + JSON.stringify(data)))
.catch(this.handleError);
}
I get a "Failed to load resource: net::ERR_NAME_NOT_RESOLVED", even in the same host.
Any ideas on what's wrong?

Resources