Kubernetes: Using UDP broadcast to find other pods - networking

I have a clustered legacy application that I am trying to deploy on kubernetes. The nodes in the cluster find each other using UDP broadcast. I cannot change this behaviour for various reasons.
When deployed on docker, this would be done by creating a shared network (i.e. docker network create --internal mynet, leading to a subnet e.g. 172.18.0.0/16), and connecting the containers containing the clustered nodes to the same network (docker network connect mynet instance1 and docker network connect mynet instance2). Then every instance starting would broadcast it's IP address periodically on this network using 172.18.255.255 until they have formed a cluster. Multiple such clusters could reside in the same kubernetes namespace, so preferrably I would like to create my own "private network" just for these pods to avoid port collisions.
Is there a way of creating such a network on kubernetes, or otherwise trick the application into believing it is connected to such a network (assuming the IP addresses of the other nodes are known)? The kubernetes cluster I am running on uses Calico.

Maybe you can Set label for your app pod, and try NetworkPolicy on Calico.

Related

Rancher Unable to connect Cluster Ip from another project

I'm using rancher 2.3.5 on centos 7.6
in my cluster the "project Network isolation" is enable.
I have 2 projects:
In the projet 1, I have deployed one apache docker that listens to port 80 on cluster IP
[enter image description network isolation config
In the second project, I unable to connect the projet 1 cluster IP
Is the project Network isolation block also the traffic for the cluter IP between the two projects.
Thanks you
Other answers have correctly pointed out how a ClusterIP works from the standpoint of just Kuberentes, however the OP specifies that Rancher is involved.
Rancher provides the concept of a "Project" which is a collection of Kubernetes namespaces. When Rancher is set to enable "Project Network Isolation", Rancher manages Kubernetes Network Policies such that namespaces in different Projects can not exchange traffic on the cluster overlay network.
This creates the situation observed by the OP. When "Project Network Isolation" is enabled, a ClusterIP in one Project is unable to exchange traffic with a traffic source in a different Project, even though they are in the same Kubernetes Cluster.
There is a brief note about this by Rancher documentation:
https://rancher.com/docs/rancher/v2.x/en/cluster-provisioning/rke-clusters/options/
and while that document seems to limit the scope to Pods, because Pods and ClusterIPs are allocated from the same network it applies to ClusterIPs as well.
K8s Cluster IPs are restricted for communication with in the cluster. A good read on the CLusterIP, Node Port and the load balancer can be found at https://www.edureka.co/community/19351/clusterip-nodeport-loadbalancer-different-from-each-other .
If your intention is to make the services in the 2 different cluster communicate, then go for the below methods.
Deploy a overlay network for your Nodegroup
Cluster peering

Does GKE use an overlay network?

GKE uses the kubenet network plugin for setting up container interfaces and configures routes in the VPC so that containers can reach eachother on different hosts.
Wikipedia defines an overlay as a computer network that is built on top of another network.
Should GKE's network model be considered an overlay network? It is built on top of another network in the sense that it relies on the connectivity between the nodes in the cluster to function properly, but the Pod IPs are natively routable within the VPC as the routes inform the network which node to go to to find a particular Pod.
VPC-native and non VPC native GKE clusters uses GCP virtual networking. It is not strictly an overlay network by definition. An overlay network would be one that's isolated to just the GKE cluster.
VPC-native clusters work like this:
Each node VM is given a primary internal address and two alias IP ranges. One alias IP range is for pods and the other is for services.
The GCP subnet used by the cluster must have at least two secondary IP ranges (one for the pod alias IP range on the node VMs and the other for the services alias IP range on the node VMs).
Non-VPC-native clusters:
GCP creates custom static routes whose destinations match pod IP space and services IP space. The next hops of these routes are node VMs by name, so there is instance based routing that happens as a "next step" within each VM.
I could see where some might consider this to be an overlay network. I don’t believe this is the best definition because the pod and service IPs are addressable from other VMs, outside of GKE cluster, in the network.
For a deeper dive on GCP’s network infrastructure, GCP’s network virtualization whitepaper can be found here.

Please Example Kubernetes External Address vs Internal Addresses

In a vmware environment, should the external address become populated with the VM's (or hosts) ip address?
I have three clusters, and have found that only those using a "cloud provider" have external addresses when I run kubectl get nodes -o wide. It is my understanding that the "cloud provider" plugin (GCP, AWS, Vmware, etc) is what assigns the public ip address to the node.
KOPS deployed to GCP = external address is the real public IP addresses of the nodes.
Kubeadm deployed to vwmare, using vmware cloud provider = external address is the same as the internal address (a private range).
Kubeadm deployed, NO cloud provider = no external ip.
I ask because I have a tool that scrapes /api/v1/nodes and then interacts with each host that is finds, using the "external ip". This only works with my first two clusters.
My tool runs on the local network of the clusters, should it be targeting the "internal ip" instead? In other words, is the internal ip ALWAYS the IP address of the VM or physical host (when installed on bare metal).
Thank you
Baremetal will not have an "extrenal-IP" for the nodes and the "internal-ip" will be the IP address of the nodes. You are running your command from inside the same network for your local cluster so you should be able to use this internal IP address to access the nodes as required.
When using k8s on baremetal the external IP and loadbalancer functions don't natively exist. If you want to expose an "External IP", quotes because most cases it would still be a 10.X.X.X address, from your baremetal cluster you would need to install something like MetalLB.
https://github.com/google/metallb

How do networking and load balancer work in docker swarm mode?

I am new to Dockers and containers. I was going through the tutorials for docker and came across this information.
https://docs.docker.com/get-started/part3/#docker-composeyml
networks:
- webnet
networks:
webnet:
What is webnet? The document says
Instruct web’s containers to share port 80 via a load-balanced network called webnet. (Internally, the containers themselves will publish to web’s port 80 at an ephemeral port.)
So, by default, the overlay network is load balanced in docker cluster? What is load balancing algo used?
Actually, it is not clear to me why do we have load balancing on the overlay network.
Not sure I can be clearer than the docs, but maybe rephrasing will help.
First, the doc you're following here uses what is called the swarm mode of docker.
What is swarm mode?
A swarm is a cluster of Docker engines, or nodes, where you deploy services. The Docker Engine CLI and API include commands to manage swarm nodes (e.g., add or remove nodes), and deploy and orchestrate services across the swarm.
From SO Documentation:
A swarm is a number of Docker Engines (or nodes) that deploy services collectively. Swarm is used to distribute processing across many physical, virtual or cloud machines.
So, with swarm mode you have a multi host (vms and/or physical) cluster a machines that communicate with each other through their docker engine.
Q1. What is webnet?
webnet is the name of an overlay network that is created when your stack is launched.
Overlay networks manage communications among the Docker daemons participating in the swarm
In your cluster of machines, a virtual network is the created, where each service has an ip - mapped to an internal DNS entry (which is service name), and allowing docker to route incoming packets to the right container, everywhere in the swarm (cluster).
Q2. So, by default, overlay network is load balanced in docker cluster ?
Yes, if you use the overlay network, but you could also remove the service networks configuration to bypass that. Then you would have to publish the port of the service you want to expose.
Q3. What is load balancing algo used ?
From this SO question answered by swarm master bmitch ;):
The algorithm is currently round-robin and I've seen no indication that it's pluginable yet. A higher level load balancer would allow swarm nodes to be taken down for maintenance, but any sticky sessions or other routing features will be undone by the round-robin algorithm in swarm mode.
Q4. Actually it is not clear to me why do we have load balancing on overlay network
Purpose of docker swarm mode / services is to allow orchestration of replicated services, meaning that we can scale up / down containers deployed in the swarm.
From the docs again:
Swarm mode has an internal DNS component that automatically assigns each service in the swarm a DNS entry. The swarm manager uses internal load balancing to distribute requests among services within the cluster based upon the DNS name of the service.
So you can have deployed like 10 exact same container (let's say nginx with you app html/js), without dealing with private network DNS entries, port configuration, etc... Any incoming request will be automatically load balanced to hosts participating in the swarm.
Hope this helps!

What is overlay network and how does DNS resolution work?

I cannot connect to external mongodb server from my docker swarm cluster.
As I understand this is because of cluster uses overlay network driver. Am I right?
If not, how does docker overlay driver works and how can I connect to external mongodb server from cluster?
Q. How does the docker overlay driver work?
I would recommend this good reference for understanding docker swarm network overlay, and more globally, Docker's architecture.
This states that:
Docker uses embedded DNS to provide service discovery for containers running on a single Docker Engine and tasks running in a Docker Swarm. Docker Engine has an internal DNS server that provides name resolution to all of the containers on the host in user-defined bridge, overlay, and MACVLAN networks.
Each Docker container ( or task in Swarm mode) has a DNS resolver that forwards DNS queries to Docker Engine, which acts as a DNS server.
So, in multi-host docker swarm mode, with this example setup :
In this example there is a service of two containers called myservice. A second service (client) exists on the same network. The client executes two curl operations for docker.com and myservice.
These are the resulting actions:
DNS queries are initiated by client for docker.com and myservice.
The container's built-in resolver intercepts the DNS queries on 127.0.0.11:53 and sends them to Docker Engine's DNS server.
myservice resolves to the Virtual IP (VIP) of that service which is internally load balanced to the individual task IP addresses. Container names resolve as well, albeit directly to their IP addresses.
docker.com does not exist as a service name in the mynet network and so the request is forwarded to the configured default DNS server.
Back to your question:
How can I connect to an external mongodb server form cluster?
For your external mongodb (let's say you have a DNS for that mongodb.mydomain.com), you are in the same situation as the client in above architecture, wanting to connect to docker.com, except that you certainly don't wan't to expose that mongodb.mydomain.com to the entire web, so you may have declared it in your internal cluster DNS server.
Then, how to tell docker engine to use this internal DNS server to resolve mongodb.mydomain.com?
You have to indicate in your docker service task that you want to use an internal DNS server, like so:
docker service create \
--name myservice \
--network my-overlay-network \
--dns=10.0.0.2 \
myservice:latest
The important thing here is --dns=10.0.0.2. This will tell the Docker engine to use the DNS server at 10.0.0.2:53 as default if it can not resolve the DNS name in the VIP.
Finally, when you say :
I cannot connect to external mongodb server from my docker swarm cluster. As I understand this is because of cluster uses overlay network driver. Am I right?
I would say no, as there is a built in method in docker engine to forward unknown DNS name coming from overlay network to the DNS server you want.
Hope this helps!

Resources