In an production env, how many flows does K8 Pod have any point of time - networking

I am looking for ball park numbers in production for following questions
1) How many flows(5 tuple: SRC-IP, DEST-IP, SRC-PORT, DEST-PORT, Protocol) does a pod open and how long these flow live ?.
2) When one moves from VM's to containers with K8, Is it typical to convert a VM in to a Pod ?
Are there any studies around this area ?. If so, can you provide pointers ?

1:
It depends on how many containers or connections your pod will make.
2:
It is definitely different thing:
Pod is a group of containers which is managed by K8s. VM is a Virtual Machine, you can't convert VM into a pod.
Usually, when you migrate from VMs to K8s, you change the architecture of your application.

Related

Reduce latency between pod on OpenShift and VM on GCP

I have a configuration where I have:
Pods managed by OpenShift on GCP in a zone/region
VM on GCP in same zone/region
I need to reduce as much as possible the latency between those pods and the VM on GCP.
What are the available options for that ?
My understanding is that they would need to be in same VPC but I don't know how to do that.
If you can point me to reference documentation, it would help me a lot.
Thanks for your help
You have 2 options for that:
The best one is to create a sub-project in the OpenShift project that shares the same VPC. This way the machines are in the same network, so the latency is as low as possible. However, this leads to management constraints for (firewall rules...). The average latency should be very low (< 1ms).
Another option is to use a dedicated OpenShift project. This leads to higher latency because the path is longer (VPN => Shared Services => VPN). You need to take care of flows between regions as it is not because the machines are in the same project that the flows do not pass through another region. You must therefore set up an optimization of network routing through a tag that must be present on the MySQL machine. The latency in this case would vary between 2 to 10ms. Of course, this latency can vary because the flows go through VPNs.
Setting your source and destination in the same VPC region, it will definitely reduce your latency. Even though latency is not only affected by distance I have found this documentation regarding GCP Inter Region Latency which could help you deciding your best scenario.
Now, going to your question, I understand you have created a GCP cluster and a VM instance in the same zone/region but in different networks (VPC) ? If possible, could you please clarify a bit more your scenario?

Load balancing on same server

I research about Kubernetes and actually saw that they do load balancer on a same node. So if I'm not wrong, one node means one server machine, so what good it be if doing load balancer on the same server machine. Because it will use same CPU and RAM to handle requests. First I thought that load balancing would do on separate machine to share resource of CPU and RAM. So I wanna know the point of doing load balancing on same server.
If you can do it on one node , it doesn't mean that you should do it , specially in production environment.
the production cluster will have least 3 or 5 nodes min
kubernetes will spread the replicas across the cluster nodes in balancing node workload , pods ends up on different nodes
you can also configure on which nodes your pods land
use advanced scheduling , pod affinity and anti-affinity
you can also plug you own schedular , that will not allow placing the replica pods of the same app on the same node
then you define a service to loadbalance across pods on different nodes
kube proxy will do the rest
here is a useful read:
https://itnext.io/keep-you-kubernetes-cluster-balanced-the-secret-to-high-availability-17edf60d9cb7
So you generally need to choose a level of availability you are
comfortable with. For example, if you are running three nodes in three
separate availability zones, you may choose to be resilient to a
single node failure. Losing two nodes might bring your application
down but the odds of loosing two data centres in separate availability
zones are low.
The bottom line is that there is no universal approach; only you can
know what works for your business and the level of risk you deem
acceptable.
I guess you mean how Services do automatical load-balancing. Imagine you have a Deployment with 2 replicas on your one node and a Service. Traffic to the Pods goes through the Service so if that were not load-balancing then everything would go to just one Pod and the other Pod would get nothing. You could then handle more load by spreading evenly and still be confident that traffic will be served if one Pod dies.
You can also load-balance traffic coming into the cluster from outside so that the entrypoint to the cluster isn't always the same node. But that is a different level of load-balancing. Even with one node you can still want load-balancing for the Services within the cluster. See Clarify Ingress load balancer on load-balancing of external entrypoint.

gke nginx lb health checks / can't get all instances in a "healthy" state

Using nginx nginx-ingress-controller:0.9.0, below is the permanent state of the google cloud load balancer :
Basically, the single healthy node is the one running the nginx-ingress-controller pods. Besides not looking good on this screen, everything works super fine. Thing is, Im' wondering why such bad notice appears on the lb
Here's the service/deployment used
Am just getting a little lost over how thing works; hope to get some experienced feedback on how to do thing right (I mean, getting green lights on all nodes), or to double check if that's a drawback of not using the 'official' gcloud l7 thing
Your Service is using the service.beta.kubernetes.io/external-traffic: OnlyLocal annotation. This configures it so that traffic arriving at the NodePort for that service will never go a Pod on another node. Since your Deployment only has 1 replica, the only node that will receive traffic is the one where the 1 Pod is running.
If you scale your Deployment to 2 replicas, 2 nodes will be healthy, etc.
Using that annotation is a recommend configuration so that you are not introducing additional network hops.

Why do Docker overlay networks require consensus?

Just been reading up on Docker overlay networks, very cool stuff. I just can't seem to find an answer to one thing.
According to the docs:
If you install and use Docker Swarm, you get overlay networks across your manager/worker hosts automagically, and don't need to configure anything more; but...
If you simply want a (non-Swarm) overlay network across multiple hosts, you need to configure that network with an external "KV Store" (consensus server) like Consul or ZooKeeper
I'm wondering why this is. Clearly, overlay networks require consensus amongst peers, but I'm not sure why or who those "peers" even are.
And I'm just guessing that, with Swarm, there's some internal/under-the-hood consensus server running out of the box.
Swarm Mode uses Raft for it's manager consensus with a built-in KV store. Before swarm mode, overlay networking was possible with third party KV stores. Overlay networking itself doesn't require consensus, it just relies on whatever the KV store says regardless of the other nodes or even it's own local state (I've found this out the hard way). The KV stores out there are typically setup with consensus for HA.
The KV store tracks IP allocations to containers running on each host (IPAM). This allows docker to only allocate a given address once, and to know which docker host it needs to communicate with when you connect to a container running on another host. This needs to be external from any one docker host, and preferably in an HA configuration (like swarm mode's consensus) so that it can continue to work even when some docker nodes are down.
Overlay networking between docker nodes only involves the nodes that have containers on that overlay network. So once the IP is allocated and discovered, all the communication only happens between the nodes with the relevant containers. This is easy to see with swarm mode if you create a network and then list networks on a worker, it won't be there. Once a container on that network gets scheduled, the network will appear. From docker, this reduces overhead of multi-host networking while also adding to the security of the architecture. The result looks like this graphic:
The raft consensus itself is only needed for leader election. Once a node is selected to be the leader and enough nodes remain to have consensus, only one node is writing to the KV store and maintaining the current state. Everyone else is a follower. This animation describes it better than I ever could.
Lastly, you don't need to setup an external KV store to use overlay networking outside of swarm mode services. You can implement swarm mode, configure overlay networks with the --attachable option, and run containers outside of swarm mode on that network as you would have with an external KV store. I've used this in the past as a transition state to get containers into swarm mode, where some were running with docker-compose and others had been deployed as a swarm stack.

communication between Openstack VM

How to make two VMs communicate with each other? I have to split a task between two VMs, so I think MPI has to be used, If so are there any useful resources that I can use to get started? Any help would be appreciated.
P.S : I have instaled devstack juno
Your question is not really clear.
Openstack is just a virtualization technology. There's almost no difference between having two hardware servers and two VMs. E.g. normally if two servers belong to the same network segment they will have access to each other's open ports. Openstack works just in the same way - if you assign the same network to VMs then this will also work.
However if you wish to install two VMs that will consume from a list of tasks and do them in parallel I would recommend you to read about Enterprise Integration Patterns (e.g. here). Technically this is implemented by using one or several messaging middleware servers such as ActiveMQ or ZeroMQ.

Resources