kubernetes: intra-cluster isolation of applications - networking

I have been experimenting with k8s/kops/aws suite.
So far things have been going pretty well (except an issue when updating cluster via kops)
I want to be able to make use of my existing resources/cluster and deploy 2 flavors of my app (i.e. production and testing) in the same cluster.
I would like to be on the safe side and maximize as much as possible the isolation between the k8s resources of those two deployments.
Definately there going on different namespaces.
From some investigation I have found out that I need to also apply NetworkPolicy to prevent inter-namespace communication; however applying NetworkPolicy resources requires a supporting networking solution (currently using kubenet, the default of kops which doesn't).
What is the solution/plugin to go for?
Just want (at least for the time being) the level of isolation described above which I assume can be achieved via NetworkPolicy even if there is a common CIDR for all pods (just saying that to emphasise that there is a need for just the simplest possible networking solution that achieves that, nothing more fancy with multiple CIDRs etc).
Ideally I would like to be able just to use NetworkPolicy resource for some namespace-based (namespaceSelector) and pod-based (podSelector) ingress rules and that's it (?)

On my kops clusters I use weave networking (also I provision them as private topology, which excludes kubenet anyway). So my first suggestion would be to go with a different networking, weave and calico being first ones coming to my mind.
Other then that, you might want to look into service mesh solution like Istio which can leverage NetworkPolicies as well (some istio policy reading)

Related

Typical resource request required for an nginx file explorer deployed on kubernetes

I have 2 nfs mounts of 100TB each i.e. 200TB in total. I have mounted these 2 on Kubernetes container. My file server is a typical log server that holds a mix of data types like JSON, HTML, images, logs and text files, etc. The size of files also varies a lot. I am kind of guessing what should be the ideal resource request for this kubernetes container? My assumption,
As this is file reads its i/o intensive operation, CPU should be high
Since we may have a large file size transferred over, Memory should also be high.
Just wanted to check if my assumptions are right?
Posting this community wiki answer to set a baseline and to show one possible set of actions that should led to solution.
Feel free to edit and expand.
As I stated previously, this setup will heavily depend on case to case basis and giving the approximate could be misleading. In my opinion the best course of actions to take would be:
Install monitoring tools
Deploy the application for testing
Simulate the load
Install monitoring tools
There are a lot of monitoring tools that can retrieve the data about the CPU and Memory usage of your Pods. You will need to choose the one that suits your workloads and infrastructure best.
Some of them are:
Prometheus.io
Elastic.co
Datadoghq.com
Deploy the application for testing
This can also be a quite wide topic considering the fact that the exact requirements and the infrastructure is not known. One of many questions is if the Deployment should have a steady replica amount or should use some kind of Horizontal Pod Autoscaling (basing on CPU and/or Memory). The access modes on the storage shouldn't matter as NFS supports RWX.
The basic implementation of the Deployment that could be used can be found in the official Kubernetes documentation:
Kubernetes.io: Docs: Concepts: Workloads: Controllers: Deployment: Creating a deployment
Kubernetes.io: Docs: Concepts: Storage: Volumes: NFS
Simulate the load
The simulation part could go either as a real life usage or by using a tool to simulate the load. You would need in this part to choose the option/tool that suits your requirements the most. This part will show you the approximate resources that should be allocated to your nginx file explorer.
A side note!
In my testing I've used ab to check if the load was divided equally by X amount of replicas.
Additional resources
I do recommend to check the official guide on official Kubernetes documentation regarding managing resources:
Kubernetes.io: Docs: Concepts: Configuration: Manage resources containers
I also think that the VPA could help you in the whole process as:
Vertical Pod Autoscaler (VPA) frees the users from necessity of setting up-to-date resource limits and requests for the containers in their pods. When configured, it will set the requests automatically based on usage and thus allow proper scheduling onto nodes so that appropriate resource amount is available for each pod. It will also maintain ratios between limits and requests that were specified in initial containers configuration.
It can both down-scale pods that are over-requesting resources, and also up-scale pods that are under-requesting resources based on their usage over time.
-- Github.com: Kubernetes: Autoscaler: Vertical Pod Autoscaler
I'd reckon you could also look on this answer:
Stackoverflow.com: Answers: PromQL query to find CPU and memory used for the last week

Is it possible to have isolated networks in a Kubernetes cluster?

I'm trying to set up a general architecture for a system that I'm moving to Kubernetes (self-hosted, probably on VSphere).
I'm not very well versed in networking and I have the following problem that I cannot seem to be able to conceptually solve:
I have many microservices which were split out of a monolith, but the monolith is still significant. All of it is moving to K8s. It's a clustered application and does a lot of all-to-all networking under high load, which I would like to separate from all the other services in the Kubernetes cluster.
Before moving to K8s we provided a way to specify a network device that is used only for the cluster communication, and as such could be strictly separated from other traffic, and alas, even use separate networking hardware for clustering.
So this is where I would request your input: is it possible to have completely separate networking for this application-level cluster inside the Kubernetes cluster? The ideal solution would allow me to continue using our existing logic, i.e. to have a separate network (and network adapter) for the chatty bits but it's not a hard requirement to keep it that way. I have looked at Calico, Flannel, and Istio, but haven't been able to come up with a sound concept.
Use k8s NetworkPolicies, by applying these policing, you can allow/deny traffic for pods base on label selector. you can try WeaveNet and Calico, both are good and support NetworkPolicies.
It is good to have Calico network plugin. Because Flannel doesn't support network policies. You can create NetworkPolicies resources for allow/deny the traffics.
On OpenShift, you can have an isolated network per project (Kubernetes namespace). See https://docs.openshift.com/container-platform/3.5/admin_guide/managing_networking.html#isolating-project-networks

Kubernetes statefulsets in a GCE multiple zone deployment

I'm working on a project to run a Kubernetes cluster on GCE. My goal is to run a cluster containing a WordPress site in multiple zones. I've been reading a lot of documentation, but I can't seem to find anything that is direct and to the point on persistent volumes and statefulsets in a multiple zone scenario. Is this not a supported configuration? I can get the cluster up and the statefulsets deployed, but I'm not getting the state replicated throughout the cluster. Any suggestions?
Thanks,
Darryl
Reading the docs, I see that the recommended configuration would be to create a MySQL cluster with replication: https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/. This way, you would have the data properly replicated between the instances of your cluster (if you are in a multi-zone deployment you may have to create an external endpoint).
Regarding the Wordpress data, my advice would be to go for an immutable deployment: https://engineering.bitnami.com/articles/why-your-next-web-service-should-be-immutable.html . This way, if you need to add a plugin or perform upgrades, you would create a new container image and re-deploy it. Regarding the media library assets and immutability, I think the best option would be to use an external storage service like S3 https://wordpress.org/plugins/amazon-s3-and-cloudfront/
So, to answer the original question: I think that statefulset synchronization is not available in K8s (at the moment). Maybe using a volume provider that allows ReadWriteMany access mode could fit your needs (https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes), though I am quite unsure about the stability of it.

Kubernetes cluster best practice

I am working on a new project with Kubernetes and I need three environments: DEV,QA and PROD.
What is most recommended, create Multiple Clusters or create one big cluster separating environments by namespace.
Are you just going to have a single prod cluster or multiple prod clusters? One thing to consider is that updating the cluster management software (to a new k8s release) can impact your application. If you only plan to have a single prod cluster, I'd recommend running qa and dev separately so that you can upgrade those clusters first to shake out any issues. If you are going to have multiple prod clusters, then you can upgrade them one at a time to ensure application availability and sharing the clusters between environments makes a lot more sense.
Namespaces will not bring you isolation, at the moment it's just a different subdomain in dns. It's better to have namespace per application.
I highly recommend you to have two clusters for prod (in case of updating k8s) and one-two for dev/qa.
Take a look at this blog post: Checklist: pros and cons of using multiple Kubernetes clusters, and how to distribute workloads between them.
I'd like to highlight some of the pros/cons:
Reasons to have multiple clusters
Separation of production/development/test: especially for testing a new version of Kubernetes, of a service mesh, of other cluster software
Compliance: according to some regulations some applications must run in separate clusters/separate VPNs
Better isolation for security
Cloud/on-prem: to split the load between on-premise services
Reasons to have a single cluster
Reduce setup, maintenance and administration overhead
Improve utilization
Cost reduction
Considering a not too expensive environment, with average maintenance, and yet still ensuring security isolation for production applications, I would recommend:
1 cluster for DEV/QA (separated by namespaces, maybe even isolated, using Network Policies, like in Calico)
1 cluster for PROD
Definitely concur that you want multiple clusters:
anything critical to k8s that may fail during an upgrade or because you screw up somewhere will affect the whole cluster.
for example, I had an issue with DNS which wrecked havoc in my cluster; all namespaces were affected.
Upgrades are usually not a big deal but one day you might hit a roadblock; if kubelet fails for too long your pods will get killed.
So it's best to upgrade your test/dev environments and iron things out there before upgrading in prod.

How can I set up a Docker network with restricted communication?

I'm trying to create something like this:
The server containers each have port 8080 exposed, and accept requests from the client, but crucially, they are not allowed to communicate with each other.
The problem here is that the server containers are launched after the client container, so I can't pass container link flags to the client like I used to, since the containers it's supposed to link to don't exist yet.
I've been looking at the newer Docker networking stuff, but I can't use a bridge because I don't want server cross-communication to be possible. It also seems to me like one bridge per server doesn't scale well, and would be difficult to manage within the client container.
Is there some kind of switch-like docker construct that can do this?
It seems like you will need to create multiple bridge networks, one per container. To simplify that, you may want to use docker-compose to specify how the networks and containers should be provisioned, and have the docker-compose tool wire it all up correctly.
Resources:
https://docs.docker.com/engine/userguide/networking/dockernetworks/
https://docs.docker.com/compose/
https://docs.docker.com/compose/compose-file/#version-2
One more side note: I think that exposed ports are accessible to all networks. If that's right, you may be able to set all of the server networking to none and rely on the exposed ports to reach the servers.
Hope this is relevant to your use-case - I'm attempting to draw context regards your actual application from the diagram and comments. I'd recommend you go the Service Discovery route. It may involve a little bit of simple API over a central store (say Redis, or SkyDNS), but would make things simple in the long run.
Kubernetes, for instance, uses SkyDNS to do so with DNS. At the end of the day, any orchestration tool of your choice would most likely do something like this out of the box: https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns
The idea is simple:
Use a DNS container that keeps entries of newly spawned servers
Allow the Client Container to query it for a list of servers. e.g. Picture a DNS response with a bunch of server-<<ISO Timestamp of Server Creation>>s
Disallow client containers read-access to this DNS (how to manage this permission-configuration without indirection, i.e. without proxying through an endpoint that allows writing into the DNS Container, but not reading, is going to exotic)
Bonus Edit: I just realised you can use a simpler Redis-like setup to do this, and that DNS might just be overengineering :)

Resources