Long story short: Connecting to mongo atlas, and trying to whitelist the smallest ip range possible.
VPC peering won't work as the Mongo cluster is hosted in AWS, and it's just a Mongo Atlas limitation. Also, some of our mongo clusters are M5 (or lower), and they don't support VPC Peering.
That being said, I'm not sure what the public/external ip of my pods will be, when they attempt to connect to Mongo. If not narrowing the outbound ip range as much as possible, what other options exist.
The GKE cluster is not private, and it's autopilot
Found these two articles that shows how you can route egress traffic to a single IP that can be used for whitelisting your GKE from Mongo Atlas:
Route the GKE cluster's egress traffic via Cloud NAT
or
Route your Public GKE cluster’s egress traffic via NAT instances
Unfortunately, both of these options only work for Non Autopilot GKEs. For routing the GKE cluster's egress traffic via Cloud NAT, the desired networking behavior is currently not supported by public Autopilot GKE clusters. The cluster's IP masquerade configuration is not configured to perform SNAT within the cluster for packets sent from Pods to the internet. Currently, there isn't a way to configure the IP masquerade agent to not masquerade the pod range when reaching out to the internet in Autopilot clusters. As a result, pod egress traffic in public Autopilot GKE clusters will be using the node's external IP.
So to move forward with Cloud NAT it's either:
use a private GKE cluster, which can be in Autopilot mode.
use a public GKE cluster, but not in Autopilot mode.
Using Autopilot, you cannot use an inexpansive solution that is using kube-ip which will allow you to associate public GCP static ips to your GKE nodes. It is a daemon that runs in your kubernetes cluster and keeps checking and associating if necessary GCP ips that you bought earlier to your GKE nodes.
With Autopilot, what you can do is to use the Network Endpoint Group (NEG) abstraction layer that enables container-native load balancing. It allows to configure path-based or host-based routing to their your backend pods. The ingress is free, but you will pay for GCP loadbalancing which is much more expansive that reserving a few ips with kube-ip in most cases.
Here are more informations about GCP's NEG. Plus another related stackexchange question.
Related
I have my web app running in GKE cluster and I am trying to create Redis and Mongo deployment for databases in compute engines/VMs in the same GCP project.
I would like only my GKE cluster to have have access to Redis and Mongo via internal/private network, so that the DBs are shielded from the public internet. What would be a preferred solution? I read one could use VPC peering or shared VPC or deploy GKE and DBs in the same VPC but I am not sure what to choose or if there is any other better way. I read one should also be aware of IP overlapping.
Any tips/help would be greatly appreciated, thanks.
You need to create a firewall rule to allow connections from GKE to your compute engine vms.
Use this command to get the source ip range for your cluster
ip_range = `gcloud container clusters describe #{cluster_name} --format=get"(clusterIpv4Cidr)" --region="us-central1" --project=#{project_id}`
Then use the below command to create the firewall rule.
`gcloud compute firewall-rules create "#{cluster_name}-to-all-vms-on-network" --network=#{network} --source-ranges=#{ip_range} --allow=tcp,udp,icmp,esp,ah,sctp --project=#{project_id}`
I am assuming you are talking about self hosting Redis and Mongo on compute engine VMs. You can create DB VMs in the same VPC as the GKE cluster but without Public IP address. This will ensure that these VMs are not accessible from internet. Create the firewall rules to allow the traffic from Cluster's Pod ip ranges on the DB VMs. See this answer for details on the firewall rules.
On GCP, peered VPC connections are not transitive and Memorystore exists in it's own VPC network. This means that it's not possible to connect to a Redis instance from multiple VPC networks. Only a single authorized network is able to get access.
This diagram illustrates how VPC-2 cannot connect to VPC-1's Redis instance:
[Redis]-[VPC-1]-[VPC-2]
The only proposed solution I've found so far to connect from multiple VPC networks is to host a Redis proxy (nutcracker)
but this feels like a lot of work and potential maintenance in the future.
Is there a managed service offered by GCP that can do the trick?
I've recently connected a private GKE cluster to Cloud Build following this documentation which makes use of routers and tunnels, is it possible to use a Cloud Router and VPN tunnels to proxy the connection?
Another solution so you can manage the peered VPCs within the same project:
As you know, peered VPCs are not transitive, in this case meaning your VPC-2 does not know about the connection between VPC-1 and Redis VPC.
You can use VPC-1 as a transit network, by either importing and exporting routes between VPC-1 and VPC-2 or for a more managed solution you could use Cloud VPN on your VPC-1. If you have multiple VPCs that you need to connect to Redis, I would suggest considering using the Cloud VPN.
Here is an example of how this architecture could work
From this example, look at network-b as your VPC-1 and Network-a as your Redis VPC and Network-c as your VPC-2.
If you only have a few VPCs that need to connect to the Redis VPC, you could also consider exporting and importing custom routes from VPC-1 to all peered VPC that need access to Redis.
For Redis please note that only IPs from RFC1918 are allowed to connect so your IPs that need to connect to Redis would need to be in these ranges
10.0.0.0 – 10.255.255.255 (10/8 prefix)
172.16.0.0 – 172.31.255.255 (172.16/12 prefix)
192.168.0.0 – 192.168.255.255 (192.168/16 prefix)
I have a compute engine instance with persistent file storage that I need outside of my GKE cluster.
I would like to open a specific TCP port on the Compute Engine instance so that only nodes within the GKE cluster can access it.
The Compute Engine instance and GKE cluster are in the same GCP project, network, and subnet.
The GKE cluster is not private and I have an ingress exposing the only service I want exposed to the internet.
I've tried creating firewall rules of three different types that do not work:
By shared service account on both Compute Engine instance and K8s nodes.
By network tags - (yes I am using the network tags as explicitly specified on the VM instance page).
By IP address, where I use network tag for target and private IANA IP ranges 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16 for source.
The only thing that works is the last option but using 0.0.0.0/0 for source IP range.
I've looked at a few related questions such as:
Google App Engine communicate with Compute Engine over internal network
Can I launch Google Container Engine (GKE) in Private GCP network Subnet?
But I'm not looking to make my GKE cluster private and I have tried to create the firewall rules using network tags to no avail.
What am I missing or is this not possible?
Not sure how I missed this, fairly certain I tried something similar a couple months back but must have had something else misconfigured.
On the GKE cluster Details page, there is a pod address range. Setting the firewall source range to GKE pod address range gave me the the desired outcome.
What is difference between MetalLB and NodePort?
A node port is a built-in feature that allows users to access a service from the IP of any k8s node using a static port. The main drawback of using node ports is that your port must be in the range 30000-32767 and that there can, of course, be no overlapping node ports among services. Using node ports also forces you to expose your k8s nodes to users who need to access your services, which could pose security risks.
MetalLB is a third-party load balancer implementation for bare metal servers. A load balancer exposes a service on an IP external to your k8s cluster at any port of your choosing and routes those requests to yours k8s nodes.
MetalLB can be deployed either with a simple Kubernetes manifest or with Helm.
MetalLB requires a pool of IP addresses in order to be able to take ownership of the ingress-nginx Service. This pool can be defined in a ConfigMap named config located in the same namespace as the MetalLB controller. This pool of IPs must be dedicated to MetalLB's use, you can't reuse the Kubernetes node IPs or IPs handed out by a DHCP server.
A NodePort is an open port on every node of your cluster. Kubernetes transparently routes incoming traffic on the NodePort to your service, even if your application is running on a different node.
GKE uses the kubenet network plugin for setting up container interfaces and configures routes in the VPC so that containers can reach eachother on different hosts.
Wikipedia defines an overlay as a computer network that is built on top of another network.
Should GKE's network model be considered an overlay network? It is built on top of another network in the sense that it relies on the connectivity between the nodes in the cluster to function properly, but the Pod IPs are natively routable within the VPC as the routes inform the network which node to go to to find a particular Pod.
VPC-native and non VPC native GKE clusters uses GCP virtual networking. It is not strictly an overlay network by definition. An overlay network would be one that's isolated to just the GKE cluster.
VPC-native clusters work like this:
Each node VM is given a primary internal address and two alias IP ranges. One alias IP range is for pods and the other is for services.
The GCP subnet used by the cluster must have at least two secondary IP ranges (one for the pod alias IP range on the node VMs and the other for the services alias IP range on the node VMs).
Non-VPC-native clusters:
GCP creates custom static routes whose destinations match pod IP space and services IP space. The next hops of these routes are node VMs by name, so there is instance based routing that happens as a "next step" within each VM.
I could see where some might consider this to be an overlay network. I don’t believe this is the best definition because the pod and service IPs are addressable from other VMs, outside of GKE cluster, in the network.
For a deeper dive on GCP’s network infrastructure, GCP’s network virtualization whitepaper can be found here.