In GCP share a VPN gateway with other projects - networking

I'm in the process of starting the design of the networks (VPC, subnetworks and such) as part of the process of moving a rather complex organization on-premise structure, on the cloud.
The chosen provider is GCP and I read and taken the courses to be associate engineer. However, the courses I've followed don't go into details of the technical aspects of doing something like this, just present you with the possible options.
My background is of a senior backend, then fullstack, developer. So I lack some of the very interesting and useful knowledge of a sysadmin unfortunately.
Our case is as follows:
On premise VMs on several racks, reachable only inside a VPN
Several projects on the GCP Cloud
Two of them need to connect to the on-premise VPN but there could be more
Some projects see each other resources (VMs, SQL, etc) using VPC Peering
Gradually we will abandon the on-premise, unless we find some legacy application that really is messed up
Now, I could just create a new VPN connection for every project from Hybrid Connectivity -> VPN but I'd rather create a project dedicated to having the VPN gateway set up and allow other projects to use that resources.
Is this a possible configuration? Is it a valid design? As far as I explored the VPN creation, it seems that I'll have to create a VM that will expose an IP acting as gateway, if that's the case I was thinking to be using the VPC peering to allow other projects to exit into the on premise VPN. No idea if I'm talking gibberish here. I'm still waiting for some information (IKE shared key, etc) before attempting anything, so I'm rather lost at this point.

You have to take in consideration several aspect:
Cost: if you set up a VPN in each project, and if you have to double your connectivity for HA, it will be expensive. If you have only 1 gateway project, it's cheaper
Cheaper, imply trade off. VPN have limited bandwidth: 3Gbps (Cloud Interconnect also, but higher and more expensive). If all your projects use the same VPN thanks to mutualization, take care at this bottleneck.
If you want to mutualise, at least for DEV/UAT project, I recommend you to use VPC Peering, I mean 1 VPN project, and others with VPC peering. Take care at your IP range assign for peering. If you are interested, I wrote an article on this
It's also possible to use Shared VPC, which is great! But there is less compatibility with several product (for example, serverless VPC Connector for Cloud Function and App Engine isn't yet compliant with shared VPC).

Related

Reduce latency between pod on OpenShift and VM on GCP

I have a configuration where I have:
Pods managed by OpenShift on GCP in a zone/region
VM on GCP in same zone/region
I need to reduce as much as possible the latency between those pods and the VM on GCP.
What are the available options for that ?
My understanding is that they would need to be in same VPC but I don't know how to do that.
If you can point me to reference documentation, it would help me a lot.
Thanks for your help
You have 2 options for that:
The best one is to create a sub-project in the OpenShift project that shares the same VPC. This way the machines are in the same network, so the latency is as low as possible. However, this leads to management constraints for (firewall rules...). The average latency should be very low (< 1ms).
Another option is to use a dedicated OpenShift project. This leads to higher latency because the path is longer (VPN => Shared Services => VPN). You need to take care of flows between regions as it is not because the machines are in the same project that the flows do not pass through another region. You must therefore set up an optimization of network routing through a tag that must be present on the MySQL machine. The latency in this case would vary between 2 to 10ms. Of course, this latency can vary because the flows go through VPNs.
Setting your source and destination in the same VPC region, it will definitely reduce your latency. Even though latency is not only affected by distance I have found this documentation regarding GCP Inter Region Latency which could help you deciding your best scenario.
Now, going to your question, I understand you have created a GCP cluster and a VM instance in the same zone/region but in different networks (VPC) ? If possible, could you please clarify a bit more your scenario?

Limits when running zerotier

We want to use zero tier to connect from one cloud machine to multiple remote machines. We do not want remote machines to access each other. What would be a good approach?
Use a single network and set rules based on tags to restrict access
Run multiple networks, each having cloud machine and a remote machine
Are there limits to
Number of members in zerotier network
Number of zerotier networks a machine can connect to at a time - tun interfaces, ip conflicts or performance impact
I would use a single network and use rules to prevent peering between the machines. For instance, you could set the 192.168.141.0/25 portion of the network to prevent peering, and allow only defined network paths between hosts.
Just a personal rant here: You don't want to do that. Really. You're going to make a headache for yourself when you have to scale horizontally (which you will if you're successful). I would STRONGLY recommend taking a mTLS approach to service authentication instead. Somewhat more work at the start, but a lot easier in the long run.

Migrate from legacy network in GCE

Long story short - I need to use networking between projects to have separate billing for them.
I'd like to reach all the VMs in different projects from a single point that I will use for provisioning systems (let's call it coordinator node).
It looks like VPC network peering is a perfect solution to this. But unfortunately one of the existing networks is "legacy". Here's what google docs state about legacy networks.
About legacy networks
Note: Legacy networks are not recommended. Many newer GCP features are not supported in legacy networks.
OK, naturally the question arises: how do you migrate out of legacy network? Documentation does not address this topic. Is it not possible?
I have a bunch of VMs, and I'd be able to shutdown them one by one:
shutdown
change something
restart
unfortunately it does not seem possible to change network even when VM is down?
EDIT:
it has been suggested to recreate VMs keeping the same disks. I would still need a way to bridge legacy network with new VPC network to make migration fluent. Any thoughts on how to do that using GCE toolset?
One possible solution - for each VM in the legacy network:
Get VM parameters (API get method)
Delete VM without deleting PD (persistent disk)
Create VM in the new VPC network using parameters from step 1 (and existing persistent disk)
This way stop-change-start is not so different from delete-recreate-with-changes. It's possible to write a script to fully automate this (migration of a whole network). I wouldn't be surprised if someone already did that.
UDPATE
https://github.com/googleinterns/vm-network-migration tool automates the above process, plus it supports migration of a whole Instance Group or Load Balancer, etc. Check it out.

Confusing about NFV implementation (Network Functions Virtualization)

I'm researching about SDN and NFV.
In the concept of NFV on Wikipedia , it says : "Network Functions Virtualization (NFV) is a network architecture concept that proposes using IT virtualization related technologies, to virtualize entire classes of network node functions into building blocks that may be connected, or chained, together to create communication services."==> first thing to consider that it will reduce the cost of facilities.
So in real life implementation, for example, how can we virtualize a network nodes like a router?
NFV was created for the networks to be capable to extend in a dynamically way(virtualize the router) , not a static way(buy a new router), that is we must implement the router functions in the server or a computer instead of buying and then adapting the new router to the current nextwork , in this case I don't see any different in this implementation , because buying a server to implement a virtualized router is not cheaper than buying a new router.
Can anyone explain this for me , or Am i wrong understanding the NFV concept?
Thanks.
SDN is just that, software defined networking. In a Hybrid SDN model SDN decouples the logic from the physical box, rendering the physical box a simple "forwarding" box. The logic rests with the SDN controller where developers create APIs that manage these forwarding boxes (we call them network elements now) with flow tables that get pushed to them. The benefit here is that the devices can now be configured and provisioned through this controller, as opposed to having to log into each and every box.
Then you have the cloud. A small office can literally get away with porting all of their apps and services into the cloud, doing away with most of their physical boxes. Of course you still need a LAN in the office and a way to get out to the Internet and eventually the cloud. You can even ask the cloud provider to provision load-balancing on specific applications, firewalls and content delivery services. So basically your office applications and most of the supporting LAN and databases can be safely ported to cloud providers.
When you said "...because buying a server to implement a virtualized router is not cheaper than buying a new router", it depends: As it's a virtualized resource, you can use this new server to run your router and another resource from your infrastructure, if the machine has more hardware capacity than you need for a single router.
In fact, you might not even need to buy a new machine, if you have your resources in a cloud like AWS (or your own private cloud), when you have need for more routers, you can just flexibly allocate more hardware resources and spawn a new router instance (scale out) and, whenever your router demand is lower than what you have allocated, you can reduce your number of routers (scale in) and stop losing money with an infrastructure that you are not using at the moment.
Consider that a really high level explanation, if you want to know the details about how a Virtual Network Function scales in and out in a NFV implementation, I recommend you to read the ETSI specification about how it should work: http://www.etsi.org/standards-search#page=1&search=&title=1&etsiNumber=1&content=0&version=1&onApproval=1&published=1&historical=0&startDate=1988-01-15&endDate=2017-04-13&harmonized=0&keyword=&TB=789,,832,,831,,795,,796,,800,,798,,799,,797,,828&stdType=&frequency=&mandate=&collection=&sort=3
Let me continue with your example of the router. Traditionally, these routers are vendor specific. For example, the major sellers are companies like Cisco, Juniper, etc. They are implemented on proprietary hardware and therefore if you want to buy a new router you need to buy from them only. Further, when they go into some problems, you need a dedicated engineer to repair them. Therefore, the telecommunication has to take care of high Capital Expenditure (COPEX) and Operational Expenditure (OPEX).
With NFV, the entire router function is implemented as a software and deployed on a general purpose servers (GPP) or cloud. These GPPs are relatively very cheap when compared to proprietary hardware. Thanks to cloud computing, even small companies can afford servers on Amazon and Google clouds. Because of cheap availability, COPEX is now relatively cheaper. Further, you don't need a dedicated engineer when the hardware goes into a problem, the same engineer who works for GPP server maintenance is enough. This way OPEX is reduced.
Now imagine, like routers there are many networking elements present in Telecommunication. If every networking element requires a dedicated engineer, how much a Teleco operator will be spending money. Apart from this, due to software implementation, suppose, when you have very high traffic than expected, you can just roll out a new router (software network function) on GPP or Cloud instead of completely buying a new router, which is very costly. As you already know, in the cloud you pay based on usage.
There are many more uses. To know more you need to read research papers.

NServiceBus messaging across private networks

I was assigned with the re-architecture of a legacy (medical) product which is controlling several external devices. In the current architecture, we have several such stations in each customer's network, where each station is processing its own data, and they all share some of that data via a central server (that talks to the DB and BLOB storage).
I'm planning the new architecture such that it will allow more scenarios, such as monitoring the stations through a web interface, and allowing data processing to be scalable by adding additional servers.
This led me to choose NServicebus as the messaging and communication infrastructure. And I pretty much have a clear view of the new architecture.
However, another factor was recently added to the equation by my manager. He requires that the machine that communicates with the devices (hardware), will not be under the IT policies of the customer. The reason behind this, as I understand, is that we don't want the customer's IT to control OS updates, security, permissions and other settings, because we want full control over that machine in order to work properly with our hardware.
My manager thus added a requirement that this machine will be disconnected from the customer's LAN.
If I still want to deploy NServiceBus on that separated machine (because I want to pub/sub async messages to other machines - some are on the customer's LAN and some aren't), Will it require some special deployment? Will it require an NServiceBus gateway?
EDIT: I removed the other (1st) question, as it wasn't relevant to the scope of StackOverflow.
Regarding question 2, yes it would require the use of a "Gateway", however the current NServiceBus Gateway implementation does not support pub/sub so you would have to look at alternatives.

Resources