I've been working on some High Availability scenario's for some database servers and inspected MaxScale & HAProxy and both seem very interesting but the problem is as follows.
When setting up MaxScale everything went well until I discovered that I don't see a way to create multiple clusters on the same MaxScale instance but this is a necessity for the amount of database servers that will have to be controlled using one MaxScale instance.
Is there any way to implement multiple clusters when setting up MaxScale or is this just something that isn't implemented in MaxScale?
Thank you for your help
To use multiple clusters with one MaxScale, just define multiple servers, monitors, services and listeners. Here is an example of one cluster being used as a service:
[server1]
type=server
address=127.0.0.1
port=3000
protocol=MariaDBBackend
[server2]
type=server
address=127.0.0.1
port=3001
protocol=MariaDBBackend
[Cluster-1-Monitor]
type=monitor
module=mariadbmon
servers=server1,server2
user=maxuser
passwd=maxpwd
monitor_interval=5000
[Cluster-1-Router]
type=service
router=readwritesplit
servers=server1,server2
user=maxuser
passwd=maxpwd
[Cluster-1-Listener]
type=listener
service=Cluster-1-Router
protocol=MariaDBClient
port=4006
This would expose the read-write splitting service on port 4006 that would do read-write splitting over the servers server1 and server2.
To define another one, just add:
The servers that define the cluster
A monitor for monitoring the servers
A service that uses the servers
A listener that connects to the service
This way you can expose multiple ports that connect to different clusters. For example, one cluster could listen on port 4006 and another on 4007. These two could then be used to connect two different applications to two different clusters.
Related
I have a system which stores large volume of data. This runs on an on-premise system and after receiving data, it uploads to cloud. For horizontal scaling I want to deploy multiple instances of the application behind a load-balancer so as to provide HA capabilities.The systems which are sending the data to my application are using TCP protocol. Hence I would need a Layer 4 load-balancer. I am looking at the options available and both HAProxy and Nginx seem to fit the need. What are the attributes I should evaluate to decide on the load-balancer to use? Has anyone already used one of these load balancer for high volume TCP traffic?
the problem is I have 3 virtual machines with the same source images in three diferent zones in a region. I can't put them in a MIG because each of them has to attach to a specific persistent disk and according that I researched, I have no control of which VM in the MIG will be attached to which persistent disk (please correct me if I´m wrong). I explored the unmanaged instance group option too, but only has zonal scope. Is there any way to create a load balancer that works with my VMs or I have to create another solution (ex. NGINX)?
You have two options.
Create an Unmanaged Instance Group. This allows you to set up your VM instances as you want. You can create multiple instance groups for each region.
Creating groups of unmanaged instances
Use a Network Endpoint Group. This supports specifying backends based upon the Compute Engine VM instance's internal IP address. You can specify other methods such as an Internet address.
Network endpoint groups overview
Solution in my case was using a dedicated vm, intalling nginx as a load balancer and creating static ips to each vm in the group. I couldn't implement managed or unmanaged instance groups and managed load balancer.
That worked fine but another solution found in quicklabs was adding all intances in a "instance pool", maybe in the future I will implement that solution.
I research about Kubernetes and actually saw that they do load balancer on a same node. So if I'm not wrong, one node means one server machine, so what good it be if doing load balancer on the same server machine. Because it will use same CPU and RAM to handle requests. First I thought that load balancing would do on separate machine to share resource of CPU and RAM. So I wanna know the point of doing load balancing on same server.
If you can do it on one node , it doesn't mean that you should do it , specially in production environment.
the production cluster will have least 3 or 5 nodes min
kubernetes will spread the replicas across the cluster nodes in balancing node workload , pods ends up on different nodes
you can also configure on which nodes your pods land
use advanced scheduling , pod affinity and anti-affinity
you can also plug you own schedular , that will not allow placing the replica pods of the same app on the same node
then you define a service to loadbalance across pods on different nodes
kube proxy will do the rest
here is a useful read:
https://itnext.io/keep-you-kubernetes-cluster-balanced-the-secret-to-high-availability-17edf60d9cb7
So you generally need to choose a level of availability you are
comfortable with. For example, if you are running three nodes in three
separate availability zones, you may choose to be resilient to a
single node failure. Losing two nodes might bring your application
down but the odds of loosing two data centres in separate availability
zones are low.
The bottom line is that there is no universal approach; only you can
know what works for your business and the level of risk you deem
acceptable.
I guess you mean how Services do automatical load-balancing. Imagine you have a Deployment with 2 replicas on your one node and a Service. Traffic to the Pods goes through the Service so if that were not load-balancing then everything would go to just one Pod and the other Pod would get nothing. You could then handle more load by spreading evenly and still be confident that traffic will be served if one Pod dies.
You can also load-balance traffic coming into the cluster from outside so that the entrypoint to the cluster isn't always the same node. But that is a different level of load-balancing. Even with one node you can still want load-balancing for the Services within the cluster. See Clarify Ingress load balancer on load-balancing of external entrypoint.
I would like to synchronize only some databases on a cluster, with replicate-do-db.
→ If I use the Galera cluster, are all data sent over the network, or are nodes smart enough to only fetch their specific databases?
On "classic" master/slave MariaDB replication, filters are made by the slave, causing network charge for nothing if you don't replicate that database. You have to configure a blackhole proxy to filter binary logs to avoid this (setup example), but the administration after is not really easy. So it would be perfect with a cluster if I can perform the same thing :)
binlog_... are performed in the sending (Master) node.
replicate_... are performed in the receiving (Slave) node.
Is this filtered server part of the cluster? If so, you are destroying much of the beauty of Galera.
On the other hand, if this is a Slave hanging off one of the Galera nodes and the Slave does not participate in the "cluster", this is a reasonable architecture.
I am currently in the process of creating 3 Neo4j High Availability servers. My business logic leaves one server as a dedicated master, while the other two machines are dedicated slaves. My slaves exist in an entirely different datacenter than my master.
What is the best method to establish a link between the two applications? I've been able to establish connections using OpenVPN, but am curious if that would be better than like SSH port forwarding? I'm not entirely sure how Zookeeper needs to communicate with each other node. A VPN connection only creates a one-way connection, where my master, for example, can create a connection with slave, but could not create one with its master. (I think?)
How should I do this? Thanks!
PS: My master is using an embedded instance of Neo4j, while the slaves are stand-alone instances (if this matters).
So your setup is not about availability as the slaves cannot become masters anyway?
Just about replication to the other datacenter?
You also need to take the neo4j coordinator (zookeeper) into account which is usually needed for all cluster participants.
My colleague suggested that you might get away with just putting the zookeeper (perhaps even just a single one as you don't need master election) directly besides your master server.
Then the ability to connect into the masters' VPN should be enough for the slaves to pull updates.