Datacenter DC1 and Datacenter DC2 are 60 miles apart and Datacenter DC3 is 600 miles apart from DC1 and DC2.
I would like to have 3 node MariaDB Galera Custer,one node in each data center.
Data Center MariaDB Galera Node
DC1 : MDB01
DC2 : MDB02
DC2 : MDB03
Because of MariaDB Galera Cluster multi-master synchronous replication nature, transaction has to wait till it replicates to all the three nodes.There will be latency because of MDB03 node which is in datacenter DC3 which is 600 miles from the other two nodes.
Therefore I would like to have two nodes MDB01 and MDB02 should be in synchronous replication mode and node MDB03 in asynchronous mode.
Is it possible to setup such configuration in MariaDB Galera Cluster?
That configuration would lose most of the HA benefits.
Given that all 3 are fully in the cluster... Let's say the 600 miles equates to 25ms. That is small enough for the typical user not to notice. And it means that you could COMMIT up to 40 transactions per second per connection. Is that not good enough?
Related
How do i setup multiple mariadb server in one single VM using Galera Cluster?
If configuration links are available please share me?
I have searched in galera website its says add the nodes into cluster and not the adding the multiple mariadb server into cluster
Analysis of having 3 Galera nodes in a single server,
All 3 in a single VM
One in each of 3 VMs
No VMs
Notes:
Galera provides crash protection -- if a node goes down due to hardware failure, the other nodes continue serving the database needs. Not so with all of them sharing the same server and disk(s).
By having multiple instances of MySQL (whether as Galera nodes or not), you can make better use of CPUs. But, since MySQL rarely needs all of the available CPU, I see no advantage in this configuration.
Each instance uses some RAM for static things -- 3 instances leads to 3 copies of such. Other things (eg, caches) scale with RAM size.
No advantage in networking.
(There may be other reasons why there is virtually no difference between a single instance and multiple instances.)
TL;DR - BFT cluster with 4-5 notary nodes grinds to a halt when one replica is killed.
I ran the notary demo and the Raft cluster (with 3 notary nodes) behaved as expected - when I kill the leader, there's an election and the notary cluster continues to provide a reliable service.
I expect the same thing to happen when I run a BFT cluster (with 4 notary nodes) - killing one of the replicas should not stop the cluster from providing a reliable notary service. However, here is what happens:
1) Start the BFT notary cluster
2) I can notarise 10 transactions using gradlew samples:notary-demo:notarise
3) Stop one of the replicas in the cluster
4) Try to notarise 10 transactions using gradlew samples:notary-demo:notarise
5) Wait for a few minutes, nothing happens (transactions not notarised)
6) All of the remaining replicas terminals keep filling with re-connecting to replica 1 at /127.0.0.1:11010
Just to be on the safe side, I decided to add another notary node to the cluster. However, nothing changes - there are 5 notary nodes and killing one of them makes the cluster grind to a halt.
I looked into how BFT SMaRt works, but as far as I can tell, it should be able to tolerate any failures (including crash-stop) as long as there are enough working replicas (N >= 3f + 1).
Is there something I'm missing here? Is the behaviour that I'm expecting unreasonable - BFT cluster with 4-5 notary nodes being able to tolerate 1 node dying? Or is that an issue with Corda?
It's hard to know what the issue was in this case as there's not much information here, however the corda repo has updated this sample recently so it may be worth trying to revisit the project to see if it works correctly now.
Here's a link to the recent 4.5 release notary demo:
https://github.com/corda/corda/tree/release/os/4.5/samples/notary-demo
I have a Mariadb Galera cluster with 2 nodes and it is up and running.
Before moving to production, I want to make sure that if a node crashes abruptly, It should come up on its own.
I tried using systemd "restart", but after killing the mysql process the mariadb service does not come up, so, is there any tool or method, that I can use to automate bringing up the nodes after crashes?
Galera clusters needs to have quorum (3 nodes).
In order to avoid a split-brain condition, the minimum recommended number of nodes in a cluster is 3. Blocking state transfer is yet another reason to require a minimum of 3 nodes in order to enjoy service availability in case one of the members fails and needs to be restarted. While two of the members will be engaged in state transfer, the remaining member(s) will be able to keep on serving client requests.
You can read more here.
Currently I have 2 data centers and mariaDB master-master semi-sync replication will be employed to synchronize data between 2 sites.
In order to improve local availability, we are planned to deploy one more mariaDB in each site to form a master-slave replication. i.e. Cross-site replication is master-master replication, while local replication is master-slave replication
I would like to know if this topology makes sense and technically feasible to do.
Can mariaDB support mixed-mode of replication at the same time?
No you can't have partial asynchronous master slave and semi-sync on the same server.
I recommend moving to either Galera (3 sites recommended to alleviate split brain or devise an alternative resolution);
Or multi-master all-(server)-to-all-(other-servers) replication (without log-slave-updates).
A Master can have any number of Slaves; those slaves can be either local to the Master's datacenter, or remote. One of those "Slaves" can be another Master, thereby giving you "dual-Maser".
For Dual-Master, I recommend writing to only one of them (until a failover).
These are partial HA solutions:
* Replication
* Dual-Master
* Semi-sync
* Using only 2 datacenters
Galera (and soon, Group Replication) are better than any combination of the above. But for good HA, you need 3 geographically separate datacenters (think flood, tornados, etc)
I am not familiar with a restriction against async + semi-sync on the same server.
Be aware that every Slave must perform every write operation, so a Slave is not necessarily less busy than a Master. However, having more than one server for "reads" does spread out the read load.
For Galera, 3 nodes is recommended. 4 or 5 is OK; more than 5 may stress the network and the handshaking needed. Galera allows any number of Slaves hanging off each 'node'.
SETUP: 1
3-node cassandra cluster. Each node is on a different machine with 4 cores 32 GB RAM, 800GB SSD (DISK), 1Gbit/s = 125 MBytes/sec Network bandwith
2 cassandra-stress client machines with same exact configuration as above.
Experiment1: Ran one client on one machine creating anywhere from 1 to 1000 threads and with Conistency Level of Quorum and the max network throughput on a cassandra node was around 8MBytes/sec with a CPU Usage of 85-90 percent on both cassandra node and the client
Experiment2: Ran two clients on two different machines creating anywhere from one to 1000 threads with Conistency Level of Quorum and the max network throughput on a cassandra node was around 12MBytes/sec with a CPU Usage of 90 percent on both cassandra node and both the client
Did not see double the throughput even though my clients were running on two different machines but I can understand the cassandra node is CPU bound and thats probably why. so that lead me to setup2
SETUP 2
3-node cassandra cluster. Each node is on a different machine with 8 cores 32 GB RAM, 800GB SSD (DISK), 1Gbit/s = 125 MBytes/sec Network bandwith
2 cassandra-stress client machines with 4 cores 32 GB RAM, 800GB SSD (DISK), 1Gbit/s = 125 MBytes/sec Network bandwith
Experiment3: Ran one client on one machine creating anywhere from 1 to 1000 threads and with Conistency Level of Quorum and the max network throughput on a cassandra node was around 18MBytes/sec with a CPU Usage of 65-70 percent on a cassandra node and >90% on the client node.
Experiment4: Ran two clients on two different machines creating anywhere from 1 to 1000 threads and with Conistency Level of Quorum and the max network throughput on a cassandra node was around 22MBytes/sec with a CPU Usage of <=75 percent on a cassandra node and >90% on both client nodes.
so the question here is with one client node I was able to push 18MB/sec (Network throughput) and with two client nodes running two different machine I was only able to push at a peak of 22MB/sec(Network throughput) ?? And I wonder why this is the case even though this time the cpu usage on cassandra node is around 65-70 percent on a 8 core machine.
Note: I stopped cassandra and ran a tool called iperf3 on two different ec2 machines and I was able to see the network bandwith of 118 MBytes/second. I am converting everything into Bytes rather than bits to avoid any sort of confusion.