How to prioritize nodes in Pacemaker/Corosync? - nginx

I have followed a guide and created a Nginx HA Cluster with a floating IP.
(Nginx, corosync, pacemaker are being used)
The guide I followed:
https://dzone.com/articles/how-to-configure-nginx-high-availability-cluster-u
I succesfully created a 2 node cluster and they are both working fine.
When Node1 goes offline, Node2 is used vice-versa.
My problem is that in my case, Node1 should be primary, meaning it should be always used whenever it is online.
To describe it better:
Node1 and Node2 are online -> Node1 is being used
Node1 goes offline -> Node2 is being used automatically
(The problem) When Node1 comes back online, Node2 is still being used
I need to manually stop Node2 if I want Node1 to be used again.
What do I exactly need to configure to make it automatically switch to Node1 when it is online?
Thank you in advance!

This is easily done via a simple infinity location constraint. In crmsh syntax this would appear like so:
location l_webserver_on_node1 hakase_balancing inf: node1
With that said, this does not adhere to best practices. In a well designed HA cluster both nodes should be equal, and it should not matter where the services run.
I have seen situations where there are intermittent problems with node1. For example, say node1 seems to crash and reboot about once a day. This means that twice a day your service will encounter a brief interruption as it migrates to node2 and then back to node1 when it finishes rebooting. Ideally it should only migrate to node2 once when node1 crashes the first time. Then stay there while you troubleshoot and repair/replace node1.

Related

MariaDB Spider with Galera Clusters failover solutions

I am having problems trying to build a database solution for the experiment to ensure HA and performance(sharding).
Now, I have a spider node and two galera clusters (3 nodes in each cluster), as shown in the figure below, and this configuration works well in general cases.:
However, as far as I know, when the spider engine performs sharding, it must assign primary IP to distribute SQL statements to two nodes in different Galera clusters.
So my first question here is:
Q1): When the machine .12 shuts down due to destruction, how can I make .13 or .14(one of them) automatically replace .12?
The servers that spider engine know
Q2): Are there any open source tools (or technologies) that can help me deal with this situation? If so, please explain how it works. (Maybe MaxScale? But I never knew what it is and what it can do.)
Q3): The motivation for this experiment is as follows. An automated factory has many machines, and each machine generates some data that must be recorded during the production process (maybe hundreds or thousands of data per second) to observe the operation of the machine and make the quality of each batch of products the best.
So my question is: how about this architecture (Figure 1)? or please provides your suggestions.
You could use MaxScale in front of the Galera cluster to make the individual nodes appear like a combined cluster. This way Spider will be able to seamlessly access the shard even if one of the nodes fails. You can take a look at the MaxScale tutorial for instructions on how to configure it for a Galera cluster.
Something like this should work:
This of course has the same limitation that a single database node has: if the MaxScale server goes down, you'll have to switch to a different MaxScale for that cluster. The benefit of using MaxScale is that it is in some sense stateless which means it can be started and stopped almost instantly. A network load balancer (e.g. ELB) can already provide some form of protection from this problem.

ProxySQL: how to configure a failover?

How can I config ProxySQL to have a failover, independent from read-only or not. The parameter 'weight' in mysql_servers do not work properly in this case. I have some nodes (MariaDB 10.3) with master-master-replication, and If node1 goes offline, node2 should perform the statements. If node1 is back, it should be the first server again.
With my setup (all 3 servers in one hostgroup, only with different 'weight') the failover works, but with up to 9 seconds delay for the first statements after failover, and if node1 is back, the first server remains on node2.
How can I configure a failover with a fixed priorisation the best way? Should it be done in the hostgroup or in the rules? Or maybe it's not possible with ProxySQL?
PS: Maybe a similar question: ProxySQL active-standby setup

Can i use master-slave replication on a cluster?

I'm using MariaDB 5.5.60 on both zabbix-servers which have a clustering solution between them for zabbix-server service.
Can I use master-slave solution for a cluster?
If I have node1 and node2, and they both have MariaDB on them,
node1 is the master and node2 is the slave.
If node1 is down, can the slave keep the new information written to the database? or I need to make some sync to make slave master and vice versa?
Is there such a solution of master-slave or do I have a better solution?
"Master-Slave" involves continually updating the Slave from the Master. If the Master crashes, there is a small chance of something not having made it to the Slave, but otherwise, the Slave is 'always' identical to the Master.
"Failover" mostly involves redirecting traffic to the Slave and making it writable.
Then there is the hassle of setting up a new Slave to the new Master, etc.
But... You have added confusion to the question by using the word "cluster". That is probably referring to another replication technology that is more robust. Does it also say "Galera" or "Group Replication" or "Innodb Cluster"? Probably not, since it is the rather old (5.5).
Please study what Zabbix provides (I don't know what it does.) and study "Replication" in MySQL/MariaDB documentations.

gke nginx lb health checks / can't get all instances in a "healthy" state

Using nginx nginx-ingress-controller:0.9.0, below is the permanent state of the google cloud load balancer :
Basically, the single healthy node is the one running the nginx-ingress-controller pods. Besides not looking good on this screen, everything works super fine. Thing is, Im' wondering why such bad notice appears on the lb
Here's the service/deployment used
Am just getting a little lost over how thing works; hope to get some experienced feedback on how to do thing right (I mean, getting green lights on all nodes), or to double check if that's a drawback of not using the 'official' gcloud l7 thing
Your Service is using the service.beta.kubernetes.io/external-traffic: OnlyLocal annotation. This configures it so that traffic arriving at the NodePort for that service will never go a Pod on another node. Since your Deployment only has 1 replica, the only node that will receive traffic is the one where the 1 Pod is running.
If you scale your Deployment to 2 replicas, 2 nodes will be healthy, etc.
Using that annotation is a recommend configuration so that you are not introducing additional network hops.

Connet two apps to MariaDB Multi Master database

Suppose that we have two application servers(app1 and app2) and also we setup multi master MariaDB clustering with two nodes(node1 and node2) without any HAProxy.Can we connect app1 to node1 and app2 to node2 and also both of app1 and app2 write to node1 and node2?
Does it cause any conflict?
Galera solves most of the problems that occur with Master-Master:
If one of Master-Master dies, now what? Galera recovers from any of its 3 nodes failing.
If you INSERT the same UNIQUE key value in more than one Master, M-M hangs; Galera complains to the last client to COMMIT.
If a node dies and recovers, the data is automatically repaired.
You can add a node without manually doing the dump, etc.
etc.
However, there are a few things that need to be done differently to when using Galera: Tips

Resources