Backup Galera cluster using mysqldump

Backup Galera cluster using mysqldump - mariadb

I have a 3-node Galera MariaDB cluster and I want to have a supplementary backup using mysqldump for restore of individual tables in the event of any user errors. Currently Node1 is being used by all applications while node2 and node3 are just kept in sync. I want to run mysqldump from idle Node3. Should I not use --flush-logs? Also should I use --master-data option?
I ran mysqldump backup in a pre-prod cluster (same setup as production) from an idle node Node3 with these options
But as soon as I ran mysqldump, the data in few tables (checked only few at random) and they were not in sync with other nodes. But in few minutes it came back in sync with other nodes.
mysqldump -u root -pPassword --host=localhost --all-databases --flush-logs --events --routines --single-transaction --master-data=2 --include-master-host-port
My question is:
a) Should I avoid using --flush-logs option in my mysqldump? --Is it the cause for the current node going out of sync?
b) Should I even include --master-data option in the mysqldump command?

Take node3 out of the cluster.
Do whatever dump you like (mysqldump, copy disk, xtrabackup, etc)
Put back into the cluster -- it will repair itself to get back in sync.

Related

Ubuntu (Oracle VM) - Mounted Samba shares hang indefinitely

I have a VM instance on Oracle Cloud (Ubuntu 22.04) set up with ZeroTier to act as a web server for some services that should work with my local Synology NAS.
For some of those services I also need to mount three SMB shares from my NAS with the ZeroTier tunnel, but I can't make it work.
I used mount and mount.cifs plenty of times with automounting too, this time it acts very strange:
running the mount command seems to succeed from the console, but /var/log/syslog reads
CIFS: VFS: \\XXX.XXX.XXX.XXX has not responded in 180 seconds.
Reconnecting...
if trying to access one of the shares (ls or lsof or cd or any other command), it succeeds for only one of the shares (always the same one), but only for the first time any command is given:
$ ls /temp
folder1 folder2 folder3
any other following command just "hangs" as if they system is working on something, but it stays like that indefinitely most of the times:
$ ls /temp
█
Just a few times it spits out this error
lsof: WARNING: can't stat() cifs file system /temp
Output information may be incomplete.
ls 1475 ubuntu 3r DIR 0,44 0 123207681 /temp
findmnt reads:
└─/temp //XXX.XXX.XXX.XXX/Downloads cifs rw,relatime,vers=2.0,cache=strict, username=[redacted],uid=1005,noforceuid,gid=0,noforcegid,addr=XXX.XXX.XXX.XXX,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=65536,wsize=65536,bsize=1048576,echo_interval=60,actimeo=1
for the remaining two "mounted" shares, none of them seems to respond to any command, not even the very first command, and they just hang like the one share that, at least, lets me browse for one time;
umount and umount -l take at least 2-3 minutes to successfully unmount the shares.
Same behavior when using smbclient and also with NFS shares from the same NAS.
What I have already tried:
update kernel and all packages;
remove, purge and reinstall cifs-utils, smbclient and so on...
tried mounting the same shares in another client / node within the ZeroTier network and it works just fine; also browsing from Windows and Android file manager apps with and without ZeroTier works flawlessly;
tried all SMB versions including SMBv3 and SMBv1 (CIFS);
tried different browsing or mounting methods / commands including mount, mount.cifs, autofs, smbclient;
tried to debug what happens behind the console, but didn't found anything that seems related to this in logs, htop or anything else. During the "hanging" sessions there is no spike in CPU, RAM or Network usage in either the Oracle VM or Synology NAS;
checked, reset and reconfigured all permissions on my NAS for shares, folders and files recursively and reconfigured users groups permissions.
What I haven't tried yet (I'll try as soon as possible):
reproduce this on another Oracle VM configured the same as the faulty one and another with a different base image (maybe Oracle Linux?);
It seems to me that the mount.cifs process doesn't really succeeds in mounting the share correctly, as it doesn't show as such anywhere. It also seems an issue not related to folder/file permissions, but rather something related to networking?
A note on something that may or may not be related to this: ZeroTier on my Synology NAS does not seems to work with IPv4 only - it remains OFFLINE. The node goes ONLINE only when IPv6 is enabled, but I must say that this is the only node in my ZT network that shows a IPv6 as public IP in the ZT web GUI - the other nodes show IPv4 public addresses.
If anyone has any clue on this, I'll be happy to support and reproduce any advice. Thank you!

I'm using YailScale, but I presume it will work the same.
You need to add the port 445 to /etc/iptables/rules.v4 just under the SSH setup like below:
-A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 445 -j ACCEPT (like this)
Then you need to edit the interfaces in /etc/samba/smb.conf to:
interfaces = lo tailscale0 100.0.0.0/24
Obviously, my interface is tailscale0, but yours will be different. Use ip link show to find yours. You may also need to change your IP range to suit ZeroTeirs, such as 100.0.0.0/24, which is what tailscale uses.
Then reboot!
I couldn't get it working without doing this.

MariaDB Galera Cluster: issue with replication

Here is my setup:
4 VMs (running on CentOS 7)
VM1 with mariadb-client and maxscale for load balancing (I have tried haproxy, results are the same). httpd and php (I am testing this with WordPress installation)
VM2, VM3, VM4 with mariadb-server, galera, rsync
Software installation
adding repository "curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash" on all 4 VMs
installing MariaDB-server on VM2, VM3, VM4 (this includes galera and all the required software)
installing maxscale and MariaDB-client on VM1
Editing config files
on VM2, VM3, VM4 I have added:
https://gist.github.com/yarko686/5adb7b24784c4c3c24a526519623d930
to /etc/my.cnf.d/server.cnf
on VM1 I have added the following lines to /etc/maxscale.cnf https://gist.github.com/a67e94afaa4ecc57ccb985d897ee3e87.git
Staring the cluster
on VM2 I have executed galera_new_cluster
on VM3 and VM4 I have executed systemctl start mariadb
Checking the cluster
on VM2 I am accessing mysql using mysql -u root then executing:
show global status like 'wsrep_cluster_size';
I receive this output https://gist.github.com/yarko686/a63c925b3275d239f38d50f0651e45ef it means that there are 3 machines in cluster
Creating maxscale user and wordpress users
Login to MySQL CLI on VM2 using mysql -u root and executing the following commands
https://gist.github.com/yarko686/950ea62f79638a6f293c28b99dd19f7b
for WordPress user I use the same commands, except .. I these cases, I'm using wordpress_db.* instead.
The main issue.
after importing WordPress database, it is properly created only on VM2 only. On VM3 and VM4 the database and tables are created, however, for some reason they are empty.
If I access wordpress database through MySQL CLI using my wordpress user and create new table with some data it gets replicated, but when I add user to my wp_users table (or add user through wp-admin) it is not replicated. The record gets created only on VM2 and not on VM3 and VM4.

check to see if the tables are innodb instead of isam.
I know on my setup when I imported old isam tables, the tables would appear, but the data wouldn't replicate. I had to convert all of the tables to innodb.

Node stuck in LEAVING state after leaving from cluster in Riak

I have having 5 node cluster of Riak KV on my production, I simply leaving a node from cluster because of some reason but i am facing issues with status as leaving since last 7 days how we remove this issues.
I tested for force-remove as we all force-replace node from cluster Locally by using command
sudo riak-admin force-remove -f riak#172.xx.xx.8
and for force-replace I follow this link https://gist.github.com/angrycub/4566736
but in this case I losses some data.
How do I fix these type of issues ?

Don't use force-remove command but riak-admin cluster leave. See this answer to more details Riak Force remove node from Riak KV cluster .

Galera first node won´t start

Ive been trying to set up a Galera Cluster. Since Im new to Linux I used the guide from mariadb (Link). I made everything as it stands there but the first node just won´t start when I use the command "service mysql start --wsrep-new-cluster". Im always getting the error:
Failed to open channel 'cluster1' at 'gcomm://10.1.0.11,10.1.0.12,10.1.0.13': -110 (Connection timed out)
My config file on all three nodes looks like this:
#mysql settings
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0
#galera settings
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="cluster1"
wsrep_cluster_address="gcomm://10.1.0.11,10.1.0.12,10.1.0.13"
wsrep_sst_method=rsync

Change MySQL config (remove IP addresses from gcomm://) at start-up of 1st cluster node or start cluster with --wsrep_cluster_address="gcomm://", that should do the trick.
Then you can add those IP address back into config - so that current 1st node can rejoin running cluster.
Haven't looked into it deep, but looks like option "--wsrep-new-cluster" is not handled correctly, because 1st node is still looking for live nodes, so you must temporarily remove all members of the cluster on 1st node (all IPs from cluster_address field).
Start all other nodes normally.
Newer OS versions use "bootstrap" instead "--wsrep-new-cluster".
My versions: Debian 9.4.0, MariaDB 10.1.26, Galera 25.3.19-2.

Node dev1#192.168.1.11 is not reachable

First, I followed exactly "The Riak Fast Track" tutorial to build a four nodes Riak cluster.
Then, I changed the 127.0.0.1 IP to 192.168.1.11 in dev[1-4]/etc/app.config files, and reinstalled clusters(delete dev[1-4], fresh install).
but Riak tells me:
Node dev1#192.168.1.11 is not reachable when I issue dev2/bin/riak-admin cluster join dev1#192.168.1.11
What's wrong?

+1 to what Brian Roach said in the comment.
Make sure to update the node name and IP address in both the app.config files AND the vm.args, before you start up the node.
Make sure node dev1 is up and reachable, before issuing a cluster join command to dev2.
Meaning, make sure dev1/bin/riak ping returns a 'pong', etc.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Backup Galera cluster using mysqldump - mariadb

Take node3 out of the cluster. Do whatever dump you like (mysqldump, copy disk, xtrabackup, etc) Put back into the cluster -- it will repair itself to get back in sync.

Related

Ubuntu (Oracle VM) - Mounted Samba shares hang indefinitely

MariaDB Galera Cluster: issue with replication

Node stuck in LEAVING state after leaving from cluster in Riak

Galera first node won´t start

Node dev1#192.168.1.11 is not reachable

Categories

Resources