We are running MariaDB Galera cluster in 3 Datacenters. We are using mariabackup tool for taking backups in each datacenter but, since the same data is replicating on all the 3 datacenters we are trying to implement a solution which includes executing the backup script in only DC and if there is downtime in the DC which is taking backups the backups should run in other DC automatically. Any solution for this approach is much appreciated.
You need some sort of "Event & Trigger" to accomplish this.
I use zabbix to monitor my daily mariabackup and I had the problem that the node was down while mariabackup was running.
I don't really care if I lose one day backup since I also have ZFS SNAPSHOT backup.
But if you want to, you can also set some trigger action in zabbix to make the backup script run on another server.
Another solution I would choose would be saltstack 'beacon & reactor'. A beacon can be created to send an event and a reactor can be triggered to take some actions. Since I am running saltstack for all my servers, this would be a solution I prefer.
Related
I tend to run simulations on a cluster that produces files larger than 100MB and I can't sync my computer with the cluster. So I considered setting up rsync between the two by following this link.
However, I believe this is just a cron job to sync the backup server with the main server and doesn't work in both directions. What will be the stepwise instructions to set up a bidirectional rsync ?
Both the systems run linux
Rsync isn't really the right tool for this job. You can sort of get it to work, using cron jobs and extremely carefully chosen parameters, but there's significant danger of data loss, especially if you want file deletion to propagate.
I'd recommend a tool like Syncthing for bidirectional sync. You want something that maintains an independent database of what's changed and what hasn't, and real-time updates are nice to have too.
I havea business need related to a MariaDb instance that should work in a master-slave configuration with failover.
Looking at the documentation I have seen that is possible to conigure a multi- cluster-master (galera) or a simple master slave replica.
Any suggestion to configure master-slave + failover?
Many thanks in advance
Roberto
MySQL/MariaDB master-slave replication is great for handling read-heavy workloads. It's also used as a redundancy strategy to improve database availability, and as a backup strategy (i.e. take the snapshot/backup on the slave to avoid interrupting the master). If you don't need a multi-master solution with all the headaches that brings—even with MySQL Cluster or MariaDB Galera Cluster—it's a great option.
It takes some effort to configure. There are several guides out there with conflicting information (e.g. MySQL vs. MariaDB, positional vs. GTID) and several decision points that can affect your implementation (e.g. row vs. statement binlog formats, storage engine selection), and you might have to stitch various pieces together to form your final solution. I've had good luck with MariaDB 10.1 (GTID, row binlog format) and mixed MyISAM and InnoDB storage engines. I create one slave user on the master per slave, and I don't replicate the mysql database. YMMV. This guide is a good starting place, but it doesn't really cover GTID.
Failover is a whole separate ball of wax. You will need some kind of a reverse proxy (such as MaxScale or HAproxy) or floating IP address in front of your master that can adjust to master changes. (There might be a way to do this client-side, but I wouldn't recommend it.) Something has to monitor the health of the cluster, and when it comes time to promote a slave to the new master, there is a whole sequence of steps that have to be performed. MySQL provides a utility called mysqlfailover to facilitate this process, but as far as I know, it is not compatible with MariaDB. Instead, you might take a look at replication-manager, which seems to be MariaDB's Go-based answer to mysqlfailover. It appears to be a very sophisticated tool.
Master-Slave helps with failover, but does not provide it.
MariaDB Cluster (Galera) does provide failover for most cases, assuming you have 3 nodes.
After successful installation of devstack and launching instances,but once reboot machine, need to start all over again and lose all the instances which were launched back then.I tried rejoin-stack but did not worked,How can i get the instances back after reboot ?
You might set resume_guests_state_on_host_boot = True in nova.conf. The file should be located at /etc/nova/nova.conf
I've found some old discussion http://www.gossamer-threads.com/lists/openstack/dev/8772
AFAIK at the present time OpenStack (Icehouse) still not completely aware about environments inside it, so it can't restore completely after reboot. The instances will be there (virsh domains), but even if you start them manually or using nova flags I'm not sure whether other facilities will handle this correctly (e.g. neutron will correctly configure all L3 rules according to DB records, etc.) Honestly I'm pretty sure they won't...
The answer depends of what you need to achieve:
If you need a template environment (e.g. similar set of instances and networks each time after reboot) you may just script everything. In other words just make a bash script creating everything you need and run it each time after stack.sh. Make sure you're starting with clean environment since OpenStack DB state remains between ./unstack - ./stack.sh or ./rejoin-stack.sh (you might try to just clean DB, or delete it. stack.sh will build it back).
If you need a persistent environment (e.g. you don't want to loose VM's and whole infrastructure state after reboot) I'm not aware how to do this using OpenStack. F.e. neutron agents (they configure iptables, dhcp etc) do not save state and driven by events from Neutron service. They will not restore after reboot, so the network will be dead. I'll be very glad if someone will share a method to do such recovery.
In general I think OpenStack is not focusing on this and will not focus during the nearest release cycles. Common approach is to have multi-node environment where each node is replaceable.
See http://docs.openstack.org/high-availability-guide/content/ch-intro.html for reference
devstack is an ephemeral environment. it is not supposed to survive a reboot. this is not a supported behavior.
that being said you might find success in re-initializing the environment by running
./unstack.sh
follower by
./stack.sh
again.
Again, devstack is an ephemeral environment. It's primary purpose for existing is to run gate testing for openstack's CI infrastructure.
or try ./rejoin-stack.sh to re-join previous screens.
Does anyone know how to run a number of processes in the background either through a job queue or parallel processing.
I have a number of maintenance updates that take time to run and want to do this in the background.
I would recomment Gearman server, it prooved quite stable, it's totally outside of Symfony2, and you have to have server up and running (don't know what your hosting options are), but it distribues jobs perfectly. In skiniest version, it just keeps all jobs in-memory, but you can configure it to use sqlite database as backup, so for any reason server reboots, or gearman deamon breaks, you can just start it again, and your jobs will be perserved. I konw it has been tested with very large loads (adding up 1k jobs per second), and it stood it's ground. It's probably more stable nowdays, I'm speaking from experience 2 yrs ago, where we offloaded some long-running tasks in ZF application to background processing via Gearman. It should be quite self-explanitory how it works from image below:
Checkout RabbitMq. It's the most popular option according to knpbundles.com
Take a look at http://github.com/mmoreram/rsqueue-bundle
Uses Redis as queue core and will be mantained.
Take a look at enqueue libraty. There are a lot of transports (AMQP, STOMP, AmazonSQS, Redis, Filesystem, Doctrine DBAL and more) to choose from. Easy to use and feature rich. That would be enough for simple job queue, though if you need something more sophisticated look at enqueue/job-queue. It can run an exclusive job (only one job running at a given time) or a job with sub-jobs, or a job with something to do after it has been done.
Of course, there is a bundle for it
I have a job running using Hadoop 0.20 on 32 spot instances. It has been running for 9 hours with no errors. It has processed 3800 tasks during that time, but I have noticed that just two tasks appear to be stuck and have been running alone for a couple of hours (apparently responding because they don't time out). The tasks don't typically take more than 15 minutes. I don't want to lose all the work that's already been done, because it costs me a lot of money. I would really just like to kill those two tasks and have Hadoop either reassign them or just count them as failed. Until they stop, I cannot get the reduce results from the other 3798 maps!
But I can't figure out how to do that. I have considered trying to figure out which instances are running the tasks and then terminate those instances, but
I don't know how to figure out which instances are the culprits
I am afraid it will have unintended effects.
How do I just kill individual map tasks?
Generally, on a Hadoop cluster you can kill a particular task by issuing:
hadoop job -kill-task [attempt_id]
This will kill the given map task and re-submits it on an different
node with a new id.
To get the attemp_id navigate on the Jobtracker's web UI to the map task
in question, click on it and note it's id (e.g: attempt_201210111830_0012_m_000000_0)
ssh to the master node as mentioned by Lorand, and execute:
bin/hadoop job -list
bin/hadoop job –kill <JobID>