Galera /var/lib/mysql/grastate.dat missing after upgrade - mariadb

I am upgrading my galera mariadb cluster from mariadb -10.3 to mariadb 10.6. I am trying to start a new cluster : sudo /usr/bin/galera_new_cluster . I see an error ,
[ERROR] WSREP: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster and may not contain all the updates. To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .
I am seeing references to set the variable safe_to_bootstrap to 1 in /var/lib/mysql/grastate.dat.
There is no such file grastate.dat in the directory in any of my cluster nodes. This is one of my qa servers. Any pointers if I can create one file new?
Thanks

Related

MariaDB Galera Cluster: issue with replication

Here is my setup:
4 VMs (running on CentOS 7)
VM1 with mariadb-client and maxscale for load balancing (I have tried haproxy, results are the same). httpd and php (I am testing this with WordPress installation)
VM2, VM3, VM4 with mariadb-server, galera, rsync
Software installation
adding repository "curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash" on all 4 VMs
installing MariaDB-server on VM2, VM3, VM4 (this includes galera and all the required software)
installing maxscale and MariaDB-client on VM1
Editing config files
on VM2, VM3, VM4 I have added:
https://gist.github.com/yarko686/5adb7b24784c4c3c24a526519623d930
to /etc/my.cnf.d/server.cnf
on VM1 I have added the following lines to /etc/maxscale.cnf https://gist.github.com/a67e94afaa4ecc57ccb985d897ee3e87.git
Staring the cluster
on VM2 I have executed galera_new_cluster
on VM3 and VM4 I have executed systemctl start mariadb
Checking the cluster
on VM2 I am accessing mysql using mysql -u root then executing:
show global status like 'wsrep_cluster_size';
I receive this output https://gist.github.com/yarko686/a63c925b3275d239f38d50f0651e45ef it means that there are 3 machines in cluster
Creating maxscale user and wordpress users
Login to MySQL CLI on VM2 using mysql -u root and executing the following commands
https://gist.github.com/yarko686/950ea62f79638a6f293c28b99dd19f7b
for WordPress user I use the same commands, except .. I these cases, I'm using wordpress_db.* instead.
The main issue.
after importing WordPress database, it is properly created only on VM2 only. On VM3 and VM4 the database and tables are created, however, for some reason they are empty.
If I access wordpress database through MySQL CLI using my wordpress user and create new table with some data it gets replicated, but when I add user to my wp_users table (or add user through wp-admin) it is not replicated. The record gets created only on VM2 and not on VM3 and VM4.
check to see if the tables are innodb instead of isam.
I know on my setup when I imported old isam tables, the tables would appear, but the data wouldn't replicate. I had to convert all of the tables to innodb.

Node stuck in LEAVING state after leaving from cluster in Riak

I have having 5 node cluster of Riak KV on my production, I simply leaving a node from cluster because of some reason but i am facing issues with status as leaving since last 7 days how we remove this issues.
I tested for force-remove as we all force-replace node from cluster Locally by using command
sudo riak-admin force-remove -f riak#172.xx.xx.8
and for force-replace I follow this link https://gist.github.com/angrycub/4566736
but in this case I losses some data.
How do I fix these type of issues ?
Don't use force-remove command but riak-admin cluster leave. See this answer to more details Riak Force remove node from Riak KV cluster .

Installation of Riak under Ubuntu 14.04 LTS

I cant bring riak to work on Ubuntu 14.04. LTS using the bash instructions under
http://docs.basho.com/riak/latest/ops/building/installing/debian-ubuntu/.
When running riak start I get:
riak failed to start within 15 seconds,
see the output of 'riak console' for more information.
If you want to wait longer, set the environment variable
WAIT_FOR_ERLANG to the number of seconds to wait.
When running riak console afterwards:
Exec: /usr/lib/riak/erts-5.10.3/bin/erlexec -boot /usr/lib/riak/releases/2.1.3/riak -config /var/lib/riak/generated.configs/app.2016.02.28.21.43.04.config -args_file /var/lib/riak/generated.configs/vm.2016.02.28.21.43.04.args -vm_args /var/lib/riak/generated.configs/vm.2016.02.28.21.43.04.args -pa /usr/lib/riak/lib/basho-patches -- console -x
Root: /usr/lib/riak
Erlang R16B02_basho8 (erts-5.10.3) [source] [64-bit] [smp:2:2] [async-threads:64] [kernel-poll:true] [frame-pointer]
[os_mon] memory supervisor port (memsup): Erlang has closed
[os_mon] cpu supervisor port (cpu_sup): Erlang has closed
{"Kernel pid terminated",application_controller,"{application_start_failure,riak_core,{bad_return,{{riak_core_app,start,[normal,[]]},{'EXIT',{{function_clause,[{orddict,fetch,['riak#127.0.0.1',[{'riak#54.194.69.48',[{{riak_core,bucket_types},[true,false]},{{riak_core,fold_req_version},[v2,v1]},{{riak_core,net_ticktime},[true,false]},{{riak_core,resizable_ring},[true,false]},{{riak_core,security},[true,false]},{{riak_core,staged_joins},[true,false]},{{riak_core,vnode_routing},[proxy,legacy]},{{riak_pipe,trace_format},[ordsets,sets]}]}]],[{file,\"orddict.erl\"},{line,72}]},{riak_core_capability,renegotiate_capabilities,1,[{file,\"src/riak_core_capability.erl\"},{line,441}]},{riak_core_capability,handle_call,3,[{file,\"src/riak_core_capability.erl\"},{line,213}]},{gen_server,handle_msg,5,[{file,\"gen_server.erl\"},{line,585}]},{proc_lib,init_p_do_apply,3,[{file,\"proc_lib.erl\"},{line,239}]}]},{gen_server,call,[riak_core_capability,{register,{riak_core,vnode_routing},{capability,[proxy,legacy],legacy,{riak_core,legacy_vnode_routing,[{true,legacy},{false,proxy}]}}},infinity]}}}}}}"}
Any idea how to fix this? Installation has been done via apt-get. Default riak.conf. Riak version is 2.1.3.
This is a Riak error, not at all related to Ubuntu.
The error message indicates that the current name of the node does not match the name of any node in the ring file. This can happen if you start the node with a default configuration before configuring the node's name. See Note on changing the name value at http://docs.basho.com/riak/latest/ops/building/basic-cluster-setup/
If this is a singleton node, the simplest solution will be to delete the files in /var/lib/riak/ring (make a backup first). A new one will be created when you start the node.

Galera first node won´t start

Ive been trying to set up a Galera Cluster. Since Im new to Linux I used the guide from mariadb (Link). I made everything as it stands there but the first node just won´t start when I use the command "service mysql start --wsrep-new-cluster". Im always getting the error:
Failed to open channel 'cluster1' at 'gcomm://10.1.0.11,10.1.0.12,10.1.0.13': -110 (Connection timed out)
My config file on all three nodes looks like this:
#mysql settings
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0
#galera settings
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="cluster1"
wsrep_cluster_address="gcomm://10.1.0.11,10.1.0.12,10.1.0.13"
wsrep_sst_method=rsync
Change MySQL config (remove IP addresses from gcomm://) at start-up of 1st cluster node or start cluster with --wsrep_cluster_address="gcomm://", that should do the trick.
Then you can add those IP address back into config - so that current 1st node can rejoin running cluster.
Haven't looked into it deep, but looks like option "--wsrep-new-cluster" is not handled correctly, because 1st node is still looking for live nodes, so you must temporarily remove all members of the cluster on 1st node (all IPs from cluster_address field).
Start all other nodes normally.
Newer OS versions use "bootstrap" instead "--wsrep-new-cluster".
My versions: Debian 9.4.0, MariaDB 10.1.26, Galera 25.3.19-2.

Impala 1.2.1 ERROR: Couldn't open transport for localhost:26000(connect() failed: Connection refused)

Using impala-shell, I can see the hive metastore, use any data base created by Hive and query any table created by Hive. When I try to create a table in impala-shell or do a "invalidate metadata", I get
"ERROR: Couldn't open transport for localhost:26000(connect() failed: Connection refused)"
Have following configuration. This is a multi-node cluster configuration * built by hand i.e. without using Cloudera Manager *
CentOS 6
CDH4.5
Impala 1.2.1
Hive MySQL Metastore
impalad are running on multiple nodes with data nodes
statestored and catalogd is running on a single node that is NOT impalad node
In /etc/default/impala I have changed IMPALA_STATE_STORE_HOST to point to IP of the statestored machine
From the /var/log/impala/catalogd.INFO, it seems 26000 is used by catalog service as there is a line in this file "--catalog_service_port=26000"
Just as /etc/default/impala has to tell Impalad where is the statestore (using IMPALA_STATE_STORE_HOST), I am wondering if for 1.2.1 (where catalogd is introduced) there has to be an additional entry for catalogd location as well - just a guess ....
Any help is appreciated.
Thanks,
you have to start the impalad with the option -catalog_service_host=fqdn_to_your_catalog_host.
unfortunately this is not yet in the default configuration so you have to add it yourself
change /etc/default/impala
CATALOG_SERVICE_HOST=fqdn_to_your_catalog_host
IMPALA_SERVER_ARGS=add: -catalog_service_host=${CATALOG_SERVICE_HOST}
restart impalad and it should work now :-)

Resources