Node dev1#192.168.1.11 is not reachable - riak

First, I followed exactly "The Riak Fast Track" tutorial to build a four nodes Riak cluster.
Then, I changed the 127.0.0.1 IP to 192.168.1.11 in dev[1-4]/etc/app.config files, and reinstalled clusters(delete dev[1-4], fresh install).
but Riak tells me:
Node dev1#192.168.1.11 is not reachable when I issue dev2/bin/riak-admin cluster join dev1#192.168.1.11
What's wrong?

+1 to what Brian Roach said in the comment.
Make sure to update the node name and IP address in both the app.config files AND the vm.args, before you start up the node.
Make sure node dev1 is up and reachable, before issuing a cluster join command to dev2.
Meaning, make sure dev1/bin/riak ping returns a 'pong', etc.

Related

Ubuntu (Oracle VM) - Mounted Samba shares hang indefinitely

I have a VM instance on Oracle Cloud (Ubuntu 22.04) set up with ZeroTier to act as a web server for some services that should work with my local Synology NAS.
For some of those services I also need to mount three SMB shares from my NAS with the ZeroTier tunnel, but I can't make it work.
I used mount and mount.cifs plenty of times with automounting too, this time it acts very strange:
running the mount command seems to succeed from the console, but /var/log/syslog reads
CIFS: VFS: \\XXX.XXX.XXX.XXX has not responded in 180 seconds.
Reconnecting...
if trying to access one of the shares (ls or lsof or cd or any other command), it succeeds for only one of the shares (always the same one), but only for the first time any command is given:
$ ls /temp
folder1 folder2 folder3
any other following command just "hangs" as if they system is working on something, but it stays like that indefinitely most of the times:
$ ls /temp
█
Just a few times it spits out this error
lsof: WARNING: can't stat() cifs file system /temp
Output information may be incomplete.
ls 1475 ubuntu 3r DIR 0,44 0 123207681 /temp
findmnt reads:
└─/temp //XXX.XXX.XXX.XXX/Downloads cifs rw,relatime,vers=2.0,cache=strict, username=[redacted],uid=1005,noforceuid,gid=0,noforcegid,addr=XXX.XXX.XXX.XXX,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=65536,wsize=65536,bsize=1048576,echo_interval=60,actimeo=1
for the remaining two "mounted" shares, none of them seems to respond to any command, not even the very first command, and they just hang like the one share that, at least, lets me browse for one time;
umount and umount -l take at least 2-3 minutes to successfully unmount the shares.
Same behavior when using smbclient and also with NFS shares from the same NAS.
What I have already tried:
update kernel and all packages;
remove, purge and reinstall cifs-utils, smbclient and so on...
tried mounting the same shares in another client / node within the ZeroTier network and it works just fine; also browsing from Windows and Android file manager apps with and without ZeroTier works flawlessly;
tried all SMB versions including SMBv3 and SMBv1 (CIFS);
tried different browsing or mounting methods / commands including mount, mount.cifs, autofs, smbclient;
tried to debug what happens behind the console, but didn't found anything that seems related to this in logs, htop or anything else. During the "hanging" sessions there is no spike in CPU, RAM or Network usage in either the Oracle VM or Synology NAS;
checked, reset and reconfigured all permissions on my NAS for shares, folders and files recursively and reconfigured users groups permissions.
What I haven't tried yet (I'll try as soon as possible):
reproduce this on another Oracle VM configured the same as the faulty one and another with a different base image (maybe Oracle Linux?);
It seems to me that the mount.cifs process doesn't really succeeds in mounting the share correctly, as it doesn't show as such anywhere. It also seems an issue not related to folder/file permissions, but rather something related to networking?
A note on something that may or may not be related to this: ZeroTier on my Synology NAS does not seems to work with IPv4 only - it remains OFFLINE. The node goes ONLINE only when IPv6 is enabled, but I must say that this is the only node in my ZT network that shows a IPv6 as public IP in the ZT web GUI - the other nodes show IPv4 public addresses.
If anyone has any clue on this, I'll be happy to support and reproduce any advice. Thank you!
I'm using YailScale, but I presume it will work the same.
You need to add the port 445 to /etc/iptables/rules.v4 just under the SSH setup like below:
-A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 445 -j ACCEPT (like this)
Then you need to edit the interfaces in /etc/samba/smb.conf to:
interfaces = lo tailscale0 100.0.0.0/24
Obviously, my interface is tailscale0, but yours will be different. Use ip link show to find yours. You may also need to change your IP range to suit ZeroTeirs, such as 100.0.0.0/24, which is what tailscale uses.
Then reboot!
I couldn't get it working without doing this.

Amazon EC2 Ubuntu 20 - DNS resolution doesn't work

I posted my solution too. I hope this saves someone else a lot of time.
I have an EC2 instance running Ubuntu 20. DNS resolution never works, or fails a lot.
My file /etc/resolv.conf has
nameserver 127.0.0.53
The file is not a symlink, and I can certainly edit it to use nameserver 8.8.8.8 ,
But the file periodically gets overwritten and the 127.0.0.53 (or something similar) is back.
I just want dns to work!
See my solution below.
Get your nic's name from a config file.
cat /etc/netplan/50-cloud-init.yaml
On my system, amazon sets the nic name to ens5.
As root create new file: /etc/netplan/99-custom-dns.yaml
with the following content.
Replace ens5 with your nic's name.
network:
version: 2
ethernets:
ens5:
nameservers:
addresses: [8.8.8.8]
dhcp4-overrides:
use-dns: false
Reboot
sudo shutdown -r now
Verify. After the reboot you can try pinging something by name
ping yahoo.com
or you can view the output of:
systemd-resolve --status
Done
Here's a link to the Amazon help doc, though it misses the nontrivial detail about your nic's name:
https://aws.amazon.com/premiumsupport/knowledge-center/ec2-static-dns-ubuntu-debian/

Node stuck in LEAVING state after leaving from cluster in Riak

I have having 5 node cluster of Riak KV on my production, I simply leaving a node from cluster because of some reason but i am facing issues with status as leaving since last 7 days how we remove this issues.
I tested for force-remove as we all force-replace node from cluster Locally by using command
sudo riak-admin force-remove -f riak#172.xx.xx.8
and for force-replace I follow this link https://gist.github.com/angrycub/4566736
but in this case I losses some data.
How do I fix these type of issues ?
Don't use force-remove command but riak-admin cluster leave. See this answer to more details Riak Force remove node from Riak KV cluster .

Galera first node won´t start

Ive been trying to set up a Galera Cluster. Since Im new to Linux I used the guide from mariadb (Link). I made everything as it stands there but the first node just won´t start when I use the command "service mysql start --wsrep-new-cluster". Im always getting the error:
Failed to open channel 'cluster1' at 'gcomm://10.1.0.11,10.1.0.12,10.1.0.13': -110 (Connection timed out)
My config file on all three nodes looks like this:
#mysql settings
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0
#galera settings
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="cluster1"
wsrep_cluster_address="gcomm://10.1.0.11,10.1.0.12,10.1.0.13"
wsrep_sst_method=rsync
Change MySQL config (remove IP addresses from gcomm://) at start-up of 1st cluster node or start cluster with --wsrep_cluster_address="gcomm://", that should do the trick.
Then you can add those IP address back into config - so that current 1st node can rejoin running cluster.
Haven't looked into it deep, but looks like option "--wsrep-new-cluster" is not handled correctly, because 1st node is still looking for live nodes, so you must temporarily remove all members of the cluster on 1st node (all IPs from cluster_address field).
Start all other nodes normally.
Newer OS versions use "bootstrap" instead "--wsrep-new-cluster".
My versions: Debian 9.4.0, MariaDB 10.1.26, Galera 25.3.19-2.

How does scp traffic flow between two remote hosts?

If you issue a scp command between 2 remote servers, will the traffic flow directly between the hosts, or will it flow from Remote1 => Local Machine => Remote2?
For example I issue this command on my laptop:
scp user1#remote1.com:/Files user2#remote2.com:/Files
Newer versions of scp (since 2011) have the option -3 which will route the traffic through your local machine. This is useful if the hosts are on different networks and can't see each other. Found this on SuperUser. From your linked article it seems like normally the hosts will try to connect directly to each other.
Looks like it can be done.
If your linux/bsd/unix or Mac do not have the -3 option, just compile the last version from: http://www.openssh.org/portable.html
It is as simple as:
./configure; make ; sudo make install
It will be installed on /usr/local/bin by default. I just did that on my Mac OS X Lion.

Resources