I've tried to configure cluster following through Glassfish clustering tutorials (1, 2), but I'm still having troubles with creating instance in cluster on remote host.
I think it will be better if I give you output with inserted commands, it'll probably explain more:
adam#adam-desktop:~/Pulpit/glassfish-3.1.1/bin$ ./asadmin
Use "exit" to exit and "help" for online help.
asadmin> setup-ssh adam-laptop
Successfully connected to adam#adam-laptop using keyfile /home/adam/.ssh/id_rsa
SSH public key authentication is already configured for adam#adam-laptop
Command setup-ssh executed successfully.
asadmin> install-node --installdir /home/adam/Pulpit/glassfish3 adam-laptop
Created installation zip /home/adam/Pulpit/glassfish-3.1.1/bin/glassfish8196347853130742869.zip
Successfully connected to adam#adam-laptop using keyfile /home/adam/.ssh/id_rsa
Copying /home/adam/Pulpit/glassfish-3.1.1/bin/glassfish8196347853130742869.zip (82498155 bytes) to adam-laptop:/home/adam/Pulpit/glassfish3
Installing glassfish8196347853130742869.zip into adam-laptop:/home/adam/Pulpit/glassfish3
Removing adam-laptop:/home/adam/Pulpit/glassfish3/glassfish8196347853130742869.zip
Fixing file permissions of all files under adam-laptop:/home/adam/Pulpit/glassfish3/bin
Command install-node executed successfully.
asadmin> start-domain domain1
Waiting for domain1 to start ........................
Successfully started the domain : domain1
domain Location: /home/adam/Pulpit/glassfish-3.1.1/glassfish/domains/domain1
Log File: /home/adam/Pulpit/glassfish-3.1.1/glassfish/domains/domain1/logs/server.log
Admin Port: 4848
Command start-domain executed successfully.
asadmin> enable-secure-admin
Command enable-secure-admin executed successfully.
asadmin> restart-domain domain1
Successfully restarted the domain
Command restart-domain executed successfully.
asadmin> create-cluster c1
Command create-cluster executed successfully.
asadmin> create-node-ssh --nodehost adam-laptop --installdir /home/adam/Pulpit/glassfish3 adam-laptop
Command create-node-ssh executed successfully.
asadmin> create-instance --node adam-laptop --cluster c1 i1
Successfully created instance i1 in the DAS configuration, but failed to create the instance files on node adam-laptop (adam-laptop).
Command failed on node adam-laptop (adam-laptop): Could not contact the DAS running at adam-desktop:4848. This could be because a firewall is blocking the connection back to the DAS or because the DAS host is known by a different name on the instance host adam-laptop. To change the hostname that the DAS uses to identify itself please update the DAS admin HTTP listener address.
Command _create-instance-filesystem failed.
To complete this operation run the following command locally on host adam-laptop from the GlassFish install location /home/adam/Pulpit/glassfish3:
asadmin --host adam-desktop --port 4848 create-local-instance --node adam-laptop i1
asadmin>
UPDATE
I'm putting hosts file contents and ping command output for sure, that exists connection between adam-desktop and adam-laptop:
adam#adam-desktop:~$ cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 adam-desktop
192.168.1.101 adam-laptop
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
adam#adam-desktop:~$ cat /etc/hostname
adam-desktop
adam#adam-desktop:~$ ping adam-laptop
PING adam-laptop (192.168.1.101) 56(84) bytes of data.
64 bytes from adam-laptop (192.168.1.101): icmp_req=1 ttl=64 time=0.786 ms
64 bytes from adam-laptop (192.168.1.101): icmp_req=2 ttl=64 time=0.694 ms
64 bytes from adam-laptop (192.168.1.101): icmp_req=3 ttl=64 time=0.687 ms
Any help?
After the domain is started, can you reach http://localhost:4848 or http://adam-desktop:4848 in your browser ?
If not, on linux glassfish requires you to set up the /etc/hosts file correctly and this is where most of my problems like this come from. Also set up the appropriate network config. On Redhat it is /etc/sysconfig/network and on Ubuntu it is /etc/hostname
It seems that error was caused by entry in /etc/hosts file.
127.0.0.1 localhost
127.0.1.1 adam-desktop
192.168.1.101 adam-laptop
after changing to:
127.0.0.1 localhost
127.0.0.1 adam-desktop
192.168.1.101 adam-laptop
it works. I had to make changes on two machines, it means on adam-desktop and adam-laptop.
Related
I'm trying to create an mpi cluster by connecting two laptops and running mpi programs. I followed the steps as mentioned here (https://medium.com/mpi-cluster-setup/mpi-clusters-within-a-lan-77168e0191b1). I am able to ssh to the other nodes without a password. However when I try to run mpiexec -n 2 -hosts manager,worker ./main I get this following error.
[proxy:0:1#gunavaran-HP-Pavilion-Notebook] HYDU_sock_connect (utils/sock/sock.c:113): unable to get host address for gunavaran-HP-ENVY-15-Notebook-PC
[proxy:0:1#gunavaran-HP-Pavilion-Notebook] main (pm/pmiserv/pmip.c:181): unable to connect to server gunavaran-HP-ENVY-15-Notebook-PC at port 43211 (check for firewalls!)
Host key verification failed.
This is my hostfile
127.0.0.1 localhost
#127.0.1.1 gunavaran-HP-ENVY-15-Notebook-PC
#MPI SETUP
192.168.8.102 manager
192.168.8.108 worker
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
I changed the hostnames to manager and worker using sudo hostnamectl set-hostname. It works fine now.
I am trying to set up a reverse ssh tunnel between a local machine behind a router and a machine on the Internet, so that the Internet machine can tunnel back and mount a disk on the local machine.
On the local machine, I type
/usr/bin/ssh -N -f -R *:2222:127.0.0.1:2222 root#ip_of_remote_machine
This causes the remote machine to listen on port 2222. But when I try to mount the sshfs disk on the remote machine, I get "connection refused" on the local machine. Interestingly, port 2222 doesn't show up on the local machine as being bound. However, I'm definitely talking to ssh on the local machine since it complains
debug1: channel 0: connection failed: Connection refused
I have GatewayPort set to Yes on both machines. I also have AllowTcpForwarding yes on both machines as well.
First, the line needs to be
/usr/bin/ssh -N -f -R *:2222:127.0.0.1:22 root#ip_of_remote_machine
Where port 22 represents the ssh server of the local machine.
Second, since I am using sshfs, the following line needs to be in its sshd_config
Subsystem sftp /usr/lib64/misc/sftp-server
I have next setup:
Local host - my work PC
Project VM - Vagrant box with project files, runned on my work PC
Remote host - remote PC, from which I need to access hosts on Project VM
Project VM setup (/etc/hosts on Local host):
192.168.100.102 host1.vm.private
192.168.100.102 sub1.host1.vm.private
192.168.100.102 sub2.host1.vm.private
"host1" subdomains resolved by application router and served by nginx (config for "host1.vm.private" on Project VM):
server {
listen 80;
server_name ~^(.+\.)?host1\.vm\.private$;
...
}
I need to make "sub(1|2|N).host1.vm.private" reachable from remote host. How this can be done?
So, i found the solution: Trouble SSH Tunneling to remote server
The main issue is that invalid HTTP header was sent and nginx cant resolve a virtual host.
Run on local PC ssh -R 8888:192.168.100.102:80 <remote_pc_credentionals>. Or, run "inversed" command with ssh -L flag on remote PC.
Add "sub1.host1.vm.private" to /etc/hosts on remote PC: 127.0.0.1 sub1.host1.vm.private
OR
Send "Host" header with each request: curl -H "Host: sub1.host1.vm.private" "http://localhost:8888/some/path"
I am trying to install Cloudera cluster on 5 machines- 4 as ubuntu 12.04 and 1 as Oracle Enterprise Linux 5.8.
I have run the Cloudera Manager Installer on Oracle Linux Enterprise host which should act as a name node ( with ip address 192.168.1.185) and other 4 Ubuntu hosts should act as data nodes.
I have completed all the prerequisites and I have configured host files as:
For Ubuntu:
127.0.0.1 localhost
192.168.1.181 hduser1.example.co.in hduser1
192.168.1.182 hduser2.example.co.in hduser2
192.168.1.183 hduser3.example.co.in hduser3
192.168.1.184 hduser4.example.co.in hduser4
192.168.1.185 hduser5.example.co.in hduser5
#The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
For Oracle Enterprise Linux:
192.168.1.181 hduser1.example.co.in hduser1
192.168.1.182 hduser2.example.co.in hduser2
192.168.1.183 hduser3.example.co.in hduser3
192.168.1.184 hduser4.example.co.in hduser4
192.168.1.185 hduser5.example.co.in hduser5
127.0.0.1 hduser5.example.co.in hduser5 localhost.localdomain loca$
::1 localhost6.localdomain6 localhost6
I am not sure whether this configuration is correct as i have got errors related to reverse DNS as follows:
The following failures were observed in checking hostnames. Showing first 1000 failures only...
DNS reverse lookup of IP 192.168.1.184 on host hduser1.example.co.in failed. Expected hduser4.example.co.in but got hduser4.local.
DNS reverse lookup of IP 192.168.1.182 on host hduser1.example.co.in failed. Expected hduser2.example.co.in but got hduser-desktop-3.local.
DNS reverse lookup of IP 192.168.1.183 on host hduser1.example.co.in failed. Expected hduser3.example.co.in but got hduser-desktop.local.
After a long research I found that the host file configuration is correct. The problem due compatibility issues between Ubuntu and Oracle Enterprise Linux. After switching all the nodes to Ubuntu the issue was resolved.
Also I edited resolv.conf of all hosts. The configuration was as follows:
domain example.co.in
search example.co.in localdomain
nameserver 192.x.x.x
I am facing problem in configuring and running MPI on my systems.
Here is what I tried:
1) I ran 'mpd &' on one machine and then I ran 'mpdtrace -l' on the same machine. I got this as output: "my-lappy_53430 (127.0.1.1)"
2) On another machine I ran 'mpd -h -p 53430 &' and got this error:
akshey-desktop_39993: conn error in connect_lhs: Connection timed out
akshey-desktop_39993 (connect_lhs 924): failed to connect to lhs at 10.2.28.137 52430
akshey-desktop_39993 (enter_ring 879): lhs connect failed
akshey-desktop_39993 (run 267): failed to enter ring
Can you please help with this issue? I tried to ping and ssh the first machine(on which mpd is running) from the second machine and it worked.
After this I executed 'mpdheck' on the first machine, I got this output:
* * * first ipaddr for this host (via my-lappy) is: 127.0.1.1
These are the contents of /etc/hosts of the first machine:
127.0.0.1 localhost
127.0.1.1 my-lappy
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
Then I ran 'mpdcheck -l' and got this as output:
**********
Your unqualified hostname resolves to 127.0.0.1, which is
the IP address reserved for localhost. This likely means that
you have a line similar to this one in your /etc/hosts file:
127.0.0.1 $uqhn
This should perhaps be changed to the following:
127.0.0.1 localhost.localdomain localhost
**********
Even after changing the first line of /etc/hosts to "127.0.0.1 localhost.localdomain localhost" I still got the same output from 'mpdcheck -l'
Please note that I do not have access to the DNS server of the network and these machines do not have a DNS entry in the DNS server. (I think this should not be a problem because we can always use IP addresses instead of hostnames. Isn't it so?)
Two points:
You probably don't want to wire up an MPD ring by hand. Unless you are just doing some troubleshooting with a raw mpd command, you probably want to use mpdboot. Its usage is described in the User's Guide.
Since you are using MPD, you are using MPICH2 or an MPICH2 derivative. Starting with MPICH2 1.1 there is a new process manager available, called "hydra". I encourage you to update to the latest version of MPICH2 that you can and give hydra a try. It is much more robust than MPD and has many more features, including better performance.
from my personal and recent experience, I would say that
127.0.1.1 my-lappy
must be change to you LAN address, and match your hostname. You can change it with hostname <new hostname> and/or edit permanently /etc/hostname.
Then on host1 you need to start mpd --echo and note the port on which mpd will listen:
mpd_port=N
then on host2 start:
mpd --host=host1 --port=N
It's very important that the /etc/hosts files of all the machines resolve correctly the names to the IPs.
mpdtrace -l will confirm that the ring is correctly setup.
Check for firewall on your systems that might be blocking the default ports. Turn off the firewall by turning off the ipchains and iptables to test if that is the problem.
In addition, make sure the hostnames/IP addresses are correct and can be successfully resolved.