(First of all, I want to thank Hristo Iliev. He helps me a lot in my current MPI project.)
The problem is that MPI_Irecv will stuck sometimes (stuck probability is close to 1/2). My program is more than 20,000 lines. So I cannot list it here.
The code where it stuck is that:
MPI id:0
MPI_Ssend(id=1,tag=1);
MPI_Ssend(id=1,tag=x);
MPI_Recv(id=1,tag=x+1);
MPI id:1
MPI_Recv(id=0,tag=1);
pthread_create(fun_A());
void fun_A()
{
MPI_Recv(id=0,tag=x);
MPI_Ssend(id=0,tag=x+1);
}
In order to debug it, I added some flags after each MPI functions. There flags include printf and write some flags to file.
Several points of my program are lited below.
1.(important) When I run my mpi program in 1 machine using 2 cores, it is OK. But when I run it in 2 machines (each machine using 1 core), sometimes , in MPI id:1, MPI_Ssend(id=0,tag=x+1)(and MPI_wait()) is returned but MPI id:0 stuck at MPI_Recv(id=1,tag=x+1).
2.(important) When MPI_Recv(id=1,tag=x+1);(MPI id:0) stucks, the first 2 MPI_functions in MPI_id:1 should have finished. But sometimes there is no flags of MPI_id:1 at all, sometime there are flags of all 3 MPI functions of MPI id:1.
3.(important) There is no sender thread, when it stuck at MPI_Recv(id=1,tag=x+1); in MPI id:0.
4.vfork is used in my program to fork other jobs. MPI functions are not used in these jobs. These jobs use message queue to communicate with a thread in MPI_comm_world.
5.I enabled multiple-thread support while config. MPI_Init_thread(mutiple_thread support) is used to Init MPI. Ret-value of it is checked.
I do not what's going on about my program. I guess:
There is bug in openMPI
There is error in config.
There is bug in my program. (But if there is bug in my program, why it is OK when I run MPI at 1 machine using 2 cores but failed when at 2 machines each using 1 core).
Could anyone give me any hints?
the out put of ifconfig -a is:
One node ip is 10.1.1.112. The other is 10.1.1.113. The out put of ifconfig is exactly the same except ip-addr.
eth0 Link encap:Ethernet HWaddr 00:21:5E:2F:62:8A
inet addr:10.1.1.113 Bcast:10.1.1.255 Mask:255.255.255.0
inet6 addr: 2001:da8:203:eb1:221:5eff:fe2f:628a/64 Scope:Global
inet6 addr: fe80::221:5eff:fe2f:628a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3402577 errors:0 dropped:0 overruns:0 frame:0
TX packets:208064 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:291778729 (278.2 MiB) TX bytes:25343147 (24.1 MiB)
eth1 Link encap:Ethernet HWaddr 00:21:5E:2F:62:8C
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:1770 errors:0 dropped:0 overruns:0 frame:0
TX packets:1770 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:798595 (779.8 KiB) TX bytes:798595 (779.8 KiB)
Related
I'm using Docker on Windows. When I ssh to a docker host machine(local VM) and type ifconfig, normally we get something like this:
docker#master:~$ ifconfig
docker0 Link encap:Ethernet HWaddr 02:42:82:A3:2D:FB
inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
eth0 Link encap:Ethernet HWaddr 08:00:27:E8:A3:F6
inet addr:10.0.2.15 Bcast:10.0.2.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fee8:a3f6/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:243 errors:0 dropped:0 overruns:0 frame:0
TX packets:235 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:39044 (38.1 KiB) TX bytes:39544 (38.6 KiB)
eth1 Link encap:Ethernet HWaddr 08:00:27:83:CF:41
inet addr:192.168.99.101 Bcast:192.168.99.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe83:cf41/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:297 errors:0 dropped:0 overruns:0 frame:0
TX packets:227 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:29987 (29.2 KiB) TX bytes:32525 (31.7 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:32 errors:0 dropped:0 overruns:0 frame:0
TX packets:32 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:2752 (2.6 KiB) TX bytes:2752 (2.6 KiB)
I know that docker0 is the bridge network created by Docker, eth1 is the interface connects to the outer world, lo is the loopback interface, my question is what's eth0 here used for?
I think I got the point. When I come to the VirtualBox Network Settings, I found there're 2 network cards, and the first one using NAT with port forwarding.
Name:ssh | Host Port:53289 | Guest Port:22
So I thought the eth0 is used for ssh connection by docker client. That's why there's no such an interface in a normal Linux OS(rather than this Boot2Docker local VM).
Different ideas are welcome!
O/P of if config is
[root#test2 ~]# ifconfig
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:3045306 errors:0 dropped:0 overruns:0 frame:0
TX packets:3045306 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:969363066 (924.4 MiB) TX bytes:969363066 (924.4 MiB)
p4p1 Link encap:Ethernet HWaddr F0:4D:A2:F7:CE:20
inet addr:192.168.250.58 Bcast:192.168.250.255 Mask:255.255.255.0
inet6 addr: fe80::f24d:a2ff:fef7:ce20/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:222163621 errors:0 dropped:0 overruns:0 frame:0
TX packets:29525032 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:67504475609 (62.8 GiB) TX bytes:13910424527 (12.9 GiB)
virbr0 Link encap:Ethernet HWaddr 52:54:00:3C:38:60
inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
when i executed tcpdump , o/p is
root#test2 ~]# tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on virbr0, link-type EN10MB (Ethernet), capture size 65535 bytes
where it is listening to vibra0 interface as default.
I want to set p4p1 interface as default interface so that i may get appropriate o/p when executing tcpdump.
Any solutions.
You cannot change tcpdump's default interface (unless you hack either tcpdump or libpcap's code).
You can, however, tell tcpdump to capture on a particular interface by using the -i option:
tcpdump -i p4p1
I have a freedompop Ubee stick that I would like to connect to my beaglebone black (running angstrom with 3.2.0-54-generic kernel). After solving some issues with hotswapping (it's not possible apparently), I am seeing the the interface in using ifconfig. But when I try bringing it up nothing happens:
root#beaglebone:~# ifconfig eth1 up
root#beaglebone:~# udhcpc eth1
udhcpc (v1.20.2) started
Sending discover...
Sending discover...
Sending discover...
Something also strange is that the interface initially has an address:
root#beaglebone:~# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:1D:88:53:2F:52
inet addr:192.168.14.2 Bcast:192.168.14.255 Mask:255.255.255.0
inet6 addr: fe80::21d:88ff:fe53:2f52/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:22 errors:0 dropped:0 overruns:0 frame:0
TX packets:48 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2542 (2.4 KiB) TX bytes:9062 (8.8 KiB)
But a few moments ( < 1 minute) later, if I run the same command, eth1 no longer has an address, bcast, etc:
root#beaglebone:~# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:1D:88:53:2F:52
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:25 errors:0 dropped:0 overruns:0 frame:0
TX packets:51 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2730 (2.6 KiB) TX bytes:9240 (9.0 KiB)
Under no circumstance (before or after address is stripped in ifconfig) can I ever ping something.
I have tried re-assigning the address, mask, etc, but nothing helps. Bring the interface up or down does not help. I tried manually creating an interfaces file and that didn't help either.
To solve this problem, I had to:
Add an inet dhcp interface in /etc/network/interfaces:
iface eth1 inet dhcp
Add the freedompop as a nameserver in resolve.conf
nameserver 192.168.14.1
Bring up the interface
ifup eth1
I have followed this link to deploy devstack on my virtual machine. When I execute ./stack.sh script in the VM, I get the following error after sometime:
keystone endpoint-create: error: argument --service-id/--service_id: expected one argument
++ failed
++ local r=2
+++ jobs -p
++ kill
++ set +o xtrace
The script terminates without giving any information such as the host on which to access Horizon and the time elapsed in running the script. I am using NAT as my virtual machine network configuration since I am not able to connect to my network using bridge mode.
I get no response when trying to access Horizon from my web browser. When I try to execute stack.sh (not preceded by ./unstack.sh), I get the error that stack is already running. Please note that I am behind a proxy server and this is my network configuration on host and guest machines respectively:
Host Machine:
eth0 Link encap:Ethernet HWaddr d4:be:d9:7f:b3:6f
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:110688 errors:0 dropped:0 overruns:0 frame:0
TX packets:110688 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:6738439 (6.7 MB) TX bytes:6738439 (6.7 MB)
vmnet1 Link encap:Ethernet HWaddr 00:50:56:c0:00:01
inet addr:172.16.85.1 Bcast:172.16.85.255 Mask:255.255.255.0
inet6 addr: fe80::250:56ff:fec0:1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:83 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
vmnet8 Link encap:Ethernet HWaddr 00:50:56:c0:00:08
inet addr:172.16.145.1 Bcast:172.16.145.255 Mask:255.255.255.0
inet6 addr: fe80::250:56ff:fec0:8/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:602 errors:0 dropped:0 overruns:0 frame:0
TX packets:82 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
wlan0 Link encap:Ethernet HWaddr 60:36:dd:3e:99:e6
inet addr:10.99.19.21 Bcast:10.99.19.255 Mask:255.255.252.0
inet6 addr: fe80::6236:ddff:fe3e:99e6/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:215802 errors:0 dropped:0 overruns:0 frame:0
TX packets:222520 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:245659430 (245.6 MB) TX bytes:30196677 (30.1 MB)
Guest Machine (Bridge):
eth0 Link encap:Ethernet HWaddr 00:0c:29:8a:c9:d4
inet addr:172.16.145.128 Bcast:172.16.145.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe8a:c9d4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1965 errors:0 dropped:0 overruns:0 frame:0
TX packets:1508 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2229981 (2.2 MB) TX bytes:160543 (160.5 KB)
Interrupt:19 Base address:0x2024
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:485 errors:0 dropped:0 overruns:0 frame:0
TX packets:485 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:36153 (36.1 KB) TX bytes:36153 (36.1 KB)
virbr0 Link encap:Ethernet HWaddr 2e:32:9b:c3:f4:12
inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
My localrc file is configured like this:
FLOATING_RANGE=192.168.1.224/27
FIXED_RANGE=10.11.12.0/24
FIXED_NETWORK_SIZE=256
FLAT_INTERFACE=eth0
ADMIN_PASSWORD=password
MYSQL_PASSWORD=password
RABBIT_PASSWORD=password
SERVICE_PASSWORD=password
SERVICE_TOKEN=tokentoken
Please note that I am behind a proxy server. Googling this error returned me some pages that suggest that the issue can be solved by setting the 'no_proxy' variable with the main IP address of the devstack machine.
Links to pages:
https://bugs.launchpad.net/devstack/+bug/1015705
https://answers.launchpad.net/devstack/+question/219539
I don't know where to add these settings or how to solve this whole keystone error. Any help is highly appreciated. Thanks in advance.
You can add your VM IP to the no_proxy variable. In Below snippet, 10.0.2.15 is my devstack VM IP. To be permanently exported when starting your terminal, you can add the below line in /etc/bash.bashrc file.
export no_proxy=localhost,127.0.0.1,10.0.2.15
How do I print which packets are dropped by by the interface ???
I have an interface wherein RX packets are dropped , see below :
eth0 Link encap:Ethernet HWaddr DE:AD:BE:EF:42:46
inet addr:192.168.122.86 Bcast:192.168.122.255 Mask:255.255.255.0
inet6 addr: fe80::dcad:beff:feef:4246/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
**RX packets:10963521 errors:0 dropped:1006 overruns:0 frame:0**
TX packets:6221974 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3108701252 (2.8 GiB) TX bytes:3842229777 (3.5 GiB)
Interrupt:10 Base address:0xe000
In Windows you can enable dropped packets logging:
Step 1. Change Windows Firewall configuration:
auditpol.exe /set /SubCategory:"Filtering Platform Packet Drop" /failure:enable
Step 2. Restart firewall service
net stop MPSSVC
net start MPSSVC
More info on http://technet.microsoft.com/en-us/library/cc754714(v=ws.10).aspx.