Get process occupying a port in Solaris 10 (alternative for pfiles) - unix

I am currently using pfiles to get the process occupying certain port in Solaris10,
but it causes problem when run parallely.
problem is pfiles can't be run parallely for the same pid.
the second one will return with error message.
pfiles: process is traced :
Is there any alternative to pfiles to get the process occupying a port in Solaris.
OR any information on OS API's to get port/process information on Solaris could help.

A workaround would be to use some lock mechanism to avoid this.
Alternatively, you might install lsof from a freeware repository and see if it supports concurrency (I think it does).
I just tested Solaris 11 Express pfiles and it doesn't seem to exhibit this issue.

Related

Problem communicating over a local area network (LAN) with ROS on WSL2

I am a developer of ROS projects. Recently I am trying using ROS(melodic) on WSL2(Windows Subsystem for Linux), and all things works just great. But I got some trouble when I want to use another PC which also in the same local area network(LAN) to communicate with. Before setting the environment variables like "ROS_MASTER_URI, ROS_IP", I know that since WSL 2 work on Hyper-V so the IP show on WSL2 is not the one in the real LAN. I have to do some command like below in order to make everyone in LAN communicate with the specific host:PORT on WSL2.
netsh interface portproxy delete v4tov4 listenport=$port listenaddress=$addr
But here comes a new question:
The nodes which use TCPROS to communicate with each other have a random PORT every time I launch the file.
How can I handle this kind of problem?
Or is there any information on the internet that I can have a look?
Thank you.
The root problem is described in WSL issue #4150. To quote from that thread,
WSL 2 seems to NAT it's virtual network, instead of making it bridged
to the host NIC.
Option 1 - Port forwarding script on login
Note: From #kraego's comment (and the edited question, which I'm just seeing based on the comment), this is probably not a good option for ROS, since the port numbers are randomly assigned. This makes port forwarding something that would have to be dynamically done.
There are a number of workarounds described in that issue, for which you've already figured out the first part (the port forwarding). The primary technique seems to be to create a PowerShell script to detect the IP address and create the port forwarding rules that runs upon Windows login. This particular comment near the top of the thread seems to be the canonical go-to answer, although many people have posted their tweaks or alternatives throughout the very long thread.
One downside - I believe the script that is mentioned there needs to be run at logon since the WSL subsystem seems to only want to run when a user is logged in. I've found that attempting to run a WSL service or instance through Windows OpenSSH results in that instance/service shutting down soon after the SSH session is closed, unless the user is already logged into Windows with a WSL instance opened.
Option 2 - WSL1
I would also propose that, assuming it fits your workflow and if the ROS works on it (it may not, given the device access you need, but not sure), you can simply use WSL1 instead of WSL2 to avoid this. You can try this out by:
Backing up your existing distro (from PowerShell or cmd, use wsl --export <DistroName> <FileName>
Import the backup into a new WSL1 instance with wsl --import <NewDistroName> <InstallLocation> <FileNameOfBackup> --version 1
It's possible to simply change versions in place, but I tend to like to have a backup anyway before doing it, and as long as you are backing up, you may as well leave the original in place.

How can I count ESTABLISHED connections in Go?

I'm trying to do basically this in Go:
netstat -an | grep 2375 -c
I need to count the number of connections to the Docker daemon in my regression test for a connection leak bug. However, because I run this in multiple places in different OS (local dev box, CI, etc), I cannot rely on the "netstat" tool, so I wonder how can I do this in a more programmatic way in Go?
I looked around the net package and could not find anything that would help. There are some libraries that basically replace netstat:
https://github.com/drael/GOnetstat
https://github.com/dominikh/netstat-nat
But they are not cross-platform compliant (Mac and *nix). Any idea how can I achieve this?
In linux this info is exposed in the /proc filesystem.
Use os.Getpid and query the info in /proc/<pid>/fd. Most likely a simple count is good here, if you need more see the proc man page.
Cross platform compatibility for this kind of thing is going to be roll your own, as the ways of identifying open fd's for a process are very per platform. If you simply need to compile, and pass some tests for this on non linux platforms you can use Go's per platform support to make this a no-op on other platforms, or implement an appropriate solution.

Process stop getting network data

We have a process (written in c++ /managed), which receives network data via tcpip.
After running the process for a while while tracking network load, it seems that network get into freeze state and the process does not getting data, there are other processes in the system that using networking (same nic) which operates normally.
the process gets out of this frozen situation by itself after several minutes.
Any idea what is happening?
Any counter i can track to see if my process reach some limitations ?
It is going to be very difficult to answer specifically,
-- without knowing what exactly is your process/application about,
-- whether it is a network chat application, or a file server/client, or ......
-- without other details about your process how it is implemented, what libraries it uses, if relevant to problem.
Also you haven't mentioned what OS and environment you are running this process under,
there is very little anyone can help . It could be anything, a busy wait loopl in your code, locking problems if its a multi-threaded code,....
Nonetheless , here are some options to check:
If its linux try below commands to debug and monitor the behaviour of the process and see what could be problem-
top
Check top to see ow much resources(CPU, memory) your process is using and if there is anything abnormally high values in CPU usage for it.
pstack
This should stack frames of the process executing at time of the problem.
netstat
Run this with necessary options (tcp/udp) to check what is the stae of the network sockets opened by your process
gcore -s -c
This forces your process to core when the mentioned problem happens, and then analyze that core file using gdb
gdb
and then use command where at gdb prompt to get full back trace of the process (which functions it was executing last and previous function calls.

makeCluster function in R snow hangs indefinitely

I am using makeCluster function from R package snow from Linux machine to start a SOCK cluster on a remote Linux machine. All seems settled for the two machines to communicate succesfully (I am able to estabilish ssh connections between the two). But:
makeCluster("192.168.128.24",type="SOCK")
does not throw any result, just hangs indefinitely.
What am I doing wrong?
Thanks a lot
Unfortunately, there are a lot of things that can go wrong when creating a snow (or parallel) cluster object, and the most common failure mode is to hang indefinitely. The problem is that makeSOCKcluster launches the cluster workers one by one, and each worker (if successfully started) must make a socket connection back to the master before the master proceeds to launch the next worker. If any of the workers fail to connect back to the master, makeSOCKcluster will hang without any error message. The worker may issue an error message, but by default any error message is redirected to /dev/null.
In addition to ssh problems, makeSOCKcluster could hang because:
R not installed on a worker machine
snow not installed on a the worker machine
R or snow not installed in the same location as the local machine
current user doesn't exist on a worker machine
networking problem
firewall problem
and there are many more possibilities.
In other words, no one can diagnose this problem without further information, so you have to do some troubleshooting in order to get that information.
In my experience, the single most useful troubleshooting technique is manual mode which you enable by specifying manual=TRUE when creating the cluster object. It's also a good idea to set outfile="" so that error messages from the workers aren't redirected to /dev/null:
cl <- makeSOCKcluster("192.168.128.24", manual=TRUE, outfile="")
makeSOCKcluster will display an Rscript command to execute in a terminal on the specified machine, and then it will wait for you to execute that command. In other words, makeSOCKcluster will hang until you manually start the worker on host 192.168.128.24, in your case. Remember that this is a troubleshooting technique, not a solution to the problem, and the hope is to get more information about why the workers aren't starting by trying to start them manually.
Obviously, the use of manual mode bypasses any ssh issues (since you're not using ssh), so if you can create a SOCK cluster successfully in manual mode, then probably ssh is your problem. If the Rscript command isn't found, then either R isn't installed, or it's installed in a different location. But hopefully you'll get some error message that will lead you to the solution.
If makeSOCKcluster still just hangs after you've executed the specified Rscript command on the specified machine, then you probably have a networking or firewall issue.
For more troubleshooting advice, see my answer for making cluster in doParallel / snowfall hangs.

How to cleanup sockets after mono-process crashes?

I am creating a chat server in mono that should be able to have many sockets open. Before deciding on the architecture, I am doing a load test with mono. Just for a test, I created a small mono-server and mono-server that opens 100,000 sockets/connections and it works pretty well.
I tried to hit the limit and at sometime the process crashes (of course).
But what worries me is that if I try to restart the process, it directly gives "Unhandled Exception: System.Net.Sockets.SocketException: Too many open files".
So I guess that somehow the filedescriptions(sockets) are kept open even when my process ends. Even several hours later it still gives this error, the only way I can deal with it is to reboot my computer. We cannot run into this kind of problem if we are in production without knowing how to handle it.
My question:
Is there anything in Mono that keeps running globally regardless of which mono application is started, a kind of service I can restart without rebooting my computer?
Or is this not a mono problem but a unix problem, that we would run into even if we would program it in java/C++?
I checked the following, but no mono processes alive, no sockets open and no files:
localhost:~ root# ps -ax | grep mono
1536 ttys002 0:00.00 grep mono
-
localhost:~ root# lsof | grep mono
(nothing)
-
localhost:~ root# netstat -a
Active Internet connections (including servers)
(no unusual ports are open)
For development I run under OSX 10.7.5. For production we can decide which platform to use.
This sounds like you need to set (or unset) the Linger option on the socket (using Socket.SetSocketOption). Depending on the actual API you're using there might be better alternatives (TcpClient has a LingerState property for instance).

Resources