Wget Hanging, Script Stops - http

Evening,
I am running a lot of wget commands using xargs
cat urls.txt | xargs -n 1 -P 10 wget -q -t 2 --timeout 10 --dns-timeout 10 --connect-timeout 10 --read-timeout 20
However, once the file has been parsed, some of the wget instances 'hang.' I can still see them in system monitor, and it can take about 2 minutes for them all to complete.
Is there anyway I can specify that the instance should be killed after 10 seconds? I can re-download all the URLs that failed later.
In system monitor, the wget instances are shown as sk_wait_data when they hang. xargs is there as 'do_wait,' but wget seems to be the issue, as once I kill them, my script continues.

I believe this should do it:
wget -v -t 2 --timeout 10
According to the docs:
--timeout: Set the network timeout to seconds seconds. This is equivalent to specifying
--dns-timeout, --connect-timeout, and --read-timeout, all at the same time.
Check the verbose output too and see more of what it's doing.
Also, you can try:
timeout 10 wget -v -t 2
Or you can do what timeout does internally:
( cmdpid=$BASHPID; (sleep 10; kill $cmdpid) & exec wget -v -t 2 )
(As seen in: BASH FAQ entry #68: "How do I run a command, and have it abort (timeout) after N seconds?")

GNU Parallel can download in parallel, and retry the process after a timeout:
cat urls.txt | parallel -j10 --timeout 10 --retries 3 wget -q -t 2
If the time for an url to be fetched changes (e.g. due to faster internet connection), you can let GNU Parallel figure out the timeout:
cat urls.txt | parallel -j10 --timeout 1000% --retries 3 wget -q -t 2
This will make GNU Parallel record the median time for a successful job and set the timeout dynamically to 10 times that.

Related

Forks are spawned on a single core on interactive HPC node

I am trying to test a script I have developed locally on an interactive HPC node, and I keep running in this strange issue that mclapply works only on a single core. I see several R processes spawned in htop (as many as the number of the cores), but they all occupy only one core.
Here is how I obtain the interactive node:
srun -n 16 -N 1 -t 5 --pty bash -il
Is there a setting I am missing? How can I make this work? What can I check?
P.S. I just tested and the other programs that rely on forking to do parallel processing (say pigz) are afflicted by the same issue as well. Those that rely on MPI and messaging work properly, it seems.
Yes, you are missing a setting. Try:
srun -N 1 -n 1 -c 16 -t 5 --pty bash -il
The problem is that you are running the parallel commands within a bash shell that is allocated on a single core, so the bash process is spawned on only one of the cores requested by srun.
Otherwise, you can first allocate your resources using salloc and once you obtain them run your actual command. For instance:
salloc -N 1 -n 1 -c 16 -t 5
srun pigz file.ext

Shell script when executed appears twice is the process list

I am running the below shell script to run in background ./something.sh &
#!/bin/bash
tail -n0 -f -F service.log | while read LOGLINE
do
done
when i check ps -ef| grep something, i see two processes
20273 1 0 16:13 ? 00:00:00 /bin/bash /something.sh
20280 20273 0 16:13 ? 00:00:00 /bin/bash /something.sh
This is because your script is piping the output of a program to a shell command. When you run this there will be three processes:
The something.sh that you explicitly started
The tail that your script starts
A copy of something.sh that is executing the while loop.

SSH between N number of servers using script

I have n number of servers like c0001.test.cloud.com, c0002.test.cloud.com, c0003.test.cloud.com and I want to do the ssh between these servers like:
from Server: c0001 do the ssh to c0002 and then exit the server.
Come back to c0001 do the ssh to c0003 and then exit the server.
So in this way it will execute the script without entering any input during runtime and we can have n number of servers.
I have written one script :
str1=c0001.test.cloud.com,c0002.test.cloud.com,c0003.test.cloud.com
string="$( cut -d ',' -f 2- <<< "$str1" )"
echo "$string"
for j in $(echo $string | sed "s/,/ /g")
do
ssh appAccount#j
done
But this script is not running fine. I have also checked it by passing parameters
like: -o StrictHostKeyChecking=no and <<'ENDSSH' but it is not working.
Assuming the number of commands you want to run are small, you could:
Create a script of commands that will run from c0001.test.cloud.com to each of the servers. For example, create a file on your local machine called commands.sh with:
hosts="c0002.test.cloud.com c0003.test.cloud.com"
for host in $hosts do
ssh -o StrictHostKeyChecking=no -q appAccount#$host <command 1> && <command 2>
done
On your local machine, ssh to c0001.test.cloud.com and execute the commands in commands.sh:
ssh -o StrictHostKeyChecking=no -q appAccount#c0001.test.cloud.com 'bash -s' < commands.sh
However, if your requirements become more complex, a more robust solution might be to use a cluster administration tool such as ClusterShell

How to debug php-fpm performance?

Webpages are loading very slow, it takes them around 6 seconds to even start sending the page data, which is then sent in a matter of 0.2 seconds and is generated in 0.19 seconds.
I doubt that it is caused by php or the browser, so the problem must be with the server which is handled by nginx and php5-fpm
A server admin said that indeed the problem was caused by a misconfigured fpm or nginx
How can I debug the cause of the slowdown?
Setup: php5.3, mysql5, linux, nginx, php5-fpm
This question is probably too broad for StackOverflow, as the question could span several pages and topics.
However if the question was just how do I do debug the performance of PHP-FPM then the answer would be much easier - use Strace and the script below.
#!/bin/bash
mkdir trc
rm -rf trc/*.trc
additional_strace_args="$1"
MASTER_PID=$(ps auwx | grep php-fpm | grep -v grep | grep 'master process' | cut -d ' ' -f 7)
summarise=""
#shows total of calls - comment in to get
#summarise="-c"
nohup strace -r $summarise -p $MASTER_PID -ff -o ./trc/master.follow.trc >"trc/master.$MASTER_PID.trc" 2>&1 &
while read -r pid;
do
if [[ $pid != $MASTER_PID ]]; then
#shows total of calls
nohup strace -r $summarise -p "$pid" $additional_strace_args >"trc/$pid.trc" 2>&1 &
fi
done < <(pgrep php-fpm)
read -p "Strace running - press [Enter] to stop"
pkill strace
That script will attach strace to all of the running php-fpm instances. Any requests to the web server that reach php-fpm will have all of their system calls logged, which will allow you to inspect which ones are taking the most time, which will allow you to figure out what needs optimising first.
On the other hand, if you can see from the Strace output that PHP-FPM is processing each request fast, it will also allow you to eliminate that as the problem, and allow you to investigate nginx, and how nginx is talking to PHP-FPM, which could also be the problem.
#Danack saved my life.
But I had to change the command to get the MASTER_PID:
MASTER_PID=$(ps auwx | grep php-fpm | grep -v grep | grep 'master process' | sed -e 's/ \+/ /g' | cut -d ' ' -f 2)

Kill and restart multiple processes that fit a certain pattern

I am trying write a shell script that will kill all processes that are running that match a certain pattern, then restart them. I can display the processes with:
ps -ef|grep ws_sched_600.sh|grep -v grep|sort -k 10
Which gives a list of the relevent processes:
user 2220258 1 0 16:53:12 - 0:01 /bin/ksh /../../../../../ws_sched_600.sh EDW02_env
user 5562418 1 0 16:54:55 - 0:01 /bin/ksh /../../../../../ws_sched_600.sh EDW03_env
user 2916598 1 0 16:55:00 - 0:01 /bin/ksh /../../../../../ws_sched_600.sh EDW04_env
But I am not too sure about how to pass the process ids to kill?
The sort doesn't seem necessary. You can use awk to print the second column and xargs to convert the output into command-line arguments to kill.
ps -ef | grep ws_sched_600.sh | awk '{print $2}' | xargs kill
Alternatively you could use pkill or killall which kill based on process name:
pkill -f ws_sched_600.sh
pkill ws_sched_600.sh
If you are concerned about running your command on multiple platforms where pkill might not be available
ps -ef | awk '/ws_sched_600/{cmd="kill -9 "$2;system(cmd)}
I think this is what you are looking for
for proc in $(ps -ef|grep ws_sched_600.sh|sort -k 10)
do
kill -9 proc
done
edit:
Of course... use xargs, it's better.

Resources