running `mtr` network diagnostic tool in the background like `nohup` processes - networking

mtr is a great tool for debugging the network packet losses. Here i sample mtr output.
My traceroute [v0.85]
myserver.com (0.0.0.0) Thu Jan 19 04:10:04 2017
Resolver: Received error response 2. (server failure)er of fields quit
Packets Pings
Host Loss% Snt Last Avg Best Wrst StDev
1. 192.168.104.23 0.0% 11 0.6 0.6 0.5 0.8 0.0
2. machine1.com 0.0% 11 8.5 12.4 2.0 20.5 5.5
3. mchine2.org.com 0.0% 11 1.2 1.0 0.8 1.8 0.0
4. machine3.orgcom 0.0% 11 0.8 0.9 0.7 1.1 0.0
However while running mtr on the server, you can't log-off the server.
I need mtr to output to a textfile and run in background similar to nohup command.
I should also be able to look into the report, something like using tail -f on the output file.

mtr offers -r option, which puts mtr into report mode. In this mode, mtr will run for the number of cycles specified by the -c option then print statistics and exit. So we can create a script to run the command and put the script to cron entries on your schedule. For example:
/usr/sbin/mtr -r -c 2 www.google.com >> /home/mtr.log
Cron entry, run every minute:
* * * * * sh /path/to/script
Then you can tail -f on the output file.

If systemd is used
┌──[root#vms81.liruilongs.github.io]-[~]
└─$systemd-run --on-calendar=*:*:00 --unit mtr-print-log --slice mtr /usr/sbin/mtr -r -b 192.168.29.154
Running timer as unit mtr-print-log.timer.
Will run service as unit mtr-print-log.service.
Viewing mtr logs
┌──[root#vms81.liruilongs.github.io]-[~]
└─$journalctl -u mtr-print-log.service
-- Logs begin at 六 2022-12-24 21:56:02 CST, end at 六 2022-12-24 22:10:19 CST. --
12月 24 22:07:00 vms81.liruilongs.github.io systemd[1]: Started /usr/sbin/mtr -r -b 192.168.29.154.
12月 24 22:07:14 vms81.liruilongs.github.io mtr[15427]: Start: Sat Dec 24 22:07:00 2022
12月 24 22:07:14 vms81.liruilongs.github.io mtr[15427]: HOST: vms81.liruilongs.github.io Loss% Snt Last Avg Best Wrst StDev
12月 24 22:07:14 vms81.liruilongs.github.io mtr[15427]: 1.|-- gateway (192.168.26.2) 0.0% 10 0.4 0.3 0.2 0.5 0.0
12月 24 22:07:14 vms81.liruilongs.github.io mtr[15427]: 2.|-- 192.168.29.154 0.0% 10 1.5 0.9 0.7 1.5 0.0
12月 24 22:08:00 vms81.liruilongs.github.io systemd[1]: Started /usr/sbin/mtr -r -b 192.168.29.154.
12月 24 22:08:14 vms81.liruilongs.github.io mtr[16400]: Start: Sat Dec 24 22:08:00 2022
12月 24 22:08:14 vms81.liruilongs.github.io mtr[16400]: HOST: vms81.liruilongs.github.io Loss% Snt Last Avg Best Wrst StDev
12月 24 22:08:14 vms81.liruilongs.github.io mtr[16400]: 1.|-- gateway (192.168.26.2) 0.0% 10 0.3 0.3 0.2 0.4 0.0
12月 24 22:08:14 vms81.liruilongs.github.io mtr[16400]: 2.|-- 192.168.29.154 0.0% 10 1.0 1.0 0.7 1.4 0.0
12月 24 22:09:00 vms81.liruilongs.github.io systemd[1]: Started /usr/sbin/mtr -r -b 192.168.29.154.
12月 24 22:09:14 vms81.liruilongs.github.io mtr[17411]: Start: Sat Dec 24 22:09:00 2022
12月 24 22:09:14 vms81.liruilongs.github.io mtr[17411]: HOST: vms81.liruilongs.github.io Loss% Snt Last Avg Best Wrst StDev
12月 24 22:09:14 vms81.liruilongs.github.io mtr[17411]: 1.|-- gateway (192.168.26.2) 0.0% 10 0.3 0.3 0.3 0.5 0.0
12月 24 22:09:14 vms81.liruilongs.github.io mtr[17411]: 2.|-- 192.168.29.154 0.0% 10 0.9 0.9 0.7 1.3 0.0
If you only want to see the output and the execution time, you can.
┌──[root#vms81.liruilongs.github.io]-[~]
└─$journalctl -u mtr-print-log.service -o cat | tail -n 10
Started /usr/sbin/mtr -r -b 192.168.29.154.
Start: Sat Dec 24 22:13:00 2022
HOST: vms81.liruilongs.github.io Loss% Snt Last Avg Best Wrst StDev
1.|-- gateway (192.168.26.2) 0.0% 10 0.2 0.3 0.2 0.5 0.0
2.|-- 192.168.29.154 0.0% 10 0.8 0.8 0.7 1.1 0.0
Started /usr/sbin/mtr -r -b 192.168.29.154.
Start: Sat Dec 24 22:14:00 2022
HOST: vms81.liruilongs.github.io Loss% Snt Last Avg Best Wrst StDev
1.|-- gateway (192.168.26.2) 0.0% 10 0.3 0.3 0.2 0.4 0.0
2.|-- 192.168.29.154 0.0% 10 0.9 0.8 0.7 1.0 0.0
Delete mtr process
┌──[root#vms81.liruilongs.github.io]-[~]
└─$systemctl stop mtr-print-log.timer
┌──[root#vms81.liruilongs.github.io]-[~]
└─$systemctl is-active mtr-print-log.service
unknown

Related

Nginx and uwsgi setting for heavy calculation site?

I am using nginx and uwsgi(django) for site on AWS fargate.
This program doing a bit heavy calculating task.
So, I guess I should do some tuning for uwsgi or nginx.
I start uwsgi in django container with processes and threads.
uwsgi --http :8011 --processes 8 --threads 8 --module mysite.wsgi
and do nothing special for nginx
server {
listen 80;
server_name mysite;
charset utf-8;
location / {
proxy_pass http://127.0.0.1:8011/;
include /etc/nginx/uwsgi_params;
}
}
With this setting, program works, but even after heavy task finished.
Server response is still so heavy.
I checked with top command, even after task finished, not many memory are free.
top - 20:15:21 up 24 min, 0 users, load average: 4.65, 4.00, 1.91
Tasks: 15 total, 1 running, 11 sleeping, 0 stopped, 3 zombie
%Cpu(s): 0.0 us, 0.3 sy, 0.0 ni, 99.5 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 3703.8 total, 176.6 free, 1095.6 used, 2431.5 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 2382.4 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 5724 284 4 S 0.0 0.0 0:00.19 bash
9 root 20 0 1398196 6260 0 S 0.0 0.2 0:00.50 amazon-ssm-agen
24 root 20 0 1410428 11772 0 S 0.0 0.3 0:00.85 ssm-agent-worke
48 root 20 0 622428 37900 1168 S 0.0 1.0 0:00.38 uwsgi
49 root 20 0 47468 1396 36 S 0.0 0.0 0:00.18 uwsgi
50 root 20 0 0 0 0 Z 0.0 0.0 0:05.75 uwsgi
56 root 20 0 1755900 424896 177332 S 0.0 11.2 0:14.92 uwsgi
59 root 20 0 1754360 365476 120228 S 0.0 9.6 0:07.77 uwsgi
60 root 20 0 0 0 0 Z 0.0 0.0 0:13.31 uwsgi
68 root 20 0 622428 36788 56 S 0.0 1.0 0:00.00 uwsgi
69 root 20 0 1755600 373404 125260 S 0.0 9.8 0:09.18 uwsgi
77 root 20 0 0 0 0 Z 0.0 0.0 0:02.90 uwsgi
129 root 20 0 1327588 10376 0 S 0.0 0.3 0:10.33 ssm-session-wor
139 root 20 0 5988 2288 1764 S 0.0 0.1 0:00.64 bash
261 root 20 0 8900 3648 3136 R 0.0 0.1 0:00.00 top
I guess this means task is not correctly freed??
However where should I check or tuning??

Use awk to modify lines with specific keys

I have a main file that includes a series of data lines whose ID's are stored in the second column. There is another key file that contains specific IDs and I would like to comment (put $) the records with those ID's in the main file and leave the rest. I have written the below script, it puts the comment but repeats the non-keyed items. Can you please help debug the awk command?
key_file:
10
20
30
main_file:
PSHELL 10 136514 0.7
PSHELL 15 136514 0.7
PSHELL 20 136513 2.0
PSHELL 30 13571 1.7
Current output:
PSHELL 10 136514 0.7
PSHELL 10 136514 0.7
$PSHELL 10 136514 0.7
PSHELL 15 136514 0.7
PSHELL 15 136514 0.7
PSHELL 15 136514 0.7
PSHELL 20 136513 2.0
$PSHELL 20 136513 2.0
PSHELL 20 136513 2.0
$PSHELL 30 13571 1.7
PSHELL 30 13571 1.7
PSHELL 30 13571 1.7
Desired output
$PSHELL 10 136514 0.7
PSHELL 15 136514 0.7
$PSHELL 20 136513 2.0
$PSHELL 30 13571 1.7
awk 'NR==FNR{a[$1]; next} {for (i in a) if (index($2, i)) {print "$"$0 > "out_file"} else {print $0 > "out_file"}}' key_file main_file
You may use this awk:
awk 'FNR == NR {key[$1]; next} $2 in key {$0 = "$" $0} 1' keyfile mainfile
$PSHELL 10 136514 0.7
PSHELL 15 136514 0.7
$PSHELL 20 136513 2.0
$PSHELL 30 13571 1.7

Jenkins Pipeline Failing with error - <PID> killed ./mvnw test

I have a simple pipeline which was running file untill last time I checked. But now its suddenly failing for any maven stage with the following error. Only the stage ./mvnw clean works fine.
/var/lib/jenkins/workspace/ProjectID#tmp/durable-ce5247e8/script.sh: line 2: 31370 Killed ./mvnw test
I tried looking at the logs for the job as well as the logs at /var/log/jenkins but I do not see anything detailed. I also tried running the mvnw command with -x, but this error does not seem to be coming from the mvn command itself. I could also confirm that the code compiles/builds fine on my local machine as well as on the jenkins server if ran manually.
I am relatively new to unix/jenkins/pipeline environment and I am clueless as to where should I look for troubleshooting. Has anyone ever encountered such situation? Appreciate any clue for troubleshooting the issue.
Thanks a lot,
Adding further investigation:
To get around this issue, did the following steps.
1. stopped Jenkins service
2. Restarted the EC2 instance hosting jenkins. (Did it a couple of times)
3. Verified that the Jenkins service is stopped.
4. Then ran the top command to see the CPU usage and noticed a strange thing. There was a process with command debug running. This process was owned by the user jenkins.
5. I looked around on internet to see if I find anything on this debug command but with no luck.
6. I killed it to see if that helps, But to my serprise, this did not help rather the situation is now worse and the top command "top -U jenkins" now gives following result (and the number of processes are ever growing)
top - 15:15:09 up 1:39, 1 user, load average: 191.30, 175.24, 135.72
Tasks: 189 total, 3 running, 159 sleeping, 0 stopped, 0 zombie
Cpu(s): 82.7%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si,
17.1%st
Mem: 4040060k total, 822672k used, 3217388k free, 42128k buffers
Swap: 4194300k total, 0k used, 4194300k free, 236476k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
6344 jenkins 20 0 384m 6764 2784 S 2.3 0.2 3:27.28 debug
6592 jenkins 20 0 384m 6880 2896 S 2.3 0.2 2:22.55 debug
6770 jenkins 20 0 384m 6860 2872 S 2.3 0.2 2:00.24 debug
7040 jenkins 20 0 384m 6760 2776 S 2.3 0.2 1:31.72 debug
7115 jenkins 20 0 384m 6864 2880 S 2.3 0.2 1:26.05 debug
7254 jenkins 20 0 384m 6828 2840 S 2.3 0.2 1:16.73 debug
7375 jenkins 20 0 384m 6812 2828 S 2.3 0.2 1:08.34 debug
7464 jenkins 20 0 384m 6864 2880 S 2.3 0.2 1:04.63 debug
7600 jenkins 20 0 320m 6852 2868 S 2.3 0.2 0:57.73 debug
7668 jenkins 20 0 320m 6780 2800 S 2.3 0.2 0:54.29 debug
7797 jenkins 20 0 320m 6756 2776 S 2.3 0.2 0:48.62 debug
7798 jenkins 20 0 320m 6776 2792 S 2.3 0.2 0:48.68 debug
7872 jenkins 20 0 320m 6852 2868 S 2.3 0.2 0:45.91 debug
7929 jenkins 20 0 320m 6756 2776 S 2.3 0.2 0:43.38 debug
8005 jenkins 20 0 320m 6808 2828 S 2.3 0.2 0:40.94 debug
8012 jenkins 20 0 320m 6884 2896 S 2.3 0.2 0:40.91 debug
8073 jenkins 20 0 320m 6852 2868 S 2.3 0.2 0:38.44 debug
6271 jenkins 20 0 384m 6852 2868 S 2.1 0.2 4:28.22 debug
6278 jenkins 20 0 384m 6752 2772 S 2.1 0.2 4:28.42 debug
6434 jenkins 20 0 384m 6828 2844 S 2.1 0.2 2:57.29 debug
6544 jenkins 20 0 384m 6860 2880 S 2.1 0.2 2:37.40 debug
6692 jenkins 20 0 384m 6784 2800 S 2.1 0.2 2:10.43 debug
6745 jenkins 20 0 384m 6856 2872 S 2.1 0.2 2:00.54 debug
6887 jenkins 20 0 384m 6824 2840 S 2.1 0.2 1:44.45 debug
6909 jenkins 20 0 384m 6812 2828 S 2.1 0.2 1:44.33 debug
6973 jenkins 20 0 384m 6852 2872 S 2.1 0.2 1:37.80 debug
7253 jenkins 20 0 384m 6812 2828 S 2.1 0.2 1:16.45 debug
7321 jenkins 20 0 320m 6828 2844 S 2.1 0.2 1:12.39 debug
7396 jenkins 20 0 384m 6800 2816 S 2.1 0.2 1:08.26 debug
7451 jenkins 20 0 320m 6880 2896 S 2.1 0.2 1:04.73 debug
The above issue was happening because the CPU was getting overworked because of a mining attack on our server. This server had to be killed and rebuilt. The debug process shown in the above logs against the jenkins user is the mining script.

What is the cause of cpu high for given result in centos [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
I am using centos.
When i am running the command free -m then its showing me below:
total used free shared buffers cached
Mem: 2048 373 1674 10 0 147
-/+ buffers/cache: 225 1822
Swap: 0 0 0
I have run the command "Top" and get the below result:
top - 07:08:01 up 16:09, 3 users, load average: 0.00, 0.00, 0.00
Tasks: 39 total, 1 running, 38 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2097152k total, 381024k used, 1716128k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 150200k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 19236 1452 1212 S 0.0 0.1 0:00.02 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd/23354
3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper/23354
147 root 16 -4 10644 668 400 S 0.0 0.0 0:00.00 udevd
453 root 20 0 179m 1512 1056 S 0.0 0.1 0:00.27 rsyslogd
489 root 20 0 66692 1296 536 S 0.0 0.1 0:00.03 sshd
497 root 20 0 22192 972 716 S 0.0 0.0 0:00.00 xinetd
658 root 20 0 66876 1028 312 S 0.0 0.0 0:00.00 saslauthd
659 root 20 0 66876 764 48 S 0.0 0.0 0:00.00 saslauthd
731 root 20 0 114m 1260 620 S 0.0 0.1 0:00.24 crond
835 ossecm 20 0 10512 492 312 S 0.0 0.0 0:00.32 ossec-maild
839 root 20 0 13088 960 712 S 0.0 0.0 0:00.00 ossec-execd
843 ossec 20 0 12780 2380 620 S 0.0 0.1 0:10.15 ossec-analysisd
847 root 20 0 4200 444 304 S 0.0 0.0 0:00.84 ossec-logcollec
858 root 20 0 5004 1484 468 S 0.0 0.1 0:07.06 ossec-syscheckd
862 ossec 20 0 6388 624 372 S 0.0 0.0 0:00.03 ossec-monitord
870 root 20 0 92420 21m 1620 S 0.0 1.0 0:01.21 miniserv.pl
4363 root 20 0 96336 4448 3464 S 0.0 0.2 0:00.10 sshd
4365 root 20 0 105m 2024 1532 S 0.0 0.1 0:00.03 bash
4615 root 20 0 96776 4936 3460 S 0.0 0.2 0:00.61 sshd
4617 root 20 0 105m 2052 1548 S 0.0 0.1 0:00.20 bash
4674 root 20 0 96336 4452 3460 S 0.0 0.2 0:00.22 sshd
4676 root 20 0 105m 2012 1532 S 0.0 0.1 0:00.06 bash
7494 root 20 0 96336 4404 3428 S 0.0 0.2 0:00.03 sshd
7496 root 20 0 57712 2704 2028 S 0.0 0.1 0:00.01 sftp-server
7719 root 20 0 83116 2700 836 S 0.0 0.1 0:00.10 sendmail
7728 smmsp 20 0 78692 2128 636 S 0.0 0.1 0:00.00 sendmail
7742 root 20 0 402m 14m 7772 S 0.0 0.7 0:00.13 httpd
7744 asterisk 20 0 502m 22m 10m S 0.0 1.1 0:00.11 httpd
7938 root 20 0 105m 756 520 S 0.0 0.0 0:00.00 safe_asterisk
7940 asterisk 20 0 3157m 26m 8508 S 0.0 1.3 0:07.14 asterisk
8066 root 20 0 105m 1568 1304 S 0.0 0.1 0:00.01 mysqld_safe
8168 mysql 20 0 499m 21m 6472 S 0.0 1.1 0:01.44 mysqld
8607 asterisk 20 0 402m 8288 1404 S 0.0 0.4 0:00.00 httpd
8608 asterisk 20 0 402m 8288 1404 S 0.0 0.4 0:00.00 httpd
8611 asterisk 20 0 402m 8284 1400 S 0.0 0.4 0:00.00 httpd
8615 asterisk 20 0 402m 8296 1412 S 0.0 0.4 0:00.00 httpd
Even when i am trying see by disabling the services asterisk,httpd,sendmail,mysqld still its showing 100% cpu usage.
Can anybody know how can i check what is the actual thing which is taking this much CPU usages?
The CPU Usage in top says:
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Your CPU is 100% idle. This is the explanation:
us: user cpu time (or) % CPU time spent in user space
sy: system cpu time (or) % CPU time spent in kernel space
ni: user nice cpu time (or) % CPU time spent on low priority processes
id: idle cpu time (or) % CPU time spent idle
wa: io wait cpu time (or) % CPU time spent in wait (on disk)
hi: hardware irq (or) % CPU time spent servicing/handling hardware interrupts
si: software irq (or) % CPU time spent servicing/handling software interrupts
st: steal time - - % CPU time in involuntary wait by virtual cpu while hypervisor is servicing another processor (or) % CPU time stolen from a virtual machine

Thousands of instances of index.php opening at the same time

Suddenly my hosting account has been suspended due to thousands of instances of index.php opening at the same time.
The site is built around the latest version of Wordpress and bbpress. here's the email from the hosting company:
*Action Taken: Please be aware we have suspended this account at this
time in order to maintain the
reliability and integrity of the
server. Reason: Thousands of
instances of index.php opening at the
same time:
17270 myserver 15 0 268m 79m 52m R
17.5 2.0 0:00.38 /usr/bin/php /home/myserver/public_html/index.php 17287 myserver 16 0 268m 34m 8712 R
14.4 0.9 0:00.35 /usr/bin/php /home/myserver/public_html/index.php 17332 myserver 15 0 213m 26m 7680 S
12.9 0.7 0:00.17 /usr/bin/php /home/myserver/public_html/index.php 17276 myserver 16 0 283m 40m 7912 R
12.1 1.0 0:00.33 /usr/bin/php /home/myserver/public_html/index.php 17336 myserver 17 0 213m 26m 7680 S
12.1 0.7 0:00.16 /usr/bin/php /home/myserver/public_html/index.php 17341 myserver 18 0 213m 26m 7680 S
12.1 0.7 0:00.16 /usr/bin/php /home/myserver/public_html/index.php 17343 myserver 16 0 213m 26m 7680 S
12.1 0.7 0:00.16 /usr/bin/php /home/myserver/public_html/index.php 17339 myserver 17 0 213m 26m 7680 S
11.4 0.7 0:00.15 /usr/bin/php /home/myserver/public_html/index.php 17344 myserver 17 0 213m 26m 7680 S
11.4 0.7 0:00.15 /usr/bin/php /home/myserver/public_html/index.php 17347 myserver 17 0 213m 26m 7680 S
11.4 0.7 0:00.15 /usr/bin/php /home/myserver/public_html/index.php 17351 myserver 16 0 213m 26m 7680 S
11.4 0.7 0:00.15 /usr/bin/php /home/myserver/public_html/index.php 17353 myserver 17 0 213m 26m 7680 S
11.4 0.7 0:00.15 /usr/bin/php /home/myserver/public_html/index.php 17364 myserver 17 0 213m 26m 7680 S
11.4 0.7 0:00.15 /usr/bin/php /home/myserver/public_html/index.php 17368 myserver 17 0 209m 23m 7388 R
10.6 0.6 0:00.14 /usr/bin/php /home/myserver/public_html/index.php 17278 myserver 16 0 283m 40m 7896 R
9.9 1.0 0:00.28 /usr/bin/php /home/myserver/public_html/index.php*
They have just emailed this too:
it is possible that your forum script is being abused if it is not secured or it has some security whole, but we can't provide more information as we do not know how it is coded.
Please check and let us know if you have any further questions.
Any ideas at what's going on?
You may have gotten DoS'd.
exactly what dav said or for some reason you are getting an insane load... to prevent that from happening again, you can cache your wordpress using a plugin like supercache to create some semi static pages, filter spam comments pre-reload. Because every single page load = loading index.php.
Seems the problem is with sites getting indexed all at once especially from crawlers like Yandex/Baidu who load up multiple pages at once
every page load via bot is another instance of index.php opening - so if you have 2000 pages on the site and they get indexed all at once - this is what you get
You can try to add the below to your robox.txt (might or might not work)
User-agent: *
Crawl-Delay: 30
Disallow: /wp-admin/
User-agent: Yandex
Crawl-Delay: 30
User-agent: Baidu
Crawl-Delay: 30
or just block IP's of crawlers (100% guarantee)

Resources