solr / tomcat7 doesn't come back up after a crash - unix

I keep getting issues with Solr crashing on my server. Its hardly a busy site, so I'm baffled as to why it keeps doing it.
Anyway, as an intermediary - I'm written a shell script that runs on a cron as root:
#!/bin/bash
declare -a arr=(tomcat7 nginx mysql);
for i in "${arr[#]}"
do
echo "Checking $i"
if (( $(ps -ef | grep -v grep | grep $i | wc -l) > 0 ))
then
echo "$i is running!!!"
else
echo "service $i start\n"
service $i start
fi
done
# re-run, but this time do a restart if its still not going!
for i in "${arr[#]}"
do
echo "Checking $i"
if (( $(ps -ef | grep -v grep | grep $i | wc -l) > 0 ))
then
echo "$i is running!!!"
else
service $i restart
fi
done
..then this cron (as root)
*/5 * * * * bash /root/script-checks.sh
The cron itself seems to run just fine:
Checking tomcat7
service tomcat7 start\n
Checking nginx
nginx is running!!!
Checking mysql
mysql is running!!!
Checking tomcat7
Checking nginx
nginx is running!!!
Checking mysql
mysql is running!!!
...and Tomcats status seems ok:
root#domain:~# service tomcat7 status
รข tomcat7.service - LSB: Start Tomcat.
Loaded: loaded (/etc/init.d/tomcat7)
Active: active (exited) since Mon 2016-03-21 06:33:28 GMT; 4 days ago
Process: 2695 ExecStart=/etc/init.d/tomcat7 start (code=exited, status=0/SUCCESS)
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
...yet my script, can't connect to Solr:
Could not parse JSON response: malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "Can't connect to loc...") at /srv/www/domain.net/www/cgi-bin/admin/WebService/Solr/Response.pm line 42. Can't connect to localhost:8080 Connection refused at /usr/share/perl5/LWP/Protocol/http.pm line 49.
If I manually run a "restart":
service tomcat7 restart
...it then starts working again. Its almost like the 2nd part in my shell script isn't working.
Any suggestions?
My Solr versions are as follows:
Solr Specification Version: 3.6.2.2014.10.31.18.33.47
Solr Implementation Version: 3.6.2 debian - pbuilder - 2014-10-31 18:33:47
Lucene Specification Version: 3.6.2
UPDATE: I've read that sometimes updating the maxThreads can help with crashes, so I've changed it to 10,000:
<Connector port="8443" protocol="org.apache.coyote.http11.Http11Protocol"
maxThreads="10000" SSLEnabled="true" scheme="https" secure="true"
clientAuth="false" sslProtocol="TLS" />
I guess time will tell, to see if this fixes the issue.

Ok, well I never got to the bottom of why it wouldn't restart... but I have worked out why it was crashing. Before, we had it on 2048mb RAM Linode server, but when we moved over to Apache2, I setup a 1024Mb server, and was going to upgrade it to 2048mb one we had it all working. However, we put it live - but I forgot to update it to the 2048mb server, so Nginx/Apache2/Tomcat/MySQL etc, were all trying to run on a pretty slow server.
We found that Solr was dying with an OOM (out of memory) error, which is what gave us the clue.
Hopefully this helps someone else, who may come across this.

Related

AWS Code Deploy - Script at specified location: scripts/validate_service.sh failed with exit code 1

My deployments fail on last step Validate Service with error message:
The overall deployment failed because too many individual instances failed deployment, too few healthy instances are available for deployment, or some instances in your deployment group are experiencing problems.
Events log
No lines are selected.
My validate_service.sh contain
#!/bin/bash
# verify we can access our webpage successfully
curl -v --silent localhost:80 2>&1 | grep Welcome
Can someone advice what should I change ?
Script return value matters. Yours looks good to me. I just added couple of seconds to wait until application starts up.
In case you use bash -x together with pipeline of commands, you better add shopt -s pipefail so all pipeline fails when one of the commands fails.
Checkout my script:
#!/bin/bash
sleep 5
curl http://localhost:3009 | grep Welcome

Openstack Devstack installation error : g-api did not start in ubuntu 18.04

I'm installing openstack using All-In-One Single Machine setup, I run stack.sh script for devstack setup. On starting glance service I'm getting following error on my console:
++:: curl -g -k --noproxy '*' -s -o /dev/null -w '%{http_code}' http://10.10.20.10/image
+:: [[ 503 == 503 ]]
+:: sleep 1
+functions:wait_for_service:485 rval=124
+functions:wait_for_service:490 time_stop wait_for_service
+functions-common:time_stop:2310 local name
+functions-common:time_stop:2311 local end_time
+functions-common:time_stop:2312 local elapsed_time
+functions-common:time_stop:2313 local total
+functions-common:time_stop:2314 local start_time
+functions-common:time_stop:2316 name=wait_for_service
+functions-common:time_stop:2317 start_time=1602763779096
+functions-common:time_stop:2319 [[ -z 1602763779096 ]]
++functions-common:time_stop:2322 date +%s%3N
+functions-common:time_stop:2322 end_time=1602763839214
+functions-common:time_stop:2323 elapsed_time=60118
+functions-common:time_stop:2324 total=569
+functions-common:time_stop:2326 _TIME_START[$name]=
+functions-common:time_stop:2327 _TIME_TOTAL[$name]=60687
+functions:wait_for_service:491 return 124
+lib/glance:start_glance:480 die 480 'g-api did not start'
+functions-common:die:198 local exitcode=0
+functions-common:die:199 set +o xtrace
[Call Trace]
./stack.sh:1306:start_glance
/opt/stack/devstack/lib/glance:480:die
[ERROR] /opt/stack/devstack/lib/glance:480 g-api did not start
Error on exit
World dumping... see /opt/stack/logs/worlddump-2020-10-15-121040.txt for details
neutron-dhcp-agent: no process found
neutron-l3-agent: no process found
neutron-metadata-agent: no process found
neutron-openvswitch-agent: no process found
I also tried to increase timeout duration but then also it failed and also verifyied devstack#g-api.service is in active state. Can someone let me know what is the exect reason behind this issue and how to resolve it.
The only solution is to reload the entire system, including the os

paramiko and nohup ''

OK so I have paramiko v2.2.1 and I am trying to login to a machine and restart a service. Inside the service scripts it basically starts a process via nohup. However if I allow paramiko to disconnect as soon as it is done the process started terminates with a PIPE signal when it writes to stdout.
If I start the service by ssh'ing into the box and manually starting it there is no issue and it runs in the background fine. Also if I add long sleep 10 before disconnecting (close) paramiko it also seems to work just fine.
The service is started via a init.d script via a line like this:
env LD_LIBRARY_PATH=$bin_path nohup $bin_path/ServerLoop.sh \
"$bin_path/Service service args" "$#" &
Where ServerLoop.sh simply calls the service forever in a loop like this so it will never die:
SERVER=$1
shift
ARGS=$#
logger $ARGS
while [ 1 ]; do
$SERVER $ARGS
logger "$SERVER terminated with exit code: $STATUS. Server has been restarted"
sleep 1
done
I have noticed when I start the service by ssh'ing into the box I get a nohup.out file written to the root. However when I run through paramiko I get no nohup.out written anywhere on the system ... ie this after I manually ssh into the box and start the service:
root#ts4700:/mnt/mc.fw/bin# find / -name "nohup*"
/usr/bin/nohup
/usr/share/man/man1/nohup.1.gz
/nohup.out
And this is after I run through paramiko:
root#ts4700:/mnt/mc.fw/bin# find / -name "nohup*"
/usr/bin/nohup
/usr/share/man/man1/nohup.1.gz
As I understand it nohup will only redirect the output to nohup.out if "If standard output is a terminal" (from the manual), otherwise it thinks it is saving the output to a file so it does not redirect. Hence I tried the following:
In [43]: import paramiko
In [44]: paramiko.__version__
Out[44]: '2.2.1'
In [45]: ssh = paramiko.SSHClient()
In [46]: ssh.set_missing_host_key_policy(AutoAddPolicy())
In [47]: ssh.connect(ip, username='root', password=not_for_so_sorry, look_for_keys=False, allow_agent=False)
In [48]: stdin, stdout, stderr = ssh.exec_command("tty")
In [49]: stdout.read()
Out[49]: 'not a tty\n'
So I am thinking that nohup is not redirecting to nohup.out when I run it through paramiko because tty is not returning a terminal. I don't know why adding a sleep(10) would fix this though as the service if run on the command line is quite verbose.
I have also noticed that if the service is started from a manual ssh its tty in the ps ax output is still set to the ssh tty ... however if the process is started by paramiko its tty in the ps ax output is set to "?" .. since both processes are run through nohup I would have expected this to be the same.
If the problem is that nohup is indeed not redirecting the output to nohup.out because of the tty is there a way to force this to happen or a better way to run this sort of command via paramiko?
Thanks all, any help with this would be great :)

Run multiple instances of RStudio in a web browser

I have RStudio server installed on a remote aws server (ubuntu) and want to run several projects at the same time (one of which takes lots of time to finish). On Windows there is a simple GUI solution like 'Open Project in New Window'. Is there something similar for rstudio server?
Simple question, but failed to find a solution except this related question for Macs, which offers
Run multiple rstudio sessions using projects
but how?
While running batch scripts is certainly a good option, it's not the only solution. Sometimes you may still want interactive use in different sessions rather than having to do everything as batch scripts.
Nothing stops you from running multiple instances of RStudio server on your Ubuntu server on different ports. (I find this particularly easy to do by launching RStudio through docker, as outlined here. Because an instance will keep running even when you close the browser window, you can easily launch several instances and switch between them. You'll just have to login again when you switch.
Unfortunately, RStudio-server still prevents you having multiple instances open in the browser at the same time (see the help forum). This is not a big issue as you just have to log in again, but you can work around it by using different browsers.
EDIT: Multiple instances are fine, as long as they are not on the same browser, same browser-user AND on the same IP address. e.g. a session on 127.0.0.1 and another on 0.0.0.0 would be fine. More importantly, the instances keep on running even if they are not 'open', so this really isn't a problem. The only thing to note about this is you would have to log back in to access the instance.
As for projects, you'll see you can switch between projects using the 'projects' button on the top right, but while this will preserve your other sessions I do not think the it actually supports simultaneous code execution. You need multiple instances of the R environment running to actually do that.
UPDATE 2020 Okay, it's now 2020 and there's lots of ways to do this.
For running scripts or functions in a new R environment, check out:
the callr package
The RStudio jobs panel
Run new R sessions or scripts from one or more terminal sessions in the RStudio terminal panel
Log out and log in to the RStudio-server as a different user (requires multiple users to be set up in the container, obviously not a good workflow for a single user but just noting that many different users can access the same RStudio server instance no problem.
Of course, spinning up multiple docker sessions on different ports is still a good option as well. Note that many of the ways listed above still do not allow you to restart the main R session, which prevents you from reloading installed packages, switching between projects, etc, which is clearly not ideal. I think it would be fantastic if switching between projects in an RStudio (server) session would allow jobs in the previously active project to keep running in the background, but have no idea if that's in the cards for the open source version.
Often you don't need several instances of Rstudio - in this case just save your code in .R file and launch it using ubuntu command prompt (maybe using screen)
Rscript script.R
That will launch a separate R session which will do the work without freezing your Rstudio. You can pass arguments too, for example
# script.R -
args <- commandArgs(trailingOnly = TRUE)
if (length(args) == 0) {
start = '2015-08-01'
} else {
start = args[1]
}
console -
Rscript script.R 2015-11-01
I think you need R Studio Server Pro to be able to log in with multiple users/sessions.
You can see the comparison table below for reference.
https://www.rstudio.com/products/rstudio-server-pro/
Installing another instance of rstudio server is less than ideal.
Linux server admins, fear not. You just need root access or a kind admin.
Create a group to use: groupadd Rwarrior
Create an additional user with same home directory as your primary Rstudio login:
useradd -d /home/user1 user2
Add primary and new user into Rwarrior group:
gpasswd -a user2 Rwarrior
gpasswd -a user1 Rwarrior
Take care of the permissions for your primary home directory:
cd /home
chown -R user1:Rwarrior /home/user1
chmod -R 770 /home/user1
chmod g+s /home/user1
Set password for the new user:
passwd user2
Open a new browser window in incognito/private browsing mode and login to Rstudio with the new user you created. Enjoy.
I run multiple RStudio servers by isolating them in Singularity instances. Download the Singularity image with the command singularity pull shub://nickjer/singularity-rstudio
I use two scripts:
run-rserver.sh:
Find a free port
#!/bin/env bash
set -ue
thisdir="$(dirname "${BASH_SOURCE[0]}")"
# Return 0 if the port $1 is free, else return 1
is_port_free(){
port="$1"
set +e
netstat -an |
grep --color=none "^tcp.*LISTEN\s*$" | \
awk '{gsub("^.*:","",$4);print $4}' | \
grep -q "^$port\$"
r="$?"
set -e
if [ "$r" = 0 ]; then return 1; else return 0; fi
}
# Find a free port
find_free_port(){
local lower_port="$1"
local upper_port="$2"
for ((port=lower_port; port <= upper_port; port++)); do
if is_port_free "$port"; then r=free; else r=used; fi
if [ "$r" = "used" -a "$port" = "$upper_port" ]; then
echo "Ports $lower_port to $upper_port are all in use" >&2
exit 1
fi
if [ "$r" = "free" ]; then break; fi
done
echo $port
}
port=$(find_free_port 8080 8200)
echo "Access RStudio Server on http://localhost:$port" >&2
"$thisdir/cexec" \
rserver \
--www-address 127.0.0.1 \
--www-port $port
cexec:
Create a dedicated config directory for each instance
Create a dedicated temporary directory for each instance
Use the singularity instance mechanism to avoid that forked R sessions are adopted by PID 1 and stay around after the rserver has shut down. Instead, they become children of the Singularity instance and are killed when that shuts down.
Map the current directory to the directory /data inside the container and set that as home folder (this step might not be nessecary if you don't care about reproducible paths on every machine)
#!/usr/bin/env bash
# Execute a command in the container
set -ue
if [ "${1-}" = "--help" ]; then
echo <<EOF
Usage: cexec command [args...]
Execute `command` in the container. This script starts the Singularity
container and executes the given command therein. The project root is mapped
to the folder `/data` inside the container. Moreover, a temporary directory
is provided at `/tmp` that is removed after the end of the script.
EOF
exit 0
fi
thisdir="$(dirname "${BASH_SOURCE[0]}")"
container="rserver_200403.sif"
# Create a temporary directory
tmpdir="$(mktemp -d -t cexec-XXXXXXXX)"
# We delete this directory afterwards, so its important that $tmpdir
# really has the path to an empty, temporary dir, and nothing else!
# (for example empty string or home dir)
if [[ ! "$tmpdir" || ! -d "$tmpdir" ]]; then
echo "Error: Could not create temp dir $tmpdir"
exit 1
fi
# check if temp dir is empty (this might be superfluous, see
# https://codereview.stackexchange.com/questions/238439)
tmpcontent="$(ls -A "$tmpdir")"
if [ ! -z "$tmpcontent" ]; then
echo "Error: Temp dir '$tmpdir' is not empty"
exit 1
fi
# Start Singularity instance
instancename="$(basename "$tmpdir")"
# Maybe also superfluous (like above)
rundir="$(readlink -f "$thisdir/.run/$instancename")"
if [ -e "$rundir" ]; then
echo "Error: Runtime directory '$rundir' exists already!" >&2
exit 1
fi
mkdir -p "$rundir"
singularity instance start \
--contain \
-W "$tmpdir" \
-H "$thisdir:/data" \
-B "$rundir:/data/.rstudio" \
-B "$thisdir/.rstudio/monitored/user-settings:/data/.rstudio/monitored/user-settings" \
"$container" \
"$instancename"
# Delete the temporary directory after the end of the script
trap "singularity instance stop '$instancename'; rm -rf '$tmpdir'; rm -rf '$rundir'" EXIT
singularity exec \
--pwd "/data" \
"instance://$instancename" \
"$#"

Baffling Debian installation script: service starts but then quits mysteriously - Why?

I've been struggling with this one for a while and I'm thoroughly baffled.
I have this postinst Debian script that is supposed to start a service once installation (of the service executable) is complete. As best I can tell, the service does start successfully, but it then immediately and mysteriously quits. Restarting the service from the command-line works fine once Synaptic concludes.
I tried writing a dummy package to verify this. The dummy package installs /etc/init/service-dummy.conf and a symbolic link to that file, named /etc/init.d/service-dummy (just like the original service). The contents of service-dummy.conf are the same as service.conf. The dummy starts the service...and then the service keeps on running. So I can't even reproduce my problem!
The postinst script does this:
#!/bin/sh
set -e
case "$1" in
configure)
# (instructions which config, make and install the freshly installed source code)
ldconfig
echo "Install concluded"
if [ -e "/etc/init/service-dummy.conf" ]; then
echo "Starting service-dummy root service" | tee service.log
service service-dummy restart | tee --append service.log
else
echo "service-dummy.conf not installed"
fi
echo "Postinst complete"
;;
*)
echo "postinst called with unknown argument '$1'" >&2
;;
esac
# exit 1 to ensure installer stalls
exit 1
Synaptic displays the log:
...
Starting service-dummy root service
stop: Unknown instance:
service-dummy start/running, process 9207
Postinst complete
dpkg: error processing service-dummy (--configure):
subprocess installed post-installation script returned error exit status 1
...
It's as if upstart needed to be refreshed?
I tried more things and then I did get it to work, sort of: I try starting the service, then abort the script with an exit 1, and when the script runs a second time (postinst, same parameters, so I detect the second run otherwise), I start the service again, and this time it sticks.
A key clue is in the log:
Postinst complete (aborting script)
dpkg: error processing service-dummy (--configure):
subprocess installed post-installation script returned error exit status 1
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place
Errors were encountered while processing:
service-dummy
E: Sub-process /usr/bin/dpkg returned an error code (1)
A package failed to install. Trying to recover:
Setting up service-dummy ...
service-dummy postinst configure
Starting service-dummy a second time
stop: Unknown instance:
service-dummy start/running, process 4034
Postinst complete (aborting script recovery attempt)
So I guess my question now becomes:
How do I force ldconfig to not defer its processing?
Found the right clue here: http://lists.debian.org/debian-glibc/2008/07/msg00169.html
Turns out apt-get temporarily prevents use of ldconfig by replacing it with something else. The solution to my problem is to very simply call ldconfig.real instead of ldconfig in the script.

Resources