I am facing a strange issue where nginx stops/crashes at around 12am(12am-1am) and 12pm(12pm-1pm) everyday UTC time. I am monitoring(up/down) a website using a website monitoring tool. When website is down, at the same time there is an entry in nginx error log:
2018/07/02 12:16:54 [notice] 2288#2288: signal process started
When I check nginx status using "sudo systemctl nginx status" it says:
Active: inactive (dead) since Mon 2018-07-02 12:16:54 UTC; 19min ago
So, I have to restart nginx to make the website working and again the same thing happens next day.
I figured out the issue myself. I checked syslogs and it was certbot that was trying to renew certificate at a particular time. Fixing certbot fixed the issue.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
Improve this question
I have a VPS hosting with a domain redirecting to it.
I have LAMP stack for my main website using WordPress CMS.
Plus I am using Odoo as my back-end with python and PostgreSQL in a sub-domain.
Everything was working fine until I installed Certbot Let’s Encrypt to obtain an SSL certificate by following these tutorials
For My Wordpress i installed this plugin:
WP Encryption – One Click single / wildcard Free SSL certificate & force HTTPS
Which got me in a loop because it forced the https i will explain it later on
So when the plugin didn't work i searched for another way for the whole VPS with these tutorials:
How To Secure Apache with Let's Encrypt on Ubuntu 16.04
How To Secure Apache with Let's Encrypt on Ubuntu 18.04
After completing the second tutorial for ubuntu 18.04 i noticed that all my domain traffic is going to https and it got stuck in a loop saying same as i said above
"ERR_TOO_MANY_REDIRECTS which means Site redirected too many times"
and couldn't access the website front-end for the wordpress in the doamin.
Then when i applied
"Step 3 — Allowing HTTPS Through the Firewall"
my internet connection got interpreted and when i got back to the ssh session i found my self locked out of the server and did not find any way to get back in.
And when i tired to use the sub-domain that has Odoo on it i have got the same error
"ERR_TOO_MANY_REDIRECTS which means Site redirected too many times"
Until here i was hopeless and did't know what to do.
I contacted my VPS server provider and told him about what exactly happened. Then some how he managed to get me into the server again with a URL to the terminal i still couldn't access the server using ssh clients like putty.. so when i entered the server after he provided me with the URL first thing noticed is that he "rebooted the VPS" will get to this in a second.
So first thing i did was removing the wordpress plugin "WP Encryption" and update the wordpress site-url in wp_options table in mysql database because the plugin changed it from http to https so i changed it back and that solved the ERR_TOO_MANY_REDIRECTS for my wordpress website.
Then the second thing i did was disabling the ufw firewall that i enabled in the tutorial in Step 3 above.
I instantly got my connection to the server back using ssh client putty but what i have noticed again is the postgres service was inactive and went down with the reboot of the VPS. i tried to start the service but it didn't a gave me this error.
Failed to start postgresql.service: Unit postgresql.service is masked.
i searched for a solution and found these commands to unmask
sudo systemctl unmask postgresql
sudo systemctl enable postgresql
sudo systemctl restart postgresql
and then the service has started and everything sames OK when i run the status command
service postgresql status
the response is
● postgresql.service - LSB: PostgreSQL RDBMS server
Loaded: loaded (/etc/init.d/postgresql; generated)
Active: active (exited) since Thu 2020-03-26 05:54:09 UTC; 2h 22min ago
Docs: man:systemd-sysv-generator(8)
Tasks: 0 (limit: 2286)
Memory: 0B
CGroup: /system.slice/postgresql.service
but when i try to connect to postgres through the default port with odoo it says:
could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"
after many searches i made i found the posgres main cluster is also inactive or down i tried to start it with this command
pg_ctlcluster 11 main start
but i get this error
Job for postgresql#11-main.service failed because the service did not take the steps required by its unit configuration. See "systemctl status postgresql#11-main.service" and "journalctl -xe" for details.
and when i run the command as requested
systemctl status postgresql#11-main.service
i get this error
● postgresql#11-main.service - PostgreSQL Cluster 11-main Loaded: loaded (/lib/systemd/system/postgresql#.service; disabled; vendor preset: enabled) Active: failed (Result: protocol) since Thu 2020-03-26 15:22:15 UTC; 14s ago Process: 18930 ExecStart=/usr/bin/pg_ctlcluster --skip-systemctl-redirect 11-main start (code=exited, status=1/FAILURE)
alone with
systemd[1]: Starting PostgreSQL Cluster 11-main...
postgresql#11-main[18930]: Error: Could not find pg_ctl executable for version 11
systemd[1]: postgresql#11-main.service: Can't open PID file /run/postgresql/11-main.pid (yet?) after start: No such file or
systemd[1]: postgresql#11-main.service: Failed with result 'protocol'.
systemd[1]: Failed to start PostgreSQL Cluster 11-main.
I guessed Let's Encrypt added an ssl configuration to the pg_hba.conf and postgres.conf like id did with apache so i searched for them and commented the "ssl on" lines and restarted postgres service along with the main cluster but nothing happened still the the same error which is
Error: Could not find pg_ctl executable for version 11
I know i shouldn't run pg_ctl directly under Ubuntu/Debian. I must use pg_ctlcluster instead, which is installed by postgresql-common. I saw the main page documentation. But when i run "sudo pg_ctlcluster 11 main reload" command i always get the above Error telling me that he could not find pg_ctl executable
I have searched a lot for this problem but nothing worked how can i solve the pg_ctl executable in version 11 ??
Ps:
I am using Ubuntu 19.10 (GNU/Linux 5.3.0-24-generic x86_64)
Odoo 11 with postgres 11 as the database odoo can't connect to postgres as i mentioned before
edit:
Unfortunately i can't do a restore or recover the server to fix postgres package because my last backup of the server was on 19/3 and today is 26/3 i have an important data between this period
Update 27/3/2020 4:06 AM
I compared my last server backup with the production server and found a lot of postgres files missing!! like int this path /usr/lib/postgres/11/ and /etc/postgres/11/ i think postgres some how got damaged and lost some files in the reboot of the server >>> but found the data files of the database located in /var/lib/postgres/11/ <<< Can i read them in my backup server ? i will try and let you know
So finally after a hours of digging
All PostgreSQL files where damaged and missing and i lost hope of repairing them i don't know what caused that but it has a relation with the accidental reboot of the server.
So i managed to find the main cluster data file for my important database information for the production server in this path
/var/lib/postgres/11/
and i took a backup from it by zipping the whole folder using this command
zip -r main.zip main/
then i did a full purge and reinstall for postgres usuing these commands from here
apt-get --purge remove postgresql\*
to remove everything PostgreSQL from your system. Just purging the postgres package isn't enough since it's just an empty meta-package.
Once all PostgreSQL packages have been removed, run:
rm -r /etc/postgresql/
rm -r /etc/postgresql-common/
rm -r /var/lib/postgresql/
userdel -r postgres
groupdel postgres
Then i installed postgres with this command to match odoo11
sudo apt-get install postgresql libpq-dev -y
then creating the ODOO PostgreSQL User
sudo su - postgres -c "createuser -s odoo" 2> /dev/null || true
Now everything is okay odoo should work fine but you still don't have any database
So to bring back the backup from the cluster folder we took earlier we need to move the zip file to the same directory we took it from which is
/var/lib/postgres/11/
but before that you should stop postgres service
sudo systemctl stop postgresql
and make sure it has stopped
sudo systemctl status postgresql
after that rename the main cluster that postgres uses right now because its empty and we don't need it because we are replacing it with our backed up cluster
mv /var/lib/postgres/11/main /var/lib/postgres/11/main_old
then move the zip file from where you backed it up to the postgres cluster folder with this command
mv /backups/main.zip /var/lib/postgres/11/
unzip the folder in the same path by using this command
unzip -a /var/lib/postgres/11/main.zip
after unzipping the folder give the ownership to your postgres user and group
chown -R postgres:postgres main
Then you are good to go. Start Postgres service
sudo systemctl start postgresql
sudo systemctl status postgresql
and make sure you also start the main cluster service
pg_ctlcluster 11 main start
if you stopped odoo make sure to start it also
service odoo-server start
Ps: I solved ERR_TOO_MANY_REDIRECTS for the odoo sub-domain by commenting ssl configurations in my odoo.config Apache2 virtual host that lets encrypt updated before and everything got back to where left it before installing lets encrypt.
I guess i will leave it here and won't use ssl in production again till i figure out how to use it in a test server .. thanks for your time i hope my question and answer helps someone in the future
Try adding 'pg_path' in your odoo configuration file.
Like: pg_path = /path/to/postgresql/binaries
Generally '/usr/lib/posrgresql/11/bin' is the binary directory.
I've been trying to start my shiny server with no success. I followed the instructions at RStudio site, but when I check my server status, this is what I get:
$ sudo systemctl status shiny-server
● shiny-server.service - ShinyServer
Loaded: loaded (/etc/systemd/system/shiny-server.service; enabled; vendor preset: disabled)
Active: deactivating (stop-post) since Mon 2018-04-30 21:16:03 -03; 2s ago
Process: 17672 ExecStart=/usr/bin/env bash -c exec /opt/shiny-server/bin/shiny-server >> /var/log/shiny-server.log 2>&1 (code=exited, status=0/SUCCESS)
Main PID: 17672 (code=exited, status=0/SUCCESS); : 17682 (sleep)
CGroup: /system.slice/shiny-server.service
└─control
└─17682 sleep 5
Apr 30 21:16:02 shiny.estatistica.ufrn.br systemd[1]: Started ShinyServer.
Apr 30 21:16:02 shiny.estatistica.ufrn.br systemd[1]: Starting ShinyServer...
But shiny.estatistica.ufrn.br is not my website! My website is shiny.estatistica.ccet.ufrn.br/ (there is a ccet in there). Notice that Apache is alive and running when ccet is added to the url.
So, what can I do to start my shiny server? I think there is something to do with the url without ccet, but I couldn't figure out how to fix it.
Shiny Server has nothing to do with Apache. The name shiny.estatistica.ufrn.br in the last two lines looks like your server's (machine) name and the missing ccet has nothing to do with Shiny Server.
To have http://shiny.estatistica.ccet.ufrn.br/ point to your shiny server, your local IT/network department will be able to help you, as it requires a http request on some sub-level of the ufrn.br-domain to be forwarded to your shiny server.
I would just erase disk and start over, else windup spending far too much "fixing" stuff which isn't how shiny wants to work.
I recently installed Ghost 1.8.4 and Nginx on my AWS ec2 Ubuntu 16.04 server. When I loaded my blog site, it correctly took me to the Ghost home page, from where I logged into Ghost admin. On the admin screen, there was a message to update.
I ran ghost update in putty
The update appeared to be successful, but when I returned to my blog site, I received the following error:
502 Bad Gateway
nginx/1.10.3 (Ubuntu)
Does anyone know a probably cause of this error and how to resolve?
I checked some posts, which suggested I should have turned Ghost off before the update. If this is true, is my ghost installation now corrupted?
I went to my ghost directory in /var/www/ghost and tried to run:
sudo service ghost start
but it returned:
Failed to start ghost.service: Unit ghost.service not found
and trying to stop, returns Unit ghost.service not loaded. Am I running the command from the correct location?
I've experienced 502 issues with ghost behind nginx several times over a few years of running it. I'm not sure if the cause of mine today is the same as yours, but what I observed was that after a restart ghost had changed its port number to one different than what its nginx config was listening on.
I followed these directions from https://web.archive.org/web/20200807095031/https://www.danwalker.com/running-ghost-on-a-5-digital-ocean-vps/ which resolved it for me:
See which port ghost is running on:
sudo netstat -plotn
Check that it matches the proxy_pass in the nginx config file in /etc/nginx/sites-enabled.
In my case the port in the nginx config had incremented to 2369 while the actual node process was running on 2368. Changing the proxy_pass port back to 2368 in my ghost blog's nginx config file resolved the issue for me.
I ran into the same problem after upgrading ghost.
Make sure the port number configured in your ghost's config file and the proxy_pass in your ghost site's nginx configuration files match.
Check the port number in
/var/www/ghost/config.production.json matches the proxy_pass port in the nginx config files.
/var/www/ghost/system/files/<yourDomainName>.<extension>.conf
/var/www/ghost/system/files/<yourDomainName>.<extension>-ssl.conf
In my case I had to change 2368 to 2369 in the nginx config files to fix the issue.
Make sure you restart your ghost and nginx after you make the changes.
# restart your ghost site
cd /var/www/ghost/
ghost restart
# restart nginx
sudo systemctl restart nginx
Hope this helps someone.
Apparently when I posted this issue it was due to a bug in the Ghost CLI that the ghost team were in the process of fixing.
They provided me with these instructions to run on my server:
systemctl stop ghost_www-blogwebsite-com
ghost update --force
The resulting output:
stopping Ghost [skipped]
Removing old Ghost versions [skipped]
This fixed the problem and updated to the correct version.
I have Gitlab 8.6 running on an Ubuntu 14.04 server that seems to have gotten messed up. I consistently get a 502 error when accessing the site. The server likely has not been restarted since installing Gitlab initially, and a power outage caused the server to reboot. Now, I cannot start/restart Gitlab due to what appears to be port conflicts.
I installed Gitlab via source, I don't have any custom port configurations, and am using NGINX. nginx -t shows that the configuration appears to be correct syntax-wise.
When I run netstat -tupln, I see that Unicorn & a Gitlab instance is already running on :8080 and :80 respectively at boot up. I suspect that a 2nd instance of Gitlab was installed which is being run at boot and that is causing the proper instance to have port conflicts when I try to run it via service gitlab restart. I'm not even sure if that's possible, but I can't seem to figure out where to go from here. Every time I run sudo gitlab-ctl reconfigure or service gitlab start, it fails and the unicorn.stderror.log shows bind errors to the :8080 port. I tried moving the Unicorn service to :8081 as well, but I still receive the port binding error.
Does anyone know how I can detect if there are multiple Gitlab instances running, and maybe if there is a way to remove a duplicated one if it's possible? Thank you!
EDIT: Here is what is in the /etc/gitlab/gitlab.rb file. Everything else is commented out.
## Url on which GitLab will be reachable
external_url 'http://my-gitlab-instance.domain.com'
EDIT 2: My /home/git/gitlab/ directory is mapped to https://gitlab.com/gitlab-org/gitlab-ce.git, and is on the 8-7-stable branch. gitlab-shell and gitlab-workhorse are on the correct versions according to https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/update/8.6-to-8.7.md
EDIT 3: I have gotten to a point where the Gitlab seems to self-check okay by removing the gitlab-ce package (https://gitlab.com/gitlab-org/omnibus-gitlab/issues/135), but the server returns a 404. NGINX, Unicorn, Sidekiq, and gitlab-workhorse all say that they're running. I see that unicorn.rb is listening on :8080, and nginx is listening on 0.0.0.0:80 and :::80. I guess now I'm troubleshooting this 404 and hopefully I will be back to my install-from-source.
What I have found is that there were 2 issues causing the errors I had.
First, I removed a "gitlab-ce" package that was installed, following the instructions here: https://gitlab.com/gitlab-org/omnibus-gitlab/issues/135. For some reason, when I restart the machine now I have to restart these services, in order, for Gitlab to run properly redis-server, gitlab, nginx. However, Gitlab does start responding properly after that.
Second, the 404 error was due to a different server that was also listening on that IP address, causing a conflict.
I will likely move to using the omnibus package on a fresh, new server going forward, but at least the immediate issues appear resolved. Thanks for your help, SLY!
I'm using nginx 1.8.1 on centos 7 64 bit, i have installed wordpress there and it works great, but sometimes when you install plugin, or do some other activity on the site mysql service dies and it shows error establishing a database connection error since the service just died and didn't started again.
So what i did several times when this happened (I'm holding the site on the server ~3 days issue started coming day ago when there where more plugins on the site or more activity i'm not sure whats causing the issue) is connected to the server via putty client and just wrote reboot command that solved the issue, however couple hours ago when the issue randomly just popped up when i was adding simple products on Woocommerce i logged in to server via putty and now just wrote systemctl stop mariadb and than systemctl start mariadb. I tough mysql server just dies randomly, but after 15 minutes it happened again so now i didn't wrote command to stop maria db i just wrote systemctl start mariadb and it didn't worked, what it says is:
[root#lrweb ~]# systemctl start mariadb
Job for mariadb.service failed. See 'systemctl status mariadb.service' and 'journalctl -xn' for details.
So next thing i did is wrote command to check status( systemctl status mariadb
): and it shows:
[root#lrweb ~]# systemctl status mariadb
mariadb.service - MariaDB database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled)
Active: failed (Result: exit-code) since Tue 2016-02-09 03:48:11 EST; 45s ago
Process: 24708 ExecStartPost=/usr/libexec/mariadb-wait-ready $MAINPID (code=exited, status=1/FAILURE)
Process: 24707 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
Process: 24680 ExecStartPre=/usr/libexec/mariadb-prepare-db-dir %n (code=exited, status=0/SUCCESS)
Main PID: 24707 (code=exited, status=0/SUCCESS)
So now i decided to stop maria db and start again like i did before but once i type command to stop maria db it outputs nothing just
[root#lrweb ~]# systemctl stop mariadb
[root#lrweb ~]#
But when i type again for start the service it does not start and gives me same
message as i wrote above.
I'm sure that i can resolve this issue by rebooting the server which would start services all automatically.
I'm using digital ocean in in there i decided to go and check resources, i have there server which is medium one and it can't be that high but apparently i was wrong because it shows that all resources was at max:
Click to see resource graphs
I'm not sure whats causing the problem, how to fix it and what i do wrong.
Thanks for your time and help.