I am trying to deploy an application with k3s kubernetes. Currently I have two master nodes behind a load-balancer, and I have some issues connecting worker nodes to them. All nodes and the load-balancer runs in seperate vms.
The load balancer is a nginx server with the following configuration.
load_module /usr/lib/nginx/modules/ngx_stream_module.so;
events {}
stream {
upstream k3s_servers {
server {master_node1_ip}:6443;
server {master_node2_ip}:6443;
}
server {
listen 6443;
proxy_pass k3s_servers;
}
}
the master nodes connects through the load-balancer, and seemingly it works as expected.
ubuntu#ip-172-31-20-78:/$ sudo k3s kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-31-33-183 Ready control-plane,master 81m v1.20.2+k3s1
ip-172-31-20-78 Ready control-plane,master 81m v1.20.2+k3s1
However the worker nodes yields an error about the SSL certificate?
sudo systemctl status k3s-agent
● k3s-agent.service - Lightweight Kubernetes
Loaded: loaded (/etc/systemd/system/k3s-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2021-01-24 15:54:10 UTC; 19min ago
Docs: https://k3s.io
Process: 3065 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
Process: 3066 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 3067 (k3s-agent)
Tasks: 6
Memory: 167.3M
CGroup: /system.slice/k3s-agent.service
└─3067 /usr/local/bin/k3s agent
Jan 24 16:12:23 ip-172-31-27-179 k3s[3311]: time="2021-01-24T16:34:02.483557102Z" level=info msg="Running load balancer 127.0.0.1:39357 -> [104.248.34.
Jan 24 16:12:23 ip-172-31-27-179 k3s[3067]: time="2021-01-24T16:12:23.313819380Z" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:339
level=error msg="failed to get CA certs: Get "https://127.0.0.1:39357/cacerts": EOF"
if I try to change K3S_URL in /etc/systemd/system/k3s-agent.service.env to use http, I get an error saying that only https is accepted.
Using the IP Address instead of the hostname in k3s-agent.service.env works for me. Not really a solution as much as a workaround.
/etc/systemd/system/k3s-agent.service.env
K3S_TOKEN='<token>'
K3S_URL='192.168.xxx.xxx:6443'
Related
Assuming you are Database Root
Checking if SELinux is enabled...Its not (good)!
Reading /etc/asterisk/asterisk.conf...Done
Checking if Asterisk is running and we can talk to it as the 'asterisk' user...Error!
Error communicating with Asterisk. Ensure that Asterisk is properly installed and running as the asterisk user
Asterisk appears to be running as asterisk
Try starting Asterisk with the './start_asterisk start' command in this directory
tried ./start_asterisk start ./install -n
help help please, what's the problem, 3rd day I'm trying to solve the problem.
● asterisk.service - Asterisk PBX
Loaded: loaded (/lib/systemd/system/asterisk.service; enabled; vendor preset: enabled)
Active: failed (Result: core-dump) since Sat 2020-07-25 01:12:16 UTC; 32min ago
Docs: man:asterisk(8)
Process: 84496 ExecStart=/usr/sbin/asterisk -g -f -p -U asterisk (code=dumped, signal=SEGV)
Main PID: 84496 (code=dumped, signal=SEGV)
Jul 25 01:12:16 webserver systemd[1]: asterisk.service: Scheduled restart job, restart counter is at 91.
Jul 25 01:12:16 webserver systemd[1]: Stopped Asterisk PBX.
Jul 25 01:12:16 webserver systemd[1]: asterisk.service: Start request repeated too quickly.
Jul 25 01:12:16 webserver systemd[1]: asterisk.service: Failed with result 'core-dump'.
Jul 25 01:12:16 webserver systemd[1]: Failed to start Asterisk PBX.```
Selinux enabled - can be issue here.
For check what is really gooing try start asterisk manually and see verbose log
asterisk -vvvgc
I have setup HA-Cluster for nginx. So when nginx or node fail, then it will failover to second node.
pcs status Cluster name: push_noti_cluster Stack: corosync Current DC: push2 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum Last updated: Tue Jul 31 11:29:16 2018 Last change: Tue Jul 31 09:20:05 2018 by root via cibadmin on push1
2 nodes configured 3 resources configured
Online: [ push1 push2 ]
Full list of resources:
virtual_ip (ocf::heartbeat:IPaddr2): Started push1 Clone Set: Nginx-clone [Nginx] Started: [ push1 push2 ]
Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled You have new mail in /var/spool/mail/root [root#server1 ~]#
Failover work fine when we stop cluster service using pcs cluster stop on either of these nodes or rebooting the servers.
What we want to achieve is to perform the resource failover when nginx on host node01 stop running and both the resources virtual_ip/webserver should failover to second host node02.
Is it possible to do a service level failover? I.e. when one of resource fails in one node (node01), all the configured resources (here virtual_ip/webserver) should failover to other node (node02).
From what you write, I see there is not configured that "active" node must be that node where present active nginx(any needed service).
Try to check your configuration with examples from this site.
https://wiki.clusterlabs.org/wiki/Example_configurations#Failover_IP_.2B_One_service
I've been trying to start my shiny server with no success. I followed the instructions at RStudio site, but when I check my server status, this is what I get:
$ sudo systemctl status shiny-server
● shiny-server.service - ShinyServer
Loaded: loaded (/etc/systemd/system/shiny-server.service; enabled; vendor preset: disabled)
Active: deactivating (stop-post) since Mon 2018-04-30 21:16:03 -03; 2s ago
Process: 17672 ExecStart=/usr/bin/env bash -c exec /opt/shiny-server/bin/shiny-server >> /var/log/shiny-server.log 2>&1 (code=exited, status=0/SUCCESS)
Main PID: 17672 (code=exited, status=0/SUCCESS); : 17682 (sleep)
CGroup: /system.slice/shiny-server.service
└─control
└─17682 sleep 5
Apr 30 21:16:02 shiny.estatistica.ufrn.br systemd[1]: Started ShinyServer.
Apr 30 21:16:02 shiny.estatistica.ufrn.br systemd[1]: Starting ShinyServer...
But shiny.estatistica.ufrn.br is not my website! My website is shiny.estatistica.ccet.ufrn.br/ (there is a ccet in there). Notice that Apache is alive and running when ccet is added to the url.
So, what can I do to start my shiny server? I think there is something to do with the url without ccet, but I couldn't figure out how to fix it.
Shiny Server has nothing to do with Apache. The name shiny.estatistica.ufrn.br in the last two lines looks like your server's (machine) name and the missing ccet has nothing to do with Shiny Server.
To have http://shiny.estatistica.ccet.ufrn.br/ point to your shiny server, your local IT/network department will be able to help you, as it requires a http request on some sub-level of the ufrn.br-domain to be forwarded to your shiny server.
I would just erase disk and start over, else windup spending far too much "fixing" stuff which isn't how shiny wants to work.
I'm kinda new on setting up a production machine and I don't get why I'm not seeing the default index page for nginx on my EC2 machine. It's installed and running on my server, but when I try to access, it keeps loading and shows nothing, keeps on a blank page. I'm trying to access through the public ip (35.160.22.104) and through public dns(ec2-35-160-22-104.us-west-2.compute.amazonaws.com). Both does the same. What I'm doing wrong?
UPDATE:
I realized that when restarting nginx service, it didn't showed the "ok" message. So I took a look at error.log:
[ 2016-12-12 17:16:11.2439 709/7f3eebc93780 age/Cor/CoreMain.cpp:967 ]: Passenger core shutdown finished
2016/12/12 17:16:12 [info] 782#782: Using 32768KiB of shared memory for push module in /etc/nginx/nginx.conf:71
[ 2016-12-12 17:16:12.2742 791/7fb0c37a0780 age/Wat/WatchdogMain.cpp:1291 ]: Starting Passenger watchdog...
[ 2016-12-12 17:16:12.2820 794/7fe4d238b780 age/Cor/CoreMain.cpp:982 ]: Starting Passenger core...
[ 2016-12-12 17:16:12.2820 794/7fe4d238b780 age/Cor/CoreMain.cpp:235 ]: Passenger core running in multi-application mode.
[ 2016-12-12 17:16:12.2832 794/7fe4d238b780 age/Cor/CoreMain.cpp:732 ]: Passenger core online, PID 794
[ 2016-12-12 17:16:12.2911 799/7f06bb50a780 age/Ust/UstRouterMain.cpp:529 ]: Starting Passenger UstRouter...
[ 2016-12-12 17:16:12.2916 799/7f06bb50a780 age/Ust/UstRouterMain.cpp:342 ]: Passenger UstRouter online, PID 799
Anyway, it doesn't looks like an error, but a usual log.
UPDATE 2:
Nginx is running:
root 810 1 0 17:16 ? 00:00:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
www-data 815 810 0 17:16 ? 00:00:00 nginx: worker process
ubuntu 853 32300 0 17:44 pts/0 00:00:00 grep --color=auto nginx
And when I try do curl localhost, it returns the HTML as expected!
UPDATE3:
When I run systemctl status nginx, I get the following error:
Dec 12 17:54:48 ip-172-31-40-156 systemd[1]: nginx.service: Failed to read PID from file /run/nginx.pid: Invalid argument
Trying to figure out what it is
UPDATE4:
Ran the command nmap 35.160.22.104 -Pn PORT STATE SERVICE 22/tcpand got the output:
Starting Nmap 7.01 ( https://nmap.org ) at 2016-12-12 18:05 UTC
Failed to resolve "PORT".
Failed to resolve "STATE".
Failed to resolve "SERVICE".
Unable to split netmask from target expression: "22/tcp"
Nmap scan report for ec2-35-160-22-104.us-west-2.compute.amazonaws.com (35.160.22.104)
Host is up (0.0015s latency).
Not shown: 999 filtered ports
PORT STATE SERVICE
22/tcp open ssh
UPDATE5:
Output for netstat -tuanp | grep 80:
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN -
tcp6 0 0 :::80 :::* LISTEN -
Your ec2 instance have a security group associated.
You should go to AWS console EC2 -> Instances -> Click on your instance -> On the bottom 'Description' -> Security Group. Click on the name and you will be redirect to EC2-> Network and Security. Click on 'Edit inbound rules' Add a rule:
Type: HTTP
Save. And that should be fine!
I'm using nginx 1.8.1 on centos 7 64 bit, i have installed wordpress there and it works great, but sometimes when you install plugin, or do some other activity on the site mysql service dies and it shows error establishing a database connection error since the service just died and didn't started again.
So what i did several times when this happened (I'm holding the site on the server ~3 days issue started coming day ago when there where more plugins on the site or more activity i'm not sure whats causing the issue) is connected to the server via putty client and just wrote reboot command that solved the issue, however couple hours ago when the issue randomly just popped up when i was adding simple products on Woocommerce i logged in to server via putty and now just wrote systemctl stop mariadb and than systemctl start mariadb. I tough mysql server just dies randomly, but after 15 minutes it happened again so now i didn't wrote command to stop maria db i just wrote systemctl start mariadb and it didn't worked, what it says is:
[root#lrweb ~]# systemctl start mariadb
Job for mariadb.service failed. See 'systemctl status mariadb.service' and 'journalctl -xn' for details.
So next thing i did is wrote command to check status( systemctl status mariadb
): and it shows:
[root#lrweb ~]# systemctl status mariadb
mariadb.service - MariaDB database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled)
Active: failed (Result: exit-code) since Tue 2016-02-09 03:48:11 EST; 45s ago
Process: 24708 ExecStartPost=/usr/libexec/mariadb-wait-ready $MAINPID (code=exited, status=1/FAILURE)
Process: 24707 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
Process: 24680 ExecStartPre=/usr/libexec/mariadb-prepare-db-dir %n (code=exited, status=0/SUCCESS)
Main PID: 24707 (code=exited, status=0/SUCCESS)
So now i decided to stop maria db and start again like i did before but once i type command to stop maria db it outputs nothing just
[root#lrweb ~]# systemctl stop mariadb
[root#lrweb ~]#
But when i type again for start the service it does not start and gives me same
message as i wrote above.
I'm sure that i can resolve this issue by rebooting the server which would start services all automatically.
I'm using digital ocean in in there i decided to go and check resources, i have there server which is medium one and it can't be that high but apparently i was wrong because it shows that all resources was at max:
Click to see resource graphs
I'm not sure whats causing the problem, how to fix it and what i do wrong.
Thanks for your time and help.