Kubernetes TCP Health Check - tcp

I am building .NET core Console microservice and it has been suggested by Architect to use TCP Health Check instead of HTTP Health Check. Hence in order to implement TCP health Check, please find the below configuration that I have added in the OCP file(deploymentconfig section). Jenkins build was successful and also the deployment config roll out was successful.
Query:
How to ensure that probes are working properly.Is there a way to verify readiness and liveliness probes is being done in regular interval with the TCP Health Check ?
Is there any syntax by which I can explicitly check the Container Health Status using TCP Health Check.
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: "${{READINESS_DELAY}}"
periodSeconds: "${{READINESS_TIMEOUT}}"
timeoutSeconds: "${{READINESS_TIMEOUT}}"
livenessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: "${{LIVENESS_DELAY}}"
periodSeconds: "${{LIVENESS_TIMEOUT}}"
timeoutSeconds: "${{LIVENESS_TIMEOUT}}"

Output of the probe is saved to kubelet component on each node.
As default in Kubernetes you can check Probes if you will describe pod. For example for pod where everything workings correctly you will not find any information about this. It shows only relevant events like Unhealthy or Killing, etc.
To check if this container is using any LivenessProbe or ReadinessProbe you need to describe pod and find: Containers.<containerName>.Liveness and Containers.<containerName>.Readiness.
Below example based on docs, but with additional changes which will guarantee it will fail.
Added:
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
Output:
$ kubectl describe pod goproxy-fail
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m38s default-scheduler Successfully assigned default/goproxy-fail to kubeadm-16
Normal Pulled 26s (x4 over 3m37s) kubelet, kubeadm-16 Container image "k8s.gcr.io/goproxy:0.1" already present on machine
Normal Created 26s (x4 over 3m37s) kubelet, kubeadm-16 Created container goproxy
Normal Killing 26s (x3 over 2m26s) kubelet, kubeadm-16 Container goproxy failed liveness probe, will be restarted
Normal Started 25s (x4 over 3m36s) kubelet, kubeadm-16 Started container goproxy
Warning Unhealthy 6s (x10 over 3m6s) kubelet, kubeadm-16 Liveness probe failed: OCI runtime exec failed: exec failed: container_linux.go:346: starting container process caused "exec: \"cat\": executable file not found in $PATH": unknown
It means over 3m6s there was 10 checks which all failed and status is Unhealthy.
Another default option is to use events using $ kubectl get events. Output will be smilar but it will gather all events from cluster. You can specify namespace etc.
How check succeed probes
Output of Successful probes is not recorded anywhere in default settings. You will need change Kubelet logging level, --verbosity to at least debugging mode (4).
To do it, you have to:
ssh to master node
edit file /var/lib/kubelet/kubeadm-flags.env (on ubuntu you need sudo rights to do it $ sudo su).
Default output looks like KUBELET_KUBEADM_ARGS="--cgroup-driver=cgroupfs --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1 --resolv-conf=/run/systemd/resolve/resolv.conf" you have to add --v=4 at the end.
Depends on your future need you can take higher log level. More information can be found here. Desired value in kubeadm-flags.env would look like below:
KUBELET_KUBEADM_ARGS="--cgroup-driver=cgroupfs --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1 --resolv-conf=/run/systemd/resolve/resolv.conf --v=4"
After that you need to restart kubelet to apply this new logging level. You can do it using sudo systemctl restart kubelet.
Next step is to check kubelet logs using journactl.
$ journalctl -u kubelet you can also grep it as you will get huge amount of logs as --v is set to 4. I have created another pod with similar config but with pod name tetest and container name goproxy, easier to find.
$ journalctl -u kubelet | grep tetest
...
Dec 31 10:29:46 kubeadm-16 kubelet[17767]: I1231 10:29:46.303112 17767 prober.go:129] Readiness probe for "tetest_default(a518f558-9b08-4ce8-86a2-81875f205826):goproxy" succeeded
Dec 31 10:29:55 kubeadm-16 kubelet[17767]: I1231 10:29:55.289330 17767 prober.go:129] Liveness probe for "tetest_default(a518f558-9b08-4ce8-86a2-81875f205826):goproxy" succeeded
Dec 31 10:29:56 kubeadm-16 kubelet[17767]: I1231 10:29:56.303326 17767 prober.go:129] Readiness probe for "tetest_default(a518f558-9b08-4ce8-86a2-81875f205826):goproxy" succeeded
Dec 31 10:30:06 kubeadm-16 kubelet[17767]: I1231 10:30:06.302931 17767 prober.go:129] Readiness probe for "tetest_default(a518f558-9b08-4ce8-86a2-81875f205826):goproxy" succeeded
Dec 31 10:30:15 kubeadm-16 kubelet[17767]: I1231 10:30:15.289462 17767 prober.go:129] Liveness probe for "tetest_default(a518f558-9b08-4ce8-86a2-81875f205826):goproxy" succeeded
Dec 31 10:30:16 kubeadm-16 kubelet[17767]: I1231 10:30:16.303267 17767 prober.go:129] Readiness probe for "tetest_default(a518f558-9b08-4ce8-86a2-81875f205826):goproxy" succeeded
Dec 31 10:30:26 kubeadm-16 kubelet[17767]: I1231 10:30:26.303248 17767 prober.go:129] Readiness probe for "tetest_default(a518f558-9b08-4ce8-86a2-81875f205826):goproxy" succeeded
Dec 31 10:30:35 kubeadm-16 kubelet[17767]: I1231 10:30:35.289164 17767 prober.go:129] Liveness probe for "tetest_default(a518f558-9b08-4ce8-86a2-81875f205826):goproxy" succeeded
Dec 31 10:30:36 kubeadm-16 kubelet[17767]: I1231 10:30:36.303071 17767 prober.go:129] Readiness probe for "tetest_default(a518f558-9b08-4ce8-86a2-81875f205826):goproxy" succeeded
Dec 31 10:30:46 kubeadm-16 kubelet[17767]: I1231 10:30:46.303751 17767 prober.go:129] Readiness probe for "tetest_default(a518f558-9b08-4ce8-86a2-81875f205826):goproxy" succeeded
Dec 31 10:30:49 kubeadm-16 kubelet[17767]: I1231 10:30:49.237565 17767 kubelet.go:1965] SyncLoop (SYNC): 1 pods; tetest_default(a518f558-9b08-4ce8-86a2-81875f205826)
...
Tested on Kubernetes 1.16.3, OS Ubuntu 18.04.
Hope it will help

Related

OpenVPN server not starting after generating config file for client

I am new to OpenVPN and to set up a server I followed this guide. I am using Ubuntu 18.04.4 LTS.
Here is what I get on starting OpenVPN
omayr#x556uqk:~$ sudo systemctl start openvpn#server
Job for openvpn#server.service failed because the control process exited with error code.
See "systemctl status openvpn#server.service" and "journalctl -xe" for details.
Status:
omayr#x556uqk:~$ sudo systemctl status openvpn#server
● openvpn#server.service - OpenVPN connection to server
Loaded: loaded (/etc/systemd/system/openvpn#.service; indirect; vendor preset
Active: activating (auto-restart) (Result: exit-code) since Sat 2020-02-22 23
Docs: man:openvpn(8)
https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage
https://community.openvpn.net/openvpn/wiki/HOWTO
Process: 3740 ExecStart=/usr/sbin/openvpn --daemon ovpn-server --status /run/o
Main PID: 3740 (code=exited, status=1/FAILURE)

mount encrypted loopback in suse/systemd: unknown filesystem type 'crypt'

This might be specific to openSUSE / systemd.
I'm having trouble mounting an encrypted loopback file using the procedure described on the SDB:Encrypted filesystems knowledge base. I get this behaviour:
[mjl#tesla:~]
[11:12] $ sudo systemctl start /home/mjl/key
Job for home-mjl-key.mount failed. See "systemctl status home-mjl-key.mount" and "journalctl -xe" for details.
[mjl#tesla:~]
[11:12] 1 $ sudo systemctl status home-mjl-key.mount
● home-mjl-key.mount - /home/mjl/key
Loaded: loaded (/etc/fstab; bad; vendor preset: disabled)
Active: failed (Result: exit-code) since Sun 2018-03-11 11:12:41 AEDT; 3s ago
Where: /home/mjl/key
What: /home/mjl/.tomb
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Process: 12949 ExecMount=/usr/bin/mount /home/mjl/.tomb /home/mjl/key -t crypt -o loop,user,acl,user_xattr (code=exited, status=32)
Mar 11 11:12:41 tesla systemd[1]: Mounting /home/mjl/key...
Mar 11 11:12:41 tesla mount[12949]: mount: unknown filesystem type 'crypt'
Mar 11 11:12:41 tesla systemd[1]: home-mjl-key.mount: Mount process exited, code=exited status=32
Mar 11 11:12:41 tesla systemd[1]: Failed to mount /home/mjl/key.
Mar 11 11:12:41 tesla systemd[1]: home-mjl-key.mount: Unit entered failed state.
[mjl#tesla:~]
[11:12] 3 $
The /home/mjl/.tomb loopback file was created using YaST Partitioner; I specified that I did not want it mounted at system start time, but that users should be allowed to mount it.
So it created the file, added an entry to /etc/cryptab and also this entry to /etc/fstab:
[mjl#tesla:~]
[11:12] 3 $ tail -n1 /etc/fstab
/home/mjl/.tomb /home/mjl/key crypt loop,user,noauto,acl,user_xattr,nofail 0 0
[mjl#tesla:~]
[11:15]$
There is the 'crypt' filesystem type.
My question is: how should I be mounting this as a user? Is systemd failing because of the filesystem type, or because I haven't told it the encryption key?
I've also tried mounting directly:
[mjl#tesla:~]
[11:16]$ sudo mount /home/mjl/key
mount: unknown filesystem type 'crypt'
[mjl#tesla:~]
The same error. So I guess I'm not mounting it correctly. Do I need to do something with cryptsetup

Docker daemon can't initialize network controller

I'm having trouble starting my docker daemon. I've installed docker but when I try to run # systemctl start docker.service it throws an error.
$ systemctl status docker.service gives me this:
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2016-09-21 14:38:24 CEST; 6s ago
Docs: https://docs.docker.com
Process: 5592 ExecStart=/usr/bin/dockerd -H fd:// (code=exited, status=1/FAILURE)
Main PID: 5592 (code=exited, status=1/FAILURE)
Sep 21 14:38:24 tp-x230 dockerd[5592]: time="2016-09-21T14:38:24.271068176+02:00" level=warning msg="devmapper: Base device already exists and has filesystem xfs on it. User specified filesystem will be ignored."
Sep 21 14:38:24 tp-x230 dockerd[5592]: time="2016-09-21T14:38:24.327814644+02:00" level=info msg="[graphdriver] using prior storage driver \"devicemapper\""
Sep 21 14:38:24 tp-x230 dockerd[5592]: time="2016-09-21T14:38:24.329895994+02:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Sep 21 14:38:24 tp-x230 dockerd[5592]: time="2016-09-21T14:38:24.330707721+02:00" level=info msg="Loading containers: start."
Sep 21 14:38:24 tp-x230 dockerd[5592]: time="2016-09-21T14:38:24.335610867+02:00" level=info msg="Firewalld running: false"
Sep 21 14:38:24 tp-x230 dockerd[5592]: time="2016-09-21T14:38:24.461243263+02:00" level=fatal msg="Error starting daemon: Error initializing network controller: Error creating default \"bridge\" network: failed to parse pool request for ad
Sep 21 14:38:24 tp-x230 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Sep 21 14:38:24 tp-x230 systemd[1]: Failed to start Docker Application Container Engine.
Sep 21 14:38:24 tp-x230 systemd[1]: docker.service: Unit entered failed state.
Sep 21 14:38:24 tp-x230 systemd[1]: docker.service: Failed with result 'exit-code'.
with the relevant line being:
Sep 21 14:38:24 tp-x230 dockerd[5592]: time="2016-09-21T14:38:24.461243263+02:00" level=fatal msg="Error starting daemon: Error initializing network controller: Error creating default \"bridge\" network: failed to parse pool request for ad
Your error text was cut, so i can not check if it is exactly the same error, but i got this error:
Error starting daemon: Error initializing network controller: Error creating default "bridge" network: failed to parse pool request for address space "LocalDefault" pool "" subpool "": could not find an available predefined network
This was related to the machine having several network cards (can also happen in machines with VPN, you can also temporarily stop it, start docker and restart the vpn OR apply the below workaround)
To me, the solution was to start manually docker like this:
/usr/bin/docker daemon --debug --bip=192.168.y.x/24
where the 192.168.y.x is the MAIN machine IP and /24 that ip netmask. Docker will use this network range for building the bridge and firewall riles. The --debug is not really needed, but might help if something else fails
After starting once, you can kill the docker and start as usual. AFAIK, docker have created a cache config for that --bip and should work now without it. Of course, if you clean the docker cache, you may need to do this again.
This worked for me, from https://github.com/microsoft/WSL/issues/6655
sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
sudo dockerd &
Apparently deleting
/var/lib/docker/network/files/local-kv.db
could work in some cases.
Source: https://github.com/moby/moby/issues/18113#issuecomment-161058473
Another explanation could be problem caused by using VPN: https://github.com/moby/moby/issues/31546#issuecomment-284196714
Turns out I needed to enable IP forwarding for the network that Docker was trying to use, namely the bridge network.
This was done by creating the file /etc/systemd/network/bridge.network with the content
[Network]
IPFoward=kernel
and then restarting the systemd-networkd daemon with # systemctl restart systemd-networkd.service. After this, # systemctl start docker.service worked fine.
P.S. After restarting the network daemon, I was disconnected from my network (as one might expect) and had to manually connect. Might be worth considering if you've got something important going on.

Authentication failed using instance:connect in Apache Karaf 3.0.2

Can someone please help me why this call is no longer working in Apache Karaf 3.0.2. I verified that it was working in version 3.0.1. All instances are up and running, but I am unable to connect to one of my instances directly from the command line.
su - karaf -c " client -h localhost -a 8101 -u karaf -r 50 -d 2 \" instance:connect -u karaf -p karaf test1 \\\" feature:repo-list \\\" \" "
Logging in as karaf
455 [sshd-SshClient[bea319b]-nio2-thread-1] WARN org.apache.sshd.client.keyverifier.AcceptAllServerKeyVerifier - Server at [localhost/127.0.0.1:8101, DSA, b6:f6:d6:3f:8b:2f:ad:a4:0f:3f:3d:c3:7b:96:fd:ae] presented unverified {} key: {}
Connecting to host localhost on port 8103
Connecting to unknown server. Automatically adding to known hosts.
Storing the server key in known_hosts.
Error executing command: Authentication failed
The call is part of an automated process and I cannot connect to a specific instance directly. Is there any specific configuration required, that was not necessary in 3.0.1?
UPDATE #1:
I have added the verbose option... Does it give you any hints what to do?
client -v -h localhost -a 8101 -u karaf -r 50 -d 2 " instance:connect -u karaf test1 \" feature:repo-list \" "
39 [main] INFO org.apache.sshd.common.util.SecurityUtils - BouncyCastle not registered, using the default JCE provider
Logging in as karaf
367 [sshd-SshClient[bea319b]-nio2-thread-1] INFO org.apache.sshd.client.session.ClientSessionImpl - Client session created
380 [main] INFO org.apache.sshd.client.session.ClientSessionImpl - Start flagging packets as pending until key exchange is done
383 [sshd-SshClient[bea319b]-nio2-thread-1] INFO org.apache.sshd.client.session.ClientSessionImpl - Server version string: SSH-2.0-SSHD-CORE-0.12.0
384 [sshd-SshClient[bea319b]-nio2-thread-1] INFO org.apache.sshd.client.session.ClientSessionImpl - Kex: server->client [aes128-ctr, hmac-sha1, none] {} {}
384 [sshd-SshClient[bea319b]-nio2-thread-1] INFO org.apache.sshd.client.session.ClientSessionImpl - Kex: client->server [aes128-ctr, hmac-sha1, none] {} {}
444 [sshd-SshClient[bea319b]-nio2-thread-1] WARN org.apache.sshd.client.keyverifier.AcceptAllServerKeyVerifier - Server at [localhost/127.0.0.1:8101, DSA, 22:8b:f8:9d:bc:c6:40:d8:fe:52:aa:90:c0:f2:70:ec] presented unverified {} key: {}
457 [sshd-SshClient[bea319b]-nio2-thread-1] INFO org.apache.sshd.client.session.ClientSessionImpl - Dequeing pending packets
524 [sshd-SshClient[bea319b]-nio2-thread-1] INFO org.apache.sshd.client.session.ClientUserAuthServiceNew - Received SSH_MSG_USERAUTH_FAILURE
568 [sshd-SshClient[bea319b]-nio2-thread-2] INFO org.apache.sshd.client.session.ClientUserAuthServiceNew - Received SSH_MSG_USERAUTH_SUCCESS
Connecting to host localhost on port 8102
Error executing command: Authentication failed
UPDATE #2:
I switched the logger to DEBUG and I found this exception:
2015-01-15 11:28:48,920 | DEBUG | 5]-nio2-thread-1 | ClientSessionImpl | 28 - org.apache.sshd.core - 0.12.0 | Received SSH_MSG_SERVICE_ACCEPT
2015-01-15 11:28:48,920 | INFO | 5]-nio2-thread-1 | ClientUserAuthServiceNew | 28 - org.apache.sshd.core - 0.12.0 | Received SSH_MSG_USERAUTH_FAILURE
2015-01-15 11:28:48,920 | DEBUG | 5]-nio2-thread-1 | ClientUserAuthServiceNew | 28 - org.apache.sshd.core - 0.12.0 | Authentications that can continue: keyboard-interactive, password, publickey
2015-01-15 11:28:48,922 | DEBUG | 5]-nio2-thread-1 | Nio2Session | 28 - org.apache.sshd.core - 0.12.0 | Caught exception, now calling handler
2015-01-15 11:28:48,922 | WARN | 5]-nio2-thread-1 | ClientSessionImpl | 28 - org.apache.sshd.core - 0.12.0 | Exception caught
java.lang.IllegalStateException: No SSH_AUTH_SOCK environment variable set
at org.apache.karaf.shell.ssh.KarafAgentFactory.createClient(KarafAgentFactory.java:71)
at org.apache.sshd.client.auth.UserAuthPublicKey.init(UserAuthPublicKey.java:78)
at org.apache.sshd.client.session.ClientUserAuthServiceNew.tryNext(ClientUserAuthServiceNew.java:212)
at org.apache.sshd.client.session.ClientUserAuthServiceNew.processUserAuth(ClientUserAuthServiceNew.java:178)
at org.apache.sshd.client.session.ClientUserAuthServiceNew.process(ClientUserAuthServiceNew.java:131)
at org.apache.sshd.client.session.ClientUserAuthService.process(ClientUserAuthService.java:80)
at org.apache.sshd.common.session.AbstractSession.doHandleMessage(AbstractSession.java:399)
at org.apache.sshd.common.session.AbstractSession.handleMessage(AbstractSession.java:295)
at org.apache.sshd.client.session.ClientSessionImpl.handleMessage(ClientSessionImpl.java:256)
at org.apache.sshd.common.session.AbstractSession.decode(AbstractSession.java:731)
at org.apache.sshd.common.session.AbstractSession.messageReceived(AbstractSession.java:277)
at org.apache.sshd.common.AbstractSessionIoHandler.messageReceived(AbstractSessionIoHandler.java:54)
at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:187)
at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:173)
at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:32)
at java.security.AccessController.doPrivileged(Native Method)[:1.7.0_65]
at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:30)[28:org.apache.sshd.core:0.12.0]
at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)[:1.7.0_65]
at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)[:1.7.0_65]
at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)[:1.7.0_65]
at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:275)[:1.7.0_65]
at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:296)[:1.7.0_65]
at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:407)[:1.7.0_65]
at org.apache.sshd.common.io.nio2.Nio2Session.startReading(Nio2Session.java:173)[28:org.apache.sshd.core:0.12.0]
at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:189)
at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:173)
at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:32)
at java.security.AccessController.doPrivileged(Native Method)[:1.7.0_65]
at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:30)[28:org.apache.sshd.core:0.12.0]
at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)[:1.7.0_65]
at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)[:1.7.0_65]
at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)[:1.7.0_65]
at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:275)[:1.7.0_65]
at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:296)[:1.7.0_65]
at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:407)[:1.7.0_65]
at org.apache.sshd.common.io.nio2.Nio2Session.startReading(Nio2Session.java:173)[28:org.apache.sshd.core:0.12.0]
at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:189)
at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:173)
at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:32)
at java.security.AccessController.doPrivileged(Native Method)[:1.7.0_65]
at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:30)[28:org.apache.sshd.core:0.12.0]
at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)[:1.7.0_65]
at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)[:1.7.0_65]
at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)[:1.7.0_65]
at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:275)[:1.7.0_65]
at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:296)[:1.7.0_65]
at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:407)[:1.7.0_65]
at org.apache.sshd.common.io.nio2.Nio2Session.startReading(Nio2Session.java:173)[28:org.apache.sshd.core:0.12.0]
at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:189)
at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:173)
at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:32)
at java.security.AccessController.doPrivileged(Native Method)[:1.7.0_65]
at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:30)[28:org.apache.sshd.core:0.12.0]
at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)[:1.7.0_65]
at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)[:1.7.0_65]
at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)[:1.7.0_65]
at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:275)[:1.7.0_65]
at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:296)[:1.7.0_65]
at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:407)[:1.7.0_65]
at org.apache.sshd.common.io.nio2.Nio2Session.startReading(Nio2Session.java:173)[28:org.apache.sshd.core:0.12.0]
at org.apache.sshd.common.io.nio2.Nio2Connector$1.onCompleted(Nio2Connector.java:53)[28:org.apache.sshd.core:0.12.0]
at org.apache.sshd.common.io.nio2.Nio2Connector$1.onCompleted(Nio2Connector.java:46)[28:org.apache.sshd.core:0.12.0]
at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:32)
at java.security.AccessController.doPrivileged(Native Method)[:1.7.0_65]
at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:30)[28:org.apache.sshd.core:0.12.0]
at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)[:1.7.0_65]
at sun.nio.ch.Invoker$2.run(Invoker.java:218)[:1.7.0_65]
at sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)[:1.7.0_65]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)[:1.7.0_65]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)[:1.7.0_65]
at java.lang.Thread.run(Thread.java:745)[:1.7.0_65]

Maria DB not starting. Job for mariadb.service failed. See 'systemctl status mariadb.service' and 'journalctl -xn' for details

I am trying to install maria db and getting the following issue.
[root#localhost ~]# service mysqld start
Redirecting to /bin/systemctl start mysqld.service
Job for mariadb.service failed. See 'systemctl status mariadb.service' and 'journalctl -xn' for details.
I tried 'systemctl status mariadb.service' and 'journalctl -xn' and follows the details.
[root#localhost ~]# systemctl status mariadb.service
mariadb.service - MariaDB database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled)
Active: failed (Result: exit-code) since Sun 2014-09-21 17:19:44 IST; 23s ago
Process: 2712 ExecStartPost=/usr/libexec/mariadb-wait-ready $MAINPID (code=exited, status=1/FAILURE)
Process: 2711 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
Process: 2683 ExecStartPre=/usr/libexec/mariadb-prepare-db-dir %n (code=exited, status=0/SUCCESS)
Main PID: 2711 (code=exited, status=0/SUCCESS)
Sep 21 17:19:42 localhost.localdomain mysqld_safe[2711]: 140921 17:19:42 mysqld_safe Logging to '/var/lib/mysql/localhost.localdomain.err'.
Sep 21 17:19:42 localhost.localdomain mysqld_safe[2711]: 140921 17:19:42 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Sep 21 17:19:43 localhost.localdomain mysqld_safe[2711]: 140921 17:19:43 mysqld_safe mysqld from pid file /var/lib/mysql/localhost.localdoma...d ended
Sep 21 17:19:44 localhost.localdomain systemd[1]: mariadb.service: control process exited, code=exited status=1
Sep 21 17:19:44 localhost.localdomain systemd[1]: Failed to start MariaDB database server.
Sep 21 17:19:44 localhost.localdomain systemd[1]: Unit mariadb.service entered failed state.
[root#localhost ~]# journalctl -xn
-- Logs begin at Sun 2014-09-21 02:33:29 IST, end at Sun 2014-09-21 17:20:11 IST. --
Sep 21 17:16:26 localhost.localdomain systemd[1]: Started dnf makecache.
-- Subject: Unit dnf-makecache.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit dnf-makecache.service has finished starting up.
--
-- The start-up result is done.
Sep 21 17:18:11 localhost.localdomain NetworkManager[683]: <warn> nl_recvmsgs() error: (-33) Dump inconsistency detected, interrupted
Sep 21 17:19:42 localhost.localdomain systemd[1]: Starting MariaDB database server...
-- Subject: Unit mariadb.service has begun with start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mariadb.service has begun starting up.
Sep 21 17:19:42 localhost.localdomain mysqld_safe[2711]: 140921 17:19:42 mysqld_safe Logging to '/var/lib/mysql/localhost.localdomain.err'.
Sep 21 17:19:42 localhost.localdomain mysqld_safe[2711]: 140921 17:19:42 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Sep 21 17:19:43 localhost.localdomain mysqld_safe[2711]: 140921 17:19:43 mysqld_safe mysqld from pid file /var/lib/mysql/localhost.localdomain.pid end
Sep 21 17:19:44 localhost.localdomain systemd[1]: mariadb.service: control process exited, code=exited status=1
Sep 21 17:19:44 localhost.localdomain systemd[1]: Failed to start MariaDB database server.
-- Subject: Unit mariadb.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mariadb.service has failed.
--
-- The result is failed.
Sep 21 17:19:44 localhost.localdomain systemd[1]: Unit mariadb.service entered failed state.
Sep 21 17:20:11 localhost.localdomain NetworkManager[683]: <warn> nl_recvmsgs() error: (-33) Dump inconsistency detected, interrupted
Can any one please help?
I have tried uninstalling and installing many a times but received the same error.
Thanks in advance.
Most of the time if the system journal (journalctl) doesn't show what the problem was, the MariaDB error log (located in /var/lib/mysql/localhost.localdomain.err) does. Looking into that file you usually see what the problem is.
Most commonly errors that do not disappear after a reinstallation mean that your data directory (by default /var/lib/mysql/) is corrupted and the database needs to be reinstalled with mysql_install_db. To make sure you do a clean installation, remove all files located in the data directory and then run sudo mysql_install_db --user=mysql.
I solved as follows:
After installing
Run: > mysql_install_db --user=mysql --basedir=/usr --datadir=/var/lib/
Then: > mysql_secure_installation
And then: systemctl start mariadb
with this this, I can resolved.
A quick update for anyone coming here through a Web search.
I had a "failed to start" message following a Debian 9 -> Debian 10 (Buster) in-place server upgrade, and after a bit of digging I found that the following line in /etc/mysql/my.cnf needed updating:
From:
[mysqld]
: (other stuff)
:
innodb_large_prefix
To:
[mysqld]
: (other stuff)
:
innodb_large_prefix = "ON"
The clue was the following lines in /var/log/mysql/error.log
2020-06-06 16:41:24 0 [ERROR] /usr/sbin/mysqld: option '--innodb-large-prefix' requires an argument
2020-06-06 16:41:24 0 [ERROR] Parsing options for plugin 'InnoDB' failed.
Not sure about your case, But u can check if mariadb/mysql client is deleted accidentally,
Since in my case, I had deleted the mariadb client repo from shared file,So reinstalled the client as,
sudo apt-get install libmariadb-dev
Note: But before installing client do one thing, for rails app just change the mysql version in gemfile and try to install as bundle install mysql2, If its mariadb client issue then it will throw error mentioning that need to install mariadb client or mysql2 client as,
sudo apt-get install libmariadb-dev
OR
sudo apt-get install libmysqlclient-dev
Please refer other answers in case its not mariadb client issue 😊 😜
This is the optimal solution for this problem:
netstat -tulpn | grep LISTEN
Now we need to look for mysqld service.
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 32029/mysqld
now kill the process in my case it is (32029).
kill 32029
Now use systemctl start mariadb.service

Resources