Calico prints "Hit error connecting to datastore: connection refused" - networking

I created a cluster on an Ubuntu server using this command:
> kubeadm init --cri-socket /var/run/dockershim.sock --control-plane-endpoint servername.local --apiserver-cert-extra-sans servername.local
I added Calico like this:
> curl https://docs.projectcalico.org/manifests/calico.yaml -o calico.yaml
> kubectl apply -f calico.yaml
The Calico pod prints errors:
> kubectl --namespace kube-system logs calico-node-2cg7x
2021-01-05 16:34:46.846 [INFO][8] startup/startup.go 379: Early log level set to info
2021-01-05 16:34:46.846 [INFO][8] startup/startup.go 395: Using NODENAME environment for node name
2021-01-05 16:34:46.846 [INFO][8] startup/startup.go 407: Determined node name: servername
2021-01-05 16:34:46.847 [INFO][8] startup/startup.go 439: Checking datastore connection
2021-01-05 16:34:46.853 [INFO][8] startup/startup.go 454: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: connect: connection refused
2021-01-05 16:34:47.859 [INFO][8] startup/startup.go 454: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: connect: connection refused
2021-01-05 16:34:48.866 [INFO][8] startup/startup.go 454: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: connect: connection refused
2021-01-05 16:34:49.872 [INFO][8] startup/startup.go 454: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: connect: connection refused
2021-01-05 16:34:50.878 [INFO][8] startup/startup.go 454: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: connect: connection refused
2021-01-05 16:34:51.884 [INFO][8] startup/startup.go 454: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: connect: connection refused
2021-01-05 16:34:52.890 [INFO][8] startup/startup.go 454: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: connect: connection refused
2021-01-05 16:34:53.896 [INFO][8] startup/startup.go 454: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: connect: connection refused
I don't know what 10.96.0.1 is. It doesn't have any ports open:
> ping 10.96.0.1 -c 1
PING 10.96.0.1 (10.96.0.1) 56(84) bytes of data.
64 bytes from 10.96.0.1: icmp_seq=1 ttl=248 time=5.62 ms
--- 10.96.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.621/5.621/5.621/0.000 ms
> nmap 10.96.0.1
Starting Nmap 7.60 ( https://nmap.org ) at 2021-01-05 17:37 CET
Nmap scan report for 10.96.0.1
Host is up (0.018s latency).
All 1000 scanned ports on 10.96.0.1 are closed
Nmap done: 1 IP address (1 host up) scanned in 1.62 seconds
The pod actually has IP 192.168.1.19.
What am I doing wrong?

I had the same issue, in my case adding --apiserver-advertise-address=<server-address> parameter was the solution.

The cause is: the iptables rule of kubernetes blocks the connection, shown as follow:
Chain KUBE-SERVICES (2 references)
pkts bytes target prot opt in out source destination
1773 106380 REJECT tcp -- * * 0.0.0.0/0 10.96.0.1 /* default/kubernetes:https has no endpoints */ tcp dpt:443 reject-with icmp-port-unreachable

Related

calabash tests: Failing to establish TCP connection and port forwarding

Some of the calabash test is running longer time due to the following connection and the test will continuo after some time and some times fails due to not establishing connection. I am running over Mac M1 and following are the messages seen over console.
For example when running calabash test for #push_13 the steps " And I establish friendship between $new_users['birthday_user'] and $new_users['push_receiver'] " which is internally calling Aruba gem and Rest API call. Within that particular step I see TCP connection will repeatedly try to establish. Find full steps and logs are attached
And I login with $new_users['push_receiver'] user into Armstrong app # step_definitions/armstrong_steps.rb:51
"Calling step in background: 'Then I am on NavigationBar'"
"Calling step in background: 'Then I am on MoreMenuScreen'"
"Calling step in background: 'Then I am on SettingsScreen'"
"Calling step in background: 'Then I am on PushSettingScreen'"
And I check push subscription for "push_receiver" user # step_definitions/push_notifications_steps.rb:34
"Calling: /network/addressbook/users/3313/contact_requests/to_user/3314?rid=internal"
"Calling: /network/addressbook/users/3314/contact_requests/received?rid=internal"
"Calling: /network/addressbook/users/3314/contact_requests/from_user/3313/confirmed?rid=internal"
"Calling: /network/addressbook/users/3313/contacts?rid=internal&ids_only=true&ignore_limit=true"
And I establish friendship between $new_users['birthday_user'] and $new_users['push_receiver'] # aruba-3.17.6/features/step_definitions/contacts_steps.rb:156
No matching processes belonging to you were found
No matching processes belonging to you were found
"Performing: xingboxctl port-forward default-mobile-xingboxes rabbitmq 5672"
Configuring port forwarding from 5672 locally to rabbitmq-0 on port 5672
W, [2022-07-14T12:51:49.772488 #30238] WARN -- #<Bunny::Session:0x1b490 guest#127.0.0.1:5672, vhost=/, addresses=[127.0.0.1:5672]>: Could not establish TCP connection to 127.0.0.1:5672: Connection refused - connect(2) for 127.0.0.1:5672
"Got an exception: Could not establish TCP connection to any of the configured hosts. Will retry in 10 seconds"
Forwarding from 127.0.0.1:5672 -> 5672
Forwarding from [::1]:5672 -> 5672
"Performing: xingboxctl port-forward default-mobile-xingboxes rabbitmq 5672"
Configuring port forwarding from 5672 locally to rabbitmq-0 on port 5672
W, [2022-07-14T12:52:02.904590 #30238] WARN -- #<Bunny::Session:0x1b558 guest#127.0.0.1:5672, vhost=/, addresses=[127.0.0.1:5672]>: Could not establish TCP connection to 127.0.0.1:5672: Connection refused - connect(2) for 127.0.0.1:5672
"Got an exception: Could not establish TCP connection to any of the configured hosts. Will retry in 10 seconds"
Forwarding from 127.0.0.1:5672 -> 5672
Forwarding from [::1]:5672 -> 5672
"Performing: xingboxctl port-forward default-mobile-xingboxes rabbitmq 5672"
Configuring port forwarding from 5672 locally to rabbitmq-0 on port 5672
W, [2022-07-14T12:52:16.025106 #30238] WARN -- #<Bunny::Session:0x1b620 guest#127.0.0.1:5672, vhost=/, addresses=[127.0.0.1:5672]>: Could not establish TCP connection to 127.0.0.1:5672: Connection refused - connect(2) for 127.0.0.1:5672
"Got an exception: Could not establish TCP connection to any of the configured hosts. Will retry in 10 seconds"
Forwarding from 127.0.0.1:5672 -> 5672
Forwarding from [::1]:5672 -> 5672
"Performing: xingboxctl port-forward default-mobile-xingboxes rabbitmq 5672"
Configuring port forwarding from 5672 locally to rabbitmq-0 on port 5672
W, [2022-07-14T12:52:29.145361 #30238] WARN -- #<Bunny::Session:0x1b6e8 guest#127.0.0.1:5672, vhost=/, addresses=[127.0.0.1:5672]>: Could not establish TCP connection to 127.0.0.1:5672: Connection refused - connect(2) for 127.0.0.1:5672
"Got an exception: Could not establish TCP connection to any of the configured hosts. Will retry in 10 seconds"
Forwarding from 127.0.0.1:5672 -> 5672
Forwarding from [::1]:5672 -> 5672
"Performing: xingboxctl port-forward default-mobile-xingboxes rabbitmq 5672"
Configuring port forwarding from 5672 locally to rabbitmq-0 on port 5672
Forwarding from 127.0.0.1:5672 -> 5672
Forwarding from [::1]:5672 -> 5672
Handling connection for 5672

Unable to reach Google Compute over port 9000

I have a google compute running CentOS 7, and I wrote up a quick test to try and communicate with it over port 9000 (from my home PC) - but I'm unexpectedly getting network errors.
This happens both with my test script (which attempts to send a payload) and even with plink.exe (which I'm just using to check the port availability).
>plink.exe -v -raw -P 9000 <external_IP>
Connecting to <external_IP> port 9000
Failed to connect to <external_IP>: Network error: Connection refused
Network error: Connection refused
FATAL ERROR: Network error: Connection refused
I've added my external IP to googles firewall (https://console.cloud.google.com/networking/firewalls) and set to allow ingress traffic over port 9000 (it's the lowest priority, at 1000)
I also updated firewalld in CentOS to allow TCP traffic over the port:
Redirecting to /bin/systemctl start firewalld.service
[foo#bar ~]$ sudo firewall-cmd --zone=public --add-port=9000/tcp --permanent
success
[foo#bar ~]$ sudo firewall-cmd --reload
success
I've confirmed my listener is running on port 9000
[foo#bar ~]$ netstat -npae | grep 9000
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN 1000 18381 1201/python3
By default, CentOS 7 doesn't use iptables (just to be sure, I confirmed it wasn't running)
Am I missing something?
NOTE: Actual external IP replaced with <external_IP> placeholder
Update:
If I nmap my listener over port 9000 from the CentOS 7 compute instance over a local IP, like 127.0.0.1 I get some results. Interestingly, if I make the same nmap call over the servers external IP -- nadda. So this has to be a firewall, right?
external call
[foo#bar~]$ nmap <external_IP> -Pn
Starting Nmap 6.40 ( http://nmap.org ) at 2020-05-25 00:33 UTC
Nmap scan report for <external_IP>.bc.googleusercontent.com (<external_IP>)
Host is up (0.00043s latency).
Not shown: 998 filtered ports
PORT STATE SERVICE
22/tcp open ssh
3389/tcp closed ms-wbt-server
Nmap done: 1 IP address (1 host up) scanned in 4.87 seconds
Internal Call
[foo#bar~]$ nmap 127.0.0.1 -Pn
Starting Nmap 6.40 ( http://nmap.org ) at 2020-05-25 04:36 UTC
Nmap scan report for localhost (127.0.0.1)
Host is up (0.010s latency).
Not shown: 997 closed ports
PORT STATE SERVICE
22/tcp open ssh
25/tcp open smtp
9000/tcp open cslistener
Nmap done: 1 IP address (1 host up) scanned in 0.10 seconds
In this case software running on the backend VM must be listening any IP (0.0.0.0 or ::), your's is listening to "127.0.0.1:9000" and it should be "0.0.0.0:9000".
The way to fix that it's to change the service config to listen to 0.0.0.0 instead of 127.0.0.1 .
Cheers.

Connection refused communicating between pods

Our team has recently started to use GKE, but have encountered an intermittent problem on some of our pods that serve HTTP on port 8080. Other pods in the cluster, even on the same node, get a "connection refused" response when trying to connect using its cluster IP:
$ kubectl run -i --tty busybox --image=busybox --restart=Never -- sh!
/ # ping 10.28.2.141
PING 10.28.2.141 (10.28.2.141): 56 data bytes
64 bytes from 10.28.2.141: seq=0 ttl=62 time=2.212 ms
64 bytes from 10.28.2.141: seq=1 ttl=62 time=1.993 ms
64 bytes from 10.28.2.141: seq=2 ttl=62 time=4.662 ms
^C
--- 10.28.2.141 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 1.993/2.955/4.662 ms
/ # wget http://10.28.2.141:8080/health-check
Connecting to 10.28.2.141:8080 (10.28.2.141:8080)
wget: can't connect to remote host (10.28.2.141): Connection refused
However, the service is indeed running and listening on that port: if I exec onto the pod and run the same command, it works happily.
For other almost identical pods, this connectivity works correctly, but intermittently some fraction (maybe 10-20%) of pods end up in this state.
There are no errors in the pod logs.
This is a freshly provisioned GKE cluster on version 1.11.6-gke.3 with two nodes, no network policies, and Istio is not installed.
Any ideas on what the problem might be, or how to diagnose further? Happy to add any other information if it would be useful.

Cannot start jupyter notebook remotely on HPC using ssh

I logged in to a HPC using:
ssh -p 2222 user#hpc.edu
and then started Jupyter notebook using:
jupyter notebook --no-browser --port=9999
I got a url:
http://localhost:9999/?token=0518475c55eaafb82abce7d2d5344b48174012
Then I tried to access the Jupyter notebook remotely using my computer:
ssh -p 2222 user#hpc.edu -L 9999:localhost:9999 -N
The connection is refused after taking a long time:
channel 2: open failed: connect failed: Connection refused
I remember earlier being able to access the notebook by not putting
-p 2222
in the ssh command anywhere. But now I have to do it to ssh remotely. Is there any other change of command needed to access the jupyter notebook remotely?
EDIT:
I added -v -v to the command that I executed on my computer. Here is what it says:
password: debug2: input_userauth_info_req debug2: input_userauth_info_req: num_prompts 0 debug1: Authentication succeeded (keyboard-interactive). Authenticated to bridges.psc.edu ([128.182.108.57]:2222). debug1: Local connections to LOCALHOST:9999 forwarded to remote address localhost:9999 debug1: Local forwarding listening on ::1 port 9999. debug2: fd 4 setting O_NONBLOCK debug1: channel 0: new [port listener] debug1: Local forwarding listening on
127.0.0.1 port 9999. debug2: fd 5 setting O_NONBLOCK debug1: channel 1: new [port listener] debug2: fd 3 setting TCP_NODELAY debug1: Requesting no-more-sessions#openssh.com debug1: Entering interactive session. debug1: pledge: network debug1: client_input_global_request: rtype keepalive#openssh.com want_reply 1 debug1: Connection to port 9999 forwarding to localhost port 9999 requested. debug2: fd 6 setting TCP_NODELAY debug2: fd 6 setting O_NONBLOCK debug1: channel 2: new [direct-tcpip] channel 2: open failed: connect failed: Connection refused debug2: channel 2: zombie debug2: channel 2: garbage collecting debug1: channel 2: free: direct-tcpip: listening port 9999 for localhost port 9999, connect from 127.0.0.1 port 54542 to
127.0.0.1 port 9999, nchannels 3 debug1: Connection to port 9999 forwarding to localhost port 9999 requested. debug2: fd 6 setting TCP_NODELAY debug2: fd 6 setting O_NONBLOCK debug1: channel 2: new [direct-tcpip] channel 2: open failed: connect failed: Connection refused
I had tried to follow this:
http://ipyrad.readthedocs.io/HPC_Tunnel.html
This one works for me. First, start Jupyter from your server using:
jupyter notebook --no-browser --port=7002
Then from your local machine, you can tunnel to Jupyter using the following code
ssh -N -f -L localhost:7001:localhost:7002 user#hpc.edu
Now you can access the Jupyter from your local machine by browsing localhost:7001
More details can be found here: here

I can connect digital ocean droplet via SSH connected in home network but not in work network

Scenario...
WiFi Network home = Can connect with my Digital Ocean servers fine via SSH;
WiFi Network work = Can't connect with my Digital Ocean servers via SSH;
WiFi Network work SSH debug:
OpenSSH_6.6.1, OpenSSL 1.0.1f 6 Jan 2014
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug2: ssh_connect: needpriv 0
debug1: Connecting to xxx.xxx.xxx.xxx [xxx.xxx.xxx.xxx] port 22.
debug1: connect to address xxx.xxx.xxx.xxx port 22: Connection timed out
ssh: connect to host xxx.xxx.xxx.xxx port 22: Connection timed out
Anyone?
Try to check with nc
nc -zvw4 your_host 22
If not open - probably 22 port not allowed in your network, you can ask your network administrator about it
on your server make forward from 443 to 22 via iptables, for example:
iptables -t nat -I PREROUTING -i eth0 -p tcp -m tcp --dport 443 -j REDIRECT --to-ports 22

Resources