Our team has recently started to use GKE, but have encountered an intermittent problem on some of our pods that serve HTTP on port 8080. Other pods in the cluster, even on the same node, get a "connection refused" response when trying to connect using its cluster IP:
$ kubectl run -i --tty busybox --image=busybox --restart=Never -- sh!
/ # ping 10.28.2.141
PING 10.28.2.141 (10.28.2.141): 56 data bytes
64 bytes from 10.28.2.141: seq=0 ttl=62 time=2.212 ms
64 bytes from 10.28.2.141: seq=1 ttl=62 time=1.993 ms
64 bytes from 10.28.2.141: seq=2 ttl=62 time=4.662 ms
^C
--- 10.28.2.141 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 1.993/2.955/4.662 ms
/ # wget http://10.28.2.141:8080/health-check
Connecting to 10.28.2.141:8080 (10.28.2.141:8080)
wget: can't connect to remote host (10.28.2.141): Connection refused
However, the service is indeed running and listening on that port: if I exec onto the pod and run the same command, it works happily.
For other almost identical pods, this connectivity works correctly, but intermittently some fraction (maybe 10-20%) of pods end up in this state.
There are no errors in the pod logs.
This is a freshly provisioned GKE cluster on version 1.11.6-gke.3 with two nodes, no network policies, and Istio is not installed.
Any ideas on what the problem might be, or how to diagnose further? Happy to add any other information if it would be useful.
Related
I have a google compute running CentOS 7, and I wrote up a quick test to try and communicate with it over port 9000 (from my home PC) - but I'm unexpectedly getting network errors.
This happens both with my test script (which attempts to send a payload) and even with plink.exe (which I'm just using to check the port availability).
>plink.exe -v -raw -P 9000 <external_IP>
Connecting to <external_IP> port 9000
Failed to connect to <external_IP>: Network error: Connection refused
Network error: Connection refused
FATAL ERROR: Network error: Connection refused
I've added my external IP to googles firewall (https://console.cloud.google.com/networking/firewalls) and set to allow ingress traffic over port 9000 (it's the lowest priority, at 1000)
I also updated firewalld in CentOS to allow TCP traffic over the port:
Redirecting to /bin/systemctl start firewalld.service
[foo#bar ~]$ sudo firewall-cmd --zone=public --add-port=9000/tcp --permanent
success
[foo#bar ~]$ sudo firewall-cmd --reload
success
I've confirmed my listener is running on port 9000
[foo#bar ~]$ netstat -npae | grep 9000
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN 1000 18381 1201/python3
By default, CentOS 7 doesn't use iptables (just to be sure, I confirmed it wasn't running)
Am I missing something?
NOTE: Actual external IP replaced with <external_IP> placeholder
Update:
If I nmap my listener over port 9000 from the CentOS 7 compute instance over a local IP, like 127.0.0.1 I get some results. Interestingly, if I make the same nmap call over the servers external IP -- nadda. So this has to be a firewall, right?
external call
[foo#bar~]$ nmap <external_IP> -Pn
Starting Nmap 6.40 ( http://nmap.org ) at 2020-05-25 00:33 UTC
Nmap scan report for <external_IP>.bc.googleusercontent.com (<external_IP>)
Host is up (0.00043s latency).
Not shown: 998 filtered ports
PORT STATE SERVICE
22/tcp open ssh
3389/tcp closed ms-wbt-server
Nmap done: 1 IP address (1 host up) scanned in 4.87 seconds
Internal Call
[foo#bar~]$ nmap 127.0.0.1 -Pn
Starting Nmap 6.40 ( http://nmap.org ) at 2020-05-25 04:36 UTC
Nmap scan report for localhost (127.0.0.1)
Host is up (0.010s latency).
Not shown: 997 closed ports
PORT STATE SERVICE
22/tcp open ssh
25/tcp open smtp
9000/tcp open cslistener
Nmap done: 1 IP address (1 host up) scanned in 0.10 seconds
In this case software running on the backend VM must be listening any IP (0.0.0.0 or ::), your's is listening to "127.0.0.1:9000" and it should be "0.0.0.0:9000".
The way to fix that it's to change the service config to listen to 0.0.0.0 instead of 127.0.0.1 .
Cheers.
I was doing az login from WSL of my windows machine. Then it gives an error:
Please ensure you have network connection. Error detail: HTTPSConnectionPool(host='login.microsoftonline.com', port=443): Max retries exceeded with url: /common/oauth2/token (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f401d135630>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
I hope this is a DNS issue.
So I checked /etc/resolv.conf of WSL:
# This file was automatically generated by WSL. To stop automatic generation of this file, remove this line.
nameserver 192.168.1.1
nameserver fcc0:0:0:ffff::1
nameserver fcc0:0:0:ffff::2
192.168.1.1 is the default gateway.
There are the results of some commands I tried:
$ ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=128 time=0.351 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=128 time=0.888 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=128 time=0.883 ms
$ dig 192.168.1.1
; <<>> DiG 9.11.3-1ubuntu1.3-Ubuntu <<>> 192.168.1.1
;; global options: +cmd
;; connection timed out; no servers could be reached
$ nslookup 192.168.1.1
;; connection timed out; no servers could be reached
These commands also give an output that indicates an issue.
Ping google.com
dig google.com
All these commands( or its alternatives) work from the Windows command prompt works correctly.
I found a workaround here:
https://askubuntu.com/questions/91543/apt-get-update-fails-to-fetch-files-temporary-failure-resolving-error
It says that I should add the followinng line to /etc/resolv.conf. If I try it like this, it works.
nameserver 192.168.1.1
nameserver 8.8.8.8
nameserver fcc0:0:0:ffff::1
nameserver fcc0:0:0:ffff::2
After this, the ping google.com and dig google.com works fine. But I can see that the nameserver it uses to resolve is 8.8.8.8.
If I connect to VPN, it adds our own nameservers to /etc/resolv.conf and after that, there is no problem in resolving the URLs. Once the VPN is disconnected, the issue arises again.
Note:
There were no issues like this before.
Last day we changed our router in order to use a new ISP's connection and after that, the issue occurs.
Other machines in the same network don't have this issue.
Why this occurs and How can I properly fix this issue of WSL?
Why only one machine in our network can ping, but can't dig to the default gateway?
Update:
I can see that there are two entries that are marked as default in routing table:
$ ip route show table all | grep default
none default via 192.168.0.1 dev wifi0 proto unspec metric 0
none default via 192.168.1.1 dev eth6 proto unspec metric 0
Already tried:
Connect the BBB with USB to iMac
Share internet with the board from System Preferences->Sharing
ssh to the board and then try to udhcp -i usb0
This is what it says:
udhcpc (v1.20.2) started
Gets stuck and I get and error: Write failed: Broken pipe
ssh exits
Any clues?
After some try-and-erroring, here's what worked for me:
1. Watch this video: http://www.youtube.com/watch?v=Cf9hnscbSK8
2. If your BBB was shipped after November 2013, instead of screen /dev/tty.usb*B 115200 use screen /dev/tty.usb* 115200 and actually you need to go to the /dev directory and check which of the tty.usbXXX is available for your BBB and screen it. In my case it was tty.usb131 for example
3. You continue the steps just like in the video until opkg update which would be the thing you need to do over the internet
And that it's all about it.
Your SSH session is getting stuck because you're connected to usb0 and the udhcpc command changed the IP address for it! At this point there's nothing listening on the other end of your ssh session, so your local computer's ssh client eventually fails with the broken pipe error and exits.
An obvious workaround is to connect via tty.usbserial instead of ssh to the IP address. You'd think the usb port's assigned IP shouldn't be changing though. Read on to understand what's happening.
Most people using a BBB for the first time attach them directly to their Internet connected computer using the supplied USB cable. It's exactly what the BBBs designers intended for you to do, and they've done a fantastic job with the BBBs startup web page.
That host computer shares it connection differently though depending on whether it's Windows, OS X or Linux, and how you do it varies depending on the version of the OS you're running.
Derek Molloy (Exploring BeagleBone) and Jason Kridner (Youtube OS X Beaglebone video) provide some fairly detailed instructions to use host based Internet sharing with your BBB. The Linux and Windows instructions are still good, but they need to update the OS X info for Yosemite - Apple switched their NAT and firewall software to pf from ipfw and natd. If you try running udhcpc like Jason did in his vid it doesn't work the same way as his did.
So back to your BBB SSH problem with OS X Yosemite. Here's how to see what's going on: Connect to the BBB using a serial/FTDI cable, then check the ip config of usb0 for the beaglebone.
beaglebone:~# ifconfig -a usb0
usb0 Link encap:Ethernet HWaddr 0e:be:ff:00:ff:00 inet addr:192.168.7.2
Bcast:192.168.7.3 Mask:255.255.255.252
confirm you can ping the host that's sharing it's Internet connection
beaglebone:~# ping 192.168.7.1
PING 192.168.7.1 (192.168.7.1) 56(84) bytes of data.
64 bytes from 192.168.7.1: icmp_req=1 ttl=64 time=0.681 ms
64 bytes from 192.168.7.1: icmp_req=2 ttl=64 time=0.533 ms
^C
try reaching an Internet IP (google dns)
beaglebone:~# ping 8.8.8.8
connect: Network is unreachable
check routes and confirm there's no default route out, which is why the ping above failed (a USB connected BBB has a 192.168.7.0/30 network setup by default, so it can only reach 192.168.7.0, .1, .2 and .3 addresses).
beaglebone:~# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
192.168.7.0 0.0.0.0 255.255.255.252 U 0 0 0 usb0
so if you run udhcpc it will add the missing route for you. you could also just add the route directly, but you need to setup dns as well, and with OS X Internet sharing it won't work without also changing the BBB's IP address - see links at end of this post)
beaglebone:~# udhcpc -i usb0
udhcpc (v1.20.2) started
Sending discover...
Sending discover...
and here is where udhcpc changes the IP instead of just re-using 192.168.7.2. The new IP is compatible with the IP range used by OS X Internet Sharing, so that may be why the DHCP server is returning it.
Sending select for 192.168.2.34...
Lease of 192.168.2.34 obtained, lease time 85536
udhcpc then throws an error because there's no default route to delete
/etc/udhcpc/default.script: Resetting default routes
SIOCDELRT: No such process
udhcpc then adds the default route - note carefully it's an OS X Internet Sharing 192.168.2 address, not the original 192.168.7.
/etc/udhcpc/default.script: Adding DNS 192.168.2.1
everything worked, so you can see the new route and successfully ping an external IP now
beaglebone:~# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 192.168.2.1 0.0.0.0 UG 0 0 0 usb0
192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0 usb0
beaglebone:~# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_req=1 ttl=53 time=4.08 ms
64 bytes from 8.8.8.8: icmp_req=2 ttl=53 time=3.59 ms
^C
There are a couple of blog posts that show how to set this up permanently:
Sharing OS X Internet Connection over USB to BeagleBone Black
and
Changing usb0 IP address on the BeagleBone Black
I'd like to learn and play with tcp/ip libraries for python, java or c++. But I only have one computer. Is it possible to "fake" remote computers to emulate remote hosts, under NAT end everything?
The simplest way is to run both the server and client on the same computer and use the "loopback" IP address: 127.0.0.1 which always connects to the local host. I've done this many times during testing. For example, run a local webserver on port NNN and then in the browser enter http://127.0.0.1:NNN/ In fact, 127.X.Y.Z should always talk to the local machine.
If you are using linux, you can configure dummy interfaces, then bind your client / server to different dummy interfaces.
[mpenning#Bucksnort ~]$ sudo modprobe dummy
[mpenning#Bucksnort ~]$ sudo ip addr add 192.168.12.12/24 dev dummy0
[mpenning#Bucksnort ~]$ ip addr show dummy0
6: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN
link/ether b6:6c:65:01:fc:ff brd ff:ff:ff:ff:ff:ff
inet 192.168.12.12/24 scope global dummy0
[mpenning#Bucksnort ~]$ ping 192.168.12.12
PING 192.168.12.12 (192.168.12.12) 56(84) bytes of data.
64 bytes from 192.168.12.12: icmp_seq=1 ttl=64 time=0.085 ms
^C
--- 192.168.12.12 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.085/0.085/0.085/0.000 ms
[mpenning#Bucksnort ~]$ sudo modprobe dummy -o dummy1
[mpenning#Bucksnort ~]$ sudo rmmod dummy
[mpenning#Bucksnort ~]$ ip addr show dummy0
Device "dummy0" does not exist.
[mpenning#Bucksnort ~]$
You should be able to run ipchains on these interfaces just like any other.
You can start out with talking between programs on your own computer.
You can use virtual machine software such as VirtualBox, VMWare, VirtualPC, etc to create what is essentially a second machine within yours and talk to that (though the network topology may be very slightly unusual - something more to learn about)
If you want to talk to something remote, you can rent a small cloud server running linux or windows from the likes of Amazon for pennies an hour and install whatever you want on it.
Use virtual box to install OS in your system. for any networking application, this is best. You dont have to work on two different system and its easy to see whats happening at both ends
Run to server to listen on your network adapter, or localhost. Then issue requests to that same IP and Port. Logically, it will all take place within the network driver(s), but it will still behave the same way if that IP address were addressed to another machine (barring Firewall configurations, etc)
I don't know if there is a way to ping a target outside my LAN proxy which accepts only Http requests through a squid proxy... I read somewhere that one way to deal with such problem is to use a http tunnel so that the proxy still sees the request as a Http request. Can I use this to ping,say, www.google.com which otherwise is giving the following error because the firewall is rejecting the request:
$ ping www.google.com
ping: unknown host www.google.com
If so how is it done...?
I have installed httptunnel.Any help in how to use it will be much appreciated.
No. Ping and traceroute make use of lower layer network protocols (ICMP and/or UDP, in particular, which are layer 4 protocols) and will not work over an HTTP (layer 7) tunnel. In any case, even if you could convince the HTTP proxy to open a raw TCP session for you (which is how some HTTP tunneling works) you would not receive the necessary packets to confirm that the host is reachable. (ICMP echo reply, in the case of ping, or the time-to-live expired ICMP packets in the case of traceroute)
To test for connectivity in this situation, I think the best you can do is an HTTP "ping". (Try to establish an HTTP connection with the remote host and see if it works.) For example, you could do something like:
$ http_proxy=http://webproxy.example.com:3128 \
> curl -I http://google.com/ > /dev/null 2>&1 \
> && echo success || echo failure
Assuming you have curl installed, this would print "success" if google.com is reachable through your proxy and "failure" if not.
It's not exactly what you were looking for, but if you can access and external ssh server, you can run it through that, and the results will reflect the ping time to the ssh server:
$ ssh username#server 'ping -c 1 google.com'
PING google.com (72.14.204.147) 56(84) bytes of data.
64 bytes from iad04s01-in-f147.1e100.net (72.14.204.147): icmp_seq=1 ttl=57 time=2.64 ms
--- google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.640/2.640/2.640/0.000 ms