Envoy's Logical DNS connection management - tcp

On its documentation ( https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/service_discovery#logical-dns ) for Logical DNS service discovery, Envoy says:
"only uses the first IP address returned when a new connection needs to
be initiated"
How does envoy decide when a new upstream connection needs to be initiated?
It also says:
"Connections are never drained"
What happens to old connections if an upstream host becomes unreachable? Do health-checks apply to all the upstream hosts that currently have established connections or are they only monitoring the host with the current "first IP address"? If the latter, am I right to assume that Envoy will only remove the failed upstream connection (and consequently stop trying to send traffic to those hosts) once it tries to write to it and the peer ACK times out? If so, is it possible to configure the timeout duration?

After looking into the code and doing some tests this is what I've seen:
How does envoy decide when a new upstream connection needs to be
initiated?
For connection establishment, in the case of the TCP proxy (the filter I was using), there is a 1:1 mapping between downstream and upstream connections, therefore a new upstream connection is established when a new downstream connection is established.
What happens to old connections if an upstream host becomes
unreachable?
It depends on whether the connection was gracefuly terminated (TCP RST packet sent) or not. If it was, then the connection will be destroyed (along with the downstream connection), if it was not, then nothing happens until the TCP connection times out (I believe due to TCP_USER_TIMEOUT or tcp_retries2 retries - it was taking more than 15 minutes on my local machine).
Do health-checks apply to all the upstream hosts that currently have
established connections or are they only monitoring the host with the
current "first IP address"?
They only apply to the current "first IP address".
If the latter, am I right to assume that Envoy will only remove the
failed upstream connection (and consequently stop trying to send
traffic to those hosts) once it tries to write to it and the peer ACK
times out?
Yes. Typically the downstream clients timeouts will kick in first and destroy the connection though.
If so, is it possible to configure the timeout duration?
I couldn't find an option to set the socket's TCP_USER_TIMEOUT in envoy. Changing the OS tcp_retries2 might help, but, according to the documentation, the total time is also influenced by the smoothed round trip time of the TCP connection, so a change to tcp_retries2 wouldn't be able to define an absolute timeout value.

Related

TCP Connection Life time

I know that the TCP is connection oriented. But if I set up a forwarding server(syslog server for example) which forwards logs on TCP, is the connection always on or whenever the logs are forwarded to a server.
It depends on the server configuration.
Say you are working on Linux, you can use the command
cat /proc/sys/net/ipv4/tcp_keepalive_time
to check your current keepalive value in seconds.

What is the typical usage of TCP keepalive?

Consider a scenario where exists one server and multiple clients. And each client creates TCP connections to interact with the server. There are three usages of TCP alive:
Server-side keepalive: The server sends TCP keepalive to make sure that the client is alive. If the client is dead, the server closes the TCP connection to the client.
Client-side keepalive: Clients sends TCP keepalive to prevent the server from closing the TCP connection to the client.
Both-side keepalive: Both server and clients send TCP keepalive as described in 1 and 2.
Which of the above usages of TCP keepalive are typical?
Actually, both server and client peers may use TCP keepalive. It is useful to ensure that the operating system will eventually release any resource associated with dead connections. Note that if a connection between two hosts get lost because of some issue with a router between them, then both hosts have to independently detect that the connection is dead, and cleanup for themselves.
Now, each host will maintain a timer on each connection indicating when it last received a packet associated with that connection. A host will send a keepalive packet when that timer goes over a certain threshold, which is defined locally (that is, hosts do not exchange information about their own keepalive configuration). So either host with the lowest keepalive time will take the initiative of sending a keepalive packet to the other host. If the packet indeed goes through, the other host (that is, the one with the higher keepalive time) will respond to that packet and reset its own timer; therefore, the host with an higher keepalive time will certainly never reach the need to send keepalive packet itself, unless the connection has indeed been lost.
Arguably, it could be said that servers are generally more aggressive on keepalive than client machines (that is, they will more often be configured with lower keepalive time), because hanging connections often have undesirable effects on server software (for example, the software may accept a limited number of concurrent connection, or the server may fork a new process instance associated with each connection).
Server-side keepalive: The server sends TCP keepalive to make sure that the client is alive. If the client is dead, the server closes the TCP connection to the client.
If the client is dead, the server gets a 'connection reset' error, after which it should close the connection.
Client-side keepalive: Clients sends TCP keepalive to prevent the server from closing the TCP connection to the client.
No. Client sends keepalive so that if the server is dead, the client will get a 'connection reset' error, after which it should close the connection.
Both-side keepalive
Both sides are capable of getting a 'connection reset' due to keepalive failure, as above.
Whuch of the above usages is typical?
Any of them, or none. If a peer is sending regularly it doesn't really need keepalive as well. It is therefore often of more use to a server than a client.

HTTP proxy SSL tunneling relay details

I am trying to wrap my head around the ssl Tunneling process which is performed by an http proxy after receiving the CONNECT method from a client.
Stuff I can't seem to find or understand in docs, blogs, rfcs:
1) when setting up the tunnel, are the two connections from client-proxy and proxy-destination two separate connections or just one and the same? E.g. is there an tcp handshake between client-proxy and another between proxy-destination?
2) when starting the ssl handshake what node is targeted (ip address/hostname) by the client? The proxy or the destination host? Since ssl requires a point-to-point connection to make the authentication work my feeling tells me it should be the destination host. But then again that wouldn't make sense since the destination host isn't (directly) accessible from the clients perspective (hence the proxy).
when setting up the tunnel, are the two connections from client-proxy and proxy-destination two separate connections or just one and the same? E.g. is there an tcp handshake between client-proxy and another between proxy-destination?
Since the client makes the TCP connection to the proxy there is no other way than that the proxy is making another TCP connection to the server. There is no way to change an existing TCP connection to be connected to a different IP:port.
when starting the ssl handshake what node is targeted (ip address/hostname) by the client? The proxy or the destination host?
The SSL handshake is done with the destination host, not the proxy.
Since ssl requires a point-to-point connection to make the authentication
It doesn't need a point-to-point connection. It just needs that all data gets exchanged unmodified between client and server which is the case when the proxy simply forwards the data.

Will keep-alive useful to use with load balancer and firewalls

I have client and server component. Server may be installed behind the firewall or load balancer. Many sites/forums suggested to use TCP keep-alive feature to avoid connection termination due to inactivity.
The question is whether the keep-alive message from client will actually reach to server?
I tried to simulate the deployment using tcptrace utility and found that the keep-alive messages does not reach to server still the client was getting ACK for keep alive message.
I am not sure whether LB/FW work in same manner.
Is the keep-alive good option to avoid connection termination due to inactivity over socket in case of firewall and load balancer?
The answer is, of course: "it depends".
Many firewalls and load balancers maintain separate frontend and backend TCP connections, e.g.:
client <-- TCP --> firewall/balancer <-- TCP --> server
For situations like this, using TCP keepalive will not work as you'd expect. Why not? The TCP keepalive works for that TCP session only, and the keepalive probe packets are more like "administrative overhead" packets that data-bearing packets. This means that a) using TCP keepalive on the client end only means keeping the TCP connection to the firewall/balancer alive, and b) the firewall/balancer does not "forward" those keepalive probe packets across to the backend connection.
So is using TCP keepalive useful? Yes. There are other types of proxies which work at lower layers in the OSI stack, and which do forward those packets; using TCP keepalive is good for keeping your idle connection alive through those types of network intermediaries.
If your client/server application uses a long-lived, possibly idle TCP connection through firewalls/balancers, the best way to ensure that that connection is not torn down (sometimes politely, e.g. with a RST packet sent by the firewall/balancer, sometimes silently) is to use a "ping" or "heartbeat" message at the application layer. (Think of this as an "application keepalive".) This is just some kind of message that is sent e.g. from the client to the server. A simple and effective technique is to have the client periodically send some bytes to the server, which the server echoes back to the client. The client knows which bytes it sent, and when it receives those same bytes back from the server, it knows that everything in the network path is still working as expected.
Hope this helps!

Http 1.1 connection and client port

Does the client remote port changes during an HTTP 1.1 connection exchange?
I am trying to figure out if I can programmaticaly uniquely identify a connection on the server using the request remote port and remote ip address.
This is not as much an HTTP question, as it's a TCP one. And no, the port doesn't change: the ephemeral port stays the same for the duration of the connection.
However, as soon as a new connection is made, the client can (and most probably will) use a different port. This totally depends on the implementation of the client OS and the Network Address Translation of intermediary routers.
Anyway, it is not something you can depend on to build something like a session, because the next request from the same client may very well arrive from a different port (let alone that HTTP does not have to run on top of TCP).
Just use a session-ID which you store in a cookie.

Resources