How should I healthcheck an event-driven service - asynchronous

Suppose I have a service which rather than listening for http request, or gRPC procedure calls only consumes messages from a broker (Kafka, rabbitMQ, Google Pub/Sub, what have you). How should I go about healthchecking the service (eg. k8s liveness and readyness probes) ?
Should the service also listen for http solely for the purpose of healthchecking or is there some other technique which can be used ?

Having the service listen to HTTP solely to expose a liveness/readiness check (although in services that pull input from a message broker, readiness isn't necessarily something that a container scheduler like k8s would be concerned with) isn't really a problem (and it also opens up the potential to expose diagnostic and control endpoints).

Kubernetes supports three different types of probes, see also Kubernetes docs:
Running a command
Making an HTTP request
Checking a TCP socket
So, in your case you can run a command that fails when your service is unhealthy.
Also be aware that liveness probes may be dangerous to use.

Related

Filter out certain grpc exceptions via Envoy?

I'd like to create a reverse proxy to expose several grpc backend services on one host. But I'd also like to whitelist certain grpc status exception categories, and drop all others. I think I've read somewhere that grpc exceptions go into http/2 trailers, so that might be an option.
I'm trying to find info on the grpc wire protocol for passing exceptions, but can't find anything amid all the info on protobuf itself.
Any hints/links ?

Can we Scale gRPC like worker processes in the same server

Our application is expected to receive thousands of request every second and we are considering gRPC as one of our main service is in a different language.
My queries are
Can we use something like supervisor to spawn multiple workers (one gRPC server per service) as gRPC servers listening to the same port, Or is gRPC servers limited to only one per server/port
How would i go about the performance testing to determine maximum requests per gRPC server.
Thanks in advance
While you can certainly use supervisord to spawn multiple gRPC server processes, port sharing would be a problem. However, this is a Posix limitation, not a gRPC limitation. By default, multiple processes cannot listen on the same port. (to be clear, multiple processes can bind to the same port with SO_REUSEPORT, but this would not result in the behavior you presumably want).
So you have two options in order to get traffic routed to the proper service on a single port. The first option is to run all of the gRPC services in the same process and attached to the same server.
If having only a single server process won't work for you, then you'll have to start looking at load balancing. You'd front all of your services with any HTTP/2-capable load balancer (e.g. Envoy, Nginx) and have it listen on your single desired port and route to each gRPC server process as appropriate.
This is a very broad question. The answer is "the way you'd benchmark any non-gRPC server." This site is a great resource for some principles behind benchmarking.

How to log pod's outgoing HTTP requests in kubernetes?

I have a kubernetes cluster with running pods. In order to monitor and troubleshoot the infrastructure, I want to implement a centralized logging solution so all incoming and outgoing HTTP requests will be logged within one place.
For the incoming requests this is not a problem at all, I can use nginx log from ingress controller and present it.
I also understand that I can log outgoing requests inside the application I run in pod, but the problem is that applications from outside developers are also used and it may not contain logging implementation.
As for the outgoing requests, there is no any solution provided by default if I understand it right. I have explored k8s logging and k8s audit, but it does not provide such feature.
Probably, I need some network sniffer, but it is quite a low-level solution for such problem as I can see. So, the question is: is there any out-of-the-box implementation for such demand?
Thanks!
Take a look at a service mesh solution like Istio or Linkerd as well as tracing solutions like Jaeger or Zipkin. With these you can build to have full observability on how information flows in/out and through your kube cluster

Fixing kubernetes service redeploy errors with keep-alive enabled

We have a kubernetes service running on three machines. Clients both inside and outside of our cluster talk to this service over http with the keep-alive option enabled. During a deploy of the service, the exiting pods have a readiness check that starts to fail when shutdown starts, and are removed from the service endpoints list appropriately, however they still receive traffic and some requests fail as the container will abruptly exit. We believe this is because of the keep-alive which allows the the client to re-use these connections that were established when the host was Ready. Is there a series of steps one should follow to make sure we don't run into these issues? We'd like to allow keep-alive connections if at all possible.
The issue happens if the proxying/load balancing happens in layer 4 instead of layer 7. For the internal services (Kubernetes service of type ClusterIP), since the Kube-proxy does the proxying using layer 4 proxying, the clients will keep the connection even after the pod isn't ready to serve anymore. Similarly, for the services of type LoadBalancer, if the backend type is set to TCP (which is by default with AWS ELB), the same issue happens. Please see this issue for more details.
The solution to this problem as of now is:
If you are using a cloud LoadBalancer, go ahead and set the backend to HTTP. For example, You can add service.beta.kubernetes.io/aws-load-balancer-backend-protocol annotation to kubernetes service and set it to HTTP so that ELB uses HTTP proxying instead of TCP.
Use a layer 7 proxy/ingress controller within the cluster to route the traffic instead of sending it via kube-proxy
We're running into the same issue, so just wondering if you figured out a way around this issue. According to this link it should be possible to do so by having a Load Balancer in front of the service which will make direct requests to the pods and handle Keep-Alive connections on it's own.
We will continue to investigate this issue and see if we can find a way of doing zero downtime deployments with keep-alive connections.

gRPC client reconnect inside Kubernetes

If we define our microservice inside Kubernetes pods, do we need to instrument a gRPC client reconnection if the service pod is restarting?
When the pod restarts the host name is not changed, but we cannot guarantee the IP address remains the same. So is the gRPC client still be able to detect the new server to reconnect to?
When the TCP connection is disconnected (because the old pod stopped) gRPC's channel will attempt to reconnect with exponential backoff. Each reconnect attempt implies resolving the DNS address, although it may not detect the new address immediately because of the TTL (time-to-live) of the old DNS entry. Also, I believe some implementations resolve the address when a failure is detected instead of before an attempt.
This process happens naturally without your application doing anything, although it may experience RPC failures until the connection is re-established. Enabling "wait for ready" on an RPC would reduce the chances the RPC fails during this period of transition, although such an RPC generally implies you don't care about response latency.
If the DNS address is not (eventually) re-resolved, then that would be a bug and you should file an issue.
You need client-side load balancing as described here. You can watch the endpoints of a service with Kubernetes api. I have created a package for Go programming language and it is on github. Sorry but I didn't write a documentation yet. Basic concept is get service endpoints at beginning than watch service endpoints for changes.

Resources