About service discovery and proxy - soa

I am reading several articles about micro-service.
At the server-side service discovery part, I am drawn to the Kubernetes and Marathon's style of running a proxy on each host/docker-container that functions as a server-side discovery router.
By doing this, we can move all code coupled with service registration/discovery and circuit-breaker to that proxy.
By configuring the router of each host/docker-container, the proxy can be transparent to the service and the network, and we can implement some gossip strategy to make the proxies sync their knowledge of the network. Seems like an excellent solution.
Can anyone explain to me what are the drawbacks of doing this and recommend some open-sourced solution implemented such kind of thing?

Related

Direct pod networking with Kubernetes

I am trying to move a currently docker based app to Kubernetes.
My app inspects network traffic that passes through it, and because of that it needs an accessible External IP, and it needs to accept traffic on all ports, not just some.
Right now, I am using docker with a macvlan network driver in order to attach docker containers to multiple interfaces and allow them to inspect traffic that way.
After research, I've found that the only way to access pods in Kubernetes is using Services, but services only allow that through some specific ports, because it is mostly intended for "server" type applications, and not "forwarder"/"sniffer" type which is what I am looking for.
Is Kubernetes a good fit for this type of application? Does it offer tools to cope with this problem?
Is Kubernetes a good fit for this type of application? Does it offer tools to cope with this problem?
Being a good fit is more of an opinion, the pods in Kubernetes have their own PodCidr that is not exposed to the outside world and a sniffer doesn't quite fit in either a service or a job definition which are the typical workloads in Kubernetes.
Having said, it can be done if you can use your custom CNI plugin that supports macvlan
You can also use something like Multus that supports the macvlan plugin.

Is it a good practice to have embedded jetty and GRPC server running in the same JVM?

Our organization is looking into implementing new internal APIs using GRPC.
Currently, we have a microservice that is serving internal/external requests using embedded Jetty. We want to make internal communication between services to be done over GRPC.
So, we'll have 2 servers running on the same VM: jetty and GRPC. Is it a good practice, any red flags with that approach?
We do not want to split that said microservice into 2 to save costs. We should be able to run the app on the same number of VMs.
There's nothing inherently special or wrong about having Jetty and gRPC in the same JVM. The main point of potential trouble is just that you will have two ports exposed instead of one; that might matter for service discovery or firewalls.

How to log pod's outgoing HTTP requests in kubernetes?

I have a kubernetes cluster with running pods. In order to monitor and troubleshoot the infrastructure, I want to implement a centralized logging solution so all incoming and outgoing HTTP requests will be logged within one place.
For the incoming requests this is not a problem at all, I can use nginx log from ingress controller and present it.
I also understand that I can log outgoing requests inside the application I run in pod, but the problem is that applications from outside developers are also used and it may not contain logging implementation.
As for the outgoing requests, there is no any solution provided by default if I understand it right. I have explored k8s logging and k8s audit, but it does not provide such feature.
Probably, I need some network sniffer, but it is quite a low-level solution for such problem as I can see. So, the question is: is there any out-of-the-box implementation for such demand?
Thanks!
Take a look at a service mesh solution like Istio or Linkerd as well as tracing solutions like Jaeger or Zipkin. With these you can build to have full observability on how information flows in/out and through your kube cluster

API gateway vs. reverse proxy

In order to deal with the microservice architecture, it's often used alongside a Reverse Proxy (such as nginx or apache httpd) and for cross cutting concerns implementation API gateway pattern is used. Sometimes Reverse proxy does the work of API gateway.
It will be good to see clear differences between these two approaches.
It looks like the potential benefit of API gateway usage is invoking multiple microservices and aggregating the results. All other responsibilities of API gateway can be implemented using Reverse Proxy. Such as:
Authentication (It can be done using nginx LUA scripts);
Transport security. It itself Reverse Proxy task;
Load balancing
...
So based on this there are several questions:
Does it make sense to use API gateway and Reverse proxy simultaneously (as example request -> API gateway -> reverse proxy(nginx) -> concrete microservice)? In what cases ?
What are the other differences that can be implemented using API gateway and can't be implemented by Reverse proxy and vice versa?
It is easier to think about them if you realize they aren't mutually exclusive. Think of an API gateway as a specific type reverse proxy implementation.
In regards to your questions, it is not uncommon to see both used in conjunction where the API gateway is treated as an application tier that sits behind a reverse proxy for load balancing and health checking. An example would be something like a WAF sandwich architecture in that your Web Application Firewall/API Gateway is sandwiched by reverse proxy tiers, one for the WAF itself and the other for the individual microservices it talks to.
Regarding the differences, they are very similar. It's just nomenclature. As you take a basic reverse proxy setup and start bolting on more pieces like authentication, rate limiting, dynamic config updates, and service discovery, people are more likely to call that an API gateway.
I believe, API Gateway is a reverse proxy that can be configured dynamically via API and potentially via UI, while traditional reverse proxy (like Nginx, HAProxy or Apache) is configured via config file and has to be restarted when configuration changes. Thus, API Gateway should be used when routing rules or other configuration often changes. To your questions:
It makes sense as long as every component in this sequence serves its purpose.
Differences are not in feature list but in the way configuration changes applied.
Additionally, API Gateway is often provided in form of SAAS, like Apigee or Tyk for example.
Also, here's my tutorial on how to create a simple API Gateway with Node.js https://memz.co/api-gateway-microservices-docker-node-js/
Hope it helps.
API gateway acts as a reverse proxy to accept all application programming interface (API) calls, aggregate the various services required to fulfill them, and return the appropriate result.
An API gateway has a more robust set of features — especially around security and monitoring — than an API proxy. I would say API gateway pattern also called as Backend for frontend (BFF) is widely used in Microservices development. Checkout the article for the benefits and features of API Gateway pattern in Microservice world.
On the other hand API proxy is basically a lightweight API gateway. It includes some basic security and monitoring capabilities. So, if you already have an API and your needs are simple, an API proxy will work fine.
The below image will provide you the clear picture of the difference between API Gateway and Reverse proxy.
API Gateways usually operate as a L7 construct.
API Gateways provide additional functionality as compared to a plain reverse proxy. If you consider some of the portals out there they can provide :
full API Lifecycle Management including documentation
a portal which can be used as the source of truth for various client applications and where you can provide client governance, rate limiting etc.
routing to different versions of the API including canary/beta versions
detecting usage patterns, register apps, retrieve client credentials etc.
However with the advent of service meshes like Istio, Consul a lot of the functionality of API Gateways will be subsumed by meshes.
From HTTP: The Definitive Guide:
Strictly speaking, proxies connect two or more applications that speak
the same protocol, while gateways hook up two or more parties that
speak different protocols. A gateway acts as a "protocol converter,"
allowing a client to complete a transaction with a server, even when
the client and server speak different protocols.
In practice, the difference between proxies and gateways is blurry.
Because browsers and servers implement different versions of HTTP,
proxies often do some amount of protocol conversion. And commercial
proxy servers implement gateway functionality to support SSL security
protocols, SOCKS firewalls, FTP access, and web-based applications.
Reverse proxy, such as Nginx and Apache, do not deal with observability, authentication, authorization, service orchestration, etc., but only do load balancing and forward traffic to upstream.
API Gateway is close to the user's business scenario and helps users solve the security and observability issues of various APIs and microservices.
Different positioning leads to different technical aspects of reverse proxy and API gateway. API gateways, such as Apache APISIX, have nearly 100 plugins and support multiple programming languages for plugin development.
If you already have a good API gateway, there is no need to use a reverse proxy.
Regarding the Andrey Chausenko's answer that
I believe, API Gateway is a reverse proxy that can be configured dynamically via API and potentially via UI, while traditional reverse proxy (like Nginx, HAProxy or Apache) is configured via config file and has to be restarted when configuration changes.
I think it is not true nowadays as modern reverse proxy like Envoy can be dynamically configured by control plane via xDS.

Doing research on SOA LoadBalancing NameServers

When looking at SOA (Service Oriented Architectures) the main problems seems to be distributing a services load across multiple hosts and then managing these hosts (taking hosts off line or adding new hosts)
When one service talks to another it should not need to know any host information (at the application level). Rather the SOA Environment should be able to route a service request to a specific host based on the hosts current load characteristics (so it must know all hots a service is running on and their relative load).
Are there any existing open protocols for service to report their existence and load to an SOA Environment.
SOA is a set of high-level software architecture guidelines. It is not a technical standard or recommendation and it has nothing to do with technical implementation details, like load balancing.
Load balancing is based on addressing, which is dependent on the service access technology.
Systems built in "SOA-way" may be using different service access technology, like SOAP (over HTTP, JMS, etc.), REST, asynchronous XML messages over JMS, etc.
With SOAP, the service consumer may look up a UDDI registry to locate the service provider. Some of the latest UDDI registry software provide simple (e.g. round-robin) load balancing.
Another SOAP idea is using WS-Addressing, but it is not really meant for load balancing.
I think currently the best place for load-balancing is the underlying network transport layer. With HTTP transport you can choose hardware or software (e.g. Apache HTTPD modules) load balancers that can adapt the distribution based on response times and time-outs. With JMS transport, the most popular JMS servers provide some form of load balancing. Other protocols - like CORBA or Rendezvous - usually require a custom solution.
You can also utilize an ESB software, e.g. Oracle Service Bus or TIBCO AMX Service Bus. With an ESB you can easily create a load-balancing proxy for your service instances. The proxy may be enhanced with some logic, like look-up a database table for guidance.
As you can see, there is no one-size-fits-all solution for service load balancing. The optimal solution will be based on the actual implementation architecture and vendors' recommendations.
After reading a lot the concept I was actually looking for was the Enterprise Service Bus ESP.
Though it does not explicitly define an explicit protocol it defines an architectural style that allows the solving of the problems I stated above.

Resources