Let's say I have a gRPC Java service that is hosting a Server.
So when a client wants to call this service, they use:
ManagedChannel channel = ManagedChannelBuilder
.forAddress(grpcHost, grpcPort)
.usePlaintext(true)
.build();
No problem.
Now what if I want to call my service from the same JVM? Is this even possible? Or perhaps this is completely invalid?
You are free to use plain ManagedChannelBuilder to connect to the same JVM. It will work just like normal.
If you want to optimize the case because it can happen frequently and are in the same ClassLoader (so it wouldn't work between Servlets, for example), you can use the in-process transport. The in-process transport has relatively low overhead and can even avoid serializing/deserializing the Protobufs.
Related
When we create a managedChannelBuilder and use this to call a grpc-java service call, how many clients can we serve with this? Doesn't this channel be shutdown after individual service call?
Say I have a REST interface which accepts REST calls from a browser
and from within these REST Service methods, I am making grpc client calls to an independent grpc server. Also I can expect client connections in the range of [4000-5000] concurrently.
How well can I make use of this managedChannelBuilder. Do I need just one? Or do I need to pool multiple channelbuilders?
Generally, I'd suggest using a single ManagedChannel per endpoint when your code can be easily structured to share it. ManagedChannel multiplexes RPCs and is thread-safe, so it can handle multiple RPCs concurrently.
In rarer cases of high very high throughput, it may make sense to use more than one ManagedChannel. Eventually ManagedChannel (or, maybe Channel) should have support for doing this natively.
There is a Windows service that I need to communicate with (in a duplex way) from ASP.NET. Is it safe to turn the Windows service into a WCF service and organize two-way communication?
I'm concerned about a scenario when the service is trying to communicate but ASP.NET process is getting reloaded and the message gets lost. Though it's unlikely during development, I guess it's quite likely in production with many clients.
I'm leaning towards a solution that involves some kind of persistence:
Both the Windows service and ASP.NET write data to SQL Server and get notified via SqlDependency
They exchange messages via RabbitMq
Here's a couple of ideas regarding the general case where two independent systems (processes, servers, etc.) need to communicate reliably:
Transaction model, where the transmitting party initiates communication and waits for acknowledgment from the recipient before marking the message as delivered. In case of transmission failure/timeout, it's the sender's responsibility to persist the message and retry later. For instance, Webhook architectures rely on this model.
Publish/Subscribe model, used by a lot of distributed systems, where both parties rely on a third-party message broker (message queue/service bus mechanism) such as RabbitMQ. In this architecture, sender is only responsible for making sure that the message has been successfully queued. The responsibility of making sure that the message is delivered to the recipient is on the message broker. In this case, you need to make sure that your message broker satisfies your reliability needs, for example: Is it in-memory only? Or does it also persist to disk and is able to recover from not just a process-recycle but also a power/system recycle.
And like you said, you can build your own messaging infrastructure too: sender writes to a local or cloud database or a cloud queue/service bus, and the receiver polls and consumes the messages.
So, a few guidelines:
If you ever need to scale out (have multiple servers) and they need to somehow collaborate on these messages, then make your initial investment on a database or cloud-queue solution (such as Azure SQL or Azure Queues).
Otherwise, if your services only need to communicate within one server, then you can use a database approach or use a queue service that satisfies your persistence/reliability requirements. RabbitMQ seems like a robust solution for this scenario.
I have an EJB3 bean that needs to GET or POST from multiple HTTP servers. I have read the documentation on writing JCA adapters, as well as the documentation on Apache HTTPComponents, specifically the managed connections, managed connection factories, etc. that HTTPClient offers.
I note that the documentation for BasicHttpClientConnectionManager says to use it, rather than PoolingHttpClientConnectionManager "Inside of an EJB Container." It is unclear whether "Inside of an EJB Container" refers to user code in an EJB to be run in a container, or to the container's own implementation code (e.g. something you might put in a JCA adapter.
I'm still a little unclear, architecturally, how to handle the task to take maximum advantage of the servics offered by the container. Thus far, my choices seem to be:
From within the EJB, create a new BasicHttpClientConnectionManager, and then create a client, like so:
BasicHttpClientConnectionManager cxMgr = new BasicHttpClientConnectionManager()
HttpClients.custom().setConnectionManager(cxMgr).build()
This would, I believe, result in no connection pooling, but rather EJB instance pooling, which probably wouldn't be all that performant since the EJB container has no way of knowing which bean instance is holding an active connection to which remote HTTP server.
Write a (fairly minimal) JCA adapter that wraps the PoolingHttpClientConnectionManager, and then, within the EJB, grab that adapter with a #Resource annotation and use it to build the HTTPClient
Write a JCA adapter that manages a pool of HTTPClients and hands those out when needed.
I am unclear on which approach I should take, or on whether or not I'm ignoring some sort of HTTP Connection management service that's already built into the container (in this case, TomEE plus). How should I do this?
It's not that simple: none of those options match. The particular answer depends on whether remote side of your HTTP connection is a single resource or not.
If it's a single logical resource, then JCA matches best. Still, you should hardly avoid any references to it's HTTP nature in your adapter: design a proper logical interface for your remote system and then implement a resource adapter upon that interface. Your clients request the data they need and then receive it not as HTML strings, but as ready-to-use Java beans. You will be able to utilize much of advantages provided by EJB container without it's requirements violation.
If you need an entry point to access unspecified number of external services (HTML pages search indexing might be this kind of job), then things get tricky. JCA is not intended to be used for that kind of things, but it suites them quite for sure: I am maintaining TCP SocketConnector project which does almost the same and everything works like a charm. EJB won't fit here: stateless beans pooling is hard to control, singletons will require you to write non-blocking pooling implementation yourself and statefuls are, well, kind of single-threaded.
There's also an extra option: while JCA will be easier to integrate to your existing Java EE environment, it might be hard to implement a resource adapter from scratch if you're doing it for the first time. An external HTTP querying tool which is connected with a JMS queue might come in handy here. JMS will make it easier to balance workload, but it removes any interactivity and may slow the whole chain down.
P.S. For the third option of your original question: there's no need to implement pooling in JCA as container already maintains connections pool for all resource adapter instances by default. Just keep a single HTTP Client connection per JCA's ManagedConnection and everything will work out-of-the-box.
A little background.
Very big monolithic Django application. All components use the same database. We need to separate services so we can independently upgrade some parts of the system without affecting the rest.
We use RabbitMQ as a broker to Celery.
Right now we have two options:
HTTP Services using a REST interface.
JSONRPC over AMQP to a event loop service
My team is leaning towards HTTP because that's what they are familiar with but I think the advantages of using RPC over AMQP far outweigh it.
AMQP provides us with the capabilities to easily add in load balancing, and high availability, with guaranteed message deliveries.
Whereas with HTTP we have to create client HTTP wrappers to work with the REST interfaces, we have to put in a load balancer and set up that infrastructure in order to have HA etc.
With AMQP I can just spawn another instance of the service, it will connect to the same queue as the other instances and bam, HA and load balancing.
Am I missing something with my thoughts on AMQP?
At first,
REST, RPC - architecture patterns, AMQP - wire-level and HTTP - application protocol which run on top of TCP/IP
AMQP is a specific protocol when HTTP - general-purpose protocol, thus, HTTP has damn high overhead comparing to AMQP
AMQP nature is asynchronous where HTTP nature is synchronous
both REST and RPC use data serialization, which format is up to you and it depends of infrastructure. If you are using python everywhere I think you can use python native serialization - pickle which should be faster than JSON or any other formats.
both HTTP+REST and AMQP+RPC can run in heterogeneous and/or distributed environment
So if you are choosing what to use: HTTP+REST or AMQP+RPC, the answer is really subject of infrastructure complexity and resource usage. Without any specific requirements both solution will work fine, but i would rather make some abstraction to be able switch between them transparently.
You told that your team familiar with HTTP but not with AMQP. If development time is an important time you got an answer.
If you want to build HA infrastructure with minimal complexity I guess AMQP protocol is what you want.
I had an experience with both of them and advantages of RESTful services are:
they well-mapped on web interface
people are familiar with them
easy to debug (due to general purpose of HTTP)
easy provide API to third-party services.
Advantages of AMQP-based solution:
damn fast
flexible
cost-effective (in resources usage meaning)
Note, that you can provide RESTful API to third-party services on top of your AMQP-based API while REST is not a protocol but rather paradigm, but you should think about it building your AQMP RPC api. I have done it in this way to provide API to external third-party services and provide access to API on those part of infrastructure which run on old codebase or where it is not possible to add AMQP support.
If I am right your question is about how to better organize communication between different parts of your software, not how to provide an API to end-users.
If you have a high-load project RabbitMQ is damn good piece of software and you can easily add any number of workers which run on different machines. Also it has mirroring and clustering out of the box. And one more thing, RabbitMQ is build on top of Erlang OTP, which is high-reliable,stable platform ... (bla-bla-bla), it is good not only for marketing but for engineers too. I had an issue with RabbitMQ only once when nginx logs took all disc space on the same partition where RabbitMQ run.
UPD (May 2018):
Saurabh Bhoomkar posted a link to the MQ vs. HTTP article written by Arnold Shoon on June 7th, 2012, here's a copy of it:
I was going through my old files and came across my notes on MQ and thought I’d share some reasons to use MQ vs. HTTP:
If your consumer processes at a fixed rate (i.e. can’t handle floods to the HTTP server [bursts]) then using MQ provides the flexibility for the service to buffer the other requests vs. bogging it down.
Time independent processing and messaging exchange patterns — if the thread is performing a fire-and-forget, then MQ is better suited for that pattern vs. HTTP.
Long-lived processes are better suited for MQ as you can send a request and have a seperate thread listening for responses (note WS-Addressing allows HTTP to process in this manner but requires both endpoints to support that capability).
Loose coupling where one process can continue to do work even if the other process is not available vs. HTTP having to retry.
Request prioritization where more important messages can jump to the front of the queue.
XA transactions – MQ is fully XA compliant – HTTP is not.
Fault tolerance – MQ messages survive server or network failures – HTTP does not.
MQ provides for ‘assured’ delivery of messages once and only once, http does not.
MQ provides the ability to do message segmentation and message grouping for large messages – HTTP does not have that ability as it treats each transaction seperately.
MQ provides a pub/sub interface where-as HTTP is point-to-point.
UPD (Dec 2018):
As noticed by #Kevin in comments below, it's questionable that RabbitMQ scales better then RESTful servies. My original answer was based on simply adding more workers, which is just a part of scaling and as long as single AMQP broker capacity not exceeded, it is true, though after that it requires more advanced techniques like Highly Available (Mirrored) Queues which makes both HTTP and AMQP-based services have some non-trivial complexity to scale at infrastructure level.
After careful thinking I also removed that maintaining AMQP broker (RabbitMQ) is simpler than any HTTP server: original answer was written in Jun 2013 and a lot of changed since that time, but the main change was that I get more insight in both of approaches, so the best I can say now that "your mileage may vary".
Also note, that comparing both HTTP and AMQP is apple to oranges to some extent, so please, do not interpret this answer as the ultimate guidance to base your decision on but rather take it as one of sources or as a reference for your further researches to find out what exact solution will match your particular case.
The irony of the solution OP had to accept is, AMQP or other MQ solutions are often used to insulate callers from the inherent unreliability of HTTP-only services -- to provide some level of timeout & retry logic and message persistence so the caller doesn't have to implement its own HTTP insulation code. A very thin HTTP gateway or adapter layer over a reliable AMQP core, with option to go straight to AMQP using a more reliable client protocol like JSONRPC would often be the best solution for this scenario.
Your thoughts on AMQP are spot on!
Furthermore, since you are transitioning from a monolithic to a more distributed architecture, then adopting AMQP for communication between the services is more ideal for your use case. Here is why…
Communication via a REST interface and by extension HTTP is synchronous in nature — this synchronous nature of HTTP makes it a not-so-great option as the pattern of communication in a distributed architecture like the one you talk about. Why?
Imagine you have two services, service A and service B in that your Django application that communicate via REST API calls. This API calls usually play out this way: service A makes an http request to service B, waits idly for the response, and only proceeds to the next task after getting a response from service B. In essence, service A is blocked until it receives a response from service B.
This is problematic because one of the goals with microservices is to build small autonomous services that would always be available even if one or more services are down– No single point of failure. The fact that service A connects directly to service B and in fact, waits for some response, introduces a level of coupling that detracts from the intended autonomy of each service.
AMQP on the other hand is asynchronous in nature — this asynchronous nature of AMQP makes it great for use in your scenario and other like it.
If you go down the AMQP route, instead of service A making requests to service B directly, you can introduce an AMQP based MQ between these two services. Service A will add requests to the Message Queue. Service B then picks up the request and processes it at its own pace.
This approach decouples the two services and, by extension, makes them autonomous. This is true because:
If service B fails unexpectedly, service A will keep accepting requests and adding them to the queue as though nothing happened. The requests would always be in the queue for service B to process them when it’s back online.
If service A experiences a spike in traffic, service B won’t even notice because it only picks up requests from the Message Queues at its own pace
This approach also has the added benefit of being easy to scale— you can add more queues or create copies of service B to process more requests.
Lastly, service A does not have to wait for a response from service B, the end users don’t also have to wait for long— this leads to improved performance and, by extension, a better user experience.
Just in case you are considering moving from HTTP to AMQP in your distributed architecture and you are just not sure how to go about it, you can checkout this 7 parts beginner guide on message queues and microservices. It shows you how to use a message queue in a distributed architecture by walking you through a demo project.
I am interested in the Pub/Sub paradigm in order to provide a notifications system (ie : like Facebook), especially in a web application which has publishers (in several web applications on the same web server IIS) and one or more subscribers, in charge to display on the web the notifications for the front user.
I found out Redis, it seems to be a great server which provides interesting features : Caching (like Memcached) , Pub/Sub, queue.
Unfortunately, I didn't find any examples in a web context (ASP.NET, with Ajax/jQuery), except WebSockets and NodeJS but I don't want to use those ones (too early). I guess I need a process (subscriber) which receives messages from the publishers but I don't see how to do that in a web application (pub/sub works fine with unit tests).
EDIT : we currently use .NET (ASP.NET Forms) and try out ServiceStack.Redis library (http://www.servicestack.net/)
Actually Redis Pub/Sub handles this scenario quite well, as Redis is an async non-blocking server it can hold many connections cheaply and it scales well.
Salvatore (aka Mr Redis :) describes the O(1) time complexity of Publish and Subscribe operations:
You can consider the work of
subscribing/unsubscribing as a
constant time operation, O(1) for both
subscribing and unsubscribing
(actually PSUBSCRIBE does more work
than this if you are subscribed
already to many patterns with the
same client).
...
About memory, it is similar or smaller
than the one used by a key, so you
should not have problems to subscribe
to millions of channels even in a
small server.
So Redis is more than capable and designed for this scenario, but the problem as Tom pointed out in order to maintain a persistent connection users will need long-running connections (aka http-push / long-poll) and each active user will take its own thread. Holding a thread isn't great for scalability and technologically you would be better off using a non-blocking http server like Manos de Mono or node.js which are both async and non-blocking and can handle this scenario. Note: WebSockets is more efficient for real-time notifications over HTTP, so ideally you would use that if the users browser supports it and fallback to regular HTTP if they don't (or fallback to use Flash for WebSockets on the client).
So it's not the Redis or its Pub/Sub that doesn't scale here, it's the number of concurrent connections that a threaded HTTP server like IIS or Apache that is the limit, with that said you can still support a fair amount of concurrent users with IIS (this post suggests 3000) and since IIS is the bottleneck and not Redis you can easily just add an extra IIS server into the mix and distribute the load.
For this application, I would strongly suggest using SignalR, which is a .Net framework that enables real-time push to connected clients.
Redis publish/subscribe is not designed for this scenario - it requires a persistent connection to redis, which you have if you are writing a worker process but not when you are working with stateless web requests.
A publish/subscribe system that works for end users over http takes a little more work, but not too much - the simplest approach is to use a sorted set for each channel and record the time a user last got notifications. You could also do it with a list recording subscribers for each channel and write to the inbox list of each of those users whenever a notification is added.
With either of those methods a user can retrieve their new notifications very quickly. It will be a form of polling rather than true push notifications, but you aren't really going to get away from that due to the nature of http.
Technically you could use redis pub/sub with long-running http connections, but if every user needs their own thread with active redis and http connections, scalability won't be very good.