What's the difference between an RPC System, like Twitter's Finagle, and an Enterprise Service Bus, like Mule? What kind of problems are each of them good at solving?
I will try to answer this as a soft explanation, rather than a technical breakdown of features:
One may say that Finagle is a asynchronous messaging library that allows services to connect to one another freely (not tightly tied to architectural system integration standards) while supporting multiple protocols.
From the Finagle website:
Finagle is a network stack for the JVM that you can use to build asynchronous Remote Procedure Call (RPC) clients and servers in Java, Scala, or any JVM-hosted language. Finagle provides a rich set of protocol-independent tools.
An enterprise service bus (ESB), on the other hand, is a asynchronous messaging architecture which typically adheres to industry standards and protocols. An ESB promotes a system where message flow is controlled and routed between systems, and where servers can register their service and clients can register what messages they are interested in. The services offered by servers can be registered and versioned.
You will typically find Finagle being used somewhere between a website and backend services. But, you will typically find an ESB inside a large corporate, where it's responsible to integrating systems like finance, support, sales, etc.
Both solutions offer asynchronous messaging and buffering to various extends, but are not designed to solve the same problem. For ESB, you would probably think 'strict, enterprise', but for Finagle you would probably think 'flexible, web'.
Hope this helps
Update:
Not quite related, but if you are exploring this space, I would look at Kafka these days.
RPC and ESB are two architectural patterns. While RPC is usually a request-reply and synchronous in nature, an ESB works on the concept of messaging (simplified explanation) and of asynchronous in nature. ESB is the foundation for any SOA implementation. ESB enables loose coupling thus promoting true agility. A simplified example from implementation perspective is as follows:
A web service is a typical RPC. The consumer is tightly bound to the producer and any change in the contract on the producer side, will require changes on the consumer side.
In ESB, the service consumer doesn't invoke the service producer directly. It just puts the message in the bus and based on the rules (mediator), appropriate service producer will handle it. If the service consumer and service producer talk in different formats, ESB provides the facility to do the transformation (like formatting the zipcode as xxxxx-xxxx, splitting the name into first name and last name, etc.).
This is just simplified explanation. For more information, please check the following links:
Why do developers need an Enterprise Service Bus?
Enterprise Service Bus
Both solve completely different problems:
An ESB is an intermediation middleware that provides message transformation and routing, protocol adaptation and other value-add operations (like orchestration, guaranteed delivery, idempotent filtering...). It sits in-between your service consumers and providers and transparently (ie without any change in consumer or provider) provides its different features.
An RPC system provides client and server technologies for performing RPC operations.
Related
Is it advicible to build the MSA based services on Oracle SOA or any other ESB suite for that matter? Is there any advantage or disadvantage?
If I am using Java, Spring and JPA over a message queue - say - RabbitMQ, I can achieve it in a more controlled environment with less recurring expenses. Of course will end up mixing tools like Drools or JBPM or similar to achieve things that may be OOTB (Out of the box) in the SOA Or ESB Suite. But scaling a specific service without paying licence fee for an additional environment should certainly be a good catch right?
Microservices architecture pattern applies to development of backend systems/services, whereas ESB (e.g. Oracle SOA Suite) is intended as an intermediary layer between consumers and backend services. Backend services contain rich application logic, whereas ESB services provide only intermediary functions like routing, transformation, orchestration etc.
ESB is not intended for rich application logic, though it's possible to do that.
Using ESB (e.g. Oracle SOA Suite) to host microservices is achievable, but you will get a big overhead comparing to limited functions and scalability. If you are looking for centralized API management (tracing, security etc.), you can put an API gateway into your architecture instead of full scale ESB.
I study SOA and webservices for a science paper. My state of knowledge is, that every SOA architcture needs a service broker.
Webservices are concrete implementations of a SOA, so do they have a service broker after? For example I create a webservice in asp.net which returns "hallo world" By Creating it, do I create a service broker too?
Don't let fool you by answers which are copy paste from Wikipedia :-)
Webservices are concrete implementations of a SOA
This assumption/statement is wrong. At least there is no direct relationship between SOA and webservices. SOA is an architectural paradigm where a webservice is a concrete technology (stack) based on WSDL and its result, the SOAP-protocol. Nothing more. Webservices may help to establish loosely coupled service landscape, which the SOA paradigm expects. But you could also build up a SOA landscape with other technology stacks (self-written hacks, RMI, even based on REST for instance).
Repository
The thing is: When you start building up your SOA-landscape, you (or others) will code services (i.e. webservices) where your service will have a technical contract (WSDL, WADL, ..) as a base for the implementation. Your clients will ask for it and you want it to store somewhere. This somewhere is usually a service repository. You could develop your own one, use the UDDI-standard or just buy one of products by the big vendors (IBM, TIBCO, Oracle etc).
Broker
A message broker within the SOA context is some piece of software, which supports the decoupling of the connected partner systems. Commonly it's called ESB (enterprise service bus). Also one of the goals of the SOA paradigm is, that the services can be used by anyone (reusability). Therefore you don't want to connect your services by P2P-connections (aka spaghetti architecture) - just imagine that one of the service participants changes it's hardware/IP: this would be a nightmare for all the connected partner systems. That's why the ESB was invented which acts between the service consumer and the service provider.
Typically, these ESB-products support a lot of technologies or -stacks/APIs like HTTP, JMS, REST etc.
Source: I work with a self-claimed SOA landscape and thousands of different (web-)services for a big company for a long time now.
A Web service is a set of related application functions that can be programmatically invoked over the Internet. Businesses can dynamically mix and match Web services to perform complex transactions with minimal programming. Web services allow buyers and sellers all over the world to discover each other, connect dynamically, and execute transactions in real time with minimal human interaction.
Web services are self-contained, self-describing modular applications that can be published, located, and invoked across the Web.
A network component in a Web Services architecture can play one or
more fundamental roles: service provider, service broker, and service
client.
Service brokers register and categorize published services and provide search services. For example, UDDI acts as a service broker for WSDL-described Web services.
A little background.
Very big monolithic Django application. All components use the same database. We need to separate services so we can independently upgrade some parts of the system without affecting the rest.
We use RabbitMQ as a broker to Celery.
Right now we have two options:
HTTP Services using a REST interface.
JSONRPC over AMQP to a event loop service
My team is leaning towards HTTP because that's what they are familiar with but I think the advantages of using RPC over AMQP far outweigh it.
AMQP provides us with the capabilities to easily add in load balancing, and high availability, with guaranteed message deliveries.
Whereas with HTTP we have to create client HTTP wrappers to work with the REST interfaces, we have to put in a load balancer and set up that infrastructure in order to have HA etc.
With AMQP I can just spawn another instance of the service, it will connect to the same queue as the other instances and bam, HA and load balancing.
Am I missing something with my thoughts on AMQP?
At first,
REST, RPC - architecture patterns, AMQP - wire-level and HTTP - application protocol which run on top of TCP/IP
AMQP is a specific protocol when HTTP - general-purpose protocol, thus, HTTP has damn high overhead comparing to AMQP
AMQP nature is asynchronous where HTTP nature is synchronous
both REST and RPC use data serialization, which format is up to you and it depends of infrastructure. If you are using python everywhere I think you can use python native serialization - pickle which should be faster than JSON or any other formats.
both HTTP+REST and AMQP+RPC can run in heterogeneous and/or distributed environment
So if you are choosing what to use: HTTP+REST or AMQP+RPC, the answer is really subject of infrastructure complexity and resource usage. Without any specific requirements both solution will work fine, but i would rather make some abstraction to be able switch between them transparently.
You told that your team familiar with HTTP but not with AMQP. If development time is an important time you got an answer.
If you want to build HA infrastructure with minimal complexity I guess AMQP protocol is what you want.
I had an experience with both of them and advantages of RESTful services are:
they well-mapped on web interface
people are familiar with them
easy to debug (due to general purpose of HTTP)
easy provide API to third-party services.
Advantages of AMQP-based solution:
damn fast
flexible
cost-effective (in resources usage meaning)
Note, that you can provide RESTful API to third-party services on top of your AMQP-based API while REST is not a protocol but rather paradigm, but you should think about it building your AQMP RPC api. I have done it in this way to provide API to external third-party services and provide access to API on those part of infrastructure which run on old codebase or where it is not possible to add AMQP support.
If I am right your question is about how to better organize communication between different parts of your software, not how to provide an API to end-users.
If you have a high-load project RabbitMQ is damn good piece of software and you can easily add any number of workers which run on different machines. Also it has mirroring and clustering out of the box. And one more thing, RabbitMQ is build on top of Erlang OTP, which is high-reliable,stable platform ... (bla-bla-bla), it is good not only for marketing but for engineers too. I had an issue with RabbitMQ only once when nginx logs took all disc space on the same partition where RabbitMQ run.
UPD (May 2018):
Saurabh Bhoomkar posted a link to the MQ vs. HTTP article written by Arnold Shoon on June 7th, 2012, here's a copy of it:
I was going through my old files and came across my notes on MQ and thought I’d share some reasons to use MQ vs. HTTP:
If your consumer processes at a fixed rate (i.e. can’t handle floods to the HTTP server [bursts]) then using MQ provides the flexibility for the service to buffer the other requests vs. bogging it down.
Time independent processing and messaging exchange patterns — if the thread is performing a fire-and-forget, then MQ is better suited for that pattern vs. HTTP.
Long-lived processes are better suited for MQ as you can send a request and have a seperate thread listening for responses (note WS-Addressing allows HTTP to process in this manner but requires both endpoints to support that capability).
Loose coupling where one process can continue to do work even if the other process is not available vs. HTTP having to retry.
Request prioritization where more important messages can jump to the front of the queue.
XA transactions – MQ is fully XA compliant – HTTP is not.
Fault tolerance – MQ messages survive server or network failures – HTTP does not.
MQ provides for ‘assured’ delivery of messages once and only once, http does not.
MQ provides the ability to do message segmentation and message grouping for large messages – HTTP does not have that ability as it treats each transaction seperately.
MQ provides a pub/sub interface where-as HTTP is point-to-point.
UPD (Dec 2018):
As noticed by #Kevin in comments below, it's questionable that RabbitMQ scales better then RESTful servies. My original answer was based on simply adding more workers, which is just a part of scaling and as long as single AMQP broker capacity not exceeded, it is true, though after that it requires more advanced techniques like Highly Available (Mirrored) Queues which makes both HTTP and AMQP-based services have some non-trivial complexity to scale at infrastructure level.
After careful thinking I also removed that maintaining AMQP broker (RabbitMQ) is simpler than any HTTP server: original answer was written in Jun 2013 and a lot of changed since that time, but the main change was that I get more insight in both of approaches, so the best I can say now that "your mileage may vary".
Also note, that comparing both HTTP and AMQP is apple to oranges to some extent, so please, do not interpret this answer as the ultimate guidance to base your decision on but rather take it as one of sources or as a reference for your further researches to find out what exact solution will match your particular case.
The irony of the solution OP had to accept is, AMQP or other MQ solutions are often used to insulate callers from the inherent unreliability of HTTP-only services -- to provide some level of timeout & retry logic and message persistence so the caller doesn't have to implement its own HTTP insulation code. A very thin HTTP gateway or adapter layer over a reliable AMQP core, with option to go straight to AMQP using a more reliable client protocol like JSONRPC would often be the best solution for this scenario.
Your thoughts on AMQP are spot on!
Furthermore, since you are transitioning from a monolithic to a more distributed architecture, then adopting AMQP for communication between the services is more ideal for your use case. Here is why…
Communication via a REST interface and by extension HTTP is synchronous in nature — this synchronous nature of HTTP makes it a not-so-great option as the pattern of communication in a distributed architecture like the one you talk about. Why?
Imagine you have two services, service A and service B in that your Django application that communicate via REST API calls. This API calls usually play out this way: service A makes an http request to service B, waits idly for the response, and only proceeds to the next task after getting a response from service B. In essence, service A is blocked until it receives a response from service B.
This is problematic because one of the goals with microservices is to build small autonomous services that would always be available even if one or more services are down– No single point of failure. The fact that service A connects directly to service B and in fact, waits for some response, introduces a level of coupling that detracts from the intended autonomy of each service.
AMQP on the other hand is asynchronous in nature — this asynchronous nature of AMQP makes it great for use in your scenario and other like it.
If you go down the AMQP route, instead of service A making requests to service B directly, you can introduce an AMQP based MQ between these two services. Service A will add requests to the Message Queue. Service B then picks up the request and processes it at its own pace.
This approach decouples the two services and, by extension, makes them autonomous. This is true because:
If service B fails unexpectedly, service A will keep accepting requests and adding them to the queue as though nothing happened. The requests would always be in the queue for service B to process them when it’s back online.
If service A experiences a spike in traffic, service B won’t even notice because it only picks up requests from the Message Queues at its own pace
This approach also has the added benefit of being easy to scale— you can add more queues or create copies of service B to process more requests.
Lastly, service A does not have to wait for a response from service B, the end users don’t also have to wait for long— this leads to improved performance and, by extension, a better user experience.
Just in case you are considering moving from HTTP to AMQP in your distributed architecture and you are just not sure how to go about it, you can checkout this 7 parts beginner guide on message queues and microservices. It shows you how to use a message queue in a distributed architecture by walking you through a demo project.
Hi went through Enterprise Integration Patterns by Gregor Hohpe and Bobby Woolf.
http://www.eaipatterns.com/toc.html
I also went through Camel and Mule's compliance with these integration patterns -
http://www.mulesoft.org/documentation/display/current/Understanding+Enterprise+Integration+Patterns+Using+Mule
http://camel.apache.org/enterprise-integration-patterns.html
I see that both Mule and Camel allow applications to be deployed and accessed via webservices like SOAP or REST, SOAP being more RPC style. They allow massive integration support using opensource utilities like CXF and Jersey. In fact Mule also supports RMI endpoints - which will give remote method invokation capability as well which is a well-accepted form of Integration.
I understands ESBs are built around a Message Bus with additional support for other protocols however ESBs only comply to EIP and EIP is not just ESBs.
Question is why SOAP/REST or their transport protocol not considered as "Integration Styles" and which is Enterprise Integration so "Message Oriented"?
I am a novice compared to the great minds which designed these patterns but trying to understand the lopsided message-y nature of Integration patterns. I admit it isn't quite the QnA format of stack overflow but will request Mods to keep it alive for a while so that people can share their opinions.
As for SOAP, it would put it under the Integration Style "Remote Procedure Invocation", since it's pretty much what SOAP implements in reality (I won't consider the SOAP over JMS hybrids here with a potential to mix RPC with Messaging..).
REST is, interface wise, very different from SOAP in that it's resource driven instead of service driven. I would non the less group it under the "RPC" style since it's just another format of syncrhonous RPC calls.
I would, however, not put too much effort in theory of what EIP integration style a specific message pattern implements.
Look at a specific scenario at hands instead and use the EIP to model your specific integration.
I've seen integrations of file transfers that in realtiy implemented RPC patterns or SOAP services that in reality implemented messaging (although I don't really recommend do this).
A concrete example: consider the usage of a dedicated file upload service, which happends to be built using SOAP technolgy, which uploads a CSV file to a file area on a server, from where it's picked up by some other system. I would call this file based integration on a high level.
Another example is that Messaging systems sometimes are implemented using a shared database. Still the integration style using them is messaging, not "Shared Database".
Think about how your integration should work on a high level, then apply the various protocols to do the grunt work.
Is an Enterprise Service Bus (a tool that acts as a mediator, a message broker, a service enabler, schema transformation enhancer, transparent location provider, service aggregator, load balancer, monitor, and all that stuff) responsible to orchestrate services?
What about putting an automated business business process with more than thousand steps and dozens of service invocations inside your enterprise service bus?
Would you do it, or would you use a specialist in orchestration such as a BPEL engine?
Please gimme you opinion.
Yes and no. There's a thin, and sometimes indistinguishable line between orchestration and aggregation/service augmentation.
In general, if you've got any long-running or complex business process (process being the key word, although I'm going to avoid defining it) - that's best suited to BPEL.
Simple tasks, such as aggregating the results of three service calls, could and often should be done in an ESB layer.
It's not worth losing too much sleep over, though
Disclaimer: I am an IBM ESB consultant, although I'm not writing this in an official capacity.
No, an ESB's responsibility is not the orchestration of services (per se).
The ESB provides a layer of abstraction at the "software infrastructure level".
This means that an ESB is a "single logical abstract port of call for connectivity" with any service that is published on the bus.
The ESB being abstract, means that consumers of services on the bus, don't "need to know" deployment details of the service, and it is possible to expose "internally facing services" with a single document model. The ESB provides low level services (such as protocol translation and message transformation), so that internally services can communicate in a simplified fashion.
This implies some orchestration: The ESB provides orchestration of the afore mentioned low level services (e.g. when service X is called via IIOP, translate this to SOAP with Attachments. Then transform the request from whatever serialized data to an XML payload).
The orchestration you would typically avoid in an ESB is: In order to process this (insurance) sale, we first need to validate the information provided by the buyer, then we need to underwrite the risk of insuring, and finally calculate the premium that needs to be paid for the insurance, after which we need to… etc.
The steps described above are clearly a business process (which could even be interrupted… e.g. if automatic underwriting is not possible, then a human underwriter needs to further assess the risk).
Business Services (e.g. Validation, Underwriting, Premium Calculation) that make up a Business Process (e.g. Insurance Sale), which is what is typically referred to as Orchestration, is best suited to happen in a Business Process Engine and defined using a formalized Business Process Modeling Language (such as BPEL).
Also making a guess about the many steps in your process: In the above example, Validation is a (course grained) service. The validation rules themselves are internal to that service. For complex business rules (i.e. not business process), the use of a Business Rules Engine may be required.
My short quick answer is NO, that not its responsability.
I would rather let that to the BPEL or a BPM suite.
Mhh I don't know what else to add :) ... Good luck?
Now my own vision.
Regarding all the work an ESB has to do, putting service orchestration inside the main infrastructure element of your SOA is not a good idea.
Aggregate, ok! But keeping your communication channel busy with business logic will, for sure, cause a terrible impact in the ability to delivery other features.
After all, most ESBs such as as BEA Aqualogic Service have a limited support for orchestration including lack of stateful capabilities, and activities like wait (a timer) or pick (wait for some input to move on the process), split/join capabilities (already added on ALSB 3.0), and so on.
No way. Just use tools like a BPEL engine or a tool like Weblogic Integration.
Thanks.
Whenever you have two or more services that interact use service orchestrator, i.e. for composition and process control services. If you have esb expose this composition service on esb. Now if you have to compose new service that includes this composition service use orchestrator and again expose on esb.
Use esb as service delivery mechanism and web service broker and proxy. In composing a service orchestrator will use esb to reach interacting services. If these interacting services use incompatible xml schemas esb can transform/map them to common schema in runtime and route service requests based on the content, e.g. namespace.
Yes orchestration is a responsibility, in most cases, of the ESB. Or, alternatively, if you draw a line between ESB infra and orchestration infra, then you are doing so on a physical level for performance reasons, not for logical attribution of responsibility.
You have 2 choices - when, for example, an HR system receives a new employee - where do you place the business logic that says "the compliance department will need to approve and check first, and then if that's ok, the HR department will need to finalise the hire, then the accounting department will need a new entry, and then the payroll system will need updating, and if that fails, then we'll need to send an email to HR"? If all business processes are considered 'owned' by the initiating dept/application, then the overall system that is the enterprise becomes complex, with disparate orchestration systems.
The second choice is centralise the orchestration, essentially making it a logical partner of the messaging platform. If you choose to see these as separate artifacts, that is up to you, but it is equally valid to described both as ESB.
An Enterprise Service Bus should never be responsible for orchestrating services.
Orchestration implies a minimum of "smarts", specifically the ability to compensate for failed transactions. Service bus tools will often say they offer "try-catch" or something like that but the ability to run scoped componsation is the mark of a proper orchestration tool. Additionally the ability to wait, know its own state, or keep things in suspense is another indicator that you're dealing with an orchestrator and not a bus.
Speaking to 1000+ steps plus dozens of services, consider the if-then's in the process. If all the if-then statements in your 1000 steps speak only to routing with no change to the payloads then you're still in "routing" and therefore still in ESB. But if there's even one nested if-then and I start to look for different tools. Aside, if-thens that look like routing can very quickly impact business logic. Once business logic starts showing up then a better language such as BPEL or BPMN is better.
The example of an orchestra conductor is often given to describe how orchestration works, a central individual directing the musicians according to a score. Often what's left off is the idea that the conductor is not only directing, but listening as well, and if something goes wrong can compensate in a reliable, repeatable way.
For instance imagine our first conductor goes to bring in the tuba player but said tuba player has decided to go do something else. A simple pinball-style "orchestrator" will bring in the tuba section, knowing full well it isn't there, and then wait for the audience to complain later. A really savvy conductor would see the tuba gone, and immediately bring up the deeper baritone horns to compensate.