Is it an anti pattern if multiple micro-services read/write to/from the same DynamoDB table?
Please note that:
We have a fixed schema which will not change any soon.
All read/write will go though REST services though API gateway + Lambda
micro-services are deployed on Kubernetes or Lambda
When you writing microservices it is advised to not share databases. That is there because it provides for easy scaling and provides each to services to have their own say when it comes to their set of data. It also enables one of these services to take a call on how data is to be kept and can change it at their will. This gives services flexibility.
Even if your schema is not changing you can be sure of the fact that one service when throttled will not impact the others.
If all your crud calls are channeled through a rest service you do have a service layer in front of a database and yes you can do that under microservices guidelines
I will still not call the approach an anti pattern. Its just that the collective experience says that its better not to talk to one databases for some of the reasons I mentioned above.
I have a different opinion here and I'll call this approach as anti-pattern. Few of the key principles which are getting violated here:
Each MS should have it's own bounded context, here if all are sharing the same data set then their business boundaries are blurred.
DB is single point of failure if DB goes down, all your services goes down.
If you try to scale single service and spawn multiple instances, it will impact DB performance and eventually, will effect the performance of other microservices.
Solution,..IMO,
First analyze if you have a case for MSA, if your data is very tightly coupled and dependent on each other then you don't need this architecture.
Explore CQRS pattern, you might like to have different DB for read and write and synchronize them via event pattern.
Hope this helps!!
Related
We are designing a new system, which will be based on microservices.
I would like to consult, whether it is considered an anti-pattern to produce and consume from the same queue.
The service should be a REST-based microservice.
The backend microservice should handle large-scale operations on IoT devices, which may not always be connected to the internet, the API must be asynchronous and provide robust features such as the number of X retries before final failure, etc.
The API should immediately return a UUID with 201 responses from the POST request and expose 'Get Current Status' (pending, active, done.. etc.) of the operation status of the IoT device, by accepting the UUID in the get request.
We've thought about two solutions (described at a high level) for managing this task:
Implement both API GW Microservice and the logic handler Microservice.
The API GW will expose the REST API methods and will publish messages via RabbitMQ that will be consumed by multiple instances of the logic handler microservice.
The main drawback is that we will need to manage multiple deployments, and keep consistency between the APIs exposed in service A, to the consumer in service B.
Implement one microservice, that exposes the relevant APIs, and in order to handle the scale and the asynchronous operations, will publish the request to own queue over RabbitMQ, being consumed by the same microservice at a background worker.
The second solution looks easier to develop, maintain, and update because all the logic including the REST API's handling will take care of the same microservice.
To some members of the team, this solution looks a bit dirty, and we can't decide whether it is an anti-pattern to consume messages of your own queue.
Any advice will be helpful.
I would like to consult, whether it is considered an anti-pattern to produce and consume from the same queue.
That definitely sounds suspicious, not that I have a lot of experience with queues. But, you also talk about microservices and "producing" & consuming from those - that sounds fine, there's no reason why a microservice (and by extension, it's API) can't do both. But then I'm a bit confused because in reading the rest of the question I don't really see how that issue is a factor.
Having some kind of separation between the API and the microservice sounds wise, because you can change the microservices implementation without affecting callers, assuming it's a non-breaking change. It means you have the ability to solve API / consumer problems, and backend problems, separately.
Version control is just a part of good engineering practice, and probably not an ideal reason to bend your architecture. You should be able to run more than one version in parallel - a lot of API providers operate a N+2 model, where they support the current version, plus the last two (major) releases. That way you give your consumers a reasonable runway for upgrading.
As long as you keep the two concerns modular so you'd be able to separate them when it would make sense it doesn't matter.
I'd think in the longer term you'd probably want to treat them as two aspects of the same service as they'd probably have different update cycle (e.g. the gateway part may need things like auth, maybe additional api in gRPC, etc.),different security reqs (one can accessible to the outside where the other consumes internal resource) different scalability concerns (you'd probably need more resources for the processing) etc.
I have been building .Net Web API's for years... normally I have one API that has 10 or so different controllers who handle everything from signing users up, handling business logic, payment, etc. Those all talk to class libraries to talk to the database and such. Nothing fancy, but it has been effective.
Fast forward to today... I am building a version 2 for an app that gets a good amount of traffic. I know my app is gonna get hit hard so I am looking for something with a foundation of efficiency and scale.
This has led me to embrace the coolness of Service Fabric and ASP.Net Core Web APIs. I have been reading lots of tutorials, articles, and SO questions and from what I understand, the beauty of Service Fabric is that it will spawn up multiple nodes in a single VM when things get busy.
So, if I maintain my normal pattern and make a single Web API with 10+ controllers, can Service Fabric do what it needs to do? Or am I supposed to create multiple little API's that are more focused so that the Service Fabric can add/remove them as things get busy?
That sounds like the right thing to do, and I have set up my code to do just that by putting my Models and Data classes in their own class libraries so they can be reused by the different API's, but I just wanted to double check before I do something potentially stupid.
If I split up, say each controller into its own Service Fabric service, will the Azure server be more efficient and scale better?
Nodes
In Service Fabric clusters (on Azure / stand alone) a Node equals a VM. If you increase the amount of machines, more Nodes appear in the cluster. (This is not the case for your local dev cluster.) Scaling in Azure Clusters is simple: just change the VMSS instance count.
Only if you configure Stateless Services with instance count -1, Service Fabric will spawn new instances of it. This is caused by the addition of nodes, not by load itself.
You can configure autoscaling for VMSS'es.
Web API
Service Fabric just tries to balance the load of all running SF Services across the available resources. That could be one instance of one service type on every node, or multiple instances of many types. So one service can just use all the resources of the node it's running on, like with IIS. (This is why Container support is coming by the way.)
Web API design isn't directly influenced by Service Fabric. The same rules apply as when running on IIS or elsewhere. It's really your choice.
Microservices
Your normal pattern will work. But making smaller services from it could help reduce the impact of changes. (At the cost of increased complexity.) Consider creating services that offer common functionality following the Microservices paradigm.
In Microservices, your code changes are scoped to smaller modules, less testing is needed, performance is less degraded during updates. This way, in theory, you can release new features in less time.
It depends.
If you have a natural division in your controllers regarding the resources they use then you may get some benefit if you split your services along that division line. Say service A uses lots of CPU and service B uses mostly HTTP then giving SF the ability to split CPU loads on their own may mean fewer affected HTTP calls.
You can optimize how SF distributes load by reporting load from inside your app but do so in the simplest way possible and don't add numerous dimensions, maybe one per service at most.
If all your controllers use the same type of resources roughly the same then there's no real benefit to splitting them away in separate services, just complications in code management, deployments and potentially inter-service communications.
I am trying to understand the definitions in this document.
http://www.opengroup.org/soa/source-book/ontologyv2/service.htm
Their definitions of service, service interface and service contract are either unclear or seem different from what I normally encounter.
Service:
“A service is a logical representation of a repeatable activity that
has a specified outcome. It is self-contained and is a ‘black box’ to
its consumers.”
Lets say I have a WCF project and it has two Operations
StoreFront
+GetPrice
+AddToCart
The definition says "a repeatable activity". So is the service StoreFront? Or do I have two services (GetPrice and AddToCart).
Service Contract:
Has an "effect" class. Is the effect "return price" and " added to cart" ?
From the same article:
“A capability offered by one entity or entities to others using
well-defined ‘terms and conditions’ and interfaces.” (Source: OMG
SoaML Specification - my italics)
This is in my opinion a preferable defnition than the one talking about "repeatable activities".
The key word in the definition is capability. Capability refers to Business Capability which is a carry-over from the BPM industry, but in an SOA context refers to a business domain with distinct boundaries.
So from this definition we can surmise that services should be exposed or should operate within a business capability/process boundary. This leads us towards the idea (from the principals or tenants of SOA) that services should be autonomous within well defined boundaries.
In your example, you are asking
So is the service StoreFront? Or do I have two services (GetPrice and
AddToCart)
The answer to that as always is "it depends". However, generally Pricing (GetPrice) would belong to a different business capability to Ordering (AddToCart). Additionally, the operations differ in some other important ways:
GetPrice is a read operation, while AddToCart is a write operation.
GetPrice is a synchronous operation, while AddToCart could very well be asynchronous
So from these we should probably assume that they are two different services from a business perspective.
This assumption has some radical repercussions. If they are two services, then according to SOA they should be autonomous. Meaning that we should be looking to minimize coupling between the services in every possible way, so that as much as possible they can be planned, developed, tested, built, deployed, hosted, supported, and managerd as separate concerns.
Another repercussion is that when you physically separate services to this extent, how can you show this stuff together to your users? They may be different capabilities but they still need to work together on the screen.
Additionally, from a back end perspective Ordering needs to know about Pricing data, otherwise how can order fulfillment happen? If you've separated the database into two, how can the Checkout service know how much stuff costs, what discounts to apply, etc?
I have posted about this stuff before, so please feel free to have a read. I would recommend reading the excellent article on Microservices by Lewis and Fowler also.
Our client follows SOA principles and have design web services that are very fine grained like createCustomer, deleteCustomer, etc.
I am not sure if fine grained services are desirable as they create transactional related issues. for e.g. if a business requirement is every Customer must have a Address when it's created. So in this case, the presentation component will invoke createCustomer first and then createAddress. The services internally use simple JDBC to update the respective tables in db. As a service is invoked by external component, it has not way of fulfilling transactional requirement here i.e. if createAddress fails, createCustomer operation must be rolledback.
I guess, one of the approach to deal with this is to either design course grained services (that creates a Customer and associated Address in one single JDBC transaction) or
perhaps simple create a reversing service (deleteCustomer) that simply reverses the action of createCustomer.
any suggestions. thanks
The short answer: services should be designed for the convenience of the service client. If the client is told "call this, then cdon't forget to call that" you're making their lives too difficult. There should be a coarse-grained service.
A long answer: Can a Customer reasonably be entered with no Address? So we call
createCustomer( stuff but no address)
and the result is a valid (if maybe not ideal) state for a customer. Later we call
changeCustomerAddress ( customerId, Address)
and now the persisted customer is more useful.
In this scenario the API is just fine. The key point is that the system's integrity does not depend upon the client code "remembering" to do something, in this case to add the address. However, more likely we don't want a customer in the system without an address in which case I see it as the service's responsibility to ensure that this happens, and to give the caller the fewest possibilities of getting it wrong.
I would see a coarse-grained createCompleteCustomer() method as by far the best way to go - this allows the service provider to solve the problem once rather then require every client programmer to implement the logic.
Alternatives:
a). There are web Services specs for Atomic Transactions and major vendors do support these specs. In principle you could actually implement using fine-grained methods and true transactions. Practically, I think you enter a world of complexity when you go down this route.
b). A stateful interface (work, work, commit) as mentioned by #mtreit. Generally speaking statefulness either adds complexity or obstructs scalability. Where does the service hold the intermediate state? If in memeory, then we require affinity to a particular service instance and hence introduce scaling and reliability problems. If in some State or Work-in-progress database then we have significant additional implementation complexity.
Ok, lets start:
Our client follows SOA principles and
have design web services that are very
fine grained like createCustomer,
deleteCustomer, etc.
No, the client has forgotten to reach the SOA principles and put up what most people do - a morass of badly defined interfaces. For SOA principles, the clinent would have gone to a coarser interface (such asfor example the OData meachsnism to update data) or followed the advice of any book on multi tiered architecture written in like the last 25 years. SOA is just another word for what was invented with CORBA and all the mistakes SOA dudes do today where basically well known design stupidities 10 years ago with CORBA. Not that any of the people doing SOA today has ever heard of CORBA.
I am not sure if fine grained services
are desirable as they create
transactional related issues.
Only for users and platforms not supporting web services. Seriously. Naturally you get transactional issues if you - ignore transactional issues in your programming. The trick here is that people further up the food chain did not, just your client decided to ignore common knowledge (again, see my first remark on Corba).
The people designing web services were well aware of transactional issues, which is why web service specification (WS*) contains actually mechanisms for handling transactional integrity by moving commit operations up to the client calling the web service. The particular spec your client and you should read is WS-Atomic.
If you use the current technology to expose your web service (a.k.a. WCF on the MS platform, similar technologies exist in the java world) then you can expose transaction flow information to the client and let the client handle transaction demarcation. This has its own share iof problems - like clients keeping transactions open maliciously - but is still pretty much the only way to handle transactions that do get defined in the client.
As you give no platform and just mention java, I am pointing you to some MS example how that can look:
http://msdn.microsoft.com/en-us/library/ms752261.aspx
Web services, in general, are a lot more powerfull and a lot more thought out than what most people doing SOA ever think about. Most of the problems they see have been solved a long time ago. But then, SOA is just a buzz word for multi tiered architecture, but most people thinking it is the greatest thing since sliced bread just dont even know what was around 10 years ago.
As your customer I would be a lot more carefull about the performance side. Fine grained non-semantic web services like he defines are a performance hog for non-casual use because the amount of times you cross the network to ask / update small small small small stuff makes the network latency kill you. Creating an order for like 10 goods can easily take 30-40 network calls in this scenario which will really possibly take a lot of time. SOA preaches, ever since the beginning (if you ignore the ramblings of those who dont know history) to NOT use fine grained calls but to go for a coarse grained exchange of documents and / or a semantical approach, much like the OData system.
If transactionality is required, a coarser-grained single operation that can implement transaction-semantics on the server is definitely going to be much simpler to implement.
That said, certainly it is possible to construct some scheme where the target of the operations is not committed until all of the necessary fine-grained operations have succeeded. For instance, have a Commit operation that checks some flag associated with the object on the server; the flag is not set until all of the necessary steps in the transaction have completed, and Commit fails if the flag is not set.
Of course, if having light-weight, fine grained operations is an important design requirement, perhaps the need to have transactionality should be re-thought.
Maybe I just expected "three-tier architecture" to deliver a little more than just a clean separation of responsibilities in the source code (see here)...
My expectations to such a beast that can safely call its self "three-tier architecture" are a lot higher... so, here they are:
If you were to build something like a "three tier architecture" system but this time with these, additional requirements and constraints:
Up and running at all times from a Users point of viewExpect when the UI gets replacedWhen other parts of the system are down, the UI has to handle that
Never get into a undefined state or one from which the system cannot recover automatically
The system has to be "pausable"
The middle-tier has to contain all the business logic
Obviously using an underlying Database, itself in the data-tier (if you like)
The business logic can use a big array of core services (here in the data-tier, not directly accessible by the UI, only through business logic tier facade)
Can be unavailable at times
Can be available as many parallel running, identical processes
The UI's may not contain any state other than the session in case of web UI's and possibly transient view baking models
Presentation-tier, logic-tier and data/core-services-tier have to be scalable independently
The only thing you can take for granted is the network
Note: The mentioned "core services" are heavy-weight components that access various external systems within the enterprise. An example would be the connection to an Active Directory or to a "stock market ticker"...
1. How would you do it?
If you don't have an answer right now, maybe read on and let me know what you think about this:
Sync considered harmful. Ties your system together in a bad way (Think: "weakest link"). Thread blocked while waiting for timeout. Not easy to recover from.
Use asynchronous messaging for all inter-process communication (between all tiers). Allows to suspend the system anytime you like. When part of the system is down, no timeout happens.
Have central routing component where all requests get routed through and core services can register themselves.
Add heartbeat component that can e.g. inform the UI that a component is not currently available.
State is a necessary evil: Allow no state other than in the business logic tier. This way the beast becomes manageable. While the core services might well need to access data themselves, all that data should be fed in by the calling middle tier. This way the core services can be implemented in a fire and forget fashion.
2. What do you think about this "solution"?
I think that, in the real world, high-availability systems are implemented using fail-over: for example, it isn't that the UI can continue to work without the business layer, instead it's that if the business layer becomes unavailable then the UI fails over to using a backup instance of the business layer.
Apart from that, they might operate using store-and-forward: e.g. a mail system might store a piece of mail, and retransmit it periodically, if it can't deliver it immediately.
Yep its the way most large websites do it. Look at nosql databases, Google's bigtable architecture etc.
1. This is the general approach I'd take.
I'd use a mixture of memcached , a nosql-cloud (couch-db or mongo-db) and enterprise grade RDBMS systems (core data storage) for the data layer. I'd then write the service layer ontop of the data layer. nosql database API's are massively parallel (look at couchdb with its ngingx service layer parallizer). I'd then provide "oldschool each request is a web-page" generating web-servers and also direct access to the service layer for new style AJAX application; both these would depend on the service layer.
p.s. the RDBMS is an important component here, it holds the authoritative copy of the all the data in the memchached/nosql cloud. I would use an enterprise grade RDBMS to do data-centre to data-centre replication. I don't know how the big boys do their cloud based site replication, it would scare me if they did data-cloud to data-cloud replication :P
Some points:
yYu do not need heartbeat, with nosql
the approach taken is that if content
becomes unavailable, you regenerate it
onto another server using the
authoratitve copy of the data.
The burden of state-less web-design
is carried to the nosql and memcached
layer which is infinitely scalable.
So you do not need to worry about
this. Just have a good network
infrastructure.
In terms of sync, when you are
talking to the RDBMS you can expect
acceptable synchronous response
times. Your cloud you should treat as
an asynchronous resource, you will
get help from the API's that
interface with your cloud so you
don't even have to think about this.
Advice I can give about networking
and redundancy is this: do not go for
fancy Ethernet bonding, as its not worth
it -- things always go wrong. Just
set up redundant switches, ethernet cards
and have multiple routes to all your
machines. You can use OpenBSD and
CARP for your routers, as they work
great - routers are your worst point of failure -- openbsd solves this problem.
2. You've described the general components of a web 2.0 farm, so no comment:D