Is there a tangible benefit to using wrapper requests over plain messages in grpc service calls? - grpc

Lets say we have a message containing ID of some record in the database
message Record {
uint64 id = 1;
}
We also have an rpc call that returns all of the rows from table DATA that said record is mentioned in.
rpc GetDataForRecord(Record) returns (Data) {}
If we, for example, wrap Record in
RqData{
Record id = 1;
}
then once we need to only return, for example, "active" data, we won't need to make
GetActiveDataForRecord
instead we could extend RqData as:
RqData{
Record id = 1;
bool use_active = 2;
}
and use
rpc GetDataForRecord(RqData) returns (Data) {}
and clients that know of this new functionality will be able to call it, while older clients will just use it as it was passing only Record part within the Rq wrapper, without specifying active or not.
Here's the question: is there really a reason to use this kind of wrapping of everything into a separate request, or am I overthinking things and just passing plain structures will do?
I am kinda trying to think about the future, but not sure if I am not overcomplicating things.

In general, making a method-specific request and response is a Good Thing™ and is encouraged. For a Foo method you'd have FooRequest and FooResponse. Having specialized messages for the method allows you to add new "arguments," as you mentioned.
But for some cases it turns out fine to break the pattern and avoid the wrapping; it's a judgement call. Although you're asking from a different perspective, you may be interested in this answer about related methods.

Related

Returning multiple items in gRPC: repeated List or stream single objects?

gRPC newbie. I have a simple api:
Customer getCustomer(int id)
List<Customer> getCustomers()
So my proto looks like this:
message ListCustomersResponse {
repeated Customer customer = 1;
}
rpc ListCustomers (google.protobuf.Empty) returns (ListCustomersResponse);
rpc GetCustomer (GetCustomerRequest) returns (Customer);
I was trying to follow Googles lead on the style. Originally I had returns (stream Customer) for GetCustomers, but Google seems to favor the ListxxxResponse style. When I generate the code, it ends up being:
public void getCustomers(com.google.protobuf.Empty request,
StreamObserver<ListCustomersResponse> responseObserver) {
vs:
public void getCustomers(com.google.protobuf.Empty request,
StreamObserver<Customer> responseObserver) {
Am I missing something? Why would I want to go through the hassle of creating a ListCustomersResponse when I can just do stream Customer and get the streaming functionality?
The ListCustomersResponse is just streaming the whole list at once vs streaming each customer. Googles preference seems to be to return the ListCustomersResponse style all of the time.
When is it appropriate to use the ListxxxResponse vs the stream response?
This question is hard to answer without knowing what reference you're using. It's possible there's a miscommunication, or that the reference is simply wrong.
If you're looking at the gRPC Basics tutorial though, then I might have an inkling as to what caused a miscommunication. If that's indeed your reference, then it does not recommend returning repeated fields for streamed responses; your intuition is correct: you would just want to stream the singular Customer.
Here is what it says (screenshot intentional):
You might be reading rpc ListFeatures(Rectangle) as meaning an endpoint that returns a list [noun] of features. If so, that's a miscommunication. The guide actually means an endpoint to list [verb] features. It would have been less confusing if they just wrote rpc GetFeatures(Rectangle).
So, your proto should look more like this,
rpc GetCustomers (google.protobuf.Empty) returns (stream Customer);
rpc GetCustomer (GetCustomerRequest) returns (Customer);
generating exactly what you suspected made more sense.
Update
Ah I see, so you're looking at this example in googleapis:
// Lists shelves. The order is unspecified but deterministic. Newly created
// shelves will not necessarily be added to the end of this list.
rpc ListShelves(ListShelvesRequest) returns (ListShelvesResponse) {
option (google.api.http) = {
get: "/v1/shelves"
};
}
...
// Response message for LibraryService.ListShelves.
message ListShelvesResponse {
// The list of shelves.
repeated Shelf shelves = 1;
// A token to retrieve next page of results.
// Pass this value in the
// [ListShelvesRequest.page_token][google.example.library.v1.ListShelvesRequest.page_token]
// field in the subsequent call to `ListShelves` method to retrieve the next
// page of results.
string next_page_token = 2;
}
Yeah, I think you've probably figured the same by now, but here they have chosen to use a simple RPC, as opposed to a server-side streaming RPC (see here). I emphasize this because, I think the important choice is not the stylistic difference between repeated versus stream, but rather the difference between a simple request-response API versus a more complex and less-ubiquitous streaming API.
In the googleapis example above, they're defining an API that returns a fixed and static number of items per page, e.g. 10 or 50. It would simply be overcomplicated to use streaming for this, when pagination is already so well-understood and prevalent in software architecture and REST APIs. I think that is what they should have said, rather than "a small number." So the complexity of streaming (and learning cost to you and future maintainers) has to justified, that's all. Suppose you're actually fetching thousands of (x, y, z) items for a Point Cloud or you're creating a live-updating bid-ask visualizer for some cryptocurrency, e.g.
Then you'd start asking yourself, "Is a simple request-response API my best option here?" So it just tends to be that, the larger the number of items needing to be returned, the more streaming APIs start to make sense. And that can be for conceptual reasons, e.g. the items are a live-updating stream in time like the above crypto example, or architectural, e.g. it would be more efficient to start displaying results in the UI as partial data streams back. I think the "small number" thing you read was an oversimplification.

Chaining Handlers with MediatR

We are using MediatR to implement a "Pipeline" for our dotnet core WebAPI backend, trying to follow the CQRS principle.
I can't decide if I should try to implement a IPipelineBehavior chain, or if it is better to construct a new Request and call MediatR.Send from within my Handler method (for the request).
The scenario is essentially this:
User requests an action to be executed, i.e. Delete something
We have to check if that something is being used by someone else
We have to mark that something as deleted in the database
We have to actually delete the files from the file system.
Option 1 is what we have now: A DeleteRequest which is handled by one class, wherein the Handler checks if it is being used, marks it as deleted, and then sends a new TaskStartRequest with the parameters to Delete.
Option 2 is what I'm considering: A DeleteRequest which implements the marker interfaces IRequireCheck, IStartTask, with a pipeline which runs:
IPipelineBehavior<IRequireCheck> first to check if the something is being used,
IPipelineBehavior<DeleteRequest> to mark the something as deleted in database and
IPipelineBehavior<IStartTask> to start the Task.
I haven't fully figured out what Option 2 would look like, but this is the general idea.
I guess I'm mainly wondering if it is code smell to call MediatR.Send(TRequest2) within a Handler for a TRequest1.
If those are the options you're set on going with - I say Option 2. Sending requests from inside existing Mediatr handlers can be seen as a code smell. You're hiding side effects and breaking the Single Responsibility Principle. You're also coupling your requests together and you should try to avoid situations where you can't send one type of request before another.
However, I think there might be an alternative. If a delete request can't happen without the validation and marking beforehand you may be able to leverage a preprocessor (example here) for your TaskStartRequest. That way you can have a single request that does everything you need. This even mirrors your pipeline example by simply leveraging the existing Mediatr patterns.
Is there any need to break the tasks into multiple Handlers? Maybe I am missing the point in mediatr. Wouldn't this suffice?
public async Task<Result<IFailure,ISuccess>> Handle(DeleteRequest request)
{
var thing = await this.repo.GetById(request.Id);
if (thing.IsBeignUsed())
{
return Failure.BeignUsed();
}
var deleted = await this.repo.Delete(request.Id);
return deleted ? new Success(request.Id) : Failure.DbError();
}

Where should I put a logic for querying extra data in CQRS command flow

I'm trying to implement simple DDD/CQRS architecture without event-sourcing for now.
Currently I need to write some code for adding a notification to a document entity (document can have multiple notifications).
I've already created a command NotificationAddCommand, ICommandService and IRepository.
Before inserting new notification through IRepository I have to query current user_id from db using NotificationAddCommand.User_name property.
I'm not sure how to do it right, because I can
Use IQuery from read-flow.
Pass user_name to domain entity and resolve user_id in the repository.
Code:
public class DocumentsCommandService : ICommandService<NotificationAddCommand>
{
private readonly IRepository<Notification, long> _notificationsRepository;
public DocumentsCommandService(
IRepository<Notification, long> notifsRepo)
{
_notificationsRepository = notifsRepo;
}
public void Handle(NotificationAddCommand command)
{
// command.user_id = Resolve(command.user_name) ??
// command.source_secret_id = Resolve(command.source_id, command.source_type) ??
foreach (var receiverId in command.Receivers)
{
var notificationEntity = _notificationsRepository.Get(0);
notificationEntity.TargetId = receiverId;
notificationEntity.Body = command.Text;
_notificationsRepository.Add(notificationEntity);
}
}
}
What if I need more complex logic before inserting? Is it ok to use IQuery or should I create additional services?
The idea of reusing your IQuery somewhat defeats the purpose of CQRS in the sense that your read-side is supposed to be optimized for pulling data for display/query purposes - meaning that it can be denormalized, distributed etc. in any way you deem necessary without being restricted by - or having implications for - the command side (a key example being that it might not be immediately consistent, while your command side obviously needs to be for integrity/validity purposes).
With that in mind, you should look to implement a contract for your write side that will resolve the necessary information for you. Driving from the consumer, that might look like this:
public DocumentsCommandService(IRepository<Notification, long> notifsRepo,
IUserIdResolver userIdResolver)
public interface IUserIdResolver
{
string ByName(string username);
}
With IUserIdResolver implemented as appropriate.
Of course, if both this and the query-side use the same low-level data access implementation (e.g. an immediately-consistent repository) that's fine - what's important is that your architecture is such that if you need to swap out where your read side gets its data for the purposes of, e.g. facilitating a slow offline process, your read and write sides are sufficiently separated that you can swap out where you're reading from without having to untangle reads from the writes.
Ultimately the most important thing is to know why you are making the architectural decisions you're making in your scenario - then you will find it much easier to make these sorts of decisions one way or another.
In a project i'm working i have similar issues. I see 3 options to solve this problem
1) What i did do is make a UserCommandRepository that has a query option. Then you would inject that repository into your service.
Since the few queries i did need were so simplistic (just returning single values) it seemed like a fine tradeoff in my case.
2) Another way of handling it is by forcing the user to just raise a command with the user_id. Then you can let him do the querying.
3) A third option is ask yourself why you need a user_id. If it's to make some relations when querying the data you could also have this handles when querying the data (or when propagating your writeDB to your readDB)

Purely functional feedback suppression?

I have a problem that I can solve reasonably easy with classic imperative programming using state: I'm writing a co-browsing app that shares URL's between several nodes. The program has a module for communication that I call link and for browser handling that I call browser. Now when a URL arrives in link i use the browser module to tell the
actual web browser to start loading the URL.
The actual browser will trigger the navigation detection that the incoming URL has started to load, and hence will immediately be presented as a candidate for sending to the other side. That must be avoided, since it would create an infinite loop of link-following to the same URL, along the line of the following (very conceptualized) pseudo-code (it's Javascript, but please consider that a somewhat irrelevant implementation detail):
actualWebBrowser.urlListen.gotURL(function(url) {
// Browser delivered an URL
browser.process(url);
});
link.receivedAnURL(function(url) {
actualWebBrowser.loadURL(url); // will eventually trigger above listener
});
What I did first wast to store every incoming URL in browser and simply eat the URL immediately when it arrives, then remove it from a 'received' list in browser, along the lines of this:
browser.recents = {} // <--- mutable state
browser.recentsExpiry = 40000;
browser.doSend = function(url) {
now = (new Date).getTime();
link.sendURL(url); // <-- URL goes out on the network
// Side-effect, mutating module state, clumsy clean up mechanism :(
browser.recents[url] = now;
setTimeout(function() { delete browser.recents[url] }, browser.recentsExpiry);
return true;
}
browser.process = function(url) {
if(/* sanity checks on `url`*/) {
now = (new Date).getTime();
var duplicate = browser.recents[url];
if(! duplicate) return browser.doSend(url);
if((now - duplicate_t) > browser.recentsExpiry) {
return browser.doSend(url);
}
return false;
}
}
It works but I'm a bit disappointed by my solution because of my habitual use of mutable state in browser. Is there a "Better Way (tm)" using immutable data structures/functional programming or the like for a situation like this?
A more functional approach to handling long-lived state is to use it as a parameter to a recursive function, and have one execution of the function responsible for handling a single "action" of some kind, then calling itself again with the new state.
F#'s MailboxProcessor is one example of this kind of approach. However it does depend on having the processing happen on an independent thread which isn't the same as the event-driven style of your code.
As you identify, the setTimeout in your code complicates the state management. One way you could simplify this out is to instead have browser.process filter out any timed-out URLs before it does anything else. That would also eliminate the need for the extra timeout check on the specific URL it is processing.
Even if you can't eliminate mutable state from your code entirely, you should think carefully about the scope and lifetime of that state.
For example might you want multiple independent browsers? If so you should think about how the recents set can be encapsulated to just belong to a single browser, so that you don't get collisions. Even if you don't need multiple ones for your actual application, this might help testability.
There are various ways you might keep the state private to a specific browser, depending in part on what features the language has available. For example in a language with objects a natural way would be to make it a private member of a browser object.

Update document in Meteor mini-mongo without updating server collections

In Meteor, I got a collection that the client subscribes to. In some cases, instead of publishing the documents that exists in the collection on the server, I want to send down some bogus data. Now that's fine using the this.added function in the publish.
My problem is that I want to treat the bogus doc as if it were a real document, specifically this gets troublesome when I want to update it. For the real docs I run a RealDocs.update but when doing that on the bogus doc it fails since there is no representation of it on the server (and I'd like to keep it that way).
A collection API that allowed me to pass something like local = true this would be fantastic but I have no idea how difficult that would be to implement and I'm not to fond of modifying the core code.
Right now I'm stuck at either creating a BogusDocs = new Meteor.Collection(null) but that makes populating the Collection more difficult since I have to either hard code fixtures in the client code or use a method to get the data from the server and I have to make sure I call BogusDocs.update instead of RealDocs.update as soon as I'm dealing with bogus data.
Maybe I could actually insert the data on the server and make sure it's removed later, but the data really has nothing to do with the server side collection so I'd rather avoid that.
Any thoughts on how to approach this problem?
After some further investigation (the evented mind site) it turns out that one can modify the local collection without making calls to the server. This is done by running the same methods as you usually would, but on MyCollection._collection instead of just on Collection. MyCollection.update() would thus become MyCollection._collection.update(). So, using a simple wrapper one can pass in the usual arguments to a update call to update the collection as usual (which will try to call the server which in turn will trigger your allow/deny rules) or we can add 'local' as the last argument to only perform the update in the client collection. Something like this should do it.
DocsUpdateWrapper = function() {
var lastIndex = arguments.length -1;
if (arguments[lastIndex] === 'local') {
Docs._collection.update(arguments.slice(0, lastIndex);
} else {
Docs.update(arguments)
}
}
(This could of course be extended to a DocsWrapper that allows for insertion and removals too.)(Didnt try this function yet but it should serve well as an example.)
The biggest benefit of this is imo that we can use the exact same calls to retrieve documents from the local collection, regardless of if they are local or living on the server too. By adding a simple boolean to the doc we can keep track of which documents are only local and which are not (An improved DocsWrapper could check for that bool so we could even omit passing the 'local' argument.) so we know how to update them.
There are some people working on local storage in the browser
https://github.com/awwx/meteor-browser-store
You might be able to adapt some of their ideas to provide "fake" documents.
I would use the transform feature on the collection to make an object that knows what to do with itself (on client). Give it the corruct update method (real/bogus), then call .update rather than a general one.
You can put the code from this.added into the transform process.
You can also set up a local minimongo collection. Insert on callback
#FoundAgents = new Meteor.Collection(null, Agent.transformData )
FoundAgents.remove({})
Meteor.call 'Get_agentsCloseToOffer', me, ping, (err, data) ->
if err
console.log JSON.stringify err,null,2
else
_.each data, (item) ->
FoundAgents.insert item
Maybe this interesting for you as well, I created two examples with native Meteor Local Collections at meteorpad. The first pad shows an example with plain reactive recordset: Sample_Publish_to_Local-Collection. The second will use the collection .observe method to listen to data: Collection.observe().

Resources