Protobuf 3 breaks contract additivity

Protobuf 3 breaks contract additivity - grpc

I'm using Protobuf 3 along with gRPC in distributed environment ("microservices").
Due to lack of supporting not-set/missing values in Protobuf 3 I got the following issue related to contract additivity.
Imagine I have Service A and couple of consumer services B and C owned by Team B and Team C.
If I add a field, say, boolean value to contract of Service A, at the first it will have default value which will be written, say, to database as is.
Then, Team B updates their service to talk using updated contract and passes 'true' as the field value.
Then, Team C still uses old contract and calls the same service - value gets replaced to false. But Team C didn't mean it, moreover they weren't aware about that field at all.
Thus, Service A cannot extend contract at all because consumers that didn't get updated for various reasons yet are able to harm data and the Service A can do nothing about it.
In Thrift such things are done just by single check (.isSet()).
There are dirty workarounds like wrapping primitives into objects but it forces to use library-implementation-specific checks-by-reference (at least in java) which seems to be rather poor hack than robust solution. Also, eventually, I have to wrap everything in wrappers, which as you imagine is not great solution as well.
What are best practices you use to manage such situations in Protobuf 3 in 2017? How do you manage/coordinate contract updates between teams/services? Thanks
Note: this question is not exactly about how to implement absence of detection for not-set/missing values, but rather about how to live with that and follow Protobuf 3 philosophy.

I think the problem here is that trying to check for field presence this way is not really an idiomatic use of protocol buffers (not even in proto2). It sounds like you are trying to evolve your schema by adding new fields but not reading those new fields unless you're sure they came from an updated client. The idiomatic way is to do this instead: just make sure the defaults for the new fields are reasonable and maintain compatible behavior if they're not explicitly set. Then don't try to check for presence--just read the fields and older clients will get good default behavior.
To give you an example, let's say you're adding a new feature that can be enabled or disabled. The right way to do this would be to add a bool field in your request message called enable_new_feature. Since older clients don't know about this field, their requests will have this default to false and so they get the old behavior they're expecting. Adding a disable_new_feature field instead would probably be the wrong way to do it because then you would indeed break older clients by enabling something they didn't want.

Using oneof looks like a better/cleaner alternative to wrappers. See this answer to a similar question: https://stackoverflow.com/a/40552570/618259

Related

Elixir - Changing default behavior of actors

I am currently looking into the behavioral reflection possibilities of Elixir, a language to which I'm pretty new. Specifically, I am searching for a way to change the default behavior of actors within Elixir, or install hooks within this behavior. As a classic example, let's say I want to log messages when they are send from one actor, arrive in the other actor and are actually processed. Could I change/override/hook into the default behavior of these hypothetical 'send', 'arrive' and 'process' methods in orde to add this functionality as 'default behavior'? How would this be done?
The logging example above is just an example. I am looking to use the approach used to implement this example in a broader context (eg. change default mailbox behavior).

How does one expose constants in a java google app engine Endpoints API?

Simple question -- how do you expose constants in a java google app engine Endpoints API?
e.g
public static final int CODE_FOO = 3845;
I'd like the client of the Endpoints to be able to match on CODE_FOO rather than on 3845. I'll end up doing enum wrappers (which probably is better anyway) but I'm just starting to be curious if this is even doable? Thx

Note that this isn't a full answer but here is a workaround: in Android Studio, create a very light-weight "common" java project and shove anything you want to keep in sync there such as constants as well as common types that you want exposed (e.g. an enum representing all possible return / error codes, etc).
This way you should get pretty decent compiler-time safety and keep these guys in sync.
Please feel free to comment if anyone has better suggestions.

This is unfortunately a Law of Information (ahem). If you have a message protocol you defined, both sides of the interaction need to be aware of the messages that could be passed. There's no other way for the client to be aware of what it needs to respond to. Ajax libraries hard-code the number "200" to be able to detect a successful request, as one example.
Yes, just use a switch statement on strings inside your client code. Or, you could use a dictionary of strings pointing to functions and just call the function after de-referencing the dictionary given the string you got.

When would you use 'Real' translation messages in Symfony2?

The Symfony documentation says:
Using Real or Keyword Messages This example illustrates the two
different philosophies when creating messages to be translated:
$translated = $translator->trans('Symfony2 is great');
$translated = $translator->trans('symfony2.great');
< snip >
The choice of which method to use is entirely up to you, but the "keyword" format is often recommended.
http://symfony.com/doc/current/book/translation.html
So when would you use 'Real' messages?

You really have to decide for yourself. It's a bit a matter of taste and a bit a matter of your translation workflow.
Real messages are good when you don't want the overhead of maintaining an additional translation file (for the origin language). Furthermore, if you forget to translate some of the messages, you'd still see a valid message in the origin language. It's also somewhat easier to translate from an original message rather than a keyword.
Keywords are better when messages are changing often, especially with long texts. You abstract away the purpose of a message from the actual text.
EDIT: there's one more scenario when you could argue that real messages are better than keys - when your website only supports one language but with multiple variations - like en_GB, en_US. Most of the messages will be the same, only few will vary. So most of the messages could be left as they are, and only the ones which are actually different between GB and US put into a translation files. It would require much less work compared to an approach with using keys (of course, assuming your messages don't change very often).

One usecase for the real format I could come up with is when messages are created by users via the UI — it would be silly to force them to come up with keywords for each phrase they want to translate.
I haven't had such a need yet, so I always use the keyword format.

For the most part I agree with #Jakub Zalas' answer, however, the last line is a bit off.
Keywords are better when messages may ever change - not just when changing often. This is outlined as well in the docs themselves:
The second method is handy because the message key won't need to be changed in every translation file if you decide that the message should actually read "Symfony2 is really great" in the default locale.
If the message changes and you haven't used a key but the message as key you have to change any code using this message to reflect that change. More places to change are more potential bugs. We have the ability to build in leverage by using message keys.

Real messages has no big interest. IMO you can use them if you are sure your application will always be mono-language and you want to gain a few minutes in development.
Keyword trans has the interest that if you have to translate your website, you'll see immediately if a translation is missing.
To facilitate translations, I personnaly use JMSTranslationBundle

node_load or direct query?

What rule of thumb do you use for deciding to use node_load() or just writing a direct db_query()?
In a situation I'm looking at right now I need to get some node data and resolve data on two nodereference fields. So that would be 3 calls to node_load(). At some point here, would it be more efficient to construct the query with Joins directly?
This is for use in a self contained module that won't be distributed or used anywhere else, so I don't believe I need to worry about subverting node modification hooks (or do I?).
Edit:
Thinking about my question more, node_load() is only really applicable when you have one node to grab (and then maybe drilling down further into nodereferences like in my example). But as soon as you need to return more than one node based on some criteria, you're pretty much forced to use db_query right? Does Drupal have any abstracted API for writing queries like this?

Not a full answer (Not sure myself), just some hints.
node_load() is using a static cache (in Drupal 7, you can even use the entity_cache module to make it a permanent cache). If the nodes you are loading are being used a second time on the same page, that call will be free.
Querying CCK-tables is tricky. The schema structure can change completely based on configuration, for example when using a single or multiple values.

The reasoning behind using API methods for DB calls over direct DB calls is to provide a DB abstraction layer so that your app could move between supported database engines etc, also it enables your app to gracefully handle any schema changes (however unlikely) that core/module may make to the tables in question. It's also likely easier as #Berdir says for CCK fields and Node_Ref fields, but that depends on which you are more confident with Drupal API& PHP or MySQL...the payoff of doing it the Drupal way is increased future productivity and understanding of the codebase and what is possible :)
Oh and my rule of thumb is - Do it the Drupal way if at all possible (possible being variable depending on app time/cost/performance/whatever requirements)

REST - Modify Part of Resource - PUT or POST

I'm seeing a good bit of hand-waving on the subject of how to update only part of a resource (eg. status indicator) using REST.
The options seem to be:
Complain that HTTP doesn't have a PATCH or MODIFY command. However, the accepted answer on HTTP MODIFY verb for REST? does a good job of showing why that's not as good an idea as it might seem.
Use POST with parameters and identify a method (eg. a parameter named "action"). Some suggestions are to specify an X-HTTP-Method-Override header with a self-defined method name. That seems to lead to the ugliness of switching within the implementation based on what you're trying to do, and to be open to the criticism of not being a particularly RESTful way to use POST. In fact, taking this approach starts to feel like an RPC-type interface.
Use PUT to over-write a sub-resource of the resource which represents the specific attribute(s) to update. In fact, this is effectively an over-write of the sub-resource, which seems in line with the spirit of PUT.
At this point, I see #3 as the most reasonable option.
Is this a best practice or an anti-pattern? Are there other options?

There are two ways to view a status update.
Update to a thing. That's a PUT. Option 3
Adding an additional log entry to the history of the thing. The list item in this sequence of log entries is the current status. That's a POST. Option 2.
If you're a data warehousing or functional programming type, you tend to be mistrustful of status changes, and like to POST a new piece of historical fact to a static, immutable thing. This does require distinguishing the thing from the history of the thing; leading to two tables.
Otherwise, you don't mind an "update" to alter the status of a thing and you're happy with a PUT. This does not distinguish between the thing and it's history, and keeps everything in one table.
Personally, I'm finding that I'm less and less trustful of mutable objects and PUT's (except for "error correction"). (And even then, I think the old thing can be left in place and the new thing added with a reference to the previous version of itself.)
If there's a status change, I think there should be a status log or history and there should be a POST to add a new entry to that history. There may be some optimization to reflect the "current" status in the object to which this applies, but that's just behind-the-scenes optimization.

Option 3 (PUT to some separated sub-resource) is your best bet right now, and it wouldn't necessarily be "wrong" to just use POST on the main resource itself - although you could disagree with that depending on how pedantic you want to be about it.
Stick with 3 and use more granular sub-resources, and if you really do have a need for PATCH-like behavior - use POST. Personally, I will still use this approach even if PATCH does actually end up as a viable option.

HTTP does have a PATCH command. It is defined in Section 19.6.1.1 of RFC 2068, and was updated in draft-dusseault-http-patch-16, currently awaiting publication as RFC.

It's ok to POST & emulating PATCH where not available
Before explaining this, it's probably worth mentioning that there's nothing wrong with using POST to do general updates (see here) In particular:
POST only becomes an issue when it is used in a situation for which some other method is ideally suited: e.g., retrieval of information that should be a representation of some resource (GET), complete replacement of a representation (PUT)
Really we should be using PATCH to make small updates to complex resources but it isn't as widely available as we'd like. We can emulated PATCH by using an additional attribute as part of a POST.
Our service needs to be open to third-party products such as SAP, Flex, Silverlight, Excel etc. That means that we have to use the lowest common denominator technology - for a while we weren't able to use PUT because only GET and POST were supported across all the client technologies.
The approach that I've gone with is to have a "_method=patch" as part of a POST request. The benefits are;
(a) It's easy to deal with on the server side - we're basically pretending that PATCH is available
(b) It indicates to third-parties that we are not violating REST but working around a limitation with the browser. It's also consistent with how PUT was handled a few years back by the Rails community so should be comprehensible by many
(c) It's easy to replace when PATCH becomes more widely available
(d) It's a pragmatic response to an awkward problem.

PATCH is fine for patch or diff formats. Until then it's not very useful at all.
As for your solution 2 with a custom method, be it in the request or in the headers, no no no no and no, it's awful :)
Only two ways that are valid are either to PUT the whole resource, with the sub data modified, or POST to that resource, or PUT to a sub-resource.
It all depends on the granularity of your resources and the intended consequences on caching.

A bit late with an answer but I would consider using JSON Patch for scenarios like this.
At the core of it, it requires two copies of the resource (the original and the modified), and performs a diff on it. The outcome of the diff is an array of patch operations describing the difference.
An example of this:
[
{ "op": "replace", "path": "/baz", "value": "boo" },
{ "op": "add", "path": "/hello", "value": ["world"] },
{ "op": "remove", "path": "/foo" }
]
There are many client libraries that can do the hard lifting in generat

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex