gRPC: Random CANCELLED exception on RPC calls - grpc

I'm occasionally getting cancellation errors when calling gRPC methods.
Here's my client-side code (Using grpc-java 1.22.0 library):
public class MyClient {
private static final Logger logger = LoggerFactory.getLogger(MyClient.class);
private ManagedChannel channel;
private FooGrpc.FooStub fooStub;
private final StreamObserver<Empty> responseObserver = new StreamObserver<>() {
#Override
public void onNext(Empty value) {
}
#Override
public void onError(Throwable t) {
logger.error("Error: ", t);
}
#Override
public void onCompleted() {
}
};
public MyClient() {
this.channel = NettyChannelBuilder
.forAddress(host, port)
.sslContext(GrpcSslContexts.forClient().trustManager(certStream).build())
.build();
var pool = Executors.newCachedThreadPool(
new ThreadFactoryBuilder().setNameFormat("foo-pool-%d").build());
this.fooStub = FooGrpc.newStub(channel)
.withExecutor(pool);
}
public void callFoo() {
fooStub.withDeadlineAfter(500L, TimeUnit.MILLISECONDS)
.myMethod(whatever, responseObserver);
}
}
When I invoke callFoo() method, it usually works. Client sends a message and server receives it without problem.
But this call occasionally gives me an error:
io.grpc.StatusRuntimeException: CANCELLED: io.grpc.Context was cancelled without error
at io.grpc.Status.asRuntimeException(Status.java:533) ~[handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:442) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:700) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:399) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:507) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:66) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:627) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:515) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:686) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:675) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) [handler-0.0.1-SNAPSHOT.jar:?]
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) [handler-0.0.1-SNAPSHOT.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
The weird thing is that even though the call gives an error at client side, the server does receive the request, mostly. But sometimes the server misses it.
It is not even DEADLINE_EXCEEDED exception, it just throws CANCELLED: io.grpc.Context was cancelled without error. No other description is provided, So I cannot figure out why this is happening.
To summarize:
gRPC call from client randomly gives CANCELLED error.
When the error happens, the server sometimes gets the call, but sometimes not.

grpc-java supports automatic deadline and cancellation propagation. When an inbound RPC causes outbound RPCs, those outbound RPCs inherit the inbound RPC's deadline. Also, if the inbound RPC is cancelled the outbound RPCs will be cancelled.
This is implemented via io.grpc.Context. If you do an outbound RPC that you want to live longer than the inbound RPC, you should use Context.fork().
public void myRpcMethod(Request req, StreamObserver<Response> observer) {
// ctx has all the values as the current context, but
// won't be cancelled
Context ctx = Context.current().fork();
// Set ctx as the current context within the Runnable
ctx.run(() -> {
// Can start asynchronous work here that will not
// be cancelled when myRpcMethod returns
});
observer.onNext(generateReply());
observer.onCompleted();
}

Related

gRPC onComplete for bidistream

In all the gRPC bidistream examples that I have seen follow a pattern that when (inbound) requestObserver receives onComplete it invokes the onComplete method of the (outbound) responseObserver. However, this is not done for onError.
Wondering what happens if I don't invoke responseObserver.onComplete() does it lead to memory leak? Why we don't do it for onError?
public StreamObserver<Point> recordRoute(final StreamObserver<RouteSummary> responseObserver) {
return new StreamObserver<Point>() {
#Override
public void onNext(Point point) {
// does something here
}
#Override
public void onError(Throwable t) {
logger.log(Level.WARNING, "recordRoute cancelled");
}
#Override
public void onCompleted() {
responseObserver.onCompleted();
}
};
}
Wondering what happens if I don't invoke responseObserver.onComplete() does it lead to memory leak?
An RPC is not complete/done until the response stream is also "completed" so yes there will be resource leak if you don't eventually call responseObserver.onCompleted(). In this particular example it just so happens that the response stream is terminated when the request stream is "complete" but there could be cases where the response stream is "completed" only after more processing is done or more data is sent on the response stream.
Why we don't do it for onError?
onError() is a terminating error from the stream which means the call is terminated. onError() on the response stream is not needed and most probably won't do anything.

does the flink sink only support bio?

The invoke method of sink seems no way to make async io? e.g. returns Future?
For example, the redis connector uses jedis lib to execute redis command synchronously:
https://github.com/apache/bahir-flink/blob/master/flink-connector-redis/src/main/java/org/apache/flink/streaming/connectors/redis/RedisSink.java
Then it will block the task thread of flink waiting the network response from redis server per command?! Is it possible for other operators running in the same thread with sink? If so, then it would block them too?
I know flink has asyncio api, but it seems not for used by sink impl?
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/asyncio.html
As #Dexter mentioned, you can use RichAsyncFunction. Here is an sample code(may need further update to make it work ;)
AsyncDataStream.orderedWait(ds, new RichAsyncFunction<Tuple2<String,MyEvent>, String>() {
transient private RedisClient client;
transient private RedisAsyncCommands<String, String> commands;
transient private ExecutorService executor;
#Override
public void open(Configuration parameters) throws Exception {
super.open(parameters);
client = RedisClient.create("redis://localhost");
commands = client.connect().async();
executor = Executors.newFixedThreadPool(10);
}
#Override
public void close() throws Exception {
// shut down the connection and thread pool.
client.shutdown();
executor.shutdown();
super.close();
}
public void asyncInvoke(Tuple2<String, MyEvent> input, final AsyncCollector<String> collector) throws Exception {
// eg.g get something from redis in async
final RedisFuture<String> future = commands.get("key");
future.thenAccept(new Consumer<String>() {
#Override
public void accept(String value) {
collector.collect(Collections.singletonList(future.get()));
}
});
}
}, 1000, TimeUnit.MILLISECONDS);

Spring kafka : Kafka Listener- consumer.seek issue

we are using Spring KafkaListener which acknowledges each records after it is processed to DB. If we have problems writing to DB we don't acknowledge the record so that offsets are not committed for the consumer. this works fine. Now we want to get the failed messages in next poll to retry them. we added errorhandler to our listener and invoked ConsumerAwareListenerErrorHandler and tried to do consumer.seek() for the failed message offset. Expectation is during next poll, we should received the failed messages. This is not happening. Next poll fetches only the new messages and not the failed messages Code snippet is given below.
#Service
public class KafkaConsumer {
#KafkaListener(topics = ("${kafka.input.stream.topic}"), containerFactory = "kafkaManualAckListenerContainerFactory", errorHandler = "listen3ErrorHandler")
public void onMessage(ConsumerRecord<Integer, String> record,
Acknowledgment acknowledgment ) throws Exception {
try {
msg = JaxbUtil.convertJsonStringToMsg(record.value());
onHandList = DCMUtil.convertMsgToOnHandDTO(msg);
TeradataDAO.updateData(onHandList);
acknowledgment.acknowledge();
recordSuccess = true;
LOGGER.info("Message Saved in Teradata DB");
} catch (Exception e) {
LOGGER.error("Error Processing On Hand Data ", e);
recordSuccess = false;
}
}
#Bean
public ConsumerAwareListenerErrorHandler listen3ErrorHandler() throws InterruptedException {
return (message, exception, consumer) -> {
this.listen3Exception = exception;
MessageHeaders headers = message.getHeaders();
consumer.seek(new org.apache.kafka.common.TopicPartition(
headers.get(KafkaHeaders.RECEIVED_TOPIC, String.class),
headers.get(KafkaHeaders.RECEIVED_PARTITION_ID, Integer.class)),
headers.get(KafkaHeaders.OFFSET, Long.class));
return null;
};
}
}
Container Class
#Bean
public Map<Object,Object> consumerConfigs() {
Map<Object,Object> props = new HashMap<Object,Object> ();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
localhost:9092);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "example-1");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
return props;
}
#SuppressWarnings({ "rawtypes", "unchecked" })
#Bean
public ConsumerFactory consumerFactory() {
return new DefaultKafkaConsumerFactory(consumerConfigs());
}
#SuppressWarnings("unchecked")
#Bean
KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<Integer, String>>
kafkaManualAckListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<Integer, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setAckMode(AckMode.MANUAL);
return factory;
}
It's supposed to work like this:
The error handler needs to throw an exception if you want to discard additional records from the previous poll.
Since you are "handling" the error, the container knows nothing and will continue to call the listener with the remaining records from the poll.
That said, I see that the container is also ignoring an exception thrown by the error handler (it will discard if the error handler throws an Error not an exception). I will open an issue for this.
Another work around would be to add the Consumer to the listener method signature and do the seek there (and throw an exception). If there is no error handler, the rest of the batch is discarded.
Correction
If the container has no ErrorHandler, any Throwable thrown by a ListenerErrorHandler will cause the remaining records to be discarded.
Please try using SeekToCurrentErrorHandler. The doc says "This allows implementations to seek all unprocessed topic/partitions so the current record (and the others remaining) will be retrieved by the next poll. The SeekToCurrentErrorHandler does exactly this.
The container will commit any pending offset commits before calling the error handler."
https://docs.spring.io/autorepo/docs/spring-kafka-dist/2.1.0.BUILD-SNAPSHOT/reference/htmlsingle/#_seek_to_current_container_error_handlers

Flink Retrofit not Serializable Exception

I have a Flink Job reading events from a Kafka queue then calling another service if certain conditions are met.
I wanted to use Retrofit2 to call the REST endpoint of that service but I get a is not Serializable Exception. I have several Flat Maps connected to each other (in series) then calling the service happens in the last FlatMap. The exception I get:
Exception in thread "main"
org.apache.flink.api.common.InvalidProgramException: The
implementation of the RichFlatMapFunction is not serializable. The
object probably contains or references non serializable fields.
...
Caused by: java.io.NotSerializableException: retrofit2.Retrofit$1
...
The way I am initializing retrofit:
RetrofitClient.getClient(BASE_URL).create(NotificationService.class);
And the NotificationService interface
public interface NotificationService {
#PUT("/test")
Call<String> putNotification(#Body Notification notification);
}
The RetrofitClient class
public class RetrofitClient {
private static Retrofit retrofit = null;
public static Retrofit getClient(String baseUrl) {
if (retrofit == null) {
retrofit = new Retrofit.Builder().baseUrl(baseUrl).addConverterFactory(GsonConverterFactory.create())
.build();
}
return retrofit;
}
Put your Notification class code for more details, but looks like this answer helps
java.io.NotSerializableException with "$1" after class

Undertow : use Hystrix Observable in Http handler

I managed to setup an Hystrix Command to be called from an Undertow HTTP Handler:
public void handleRequest(HttpServerExchange exchange) throws Exception {
if (exchange.isInIoThread()) {
exchange.dispatch(this);
return;
}
RpcClient rpcClient = new RpcClient(/* ... */);
try {
byte[] response = new RpcCommand(rpcClient).execute();
// send the response
} catch (Exception e) {
// send an error
}
}
This works nice. But now, I would like to use the observable feature of Hystrix, calling observe instead of execute, making the code non-blocking.
public void handleRequest(HttpServerExchange exchange) throws Exception {
RpcClient rpcClient = new RpcClient(/* ... */);
new RpcCommand(rpcClient).observe().subscribe(new Observer<byte[]>(){
#Override
public void onCompleted() {
}
#Override
public void onError(Throwable throwable) {
exchange.setStatusCode(StatusCodes.INTERNAL_SERVER_ERROR);
exchange.endExchange();
}
#Override
public void onNext(byte[] body) {
exchange.getResponseHeaders().add(Headers.CONTENT_TYPE, "text/plain");
exchange.getResponseSender().send(ByteBuffer.wrap(body));
}
});
}
As expected (reading the doc), the handler returns immediately and as a consequence, the exchange is ended; when the onNext callback is executed, it fails with an exception:
Caused by: java.lang.IllegalStateException: UT000127: Response has already been sent
at io.undertow.io.AsyncSenderImpl.send(AsyncSenderImpl.java:122)
at io.undertow.io.AsyncSenderImpl.send(AsyncSenderImpl.java:272)
at com.xxx.poc.undertow.DiyServerBootstrap$1$1.onNext(DiyServerBootstrap.java:141)
at com.xxx.poc.undertow.DiyServerBootstrap$1$1.onNext(DiyServerBootstrap.java:115)
at rx.internal.util.ObserverSubscriber.onNext(ObserverSubscriber.java:34)
Is there a way to tell Undertow that the handler is doing IO asynchronously? I expect to use a lot of non-blocking code to access database and other services.
Thanks in advance!
You should dispatch() a Runnable to have the exchange not end when the handleRequest method returns. Since the creation of the client and subscription are pretty simple tasks, you can do it on the same thread with SameThreadExecutor.INSTANCE like this:
public void handleRequest(HttpServerExchange exchange) throws Exception {
exchange.dispatch(SameThreadExecutor.INSTANCE, () -> {
RpcClient rpcClient = new RpcClient(/* ... */);
new RpcCommand(rpcClient).observe().subscribe(new Observer<byte[]>(){
//...
});
});
}
(If you do not pass an executor to dispatch(), it will dispatch it to the XNIO worker thread pool. If you wish to do the client creation and subscription on your own executor, then you should pass that instead.)

Resources