SocketTimeoutException when calling load for DynamoDBMapper - amazon-dynamodb

I am getting sometimes this error when calling load for DynamoDBMapper:
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1593)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:82)
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.getToken(InstanceMetadataServiceResourceFetcher.java:91)
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:69)
at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66)
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsEndpoint(InstanceMetadataServiceCredentialsFetcher.java:58)
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsResponse(InstanceMetadataServiceCredentialsFetcher.java:46)
at com.amazonaws.auth.BaseCredentialsFetcher.fetchCredentials(BaseCredentialsFetcher.java:112)
at com.amazonaws.auth.BaseCredentialsFetcher.getCredentials(BaseCredentialsFetcher.java:68)
at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:166)
at com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper.getCredentials(EC2ContainerCredentialsProviderWrapper.java:75)
at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:117)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1251)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:827)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:777)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:5110)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:5077)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executeGetItem(AmazonDynamoDBClient.java:2197)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.getItem(AmazonDynamoDBClient.java:2163)
at com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBMapper.load(DynamoDBMapper.java:431)
at com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBMapper.load(DynamoDBMapper.java:448)
at com.amazonaws.services.dynamodbv2.datamodeling.AbstractDynamoDBMapper.load(AbstractDynamoDBMapper.java:80)
I have 2 timeouts to PUT /latest/api/token, then I get a response. I am not sure what is wrong exactly or why do I have this behavior sometimes, but this leads to latency in my application.
Do I need to modify something in the settings? Is it related to DynamoMapper? Should I use low level Dynamo API?

These issues can occur when:
You call a remote API that takes too long to respond or that is unreachable.
Your API call doesn't get a response within the socket timeout.
Your API call doesn't get a response within the timeout period of your Lambda function.
If you make an API call using an AWS SDK and the call fails, the SDK automatically retries the call https://aws.amazon.com/premiumsupport/knowledge-center/lambda-function-retry-timeout-sdk/. How long and how many times the SDK retries is determined by settings that vary among each SDK. Here are the default values of these settings:
see the SDK client configuration documentation: https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/ClientConfiguration.html

Related

Will Spring KafkaContainerStoppingErrorHandler commits offset for batch listener

I am working on Spring Kafka implementation and my use case is consume messages from Kafka topic as batch (using batch listener). when I consumer the list of messages, will iterate and call the REST endpoint for message enrichment. In case REST API fails for any runtime exception, I have implemented retry logic using spring retry. I want to stop the container, after the number of retries fails. So planning to use KafkaContainerStoppingErrorHandler to achieve this. Does the KafkaContainerStoppingErrorHandler commits the previous success messages - say if we receive 10 messages, and for message 1,2,3,4, enrichment call is success and for message 5 enrichment API call fails. so when we restart the container, will I get all 10 again or will I receive messages 5- 10?
or is there a way we can achieve above use case? I looked into all types of error handles of Spring kafka and need input on how to achieve above requirement.
You will get them all again.
You can use the DefaultErrorHandler (with a custom recoverer) and throw a BatchListenerFailedException to indicate which record in the batch failed.
The error handler will commit the offsets up to that record and call the recoverer with the failed record; in your custom recoverer you can stop the container (use the same logic as the container stopping error handler).
In versions before 2.8, this same functionality is provided by the RecoveringBatchErrorHandler.

Still receiving Timeout connects using JS DAX client, classified as an error but no impact

We use the JS amaxon-dax-client (1.2.3) and we receive a bunch of
ERROR Failed to pull from <cluster>.cache.amazonaws.com (-.-.-.-): TimeoutError: Connection timeout after 10000ms
at SocketTubePool.alloc (/opt/nodejs/node_modules/amazon-dax-client/src/Tube.js:227:64)
at /opt/nodejs/node_modules/amazon-dax-client/generated-src/Operations.js:215:30
My understanding is this is a warning, not an error? We don't see any adverse effects (at least not noticeable) and the functions can access DAX just fine, but it pollutes our logs.
If the DAX endpoint isn't properly configured, the DB operations fail which isn't what we're seeing here.

How to set maxWebsocketFrameSize

getting error:
2020-01-20 21:15:29,599 WARN [io.net.cha.DefaultChannelPipeline] (vert.x-eventloop-thread-0) An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.: io.netty.handler.codec.http.websocketx.CorruptedWebSocketFrameException: Max frame length of 65536 has been exceeded.
at io.netty.handler.codec.http.websocketx.WebSocket08FrameDecoder.protocolViolation(WebSocket08FrameDecoder.java:426)
at io.netty.handler.codec.http.websocketx.WebSocket08FrameDecoder.decode(WebSocket08FrameDecoder.java:286)
Very hard to tell "what" is implementing the websocket server and how to modify the configuration.
Vert.x core has a HttpServerOptions with maxWebsocketFrameSize that "might" be the correct thing to increase, or even the maxWebsocketMessageSize that when increased, might increase the frameSize?
Cannot find any way to clear up this exception when client sends a big message to the server over the websocket channel. (it is text message)

python grpc: setting timeout per grpc call

Is there a way to specify timeout per grpc call with python.
I am experiencing more than 1 minute delay in receiving response.
I want the api to return some error in case it is taking longer that specified time. I am using blocking grpc call.
You can look up the information you want at gRPC Python's API reference. Setting timeout should be as simple as:
channel = grpc.insecure_channel(...)
stub = ...(channel)
stub.AnRPC(request, timeout=5) # 5 seconds timeout

Hitting 100 active connections limit in test env with only two users

I have a single web client and a few Lambda functions which use the Admin SDK. I've noticed recently that I've bumped into the 100 simultaneous connection limit but I really shouldn't be anywhere near that limit. Also it would appear that the connections established by my Lamba functions are not dropping off even after the function has completed.
Any idea on:
how I can prevent this run-up on connections from happening?
how I can release connections established by past Lambda scripts?
how can I monitor which processes/threads/stacks are holding connections?
Note: this is a testing environment I'm working out of so I'd prefer to keep this in the free tier and my requirements should definitely not be running into the 100 active limit. I am on a paid plan in prod.
I attempt to avoid calling initializeApp more than once by using the following connection code. In the example I'm talking about I only have a single database as a backend and so the default "name" of DEFAULT is used each time.
const runningApps = new Set(firebase.apps.map(i => i.name));
this.app = runningApps.has(name)
? firebase.app()
: firebase.initializeApp({
credential: firebase.credential.cert(serviceAccount),
databaseURL: config.databaseUrl
});
I'm now trying to explicitly close connections with goOffline but that leads to another issue where on the second connection -- aka, where the DEFAULT application is already setup and it just reuses the connection already established I get the following logging:
# Generated as result of `goOnline`
Connecting to Firebase: [https://xyz.firebaseio.com]
appears to be already connected
# Listening on ".info/connected" comes back as true, resulting in:
AbstractedAdmin: connected to [DEFAULT]
# but then I get this error
NotAllowed: You must first connect before using the database() API at Object._getFirebaseType
The fact that you have unexpected incoming connections to the database, makes it seem like the stale instances keep an open connection.
Best I can think off is to call goOffline() in your function before it completes to explicitly disconnect. That would probably also mean you have to call goOnline at the start of the function, since it might be running on an instance that previously went offline. Both goOnline and goOffline are synchronous calls afaik, but there's definitely going to be some time between going online and the data becoming available in your app.
If Lambda has a way for you to detect life-cycle events of its instances, that would be the preferred place to call goOffline and goOnline.
admin.initializeApp should only get called once in your script/node app.
The Firebase SDK's talks HTTP2 to the Firebase cloud system, so I'm not sure why you would encounter max connection issues as unique sockets are not stood up per call.
One thing to look out for is that calls to 3rd part API's (such as sendgrid) are not supported on the free tier.

Resources