Trouble connecting to gRPC server on AWS Fargate - grpc

I have a Python gRPC server running on AWS Fargate (configured very similar to this AWS guide here), and another AWS Fargate task (call it the "client") that attempts to make a connection to my gRPC server (also using Python gRPC). However, the client is unable to make a call to my server, with the following error:
<_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
debug_error_string = "{"created":"#1619057124.216955000","description":"Failed to pick subchannel",
"file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":5397,
"referenced_errors":[{"created":"#1619057124.216950000","description":"failed to connect to all addresses",
"file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc",
"file_line":398,"grpc_status":14}]}"
Based on my reading online, it seems like there are myriad situations in which this error is thrown, and I'm having trouble figuring out which one pertains to my case. Here is some additional information:
When running client and server locally, I am able to successfully connect by having the client connect to localhost:[PORT]
I have configured an application load balancer target group following the guide from AWS here that makes health check requests to the / route of my gRPC server, using the gRPC protocol, and expect gRPC response code 12 (UNIMPLEMENTED); these health check requests are coming back as expected, which I believe implies the load balancer is able to successfully communicate with the server (although I could be misunderstanding)
I configured a service discovery system (following this guide here) that should allow me to reach my gRPC server within my VPC via the name service-name.dev.co.local. I can confirm that the corresponding DNS record exists in Route 53, and when I SSH into my VPC, I am indeed able to ping service-name.dev.co.local successfully.
Anyone have any ideas? Would appreciate any and all advice, and I'm happy to answer any further questions.
Thank you for your help!

on your grpc server use 0.0.0.0:[port] and expose this port with TCP on your container.

Related

Remote Server access with tunneling

I want to integrate service on my website, but the requirement from the service provider is that, data transfer must be performed using Tunneling, could you tell me detailed process how to connect remote server and send requests there. I have all credentials: remote server IP, ISAKMP key and stuff like that.
I tried configuring strongswan on my VPS, but I was not able to complete process due to some errors.

AMQP handshake timeout error while deploying AWS Corda Enterprise Template

I am deploying AWS Corda Enterprise Template. The Quick start deployed the stack as per the defined CloudFormation template. I can see 2 AWS instances, up and running as Corda nodes, in Hot-Cold setup with a load balancer.
However the Log for Corda node has following ERROR related to AMQP communication.
[ERROR] 2018-10-18T05:47:55,743Z [Thread-3
(ActiveMQ-scheduled-threads)] core.server.lambda$channelActive$0 -
AMQ224088: Timeout (10 seconds) while handshaking has occurred. {}
What can be possible reason for this error? This error keeps on occurring after a certain time interval. So it looks like some connectivity issue to me.
Note: The load balancer shows the status of this AWS Corda instances as healty (In Service). So I believe the Corda node has booted up successfully.
The ERROR message isn't necessarily tied to AMQP. Perhaps you were confused by the "AMQ" in the error ID (AMQ224088)?
In any event, this error indicates that something on the network is connecting to the ActiveMQ Artemis broker, but it's not completing any protocol handshake. This is commonly seen with, for example, load balancers that do a health check by creating a socket connection without sending any real data just to see if the port is open on the target machine.

C# gRPC client - name resolution failure

A client is running our C# gRPC client on a corporate network, behind an HTTP proxy. The http_proxy environment variable is configured, but nevertheless he sees an error message Name resolution failure when attempting to connect to the server on the internet.
DNS resolution from the same machine works fine using nslookup.
Any ideas what I can do to investigate this problem?
You can use the following three lines at application startup to configure the detailed logging that #JanTattermusch suggested:
Environment.SetEnvironmentVariable("GRPC_TRACE", "api");
Environment.SetEnvironmentVariable("GRPC_VERBOSITY", "debug");
Grpc.Core.GrpcEnvironment.SetLogger(new Grpc.Core.Logging.ConsoleLogger());
In order to connect a C# gRPC client on a corporate network, behind an HTTP proxy, add this on the client main method works:
Environment.SetEnvironmentVariable("NO_PROXY", "127.0.0.1");

gRPC client reconnect inside Kubernetes

If we define our microservice inside Kubernetes pods, do we need to instrument a gRPC client reconnection if the service pod is restarting?
When the pod restarts the host name is not changed, but we cannot guarantee the IP address remains the same. So is the gRPC client still be able to detect the new server to reconnect to?
When the TCP connection is disconnected (because the old pod stopped) gRPC's channel will attempt to reconnect with exponential backoff. Each reconnect attempt implies resolving the DNS address, although it may not detect the new address immediately because of the TTL (time-to-live) of the old DNS entry. Also, I believe some implementations resolve the address when a failure is detected instead of before an attempt.
This process happens naturally without your application doing anything, although it may experience RPC failures until the connection is re-established. Enabling "wait for ready" on an RPC would reduce the chances the RPC fails during this period of transition, although such an RPC generally implies you don't care about response latency.
If the DNS address is not (eventually) re-resolved, then that would be a bug and you should file an issue.
You need client-side load balancing as described here. You can watch the endpoints of a service with Kubernetes api. I have created a package for Go programming language and it is on github. Sorry but I didn't write a documentation yet. Basic concept is get service endpoints at beginning than watch service endpoints for changes.

SocketException: No connection could be made because the target machine actively refused it XX.XXX.XX.XXX:443

I got 2 servers with two equal wcf services hosted on them and one client application server. I can connect to endpoints and send a requests to both services using test wcf client app (.NET Web Service Studio) from my local machine successfully. But when I am trying to connect from client application server using the same test wcf client app I successfully connected only to the one wcf service server, but I have got an error when connecting to another one:
System.Net.WebException: There was an error downloading 'https://XXX/XXX?wsdl'. ---> System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it XX.XXX.XX.XXX:443
I performed netstat -an | find "443" command in command prompt on the client server and on my local machine to find out the difference and here what I have got:
1. On my local machine:
2. On the client app server:
What I already tried to do on client application server is:
- turned off firewall;
- stopped windows firewall service
- uninstalled mcafee virusscan enterprise application.
(I tried to set "prevent mass mailing worms from send mail" first, but mcafee was in foreign language that I don't understand, so I just uninstalled it)
after running command netstat -aon | findstr "443" on client application server I have got this result:
but I still got an error.
Does anybody know how to solve this issue?
Could be the problem on the wcf service server side?
The solution was predictable simple one - firewall was blocking the port,
but it's important to notice that the issue was caused by firewall on the wcf service server side, but not on client application server, which is making the request to that service.
I asked the technical support of that server, and they made firewall changes.
After that error was disappeared.
I faced the same issue and tried different ways to fix this. Nothing works. Later i found the issue which is, the application i tried to run is https and in my IIS, https binding was not created. I created binding https with the website and it works.

Resources