In SignalR, the implementation of ITransportHeartbeat.GetConnections() should get me a list of connections being tracked. In one of my cases, I was always getting a web socket connection which didn't exist anymore.
Here is how I retrieve the default ITransportHeartbeat instance:
SignalRAutofac.Initialize();
IDependencyResolver resolver = GlobalHost.DependencyResolver;
ITransportHeartbeat heartbeat = resolver.Resolve<ITransportHeartbeat>();
I am assuming SignalR is sending heartbeat to the tracked connection in a specific interval. Is this the case? If yes, I didn't see that dead connection being dropped for 5 minutes from the list. Is this usual? is that dead connection supposed to live throughout the lifetime of the AppDomain?
Check the IsAlive property on the retrieved connections.
Related
I read the gRPC Core concepts, architecture and lifecycle, but it doesn't go into the depth I like to see. There is the RPC call, gRPC channel, gRPC connection (not described in the article) and HTTP/2 connection (not described in the article).
I'm interested in knowing how these come together. For example, what happens to the channel when a RPC throws an exception? What happens to the gRPC connection when the channel is closed? When is the channel closed? When is the gRPC connection closed? Heart beats? What if the deadline is exceeded?
Can anyone answer these questions, or point me to resources that can?
The connection is not a gRPC concept. It is not part of the normal API and is an implementation detail. This should be seen as fairly normal, like HTTP libraries providing details about HTTP exchanges but not exposing connections.
It is best to view RPCs and connections as two mostly-separate systems.
The only real guarantee is that "connections are managed by channels," for varying definitions of "managed." You must shut down channels when no longer used if you want connections and other resources to be freed. Other details are either an implementation detail or an advanced API detail.
There is no "gRPC connection." A "gRPC connection" would just be a standard "HTTP/2 connection." Except that is even an implementation detail of the transport in many gRPC implementations. That allows having alternative "connection" types like "inprocess" or QUIC (via Cronet, where there is not a classic "connection" at all).
It is the channel's job to hold all the connections and reconnect as necessary. It delegates part of that responsibility to load balancers and the load balancing APIs do have a concept of connections (subchannels). By not exposing connections to the application, load balancers have a lot of freedom to operate.
I'll note that gRPC C-core based implementations share connections across channels.
What happens to the channel when a RPC throws an exception?
The channel and connection is not impacted by a failed RPC. Note that connection-level failures typically cause RPCs to fail. But things like retries could allow the RPC to be re-sent on a new connection.
What happens to the gRPC connection when the channel is closed?
The connections are closed, eventually. Channel shutdown isn't instantaneous because existing RPCs can continue, and connection shutdown isn't instantaneous as well. But once all RPCs complete the connections are closed. Although C-core won't shut down a connection until no channels are using it.
When is the channel closed?
Only when the user closes it.
When is the gRPC connection closed?
Lots of times. The client may close it when no longer needed. For example, let's say the server IP address changes and the client need to connect to 1.1.1.2 instead of 1.1.1.1. A new connection will be created and new RPCs will go to the new IP address. The client may also close connections it thinks are dead (e.g., via keepalive timeouts).
Servers have a lot of say of when to close connections. They may close them simply because they are old, or because they have been idle, or because the server is overloaded. But those are simply use-cases; the server can shut down a connection at-will.
What if the deadline is exceeded?
Deadline only applies to RPCs and doesn't impact the channel or a connection.
I was actually waiting for Eric to answer this as he is the expert in this!
I also have been playing with gRPC for a while now, I would like to add few things here for beginners. Anyone more experienced, please feel free to edit!
Channel is an abstraction over a long-lived connection! The client application will create a channel on start up. The channel can be reused/shared among multiple threads. It is thred safe. One channel is enough (for most of the use cases) for multiple threads and multiplexing concurrent requests. It is channel's responsibility to close / reconnect / keep the connection alive etc. We as the users do not have to worry about this in general. The client application can close the channel anytime it wants. Channel creation seems to be an expensive process. So we would not open/close for every RPC.
When you use gRPC loadbalancer/nameresolver for a domain name and the nameresolver resolves the domain with multiple ip addresses, a channel creates multiple subchannels where each subchannel is an abstraction over a connection to 1 server. So a channel can also represent multiple connections!!
Adding some points to note from Eric's comment.
adding the default load balancer still only creates (approximately)
one connection if the name resolver returns multiple addresses, as the
default is pick_first. But if you change the load balancer to
round_robin or virtually any other policy, then yes, there will be
multiple connections in a channel. Even if a name resolver returns one
address, the load balancer is free to create multiple connections
(e.g., for higher throughput), but that's not common today
An underlying connection can be closed any time for any reason. For ex: remote server is shutting down gracefully for a scheduled maintenance or a connection is idle for longer duration. In that case, the server could send GOAWAY signal to the client and client might disconnect and reconnect to some other server. or Server might crash due to OOM error. In this case channel will detect connection failure and will retry for new connection for some other server etc.
A channel can keep sending PING frame to the server to keep the connection alive. These are all configurable via channel builder.
With these information above, if we look at your questions,
what happens to the channel when a RPC throws an exception?
Nothing happens to the channel. The unhandled exception on the server might the fail the RPC on the client side. But channel is still usable for any RPC calls.
What happens to the gRPC connection when the channel is closed?
Channel is an abstraction over the connection. So it will be closed. (again there is no gRPC connection as such as Eric had mentioned. It would be a HTTP2 connection)
When is the channel closed?
Any time you want. But normally when the application shuts down.
When is the gRPC connection closed?
It is not our problem. Channel takes care of this.
Heart beats?
Channel sends PING frames periodicaly to keep the connection alive.
What if the deadline is exceeded?
It is something like timeout on the client side. When the deadline exceeds, the client might cancel the request. Once again nothing happens to the channel. (But it might trigger exception on the server side which I had noticed few times. (Received DATA frame for an unknown stream. https://github.com/grpc/grpc-java/issues/3548). It seems to have been fixed now).
If "OnConnected" is raised when the 1st time we connect to our website, when will "OnReconnected" happen?
1) Suppose someone is connected to the network and suddenly the network isn't available and soon it recovers, so OnReconnected happens?
2) Any other special that will make OnReconnted happen?
Thanks!
The Signalr documentation on Understanding and Handling Connection Lifetime Events in SignalR should have all the information you need.
Generally speaking OnReconnected will fire any time the SignalR client automatically reconnects to the SignalR server after it has lost its connection for any reason. These reasons can include network issues, the server restarting, etc...
The SignalR client will stop attempting to automatically reconnect to the server if it is unable to successfully do so within the DisconnectTimeout. If this happens, and you want to reestablish a connection, you will be required to manually restart the client by calling start() after the client becomes disconnected. If you manually restart the client by calling start(), OnConnected will be called instead of OnReconnected and the client will receive a new connection id.
I have an WebAPI service using ODP.NET to make connections to several oracle databases. Normally the web service would be hit several times a second and will never have long periods on inactivity. In our test site however, we did not use it for 2-3 days. This morning, we hit the service and got "connection request timeout" exceptions from ODP.NET, suggesting that the connection pool was out of available connections. We are closing the connections after use. The service was working fine before the period, but today the very first query got the timeout exception. Our app pool in IIS is configured to never reset.
My question then is, what can cause the connection pool to fill with bad connections after a period of inactivity, where these connections are not cleaned up in the usual 3 minute cycle? It only happened to 2 out of the 3 of our databases, and Validate Connection=true is set for all of them.
EDIT
So after talking to the DBA, there is some different between a connection/session being killed manually or by timeout and the database server severing the TCP connections. In this case, the TCP connection was severed as part of a regular backup (why is not important for this). I guess this happens when the whole database server goes offline at once. The basis of the question still applies I think though: why is ODP.NET unable to cleanup severed connections overtime? There is a performance counter that refers to "Stasis" connections, could those connections be stuck in that state? I would think that it should be able to see that a connection is no longer active (Validate Connection=True), kill it and not return it to the pool.
Granted, this problem can be solved by just resetting the app pool everything the database goes down. I would still like to configure ODP.NET connection pooling to be more fault tolerant.
I have run into this same issue, and the only solution I have found is to use the Connection Lifetime connection string parameter in conjunction with Validate Connection.
In my particular case, the connection timeout was set at the server and the connections in the pool would timeout, but not be sniped out of the pool, resulting in errors.
Setting both the Connection Lifetime and the Validate Connection parameters has resolved the issue.
Make sure the Connection Lifetime value that you choose is less than the server connection inactivity timeout.
The recommended solution is to use ODP.NET Fast Connection Failover (FCF). FCF will automatically remove invalid connections from the pool such that you don't need to use Validate Connection, Connection Lifetime, nor clear the pool.
To use FCF, set "HA events=true", use connection pooling, and have your DBA set up Fast Application Notification (FAN) on the server side. FAN is what alerts the ODP.NET pool when a DB service or node goes down or rebooted. Upon receiving the message, ODP.NET knows which connections to remove from the pool and removes them, leaving all other valid connections untouched.
Something else is going on here. Min Pool Size and some of the other settings help when the connection is severed from things like DBA configured idle timeouts and firewall tcp idle timeouts, 'connection request timeout' occurs when created a new connection.
This could be simple network problem. There could be something interfering with dns resolution of the servers. Another case is not having fully qualified entries in tnsnames. I've been bit by the latter a couple of times.
The other issue is the one you've already recognized - full pool.
Double check that you don't have a connection leak somewhere. A missing .Close is one thing but if you're not using a 'using' statement, a try/finally is required as an unhandled exception could be thrown prior to the .Close.
I would use perfmon to monitor some of the connection statistics to start - NumberOfPooledConnections, NumberOfActiveConnections, etc:
I'm confused for how to deal with lots of connections in netty (3.6.2.FINAL) and keep-alive=true.
For work on a netty client as a server side connector, making http calls to another service, it wants to always keep the connection open for performance (keep-alive=true).
The issue: there is a hard limit for number of open channels, after which the client will hang when attempting to open a channel. Why no exception just hangs? Is this a setting in terms of channel timeout?
I can't seem to understand Netty in terms of overall managing of connections within worker threads:
With a blocking write/read client ChannelHandler (http request/response), how do you detect that the connection pool is empty?
The handler can receive ChannelEvent(s) but nothing about the overall count available in the connection pool (its very non-deterministic anyway). And if the channel is not open, does it make sense for the handler to initiate opening a new channel given its running in a worker thread?
But if the connection pool is exhausted, how do you go and cleanup some idle connections (within the handler)?
I had to completely rip apart my handler to get the client blocking call to work without hanging. The issue was mostly resolved by not holding onto local channel ref within the handler.
Now we just pass a ConnectionInterface#openConnection() [returns a new ChannelFuture] into the shared custom ChannelHandler#call( ConnectionInterface connectionInterface, HttpRequest request ).
Better to open-channel within the handler call method, and to pass that channel along with checks on its state before channel.write(x), if !channel.isWritable() then recycle the channel (from a new client connection eg. ConnectionInterface#openConnection()) and retry the write. There isn't even a need to close the channel (it gets handled in the pool).
Just ran it with 500 threads / 5000 requests and it succeeds fine.
1) Book I’m reading argues that connections shouldn’t be opened between client requests, since they are a finite resource.
I realize that max pool size can quickly be reached and thus any further attempts to open a connection will be queued until connection becomes available and for that reason it would be imperative that we release connection as soon as possible.
But assuming all request will open connection to the same DB, then I’m not sure how having a connection open between two client requests would be any less efficient than having each request first acquiring a connection from connection pool and later returning that object to connection pool?
2) Book also recommends that when database code is encapsulated in a dedicated data access class, then method M opening a database connection should also close that connection.
a) I assume one reason why M should also close it, is because if method M opening the connection doesn’t also close it, but instead this connection object is used inside several methods, then it’s more likely that a programmer will forget to close it.
b) Are there any other reasons why a method opening the connection should also close it?
thanx
EDIT:
If during the processing of a web request you don’t close the connection, then same connection can’t be used “directly” by the next request, but instead it first needs to be returned to connection pool, and only then can it be reused? If that is the case, I can see how we don’t gain anything by leaving the connection open during the requests?!
E.g. Transaction 1 reads a row from a table, then the user is thinking long, before modifying the data. During that time, transaction B reads and then updates the same row: transaction A now has stale data! Now if the user finally modifies the data and tx A commits it, the modifications made by tx B may get lost entirely: this is called lost update.
If my above assumption is correct,then how can user U that initiated transaction 1 ( thus established a database connection 1 ) during a first request, get a reference to same database connection 1 ( and thus a "reference" to transaction 1 ) during the second request(aka postback)? Namely, wasn't connection object returned to the connection pool when server finished processing user U's first request?
Yes, you never know how long the connection will be open, as the request is initiated by the user... also, what happens if the request gets lost (user closes browser), too easy to have connections open infinitely... hard to have a cleanup process if you do do that.
HTH.
The points in 2) can be solved by wrapping the opening and closing of the connection in a smart manager object. Methods using the database would call this manager to get a connection and to give it back. The manager would count how many methods are using the connection, how long ago it was, etc. and make/close connnections accordingly.
Yeah, with ASP.NET and SQL connections, there are very few scenarios where it makes sense to not use the connection pool. One of the most common scenarios where the connection pooling causes issues is when you change contexts (from an data access/authorization perspective). Almost think of the connection pool as a connection load balancer for you that is going to be more efficient than anything you're going to code up until you learn a lot and then write a lot of code.
A couple links on the topic that will explain it much better than I could:
first link
second link