ODP.NET Connection Pooling Issues - Fault Tollerance After Database Goes Down

ODP.NET Connection Pooling Issues - Fault Tollerance After Database Goes Down - asp.net

I have an WebAPI service using ODP.NET to make connections to several oracle databases. Normally the web service would be hit several times a second and will never have long periods on inactivity. In our test site however, we did not use it for 2-3 days. This morning, we hit the service and got "connection request timeout" exceptions from ODP.NET, suggesting that the connection pool was out of available connections. We are closing the connections after use. The service was working fine before the period, but today the very first query got the timeout exception. Our app pool in IIS is configured to never reset.
My question then is, what can cause the connection pool to fill with bad connections after a period of inactivity, where these connections are not cleaned up in the usual 3 minute cycle? It only happened to 2 out of the 3 of our databases, and Validate Connection=true is set for all of them.
EDIT
So after talking to the DBA, there is some different between a connection/session being killed manually or by timeout and the database server severing the TCP connections. In this case, the TCP connection was severed as part of a regular backup (why is not important for this). I guess this happens when the whole database server goes offline at once. The basis of the question still applies I think though: why is ODP.NET unable to cleanup severed connections overtime? There is a performance counter that refers to "Stasis" connections, could those connections be stuck in that state? I would think that it should be able to see that a connection is no longer active (Validate Connection=True), kill it and not return it to the pool.
Granted, this problem can be solved by just resetting the app pool everything the database goes down. I would still like to configure ODP.NET connection pooling to be more fault tolerant.

I have run into this same issue, and the only solution I have found is to use the Connection Lifetime connection string parameter in conjunction with Validate Connection.
In my particular case, the connection timeout was set at the server and the connections in the pool would timeout, but not be sniped out of the pool, resulting in errors.
Setting both the Connection Lifetime and the Validate Connection parameters has resolved the issue.
Make sure the Connection Lifetime value that you choose is less than the server connection inactivity timeout.

The recommended solution is to use ODP.NET Fast Connection Failover (FCF). FCF will automatically remove invalid connections from the pool such that you don't need to use Validate Connection, Connection Lifetime, nor clear the pool.
To use FCF, set "HA events=true", use connection pooling, and have your DBA set up Fast Application Notification (FAN) on the server side. FAN is what alerts the ODP.NET pool when a DB service or node goes down or rebooted. Upon receiving the message, ODP.NET knows which connections to remove from the pool and removes them, leaving all other valid connections untouched.

Something else is going on here. Min Pool Size and some of the other settings help when the connection is severed from things like DBA configured idle timeouts and firewall tcp idle timeouts, 'connection request timeout' occurs when created a new connection.
This could be simple network problem. There could be something interfering with dns resolution of the servers. Another case is not having fully qualified entries in tnsnames. I've been bit by the latter a couple of times.
The other issue is the one you've already recognized - full pool.
Double check that you don't have a connection leak somewhere. A missing .Close is one thing but if you're not using a 'using' statement, a try/finally is required as an unhandled exception could be thrown prior to the .Close.
I would use perfmon to monitor some of the connection statistics to start - NumberOfPooledConnections, NumberOfActiveConnections, etc:

Related

Can a gRPC client connect timeout be set independent of reconnect backoff settings?

We'd like to configure our gRPC client to reconnect very quickly after a connection is lost. (I believe the default behavior is to attempt to reconnect after 20 seconds, backing off to 120 seconds between attempts.) After a review of available settings, we tried setting grpc.initial_reconnect_backoff_ms and grpc.min_reconnect_backoff_ms to 200. While that results in quick reconnects when a connection is lost, we sometimes see calls (from tests) fail with GRPC::Internal: 13:Completed without a response. Looking at logging from a tcp reverse proxy sitting between client and server, I see a connection lasting for just over 200ms, then a second connection lasting for longer. So it looks like the reconnect times are effectively serving as timeouts on connection attempts.
Is it possible to configure a gRPC client so that it will begin attempting a reconnect very quickly after a connection is lost, but allow creation of that connection to take longer than the reconnect time?
If it matters, this is a Ruby client.

The initial backoff is supposed to be 1 second.
You're experiencing a bug were the minimum connection timeout acts as both the timeout and the backoff (so the 1s initial backoff is ignored). So both your initial problem and the failed workaround are caused by the same bug.
(The bug was noticed a month ago, but an issue wasn't filed due to a mixup with a second bug. Your question here let me notice the missing issue.)

ITransportHeartbeat.GetConnections() gets me disconnected connections

In SignalR, the implementation of ITransportHeartbeat.GetConnections() should get me a list of connections being tracked. In one of my cases, I was always getting a web socket connection which didn't exist anymore.
Here is how I retrieve the default ITransportHeartbeat instance:
SignalRAutofac.Initialize();
IDependencyResolver resolver = GlobalHost.DependencyResolver;
ITransportHeartbeat heartbeat = resolver.Resolve<ITransportHeartbeat>();
I am assuming SignalR is sending heartbeat to the tracked connection in a specific interval. Is this the case? If yes, I didn't see that dead connection being dropped for 5 minutes from the list. Is this usual? is that dead connection supposed to live throughout the lifetime of the AppDomain?

Check the IsAlive property on the retrieved connections.

There is some way to guarantee an unique (not pooled) database connection per user session in ASP.NET?

Due to locking concerns, we need to guarantee that every asp.net session get it's own connection in the connection pool. One connection per session, not shared by any other connection.
Any ideas?
There is no RIGHT answer, we are just out of ideas and need some suggestions.
Thanks in advance.

If this is SQL Server, in the database connection string add Pooling='false' and then connections will not pooled.
In general connection pooling is good for your application. Open, do something, close and dispose of the connection back to the pool. I've had instances where the connections were not being disposed of properly which caused issues.
Make sure all connections are in a using block (which will call dispose automatically) or if they are in a try catch, make sure in finally dispose is called on the connection.
Other than that, if pooling is not desired, alter the connection string not to use the pool.

TcpListener stops accepting or accepts broken connections

We currently experience a problem with a self-written server application running on Windows (occurs on different versions). The server listens at a TCP port, accepts connections, exchanges some data and then closes the connections again. There are about 100 clients that connect from time to time.
Sometimes the server stops to work: Log files show that connections are still accepted, but that at the first read attempt a socket error (10054 - Connection reset by peer) occurs. I don't think it is a client issue because it suddenly stops working for all clients.
Now we found out, that the same problem occurs with our old server software, that is even written in another programming language. So it doesn't seem to be an error in our program - I think it has to be some kind of OS / firewall issue? Of course, firewalls have been deactivated, which didn't solve the issue yet.
Any ideas where to look into? Wireshark logs will follow soon..
Excerpt from the log (Timestamp, Thread Id, message)
11:37:56.137 T#3960 Connection from 10.21.13.3
11:37:56.138 T#3960 Client Exception: Socket Error # 10054
Connection reset by peer.
11:37:56.138 T#3960 ClientDisconnected
11:38:00.294 T#4144 Connection from 10.21.13.3
You can see that the exception occurs almost at the same time as the connection is accepted, in this case the client reconnects after a few seconds.

A "stateful" firewall or NAT keeps track of connections, and ought to send RSTs for connectiosn it doesn't know about. If the firewall loses track of connections for some reason, then you'll probably see random connections being reset.
Our router at work does this — it forgets about connections when the PPP connection dies, which is remarkably unhelpful when it rains and the DSL restart takes a bit too long. However, instead of resetting connections, it just drops packets (even more unhelpful!).

Sounds like a firewall or routing issue - maybe stale connections get disconnected after a timeout period. Are you using a ping/keepalive inside your protocol.
Otherwise you may ask Wireshark to see what is going on.

First, thanks for many hints - I'm afraid the problem was a completely different one which you couldn't possibly solve by reading my question.
The server application uses log4net, configured with a log file an ImmediateFlush = true. If every log statement is directly written into the file and multiple socket connections occur this slows down the whole application.
The server needed about a minute to really accept the connection. This was far more than the timeout on clientside. So in the log there was only shown "accepted" followed by "disconnected" - even the log was delayed!
Sorry for the inconvenience...

Have you tried changing the backlog and then see how much time or how many clients are served before this problem occurs

You don't say what Windows versions you're using for the server, but you should be aware that the Windows TCP/IP stack behaves differently in server and client OSes. There are limits on how many simultaneous incoming connections a client OS will allow, and they are significantly lower than you might expect.

What do the logs look like from the client side?
Since the error is stating that the client is dropping the connection; if you see the same error on the client side then it is a firewall or proxy that is dropping the connection (both side seeing the opposite side dropping the connection is indicative of a proxy/firewall).
If the error is not present on the client side; then I would say that your client side is where you will see the actual error.

Why would it be a bad idea to have database connection open between client requests?

1) Book I’m reading argues that connections shouldn’t be opened between client requests, since they are a finite resource.
I realize that max pool size can quickly be reached and thus any further attempts to open a connection will be queued until connection becomes available and for that reason it would be imperative that we release connection as soon as possible.
But assuming all request will open connection to the same DB, then I’m not sure how having a connection open between two client requests would be any less efficient than having each request first acquiring a connection from connection pool and later returning that object to connection pool?
2) Book also recommends that when database code is encapsulated in a dedicated data access class, then method M opening a database connection should also close that connection.
a) I assume one reason why M should also close it, is because if method M opening the connection doesn’t also close it, but instead this connection object is used inside several methods, then it’s more likely that a programmer will forget to close it.
b) Are there any other reasons why a method opening the connection should also close it?
thanx
EDIT:
If during the processing of a web request you don’t close the connection, then same connection can’t be used “directly” by the next request, but instead it first needs to be returned to connection pool, and only then can it be reused? If that is the case, I can see how we don’t gain anything by leaving the connection open during the requests?!
E.g. Transaction 1 reads a row from a table, then the user is thinking long, before modifying the data. During that time, transaction B reads and then updates the same row: transaction A now has stale data! Now if the user finally modifies the data and tx A commits it, the modifications made by tx B may get lost entirely: this is called lost update.
If my above assumption is correct,then how can user U that initiated transaction 1 ( thus established a database connection 1 ) during a first request, get a reference to same database connection 1 ( and thus a "reference" to transaction 1 ) during the second request(aka postback)? Namely, wasn't connection object returned to the connection pool when server finished processing user U's first request?

Yes, you never know how long the connection will be open, as the request is initiated by the user... also, what happens if the request gets lost (user closes browser), too easy to have connections open infinitely... hard to have a cleanup process if you do do that.
HTH.

The points in 2) can be solved by wrapping the opening and closing of the connection in a smart manager object. Methods using the database would call this manager to get a connection and to give it back. The manager would count how many methods are using the connection, how long ago it was, etc. and make/close connnections accordingly.

Yeah, with ASP.NET and SQL connections, there are very few scenarios where it makes sense to not use the connection pool. One of the most common scenarios where the connection pooling causes issues is when you change contexts (from an data access/authorization perspective). Almost think of the connection pool as a connection load balancer for you that is going to be more efficient than anything you're going to code up until you learn a lot and then write a lot of code.
A couple links on the topic that will explain it much better than I could:
first link
second link

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex