C3P0 connection pool gives connection timeout error with this configuration - spring-mvc

I am using resin server + spring framework and c3p0 connection pooling. I have configured the connection pool with the following properties file. But somehow every 24 hours or so my website faces connection timeout errors and then i have to restart my resin server to make the website live again. Please tell me whats wrong in the following configuration file and what im missing here.
jdbc.driverClassName=com.mysql.jdbc.Driver
jdbc.databaseURL=jdbc:mysql://localhost/my_database1_url
jdbc.StockDatabaseURL=jdbc:mysql://localhost/my_database2_url
jdbc.username=my_username
jdbc.password=my_password
jdbc.acquireIncrement=10
jdbc.minPoolSize=20
jdbc.maxPoolSize=30
jdbc.maxStockPoolSize=30
jdbc.maxStatements=100
jdbc.numOfHelperThreads=6
jdbc.testConnectionOnCheckout=true
jdbc.testConnectionOnCheckin=true
jdbc.idleConnectionTestPeriod=30
jdbc.prefferedTestQuery=select curdate();
jdbc.maxIdleTime=7200
jdbc.maxIdleTimeExcessConnections=5

So, a bunch of things.
c3p0 has built-in facilities for observing and debugging for Connection leaks. Please set the configuration parameters unusedConnectionTimeout unreturnedConnectionTimeout and debugUnreturnedConnectionStackTraces. Set an unreturnedConnectionTimeout that defines a period of time after which c3p0 should presume a Connection has leaked, and so close it. Set debugUnreturnedConnectionStackTraces to ask c3p0 to log the stack trace that checked out the Connection that did not get checked in properly. See Configuring to Debug and Workaround Broken Client Applications.
You are configuring c3p0 in a nonstandard way. That might be fine, or not, but you want to verify that the config that you intend to set is the config c3p0 gets. c3p0 DataSources dump their config at INFO on pool initialization. Please consider checking that to be sure you are getting the config you intend. Alternatively, you can check your DataSource's runtime config via JMX.
Besides the nonstandard means of configuration, several of your configuration properties seem amiss. prefferedTestQuery should be preferredTestQuery. numOfHelperThreads should be numHelperThreads.
The following are not c3p0 configuration names at all. Perhaps you are internally mapping them to c3p0 configuration, but you'd want to verify this. Here are the not-c3p0-property-names:
jdbc.driverClassName=com.mysql.jdbc.Driver
jdbc.databaseURL=jdbc:mysql://localhost/my_database1_url
jdbc.StockDatabaseURL=jdbc:mysql://localhost/my_database2_url
jdbc.username=my_username
jdbc.maxStockPoolSize=30
In a standard c3p0.properties form, what you probably mean is
c3p0.driverClass=com.mysql.jdbc.Driver
c3p0.jdbcURL=jdbc:mysql://localhost/my_database1_url
# no equivalent -- jdbc.StockDatabaseURL=jdbc:mysql://localhost/my_database2_url
c3p0.user=my_username
# no equivalent -- jdbc.maxStockPoolSize=30
Please see Configuration Properties. Again, c3p0 knows nothing about jdbc.-prefixed properties, but perhaps something in your own libraries or middleware picks those up.
Note: I love to see #NiSay's way of checking for Connection leaks, because I love to see people using more advanced c3p0 API. It will work, as long as you don't hot-update your DataSource's config. But you don't need to go to that much trouble, and there's no guarantee this approach will continue to work in future versions c3p0 makes no promises about ConnectionCustomizer lifecycles. ConnectionCustomizers are intended to be stateless. It is easier and safer to use c3p0's built-in leak check facility, described in the first bullet-point above.

As there could be possibility of connection leaks in the program (the probable cause of connection timeouts), you need to follow the below steps in order to identify the leaks.
Make as entry in your c3p0.properties file
c3p0.connectionCustomizerClassName = some.package.ConnectionLeakDetector
Create a class with name 'ConnectionLeakDetector' and place it in appropriate package. Below is the content of the class.
import java.sql.Connection;
import java.util.concurrent.atomic.AtomicInteger;
public class ConnectionLeakDetector implements com.mchange.v2.c3p0.ConnectionCustomizer {
static AtomicInteger connectionCount = new AtomicInteger(0);
#Override
public void onAcquire(Connection c, String parentDataSourceIdentityToken)
throws Exception {
}
#Override
public void onDestroy(Connection c, String parentDataSourceIdentityToken)
throws Exception {
}
#Override
public void onCheckOut(Connection c, String parentDataSourceIdentityToken)
throws Exception {
System.out.println("Connections acquired: " + connectionCount.decrementAndGet());
}
#Override
public void onCheckIn(Connection c, String parentDataSourceIdentityToken)
throws Exception {
System.out.println("Connections released: " + connectionCount.incrementAndGet());
}
}
The onCheckOut method will increment the count when the connection is acquired, where as onCheckOut will decrement it when the connection is released.
Execute some scenarios and observe the statistics on your console. If the count is more than 0, then the scenario executed has a connection leak. Try to fix them and you will observe the difference.
As a side note, you can increment the jdbc.maxPoolSize as a temporary solution until you deploy the fix.

Related

How to get visibility into completion queue on C++ gRPC server

Note: Help with the immediate problem would be great, but mostly I'm looking for advice on troubleshooting gRPC timing issues in general (this isn't my first such issue).
I am adding a new server streaming service to a C++ module which has an existing server streaming service, and the two appear to be conflicting. Specifically, the completion queue Next() call on the server is crashing intermittently after the C# client calls Cancel() on the cancellation token for one of the services. This doesn't happen if I run each service independently.
On the client, I get this at the response stream MoveNext() call:
System.InvalidOperationException
HResult=0x80131509
Message=Shutdown has already been called
Source=Grpc.Core
StackTrace:
at Grpc.Core.Internal.CompletionQueueSafeHandle.BeginOp()
at Grpc.Core.Internal.CallSafeHandle.StartReceiveMessage(IReceivedMessageCallback callback)
at Grpc.Core.Internal.AsyncCallBase`2.ReadMessageInternalAsync()
at Grpc.Core.Internal.ClientResponseStream`2.<MoveNext>d__5.MoveNext()
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
at MyModule.Connection.<DoSubscriptionReceives>d__7.MoveNext() in C:\snip\Connection.cs:line 67
On the server, I get this at the completion queue next() call:
Exception thrown: read access violation.
core_cq_tag->**** was 0xDDDDDDDD.
The stack trace:
MyModule.exe!grpc_impl::CompletionQueue::AsyncNextInternal(void * * tag, bool * ok, gpr_timespec deadline) Line 59 C++
> MyModule.exe!grpc_impl::CompletionQueue::Next(void * * tag, bool * ok) Line 176 C++
...snip...
It appears something is being added to the queue after shutdown. The difficulty is I have little visibility into what is being added into the queue and in what order.
I'm trying to write a server-side interceptor to log all requests & responses, but there seems to be no documentation. So far, poking through the API hasn't gotten me very far. Is there any documentation available on wiring up an interceptor in C++? Or, are there other approaches for troubleshooting timing conflicts between services?
Windows 11, Grpc.Core 1.27
What I've tried:
I first played with the GRPC_TRACE & GRPC_VERBOSITY environment variables. I was able to get some unhelpful output from the client, but nothing from the server. Of course, there's been lots of debugging, stripping the client & server down to barebones, disabling keep alives, ensuring we aren't using deadlines, having the services share a cancellation token, etc.
Update: I have found that the crash only happens when the client is run from an NUnit test. In that environment, the completion queue is getting more hits on Next(), but I'm still trying to figure out where they are coming from.
Is 1.27 the version you are using? That seems pretty old.. There might have been fixes since then.
For using the C++ server interception API, I think you would find this very useful - https://github.com/grpc/grpc/blob/0f2a0f5fc9b9e9b9c98d227d16575d106f1e8d43/test/cpp/end2end/server_interceptors_end2end_test.cc#L48
One suggestion I have is to run the code another sanitizers https://github.com/google/sanitizers to make sure that we don't have a heap-use-after-free type bug.
I would also check for API misuse issues. (If you had posted the code, I could've given a look to see if anything seems weird..)

Having trouble connecting to iSeries from .NET Core

This is a follow-up from the following question: Having trouble connecting to iSeries from .NET Core
The initial problem was resolved by setting a port number. I'm now running into the problem of the connection seemingly opening, however, hanging on the actual .Open() step - IE, never continuing on to the next line of code. For reference, here's my code block:
public static DB2Connection GetDatabaseConnection(string connectionString)
{
DB2Connection DB2Connection = new DB2Connection(connectionString);
DB2Connection.SystemNaming = true;
try
{
DB2Connection.Open();
return DB2Connection;
}
catch (Exception ex)
{
throw ex;
}
}
And my connection string is in this format: Server=###.###.###.###:#####;Database=DATABASE;UID=USER;PWD=PASSWORD;LibraryList=LIBRARY,LIST
Looking at the logs on the i Navigator, I see that there is a job name Qzhqssrv when is opened, with the user Quser, status Running, and type Prestart batch - Server. Looking into the logs for that entry, I see Job #####/QUSER/QZHQSSRV started on DATE at TIME in subsystem QUSRWRK in QSYS. Job entered system on DATE at TIME. However, it doesn't seem to continue beyond that.
Looking at the logs for a similar operation, when I'm connecting via Access Client Solutions, I get considerably more information and more steps in the logs. This leads me to believe that the system is waiting for me to send further information, however, my application is still stuck on .Open() - so perhaps there is something else I was supposed to send as part of the .Open() instruction. If so, I'm not sure what it would be.
Any insights would be greatly appreciated. Thanks!
Just to close this topic out - the problem was indeed the lack of a license. Connecting on port 446 was the correct approach, and once we got a license, we were able to get the connection working. Thanks #nfgl!

JMS - Cannot retrieve message from queue. Happens intermittently

We have a Java class that listens to a database (Oracle) queue table and process it if there are records placed in that queue. It worked normally in UAT and development environments. Upon deployment in production, there are times when it cannot read a record from the queue. When a record is inserted, it cannot detect it and the records remain in the queue. This seldom happens but it happens. If I would give statistic, out of 30 records queued in a day, about 8 don't make it. We would need to restart the whole app for it to be able to read the records.
Here is a code snippet of my class..
public class SomeListener implements MessageListener{
public void onMessage(Message msg){
InputStream input = null;
try {
TextMessage txtMsg = (TextMessage) msg;
String text = txtMsg.getText();
input = new ByteArrayInputStream(text.getBytes());
} catch (Exception e1) {
// TODO Auto-generated catch block
logger.error("Parsing from the queue.... failed",e1);
e1.printStackTrace();
}
//process text message
}
}
Weird thing we cant find any traces of exceptions from the logs.
Can anyone help? by the way we set the receiveTimeout to 10 secs
We would need to restart the whole app for it to be able to read the records.
The most common reason for this is the listener thread is "stuck" in user code (//process text message). You can take a thread dump with jstack or jvisualvm or similar to see what the thread is doing.
Another possibility (with low volume apps like this) is the network (most likely a router someplace in the network) silently closes an idle socket because it has not been used for some time. If the container (actually the broker's JMS client library) doesn't know the socket is dead, it will never receive any more messages.
The solution to the first is to fix the code; the solution to the second is to enable some kind of heartbeat or keepalives on the connection so that the network/router does not close the socket when it has no "real" traffic on it.
You would need to consult your broker's documentation about configuring heartbeats/keepalives.

ActiveMQ Override scheduled message

I am trying to implement delayed queue with overriding of messages using Active MQ.
Each message is scheduled to be delivered with delay of x (say 60 seconds)
In between if same message is received again it should override previous message.
So even if I receive 10 messages say in x seconds. Only one message should be processed.
Is there clean way to accomplish this?
The question has two parts that need to be addressed separately:
Can a message be delayed in ActiveMQ?
Yes - see Delay and Schedule Message Delivery. You need to set <broker ... schedulerSupport="true"> in your ActiveMQ config, as well as setting the AMQ_SCHEDULED_DELAY property of the JMS message saying how long you want the message to be delayed (10000 in your case).
Is there any way to prevent the same message being consumed more than once?
Yes, but that's an application concern rather than an ActiveMQ one. It's often referred to as de-duplication or idempotent consumption. The simplest way if you only have one consumer is to keep track of messages received in a map, and check that map whether you receive a message. It it has been seen, discard.
For more complex use cases where you have multiple consumers on different machines, or you want that state to survive application restart, you will need to keep a table of messages seen in a database, and query it each time.
Please vote this answer up if it helps, as it encourages people to help you out.
Also according to method from ActiveMQ BrokerService class you should configure persistence to have ability to use scheduler functionality.
public boolean isSchedulerSupport() {
return this.schedulerSupport && (isPersistent() || jobSchedulerStore != null);
}
you can configure activemq broker to enable "schedulerSupport" with the following entry in your activemq.xml file located in conf directory of your activemq home directory.
<broker xmlns="http://activemq.apache.org/schema/core" brokerName="localhost" dataDirectory="${activemq.data}" schedulerSupport="true">
You can Override the BrokerService in your configuration
#Configuration
#EnableJms
public class JMSConfiguration {
#Bean
public BrokerService brokerService() throws Exception {
BrokerService brokerService = new BrokerService();
brokerService.setSchedulerSupport(true);
return brokerService;
}
}

Application_Start timeout?

I have one piece of code that gets run on Application_Start for seeding demo data into my database, but I'm getting an exception saying:
The ObjectContext instance has been disposed and can no longer be used for operations that require a connection
While trying to enumerate one of my entities DB.ENTITY.SELECT(x => x.Id == value);
I've checked my code and I'm not disposing my context before my operation, Below is an outline of my current implementation:
protected void Application_Start()
{
SeedDemoData();
}
public static void SeedDemoData()
{
using(var context = new DBContext())
{
// my code is run here.
}
}
So I was wondering if Application_Start is timing out and forcing my db context to close its connection before it completes.
Note: I know the code because I'm using it on a different place and it is unit tested and over there it works without any issues.
Any ideas of what could be the issue here? or what I'm missing?
After a few hours investigating the issue I found that it is being caused by the data context having pending changes on a different thread. Our current implementation for database upgrades/migrations runs on a parallel thread to our App_Start method so I noticed that the entity I'm trying enumerate is being altered at the same time, even that they are being run on different data contexts EF is noticing that something is wrong while accessing the entity and returning an incorrect error message saying that the datacontext is disposed while the actual exception is that the entity state is modified but not saved.
The actual solution for my issue was to move all the seed data functions to the database upgrades/migrations scripts so that the entities are only modified on one place at the time.

Resources