Background
We have a number of web applications on different web servers that connect to a single database server. Over the past couple months, we have noticed that every once in awhile, our web servers won't be able to connect to the database server.
Our Environment
We have a couple different web environments, some running ColdFusion and others running .NET. The .NET apps are both Web Forms and MVC. They span multiple versions from 2.0 to 4.5. Both the ColdFusion and .NET web servers are windows based machines. Both the ColdFusion and .NET web environments are clustered and some of the machines are physical while others are virtual.
Our database server is SQL Server 2008 r2. It houses multiple databases. Each application has its own database user that it connects with to the server that only gives it access to a particular database.
Other Facts
When we notice issues, they occur in short bursts that last anywhere from a couple seconds to a couple minutes.
When we notice issues, the burst contains errors from multiple different appliations, not just one app at at time.
When we notice issues, the burst contains errors from applications from different web environments. (This makes us think we can rule out that the apps themselves are the issue)
The burst of connection issues happen at various times throughout the day and night. They are not always during times of high usage.
We have monitored things like number of user connections, memory, IO, CPU usage, etc... and we have not seen spikes or anything else that might point to a problem.
We have installed wireshark on the web and db servers in hopes of catching the problem without any success.
Questions
Does anyone have suggestions on where I should look next?
Are there properties of the database that could cause this?
Is there any way to "monitor" the connection between the database and web server in a better manner?
Is there anything that can be done on the app side to better understand what is happening?
Errors Caught by Apps
.NET errors
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)
Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
ColdFusion errors
Error Executing Database Query. The TCP/IP connection to the host has failed. java.net.ConnectException: Connection timed out: connect The error occurred on line 38.
Error Executing Database Query. Connection reset by peer: socket write error The error occurred on line 91.
Error Executing Database Query. Timed out trying to establish connection The error occurred on line 38.
In CF, I once had an issue like what you were seeing. I had CF on 1 server, and sql 2008 r2 on another server. I would see CF errors like you posted below. To help trace it to a network error I wrote something like this:
1) created a down.bat
tracert serverip
2) I then put a <cftry><cfcatch> around the query.
When the query generated the error I would execute
<cfexecute name="C:\path\to\down.bat" variable="log" timeout="60" />
<cfmail to="ME" from="Server" subject="SQL DOWN">
Server Debugging Info:
------------------------------------------------------------
#now()#
#cfcatch.Detail#
#cfcatch.Message#
#log#
</cfmail>
</cfexecute>
This helped me fix my situation which ended up being hardware at the datacenter.
Related
My ASP.net application uses ODBC(64 bit) to connect to DB2 Database(9.5.3). I am using 64 bit IBM DB2 Client 10.5. on Windows Server 2008 R2.
Connection pool is turned on.
It works fine(immediate connectivity) for most of the time, but occasionally too much time is consumed to establish connectivity. By too much time I mean to say 10 to 11 minutes. No error is reported. No issue with database, as it can be accessed from other servers at same timestamp.
All database request issued during this time via DB2 Client are kept on hold and then once connectivity is established, all are executed immediately.
When the issue is going on, I tried to connect to database through Windows server CMD and that too waits for around 10 mins. No error is reported.
Network team says they dont see network traffic from Windows server to Database server when issue is going on. And no errors from network side. Which means DB2 client is not making connectivity request.
What could be causing delay in connectivity? This is not a consistent issue. It automatically resolves after around 10 mins. Is there any issue with DB2 driver ? Was any resource withheld during that time? I am closing all connections properly.
I have a ASP.NET application that connects to a SQL server backend on another server. At random points throughout the day I will get a message when logging into the application that states:
A network-related or instance-specific error occurred while
establishing a connection to SQL Server. The server was not found or
was not accessible. Verify that the instance name is correct and that
SQL Server is configured to allow remote connections. (provider: Named
Pipes Provider, error: 40 - Could not open a connection to SQL Server)
This may happen 2-3 times per day and will last about 30 minutes or so. The everyting will run fine for a few more hours until it happens again. The crazy thing about it is that I cannot even ping the server using its hostname while this error is occurring. I can ping it by the IP address though. Once the web application can connect again I can ping the server again by its hostname.
Even stranger is that I have another web application that is running on the same webserver and uses the same SQL server database as a backend that continues to function while the other web application is generating the error.
Each application is running on a different application pool on IIS 6 (Recyclying the affected app pool does not appear to resolve the problem). Both applications run on .NET 4 and the webserver is Windows Server 2003. The database is running on SQL Server 2008.
Any ideas what my issue could be?
Looks like a DNS issue, and it probably has nothing to do with your code. Can you use the IP address in your connection string instead?
I get the following error whilst trying to connect from an ASP.NET web application from a particular server to an instance of SQL Server 2005 on a different server.
An error has occurred while
establishing a connection to the
server. When connecting to SQL Server
2005, this failure may be caused by
the fact that under the default
settings SQL Server does not allow
remote connections. (provider: SQL
Network Interfaces, error: 26 - Error
Locating Server/Instance Specified)
This article lists 5 steps:
http://blogs.msdn.com/b/sql_protocols/archive/2007/05/13/sql-network-interfaces-error-26-error-locating-server-instance-specified.aspx
We have eliminated each of these (there is no firewall).
Running the application on a different web server (in the same network) and it can connect to the database. Similarly, running the database on a different server, and the application can see it. It appears to be a problem between these two servers.
The problem occurred at random. It was working one minute, then not, and hasn't been working since. Nothing was installed or changed (as far as I can work out) on the server.
Any ideas?
Thanks
Have you tried connecting using an IP address rather than a qualified server name in the connection string? Sometimes the remote server can not be resolved but the IP address can be, depending on domain/network dns setup.
I developed a web site using Sybase PowerBuilder V12.0 Classic and the output is deployed and converted to ASP.Net (ASPX) web pages.
The Database connection is configured and used properly as the data is displayed in the web application using DataWindow Objects and is a direct connection with sybase Database Server using Dsedit Tool and No ODBC is used.
I edited The Sybase Databse configurations related to remote servers and connections to be more than the default value which was 25 although the real number of users will not exceed that in the same time.
Recently I received an error message when some users connected to the web site and after making valid logins:-
The Error Message is:-
Maximum number of connections already opened
ct_connect(): user api layer: external
error: The maximum number of
connections have already been opened.
I am confused about the causes of that error as I think I had made all configurations needed and I checked evey option and setting related to the Number of Connections in The Sybase Database server, The Application Deployment Settings in Sybase PowerBuilder V12.0 Classic, and The IIS Settings.
I use Windows Server 2003 and the IIS version is 6.0 in the Web Server.
I appreciate any suggestion or hint to solve that problem and Thanks in Advance :)
The error message says it all really. There are too many concurrent connections to your database.
Perhaps your application doesn't close all of them. If you do not close database connections, connections can remain opened for some time.
Now, I never developed for Sybase, but that is what is usually the case with MSSQL server when this error occurs.
I have this error which happen many times on my site:
"A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 0 - An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.)"
My site files are on a server and the database on another server, but on the same network.
Anyone know what may cause this problem and how to solve?
Notes:
- When i restart the website server the problem solve and everything work fine.
- Sure my sql server allow remote connections from this website server
It's a pretty generic error when the application can't get to the database. Perhaps you are leaving connections open after using them? There could be a limit on the amount of concurrent connections by user or server.
It's a networking error of some kind, or you're specifying the instance name incorrectly. I've recently seen both cases. In one case, I had moved a connection string from one system, to another system that had a different instance name installed. In another case, I had a server running in a Virtual Machine - about ten minutes after the machine was resumed from suspension, the machine would lose its DHCP lease (and therefore its connectivity).
It could be just about anything, and you'll have to go find out, perhaps by using a network monitor program like Microsoft Network Monitor 3.2.
I found a good article talking about this error, i think it shows the real reason
Understanding the error “An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.”