NHibernate Memory Leak - asp.net

My company has an ASP.Net application that runs out of memory and throws out of memory exceptions after only a couple of days of activity by our customers. I am able to reproduce the error in our testing environment and I created a hang dump using adplus. When looking at the largest/most objects on the heap I noticed that we have over 500,000 NHibernate.SqlCommand.Parameter objects. This cannot be correct! We had 33 sessionfactories instantiated total and we have 1 sessionfactory per client database. The version of nhibernate we are using is 2.1.0.4000.
We have disabled second-level cache, query plan cache, and query cache. We still see 500,000 NHibernate.SqlCommand.Parameter in the memory dump.
Has any body seen this behavior?

We have a similar problem with our application (NHibernate 2.1.2.4000, ODP.net 2.111.7.0 on Windows 7). When we insert data into the database, we end up with a huge memory and handle leak:
for (int i=1;i<10000;i++)
{
using (var session = _sessionFactory.OpenSession();
{
var tx = session.OpenTransaction()
// insert a few rows into one table
tx.Commit()
}
}
The only fix for the problem is to set Enlist=false in the connection string or use the OracleClientDriver instead of the OracleDataClientDriver. This problem did not happen in NHibernate 1.2. There was an even worse connection leak when we tried this with TransactionScope.

Related

EntityException: The underlying provider failed on Open. Can one server closing a db connection, make another server fail on opening?

I am experiencing database connection errors with an ASP.NET application written in VB, running on three IIS servers. The underlying database is MS Access, which is on a shared network device. It uses Entity Framework, code first implementation and JetEntityFrameworkProvider.
The application is running stable. But, approximately 1 out of 1000 attempts to open the database connection fails with either one of the following two errors:
06:33:50 DbContext "Failed to open connection at 2/12/2020 6:33:50 AM +00:00 with error:
Cannot open database ''. It may not be a database that your application recognizes, or the file may be corrupt.
Or
14:04:39 DbContext "Failed to open connection at 2/13/2020 2:04:39 PM +00:00 with error:
Could not use ''; file already in use.
One second later, with refreshing (F5), the error is gone and it works again.
Details about the environment and used code.
Connection String
<add name="DbContext" connectionString="Provider=Microsoft.Jet.OLEDB.4.0;Data Source=x:\thedatabase.mdb;Jet OLEDB:Database Password=xx;OLE DB Services=-4;" providerName="JetEntityFrameworkProvider" />
DbContext management
The application uses public property to access DbContext. DbContext is kept in the HttpContext.Current.Items collection for the lifetime of the request, and is disposed at it’s end.
Public Shared ReadOnly Property Instance() As DbContext
Get
SyncLock obj
If Not HttpContext.Current.Items.Contains("DbContext") Then
HttpContext.Current.Items.Item("DbContext") = New DbContext()
End If
Return HttpContext.Current.Items.Item("DbContext")
End SyncLock
End Get
End Property
BasePage inits and disposes the DbContext.
Protected Overrides Sub OnInit(e As EventArgs)
MyBase.OnInit(e)
DbContext = Data.DbContext.Instance
...
End Sub
Protected Overrides Sub OnUnload(e As EventArgs)
MyBase.OnUnload(e)
If DbContext IsNot Nothing Then DbContext.Dispose()
End Sub
What I have tried
Many of the questions on SO which address above error messages, deal with generally not being able to establish a connection to the database – they can’t connect at all. That’s different with this case. Connection works 99,99% of the time.
Besides that, I have checked:
Permissions: Full access is granted for share where .mdb (database) and .ldb (locking file) resides.
Network connection: there are no connection issues to the shared device; it’s a Gigabit LAN connection
Maximum number of 255 concurrent connections is not reached
Maximum size of database not exceeded (db has only 5 MB)
Changed the compile option from “Any CPU” to “x86” as suggested in this MS Dev-Net post
Quote: I was getting the same "Cannot open database ''" error, but completely randomly (it seemed). The MDB file was less than 1Mb, so no issue with a 2Gb limit as mentioned a lot with this error.
It worked 100% on 32 bit versions of windows, but I discovered that the issues were on 64 bit installations.
The app was being compiled as "Any CPU".
I changed the compile option from "Any CPU" to "x86" and the problem has disappeared.
Nothing helped so far.
To gather more information, I attached an Nlog logger to the DbContext which writes all database actions and queries to a log file.
Shared Log As Logger = LogManager.GetLogger("DbContext")
Me.Database.Log = Sub(s) Log.Debug(s)
Investigating the logs I figured out that when one of the above errors occured on one server, another one of the servers (3 in total) has closed the db connection at exactly the same time.
Here two examples which correspond to the above errors:
06:33:50 DbContext "Closed connection at 2/12/2020 6:33:50 AM +00:00
14:04:39 DbContext "Closed connection at 2/13/2020 2:04:39 PM +00:00
Assumption
When all connections of a DbContext have been closed, the according record is removed from the .ldb lock file. When a connection to the db is being opened, a record will be added to the lock file. When these two events occur at the exact same time, from two different servers, there is a write conflict to the .ldb lock file, which results in on of the errors from above.
Question
Can anyone confirm or prove this wrong? Has anyone experienced this behaviour? Maybe I am missing something else. I’d appreciate your input and experience on this.
If my assumption is true, a solution could be to use a helper class for accessing db, which catches and handles this error, waiting for a minimal time period and trying again.
But this feels kind of wrong. So I am also open to suggestions for a “proper” solution.
EDIT: The "proper" solution would be using a DBMS Server (as stated in the comments below). I'm aware of this. For now, I have to deal with this design mistake without being responsible for it. Also, I can't change it in the short run.
I write this as an aswer because of space but this is not really an answer.
It's for sure an OleDb provider issue.
I think that is a sharing issue.
You could do some tries:
use a newer OleDb provider instead of Microsoft.Jet.OLEDB.4.0. (if you have try 64 bits you could already have try another provider because Jet.OLEDB.4.0 is 32 bits only)
Implement a retry mechanism on the new DbContext()
Reading your tests this is probaly not your case. I THINK that Dispose does not always work properly on Jet.OLEDB.4.0 connections. I noted it on tests and I solved it using a different testing engine. Before giving up I used this piece of code
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced, true);
GC.WaitForPendingFinalizers();
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced, true);
As you can understand reading this code, they are tries and the latest solution was changing the testing engine.
If your app is not too busy you could try to lock the db using a different mechanism (for example using a lock file). This is not really different from new DbContext() retries.
In late '90s I remember I had an issue related to disk sharing OS (I were using Novel Netware). Actually I have not experience in using mdb files on a network share. You could try to move the mdb on a folder shared with Windows
Actually I use Access databases only for tests. If you really need to use a single file database you could try other solutions: SQL Lite (you need a library, also this written by me, to apply code first https://www.nuget.org/packages/System.Data.SQLite.EF6.Migrations/ ) or SQL Server CE
Use a DBMS Server. This is for sure the best solution. As the writer of JetEntityFrameworkProvider I think that single file databases are great for single user apps (for this apps I suggest SQL Lite), for tests (I think that for tests JetEntityFrameworkProvider is great), for transfering data or, also, for readonly applications. In other cases use a DBMS Server. As you know, with EF, you can change from JetEntityFrameworkProvider to SQL Server or to MySql without effort.
You went wrong at the design stage: The MS Access database engine is unfit for ASP.Net sites, and this is explicitly stated on multiple places, e.g. the official download page under details.
The Access Database Engine 2016 Redistributable is not intended .... To be used by ... a program called from server-side web application such as ASP.NET
If you really have to work with an Access database, you can run a helper class that retries in case of common errors. But I don't recommend it.
The proper solution here is using a different RDBMS which exhibits stateless behavior. I recommend SQL Server Express, which has limitations, but if you exceed those you will be far beyond what Access supports, and wont cause errors like this.

Slow AppFabric, high CPU and Memory Usage

I implemented AppFabric 1.1 to my ASP.NET web application. I am using Read Through approach because I just need to read images from my SQL database and store them in the cache. So I will have chance to retrieve those data as fast as possible.
I am checking shell and I can see that my application is reading successfully from cache and write to the cache if cache is empty. However, AppFabric is not as fast as I expected. The version without AppFabric is faster than the one with AppFabric. In addition to that, when I use Appfabric, I can see that there is high CPU and memory usage.
What are the potential reasons of that? What do you suggest to me?
Appreciated to your ideas,
So without more details, it's hard to tell for sure, but I can try to help from my experience with AppFabric. Are we talking about high memory usage on the AppFabric server, or the client computer(not sure if you are using a web app, or something else)
AppFabric will be slower than in-proc memory, also AF should not be on the same server as you application.
How are you creating the AppFabric DataCacheFactory? Are you creating for every request? That is bad, as it is expensive, so it should be a static/singleton. I do something like
public class AppFabricDistributedCacheManagerFactory {
private static DataCacheFactory _dataCacheFactory;
public void Initialize()
{
if (_dataCacheFactory == null)
{
_dataCacheFactory = new DataCacheFactory();
}
}
......
Do you have local cache enabled in AppFabric, for images it seems appropriate.
Make sure your Provider is not throwing exceptions and is only calling Appfabric when it really should. Put fiddler on you dev box and watch the requests. So watch for
First call to AF, are you using regions? Make you create it.
If you are creating regions? Do you make it exists before you save? Just in case you are look at this code.. before did this.. I had a few issues
public void SaveToProvider(string key,TimeSpan duration ,string regionName,object toSave)
try
{
Cache.Put(key, toSave, duration , regionName);
}
catch (DataCacheException cacheError)
{
// Look at the ErrorCode property to see if the Region is missing
if (cacheError.ErrorCode == DataCacheErrorCode.RegionDoesNotExist)
{
// Create the Region and retry the Put call
Cache.CreateRegion(_regionName);
Cache.Put(key, toSave, duration , regionName);
}
}
Watch the requests when you request a item not is cache.. see that is call AF then loads the image and call AF again to save.
Watch the request when you know the item in already loaded, if you are using local cache you should see no AF requests..One if you are not.

SQL error timeout with transaction

I have an ASP.NET application importing data from a CSV file, and storing it to a (SQL Server) database table. Basically, the import process consists of:
Importing the raw CSV data into a corresponding SQL table (with the same columns)
"Merging" the data into the DB, with some sql clauses (INSERTS and UPDATE)
The whole import procedure is wrapped with a transaction.
using (SqlConnection c = new SqlConnection(cSqlHelper.GetConnectionString()))
{
c.Open();
SqlTransaction trans = c.BeginTransaction();
SqlCommand cmd = new SqlCommand("DELETE FROM T_TempCsvImport", c, trans);
cmd.ExecuteNonQuery();
// Other import SQL ...
trans.Commit();
}
Trying this import procedure from a virtual machine (everything is local), I got an error
[SqlException (0x80131904): Timeout. The timeout period elapsed prior to completion of the operation or the server is not responding.
Trying the same without the transaction, works fine.
Something I tried:
Executing the same queries from SQL Server Management Studio, all of them runs quite fast (500ms)
Executing from my development machine, works fine
Increasing the Command Timeout, I get the error anyhow. I also tried to set CommandTimeout to 0 (infinite), and the procedure seems to run "forever" (I get a server timeout, which I set to 10 minutes)
So, the final question is: why the SQL transaction is creating such problems? Why is it working without the transaction?
After several tests I did, I found out that the problem is ...Not Enough Memory!
What I found out is that my situation is exactly the same as this answer:
Help troubleshooting SqlException: Timeout expired on connection, in a non-load situation
I have both IIS and SQL server on my local machine, with the test running on a virtual machine. This virtual machine was using 2Gb of RAM, that is 50% of the total RAM of my PC. Reducing the RAM available to the virtual machine to 512Mb fixed the problem.
Furthermore, I noticed that using a transaction or not using it has exactly the same results, when the system is working, so my first guess was wrong, as well.

HttpRuntime.Cache corrupting data under high load on Windows Azure

I have an application which caches data from the database in the HttpRuntime.Cache. When I load test this application with 1000 users per second some values in the cache become corrupted.
For example I have a page which simply queries the database for it's content. It first checks the cache and if available gets the data from there.
DataSet ds;
var cachedData = HttpRuntime.Cache["homepage"];
if (cachedData == null) {
ds = getDataSet("SQL query...");
addToCache("homepage", ds);
}
else {
ds = (DataSet)cachedData;
}
This works fine up to about 100 users per second but when I start stress testing up to 1000 users some of the fields in the cached tables return DBNull.Value. After testing when I check what's in the cache I can see the fields are now DBNull.Value.
I have enabled logging and have checked that the DataSet is only added to the cache once but somehow it's getting corrupted during stress testing.
Has anyone seen this before or have some pointers on what's going wrong? It's being hosted under Windows Azure with dedicated worker roles for caching.
The problem turned out to be where I have updated a column in a table after returning it from the cache it somehow corrupted the data in the cache but intermittently so was very hard to detect. Now after getting the DataSet from the cache I take a copy of it using DataSet.Copy to use instead of the original.

AS400 Data Connection in ASP.NET

I have an application that will reside within a business2business network that will communicate with our AS400 in our internal network environment. The firewall has been configured to allow the data request through to our AS400, but we are seeing a huge lag time in connection speed and response time. For example what takes less than a half second in our local development environments is taking upwards of 120 seconds in our B2B environment.
This is the function that we are utilizing to get our data. We are using the enterprise library application blocks, so the ASI object is the Database...
/// <summary>
/// Generic function to retrieve data table from AS400
/// </summary>
/// <param name="sql">SQL String</param>
/// <returns></returns>
private DataTable GetASIDataTable(string sql)
{
DataTable tbl = null;
HttpContext.Current.Trace.Warn("GetASIDataTable(" + sql + ") BEGIN");
using (var cmd = ASI.GetSqlStringCommand(sql))
{
using (var ds = ASI.ExecuteDataSet(cmd))
{
if (ds.Tables.Count > 0) tbl = ds.Tables[0];
}
}
HttpContext.Current.Trace.Warn("GetASIDataTable() END");
return tbl;
}
I am trying to brainstorm some ideas to consider as to why this is occurring.
Have never used ASP.NET or AS400 in anger, but I have seen this kind of behaviour before and it usually indicated some kind of network problem, typically a reverse DNS lookup that is timing out.
Assuming you have ping enabled through your firewall, check that you can ping in both directions.
Also run traceroute from each machine to try and diagnose where a delay might be.
Hope that helps.
Sorry but I can't tell you what is going on but I just have a couple comments...
First I would output the sql, see if it has a lot of joins and/or is hitting a table (file) with a large amount of records. If you really want to dig in fire up your profiler of choice (I use Ants Profiler) and try to find a profiler for the 400 - see what the server resources are as well as actual query after it goes thru the odbc driver.
I have worked with asp.net and as400 a few times and the way I have been most successful is actually using sql server with a linked server to AS400. I created a view to make it simpler to work with - hiding the oddities of as400 naming. It worked well in my scenario because the application needed to pull information from sql server anyway.
I thought I would mention it in case it helps... best of luck
Check the size of your iSeries system as well. Depending on the size of the query and if the system is undersized for the applications running on it, this may take time. While it shouldn't be thrown out as a posibility, I have seen a similar behavior in the past. But of course more likely is a network issue.
The other idea if you can solve the speed issue or is a sizing problem is to store it in an MS SQL Server then write the records from SQL Server to the iSeries from there.

Resources