This is ASP.NET 4.0 application which is using Oracle cluster through ODP.net and Distributed Transaction Coordinator/MSDTC. In System.Transactions.TransactionScope transaction, it saves data in two different databases (db1 and db2). It commits transaction only if both save operations are succeeded.
It had been working fine until Monday. Now this happening intermittently that data is getting deleted from one database(db1 - save operation call succeeds) as soon as ASP.NET request is complete but another database(db2) has that data associated with this request. It happens intermittently, some time data is saved in both database while some time in one database only (db2). No exceptions are logged.
Only change on server is installation of ODP.NET 11.2.3.
Any idea what could be the reason?
We were having a problem with DTCs as well with 11.2.3. Not the same issue you are having, but a huge problem none-the-less. We were getting "ORA-24776: cannot start a new transaction" Oracle errors under heavy load. No errors were logged in the database and process tracing proved unhelpful. There was no code change other than the database update from 10g to 11g database. The 11.2.3 ODP drivers we were using worked perfectly against Oracle 10g. When we updated the database to 11g is when all the errors started. I searched and debugged for a month. The only thing that I have found that stopped this was updating to 11.2.4 which was released January 14. There is only X-copy deploys for this release and strangely there is not release notes for this particular update. We contacted Oracle to find what they have fixed with this release but they have failed to respond so far.
I know this is an older posting I hope this helps!
Related
We are using a Self-Hosted Integration Runtime for Azure Data Factory.
On that machine there was installed an Exasol ODBC driver of version 6. We wanted to upgrade the driver, deleted an old one and installed a new driver of version 7.
Weird thing is that now in Exasol logs we can see that Data Factory is sometimes connecting via driver version 7, and sometimes via driver version 6.
I made an experiment and deleted Exasol ODBC driver from the machine completely. After that Data Factory still was able to connect to Exasol using the driver I just deleted.
Looks like drivers' DLLs are cached somewhere. What can it be?
Update 1
I captured following actions in Process Monitor when Data Fatory connected to Exasol with ODBC driver of version 6:
Where these C:\Config.Msi\3739be5*.rbfASolution-6.1\ODBC\ DLLs may come from? There is no C:\Config.Msi\ directory on the machine.
Update 2
I noticed that when I test connection via Microsoft Integration Runtime Configuration Manager on the machine or in Data Factory Linked Service, then connection is always performed with ODBC driver of version 7.
But when I test connection via Data Factory Dataset, then in some cases connection is done with ODBC driver of version 6.
You could check the registry but clean at your own risk. An alternative might be the SysIternals tools, Process Monitor or Process Explorer which might help you get to the bottom of this. Install them on the SHIR VM if you are allowed to. Process Explorer in particular is a bit like SQL Profiler (if you've ever used that) so will be able to tell you which registry keys external processes are using. It will give you a lot a lot of information so you will have to make judicious use of timestamp and filtering. The proposed steps:
Start a trace using Process Monitor
Start a pipeline using the Exasol driver
Wait til it completes (or at least you know it has started)
Stop the Process Monitor trace Spend time going through the millions
of records it has captured, trying to filter down, or search for your
process
An alternative would be to build a clean SHIR and install only the new driver. Then swap it in for the old one. You may have to get the new SHIR added to the firewall if this is an issue for you.
Honestly I would propose both of these approached in parallel for a production problem. Procmon / Process Explorer can be quite labour and time expensive but should help you get to the bottom of the issue. Building a cleaner SHIR is probably a safer option in the long-term, but requires new infrastructure.
It may sound silly, but rebooting the server where SHIR is working solved the problem.
We noticed, that this server was running for more than 30 days, and decided to reboot it. Maybe restarting Integration Runtime service itself would also help, but we didn't do it.
Thanks to everyone for you help.
In a hybrid asp.net web application, framework 4.5.1, using LINQ to SQL (not Entity Framework) I'm getting the exception
"An unhandled exception of type 'System.StackOverflowException' occurred in System.Data.Linq.dll"
on any call to DataContext.SubmitChanges().
Every call to SubmitChanges() causes the error, it does not matter what specific entity is being altered. The error is thrown immediately (unlike most StackOverflow exceptions, which normally take a few seconds to occur while the errant code overflows the stack).
The asp.net web application is running on my local host in IIS express using Visual Studio 2013. The database is SQL Server 2005.
My question is, how does one debug a StackOverflow exception in this environment? Right now the above error message is all I get.
The Event Viewer notes that the browser crashed (it happens in both IE 11 and Chrome) but nothing about the LINQ to SQL exception.
The SQL Server process monitor does not register any database call.
I have a log hooked up to my DataContext but it records nothing.
It appears the stack overflow is happening inside System.Data.dll before any database call occurs and before anything can be logged.
This suddenly started happening several hours ago, after a windows update ran and the machine rebooted. That might be a coincidence.
Something else extremely odd: we have four developers in our shop, all using Visual Studio 2013. Two of us suddenly began having this problem, and two of us never had it. We're all running identical code and hitting the same database. The two of us having the problem rebooted, and the problem disappeared on one machine, but is still occurring on my machine.
In addition to rebooting, I've deleted the project from machine and pulled it down from source control so that it is identical to what my 3 co-workers have, deleted all temporary internet files on my machine, and deleted all of my AppData\Local\temp files for my login.
Is there any way to debug this issue?
Clip of call stack when exception occurs (the calls to VisitExpression and etc repeat many dozens of times until it ends).
The unsatisfying "answer" in this case was to delete the *.dbml file and re-create it. That fixed the stack overflow error.
My comment in reply to #GertArnold above was not accurate. Only one DataContext was throwing the stack overflow exception. It was doing it for every entity in the DataContext, but other DataContexts in the application were working properly.
This particular *.dbml file has been growing over the years to gargantuan size. While re-creating I was careful to only add database objects that are being referenced, which resulted in a much smaller *.dbml file, which might itself have fixed the problem.
Thanks a lot Tom for the info!
Just in case other people may hit the same problem, here is extra info from my case. I have got a very similar issue after my PC got a batch of Windows updates yesterday, the updates including windows10 OS, VS2013/VS2015 and etc. I primarily use VS2013, some differences with Tom's case are,
it only pops up when update one entity, other entities in the same DataContext work fine
only affects my ASP.NET Web API project, console applications are fine, even all app projects ref to the same unitofwork data layer project (in where the dbml file sits)
replacing the dbml file didn't work for me, I finally solved it by opening the solution in VS2015 >> debugging >> closing VS2015 >> opening the solution in VS2013, the problem just disappeared
There are a couple reports which stopped working, and I'm getting an error "An Existing connection was forcibly closed by the remote host." When I try to look at the reports they are taking forever to run, and in the event log there were a couple timeout errors... so guessing I'm getting that error due to timeout.
Now the problem is figuring out why the reports are running so slow. I already changed the proc to prevent parameter sniffing... but basically:
Run the proc through SSMS: 1:42
Run the report through Report Server: 6:45
Run report through ASP.NET ReportViewer control: 13:00 minutes
So the real mystery here for me is why it's twice as slow through the ReportViewer control as through SSRS itself? (I can deal with the report being slower than the proc later...)
EDIT:
Ran some profiling as suggested in the comments. The stored procedure is running at normal speed (55 seconds) when being called from the report itself. So the problem is either the SSRS server, the ReportViewer control, both... or the network between the ReportViewer and the SSRS server.
Also if I run the report on my desktop PC (over VPN) in Visual Studio it works just fine.
Also, there are some other shorter reports that are running just fine. Wondering if it's just they pull so much less data though.
One last thing I've noticed is that the query seems to be running multiple times when run through the ReportViewer control.
EDIT Again:
Looked at the ExecutionLog tables, and the time is definitely going to rendering the report. The time for getting the data is pretty consistent. Also the rendering takes 6 minutes longer on the production server than the test server (even running the query against the production database) so it's definitely something with Reporting Services.
It's a shot in the dark, but whenever we run into this problem one of the first things that we do is drop statistics on the tables involved, and let the DBMS recreate them as needed. We have noticed that WITH RECOMPILE does not seem to help as well.
I have several legacy applications built originally on asp.net 2.0, IIS 6 and update panels. They were working fine on that old server, response time was never more than 4 seconds.
I moved them to a new windows 2008 server with IIS 7.5 and performance is much slower at 20 seconds per async request/response.
The code has not changed
Database has not changed.
The appPool is running in classic mode.
The database responds immediately once it receives the query (again
it takes 20 seconds for the database to receive the query)
I have installed the latest AJAXControlToolkit for ASP.net 4.5.
I did some analysis and found that the request is what is taking so long but I don't know why. I have tried switching to integrated mode but that had no positive effect.
Any ideas on what I can do?
Thanks, Justin.
Seems like a network issue to me.
Database has not changed.
Database has not changed and you've moved the code to a new server. Are you accessing the DB through a local IP? If not, you need to do that.
Verify the connection speed by pinging the old DB server from your new server and see if it is taking a long time to get a reply.
The database responds immediately once it receives the query (again it takes 20 seconds for the database to receive the query)
If the database is quick in responding and it was working fine earlier, there is no chance that there is any problem with it. Something is fishy in data transmission over the network.
Just in case, it is not a network problem:
1) Please clean and rebuild the solution and upload it again. Deleting all the "bin" folders in every project and rebuilding them again worked for me. It was the reason behind my problem.
2) Please make sure that you are using the latest builds of DB connectors just in case you are using MySql or Oracle.
3) You may want to use fiddler to trace your http requests and see where exactly the problem is.
I have been given an ad hoc reporting tool from another individual that has successfully deployed it to the field. He uses Web Logic servers and an Oracle database.
I tried to deploy the same application in my local environment (WAS 7 and Oracle). The first report runs flawlessly. However, when I run the second (or third or fourth) report, I get a very strange error: the second report is appended to the first report.
There is nothing in the code to account for this. This problem can be temporarily solved by stopping and starting the servers every time a report is run (obviously not a real solution). I think this has something to do with data sources and cached information. I then took a step back and tried to deploy it to a Tomcat server. It works perfectly, just like it does in the field. So my question is: are there any known issues between WAS 7 and Oracle 11g that could be causing this kind of problem? Any information would be very helpful.
Please ask about any specifics you may want to know and I will do my best to provide that information.
Thank you for your time.
EDIT: For anyone else looking into this, the problem was due to an incompatibility with the proprietary Oracle calls and Websphere. Once the application was edited to use only JDBC calls, everything works perfectly. Thanks.
This ended up being a incompatibility with using proprietary Oracle calls and Websphere. It was fixed by changing all of proprietary call to normal JDBC calls.