A couple of us are working on (different) Workflows in a Logic App; we are finding that after a couple of hours development work we start getting this error "GetCallFailed. Failed fetching operations"; this occurs when attempting to add a step in the Designer and it can no longer populate the Built-in operations, hence we are completely blocked. We also find that it is no longer possible to manually run a Workflow, which is probably related? Previous experience shows that waiting, perhaps overnight, appears to resolve the problem - at least for an hour or two. Research has suggested Storage or Network problems, but it's hard to see how problems there could account for our case.
Related
I update a live ASP.NET application frequently. I have a load balancing set up, so I update each server while no one is on it.
However, there are still problems from time to time, commonly with people loading a page on the old version and then submitting it on the new version and then the viewstate cannot be decoded.
That's the type of generic problem I'm looking for a list of.
I am looking for a complete list of generic problems that can occur after an update, so I can become aware of when and where my update will cause problems for people using the system at the same time.
Of course problems can occur if there are errors etc. in the update or the code, but I'm obviously not talking about that.
All answers point to "no!". However, some people say having debugging enabled is convenient when errors occur on the server. I am not sure what they mean by this. Do people actually debug live server code? I honestly didn't even know you could. With the website I work on, we use ELMAH for error reporting. When a server error occurs, we are emailed a complete stack trace. After acquiring a rough idea of where and how the error occurred, I will open the local solution containing all the code that's currently deployed to the production environment and debug locally. I never actually debug the code on the server itself, so I am not sure what people mean by that.
I ask this because I just found out today while consolidating web.config XML that debug=true exists on in the staging and production environments' web.config files. It must have been this way for a few years now and I am wondering what benefits we will experience by turning it off. Could anything possibly depend on debugging being turned on that might break if shut off after being enabled for over two years since the beginning of the project?
It should be fine to turn it off, and you should get a slight performance boost. It sounds like you are doing the right thing using ELMAH. I cannot think of a good reason why you would want to have it turned ON in production... hope that helps.
The "advantage" that people are talking about is that when an error occurs on the site, the default asp.net error page will show you the actual line of code that failed. If you have debug=false, then you will not see any of that information. I think most who would recommend something like this either do not know about logging frameworks like ELMAH (and hence, cannot easily find the cause of errors on the site without this), or they have left it on the production machine in the beginning of the project while they are installing/testing the site, and then forgot to change it later.
However, with a proper logging framework in place, you can still get good error information behind the scenes without presenting it to your end-users in that way. In fact, you don't want to show that kind of information to end-users because a) they won't know what it means, and b) it could be a possible security issue if sensitive aspects of your code are shown (info that might help somebody find vulnerabilities).
All the System.Diagnostic library is depend from this flag. If you do not use any of this function, then probably you can not see any direct effect, but for see other effect and messages that come from debug functions you need to monitor at lease the windows log file.
Functions like Debug.Assert, and Debug.Fail are still active if you do not set the debug flag to off, and affect performance, and maybe create small issues that you never see if you do not check the windows system log file.
In our library that they are full with assert, the debug flag are critical.
Also with Debug flag to on, probably the compiler is not make optimization's that also affect performance.
Either an advantage or disadvantage depending on how you look at it is that Webresource.axd type files are not cached when debug="true". You've got the advantage of having the latest files every time and a disadvantage of having the latest files every time.
This is often true with other third party compression/combining type modules due to the fact that it is easier to debug non minified javascript etc so they usually only begin properly function once debug is disabled.
I have very little PeopleSoft experience but have been put in a position to support an install. This question could straddles serverfault but is certainly developer oriented.
On a daily basis, we have a PeopleSoft "developer" who writes scripts to fix records/journal entries/approval status etc. To me this screams "bad install" and botched customizations. Is this normal? Is it best practice to have an employee having to write scripts daily just to keep things running?
Note: there is no fraud happening here, he has the full approval of the accounting department when doing this.
It is unlikely that it is the installation. Likely causes:
Bad customization
Missing patches
Bugs in the delivered code
If you only have one admin, though, and you have only one developer, I would be shocked to hear that there is much in the way of custom code.
Back to the question: It is not normal to need to do SQL updates regularly to fix data. Yes, it happens, but not too often. It is also possible that the end users could fix it from the application, but do not for some reason.
Ad-hoc SQL updates can be dangerous and the SQL may change on every request. It is difficult to fully test ad-hoc scripts due to the turnaround they typically require.
I assume these "fixes" are in fact making changes not implemented by the system.
It would be more sensible to either:
Build a custom page to "fix" the entries (or less sensible: modify the delivered pages).
Build and thoroughly test a paramater-driven App Engine to perform the most commonly made changes. It could potentially be run as part of the batch stream.
Watch out on your next upgrade: application tables have had a lot of changes in recent releases.
I've had sporadic performance problems with my website for awhile now. 90% of the time the site is very fast. But occasionally it is just really, really slow. I mean like 5-10 seconds load time kind of slow. I thought I had narrowed it down to the server I was on so I migrated everything to a new dedicated server from a completely different web hosting company. But the problems continue.
I guess what I'm looking for is a good tool that'll help me track down the problem, because it's clearly not the hardware. I'd like to be able to log certain events in my ASP.NET code and have that same logger also track server performance/resources at the time. If I can then look back at the logs then I can see what exactly my website was doing at the time of extreme slowness.
Is there a .NET logging system that'll allow me to make calls into it with code while simultaneously tracking performance? What would you recommend?
Every intermittent performance problem I ever had turn out to be caused by something in the database.
You need to check out my blog post Unexplained-SQL-Server-Timeouts-and-Intermittent-Blocking. No, it's not caused by a heavy INSERT or UPDATE process like you would expect.
I would run a database trace for 1/2 a day. Yes, the trace has to be done on production because the problem doesn't usually happen in a low use environment.
Your trace log rows will have a "Duration" column showing how long an event took. You are looking at the long running ones, and the ones before them that might be holding up the long running ones. Once you find the pattern you need to figure out how things are working.
IIS 7.0 has built-in ETW tracing capability. ETW is the fastest and least overhead logging. It is built into Kernel. With respect to IIS it can log every call. The best part of ETW you can include everything in the system and get a holistic picture of the application and the sever. For example you can include , registry, file system, context switching and get call-stacks along with duration.
Here is the basic overview of ETW and specific to IIS and I also have few posts on ETW
I would start by monitoring ASP.NET related performance counters. You could even add your own counters to your application, if you wanted. Also, look to the number of w3wp.exe processes running at the time of the slow down vs normal. Look at their memory usage. Sounds to me like a memory leak that eventually results in a termination of the worker process, which of course fixes the problem, temporarily.
You don't provide specifics of what your application is doing in terms of the resources (database, networking, files) that it is using. In addition to the steps from the other posters, I would take a look at anything that is happening at "out-of-process" such as:
Databases connections
Files opened
Network shares accessed
...basically anything that is not happening in the ASP.NET process.
I would start off with the following list of items:
Turn on ASP.Net Health Monitoring to start getting some metrics & numbers.
Check the memory utilization on the server. Does re-cycling the IIS periodically remove this issue (memory leak??).
ELMAH is a good tool to start looking at the exceptions. Also, go though the logs your application might be generating.
Then, I would look for anti-virus software running at a particular time or some long running processes which might be slowing down the machine etc., a database backup schedule...
HTH.
Of course ultimately I just want to solve the intermittent slowness issues (and I'm not yet sure if I have). But in my initial question I was asking for a rather specific logger.
I never did find an answer for that so I wrote my own stopwatch threshold logging. It's not quite as detailed as my initial idea but it has the benefit of being very easy to apply globally to a web application.
From my experience performance related issues are almost always IO related and is rarely the CPU.
In order to get a gauge on where things are at without writing instrumentation code or installing software is to use Performance Monitor in Windows to see where the time is being spent.
Another quick way to get a sense of where problems might be is to run a small load test locally on your machine while a code profiler (like the one built into VS) is attached to the process to tell you where all the time is going. I usually find a few "quick wins" with that approach.
For the last few months we've had a wierd problem with our website. Once in a while various queries to the database, using ADO.NET DataSets, will throw an error... the most common of which is "Failed to enable constraints. One or more rows contain values violating non-null, unique, or foreign-key constraints."
The data is actually valid though, as without changing anything the error will be intermittent. Further, the "fix" for it is to recycle the app pool on both web servers... so the problem can't be bad data being returned. Once this is done it can run fine for weeks at a time, or break 3 times in one day. There's no consistency to it...
It also seems like newer means of data access, such as Linq 2 SQL, work just fine... though it's hard to tell since the site is using both at the moment. (Working on getting everything over to L2S, but don't have a lot of time to rewrite old components unfortunately...)
So has anyone had anything like this before? Is it something with the load balancing? Maybe something wrong with the servers? (I've forced all connections to each server in turn and experienced the error on both of them.) Could it be something wrong with running in a VM?
Err... ok, so the overall question is: What's causing this and how do I fix it?
Oh, and the website is in .NET 3.5...
Based off of what you've said, I would guess that this is related to the load experienced on the servers at the time of the error.
If you can, set up a staging environment that is load balanced like your production servers are. Then start load testing the app.
Also, make sure you have all the latest service packs / updates applied on your production servers. MS has a tendency to not tell us everything they are fixing. Finally, look on MS connect to see if a hotfix corrects the problem you are talking about.
UPDATE:
Load testing can be as simple or complicated as you can afford. What it should do is run through a sequence of pages that perform standard operations on your site in a repeatable way. You usually want to simulate "think" times between each page load / operation that are in line with expected user behavior.
Then, you execute the test as a certain number of simulataneous users. While the test is executing, you need to record any errors and the servers performance counters to get an idea of how the app really performs.
Some links to load testing tools are here. Another list is here.
As a side note, I've seen apps start exhibiting strange behavior under a load of only 5 simultaneous users. It really depends on how the site is built.