Memory leak while sending response from rebus handler - rebus

I saw a very strange behavior in my rebus handler which is self hosted in exe. Right after sending response using bus.send method it adds up some memory consumed by process. I tried to look up object graph using memory profile and found that rebus is holding response message in serialized format somewhere.
Object graph was showing below hierarchy to the root.
System.Message --> CachedBodyMessage --> stream
Give me some pointers if anybody is aware of this thing.

I understand that a memory leak is a grave concern, but my belief is that it is unlikely that Rebus should contain a memory leak.
This belief is rooted in the fact that I have been running Windows Service-hosted Rebus endpoints in production for 1,5 years now, and several of them (e.g. the timeout managers) have sometimes been running for several months without being restarted.
I'd like to be absolutely bulletproof sure though, so I'm willing to investigate the issue you're reporting.
You're mentioning "CachedBodyMessage" - judging by the names of fields inside System.Messaging.Message, it sounds like it's something within MSMQ. To try to reproduce your issue, I coded the following test:
[Test, Ignore("Only works in RELEASE mode because otherwise object references are held on to for the duration of the method")]
public void DoesNotLeakMessages()
{
// arrange
const string inputQueueName = "test.leak.input";
var queue = new MsmqMessageQueue(inputQueueName);
disposables.Add(queue);
var body = Encoding.UTF8.GetBytes(new string('*', 32768));
var message = new TransportMessageToSend
{
Headers = new Dictionary<string, object> { { Headers.MessageId, "msg-1" } },
Body = body
};
var weakMessageRef = new WeakReference(message);
var weakBodyRef = new WeakReference(body);
// act
queue.Send(inputQueueName, message, new NoTransaction());
message = null;
body = null;
GC.Collect();
GC.WaitForPendingFinalizers();
// assert
Assert.That(weakMessageRef.IsAlive, Is.False, "Expected the message to have been collected");
Assert.That(weakBodyRef.IsAlive, Is.False, "Expected the body bytes to have been collected");
}
which verifies that the sent transport message is collected as it should (will only do this in RELEASE mode though, because of the way DEBUG mode holds on to object references within scope)
I'll try and run the TimePrinter sample now and leave it running for a while to see if I can reproduce the issue. If you stumble upon more information about e.g. exactly which objects are leaking, it would be very helpful.
Thanks again for taking the time to report your worries to me :)
Followup:
I've modified the TimePrinter sample so that it sends 50 msg/s and includes a 64 KB random string payload with each message, and I've tracked the memory usage for almost four hours now. As you can see, it does not look like memory is being leaked.
I'll leave it running the rest of the day, just to be sure.
Maybe you can tell me some more about why you suspected there was a memory leak in the first place?
Update:
As you can see from the trace, it has now been running for 7 hours and thus more than 1,200,000 messages containing more than 70 GB of data has been sent and consumed by the same process. If cached message bodies were leaking, I am pretty sure that we would have been able to see something rising on the graph.

Related

Google reCAPTCHA response success: false, no error codes

UPDATE: Google has recently updated their error message with an additional error code possibility: "timeout-or-duplicate".
This new error code seems to cover 99% of our previously mentioned mysterious
cases.
We are still left wondering why we get that many validation requests that are either timeouts or duplicates. Determinining this with certainty is likely to be impossible, but now I am just hoping that someone else has experienced something like it.
Disclaimer: I cross posted this to Google Groups, so apologies for spamming the ether for the ones of you who frequent both sites.
I am currently working on a page as part of a ASP.Net MVC application with a form that uses reCAPTCHA validation. The page currently has many daily users.
In my server side validation** of a reCAPTCHA response, for a while now, I have seen the case of the reCAPTCHA response having its success property set to false, but with an accompanying empty error code array.
Most of the requests pass validation, but some keep exhibiting this pattern.
So after doing some research online, I explored the two possible scenarios I could think of:
The validation has timed out and is no longer valid.
The user has already been validated using the response value, so they are rejected the second time.
After collecting data for a while, I have found that all cases of "Success: false, error codes: []" have either had the validation be rather old (ranging from 5 minutes to 10 days(!)), or it has been a case of a re-used response value, or sometimes a combination of the two.
Even after implementing client side prevention of double-clicking my submit-form button, a lot of double submits still seem to get through to the server side Google reCAPTCHA validation logic.
My data tells me that 1.6% (28) of all requests (1760) have failed with at least one of the above scenarios being true ("timeout" or "double submission").
Meanwhile, not a single request of the 1760 has failed where the error code array was not empty.
I just have a hard time imagining a practical use case where a ChallengeTimeStamp gets issued, and then after 10 days validation is attempted, server side.
My question is:
What could be the reason for a non-negligible percentage of all Google reCAPTCHA server side validation attempts to be either very old or a case of double submission?
**By "server side validation" I mean logic that looks like this:
public bool IsVerifiedUser(string captchaResponse, string endUserIp)
{
string apiUrl = ConfigurationManager.AppSettings["Google_Captcha_API"];
string secret = ConfigurationManager.AppSettings["Google_Captcha_SecretKey"];
using (var client = new HttpClient())
{
var parameters = new Dictionary<string, string>
{
{ "secret", secret },
{ "response", captchaResponse },
{ "remoteip", endUserIp },
};
var content = new FormUrlEncodedContent(parameters);
var response = client.PostAsync(apiUrl, content).Result;
var responseContent = response.Content.ReadAsStringAsync().Result;
GoogleCaptchaResponse googleCaptchaResponse = JsonConvert.DeserializeObject<GoogleCaptchaResponse>(responseContent);
if (googleCaptchaResponse.Success)
{
_dal.LogGoogleRecaptchaResponse(endUserIp, captchaResponse);
return true;
}
else
{
//Actual code ommitted
//Try to determine the cause of failure
//Look at googleCaptchaResponse.ErrorCodes array (this has been empty in all of the 28 cases of "success: false")
//Measure time between googleCaptchaResponse.ChallengeTimeStamp (which is UTC) and DateTime.UtcNow
//Check reCAPTCHAresponse against local database of previously used reCAPTCHAresponses to detect cases of double submission
return false;
}
}
}
Thank you in advance to anyone who has a clue and can perhaps shed some light on the subject.
You will get timeout-or-duplicate problem if your captcha is validated twice.
Save logs in a file in append mode and check if you are validating a Captcha twice.
Here is an example
$verifyResponse = file_get_contents('https://www.google.com/recaptcha/api/siteverify?secret='.$secret.'&response='.$_POST['g-recaptcha-response'])
file_put_contents( "logfile", $verifyResponse, FILE_APPEND );
Now read the content of logfile created above and check if captcha is verified twice
This is an interesting question, but it's going to be impossible to answer with any sort of certainly. I can give an educated guess about what's occurring.
As far as the old submissions go, that could simply be users leaving the page open in the browser and coming back later to finally submit. You can handle this scenario in a few different ways:
Set a meta refresh for the page, such that it will update itself after a defined period of time, and hopefully either get a new ReCAPTCHA validation code or at least prompt the user to verify the CAPTCHA again. However, this is less than ideal as it increases requests to your server and will blow out any work the user has done on the form. It's also very brute-force: it will simply refresh after a certain amount of time, regardless of whether the user is currently actively using the page or not.
Use a JavaScript timer to notify the user about the page timing out and then refresh. This is like #1, but with much more finesse. You can pop a warning dialog telling the user that they've left the page sitting too long and it will soon need to be refreshed, giving them time to finish up if they're actively using it. You can also check for user activity via events like onmousemove. If the user's not moving the mouse, it's very likely they aren't on the page.
Handle it server-side, by catching this scenario. I actually prefer this method the most as it's the most fluid, and honestly the easiest to achieve. When you get back success: false with no error codes, simply send the user back to the page, as if they had made a validation error in the form. Provide a message telling them that their CAPTCHA validation expired and they need to verify again. Then, all they have to do is verify and resubmit.
The double-submit issue is a perennial one that plagues all web developers. User behavior studies have shown that the vast majority occur because users have been trained to double-click icons, and as a result, think they need to double-click submit buttons as well. Some of it is impatience if something doesn't happen immediately on click. Regardless, the best thing you can do is implement JavaScript that disables the button on click, preventing a second click.

Understanding the JIT; slow website

First off, this question has been covered a few times (I've done my research), and, for example, on the right side of the SO webpage is a list of related items... I have been through them all (or as many as I could find).
When I publish my pre-compiled .NET web application, it is very slow to load the first time.
I've read up on this, it's the JIT which I understand (sort of).
The problem is, after the home page loads (up to 20 seconds), many other pages load very fast.
However, it would appear that the only reason they load is because the resources have been loaded (or that they share the same compiled dlls). However, some pages still take a long time.
This indicates that maybe the JIT needs to compile different pages in different ways? If so, and using a contact form as an example (where the Thank You page needs to be compiled by the JIT and first time is slow), the user may hit the send button multiple times whilst waiting for the page to be shown.
After I load all these pages which use different models or different shared HTML content, the site loads quickly as expected. I assume this issue is a common problem?
Please note, I'm using .NET 4.0 but, there is no database, XML files etc. The only IO is if an email doesn't send and it writes the error to a log.
So, assuming my understanding is correct, what is the approach to not have to manually go through the website and load every page?
If the above is a little too broad, then can this be resolved in the settings/configuration in Visual Studio (2012) or the web.config file (excluding adding compilation debug=false)?
In this case, there are 2 problems
As per rene's comments, review this http://msdn.microsoft.com/en-us/library/ms972959.aspx... The helpful part was to add the following code to the global.asax file
const string sourceName = ".NET Runtime";
const string serverName = ".";
const string logName = "Application";
const string uriFormat = "\r\n\r\nURI: {0}\r\n\r\n";
const string exceptionFormat = "{0}: \"{1}\"\r\n{2}\r\n\r\n";
void Application_Error(Object sender, EventArgs ea) {
StringBuilder message = new StringBuilder();
if (Request != null) {
message.AppendFormat(uriFormat, Request.Path);
}
if (Server != null) {
Exception e;
for (e = Server.GetLastError(); e != null; e = e.InnerException) {
message.AppendFormat(exceptionFormat,
e.GetType().Name,
e.Message,
e.StackTrace);
}
}
if (!EventLog.SourceExists(sourceName)) {
EventLog.CreateEventSource(sourceName, logName);
}
EventLog Log = new EventLog(logName, serverName, sourceName);
Log.WriteEntry(message.ToString(), EventLogEntryType.Error);
//Server.ClearError(); // uncomment this to cancel the error
}
The server was maxing out during sending of the email! My code was fine, but, viewing Task Scheduler showed it was hitting 100% memory...
The solution was to monitor the errors shown by point 1 and fix it. Then, find out why the server was being throttled when sending an email!

Optimizing a set 20 webrequests with threads

This is for ASP.NET. I want to improve the time it takes run my function, today it takes around 20-30 seconds, more towards 30secs than 20secs though. That's running on one thread making 20 webrequests.
I'm thinking threads that do all the 20 webreqeusts, in order to quickly find the result or just go through the data (IE do all the 20 requests not finding anything).
Here's how it works.
1. I'm using html agility pack to fetch htmldocuments. 2. Then I parse them for information 3. Lastly I add that information to a dictionary OR I move on to the next webrequest until I reach 20 requests made.
I make at most 20 webRequests, at minimum 1. I have set the function to end when the info I'm searching for is found. Sometimes the info isn't there hence the 20 webrequests(it goes through all the data).
Every webrequest adds between 5-20 entries to the dictionary. This is then compared with the information I sent to it, if it's in the list I get the Key back, otherwise it returns 201. If found it gets added to the database.
QUESTIONS
*A:*If I want to do this with threads, how many should I create? 20 One for each request and let them all loose to do the job? Or should i create like 4 of them making at most 5 requests each?B: What if two threads are finished at the same time and wants to add info to the directory, can it lock the whole site(I'm using ASP.NET), or will it try to add one from thread A and then one result from Thread B? I have a check already today that checks if the key exists before adding it.
C:What would be the fastest way to this?
This is my code, depicting the loop which just shows that 20 requests are being made?
public void FetchAndParseAllPages()
{
int _maxSearchDepth = 200;
int _searchIncrement = 10;
PageFetcher fetcher = new PageFetcher();
for (int i = 0; i < _maxSearchDepth; i += _searchIncrement)
{
string keywordNsearch = _keyword + i;
ParseHtmldocuments(fetcher.GetWebpage(keywordNsearch));
if (GetPostion() != 201)
{ //ADD DATA TO DATABASE
InsertRankingData(DocParser.GetSearchResults(), _theSearchedKeyword);
return;
}
}
}
.NET allows only 2 requests open at the same time. If you want more than that, you need to configure it in web.config. Look here: http://msdn.microsoft.com/en-us/library/aa480507.aspx
You can the Parallel.For method which is very straightforward and handles the "how much threads" for you. Of course you can tweak it to set how much threads (or tasks) you want with ParallelOptions. Look here: http://msdn.microsoft.com/en-us/library/dd781401.aspx
For making a thread-safe dictionary you can use the ConcurrentDictionary. Look here: http://msdn.microsoft.com/en-us/library/dd287191.aspx

Flex slow first Http request

When i use loader.load(request); for the first time, my flex freeze for 10 secondes before posting the data (i can see the web server result in real time).
However if redo a similar POST with other data but same request.url, it's instantaneous.
// Multi form encoded data
variables = new URLVariables();
variables.user = "aaa";
variables.boardjpg = new URLFileVariable(data.boardBytes, "foo.jpg");
request = new URLRequestBuilder(variables).build();
request.url = "http://localhost:8000/upload/";
loader.load(request);
How can i see what is taking so long ?
Thanks !
Ok, this is an old question, anyway I find it searching for other things so quick adding this
URLFileVariables nor URLRequestBuilder are core classes in AS3, so I guess you're using some custom library to build your request. I don't know which library you use, but it seems that the purpose is to serialize some binary data to build a POST. Serializing usually takes some times the first time (lookup initialization and the like) and goes faster next, a well known example is Remoting in his different flavours

URLLoader fails randomly without throwing an error or dispatching any events

In Adobe AIR 1.5, I'm using URLLoader to upload a video in 1 MB chunks. It uploads 1 MB, waits for the Event.COMPLETE event, and then uploads the next chunk. The server-side code knows how to construct the video from these chunks.
Usually, it works fine. However, sometimes it just stops without throwing any errors or dispatching any events. This is an example of what is shown in a log that I create:
Uploading chunk of size: 1000000
HTTP_RESPONSE_STATUS dispatched: 200
HTTP_STATUS dispatched: 200
Completed chunk 1 of 108
Uploading chunk of size: 1000000
HTTP_RESPONSE_STATUS ...
etc...
Most of the time, it completes all of the chunks fine. However, sometimes, it just fails in the middle:
Completed chunk 2 of 108
Uploading chunk of size: 1000000
... and nothing else, and no network activity.
Through debugging, I can tell that it does successfully call urlLoader.load(). When it fails, it just seems to stall, calling load(), and then calling the UIComponent's callLaterDispatcher() and then nothing.
Does anyone have any idea why this could be happening? I'm setting up my URLLoader like this:
urlLoader.dataFormat = URLLoaderDataFormat.BINARY;
urlLoader.addEventListener(Event.COMPLETE, chunkComplete);
urlLoader.addEventListener(IOErrorEvent.IO_ERROR, ioErrorHandler);
urlLoader.addEventListener(SecurityErrorEvent.SECURITY_ERROR, securityErrorHandler);
urlLoader.addEventListener(HTTPStatusEvent.HTTP_RESPONSE_STATUS, responseStatusHandler);
urlLoader.addEventListener(HTTPStatusEvent.HTTP_STATUS, statusHandler);
urlLoader.addEventListener(ProgressEvent.PROGRESS, progressHandler);
And I'm re-using it for each chunk. No events get called when it doesn't succeed, and urlLoader.load() doesn't throw any exceptions. When it succeeds, HTTP_RESPONSE_STATUS, HTTP_STATUS, and PROGRESS events are dispatched.
Thanks!
Edit: One thing that might be helpful is that, we have the same upload functionality implemented in .NET. In .NET, the request.GetResponse() method sometimes throws an exception, complaining that the connection was closed unexpectedly. We catch the exception if this happens, and try that chunk again, until it succeeds. I'm looking to implement something similar here, but there are no exceptions being thrown or error events being dispatched.
More detailed code example below. The URLLoader is setup as described above. The readAgain variable just makes it skip reading a new set of bytes in the file stream (ie: it tries to send the old one again) ... however, it never catches any exceptions, because none are ever thrown.
private function uploadSegment():void
{
.... prepare byte array, setup url ...
// Create a URL request
var urlRequest:URLRequest = new URLRequest();
urlRequest.url = _url + "?" + paramStr;
urlRequest.method = URLRequestMethod.POST;
urlRequest.data = byteArray;
urlRequest.useCache = false;
urlRequest.requestHeaders.push(new URLRequestHeader('Cache-Control', 'no-cache'));
try
{
urlLoader.load(urlRequest);
}
catch (e:Error)
{
Logger.error("Failed to upload chunk. Caught exception. Trying again.");
readAgain = true;
uploadSegment();
return;
}
readAgain = false;
}
Have you tried signing up for 'Event.OPEN' to see if the connection is opening correctly? If you're doing this per chunk - perhaps that event or lack thereof would help?
[Edit]
Can you also try setting useCache to false on your URLRequest?
[Edit]
I assume you're urlLoader is globally referenced... If not, while you're waiting for async behavior, something evil like GC might hurt you ... But - skipping that, if you call 'bytesTotal' while you're waiting for something to happen - does it always return zero?
[More]
Also - check the URL in the cases where NOTHING happens - because online I've found some mention that if the server is unreachable there are no events fired (though there is some argument around that)...
I encountered a similar problem in Flex, only with Safari.
The URLloader sometimes returned nothing, not even the OPEN event.
I made sure that this wasn't a cache problem.
After lots of trial
and error, the only remedy I found was to use https protocol in the url. I am not sure what this does to
Safari, but now the problem is gone.

Resources