System.Io.Directory::GetFiles() Polling from AX 2009, Only Seeing New Files Every 10s - axapta

I wrote code in AX 2009 to poll a directory on a network drive, every 1 second, waiting for a response file from another system. I noticed that using a file explorer window, I could see the file appear, yet my code was not seeing and processing the file for several seconds - up to 9 seconds (and 9 polls) after the file appeared!
The AX code calls System.IO.Directory::GetFiles() using ClrInterop:
interopPerm = new InteropPermission(InteropKind::ClrInterop);
interopPerm.assert();
files = System.IO.Directory::GetFiles(#POLLDIR,'*.csv');
// etc...
CodeAccessPermission::revertAssert();
After much experimentation, it emerges that the first time in my program's lifetime, that I call ::GetFiles(), it starts a notional "ticking clock" with a period of 10 seconds. Only calls every 10 seconds find any new files that may have appeared, though they do still report files that were found on an earlier 10s "tick" since the first call to ::GetFiles().
If, when I start the program, the file is not there, then all the other calls to ::GetFiles(), 1 second after the first call, 2 seconds after, etc., up to 9 seconds after, simply do not see the file, even though it may have sitting there since 0.5s after the first call!
Then, reliably, and repeatably, the call 10s after the first call, will find the file. Then no calls from 11s to 19s will see any new file that might have appeared, yet the call 20s after the first call, will reliably see any new files. And so on, every 10 seconds.
Further investigation revealed that if the polled directory is on the AX AOS machine, this does not happen, and the file is found immediately, as one would expect, on the call after the file appears in the directory.
But this figure of 10s is reliable and repeatable, no matter what network drive I poll, no matter what server it's on.
Our network certainly doesn't have 10s of latency to see files; as I said, a file explorer window on the polled directory sees the file immediately.
What is going on?

Sounds like your issue is due to SMB caching - from this technet page:
Name, type, and ID
Directory Cache [DWORD] DirectoryCacheLifetime
Registry key the cache setting is controlled by
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Lanmanworkstation\Parameters
This is a cache of recent directory enumerations performed by the
client. Subsequent enumeration requests made by client applications as
well as metadata queries for files in the directory can be satisfied
from the cache. The client also uses the directory cache to determine
the presence or absence of a file in the directory and uses that
information to prevent clients from repeatedly attempting to open
files which are known not to exist on the server. This cache is likely
to affect distributed applications running on multiple computers
accessing a set of files on a server – where the applications use an
out of band mechanism to signal each other about
modification/addition/deletion of files on the server.
In short try to set the registry key
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Lanmanworkstation\Parameters\DirectoryCacheLifetime
to 0

Thanks to #Jan B. Kjeldsen , I have been able to solve my problem using FileSystemWatcher. Here is my implementation in X++ :
class SelTestThreadDirPolling
{
}
public server static Container SetStaticFileWatcher(str _dirPath,str _filenamePattern,int _timeoutMs)
{
InteropPermission interopPerm;
System.IO.FileSystemWatcher fw;
System.IO.WatcherChangeTypes watcherChangeType;
System.IO.WaitForChangedResult res;
Container cont;
str fileName;
str oldFileName;
str changeType;
;
interopPerm = new InteropPermission(InteropKind::ClrInterop);
interopPerm.assert();
fw = new System.IO.FileSystemWatcher();
fw.set_Path(_dirPath);
fw.set_IncludeSubdirectories(false);
fw.set_Filter(_filenamePattern);
watcherChangeType = ClrInterop::parseClrEnum('System.IO.WatcherChangeTypes', 'Created');
res = fw.WaitForChanged(watcherChangeType,_timeoutMs);
if (res.get_TimedOut()) return conNull();
fileName = res.get_Name();
//ChangeTypeName can be: Created, Deleted, Renamed and Changed
changeType = System.Enum::GetName(watcherChangeType.GetType(), res.get_ChangeType());
fw.Dispose();
CodeAccessPermission::revertAssert();
if (changeType == 'Renamed') oldFileName = res.get_OldName();
cont += fileName;
cont += changeType;
cont += oldFileName;
return cont;
}
void waitFileSystemWatcher(str _dirPath,str _filenamePattern,int _timeoutMs)
{
container cResult;
str filename,changeType,oldFilename;
;
cResult=SelTestThreadDirPolling::SetStaticFileWatcher(_dirPath,_filenamePattern,_timeoutMs);
if (cResult)
{
[filename,changeType,oldFilename]=cResult;
info(strfmt("filename=%1, changeType=%2, oldFilename=%3",filename,changeType,oldFilename));
}
else
{
info("TIMED OUT");
}
}
void run()
{;
this.waitFileSystemWatcher(#'\\myserver\mydir','filepattern*.csv',10000);
}
I should acknowledge the following for forming the basis of my X++ implementation:
https://blogs.msdn.microsoft.com/floditt/2008/09/01/how-to-implement-filesystemwatcher-with-x/

I would guess DAXaholic's answer is correct, but you could try other solutions like EnumerateFiles.
In your case I would rather wait for the files rather than poll for the files.
Using FileSystemWatcher there will be a minimal delay from file creation till your process wakes up. It is more tricky to use, but avoiding polling is a good thing. I have never used it over a network.

Related

Understanding the JIT; slow website

First off, this question has been covered a few times (I've done my research), and, for example, on the right side of the SO webpage is a list of related items... I have been through them all (or as many as I could find).
When I publish my pre-compiled .NET web application, it is very slow to load the first time.
I've read up on this, it's the JIT which I understand (sort of).
The problem is, after the home page loads (up to 20 seconds), many other pages load very fast.
However, it would appear that the only reason they load is because the resources have been loaded (or that they share the same compiled dlls). However, some pages still take a long time.
This indicates that maybe the JIT needs to compile different pages in different ways? If so, and using a contact form as an example (where the Thank You page needs to be compiled by the JIT and first time is slow), the user may hit the send button multiple times whilst waiting for the page to be shown.
After I load all these pages which use different models or different shared HTML content, the site loads quickly as expected. I assume this issue is a common problem?
Please note, I'm using .NET 4.0 but, there is no database, XML files etc. The only IO is if an email doesn't send and it writes the error to a log.
So, assuming my understanding is correct, what is the approach to not have to manually go through the website and load every page?
If the above is a little too broad, then can this be resolved in the settings/configuration in Visual Studio (2012) or the web.config file (excluding adding compilation debug=false)?
In this case, there are 2 problems
As per rene's comments, review this http://msdn.microsoft.com/en-us/library/ms972959.aspx... The helpful part was to add the following code to the global.asax file
const string sourceName = ".NET Runtime";
const string serverName = ".";
const string logName = "Application";
const string uriFormat = "\r\n\r\nURI: {0}\r\n\r\n";
const string exceptionFormat = "{0}: \"{1}\"\r\n{2}\r\n\r\n";
void Application_Error(Object sender, EventArgs ea) {
StringBuilder message = new StringBuilder();
if (Request != null) {
message.AppendFormat(uriFormat, Request.Path);
}
if (Server != null) {
Exception e;
for (e = Server.GetLastError(); e != null; e = e.InnerException) {
message.AppendFormat(exceptionFormat,
e.GetType().Name,
e.Message,
e.StackTrace);
}
}
if (!EventLog.SourceExists(sourceName)) {
EventLog.CreateEventSource(sourceName, logName);
}
EventLog Log = new EventLog(logName, serverName, sourceName);
Log.WriteEntry(message.ToString(), EventLogEntryType.Error);
//Server.ClearError(); // uncomment this to cancel the error
}
The server was maxing out during sending of the email! My code was fine, but, viewing Task Scheduler showed it was hitting 100% memory...
The solution was to monitor the errors shown by point 1 and fix it. Then, find out why the server was being throttled when sending an email!

Memory leak while sending response from rebus handler

I saw a very strange behavior in my rebus handler which is self hosted in exe. Right after sending response using bus.send method it adds up some memory consumed by process. I tried to look up object graph using memory profile and found that rebus is holding response message in serialized format somewhere.
Object graph was showing below hierarchy to the root.
System.Message --> CachedBodyMessage --> stream
Give me some pointers if anybody is aware of this thing.
I understand that a memory leak is a grave concern, but my belief is that it is unlikely that Rebus should contain a memory leak.
This belief is rooted in the fact that I have been running Windows Service-hosted Rebus endpoints in production for 1,5 years now, and several of them (e.g. the timeout managers) have sometimes been running for several months without being restarted.
I'd like to be absolutely bulletproof sure though, so I'm willing to investigate the issue you're reporting.
You're mentioning "CachedBodyMessage" - judging by the names of fields inside System.Messaging.Message, it sounds like it's something within MSMQ. To try to reproduce your issue, I coded the following test:
[Test, Ignore("Only works in RELEASE mode because otherwise object references are held on to for the duration of the method")]
public void DoesNotLeakMessages()
{
// arrange
const string inputQueueName = "test.leak.input";
var queue = new MsmqMessageQueue(inputQueueName);
disposables.Add(queue);
var body = Encoding.UTF8.GetBytes(new string('*', 32768));
var message = new TransportMessageToSend
{
Headers = new Dictionary<string, object> { { Headers.MessageId, "msg-1" } },
Body = body
};
var weakMessageRef = new WeakReference(message);
var weakBodyRef = new WeakReference(body);
// act
queue.Send(inputQueueName, message, new NoTransaction());
message = null;
body = null;
GC.Collect();
GC.WaitForPendingFinalizers();
// assert
Assert.That(weakMessageRef.IsAlive, Is.False, "Expected the message to have been collected");
Assert.That(weakBodyRef.IsAlive, Is.False, "Expected the body bytes to have been collected");
}
which verifies that the sent transport message is collected as it should (will only do this in RELEASE mode though, because of the way DEBUG mode holds on to object references within scope)
I'll try and run the TimePrinter sample now and leave it running for a while to see if I can reproduce the issue. If you stumble upon more information about e.g. exactly which objects are leaking, it would be very helpful.
Thanks again for taking the time to report your worries to me :)
Followup:
I've modified the TimePrinter sample so that it sends 50 msg/s and includes a 64 KB random string payload with each message, and I've tracked the memory usage for almost four hours now. As you can see, it does not look like memory is being leaked.
I'll leave it running the rest of the day, just to be sure.
Maybe you can tell me some more about why you suspected there was a memory leak in the first place?
Update:
As you can see from the trace, it has now been running for 7 hours and thus more than 1,200,000 messages containing more than 70 GB of data has been sent and consumed by the same process. If cached message bodies were leaking, I am pretty sure that we would have been able to see something rising on the graph.

Optimizing a set 20 webrequests with threads

This is for ASP.NET. I want to improve the time it takes run my function, today it takes around 20-30 seconds, more towards 30secs than 20secs though. That's running on one thread making 20 webrequests.
I'm thinking threads that do all the 20 webreqeusts, in order to quickly find the result or just go through the data (IE do all the 20 requests not finding anything).
Here's how it works.
1. I'm using html agility pack to fetch htmldocuments. 2. Then I parse them for information 3. Lastly I add that information to a dictionary OR I move on to the next webrequest until I reach 20 requests made.
I make at most 20 webRequests, at minimum 1. I have set the function to end when the info I'm searching for is found. Sometimes the info isn't there hence the 20 webrequests(it goes through all the data).
Every webrequest adds between 5-20 entries to the dictionary. This is then compared with the information I sent to it, if it's in the list I get the Key back, otherwise it returns 201. If found it gets added to the database.
QUESTIONS
*A:*If I want to do this with threads, how many should I create? 20 One for each request and let them all loose to do the job? Or should i create like 4 of them making at most 5 requests each?B: What if two threads are finished at the same time and wants to add info to the directory, can it lock the whole site(I'm using ASP.NET), or will it try to add one from thread A and then one result from Thread B? I have a check already today that checks if the key exists before adding it.
C:What would be the fastest way to this?
This is my code, depicting the loop which just shows that 20 requests are being made?
public void FetchAndParseAllPages()
{
int _maxSearchDepth = 200;
int _searchIncrement = 10;
PageFetcher fetcher = new PageFetcher();
for (int i = 0; i < _maxSearchDepth; i += _searchIncrement)
{
string keywordNsearch = _keyword + i;
ParseHtmldocuments(fetcher.GetWebpage(keywordNsearch));
if (GetPostion() != 201)
{ //ADD DATA TO DATABASE
InsertRankingData(DocParser.GetSearchResults(), _theSearchedKeyword);
return;
}
}
}
.NET allows only 2 requests open at the same time. If you want more than that, you need to configure it in web.config. Look here: http://msdn.microsoft.com/en-us/library/aa480507.aspx
You can the Parallel.For method which is very straightforward and handles the "how much threads" for you. Of course you can tweak it to set how much threads (or tasks) you want with ParallelOptions. Look here: http://msdn.microsoft.com/en-us/library/dd781401.aspx
For making a thread-safe dictionary you can use the ConcurrentDictionary. Look here: http://msdn.microsoft.com/en-us/library/dd287191.aspx

Qt or PyQt - check when file is used by another process. Wait until finish copy

Good morning,
What is the best strategy for check when a big file o big directory has finished to copy?
I want wait until a file has finish fully to copy. Is there a code example in q
I'm working on mac os x.
thanks
Update
I use QFileSystemWatcher. the problem is that I receive file or directory change notification when o copy it is in progress. So user copy a big folder (inside many files), the operating system copy process start, it take 5 minuts, but in same times my application receive file changed notification. This is a problem because when i receive a change notification my application start for doing some operations on that files, but the copy is already in progress!!!!
There is only one reliable way to do this: Change the copy process to write to temporary files and then rename them after the copy is finished.
That way, you can ignore new files which end with .tmp and rename is an atomic operation.
If you can't change the copy process, all you can do is add a timer to wait for, say, half an hour to make sure the copy is really finished.
A more fine grained (and more risky) approach is to add a loop that check the file size and stops when the file size doesn't change for a certain time but that's also hard to get right.
Worse, this doesn't prevent you from reading partial files (when the copy process was terminated in the middle).
I think that the QFileSystemWatcher is the right start for you to get to the point of monitoring for changes, but as you have found, these changes are ANY changes. From this point, I think it should be easy enough for you to just check the modification time of the file.
Here is a simple example of a Watcher class that will let you specify a file to monitor and see if it has been modified after a given time. It can run a callback or emit a signal that anyone can watch:
import os.path
import time
from PyQt4 import QtCore
class Watcher(QtCore.QObject):
fileNotModified = QtCore.pyqtSignal(str)
MOD_TIME_DIFF = 5 #seconds
def __init__(self, aFile, callback=None, checkEvery=5):
super(Watcher, self).__init__()
self.file = aFile
self.callback = callback
self._timer = QtCore.QTimer(self)
self._timer.setInterval(checkEvery*1000)
self._timer.timeout.connect(self._checkFile)
def _checkFile(self):
diff = time.time() - os.path.getmtime(self.file)
if diff > self.MOD_TIME_DIFF:
self._timer.stop()
self.fileNotModified.emit(self.file)
if self.callback:
self.callback()
def start(self):
self._timer.start()
def stop(self):
self._timer.stop()
An example of using it:
def callbackNotify():
print "Callback!"
def signalNotify(f):
print "Signal: %s was modified!" % f
# You could directly give it a callback
watcher = Watcher("/path/to/file.file", callback=callbackNotify)
# Or could use a signal
watcher.fileNotModified.connect(signalNotify)
# tell the watcher timer to start checking
watcher.start()
## after the file hasnt been modified in 5 seconds ##
# Signal: /path/to/file.file was modified!
# Callback!
Try using QtConcurrent framework.
In particular, check out QFuture and QFutureWatcher. You can execute asynchronous copy operations inside a QFuture object and monitor its progress through signals and slots with a watcher.
bool copyFunction() {
// copy operations here, return true on success
}
MyHandlerClass myObject;
QFutureWatcher<bool> watcher;
connect(&watcher, SIGNAL(finished()), &myObject, SLOT(handleFinished()));
QFuture<bool> future = QtConcurrent::run(copyFunction);
Since you have no control on the external application, my suggestion is that you lock the files while you work on them. In this way other programs will not be able to access them while locked.
Alternatively, if you have access to the other program's source, you should implement some form of inter process communication,via sockets, messages or whatever method you prefer.

Please suggest a way to store a temp file in Windows Azure

Here I have a simple feature on ASP.NET MVC3 which host on Azure.
1st step: user upload a picture
2nd step: user crop the uploaded picture
3rd: system save the cropped picture, delete the temp file which is the uploaded original picture
Here is the problem I am facing now: where to store the temp file?
I tried on windows system somewhere, or on LocalResources: the problem is these resources are per Instance, so here is no guarantee the code on an instance shows the picture to crop will be the same code on the same instance that saved the temp file.
Do you have any idea on this temp file issue?
normally the file exist just for a while before delete it
the temp file needs to be Instance independent
Better the file can have some expire setting (for example, 1H) to delete itself, in case code crashed somewhere.
OK. So what you're after is basically somthing that is shared storage but expires. Amazon have just announced a rather nice setting called object expiration (https://forums.aws.amazon.com/ann.jspa?annID=1303). Nothing like this for Windows Azure storage yet unfortunately, but, doesnt mean we can't come up with some other approach; indeed even come up with a better (more cost effective) approach.
You say that it needs to be instance independant which means using a local temp drive is out of the picture. As others have said my initial leaning would be towards Blob storage but you will have cleanup effort there. If you are working with large images (>1MB) or low throughput (<100rps) then I think Blob storage is the only option. If you are working with smaller images AND high throughput then the transaction costs for blob storage will start to really add up (I have a white paper coming out soon which shows some modelling of this but some quick thoughts are below).
For a scenario with small images and high throughput a better option might be to use the Windows Azure Cache as your temporary storaage area. At first glance it will be eye wateringly expensive; on a per GB basis (110GB/month for Cache, 12c/GB for Storage). But, with storage your transactions are paid for whereas with Cache they are 'free'. (Quotas are here: http://msdn.microsoft.com/en-us/library/hh697522.aspx#C_BKMK_FAQ8) This can really add up; e.g. using 100kb temp files held for 20 minutes with a system throughput of 1500rps using Cache is about $1000 per month vs $15000 per month for storage transactions.
The Azure Cache approach is well worth considering, but, to be sure it is the 'best' approach I'd really want to know;
Size of images
Throughput per hour
A bit more detail on the actual client interaction with the server during the crop process? Is it an interactive process where the user will pull the iamge into their browser and crop visually? Or is it just a simple crop?
Here is what I see as a possible approach:
user upload the picture
your code saves it to a blob and have some data backend to know the relation between user session and uploaded image (mark it as temp image)
display the image in the cropping user interface interface
when user is done cropping on the client:
4.1. retrieve the original from the blob
4.2. crop it according the data sent from the user
4.3. delete the original from the blob and the record in the data backend used in step 2
4.4. save the final to another blob (final blob).
And have one background process checking for "expired" temp images in the data backend (used in step 2) to delete the images and the records in the data backend.
Please note that even in WebRole, you still have the RoleEntryPoint descendant, and you still can override the Run method. Impleneting the infinite loop in the Run() (that method shall never exit!) method, you can check if there is anything for deleting every N seconds (depending on your Thread.Sleep() in the Run().
You can use the Azure blob storage. Have look at this tutorial.
Under sample will be help you.
https://code.msdn.microsoft.com/How-to-store-temp-files-in-d33bbb10
you have two way of temp file in Azure.
1, you can use Path.GetTempPath and Path.GetTempFilename() functions for the temp file name
2, you can use Azure blob to simulate it.
private long TotalLimitSizeOfTempFiles = 100 * 1024 * 1024;
private async Task SaveTempFile(string fileName, long contentLenght, Stream inputStream)
{
try
{
//firstly, we need check the container if exists or not. And if not, we need to create one.
await container.CreateIfNotExistsAsync();
//init a blobReference
CloudBlockBlob tempFileBlob = container.GetBlockBlobReference(fileName);
//if the blobReference is exists, delete the old blob
tempFileBlob.DeleteIfExists();
//check the count of blob if over limit or not, if yes, clear them.
await CleanStorageIfReachLimit(contentLenght);
//and upload the new file in this
tempFileBlob.UploadFromStream(inputStream);
}
catch (Exception ex)
{
if (ex.InnerException != null)
{
throw ex.InnerException;
}
else
{
throw ex;
}
}
}
//check the count of blob if over limit or not, if yes, clear them.
private async Task CleanStorageIfReachLimit(long newFileLength)
{
List<CloudBlob> blobs = container.ListBlobs()
.OfType<CloudBlob>()
.OrderBy(m => m.Properties.LastModified)
.ToList();
//get total size of all blobs.
long totalSize = blobs.Sum(m => m.Properties.Length);
//calculate out the real limit size of before upload
long realLimetSize = TotalLimitSizeOfTempFiles - newFileLength;
//delete all,when the free size is enough, break this loop,and stop delete blob anymore
foreach (CloudBlob item in blobs)
{
if (totalSize <= realLimetSize)
{
break;
}
await item.DeleteIfExistsAsync();
totalSize -= item.Properties.Length;
}
}

Resources