File Access Strategy in a Multi-Threaded Environment (Web App) - asp.net

I have a file which is an XML representation of some data that is taken from a Web service and cached locally within a Web Application. The idea being is that this data is very static, but just might change. So I have set it up to cache to a file, and stuck a monitor against it to check if it has been deleted. Once deleted, the file will be refreshed from its source and rebuilt.
I am now running in to problems though, because obviously in a multi-threaded environment it falls over as it is trying to access the data when it is still reading/writing the file.
This is confusing me, because I added a object to lock against, and this is always locked during read/write. It was my understanding that attempted access from other threads would be told to "wait" until the lock was released?
Just to let you know, I am real new to multi-threaded development, so I am totally willing to accept this is a screw up on my part :)
Am I missing something?
What is the best file access strategy in a multi-threaded environment?
Edit
Sorry - I should have said this is using ASP.NET 2.0 :)

Here is the code that I use to make sure a file is not locked by another process. It's not 100% foolproof, but it gets the job done most of the time:
/// <summary>
/// Blocks until the file is not locked any more.
/// </summary>
/// <param name="fullPath"></param>
bool WaitForFile(string fullPath)
{
int numTries = 0;
while (true)
{
++numTries;
try
{
// Attempt to open the file exclusively.
using (FileStream fs = new FileStream(fullPath,
FileMode.Open, FileAccess.ReadWrite,
FileShare.None, 100))
{
fs.ReadByte();
// If we got this far the file is ready
break;
}
}
catch (Exception ex)
{
Log.LogWarning(
"WaitForFile {0} failed to get an exclusive lock: {1}",
fullPath, ex.ToString());
if (numTries > 10)
{
Log.LogWarning(
"WaitForFile {0} giving up after 10 tries",
fullPath);
return false;
}
// Wait for the lock to be released
System.Threading.Thread.Sleep(500);
}
}
Log.LogTrace("WaitForFile {0} returning true after {1} tries",
fullPath, numTries);
return true;
}
Obviously you can tweak the timeouts and retries to suit your application. I use this to process huge FTP files that take a while to be written.

If you're locking on a object stored as a static then the lock should work for all threads in the same Application Domain, but perhaps you need to upload a code sample so we can have a look at the offending lines.
That said, one thought would be to check if IIS is configured to run in Web Garden mode (i.e. more than 1 process executing your application) which would break your locking logic. While you could fix such a situation with a mutex it'd be easier to reconfigure your application to execute in a single process, although you'd be wise to check the performance before and after messing with the web garden settings as it can potentially affect performance.

You could maybe create the file with a temporary name ("data.xml_TMP"), and when it's ready change the name to what it is supposed to be. That way, no other process will be accessing it before it is ready.

OK, I have been working on this and ended up creating a stress-test module to basically hammer the crap out of my code from several threads (See Related Question).
It was much easier from this point on to find holes in my code. It turns out that my code wasn't actually far off, but there was a certain logic path that it could enter in to which basically caused read/write operations to stack up, meaning if they didn't get cleared in time, it would go boom!
Once I took that out, ran my stress test again, all worked fine!
So, I didn't really do anything special in my file access code, just ensured I used lock statements where appropriate (i.e. when reading or writing).

How about using AutoResetEvent to communicate between threads? I created a console app which creates roughly 8 GB file in createfile method and then copy that file in main method
static AutoResetEvent waitHandle = new AutoResetEvent(false);
static string filePath=#"C:\Temp\test.txt";
static string fileCopyPath=#"C:\Temp\test-copy.txt";
static void Main(string[] args)
{
Console.WriteLine("in main method");
Console.WriteLine();
Thread thread = new Thread(createFile);
thread.Start();
Console.WriteLine("waiting for file to be processed ");
Console.WriteLine();
waitHandle.WaitOne();
Console.WriteLine();
File.Copy(filePath, fileCopyPath);
Console.WriteLine("file copied ");
}
static void createFile()
{
FileStream fs= File.Create(filePath);
Console.WriteLine("start processing a file "+DateTime.Now);
Console.WriteLine();
using (StreamWriter sw = new StreamWriter(fs))
{
for (long i = 0; i < 300000000; i++)
{
sw.WriteLine("The value of i is " + i);
}
}
Console.WriteLine("file processed " + DateTime.Now);
Console.WriteLine();
waitHandle.Set();
}

Related

Can InputStreamResource be used instead of temporary file for async file upload to sftp server

I have a Spring Integration flow which uploads the files to sftp server asynchronously, files to be uploaded are coming from http endpoint. Initially I faced same problem as discussed here glad it got solved.
In the same SO thread I found this comment.
In enterprise environments, you often have files of sizes you cannot afford to buffer into memory like that. Sadly enough, InputStreamResource won't work either. Your best bet, as far as I could tell so far, is to copy contents to an own temp file (e.g. File#createTempFile) which you can clean up at the end of the processing thread.
Currently I'm connecting file inputstream to InputStreamResource to get rid of the problem, its working flawlessly. why does the commenter say InputStreamResource won't work either, AFAIK InputStream never store data in memory
Does the InputStreamResource's inputStream gets closed automatically after file upload?
When we say large file how much of file size are we talking about here. currently in my case 2-5 Mb of files are uploaded to SFTP
Do I really need to care about changing my file upload mechanism to one something like storing in temp folder?
Code Sample:
#PostMapping("/upload")
public void sampleEndpoint(#NotEmpty #RequestParam MultipartFile file )
throws IOException {
Resource resource = new InputStreamResource(file.getInputStream());
sftpFileService.upload(resource);
}
SftpFileService Async upload method:
#Async
public void upload(Resource resource){
try{
messagingGateway.upload(resource);
}catch(Exception e){
e.printStackTrace();
}
}
2-5 Mb is probably not a size to worry about. The problem could appear when files are in 1-2Gb size. Although you may face some out of memory when several concurrent uploads happens to your service.
The InputStreamResource is just a decorator around an InputStream with Resource API for access to the underlying delegating stream. It is not clear how it can work in async environment since MultipartFile is deleted in the end of HTTP upload request.
Plus you don't show any code to understand the situation better...

ASP.NET WEB API Temp File operation Speed Up

I'm hosting a WEB API App on Azure and I'm noticing some latency on a file operation.
What I'm doing is:
Post File
Write file on disk
Execute operations on file
The problem is that it takes A LOT of time to simply "take" this file, this is my code:
[HttpPost]
[Route("api/Test/Debug")]
public async Task Debug()
{
var sw = Stopwatch.StartNew();
var httpRequest = HttpContext.Current.Request;
var postedFile = httpRequest.Files[0];
return Ok(sw.ElapsedMilliseconds);
}
As you can see it's really simple but it takes also 3 seconds to get the "postedFile" element.
Is there any way to optimize this? Any other way to increase performance? Is this file already stored in some temp dir so that I can access it without having to write him down and then delete it?
thanks in advance

Using ffmpeg in asp.net

I needed a audio conversion library. After already pulling my hair..I have given up on the fact that there is no such audio library out there..every library out there has some or the other problem.
The only option left is ffmpeg which is the best but unfortunately you cannot use it in asp.net (not directly I mean). Every user on the website that will convert a file; will launch an exe?; I think I will hit the server memory max soon.
Bottom Line: I will try using ffmpeg.exe and see how many users it can support simultaneously.
I went to the ffmpeg website and in the windows download section I found 3 different version; static, shared and dev.
Does any one know which would be the best? All packed in one exe (static) or dll's separely and exe small, wrt using it in asp.net?
PS: any one has a good library out there..would be great if you can share.
Static builds provide one self-contained .exe file for each program (ffmpeg, ffprobe, ffplay).
Shared builds provide each library as a separate .dll file (avcodec, avdevice, avfilter, etc.), and .exe files that depend on those libraries for each program
Dev packages provide the headers and .lib/.dll.a files required to use the .dll files in other programs.
ffMpeg is the best library out there from what I have used but I wouldn't recommend trying to call it directly from asp.net.
What I have done, is accepted the upload, stored it on the server, or S3 in my case, then have a worker role (if using something like Azure) and a process that continuously looks and monitors for new files to convert.
If you needed a realtime like solution, you could update flags in your database and have an AJAX solution to poll the database to keep providing progress updates, then a link to download once the conversion is complete.
Personally my approach would be
Azure Web Roles
Azure Worker Role
ServiceBus
The WorkerRole starts up and is monitoring the ServiceBus Queue for messages.
The ASP.NET site uploads and stores the file in S3 or Azure
The ASP.NET site then records information in your DB if needed and sends a message to the ServiceBus queue.
The WorkerRole picks this up and converts.
AJAX will be needed on the ASP.NET site if you want a realtime monitoring solution. Otherwise you could send an email when complete if needed.
Using a queuing process also helps you with load as when you are under heavy load people just wait a little longer and it doesn't grind everything to a halt. Also you can scale out your worker roles as needed to balance loads, should it ever become too much for one server.
Here is how I run ffMpeg from C# (you will need to change the parameters for your requirements)
String params = string.Format("-i {0} -s 640x360 {1}", input.Path, "C:\\FilePath\\file.mp4");
RunProcess(params);
private string RunProcess(string Parameters)
{
//create a process info
ProcessStartInfo oInfo = new ProcessStartInfo(this._ffExe, Parameters);
oInfo.UseShellExecute = false;
oInfo.CreateNoWindow = true;
oInfo.RedirectStandardOutput = true;
oInfo.RedirectStandardError = true;
//Create the output and streamreader to get the output
string output = null; StreamReader srOutput = null;
//try the process
try
{
//run the process
Process proc = System.Diagnostics.Process.Start(oInfo);
proc.ErrorDataReceived += new DataReceivedEventHandler(proc_ErrorDataReceived);
proc.OutputDataReceived += new DataReceivedEventHandler(proc_OutputDataReceived);
proc.BeginOutputReadLine();
proc.BeginErrorReadLine();
proc.WaitForExit();
proc.Close();
proc.Dispose();
}
catch (Exception)
{
// Capture Error
}
finally
{
//now, if we succeeded, close out the streamreader
if (srOutput != null)
{
srOutput.Close();
srOutput.Dispose();
}
}
return output;
}

Webmatrix.Data.Database Connection String Cleared After Form Submit

I'm developing an ASP.NET (Razor v2) Web Site, and using the WebMatrix.Data library to connect to a remote DB. I have the Database wrapped in a singleton, because it seemed like a better idea than constantly opening and closing DB connections, implemented like so:
public class DB
{
private static DB sInstance = null;
private Database mDatabase = null;
public static DB Instance
{
get
{
if (sInstance == null)
{
sInstance = new DB();
}
return sInstance;
}
}
private DB()
{
mDatabase = Database.Open("<Connection String name from web.config>");
return;
}
<Query Functions Go Here>
}
("Database" here refers to the WebMatrix.Data.Database class)
The first time I load my page with the form on it and submit, a watch of mDatabase's Database.Connection property shows the following: (Sorry, not enough rep to post images yet.)
http://i.stack.imgur.com/jJ1RK.png
The form submits, the page reloads, the submitted data shows up, everything is a-ok. Then I enter new data and submit the form again, and here's the watch:
http://i.stack.imgur.com/Zorv0.png
The Connection has been closed and its Connection String blanked, despite not calling Database.Close() anywhere in my code. I have absolutely no idea what is causing this, has anyone seen it before?
I'm currently working around the problem by calling Database.Open() before and Database.Close() immediately after every query, which seems inefficient.
The Web Pages framework will ensure that connections opened via the Database helper class are closed and disposed when the current page has finished executing. This is by design. It is also why you rarely see connections explicitly closed in any Web Pages tutorial where the Database helper is used.
It is very rarely a good idea to have permanently opened connections in ASP.NET applications. It can cause memory leaks. When Close is called, the connection is not actually terminated by default. It is returned to a pool of connections that are kept alive by ADO.NET connection pooling. That way, the effort required to instantiate new connections is minimised but managed properly. So all you need to do is call Database.Open in each page. It's the recommended approach.

NHibernate thread safety with session

I've been using NHibernate for a while now and have found from time to time that if I try to request two pages simultaniously (or as close as I can) it will occasionally error. So I assumed that it was because my Session management was not thread safe.
I thought it was my class so I tried to use a different method from this blog post http://pwigle.wordpress.com/2008/11/21/nhibernate-session-handling-in-aspnet-the-easy-way/ however I still get the same issues. The actual error I am getting is:
Server Error in '/AvvioCMS' Application.
failed to lazily initialize a collection, no session or session was closed
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: NHibernate.LazyInitializationException: failed to lazily initialize a collection, no session or session was closed
Either that or no datareader is open, but this is the main culprit.
I've placed my session management class below, can anyone spot why I may be having these issues?
public interface IUnitOfWorkDataStore
{
object this[string key] { get; set; }
}
public static Configuration Init(IUnitOfWorkDataStore storage, Assembly[] assemblies)
{
if (storage == null)
throw new Exception("storage mechanism was null but must be provided");
Configuration cfg = ConfigureNHibernate(string.Empty);
foreach (Assembly assembly in assemblies)
{
cfg.AddMappingsFromAssembly(assembly);
}
SessionFactory = cfg.BuildSessionFactory();
ContextDataStore = storage;
return cfg;
}
public static ISessionFactory SessionFactory { get; set; }
public static ISession StoredSession
{
get
{
return (ISession)ContextDataStore[NHibernateSession.CDS_NHibernateSession];
}
set
{
ContextDataStore[NHibernateSession.CDS_NHibernateSession] = value;
}
}
public const string CDS_NHibernateSession = "NHibernateSession";
public const string CDS_IDbConnection = "IDbConnection";
public static IUnitOfWorkDataStore ContextDataStore { get; set; }
private static object locker = new object();
public static ISession Current
{
get
{
ISession session = StoredSession;
if (session == null)
{
lock (locker)
{
if (DBConnection != null)
session = SessionFactory.OpenSession(DBConnection);
else
session = SessionFactory.OpenSession();
StoredSession = session;
}
}
return session;
}
set
{
StoredSession = value;
}
}
public static IDbConnection DBConnection
{
get
{
return (IDbConnection)ContextDataStore[NHibernateSession.CDS_IDbConnection];
}
set
{
ContextDataStore[NHibernateSession.CDS_IDbConnection] = value;
}
}
}
And the actual store I am using is this:
public class HttpContextDataStore : IUnitOfWorkDataStore
{
public object this[string key]
{
get { return HttpContext.Current.Items[key]; }
set { HttpContext.Current.Items[key] = value; }
}
}
I initialize the SessionFactory on Application_Start up with:
NHibernateSession.Init(new HttpContextDataStore(), new Assembly[] {
typeof(MappedClass).Assembly});
Update
Thanks for your advice. I have tried a few different things to try and simplify the code but I am still running into the same issues and I may have an idea why.
I create the session per request as and when it is needed but in my global.asax I am disposing of the session on Application_EndRequest. However I'm finding the Application_EndRequest is being fired more than once while I am in debug at the end of loading a page. I thought that the event is only suppose to fire once at the very end of the request but if it isn't and some other items are trying to use the Session (which is what the error is complaining about) for whatever weird reason that could be my problem and the Session is still thread safe it is just being disposed of to early.
Anyone got any ideas? I did a google and saw that the VS development server does cause issues like that but I am running it through IIS.
While I haven't seen your entire codebase or the the problem you're trying to solve, a rethinking of how you are using NHibernate might be in order. From the documentation:
You should observe the following
practices when creating NHibernate
Sessions:
Never create more than one concurrent
ISession or ITransaction instance per
database connection.
Be extremely careful when creating
more than one ISession per database
per transaction. The ISession itself
keeps track of updates made to loaded
objects, so a different ISession might
see stale data.
The ISession is not threadsafe! Never
access the same ISession in two
concurrent threads. An ISession is
usually only a single unit-of-work!
That last bit is the most relevant (and important in the case of a multithreaded environment) to what I'm saying. An ISession should be used once for a small atomic operation and then disposed. Also from the documentation:
An ISessionFactory is an
expensive-to-create, threadsafe object
intended to be shared by all
application threads. An ISession is an
inexpensive, non-threadsafe object
that should be used once, for a single
business process, and then discarded.
Combining those two ideas, instead of storing the ISession itself, store the session factory since that is the "big" object. You can then employ something like SessionManager.GetSession() as a wrapper to retrieve the factory from the session store and instantiate a session and use it for one operation.
The problem is also less obvious in the context of an ASP.NET application. You're statically scoping the ISession object which means it's shared across the AppDomain. If two different Page requests are created within that AppDomain's lifetime and are executed simultaneously, you now have two Pages (different threads) touching the same ISession which is not safe.
Basically, instead of trying to keep a session around for as long as possible, try to get rid of them as soon as possible and see if you have better results.
EDIT:
Ok, I can see where you're trying to go with this. It sounds like you're trying to implement the Open Session In View pattern, and there a couple different routes you can take on that:
If adding another framework is not an issue, look into something like Spring.NET. It's modular so you don't have to use the whole thing, you could just use the NHibernate helper module. It supports the open session in view pattern. Documentation here (heading 21.2.10. "Web Session Management").
If you'd rather roll your own, check out this codeproject posting by Bill McCafferty: "NHibernate Best Practices". Towards the end he describes implementing the pattern through a custom IHttpModule. I've also seen posts around the Internet for implementing the pattern without an IHttpModule, but that might be what you've been trying.
My usual pattern (and maybe you've already skipped ahead here) is use a framework first. It removes lots of headaches. If it's too slow or doesn't fit my needs then I try to tweak the configuration or customize it. Only after that do I try to roll my own, but YMMV. :)
I can't be certain (as I'm a Java Hibernate guy) in NHibernate but in hibernate Session objects are not thread safe by design. You should open and close a session and never allow it out of the scope of the current thread.
I'm sure that patterns such as 'Open session view' have been implemented in .Net somewhere.
The other interesting issue is when you put a hibernate entity in the session. The problem here is that the session that it is attached to will be closed (or should be) on the request finishing. You have to reattach the entity to the new (hibernate) session if you wish to navigate any non loaded associations. This in it's self causes a new issue if two requests try to do this at the same time as something will blow up if you try to attach an entity to two sessions.
Hope this helps.
Gareth
The problem ended up being that my library for inversion of control was not managing the objects being created in HTTP context correctly so I was getting references for objects that should of not been available to that context. This was using Ninject 1.0, once I updated to Ninject 2.0 (beta) the problem was resolved.

Resources