write the uploaded files on the disk - nginx

look at this page of web.py:
http://webpy.org/cookbook/storeupload/
pay attention to how it write the file on the disk.
The current situation is:
I launched a server in virtualbox with 256 mb memory and 512 swap.
Just when I upload a file larger than 200 mb I get an error("the page is not available temporary").
I think that the python file-write function reads the whole file into the memory, then it crashed due to the limited memory.
Am I right?
If so, is there any solution?
Thank you for your time.

Try not to read the whole file in memory, create a loop and transfer the file by 1024 bytes chunks.

I take it you have set up nginx correctly, especially the client_max_body_size directive.
I think you're right, your problem is linked to bad memory usage : it probably comes from the read() method.
Used without a size argument, the entire contents of the file will be read and returned. Since the file is almost as large as the machine's memory, the program's running out of it and crashes.
What you should do is investigate on better ways to copy a file in Python.

Related

Downloading data directly to volatile memory

When you download a file from the internet whether it be a FTP request, a Peer to Peer connection, ext. you are always prompted with a window asking where to store the file on your HDD or SSD, maybe you have a little NAS enclosure in your house.. either way you put it this information is being stored to a physical drive and the information is not considered volatile. It is stored digitally or magnetically and readily available to you even after the system is restarted.
Is it possible for software to be programmed to download and store information directly to a designated location in RAM without it ever touching a form of non-volatile memory?
If this is not possible can you please elaborate on why?
Otherwise if this is possible, if you could give me examples of software that implement this, or perhaps a scenario where this would be the only resolution to generate a desired outcome?
Thank you for the help. I feel this must be possible, however, I cant think of anytime I've encountered this and google doesn't seem to understand what I'm asking.
edit: This is being asked from the perspective of a novice programmer; someone who is looking into creating something like this. I seem to have over-inflated my own question. I suppose what I mean to ask is as follows:
How is software such as RAMDisk programmed, how exactly does it work, and are heavily abstract languages such as C# and Java incapable of implementing such a feature?
This is actually not very hard to do if I understand your request correctly. What you're looking for is tmpfs[1].
Carve our a tmpfs partition (if /tmp isn't tmpfs for you by default), mount it at a location, say something like /volative.
Then you can simply configure your browser or whatever application to download all files to folder/directory henceforth. Since tmpfs is essentially ram mounted as a folder, it's reset after reboot.
Edit: OP asks for how tmpfs and related ram based file systems are implemented. This is something that is usually Operating system specific, but the general idea probably remains the same: The driver responsible for the ram file system mmap() the required amount of memory and then exposes that memory in a way file system APIs typical to your operating system (For example POSIX-y operations on linux/solaris/bsd) can access it.
Here's a paper describing the implemention of tmpfs on solaris[2]
Further note: If however you're trying to simply download something, use it and delete it without ever hitting disk in a way that's internal entirely to your application, then you can simply allocate memory dynamically based on the size of whatever you're downloading, write bytes into allocated memory and free() it once you're done using it.
This answer assumes you're on a Linux-y operating system. There are likely similar solutions for other operating systems.
References:
[1] https://en.wikipedia.org/wiki/Tmpfs
[2] http://www.solarisinternals.com/si/reading/tmpfs.pdf

MPI one-sided file I/O

I have some questions on performing File I/Os using MPI.
A set of files are distributed across different processes.
I want the processes to read the files in the other processes.
For example, in one-sided communication, each process sets a window visible to other processors. I need the exactly same functionality. (Create 'windows' for all files and share them so that any process can read any file from any offset)
Is it possible in MPI? I read lots of documentations about MPI, but couldn't find the exact one.
The simple answer is that you can't do that automatically with MPI.
You can convince yourself by seeing that MPI_File_open() is a collective call taking an intra-communicator as first argument and returning a file handler to the opened file as last argument. In this communicator, all processes open the file and therefore, all processes must see the file. So unless a process sees a file, it cannot get a MPI_file handler to access it.
Now, that doesn't mean there's no solution. A possibility could be to do by hand exactly what you described, namely:
Each MPI process opens individually the file they see and are responsible of; then
Each of theses processes reads this local file into a buffer;
Theses individual buffers are all exposed, using either a global MPI_Win memory windows, or several individual ones, ready for one-sided read accesses; and finally
All read accesses to any data that were previously stored in these individual local files, are now done through MPI_Get() calls using the memory window(s).
The true limitation of this approach is that it requires to fully read all of the individual files, therefore, you need to have sufficient memory per node for storing each of them. I'm well aware that this is a very very big caveat that could just make the solution completely impractical. However, if the memory is sufficient, this is an easy approach.
Another even simpler solution would be to store the files into a shared file system, or having them all copied on all local file systems. I imagine this isn't an option since the question wouldn't have been asked otherwise...
Finally, in last resort, a possibility I see would be to dedicate a MPI process (or an OpenMP thread of a MPI process) per node to serve each files. This process would just act as a "file server", answering "read" request coming from the other MPI processes, and serving them by reading the requested data from the file, and sending it back via MPI. It's a bit lengthy to write, but it should work.

0-copy inter-process communication on Unix without using the filesystem

If I have to move a moderate amount of memory between two processes, I can do the following:
create a file for writing
ftruncate to desired size
mmap and unlink it
use as desired
When another process requires that data, it:
connects to the first process through a unix socket
the first process sends the fd of the file through a unix socket message
mmap the fd
use as desired
This allows us to move memory between processes without any copy - but the file created must be on a memory-mounted filesystem, otherwise we might get a disk hit, which would degrade performance. Is there a way to do something like that without using a filesystem? A malloc-like function that returned a fd along with a pointer would do it.
[Edit] Having a file descriptor provides also a reference count mechanism that is maintained by the kernel.
Is there anything wrong with System V or POSIX shared memory (which are somewhat different, but end up with the same result)? With any such system, you have to worry about coordination between the processes as they access the memory, but that is true with memory-mapped files too.

Intercept outputs from a Program in Windows 7

I have an executable program which outputs data to the harddisk e.g. C:\documents.
I need some means to intercept the data in Windows 7 before they get to the hard drive. Then I will encrypt the data and send it back to the harddisk. Unfortunately, the .exe file does not support redirection command i.e. > in command prompt. Do you know how I can achieve such a thing in any programming language (c, c++, JAVA, php).
The encryption can only be done before the plain data is sent to the disk not after.
Any ideas most welcome. Thanks
This is virtually impossible in general. Many programs write to disk using memory-mapped files. In such a scheme, a memory range is mapped to (part of) a file. In such a scheme, writes to file can't be distinguished from writes to memory. A statement like p[OFFSET_OF_FIELD_X] = 17; is a logically write to file. Furthermore, the OS will keep track of the synchronization of memory and disk. Not all logical writes to memory are directly translated into physical writes to disk. From time to time, at the whim of the OS, dirty memory pages are copied back to disk.
Even in the simpler case of CreateFile/WriteFile, there's little room to intercept the data on the fly. The closest you could achieve is the use of Microsoft Detours. I know of at least one snakeoil encyption program (WxVault, crapware shipped on Dells) that does that. It repeatedly crashed my application in the field, which is why my program unpatches any attempt to intercept data on the fly. So, not even such hacks are robust against programs that dislike interference.

Sending a binary stream to the clients browser

I was wondering if there is any difference in performance (or any other important factor) between a file sent to the browser from our server by this method :
For i = 1 To fileSize \ chunk
If Not Response.IsClientConnected Then Exit For
Response.BinaryWrite stream.Read(chunk)
Response.Flush
Next
VS
the old plain file access method that the IIS comes with.
We are working on a file manager handler for security reasons and would like to know what is the performance hit.
Unless you are dealing with a fairly large file, there shouldn't be a noticeable difference. Since you are creating the chunks manually and the flushing the buffer, you are going to have more packet traffic to the client (the payload of the packet or the last packet will be only partially full). However, as I said, this probably won't be noticeable unless you have a large file and even then it's not likely to show up.
Both methods need to push binary data to the browser.
would like to know what is the performance hit.
Like always in such cases: measure. Try to optimize settings on IIS and measure again until you get the most optimal solution.
In my experience chunking stuff down using script is significantly slower than IIS's highly optimised static file handling. How much slower depends on too many factors I've seen up to 10 times when other poor choices have been made (like using 4096 bytes as the buffer size).
Some things may have improved on IIS7 but if you can fetch the file as static content I would definitely go with that.

Resources