UNIX buffered vs unbuffered I/O - unix

What is the difference between Unbuffered I/O and standard I/O? I know that using read(), write(), close() are unbuffered IO. Printf and gets are buffered IO. I also know that it is better to use buffered IO for big transactions. I just dont know the reason why. And what does the term "buffered" mean in this context?

Unbuffered I/O simply means that don't use any buffer while reading or writing.Generally when we use System calls like read() and write() they read and write char by char and can cause huge performance degradation . So for huge date generally high level reads/writes or simply buffered I/O are preferred .Buffered simply means that we are not dealing with single char but a block of chars, that is why sometimes it also known as block I/O.Generally in Unix when we use high level read/write functions they fetch/store the data of a given block size and place them in buffer cache and from this buffer cache these I/O functions get the desired amount of data.

Related

Concurrent write with OCaml Async

I'm reading data from the network and I'd like to write it to a file whenever I get them. The writes are concurrent and non sequential (think P2P file sharing). In C, I would get a file descriptor to the file(for the duration of the program) then use lseek, followed by write and eventually close the fd. These operations could be protected by a mutex in a multithreaded setting (especially, lseek and write should be atomic).
I don't really see how to get that behavior in Async. My initial idea is to have something like this.
let write fd s pos =
let posl = Int64.of_int pos in
Async_unix.Unix_syscalls.lseek fd ~mode:`Set posl
>>| fun _ ->
let wr = Writer.create t.fd in
let len = String.length s in
Writer.write wr s ~pos:0 ~len
Then, the writes are scheduled asynchronously when data is received.
My solution isn't correct. For one thing, this write task need to be atomic but it is not the case, since two lseek can be executed before the first Writer.write. Even if I can schedule the write sequentially it wouldn't help since Writer.write doesn't return a Deferred.t. Any idea?
BTW, this is a follow-up to a previous answered question.
The basic approach would be to have a queue of workers, where each worker performs an atomic seek/write1 operation. The invariant is that only one worker at a time is running. A more complicated strategy would employ a priority queue, where writes are ordered by some criterium that maximizes the throughput, e.g., writes to subsequent positions. You may also implement a sophisticated buffering strategy if you observe lots of small writes, then a good idea would be to coalesce them into larger chunks.
But let's start with a simple non-prioritized queue, implemented via Async.Pipe.t. For the positional write, we can't use the Writer interface, as it is designed for buffered sequential writes. So, we will use the Unix.lseek from Async_unix.Std and Bigstring.really_writefunction. The really_write is a regular non-asynchronous function, so we need to lift it into the Async interface using theFd.syscall_in_thread` function, e.g.,
let really_pwrite fd offset bytes =
Unix.lseek fd offset ~mode:`Set >>= fun (_ : int64) ->
Fd.syscall_in_thread fd (fun desc ->
Bigstring.really_write desc bytes)
Note: this function will write as many bytes as system decides, but no more than the length of bytes. So you might be interested in implementing a really_pwrite function that will write all bytes.
The overall scheme would include one master thread, that will own a file descriptor and accept write requests from multiple clients via an Async.Pipe. Suppose that each write request is a message of the follwing type:
type chunk = {
offset : int;
bytes : Bigstring.t;
}
Then your master thread will look like this:
let process_requests fd =
Async.Pipe.iter ~f:(fun {offset; bytes} ->
really_pwrite fd offset bytes)
Where the really_pwrite is a function that really writes all the bytes and handles all the errors. You may also use Async.Pipe.iter' function and presort and coalesce the writes before actually executing the pwrite syscall.
One more optimization note. Allocating a bigstring is a rather expensive operation, so you may consider to pre allocate one big bigstring and serve small chunks from it. This will create a limited resource, so your clients will wait until other clients will finish their writes and release their chunks. As a result, you will have a throttled system with a limited memory footprint.
1)Ideally we should use pwrite though Janestreet provides only pwrite_assume_fd_is_nonblocking function, that doesn't release OCaml runtime when a call to the system pwrite is done, and will actually block the whole system. So we need to use a combination of a seek and write. The latter will release the OCaml runtime so that the rest of the program can continue. (Also, given their definition of nonblocking fd, this function doesn't really make much sense, as only sockets and FIFO are considered non-blocking, and as far as I know, they do not support the seek operation. I will file an issue on their bug tracker.

printf messages show up after about 10 of them are sent [duplicate]

Why does printf not flush after the call unless a newline is in the format string? Is this POSIX behavior? How might I have printf immediately flush every time?
The stdout stream is line buffered by default, so will only display what's in the buffer after it reaches a newline (or when it's told to). You have a few options to print immediately:
Print to stderrinstead using fprintf (stderr is unbuffered by default):
fprintf(stderr, "I will be printed immediately");
Flush stdout whenever you need it to using fflush:
printf("Buffered, will be flushed");
fflush(stdout); // Will now print everything in the stdout buffer
Disable buffering on stdout by using setbuf:
setbuf(stdout, NULL);
Or use the more flexible setvbuf:
setvbuf(stdout, NULL, _IONBF, 0);
No, it's not POSIX behaviour, it's ISO behaviour (well, it is POSIX behaviour but only insofar as they conform to ISO).
Standard output is line buffered if it can be detected to refer to an interactive device, otherwise it's fully buffered. So there are situations where printf won't flush, even if it gets a newline to send out, such as:
myprog >myfile.txt
This makes sense for efficiency since, if you're interacting with a user, they probably want to see every line. If you're sending the output to a file, it's most likely that there's not a user at the other end (though not impossible, they could be tailing the file). Now you could argue that the user wants to see every character but there are two problems with that.
The first is that it's not very efficient. The second is that the original ANSI C mandate was to primarily codify existing behaviour, rather than invent new behaviour, and those design decisions were made long before ANSI started the process. Even ISO nowadays treads very carefully when changing existing rules in the standards.
As to how to deal with that, if you fflush (stdout) after every output call that you want to see immediately, that will solve the problem.
Alternatively, you can use setvbuf before operating on stdout, to set it to unbuffered and you won't have to worry about adding all those fflush lines to your code:
setvbuf (stdout, NULL, _IONBF, BUFSIZ);
Just keep in mind that may affect performance quite a bit if you are sending the output to a file. Also keep in mind that support for this is implementation-defined, not guaranteed by the standard.
ISO C99 section 7.19.3/3 is the relevant bit:
When a stream is unbuffered, characters are intended to appear from the source or at the destination as soon as possible. Otherwise characters may be accumulated and transmitted to or from the host environment as a block.
When a stream is fully buffered, characters are intended to be transmitted to or from the host environment as a block when a buffer is filled.
When a stream is line buffered, characters are intended to be transmitted to or from the host environment as a block when a new-line character is encountered.
Furthermore, characters are intended to be transmitted as a block to the host environment when a buffer is filled, when input is requested on an unbuffered stream, or when input is requested on a line buffered stream that requires the transmission of characters from the host environment.
Support for these characteristics is implementation-defined, and may be affected via the setbuf and setvbuf functions.
It's probably like that because of efficiency and because if you have multiple programs writing to a single TTY, this way you don't get characters on a line interlaced. So if program A and B are outputting, you'll usually get:
program A output
program B output
program B output
program A output
program B output
This stinks, but it's better than
proprogrgraam m AB ououtputputt
prproogrgram amB A ououtputtput
program B output
Note that it isn't even guaranteed to flush on a newline, so you should flush explicitly if flushing matters to you.
To immediately flush call fflush(stdout) or fflush(NULL) (NULL means flush everything).
stdout is buffered, so will only output after a newline is printed.
To get immediate output, either:
Print to stderr.
Make stdout unbuffered.
Note: Microsoft runtime libraries do not support line buffering, so printf("will print immediately to terminal"):
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setvbuf
by default, stdout is line buffered, stderr is none buffered and file is completely buffered.
You can fprintf to stderr, which is unbuffered, instead. Or you can flush stdout when you want to. Or you can set stdout to unbuffered.
Use setbuf(stdout, NULL); to disable buffering.
There are generally 2 levels of buffering-
1. Kernel buffer Cache (makes read/write faster)
2. Buffering in I/O library (reduces no. of system calls)
Let's take example of fprintf and write().
When you call fprintf(), it doesn't wirte directly to the file. It first goes to stdio buffer in the program's memory. From there it is written to the kernel buffer cache by using write system call. So one way to skip I/O buffer is directly using write(). Other ways are by using setbuff(stream,NULL). This sets the buffering mode to no buffering and data is directly written to kernel buffer.
To forcefully make the data to be shifted to kernel buffer, we can use "\n", which in case of default buffering mode of 'line buffering', will flush I/O buffer.
Or we can use fflush(FILE *stream).
Now we are in kernel buffer. Kernel(/OS) wants to minimise disk access time and hence it reads/writes only blocks of disk. So when a read() is issued, which is a system call and can be invoked directly or through fscanf(), kernel reads the disk block from disk and stores it in a buffer. After that data is copied from here to user space.
Similarly that fprintf() data recieved from I/O buffer is written to the disk by the kernel. This makes read() write() faster.
Now to force the kernel to initiate a write(), after which data transfer is controlled by hardware controllers, there are also some ways. We can use O_SYNC or similar flags during write calls. Or we could use other functions like fsync(),fdatasync(),sync() to make the kernel initiate writes as soon as data is available in the kernel buffer.

I don't understand what exactly does the function bytesToWrite() Qt

I searched for bytesToWrite in doc and that what I found "For buffered devices, this function returns the number of bytes waiting to be written. For devices with no buffer, this function returns 0."
First what does mean buffered devices. And can anyone please explain to me what exactly this function does and where or how can I use it.
Many IO devices are buffered, which means that data isn't sent straight away, but it is accumulated to be sent in bulk when there is a sufficient amount.
This is done essentially to have better performance, as sending data normally has some fixed overhead (at the very least the syscall overhead), which is well amortized when sending data in bulk, but would have to be paid for each write if no buffering would be used.
(notice that here we are only talking about QIODevice buffers, normally there are also all kinds of kernel-mode buffers and buffers internal to hardware devices themselves)
bytesToWrite tells you how much stuff is in the QIODevice write buffer, i.e. how many bytes you wrote that are waiting to be actually written (as in, given to the OS to write).
I never actually had to use that member, but I suppose it could be useful e.g. to in a producer-consumer scenario (=if the write buffer is lower than something, then you have to actually calculate the next chunk of data to send), to manually handle buffering in some places or even just for debugging/logging purposes.
it's actually very usefull when you're using an asynchronous API.
you can for example, use it inside a bytesWritten() slot to tell wether the buffer is empty and the data has been fully written or not.

May MPI_SEND use my input data array as Buffer?

As we know, there is a thing called an MPI send buffer used during a send action.
And for following code:
MPI_Isend(data, ..., req);
...
MPI_Wait(req, &status)
Is it safe to use data between MPI_Isend and MPI_Wait ?
That means, will MPI_Isend use data as the internal send buffer?
And more, if I don't use data anymore, could I indicate MPI to use data as the send buffer rather than waste time to copy data?
BTW, I've heard of MPI_Bsend, but I don't think it could save memory and time in this case.
MPI provides two kinds of operations: blocking and non-blocking. The difference between the two is when it is safe to reuse the data buffer passed to the MPI function.
When a blocking call like MPI_Send returns, the buffer is no longer needed by the MPI library and can be safely reused. On the other hand, non-blocking calls only initiate the corresponding operation and let it continue asynchronously. Only after a successful call to a routine like MPI_Wait or after a positive test result from MPI_Test one can safely reuse the buffer.
As for how the library utilises the user buffer, that is very implementation-specific. Shorter messages are usually copied to internal (for the MPI library) buffers for performance reasons. Longer messages are usually directly read from the user buffer and sent to the network, therefore the buffer will be in use by MPI until the whole message has been sent.
It is absolutely not save to use data between MPI_Isend and MPI_Wait.
Between MPI_Isend and MPI_Wait you actually don't know when data can be reused. Only after MPI_Wait you can be sure that data is sent and you can reuse it.
If you don't use data anymore you should call MPI_Wait at the end of your program.

Handling messages over TCP

I'm trying to send and receive messages over TCP using a size of each message appended before the it starts.
Say, First three bytes will be the length and later will the message:
As a small example:
005Hello003Hey002Hi
I'll be using this method to do large messages, but because the buffer size will be a constant integer say, 200 Bytes. So, there is a chance that a complete message may not be received e.g. instead of 005Hello I get 005He nor a complete length may be received e.g. I get 2 bytes of length in message.
So, to get over this problem, I'll need to wait for next message and append it to the incomplete message etc.
My question is: Am I the only one having these difficulties to appending messages to each other, appending lengths etc.. to make them complete Or is this really usually how we need to handle the individual messages on TCP? Or, if there is a better way?
What you're seeing is 100% normal TCP behavior. It is completely expected that you'll loop receiving bytes until you get a "message" (whatever that means in your context). It's part of the work of going from a low-level TCP byte stream to a higher-level concept like "message".
And "usr" is right above. There are higher level abstractions that you may have available. If they're appropriate, use them to avoid reinventing the wheel.
So, there is a chance that a complete message may not be received e.g.
instead of 005Hello I get 005He nor a complete length may be received
e.g. I get 2 bytes of length in message.
Yes. TCP gives you at least one byte per read, that's all.
Or is this really usually how we need to handle the individual messages on TCP? Or, if there is a better way?
Try using higher-level primitives. For example, BinaryReader allows you to read exactly N bytes (it will internally loop). StreamReader lets you forget this peculiarity of TCP as well.
Even better is using even more higher-level abstractions such as HTTP (request/response pattern - very common), protobuf as a serialization format or web services which automate pretty much all transport layer concerns.
Don't do TCP if you can avoid it.
So, to get over this problem, I'll need to wait for next message and append it to the incomplete message etc.
Yep, this is how things are done at the socket level code. For each socket you would like to allocate a buffer of at least the same size as kernel socket receive buffer, so that you can read the entire kernel buffer in one read/recv/resvmsg call. Reading from the socket in a loop may starve other sockets in your application (this is why they changed epoll to be level-triggered by default, because the default edge-triggered forced application writers to read in a loop).
The first incomplete message is always kept in the beginning of the buffer, reading the socket continues at the next free byte in the buffer, so that it automatically appends to the incomplete message.
Once reading is done, normally a higher level callback is called with the pointers to all read data in the buffer. That callback should consume all complete messages in the buffer and return how many bytes it has consumed (may be 0 if there is only an incomplete message). The buffer management code should memmove the remaining unconsumed bytes (if any) to the beginning of the buffer. Alternatively, a ring-buffer can be used to avoid moving those unconsumed bytes, but in this case the higher level code should be able to cope with ring-buffer iterators, which it may be not ready to. Hence keeping the buffer linear may be the most convenient option.

Resources