printf messages show up after about 10 of them are sent [duplicate] - serial-port

Why does printf not flush after the call unless a newline is in the format string? Is this POSIX behavior? How might I have printf immediately flush every time?

The stdout stream is line buffered by default, so will only display what's in the buffer after it reaches a newline (or when it's told to). You have a few options to print immediately:
Print to stderrinstead using fprintf (stderr is unbuffered by default):
fprintf(stderr, "I will be printed immediately");
Flush stdout whenever you need it to using fflush:
printf("Buffered, will be flushed");
fflush(stdout); // Will now print everything in the stdout buffer
Disable buffering on stdout by using setbuf:
setbuf(stdout, NULL);
Or use the more flexible setvbuf:
setvbuf(stdout, NULL, _IONBF, 0);

No, it's not POSIX behaviour, it's ISO behaviour (well, it is POSIX behaviour but only insofar as they conform to ISO).
Standard output is line buffered if it can be detected to refer to an interactive device, otherwise it's fully buffered. So there are situations where printf won't flush, even if it gets a newline to send out, such as:
myprog >myfile.txt
This makes sense for efficiency since, if you're interacting with a user, they probably want to see every line. If you're sending the output to a file, it's most likely that there's not a user at the other end (though not impossible, they could be tailing the file). Now you could argue that the user wants to see every character but there are two problems with that.
The first is that it's not very efficient. The second is that the original ANSI C mandate was to primarily codify existing behaviour, rather than invent new behaviour, and those design decisions were made long before ANSI started the process. Even ISO nowadays treads very carefully when changing existing rules in the standards.
As to how to deal with that, if you fflush (stdout) after every output call that you want to see immediately, that will solve the problem.
Alternatively, you can use setvbuf before operating on stdout, to set it to unbuffered and you won't have to worry about adding all those fflush lines to your code:
setvbuf (stdout, NULL, _IONBF, BUFSIZ);
Just keep in mind that may affect performance quite a bit if you are sending the output to a file. Also keep in mind that support for this is implementation-defined, not guaranteed by the standard.
ISO C99 section 7.19.3/3 is the relevant bit:
When a stream is unbuffered, characters are intended to appear from the source or at the destination as soon as possible. Otherwise characters may be accumulated and transmitted to or from the host environment as a block.
When a stream is fully buffered, characters are intended to be transmitted to or from the host environment as a block when a buffer is filled.
When a stream is line buffered, characters are intended to be transmitted to or from the host environment as a block when a new-line character is encountered.
Furthermore, characters are intended to be transmitted as a block to the host environment when a buffer is filled, when input is requested on an unbuffered stream, or when input is requested on a line buffered stream that requires the transmission of characters from the host environment.
Support for these characteristics is implementation-defined, and may be affected via the setbuf and setvbuf functions.

It's probably like that because of efficiency and because if you have multiple programs writing to a single TTY, this way you don't get characters on a line interlaced. So if program A and B are outputting, you'll usually get:
program A output
program B output
program B output
program A output
program B output
This stinks, but it's better than
proprogrgraam m AB ououtputputt
prproogrgram amB A ououtputtput
program B output
Note that it isn't even guaranteed to flush on a newline, so you should flush explicitly if flushing matters to you.

To immediately flush call fflush(stdout) or fflush(NULL) (NULL means flush everything).

stdout is buffered, so will only output after a newline is printed.
To get immediate output, either:
Print to stderr.
Make stdout unbuffered.

Note: Microsoft runtime libraries do not support line buffering, so printf("will print immediately to terminal"):
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setvbuf

by default, stdout is line buffered, stderr is none buffered and file is completely buffered.

You can fprintf to stderr, which is unbuffered, instead. Or you can flush stdout when you want to. Or you can set stdout to unbuffered.

Use setbuf(stdout, NULL); to disable buffering.

There are generally 2 levels of buffering-
1. Kernel buffer Cache (makes read/write faster)
2. Buffering in I/O library (reduces no. of system calls)
Let's take example of fprintf and write().
When you call fprintf(), it doesn't wirte directly to the file. It first goes to stdio buffer in the program's memory. From there it is written to the kernel buffer cache by using write system call. So one way to skip I/O buffer is directly using write(). Other ways are by using setbuff(stream,NULL). This sets the buffering mode to no buffering and data is directly written to kernel buffer.
To forcefully make the data to be shifted to kernel buffer, we can use "\n", which in case of default buffering mode of 'line buffering', will flush I/O buffer.
Or we can use fflush(FILE *stream).
Now we are in kernel buffer. Kernel(/OS) wants to minimise disk access time and hence it reads/writes only blocks of disk. So when a read() is issued, which is a system call and can be invoked directly or through fscanf(), kernel reads the disk block from disk and stores it in a buffer. After that data is copied from here to user space.
Similarly that fprintf() data recieved from I/O buffer is written to the disk by the kernel. This makes read() write() faster.
Now to force the kernel to initiate a write(), after which data transfer is controlled by hardware controllers, there are also some ways. We can use O_SYNC or similar flags during write calls. Or we could use other functions like fsync(),fdatasync(),sync() to make the kernel initiate writes as soon as data is available in the kernel buffer.

Related

I don't understand what exactly does the function bytesToWrite() Qt

I searched for bytesToWrite in doc and that what I found "For buffered devices, this function returns the number of bytes waiting to be written. For devices with no buffer, this function returns 0."
First what does mean buffered devices. And can anyone please explain to me what exactly this function does and where or how can I use it.
Many IO devices are buffered, which means that data isn't sent straight away, but it is accumulated to be sent in bulk when there is a sufficient amount.
This is done essentially to have better performance, as sending data normally has some fixed overhead (at the very least the syscall overhead), which is well amortized when sending data in bulk, but would have to be paid for each write if no buffering would be used.
(notice that here we are only talking about QIODevice buffers, normally there are also all kinds of kernel-mode buffers and buffers internal to hardware devices themselves)
bytesToWrite tells you how much stuff is in the QIODevice write buffer, i.e. how many bytes you wrote that are waiting to be actually written (as in, given to the OS to write).
I never actually had to use that member, but I suppose it could be useful e.g. to in a producer-consumer scenario (=if the write buffer is lower than something, then you have to actually calculate the next chunk of data to send), to manually handle buffering in some places or even just for debugging/logging purposes.
it's actually very usefull when you're using an asynchronous API.
you can for example, use it inside a bytesWritten() slot to tell wether the buffer is empty and the data has been fully written or not.

Handling messages over TCP

I'm trying to send and receive messages over TCP using a size of each message appended before the it starts.
Say, First three bytes will be the length and later will the message:
As a small example:
005Hello003Hey002Hi
I'll be using this method to do large messages, but because the buffer size will be a constant integer say, 200 Bytes. So, there is a chance that a complete message may not be received e.g. instead of 005Hello I get 005He nor a complete length may be received e.g. I get 2 bytes of length in message.
So, to get over this problem, I'll need to wait for next message and append it to the incomplete message etc.
My question is: Am I the only one having these difficulties to appending messages to each other, appending lengths etc.. to make them complete Or is this really usually how we need to handle the individual messages on TCP? Or, if there is a better way?
What you're seeing is 100% normal TCP behavior. It is completely expected that you'll loop receiving bytes until you get a "message" (whatever that means in your context). It's part of the work of going from a low-level TCP byte stream to a higher-level concept like "message".
And "usr" is right above. There are higher level abstractions that you may have available. If they're appropriate, use them to avoid reinventing the wheel.
So, there is a chance that a complete message may not be received e.g.
instead of 005Hello I get 005He nor a complete length may be received
e.g. I get 2 bytes of length in message.
Yes. TCP gives you at least one byte per read, that's all.
Or is this really usually how we need to handle the individual messages on TCP? Or, if there is a better way?
Try using higher-level primitives. For example, BinaryReader allows you to read exactly N bytes (it will internally loop). StreamReader lets you forget this peculiarity of TCP as well.
Even better is using even more higher-level abstractions such as HTTP (request/response pattern - very common), protobuf as a serialization format or web services which automate pretty much all transport layer concerns.
Don't do TCP if you can avoid it.
So, to get over this problem, I'll need to wait for next message and append it to the incomplete message etc.
Yep, this is how things are done at the socket level code. For each socket you would like to allocate a buffer of at least the same size as kernel socket receive buffer, so that you can read the entire kernel buffer in one read/recv/resvmsg call. Reading from the socket in a loop may starve other sockets in your application (this is why they changed epoll to be level-triggered by default, because the default edge-triggered forced application writers to read in a loop).
The first incomplete message is always kept in the beginning of the buffer, reading the socket continues at the next free byte in the buffer, so that it automatically appends to the incomplete message.
Once reading is done, normally a higher level callback is called with the pointers to all read data in the buffer. That callback should consume all complete messages in the buffer and return how many bytes it has consumed (may be 0 if there is only an incomplete message). The buffer management code should memmove the remaining unconsumed bytes (if any) to the beginning of the buffer. Alternatively, a ring-buffer can be used to avoid moving those unconsumed bytes, but in this case the higher level code should be able to cope with ring-buffer iterators, which it may be not ready to. Hence keeping the buffer linear may be the most convenient option.

UNIX buffered vs unbuffered I/O

What is the difference between Unbuffered I/O and standard I/O? I know that using read(), write(), close() are unbuffered IO. Printf and gets are buffered IO. I also know that it is better to use buffered IO for big transactions. I just dont know the reason why. And what does the term "buffered" mean in this context?
Unbuffered I/O simply means that don't use any buffer while reading or writing.Generally when we use System calls like read() and write() they read and write char by char and can cause huge performance degradation . So for huge date generally high level reads/writes or simply buffered I/O are preferred .Buffered simply means that we are not dealing with single char but a block of chars, that is why sometimes it also known as block I/O.Generally in Unix when we use high level read/write functions they fetch/store the data of a given block size and place them in buffer cache and from this buffer cache these I/O functions get the desired amount of data.

Techniques for infinitely long pipes

There are two really simple ways to let one program send a stream of data to another:
Unix pipe, or TCP socket, or something like that. This requires constant attention by consumer program, or producer program will block. Even increasing buffers their typically tiny defaults, it's still a huge problem.
Plain files - producer program appends with O_APPEND, consumer just reads whatever new data became available at its convenience. This doesn't require any synchronization (as long as diskspace is available), but Unix files only support truncating at the end, not at beginning, so it will fill up disk until both programs quit.
Is there a simple way to have it both ways, with data stored on disk until it gets read, and then freed? Obviously programs could communicate via database server or something like that, and not have this problem, but I'm looking for something that integrates well with normal Unix piping.
A relatively simple hand-rolled solution.
You could have the producer create files and keep writing until it gets to a certain size/number of record, whatever suits your application. The producer then closes the file and starts a new one with an agreed naming algorithm.
The consumer reads new records from a file then when it gets to the agreed maximum size closes and unlinks it and then opens the next one.
If your data can be split into blocks or transactions of some sort, you can use the file method for this with a serial number. The data producer would store the first megabyte of data in outfile.1, the next in outfile.2 etc. The consumer can read the files in order and delete them when read. Thus you get something like your second method, with cleanup along the way.
You should probably wrap all this in a library, so that from the applications point of view this is a pipe of some sort.
You should read some documentation on socat. You can use it to bridge the gap between tcp sockets, fifo files, pipes, stdio and others.
If you're feeling lazy, there's some nice examples of useful commands.
I'm not aware of anything, but it shouldn't be too hard to write a small utility that takes a directory as an argument (or uses $TMPDIR); and, uses select/poll to multiplex between reading from stdin, paging to a series of temporary files, and writing to stdout.

Piping as interprocess communication

I am interested in writing separate program modules that run as independent threads that I could hook together with pipes. The motivation would be that I could write and test each module completely independently, perhaps even write them in different languages, or run the different modules on different machines. There are a wide variety of possibilities here. I have used piping for a while, but I am unfamiliar with the nuances of its behaviour.
It seems like the receiving end will block waiting for input, which I would expect, but will the sending end block sometimes waiting for someone to read from the stream?
If I write an eof to the stream can I keep continue writing to that stream until I close it?
Are there differences in the behaviour named and unnamed pipes?
Does it matter which end of the pipe I open first with named pipes?
Is the behaviour of pipes consistent between different Linux systems?
Does the behaviour of the pipes depend on the shell I'm using or the way I've configured it?
Are there any other questions I should be asking or issues I should be aware of if I want to use pipes in this way?
Wow, that's a lot of questions. Let's see if I can cover everything...
It seems like the receiving end will
block waiting for input, which I would
expect
You expect correctly an actual 'read' call will block until something is there. However, I believe there are some C functions that will allow you to 'peek' at what (and how much) is waiting in the pipe. Unfortunately, I don't remember if this blocks as well.
will the sending end block sometimes
waiting for someone to read from the
stream
No, sending should never block. Think of the ramifications if this were a pipe across the network to another computer. Would you want to wait (through possibly high latency) for the other computer to respond that it received it? Now this is a different case if the reader handle of the destination has been closed. In this case, you should have some error checking to handle that.
If I write an eof to the stream can I
keep continue writing to that stream
until I close it
I would think this depends on what language you're using and its implementation of pipes. In C, I'd say no. In a linux shell, I'd say yes. Someone else with more experience would have to answer that.
Are there differences in the behaviour
named and unnamed pipes?
As far as I know, yes. However, I don't have much experience with named vs unnamed. I believe the difference is:
Single direction vs Bidirectional communication
Reading AND writing to the "in" and "out" streams of a thread
Does it matter which end of the pipe I
open first with named pipes?
Generally no, but you could run into problems on initialization trying to create and link the threads with each other. You'd need to have one main thread that creates all the sub-threads and syncs their respective pipes with each other.
Is the behaviour of pipes consistent
between different linux systems?
Again, this depends on what language, but generally yes. Ever heard of POSIX? That's the standard (at least for linux, Windows does it's own thing).
Does the behaviour of the pipes depend
on the shell I'm using or the way I've
configured it?
This is getting into a little more of a gray area. The answer should be no since the shell should essentially be making system calls. However, everything up until that point is up for grabs.
Are there any other questions I should
be asking
The questions you've asked shows that you have a decent understanding of the system. Keep researching and focus on what level you're going to be working on (shell, C, so on). You'll learn a lot more by just trying it though.
This is all based on a UNIX-like system; I'm not familiar with the specific behavior of recent versions of Windows.
It seems like the receiving end will block waiting for input, which I would expect, but will the sending end block sometimes waiting for someone to read from the stream?
Yes, although on a modern machine it may not happen often. The pipe has an intermediate buffer that can potentially fill up. If it does, the write side of the pipe will indeed block. But if you think about it, there aren't a lot of files that are big enough to risk this.
If I write an eof to the stream can I keep continue writing to that stream until I close it?
Um, you mean like a CTRL-D, 0x04? Sure, as long as the stream is set up that way. Viz.
506 # cat | od -c
abc
^D
efg
0000000 a b c \n 004 \n e f g \n
0000012
Are there differences in the behaviour named and unnamed pipes?
Yes, but they're subtle and implementation dependent. The biggest one is that you can write to a named pipe before the other end is running; with unnamed pipes, the file descriptors get shared during the fork/exec process, so there's no way to access the transient buffer without the processes being up.
Does it matter which end of the pipe I open first with named pipes?
Nope.
Is the behaviour of pipes consistent between different linux systems?
Within reason, yes. Buffer sizes etc may vary.
Does the behaviour of the pipes depend on the shell I'm using or the way I've configured it?
No. When you create a pipe, under the covers what happens is your parent process (the shell) creates a pipe which has a pair of file descriptors, then does a fork exec like this pseudocode:
Parent:
create pipe, returning two file descriptors, call them fd[0] and fd[1]
fork write-side process
fork read-side process
Write-side:
close fd[0]
connect fd[1] to stdout
exec writer program
Read-side:
close fd[1]
connect fd[0] to stdin
exec reader program
Are there any other questions I should be asking or issues I should be aware of if I want to use pipes in this way?
Is everything you want to do really going to lay out in a line like this? If not, you might want to think about a more general architecture. But the insight that having lots of separate processes interacting through the "narrow" interface of a pipe is desirable is a good one.
[Updated: I had the file descriptor indices reversed at first. They're correct now, see man 2 pipe.]
As Dashogun and Charlie Martin noted, this is a big question. Some parts of their answers are inaccurate, so I'm going to answer too.
I am interested in writing separate program modules that run as independent threads that I could hook together with pipes.
Be wary of trying to use pipes as a communication mechanism between threads of a single process. Because you would have both read and write ends of the pipe open in a single process, you would never get the EOF (zero bytes) indication.
If you were really referring to processes, then this is the basis of the classic Unix approach to building tools. Many of the standard Unix programs are filters that read from standard input, transform it somehow, and write the result to standard output. For example, tr, sort, grep, and cat are all filters, to name but a few. This is an excellent paradigm to follow when the data you are manipulating permits it. Not all data manipulations are conducive to this approach, but there are many that are.
The motivation would be that I could write and test each module completely independently, perhaps even write them in different languages, or run the different modules on different machines.
Good points. Be aware that there isn't really a pipe mechanism between machines, though you can get close to it with programs such as rsh or (better) ssh. However, internally, such programs may read local data from pipes and send that data to remote machines, but they communicate between machines over sockets, not using pipes.
There are a wide variety of possibilities here. I have used piping for a while, but I am unfamiliar with the nuances of its behaviour.
OK; asking questions is one (good) way to learn. Experimenting is another, of course.
It seems like the receiving end will block waiting for input, which I would expect, but will the sending end block sometimes waiting for someone to read from the stream?
Yes. There is a limit to the size of a pipe buffer. Classically, this was quite small - 4096 or 5120 were common values. You may find that modern Linux uses a larger value. You can use fpathconf() and _PC_PIPE_BUF to find out the size of a pipe buffer. POSIX only requires the buffer to be 512 (that is, _POSIX_PIPE_BUF is 512).
If I write an eof to the stream can I keep continue writing to that stream until I close it?
Technically, there is no way to write EOF to a stream; you close the pipe descriptor to indicate EOF. If you are thinking of control-D or control-Z as an EOF character, then those are just regular characters as far as pipes are concerned - they only have an effect like EOF when typed at a terminal that is running in canonical mode (cooked, or normal).
Are there differences in the behaviour named and unnamed pipes?
Yes, and no. The biggest differences are that unnamed pipes must be set up by one process and can only be used by that process and children who share that process as a common ancestor. By contrast, named pipes can be used by previously unassociated processes. The next big difference is a consequence of the first; with an unnamed pipe, you get back two file descriptors from a single function (system) call to pipe(), but you open a FIFO or named pipe using the regular open() function. (Someone must create a FIFO with the mkfifo() call before you can open it; unnamed pipes do not need any such prior setup.) However, once you have a file descriptor open, there is precious little difference between a named pipe and an unnamed pipe.
Does it matter which end of the pipe I open first with named pipes?
No. The first process to open the FIFO will (normally) block until there's a process with the other end open. If you open it for reading and writing (aconventional but possible) then you won't be blocked; if you use the O_NONBLOCK flag, you won't be blocked.
Is the behaviour of pipes consistent between different Linux systems?
Yes. I've not heard of or experienced any problems with pipes on any of the systems where I've used them.
Does the behaviour of the pipes depend on the shell I'm using or the way I've configured it?
No: pipes and FIFOs are independent of the shell you use.
Are there any other questions I should be asking or issues I should be aware of if I want to use pipes in this way?
Just remember that you must close the reading end of a pipe in the process that will be writing, and the writing end of the pipe in the process that will be reading. If you want bidirectional communication over pipes, use two separate pipes. If you create complicated plumbing arrangements, beware of deadlock - it is possible. A linear pipeline does not deadlock, however (though if the first process never closes its output, the downstream processes may wait indefinitely).
I observed both above and in comments to other answers that pipe buffers are classically limited to quite small sizes. #Charlie Martin counter-commented that some versions of Unix have dynamic pipe buffers and these can be quite large.
I'm not sure which ones he has in mind. I used the test program that follows on Solaris, AIX, HP-UX, MacOS X, Linux and Cygwin / Windows XP (results below):
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
static const char *arg0;
static void err_syserr(char *str)
{
int errnum = errno;
fprintf(stderr, "%s: %s - (%d) %s\n", arg0, str, errnum, strerror(errnum));
exit(1);
}
int main(int argc, char **argv)
{
int pd[2];
pid_t kid;
size_t i = 0;
char buffer[2] = "a";
int flags;
arg0 = argv[0];
if (pipe(pd) != 0)
err_syserr("pipe() failed");
if ((kid = fork()) < 0)
err_syserr("fork() failed");
else if (kid == 0)
{
close(pd[1]);
pause();
}
/* else */
close(pd[0]);
if (fcntl(pd[1], F_GETFL, &flags) == -1)
err_syserr("fcntl(F_GETFL) failed");
flags |= O_NONBLOCK;
if (fcntl(pd[1], F_SETFL, &flags) == -1)
err_syserr("fcntl(F_SETFL) failed");
while (write(pd[1], buffer, sizeof(buffer)-1) == sizeof(buffer)-1)
{
putchar('.');
if (++i % 50 == 0)
printf("%u\n", (unsigned)i);
}
if (i % 50 != 0)
printf("%u\n", (unsigned)i);
kill(kid, SIGINT);
return 0;
}
I'd be curious to get extra results from other platforms. Here are the sizes I found. All the results are larger than I expected, I must confess, but Charlie and I may be debating the meaning of 'quite large' when it comes to buffer sizes.
8196 - HP-UX 11.23 for IA-64 (fcntl(F_SETFL) failed)
16384 - Solaris 10
16384 - MacOS X 10.5 (O_NONBLOCK did not work, though fcntl(F_SETFL) did not fail)
32768 - AIX 5.3
65536 - Cygwin / Windows XP (O_NONBLOCK did not work, though fcntl(F_SETFL) did not fail)
65536 - SuSE Linux 10 (and CentOS) (fcntl(F_SETFL) failed)
One point that is clear from these tests is that O_NONBLOCK works with pipes on some platforms and not on others.
The program creates a pipe, and forks. The child closes the write end of the pipe, and then goes to sleep until it gets a signal - that's what pause() does. The parent then closes the read end of the pipe, and sets the flags on the write descriptor so that it won't block on an attempt to write on a full pipe. It then loops, writing one character at a time, and printing a dot for each character written, and a count and newline every 50 characters. When it detects a write problem (buffer full, since the child is not reading a thing), it stops the loop, writes the final count, and kills the child.

Resources