Am I misunderstanding how often IORING POLL is used? - io-uring

I am watching a video that says after a million operations its faster to use io_uring_poll. I haven't used io_uring much so I may be remembering things wrong but io_uring poll means use io_uring_enter with the IORING_SETUP_SQPOLL flag? I understand sometimes traffic drops but wouldn't most of the time the code be using atomics to check if there's more items in the completion queue? Wouldnt enter/poll be needed the rare second when traffic drops? Or is it called constantly? I never had to do that many operations so I don't know

Related

Why the time-out values are so small?

I'm not sure if this is a pure stackoverflow relevant question. It is related to general design practice. Since I cannot think of another relevant stack exchange site, posting it here.
In the general design practice of converting an async call to sync one, we use a time-out and wait for the results. While, this may not exactly a good practice from the point of view of responsiveness, it definitely makes the implementation easier.
I have seen many such implementations and often noticed that the developers tend to give a very small time-out value. I can understand that the people may have the need of a responsive system in mind when they did this. But many of these applications I have seen are very data critical ones where the loss of data is very bad. So, it is always better to wait more and try to get as much data instead of timing out early and giving an error message to the user. Now, the situations where the server failing to give data or the client unable to reach server etc are rare. In those situations, I expect the a large time-out for such waits. After all, these time-outs don't mean that the wait will definitely last until the given time-out value; the timeout value is only an upper limit. So, I have always arguing for higher values here. But I see the use of low values in more and more places and now I'm getting confused if really there is something else in this practice that I don't understand.
So, my question is : Are there any arguments, other than the need for responsiveness to implement a very small time-out for waiting?
As always, the right decision depends on the real-life data.
The timeout should be proportional to the time it usually takes to complete an operation successfully.
Sending a UDP message for example could take between 1 - 50 milliseconds so a timeout of 100 milliseconds is more than reasonable however copying a file over the wire could take minutes or more so a 100 millisecond timeout is laughable.
There are pros and cons to both short and long timeouts so it's a tradeoff. Longer timeouts use more resources (tasks, threads, memory, etc.) for the same amount of work while short timeouts, as you mentioned, may result in loss of data.
In conclusion, you need to set a configurable timeout that sounds reasonable and then figure out whether you timeout too many operations in production or the other way around and calibrate accordingly.

Optimization of parallel programming

I want to use MPI to make my program parallel, and I want to send something to other computers. I want to know which one is better: sending a huge buffer one time or sending two smaller messages 3 times atrent times during the execution instead of all at once?
It's almost always going to be faster to send the one big message than the smaller one. Each time you do a Send/Receive pair, the two processes have to go through the entire process of sending a message to each other, including at least 6 roundtrip messages. If you are just sending one larger message, there is a minimum of 2 roundtrip messages. Each of those messages can be very expensive (compared to doing things locally like packing all of your data into one buffer).
I'd encourage you to try it out both ways though to be sure that this applies to your application. It could be different if you're doing something unexpected.
Depending on your problem, sending all data may be more efficient, because the nodes have to be synced, every time. That may cause a delay.
I always try to send as much data as I can in a single MPI call. In my experience, sending many small bits of data greatly increases the overhead and network traffic, and I have even run into problems where I overwhelmed the computers' ability to keep up with the number of requests, because I was sending a large member of a complicated class, one integer at a time, to many workers. Therefore, when possible, send the entire data at once, unless you have some reason to believe it is too large.
Further, I strive to use 100% of all the CPU's my program claims. When you are working on shared resources, if you use a CPU, you need to actually use it. Otherwise, someone else who wants to use that core, or node, is blocked out while your program sits and does nothing. For example, on a Cray which I have used, even if you call for only two 'cores', the manager will reserve a full bank of 24 cores, essentially wasting 22. Or, perhaps one worker has nothing to do, while another chugs away -- again, wasting time. Hopefully, there is a way to balance the load, so to speak, to avoid unintentional waste of resources.
Back to the topic at hand. Demonstrate timing and efficiency of vector sending to yourself -- write a program which breaks up the vector into varying sizes of packets, and do the sends/receives. Test it with varying numbers of workers, and on several different configurations of computers, if you can. Before writing production code, do proof of concept, and what optimization you can. Test and time it!

How to write integration test for systems that interact asynchronously

Assume that i have function called PlaceOrder, which when called inserts the order details into local DB and puts a message(order details) into a TIBCO EMS Queue.
Once message received, a TIBCO BW will then invoke some other system(say ExternalSystem) to pass on the order details.
Now the way i wrote my integration tests is
Call the Place Order
Sleep, and check details exists in local DB
Sleep and check details exists in ExternalSystem.
Is the above approach correct? Above test gives me confidence that, End to End integration is working, but are there any better way to test above scenario?
The problem you describe is quite common, and your approach is a very typical solution.
The problem with this solution is that if the delay is too short, your tests may sometimes pass and sometimes fail, but if the delay is very long, then your just wasteing time waiting, and with many tests, it can add a lot of delay. But unless you can get some signal to tell you the order arrived in the database, then you just have to wait.
You can reduce the delay by doing lots of checks with short intervals. If you're order is not there after timeout, then you would fail the test.
In "Growing Object-Oriented Software, Guided by Tests"*, there is a chapter on this very subject, so you might want to get a copy if you will be doing a lot of this sort of testing.
"There are two ways a test can observe the system: by sampling its observable
state or by listening for events that it sends out. Of these, sampling is
often the only option because many systems don’t send any monitoring
events. It’s quite common for a test to include both techniques to interact
with different “ends” of its system"
(*) http://my.safaribooksonline.com/book/software-engineering-and-development/software-testing/9780321574442

strategy for hit detection over a net connection, like Quake or other FPS games

I'm learning about the various networking technologies, specifically the protocols UDP and TCP.
I've read numerous times that games like Quake use UDP because, "it doesn't matter if you miss a position update packet for a missile or the like, because the next packet will put the missile where it needs to be."
This thought process is all well-and-good during the flight path of an object, but it's not good for when the missile reaches it's target. If one computer receives the message that the missile reached it's intended target, but that packet got dropped on a different computer, that would cause some trouble.
Clearly that type of thing doesn't really happen in games like Quake, so what strategy are they using to make sure that everyone is in sync with instantaneous type events, such as a collision?
You've identified two distinct kinds of information:
updates that can be safely missed, because the information they carry will be provided in the next update;
updates that can't be missed, because the information they carry is not part of the next regular update.
You're right - and what the games typically do is to separate out those two kinds of messages within their protocol, and require acknowledgements and retransmissions for the second type, but not for the first type. (If the underlying IP protocol is UDP, then these acknowledgements / retransmissions need to be provided at a higher layer).
When you say that "clearly doesn't happen", you clearly haven't played games on a lossy connection. A popular trick amongst the console crowd is to put a switch on the receive line of your ethernet connection so you can make your console temporarily stop receiving packets, so everybody is nice and still for you to shoot them all.
The reason that could happen is the console that did the shooting decides if it was a hit or not, and relays that information to the opponent. That ensures out of sync or laggy hit data can be deterministically decided. Even if the remote end didn't think that the shot was a hit, it should be close enough that it doesn't seem horribly bad. It works in a reasonable manner, except for what I've mentioned above. Of course, if you assume your players are not cheating, this approach works quite reasonably.
I'm no expert, but there seems to be two approaches you can take. Let the client decide if it's a hit or not (allows for cheating), or let the server decide.
With the former, if you shoot a bullet, and it looks like a hit, it will count as a hit. There may be a bit of a delay before everyone else receives this data though (i.e., you may hit someone, but they'll still be able to play for half a second, and then drop dead).
With the latter, as long as the server receives the information that you shot a bullet, it can use whatever positions it currently has to determine if there was a hit or not, then send that data back for you. This means neither you nor the victim will be aware of you hit or not until that data is sent back to you.
I guess to "smooth" it out you let the client decide for itself, and then if the server pipes in and says "no, that didn't happen" it corrects. Which I suppose could mean players popping back to life, but I reckon it would make more sense just to set their life to 0 and until you get a definitive answer so you don't have weird graphical things going on.
As for ensuring the server/client has received the event... I guess there are two more approaches. Either get the server/client to respond "Yeah, I received the event" or forget about events altogether and just think about everything in terms of state. There is no "hit" event, there's just HP before and after. Sooner or later, it'll receive the most up-to-date state.

Why are Asynchronous processes not called Synchronous?

So I'm a little confused by this terminology.
Everyone refers to "Asynchronous" computing as running different processes on seperate threads, which gives the illusion that these processes are running at the same time.
This is not the definition of the word asynchronous.
a⋅syn⋅chro⋅nous
–adjective
1. not occurring at the same time.
2. (of a computer or other electrical machine) having each operation started only after the preceding operation is completed.
What am I not understanding here?
It means that the two threads are not running in sync, that is, they are not both running on the same timeline.
I think it's a case of computer scientists being too clever about their use of words.
Synchronisation, in this context, would suggest that both threads start and end at the same time. Asynchrony in this sense, means both threads are free to start, execute and end as they require.
The word "synchronous" implies that a function call will be synchronized with some other event.
Asynchronous implies that no such synchronization occurs.
It seems like the definition that you have there should really be the definition for "concurrent," or something. That definition looks wrong.
PS:
Here is the wiktionary definition:
asynchronous
Not synchronous; occurring at different times.
(computing, of a request or a message) allowing the client to continue during processing.
Which just so happens to be the exact opposite of what you posted.
I believe that the term was first used for synchronous vs. asynchronous communication. There synchronous means that the two communicating parts have a common clock signal that they run by, so they run in parallel. Asynchronous communication instead has a ready signal, so one part asks for data and gets a signal back when it's available.
The terms was then adapted to processes, but as there are obvious differences some aspects of the terms work differently. For a single thread process the natural way to request for something to be done is to make a synchronous call that transfers control to the subprocess, and then control is returned when it's done, and the process continues.
An asynchronous call works just like asynchronous communication in the aspect that you send a request for something to be done, and the process doing it returns a signal when it's done. The difference in the usage of the terms is that for processes it's in the asynchronous processing that the processes runs in parallel, while for communication it is the synchronous communication that run in parallel.
So "computer or electrical machine" is really a too wide scope for making a correct definition of the term, as it's used in slightly different ways for different techniques.
I would guess it's because they are not synchronized ;)
In other words... if one process gets stopped, killed, or is waiting for something, the other will carry on
I think there's a slant that is slightly different to most of the answers here.
Asynchronous means "not happening at the same time".
In the specific case of threading:
Synchronous means "execute this code now".
Asynchronous means "enqueue this work on a different thread that will be executed at some indeterminate time in the future"
This usually allows you to "do two things at once" because of reasons like:
one thread is just waiting (e.g. for data to arrive on a serial port) so is asleep
You have multiple processors, so the two threads can run concurrently.
However, even with 128 processor cores, the case is the same: the work will be executed "at some time in the future" (if perhaps the very near future) rather than "now".
Your second definition is more helpful here:
2. [...] having each operation started only after the preceding operation is completed.
When you make an asynchronous call, that call might not be completed before the next operation is started. When the call is synchronous, it will be.
It really means that an asynchronous event is happening independently of other events whereas a synchronous event would be happening dependent of other events.
It's like: Flammable, Inflammable ( which mean the same thing )
Seriously -- it's just one of those quirks of the English language. It doesn't really make sense. You can try to explain it, but it would be just as easy to justify the reverse meanings.
Many of the answers here are not correct. IN-dependently has a beginning particle that says NOT dependently, just like A-synchronous, but the meaning of dependent and synchronous are not the same! :D
So three dependent persons would wait for an order, because they are dependent to the order, but they wait, so they are not synchronous.
In english and any other language with common roots with a, syn and chrono (italian: asincrono; spanish: asincrónico; french:
asynchrone; greek: a= not syn=together chronos=time)it means exactly the opposite.
The terminology is UTTERLY counter-intiutive. Async functions ARE synchronous, they happen at the same time, and that's their power. They DO NOT wait, they DO NOT depend, they DO NOT hold the user waiting, but all those NOTs refer to anything but synchronicity :)
The only answer possibly right is the CLOCK one, although it is still confusing. My personal interpretation is this story:
"A professor has an office, and he makes SYNCHRONOUS CALLS for students to come. He says out loud in the main university hall: 'Hey guys who wants to talk to me should come at 10 in the morning tomorrow.', or simply puts a sign saying the same stuff.
RESULT: at 10 in the morning you see a long queue. People had the same time so they came in in the same moment and they got "piled up in the process".
So the professor thinks it would be nice for students not to waste time in the queue (and do synchronous operations, that is, do parallel stuff in their lives at the same time, and that's where the confusion comes).
He decides students can substitute him in making ASYNCHRONOUS CALLS, that is, every time a student ends talking with him, the students may, e.g., call another student saying the professor is free to talk, in a room where students may do whatever they like in the meantime. So every student does not have a single SYNCHRONOUS CALL (10 in the morning, the same time for all) but they have 10, 10.10, 10.18, 10.27.. etc. according to the needed time for each discussion in the professor office."
Is that the meaning of having the same clock, #Guffa?

Resources