Asynchronous address resolution in winsock? - asynchronous

Looking into asynchronous address resolution in winsock it seems that the only two options are either to use the blocking gethostbyname on a seperate thread, or use WSAAsyncGetHostByName. The latter is designed for some reason to work with window messages, instead of overlapped operations and completion ports/routines.
Is there any version of gethostbyname that works asynchronously with overlapped operations in a similiar manner to the rest of the winsock API?

Unfortunately there isn't at present, although GetAddrInfoEx() has placeholders for all the right things for async operation via all of the 'usual' routes (including IOCP) so I expect there will be eventually... Unfortunately, at this time, the docs say that all of these must be set to NULL and are marked as 'reserved'. :(
I'm just about to write one (have been for a while)... It's unfortunate that WSAAsyncGetHostByName doesn't even allow concurrent name resolution, so it's pretty useless as a base for what I want; but, then again, since it doesn't handle IPv6 that also makes it pretty useless to me. I expect I'll start from scratch; possibly using something like this (beerware) as a base.

Sorry, there is no overlapped version of gethostbyname().

Related

Client callbacks with protobuf-net.Grpc

I'm currently working on replacing an old WCF client/server pairing with gRpc, and decided to use protobuf-net.Grpc as we've used protobuf-net extensively elsewhere in our codebase. I'm running into a bit of trouble with one particular portion however.
Part of the original service is a Subscribe method which uses IClientCallback to effectively send an event to the client. Looking at regular gRpc, it seems like this would be possible (though a bit hacky) using a server streaming method and storing the IServerStreamWriter object on the server, writing to it whenever we wanted to "fire an event".
For the life of me, however, I can't quite figure out how to do something similar in protobuf-net.Grpc with the IAsyncEnumerable return type. The closest I can figure is using Task.Wait in a loop and updating some shared collection when I want to "fire" the event, which the loop would then check for and yield return. This doesn't seem like it'd scale well, however, and there isn't really a great way to definitely unsubscribe when a client is no longer listening to events.
Is there some other/better way to do this?
Channel<T>, which can be tweaked via AsAsyncEnumerable() - which then essentially acts as a queue at the producer side, and a sequence at the consumer.

Why does zumero_sync need to be called multiple times?

According to the documentation for zumero_sync:
If a large amount of information needs to be pulled from the server,
this function may need to be called more than once.
In my Android app that uses Zumero that's no problem; I just keep calling zumero_sync until the return value doesn't start with "0;".
However, now I'm trying to write an admin script that also syncs with my server dbfiles. I'd like to use the sqlite3 shell, and have the script pass the SQL to execute via command line arguments. I need to call zumero_sync in a loop (which SQLite doesn't support) to make sure the db is fully synced. If I had to, I could invoke sqlite3 in a loop (reading its output, looking for "0;"), or even write a C++ app to call the SQLite/Zumero functions natively. But it certainly would be easier if a single zumero_sync was enough.
I guess my real question is: could zumero_sync be changed so it completes the sync before returning? If there are cases where the existing behavior is more useful, maybe there could be a parameter for specifying which mode to use?
I see two basic questions here:
(1) Why does zumero_sync() work the way it does?
(2) Can it work differently?
I'll answer (2) first, since it's easier: Yes, it could work differently. Rather, we could (and probably will, soon, you brought this up) implement an additional function, named something like zumero_sync_complete(), which performs [the guts of] zumero_sync() in a loop and returns after the sync is complete.
We didn't implement zumero_sync_complete() because it doesn't add much value. It's a simple loop, so you can darn well write it yourself. :-)
Er, except in scripting environments which don't support loops. Like the sqlite3 shell.
Answer to (1):
The Zumero sync protocol is designed to give the server the flexibility to return partial results if it wants to do so. And for the sake of reducing load on the server (and increasing its scalability) it often does want to do exactly that.
Given that, one reason to expose this to the client is to increase the client's flexibility as well. As long we're making multiple roundtrips, we might as well give the client an opportunity to do something (like, maybe, update a progress bar) in between them.
Another thing a client might want to do in between loop iterations is handle an error.
Or, in the case of a multithreaded client, it might want to deal with changes that happened on the client while the sync is going on.
Which raises the question of how locking should be managed? Do we hold the sqlite write lock during the entire loop? Or only when absolutely necessary?
Bottom line: A robust app would probably want to implement the loop itself so that it can make its own decisions and retain full control over things.
But, as you observe, the sqlite3 shell doesn't have loops. And it's not an app. And it doesn't have threads. Or progress bars. So it's a use case where a simpler-and-less-powerful form of zumero_sync() would make sense.

Which POSIX system calls may block a process ?

Is there a reference document somewhere that describes the blocking behaviour of POSIX system calls ?
So far my heuristic has been to flag as potentially blocking any function that may fail with EINTR ? Is that a necessary and sufficient condition ?
Frankly, to the best of my knowledge, just look at the man pages and use common sense. Even if a call doesn't specify EINTR, think about what it does. I may be wrong, but it seems to me that every call in the POSIX library blocks by default and will only execute in nonblocking mode if you ask it too (via various flags), but I haven't seen a single call that is unblocking by default (that would do anything but unblock).

Rewriting network packets on the fly using libnetfilter_queue

I am attempting to write a userspace application that can hook into an OS's network stack, sniff packets flying past and edit ones that its interested in.
After much Googling, it appears to me that the simplest (yet reasonably robust) method of doing so (on any platform) is Linux's libnetfilter_queue project. However, I'm having trouble finding any reasonable documentation for the project, outside of the limited official documentation. Its main features (as stated by the first link are)
receiving queued packets from the kernel nfnetlink_queue subsystem
issuing verdicts and/or reinjecting altered packets to the kernel nfnetlink_queue subsystem
Emphasis is my own. How exactly am I meant go about this? I've tried modifying the sample code provided, but perhaps I am misunderstanding something. The code is operating in NFQNL_COPY_PACKET mode, so I am receiving the whole packet -- but my modifications to it seem to be restricted to my own application -- as one would expect, given the "copy" semantics.
My feeling is that I am meant to make use of NF_QUEUE somehow, but I haven't quite grokked it. Any pointers?
(If there is a simpler mechanism for doing this, which is also cross-platform, I'd love to hear about it!)
I can't believe I missed this previously. As reticent as I am to post questions on SO, I thought I would never work this one out myself. :)
I didn't look at the function prototype properly. It turns out in the "verdict" function (outlined below),
int nfq_set_verdict(struct nfq_q_handle *qh,
u_int32_t id,
u_int32_t verdict,
u_int32_t data_len,
const unsigned char *buf
)
The last two parameters are for the data to be returned to the network stack. Obvious in hindsight, but I missed it completely as the print_pkt function doesn't take the packet data as a parameter, but extracts it from the struct nfq_data.
The key is to NF_ACCEPT the packet and pass the suitably modified packet back to the kernel.
Just a wild guess from digging around the source code: try explicitly adding the mangled payload using nfnl_addattr_l(…, NFQA_PAYLOAD, …)?

API design: is "fault tolerance" a good thing?

I've consolidated many of the useful answers and came up with my own answer below
For example, I am writing a an API Foo which needs explicit initialization and termination. (Should be language agnostic but I'm using C++ here)
class Foo
{
public:
static void InitLibrary(int someMagicInputRequiredAtRuntime);
static void TermLibrary(int someOtherInput);
};
Apparently, our library doesn't care about multi-threading, reentrancy or whatnot. Let's suppose our Init function should only be called once, calling it again with any other input would wreak havoc.
What's the best way to communicate this to my caller? I can think of two ways:
Inside InitLibrary, I assert some static variable which will blame my caller for init'ing twice.
Inside InitLibrary, I check some static variable and silently aborts if my lib has already been initialized.
Method #1 obviously is explicit, while method #2 makes it more user friendly. I am thinking that method #2 probably has the disadvantage that my caller wouldn't be aware of the fact that InitLibrary shouln't be called twice.
What would be the pros/cons of each approach? Is there a cleverer way to subvert all these?
Edit
I know that the example here is very contrived. As #daemon pointed out, I should initialized myself and not bother the caller. Practically however, there are places where I need more information to properly initialize myself (note the use of my variable name someMagicInputRequiredAtRuntime). This is not restricted to initialization/termination but other instances where the dilemma exists whether I should choose to be quote-and-quote "fault tolorent" or fail lousily.
I would definitely go for approach 1, along with an easy-to-understand exception and good documentation that explains why this fails. This will force the caller to be aware that this can happen, and the calling class can easily wrap the call in a try-catch statement if needed.
Failing silently, on the other hand, will lead your users to believe that the second call was successful (no error message, no exception) and thus they will expect that the new values are set. So when they try to do something else with Foo, they don't get the expected results. And it's darn near impossible to figure out why if they don't have access to your source code.
Serenity Prayer (modified for interfaces)
SA, grant me the assertions
to accept the things devs cannot change
the code to except the things they can,
and the conditionals to detect the difference
If the fault is in the environment, then you should try and make your code deal with it. If it is something that the developer can prevent by fixing their code, it should generate an exception.
A good approach would be to have a factory that creates an intialized library object (this would require you to wrap your library in a class). Multiple create-calls to the factory would create different objects. This way, the initialize-method would then not be a part of the public interface of the library, and the factory would manage initialization.
If there can be only one instance of the library active, make the factory check for existing instances. This would effectively make your library-object a singleton.
I would suggest that you should flag an exception if your routine cannot achieve the expected post-condition. If someone calls your init routine twice, and the system state after calling it the second time will be the same would be the same as if it had just been called once, then it is probably not necessary to throw an exception. If the system state after the second call would not match the caller's expectation, then an exception should be thrown.
In general, I think it's more helpful to think in terms of state than in terms of action. To use an analogy, an attempt to open as "write new" a file that is already open should either fail or result in a close-erase-reopen. It should not simply perform a no-op, since the program will be expecting to be writing into an empty file whose creation time matches the current time. On the other hand, trying to close a file that's already closed should generally not be considered an error, because the desire is that the file be closed.
BTW, it's often helpful to have available a "Try" version of a method that might throw an exception. It would be nice, for example, to have a Control.TryBeginInvoke available for things like update routines (if a thread-safe control property changes, the property handler would like the control to be updated if it still exists, but won't really mind if the control gets disposed; it's a little irksome not being able to avoid a first-chance exception if a control gets closed when its property is being updated).
Have a private static counter variable in your class. If it is 0 then do the logic in Init and increment the counter, If it is more than 0 then simply increment the counter. In Term do the opposite, decrement until it is 0 then do the logic.
Another way is to use a Singleton pattern, here is a sample in C++.
I guess one way to subvert this dilemma is to fulfill both camps. Ruby has the -w warning switch, it is custom for gcc users to -Wall or even -Weffc++ and Perl has taint mode. By default, these "just work," but the more careful programmer can turn on these strict settings themselves.
One example against the "always complain the slightest error" approach is HTML. Imagine how frustrated the world would be if all browsers would bark at any CSS hacks (such as drawing elements at negative coordinates).
After considering many excellent answers, I've come to this conclusion for myself: When someone sits down, my API should ideally "just work." Of course, for anyone to be involved in any domain, he needs to work at one or two level of abstractions lower than the problem he is trying to solve, which means my user must learn about my internals sooner or later. If he uses my API for long enough, he will begin to stretch the limits and too much efforts to "hide" or "encapsulate" the inner workings will only become nuisance.
I guess fault tolerance is most of the time a good thing, it's just that it's difficult to get right when the API user is stretching corner cases. I could say the best of both worlds is to provide some kind of "strict mode" so that when things don't "just work," the user can easily dissect the problem.
Of course, doing this is a lot of extra work, so I may be just talking ideals here. Practically it all comes down to the specific case and the programmer's decision.
If your language doesn't allow this error to surface statically, chances are good the error will surface only at runtime. Depending on the use of your library, this means the error won't surface until much later in development. Possibly only when shipped (again, depends on alot).
If there's no danger in silently eating an error (which isn't a real error anyway, since you catch it before anything dangerous happens), then I'd say you should silently eat it. This makes it more user friendly.
If however someMagicInputRequiredAtRuntime varies from calling to calling, I'd raise the error whenever possible, or presumably the library will not function as expected ("I init'ed the lib with value 42, but it's behaving as if I initted with 11!?").
If this Library is a static class, (a library type with no state), why not put the call to Init in the type initializer? If it is an instantiatable type, then put the call in the constructor, or in the factory method that handles instantiation.
Don;t allow public access to the Init function at all.
I think your interface is a bit too technical. No programmer want to learn what concept you have used while designing the API. Programmers want solutions for their actual problems and don't want to learn how to use an API. Nobody wants to init your API, that is something that the API should handle in the background as far as possible. Find a good abstraction that shields the developer from as much low-level technical stuff as possible. That implies, that the API should be fault tolerant.

Resources