I'm currently working on replacing an old WCF client/server pairing with gRpc, and decided to use protobuf-net.Grpc as we've used protobuf-net extensively elsewhere in our codebase. I'm running into a bit of trouble with one particular portion however.
Part of the original service is a Subscribe method which uses IClientCallback to effectively send an event to the client. Looking at regular gRpc, it seems like this would be possible (though a bit hacky) using a server streaming method and storing the IServerStreamWriter object on the server, writing to it whenever we wanted to "fire an event".
For the life of me, however, I can't quite figure out how to do something similar in protobuf-net.Grpc with the IAsyncEnumerable return type. The closest I can figure is using Task.Wait in a loop and updating some shared collection when I want to "fire" the event, which the loop would then check for and yield return. This doesn't seem like it'd scale well, however, and there isn't really a great way to definitely unsubscribe when a client is no longer listening to events.
Is there some other/better way to do this?
Channel<T>, which can be tweaked via AsAsyncEnumerable() - which then essentially acts as a queue at the producer side, and a sequence at the consumer.
Related
As redux-saga describes itself as
a library that aims to make application side effects (i.e.
asynchronous things like data fetching and impure things like
accessing the browser cache) easier to manage.
The select effect is just used to get a slice of the current Store's state. It doesn't produce any side effect at all (no I/O operation, no mutation,etc). It's just a purely functional operation. Why a purely functional operation was designed to be an effect ?
Because none of your saga code is supposed to be interacting with the store directly overall. Whatever your saga needs to do, whether it be making an AJAX call, dispatching an action, or anything else, gets done by yielding an effect description and asking the middleware to do that work for you. Your saga doesn't have access to the store directly to call dispatch(), so there's no reason for it to have access to getState() directly either.
There is an existing third party Rest API available which would accept one set of input and return the output for the same. (Think of it as Bing's Geo coding service, which would accept address and return location detail)
My need would be is to call this API multiple times (say 500-1000) for a single asp.net request and each call may take close to 500ms to return.
I could think of three approaches on how to do this action. Need your input on which could be best possible approach keeping speed as criteria.
1. Using Http Request in a for loop
Write a simple for loop and for each input call the REST API and add the output to the result. This by far could be the slowest. But there is no overhead of threads or context switching.
2. Using async and await
Use async and await mechanisms to call REST Api. It could be efficient as thread continues to do other activites while waiting for REST call to return. The problem I am facing is that, as per recommendations I should be using await all the way to the top most caller, which is not possible in my case. Not following it may lead to dead locks in asp.net as mentioned here http://msdn.microsoft.com/en-us/magazine/jj991977.aspx
3. Using Task Parallel Library
Using a Parallel.ForEach and use the Synchronuos API to invoke the Server parallely and use ConcurrentDictionary to hold the result. But may result in thread overhead
Also, let me know is there any other better way to handle things. I understand people might suggest to track performance for each approach, but would like to understand how people has solved this problem before
The best solution is to use async and await, but in that case you will have to take it async all the way up the call stack to the controller action.
The for loop keeps it all sequential and synchronous, so it would definitely be the slowest solution. Parallel will block multiple threads per request, which will negatively impact your scalability.
Since the operation is I/O-based (calling a REST API), async is the most natural fit and should provide the best overall system performance of these options.
First, I think it's worth considering some issues that you didn't mention in your question:
500-1000 API calls sounds like quite a lot. Isn't there a way to avoid that? Doesn't the API have some kind of bulk query functionality? Or can't you download their database and query it locally? (The more open organizations like Wikimedia or Stack Exchange often support this, the more closed ones like Microsoft or Google usually don't.)
If those options are not available, then at least consider some kind of caching, if that makes sense for you.
The number of concurrent requests to the same server allowed at the same time in ASP.NET is only 10 by default. If you want to make more concurrent requests, you will need to set ServicePointManager.DefaultConnectionLimit.
Making this many requests could be considered abuse by the service provider and could lead to blocking of your IP. Make sure the provider is okay with this kind of usage.
Now, to your actual question: I think that the best option is to use async-await, even if you can't use it all the way. You can avoid deadlocks either by using ConfigureAwait(false) at every await (which is the correct solution) or by using something like Task.Run(() => /* your async code here */).Wait() to escape the ASP.NET context (which is the simple solution).
Using something like Parallel.ForEach() is not great, because it unnecessarily wastes ThreadPool threads.
If you go with async, you should probably also consider throttling. A simple way to achieve that is by using SemaphoreSlim.
So just started playing with Meteor and trying to get my head around the security model. It seems there's two ways to modify data.
The Meteor.call way which seems pretty standard - pretty much just a call to the server with its own set of business rules implemented.
Then there is the Collection.allow method which seems much more different to anything I've done before. So it seems that if you put an collection.allow, you're saying that the client can make any write operation to that collection as long as it can get past the validations in its allow function.
That makes me feel uneasy cause it's feels like a lot of freedom and my allow function would need to be pretty long to make sure it's locked down securely enough.
For instance, mongodb has no schema, so you'd have to basically have a rule that defines which fields would be accepted and the format those fields must be in.
Wouldn't you also have to put in the business logic for every type of update that might be made to your system.
So say, I had a SoccerTeam collection. There may be several situations I may need to make a change, like if I'm adding or removing a player, changing the team name, team status has changed etc.
It seems to me that you'd have to put everything into this one massive function. It just sounds like a radical idea, but it seems Meteor.call methods would just be a lot simpler.
Am I thinking about this in the wrong manner (or for the wrong use case?) Does anyone have any example of how they can structure an allow or deny function with a list of what I may need to check in my allow function to make my collection secure?
You are following the same line of reasoning I used in deciding how to handle data mutations when building Edthena. Out of the box, meteor provides you with the tools to make a simple tradeoff:
Do I trust the client and get a more responsive UI (latency compensation)? Or do I require strict control over data validation, but force the client to wait for an update?
I went with the latter, and exclusively used method calls for a few reasons:
I sleep better a night knowing there exists exactly one way to update each of my collections.
I found that some of my updates required side effects that only made sense to execute on the server (e.g. making denormalized updates to other collections).
At present, there isn't a clear benefit to latency compensation for our app. We found the delay for most writes was inconsequential to the user experience.
allow and deny rules are weak tools. They are essentially only good for validating ownership and other simple checks.
At the time when we first released to production (August 2013) this seemed like a radical conclusion. The meteor docs, the API, and the demos highlight the use of client-side writes, so I wasn't entirely sure I had made the right decision. A couple of months later I had my first opportunity to sit down with several of the meteor core devs - this is a summary of their reaction to my design choices:
This seems like a rational approach. Latency compensation is really useful in some contexts like mobile apps, and games, but may not be worth it for all web apps. It also makes for cool demos.
So there you have it. As of this writing, my advice for production apps would be to use client-side updates where you really need the speed, but you shouldn't feel like you are doing something wrong by making heavy use of methods.
As for the future, I'd imagine that post-1.0 we'll start to see things like built-in schema enforcement on both the client and server which will go a long way towards resolving my concerns. I see Collection2 as a significant first step in that direction, but I haven't tried it yet in any meaningful way.
stubs
A logical follow-up question is "Why not use stubs?". I spent some time investigating this but reached the conclusion that method stubbing wasn't useful to our project for the following reasons:
I like to keep my server code on the server. Stubbing requires that I either ship all of my model code to the client or selectively repeat parts of it again. In a large app, I don't see that as practical.
I found the the overhead required to separate out what may or may not run on the client to be a maintenance challenge.
In order for the stub to do anything other than reject a database mutation, you'd need to have an allow rule in place - otherwise you'd end up with a lot of UI flicker (the client allows the write but the server immediately invalidates it). But having an allow rule defeats the whole point, because a user could still write to the db from the console.
The usual allow methods I have are these:
MyCollection.allow({
insert: false
update: false
remove: false
})
And then, I have methods which take care of all insertions. These methods perform the type checks and permission assessment. I have found that to be a much more maintainable method: completely decoupling the data layer from the code which runs on the client.
For instance, mongodb has no schema, so you'd have to basically have a rule that defines which fields would be accepted and the format those fields must be in.
Take a look at Collection2. They support schema checking at run-time before inserting documents into the Collection.
According to the documentation for zumero_sync:
If a large amount of information needs to be pulled from the server,
this function may need to be called more than once.
In my Android app that uses Zumero that's no problem; I just keep calling zumero_sync until the return value doesn't start with "0;".
However, now I'm trying to write an admin script that also syncs with my server dbfiles. I'd like to use the sqlite3 shell, and have the script pass the SQL to execute via command line arguments. I need to call zumero_sync in a loop (which SQLite doesn't support) to make sure the db is fully synced. If I had to, I could invoke sqlite3 in a loop (reading its output, looking for "0;"), or even write a C++ app to call the SQLite/Zumero functions natively. But it certainly would be easier if a single zumero_sync was enough.
I guess my real question is: could zumero_sync be changed so it completes the sync before returning? If there are cases where the existing behavior is more useful, maybe there could be a parameter for specifying which mode to use?
I see two basic questions here:
(1) Why does zumero_sync() work the way it does?
(2) Can it work differently?
I'll answer (2) first, since it's easier: Yes, it could work differently. Rather, we could (and probably will, soon, you brought this up) implement an additional function, named something like zumero_sync_complete(), which performs [the guts of] zumero_sync() in a loop and returns after the sync is complete.
We didn't implement zumero_sync_complete() because it doesn't add much value. It's a simple loop, so you can darn well write it yourself. :-)
Er, except in scripting environments which don't support loops. Like the sqlite3 shell.
Answer to (1):
The Zumero sync protocol is designed to give the server the flexibility to return partial results if it wants to do so. And for the sake of reducing load on the server (and increasing its scalability) it often does want to do exactly that.
Given that, one reason to expose this to the client is to increase the client's flexibility as well. As long we're making multiple roundtrips, we might as well give the client an opportunity to do something (like, maybe, update a progress bar) in between them.
Another thing a client might want to do in between loop iterations is handle an error.
Or, in the case of a multithreaded client, it might want to deal with changes that happened on the client while the sync is going on.
Which raises the question of how locking should be managed? Do we hold the sqlite write lock during the entire loop? Or only when absolutely necessary?
Bottom line: A robust app would probably want to implement the loop itself so that it can make its own decisions and retain full control over things.
But, as you observe, the sqlite3 shell doesn't have loops. And it's not an app. And it doesn't have threads. Or progress bars. So it's a use case where a simpler-and-less-powerful form of zumero_sync() would make sense.
Looking into asynchronous address resolution in winsock it seems that the only two options are either to use the blocking gethostbyname on a seperate thread, or use WSAAsyncGetHostByName. The latter is designed for some reason to work with window messages, instead of overlapped operations and completion ports/routines.
Is there any version of gethostbyname that works asynchronously with overlapped operations in a similiar manner to the rest of the winsock API?
Unfortunately there isn't at present, although GetAddrInfoEx() has placeholders for all the right things for async operation via all of the 'usual' routes (including IOCP) so I expect there will be eventually... Unfortunately, at this time, the docs say that all of these must be set to NULL and are marked as 'reserved'. :(
I'm just about to write one (have been for a while)... It's unfortunate that WSAAsyncGetHostByName doesn't even allow concurrent name resolution, so it's pretty useless as a base for what I want; but, then again, since it doesn't handle IPv6 that also makes it pretty useless to me. I expect I'll start from scratch; possibly using something like this (beerware) as a base.
Sorry, there is no overlapped version of gethostbyname().