What is the minimum access requirement of rebus? - rebus

As part of security concern, we manually created msmq queues instead of relying on the automated queue creation from rebus. So what would be the minimum queue access right for rebus queues?
Is full control really necessary?
Thanks
Yin

When Rebus starts up, it will automatically create the necessary input queue and error queue (if it's local) if they're not there.
If they're already there, it will check whether the queues are transactional because it cannot live up to its message delivery guarantees if they're non-transactional.
When you create the queues manually, only the transactionality check will be performed - other than that, the only operations performed are send/receive operations.
I'm not an expert on MSMQ access rights, so all I can say is that if a) the transactionality check can be performed, and b) appropriate send/receive access rights are in place (depending on specific endpoints/queues, obviously, then everything should work ok.

Related

Rebus pub-sub system. Process each message by one and only one subscriber

With rebus, I'd like to be able to publish from one application, and subscribe from multiple applications, where each message only gets processed by one subscriber in a round robin, as described here:
https://www.rabbitmq.com/tutorials/tutorial-two-dotnet.html
Is this possible with rebus?
Yes, but the words "publish" and "subscribe" are confusing here, because it's not "Publish/Subscribe" as described in the literature, because pub/sub pretty much means that you do not care how many listeners there are.
What you want, is to send a message, and then you want the sent messages distributed among multiple consumers.
With Rebus, that is definitely possibly. However, the way you do it may depend slightly on which transport, you are using(*)
Most transports have pretty good support for the competing consumers pattern, where you simply start multiple processes, probably running on multiple machines, that consume messages from the same queue.
This way, each message gets processed exactly once, and each consumer will receive messages at a rate that suits that particular consumer.
(*) MSMQ is not good at distributing load when multiple processes are taking messages off of the same queue, especially not when the processes are running on other machines than where the queue is.

Rebus FileSystemMessageQueue

I discovered Rebus contains FileSystemMessageQueue. It seems too great to be true so I wanted to ask few questions about it :)
Is it thread-safe/process-safe
Is it transactional
Why it uses JSON as serialization format (doesn't it add limitations to POCOs in comparison to binary serializer?)
Could it work as separate without bus? (just as separate dll, not service)
For small amount of messages, could it be replacement of MSMQ? I mean how it could be compared to MSMQ if we speak about local (not-networked), not resource-intensive messaging? would it be as good as MSMQ?
Thanks in advance
The FileSystemMessageQueue started out as a fun experiment because I wanted to use Dropbox as a transport - which actually seems to work, but I have not tested it in any way, except from making the transport pass Rebus' usual transport contract tests and show it off at a couple of user group meetings and such :)
Therefore: Please understand that you'll be the one testing the transport, and if you do use it you'll almost immediately be the one in the world with the most experience in using it :)
</disclamer>
1) The transport keeps track of which message files are currently being handled to ensure that the same file is not being received twice, so you can safely have multiple threads receiving messages in the same endpoint.
You cannot have do competing consumers though, because there's currently no locking that can span multiple processes (could probably be done though, by using the OS to lock the files and keep the file handle for the time it takes to handle the message).
2) No. It satisfies the same at-least-once delivery guarantee as all the other transports in Rebus, but it is not transactional and it is not capable of committing its work atomically.
I've made the transport postpone the actual writing of the outgoing messages to the point after you've done your own work in your message handler, so messages won't be visible to recipients too soon - but in theory you could run into a situation where a bunch of outgoing messages were sent, and then the deletion of the received message file fails, which will result in receiving the same message again - that's why it's called "at least once" ;)
3) It uses JSON because that's an easy way to write an object to a file (even though the actual message body is serialized and encoded using the configured serializer).
4) ??? I don't understand your question :)
5) Yes and no - I guess that it would be just as good as MSMQ if we speak about local and not resource intensive messages.
I haven't performed any load tests, but I'm guessing it will be much slower than MSMQ regarding message volume. I do think that it is capable of transferring messages that are much much bigger than MSMQ though, because MSMQ still has (to my knowledge) a hard upper cap of 4 MB per message.

How do I create a memory bound message queue in Erlang?

I want the speed of asynchronous messages but still have some flow control. How can I accomplish this in Erlang?
There is no process memory limit right now -- it is discussed on mailing list etc. You can look at those threads.
On the up side, when you use OTP patterns implementation like gen_server you have a lot of freedom in retrieving messages from process queue and measuring the length of the queue.
gen_server2 used in rabbitmq used to optimize that by moving messages to internal data structure.
Having that you can discard any new incoming message when internal queue is too long.
You can do it silently or notify sender that the message rejected.
All of that is on very low level.
RabbitMQ will provide this functionality on AMQP level.
A common and quite good way of enforcing flow control is to make well selected messages into calls which limits how much load each client can load the server to one, effectively providing force feed back in an extremely simple way. The trick is of course to pick which communications uses synchronous calls :-)

How to best implement a blocking/waiting actor?

I'm fairly new to Akka and writing concurrent applications and I'm wondering what's a good way to implement an actor that would wait for a redis list and once an item becomes available it will process it, or send it to a different actor to process?
Would using the blocking function BRPOPLPUSH be better, or would a scheduler that will ask the actor to poll redis every second be a better way?
Also, on a normal system, how many of these actors can I spawn concurrently without consuming all the resource the system has to offer? How does one decide how many of each Actor type should an actor system be able to handle on the system its running on?
As a rule of thumb you should never block inside receive. Each actor should rely only on CPU and never wait, sleep or block on I/O. When these conditions are met you can create even millions of actors working concurrently. Each actor is suppose to have 600-650 bytes memory footprint (see: Concurrency, Scalability & Fault-tolerance 2.0 with Akka Actors & STM).
Back to your main question. Unfortunately there is no official Redis client "compatible" with Akka philosophy, that is, completely asynchronous. What you need is a client that instead of blocking will return you a Future object of some sort and allow you to register callback when results are available. There are such clients e.g. for Perl and node.js.
However I found fyrie-redis independent project which you might find useful. If you are bound to synchronous client, the best you can do is either:
poll Redis periodically without blocking and inform some actor by sending a message to with a Redis reply or
block inside an actor and understand the consequences
See also
Redis client library recommendations for use from Scala
BRPOPLPUSH with block for long time (up to the timeout you specify), so I would favour a Scheduler instead which still blocks, but for a shorter amount of time every second or so.
Whichever way you go, because you are blocking, you should read this section of the Akka docs which describes methods for working with blocking libraries.
Do you you have control over the code that is inserting the item into redis? If so you could get that code to send your akka code a message (maybe over ActiveMQ using the akka camel support) to notify it when the item has been inserted into redis. This will be a more event driven way of working and prevent you from having to poll, or block for super long periods of time.

When a queue should be used?

Suppose we were to implement a network application, such as a chat with a central server and several clients: we assume that all communication must go through the central server, then it should pick up messages from some clients and forward them to target clients, and so on.
Regardless of the technology used (sockets, web services, etc..), it is possible to think that there are some producer threads (that generate messages) and some consumer threads (that read messages).
For example, you could use a single queue for incoming and outgoing messages, but using a single queue, you couldn't receive and send messages simultaneously, because only one thread at a time can access the queue.
Perhaps it would be more appropriate to use two queues: for example, this article explains a way in which you can manage a double queue so that producers and consumers can work almost simultaneously. This scenario may be fine if there are only a producer and a consumer, but if there are many clients:
How to make so that the central server can receive data simultaneously from multiple input streams?
How to make so that the central server can send data simultaneously to multiple output streams?
To resolve this problem, my idea is to use a double queue for each client: on the central server, each client connection may be associated with two queues, one for incoming messages from that client and one for outgoing messages addressed to that client. In this way the central server may send and receive data simultaneously on almost all the connections with the clients...
There are probably other ways to manage the queues ... What are the parameters to determine how many queues are needed and how to organize them? There are cases that do not need any queue?
To me, this idea of using a queue per client or multiple queues per client seems to miss the point. First of all, it is absolutely possible to build a queue which can be accessed simultaneously by 2 threads (one can be enqueueing an item while a different one is dequeueing another item). If you want to know how, post a specific question about that.
Second, even if we assume that only 1 thread at a time can access a single queue, and even if we assume that the server will be receiving or sending data to/from all the clients simultaneously, it still doesn't follow that you need a different queue for each client. To avoid limiting system performance, you just need to allow enough concurrency to utilize all the server's CPUs. Even with a single, system-wide queue, if dequeueing/enqueueing messages is fast enough compared to the other work the server is doing, it might not be a bottleneck. (And with an efficient implementation, simply inserting an item or removing an item from a queue should be very fast. It's a very simple operation.) For that message queue to become the bottleneck limiting performance, either you would need a LOT of CPUs, or everything else the server was doing would have to be very fast. In that case, you could work out some scheme with 2 or 4 system-wide queues, to allow 2x or 4x more concurrency.
The whole idea of using work queues in a multi-threaded system is that they 1) allow multiple consumers to all grab work from a single location, so producers can "dump" whatever work they need done at that single location without worrying about which consumer will do it, and 2) function as a load-balancing mechanism for the consumers. (Additionally, a work queue can act as a "buffer" if producers temporarily generate work too fast for the consumers.) If you have a dedicated pair of producer-consumer threads for each client, it calls into question why you need to use queues at all. Why not just do a synchronous "pass off" from dedicated producer to corresponding dedicated consumer? Or, why not use a single thread per client which acts as both producer and consumer? Using queues in the way which you are proposing doesn't seem to really gain anything.

Resources