I would like to ask about Rebus Timeout Manager. I know we have Internal timeout manager and External timeout manager and I have been using Internal timeout manager for quite some time. And I have been sharing one timeout database (Sql Server) for all my endpoints.
I would like to know if this is correct.
Secondly I would like to know if I can also use one external Timeout Manager for all my endpoints.
My questions comes from the the fact that the information contained in the Timeouts table (id,due_time,headers,body) has no connection with the endpoint that sent a message to the timeout manager.
I just would like to get assurance.
Regards
You can definitely use the internal timeout manager like you're currently doing.
The MSSSQL-based timeout storage is safe to use concurrently from multiple instances, as it used some finely trimmed lock hints when reading due messages, thus preventing issues that could otherwise have happened due to concurrent access.
But it's also a valid (and often very sensible) approach to create a dedicated timeout manager and then configure all other Rebus instances to use that.
And you are absolutely right that the sender of the timeout is irrelevant. The recipient is determined when sending the timeout, so that
await bus.DeferLocal(TimeSpan.FromMinutes(2), "HELLO FROM THE PAST 🙂");
will send the string to the bus' own input queue, and
await bus.Defer(TimeSpan.FromMinutes(2), "HELLO FROM THE PAST 🙂");
will send the string to the queue mapped as the owner of string:
.Routing(r => r.TypeBased().Map<string>("string-owner"))
In both cases, the message will actually be sent to the timeout manager, which will read the rbs2-deferred-until and rbs2-defer-recipient headers and keep the message until it is due.
Related
I am yet to understand the behavior of web server thread, if I make an async call to say, a database, and immediately return response ( say OK ) to the client without even waiting for the async call to return back. First of all, is it a good approach ? What will happen to the thread which made the async call and if it is used again to serve another request and then the previous async call returns to this particular thread. Or does web server holds this thread waiting till the async call which it made, returns. Then the issue would be many hanging threads would be open as and web server would be available to take more requests. I am looking for an answer.
It depends on the way your HTTP servers works. But you should be very cautious.
Let's say you have a main event loop taking care of incoming HTTP connections, and workers threads which manage the HTTP communications.
A worker thread should be considered ready to accept a new HTTP request management only when it is effectively completly ready for that.
In terms of pure HTTP the more important thing is to avoid sending a response before having received the whole query. It seems simple, and it's usually the case. But if the query as a body, which may be a chunked body, it could take time to receive the whole message.
You should never send a response before, unless it's something like a 400 bad request response, followed by a real tcp/ip connection closing. If you fail to do so, and you have a message length parsing issue, the fact that you sent a response before the end of the query may lead to security problems. It could be used to exploit differences in the parsing of messages between your server and any other HTTP agent in front of your server (ssl terminator, reverse proxy, etc), in some sort of http smuggling issue. For this agent, if you made a response, it means you had the whole message, and it can send the next message, where you will in fact think this is just another part of the body.
Now if you have the whole message, you can decide to send an early response and detach an asynchronous task to really perform some sort of stuff. but this means:
you have to assume that no more output should be generated, you will not try to send any output to the request issuer, you should consider that the communication is now closed
the worker thread should not receive new requests to manage, and this is the hard part. If this thread is marked as available for a new request, it may also be killed by the thread manager (you have in Nginx or Apache request counters associated with workers, and they are killed after reaching a limit, to create fresh ones). it may also receive a gracefull reload command (usually it's a kill), etc.
So you start to enter a zone where you should know the internals of the HTTP server, which is maybe managed by you, or not, and where changes may appear sooner or later. And you start to make very strange things, which leads usually to strange issues, hard to reproduce.
Uausally the best way to handle asynchronous tasks, while still being able to understand what happen, is to use a messaging system. Put a list of tasks in queue, and get a parallel asynchronous worker process which does things with theses tasks. track status of theses tasks if you need it.
Same things may apply with the client, after receiving a very fast HTTP answer, it may need to perform some ajax status polling for the task status. And you will maybe only have to check the status of the task in the queue to send a response.
You will get more control on the whole thing.
For me I really dislike having detached threads, coming from strange code, performing heavy tasks without any way of outputing a status or reporting errors, and maybe preventing the nice application stop calls (still waiting for strange threads to join) which does not imply a killall.
It depends whether this asynchronous operation performs something which the client should be notified about.
If you return 200 OK (i.e. successfully completed) and later the asynchronous operation fails then the client will not know about the error.
You of course have some options like sending some kind of push notification over websocket or sending another request which would return the actual result and things like that. So basically depends on your needs...
I'm currently investigating Rebus but being unable to find good documentation this process is proving difficult. I am hoping someone can help me understand this exciting product.
I have read that during message processing, if something goes wrong the message will return to the queue.
Is the message returned to the front of the queue or placed on the end? If placed on the front this will be problem because the queue in essence becomes blocked with a message that may not be able to be processed - at least until it times out or retries exceeded.
Does Rebus have support for an out-of-the-box separate Retry queue?
Can I specify the interval between retries?
Can I specify an exponential backoff interval for retries as in Apache ActiveMQ?
Thanks
1) The queue transaction is rolled back, effectively moving the message back in front - therefore, it will be immediately retried.
After 5 failed attempts (at least that is the default), Rebus will move the message to the error queue. The default retry mechanism is intentionally very swift - this way, the input queue will never be clogged by poisonous messages.
If you need more sophisticated retries, I suggest you tage a look at bus.Defer - it can defer delivery of a message to the future. It requires that you have a timeout manager(*) running though.
2) I guess that's what I call "error queue", except there's no retry :)
I did create a solution some time, though, where I coded a simple endpoint that would periodically empty the error queue and move all the messages back into the original source queue, as a form of crude automatic second-level retry mechanism.
3) No. NServiceBus has the concept of second-level retries, but this is something that I've never really needed (enough) with Rebus. But with Rebus, you're on your own here - it should be fairly easy to do some intelligent bus.Defer that can then be easily adapted to each kind of error that you're expecting.
4) See (3)
I hope that clarifies a bit :)
(*) The timeout manager can be a separate endpoint whose only job in life is to receive a message, hold on to it for a while (i.e. save it to a database), and then return it to the sender when the time has elapsed. The timeout manager can be hosted in-process though, but using the .Timeouts(t => t.???) configuration spell.
Here is my scenario. BizTalk needs to transfer a file from a shared/central document library. First BizTalk receives an incoming message with a reference/path to this document in the library. Then it simply needs to read it out from this library and send it (potentially through different adapters). This is in essence, a scenario not so remote from the ClaimCheck EAI pattern.
Some ways to implement a claim check have been documented, noticeably BizTalk ESB Toolkit Claim Check, and BizTalk 2009: Dealing with Extremely Large Messages, Part I & Part II. These implementations do however take the assumption that the send pipeline can immediately read the stream that has been “checked in.”
That is not my case: the document will take some time before it is available in the shared library, and I cannot delay the initial received message. That leaves me with 2 options: either introduce some delay via an orchestration or ensure the send port will later on retry if the document is not there yet.
(A delay can only be introduced via an orchestration, there is no time-based subscriptions in BizTalk. Right?)
Since this a message-only flow I’d figure I could skip the orchestration. I have seen ways on how to have "Custom Retry Logic in Message Only Solution Using Pipeline" but what I need is not only a way to control the retry behavior (as performed by the adapter) but also to enforce it right from within the pipeline…
Every attempt I made so far just ended up with a suspended message that won’t be automatically retried even though the send adapter had retry configured… If this is indeed possible, then where/what should I do?
Oh right… and there is queuing… but unfortunately neither on premises nor in the cloud ;)
OK I may be pushing the limits… but just out of curiosity…
Many thanks for your help and suggestions!
I'm puzzled as to how this could be done without an Orch. The only way I can think of would be along the lines of:
The receive port for the initial messages just 'eats' the messages,
e.g. subscribing these messages to a dummy Send port with the Null Adapter,
ignoring them totally.
You monitor the Shared document library with a receive port, looking for any ? any new? document there.
Any located documents are subscribed by a send port and sent downstream.
An orchestration based approach would be along the lines of:
Orch is triggered by a receive of the Initial notification of an 'upcoming' new file to the library. If your initial notification is request response (e.g. exposed web service, you can immediately and synchronously issue the response)
Another receive port is used to do the monitoring of availability and retrieval of the file from shared library, correlating to the original notification message (e.g. by filename, or other key)
A mechanism to handle the retry if the document isn't available, and potentially an eventual timeout, e.g. if the document never makes it to the shared library.
And on success, a send port to then send the document downstream
Placing the delay shape in the Orch will offer more scalability than e.g. using Thread.Sleep() or similar in custom adapter or pipeline code, since BTS just calculates ad stamps the 'awaken' timestamp on the SQL record and can then dehydrate the orch, freeing up the thread.
The 'is the file there yet?' check can be done with a retry loop, delaying after each failed check, with a parallel branch with a timeout e.g. after an hour or so.
The polling interval can be controlled in the receive location, so I do not understand what you mean by there is no time based subscriptions in Biztalk. You also have a schedule window.
One way to introduce delay is to send that initial message to an internal webservice, which will simply post back the message to Biztalk after a specified time interval.
There are also loopback adapters, which simply post the message back into the messagebox. This can be ammended to add a delay.
I am using Signalr in an application I'm writing and storing all the user connections in a concurrent dictionary
ConcurrentDictionary<string, User> _users = new ConcurrentDictionary<string, User>();
e.g.
https://github.com/SignalR/SignalR/blob/master/samples/SignalR.Hosting.AspNet.Samples/Hubs/ShapeShare/ShapeShare.cs
I have implemented the IDisconnect interface on my Hub and I'm removing users from the dictionary when they disconnect
I am wondering how reliable the Disconnect method really is?
Does it capture all the different ways that a user could diconnect?
I dont want the dictionary to grow and grow indefinitely
I was thinking of maybe having a timer to periodically traverse the dictionary and remove users who havent had any recent activity
Is this necessary? Can I rely on the disconnect method?
Check out https://github.com/SignalR/SignalR/wiki/Configuring-SignalR , there are settings for :
DisconnectTimeout
KeepAlive
& Heatbeat interval
These could all be applied to help in maintaining your dictionary.
In my experience a graceful disconnect seems to work perfectly on signalR (still problems with win-apps) , if it ungracefully disconnects in a few minutes the connection will timeout and the disconnect method will fire and remove it from your dictionary like Drew said.
You could create a method that sends a message to all clients and log the returned connection ID and then remove any entries that are old, but in practice the disconnect method does work/work itself out, I would only implement the heartbeat interval if you really need to keep a very close eye on the connections.
If it doesn't fire it's a bug and you should report an issue to the SignalR project on GitHub. Here's a list of open issues with Disconnects at this time.
Be aware that diff. transports have diff. disconnect detection logic and so, depending on which transport the user is using, you will see diff. patterns of when the Disconnect fires, but it SHOULD fire eventually for all transports.
I am using OpenPop.net in my application. What this application does is that it downloads mails from a pop3 account saves all the attachments (CSV files) and processes them. This processing takes a lot of time. I am getting this exception which I am not able to figure out:
Exception message: OpenPop.Pop3.Exceptions.PopServerException: The stream used to retrieve responses from was closed
at OpenPop.Pop3.Pop3Client.IsOkResponse(String response)
at OpenPop.Pop3.Pop3Client.SendCommand(String command)
at OpenPop.Pop3.Pop3Client.DeleteMessage(Int32 messageNumber)
At the end of processing the CSV, the mails are deleted from the pop3 account. I believe this is where this exception is happening.
You are really having two issues here.
One is that you are doing a lot of processing while being connected to the POP3 server. When you are idle too long, the server will simply disconnect you to save resources.
What you should do, is fetch one email, process the attachments and then reconnect to fetch the next. You could also fetch all the attachments and then process them offline.
Second, I guess you are connecting to a gmail account. Gmail has some weird characteristics. A thread tries to find these characteristics. One of them is that, when you have fetched an email, it will not be available in the next POP3 session with the server. You can connect using a special username, where you append recent: in front of your normal username. This will show you the emails received in the last 30 days, despite of having been showed to an earlier POP3 session.
Hope it helps.
It sounds like something is trying to read a stream that has already been closed. Are you handling the streams at all, or is this done completely internal to the API? If you are handling them at all, there is a chance you are closing the streams (this often happens if someone uses a StreamReader, most people don't realize that closing the StreamReader also closes the underlying stream).