I have an architecture in which there are separate services which run independently. My one of the service is continuously fetching frames from camera and sending them to another service which performs some processing on the frame (like face detection, face recognition etc) and sends back the results. Services can be run on different machines.
Please suggest any good library or something which is fast at transferring frames between services. I have already a few options in my mind like Kafka and ZeroMq but I am also confused between them, which to choose.
Any good pipeline design is also welcome. Thanks
After experimenting with the Kafka pipeline in a very similar scenario my suggestion is that Kafka is not a suitable choice for it. Since the frequency and size of messages are the concern here while dealing with the live image streams.
The choices that can be tested are:
Zero MQ
Rabbit MQ
Zero MQ will highly likely perform well in the given scenario.
Related
I want to fetch the data of a stock. Since the data changes very fast, is there any way to pull the data like 50-100 times a second from trading websites?
And can we implement that using a raspberry Pi 4 8gig model.
RasPi4 should be more than adequate for this task. Both the ethernet and WiFi hardware is capable of connections at these speeds. (Unless you’re running a bunch of other stuff on it.) Consider where your bottlenecks may be, likely ISP or other network traffic). Consider avoiding WiFi in favor of cat5e or cat6. Consider hanging this device off your router (edge) to keep lan traffic lower and consider QOS settings if you think this traffic may compete with other lan traffic.
This appears to be a general question with no specific platform in mind. For stocks, there are lots of platforms to choose from.
APIs for trading platforms often include a method to open a stream. Instead of a full TCP conversation for each price check, a stream tells the server to just keep on sending data. There are timeout mechanisms of course, but it is good to close that stream gracefully (It’s polite since you’re consuming server resources at a different scale. I’ve seen some financial APIs monitor and throttle stream subscribers who leave sessions open.).
For some APIs/languages you can find solid classes already built on GitHub. Although, if simply pulling and reading a stream then the platform API doc code snippets should be enough to get you going.
Be sure to find out what other overhead may be implicated. For example, if an account or API key is needed to open a stream then either a session must be opened first or the creds must be passed with the stream being opened. The API docs will say. If you’re new to this sort of thing, just be a detective and try to infer what is needed. API docs usually try to be precise and technically correct with the absolute minimum word count.
Simply checking the steam should be easy. Depending on how that steam can be handled by your code/script, it may be harder to perform logic on the stream while it is being updated. That’s usually a thread issue or a variable scope issue depending on the script/code. For what you’re doing I would consider Python or PowerShell depending on your skill-set and other design parameters.
I am comparatively new to ZeroMQ and would like some suggestions regarding its it's internal architecture.
I am planning to use ZeroMQ as a messaging framework for my work. The basic idea what I want to achieve is to be able to dynamically scale the infrastructure based on the load and computational capacity required to achieve a particular workflow deadlines.
So,if if there is a necessity to add more nodes, then the application spawns new nodes and the messaging framework should be able to incorporate the changes as well. I should also be able to be point where the additional computations should occur or how the framework dynamically adds the new nodes (if any). The event on a particular node decides subsequent actions to be performed on other nodes. Here is my scenario or my stack that I am thinking off, but wanted to know if it makes sense:
User applications
ZeroMQ messaging
Squid-Content based routing
Overlay
Physical Substrate
I am bit skeptical about the above stack as I believe ZeroMQ helps one to achieve most of the functionality and therby thereby making it simpler.
Few points about my stack:
Physical substrate are the total number of nodes that are available for the computations or as data sources.
Overlay is a logical network that is built dynamically upon the physical network based upon the closest nodes available for a particular workflow. i.e. if two nodes exchange data frequently, then those two nodes are placed logically close to one another. Is a separate overlay like CHORD etc required when we use ZeroMQ?
Squid is basically used for content based routing. Is Squid required when we use ZeroMQ?
ZeroMQ messaging is for the communication between different nodes for an application.
Basically, what I wanted to know is whether above stack can be made simpler given that ZeroMQ has richer functionalities. If so, can someone point or share the thoughts. I am however going through the documentations of ZeroMQ, I am finding it a bit difficult to understand the intrinsic design of ZeroMQ. Please help.
Thanks
There's so much specific to your use-case here that it's almost impossible to give any definite answers. ZeroMQ is not a direct replacement for the concepts you've built into your architecture, however it may meet the goals you're trying to meet depending on how you're using them.
My suggestion would be to put your current architecture aside and start trying to build up a new one with ZMQ as its core, and see where you run into limitations that are solved by the other parts of your stack.
As for the "intrinsic design" of ZMQ, here's the basics that you need to understand as a starting point:
A ZMQ socket handles connection details for you, including managing network hiccups - but this has limits that you'll need to know
There are different kinds of ZMQ sockets, and they have opinions about how you use them. Some of them communicate asynchronously, some of them are strictly synchronous, some are one way, some are bi-directional.
If a connection between two sockets is severed (e.g. one node goes down, there is a network failure - something more than a momentary hiccup), it's your job to recognize that and re-establish that connection
There is no built in brokering or topology, you have to design and build that all yourself.
... ultimately, ZMQ provides a toolset for you to build a messaging framework, it does not provide a fully realized messaging framework out of the box. So, yes, it has the power to replace some of the other tools you're currently using, but you'll have to build it.
Part of this question is I'm not even sure what exactly I'll need to ask, so I'll start with the situation and work out from there.
One project I'm working on involves use of COMET via the aspComet library. The use case of the program is somewhat of a collaborative slideshow. One person runs the bulk of it, with one or more participants able to perform certain actions. Low latency between when an action is performed on screen
Previously, it was just running on one server. Now, we're wanting to scale it out a bit, more for reliability than performance reasons. So, we have some boxes out in Rackspace's cloud and all that fun stuff.
I knew from the start of this that I was going to need to make some changes to the way the COMET stuff works since different people in the same "show" might be on different servers, and I have no way of knowing what "show" they belong to until after they have already arrived on the site.
I initially tackled this using the WCF Mesh provider, which wasn't well documented to start with, now I'm running into issues with dispatching messages to it sometimes get lost, or delayed (I'm not 100% sure what is going on there), but it screws up the long poll for COMET and breaks things in rather strange ways (Clicking a button may trigger an event, or it'll hang for 10 seconds {long poll duration} and not actually do anything).
More research leads me to believe one of the .Net service bus providers may do what I need. However, I can't find examples that would cover what I need:
No single point of failure (outside of a database)
No hardcoding of peers.
Near-realtime (no polling, event based would be best)
My ideal solution would involve that when a server comes up, it lets other servers know of its existence (Even if it's just a row in a table somewhere), and they can start sending broadcast messages among each other, with each server being both a publisher and subscriber. This is what I somewhat had in the WCF Mesh provider, but I'm not overly confident in that code.
Can anyone point me in the right direction with this? Even proper terms to look for in the docs for service bus providers would be good at this point. Or are service buses not what I want? At this point I would settle for setting up a Jabber server on each web server and use that, if it could fit within my constraints.
I can't speak a ton to NServiceBus, but I expect the answers will be similar.
Single point of failure: MSMQ can use multicasting, which means each endpoint will broadcast it's existence and no DB table is needed. RabbitMQ uses this Exchange-to-Queue binding process which means as long as the Rabbit instance or cluster is up then messages still exist. RabbitMQ can be clustered, MSMQ cannot be. *Note: You might have issues with multicasting with Rackspace, no idea how they work. If so, you'll have to fall back on the runtime services for MSMQ (not RabbitMQ), that would create a single point of failure because everyone has a single point to coordinate control messages through.
Hard coding of peers: discussed above a bit; MSMQ's multicast handles it. Rabbit it can also be done, just bind queues to an exchange you want to listen to. MassTransit takes care of this for you.
Near-realtime: These both use messaging which is near real time. There's no polling in your message consumer code.
I think a service bus seems like a reasonable solution for what you're trying. Some more details would likely be needed, but the general messaging approach is correct. There are other more light weight messaging libraries if you decided you just want something on top of RabbitMQ and configure Rabbit to handle most of the stuff.
To get started with MassTransit, we have documentation up: http://readthedocs.org/projects/masstransit/ and mailing list http://groups.google.com/group/masstransit-discuss. Join the mailing list if you have future questions and someone will try and help you out.
Is there an alternative to the "ordered delivery" on a send port in BizTalk? The sequence of the message is very important to me, so I created an orchestration that suspends the message when it is not in sequence, and resumes it when it is in sequence. I use a long running orchestration and direct port binding.
Now some messages are processed faster in the send pipeline, so it happens that sometimes the messages aren't in sequence (I use file adapter...).
Now when I check the "ordered delivery" the messages are in sequence no matter what, but the performance is really really bad (messages get bulked up in the send ports), so I need to find an alternative for the ordered delivery in the send port.
Any suggestions?
thx
Now ordered delivery does obviously add a lot of overhead with the FIFO pattern. Take a look at this article and look a the FIFO article in the first issue. Also take a look at BizTalk performance in general to help speed up some of the other areas on your solution. Now I've seen a few people try their own custom solution to ordering via .net and SQL and performance wasn't that much better because the ordering pattern takes time to process. ALso take a look at these resources around performance in general:
Considersations when planning a perf test -
http://msdn2.microsoft.com/en-us/library/aa972201.aspx
BizTalk 2006 adapter performance numbers -
http://msdn2.microsoft.com/en-us/library/aa972200.aspx
If your transport in or out is SOAP, read this scalability study -
http://msdn2.microsoft.com/en-us/library/aa972198.aspx
Good proof points for BizTalk performance with relation to infrastructure
setup - http://msdn2.microsoft.com/en-us/library/ms864801.aspx
Do you have multiple locations the data is being sent to? So, it has to be in sequence but could be partioned out. If so you can use correlation and ordered delivery and have several pipes that deliver and speed the process up.
Does anyone have experiance in a lot of these?
I'm not so intrested in the pdf creation part of LCDS.
Just for flex messaging which would give me the best performance? As far as I know LCDS and WebOrb both do real time streaming is that correct?
Basically the question is which gives quickest response and which will allow for most client connected to a single servlet container.
Thanks
Edit 1
This may be clearer what I want. I'm looking to server at least 5000 clients with sub second response times with push messages, I'm trying to figure out which is the most scalable option, I've been quoted several million push messages a day. Obviously we can throw more servers at the problem I'm not convinced thats the most maintainable option.
Its not media streaming I'm looking for, but more event updates. It must work without sticky sessions.
LiveCycleDS & WebOrb are the only ones providing messaging using sockets through RTMP protocol. Note that in this case the clients are not connected to a servlet container, but to a dedicated server included in the product distribution (bypassing the servlet mechanism).
There are more messaging servers on the market, Lightstreamer is one of them. Or Flash media server.
There are many more things to be taken into consideration when choosing a solution however (price, integration with various architectures (like DMZ) and frameworks, paid support, documentation, your relation with the sales representative etc).