I am currently developping a massively multi-threaded application that heavily relies on gRPC (only one service)
As I am using a single Channel object shared between threads, the number of stubs/clients I should use is not clear to me.
How many stubs should I instantiate in this case (1 or n)?
Thanks for your help
It doesn't really matter. Channels are the expensive object, whereas stubs/clients are lighter weight. Each stub/client will be an allocation, but otherwise don't really have much overhead.
In Java, you are free to share stubs as they are thread-safe.
Related
I am currently developping a massively multi-threaded application that heavily relies on gRPC (only one service)
As I am using a single Channel object shared between threads, the number of stubs/clients I should use is not clear to me.
How many stubs should I instantiate in this case (1 or n)?
Thanks for your help
It doesn't really matter. Channels are the expensive object, whereas stubs/clients are lighter weight. Each stub/client will be an allocation, but otherwise don't really have much overhead.
In Java, you are free to share stubs as they are thread-safe.
We've looking at moving splitting up our architecture (and adding new components) using a Service Oriented Architecture (SOA). There will be a number of external API's that will be used by third parties, which we will make using a REST HTTP interface, however I was wondering what would be best to use internally as all components are with in our control and will be on the same network, however potentially different technologies (mainly .net and ruby on rails).
Would there be big performance/functionality gains in using a messaging system (redis, rabbitmq, EMS, other notable exceptions I've not heard of...) instead of HTTP (REST, SOAP, etc).
I've struggled to find good information on this topic and (as you can probably tell) I'm fairly new to this side area, so any advice or good resources would be appreciated!
Thnaks
Messaging tends to give you a more loosely coupled architecture. It can potentially be more robust as well, since individual components can fail without killing the entire infrastructure.
The downside is complexity, the paradigm shift to an asynchronous model, and possibly performance (especially if you're persisting messages every where).
You also need to ensure that your messaging system is particularly robust. A single aspect of your logic can go down and restart without affecting everything, but if you lose your core message base, then ALL of your logic is down waiting for the messaging to be back up.
Fortunately, the message bus can be long running without humans fiddling and touching it, the largest source of errors and instability in any system.
In addition to what #Will Hartung mentioned, I would also say that it depends on what you are going to do with your system. If you have mostly client-server type applications, where you have few servers/services and they tend to be completely independent, then it will probably be easier to implement service contracts via REST over HTTP.
If, on the other hand, your entire system is doing bi-directional communication, or if there are many inter-process calls (and particularly if every participant in the system is going to be both a client and a server at some point), then messaging is your best bet. Of the messaging options, I find that AMQP/RabbitMQ is the most feature-rich and easy to use of all of these. It offers you a true asynchronous platform to code against.
They key benefit to using messaging is that you can have queues for each type of message, so as your system expands and changes, the queues/messages can be the same, but the service that handles them can change underneath. It promotes separation of layers.
Finally, and this is a huge thing in my opinion, the proper use of messaging promotes small, independent pieces of code. These are both more testable and more maintainable, and in general it simplifies your enterprise architecture. If you attempt to handle too many services from HTTP endpoints, you will eventually (over the course of a year or two) end up with either (1) way too many endpoints to keep track of or (2) an unmaintainable mess of spaghetti code.
My company started out with using a message-based framework, and it has worked very well for us. The RabbitMQ server has easily been the most reliable component. Feel free to ask if you have any more questions about messaging or SOA.
While I was reading the book - O'Reilly Java Servlet Programming. There was a statement that I couldn't understand, the text is as below:
Servlets may also be allowed to persist between requests as object
instances, taking up far less memory than full-fledged processes.
May I know how could I know whether Servlet is taking far less memory than full fledged processes?
Hard to tell what is this fragment about without more context but I guess this is a comparison between servlets and cgi. Basically in a single JVM/servlet container you can deploy several singleton servlets. This means one servlet (occupying very little memory) is capable of handling unlimited number of requests (hardware limitations put aside).
With CGI you had to create a single process per request, which might cause more latency and mentioned high memory usage.
I was wondering if OCaml will perform well in terms of performance and ease of implementation while dealing with typical client/server interactions over TCP in a multi threaded environment.. I mean something really typical like having a thread per client that receives data, operated changes on game states and send them back to clients.
This because I need to write a server for a game and I always did these things in C but since now I know OCaml I was curious to know if it would be ok or I'll just find myself trying to solve a typical problem in a language that doesn't fit well that.
Performance: probably not. OCaml's threads do not provide parallel execution, they are only a way to structure your program. The OCaml runtime itself is not thread-safe, so the only code that could possibly execute in parallel of a single OCaml thread would be interfaced C code (without callbacks to OCaml!).
Implementation-wise, there is a mutex on the run-time, which is released when calling blocking C primitives, and could also be released when calling C functions that do significant work.
Ease of implementation: it wouldn't be world-changing. You would have the comfort of OCaml and a pthread-like library on the side. If you are looking for new things to discover while leveraging what you have learnt of OCaml, I recommend Jocaml. It goes in and out of sync with OCaml, but there was a (re-)re-implementation quite recently, and even when it is slightly out of sync, it is a lot of fun, and a completely new perspective of concurrent programs.
Jocaml is implemented on top of OCaml. What with the run-time not being concurrent and all, I am almost sure it uses separate processes and message-passing. But for the application that you mentioned it should be able to do fine.
OCaml is quite suitable for writing network servers, although as Pascal observes, there are limitations on threading.
Fortunately, however, threading isn't the only way to organize such a program. The Lwt library (for Light Weight Threads) provides an abstraction of asynchronous I/O that is quite easy to use (particularly when combined with a bit of syntax support). Everything actually runs in one thread, but it's all driven by an asynchronous I/O loop (built on the Unix select call), and the programming style lets you write code that looks like direct code (avoiding much of the normal code overhead of doing asynchronous I/O in many other languages). For example:
lwt my_message = read_message socket in
let repsonse = compute_response my_message in
send_response socket response
Both the read and the write happen back in the main event loop, but you avoid the normal "read, calling this function when you're done" manual overhead.
I'm so sorry this question has been sitting here for eight years with what I consider to be several quite bad answers because they all ignore the elephant in the room.
You say "really typical like having a thread per client" but having an OS thread per client is an extremely bad design. Threads are heavyweight, taking a long time to create and destroy and consuming ~1MB just for the thread stack. If you have one thread per connection then 1,000 simultaneous client connections (which is entirely feasible) will burn 1GB of RAM just for their stacks and the performance of your program (in any language) will be cripppled by the amount of context switching required to get any work done. You don't want to use that design in any language including both C and OCaml. Note that this problem is especially bad in the context of tracing garbage collected languages because the GC also traverses all of those thread stack in order to collate global roots before every GC cycle. I am the first to admit that this anti-pattern is ubiquitous in the real world but please don't copy it! I have seen "low latency" servers in the finance industry written in C++ using one thread per connection and they suffered latency stalls of up to six seconds just from the (Windows) OS servicing those threads.
See: http://people.eecs.berkeley.edu/~sangjin/2012/12/21/epoll-vs-kqueue.html
Let's consider an efficient design instead, like an epoll or kqueue interface to the OS kernel giving the server's code information about incoming and outgoing data buffers. Single threaded servers can attain excellent performance with this design. However, a typical server has serialization work to do per client and some core work that is often performed in serial across all client connections. Therefore, serialization and deserialization can be parallelized but the core server operation cannot. In this context, OCaml is great for everything except the serialization layer because it has poor support for parallelism.
I have personally implemented many servers for various industries with hugely varying performance requirements. In my experience, OCaml is one of the best tools for this because it offers excellent libraries (easy to use and reliable) and excellent serial performance. The only issue I have is around parallelizing the serialization layer but, in practice, I have found that OCaml runs circles around alternatives like Java and .NET even though they can parallelize this. I found typical latencies were ~100us for .NET and 10us for OCaml.
See also: http://prl.ccs.neu.edu/blog/2016/05/24/measuring-gc-latencies-in-haskell-ocaml-racket/
OCaml will work great for networking applications as long as you can live with a relatively small number of threads active at one time—say no more than 100. You could consider MLdonkey as an example, although in the client space, not in the server space.
Haskell would be a better choice if you want to use many preemptive threads. GHC can support huge numbers of threads and they run in parallel on multicore systems. OCaml prefers cooperative multithreading and multiple processes.
I have been researching asynchronous messaging, and I like the way it elegantly deals with some problems within certain domains and how it makes domain concepts more explicit. But is it a viable pattern for general domain-driven development (at least in the service/application/controller layer), or is the design overhead such that it should be restricted to SOA-based scenarios, like remote services and distributed processing?
Great question :). The main problem with asynchronous messaging is that when folks use procedural or object oriented languages, working in an asynchronous or event based manner is often quite tricky and complex and hard for the programmer to read & understand. Business logic is often way simpler if its built in a kinda synchronous manner - invoking methods and getting results immediately etc :).
My rule of thumb is generally to try use simpler synchronous programming models at the micro level for business logic; then use asynchrony and SEDA at the macro level.
For example submitting a purchase order might just write a message to a message queue; but the processing of the purchase order might require 10 different steps all being asynchronous and parallel in a high performance distributed system with many concurrent processes & threads processing individual steps in parallel. So the macro level wiring is based on a SEDA kind of approach - but at the micro level the code for the individual 10 steps could be written mostly in a synchronous programming style.
Like so many architecture and design questions, the answer is "it depends".
In my experience, the strength of asynchronous messaging has been in the loose coupling it brings to a design. The coupling can be in:
Time - Requests can be handled asynchronously, helping overall scalability.
Space - As you point out, allowing for distributed processing in a more robust way than many synchronous designs.
Technology - Messages and queues are one way to bridge technology differences.
Remember that messages and queues are an abstraction that can have a variety of implementations. You don't necessarily need to use a JMS-compliant, transactional, high-performance messaging framework. Implemented correctly, a table in a relational database can act as a queue with the rows as messages. I've seen both approaches used to great effect.
I agree with #BradS too BTW
BTW here's a way of hiding the middleware from your business logic while still getting the benefits of loose coupling & SEDA - while being able to easily switch between a variety of different middleware technology - from in memory SEDA to JMS to AMQP to JavaSpaces to database, files or FTP etc