Connecting http request/response model with asynchronous queue - http

What's a good way to connect the synchronous http request/response model with an asynchronous queue based model?
When the user's HTTP request comes it generates a work request that goes onto a queue (beanstalkd in this case). One of the workers picks up the request, does the work, and prepares a response.
The queue model is not request/response - there are only requests, not responses. So the question is, how best do we get the response back into the world of HTTP and back to the user?
Ideas:
Beanstalkd supports light weight topics or queues (they call them tubes). We could create a tube for each request, have the worker create a message on that tube, and have the http process sit and wait on the tube for the response. Don't particularly like this one since it has apache processes sitting around taking memory.
Have the http client poll for the response. The user's initial HTTP request kicks off the job on the queue and returns immediately. The client (the user's browser) polls periodically for a response. On the backend the worker puts its response into memcached, and we connect nginx to memcached so the polling is light weight.
Use Comet. Similar to the second option, but with fancier http communication to avoid polling.
I'm leaning towards 2 since it's easy and well know (I haven't used comet yet). I'm guessing there's probably also a much better obvious model I haven't thought of. What do you think?

Here's how to implement request-response efficiently on JMS which might be helpful (though Java/JMS centric). The general idea is to create a temporary queue per client/thread then use correlationIDs to correlate requests to replies etc.

Polling is the simple solution; comet is the more efficient solution. You've got it nailed :)
I personally love comet (although I'm biased, since I helped write WebSync), it nicely lets your clients subscribe to a channel and get the message when your server process is ready. Works like a champ.

I'm looking to implement a Beanstalkd and memcached system to run a number of processes following a request - in this case, looking up information when a user logs in (the number of messages a user has waiting for example). The info is stored in Memcached and then read back on the next page load.
Without knowing a little more about what tasks you are doing though, it's not so easy to say what needs to be done, or how. Option #2 is however the simplest, and that may be all you need - depending on what you are pushing back into the workers.

Related

How is non-blocking IO actually works from client's perspective?

So I came across the idea of blocking and non-blocking I/O. But what I understood from the concept and some of the sample implementations is that we implement code on the server side to achieve this nature of the code.
But now my question is, if (for example postman sending HTTP request to the server) the request has to wait for the server to respond, then what's the point of non-blocking I/O? (Please correct me if I am wrong) Or the whole concept is just for the increase of throughput of the server instead of actual async nature w.r.t. to client.
For example, in one of my project what I did was created a post request to create a request in the system for processing which will return the transaction ID, now using this transaction id, I can query the server to know the outcome.
I may sound too naive, but the concept has confused me a lot. I do not understand this concept clearly. Please help.
Thanks
the request has to wait for the server to respond, then what's the point of non-blocking I/O?
There's a confusion. Waiting for a response and (non)blocking i/o are very loosely related. You always have to wait for response. That's why youve made the request to begin with. But the question is: how?
Non-blocking HTTP: "Dear server, here's my request, please process it and send me a response, I'm going to do something else in the meantime, like calculating n-th digit of Pi (I'm a weirdo)".
Blocking HTTP: "Dear server, here's my request, please process it and send me a response, I'm going to patiently wait for it doing nothing".
Or the whole concept is just for the increase of throughput of the server instead of actual async nature w.r.t. to client.
The whole concept is to be able to do other things while waiting for i/o at the same time. And to do that while minimizing the usage of threads which don't scale well.
Asynchronous systems, i.e. systems without "I'm going to wait idly" part tend to perform better at the cost of complexity.
Side note: nonblocking i/o can be used both on the server side and client side. For example almost all JS engines in browsers are built on top of some asynchronous engine. JS is often single-threaded, meaning nonblocking i/o is necessary to achieve any concurrency.
But what I understood from the concept and some of the sample implementations is that we implement code on the server side to achieve this nature of the code.
You implement code in whereever you are doing the non-blocking UI. What a server does has no bearing on whether a client uses blocking or non-blocking UI, and what a client does has no bearing on whether a server uses blocking or non-blocking UI.
if (for example postman sending HTTP request to the server) the request has to wait for the server to respond, then what's the point of non-blocking I/O?
So that you're not wasting resources.
Let's consider first a simple console application that hits the web and then does something with the results. In this case there's very little to gain with non-blocking I/O as the application is just going to be sitting around waiting for something to do anyway.
Now let's consider a simple console application that hits 50 different web resources and collates the responses. Now non-blocking I/O is more useful, because with blocking I/O it would have to either get one resource after the other, or spin up 50 threads. With non-blocking I/O one, a small number of threads is all that is needed to hit 50 resources and respond promptly to each returning a response.
Now let's consider a GUI version of this application that wants to remain responsive to user input, while also running on low-power low-memory devices in which blocked threads are all the more expensive. The advantages of the above are increased.
Finally, consider a web application that is doing I/O both with the client and also as a client to a database, file system and maybe other web applications. It may have multiple requests at the same time, and blocking on either the I/O it does with the client or any of the I/O it does with db, file or other applications would cost a thread, which would put a scalability limit on how many requests it can handle simultaneously. Not blocking on I/O allows threads to be used for other requests while the I/O is pending.

Using asynchronous API to create nodes in Zookeeper

While looking for zookeeper, the accepted answer says that concurrent writes are not allowed.
Explaining Apache ZooKeeper
Now my question is as Zookeeper has linear writes, that does not stop me to use Asynchronous APIs to create nodes and take the response in a callback ? Though internally it may not allow concurrent writes , or am I missing something ?
Even though zookeeper operates in an ensemble, writes are always served through the leader. Therefore, leader is capable of queuing write requests and completing them sequentially.
Using the asynchronous API will not do any harm to the above mentioned approach. Even though the write requests are asynchronous (from the client side), leader will always make sure that they are served sequentially. Once a asynchronous write request is served, client will be notified through the callback. It is simple as that. Remember, the requests are asynchronous as viewed by the client. But from the leader's point of view, they are served sequentially.

implementing a background process responding to the client in an atmosphere+netty/jetty application

We have a requirement to to support 10k+ users, where every user initiate a request and waits for a response from the server (the response can take as long as 20-30 seconds to arrive). it is only one request from the client, and after a long processing by the server, a response will be transmitted and then the connection will disconnect.
in the background, the server will do some DB search and wait for other background processes to notify on completion before responding to the client.
after doing some research i figured out we will need to use something like the atmosphere framework to support websockets/sse event/long polling along with an asynchronous server like netty (=> nettosphere) or jetty.
As for my experience - mostly Java EE world and Tomcat server.
my questions are:
what will be easier to implement in regard to my experience and our requirement: atmosphere + netty or atmoshphere+jetty? which one can scale better, has an easier learning curve and easier to implement other java technologies?
how do u implement in atmosphere a response that is sent only to the originating client and not broadcast to the rest of the clients? (all the examples i found are broadcast).
how can i implement in netty (or jetty) when using the atmosphere framework our response? i.e., the client send a request, after it is received in the server some background processes are run, and when they finish i need to locate the connection and transmit the response. is that achievable?
Some thoughts:
At 10k+ users, with 20-30 second response latency, you likely hit file descriptor limits if using just 1 network interface. Consider a solution that uses multiple network interfaces.
Your description of your request/response can be handled entirely with standard Servlet 3.0, standard HTTP/1.1, Async request handling, and large timeouts.
If your clients are web browsers, and you don't start sending a response from the server until the 20-30 second window, you might hit browser idle timeouts.
Atmosphere and Cometd do the same things, supporting long duration connections, with connection technique fallbacks, and with logical channel APIs.
I believe the AKKA framework will handle this sort of need. I am looking at using it to handle scaling issues possibly with a RabbitMQ to help off load work to potentially other servers that may be added later to scale as needed.

Constantly retrieves information from the stream

How Facebook, Google plus or other informations web site, constantly retrieves information from the stream?
I suppose there is an asynchronous recovery , but how he gets constantly? It's like an infinite loop?
Which technology is used ?
There are a few different approaches to displaying updates in near-real time on the web. Here are some of the most common ones:
Short polling
The simplest approach to the problem is to continuously poll the server on a short interval (hence the name). This means that every few seconds, client-side code sends an asynchronous request to the server and displays the result. The downside to this approach is that if updates happen less frequently than the server is queried, the client is doing a lot of work for little payoff. There may also be a slight delay between when the event happens on the server and when the client receives it, based on the polling frequency.
Long polling
The next evolutionary step from short polling is what's known as long polling, where the client-side JavaScript fires off an asynchronous request to the server as soon as the page loads. The server only responds to the request when an update is made, and once the response reaches the client, another request is fired off immediately. The key part of this approach is that the asynchronous request can wait for the server for a long time.
Long polling saves bandwidth and computation time, since the response is only handled when the server has something that changed. It does require more complex server-side logic, but it does allow for near-instant updates on the client side.
This question has a decent sample: How do I implement basic "Long Polling"?
WebSockets
WebSockets are a relatively new technology, and allow for two-way communication in a way that's similar to standard network sockets. The server or client can send messages across the socket that trigger events on the other side of the connection. As nice as this is, browser support isn't as widespread enough to make it a dependable solution.
For the current WebSocket specification, take a look at RFC 6455.

Why can HTTP handle only one pending request per socket?

Being curious, I wonder why HTTP, by design, can only handle one pending request per socket.
I understand that this limitation is because there is no 'Id' to associate a request to its response, so the only way to match a response with its request is to send the response on the same socket that sent the request. There would be no way to match a response to its request if there was more than one pending request on the socket because we may not receive the responses in the same order requests were sent.
If the protocol had been designed to have a matching 'Id' for requests and responses, there could be multiple pending requests on only one socket. This could greatly reduce the number of socket used by internet browsers and applications using web services.
Was HTTP designed like this for simplicity even if it's less efficient or am I missing something and this is the best approach?
Thanks.
Not true. Read about HTTP1.1 pipelining. Apache implements it and Firefox implements it. Although Firefox disables it by default.
To turn it on in Firefox use about:config and write 'pipelining' in the filter.
see: http://www.mozilla.org/projects/netlib/http/pipelining-faq.html
It's basically for simplicity; various proposals have been made over the years that multiplex on the same connection (e.g. SPDY) but none have taken off yet.
One problem with sending multiple requests on a single socket is that it would cause inefficient queuing.
For instance, lets say you are in a store and there are 2 cashiers, and 10 people waiting to be checked out. The ideal way to make the line is to have a single queue of 10 people and the next person in line goes to a cashier when they become available. However, if you sent all the requests at once you would probably send 5 people to cashier A and 5 to cashier B. However, what if you sent the 5 people with the largest shopping carts to the same cashier? That's bad queuing and what could happen if you queued a bunch of requests on a single socket.
NOTE: I'm not saying that you couldn't use queuing well, but it keeps it simple to do it right if there is no queuing on a single socket.
There are a few concidertaions I would review.
The first is related to the nature of TCP itself. TCP suffers from 'head-of-line' blocking issue where there can only be a single outstanding (unacknowledged) request (connection/TCP level) in flight. Given traditional latencies this can be a problem from a load time user experience perspective compared to results of parallel connection scheme browsers employ today. The higher the latency of the link the larger the impact of this fundemental limitation.
There is also a concurrency issue in that sometimes you really want to load multiple resources incrementally / in parallel. Back in the day one of the greatest features mozilla had over mosaic was that it would load images and objects incrementally so you could begin to see what was going on and use a resource without having to wait for it to load. With fewer connections there is a risk in that for example loading a large image on page before a style sheet can be catastrophic from an experience point of view. Expecting some kind of mitigating intelligence or explicit configuration to optimally order requests may not be a realistic or ideal solution.
There are proposals such as HTTP over SCTP that will more or less totally correct the issue you raise at the transport level.
Also realize that HTTP doesn't necessarily mandate a Content-Length header to serve data. Even if each HTTP response was ID'd, how would you manage streaming binary content with no content length (HTTP/1.0 style)? or if the client sent the Connection: close header to have the client close due to non-known lengths?
To manage this you would have to HTTP chunk (already present) in multiplex (I don't think anyone implements this) and add some non-trivial work to many programs.

Resources