how to send response directly from worker to client - nginx

When Nginx is used as a reverse proxy so that the client connects to Nginx and Nginx load balances or otherwise redirects the request to a backend worker via CGI etc... what is it called and how is it implemented when the worker responds directly to the client bypassing Nginx?
The source of my question is from two places. a) erlangonxen uses Nginx and a "spawner" app to launch a huge volume of instant-on workers. However, the response still passes through the spawner (an expensive step); b) I recently scanned an article that described this solution but I can no longer find it.

You got your jargon mixed I believe, so I'm going to ignore the proxy bit and assume this is about CGI. In that case you should be looking for fast CGI solutions. Nginx has support for fast CGI built in.
This spawner as you call it, is meant to provide concurrency, so that multiple CGI requests can be handled in parallel, without having to spawn an interpreter for each request. Instead the workers get spawned and ideally live forever.
If the selection of an available worker really is a performance bottleneck, then the implementation of this fast CGI daemon is severely lacking and you should look for a better solution. Worker selection should be a fraction of the time of the workers job.

I'm not sure if it's a jargon thing. The good news (for me anyway) is that I had read the articles and seen the diagrams... I just could not remember where. So reverse proxy not withstanding... I was looking for a "direct server request" (DSR) and the spawner from the erlangonxen project.
I'm not certain whether ot not these two technologies are going to work together. The DSR seems to have fallen out of favor and I'll probably not use it al all although in the given architecture it would seem to make sense to try. a) limits the total number of trips and sockets; b) really allows for some functions like gzip to be distributed nicely
Anyway, "found it".

Related

HTTP response times GUI

I'm looking for an application available on CentOS, that allows me to check periodic connectivity response times between that server and a specific port of a remote server (in this case servers a SOAP API).
Something that preferentially allows me to send periodic API calls, but if not possible, just telnet's that remote port, but shows results in a graphic.
Does someone know about an application that allows this, without the need for me to create a script that writes results to a log file that is less readable in terms of time perspective?
After digging and testing a bit more, ended up using netdata:
https://www.netdata.cloud/
Awesome tool, extremely simple to use and install.

Web server tolerance to high client poll rate: Cowboy vs. Yaws web servers

I have been building a real-time notification system. It’s part of a web application, but events have to be seen as soon as they occur. Long polling was not an option because it would be expensive for the web server to hold on to connections when no events are available, so I had to go for short-lived polls.
Each client hits the web server every, say, 2 seconds (this is a fairly high rate). When events are available, they are sent as JSON to the JavaScript client. Now, this requires a server set-up to handle a high number of short-lived connections. I have implemented one such system using the Yaws web server. However, because Yaws starts quite a number of many other services, it feels heavy and connections begin to get either refused or aborted when they go beyond 30,000 (maybe because I am running some ETS Tables in the same Erlang VM as Yaws is running on [separating these may require rpc:call/4, which—I fear—will increase latency]). I know that there are operating-system-specific tweaks to do, and those have been done.
This would not be a problem if it was easy to cluster up several Yaws instances. In Yaws, i am using a few appmods, and I am doing things RESTfully. I was thinking that the Cowboy web server might enhance things a bit here. I have not used Cowboy before, but I have used Misultin. Looking at Cowboy, it is a full fledged OTP Application and it seems to be easy to cluster, and being lightweight, may perhaps increase on the number of clients the overall system can handle. Storage is on Mnesia, which I can distribute easily to add more nodes (maybe by replication), so that there is a Cowboy instance in front of every Mnesia instance.
My questions are:
Is my speculation correct, that if I switched from Yaws to Cowboy, I might increase the performance significantly?
Yaws has a clean API via Appmods and the #arg{} record. Does Cowboy have an equivalent of these two things (illustrate please)?
Can Cowboy handle file uploads? If so, which server (Yaws or Cowboy), in your opinion would be better to use in the case of frequent file uploads? Illustrate how file uploads are done with Cowboy.
It is possible to run several Yaws instances on the same machine. Do you think that creating many Yaws instances per server (physical box) and having the client-load distributed across these would help? What do I need to know about doing this?
When I set the yaws.conf parameter max_connections = nolimit, how would I specify the same in Cowboy?
Now, I followed the interview with Cowboy author and he discusses the reasons why Cowboy is more lightweight than Yaws. He says that
The biggest difference is the use of binaries instead of lists. The generic acceptor pool is another. I could list a lot of other small differences but I figure these aren’t the most interesting.
That because Cowboy uses the listener-pool library Ranch, it somehow ends up with a higher capability of handling more connections, plus the use of binaries and not lists.
Another quote from the same interview:
Since we use one process per connection instead of two, and we use binaries instead of lists, we end up using a lot less memory than other projects without user intervention. Cowboy is also lazy, it doesn’t do anything unless required. So we don’t have much in memory until the user starts calling functions.
I wonder how yaws handles this case. Somehow, my problem domain needs lightweight HTTP handling. It’s actually true that Yaws will lead to more memory consumption as compared to say, Mochiweb, Misultin or Cowboy. My greatest concern is that Yaws has the best/cleanest API whereby it gives us access to the #arg{} containing everything we need as an Erlang record, so that we can get them out ourselves, than the others which have numerous functions for extracting stuff outside. Even the documentation: Yaws docs are pretty good and straightforward. Perhaps I need to look at more Cowboy code for things like file uploading and simple GET and POST request handling.
Otherwise, the questions I asked earlier, remain as pressing concerns. Yaws is pretty good, but seems to be overkill for this fast light-weight short-lived high rate poll situation, what do you think?
Your 30000 refusal limit sounds an awful lot like a 32k limit somewhere. Either the default process count, which is 32k, or some system limit on file descriptors and so on. You should not rule out the possibility that the limitation is on the kernel side of things. I've seen systems come to their limits quite easily due to kernel configurations which can be really hard to handle.

CGI vs. Long-Running Server

Can someone please explain this excerpt from golang's documentation on CGI:
"
Note that using CGI means starting a new process to handle each request, which is typically less efficient than using a long-running server. This package is intended primarily for compatibility with existing systems.
"
I use CGI to make database puts and gets.
Is this inefficient? Should I be using a 'long-running server'?
If so what does that mean, and how do I implement it?
... http://golang.org/pkg/net/http/cgi/
Yes, it is inefficient. The cost of starting a whole new process is generally much more than just connecting through to an already-existing process, or doing something on a thread within the current process.
In terms of whether it's necessary, that depends. If you're creating a search engine to rival Google, I would suggest CGI is not the way to go.
If it's a personal website accessed once an hour, I think you can probably get away with it.
In terms of a long running server, you can generally write something like a plug-in for a web server which is running all the time and the web server just passes off requests to it when needed (and possibly multiple threads of "it").
That way, it's ready all the time, you don't have to wait while the web server starts another process to handle the request.
In fact, Apache itself does CGI via a module (like a plug-in) which integrates itself into Apache at runtime - the actual calling of external processes is handled from that module. The source code, if you think it will help, can be found in mod_cgi.c if you do a web search for it.
Another example is mod_perl which is a Perl interpreter module, available at this link.
One option to look into is fastcgi which is a long running server program that doesn't continually restart each request. It used to be that fast cgi had its disadvantages due to memory leaks over time in languages like C, C++, FPC, etc. since they are not garbage collected. A small memory leak in one fastcgi program after millions of hits to the website could bring the server down, whereas regular old CGI was a garbage collector itself: the program restarted and therefore cleaned up each time someone requested the page and the cgi exited. In the case of Go lang memory leaks are not a concern, however fast cgi could have some hidden gotchyas such as: if golang has any memory leaks in its garbage collector itself... (unlikely, but gotchyas like this might pop up - also heap fragmentation .... over time..)
Generally fastcgi and "long running" is premature optimization. I've seen people with 5 visitors to their personal home page website a day yelling "hey maybe I should use fastcgi" when in fact they would need 5 million visitors a day - but they like to be hip and cool so they start thinking about fast cgi before their site is even known by 3 people.
You need to ask yourself: does the server you are using have a lot of traffic, and by a lot of traffic I don't mean 100 visitors a day... even 1000 unique visitors a day is not a lot.
It is unclear whether you are wanting to write Go lang cgi programs for apache server, or python ones for a go server, or whether you are writing a go server that has cgi capability for python and perl. Clarify what you are actually doing.
As for rivaling Google as a search engine which someone posted about in another answer: if you look at the history of Google they actually coded their programs in C++/C via some cgi system ... rather than using PHP, perl, or other hip and cool stuff that the kids use. Look up backrub project and its template system eons ago. It was called Ctemplate (C compiled programs called upon html templates.....)
https://www.google.com/search?safe=off&q=google+backrub+template+ctemplate
Fastcgi was maybe something that google figured out before there was a fastcgi, or they had their own proprietary solution similar to fastcgi, I don't know since I didn't work at google - but since they used C++/C programs to power google in the old days (and probably still today for some stuff) they must have been using some cgi technology, even if it was modified cgi technology for speed.

How to test the performance of an http JSON server?

How to test the performance of an http server that serves and accepts only JSON requests (post and get)? I'm new to web testing, so tell me if I'm trying to do it in incorrect way.
I want to test if:
server is capable of handling hundreds of simultaneous connections.
server is capable to serve thousands requests per second.
server does not crash or get stuck when the number of requests exceeds server capabilities, and continues to run normally when the number of requests drops below average.
One way is to write some logic that repeats certain actions per run, and run multiple of them.
PS: Ideally, the tool/method should support compression like gzip as an option.
You can try JMeter and it's HTTPSampler.
About gzip. I've never used it in JMeter, but it seems it can:
How to get JMeter to request gzipped content?
Apache Bench (ab) is a command line tool that's great for these kinds of things. http://en.wikipedia.org/wiki/ApacheBench
ab -n 100 -c 10 http://www.yahoo.com/
If you are new to web testing then there are a lot of factors that you need to take into account. At the most basic level you want to do the things you have outlined.
Beyond this you need to think about how poorly performing clients might impact your service eg. keeping connections alive, sending malformed requests etc. These may translate into exceptions on the server which might in turn have additional impact (due to logging or slower execution). This means that you have to think of ways to break the service and monitor events that have an impact at higher scales.
Microsoft have a fairly good introduction to performance testing for web applications.

Why use Mongrel2?

I'm confused what purpose Mongrel2 serves/provides that nginx doesn't already do.
(Yes, I've read the manual but I must to be too much of a noob to understand how it's fundamentally different than nginx)
My current web application stack is:
- nginx: webserver
- Lua: programming language
- FastCGI + LuaJIT: to connect nginx to Lua
- Postgres: database
If you could only name one thing then it would be that Mongrel2 is build around ZeroMQ which means that scaling your web server has never been easier.
If a request comes in, Mongrel2 receives it (nothing unusual here, same as for NginX and any other httpd). Next thing that happens is that Mongrel2 distributes the task of compiling a response to n (ZeroMQ-enabled) backends, waits for them to do the work, receives results, compiles the response and sends it off to the client.
Now, the magic is with the fact that n can be any number and, that each of n can be written in any language as supported by ZeroMQ (20 or so) plus, all goes across the network so each n can be a dedicated box, possibly in another datacenter.
In other words: with NginX and all the rest you have to do scalability in your logic tier, Mongrel2 allows you to start (from a request/response cycle point of view) this right where the request hits your infrastructure, at the httpd rather than letting complexity penetrate down to your logic tier which blows complexity upwards by at least one order of magnitude imo.
You should look at the strengths of each and decide to use either or both depending on your use cases..
While, it seems that nginx does everything that mongrel2 provides in the surface, you'll find there are major differences in focus between the two.
Nginx shines as a front-end webserver, that can proxy requests to your backend webservers/appservers and also serve static content.
Mongrel2 is a slight change in the stack. As mentioned, it's power comes from it's use of zeromq as the transport layer between it and the backend appservers. It can serve dynamic request urls (app requests) and direct the compute portion of the task out to different backends using zeromq..
mongrel2 allows you to serve not just http, websockets etc, but other protocols (if you're inclined to do so) all from the same server. the user would never know that portions of the app are being served from different backends.
If your requirements for the functionality of your webapp keeps changing or you want to add things like streaming, the ability to code in different languages in the back end etc, then I would definitely look at mongrel2. Or even have a hybrid
where you use nginx/haproxy/varnish for static files and caching, and everything else is directed to mongrel2.

Resources