Why use gunicorn with a reverse-proxy? - nginx

From Gunicorn's documentation:
Deploying Gunicorn
We strongly recommend to use Gunicorn behind a proxy server.
Nginx Configuration
Although there are many HTTP proxies available, we strongly advise that
you use Nginx. If you choose another proxy server you need to make sure
that it buffers slow clients when you use default Gunicorn workers.
Without this buffering Gunicorn will be easily susceptible to
denial-of-service attacks. You can use slowloris to check if your proxy
is behaving properly.
Why is it strongly recommended to use a proxy server, and how would the buffering prevent DOS attacks?

According to the Nginx documentation, a reverse proxy can be used to provide load balancing, provide web acceleration through caching or compressing inbound and outbound data, and provide an extra layer of security by intercepting requests headed for back-end servers.
Gunicorn is designed to be an application server that sits behind a reverse proxy server that handles load balancing, caching, and preventing direct access to internal resources.
By exposing Gunicorn's synchronous workers directly to the internet, a DOS attack could be performed by creating a load that trickles data to the servers, like the Slowloris.

The reason is that there are many slow clients that need time to consume server responses, while Gunicorn is designed to respond fast. There is an explanation of this situation for a similar web server for Ruby called Unicorn.

Related

Docker, nginx and several sites on one server

I have server with nginx and one working app. I want to add several apps to this servers. I would like to assimilate a few things for myself.
What is the difference between load balancer and reverse proxy?
In which situations should I use the first, and in which situations should I use the second?
What should I use if my sites are static, and what if not static?
And additionally it would be a big plus to hear about containers in the context of several sites for nginx
Differences between load balancer and reverse proxy
A reverse proxy accepts a request from a client, forwards it to a server that can fulfill it, and returns the server’s response to the client.
A load balancer distributes incoming client requests among a group of servers, in each case returning the response from the selected server to the appropriate client.
Taken from nginx docs
TL;DR :
Reverse proxying is about : routing requests to the correct server using the domain name
Load balancing is about : distributing load to multiple instances
What should I use if my sites are static, and what if not static?
You can combine an HTTP reverse proxy + load balancing with both static and non static web apps, so it depends.
And additionally it would be a big plus to hear about containers in the context of several sites for nginx
I recommend one nginx container per app / site + a dynamic reverse proxy, traefik in particular (http://traefik.io)
You need a reverse proxy to route the incoming traffic to the proper application taking into account the content of the original request (and rules that you may define).
When the target application(s) is determined, you will need to load balance them in order to distribute the amount of work across them.
Both tasks can be done by software like classic nginx, apache, haproxy, etc or by those that are designed for the microservices world, like fabio, traefik and others.

Load balancing go servers in Beanstalk

I'm trying to load balance go servers in AWS beanstalk that uses GRPC/Protobuf for data serialization. Beanstalk makes offers nginx as reverse proxy for client-server communication which makes use of http1.1 protocol. This is resulting in bogus messages exchanged between proxy and server but client messages never seem to reach the server as intended. Any clean ideas would help here.
Nginx doesnt support http/2 to backend yet. Some of us are working on a fix for this but will take another quarter before we could get to upstream it. You can either wait for that or use Envoy (https://github.com/lyft/envoy) in front which supports grpc and http/2 natively. Hope this helps.

What is "Reverse Proxy" and "Load Balancing" in Nginx / Web server terms?

These are two phrases I hear about very often, mainly associated with Nginx. Can someone give me a laymans defintion?
Definitions are often difficult to understand. I guess you just need some explanation for their use case.
A short explanation is: load balancing is one of the functionalities of reverse proxy, and reverse proxy is one of the softwares that can do load balancing.
And a long explanation is given below.
For example a service of your company has customers in UK and German. Because the policy is different for these two countries, your company has two web servers, uk.myservice.com for UK and de.myservice.com for German, each with different business logic. In addition, your company wants there to be only one unified endpoint, myservice.com for the service. In this case, you need to set up a reverse proxy as the unified endpoint. The proxy takes the url myservice.com, and rewrites the url of incoming requests so that requests from UK(determined by source ip) go to uk.myservice.com and requests from German go to de.myservice.com. From the view of a client from UK, it never knows the response is actually generated from uk.myservice.com.
In this case, the load of request traffic to the service is actually balanced to servers on uk.myservice.com and de.myservice.com as a side effect. So we normally don't call it used as a load balancer, just say it as a reverse proxy.
But lets say if your company uses the same policy for all countries, and has 2 servers, a.myservice.com and b.myservice.com, only for the reason that the work load is to heavy for one server machine. In this case, we normally call the reverse proxy as load balancer to emphasize the reason why it is being used.
Here is the basic definition:
Reverse Proxy is a proxy host, that receives requests from a client, and sends it to one of the servers behind itself. Nginx and apache httpd are commonly used as reverse proxies. These are in the administrative network of the web server that a servers a request.
This is in contrast with a (forward) Proxy, which sits in front of a client, and sends requests on behalf of a client to a web server. As an example, your corporate network address translator is a forward proxy. These are in the administrative network of the client from where the request originates.
Load balancing is a function performed by reverse proxies. The client requests are received by a load balancer, and the load balancer tries to send that request to one of the nodes (hosts) in the server pool, in an attempt to balance the load across various nodes.
I see both of them as a functionality of a HTTP/Web Server.
Load balancer’s job is to distribute the workload between servers node in a way that makes the best use of it.
Reverse proxy is a interface for external world ,forwarding request to a server node (even when we have a single node)
Its other use cases are caching of static content ,compression etc

HAProxy vs. Nginx

I was looking at using HAProxy and Nginx for load balancing, and I had some questions:
Should I use JUST HAProxy over Nginx for the proxy server?
Is there any reason to have HAProxy and Nginx installed on the same proxy server?
Haproxy is a "load balancer" it doesn't know to serve files or dynamic content. nginx is a web server capable of many interesting things. if you only need to load balance + HA some third web server then haproxy is enough. if you need to implement some static content or some logic in routing of the requests before terminating them on a third server then you may need nginx.
The reason you can see haproxy+nginx on the same host is that it allows you to bring down single nginx instances while haproxy continues to serve requests from other hosts. Imagine having a RR DNS using A records:
myapp.com IN A 1.1.1.1
myapp.com IN A 1.1.1.2
Where 1.1.1.1 and 1.1.1.2 are two hosts with haproxy+nginx configured to load balance between them. Now for some reason your 1.1.1.1's nginx goes down. The browsers that come to 1.1.1.1 are still being served by haproxy on it which in turn gets data from 1.1.1.2's nginx.
HAProxy is definitely the better, more fully featured loadbalancer (compared to the free nginx, not nginx plus (but one could argue that as well).
One thing that HAProxy sadly still can't do is generic UDP connections. So we used HAProxy and nginx on our logging lbs. But HAProxy released support for syslog/udp in 2.3 so we are about to change that. :)
We use HAProxy together with nginx. There are a number of reasons.
Nginx can do everything (more or less) but you don't want your load balancer serving web pages. Some error in config (which might have nothing to do with load balancing) and your entire setup comes to a screeching halt. Imagine that you have a Nodejs app, a Dotnet Core app, static files served by Nginx, and a php app. You just make some mistake and your 4 apps come to a standstill. You have lost your redundancy too if you have multiple instances of each app.
Even if you say that Nginx will only do the load balancing, Nginx doesn't support PROXY Protocol which is problematic if you forward to other servers who are also not serving the pages.
In addition there is something to be said for doing one thing and doing it well. Nginx is the master toolbox today. It does almost everything. Your load balancer is supposed to be the most stable part of your setup. Wouldn't you prefer to use something that was built just for load balancing?
If you use varnish then HAProxy works well with it and in fact they are made by the same people.
If you want an added level of balance then you can also use dns as a load balancer with multiple HAPROXY instances. Dns is not meant for this perse but you will always have some weak link. Your load balancer can crash too even if it's managed by your cloud provider. Most web browsers today will try other servers if there is more than one in your dns entry so it's like a load balancer. Your dns should be very reliable thus increasing your uptime.
We use 2 haproxy instances with 2 varnish instances with two dns entries.

Running nginx infront of a unicorn or gunicorn under Elastic Load Balancer

I have a very simple question. Nginx does reverse proxy buffering for HTTP servers like Gunicorn and Unicorn. However if I have a Elastic Load Balancer (offered by Amazon Web Services also known as -- ELB) is there any point in running nginx in front of my app server?
Request----> ELB -------> NGINX-------> UNICORN/GUNICORN HTTP SERVER
In a word: yes. Amazon's ELB service is wonderful, but it is solely a load balancer. Running nginx on your own server gives you a locus of control and a place to do rewrites, redirects, compression, header munging, caching, and more. Furthermore it allows you to serve static files in the fastest possible way, rather than using a slot on your more heavyweight appserver.

Resources