How do I deploy Apache-Airflow via uWSGI and nginx? - nginx

I'm trying to deploy airflow in a production environment on a server running nginx and uWSGI.
I've searched the web and found instructions on installing airflow behind a reverse proxy, but those instructions only have nginx config examples. However, due to the permissions, I can't change the nginx.conf itself and have to solve it via uswsgi.
My folder structure is:
project_folder
|_airflow
|_airflow.cfg
|_webserver_config.py
|_wsgi.py
|_env
|_start
|_stop
|_uwsgi.ini
My path/to/myproject/uwsgi.ini file is configured as follows:
[uwsgi]
master = True
http-socket = 127.0.0.1:9999
virtualenv = /path/to/myproject/env/
daemonize = /path/to/myproject/uwsgi.log
pidfile = /path/to/myproject/tmp/myapp.pid
workers = 2
threads = 2
# adjust the following to point to your project
wsgi-file = /path/to/myproject/airflow/wsgi.py
touch-reload = /path/to/myproject/airflow/wsgi.py
and currently the /path/to/myproject/airflow/wsgi.py looks as follows:
def application(env, start_response):
start_response('200 OK', [('Content-Type','text/html')])
return [b'Hello World!']
I'm assuming I have to somehow call the airflow flask app from the wsgi.py file (perhaps by also changing some reverse proxy fix configs, since I'm behind SSL), but I'm stuck; what do I have to configure?
Will this procedure then be identical for the workers and scheduler?

Related

Pytorch model prediction in production with uwsgi

I have a problem deploying a pytorch model in production. For a demonstration, I build a simple model and a flask app. I put everything in a docker container (pytorch+flask+uwsgi) plus another container for nginx. Everything is running well, my app is rendered and I can navigate inside. However, well I navigate into the URL that launches a prediction of the model, the server hangs and does not seem to compute anything.
The uWSGI is run like this:
/opt/conda/bin/uwsgi --ini /usr/src/web/uwsgi.ini
with uwsgi.ini
[uwsgi]
#application's base folder
chdir = /usr/src/web/
#python module to import
wsgi-file = /usr/src/web/wsgi.py
callable = app
#socket file's location
socket = /usr/src/web/uwsgi.sock
#permissions for the socket file
chmod-socket = 666
# Port to expose
http = :5000
# Cleanup the socket when process stops
vacuum = true
#Log directory
logto = /usr/src/web/app.log
# minimum number of workers to keep at all times
cheaper = 2
processes = 16
As said, the server hangs and I finally got a timeout. What is strange is when I run the flask application directly (also in the container) with
python /usr/src/web/manage.py runserver --host 0.0.0.0
I get my prediction in no time
I think this is related to
https://discuss.pytorch.org/t/basic-operations-do-not-work-in-1-1-0-with-uwsgi-flask/50257
Maybe try as mentioned there:
app = flask.Flask(__name__)
segmentator = None
#app.before_first_request
def load_segmentator():
global segmentator
segmentator = Segmentator()
where Segmentator is a class with pytorch’s nn.Module, which loads weights in __init__
FYI this solution worked for me with one app but not the other

Airflow - Failed to fetch log file from worker. 404 Client Error: NOT FOUND for url

I am running Airflowv1.9 with Celery Executor. I have 5 Airflow workers running in 5 different machines. Airflow scheduler is also running in one of these machines. I have copied the same airflow.cfg file across these 5 machines.
I have daily workflows setup in different queues like DEV, QA etc. (each worker runs with an individual queue name) which are running fine.
While scheduling a DAG in one of the worker (no other DAG have been setup for this worker/machine previously), I am seeing the error in the 1st task and as a result downstream tasks are failing:
*** Log file isn't local.
*** Fetching here: http://<worker hostname>:8793/log/PDI_Incr_20190407_v2/checkBCWatermarkDt/2019-04-07T17:00:00/1.log
*** Failed to fetch log file from worker. 404 Client Error: NOT FOUND for url: http://<worker hostname>:8793/log/PDI_Incr_20190407_v2/checkBCWatermarkDt/2019-04-07T17:00:00/1.log
I have configured MySQL for storing the DAG metadata. When I checked task_instance table, I see proper hostnames are populated against the task.
I also checked the log location and found that the log is getting created.
airflow.cfg snippet:
base_log_folder = /var/log/airflow
base_url = http://<webserver ip>:8082
worker_log_server_port = 8793
api_client = airflow.api.client.local_client
endpoint_url = http://localhost:8080
What am I missing here? What configurations do I need to check additionally for resolving this issue?
Looks like the worker's hostname is not being correctly resolved.
Add a file hostname_resolver.py:
import os
import socket
import requests
def resolve():
"""
Resolves Airflow external hostname for accessing logs on a worker
"""
if 'AWS_REGION' in os.environ:
# Return EC2 instance hostname:
return requests.get(
'http://169.254.169.254/latest/meta-data/local-ipv4').text
# Use DNS request for finding out what's our external IP:
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.connect(('1.1.1.1', 53))
external_ip = s.getsockname()[0]
s.close()
return external_ip
And export: AIRFLOW__CORE__HOSTNAME_CALLABLE=airflow.hostname_resolver:resolve
The web program of the master needs to go to the worker to fetch the log and display it on the front-end page. This process is to find the host name of the worker. Obviously, the host name cannot be found,Therefore, add the host name to IP mapping on the master's vim /etc/hosts
If this happens as part of a Docker Compose Airflow setup, the hostname resolution needs to be passed to the container hosting the webserver, e.g. through extra_hosts:
# docker-compose.yml
version: "3.9"
services:
webserver:
extra_hosts:
- "worker_hostname_0:192.168.xxx.yyy"
- "worker_hostname_1:192.168.xxx.zzz"
...
...
More details here.

uWSGI and Flask Server Sent Events

I want to run a Flask application on my Raspberry Pi 3. I already developed the Flask app and it works fine, but this is on Flask's development server.
I want to use a production server so i'm using nginx as the webserver and uWSGI as the application server on the Pi. Now, the Flask app uses server sent events (SSE) to to get live data from the server. When I run the app using uWSGI, it stalls. I believe its because i'm using SSE because I had a similar problem on the Flask server but all I did was enable threading and the problem was solved. Enabling threading on uWSGI (when running the uWSGI script) doesn't solve the issue though. HELP!
This is my uWSGI .ini file.
[uwsgi]
base = /home/pi/heap
app = app
module = %(app)
home = %(base)/venv
pythonpath = %(base)
socket = /home/pi/heap/%n.sock
chmod-socket = 666
callable = app
Thank you!
Try running it in port instead of socket mode with defined processes and threads.
[uwsgi]
base = project_path
chdir = project_path
module = your_module_name
callable = your_app_name
enable-threads = true
master = true
processes = 5
threads = 2
http = :5000

How to deploy a python WSGI Flask Application using nginx without manually running uwsgi?

So, I am right now at this point. The webpage can be accessed without any errors and without using any specific port. Example: www.my-example.com.
But, this works only when I run the command "uwsgi --socket 0.0.0.0:4567 --protocol=http -w wsgi" in my server.
How to automate this app deployment through nginx?
You can use something like Supervisor to automatically start uWSGI, restart it if it fails, and log stderr/stdout:
[program:app]
# emulates a virtualenv
directory = /srv/app/
environment = PATH="/srv/app/virtualenv/bin"
command = /srv/app/virtualenv/bin/uwsgi --ini /srv/app/config/uwsgi.ini
autostart = true
autorestart = true
user = app-user

Configuring plone.recipe.varnish in Plone 4

I am using plone.recipe.varnish 1.2.2 in my Plone application.
Below is a section of my buildout:
parts =
...
instance
paster
varnish-build
varnish
plonesite
...
[varnish-build]
recipe = zc.recipe.cmmi
url = http://downloads.sourceforge.net/project/varnish/varnish/2.1.3/varnish-2.1.3.tar.gz
[varnish]
recipe = plone.recipe.varnish
daemon = ${buildout:parts-directory}/varnish-build/sbin/varnishd
bind = 127.0.0.1:8000
backends = 127.0.0.1:9000
cache-size = 1G
I cannot conclusively determine if it works. My Plone application serves on port 9000. So I want to test if varnish really works by going to http://localhost:8000 but I get nothing. The browser says "Firefox can't establish a connection to the server at 127.0.0.1:8000."
Am I doing this wrong? I have followed the instructions provided here but no headway.
How does one really configure plone.recipe.varnish in Plone, and how do you actually test that it works in local development machine?
The recipe does not start your varnish server. It only configures it for you.
Use something like supervisord to manage the process, or start it by hand with bin/varnish.

Resources