Pytorch model prediction in production with uwsgi - nginx

I have a problem deploying a pytorch model in production. For a demonstration, I build a simple model and a flask app. I put everything in a docker container (pytorch+flask+uwsgi) plus another container for nginx. Everything is running well, my app is rendered and I can navigate inside. However, well I navigate into the URL that launches a prediction of the model, the server hangs and does not seem to compute anything.
The uWSGI is run like this:
/opt/conda/bin/uwsgi --ini /usr/src/web/uwsgi.ini
with uwsgi.ini
[uwsgi]
#application's base folder
chdir = /usr/src/web/
#python module to import
wsgi-file = /usr/src/web/wsgi.py
callable = app
#socket file's location
socket = /usr/src/web/uwsgi.sock
#permissions for the socket file
chmod-socket = 666
# Port to expose
http = :5000
# Cleanup the socket when process stops
vacuum = true
#Log directory
logto = /usr/src/web/app.log
# minimum number of workers to keep at all times
cheaper = 2
processes = 16
As said, the server hangs and I finally got a timeout. What is strange is when I run the flask application directly (also in the container) with
python /usr/src/web/manage.py runserver --host 0.0.0.0
I get my prediction in no time

I think this is related to
https://discuss.pytorch.org/t/basic-operations-do-not-work-in-1-1-0-with-uwsgi-flask/50257
Maybe try as mentioned there:
app = flask.Flask(__name__)
segmentator = None
#app.before_first_request
def load_segmentator():
global segmentator
segmentator = Segmentator()
where Segmentator is a class with pytorch’s nn.Module, which loads weights in __init__
FYI this solution worked for me with one app but not the other

Related

How do I deploy Apache-Airflow via uWSGI and nginx?

I'm trying to deploy airflow in a production environment on a server running nginx and uWSGI.
I've searched the web and found instructions on installing airflow behind a reverse proxy, but those instructions only have nginx config examples. However, due to the permissions, I can't change the nginx.conf itself and have to solve it via uswsgi.
My folder structure is:
project_folder
|_airflow
|_airflow.cfg
|_webserver_config.py
|_wsgi.py
|_env
|_start
|_stop
|_uwsgi.ini
My path/to/myproject/uwsgi.ini file is configured as follows:
[uwsgi]
master = True
http-socket = 127.0.0.1:9999
virtualenv = /path/to/myproject/env/
daemonize = /path/to/myproject/uwsgi.log
pidfile = /path/to/myproject/tmp/myapp.pid
workers = 2
threads = 2
# adjust the following to point to your project
wsgi-file = /path/to/myproject/airflow/wsgi.py
touch-reload = /path/to/myproject/airflow/wsgi.py
and currently the /path/to/myproject/airflow/wsgi.py looks as follows:
def application(env, start_response):
start_response('200 OK', [('Content-Type','text/html')])
return [b'Hello World!']
I'm assuming I have to somehow call the airflow flask app from the wsgi.py file (perhaps by also changing some reverse proxy fix configs, since I'm behind SSL), but I'm stuck; what do I have to configure?
Will this procedure then be identical for the workers and scheduler?

airflow webserver suddenly stopped after long time of no issues, "No response from gunicorn"

Have had airflow webserver -D deamon process (v1.10.7) running on machine (CentOS 7) for long time. Suddenly saw that the webserver could no longer be accessed and checking the airflow-webserver.log saw...
[airflow#airflowetl airflow]$ cat airflow-webserver.log
2020-10-23 00:57:15,648 ERROR - No response from gunicorn master within 120 seconds
2020-10-23 00:57:15,649 ERROR - Shutting down webserver
(nothing of note in airflow-webserver.err)
[airflow#airflowetl airflow]$ cat airflow-webserver.err
/home/airflow/.local/lib/python3.6/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
""")
The airflow.cfg values for the webserver section looks like...
[webserver]
# The base url of your website as airflow cannot guess what domain or
# cname you are using. This is used in automated emails that
# airflow sends to point links to the right web server
#base_url = http://localhost:8080
base_url = http://airflowetl.co.local:8080
# The ip specified when starting the web server
web_server_host = 0.0.0.0
# The port on which to run the web server
web_server_port = 8080
# Paths to the SSL certificate and key for the web server. When both are
# provided SSL will be enabled. This does not change the web server port.
web_server_ssl_cert =
web_server_ssl_key =
# Number of seconds the webserver waits before killing gunicorn master that doesn't respond
web_server_master_timeout = 120
# Number of seconds the gunicorn webserver waits before timing out on a worker
#web_server_worker_timeout = 120
web_server_worker_timeout = 300
# Number of workers to refresh at a time. When set to 0, worker refresh is
# disabled. When nonzero, airflow periodically refreshes webserver workers by
# bringing up new ones and killing old ones.
worker_refresh_batch_size = 1
# Number of seconds to wait before refreshing a batch of workers.
worker_refresh_interval = 30
# Secret key used to run your flask app
secret_key = my_key
# Number of workers to run the Gunicorn web server
workers = 4
# The worker class gunicorn should use. Choices include
# sync (default), eventlet, gevent
worker_class = sync
Ultimately, just restarted the process as a daemon again (airflow webserver -D (should I have deleted the old airflow-webserer.log and .err files first?)), but not sure what would make this happen, since it had had no problems running for months before this.
Could anyone with more experience explain what could have happened after all this time and how I could prevent it in the future? Any issues with running dags or anything else that I should check for that this temporary unexpected shutdown of the websever may have caused?
I am experiencing the same issue, and it only started (very unfrequently) when I changed the following two config parameters in the webserver.
worker_refresh_interval = 120
workers = 2
However, my parameters are also set quite differently than yours, will share them here.
rbac = True
web_server_host = 0.0.0.0
web_server_port = 8080
web_server_master_timeout = 600
web_server_worker_timeout = 600
default_ui_timezone = Europe/Amsterdam
reload_on_plugin_change = True
After comparing the two, as your settings of the two I changed were set to the default (same as me before changing them), it seems that it is a combination of more parameters.

Uvicorn not processing some requests randomly

We are running a Fastapi + Uvicorn web application using gunicorn as a process manager and Nginx as the reverse proxy server. The application is running in async mode for most of the i/o operations (DB call, Rest apis). The whole setup is running inside a Docker container on Ubuntu 16.04.
The setup works most of the times but sometimes it does not process a request at all & it gets timed out at Nginx end. We also tried taking Nginx out of the setup and observed that few requests get processed after really long time (like after 15 mins). This is very random but usually happens 2-3 times in an hour.
Below is the gunicorn config that we are using –
host = os.getenv("HOST", "0.0.0.0")
port = os.getenv("PORT", "80")
# Gunicorn config variables
workers = web_concurrency
bind = f"{host}:{port}"
keepalive = 2
timeout = 60
graceful_timeout = 30
threads = 2
worker_tmp_dir = "/dev/shm"
# Logging mechanism
capture_output = True
loglevel = os.getenv("LOG_LEVEL", "debug")
And gunicorn is invoked with command exec gunicorn -k uvicorn.workers.UvicornWorker -c "$GUNICORN_CONF" "$APP_MODULE"
We have tried several config changes like –
Changing the number of workers, worker timeout
Changing the process manager from gunicorn to supervisord
Offloading the CPU intensive task to Celery instead of threading
Binding uvicorn app to unix socket instead of proxy server

uWSGI and Flask Server Sent Events

I want to run a Flask application on my Raspberry Pi 3. I already developed the Flask app and it works fine, but this is on Flask's development server.
I want to use a production server so i'm using nginx as the webserver and uWSGI as the application server on the Pi. Now, the Flask app uses server sent events (SSE) to to get live data from the server. When I run the app using uWSGI, it stalls. I believe its because i'm using SSE because I had a similar problem on the Flask server but all I did was enable threading and the problem was solved. Enabling threading on uWSGI (when running the uWSGI script) doesn't solve the issue though. HELP!
This is my uWSGI .ini file.
[uwsgi]
base = /home/pi/heap
app = app
module = %(app)
home = %(base)/venv
pythonpath = %(base)
socket = /home/pi/heap/%n.sock
chmod-socket = 666
callable = app
Thank you!
Try running it in port instead of socket mode with defined processes and threads.
[uwsgi]
base = project_path
chdir = project_path
module = your_module_name
callable = your_app_name
enable-threads = true
master = true
processes = 5
threads = 2
http = :5000

How to deploy a python WSGI Flask Application using nginx without manually running uwsgi?

So, I am right now at this point. The webpage can be accessed without any errors and without using any specific port. Example: www.my-example.com.
But, this works only when I run the command "uwsgi --socket 0.0.0.0:4567 --protocol=http -w wsgi" in my server.
How to automate this app deployment through nginx?
You can use something like Supervisor to automatically start uWSGI, restart it if it fails, and log stderr/stdout:
[program:app]
# emulates a virtualenv
directory = /srv/app/
environment = PATH="/srv/app/virtualenv/bin"
command = /srv/app/virtualenv/bin/uwsgi --ini /srv/app/config/uwsgi.ini
autostart = true
autorestart = true
user = app-user

Resources