I just installed dokku for the first time and I'm struggling with an apparently very simple problem... I made a sample python app that just logs an env variable:
import os
import time
API_TOKEN = os.getenv('API_TOKEN')
while True:
print(f'API_TOKEN is {API_TOKEN}')
time.sleep(1)
pass
With a Procfile as this:
worker: python temp.py
The deploy looks normal and successful, however if I try to look at the logs, dokku says
App <app name> has not been deployed .
Am I missing something very trivial?
Thanks in advance!
By default dokku only scales up the web process if it is present. Any workers or other processes are scaled to 0 otherwise known as "App < app name > has not been deployed"
To deploy your app you need to log onto the box and scale up the worker by running:
dokku ps:scale <app name> worker=1
change 1 to a larger number if you want more workers
If you often deploy the app to different dokku instances and have to search for this solution over and over again you can instead create a file in the root of your app called DOKKU_SCALE. In it you can set the default scale of all the proceses, like so:
worker=1
That reminds me, I need to go do that now. It is driving me nuts.
Related
I have a static pipeline with the following architecture:
main.py
setup.py
requirements.txt
module 1
__init__.py
functions.py
module 2
__init__.py
functions.py
dist
setup_tarball
The setup.py and requirements.txt contain the non-native PyPI and local functions which would be used by the Dataflow worker node. The dataflow options are written as follows:
import apache_beam as beam
from apache_beam.io import ReadFromText, WriteToText
from apache_beam.options.pipeline_options import PipelineOptions
from module2.functions import function_to_use
dataflow_options = ['--extra_package=./dist/setup_tarball','temp_location=<gcs_temp_location>', '--runner=DataflowRunner', '--region=us-central1', '--requirements_file=./requirements.txt]
So then the pipeline will run something like this:
options = PipelineOptions(dataflow_options)
p = beam.Pipeline(options=options)
transform = (p | ReadFromText(gcs_url) | beam.Map(function_to_use) | WriteToText(gcs_output_url))
Running this locally takes Dataflow around 6 minutes to complete, where most of the time goes to worker startup. I tried getting this code automated with Composer and re-arranged the architecture as follows: my main (dag) function in dags folder, the modules in plugins, and setup_tarball and requirements.txt in data folder... So the only parameters that really changed are:
'--extra_package=/home/airflow/gcs/data/setup_tarball'
'--requirements_file=/home/airflow/gcs/data/requirements.txt'
When I try running this modified code in Composer, it will work... but it'll take much, much longer... Once the worker starts up, it will take anywhere from 20-30 minutes before actually running the pipeline (which is only a few seconds).. This is much longer than triggering Dataflow from my local code, which was taking only 6 minutes to complete. I realize this question is very general, but since the code works, I don't think it's related to the Airflow task itself. Where would be a reasonable place to start looking at for troubleshooting this problem? At the Airflow level, what can be modified? How does Composer (Airflow) interact with Dataflow, and what can potentially cause this bottleneck?
It turns out that the problem was associated with Composer itself. The fix was to increase the capacity of Composer, i.e., increase vCPUs. Not sure why this would be the case, so if anyone has an idea for the foundation behind this issue, your input would be much appreciated!
We're running Airflow cluster using puckel/airflow docker image with docker-compose. Airflow's scheduler container outputs its logs to /usr/local/airflow/logs/scheduler.
The problem is that the log files are not rotated and disk usage increases until the disk gets full. Dag for cleaning up the log directory is available but the DAG run on worker node and log directory on scheduler container is not cleaned up.
I'm looking for the way to output scheduler log to stdout or S3/GCS bucket but unable to find out. Is there any to output the scheduler log to stdout or S3/GCS bucket?
Finally I managed to output scheduler's log to stdout.
Here you can find how to use custom logger of Airflow. The default logging config is available at github.
What you have to do is.
(1) Create custom logger class to ${AIRFLOW_HOME}/config/log_config.py.
# Setting processor (scheduler, etc..) logs output to stdout
# Referring https://www.astronomer.io/guides/logging
# This file is created following https://airflow.apache.org/docs/apache-airflow/2.0.0/logging-monitoring/logging-tasks.html#advanced-configuration
from copy import deepcopy
from airflow.config_templates.airflow_local_settings import DEFAULT_LOGGING_CONFIG
import sys
LOGGING_CONFIG = deepcopy(DEFAULT_LOGGING_CONFIG)
LOGGING_CONFIG["handlers"]["processor"] = {
"class": "logging.StreamHandler",
"formatter": "airflow",
"stream": sys.stdout,
}
(2) Set logging_config_class property to config.log_config.LOGGING_CONFIG in airflow.cfg
logging_config_class = config.log_config.LOGGING_CONFIG
(3) [Optional] Add $AIRFLOW_HOME to PYTHONPATH environment.
export "${PYTHONPATH}:~"
Actually, you can set the path of logging_config_class to anything as long as the python is able to load the package.
Setting handler.processor to airflow.utils.log.logging_mixin.RedirectStdHandler didn't work for me. It used too much memory.
remote_logging=True in airflow.cfg is the key.
Please check the thread here for detailed steps.
You can extend the image with the following or do so in airflow.cfg
ENV AIRFLOW__LOGGING__REMOTE_LOGGING=True
ENV AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID=gcp_conn_id
ENV AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER=gs://bucket_name/AIRFLOW_LOGS
the gcp_conn_id should have the correct permission to create/delete objects in GCS
This question already has answers here:
Where should I place the secret key in Flask?
(2 answers)
Closed 4 years ago.
I'm preparing to deploy a small Flask app that I've developed for internal use. I have an old laptop with Ubuntu Server 16.04, uWSGI and Nginx which I'll use for deployment.
OPTION 1
My current app setup has an instance/config.py file that I've kept out of version control. This file contains the following:
SECRET_KEY = ...
SQLALCHEMY_DATABASE_URI = ...
# Google 'client_id' and 'client_secret' for social authentication functionality.
The instance/config.py file is loaded as follows in app/__init__.py:
def create_app(config_name):
app = Flask(__name__, instance_relative_config=true)
app.config.from_object(app_config[config_name])
app.config.from_pyfile('config.py')
Is it safe to keep this same setup for production and thus have the instance/config.py file in the production server?
OPTION 2
Alternatively, should I be using environment variables? If this were the case, should I do something like so in wsgi.py:
os.environ['FLASK_CONFIG'] = 'production'
os.environ['SECRET_KEY'] = ...
os.environ['SQL_ALCHEMY_DATABASE_URI'] = ...
and then have the following in app/__init__.py:
def create_app(config_name):
if os.getenv('FLASK_CONFIG') == 'production':
app = Flask(__name__)
app.config.update(
SECRET_KEY=os.getenv('SECRET_KEY')
SQLALCHEMY_DATABASE_URI=os.getenv('SQLALCHEMY_DATABASE_URI')
)
else:
app = Flask(__name__, instance_relative_config=true)
app.config.from_object(app_config[config_name])
app.config.from_pyfile('config.py')
To answer the question, yes it's safe as long as your server is secure. Hopefully access is only allowed using a private key. If you're using a password to login, then that may be a problem.
It's a good idea to keep the actual file used to load configuration out of version control. I actually made a mistake with one of my servers where I did put config.py in version control and now I have to be careful each time I pull not overwrite the file.
One thing that you could do is to have a config file for each environment, say prod.py and dev.py, that are both checked in. Then create a pointer.py that is not checked into version control.
prod.py
SECRET_KEY = ...
SQLALCHEMY_DATABASE_URI = ...
...
pointer.py
from prod import SECRET_KEY, SQLALCHEMY_DATABASE_URI, ...
server.py
app.config.from_pyfile('pointer.py')
In dev, simply change the import statement to point to dev.py. You could also do from prod import *, but that isn't very good practice.
I'm using Meteor's built-in hosting for staging, with Codeship handling the continuous deployment. All tests and notifications succeed as expected in Codeship, but nothing is getting deployed.
My script:
expect -c "set timeout 60; spawn meteor deploy staging.myapp.com; expect “Email:” { send $METEOR_DEPLOY_EMAIL\r; expect eof } expect "Password:" { send $METEOR_DEPLOY_PASSWORD\r; expect eof }"
When that script runs during the build process I see the following:
spawn meteor deploy staging.myapp.com
=> Running Meteor from a checkout -- overrides project version (0.8.1)
To instantly deploy your app on a free testing server, just enter your
email address!
ail:
The ail: isn't a typo...that's what Codeship displays. It appears it eventually times out and moves on, though no errors are shown.
First time setting up a CI server (and using Expect), so thanks in advance for the help!
Figured it out...had two syntax issues:
Left/right double quotation marks snuck in there (instead of
standard quotation mark)
Missing semicolon
So, for anyone looking for a script to deploy to *.meteor.com using Codeship, here is the working script:
expect -c "set timeout 60; spawn meteor deploy example.com; expect "Email:" { send $METEOR_DEPLOY_EMAIL\r; expect eof }; expect "Password:" { send $METEOR_DEPLOY_PASSWORD\r; expect eof }"
I am trying to create an OpenShift application using the --from-code option to grab the application code from GitHub. I've created two different OpenShift QuickStarts -- with one, the --from-code option works, and with the other, it doesn't work.
So clearly I'm doing something wrong in the QuickStart that isn't working. But I can't see what I'm doing wrong. I either get error 504 or an error occurred, neither of which tells me what the problem is, and there doesn't seem to be a verbose flag to get more details on the error.
Tests-Mac:~ testuser$ rhc app create sonr diy-0.1 http://cartreflect-claytondev.rhcloud.com/reflect?github=smarterclayton/openshift-redis-cart --from-code https://github.com/citrusbyte/SONR.git
The cartridge 'http://cartreflect-claytondev.rhcloud.com/reflect?github=smarterclayton/openshift-redis-cart' will be downloaded and installed
Application Options
-------------------
Domain: schof
Cartridges: diy-0.1, http://cartreflect-claytondev.rhcloud.com/reflect?github=smarterclayton/openshift-redis-cart
Source Code: https://github.com/citrusbyte/SONR.git
Gear Size: default
Scaling: no
Creating application 'sonr' ... Server returned an unexpected error code: 504
Tests-Mac:~ testuser$ rhc app create sonr diy-0.1 http://cartreflect-claytondev.rhcloud.com/reflect?github=smarterclayton/openshift-redis-cart --from-code https://github.com/citrusbyte/SONR.git
The cartridge 'http://cartreflect-claytondev.rhcloud.com/reflect?github=smarterclayton/openshift-redis-cart' will be downloaded and installed
Application Options
-------------------
Domain: schof
Cartridges: diy-0.1, http://cartreflect-claytondev.rhcloud.com/reflect?github=smarterclayton/openshift-redis-cart
Source Code: https://github.com/citrusbyte/SONR.git
Gear Size: default
Scaling: no
Creating application 'sonr' ...
An error occurred while communicating with the server. This problem may only be temporary. Check that you have correctly specified your
OpenShift server 'https://openshift.redhat.com/broker/rest/domain/schof/applications'.
Tests-Mac:~ testuser$
That's creating an application with --from-code using this repo: https://github.com/citrusbyte/SONR . If I use this repo it works flawlessly: https://github.com/citrusbyte/openshift-sinatra-redis
The code itself seems to be good, as I can create an empty new application, merge the SONR code in, and it works flawlessly.
What am I doing wrong?
UPDATE: I've worked around this issue by creating the app in two stages instead of doing it in one stage:
rhc app create APPNAME diy-0.1 http://cartreflect-claytondev.rhcloud.com/reflect?github=smarterclayton/openshift-redis-cart
cd APPNAME
git remote add github -f https://github.com/citrusbyte/SONR.git
git merge github/master -s recursive -X theirs
git push origin master
I'd still love to know why doing it in one step was failing, though.
#developercorey had the right idea.
I tried with a ridiculous timeout of 99999, and then got a different timeout error that I don't think I can change:
$ rhc app create APPNAME diy-0.1 http://cartreflect-claytondev.rhcloud.com/reflect?github=smarterclayton/openshift-redis-cart --from-code https://github.com/citrusbyte/SONR.git --timeout 99999
...
Creating application 'APPNAME' ...
The initial build for the application failed: Shell command '/sbin/runuser -s /bin/sh 5328a9385973ca70150002af -c "exec /usr/bin/runcon 'unconfined_u:system_r:openshift_t:s0:c5,c974' /bin/sh -c \"gear postreceive --init >> /tmp/initial-build.log 2>&1\""' exceeded timeout of 229
The fix I mentioned in my earlier update is working perfectly, and that's what I recommend anyone with a similar problem try -- I'm creating the app as empty without the --from-code option, and then merging in the code I wanted to use in a separate step:
rhc app create APPNAME diy-0.1 http://cartreflect-claytondev.rhcloud.com/reflect?github=smarterclayton/openshift-redis-cart
cd APPNAME
git remote add github -f https://github.com/citrusbyte/SONR.git
git merge github/master -s recursive -X theirs
git push origin master
It could be that the application takes to long to clone/setup, and the creation is timing out. Something you can try is to create the application without the --from-code, then clone it locally, and merge in your code from github, then do a git push. This operation has a much longer timeout period, and will also let you see what, if any, errors that you get since the application won't disappear if it doesn't succeed, unlike an app create.