airflow trigger_dag command throwing error - airflow

I am executing airflow trigger_dag cng-hello_world command in airflow server and it resulted in below error. please suggest.
I followed below link:- http://michal.karzynski.pl/blog/2017/03/19/developing-workflows-with-apache-airflow/
The same Dag is been executed via airflow UI
[2019-02-06 11:57:41,755] {settings.py:174} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=2000
[2019-02-06 11:57:43,326] {plugins_manager.py:97} ERROR - invalid syntax (airflow_api.py, line 7)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/airflow/plugins_manager.py", line 86, in <module>
m = imp.load_source(namespace, filepath)
File "/home/ec2-user/airflow/plugins/airflow_api.py", line 7
<!DOCTYPE html>
^
SyntaxError: invalid syntax
[2019-02-06 11:57:43,326] {plugins_manager.py:98} ERROR - Failed to import plugin /home/ec2-user/airflow/plugins/airflow_api.py
[2019-02-06 11:57:43,326] {plugins_manager.py:97} ERROR - invalid syntax (__init__.py, line 7)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/airflow/plugins_manager.py", line 86, in <module>
m = imp.load_source(namespace, filepath)
File "/home/ec2-user/airflow/plugins/__init__.py", line 7
<!DOCTYPE html>
^
SyntaxError: invalid syntax
[2019-02-06 11:57:43,327] {plugins_manager.py:98} ERROR - Failed to import plugin /home/ec2-user/airflow/plugins/__init__.py
[2019-02-06 11:57:47,236] {__init__.py:51} INFO - Using executor CeleryExecutor
[2019-02-06 11:57:48,420] {models.py:258} INFO - Filling up the DagBag from /home/ec2-user/airflow/dags
[2019-02-06 11:57:48,783] {cli.py:237} INFO - Created <DagRun cng-hello_world # 2019-02-06 11:57:48+00:00: manual__2019-02-06T11:57:48+00:00, externally triggered: True>

Related

Read the Docs with nbsphinx

I created my own docs for Read the Docs. See my repository
Some of my docs files are jupyter notebook so I used nbshpinx for it.
In my computer I installed all the dependencies and it works great when I use make html.
However, Read the docs throws the error:
Running Sphinx v1.8.5
loading translations [en]... done
Traceback (most recent call last):
File "/home/docs/checkouts/readthedocs.org/user_builds/complex-valued-neural-networks/envs/latest/lib/python3.7/site-packages/sphinx/registry.py", line 472, in load_extension
mod = __import__(extname, None, None, ['setup'])
ModuleNotFoundError: No module named 'nbsphinx'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/docs/checkouts/readthedocs.org/user_builds/complex-valued-neural-networks/envs/latest/lib/python3.7/site-packages/sphinx/cmd/build.py", line 303, in build_main
args.tags, args.verbosity, args.jobs, args.keep_going)
File "/home/docs/checkouts/readthedocs.org/user_builds/complex-valued-neural-networks/envs/latest/lib/python3.7/site-packages/sphinx/application.py", line 228, in __init__
self.setup_extension(extension)
File "/home/docs/checkouts/readthedocs.org/user_builds/complex-valued-neural-networks/envs/latest/lib/python3.7/site-packages/sphinx/application.py", line 449, in setup_extension
self.registry.load_extension(self, extname)
File "/home/docs/checkouts/readthedocs.org/user_builds/complex-valued-neural-networks/envs/latest/lib/python3.7/site-packages/sphinx/registry.py", line 475, in load_extension
raise ExtensionError(__('Could not import extension %s') % extname, err)
sphinx.errors.ExtensionError: Could not import extension nbsphinx (exception: No module named 'nbsphinx')
Extension error:
Could not import extension nbsphinx (exception: No module named 'nbsphinx')
Following this tutorial I created two yml files and the error changed to:
Error
Problem in your project's configuration. Invalid "conda.environment": environment not found
Solved it!
I followed this tutorial
I added in readthedocs.yml:
python:
version: 3
install:
- requirements: docs/requirements.txt
system_packages: true
And then in docs/requirements.txt:
ipykernel
nbsphinx
If questions you can always check the repository where I do it.

Airflow branch errors with TypeError: 'NoneType' object is not iterable

I get the below error when trying to invoke a branching operation
[2020-01-05 19:11:34,888] {skipmixin.py:78} INFO - Following branch None
[2020-01-05 19:11:34,897] {taskinstance.py:1047} ERROR - 'NoneType' object is not iterable
Traceback (most recent call last):
File "/root/anaconda3/envs/py3/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 922, in _run_raw_task
result = task_copy.execute(context=context)
File "/root/anaconda3/envs/py3/lib/python3.6/site-packages/airflow/operators/python_operator.py", line 142, in execute
self.skip_all_except(context['ti'], branch)
File "/root/anaconda3/envs/py3/lib/python3.6/site-packages/airflow/models/skipmixin.py", line 92, in skip_all_except
for b in branch_task_ids:
TypeError: 'NoneType' object is not iterable
[2020-01-05 19:11:34,900] {taskinstance.py:1076} INFO - All retries failed; marking task as FAILED
[2020-01-05 19:11:35,315] {logging_mixin.py:95} INFO - [[34m2020-01-05 19:11:35,312[0m] {[34mlocal_task_job.py:[0m172} [33mWARNING[0m - [33mState of this instance has been externally set to [1mfailed[0m. Taking the poison pill.[0m
[2020-01-05 19:11:35,321] {helpers.py:319} INFO - Sending Signals.SIGTERM to GPID 25398
[2020-01-05 19:11:35,321] {taskinstance.py:897} ERROR - Received SIGTERM. Terminating subprocesses.
It works as expected for the CONVERT_PDF_TO_JPG_TASK
Found the problem...It was a stupid mistake the PRE_PROCESS_JPG_TASK was created as a BranchPythonOperator instead of a regular PythonOperator, so it was expecting a branch id as a return from the function.
(Side note: Suggestion for Airflow DAG UI team: Love the UI...but It would be great if differet Operators are represented in different colors. Thanks!)

Apache Airflow initdb command fails, due to syntax error

I have created virtualenv for python3 using:
virtualenv -p $(which python3) ENV
Then activate the source
source /Users/myusername/ENV/bin/activate
Install the apache-airflow:
pip install apache-airflow
then which airflow yields /Users/myusername/ENV/bin/airflow
But when I try to initdb using:
airflow initdb
I get below error:
{db.py:350} INFO - Creating tables
INFO [alembic.runtime.migration] Context impl SQLiteImpl.
INFO [alembic.runtime.migration] Will assume non-transactional DDL.
WARNI [airflow.utils.log.logging_mixin.LoggingMixin] cryptography not found - values will not be stored encrypted.
ERROR [airflow.models.DagBag] Failed to import: /Library/Python/2.7/site-packages/airflow/example_dags/example_http_operator.py
Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/airflow/models/__init__.py", line 413, in process_file
m = imp.load_source(mod_name, filepath)
File "/Library/Python/2.7/site-packages/airflow/example_dags/example_http_operator.py", line 27, in <module>
from airflow.operators.http_operator import SimpleHttpOperator
File "/Library/Python/2.7/site-packages/airflow/operators/http_operator.py", line 21, in <module>
from airflow.hooks.http_hook import HttpHook
File "/Library/Python/2.7/site-packages/airflow/hooks/http_hook.py", line 23, in <module>
import tenacity
File "/Library/Python/2.7/site-packages/tenacity/__init__.py", line 375, in <module>
from tenacity.tornadoweb import TornadoRetrying
File "/Library/Python/2.7/site-packages/tenacity/tornadoweb.py", line 24, in <module>
from tornado import gen
File "/Library/Python/2.7/site-packages/tornado-6.0.3-py2.7-macosx-10.14-intel.egg/tornado/gen.py", line 126
def _value_from_stopiteration(e: Union[StopIteration, "Return"]) -> Any:
^
SyntaxError: invalid syntax
Done.
(ENV) ---------------------------------------------------------
Seems like example scripts use python 2.7 and it can't recognize the function definition syntax.
Does apache-airflow package need to be fixed by the next release or I can do something to fix this?
I tried fixing this:
Use python2.7 instead of python3
then install airflow on default python 2.7 enabled on mac but this throws other errors like package "six" is not compatible.
You need to turn off the example DAGs to be loaded in config file to solve this problem.
Anyway, it seems weird that airflow uses 2.7 Python when you told that it is installed into Python 3 virtual environment.

how to install dask on google composer

I tried to install dask on google composer (airflow). I used pypi (GCP UI) to add dask and the below required packages(not sure if all the google one are required though, couldn't find requirement.txt):
dask
toolz
partd
cloudpickle
google-cloud
google-cloud-storage
google-auth
google-auth-oauthlib
decorator
when I run my DAG that has dd.read_csv("a gcp bucket") it shows the below error in airflow log:
[2018-10-24 22:25:12,729] {base_task_runner.py:98} INFO - Subtask: File "/usr/local/lib/python2.7/site-packages/dask/bytes/core.py", line 350, in get_fs_token_paths
[2018-10-24 22:25:12,733] {base_task_runner.py:98} INFO - Subtask: fs, fs_token = get_fs(protocol, options)
[2018-10-24 22:25:12,735] {base_task_runner.py:98} INFO - Subtask: File "/usr/local/lib/python2.7/site-packages/dask/bytes/core.py", line 473, in get_fs
[2018-10-24 22:25:12,740] {base_task_runner.py:98} INFO - Subtask: "Need to install `gcsfs` library for Google Cloud Storage support\n"
[2018-10-24 22:25:12,741] {base_task_runner.py:98} INFO - Subtask: File "/usr/local/lib/python2.7/site-packages/dask/utils.py", line 94, in import_required
[2018-10-24 22:25:12,748] {base_task_runner.py:98} INFO - Subtask: raise RuntimeError(error_msg)
[2018-10-24 22:25:12,751] {base_task_runner.py:98} INFO - Subtask: RuntimeError: Need to install `gcsfs` library for Google Cloud Storage support
[2018-10-24 22:25:12,756] {base_task_runner.py:98} INFO - Subtask: conda install gcsfs -c conda-forge
[2018-10-24 22:25:12,758] {base_task_runner.py:98} INFO - Subtask: or
[2018-10-24 22:25:12,762] {base_task_runner.py:98} INFO - Subtask: pip install gcsfs
so I tried to install gcsfs using pypi but got the below airflow error:
{
insertId: "17ks763f726w1i"
logName: "projects/xxxxxxxxx/logs/airflow-worker"
receiveTimestamp: "2018-10-25T15:42:24.935880717Z"
resource: {…}
severity: "ERROR"
textPayload: "Traceback (most recent call last):
File "/usr/local/bin/gcsfuse", line 7, in <module>
from gcsfs.cli.gcsfuse import main
File "/usr/local/lib/python2.7/site-
packages/gcsfs/cli/gcsfuse.py", line 3, in <module>
fuse import FUSE
ImportError: No module named fuse
"
timestamp: "2018-10-25T15:41:53Z"
}
seems that it is trapped in a loop of required packages!! not sure if I missed anything here? any thoughts?
You don't need to add storage in your PyPi packages, it's already installed. I ran a dag (image-version:composer-1.3.0-airflow-1.10.0) logging the version of the pre-installed package and it appears that it is 1.13.0. I also added in my dag the following, in order to replicate your case:
import dask.dataframe as dd
def read_csv_dask():
df = dd.read_csv('gs://gcs_path/data.csv')
logging.info("csv from gs://gcs_path/ read alright")
Before anything, I added via the UI the following dependencies:
dask==0.20.0
toolz==0.9.0
partd==0.3.9
cloudpickle==0.6.1
The corresponding task failed with the same message as yours ("Need to install gcsfs library for Google Cloud Storage support") at which point I returned to the UI and attempted to add gcsfs==0.1.2. This never succeeded. However, I did not get the error you did, I instead repeatedly failed with "Composer Backend timed out".
At this point, you could consider the following alternatives:
1) Install gcsfs with pip in a BashOperator. This is not optimal as you will be installing gcsfs every time the dag is ran.
2) Use another library. What are you doing with this csv? If you upload it to the gs://composer_gcs_bucket/data/ directory (check here) you can then read it using e.g. the csv standard lib like so:
import csv
def read_csv():
f=open('/home/airflow/gcs/data/data.csv', 'rU')
reader = csv.reader(f)

Robot framework, Sikuli hello_world demo script is failing?

I have installed Robot framework 2.8.7 in a solaris server and added sikuli library to it . when tried to run demo script "Hello world" i'm getting the following error.
bash-3.2# pybot /robot/robotframework-SikuliLibrary-master/demo/hello_world/testsuite_sikuli_demo.txt
*[ WARN ] Test get_keyword_names failed! Connecting remote server at http://127.0.0.1:42821/ failed: <Fault 0: 'Failed to invoke method get_keyword_names in class org.robotframework.remoteserver.servlet.ServerMethods: java.lang.RuntimeException'>
[ ERROR ] Error in file '/robot/robotframework-SikuliLibrary-master/demo/hello_world/testsuite_sikuli_demo.txt': Initializing test library 'SikuliLibrary' with no arguments failed: Failed to get_keyword_names!
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/SikuliLibrary/sikuli.py", line 41, in __init__
self.remote = self._connect_remote_library()
File "/usr/lib/python2.7/site-packages/SikuliLibrary/sikuli.py", line 138, in _connect_remote_library
self._test_get_keyword_names(remote)
File "/usr/lib/python2.7/site-packages/SikuliLibrary/sikuli.py", line 155, in _test_get_keyword_names
raise RuntimeError('Failed to get_keyword_names!')*
I have done the same setup on windows machine and it is working fine. Python version used in solaris is 2.6.Can you let me know how to resolve this?
Thanks

Resources