How to install and run apache-airflow on windows without problem - airflow

I tried to install Apache Airflow 2.2.4 on windows 10. When I finish and run airflow here are the errors it gives me.
Traceback (most recent call last):
File "/home/david/.local/bin/airflow", line 5, in <module>
from airflow.__main__ import main
File "/home/david/.local/lib/python3.6/site-packages/airflow/__init__.py", line 34, in <module>
from airflow import settings
File "/home/david/.local/lib/python3.6/site-packages/airflow/settings.py", line 35, in <module>
from airflow.configuration import AIRFLOW_HOME, WEBSERVER_CONFIG, conf # NOQA F401
File "/home/david/.local/lib/python3.6/site-packages/airflow/configuration.py", line 1127, in <module>
conf = initialize_config()
File "/home/david/.local/lib/python3.6/site-packages/airflow/configuration.py", line 890, in initialize_config
shutil.copy(_default_config_file_path('default_webserver_config.py'), WEBSERVER_CONFIG)
File "/usr/lib/python3.6/shutil.py", line 245, in copy
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/usr/lib/python3.6/shutil.py", line 121, in copyfile
with open(dst, 'wb') as fdst:
PermissionError: [Errno 13] Permission denied: '/webserver_config.py'

Following steps resolved similar issue for me, but not sure what resolved it
1) Makes sure your wsl version is 2. (Restart PC if you change wsl version)
2)Enable Windows subsystem for Linux and Virtual Machine platform. (Restart PC)
After this, I followed following tutorial:
https://towardsdatascience.com/run-apache-airflow-on-windows-10-without-docker-3c5754bb98b4
If u follow this, u will not be installing airflow ver 1.10.12 but apache airflow 2.2.4, and instead of "airflow initdb" use "airflow db init" command.
Also, before running command "airflow db init" create an user, command for this (optional but I suggest run this command):
airflow users create --username admin --password admin --firstname <firstname> --lastname <lastname> --role Admin --email abc#gmail.com

Related

Airflow cant execute SparkSubmit

I am trying SparkSubmitOperator in Airflow
My job run by jar file, with config in config.properties by typesafe.ConfigFactory
My error is:
airflow.exceptions.AirflowException: Cannot execute: /home/hungnd/spark-2.4.3-bin-hadoop2.7/bin/spark-submit
--master yarn
--conf spark.executor.extraClassPath=file:///home/hungnd/airflow/dags/project/spark-example/config.properties
--files /home/hungnd/airflow/dags/project/spark-example/config.properties
--driver-class-path file:///home/hungnd/airflow/dags/project/spark-example/config.properties
--jars file:///home/hungnd/airflow/dags/project/spark-example/target/libs/*
--name arrow-spark
--class vn.vccorp.adtech.analytic.Main
--queue root.default
--deploy-mode client
/home/hungnd/airflow/dags/project/spark-example/target/spark-example-1.0-SNAPSHOT.jar
But I copy that command to server ubuntu, it run successfully
Please help me!! Thank

crontab error "ImportError: No module named request" python runs OK in terminal - fails in crontab

I am having difficulty in a running python script that calls module "requests" in crontab. This was fine a few days ago and then I had to change my authentication for Google (to send emails), then "requests" stopped working in crontab. The python script runs fine in a terminal but will not execute in crontab. "requests" is available and when I type "pip3 show requests" the following is displayed (note I replaced my user name with "user"):
$pip3 show requests
Name: requests
Version: 2.27.1
Summary: Python HTTP for Humans.
Home-page: https://requests.readthedocs.io
Author: Kenneth Reitz
Author-email: me#kennethreitz.org
License: Apache 2.0
Location: /home/user/.local/lib/python3.6/site-packages
Requires: certifi, idna, urllib3, charset-normalizer
A simplified version of the python file I would like to execute in crontab is:
#!/usr/bin/env python...
# -*- coding: utf-8 -*-
import requests
print ('End of code')
The file test_request.py executes fine in a terminal.
I created a bash script called test_request.sh based on directions from this stack overflow page:
"ImportError: No module named requests" in crontab Python script
That bash script is this:
#!/bin/bash
echo test_request.sh called: `date`
HOME=/home/user/
PYTHONPATH=/home/user/.local/lib/python3.6/site-packages
cd /home/user/Documents/bjg_code/
python ./test_request.py 2>&1 1>/dev/null
When I try to run the bash script in a terminal or in crontab I receive this error:
$bash test_request.sh
test_request.sh called: Sat Jun 11 14:18:46 EDT 2022
Traceback (most recent call last):
File "./test_request.py", line 4, in <module>
import requests
ImportError: No module named requests
Any advice would be welcomed and appreciated.
Thank you in advance.

jupyter notebook can not start

I usually command with 'jupyter notebook' to start the jupyter notebook .
Traceback (most recent call last): File
"/home/jake/venv/bin/jupyter-notebook", line 8, in
sys.exit(main()) File "/home/jake/venv/lib/python3.7/site-packages/jupyter_core/application.py",
line 268, in launch_instance
return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs) File
"/home/jake/venv/lib/python3.7/site-packages/traitlets/config/application.py",
line 663, in launch_instance
app.initialize(argv) File "",
line 2, in initialize File
"/home/jake/venv/lib/python3.7/site-packages/traitlets/config/application.py",
line 87, in catch_config_error
return method(app, *args, **kwargs) File "/home/jake/venv/lib/python3.7/site-packages/notebook/notebookapp.py",
line 1720, in initialize
self.init_webapp() File "/home/jake/venv/lib/python3.7/site-packages/notebook/notebookapp.py",
line 1482, in init_webapp
self.http_server.listen(port, self.ip) File "/home/jake/venv/lib/python3.7/site-packages/tornado/tcpserver.py",
line 151, in listen
sockets = bind_sockets(port, address=address) File "/home/jake/venv/lib/python3.7/site-packages/tornado/netutil.py", line
174, in bind_sockets
sock.bind(sockaddr) OSError: [Errno 99] Cannot assign requested address
but this message is showed up
Go to /etc/hosts and check that the local host has the ip 127.0.0.1
How to go to the hosts file? If you are using Linux, open up terminal, type
cd /etc/
Then type
cat hosts
This will display the contents in hosts. Now you will see localhosts there. Change its value to 127.0.0.1 if it already isn't. And that should get your notebook running.
If you find that localhost is already 127.0.0.1, then try the command in your terminal:
jupyter notebook --ip=0.0.0.0 --port=8080
to run the jupyter notebook.
The second one is an immediate fix, but every time you want to start jupyter notebook, you will have to provide those two arguments. On the other hand, the first one is a permanent fix (recommended) and you just have to type "jupyter notebook" the next time.

DAG does not execute but it is in the list_dag

I am running an airflow in container inside codebuild, currently it executes everything but in the part to trigger the DAG it fail.
- sudo sh scripts/setup.sh
- pipenv --three install
- airflow initdb
- airflow scheduler > ~/scheduler.log 2>&1 &
- airflow list_dags -sd $(pwd)/dags
- airflow trigger_dag -sd $(pwd)/dags Pampa
And when i use the list_dags it shows
-------------------------------------------------------------------
DAGS
-------------------------------------------------------------------
Pampa
But it does not execute the DAG.
airflow trigger_dag -sd $(pwd)/dags Pampa
[2018-07-05 20:04:36,495] {__init__.py:45} INFO - Using executor SequentialExecutor
[2018-07-05 20:04:36,556] {models.py:189} INFO -
Filling up the DagBag from /codebuild/output/src188373663/dags
Traceback (most recent call last): File "/usr/local/bin/airflow",
line 27, in <module>
args.func(args) File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 199,
in trigger_dag
execution_date=args.exec_date) File "/usr/local/lib/python3.6/site-packages/airflow/api/client/local_client.py",
line 27, in trigger_dag
execution_date=execution_date) File "/usr/local/lib/python3.6/site-packages/airflow/api/common/experimental/trigger_dag.py",
line 27, in trigger_dag
raise AirflowException("Dag id {} not found".format(dag_id)) airflow.exceptions.AirflowException: Dag id Pampa not found
The error was in the $AIRFLOW_HOME it was taking the path of a folder below:
$AIRFLOW_HOME /home/ubuntu
But the dags was in /home/ubuntu/airflow/dags/ so it can not find the dags, it just can find the dags when I specify the -sd and the subfolder path in the list_dags.
I have to change the way I have the $AIRFLOW_HOME.

Cloudfoundry grizzly keystone with devstack Install: configparser error in keystone.conf

I'm trying to install opnstack grizzly on a fresh ubuntu 12.04 server.
The sript runs fin until it reach this point :
screen -S stack -p key -X stuff 'cd /opt/stack/keystone &&
/opt/stack/keystone/bin/keystone-all --config-file /etc/keystone/keystone.con' --log-
config
/etc/keystone/logging.conf -d --debug || touch "/opt/stack/status/stack/key.failure"
2013-07-16 17:33:03 + echo 'Waiting for keystone to start...'
2013-07-16 17:33:03 Waiting for keystone to start...
2013-07-16 17:33:03 + timeout 60 sh -c 'while ! http_proxy= curl -s
http://192.168.20.69:5000/v2.0/ >/dev/null; do sleep 1; done'
2013-07-16 17:34:03 + die 311 'keystone did not start'
2013-07-16 17:34:03 + local exitcode=0
2013-07-16 17:34:03 + set +o xtrace
2013-07-16 17:34:03 [ERROR] ./stack.sh:311 keystone did not start
the log file :
File "/opt/stack/keystone/bin/keystone-all", line 112, in <module>
options = deploy.appconfig('config:%s' % paste_config)
File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 261, in appconfig
global_conf=global_conf)
File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 296, in loadcontext
global_conf=global_conf)
File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 320, in _loadconfig
return loader.get_context(object_type, name, global_conf)
File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 413, in get_context
defaults = self.parser.defaults()
File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 68, in defaults
defaults[key] = self.get('DEFAULT', key) or val
File "/usr/lib/python2.7/ConfigParser.py", line 623, in get
return self._interpolate(section, option, value, d)
File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 75, in _interpolate
self, section, option, rawval, vars)
File "/usr/lib/python2.7/ConfigParser.py", line 669, in _interpolate
option, section, rawval, e.args[0])
ConfigParser.InterpolationMissingOptionError: Error in file /etc/keystone/keystone.conf:
Bad value substitution:
section: [DEFAULT]
option : admin_endpoint
key : admin_port
rawval : http://192.168.20.69:%(admin_port)s/
the parsing instruction :
https://github.com/openstack/keystone/blob/master/keystone/common/config.py
the ConfigParser.InterpolationMissingOptionError :
Exception raised when an option referenced from a value does not exist. Subclass of InterpolationError.
I actually don't understan which option referenced does not exist..
Thank you in advance for your help.
Damien
I had the same problem when I ran stack.sh. The localrc file at the time of running stack.sh was:
disable_service n-net
enable_service q-svc
enable_service q-agt
enable_service q-dhcp
enable_service q-l3
enable_service q-meta
enable_service neutron
# enable_service q-lbass
disable_service mysql
enable_service postgresql
# enable_service swift
# SWIFT_HASH=devstack
#
LOGFILE=$DEST/logs/stack.log
SCREEN_LOGDIR=$DEST/logs/screens
#
SERVICE_TOKEN=devstack
SCHEDULER=nova.scheduler.chance.ChanceScheduler
# Repositories
GLANCE_BRANCK=stable/grizzly
HORIZON_BRANCH=stable/grizzly
KEYSTONE_BRANCH=stable/grizzly
NOVA_BRANCH=stable/grizzly
NEUTRON_BRANCH=stable/grizzly
CINDER_BRANCH=stable/grizzly
SWIFT_BRANCH=stable/grizzly
PBR_BRANCH=master
REQUIREMENTS_BRANCH=stable/grizzly
CEILOMETER_BRANCH=stable/grizzly
...
However, after I removed the repositories definition, and let the defaults in stackrc take over, ie. all branches pointed to 'master', the problem went away.
Further, The contents of /opt/stack/keystone/bin/keystone-all script are different between the stable/grizzly and master branches. I think the one in 'master' branch seems to work now withe neutron enabled.
this error because
you run this "stack.sh" as root
or you forget to chmod your config in /etc/keystone/keystone.conf
chmod 777 /etc/keystone/keystone.conf
unstack.sh and then re run stack.sh
just simply
visudo
add stack as user who will do same as root but no password required
stack ALL=(ALL:ALL) ALL
su stack
cp -r /root/devstack /home/stack/
cd /home/stack/devstack/
./stack.sh
clean all first if necessary
Looks like a bug that has been filed for keystone https://bugs.launchpad.net/keystone/+bug/1201861 and it is still open.
Modify devstack/lib/keystone as follows:
iniset $KEYSTONE_CONF DEFAULT public_endpoint "$KEYSTONE_SERVICE_PROTOCOL://$KEYSTONE_SERVICE_HOST:35357/"
iniset $KEYSTONE_CONF DEFAULT admin_endpoint "$KEYSTONE_SERVICE_PROTOCOL://$KEYSTONE_SERVICE_HOST:5000/"
I just ran into this myself. The problem is that DevStack is building a Keystone configuration file in /etc/keystone/keystone.conf in which the option "admin_port" is used before it's been set. And you can't just edit keystone.conf and re-run stack.sh, because your edited version will be overwritten. I'm still chasing down the code that borks the configuration file....

Resources