I'm running airflow v 1.9.0. I am trying to get some form of authentication working but have so far failed to get github auth and password auth working. The password auth feels like it should be pretty straight forward and I'm hoping someone can point me in the right direction. My airflow.cfg has the following
[webserver]
authenticate = True
auth_backend = airflow.contrib.auth.backends.password_auth
Following the instructions here https://airflow.incubator.apache.org/security.html#password I've logged into my airflow web server and run the following interactive python to try to create a user which gives me an error
airflow#airflow-web-66fbccc84c-vmqbp:~$ python3
Python 3.6.4 (default, Feb 15 2018, 13:07:07)
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import airflow
>>> from airflow import models, settings
>>> from airflow.contrib.auth.backends.password_auth import PasswordUser
>>> user = PasswordUser(models.User())
>>> user.username = 'admin'
>>> user.password = 'airflowWTF'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/ext/hybrid.py", line 873, in __set__
raise AttributeError("can't set attribute")
AttributeError: can't set attribute
Going through the web UI to create a user I just get an exception. Here is the end of the exception. https://www.dropbox.com/s/7cxwi6hdde61wnb/Screenshot%202018-02-21%2013.52.16.png?dl=0
Any tips appreciated.
Thanks!
You need to use <1.2.0 version of sqlalchemy ('sqlalchemy>=1.1.15, <1.2.0',) or use "_password".
Changing version of sqlalchemy is better.
As mentioned here, using the _set_password method worked for me when I had the same error:
import airflow
from airflow import models, settings
from airflow.contrib.auth.backends.password_auth import PasswordUser
user = PasswordUser(models.User())
user.username = 'new_user_name'
user.email = 'new_user_email#example.com'
user._set_password = 'set_the_password'.encode('utf8')
session = settings.Session()
session.add(user)
session.commit()
session.close()
Related
I'm trying to implement custom XCOM backend.
Those are the steps I did:
Created "include" directory at the main Airflow dir (AIRFLOW_HOME).
Created these "custom_xcom_backend.py" file inside:
from typing import Any
from airflow.models.xcom import BaseXCom
import pandas as pd
class CustomXComBackend(BaseXCom):
#staticmethod
def serialize_value(value: Any):
if isinstance(value, pd.DataFrame):
value = value.to_json(orient='records')
return BaseXCom.serialize_value(value)
#staticmethod
def deserialize_value(result) -> Any:
result = BaseXCom.deserialize_value(result)
result = df = pd.read_json(result)
return result
Set at config file:
xcom_backend = include.custom_xcom_backend.CustomXComBackend
When I restarted webserver I got:
airflow.exceptions.AirflowConfigException: The object could not be loaded. Please check "xcom_backend" key in "core" section. Current value: "include.cust...
My guess is that it not recognizing the "include" folder
But how can I fix it?
*Note: There is no docker. It is installed on a Ubuntu machine.
Thanks!
So I solved it:
Put custom_xcom_backend.py into the plugins directory
set at config file:
xcom_backend = custom_xcom_backend.CustomXComBackend
Restart all airflow related services
*Note: Do not store DataFrames that way (bad practice).
Sources I used:
https://www.youtube.com/watch?v=iI0ymwOij88
I am quite new to Airflow and started practicing with it but currently stuck with a broken DAG that complains about 'airflow.hooks.dbapi' module not found.
Below is the code snippet that I am trying to run.
from airflow.models import DAG
from airflow.providers.sqlite.operators.sqlite import SqliteOperator
from datetime import datetime
default_args = {
'start_date': datetime(2020, 1, 1)
}
with DAG('user_processing', schedule_interval='daily',
default_args=default_args,
catchup=False) as dag:
creating_table = SqliteOperator(
task_id='creating_table',
sqlite_conn_id='db_sqlite',
sql='''
CREATE TABLE users (
firstname TEXT NOT NULL,
lastname TEXT NOT NULL,
country TEXT NOT NULL,
username TEXT NOT NULL,
password TEXT NOT NULL,
email TEXT NOT NULL PRIMARY KEY,
);
'''
)
I get the following error from the Airflow UI:
Broken DAG: [/home/airflow/airflow/dags/user_processing.py] Traceback (most recent call
last):
File "/home/airflow/sandbox/lib/python3.8/site-
packages/airflow/providers/sqlite/operators/sqlite.py", line 21, in <module>
from airflow.providers.sqlite.hooks.sqlite import SqliteHook
File "/home/airflow/sandbox/lib/python3.8/site-
packages/airflow/providers/sqlite/hooks/sqlite.py", line 21, in <module>
from airflow.hooks.dbapi import DbApiHook
ModuleNotFoundError: No module named 'airflow.hooks.dbapi'
So I tried modifying the import statements as below, but still no luck.
from airflow.models import DAG
from airflow.providers.sqlite.hooks.sqlite import SqliteHook
from airflow.providers.sqlite.operators.sqlite import SqliteOperator
Any ideas, or resolution to solve the issue? I am using Airflow version = 2.0.0b3 and python = 3.8.5.
I did try to look here and here, but not much luck.
You are running an old beta version of Airflow where not everything is working peoperly. you should not use beta version in production.
Use the latest Airflow stable version 2.1.0
Specificly the error that you are experiencing is resolved in that version.
I had same problem. Try to install airflow as following (without 'b3' beta version):
pip install apache-airflow==2.0.0 --constraint https://gist.githubusercontent.com/marclamberti/742efaef5b2d94f44666b0aec020be7c/raw/21c88601337250b6fd93f1adceb55282fb07b7ed/constraint.txt
We're upgrading to Airflow 2 so I've changed the hooks import from:
from airflow.hooks.base_hook import BaseHook
to
from airflow.hooks.base import BaseHook
and now I'm getting this error:
{plugins_manager.py:225} ERROR - No module named 'airflow.hooks.base'
Here are the docs for this change, but I don't see any other required changes to get airflow.hooks.base to work: https://github.com/apache/airflow/blob/a17db7883044889b2b2001cefc41a8960359a23f/UPDATING.md#changes-to-import-paths
Make sure you are running on Airflow 2.0.
You can check which version you are running with the version command.
airflow version
I've got a product installed in one Plone Site, this product change the visibility of a field of the Event content type.
It use IBrowserLayerAwareExtender to restrain the change to only the Plone Site where the product is installed.
This work on the development server on which buildout is made with the develop.cfg option, but in production, the layer is not respected, and all others Plone Site have this change.
Here is the code:
schemaextender.py:
class EventModifier(object):
"""
Masque certains champs inutiles pour le projet
"""
implements(ISchemaModifier, IBrowserLayerAwareExtender)
adapts(IATEvent)
layer = IBswMonasticLayer
def __init__(self, context):
self.context = context
# noinspection PyMethodMayBeStatic
def fiddle(self, schema):
"""
:param schema:
:return:
"""
schema['attendees'].widget.visible = {'edit': 'invisible', 'view': 'invisible'}
schema['location'].widget.label = _(u'Adresse')
return schema
configure.zcml:
<adapter for="Products.ATContentTypes.interface.IATEvent"
provides="archetypes.schemaextender.interfaces.ISchemaModifier"
factory=".schemaextender.EventModifier"
name="bsw.monastic.schemaextender.EventModifier"/>
Is it a bug or am I missing something ?
IMHO it's weird it works on you dev machine and not in production. I assume the browserlayer is indeed there on the production site.
You can check this by running the following code in a debug session on the server:
>>> from zope.component.hooks import setSite
>>> plone = app.path.to.plone.site
>>> setSite(plone) # Setup component registry
>>> from plone.browserlayer.utils import registered_layers
>>> registered_layers()
[...] # Bunch of layer interface active on the plone site.
I assume it's there, so you should remove it.
If so remove it using from plone.browserlayer.utils import unregister_layer
>>> from plone.browserlayer.utils import unregister_layer
>>> unregister_layer(layername)
>>> import transaction
>>> transaction.commit()
We are using Apache 1.9.0. I have written a snowflake hook plugin. I have placed the hook in the $AIRFLOW_HOME/plugins directory.
$AIRFLOW_HOME
+--plugins
+--snowflake_hook2.py
snowflake_hook2.py
# This is the base class for a plugin
from airflow.plugins_manager import AirflowPlugin
# This is necessary to expose the plugin in the Web interface
from flask import Blueprint
from flask_admin import BaseView, expose
from flask_admin.base import MenuLink
# This is the base hook for connecting to a database
from airflow.hooks.dbapi_hook import DbApiHook
# This is the snowflake provided Connector
import snowflake.connector
# This is the default python logging package
import logging
class SnowflakeHook2(DbApiHook):
"""
Airflow Hook to communicate with Snowflake
This is implemented as a Plugin
"""
def __init__(self, connname_in='snowflake_default', db_in='default', wh_in='default', schema_in='default'):
logging.info('# Connecting to {0}'.format(connname_in))
self.conn_name_attr = 'snowflake_conn_id'
self.connname = connname_in
self.superconn = super().get_connection(self.connname) #gets the values from Airflow
{SNIP - Connection stuff that works}
self.cur = self.conn.cursor()
def query(self,q,params=None):
"""From jmoney's db_wrapper allows return of a full list of rows(tuples)"""
if params == None: #no Params, so no insertion
self.cur.execute(q)
else: #make the parameter substitution
self.cur.execute(q,params)
self.results = self.cur.fetchall()
self.rowcount = self.cur.rowcount
self.columnnames = [colspec[0] for colspec in self.cur.description]
return self.results
{SNIP - Other class functions}
class SnowflakePluginClass(AirflowPlugin):
name = "SnowflakePluginModule"
hooks = [SnowflakeHook2]
operators = []
So I went ahead and put some print statements in Airflows plugin_manager to try and get a better handle on what is happening. After restarting the webserver and running airflow list_dags, these lines were showing the "new module name" (and no errors
SnowflakePluginModule [<class '__home__ubuntu__airflow__plugins_snowflake_hook2.SnowflakeHook2'>]
hook_module - airflow.hooks.snowflakepluginmodule
INTEGRATING airflow.hooks.snowflakepluginmodule
snowflakepluginmodule <module 'airflow.hooks.snowflakepluginmodule'>
As this is consistent with what the documentation says, I should be fine using this in my DAG:
from airflow import DAG
from airflow.hooks.snowflakepluginmodule import SnowflakeHook2
from airflow.operators.python_operator import PythonOperator
But the web throws this error
Broken DAG: [/home/ubuntu/airflow/dags/test_sf2.py] No module named 'airflow.hooks.snowflakepluginmodule'
So the question is, What am I doing wrong? Or have I uncovered a bug?
You need to import as below:
from airflow import DAG
from airflow.hooks import SnowflakeHook2
from airflow.operators.python_operator import PythonOperator
OR
from airflow import DAG
from airflow.hooks.SnowflakePluginModule import SnowflakeHook2
from airflow.operators.python_operator import PythonOperator
I don't think that airflow automatically goes through the folders in your plugins directory and runs everything underneath it. The way that I've set it up successfully is to have an __init__.py under the plugins directory which contains each plugin class. Have a look at the Astronomer plugins in Github, it provides some really good examples for how to set up your plugins.
In particular have a look at how they've set up the mysql plugin
https://github.com/airflow-plugins/mysql_plugin
Also someone has incorporated a snowflake hook in one of the later versions of airflow too which you might want to leverage:
https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/snowflake_hook.py